perldoc-html/000755 000765 000024 00000000000 12276001417 013216 5ustar00jjstaff000000 000000 perldoc-html/AnyDBM_File.html000644 000765 000024 00000043776 12275777423 016156 0ustar00jjstaff000000 000000 AnyDBM_File - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

AnyDBM_File

Perl 5 version 18.2 documentation
Recently read

AnyDBM_File

NAME

AnyDBM_File - provide framework for multiple DBMs

NDBM_File, DB_File, GDBM_File, SDBM_File, ODBM_File - various DBM implementations

SYNOPSIS

  1. use AnyDBM_File;

DESCRIPTION

This module is a "pure virtual base class"--it has nothing of its own. It's just there to inherit from one of the various DBM packages. It prefers ndbm for compatibility reasons with Perl 4, then Berkeley DB (See DB_File), GDBM, SDBM (which is always there--it comes with Perl), and finally ODBM. This way old programs that used to use NDBM via dbmopen() can still do so, but new ones can reorder @ISA:

  1. BEGIN { @AnyDBM_File::ISA = qw(DB_File GDBM_File NDBM_File) }
  2. use AnyDBM_File;

Having multiple DBM implementations makes it trivial to copy database formats:

  1. use Fcntl; use NDBM_File; use DB_File;
  2. tie %newhash, 'DB_File', $new_filename, O_CREAT|O_RDWR;
  3. tie %oldhash, 'NDBM_File', $old_filename, 1, 0;
  4. %newhash = %oldhash;

DBM Comparisons

Here's a partial table of features the different packages offer:

  1. odbm ndbm sdbm gdbm bsd-db
  2. ---- ---- ---- ---- ------
  3. Linkage comes w/ perl yes yes yes yes yes
  4. Src comes w/ perl no no yes no no
  5. Comes w/ many unix os yes yes[0] no no no
  6. Builds ok on !unix ? ? yes yes ?
  7. Code Size ? ? small big big
  8. Database Size ? ? small big? ok[1]
  9. Speed ? ? slow ok fast
  10. FTPable no no yes yes yes
  11. Easy to build N/A N/A yes yes ok[2]
  12. Size limits 1k 4k 1k[3] none none
  13. Byte-order independent no no no no yes
  14. Licensing restrictions ? ? no yes no
  • [0]

    on mixed universe machines, may be in the bsd compat library, which is often shunned.

  • [1]

    Can be trimmed if you compile for one access method.

  • [2]

    See DB_File. Requires symbolic links.

  • [3]

    By default, but can be redefined.

SEE ALSO

dbm(3), ndbm(3), DB_File(3), perldbmfilter

 
perldoc-html/App/000755 000765 000024 00000000000 12275777423 013756 5ustar00jjstaff000000 000000 perldoc-html/Archive/000755 000765 000024 00000000000 12275777423 014617 5ustar00jjstaff000000 000000 perldoc-html/Attribute/000755 000765 000024 00000000000 12275777422 015200 5ustar00jjstaff000000 000000 perldoc-html/AutoLoader.html000644 000765 000024 00000116035 12275777422 016170 0ustar00jjstaff000000 000000 AutoLoader - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

AutoLoader

Perl 5 version 18.2 documentation
Recently read

AutoLoader

NAME

AutoLoader - load subroutines only on demand

SYNOPSIS

  1. package Foo;
  2. use AutoLoader 'AUTOLOAD'; # import the default AUTOLOAD subroutine
  3. package Bar;
  4. use AutoLoader; # don't import AUTOLOAD, define our own
  5. sub AUTOLOAD {
  6. ...
  7. $AutoLoader::AUTOLOAD = "...";
  8. goto &AutoLoader::AUTOLOAD;
  9. }

DESCRIPTION

The AutoLoader module works with the AutoSplit module and the __END__ token to defer the loading of some subroutines until they are used rather than loading them all at once.

To use AutoLoader, the author of a module has to place the definitions of subroutines to be autoloaded after an __END__ token. (See perldata.) The AutoSplit module can then be run manually to extract the definitions into individual files auto/funcname.al.

AutoLoader implements an AUTOLOAD subroutine. When an undefined subroutine in is called in a client module of AutoLoader, AutoLoader's AUTOLOAD subroutine attempts to locate the subroutine in a file with a name related to the location of the file from which the client module was read. As an example, if POSIX.pm is located in /usr/local/lib/perl5/POSIX.pm, AutoLoader will look for perl subroutines POSIX in /usr/local/lib/perl5/auto/POSIX/*.al, where the .al file has the same name as the subroutine, sans package. If such a file exists, AUTOLOAD will read and evaluate it, thus (presumably) defining the needed subroutine. AUTOLOAD will then goto the newly defined subroutine.

Once this process completes for a given function, it is defined, so future calls to the subroutine will bypass the AUTOLOAD mechanism.

Subroutine Stubs

In order for object method lookup and/or prototype checking to operate correctly even when methods have not yet been defined it is necessary to "forward declare" each subroutine (as in sub NAME; ). See SYNOPSIS in perlsub. Such forward declaration creates "subroutine stubs", which are place holders with no code.

The AutoSplit and AutoLoader modules automate the creation of forward declarations. The AutoSplit module creates an 'index' file containing forward declarations of all the AutoSplit subroutines. When the AutoLoader module is 'use'd it loads these declarations into its callers package.

Because of this mechanism it is important that AutoLoader is always used and not required.

Using AutoLoader's AUTOLOAD Subroutine

In order to use AutoLoader's AUTOLOAD subroutine you must explicitly import it:

  1. use AutoLoader 'AUTOLOAD';

Overriding AutoLoader's AUTOLOAD Subroutine

Some modules, mainly extensions, provide their own AUTOLOAD subroutines. They typically need to check for some special cases (such as constants) and then fallback to AutoLoader's AUTOLOAD for the rest.

Such modules should not import AutoLoader's AUTOLOAD subroutine. Instead, they should define their own AUTOLOAD subroutines along these lines:

  1. use AutoLoader;
  2. use Carp;
  3. sub AUTOLOAD {
  4. my $sub = $AUTOLOAD;
  5. (my $constname = $sub) =~ s/.*:://;
  6. my $val = constant($constname, @_ ? $_[0] : 0);
  7. if ($! != 0) {
  8. if ($! =~ /Invalid/ || $!{EINVAL}) {
  9. $AutoLoader::AUTOLOAD = $sub;
  10. goto &AutoLoader::AUTOLOAD;
  11. }
  12. else {
  13. croak "Your vendor has not defined constant $constname";
  14. }
  15. }
  16. *$sub = sub { $val }; # same as: eval "sub $sub { $val }";
  17. goto &$sub;
  18. }

If any module's own AUTOLOAD subroutine has no need to fallback to the AutoLoader's AUTOLOAD subroutine (because it doesn't have any AutoSplit subroutines), then that module should not use AutoLoader at all.

Package Lexicals

Package lexicals declared with my in the main block of a package using AutoLoader will not be visible to auto-loaded subroutines, due to the fact that the given scope ends at the __END__ marker. A module using such variables as package globals will not work properly under the AutoLoader.

The vars pragma (see vars in perlmod) may be used in such situations as an alternative to explicitly qualifying all globals with the package namespace. Variables pre-declared with this pragma will be visible to any autoloaded routines (but will not be invisible outside the package, unfortunately).

Not Using AutoLoader

You can stop using AutoLoader by simply

  1. no AutoLoader;

AutoLoader vs. SelfLoader

The AutoLoader is similar in purpose to SelfLoader: both delay the loading of subroutines.

SelfLoader uses the __DATA__ marker rather than __END__ . While this avoids the use of a hierarchy of disk files and the associated open/close for each routine loaded, SelfLoader suffers a startup speed disadvantage in the one-time parsing of the lines after __DATA__ , after which routines are cached. SelfLoader can also handle multiple packages in a file.

AutoLoader only reads code as it is requested, and in many cases should be faster, but requires a mechanism like AutoSplit be used to create the individual files. ExtUtils::MakeMaker will invoke AutoSplit automatically if AutoLoader is used in a module source file.

Forcing AutoLoader to Load a Function

Sometimes, it can be necessary or useful to make sure that a certain function is fully loaded by AutoLoader. This is the case, for example, when you need to wrap a function to inject debugging code. It is also helpful to force early loading of code before forking to make use of copy-on-write as much as possible.

Starting with AutoLoader 5.73, you can call the AutoLoader::autoload_sub function with the fully-qualified name of the function to load from its .al file. The behaviour is exactly the same as if you called the function, triggering the regular AUTOLOAD mechanism, but it does not actually execute the autoloaded function.

CAVEATS

AutoLoaders prior to Perl 5.002 had a slightly different interface. Any old modules which use AutoLoader should be changed to the new calling style. Typically this just means changing a require to a use, adding the explicit 'AUTOLOAD' import if needed, and removing AutoLoader from @ISA .

On systems with restrictions on file name length, the file corresponding to a subroutine may have a shorter name that the routine itself. This can lead to conflicting file names. The AutoSplit package warns of these potential conflicts when used to split a module.

AutoLoader may fail to find the autosplit files (or even find the wrong ones) in cases where @INC contains relative paths, and the program does chdir.

SEE ALSO

SelfLoader - an autoloader that doesn't use external files.

AUTHOR

AutoLoader is maintained by the perl5-porters. Please direct any questions to the canonical mailing list. Anything that is applicable to the CPAN release can be sent to its maintainer, though.

Author and Maintainer: The Perl5-Porters <perl5-porters@perl.org>

Maintainer of the CPAN release: Steffen Mueller <smueller@cpan.org>

COPYRIGHT AND LICENSE

This package has been part of the perl core since the first release of perl5. It has been released separately to CPAN so older installations can benefit from bug fixes.

This package has the same copyright and license as the perl core:

  1. Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999,
  2. 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009,
  3. 2011, 2012
  4. by Larry Wall and others
  5. All rights reserved.
  6. This program is free software; you can redistribute it and/or modify
  7. it under the terms of either:
  8. a) the GNU General Public License as published by the Free
  9. Software Foundation; either version 1, or (at your option) any
  10. later version, or
  11. b) the "Artistic License" which comes with this Kit.
  12. This program is distributed in the hope that it will be useful,
  13. but WITHOUT ANY WARRANTY; without even the implied warranty of
  14. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See either
  15. the GNU General Public License or the Artistic License for more details.
  16. You should have received a copy of the Artistic License with this
  17. Kit, in the file named "Artistic". If not, I'll be glad to provide one.
  18. You should also have received a copy of the GNU General Public License
  19. along with this program in the file named "Copying". If not, write to the
  20. Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
  21. MA 02110-1301, USA or visit their web page on the internet at
  22. http://www.gnu.org/copyleft/gpl.html.
  23. For those of you that choose to use the GNU General Public License,
  24. my interpretation of the GNU General Public License is that no Perl
  25. script falls under the terms of the GPL unless you explicitly put
  26. said script under the terms of the GPL yourself. Furthermore, any
  27. object code linked with perl does not automatically fall under the
  28. terms of the GPL, provided such object code only adds definitions
  29. of subroutines and variables, and does not otherwise impair the
  30. resulting interpreter from executing any standard Perl script. I
  31. consider linking in C subroutines in this manner to be the moral
  32. equivalent of defining subroutines in the Perl language itself. You
  33. may sell such an object file as proprietary provided that you provide
  34. or offer to provide the Perl source, as specified by the GNU General
  35. Public License. (This is merely an alternate way of specifying input
  36. to the program.) You may also sell a binary produced by the dumping of
  37. a running Perl script that belongs to you, provided that you provide or
  38. offer to provide the Perl source as specified by the GPL. (The
  39. fact that a Perl interpreter and your code are in the same binary file
  40. is, in this case, a form of mere aggregation.) This is my interpretation
  41. of the GPL. If you still have concerns or difficulties understanding
  42. my intent, feel free to contact me. Of course, the Artistic License
  43. spells all this out for your protection, so you may prefer to use that.
 
perldoc-html/AutoSplit.html000644 000765 000024 00000060013 12275777422 016047 0ustar00jjstaff000000 000000 AutoSplit - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

AutoSplit

Perl 5 version 18.2 documentation
Recently read

AutoSplit

NAME

AutoSplit - split a package for autoloading

SYNOPSIS

  1. autosplit($file, $dir, $keep, $check, $modtime);
  2. autosplit_lib_modules(@modules);

DESCRIPTION

This function will split up your program into files that the AutoLoader module can handle. It is used by both the standard perl libraries and by the MakeMaker utility, to automatically configure libraries for autoloading.

The autosplit interface splits the specified file into a hierarchy rooted at the directory $dir . It creates directories as needed to reflect class hierarchy, and creates the file autosplit.ix. This file acts as both forward declaration of all package routines, and as timestamp for the last update of the hierarchy.

The remaining three arguments to autosplit govern other options to the autosplitter.

  • $keep

    If the third argument, $keep, is false, then any pre-existing *.al files in the autoload directory are removed if they are no longer part of the module (obsoleted functions). $keep defaults to 0.

  • $check

    The fourth argument, $check, instructs autosplit to check the module currently being split to ensure that it includes a use specification for the AutoLoader module, and skips the module if AutoLoader is not detected. $check defaults to 1.

  • $modtime

    Lastly, the $modtime argument specifies that autosplit is to check the modification time of the module against that of the autosplit.ix file, and only split the module if it is newer. $modtime defaults to 1.

Typical use of AutoSplit in the perl MakeMaker utility is via the command-line with:

  1. perl -e 'use AutoSplit; autosplit($ARGV[0], $ARGV[1], 0, 1, 1)'

Defined as a Make macro, it is invoked with file and directory arguments; autosplit will split the specified file into the specified directory and delete obsolete .al files, after checking first that the module does use the AutoLoader, and ensuring that the module is not already currently split in its current form (the modtime test).

The autosplit_lib_modules form is used in the building of perl. It takes as input a list of files (modules) that are assumed to reside in a directory lib relative to the current directory. Each file is sent to the autosplitter one at a time, to be split into the directory lib/auto.

In both usages of the autosplitter, only subroutines defined following the perl __END__ token are split out into separate files. Some routines may be placed prior to this marker to force their immediate loading and parsing.

Multiple packages

As of version 1.01 of the AutoSplit module it is possible to have multiple packages within a single file. Both of the following cases are supported:

  1. package NAME;
  2. __END__
  3. sub AAA { ... }
  4. package NAME::option1;
  5. sub BBB { ... }
  6. package NAME::option2;
  7. sub BBB { ... }
  8. package NAME;
  9. __END__
  10. sub AAA { ... }
  11. sub NAME::option1::BBB { ... }
  12. sub NAME::option2::BBB { ... }

DIAGNOSTICS

AutoSplit will inform the user if it is necessary to create the top-level directory specified in the invocation. It is preferred that the script or installation process that invokes AutoSplit have created the full directory path ahead of time. This warning may indicate that the module is being split into an incorrect path.

AutoSplit will warn the user of all subroutines whose name causes potential file naming conflicts on machines with drastically limited (8 characters or less) file name length. Since the subroutine name is used as the file name, these warnings can aid in portability to such systems.

Warnings are issued and the file skipped if AutoSplit cannot locate either the __END__ marker or a "package Name;"-style specification.

AutoSplit will also emit general diagnostics for inability to create directories or files.

AUTHOR

AutoSplit is maintained by the perl5-porters. Please direct any questions to the canonical mailing list. Anything that is applicable to the CPAN release can be sent to its maintainer, though.

Author and Maintainer: The Perl5-Porters <perl5-porters@perl.org>

Maintainer of the CPAN release: Steffen Mueller <smueller@cpan.org>

COPYRIGHT AND LICENSE

This package has been part of the perl core since the first release of perl5. It has been released separately to CPAN so older installations can benefit from bug fixes.

This package has the same copyright and license as the perl core:

  1. Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999,
  2. 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008
  3. by Larry Wall and others
  4. All rights reserved.
  5. This program is free software; you can redistribute it and/or modify
  6. it under the terms of either:
  7. a) the GNU General Public License as published by the Free
  8. Software Foundation; either version 1, or (at your option) any
  9. later version, or
  10. b) the "Artistic License" which comes with this Kit.
  11. This program is distributed in the hope that it will be useful,
  12. but WITHOUT ANY WARRANTY; without even the implied warranty of
  13. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See either
  14. the GNU General Public License or the Artistic License for more details.
  15. You should have received a copy of the Artistic License with this
  16. Kit, in the file named "Artistic". If not, I'll be glad to provide one.
  17. You should also have received a copy of the GNU General Public License
  18. along with this program in the file named "Copying". If not, write to the
  19. Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
  20. 02111-1307, USA or visit their web page on the internet at
  21. http://www.gnu.org/copyleft/gpl.html.
  22. For those of you that choose to use the GNU General Public License,
  23. my interpretation of the GNU General Public License is that no Perl
  24. script falls under the terms of the GPL unless you explicitly put
  25. said script under the terms of the GPL yourself. Furthermore, any
  26. object code linked with perl does not automatically fall under the
  27. terms of the GPL, provided such object code only adds definitions
  28. of subroutines and variables, and does not otherwise impair the
  29. resulting interpreter from executing any standard Perl script. I
  30. consider linking in C subroutines in this manner to be the moral
  31. equivalent of defining subroutines in the Perl language itself. You
  32. may sell such an object file as proprietary provided that you provide
  33. or offer to provide the Perl source, as specified by the GNU General
  34. Public License. (This is merely an alternate way of specifying input
  35. to the program.) You may also sell a binary produced by the dumping of
  36. a running Perl script that belongs to you, provided that you provide or
  37. offer to provide the Perl source as specified by the GPL. (The
  38. fact that a Perl interpreter and your code are in the same binary file
  39. is, in this case, a form of mere aggregation.) This is my interpretation
  40. of the GPL. If you still have concerns or difficulties understanding
  41. my intent, feel free to contact me. Of course, the Artistic License
  42. spells all this out for your protection, so you may prefer to use that.
 
perldoc-html/B/000755 000765 000024 00000000000 12275777424 013420 5ustar00jjstaff000000 000000 perldoc-html/B.html000644 000765 000024 00000145240 12275777423 014313 0ustar00jjstaff000000 000000 B - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

B

Perl 5 version 18.2 documentation
Recently read

B

NAME

B - The Perl Compiler Backend

SYNOPSIS

  1. use B;

DESCRIPTION

The B module supplies classes which allow a Perl program to delve into its own innards. It is the module used to implement the "backends" of the Perl compiler. Usage of the compiler does not require knowledge of this module: see the O module for the user-visible part. The B module is of use to those who want to write new compiler backends. This documentation assumes that the reader knows a fair amount about perl's internals including such things as SVs, OPs and the internal symbol table and syntax tree of a program.

OVERVIEW

The B module contains a set of utility functions for querying the current state of the Perl interpreter; typically these functions return objects from the B::SV and B::OP classes, or their derived classes. These classes in turn define methods for querying the resulting objects about their own internal state.

Utility Functions

The B module exports a variety of functions: some are simple utility functions, others provide a Perl program with a way to get an initial "handle" on an internal object.

Functions Returning B::SV , B::AV , B::HV , and B::CV objects

For descriptions of the class hierarchy of these objects and the methods that can be called on them, see below, OVERVIEW OF CLASSES and SV-RELATED CLASSES.

  • sv_undef

    Returns the SV object corresponding to the C variable sv_undef .

  • sv_yes

    Returns the SV object corresponding to the C variable sv_yes .

  • sv_no

    Returns the SV object corresponding to the C variable sv_no .

  • svref_2object(SVREF)

    Takes a reference to any Perl value, and turns the referred-to value into an object in the appropriate B::OP-derived or B::SV-derived class. Apart from functions such as main_root , this is the primary way to get an initial "handle" on an internal perl data structure which can then be followed with the other access methods.

    The returned object will only be valid as long as the underlying OPs and SVs continue to exist. Do not attempt to use the object after the underlying structures are freed.

  • amagic_generation

    Returns the SV object corresponding to the C variable amagic_generation . As of Perl 5.18, this is just an alias to PL_na , so its value is meaningless.

  • init_av

    Returns the AV object (i.e. in class B::AV) representing INIT blocks.

  • check_av

    Returns the AV object (i.e. in class B::AV) representing CHECK blocks.

  • unitcheck_av

    Returns the AV object (i.e. in class B::AV) representing UNITCHECK blocks.

  • begin_av

    Returns the AV object (i.e. in class B::AV) representing BEGIN blocks.

  • end_av

    Returns the AV object (i.e. in class B::AV) representing END blocks.

  • comppadlist

    Returns the AV object (i.e. in class B::AV) of the global comppadlist.

  • regex_padav

    Only when perl was compiled with ithreads.

  • main_cv

    Return the (faked) CV corresponding to the main part of the Perl program.

Functions for Examining the Symbol Table

  • walksymtable(SYMREF, METHOD, RECURSE, PREFIX)

    Walk the symbol table starting at SYMREF and call METHOD on each symbol (a B::GV object) visited. When the walk reaches package symbols (such as "Foo::") it invokes RECURSE, passing in the symbol name, and only recurses into the package if that sub returns true.

    PREFIX is the name of the SYMREF you're walking.

    For example:

    1. # Walk CGI's symbol table calling print_subs on each symbol.
    2. # Recurse only into CGI::Util::
    3. walksymtable(\%CGI::, 'print_subs',
    4. sub { $_[0] eq 'CGI::Util::' }, 'CGI::');

    print_subs() is a B::GV method you have declared. Also see B::GV Methods, below.

Functions Returning B::OP objects or for walking op trees

For descriptions of the class hierarchy of these objects and the methods that can be called on them, see below, OVERVIEW OF CLASSES and OP-RELATED CLASSES.

  • main_root

    Returns the root op (i.e. an object in the appropriate B::OP-derived class) of the main part of the Perl program.

  • main_start

    Returns the starting op of the main part of the Perl program.

  • walkoptree(OP, METHOD)

    Does a tree-walk of the syntax tree based at OP and calls METHOD on each op it visits. Each node is visited before its children. If walkoptree_debug (see below) has been called to turn debugging on then the method walkoptree_debug is called on each op before METHOD is called.

  • walkoptree_debug(DEBUG)

    Returns the current debugging flag for walkoptree . If the optional DEBUG argument is non-zero, it sets the debugging flag to that. See the description of walkoptree above for what the debugging flag does.

Miscellaneous Utility Functions

  • ppname(OPNUM)

    Return the PP function name (e.g. "pp_add") of op number OPNUM.

  • hash(STR)

    Returns a string in the form "0x..." representing the value of the internal hash function used by perl on string STR.

  • cast_I32(I)

    Casts I to the internal I32 type used by that perl.

  • minus_c

    Does the equivalent of the -c command-line option. Obviously, this is only useful in a BEGIN block or else the flag is set too late.

  • cstring(STR)

    Returns a double-quote-surrounded escaped version of STR which can be used as a string in C source code.

  • perlstring(STR)

    Returns a double-quote-surrounded escaped version of STR which can be used as a string in Perl source code.

  • class(OBJ)

    Returns the class of an object without the part of the classname preceding the first "::" . This is used to turn "B::UNOP" into "UNOP" for example.

  • threadsv_names

    In a perl compiled for threads, this returns a list of the special per-thread threadsv variables.

Exported utility variables

  • @optype
    1. my $op_type = $optype[$op_type_num];

    A simple mapping of the op type number to its type (like 'COP' or 'BINOP').

  • @specialsv_name
    1. my $sv_name = $specialsv_name[$sv_index];

    Certain SV types are considered 'special'. They're represented by B::SPECIAL and are referred to by a number from the specialsv_list. This array maps that number back to the name of the SV (like 'Nullsv' or '&PL_sv_undef').

OVERVIEW OF CLASSES

The C structures used by Perl's internals to hold SV and OP information (PVIV, AV, HV, ..., OP, SVOP, UNOP, ...) are modelled on a class hierarchy and the B module gives access to them via a true object hierarchy. Structure fields which point to other objects (whether types of SV or types of OP) are represented by the B module as Perl objects of the appropriate class.

The bulk of the B module is the methods for accessing fields of these structures.

Note that all access is read-only. You cannot modify the internals by using this module. Also, note that the B::OP and B::SV objects created by this module are only valid for as long as the underlying objects exist; their creation doesn't increase the reference counts of the underlying objects. Trying to access the fields of a freed object will give incomprehensible results, or worse.

SV-RELATED CLASSES

B::IV, B::NV, B::RV, B::PV, B::PVIV, B::PVNV, B::PVMG, B::BM (5.9.5 and earlier), B::PVLV, B::AV, B::HV, B::CV, B::GV, B::FM, B::IO. These classes correspond in the obvious way to the underlying C structures of similar names. The inheritance hierarchy mimics the underlying C "inheritance". For the 5.10.x branch, (ie 5.10.0, 5.10.1 etc) this is:

  1. B::SV
  2. |
  3. +------------+------------+------------+
  4. | | | |
  5. B::PV B::IV B::NV B::RV
  6. \ / /
  7. \ / /
  8. B::PVIV /
  9. \ /
  10. \ /
  11. \ /
  12. B::PVNV
  13. |
  14. |
  15. B::PVMG
  16. |
  17. +-----+-----+-----+-----+
  18. | | | | |
  19. B::AV B::GV B::HV B::CV B::IO
  20. | |
  21. | |
  22. B::PVLV B::FM

For 5.9.0 and earlier, PVLV is a direct subclass of PVMG, and BM is still present as a distinct type, so the base of this diagram is

  1. |
  2. |
  3. B::PVMG
  4. |
  5. +------+-----+-----+-----+-----+-----+
  6. | | | | | | |
  7. B::PVLV B::BM B::AV B::GV B::HV B::CV B::IO
  8. |
  9. |
  10. B::FM

For 5.11.0 and later, B::RV is abolished, and IVs can be used to store references, and a new type B::REGEXP is introduced, giving this structure:

  1. B::SV
  2. |
  3. +------------+------------+
  4. | | |
  5. B::PV B::IV B::NV
  6. \ / /
  7. \ / /
  8. B::PVIV /
  9. \ /
  10. \ /
  11. \ /
  12. B::PVNV
  13. |
  14. |
  15. B::PVMG
  16. |
  17. +-------+-------+---+---+-------+-------+
  18. | | | | | |
  19. B::AV B::GV B::HV B::CV B::IO B::REGEXP
  20. | |
  21. | |
  22. B::PVLV B::FM

Access methods correspond to the underlying C macros for field access, usually with the leading "class indication" prefix removed (Sv, Av, Hv, ...). The leading prefix is only left in cases where its removal would cause a clash in method name. For example, GvREFCNT stays as-is since its abbreviation would clash with the "superclass" method REFCNT (corresponding to the C function SvREFCNT ).

B::SV Methods

  • REFCNT
  • FLAGS
  • object_2svref

    Returns a reference to the regular scalar corresponding to this B::SV object. In other words, this method is the inverse operation to the svref_2object() subroutine. This scalar and other data it points at should be considered read-only: modifying them is neither safe nor guaranteed to have a sensible effect.

B::IV Methods

  • IV

    Returns the value of the IV, interpreted as a signed integer. This will be misleading if FLAGS & SVf_IVisUV . Perhaps you want the int_value method instead?

  • IVX
  • UVX
  • int_value

    This method returns the value of the IV as an integer. It differs from IV in that it returns the correct value regardless of whether it's stored signed or unsigned.

  • needs64bits
  • packiv

B::NV Methods

  • NV
  • NVX

B::RV Methods

  • RV

B::PV Methods

  • PV

    This method is the one you usually want. It constructs a string using the length and offset information in the struct: for ordinary scalars it will return the string that you'd see from Perl, even if it contains null characters.

  • RV

    Same as B::RV::RV, except that it will die() if the PV isn't a reference.

  • PVX

    This method is less often useful. It assumes that the string stored in the struct is null-terminated, and disregards the length information.

    It is the appropriate method to use if you need to get the name of a lexical variable from a padname array. Lexical variable names are always stored with a null terminator, and the length field (CUR) is overloaded for other purposes and can't be relied on here.

  • CUR

    This method returns the internal length field, which consists of the number of internal bytes, not necessarily the number of logical characters.

  • LEN

    This method returns the number of bytes allocated (via malloc) for storing the string. This is 0 if the scalar does not "own" the string.

B::PVMG Methods

  • MAGIC
  • SvSTASH

B::MAGIC Methods

  • MOREMAGIC
  • precomp

    Only valid on r-magic, returns the string that generated the regexp.

  • PRIVATE
  • TYPE
  • FLAGS
  • OBJ

    Will die() if called on r-magic.

  • PTR
  • REGEX

    Only valid on r-magic, returns the integer value of the REGEX stored in the MAGIC.

B::PVLV Methods

  • TARGOFF
  • TARGLEN
  • TYPE
  • TARG

B::BM Methods

  • USEFUL
  • PREVIOUS
  • RARE
  • TABLE

B::GV Methods

  • is_empty

    This method returns TRUE if the GP field of the GV is NULL.

  • NAME
  • SAFENAME

    This method returns the name of the glob, but if the first character of the name is a control character, then it converts it to ^X first, so that *^G would return "^G" rather than "\cG".

    It's useful if you want to print out the name of a variable. If you restrict yourself to globs which exist at compile-time then the result ought to be unambiguous, because code like ${"^G"} = 1 is compiled as two ops - a constant string and a dereference (rv2gv) - so that the glob is created at runtime.

    If you're working with globs at runtime, and need to disambiguate *^G from *{"^G"}, then you should use the raw NAME method.

  • STASH
  • SV
  • IO
  • FORM
  • AV
  • HV
  • EGV
  • CV
  • CVGEN
  • LINE
  • FILE
  • FILEGV
  • GvREFCNT
  • FLAGS

B::IO Methods

B::IO objects derive from IO objects and you will get more information from the IO object itself.

For example:

  1. $gvio = B::svref_2object(\*main::stdin)->IO;
  2. $IO = $gvio->object_2svref();
  3. $fd = $IO->fileno();
  • LINES
  • PAGE
  • PAGE_LEN
  • LINES_LEFT
  • TOP_NAME
  • TOP_GV
  • FMT_NAME
  • FMT_GV
  • BOTTOM_NAME
  • BOTTOM_GV
  • SUBPROCESS
  • IoTYPE

    A character symbolizing the type of IO Handle.

    1. - STDIN/OUT
    2. I STDIN/OUT/ERR
    3. < read-only
    4. > write-only
    5. a append
    6. + read and write
    7. s socket
    8. | pipe
    9. I IMPLICIT
    10. # NUMERIC
    11. space closed handle
    12. \0 closed internal handle
  • IoFLAGS
  • IsSTD

    Takes one argument ( 'stdin' | 'stdout' | 'stderr' ) and returns true if the IoIFP of the object is equal to the handle whose name was passed as argument; i.e., $io->IsSTD('stderr') is true if IoIFP($io) == PerlIO_stderr().

B::AV Methods

  • FILL
  • MAX
  • ARRAY
  • ARRAYelt

    Like ARRAY , but takes an index as an argument to get only one element, rather than a list of all of them.

  • OFF

    This method is deprecated if running under Perl 5.8, and is no longer present if running under Perl 5.9

  • AvFLAGS

    This method returns the AV specific flags. In Perl 5.9 these are now stored in with the main SV flags, so this method is no longer present.

B::CV Methods

  • STASH
  • START
  • ROOT
  • GV
  • FILE
  • DEPTH
  • PADLIST
  • OUTSIDE
  • OUTSIDE_SEQ
  • XSUB
  • XSUBANY

    For constant subroutines, returns the constant SV returned by the subroutine.

  • CvFLAGS
  • const_sv
  • NAME_HEK

    Returns the name of a lexical sub, otherwise undef.

B::HV Methods

  • FILL
  • MAX
  • KEYS
  • RITER
  • NAME
  • ARRAY
  • PMROOT

    This method is not present if running under Perl 5.9, as the PMROOT information is no longer stored directly in the hash.

OP-RELATED CLASSES

B::OP , B::UNOP , B::BINOP , B::LOGOP , B::LISTOP , B::PMOP , B::SVOP , B::PADOP , B::PVOP , B::LOOP , B::COP .

These classes correspond in the obvious way to the underlying C structures of similar names. The inheritance hierarchy mimics the underlying C "inheritance":

  1. B::OP
  2. |
  3. +---------------+--------+--------+-------+
  4. | | | | |
  5. B::UNOP B::SVOP B::PADOP B::COP B::PVOP
  6. ,' `-.
  7. / `--.
  8. B::BINOP B::LOGOP
  9. |
  10. |
  11. B::LISTOP
  12. ,' `.
  13. / \
  14. B::LOOP B::PMOP

Access methods correspond to the underlying C structre field names, with the leading "class indication" prefix ("op_" ) removed.

B::OP Methods

These methods get the values of similarly named fields within the OP data structure. See top of op.h for more info.

  • next
  • sibling
  • name

    This returns the op name as a string (e.g. "add", "rv2av").

  • ppaddr

    This returns the function name as a string (e.g. "PL_ppaddr[OP_ADD]", "PL_ppaddr[OP_RV2AV]").

  • desc

    This returns the op description from the global C PL_op_desc array (e.g. "addition" "array deref").

  • targ
  • type
  • opt
  • flags
  • private
  • spare

B::UNOP METHOD

  • first

B::BINOP METHOD

  • last

B::LOGOP METHOD

  • other

B::LISTOP METHOD

  • children

B::PMOP Methods

  • pmreplroot
  • pmreplstart
  • pmnext

    Only up to Perl 5.9.4

  • pmflags
  • extflags

    Since Perl 5.9.5

  • precomp
  • pmoffset

    Only when perl was compiled with ithreads.

  • code_list

    Since perl 5.17.1

B::SVOP METHOD

  • sv
  • gv

B::PADOP METHOD

  • padix

B::PVOP METHOD

  • pv

B::LOOP Methods

  • redoop
  • nextop
  • lastop

B::COP Methods

  • label
  • stash
  • stashpv
  • stashoff (threaded only)
  • file
  • cop_seq
  • arybase
  • line
  • warnings
  • io
  • hints
  • hints_hash

$B::overlay

Although the optree is read-only, there is an overlay facility that allows you to override what values the various B::*OP methods return for a particular op. $B::overlay should be set to reference a two-deep hash: indexed by OP address, then method name. Whenever a an op method is called, the value in the hash is returned if it exists. This facility is used by B::Deparse to "undo" some optimisations. For example:

  1. local $B::overlay = {};
  2. ...
  3. if ($op->name eq "foo") {
  4. $B::overlay->{$$op} = {
  5. name => 'bar',
  6. next => $op->next->next,
  7. };
  8. }
  9. ...
  10. $op->name # returns "bar"
  11. $op->next # returns the next op but one

AUTHOR

Malcolm Beattie, mbeattie@sable.ox.ac.uk

 
perldoc-html/Benchmark.html000644 000765 000024 00000111265 12275777423 016024 0ustar00jjstaff000000 000000 Benchmark - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Benchmark

Perl 5 version 18.2 documentation
Recently read

Benchmark

NAME

Benchmark - benchmark running times of Perl code

SYNOPSIS

  1. use Benchmark qw(:all) ;
  2. timethis ($count, "code");
  3. # Use Perl code in strings...
  4. timethese($count, {
  5. 'Name1' => '...code1...',
  6. 'Name2' => '...code2...',
  7. });
  8. # ... or use subroutine references.
  9. timethese($count, {
  10. 'Name1' => sub { ...code1... },
  11. 'Name2' => sub { ...code2... },
  12. });
  13. # cmpthese can be used both ways as well
  14. cmpthese($count, {
  15. 'Name1' => '...code1...',
  16. 'Name2' => '...code2...',
  17. });
  18. cmpthese($count, {
  19. 'Name1' => sub { ...code1... },
  20. 'Name2' => sub { ...code2... },
  21. });
  22. # ...or in two stages
  23. $results = timethese($count,
  24. {
  25. 'Name1' => sub { ...code1... },
  26. 'Name2' => sub { ...code2... },
  27. },
  28. 'none'
  29. );
  30. cmpthese( $results ) ;
  31. $t = timeit($count, '...other code...')
  32. print "$count loops of other code took:",timestr($t),"\n";
  33. $t = countit($time, '...other code...')
  34. $count = $t->iters ;
  35. print "$count loops of other code took:",timestr($t),"\n";
  36. # enable hires wallclock timing if possible
  37. use Benchmark ':hireswallclock';

DESCRIPTION

The Benchmark module encapsulates a number of routines to help you figure out how long it takes to execute some code.

timethis - run a chunk of code several times

timethese - run several chunks of code several times

cmpthese - print results of timethese as a comparison chart

timeit - run a chunk of code and see how long it goes

countit - see how many times a chunk of code runs in a given time

Methods

  • new

    Returns the current time. Example:

    1. use Benchmark;
    2. $t0 = Benchmark->new;
    3. # ... your code here ...
    4. $t1 = Benchmark->new;
    5. $td = timediff($t1, $t0);
    6. print "the code took:",timestr($td),"\n";
  • debug

    Enables or disable debugging by setting the $Benchmark::Debug flag:

    1. Benchmark->debug(1);
    2. $t = timeit(10, ' 5 ** $Global ');
    3. Benchmark->debug(0);
  • iters

    Returns the number of iterations.

Standard Exports

The following routines will be exported into your namespace if you use the Benchmark module:

  • timeit(COUNT, CODE)

    Arguments: COUNT is the number of times to run the loop, and CODE is the code to run. CODE may be either a code reference or a string to be eval'd; either way it will be run in the caller's package.

    Returns: a Benchmark object.

  • timethis ( COUNT, CODE, [ TITLE, [ STYLE ]] )

    Time COUNT iterations of CODE. CODE may be a string to eval or a code reference; either way the CODE will run in the caller's package. Results will be printed to STDOUT as TITLE followed by the times. TITLE defaults to "timethis COUNT" if none is provided. STYLE determines the format of the output, as described for timestr() below.

    The COUNT can be zero or negative: this means the minimum number of CPU seconds to run. A zero signifies the default of 3 seconds. For example to run at least for 10 seconds:

    1. timethis(-10, $code)

    or to run two pieces of code tests for at least 3 seconds:

    1. timethese(0, { test1 => '...', test2 => '...'})

    CPU seconds is, in UNIX terms, the user time plus the system time of the process itself, as opposed to the real (wallclock) time and the time spent by the child processes. Less than 0.1 seconds is not accepted (-0.01 as the count, for example, will cause a fatal runtime exception).

    Note that the CPU seconds is the minimum time: CPU scheduling and other operating system factors may complicate the attempt so that a little bit more time is spent. The benchmark output will, however, also tell the number of $code runs/second, which should be a more interesting number than the actually spent seconds.

    Returns a Benchmark object.

  • timethese ( COUNT, CODEHASHREF, [ STYLE ] )

    The CODEHASHREF is a reference to a hash containing names as keys and either a string to eval or a code reference for each value. For each (KEY, VALUE) pair in the CODEHASHREF, this routine will call

    1. timethis(COUNT, VALUE, KEY, STYLE)

    The routines are called in string comparison order of KEY.

    The COUNT can be zero or negative, see timethis().

    Returns a hash reference of Benchmark objects, keyed by name.

  • timediff ( T1, T2 )

    Returns the difference between two Benchmark times as a Benchmark object suitable for passing to timestr().

  • timestr ( TIMEDIFF, [ STYLE, [ FORMAT ] ] )

    Returns a string that formats the times in the TIMEDIFF object in the requested STYLE. TIMEDIFF is expected to be a Benchmark object similar to that returned by timediff().

    STYLE can be any of 'all', 'none', 'noc', 'nop' or 'auto'. 'all' shows each of the 5 times available ('wallclock' time, user time, system time, user time of children, and system time of children). 'noc' shows all except the two children times. 'nop' shows only wallclock and the two children times. 'auto' (the default) will act as 'all' unless the children times are both zero, in which case it acts as 'noc'. 'none' prevents output.

    FORMAT is the printf(3)-style format specifier (without the leading '%') to use to print the times. It defaults to '5.2f'.

Optional Exports

The following routines will be exported into your namespace if you specifically ask that they be imported:

  • clearcache ( COUNT )

    Clear the cached time for COUNT rounds of the null loop.

  • clearallcache ( )

    Clear all cached times.

  • cmpthese ( COUNT, CODEHASHREF, [ STYLE ] )
  • cmpthese ( RESULTSHASHREF, [ STYLE ] )

    Optionally calls timethese(), then outputs comparison chart. This:

    1. cmpthese( -1, { a => "++\$i", b => "\$i *= 2" } ) ;

    outputs a chart like:

    1. Rate b a
    2. b 2831802/s -- -61%
    3. a 7208959/s 155% --

    This chart is sorted from slowest to fastest, and shows the percent speed difference between each pair of tests.

    cmpthese can also be passed the data structure that timethese() returns:

    1. $results = timethese( -1, { a => "++\$i", b => "\$i *= 2" } ) ;
    2. cmpthese( $results );

    in case you want to see both sets of results. If the first argument is an unblessed hash reference, that is RESULTSHASHREF; otherwise that is COUNT.

    Returns a reference to an ARRAY of rows, each row is an ARRAY of cells from the above chart, including labels. This:

    1. my $rows = cmpthese( -1, { a => '++$i', b => '$i *= 2' }, "none" );

    returns a data structure like:

    1. [
    2. [ '', 'Rate', 'b', 'a' ],
    3. [ 'b', '2885232/s', '--', '-59%' ],
    4. [ 'a', '7099126/s', '146%', '--' ],
    5. ]

    NOTE: This result value differs from previous versions, which returned the timethese() result structure. If you want that, just use the two statement timethese ...cmpthese idiom shown above.

    Incidentally, note the variance in the result values between the two examples; this is typical of benchmarking. If this were a real benchmark, you would probably want to run a lot more iterations.

  • countit(TIME, CODE)

    Arguments: TIME is the minimum length of time to run CODE for, and CODE is the code to run. CODE may be either a code reference or a string to be eval'd; either way it will be run in the caller's package.

    TIME is not negative. countit() will run the loop many times to calculate the speed of CODE before running it for TIME. The actual time run for will usually be greater than TIME due to system clock resolution, so it's best to look at the number of iterations divided by the times that you are concerned with, not just the iterations.

    Returns: a Benchmark object.

  • disablecache ( )

    Disable caching of timings for the null loop. This will force Benchmark to recalculate these timings for each new piece of code timed.

  • enablecache ( )

    Enable caching of timings for the null loop. The time taken for COUNT rounds of the null loop will be calculated only once for each different COUNT used.

  • timesum ( T1, T2 )

    Returns the sum of two Benchmark times as a Benchmark object suitable for passing to timestr().

:hireswallclock

If the Time::HiRes module has been installed, you can specify the special tag :hireswallclock for Benchmark (if Time::HiRes is not available, the tag will be silently ignored). This tag will cause the wallclock time to be measured in microseconds, instead of integer seconds. Note though that the speed computations are still conducted in CPU time, not wallclock time.

NOTES

The data is stored as a list of values from the time and times functions:

  1. ($real, $user, $system, $children_user, $children_system, $iters)

in seconds for the whole loop (not divided by the number of rounds).

The timing is done using time(3) and times(3).

Code is executed in the caller's package.

The time of the null loop (a loop with the same number of rounds but empty loop body) is subtracted from the time of the real loop.

The null loop times can be cached, the key being the number of rounds. The caching can be controlled using calls like these:

  1. clearcache($key);
  2. clearallcache();
  3. disablecache();
  4. enablecache();

Caching is off by default, as it can (usually slightly) decrease accuracy and does not usually noticeably affect runtimes.

EXAMPLES

For example,

  1. use Benchmark qw( cmpthese ) ;
  2. $x = 3;
  3. cmpthese( -5, {
  4. a => sub{$x*$x},
  5. b => sub{$x**2},
  6. } );

outputs something like this:

  1. Benchmark: running a, b, each for at least 5 CPU seconds...
  2. Rate b a
  3. b 1559428/s -- -62%
  4. a 4152037/s 166% --

while

  1. use Benchmark qw( timethese cmpthese ) ;
  2. $x = 3;
  3. $r = timethese( -5, {
  4. a => sub{$x*$x},
  5. b => sub{$x**2},
  6. } );
  7. cmpthese $r;

outputs something like this:

  1. Benchmark: running a, b, each for at least 5 CPU seconds...
  2. a: 10 wallclock secs ( 5.14 usr + 0.13 sys = 5.27 CPU) @ 3835055.60/s (n=20210743)
  3. b: 5 wallclock secs ( 5.41 usr + 0.00 sys = 5.41 CPU) @ 1574944.92/s (n=8520452)
  4. Rate b a
  5. b 1574945/s -- -59%
  6. a 3835056/s 144% --

INHERITANCE

Benchmark inherits from no other class, except of course for Exporter.

CAVEATS

Comparing eval'd strings with code references will give you inaccurate results: a code reference will show a slightly slower execution time than the equivalent eval'd string.

The real time timing is done using time(2) and the granularity is therefore only one second.

Short tests may produce negative figures because perl can appear to take longer to execute the empty loop than a short test; try:

  1. timethis(100,'1');

The system time of the null loop might be slightly more than the system time of the loop with the actual code and therefore the difference might end up being < 0.

SEE ALSO

Devel::NYTProf - a Perl code profiler

AUTHORS

Jarkko Hietaniemi <jhi@iki.fi>, Tim Bunce <Tim.Bunce@ig.co.uk>

MODIFICATION HISTORY

September 8th, 1994; by Tim Bunce.

March 28th, 1997; by Hugo van der Sanden: added support for code references and the already documented 'debug' method; revamped documentation.

April 04-07th, 1997: by Jarkko Hietaniemi, added the run-for-some-time functionality.

September, 1999; by Barrie Slaymaker: math fixes and accuracy and efficiency tweaks. Added cmpthese(). A result is now returned from timethese(). Exposed countit() (was runfor()).

December, 2001; by Nicholas Clark: make timestr() recognise the style 'none' and return an empty string. If cmpthese is calling timethese, make it pass the style in. (so that 'none' will suppress output). Make sub new dump its debugging output to STDERR, to be consistent with everything else. All bugs found while writing a regression test.

September, 2002; by Jarkko Hietaniemi: add ':hireswallclock' special tag.

February, 2004; by Chia-liang Kao: make cmpthese and timestr use time statistics for children instead of parent when the style is 'nop'.

November, 2007; by Christophe Grosjean: make cmpthese and timestr compute time consistently with style argument, default is 'all' not 'noc' any more.

 
perldoc-html/CGI/000755 000765 000024 00000000000 12275777434 013642 5ustar00jjstaff000000 000000 perldoc-html/CGI.html000644 000765 000024 00001033407 12275777433 014537 0ustar00jjstaff000000 000000 CGI - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

CGI

Perl 5 version 18.2 documentation
Recently read

CGI

NAME

CGI - Handle Common Gateway Interface requests and responses

SYNOPSIS

  1. use CGI;
  2. my $q = CGI->new;
  3. # Process an HTTP request
  4. @values = $q->param('form_field');
  5. $fh = $q->upload('file_field');
  6. $riddle = $query->cookie('riddle_name');
  7. %answers = $query->cookie('answers');
  8. # Prepare various HTTP responses
  9. print $q->header();
  10. print $q->header('application/json');
  11. $cookie1 = $q->cookie(-name=>'riddle_name', -value=>"The Sphynx's Question");
  12. $cookie2 = $q->cookie(-name=>'answers', -value=>\%answers);
  13. print $q->header(
  14. -type => 'image/gif',
  15. -expires => '+3d',
  16. -cookie => [$cookie1,$cookie2]
  17. );
  18. print $q->redirect('http://somewhere.else/in/movie/land');

DESCRIPTION

CGI.pm is a stable, complete and mature solution for processing and preparing HTTP requests and responses. Major features including processing form submissions, file uploads, reading and writing cookies, query string generation and manipulation, and processing and preparing HTTP headers. Some HTML generation utilities are included as well.

CGI.pm performs very well in in a vanilla CGI.pm environment and also comes with built-in support for mod_perl and mod_perl2 as well as FastCGI.

It has the benefit of having developed and refined over 10 years with input from dozens of contributors and being deployed on thousands of websites. CGI.pm has been included in the Perl distribution since Perl 5.4, and has become a de-facto standard.

PROGRAMMING STYLE

There are two styles of programming with CGI.pm, an object-oriented style and a function-oriented style. In the object-oriented style you create one or more CGI objects and then use object methods to create the various elements of the page. Each CGI object starts out with the list of named parameters that were passed to your CGI script by the server. You can modify the objects, save them to a file or database and recreate them. Because each object corresponds to the "state" of the CGI script, and because each object's parameter list is independent of the others, this allows you to save the state of the script and restore it later.

For example, using the object oriented style, here is how you create a simple "Hello World" HTML page:

  1. #!/usr/local/bin/perl -w
  2. use CGI; # load CGI routines
  3. $q = CGI->new; # create new CGI object
  4. print $q->header, # create the HTTP header
  5. $q->start_html('hello world'), # start the HTML
  6. $q->h1('hello world'), # level 1 header
  7. $q->end_html; # end the HTML

In the function-oriented style, there is one default CGI object that you rarely deal with directly. Instead you just call functions to retrieve CGI parameters, create HTML tags, manage cookies, and so on. This provides you with a cleaner programming interface, but limits you to using one CGI object at a time. The following example prints the same page, but uses the function-oriented interface. The main differences are that we now need to import a set of functions into our name space (usually the "standard" functions), and we don't need to create the CGI object.

  1. #!/usr/local/bin/perl
  2. use CGI qw/:standard/; # load standard CGI routines
  3. print header, # create the HTTP header
  4. start_html('hello world'), # start the HTML
  5. h1('hello world'), # level 1 header
  6. end_html; # end the HTML

The examples in this document mainly use the object-oriented style. See HOW TO IMPORT FUNCTIONS for important information on function-oriented programming in CGI.pm

CALLING CGI.PM ROUTINES

Most CGI.pm routines accept several arguments, sometimes as many as 20 optional ones! To simplify this interface, all routines use a named argument calling style that looks like this:

  1. print $q->header(-type=>'image/gif',-expires=>'+3d');

Each argument name is preceded by a dash. Neither case nor order matters in the argument list. -type, -Type, and -TYPE are all acceptable. In fact, only the first argument needs to begin with a dash. If a dash is present in the first argument, CGI.pm assumes dashes for the subsequent ones.

Several routines are commonly called with just one argument. In the case of these routines you can provide the single argument without an argument name. header() happens to be one of these routines. In this case, the single argument is the document type.

  1. print $q->header('text/html');

Other such routines are documented below.

Sometimes named arguments expect a scalar, sometimes a reference to an array, and sometimes a reference to a hash. Often, you can pass any type of argument and the routine will do whatever is most appropriate. For example, the param() routine is used to set a CGI parameter to a single or a multi-valued value. The two cases are shown below:

  1. $q->param(-name=>'veggie',-value=>'tomato');
  2. $q->param(-name=>'veggie',-value=>['tomato','tomahto','potato','potahto']);

A large number of routines in CGI.pm actually aren't specifically defined in the module, but are generated automatically as needed. These are the "HTML shortcuts," routines that generate HTML tags for use in dynamically-generated pages. HTML tags have both attributes (the attribute="value" pairs within the tag itself) and contents (the part between the opening and closing pairs.) To distinguish between attributes and contents, CGI.pm uses the convention of passing HTML attributes as a hash reference as the first argument, and the contents, if any, as any subsequent arguments. It works out like this:

  1. Code Generated HTML
  2. ---- --------------
  3. h1() <h1>
  4. h1('some','contents'); <h1>some contents</h1>
  5. h1({-align=>left}); <h1 align="LEFT">
  6. h1({-align=>left},'contents'); <h1 align="LEFT">contents</h1>

HTML tags are described in more detail later.

Many newcomers to CGI.pm are puzzled by the difference between the calling conventions for the HTML shortcuts, which require curly braces around the HTML tag attributes, and the calling conventions for other routines, which manage to generate attributes without the curly brackets. Don't be confused. As a convenience the curly braces are optional in all but the HTML shortcuts. If you like, you can use curly braces when calling any routine that takes named arguments. For example:

  1. print $q->header( {-type=>'image/gif',-expires=>'+3d'} );

If you use the -w switch, you will be warned that some CGI.pm argument names conflict with built-in Perl functions. The most frequent of these is the -values argument, used to create multi-valued menus, radio button clusters and the like. To get around this warning, you have several choices:

1.

Use another name for the argument, if one is available. For example, -value is an alias for -values.

2.

Change the capitalization, e.g. -Values

3.

Put quotes around the argument name, e.g. '-values'

Many routines will do something useful with a named argument that it doesn't recognize. For example, you can produce non-standard HTTP header fields by providing them as named arguments:

  1. print $q->header(-type => 'text/html',
  2. -cost => 'Three smackers',
  3. -annoyance_level => 'high',
  4. -complaints_to => 'bit bucket');

This will produce the following nonstandard HTTP header:

  1. HTTP/1.0 200 OK
  2. Cost: Three smackers
  3. Annoyance-level: high
  4. Complaints-to: bit bucket
  5. Content-type: text/html

Notice the way that underscores are translated automatically into hyphens. HTML-generating routines perform a different type of translation.

This feature allows you to keep up with the rapidly changing HTTP and HTML "standards".

CREATING A NEW QUERY OBJECT (OBJECT-ORIENTED STYLE):

  1. $query = CGI->new;

This will parse the input (from POST, GET and DELETE methods) and store it into a perl5 object called $query.

Any filehandles from file uploads will have their position reset to the beginning of the file.

CREATING A NEW QUERY OBJECT FROM AN INPUT FILE

  1. $query = CGI->new(INPUTFILE);

If you provide a file handle to the new() method, it will read parameters from the file (or STDIN, or whatever). The file can be in any of the forms describing below under debugging (i.e. a series of newline delimited TAG=VALUE pairs will work). Conveniently, this type of file is created by the save() method (see below). Multiple records can be saved and restored.

Perl purists will be pleased to know that this syntax accepts references to file handles, or even references to filehandle globs, which is the "official" way to pass a filehandle:

  1. $query = CGI->new(\*STDIN);

You can also initialize the CGI object with a FileHandle or IO::File object.

If you are using the function-oriented interface and want to initialize CGI state from a file handle, the way to do this is with restore_parameters(). This will (re)initialize the default CGI object from the indicated file handle.

  1. open (IN,"test.in") || die;
  2. restore_parameters(IN);
  3. close IN;

You can also initialize the query object from a hash reference:

  1. $query = CGI->new( {'dinosaur'=>'barney',
  2. 'song'=>'I love you',
  3. 'friends'=>[qw/Jessica George Nancy/]}
  4. );

or from a properly formatted, URL-escaped query string:

  1. $query = CGI->new('dinosaur=barney&color=purple');

or from a previously existing CGI object (currently this clones the parameter list, but none of the other object-specific fields, such as autoescaping):

  1. $old_query = CGI->new;
  2. $new_query = CGI->new($old_query);

To create an empty query, initialize it from an empty string or hash:

  1. $empty_query = CGI->new("");
  2. -or-
  3. $empty_query = CGI->new({});

FETCHING A LIST OF KEYWORDS FROM THE QUERY:

  1. @keywords = $query->keywords

If the script was invoked as the result of an <ISINDEX> search, the parsed keywords can be obtained as an array using the keywords() method.

FETCHING THE NAMES OF ALL THE PARAMETERS PASSED TO YOUR SCRIPT:

  1. @names = $query->param

If the script was invoked with a parameter list (e.g. "name1=value1&name2=value2&name3=value3"), the param() method will return the parameter names as a list. If the script was invoked as an <ISINDEX> script and contains a string without ampersands (e.g. "value1+value2+value3") , there will be a single parameter named "keywords" containing the "+"-delimited keywords.

NOTE: As of version 1.5, the array of parameter names returned will be in the same order as they were submitted by the browser. Usually this order is the same as the order in which the parameters are defined in the form (however, this isn't part of the spec, and so isn't guaranteed).

FETCHING THE VALUE OR VALUES OF A SINGLE NAMED PARAMETER:

  1. @values = $query->param('foo');
  2. -or-
  3. $value = $query->param('foo');

Pass the param() method a single argument to fetch the value of the named parameter. If the parameter is multivalued (e.g. from multiple selections in a scrolling list), you can ask to receive an array. Otherwise the method will return a single value.

If a value is not given in the query string, as in the queries "name1=&name2=", it will be returned as an empty string.

If the parameter does not exist at all, then param() will return undef in a scalar context, and the empty list in a list context.

SETTING THE VALUE(S) OF A NAMED PARAMETER:

  1. $query->param('foo','an','array','of','values');

This sets the value for the named parameter 'foo' to an array of values. This is one way to change the value of a field AFTER the script has been invoked once before. (Another way is with the -override parameter accepted by all methods that generate form elements.)

param() also recognizes a named parameter style of calling described in more detail later:

  1. $query->param(-name=>'foo',-values=>['an','array','of','values']);
  2. -or-
  3. $query->param(-name=>'foo',-value=>'the value');

APPENDING ADDITIONAL VALUES TO A NAMED PARAMETER:

  1. $query->append(-name=>'foo',-values=>['yet','more','values']);

This adds a value or list of values to the named parameter. The values are appended to the end of the parameter if it already exists. Otherwise the parameter is created. Note that this method only recognizes the named argument calling syntax.

IMPORTING ALL PARAMETERS INTO A NAMESPACE:

  1. $query->import_names('R');

This creates a series of variables in the 'R' namespace. For example, $R::foo, @R:foo. For keyword lists, a variable @R::keywords will appear. If no namespace is given, this method will assume 'Q'. WARNING: don't import anything into 'main'; this is a major security risk!!!!

NOTE 1: Variable names are transformed as necessary into legal Perl variable names. All non-legal characters are transformed into underscores. If you need to keep the original names, you should use the param() method instead to access CGI variables by name.

NOTE 2: In older versions, this method was called import(). As of version 2.20, this name has been removed completely to avoid conflict with the built-in Perl module import operator.

DELETING A PARAMETER COMPLETELY:

  1. $query->delete('foo','bar','baz');

This completely clears a list of parameters. It sometimes useful for resetting parameters that you don't want passed down between script invocations.

If you are using the function call interface, use "Delete()" instead to avoid conflicts with Perl's built-in delete operator.

DELETING ALL PARAMETERS:

  1. $query->delete_all();

This clears the CGI object completely. It might be useful to ensure that all the defaults are taken when you create a fill-out form.

Use Delete_all() instead if you are using the function call interface.

HANDLING NON-URLENCODED ARGUMENTS

If POSTed data is not of type application/x-www-form-urlencoded or multipart/form-data, then the POSTed data will not be processed, but instead be returned as-is in a parameter named POSTDATA. To retrieve it, use code like this:

  1. my $data = $query->param('POSTDATA');

Likewise if PUTed data can be retrieved with code like this:

  1. my $data = $query->param('PUTDATA');

(If you don't know what the preceding means, don't worry about it. It only affects people trying to use CGI for XML processing and other specialized tasks.)

DIRECT ACCESS TO THE PARAMETER LIST:

  1. $q->param_fetch('address')->[1] = '1313 Mockingbird Lane';
  2. unshift @{$q->param_fetch(-name=>'address')},'George Munster';

If you need access to the parameter list in a way that isn't covered by the methods given in the previous sections, you can obtain a direct reference to it by calling the param_fetch() method with the name of the parameter. This will return an array reference to the named parameter, which you then can manipulate in any way you like.

You can also use a named argument style using the -name argument.

FETCHING THE PARAMETER LIST AS A HASH:

  1. $params = $q->Vars;
  2. print $params->{'address'};
  3. @foo = split("\0",$params->{'foo'});
  4. %params = $q->Vars;
  5. use CGI ':cgi-lib';
  6. $params = Vars;

Many people want to fetch the entire parameter list as a hash in which the keys are the names of the CGI parameters, and the values are the parameters' values. The Vars() method does this. Called in a scalar context, it returns the parameter list as a tied hash reference. Changing a key changes the value of the parameter in the underlying CGI parameter list. Called in a list context, it returns the parameter list as an ordinary hash. This allows you to read the contents of the parameter list, but not to change it.

When using this, the thing you must watch out for are multivalued CGI parameters. Because a hash cannot distinguish between scalar and list context, multivalued parameters will be returned as a packed string, separated by the "\0" (null) character. You must split this packed string in order to get at the individual values. This is the convention introduced long ago by Steve Brenner in his cgi-lib.pl module for Perl version 4.

If you wish to use Vars() as a function, import the :cgi-lib set of function calls (also see the section on CGI-LIB compatibility).

SAVING THE STATE OF THE SCRIPT TO A FILE:

  1. $query->save(\*FILEHANDLE)

This will write the current state of the form to the provided filehandle. You can read it back in by providing a filehandle to the new() method. Note that the filehandle can be a file, a pipe, or whatever!

The format of the saved file is:

  1. NAME1=VALUE1
  2. NAME1=VALUE1'
  3. NAME2=VALUE2
  4. NAME3=VALUE3
  5. =

Both name and value are URL escaped. Multi-valued CGI parameters are represented as repeated names. A session record is delimited by a single = symbol. You can write out multiple records and read them back in with several calls to new. You can do this across several sessions by opening the file in append mode, allowing you to create primitive guest books, or to keep a history of users' queries. Here's a short example of creating multiple session records:

  1. use CGI;
  2. open (OUT,'>>','test.out') || die;
  3. $records = 5;
  4. for (0..$records) {
  5. my $q = CGI->new;
  6. $q->param(-name=>'counter',-value=>$_);
  7. $q->save(\*OUT);
  8. }
  9. close OUT;
  10. # reopen for reading
  11. open (IN,'<','test.out') || die;
  12. while (!eof(IN)) {
  13. my $q = CGI->new(\*IN);
  14. print $q->param('counter'),"\n";
  15. }

The file format used for save/restore is identical to that used by the Whitehead Genome Center's data exchange format "Boulderio", and can be manipulated and even databased using Boulderio utilities. See

  1. http://stein.cshl.org/boulder/

for further details.

If you wish to use this method from the function-oriented (non-OO) interface, the exported name for this method is save_parameters().

RETRIEVING CGI ERRORS

Errors can occur while processing user input, particularly when processing uploaded files. When these errors occur, CGI will stop processing and return an empty parameter list. You can test for the existence and nature of errors using the cgi_error() function. The error messages are formatted as HTTP status codes. You can either incorporate the error text into an HTML page, or use it as the value of the HTTP status:

  1. my $error = $q->cgi_error;
  2. if ($error) {
  3. print $q->header(-status=>$error),
  4. $q->start_html('Problems'),
  5. $q->h2('Request not processed'),
  6. $q->strong($error);
  7. exit 0;
  8. }

When using the function-oriented interface (see the next section), errors may only occur the first time you call param(). Be ready for this!

USING THE FUNCTION-ORIENTED INTERFACE

To use the function-oriented interface, you must specify which CGI.pm routines or sets of routines to import into your script's namespace. There is a small overhead associated with this importation, but it isn't much.

  1. use CGI <list of methods>;

The listed methods will be imported into the current package; you can call them directly without creating a CGI object first. This example shows how to import the param() and header() methods, and then use them directly:

  1. use CGI 'param','header';
  2. print header('text/plain');
  3. $zipcode = param('zipcode');

More frequently, you'll import common sets of functions by referring to the groups by name. All function sets are preceded with a ":" character as in ":html3" (for tags defined in the HTML 3 standard).

Here is a list of the function sets you can import:

  • :cgi

    Import all CGI-handling methods, such as param(), path_info() and the like.

  • :form

    Import all fill-out form generating methods, such as textfield().

  • :html2

    Import all methods that generate HTML 2.0 standard elements.

  • :html3

    Import all methods that generate HTML 3.0 elements (such as <table>, <super> and <sub>).

  • :html4

    Import all methods that generate HTML 4 elements (such as <abbrev>, <acronym> and <thead>).

  • :netscape

    Import the <blink>, <fontsize> and <center> tags.

  • :html

    Import all HTML-generating shortcuts (i.e. 'html2', 'html3', 'html4' and 'netscape')

  • :standard

    Import "standard" features, 'html2', 'html3', 'html4', 'form' and 'cgi'.

  • :all

    Import all the available methods. For the full list, see the CGI.pm code, where the variable %EXPORT_TAGS is defined.

If you import a function name that is not part of CGI.pm, the module will treat it as a new HTML tag and generate the appropriate subroutine. You can then use it like any other HTML tag. This is to provide for the rapidly-evolving HTML "standard." For example, say Microsoft comes out with a new tag called <gradient> (which causes the user's desktop to be flooded with a rotating gradient fill until his machine reboots). You don't need to wait for a new version of CGI.pm to start using it immediately:

  1. use CGI qw/:standard :html3 gradient/;
  2. print gradient({-start=>'red',-end=>'blue'});

Note that in the interests of execution speed CGI.pm does not use the standard Exporter syntax for specifying load symbols. This may change in the future.

If you import any of the state-maintaining CGI or form-generating methods, a default CGI object will be created and initialized automatically the first time you use any of the methods that require one to be present. This includes param(), textfield(), submit() and the like. (If you need direct access to the CGI object, you can find it in the global variable $CGI::Q). By importing CGI.pm methods, you can create visually elegant scripts:

  1. use CGI qw/:standard/;
  2. print
  3. header,
  4. start_html('Simple Script'),
  5. h1('Simple Script'),
  6. start_form,
  7. "What's your name? ",textfield('name'),p,
  8. "What's the combination?",
  9. checkbox_group(-name=>'words',
  10. -values=>['eenie','meenie','minie','moe'],
  11. -defaults=>['eenie','moe']),p,
  12. "What's your favorite color?",
  13. popup_menu(-name=>'color',
  14. -values=>['red','green','blue','chartreuse']),p,
  15. submit,
  16. end_form,
  17. hr,"\n";
  18. if (param) {
  19. print
  20. "Your name is ",em(param('name')),p,
  21. "The keywords are: ",em(join(", ",param('words'))),p,
  22. "Your favorite color is ",em(param('color')),".\n";
  23. }
  24. print end_html;

PRAGMAS

In addition to the function sets, there are a number of pragmas that you can import. Pragmas, which are always preceded by a hyphen, change the way that CGI.pm functions in various ways. Pragmas, function sets, and individual functions can all be imported in the same use() line. For example, the following use statement imports the standard set of functions and enables debugging mode (pragma -debug):

  1. use CGI qw/:standard -debug/;

The current list of pragmas is as follows:

  • -any

    When you use CGI -any, then any method that the query object doesn't recognize will be interpreted as a new HTML tag. This allows you to support the next ad hoc HTML extension. This lets you go wild with new and unsupported tags:

    1. use CGI qw(-any);
    2. $q=CGI->new;
    3. print $q->gradient({speed=>'fast',start=>'red',end=>'blue'});

    Since using <cite>any</cite> causes any mistyped method name to be interpreted as an HTML tag, use it with care or not at all.

  • -compile

    This causes the indicated autoloaded methods to be compiled up front, rather than deferred to later. This is useful for scripts that run for an extended period of time under FastCGI or mod_perl, and for those destined to be crunched by Malcolm Beattie's Perl compiler. Use it in conjunction with the methods or method families you plan to use.

    1. use CGI qw(-compile :standard :html3);

    or even

    1. use CGI qw(-compile :all);

    Note that using the -compile pragma in this way will always have the effect of importing the compiled functions into the current namespace. If you want to compile without importing use the compile() method instead:

    1. use CGI();
    2. CGI->compile();

    This is particularly useful in a mod_perl environment, in which you might want to precompile all CGI routines in a startup script, and then import the functions individually in each mod_perl script.

  • -nosticky

    By default the CGI module implements a state-preserving behavior called "sticky" fields. The way this works is that if you are regenerating a form, the methods that generate the form field values will interrogate param() to see if similarly-named parameters are present in the query string. If they find a like-named parameter, they will use it to set their default values.

    Sometimes this isn't what you want. The -nosticky pragma prevents this behavior. You can also selectively change the sticky behavior in each element that you generate.

  • -tabindex

    Automatically add tab index attributes to each form field. With this option turned off, you can still add tab indexes manually by passing a -tabindex option to each field-generating method.

  • -no_undef_params

    This keeps CGI.pm from including undef params in the parameter list.

  • -no_xhtml

    By default, CGI.pm versions 2.69 and higher emit XHTML (http://www.w3.org/TR/xhtml1/). The -no_xhtml pragma disables this feature. Thanks to Michalis Kabrianis <kabrianis@hellug.gr> for this feature.

    If start_html()'s -dtd parameter specifies an HTML 2.0, 3.2, 4.0 or 4.01 DTD, XHTML will automatically be disabled without needing to use this pragma.

  • -utf8

    This makes CGI.pm treat all parameters as UTF-8 strings. Use this with care, as it will interfere with the processing of binary uploads. It is better to manually select which fields are expected to return utf-8 strings and convert them using code like this:

    1. use Encode;
    2. my $arg = decode utf8=>param('foo');
  • -nph

    This makes CGI.pm produce a header appropriate for an NPH (no parsed header) script. You may need to do other things as well to tell the server that the script is NPH. See the discussion of NPH scripts below.

  • -newstyle_urls

    Separate the name=value pairs in CGI parameter query strings with semicolons rather than ampersands. For example:

    1. ?name=fred;age=24;favorite_color=3

    Semicolon-delimited query strings are always accepted, and will be emitted by self_url() and query_string(). newstyle_urls became the default in version 2.64.

  • -oldstyle_urls

    Separate the name=value pairs in CGI parameter query strings with ampersands rather than semicolons. This is no longer the default.

  • -autoload

    This overrides the autoloader so that any function in your program that is not recognized is referred to CGI.pm for possible evaluation. This allows you to use all the CGI.pm functions without adding them to your symbol table, which is of concern for mod_perl users who are worried about memory consumption. Warning: when -autoload is in effect, you cannot use "poetry mode" (functions without the parenthesis). Use hr() rather than hr, or add something like use subs qw/hr p header/ to the top of your script.

  • -no_debug

    This turns off the command-line processing features. If you want to run a CGI.pm script from the command line to produce HTML, and you don't want it to read CGI parameters from the command line or STDIN, then use this pragma:

    1. use CGI qw(-no_debug :standard);
  • -debug

    This turns on full debugging. In addition to reading CGI arguments from the command-line processing, CGI.pm will pause and try to read arguments from STDIN, producing the message "(offline mode: enter name=value pairs on standard input)" features.

    See the section on debugging for more details.

  • -private_tempfiles

    CGI.pm can process uploaded file. Ordinarily it spools the uploaded file to a temporary directory, then deletes the file when done. However, this opens the risk of eavesdropping as described in the file upload section. Another CGI script author could peek at this data during the upload, even if it is confidential information. On Unix systems, the -private_tempfiles pragma will cause the temporary file to be unlinked as soon as it is opened and before any data is written into it, reducing, but not eliminating the risk of eavesdropping (there is still a potential race condition). To make life harder for the attacker, the program chooses tempfile names by calculating a 32 bit checksum of the incoming HTTP headers.

    To ensure that the temporary file cannot be read by other CGI scripts, use suEXEC or a CGI wrapper program to run your script. The temporary file is created with mode 0600 (neither world nor group readable).

    The temporary directory is selected using the following algorithm:

    1. 1. if $CGITempFile::TMPDIRECTORY is already set, use that
    2. 2. if the environment variable TMPDIR exists, use the location
    3. indicated.
    4. 3. Otherwise try the locations /usr/tmp, /var/tmp, C:\temp,
    5. /tmp, /temp, ::Temporary Items, and \WWW_ROOT.

    Each of these locations is checked that it is a directory and is writable. If not, the algorithm tries the next choice.

SPECIAL FORMS FOR IMPORTING HTML-TAG FUNCTIONS

Many of the methods generate HTML tags. As described below, tag functions automatically generate both the opening and closing tags. For example:

  1. print h1('Level 1 Header');

produces

  1. <h1>Level 1 Header</h1>

There will be some times when you want to produce the start and end tags yourself. In this case, you can use the form start_tag_name and end_tag_name, as in:

  1. print start_h1,'Level 1 Header',end_h1;

With a few exceptions (described below), start_tag_name and end_tag_name functions are not generated automatically when you use CGI. However, you can specify the tags you want to generate start/end functions for by putting an asterisk in front of their name, or, alternatively, requesting either "start_tag_name" or "end_tag_name" in the import list.

Example:

  1. use CGI qw/:standard *table start_ul/;

In this example, the following functions are generated in addition to the standard ones:

1.
start_table() (generates a <table> tag)
2.
end_table() (generates a </table> tag)
3.
start_ul() (generates a <ul> tag)
4.
end_ul() (generates a </ul> tag)

GENERATING DYNAMIC DOCUMENTS

Most of CGI.pm's functions deal with creating documents on the fly. Generally you will produce the HTTP header first, followed by the document itself. CGI.pm provides functions for generating HTTP headers of various types as well as for generating HTML. For creating GIF images, see the GD.pm module.

Each of these functions produces a fragment of HTML or HTTP which you can print out directly so that it displays in the browser window, append to a string, or save to a file for later use.

CREATING A STANDARD HTTP HEADER:

Normally the first thing you will do in any CGI script is print out an HTTP header. This tells the browser what type of document to expect, and gives other optional information, such as the language, expiration date, and whether to cache the document. The header can also be manipulated for special purposes, such as server push and pay per view pages.

  1. print header;
  2. -or-
  3. print header('image/gif');
  4. -or-
  5. print header('text/html','204 No response');
  6. -or-
  7. print header(-type=>'image/gif',
  8. -nph=>1,
  9. -status=>'402 Payment required',
  10. -expires=>'+3d',
  11. -cookie=>$cookie,
  12. -charset=>'utf-7',
  13. -attachment=>'foo.gif',
  14. -Cost=>'$2.00');

header() returns the Content-type: header. You can provide your own MIME type if you choose, otherwise it defaults to text/html. An optional second parameter specifies the status code and a human-readable message. For example, you can specify 204, "No response" to create a script that tells the browser to do nothing at all. Note that RFC 2616 expects the human-readable phase to be there as well as the numeric status code.

The last example shows the named argument style for passing arguments to the CGI methods using named parameters. Recognized parameters are -type, -status, -expires, and -cookie. Any other named parameters will be stripped of their initial hyphens and turned into header fields, allowing you to specify any HTTP header you desire. Internal underscores will be turned into hyphens:

  1. print header(-Content_length=>3002);

Most browsers will not cache the output from CGI scripts. Every time the browser reloads the page, the script is invoked anew. You can change this behavior with the -expires parameter. When you specify an absolute or relative expiration interval with this parameter, some browsers and proxy servers will cache the script's output until the indicated expiration date. The following forms are all valid for the -expires field:

  1. +30s 30 seconds from now
  2. +10m ten minutes from now
  3. +1h one hour from now
  4. -1d yesterday (i.e. "ASAP!")
  5. now immediately
  6. +3M in three months
  7. +10y in ten years time
  8. Thursday, 25-Apr-1999 00:40:33 GMT at the indicated time & date

The -cookie parameter generates a header that tells the browser to provide a "magic cookie" during all subsequent transactions with your script. Some cookies have a special format that includes interesting attributes such as expiration time. Use the cookie() method to create and retrieve session cookies.

The -nph parameter, if set to a true value, will issue the correct headers to work with a NPH (no-parse-header) script. This is important to use with certain servers that expect all their scripts to be NPH.

The -charset parameter can be used to control the character set sent to the browser. If not provided, defaults to ISO-8859-1. As a side effect, this sets the charset() method as well.

The -attachment parameter can be used to turn the page into an attachment. Instead of displaying the page, some browsers will prompt the user to save it to disk. The value of the argument is the suggested name for the saved file. In order for this to work, you may have to set the -type to "application/octet-stream".

The -p3p parameter will add a P3P tag to the outgoing header. The parameter can be an arrayref or a space-delimited string of P3P tags. For example:

  1. print header(-p3p=>[qw(CAO DSP LAW CURa)]);
  2. print header(-p3p=>'CAO DSP LAW CURa');

In either case, the outgoing header will be formatted as:

  1. P3P: policyref="/w3c/p3p.xml" cp="CAO DSP LAW CURa"

CGI.pm will accept valid multi-line headers when each line is separated with a CRLF value ("\r\n" on most platforms) followed by at least one space. For example:

  1. print header( -ingredients => "ham\r\n\seggs\r\n\sbacon" );

Invalid multi-line header input will trigger in an exception. When multi-line headers are received, CGI.pm will always output them back as a single line, according to the folding rules of RFC 2616: the newlines will be removed, while the white space remains.

GENERATING A REDIRECTION HEADER

  1. print $q->redirect('http://somewhere.else/in/movie/land');

Sometimes you don't want to produce a document yourself, but simply redirect the browser elsewhere, perhaps choosing a URL based on the time of day or the identity of the user.

The redirect() method redirects the browser to a different URL. If you use redirection like this, you should not print out a header as well.

You should always use full URLs (including the http: or ftp: part) in redirection requests. Relative URLs will not work correctly.

You can also use named arguments:

  1. print $q->redirect(
  2. -uri=>'http://somewhere.else/in/movie/land',
  3. -nph=>1,
  4. -status=>'301 Moved Permanently');

All names arguments recognized by header() are also recognized by redirect(). However, most HTTP headers, including those generated by -cookie and -target, are ignored by the browser.

The -nph parameter, if set to a true value, will issue the correct headers to work with a NPH (no-parse-header) script. This is important to use with certain servers, such as Microsoft IIS, which expect all their scripts to be NPH.

The -status parameter will set the status of the redirect. HTTP defines three different possible redirection status codes:

  1. 301 Moved Permanently
  2. 302 Found
  3. 303 See Other

The default if not specified is 302, which means "moved temporarily." You may change the status to another status code if you wish. Be advised that changing the status to anything other than 301, 302 or 303 will probably break redirection.

Note that the human-readable phrase is also expected to be present to conform with RFC 2616, section 6.1.

CREATING THE HTML DOCUMENT HEADER

  1. print start_html(-title=>'Secrets of the Pyramids',
  2. -author=>'fred@capricorn.org',
  3. -base=>'true',
  4. -target=>'_blank',
  5. -meta=>{'keywords'=>'pharaoh secret mummy',
  6. 'copyright'=>'copyright 1996 King Tut'},
  7. -style=>{'src'=>'/styles/style1.css'},
  8. -BGCOLOR=>'blue');

The start_html() routine creates the top of the page, along with a lot of optional information that controls the page's appearance and behavior.

This method returns a canned HTML header and the opening <body> tag. All parameters are optional. In the named parameter form, recognized parameters are -title, -author, -base, -xbase, -dtd, -lang and -target (see below for the explanation). Any additional parameters you provide, such as the unofficial BGCOLOR attribute, are added to the <body> tag. Additional parameters must be proceeded by a hyphen.

The argument -xbase allows you to provide an HREF for the <base> tag different from the current location, as in

  1. -xbase=>"http://home.mcom.com/"

All relative links will be interpreted relative to this tag.

The argument -target allows you to provide a default target frame for all the links and fill-out forms on the page. This is a non-standard HTTP feature which only works with some browsers!

  1. -target=>"answer_window"

All relative links will be interpreted relative to this tag. You add arbitrary meta information to the header with the -meta argument. This argument expects a reference to a hash containing name/value pairs of meta information. These will be turned into a series of header <meta> tags that look something like this:

  1. <meta name="keywords" content="pharaoh secret mummy">
  2. <meta name="description" content="copyright 1996 King Tut">

To create an HTTP-EQUIV type of <meta> tag, use -head, described below.

The -style argument is used to incorporate cascading stylesheets into your code. See the section on CASCADING STYLESHEETS for more information.

The -lang argument is used to incorporate a language attribute into the <html> tag. For example:

  1. print $q->start_html(-lang=>'fr-CA');

The default if not specified is "en-US" for US English, unless the -dtd parameter specifies an HTML 2.0 or 3.2 DTD, in which case the lang attribute is left off. You can force the lang attribute to left off in other cases by passing an empty string (-lang=>'').

The -encoding argument can be used to specify the character set for XHTML. It defaults to iso-8859-1 if not specified.

The -dtd argument can be used to specify a public DTD identifier string. For example:

  1. -dtd => '-//W3C//DTD HTML 4.01 Transitional//EN')

Alternatively, it can take public and system DTD identifiers as an array:

  1. dtd => [ '-//W3C//DTD HTML 4.01 Transitional//EN', 'http://www.w3.org/TR/html4/loose.dtd' ]

For the public DTD identifier to be considered, it must be valid. Otherwise it will be replaced by the default DTD. If the public DTD contains 'XHTML', CGI.pm will emit XML.

The -declare_xml argument, when used in conjunction with XHTML, will put a <?xml> declaration at the top of the HTML header. The sole purpose of this declaration is to declare the character set encoding. In the absence of -declare_xml, the output HTML will contain a <meta> tag that specifies the encoding, allowing the HTML to pass most validators. The default for -declare_xml is false.

You can place other arbitrary HTML elements to the <head> section with the -head tag. For example, to place a <link> element in the head section, use this:

  1. print start_html(-head=>Link({-rel=>'shortcut icon',
  2. -href=>'favicon.ico'}));

To incorporate multiple HTML elements into the <head> section, just pass an array reference:

  1. print start_html(-head=>[
  2. Link({-rel=>'next',
  3. -href=>'http://www.capricorn.com/s2.html'}),
  4. Link({-rel=>'previous',
  5. -href=>'http://www.capricorn.com/s1.html'})
  6. ]
  7. );

And here's how to create an HTTP-EQUIV <meta> tag:

  1. print start_html(-head=>meta({-http_equiv => 'Content-Type',
  2. -content => 'text/html'}))

JAVASCRIPTING: The -script, -noScript, -onLoad, -onMouseOver, -onMouseOut and -onUnload parameters are used to add JavaScript calls to your pages. -script should point to a block of text containing JavaScript function definitions. This block will be placed within a <script> block inside the HTML (not HTTP) header. The block is placed in the header in order to give your page a fighting chance of having all its JavaScript functions in place even if the user presses the stop button before the page has loaded completely. CGI.pm attempts to format the script in such a way that JavaScript-naive browsers will not choke on the code: unfortunately there are some browsers, such as Chimera for Unix, that get confused by it nevertheless.

The -onLoad and -onUnload parameters point to fragments of JavaScript code to execute when the page is respectively opened and closed by the browser. Usually these parameters are calls to functions defined in the -script field:

  1. $query = CGI->new;
  2. print header;
  3. $JSCRIPT=<<END;
  4. // Ask a silly question
  5. function riddle_me_this() {
  6. var r = prompt("What walks on four legs in the morning, " +
  7. "two legs in the afternoon, " +
  8. "and three legs in the evening?");
  9. response(r);
  10. }
  11. // Get a silly answer
  12. function response(answer) {
  13. if (answer == "man")
  14. alert("Right you are!");
  15. else
  16. alert("Wrong! Guess again.");
  17. }
  18. END
  19. print start_html(-title=>'The Riddle of the Sphinx',
  20. -script=>$JSCRIPT);

Use the -noScript parameter to pass some HTML text that will be displayed on browsers that do not have JavaScript (or browsers where JavaScript is turned off).

The <script> tag, has several attributes including "type", "charset" and "src". "src" allows you to keep JavaScript code in an external file. To use these attributes pass a HASH reference in the -script parameter containing one or more of -type, -src, or -code:

  1. print $q->start_html(-title=>'The Riddle of the Sphinx',
  2. -script=>{-type=>'JAVASCRIPT',
  3. -src=>'/javascript/sphinx.js'}
  4. );
  5. print $q->(-title=>'The Riddle of the Sphinx',
  6. -script=>{-type=>'PERLSCRIPT',
  7. -code=>'print "hello world!\n;"'}
  8. );

A final feature allows you to incorporate multiple <script> sections into the header. Just pass the list of script sections as an array reference. this allows you to specify different source files for different dialects of JavaScript. Example:

  1. print $q->start_html(-title=>'The Riddle of the Sphinx',
  2. -script=>[
  3. { -type => 'text/javascript',
  4. -src => '/javascript/utilities10.js'
  5. },
  6. { -type => 'text/javascript',
  7. -src => '/javascript/utilities11.js'
  8. },
  9. { -type => 'text/jscript',
  10. -src => '/javascript/utilities12.js'
  11. },
  12. { -type => 'text/ecmascript',
  13. -src => '/javascript/utilities219.js'
  14. }
  15. ]
  16. );

The option "-language" is a synonym for -type, and is supported for backwards compatibility.

The old-style positional parameters are as follows:

  • Parameters:
  • 1.

    The title

  • 2.

    The author's e-mail address (will create a <link rev="MADE"> tag if present

  • 3.

    A 'true' flag if you want to include a <base> tag in the header. This helps resolve relative addresses to absolute ones when the document is moved, but makes the document hierarchy non-portable. Use with care!

  • 4, 5, 6...

    Any other parameters you want to include in the <body> tag. This is a good place to put HTML extensions, such as colors and wallpaper patterns.

ENDING THE HTML DOCUMENT:

  1. print $q->end_html;

This ends an HTML document by printing the </body></html> tags.

CREATING A SELF-REFERENCING URL THAT PRESERVES STATE INFORMATION:

  1. $myself = $q->self_url;
  2. print q(<a href="$myself">I'm talking to myself.</a>);

self_url() will return a URL, that, when selected, will reinvoke this script with all its state information intact. This is most useful when you want to jump around within the document using internal anchors but you don't want to disrupt the current contents of the form(s). Something like this will do the trick.

  1. $myself = $q->self_url;
  2. print "<a href=\"$myself#table1\">See table 1</a>";
  3. print "<a href=\"$myself#table2\">See table 2</a>";
  4. print "<a href=\"$myself#yourself\">See for yourself</a>";

If you want more control over what's returned, using the url() method instead.

You can also retrieve the unprocessed query string with query_string():

  1. $the_string = $q->query_string();

The behavior of calling query_string is currently undefined when the HTTP method is something other than GET.

OBTAINING THE SCRIPT'S URL

  1. $full_url = url();
  2. $full_url = url(-full=>1); #alternative syntax
  3. $relative_url = url(-relative=>1);
  4. $absolute_url = url(-absolute=>1);
  5. $url_with_path = url(-path_info=>1);
  6. $url_with_path_and_query = url(-path_info=>1,-query=>1);
  7. $netloc = url(-base => 1);

url() returns the script's URL in a variety of formats. Called without any arguments, it returns the full form of the URL, including host name and port number

  1. http://your.host.com/path/to/script.cgi

You can modify this format with the following named arguments:

  • -absolute

    If true, produce an absolute URL, e.g.

    1. /path/to/script.cgi
  • -relative

    Produce a relative URL. This is useful if you want to reinvoke your script with different parameters. For example:

    1. script.cgi
  • -full

    Produce the full URL, exactly as if called without any arguments. This overrides the -relative and -absolute arguments.

  • -path (-path_info)

    Append the additional path information to the URL. This can be combined with -full, -absolute or -relative. -path_info is provided as a synonym.

  • -query (-query_string)

    Append the query string to the URL. This can be combined with -full, -absolute or -relative. -query_string is provided as a synonym.

  • -base

    Generate just the protocol and net location, as in http://www.foo.com:8000

  • -rewrite

    If Apache's mod_rewrite is turned on, then the script name and path info probably won't match the request that the user sent. Set -rewrite=>1 (default) to return URLs that match what the user sent (the original request URI). Set -rewrite=>0 to return URLs that match the URL after mod_rewrite's rules have run.

MIXING POST AND URL PARAMETERS

  1. $color = url_param('color');

It is possible for a script to receive CGI parameters in the URL as well as in the fill-out form by creating a form that POSTs to a URL containing a query string (a "?" mark followed by arguments). The param() method will always return the contents of the POSTed fill-out form, ignoring the URL's query string. To retrieve URL parameters, call the url_param() method. Use it in the same way as param(). The main difference is that it allows you to read the parameters, but not set them.

Under no circumstances will the contents of the URL query string interfere with similarly-named CGI parameters in POSTed forms. If you try to mix a URL query string with a form submitted with the GET method, the results will not be what you expect.

CREATING STANDARD HTML ELEMENTS:

CGI.pm defines general HTML shortcut methods for many HTML tags. HTML shortcuts are named after a single HTML element and return a fragment of HTML text. Example:

  1. print $q->blockquote(
  2. "Many years ago on the island of",
  3. $q->a({href=>"http://crete.org/"},"Crete"),
  4. "there lived a Minotaur named",
  5. $q->strong("Fred."),
  6. ),
  7. $q->hr;

This results in the following HTML code (extra newlines have been added for readability):

  1. <blockquote>
  2. Many years ago on the island of
  3. <a href="http://crete.org/">Crete</a> there lived
  4. a minotaur named <strong>Fred.</strong>
  5. </blockquote>
  6. <hr>

If you find the syntax for calling the HTML shortcuts awkward, you can import them into your namespace and dispense with the object syntax completely (see the next section for more details):

  1. use CGI ':standard';
  2. print blockquote(
  3. "Many years ago on the island of",
  4. a({href=>"http://crete.org/"},"Crete"),
  5. "there lived a minotaur named",
  6. strong("Fred."),
  7. ),
  8. hr;

PROVIDING ARGUMENTS TO HTML SHORTCUTS

The HTML methods will accept zero, one or multiple arguments. If you provide no arguments, you get a single tag:

  1. print hr; # <hr>

If you provide one or more string arguments, they are concatenated together with spaces and placed between opening and closing tags:

  1. print h1("Chapter","1"); # <h1>Chapter 1</h1>"

If the first argument is a hash reference, then the keys and values of the hash become the HTML tag's attributes:

  1. print a({-href=>'fred.html',-target=>'_new'},
  2. "Open a new frame");
  3. <a href="fred.html",target="_new">Open a new frame</a>

You may dispense with the dashes in front of the attribute names if you prefer:

  1. print img {src=>'fred.gif',align=>'LEFT'};
  2. <img align="LEFT" src="fred.gif">

Sometimes an HTML tag attribute has no argument. For example, ordered lists can be marked as COMPACT. The syntax for this is an argument that that points to an undef string:

  1. print ol({compact=>undef},li('one'),li('two'),li('three'));

Prior to CGI.pm version 2.41, providing an empty ('') string as an attribute argument was the same as providing undef. However, this has changed in order to accommodate those who want to create tags of the form <img alt="">. The difference is shown in these two pieces of code:

  1. CODE RESULT
  2. img({alt=>undef}) <img alt>
  3. img({alt=>''}) <img alt="">

THE DISTRIBUTIVE PROPERTY OF HTML SHORTCUTS

One of the cool features of the HTML shortcuts is that they are distributive. If you give them an argument consisting of a reference to a list, the tag will be distributed across each element of the list. For example, here's one way to make an ordered list:

  1. print ul(
  2. li({-type=>'disc'},['Sneezy','Doc','Sleepy','Happy'])
  3. );

This example will result in HTML output that looks like this:

  1. <ul>
  2. <li type="disc">Sneezy</li>
  3. <li type="disc">Doc</li>
  4. <li type="disc">Sleepy</li>
  5. <li type="disc">Happy</li>
  6. </ul>

This is extremely useful for creating tables. For example:

  1. print table({-border=>undef},
  2. caption('When Should You Eat Your Vegetables?'),
  3. Tr({-align=>'CENTER',-valign=>'TOP'},
  4. [
  5. th(['Vegetable', 'Breakfast','Lunch','Dinner']),
  6. td(['Tomatoes' , 'no', 'yes', 'yes']),
  7. td(['Broccoli' , 'no', 'no', 'yes']),
  8. td(['Onions' , 'yes','yes', 'yes'])
  9. ]
  10. )
  11. );

HTML SHORTCUTS AND LIST INTERPOLATION

Consider this bit of code:

  1. print blockquote(em('Hi'),'mom!'));

It will ordinarily return the string that you probably expect, namely:

  1. <blockquote><em>Hi</em> mom!</blockquote>

Note the space between the element "Hi" and the element "mom!". CGI.pm puts the extra space there using array interpolation, which is controlled by the magic $" variable. Sometimes this extra space is not what you want, for example, when you are trying to align a series of images. In this case, you can simply change the value of $" to an empty string.

  1. {
  2. local($") = '';
  3. print blockquote(em('Hi'),'mom!'));
  4. }

I suggest you put the code in a block as shown here. Otherwise the change to $" will affect all subsequent code until you explicitly reset it.

NON-STANDARD HTML SHORTCUTS

A few HTML tags don't follow the standard pattern for various reasons.

comment() generates an HTML comment (<!-- comment -->). Call it like

  1. print comment('here is my comment');

Because of conflicts with built-in Perl functions, the following functions begin with initial caps:

  1. Select
  2. Tr
  3. Link
  4. Delete
  5. Accept
  6. Sub

In addition, start_html(), end_html(), start_form(), end_form(), start_multipart_form() and all the fill-out form tags are special. See their respective sections.

AUTOESCAPING HTML

By default, all HTML that is emitted by the form-generating functions is passed through a function called escapeHTML():

  • $escaped_string = escapeHTML("unescaped string");

    Escape HTML formatting characters in a string.

Provided that you have specified a character set of ISO-8859-1 (the default), the standard HTML escaping rules will be used. The "<" character becomes "&lt;", ">" becomes "&gt;", "&" becomes "&amp;", and the quote character becomes "&quot;". In addition, the hexadecimal 0x8b and 0x9b characters, which some browsers incorrectly interpret as the left and right angle-bracket characters, are replaced by their numeric character entities ("&#8249" and "&#8250;"). If you manually change the charset, either by calling the charset() method explicitly or by passing a -charset argument to header(), then all characters will be replaced by their numeric entities, since CGI.pm has no lookup table for all the possible encodings.

escapeHTML() expects the supplied string to be a character string. This means you should Encode::decode data received from "outside" and Encode::encode your strings before sending them back outside. If your source code UTF-8 encoded and you want to upgrade string literals in your source to character strings, you can use "use utf8". See perlunitut, perlunifaq and perlunicode for more information on how Perl handles the difference between bytes and characters.

The automatic escaping does not apply to other shortcuts, such as h1(). You should call escapeHTML() yourself on untrusted data in order to protect your pages against nasty tricks that people may enter into guestbooks, etc.. To change the character set, use charset(). To turn autoescaping off completely, use autoEscape(0):

  • $charset = charset([$charset]);

    Get or set the current character set.

  • $flag = autoEscape([$flag]);

    Get or set the value of the autoescape flag.

PRETTY-PRINTING HTML

By default, all the HTML produced by these functions comes out as one long line without carriage returns or indentation. This is yuck, but it does reduce the size of the documents by 10-20%. To get pretty-printed output, please use CGI::Pretty, a subclass contributed by Brian Paulsen.

CREATING FILL-OUT FORMS:

General note The various form-creating methods all return strings to the caller, containing the tag or tags that will create the requested form element. You are responsible for actually printing out these strings. It's set up this way so that you can place formatting tags around the form elements.

Another note The default values that you specify for the forms are only used the first time the script is invoked (when there is no query string). On subsequent invocations of the script (when there is a query string), the former values are used even if they are blank.

If you want to change the value of a field from its previous value, you have two choices:

(1) call the param() method to set it.

(2) use the -override (alias -force) parameter (a new feature in version 2.15). This forces the default value to be used, regardless of the previous value:

  1. print textfield(-name=>'field_name',
  2. -default=>'starting value',
  3. -override=>1,
  4. -size=>50,
  5. -maxlength=>80);

Yet another note By default, the text and labels of form elements are escaped according to HTML rules. This means that you can safely use "<CLICK ME>" as the label for a button. However, it also interferes with your ability to incorporate special HTML character sequences, such as &Aacute;, into your fields. If you wish to turn off automatic escaping, call the autoEscape() method with a false value immediately after creating the CGI object:

  1. $query = CGI->new;
  2. $query->autoEscape(0);

Note that autoEscape() is exclusively used to effect the behavior of how some CGI.pm HTML generation functions handle escaping. Calling escapeHTML() explicitly will always escape the HTML.

A Lurking Trap! Some of the form-element generating methods return multiple tags. In a scalar context, the tags will be concatenated together with spaces, or whatever is the current value of the $" global. In a list context, the methods will return a list of elements, allowing you to modify them if you wish. Usually you will not notice this behavior, but beware of this:

  1. printf("%s\n",end_form())

end_form() produces several tags, and only the first of them will be printed because the format only expects one value.

<p>

CREATING AN ISINDEX TAG

  1. print isindex(-action=>$action);
  2. -or-
  3. print isindex($action);

Prints out an <isindex> tag. Not very exciting. The parameter -action specifies the URL of the script to process the query. The default is to process the query with the current script.

STARTING AND ENDING A FORM

  1. print start_form(-method=>$method,
  2. -action=>$action,
  3. -enctype=>$encoding);
  4. <... various form stuff ...>
  5. print end_form;
  6. -or-
  7. print start_form($method,$action,$encoding);
  8. <... various form stuff ...>
  9. print end_form;

start_form() will return a <form> tag with the optional method, action and form encoding that you specify. The defaults are:

  1. method: POST
  2. action: this script
  3. enctype: application/x-www-form-urlencoded for non-XHTML
  4. multipart/form-data for XHTML, see multipart/form-data below.

end_form() returns the closing </form> tag.

Start_form()'s enctype argument tells the browser how to package the various fields of the form before sending the form to the server. Two values are possible:

Note: These methods were previously named startform() and endform(). These methods are now DEPRECATED. Please use start_form() and end_form() instead.

  • application/x-www-form-urlencoded

    This is the older type of encoding. It is compatible with many CGI scripts and is suitable for short fields containing text data. For your convenience, CGI.pm stores the name of this encoding type in &CGI::URL_ENCODED.

  • multipart/form-data

    This is the newer type of encoding. It is suitable for forms that contain very large fields or that are intended for transferring binary data. Most importantly, it enables the "file upload" feature. For your convenience, CGI.pm stores the name of this encoding type in &CGI::MULTIPART

    Forms that use this type of encoding are not easily interpreted by CGI scripts unless they use CGI.pm or another library designed to handle them.

    If XHTML is activated (the default), then forms will be automatically created using this type of encoding.

The start_form() method uses the older form of encoding by default unless XHTML is requested. If you want to use the newer form of encoding by default, you can call start_multipart_form() instead of start_form(). The method end_multipart_form() is an alias to end_form().

JAVASCRIPTING: The -name and -onSubmit parameters are provided for use with JavaScript. The -name parameter gives the form a name so that it can be identified and manipulated by JavaScript functions. -onSubmit should point to a JavaScript function that will be executed just before the form is submitted to your server. You can use this opportunity to check the contents of the form for consistency and completeness. If you find something wrong, you can put up an alert box or maybe fix things up yourself. You can abort the submission by returning false from this function.

Usually the bulk of JavaScript functions are defined in a <script> block in the HTML header and -onSubmit points to one of these function call. See start_html() for details.

FORM ELEMENTS

After starting a form, you will typically create one or more textfields, popup menus, radio groups and other form elements. Each of these elements takes a standard set of named arguments. Some elements also have optional arguments. The standard arguments are as follows:

  • -name

    The name of the field. After submission this name can be used to retrieve the field's value using the param() method.

  • -value, -values

    The initial value of the field which will be returned to the script after form submission. Some form elements, such as text fields, take a single scalar -value argument. Others, such as popup menus, take a reference to an array of values. The two arguments are synonyms.

  • -tabindex

    A numeric value that sets the order in which the form element receives focus when the user presses the tab key. Elements with lower values receive focus first.

  • -id

    A string identifier that can be used to identify this element to JavaScript and DHTML.

  • -override

    A boolean, which, if true, forces the element to take on the value specified by -value, overriding the sticky behavior described earlier for the -nosticky pragma.

  • -onChange, -onFocus, -onBlur, -onMouseOver, -onMouseOut, -onSelect

    These are used to assign JavaScript event handlers. See the JavaScripting section for more details.

Other common arguments are described in the next section. In addition to these, all attributes described in the HTML specifications are supported.

CREATING A TEXT FIELD

  1. print textfield(-name=>'field_name',
  2. -value=>'starting value',
  3. -size=>50,
  4. -maxlength=>80);
  5. -or-
  6. print textfield('field_name','starting value',50,80);

textfield() will return a text input field.

  • Parameters
  • 1.

    The first parameter is the required name for the field (-name).

  • 2.

    The optional second parameter is the default starting value for the field contents (-value, formerly known as -default).

  • 3.

    The optional third parameter is the size of the field in characters (-size).

  • 4.

    The optional fourth parameter is the maximum number of characters the field will accept (-maxlength).

As with all these methods, the field will be initialized with its previous contents from earlier invocations of the script. When the form is processed, the value of the text field can be retrieved with:

  1. $value = param('foo');

If you want to reset it from its initial value after the script has been called once, you can do so like this:

  1. param('foo',"I'm taking over this value!");

CREATING A BIG TEXT FIELD

  1. print textarea(-name=>'foo',
  2. -default=>'starting value',
  3. -rows=>10,
  4. -columns=>50);
  5. -or
  6. print textarea('foo','starting value',10,50);

textarea() is just like textfield, but it allows you to specify rows and columns for a multiline text entry box. You can provide a starting value for the field, which can be long and contain multiple lines.

CREATING A PASSWORD FIELD

  1. print password_field(-name=>'secret',
  2. -value=>'starting value',
  3. -size=>50,
  4. -maxlength=>80);
  5. -or-
  6. print password_field('secret','starting value',50,80);

password_field() is identical to textfield(), except that its contents will be starred out on the web page.

CREATING A FILE UPLOAD FIELD

  1. print filefield(-name=>'uploaded_file',
  2. -default=>'starting value',
  3. -size=>50,
  4. -maxlength=>80);
  5. -or-
  6. print filefield('uploaded_file','starting value',50,80);

filefield() will return a file upload field. In order to take full advantage of this you must use the new multipart encoding scheme for the form. You can do this either by calling start_form() with an encoding type of &CGI::MULTIPART, or by calling the new method start_multipart_form() instead of vanilla start_form().

  • Parameters
  • 1.

    The first parameter is the required name for the field (-name).

  • 2.

    The optional second parameter is the starting value for the field contents to be used as the default file name (-default).

    For security reasons, browsers don't pay any attention to this field, and so the starting value will always be blank. Worse, the field loses its "sticky" behavior and forgets its previous contents. The starting value field is called for in the HTML specification, however, and possibly some browser will eventually provide support for it.

  • 3.

    The optional third parameter is the size of the field in characters (-size).

  • 4.

    The optional fourth parameter is the maximum number of characters the field will accept (-maxlength).

JAVASCRIPTING: The -onChange, -onFocus, -onBlur, -onMouseOver, -onMouseOut and -onSelect parameters are recognized. See textfield() for details.

PROCESSING A FILE UPLOAD FIELD

Basics

When the form is processed, you can retrieve an IO::Handle compatible handle for a file upload field like this:

  1. $lightweight_fh = $q->upload('field_name');
  2. # undef may be returned if it's not a valid file handle
  3. if (defined $lightweight_fh) {
  4. # Upgrade the handle to one compatible with IO::Handle:
  5. my $io_handle = $lightweight_fh->handle;
  6. open (OUTFILE,'>>','/usr/local/web/users/feedback');
  7. while ($bytesread = $io_handle->read($buffer,1024)) {
  8. print OUTFILE $buffer;
  9. }
  10. }

In a list context, upload() will return an array of filehandles. This makes it possible to process forms that use the same name for multiple upload fields.

If you want the entered file name for the file, you can just call param():

  1. $filename = $q->param('field_name');

Different browsers will return slightly different things for the name. Some browsers return the filename only. Others return the full path to the file, using the path conventions of the user's machine. Regardless, the name returned is always the name of the file on the user's machine, and is unrelated to the name of the temporary file that CGI.pm creates during upload spooling (see below).

When a file is uploaded the browser usually sends along some information along with it in the format of headers. The information usually includes the MIME content type. To retrieve this information, call uploadInfo(). It returns a reference to a hash containing all the document headers.

  1. $filename = $q->param('uploaded_file');
  2. $type = $q->uploadInfo($filename)->{'Content-Type'};
  3. unless ($type eq 'text/html') {
  4. die "HTML FILES ONLY!";
  5. }

If you are using a machine that recognizes "text" and "binary" data modes, be sure to understand when and how to use them (see the Camel book). Otherwise you may find that binary files are corrupted during file uploads.

Accessing the temp files directly

When processing an uploaded file, CGI.pm creates a temporary file on your hard disk and passes you a file handle to that file. After you are finished with the file handle, CGI.pm unlinks (deletes) the temporary file. If you need to you can access the temporary file directly. You can access the temp file for a file upload by passing the file name to the tmpFileName() method:

  1. $filename = $query->param('uploaded_file');
  2. $tmpfilename = $query->tmpFileName($filename);

The temporary file will be deleted automatically when your program exits unless you manually rename it. On some operating systems (such as Windows NT), you will need to close the temporary file's filehandle before your program exits. Otherwise the attempt to delete the temporary file will fail.

Handling interrupted file uploads

There are occasionally problems involving parsing the uploaded file. This usually happens when the user presses "Stop" before the upload is finished. In this case, CGI.pm will return undef for the name of the uploaded file and set cgi_error() to the string "400 Bad request (malformed multipart POST)". This error message is designed so that you can incorporate it into a status code to be sent to the browser. Example:

  1. $file = $q->upload('uploaded_file');
  2. if (!$file && $q->cgi_error) {
  3. print $q->header(-status=>$q->cgi_error);
  4. exit 0;
  5. }

You are free to create a custom HTML page to complain about the error, if you wish.

Progress bars for file uploads and avoiding temp files

CGI.pm gives you low-level access to file upload management through a file upload hook. You can use this feature to completely turn off the temp file storage of file uploads, or potentially write your own file upload progress meter.

This is much like the UPLOAD_HOOK facility available in Apache::Request, with the exception that the first argument to the callback is an Apache::Upload object, here it's the remote filename.

  1. $q = CGI->new(\&hook [,$data [,$use_tempfile]]);
  2. sub hook {
  3. my ($filename, $buffer, $bytes_read, $data) = @_;
  4. print "Read $bytes_read bytes of $filename\n";
  5. }

The $data field is optional; it lets you pass configuration information (e.g. a database handle) to your hook callback.

The $use_tempfile field is a flag that lets you turn on and off CGI.pm's use of a temporary disk-based file during file upload. If you set this to a FALSE value (default true) then $q->param('uploaded_file') will no longer work, and the only way to get at the uploaded data is via the hook you provide.

If using the function-oriented interface, call the CGI::upload_hook() method before calling param() or any other CGI functions:

  1. CGI::upload_hook(\&hook [,$data [,$use_tempfile]]);

This method is not exported by default. You will have to import it explicitly if you wish to use it without the CGI:: prefix.

Troubleshooting file uploads on Windows

If you are using CGI.pm on a Windows platform and find that binary files get slightly larger when uploaded but that text files remain the same, then you have forgotten to activate binary mode on the output filehandle. Be sure to call binmode() on any handle that you create to write the uploaded file to disk.

Older ways to process file uploads

( This section is here for completeness. if you are building a new application with CGI.pm, you can skip it. )

The original way to process file uploads with CGI.pm was to use param(). The value it returns has a dual nature as both a file name and a lightweight filehandle. This dual nature is problematic if you following the recommended practice of having use strict in your code. Perl will complain when you try to use a string as a filehandle. More seriously, it is possible for the remote user to type garbage into the upload field, in which case what you get from param() is not a filehandle at all, but a string.

To solve this problem the upload() method was added, which always returns a lightweight filehandle. This generally works well, but will have trouble interoperating with some other modules because the file handle is not derived from IO::Handle. So that brings us to current recommendation given above, which is to call the handle() method on the file handle returned by upload(). That upgrades the handle to an IO::Handle. It's a big win for compatibility for a small penalty of loading IO::Handle the first time you call it.

CREATING A POPUP MENU

  1. print popup_menu('menu_name',
  2. ['eenie','meenie','minie'],
  3. 'meenie');
  4. -or-
  5. %labels = ('eenie'=>'your first choice',
  6. 'meenie'=>'your second choice',
  7. 'minie'=>'your third choice');
  8. %attributes = ('eenie'=>{'class'=>'class of first choice'});
  9. print popup_menu('menu_name',
  10. ['eenie','meenie','minie'],
  11. 'meenie',\%labels,\%attributes);
  12. -or (named parameter style)-
  13. print popup_menu(-name=>'menu_name',
  14. -values=>['eenie','meenie','minie'],
  15. -default=>['meenie','minie'],
  16. -labels=>\%labels,
  17. -attributes=>\%attributes);

popup_menu() creates a menu.

1.

The required first argument is the menu's name (-name).

2.

The required second argument (-values) is an array reference containing the list of menu items in the menu. You can pass the method an anonymous array, as shown in the example, or a reference to a named array, such as "\@foo".

3.

The optional third parameter (-default) is the name of the default menu choice. If not specified, the first item will be the default. The values of the previous choice will be maintained across queries. Pass an array reference to select multiple defaults.

4.

The optional fourth parameter (-labels) is provided for people who want to use different values for the user-visible label inside the popup menu and the value returned to your script. It's a pointer to an hash relating menu values to user-visible labels. If you leave this parameter blank, the menu values will be displayed by default. (You can also leave a label undefined if you want to).

5.

The optional fifth parameter (-attributes) is provided to assign any of the common HTML attributes to an individual menu item. It's a pointer to a hash relating menu values to another hash with the attribute's name as the key and the attribute's value as the value.

When the form is processed, the selected value of the popup menu can be retrieved using:

  1. $popup_menu_value = param('menu_name');

CREATING AN OPTION GROUP

Named parameter style

  1. print popup_menu(-name=>'menu_name',
  2. -values=>[qw/eenie meenie minie/,
  3. optgroup(-name=>'optgroup_name',
  4. -values => ['moe','catch'],
  5. -attributes=>{'catch'=>{'class'=>'red'}})],
  6. -labels=>{'eenie'=>'one',
  7. 'meenie'=>'two',
  8. 'minie'=>'three'},
  9. -default=>'meenie');
  10. Old style
  11. print popup_menu('menu_name',
  12. ['eenie','meenie','minie',
  13. optgroup('optgroup_name', ['moe', 'catch'],
  14. {'catch'=>{'class'=>'red'}})],'meenie',
  15. {'eenie'=>'one','meenie'=>'two','minie'=>'three'});

optgroup() creates an option group within a popup menu.

1.

The required first argument (-name) is the label attribute of the optgroup and is not inserted in the parameter list of the query.

2.

The required second argument (-values) is an array reference containing the list of menu items in the menu. You can pass the method an anonymous array, as shown in the example, or a reference to a named array, such as \@foo. If you pass a HASH reference, the keys will be used for the menu values, and the values will be used for the menu labels (see -labels below).

3.

The optional third parameter (-labels) allows you to pass a reference to a hash containing user-visible labels for one or more of the menu items. You can use this when you want the user to see one menu string, but have the browser return your program a different one. If you don't specify this, the value string will be used instead ("eenie", "meenie" and "minie" in this example). This is equivalent to using a hash reference for the -values parameter.

4.

An optional fourth parameter (-labeled) can be set to a true value and indicates that the values should be used as the label attribute for each option element within the optgroup.

5.

An optional fifth parameter (-novals) can be set to a true value and indicates to suppress the val attribute in each option element within the optgroup.

See the discussion on optgroup at W3C (http://www.w3.org/TR/REC-html40/interact/forms.html#edef-OPTGROUP) for details.

6.

An optional sixth parameter (-attributes) is provided to assign any of the common HTML attributes to an individual menu item. It's a pointer to a hash relating menu values to another hash with the attribute's name as the key and the attribute's value as the value.

CREATING A SCROLLING LIST

  1. print scrolling_list('list_name',
  2. ['eenie','meenie','minie','moe'],
  3. ['eenie','moe'],5,'true',{'moe'=>{'class'=>'red'}});
  4. -or-
  5. print scrolling_list('list_name',
  6. ['eenie','meenie','minie','moe'],
  7. ['eenie','moe'],5,'true',
  8. \%labels,%attributes);
  9. -or-
  10. print scrolling_list(-name=>'list_name',
  11. -values=>['eenie','meenie','minie','moe'],
  12. -default=>['eenie','moe'],
  13. -size=>5,
  14. -multiple=>'true',
  15. -labels=>\%labels,
  16. -attributes=>\%attributes);

scrolling_list() creates a scrolling list.

  • Parameters:
  • 1.

    The first and second arguments are the list name (-name) and values (-values). As in the popup menu, the second argument should be an array reference.

  • 2.

    The optional third argument (-default) can be either a reference to a list containing the values to be selected by default, or can be a single value to select. If this argument is missing or undefined, then nothing is selected when the list first appears. In the named parameter version, you can use the synonym "-defaults" for this parameter.

  • 3.

    The optional fourth argument is the size of the list (-size).

  • 4.

    The optional fifth argument can be set to true to allow multiple simultaneous selections (-multiple). Otherwise only one selection will be allowed at a time.

  • 5.

    The optional sixth argument is a pointer to a hash containing long user-visible labels for the list items (-labels). If not provided, the values will be displayed.

  • 6.

    The optional sixth parameter (-attributes) is provided to assign any of the common HTML attributes to an individual menu item. It's a pointer to a hash relating menu values to another hash with the attribute's name as the key and the attribute's value as the value.

    When this form is processed, all selected list items will be returned as a list under the parameter name 'list_name'. The values of the selected items can be retrieved with:

    1. @selected = param('list_name');

CREATING A GROUP OF RELATED CHECKBOXES

  1. print checkbox_group(-name=>'group_name',
  2. -values=>['eenie','meenie','minie','moe'],
  3. -default=>['eenie','moe'],
  4. -linebreak=>'true',
  5. -disabled => ['moe'],
  6. -labels=>\%labels,
  7. -attributes=>\%attributes);
  8. print checkbox_group('group_name',
  9. ['eenie','meenie','minie','moe'],
  10. ['eenie','moe'],'true',\%labels,
  11. {'moe'=>{'class'=>'red'}});
  12. HTML3-COMPATIBLE BROWSERS ONLY:
  13. print checkbox_group(-name=>'group_name',
  14. -values=>['eenie','meenie','minie','moe'],
  15. -rows=2,-columns=>2);

checkbox_group() creates a list of checkboxes that are related by the same name.

  • Parameters:
  • 1.

    The first and second arguments are the checkbox name and values, respectively (-name and -values). As in the popup menu, the second argument should be an array reference. These values are used for the user-readable labels printed next to the checkboxes as well as for the values passed to your script in the query string.

  • 2.

    The optional third argument (-default) can be either a reference to a list containing the values to be checked by default, or can be a single value to checked. If this argument is missing or undefined, then nothing is selected when the list first appears.

  • 3.

    The optional fourth argument (-linebreak) can be set to true to place line breaks between the checkboxes so that they appear as a vertical list. Otherwise, they will be strung together on a horizontal line.

The optional -labels argument is a pointer to a hash relating the checkbox values to the user-visible labels that will be printed next to them. If not provided, the values will be used as the default.

The optional parameters -rows, and -columns cause checkbox_group() to return an HTML3 compatible table containing the checkbox group formatted with the specified number of rows and columns. You can provide just the -columns parameter if you wish; checkbox_group will calculate the correct number of rows for you.

The option -disabled takes an array of checkbox values and disables them by greying them out (this may not be supported by all browsers).

The optional -attributes argument is provided to assign any of the common HTML attributes to an individual menu item. It's a pointer to a hash relating menu values to another hash with the attribute's name as the key and the attribute's value as the value.

The optional -tabindex argument can be used to control the order in which radio buttons receive focus when the user presses the tab button. If passed a scalar numeric value, the first element in the group will receive this tab index and subsequent elements will be incremented by one. If given a reference to an array of radio button values, then the indexes will be jiggered so that the order specified in the array will correspond to the tab order. You can also pass a reference to a hash in which the hash keys are the radio button values and the values are the tab indexes of each button. Examples:

  1. -tabindex => 100 # this group starts at index 100 and counts up
  2. -tabindex => ['moe','minie','eenie','meenie'] # tab in this order
  3. -tabindex => {meenie=>100,moe=>101,minie=>102,eenie=>200} # tab in this order

The optional -labelattributes argument will contain attributes attached to the <label> element that surrounds each button.

When the form is processed, all checked boxes will be returned as a list under the parameter name 'group_name'. The values of the "on" checkboxes can be retrieved with:

  1. @turned_on = param('group_name');

The value returned by checkbox_group() is actually an array of button elements. You can capture them and use them within tables, lists, or in other creative ways:

  1. @h = checkbox_group(-name=>'group_name',-values=>\@values);
  2. &use_in_creative_way(@h);

CREATING A STANDALONE CHECKBOX

  1. print checkbox(-name=>'checkbox_name',
  2. -checked=>1,
  3. -value=>'ON',
  4. -label=>'CLICK ME');
  5. -or-
  6. print checkbox('checkbox_name','checked','ON','CLICK ME');

checkbox() is used to create an isolated checkbox that isn't logically related to any others.

  • Parameters:
  • 1.

    The first parameter is the required name for the checkbox (-name). It will also be used for the user-readable label printed next to the checkbox.

  • 2.

    The optional second parameter (-checked) specifies that the checkbox is turned on by default. Synonyms are -selected and -on.

  • 3.

    The optional third parameter (-value) specifies the value of the checkbox when it is checked. If not provided, the word "on" is assumed.

  • 4.

    The optional fourth parameter (-label) is the user-readable label to be attached to the checkbox. If not provided, the checkbox name is used.

The value of the checkbox can be retrieved using:

  1. $turned_on = param('checkbox_name');

CREATING A RADIO BUTTON GROUP

  1. print radio_group(-name=>'group_name',
  2. -values=>['eenie','meenie','minie'],
  3. -default=>'meenie',
  4. -linebreak=>'true',
  5. -labels=>\%labels,
  6. -attributes=>\%attributes);
  7. -or-
  8. print radio_group('group_name',['eenie','meenie','minie'],
  9. 'meenie','true',\%labels,\%attributes);
  10. HTML3-COMPATIBLE BROWSERS ONLY:
  11. print radio_group(-name=>'group_name',
  12. -values=>['eenie','meenie','minie','moe'],
  13. -rows=2,-columns=>2);

radio_group() creates a set of logically-related radio buttons (turning one member of the group on turns the others off)

  • Parameters:
  • 1.

    The first argument is the name of the group and is required (-name).

  • 2.

    The second argument (-values) is the list of values for the radio buttons. The values and the labels that appear on the page are identical. Pass an array reference in the second argument, either using an anonymous array, as shown, or by referencing a named array as in "\@foo".

  • 3.

    The optional third parameter (-default) is the name of the default button to turn on. If not specified, the first item will be the default. You can provide a nonexistent button name, such as "-" to start up with no buttons selected.

  • 4.

    The optional fourth parameter (-linebreak) can be set to 'true' to put line breaks between the buttons, creating a vertical list.

  • 5.

    The optional fifth parameter (-labels) is a pointer to an associative array relating the radio button values to user-visible labels to be used in the display. If not provided, the values themselves are displayed.

All modern browsers can take advantage of the optional parameters -rows, and -columns. These parameters cause radio_group() to return an HTML3 compatible table containing the radio group formatted with the specified number of rows and columns. You can provide just the -columns parameter if you wish; radio_group will calculate the correct number of rows for you.

To include row and column headings in the returned table, you can use the -rowheaders and -colheaders parameters. Both of these accept a pointer to an array of headings to use. The headings are just decorative. They don't reorganize the interpretation of the radio buttons -- they're still a single named unit.

The optional -tabindex argument can be used to control the order in which radio buttons receive focus when the user presses the tab button. If passed a scalar numeric value, the first element in the group will receive this tab index and subsequent elements will be incremented by one. If given a reference to an array of radio button values, then the indexes will be jiggered so that the order specified in the array will correspond to the tab order. You can also pass a reference to a hash in which the hash keys are the radio button values and the values are the tab indexes of each button. Examples:

  1. -tabindex => 100 # this group starts at index 100 and counts up
  2. -tabindex => ['moe','minie','eenie','meenie'] # tab in this order
  3. -tabindex => {meenie=>100,moe=>101,minie=>102,eenie=>200} # tab in this order

The optional -attributes argument is provided to assign any of the common HTML attributes to an individual menu item. It's a pointer to a hash relating menu values to another hash with the attribute's name as the key and the attribute's value as the value.

The optional -labelattributes argument will contain attributes attached to the <label> element that surrounds each button.

When the form is processed, the selected radio button can be retrieved using:

  1. $which_radio_button = param('group_name');

The value returned by radio_group() is actually an array of button elements. You can capture them and use them within tables, lists, or in other creative ways:

  1. @h = radio_group(-name=>'group_name',-values=>\@values);
  2. &use_in_creative_way(@h);

CREATING A SUBMIT BUTTON

  1. print submit(-name=>'button_name',
  2. -value=>'value');
  3. -or-
  4. print submit('button_name','value');

submit() will create the query submission button. Every form should have one of these.

  • Parameters:
  • 1.

    The first argument (-name) is optional. You can give the button a name if you have several submission buttons in your form and you want to distinguish between them.

  • 2.

    The second argument (-value) is also optional. This gives the button a value that will be passed to your script in the query string. The name will also be used as the user-visible label.

  • 3.

    You can use -label as an alias for -value. I always get confused about which of -name and -value changes the user-visible label on the button.

You can figure out which button was pressed by using different values for each one:

  1. $which_one = param('button_name');

CREATING A RESET BUTTON

  1. print reset

reset() creates the "reset" button. Note that it restores the form to its value from the last time the script was called, NOT necessarily to the defaults.

Note that this conflicts with the Perl reset() built-in. Use CORE::reset() to get the original reset function.

CREATING A DEFAULT BUTTON

  1. print defaults('button_label')

defaults() creates a button that, when invoked, will cause the form to be completely reset to its defaults, wiping out all the changes the user ever made.

CREATING A HIDDEN FIELD

  1. print hidden(-name=>'hidden_name',
  2. -default=>['value1','value2'...]);
  3. -or-
  4. print hidden('hidden_name','value1','value2'...);

hidden() produces a text field that can't be seen by the user. It is useful for passing state variable information from one invocation of the script to the next.

  • Parameters:
  • 1.

    The first argument is required and specifies the name of this field (-name).

  • 2.

    The second argument is also required and specifies its value (-default). In the named parameter style of calling, you can provide a single value here or a reference to a whole list

Fetch the value of a hidden field this way:

  1. $hidden_value = param('hidden_name');

Note, that just like all the other form elements, the value of a hidden field is "sticky". If you want to replace a hidden field with some other values after the script has been called once you'll have to do it manually:

  1. param('hidden_name','new','values','here');

CREATING A CLICKABLE IMAGE BUTTON

  1. print image_button(-name=>'button_name',
  2. -src=>'/source/URL',
  3. -align=>'MIDDLE');
  4. -or-
  5. print image_button('button_name','/source/URL','MIDDLE');

image_button() produces a clickable image. When it's clicked on the position of the click is returned to your script as "button_name.x" and "button_name.y", where "button_name" is the name you've assigned to it.

  • Parameters:
  • 1.

    The first argument (-name) is required and specifies the name of this field.

  • 2.

    The second argument (-src) is also required and specifies the URL

  • 3. The third option (-align, optional) is an alignment type, and may be TOP, BOTTOM or MIDDLE

Fetch the value of the button this way: $x = param('button_name.x'); $y = param('button_name.y');

CREATING A JAVASCRIPT ACTION BUTTON

  1. print button(-name=>'button_name',
  2. -value=>'user visible label',
  3. -onClick=>"do_something()");
  4. -or-
  5. print button('button_name',"user visible value","do_something()");

button() produces an <input> tag with type="button" . When it's pressed the fragment of JavaScript code pointed to by the -onClick parameter will be executed.

HTTP COOKIES

Browsers support a so-called "cookie" designed to help maintain state within a browser session. CGI.pm has several methods that support cookies.

A cookie is a name=value pair much like the named parameters in a CGI query string. CGI scripts create one or more cookies and send them to the browser in the HTTP header. The browser maintains a list of cookies that belong to a particular Web server, and returns them to the CGI script during subsequent interactions.

In addition to the required name=value pair, each cookie has several optional attributes:

1.
an expiration time

This is a time/date string (in a special GMT format) that indicates when a cookie expires. The cookie will be saved and returned to your script until this expiration date is reached if the user exits the browser and restarts it. If an expiration date isn't specified, the cookie will remain active until the user quits the browser.

2.
a domain

This is a partial or complete domain name for which the cookie is valid. The browser will return the cookie to any host that matches the partial domain name. For example, if you specify a domain name of ".capricorn.com", then the browser will return the cookie to Web servers running on any of the machines "www.capricorn.com", "www2.capricorn.com", "feckless.capricorn.com", etc. Domain names must contain at least two periods to prevent attempts to match on top level domains like ".edu". If no domain is specified, then the browser will only return the cookie to servers on the host the cookie originated from.

3.
a path

If you provide a cookie path attribute, the browser will check it against your script's URL before returning the cookie. For example, if you specify the path "/cgi-bin", then the cookie will be returned to each of the scripts "/cgi-bin/tally.pl", "/cgi-bin/order.pl", and "/cgi-bin/customer_service/complain.pl", but not to the script "/cgi-private/site_admin.pl". By default, path is set to "/", which causes the cookie to be sent to any CGI script on your site.

4.
a "secure" flag

If the "secure" attribute is set, the cookie will only be sent to your script if the CGI request is occurring on a secure channel, such as SSL.

The interface to HTTP cookies is the cookie() method:

  1. $cookie = cookie(-name=>'sessionID',
  2. -value=>'xyzzy',
  3. -expires=>'+1h',
  4. -path=>'/cgi-bin/database',
  5. -domain=>'.capricorn.org',
  6. -secure=>1);
  7. print header(-cookie=>$cookie);

cookie() creates a new cookie. Its parameters include:

  • -name

    The name of the cookie (required). This can be any string at all. Although browsers limit their cookie names to non-whitespace alphanumeric characters, CGI.pm removes this restriction by escaping and unescaping cookies behind the scenes.

  • -value

    The value of the cookie. This can be any scalar value, array reference, or even hash reference. For example, you can store an entire hash into a cookie this way:

    1. $cookie=cookie(-name=>'family information',
    2. -value=>\%childrens_ages);
  • -path

    The optional partial path for which this cookie will be valid, as described above.

  • -domain

    The optional partial domain for which this cookie will be valid, as described above.

  • -expires

    The optional expiration date for this cookie. The format is as described in the section on the header() method:

    1. "+1h" one hour from now
  • -secure

    If set to true, this cookie will only be used within a secure SSL session.

The cookie created by cookie() must be incorporated into the HTTP header within the string returned by the header() method:

  1. use CGI ':standard';
  2. print header(-cookie=>$my_cookie);

To create multiple cookies, give header() an array reference:

  1. $cookie1 = cookie(-name=>'riddle_name',
  2. -value=>"The Sphynx's Question");
  3. $cookie2 = cookie(-name=>'answers',
  4. -value=>\%answers);
  5. print header(-cookie=>[$cookie1,$cookie2]);

To retrieve a cookie, request it by name by calling cookie() method without the -value parameter. This example uses the object-oriented form:

  1. use CGI;
  2. $query = CGI->new;
  3. $riddle = $query->cookie('riddle_name');
  4. %answers = $query->cookie('answers');

Cookies created with a single scalar value, such as the "riddle_name" cookie, will be returned in that form. Cookies with array and hash values can also be retrieved.

The cookie and CGI namespaces are separate. If you have a parameter named 'answers' and a cookie named 'answers', the values retrieved by param() and cookie() are independent of each other. However, it's simple to turn a CGI parameter into a cookie, and vice-versa:

  1. # turn a CGI parameter into a cookie
  2. $c=cookie(-name=>'answers',-value=>[param('answers')]);
  3. # vice-versa
  4. param(-name=>'answers',-value=>[cookie('answers')]);

If you call cookie() without any parameters, it will return a list of the names of all cookies passed to your script:

  1. @cookies = cookie();

See the cookie.cgi example script for some ideas on how to use cookies effectively.

WORKING WITH FRAMES

It's possible for CGI.pm scripts to write into several browser panels and windows using the HTML 4 frame mechanism. There are three techniques for defining new frames programmatically:

1.
Create a <Frameset> document

After writing out the HTTP header, instead of creating a standard HTML document using the start_html() call, create a <frameset> document that defines the frames on the page. Specify your script(s) (with appropriate parameters) as the SRC for each of the frames.

There is no specific support for creating <frameset> sections in CGI.pm, but the HTML is very simple to write.

2.
Specify the destination for the document in the HTTP header

You may provide a -target parameter to the header() method:

  1. print header(-target=>'ResultsWindow');

This will tell the browser to load the output of your script into the frame named "ResultsWindow". If a frame of that name doesn't already exist, the browser will pop up a new window and load your script's document into that. There are a number of magic names that you can use for targets. See the HTML <frame> documentation for details.

3.
Specify the destination for the document in the <form> tag

You can specify the frame to load in the FORM tag itself. With CGI.pm it looks like this:

  1. print start_form(-target=>'ResultsWindow');

When your script is reinvoked by the form, its output will be loaded into the frame named "ResultsWindow". If one doesn't already exist a new window will be created.

The script "frameset.cgi" in the examples directory shows one way to create pages in which the fill-out form and the response live in side-by-side frames.

SUPPORT FOR JAVASCRIPT

The usual way to use JavaScript is to define a set of functions in a <SCRIPT> block inside the HTML header and then to register event handlers in the various elements of the page. Events include such things as the mouse passing over a form element, a button being clicked, the contents of a text field changing, or a form being submitted. When an event occurs that involves an element that has registered an event handler, its associated JavaScript code gets called.

The elements that can register event handlers include the <BODY> of an HTML document, hypertext links, all the various elements of a fill-out form, and the form itself. There are a large number of events, and each applies only to the elements for which it is relevant. Here is a partial list:

  • onLoad

    The browser is loading the current document. Valid in:

    1. + The HTML <BODY> section only.
  • onUnload

    The browser is closing the current page or frame. Valid for:

    1. + The HTML <BODY> section only.
  • onSubmit

    The user has pressed the submit button of a form. This event happens just before the form is submitted, and your function can return a value of false in order to abort the submission. Valid for:

    1. + Forms only.
  • onClick

    The mouse has clicked on an item in a fill-out form. Valid for:

    1. + Buttons (including submit, reset, and image buttons)
    2. + Checkboxes
    3. + Radio buttons
  • onChange

    The user has changed the contents of a field. Valid for:

    1. + Text fields
    2. + Text areas
    3. + Password fields
    4. + File fields
    5. + Popup Menus
    6. + Scrolling lists
  • onFocus

    The user has selected a field to work with. Valid for:

    1. + Text fields
    2. + Text areas
    3. + Password fields
    4. + File fields
    5. + Popup Menus
    6. + Scrolling lists
  • onBlur

    The user has deselected a field (gone to work somewhere else). Valid for:

    1. + Text fields
    2. + Text areas
    3. + Password fields
    4. + File fields
    5. + Popup Menus
    6. + Scrolling lists
  • onSelect

    The user has changed the part of a text field that is selected. Valid for:

    1. + Text fields
    2. + Text areas
    3. + Password fields
    4. + File fields
  • onMouseOver

    The mouse has moved over an element.

    1. + Text fields
    2. + Text areas
    3. + Password fields
    4. + File fields
    5. + Popup Menus
    6. + Scrolling lists
  • onMouseOut

    The mouse has moved off an element.

    1. + Text fields
    2. + Text areas
    3. + Password fields
    4. + File fields
    5. + Popup Menus
    6. + Scrolling lists

In order to register a JavaScript event handler with an HTML element, just use the event name as a parameter when you call the corresponding CGI method. For example, to have your validateAge() JavaScript code executed every time the textfield named "age" changes, generate the field like this:

  1. print textfield(-name=>'age',-onChange=>"validateAge(this)");

This example assumes that you've already declared the validateAge() function by incorporating it into a <SCRIPT> block. The CGI.pm start_html() method provides a convenient way to create this section.

Similarly, you can create a form that checks itself over for consistency and alerts the user if some essential value is missing by creating it this way: print start_form(-onSubmit=>"validateMe(this)");

See the javascript.cgi script for a demonstration of how this all works.

LIMITED SUPPORT FOR CASCADING STYLE SHEETS

CGI.pm has limited support for HTML3's cascading style sheets (css). To incorporate a stylesheet into your document, pass the start_html() method a -style parameter. The value of this parameter may be a scalar, in which case it is treated as the source URL for the stylesheet, or it may be a hash reference. In the latter case you should provide the hash with one or more of -src or -code. -src points to a URL where an externally-defined stylesheet can be found. -code points to a scalar value to be incorporated into a <style> section. Style definitions in -code override similarly-named ones in -src, hence the name "cascading."

You may also specify the type of the stylesheet by adding the optional -type parameter to the hash pointed to by -style. If not specified, the style defaults to 'text/css'.

To refer to a style within the body of your document, add the -class parameter to any HTML element:

  1. print h1({-class=>'Fancy'},'Welcome to the Party');

Or define styles on the fly with the -style parameter:

  1. print h1({-style=>'Color: red;'},'Welcome to Hell');

You may also use the new span() element to apply a style to a section of text:

  1. print span({-style=>'Color: red;'},
  2. h1('Welcome to Hell'),
  3. "Where did that handbasket get to?"
  4. );

Note that you must import the ":html3" definitions to have the span() method available. Here's a quick and dirty example of using CSS's. See the CSS specification at http://www.w3.org/Style/CSS/ for more information.

  1. use CGI qw/:standard :html3/;
  2. #here's a stylesheet incorporated directly into the page
  3. $newStyle=<<END;
  4. <!--
  5. P.Tip {
  6. margin-right: 50pt;
  7. margin-left: 50pt;
  8. color: red;
  9. }
  10. P.Alert {
  11. font-size: 30pt;
  12. font-family: sans-serif;
  13. color: red;
  14. }
  15. -->
  16. END
  17. print header();
  18. print start_html( -title=>'CGI with Style',
  19. -style=>{-src=>'http://www.capricorn.com/style/st1.css',
  20. -code=>$newStyle}
  21. );
  22. print h1('CGI with Style'),
  23. p({-class=>'Tip'},
  24. "Better read the cascading style sheet spec before playing with this!"),
  25. span({-style=>'color: magenta'},
  26. "Look Mom, no hands!",
  27. p(),
  28. "Whooo wee!"
  29. );
  30. print end_html;

Pass an array reference to -code or -src in order to incorporate multiple stylesheets into your document.

Should you wish to incorporate a verbatim stylesheet that includes arbitrary formatting in the header, you may pass a -verbatim tag to the -style hash, as follows:

print start_html (-style => {-verbatim => '@import url("/server-common/css/'.$cssFile.'");', -src => '/server-common/css/core.css'});

This will generate an HTML header that contains this:

  1. <link rel="stylesheet" type="text/css" href="/server-common/css/core.css">
  2. <style type="text/css">
  3. @import url("/server-common/css/main.css");
  4. </style>

Any additional arguments passed in the -style value will be incorporated into the <link> tag. For example:

  1. start_html(-style=>{-src=>['/styles/print.css','/styles/layout.css'],
  2. -media => 'all'});

This will give:

  1. <link rel="stylesheet" type="text/css" href="/styles/print.css" media="all"/>
  2. <link rel="stylesheet" type="text/css" href="/styles/layout.css" media="all"/>

<p>

To make more complicated <link> tags, use the Link() function and pass it to start_html() in the -head argument, as in:

  1. @h = (Link({-rel=>'stylesheet',-type=>'text/css',-src=>'/ss/ss.css',-media=>'all'}),
  2. Link({-rel=>'stylesheet',-type=>'text/css',-src=>'/ss/fred.css',-media=>'paper'}));
  3. print start_html({-head=>\@h})

To create primary and "alternate" stylesheet, use the -alternate option:

  1. start_html(-style=>{-src=>[
  2. {-src=>'/styles/print.css'},
  3. {-src=>'/styles/alt.css',-alternate=>1}
  4. ]
  5. });

DEBUGGING

If you are running the script from the command line or in the perl debugger, you can pass the script a list of keywords or parameter=value pairs on the command line or from standard input (you don't have to worry about tricking your script into reading from environment variables). You can pass keywords like this:

  1. your_script.pl keyword1 keyword2 keyword3

or this:

  1. your_script.pl keyword1+keyword2+keyword3

or this:

  1. your_script.pl name1=value1 name2=value2

or this:

  1. your_script.pl name1=value1&name2=value2

To turn off this feature, use the -no_debug pragma.

To test the POST method, you may enable full debugging with the -debug pragma. This will allow you to feed newline-delimited name=value pairs to the script on standard input.

When debugging, you can use quotes and backslashes to escape characters in the familiar shell manner, letting you place spaces and other funny characters in your parameter=value pairs:

  1. your_script.pl "name1='I am a long value'" "name2=two\ words"

Finally, you can set the path info for the script by prefixing the first name/value parameter with the path followed by a question mark (?):

  1. your_script.pl /your/path/here?name1=value1&name2=value2

DUMPING OUT ALL THE NAME/VALUE PAIRS

The Dump() method produces a string consisting of all the query's name/value pairs formatted nicely as a nested list. This is useful for debugging purposes:

  1. print Dump

Produces something that looks like:

  1. <ul>
  2. <li>name1
  3. <ul>
  4. <li>value1
  5. <li>value2
  6. </ul>
  7. <li>name2
  8. <ul>
  9. <li>value1
  10. </ul>
  11. </ul>

As a shortcut, you can interpolate the entire CGI object into a string and it will be replaced with the a nice HTML dump shown above:

  1. $query=CGI->new;
  2. print "<h2>Current Values</h2> $query\n";

FETCHING ENVIRONMENT VARIABLES

Some of the more useful environment variables can be fetched through this interface. The methods are as follows:

  • Accept()

    Return a list of MIME types that the remote browser accepts. If you give this method a single argument corresponding to a MIME type, as in Accept('text/html'), it will return a floating point value corresponding to the browser's preference for this type from 0.0 (don't want) to 1.0. Glob types (e.g. text/*) in the browser's accept list are handled correctly.

    Note that the capitalization changed between version 2.43 and 2.44 in order to avoid conflict with Perl's accept() function.

  • raw_cookie()

    Returns the HTTP_COOKIE variable. Cookies have a special format, and this method call just returns the raw form (?cookie dough). See cookie() for ways of setting and retrieving cooked cookies.

    Called with no parameters, raw_cookie() returns the packed cookie structure. You can separate it into individual cookies by splitting on the character sequence "; ". Called with the name of a cookie, retrieves the unescaped form of the cookie. You can use the regular cookie() method to get the names, or use the raw_fetch() method from the CGI::Cookie module.

  • user_agent()

    Returns the HTTP_USER_AGENT variable. If you give this method a single argument, it will attempt to pattern match on it, allowing you to do something like user_agent(Mozilla);

  • path_info()

    Returns additional path information from the script URL. E.G. fetching /cgi-bin/your_script/additional/stuff will result in path_info() returning "/additional/stuff".

    NOTE: The Microsoft Internet Information Server is broken with respect to additional path information. If you use the Perl DLL library, the IIS server will attempt to execute the additional path information as a Perl script. If you use the ordinary file associations mapping, the path information will be present in the environment, but incorrect. The best thing to do is to avoid using additional path information in CGI scripts destined for use with IIS.

  • path_translated()

    As per path_info() but returns the additional path information translated into a physical path, e.g. "/usr/local/etc/httpd/htdocs/additional/stuff".

    The Microsoft IIS is broken with respect to the translated path as well.

  • remote_host()

    Returns either the remote host name or IP address. if the former is unavailable.

  • remote_addr()

    Returns the remote host IP address, or 127.0.0.1 if the address is unavailable.

  • script_name() Return the script name as a partial URL, for self-referring scripts.
  • referer()

    Return the URL of the page the browser was viewing prior to fetching your script. Not available for all browsers.

  • auth_type ()

    Return the authorization/verification method in use for this script, if any.

  • server_name ()

    Returns the name of the server, usually the machine's host name.

  • virtual_host ()

    When using virtual hosts, returns the name of the host that the browser attempted to contact

  • server_port ()

    Return the port that the server is listening on.

  • virtual_port ()

    Like server_port() except that it takes virtual hosts into account. Use this when running with virtual hosts.

  • server_software ()

    Returns the server software and version number.

  • remote_user ()

    Return the authorization/verification name used for user verification, if this script is protected.

  • user_name ()

    Attempt to obtain the remote user's name, using a variety of different techniques. This only works with older browsers such as Mosaic. Newer browsers do not report the user name for privacy reasons!

  • request_method()

    Returns the method used to access your script, usually one of 'POST', 'GET' or 'HEAD'.

  • content_type()

    Returns the content_type of data submitted in a POST, generally multipart/form-data or application/x-www-form-urlencoded

  • http()

    Called with no arguments returns the list of HTTP environment variables, including such things as HTTP_USER_AGENT, HTTP_ACCEPT_LANGUAGE, and HTTP_ACCEPT_CHARSET, corresponding to the like-named HTTP header fields in the request. Called with the name of an HTTP header field, returns its value. Capitalization and the use of hyphens versus underscores are not significant.

    For example, all three of these examples are equivalent:

    1. $requested_language = http('Accept-language');
    2. $requested_language = http('Accept_language');
    3. $requested_language = http('HTTP_ACCEPT_LANGUAGE');
  • https()

    The same as http(), but operates on the HTTPS environment variables present when the SSL protocol is in effect. Can be used to determine whether SSL is turned on.

USING NPH SCRIPTS

NPH, or "no-parsed-header", scripts bypass the server completely by sending the complete HTTP header directly to the browser. This has slight performance benefits, but is of most use for taking advantage of HTTP extensions that are not directly supported by your server, such as server push and PICS headers.

Servers use a variety of conventions for designating CGI scripts as NPH. Many Unix servers look at the beginning of the script's name for the prefix "nph-". The Macintosh WebSTAR server and Microsoft's Internet Information Server, in contrast, try to decide whether a program is an NPH script by examining the first line of script output.

CGI.pm supports NPH scripts with a special NPH mode. When in this mode, CGI.pm will output the necessary extra header information when the header() and redirect() methods are called.

The Microsoft Internet Information Server requires NPH mode. As of version 2.30, CGI.pm will automatically detect when the script is running under IIS and put itself into this mode. You do not need to do this manually, although it won't hurt anything if you do. However, note that if you have applied Service Pack 6, much of the functionality of NPH scripts, including the ability to redirect while setting a cookie, do not work at all on IIS without a special patch from Microsoft. See http://web.archive.org/web/20010812012030/http://support.microsoft.com/support/kb/articles/Q280/3/41.ASP Non-Parsed Headers Stripped From CGI Applications That Have nph- Prefix in Name.

  • In the use statement

    Simply add the "-nph" pragma to the list of symbols to be imported into your script:

    1. use CGI qw(:standard -nph)
  • By calling the nph() method:

    Call nph() with a non-zero parameter at any point after using CGI.pm in your program.

    1. CGI->nph(1)
  • By using -nph parameters

    in the header() and redirect() statements:

    1. print header(-nph=>1);

Server Push

CGI.pm provides four simple functions for producing multipart documents of the type needed to implement server push. These functions were graciously provided by Ed Jordan <ed@fidalgo.net>. To import these into your namespace, you must import the ":push" set. You are also advised to put the script into NPH mode and to set $| to 1 to avoid buffering problems.

Here is a simple script that demonstrates server push:

  1. #!/usr/local/bin/perl
  2. use CGI qw/:push -nph/;
  3. $| = 1;
  4. print multipart_init(-boundary=>'----here we go!');
  5. for (0 .. 4) {
  6. print multipart_start(-type=>'text/plain'),
  7. "The current time is ",scalar(localtime),"\n";
  8. if ($_ < 4) {
  9. print multipart_end;
  10. } else {
  11. print multipart_final;
  12. }
  13. sleep 1;
  14. }

This script initializes server push by calling multipart_init(). It then enters a loop in which it begins a new multipart section by calling multipart_start(), prints the current local time, and ends a multipart section with multipart_end(). It then sleeps a second, and begins again. On the final iteration, it ends the multipart section with multipart_final() rather than with multipart_end().

  • multipart_init()
    1. multipart_init(-boundary=>$boundary);

    Initialize the multipart system. The -boundary argument specifies what MIME boundary string to use to separate parts of the document. If not provided, CGI.pm chooses a reasonable boundary for you.

  • multipart_start()
    1. multipart_start(-type=>$type)

    Start a new part of the multipart document using the specified MIME type. If not specified, text/html is assumed.

  • multipart_end()
    1. multipart_end()

    End a part. You must remember to call multipart_end() once for each multipart_start(), except at the end of the last part of the multipart document when multipart_final() should be called instead of multipart_end().

  • multipart_final()
    1. multipart_final()

    End all parts. You should call multipart_final() rather than multipart_end() at the end of the last part of the multipart document.

Users interested in server push applications should also have a look at the CGI::Push module.

Avoiding Denial of Service Attacks

A potential problem with CGI.pm is that, by default, it attempts to process form POSTings no matter how large they are. A wily hacker could attack your site by sending a CGI script a huge POST of many megabytes. CGI.pm will attempt to read the entire POST into a variable, growing hugely in size until it runs out of memory. While the script attempts to allocate the memory the system may slow down dramatically. This is a form of denial of service attack.

Another possible attack is for the remote user to force CGI.pm to accept a huge file upload. CGI.pm will accept the upload and store it in a temporary directory even if your script doesn't expect to receive an uploaded file. CGI.pm will delete the file automatically when it terminates, but in the meantime the remote user may have filled up the server's disk space, causing problems for other programs.

The best way to avoid denial of service attacks is to limit the amount of memory, CPU time and disk space that CGI scripts can use. Some Web servers come with built-in facilities to accomplish this. In other cases, you can use the shell limit or ulimit commands to put ceilings on CGI resource usage.

CGI.pm also has some simple built-in protections against denial of service attacks, but you must activate them before you can use them. These take the form of two global variables in the CGI name space:

  • $CGI::POST_MAX

    If set to a non-negative integer, this variable puts a ceiling on the size of POSTings, in bytes. If CGI.pm detects a POST that is greater than the ceiling, it will immediately exit with an error message. This value will affect both ordinary POSTs and multipart POSTs, meaning that it limits the maximum size of file uploads as well. You should set this to a reasonably high value, such as 1 megabyte.

  • $CGI::DISABLE_UPLOADS

    If set to a non-zero value, this will disable file uploads completely. Other fill-out form values will work as usual.

You can use these variables in either of two ways.

  • 1. On a script-by-script basis

    Set the variable at the top of the script, right after the "use" statement:

    1. use CGI qw/:standard/;
    2. use CGI::Carp 'fatalsToBrowser';
    3. $CGI::POST_MAX=1024 * 100; # max 100K posts
    4. $CGI::DISABLE_UPLOADS = 1; # no uploads
  • 2. Globally for all scripts

    Open up CGI.pm, find the definitions for $POST_MAX and $DISABLE_UPLOADS, and set them to the desired values. You'll find them towards the top of the file in a subroutine named initialize_globals().

An attempt to send a POST larger than $POST_MAX bytes will cause param() to return an empty CGI parameter list. You can test for this event by checking cgi_error(), either after you create the CGI object or, if you are using the function-oriented interface, call <param()> for the first time. If the POST was intercepted, then cgi_error() will return the message "413 POST too large".

This error message is actually defined by the HTTP protocol, and is designed to be returned to the browser as the CGI script's status code. For example:

  1. $uploaded_file = param('upload');
  2. if (!$uploaded_file && cgi_error()) {
  3. print header(-status=>cgi_error());
  4. exit 0;
  5. }

However it isn't clear that any browser currently knows what to do with this status code. It might be better just to create an HTML page that warns the user of the problem.

COMPATIBILITY WITH CGI-LIB.PL

To make it easier to port existing programs that use cgi-lib.pl the compatibility routine "ReadParse" is provided. Porting is simple:

OLD VERSION

  1. require "cgi-lib.pl";
  2. &ReadParse;
  3. print "The value of the antique is $in{antique}.\n";

NEW VERSION

  1. use CGI;
  2. CGI::ReadParse();
  3. print "The value of the antique is $in{antique}.\n";

CGI.pm's ReadParse() routine creates a tied variable named %in, which can be accessed to obtain the query variables. Like ReadParse, you can also provide your own variable. Infrequently used features of ReadParse, such as the creation of @in and $in variables, are not supported.

Once you use ReadParse, you can retrieve the query object itself this way:

  1. $q = $in{CGI};
  2. print $q->textfield(-name=>'wow',
  3. -value=>'does this really work?');

This allows you to start using the more interesting features of CGI.pm without rewriting your old scripts from scratch.

An even simpler way to mix cgi-lib calls with CGI.pm calls is to import both the :cgi-lib and :standard method:

  1. use CGI qw(:cgi-lib :standard);
  2. &ReadParse;
  3. print "The price of your purchase is $in{price}.\n";
  4. print textfield(-name=>'price', -default=>'$1.99');

Cgi-lib functions that are available in CGI.pm

In compatibility mode, the following cgi-lib.pl functions are available for your use:

  1. ReadParse()
  2. PrintHeader()
  3. HtmlTop()
  4. HtmlBot()
  5. SplitParam()
  6. MethGet()
  7. MethPost()

Cgi-lib functions that are not available in CGI.pm

  1. * Extended form of ReadParse()
  2. The extended form of ReadParse() that provides for file upload
  3. spooling, is not available.
  4. * MyBaseURL()
  5. This function is not available. Use CGI.pm's url() method instead.
  6. * MyFullURL()
  7. This function is not available. Use CGI.pm's self_url() method
  8. instead.
  9. * CgiError(), CgiDie()
  10. These functions are not supported. Look at CGI::Carp for the way I
  11. prefer to handle error messages.
  12. * PrintVariables()
  13. This function is not available. To achieve the same effect,
  14. just print out the CGI object:
  15. use CGI qw(:standard);
  16. $q = CGI->new;
  17. print h1("The Variables Are"),$q;
  18. * PrintEnv()
  19. This function is not available. You'll have to roll your own if you really need it.

AUTHOR INFORMATION

The CGI.pm distribution is copyright 1995-2007, Lincoln D. Stein. It is distributed under GPL and the Artistic License 2.0. It is currently maintained by Mark Stosberg with help from many contributors.

Address bug reports and comments to: https://rt.cpan.org/Public/Dist/Display.html?Queue=CGI.pm When sending bug reports, please provide the version of CGI.pm, the version of Perl, the name and version of your Web server, and the name and version of the operating system you are using. If the problem is even remotely browser dependent, please provide information about the affected browsers as well.

CREDITS

Thanks very much to:

  • Matt Heffron (heffron@falstaff.css.beckman.com)
  • James Taylor (james.taylor@srs.gov)
  • Scott Anguish <sanguish@digifix.com>
  • Mike Jewell (mlj3u@virginia.edu)
  • Timothy Shimmin (tes@kbs.citri.edu.au)
  • Joergen Haegg (jh@axis.se)
  • Laurent Delfosse (delfosse@delfosse.com)
  • Richard Resnick (applepi1@aol.com)
  • Craig Bishop (csb@barwonwater.vic.gov.au)
  • Tony Curtis (tc@vcpc.univie.ac.at)
  • Tim Bunce (Tim.Bunce@ig.co.uk)
  • Tom Christiansen (tchrist@convex.com)
  • Andreas Koenig (k@franz.ww.TU-Berlin.DE)
  • Tim MacKenzie (Tim.MacKenzie@fulcrum.com.au)
  • Kevin B. Hendricks (kbhend@dogwood.tyler.wm.edu)
  • Stephen Dahmen (joyfire@inxpress.net)
  • Ed Jordan (ed@fidalgo.net)
  • David Alan Pisoni (david@cnation.com)
  • Doug MacEachern (dougm@opengroup.org)
  • Robin Houston (robin@oneworld.org)
  • ...and many many more...

    for suggestions and bug fixes.

A COMPLETE EXAMPLE OF A SIMPLE FORM-BASED SCRIPT

  1. #!/usr/local/bin/perl
  2. use CGI ':standard';
  3. print header;
  4. print start_html("Example CGI.pm Form");
  5. print "<h1> Example CGI.pm Form</h1>\n";
  6. print_prompt();
  7. do_work();
  8. print_tail();
  9. print end_html;
  10. sub print_prompt {
  11. print start_form;
  12. print "<em>What's your name?</em><br>";
  13. print textfield('name');
  14. print checkbox('Not my real name');
  15. print "<p><em>Where can you find English Sparrows?</em><br>";
  16. print checkbox_group(
  17. -name=>'Sparrow locations',
  18. -values=>[England,France,Spain,Asia,Hoboken],
  19. -linebreak=>'yes',
  20. -defaults=>[England,Asia]);
  21. print "<p><em>How far can they fly?</em><br>",
  22. radio_group(
  23. -name=>'how far',
  24. -values=>['10 ft','1 mile','10 miles','real far'],
  25. -default=>'1 mile');
  26. print "<p><em>What's your favorite color?</em> ";
  27. print popup_menu(-name=>'Color',
  28. -values=>['black','brown','red','yellow'],
  29. -default=>'red');
  30. print hidden('Reference','Monty Python and the Holy Grail');
  31. print "<p><em>What have you got there?</em><br>";
  32. print scrolling_list(
  33. -name=>'possessions',
  34. -values=>['A Coconut','A Grail','An Icon',
  35. 'A Sword','A Ticket'],
  36. -size=>5,
  37. -multiple=>'true');
  38. print "<p><em>Any parting comments?</em><br>";
  39. print textarea(-name=>'Comments',
  40. -rows=>10,
  41. -columns=>50);
  42. print "<p>",reset;
  43. print submit('Action','Shout');
  44. print submit('Action','Scream');
  45. print end_form;
  46. print "<hr>\n";
  47. }
  48. sub do_work {
  49. print "<h2>Here are the current settings in this form</h2>";
  50. for my $key (param) {
  51. print "<strong>$key</strong> -> ";
  52. my @values = param($key);
  53. print join(", ",@values),"<br>\n";
  54. }
  55. }
  56. sub print_tail {
  57. print <<END;
  58. <hr>
  59. <address>Lincoln D. Stein</address><br>
  60. <a href="/">Home Page</a>
  61. END
  62. }

BUGS

Please report them.

SEE ALSO

CGI::Carp - provides a Carp implementation tailored to the CGI environment.

CGI::Fast - supports running CGI applications under FastCGI

CGI::Pretty - pretty prints HTML generated by CGI.pm (with a performance penalty)

Page index
 
perldoc-html/CORE.html000644 000765 000024 00000062331 12275777356 014666 0ustar00jjstaff000000 000000 CORE - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

CORE

Perl 5 version 18.2 documentation
Recently read

CORE

NAME

CORE - Namespace for Perl's core routines

SYNOPSIS

  1. BEGIN {
  2. *CORE::GLOBAL::hex = sub { 1; };
  3. }
  4. print hex("0x50"),"\n"; # prints 1
  5. print CORE::hex("0x50"),"\n"; # prints 80
  6. CORE::say "yes"; # prints yes
  7. BEGIN { *shove = \&CORE::push; }
  8. shove @array, 1,2,3; # pushes on to @array

DESCRIPTION

The CORE namespace gives access to the original built-in functions of Perl. The CORE package is built into Perl, and therefore you do not need to use or require a hypothetical "CORE" module prior to accessing routines in this namespace.

A list of the built-in functions in Perl can be found in perlfunc.

For all Perl keywords, a CORE:: prefix will force the built-in function to be used, even if it has been overridden or would normally require the feature pragma. Despite appearances, this has nothing to do with the CORE package, but is part of Perl's syntax.

For many Perl functions, the CORE package contains real subroutines. This feature is new in Perl 5.16. You can take references to these and make aliases. However, some can only be called as barewords; i.e., you cannot use ampersand syntax (&foo ) or call them through references. See the shove example above. These subroutines exist for all keywords except the following:

__DATA__ , __END__ , and , cmp , default , do, dump, else , elsif , eq , eval, for , foreach , format, ge , given , goto, grep, gt , if , last, le , local, lt , m, map, my, ne , next, no, or , our, package, print, printf, q, qq, qr, qw, qx, redo, require, return, s, say, sort, state, sub, tr, unless , until , use, when , while , x , xor , y

Calling with ampersand syntax and through references does not work for the following functions, as they have special syntax that cannot always be translated into a simple list (e.g., eof vs eof()):

chdir, chomp, chop, defined, delete, each, eof, exec, exists, keys, lstat, pop, push, shift, splice, split, stat, system, truncate, unlink, unshift, values

OVERRIDING CORE FUNCTIONS

To override a Perl built-in routine with your own version, you need to import it at compile-time. This can be conveniently achieved with the subs pragma. This will affect only the package in which you've imported the said subroutine:

  1. use subs 'chdir';
  2. sub chdir { ... }
  3. chdir $somewhere;

To override a built-in globally (that is, in all namespaces), you need to import your function into the CORE::GLOBAL pseudo-namespace at compile time:

  1. BEGIN {
  2. *CORE::GLOBAL::hex = sub {
  3. # ... your code here
  4. };
  5. }

The new routine will be called whenever a built-in function is called without a qualifying package:

  1. print hex("0x50"),"\n"; # prints 1

In both cases, if you want access to the original, unaltered routine, use the CORE:: prefix:

  1. print CORE::hex("0x50"),"\n"; # prints 80

AUTHOR

This documentation provided by Tels <nospam-abuse@bloodgate.com> 2007.

SEE ALSO

perlsub, perlfunc.

 
perldoc-html/CPAN/000755 000765 000024 00000000000 12275777434 013761 5ustar00jjstaff000000 000000 perldoc-html/CPAN.html000644 000765 000024 00000552327 12275777432 014663 0ustar00jjstaff000000 000000 CPAN - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

CPAN

Perl 5 version 18.2 documentation
Recently read

CPAN

NAME

CPAN - query, download and build perl modules from CPAN sites

SYNOPSIS

Interactive mode:

  1. perl -MCPAN -e shell

--or--

  1. cpan

Basic commands:

  1. # Modules:
  2. cpan> install Acme::Meta # in the shell
  3. CPAN::Shell->install("Acme::Meta"); # in perl
  4. # Distributions:
  5. cpan> install NWCLARK/Acme-Meta-0.02.tar.gz # in the shell
  6. CPAN::Shell->
  7. install("NWCLARK/Acme-Meta-0.02.tar.gz"); # in perl
  8. # module objects:
  9. $mo = CPAN::Shell->expandany($mod);
  10. $mo = CPAN::Shell->expand("Module",$mod); # same thing
  11. # distribution objects:
  12. $do = CPAN::Shell->expand("Module",$mod)->distribution;
  13. $do = CPAN::Shell->expandany($distro); # same thing
  14. $do = CPAN::Shell->expand("Distribution",
  15. $distro); # same thing

DESCRIPTION

The CPAN module automates or at least simplifies the make and install of perl modules and extensions. It includes some primitive searching capabilities and knows how to use LWP, HTTP::Tiny, Net::FTP and certain external download clients to fetch distributions from the net.

These are fetched from one or more mirrored CPAN (Comprehensive Perl Archive Network) sites and unpacked in a dedicated directory.

The CPAN module also supports named and versioned bundles of modules. Bundles simplify handling of sets of related modules. See Bundles below.

The package contains a session manager and a cache manager. The session manager keeps track of what has been fetched, built, and installed in the current session. The cache manager keeps track of the disk space occupied by the make processes and deletes excess space using a simple FIFO mechanism.

All methods provided are accessible in a programmer style and in an interactive shell style.

CPAN::shell([$prompt, $command]) Starting Interactive Mode

Enter interactive mode by running

  1. perl -MCPAN -e shell

or

  1. cpan

which puts you into a readline interface. If Term::ReadKey and either of Term::ReadLine::Perl or Term::ReadLine::Gnu are installed, history and command completion are supported.

Once at the command line, type h for one-page help screen; the rest should be self-explanatory.

The function call shell takes two optional arguments: one the prompt, the second the default initial command line (the latter only works if a real ReadLine interface module is installed).

The most common uses of the interactive modes are

  • Searching for authors, bundles, distribution files and modules

    There are corresponding one-letter commands a , b , d , and m for each of the four categories and another, i for any of the mentioned four. Each of the four entities is implemented as a class with slightly differing methods for displaying an object.

    Arguments to these commands are either strings exactly matching the identification string of an object, or regular expressions matched case-insensitively against various attributes of the objects. The parser only recognizes a regular expression when you enclose it with slashes.

    The principle is that the number of objects found influences how an item is displayed. If the search finds one item, the result is displayed with the rather verbose method as_string , but if more than one is found, each object is displayed with the terse method as_glimpse .

    Examples:

    1. cpan> m Acme::MetaSyntactic
    2. Module id = Acme::MetaSyntactic
    3. CPAN_USERID BOOK (Philippe Bruhat (BooK) <[...]>)
    4. CPAN_VERSION 0.99
    5. CPAN_FILE B/BO/BOOK/Acme-MetaSyntactic-0.99.tar.gz
    6. UPLOAD_DATE 2006-11-06
    7. MANPAGE Acme::MetaSyntactic - Themed metasyntactic variables names
    8. INST_FILE /usr/local/lib/perl/5.10.0/Acme/MetaSyntactic.pm
    9. INST_VERSION 0.99
    10. cpan> a BOOK
    11. Author id = BOOK
    12. EMAIL [...]
    13. FULLNAME Philippe Bruhat (BooK)
    14. cpan> d BOOK/Acme-MetaSyntactic-0.99.tar.gz
    15. Distribution id = B/BO/BOOK/Acme-MetaSyntactic-0.99.tar.gz
    16. CPAN_USERID BOOK (Philippe Bruhat (BooK) <[...]>)
    17. CONTAINSMODS Acme::MetaSyntactic Acme::MetaSyntactic::Alias [...]
    18. UPLOAD_DATE 2006-11-06
    19. cpan> m /lorem/
    20. Module = Acme::MetaSyntactic::loremipsum (BOOK/Acme-MetaSyntactic-0.99.tar.gz)
    21. Module Text::Lorem (ADEOLA/Text-Lorem-0.3.tar.gz)
    22. Module Text::Lorem::More (RKRIMEN/Text-Lorem-More-0.12.tar.gz)
    23. Module Text::Lorem::More::Source (RKRIMEN/Text-Lorem-More-0.12.tar.gz)
    24. cpan> i /berlin/
    25. Distribution BEATNIK/Filter-NumberLines-0.02.tar.gz
    26. Module = DateTime::TimeZone::Europe::Berlin (DROLSKY/DateTime-TimeZone-0.7904.tar.gz)
    27. Module Filter::NumberLines (BEATNIK/Filter-NumberLines-0.02.tar.gz)
    28. Author [...]

    The examples illustrate several aspects: the first three queries target modules, authors, or distros directly and yield exactly one result. The last two use regular expressions and yield several results. The last one targets all of bundles, modules, authors, and distros simultaneously. When more than one result is available, they are printed in one-line format.

  • get , make , test , install , clean modules or distributions

    These commands take any number of arguments and investigate what is necessary to perform the action. Argument processing is as follows:

    1. known module name in format Foo/Bar.pm module
    2. other embedded slash distribution
    3. - with trailing slash dot directory
    4. enclosing slashes regexp
    5. known module name in format Foo::Bar module

    If the argument is a distribution file name (recognized by embedded slashes), it is processed. If it is a module, CPAN determines the distribution file in which this module is included and processes that, following any dependencies named in the module's META.yml or Makefile.PL (this behavior is controlled by the configuration parameter prerequisites_policy ). If an argument is enclosed in slashes it is treated as a regular expression: it is expanded and if the result is a single object (distribution, bundle or module), this object is processed.

    Example:

    1. install Dummy::Perl # installs the module
    2. install AUXXX/Dummy-Perl-3.14.tar.gz # installs that distribution
    3. install /Dummy-Perl-3.14/ # same if the regexp is unambiguous

    get downloads a distribution file and untars or unzips it, make builds it, test runs the test suite, and install installs it.

    Any make or test is run unconditionally. An

    1. install <distribution_file>

    is also run unconditionally. But for

    1. install <module>

    CPAN checks whether an install is needed and prints module up to date if the distribution file containing the module doesn't need updating.

    CPAN also keeps track of what it has done within the current session and doesn't try to build a package a second time regardless of whether it succeeded or not. It does not repeat a test run if the test has been run successfully before. Same for install runs.

    The force pragma may precede another command (currently: get , make , test , or install ) to execute the command from scratch and attempt to continue past certain errors. See the section below on the force and the fforce pragma.

    The notest pragma skips the test part in the build process.

    Example:

    1. cpan> notest install Tk

    A clean command results in a

    1. make clean

    being executed within the distribution file's working directory.

  • readme , perldoc , look module or distribution

    readme displays the README file of the associated distribution. Look gets and untars (if not yet done) the distribution file, changes to the appropriate directory and opens a subshell process in that directory. perldoc displays the module's pod documentation in html or plain text format.

  • ls author
  • ls globbing_expression

    The first form lists all distribution files in and below an author's CPAN directory as stored in the CHECKUMS files distributed on CPAN. The listing recurses into subdirectories.

    The second form limits or expands the output with shell globbing as in the following examples:

    1. ls JV/make*
    2. ls GSAR/*make*
    3. ls */*make*

    The last example is very slow and outputs extra progress indicators that break the alignment of the result.

    Note that globbing only lists directories explicitly asked for, for example FOO/* will not list FOO/bar/Acme-Sthg-n.nn.tar.gz. This may be regarded as a bug that may be changed in some future version.

  • failed

    The failed command reports all distributions that failed on one of make , test or install for some reason in the currently running shell session.

  • Persistence between sessions

    If the YAML or the YAML::Syck module is installed a record of the internal state of all modules is written to disk after each step. The files contain a signature of the currently running perl version for later perusal.

    If the configurations variable build_dir_reuse is set to a true value, then CPAN.pm reads the collected YAML files. If the stored signature matches the currently running perl, the stored state is loaded into memory such that persistence between sessions is effectively established.

  • The force and the fforce pragma

    To speed things up in complex installation scenarios, CPAN.pm keeps track of what it has already done and refuses to do some things a second time. A get , a make , and an install are not repeated. A test is repeated only if the previous test was unsuccessful. The diagnostic message when CPAN.pm refuses to do something a second time is one of Has already been unwrapped|made|tested successfully or something similar. Another situation where CPAN refuses to act is an install if the corresponding test was not successful.

    In all these cases, the user can override this stubborn behaviour by prepending the command with the word force, for example:

    1. cpan> force get Foo
    2. cpan> force make AUTHOR/Bar-3.14.tar.gz
    3. cpan> force test Baz
    4. cpan> force install Acme::Meta

    Each forced command is executed with the corresponding part of its memory erased.

    The fforce pragma is a variant that emulates a force get which erases the entire memory followed by the action specified, effectively restarting the whole get/make/test/install procedure from scratch.

  • Lockfile

    Interactive sessions maintain a lockfile, by default ~/.cpan/.lock . Batch jobs can run without a lockfile and not disturb each other.

    The shell offers to run in downgraded mode when another process is holding the lockfile. This is an experimental feature that is not yet tested very well. This second shell then does not write the history file, does not use the metadata file, and has a different prompt.

  • Signals

    CPAN.pm installs signal handlers for SIGINT and SIGTERM. While you are in the cpan-shell, it is intended that you can press ^C anytime and return to the cpan-shell prompt. A SIGTERM will cause the cpan-shell to clean up and leave the shell loop. You can emulate the effect of a SIGTERM by sending two consecutive SIGINTs, which usually means by pressing ^C twice.

    CPAN.pm ignores SIGPIPE. If the user sets inactivity_timeout , a SIGALRM is used during the run of the perl Makefile.PL or perl Build.PL subprocess. A SIGALRM is also used during module version parsing, and is controlled by version_timeout .

CPAN::Shell

The commands available in the shell interface are methods in the package CPAN::Shell. If you enter the shell command, your input is split by the Text::ParseWords::shellwords() routine, which acts like most shells do. The first word is interpreted as the method to be invoked, and the rest of the words are treated as the method's arguments. Continuation lines are supported by ending a line with a literal backslash.

autobundle

autobundle writes a bundle file into the $CPAN::Config->{cpan_home}/Bundle directory. The file contains a list of all modules that are both available from CPAN and currently installed within @INC. Duplicates of each distribution are suppressed. The name of the bundle file is based on the current date and a counter, e.g. Bundle/Snapshot_2012_05_21_00.pm. This is installed again by running cpan Bundle::Snapshot_2012_05_21_00 , or installing Bundle::Snapshot_2012_05_21_00 from the CPAN shell.

Return value: path to the written file.

hosts

Note: this feature is still in alpha state and may change in future versions of CPAN.pm

This commands provides a statistical overview over recent download activities. The data for this is collected in the YAML file FTPstats.yml in your cpan_home directory. If no YAML module is configured or YAML not installed, no stats are provided.

  • install_tested

    Install all distributions that have been tested successfully but have not yet been installed. See also is_tested .

  • is_tested

    List all buid directories of distributions that have been tested successfully but have not yet been installed. See also install_tested .

mkmyconfig

mkmyconfig() writes your own CPAN::MyConfig file into your ~/.cpan/ directory so that you can save your own preferences instead of the system-wide ones.

r [Module|/Regexp/]...

scans current perl installation for modules that have a newer version available on CPAN and provides a list of them. If called without argument, all potential upgrades are listed; if called with arguments the list is filtered to the modules and regexps given as arguments.

The listing looks something like this:

  1. Package namespace installed latest in CPAN file
  2. CPAN 1.94_64 1.9600 ANDK/CPAN-1.9600.tar.gz
  3. CPAN::Reporter 1.1801 1.1902 DAGOLDEN/CPAN-Reporter-1.1902.tar.gz
  4. YAML 0.70 0.73 INGY/YAML-0.73.tar.gz
  5. YAML::Syck 1.14 1.17 AVAR/YAML-Syck-1.17.tar.gz
  6. YAML::Tiny 1.44 1.50 ADAMK/YAML-Tiny-1.50.tar.gz
  7. CGI 3.43 3.55 MARKSTOS/CGI.pm-3.55.tar.gz
  8. Module::Build::YAML 1.40 1.41 DAGOLDEN/Module-Build-0.3800.tar.gz
  9. TAP::Parser::Result::YAML 3.22 3.23 ANDYA/Test-Harness-3.23.tar.gz
  10. YAML::XS 0.34 0.35 INGY/YAML-LibYAML-0.35.tar.gz

It suppresses duplicates in the column in CPAN file such that distributions with many upgradeable modules are listed only once.

Note that the list is not sorted.

recent ***EXPERIMENTAL COMMAND***

The recent command downloads a list of recent uploads to CPAN and displays them slowly. While the command is running, a $SIG{INT} exits the loop after displaying the current item.

Note: This command requires XML::LibXML installed.

Note: This whole command currently is just a hack and will probably change in future versions of CPAN.pm, but the general approach will likely remain.

Note: See also smoke

recompile

recompile() is a special command that takes no argument and runs the make/test/install cycle with brute force over all installed dynamically loadable extensions (a.k.a. XS modules) with 'force' in effect. The primary purpose of this command is to finish a network installation. Imagine you have a common source tree for two different architectures. You decide to do a completely independent fresh installation. You start on one architecture with the help of a Bundle file produced earlier. CPAN installs the whole Bundle for you, but when you try to repeat the job on the second architecture, CPAN responds with a "Foo up to date" message for all modules. So you invoke CPAN's recompile on the second architecture and you're done.

Another popular use for recompile is to act as a rescue in case your perl breaks binary compatibility. If one of the modules that CPAN uses is in turn depending on binary compatibility (so you cannot run CPAN commands), then you should try the CPAN::Nox module for recovery.

report Bundle|Distribution|Module

The report command temporarily turns on the test_report config variable, then runs the force test command with the given arguments. The force pragma reruns the tests and repeats every step that might have failed before.

smoke ***EXPERIMENTAL COMMAND***

*** WARNING: this command downloads and executes software from CPAN to your computer of completely unknown status. You should never do this with your normal account and better have a dedicated well separated and secured machine to do this. ***

The smoke command takes the list of recent uploads to CPAN as provided by the recent command and tests them all. While the command is running $SIG{INT} is defined to mean that the current item shall be skipped.

Note: This whole command currently is just a hack and will probably change in future versions of CPAN.pm, but the general approach will likely remain.

Note: See also recent

upgrade [Module|/Regexp/]...

The upgrade command first runs an r command with the given arguments and then installs the newest versions of all modules that were listed by that.

The four CPAN::* Classes: Author, Bundle, Module, Distribution

Although it may be considered internal, the class hierarchy does matter for both users and programmer. CPAN.pm deals with the four classes mentioned above, and those classes all share a set of methods. Classical single polymorphism is in effect. A metaclass object registers all objects of all kinds and indexes them with a string. The strings referencing objects have a separated namespace (well, not completely separated):

  1. Namespace Class
  2. words containing a "/" (slash) Distribution
  3. words starting with Bundle:: Bundle
  4. everything else Module or Author

Modules know their associated Distribution objects. They always refer to the most recent official release. Developers may mark their releases as unstable development versions (by inserting an underbar into the module version number which will also be reflected in the distribution name when you run 'make dist'), so the really hottest and newest distribution is not always the default. If a module Foo circulates on CPAN in both version 1.23 and 1.23_90, CPAN.pm offers a convenient way to install version 1.23 by saying

  1. install Foo

This would install the complete distribution file (say BAR/Foo-1.23.tar.gz) with all accompanying material. But if you would like to install version 1.23_90, you need to know where the distribution file resides on CPAN relative to the authors/id/ directory. If the author is BAR, this might be BAR/Foo-1.23_90.tar.gz; so you would have to say

  1. install BAR/Foo-1.23_90.tar.gz

The first example will be driven by an object of the class CPAN::Module, the second by an object of class CPAN::Distribution.

Integrating local directories

Note: this feature is still in alpha state and may change in future versions of CPAN.pm

Distribution objects are normally distributions from the CPAN, but there is a slightly degenerate case for Distribution objects, too, of projects held on the local disk. These distribution objects have the same name as the local directory and end with a dot. A dot by itself is also allowed for the current directory at the time CPAN.pm was used. All actions such as make , test , and install are applied directly to that directory. This gives the command cpan . an interesting touch: while the normal mantra of installing a CPAN module without CPAN.pm is one of

  1. perl Makefile.PL perl Build.PL
  2. ( go and get prerequisites )
  3. make ./Build
  4. make test ./Build test
  5. make install ./Build install

the command cpan . does all of this at once. It figures out which of the two mantras is appropriate, fetches and installs all prerequisites, takes care of them recursively, and finally finishes the installation of the module in the current directory, be it a CPAN module or not.

The typical usage case is for private modules or working copies of projects from remote repositories on the local disk.

Redirection

The usual shell redirection symbols | and > are recognized by the cpan shell only when surrounded by whitespace. So piping to pager or redirecting output into a file works somewhat as in a normal shell, with the stipulation that you must type extra spaces.

CONFIGURATION

When the CPAN module is used for the first time, a configuration dialogue tries to determine a couple of site specific options. The result of the dialog is stored in a hash reference $CPAN::Config in a file CPAN/Config.pm.

Default values defined in the CPAN/Config.pm file can be overridden in a user specific file: CPAN/MyConfig.pm. Such a file is best placed in $HOME/.cpan/CPAN/MyConfig.pm, because $HOME/.cpan is added to the search path of the CPAN module before the use() or require() statements. The mkmyconfig command writes this file for you.

The o conf command has various bells and whistles:

  • completion support

    If you have a ReadLine module installed, you can hit TAB at any point of the commandline and o conf will offer you completion for the built-in subcommands and/or config variable names.

  • displaying some help: o conf help

    Displays a short help

  • displaying current values: o conf [KEY]

    Displays the current value(s) for this config variable. Without KEY, displays all subcommands and config variables.

    Example:

    1. o conf shell

    If KEY starts and ends with a slash, the string in between is treated as a regular expression and only keys matching this regexp are displayed

    Example:

    1. o conf /color/
  • changing of scalar values: o conf KEY VALUE

    Sets the config variable KEY to VALUE. The empty string can be specified as usual in shells, with '' or ""

    Example:

    1. o conf wget /usr/bin/wget
  • changing of list values: o conf KEY SHIFT|UNSHIFT|PUSH|POP|SPLICE|LIST

    If a config variable name ends with list , it is a list. o conf KEY shift removes the first element of the list, o conf KEY pop removes the last element of the list. o conf KEYS unshift LIST prepends a list of values to the list, o conf KEYS push LIST appends a list of valued to the list.

    Likewise, o conf KEY splice LIST passes the LIST to the corresponding splice command.

    Finally, any other list of arguments is taken as a new list value for the KEY variable discarding the previous value.

    Examples:

    1. o conf urllist unshift http://cpan.dev.local/CPAN
    2. o conf urllist splice 3 1
    3. o conf urllist http://cpan1.local http://cpan2.local ftp://ftp.perl.org
  • reverting to saved: o conf defaults

    Reverts all config variables to the state in the saved config file.

  • saving the config: o conf commit

    Saves all config variables to the current config file (CPAN/Config.pm or CPAN/MyConfig.pm that was loaded at start).

The configuration dialog can be started any time later again by issuing the command o conf init in the CPAN shell. A subset of the configuration dialog can be run by issuing o conf init WORD where WORD is any valid config variable or a regular expression.

Config Variables

The following keys in the hash reference $CPAN::Config are currently defined:

  1. applypatch path to external prg
  2. auto_commit commit all changes to config variables to disk
  3. build_cache size of cache for directories to build modules
  4. build_dir locally accessible directory to build modules
  5. build_dir_reuse boolean if distros in build_dir are persistent
  6. build_requires_install_policy
  7. to install or not to install when a module is
  8. only needed for building. yes|no|ask/yes|ask/no
  9. bzip2 path to external prg
  10. cache_metadata use serializer to cache metadata
  11. check_sigs if signatures should be verified
  12. colorize_debug Term::ANSIColor attributes for debugging output
  13. colorize_output boolean if Term::ANSIColor should colorize output
  14. colorize_print Term::ANSIColor attributes for normal output
  15. colorize_warn Term::ANSIColor attributes for warnings
  16. commandnumber_in_prompt
  17. boolean if you want to see current command number
  18. commands_quote preferred character to use for quoting external
  19. commands when running them. Defaults to double
  20. quote on Windows, single tick everywhere else;
  21. can be set to space to disable quoting
  22. connect_to_internet_ok
  23. whether to ask if opening a connection is ok before
  24. urllist is specified
  25. cpan_home local directory reserved for this package
  26. curl path to external prg
  27. dontload_hash DEPRECATED
  28. dontload_list arrayref: modules in the list will not be
  29. loaded by the CPAN::has_inst() routine
  30. ftp path to external prg
  31. ftp_passive if set, the environment variable FTP_PASSIVE is set
  32. for downloads
  33. ftp_proxy proxy host for ftp requests
  34. ftpstats_period max number of days to keep download statistics
  35. ftpstats_size max number of items to keep in the download statistics
  36. getcwd see below
  37. gpg path to external prg
  38. gzip location of external program gzip
  39. halt_on_failure stop processing after the first failure of queued
  40. items or dependencies
  41. histfile file to maintain history between sessions
  42. histsize maximum number of lines to keep in histfile
  43. http_proxy proxy host for http requests
  44. inactivity_timeout breaks interactive Makefile.PLs or Build.PLs
  45. after this many seconds inactivity. Set to 0 to
  46. disable timeouts.
  47. index_expire refetch index files after this many days
  48. inhibit_startup_message
  49. if true, suppress the startup message
  50. keep_source_where directory in which to keep the source (if we do)
  51. load_module_verbosity
  52. report loading of optional modules used by CPAN.pm
  53. lynx path to external prg
  54. make location of external make program
  55. make_arg arguments that should always be passed to 'make'
  56. make_install_make_command
  57. the make command for running 'make install', for
  58. example 'sudo make'
  59. make_install_arg same as make_arg for 'make install'
  60. makepl_arg arguments passed to 'perl Makefile.PL'
  61. mbuild_arg arguments passed to './Build'
  62. mbuild_install_arg arguments passed to './Build install'
  63. mbuild_install_build_command
  64. command to use instead of './Build' when we are
  65. in the install stage, for example 'sudo ./Build'
  66. mbuildpl_arg arguments passed to 'perl Build.PL'
  67. ncftp path to external prg
  68. ncftpget path to external prg
  69. no_proxy don't proxy to these hosts/domains (comma separated list)
  70. pager location of external program more (or any pager)
  71. password your password if you CPAN server wants one
  72. patch path to external prg
  73. patches_dir local directory containing patch files
  74. perl5lib_verbosity verbosity level for PERL5LIB additions
  75. prefer_external_tar
  76. per default all untar operations are done with
  77. Archive::Tar; by setting this variable to true
  78. the external tar command is used if available
  79. prefer_installer legal values are MB and EUMM: if a module comes
  80. with both a Makefile.PL and a Build.PL, use the
  81. former (EUMM) or the latter (MB); if the module
  82. comes with only one of the two, that one will be
  83. used no matter the setting
  84. prerequisites_policy
  85. what to do if you are missing module prerequisites
  86. ('follow' automatically, 'ask' me, or 'ignore')
  87. For 'follow', also sets PERL_AUTOINSTALL and
  88. PERL_EXTUTILS_AUTOINSTALL for "--defaultdeps" if
  89. not already set
  90. prefs_dir local directory to store per-distro build options
  91. proxy_user username for accessing an authenticating proxy
  92. proxy_pass password for accessing an authenticating proxy
  93. randomize_urllist add some randomness to the sequence of the urllist
  94. scan_cache controls scanning of cache ('atstart', 'atexit' or 'never')
  95. shell your favorite shell
  96. show_unparsable_versions
  97. boolean if r command tells which modules are versionless
  98. show_upload_date boolean if commands should try to determine upload date
  99. show_zero_versions boolean if r command tells for which modules $version==0
  100. tar location of external program tar
  101. tar_verbosity verbosity level for the tar command
  102. term_is_latin deprecated: if true Unicode is translated to ISO-8859-1
  103. (and nonsense for characters outside latin range)
  104. term_ornaments boolean to turn ReadLine ornamenting on/off
  105. test_report email test reports (if CPAN::Reporter is installed)
  106. trust_test_report_history
  107. skip testing when previously tested ok (according to
  108. CPAN::Reporter history)
  109. unzip location of external program unzip
  110. urllist arrayref to nearby CPAN sites (or equivalent locations)
  111. use_sqlite use CPAN::SQLite for metadata storage (fast and lean)
  112. username your username if you CPAN server wants one
  113. version_timeout stops version parsing after this many seconds.
  114. Default is 15 secs. Set to 0 to disable.
  115. wait_list arrayref to a wait server to try (See CPAN::WAIT)
  116. wget path to external prg
  117. yaml_load_code enable YAML code deserialisation via CPAN::DeferredCode
  118. yaml_module which module to use to read/write YAML files

You can set and query each of these options interactively in the cpan shell with the o conf or the o conf init command as specified below.

  • o conf <scalar option>

    prints the current value of the scalar option

  • o conf <scalar option> <value>

    Sets the value of the scalar option to value

  • o conf <list option>

    prints the current value of the list option in MakeMaker's neatvalue format.

  • o conf <list option> [shift|pop]

    shifts or pops the array in the list option variable

  • o conf <list option> [unshift|push|splice] <list>

    works like the corresponding perl commands.

  • interactive editing: o conf init [MATCH|LIST]

    Runs an interactive configuration dialog for matching variables. Without argument runs the dialog over all supported config variables. To specify a MATCH the argument must be enclosed by slashes.

    Examples:

    1. o conf init ftp_passive ftp_proxy
    2. o conf init /color/

    Note: this method of setting config variables often provides more explanation about the functioning of a variable than the manpage.

CPAN::anycwd($path): Note on config variable getcwd

CPAN.pm changes the current working directory often and needs to determine its own current working directory. By default it uses Cwd::cwd, but if for some reason this doesn't work on your system, configure alternatives according to the following table:

  • cwd

    Calls Cwd::cwd

  • getcwd

    Calls Cwd::getcwd

  • fastcwd

    Calls Cwd::fastcwd

  • backtickcwd

    Calls the external command cwd.

Note on the format of the urllist parameter

urllist parameters are URLs according to RFC 1738. We do a little guessing if your URL is not compliant, but if you have problems with file URLs, please try the correct format. Either:

  1. file://localhost/whatever/ftp/pub/CPAN/

or

  1. file:///home/ftp/pub/CPAN/

The urllist parameter has CD-ROM support

The urllist parameter of the configuration table contains a list of URLs used for downloading. If the list contains any file URLs, CPAN always tries there first. This feature is disabled for index files. So the recommendation for the owner of a CD-ROM with CPAN contents is: include your local, possibly outdated CD-ROM as a file URL at the end of urllist, e.g.

  1. o conf urllist push file://localhost/CDROM/CPAN

CPAN.pm will then fetch the index files from one of the CPAN sites that come at the beginning of urllist. It will later check for each module to see whether there is a local copy of the most recent version.

Another peculiarity of urllist is that the site that we could successfully fetch the last file from automatically gets a preference token and is tried as the first site for the next request. So if you add a new site at runtime it may happen that the previously preferred site will be tried another time. This means that if you want to disallow a site for the next transfer, it must be explicitly removed from urllist.

Maintaining the urllist parameter

If you have YAML.pm (or some other YAML module configured in yaml_module ) installed, CPAN.pm collects a few statistical data about recent downloads. You can view the statistics with the hosts command or inspect them directly by looking into the FTPstats.yml file in your cpan_home directory.

To get some interesting statistics, it is recommended that randomize_urllist be set; this introduces some amount of randomness into the URL selection.

The requires and build_requires dependency declarations

Since CPAN.pm version 1.88_51 modules declared as build_requires by a distribution are treated differently depending on the config variable build_requires_install_policy . By setting build_requires_install_policy to no, such a module is not installed. It is only built and tested, and then kept in the list of tested but uninstalled modules. As such, it is available during the build of the dependent module by integrating the path to the blib/arch and blib/lib directories in the environment variable PERL5LIB. If build_requires_install_policy is set ti yes , then both modules declared as requires and those declared as build_requires are treated alike. By setting to ask/yes or ask/no , CPAN.pm asks the user and sets the default accordingly.

Configuration for individual distributions (Distroprefs)

(Note: This feature has been introduced in CPAN.pm 1.8854 and is still considered beta quality)

Distributions on CPAN usually behave according to what we call the CPAN mantra. Or since the advent of Module::Build we should talk about two mantras:

  1. perl Makefile.PL perl Build.PL
  2. make ./Build
  3. make test ./Build test
  4. make install ./Build install

But some modules cannot be built with this mantra. They try to get some extra data from the user via the environment, extra arguments, or interactively--thus disturbing the installation of large bundles like Phalanx100 or modules with many dependencies like Plagger.

The distroprefs system of CPAN.pm addresses this problem by allowing the user to specify extra informations and recipes in YAML files to either

  • pass additional arguments to one of the four commands,

  • set environment variables

  • instantiate an Expect object that reads from the console, waits for some regular expressions and enters some answers

  • temporarily override assorted CPAN.pm configuration variables

  • specify dependencies the original maintainer forgot

  • disable the installation of an object altogether

See the YAML and Data::Dumper files that come with the CPAN.pm distribution in the distroprefs/ directory for examples.

Filenames

The YAML files themselves must have the .yml extension; all other files are ignored (for two exceptions see Fallback Data::Dumper and Storable below). The containing directory can be specified in CPAN.pm in the prefs_dir config variable. Try o conf init prefs_dir in the CPAN shell to set and activate the distroprefs system.

Every YAML file may contain arbitrary documents according to the YAML specification, and every document is treated as an entity that can specify the treatment of a single distribution.

Filenames can be picked arbitrarily; CPAN.pm always reads all files (in alphabetical order) and takes the key match (see below in Language Specs) as a hashref containing match criteria that determine if the current distribution matches the YAML document or not.

Fallback Data::Dumper and Storable

If neither your configured yaml_module nor YAML.pm is installed, CPAN.pm falls back to using Data::Dumper and Storable and looks for files with the extensions .dd or .st in the prefs_dir directory. These files are expected to contain one or more hashrefs. For Data::Dumper generated files, this is expected to be done with by defining $VAR1 , $VAR2 , etc. The YAML shell would produce these with the command

  1. ysh < somefile.yml > somefile.dd

For Storable files the rule is that they must be constructed such that Storable::retrieve(file) returns an array reference and the array elements represent one distropref object each. The conversion from YAML would look like so:

  1. perl -MYAML=LoadFile -MStorable=nstore -e '
  2. @y=LoadFile(shift);
  3. nstore(\@y, shift)' somefile.yml somefile.st

In bootstrapping situations it is usually sufficient to translate only a few YAML files to Data::Dumper for crucial modules like YAML::Syck , YAML.pm and Expect.pm . If you prefer Storable over Data::Dumper, remember to pull out a Storable version that writes an older format than all the other Storable versions that will need to read them.

Blueprint

The following example contains all supported keywords and structures with the exception of eexpect which can be used instead of expect .

  1. ---
  2. comment: "Demo"
  3. match:
  4. module: "Dancing::Queen"
  5. distribution: "^CHACHACHA/Dancing-"
  6. not_distribution: "\.zip$"
  7. perl: "/usr/local/cariba-perl/bin/perl"
  8. perlconfig:
  9. archname: "freebsd"
  10. not_cc: "gcc"
  11. env:
  12. DANCING_FLOOR: "Shubiduh"
  13. disabled: 1
  14. cpanconfig:
  15. make: gmake
  16. pl:
  17. args:
  18. - "--somearg=specialcase"
  19. env: {}
  20. expect:
  21. - "Which is your favorite fruit"
  22. - "apple\n"
  23. make:
  24. args:
  25. - all
  26. - extra-all
  27. env: {}
  28. expect: []
  29. commandline: "echo SKIPPING make"
  30. test:
  31. args: []
  32. env: {}
  33. expect: []
  34. install:
  35. args: []
  36. env:
  37. WANT_TO_INSTALL: YES
  38. expect:
  39. - "Do you really want to install"
  40. - "y\n"
  41. patches:
  42. - "ABCDE/Fedcba-3.14-ABCDE-01.patch"
  43. depends:
  44. configure_requires:
  45. LWP: 5.8
  46. build_requires:
  47. Test::Exception: 0.25
  48. requires:
  49. Spiffy: 0.30

Language Specs

Every YAML document represents a single hash reference. The valid keys in this hash are as follows:

  • comment [scalar]

    A comment

  • cpanconfig [hash]

    Temporarily override assorted CPAN.pm configuration variables.

    Supported are: build_requires_install_policy , check_sigs , make , make_install_make_command , prefer_installer , test_report . Please report as a bug when you need another one supported.

  • depends [hash] *** EXPERIMENTAL FEATURE ***

    All three types, namely configure_requires , build_requires , and requires are supported in the way specified in the META.yml specification. The current implementation merges the specified dependencies with those declared by the package maintainer. In a future implementation this may be changed to override the original declaration.

  • disabled [boolean]

    Specifies that this distribution shall not be processed at all.

  • features [array] *** EXPERIMENTAL FEATURE ***

    Experimental implementation to deal with optional_features from META.yml. Still needs coordination with installer software and currently works only for META.yml declaring dynamic_config=0 . Use with caution.

  • goto [string]

    The canonical name of a delegate distribution to install instead. Useful when a new version, although it tests OK itself, breaks something else or a developer release or a fork is already uploaded that is better than the last released version.

  • install [hash]

    Processing instructions for the make install or ./Build install phase of the CPAN mantra. See below under Processing Instructions.

  • make [hash]

    Processing instructions for the make or ./Build phase of the CPAN mantra. See below under Processing Instructions.

  • match [hash]

    A hashref with one or more of the keys distribution , module , perl , perlconfig , and env that specify whether a document is targeted at a specific CPAN distribution or installation. Keys prefixed with not_ negates the corresponding match.

    The corresponding values are interpreted as regular expressions. The distribution related one will be matched against the canonical distribution name, e.g. "AUTHOR/Foo-Bar-3.14.tar.gz".

    The module related one will be matched against all modules contained in the distribution until one module matches.

    The perl related one will be matched against $^X (but with the absolute path).

    The value associated with perlconfig is itself a hashref that is matched against corresponding values in the %Config::Config hash living in the Config.pm module. Keys prefixed with not_ negates the corresponding match.

    The value associated with env is itself a hashref that is matched against corresponding values in the %ENV hash. Keys prefixed with not_ negates the corresponding match.

    If more than one restriction of module , distribution , etc. is specified, the results of the separately computed match values must all match. If so, the hashref represented by the YAML document is returned as the preference structure for the current distribution.

  • patches [array]

    An array of patches on CPAN or on the local disk to be applied in order via an external patch program. If the value for the -p parameter is 0 or 1 is determined by reading the patch beforehand. The path to each patch is either an absolute path on the local filesystem or relative to a patch directory specified in the patches_dir configuration variable or in the format of a canonical distro name. For examples please consult the distroprefs/ directory in the CPAN.pm distribution (these examples are not installed by default).

    Note: if the applypatch program is installed and CPAN::Config knows about it and a patch is written by the makepatch program, then CPAN.pm lets applypatch apply the patch. Both makepatch and applypatch are available from CPAN in the JV/makepatch-* distribution.

  • pl [hash]

    Processing instructions for the perl Makefile.PL or perl Build.PL phase of the CPAN mantra. See below under Processing Instructions.

  • test [hash]

    Processing instructions for the make test or ./Build test phase of the CPAN mantra. See below under Processing Instructions.

Processing Instructions

  • args [array]

    Arguments to be added to the command line

  • commandline

    A full commandline to run via system(). During execution, the environment variable PERL is set to $^X (but with an absolute path). If commandline is specified, args is not used.

  • eexpect [hash]

    Extended expect . This is a hash reference with four allowed keys, mode , timeout , reuse , and talk .

    You must install the Expect module to use eexpect . CPAN.pm does not install it for you.

    mode may have the values deterministic for the case where all questions come in the order written down and anyorder for the case where the questions may come in any order. The default mode is deterministic .

    timeout denotes a timeout in seconds. Floating-point timeouts are OK. With mode=deterministic , the timeout denotes the timeout per question; with mode=anyorder it denotes the timeout per byte received from the stream or questions.

    talk is a reference to an array that contains alternating questions and answers. Questions are regular expressions and answers are literal strings. The Expect module watches the stream from the execution of the external program (perl Makefile.PL , perl Build.PL , make , etc.).

    For mode=deterministic , the CPAN.pm injects the corresponding answer as soon as the stream matches the regular expression.

    For mode=anyorder CPAN.pm answers a question as soon as the timeout is reached for the next byte in the input stream. In this mode you can use the reuse parameter to decide what will happen with a question-answer pair after it has been used. In the default case (reuse=0) it is removed from the array, avoiding being used again accidentally. If you want to answer the question Do you really want to do that several times, then it must be included in the array at least as often as you want this answer to be given. Setting the parameter reuse to 1 makes this repetition unnecessary.

  • env [hash]

    Environment variables to be set during the command

  • expect [array]

    You must install the Expect module to use expect . CPAN.pm does not install it for you.

    expect: <array> is a short notation for this eexpect :

    1. eexpect:
    2. mode: deterministic
    3. timeout: 15
    4. talk: <array>

Schema verification with Kwalify

If you have the Kwalify module installed (which is part of the Bundle::CPANxxl), then all your distroprefs files are checked for syntactic correctness.

Example Distroprefs Files

CPAN.pm comes with a collection of example YAML files. Note that these are really just examples and should not be used without care because they cannot fit everybody's purpose. After all, the authors of the packages that ask questions had a need to ask, so you should watch their questions and adjust the examples to your environment and your needs. You have been warned:-)

PROGRAMMER'S INTERFACE

If you do not enter the shell, shell commands are available both as methods (CPAN::Shell->install(...) ) and as functions in the calling package (install(...) ). Before calling low-level commands, it makes sense to initialize components of CPAN you need, e.g.:

  1. CPAN::HandleConfig->load;
  2. CPAN::Shell::setup_output;
  3. CPAN::Index->reload;

High-level commands do such initializations automatically.

There's currently only one class that has a stable interface - CPAN::Shell. All commands that are available in the CPAN shell are methods of the class CPAN::Shell. The arguments on the commandline are passed as arguments to the method.

So if you take for example the shell command

  1. notest install A B C

the actually executed command is

  1. CPAN::Shell->notest("install","A","B","C");

Each of the commands that produce listings of modules (r , autobundle , u ) also return a list of the IDs of all modules within the list.

  • expand($type,@things)

    The IDs of all objects available within a program are strings that can be expanded to the corresponding real objects with the CPAN::Shell->expand("Module",@things) method. Expand returns a list of CPAN::Module objects according to the @things arguments given. In scalar context, it returns only the first element of the list.

  • expandany(@things)

    Like expand, but returns objects of the appropriate type, i.e. CPAN::Bundle objects for bundles, CPAN::Module objects for modules, and CPAN::Distribution objects for distributions. Note: it does not expand to CPAN::Author objects.

  • Programming Examples

    This enables the programmer to do operations that combine functionalities that are available in the shell.

    1. # install everything that is outdated on my disk:
    2. perl -MCPAN -e 'CPAN::Shell->install(CPAN::Shell->r)'
    3. # install my favorite programs if necessary:
    4. for $mod (qw(Net::FTP Digest::SHA Data::Dumper)) {
    5. CPAN::Shell->install($mod);
    6. }
    7. # list all modules on my disk that have no VERSION number
    8. for $mod (CPAN::Shell->expand("Module","/./")) {
    9. next unless $mod->inst_file;
    10. # MakeMaker convention for undefined $VERSION:
    11. next unless $mod->inst_version eq "undef";
    12. print "No VERSION in ", $mod->id, "\n";
    13. }
    14. # find out which distribution on CPAN contains a module:
    15. print CPAN::Shell->expand("Module","Apache::Constants")->cpan_file

    Or if you want to schedule a cron job to watch CPAN, you could list all modules that need updating. First a quick and dirty way:

    1. perl -e 'use CPAN; CPAN::Shell->r;'

    If you don't want any output should all modules be up to date, parse the output of above command for the regular expression /modules are up to date/ and decide to mail the output only if it doesn't match.

    If you prefer to do it more in a programmerish style in one single process, something like this may better suit you:

    1. # list all modules on my disk that have newer versions on CPAN
    2. for $mod (CPAN::Shell->expand("Module","/./")) {
    3. next unless $mod->inst_file;
    4. next if $mod->uptodate;
    5. printf "Module %s is installed as %s, could be updated to %s from CPAN\n",
    6. $mod->id, $mod->inst_version, $mod->cpan_version;
    7. }

    If that gives too much output every day, you may want to watch only for three modules. You can write

    1. for $mod (CPAN::Shell->expand("Module","/Apache|LWP|CGI/")) {

    as the first line instead. Or you can combine some of the above tricks:

    1. # watch only for a new mod_perl module
    2. $mod = CPAN::Shell->expand("Module","mod_perl");
    3. exit if $mod->uptodate;
    4. # new mod_perl arrived, let me know all update recommendations
    5. CPAN::Shell->r;

Methods in the other Classes

  • CPAN::Author::as_glimpse()

    Returns a one-line description of the author

  • CPAN::Author::as_string()

    Returns a multi-line description of the author

  • CPAN::Author::email()

    Returns the author's email address

  • CPAN::Author::fullname()

    Returns the author's name

  • CPAN::Author::name()

    An alias for fullname

  • CPAN::Bundle::as_glimpse()

    Returns a one-line description of the bundle

  • CPAN::Bundle::as_string()

    Returns a multi-line description of the bundle

  • CPAN::Bundle::clean()

    Recursively runs the clean method on all items contained in the bundle.

  • CPAN::Bundle::contains()

    Returns a list of objects' IDs contained in a bundle. The associated objects may be bundles, modules or distributions.

  • CPAN::Bundle::force($method,@args)

    Forces CPAN to perform a task that it normally would have refused to do. Force takes as arguments a method name to be called and any number of additional arguments that should be passed to the called method. The internals of the object get the needed changes so that CPAN.pm does not refuse to take the action. The force is passed recursively to all contained objects. See also the section above on the force and the fforce pragma.

  • CPAN::Bundle::get()

    Recursively runs the get method on all items contained in the bundle

  • CPAN::Bundle::inst_file()

    Returns the highest installed version of the bundle in either @INC or $CPAN::Config->{cpan_home} . Note that this is different from CPAN::Module::inst_file.

  • CPAN::Bundle::inst_version()

    Like CPAN::Bundle::inst_file, but returns the $VERSION

  • CPAN::Bundle::uptodate()

    Returns 1 if the bundle itself and all its members are up-to-date.

  • CPAN::Bundle::install()

    Recursively runs the install method on all items contained in the bundle

  • CPAN::Bundle::make()

    Recursively runs the make method on all items contained in the bundle

  • CPAN::Bundle::readme()

    Recursively runs the readme method on all items contained in the bundle

  • CPAN::Bundle::test()

    Recursively runs the test method on all items contained in the bundle

  • CPAN::Distribution::as_glimpse()

    Returns a one-line description of the distribution

  • CPAN::Distribution::as_string()

    Returns a multi-line description of the distribution

  • CPAN::Distribution::author

    Returns the CPAN::Author object of the maintainer who uploaded this distribution

  • CPAN::Distribution::pretty_id()

    Returns a string of the form "AUTHORID/TARBALL", where AUTHORID is the author's PAUSE ID and TARBALL is the distribution filename.

  • CPAN::Distribution::base_id()

    Returns the distribution filename without any archive suffix. E.g "Foo-Bar-0.01"

  • CPAN::Distribution::clean()

    Changes to the directory where the distribution has been unpacked and runs make clean there.

  • CPAN::Distribution::containsmods()

    Returns a list of IDs of modules contained in a distribution file. Works only for distributions listed in the 02packages.details.txt.gz file. This typically means that just most recent version of a distribution is covered.

  • CPAN::Distribution::cvs_import()

    Changes to the directory where the distribution has been unpacked and runs something like

    1. cvs -d $cvs_root import -m $cvs_log $cvs_dir $userid v$version

    there.

  • CPAN::Distribution::dir()

    Returns the directory into which this distribution has been unpacked.

  • CPAN::Distribution::force($method,@args)

    Forces CPAN to perform a task that it normally would have refused to do. Force takes as arguments a method name to be called and any number of additional arguments that should be passed to the called method. The internals of the object get the needed changes so that CPAN.pm does not refuse to take the action. See also the section above on the force and the fforce pragma.

  • CPAN::Distribution::get()

    Downloads the distribution from CPAN and unpacks it. Does nothing if the distribution has already been downloaded and unpacked within the current session.

  • CPAN::Distribution::install()

    Changes to the directory where the distribution has been unpacked and runs the external command make install there. If make has not yet been run, it will be run first. A make test is issued in any case and if this fails, the install is cancelled. The cancellation can be avoided by letting force run the install for you.

    This install method only has the power to install the distribution if there are no dependencies in the way. To install an object along with all its dependencies, use CPAN::Shell->install.

    Note that install() gives no meaningful return value. See uptodate().

  • CPAN::Distribution::isa_perl()

    Returns 1 if this distribution file seems to be a perl distribution. Normally this is derived from the file name only, but the index from CPAN can contain a hint to achieve a return value of true for other filenames too.

  • CPAN::Distribution::look()

    Changes to the directory where the distribution has been unpacked and opens a subshell there. Exiting the subshell returns.

  • CPAN::Distribution::make()

    First runs the get method to make sure the distribution is downloaded and unpacked. Changes to the directory where the distribution has been unpacked and runs the external commands perl Makefile.PL or perl Build.PL and make there.

  • CPAN::Distribution::perldoc()

    Downloads the pod documentation of the file associated with a distribution (in HTML format) and runs it through the external command lynx specified in $CPAN::Config->{lynx} . If lynx isn't available, it converts it to plain text with the external command html2text and runs it through the pager specified in $CPAN::Config->{pager} .

  • CPAN::Distribution::prefs()

    Returns the hash reference from the first matching YAML file that the user has deposited in the prefs_dir/ directory. The first succeeding match wins. The files in the prefs_dir/ are processed alphabetically, and the canonical distro name (e.g. AUTHOR/Foo-Bar-3.14.tar.gz) is matched against the regular expressions stored in the $root->{match}{distribution} attribute value. Additionally all module names contained in a distribution are matched against the regular expressions in the $root->{match}{module} attribute value. The two match values are ANDed together. Each of the two attributes are optional.

  • CPAN::Distribution::prereq_pm()

    Returns the hash reference that has been announced by a distribution as the requires and build_requires elements. These can be declared either by the META.yml (if authoritative) or can be deposited after the run of Build.PL in the file ./_build/prereqs or after the run of Makfile.PL written as the PREREQ_PM hash in a comment in the produced Makefile . Note: this method only works after an attempt has been made to make the distribution. Returns undef otherwise.

  • CPAN::Distribution::readme()

    Downloads the README file associated with a distribution and runs it through the pager specified in $CPAN::Config->{pager} .

  • CPAN::Distribution::reports()

    Downloads report data for this distribution from www.cpantesters.org and displays a subset of them.

  • CPAN::Distribution::read_yaml()

    Returns the content of the META.yml of this distro as a hashref. Note: works only after an attempt has been made to make the distribution. Returns undef otherwise. Also returns undef if the content of META.yml is not authoritative. (The rules about what exactly makes the content authoritative are still in flux.)

  • CPAN::Distribution::test()

    Changes to the directory where the distribution has been unpacked and runs make test there.

  • CPAN::Distribution::uptodate()

    Returns 1 if all the modules contained in the distribution are up-to-date. Relies on containsmods.

  • CPAN::Index::force_reload()

    Forces a reload of all indices.

  • CPAN::Index::reload()

    Reloads all indices if they have not been read for more than $CPAN::Config->{index_expire} days.

  • CPAN::InfoObj::dump()

    CPAN::Author, CPAN::Bundle, CPAN::Module, and CPAN::Distribution inherit this method. It prints the data structure associated with an object. Useful for debugging. Note: the data structure is considered internal and thus subject to change without notice.

  • CPAN::Module::as_glimpse()

    Returns a one-line description of the module in four columns: The first column contains the word Module , the second column consists of one character: an equals sign if this module is already installed and up-to-date, a less-than sign if this module is installed but can be upgraded, and a space if the module is not installed. The third column is the name of the module and the fourth column gives maintainer or distribution information.

  • CPAN::Module::as_string()

    Returns a multi-line description of the module

  • CPAN::Module::clean()

    Runs a clean on the distribution associated with this module.

  • CPAN::Module::cpan_file()

    Returns the filename on CPAN that is associated with the module.

  • CPAN::Module::cpan_version()

    Returns the latest version of this module available on CPAN.

  • CPAN::Module::cvs_import()

    Runs a cvs_import on the distribution associated with this module.

  • CPAN::Module::description()

    Returns a 44 character description of this module. Only available for modules listed in The Module List (CPAN/modules/00modlist.long.html or 00modlist.long.txt.gz)

  • CPAN::Module::distribution()

    Returns the CPAN::Distribution object that contains the current version of this module.

  • CPAN::Module::dslip_status()

    Returns a hash reference. The keys of the hash are the letters D , S , L , I , and <P>, for development status, support level, language, interface and public licence respectively. The data for the DSLIP status are collected by pause.perl.org when authors register their namespaces. The values of the 5 hash elements are one-character words whose meaning is described in the table below. There are also 5 hash elements DV , SV , LV , IV , and <PV> that carry a more verbose value of the 5 status variables.

    Where the 'DSLIP' characters have the following meanings:

    1. D - Development Stage (Note: *NO IMPLIED TIMESCALES*):
    2. i - Idea, listed to gain consensus or as a placeholder
    3. c - under construction but pre-alpha (not yet released)
    4. a/b - Alpha/Beta testing
    5. R - Released
    6. M - Mature (no rigorous definition)
    7. S - Standard, supplied with Perl 5
    8. S - Support Level:
    9. m - Mailing-list
    10. d - Developer
    11. u - Usenet newsgroup comp.lang.perl.modules
    12. n - None known, try comp.lang.perl.modules
    13. a - abandoned; volunteers welcome to take over maintenance
    14. L - Language Used:
    15. p - Perl-only, no compiler needed, should be platform independent
    16. c - C and perl, a C compiler will be needed
    17. h - Hybrid, written in perl with optional C code, no compiler needed
    18. + - C++ and perl, a C++ compiler will be needed
    19. o - perl and another language other than C or C++
    20. I - Interface Style
    21. f - plain Functions, no references used
    22. h - hybrid, object and function interfaces available
    23. n - no interface at all (huh?)
    24. r - some use of unblessed References or ties
    25. O - Object oriented using blessed references and/or inheritance
    26. P - Public License
    27. p - Standard-Perl: user may choose between GPL and Artistic
    28. g - GPL: GNU General Public License
    29. l - LGPL: "GNU Lesser General Public License" (previously known as
    30. "GNU Library General Public License")
    31. b - BSD: The BSD License
    32. a - Artistic license alone
    33. 2 - Artistic license 2.0 or later
    34. o - open source: approved by www.opensource.org
    35. d - allows distribution without restrictions
    36. r - restricted distribution
    37. n - no license at all
  • CPAN::Module::force($method,@args)

    Forces CPAN to perform a task it would normally refuse to do. Force takes as arguments a method name to be invoked and any number of additional arguments to pass that method. The internals of the object get the needed changes so that CPAN.pm does not refuse to take the action. See also the section above on the force and the fforce pragma.

  • CPAN::Module::get()

    Runs a get on the distribution associated with this module.

  • CPAN::Module::inst_file()

    Returns the filename of the module found in @INC. The first file found is reported, just as perl itself stops searching @INC once it finds a module.

  • CPAN::Module::available_file()

    Returns the filename of the module found in PERL5LIB or @INC. The first file found is reported. The advantage of this method over inst_file is that modules that have been tested but not yet installed are included because PERL5LIB keeps track of tested modules.

  • CPAN::Module::inst_version()

    Returns the version number of the installed module in readable format.

  • CPAN::Module::available_version()

    Returns the version number of the available module in readable format.

  • CPAN::Module::install()

    Runs an install on the distribution associated with this module.

  • CPAN::Module::look()

    Changes to the directory where the distribution associated with this module has been unpacked and opens a subshell there. Exiting the subshell returns.

  • CPAN::Module::make()

    Runs a make on the distribution associated with this module.

  • CPAN::Module::manpage_headline()

    If module is installed, peeks into the module's manpage, reads the headline, and returns it. Moreover, if the module has been downloaded within this session, does the equivalent on the downloaded module even if it hasn't been installed yet.

  • CPAN::Module::perldoc()

    Runs a perldoc on this module.

  • CPAN::Module::readme()

    Runs a readme on the distribution associated with this module.

  • CPAN::Module::reports()

    Calls the reports() method on the associated distribution object.

  • CPAN::Module::test()

    Runs a test on the distribution associated with this module.

  • CPAN::Module::uptodate()

    Returns 1 if the module is installed and up-to-date.

  • CPAN::Module::userid()

    Returns the author's ID of the module.

Cache Manager

Currently the cache manager only keeps track of the build directory ($CPAN::Config->{build_dir}). It is a simple FIFO mechanism that deletes complete directories below build_dir as soon as the size of all directories there gets bigger than $CPAN::Config->{build_cache} (in MB). The contents of this cache may be used for later re-installations that you intend to do manually, but will never be trusted by CPAN itself. This is due to the fact that the user might use these directories for building modules on different architectures.

There is another directory ($CPAN::Config->{keep_source_where}) where the original distribution files are kept. This directory is not covered by the cache manager and must be controlled by the user. If you choose to have the same directory as build_dir and as keep_source_where directory, then your sources will be deleted with the same fifo mechanism.

Bundles

A bundle is just a perl module in the namespace Bundle:: that does not define any functions or methods. It usually only contains documentation.

It starts like a perl module with a package declaration and a $VERSION variable. After that the pod section looks like any other pod with the only difference being that one special pod section exists starting with (verbatim):

  1. =head1 CONTENTS

In this pod section each line obeys the format

  1. Module_Name [Version_String] [- optional text]

The only required part is the first field, the name of a module (e.g. Foo::Bar, i.e. not the name of the distribution file). The rest of the line is optional. The comment part is delimited by a dash just as in the man page header.

The distribution of a bundle should follow the same convention as other distributions.

Bundles are treated specially in the CPAN package. If you say 'install Bundle::Tkkit' (assuming such a bundle exists), CPAN will install all the modules in the CONTENTS section of the pod. You can install your own Bundles locally by placing a conformant Bundle file somewhere into your @INC path. The autobundle() command which is available in the shell interface does that for you by including all currently installed modules in a snapshot bundle file.

PREREQUISITES

The CPAN program is trying to depend on as little as possible so the user can use it in hostile environment. It works better the more goodies the environment provides. For example if you try in the CPAN shell

  1. install Bundle::CPAN

or

  1. install Bundle::CPANxxl

you will find the shell more convenient than the bare shell before.

If you have a local mirror of CPAN and can access all files with "file:" URLs, then you only need a perl later than perl5.003 to run this module. Otherwise Net::FTP is strongly recommended. LWP may be required for non-UNIX systems, or if your nearest CPAN site is associated with a URL that is not ftp: .

If you have neither Net::FTP nor LWP, there is a fallback mechanism implemented for an external ftp command or for an external lynx command.

UTILITIES

Finding packages and VERSION

This module presumes that all packages on CPAN

  • declare their $VERSION variable in an easy to parse manner. This prerequisite can hardly be relaxed because it consumes far too much memory to load all packages into the running program just to determine the $VERSION variable. Currently all programs that are dealing with version use something like this

    1. perl -MExtUtils::MakeMaker -le \
    2. 'print MM->parse_version(shift)' filename

    If you are author of a package and wonder if your $VERSION can be parsed, please try the above method.

  • come as compressed or gzipped tarfiles or as zip files and contain a Makefile.PL or Build.PL (well, we try to handle a bit more, but with little enthusiasm).

Debugging

Debugging this module is more than a bit complex due to interference from the software producing the indices on CPAN, the mirroring process on CPAN, packaging, configuration, synchronicity, and even (gasp!) due to bugs within the CPAN.pm module itself.

For debugging the code of CPAN.pm itself in interactive mode, some debugging aid can be turned on for most packages within CPAN.pm with one of

  • o debug package...

    sets debug mode for packages.

  • o debug -package...

    unsets debug mode for packages.

  • o debug all

    turns debugging on for all packages.

  • o debug number

which sets the debugging packages directly. Note that o debug 0 turns debugging off.

What seems a successful strategy is the combination of reload cpan and the debugging switches. Add a new debug statement while running in the shell and then issue a reload cpan and see the new debugging messages immediately without losing the current context.

o debug without an argument lists the valid package names and the current set of packages in debugging mode. o debug has built-in completion support.

For debugging of CPAN data there is the dump command which takes the same arguments as make/test/install and outputs each object's Data::Dumper dump. If an argument looks like a perl variable and contains one of $ , @ or % , it is eval()ed and fed to Data::Dumper directly.

Floppy, Zip, Offline Mode

CPAN.pm works nicely without network access, too. If you maintain machines that are not networked at all, you should consider working with file: URLs. You'll have to collect your modules somewhere first. So you might use CPAN.pm to put together all you need on a networked machine. Then copy the $CPAN::Config->{keep_source_where} (but not $CPAN::Config->{build_dir}) directory on a floppy. This floppy is kind of a personal CPAN. CPAN.pm on the non-networked machines works nicely with this floppy. See also below the paragraph about CD-ROM support.

Basic Utilities for Programmers

  • has_inst($module)

    Returns true if the module is installed. Used to load all modules into the running CPAN.pm that are considered optional. The config variable dontload_list intercepts the has_inst() call such that an optional module is not loaded despite being available. For example, the following command will prevent YAML.pm from being loaded:

    1. cpan> o conf dontload_list push YAML

    See the source for details.

  • has_usable($module)

    Returns true if the module is installed and in a usable state. Only useful for a handful of modules that are used internally. See the source for details.

  • instance($module)

    The constructor for all the singletons used to represent modules, distributions, authors, and bundles. If the object already exists, this method returns the object; otherwise, it calls the constructor.

SECURITY

There's no strong security layer in CPAN.pm. CPAN.pm helps you to install foreign, unmasked, unsigned code on your machine. We compare to a checksum that comes from the net just as the distribution file itself. But we try to make it easy to add security on demand:

Cryptographically signed modules

Since release 1.77, CPAN.pm has been able to verify cryptographically signed module distributions using Module::Signature. The CPAN modules can be signed by their authors, thus giving more security. The simple unsigned MD5 checksums that were used before by CPAN protect mainly against accidental file corruption.

You will need to have Module::Signature installed, which in turn requires that you have at least one of Crypt::OpenPGP module or the command-line gpg tool installed.

You will also need to be able to connect over the Internet to the public key servers, like pgp.mit.edu, and their port 11731 (the HKP protocol).

The configuration parameter check_sigs is there to turn signature checking on or off.

EXPORT

Most functions in package CPAN are exported by default. The reason for this is that the primary use is intended for the cpan shell or for one-liners.

ENVIRONMENT

When the CPAN shell enters a subshell via the look command, it sets the environment CPAN_SHELL_LEVEL to 1, or increments that variable if it is already set.

When CPAN runs, it sets the environment variable PERL5_CPAN_IS_RUNNING to the ID of the running process. It also sets PERL5_CPANPLUS_IS_RUNNING to prevent runaway processes which could happen with older versions of Module::Install.

When running perl Makefile.PL , the environment variable PERL5_CPAN_IS_EXECUTING is set to the full path of the Makefile.PL that is being executed. This prevents runaway processes with newer versions of Module::Install.

When the config variable ftp_passive is set, all downloads will be run with the environment variable FTP_PASSIVE set to this value. This is in general a good idea as it influences both Net::FTP and LWP based connections. The same effect can be achieved by starting the cpan shell with this environment variable set. For Net::FTP alone, one can also always set passive mode by running libnetcfg.

POPULATE AN INSTALLATION WITH LOTS OF MODULES

Populating a freshly installed perl with one's favorite modules is pretty easy if you maintain a private bundle definition file. To get a useful blueprint of a bundle definition file, the command autobundle can be used on the CPAN shell command line. This command writes a bundle definition file for all modules installed for the current perl interpreter. It's recommended to run this command once only, and from then on maintain the file manually under a private name, say Bundle/my_bundle.pm. With a clever bundle file you can then simply say

  1. cpan> install Bundle::my_bundle

then answer a few questions and go out for coffee (possibly even in a different city).

Maintaining a bundle definition file means keeping track of two things: dependencies and interactivity. CPAN.pm sometimes fails on calculating dependencies because not all modules define all MakeMaker attributes correctly, so a bundle definition file should specify prerequisites as early as possible. On the other hand, it's annoying that so many distributions need some interactive configuring. So what you can try to accomplish in your private bundle file is to have the packages that need to be configured early in the file and the gentle ones later, so you can go out for coffee after a few minutes and leave CPAN.pm to churn away unattended.

WORKING WITH CPAN.pm BEHIND FIREWALLS

Thanks to Graham Barr for contributing the following paragraphs about the interaction between perl, and various firewall configurations. For further information on firewalls, it is recommended to consult the documentation that comes with the ncftp program. If you are unable to go through the firewall with a simple Perl setup, it is likely that you can configure ncftp so that it works through your firewall.

Three basic types of firewalls

Firewalls can be categorized into three basic types.

  • http firewall

    This is when the firewall machine runs a web server, and to access the outside world, you must do so via that web server. If you set environment variables like http_proxy or ftp_proxy to values beginning with http://, or in your web browser you've proxy information set, then you know you are running behind an http firewall.

    To access servers outside these types of firewalls with perl (even for ftp), you need LWP or HTTP::Tiny.

  • ftp firewall

    This where the firewall machine runs an ftp server. This kind of firewall will only let you access ftp servers outside the firewall. This is usually done by connecting to the firewall with ftp, then entering a username like "user@outside.host.com".

    To access servers outside these type of firewalls with perl, you need Net::FTP.

  • One-way visibility

    One-way visibility means these firewalls try to make themselves invisible to users inside the firewall. An FTP data connection is normally created by sending your IP address to the remote server and then listening for the return connection. But the remote server will not be able to connect to you because of the firewall. For these types of firewall, FTP connections need to be done in a passive mode.

    There are two that I can think off.

    • SOCKS

      If you are using a SOCKS firewall, you will need to compile perl and link it with the SOCKS library. This is what is normally called a 'socksified' perl. With this executable you will be able to connect to servers outside the firewall as if it were not there.

    • IP Masquerade

      This is when the firewall implemented in the kernel (via NAT, or networking address translation), it allows you to hide a complete network behind one IP address. With this firewall no special compiling is needed as you can access hosts directly.

      For accessing ftp servers behind such firewalls you usually need to set the environment variable FTP_PASSIVE or the config variable ftp_passive to a true value.

Configuring lynx or ncftp for going through a firewall

If you can go through your firewall with e.g. lynx, presumably with a command such as

  1. /usr/local/bin/lynx -pscott:tiger

then you would configure CPAN.pm with the command

  1. o conf lynx "/usr/local/bin/lynx -pscott:tiger"

That's all. Similarly for ncftp or ftp, you would configure something like

  1. o conf ncftp "/usr/bin/ncftp -f /home/scott/ncftplogin.cfg"

Your mileage may vary...

FAQ

1
)

I installed a new version of module X but CPAN keeps saying, I have the old version installed

Probably you do have the old version installed. This can happen if a module installs itself into a different directory in the @INC path than it was previously installed. This is not really a CPAN.pm problem, you would have the same problem when installing the module manually. The easiest way to prevent this behaviour is to add the argument UNINST=1 to the make install call, and that is why many people add this argument permanently by configuring

  1. o conf make_install_arg UNINST=1
2
)

So why is UNINST=1 not the default?

Because there are people who have their precise expectations about who may install where in the @INC path and who uses which @INC array. In fine tuned environments UNINST=1 can cause damage.

3
)

I want to clean up my mess, and install a new perl along with all modules I have. How do I go about it?

Run the autobundle command for your old perl and optionally rename the resulting bundle file (e.g. Bundle/mybundle.pm), install the new perl with the Configure option prefix, e.g.

  1. ./Configure -Dprefix=/usr/local/perl-5.6.78.9

Install the bundle file you produced in the first step with something like

  1. cpan> install Bundle::mybundle

and you're done.

4
)

When I install bundles or multiple modules with one command there is too much output to keep track of.

You may want to configure something like

  1. o conf make_arg "| tee -ai /root/.cpan/logs/make.out"
  2. o conf make_install_arg "| tee -ai /root/.cpan/logs/make_install.out"

so that STDOUT is captured in a file for later inspection.

5
)

I am not root, how can I install a module in a personal directory?

As of CPAN 1.9463, if you do not have permission to write the default perl library directories, CPAN's configuration process will ask you whether you want to bootstrap <local::lib>, which makes keeping a personal perl library directory easy.

Another thing you should bear in mind is that the UNINST parameter can be dangerous when you are installing into a private area because you might accidentally remove modules that other people depend on that are not using the private area.

6
)

How to get a package, unwrap it, and make a change before building it?

Have a look at the look (!) command.

7
)

I installed a Bundle and had a couple of fails. When I retried, everything resolved nicely. Can this be fixed to work on first try?

The reason for this is that CPAN does not know the dependencies of all modules when it starts out. To decide about the additional items to install, it just uses data found in the META.yml file or the generated Makefile. An undetected missing piece breaks the process. But it may well be that your Bundle installs some prerequisite later than some depending item and thus your second try is able to resolve everything. Please note, CPAN.pm does not know the dependency tree in advance and cannot sort the queue of things to install in a topologically correct order. It resolves perfectly well if all modules declare the prerequisites correctly with the PREREQ_PM attribute to MakeMaker or the requires stanza of Module::Build. For bundles which fail and you need to install often, it is recommended to sort the Bundle definition file manually.

8
)

In our intranet, we have many modules for internal use. How can I integrate these modules with CPAN.pm but without uploading the modules to CPAN?

Have a look at the CPAN::Site module.

9
)

When I run CPAN's shell, I get an error message about things in my /etc/inputrc (or ~/.inputrc) file.

These are readline issues and can only be fixed by studying readline configuration on your architecture and adjusting the referenced file accordingly. Please make a backup of the /etc/inputrc or ~/.inputrc and edit them. Quite often harmless changes like uppercasing or lowercasing some arguments solves the problem.

10
)

Some authors have strange characters in their names.

Internally CPAN.pm uses the UTF-8 charset. If your terminal is expecting ISO-8859-1 charset, a converter can be activated by setting term_is_latin to a true value in your config file. One way of doing so would be

  1. cpan> o conf term_is_latin 1

If other charset support is needed, please file a bug report against CPAN.pm at rt.cpan.org and describe your needs. Maybe we can extend the support or maybe UTF-8 terminals become widely available.

Note: this config variable is deprecated and will be removed in a future version of CPAN.pm. It will be replaced with the conventions around the family of $LANG and $LC_* environment variables.

11
)

When an install fails for some reason and then I correct the error condition and retry, CPAN.pm refuses to install the module, saying Already tried without success .

Use the force pragma like so

  1. force install Foo::Bar

Or you can use

  1. look Foo::Bar

and then make install directly in the subshell.

12
)

How do I install a "DEVELOPER RELEASE" of a module?

By default, CPAN will install the latest non-developer release of a module. If you want to install a dev release, you have to specify the partial path starting with the author id to the tarball you wish to install, like so:

  1. cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz

Note that you can use the ls command to get this path listed.

13
)

How do I install a module and all its dependencies from the commandline, without being prompted for anything, despite my CPAN configuration (or lack thereof)?

CPAN uses ExtUtils::MakeMaker's prompt() function to ask its questions, so if you set the PERL_MM_USE_DEFAULT environment variable, you shouldn't be asked any questions at all (assuming the modules you are installing are nice about obeying that variable as well):

  1. % PERL_MM_USE_DEFAULT=1 perl -MCPAN -e 'install My::Module'
14
)

How do I create a Module::Build based Build.PL derived from an ExtUtils::MakeMaker focused Makefile.PL?

http://search.cpan.org/dist/Module-Build-Convert/

15
)

I'm frequently irritated with the CPAN shell's inability to help me select a good mirror.

CPAN can now help you select a "good" mirror, based on which ones have the lowest 'ping' round-trip times. From the shell, use the command 'o conf init urllist' and allow CPAN to automatically select mirrors for you.

Beyond that help, the urllist config parameter is yours. You can add and remove sites at will. You should find out which sites have the best up-to-dateness, bandwidth, reliability, etc. and are topologically close to you. Some people prefer fast downloads, others up-to-dateness, others reliability. You decide which to try in which order.

Henk P. Penning maintains a site that collects data about CPAN sites:

  1. http://www.cs.uu.nl/people/henkp/mirmon/cpan.html

Also, feel free to play with experimental features. Run

  1. o conf init randomize_urllist ftpstats_period ftpstats_size

and choose your favorite parameters. After a few downloads running the hosts command will probably assist you in choosing the best mirror sites.

16
)

Why do I get asked the same questions every time I start the shell?

You can make your configuration changes permanent by calling the command o conf commit . Alternatively set the auto_commit variable to true by running o conf init auto_commit and answering the following question with yes.

17
)

Older versions of CPAN.pm had the original root directory of all tarballs in the build directory. Now there are always random characters appended to these directory names. Why was this done?

The random characters are provided by File::Temp and ensure that each module's individual build directory is unique. This makes running CPAN.pm in concurrent processes simultaneously safe.

18
)

Speaking of the build directory. Do I have to clean it up myself?

You have the choice to set the config variable scan_cache to never . Then you must clean it up yourself. The other possible values, atstart and atexit clean up the build directory when you start or exit the CPAN shell, respectively. If you never start up the CPAN shell, you probably also have to clean up the build directory yourself.

COMPATIBILITY

OLD PERL VERSIONS

CPAN.pm is regularly tested to run under 5.005 and assorted newer versions. It is getting more and more difficult to get the minimal prerequisites working on older perls. It is close to impossible to get the whole Bundle::CPAN working there. If you're in the position to have only these old versions, be advised that CPAN is designed to work fine without the Bundle::CPAN installed.

To get things going, note that GBARR/Scalar-List-Utils-1.18.tar.gz is compatible with ancient perls and that File::Temp is listed as a prerequisite but CPAN has reasonable workarounds if it is missing.

CPANPLUS

This module and its competitor, the CPANPLUS module, are both much cooler than the other. CPAN.pm is older. CPANPLUS was designed to be more modular, but it was never intended to be compatible with CPAN.pm.

CPANMINUS

In the year 2010 App::cpanminus was launched as a new approach to a cpan shell with a considerably smaller footprint. Very cool stuff.

SECURITY ADVICE

This software enables you to upgrade software on your computer and so is inherently dangerous because the newly installed software may contain bugs and may alter the way your computer works or even make it unusable. Please consider backing up your data before every upgrade.

BUGS

Please report bugs via http://rt.cpan.org/

Before submitting a bug, please make sure that the traditional method of building a Perl module package from a shell by following the installation instructions of that package still works in your environment.

AUTHOR

Andreas Koenig <andk@cpan.org>

LICENSE

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

See http://www.perl.com/perl/misc/Artistic.html

TRANSLATIONS

Kawai,Takanori provides a Japanese translation of a very old version of this manpage at http://homepage3.nifty.com/hippo2000/perltips/CPAN.htm

SEE ALSO

Many people enter the CPAN shell by running the cpan utility program which is installed in the same directory as perl itself. So if you have this directory in your PATH variable (or some equivalent in your operating system) then typing cpan in a console window will work for you as well. Above that the utility provides several commandline shortcuts.

melezhik (Alexey) sent me a link where he published a chef recipe to work with CPAN.pm: http://community.opscode.com/cookbooks/cpan.

 
perldoc-html/CPANPLUS/000755 000765 000024 00000000000 12275777435 014466 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS.html000644 000765 000024 00000056305 12275777431 015361 0ustar00jjstaff000000 000000 CPANPLUS - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

CPANPLUS

Perl 5 version 18.2 documentation
Recently read

CPANPLUS

NAME

CPANPLUS - API & CLI access to the CPAN mirrors

SYNOPSIS

  1. ### standard invocation from the command line
  2. $ cpanp
  3. $ cpanp -i Some::Module
  4. $ perl -MCPANPLUS -eshell
  5. $ perl -MCPANPLUS -e'fetch Some::Module'

DESCRIPTION

The CPANPLUS library is an API to the CPAN mirrors and a collection of interactive shells, commandline programs, etc, that use this API.

GUIDE TO DOCUMENTATION

GENERAL USAGE

This is the document you are currently reading. It describes basic usage and background information. Its main purpose is to assist the user who wants to learn how to invoke CPANPLUS and install modules from the commandline and to point you to more indepth reading if required.

API REFERENCE

The CPANPLUS API is meant to let you programmatically interact with the CPAN mirrors. The documentation in CPANPLUS::Backend shows you how to create an object capable of interacting with those mirrors, letting you create & retrieve module objects. CPANPLUS::Module shows you how you can use these module objects to perform actions like installing and testing.

The default shell, documented in CPANPLUS::Shell::Default is also scriptable. You can use its API to dispatch calls from your script to the CPANPLUS Shell.

COMMANDLINE TOOLS

STARTING AN INTERACTIVE SHELL

You can start an interactive shell by running either of the two following commands:

  1. $ cpanp
  2. $ perl -MCPANPLUS -eshell

All commands available are listed in the interactive shells help menu. See cpanp -h or CPANPLUS::Shell::Default for instructions on using the default shell.

CHOOSE A SHELL

By running cpanp without arguments, you will start up the shell specified in your config, which defaults to CPANPLUS::Shell::Default. There are more shells available. CPANPLUS itself ships with an emulation shell called CPANPLUS::Shell::Classic that looks and feels just like the old CPAN.pm shell.

You can start this shell by typing:

  1. $ perl -MCPANPLUS -e'shell Classic'

Even more shells may be available from CPAN .

Note that if you have changed your default shell in your configuration, that shell will be used instead. If for some reason there was an error with your specified shell, you will be given the default shell.

BUILDING PACKAGES

cpan2dist is a commandline tool to convert any distribution from CPAN into a package in the format of your choice, like for example .deb or FreeBSD ports .

See cpan2dist -h for details.

FUNCTIONS

For quick access to common commands, you may use this module, CPANPLUS rather than the full programmatic API situated in CPANPLUS::Backend . This module offers the following functions:

$bool = install( Module::Name | /A/AU/AUTHOR/Module-Name-1.tgz )

This function requires the full name of the module, which is case sensitive. The module name can also be provided as a fully qualified file name, beginning with a /, relative to the /authors/id directory on a CPAN mirror.

It will download, extract and install the module.

$where = fetch( Module::Name | /A/AU/AUTHOR/Module-Name-1.tgz )

Like install, fetch needs the full name of a module or the fully qualified file name, and is case sensitive.

It will download the specified module to the current directory.

$where = get( Module::Name | /A/AU/AUTHOR/Module-Name-1.tgz )

Get is provided as an alias for fetch for compatibility with CPAN.pm.

shell()

Shell starts the default CPAN shell. You can also start the shell by using the cpanp command, which will be installed in your perl bin.

FAQ

For frequently asked questions and answers, please consult the CPANPLUS::FAQ manual.

BUG REPORTS

Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

AUTHOR

This module by Jos Boumans <kane@cpan.org>.

COPYRIGHT

The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

SEE ALSO

CPANPLUS::Shell::Default, CPANPLUS::FAQ, CPANPLUS::Backend, CPANPLUS::Module, cpanp, cpan2dist

CONTACT INFORMATION

  • Bug reporting: bug-cpanplus@rt.cpan.org
  • Questions & suggestions: bug-cpanplus@rt.cpan.org
 
perldoc-html/Carp.html000644 000765 000024 00000074713 12275777433 015026 0ustar00jjstaff000000 000000 Carp - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Carp

Perl 5 version 18.2 documentation
Recently read

Carp

NAME

Carp - alternative warn and die for modules

SYNOPSIS

  1. use Carp;
  2. # warn user (from perspective of caller)
  3. carp "string trimmed to 80 chars";
  4. # die of errors (from perspective of caller)
  5. croak "We're outta here!";
  6. # die of errors with stack backtrace
  7. confess "not implemented";
  8. # cluck, longmess and shortmess not exported by default
  9. use Carp qw(cluck longmess shortmess);
  10. cluck "This is how we got here!";
  11. $long_message = longmess( "message from cluck() or confess()" );
  12. $short_message = shortmess( "message from carp() or croak()" );

DESCRIPTION

The Carp routines are useful in your own modules because they act like die() or warn(), but with a message which is more likely to be useful to a user of your module. In the case of cluck() and confess() , that context is a summary of every call in the call-stack; longmess() returns the contents of the error message.

For a shorter message you can use carp() or croak() which report the error as being from where your module was called. shortmess() returns the contents of this error message. There is no guarantee that that is where the error was, but it is a good educated guess.

You can also alter the way the output and logic of Carp works, by changing some global variables in the Carp namespace. See the section on GLOBAL VARIABLES below.

Here is a more complete description of how carp and croak work. What they do is search the call-stack for a function call stack where they have not been told that there shouldn't be an error. If every call is marked safe, they give up and give a full stack backtrace instead. In other words they presume that the first likely looking potential suspect is guilty. Their rules for telling whether a call shouldn't generate errors work as follows:

1.

Any call from a package to itself is safe.

2.

Packages claim that there won't be errors on calls to or from packages explicitly marked as safe by inclusion in @CARP_NOT , or (if that array is empty) @ISA . The ability to override what @ISA says is new in 5.8.

3.

The trust in item 2 is transitive. If A trusts B, and B trusts C, then A trusts C. So if you do not override @ISA with @CARP_NOT , then this trust relationship is identical to, "inherits from".

4.

Any call from an internal Perl module is safe. (Nothing keeps user modules from marking themselves as internal to Perl, but this practice is discouraged.)

5.

Any call to Perl's warning system (eg Carp itself) is safe. (This rule is what keeps it from reporting the error at the point where you call carp or croak .)

6.

$Carp::CarpLevel can be set to skip a fixed number of additional call levels. Using this is not recommended because it is very difficult to get it to behave correctly.

Forcing a Stack Trace

As a debugging aid, you can force Carp to treat a croak as a confess and a carp as a cluck across all modules. In other words, force a detailed stack trace to be given. This can be very helpful when trying to understand why, or from where, a warning or error is being generated.

This feature is enabled by 'importing' the non-existent symbol 'verbose'. You would typically enable it by saying

  1. perl -MCarp=verbose script.pl

or by including the string -MCarp=verbose in the PERL5OPT environment variable.

Alternately, you can set the global variable $Carp::Verbose to true. See the GLOBAL VARIABLES section below.

GLOBAL VARIABLES

$Carp::MaxEvalLen

This variable determines how many characters of a string-eval are to be shown in the output. Use a value of 0 to show all text.

Defaults to 0 .

$Carp::MaxArgLen

This variable determines how many characters of each argument to a function to print. Use a value of 0 to show the full length of the argument.

Defaults to 64 .

$Carp::MaxArgNums

This variable determines how many arguments to each function to show. Use a value of 0 to show all arguments to a function call.

Defaults to 8 .

$Carp::Verbose

This variable makes carp() and croak() generate stack backtraces just like cluck() and confess() . This is how use Carp 'verbose' is implemented internally.

Defaults to 0 .

@CARP_NOT

This variable, in your package, says which packages are not to be considered as the location of an error. The carp() and cluck() functions will skip over callers when reporting where an error occurred.

NB: This variable must be in the package's symbol table, thus:

  1. # These work
  2. our @CARP_NOT; # file scope
  3. use vars qw(@CARP_NOT); # package scope
  4. @My::Package::CARP_NOT = ... ; # explicit package variable
  5. # These don't work
  6. sub xyz { ... @CARP_NOT = ... } # w/o declarations above
  7. my @CARP_NOT; # even at top-level

Example of use:

  1. package My::Carping::Package;
  2. use Carp;
  3. our @CARP_NOT;
  4. sub bar { .... or _error('Wrong input') }
  5. sub _error {
  6. # temporary control of where'ness, __PACKAGE__ is implicit
  7. local @CARP_NOT = qw(My::Friendly::Caller);
  8. carp(@_)
  9. }

This would make Carp report the error as coming from a caller not in My::Carping::Package , nor from My::Friendly::Caller .

Also read the DESCRIPTION section above, about how Carp decides where the error is reported from.

Use @CARP_NOT , instead of $Carp::CarpLevel .

Overrides Carp 's use of @ISA .

%Carp::Internal

This says what packages are internal to Perl. Carp will never report an error as being from a line in a package that is internal to Perl. For example:

  1. $Carp::Internal{ (__PACKAGE__) }++;
  2. # time passes...
  3. sub foo { ... or confess("whatever") };

would give a full stack backtrace starting from the first caller outside of __PACKAGE__. (Unless that package was also internal to Perl.)

%Carp::CarpInternal

This says which packages are internal to Perl's warning system. For generating a full stack backtrace this is the same as being internal to Perl, the stack backtrace will not start inside packages that are listed in %Carp::CarpInternal . But it is slightly different for the summary message generated by carp or croak . There errors will not be reported on any lines that are calling packages in %Carp::CarpInternal .

For example Carp itself is listed in %Carp::CarpInternal . Therefore the full stack backtrace from confess will not start inside of Carp , and the short message from calling croak is not placed on the line where croak was called.

$Carp::CarpLevel

This variable determines how many additional call frames are to be skipped that would not otherwise be when reporting where an error occurred on a call to one of Carp 's functions. It is fairly easy to count these call frames on calls that generate a full stack backtrace. However it is much harder to do this accounting for calls that generate a short message. Usually people skip too many call frames. If they are lucky they skip enough that Carp goes all of the way through the call stack, realizes that something is wrong, and then generates a full stack backtrace. If they are unlucky then the error is reported from somewhere misleading very high in the call stack.

Therefore it is best to avoid $Carp::CarpLevel . Instead use @CARP_NOT , %Carp::Internal and %Carp::CarpInternal .

Defaults to 0 .

BUGS

The Carp routines don't handle exception objects currently. If called with a first argument that is a reference, they simply call die() or warn(), as appropriate.

SEE ALSO

Carp::Always, Carp::Clan

AUTHOR

The Carp module first appeared in Larry Wall's perl 5.000 distribution. Since then it has been modified by several of the perl 5 porters. Andrew Main (Zefram) <zefram@fysh.org> divested Carp into an independent distribution.

COPYRIGHT

Copyright (C) 1994-2012 Larry Wall

Copyright (C) 2011, 2012 Andrew Main (Zefram) <zefram@fysh.org>

LICENSE

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/Class/000755 000765 000024 00000000000 12275777430 014301 5ustar00jjstaff000000 000000 perldoc-html/Compress/000755 000765 000024 00000000000 12275777434 015033 5ustar00jjstaff000000 000000 perldoc-html/Config/000755 000765 000024 00000000000 12275777434 014445 5ustar00jjstaff000000 000000 perldoc-html/Config.html000644 000765 000024 00001447755 12275777427 015364 0ustar00jjstaff000000 000000 Config - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Config

Perl 5 version 18.2 documentation
Recently read

Config

NAME

Config - access Perl configuration information

SYNOPSIS

  1. use Config;
  2. if ($Config{usethreads}) {
  3. print "has thread support\n"
  4. }
  5. use Config qw(myconfig config_sh config_vars config_re);
  6. print myconfig();
  7. print config_sh();
  8. print config_re();
  9. config_vars(qw(osname archname));

DESCRIPTION

The Config module contains all the information that was available to the Configure program at Perl build time (over 900 values).

Shell variables from the config.sh file (written by Configure) are stored in the readonly-variable %Config , indexed by their names.

Values stored in config.sh as 'undef' are returned as undefined values. The perl exists function can be used to check if a named variable exists.

For a description of the variables, please have a look at the Glossary file, as written in the Porting folder, or use the url: http://perl5.git.perl.org/perl.git/blob/HEAD:/Porting/Glossary

  • myconfig()

    Returns a textual summary of the major perl configuration values. See also -V in Command Switches in perlrun.

  • config_sh()

    Returns the entire perl configuration information in the form of the original config.sh shell variable assignment script.

  • config_re($regex)

    Like config_sh() but returns, as a list, only the config entries who's names match the $regex.

  • config_vars(@names)

    Prints to STDOUT the values of the named configuration variable. Each is printed on a separate line in the form:

    1. name='value';

    Names which are unknown are output as name='UNKNOWN'; . See also -V:name in Command Switches in perlrun.

  • bincompat_options()

    Returns a list of C pre-processor options used when compiling this perl binary, which affect its binary compatibility with extensions. bincompat_options() and non_bincompat_options() are shown together in the output of perl -V as Compile-time options.

  • non_bincompat_options()

    Returns a list of C pre-processor options used when compiling this perl binary, which do not affect binary compatibility with extensions.

  • compile_date()

    Returns the compile date (as a string), equivalent to what is shown by perl -V

  • local_patches()

    Returns a list of the names of locally applied patches, equivalent to what is shown by perl -V .

  • header_files()

    Returns a list of the header files that should be used as dependencies for XS code, for this version of Perl on this platform.

EXAMPLE

Here's a more sophisticated example of using %Config:

  1. use Config;
  2. use strict;
  3. my %sig_num;
  4. my @sig_name;
  5. unless($Config{sig_name} && $Config{sig_num}) {
  6. die "No sigs?";
  7. } else {
  8. my @names = split ' ', $Config{sig_name};
  9. @sig_num{@names} = split ' ', $Config{sig_num};
  10. foreach (@names) {
  11. $sig_name[$sig_num{$_}] ||= $_;
  12. }
  13. }
  14. print "signal #17 = $sig_name[17]\n";
  15. if ($sig_num{ALRM}) {
  16. print "SIGALRM is $sig_num{ALRM}\n";
  17. }

WARNING

Because this information is not stored within the perl executable itself it is possible (but unlikely) that the information does not relate to the actual perl binary which is being used to access it.

The Config module is installed into the architecture and version specific library directory ($Config{installarchlib}) and it checks the perl version number when loaded.

The values stored in config.sh may be either single-quoted or double-quoted. Double-quoted strings are handy for those cases where you need to include escape sequences in the strings. To avoid runtime variable interpolation, any $ and @ characters are replaced by \$ and \@ , respectively. This isn't foolproof, of course, so don't embed \$ or \@ in double-quoted strings unless you're willing to deal with the consequences. (The slashes will end up escaped and the $ or @ will trigger variable interpolation)

GLOSSARY

Most Config variables are determined by the Configure script on platforms supported by it (which is most UNIX platforms). Some platforms have custom-made Config variables, and may thus not have some of the variables described below, or may have extraneous variables specific to that particular port. See the port specific documentation in such cases.

_

  • _a

    From Unix.U:

    This variable defines the extension used for ordinary library files. For unix, it is .a. The . is included. Other possible values include .lib.

  • _exe

    From Unix.U:

    This variable defines the extension used for executable files. DJGPP , Cygwin and OS/2 use .exe. Stratus VOS uses .pm. On operating systems which do not require a specific extension for executable files, this variable is empty.

  • _o

    From Unix.U:

    This variable defines the extension used for object files. For unix, it is .o. The . is included. Other possible values include .obj.

a

  • afs

    From afs.U:

    This variable is set to true if AFS (Andrew File System) is used on the system, false otherwise. It is possible to override this with a hint value or command line option, but you'd better know what you are doing.

  • afsroot

    From afs.U:

    This variable is by default set to /afs. In the unlikely case this is not the correct root, it is possible to override this with a hint value or command line option. This will be used in subsequent tests for AFSness in the configure and test process.

  • alignbytes

    From alignbytes.U:

    This variable holds the number of bytes required to align a double-- or a long double when applicable. Usual values are 2, 4 and 8. The default is eight, for safety.

  • ansi2knr

    From ansi2knr.U:

    This variable is set if the user needs to run ansi2knr. Currently, this is not supported, so we just abort.

  • aphostname

    From d_gethname.U:

    This variable contains the command which can be used to compute the host name. The command is fully qualified by its absolute path, to make it safe when used by a process with super-user privileges.

  • api_revision

    From patchlevel.U:

    The three variables, api_revision, api_version, and api_subversion, specify the version of the oldest perl binary compatible with the present perl. In a full version string such as 5.6.1, api_revision is the 5 . Prior to 5.5.640, the format was a floating point number, like 5.00563.

    perl.c:incpush() and lib/lib.pm will automatically search in $sitelib/.. for older directories back to the limit specified by these api_ variables. This is only useful if you have a perl library directory tree structured like the default one. See INSTALL for how this works. The versioned site_perl directory was introduced in 5.005, so that is the lowest possible value. The version list appropriate for the current system is determined in inc_version_list.U.

    XXX To do: Since compatibility can depend on compile time options (such as bincompat, longlong, etc.) it should (perhaps) be set by Configure, but currently it isn't. Currently, we read a hard-wired value from patchlevel.h. Perhaps what we ought to do is take the hard-wired value from patchlevel.h but then modify it if the current Configure options warrant. patchlevel.h then would use an #ifdef guard.

  • api_subversion

    From patchlevel.U:

    The three variables, api_revision, api_version, and api_subversion, specify the version of the oldest perl binary compatible with the present perl. In a full version string such as 5.6.1, api_subversion is the 1 . See api_revision for full details.

  • api_version

    From patchlevel.U:

    The three variables, api_revision, api_version, and api_subversion, specify the version of the oldest perl binary compatible with the present perl. In a full version string such as 5.6.1, api_version is the 6 . See api_revision for full details. As a special case, 5.5.0 is rendered in the old-style as 5.005. (In the 5.005_0x maintenance series, this was the only versioned directory in $sitelib.)

  • api_versionstring

    From patchlevel.U:

    This variable combines api_revision, api_version, and api_subversion in a format such as 5.6.1 (or 5_6_1) suitable for use as a directory name. This is filesystem dependent.

  • ar

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the ar program. After Configure runs, the value is reset to a plain ar and is not useful.

  • archlib

    From archlib.U:

    This variable holds the name of the directory in which the user wants to put architecture-dependent public library files for $package. It is most often a local directory such as /usr/local/lib. Programs using this variable must be prepared to deal with filename expansion.

  • archlibexp

    From archlib.U:

    This variable is the same as the archlib variable, but is filename expanded at configuration time, for convenient use.

  • archname

    From archname.U:

    This variable is a short name to characterize the current architecture. It is used mainly to construct the default archlib.

  • archname64

    From use64bits.U:

    This variable is used for the 64-bitness part of $archname.

  • archobjs

    From Unix.U:

    This variable defines any additional objects that must be linked in with the program on this architecture. On unix, it is usually empty. It is typically used to include emulations of unix calls or other facilities. For perl on OS/2, for example, this would include os2/os2.obj.

  • asctime_r_proto

    From d_asctime_r.U:

    This variable encodes the prototype of asctime_r. It is zero if d_asctime_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_asctime_r is defined.

  • awk

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the awk program. After Configure runs, the value is reset to a plain awk and is not useful.

b

  • baserev

    From baserev.U:

    The base revision level of this package, from the .package file.

  • bash

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • bin

    From bin.U:

    This variable holds the name of the directory in which the user wants to put publicly executable images for the package in question. It is most often a local directory such as /usr/local/bin. Programs using this variable must be prepared to deal with ~name substitution.

  • bin_ELF

    From dlsrc.U:

    This variable saves the result from configure if generated binaries are in ELF format. Only set to defined when the test has actually been performed, and the result was positive.

  • binexp

    From bin.U:

    This is the same as the bin variable, but is filename expanded at configuration time, for use in your makefiles.

  • bison

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the bison program. After Configure runs, the value is reset to a plain bison and is not useful.

  • bootstrap_charset

    From ebcdic.U:

    This variable conditionally defines BOOTSTRAP_CHARSET if this system uses non-ASCII encoding.

  • byacc

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the byacc program. After Configure runs, the value is reset to a plain byacc and is not useful.

  • byteorder

    From byteorder.U:

    This variable holds the byte order in a UV . In the following, larger digits indicate more significance. The variable byteorder is either 4321 on a big-endian machine, or 1234 on a little-endian, or 87654321 on a Cray ... or 3412 with weird order !

c

  • c

    From n.U:

    This variable contains the \c string if that is what causes the echo command to suppress newline. Otherwise it is null. Correct usage is $echo $n "prompt for a question: $c".

  • castflags

    From d_castneg.U:

    This variable contains a flag that precise difficulties the compiler has casting odd floating values to unsigned long: 0 = ok 1 = couldn't cast < 0 2 = couldn't cast >= 0x80000000 4 = couldn't cast in argument expression list

  • cat

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the cat program. After Configure runs, the value is reset to a plain cat and is not useful.

  • cc

    From cc.U:

    This variable holds the name of a command to execute a C compiler which can resolve multiple global references that happen to have the same name. Usual values are cc and gcc . Fervent ANSI compilers may be called c89 . AIX has xlc.

  • cccdlflags

    From dlsrc.U:

    This variable contains any special flags that might need to be passed with cc -c to compile modules to be used to create a shared library that will be used for dynamic loading. For hpux, this should be +z. It is up to the makefile to use it.

  • ccdlflags

    From dlsrc.U:

    This variable contains any special flags that might need to be passed to cc to link with a shared library for dynamic loading. It is up to the makefile to use it. For sunos 4.1, it should be empty.

  • ccflags

    From ccflags.U:

    This variable contains any additional C compiler flags desired by the user. It is up to the Makefile to use this.

  • ccflags_uselargefiles

    From uselfs.U:

    This variable contains the compiler flags needed by large file builds and added to ccflags by hints files.

  • ccname

    From Checkcc.U:

    This can set either by hints files or by Configure. If using gcc, this is gcc, and if not, usually equal to cc, unimpressive, no? Some platforms, however, make good use of this by storing the flavor of the C compiler being used here. For example if using the Sun WorkShop suite, ccname will be workshop .

  • ccsymbols

    From Cppsym.U:

    The variable contains the symbols defined by the C compiler alone. The symbols defined by cpp or by cc when it calls cpp are not in this list, see cppsymbols and cppccsymbols. The list is a space-separated list of symbol=value tokens.

  • ccversion

    From Checkcc.U:

    This can set either by hints files or by Configure. If using a (non-gcc) vendor cc, this variable may contain a version for the compiler.

  • cf_by

    From cf_who.U:

    Login name of the person who ran the Configure script and answered the questions. This is used to tag both config.sh and config_h.SH.

  • cf_email

    From cf_email.U:

    Electronic mail address of the person who ran Configure. This can be used by units that require the user's e-mail, like MailList.U.

  • cf_time

    From cf_who.U:

    Holds the output of the date command when the configuration file was produced. This is used to tag both config.sh and config_h.SH.

  • charbits

    From charsize.U:

    This variable contains the value of the CHARBITS symbol, which indicates to the C program how many bits there are in a character.

  • charsize

    From charsize.U:

    This variable contains the value of the CHARSIZE symbol, which indicates to the C program how many bytes there are in a character.

  • chgrp

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • chmod

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the chmod program. After Configure runs, the value is reset to a plain chmod and is not useful.

  • chown

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • clocktype

    From d_times.U:

    This variable holds the type returned by times(). It can be long, or clock_t on BSD sites (in which case <sys/types.h> should be included).

  • comm

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the comm program. After Configure runs, the value is reset to a plain comm and is not useful.

  • compress

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • config_arg0

    From Options.U:

    This variable contains the string used to invoke the Configure command, as reported by the shell in the $0 variable.

  • config_argc

    From Options.U:

    This variable contains the number of command-line arguments passed to Configure, as reported by the shell in the $# variable. The individual arguments are stored as variables config_arg1, config_arg2, etc.

  • config_args

    From Options.U:

    This variable contains a single string giving the command-line arguments passed to Configure. Spaces within arguments, quotes, and escaped characters are not correctly preserved. To reconstruct the command line, you must assemble the individual command line pieces, given in config_arg[0-9]*.

  • contains

    From contains.U:

    This variable holds the command to do a grep with a proper return status. On most sane systems it is simply grep. On insane systems it is a grep followed by a cat followed by a test. This variable is primarily for the use of other Configure units.

  • cp

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the cp program. After Configure runs, the value is reset to a plain cp and is not useful.

  • cpio

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • cpp

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the cpp program. After Configure runs, the value is reset to a plain cpp and is not useful.

  • cpp_stuff

    From cpp_stuff.U:

    This variable contains an identification of the concatenation mechanism used by the C preprocessor.

  • cppccsymbols

    From Cppsym.U:

    The variable contains the symbols defined by the C compiler when it calls cpp. The symbols defined by the cc alone or cpp alone are not in this list, see ccsymbols and cppsymbols. The list is a space-separated list of symbol=value tokens.

  • cppflags

    From ccflags.U:

    This variable holds the flags that will be passed to the C pre- processor. It is up to the Makefile to use it.

  • cpplast

    From cppstdin.U:

    This variable has the same functionality as cppminus, only it applies to cpprun and not cppstdin.

  • cppminus

    From cppstdin.U:

    This variable contains the second part of the string which will invoke the C preprocessor on the standard input and produce to standard output. This variable will have the value - if cppstdin needs a minus to specify standard input, otherwise the value is "".

  • cpprun

    From cppstdin.U:

    This variable contains the command which will invoke a C preprocessor on standard input and put the output to stdout. It is guaranteed not to be a wrapper and may be a null string if no preprocessor can be made directly available. This preprocessor might be different from the one used by the C compiler. Don't forget to append cpplast after the preprocessor options.

  • cppstdin

    From cppstdin.U:

    This variable contains the command which will invoke the C preprocessor on standard input and put the output to stdout. It is primarily used by other Configure units that ask about preprocessor symbols.

  • cppsymbols

    From Cppsym.U:

    The variable contains the symbols defined by the C preprocessor alone. The symbols defined by cc or by cc when it calls cpp are not in this list, see ccsymbols and cppccsymbols. The list is a space-separated list of symbol=value tokens.

  • crypt_r_proto

    From d_crypt_r.U:

    This variable encodes the prototype of crypt_r. It is zero if d_crypt_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_crypt_r is defined.

  • cryptlib

    From d_crypt.U:

    This variable holds -lcrypt or the path to a libcrypt.a archive if the crypt() function is not defined in the standard C library. It is up to the Makefile to use this.

  • csh

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the csh program. After Configure runs, the value is reset to a plain csh and is not useful.

  • ctermid_r_proto

    From d_ctermid_r.U:

    This variable encodes the prototype of ctermid_r. It is zero if d_ctermid_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_ctermid_r is defined.

  • ctime_r_proto

    From d_ctime_r.U:

    This variable encodes the prototype of ctime_r. It is zero if d_ctime_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_ctime_r is defined.

d

  • d__fwalk

    From d__fwalk.U:

    This variable conditionally defines HAS__FWALK if _fwalk() is available to apply a function to all the file handles.

  • d_access

    From d_access.U:

    This variable conditionally defines HAS_ACCESS if the access() system call is available to check for access permissions using real IDs.

  • d_accessx

    From d_accessx.U:

    This variable conditionally defines the HAS_ACCESSX symbol, which indicates to the C program that the accessx() routine is available.

  • d_aintl

    From d_aintl.U:

    This variable conditionally defines the HAS_AINTL symbol, which indicates to the C program that the aintl() routine is available. If copysignl is also present we can emulate modfl.

  • d_alarm

    From d_alarm.U:

    This variable conditionally defines the HAS_ALARM symbol, which indicates to the C program that the alarm() routine is available.

  • d_archlib

    From archlib.U:

    This variable conditionally defines ARCHLIB to hold the pathname of architecture-dependent library files for $package. If $archlib is the same as $privlib, then this is set to undef.

  • d_asctime64

    From d_timefuncs64.U:

    This variable conditionally defines the HAS_ASCTIME64 symbol, which indicates to the C program that the asctime64 () routine is available.

  • d_asctime_r

    From d_asctime_r.U:

    This variable conditionally defines the HAS_ASCTIME_R symbol, which indicates to the C program that the asctime_r() routine is available.

  • d_atolf

    From atolf.U:

    This variable conditionally defines the HAS_ATOLF symbol, which indicates to the C program that the atolf() routine is available.

  • d_atoll

    From atoll.U:

    This variable conditionally defines the HAS_ATOLL symbol, which indicates to the C program that the atoll() routine is available.

  • d_attribute_deprecated

    From d_attribut.U:

    This variable conditionally defines HASATTRIBUTE_DEPRECATED , which indicates that GCC can handle the attribute for marking deprecated APIs

  • d_attribute_format

    From d_attribut.U:

    This variable conditionally defines HASATTRIBUTE_FORMAT , which indicates the C compiler can check for printf-like formats.

  • d_attribute_malloc

    From d_attribut.U:

    This variable conditionally defines HASATTRIBUTE_MALLOC , which indicates the C compiler can understand functions as having malloc-like semantics.

  • d_attribute_nonnull

    From d_attribut.U:

    This variable conditionally defines HASATTRIBUTE_NONNULL , which indicates that the C compiler can know that certain arguments must not be NULL , and will check accordingly at compile time.

  • d_attribute_noreturn

    From d_attribut.U:

    This variable conditionally defines HASATTRIBUTE_NORETURN , which indicates that the C compiler can know that certain functions are guaranteed never to return.

  • d_attribute_pure

    From d_attribut.U:

    This variable conditionally defines HASATTRIBUTE_PURE , which indicates that the C compiler can know that certain functions are pure functions, meaning that they have no side effects, and only rely on function input and/or global data for their results.

  • d_attribute_unused

    From d_attribut.U:

    This variable conditionally defines HASATTRIBUTE_UNUSED , which indicates that the C compiler can know that certain variables and arguments may not always be used, and to not throw warnings if they don't get used.

  • d_attribute_warn_unused_result

    From d_attribut.U:

    This variable conditionally defines HASATTRIBUTE_WARN_UNUSED_RESULT , which indicates that the C compiler can know that certain functions have a return values that must not be ignored, such as malloc() or open().

  • d_bcmp

    From d_bcmp.U:

    This variable conditionally defines the HAS_BCMP symbol if the bcmp() routine is available to compare strings.

  • d_bcopy

    From d_bcopy.U:

    This variable conditionally defines the HAS_BCOPY symbol if the bcopy() routine is available to copy strings.

  • d_bsd

    From Guess.U:

    This symbol conditionally defines the symbol BSD when running on a BSD system.

  • d_bsdgetpgrp

    From d_getpgrp.U:

    This variable conditionally defines USE_BSD_GETPGRP if getpgrp needs one arguments whereas USG one needs none.

  • d_bsdsetpgrp

    From d_setpgrp.U:

    This variable conditionally defines USE_BSD_SETPGRP if setpgrp needs two arguments whereas USG one needs none. See also d_setpgid for a POSIX interface.

  • d_builtin_choose_expr

    From d_builtin.U:

    This conditionally defines HAS_BUILTIN_CHOOSE_EXPR , which indicates that the compiler supports __builtin_choose_expr(x,y,z). This built-in function is analogous to the x?y:z operator in C, except that the expression returned has its type unaltered by promotion rules. Also, the built-in function does not evaluate the expression that was not chosen.

  • d_builtin_expect

    From d_builtin.U:

    This conditionally defines HAS_BUILTIN_EXPECT , which indicates that the compiler supports __builtin_expect(exp,c). You may use __builtin_expect to provide the compiler with branch prediction information.

  • d_bzero

    From d_bzero.U:

    This variable conditionally defines the HAS_BZERO symbol if the bzero() routine is available to set memory to 0.

  • d_c99_variadic_macros

    From d_c99_variadic.U:

    This variable conditionally defines the HAS_C99_VARIADIC_MACROS symbol, which indicates to the C program that C99 variadic macros are available.

  • d_casti32

    From d_casti32.U:

    This variable conditionally defines CASTI32, which indicates whether the C compiler can cast large floats to 32-bit ints.

  • d_castneg

    From d_castneg.U:

    This variable conditionally defines CASTNEG , which indicates whether the C compiler can cast negative float to unsigned.

  • d_charvspr

    From d_vprintf.U:

    This variable conditionally defines CHARVSPRINTF if this system has vsprintf returning type (char*). The trend seems to be to declare it as "int vsprintf()".

  • d_chown

    From d_chown.U:

    This variable conditionally defines the HAS_CHOWN symbol, which indicates to the C program that the chown() routine is available.

  • d_chroot

    From d_chroot.U:

    This variable conditionally defines the HAS_CHROOT symbol, which indicates to the C program that the chroot() routine is available.

  • d_chsize

    From d_chsize.U:

    This variable conditionally defines the CHSIZE symbol, which indicates to the C program that the chsize() routine is available to truncate files. You might need a -lx to get this routine.

  • d_class

    From d_class.U:

    This variable conditionally defines the HAS_CLASS symbol, which indicates to the C program that the class() routine is available.

  • d_clearenv

    From d_clearenv.U:

    This variable conditionally defines the HAS_CLEARENV symbol, which indicates to the C program that the clearenv () routine is available.

  • d_closedir

    From d_closedir.U:

    This variable conditionally defines HAS_CLOSEDIR if closedir() is available.

  • d_cmsghdr_s

    From d_cmsghdr_s.U:

    This variable conditionally defines the HAS_STRUCT_CMSGHDR symbol, which indicates that the struct cmsghdr is supported.

  • d_const

    From d_const.U:

    This variable conditionally defines the HASCONST symbol, which indicates to the C program that this C compiler knows about the const type.

  • d_copysignl

    From d_copysignl.U:

    This variable conditionally defines the HAS_COPYSIGNL symbol, which indicates to the C program that the copysignl() routine is available. If aintl is also present we can emulate modfl.

  • d_cplusplus

    From d_cplusplus.U:

    This variable conditionally defines the USE_CPLUSPLUS symbol, which indicates that a C++ compiler was used to compiled Perl and will be used to compile extensions.

  • d_crypt

    From d_crypt.U:

    This variable conditionally defines the CRYPT symbol, which indicates to the C program that the crypt() routine is available to encrypt passwords and the like.

  • d_crypt_r

    From d_crypt_r.U:

    This variable conditionally defines the HAS_CRYPT_R symbol, which indicates to the C program that the crypt_r() routine is available.

  • d_csh

    From d_csh.U:

    This variable conditionally defines the CSH symbol, which indicates to the C program that the C-shell exists.

  • d_ctermid

    From d_ctermid.U:

    This variable conditionally defines CTERMID if ctermid() is available to generate filename for terminal.

  • d_ctermid_r

    From d_ctermid_r.U:

    This variable conditionally defines the HAS_CTERMID_R symbol, which indicates to the C program that the ctermid_r() routine is available.

  • d_ctime64

    From d_timefuncs64.U:

    This variable conditionally defines the HAS_CTIME64 symbol, which indicates to the C program that the ctime64 () routine is available.

  • d_ctime_r

    From d_ctime_r.U:

    This variable conditionally defines the HAS_CTIME_R symbol, which indicates to the C program that the ctime_r() routine is available.

  • d_cuserid

    From d_cuserid.U:

    This variable conditionally defines the HAS_CUSERID symbol, which indicates to the C program that the cuserid() routine is available to get character login names.

  • d_dbl_dig

    From d_dbl_dig.U:

    This variable conditionally defines d_dbl_dig if this system's header files provide DBL_DIG , which is the number of significant digits in a double precision number.

  • d_dbminitproto

    From d_dbminitproto.U:

    This variable conditionally defines the HAS_DBMINIT_PROTO symbol, which indicates to the C program that the system provides a prototype for the dbminit() function. Otherwise, it is up to the program to supply one.

  • d_difftime

    From d_difftime.U:

    This variable conditionally defines the HAS_DIFFTIME symbol, which indicates to the C program that the difftime() routine is available.

  • d_difftime64

    From d_timefuncs64.U:

    This variable conditionally defines the HAS_DIFFTIME64 symbol, which indicates to the C program that the difftime64 () routine is available.

  • d_dir_dd_fd

    From d_dir_dd_fd.U:

    This variable conditionally defines the HAS_DIR_DD_FD symbol, which indicates that the DIR directory stream type contains a member variable called dd_fd.

  • d_dirfd

    From d_dirfd.U:

    This variable conditionally defines the HAS_DIRFD constant, which indicates to the C program that dirfd() is available to return the file descriptor of a directory stream.

  • d_dirnamlen

    From i_dirent.U:

    This variable conditionally defines DIRNAMLEN , which indicates to the C program that the length of directory entry names is provided by a d_namelen field.

  • d_dlerror

    From d_dlerror.U:

    This variable conditionally defines the HAS_DLERROR symbol, which indicates to the C program that the dlerror() routine is available.

  • d_dlopen

    From d_dlopen.U:

    This variable conditionally defines the HAS_DLOPEN symbol, which indicates to the C program that the dlopen() routine is available.

  • d_dlsymun

    From d_dlsymun.U:

    This variable conditionally defines DLSYM_NEEDS_UNDERSCORE , which indicates that we need to prepend an underscore to the symbol name before calling dlsym().

  • d_dosuid

    From d_dosuid.U:

    This variable conditionally defines the symbol DOSUID , which tells the C program that it should insert setuid emulation code on hosts which have setuid #! scripts disabled.

  • d_drand48_r

    From d_drand48_r.U:

    This variable conditionally defines the HAS_DRAND48_R symbol, which indicates to the C program that the drand48_r() routine is available.

  • d_drand48proto

    From d_drand48proto.U:

    This variable conditionally defines the HAS_DRAND48_PROTO symbol, which indicates to the C program that the system provides a prototype for the drand48() function. Otherwise, it is up to the program to supply one.

  • d_dup2

    From d_dup2.U:

    This variable conditionally defines HAS_DUP2 if dup2() is available to duplicate file descriptors.

  • d_eaccess

    From d_eaccess.U:

    This variable conditionally defines the HAS_EACCESS symbol, which indicates to the C program that the eaccess() routine is available.

  • d_endgrent

    From d_endgrent.U:

    This variable conditionally defines the HAS_ENDGRENT symbol, which indicates to the C program that the endgrent() routine is available for sequential access of the group database.

  • d_endgrent_r

    From d_endgrent_r.U:

    This variable conditionally defines the HAS_ENDGRENT_R symbol, which indicates to the C program that the endgrent_r() routine is available.

  • d_endhent

    From d_endhent.U:

    This variable conditionally defines HAS_ENDHOSTENT if endhostent() is available to close whatever was being used for host queries.

  • d_endhostent_r

    From d_endhostent_r.U:

    This variable conditionally defines the HAS_ENDHOSTENT_R symbol, which indicates to the C program that the endhostent_r() routine is available.

  • d_endnent

    From d_endnent.U:

    This variable conditionally defines HAS_ENDNETENT if endnetent() is available to close whatever was being used for network queries.

  • d_endnetent_r

    From d_endnetent_r.U:

    This variable conditionally defines the HAS_ENDNETENT_R symbol, which indicates to the C program that the endnetent_r() routine is available.

  • d_endpent

    From d_endpent.U:

    This variable conditionally defines HAS_ENDPROTOENT if endprotoent() is available to close whatever was being used for protocol queries.

  • d_endprotoent_r

    From d_endprotoent_r.U:

    This variable conditionally defines the HAS_ENDPROTOENT_R symbol, which indicates to the C program that the endprotoent_r() routine is available.

  • d_endpwent

    From d_endpwent.U:

    This variable conditionally defines the HAS_ENDPWENT symbol, which indicates to the C program that the endpwent() routine is available for sequential access of the passwd database.

  • d_endpwent_r

    From d_endpwent_r.U:

    This variable conditionally defines the HAS_ENDPWENT_R symbol, which indicates to the C program that the endpwent_r() routine is available.

  • d_endsent

    From d_endsent.U:

    This variable conditionally defines HAS_ENDSERVENT if endservent() is available to close whatever was being used for service queries.

  • d_endservent_r

    From d_endservent_r.U:

    This variable conditionally defines the HAS_ENDSERVENT_R symbol, which indicates to the C program that the endservent_r() routine is available.

  • d_eofnblk

    From nblock_io.U:

    This variable conditionally defines EOF_NONBLOCK if EOF can be seen when reading from a non-blocking I/O source.

  • d_eunice

    From Guess.U:

    This variable conditionally defines the symbols EUNICE and VAX , which alerts the C program that it must deal with idiosyncrasies of VMS .

  • d_faststdio

    From d_faststdio.U:

    This variable conditionally defines the HAS_FAST_STDIO symbol, which indicates to the C program that the "fast stdio" is available to manipulate the stdio buffers directly.

  • d_fchdir

    From d_fchdir.U:

    This variable conditionally defines the HAS_FCHDIR symbol, which indicates to the C program that the fchdir() routine is available.

  • d_fchmod

    From d_fchmod.U:

    This variable conditionally defines the HAS_FCHMOD symbol, which indicates to the C program that the fchmod() routine is available to change mode of opened files.

  • d_fchown

    From d_fchown.U:

    This variable conditionally defines the HAS_FCHOWN symbol, which indicates to the C program that the fchown() routine is available to change ownership of opened files.

  • d_fcntl

    From d_fcntl.U:

    This variable conditionally defines the HAS_FCNTL symbol, and indicates whether the fcntl() function exists

  • d_fcntl_can_lock

    From d_fcntl_can_lock.U:

    This variable conditionally defines the FCNTL_CAN_LOCK symbol and indicates whether file locking with fcntl() works.

  • d_fd_macros

    From d_fd_set.U:

    This variable contains the eventual value of the HAS_FD_MACROS symbol, which indicates if your C compiler knows about the macros which manipulate an fd_set.

  • d_fd_set

    From d_fd_set.U:

    This variable contains the eventual value of the HAS_FD_SET symbol, which indicates if your C compiler knows about the fd_set typedef.

  • d_fds_bits

    From d_fd_set.U:

    This variable contains the eventual value of the HAS_FDS_BITS symbol, which indicates if your fd_set typedef contains the fds_bits member. If you have an fd_set typedef, but the dweebs who installed it did a half-fast job and neglected to provide the macros to manipulate an fd_set, HAS_FDS_BITS will let us know how to fix the gaffe.

  • d_fgetpos

    From d_fgetpos.U:

    This variable conditionally defines HAS_FGETPOS if fgetpos() is available to get the file position indicator.

  • d_finite

    From d_finite.U:

    This variable conditionally defines the HAS_FINITE symbol, which indicates to the C program that the finite() routine is available.

  • d_finitel

    From d_finitel.U:

    This variable conditionally defines the HAS_FINITEL symbol, which indicates to the C program that the finitel() routine is available.

  • d_flexfnam

    From d_flexfnam.U:

    This variable conditionally defines the FLEXFILENAMES symbol, which indicates that the system supports filenames longer than 14 characters.

  • d_flock

    From d_flock.U:

    This variable conditionally defines HAS_FLOCK if flock() is available to do file locking.

  • d_flockproto

    From d_flockproto.U:

    This variable conditionally defines the HAS_FLOCK_PROTO symbol, which indicates to the C program that the system provides a prototype for the flock() function. Otherwise, it is up to the program to supply one.

  • d_fork

    From d_fork.U:

    This variable conditionally defines the HAS_FORK symbol, which indicates to the C program that the fork() routine is available.

  • d_fp_class

    From d_fp_class.U:

    This variable conditionally defines the HAS_FP_CLASS symbol, which indicates to the C program that the fp_class() routine is available.

  • d_fpathconf

    From d_pathconf.U:

    This variable conditionally defines the HAS_FPATHCONF symbol, which indicates to the C program that the pathconf() routine is available to determine file-system related limits and options associated with a given open file descriptor.

  • d_fpclass

    From d_fpclass.U:

    This variable conditionally defines the HAS_FPCLASS symbol, which indicates to the C program that the fpclass() routine is available.

  • d_fpclassify

    From d_fpclassify.U:

    This variable conditionally defines the HAS_FPCLASSIFY symbol, which indicates to the C program that the fpclassify() routine is available.

  • d_fpclassl

    From d_fpclassl.U:

    This variable conditionally defines the HAS_FPCLASSL symbol, which indicates to the C program that the fpclassl() routine is available.

  • d_fpos64_t

    From d_fpos64_t.U:

    This symbol will be defined if the C compiler supports fpos64_t.

  • d_frexpl

    From d_frexpl.U:

    This variable conditionally defines the HAS_FREXPL symbol, which indicates to the C program that the frexpl() routine is available.

  • d_fs_data_s

    From d_fs_data_s.U:

    This variable conditionally defines the HAS_STRUCT_FS_DATA symbol, which indicates that the struct fs_data is supported.

  • d_fseeko

    From d_fseeko.U:

    This variable conditionally defines the HAS_FSEEKO symbol, which indicates to the C program that the fseeko() routine is available.

  • d_fsetpos

    From d_fsetpos.U:

    This variable conditionally defines HAS_FSETPOS if fsetpos() is available to set the file position indicator.

  • d_fstatfs

    From d_fstatfs.U:

    This variable conditionally defines the HAS_FSTATFS symbol, which indicates to the C program that the fstatfs() routine is available.

  • d_fstatvfs

    From d_statvfs.U:

    This variable conditionally defines the HAS_FSTATVFS symbol, which indicates to the C program that the fstatvfs() routine is available.

  • d_fsync

    From d_fsync.U:

    This variable conditionally defines the HAS_FSYNC symbol, which indicates to the C program that the fsync() routine is available.

  • d_ftello

    From d_ftello.U:

    This variable conditionally defines the HAS_FTELLO symbol, which indicates to the C program that the ftello() routine is available.

  • d_ftime

    From d_ftime.U:

    This variable conditionally defines the HAS_FTIME symbol, which indicates that the ftime() routine exists. The ftime() routine is basically a sub-second accuracy clock.

  • d_futimes

    From d_futimes.U:

    This variable conditionally defines the HAS_FUTIMES symbol, which indicates to the C program that the futimes() routine is available.

  • d_Gconvert

    From d_gconvert.U:

    This variable holds what Gconvert is defined as to convert floating point numbers into strings. By default, Configure sets this macro to use the first of gconvert, gcvt, or sprintf that pass sprintf-%g-like behavior tests. If perl is using long doubles, the macro uses the first of the following functions that pass Configure's tests: qgcvt, sprintf (if Configure knows how to make sprintf format long doubles--see sPRIgldbl), gconvert, gcvt, and sprintf (casting to double). The gconvert_preference and gconvert_ld_preference variables can be used to alter Configure's preferences, for doubles and long doubles, respectively. If present, they contain a space-separated list of one or more of the above function names in the order they should be tried.

    d_Gconvert may be set to override Configure with a platform- specific function. If this function expects a double, a different value may need to be set by the uselongdouble.cbu call-back unit so that long doubles can be formatted without loss of precision.

  • d_gdbm_ndbm_h_uses_prototypes

    From i_ndbm.U:

    This variable conditionally defines the NDBM_H_USES_PROTOTYPES symbol, which indicates that the gdbm-ndbm.h include file uses real ANSI C prototypes instead of K&R style function declarations. K&R style declarations are unsupported in C++, so the include file requires special handling when using a C++ compiler and this variable is undefined. Consult the different d_*ndbm_h_uses_prototypes variables to get the same information for alternative ndbm.h include files.

  • d_gdbmndbm_h_uses_prototypes

    From i_ndbm.U:

    This variable conditionally defines the NDBM_H_USES_PROTOTYPES symbol, which indicates that the gdbm/ndbm.h include file uses real ANSI C prototypes instead of K&R style function declarations. K&R style declarations are unsupported in C++, so the include file requires special handling when using a C++ compiler and this variable is undefined. Consult the different d_*ndbm_h_uses_prototypes variables to get the same information for alternative ndbm.h include files.

  • d_getaddrinfo

    From d_getaddrinfo.U:

    This variable conditionally defines the HAS_GETADDRINFO symbol, which indicates to the C program that the getaddrinfo() function is available.

  • d_getcwd

    From d_getcwd.U:

    This variable conditionally defines the HAS_GETCWD symbol, which indicates to the C program that the getcwd() routine is available to get the current working directory.

  • d_getespwnam

    From d_getespwnam.U:

    This variable conditionally defines HAS_GETESPWNAM if getespwnam() is available to retrieve enhanced (shadow) password entries by name.

  • d_getfsstat

    From d_getfsstat.U:

    This variable conditionally defines the HAS_GETFSSTAT symbol, which indicates to the C program that the getfsstat() routine is available.

  • d_getgrent

    From d_getgrent.U:

    This variable conditionally defines the HAS_GETGRENT symbol, which indicates to the C program that the getgrent() routine is available for sequential access of the group database.

  • d_getgrent_r

    From d_getgrent_r.U:

    This variable conditionally defines the HAS_GETGRENT_R symbol, which indicates to the C program that the getgrent_r() routine is available.

  • d_getgrgid_r

    From d_getgrgid_r.U:

    This variable conditionally defines the HAS_GETGRGID_R symbol, which indicates to the C program that the getgrgid_r() routine is available.

  • d_getgrnam_r

    From d_getgrnam_r.U:

    This variable conditionally defines the HAS_GETGRNAM_R symbol, which indicates to the C program that the getgrnam_r() routine is available.

  • d_getgrps

    From d_getgrps.U:

    This variable conditionally defines the HAS_GETGROUPS symbol, which indicates to the C program that the getgroups() routine is available to get the list of process groups.

  • d_gethbyaddr

    From d_gethbyad.U:

    This variable conditionally defines the HAS_GETHOSTBYADDR symbol, which indicates to the C program that the gethostbyaddr() routine is available to look up hosts by their IP addresses.

  • d_gethbyname

    From d_gethbynm.U:

    This variable conditionally defines the HAS_GETHOSTBYNAME symbol, which indicates to the C program that the gethostbyname() routine is available to look up host names in some data base or other.

  • d_gethent

    From d_gethent.U:

    This variable conditionally defines HAS_GETHOSTENT if gethostent() is available to look up host names in some data base or another.

  • d_gethname

    From d_gethname.U:

    This variable conditionally defines the HAS_GETHOSTNAME symbol, which indicates to the C program that the gethostname() routine may be used to derive the host name.

  • d_gethostbyaddr_r

    From d_gethostbyaddr_r.U:

    This variable conditionally defines the HAS_GETHOSTBYADDR_R symbol, which indicates to the C program that the gethostbyaddr_r() routine is available.

  • d_gethostbyname_r

    From d_gethostbyname_r.U:

    This variable conditionally defines the HAS_GETHOSTBYNAME_R symbol, which indicates to the C program that the gethostbyname_r() routine is available.

  • d_gethostent_r

    From d_gethostent_r.U:

    This variable conditionally defines the HAS_GETHOSTENT_R symbol, which indicates to the C program that the gethostent_r() routine is available.

  • d_gethostprotos

    From d_gethostprotos.U:

    This variable conditionally defines the HAS_GETHOST_PROTOS symbol, which indicates to the C program that <netdb.h> supplies prototypes for the various gethost*() functions. See also netdbtype.U for probing for various netdb types.

  • d_getitimer

    From d_getitimer.U:

    This variable conditionally defines the HAS_GETITIMER symbol, which indicates to the C program that the getitimer() routine is available.

  • d_getlogin

    From d_getlogin.U:

    This variable conditionally defines the HAS_GETLOGIN symbol, which indicates to the C program that the getlogin() routine is available to get the login name.

  • d_getlogin_r

    From d_getlogin_r.U:

    This variable conditionally defines the HAS_GETLOGIN_R symbol, which indicates to the C program that the getlogin_r() routine is available.

  • d_getmnt

    From d_getmnt.U:

    This variable conditionally defines the HAS_GETMNT symbol, which indicates to the C program that the getmnt() routine is available to retrieve one or more mount info blocks by filename.

  • d_getmntent

    From d_getmntent.U:

    This variable conditionally defines the HAS_GETMNTENT symbol, which indicates to the C program that the getmntent() routine is available to iterate through mounted files to get their mount info.

  • d_getnameinfo

    From d_getnameinfo.U:

    This variable conditionally defines the HAS_GETNAMEINFO symbol, which indicates to the C program that the getnameinfo() function is available.

  • d_getnbyaddr

    From d_getnbyad.U:

    This variable conditionally defines the HAS_GETNETBYADDR symbol, which indicates to the C program that the getnetbyaddr() routine is available to look up networks by their IP addresses.

  • d_getnbyname

    From d_getnbynm.U:

    This variable conditionally defines the HAS_GETNETBYNAME symbol, which indicates to the C program that the getnetbyname() routine is available to look up networks by their names.

  • d_getnent

    From d_getnent.U:

    This variable conditionally defines HAS_GETNETENT if getnetent() is available to look up network names in some data base or another.

  • d_getnetbyaddr_r

    From d_getnetbyaddr_r.U:

    This variable conditionally defines the HAS_GETNETBYADDR_R symbol, which indicates to the C program that the getnetbyaddr_r() routine is available.

  • d_getnetbyname_r

    From d_getnetbyname_r.U:

    This variable conditionally defines the HAS_GETNETBYNAME_R symbol, which indicates to the C program that the getnetbyname_r() routine is available.

  • d_getnetent_r

    From d_getnetent_r.U:

    This variable conditionally defines the HAS_GETNETENT_R symbol, which indicates to the C program that the getnetent_r() routine is available.

  • d_getnetprotos

    From d_getnetprotos.U:

    This variable conditionally defines the HAS_GETNET_PROTOS symbol, which indicates to the C program that <netdb.h> supplies prototypes for the various getnet*() functions. See also netdbtype.U for probing for various netdb types.

  • d_getpagsz

    From d_getpagsz.U:

    This variable conditionally defines HAS_GETPAGESIZE if getpagesize() is available to get the system page size.

  • d_getpbyname

    From d_getprotby.U:

    This variable conditionally defines the HAS_GETPROTOBYNAME symbol, which indicates to the C program that the getprotobyname() routine is available to look up protocols by their name.

  • d_getpbynumber

    From d_getprotby.U:

    This variable conditionally defines the HAS_GETPROTOBYNUMBER symbol, which indicates to the C program that the getprotobynumber() routine is available to look up protocols by their number.

  • d_getpent

    From d_getpent.U:

    This variable conditionally defines HAS_GETPROTOENT if getprotoent() is available to look up protocols in some data base or another.

  • d_getpgid

    From d_getpgid.U:

    This variable conditionally defines the HAS_GETPGID symbol, which indicates to the C program that the getpgid(pid) function is available to get the process group id.

  • d_getpgrp

    From d_getpgrp.U:

    This variable conditionally defines HAS_GETPGRP if getpgrp() is available to get the current process group.

  • d_getpgrp2

    From d_getpgrp2.U:

    This variable conditionally defines the HAS_GETPGRP2 symbol, which indicates to the C program that the getpgrp2() (as in DG/UX ) routine is available to get the current process group.

  • d_getppid

    From d_getppid.U:

    This variable conditionally defines the HAS_GETPPID symbol, which indicates to the C program that the getppid() routine is available to get the parent process ID .

  • d_getprior

    From d_getprior.U:

    This variable conditionally defines HAS_GETPRIORITY if getpriority() is available to get a process's priority.

  • d_getprotobyname_r

    From d_getprotobyname_r.U:

    This variable conditionally defines the HAS_GETPROTOBYNAME_R symbol, which indicates to the C program that the getprotobyname_r() routine is available.

  • d_getprotobynumber_r

    From d_getprotobynumber_r.U:

    This variable conditionally defines the HAS_GETPROTOBYNUMBER_R symbol, which indicates to the C program that the getprotobynumber_r() routine is available.

  • d_getprotoent_r

    From d_getprotoent_r.U:

    This variable conditionally defines the HAS_GETPROTOENT_R symbol, which indicates to the C program that the getprotoent_r() routine is available.

  • d_getprotoprotos

    From d_getprotoprotos.U:

    This variable conditionally defines the HAS_GETPROTO_PROTOS symbol, which indicates to the C program that <netdb.h> supplies prototypes for the various getproto*() functions. See also netdbtype.U for probing for various netdb types.

  • d_getprpwnam

    From d_getprpwnam.U:

    This variable conditionally defines HAS_GETPRPWNAM if getprpwnam() is available to retrieve protected (shadow) password entries by name.

  • d_getpwent

    From d_getpwent.U:

    This variable conditionally defines the HAS_GETPWENT symbol, which indicates to the C program that the getpwent() routine is available for sequential access of the passwd database.

  • d_getpwent_r

    From d_getpwent_r.U:

    This variable conditionally defines the HAS_GETPWENT_R symbol, which indicates to the C program that the getpwent_r() routine is available.

  • d_getpwnam_r

    From d_getpwnam_r.U:

    This variable conditionally defines the HAS_GETPWNAM_R symbol, which indicates to the C program that the getpwnam_r() routine is available.

  • d_getpwuid_r

    From d_getpwuid_r.U:

    This variable conditionally defines the HAS_GETPWUID_R symbol, which indicates to the C program that the getpwuid_r() routine is available.

  • d_getsbyname

    From d_getsrvby.U:

    This variable conditionally defines the HAS_GETSERVBYNAME symbol, which indicates to the C program that the getservbyname() routine is available to look up services by their name.

  • d_getsbyport

    From d_getsrvby.U:

    This variable conditionally defines the HAS_GETSERVBYPORT symbol, which indicates to the C program that the getservbyport() routine is available to look up services by their port.

  • d_getsent

    From d_getsent.U:

    This variable conditionally defines HAS_GETSERVENT if getservent() is available to look up network services in some data base or another.

  • d_getservbyname_r

    From d_getservbyname_r.U:

    This variable conditionally defines the HAS_GETSERVBYNAME_R symbol, which indicates to the C program that the getservbyname_r() routine is available.

  • d_getservbyport_r

    From d_getservbyport_r.U:

    This variable conditionally defines the HAS_GETSERVBYPORT_R symbol, which indicates to the C program that the getservbyport_r() routine is available.

  • d_getservent_r

    From d_getservent_r.U:

    This variable conditionally defines the HAS_GETSERVENT_R symbol, which indicates to the C program that the getservent_r() routine is available.

  • d_getservprotos

    From d_getservprotos.U:

    This variable conditionally defines the HAS_GETSERV_PROTOS symbol, which indicates to the C program that <netdb.h> supplies prototypes for the various getserv*() functions. See also netdbtype.U for probing for various netdb types.

  • d_getspnam

    From d_getspnam.U:

    This variable conditionally defines HAS_GETSPNAM if getspnam() is available to retrieve SysV shadow password entries by name.

  • d_getspnam_r

    From d_getspnam_r.U:

    This variable conditionally defines the HAS_GETSPNAM_R symbol, which indicates to the C program that the getspnam_r() routine is available.

  • d_gettimeod

    From d_ftime.U:

    This variable conditionally defines the HAS_GETTIMEOFDAY symbol, which indicates that the gettimeofday() system call exists (to obtain a sub-second accuracy clock). You should probably include <sys/resource.h>.

  • d_gmtime64

    From d_timefuncs64.U:

    This variable conditionally defines the HAS_GMTIME64 symbol, which indicates to the C program that the gmtime64 () routine is available.

  • d_gmtime_r

    From d_gmtime_r.U:

    This variable conditionally defines the HAS_GMTIME_R symbol, which indicates to the C program that the gmtime_r() routine is available.

  • d_gnulibc

    From d_gnulibc.U:

    Defined if we're dealing with the GNU C Library.

  • d_grpasswd

    From i_grp.U:

    This variable conditionally defines GRPASSWD , which indicates that struct group in <grp.h> contains gr_passwd.

  • d_hasmntopt

    From d_hasmntopt.U:

    This variable conditionally defines the HAS_HASMNTOPT symbol, which indicates to the C program that the hasmntopt() routine is available to query the mount options of file systems.

  • d_htonl

    From d_htonl.U:

    This variable conditionally defines HAS_HTONL if htonl() and its friends are available to do network order byte swapping.

  • d_ilogbl

    From d_ilogbl.U:

    This variable conditionally defines the HAS_ILOGBL symbol, which indicates to the C program that the ilogbl() routine is available. If scalbnl is also present we can emulate frexpl.

  • d_inc_version_list

    From inc_version_list.U:

    This variable conditionally defines PERL_INC_VERSION_LIST . It is set to undef when PERL_INC_VERSION_LIST is empty.

  • d_index

    From d_strchr.U:

    This variable conditionally defines HAS_INDEX if index() and rindex() are available for string searching.

  • d_inetaton

    From d_inetaton.U:

    This variable conditionally defines the HAS_INET_ATON symbol, which indicates to the C program that the inet_aton() function is available to parse IP address dotted-quad strings.

  • d_inetntop

    From d_inetntop.U:

    This variable conditionally defines the HAS_INETNTOP symbol, which indicates to the C program that the inet_ntop() function is available.

  • d_inetpton

    From d_inetpton.U:

    This variable conditionally defines the HAS_INETPTON symbol, which indicates to the C program that the inet_pton() function is available.

  • d_int64_t

    From d_int64_t.U:

    This symbol will be defined if the C compiler supports int64_t.

  • d_ip_mreq

    From d_socket.U:

    This variable conditionally defines the HAS_IP_MREQ symbol, which indicates the availability of a struct ip_mreq.

  • d_ip_mreq_source

    From d_socket.U:

    This variable conditionally defines the HAS_IP_MREQ_SOURCE symbol, which indicates the availability of a struct ip_mreq_source.

  • d_ipv6_mreq

    From d_socket.U:

    This variable conditionally defines the HAS_IPV6_MREQ symbol, which indicates the availability of a struct ipv6_mreq.

  • d_ipv6_mreq_source

    From d_socket.U:

    This variable conditionally defines the HAS_IPV6_MREQ_SOURCE symbol, which indicates the availability of a struct ipv6_mreq_source.

  • d_isascii

    From d_isascii.U:

    This variable conditionally defines the HAS_ISASCII constant, which indicates to the C program that isascii() is available.

  • d_isblank

    From d_isblank.U:

    This variable conditionally defines the HAS_ISBLANK constant, which indicates to the C program that isblank() is available.

  • d_isfinite

    From d_isfinite.U:

    This variable conditionally defines the HAS_ISFINITE symbol, which indicates to the C program that the isfinite() routine is available.

  • d_isinf

    From d_isinf.U:

    This variable conditionally defines the HAS_ISINF symbol, which indicates to the C program that the isinf() routine is available.

  • d_isnan

    From d_isnan.U:

    This variable conditionally defines the HAS_ISNAN symbol, which indicates to the C program that the isnan() routine is available.

  • d_isnanl

    From d_isnanl.U:

    This variable conditionally defines the HAS_ISNANL symbol, which indicates to the C program that the isnanl() routine is available.

  • d_killpg

    From d_killpg.U:

    This variable conditionally defines the HAS_KILLPG symbol, which indicates to the C program that the killpg() routine is available to kill process groups.

  • d_lchown

    From d_lchown.U:

    This variable conditionally defines the HAS_LCHOWN symbol, which indicates to the C program that the lchown() routine is available to operate on a symbolic link (instead of following the link).

  • d_ldbl_dig

    From d_ldbl_dig.U:

    This variable conditionally defines d_ldbl_dig if this system's header files provide LDBL_DIG , which is the number of significant digits in a long double precision number.

  • d_libm_lib_version

    From d_libm_lib_version.U:

    This variable conditionally defines the LIBM_LIB_VERSION symbol, which indicates to the C program that math.h defines _LIB_VERSION being available in libm

  • d_link

    From d_link.U:

    This variable conditionally defines HAS_LINK if link() is available to create hard links.

  • d_localtime64

    From d_timefuncs64.U:

    This variable conditionally defines the HAS_LOCALTIME64 symbol, which indicates to the C program that the localtime64 () routine is available.

  • d_localtime_r

    From d_localtime_r.U:

    This variable conditionally defines the HAS_LOCALTIME_R symbol, which indicates to the C program that the localtime_r() routine is available.

  • d_localtime_r_needs_tzset

    From d_localtime_r.U:

    This variable conditionally defines the LOCALTIME_R_NEEDS_TZSET symbol, which makes us call tzset before localtime_r()

  • d_locconv

    From d_locconv.U:

    This variable conditionally defines HAS_LOCALECONV if localeconv() is available for numeric and monetary formatting conventions.

  • d_lockf

    From d_lockf.U:

    This variable conditionally defines HAS_LOCKF if lockf() is available to do file locking.

  • d_longdbl

    From d_longdbl.U:

    This variable conditionally defines HAS_LONG_DOUBLE if the long double type is supported.

  • d_longlong

    From d_longlong.U:

    This variable conditionally defines HAS_LONG_LONG if the long long type is supported.

  • d_lseekproto

    From d_lseekproto.U:

    This variable conditionally defines the HAS_LSEEK_PROTO symbol, which indicates to the C program that the system provides a prototype for the lseek() function. Otherwise, it is up to the program to supply one.

  • d_lstat

    From d_lstat.U:

    This variable conditionally defines HAS_LSTAT if lstat() is available to do file stats on symbolic links.

  • d_madvise

    From d_madvise.U:

    This variable conditionally defines HAS_MADVISE if madvise() is available to map a file into memory.

  • d_malloc_good_size

    From d_malloc_size.U:

    This symbol, if defined, indicates that the malloc_good_size routine is available for use.

  • d_malloc_size

    From d_malloc_size.U:

    This symbol, if defined, indicates that the malloc_size routine is available for use.

  • d_mblen

    From d_mblen.U:

    This variable conditionally defines the HAS_MBLEN symbol, which indicates to the C program that the mblen() routine is available to find the number of bytes in a multibye character.

  • d_mbstowcs

    From d_mbstowcs.U:

    This variable conditionally defines the HAS_MBSTOWCS symbol, which indicates to the C program that the mbstowcs() routine is available to convert a multibyte string into a wide character string.

  • d_mbtowc

    From d_mbtowc.U:

    This variable conditionally defines the HAS_MBTOWC symbol, which indicates to the C program that the mbtowc() routine is available to convert multibyte to a wide character.

  • d_memchr

    From d_memchr.U:

    This variable conditionally defines the HAS_MEMCHR symbol, which indicates to the C program that the memchr() routine is available to locate characters within a C string.

  • d_memcmp

    From d_memcmp.U:

    This variable conditionally defines the HAS_MEMCMP symbol, which indicates to the C program that the memcmp() routine is available to compare blocks of memory.

  • d_memcpy

    From d_memcpy.U:

    This variable conditionally defines the HAS_MEMCPY symbol, which indicates to the C program that the memcpy() routine is available to copy blocks of memory.

  • d_memmove

    From d_memmove.U:

    This variable conditionally defines the HAS_MEMMOVE symbol, which indicates to the C program that the memmove() routine is available to copy potentially overlapping blocks of memory.

  • d_memset

    From d_memset.U:

    This variable conditionally defines the HAS_MEMSET symbol, which indicates to the C program that the memset() routine is available to set blocks of memory.

  • d_mkdir

    From d_mkdir.U:

    This variable conditionally defines the HAS_MKDIR symbol, which indicates to the C program that the mkdir() routine is available to create directories..

  • d_mkdtemp

    From d_mkdtemp.U:

    This variable conditionally defines the HAS_MKDTEMP symbol, which indicates to the C program that the mkdtemp() routine is available to exclusively create a uniquely named temporary directory.

  • d_mkfifo

    From d_mkfifo.U:

    This variable conditionally defines the HAS_MKFIFO symbol, which indicates to the C program that the mkfifo() routine is available.

  • d_mkstemp

    From d_mkstemp.U:

    This variable conditionally defines the HAS_MKSTEMP symbol, which indicates to the C program that the mkstemp() routine is available to exclusively create and open a uniquely named temporary file.

  • d_mkstemps

    From d_mkstemps.U:

    This variable conditionally defines the HAS_MKSTEMPS symbol, which indicates to the C program that the mkstemps() routine is available to exclusively create and open a uniquely named (with a suffix) temporary file.

  • d_mktime

    From d_mktime.U:

    This variable conditionally defines the HAS_MKTIME symbol, which indicates to the C program that the mktime() routine is available.

  • d_mktime64

    From d_timefuncs64.U:

    This variable conditionally defines the HAS_MKTIME64 symbol, which indicates to the C program that the mktime64 () routine is available.

  • d_mmap

    From d_mmap.U:

    This variable conditionally defines HAS_MMAP if mmap() is available to map a file into memory.

  • d_modfl

    From d_modfl.U:

    This variable conditionally defines the HAS_MODFL symbol, which indicates to the C program that the modfl() routine is available.

  • d_modfl_pow32_bug

    From d_modfl.U:

    This variable conditionally defines the HAS_MODFL_POW32_BUG symbol, which indicates that modfl() is broken for long doubles >= pow(2, 32). For example from 4294967303.150000 one would get 4294967302.000000 and 1.150000. The bug has been seen in certain versions of glibc, release 2.2.2 is known to be okay.

  • d_modflproto

    From d_modfl.U:

    This symbol, if defined, indicates that the system provides a prototype for the modfl() function. Otherwise, it is up to the program to supply one. C99 says it should be long double modfl(long double, long double *);

  • d_mprotect

    From d_mprotect.U:

    This variable conditionally defines HAS_MPROTECT if mprotect() is available to modify the access protection of a memory mapped file.

  • d_msg

    From d_msg.U:

    This variable conditionally defines the HAS_MSG symbol, which indicates that the entire msg*(2) library is present.

  • d_msg_ctrunc

    From d_socket.U:

    This variable conditionally defines the HAS_MSG_CTRUNC symbol, which indicates that the MSG_CTRUNC is available. #ifdef is not enough because it may be an enum, glibc has been known to do this.

  • d_msg_dontroute

    From d_socket.U:

    This variable conditionally defines the HAS_MSG_DONTROUTE symbol, which indicates that the MSG_DONTROUTE is available. #ifdef is not enough because it may be an enum, glibc has been known to do this.

  • d_msg_oob

    From d_socket.U:

    This variable conditionally defines the HAS_MSG_OOB symbol, which indicates that the MSG_OOB is available. #ifdef is not enough because it may be an enum, glibc has been known to do this.

  • d_msg_peek

    From d_socket.U:

    This variable conditionally defines the HAS_MSG_PEEK symbol, which indicates that the MSG_PEEK is available. #ifdef is not enough because it may be an enum, glibc has been known to do this.

  • d_msg_proxy

    From d_socket.U:

    This variable conditionally defines the HAS_MSG_PROXY symbol, which indicates that the MSG_PROXY is available. #ifdef is not enough because it may be an enum, glibc has been known to do this.

  • d_msgctl

    From d_msgctl.U:

    This variable conditionally defines the HAS_MSGCTL symbol, which indicates to the C program that the msgctl() routine is available.

  • d_msgget

    From d_msgget.U:

    This variable conditionally defines the HAS_MSGGET symbol, which indicates to the C program that the msgget() routine is available.

  • d_msghdr_s

    From d_msghdr_s.U:

    This variable conditionally defines the HAS_STRUCT_MSGHDR symbol, which indicates that the struct msghdr is supported.

  • d_msgrcv

    From d_msgrcv.U:

    This variable conditionally defines the HAS_MSGRCV symbol, which indicates to the C program that the msgrcv() routine is available.

  • d_msgsnd

    From d_msgsnd.U:

    This variable conditionally defines the HAS_MSGSND symbol, which indicates to the C program that the msgsnd() routine is available.

  • d_msync

    From d_msync.U:

    This variable conditionally defines HAS_MSYNC if msync() is available to synchronize a mapped file.

  • d_munmap

    From d_munmap.U:

    This variable conditionally defines HAS_MUNMAP if munmap() is available to unmap a region mapped by mmap().

  • d_mymalloc

    From mallocsrc.U:

    This variable conditionally defines MYMALLOC in case other parts of the source want to take special action if MYMALLOC is used. This may include different sorts of profiling or error detection.

  • d_ndbm

    From i_ndbm.U:

    This variable conditionally defines the HAS_NDBM symbol, which indicates that both the ndbm.h include file and an appropriate ndbm library exist. Consult the different i_*ndbm variables to find out the actual include location. Sometimes, a system has the header file but not the library. This variable will only be set if the system has both.

  • d_ndbm_h_uses_prototypes

    From i_ndbm.U:

    This variable conditionally defines the NDBM_H_USES_PROTOTYPES symbol, which indicates that the ndbm.h include file uses real ANSI C prototypes instead of K&R style function declarations. K&R style declarations are unsupported in C++, so the include file requires special handling when using a C++ compiler and this variable is undefined. Consult the different d_*ndbm_h_uses_prototypes variables to get the same information for alternative ndbm.h include files.

  • d_nice

    From d_nice.U:

    This variable conditionally defines the HAS_NICE symbol, which indicates to the C program that the nice() routine is available.

  • d_nl_langinfo

    From d_nl_langinfo.U:

    This variable conditionally defines the HAS_NL_LANGINFO symbol, which indicates to the C program that the nl_langinfo() routine is available.

  • d_nv_preserves_uv

    From perlxv.U:

    This variable indicates whether a variable of type nvtype can preserve all the bits a variable of type uvtype.

  • d_nv_zero_is_allbits_zero

    From perlxv.U:

    This variable indicates whether a variable of type nvtype stores 0.0 in memory as all bits zero.

  • d_off64_t

    From d_off64_t.U:

    This symbol will be defined if the C compiler supports off64_t.

  • d_old_pthread_create_joinable

    From d_pthrattrj.U:

    This variable conditionally defines pthread_create_joinable. undef if pthread.h defines PTHREAD_CREATE_JOINABLE .

  • d_oldpthreads

    From usethreads.U:

    This variable conditionally defines the OLD_PTHREADS_API symbol, and indicates that Perl should be built to use the old draft POSIX threads API . This is only potentially meaningful if usethreads is set.

  • d_oldsock

    From d_socket.U:

    This variable conditionally defines the OLDSOCKET symbol, which indicates that the BSD socket interface is based on 4.1c and not 4.2.

  • d_open3

    From d_open3.U:

    This variable conditionally defines the HAS_OPEN3 manifest constant, which indicates to the C program that the 3 argument version of the open(2) function is available.

  • d_pathconf

    From d_pathconf.U:

    This variable conditionally defines the HAS_PATHCONF symbol, which indicates to the C program that the pathconf() routine is available to determine file-system related limits and options associated with a given filename.

  • d_pause

    From d_pause.U:

    This variable conditionally defines the HAS_PAUSE symbol, which indicates to the C program that the pause() routine is available to suspend a process until a signal is received.

  • d_perl_otherlibdirs

    From otherlibdirs.U:

    This variable conditionally defines PERL_OTHERLIBDIRS , which contains a colon-separated set of paths for the perl binary to include in @INC . See also otherlibdirs.

  • d_phostname

    From d_gethname.U:

    This variable conditionally defines the HAS_PHOSTNAME symbol, which contains the shell command which, when fed to popen(), may be used to derive the host name.

  • d_pipe

    From d_pipe.U:

    This variable conditionally defines the HAS_PIPE symbol, which indicates to the C program that the pipe() routine is available to create an inter-process channel.

  • d_poll

    From d_poll.U:

    This variable conditionally defines the HAS_POLL symbol, which indicates to the C program that the poll() routine is available to poll active file descriptors.

  • d_portable

    From d_portable.U:

    This variable conditionally defines the PORTABLE symbol, which indicates to the C program that it should not assume that it is running on the machine it was compiled on.

  • d_prctl

    From d_prctl.U:

    This variable conditionally defines the HAS_PRCTL symbol, which indicates to the C program that the prctl() routine is available.

  • d_prctl_set_name

    From d_prctl.U:

    This variable conditionally defines the HAS_PRCTL_SET_NAME symbol, which indicates to the C program that the prctl() routine supports the PR_SET_NAME option.

  • d_PRId64

    From quadfio.U:

    This variable conditionally defines the PERL_PRId64 symbol, which indicates that stdio has a symbol to print 64-bit decimal numbers.

  • d_PRIeldbl

    From longdblfio.U:

    This variable conditionally defines the PERL_PRIfldbl symbol, which indicates that stdio has a symbol to print long doubles.

  • d_PRIEUldbl

    From longdblfio.U:

    This variable conditionally defines the PERL_PRIfldbl symbol, which indicates that stdio has a symbol to print long doubles. The U in the name is to separate this from d_PRIeldbl so that even case-blind systems can see the difference.

  • d_PRIfldbl

    From longdblfio.U:

    This variable conditionally defines the PERL_PRIfldbl symbol, which indicates that stdio has a symbol to print long doubles.

  • d_PRIFUldbl

    From longdblfio.U:

    This variable conditionally defines the PERL_PRIfldbl symbol, which indicates that stdio has a symbol to print long doubles. The U in the name is to separate this from d_PRIfldbl so that even case-blind systems can see the difference.

  • d_PRIgldbl

    From longdblfio.U:

    This variable conditionally defines the PERL_PRIfldbl symbol, which indicates that stdio has a symbol to print long doubles.

  • d_PRIGUldbl

    From longdblfio.U:

    This variable conditionally defines the PERL_PRIfldbl symbol, which indicates that stdio has a symbol to print long doubles. The U in the name is to separate this from d_PRIgldbl so that even case-blind systems can see the difference.

  • d_PRIi64

    From quadfio.U:

    This variable conditionally defines the PERL_PRIi64 symbol, which indicates that stdio has a symbol to print 64-bit decimal numbers.

  • d_printf_format_null

    From d_attribut.U:

    This variable conditionally defines PRINTF_FORMAT_NULL_OK , which indicates the C compiler allows printf-like formats to be null.

  • d_PRIo64

    From quadfio.U:

    This variable conditionally defines the PERL_PRIo64 symbol, which indicates that stdio has a symbol to print 64-bit octal numbers.

  • d_PRIu64

    From quadfio.U:

    This variable conditionally defines the PERL_PRIu64 symbol, which indicates that stdio has a symbol to print 64-bit unsigned decimal numbers.

  • d_PRIx64

    From quadfio.U:

    This variable conditionally defines the PERL_PRIx64 symbol, which indicates that stdio has a symbol to print 64-bit hexadecimal numbers.

  • d_PRIXU64

    From quadfio.U:

    This variable conditionally defines the PERL_PRIXU64 symbol, which indicates that stdio has a symbol to print 64-bit hExADECimAl numbers. The U in the name is to separate this from d_PRIx64 so that even case-blind systems can see the difference.

  • d_procselfexe

    From d_procselfexe.U:

    Defined if $procselfexe is symlink to the absolute pathname of the executing program.

  • d_pseudofork

    From d_vfork.U:

    This variable conditionally defines the HAS_PSEUDOFORK symbol, which indicates that an emulation of the fork routine is available.

  • d_pthread_atfork

    From d_pthread_atfork.U:

    This variable conditionally defines the HAS_PTHREAD_ATFORK symbol, which indicates to the C program that the pthread_atfork() routine is available.

  • d_pthread_attr_setscope

    From d_pthread_attr_ss.U:

    This variable conditionally defines HAS_PTHREAD_ATTR_SETSCOPE if pthread_attr_setscope() is available to set the contention scope attribute of a thread attribute object.

  • d_pthread_yield

    From d_pthread_y.U:

    This variable conditionally defines the HAS_PTHREAD_YIELD symbol if the pthread_yield routine is available to yield the execution of the current thread.

  • d_pwage

    From i_pwd.U:

    This variable conditionally defines PWAGE , which indicates that struct passwd contains pw_age.

  • d_pwchange

    From i_pwd.U:

    This variable conditionally defines PWCHANGE , which indicates that struct passwd contains pw_change.

  • d_pwclass

    From i_pwd.U:

    This variable conditionally defines PWCLASS , which indicates that struct passwd contains pw_class.

  • d_pwcomment

    From i_pwd.U:

    This variable conditionally defines PWCOMMENT , which indicates that struct passwd contains pw_comment.

  • d_pwexpire

    From i_pwd.U:

    This variable conditionally defines PWEXPIRE , which indicates that struct passwd contains pw_expire.

  • d_pwgecos

    From i_pwd.U:

    This variable conditionally defines PWGECOS , which indicates that struct passwd contains pw_gecos.

  • d_pwpasswd

    From i_pwd.U:

    This variable conditionally defines PWPASSWD , which indicates that struct passwd contains pw_passwd.

  • d_pwquota

    From i_pwd.U:

    This variable conditionally defines PWQUOTA , which indicates that struct passwd contains pw_quota.

  • d_qgcvt

    From d_qgcvt.U:

    This variable conditionally defines the HAS_QGCVT symbol, which indicates to the C program that the qgcvt() routine is available.

  • d_quad

    From quadtype.U:

    This variable, if defined, tells that there's a 64-bit integer type, quadtype.

  • d_random_r

    From d_random_r.U:

    This variable conditionally defines the HAS_RANDOM_R symbol, which indicates to the C program that the random_r() routine is available.

  • d_readdir

    From d_readdir.U:

    This variable conditionally defines HAS_READDIR if readdir() is available to read directory entries.

  • d_readdir64_r

    From d_readdir64_r.U:

    This variable conditionally defines the HAS_READDIR64_R symbol, which indicates to the C program that the readdir64_r() routine is available.

  • d_readdir_r

    From d_readdir_r.U:

    This variable conditionally defines the HAS_READDIR_R symbol, which indicates to the C program that the readdir_r() routine is available.

  • d_readlink

    From d_readlink.U:

    This variable conditionally defines the HAS_READLINK symbol, which indicates to the C program that the readlink() routine is available to read the value of a symbolic link.

  • d_readv

    From d_readv.U:

    This variable conditionally defines the HAS_READV symbol, which indicates to the C program that the readv() routine is available.

  • d_recvmsg

    From d_recvmsg.U:

    This variable conditionally defines the HAS_RECVMSG symbol, which indicates to the C program that the recvmsg() routine is available.

  • d_rename

    From d_rename.U:

    This variable conditionally defines the HAS_RENAME symbol, which indicates to the C program that the rename() routine is available to rename files.

  • d_rewinddir

    From d_readdir.U:

    This variable conditionally defines HAS_REWINDDIR if rewinddir() is available.

  • d_rmdir

    From d_rmdir.U:

    This variable conditionally defines HAS_RMDIR if rmdir() is available to remove directories.

  • d_safebcpy

    From d_safebcpy.U:

    This variable conditionally defines the HAS_SAFE_BCOPY symbol if the bcopy() routine can do overlapping copies. Normally, you should probably use memmove().

  • d_safemcpy

    From d_safemcpy.U:

    This variable conditionally defines the HAS_SAFE_MEMCPY symbol if the memcpy() routine can do overlapping copies. For overlapping copies, memmove() should be used, if available.

  • d_sanemcmp

    From d_sanemcmp.U:

    This variable conditionally defines the HAS_SANE_MEMCMP symbol if the memcpy() routine is available and can be used to compare relative magnitudes of chars with their high bits set.

  • d_sbrkproto

    From d_sbrkproto.U:

    This variable conditionally defines the HAS_SBRK_PROTO symbol, which indicates to the C program that the system provides a prototype for the sbrk() function. Otherwise, it is up to the program to supply one.

  • d_scalbnl

    From d_scalbnl.U:

    This variable conditionally defines the HAS_SCALBNL symbol, which indicates to the C program that the scalbnl() routine is available. If ilogbl is also present we can emulate frexpl.

  • d_sched_yield

    From d_pthread_y.U:

    This variable conditionally defines the HAS_SCHED_YIELD symbol if the sched_yield routine is available to yield the execution of the current thread.

  • d_scm_rights

    From d_socket.U:

    This variable conditionally defines the HAS_SCM_RIGHTS symbol, which indicates that the SCM_RIGHTS is available. #ifdef is not enough because it may be an enum, glibc has been known to do this.

  • d_SCNfldbl

    From longdblfio.U:

    This variable conditionally defines the PERL_PRIfldbl symbol, which indicates that stdio has a symbol to scan long doubles.

  • d_seekdir

    From d_readdir.U:

    This variable conditionally defines HAS_SEEKDIR if seekdir() is available.

  • d_select

    From d_select.U:

    This variable conditionally defines HAS_SELECT if select() is available to select active file descriptors. A <sys/time.h> inclusion may be necessary for the timeout field.

  • d_sem

    From d_sem.U:

    This variable conditionally defines the HAS_SEM symbol, which indicates that the entire sem*(2) library is present.

  • d_semctl

    From d_semctl.U:

    This variable conditionally defines the HAS_SEMCTL symbol, which indicates to the C program that the semctl() routine is available.

  • d_semctl_semid_ds

    From d_union_semun.U:

    This variable conditionally defines USE_SEMCTL_SEMID_DS , which indicates that struct semid_ds * is to be used for semctl IPC_STAT .

  • d_semctl_semun

    From d_union_semun.U:

    This variable conditionally defines USE_SEMCTL_SEMUN , which indicates that union semun is to be used for semctl IPC_STAT .

  • d_semget

    From d_semget.U:

    This variable conditionally defines the HAS_SEMGET symbol, which indicates to the C program that the semget() routine is available.

  • d_semop

    From d_semop.U:

    This variable conditionally defines the HAS_SEMOP symbol, which indicates to the C program that the semop() routine is available.

  • d_sendmsg

    From d_sendmsg.U:

    This variable conditionally defines the HAS_SENDMSG symbol, which indicates to the C program that the sendmsg() routine is available.

  • d_setegid

    From d_setegid.U:

    This variable conditionally defines the HAS_SETEGID symbol, which indicates to the C program that the setegid() routine is available to change the effective gid of the current program.

  • d_seteuid

    From d_seteuid.U:

    This variable conditionally defines the HAS_SETEUID symbol, which indicates to the C program that the seteuid() routine is available to change the effective uid of the current program.

  • d_setgrent

    From d_setgrent.U:

    This variable conditionally defines the HAS_SETGRENT symbol, which indicates to the C program that the setgrent() routine is available for initializing sequential access to the group database.

  • d_setgrent_r

    From d_setgrent_r.U:

    This variable conditionally defines the HAS_SETGRENT_R symbol, which indicates to the C program that the setgrent_r() routine is available.

  • d_setgrps

    From d_setgrps.U:

    This variable conditionally defines the HAS_SETGROUPS symbol, which indicates to the C program that the setgroups() routine is available to set the list of process groups.

  • d_sethent

    From d_sethent.U:

    This variable conditionally defines HAS_SETHOSTENT if sethostent() is available.

  • d_sethostent_r

    From d_sethostent_r.U:

    This variable conditionally defines the HAS_SETHOSTENT_R symbol, which indicates to the C program that the sethostent_r() routine is available.

  • d_setitimer

    From d_setitimer.U:

    This variable conditionally defines the HAS_SETITIMER symbol, which indicates to the C program that the setitimer() routine is available.

  • d_setlinebuf

    From d_setlnbuf.U:

    This variable conditionally defines the HAS_SETLINEBUF symbol, which indicates to the C program that the setlinebuf() routine is available to change stderr or stdout from block-buffered or unbuffered to a line-buffered mode.

  • d_setlocale

    From d_setlocale.U:

    This variable conditionally defines HAS_SETLOCALE if setlocale() is available to handle locale-specific ctype implementations.

  • d_setlocale_r

    From d_setlocale_r.U:

    This variable conditionally defines the HAS_SETLOCALE_R symbol, which indicates to the C program that the setlocale_r() routine is available.

  • d_setnent

    From d_setnent.U:

    This variable conditionally defines HAS_SETNETENT if setnetent() is available.

  • d_setnetent_r

    From d_setnetent_r.U:

    This variable conditionally defines the HAS_SETNETENT_R symbol, which indicates to the C program that the setnetent_r() routine is available.

  • d_setpent

    From d_setpent.U:

    This variable conditionally defines HAS_SETPROTOENT if setprotoent() is available.

  • d_setpgid

    From d_setpgid.U:

    This variable conditionally defines the HAS_SETPGID symbol if the setpgid(pid, gpid) function is available to set process group ID .

  • d_setpgrp

    From d_setpgrp.U:

    This variable conditionally defines HAS_SETPGRP if setpgrp() is available to set the current process group.

  • d_setpgrp2

    From d_setpgrp2.U:

    This variable conditionally defines the HAS_SETPGRP2 symbol, which indicates to the C program that the setpgrp2() (as in DG/UX ) routine is available to set the current process group.

  • d_setprior

    From d_setprior.U:

    This variable conditionally defines HAS_SETPRIORITY if setpriority() is available to set a process's priority.

  • d_setproctitle

    From d_setproctitle.U:

    This variable conditionally defines the HAS_SETPROCTITLE symbol, which indicates to the C program that the setproctitle() routine is available.

  • d_setprotoent_r

    From d_setprotoent_r.U:

    This variable conditionally defines the HAS_SETPROTOENT_R symbol, which indicates to the C program that the setprotoent_r() routine is available.

  • d_setpwent

    From d_setpwent.U:

    This variable conditionally defines the HAS_SETPWENT symbol, which indicates to the C program that the setpwent() routine is available for initializing sequential access to the passwd database.

  • d_setpwent_r

    From d_setpwent_r.U:

    This variable conditionally defines the HAS_SETPWENT_R symbol, which indicates to the C program that the setpwent_r() routine is available.

  • d_setregid

    From d_setregid.U:

    This variable conditionally defines HAS_SETREGID if setregid() is available to change the real and effective gid of the current process.

  • d_setresgid

    From d_setregid.U:

    This variable conditionally defines HAS_SETRESGID if setresgid() is available to change the real, effective and saved gid of the current process.

  • d_setresuid

    From d_setreuid.U:

    This variable conditionally defines HAS_SETREUID if setresuid() is available to change the real, effective and saved uid of the current process.

  • d_setreuid

    From d_setreuid.U:

    This variable conditionally defines HAS_SETREUID if setreuid() is available to change the real and effective uid of the current process.

  • d_setrgid

    From d_setrgid.U:

    This variable conditionally defines the HAS_SETRGID symbol, which indicates to the C program that the setrgid() routine is available to change the real gid of the current program.

  • d_setruid

    From d_setruid.U:

    This variable conditionally defines the HAS_SETRUID symbol, which indicates to the C program that the setruid() routine is available to change the real uid of the current program.

  • d_setsent

    From d_setsent.U:

    This variable conditionally defines HAS_SETSERVENT if setservent() is available.

  • d_setservent_r

    From d_setservent_r.U:

    This variable conditionally defines the HAS_SETSERVENT_R symbol, which indicates to the C program that the setservent_r() routine is available.

  • d_setsid

    From d_setsid.U:

    This variable conditionally defines HAS_SETSID if setsid() is available to set the process group ID .

  • d_setvbuf

    From d_setvbuf.U:

    This variable conditionally defines the HAS_SETVBUF symbol, which indicates to the C program that the setvbuf() routine is available to change buffering on an open stdio stream.

  • d_sfio

    From d_sfio.U:

    This variable conditionally defines the USE_SFIO symbol, and indicates whether sfio is available (and should be used).

  • d_shm

    From d_shm.U:

    This variable conditionally defines the HAS_SHM symbol, which indicates that the entire shm*(2) library is present.

  • d_shmat

    From d_shmat.U:

    This variable conditionally defines the HAS_SHMAT symbol, which indicates to the C program that the shmat() routine is available.

  • d_shmatprototype

    From d_shmat.U:

    This variable conditionally defines the HAS_SHMAT_PROTOTYPE symbol, which indicates that sys/shm.h has a prototype for shmat.

  • d_shmctl

    From d_shmctl.U:

    This variable conditionally defines the HAS_SHMCTL symbol, which indicates to the C program that the shmctl() routine is available.

  • d_shmdt

    From d_shmdt.U:

    This variable conditionally defines the HAS_SHMDT symbol, which indicates to the C program that the shmdt() routine is available.

  • d_shmget

    From d_shmget.U:

    This variable conditionally defines the HAS_SHMGET symbol, which indicates to the C program that the shmget() routine is available.

  • d_sigaction

    From d_sigaction.U:

    This variable conditionally defines the HAS_SIGACTION symbol, which indicates that the Vr4 sigaction() routine is available.

  • d_signbit

    From d_signbit.U:

    This variable conditionally defines the HAS_SIGNBIT symbol, which indicates to the C program that the signbit() routine is available and safe to use with perl's intern NV type.

  • d_sigprocmask

    From d_sigprocmask.U:

    This variable conditionally defines HAS_SIGPROCMASK if sigprocmask() is available to examine or change the signal mask of the calling process.

  • d_sigsetjmp

    From d_sigsetjmp.U:

    This variable conditionally defines the HAS_SIGSETJMP symbol, which indicates that the sigsetjmp() routine is available to call setjmp() and optionally save the process's signal mask.

  • d_sin6_scope_id

    From d_socket.U:

    This variable conditionally defines the HAS_SIN6_SCOPE_ID symbol, which indicates that a struct sockaddr_in6 structure has the sin6_scope_id member.

  • d_sitearch

    From sitearch.U:

    This variable conditionally defines SITEARCH to hold the pathname of architecture-dependent library files for $package. If $sitearch is the same as $archlib, then this is set to undef.

  • d_snprintf

    From d_snprintf.U:

    This variable conditionally defines the HAS_SNPRINTF symbol, which indicates to the C program that the snprintf () library function is available.

  • d_sockaddr_in6

    From d_socket.U:

    This variable conditionally defines the HAS_SOCKADDR_IN6 symbol, which indicates the availability of a struct sockaddr_in6.

  • d_sockaddr_sa_len

    From d_socket.U:

    This variable conditionally defines the HAS_SOCKADDR_SA_LEN symbol, which indicates that a struct sockaddr structure has the sa_len member.

  • d_sockatmark

    From d_sockatmark.U:

    This variable conditionally defines the HAS_SOCKATMARK symbol, which indicates to the C program that the sockatmark() routine is available.

  • d_sockatmarkproto

    From d_sockatmarkproto.U:

    This variable conditionally defines the HAS_SOCKATMARK_PROTO symbol, which indicates to the C program that the system provides a prototype for the sockatmark() function. Otherwise, it is up to the program to supply one.

  • d_socket

    From d_socket.U:

    This variable conditionally defines HAS_SOCKET , which indicates that the BSD socket interface is supported.

  • d_socklen_t

    From d_socklen_t.U:

    This symbol will be defined if the C compiler supports socklen_t.

  • d_sockpair

    From d_socket.U:

    This variable conditionally defines the HAS_SOCKETPAIR symbol, which indicates that the BSD socketpair() is supported.

  • d_socks5_init

    From d_socks5_init.U:

    This variable conditionally defines the HAS_SOCKS5_INIT symbol, which indicates to the C program that the socks5_init() routine is available.

  • d_sprintf_returns_strlen

    From d_sprintf_len.U:

    This variable defines whether sprintf returns the length of the string (as per the ANSI spec). Some C libraries retain compatibility with pre-ANSI C and return a pointer to the passed in buffer; for these this variable will be undef.

  • d_sqrtl

    From d_sqrtl.U:

    This variable conditionally defines the HAS_SQRTL symbol, which indicates to the C program that the sqrtl() routine is available.

  • d_srand48_r

    From d_srand48_r.U:

    This variable conditionally defines the HAS_SRAND48_R symbol, which indicates to the C program that the srand48_r() routine is available.

  • d_srandom_r

    From d_srandom_r.U:

    This variable conditionally defines the HAS_SRANDOM_R symbol, which indicates to the C program that the srandom_r() routine is available.

  • d_sresgproto

    From d_sresgproto.U:

    This variable conditionally defines the HAS_SETRESGID_PROTO symbol, which indicates to the C program that the system provides a prototype for the setresgid() function. Otherwise, it is up to the program to supply one.

  • d_sresuproto

    From d_sresuproto.U:

    This variable conditionally defines the HAS_SETRESUID_PROTO symbol, which indicates to the C program that the system provides a prototype for the setresuid() function. Otherwise, it is up to the program to supply one.

  • d_statblks

    From d_statblks.U:

    This variable conditionally defines USE_STAT_BLOCKS if this system has a stat structure declaring st_blksize and st_blocks.

  • d_statfs_f_flags

    From d_statfs_f_flags.U:

    This variable conditionally defines the HAS_STRUCT_STATFS_F_FLAGS symbol, which indicates to struct statfs from has f_flags member. This kind of struct statfs is coming from sys/mount.h (BSD ), not from sys/statfs.h (SYSV ).

  • d_statfs_s

    From d_statfs_s.U:

    This variable conditionally defines the HAS_STRUCT_STATFS symbol, which indicates that the struct statfs is supported.

  • d_static_inline

    From d_static_inline.U:

    This variable conditionally defines the HAS_STATIC_INLINE symbol, which indicates that the C compiler supports C99-style static inline. That is, the function can't be called from another translation unit.

  • d_statvfs

    From d_statvfs.U:

    This variable conditionally defines the HAS_STATVFS symbol, which indicates to the C program that the statvfs() routine is available.

  • d_stdio_cnt_lval

    From d_stdstdio.U:

    This variable conditionally defines STDIO_CNT_LVALUE if the FILE_cnt macro can be used as an lvalue.

  • d_stdio_ptr_lval

    From d_stdstdio.U:

    This variable conditionally defines STDIO_PTR_LVALUE if the FILE_ptr macro can be used as an lvalue.

  • d_stdio_ptr_lval_nochange_cnt

    From d_stdstdio.U:

    This symbol is defined if using the FILE_ptr macro as an lvalue to increase the pointer by n leaves File_cnt(fp) unchanged.

  • d_stdio_ptr_lval_sets_cnt

    From d_stdstdio.U:

    This symbol is defined if using the FILE_ptr macro as an lvalue to increase the pointer by n has the side effect of decreasing the value of File_cnt(fp) by n.

  • d_stdio_stream_array

    From stdio_streams.U:

    This variable tells whether there is an array holding the stdio streams.

  • d_stdiobase

    From d_stdstdio.U:

    This variable conditionally defines USE_STDIO_BASE if this system has a FILE structure declaring a usable _base field (or equivalent) in stdio.h.

  • d_stdstdio

    From d_stdstdio.U:

    This variable conditionally defines USE_STDIO_PTR if this system has a FILE structure declaring usable _ptr and _cnt fields (or equivalent) in stdio.h.

  • d_strchr

    From d_strchr.U:

    This variable conditionally defines HAS_STRCHR if strchr() and strrchr() are available for string searching.

  • d_strcoll

    From d_strcoll.U:

    This variable conditionally defines HAS_STRCOLL if strcoll() is available to compare strings using collating information.

  • d_strctcpy

    From d_strctcpy.U:

    This variable conditionally defines the USE_STRUCT_COPY symbol, which indicates to the C program that this C compiler knows how to copy structures.

  • d_strerrm

    From d_strerror.U:

    This variable holds what Strerror is defined as to translate an error code condition into an error message string. It could be strerror or a more complex macro emulating strerror with sys_errlist[], or the unknown string when both strerror and sys_errlist are missing.

  • d_strerror

    From d_strerror.U:

    This variable conditionally defines HAS_STRERROR if strerror() is available to translate error numbers to strings.

  • d_strerror_r

    From d_strerror_r.U:

    This variable conditionally defines the HAS_STRERROR_R symbol, which indicates to the C program that the strerror_r() routine is available.

  • d_strftime

    From d_strftime.U:

    This variable conditionally defines the HAS_STRFTIME symbol, which indicates to the C program that the strftime() routine is available.

  • d_strlcat

    From d_strlcat.U:

    This variable conditionally defines the HAS_STRLCAT symbol, which indicates to the C program that the strlcat () routine is available.

  • d_strlcpy

    From d_strlcpy.U:

    This variable conditionally defines the HAS_STRLCPY symbol, which indicates to the C program that the strlcpy () routine is available.

  • d_strtod

    From d_strtod.U:

    This variable conditionally defines the HAS_STRTOD symbol, which indicates to the C program that the strtod() routine is available to provide better numeric string conversion than atof().

  • d_strtol

    From d_strtol.U:

    This variable conditionally defines the HAS_STRTOL symbol, which indicates to the C program that the strtol() routine is available to provide better numeric string conversion than atoi() and friends.

  • d_strtold

    From d_strtold.U:

    This variable conditionally defines the HAS_STRTOLD symbol, which indicates to the C program that the strtold() routine is available.

  • d_strtoll

    From d_strtoll.U:

    This variable conditionally defines the HAS_STRTOLL symbol, which indicates to the C program that the strtoll() routine is available.

  • d_strtoq

    From d_strtoq.U:

    This variable conditionally defines the HAS_STRTOQ symbol, which indicates to the C program that the strtoq() routine is available.

  • d_strtoul

    From d_strtoul.U:

    This variable conditionally defines the HAS_STRTOUL symbol, which indicates to the C program that the strtoul() routine is available to provide conversion of strings to unsigned long.

  • d_strtoull

    From d_strtoull.U:

    This variable conditionally defines the HAS_STRTOULL symbol, which indicates to the C program that the strtoull() routine is available.

  • d_strtouq

    From d_strtouq.U:

    This variable conditionally defines the HAS_STRTOUQ symbol, which indicates to the C program that the strtouq() routine is available.

  • d_strxfrm

    From d_strxfrm.U:

    This variable conditionally defines HAS_STRXFRM if strxfrm() is available to transform strings.

  • d_suidsafe

    From d_dosuid.U:

    This variable conditionally defines SETUID_SCRIPTS_ARE_SECURE_NOW if setuid scripts can be secure. This test looks in /dev/fd/.

  • d_symlink

    From d_symlink.U:

    This variable conditionally defines the HAS_SYMLINK symbol, which indicates to the C program that the symlink() routine is available to create symbolic links.

  • d_syscall

    From d_syscall.U:

    This variable conditionally defines HAS_SYSCALL if syscall() is available call arbitrary system calls.

  • d_syscallproto

    From d_syscallproto.U:

    This variable conditionally defines the HAS_SYSCALL_PROTO symbol, which indicates to the C program that the system provides a prototype for the syscall() function. Otherwise, it is up to the program to supply one.

  • d_sysconf

    From d_sysconf.U:

    This variable conditionally defines the HAS_SYSCONF symbol, which indicates to the C program that the sysconf() routine is available to determine system related limits and options.

  • d_sysernlst

    From d_strerror.U:

    This variable conditionally defines HAS_SYS_ERRNOLIST if sys_errnolist[] is available to translate error numbers to the symbolic name.

  • d_syserrlst

    From d_strerror.U:

    This variable conditionally defines HAS_SYS_ERRLIST if sys_errlist[] is available to translate error numbers to strings.

  • d_system

    From d_system.U:

    This variable conditionally defines HAS_SYSTEM if system() is available to issue a shell command.

  • d_tcgetpgrp

    From d_tcgtpgrp.U:

    This variable conditionally defines the HAS_TCGETPGRP symbol, which indicates to the C program that the tcgetpgrp() routine is available. to get foreground process group ID .

  • d_tcsetpgrp

    From d_tcstpgrp.U:

    This variable conditionally defines the HAS_TCSETPGRP symbol, which indicates to the C program that the tcsetpgrp() routine is available to set foreground process group ID .

  • d_telldir

    From d_readdir.U:

    This variable conditionally defines HAS_TELLDIR if telldir() is available.

  • d_telldirproto

    From d_telldirproto.U:

    This variable conditionally defines the HAS_TELLDIR_PROTO symbol, which indicates to the C program that the system provides a prototype for the telldir() function. Otherwise, it is up to the program to supply one.

  • d_time

    From d_time.U:

    This variable conditionally defines the HAS_TIME symbol, which indicates that the time() routine exists. The time() routine is normally provided on UNIX systems.

  • d_timegm

    From d_timegm.U:

    This variable conditionally defines the HAS_TIMEGM symbol, which indicates to the C program that the timegm () routine is available.

  • d_times

    From d_times.U:

    This variable conditionally defines the HAS_TIMES symbol, which indicates that the times() routine exists. The times() routine is normally provided on UNIX systems. You may have to include <sys/times.h>.

  • d_tm_tm_gmtoff

    From i_time.U:

    This variable conditionally defines HAS_TM_TM_GMTOFF , which indicates indicates to the C program that the struct tm has the tm_gmtoff field.

  • d_tm_tm_zone

    From i_time.U:

    This variable conditionally defines HAS_TM_TM_ZONE , which indicates indicates to the C program that the struct tm has the tm_zone field.

  • d_tmpnam_r

    From d_tmpnam_r.U:

    This variable conditionally defines the HAS_TMPNAM_R symbol, which indicates to the C program that the tmpnam_r() routine is available.

  • d_truncate

    From d_truncate.U:

    This variable conditionally defines HAS_TRUNCATE if truncate() is available to truncate files.

  • d_ttyname_r

    From d_ttyname_r.U:

    This variable conditionally defines the HAS_TTYNAME_R symbol, which indicates to the C program that the ttyname_r() routine is available.

  • d_tzname

    From d_tzname.U:

    This variable conditionally defines HAS_TZNAME if tzname[] is available to access timezone names.

  • d_u32align

    From d_u32align.U:

    This variable tells whether you must access character data through U32-aligned pointers.

  • d_ualarm

    From d_ualarm.U:

    This variable conditionally defines the HAS_UALARM symbol, which indicates to the C program that the ualarm() routine is available.

  • d_umask

    From d_umask.U:

    This variable conditionally defines the HAS_UMASK symbol, which indicates to the C program that the umask() routine is available. to set and get the value of the file creation mask.

  • d_uname

    From d_gethname.U:

    This variable conditionally defines the HAS_UNAME symbol, which indicates to the C program that the uname() routine may be used to derive the host name.

  • d_union_semun

    From d_union_semun.U:

    This variable conditionally defines HAS_UNION_SEMUN if the union semun is defined by including <sys/sem.h>.

  • d_unordered

    From d_unordered.U:

    This variable conditionally defines the HAS_UNORDERED symbol, which indicates to the C program that the unordered() routine is available.

  • d_unsetenv

    From d_unsetenv.U:

    This variable conditionally defines the HAS_UNSETENV symbol, which indicates to the C program that the unsetenv () routine is available.

  • d_usleep

    From d_usleep.U:

    This variable conditionally defines HAS_USLEEP if usleep() is available to do high granularity sleeps.

  • d_usleepproto

    From d_usleepproto.U:

    This variable conditionally defines the HAS_USLEEP_PROTO symbol, which indicates to the C program that the system provides a prototype for the usleep() function. Otherwise, it is up to the program to supply one.

  • d_ustat

    From d_ustat.U:

    This variable conditionally defines HAS_USTAT if ustat() is available to query file system statistics by dev_t.

  • d_vendorarch

    From vendorarch.U:

    This variable conditionally defined PERL_VENDORARCH .

  • d_vendorbin

    From vendorbin.U:

    This variable conditionally defines PERL_VENDORBIN .

  • d_vendorlib

    From vendorlib.U:

    This variable conditionally defines PERL_VENDORLIB .

  • d_vendorscript

    From vendorscript.U:

    This variable conditionally defines PERL_VENDORSCRIPT .

  • d_vfork

    From d_vfork.U:

    This variable conditionally defines the HAS_VFORK symbol, which indicates the vfork() routine is available.

  • d_void_closedir

    From d_closedir.U:

    This variable conditionally defines VOID_CLOSEDIR if closedir() does not return a value.

  • d_voidsig

    From d_voidsig.U:

    This variable conditionally defines VOIDSIG if this system declares "void (*signal(...))()" in signal.h. The old way was to declare it as "int (*signal(...))()".

  • d_voidtty

    From i_sysioctl.U:

    This variable conditionally defines USE_IOCNOTTY to indicate that the ioctl() call with TIOCNOTTY should be used to void tty association. Otherwise (on USG probably), it is enough to close the standard file descriptors and do a setpgrp().

  • d_volatile

    From d_volatile.U:

    This variable conditionally defines the HASVOLATILE symbol, which indicates to the C program that this C compiler knows about the volatile declaration.

  • d_vprintf

    From d_vprintf.U:

    This variable conditionally defines the HAS_VPRINTF symbol, which indicates to the C program that the vprintf() routine is available to printf with a pointer to an argument list.

  • d_vsnprintf

    From d_snprintf.U:

    This variable conditionally defines the HAS_VSNPRINTF symbol, which indicates to the C program that the vsnprintf () library function is available.

  • d_wait4

    From d_wait4.U:

    This variable conditionally defines the HAS_WAIT4 symbol, which indicates the wait4() routine is available.

  • d_waitpid

    From d_waitpid.U:

    This variable conditionally defines HAS_WAITPID if waitpid() is available to wait for child process.

  • d_wcstombs

    From d_wcstombs.U:

    This variable conditionally defines the HAS_WCSTOMBS symbol, which indicates to the C program that the wcstombs() routine is available to convert wide character strings to multibyte strings.

  • d_wctomb

    From d_wctomb.U:

    This variable conditionally defines the HAS_WCTOMB symbol, which indicates to the C program that the wctomb() routine is available to convert a wide character to a multibyte.

  • d_writev

    From d_writev.U:

    This variable conditionally defines the HAS_WRITEV symbol, which indicates to the C program that the writev() routine is available.

  • d_xenix

    From Guess.U:

    This variable conditionally defines the symbol XENIX , which alerts the C program that it runs under Xenix.

  • date

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the date program. After Configure runs, the value is reset to a plain date and is not useful.

  • db_hashtype

    From i_db.U:

    This variable contains the type of the hash structure element in the <db.h> header file. In older versions of DB , it was int, while in newer ones it is u_int32_t.

  • db_prefixtype

    From i_db.U:

    This variable contains the type of the prefix structure element in the <db.h> header file. In older versions of DB , it was int, while in newer ones it is size_t.

  • db_version_major

    From i_db.U:

    This variable contains the major version number of Berkeley DB found in the <db.h> header file.

  • db_version_minor

    From i_db.U:

    This variable contains the minor version number of Berkeley DB found in the <db.h> header file. For DB version 1 this is always 0.

  • db_version_patch

    From i_db.U:

    This variable contains the patch version number of Berkeley DB found in the <db.h> header file. For DB version 1 this is always 0.

  • defvoidused

    From voidflags.U:

    This variable contains the default value of the VOIDUSED symbol (15).

  • direntrytype

    From i_dirent.U:

    This symbol is set to struct direct or struct dirent depending on whether dirent is available or not. You should use this pseudo type to portably declare your directory entries.

  • dlext

    From dlext.U:

    This variable contains the extension that is to be used for the dynamically loaded modules that perl generates.

  • dlsrc

    From dlsrc.U:

    This variable contains the name of the dynamic loading file that will be used with the package.

  • doublesize

    From doublesize.U:

    This variable contains the value of the DOUBLESIZE symbol, which indicates to the C program how many bytes there are in a double.

  • drand01

    From randfunc.U:

    Indicates the macro to be used to generate normalized random numbers. Uses randfunc, often divided by (double) (((unsigned long) 1 << randbits)) in order to normalize the result. In C programs, the macro Drand01 is mapped to drand01.

  • drand48_r_proto

    From d_drand48_r.U:

    This variable encodes the prototype of drand48_r. It is zero if d_drand48_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_drand48_r is defined.

  • dtrace

    From usedtrace.U:

    This variable holds the location of the dtrace executable.

  • dynamic_ext

    From Extensions.U:

    This variable holds a list of XS extension files we want to link dynamically into the package. It is used by Makefile.

e

  • eagain

    From nblock_io.U:

    This variable bears the symbolic errno code set by read() when no data is present on the file and non-blocking I/O was enabled (otherwise, read() blocks naturally).

  • ebcdic

    From ebcdic.U:

    This variable conditionally defines EBCDIC if this system uses EBCDIC encoding.

  • echo

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the echo program. After Configure runs, the value is reset to a plain echo and is not useful.

  • egrep

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the egrep program. After Configure runs, the value is reset to a plain egrep and is not useful.

  • emacs

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • endgrent_r_proto

    From d_endgrent_r.U:

    This variable encodes the prototype of endgrent_r. It is zero if d_endgrent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_endgrent_r is defined.

  • endhostent_r_proto

    From d_endhostent_r.U:

    This variable encodes the prototype of endhostent_r. It is zero if d_endhostent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_endhostent_r is defined.

  • endnetent_r_proto

    From d_endnetent_r.U:

    This variable encodes the prototype of endnetent_r. It is zero if d_endnetent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_endnetent_r is defined.

  • endprotoent_r_proto

    From d_endprotoent_r.U:

    This variable encodes the prototype of endprotoent_r. It is zero if d_endprotoent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_endprotoent_r is defined.

  • endpwent_r_proto

    From d_endpwent_r.U:

    This variable encodes the prototype of endpwent_r. It is zero if d_endpwent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_endpwent_r is defined.

  • endservent_r_proto

    From d_endservent_r.U:

    This variable encodes the prototype of endservent_r. It is zero if d_endservent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_endservent_r is defined.

  • eunicefix

    From Init.U:

    When running under Eunice this variable contains a command which will convert a shell script to the proper form of text file for it to be executable by the shell. On other systems it is a no-op.

  • exe_ext

    From Unix.U:

    This is an old synonym for _exe.

  • expr

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the expr program. After Configure runs, the value is reset to a plain expr and is not useful.

  • extensions

    From Extensions.U:

    This variable holds a list of all extension files (both XS and non-xs linked into the package. It is propagated to Config.pm and is typically used to test whether a particular extension is available.

  • extern_C

    From Csym.U:

    ANSI C requires extern where C++ requires 'extern C '. This variable can be used in Configure to do the right thing.

  • extras

    From Extras.U:

    This variable holds a list of extra modules to install.

f

  • fflushall

    From fflushall.U:

    This symbol, if defined, tells that to flush all pending stdio output one must loop through all the stdio file handles stored in an array and fflush them. Note that if fflushNULL is defined, fflushall will not even be probed for and will be left undefined.

  • fflushNULL

    From fflushall.U:

    This symbol, if defined, tells that fflush(NULL ) does flush all pending stdio output.

  • find

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • firstmakefile

    From Unix.U:

    This variable defines the first file searched by make. On unix, it is makefile (then Makefile). On case-insensitive systems, it might be something else. This is only used to deal with convoluted make depend tricks.

  • flex

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • fpossize

    From fpossize.U:

    This variable contains the size of a fpostype in bytes.

  • fpostype

    From fpostype.U:

    This variable defines Fpos_t to be something like fpos_t, long, uint, or whatever type is used to declare file positions in libc.

  • freetype

    From mallocsrc.U:

    This variable contains the return type of free(). It is usually void, but occasionally int.

  • from

    From Cross.U:

    This variable contains the command used by Configure to copy files from the target host. Useful and available only during Perl build. The string : if not cross-compiling.

  • full_ar

    From Loc_ar.U:

    This variable contains the full pathname to ar , whether or not the user has specified portability . This is only used in the Makefile.SH.

  • full_csh

    From d_csh.U:

    This variable contains the full pathname to csh , whether or not the user has specified portability . This is only used in the compiled C program, and we assume that all systems which can share this executable will have the same full pathname to csh.

  • full_sed

    From Loc_sed.U:

    This variable contains the full pathname to sed , whether or not the user has specified portability . This is only used in the compiled C program, and we assume that all systems which can share this executable will have the same full pathname to sed.

g

  • gccansipedantic

    From gccvers.U:

    If GNU cc (gcc) is used, this variable will enable (if set) the -ansi and -pedantic ccflags for building core files (through cflags script). (See Porting/pumpkin.pod for full description).

  • gccosandvers

    From gccvers.U:

    If GNU cc (gcc) is used, this variable holds the operating system and version used to compile gcc. It is set to '' if not gcc, or if nothing useful can be parsed as the os version.

  • gccversion

    From gccvers.U:

    If GNU cc (gcc) is used, this variable holds 1 or 2 to indicate whether the compiler is version 1 or 2. This is used in setting some of the default cflags. It is set to '' if not gcc.

  • getgrent_r_proto

    From d_getgrent_r.U:

    This variable encodes the prototype of getgrent_r. It is zero if d_getgrent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getgrent_r is defined.

  • getgrgid_r_proto

    From d_getgrgid_r.U:

    This variable encodes the prototype of getgrgid_r. It is zero if d_getgrgid_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getgrgid_r is defined.

  • getgrnam_r_proto

    From d_getgrnam_r.U:

    This variable encodes the prototype of getgrnam_r. It is zero if d_getgrnam_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getgrnam_r is defined.

  • gethostbyaddr_r_proto

    From d_gethostbyaddr_r.U:

    This variable encodes the prototype of gethostbyaddr_r. It is zero if d_gethostbyaddr_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_gethostbyaddr_r is defined.

  • gethostbyname_r_proto

    From d_gethostbyname_r.U:

    This variable encodes the prototype of gethostbyname_r. It is zero if d_gethostbyname_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_gethostbyname_r is defined.

  • gethostent_r_proto

    From d_gethostent_r.U:

    This variable encodes the prototype of gethostent_r. It is zero if d_gethostent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_gethostent_r is defined.

  • getlogin_r_proto

    From d_getlogin_r.U:

    This variable encodes the prototype of getlogin_r. It is zero if d_getlogin_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getlogin_r is defined.

  • getnetbyaddr_r_proto

    From d_getnetbyaddr_r.U:

    This variable encodes the prototype of getnetbyaddr_r. It is zero if d_getnetbyaddr_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getnetbyaddr_r is defined.

  • getnetbyname_r_proto

    From d_getnetbyname_r.U:

    This variable encodes the prototype of getnetbyname_r. It is zero if d_getnetbyname_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getnetbyname_r is defined.

  • getnetent_r_proto

    From d_getnetent_r.U:

    This variable encodes the prototype of getnetent_r. It is zero if d_getnetent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getnetent_r is defined.

  • getprotobyname_r_proto

    From d_getprotobyname_r.U:

    This variable encodes the prototype of getprotobyname_r. It is zero if d_getprotobyname_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getprotobyname_r is defined.

  • getprotobynumber_r_proto

    From d_getprotobynumber_r.U:

    This variable encodes the prototype of getprotobynumber_r. It is zero if d_getprotobynumber_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getprotobynumber_r is defined.

  • getprotoent_r_proto

    From d_getprotoent_r.U:

    This variable encodes the prototype of getprotoent_r. It is zero if d_getprotoent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getprotoent_r is defined.

  • getpwent_r_proto

    From d_getpwent_r.U:

    This variable encodes the prototype of getpwent_r. It is zero if d_getpwent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getpwent_r is defined.

  • getpwnam_r_proto

    From d_getpwnam_r.U:

    This variable encodes the prototype of getpwnam_r. It is zero if d_getpwnam_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getpwnam_r is defined.

  • getpwuid_r_proto

    From d_getpwuid_r.U:

    This variable encodes the prototype of getpwuid_r. It is zero if d_getpwuid_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getpwuid_r is defined.

  • getservbyname_r_proto

    From d_getservbyname_r.U:

    This variable encodes the prototype of getservbyname_r. It is zero if d_getservbyname_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getservbyname_r is defined.

  • getservbyport_r_proto

    From d_getservbyport_r.U:

    This variable encodes the prototype of getservbyport_r. It is zero if d_getservbyport_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getservbyport_r is defined.

  • getservent_r_proto

    From d_getservent_r.U:

    This variable encodes the prototype of getservent_r. It is zero if d_getservent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getservent_r is defined.

  • getspnam_r_proto

    From d_getspnam_r.U:

    This variable encodes the prototype of getspnam_r. It is zero if d_getspnam_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_getspnam_r is defined.

  • gidformat

    From gidf.U:

    This variable contains the format string used for printing a Gid_t.

  • gidsign

    From gidsign.U:

    This variable contains the signedness of a gidtype. 1 for unsigned, -1 for signed.

  • gidsize

    From gidsize.U:

    This variable contains the size of a gidtype in bytes.

  • gidtype

    From gidtype.U:

    This variable defines Gid_t to be something like gid_t, int, ushort, or whatever type is used to declare the return type of getgid(). Typically, it is the type of group ids in the kernel.

  • glibpth

    From libpth.U:

    This variable holds the general path (space-separated) used to find libraries. It may contain directories that do not exist on this platform, libpth is the cleaned-up version.

  • gmake

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the gmake program. After Configure runs, the value is reset to a plain gmake and is not useful.

  • gmtime_r_proto

    From d_gmtime_r.U:

    This variable encodes the prototype of gmtime_r. It is zero if d_gmtime_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_gmtime_r is defined.

  • gnulibc_version

    From d_gnulibc.U:

    This variable contains the version number of the GNU C library. It is usually something like 2.2.5. It is a plain '' if this is not the GNU C library, or if the version is unknown.

  • grep

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the grep program. After Configure runs, the value is reset to a plain grep and is not useful.

  • groupcat

    From nis.U:

    This variable contains a command that produces the text of the /etc/group file. This is normally "cat /etc/group", but can be "ypcat group" when NIS is used. On some systems, such as os390, there may be no equivalent command, in which case this variable is unset.

  • groupstype

    From groupstype.U:

    This variable defines Groups_t to be something like gid_t, int, ushort, or whatever type is used for the second argument to getgroups() and setgroups(). Usually, this is the same as gidtype (gid_t), but sometimes it isn't.

  • gzip

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the gzip program. After Configure runs, the value is reset to a plain gzip and is not useful.

h

  • h_fcntl

    From h_fcntl.U:

    This is variable gets set in various places to tell i_fcntl that <fcntl.h> should be included.

  • h_sysfile

    From h_sysfile.U:

    This is variable gets set in various places to tell i_sys_file that <sys/file.h> should be included.

  • hint

    From Oldconfig.U:

    Gives the type of hints used for previous answers. May be one of default , recommended or previous .

  • hostcat

    From nis.U:

    This variable contains a command that produces the text of the /etc/hosts file. This is normally "cat /etc/hosts", but can be "ypcat hosts" when NIS is used. On some systems, such as os390, there may be no equivalent command, in which case this variable is unset.

  • html1dir

    From html1dir.U:

    This variable contains the name of the directory in which html source pages are to be put. This directory is for pages that describe whole programs, not libraries or modules. It is intended to correspond roughly to section 1 of the Unix manuals.

  • html1direxp

    From html1dir.U:

    This variable is the same as the html1dir variable, but is filename expanded at configuration time, for convenient use in makefiles.

  • html3dir

    From html3dir.U:

    This variable contains the name of the directory in which html source pages are to be put. This directory is for pages that describe libraries or modules. It is intended to correspond roughly to section 3 of the Unix manuals.

  • html3direxp

    From html3dir.U:

    This variable is the same as the html3dir variable, but is filename expanded at configuration time, for convenient use in makefiles.

i

  • i16size

    From perlxv.U:

    This variable is the size of an I16 in bytes.

  • i16type

    From perlxv.U:

    This variable contains the C type used for Perl's I16.

  • i32size

    From perlxv.U:

    This variable is the size of an I32 in bytes.

  • i32type

    From perlxv.U:

    This variable contains the C type used for Perl's I32.

  • i64size

    From perlxv.U:

    This variable is the size of an I64 in bytes.

  • i64type

    From perlxv.U:

    This variable contains the C type used for Perl's I64.

  • i8size

    From perlxv.U:

    This variable is the size of an I8 in bytes.

  • i8type

    From perlxv.U:

    This variable contains the C type used for Perl's I8.

  • i_arpainet

    From i_arpainet.U:

    This variable conditionally defines the I_ARPA_INET symbol, and indicates whether a C program should include <arpa/inet.h>.

  • i_assert

    From i_assert.U:

    This variable conditionally defines the I_ASSERT symbol, which indicates to the C program that <assert.h> exists and could be included.

  • i_bsdioctl

    From i_sysioctl.U:

    This variable conditionally defines the I_SYS_BSDIOCTL symbol, which indicates to the C program that <sys/bsdioctl.h> exists and should be included.

  • i_crypt

    From i_crypt.U:

    This variable conditionally defines the I_CRYPT symbol, and indicates whether a C program should include <crypt.h>.

  • i_db

    From i_db.U:

    This variable conditionally defines the I_DB symbol, and indicates whether a C program may include Berkeley's DB include file <db.h>.

  • i_dbm

    From i_dbm.U:

    This variable conditionally defines the I_DBM symbol, which indicates to the C program that <dbm.h> exists and should be included.

  • i_dirent

    From i_dirent.U:

    This variable conditionally defines I_DIRENT , which indicates to the C program that it should include <dirent.h>.

  • i_dld

    From i_dld.U:

    This variable conditionally defines the I_DLD symbol, which indicates to the C program that <dld.h> (GNU dynamic loading) exists and should be included.

  • i_dlfcn

    From i_dlfcn.U:

    This variable conditionally defines the I_DLFCN symbol, which indicates to the C program that <dlfcn.h> exists and should be included.

  • i_fcntl

    From i_fcntl.U:

    This variable controls the value of I_FCNTL (which tells the C program to include <fcntl.h>).

  • i_float

    From i_float.U:

    This variable conditionally defines the I_FLOAT symbol, and indicates whether a C program may include <float.h> to get symbols like DBL_MAX or DBL_MIN , i.e. machine dependent floating point values.

  • i_fp

    From i_fp.U:

    This variable conditionally defines the I_FP symbol, and indicates whether a C program should include <fp.h>.

  • i_fp_class

    From i_fp_class.U:

    This variable conditionally defines the I_FP_CLASS symbol, and indicates whether a C program should include <fp_class.h>.

  • i_gdbm

    From i_gdbm.U:

    This variable conditionally defines the I_GDBM symbol, which indicates to the C program that <gdbm.h> exists and should be included.

  • i_gdbm_ndbm

    From i_ndbm.U:

    This variable conditionally defines the I_GDBM_NDBM symbol, which indicates to the C program that <gdbm-ndbm.h> exists and should be included. This is the location of the ndbm.h compatibility file in Debian 4.0.

  • i_gdbmndbm

    From i_ndbm.U:

    This variable conditionally defines the I_GDBMNDBM symbol, which indicates to the C program that <gdbm/ndbm.h> exists and should be included. This was the location of the ndbm.h compatibility file in RedHat 7.1.

  • i_grp

    From i_grp.U:

    This variable conditionally defines the I_GRP symbol, and indicates whether a C program should include <grp.h>.

  • i_ieeefp

    From i_ieeefp.U:

    This variable conditionally defines the I_IEEEFP symbol, and indicates whether a C program should include <ieeefp.h>.

  • i_inttypes

    From i_inttypes.U:

    This variable conditionally defines the I_INTTYPES symbol, and indicates whether a C program should include <inttypes.h>.

  • i_langinfo

    From i_langinfo.U:

    This variable conditionally defines the I_LANGINFO symbol, and indicates whether a C program should include <langinfo.h>.

  • i_libutil

    From i_libutil.U:

    This variable conditionally defines the I_LIBUTIL symbol, and indicates whether a C program should include <libutil.h>.

  • i_limits

    From i_limits.U:

    This variable conditionally defines the I_LIMITS symbol, and indicates whether a C program may include <limits.h> to get symbols like WORD_BIT and friends.

  • i_locale

    From i_locale.U:

    This variable conditionally defines the I_LOCALE symbol, and indicates whether a C program should include <locale.h>.

  • i_machcthr

    From i_machcthr.U:

    This variable conditionally defines the I_MACH_CTHREADS symbol, and indicates whether a C program should include <mach/cthreads.h>.

  • i_malloc

    From i_malloc.U:

    This variable conditionally defines the I_MALLOC symbol, and indicates whether a C program should include <malloc.h>.

  • i_mallocmalloc

    From i_mallocmalloc.U:

    This variable conditionally defines the I_MALLOCMALLOC symbol, and indicates whether a C program should include <malloc/malloc.h>.

  • i_math

    From i_math.U:

    This variable conditionally defines the I_MATH symbol, and indicates whether a C program may include <math.h>.

  • i_memory

    From i_memory.U:

    This variable conditionally defines the I_MEMORY symbol, and indicates whether a C program should include <memory.h>.

  • i_mntent

    From i_mntent.U:

    This variable conditionally defines the I_MNTENT symbol, and indicates whether a C program should include <mntent.h>.

  • i_ndbm

    From i_ndbm.U:

    This variable conditionally defines the I_NDBM symbol, which indicates to the C program that <ndbm.h> exists and should be included.

  • i_netdb

    From i_netdb.U:

    This variable conditionally defines the I_NETDB symbol, and indicates whether a C program should include <netdb.h>.

  • i_neterrno

    From i_neterrno.U:

    This variable conditionally defines the I_NET_ERRNO symbol, which indicates to the C program that <net/errno.h> exists and should be included.

  • i_netinettcp

    From i_netinettcp.U:

    This variable conditionally defines the I_NETINET_TCP symbol, and indicates whether a C program should include <netinet/tcp.h>.

  • i_niin

    From i_niin.U:

    This variable conditionally defines I_NETINET_IN , which indicates to the C program that it should include <netinet/in.h>. Otherwise, you may try <sys/in.h>.

  • i_poll

    From i_poll.U:

    This variable conditionally defines the I_POLL symbol, and indicates whether a C program should include <poll.h>.

  • i_prot

    From i_prot.U:

    This variable conditionally defines the I_PROT symbol, and indicates whether a C program should include <prot.h>.

  • i_pthread

    From i_pthread.U:

    This variable conditionally defines the I_PTHREAD symbol, and indicates whether a C program should include <pthread.h>.

  • i_pwd

    From i_pwd.U:

    This variable conditionally defines I_PWD , which indicates to the C program that it should include <pwd.h>.

  • i_rpcsvcdbm

    From i_dbm.U:

    This variable conditionally defines the I_RPCSVC_DBM symbol, which indicates to the C program that <rpcsvc/dbm.h> exists and should be included. Some System V systems might need this instead of <dbm.h>.

  • i_sfio

    From i_sfio.U:

    This variable conditionally defines the I_SFIO symbol, and indicates whether a C program should include <sfio.h>.

  • i_sgtty

    From i_termio.U:

    This variable conditionally defines the I_SGTTY symbol, which indicates to the C program that it should include <sgtty.h> rather than <termio.h>.

  • i_shadow

    From i_shadow.U:

    This variable conditionally defines the I_SHADOW symbol, and indicates whether a C program should include <shadow.h>.

  • i_socks

    From i_socks.U:

    This variable conditionally defines the I_SOCKS symbol, and indicates whether a C program should include <socks.h>.

  • i_stdarg

    From i_varhdr.U:

    This variable conditionally defines the I_STDARG symbol, which indicates to the C program that <stdarg.h> exists and should be included.

  • i_stdbool

    From i_stdbool.U:

    This variable conditionally defines the I_STDBOOL symbol, which indicates to the C program that <stdbool.h> exists and should be included.

  • i_stddef

    From i_stddef.U:

    This variable conditionally defines the I_STDDEF symbol, which indicates to the C program that <stddef.h> exists and should be included.

  • i_stdlib

    From i_stdlib.U:

    This variable conditionally defines the I_STDLIB symbol, which indicates to the C program that <stdlib.h> exists and should be included.

  • i_string

    From i_string.U:

    This variable conditionally defines the I_STRING symbol, which indicates that <string.h> should be included rather than <strings.h>.

  • i_sunmath

    From i_sunmath.U:

    This variable conditionally defines the I_SUNMATH symbol, and indicates whether a C program should include <sunmath.h>.

  • i_sysaccess

    From i_sysaccess.U:

    This variable conditionally defines the I_SYS_ACCESS symbol, and indicates whether a C program should include <sys/access.h>.

  • i_sysdir

    From i_sysdir.U:

    This variable conditionally defines the I_SYS_DIR symbol, and indicates whether a C program should include <sys/dir.h>.

  • i_sysfile

    From i_sysfile.U:

    This variable conditionally defines the I_SYS_FILE symbol, and indicates whether a C program should include <sys/file.h> to get R_OK and friends.

  • i_sysfilio

    From i_sysioctl.U:

    This variable conditionally defines the I_SYS_FILIO symbol, which indicates to the C program that <sys/filio.h> exists and should be included in preference to <sys/ioctl.h>.

  • i_sysin

    From i_niin.U:

    This variable conditionally defines I_SYS_IN , which indicates to the C program that it should include <sys/in.h> instead of <netinet/in.h>.

  • i_sysioctl

    From i_sysioctl.U:

    This variable conditionally defines the I_SYS_IOCTL symbol, which indicates to the C program that <sys/ioctl.h> exists and should be included.

  • i_syslog

    From i_syslog.U:

    This variable conditionally defines the I_SYSLOG symbol, and indicates whether a C program should include <syslog.h>.

  • i_sysmman

    From i_sysmman.U:

    This variable conditionally defines the I_SYS_MMAN symbol, and indicates whether a C program should include <sys/mman.h>.

  • i_sysmode

    From i_sysmode.U:

    This variable conditionally defines the I_SYSMODE symbol, and indicates whether a C program should include <sys/mode.h>.

  • i_sysmount

    From i_sysmount.U:

    This variable conditionally defines the I_SYSMOUNT symbol, and indicates whether a C program should include <sys/mount.h>.

  • i_sysndir

    From i_sysndir.U:

    This variable conditionally defines the I_SYS_NDIR symbol, and indicates whether a C program should include <sys/ndir.h>.

  • i_sysparam

    From i_sysparam.U:

    This variable conditionally defines the I_SYS_PARAM symbol, and indicates whether a C program should include <sys/param.h>.

  • i_syspoll

    From i_syspoll.U:

    This variable conditionally defines the I_SYS_POLL symbol, which indicates to the C program that it should include <sys/poll.h>.

  • i_sysresrc

    From i_sysresrc.U:

    This variable conditionally defines the I_SYS_RESOURCE symbol, and indicates whether a C program should include <sys/resource.h>.

  • i_syssecrt

    From i_syssecrt.U:

    This variable conditionally defines the I_SYS_SECURITY symbol, and indicates whether a C program should include <sys/security.h>.

  • i_sysselct

    From i_sysselct.U:

    This variable conditionally defines I_SYS_SELECT , which indicates to the C program that it should include <sys/select.h> in order to get the definition of struct timeval.

  • i_syssockio

    From i_sysioctl.U:

    This variable conditionally defines I_SYS_SOCKIO to indicate to the C program that socket ioctl codes may be found in <sys/sockio.h> instead of <sys/ioctl.h>.

  • i_sysstat

    From i_sysstat.U:

    This variable conditionally defines the I_SYS_STAT symbol, and indicates whether a C program should include <sys/stat.h>.

  • i_sysstatfs

    From i_sysstatfs.U:

    This variable conditionally defines the I_SYSSTATFS symbol, and indicates whether a C program should include <sys/statfs.h>.

  • i_sysstatvfs

    From i_sysstatvfs.U:

    This variable conditionally defines the I_SYSSTATVFS symbol, and indicates whether a C program should include <sys/statvfs.h>.

  • i_systime

    From i_time.U:

    This variable conditionally defines I_SYS_TIME , which indicates to the C program that it should include <sys/time.h>.

  • i_systimek

    From i_time.U:

    This variable conditionally defines I_SYS_TIME_KERNEL , which indicates to the C program that it should include <sys/time.h> with KERNEL defined.

  • i_systimes

    From i_systimes.U:

    This variable conditionally defines the I_SYS_TIMES symbol, and indicates whether a C program should include <sys/times.h>.

  • i_systypes

    From i_systypes.U:

    This variable conditionally defines the I_SYS_TYPES symbol, and indicates whether a C program should include <sys/types.h>.

  • i_sysuio

    From i_sysuio.U:

    This variable conditionally defines the I_SYSUIO symbol, and indicates whether a C program should include <sys/uio.h>.

  • i_sysun

    From i_sysun.U:

    This variable conditionally defines I_SYS_UN , which indicates to the C program that it should include <sys/un.h> to get UNIX domain socket definitions.

  • i_sysutsname

    From i_sysutsname.U:

    This variable conditionally defines the I_SYSUTSNAME symbol, and indicates whether a C program should include <sys/utsname.h>.

  • i_sysvfs

    From i_sysvfs.U:

    This variable conditionally defines the I_SYSVFS symbol, and indicates whether a C program should include <sys/vfs.h>.

  • i_syswait

    From i_syswait.U:

    This variable conditionally defines I_SYS_WAIT , which indicates to the C program that it should include <sys/wait.h>.

  • i_termio

    From i_termio.U:

    This variable conditionally defines the I_TERMIO symbol, which indicates to the C program that it should include <termio.h> rather than <sgtty.h>.

  • i_termios

    From i_termio.U:

    This variable conditionally defines the I_TERMIOS symbol, which indicates to the C program that the POSIX <termios.h> file is to be included.

  • i_time

    From i_time.U:

    This variable conditionally defines I_TIME , which indicates to the C program that it should include <time.h>.

  • i_unistd

    From i_unistd.U:

    This variable conditionally defines the I_UNISTD symbol, and indicates whether a C program should include <unistd.h>.

  • i_ustat

    From i_ustat.U:

    This variable conditionally defines the I_USTAT symbol, and indicates whether a C program should include <ustat.h>.

  • i_utime

    From i_utime.U:

    This variable conditionally defines the I_UTIME symbol, and indicates whether a C program should include <utime.h>.

  • i_values

    From i_values.U:

    This variable conditionally defines the I_VALUES symbol, and indicates whether a C program may include <values.h> to get symbols like MAXLONG and friends.

  • i_varargs

    From i_varhdr.U:

    This variable conditionally defines I_VARARGS , which indicates to the C program that it should include <varargs.h>.

  • i_varhdr

    From i_varhdr.U:

    Contains the name of the header to be included to get va_dcl definition. Typically one of varargs.h or stdarg.h.

  • i_vfork

    From i_vfork.U:

    This variable conditionally defines the I_VFORK symbol, and indicates whether a C program should include vfork.h.

  • ignore_versioned_solibs

    From libs.U:

    This variable should be non-empty if non-versioned shared libraries (libfoo.so.x.y) are to be ignored (because they cannot be linked against).

  • inc_version_list

    From inc_version_list.U:

    This variable specifies the list of subdirectories in over which perl.c:incpush() and lib/lib.pm will automatically search when adding directories to @INC . The elements in the list are separated by spaces. This is only useful if you have a perl library directory tree structured like the default one. See INSTALL for how this works. The versioned site_perl directory was introduced in 5.005, so that is the lowest possible value.

    This list includes architecture-dependent directories back to version $api_versionstring (e.g. 5.5.640) and architecture-independent directories all the way back to 5.005.

  • inc_version_list_init

    From inc_version_list.U:

    This variable holds the same list as inc_version_list, but each item is enclosed in double quotes and separated by commas, suitable for use in the PERL_INC_VERSION_LIST initialization.

  • incpath

    From usrinc.U:

    This variable must precede the normal include path to get the right one, as in $incpath/usr/include or $incpath/usr/lib. Value can be "" or /bsd43 on mips.

  • inews

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • initialinstalllocation

    From bin.U:

    When userelocatableinc is true, this variable holds the location that make install should copy the perl binary to, with all the run-time relocatable paths calculated from this at install time. When used, it is initialized to the original value of binexp, and then binexp is set to .../, as the other binaries are found relative to the perl binary.

  • installarchlib

    From archlib.U:

    This variable is really the same as archlibexp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installbin

    From bin.U:

    This variable is the same as binexp unless AFS is running in which case the user is explicitly prompted for it. This variable should always be used in your makefiles for maximum portability.

  • installhtml1dir

    From html1dir.U:

    This variable is really the same as html1direxp, unless you are using a different installprefix. For extra portability, you should only use this variable within your makefiles.

  • installhtml3dir

    From html3dir.U:

    This variable is really the same as html3direxp, unless you are using a different installprefix. For extra portability, you should only use this variable within your makefiles.

  • installman1dir

    From man1dir.U:

    This variable is really the same as man1direxp, unless you are using AFS in which case it points to the read/write location whereas man1direxp only points to the read-only access location. For extra portability, you should only use this variable within your makefiles.

  • installman3dir

    From man3dir.U:

    This variable is really the same as man3direxp, unless you are using AFS in which case it points to the read/write location whereas man3direxp only points to the read-only access location. For extra portability, you should only use this variable within your makefiles.

  • installprefix

    From installprefix.U:

    This variable holds the name of the directory below which "make install" will install the package. For most users, this is the same as prefix. However, it is useful for installing the software into a different (usually temporary) location after which it can be bundled up and moved somehow to the final location specified by prefix.

  • installprefixexp

    From installprefix.U:

    This variable holds the full absolute path of installprefix with all ~-expansion done.

  • installprivlib

    From privlib.U:

    This variable is really the same as privlibexp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installscript

    From scriptdir.U:

    This variable is usually the same as scriptdirexp, unless you are on a system running AFS , in which case they may differ slightly. You should always use this variable within your makefiles for portability.

  • installsitearch

    From sitearch.U:

    This variable is really the same as sitearchexp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installsitebin

    From sitebin.U:

    This variable is usually the same as sitebinexp, unless you are on a system running AFS , in which case they may differ slightly. You should always use this variable within your makefiles for portability.

  • installsitehtml1dir

    From sitehtml1dir.U:

    This variable is really the same as sitehtml1direxp, unless you are using AFS in which case it points to the read/write location whereas html1direxp only points to the read-only access location. For extra portability, you should only use this variable within your makefiles.

  • installsitehtml3dir

    From sitehtml3dir.U:

    This variable is really the same as sitehtml3direxp, unless you are using AFS in which case it points to the read/write location whereas html3direxp only points to the read-only access location. For extra portability, you should only use this variable within your makefiles.

  • installsitelib

    From sitelib.U:

    This variable is really the same as sitelibexp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installsiteman1dir

    From siteman1dir.U:

    This variable is really the same as siteman1direxp, unless you are using AFS in which case it points to the read/write location whereas man1direxp only points to the read-only access location. For extra portability, you should only use this variable within your makefiles.

  • installsiteman3dir

    From siteman3dir.U:

    This variable is really the same as siteman3direxp, unless you are using AFS in which case it points to the read/write location whereas man3direxp only points to the read-only access location. For extra portability, you should only use this variable within your makefiles.

  • installsitescript

    From sitescript.U:

    This variable is usually the same as sitescriptexp, unless you are on a system running AFS , in which case they may differ slightly. You should always use this variable within your makefiles for portability.

  • installstyle

    From installstyle.U:

    This variable describes the style of the perl installation. This is intended to be useful for tools that need to manipulate entire perl distributions. Perl itself doesn't use this to find its libraries -- the library directories are stored directly in Config.pm. Currently, there are only two styles: lib and lib/perl5. The default library locations (e.g. privlib, sitelib) are either $prefix/lib or $prefix/lib/perl5. The former is useful if $prefix is a directory dedicated to perl (e.g. /opt/perl), while the latter is useful if $prefix is shared by many packages, e.g. if $prefix=/usr/local.

    Unfortunately, while this style variable is used to set defaults for all three directory hierarchies (core, vendor, and site), there is no guarantee that the same style is actually appropriate for all those directories. For example, $prefix might be /opt/perl, but $siteprefix might be /usr/local. (Perhaps, in retrospect, the lib style should never have been supported, but it did seem like a nice idea at the time.)

    The situation is even less clear for tools such as MakeMaker that can be used to install additional modules into non-standard places. For example, if a user intends to install a module into a private directory (perhaps by setting PREFIX on the Makefile.PL command line), then there is no reason to assume that the Configure-time $installstyle setting will be relevant for that PREFIX .

    This may later be extended to include other information, so be careful with pattern-matching on the results.

    For compatibility with perl5.005 and earlier, the default setting is based on whether or not $prefix contains the string perl .

  • installusrbinperl

    From instubperl.U:

    This variable tells whether Perl should be installed also as /usr/bin/perl in addition to $installbin/perl

  • installvendorarch

    From vendorarch.U:

    This variable is really the same as vendorarchexp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installvendorbin

    From vendorbin.U:

    This variable is really the same as vendorbinexp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installvendorhtml1dir

    From vendorhtml1dir.U:

    This variable is really the same as vendorhtml1direxp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installvendorhtml3dir

    From vendorhtml3dir.U:

    This variable is really the same as vendorhtml3direxp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installvendorlib

    From vendorlib.U:

    This variable is really the same as vendorlibexp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installvendorman1dir

    From vendorman1dir.U:

    This variable is really the same as vendorman1direxp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installvendorman3dir

    From vendorman3dir.U:

    This variable is really the same as vendorman3direxp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • installvendorscript

    From vendorscript.U:

    This variable is really the same as vendorscriptexp but may differ on those systems using AFS . For extra portability, only this variable should be used in makefiles.

  • intsize

    From intsize.U:

    This variable contains the value of the INTSIZE symbol, which indicates to the C program how many bytes there are in an int.

  • issymlink

    From issymlink.U:

    This variable holds the test command to test for a symbolic link (if they are supported). Typical values include test -h and test -L .

  • ivdformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl IV as a signed decimal integer.

  • ivsize

    From perlxv.U:

    This variable is the size of an IV in bytes.

  • ivtype

    From perlxv.U:

    This variable contains the C type used for Perl's IV .

k

  • known_extensions

    From Extensions.U:

    This variable holds a list of all XS extensions included in the package.

  • ksh

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

l

  • ld

    From dlsrc.U:

    This variable indicates the program to be used to link libraries for dynamic loading. On some systems, it is ld . On ELF systems, it should be $cc. Mostly, we'll try to respect the hint file setting.

  • ld_can_script

    From dlsrc.U:

    This variable shows if the loader accepts scripts in the form of -Wl,--version-script=ld.script. This is currently only supported for GNU ld on ELF in dynamic loading builds.

  • lddlflags

    From dlsrc.U:

    This variable contains any special flags that might need to be passed to $ld to create a shared library suitable for dynamic loading. It is up to the makefile to use it. For hpux, it should be -b . For sunos 4.1, it is empty.

  • ldflags

    From ccflags.U:

    This variable contains any additional C loader flags desired by the user. It is up to the Makefile to use this.

  • ldflags_uselargefiles

    From uselfs.U:

    This variable contains the loader flags needed by large file builds and added to ldflags by hints files.

  • ldlibpthname

    From libperl.U:

    This variable holds the name of the shared library search path, often LD_LIBRARY_PATH . To get an empty string, the hints file must set this to none .

  • less

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the less program. After Configure runs, the value is reset to a plain less and is not useful.

  • lib_ext

    From Unix.U:

    This is an old synonym for _a.

  • libc

    From libc.U:

    This variable contains the location of the C library.

  • libperl

    From libperl.U:

    The perl executable is obtained by linking perlmain.c with libperl, any static extensions (usually just DynaLoader), and any other libraries needed on this system. libperl is usually libperl.a, but can also be libperl.so.xxx if the user wishes to build a perl executable with a shared library.

  • libpth

    From libpth.U:

    This variable holds the general path (space-separated) used to find libraries. It is intended to be used by other units.

  • libs

    From libs.U:

    This variable holds the additional libraries we want to use. It is up to the Makefile to deal with it. The list can be empty.

  • libsdirs

    From libs.U:

    This variable holds the directory names aka dirnames of the libraries we found and accepted, duplicates are removed.

  • libsfiles

    From libs.U:

    This variable holds the filenames aka basenames of the libraries we found and accepted.

  • libsfound

    From libs.U:

    This variable holds the full pathnames of the libraries we found and accepted.

  • libspath

    From libs.U:

    This variable holds the directory names probed for libraries.

  • libswanted

    From Myinit.U:

    This variable holds a list of all the libraries we want to search. The order is chosen to pick up the c library ahead of ucb or bsd libraries for SVR4.

  • libswanted_uselargefiles

    From uselfs.U:

    This variable contains the libraries needed by large file builds and added to ldflags by hints files. It is a space separated list of the library names without the lib prefix or any suffix, just like libswanted..

  • line

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • lint

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • lkflags

    From ccflags.U:

    This variable contains any additional C partial linker flags desired by the user. It is up to the Makefile to use this.

  • ln

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the ln program. After Configure runs, the value is reset to a plain ln and is not useful.

  • lns

    From lns.U:

    This variable holds the name of the command to make symbolic links (if they are supported). It can be used in the Makefile. It is either ln -s or ln

  • localtime_r_proto

    From d_localtime_r.U:

    This variable encodes the prototype of localtime_r. It is zero if d_localtime_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_localtime_r is defined.

  • locincpth

    From ccflags.U:

    This variable contains a list of additional directories to be searched by the compiler. The appropriate -I directives will be added to ccflags. This is intended to simplify setting local directories from the Configure command line. It's not much, but it parallels the loclibpth stuff in libpth.U.

  • loclibpth

    From libpth.U:

    This variable holds the paths (space-separated) used to find local libraries. It is prepended to libpth, and is intended to be easily set from the command line.

  • longdblsize

    From d_longdbl.U:

    This variable contains the value of the LONG_DOUBLESIZE symbol, which indicates to the C program how many bytes there are in a long double, if this system supports long doubles.

  • longlongsize

    From d_longlong.U:

    This variable contains the value of the LONGLONGSIZE symbol, which indicates to the C program how many bytes there are in a long long, if this system supports long long.

  • longsize

    From intsize.U:

    This variable contains the value of the LONGSIZE symbol, which indicates to the C program how many bytes there are in a long.

  • lp

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • lpr

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • ls

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the ls program. After Configure runs, the value is reset to a plain ls and is not useful.

  • lseeksize

    From lseektype.U:

    This variable defines lseektype to be something like off_t, long, or whatever type is used to declare lseek offset's type in the kernel (which also appears to be lseek's return type).

  • lseektype

    From lseektype.U:

    This variable defines lseektype to be something like off_t, long, or whatever type is used to declare lseek offset's type in the kernel (which also appears to be lseek's return type).

m

  • mad

    From mad.U:

    This variable indicates that the Misc Attribute Definition code is to be compiled.

  • madlyh

    From mad.U:

    If the Misc Attribute Decoration is to be compiled, this variable is set to the name of the extra header files to be used, else it is ''

  • madlyobj

    From mad.U:

    If the Misc Attribute Decoration is to be compiled, this variable is set to the name of the extra object files to be used, else it is ''

  • madlysrc

    From mad.U:

    If the Misc Attribute Decoration is to be compiled, this variable is set to the name of the extra C source files to be used, else it is ''

  • mail

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • mailx

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • make

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the make program. After Configure runs, the value is reset to a plain make and is not useful.

  • make_set_make

    From make.U:

    Some versions of make set the variable MAKE . Others do not. This variable contains the string to be included in Makefile.SH so that MAKE is set if needed, and not if not needed. Possible values are:

    make_set_make=# # If your make program handles this for you,

    make_set_make=MAKE=$make # if it doesn't.

    This uses a comment character so that we can distinguish a set value (from a previous config.sh or Configure -D option) from an uncomputed value.

  • mallocobj

    From mallocsrc.U:

    This variable contains the name of the malloc.o that this package generates, if that malloc.o is preferred over the system malloc. Otherwise the value is null. This variable is intended for generating Makefiles. See mallocsrc.

  • mallocsrc

    From mallocsrc.U:

    This variable contains the name of the malloc.c that comes with the package, if that malloc.c is preferred over the system malloc. Otherwise the value is null. This variable is intended for generating Makefiles.

  • malloctype

    From mallocsrc.U:

    This variable contains the kind of ptr returned by malloc and realloc.

  • man1dir

    From man1dir.U:

    This variable contains the name of the directory in which manual source pages are to be put. It is the responsibility of the Makefile.SH to get the value of this into the proper command. You must be prepared to do the ~name expansion yourself.

  • man1direxp

    From man1dir.U:

    This variable is the same as the man1dir variable, but is filename expanded at configuration time, for convenient use in makefiles.

  • man1ext

    From man1dir.U:

    This variable contains the extension that the manual page should have: one of n , l , or 1 . The Makefile must supply the .. See man1dir.

  • man3dir

    From man3dir.U:

    This variable contains the name of the directory in which manual source pages are to be put. It is the responsibility of the Makefile.SH to get the value of this into the proper command. You must be prepared to do the ~name expansion yourself.

  • man3direxp

    From man3dir.U:

    This variable is the same as the man3dir variable, but is filename expanded at configuration time, for convenient use in makefiles.

  • man3ext

    From man3dir.U:

    This variable contains the extension that the manual page should have: one of n , l , or 3 . The Makefile must supply the .. See man3dir.

  • mips_type

    From usrinc.U:

    This variable holds the environment type for the mips system. Possible values are "BSD 4.3" and "System V".

  • mistrustnm

    From Csym.U:

    This variable can be used to establish a fallthrough for the cases where nm fails to find a symbol. If usenm is false or usenm is true and mistrustnm is false, this variable has no effect. If usenm is true and mistrustnm is compile , a test program will be compiled to try to find any symbol that can't be located via nm lookup. If mistrustnm is run , the test program will be run as well as being compiled.

  • mkdir

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the mkdir program. After Configure runs, the value is reset to a plain mkdir and is not useful.

  • mmaptype

    From d_mmap.U:

    This symbol contains the type of pointer returned by mmap() (and simultaneously the type of the first argument). It can be void * or caddr_t .

  • modetype

    From modetype.U:

    This variable defines modetype to be something like mode_t, int, unsigned short, or whatever type is used to declare file modes for system calls.

  • more

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the more program. After Configure runs, the value is reset to a plain more and is not useful.

  • multiarch

    From multiarch.U:

    This variable conditionally defines the MULTIARCH symbol which signifies the presence of multiplatform files. This is normally set by hints files.

  • mv

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • myarchname

    From archname.U:

    This variable holds the architecture name computed by Configure in a previous run. It is not intended to be perused by any user and should never be set in a hint file.

  • mydomain

    From myhostname.U:

    This variable contains the eventual value of the MYDOMAIN symbol, which is the domain of the host the program is going to run on. The domain must be appended to myhostname to form a complete host name. The dot comes with mydomain, and need not be supplied by the program.

  • myhostname

    From myhostname.U:

    This variable contains the eventual value of the MYHOSTNAME symbol, which is the name of the host the program is going to run on. The domain is not kept with hostname, but must be gotten from mydomain. The dot comes with mydomain, and need not be supplied by the program.

  • myuname

    From Oldconfig.U:

    The output of uname -a if available, otherwise the hostname. The whole thing is then lower-cased and slashes and single quotes are removed.

n

  • n

    From n.U:

    This variable contains the -n flag if that is what causes the echo command to suppress newline. Otherwise it is null. Correct usage is $echo $n "prompt for a question: $c".

  • need_va_copy

    From need_va_copy.U:

    This symbol, if defined, indicates that the system stores the variable argument list datatype, va_list, in a format that cannot be copied by simple assignment, so that some other means must be used when copying is required. As such systems vary in their provision (or non-provision) of copying mechanisms, handy.h defines a platform- independent macro, Perl_va_copy(src, dst), to do the job.

  • netdb_hlen_type

    From netdbtype.U:

    This variable holds the type used for the 2nd argument to gethostbyaddr(). Usually, this is int or size_t or unsigned. This is only useful if you have gethostbyaddr(), naturally.

  • netdb_host_type

    From netdbtype.U:

    This variable holds the type used for the 1st argument to gethostbyaddr(). Usually, this is char * or void *, possibly with or without a const prefix. This is only useful if you have gethostbyaddr(), naturally.

  • netdb_name_type

    From netdbtype.U:

    This variable holds the type used for the argument to gethostbyname(). Usually, this is char * or const char *. This is only useful if you have gethostbyname(), naturally.

  • netdb_net_type

    From netdbtype.U:

    This variable holds the type used for the 1st argument to getnetbyaddr(). Usually, this is int or long. This is only useful if you have getnetbyaddr(), naturally.

  • nm

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the nm program. After Configure runs, the value is reset to a plain nm and is not useful.

  • nm_opt

    From usenm.U:

    This variable holds the options that may be necessary for nm.

  • nm_so_opt

    From usenm.U:

    This variable holds the options that may be necessary for nm to work on a shared library but that can not be used on an archive library. Currently, this is only used by Linux, where nm --dynamic is *required* to get symbols from an ELF library which has been stripped, but nm --dynamic is *fatal* on an archive library. Maybe Linux should just always set usenm=false.

  • nonxs_ext

    From Extensions.U:

    This variable holds a list of all non-xs extensions included in the package. All of them will be built.

  • nroff

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the nroff program. After Configure runs, the value is reset to a plain nroff and is not useful.

  • nv_overflows_integers_at

    From perlxv.U:

    This variable gives the largest integer value that NVs can hold as a constant floating point expression. If it could not be determined, it holds the value 0.

  • nv_preserves_uv_bits

    From perlxv.U:

    This variable indicates how many of bits type uvtype a variable nvtype can preserve.

  • nveformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl NV using %e-ish floating point format.

  • nvEUformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl NV using %E-ish floating point format.

  • nvfformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl NV using %f-ish floating point format.

  • nvFUformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl NV using %F-ish floating point format.

  • nvgformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl NV using %g-ish floating point format.

  • nvGUformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl NV using %G-ish floating point format.

  • nvsize

    From perlxv.U:

    This variable is the size of an NV in bytes.

  • nvtype

    From perlxv.U:

    This variable contains the C type used for Perl's NV .

o

  • o_nonblock

    From nblock_io.U:

    This variable bears the symbol value to be used during open() or fcntl() to turn on non-blocking I/O for a file descriptor. If you wish to switch between blocking and non-blocking, you may try ioctl(FIOSNBIO ) instead, but that is only supported by some devices.

  • obj_ext

    From Unix.U:

    This is an old synonym for _o.

  • old_pthread_create_joinable

    From d_pthrattrj.U:

    This variable defines the constant to use for creating joinable (aka undetached) pthreads. Unused if pthread.h defines PTHREAD_CREATE_JOINABLE . If used, possible values are PTHREAD_CREATE_UNDETACHED and __UNDETACHED .

  • optimize

    From ccflags.U:

    This variable contains any optimizer/debugger flag that should be used. It is up to the Makefile to use it.

  • orderlib

    From orderlib.U:

    This variable is true if the components of libraries must be ordered (with `lorder $* | tsort`) before placing them in an archive. Set to false if ranlib or ar can generate random libraries.

  • osname

    From Oldconfig.U:

    This variable contains the operating system name (e.g. sunos, solaris, hpux, etc.). It can be useful later on for setting defaults. Any spaces are replaced with underscores. It is set to a null string if we can't figure it out.

  • osvers

    From Oldconfig.U:

    This variable contains the operating system version (e.g. 4.1.3, 5.2, etc.). It is primarily used for helping select an appropriate hints file, but might be useful elsewhere for setting defaults. It is set to '' if we can't figure it out. We try to be flexible about how much of the version number to keep, e.g. if 4.1.1, 4.1.2, and 4.1.3 are essentially the same for this package, hints files might just be os_4.0 or os_4.1, etc., not keeping separate files for each little release.

  • otherlibdirs

    From otherlibdirs.U:

    This variable contains a colon-separated set of paths for the perl binary to search for additional library files or modules. These directories will be tacked to the end of @INC . Perl will automatically search below each path for version- and architecture-specific directories. See inc_version_list for more details. A value of means none and is used to preserve this value for the next run through Configure.

p

  • package

    From package.U:

    This variable contains the name of the package being constructed. It is primarily intended for the use of later Configure units.

  • pager

    From pager.U:

    This variable contains the name of the preferred pager on the system. Usual values are (the full pathnames of) more, less, pg, or cat.

  • passcat

    From nis.U:

    This variable contains a command that produces the text of the /etc/passwd file. This is normally "cat /etc/passwd", but can be "ypcat passwd" when NIS is used. On some systems, such as os390, there may be no equivalent command, in which case this variable is unset.

  • patchlevel

    From patchlevel.U:

    The patchlevel level of this package. The value of patchlevel comes from the patchlevel.h file. In a version number such as 5.6.1, this is the 6 . In patchlevel.h, this is referred to as PERL_VERSION .

  • path_sep

    From Unix.U:

    This is an old synonym for p_ in Head.U, the character used to separate elements in the command shell search PATH .

  • perl

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the perl program. After Configure runs, the value is reset to a plain perl and is not useful.

  • perl5

    From perl5.U:

    This variable contains the full path (if any) to a previously installed perl5.005 or later suitable for running the script to determine inc_version_list.

P

  • PERL_API_REVISION

    From patchlevel.h:

    This number describes the earliest compatible PERL_REVISION of Perl (compatibility here being defined as sufficient binary/API compatibility to run XS code built with the older version). Normally this does not change across maintenance releases. Please read the comment in patchlevel.h.

  • PERL_API_SUBVERSION

    From patchlevel.h:

    This number describes the earliest compatible PERL_SUBVERSION of Perl (compatibility here being defined as sufficient binary/API compatibility to run XS code built with the older version). Normally this does not change across maintenance releases. Please read the comment in patchlevel.h.

  • PERL_API_VERSION

    From patchlevel.h:

    This number describes the earliest compatible PERL_VERSION of Perl (compatibility here being defined as sufficient binary/API compatibility to run XS code built with the older version). Normally this does not change across maintenance releases. Please read the comment in patchlevel.h.

  • PERL_CONFIG_SH

    From Oldsyms.U:

    This is set to true in config.sh so that a shell script sourcing config.sh can tell if it has been sourced already.

  • PERL_PATCHLEVEL

    From Oldsyms.U:

    This symbol reflects the patchlevel, if available. Will usually come from the .patch file, which is available when the perl source tree was fetched with rsync.

  • perl_patchlevel

    From patchlevel.U:

    This is the Perl patch level, a numeric change identifier, as defined by whichever source code maintenance system is used to maintain the patches; currently Perforce. It does not correlate with the Perl version numbers or the maintenance versus development dichotomy except by also being increasing.

  • PERL_REVISION

    From Oldsyms.U:

    In a Perl version number such as 5.6.2, this is the 5. This value is manually set in patchlevel.h

  • perl_static_inline

    From d_static_inline.U:

    This variable defines the PERL_STATIC_INLINE symbol to the best-guess incantation to use for static inline functions. Possibilities include static inline (c99) static __inline__ (gcc -ansi) static __inline (MSVC ) static _inline (older MSVC ) static (c89 compilers)

  • PERL_SUBVERSION

    From Oldsyms.U:

    In a Perl version number such as 5.6.2, this is the 2. Values greater than 50 represent potentially unstable development subversions. This value is manually set in patchlevel.h

  • PERL_VERSION

    From Oldsyms.U:

    In a Perl version number such as 5.6.2, this is the 6. This value is manually set in patchlevel.h

  • perladmin

    From perladmin.U:

    Electronic mail address of the perl5 administrator.

  • perllibs

    From End.U:

    The list of libraries needed by Perl only (any libraries needed by extensions only will by dropped, if using dynamic loading).

  • perlpath

    From perlpath.U:

    This variable contains the eventual value of the PERLPATH symbol, which contains the name of the perl interpreter to be used in shell scripts and in the "eval exec" idiom. This variable is not necessarily the pathname of the file containing the perl interpreter; you must append the executable extension (_exe) if it is not already present. Note that Perl code that runs during the Perl build process cannot reference this variable, as Perl may not have been installed, or even if installed, may be a different version of Perl.

  • pg

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the pg program. After Configure runs, the value is reset to a plain pg and is not useful.

  • phostname

    From myhostname.U:

    This variable contains the eventual value of the PHOSTNAME symbol, which is a command that can be fed to popen() to get the host name. The program should probably not presume that the domain is or isn't there already.

  • pidtype

    From pidtype.U:

    This variable defines PIDTYPE to be something like pid_t, int, ushort, or whatever type is used to declare process ids in the kernel.

  • plibpth

    From libpth.U:

    Holds the private path used by Configure to find out the libraries. Its value is prepend to libpth. This variable takes care of special machines, like the mips. Usually, it should be empty.

  • pmake

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • pr

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • prefix

    From prefix.U:

    This variable holds the name of the directory below which the user will install the package. Usually, this is /usr/local, and executables go in /usr/local/bin, library stuff in /usr/local/lib, man pages in /usr/local/man, etc. It is only used to set defaults for things in bin.U, mansrc.U, privlib.U, or scriptdir.U.

  • prefixexp

    From prefix.U:

    This variable holds the full absolute path of the directory below which the user will install the package. Derived from prefix.

  • privlib

    From privlib.U:

    This variable contains the eventual value of the PRIVLIB symbol, which is the name of the private library for this package. It may have a ~ on the front. It is up to the makefile to eventually create this directory while performing installation (with ~ substitution).

  • privlibexp

    From privlib.U:

    This variable is the ~name expanded version of privlib, so that you may use it directly in Makefiles or shell scripts.

  • procselfexe

    From d_procselfexe.U:

    If d_procselfexe is defined, $procselfexe is the filename of the symbolic link pointing to the absolute pathname of the executing program.

  • prototype

    From prototype.U:

    This variable holds the eventual value of CAN_PROTOTYPE , which indicates the C compiler can handle function prototypes.

  • ptrsize

    From ptrsize.U:

    This variable contains the value of the PTRSIZE symbol, which indicates to the C program how many bytes there are in a pointer.

q

  • quadkind

    From quadtype.U:

    This variable, if defined, encodes the type of a quad: 1 = int, 2 = long, 3 = long long, 4 = int64_t.

  • quadtype

    From quadtype.U:

    This variable defines Quad_t to be something like long, int, long long, int64_t, or whatever type is used for 64-bit integers.

r

  • randbits

    From randfunc.U:

    Indicates how many bits are produced by the function used to generate normalized random numbers.

  • randfunc

    From randfunc.U:

    Indicates the name of the random number function to use. Values include drand48, random, and rand. In C programs, the Drand01 macro is defined to generate uniformly distributed random numbers over the range [0., 1.[ (see drand01 and nrand).

  • random_r_proto

    From d_random_r.U:

    This variable encodes the prototype of random_r. It is zero if d_random_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_random_r is defined.

  • randseedtype

    From randfunc.U:

    Indicates the type of the argument of the seedfunc.

  • ranlib

    From orderlib.U:

    This variable is set to the pathname of the ranlib program, if it is needed to generate random libraries. Set to : if ar can generate random libraries or if random libraries are not supported

  • rd_nodata

    From nblock_io.U:

    This variable holds the return code from read() when no data is present. It should be -1, but some systems return 0 when O_NDELAY is used, which is a shame because you cannot make the difference between no data and an EOF.. Sigh!

  • readdir64_r_proto

    From d_readdir64_r.U:

    This variable encodes the prototype of readdir64_r. It is zero if d_readdir64_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_readdir64_r is defined.

  • readdir_r_proto

    From d_readdir_r.U:

    This variable encodes the prototype of readdir_r. It is zero if d_readdir_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_readdir_r is defined.

  • revision

    From patchlevel.U:

    The value of revision comes from the patchlevel.h file. In a version number such as 5.6.1, this is the 5 . In patchlevel.h, this is referred to as PERL_REVISION .

  • rm

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the rm program. After Configure runs, the value is reset to a plain rm and is not useful.

  • rm_try

    From Unix.U:

    This is a cleanup variable for try test programs. Internal Configure use only.

  • rmail

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • run

    From Cross.U:

    This variable contains the command used by Configure to copy and execute a cross-compiled executable in the target host. Useful and available only during Perl build. Empty string '' if not cross-compiling.

  • runnm

    From usenm.U:

    This variable contains true or false depending whether the nm extraction should be performed or not, according to the value of usenm and the flags on the Configure command line.

s

  • sched_yield

    From d_pthread_y.U:

    This variable defines the way to yield the execution of the current thread.

  • scriptdir

    From scriptdir.U:

    This variable holds the name of the directory in which the user wants to put publicly scripts for the package in question. It is either the same directory as for binaries, or a special one that can be mounted across different architectures, like /usr/share. Programs must be prepared to deal with ~name expansion.

  • scriptdirexp

    From scriptdir.U:

    This variable is the same as scriptdir, but is filename expanded at configuration time, for programs not wanting to bother with it.

  • sed

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the sed program. After Configure runs, the value is reset to a plain sed and is not useful.

  • seedfunc

    From randfunc.U:

    Indicates the random number generating seed function. Values include srand48, srandom, and srand.

  • selectminbits

    From selectminbits.U:

    This variable holds the minimum number of bits operated by select. That is, if you do select(n, ...), how many bits at least will be cleared in the masks if some activity is detected. Usually this is either n or 32*ceil(n/32), especially many little-endians do the latter. This is only useful if you have select(), naturally.

  • selecttype

    From selecttype.U:

    This variable holds the type used for the 2nd, 3rd, and 4th arguments to select. Usually, this is fd_set * , if HAS_FD_SET is defined, and int * otherwise. This is only useful if you have select(), naturally.

  • sendmail

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • setgrent_r_proto

    From d_setgrent_r.U:

    This variable encodes the prototype of setgrent_r. It is zero if d_setgrent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_setgrent_r is defined.

  • sethostent_r_proto

    From d_sethostent_r.U:

    This variable encodes the prototype of sethostent_r. It is zero if d_sethostent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_sethostent_r is defined.

  • setlocale_r_proto

    From d_setlocale_r.U:

    This variable encodes the prototype of setlocale_r. It is zero if d_setlocale_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_setlocale_r is defined.

  • setnetent_r_proto

    From d_setnetent_r.U:

    This variable encodes the prototype of setnetent_r. It is zero if d_setnetent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_setnetent_r is defined.

  • setprotoent_r_proto

    From d_setprotoent_r.U:

    This variable encodes the prototype of setprotoent_r. It is zero if d_setprotoent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_setprotoent_r is defined.

  • setpwent_r_proto

    From d_setpwent_r.U:

    This variable encodes the prototype of setpwent_r. It is zero if d_setpwent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_setpwent_r is defined.

  • setservent_r_proto

    From d_setservent_r.U:

    This variable encodes the prototype of setservent_r. It is zero if d_setservent_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_setservent_r is defined.

  • sGMTIME_max

    From time_size.U:

    This variable defines the maximum value of the time_t offset that the system function gmtime () accepts

  • sGMTIME_min

    From time_size.U:

    This variable defines the minimum value of the time_t offset that the system function gmtime () accepts

  • sh

    From sh.U:

    This variable contains the full pathname of the shell used on this system to execute Bourne shell scripts. Usually, this will be /bin/sh, though it's possible that some systems will have /bin/ksh, /bin/pdksh, /bin/ash, /bin/bash, or even something such as D:/bin/sh.exe. This unit comes before Options.U, so you can't set sh with a -D option, though you can override this (and startsh) with -O -Dsh=/bin/whatever -Dstartsh=whatever

  • shar

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • sharpbang

    From spitshell.U:

    This variable contains the string #! if this system supports that construct.

  • shmattype

    From d_shmat.U:

    This symbol contains the type of pointer returned by shmat(). It can be void * or char * .

  • shortsize

    From intsize.U:

    This variable contains the value of the SHORTSIZE symbol which indicates to the C program how many bytes there are in a short.

  • shrpenv

    From libperl.U:

    If the user builds a shared libperl.so, then we need to tell the perl executable where it will be able to find the installed libperl.so. One way to do this on some systems is to set the environment variable LD_RUN_PATH to the directory that will be the final location of the shared libperl.so. The makefile can use this with something like $shrpenv $(CC ) -o perl perlmain.o $libperl $libs Typical values are shrpenv="env LD_RUN_PATH =$archlibexp/CORE " or shrpenv='' See the main perl Makefile.SH for actual working usage. Alternatively, we might be able to use a command line option such as -R $archlibexp/CORE (Solaris) or -Wl,-rpath $archlibexp/CORE (Linux).

  • shsharp

    From spitshell.U:

    This variable tells further Configure units whether your sh can handle # comments.

  • sig_count

    From sig_name.U:

    This variable holds a number larger than the largest valid signal number. This is usually the same as the NSIG macro.

  • sig_name

    From sig_name.U:

    This variable holds the signal names, space separated. The leading SIG in signal name is removed. A ZERO is prepended to the list. This is currently not used, sig_name_init is used instead.

  • sig_name_init

    From sig_name.U:

    This variable holds the signal names, enclosed in double quotes and separated by commas, suitable for use in the SIG_NAME definition below. A ZERO is prepended to the list, and the list is terminated with a plain 0. The leading SIG in signal names is removed. See sig_num.

  • sig_num

    From sig_name.U:

    This variable holds the signal numbers, space separated. A ZERO is prepended to the list (corresponding to the fake SIGZERO ). Those numbers correspond to the value of the signal listed in the same place within the sig_name list. This is currently not used, sig_num_init is used instead.

  • sig_num_init

    From sig_name.U:

    This variable holds the signal numbers, enclosed in double quotes and separated by commas, suitable for use in the SIG_NUM definition below. A ZERO is prepended to the list, and the list is terminated with a plain 0.

  • sig_size

    From sig_name.U:

    This variable contains the number of elements of the sig_name and sig_num arrays.

  • signal_t

    From d_voidsig.U:

    This variable holds the type of the signal handler (void or int).

  • sitearch

    From sitearch.U:

    This variable contains the eventual value of the SITEARCH symbol, which is the name of the private library for this package. It may have a ~ on the front. It is up to the makefile to eventually create this directory while performing installation (with ~ substitution). The standard distribution will put nothing in this directory. After perl has been installed, users may install their own local architecture-dependent modules in this directory with MakeMaker Makefile.PL or equivalent. See INSTALL for details.

  • sitearchexp

    From sitearch.U:

    This variable is the ~name expanded version of sitearch, so that you may use it directly in Makefiles or shell scripts.

  • sitebin

    From sitebin.U:

    This variable holds the name of the directory in which the user wants to put add-on publicly executable files for the package in question. It is most often a local directory such as /usr/local/bin. Programs using this variable must be prepared to deal with ~name substitution. The standard distribution will put nothing in this directory. After perl has been installed, users may install their own local executables in this directory with MakeMaker Makefile.PL or equivalent. See INSTALL for details.

  • sitebinexp

    From sitebin.U:

    This is the same as the sitebin variable, but is filename expanded at configuration time, for use in your makefiles.

  • sitehtml1dir

    From sitehtml1dir.U:

    This variable contains the name of the directory in which site-specific html source pages are to be put. It is the responsibility of the Makefile.SH to get the value of this into the proper command. You must be prepared to do the ~name expansion yourself. The standard distribution will put nothing in this directory. After perl has been installed, users may install their own local html pages in this directory with MakeMaker Makefile.PL or equivalent. See INSTALL for details.

  • sitehtml1direxp

    From sitehtml1dir.U:

    This variable is the same as the sitehtml1dir variable, but is filename expanded at configuration time, for convenient use in makefiles.

  • sitehtml3dir

    From sitehtml3dir.U:

    This variable contains the name of the directory in which site-specific library html source pages are to be put. It is the responsibility of the Makefile.SH to get the value of this into the proper command. You must be prepared to do the ~name expansion yourself. The standard distribution will put nothing in this directory. After perl has been installed, users may install their own local library html pages in this directory with MakeMaker Makefile.PL or equivalent. See INSTALL for details.

  • sitehtml3direxp

    From sitehtml3dir.U:

    This variable is the same as the sitehtml3dir variable, but is filename expanded at configuration time, for convenient use in makefiles.

  • sitelib

    From sitelib.U:

    This variable contains the eventual value of the SITELIB symbol, which is the name of the private library for this package. It may have a ~ on the front. It is up to the makefile to eventually create this directory while performing installation (with ~ substitution). The standard distribution will put nothing in this directory. After perl has been installed, users may install their own local architecture-independent modules in this directory with MakeMaker Makefile.PL or equivalent. See INSTALL for details.

  • sitelib_stem

    From sitelib.U:

    This variable is $sitelibexp with any trailing version-specific component removed. The elements in inc_version_list (inc_version_list.U) can be tacked onto this variable to generate a list of directories to search.

  • sitelibexp

    From sitelib.U:

    This variable is the ~name expanded version of sitelib, so that you may use it directly in Makefiles or shell scripts.

  • siteman1dir

    From siteman1dir.U:

    This variable contains the name of the directory in which site-specific manual source pages are to be put. It is the responsibility of the Makefile.SH to get the value of this into the proper command. You must be prepared to do the ~name expansion yourself. The standard distribution will put nothing in this directory. After perl has been installed, users may install their own local man1 pages in this directory with MakeMaker Makefile.PL or equivalent. See INSTALL for details.

  • siteman1direxp

    From siteman1dir.U:

    This variable is the same as the siteman1dir variable, but is filename expanded at configuration time, for convenient use in makefiles.

  • siteman3dir

    From siteman3dir.U:

    This variable contains the name of the directory in which site-specific library man source pages are to be put. It is the responsibility of the Makefile.SH to get the value of this into the proper command. You must be prepared to do the ~name expansion yourself. The standard distribution will put nothing in this directory. After perl has been installed, users may install their own local man3 pages in this directory with MakeMaker Makefile.PL or equivalent. See INSTALL for details.

  • siteman3direxp

    From siteman3dir.U:

    This variable is the same as the siteman3dir variable, but is filename expanded at configuration time, for convenient use in makefiles.

  • siteprefix

    From siteprefix.U:

    This variable holds the full absolute path of the directory below which the user will install add-on packages. See INSTALL for usage and examples.

  • siteprefixexp

    From siteprefix.U:

    This variable holds the full absolute path of the directory below which the user will install add-on packages. Derived from siteprefix.

  • sitescript

    From sitescript.U:

    This variable holds the name of the directory in which the user wants to put add-on publicly executable files for the package in question. It is most often a local directory such as /usr/local/bin. Programs using this variable must be prepared to deal with ~name substitution. The standard distribution will put nothing in this directory. After perl has been installed, users may install their own local scripts in this directory with MakeMaker Makefile.PL or equivalent. See INSTALL for details.

  • sitescriptexp

    From sitescript.U:

    This is the same as the sitescript variable, but is filename expanded at configuration time, for use in your makefiles.

  • sizesize

    From sizesize.U:

    This variable contains the size of a sizetype in bytes.

  • sizetype

    From sizetype.U:

    This variable defines sizetype to be something like size_t, unsigned long, or whatever type is used to declare length parameters for string functions.

  • sleep

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • sLOCALTIME_max

    From time_size.U:

    This variable defines the maximum value of the time_t offset that the system function localtime () accepts

  • sLOCALTIME_min

    From time_size.U:

    This variable defines the minimum value of the time_t offset that the system function localtime () accepts

  • smail

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • so

    From so.U:

    This variable holds the extension used to identify shared libraries (also known as shared objects) on the system. Usually set to so .

  • sockethdr

    From d_socket.U:

    This variable has any cpp -I flags needed for socket support.

  • socketlib

    From d_socket.U:

    This variable has the names of any libraries needed for socket support.

  • socksizetype

    From socksizetype.U:

    This variable holds the type used for the size argument for various socket calls like accept. Usual values include socklen_t, size_t, and int.

  • sort

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the sort program. After Configure runs, the value is reset to a plain sort and is not useful.

  • spackage

    From package.U:

    This variable contains the name of the package being constructed, with the first letter uppercased, i.e. suitable for starting sentences.

  • spitshell

    From spitshell.U:

    This variable contains the command necessary to spit out a runnable shell on this system. It is either cat or a grep -v for # comments.

  • sPRId64

    From quadfio.U:

    This variable, if defined, contains the string used by stdio to format 64-bit decimal numbers (format d ) for output.

  • sPRIeldbl

    From longdblfio.U:

    This variable, if defined, contains the string used by stdio to format long doubles (format e ) for output.

  • sPRIEUldbl

    From longdblfio.U:

    This variable, if defined, contains the string used by stdio to format long doubles (format E ) for output. The U in the name is to separate this from sPRIeldbl so that even case-blind systems can see the difference.

  • sPRIfldbl

    From longdblfio.U:

    This variable, if defined, contains the string used by stdio to format long doubles (format f ) for output.

  • sPRIFUldbl

    From longdblfio.U:

    This variable, if defined, contains the string used by stdio to format long doubles (format F ) for output. The U in the name is to separate this from sPRIfldbl so that even case-blind systems can see the difference.

  • sPRIgldbl

    From longdblfio.U:

    This variable, if defined, contains the string used by stdio to format long doubles (format g ) for output.

  • sPRIGUldbl

    From longdblfio.U:

    This variable, if defined, contains the string used by stdio to format long doubles (format G ) for output. The U in the name is to separate this from sPRIgldbl so that even case-blind systems can see the difference.

  • sPRIi64

    From quadfio.U:

    This variable, if defined, contains the string used by stdio to format 64-bit decimal numbers (format i ) for output.

  • sPRIo64

    From quadfio.U:

    This variable, if defined, contains the string used by stdio to format 64-bit octal numbers (format o ) for output.

  • sPRIu64

    From quadfio.U:

    This variable, if defined, contains the string used by stdio to format 64-bit unsigned decimal numbers (format u ) for output.

  • sPRIx64

    From quadfio.U:

    This variable, if defined, contains the string used by stdio to format 64-bit hexadecimal numbers (format x ) for output.

  • sPRIXU64

    From quadfio.U:

    This variable, if defined, contains the string used by stdio to format 64-bit hExADECimAl numbers (format X ) for output. The U in the name is to separate this from sPRIx64 so that even case-blind systems can see the difference.

  • srand48_r_proto

    From d_srand48_r.U:

    This variable encodes the prototype of srand48_r. It is zero if d_srand48_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_srand48_r is defined.

  • srandom_r_proto

    From d_srandom_r.U:

    This variable encodes the prototype of srandom_r. It is zero if d_srandom_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_srandom_r is defined.

  • src

    From src.U:

    This variable holds the (possibly relative) path of the package source. It is up to the Makefile to use this variable and set VPATH accordingly to find the sources remotely. Use $pkgsrc to have an absolute path.

  • sSCNfldbl

    From longdblfio.U:

    This variable, if defined, contains the string used by stdio to format long doubles (format f ) for input.

  • ssizetype

    From ssizetype.U:

    This variable defines ssizetype to be something like ssize_t, long or int. It is used by functions that return a count of bytes or an error condition. It must be a signed type. We will pick a type such that sizeof(SSize_t) == sizeof(Size_t).

  • st_ino_sign

    From st_ino_def.U:

    This variable contains the signedness of struct stat's st_ino. 1 for unsigned, -1 for signed.

  • st_ino_size

    From st_ino_def.U:

    This variable contains the size of struct stat's st_ino in bytes.

  • startperl

    From startperl.U:

    This variable contains the string to put on the front of a perl script to make sure (hopefully) that it runs with perl and not some shell. Of course, that leading line must be followed by the classical perl idiom: eval 'exec perl -S $0 ${1+$@ }' if $running_under_some_shell; to guarantee perl startup should the shell execute the script. Note that this magic incantation is not understood by csh.

  • startsh

    From startsh.U:

    This variable contains the string to put on the front of a shell script to make sure (hopefully) that it runs with sh and not some other shell.

  • static_ext

    From Extensions.U:

    This variable holds a list of XS extension files we want to link statically into the package. It is used by Makefile.

  • stdchar

    From stdchar.U:

    This variable conditionally defines STDCHAR to be the type of char used in stdio.h. It has the values "unsigned char" or char .

  • stdio_base

    From d_stdstdio.U:

    This variable defines how, given a FILE pointer, fp, to access the _base field (or equivalent) of stdio.h's FILE structure. This will be used to define the macro FILE_base(fp).

  • stdio_bufsiz

    From d_stdstdio.U:

    This variable defines how, given a FILE pointer, fp, to determine the number of bytes store in the I/O buffer pointer to by the _base field (or equivalent) of stdio.h's FILE structure. This will be used to define the macro FILE_bufsiz(fp).

  • stdio_cnt

    From d_stdstdio.U:

    This variable defines how, given a FILE pointer, fp, to access the _cnt field (or equivalent) of stdio.h's FILE structure. This will be used to define the macro FILE_cnt(fp).

  • stdio_filbuf

    From d_stdstdio.U:

    This variable defines how, given a FILE pointer, fp, to tell stdio to refill its internal buffers (?). This will be used to define the macro FILE_filbuf(fp).

  • stdio_ptr

    From d_stdstdio.U:

    This variable defines how, given a FILE pointer, fp, to access the _ptr field (or equivalent) of stdio.h's FILE structure. This will be used to define the macro FILE_ptr(fp).

  • stdio_stream_array

    From stdio_streams.U:

    This variable tells the name of the array holding the stdio streams. Usual values include _iob, __iob, and __sF.

  • strerror_r_proto

    From d_strerror_r.U:

    This variable encodes the prototype of strerror_r. It is zero if d_strerror_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_strerror_r is defined.

  • strings

    From i_string.U:

    This variable holds the full path of the string header that will be used. Typically /usr/include/string.h or /usr/include/strings.h.

  • submit

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • subversion

    From patchlevel.U:

    The subversion level of this package. The value of subversion comes from the patchlevel.h file. In a version number such as 5.6.1, this is the 1 . In patchlevel.h, this is referred to as PERL_SUBVERSION . This is unique to perl.

  • sysman

    From sysman.U:

    This variable holds the place where the manual is located on this system. It is not the place where the user wants to put his manual pages. Rather it is the place where Configure may look to find manual for unix commands (section 1 of the manual usually). See mansrc.

t

  • tail

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • tar

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • targetarch

    From Cross.U:

    If cross-compiling, this variable contains the target architecture. If not, this will be empty.

  • tbl

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • tee

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • test

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the test program. After Configure runs, the value is reset to a plain test and is not useful.

  • timeincl

    From i_time.U:

    This variable holds the full path of the included time header(s).

  • timetype

    From d_time.U:

    This variable holds the type returned by time(). It can be long, or time_t on BSD sites (in which case <sys/types.h> should be included). Anyway, the type Time_t should be used.

  • tmpnam_r_proto

    From d_tmpnam_r.U:

    This variable encodes the prototype of tmpnam_r. It is zero if d_tmpnam_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_tmpnam_r is defined.

  • to

    From Cross.U:

    This variable contains the command used by Configure to copy to from the target host. Useful and available only during Perl build. The string : if not cross-compiling.

  • touch

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the touch program. After Configure runs, the value is reset to a plain touch and is not useful.

  • tr

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the tr program. After Configure runs, the value is reset to a plain tr and is not useful.

  • trnl

    From trnl.U:

    This variable contains the value to be passed to the tr(1) command to transliterate a newline. Typical values are \012 and \n . This is needed for EBCDIC systems where newline is not necessarily \012 .

  • troff

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • ttyname_r_proto

    From d_ttyname_r.U:

    This variable encodes the prototype of ttyname_r. It is zero if d_ttyname_r is undef, and one of the REENTRANT_PROTO_T_ABC macros of reentr.h if d_ttyname_r is defined.

u

  • u16size

    From perlxv.U:

    This variable is the size of an U16 in bytes.

  • u16type

    From perlxv.U:

    This variable contains the C type used for Perl's U16.

  • u32size

    From perlxv.U:

    This variable is the size of an U32 in bytes.

  • u32type

    From perlxv.U:

    This variable contains the C type used for Perl's U32.

  • u64size

    From perlxv.U:

    This variable is the size of an U64 in bytes.

  • u64type

    From perlxv.U:

    This variable contains the C type used for Perl's U64.

  • u8size

    From perlxv.U:

    This variable is the size of an U8 in bytes.

  • u8type

    From perlxv.U:

    This variable contains the C type used for Perl's U8.

  • uidformat

    From uidf.U:

    This variable contains the format string used for printing a Uid_t.

  • uidsign

    From uidsign.U:

    This variable contains the signedness of a uidtype. 1 for unsigned, -1 for signed.

  • uidsize

    From uidsize.U:

    This variable contains the size of a uidtype in bytes.

  • uidtype

    From uidtype.U:

    This variable defines Uid_t to be something like uid_t, int, ushort, or whatever type is used to declare user ids in the kernel.

  • uname

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the uname program. After Configure runs, the value is reset to a plain uname and is not useful.

  • uniq

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the uniq program. After Configure runs, the value is reset to a plain uniq and is not useful.

  • uquadtype

    From quadtype.U:

    This variable defines Uquad_t to be something like unsigned long, unsigned int, unsigned long long, uint64_t, or whatever type is used for 64-bit integers.

  • use5005threads

    From usethreads.U:

    This variable conditionally defines the USE_5005THREADS symbol, and indicates that Perl should be built to use the 5.005-based threading implementation. Only valid up to 5.8.x.

  • use64bitall

    From use64bits.U:

    This variable conditionally defines the USE_64_BIT_ALL symbol, and indicates that 64-bit integer types should be used when available. The maximal possible 64-bitness is employed: LP64 or ILP64, meaning that you will be able to use more than 2 gigabytes of memory. This mode is even more binary incompatible than USE_64_BIT_INT. You may not be able to run the resulting executable in a 32-bit CPU at all or you may need at least to reboot your OS to 64-bit mode.

  • use64bitint

    From use64bits.U:

    This variable conditionally defines the USE_64_BIT_INT symbol, and indicates that 64-bit integer types should be used when available. The minimal possible 64-bitness is employed, just enough to get 64-bit integers into Perl. This may mean using for example "long longs", while your memory may still be limited to 2 gigabytes.

  • usecrosscompile

    From Cross.U:

    This variable conditionally defines the USE_CROSS_COMPILE symbol, and indicates that Perl has been cross-compiled.

  • usedevel

    From Devel.U:

    This variable indicates that Perl was configured with development features enabled. This should not be done for production builds.

  • usedl

    From dlsrc.U:

    This variable indicates if the system supports dynamic loading of some sort. See also dlsrc and dlobj.

  • usedtrace

    From usedtrace.U:

    This variable indicates whether we are compiling with dtrace support. See also dtrace.

  • usefaststdio

    From usefaststdio.U:

    This variable conditionally defines the USE_FAST_STDIO symbol, and indicates that Perl should be built to use fast stdio . Defaults to define in Perls 5.8 and earlier, to undef later.

  • useithreads

    From usethreads.U:

    This variable conditionally defines the USE_ITHREADS symbol, and indicates that Perl should be built to use the interpreter-based threading implementation.

  • usekernprocpathname

    From usekernprocpathname.U:

    This variable, indicates that we can use sysctl with KERN_PROC_PATHNAME to get a full path for the executable, and hence convert $^X to an absolute path.

  • uselargefiles

    From uselfs.U:

    This variable conditionally defines the USE_LARGE_FILES symbol, and indicates that large file interfaces should be used when available.

  • uselongdouble

    From uselongdbl.U:

    This variable conditionally defines the USE_LONG_DOUBLE symbol, and indicates that long doubles should be used when available.

  • usemallocwrap

    From mallocsrc.U:

    This variable contains y if we are wrapping malloc to prevent integer overflow during size calculations.

  • usemorebits

    From usemorebits.U:

    This variable conditionally defines the USE_MORE_BITS symbol, and indicates that explicit 64-bit interfaces and long doubles should be used when available.

  • usemultiplicity

    From usemultiplicity.U:

    This variable conditionally defines the MULTIPLICITY symbol, and indicates that Perl should be built to use multiplicity.

  • usemymalloc

    From mallocsrc.U:

    This variable contains y if the malloc that comes with this package is desired over the system's version of malloc. People often include special versions of malloc for efficiency, but such versions are often less portable. See also mallocsrc and mallocobj. If this is y, then -lmalloc is removed from $libs.

  • usenm

    From usenm.U:

    This variable contains true or false depending whether the nm extraction is wanted or not.

  • usensgetexecutablepath

    From usensgetexecutablepath.U:

    This symbol, if defined, indicates that we can use _NSGetExecutablePath and realpath to get a full path for the executable, and hence convert $^X to an absolute path.

  • useopcode

    From Extensions.U:

    This variable holds either true or false to indicate whether the Opcode extension should be used. The sole use for this currently is to allow an easy mechanism for users to skip the Opcode extension from the Configure command line.

  • useperlio

    From useperlio.U:

    This variable conditionally defines the USE_PERLIO symbol, and indicates that the PerlIO abstraction should be used throughout.

  • useposix

    From Extensions.U:

    This variable holds either true or false to indicate whether the POSIX extension should be used. The sole use for this currently is to allow an easy mechanism for hints files to indicate that POSIX will not compile on a particular system.

  • usereentrant

    From usethreads.U:

    This variable conditionally defines the USE_REENTRANT_API symbol, which indicates that the thread code may try to use the various _r versions of library functions. This is only potentially meaningful if usethreads is set and is very experimental, it is not even prompted for.

  • userelocatableinc

    From bin.U:

    This variable is set to true to indicate that perl should relocate @INC entries at runtime based on the path to the perl binary. Any @INC paths starting .../ are relocated relative to the directory containing the perl binary, and a logical cleanup of the path is then made around the join point (removing dir/../ pairs)

  • usesfio

    From d_sfio.U:

    This variable is set to true when the user agrees to use sfio. It is set to false when sfio is not available or when the user explicitly requests not to use sfio. It is here primarily so that command-line settings can override the auto-detection of d_sfio without running into a "WHOA THERE".

  • useshrplib

    From libperl.U:

    This variable is set to true if the user wishes to build a shared libperl, and false otherwise.

  • usesitecustomize

    From d_sitecustomize.U:

    This variable is set to true when the user requires a mechanism that allows the sysadmin to add entries to @INC at runtime. This variable being set, makes perl run $sitelib/sitecustomize.pl at startup.

  • usesocks

    From usesocks.U:

    This variable conditionally defines the USE_SOCKS symbol, and indicates that Perl should be built to use SOCKS .

  • usethreads

    From usethreads.U:

    This variable conditionally defines the USE_THREADS symbol, and indicates that Perl should be built to use threads.

  • usevendorprefix

    From vendorprefix.U:

    This variable tells whether the vendorprefix and consequently other vendor* paths are in use.

  • useversionedarchname

    From archname.U:

    This variable indicates whether to include the $api_versionstring as a component of the $archname.

  • usevfork

    From d_vfork.U:

    This variable is set to true when the user accepts to use vfork. It is set to false when no vfork is available or when the user explicitly requests not to use vfork.

  • usrinc

    From usrinc.U:

    This variable holds the path of the include files, which is usually /usr/include. It is mainly used by other Configure units.

  • uuname

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • uvoformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl UV as an unsigned octal integer.

  • uvsize

    From perlxv.U:

    This variable is the size of a UV in bytes.

  • uvtype

    From perlxv.U:

    This variable contains the C type used for Perl's UV .

  • uvuformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl UV as an unsigned decimal integer.

  • uvxformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl UV as an unsigned hexadecimal integer in lowercase abcdef.

  • uvXUformat

    From perlxvf.U:

    This variable contains the format string used for printing a Perl UV as an unsigned hexadecimal integer in uppercase ABCDEF .

v

  • vaproto

    From vaproto.U:

    This variable conditionally defines CAN_VAPROTO on systems supporting prototype declaration of functions with a variable number of arguments. See also prototype.

  • vendorarch

    From vendorarch.U:

    This variable contains the value of the PERL_VENDORARCH symbol. It may have a ~ on the front. The standard distribution will put nothing in this directory. Vendors who distribute perl may wish to place their own architecture-dependent modules and extensions in this directory with MakeMaker Makefile.PL INSTALLDIRS =vendor or equivalent. See INSTALL for details.

  • vendorarchexp

    From vendorarch.U:

    This variable is the ~name expanded version of vendorarch, so that you may use it directly in Makefiles or shell scripts.

  • vendorbin

    From vendorbin.U:

    This variable contains the eventual value of the VENDORBIN symbol. It may have a ~ on the front. The standard distribution will put nothing in this directory. Vendors who distribute perl may wish to place additional binaries in this directory with MakeMaker Makefile.PL INSTALLDIRS =vendor or equivalent. See INSTALL for details.

  • vendorbinexp

    From vendorbin.U:

    This variable is the ~name expanded version of vendorbin, so that you may use it directly in Makefiles or shell scripts.

  • vendorhtml1dir

    From vendorhtml1dir.U:

    This variable contains the name of the directory for html pages. It may have a ~ on the front. The standard distribution will put nothing in this directory. Vendors who distribute perl may wish to place their own html pages in this directory with MakeMaker Makefile.PL INSTALLDIRS =vendor or equivalent. See INSTALL for details.

  • vendorhtml1direxp

    From vendorhtml1dir.U:

    This variable is the ~name expanded version of vendorhtml1dir, so that you may use it directly in Makefiles or shell scripts.

  • vendorhtml3dir

    From vendorhtml3dir.U:

    This variable contains the name of the directory for html library pages. It may have a ~ on the front. The standard distribution will put nothing in this directory. Vendors who distribute perl may wish to place their own html pages for modules and extensions in this directory with MakeMaker Makefile.PL INSTALLDIRS =vendor or equivalent. See INSTALL for details.

  • vendorhtml3direxp

    From vendorhtml3dir.U:

    This variable is the ~name expanded version of vendorhtml3dir, so that you may use it directly in Makefiles or shell scripts.

  • vendorlib

    From vendorlib.U:

    This variable contains the eventual value of the VENDORLIB symbol, which is the name of the private library for this package. The standard distribution will put nothing in this directory. Vendors who distribute perl may wish to place their own modules in this directory with MakeMaker Makefile.PL INSTALLDIRS =vendor or equivalent. See INSTALL for details.

  • vendorlib_stem

    From vendorlib.U:

    This variable is $vendorlibexp with any trailing version-specific component removed. The elements in inc_version_list (inc_version_list.U) can be tacked onto this variable to generate a list of directories to search.

  • vendorlibexp

    From vendorlib.U:

    This variable is the ~name expanded version of vendorlib, so that you may use it directly in Makefiles or shell scripts.

  • vendorman1dir

    From vendorman1dir.U:

    This variable contains the name of the directory for man1 pages. It may have a ~ on the front. The standard distribution will put nothing in this directory. Vendors who distribute perl may wish to place their own man1 pages in this directory with MakeMaker Makefile.PL INSTALLDIRS =vendor or equivalent. See INSTALL for details.

  • vendorman1direxp

    From vendorman1dir.U:

    This variable is the ~name expanded version of vendorman1dir, so that you may use it directly in Makefiles or shell scripts.

  • vendorman3dir

    From vendorman3dir.U:

    This variable contains the name of the directory for man3 pages. It may have a ~ on the front. The standard distribution will put nothing in this directory. Vendors who distribute perl may wish to place their own man3 pages in this directory with MakeMaker Makefile.PL INSTALLDIRS =vendor or equivalent. See INSTALL for details.

  • vendorman3direxp

    From vendorman3dir.U:

    This variable is the ~name expanded version of vendorman3dir, so that you may use it directly in Makefiles or shell scripts.

  • vendorprefix

    From vendorprefix.U:

    This variable holds the full absolute path of the directory below which the vendor will install add-on packages. See INSTALL for usage and examples.

  • vendorprefixexp

    From vendorprefix.U:

    This variable holds the full absolute path of the directory below which the vendor will install add-on packages. Derived from vendorprefix.

  • vendorscript

    From vendorscript.U:

    This variable contains the eventual value of the VENDORSCRIPT symbol. It may have a ~ on the front. The standard distribution will put nothing in this directory. Vendors who distribute perl may wish to place additional executable scripts in this directory with MakeMaker Makefile.PL INSTALLDIRS =vendor or equivalent. See INSTALL for details.

  • vendorscriptexp

    From vendorscript.U:

    This variable is the ~name expanded version of vendorscript, so that you may use it directly in Makefiles or shell scripts.

  • version

    From patchlevel.U:

    The full version number of this package, such as 5.6.1 (or 5_6_1). This combines revision, patchlevel, and subversion to get the full version number, including any possible subversions. This is suitable for use as a directory name, and hence is filesystem dependent.

  • version_patchlevel_string

    From patchlevel.U:

    This is a string combining version, subversion and perl_patchlevel (if perl_patchlevel is non-zero). It is typically something like 'version 7 subversion 1' or 'version 7 subversion 1 patchlevel 11224' It is computed here to avoid duplication of code in myconfig.SH and lib/Config.pm.

  • versiononly

    From versiononly.U:

    If set, this symbol indicates that only the version-specific components of a perl installation should be installed. This may be useful for making a test installation of a new version without disturbing the existing installation. Setting versiononly is equivalent to setting installperl's -v option. In particular, the non-versioned scripts and programs such as a2p, c2ph, h2xs, pod2*, and perldoc are not installed (see INSTALL for a more complete list). Nor are the man pages installed. Usually, this is undef.

  • vi

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • voidflags

    From voidflags.U:

    This variable contains the eventual value of the VOIDFLAGS symbol, which indicates how much support of the void type is given by this compiler. See VOIDFLAGS for more info.

x

  • xlibpth

    From libpth.U:

    This variable holds extra path (space-separated) used to find libraries on this platform, for example CPU -specific libraries (on multi-CPU platforms) may be listed here.

y

  • yacc

    From yacc.U:

    This variable holds the name of the compiler compiler we want to use in the Makefile. It can be yacc, byacc, or bison -y.

  • yaccflags

    From yacc.U:

    This variable contains any additional yacc flags desired by the user. It is up to the Makefile to use this.

z

  • zcat

    From Loc.U:

    This variable is defined but not used by Configure. The value is the empty string and is not useful.

  • zip

    From Loc.U:

    This variable is used internally by Configure to determine the full pathname (if any) of the zip program. After Configure runs, the value is reset to a plain zip and is not useful.

GIT DATA

Information on the git commit from which the current perl binary was compiled can be found in the variable $Config::Git_Data . The variable is a structured string that looks something like this:

  1. git_commit_id='ea0c2dbd5f5ac6845ecc7ec6696415bf8e27bd52'
  2. git_describe='GitLive-blead-1076-gea0c2db'
  3. git_branch='smartmatch'
  4. git_uncommitted_changes=''
  5. git_commit_id_title='Commit id:'
  6. git_commit_date='2009-05-09 17:47:31 +0200'

Its format is not guaranteed not to change over time.

NOTE

This module contains a good example of how to use tie to implement a cache and an example of how to make a tied variable readonly to those outside of it.

 
perldoc-html/Cwd.html000644 000765 000024 00000054501 12275777434 014650 0ustar00jjstaff000000 000000 Cwd - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Cwd

Perl 5 version 18.2 documentation
Recently read

Cwd

NAME

Cwd - get pathname of current working directory

SYNOPSIS

  1. use Cwd;
  2. my $dir = getcwd;
  3. use Cwd 'abs_path';
  4. my $abs_path = abs_path($file);

DESCRIPTION

This module provides functions for determining the pathname of the current working directory. It is recommended that getcwd (or another *cwd() function) be used in all code to ensure portability.

By default, it exports the functions cwd(), getcwd(), fastcwd(), and fastgetcwd() (and, on Win32, getdcwd()) into the caller's namespace.

getcwd and friends

Each of these functions are called without arguments and return the absolute path of the current working directory.

  • getcwd
    1. my $cwd = getcwd();

    Returns the current working directory.

    Exposes the POSIX function getcwd(3) or re-implements it if it's not available.

  • cwd
    1. my $cwd = cwd();

    The cwd() is the most natural form for the current architecture. For most systems it is identical to `pwd` (but without the trailing line terminator).

  • fastcwd
    1. my $cwd = fastcwd();

    A more dangerous version of getcwd(), but potentially faster.

    It might conceivably chdir() you out of a directory that it can't chdir() you back into. If fastcwd encounters a problem it will return undef but will probably leave you in a different directory. For a measure of extra security, if everything appears to have worked, the fastcwd() function will check that it leaves you in the same directory that it started in. If it has changed it will die with the message "Unstable directory path, current directory changed unexpectedly". That should never happen.

  • fastgetcwd
    1. my $cwd = fastgetcwd();

    The fastgetcwd() function is provided as a synonym for cwd().

  • getdcwd
    1. my $cwd = getdcwd();
    2. my $cwd = getdcwd('C:');

    The getdcwd() function is also provided on Win32 to get the current working directory on the specified drive, since Windows maintains a separate current working directory for each drive. If no drive is specified then the current drive is assumed.

    This function simply calls the Microsoft C library _getdcwd() function.

abs_path and friends

These functions are exported only on request. They each take a single argument and return the absolute pathname for it. If no argument is given they'll use the current working directory.

  • abs_path
    1. my $abs_path = abs_path($file);

    Uses the same algorithm as getcwd(). Symbolic links and relative-path components ("." and "..") are resolved to return the canonical pathname, just like realpath(3).

  • realpath
    1. my $abs_path = realpath($file);

    A synonym for abs_path().

  • fast_abs_path
    1. my $abs_path = fast_abs_path($file);

    A more dangerous, but potentially faster version of abs_path.

$ENV{PWD}

If you ask to override your chdir() built-in function,

  1. use Cwd qw(chdir);

then your PWD environment variable will be kept up to date. Note that it will only be kept up to date if all packages which use chdir import it from Cwd.

NOTES

  • Since the path separators are different on some operating systems ('/' on Unix, ':' on MacPerl, etc...) we recommend you use the File::Spec modules wherever portability is a concern.

  • Actually, on Mac OS, the getcwd() , fastgetcwd() and fastcwd() functions are all aliases for the cwd() function, which, on Mac OS, calls `pwd`. Likewise, the abs_path() function is an alias for fast_abs_path() .

AUTHOR

Originally by the perl5-porters.

Maintained by Ken Williams <KWILLIAMS@cpan.org>

COPYRIGHT

Copyright (c) 2004 by the Perl 5 Porters. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Portions of the C code in this library are copyright (c) 1994 by the Regents of the University of California. All rights reserved. The license on this code is compatible with the licensing of the rest of the distribution - please see the source code in Cwd.xs for the details.

SEE ALSO

File::chdir

 
perldoc-html/DB.html000644 000765 000024 00000056607 12275777435 014432 0ustar00jjstaff000000 000000 DB - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

DB

Perl 5 version 18.2 documentation
Recently read

DB

NAME

DB - programmatic interface to the Perl debugging API

SYNOPSIS

  1. package CLIENT;
  2. use DB;
  3. @ISA = qw(DB);
  4. # these (inherited) methods can be called by the client
  5. CLIENT->register() # register a client package name
  6. CLIENT->done() # de-register from the debugging API
  7. CLIENT->skippkg('hide::hide') # ask DB not to stop in this package
  8. CLIENT->cont([WHERE]) # run some more (until BREAK or another breakpt)
  9. CLIENT->step() # single step
  10. CLIENT->next() # step over
  11. CLIENT->ret() # return from current subroutine
  12. CLIENT->backtrace() # return the call stack description
  13. CLIENT->ready() # call when client setup is done
  14. CLIENT->trace_toggle() # toggle subroutine call trace mode
  15. CLIENT->subs([SUBS]) # return subroutine information
  16. CLIENT->files() # return list of all files known to DB
  17. CLIENT->lines() # return lines in currently loaded file
  18. CLIENT->loadfile(FILE,LINE) # load a file and let other clients know
  19. CLIENT->lineevents() # return info on lines with actions
  20. CLIENT->set_break([WHERE],[COND])
  21. CLIENT->set_tbreak([WHERE])
  22. CLIENT->clr_breaks([LIST])
  23. CLIENT->set_action(WHERE,ACTION)
  24. CLIENT->clr_actions([LIST])
  25. CLIENT->evalcode(STRING) # eval STRING in executing code's context
  26. CLIENT->prestop([STRING]) # execute in code context before stopping
  27. CLIENT->poststop([STRING])# execute in code context before resuming
  28. # These methods will be called at the appropriate times.
  29. # Stub versions provided do nothing.
  30. # None of these can block.
  31. CLIENT->init() # called when debug API inits itself
  32. CLIENT->stop(FILE,LINE) # when execution stops
  33. CLIENT->idle() # while stopped (can be a client event loop)
  34. CLIENT->cleanup() # just before exit
  35. CLIENT->output(LIST) # called to print any output that API must show

DESCRIPTION

Perl debug information is frequently required not just by debuggers, but also by modules that need some "special" information to do their job properly, like profilers.

This module abstracts and provides all of the hooks into Perl internal debugging functionality, so that various implementations of Perl debuggers (or packages that want to simply get at the "privileged" debugging data) can all benefit from the development of this common code. Currently used by Swat, the perl/Tk GUI debugger.

Note that multiple "front-ends" can latch into this debugging API simultaneously. This is intended to facilitate things like debugging with a command line and GUI at the same time, debugging debuggers etc. [Sounds nice, but this needs some serious support -- GSAR]

In particular, this API does not provide the following functions:

  • data display

  • command processing

  • command alias management

  • user interface (tty or graphical)

These are intended to be services performed by the clients of this API.

This module attempts to be squeaky clean w.r.t use strict; and when warnings are enabled.

Global Variables

The following "public" global names can be read by clients of this API. Beware that these should be considered "readonly".

  • $DB::sub

    Name of current executing subroutine.

  • %DB::sub

    The keys of this hash are the names of all the known subroutines. Each value is an encoded string that has the sprintf(3) format ("%s:%d-%d", filename, fromline, toline) .

  • $DB::single

    Single-step flag. Will be true if the API will stop at the next statement.

  • $DB::signal

    Signal flag. Will be set to a true value if a signal was caught. Clients may check for this flag to abort time-consuming operations.

  • $DB::trace

    This flag is set to true if the API is tracing through subroutine calls.

  • @DB::args

    Contains the arguments of current subroutine, or the @ARGV array if in the toplevel context.

  • @DB::dbline

    List of lines in currently loaded file.

  • %DB::dbline

    Actions in current file (keys are line numbers). The values are strings that have the sprintf(3) format ("%s\000%s", breakcondition, actioncode) .

  • $DB::package

    Package namespace of currently executing code.

  • $DB::filename

    Currently loaded filename.

  • $DB::subname

    Fully qualified name of currently executing subroutine.

  • $DB::lineno

    Line number that will be executed next.

API Methods

The following are methods in the DB base class. A client must access these methods by inheritance (*not* by calling them directly), since the API keeps track of clients through the inheritance mechanism.

  • CLIENT->register()

    register a client object/package

  • CLIENT->evalcode(STRING)

    eval STRING in executing code context

  • CLIENT->skippkg('D::hide')

    ask DB not to stop in these packages

  • CLIENT->run()

    run some more (until a breakpt is reached)

  • CLIENT->step()

    single step

  • CLIENT->next()

    step over

  • CLIENT->done()

    de-register from the debugging API

Client Callback Methods

The following "virtual" methods can be defined by the client. They will be called by the API at appropriate points. Note that unless specified otherwise, the debug API only defines empty, non-functional default versions of these methods.

  • CLIENT->init()

    Called after debug API inits itself.

  • CLIENT->prestop([STRING])

    Usually inherited from DB package. If no arguments are passed, returns the prestop action string.

  • CLIENT->stop()

    Called when execution stops (w/ args file, line).

  • CLIENT->idle()

    Called while stopped (can be a client event loop).

  • CLIENT->poststop([STRING])

    Usually inherited from DB package. If no arguments are passed, returns the poststop action string.

  • CLIENT->evalcode(STRING)

    Usually inherited from DB package. Ask for a STRING to be eval-ed in executing code context.

  • CLIENT->cleanup()

    Called just before exit.

  • CLIENT->output(LIST)

    Called when API must show a message (warnings, errors etc.).

BUGS

The interface defined by this module is missing some of the later additions to perl's debugging functionality. As such, this interface should be considered highly experimental and subject to change.

AUTHOR

Gurusamy Sarathy gsar@activestate.com

This code heavily adapted from an early version of perl5db.pl attributable to Larry Wall and the Perl Porters.

 
perldoc-html/DBM_Filter/000755 000765 000024 00000000000 12275777435 015150 5ustar00jjstaff000000 000000 perldoc-html/DBM_Filter.html000644 000765 000024 00000105347 12275777437 016052 0ustar00jjstaff000000 000000 DBM_Filter - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

DBM_Filter

Perl 5 version 18.2 documentation
Recently read

DBM_Filter

NAME

DBM_Filter -- Filter DBM keys/values

SYNOPSIS

  1. use DBM_Filter ;
  2. use SDBM_File; # or DB_File, or GDBM_File, or NDBM_File, or ODBM_File
  3. $db = tie %hash, ...
  4. $db->Filter_Push(Fetch => sub {...},
  5. Store => sub {...});
  6. $db->Filter_Push('my_filter1');
  7. $db->Filter_Push('my_filter2', params...);
  8. $db->Filter_Key_Push(...) ;
  9. $db->Filter_Value_Push(...) ;
  10. $db->Filter_Pop();
  11. $db->Filtered();
  12. package DBM_Filter::my_filter1;
  13. sub Store { ... }
  14. sub Fetch { ... }
  15. 1;
  16. package DBM_Filter::my_filter2;
  17. sub Filter
  18. {
  19. my @opts = @_;
  20. ...
  21. return (
  22. sub Store { ... },
  23. sub Fetch { ... } );
  24. }
  25. 1;

DESCRIPTION

This module provides an interface that allows filters to be applied to tied Hashes associated with DBM files. It builds on the DBM Filter hooks that are present in all the *DB*_File modules included with the standard Perl source distribution from version 5.6.1 onwards. In addition to the *DB*_File modules distributed with Perl, the BerkeleyDB module, available on CPAN, supports the DBM Filter hooks. See perldbmfilter for more details on the DBM Filter hooks.

What is a DBM Filter?

A DBM Filter allows the keys and/or values in a tied hash to be modified by some user-defined code just before it is written to the DBM file and just after it is read back from the DBM file. For example, this snippet of code

  1. $some_hash{"abc"} = 42;

could potentially trigger two filters, one for the writing of the key "abc" and another for writing the value 42. Similarly, this snippet

  1. my ($key, $value) = each %some_hash

will trigger two filters, one for the reading of the key and one for the reading of the value.

Like the existing DBM Filter functionality, this module arranges for the $_ variable to be populated with the key or value that a filter will check. This usually means that most DBM filters tend to be very short.

So what's new?

The main enhancements over the standard DBM Filter hooks are:

  • A cleaner interface.

  • The ability to easily apply multiple filters to a single DBM file.

  • The ability to create "canned" filters. These allow commonly used filters to be packaged into a stand-alone module.

METHODS

This module will arrange for the following methods to be available via the object returned from the tie call.

$db->Filter_Push() / $db->Filter_Key_Push() / $db->Filter_Value_Push()

Add a filter to filter stack for the database, $db . The three formats vary only in whether they apply to the DBM key, the DBM value or both.

  • Filter_Push

    The filter is applied to both keys and values.

  • Filter_Key_Push

    The filter is applied to the key only.

  • Filter_Value_Push

    The filter is applied to the value only.

$db->Filter_Pop()

Removes the last filter that was applied to the DBM file associated with $db , if present.

$db->Filtered()

Returns TRUE if there are any filters applied to the DBM associated with $db . Otherwise returns FALSE.

Writing a Filter

Filters can be created in two main ways

Immediate Filters

An immediate filter allows you to specify the filter code to be used at the point where the filter is applied to a dbm. In this mode the Filter_*_Push methods expects to receive exactly two parameters.

  1. my $db = tie %hash, 'SDBM_File', ...
  2. $db->Filter_Push( Store => sub { },
  3. Fetch => sub { });

The code reference associated with Store will be called before any key/value is written to the database and the code reference associated with Fetch will be called after any key/value is read from the database.

For example, here is a sample filter that adds a trailing NULL character to all strings before they are written to the DBM file, and removes the trailing NULL when they are read from the DBM file

  1. my $db = tie %hash, 'SDBM_File', ...
  2. $db->Filter_Push( Store => sub { $_ .= "\x00" ; },
  3. Fetch => sub { s/\x00$// ; });

Points to note:

1.

Both the Store and Fetch filters manipulate $_ .

Canned Filters

Immediate filters are useful for one-off situations. For more generic problems it can be useful to package the filter up in its own module.

The usage is for a canned filter is:

  1. $db->Filter_Push("name", params)

where

  • "name"

    is the name of the module to load. If the string specified does not contain the package separator characters "::", it is assumed to refer to the full module name "DBM_Filter::name". This means that the full names for canned filters, "null" and "utf8", included with this module are:

    1. DBM_Filter::null
    2. DBM_Filter::utf8
  • params

    any optional parameters that need to be sent to the filter. See the encode filter for an example of a module that uses parameters.

The module that implements the canned filter can take one of two forms. Here is a template for the first

  1. package DBM_Filter::null ;
  2. use strict;
  3. use warnings;
  4. sub Store
  5. {
  6. # store code here
  7. }
  8. sub Fetch
  9. {
  10. # fetch code here
  11. }
  12. 1;

Notes:

1.

The package name uses the DBM_Filter:: prefix.

2.

The module must have both a Store and a Fetch method. If only one is present, or neither are present, a fatal error will be thrown.

The second form allows the filter to hold state information using a closure, thus:

  1. package DBM_Filter::encoding ;
  2. use strict;
  3. use warnings;
  4. sub Filter
  5. {
  6. my @params = @_ ;
  7. ...
  8. return {
  9. Store => sub { $_ = $encoding->encode($_) },
  10. Fetch => sub { $_ = $encoding->decode($_) }
  11. } ;
  12. }
  13. 1;

In this instance the "Store" and "Fetch" methods are encapsulated inside a "Filter" method.

Filters Included

A number of canned filers are provided with this module. They cover a number of the main areas that filters are needed when interfacing with DBM files. They also act as templates for your own filters.

The filter included are:

  • utf8

    This module will ensure that all data written to the DBM will be encoded in UTF-8.

    This module needs the Encode module.

  • encode

    Allows you to choose the character encoding will be store in the DBM file.

  • compress

    This filter will compress all data before it is written to the database and uncompressed it on reading.

    This module needs Compress::Zlib.

  • int32

    This module is used when interoperating with a C/C++ application that uses a C int as either the key and/or value in the DBM file.

  • null

    This module ensures that all data written to the DBM file is null terminated. This is useful when you have a perl script that needs to interoperate with a DBM file that a C program also uses. A fairly common issue is for the C application to include the terminating null in a string when it writes to the DBM file. This filter will ensure that all data written to the DBM file can be read by the C application.

NOTES

Maintain Round Trip Integrity

When writing a DBM filter it is very important to ensure that it is possible to retrieve all data that you have written when the DBM filter is in place. In practice, this means that whatever transformation is applied to the data in the Store method, the exact inverse operation should be applied in the Fetch method.

If you don't provide an exact inverse transformation, you will find that code like this will not behave as you expect.

  1. while (my ($k, $v) = each %hash)
  2. {
  3. ...
  4. }

Depending on the transformation, you will find that one or more of the following will happen

1

The loop will never terminate.

2

Too few records will be retrieved.

3

Too many will be retrieved.

4

The loop will do the right thing for a while, but it will unexpectedly fail.

Don't mix filtered & non-filtered data in the same database file.

This is just a restatement of the previous section. Unless you are completely certain you know what you are doing, avoid mixing filtered & non-filtered data.

EXAMPLE

Say you need to interoperate with a legacy C application that stores keys as C ints and the values and null terminated UTF-8 strings. Here is how you would set that up

  1. my $db = tie %hash, 'SDBM_File', ...
  2. $db->Filter_Key_Push('int32') ;
  3. $db->Filter_Value_Push('utf8');
  4. $db->Filter_Value_Push('null');

SEE ALSO

<DB_File>, GDBM_File, NDBM_File, ODBM_File, SDBM_File, perldbmfilter

AUTHOR

Paul Marquess <pmqs@cpan.org>

 
perldoc-html/DB_File.html000644 000765 000024 00000447030 12275777436 015364 0ustar00jjstaff000000 000000 DB_File - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

DB_File

Perl 5 version 18.2 documentation
Recently read

DB_File

NAME

DB_File - Perl5 access to Berkeley DB version 1.x

SYNOPSIS

  1. use DB_File;
  2. [$X =] tie %hash, 'DB_File', [$filename, $flags, $mode, $DB_HASH] ;
  3. [$X =] tie %hash, 'DB_File', $filename, $flags, $mode, $DB_BTREE ;
  4. [$X =] tie @array, 'DB_File', $filename, $flags, $mode, $DB_RECNO ;
  5. $status = $X->del($key [, $flags]) ;
  6. $status = $X->put($key, $value [, $flags]) ;
  7. $status = $X->get($key, $value [, $flags]) ;
  8. $status = $X->seq($key, $value, $flags) ;
  9. $status = $X->sync([$flags]) ;
  10. $status = $X->fd ;
  11. # BTREE only
  12. $count = $X->get_dup($key) ;
  13. @list = $X->get_dup($key) ;
  14. %list = $X->get_dup($key, 1) ;
  15. $status = $X->find_dup($key, $value) ;
  16. $status = $X->del_dup($key, $value) ;
  17. # RECNO only
  18. $a = $X->length;
  19. $a = $X->pop ;
  20. $X->push(list);
  21. $a = $X->shift;
  22. $X->unshift(list);
  23. @r = $X->splice(offset, length, elements);
  24. # DBM Filters
  25. $old_filter = $db->filter_store_key ( sub { ... } ) ;
  26. $old_filter = $db->filter_store_value( sub { ... } ) ;
  27. $old_filter = $db->filter_fetch_key ( sub { ... } ) ;
  28. $old_filter = $db->filter_fetch_value( sub { ... } ) ;
  29. untie %hash ;
  30. untie @array ;

DESCRIPTION

DB_File is a module which allows Perl programs to make use of the facilities provided by Berkeley DB version 1.x (if you have a newer version of DB, see Using DB_File with Berkeley DB version 2 or greater). It is assumed that you have a copy of the Berkeley DB manual pages at hand when reading this documentation. The interface defined here mirrors the Berkeley DB interface closely.

Berkeley DB is a C library which provides a consistent interface to a number of database formats. DB_File provides an interface to all three of the database types currently supported by Berkeley DB.

The file types are:

  • DB_HASH

    This database type allows arbitrary key/value pairs to be stored in data files. This is equivalent to the functionality provided by other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM. Remember though, the files created using DB_HASH are not compatible with any of the other packages mentioned.

    A default hashing algorithm, which will be adequate for most applications, is built into Berkeley DB. If you do need to use your own hashing algorithm it is possible to write your own in Perl and have DB_File use it instead.

  • DB_BTREE

    The btree format allows arbitrary key/value pairs to be stored in a sorted, balanced binary tree.

    As with the DB_HASH format, it is possible to provide a user defined Perl routine to perform the comparison of keys. By default, though, the keys are stored in lexical order.

  • DB_RECNO

    DB_RECNO allows both fixed-length and variable-length flat text files to be manipulated using the same key/value pair interface as in DB_HASH and DB_BTREE. In this case the key will consist of a record (line) number.

Using DB_File with Berkeley DB version 2 or greater

Although DB_File is intended to be used with Berkeley DB version 1, it can also be used with version 2, 3 or 4. In this case the interface is limited to the functionality provided by Berkeley DB 1.x. Anywhere the version 2 or greater interface differs, DB_File arranges for it to work like version 1. This feature allows DB_File scripts that were built with version 1 to be migrated to version 2 or greater without any changes.

If you want to make use of the new features available in Berkeley DB 2.x or greater, use the Perl module BerkeleyDB instead.

Note: The database file format has changed multiple times in Berkeley DB version 2, 3 and 4. If you cannot recreate your databases, you must dump any existing databases with either the db_dump or the db_dump185 utility that comes with Berkeley DB. Once you have rebuilt DB_File to use Berkeley DB version 2 or greater, your databases can be recreated using db_load . Refer to the Berkeley DB documentation for further details.

Please read COPYRIGHT before using version 2.x or greater of Berkeley DB with DB_File.

Interface to Berkeley DB

DB_File allows access to Berkeley DB files using the tie() mechanism in Perl 5 (for full details, see tie()). This facility allows DB_File to access Berkeley DB files using either an associative array (for DB_HASH & DB_BTREE file types) or an ordinary array (for the DB_RECNO file type).

In addition to the tie() interface, it is also possible to access most of the functions provided in the Berkeley DB API directly. See THE API INTERFACE.

Opening a Berkeley DB Database File

Berkeley DB uses the function dbopen() to open or create a database. Here is the C prototype for dbopen():

  1. DB*
  2. dbopen (const char * file, int flags, int mode,
  3. DBTYPE type, const void * openinfo)

The parameter type is an enumeration which specifies which of the 3 interface methods (DB_HASH, DB_BTREE or DB_RECNO) is to be used. Depending on which of these is actually chosen, the final parameter, openinfo points to a data structure which allows tailoring of the specific interface method.

This interface is handled slightly differently in DB_File. Here is an equivalent call using DB_File:

  1. tie %array, 'DB_File', $filename, $flags, $mode, $DB_HASH ;

The filename , flags and mode parameters are the direct equivalent of their dbopen() counterparts. The final parameter $DB_HASH performs the function of both the type and openinfo parameters in dbopen().

In the example above $DB_HASH is actually a pre-defined reference to a hash object. DB_File has three of these pre-defined references. Apart from $DB_HASH, there is also $DB_BTREE and $DB_RECNO.

The keys allowed in each of these pre-defined references is limited to the names used in the equivalent C structure. So, for example, the $DB_HASH reference will only allow keys called bsize , cachesize , ffactor , hash , lorder and nelem .

To change one of these elements, just assign to it like this:

  1. $DB_HASH->{'cachesize'} = 10000 ;

The three predefined variables $DB_HASH, $DB_BTREE and $DB_RECNO are usually adequate for most applications. If you do need to create extra instances of these objects, constructors are available for each file type.

Here are examples of the constructors and the valid options available for DB_HASH, DB_BTREE and DB_RECNO respectively.

  1. $a = new DB_File::HASHINFO ;
  2. $a->{'bsize'} ;
  3. $a->{'cachesize'} ;
  4. $a->{'ffactor'};
  5. $a->{'hash'} ;
  6. $a->{'lorder'} ;
  7. $a->{'nelem'} ;
  8. $b = new DB_File::BTREEINFO ;
  9. $b->{'flags'} ;
  10. $b->{'cachesize'} ;
  11. $b->{'maxkeypage'} ;
  12. $b->{'minkeypage'} ;
  13. $b->{'psize'} ;
  14. $b->{'compare'} ;
  15. $b->{'prefix'} ;
  16. $b->{'lorder'} ;
  17. $c = new DB_File::RECNOINFO ;
  18. $c->{'bval'} ;
  19. $c->{'cachesize'} ;
  20. $c->{'psize'} ;
  21. $c->{'flags'} ;
  22. $c->{'lorder'} ;
  23. $c->{'reclen'} ;
  24. $c->{'bfname'} ;

The values stored in the hashes above are mostly the direct equivalent of their C counterpart. Like their C counterparts, all are set to a default values - that means you don't have to set all of the values when you only want to change one. Here is an example:

  1. $a = new DB_File::HASHINFO ;
  2. $a->{'cachesize'} = 12345 ;
  3. tie %y, 'DB_File', "filename", $flags, 0777, $a ;

A few of the options need extra discussion here. When used, the C equivalent of the keys hash , compare and prefix store pointers to C functions. In DB_File these keys are used to store references to Perl subs. Below are templates for each of the subs:

  1. sub hash
  2. {
  3. my ($data) = @_ ;
  4. ...
  5. # return the hash value for $data
  6. return $hash ;
  7. }
  8. sub compare
  9. {
  10. my ($key, $key2) = @_ ;
  11. ...
  12. # return 0 if $key1 eq $key2
  13. # -1 if $key1 lt $key2
  14. # 1 if $key1 gt $key2
  15. return (-1 , 0 or 1) ;
  16. }
  17. sub prefix
  18. {
  19. my ($key, $key2) = @_ ;
  20. ...
  21. # return number of bytes of $key2 which are
  22. # necessary to determine that it is greater than $key1
  23. return $bytes ;
  24. }

See Changing the BTREE sort order for an example of using the compare template.

If you are using the DB_RECNO interface and you intend making use of bval , you should check out The 'bval' Option.

Default Parameters

It is possible to omit some or all of the final 4 parameters in the call to tie and let them take default values. As DB_HASH is the most common file format used, the call:

  1. tie %A, "DB_File", "filename" ;

is equivalent to:

  1. tie %A, "DB_File", "filename", O_CREAT|O_RDWR, 0666, $DB_HASH ;

It is also possible to omit the filename parameter as well, so the call:

  1. tie %A, "DB_File" ;

is equivalent to:

  1. tie %A, "DB_File", undef, O_CREAT|O_RDWR, 0666, $DB_HASH ;

See In Memory Databases for a discussion on the use of undef in place of a filename.

In Memory Databases

Berkeley DB allows the creation of in-memory databases by using NULL (that is, a (char *)0 in C) in place of the filename. DB_File uses undef instead of NULL to provide this functionality.

DB_HASH

The DB_HASH file format is probably the most commonly used of the three file formats that DB_File supports. It is also very straightforward to use.

A Simple Example

This example shows how to create a database, add key/value pairs to the database, delete keys/value pairs and finally how to enumerate the contents of the database.

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. our (%h, $k, $v) ;
  5. unlink "fruit" ;
  6. tie %h, "DB_File", "fruit", O_RDWR|O_CREAT, 0666, $DB_HASH
  7. or die "Cannot open file 'fruit': $!\n";
  8. # Add a few key/value pairs to the file
  9. $h{"apple"} = "red" ;
  10. $h{"orange"} = "orange" ;
  11. $h{"banana"} = "yellow" ;
  12. $h{"tomato"} = "red" ;
  13. # Check for existence of a key
  14. print "Banana Exists\n\n" if $h{"banana"} ;
  15. # Delete a key/value pair.
  16. delete $h{"apple"} ;
  17. # print the contents of the file
  18. while (($k, $v) = each %h)
  19. { print "$k -> $v\n" }
  20. untie %h ;

here is the output:

  1. Banana Exists
  2. orange -> orange
  3. tomato -> red
  4. banana -> yellow

Note that the like ordinary associative arrays, the order of the keys retrieved is in an apparently random order.

DB_BTREE

The DB_BTREE format is useful when you want to store data in a given order. By default the keys will be stored in lexical order, but as you will see from the example shown in the next section, it is very easy to define your own sorting function.

Changing the BTREE sort order

This script shows how to override the default sorting algorithm that BTREE uses. Instead of using the normal lexical ordering, a case insensitive compare function will be used.

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my %h ;
  5. sub Compare
  6. {
  7. my ($key1, $key2) = @_ ;
  8. "\L$key1" cmp "\L$key2" ;
  9. }
  10. # specify the Perl sub that will do the comparison
  11. $DB_BTREE->{'compare'} = \&Compare ;
  12. unlink "tree" ;
  13. tie %h, "DB_File", "tree", O_RDWR|O_CREAT, 0666, $DB_BTREE
  14. or die "Cannot open file 'tree': $!\n" ;
  15. # Add a key/value pair to the file
  16. $h{'Wall'} = 'Larry' ;
  17. $h{'Smith'} = 'John' ;
  18. $h{'mouse'} = 'mickey' ;
  19. $h{'duck'} = 'donald' ;
  20. # Delete
  21. delete $h{"duck"} ;
  22. # Cycle through the keys printing them in order.
  23. # Note it is not necessary to sort the keys as
  24. # the btree will have kept them in order automatically.
  25. foreach (keys %h)
  26. { print "$_\n" }
  27. untie %h ;

Here is the output from the code above.

  1. mouse
  2. Smith
  3. Wall

There are a few point to bear in mind if you want to change the ordering in a BTREE database:

1.

The new compare function must be specified when you create the database.

2.

You cannot change the ordering once the database has been created. Thus you must use the same compare function every time you access the database.

3

Duplicate keys are entirely defined by the comparison function. In the case-insensitive example above, the keys: 'KEY' and 'key' would be considered duplicates, and assigning to the second one would overwrite the first. If duplicates are allowed for (with the R_DUP flag discussed below), only a single copy of duplicate keys is stored in the database --- so (again with example above) assigning three values to the keys: 'KEY', 'Key', and 'key' would leave just the first key: 'KEY' in the database with three values. For some situations this results in information loss, so care should be taken to provide fully qualified comparison functions when necessary. For example, the above comparison routine could be modified to additionally compare case-sensitively if two keys are equal in the case insensitive comparison:

  1. sub compare {
  2. my($key1, $key2) = @_;
  3. lc $key1 cmp lc $key2 ||
  4. $key1 cmp $key2;
  5. }

And now you will only have duplicates when the keys themselves are truly the same. (note: in versions of the db library prior to about November 1996, such duplicate keys were retained so it was possible to recover the original keys in sets of keys that compared as equal).

Handling Duplicate Keys

The BTREE file type optionally allows a single key to be associated with an arbitrary number of values. This option is enabled by setting the flags element of $DB_BTREE to R_DUP when creating the database.

There are some difficulties in using the tied hash interface if you want to manipulate a BTREE database with duplicate keys. Consider this code:

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my ($filename, %h) ;
  5. $filename = "tree" ;
  6. unlink $filename ;
  7. # Enable duplicate records
  8. $DB_BTREE->{'flags'} = R_DUP ;
  9. tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
  10. or die "Cannot open $filename: $!\n";
  11. # Add some key/value pairs to the file
  12. $h{'Wall'} = 'Larry' ;
  13. $h{'Wall'} = 'Brick' ; # Note the duplicate key
  14. $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
  15. $h{'Smith'} = 'John' ;
  16. $h{'mouse'} = 'mickey' ;
  17. # iterate through the associative array
  18. # and print each key/value pair.
  19. foreach (sort keys %h)
  20. { print "$_ -> $h{$_}\n" }
  21. untie %h ;

Here is the output:

  1. Smith -> John
  2. Wall -> Larry
  3. Wall -> Larry
  4. Wall -> Larry
  5. mouse -> mickey

As you can see 3 records have been successfully created with key Wall - the only thing is, when they are retrieved from the database they seem to have the same value, namely Larry . The problem is caused by the way that the associative array interface works. Basically, when the associative array interface is used to fetch the value associated with a given key, it will only ever retrieve the first value.

Although it may not be immediately obvious from the code above, the associative array interface can be used to write values with duplicate keys, but it cannot be used to read them back from the database.

The way to get around this problem is to use the Berkeley DB API method called seq . This method allows sequential access to key/value pairs. See THE API INTERFACE for details of both the seq method and the API in general.

Here is the script above rewritten using the seq API method.

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my ($filename, $x, %h, $status, $key, $value) ;
  5. $filename = "tree" ;
  6. unlink $filename ;
  7. # Enable duplicate records
  8. $DB_BTREE->{'flags'} = R_DUP ;
  9. $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
  10. or die "Cannot open $filename: $!\n";
  11. # Add some key/value pairs to the file
  12. $h{'Wall'} = 'Larry' ;
  13. $h{'Wall'} = 'Brick' ; # Note the duplicate key
  14. $h{'Wall'} = 'Brick' ; # Note the duplicate key and value
  15. $h{'Smith'} = 'John' ;
  16. $h{'mouse'} = 'mickey' ;
  17. # iterate through the btree using seq
  18. # and print each key/value pair.
  19. $key = $value = 0 ;
  20. for ($status = $x->seq($key, $value, R_FIRST) ;
  21. $status == 0 ;
  22. $status = $x->seq($key, $value, R_NEXT) )
  23. { print "$key -> $value\n" }
  24. undef $x ;
  25. untie %h ;

that prints:

  1. Smith -> John
  2. Wall -> Brick
  3. Wall -> Brick
  4. Wall -> Larry
  5. mouse -> mickey

This time we have got all the key/value pairs, including the multiple values associated with the key Wall .

To make life easier when dealing with duplicate keys, DB_File comes with a few utility methods.

The get_dup() Method

The get_dup method assists in reading duplicate values from BTREE databases. The method can take the following forms:

  1. $count = $x->get_dup($key) ;
  2. @list = $x->get_dup($key) ;
  3. %list = $x->get_dup($key, 1) ;

In a scalar context the method returns the number of values associated with the key, $key .

In list context, it returns all the values which match $key . Note that the values will be returned in an apparently random order.

In list context, if the second parameter is present and evaluates TRUE, the method returns an associative array. The keys of the associative array correspond to the values that matched in the BTREE and the values of the array are a count of the number of times that particular value occurred in the BTREE.

So assuming the database created above, we can use get_dup like this:

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my ($filename, $x, %h) ;
  5. $filename = "tree" ;
  6. # Enable duplicate records
  7. $DB_BTREE->{'flags'} = R_DUP ;
  8. $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
  9. or die "Cannot open $filename: $!\n";
  10. my $cnt = $x->get_dup("Wall") ;
  11. print "Wall occurred $cnt times\n" ;
  12. my %hash = $x->get_dup("Wall", 1) ;
  13. print "Larry is there\n" if $hash{'Larry'} ;
  14. print "There are $hash{'Brick'} Brick Walls\n" ;
  15. my @list = sort $x->get_dup("Wall") ;
  16. print "Wall => [@list]\n" ;
  17. @list = $x->get_dup("Smith") ;
  18. print "Smith => [@list]\n" ;
  19. @list = $x->get_dup("Dog") ;
  20. print "Dog => [@list]\n" ;

and it will print:

  1. Wall occurred 3 times
  2. Larry is there
  3. There are 2 Brick Walls
  4. Wall => [Brick Brick Larry]
  5. Smith => [John]
  6. Dog => []

The find_dup() Method

  1. $status = $X->find_dup($key, $value) ;

This method checks for the existence of a specific key/value pair. If the pair exists, the cursor is left pointing to the pair and the method returns 0. Otherwise the method returns a non-zero value.

Assuming the database from the previous example:

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my ($filename, $x, %h, $found) ;
  5. $filename = "tree" ;
  6. # Enable duplicate records
  7. $DB_BTREE->{'flags'} = R_DUP ;
  8. $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
  9. or die "Cannot open $filename: $!\n";
  10. $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
  11. print "Larry Wall is $found there\n" ;
  12. $found = ( $x->find_dup("Wall", "Harry") == 0 ? "" : "not") ;
  13. print "Harry Wall is $found there\n" ;
  14. undef $x ;
  15. untie %h ;

prints this

  1. Larry Wall is there
  2. Harry Wall is not there

The del_dup() Method

  1. $status = $X->del_dup($key, $value) ;

This method deletes a specific key/value pair. It returns 0 if they exist and have been deleted successfully. Otherwise the method returns a non-zero value.

Again assuming the existence of the tree database

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my ($filename, $x, %h, $found) ;
  5. $filename = "tree" ;
  6. # Enable duplicate records
  7. $DB_BTREE->{'flags'} = R_DUP ;
  8. $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
  9. or die "Cannot open $filename: $!\n";
  10. $x->del_dup("Wall", "Larry") ;
  11. $found = ( $x->find_dup("Wall", "Larry") == 0 ? "" : "not") ;
  12. print "Larry Wall is $found there\n" ;
  13. undef $x ;
  14. untie %h ;

prints this

  1. Larry Wall is not there

Matching Partial Keys

The BTREE interface has a feature which allows partial keys to be matched. This functionality is only available when the seq method is used along with the R_CURSOR flag.

  1. $x->seq($key, $value, R_CURSOR) ;

Here is the relevant quote from the dbopen man page where it defines the use of the R_CURSOR flag with seq:

  1. Note, for the DB_BTREE access method, the returned key is not
  2. necessarily an exact match for the specified key. The returned key
  3. is the smallest key greater than or equal to the specified key,
  4. permitting partial key matches and range searches.

In the example script below, the match sub uses this feature to find and print the first matching key/value pair given a partial key.

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. use Fcntl ;
  5. my ($filename, $x, %h, $st, $key, $value) ;
  6. sub match
  7. {
  8. my $key = shift ;
  9. my $value = 0;
  10. my $orig_key = $key ;
  11. $x->seq($key, $value, R_CURSOR) ;
  12. print "$orig_key\t-> $key\t-> $value\n" ;
  13. }
  14. $filename = "tree" ;
  15. unlink $filename ;
  16. $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_BTREE
  17. or die "Cannot open $filename: $!\n";
  18. # Add some key/value pairs to the file
  19. $h{'mouse'} = 'mickey' ;
  20. $h{'Wall'} = 'Larry' ;
  21. $h{'Walls'} = 'Brick' ;
  22. $h{'Smith'} = 'John' ;
  23. $key = $value = 0 ;
  24. print "IN ORDER\n" ;
  25. for ($st = $x->seq($key, $value, R_FIRST) ;
  26. $st == 0 ;
  27. $st = $x->seq($key, $value, R_NEXT) )
  28. { print "$key -> $value\n" }
  29. print "\nPARTIAL MATCH\n" ;
  30. match "Wa" ;
  31. match "A" ;
  32. match "a" ;
  33. undef $x ;
  34. untie %h ;

Here is the output:

  1. IN ORDER
  2. Smith -> John
  3. Wall -> Larry
  4. Walls -> Brick
  5. mouse -> mickey
  6. PARTIAL MATCH
  7. Wa -> Wall -> Larry
  8. A -> Smith -> John
  9. a -> mouse -> mickey

DB_RECNO

DB_RECNO provides an interface to flat text files. Both variable and fixed length records are supported.

In order to make RECNO more compatible with Perl, the array offset for all RECNO arrays begins at 0 rather than 1 as in Berkeley DB.

As with normal Perl arrays, a RECNO array can be accessed using negative indexes. The index -1 refers to the last element of the array, -2 the second last, and so on. Attempting to access an element before the start of the array will raise a fatal run-time error.

The 'bval' Option

The operation of the bval option warrants some discussion. Here is the definition of bval from the Berkeley DB 1.85 recno manual page:

  1. The delimiting byte to be used to mark the end of a
  2. record for variable-length records, and the pad charac-
  3. ter for fixed-length records. If no value is speci-
  4. fied, newlines (``\n'') are used to mark the end of
  5. variable-length records and fixed-length records are
  6. padded with spaces.

The second sentence is wrong. In actual fact bval will only default to "\n" when the openinfo parameter in dbopen is NULL. If a non-NULL openinfo parameter is used at all, the value that happens to be in bval will be used. That means you always have to specify bval when making use of any of the options in the openinfo parameter. This documentation error will be fixed in the next release of Berkeley DB.

That clarifies the situation with regards Berkeley DB itself. What about DB_File? Well, the behavior defined in the quote above is quite useful, so DB_File conforms to it.

That means that you can specify other options (e.g. cachesize) and still have bval default to "\n" for variable length records, and space for fixed length records.

Also note that the bval option only allows you to specify a single byte as a delimiter.

A Simple Example

Here is a simple example that uses RECNO (if you are using a version of Perl earlier than 5.004_57 this example won't work -- see Extra RECNO Methods for a workaround).

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my $filename = "text" ;
  5. unlink $filename ;
  6. my @h ;
  7. tie @h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_RECNO
  8. or die "Cannot open file 'text': $!\n" ;
  9. # Add a few key/value pairs to the file
  10. $h[0] = "orange" ;
  11. $h[1] = "blue" ;
  12. $h[2] = "yellow" ;
  13. push @h, "green", "black" ;
  14. my $elements = scalar @h ;
  15. print "The array contains $elements entries\n" ;
  16. my $last = pop @h ;
  17. print "popped $last\n" ;
  18. unshift @h, "white" ;
  19. my $first = shift @h ;
  20. print "shifted $first\n" ;
  21. # Check for existence of a key
  22. print "Element 1 Exists with value $h[1]\n" if $h[1] ;
  23. # use a negative index
  24. print "The last element is $h[-1]\n" ;
  25. print "The 2nd last element is $h[-2]\n" ;
  26. untie @h ;

Here is the output from the script:

  1. The array contains 5 entries
  2. popped black
  3. shifted white
  4. Element 1 Exists with value blue
  5. The last element is green
  6. The 2nd last element is yellow

Extra RECNO Methods

If you are using a version of Perl earlier than 5.004_57, the tied array interface is quite limited. In the example script above push, pop, shift, unshift or determining the array length will not work with a tied array.

To make the interface more useful for older versions of Perl, a number of methods are supplied with DB_File to simulate the missing array operations. All these methods are accessed via the object returned from the tie call.

Here are the methods:

  • $X->push(list) ;

    Pushes the elements of list to the end of the array.

  • $value = $X->pop ;

    Removes and returns the last element of the array.

  • $X->shift

    Removes and returns the first element of the array.

  • $X->unshift(list) ;

    Pushes the elements of list to the start of the array.

  • $X->length

    Returns the number of elements in the array.

  • $X->splice(offset, length, elements);

    Returns a splice of the array.

Another Example

Here is a more complete example that makes use of some of the methods described above. It also makes use of the API interface directly (see THE API INTERFACE).

  1. use warnings ;
  2. use strict ;
  3. my (@h, $H, $file, $i) ;
  4. use DB_File ;
  5. use Fcntl ;
  6. $file = "text" ;
  7. unlink $file ;
  8. $H = tie @h, "DB_File", $file, O_RDWR|O_CREAT, 0666, $DB_RECNO
  9. or die "Cannot open file $file: $!\n" ;
  10. # first create a text file to play with
  11. $h[0] = "zero" ;
  12. $h[1] = "one" ;
  13. $h[2] = "two" ;
  14. $h[3] = "three" ;
  15. $h[4] = "four" ;
  16. # Print the records in order.
  17. #
  18. # The length method is needed here because evaluating a tied
  19. # array in a scalar context does not return the number of
  20. # elements in the array.
  21. print "\nORIGINAL\n" ;
  22. foreach $i (0 .. $H->length - 1) {
  23. print "$i: $h[$i]\n" ;
  24. }
  25. # use the push & pop methods
  26. $a = $H->pop ;
  27. $H->push("last") ;
  28. print "\nThe last record was [$a]\n" ;
  29. # and the shift & unshift methods
  30. $a = $H->shift ;
  31. $H->unshift("first") ;
  32. print "The first record was [$a]\n" ;
  33. # Use the API to add a new record after record 2.
  34. $i = 2 ;
  35. $H->put($i, "Newbie", R_IAFTER) ;
  36. # and a new record before record 1.
  37. $i = 1 ;
  38. $H->put($i, "New One", R_IBEFORE) ;
  39. # delete record 3
  40. $H->del(3) ;
  41. # now print the records in reverse order
  42. print "\nREVERSE\n" ;
  43. for ($i = $H->length - 1 ; $i >= 0 ; -- $i)
  44. { print "$i: $h[$i]\n" }
  45. # same again, but use the API functions instead
  46. print "\nREVERSE again\n" ;
  47. my ($s, $k, $v) = (0, 0, 0) ;
  48. for ($s = $H->seq($k, $v, R_LAST) ;
  49. $s == 0 ;
  50. $s = $H->seq($k, $v, R_PREV))
  51. { print "$k: $v\n" }
  52. undef $H ;
  53. untie @h ;

and this is what it outputs:

  1. ORIGINAL
  2. 0: zero
  3. 1: one
  4. 2: two
  5. 3: three
  6. 4: four
  7. The last record was [four]
  8. The first record was [zero]
  9. REVERSE
  10. 5: last
  11. 4: three
  12. 3: Newbie
  13. 2: one
  14. 1: New One
  15. 0: first
  16. REVERSE again
  17. 5: last
  18. 4: three
  19. 3: Newbie
  20. 2: one
  21. 1: New One
  22. 0: first

Notes:

1.

Rather than iterating through the array, @h like this:

  1. foreach $i (@h)

it is necessary to use either this:

  1. foreach $i (0 .. $H->length - 1)

or this:

  1. for ($a = $H->get($k, $v, R_FIRST) ;
  2. $a == 0 ;
  3. $a = $H->get($k, $v, R_NEXT) )
2.

Notice that both times the put method was used the record index was specified using a variable, $i , rather than the literal value itself. This is because put will return the record number of the inserted line via that parameter.

THE API INTERFACE

As well as accessing Berkeley DB using a tied hash or array, it is also possible to make direct use of most of the API functions defined in the Berkeley DB documentation.

To do this you need to store a copy of the object returned from the tie.

  1. $db = tie %hash, "DB_File", "filename" ;

Once you have done that, you can access the Berkeley DB API functions as DB_File methods directly like this:

  1. $db->put($key, $value, R_NOOVERWRITE) ;

Important: If you have saved a copy of the object returned from tie, the underlying database file will not be closed until both the tied variable is untied and all copies of the saved object are destroyed.

  1. use DB_File ;
  2. $db = tie %hash, "DB_File", "filename"
  3. or die "Cannot tie filename: $!" ;
  4. ...
  5. undef $db ;
  6. untie %hash ;

See The untie() Gotcha for more details.

All the functions defined in dbopen are available except for close() and dbopen() itself. The DB_File method interface to the supported functions have been implemented to mirror the way Berkeley DB works whenever possible. In particular note that:

  • The methods return a status value. All return 0 on success. All return -1 to signify an error and set $! to the exact error code. The return code 1 generally (but not always) means that the key specified did not exist in the database.

    Other return codes are defined. See below and in the Berkeley DB documentation for details. The Berkeley DB documentation should be used as the definitive source.

  • Whenever a Berkeley DB function returns data via one of its parameters, the equivalent DB_File method does exactly the same.

  • If you are careful, it is possible to mix API calls with the tied hash/array interface in the same piece of code. Although only a few of the methods used to implement the tied interface currently make use of the cursor, you should always assume that the cursor has been changed any time the tied hash/array interface is used. As an example, this code will probably not do what you expect:

    1. $X = tie %x, 'DB_File', $filename, O_RDWR|O_CREAT, 0777, $DB_BTREE
    2. or die "Cannot tie $filename: $!" ;
    3. # Get the first key/value pair and set the cursor
    4. $X->seq($key, $value, R_FIRST) ;
    5. # this line will modify the cursor
    6. $count = scalar keys %x ;
    7. # Get the second key/value pair.
    8. # oops, it didn't, it got the last key/value pair!
    9. $X->seq($key, $value, R_NEXT) ;

    The code above can be rearranged to get around the problem, like this:

    1. $X = tie %x, 'DB_File', $filename, O_RDWR|O_CREAT, 0777, $DB_BTREE
    2. or die "Cannot tie $filename: $!" ;
    3. # this line will modify the cursor
    4. $count = scalar keys %x ;
    5. # Get the first key/value pair and set the cursor
    6. $X->seq($key, $value, R_FIRST) ;
    7. # Get the second key/value pair.
    8. # worked this time.
    9. $X->seq($key, $value, R_NEXT) ;

All the constants defined in dbopen for use in the flags parameters in the methods defined below are also available. Refer to the Berkeley DB documentation for the precise meaning of the flags values.

Below is a list of the methods available.

  • $status = $X->get($key, $value [, $flags]) ;

    Given a key ($key ) this method reads the value associated with it from the database. The value read from the database is returned in the $value parameter.

    If the key does not exist the method returns 1.

    No flags are currently defined for this method.

  • $status = $X->put($key, $value [, $flags]) ;

    Stores the key/value pair in the database.

    If you use either the R_IAFTER or R_IBEFORE flags, the $key parameter will have the record number of the inserted key/value pair set.

    Valid flags are R_CURSOR, R_IAFTER, R_IBEFORE, R_NOOVERWRITE and R_SETCURSOR.

  • $status = $X->del($key [, $flags]) ;

    Removes all key/value pairs with key $key from the database.

    A return code of 1 means that the requested key was not in the database.

    R_CURSOR is the only valid flag at present.

  • $status = $X->fd ;

    Returns the file descriptor for the underlying database.

    See Locking: The Trouble with fd for an explanation for why you should not use fd to lock your database.

  • $status = $X->seq($key, $value, $flags) ;

    This interface allows sequential retrieval from the database. See dbopen for full details.

    Both the $key and $value parameters will be set to the key/value pair read from the database.

    The flags parameter is mandatory. The valid flag values are R_CURSOR, R_FIRST, R_LAST, R_NEXT and R_PREV.

  • $status = $X->sync([$flags]) ;

    Flushes any cached buffers to disk.

    R_RECNOSYNC is the only valid flag at present.

DBM FILTERS

A DBM Filter is a piece of code that is be used when you always want to make the same transformation to all keys and/or values in a DBM database.

There are four methods associated with DBM Filters. All work identically, and each is used to install (or uninstall) a single DBM Filter. Each expects a single parameter, namely a reference to a sub. The only difference between them is the place that the filter is installed.

To summarise:

  • filter_store_key

    If a filter has been installed with this method, it will be invoked every time you write a key to a DBM database.

  • filter_store_value

    If a filter has been installed with this method, it will be invoked every time you write a value to a DBM database.

  • filter_fetch_key

    If a filter has been installed with this method, it will be invoked every time you read a key from a DBM database.

  • filter_fetch_value

    If a filter has been installed with this method, it will be invoked every time you read a value from a DBM database.

You can use any combination of the methods, from none, to all four.

All filter methods return the existing filter, if present, or undef in not.

To delete a filter pass undef to it.

The Filter

When each filter is called by Perl, a local copy of $_ will contain the key or value to be filtered. Filtering is achieved by modifying the contents of $_ . The return code from the filter is ignored.

An Example -- the NULL termination problem.

Consider the following scenario. You have a DBM database that you need to share with a third-party C application. The C application assumes that all keys and values are NULL terminated. Unfortunately when Perl writes to DBM databases it doesn't use NULL termination, so your Perl application will have to manage NULL termination itself. When you write to the database you will have to use something like this:

  1. $hash{"$key\0"} = "$value\0" ;

Similarly the NULL needs to be taken into account when you are considering the length of existing keys/values.

It would be much better if you could ignore the NULL terminations issue in the main application code and have a mechanism that automatically added the terminating NULL to all keys and values whenever you write to the database and have them removed when you read from the database. As I'm sure you have already guessed, this is a problem that DBM Filters can fix very easily.

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my %hash ;
  5. my $filename = "filt" ;
  6. unlink $filename ;
  7. my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
  8. or die "Cannot open $filename: $!\n" ;
  9. # Install DBM Filters
  10. $db->filter_fetch_key ( sub { s/\0$// } ) ;
  11. $db->filter_store_key ( sub { $_ .= "\0" } ) ;
  12. $db->filter_fetch_value( sub { s/\0$// } ) ;
  13. $db->filter_store_value( sub { $_ .= "\0" } ) ;
  14. $hash{"abc"} = "def" ;
  15. my $a = $hash{"ABC"} ;
  16. # ...
  17. undef $db ;
  18. untie %hash ;

Hopefully the contents of each of the filters should be self-explanatory. Both "fetch" filters remove the terminating NULL, and both "store" filters add a terminating NULL.

Another Example -- Key is a C int.

Here is another real-life example. By default, whenever Perl writes to a DBM database it always writes the key and value as strings. So when you use this:

  1. $hash{12345} = "something" ;

the key 12345 will get stored in the DBM database as the 5 byte string "12345". If you actually want the key to be stored in the DBM database as a C int, you will have to use pack when writing, and unpack when reading.

Here is a DBM Filter that does it:

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my %hash ;
  5. my $filename = "filt" ;
  6. unlink $filename ;
  7. my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
  8. or die "Cannot open $filename: $!\n" ;
  9. $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } ) ;
  10. $db->filter_store_key ( sub { $_ = pack ("i", $_) } ) ;
  11. $hash{123} = "def" ;
  12. # ...
  13. undef $db ;
  14. untie %hash ;

This time only two filters have been used -- we only need to manipulate the contents of the key, so it wasn't necessary to install any value filters.

HINTS AND TIPS

Locking: The Trouble with fd

Until version 1.72 of this module, the recommended technique for locking DB_File databases was to flock the filehandle returned from the "fd" function. Unfortunately this technique has been shown to be fundamentally flawed (Kudos to David Harris for tracking this down). Use it at your own peril!

The locking technique went like this.

  1. $db = tie(%db, 'DB_File', 'foo.db', O_CREAT|O_RDWR, 0644)
  2. || die "dbcreat foo.db $!";
  3. $fd = $db->fd;
  4. open(DB_FH, "+<&=$fd") || die "dup $!";
  5. flock (DB_FH, LOCK_EX) || die "flock: $!";
  6. ...
  7. $db{"Tom"} = "Jerry" ;
  8. ...
  9. flock(DB_FH, LOCK_UN);
  10. undef $db;
  11. untie %db;
  12. close(DB_FH);

In simple terms, this is what happens:

1.

Use "tie" to open the database.

2.

Lock the database with fd & flock.

3.

Read & Write to the database.

4.

Unlock and close the database.

Here is the crux of the problem. A side-effect of opening the DB_File database in step 2 is that an initial block from the database will get read from disk and cached in memory.

To see why this is a problem, consider what can happen when two processes, say "A" and "B", both want to update the same DB_File database using the locking steps outlined above. Assume process "A" has already opened the database and has a write lock, but it hasn't actually updated the database yet (it has finished step 2, but not started step 3 yet). Now process "B" tries to open the same database - step 1 will succeed, but it will block on step 2 until process "A" releases the lock. The important thing to notice here is that at this point in time both processes will have cached identical initial blocks from the database.

Now process "A" updates the database and happens to change some of the data held in the initial buffer. Process "A" terminates, flushing all cached data to disk and releasing the database lock. At this point the database on disk will correctly reflect the changes made by process "A".

With the lock released, process "B" can now continue. It also updates the database and unfortunately it too modifies the data that was in its initial buffer. Once that data gets flushed to disk it will overwrite some/all of the changes process "A" made to the database.

The result of this scenario is at best a database that doesn't contain what you expect. At worst the database will corrupt.

The above won't happen every time competing process update the same DB_File database, but it does illustrate why the technique should not be used.

Safe ways to lock a database

Starting with version 2.x, Berkeley DB has internal support for locking. The companion module to this one, BerkeleyDB, provides an interface to this locking functionality. If you are serious about locking Berkeley DB databases, I strongly recommend using BerkeleyDB.

If using BerkeleyDB isn't an option, there are a number of modules available on CPAN that can be used to implement locking. Each one implements locking differently and has different goals in mind. It is therefore worth knowing the difference, so that you can pick the right one for your application. Here are the three locking wrappers:

  • Tie::DB_Lock

    A DB_File wrapper which creates copies of the database file for read access, so that you have a kind of a multiversioning concurrent read system. However, updates are still serial. Use for databases where reads may be lengthy and consistency problems may occur.

  • Tie::DB_LockFile

    A DB_File wrapper that has the ability to lock and unlock the database while it is being used. Avoids the tie-before-flock problem by simply re-tie-ing the database when you get or drop a lock. Because of the flexibility in dropping and re-acquiring the lock in the middle of a session, this can be massaged into a system that will work with long updates and/or reads if the application follows the hints in the POD documentation.

  • DB_File::Lock

    An extremely lightweight DB_File wrapper that simply flocks a lockfile before tie-ing the database and drops the lock after the untie. Allows one to use the same lockfile for multiple databases to avoid deadlock problems, if desired. Use for databases where updates are reads are quick and simple flock locking semantics are enough.

Sharing Databases With C Applications

There is no technical reason why a Berkeley DB database cannot be shared by both a Perl and a C application.

The vast majority of problems that are reported in this area boil down to the fact that C strings are NULL terminated, whilst Perl strings are not. See DBM FILTERS for a generic way to work around this problem.

Here is a real example. Netscape 2.0 keeps a record of the locations you visit along with the time you last visited them in a DB_HASH database. This is usually stored in the file ~/.netscape/history.db. The key field in the database is the location string and the value field is the time the location was last visited stored as a 4 byte binary value.

If you haven't already guessed, the location string is stored with a terminating NULL. This means you need to be careful when accessing the database.

Here is a snippet of code that is loosely based on Tom Christiansen's ggh script (available from your nearest CPAN archive in authors/id/TOMC/scripts/nshist.gz).

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. use Fcntl ;
  5. my ($dotdir, $HISTORY, %hist_db, $href, $binary_time, $date) ;
  6. $dotdir = $ENV{HOME} || $ENV{LOGNAME};
  7. $HISTORY = "$dotdir/.netscape/history.db";
  8. tie %hist_db, 'DB_File', $HISTORY
  9. or die "Cannot open $HISTORY: $!\n" ;;
  10. # Dump the complete database
  11. while ( ($href, $binary_time) = each %hist_db ) {
  12. # remove the terminating NULL
  13. $href =~ s/\x00$// ;
  14. # convert the binary time into a user friendly string
  15. $date = localtime unpack("V", $binary_time);
  16. print "$date $href\n" ;
  17. }
  18. # check for the existence of a specific key
  19. # remember to add the NULL
  20. if ( $binary_time = $hist_db{"http://mox.perl.com/\x00"} ) {
  21. $date = localtime unpack("V", $binary_time) ;
  22. print "Last visited mox.perl.com on $date\n" ;
  23. }
  24. else {
  25. print "Never visited mox.perl.com\n"
  26. }
  27. untie %hist_db ;

The untie() Gotcha

If you make use of the Berkeley DB API, it is very strongly recommended that you read The untie Gotcha in perltie.

Even if you don't currently make use of the API interface, it is still worth reading it.

Here is an example which illustrates the problem from a DB_File perspective:

  1. use DB_File ;
  2. use Fcntl ;
  3. my %x ;
  4. my $X ;
  5. $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_TRUNC
  6. or die "Cannot tie first time: $!" ;
  7. $x{123} = 456 ;
  8. untie %x ;
  9. tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_CREAT
  10. or die "Cannot tie second time: $!" ;
  11. untie %x ;

When run, the script will produce this error message:

  1. Cannot tie second time: Invalid argument at bad.file line 14.

Although the error message above refers to the second tie() statement in the script, the source of the problem is really with the untie() statement that precedes it.

Having read perltie you will probably have already guessed that the error is caused by the extra copy of the tied object stored in $X . If you haven't, then the problem boils down to the fact that the DB_File destructor, DESTROY, will not be called until all references to the tied object are destroyed. Both the tied variable, %x , and $X above hold a reference to the object. The call to untie() will destroy the first, but $X still holds a valid reference, so the destructor will not get called and the database file tst.fil will remain open. The fact that Berkeley DB then reports the attempt to open a database that is already open via the catch-all "Invalid argument" doesn't help.

If you run the script with the -w flag the error message becomes:

  1. untie attempted while 1 inner references still exist at bad.file line 12.
  2. Cannot tie second time: Invalid argument at bad.file line 14.

which pinpoints the real problem. Finally the script can now be modified to fix the original problem by destroying the API object before the untie:

  1. ...
  2. $x{123} = 456 ;
  3. undef $X ;
  4. untie %x ;
  5. $X = tie %x, 'DB_File', 'tst.fil' , O_RDWR|O_CREAT
  6. ...

COMMON QUESTIONS

Why is there Perl source in my database?

If you look at the contents of a database file created by DB_File, there can sometimes be part of a Perl script included in it.

This happens because Berkeley DB uses dynamic memory to allocate buffers which will subsequently be written to the database file. Being dynamic, the memory could have been used for anything before DB malloced it. As Berkeley DB doesn't clear the memory once it has been allocated, the unused portions will contain random junk. In the case where a Perl script gets written to the database, the random junk will correspond to an area of dynamic memory that happened to be used during the compilation of the script.

Unless you don't like the possibility of there being part of your Perl scripts embedded in a database file, this is nothing to worry about.

How do I store complex data structures with DB_File?

Although DB_File cannot do this directly, there is a module which can layer transparently over DB_File to accomplish this feat.

Check out the MLDBM module, available on CPAN in the directory modules/by-module/MLDBM.

What does "Invalid Argument" mean?

You will get this error message when one of the parameters in the tie call is wrong. Unfortunately there are quite a few parameters to get wrong, so it can be difficult to figure out which one it is.

Here are a couple of possibilities:

1.

Attempting to reopen a database without closing it.

2.

Using the O_WRONLY flag.

What does "Bareword 'DB_File' not allowed" mean?

You will encounter this particular error message when you have the strict 'subs' pragma (or the full strict pragma) in your script. Consider this script:

  1. use warnings ;
  2. use strict ;
  3. use DB_File ;
  4. my %x ;
  5. tie %x, DB_File, "filename" ;

Running it produces the error in question:

  1. Bareword "DB_File" not allowed while "strict subs" in use

To get around the error, place the word DB_File in either single or double quotes, like this:

  1. tie %x, "DB_File", "filename" ;

Although it might seem like a real pain, it is really worth the effort of having a use strict in all your scripts.

REFERENCES

Articles that are either about DB_File or make use of it.

1.

Full-Text Searching in Perl, Tim Kientzle (tkientzle@ddj.com), Dr. Dobb's Journal, Issue 295, January 1999, pp 34-41

HISTORY

Moved to the Changes file.

BUGS

Some older versions of Berkeley DB had problems with fixed length records using the RECNO file format. This problem has been fixed since version 1.85 of Berkeley DB.

I am sure there are bugs in the code. If you do find any, or can suggest any enhancements, I would welcome your comments.

AVAILABILITY

DB_File comes with the standard Perl source distribution. Look in the directory ext/DB_File. Given the amount of time between releases of Perl the version that ships with Perl is quite likely to be out of date, so the most recent version can always be found on CPAN (see CPAN in perlmodlib for details), in the directory modules/by-module/DB_File.

This version of DB_File will work with either version 1.x, 2.x or 3.x of Berkeley DB, but is limited to the functionality provided by version 1.

The official web site for Berkeley DB is http://www.oracle.com/technology/products/berkeley-db/db/index.html. All versions of Berkeley DB are available there.

Alternatively, Berkeley DB version 1 is available at your nearest CPAN archive in src/misc/db.1.85.tar.gz.

COPYRIGHT

Copyright (c) 1995-2012 Paul Marquess. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Although DB_File is covered by the Perl license, the library it makes use of, namely Berkeley DB, is not. Berkeley DB has its own copyright and its own license. Please take the time to read it.

Here are are few words taken from the Berkeley DB FAQ (at http://www.oracle.com/technology/products/berkeley-db/db/index.html) regarding the license:

  1. Do I have to license DB to use it in Perl scripts?
  2. No. The Berkeley DB license requires that software that uses
  3. Berkeley DB be freely redistributable. In the case of Perl, that
  4. software is Perl, and not your scripts. Any Perl scripts that you
  5. write are your property, including scripts that make use of
  6. Berkeley DB. Neither the Perl license nor the Berkeley DB license
  7. place any restriction on what you may do with them.

If you are in any doubt about the license situation, contact either the Berkeley DB authors or the author of DB_File. See AUTHOR for details.

SEE ALSO

perl, dbopen(3), hash(3), recno(3), btree(3), perldbmfilter

AUTHOR

The DB_File interface was written by Paul Marquess <pmqs@cpan.org>.

 
perldoc-html/Data/000755 000765 000024 00000000000 12275777436 014113 5ustar00jjstaff000000 000000 perldoc-html/Devel/000755 000765 000024 00000000000 12275777436 014301 5ustar00jjstaff000000 000000 perldoc-html/Digest/000755 000765 000024 00000000000 12275777436 014461 5ustar00jjstaff000000 000000 perldoc-html/Digest.html000644 000765 000024 00000072511 12275777436 015355 0ustar00jjstaff000000 000000 Digest - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Digest

Perl 5 version 18.2 documentation
Recently read

Digest

NAME

Digest - Modules that calculate message digests

SYNOPSIS

  1. $md5 = Digest->new("MD5");
  2. $sha1 = Digest->new("SHA-1");
  3. $sha256 = Digest->new("SHA-256");
  4. $sha384 = Digest->new("SHA-384");
  5. $sha512 = Digest->new("SHA-512");
  6. $hmac = Digest->HMAC_MD5($key);

DESCRIPTION

The Digest:: modules calculate digests, also called "fingerprints" or "hashes", of some data, called a message. The digest is (usually) some small/fixed size string. The actual size of the digest depend of the algorithm used. The message is simply a sequence of arbitrary bytes or bits.

An important property of the digest algorithms is that the digest is likely to change if the message change in some way. Another property is that digest functions are one-way functions, that is it should be hard to find a message that correspond to some given digest. Algorithms differ in how "likely" and how "hard", as well as how efficient they are to compute.

Note that the properties of the algorithms change over time, as the algorithms are analyzed and machines grow faster. If your application for instance depends on it being "impossible" to generate the same digest for a different message it is wise to make it easy to plug in stronger algorithms as the one used grow weaker. Using the interface documented here should make it easy to change algorithms later.

All Digest:: modules provide the same programming interface. A functional interface for simple use, as well as an object oriented interface that can handle messages of arbitrary length and which can read files directly.

The digest can be delivered in three formats:

  • binary

    This is the most compact form, but it is not well suited for printing or embedding in places that can't handle arbitrary data.

  • hex

    A twice as long string of lowercase hexadecimal digits.

  • base64

    A string of portable printable characters. This is the base64 encoded representation of the digest with any trailing padding removed. The string will be about 30% longer than the binary version. MIME::Base64 tells you more about this encoding.

The functional interface is simply importable functions with the same name as the algorithm. The functions take the message as argument and return the digest. Example:

  1. use Digest::MD5 qw(md5);
  2. $digest = md5($message);

There are also versions of the functions with "_hex" or "_base64" appended to the name, which returns the digest in the indicated form.

OO INTERFACE

The following methods are available for all Digest:: modules:

  • $ctx = Digest->XXX($arg,...)
  • $ctx = Digest->new(XXX => $arg,...)
  • $ctx = Digest::XXX->new($arg,...)

    The constructor returns some object that encapsulate the state of the message-digest algorithm. You can add data to the object and finally ask for the digest. The "XXX" should of course be replaced by the proper name of the digest algorithm you want to use.

    The two first forms are simply syntactic sugar which automatically load the right module on first use. The second form allow you to use algorithm names which contains letters which are not legal perl identifiers, e.g. "SHA-1". If no implementation for the given algorithm can be found, then an exception is raised.

    If new() is called as an instance method (i.e. $ctx->new) it will just reset the state the object to the state of a newly created object. No new object is created in this case, and the return value is the reference to the object (i.e. $ctx).

  • $other_ctx = $ctx->clone

    The clone method creates a copy of the digest state object and returns a reference to the copy.

  • $ctx->reset

    This is just an alias for $ctx->new.

  • $ctx->add( $data )
  • $ctx->add( $chunk1, $chunk2, ... )

    The string value of the $data provided as argument is appended to the message we calculate the digest for. The return value is the $ctx object itself.

    If more arguments are provided then they are all appended to the message, thus all these lines will have the same effect on the state of the $ctx object:

    1. $ctx->add("a"); $ctx->add("b"); $ctx->add("c");
    2. $ctx->add("a")->add("b")->add("c");
    3. $ctx->add("a", "b", "c");
    4. $ctx->add("abc");

    Most algorithms are only defined for strings of bytes and this method might therefore croak if the provided arguments contain chars with ordinal number above 255.

  • $ctx->addfile( $io_handle )

    The $io_handle is read until EOF and the content is appended to the message we calculate the digest for. The return value is the $ctx object itself.

    The addfile() method will croak() if it fails reading data for some reason. If it croaks it is unpredictable what the state of the $ctx object will be in. The addfile() method might have been able to read the file partially before it failed. It is probably wise to discard or reset the $ctx object if this occurs.

    In most cases you want to make sure that the $io_handle is in "binmode" before you pass it as argument to the addfile() method.

  • $ctx->add_bits( $data, $nbits )
  • $ctx->add_bits( $bitstring )

    The add_bits() method is an alternative to add() that allow partial bytes to be appended to the message. Most users should just ignore this method as partial bytes is very unlikely to be of any practical use.

    The two argument form of add_bits() will add the first $nbits bits from $data. For the last potentially partial byte only the high order $nbits % 8 bits are used. If $nbits is greater than length($data) * 8 , then this method would do the same as $ctx->add($data) .

    The one argument form of add_bits() takes a $bitstring of "1" and "0" chars as argument. It's a shorthand for $ctx->add_bits(pack("B*", $bitstring), length($bitstring)) .

    The return value is the $ctx object itself.

    This example shows two calls that should have the same effect:

    1. $ctx->add_bits("111100001010");
    2. $ctx->add_bits("\xF0\xA0", 12);

    Most digest algorithms are byte based and for these it is not possible to add bits that are not a multiple of 8, and the add_bits() method will croak if you try.

  • $ctx->digest

    Return the binary digest for the message.

    Note that the digest operation is effectively a destructive, read-once operation. Once it has been performed, the $ctx object is automatically reset and can be used to calculate another digest value. Call $ctx->clone->digest if you want to calculate the digest without resetting the digest state.

  • $ctx->hexdigest

    Same as $ctx->digest, but will return the digest in hexadecimal form.

  • $ctx->b64digest

    Same as $ctx->digest, but will return the digest as a base64 encoded string.

Digest speed

This table should give some indication on the relative speed of different algorithms. It is sorted by throughput based on a benchmark done with of some implementations of this API:

  1. Algorithm Size Implementation MB/s
  2. MD4 128 Digest::MD4 v1.3 165.0
  3. MD5 128 Digest::MD5 v2.33 98.8
  4. SHA-256 256 Digest::SHA2 v1.1.0 66.7
  5. SHA-1 160 Digest::SHA v4.3.1 58.9
  6. SHA-1 160 Digest::SHA1 v2.10 48.8
  7. SHA-256 256 Digest::SHA v4.3.1 41.3
  8. Haval-256 256 Digest::Haval256 v1.0.4 39.8
  9. SHA-384 384 Digest::SHA2 v1.1.0 19.6
  10. SHA-512 512 Digest::SHA2 v1.1.0 19.3
  11. SHA-384 384 Digest::SHA v4.3.1 19.2
  12. SHA-512 512 Digest::SHA v4.3.1 19.2
  13. Whirlpool 512 Digest::Whirlpool v1.0.2 13.0
  14. MD2 128 Digest::MD2 v2.03 9.5
  15. Adler-32 32 Digest::Adler32 v0.03 1.3
  16. CRC-16 16 Digest::CRC v0.05 1.1
  17. CRC-32 32 Digest::CRC v0.05 1.1
  18. MD5 128 Digest::Perl::MD5 v1.5 1.0
  19. CRC-CCITT 16 Digest::CRC v0.05 0.8

These numbers was achieved Apr 2004 with ActivePerl-5.8.3 running under Linux on a P4 2.8 GHz CPU. The last 5 entries differ by being pure perl implementations of the algorithms, which explains why they are so slow.

SEE ALSO

Digest::Adler32, Digest::CRC, Digest::Haval256, Digest::HMAC, Digest::MD2, Digest::MD4, Digest::MD5, Digest::SHA, Digest::SHA1, Digest::SHA2, Digest::Whirlpool

New digest implementations should consider subclassing from Digest::base.

MIME::Base64

http://en.wikipedia.org/wiki/Cryptographic_hash_function

AUTHOR

Gisle Aas <gisle@aas.no>

The Digest:: interface is based on the interface originally developed by Neil Winton for his MD5 module.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

  1. Copyright 1998-2006 Gisle Aas.
  2. Copyright 1995,1996 Neil Winton.
 
perldoc-html/DirHandle.html000644 000765 000024 00000037742 12275777436 015777 0ustar00jjstaff000000 000000 DirHandle - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

DirHandle

Perl 5 version 18.2 documentation
Recently read

DirHandle

NAME

DirHandle - supply object methods for directory handles

SYNOPSIS

  1. use DirHandle;
  2. $d = DirHandle->new(".");
  3. if (defined $d) {
  4. while (defined($_ = $d->read)) { something($_); }
  5. $d->rewind;
  6. while (defined($_ = $d->read)) { something_else($_); }
  7. undef $d;
  8. }

DESCRIPTION

The DirHandle method provide an alternative interface to the opendir(), closedir(), readdir(), and rewinddir() functions.

The only objective benefit to using DirHandle is that it avoids namespace pollution by creating globs to hold directory handles.

 
perldoc-html/Dumpvalue.html000644 000765 000024 00000060656 12275777435 016106 0ustar00jjstaff000000 000000 Dumpvalue - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Dumpvalue

Perl 5 version 18.2 documentation
Recently read

Dumpvalue

NAME

Dumpvalue - provides screen dump of Perl data.

SYNOPSIS

  1. use Dumpvalue;
  2. my $dumper = Dumpvalue->new;
  3. $dumper->set(globPrint => 1);
  4. $dumper->dumpValue(\*::);
  5. $dumper->dumpvars('main');
  6. my $dump = $dumper->stringify($some_value);

DESCRIPTION

Creation

A new dumper is created by a call

  1. $d = Dumpvalue->new(option1 => value1, option2 => value2)

Recognized options:

  • arrayDepth , hashDepth

    Print only first N elements of arrays and hashes. If false, prints all the elements.

  • compactDump , veryCompact

    Change style of array and hash dump. If true, short array may be printed on one line.

  • globPrint

    Whether to print contents of globs.

  • dumpDBFiles

    Dump arrays holding contents of debugged files.

  • dumpPackages

    Dump symbol tables of packages.

  • dumpReused

    Dump contents of "reused" addresses.

  • tick , quoteHighBit , printUndef

    Change style of string dump. Default value of tick is auto , one can enable either double-quotish dump, or single-quotish by setting it to " or '. By default, characters with high bit set are printed as is. If quoteHighBit is set, they will be quoted.

  • usageOnly

    rudimentary per-package memory usage dump. If set, dumpvars calculates total size of strings in variables in the package.

  • unctrl

    Changes the style of printout of strings. Possible values are unctrl and quote .

  • subdump

    Whether to try to find the subroutine name given the reference.

  • bareStringify

    Whether to write the non-overloaded form of the stringify-overloaded objects.

  • quoteHighBit

    Whether to print chars with high bit set in binary or "as is".

  • stopDbSignal

    Whether to abort printing if debugger signal flag is raised.

Later in the life of the object the methods may be queries with get() method and set() method (which accept multiple arguments).

Methods

  • dumpValue
    1. $dumper->dumpValue($value);
    2. $dumper->dumpValue([$value1, $value2]);

    Prints a dump to the currently selected filehandle.

  • dumpValues
    1. $dumper->dumpValues($value1, $value2);

    Same as $dumper->dumpValue([$value1, $value2]); .

  • stringify
    1. my $dump = $dumper->stringify($value [,$noticks] );

    Returns the dump of a single scalar without printing. If the second argument is true, the return value does not contain enclosing ticks. Does not handle data structures.

  • dumpvars
    1. $dumper->dumpvars('my_package');
    2. $dumper->dumpvars('my_package', 'foo', '~bar$', '!......');

    The optional arguments are considered as literal strings unless they start with ~ or ! , in which case they are interpreted as regular expressions (possibly negated).

    The second example prints entries with names foo , and also entries with names which ends on bar , or are shorter than 5 chars.

  • set_quote
    1. $d->set_quote('"');

    Sets tick and unctrl options to suitable values for printout with the given quote char. Possible values are auto , ' and ".

  • set_unctrl
    1. $d->set_unctrl('unctrl');

    Sets unctrl option with checking for an invalid argument. Possible values are unctrl and quote .

  • compactDump
    1. $d->compactDump(1);

    Sets compactDump option. If the value is 1, sets to a reasonable big number.

  • veryCompact
    1. $d->veryCompact(1);

    Sets compactDump and veryCompact options simultaneously.

  • set
    1. $d->set(option1 => value1, option2 => value2);
  • get
    1. @values = $d->get('option1', 'option2');
 
perldoc-html/DynaLoader.html000644 000765 000024 00000101443 12275777435 016154 0ustar00jjstaff000000 000000 DynaLoader - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

DynaLoader

Perl 5 version 18.2 documentation
Recently read

DynaLoader

NAME

DynaLoader - Dynamically load C libraries into Perl code

SYNOPSIS

  1. package YourPackage;
  2. require DynaLoader;
  3. @ISA = qw(... DynaLoader ...);
  4. bootstrap YourPackage;
  5. # optional method for 'global' loading
  6. sub dl_load_flags { 0x01 }

DESCRIPTION

This document defines a standard generic interface to the dynamic linking mechanisms available on many platforms. Its primary purpose is to implement automatic dynamic loading of Perl modules.

This document serves as both a specification for anyone wishing to implement the DynaLoader for a new platform and as a guide for anyone wishing to use the DynaLoader directly in an application.

The DynaLoader is designed to be a very simple high-level interface that is sufficiently general to cover the requirements of SunOS, HP-UX, NeXT, Linux, VMS and other platforms.

It is also hoped that the interface will cover the needs of OS/2, NT etc and also allow pseudo-dynamic linking (using ld -A at runtime).

It must be stressed that the DynaLoader, by itself, is practically useless for accessing non-Perl libraries because it provides almost no Perl-to-C 'glue'. There is, for example, no mechanism for calling a C library function or supplying arguments. A C::DynaLib module is available from CPAN sites which performs that function for some common system types. And since the year 2000, there's also Inline::C, a module that allows you to write Perl subroutines in C. Also available from your local CPAN site.

DynaLoader Interface Summary

  1. @dl_library_path
  2. @dl_resolve_using
  3. @dl_require_symbols
  4. $dl_debug
  5. @dl_librefs
  6. @dl_modules
  7. @dl_shared_objects
  8. Implemented in:
  9. bootstrap($modulename) Perl
  10. @filepaths = dl_findfile(@names) Perl
  11. $flags = $modulename->dl_load_flags Perl
  12. $symref = dl_find_symbol_anywhere($symbol) Perl
  13. $libref = dl_load_file($filename, $flags) C
  14. $status = dl_unload_file($libref) C
  15. $symref = dl_find_symbol($libref, $symbol) C
  16. @symbols = dl_undef_symbols() C
  17. dl_install_xsub($name, $symref [, $filename]) C
  18. $message = dl_error C
  • @dl_library_path

    The standard/default list of directories in which dl_findfile() will search for libraries etc. Directories are searched in order: $dl_library_path[0], [1], ... etc

    @dl_library_path is initialised to hold the list of 'normal' directories (/usr/lib, etc) determined by Configure ($Config{'libpth'} ). This should ensure portability across a wide range of platforms.

    @dl_library_path should also be initialised with any other directories that can be determined from the environment at runtime (such as LD_LIBRARY_PATH for SunOS).

    After initialisation @dl_library_path can be manipulated by an application using push and unshift before calling dl_findfile(). Unshift can be used to add directories to the front of the search order either to save search time or to override libraries with the same name in the 'normal' directories.

    The load function that dl_load_file() calls may require an absolute pathname. The dl_findfile() function and @dl_library_path can be used to search for and return the absolute pathname for the library/object that you wish to load.

  • @dl_resolve_using

    A list of additional libraries or other shared objects which can be used to resolve any undefined symbols that might be generated by a later call to load_file().

    This is only required on some platforms which do not handle dependent libraries automatically. For example the Socket Perl extension library (auto/Socket/Socket.so) contains references to many socket functions which need to be resolved when it's loaded. Most platforms will automatically know where to find the 'dependent' library (e.g., /usr/lib/libsocket.so). A few platforms need to be told the location of the dependent library explicitly. Use @dl_resolve_using for this.

    Example usage:

    1. @dl_resolve_using = dl_findfile('-lsocket');
  • @dl_require_symbols

    A list of one or more symbol names that are in the library/object file to be dynamically loaded. This is only required on some platforms.

  • @dl_librefs

    An array of the handles returned by successful calls to dl_load_file(), made by bootstrap, in the order in which they were loaded. Can be used with dl_find_symbol() to look for a symbol in any of the loaded files.

  • @dl_modules

    An array of module (package) names that have been bootstrap'ed.

  • @dl_shared_objects

    An array of file names for the shared objects that were loaded.

  • dl_error()

    Syntax:

    1. $message = dl_error();

    Error message text from the last failed DynaLoader function. Note that, similar to errno in unix, a successful function call does not reset this message.

    Implementations should detect the error as soon as it occurs in any of the other functions and save the corresponding message for later retrieval. This will avoid problems on some platforms (such as SunOS) where the error message is very temporary (e.g., dlerror()).

  • $dl_debug

    Internal debugging messages are enabled when $dl_debug is set true. Currently setting $dl_debug only affects the Perl side of the DynaLoader. These messages should help an application developer to resolve any DynaLoader usage problems.

    $dl_debug is set to $ENV{'PERL_DL_DEBUG'} if defined.

    For the DynaLoader developer/porter there is a similar debugging variable added to the C code (see dlutils.c) and enabled if Perl was built with the -DDEBUGGING flag. This can also be set via the PERL_DL_DEBUG environment variable. Set to 1 for minimal information or higher for more.

  • dl_findfile()

    Syntax:

    1. @filepaths = dl_findfile(@names)

    Determine the full paths (including file suffix) of one or more loadable files given their generic names and optionally one or more directories. Searches directories in @dl_library_path by default and returns an empty list if no files were found.

    Names can be specified in a variety of platform independent forms. Any names in the form -lname are converted into libname.*, where .* is an appropriate suffix for the platform.

    If a name does not already have a suitable prefix and/or suffix then the corresponding file will be searched for by trying combinations of prefix and suffix appropriate to the platform: "$name.o", "lib$name.*" and "$name".

    If any directories are included in @names they are searched before @dl_library_path. Directories may be specified as -Ldir. Any other names are treated as filenames to be searched for.

    Using arguments of the form -Ldir and -lname is recommended.

    Example:

    1. @dl_resolve_using = dl_findfile(qw(-L/usr/5lib -lposix));
  • dl_expandspec()

    Syntax:

    1. $filepath = dl_expandspec($spec)

    Some unusual systems, such as VMS, require special filename handling in order to deal with symbolic names for files (i.e., VMS's Logical Names).

    To support these systems a dl_expandspec() function can be implemented either in the dl_*.xs file or code can be added to the dl_expandspec() function in DynaLoader.pm. See DynaLoader_pm.PL for more information.

  • dl_load_file()

    Syntax:

    1. $libref = dl_load_file($filename, $flags)

    Dynamically load $filename, which must be the path to a shared object or library. An opaque 'library reference' is returned as a handle for the loaded object. Returns undef on error.

    The $flags argument to alters dl_load_file behaviour. Assigned bits:

    1. 0x01 make symbols available for linking later dl_load_file's.
    2. (only known to work on Solaris 2 using dlopen(RTLD_GLOBAL))
    3. (ignored under VMS; this is a normal part of image linking)

    (On systems that provide a handle for the loaded object such as SunOS and HPUX, $libref will be that handle. On other systems $libref will typically be $filename or a pointer to a buffer containing $filename. The application should not examine or alter $libref in any way.)

    This is the function that does the real work. It should use the current values of @dl_require_symbols and @dl_resolve_using if required.

    1. SunOS: dlopen($filename)
    2. HP-UX: shl_load($filename)
    3. Linux: dld_create_reference(@dl_require_symbols); dld_link($filename)
    4. NeXT: rld_load($filename, @dl_resolve_using)
    5. VMS: lib$find_image_symbol($filename,$dl_require_symbols[0])

    (The dlopen() function is also used by Solaris and some versions of Linux, and is a common choice when providing a "wrapper" on other mechanisms as is done in the OS/2 port.)

  • dl_unload_file()

    Syntax:

    1. $status = dl_unload_file($libref)

    Dynamically unload $libref, which must be an opaque 'library reference' as returned from dl_load_file. Returns one on success and zero on failure.

    This function is optional and may not necessarily be provided on all platforms. If it is defined, it is called automatically when the interpreter exits for every shared object or library loaded by DynaLoader::bootstrap. All such library references are stored in @dl_librefs by DynaLoader::Bootstrap as it loads the libraries. The files are unloaded in last-in, first-out order.

    This unloading is usually necessary when embedding a shared-object perl (e.g. one configured with -Duseshrplib) within a larger application, and the perl interpreter is created and destroyed several times within the lifetime of the application. In this case it is possible that the system dynamic linker will unload and then subsequently reload the shared libperl without relocating any references to it from any files DynaLoaded by the previous incarnation of the interpreter. As a result, any shared objects opened by DynaLoader may point to a now invalid 'ghost' of the libperl shared object, causing apparently random memory corruption and crashes. This behaviour is most commonly seen when using Apache and mod_perl built with the APXS mechanism.

    1. SunOS: dlclose($libref)
    2. HP-UX: ???
    3. Linux: ???
    4. NeXT: ???
    5. VMS: ???

    (The dlclose() function is also used by Solaris and some versions of Linux, and is a common choice when providing a "wrapper" on other mechanisms as is done in the OS/2 port.)

  • dl_load_flags()

    Syntax:

    1. $flags = dl_load_flags $modulename;

    Designed to be a method call, and to be overridden by a derived class (i.e. a class which has DynaLoader in its @ISA). The definition in DynaLoader itself returns 0, which produces standard behavior from dl_load_file().

  • dl_find_symbol()

    Syntax:

    1. $symref = dl_find_symbol($libref, $symbol)

    Return the address of the symbol $symbol or undef if not found. If the target system has separate functions to search for symbols of different types then dl_find_symbol() should search for function symbols first and then other types.

    The exact manner in which the address is returned in $symref is not currently defined. The only initial requirement is that $symref can be passed to, and understood by, dl_install_xsub().

    1. SunOS: dlsym($libref, $symbol)
    2. HP-UX: shl_findsym($libref, $symbol)
    3. Linux: dld_get_func($symbol) and/or dld_get_symbol($symbol)
    4. NeXT: rld_lookup("_$symbol")
    5. VMS: lib$find_image_symbol($libref,$symbol)
  • dl_find_symbol_anywhere()

    Syntax:

    1. $symref = dl_find_symbol_anywhere($symbol)

    Applies dl_find_symbol() to the members of @dl_librefs and returns the first match found.

  • dl_undef_symbols()

    Example

    1. @symbols = dl_undef_symbols()

    Return a list of symbol names which remain undefined after load_file(). Returns () if not known. Don't worry if your platform does not provide a mechanism for this. Most do not need it and hence do not provide it, they just return an empty list.

  • dl_install_xsub()

    Syntax:

    1. dl_install_xsub($perl_name, $symref [, $filename])

    Create a new Perl external subroutine named $perl_name using $symref as a pointer to the function which implements the routine. This is simply a direct call to newXSUB(). Returns a reference to the installed function.

    The $filename parameter is used by Perl to identify the source file for the function if required by die(), caller() or the debugger. If $filename is not defined then "DynaLoader" will be used.

  • bootstrap()

    Syntax:

    bootstrap($module [...])

    This is the normal entry point for automatic dynamic loading in Perl.

    It performs the following actions:

    • locates an auto/$module directory by searching @INC

    • uses dl_findfile() to determine the filename to load

    • sets @dl_require_symbols to ("boot_$module")

    • executes an auto/$module/$module.bs file if it exists (typically used to add to @dl_resolve_using any files which are required to load the module on the current platform)

    • calls dl_load_flags() to determine how to load the file.

    • calls dl_load_file() to load the file

    • calls dl_undef_symbols() and warns if any symbols are undefined

    • calls dl_find_symbol() for "boot_$module"

    • calls dl_install_xsub() to install it as "${module}::bootstrap"

    • calls &{"${module}::bootstrap"} to bootstrap the module (actually it uses the function reference returned by dl_install_xsub for speed)

    All arguments to bootstrap() are passed to the module's bootstrap function. The default code generated by xsubpp expects $module [, $version] If the optional $version argument is not given, it defaults to $XS_VERSION // $VERSION in the module's symbol table. The default code compares the Perl-space version with the version of the compiled XS code, and croaks with an error if they do not match.

AUTHOR

Tim Bunce, 11 August 1994.

This interface is based on the work and comments of (in no particular order): Larry Wall, Robert Sanders, Dean Roehrich, Jeff Okamoto, Anno Siegel, Thomas Neumann, Paul Marquess, Charles Bailey, myself and others.

Larry Wall designed the elegant inherited bootstrap mechanism and implemented the first Perl 5 dynamic loader using it.

Solaris global loading added by Nick Ing-Simmons with design/coding assistance from Tim Bunce, January 1996.

 
perldoc-html/Encode/000755 000765 000024 00000000000 12275777442 014434 5ustar00jjstaff000000 000000 perldoc-html/Encode.html000644 000765 000024 00000211662 12275777441 015331 0ustar00jjstaff000000 000000 Encode - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Encode

Perl 5 version 18.2 documentation
Recently read

Encode

NAME

Encode - character encodings in Perl

SYNOPSIS

  1. use Encode qw(decode encode);
  2. $characters = decode('UTF-8', $octets, Encode::FB_CROAK);
  3. $octets = encode('UTF-8', $characters, Encode::FB_CROAK);

Table of Contents

Encode consists of a collection of modules whose details are too extensive to fit in one document. This one itself explains the top-level APIs and general topics at a glance. For other topics and more details, see the documentation for these modules:

DESCRIPTION

The Encode module provides the interface between Perl strings and the rest of the system. Perl strings are sequences of characters.

The repertoire of characters that Perl can represent is a superset of those defined by the Unicode Consortium. On most platforms the ordinal values of a character as returned by ord(S) is the Unicode codepoint for that character. The exceptions are platforms where the legacy encoding is some variant of EBCDIC rather than a superset of ASCII; see perlebcdic.

During recent history, data is moved around a computer in 8-bit chunks, often called "bytes" but also known as "octets" in standards documents. Perl is widely used to manipulate data of many types: not only strings of characters representing human or computer languages, but also "binary" data, being the machine's representation of numbers, pixels in an image, or just about anything.

When Perl is processing "binary data", the programmer wants Perl to process "sequences of bytes". This is not a problem for Perl: because a byte has 256 possible values, it easily fits in Perl's much larger "logical character".

This document mostly explains the how. perlunitut and perlunifaq explain the why.

TERMINOLOGY

character

A character in the range 0 .. 2**32-1 (or more); what Perl's strings are made of.

byte

A character in the range 0..255; a special case of a Perl character.

octet

8 bits of data, with ordinal values 0..255; term for bytes passed to or from a non-Perl context, such as a disk file, standard I/O stream, database, command-line argument, environment variable, socket etc.

THE PERL ENCODING API

Basic methods

encode

  1. $octets = encode(ENCODING, STRING[, CHECK])

Encodes the scalar value STRING from Perl's internal form into ENCODING and returns a sequence of octets. ENCODING can be either a canonical name or an alias. For encoding names and aliases, see Defining Aliases. For CHECK, see Handling Malformed Data.

For example, to convert a string from Perl's internal format into ISO-8859-1, also known as Latin1:

  1. $octets = encode("iso-8859-1", $string);

CAVEAT: When you run $octets = encode("utf8", $string) , then $octets might not be equal to $string. Though both contain the same data, the UTF8 flag for $octets is always off. When you encode anything, the UTF8 flag on the result is always off, even when it contains a completely valid utf8 string. See The UTF8 flag below.

If the $string is undef, then undef is returned.

decode

  1. $string = decode(ENCODING, OCTETS[, CHECK])

This function returns the string that results from decoding the scalar value OCTETS, assumed to be a sequence of octets in ENCODING, into Perl's internal form. The returns the resulting string. As with encode(), ENCODING can be either a canonical name or an alias. For encoding names and aliases, see Defining Aliases; for CHECK, see Handling Malformed Data.

For example, to convert ISO-8859-1 data into a string in Perl's internal format:

  1. $string = decode("iso-8859-1", $octets);

CAVEAT: When you run $string = decode("utf8", $octets) , then $string might not be equal to $octets. Though both contain the same data, the UTF8 flag for $string is on unless $octets consists entirely of ASCII data on ASCII machines or EBCDIC on EBCDIC machines. See The UTF8 flag below.

If the $string is undef, then undef is returned.

find_encoding

  1. [$obj =] find_encoding(ENCODING)

Returns the encoding object corresponding to ENCODING. Returns undef if no matching ENCODING is find. The returned object is what does the actual encoding or decoding.

  1. $utf8 = decode($name, $bytes);

is in fact

  1. $utf8 = do {
  2. $obj = find_encoding($name);
  3. croak qq(encoding "$name" not found) unless ref $obj;
  4. $obj->decode($bytes);
  5. };

with more error checking.

You can therefore save time by reusing this object as follows;

  1. my $enc = find_encoding("iso-8859-1");
  2. while(<>) {
  3. my $utf8 = $enc->decode($_);
  4. ... # now do something with $utf8;
  5. }

Besides decode and encode, other methods are available as well. For instance, name() returns the canonical name of the encoding object.

  1. find_encoding("latin1")->name; # iso-8859-1

See Encode::Encoding for details.

from_to

  1. [$length =] from_to($octets, FROM_ENC, TO_ENC [, CHECK])

Converts in-place data between two encodings. The data in $octets must be encoded as octets and not as characters in Perl's internal format. For example, to convert ISO-8859-1 data into Microsoft's CP1250 encoding:

  1. from_to($octets, "iso-8859-1", "cp1250");

and to convert it back:

  1. from_to($octets, "cp1250", "iso-8859-1");

Because the conversion happens in place, the data to be converted cannot be a string constant: it must be a scalar variable.

from_to() returns the length of the converted string in octets on success, and undef on error.

CAVEAT: The following operations may look the same, but are not:

  1. from_to($data, "iso-8859-1", "utf8"); #1
  2. $data = decode("iso-8859-1", $data); #2

Both #1 and #2 make $data consist of a completely valid UTF-8 string, but only #2 turns the UTF8 flag on. #1 is equivalent to:

  1. $data = encode("utf8", decode("iso-8859-1", $data));

See The UTF8 flag below.

Also note that:

  1. from_to($octets, $from, $to, $check);

is equivalent t:o

  1. $octets = encode($to, decode($from, $octets), $check);

Yes, it does not respect the $check during decoding. It is deliberately done that way. If you need minute control, use decode followed by encode as follows:

  1. $octets = encode($to, decode($from, $octets, $check_from), $check_to);

encode_utf8

  1. $octets = encode_utf8($string);

Equivalent to $octets = encode("utf8", $string) . The characters in $string are encoded in Perl's internal format, and the result is returned as a sequence of octets. Because all possible characters in Perl have a (loose, not strict) UTF-8 representation, this function cannot fail.

decode_utf8

  1. $string = decode_utf8($octets [, CHECK]);

Equivalent to $string = decode("utf8", $octets [, CHECK]) . The sequence of octets represented by $octets is decoded from UTF-8 into a sequence of logical characters. Because not all sequences of octets are valid UTF-8, it is quite possible for this function to fail. For CHECK, see Handling Malformed Data.

Listing available encodings

  1. use Encode;
  2. @list = Encode->encodings();

Returns a list of canonical names of available encodings that have already been loaded. To get a list of all available encodings including those that have not yet been loaded, say:

  1. @all_encodings = Encode->encodings(":all");

Or you can give the name of a specific module:

  1. @with_jp = Encode->encodings("Encode::JP");

When ":: " is not in the name, "Encode:: " is assumed.

  1. @ebcdic = Encode->encodings("EBCDIC");

To find out in detail which encodings are supported by this package, see Encode::Supported.

Defining Aliases

To add a new alias to a given encoding, use:

  1. use Encode;
  2. use Encode::Alias;
  3. define_alias(NEWNAME => ENCODING);

After that, NEWNAME can be used as an alias for ENCODING. ENCODING may be either the name of an encoding or an encoding object.

Before you do that, first make sure the alias is nonexistent using resolve_alias() , which returns the canonical name thereof. For example:

  1. Encode::resolve_alias("latin1") eq "iso-8859-1" # true
  2. Encode::resolve_alias("iso-8859-12") # false; nonexistent
  3. Encode::resolve_alias($name) eq $name # true if $name is canonical

resolve_alias() does not need use Encode::Alias ; it can be imported via use Encode qw(resolve_alias) .

See Encode::Alias for details.

Finding IANA Character Set Registry names

The canonical name of a given encoding does not necessarily agree with IANA Character Set Registry, commonly seen as Content-Type: text/plain; charset=WHATEVER. For most cases, the canonical name works, but sometimes it does not, most notably with "utf-8-strict".

As of Encode version 2.21, a new method mime_name() is therefore added.

  1. use Encode;
  2. my $enc = find_encoding("UTF-8");
  3. warn $enc->name; # utf-8-strict
  4. warn $enc->mime_name; # UTF-8

See also: Encode::Encoding

Encoding via PerlIO

If your perl supports PerlIO (which is the default), you can use a PerlIO layer to decode and encode directly via a filehandle. The following two examples are fully identical in functionality:

  1. ### Version 1 via PerlIO
  2. open(INPUT, "< :encoding(shiftjis)", $infile)
  3. || die "Can't open < $infile for reading: $!";
  4. open(OUTPUT, "> :encoding(euc-jp)", $outfile)
  5. || die "Can't open > $output for writing: $!";
  6. while (<INPUT>) { # auto decodes $_
  7. print OUTPUT; # auto encodes $_
  8. }
  9. close(INPUT) || die "can't close $infile: $!";
  10. close(OUTPUT) || die "can't close $outfile: $!";
  11. ### Version 2 via from_to()
  12. open(INPUT, "< :raw", $infile)
  13. || die "Can't open < $infile for reading: $!";
  14. open(OUTPUT, "> :raw", $outfile)
  15. || die "Can't open > $output for writing: $!";
  16. while (<INPUT>) {
  17. from_to($_, "shiftjis", "euc-jp", 1); # switch encoding
  18. print OUTPUT; # emit raw (but properly encoded) data
  19. }
  20. close(INPUT) || die "can't close $infile: $!";
  21. close(OUTPUT) || die "can't close $outfile: $!";

In the first version above, you let the appropriate encoding layer handle the conversion. In the second, you explicitly translate from one encoding to the other.

Unfortunately, it may be that encodings are PerlIO -savvy. You can check to see whether your encoding is supported by PerlIO by invoking the perlio_ok method on it:

  1. Encode::perlio_ok("hz"); # false
  2. find_encoding("euc-cn")->perlio_ok; # true wherever PerlIO is available
  3. use Encode qw(perlio_ok); # imported upon request
  4. perlio_ok("euc-jp")

Fortunately, all encodings that come with Encode core are PerlIO -savvy except for hz and ISO-2022-kr . For the gory details, see Encode::Encoding and Encode::PerlIO.

Handling Malformed Data

The optional CHECK argument tells Encode what to do when encountering malformed data. Without CHECK, Encode::FB_DEFAULT (== 0) is assumed.

As of version 2.12, Encode supports coderef values for CHECK ; see below.

NOTE: Not all encodings support this feature. Some encodings ignore the CHECK argument. For example, Encode::Unicode ignores CHECK and it always croaks on error.

List of CHECK values

FB_DEFAULT

  1. I<CHECK> = Encode::FB_DEFAULT ( == 0)

If CHECK is 0, encoding and decoding replace any malformed character with a substitution character. When you encode, SUBCHAR is used. When you decode, the Unicode REPLACEMENT CHARACTER, code point U+FFFD, is used. If the data is supposed to be UTF-8, an optional lexical warning of warning category "utf8" is given.

FB_CROAK

  1. I<CHECK> = Encode::FB_CROAK ( == 1)

If CHECK is 1, methods immediately die with an error message. Therefore, when CHECK is 1, you should trap exceptions with eval{}, unless you really want to let it die.

FB_QUIET

  1. I<CHECK> = Encode::FB_QUIET

If CHECK is set to Encode::FB_QUIET , encoding and decoding immediately return the portion of the data that has been processed so far when an error occurs. The data argument is overwritten with everything after that point; that is, the unprocessed portion of the data. This is handy when you have to call decode repeatedly in the case where your source data may contain partial multi-byte character sequences, (that is, you are reading with a fixed-width buffer). Here's some sample code to do exactly that:

  1. my($buffer, $string) = ("", "");
  2. while (read($fh, $buffer, 256, length($buffer))) {
  3. $string .= decode($encoding, $buffer, Encode::FB_QUIET);
  4. # $buffer now contains the unprocessed partial character
  5. }

FB_WARN

  1. I<CHECK> = Encode::FB_WARN

This is the same as FB_QUIET above, except that instead of being silent on errors, it issues a warning. This is handy for when you are debugging.

FB_PERLQQ FB_HTMLCREF FB_XMLCREF

  • perlqq mode (CHECK = Encode::FB_PERLQQ)
  • HTML charref mode (CHECK = Encode::FB_HTMLCREF)
  • XML charref mode (CHECK = Encode::FB_XMLCREF)

For encodings that are implemented by the Encode::XS module, CHECK == Encode::FB_PERLQQ puts encode and decode into perlqq fallback mode.

When you decode, \xHH is inserted for a malformed character, where HH is the hex representation of the octet that could not be decoded to utf8. When you encode, \x{HHHH} will be inserted, where HHHH is the Unicode code point (in any number of hex digits) of the character that cannot be found in the character repertoire of the encoding.

The HTML/XML character reference modes are about the same. In place of \x{HHHH}, HTML uses &#NNN; where NNN is a decimal number, and XML uses &#xHHHH; where HHHH is the hexadecimal number.

In Encode 2.10 or later, LEAVE_SRC is also implied.

The bitmask

These modes are all actually set via a bitmask. Here is how the FB_XXX constants are laid out. You can import the FB_XXX constants via use Encode qw(:fallbacks) , and you can import the generic bitmask constants via use Encode qw(:fallback_all) .

  1. FB_DEFAULT FB_CROAK FB_QUIET FB_WARN FB_PERLQQ
  2. DIE_ON_ERR 0x0001 X
  3. WARN_ON_ERR 0x0002 X
  4. RETURN_ON_ERR 0x0004 X X
  5. LEAVE_SRC 0x0008 X
  6. PERLQQ 0x0100 X
  7. HTMLCREF 0x0200
  8. XMLCREF 0x0400

LEAVE_SRC

  1. Encode::LEAVE_SRC

If the Encode::LEAVE_SRC bit is not set but CHECK is set, then the source string to encode() or decode() will be overwritten in place. If you're not interested in this, then bitwise-OR it with the bitmask.

coderef for CHECK

As of Encode 2.12, CHECK can also be a code reference which takes the ordinal value of the unmapped character as an argument and returns a string that represents the fallback character. For instance:

  1. $ascii = encode("ascii", $utf8, sub{ sprintf "<U+%04X>", shift });

Acts like FB_PERLQQ but U+XXXX is used instead of \x{XXXX}.

Defining Encodings

To define a new encoding, use:

  1. use Encode qw(define_encoding);
  2. define_encoding($object, CANONICAL_NAME [, alias...]);

CANONICAL_NAME will be associated with $object. The object should provide the interface described in Encode::Encoding. If more than two arguments are provided, additional arguments are considered aliases for $object.

See Encode::Encoding for details.

The UTF8 flag

Before the introduction of Unicode support in Perl, The eq operator just compared the strings represented by two scalars. Beginning with Perl 5.8, eq compares two strings with simultaneous consideration of the UTF8 flag. To explain why we made it so, I quote from page 402 of Programming Perl, 3rd ed.

  • Goal #1:

    Old byte-oriented programs should not spontaneously break on the old byte-oriented data they used to work on.

  • Goal #2:

    Old byte-oriented programs should magically start working on the new character-oriented data when appropriate.

  • Goal #3:

    Programs should run just as fast in the new character-oriented mode as in the old byte-oriented mode.

  • Goal #4:

    Perl should remain one language, rather than forking into a byte-oriented Perl and a character-oriented Perl.

When Programming Perl, 3rd ed. was written, not even Perl 5.6.0 had been born yet, many features documented in the book remained unimplemented for a long time. Perl 5.8 corrected much of this, and the introduction of the UTF8 flag is one of them. You can think of there being two fundamentally different kinds of strings and string-operations in Perl: one a byte-oriented mode for when the internal UTF8 flag is off, and the other a character-oriented mode for when the internal UTF8 flag is on.

Here is how Encode handles the UTF8 flag.

  • When you encode, the resulting UTF8 flag is always off.

  • When you decode, the resulting UTF8 flag is on--unless you can unambiguously represent data. Here is what we mean by "unambiguously". After $utf8 = decode("foo", $octet) ,

    1. When $octet is... The UTF8 flag in $utf8 is
    2. ---------------------------------------------
    3. In ASCII only (or EBCDIC only) OFF
    4. In ISO-8859-1 ON
    5. In any other Encoding ON
    6. ---------------------------------------------

    As you see, there is one exception: in ASCII. That way you can assume Goal #1. And with Encode , Goal #2 is assumed but you still have to be careful in the cases mentioned in the CAVEAT paragraphs above.

    This UTF8 flag is not visible in Perl scripts, exactly for the same reason you cannot (or rather, you don't have to) see whether a scalar contains a string, an integer, or a floating-point number. But you can still peek and poke these if you will. See the next section.

Messing with Perl's Internals

The following API uses parts of Perl's internals in the current implementation. As such, they are efficient but may change in a future release.

is_utf8

  1. is_utf8(STRING [, CHECK])

[INTERNAL] Tests whether the UTF8 flag is turned on in the STRING. If CHECK is true, also checks whether STRING contains well-formed UTF-8. Returns true if successful, false otherwise.

As of Perl 5.8.1, utf8 also has the utf8::is_utf8 function.

_utf8_on

  1. _utf8_on(STRING)

[INTERNAL] Turns the STRING's internal UTF8 flag on. The STRING is not checked for containing only well-formed UTF-8. Do not use this unless you know with absolute certainty that the STRING holds only well-formed UTF-8. Returns the previous state of the UTF8 flag (so please don't treat the return value as indicating success or failure), or undef if STRING is not a string.

NOTE: For security reasons, this function does not work on tainted values.

_utf8_off

  1. _utf8_off(STRING)

[INTERNAL] Turns the STRING's internal UTF8 flag off. Do not use frivolously. Returns the previous state of the UTF8 flag, or undef if STRING is not a string. Do not treat the return value as indicative of success or failure, because that isn't what it means: it is only the previous setting.

NOTE: For security reasons, this function does not work on tainted values.

UTF-8 vs. utf8 vs. UTF8

  1. ....We now view strings not as sequences of bytes, but as sequences
  2. of numbers in the range 0 .. 2**32-1 (or in the case of 64-bit
  3. computers, 0 .. 2**64-1) -- Programming Perl, 3rd ed.

That has historically been Perl's notion of UTF-8, as that is how UTF-8 was first conceived by Ken Thompson when he invented it. However, thanks to later revisions to the applicable standards, official UTF-8 is now rather stricter than that. For example, its range is much narrower (0 .. 0x10_FFFF to cover only 21 bits instead of 32 or 64 bits) and some sequences are not allowed, like those used in surrogate pairs, the 31 non-character code points 0xFDD0 .. 0xFDEF, the last two code points in any plane (0xXX_FFFE and 0xXX_FFFF), all non-shortest encodings, etc.

The former default in which Perl would always use a loose interpretation of UTF-8 has now been overruled:

  1. From: Larry Wall <larry@wall.org>
  2. Date: December 04, 2004 11:51:58 JST
  3. To: perl-unicode@perl.org
  4. Subject: Re: Make Encode.pm support the real UTF-8
  5. Message-Id: <20041204025158.GA28754@wall.org>
  6. On Fri, Dec 03, 2004 at 10:12:12PM +0000, Tim Bunce wrote:
  7. : I've no problem with 'utf8' being perl's unrestricted uft8 encoding,
  8. : but "UTF-8" is the name of the standard and should give the
  9. : corresponding behaviour.
  10. For what it's worth, that's how I've always kept them straight in my
  11. head.
  12. Also for what it's worth, Perl 6 will mostly default to strict but
  13. make it easy to switch back to lax.
  14. Larry

Got that? As of Perl 5.8.7, "UTF-8" means UTF-8 in its current sense, which is conservative and strict and security-conscious, whereas "utf8" means UTF-8 in its former sense, which was liberal and loose and lax. Encode version 2.10 or later thus groks this subtle but critically important distinction between "UTF-8" and "utf8" .

  1. encode("utf8", "\x{FFFF_FFFF}", 1); # okay
  2. encode("UTF-8", "\x{FFFF_FFFF}", 1); # croaks

In the Encode module, "UTF-8" is actually a canonical name for "utf-8-strict" . That hyphen between the "UTF" and the "8" is critical; without it, Encode goes "liberal" and (perhaps overly-)permissive:

  1. find_encoding("UTF-8")->name # is 'utf-8-strict'
  2. find_encoding("utf-8")->name # ditto. names are case insensitive
  3. find_encoding("utf_8")->name # ditto. "_" are treated as "-"
  4. find_encoding("UTF8")->name # is 'utf8'.

Perl's internal UTF8 flag is called "UTF8", without a hyphen. It indicates whether a string is internally encoded as "utf8", also without a hyphen.

SEE ALSO

Encode::Encoding, Encode::Supported, Encode::PerlIO, encoding, perlebcdic, open, perlunicode, perluniintro, perlunifaq, perlunitut utf8, the Perl Unicode Mailing List http://lists.perl.org/list/perl-unicode.html

MAINTAINER

This project was originated by the late Nick Ing-Simmons and later maintained by Dan Kogai <dankogai@cpan.org>. See AUTHORS for a full list of people involved. For any questions, send mail to <perl-unicode@perl.org> so that we can all share.

While Dan Kogai retains the copyright as a maintainer, credit should go to all those involved. See AUTHORS for a list of those who submitted code to the project.

COPYRIGHT

Copyright 2002-2012 Dan Kogai <dankogai@cpan.org>.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/English.html000644 000765 000024 00000040123 12275777442 015516 0ustar00jjstaff000000 000000 English - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

English

Perl 5 version 18.2 documentation
Recently read

English

NAME

English - use nice English (or awk) names for ugly punctuation variables

SYNOPSIS

  1. use English;
  2. use English qw( -no_match_vars ) ; # Avoids regex performance penalty
  3. # in perl 5.16 and earlier
  4. ...
  5. if ($ERRNO =~ /denied/) { ... }

DESCRIPTION

This module provides aliases for the built-in variables whose names no one seems to like to read. Variables with side-effects which get triggered just by accessing them (like $0) will still be affected.

For those variables that have an awk version, both long and short English alternatives are provided. For example, the $/ variable can be referred to either $RS or $INPUT_RECORD_SEPARATOR if you are using the English module.

See perlvar for a complete list of these.

PERFORMANCE

NOTE: This was fixed in perl 5.20. Mentioning these three variables no longer makes a speed difference. This section still applies if your code is to run on perl 5.18 or earlier.

This module can provoke sizeable inefficiencies for regular expressions, due to unfortunate implementation details. If performance matters in your application and you don't need $PREMATCH, $MATCH, or $POSTMATCH, try doing

  1. use English qw( -no_match_vars ) ;

. It is especially important to do this in modules to avoid penalizing all applications which use them.

 
perldoc-html/Env.html000644 000765 000024 00000045337 12275777442 014671 0ustar00jjstaff000000 000000 Env - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Env

Perl 5 version 18.2 documentation
Recently read

Env

NAME

Env - perl module that imports environment variables as scalars or arrays

SYNOPSIS

  1. use Env;
  2. use Env qw(PATH HOME TERM);
  3. use Env qw($SHELL @LD_LIBRARY_PATH);

DESCRIPTION

Perl maintains environment variables in a special hash named %ENV . For when this access method is inconvenient, the Perl module Env allows environment variables to be treated as scalar or array variables.

The Env::import() function ties environment variables with suitable names to global Perl variables with the same names. By default it ties all existing environment variables (keys %ENV ) to scalars. If the import function receives arguments, it takes them to be a list of variables to tie; it's okay if they don't yet exist. The scalar type prefix '$' is inferred for any element of this list not prefixed by '$' or '@'. Arrays are implemented in terms of split and join, using $Config::Config{path_sep} as the delimiter.

After an environment variable is tied, merely use it like a normal variable. You may access its value

  1. @path = split(/:/, $PATH);
  2. print join("\n", @LD_LIBRARY_PATH), "\n";

or modify it

  1. $PATH .= ":.";
  2. push @LD_LIBRARY_PATH, $dir;

however you'd like. Bear in mind, however, that each access to a tied array variable requires splitting the environment variable's string anew.

The code:

  1. use Env qw(@PATH);
  2. push @PATH, '.';

is equivalent to:

  1. use Env qw(PATH);
  2. $PATH .= ":.";

except that if $ENV{PATH} started out empty, the second approach leaves it with the (odd) value ":.", but the first approach leaves it with ".".

To remove a tied environment variable from the environment, assign it the undefined value

  1. undef $PATH;
  2. undef @LD_LIBRARY_PATH;

LIMITATIONS

On VMS systems, arrays tied to environment variables are read-only. Attempting to change anything will cause a warning.

AUTHOR

Chip Salzenberg <chip@fin.uucp> and Gregor N. Purdy <gregor@focusresearch.com>

 
perldoc-html/Errno.html000644 000765 000024 00000042734 12275777440 015222 0ustar00jjstaff000000 000000 Errno - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Errno

Perl 5 version 18.2 documentation
Recently read

Errno

NAME

Errno - System errno constants

SYNOPSIS

  1. use Errno qw(EINTR EIO :POSIX);

DESCRIPTION

Errno defines and conditionally exports all the error constants defined in your system errno.h include file. It has a single export tag, :POSIX , which will export all POSIX defined error numbers.

Errno also makes %! magic such that each element of %! has a non-zero value only if $! is set to that value. For example:

  1. use Errno;
  2. unless (open(FH, "/fangorn/spouse")) {
  3. if ($!{ENOENT}) {
  4. warn "Get a wife!\n";
  5. } else {
  6. warn "This path is barred: $!";
  7. }
  8. }

If a specified constant EFOO does not exist on the system, $!{EFOO} returns "" . You may use exists $!{EFOO} to check whether the constant is available on the system.

CAVEATS

Importing a particular constant may not be very portable, because the import will fail on platforms that do not have that constant. A more portable way to set $! to a valid value is to use:

  1. if (exists &Errno::EFOO) {
  2. $! = &Errno::EFOO;
  3. }

AUTHOR

Graham Barr <gbarr@pobox.com>

COPYRIGHT

Copyright (c) 1997-8 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/Exporter/000755 000765 000024 00000000000 12275777441 015046 5ustar00jjstaff000000 000000 perldoc-html/Exporter.html000644 000765 000024 00000142404 12275777440 015740 0ustar00jjstaff000000 000000 Exporter - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Exporter

Perl 5 version 18.2 documentation
Recently read

Exporter

NAME

Exporter - Implements default import method for modules

SYNOPSIS

In module YourModule.pm:

  1. package YourModule;
  2. require Exporter;
  3. @ISA = qw(Exporter);
  4. @EXPORT_OK = qw(munge frobnicate); # symbols to export on request

or

  1. package YourModule;
  2. use Exporter 'import'; # gives you Exporter's import() method directly
  3. @EXPORT_OK = qw(munge frobnicate); # symbols to export on request

In other files which wish to use YourModule :

  1. use YourModule qw(frobnicate); # import listed symbols
  2. frobnicate ($left, $right) # calls YourModule::frobnicate

Take a look at Good Practices for some variants you will like to use in modern Perl code.

DESCRIPTION

The Exporter module implements an import method which allows a module to export functions and variables to its users' namespaces. Many modules use Exporter rather than implementing their own import method because Exporter provides a highly flexible interface, with an implementation optimised for the common case.

Perl automatically calls the import method when processing a use statement for a module. Modules and use are documented in perlfunc and perlmod. Understanding the concept of modules and how the use statement operates is important to understanding the Exporter.

How to Export

The arrays @EXPORT and @EXPORT_OK in a module hold lists of symbols that are going to be exported into the users name space by default, or which they can request to be exported, respectively. The symbols can represent functions, scalars, arrays, hashes, or typeglobs. The symbols must be given by full name with the exception that the ampersand in front of a function is optional, e.g.

  1. @EXPORT = qw(afunc $scalar @array); # afunc is a function
  2. @EXPORT_OK = qw(&bfunc %hash *typeglob); # explicit prefix on &bfunc

If you are only exporting function names it is recommended to omit the ampersand, as the implementation is faster this way.

Selecting What to Export

Do not export method names!

Do not export anything else by default without a good reason!

Exports pollute the namespace of the module user. If you must export try to use @EXPORT_OK in preference to @EXPORT and avoid short or common symbol names to reduce the risk of name clashes.

Generally anything not exported is still accessible from outside the module using the YourModule::item_name (or $blessed_ref->method ) syntax. By convention you can use a leading underscore on names to informally indicate that they are 'internal' and not for public use.

(It is actually possible to get private functions by saying:

  1. my $subref = sub { ... };
  2. $subref->(@args); # Call it as a function
  3. $obj->$subref(@args); # Use it as a method

However if you use them for methods it is up to you to figure out how to make inheritance work.)

As a general rule, if the module is trying to be object oriented then export nothing. If it's just a collection of functions then @EXPORT_OK anything but use @EXPORT with caution. For function and method names use barewords in preference to names prefixed with ampersands for the export lists.

Other module design guidelines can be found in perlmod.

How to Import

In other files which wish to use your module there are three basic ways for them to load your module and import its symbols:

  • use YourModule;

    This imports all the symbols from YourModule's @EXPORT into the namespace of the use statement.

  • use YourModule ();

    This causes perl to load your module but does not import any symbols.

  • use YourModule qw(...);

    This imports only the symbols listed by the caller into their namespace. All listed symbols must be in your @EXPORT or @EXPORT_OK , else an error occurs. The advanced export features of Exporter are accessed like this, but with list entries that are syntactically distinct from symbol names.

Unless you want to use its advanced features, this is probably all you need to know to use Exporter.

Advanced Features

Specialised Import Lists

If any of the entries in an import list begins with !, : or / then the list is treated as a series of specifications which either add to or delete from the list of names to import. They are processed left to right. Specifications are in the form:

  1. [!]name This name only
  2. [!]:DEFAULT All names in @EXPORT
  3. [!]:tag All names in $EXPORT_TAGS{tag} anonymous list
  4. [!]/pattern/ All names in @EXPORT and @EXPORT_OK which match

A leading ! indicates that matching names should be deleted from the list of names to import. If the first specification is a deletion it is treated as though preceded by :DEFAULT. If you just want to import extra names in addition to the default set you will still need to include :DEFAULT explicitly.

e.g., Module.pm defines:

  1. @EXPORT = qw(A1 A2 A3 A4 A5);
  2. @EXPORT_OK = qw(B1 B2 B3 B4 B5);
  3. %EXPORT_TAGS = (T1 => [qw(A1 A2 B1 B2)], T2 => [qw(A1 A2 B3 B4)]);

Note that you cannot use tags in @EXPORT or @EXPORT_OK.

Names in EXPORT_TAGS must also appear in @EXPORT or @EXPORT_OK.

An application using Module can say something like:

  1. use Module qw(:DEFAULT :T2 !B3 A3);

Other examples include:

  1. use Socket qw(!/^[AP]F_/ !SOMAXCONN !SOL_SOCKET);
  2. use POSIX qw(:errno_h :termios_h !TCSADRAIN !/^EXIT/);

Remember that most patterns (using //) will need to be anchored with a leading ^, e.g., /^EXIT/ rather than /EXIT/ .

You can say BEGIN { $Exporter::Verbose=1 } to see how the specifications are being processed and what is actually being imported into modules.

Exporting Without Using Exporter's import Method

Exporter has a special method, 'export_to_level' which is used in situations where you can't directly call Exporter's import method. The export_to_level method looks like:

  1. MyPackage->export_to_level(
  2. $where_to_export, $package, @what_to_export
  3. );

where $where_to_export is an integer telling how far up the calling stack to export your symbols, and @what_to_export is an array telling what symbols *to* export (usually this is @_ ). The $package argument is currently unused.

For example, suppose that you have a module, A, which already has an import function:

  1. package A;
  2. @ISA = qw(Exporter);
  3. @EXPORT_OK = qw ($b);
  4. sub import
  5. {
  6. $A::b = 1; # not a very useful import method
  7. }

and you want to Export symbol $A::b back to the module that called package A. Since Exporter relies on the import method to work, via inheritance, as it stands Exporter::import() will never get called. Instead, say the following:

  1. package A;
  2. @ISA = qw(Exporter);
  3. @EXPORT_OK = qw ($b);
  4. sub import
  5. {
  6. $A::b = 1;
  7. A->export_to_level(1, @_);
  8. }

This will export the symbols one level 'above' the current package - ie: to the program or module that used package A.

Note: Be careful not to modify @_ at all before you call export_to_level - or people using your package will get very unexplained results!

Exporting Without Inheriting from Exporter

By including Exporter in your @ISA you inherit an Exporter's import() method but you also inherit several other helper methods which you probably don't want. To avoid this you can do

  1. package YourModule;
  2. use Exporter qw( import );

which will export Exporter's own import() method into YourModule. Everything will work as before but you won't need to include Exporter in @YourModule::ISA .

Note: This feature was introduced in version 5.57 of Exporter, released with perl 5.8.3.

Module Version Checking

The Exporter module will convert an attempt to import a number from a module into a call to $module_name->VERSION($value) . This can be used to validate that the version of the module being used is greater than or equal to the required version.

For historical reasons, Exporter supplies a require_version method that simply delegates to VERSION . Originally, before UNIVERSAL::VERSION existed, Exporter would call require_version .

Since the UNIVERSAL::VERSION method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09.

Managing Unknown Symbols

In some situations you may want to prevent certain symbols from being exported. Typically this applies to extensions which have functions or constants that may not exist on some systems.

The names of any symbols that cannot be exported should be listed in the @EXPORT_FAIL array.

If a module attempts to import any of these symbols the Exporter will give the module an opportunity to handle the situation before generating an error. The Exporter will call an export_fail method with a list of the failed symbols:

  1. @failed_symbols = $module_name->export_fail(@failed_symbols);

If the export_fail method returns an empty list then no error is recorded and all the requested symbols are exported. If the returned list is not empty then an error is generated for each symbol and the export fails. The Exporter provides a default export_fail method which simply returns the list unchanged.

Uses for the export_fail method include giving better error messages for some symbols and performing lazy architectural checks (put more symbols into @EXPORT_FAIL by default and then take them out if someone actually tries to use them and an expensive check shows that they are usable on that platform).

Tag Handling Utility Functions

Since the symbols listed within %EXPORT_TAGS must also appear in either @EXPORT or @EXPORT_OK , two utility functions are provided which allow you to easily add tagged sets of symbols to @EXPORT or @EXPORT_OK :

  1. %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);
  2. Exporter::export_tags('foo'); # add aa, bb and cc to @EXPORT
  3. Exporter::export_ok_tags('bar'); # add aa, cc and dd to @EXPORT_OK

Any names which are not tags are added to @EXPORT or @EXPORT_OK unchanged but will trigger a warning (with -w ) to avoid misspelt tags names being silently added to @EXPORT or @EXPORT_OK . Future versions may make this a fatal error.

Generating Combined Tags

If several symbol categories exist in %EXPORT_TAGS , it's usually useful to create the utility ":all" to simplify "use" statements.

The simplest way to do this is:

  1. %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);
  2. # add all the other ":class" tags to the ":all" class,
  3. # deleting duplicates
  4. {
  5. my %seen;
  6. push @{$EXPORT_TAGS{all}},
  7. grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}} foreach keys %EXPORT_TAGS;
  8. }

CGI.pm creates an ":all" tag which contains some (but not really all) of its categories. That could be done with one small change:

  1. # add some of the other ":class" tags to the ":all" class,
  2. # deleting duplicates
  3. {
  4. my %seen;
  5. push @{$EXPORT_TAGS{all}},
  6. grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}}
  7. foreach qw/html2 html3 netscape form cgi internal/;
  8. }

Note that the tag names in %EXPORT_TAGS don't have the leading ':'.

AUTOLOAD ed Constants

Many modules make use of AUTOLOAD ing for constant subroutines to avoid having to compile and waste memory on rarely used values (see perlsub for details on constant subroutines). Calls to such constant subroutines are not optimized away at compile time because they can't be checked at compile time for constancy.

Even if a prototype is available at compile time, the body of the subroutine is not (it hasn't been AUTOLOAD ed yet). perl needs to examine both the () prototype and the body of a subroutine at compile time to detect that it can safely replace calls to that subroutine with the constant value.

A workaround for this is to call the constants once in a BEGIN block:

  1. package My ;
  2. use Socket ;
  3. foo( SO_LINGER ); ## SO_LINGER NOT optimized away; called at runtime
  4. BEGIN { SO_LINGER }
  5. foo( SO_LINGER ); ## SO_LINGER optimized away at compile time.

This forces the AUTOLOAD for SO_LINGER to take place before SO_LINGER is encountered later in My package.

If you are writing a package that AUTOLOAD s, consider forcing an AUTOLOAD for any constants explicitly imported by other packages or which are usually used when your package is used.

Good Practices

Declaring @EXPORT_OK and Friends

When using Exporter with the standard strict and warnings pragmas, the our keyword is needed to declare the package variables @EXPORT_OK , @EXPORT , @ISA , etc.

  1. our @ISA = qw(Exporter);
  2. our @EXPORT_OK = qw(munge frobnicate);

If backward compatibility for Perls under 5.6 is important, one must write instead a use vars statement.

  1. use vars qw(@ISA @EXPORT_OK);
  2. @ISA = qw(Exporter);
  3. @EXPORT_OK = qw(munge frobnicate);

Playing Safe

There are some caveats with the use of runtime statements like require Exporter and the assignment to package variables, which can very subtle for the unaware programmer. This may happen for instance with mutually recursive modules, which are affected by the time the relevant constructions are executed.

The ideal (but a bit ugly) way to never have to think about that is to use BEGIN blocks. So the first part of the SYNOPSIS code could be rewritten as:

  1. package YourModule;
  2. use strict;
  3. use warnings;
  4. our (@ISA, @EXPORT_OK);
  5. BEGIN {
  6. require Exporter;
  7. @ISA = qw(Exporter);
  8. @EXPORT_OK = qw(munge frobnicate); # symbols to export on request
  9. }

The BEGIN will assure that the loading of Exporter.pm and the assignments to @ISA and @EXPORT_OK happen immediately, leaving no room for something to get awry or just plain wrong.

With respect to loading Exporter and inheriting, there are alternatives with the use of modules like base and parent .

  1. use base qw( Exporter );
  2. # or
  3. use parent qw( Exporter );

Any of these statements are nice replacements for BEGIN { require Exporter; @ISA = qw(Exporter); } with the same compile-time effect. The basic difference is that base code interacts with declared fields while parent is a streamlined version of the older base code to just establish the IS-A relationship.

For more details, see the documentation and code of base and parent.

Another thorough remedy to that runtime vs. compile-time trap is to use Exporter::Easy, which is a wrapper of Exporter that allows all boilerplate code at a single gulp in the use statement.

  1. use Exporter::Easy (
  2. OK => [ qw(munge frobnicate) ],
  3. );
  4. # @ISA setup is automatic
  5. # all assignments happen at compile time

What Not to Export

You have been warned already in Selecting What to Export to not export:

  • method names (because you don't need to and that's likely to not do what you want),

  • anything by default (because you don't want to surprise your users... badly)

  • anything you don't need to (because less is more)

There's one more item to add to this list. Do not export variable names. Just because Exporter lets you do that, it does not mean you should.

  1. @EXPORT_OK = qw( $svar @avar %hvar ); # DON'T!

Exporting variables is not a good idea. They can change under the hood, provoking horrible effects at-a-distance, that are too hard to track and to fix. Trust me: they are not worth it.

To provide the capability to set/get class-wide settings, it is best instead to provide accessors as subroutines or class methods instead.

SEE ALSO

Exporter is definitely not the only module with symbol exporter capabilities. At CPAN, you may find a bunch of them. Some are lighter. Some provide improved APIs and features. Peek the one that fits your needs. The following is a sample list of such modules.

  1. Exporter::Easy
  2. Exporter::Lite
  3. Exporter::Renaming
  4. Exporter::Tidy
  5. Sub::Exporter / Sub::Installer
  6. Perl6::Export / Perl6::Export::Attrs

LICENSE

This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/ExtUtils/000755 000765 000024 00000000000 12275777443 015021 5ustar00jjstaff000000 000000 perldoc-html/Fatal.html000644 000765 000024 00000051463 12275777445 015170 0ustar00jjstaff000000 000000 Fatal - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Fatal

Perl 5 version 18.2 documentation
Recently read

Fatal

NAME

Fatal - Replace functions with equivalents which succeed or die

SYNOPSIS

  1. use Fatal qw(open close);
  2. open(my $fh, "<", $filename); # No need to check errors!
  3. use File::Copy qw(move);
  4. use Fatal qw(move);
  5. move($file1, $file2); # No need to check errors!
  6. sub juggle { . . . }
  7. Fatal->import('juggle');

BEST PRACTICE

Fatal has been obsoleted by the new autodie pragma. Please use autodie in preference to Fatal . autodie supports lexical scoping, throws real exception objects, and provides much nicer error messages.

The use of :void with Fatal is discouraged.

DESCRIPTION

Fatal provides a way to conveniently replace functions which normally return a false value when they fail with equivalents which raise exceptions if they are not successful. This lets you use these functions without having to test their return values explicitly on each call. Exceptions can be caught using eval{}. See perlfunc and perlvar for details.

The do-or-die equivalents are set up simply by calling Fatal's import routine, passing it the names of the functions to be replaced. You may wrap both user-defined functions and overridable CORE operators (except exec, system, print, or any other built-in that cannot be expressed via prototypes) in this way.

If the symbol :void appears in the import list, then functions named later in that import list raise an exception only when these are called in void context--that is, when their return values are ignored. For example

  1. use Fatal qw/:void open close/;
  2. # properly checked, so no exception raised on error
  3. if (not open(my $fh, '<', '/bogotic') {
  4. warn "Can't open /bogotic: $!";
  5. }
  6. # not checked, so error raises an exception
  7. close FH;

The use of :void is discouraged, as it can result in exceptions not being thrown if you accidentally call a method without void context. Use autodie instead if you need to be able to disable autodying/Fatal behaviour for a small block of code.

DIAGNOSTICS

  • Bad subroutine name for Fatal: %s

    You've called Fatal with an argument that doesn't look like a subroutine name, nor a switch that this version of Fatal understands.

  • %s is not a Perl subroutine

    You've asked Fatal to try and replace a subroutine which does not exist, or has not yet been defined.

  • %s is neither a builtin, nor a Perl subroutine

    You've asked Fatal to replace a subroutine, but it's not a Perl built-in, and Fatal couldn't find it as a regular subroutine. It either doesn't exist or has not yet been defined.

  • Cannot make the non-overridable %s fatal

    You've tried to use Fatal on a Perl built-in that can't be overridden, such as print or system, which means that Fatal can't help you, although some other modules might. See the SEE ALSO section of this documentation.

  • Internal error: %s

    You've found a bug in Fatal . Please report it using the perlbug command.

BUGS

Fatal clobbers the context in which a function is called and always makes it a scalar context, except when the :void tag is used. This problem does not exist in autodie.

"Used only once" warnings can be generated when autodie or Fatal is used with package filehandles (eg, FILE ). It's strongly recommended you use scalar filehandles instead.

AUTHOR

Original module by Lionel Cons (CERN).

Prototype updates by Ilya Zakharevich <ilya@math.ohio-state.edu>.

autodie support, bugfixes, extended diagnostics, system support, and major overhauling by Paul Fenwick <pjf@perltraining.com.au>

LICENSE

This module is free software, you may distribute it under the same terms as Perl itself.

SEE ALSO

autodie for a nicer way to use lexical Fatal.

IPC::System::Simple for a similar idea for calls to system() and backticks.

 
perldoc-html/Fcntl.html000644 000765 000024 00000041314 12275777444 015200 0ustar00jjstaff000000 000000 Fcntl - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Fcntl

Perl 5 version 18.2 documentation
Recently read

Fcntl

NAME

Fcntl - load the C Fcntl.h defines

SYNOPSIS

  1. use Fcntl;
  2. use Fcntl qw(:DEFAULT :flock);

DESCRIPTION

This module is just a translation of the C fcntl.h file. Unlike the old mechanism of requiring a translated fcntl.ph file, this uses the h2xs program (see the Perl source distribution) and your native C compiler. This means that it has a far more likely chance of getting the numbers right.

NOTE

Only #define symbols get translated; you must still correctly pack up your own arguments to pass as args for locking functions, etc.

EXPORTED SYMBOLS

By default your system's F_* and O_* constants (eg, F_DUPFD and O_CREAT) and the FD_CLOEXEC constant are exported into your namespace.

You can request that the flock() constants (LOCK_SH, LOCK_EX, LOCK_NB and LOCK_UN) be provided by using the tag :flock . See Exporter.

You can request that the old constants (FAPPEND, FASYNC, FCREAT, FDEFER, FEXCL, FNDELAY, FNONBLOCK, FSYNC, FTRUNC) be provided for compatibility reasons by using the tag :Fcompat . For new applications the newer versions of these constants are suggested (O_APPEND, O_ASYNC, O_CREAT, O_DEFER, O_EXCL, O_NDELAY, O_NONBLOCK, O_SYNC, O_TRUNC).

For ease of use also the SEEK_* constants (for seek() and sysseek(), e.g. SEEK_END) and the S_I* constants (for chmod() and stat()) are available for import. They can be imported either separately or using the tags :seek and :mode .

Please refer to your native fcntl(2), open(2), fseek(3), lseek(2) (equal to Perl's seek() and sysseek(), respectively), and chmod(2) documentation to see what constants are implemented in your system.

See perlopentut to learn about the uses of the O_* constants with sysopen().

See seek and sysseek about the SEEK_* constants.

See stat about the S_I* constants.

 
perldoc-html/File/000755 000765 000024 00000000000 12275777446 014122 5ustar00jjstaff000000 000000 perldoc-html/FileCache.html000644 000765 000024 00000046121 12275777444 015736 0ustar00jjstaff000000 000000 FileCache - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

FileCache

Perl 5 version 18.2 documentation
Recently read

FileCache

NAME

FileCache - keep more files open than the system permits

SYNOPSIS

  1. no strict 'refs';
  2. use FileCache;
  3. # or
  4. use FileCache maxopen => 16;
  5. cacheout $mode, $path;
  6. # or
  7. cacheout $path;
  8. print $path @data;
  9. $fh = cacheout $mode, $path;
  10. # or
  11. $fh = cacheout $path;
  12. print $fh @data;

DESCRIPTION

The cacheout function will make sure that there's a filehandle open for reading or writing available as the pathname you give it. It automatically closes and re-opens files if you exceed your system's maximum number of file descriptors, or the suggested maximum maxopen.

  • cacheout EXPR

    The 1-argument form of cacheout will open a file for writing ('>' ) on it's first use, and appending ('>>' ) thereafter.

    Returns EXPR on success for convenience. You may neglect the return value and manipulate EXPR as the filehandle directly if you prefer.

  • cacheout MODE, EXPR

    The 2-argument form of cacheout will use the supplied mode for the initial and subsequent openings. Most valid modes for 3-argument open are supported namely; '>' , '+>' , '<' , '<+' , '>>' , '|-' and '-|'

    To pass supplemental arguments to a program opened with '|-' or '-|' append them to the command string as you would system EXPR.

    Returns EXPR on success for convenience. You may neglect the return value and manipulate EXPR as the filehandle directly if you prefer.

CAVEATS

While it is permissible to close a FileCache managed file, do not do so if you are calling FileCache::cacheout from a package other than which it was imported, or with another module which overrides close. If you must, use FileCache::cacheout_close .

Although FileCache can be used with piped opens ('-|' or '|-') doing so is strongly discouraged. If FileCache finds it necessary to close and then reopen a pipe, the command at the far end of the pipe will be reexecuted - the results of performing IO on FileCache'd pipes is unlikely to be what you expect. The ability to use FileCache on pipes may be removed in a future release.

FileCache does not store the current file offset if it finds it necessary to close a file. When the file is reopened, the offset will be as specified by the original open file mode. This could be construed to be a bug.

The module functionality relies on symbolic references, so things will break under 'use strict' unless 'no strict "refs"' is also specified.

BUGS

sys/param.h lies with its NOFILE define on some systems, so you may have to set maxopen yourself.

 
perldoc-html/FileHandle.html000644 000765 000024 00000061225 12275777445 016131 0ustar00jjstaff000000 000000 FileHandle - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

FileHandle

Perl 5 version 18.2 documentation
Recently read

FileHandle

NAME

FileHandle - supply object methods for filehandles

SYNOPSIS

  1. use FileHandle;
  2. $fh = FileHandle->new;
  3. if ($fh->open("< file")) {
  4. print <$fh>;
  5. $fh->close;
  6. }
  7. $fh = FileHandle->new("> FOO");
  8. if (defined $fh) {
  9. print $fh "bar\n";
  10. $fh->close;
  11. }
  12. $fh = FileHandle->new("file", "r");
  13. if (defined $fh) {
  14. print <$fh>;
  15. undef $fh; # automatically closes the file
  16. }
  17. $fh = FileHandle->new("file", O_WRONLY|O_APPEND);
  18. if (defined $fh) {
  19. print $fh "corge\n";
  20. undef $fh; # automatically closes the file
  21. }
  22. $pos = $fh->getpos;
  23. $fh->setpos($pos);
  24. $fh->setvbuf($buffer_var, _IOLBF, 1024);
  25. ($readfh, $writefh) = FileHandle::pipe;
  26. autoflush STDOUT 1;

DESCRIPTION

NOTE: This class is now a front-end to the IO::* classes.

FileHandle::new creates a FileHandle , which is a reference to a newly created symbol (see the Symbol package). If it receives any parameters, they are passed to FileHandle::open ; if the open fails, the FileHandle object is destroyed. Otherwise, it is returned to the caller.

FileHandle::new_from_fd creates a FileHandle like new does. It requires two parameters, which are passed to FileHandle::fdopen ; if the fdopen fails, the FileHandle object is destroyed. Otherwise, it is returned to the caller.

FileHandle::open accepts one parameter or two. With one parameter, it is just a front end for the built-in open function. With two parameters, the first parameter is a filename that may include whitespace or other special characters, and the second parameter is the open mode, optionally followed by a file permission value.

If FileHandle::open receives a Perl mode string (">", "+<", etc.) or a POSIX fopen() mode string ("w", "r+", etc.), it uses the basic Perl open operator.

If FileHandle::open is given a numeric mode, it passes that mode and the optional permissions value to the Perl sysopen operator. For convenience, FileHandle::import tries to import the O_XXX constants from the Fcntl module. If dynamic loading is not available, this may fail, but the rest of FileHandle will still work.

FileHandle::fdopen is like open except that its first parameter is not a filename but rather a file handle name, a FileHandle object, or a file descriptor number.

If the C functions fgetpos() and fsetpos() are available, then FileHandle::getpos returns an opaque value that represents the current position of the FileHandle, and FileHandle::setpos uses that value to return to a previously visited position.

If the C function setvbuf() is available, then FileHandle::setvbuf sets the buffering policy for the FileHandle. The calling sequence for the Perl function is the same as its C counterpart, including the macros _IOFBF , _IOLBF , and _IONBF , except that the buffer parameter specifies a scalar variable to use as a buffer. WARNING: A variable used as a buffer by FileHandle::setvbuf must not be modified in any way until the FileHandle is closed or until FileHandle::setvbuf is called again, or memory corruption may result!

See perlfunc for complete descriptions of each of the following supported FileHandle methods, which are just front ends for the corresponding built-in functions:

  1. close
  2. fileno
  3. getc
  4. gets
  5. eof
  6. clearerr
  7. seek
  8. tell

See perlvar for complete descriptions of each of the following supported FileHandle methods:

  1. autoflush
  2. output_field_separator
  3. output_record_separator
  4. input_record_separator
  5. input_line_number
  6. format_page_number
  7. format_lines_per_page
  8. format_lines_left
  9. format_name
  10. format_top_name
  11. format_line_break_characters
  12. format_formfeed

Furthermore, for doing normal I/O you might need these:

  • $fh->print

    See print.

  • $fh->printf

    See printf.

  • $fh->getline

    This works like <$fh> described in I/O Operators in perlop except that it's more readable and can be safely called in a list context but still returns just one line.

  • $fh->getlines

    This works like <$fh> when called in a list context to read all the remaining lines in a file, except that it's more readable. It will also croak() if accidentally called in a scalar context.

There are many other functions available since FileHandle is descended from IO::File, IO::Seekable, and IO::Handle. Please see those respective pages for documentation on more functions.

SEE ALSO

The IO extension, perlfunc, I/O Operators in perlop.

 
perldoc-html/Filter/000755 000765 000024 00000000000 12275777446 014470 5ustar00jjstaff000000 000000 perldoc-html/FindBin.html000644 000765 000024 00000044432 12275777444 015447 0ustar00jjstaff000000 000000 FindBin - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

FindBin

Perl 5 version 18.2 documentation
Recently read

FindBin

NAME

FindBin - Locate directory of original perl script

SYNOPSIS

  1. use FindBin;
  2. use lib "$FindBin::Bin/../lib";
  3. or
  4. use FindBin qw($Bin);
  5. use lib "$Bin/../lib";

DESCRIPTION

Locates the full path to the script bin directory to allow the use of paths relative to the bin directory.

This allows a user to setup a directory tree for some software with directories <root>/bin and <root>/lib , and then the above example will allow the use of modules in the lib directory without knowing where the software tree is installed.

If perl is invoked using the -e option or the perl script is read from STDIN then FindBin sets both $Bin and $RealBin to the current directory.

EXPORTABLE VARIABLES

  1. $Bin - path to bin directory from where script was invoked
  2. $Script - basename of script from which perl was invoked
  3. $RealBin - $Bin with all links resolved
  4. $RealScript - $Script with all links resolved

KNOWN ISSUES

If there are two modules using FindBin from different directories under the same interpreter, this won't work. Since FindBin uses a BEGIN block, it'll be executed only once, and only the first caller will get it right. This is a problem under mod_perl and other persistent Perl environments, where you shouldn't use this module. Which also means that you should avoid using FindBin in modules that you plan to put on CPAN. To make sure that FindBin will work is to call the again function:

  1. use FindBin;
  2. FindBin::again(); # or FindBin->again;

In former versions of FindBin there was no again function. The workaround was to force the BEGIN block to be executed again:

  1. delete $INC{'FindBin.pm'};
  2. require FindBin;

AUTHORS

FindBin is supported as part of the core perl distribution. Please send bug reports to <perlbug@perl.org> using the perlbug program included with perl.

Graham Barr <gbarr@pobox.com> Nick Ing-Simmons <nik@tiuk.ti.com>

COPYRIGHT

Copyright (c) 1995 Graham Barr & Nick Ing-Simmons. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/Getopt/000755 000765 000024 00000000000 12275777446 014505 5ustar00jjstaff000000 000000 perldoc-html/Hash/000755 000765 000024 00000000000 12275777447 014127 5ustar00jjstaff000000 000000 perldoc-html/I18N/000755 000765 000024 00000000000 12275777454 013721 5ustar00jjstaff000000 000000 perldoc-html/IO/000755 000765 000024 00000000000 12275777455 013552 5ustar00jjstaff000000 000000 perldoc-html/IO.html000644 000765 000024 00000037335 12275777450 014446 0ustar00jjstaff000000 000000 IO - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

IO

Perl 5 version 18.2 documentation
Recently read

IO

NAME

IO - load various IO modules

SYNOPSIS

  1. use IO qw(Handle File); # loads IO modules, here IO::Handle, IO::File
  2. use IO; # DEPRECATED

DESCRIPTION

IO provides a simple mechanism to load several of the IO modules in one go. The IO modules belonging to the core are:

  1. IO::Handle
  2. IO::Seekable
  3. IO::File
  4. IO::Pipe
  5. IO::Socket
  6. IO::Dir
  7. IO::Select
  8. IO::Poll

Some other IO modules don't belong to the perl core but can be loaded as well if they have been installed from CPAN. You can discover which ones exist by searching for "^IO::" on http://search.cpan.org.

For more information on any of these modules, please see its respective documentation.

DEPRECATED

  1. use IO; # loads all the modules listed below

The loaded modules are IO::Handle, IO::Seekable, IO::File, IO::Pipe, IO::Socket, IO::Dir. You should instead explicitly import the IO modules you want.

 
perldoc-html/IPC/000755 000765 000024 00000000000 12275777455 013656 5ustar00jjstaff000000 000000 perldoc-html/List/000755 000765 000024 00000000000 12275777463 014155 5ustar00jjstaff000000 000000 perldoc-html/Locale/000755 000765 000024 00000000000 12275777464 014442 5ustar00jjstaff000000 000000 perldoc-html/Log/000755 000765 000024 00000000000 12275777464 013764 5ustar00jjstaff000000 000000 perldoc-html/MIME/000755 000765 000024 00000000000 12275777472 013771 5ustar00jjstaff000000 000000 perldoc-html/Math/000755 000765 000024 00000000000 12275777472 014133 5ustar00jjstaff000000 000000 perldoc-html/Memoize/000755 000765 000024 00000000000 12275777472 014647 5ustar00jjstaff000000 000000 perldoc-html/Memoize.html000644 000765 000024 00000215560 12275777472 015546 0ustar00jjstaff000000 000000 Memoize - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Memoize

Perl 5 version 18.2 documentation
Recently read

Memoize

NAME

Memoize - Make functions faster by trading space for time

SYNOPSIS

  1. # This is the documentation for Memoize 1.03
  2. use Memoize;
  3. memoize('slow_function');
  4. slow_function(arguments); # Is faster than it was before

This is normally all you need to know. However, many options are available:

  1. memoize(function, options...);

Options include:

  1. NORMALIZER => function
  2. INSTALL => new_name
  3. SCALAR_CACHE => 'MEMORY'
  4. SCALAR_CACHE => ['HASH', \%cache_hash ]
  5. SCALAR_CACHE => 'FAULT'
  6. SCALAR_CACHE => 'MERGE'
  7. LIST_CACHE => 'MEMORY'
  8. LIST_CACHE => ['HASH', \%cache_hash ]
  9. LIST_CACHE => 'FAULT'
  10. LIST_CACHE => 'MERGE'

DESCRIPTION

`Memoizing' a function makes it faster by trading space for time. It does this by caching the return values of the function in a table. If you call the function again with the same arguments, memoize jumps in and gives you the value out of the table, instead of letting the function compute the value all over again.

Here is an extreme example. Consider the Fibonacci sequence, defined by the following function:

  1. # Compute Fibonacci numbers
  2. sub fib {
  3. my $n = shift;
  4. return $n if $n < 2;
  5. fib($n-1) + fib($n-2);
  6. }

This function is very slow. Why? To compute fib(14), it first wants to compute fib(13) and fib(12), and add the results. But to compute fib(13), it first has to compute fib(12) and fib(11), and then it comes back and computes fib(12) all over again even though the answer is the same. And both of the times that it wants to compute fib(12), it has to compute fib(11) from scratch, and then it has to do it again each time it wants to compute fib(13). This function does so much recomputing of old results that it takes a really long time to run---fib(14) makes 1,200 extra recursive calls to itself, to compute and recompute things that it already computed.

This function is a good candidate for memoization. If you memoize the `fib' function above, it will compute fib(14) exactly once, the first time it needs to, and then save the result in a table. Then if you ask for fib(14) again, it gives you the result out of the table. While computing fib(14), instead of computing fib(12) twice, it does it once; the second time it needs the value it gets it from the table. It doesn't compute fib(11) four times; it computes it once, getting it from the table the next three times. Instead of making 1,200 recursive calls to `fib', it makes 15. This makes the function about 150 times faster.

You could do the memoization yourself, by rewriting the function, like this:

  1. # Compute Fibonacci numbers, memoized version
  2. { my @fib;
  3. sub fib {
  4. my $n = shift;
  5. return $fib[$n] if defined $fib[$n];
  6. return $fib[$n] = $n if $n < 2;
  7. $fib[$n] = fib($n-1) + fib($n-2);
  8. }
  9. }

Or you could use this module, like this:

  1. use Memoize;
  2. memoize('fib');
  3. # Rest of the fib function just like the original version.

This makes it easy to turn memoizing on and off.

Here's an even simpler example: I wrote a simple ray tracer; the program would look in a certain direction, figure out what it was looking at, and then convert the `color' value (typically a string like `red') of that object to a red, green, and blue pixel value, like this:

  1. for ($direction = 0; $direction < 300; $direction++) {
  2. # Figure out which object is in direction $direction
  3. $color = $object->{color};
  4. ($r, $g, $b) = @{&ColorToRGB($color)};
  5. ...
  6. }

Since there are relatively few objects in a picture, there are only a few colors, which get looked up over and over again. Memoizing ColorToRGB sped up the program by several percent.

DETAILS

This module exports exactly one function, memoize . The rest of the functions in this package are None of Your Business.

You should say

  1. memoize(function)

where function is the name of the function you want to memoize, or a reference to it. memoize returns a reference to the new, memoized version of the function, or undef on a non-fatal error. At present, there are no non-fatal errors, but there might be some in the future.

If function was the name of a function, then memoize hides the old version and installs the new memoized version under the old name, so that &function(...) actually invokes the memoized version.

OPTIONS

There are some optional options you can pass to memoize to change the way it behaves a little. To supply options, invoke memoize like this:

  1. memoize(function, NORMALIZER => function,
  2. INSTALL => newname,
  3. SCALAR_CACHE => option,
  4. LIST_CACHE => option
  5. );

Each of these options is optional; you can include some, all, or none of them.

INSTALL

If you supply a function name with INSTALL , memoize will install the new, memoized version of the function under the name you give. For example,

  1. memoize('fib', INSTALL => 'fastfib')

installs the memoized version of fib as fastfib ; without the INSTALL option it would have replaced the old fib with the memoized version.

To prevent memoize from installing the memoized version anywhere, use INSTALL => undef .

NORMALIZER

Suppose your function looks like this:

  1. # Typical call: f('aha!', A => 11, B => 12);
  2. sub f {
  3. my $a = shift;
  4. my %hash = @_;
  5. $hash{B} ||= 2; # B defaults to 2
  6. $hash{C} ||= 7; # C defaults to 7
  7. # Do something with $a, %hash
  8. }

Now, the following calls to your function are all completely equivalent:

  1. f(OUCH);
  2. f(OUCH, B => 2);
  3. f(OUCH, C => 7);
  4. f(OUCH, B => 2, C => 7);
  5. f(OUCH, C => 7, B => 2);
  6. (etc.)

However, unless you tell Memoize that these calls are equivalent, it will not know that, and it will compute the values for these invocations of your function separately, and store them separately.

To prevent this, supply a NORMALIZER function that turns the program arguments into a string in a way that equivalent arguments turn into the same string. A NORMALIZER function for f above might look like this:

  1. sub normalize_f {
  2. my $a = shift;
  3. my %hash = @_;
  4. $hash{B} ||= 2;
  5. $hash{C} ||= 7;
  6. join(',', $a, map ($_ => $hash{$_}) sort keys %hash);
  7. }

Each of the argument lists above comes out of the normalize_f function looking exactly the same, like this:

  1. OUCH,B,2,C,7

You would tell Memoize to use this normalizer this way:

  1. memoize('f', NORMALIZER => 'normalize_f');

memoize knows that if the normalized version of the arguments is the same for two argument lists, then it can safely look up the value that it computed for one argument list and return it as the result of calling the function with the other argument list, even if the argument lists look different.

The default normalizer just concatenates the arguments with character 28 in between. (In ASCII, this is called FS or control-\.) This always works correctly for functions with only one string argument, and also when the arguments never contain character 28. However, it can confuse certain argument lists:

  1. normalizer("a\034", "b")
  2. normalizer("a", "\034b")
  3. normalizer("a\034\034b")

for example.

Since hash keys are strings, the default normalizer will not distinguish between undef and the empty string. It also won't work when the function's arguments are references. For example, consider a function g which gets two arguments: A number, and a reference to an array of numbers:

  1. g(13, [1,2,3,4,5,6,7]);

The default normalizer will turn this into something like "13\034ARRAY(0x436c1f)" . That would be all right, except that a subsequent array of numbers might be stored at a different location even though it contains the same data. If this happens, Memoize will think that the arguments are different, even though they are equivalent. In this case, a normalizer like this is appropriate:

  1. sub normalize { join ' ', $_[0], @{$_[1]} }

For the example above, this produces the key "13 1 2 3 4 5 6 7".

Another use for normalizers is when the function depends on data other than those in its arguments. Suppose you have a function which returns a value which depends on the current hour of the day:

  1. sub on_duty {
  2. my ($problem_type) = @_;
  3. my $hour = (localtime)[2];
  4. open my $fh, "$DIR/$problem_type" or die...;
  5. my $line;
  6. while ($hour-- > 0)
  7. $line = <$fh>;
  8. }
  9. return $line;
  10. }

At 10:23, this function generates the 10th line of a data file; at 3:45 PM it generates the 15th line instead. By default, Memoize will only see the $problem_type argument. To fix this, include the current hour in the normalizer:

  1. sub normalize { join ' ', (localtime)[2], @_ }

The calling context of the function (scalar or list context) is propagated to the normalizer. This means that if the memoized function will treat its arguments differently in list context than it would in scalar context, you can have the normalizer function select its behavior based on the results of wantarray. Even if called in a list context, a normalizer should still return a single string.

SCALAR_CACHE , LIST_CACHE

Normally, Memoize caches your function's return values into an ordinary Perl hash variable. However, you might like to have the values cached on the disk, so that they persist from one run of your program to the next, or you might like to associate some other interesting semantics with the cached values.

There's a slight complication under the hood of Memoize : There are actually two caches, one for scalar values and one for list values. When your function is called in scalar context, its return value is cached in one hash, and when your function is called in list context, its value is cached in the other hash. You can control the caching behavior of both contexts independently with these options.

The argument to LIST_CACHE or SCALAR_CACHE must either be one of the following four strings:

  1. MEMORY
  2. FAULT
  3. MERGE
  4. HASH

or else it must be a reference to an array whose first element is one of these four strings, such as [HASH, arguments...] .

  • MEMORY

    MEMORY means that return values from the function will be cached in an ordinary Perl hash variable. The hash variable will not persist after the program exits. This is the default.

  • HASH

    HASH allows you to specify that a particular hash that you supply will be used as the cache. You can tie this hash beforehand to give it any behavior you want.

    A tied hash can have any semantics at all. It is typically tied to an on-disk database, so that cached values are stored in the database and retrieved from it again when needed, and the disk file typically persists after your program has exited. See perltie for more complete details about tie.

    A typical example is:

    1. use DB_File;
    2. tie my %cache => 'DB_File', $filename, O_RDWR|O_CREAT, 0666;
    3. memoize 'function', SCALAR_CACHE => [HASH => \%cache];

    This has the effect of storing the cache in a DB_File database whose name is in $filename . The cache will persist after the program has exited. Next time the program runs, it will find the cache already populated from the previous run of the program. Or you can forcibly populate the cache by constructing a batch program that runs in the background and populates the cache file. Then when you come to run your real program the memoized function will be fast because all its results have been precomputed.

    Another reason to use HASH is to provide your own hash variable. You can then inspect or modify the contents of the hash to gain finer control over the cache management.

  • TIE

    This option is no longer supported. It is still documented only to aid in the debugging of old programs that use it. Old programs should be converted to use the HASH option instead.

    1. memoize ... ['TIE', PACKAGE, ARGS...]

    is merely a shortcut for

    1. require PACKAGE;
    2. { tie my %cache, PACKAGE, ARGS...;
    3. memoize ... [HASH => \%cache];
    4. }
  • FAULT

    FAULT means that you never expect to call the function in scalar (or list) context, and that if Memoize detects such a call, it should abort the program. The error message is one of

    1. `foo' function called in forbidden list context at line ...
    2. `foo' function called in forbidden scalar context at line ...
  • MERGE

    MERGE normally means that the memoized function does not distinguish between list and sclar context, and that return values in both contexts should be stored together. Both LIST_CACHE => MERGE and SCALAR_CACHE => MERGE mean the same thing.

    Consider this function:

    1. sub complicated {
    2. # ... time-consuming calculation of $result
    3. return $result;
    4. }

    The complicated function will return the same numeric $result regardless of whether it is called in list or in scalar context.

    Normally, the following code will result in two calls to complicated , even if complicated is memoized:

    1. $x = complicated(142);
    2. ($y) = complicated(142);
    3. $z = complicated(142);

    The first call will cache the result, say 37, in the scalar cache; the second will cach the list (37) in the list cache. The third call doesn't call the real complicated function; it gets the value 37 from the scalar cache.

    Obviously, the second call to complicated is a waste of time, and storing its return value is a waste of space. Specifying LIST_CACHE => MERGE will make memoize use the same cache for scalar and list context return values, so that the second call uses the scalar cache that was populated by the first call. complicated ends up being called only once, and both subsequent calls return 3 from the cache, regardless of the calling context.

List values in scalar context

Consider this function:

  1. sub iota { return reverse (1..$_[0]) }

This function normally returns a list. Suppose you memoize it and merge the caches:

  1. memoize 'iota', SCALAR_CACHE => 'MERGE';
  2. @i7 = iota(7);
  3. $i7 = iota(7);

Here the first call caches the list (1,2,3,4,5,6,7). The second call does not really make sense. Memoize cannot guess what behavior iota should have in scalar context without actually calling it in scalar context. Normally Memoize would call iota in scalar context and cache the result, but the SCALAR_CACHE => 'MERGE' option says not to do that, but to use the cache list-context value instead. But it cannot return a list of seven elements in a scalar context. In this case $i7 will receive the first element of the cached list value, namely 7.

Merged disk caches

Another use for MERGE is when you want both kinds of return values stored in the same disk file; this saves you from having to deal with two disk files instead of one. You can use a normalizer function to keep the two sets of return values separate. For example:

  1. tie my %cache => 'MLDBM', 'DB_File', $filename, ...;
  2. memoize 'myfunc',
  3. NORMALIZER => 'n',
  4. SCALAR_CACHE => [HASH => \%cache],
  5. LIST_CACHE => 'MERGE',
  6. ;
  7. sub n {
  8. my $context = wantarray() ? 'L' : 'S';
  9. # ... now compute the hash key from the arguments ...
  10. $hashkey = "$context:$hashkey";
  11. }

This normalizer function will store scalar context return values in the disk file under keys that begin with S: , and list context return values under keys that begin with L: .

OTHER FACILITIES

unmemoize

There's an unmemoize function that you can import if you want to. Why would you want to? Here's an example: Suppose you have your cache tied to a DBM file, and you want to make sure that the cache is written out to disk if someone interrupts the program. If the program exits normally, this will happen anyway, but if someone types control-C or something then the program will terminate immediately without synchronizing the database. So what you can do instead is

  1. $SIG{INT} = sub { unmemoize 'function' };

unmemoize accepts a reference to, or the name of a previously memoized function, and undoes whatever it did to provide the memoized version in the first place, including making the name refer to the unmemoized version if appropriate. It returns a reference to the unmemoized version of the function.

If you ask it to unmemoize a function that was never memoized, it croaks.

flush_cache

flush_cache(function) will flush out the caches, discarding all the cached data. The argument may be a function name or a reference to a function. For finer control over when data is discarded or expired, see the documentation for Memoize::Expire , included in this package.

Note that if the cache is a tied hash, flush_cache will attempt to invoke the CLEAR method on the hash. If there is no CLEAR method, this will cause a run-time error.

An alternative approach to cache flushing is to use the HASH option (see above) to request that Memoize use a particular hash variable as its cache. Then you can examine or modify the hash at any time in any way you desire. You may flush the cache by using %hash = () .

CAVEATS

Memoization is not a cure-all:

  • Do not memoize a function whose behavior depends on program state other than its own arguments, such as global variables, the time of day, or file input. These functions will not produce correct results when memoized. For a particularly easy example:

    1. sub f {
    2. time;
    3. }

    This function takes no arguments, and as far as Memoize is concerned, it always returns the same result. Memoize is wrong, of course, and the memoized version of this function will call time once to get the current time, and it will return that same time every time you call it after that.

  • Do not memoize a function with side effects.

    1. sub f {
    2. my ($a, $b) = @_;
    3. my $s = $a + $b;
    4. print "$a + $b = $s.\n";
    5. }

    This function accepts two arguments, adds them, and prints their sum. Its return value is the numuber of characters it printed, but you probably didn't care about that. But Memoize doesn't understand that. If you memoize this function, you will get the result you expect the first time you ask it to print the sum of 2 and 3, but subsequent calls will return 1 (the return value of print) without actually printing anything.

  • Do not memoize a function that returns a data structure that is modified by its caller.

    Consider these functions: getusers returns a list of users somehow, and then main throws away the first user on the list and prints the rest:

    1. sub main {
    2. my $userlist = getusers();
    3. shift @$userlist;
    4. foreach $u (@$userlist) {
    5. print "User $u\n";
    6. }
    7. }
    8. sub getusers {
    9. my @users;
    10. # Do something to get a list of users;
    11. \@users; # Return reference to list.
    12. }

    If you memoize getusers here, it will work right exactly once. The reference to the users list will be stored in the memo table. main will discard the first element from the referenced list. The next time you invoke main , Memoize will not call getusers ; it will just return the same reference to the same list it got last time. But this time the list has already had its head removed; main will erroneously remove another element from it. The list will get shorter and shorter every time you call main .

    Similarly, this:

    1. $u1 = getusers();
    2. $u2 = getusers();
    3. pop @$u1;

    will modify $u2 as well as $u1, because both variables are references to the same array. Had getusers not been memoized, $u1 and $u2 would have referred to different arrays.

  • Do not memoize a very simple function.

    Recently someone mentioned to me that the Memoize module made his program run slower instead of faster. It turned out that he was memoizing the following function:

    1. sub square {
    2. $_[0] * $_[0];
    3. }

    I pointed out that Memoize uses a hash, and that looking up a number in the hash is necessarily going to take a lot longer than a single multiplication. There really is no way to speed up the square function.

    Memoization is not magical.

PERSISTENT CACHE SUPPORT

You can tie the cache tables to any sort of tied hash that you want to, as long as it supports TIEHASH , FETCH , STORE , and EXISTS . For example,

  1. tie my %cache => 'GDBM_File', $filename, O_RDWR|O_CREAT, 0666;
  2. memoize 'function', SCALAR_CACHE => [HASH => \%cache];

works just fine. For some storage methods, you need a little glue.

SDBM_File doesn't supply an EXISTS method, so included in this package is a glue module called Memoize::SDBM_File which does provide one. Use this instead of plain SDBM_File to store your cache table on disk in an SDBM_File database:

  1. tie my %cache => 'Memoize::SDBM_File', $filename, O_RDWR|O_CREAT, 0666;
  2. memoize 'function', SCALAR_CACHE => [HASH => \%cache];

NDBM_File has the same problem and the same solution. (Use Memoize::NDBM_File instead of plain NDBM_File. )

Storable isn't a tied hash class at all. You can use it to store a hash to disk and retrieve it again, but you can't modify the hash while it's on the disk. So if you want to store your cache table in a Storable database, use Memoize::Storable , which puts a hashlike front-end onto Storable . The hash table is actually kept in memory, and is loaded from your Storable file at the time you memoize the function, and stored back at the time you unmemoize the function (or when your program exits):

  1. tie my %cache => 'Memoize::Storable', $filename;
  2. memoize 'function', SCALAR_CACHE => [HASH => \%cache];
  3. tie my %cache => 'Memoize::Storable', $filename, 'nstore';
  4. memoize 'function', SCALAR_CACHE => [HASH => \%cache];

Include the `nstore' option to have the Storable database written in `network order'. (See Storable for more details about this.)

The flush_cache() function will raise a run-time error unless the tied package provides a CLEAR method.

EXPIRATION SUPPORT

See Memoize::Expire, which is a plug-in module that adds expiration functionality to Memoize. If you don't like the kinds of policies that Memoize::Expire implements, it is easy to write your own plug-in module to implement whatever policy you desire. Memoize comes with several examples. An expiration manager that implements a LRU policy is available on CPAN as Memoize::ExpireLRU.

BUGS

The test suite is much better, but always needs improvement.

There is some problem with the way goto &f works under threaded Perl, perhaps because of the lexical scoping of @_ . This is a bug in Perl, and until it is resolved, memoized functions will see a slightly different caller() and will perform a little more slowly on threaded perls than unthreaded perls.

Some versions of DB_File won't let you store data under a key of length 0. That means that if you have a function f which you memoized and the cache is in a DB_File database, then the value of f() (f called with no arguments) will not be memoized. If this is a big problem, you can supply a normalizer function that prepends "x" to every key.

MAILING LIST

To join a very low-traffic mailing list for announcements about Memoize , send an empty note to mjd-perl-memoize-request@plover.com .

AUTHOR

Mark-Jason Dominus (mjd-perl-memoize+@plover.com ), Plover Systems co.

See the Memoize.pm Page at http://perl.plover.com/Memoize/ for news and upgrades. Near this page, at http://perl.plover.com/MiniMemoize/ there is an article about memoization and about the internals of Memoize that appeared in The Perl Journal, issue #13. (This article is also included in the Memoize distribution as `article.html'.)

The author's book Higher-Order Perl (2005, ISBN 1558607013, published by Morgan Kaufmann) discusses memoization (and many other topics) in tremendous detail. It is available on-line for free. For more information, visit http://hop.perl.plover.com/ .

To join a mailing list for announcements about Memoize , send an empty message to mjd-perl-memoize-request@plover.com . This mailing list is for announcements only and has extremely low traffic---fewer than two messages per year.

COPYRIGHT AND LICENSE

Copyright 1998, 1999, 2000, 2001, 2012 by Mark Jason Dominus

This library is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

THANK YOU

Many thanks to Florian Ragwitz for administration and packaging assistance, to John Tromp for bug reports, to Jonathan Roy for bug reports and suggestions, to Michael Schwern for other bug reports and patches, to Mike Cariaso for helping me to figure out the Right Thing to Do About Expiration, to Joshua Gerth, Joshua Chamas, Jonathan Roy (again), Mark D. Anderson, and Andrew Johnson for more suggestions about expiration, to Brent Powers for the Memoize::ExpireLRU module, to Ariel Scolnicov for delightful messages about the Fibonacci function, to Dion Almaer for thought-provoking suggestions about the default normalizer, to Walt Mankowski and Kurt Starsinic for much help investigating problems under threaded Perl, to Alex Dudkevich for reporting the bug in prototyped functions and for checking my patch, to Tony Bass for many helpful suggestions, to Jonathan Roy (again) for finding a use for unmemoize() , to Philippe Verdret for enlightening discussion of Hook::PrePostCall , to Nat Torkington for advice I ignored, to Chris Nandor for portability advice, to Randal Schwartz for suggesting the 'flush_cache function, and to Jenda Krynicky for being a light in the world.

Special thanks to Jarkko Hietaniemi, the 5.8.0 pumpking, for including this module in the core and for his patient and helpful guidance during the integration process.

 
perldoc-html/Module/000755 000765 000024 00000000000 12275777472 014467 5ustar00jjstaff000000 000000 perldoc-html/NDBM_File.html000644 000765 000024 00000045664 12275777474 015630 0ustar00jjstaff000000 000000 NDBM_File - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

NDBM_File

Perl 5 version 18.2 documentation
Recently read

NDBM_File

NAME

NDBM_File - Tied access to ndbm files

SYNOPSIS

  1. use Fcntl; # For O_RDWR, O_CREAT, etc.
  2. use NDBM_File;
  3. tie(%h, 'NDBM_File', 'filename', O_RDWR|O_CREAT, 0666)
  4. or die "Couldn't tie NDBM file 'filename': $!; aborting";
  5. # Now read and change the hash
  6. $h{newkey} = newvalue;
  7. print $h{oldkey};
  8. ...
  9. untie %h;

DESCRIPTION

NDBM_File establishes a connection between a Perl hash variable and a file in NDBM_File format;. You can manipulate the data in the file just as if it were in a Perl hash, but when your program exits, the data will remain in the file, to be used the next time your program runs.

Use NDBM_File with the Perl built-in tie function to establish the connection between the variable and the file. The arguments to tie should be:

1.

The hash variable you want to tie.

2.

The string "NDBM_File" . (Ths tells Perl to use the NDBM_File package to perform the functions of the hash.)

3.

The name of the file you want to tie to the hash.

4.

Flags. Use one of:

  • O_RDONLY

    Read-only access to the data in the file.

  • O_WRONLY

    Write-only access to the data in the file.

  • O_RDWR

    Both read and write access.

If you want to create the file if it does not exist, add O_CREAT to any of these, as in the example. If you omit O_CREAT and the file does not already exist, the tie call will fail.

5.

The default permissions to use if a new file is created. The actual permissions will be modified by the user's umask, so you should probably use 0666 here. (See umask.)

DIAGNOSTICS

On failure, the tie call returns an undefined value and probably sets $! to contain the reason the file could not be tied.

ndbm store returned -1, errno 22, key "..." at ...

This warning is emitted when you try to store a key or a value that is too long. It means that the change was not recorded in the database. See BUGS AND WARNINGS below.

BUGS AND WARNINGS

There are a number of limits on the size of the data that you can store in the NDBM file. The most important is that the length of a key, plus the length of its associated value, may not exceed 1008 bytes.

See tie, perldbmfilter, Fcntl

 
perldoc-html/NEXT.html000644 000765 000024 00000136032 12275777474 014715 0ustar00jjstaff000000 000000 NEXT - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

NEXT

Perl 5 version 18.2 documentation
Recently read

NEXT

NAME

NEXT.pm - Provide a pseudo-class NEXT (et al) that allows method redispatch

SYNOPSIS

  1. use NEXT;
  2. package A;
  3. sub A::method { print "$_[0]: A method\n"; $_[0]->NEXT::method() }
  4. sub A::DESTROY { print "$_[0]: A dtor\n"; $_[0]->NEXT::DESTROY() }
  5. package B;
  6. use base qw( A );
  7. sub B::AUTOLOAD { print "$_[0]: B AUTOLOAD\n"; $_[0]->NEXT::AUTOLOAD() }
  8. sub B::DESTROY { print "$_[0]: B dtor\n"; $_[0]->NEXT::DESTROY() }
  9. package C;
  10. sub C::method { print "$_[0]: C method\n"; $_[0]->NEXT::method() }
  11. sub C::AUTOLOAD { print "$_[0]: C AUTOLOAD\n"; $_[0]->NEXT::AUTOLOAD() }
  12. sub C::DESTROY { print "$_[0]: C dtor\n"; $_[0]->NEXT::DESTROY() }
  13. package D;
  14. use base qw( B C );
  15. sub D::method { print "$_[0]: D method\n"; $_[0]->NEXT::method() }
  16. sub D::AUTOLOAD { print "$_[0]: D AUTOLOAD\n"; $_[0]->NEXT::AUTOLOAD() }
  17. sub D::DESTROY { print "$_[0]: D dtor\n"; $_[0]->NEXT::DESTROY() }
  18. package main;
  19. my $obj = bless {}, "D";
  20. $obj->method(); # Calls D::method, A::method, C::method
  21. $obj->missing_method(); # Calls D::AUTOLOAD, B::AUTOLOAD, C::AUTOLOAD
  22. # Clean-up calls D::DESTROY, B::DESTROY, A::DESTROY, C::DESTROY

DESCRIPTION

NEXT.pm adds a pseudoclass named NEXT to any program that uses it. If a method m calls $self->NEXT::m() , the call to m is redispatched as if the calling method had not originally been found.

In other words, a call to $self->NEXT::m() resumes the depth-first, left-to-right search of $self 's class hierarchy that resulted in the original call to m.

Note that this is not the same thing as $self->SUPER::m() , which begins a new dispatch that is restricted to searching the ancestors of the current class. $self->NEXT::m() can backtrack past the current class -- to look for a suitable method in other ancestors of $self -- whereas $self->SUPER::m() cannot.

A typical use would be in the destructors of a class hierarchy, as illustrated in the synopsis above. Each class in the hierarchy has a DESTROY method that performs some class-specific action and then redispatches the call up the hierarchy. As a result, when an object of class D is destroyed, the destructors of all its parent classes are called (in depth-first, left-to-right order).

Another typical use of redispatch would be in AUTOLOAD 'ed methods. If such a method determined that it was not able to handle a particular call, it might choose to redispatch that call, in the hope that some other AUTOLOAD (above it, or to its left) might do better.

By default, if a redispatch attempt fails to find another method elsewhere in the objects class hierarchy, it quietly gives up and does nothing (but see Enforcing redispatch). This gracious acquiescence is also unlike the (generally annoying) behaviour of SUPER , which throws an exception if it cannot redispatch.

Note that it is a fatal error for any method (including AUTOLOAD ) to attempt to redispatch any method that does not have the same name. For example:

  1. sub D::oops { print "oops!\n"; $_[0]->NEXT::other_method() }

Enforcing redispatch

It is possible to make NEXT redispatch more demandingly (i.e. like SUPER does), so that the redispatch throws an exception if it cannot find a "next" method to call.

To do this, simple invoke the redispatch as:

  1. $self->NEXT::ACTUAL::method();

rather than:

  1. $self->NEXT::method();

The ACTUAL tells NEXT that there must actually be a next method to call, or it should throw an exception.

NEXT::ACTUAL is most commonly used in AUTOLOAD methods, as a means to decline an AUTOLOAD request, but preserve the normal exception-on-failure semantics:

  1. sub AUTOLOAD {
  2. if ($AUTOLOAD =~ /foo|bar/) {
  3. # handle here
  4. }
  5. else { # try elsewhere
  6. shift()->NEXT::ACTUAL::AUTOLOAD(@_);
  7. }
  8. }

By using NEXT::ACTUAL , if there is no other AUTOLOAD to handle the method call, an exception will be thrown (as usually happens in the absence of a suitable AUTOLOAD ).

Avoiding repetitions

If NEXT redispatching is used in the methods of a "diamond" class hierarchy:

  1. # A B
  2. # / \ /
  3. # C D
  4. # \ /
  5. # E
  6. use NEXT;
  7. package A;
  8. sub foo { print "called A::foo\n"; shift->NEXT::foo() }
  9. package B;
  10. sub foo { print "called B::foo\n"; shift->NEXT::foo() }
  11. package C; @ISA = qw( A );
  12. sub foo { print "called C::foo\n"; shift->NEXT::foo() }
  13. package D; @ISA = qw(A B);
  14. sub foo { print "called D::foo\n"; shift->NEXT::foo() }
  15. package E; @ISA = qw(C D);
  16. sub foo { print "called E::foo\n"; shift->NEXT::foo() }
  17. E->foo();

then derived classes may (re-)inherit base-class methods through two or more distinct paths (e.g. in the way E inherits A::foo twice -- through C and D ). In such cases, a sequence of NEXT redispatches will invoke the multiply inherited method as many times as it is inherited. For example, the above code prints:

  1. called E::foo
  2. called C::foo
  3. called A::foo
  4. called D::foo
  5. called A::foo
  6. called B::foo

(i.e. A::foo is called twice).

In some cases this may be the desired effect within a diamond hierarchy, but in others (e.g. for destructors) it may be more appropriate to call each method only once during a sequence of redispatches.

To cover such cases, you can redispatch methods via:

  1. $self->NEXT::DISTINCT::method();

rather than:

  1. $self->NEXT::method();

This causes the redispatcher to only visit each distinct method method once. That is, to skip any classes in the hierarchy that it has already visited during redispatch. So, for example, if the previous example were rewritten:

  1. package A;
  2. sub foo { print "called A::foo\n"; shift->NEXT::DISTINCT::foo() }
  3. package B;
  4. sub foo { print "called B::foo\n"; shift->NEXT::DISTINCT::foo() }
  5. package C; @ISA = qw( A );
  6. sub foo { print "called C::foo\n"; shift->NEXT::DISTINCT::foo() }
  7. package D; @ISA = qw(A B);
  8. sub foo { print "called D::foo\n"; shift->NEXT::DISTINCT::foo() }
  9. package E; @ISA = qw(C D);
  10. sub foo { print "called E::foo\n"; shift->NEXT::DISTINCT::foo() }
  11. E->foo();

then it would print:

  1. called E::foo
  2. called C::foo
  3. called A::foo
  4. called D::foo
  5. called B::foo

and omit the second call to A::foo (since it would not be distinct from the first call to A::foo ).

Note that you can also use:

  1. $self->NEXT::DISTINCT::ACTUAL::method();

or:

  1. $self->NEXT::ACTUAL::DISTINCT::method();

to get both unique invocation and exception-on-failure.

Note that, for historical compatibility, you can also use NEXT::UNSEEN instead of NEXT::DISTINCT .

Invoking all versions of a method with a single call

Yet another pseudo-class that NEXT.pm provides is EVERY . Its behaviour is considerably simpler than that of the NEXT family. A call to:

  1. $obj->EVERY::foo();

calls every method named foo that the object in $obj has inherited. That is:

  1. use NEXT;
  2. package A; @ISA = qw(B D X);
  3. sub foo { print "A::foo " }
  4. package B; @ISA = qw(D X);
  5. sub foo { print "B::foo " }
  6. package X; @ISA = qw(D);
  7. sub foo { print "X::foo " }
  8. package D;
  9. sub foo { print "D::foo " }
  10. package main;
  11. my $obj = bless {}, 'A';
  12. $obj->EVERY::foo(); # prints" A::foo B::foo X::foo D::foo

Prefixing a method call with EVERY:: causes every method in the object's hierarchy with that name to be invoked. As the above example illustrates, they are not called in Perl's usual "left-most-depth-first" order. Instead, they are called "breadth-first-dependency-wise".

That means that the inheritance tree of the object is traversed breadth-first and the resulting order of classes is used as the sequence in which methods are called. However, that sequence is modified by imposing a rule that the appropriate method of a derived class must be called before the same method of any ancestral class. That's why, in the above example, X::foo is called before D::foo , even though D comes before X in @B::ISA .

In general, there's no need to worry about the order of calls. They will be left-to-right, breadth-first, most-derived-first. This works perfectly for most inherited methods (including destructors), but is inappropriate for some kinds of methods (such as constructors, cloners, debuggers, and initializers) where it's more appropriate that the least-derived methods be called first (as more-derived methods may rely on the behaviour of their "ancestors"). In that case, instead of using the EVERY pseudo-class:

  1. $obj->EVERY::foo(); # prints" A::foo B::foo X::foo D::foo

you can use the EVERY::LAST pseudo-class:

  1. $obj->EVERY::LAST::foo(); # prints" D::foo X::foo B::foo A::foo

which reverses the order of method call.

Whichever version is used, the actual methods are called in the same context (list, scalar, or void) as the original call via EVERY , and return:

  • A hash of array references in list context. Each entry of the hash has the fully qualified method name as its key and a reference to an array containing the method's list-context return values as its value.

  • A reference to a hash of scalar values in scalar context. Each entry of the hash has the fully qualified method name as its key and the method's scalar-context return values as its value.

  • Nothing in void context (obviously).

Using EVERY methods

The typical way to use an EVERY call is to wrap it in another base method, that all classes inherit. For example, to ensure that every destructor an object inherits is actually called (as opposed to just the left-most-depth-first-est one):

  1. package Base;
  2. sub DESTROY { $_[0]->EVERY::Destroy }
  3. package Derived1;
  4. use base 'Base';
  5. sub Destroy {...}
  6. package Derived2;
  7. use base 'Base', 'Derived1';
  8. sub Destroy {...}

et cetera. Every derived class than needs its own clean-up behaviour simply adds its own Destroy method (not a DESTROY method), which the call to EVERY::LAST::Destroy in the inherited destructor then correctly picks up.

Likewise, to create a class hierarchy in which every initializer inherited by a new object is invoked:

  1. package Base;
  2. sub new {
  3. my ($class, %args) = @_;
  4. my $obj = bless {}, $class;
  5. $obj->EVERY::LAST::Init(\%args);
  6. }
  7. package Derived1;
  8. use base 'Base';
  9. sub Init {
  10. my ($argsref) = @_;
  11. ...
  12. }
  13. package Derived2;
  14. use base 'Base', 'Derived1';
  15. sub Init {
  16. my ($argsref) = @_;
  17. ...
  18. }

et cetera. Every derived class than needs some additional initialization behaviour simply adds its own Init method (not a new method), which the call to EVERY::LAST::Init in the inherited constructor then correctly picks up.

AUTHOR

Damian Conway (damian@conway.org)

BUGS AND IRRITATIONS

Because it's a module, not an integral part of the interpreter, NEXT.pm has to guess where the surrounding call was found in the method look-up sequence. In the presence of diamond inheritance patterns it occasionally guesses wrong.

It's also too slow (despite caching).

Comment, suggestions, and patches welcome.

COPYRIGHT

  1. Copyright (c) 2000-2001, Damian Conway. All Rights Reserved.
  2. This module is free software. It may be used, redistributed
  3. and/or modified under the same terms as Perl itself.
 
perldoc-html/Net/000755 000765 000024 00000000000 12275777474 013772 5ustar00jjstaff000000 000000 perldoc-html/O.html000644 000765 000024 00000045240 12275777475 014336 0ustar00jjstaff000000 000000 O - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

O

Perl 5 version 18.2 documentation
Recently read

O

NAME

O - Generic interface to Perl Compiler backends

SYNOPSIS

  1. perl -MO=[-q,]Backend[,OPTIONS] foo.pl

DESCRIPTION

This is the module that is used as a frontend to the Perl Compiler.

If you pass the -q option to the module, then the STDOUT filehandle will be redirected into the variable $O::BEGIN_output during compilation. This has the effect that any output printed to STDOUT by BEGIN blocks or use'd modules will be stored in this variable rather than printed. It's useful with those backends which produce output themselves (Deparse , Concise etc), so that their output is not confused with that generated by the code being compiled.

The -qq option behaves like -q, except that it also closes STDERR after deparsing has finished. This suppresses the "Syntax OK" message normally produced by perl.

CONVENTIONS

Most compiler backends use the following conventions: OPTIONS consists of a comma-separated list of words (no white-space). The -v option usually puts the backend into verbose mode. The -ofile option generates output to file instead of stdout. The -D option followed by various letters turns on various internal debugging flags. See the documentation for the desired backend (named B::Backend for the example above) to find out about that backend.

IMPLEMENTATION

This section is only necessary for those who want to write a compiler backend module that can be used via this module.

The command-line mentioned in the SYNOPSIS section corresponds to the Perl code

  1. use O ("Backend", OPTIONS);

The O::import function loads the appropriate B::Backend module and calls its compile function, passing it OPTIONS. That function is expected to return a sub reference which we'll call CALLBACK. Next, the "compile-only" flag is switched on (equivalent to the command-line option -c ) and a CHECK block is registered which calls CALLBACK. Thus the main Perl program mentioned on the command-line is read in, parsed and compiled into internal syntax tree form. Since the -c flag is set, the program does not start running (excepting BEGIN blocks of course) but the CALLBACK function registered by the compiler backend is called.

In summary, a compiler backend module should be called "B::Foo" for some foo and live in the appropriate directory for that name. It should define a function called compile . When the user types

  1. perl -MO=Foo,OPTIONS foo.pl

that function is called and is passed those OPTIONS (split on commas). It should return a sub ref to the main compilation function. After the user's program is loaded and parsed, that returned sub ref is invoked which can then go ahead and do the compilation, usually by making use of the B module's functionality.

BUGS

The -q and -qq options don't work correctly if perl isn't compiled with PerlIO support : STDOUT will be closed instead of being redirected to $O::BEGIN_output .

AUTHOR

Malcolm Beattie, mbeattie@sable.ox.ac.uk

 
perldoc-html/Object/000755 000765 000024 00000000000 12275777475 014453 5ustar00jjstaff000000 000000 perldoc-html/Opcode.html000644 000765 000024 00000132430 12275777474 015346 0ustar00jjstaff000000 000000 Opcode - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Opcode

Perl 5 version 18.2 documentation
Recently read

Opcode

NAME

Opcode - Disable named opcodes when compiling perl code

SYNOPSIS

  1. use Opcode;

DESCRIPTION

Perl code is always compiled into an internal format before execution.

Evaluating perl code (e.g. via "eval" or "do 'file'") causes the code to be compiled into an internal format and then, provided there was no error in the compilation, executed. The internal format is based on many distinct opcodes.

By default no opmask is in effect and any code can be compiled.

The Opcode module allow you to define an operator mask to be in effect when perl next compiles any code. Attempting to compile code which contains a masked opcode will cause the compilation to fail with an error. The code will not be executed.

NOTE

The Opcode module is not usually used directly. See the ops pragma and Safe modules for more typical uses.

WARNING

The authors make no warranty, implied or otherwise, about the suitability of this software for safety or security purposes.

The authors shall not in any case be liable for special, incidental, consequential, indirect or other similar damages arising from the use of this software.

Your mileage will vary. If in any doubt do not use it.

Operator Names and Operator Lists

The canonical list of operator names is the contents of the array PL_op_name defined and initialised in file opcode.h of the Perl source distribution (and installed into the perl library).

Each operator has both a terse name (its opname) and a more verbose or recognisable descriptive name. The opdesc function can be used to return a list of descriptions for a list of operators.

Many of the functions and methods listed below take a list of operators as parameters. Most operator lists can be made up of several types of element. Each element can be one of

  • an operator name (opname)

    Operator names are typically small lowercase words like enterloop, leaveloop, last, next, redo etc. Sometimes they are rather cryptic like gv2cv, i_ncmp and ftsvtx.

  • an operator tag name (optag)

    Operator tags can be used to refer to groups (or sets) of operators. Tag names always begin with a colon. The Opcode module defines several optags and the user can define others using the define_optag function.

  • a negated opname or optag

    An opname or optag can be prefixed with an exclamation mark, e.g., !mkdir. Negating an opname or optag means remove the corresponding ops from the accumulated set of ops at that point.

  • an operator set (opset)

    An opset as a binary string of approximately 44 bytes which holds a set or zero or more operators.

    The opset and opset_to_ops functions can be used to convert from a list of operators to an opset and vice versa.

    Wherever a list of operators can be given you can use one or more opsets. See also Manipulating Opsets below.

Opcode Functions

The Opcode package contains functions for manipulating operator names tags and sets. All are available for export by the package.

  • opcodes

    In a scalar context opcodes returns the number of opcodes in this version of perl (around 350 for perl-5.7.0).

    In a list context it returns a list of all the operator names. (Not yet implemented, use @names = opset_to_ops(full_opset).)

  • opset (OP, ...)

    Returns an opset containing the listed operators.

  • opset_to_ops (OPSET)

    Returns a list of operator names corresponding to those operators in the set.

  • opset_to_hex (OPSET)

    Returns a string representation of an opset. Can be handy for debugging.

  • full_opset

    Returns an opset which includes all operators.

  • empty_opset

    Returns an opset which contains no operators.

  • invert_opset (OPSET)

    Returns an opset which is the inverse set of the one supplied.

  • verify_opset (OPSET, ...)

    Returns true if the supplied opset looks like a valid opset (is the right length etc) otherwise it returns false. If an optional second parameter is true then verify_opset will croak on an invalid opset instead of returning false.

    Most of the other Opcode functions call verify_opset automatically and will croak if given an invalid opset.

  • define_optag (OPTAG, OPSET)

    Define OPTAG as a symbolic name for OPSET. Optag names always start with a colon : .

    The optag name used must not be defined already (define_optag will croak if it is already defined). Optag names are global to the perl process and optag definitions cannot be altered or deleted once defined.

    It is strongly recommended that applications using Opcode should use a leading capital letter on their tag names since lowercase names are reserved for use by the Opcode module. If using Opcode within a module you should prefix your tags names with the name of your module to ensure uniqueness and thus avoid clashes with other modules.

  • opmask_add (OPSET)

    Adds the supplied opset to the current opmask. Note that there is currently no mechanism for unmasking ops once they have been masked. This is intentional.

  • opmask

    Returns an opset corresponding to the current opmask.

  • opdesc (OP, ...)

    This takes a list of operator names and returns the corresponding list of operator descriptions.

  • opdump (PAT)

    Dumps to STDOUT a two column list of op names and op descriptions. If an optional pattern is given then only lines which match the (case insensitive) pattern will be output.

    It's designed to be used as a handy command line utility:

    1. perl -MOpcode=opdump -e opdump
    2. perl -MOpcode=opdump -e 'opdump Eval'

Manipulating Opsets

Opsets may be manipulated using the perl bit vector operators & (and), | (or), ^ (xor) and ~ (negate/invert).

However you should never rely on the numerical position of any opcode within the opset. In other words both sides of a bit vector operator should be opsets returned from Opcode functions.

Also, since the number of opcodes in your current version of perl might not be an exact multiple of eight, there may be unused bits in the last byte of an upset. This should not cause any problems (Opcode functions ignore those extra bits) but it does mean that using the ~ operator will typically not produce the same 'physical' opset 'string' as the invert_opset function.

TO DO (maybe)

  1. $bool = opset_eq($opset1, $opset2) true if opsets are logically
  2. equivalent
  3. $yes = opset_can($opset, @ops) true if $opset has all @ops set
  4. @diff = opset_diff($opset1, $opset2) => ('foo', '!bar', ...)

Predefined Opcode Tags

  • :base_core
    1. null stub scalar pushmark wantarray const defined undef
    2. rv2sv sassign
    3. rv2av aassign aelem aelemfast aelemfast_lex aslice av2arylen
    4. rv2hv helem hslice each values keys exists delete aeach akeys
    5. avalues reach rvalues rkeys
    6. preinc i_preinc predec i_predec postinc i_postinc
    7. postdec i_postdec int hex oct abs pow multiply i_multiply
    8. divide i_divide modulo i_modulo add i_add subtract i_subtract
    9. left_shift right_shift bit_and bit_xor bit_or negate i_negate
    10. not complement
    11. lt i_lt gt i_gt le i_le ge i_ge eq i_eq ne i_ne ncmp i_ncmp
    12. slt sgt sle sge seq sne scmp
    13. substr vec stringify study pos length index rindex ord chr
    14. ucfirst lcfirst uc lc fc quotemeta trans transr chop schop
    15. chomp schomp
    16. match split qr
    17. list lslice splice push pop shift unshift reverse
    18. cond_expr flip flop andassign orassign dorassign and or dor xor
    19. warn die lineseq nextstate scope enter leave
    20. rv2cv anoncode prototype coreargs
    21. entersub leavesub leavesublv return method method_named
    22. -- XXX loops via recursion?
    23. leaveeval -- needed for Safe to operate, is safe
    24. without entereval
  • :base_mem

    These memory related ops are not included in :base_core because they can easily be used to implement a resource attack (e.g., consume all available memory).

    1. concat repeat join range
    2. anonlist anonhash

    Note that despite the existence of this optag a memory resource attack may still be possible using only :base_core ops.

    Disabling these ops is a very heavy handed way to attempt to prevent a memory resource attack. It's probable that a specific memory limit mechanism will be added to perl in the near future.

  • :base_loop

    These loop ops are not included in :base_core because they can easily be used to implement a resource attack (e.g., consume all available CPU time).

    1. grepstart grepwhile
    2. mapstart mapwhile
    3. enteriter iter
    4. enterloop leaveloop unstack
    5. last next redo
    6. goto
  • :base_io

    These ops enable filehandle (rather than filename) based input and output. These are safe on the assumption that only pre-existing filehandles are available for use. Usually, to create new filehandles other ops such as open would need to be enabled, if you don't take into account the magical open of ARGV.

    1. readline rcatline getc read
    2. formline enterwrite leavewrite
    3. print say sysread syswrite send recv
    4. eof tell seek sysseek
    5. readdir telldir seekdir rewinddir
  • :base_orig

    These are a hotchpotch of opcodes still waiting to be considered

    1. gvsv gv gelem
    2. padsv padav padhv padcv padany padrange introcv clonecv
    3. once
    4. rv2gv refgen srefgen ref
    5. bless -- could be used to change ownership of objects
    6. (reblessing)
    7. pushre regcmaybe regcreset regcomp subst substcont
    8. sprintf prtf -- can core dump
    9. crypt
    10. tie untie
    11. dbmopen dbmclose
    12. sselect select
    13. pipe_op sockpair
    14. getppid getpgrp setpgrp getpriority setpriority
    15. localtime gmtime
    16. entertry leavetry -- can be used to 'hide' fatal errors
    17. entergiven leavegiven
    18. enterwhen leavewhen
    19. break continue
    20. smartmatch
    21. custom -- where should this go
  • :base_math

    These ops are not included in :base_core because of the risk of them being used to generate floating point exceptions (which would have to be caught using a $SIG{FPE} handler).

    1. atan2 sin cos exp log sqrt

    These ops are not included in :base_core because they have an effect beyond the scope of the compartment.

    1. rand srand
  • :base_thread

    These ops are related to multi-threading.

    1. lock
  • :default

    A handy tag name for a reasonable default set of ops. (The current ops allowed are unstable while development continues. It will change.)

    1. :base_core :base_mem :base_loop :base_orig :base_thread

    This list used to contain :base_io prior to Opcode 1.07.

    If safety matters to you (and why else would you be using the Opcode module?) then you should not rely on the definition of this, or indeed any other, optag!

  • :filesys_read
    1. stat lstat readlink
    2. ftatime ftblk ftchr ftctime ftdir fteexec fteowned
    3. fteread ftewrite ftfile ftis ftlink ftmtime ftpipe
    4. ftrexec ftrowned ftrread ftsgid ftsize ftsock ftsuid
    5. fttty ftzero ftrwrite ftsvtx
    6. fttext ftbinary
    7. fileno
  • :sys_db
    1. ghbyname ghbyaddr ghostent shostent ehostent -- hosts
    2. gnbyname gnbyaddr gnetent snetent enetent -- networks
    3. gpbyname gpbynumber gprotoent sprotoent eprotoent -- protocols
    4. gsbyname gsbyport gservent sservent eservent -- services
    5. gpwnam gpwuid gpwent spwent epwent getlogin -- users
    6. ggrnam ggrgid ggrent sgrent egrent -- groups
  • :browse

    A handy tag name for a reasonable default set of ops beyond the :default optag. Like :default (and indeed all the other optags) its current definition is unstable while development continues. It will change.

    The :browse tag represents the next step beyond :default. It it a superset of the :default ops and adds :filesys_read the :sys_db. The intent being that scripts can access more (possibly sensitive) information about your system but not be able to change it.

    1. :default :filesys_read :sys_db
  • :filesys_open
    1. sysopen open close
    2. umask binmode
    3. open_dir closedir -- other dir ops are in :base_io
  • :filesys_write
    1. link unlink rename symlink truncate
    2. mkdir rmdir
    3. utime chmod chown
    4. fcntl -- not strictly filesys related, but possibly as
    5. dangerous?
  • :subprocess
    1. backtick system
    2. fork
    3. wait waitpid
    4. glob -- access to Cshell via <`rm *`>
  • :ownprocess
    1. exec exit kill
    2. time tms -- could be used for timing attacks (paranoid?)
  • :others

    This tag holds groups of assorted specialist opcodes that don't warrant having optags defined for them.

    SystemV Interprocess Communications:

    1. msgctl msgget msgrcv msgsnd
    2. semctl semget semop
    3. shmctl shmget shmread shmwrite
  • :load

    This tag holds opcodes related to loading modules and getting information about calling environment and args.

    1. require dofile
    2. caller runcv
  • :still_to_be_decided
    1. chdir
    2. flock ioctl
    3. socket getpeername ssockopt
    4. bind connect listen accept shutdown gsockopt getsockname
    5. sleep alarm -- changes global timer state and signal handling
    6. sort -- assorted problems including core dumps
    7. tied -- can be used to access object implementing a tie
    8. pack unpack -- can be used to create/use memory pointers
    9. hintseval -- constant op holding eval hints
    10. entereval -- can be used to hide code from initial compile
    11. reset
    12. dbstate -- perl -d version of nextstate(ment) opcode
  • :dangerous

    This tag is simply a bucket for opcodes that are unlikely to be used via a tag name but need to be tagged for completeness and documentation.

    1. syscall dump chroot

SEE ALSO

ops -- perl pragma interface to Opcode module.

Safe -- Opcode and namespace limited execution compartments

AUTHORS

Originally designed and implemented by Malcolm Beattie, mbeattie@sable.ox.ac.uk as part of Safe version 1.

Split out from Safe module version 1, named opcode tags and other changes added by Tim Bunce.

 
perldoc-html/POSIX.html000644 000765 000024 00000427552 12275777503 015044 0ustar00jjstaff000000 000000 POSIX - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

POSIX

Perl 5 version 18.2 documentation
Recently read

POSIX

NAME

POSIX - Perl interface to IEEE Std 1003.1

SYNOPSIS

  1. use POSIX ();
  2. use POSIX qw(setsid);
  3. use POSIX qw(:errno_h :fcntl_h);
  4. printf "EINTR is %d\n", EINTR;
  5. $sess_id = POSIX::setsid();
  6. $fd = POSIX::open($path, O_CREAT|O_EXCL|O_WRONLY, 0644);
  7. # note: that's a filedescriptor, *NOT* a filehandle

DESCRIPTION

The POSIX module permits you to access all (or nearly all) the standard POSIX 1003.1 identifiers. Many of these identifiers have been given Perl-ish interfaces.

Everything is exported by default with the exception of any POSIX functions with the same name as a built-in Perl function, such as abs, alarm, rmdir, write, etc.., which will be exported only if you ask for them explicitly. This is an unfortunate backwards compatibility feature. You can stop the exporting by saying use POSIX () and then use the fully qualified names (ie. POSIX::SEEK_END ), or by giving an explicit import list. If you do neither, and opt for the default, use POSIX; has to import 553 symbols.

This document gives a condensed list of the features available in the POSIX module. Consult your operating system's manpages for general information on most features. Consult perlfunc for functions which are noted as being identical to Perl's builtin functions.

The first section describes POSIX functions from the 1003.1 specification. The second section describes some classes for signal objects, TTY objects, and other miscellaneous objects. The remaining sections list various constants and macros in an organization which roughly follows IEEE Std 1003.1b-1993.

CAVEATS

A few functions are not implemented because they are C specific. If you attempt to call these, they will print a message telling you that they aren't implemented, and suggest using the Perl equivalent should one exist. For example, trying to access the setjmp() call will elicit the message "setjmp() is C-specific: use eval {} instead".

Furthermore, some evil vendors will claim 1003.1 compliance, but in fact are not so: they will not pass the PCTS (POSIX Compliance Test Suites). For example, one vendor may not define EDEADLK, or the semantics of the errno values set by open(2) might not be quite right. Perl does not attempt to verify POSIX compliance. That means you can currently successfully say "use POSIX", and then later in your program you find that your vendor has been lax and there's no usable ICANON macro after all. This could be construed to be a bug.

FUNCTIONS

  • _exit

    This is identical to the C function _exit() . It exits the program immediately which means among other things buffered I/O is not flushed.

    Note that when using threads and in Linux this is not a good way to exit a thread because in Linux processes and threads are kind of the same thing (Note: while this is the situation in early 2003 there are projects under way to have threads with more POSIXly semantics in Linux). If you want not to return from a thread, detach the thread.

  • abort

    This is identical to the C function abort() . It terminates the process with a SIGABRT signal unless caught by a signal handler or if the handler does not return normally (it e.g. does a longjmp ).

  • abs

    This is identical to Perl's builtin abs() function, returning the absolute value of its numerical argument.

  • access

    Determines the accessibility of a file.

    1. if( POSIX::access( "/", &POSIX::R_OK ) ){
    2. print "have read permission\n";
    3. }

    Returns undef on failure. Note: do not use access() for security purposes. Between the access() call and the operation you are preparing for the permissions might change: a classic race condition.

  • acos

    This is identical to the C function acos() , returning the arcus cosine of its numerical argument. See also Math::Trig.

  • alarm

    This is identical to Perl's builtin alarm() function, either for arming or disarming the SIGARLM timer.

  • asctime

    This is identical to the C function asctime() . It returns a string of the form

    1. "Fri Jun 2 18:22:13 2000\n\0"

    and it is called thusly

    1. $asctime = asctime($sec, $min, $hour, $mday, $mon, $year,
    2. $wday, $yday, $isdst);

    The $mon is zero-based: January equals 0 . The $year is 1900-based: 2001 equals 101 . $wday and $yday default to zero (and are usually ignored anyway), and $isdst defaults to -1.

  • asin

    This is identical to the C function asin() , returning the arcus sine of its numerical argument. See also Math::Trig.

  • assert

    Unimplemented, but you can use die and the Carp module to achieve similar things.

  • atan

    This is identical to the C function atan() , returning the arcus tangent of its numerical argument. See also Math::Trig.

  • atan2

    This is identical to Perl's builtin atan2() function, returning the arcus tangent defined by its two numerical arguments, the y coordinate and the x coordinate. See also Math::Trig.

  • atexit

    atexit() is C-specific: use END {} instead, see perlsub.

  • atof

    atof() is C-specific. Perl converts strings to numbers transparently. If you need to force a scalar to a number, add a zero to it.

  • atoi

    atoi() is C-specific. Perl converts strings to numbers transparently. If you need to force a scalar to a number, add a zero to it. If you need to have just the integer part, see int.

  • atol

    atol() is C-specific. Perl converts strings to numbers transparently. If you need to force a scalar to a number, add a zero to it. If you need to have just the integer part, see int.

  • bsearch

    bsearch() not supplied. For doing binary search on wordlists, see Search::Dict.

  • calloc

    calloc() is C-specific. Perl does memory management transparently.

  • ceil

    This is identical to the C function ceil() , returning the smallest integer value greater than or equal to the given numerical argument.

  • chdir

    This is identical to Perl's builtin chdir() function, allowing one to change the working (default) directory, see chdir.

  • chmod

    This is identical to Perl's builtin chmod() function, allowing one to change file and directory permissions, see chmod.

  • chown

    This is identical to Perl's builtin chown() function, allowing one to change file and directory owners and groups, see chown.

  • clearerr

    Use the method IO::Handle::clearerr() instead, to reset the error state (if any) and EOF state (if any) of the given stream.

  • clock

    This is identical to the C function clock() , returning the amount of spent processor time in microseconds.

  • close

    Close the file. This uses file descriptors such as those obtained by calling POSIX::open .

    1. $fd = POSIX::open( "foo", &POSIX::O_RDONLY );
    2. POSIX::close( $fd );

    Returns undef on failure.

    See also close.

  • closedir

    This is identical to Perl's builtin closedir() function for closing a directory handle, see closedir.

  • cos

    This is identical to Perl's builtin cos() function, for returning the cosine of its numerical argument, see cos. See also Math::Trig.

  • cosh

    This is identical to the C function cosh() , for returning the hyperbolic cosine of its numeric argument. See also Math::Trig.

  • creat

    Create a new file. This returns a file descriptor like the ones returned by POSIX::open . Use POSIX::close to close the file.

    1. $fd = POSIX::creat( "foo", 0611 );
    2. POSIX::close( $fd );

    See also sysopen and its O_CREAT flag.

  • ctermid

    Generates the path name for the controlling terminal.

    1. $path = POSIX::ctermid();
  • ctime

    This is identical to the C function ctime() and equivalent to asctime(localtime(...)) , see asctime and localtime.

  • cuserid

    Get the login name of the owner of the current process.

    1. $name = POSIX::cuserid();
  • difftime

    This is identical to the C function difftime() , for returning the time difference (in seconds) between two times (as returned by time()), see time.

  • div

    div() is C-specific, use int on the usual / division and the modulus % .

  • dup

    This is similar to the C function dup() , for duplicating a file descriptor.

    This uses file descriptors such as those obtained by calling POSIX::open .

    Returns undef on failure.

  • dup2

    This is similar to the C function dup2() , for duplicating a file descriptor to an another known file descriptor.

    This uses file descriptors such as those obtained by calling POSIX::open .

    Returns undef on failure.

  • errno

    Returns the value of errno.

    1. $errno = POSIX::errno();

    This identical to the numerical values of the $! , see $ERRNO in perlvar.

  • execl

    execl() is C-specific, see exec.

  • execle

    execle() is C-specific, see exec.

  • execlp

    execlp() is C-specific, see exec.

  • execv

    execv() is C-specific, see exec.

  • execve

    execve() is C-specific, see exec.

  • execvp

    execvp() is C-specific, see exec.

  • exit

    This is identical to Perl's builtin exit() function for exiting the program, see exit.

  • exp

    This is identical to Perl's builtin exp() function for returning the exponent (e-based) of the numerical argument, see exp.

  • fabs

    This is identical to Perl's builtin abs() function for returning the absolute value of the numerical argument, see abs.

  • fclose

    Use method IO::Handle::close() instead, or see close.

  • fcntl

    This is identical to Perl's builtin fcntl() function, see fcntl.

  • fdopen

    Use method IO::Handle::new_from_fd() instead, or see open.

  • feof

    Use method IO::Handle::eof() instead, or see eof.

  • ferror

    Use method IO::Handle::error() instead.

  • fflush

    Use method IO::Handle::flush() instead. See also $OUTPUT_AUTOFLUSH in perlvar.

  • fgetc

    Use method IO::Handle::getc() instead, or see read.

  • fgetpos

    Use method IO::Seekable::getpos() instead, or see seek.

  • fgets

    Use method IO::Handle::gets() instead. Similar to <>, also known as readline.

  • fileno

    Use method IO::Handle::fileno() instead, or see fileno.

  • floor

    This is identical to the C function floor() , returning the largest integer value less than or equal to the numerical argument.

  • fmod

    This is identical to the C function fmod() .

    1. $r = fmod($x, $y);

    It returns the remainder $r = $x - $n*$y , where $n = trunc($x/$y) . The $r has the same sign as $x and magnitude (absolute value) less than the magnitude of $y .

  • fopen

    Use method IO::File::open() instead, or see open.

  • fork

    This is identical to Perl's builtin fork() function for duplicating the current process, see fork and perlfork if you are in Windows.

  • fpathconf

    Retrieves the value of a configurable limit on a file or directory. This uses file descriptors such as those obtained by calling POSIX::open .

    The following will determine the maximum length of the longest allowable pathname on the filesystem which holds /var/foo.

    1. $fd = POSIX::open( "/var/foo", &POSIX::O_RDONLY );
    2. $path_max = POSIX::fpathconf( $fd, &POSIX::_PC_PATH_MAX );

    Returns undef on failure.

  • fprintf

    fprintf() is C-specific, see printf instead.

  • fputc

    fputc() is C-specific, see print instead.

  • fputs

    fputs() is C-specific, see print instead.

  • fread

    fread() is C-specific, see read instead.

  • free

    free() is C-specific. Perl does memory management transparently.

  • freopen

    freopen() is C-specific, see open instead.

  • frexp

    Return the mantissa and exponent of a floating-point number.

    1. ($mantissa, $exponent) = POSIX::frexp( 1.234e56 );
  • fscanf

    fscanf() is C-specific, use <> and regular expressions instead.

  • fseek

    Use method IO::Seekable::seek() instead, or see seek.

  • fsetpos

    Use method IO::Seekable::setpos() instead, or seek seek.

  • fstat

    Get file status. This uses file descriptors such as those obtained by calling POSIX::open . The data returned is identical to the data from Perl's builtin stat function.

    1. $fd = POSIX::open( "foo", &POSIX::O_RDONLY );
    2. @stats = POSIX::fstat( $fd );
  • fsync

    Use method IO::Handle::sync() instead.

  • ftell

    Use method IO::Seekable::tell() instead, or see tell.

  • fwrite

    fwrite() is C-specific, see print instead.

  • getc

    This is identical to Perl's builtin getc() function, see getc.

  • getchar

    Returns one character from STDIN. Identical to Perl's getc(), see getc.

  • getcwd

    Returns the name of the current working directory. See also Cwd.

  • getegid

    Returns the effective group identifier. Similar to Perl' s builtin variable $( , see $EGID in perlvar.

  • getenv

    Returns the value of the specified environment variable. The same information is available through the %ENV array.

  • geteuid

    Returns the effective user identifier. Identical to Perl's builtin $> variable, see $EUID in perlvar.

  • getgid

    Returns the user's real group identifier. Similar to Perl's builtin variable $) , see $GID in perlvar.

  • getgrgid

    This is identical to Perl's builtin getgrgid() function for returning group entries by group identifiers, see getgrgid.

  • getgrnam

    This is identical to Perl's builtin getgrnam() function for returning group entries by group names, see getgrnam.

  • getgroups

    Returns the ids of the user's supplementary groups. Similar to Perl's builtin variable $) , see $GID in perlvar.

  • getlogin

    This is identical to Perl's builtin getlogin() function for returning the user name associated with the current session, see getlogin.

  • getpgrp

    This is identical to Perl's builtin getpgrp() function for returning the process group identifier of the current process, see getpgrp.

  • getpid

    Returns the process identifier. Identical to Perl's builtin variable $$ , see $PID in perlvar.

  • getppid

    This is identical to Perl's builtin getppid() function for returning the process identifier of the parent process of the current process , see getppid.

  • getpwnam

    This is identical to Perl's builtin getpwnam() function for returning user entries by user names, see getpwnam.

  • getpwuid

    This is identical to Perl's builtin getpwuid() function for returning user entries by user identifiers, see getpwuid.

  • gets

    Returns one line from STDIN , similar to <>, also known as the readline() function, see readline.

    NOTE: if you have C programs that still use gets() , be very afraid. The gets() function is a source of endless grief because it has no buffer overrun checks. It should never be used. The fgets() function should be preferred instead.

  • getuid

    Returns the user's identifier. Identical to Perl's builtin $< variable, see $UID in perlvar.

  • gmtime

    This is identical to Perl's builtin gmtime() function for converting seconds since the epoch to a date in Greenwich Mean Time, see gmtime.

  • isalnum

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered isalnum . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:alnum:]]/ construct instead, or possibly the /\w/ construct.

  • isalpha

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered isalpha . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:alpha:]]/ construct instead.

  • isatty

    Returns a boolean indicating whether the specified filehandle is connected to a tty. Similar to the -t operator, see -X.

  • iscntrl

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered iscntrl . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:cntrl:]]/ construct instead.

  • isdigit

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered isdigit (unlikely, but still possible). Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:digit:]]/ construct instead, or the /\d/ construct.

  • isgraph

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered isgraph . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:graph:]]/ construct instead.

  • islower

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered islower . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:lower:]]/ construct instead. Do not use /[a-z]/ .

  • isprint

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered isprint . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:print:]]/ construct instead.

  • ispunct

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered ispunct . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:punct:]]/ construct instead.

  • isspace

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered isspace . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:space:]]/ construct instead, or the /\s/ construct. (Note that /\s/ and /[[:space:]]/ are slightly different in that /[[:space:]]/ can normally match a vertical tab, while /\s/ does not.)

  • isupper

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered isupper . Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:upper:]]/ construct instead. Do not use /[A-Z]/ .

  • isxdigit

    This is identical to the C function, except that it can apply to a single character or to a whole string. Note that locale settings may affect what characters are considered isxdigit (unlikely, but still possible). Does not work on Unicode characters code point 256 or higher. Consider using regular expressions and the /[[:xdigit:]]/ construct instead, or simply /[0-9a-f]/i .

  • kill

    This is identical to Perl's builtin kill() function for sending signals to processes (often to terminate them), see kill.

  • labs

    (For returning absolute values of long integers.) labs() is C-specific, see abs instead.

  • lchown

    This is identical to the C function, except the order of arguments is consistent with Perl's builtin chown() with the added restriction of only one path, not an list of paths. Does the same thing as the chown() function but changes the owner of a symbolic link instead of the file the symbolic link points to.

  • ldexp

    This is identical to the C function ldexp() for multiplying floating point numbers with powers of two.

    1. $x_quadrupled = POSIX::ldexp($x, 2);
  • ldiv

    (For computing dividends of long integers.) ldiv() is C-specific, use / and int() instead.

  • link

    This is identical to Perl's builtin link() function for creating hard links into files, see link.

  • localeconv

    Get numeric formatting information. Returns a reference to a hash containing the current locale formatting values.

    Here is how to query the database for the de (Deutsch or German) locale.

    1. my $loc = POSIX::setlocale( &POSIX::LC_ALL, "de" );
    2. print "Locale: \"$loc\"\n";
    3. my $lconv = POSIX::localeconv();
    4. foreach my $property (qw(
    5. decimal_point
    6. thousands_sep
    7. grouping
    8. int_curr_symbol
    9. currency_symbol
    10. mon_decimal_point
    11. mon_thousands_sep
    12. mon_grouping
    13. positive_sign
    14. negative_sign
    15. int_frac_digits
    16. frac_digits
    17. p_cs_precedes
    18. p_sep_by_space
    19. n_cs_precedes
    20. n_sep_by_space
    21. p_sign_posn
    22. n_sign_posn
    23. ))
    24. {
    25. printf qq(%s: "%s",\n), $property, $lconv->{$property};
    26. }
  • localtime

    This is identical to Perl's builtin localtime() function for converting seconds since the epoch to a date see localtime.

  • log

    This is identical to Perl's builtin log() function, returning the natural (e-based) logarithm of the numerical argument, see log.

  • log10

    This is identical to the C function log10() , returning the 10-base logarithm of the numerical argument. You can also use

    1. sub log10 { log($_[0]) / log(10) }

    or

    1. sub log10 { log($_[0]) / 2.30258509299405 }

    or

    1. sub log10 { log($_[0]) * 0.434294481903252 }
  • longjmp

    longjmp() is C-specific: use die instead.

  • lseek

    Move the file's read/write position. This uses file descriptors such as those obtained by calling POSIX::open .

    1. $fd = POSIX::open( "foo", &POSIX::O_RDONLY );
    2. $off_t = POSIX::lseek( $fd, 0, &POSIX::SEEK_SET );

    Returns undef on failure.

  • malloc

    malloc() is C-specific. Perl does memory management transparently.

  • mblen

    This is identical to the C function mblen() . Perl does not have any support for the wide and multibyte characters of the C standards, so this might be a rather useless function.

  • mbstowcs

    This is identical to the C function mbstowcs() . Perl does not have any support for the wide and multibyte characters of the C standards, so this might be a rather useless function.

  • mbtowc

    This is identical to the C function mbtowc() . Perl does not have any support for the wide and multibyte characters of the C standards, so this might be a rather useless function.

  • memchr

    memchr() is C-specific, see index instead.

  • memcmp

    memcmp() is C-specific, use eq instead, see perlop.

  • memcpy

    memcpy() is C-specific, use = , see perlop, or see substr.

  • memmove

    memmove() is C-specific, use = , see perlop, or see substr.

  • memset

    memset() is C-specific, use x instead, see perlop.

  • mkdir

    This is identical to Perl's builtin mkdir() function for creating directories, see mkdir.

  • mkfifo

    This is similar to the C function mkfifo() for creating FIFO special files.

    1. if (mkfifo($path, $mode)) { ....

    Returns undef on failure. The $mode is similar to the mode of mkdir(), see mkdir, though for mkfifo you must specify the $mode .

  • mktime

    Convert date/time info to a calendar time.

    Synopsis:

    1. mktime(sec, min, hour, mday, mon, year, wday = 0, yday = 0, isdst = -1)

    The month (mon ), weekday (wday ), and yearday (yday ) begin at zero. I.e. January is 0, not 1; Sunday is 0, not 1; January 1st is 0, not 1. The year (year ) is given in years since 1900. I.e. The year 1995 is 95; the year 2001 is 101. Consult your system's mktime() manpage for details about these and the other arguments.

    Calendar time for December 12, 1995, at 10:30 am.

    1. $time_t = POSIX::mktime( 0, 30, 10, 12, 11, 95 );
    2. print "Date = ", POSIX::ctime($time_t);

    Returns undef on failure.

  • modf

    Return the integral and fractional parts of a floating-point number.

    1. ($fractional, $integral) = POSIX::modf( 3.14 );
  • nice

    This is similar to the C function nice() , for changing the scheduling preference of the current process. Positive arguments mean more polite process, negative values more needy process. Normal user processes can only be more polite.

    Returns undef on failure.

  • offsetof

    offsetof() is C-specific, you probably want to see pack instead.

  • open

    Open a file for reading for writing. This returns file descriptors, not Perl filehandles. Use POSIX::close to close the file.

    Open a file read-only with mode 0666.

    1. $fd = POSIX::open( "foo" );

    Open a file for read and write.

    1. $fd = POSIX::open( "foo", &POSIX::O_RDWR );

    Open a file for write, with truncation.

    1. $fd = POSIX::open( "foo", &POSIX::O_WRONLY | &POSIX::O_TRUNC );

    Create a new file with mode 0640. Set up the file for writing.

    1. $fd = POSIX::open( "foo", &POSIX::O_CREAT | &POSIX::O_WRONLY, 0640 );

    Returns undef on failure.

    See also sysopen.

  • opendir

    Open a directory for reading.

    1. $dir = POSIX::opendir( "/var" );
    2. @files = POSIX::readdir( $dir );
    3. POSIX::closedir( $dir );

    Returns undef on failure.

  • pathconf

    Retrieves the value of a configurable limit on a file or directory.

    The following will determine the maximum length of the longest allowable pathname on the filesystem which holds /var.

    1. $path_max = POSIX::pathconf( "/var", &POSIX::_PC_PATH_MAX );

    Returns undef on failure.

  • pause

    This is similar to the C function pause() , which suspends the execution of the current process until a signal is received.

    Returns undef on failure.

  • perror

    This is identical to the C function perror() , which outputs to the standard error stream the specified message followed by ": " and the current error string. Use the warn() function and the $! variable instead, see warn and $ERRNO in perlvar.

  • pipe

    Create an interprocess channel. This returns file descriptors like those returned by POSIX::open .

    1. my ($read, $write) = POSIX::pipe();
    2. POSIX::write( $write, "hello", 5 );
    3. POSIX::read( $read, $buf, 5 );

    See also pipe.

  • pow

    Computes $x raised to the power $exponent .

    1. $ret = POSIX::pow( $x, $exponent );

    You can also use the ** operator, see perlop.

  • printf

    Formats and prints the specified arguments to STDOUT. See also printf.

  • putc

    putc() is C-specific, see print instead.

  • putchar

    putchar() is C-specific, see print instead.

  • puts

    puts() is C-specific, see print instead.

  • qsort

    qsort() is C-specific, see sort instead.

  • raise

    Sends the specified signal to the current process. See also kill and the $$ in $PID in perlvar.

  • rand

    rand() is non-portable, see rand instead.

  • read

    Read from a file. This uses file descriptors such as those obtained by calling POSIX::open . If the buffer $buf is not large enough for the read then Perl will extend it to make room for the request.

    1. $fd = POSIX::open( "foo", &POSIX::O_RDONLY );
    2. $bytes = POSIX::read( $fd, $buf, 3 );

    Returns undef on failure.

    See also sysread.

  • readdir

    This is identical to Perl's builtin readdir() function for reading directory entries, see readdir.

  • realloc

    realloc() is C-specific. Perl does memory management transparently.

  • remove

    This is identical to Perl's builtin unlink() function for removing files, see unlink.

  • rename

    This is identical to Perl's builtin rename() function for renaming files, see rename.

  • rewind

    Seeks to the beginning of the file.

  • rewinddir

    This is identical to Perl's builtin rewinddir() function for rewinding directory entry streams, see rewinddir.

  • rmdir

    This is identical to Perl's builtin rmdir() function for removing (empty) directories, see rmdir.

  • scanf

    scanf() is C-specific, use <> and regular expressions instead, see perlre.

  • setgid

    Sets the real group identifier and the effective group identifier for this process. Similar to assigning a value to the Perl's builtin $) variable, see $EGID in perlvar, except that the latter will change only the real user identifier, and that the setgid() uses only a single numeric argument, as opposed to a space-separated list of numbers.

  • setjmp

    setjmp() is C-specific: use eval {} instead, see eval.

  • setlocale

    Modifies and queries program's locale. The following examples assume

    1. use POSIX qw(setlocale LC_ALL LC_CTYPE);

    has been issued.

    The following will set the traditional UNIX system locale behavior (the second argument "C" ).

    1. $loc = setlocale( LC_ALL, "C" );

    The following will query the current LC_CTYPE category. (No second argument means 'query'.)

    1. $loc = setlocale( LC_CTYPE );

    The following will set the LC_CTYPE behaviour according to the locale environment variables (the second argument "" ). Please see your systems setlocale(3) documentation for the locale environment variables' meaning or consult perllocale.

    1. $loc = setlocale( LC_CTYPE, "" );

    The following will set the LC_COLLATE behaviour to Argentinian Spanish. NOTE: The naming and availability of locales depends on your operating system. Please consult perllocale for how to find out which locales are available in your system.

    1. $loc = setlocale( LC_COLLATE, "es_AR.ISO8859-1" );
  • setpgid

    This is similar to the C function setpgid() for setting the process group identifier of the current process.

    Returns undef on failure.

  • setsid

    This is identical to the C function setsid() for setting the session identifier of the current process.

  • setuid

    Sets the real user identifier and the effective user identifier for this process. Similar to assigning a value to the Perl's builtin $< variable, see $UID in perlvar, except that the latter will change only the real user identifier.

  • sigaction

    Detailed signal management. This uses POSIX::SigAction objects for the action and oldaction arguments (the oldaction can also be just a hash reference). Consult your system's sigaction manpage for details, see also POSIX::SigRt .

    Synopsis:

    1. sigaction(signal, action, oldaction = 0)

    Returns undef on failure. The signal must be a number (like SIGHUP), not a string (like "SIGHUP"), though Perl does try hard to understand you.

    If you use the SA_SIGINFO flag, the signal handler will in addition to the first argument, the signal name, also receive a second argument, a hash reference, inside which are the following keys with the following semantics, as defined by POSIX/SUSv3:

    1. signo the signal number
    2. errno the error number
    3. code if this is zero or less, the signal was sent by
    4. a user process and the uid and pid make sense,
    5. otherwise the signal was sent by the kernel

    The following are also defined by POSIX/SUSv3, but unfortunately not very widely implemented:

    1. pid the process id generating the signal
    2. uid the uid of the process id generating the signal
    3. status exit value or signal for SIGCHLD
    4. band band event for SIGPOLL

    A third argument is also passed to the handler, which contains a copy of the raw binary contents of the siginfo structure: if a system has some non-POSIX fields, this third argument is where to unpack() them from.

    Note that not all siginfo values make sense simultaneously (some are valid only for certain signals, for example), and not all values make sense from Perl perspective, you should to consult your system's sigaction and possibly also siginfo documentation.

  • siglongjmp

    siglongjmp() is C-specific: use die instead.

  • sigpending

    Examine signals that are blocked and pending. This uses POSIX::SigSet objects for the sigset argument. Consult your system's sigpending manpage for details.

    Synopsis:

    1. sigpending(sigset)

    Returns undef on failure.

  • sigprocmask

    Change and/or examine calling process's signal mask. This uses POSIX::SigSet objects for the sigset and oldsigset arguments. Consult your system's sigprocmask manpage for details.

    Synopsis:

    1. sigprocmask(how, sigset, oldsigset = 0)

    Returns undef on failure.

    Note that you can't reliably block or unblock a signal from its own signal handler if you're using safe signals. Other signals can be blocked or unblocked reliably.

  • sigsetjmp

    sigsetjmp() is C-specific: use eval {} instead, see eval.

  • sigsuspend

    Install a signal mask and suspend process until signal arrives. This uses POSIX::SigSet objects for the signal_mask argument. Consult your system's sigsuspend manpage for details.

    Synopsis:

    1. sigsuspend(signal_mask)

    Returns undef on failure.

  • sin

    This is identical to Perl's builtin sin() function for returning the sine of the numerical argument, see sin. See also Math::Trig.

  • sinh

    This is identical to the C function sinh() for returning the hyperbolic sine of the numerical argument. See also Math::Trig.

  • sleep

    This is functionally identical to Perl's builtin sleep() function for suspending the execution of the current for process for certain number of seconds, see sleep. There is one significant difference, however: POSIX::sleep() returns the number of unslept seconds, while the CORE::sleep() returns the number of slept seconds.

  • sprintf

    This is similar to Perl's builtin sprintf() function for returning a string that has the arguments formatted as requested, see sprintf.

  • sqrt

    This is identical to Perl's builtin sqrt() function. for returning the square root of the numerical argument, see sqrt.

  • srand

    Give a seed the pseudorandom number generator, see srand.

  • sscanf

    sscanf() is C-specific, use regular expressions instead, see perlre.

  • stat

    This is identical to Perl's builtin stat() function for returning information about files and directories.

  • strcat

    strcat() is C-specific, use .= instead, see perlop.

  • strchr

    strchr() is C-specific, see index instead.

  • strcmp

    strcmp() is C-specific, use eq or cmp instead, see perlop.

  • strcoll

    This is identical to the C function strcoll() for collating (comparing) strings transformed using the strxfrm() function. Not really needed since Perl can do this transparently, see perllocale.

  • strcpy

    strcpy() is C-specific, use = instead, see perlop.

  • strcspn

    strcspn() is C-specific, use regular expressions instead, see perlre.

  • strerror

    Returns the error string for the specified errno. Identical to the string form of the $! , see $ERRNO in perlvar.

  • strftime

    Convert date and time information to string. Returns the string.

    Synopsis:

    1. strftime(fmt, sec, min, hour, mday, mon, year, wday = -1, yday = -1, isdst = -1)

    The month (mon ), weekday (wday ), and yearday (yday ) begin at zero. I.e. January is 0, not 1; Sunday is 0, not 1; January 1st is 0, not 1. The year (year ) is given in years since 1900. I.e., the year 1995 is 95; the year 2001 is 101. Consult your system's strftime() manpage for details about these and the other arguments.

    If you want your code to be portable, your format (fmt ) argument should use only the conversion specifiers defined by the ANSI C standard (C89, to play safe). These are aAbBcdHIjmMpSUwWxXyYZ% . But even then, the results of some of the conversion specifiers are non-portable. For example, the specifiers aAbBcpZ change according to the locale settings of the user, and both how to set locales (the locale names) and what output to expect are non-standard. The specifier c changes according to the timezone settings of the user and the timezone computation rules of the operating system. The Z specifier is notoriously unportable since the names of timezones are non-standard. Sticking to the numeric specifiers is the safest route.

    The given arguments are made consistent as though by calling mktime() before calling your system's strftime() function, except that the isdst value is not affected.

    The string for Tuesday, December 12, 1995.

    1. $str = POSIX::strftime( "%A, %B %d, %Y", 0, 0, 0, 12, 11, 95, 2 );
    2. print "$str\n";
  • strlen

    strlen() is C-specific, use length() instead, see length.

  • strncat

    strncat() is C-specific, use .= instead, see perlop.

  • strncmp

    strncmp() is C-specific, use eq instead, see perlop.

  • strncpy

    strncpy() is C-specific, use = instead, see perlop.

  • strpbrk

    strpbrk() is C-specific, use regular expressions instead, see perlre.

  • strrchr

    strrchr() is C-specific, see rindex instead.

  • strspn

    strspn() is C-specific, use regular expressions instead, see perlre.

  • strstr

    This is identical to Perl's builtin index() function, see index.

  • strtod

    String to double translation. Returns the parsed number and the number of characters in the unparsed portion of the string. Truly POSIX-compliant systems set $! ($ERRNO) to indicate a translation error, so clear $! before calling strtod. However, non-POSIX systems may not check for overflow, and therefore will never set $!.

    strtod should respect any POSIX setlocale() settings.

    To parse a string $str as a floating point number use

    1. $! = 0;
    2. ($num, $n_unparsed) = POSIX::strtod($str);

    The second returned item and $! can be used to check for valid input:

    1. if (($str eq '') || ($n_unparsed != 0) || $!) {
    2. die "Non-numeric input $str" . ($! ? ": $!\n" : "\n");
    3. }

    When called in a scalar context strtod returns the parsed number.

  • strtok

    strtok() is C-specific, use regular expressions instead, see perlre, or split.

  • strtol

    String to (long) integer translation. Returns the parsed number and the number of characters in the unparsed portion of the string. Truly POSIX-compliant systems set $! ($ERRNO) to indicate a translation error, so clear $! before calling strtol. However, non-POSIX systems may not check for overflow, and therefore will never set $!.

    strtol should respect any POSIX setlocale() settings.

    To parse a string $str as a number in some base $base use

    1. $! = 0;
    2. ($num, $n_unparsed) = POSIX::strtol($str, $base);

    The base should be zero or between 2 and 36, inclusive. When the base is zero or omitted strtol will use the string itself to determine the base: a leading "0x" or "0X" means hexadecimal; a leading "0" means octal; any other leading characters mean decimal. Thus, "1234" is parsed as a decimal number, "01234" as an octal number, and "0x1234" as a hexadecimal number.

    The second returned item and $! can be used to check for valid input:

    1. if (($str eq '') || ($n_unparsed != 0) || !$!) {
    2. die "Non-numeric input $str" . $! ? ": $!\n" : "\n";
    3. }

    When called in a scalar context strtol returns the parsed number.

  • strtoul

    String to unsigned (long) integer translation. strtoul() is identical to strtol() except that strtoul() only parses unsigned integers. See strtol for details.

    Note: Some vendors supply strtod() and strtol() but not strtoul(). Other vendors that do supply strtoul() parse "-1" as a valid value.

  • strxfrm

    String transformation. Returns the transformed string.

    1. $dst = POSIX::strxfrm( $src );

    Used in conjunction with the strcoll() function, see strcoll.

    Not really needed since Perl can do this transparently, see perllocale.

  • sysconf

    Retrieves values of system configurable variables.

    The following will get the machine's clock speed.

    1. $clock_ticks = POSIX::sysconf( &POSIX::_SC_CLK_TCK );

    Returns undef on failure.

  • system

    This is identical to Perl's builtin system() function, see system.

  • tan

    This is identical to the C function tan() , returning the tangent of the numerical argument. See also Math::Trig.

  • tanh

    This is identical to the C function tanh() , returning the hyperbolic tangent of the numerical argument. See also Math::Trig.

  • tcdrain

    This is similar to the C function tcdrain() for draining the output queue of its argument stream.

    Returns undef on failure.

  • tcflow

    This is similar to the C function tcflow() for controlling the flow of its argument stream.

    Returns undef on failure.

  • tcflush

    This is similar to the C function tcflush() for flushing the I/O buffers of its argument stream.

    Returns undef on failure.

  • tcgetpgrp

    This is identical to the C function tcgetpgrp() for returning the process group identifier of the foreground process group of the controlling terminal.

  • tcsendbreak

    This is similar to the C function tcsendbreak() for sending a break on its argument stream.

    Returns undef on failure.

  • tcsetpgrp

    This is similar to the C function tcsetpgrp() for setting the process group identifier of the foreground process group of the controlling terminal.

    Returns undef on failure.

  • time

    This is identical to Perl's builtin time() function for returning the number of seconds since the epoch (whatever it is for the system), see time.

  • times

    The times() function returns elapsed realtime since some point in the past (such as system startup), user and system times for this process, and user and system times used by child processes. All times are returned in clock ticks.

    1. ($realtime, $user, $system, $cuser, $csystem) = POSIX::times();

    Note: Perl's builtin times() function returns four values, measured in seconds.

  • tmpfile

    Use method IO::File::new_tmpfile() instead, or see File::Temp.

  • tmpnam

    Returns a name for a temporary file.

    1. $tmpfile = POSIX::tmpnam();

    For security reasons, which are probably detailed in your system's documentation for the C library tmpnam() function, this interface should not be used; instead see File::Temp.

  • tolower

    This is identical to the C function, except that it can apply to a single character or to a whole string. Consider using the lc() function, see lc, or the equivalent \L operator inside doublequotish strings.

  • toupper

    This is identical to the C function, except that it can apply to a single character or to a whole string. Consider using the uc() function, see uc, or the equivalent \U operator inside doublequotish strings.

  • ttyname

    This is identical to the C function ttyname() for returning the name of the current terminal.

  • tzname

    Retrieves the time conversion information from the tzname variable.

    1. POSIX::tzset();
    2. ($std, $dst) = POSIX::tzname();
  • tzset

    This is identical to the C function tzset() for setting the current timezone based on the environment variable TZ , to be used by ctime() , localtime(), mktime() , and strftime() functions.

  • umask

    This is identical to Perl's builtin umask() function for setting (and querying) the file creation permission mask, see umask.

  • uname

    Get name of current operating system.

    1. ($sysname, $nodename, $release, $version, $machine) = POSIX::uname();

    Note that the actual meanings of the various fields are not that well standardized, do not expect any great portability. The $sysname might be the name of the operating system, the $nodename might be the name of the host, the $release might be the (major) release number of the operating system, the $version might be the (minor) release number of the operating system, and the $machine might be a hardware identifier. Maybe.

  • ungetc

    Use method IO::Handle::ungetc() instead.

  • unlink

    This is identical to Perl's builtin unlink() function for removing files, see unlink.

  • utime

    This is identical to Perl's builtin utime() function for changing the time stamps of files and directories, see utime.

  • vfprintf

    vfprintf() is C-specific, see printf instead.

  • vprintf

    vprintf() is C-specific, see printf instead.

  • vsprintf

    vsprintf() is C-specific, see sprintf instead.

  • wait

    This is identical to Perl's builtin wait() function, see wait.

  • waitpid

    Wait for a child process to change state. This is identical to Perl's builtin waitpid() function, see waitpid.

    1. $pid = POSIX::waitpid( -1, POSIX::WNOHANG );
    2. print "status = ", ($? / 256), "\n";
  • wcstombs

    This is identical to the C function wcstombs() . Perl does not have any support for the wide and multibyte characters of the C standards, so this might be a rather useless function.

  • wctomb

    This is identical to the C function wctomb() . Perl does not have any support for the wide and multibyte characters of the C standards, so this might be a rather useless function.

  • write

    Write to a file. This uses file descriptors such as those obtained by calling POSIX::open .

    1. $fd = POSIX::open( "foo", &POSIX::O_WRONLY );
    2. $buf = "hello";
    3. $bytes = POSIX::write( $fd, $buf, 5 );

    Returns undef on failure.

    See also syswrite.

CLASSES

POSIX::SigAction

  • new

    Creates a new POSIX::SigAction object which corresponds to the C struct sigaction . This object will be destroyed automatically when it is no longer needed. The first parameter is the handler, a sub reference. The second parameter is a POSIX::SigSet object, it defaults to the empty set. The third parameter contains the sa_flags , it defaults to 0.

    1. $sigset = POSIX::SigSet->new(SIGINT, SIGQUIT);
    2. $sigaction = POSIX::SigAction->new( \&handler, $sigset, &POSIX::SA_NOCLDSTOP );

    This POSIX::SigAction object is intended for use with the POSIX::sigaction() function.

  • handler
  • mask
  • flags

    accessor functions to get/set the values of a SigAction object.

    1. $sigset = $sigaction->mask;
    2. $sigaction->flags(&POSIX::SA_RESTART);
  • safe

    accessor function for the "safe signals" flag of a SigAction object; see perlipc for general information on safe (a.k.a. "deferred") signals. If you wish to handle a signal safely, use this accessor to set the "safe" flag in the POSIX::SigAction object:

    1. $sigaction->safe(1);

    You may also examine the "safe" flag on the output action object which is filled in when given as the third parameter to POSIX::sigaction() :

    1. sigaction(SIGINT, $new_action, $old_action);
    2. if ($old_action->safe) {
    3. # previous SIGINT handler used safe signals
    4. }

POSIX::SigRt

  • %SIGRT

    A hash of the POSIX realtime signal handlers. It is an extension of the standard %SIG, the $POSIX::SIGRT{SIGRTMIN} is roughly equivalent to $SIG{SIGRTMIN}, but the right POSIX moves (see below) are made with the POSIX::SigSet and POSIX::sigaction instead of accessing the %SIG.

    You can set the %POSIX::SIGRT elements to set the POSIX realtime signal handlers, use delete and exists on the elements, and use scalar on the %POSIX::SIGRT to find out how many POSIX realtime signals there are available (SIGRTMAX - SIGRTMIN + 1, the SIGRTMAX is a valid POSIX realtime signal).

    Setting the %SIGRT elements is equivalent to calling this:

    1. sub new {
    2. my ($rtsig, $handler, $flags) = @_;
    3. my $sigset = POSIX::SigSet($rtsig);
    4. my $sigact = POSIX::SigAction->new($handler, $sigset, $flags);
    5. sigaction($rtsig, $sigact);
    6. }

    The flags default to zero, if you want something different you can either use local on $POSIX::SigRt::SIGACTION_FLAGS, or you can derive from POSIX::SigRt and define your own new() (the tied hash STORE method of the %SIGRT calls new($rtsig, $handler, $SIGACTION_FLAGS) , where the $rtsig ranges from zero to SIGRTMAX - SIGRTMIN + 1).

    Just as with any signal, you can use sigaction($rtsig, undef, $oa) to retrieve the installed signal handler (or, rather, the signal action).

    NOTE: whether POSIX realtime signals really work in your system, or whether Perl has been compiled so that it works with them, is outside of this discussion.

  • SIGRTMIN

    Return the minimum POSIX realtime signal number available, or undef if no POSIX realtime signals are available.

  • SIGRTMAX

    Return the maximum POSIX realtime signal number available, or undef if no POSIX realtime signals are available.

POSIX::SigSet

  • new

    Create a new SigSet object. This object will be destroyed automatically when it is no longer needed. Arguments may be supplied to initialize the set.

    Create an empty set.

    1. $sigset = POSIX::SigSet->new;

    Create a set with SIGUSR1.

    1. $sigset = POSIX::SigSet->new( &POSIX::SIGUSR1 );
  • addset

    Add a signal to a SigSet object.

    1. $sigset->addset( &POSIX::SIGUSR2 );

    Returns undef on failure.

  • delset

    Remove a signal from the SigSet object.

    1. $sigset->delset( &POSIX::SIGUSR2 );

    Returns undef on failure.

  • emptyset

    Initialize the SigSet object to be empty.

    1. $sigset->emptyset();

    Returns undef on failure.

  • fillset

    Initialize the SigSet object to include all signals.

    1. $sigset->fillset();

    Returns undef on failure.

  • ismember

    Tests the SigSet object to see if it contains a specific signal.

    1. if( $sigset->ismember( &POSIX::SIGUSR1 ) ){
    2. print "contains SIGUSR1\n";
    3. }

POSIX::Termios

  • new

    Create a new Termios object. This object will be destroyed automatically when it is no longer needed. A Termios object corresponds to the termios C struct. new() mallocs a new one, getattr() fills it from a file descriptor, and setattr() sets a file descriptor's parameters to match Termios' contents.

    1. $termios = POSIX::Termios->new;
  • getattr

    Get terminal control attributes.

    Obtain the attributes for stdin.

    1. $termios->getattr( 0 ) # Recommended for clarity.
    2. $termios->getattr()

    Obtain the attributes for stdout.

    1. $termios->getattr( 1 )

    Returns undef on failure.

  • getcc

    Retrieve a value from the c_cc field of a termios object. The c_cc field is an array so an index must be specified.

    1. $c_cc[1] = $termios->getcc(1);
  • getcflag

    Retrieve the c_cflag field of a termios object.

    1. $c_cflag = $termios->getcflag;
  • getiflag

    Retrieve the c_iflag field of a termios object.

    1. $c_iflag = $termios->getiflag;
  • getispeed

    Retrieve the input baud rate.

    1. $ispeed = $termios->getispeed;
  • getlflag

    Retrieve the c_lflag field of a termios object.

    1. $c_lflag = $termios->getlflag;
  • getoflag

    Retrieve the c_oflag field of a termios object.

    1. $c_oflag = $termios->getoflag;
  • getospeed

    Retrieve the output baud rate.

    1. $ospeed = $termios->getospeed;
  • setattr

    Set terminal control attributes.

    Set attributes immediately for stdout.

    1. $termios->setattr( 1, &POSIX::TCSANOW );

    Returns undef on failure.

  • setcc

    Set a value in the c_cc field of a termios object. The c_cc field is an array so an index must be specified.

    1. $termios->setcc( &POSIX::VEOF, 1 );
  • setcflag

    Set the c_cflag field of a termios object.

    1. $termios->setcflag( $c_cflag | &POSIX::CLOCAL );
  • setiflag

    Set the c_iflag field of a termios object.

    1. $termios->setiflag( $c_iflag | &POSIX::BRKINT );
  • setispeed

    Set the input baud rate.

    1. $termios->setispeed( &POSIX::B9600 );

    Returns undef on failure.

  • setlflag

    Set the c_lflag field of a termios object.

    1. $termios->setlflag( $c_lflag | &POSIX::ECHO );
  • setoflag

    Set the c_oflag field of a termios object.

    1. $termios->setoflag( $c_oflag | &POSIX::OPOST );
  • setospeed

    Set the output baud rate.

    1. $termios->setospeed( &POSIX::B9600 );

    Returns undef on failure.

  • Baud rate values

    B38400 B75 B200 B134 B300 B1800 B150 B0 B19200 B1200 B9600 B600 B4800 B50 B2400 B110

  • Terminal interface values

    TCSADRAIN TCSANOW TCOON TCIOFLUSH TCOFLUSH TCION TCIFLUSH TCSAFLUSH TCIOFF TCOOFF

  • c_cc field values

    VEOF VEOL VERASE VINTR VKILL VQUIT VSUSP VSTART VSTOP VMIN VTIME NCCS

  • c_cflag field values

    CLOCAL CREAD CSIZE CS5 CS6 CS7 CS8 CSTOPB HUPCL PARENB PARODD

  • c_iflag field values

    BRKINT ICRNL IGNBRK IGNCR IGNPAR INLCR INPCK ISTRIP IXOFF IXON PARMRK

  • c_lflag field values

    ECHO ECHOE ECHOK ECHONL ICANON IEXTEN ISIG NOFLSH TOSTOP

  • c_oflag field values

    OPOST

PATHNAME CONSTANTS

  • Constants

    _PC_CHOWN_RESTRICTED _PC_LINK_MAX _PC_MAX_CANON _PC_MAX_INPUT _PC_NAME_MAX _PC_NO_TRUNC _PC_PATH_MAX _PC_PIPE_BUF _PC_VDISABLE

POSIX CONSTANTS

  • Constants

    _POSIX_ARG_MAX _POSIX_CHILD_MAX _POSIX_CHOWN_RESTRICTED _POSIX_JOB_CONTROL _POSIX_LINK_MAX _POSIX_MAX_CANON _POSIX_MAX_INPUT _POSIX_NAME_MAX _POSIX_NGROUPS_MAX _POSIX_NO_TRUNC _POSIX_OPEN_MAX _POSIX_PATH_MAX _POSIX_PIPE_BUF _POSIX_SAVED_IDS _POSIX_SSIZE_MAX _POSIX_STREAM_MAX _POSIX_TZNAME_MAX _POSIX_VDISABLE _POSIX_VERSION

SYSTEM CONFIGURATION

  • Constants

    _SC_ARG_MAX _SC_CHILD_MAX _SC_CLK_TCK _SC_JOB_CONTROL _SC_NGROUPS_MAX _SC_OPEN_MAX _SC_PAGESIZE _SC_SAVED_IDS _SC_STREAM_MAX _SC_TZNAME_MAX _SC_VERSION

ERRNO

  • Constants

    E2BIG EACCES EADDRINUSE EADDRNOTAVAIL EAFNOSUPPORT EAGAIN EALREADY EBADF EBUSY ECHILD ECONNABORTED ECONNREFUSED ECONNRESET EDEADLK EDESTADDRREQ EDOM EDQUOT EEXIST EFAULT EFBIG EHOSTDOWN EHOSTUNREACH EINPROGRESS EINTR EINVAL EIO EISCONN EISDIR ELOOP EMFILE EMLINK EMSGSIZE ENAMETOOLONG ENETDOWN ENETRESET ENETUNREACH ENFILE ENOBUFS ENODEV ENOENT ENOEXEC ENOLCK ENOMEM ENOPROTOOPT ENOSPC ENOSYS ENOTBLK ENOTCONN ENOTDIR ENOTEMPTY ENOTSOCK ENOTTY ENXIO EOPNOTSUPP EPERM EPFNOSUPPORT EPIPE EPROCLIM EPROTONOSUPPORT EPROTOTYPE ERANGE EREMOTE ERESTART EROFS ESHUTDOWN ESOCKTNOSUPPORT ESPIPE ESRCH ESTALE ETIMEDOUT ETOOMANYREFS ETXTBSY EUSERS EWOULDBLOCK EXDEV

FCNTL

  • Constants

    FD_CLOEXEC F_DUPFD F_GETFD F_GETFL F_GETLK F_OK F_RDLCK F_SETFD F_SETFL F_SETLK F_SETLKW F_UNLCK F_WRLCK O_ACCMODE O_APPEND O_CREAT O_EXCL O_NOCTTY O_NONBLOCK O_RDONLY O_RDWR O_TRUNC O_WRONLY

FLOAT

  • Constants

    DBL_DIG DBL_EPSILON DBL_MANT_DIG DBL_MAX DBL_MAX_10_EXP DBL_MAX_EXP DBL_MIN DBL_MIN_10_EXP DBL_MIN_EXP FLT_DIG FLT_EPSILON FLT_MANT_DIG FLT_MAX FLT_MAX_10_EXP FLT_MAX_EXP FLT_MIN FLT_MIN_10_EXP FLT_MIN_EXP FLT_RADIX FLT_ROUNDS LDBL_DIG LDBL_EPSILON LDBL_MANT_DIG LDBL_MAX LDBL_MAX_10_EXP LDBL_MAX_EXP LDBL_MIN LDBL_MIN_10_EXP LDBL_MIN_EXP

LIMITS

  • Constants

    ARG_MAX CHAR_BIT CHAR_MAX CHAR_MIN CHILD_MAX INT_MAX INT_MIN LINK_MAX LONG_MAX LONG_MIN MAX_CANON MAX_INPUT MB_LEN_MAX NAME_MAX NGROUPS_MAX OPEN_MAX PATH_MAX PIPE_BUF SCHAR_MAX SCHAR_MIN SHRT_MAX SHRT_MIN SSIZE_MAX STREAM_MAX TZNAME_MAX UCHAR_MAX UINT_MAX ULONG_MAX USHRT_MAX

LOCALE

  • Constants

    LC_ALL LC_COLLATE LC_CTYPE LC_MONETARY LC_NUMERIC LC_TIME

MATH

  • Constants

    HUGE_VAL

SIGNAL

  • Constants

    SA_NOCLDSTOP SA_NOCLDWAIT SA_NODEFER SA_ONSTACK SA_RESETHAND SA_RESTART SA_SIGINFO SIGABRT SIGALRM SIGCHLD SIGCONT SIGFPE SIGHUP SIGILL SIGINT SIGKILL SIGPIPE SIGQUIT SIGSEGV SIGSTOP SIGTERM SIGTSTP SIGTTIN SIGTTOU SIGUSR1 SIGUSR2 SIG_BLOCK SIG_DFL SIG_ERR SIG_IGN SIG_SETMASK SIG_UNBLOCK

STAT

  • Constants

    S_IRGRP S_IROTH S_IRUSR S_IRWXG S_IRWXO S_IRWXU S_ISGID S_ISUID S_IWGRP S_IWOTH S_IWUSR S_IXGRP S_IXOTH S_IXUSR

  • Macros

    S_ISBLK S_ISCHR S_ISDIR S_ISFIFO S_ISREG

STDLIB

  • Constants

    EXIT_FAILURE EXIT_SUCCESS MB_CUR_MAX RAND_MAX

STDIO

  • Constants

    BUFSIZ EOF FILENAME_MAX L_ctermid L_cuserid L_tmpname TMP_MAX

TIME

  • Constants

    CLK_TCK CLOCKS_PER_SEC

UNISTD

  • Constants

    R_OK SEEK_CUR SEEK_END SEEK_SET STDIN_FILENO STDOUT_FILENO STDERR_FILENO W_OK X_OK

WAIT

  • Constants

    WNOHANG WUNTRACED

    • WNOHANG

      Do not suspend the calling process until a child process changes state but instead return immediately.

    • WUNTRACED

      Catch stopped child processes.

  • Macros

    WIFEXITED WEXITSTATUS WIFSIGNALED WTERMSIG WIFSTOPPED WSTOPSIG

    • WIFEXITED

      WIFEXITED(${^CHILD_ERROR_NATIVE}) returns true if the child process exited normally (exit() or by falling off the end of main() )

    • WEXITSTATUS

      WEXITSTATUS(${^CHILD_ERROR_NATIVE}) returns the normal exit status of the child process (only meaningful if WIFEXITED(${^CHILD_ERROR_NATIVE}) is true)

    • WIFSIGNALED

      WIFSIGNALED(${^CHILD_ERROR_NATIVE}) returns true if the child process terminated because of a signal

    • WTERMSIG

      WTERMSIG(${^CHILD_ERROR_NATIVE}) returns the signal the child process terminated for (only meaningful if WIFSIGNALED(${^CHILD_ERROR_NATIVE}) is true)

    • WIFSTOPPED

      WIFSTOPPED(${^CHILD_ERROR_NATIVE}) returns true if the child process is currently stopped (can happen only if you specified the WUNTRACED flag to waitpid())

    • WSTOPSIG

      WSTOPSIG(${^CHILD_ERROR_NATIVE}) returns the signal the child process was stopped for (only meaningful if WIFSTOPPED(${^CHILD_ERROR_NATIVE}) is true)

 
perldoc-html/Package/000755 000765 000024 00000000000 12275777500 014565 5ustar00jjstaff000000 000000 perldoc-html/Params/000755 000765 000024 00000000000 12275777476 014471 5ustar00jjstaff000000 000000 perldoc-html/Parse/000755 000765 000024 00000000000 12275777476 014320 5ustar00jjstaff000000 000000 perldoc-html/PerlIO/000755 000765 000024 00000000000 12275777502 014366 5ustar00jjstaff000000 000000 perldoc-html/PerlIO.html000644 000765 000024 00000103134 12275777500 015254 0ustar00jjstaff000000 000000 PerlIO - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

PerlIO

Perl 5 version 18.2 documentation
Recently read

PerlIO

NAME

PerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space

SYNOPSIS

  1. open($fh,"<:crlf", "my.txt"); # support platform-native and CRLF text files
  2. open($fh,"<","his.jpg"); # portably open a binary file for reading
  3. binmode($fh);
  4. Shell:
  5. PERLIO=perlio perl ....

DESCRIPTION

When an undefined layer 'foo' is encountered in an open or binmode layer specification then C code performs the equivalent of:

  1. use PerlIO 'foo';

The perl code in PerlIO.pm then attempts to locate a layer by doing

  1. require PerlIO::foo;

Otherwise the PerlIO package is a place holder for additional PerlIO related functions.

The following layers are currently defined:

  • :unix

    Lowest level layer which provides basic PerlIO operations in terms of UNIX/POSIX numeric file descriptor calls (open(), read(), write(), lseek(), close()).

  • :stdio

    Layer which calls fread , fwrite and fseek /ftell etc. Note that as this is "real" stdio it will ignore any layers beneath it and go straight to the operating system via the C library as usual.

  • :perlio

    A from scratch implementation of buffering for PerlIO. Provides fast access to the buffer for sv_gets which implements perl's readline/<> and in general attempts to minimize data copying.

    :perlio will insert a :unix layer below itself to do low level IO.

  • :crlf

    A layer that implements DOS/Windows like CRLF line endings. On read converts pairs of CR,LF to a single "\n" newline character. On write converts each "\n" to a CR,LF pair. Note that this layer will silently refuse to be pushed on top of itself.

    It currently does not mimic MS-DOS as far as treating of Control-Z as being an end-of-file marker.

    Based on the :perlio layer.

  • :utf8

    Declares that the stream accepts perl's internal encoding of characters. (Which really is UTF-8 on ASCII machines, but is UTF-EBCDIC on EBCDIC machines.) This allows any character perl can represent to be read from or written to the stream. The UTF-X encoding is chosen to render simple text parts (i.e. non-accented letters, digits and common punctuation) human readable in the encoded file.

    Here is how to write your native data out using UTF-8 (or UTF-EBCDIC) and then read it back in.

    1. open(F, ">:utf8", "data.utf");
    2. print F $out;
    3. close(F);
    4. open(F, "<:utf8", "data.utf");
    5. $in = <F>;
    6. close(F);

    Note that this layer does not validate byte sequences. For reading input, using :encoding(utf8) instead of bare :utf8 is strongly recommended.

  • :bytes

    This is the inverse of the :utf8 layer. It turns off the flag on the layer below so that data read from it is considered to be "octets" i.e. characters in the range 0..255 only. Likewise on output perl will warn if a "wide" character is written to a such a stream.

  • :raw

    The :raw layer is defined as being identical to calling binmode($fh) - the stream is made suitable for passing binary data, i.e. each byte is passed as-is. The stream will still be buffered.

    In Perl 5.6 and some books the :raw layer (previously sometimes also referred to as a "discipline") is documented as the inverse of the :crlf layer. That is no longer the case - other layers which would alter the binary nature of the stream are also disabled. If you want UNIX line endings on a platform that normally does CRLF translation, but still want UTF-8 or encoding defaults, the appropriate thing to do is to add :perlio to the PERLIO environment variable.

    The implementation of :raw is as a pseudo-layer which when "pushed" pops itself and then any layers which do not declare themselves as suitable for binary data. (Undoing :utf8 and :crlf are implemented by clearing flags rather than popping layers but that is an implementation detail.)

    As a consequence of the fact that :raw normally pops layers, it usually only makes sense to have it as the only or first element in a layer specification. When used as the first element it provides a known base on which to build e.g.

    1. open($fh,":raw:utf8",...)

    will construct a "binary" stream, but then enable UTF-8 translation.

  • :pop

    A pseudo layer that removes the top-most layer. Gives perl code a way to manipulate the layer stack. Should be considered as experimental. Note that :pop only works on real layers and will not undo the effects of pseudo layers like :utf8 . An example of a possible use might be:

    1. open($fh,...)
    2. ...
    3. binmode($fh,":encoding(...)"); # next chunk is encoded
    4. ...
    5. binmode($fh,":pop"); # back to un-encoded

    A more elegant (and safer) interface is needed.

  • :win32

    On Win32 platforms this experimental layer uses the native "handle" IO rather than the unix-like numeric file descriptor layer. Known to be buggy as of perl 5.8.2.

Custom Layers

It is possible to write custom layers in addition to the above builtin ones, both in C/XS and Perl. Two such layers (and one example written in Perl using the latter) come with the Perl distribution.

  • :encoding

    Use :encoding(ENCODING) either in open() or binmode() to install a layer that transparently does character set and encoding transformations, for example from Shift-JIS to Unicode. Note that under stdio an :encoding also enables :utf8 . See PerlIO::encoding for more information.

  • :mmap

    A layer which implements "reading" of files by using mmap() to make a (whole) file appear in the process's address space, and then using that as PerlIO's "buffer". This may be faster in certain circumstances for large files, and may result in less physical memory use when multiple processes are reading the same file.

    Files which are not mmap() -able revert to behaving like the :perlio layer. Writes also behave like the :perlio layer, as mmap() for write needs extra house-keeping (to extend the file) which negates any advantage.

    The :mmap layer will not exist if the platform does not support mmap() .

  • :via

    Use :via(MODULE) either in open() or binmode() to install a layer that does whatever transformation (for example compression / decompression, encryption / decryption) to the filehandle. See PerlIO::via for more information.

Alternatives to raw

To get a binary stream an alternate method is to use:

  1. open($fh,"whatever")
  2. binmode($fh);

this has the advantage of being backward compatible with how such things have had to be coded on some platforms for years.

To get an unbuffered stream specify an unbuffered layer (e.g. :unix ) in the open call:

  1. open($fh,"<:unix",$path)

Defaults and how to override them

If the platform is MS-DOS like and normally does CRLF to "\n" translation for text files then the default layers are :

  1. unix crlf

(The low level "unix" layer may be replaced by a platform specific low level layer.)

Otherwise if Configure found out how to do "fast" IO using the system's stdio, then the default layers are:

  1. unix stdio

Otherwise the default layers are

  1. unix perlio

These defaults may change once perlio has been better tested and tuned.

The default can be overridden by setting the environment variable PERLIO to a space separated list of layers (unix or platform low level layer is always pushed first).

This can be used to see the effect of/bugs in the various layers e.g.

  1. cd .../perl/t
  2. PERLIO=stdio ./perl harness
  3. PERLIO=perlio ./perl harness

For the various values of PERLIO see PERLIO in perlrun.

Querying the layers of filehandles

The following returns the names of the PerlIO layers on a filehandle.

  1. my @layers = PerlIO::get_layers($fh); # Or FH, *FH, "FH".

The layers are returned in the order an open() or binmode() call would use them. Note that the "default stack" depends on the operating system and on the Perl version, and both the compile-time and runtime configurations of Perl.

The following table summarizes the default layers on UNIX-like and DOS-like platforms and depending on the setting of $ENV{PERLIO} :

  1. PERLIO UNIX-like DOS-like
  2. ------ --------- --------
  3. unset / "" unix perlio / stdio [1] unix crlf
  4. stdio unix perlio / stdio [1] stdio
  5. perlio unix perlio unix perlio
  6. # [1] "stdio" if Configure found out how to do "fast stdio" (depends
  7. # on the stdio implementation) and in Perl 5.8, otherwise "unix perlio"

By default the layers from the input side of the filehandle are returned; to get the output side, use the optional output argument:

  1. my @layers = PerlIO::get_layers($fh, output => 1);

(Usually the layers are identical on either side of a filehandle but for example with sockets there may be differences, or if you have been using the open pragma.)

There is no set_layers(), nor does get_layers() return a tied array mirroring the stack, or anything fancy like that. This is not accidental or unintentional. The PerlIO layer stack is a bit more complicated than just a stack (see for example the behaviour of :raw ). You are supposed to use open() and binmode() to manipulate the stack.

Implementation details follow, please close your eyes.

The arguments to layers are by default returned in parentheses after the name of the layer, and certain layers (like utf8 ) are not real layers but instead flags on real layers; to get all of these returned separately, use the optional details argument:

  1. my @layer_and_args_and_flags = PerlIO::get_layers($fh, details => 1);

The result will be up to be three times the number of layers: the first element will be a name, the second element the arguments (unspecified arguments will be undef), the third element the flags, the fourth element a name again, and so forth.

You may open your eyes now.

AUTHOR

Nick Ing-Simmons <nick@ing-simmons.net>

SEE ALSO

binmode, open, perlunicode, perliol, Encode

 
perldoc-html/Pod/000755 000765 000024 00000000000 12275777501 013755 5ustar00jjstaff000000 000000 perldoc-html/SDBM_File.html000644 000765 000024 00000045654 12275777504 015626 0ustar00jjstaff000000 000000 SDBM_File - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

SDBM_File

Perl 5 version 18.2 documentation
Recently read

SDBM_File

NAME

SDBM_File - Tied access to sdbm files

SYNOPSIS

  1. use Fcntl; # For O_RDWR, O_CREAT, etc.
  2. use SDBM_File;
  3. tie(%h, 'SDBM_File', 'filename', O_RDWR|O_CREAT, 0666)
  4. or die "Couldn't tie SDBM file 'filename': $!; aborting";
  5. # Now read and change the hash
  6. $h{newkey} = newvalue;
  7. print $h{oldkey};
  8. ...
  9. untie %h;

DESCRIPTION

SDBM_File establishes a connection between a Perl hash variable and a file in SDBM_File format;. You can manipulate the data in the file just as if it were in a Perl hash, but when your program exits, the data will remain in the file, to be used the next time your program runs.

Use SDBM_File with the Perl built-in tie function to establish the connection between the variable and the file. The arguments to tie should be:

1.

The hash variable you want to tie.

2.

The string "SDBM_File" . (Ths tells Perl to use the SDBM_File package to perform the functions of the hash.)

3.

The name of the file you want to tie to the hash.

4.

Flags. Use one of:

  • O_RDONLY

    Read-only access to the data in the file.

  • O_WRONLY

    Write-only access to the data in the file.

  • O_RDWR

    Both read and write access.

If you want to create the file if it does not exist, add O_CREAT to any of these, as in the example. If you omit O_CREAT and the file does not already exist, the tie call will fail.

5.

The default permissions to use if a new file is created. The actual permissions will be modified by the user's umask, so you should probably use 0666 here. (See umask.)

DIAGNOSTICS

On failure, the tie call returns an undefined value and probably sets $! to contain the reason the file could not be tied.

sdbm store returned -1, errno 22, key "..." at ...

This warning is emitted when you try to store a key or a value that is too long. It means that the change was not recorded in the database. See BUGS AND WARNINGS below.

BUGS AND WARNINGS

There are a number of limits on the size of the data that you can store in the SDBM file. The most important is that the length of a key, plus the length of its associated value, may not exceed 1008 bytes.

See tie, perldbmfilter, Fcntl

 
perldoc-html/Safe.html000644 000765 000024 00000077312 12275777504 015014 0ustar00jjstaff000000 000000 Safe - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Safe

Perl 5 version 18.2 documentation
Recently read

Safe

NAME

Safe - Compile and execute code in restricted compartments

SYNOPSIS

  1. use Safe;
  2. $compartment = new Safe;
  3. $compartment->permit(qw(time sort :browse));
  4. $result = $compartment->reval($unsafe_code);

DESCRIPTION

The Safe extension module allows the creation of compartments in which perl code can be evaluated. Each compartment has

  • a new namespace

    The "root" of the namespace (i.e. "main::") is changed to a different package and code evaluated in the compartment cannot refer to variables outside this namespace, even with run-time glob lookups and other tricks.

    Code which is compiled outside the compartment can choose to place variables into (or share variables with) the compartment's namespace and only that data will be visible to code evaluated in the compartment.

    By default, the only variables shared with compartments are the "underscore" variables $_ and @_ (and, technically, the less frequently used %_, the _ filehandle and so on). This is because otherwise perl operators which default to $_ will not work and neither will the assignment of arguments to @_ on subroutine entry.

  • an operator mask

    Each compartment has an associated "operator mask". Recall that perl code is compiled into an internal format before execution. Evaluating perl code (e.g. via "eval" or "do 'file'") causes the code to be compiled into an internal format and then, provided there was no error in the compilation, executed. Code evaluated in a compartment compiles subject to the compartment's operator mask. Attempting to evaluate code in a compartment which contains a masked operator will cause the compilation to fail with an error. The code will not be executed.

    The default operator mask for a newly created compartment is the ':default' optag.

    It is important that you read the Opcode module documentation for more information, especially for detailed definitions of opnames, optags and opsets.

    Since it is only at the compilation stage that the operator mask applies, controlled access to potentially unsafe operations can be achieved by having a handle to a wrapper subroutine (written outside the compartment) placed into the compartment. For example,

    1. $cpt = new Safe;
    2. sub wrapper {
    3. # vet arguments and perform potentially unsafe operations
    4. }
    5. $cpt->share('&wrapper');

WARNING

The authors make no warranty, implied or otherwise, about the suitability of this software for safety or security purposes.

The authors shall not in any case be liable for special, incidental, consequential, indirect or other similar damages arising from the use of this software.

Your mileage will vary. If in any doubt do not use it.

METHODS

To create a new compartment, use

  1. $cpt = new Safe;

Optional argument is (NAMESPACE), where NAMESPACE is the root namespace to use for the compartment (defaults to "Safe::Root0", incremented for each new compartment).

Note that version 1.00 of the Safe module supported a second optional parameter, MASK. That functionality has been withdrawn pending deeper consideration. Use the permit and deny methods described below.

The following methods can then be used on the compartment object returned by the above constructor. The object argument is implicit in each case.

permit (OP, ...)

Permit the listed operators to be used when compiling code in the compartment (in addition to any operators already permitted).

You can list opcodes by names, or use a tag name; see Predefined Opcode Tags in Opcode.

permit_only (OP, ...)

Permit only the listed operators to be used when compiling code in the compartment (no other operators are permitted).

deny (OP, ...)

Deny the listed operators from being used when compiling code in the compartment (other operators may still be permitted).

deny_only (OP, ...)

Deny only the listed operators from being used when compiling code in the compartment (all other operators will be permitted, so you probably don't want to use this method).

trap (OP, ...)

untrap (OP, ...)

The trap and untrap methods are synonyms for deny and permit respectfully.

share (NAME, ...)

This shares the variable(s) in the argument list with the compartment. This is almost identical to exporting variables using the Exporter module.

Each NAME must be the name of a non-lexical variable, typically with the leading type identifier included. A bareword is treated as a function name.

Examples of legal names are '$foo' for a scalar, '@foo' for an array, '%foo' for a hash, '&foo' or 'foo' for a subroutine and '*foo' for a glob (i.e. all symbol table entries associated with "foo", including scalar, array, hash, sub and filehandle).

Each NAME is assumed to be in the calling package. See share_from for an alternative method (which share uses).

share_from (PACKAGE, ARRAYREF)

This method is similar to share() but allows you to explicitly name the package that symbols should be shared from. The symbol names (including type characters) are supplied as an array reference.

  1. $safe->share_from('main', [ '$foo', '%bar', 'func' ]);

Names can include package names, which are relative to the specified PACKAGE. So these two calls have the same effect:

  1. $safe->share_from('Scalar::Util', [ 'reftype' ]);
  2. $safe->share_from('main', [ 'Scalar::Util::reftype' ]);

varglob (VARNAME)

This returns a glob reference for the symbol table entry of VARNAME in the package of the compartment. VARNAME must be the name of a variable without any leading type marker. For example:

  1. ${$cpt->varglob('foo')} = "Hello world";

has the same effect as:

  1. $cpt = new Safe 'Root';
  2. $Root::foo = "Hello world";

but avoids the need to know $cpt's package name.

reval (STRING, STRICT)

This evaluates STRING as perl code inside the compartment.

The code can only see the compartment's namespace (as returned by the root method). The compartment's root package appears to be the main:: package to the code inside the compartment.

Any attempt by the code in STRING to use an operator which is not permitted by the compartment will cause an error (at run-time of the main program but at compile-time for the code in STRING). The error is of the form "'%s' trapped by operation mask...".

If an operation is trapped in this way, then the code in STRING will not be executed. If such a trapped operation occurs or any other compile-time or return error, then $@ is set to the error message, just as with an eval().

If there is no error, then the method returns the value of the last expression evaluated, or a return statement may be used, just as with subroutines and eval(). The context (list or scalar) is determined by the caller as usual.

If the return value of reval() is (or contains) any code reference, those code references are wrapped to be themselves executed always in the compartment. See wrap_code_refs_within.

The formerly undocumented STRICT argument sets strictness: if true 'use strict;' is used, otherwise it uses 'no strict;'. Note: if STRICT is omitted 'no strict;' is the default.

Some points to note:

If the entereval op is permitted then the code can use eval "..." to 'hide' code which might use denied ops. This is not a major problem since when the code tries to execute the eval it will fail because the opmask is still in effect. However this technique would allow clever, and possibly harmful, code to 'probe' the boundaries of what is possible.

Any string eval which is executed by code executing in a compartment, or by code called from code executing in a compartment, will be eval'd in the namespace of the compartment. This is potentially a serious problem.

Consider a function foo() in package pkg compiled outside a compartment but shared with it. Assume the compartment has a root package called 'Root'. If foo() contains an eval statement like eval '$foo = 1' then, normally, $pkg::foo will be set to 1. If foo() is called from the compartment (by whatever means) then instead of setting $pkg::foo, the eval will actually set $Root::pkg::foo.

This can easily be demonstrated by using a module, such as the Socket module, which uses eval "..." as part of an AUTOLOAD function. You can 'use' the module outside the compartment and share an (autoloaded) function with the compartment. If an autoload is triggered by code in the compartment, or by any code anywhere that is called by any means from the compartment, then the eval in the Socket module's AUTOLOAD function happens in the namespace of the compartment. Any variables created or used by the eval'd code are now under the control of the code in the compartment.

A similar effect applies to all runtime symbol lookups in code called from a compartment but not compiled within it.

rdo (FILENAME)

This evaluates the contents of file FILENAME inside the compartment. See above documentation on the reval method for further details.

root (NAMESPACE)

This method returns the name of the package that is the root of the compartment's namespace.

Note that this behaviour differs from version 1.00 of the Safe module where the root module could be used to change the namespace. That functionality has been withdrawn pending deeper consideration.

mask (MASK)

This is a get-or-set method for the compartment's operator mask.

With no MASK argument present, it returns the current operator mask of the compartment.

With the MASK argument present, it sets the operator mask for the compartment (equivalent to calling the deny_only method).

wrap_code_ref (CODEREF)

Returns a reference to an anonymous subroutine that, when executed, will call CODEREF with the Safe compartment 'in effect'. In other words, with the package namespace adjusted and the opmask enabled.

Note that the opmask doesn't affect the already compiled code, it only affects any further compilation that the already compiled code may try to perform.

This is particularly useful when applied to code references returned from reval().

(It also provides a kind of workaround for RT#60374: "Safe.pm sort {} bug with -Dusethreads". See http://rt.perl.org/rt3//Public/Bug/Display.html?id=60374 for much more detail.)

wrap_code_refs_within (...)

Wraps any CODE references found within the arguments by replacing each with the result of calling wrap_code_ref on the CODE reference. Any ARRAY or HASH references in the arguments are inspected recursively.

Returns nothing.

RISKS

This section is just an outline of some of the things code in a compartment might do (intentionally or unintentionally) which can have an effect outside the compartment.

  • Memory

    Consuming all (or nearly all) available memory.

  • CPU

    Causing infinite loops etc.

  • Snooping

    Copying private information out of your system. Even something as simple as your user name is of value to others. Much useful information could be gleaned from your environment variables for example.

  • Signals

    Causing signals (especially SIGFPE and SIGALARM) to affect your process.

    Setting up a signal handler will need to be carefully considered and controlled. What mask is in effect when a signal handler gets called? If a user can get an imported function to get an exception and call the user's signal handler, does that user's restricted mask get re-instated before the handler is called? Does an imported handler get called with its original mask or the user's one?

  • State Changes

    Ops such as chdir obviously effect the process as a whole and not just the code in the compartment. Ops such as rand and srand have a similar but more subtle effect.

AUTHOR

Originally designed and implemented by Malcolm Beattie.

Reworked to use the Opcode module and other changes added by Tim Bunce.

Currently maintained by the Perl 5 Porters, <perl5-porters@perl.org>.

 
perldoc-html/Scalar/000755 000765 000024 00000000000 12275777504 014443 5ustar00jjstaff000000 000000 perldoc-html/Search/000755 000765 000024 00000000000 12275777505 014444 5ustar00jjstaff000000 000000 perldoc-html/SelectSaver.html000644 000765 000024 00000036757 12275777504 016366 0ustar00jjstaff000000 000000 SelectSaver - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

SelectSaver

Perl 5 version 18.2 documentation
Recently read

SelectSaver

NAME

SelectSaver - save and restore selected file handle

SYNOPSIS

  1. use SelectSaver;
  2. {
  3. my $saver = SelectSaver->new(FILEHANDLE);
  4. # FILEHANDLE is selected
  5. }
  6. # previous handle is selected
  7. {
  8. my $saver = SelectSaver->new;
  9. # new handle may be selected, or not
  10. }
  11. # previous handle is selected

DESCRIPTION

A SelectSaver object contains a reference to the file handle that was selected when it was created. If its new method gets an extra parameter, then that parameter is selected; otherwise, the selected file handle remains unchanged.

When a SelectSaver is destroyed, it re-selects the file handle that was selected when it was created.

 
perldoc-html/SelfLoader.html000644 000765 000024 00000114225 12275777504 016151 0ustar00jjstaff000000 000000 SelfLoader - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

SelfLoader

Perl 5 version 18.2 documentation
Recently read

SelfLoader

NAME

SelfLoader - load functions only on demand

SYNOPSIS

  1. package FOOBAR;
  2. use SelfLoader;
  3. ... (initializing code)
  4. __DATA__
  5. sub {....

DESCRIPTION

This module tells its users that functions in the FOOBAR package are to be autoloaded from after the __DATA__ token. See also Autoloading in perlsub.

The __DATA__ token

The __DATA__ token tells the perl compiler that the perl code for compilation is finished. Everything after the __DATA__ token is available for reading via the filehandle FOOBAR::DATA, where FOOBAR is the name of the current package when the __DATA__ token is reached. This works just the same as __END__ does in package 'main', but for other modules data after __END__ is not automatically retrievable, whereas data after __DATA__ is. The __DATA__ token is not recognized in versions of perl prior to 5.001m.

Note that it is possible to have __DATA__ tokens in the same package in multiple files, and that the last __DATA__ token in a given package that is encountered by the compiler is the one accessible by the filehandle. This also applies to __END__ and main, i.e. if the 'main' program has an __END__ , but a module 'require'd (_not_ 'use'd) by that program has a 'package main;' declaration followed by an '__DATA__ ', then the DATA filehandle is set to access the data after the __DATA__ in the module, _not_ the data after the __END__ token in the 'main' program, since the compiler encounters the 'require'd file later.

SelfLoader autoloading

The SelfLoader works by the user placing the __DATA__ token after perl code which needs to be compiled and run at 'require' time, but before subroutine declarations that can be loaded in later - usually because they may never be called.

The SelfLoader will read from the FOOBAR::DATA filehandle to load in the data after __DATA__ , and load in any subroutine when it is called. The costs are the one-time parsing of the data after __DATA__ , and a load delay for the _first_ call of any autoloaded function. The benefits (hopefully) are a speeded up compilation phase, with no need to load functions which are never used.

The SelfLoader will stop reading from __DATA__ if it encounters the __END__ token - just as you would expect. If the __END__ token is present, and is followed by the token DATA, then the SelfLoader leaves the FOOBAR::DATA filehandle open on the line after that token.

The SelfLoader exports the AUTOLOAD subroutine to the package using the SelfLoader, and this loads the called subroutine when it is first called.

There is no advantage to putting subroutines which will _always_ be called after the __DATA__ token.

Autoloading and package lexicals

A 'my $pack_lexical' statement makes the variable $pack_lexical local _only_ to the file up to the __DATA__ token. Subroutines declared elsewhere _cannot_ see these types of variables, just as if you declared subroutines in the package but in another file, they cannot see these variables.

So specifically, autoloaded functions cannot see package lexicals (this applies to both the SelfLoader and the Autoloader). The vars pragma provides an alternative to defining package-level globals that will be visible to autoloaded routines. See the documentation on vars in the pragma section of perlmod.

SelfLoader and AutoLoader

The SelfLoader can replace the AutoLoader - just change 'use AutoLoader' to 'use SelfLoader' (though note that the SelfLoader exports the AUTOLOAD function - but if you have your own AUTOLOAD and are using the AutoLoader too, you probably know what you're doing), and the __END__ token to __DATA__ . You will need perl version 5.001m or later to use this (version 5.001 with all patches up to patch m).

There is no need to inherit from the SelfLoader.

The SelfLoader works similarly to the AutoLoader, but picks up the subs from after the __DATA__ instead of in the 'lib/auto' directory. There is a maintenance gain in not needing to run AutoSplit on the module at installation, and a runtime gain in not needing to keep opening and closing files to load subs. There is a runtime loss in needing to parse the code after the __DATA__ . Details of the AutoLoader and another view of these distinctions can be found in that module's documentation.

__DATA__, __END__, and the FOOBAR::DATA filehandle.

This section is only relevant if you want to use the FOOBAR::DATA together with the SelfLoader.

Data after the __DATA__ token in a module is read using the FOOBAR::DATA filehandle. __END__ can still be used to denote the end of the __DATA__ section if followed by the token DATA - this is supported by the SelfLoader. The FOOBAR::DATA filehandle is left open if an __END__ followed by a DATA is found, with the filehandle positioned at the start of the line after the __END__ token. If no __END__ token is present, or an __END__ token with no DATA token on the same line, then the filehandle is closed.

The SelfLoader reads from wherever the current position of the FOOBAR::DATA filehandle is, until the EOF or __END__ . This means that if you want to use that filehandle (and ONLY if you want to), you should either

1. Put all your subroutine declarations immediately after the __DATA__ token and put your own data after those declarations, using the __END__ token to mark the end of subroutine declarations. You must also ensure that the SelfLoader reads first by calling 'SelfLoader->load_stubs();', or by using a function which is selfloaded;

or

2. You should read the FOOBAR::DATA filehandle first, leaving the handle open and positioned at the first line of subroutine declarations.

You could conceivably do both.

Classes and inherited methods.

For modules which are not classes, this section is not relevant. This section is only relevant if you have methods which could be inherited.

A subroutine stub (or forward declaration) looks like

  1. sub stub;

i.e. it is a subroutine declaration without the body of the subroutine. For modules which are not classes, there is no real need for stubs as far as autoloading is concerned.

For modules which ARE classes, and need to handle inherited methods, stubs are needed to ensure that the method inheritance mechanism works properly. You can load the stubs into the module at 'require' time, by adding the statement 'SelfLoader->load_stubs();' to the module to do this.

The alternative is to put the stubs in before the __DATA__ token BEFORE releasing the module, and for this purpose the Devel::SelfStubber module is available. However this does require the extra step of ensuring that the stubs are in the module. If this is done I strongly recommend that this is done BEFORE releasing the module - it should NOT be done at install time in general.

Multiple packages and fully qualified subroutine names

Subroutines in multiple packages within the same file are supported - but you should note that this requires exporting the SelfLoader::AUTOLOAD to every package which requires it. This is done automatically by the SelfLoader when it first loads the subs into the cache, but you should really specify it in the initialization before the __DATA__ by putting a 'use SelfLoader' statement in each package.

Fully qualified subroutine names are also supported. For example,

  1. __DATA__
  2. sub foo::bar {23}
  3. package baz;
  4. sub dob {32}

will all be loaded correctly by the SelfLoader, and the SelfLoader will ensure that the packages 'foo' and 'baz' correctly have the SelfLoader AUTOLOAD method when the data after __DATA__ is first parsed.

AUTHOR

SelfLoader is maintained by the perl5-porters. Please direct any questions to the canonical mailing list. Anything that is applicable to the CPAN release can be sent to its maintainer, though.

Author and Maintainer: The Perl5-Porters <perl5-porters@perl.org>

Maintainer of the CPAN release: Steffen Mueller <smueller@cpan.org>

COPYRIGHT AND LICENSE

This package has been part of the perl core since the first release of perl5. It has been released separately to CPAN so older installations can benefit from bug fixes.

This package has the same copyright and license as the perl core:

  1. Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999,
  2. 2000, 2001, 2002, 2003, 2004, 2005, 2006 by Larry Wall and others
  3. All rights reserved.
  4. This program is free software; you can redistribute it and/or modify
  5. it under the terms of either:
  6. a) the GNU General Public License as published by the Free
  7. Software Foundation; either version 1, or (at your option) any
  8. later version, or
  9. b) the "Artistic License" which comes with this Kit.
  10. This program is distributed in the hope that it will be useful,
  11. but WITHOUT ANY WARRANTY; without even the implied warranty of
  12. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See either
  13. the GNU General Public License or the Artistic License for more details.
  14. You should have received a copy of the Artistic License with this
  15. Kit, in the file named "Artistic". If not, I'll be glad to provide one.
  16. You should also have received a copy of the GNU General Public License
  17. along with this program in the file named "Copying". If not, write to the
  18. Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
  19. MA 02110-1301, USA or visit their web page on the internet at
  20. http://www.gnu.org/copyleft/gpl.html.
  21. For those of you that choose to use the GNU General Public License,
  22. my interpretation of the GNU General Public License is that no Perl
  23. script falls under the terms of the GPL unless you explicitly put
  24. said script under the terms of the GPL yourself. Furthermore, any
  25. object code linked with perl does not automatically fall under the
  26. terms of the GPL, provided such object code only adds definitions
  27. of subroutines and variables, and does not otherwise impair the
  28. resulting interpreter from executing any standard Perl script. I
  29. consider linking in C subroutines in this manner to be the moral
  30. equivalent of defining subroutines in the Perl language itself. You
  31. may sell such an object file as proprietary provided that you provide
  32. or offer to provide the Perl source, as specified by the GNU General
  33. Public License. (This is merely an alternate way of specifying input
  34. to the program.) You may also sell a binary produced by the dumping of
  35. a running Perl script that belongs to you, provided that you provide or
  36. offer to provide the Perl source as specified by the GPL. (The
  37. fact that a Perl interpreter and your code are in the same binary file
  38. is, in this case, a form of mere aggregation.) This is my interpretation
  39. of the GPL. If you still have concerns or difficulties understanding
  40. my intent, feel free to contact me. Of course, the Artistic License
  41. spells all this out for your protection, so you may prefer to use that.
 
perldoc-html/Socket.html000644 000765 000024 00000206663 12275777504 015371 0ustar00jjstaff000000 000000 Socket - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Socket

Perl 5 version 18.2 documentation
Recently read

Socket

NAME

Socket - networking constants and support functions

SYNOPSIS

Socket a low-level module used by, among other things, the IO::Socket family of modules. The following examples demonstrate some low-level uses but a practical program would likely use the higher-level API provided by IO::Socket or similar instead.

  1. use Socket qw(PF_INET SOCK_STREAM pack_sockaddr_in inet_aton);
  2. socket(my $socket, PF_INET, SOCK_STREAM, 0)
  3. or die "socket: $!";
  4. my $port = getservbyname "echo", "tcp";
  5. connect($socket, pack_sockaddr_in($port, inet_aton("localhost")))
  6. or die "connect: $!";
  7. print $socket "Hello, world!\n";
  8. print <$socket>;

See also the EXAMPLES section.

DESCRIPTION

This module provides a variety of constants, structure manipulators and other functions related to socket-based networking. The values and functions provided are useful when used in conjunction with Perl core functions such as socket(), setsockopt() and bind(). It also provides several other support functions, mostly for dealing with conversions of network addresses between human-readable and native binary forms, and for hostname resolver operations.

Some constants and functions are exported by default by this module; but for backward-compatibility any recently-added symbols are not exported by default and must be requested explicitly. When an import list is provided to the use Socket line, the default exports are not automatically imported. It is therefore best practice to always to explicitly list all the symbols required.

Also, some common socket "newline" constants are provided: the constants CR , LF , and CRLF , as well as $CR , $LF , and $CRLF , which map to \015 , \012 , and \015\012 . If you do not want to use the literal characters in your programs, then use the constants provided here. They are not exported by default, but can be imported individually, and with the :crlf export tag:

  1. use Socket qw(:DEFAULT :crlf);
  2. $sock->print("GET / HTTP/1.0$CRLF");

The entire getaddrinfo() subsystem can be exported using the tag :addrinfo ; this exports the getaddrinfo() and getnameinfo() functions, and all the AI_* , NI_* , NIx_* and EAI_* constants.

CONSTANTS

In each of the following groups, there may be many more constants provided than just the ones given as examples in the section heading. If the heading ends ... then this means there are likely more; the exact constants provided will depend on the OS and headers found at compile-time.

PF_INET, PF_INET6, PF_UNIX, ...

Protocol family constants to use as the first argument to socket() or the value of the SO_DOMAIN or SO_FAMILY socket option.

AF_INET, AF_INET6, AF_UNIX, ...

Address family constants used by the socket address structures, to pass to such functions as inet_pton() or getaddrinfo(), or are returned by such functions as sockaddr_family().

SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, ...

Socket type constants to use as the second argument to socket(), or the value of the SO_TYPE socket option.

SOCK_NONBLOCK. SOCK_CLOEXEC

Linux-specific shortcuts to specify the O_NONBLOCK and FD_CLOEXEC flags during a socket(2) call.

  1. socket( my $sockh, PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, 0 )

SOL_SOCKET

Socket option level constant for setsockopt() and getsockopt().

SO_ACCEPTCONN, SO_BROADCAST, SO_ERROR, ...

Socket option name constants for setsockopt() and getsockopt() at the SOL_SOCKET level.

IP_OPTIONS, IP_TOS, IP_TTL, ...

Socket option name constants for IPv4 socket options at the IPPROTO_IP level.

MSG_BCAST, MSG_OOB, MSG_TRUNC, ...

Message flag constants for send() and recv().

SHUT_RD, SHUT_RDWR, SHUT_WR

Direction constants for shutdown().

INADDR_ANY, INADDR_BROADCAST, INADDR_LOOPBACK, INADDR_NONE

Constants giving the special AF_INET addresses for wildcard, broadcast, local loopback, and invalid addresses.

Normally equivalent to inet_aton('0.0.0.0'), inet_aton('255.255.255.255'), inet_aton('localhost') and inet_aton('255.255.255.255') respectively.

IPPROTO_IP, IPPROTO_IPV6, IPPROTO_TCP, ...

IP protocol constants to use as the third argument to socket(), the level argument to getsockopt() or setsockopt(), or the value of the SO_PROTOCOL socket option.

TCP_CORK, TCP_KEEPALIVE, TCP_NODELAY, ...

Socket option name constants for TCP socket options at the IPPROTO_TCP level.

IN6ADDR_ANY, IN6ADDR_LOOPBACK

Constants giving the special AF_INET6 addresses for wildcard and local loopback.

Normally equivalent to inet_pton(AF_INET6, "::") and inet_pton(AF_INET6, "::1") respectively.

IPV6_ADD_MEMBERSHIP, IPV6_MTU, IPV6_V6ONLY, ...

Socket option name constants for IPv6 socket options at the IPPROTO_IPV6 level.

STRUCTURE MANIPULATORS

The following functions convert between lists of Perl values and packed binary strings representing structures.

$family = sockaddr_family $sockaddr

Takes a packed socket address (as returned by pack_sockaddr_in(), pack_sockaddr_un() or the perl builtin functions getsockname() and getpeername()). Returns the address family tag. This will be one of the AF_* constants, such as AF_INET for a sockaddr_in addresses or AF_UNIX for a sockaddr_un . It can be used to figure out what unpack to use for a sockaddr of unknown type.

$sockaddr = pack_sockaddr_in $port, $ip_address

Takes two arguments, a port number and an opaque string (as returned by inet_aton(), or a v-string). Returns the sockaddr_in structure with those arguments packed in and AF_INET filled in. For Internet domain sockets, this structure is normally what you need for the arguments in bind(), connect(), and send().

($port, $ip_address) = unpack_sockaddr_in $sockaddr

Takes a sockaddr_in structure (as returned by pack_sockaddr_in(), getpeername() or recv()). Returns a list of two elements: the port and an opaque string representing the IP address (you can use inet_ntoa() to convert the address to the four-dotted numeric format). Will croak if the structure does not represent an AF_INET address.

In scalar context will return just the IP address.

$sockaddr = sockaddr_in $port, $ip_address

($port, $ip_address) = sockaddr_in $sockaddr

A wrapper of pack_sockaddr_in() or unpack_sockaddr_in(). In list context, unpacks its argument and returns a list consisting of the port and IP address. In scalar context, packs its port and IP address arguments as a sockaddr_in and returns it.

Provided largely for legacy compatibility; it is better to use pack_sockaddr_in() or unpack_sockaddr_in() explicitly.

$sockaddr = pack_sockaddr_in6 $port, $ip6_address, [$scope_id, [$flowinfo]]

Takes two to four arguments, a port number, an opaque string (as returned by inet_pton()), optionally a scope ID number, and optionally a flow label number. Returns the sockaddr_in6 structure with those arguments packed in and AF_INET6 filled in. IPv6 equivalent of pack_sockaddr_in().

($port, $ip6_address, $scope_id, $flowinfo) = unpack_sockaddr_in6 $sockaddr

Takes a sockaddr_in6 structure. Returns a list of four elements: the port number, an opaque string representing the IPv6 address, the scope ID, and the flow label. (You can use inet_ntop() to convert the address to the usual string format). Will croak if the structure does not represent an AF_INET6 address.

In scalar context will return just the IP address.

$sockaddr = sockaddr_in6 $port, $ip6_address, [$scope_id, [$flowinfo]]

($port, $ip6_address, $scope_id, $flowinfo) = sockaddr_in6 $sockaddr

A wrapper of pack_sockaddr_in6() or unpack_sockaddr_in6(). In list context, unpacks its argument according to unpack_sockaddr_in6(). In scalar context, packs its arguments according to pack_sockaddr_in6().

Provided largely for legacy compatibility; it is better to use pack_sockaddr_in6() or unpack_sockaddr_in6() explicitly.

$sockaddr = pack_sockaddr_un $path

Takes one argument, a pathname. Returns the sockaddr_un structure with that path packed in with AF_UNIX filled in. For PF_UNIX sockets, this structure is normally what you need for the arguments in bind(), connect(), and send().

($path) = unpack_sockaddr_un $sockaddr

Takes a sockaddr_un structure (as returned by pack_sockaddr_un(), getpeername() or recv()). Returns a list of one element: the pathname. Will croak if the structure does not represent an AF_UNIX address.

$sockaddr = sockaddr_un $path

($path) = sockaddr_un $sockaddr

A wrapper of pack_sockaddr_un() or unpack_sockaddr_un(). In a list context, unpacks its argument and returns a list consisting of the pathname. In a scalar context, packs its pathname as a sockaddr_un and returns it.

Provided largely for legacy compatibility; it is better to use pack_sockaddr_un() or unpack_sockaddr_un() explicitly.

These are only supported if your system has <sys/un.h>.

$ip_mreq = pack_ip_mreq $multiaddr, $interface

Takes an IPv4 multicast address and optionally an interface address (or INADDR_ANY ). Returns the ip_mreq structure with those arguments packed in. Suitable for use with the IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP sockopts.

($multiaddr, $interface) = unpack_ip_mreq $ip_mreq

Takes an ip_mreq structure. Returns a list of two elements; the IPv4 multicast address and interface address.

$ip_mreq_source = pack_ip_mreq_source $multiaddr, $source, $interface

Takes an IPv4 multicast address, source address, and optionally an interface address (or INADDR_ANY ). Returns the ip_mreq_source structure with those arguments packed in. Suitable for use with the IP_ADD_SOURCE_MEMBERSHIP and IP_DROP_SOURCE_MEMBERSHIP sockopts.

($multiaddr, $source, $interface) = unpack_ip_mreq_source $ip_mreq

Takes an ip_mreq_source structure. Returns a list of three elements; the IPv4 multicast address, source address and interface address.

$ipv6_mreq = pack_ipv6_mreq $multiaddr6, $ifindex

Takes an IPv6 multicast address and an interface number. Returns the ipv6_mreq structure with those arguments packed in. Suitable for use with the IPV6_ADD_MEMBERSHIP and IPV6_DROP_MEMBERSHIP sockopts.

($multiaddr6, $ifindex) = unpack_ipv6_mreq $ipv6_mreq

Takes an ipv6_mreq structure. Returns a list of two elements; the IPv6 address and an interface number.

FUNCTIONS

$ip_address = inet_aton $string

Takes a string giving the name of a host, or a textual representation of an IP address and translates that to an packed binary address structure suitable to pass to pack_sockaddr_in(). If passed a hostname that cannot be resolved, returns undef. For multi-homed hosts (hosts with more than one address), the first address found is returned.

For portability do not assume that the result of inet_aton() is 32 bits wide, in other words, that it would contain only the IPv4 address in network order.

This IPv4-only function is provided largely for legacy reasons. Newly-written code should use getaddrinfo() or inet_pton() instead for IPv6 support.

$string = inet_ntoa $ip_address

Takes a packed binary address structure such as returned by unpack_sockaddr_in() (or a v-string representing the four octets of the IPv4 address in network order) and translates it into a string of the form d.d.d.d where the d s are numbers less than 256 (the normal human-readable four dotted number notation for Internet addresses).

This IPv4-only function is provided largely for legacy reasons. Newly-written code should use getnameinfo() or inet_ntop() instead for IPv6 support.

$address = inet_pton $family, $string

Takes an address family (such as AF_INET or AF_INET6 ) and a string containing a textual representation of an address in that family and translates that to an packed binary address structure.

See also getaddrinfo() for a more powerful and flexible function to look up socket addresses given hostnames or textual addresses.

$string = inet_ntop $family, $address

Takes an address family and a packed binary address structure and translates it into a human-readable textual representation of the address; typically in d.d.d.d form for AF_INET or hhhh:hhhh::hhhh form for AF_INET6 .

See also getnameinfo() for a more powerful and flexible function to turn socket addresses into human-readable textual representations.

($err, @result) = getaddrinfo $host, $service, [$hints]

Given both a hostname and service name, this function attempts to resolve the host name into a list of network addresses, and the service name into a protocol and port number, and then returns a list of address structures suitable to connect() to it.

Given just a host name, this function attempts to resolve it to a list of network addresses, and then returns a list of address structures giving these addresses.

Given just a service name, this function attempts to resolve it to a protocol and port number, and then returns a list of address structures that represent it suitable to bind() to. This use should be combined with the AI_PASSIVE flag; see below.

Given neither name, it generates an error.

If present, $hints should be a reference to a hash, where the following keys are recognised:

  • flags => INT

    A bitfield containing AI_* constants; see below.

  • family => INT

    Restrict to only generating addresses in this address family

  • socktype => INT

    Restrict to only generating addresses of this socket type

  • protocol => INT

    Restrict to only generating addresses for this protocol

The return value will be a list; the first value being an error indication, followed by a list of address structures (if no error occurred).

The error value will be a dualvar; comparable to the EI_* error constants, or printable as a human-readable error message string. If no error occurred it will be zero numerically and an empty string.

Each value in the results list will be a hash reference containing the following fields:

  • family => INT

    The address family (e.g. AF_INET )

  • socktype => INT

    The socket type (e.g. SOCK_STREAM )

  • protocol => INT

    The protocol (e.g. IPPROTO_TCP )

  • addr => STRING

    The address in a packed string (such as would be returned by pack_sockaddr_in())

  • canonname => STRING

    The canonical name for the host if the AI_CANONNAME flag was provided, or undef otherwise. This field will only be present on the first returned address.

The following flag constants are recognised in the $hints hash. Other flag constants may exist as provided by the OS.

  • AI_PASSIVE

    Indicates that this resolution is for a local bind() for a passive (i.e. listening) socket, rather than an active (i.e. connecting) socket.

  • AI_CANONNAME

    Indicates that the caller wishes the canonical hostname (canonname ) field of the result to be filled in.

  • AI_NUMERICHOST

    Indicates that the caller will pass a numeric address, rather than a hostname, and that getaddrinfo() must not perform a resolve operation on this name. This flag will prevent a possibly-slow network lookup operation, and instead return an error if a hostname is passed.

($err, $hostname, $servicename) = getnameinfo $sockaddr, [$flags, [$xflags]]

Given a packed socket address (such as from getsockname(), getpeername(), or returned by getaddrinfo() in a addr field), returns the hostname and symbolic service name it represents. $flags may be a bitmask of NI_* constants, or defaults to 0 if unspecified.

The return value will be a list; the first value being an error condition, followed by the hostname and service name.

The error value will be a dualvar; comparable to the EI_* error constants, or printable as a human-readable error message string. The host and service names will be plain strings.

The following flag constants are recognised as $flags. Other flag constants may exist as provided by the OS.

  • NI_NUMERICHOST

    Requests that a human-readable string representation of the numeric address be returned directly, rather than performing a name resolve operation that may convert it into a hostname. This will also avoid potentially-blocking network IO.

  • NI_NUMERICSERV

    Requests that the port number be returned directly as a number representation rather than performing a name resolve operation that may convert it into a service name.

  • NI_NAMEREQD

    If a name resolve operation fails to provide a name, then this flag will cause getnameinfo() to indicate an error, rather than returning the numeric representation as a human-readable string.

  • NI_DGRAM

    Indicates that the socket address relates to a SOCK_DGRAM socket, for the services whose name differs between TCP and UDP protocols.

The following constants may be supplied as $xflags.

  • NIx_NOHOST

    Indicates that the caller is not interested in the hostname of the result, so it does not have to be converted. undef will be returned as the hostname.

  • NIx_NOSERV

    Indicates that the caller is not interested in the service name of the result, so it does not have to be converted. undef will be returned as the service name.

getaddrinfo() / getnameinfo() ERROR CONSTANTS

The following constants may be returned by getaddrinfo() or getnameinfo(). Others may be provided by the OS.

  • EAI_AGAIN

    A temporary failure occurred during name resolution. The operation may be successful if it is retried later.

  • EAI_BADFLAGS

    The value of the flags hint to getaddrinfo(), or the $flags parameter to getnameinfo() contains unrecognised flags.

  • EAI_FAMILY

    The family hint to getaddrinfo(), or the family of the socket address passed to getnameinfo() is not supported.

  • EAI_NODATA

    The host name supplied to getaddrinfo() did not provide any usable address data.

  • EAI_NONAME

    The host name supplied to getaddrinfo() does not exist, or the address supplied to getnameinfo() is not associated with a host name and the NI_NAMEREQD flag was supplied.

  • EAI_SERVICE

    The service name supplied to getaddrinfo() is not available for the socket type given in the $hints.

EXAMPLES

Lookup for connect()

The getaddrinfo() function converts a hostname and a service name into a list of structures, each containing a potential way to connect() to the named service on the named host.

  1. use IO::Socket;
  2. use Socket qw(SOCK_STREAM getaddrinfo);
  3. my %hints = (socktype => SOCK_STREAM);
  4. my ($err, @res) = getaddrinfo("localhost", "echo", \%hints);
  5. die "Cannot getaddrinfo - $err" if $err;
  6. my $sock;
  7. foreach my $ai (@res) {
  8. my $candidate = IO::Socket->new();
  9. $candidate->socket($ai->{family}, $ai->{socktype}, $ai->{protocol})
  10. or next;
  11. $candidate->connect($ai->{addr})
  12. or next;
  13. $sock = $candidate;
  14. last;
  15. }
  16. die "Cannot connect to localhost:echo" unless $sock;
  17. $sock->print("Hello, world!\n");
  18. print <$sock>;

Because a list of potential candidates is returned, the while loop tries each in turn until it it finds one that succeeds both the socket() and connect() calls.

This function performs the work of the legacy functions gethostbyname(), getservbyname(), inet_aton() and pack_sockaddr_in().

In practice this logic is better performed by IO::Socket::IP.

Making a human-readable string out of an address

The getnameinfo() function converts a socket address, such as returned by getsockname() or getpeername(), into a pair of human-readable strings representing the address and service name.

  1. use IO::Socket::IP;
  2. use Socket qw(getnameinfo);
  3. my $server = IO::Socket::IP->new(LocalPort => 12345, Listen => 1) or
  4. die "Cannot listen - $@";
  5. my $socket = $server->accept or die "accept: $!";
  6. my ($err, $hostname, $servicename) = getnameinfo($socket->peername);
  7. die "Cannot getnameinfo - $err" if $err;
  8. print "The peer is connected from $hostname\n";

Since in this example only the hostname was used, the redundant conversion of the port number into a service name may be omitted by passing the NIx_NOSERV flag.

  1. use Socket qw(getnameinfo NIx_NOSERV);
  2. my ($err, $hostname) = getnameinfo($socket->peername, 0, NIx_NOSERV);

This function performs the work of the legacy functions unpack_sockaddr_in(), inet_ntoa(), gethostbyaddr() and getservbyport().

In practice this logic is better performed by IO::Socket::IP.

Resolving hostnames into IP addresses

To turn a hostname into a human-readable plain IP address use getaddrinfo() to turn the hostname into a list of socket structures, then getnameinfo() on each one to make it a readable IP address again.

  1. use Socket qw(:addrinfo SOCK_RAW);
  2. my ($err, @res) = getaddrinfo($hostname, "", {socktype => SOCK_RAW});
  3. die "Cannot getaddrinfo - $err" if $err;
  4. while( my $ai = shift @res ) {
  5. my ($err, $ipaddr) = getnameinfo($ai->{addr}, NI_NUMERICHOST, NIx_NOSERV);
  6. die "Cannot getnameinfo - $err" if $err;
  7. print "$ipaddr\n";
  8. }

The socktype hint to getaddrinfo() filters the results to only include one socket type and protocol. Without this most OSes return three combinations, for SOCK_STREAM , SOCK_DGRAM and SOCK_RAW , resulting in triplicate output of addresses. The NI_NUMERICHOST flag to getnameinfo() causes it to return a string-formatted plain IP address, rather than reverse resolving it back into a hostname.

This combination performs the work of the legacy functions gethostbyname() and inet_ntoa().

Accessing socket options

The many SO_* and other constants provide the socket option names for getsockopt() and setsockopt().

  1. use IO::Socket::INET;
  2. use Socket qw(SOL_SOCKET SO_RCVBUF IPPROTO_IP IP_TTL);
  3. my $socket = IO::Socket::INET->new(LocalPort => 0, Proto => 'udp')
  4. or die "Cannot create socket: $@";
  5. $socket->setsockopt(SOL_SOCKET, SO_RCVBUF, 64*1024) or
  6. die "setsockopt: $!";
  7. print "Receive buffer is ", $socket->getsockopt(SOL_SOCKET, SO_RCVBUF),
  8. " bytes\n";
  9. print "IP TTL is ", $socket->getsockopt(IPPROTO_IP, IP_TTL), "\n";

As a convenience, IO::Socket's setsockopt() method will convert a number into a packed byte buffer, and getsockopt() will unpack a byte buffer of the correct size back into a number.

AUTHOR

This module was originally maintained in Perl core by the Perl 5 Porters.

It was extracted to dual-life on CPAN at version 1.95 by Paul Evans <leonerd@leonerd.org.uk>

Page index
 
perldoc-html/Storable.html000644 000765 000024 00000177117 12275777505 015716 0ustar00jjstaff000000 000000 Storable - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Storable

Perl 5 version 18.2 documentation
Recently read

Storable

NAME

Storable - persistence for Perl data structures

SYNOPSIS

  1. use Storable;
  2. store \%table, 'file';
  3. $hashref = retrieve('file');
  4. use Storable qw(nstore store_fd nstore_fd freeze thaw dclone);
  5. # Network order
  6. nstore \%table, 'file';
  7. $hashref = retrieve('file'); # There is NO nretrieve()
  8. # Storing to and retrieving from an already opened file
  9. store_fd \@array, \*STDOUT;
  10. nstore_fd \%table, \*STDOUT;
  11. $aryref = fd_retrieve(\*SOCKET);
  12. $hashref = fd_retrieve(\*SOCKET);
  13. # Serializing to memory
  14. $serialized = freeze \%table;
  15. %table_clone = %{ thaw($serialized) };
  16. # Deep (recursive) cloning
  17. $cloneref = dclone($ref);
  18. # Advisory locking
  19. use Storable qw(lock_store lock_nstore lock_retrieve)
  20. lock_store \%table, 'file';
  21. lock_nstore \%table, 'file';
  22. $hashref = lock_retrieve('file');

DESCRIPTION

The Storable package brings persistence to your Perl data structures containing SCALAR, ARRAY, HASH or REF objects, i.e. anything that can be conveniently stored to disk and retrieved at a later time.

It can be used in the regular procedural way by calling store with a reference to the object to be stored, along with the file name where the image should be written.

The routine returns undef for I/O problems or other internal error, a true value otherwise. Serious errors are propagated as a die exception.

To retrieve data stored to disk, use retrieve with a file name. The objects stored into that file are recreated into memory for you, and a reference to the root object is returned. In case an I/O error occurs while reading, undef is returned instead. Other serious errors are propagated via die.

Since storage is performed recursively, you might want to stuff references to objects that share a lot of common data into a single array or hash table, and then store that object. That way, when you retrieve back the whole thing, the objects will continue to share what they originally shared.

At the cost of a slight header overhead, you may store to an already opened file descriptor using the store_fd routine, and retrieve from a file via fd_retrieve . Those names aren't imported by default, so you will have to do that explicitly if you need those routines. The file descriptor you supply must be already opened, for read if you're going to retrieve and for write if you wish to store.

  1. store_fd(\%table, *STDOUT) || die "can't store to stdout\n";
  2. $hashref = fd_retrieve(*STDIN);

You can also store data in network order to allow easy sharing across multiple platforms, or when storing on a socket known to be remotely connected. The routines to call have an initial n prefix for network, as in nstore and nstore_fd . At retrieval time, your data will be correctly restored so you don't have to know whether you're restoring from native or network ordered data. Double values are stored stringified to ensure portability as well, at the slight risk of loosing some precision in the last decimals.

When using fd_retrieve , objects are retrieved in sequence, one object (i.e. one recursive tree) per associated store_fd .

If you're more from the object-oriented camp, you can inherit from Storable and directly store your objects by invoking store as a method. The fact that the root of the to-be-stored tree is a blessed reference (i.e. an object) is special-cased so that the retrieve does not provide a reference to that object but rather the blessed object reference itself. (Otherwise, you'd get a reference to that blessed object).

MEMORY STORE

The Storable engine can also store data into a Perl scalar instead, to later retrieve them. This is mainly used to freeze a complex structure in some safe compact memory place (where it can possibly be sent to another process via some IPC, since freezing the structure also serializes it in effect). Later on, and maybe somewhere else, you can thaw the Perl scalar out and recreate the original complex structure in memory.

Surprisingly, the routines to be called are named freeze and thaw . If you wish to send out the frozen scalar to another machine, use nfreeze instead to get a portable image.

Note that freezing an object structure and immediately thawing it actually achieves a deep cloning of that structure:

  1. dclone(.) = thaw(freeze(.))

Storable provides you with a dclone interface which does not create that intermediary scalar but instead freezes the structure in some internal memory space and then immediately thaws it out.

ADVISORY LOCKING

The lock_store and lock_nstore routine are equivalent to store and nstore , except that they get an exclusive lock on the file before writing. Likewise, lock_retrieve does the same as retrieve , but also gets a shared lock on the file before reading.

As with any advisory locking scheme, the protection only works if you systematically use lock_store and lock_retrieve . If one side of your application uses store whilst the other uses lock_retrieve , you will get no protection at all.

The internal advisory locking is implemented using Perl's flock() routine. If your system does not support any form of flock(), or if you share your files across NFS, you might wish to use other forms of locking by using modules such as LockFile::Simple which lock a file using a filesystem entry, instead of locking the file descriptor.

SPEED

The heart of Storable is written in C for decent speed. Extra low-level optimizations have been made when manipulating perl internals, to sacrifice encapsulation for the benefit of greater speed.

CANONICAL REPRESENTATION

Normally, Storable stores elements of hashes in the order they are stored internally by Perl, i.e. pseudo-randomly. If you set $Storable::canonical to some TRUE value, Storable will store hashes with the elements sorted by their key. This allows you to compare data structures by comparing their frozen representations (or even the compressed frozen representations), which can be useful for creating lookup tables for complicated queries.

Canonical order does not imply network order; those are two orthogonal settings.

CODE REFERENCES

Since Storable version 2.05, CODE references may be serialized with the help of B::Deparse. To enable this feature, set $Storable::Deparse to a true value. To enable deserialization, $Storable::Eval should be set to a true value. Be aware that deserialization is done through eval, which is dangerous if the Storable file contains malicious data. You can set $Storable::Eval to a subroutine reference which would be used instead of eval. See below for an example using a Safe compartment for deserialization of CODE references.

If $Storable::Deparse and/or $Storable::Eval are set to false values, then the value of $Storable::forgive_me (see below) is respected while serializing and deserializing.

FORWARD COMPATIBILITY

This release of Storable can be used on a newer version of Perl to serialize data which is not supported by earlier Perls. By default, Storable will attempt to do the right thing, by croak() ing if it encounters data that it cannot deserialize. However, the defaults can be changed as follows:

  • utf8 data

    Perl 5.6 added support for Unicode characters with code points > 255, and Perl 5.8 has full support for Unicode characters in hash keys. Perl internally encodes strings with these characters using utf8, and Storable serializes them as utf8. By default, if an older version of Perl encounters a utf8 value it cannot represent, it will croak() . To change this behaviour so that Storable deserializes utf8 encoded values as the string of bytes (effectively dropping the is_utf8 flag) set $Storable::drop_utf8 to some TRUE value. This is a form of data loss, because with $drop_utf8 true, it becomes impossible to tell whether the original data was the Unicode string, or a series of bytes that happen to be valid utf8.

  • restricted hashes

    Perl 5.8 adds support for restricted hashes, which have keys restricted to a given set, and can have values locked to be read only. By default, when Storable encounters a restricted hash on a perl that doesn't support them, it will deserialize it as a normal hash, silently discarding any placeholder keys and leaving the keys and all values unlocked. To make Storable croak() instead, set $Storable::downgrade_restricted to a FALSE value. To restore the default set it back to some TRUE value.

  • files from future versions of Storable

    Earlier versions of Storable would immediately croak if they encountered a file with a higher internal version number than the reading Storable knew about. Internal version numbers are increased each time new data types (such as restricted hashes) are added to the vocabulary of the file format. This meant that a newer Storable module had no way of writing a file readable by an older Storable, even if the writer didn't store newer data types.

    This version of Storable will defer croaking until it encounters a data type in the file that it does not recognize. This means that it will continue to read files generated by newer Storable modules which are careful in what they write out, making it easier to upgrade Storable modules in a mixed environment.

    The old behaviour of immediate croaking can be re-instated by setting $Storable::accept_future_minor to some FALSE value.

All these variables have no effect on a newer Perl which supports the relevant feature.

ERROR REPORTING

Storable uses the "exception" paradigm, in that it does not try to workaround failures: if something bad happens, an exception is generated from the caller's perspective (see Carp and croak() ). Use eval {} to trap those exceptions.

When Storable croaks, it tries to report the error via the logcroak() routine from the Log::Agent package, if it is available.

Normal errors are reported by having store() or retrieve() return undef. Such errors are usually I/O errors (or truncated stream errors at retrieval).

WIZARDS ONLY

Hooks

Any class may define hooks that will be called during the serialization and deserialization process on objects that are instances of that class. Those hooks can redefine the way serialization is performed (and therefore, how the symmetrical deserialization should be conducted).

Since we said earlier:

  1. dclone(.) = thaw(freeze(.))

everything we say about hooks should also hold for deep cloning. However, hooks get to know whether the operation is a mere serialization, or a cloning.

Therefore, when serializing hooks are involved,

  1. dclone(.) <> thaw(freeze(.))

Well, you could keep them in sync, but there's no guarantee it will always hold on classes somebody else wrote. Besides, there is little to gain in doing so: a serializing hook could keep only one attribute of an object, which is probably not what should happen during a deep cloning of that same object.

Here is the hooking interface:

  • STORABLE_freeze obj, cloning

    The serializing hook, called on the object during serialization. It can be inherited, or defined in the class itself, like any other method.

    Arguments: obj is the object to serialize, cloning is a flag indicating whether we're in a dclone() or a regular serialization via store() or freeze().

    Returned value: A LIST ($serialized, $ref1, $ref2, ...) where $serialized is the serialized form to be used, and the optional $ref1, $ref2, etc... are extra references that you wish to let the Storable engine serialize.

    At deserialization time, you will be given back the same LIST, but all the extra references will be pointing into the deserialized structure.

    The first time the hook is hit in a serialization flow, you may have it return an empty list. That will signal the Storable engine to further discard that hook for this class and to therefore revert to the default serialization of the underlying Perl data. The hook will again be normally processed in the next serialization.

    Unless you know better, serializing hook should always say:

    1. sub STORABLE_freeze {
    2. my ($self, $cloning) = @_;
    3. return if $cloning; # Regular default serialization
    4. ....
    5. }

    in order to keep reasonable dclone() semantics.

  • STORABLE_thaw obj, cloning, serialized, ...

    The deserializing hook called on the object during deserialization. But wait: if we're deserializing, there's no object yet... right?

    Wrong: the Storable engine creates an empty one for you. If you know Eiffel, you can view STORABLE_thaw as an alternate creation routine.

    This means the hook can be inherited like any other method, and that obj is your blessed reference for this particular instance.

    The other arguments should look familiar if you know STORABLE_freeze : cloning is true when we're part of a deep clone operation, serialized is the serialized string you returned to the engine in STORABLE_freeze , and there may be an optional list of references, in the same order you gave them at serialization time, pointing to the deserialized objects (which have been processed courtesy of the Storable engine).

    When the Storable engine does not find any STORABLE_thaw hook routine, it tries to load the class by requiring the package dynamically (using the blessed package name), and then re-attempts the lookup. If at that time the hook cannot be located, the engine croaks. Note that this mechanism will fail if you define several classes in the same file, but perlmod warned you.

    It is up to you to use this information to populate obj the way you want.

    Returned value: none.

  • STORABLE_attach class, cloning, serialized

    While STORABLE_freeze and STORABLE_thaw are useful for classes where each instance is independent, this mechanism has difficulty (or is incompatible) with objects that exist as common process-level or system-level resources, such as singleton objects, database pools, caches or memoized objects.

    The alternative STORABLE_attach method provides a solution for these shared objects. Instead of STORABLE_freeze --> STORABLE_thaw , you implement STORABLE_freeze --> STORABLE_attach instead.

    Arguments: class is the class we are attaching to, cloning is a flag indicating whether we're in a dclone() or a regular de-serialization via thaw(), and serialized is the stored string for the resource object.

    Because these resource objects are considered to be owned by the entire process/system, and not the "property" of whatever is being serialized, no references underneath the object should be included in the serialized string. Thus, in any class that implements STORABLE_attach , the STORABLE_freeze method cannot return any references, and Storable will throw an error if STORABLE_freeze tries to return references.

    All information required to "attach" back to the shared resource object must be contained only in the STORABLE_freeze return string. Otherwise, STORABLE_freeze behaves as normal for STORABLE_attach classes.

    Because STORABLE_attach is passed the class (rather than an object), it also returns the object directly, rather than modifying the passed object.

    Returned value: object of type class

Predicates

Predicates are not exportable. They must be called by explicitly prefixing them with the Storable package name.

  • Storable::last_op_in_netorder

    The Storable::last_op_in_netorder() predicate will tell you whether network order was used in the last store or retrieve operation. If you don't know how to use this, just forget about it.

  • Storable::is_storing

    Returns true if within a store operation (via STORABLE_freeze hook).

  • Storable::is_retrieving

    Returns true if within a retrieve operation (via STORABLE_thaw hook).

Recursion

With hooks comes the ability to recurse back to the Storable engine. Indeed, hooks are regular Perl code, and Storable is convenient when it comes to serializing and deserializing things, so why not use it to handle the serialization string?

There are a few things you need to know, however:

  • You can create endless loops if the things you serialize via freeze() (for instance) point back to the object we're trying to serialize in the hook.

  • Shared references among objects will not stay shared: if we're serializing the list of object [A, C] where both object A and C refer to the SAME object B, and if there is a serializing hook in A that says freeze(B), then when deserializing, we'll get [A', C'] where A' refers to B', but C' refers to D, a deep clone of B'. The topology was not preserved.

That's why STORABLE_freeze lets you provide a list of references to serialize. The engine guarantees that those will be serialized in the same context as the other objects, and therefore that shared objects will stay shared.

In the above [A, C] example, the STORABLE_freeze hook could return:

  1. ("something", $self->{B})

and the B part would be serialized by the engine. In STORABLE_thaw , you would get back the reference to the B' object, deserialized for you.

Therefore, recursion should normally be avoided, but is nonetheless supported.

Deep Cloning

There is a Clone module available on CPAN which implements deep cloning natively, i.e. without freezing to memory and thawing the result. It is aimed to replace Storable's dclone() some day. However, it does not currently support Storable hooks to redefine the way deep cloning is performed.

Storable magic

Yes, there's a lot of that :-) But more precisely, in UNIX systems there's a utility called file , which recognizes data files based on their contents (usually their first few bytes). For this to work, a certain file called magic needs to taught about the signature of the data. Where that configuration file lives depends on the UNIX flavour; often it's something like /usr/share/misc/magic or /etc/magic. Your system administrator needs to do the updating of the magic file. The necessary signature information is output to STDOUT by invoking Storable::show_file_magic(). Note that the GNU implementation of the file utility, version 3.38 or later, is expected to contain support for recognising Storable files out-of-the-box, in addition to other kinds of Perl files.

You can also use the following functions to extract the file header information from Storable images:

  • $info = Storable::file_magic( $filename )

    If the given file is a Storable image return a hash describing it. If the file is readable, but not a Storable image return undef. If the file does not exist or is unreadable then croak.

    The hash returned has the following elements:

    • version

      This returns the file format version. It is a string like "2.7".

      Note that this version number is not the same as the version number of the Storable module itself. For instance Storable v0.7 create files in format v2.0 and Storable v2.15 create files in format v2.7. The file format version number only increment when additional features that would confuse older versions of the module are added.

      Files older than v2.0 will have the one of the version numbers "-1", "0" or "1". No minor number was used at that time.

    • version_nv

      This returns the file format version as number. It is a string like "2.007". This value is suitable for numeric comparisons.

      The constant function Storable::BIN_VERSION_NV returns a comparable number that represents the highest file version number that this version of Storable fully supports (but see discussion of $Storable::accept_future_minor above). The constant Storable::BIN_WRITE_VERSION_NV function returns what file version is written and might be less than Storable::BIN_VERSION_NV in some configurations.

    • major , minor

      This also returns the file format version. If the version is "2.7" then major would be 2 and minor would be 7. The minor element is missing for when major is less than 2.

    • hdrsize

      The is the number of bytes that the Storable header occupies.

    • netorder

      This is TRUE if the image store data in network order. This means that it was created with nstore() or similar.

    • byteorder

      This is only present when netorder is FALSE. It is the $Config{byteorder} string of the perl that created this image. It is a string like "1234" (32 bit little endian) or "87654321" (64 bit big endian). This must match the current perl for the image to be readable by Storable.

    • intsize , longsize , ptrsize , nvsize

      These are only present when netorder is FALSE. These are the sizes of various C datatypes of the perl that created this image. These must match the current perl for the image to be readable by Storable.

      The nvsize element is only present for file format v2.2 and higher.

    • file

      The name of the file.

  • $info = Storable::read_magic( $buffer )
  • $info = Storable::read_magic( $buffer, $must_be_file )

    The $buffer should be a Storable image or the first few bytes of it. If $buffer starts with a Storable header, then a hash describing the image is returned, otherwise undef is returned.

    The hash has the same structure as the one returned by Storable::file_magic(). The file element is true if the image is a file image.

    If the $must_be_file argument is provided and is TRUE, then return undef unless the image looks like it belongs to a file dump.

    The maximum size of a Storable header is currently 21 bytes. If the provided $buffer is only the first part of a Storable image it should at least be this long to ensure that read_magic() will recognize it as such.

EXAMPLES

Here are some code samples showing a possible usage of Storable:

  1. use Storable qw(store retrieve freeze thaw dclone);
  2. %color = ('Blue' => 0.1, 'Red' => 0.8, 'Black' => 0, 'White' => 1);
  3. store(\%color, 'mycolors') or die "Can't store %a in mycolors!\n";
  4. $colref = retrieve('mycolors');
  5. die "Unable to retrieve from mycolors!\n" unless defined $colref;
  6. printf "Blue is still %lf\n", $colref->{'Blue'};
  7. $colref2 = dclone(\%color);
  8. $str = freeze(\%color);
  9. printf "Serialization of %%color is %d bytes long.\n", length($str);
  10. $colref3 = thaw($str);

which prints (on my machine):

  1. Blue is still 0.100000
  2. Serialization of %color is 102 bytes long.

Serialization of CODE references and deserialization in a safe compartment:

  1. use Storable qw(freeze thaw);
  2. use Safe;
  3. use strict;
  4. my $safe = new Safe;
  5. # because of opcodes used in "use strict":
  6. $safe->permit(qw(:default require));
  7. local $Storable::Deparse = 1;
  8. local $Storable::Eval = sub { $safe->reval($_[0]) };
  9. my $serialized = freeze(sub { 42 });
  10. my $code = thaw($serialized);
  11. $code->() == 42;

SECURITY WARNING

Do not accept Storable documents from untrusted sources!

Some features of Storable can lead to security vulnerabilities if you accept Storable documents from untrusted sources. Most obviously, the optional (off by default) CODE reference serialization feature allows transfer of code to the deserializing process. Furthermore, any serialized object will cause Storable to helpfully load the module corresponding to the class of the object in the deserializing module. For manipulated module names, this can load almost arbitrary code. Finally, the deserialized object's destructors will be invoked when the objects get destroyed in the deserializing process. Maliciously crafted Storable documents may put such objects in the value of a hash key that is overridden by another key/value pair in the same hash, thus causing immediate destructor execution.

In a future version of Storable, we intend to provide options to disable loading modules for classes and to disable deserializing objects altogether. Nonetheless, Storable deserializing documents from untrusted sources is expected to have other, yet undiscovered, security concerns such as allowing an attacker to cause the deserializer to crash hard.

Therefore, let me repeat: Do not accept Storable documents from untrusted sources!

If your application requires accepting data from untrusted sources, you are best off with a less powerful and more-likely safe serialization format and implementation. If your data is sufficently simple, JSON is a good choice and offers maximum interoperability.

WARNING

If you're using references as keys within your hash tables, you're bound to be disappointed when retrieving your data. Indeed, Perl stringifies references used as hash table keys. If you later wish to access the items via another reference stringification (i.e. using the same reference that was used for the key originally to record the value into the hash table), it will work because both references stringify to the same string.

It won't work across a sequence of store and retrieve operations, however, because the addresses in the retrieved objects, which are part of the stringified references, will probably differ from the original addresses. The topology of your structure is preserved, but not hidden semantics like those.

On platforms where it matters, be sure to call binmode() on the descriptors that you pass to Storable functions.

Storing data canonically that contains large hashes can be significantly slower than storing the same data normally, as temporary arrays to hold the keys for each hash have to be allocated, populated, sorted and freed. Some tests have shown a halving of the speed of storing -- the exact penalty will depend on the complexity of your data. There is no slowdown on retrieval.

BUGS

You can't store GLOB, FORMLINE, REGEXP, etc.... If you can define semantics for those operations, feel free to enhance Storable so that it can deal with them.

The store functions will croak if they run into such references unless you set $Storable::forgive_me to some TRUE value. In that case, the fatal message is turned in a warning and some meaningless string is stored instead.

Setting $Storable::canonical may not yield frozen strings that compare equal due to possible stringification of numbers. When the string version of a scalar exists, it is the form stored; therefore, if you happen to use your numbers as strings between two freezing operations on the same data structures, you will get different results.

When storing doubles in network order, their value is stored as text. However, you should also not expect non-numeric floating-point values such as infinity and "not a number" to pass successfully through a nstore()/retrieve() pair.

As Storable neither knows nor cares about character sets (although it does know that characters may be more than eight bits wide), any difference in the interpretation of character codes between a host and a target system is your problem. In particular, if host and target use different code points to represent the characters used in the text representation of floating-point numbers, you will not be able be able to exchange floating-point data, even with nstore().

Storable::drop_utf8 is a blunt tool. There is no facility either to return all strings as utf8 sequences, or to attempt to convert utf8 data back to 8 bit and croak() if the conversion fails.

Prior to Storable 2.01, no distinction was made between signed and unsigned integers on storing. By default Storable prefers to store a scalars string representation (if it has one) so this would only cause problems when storing large unsigned integers that had never been converted to string or floating point. In other words values that had been generated by integer operations such as logic ops and then not used in any string or arithmetic context before storing.

64 bit data in perl 5.6.0 and 5.6.1

This section only applies to you if you have existing data written out by Storable 2.02 or earlier on perl 5.6.0 or 5.6.1 on Unix or Linux which has been configured with 64 bit integer support (not the default) If you got a precompiled perl, rather than running Configure to build your own perl from source, then it almost certainly does not affect you, and you can stop reading now (unless you're curious). If you're using perl on Windows it does not affect you.

Storable writes a file header which contains the sizes of various C language types for the C compiler that built Storable (when not writing in network order), and will refuse to load files written by a Storable not on the same (or compatible) architecture. This check and a check on machine byteorder is needed because the size of various fields in the file are given by the sizes of the C language types, and so files written on different architectures are incompatible. This is done for increased speed. (When writing in network order, all fields are written out as standard lengths, which allows full interworking, but takes longer to read and write)

Perl 5.6.x introduced the ability to optional configure the perl interpreter to use C's long long type to allow scalars to store 64 bit integers on 32 bit systems. However, due to the way the Perl configuration system generated the C configuration files on non-Windows platforms, and the way Storable generates its header, nothing in the Storable file header reflected whether the perl writing was using 32 or 64 bit integers, despite the fact that Storable was storing some data differently in the file. Hence Storable running on perl with 64 bit integers will read the header from a file written by a 32 bit perl, not realise that the data is actually in a subtly incompatible format, and then go horribly wrong (possibly crashing) if it encountered a stored integer. This is a design failure.

Storable has now been changed to write out and read in a file header with information about the size of integers. It's impossible to detect whether an old file being read in was written with 32 or 64 bit integers (they have the same header) so it's impossible to automatically switch to a correct backwards compatibility mode. Hence this Storable defaults to the new, correct behaviour.

What this means is that if you have data written by Storable 1.x running on perl 5.6.0 or 5.6.1 configured with 64 bit integers on Unix or Linux then by default this Storable will refuse to read it, giving the error Byte order is not compatible. If you have such data then you you should set $Storable::interwork_56_64bit to a true value to make this Storable read and write files with the old header. You should also migrate your data, or any older perl you are communicating with, to this current version of Storable.

If you don't have data written with specific configuration of perl described above, then you do not and should not do anything. Don't set the flag - not only will Storable on an identically configured perl refuse to load them, but Storable a differently configured perl will load them believing them to be correct for it, and then may well fail or crash part way through reading them.

CREDITS

Thank you to (in chronological order):

  1. Jarkko Hietaniemi <jhi@iki.fi>
  2. Ulrich Pfeifer <pfeifer@charly.informatik.uni-dortmund.de>
  3. Benjamin A. Holzman <bholzman@earthlink.net>
  4. Andrew Ford <A.Ford@ford-mason.co.uk>
  5. Gisle Aas <gisle@aas.no>
  6. Jeff Gresham <gresham_jeffrey@jpmorgan.com>
  7. Murray Nesbitt <murray@activestate.com>
  8. Marc Lehmann <pcg@opengroup.org>
  9. Justin Banks <justinb@wamnet.com>
  10. Jarkko Hietaniemi <jhi@iki.fi> (AGAIN, as perl 5.7.0 Pumpkin!)
  11. Salvador Ortiz Garcia <sog@msg.com.mx>
  12. Dominic Dunlop <domo@computer.org>
  13. Erik Haugan <erik@solbors.no>
  14. Benjamin A. Holzman <ben.holzman@grantstreet.com>

for their bug reports, suggestions and contributions.

Benjamin Holzman contributed the tied variable support, Andrew Ford contributed the canonical order for hashes, and Gisle Aas fixed a few misunderstandings of mine regarding the perl internals, and optimized the emission of "tags" in the output streams by simply counting the objects instead of tagging them (leading to a binary incompatibility for the Storable image starting at version 0.6--older images are, of course, still properly understood). Murray Nesbitt made Storable thread-safe. Marc Lehmann added overloading and references to tied items support. Benjamin Holzman added a performance improvement for overloaded classes; thanks to Grant Street Group for footing the bill.

AUTHOR

Storable was written by Raphael Manfredi <Raphael_Manfredi@pobox.com> Maintenance is now done by the perl5-porters <perl5-porters@perl.org>

Please e-mail us with problems, bug fixes, comments and complaints, although if you have compliments you should send them to Raphael. Please don't e-mail Raphael with problems, as he no longer works on Storable, and your message will be delayed while he forwards it to us.

SEE ALSO

Clone.

 
perldoc-html/Symbol.html000644 000765 000024 00000052017 12275777505 015377 0ustar00jjstaff000000 000000 Symbol - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Symbol

Perl 5 version 18.2 documentation
Recently read

Symbol

NAME

Symbol - manipulate Perl symbols and their names

SYNOPSIS

  1. use Symbol;
  2. $sym = gensym;
  3. open($sym, "filename");
  4. $_ = <$sym>;
  5. # etc.
  6. ungensym $sym; # no effect
  7. # replace *FOO{IO} handle but not $FOO, %FOO, etc.
  8. *FOO = geniosym;
  9. print qualify("x"), "\n"; # "main::x"
  10. print qualify("x", "FOO"), "\n"; # "FOO::x"
  11. print qualify("BAR::x"), "\n"; # "BAR::x"
  12. print qualify("BAR::x", "FOO"), "\n"; # "BAR::x"
  13. print qualify("STDOUT", "FOO"), "\n"; # "main::STDOUT" (global)
  14. print qualify(\*x), "\n"; # returns \*x
  15. print qualify(\*x, "FOO"), "\n"; # returns \*x
  16. use strict refs;
  17. print { qualify_to_ref $fh } "foo!\n";
  18. $ref = qualify_to_ref $name, $pkg;
  19. use Symbol qw(delete_package);
  20. delete_package('Foo::Bar');
  21. print "deleted\n" unless exists $Foo::{'Bar::'};

DESCRIPTION

Symbol::gensym creates an anonymous glob and returns a reference to it. Such a glob reference can be used as a file or directory handle.

For backward compatibility with older implementations that didn't support anonymous globs, Symbol::ungensym is also provided. But it doesn't do anything.

Symbol::geniosym creates an anonymous IO handle. This can be assigned into an existing glob without affecting the non-IO portions of the glob.

Symbol::qualify turns unqualified symbol names into qualified variable names (e.g. "myvar" -> "MyPackage::myvar"). If it is given a second parameter, qualify uses it as the default package; otherwise, it uses the package of its caller. Regardless, global variable names (e.g. "STDOUT", "ENV", "SIG") are always qualified with "main::".

Qualification applies only to symbol names (strings). References are left unchanged under the assumption that they are glob references, which are qualified by their nature.

Symbol::qualify_to_ref is just like Symbol::qualify except that it returns a glob ref rather than a symbol name, so you can use the result even if use strict 'refs' is in effect.

Symbol::delete_package wipes out a whole package namespace. Note this routine is not exported by default--you may want to import it explicitly.

BUGS

Symbol::delete_package is a bit too powerful. It undefines every symbol that lives in the specified package. Since perl, for performance reasons, does not perform a symbol table lookup each time a function is called or a global variable is accessed, some code that has already been loaded and that makes use of symbols in package Foo may stop working after you delete Foo , even if you reload the Foo module afterwards.

 
perldoc-html/Sys/000755 000765 000024 00000000000 12275777504 014014 5ustar00jjstaff000000 000000 perldoc-html/TAP/000755 000765 000024 00000000000 12275777513 013662 5ustar00jjstaff000000 000000 perldoc-html/Term/000755 000765 000024 00000000000 12275777514 014146 5ustar00jjstaff000000 000000 perldoc-html/Test/000755 000765 000024 00000000000 12275777512 014154 5ustar00jjstaff000000 000000 perldoc-html/Test.html000644 000765 000024 00000160730 12275777510 015047 0ustar00jjstaff000000 000000 Test - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Test

Perl 5 version 18.2 documentation
Recently read

Test

NAME

Test - provides a simple framework for writing test scripts

SYNOPSIS

  1. use strict;
  2. use Test;
  3. # use a BEGIN block so we print our plan before MyModule is loaded
  4. BEGIN { plan tests => 14, todo => [3,4] }
  5. # load your module...
  6. use MyModule;
  7. # Helpful notes. All note-lines must start with a "#".
  8. print "# I'm testing MyModule version $MyModule::VERSION\n";
  9. ok(0); # failure
  10. ok(1); # success
  11. ok(0); # ok, expected failure (see todo list, above)
  12. ok(1); # surprise success!
  13. ok(0,1); # failure: '0' ne '1'
  14. ok('broke','fixed'); # failure: 'broke' ne 'fixed'
  15. ok('fixed','fixed'); # success: 'fixed' eq 'fixed'
  16. ok('fixed',qr/x/); # success: 'fixed' =~ qr/x/
  17. ok(sub { 1+1 }, 2); # success: '2' eq '2'
  18. ok(sub { 1+1 }, 3); # failure: '2' ne '3'
  19. my @list = (0,0);
  20. ok @list, 3, "\@list=".join(',',@list); #extra notes
  21. ok 'segmentation fault', '/(?i)success/'; #regex match
  22. skip(
  23. $^O =~ m/MSWin/ ? "Skip if MSWin" : 0, # whether to skip
  24. $foo, $bar # arguments just like for ok(...)
  25. );
  26. skip(
  27. $^O =~ m/MSWin/ ? 0 : "Skip unless MSWin", # whether to skip
  28. $foo, $bar # arguments just like for ok(...)
  29. );

DESCRIPTION

This module simplifies the task of writing test files for Perl modules, such that their output is in the format that Test::Harness expects to see.

QUICK START GUIDE

To write a test for your new (and probably not even done) module, create a new file called t/test.t (in a new t directory). If you have multiple test files, to test the "foo", "bar", and "baz" feature sets, then feel free to call your files t/foo.t, t/bar.t, and t/baz.t

Functions

This module defines three public functions, plan(...) , ok(...) , and skip(...) . By default, all three are exported by the use Test; statement.

  • plan(...)
    1. BEGIN { plan %theplan; }

    This should be the first thing you call in your test script. It declares your testing plan, how many there will be, if any of them should be allowed to fail, and so on.

    Typical usage is just:

    1. use Test;
    2. BEGIN { plan tests => 23 }

    These are the things that you can put in the parameters to plan:

    • tests => number

      The number of tests in your script. This means all ok() and skip() calls.

    • todo => [1,5,14]

      A reference to a list of tests which are allowed to fail. See TODO TESTS.

    • onfail => sub { ... }
    • onfail => \&some_sub

      A subroutine reference to be run at the end of the test script, if any of the tests fail. See ONFAIL.

    You must call plan(...) once and only once. You should call it in a BEGIN {...} block, like so:

    1. BEGIN { plan tests => 23 }
  • ok(...)
    1. ok(1 + 1 == 2);
    2. ok($have, $expect);
    3. ok($have, $expect, $diagnostics);

    This function is the reason for Test 's existence. It's the basic function that handles printing "ok " or "not ok ", along with the current test number. (That's what Test::Harness wants to see.)

    In its most basic usage, ok(...) simply takes a single scalar expression. If its value is true, the test passes; if false, the test fails. Examples:

    1. # Examples of ok(scalar)
    2. ok( 1 + 1 == 2 ); # ok if 1 + 1 == 2
    3. ok( $foo =~ /bar/ ); # ok if $foo contains 'bar'
    4. ok( baz($x + $y) eq 'Armondo' ); # ok if baz($x + $y) returns
    5. # 'Armondo'
    6. ok( @a == @b ); # ok if @a and @b are the same length

    The expression is evaluated in scalar context. So the following will work:

    1. ok( @stuff ); # ok if @stuff has any elements
    2. ok( !grep !defined $_, @stuff ); # ok if everything in @stuff is
    3. # defined.

    A special case is if the expression is a subroutine reference (in either sub {...} syntax or \&foo syntax). In that case, it is executed and its value (true or false) determines if the test passes or fails. For example,

    1. ok( sub { # See whether sleep works at least passably
    2. my $start_time = time;
    3. sleep 5;
    4. time() - $start_time >= 4
    5. });

    In its two-argument form, ok(arg1, arg2) compares the two scalar values to see if they match. They match if both are undefined, or if arg2 is a regex that matches arg1, or if they compare equal with eq .

    1. # Example of ok(scalar, scalar)
    2. ok( "this", "that" ); # not ok, 'this' ne 'that'
    3. ok( "", undef ); # not ok, "" is defined

    The second argument is considered a regex if it is either a regex object or a string that looks like a regex. Regex objects are constructed with the qr// operator in recent versions of perl. A string is considered to look like a regex if its first and last characters are "/", or if the first character is "m" and its second and last characters are both the same non-alphanumeric non-whitespace character. These regexp

    Regex examples:

    1. ok( 'JaffO', '/Jaff/' ); # ok, 'JaffO' =~ /Jaff/
    2. ok( 'JaffO', 'm|Jaff|' ); # ok, 'JaffO' =~ m|Jaff|
    3. ok( 'JaffO', qr/Jaff/ ); # ok, 'JaffO' =~ qr/Jaff/;
    4. ok( 'JaffO', '/(?i)jaff/ ); # ok, 'JaffO' =~ /jaff/i;

    If either (or both!) is a subroutine reference, it is run and used as the value for comparing. For example:

    1. ok sub {
    2. open(OUT, ">x.dat") || die $!;
    3. print OUT "\x{e000}";
    4. close OUT;
    5. my $bytecount = -s 'x.dat';
    6. unlink 'x.dat' or warn "Can't unlink : $!";
    7. return $bytecount;
    8. },
    9. 4
    10. ;

    The above test passes two values to ok(arg1, arg2) -- the first a coderef, and the second is the number 4. Before ok compares them, it calls the coderef, and uses its return value as the real value of this parameter. Assuming that $bytecount returns 4, ok ends up testing 4 eq 4 . Since that's true, this test passes.

    Finally, you can append an optional third argument, in ok(arg1,arg2, note), where note is a string value that will be printed if the test fails. This should be some useful information about the test, pertaining to why it failed, and/or a description of the test. For example:

    1. ok( grep($_ eq 'something unique', @stuff), 1,
    2. "Something that should be unique isn't!\n".
    3. '@stuff = '.join ', ', @stuff
    4. );

    Unfortunately, a note cannot be used with the single argument style of ok() . That is, if you try ok(arg1, note), then Test will interpret this as ok(arg1, arg2), and probably end up testing arg1 eq arg2 -- and that's not what you want!

    All of the above special cases can occasionally cause some problems. See BUGS and CAVEATS.

  • skip(skip_if_true, args...)

    This is used for tests that under some conditions can be skipped. It's basically equivalent to:

    1. if( $skip_if_true ) {
    2. ok(1);
    3. } else {
    4. ok( args... );
    5. }

    ...except that the ok(1) emits not just "ok testnum" but actually "ok testnum # skip_if_true_value".

    The arguments after the skip_if_true are what is fed to ok(...) if this test isn't skipped.

    Example usage:

    1. my $if_MSWin =
    2. $^O =~ m/MSWin/ ? 'Skip if under MSWin' : '';
    3. # A test to be skipped if under MSWin (i.e., run except under MSWin)
    4. skip($if_MSWin, thing($foo), thing($bar) );

    Or, going the other way:

    1. my $unless_MSWin =
    2. $^O =~ m/MSWin/ ? '' : 'Skip unless under MSWin';
    3. # A test to be skipped unless under MSWin (i.e., run only under MSWin)
    4. skip($unless_MSWin, thing($foo), thing($bar) );

    The tricky thing to remember is that the first parameter is true if you want to skip the test, not run it; and it also doubles as a note about why it's being skipped. So in the first codeblock above, read the code as "skip if MSWin -- (otherwise) test whether thing($foo) is thing($bar) " or for the second case, "skip unless MSWin...".

    Also, when your skip_if_reason string is true, it really should (for backwards compatibility with older Test.pm versions) start with the string "Skip", as shown in the above examples.

    Note that in the above cases, thing($foo) and thing($bar) are evaluated -- but as long as the skip_if_true is true, then we skip(...) just tosses out their value (i.e., not bothering to treat them like values to ok(...) . But if you need to not eval the arguments when skipping the test, use this format:

    1. skip( $unless_MSWin,
    2. sub {
    3. # This code returns true if the test passes.
    4. # (But it doesn't even get called if the test is skipped.)
    5. thing($foo) eq thing($bar)
    6. }
    7. );

    or even this, which is basically equivalent:

    1. skip( $unless_MSWin,
    2. sub { thing($foo) }, sub { thing($bar) }
    3. );

    That is, both are like this:

    1. if( $unless_MSWin ) {
    2. ok(1); # but it actually appends "# $unless_MSWin"
    3. # so that Test::Harness can tell it's a skip
    4. } else {
    5. # Not skipping, so actually call and evaluate...
    6. ok( sub { thing($foo) }, sub { thing($bar) } );
    7. }

TEST TYPES

  • NORMAL TESTS

    These tests are expected to succeed. Usually, most or all of your tests are in this category. If a normal test doesn't succeed, then that means that something is wrong.

  • SKIPPED TESTS

    The skip(...) function is for tests that might or might not be possible to run, depending on the availability of platform-specific features. The first argument should evaluate to true (think "yes, please skip") if the required feature is not available. After the first argument, skip(...) works exactly the same way as ok(...) does.

  • TODO TESTS

    TODO tests are designed for maintaining an executable TODO list. These tests are expected to fail. If a TODO test does succeed, then the feature in question shouldn't be on the TODO list, now should it?

    Packages should NOT be released with succeeding TODO tests. As soon as a TODO test starts working, it should be promoted to a normal test, and the newly working feature should be documented in the release notes or in the change log.

ONFAIL

  1. BEGIN { plan test => 4, onfail => sub { warn "CALL 911!" } }

Although test failures should be enough, extra diagnostics can be triggered at the end of a test run. onfail is passed an array ref of hash refs that describe each test failure. Each hash will contain at least the following fields: package, repetition , and result . (You shouldn't rely on any other fields being present.) If the test had an expected value or a diagnostic (or "note") string, these will also be included.

The optional onfail hook might be used simply to print out the version of your package and/or how to report problems. It might also be used to generate extremely sophisticated diagnostics for a particularly bizarre test failure. However it's not a panacea. Core dumps or other unrecoverable errors prevent the onfail hook from running. (It is run inside an END block.) Besides, onfail is probably over-kill in most cases. (Your test code should be simpler than the code it is testing, yes?)

BUGS and CAVEATS

  • ok(...) 's special handing of strings which look like they might be regexes can also cause unexpected behavior. An innocent:

    1. ok( $fileglob, '/path/to/some/*stuff/' );

    will fail, since Test.pm considers the second argument to be a regex! The best bet is to use the one-argument form:

    1. ok( $fileglob eq '/path/to/some/*stuff/' );
  • ok(...) 's use of string eq can sometimes cause odd problems when comparing numbers, especially if you're casting a string to a number:

    1. $foo = "1.0";
    2. ok( $foo, 1 ); # not ok, "1.0" ne 1

    Your best bet is to use the single argument form:

    1. ok( $foo == 1 ); # ok "1.0" == 1
  • As you may have inferred from the above documentation and examples, ok 's prototype is ($;$$) (and, incidentally, skip 's is ($;$$$)). This means, for example, that you can do ok @foo, @bar to compare the size of the two arrays. But don't be fooled into thinking that ok @foo, @bar means a comparison of the contents of two arrays -- you're comparing just the number of elements of each. It's so easy to make that mistake in reading ok @foo, @bar that you might want to be very explicit about it, and instead write ok scalar(@foo), scalar(@bar) .

  • This almost definitely doesn't do what you expect:

    1. ok $thingy->can('some_method');

    Why? Because can returns a coderef to mean "yes it can (and the method is this...)", and then ok sees a coderef and thinks you're passing a function that you want it to call and consider the truth of the result of! I.e., just like:

    1. ok $thingy->can('some_method')->();

    What you probably want instead is this:

    1. ok $thingy->can('some_method') && 1;

    If the can returns false, then that is passed to ok . If it returns true, then the larger expression $thingy->can('some_method') && 1 returns 1, which ok sees as a simple signal of success, as you would expect.

  • The syntax for skip is about the only way it can be, but it's still quite confusing. Just start with the above examples and you'll be okay.

    Moreover, users may expect this:

    1. skip $unless_mswin, foo($bar), baz($quux);

    to not evaluate foo($bar) and baz($quux) when the test is being skipped. But in reality, they are evaluated, but skip just won't bother comparing them if $unless_mswin is true.

    You could do this:

    1. skip $unless_mswin, sub{foo($bar)}, sub{baz($quux)};

    But that's not terribly pretty. You may find it simpler or clearer in the long run to just do things like this:

    1. if( $^O =~ m/MSWin/ ) {
    2. print "# Yay, we're under $^O\n";
    3. ok foo($bar), baz($quux);
    4. ok thing($whatever), baz($stuff);
    5. ok blorp($quux, $whatever);
    6. ok foo($barzbarz), thang($quux);
    7. } else {
    8. print "# Feh, we're under $^O. Watch me skip some tests...\n";
    9. for(1 .. 4) { skip "Skip unless under MSWin" }
    10. }

    But be quite sure that ok is called exactly as many times in the first block as skip is called in the second block.

ENVIRONMENT

If PERL_TEST_DIFF environment variable is set, it will be used as a command for comparing unexpected multiline results. If you have GNU diff installed, you might want to set PERL_TEST_DIFF to diff -u . If you don't have a suitable program, you might install the Text::Diff module and then set PERL_TEST_DIFF to be perl -MText::Diff -e 'print diff(@ARGV)' . If PERL_TEST_DIFF isn't set but the Algorithm::Diff module is available, then it will be used to show the differences in multiline results.

NOTE

A past developer of this module once said that it was no longer being actively developed. However, rumors of its demise were greatly exaggerated. Feedback and suggestions are quite welcome.

Be aware that the main value of this module is its simplicity. Note that there are already more ambitious modules out there, such as Test::More and Test::Unit.

Some earlier versions of this module had docs with some confusing typos in the description of skip(...) .

SEE ALSO

Test::Harness

Test::Simple, Test::More, Devel::Cover

Test::Builder for building your own testing library.

Test::Unit is an interesting XUnit-style testing library.

Test::Inline and SelfTest let you embed tests in code.

AUTHOR

Copyright (c) 1998-2000 Joshua Nathaniel Pritikin.

Copyright (c) 2001-2002 Michael G. Schwern.

Copyright (c) 2002-2004 Sean M. Burke.

Current maintainer: Jesse Vincent. <jesse@bestpractical.com>

This package is free software and is provided "as is" without express or implied warranty. It may be used, redistributed and/or modified under the same terms as Perl itself.

 
perldoc-html/Text/000755 000765 000024 00000000000 12275777512 014161 5ustar00jjstaff000000 000000 perldoc-html/Thread/000755 000765 000024 00000000000 12275777514 014446 5ustar00jjstaff000000 000000 perldoc-html/Thread.html000644 000765 000024 00000071275 12275777507 015352 0ustar00jjstaff000000 000000 Thread - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Thread

Perl 5 version 18.2 documentation
Recently read

Thread

NAME

Thread - Manipulate threads in Perl (for old code only)

DEPRECATED

The Thread module served as the frontend to the old-style thread model, called 5005threads, that was introduced in release 5.005. That model was deprecated, and has been removed in version 5.10.

For old code and interim backwards compatibility, the Thread module has been reworked to function as a frontend for the new interpreter threads (ithreads) model. However, some previous functionality is not available. Further, the data sharing models between the two thread models are completely different, and anything to do with data sharing has to be thought differently. With ithreads, you must explicitly share() variables between the threads.

You are strongly encouraged to migrate any existing threaded code to the new model (i.e., use the threads and threads::shared modules) as soon as possible.

HISTORY

In Perl 5.005, the thread model was that all data is implicitly shared, and shared access to data has to be explicitly synchronized. This model is called 5005threads.

In Perl 5.6, a new model was introduced in which all is was thread local and shared access to data has to be explicitly declared. This model is called ithreads, for "interpreter threads".

In Perl 5.6, the ithreads model was not available as a public API; only as an internal API that was available for extension writers, and to implement fork() emulation on Win32 platforms.

In Perl 5.8, the ithreads model became available through the threads module, and the 5005threads model was deprecated.

In Perl 5.10, the 5005threads model was removed from the Perl interpreter.

SYNOPSIS

  1. use Thread qw(:DEFAULT async yield);
  2. my $t = Thread->new(\&start_sub, @start_args);
  3. $result = $t->join;
  4. $t->detach;
  5. if ($t->done) {
  6. $t->join;
  7. }
  8. if($t->equal($another_thread)) {
  9. # ...
  10. }
  11. yield();
  12. my $tid = Thread->self->tid;
  13. lock($scalar);
  14. lock(@array);
  15. lock(%hash);
  16. my @list = Thread->list;

DESCRIPTION

The Thread module provides multithreading support for Perl.

FUNCTIONS

  • $thread = Thread->new(\&start_sub)
  • $thread = Thread->new(\&start_sub, LIST)

    new starts a new thread of execution in the referenced subroutine. The optional list is passed as parameters to the subroutine. Execution continues in both the subroutine and the code after the new call.

    Thread-&gt;new returns a thread object representing the newly created thread.

  • lock VARIABLE

    lock places a lock on a variable until the lock goes out of scope.

    If the variable is locked by another thread, the lock call will block until it's available. lock is recursive, so multiple calls to lock are safe--the variable will remain locked until the outermost lock on the variable goes out of scope.

    Locks on variables only affect lock calls--they do not affect normal access to a variable. (Locks on subs are different, and covered in a bit.) If you really, really want locks to block access, then go ahead and tie them to something and manage this yourself. This is done on purpose. While managing access to variables is a good thing, Perl doesn't force you out of its living room...

    If a container object, such as a hash or array, is locked, all the elements of that container are not locked. For example, if a thread does a lock @a , any other thread doing a lock($a[12]) won't block.

    Finally, lock will traverse up references exactly one level. lock(\$a) is equivalent to lock($a), while lock(\\$a) is not.

  • async BLOCK;

    async creates a thread to execute the block immediately following it. This block is treated as an anonymous sub, and so must have a semi-colon after the closing brace. Like Thread-&gt;new , async returns a thread object.

  • Thread->self

    The Thread->self function returns a thread object that represents the thread making the Thread->self call.

  • Thread->list

    Returns a list of all non-joined, non-detached Thread objects.

  • cond_wait VARIABLE

    The cond_wait function takes a locked variable as a parameter, unlocks the variable, and blocks until another thread does a cond_signal or cond_broadcast for that same locked variable. The variable that cond_wait blocked on is relocked after the cond_wait is satisfied. If there are multiple threads cond_wait ing on the same variable, all but one will reblock waiting to reaquire the lock on the variable. (So if you're only using cond_wait for synchronization, give up the lock as soon as possible.)

  • cond_signal VARIABLE

    The cond_signal function takes a locked variable as a parameter and unblocks one thread that's cond_wait ing on that variable. If more than one thread is blocked in a cond_wait on that variable, only one (and which one is indeterminate) will be unblocked.

    If there are no threads blocked in a cond_wait on the variable, the signal is discarded.

  • cond_broadcast VARIABLE

    The cond_broadcast function works similarly to cond_signal . cond_broadcast , though, will unblock all the threads that are blocked in a cond_wait on the locked variable, rather than only one.

  • yield

    The yield function allows another thread to take control of the CPU. The exact results are implementation-dependent.

METHODS

  • join

    join waits for a thread to end and returns any values the thread exited with. join will block until the thread has ended, though it won't block if the thread has already terminated.

    If the thread being joined died, the error it died with will be returned at this time. If you don't want the thread performing the join to die as well, you should either wrap the join in an eval or use the eval thread method instead of join.

  • detach

    detach tells a thread that it is never going to be joined i.e. that all traces of its existence can be removed once it stops running. Errors in detached threads will not be visible anywhere - if you want to catch them, you should use $SIG{__DIE__} or something like that.

  • equal

    equal tests whether two thread objects represent the same thread and returns true if they do.

  • tid

    The tid method returns the tid of a thread. The tid is a monotonically increasing integer assigned when a thread is created. The main thread of a program will have a tid of zero, while subsequent threads will have tids assigned starting with one.

  • done

    The done method returns true if the thread you're checking has finished, and false otherwise.

DEFUNCT

The following were implemented with 5005threads, but are no longer available with ithreads.

  • lock(\&sub)

    With 5005threads, you could also lock a sub such that any calls to that sub from another thread would block until the lock was released.

    Also, subroutines could be declared with the :locked attribute which would serialize access to the subroutine, but allowed different threads non-simultaneous access.

  • eval

    The eval method wrapped an eval around a join, and so waited for a thread to exit, passing along any values the thread might have returned and placing any errors into $@ .

  • flags

    The flags method returned the flags for the thread - an integer value corresponding to the internal flags for the thread.

SEE ALSO

threads, threads::shared, Thread::Queue, Thread::Semaphore

 
perldoc-html/Tie/000755 000765 000024 00000000000 12275777514 013760 5ustar00jjstaff000000 000000 perldoc-html/Time/000755 000765 000024 00000000000 12275777513 014134 5ustar00jjstaff000000 000000 perldoc-html/UNIVERSAL.html000644 000765 000024 00000073023 12275777516 015504 0ustar00jjstaff000000 000000 UNIVERSAL - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

UNIVERSAL

Perl 5 version 18.2 documentation
Recently read

UNIVERSAL

NAME

UNIVERSAL - base class for ALL classes (blessed references)

SYNOPSIS

  1. $is_io = $fd->isa("IO::Handle");
  2. $is_io = Class->isa("IO::Handle");
  3. $does_log = $obj->DOES("Logger");
  4. $does_log = Class->DOES("Logger");
  5. $sub = $obj->can("print");
  6. $sub = Class->can("print");
  7. $sub = eval { $ref->can("fandango") };
  8. $ver = $obj->VERSION;
  9. # but never do this!
  10. $is_io = UNIVERSAL::isa($fd, "IO::Handle");
  11. $sub = UNIVERSAL::can($obj, "print");

DESCRIPTION

UNIVERSAL is the base class from which all blessed references inherit. See perlobj.

UNIVERSAL provides the following methods:

  • $obj->isa( TYPE )
  • CLASS->isa( TYPE )
  • eval { VAL->isa( TYPE ) }

    Where

    • TYPE

      is a package name

    • $obj

      is a blessed reference or a package name

    • CLASS

      is a package name

    • VAL

      is any of the above or an unblessed reference

    When used as an instance or class method ($obj->isa( TYPE ) ), isa returns true if $obj is blessed into package TYPE or inherits from package TYPE .

    When used as a class method (CLASS->isa( TYPE ) , sometimes referred to as a static method), isa returns true if CLASS inherits from (or is itself) the name of the package TYPE or inherits from package TYPE .

    If you're not sure what you have (the VAL case), wrap the method call in an eval block to catch the exception if VAL is undefined.

    If you want to be sure that you're calling isa as a method, not a class, check the invocand with blessed from Scalar::Util first:

    1. use Scalar::Util 'blessed';
    2. if ( blessed( $obj ) && $obj->isa("Some::Class") ) {
    3. ...
    4. }
  • $obj->DOES( ROLE )
  • CLASS->DOES( ROLE )

    DOES checks if the object or class performs the role ROLE . A role is a named group of specific behavior (often methods of particular names and signatures), similar to a class, but not necessarily a complete class by itself. For example, logging or serialization may be roles.

    DOES and isa are similar, in that if either is true, you know that the object or class on which you call the method can perform specific behavior. However, DOES is different from isa in that it does not care how the invocand performs the operations, merely that it does. (isa of course mandates an inheritance relationship. Other relationships include aggregation, delegation, and mocking.)

    By default, classes in Perl only perform the UNIVERSAL role, as well as the role of all classes in their inheritance. In other words, by default DOES responds identically to isa .

    There is a relationship between roles and classes, as each class implies the existence of a role of the same name. There is also a relationship between inheritance and roles, in that a subclass that inherits from an ancestor class implicitly performs any roles its parent performs. Thus you can use DOES in place of isa safely, as it will return true in all places where isa will return true (provided that any overridden DOES and isa methods behave appropriately).

  • $obj->can( METHOD )
  • CLASS->can( METHOD )
  • eval { VAL->can( METHOD ) }

    can checks if the object or class has a method called METHOD . If it does, then it returns a reference to the sub. If it does not, then it returns undef. This includes methods inherited or imported by $obj , CLASS , or VAL .

    can cannot know whether an object will be able to provide a method through AUTOLOAD (unless the object's class has overridden can appropriately), so a return value of undef does not necessarily mean the object will not be able to handle the method call. To get around this some module authors use a forward declaration (see perlsub) for methods they will handle via AUTOLOAD. For such 'dummy' subs, can will still return a code reference, which, when called, will fall through to the AUTOLOAD. If no suitable AUTOLOAD is provided, calling the coderef will cause an error.

    You may call can as a class (static) method or an object method.

    Again, the same rule about having a valid invocand applies -- use an eval block or blessed if you need to be extra paranoid.

  • VERSION ( [ REQUIRE ] )

    VERSION will return the value of the variable $VERSION in the package the object is blessed into. If REQUIRE is given then it will do a comparison and die if the package version is not greater than or equal to REQUIRE , or if either $VERSION or REQUIRE is not a "lax" version number (as defined by the version module).

    The return from VERSION will actually be the stringified version object using the package $VERSION scalar, which is guaranteed to be equivalent but may not be precisely the contents of the $VERSION scalar. If you want the actual contents of $VERSION , use $CLASS::VERSION instead.

    VERSION can be called as either a class (static) method or an object method.

WARNINGS

NOTE: can directly uses Perl's internal code for method lookup, and isa uses a very similar method and cache-ing strategy. This may cause strange effects if the Perl code dynamically changes @ISA in any package.

You may add other methods to the UNIVERSAL class via Perl or XS code. You do not need to use UNIVERSAL to make these methods available to your program (and you should not do so).

EXPORTS

None by default.

You may request the import of three functions (isa , can , and VERSION ), but this feature is deprecated and will be removed. Please don't do this in new code.

For example, previous versions of this documentation suggested using isa as a function to determine the type of a reference:

  1. use UNIVERSAL 'isa';
  2. $yes = isa $h, "HASH";
  3. $yes = isa "Foo", "Bar";

The problem is that this code will never call an overridden isa method in any class. Instead, use reftype from Scalar::Util for the first case:

  1. use Scalar::Util 'reftype';
  2. $yes = reftype( $h ) eq "HASH";

and the method form of isa for the second:

  1. $yes = Foo->isa("Bar");
 
perldoc-html/Unicode/000755 000765 000024 00000000000 12275777516 014627 5ustar00jjstaff000000 000000 perldoc-html/User/000755 000765 000024 00000000000 12275777516 014157 5ustar00jjstaff000000 000000 perldoc-html/XSLoader.html000644 000765 000024 00000076644 12275777517 015632 0ustar00jjstaff000000 000000 XSLoader - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

XSLoader

Perl 5 version 18.2 documentation
Recently read

XSLoader

NAME

XSLoader - Dynamically load C libraries into Perl code

VERSION

Version 0.16

SYNOPSIS

  1. package YourPackage;
  2. require XSLoader;
  3. XSLoader::load();

DESCRIPTION

This module defines a standard simplified interface to the dynamic linking mechanisms available on many platforms. Its primary purpose is to implement cheap automatic dynamic loading of Perl modules.

For a more complicated interface, see DynaLoader. Many (most) features of DynaLoader are not implemented in XSLoader , like for example the dl_load_flags , not honored by XSLoader .

Migration from DynaLoader

A typical module using DynaLoader starts like this:

  1. package YourPackage;
  2. require DynaLoader;
  3. our @ISA = qw( OnePackage OtherPackage DynaLoader );
  4. our $VERSION = '0.01';
  5. bootstrap YourPackage $VERSION;

Change this to

  1. package YourPackage;
  2. use XSLoader;
  3. our @ISA = qw( OnePackage OtherPackage );
  4. our $VERSION = '0.01';
  5. XSLoader::load 'YourPackage', $VERSION;

In other words: replace require DynaLoader by use XSLoader , remove DynaLoader from @ISA , change bootstrap by XSLoader::load . Do not forget to quote the name of your package on the XSLoader::load line, and add comma (, ) before the arguments ($VERSION above).

Of course, if @ISA contained only DynaLoader , there is no need to have the @ISA assignment at all; moreover, if instead of our one uses the more backward-compatible

  1. use vars qw($VERSION @ISA);

one can remove this reference to @ISA together with the @ISA assignment.

If no $VERSION was specified on the bootstrap line, the last line becomes

  1. XSLoader::load 'YourPackage';

If the call to load is from YourPackage , then that can be further simplified to

  1. XSLoader::load();

as load will use caller to determine the package.

Backward compatible boilerplate

If you want to have your cake and eat it too, you need a more complicated boilerplate.

  1. package YourPackage;
  2. use vars qw($VERSION @ISA);
  3. @ISA = qw( OnePackage OtherPackage );
  4. $VERSION = '0.01';
  5. eval {
  6. require XSLoader;
  7. XSLoader::load('YourPackage', $VERSION);
  8. 1;
  9. } or do {
  10. require DynaLoader;
  11. push @ISA, 'DynaLoader';
  12. bootstrap YourPackage $VERSION;
  13. };

The parentheses about XSLoader::load() arguments are needed since we replaced use XSLoader by require, so the compiler does not know that a function XSLoader::load() is present.

This boilerplate uses the low-overhead XSLoader if present; if used with an antique Perl which has no XSLoader , it falls back to using DynaLoader .

Order of initialization: early load()

Skip this section if the XSUB functions are supposed to be called from other modules only; read it only if you call your XSUBs from the code in your module, or have a BOOT: section in your XS file (see The BOOT: Keyword in perlxs). What is described here is equally applicable to the DynaLoader interface.

A sufficiently complicated module using XS would have both Perl code (defined in YourPackage.pm) and XS code (defined in YourPackage.xs). If this Perl code makes calls into this XS code, and/or this XS code makes calls to the Perl code, one should be careful with the order of initialization.

The call to XSLoader::load() (or bootstrap() ) calls the module's bootstrap code. For modules build by xsubpp (nearly all modules) this has three side effects:

  • A sanity check is done to ensure that the versions of the .pm and the (compiled) .xs parts are compatible. If $VERSION was specified, this is used for the check. If not specified, it defaults to $XS_VERSION // $VERSION (in the module's namespace)

  • the XSUBs are made accessible from Perl

  • if a BOOT: section was present in the .xs file, the code there is called.

Consequently, if the code in the .pm file makes calls to these XSUBs, it is convenient to have XSUBs installed before the Perl code is defined; for example, this makes prototypes for XSUBs visible to this Perl code. Alternatively, if the BOOT: section makes calls to Perl functions (or uses Perl variables) defined in the .pm file, they must be defined prior to the call to XSLoader::load() (or bootstrap() ).

The first situation being much more frequent, it makes sense to rewrite the boilerplate as

  1. package YourPackage;
  2. use XSLoader;
  3. use vars qw($VERSION @ISA);
  4. BEGIN {
  5. @ISA = qw( OnePackage OtherPackage );
  6. $VERSION = '0.01';
  7. # Put Perl code used in the BOOT: section here
  8. XSLoader::load 'YourPackage', $VERSION;
  9. }
  10. # Put Perl code making calls into XSUBs here

The most hairy case

If the interdependence of your BOOT: section and Perl code is more complicated than this (e.g., the BOOT: section makes calls to Perl functions which make calls to XSUBs with prototypes), get rid of the BOOT: section altogether. Replace it with a function onBOOT() , and call it like this:

  1. package YourPackage;
  2. use XSLoader;
  3. use vars qw($VERSION @ISA);
  4. BEGIN {
  5. @ISA = qw( OnePackage OtherPackage );
  6. $VERSION = '0.01';
  7. XSLoader::load 'YourPackage', $VERSION;
  8. }
  9. # Put Perl code used in onBOOT() function here; calls to XSUBs are
  10. # prototype-checked.
  11. onBOOT;
  12. # Put Perl initialization code assuming that XS is initialized here

DIAGNOSTICS

  • Can't find '%s' symbol in %s

    (F) The bootstrap symbol could not be found in the extension module.

  • Can't load '%s' for module %s: %s

    (F) The loading or initialisation of the extension module failed. The detailed error follows.

  • Undefined symbols present after loading %s: %s

    (W) As the message says, some symbols stay undefined although the extension module was correctly loaded and initialised. The list of undefined symbols follows.

LIMITATIONS

To reduce the overhead as much as possible, only one possible location is checked to find the extension DLL (this location is where make install would put the DLL). If not found, the search for the DLL is transparently delegated to DynaLoader , which looks for the DLL along the @INC list.

In particular, this is applicable to the structure of @INC used for testing not-yet-installed extensions. This means that running uninstalled extensions may have much more overhead than running the same extensions after make install .

KNOWN BUGS

The new simpler way to call XSLoader::load() with no arguments at all does not work on Perl 5.8.4 and 5.8.5.

BUGS

Please report any bugs or feature requests via the perlbug(1) utility.

SEE ALSO

DynaLoader

AUTHORS

Ilya Zakharevich originally extracted XSLoader from DynaLoader .

CPAN version is currently maintained by Sébastien Aperghis-Tramoni <sebastien@aperghis.net>.

Previous maintainer was Michael G Schwern <schwern@pobox.com>.

COPYRIGHT & LICENSE

Copyright (C) 1990-2011 by Larry Wall and others.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/a2p.html000644 000765 000024 00000053777 12275777417 014634 0ustar00jjstaff000000 000000 a2p - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

a2p

Perl 5 version 18.2 documentation
Recently read

a2p

NAME

a2p - Awk to Perl translator

SYNOPSIS

a2p [options] [filename]

DESCRIPTION

A2p takes an awk script specified on the command line (or from standard input) and produces a comparable perl script on the standard output.

OPTIONS

Options include:

  • -D<number>

    sets debugging flags.

  • -F<character>

    tells a2p that this awk script is always invoked with this -F switch.

  • -n<fieldlist>

    specifies the names of the input fields if input does not have to be split into an array. If you were translating an awk script that processes the password file, you might say:

    1. a2p -7 -nlogin.password.uid.gid.gcos.shell.home

    Any delimiter can be used to separate the field names.

  • -<number>

    causes a2p to assume that input will always have that many fields.

  • -o

    tells a2p to use old awk behavior. The only current differences are:

    • Old awk always has a line loop, even if there are no line actions, whereas new awk does not.

    • In old awk, sprintf is extremely greedy about its arguments. For example, given the statement

      1. print sprintf(some_args), extra_args;

      old awk considers extra_args to be arguments to sprintf; new awk considers them arguments to print.

"Considerations"

A2p cannot do as good a job translating as a human would, but it usually does pretty well. There are some areas where you may want to examine the perl script produced and tweak it some. Here are some of them, in no particular order.

There is an awk idiom of putting int() around a string expression to force numeric interpretation, even though the argument is always integer anyway. This is generally unneeded in perl, but a2p can't tell if the argument is always going to be integer, so it leaves it in. You may wish to remove it.

Perl differentiates numeric comparison from string comparison. Awk has one operator for both that decides at run time which comparison to do. A2p does not try to do a complete job of awk emulation at this point. Instead it guesses which one you want. It's almost always right, but it can be spoofed. All such guesses are marked with the comment "#??? ". You should go through and check them. You might want to run at least once with the -w switch to perl, which will warn you if you use == where you should have used eq.

Perl does not attempt to emulate the behavior of awk in which nonexistent array elements spring into existence simply by being referenced. If somehow you are relying on this mechanism to create null entries for a subsequent for...in, they won't be there in perl.

If a2p makes a split line that assigns to a list of variables that looks like (Fld1, Fld2, Fld3...) you may want to rerun a2p using the -n option mentioned above. This will let you name the fields throughout the script. If it splits to an array instead, the script is probably referring to the number of fields somewhere.

The exit statement in awk doesn't necessarily exit; it goes to the END block if there is one. Awk scripts that do contortions within the END block to bypass the block under such circumstances can be simplified by removing the conditional in the END block and just exiting directly from the perl script.

Perl has two kinds of array, numerically-indexed and associative. Perl associative arrays are called "hashes". Awk arrays are usually translated to hashes, but if you happen to know that the index is always going to be numeric you could change the {...} to [...]. Iteration over a hash is done using the keys() function, but iteration over an array is NOT. You might need to modify any loop that iterates over such an array.

Awk starts by assuming OFMT has the value %.6g. Perl starts by assuming its equivalent, $#, to have the value %.20g. You'll want to set $# explicitly if you use the default value of OFMT.

Near the top of the line loop will be the split operation that is implicit in the awk script. There are times when you can move this down past some conditionals that test the entire record so that the split is not done as often.

For aesthetic reasons you may wish to change index variables from being 1-based (awk style) to 0-based (Perl style). Be sure to change all operations the variable is involved in to match.

Cute comments that say "# Here is a workaround because awk is dumb" are passed through unmodified.

Awk scripts are often embedded in a shell script that pipes stuff into and out of awk. Often the shell script wrapper can be incorporated into the perl script, since perl can start up pipes into and out of itself, and can do other things that awk can't do by itself.

Scripts that refer to the special variables RSTART and RLENGTH can often be simplified by referring to the variables $`, $& and $', as long as they are within the scope of the pattern match that sets them.

The produced perl script may have subroutines defined to deal with awk's semantics regarding getline and print. Since a2p usually picks correctness over efficiency. it is almost always possible to rewrite such code to be more efficient by discarding the semantic sugar.

For efficiency, you may wish to remove the keyword from any return statement that is the last statement executed in a subroutine. A2p catches the most common case, but doesn't analyze embedded blocks for subtler cases.

ARGV[0] translates to $ARGV0, but ARGV[n] translates to $ARGV[$n-1]. A loop that tries to iterate over ARGV[0] won't find it.

ENVIRONMENT

A2p uses no environment variables.

AUTHOR

Larry Wall <larry@wall.org>

FILES

SEE ALSO

  1. perl The perl compiler/interpreter
  2. s2p sed to perl translator

DIAGNOSTICS

BUGS

It would be possible to emulate awk's behavior in selecting string versus numeric operations at run time by inspection of the operands, but it would be gross and inefficient. Besides, a2p almost always guesses right.

Storage for the awk syntax tree is currently static, and can run out.

 
perldoc-html/attributes.html000644 000765 000024 00000122234 12275777414 016316 0ustar00jjstaff000000 000000 attributes - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

attributes

Perl 5 version 18.2 documentation
Recently read

attributes

NAME

attributes - get/set subroutine or variable attributes

SYNOPSIS

  1. sub foo : method ;
  2. my ($x,@y,%z) : Bent = 1;
  3. my $s = sub : method { ... };
  4. use attributes (); # optional, to get subroutine declarations
  5. my @attrlist = attributes::get(\&foo);
  6. use attributes 'get'; # import the attributes::get subroutine
  7. my @attrlist = get \&foo;

DESCRIPTION

Subroutine declarations and definitions may optionally have attribute lists associated with them. (Variable my declarations also may, but see the warning below.) Perl handles these declarations by passing some information about the call site and the thing being declared along with the attribute list to this module. In particular, the first example above is equivalent to the following:

  1. use attributes __PACKAGE__, \&foo, 'method';

The second example in the synopsis does something equivalent to this:

  1. use attributes ();
  2. my ($x,@y,%z);
  3. attributes::->import(__PACKAGE__, \$x, 'Bent');
  4. attributes::->import(__PACKAGE__, \@y, 'Bent');
  5. attributes::->import(__PACKAGE__, \%z, 'Bent');
  6. ($x,@y,%z) = 1;

Yes, that's a lot of expansion.

WARNING: attribute declarations for variables are still evolving. The semantics and interfaces of such declarations could change in future versions. They are present for purposes of experimentation with what the semantics ought to be. Do not rely on the current implementation of this feature.

There are only a few attributes currently handled by Perl itself (or directly by this module, depending on how you look at it.) However, package-specific attributes are allowed by an extension mechanism. (See Package-specific Attribute Handling below.)

The setting of subroutine attributes happens at compile time. Variable attributes in our declarations are also applied at compile time. However, my variables get their attributes applied at run-time. This means that you have to reach the run-time component of the my before those attributes will get applied. For example:

  1. my $x : Bent = 42 if 0;

will neither assign 42 to $x nor will it apply the Bent attribute to the variable.

An attempt to set an unrecognized attribute is a fatal error. (The error is trappable, but it still stops the compilation within that eval.) Setting an attribute with a name that's all lowercase letters that's not a built-in attribute (such as "foo") will result in a warning with -w or use warnings 'reserved' .

What import does

In the description it is mentioned that

  1. sub foo : method;

is equivalent to

  1. use attributes __PACKAGE__, \&foo, 'method';

As you might know this calls the import function of attributes at compile time with these parameters: 'attributes', the caller's package name, the reference to the code and 'method'.

  1. attributes->import( __PACKAGE__, \&foo, 'method' );

So you want to know what import actually does?

First of all import gets the type of the third parameter ('CODE' in this case). attributes.pm checks if there is a subroutine called MODIFY_<reftype>_ATTRIBUTES in the caller's namespace (here: 'main'). In this case a subroutine MODIFY_CODE_ATTRIBUTES is required. Then this method is called to check if you have used a "bad attribute". The subroutine call in this example would look like

  1. MODIFY_CODE_ATTRIBUTES( 'main', \&foo, 'method' );

MODIFY_<reftype>_ATTRIBUTES has to return a list of all "bad attributes". If there are any bad attributes import croaks.

(See Package-specific Attribute Handling below.)

Built-in Attributes

The following are the built-in attributes for subroutines:

  • lvalue

    Indicates that the referenced subroutine is a valid lvalue and can be assigned to. The subroutine must return a modifiable value such as a scalar variable, as described in perlsub.

    This module allows one to set this attribute on a subroutine that is already defined. For Perl subroutines (XSUBs are fine), it may or may not do what you want, depending on the code inside the subroutine, with details subject to change in future Perl versions. You may run into problems with lvalue context not being propagated properly into the subroutine, or maybe even assertion failures. For this reason, a warning is emitted if warnings are enabled. In other words, you should only do this if you really know what you are doing. You have been warned.

  • method

    Indicates that the referenced subroutine is a method. A subroutine so marked will not trigger the "Ambiguous call resolved as CORE::%s" warning.

  • locked

    The "locked" attribute is deprecated, and has no effect in 5.10.0 and later. It was used as part of the now-removed "Perl 5.005 threads".

The following are the built-in attributes for variables:

  • shared

    Indicates that the referenced variable can be shared across different threads when used in conjunction with the threads and threads::shared modules.

  • unique

    The "unique" attribute is deprecated, and has no effect in 5.10.0 and later. It used to indicate that a single copy of an our variable was to be used by all interpreters should the program happen to be running in a multi-interpreter environment.

Available Subroutines

The following subroutines are available for general use once this module has been loaded:

  • get

    This routine expects a single parameter--a reference to a subroutine or variable. It returns a list of attributes, which may be empty. If passed invalid arguments, it uses die() (via Carp::croak) to raise a fatal exception. If it can find an appropriate package name for a class method lookup, it will include the results from a FETCH_type_ATTRIBUTES call in its return list, as described in Package-specific Attribute Handling below. Otherwise, only built-in attributes will be returned.

  • reftype

    This routine expects a single parameter--a reference to a subroutine or variable. It returns the built-in type of the referenced variable, ignoring any package into which it might have been blessed. This can be useful for determining the type value which forms part of the method names described in Package-specific Attribute Handling below.

Note that these routines are not exported by default.

Package-specific Attribute Handling

WARNING: the mechanisms described here are still experimental. Do not rely on the current implementation. In particular, there is no provision for applying package attributes to 'cloned' copies of subroutines used as closures. (See Making References in perlref for information on closures.) Package-specific attribute handling may change incompatibly in a future release.

When an attribute list is present in a declaration, a check is made to see whether an attribute 'modify' handler is present in the appropriate package (or its @ISA inheritance tree). Similarly, when attributes::get is called on a valid reference, a check is made for an appropriate attribute 'fetch' handler. See EXAMPLES to see how the "appropriate package" determination works.

The handler names are based on the underlying type of the variable being declared or of the reference passed. Because these attributes are associated with subroutine or variable declarations, this deliberately ignores any possibility of being blessed into some package. Thus, a subroutine declaration uses "CODE" as its type, and even a blessed hash reference uses "HASH" as its type.

The class methods invoked for modifying and fetching are these:

  • FETCH_type_ATTRIBUTES

    This method is called with two arguments: the relevant package name, and a reference to a variable or subroutine for which package-defined attributes are desired. The expected return value is a list of associated attributes. This list may be empty.

  • MODIFY_type_ATTRIBUTES

    This method is called with two fixed arguments, followed by the list of attributes from the relevant declaration. The two fixed arguments are the relevant package name and a reference to the declared subroutine or variable. The expected return value is a list of attributes which were not recognized by this handler. Note that this allows for a derived class to delegate a call to its base class, and then only examine the attributes which the base class didn't already handle for it.

    The call to this method is currently made during the processing of the declaration. In particular, this means that a subroutine reference will probably be for an undefined subroutine, even if this declaration is actually part of the definition.

Calling attributes::get() from within the scope of a null package declaration package ; for an unblessed variable reference will not provide any starting package name for the 'fetch' method lookup. Thus, this circumstance will not result in a method call for package-defined attributes. A named subroutine knows to which symbol table entry it belongs (or originally belonged), and it will use the corresponding package. An anonymous subroutine knows the package name into which it was compiled (unless it was also compiled with a null package declaration), and so it will use that package name.

Syntax of Attribute Lists

An attribute list is a sequence of attribute specifications, separated by whitespace or a colon (with optional whitespace). Each attribute specification is a simple name, optionally followed by a parenthesised parameter list. If such a parameter list is present, it is scanned past as for the rules for the q() operator. (See Quote and Quote-like Operators in perlop.) The parameter list is passed as it was found, however, and not as per q().

Some examples of syntactically valid attribute lists:

  1. switch(10,foo(7,3)) : expensive
  2. Ugly('\(") :Bad
  3. _5x5
  4. lvalue method

Some examples of syntactically invalid attribute lists (with annotation):

  1. switch(10,foo() # ()-string not balanced
  2. Ugly('(') # ()-string not balanced
  3. 5x5 # "5x5" not a valid identifier
  4. Y2::north # "Y2::north" not a simple identifier
  5. foo + bar # "+" neither a colon nor whitespace

EXPORTS

Default exports

None.

Available exports

The routines get and reftype are exportable.

Export tags defined

The :ALL tag will get all of the above exports.

EXAMPLES

Here are some samples of syntactically valid declarations, with annotation as to how they resolve internally into use attributes invocations by perl. These examples are primarily useful to see how the "appropriate package" is found for the possible method lookups for package-defined attributes.

1.

Code:

  1. package Canine;
  2. package Dog;
  3. my Canine $spot : Watchful ;

Effect:

  1. use attributes ();
  2. attributes::->import(Canine => \$spot, "Watchful");
2.

Code:

  1. package Felis;
  2. my $cat : Nervous;

Effect:

  1. use attributes ();
  2. attributes::->import(Felis => \$cat, "Nervous");
3.

Code:

  1. package X;
  2. sub foo : lvalue ;

Effect:

  1. use attributes X => \&foo, "lvalue";
4.

Code:

  1. package X;
  2. sub Y::x : lvalue { 1 }

Effect:

  1. use attributes Y => \&Y::x, "lvalue";
5.

Code:

  1. package X;
  2. sub foo { 1 }
  3. package Y;
  4. BEGIN { *bar = \&X::foo; }
  5. package Z;
  6. sub Y::bar : lvalue ;

Effect:

  1. use attributes X => \&X::foo, "lvalue";

This last example is purely for purposes of completeness. You should not be trying to mess with the attributes of something in a package that's not your own.

MORE EXAMPLES

1.
  1. sub MODIFY_CODE_ATTRIBUTES {
  2. my ($class,$code,@attrs) = @_;
  3. my $allowed = 'MyAttribute';
  4. my @bad = grep { $_ ne $allowed } @attrs;
  5. return @bad;
  6. }
  7. sub foo : MyAttribute {
  8. print "foo\n";
  9. }

This example runs. At compile time MODIFY_CODE_ATTRIBUTES is called. In that subroutine, we check if any attribute is disallowed and we return a list of these "bad attributes".

As we return an empty list, everything is fine.

2.
  1. sub MODIFY_CODE_ATTRIBUTES {
  2. my ($class,$code,@attrs) = @_;
  3. my $allowed = 'MyAttribute';
  4. my @bad = grep{ $_ ne $allowed }@attrs;
  5. return @bad;
  6. }
  7. sub foo : MyAttribute Test {
  8. print "foo\n";
  9. }

This example is aborted at compile time as we use the attribute "Test" which isn't allowed. MODIFY_CODE_ATTRIBUTES returns a list that contains a single element ('Test').

SEE ALSO

Private Variables via my() in perlsub and Subroutine Attributes in perlsub for details on the basic declarations; use for details on the normal invocation mechanism.

 
perldoc-html/autodie.html000644 000765 000024 00000115560 12275777414 015566 0ustar00jjstaff000000 000000 autodie - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

autodie

Perl 5 version 18.2 documentation
Recently read

autodie

NAME

autodie - Replace functions with ones that succeed or die with lexical scope

SYNOPSIS

  1. use autodie; # Recommended: implies 'use autodie qw(:default)'
  2. use autodie qw(:all); # Recommended more: defaults and system/exec.
  3. use autodie qw(open close); # open/close succeed or die
  4. open(my $fh, "<", $filename); # No need to check!
  5. {
  6. no autodie qw(open); # open failures won't die
  7. open(my $fh, "<", $filename); # Could fail silently!
  8. no autodie; # disable all autodies
  9. }

DESCRIPTION

  1. bIlujDI' yIchegh()Qo'; yIHegh()!
  2. It is better to die() than to return() in failure.
  3. -- Klingon programming proverb.

The autodie pragma provides a convenient way to replace functions that normally return false on failure with equivalents that throw an exception on failure.

The autodie pragma has lexical scope, meaning that functions and subroutines altered with autodie will only change their behaviour until the end of the enclosing block, file, or eval.

If system is specified as an argument to autodie , then it uses IPC::System::Simple to do the heavy lifting. See the description of that module for more information.

EXCEPTIONS

Exceptions produced by the autodie pragma are members of the autodie::exception class. The preferred way to work with these exceptions under Perl 5.10 is as follows:

  1. use feature qw(switch);
  2. eval {
  3. use autodie;
  4. open(my $fh, '<', $some_file);
  5. my @records = <$fh>;
  6. # Do things with @records...
  7. close($fh);
  8. };
  9. given ($@) {
  10. when (undef) { say "No error"; }
  11. when ('open') { say "Error from open"; }
  12. when (':io') { say "Non-open, IO error."; }
  13. when (':all') { say "All other autodie errors." }
  14. default { say "Not an autodie error at all." }
  15. }

Under Perl 5.8, the given/when structure is not available, so the following structure may be used:

  1. eval {
  2. use autodie;
  3. open(my $fh, '<', $some_file);
  4. my @records = <$fh>;
  5. # Do things with @records...
  6. close($fh);
  7. };
  8. if ($@ and $@->isa('autodie::exception')) {
  9. if ($@->matches('open')) { print "Error from open\n"; }
  10. if ($@->matches(':io' )) { print "Non-open, IO error."; }
  11. } elsif ($@) {
  12. # A non-autodie exception.
  13. }

See autodie::exception for further information on interrogating exceptions.

CATEGORIES

Autodie uses a simple set of categories to group together similar built-ins. Requesting a category type (starting with a colon) will enable autodie for all built-ins beneath that category. For example, requesting :file will enable autodie for close, fcntl, fileno, open and sysopen.

The categories are currently:

  1. :all
  2. :default
  3. :io
  4. read
  5. seek
  6. sysread
  7. sysseek
  8. syswrite
  9. :dbm
  10. dbmclose
  11. dbmopen
  12. :file
  13. binmode
  14. close
  15. fcntl
  16. fileno
  17. flock
  18. ioctl
  19. open
  20. sysopen
  21. truncate
  22. :filesys
  23. chdir
  24. closedir
  25. opendir
  26. link
  27. mkdir
  28. readlink
  29. rename
  30. rmdir
  31. symlink
  32. unlink
  33. :ipc
  34. pipe
  35. :msg
  36. msgctl
  37. msgget
  38. msgrcv
  39. msgsnd
  40. :semaphore
  41. semctl
  42. semget
  43. semop
  44. :shm
  45. shmctl
  46. shmget
  47. shmread
  48. :socket
  49. accept
  50. bind
  51. connect
  52. getsockopt
  53. listen
  54. recv
  55. send
  56. setsockopt
  57. shutdown
  58. socketpair
  59. :threads
  60. fork
  61. :system
  62. system
  63. exec

Note that while the above category system is presently a strict hierarchy, this should not be assumed.

A plain use autodie implies use autodie qw(:default) . Note that system and exec are not enabled by default. system requires the optional IPC::System::Simple module to be installed, and enabling system or exec will invalidate their exotic forms. See BUGS below for more details.

The syntax:

  1. use autodie qw(:1.994);

allows the :default list from a particular version to be used. This provides the convenience of using the default methods, but the surety that no behavorial changes will occur if the autodie module is upgraded.

autodie can be enabled for all of Perl's built-ins, including system and exec with:

  1. use autodie qw(:all);

FUNCTION SPECIFIC NOTES

flock

It is not considered an error for flock to return false if it fails due to an EWOULDBLOCK (or equivalent) condition. This means one can still use the common convention of testing the return value of flock when called with the LOCK_NB option:

  1. use autodie;
  2. if ( flock($fh, LOCK_EX | LOCK_NB) ) {
  3. # We have a lock
  4. }

Autodying flock will generate an exception if flock returns false with any other error.

system/exec

The system built-in is considered to have failed in the following circumstances:

  • The command does not start.

  • The command is killed by a signal.

  • The command returns a non-zero exit value (but see below).

On success, the autodying form of system returns the exit value rather than the contents of $? .

Additional allowable exit values can be supplied as an optional first argument to autodying system:

  1. system( [ 0, 1, 2 ], $cmd, @args); # 0,1,2 are good exit values

autodie uses the IPC::System::Simple module to change system. See its documentation for further information.

Applying autodie to system or exec causes the exotic forms system { $cmd } @args or exec { $cmd } @args to be considered a syntax error until the end of the lexical scope. If you really need to use the exotic form, you can call CORE::system or CORE::exec instead, or use no autodie qw(system exec) before calling the exotic form.

GOTCHAS

Functions called in list context are assumed to have failed if they return an empty list, or a list consisting only of a single undef element.

DIAGNOSTICS

  • :void cannot be used with lexical scope

    The :void option is supported in Fatal, but not autodie . To workaround this, autodie may be explicitly disabled until the end of the current block with no autodie . To disable autodie for only a single function (eg, open) use no autodie qw(open) .

    autodie performs no checking of called context to determine whether to throw an exception; the explicitness of error handling with autodie is a deliberate feature.

  • No user hints defined for %s

    You've insisted on hints for user-subroutines, either by pre-pending a ! to the subroutine name itself, or earlier in the list of arguments to autodie . However the subroutine in question does not have any hints available.

See also DIAGNOSTICS in Fatal.

BUGS

"Used only once" warnings can be generated when autodie or Fatal is used with package filehandles (eg, FILE ). Scalar filehandles are strongly recommended instead.

When using autodie or Fatal with user subroutines, the declaration of those subroutines must appear before the first use of Fatal or autodie , or have been exported from a module. Attempting to use Fatal or autodie on other user subroutines will result in a compile-time error.

Due to a bug in Perl, autodie may "lose" any format which has the same name as an autodying built-in or function.

autodie may not work correctly if used inside a file with a name that looks like a string eval, such as eval (3).

autodie and string eval

Due to the current implementation of autodie , unexpected results may be seen when used near or with the string version of eval. None of these bugs exist when using block eval.

Under Perl 5.8 only, autodie does not propagate into string eval statements, although it can be explicitly enabled inside a string eval.

Under Perl 5.10 only, using a string eval when autodie is in effect can cause the autodie behaviour to leak into the surrounding scope. This can be worked around by using a no autodie at the end of the scope to explicitly remove autodie's effects, or by avoiding the use of string eval.

None of these bugs exist when using block eval. The use of autodie with block eval is considered good practice.

REPORTING BUGS

Please report bugs via the CPAN Request Tracker at http://rt.cpan.org/NoAuth/Bugs.html?Dist=autodie.

FEEDBACK

If you find this module useful, please consider rating it on the CPAN Ratings service at http://cpanratings.perl.org/rate?distribution=autodie .

The module author loves to hear how autodie has made your life better (or worse). Feedback can be sent to <pjf@perltraining.com.au>.

AUTHOR

Copyright 2008-2009, Paul Fenwick <pjf@perltraining.com.au>

LICENSE

This module is free software. You may distribute it under the same terms as Perl itself.

SEE ALSO

Fatal, autodie::exception, autodie::hints, IPC::System::Simple

Perl tips, autodie at http://perltraining.com.au/tips/2008-08-20.html

ACKNOWLEDGEMENTS

Mark Reed and Roland Giersig -- Klingon translators.

See the AUTHORS file for full credits. The latest version of this file can be found at http://github.com/pfenwick/autodie/tree/master/AUTHORS .

 
perldoc-html/autouse.html000644 000765 000024 00000043416 12275777414 015621 0ustar00jjstaff000000 000000 autouse - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

autouse

Perl 5 version 18.2 documentation
Recently read

autouse

NAME

autouse - postpone load of modules until a function is used

SYNOPSIS

  1. use autouse 'Carp' => qw(carp croak);
  2. carp "this carp was predeclared and autoused ";

DESCRIPTION

If the module Module is already loaded, then the declaration

  1. use autouse 'Module' => qw(func1 func2($;$));

is equivalent to

  1. use Module qw(func1 func2);

if Module defines func2() with prototype ($;$), and func1() has no prototypes. (At least if Module uses Exporter 's import, otherwise it is a fatal error.)

If the module Module is not loaded yet, then the above declaration declares functions func1() and func2() in the current package. When these functions are called, they load the package Module if needed, and substitute themselves with the correct definitions.

WARNING

Using autouse will move important steps of your program's execution from compile time to runtime. This can

  • Break the execution of your program if the module you autouse d has some initialization which it expects to be done early.

  • hide bugs in your code since important checks (like correctness of prototypes) is moved from compile time to runtime. In particular, if the prototype you specified on autouse line is wrong, you will not find it out until the corresponding function is executed. This will be very unfortunate for functions which are not always called (note that for such functions autouse ing gives biggest win, for a workaround see below).

To alleviate the second problem (partially) it is advised to write your scripts like this:

  1. use Module;
  2. use autouse Module => qw(carp($) croak(&$));
  3. carp "this carp was predeclared and autoused ";

The first line ensures that the errors in your argument specification are found early. When you ship your application you should comment out the first line, since it makes the second one useless.

AUTHOR

Ilya Zakharevich (ilya@math.ohio-state.edu)

SEE ALSO

perl(1).

 
perldoc-html/base.html000644 000765 000024 00000045035 12275777414 015045 0ustar00jjstaff000000 000000 base - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

base

Perl 5 version 18.2 documentation
Recently read

base

NAME

base - Establish an ISA relationship with base classes at compile time

SYNOPSIS

  1. package Baz;
  2. use base qw(Foo Bar);

DESCRIPTION

Unless you are using the fields pragma, consider this module discouraged in favor of the lighter-weight parent .

Allows you to both load one or more modules, while setting up inheritance from those modules at the same time. Roughly similar in effect to

  1. package Baz;
  2. BEGIN {
  3. require Foo;
  4. require Bar;
  5. push @ISA, qw(Foo Bar);
  6. }

When base tries to require a module, it will not die if it cannot find the module's file, but will die on any other error. After all this, should your base class be empty, containing no symbols, base will die. This is useful for inheriting from classes in the same file as yourself but where the filename does not match the base module name, like so:

  1. # in Bar.pm
  2. package Foo;
  3. sub exclaim { "I can have such a thing?!" }
  4. package Bar;
  5. use base "Foo";

There is no Foo.pm, but because Foo defines a symbol (the exclaim subroutine), base will not die when the require fails to load Foo.pm.

base will also initialize the fields if one of the base classes has it. Multiple inheritance of fields is NOT supported, if two or more base classes each have inheritable fields the 'base' pragma will croak. See fields for a description of this feature.

The base class' import method is not called.

DIAGNOSTICS

  • Base class package "%s" is empty.

    base.pm was unable to require the base package, because it was not found in your path.

  • Class 'Foo' tried to inherit from itself

    Attempting to inherit from yourself generates a warning.

    1. package Foo;
    2. use base 'Foo';

HISTORY

This module was introduced with Perl 5.004_04.

CAVEATS

Due to the limitations of the implementation, you must use base before you declare any of your own fields.

SEE ALSO

fields

 
perldoc-html/bigint.html000644 000765 000024 00000116407 12275777414 015411 0ustar00jjstaff000000 000000 bigint - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

bigint

Perl 5 version 18.2 documentation
Recently read

bigint

NAME

bigint - Transparent BigInteger support for Perl

SYNOPSIS

  1. use bigint;
  2. $x = 2 + 4.5,"\n"; # BigInt 6
  3. print 2 ** 512,"\n"; # really is what you think it is
  4. print inf + 42,"\n"; # inf
  5. print NaN * 7,"\n"; # NaN
  6. print hex("0x1234567890123490"),"\n"; # Perl v5.10.0 or later
  7. {
  8. no bigint;
  9. print 2 ** 256,"\n"; # a normal Perl scalar now
  10. }
  11. # Import into current package:
  12. use bigint qw/hex oct/;
  13. print hex("0x1234567890123490"),"\n";
  14. print oct("01234567890123490"),"\n";

DESCRIPTION

All operators (including basic math operations) except the range operator .. are overloaded. Integer constants are created as proper BigInts.

Floating point constants are truncated to integer. All parts and results of expressions are also truncated.

Unlike integer, this pragma creates integer constants that are only limited in their size by the available memory and CPU time.

use integer vs. use bigint

There is one small difference between use integer and use bigint : the former will not affect assignments to variables and the return value of some functions. bigint truncates these results to integer too:

  1. # perl -Minteger -wle 'print 3.2'
  2. 3.2
  3. # perl -Minteger -wle 'print 3.2 + 0'
  4. 3
  5. # perl -Mbigint -wle 'print 3.2'
  6. 3
  7. # perl -Mbigint -wle 'print 3.2 + 0'
  8. 3
  9. # perl -Mbigint -wle 'print exp(1) + 0'
  10. 2
  11. # perl -Mbigint -wle 'print exp(1)'
  12. 2
  13. # perl -Minteger -wle 'print exp(1)'
  14. 2.71828182845905
  15. # perl -Minteger -wle 'print exp(1) + 0'
  16. 2

In practice this makes seldom a difference as parts and results of expressions will be truncated anyway, but this can, for instance, affect the return value of subroutines:

  1. sub three_integer { use integer; return 3.2; }
  2. sub three_bigint { use bigint; return 3.2; }
  3. print three_integer(), " ", three_bigint(),"\n"; # prints "3.2 3"

Options

bigint recognizes some options that can be passed while loading it via use. The options can (currently) be either a single letter form, or the long form. The following options exist:

  • a or accuracy

    This sets the accuracy for all math operations. The argument must be greater than or equal to zero. See Math::BigInt's bround() function for details.

    1. perl -Mbigint=a,2 -le 'print 12345+1'

    Note that setting precision and accuracy at the same time is not possible.

  • p or precision

    This sets the precision for all math operations. The argument can be any integer. Negative values mean a fixed number of digits after the dot, and are <B>ignored</B> since all operations happen in integer space. A positive value rounds to this digit left from the dot. 0 or 1 mean round to integer and are ignore like negative values.

    See Math::BigInt's bfround() function for details.

    1. perl -Mbignum=p,5 -le 'print 123456789+123'

    Note that setting precision and accuracy at the same time is not possible.

  • t or trace

    This enables a trace mode and is primarily for debugging bigint or Math::BigInt.

  • hex

    Override the built-in hex() method with a version that can handle big integers. This overrides it by exporting it to the current package. Under Perl v5.10.0 and higher, this is not so necessary, as hex() is lexically overridden in the current scope whenever the bigint pragma is active.

  • oct

    Override the built-in oct() method with a version that can handle big integers. This overrides it by exporting it to the current package. Under Perl v5.10.0 and higher, this is not so necessary, as oct() is lexically overridden in the current scope whenever the bigint pragma is active.

  • l, lib, try or only

    Load a different math lib, see Math Library.

    1. perl -Mbigint=lib,GMP -e 'print 2 ** 512'
    2. perl -Mbigint=try,GMP -e 'print 2 ** 512'
    3. perl -Mbigint=only,GMP -e 'print 2 ** 512'

    Currently there is no way to specify more than one library on the command line. This means the following does not work:

    1. perl -Mbignum=l,GMP,Pari -e 'print 2 ** 512'

    This will be hopefully fixed soon ;)

  • v or version

    This prints out the name and version of all modules used and then exits.

    1. perl -Mbigint=v

Math Library

Math with the numbers is done (by default) by a module called Math::BigInt::Calc. This is equivalent to saying:

  1. use bigint lib => 'Calc';

You can change this by using:

  1. use bignum lib => 'GMP';

The following would first try to find Math::BigInt::Foo, then Math::BigInt::Bar, and when this also fails, revert to Math::BigInt::Calc:

  1. use bigint lib => 'Foo,Math::BigInt::Bar';

Using lib warns if none of the specified libraries can be found and Math::BigInt did fall back to one of the default libraries. To suppress this warning, use try instead:

  1. use bignum try => 'GMP';

If you want the code to die instead of falling back, use only instead:

  1. use bignum only => 'GMP';

Please see respective module documentation for further details.

Internal Format

The numbers are stored as objects, and their internals might change at anytime, especially between math operations. The objects also might belong to different classes, like Math::BigInt, or Math::BigInt::Lite. Mixing them together, even with normal scalars is not extraordinary, but normal and expected.

You should not depend on the internal format, all accesses must go through accessor methods. E.g. looking at $x->{sign} is not a good idea since there is no guaranty that the object in question has such a hash key, nor is a hash underneath at all.

Sign

The sign is either '+', '-', 'NaN', '+inf' or '-inf'. You can access it with the sign() method.

A sign of 'NaN' is used to represent the result when input arguments are not numbers or as a result of 0/0. '+inf' and '-inf' represent plus respectively minus infinity. You will get '+inf' when dividing a positive number by 0, and '-inf' when dividing any negative number by 0.

Method calls

Since all numbers are now objects, you can use all functions that are part of the BigInt API. You can only use the bxxx() notation, and not the fxxx() notation, though.

But a warning is in order. When using the following to make a copy of a number, only a shallow copy will be made.

  1. $x = 9; $y = $x;
  2. $x = $y = 7;

Using the copy or the original with overloaded math is okay, e.g. the following work:

  1. $x = 9; $y = $x;
  2. print $x + 1, " ", $y,"\n"; # prints 10 9

but calling any method that modifies the number directly will result in both the original and the copy being destroyed:

  1. $x = 9; $y = $x;
  2. print $x->badd(1), " ", $y,"\n"; # prints 10 10
  3. $x = 9; $y = $x;
  4. print $x->binc(1), " ", $y,"\n"; # prints 10 10
  5. $x = 9; $y = $x;
  6. print $x->bmul(2), " ", $y,"\n"; # prints 18 18

Using methods that do not modify, but testthe contents works:

  1. $x = 9; $y = $x;
  2. $z = 9 if $x->is_zero(); # works fine

See the documentation about the copy constructor and = in overload, as well as the documentation in BigInt for further details.

Methods

  • inf()

    A shortcut to return Math::BigInt->binf(). Useful because Perl does not always handle bareword inf properly.

  • NaN()

    A shortcut to return Math::BigInt->bnan(). Useful because Perl does not always handle bareword NaN properly.

  • e
    1. # perl -Mbigint=e -wle 'print e'

    Returns Euler's number e , aka exp(1). Note that under bigint, this is truncated to an integer, and hence simple '2'.

  • PI
    1. # perl -Mbigint=PI -wle 'print PI'

    Returns PI. Note that under bigint, this is truncated to an integer, and hence simple '3'.

  • bexp()
    1. bexp($power,$accuracy);

    Returns Euler's number e raised to the appropriate power, to the wanted accuracy.

    Note that under bigint, the result is truncated to an integer.

    Example:

    1. # perl -Mbigint=bexp -wle 'print bexp(1,80)'
  • bpi()
    1. bpi($accuracy);

    Returns PI to the wanted accuracy. Note that under bigint, this is truncated to an integer, and hence simple '3'.

    Example:

    1. # perl -Mbigint=bpi -wle 'print bpi(80)'
  • upgrade()

    Return the class that numbers are upgraded to, is in fact returning $Math::BigInt::upgrade .

  • in_effect()
    1. use bigint;
    2. print "in effect\n" if bigint::in_effect; # true
    3. {
    4. no bigint;
    5. print "in effect\n" if bigint::in_effect; # false
    6. }

    Returns true or false if bigint is in effect in the current scope.

    This method only works on Perl v5.9.4 or later.

CAVEATS

  • ranges

    Perl does not allow overloading of ranges, so you can neither safely use ranges with bigint endpoints, nor is the iterator variable a bigint.

    1. use 5.010;
    2. for my $i (12..13) {
    3. for my $j (20..21) {
    4. say $i ** $j; # produces a floating-point number,
    5. # not a big integer
    6. }
    7. }
  • in_effect()

    This method only works on Perl v5.9.4 or later.

  • hex()/oct()

    bigint overrides these routines with versions that can also handle big integer values. Under Perl prior to version v5.9.4, however, this will not happen unless you specifically ask for it with the two import tags "hex" and "oct" - and then it will be global and cannot be disabled inside a scope with "no bigint":

    1. use bigint qw/hex oct/;
    2. print hex("0x1234567890123456");
    3. {
    4. no bigint;
    5. print hex("0x1234567890123456");
    6. }

    The second call to hex() will warn about a non-portable constant.

    Compare this to:

    1. use bigint;
    2. # will warn only under Perl older than v5.9.4
    3. print hex("0x1234567890123456");

MODULES USED

bigint is just a thin wrapper around various modules of the Math::BigInt family. Think of it as the head of the family, who runs the shop, and orders the others to do the work.

The following modules are currently used by bigint:

  1. Math::BigInt::Lite (for speed, and only if it is loadable)
  2. Math::BigInt

EXAMPLES

Some cool command line examples to impress the Python crowd ;) You might want to compare them to the results under -Mbignum or -Mbigrat:

  1. perl -Mbigint -le 'print sqrt(33)'
  2. perl -Mbigint -le 'print 2*255'
  3. perl -Mbigint -le 'print 4.5+2*255'
  4. perl -Mbigint -le 'print 3/7 + 5/7 + 8/3'
  5. perl -Mbigint -le 'print 123->is_odd()'
  6. perl -Mbigint -le 'print log(2)'
  7. perl -Mbigint -le 'print 2 ** 0.5'
  8. perl -Mbigint=a,65 -le 'print 2 ** 0.2'
  9. perl -Mbignum=a,65,l,GMP -le 'print 7 ** 7777'

LICENSE

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Especially bigrat as in perl -Mbigrat -le 'print 1/3+1/4' and bignum as in perl -Mbignum -le 'print sqrt(2)' .

Math::BigInt, Math::BigRat and Math::Big as well as Math::BigInt::Pari and Math::BigInt::GMP.

AUTHORS

(C) by Tels http://bloodgate.com/ in early 2002 - 2007.

 
perldoc-html/bignum.html000644 000765 000024 00000120314 12275777414 015406 0ustar00jjstaff000000 000000 bignum - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

bignum

Perl 5 version 18.2 documentation
Recently read

bignum

NAME

bignum - Transparent BigNumber support for Perl

SYNOPSIS

  1. use bignum;
  2. $x = 2 + 4.5,"\n"; # BigFloat 6.5
  3. print 2 ** 512 * 0.1,"\n"; # really is what you think it is
  4. print inf * inf,"\n"; # prints inf
  5. print NaN * 3,"\n"; # prints NaN
  6. {
  7. no bignum;
  8. print 2 ** 256,"\n"; # a normal Perl scalar now
  9. }
  10. # for older Perls, import into current package:
  11. use bignum qw/hex oct/;
  12. print hex("0x1234567890123490"),"\n";
  13. print oct("01234567890123490"),"\n";

DESCRIPTION

All operators (including basic math operations) are overloaded. Integer and floating-point constants are created as proper BigInts or BigFloats, respectively.

If you do

  1. use bignum;

at the top of your script, Math::BigFloat and Math::BigInt will be loaded and any constant number will be converted to an object (Math::BigFloat for floats like 3.1415 and Math::BigInt for integers like 1234).

So, the following line:

  1. $x = 1234;

creates actually a Math::BigInt and stores a reference to in $x. This happens transparently and behind your back, so to speak.

You can see this with the following:

  1. perl -Mbignum -le 'print ref(1234)'

Don't worry if it says Math::BigInt::Lite, bignum and friends will use Lite if it is installed since it is faster for some operations. It will be automatically upgraded to BigInt whenever necessary:

  1. perl -Mbignum -le 'print ref(2**255)'

This also means it is a bad idea to check for some specific package, since the actual contents of $x might be something unexpected. Due to the transparent way of bignum ref() should not be necessary, anyway.

Since Math::BigInt and BigFloat also overload the normal math operations, the following line will still work:

  1. perl -Mbignum -le 'print ref(1234+1234)'

Since numbers are actually objects, you can call all the usual methods from BigInt/BigFloat on them. This even works to some extent on expressions:

  1. perl -Mbignum -le '$x = 1234; print $x->bdec()'
  2. perl -Mbignum -le 'print 1234->copy()->binc();'
  3. perl -Mbignum -le 'print 1234->copy()->binc->badd(6);'
  4. perl -Mbignum -le 'print +(1234)->copy()->binc()'

(Note that print doesn't do what you expect if the expression starts with '(' hence the + )

You can even chain the operations together as usual:

  1. perl -Mbignum -le 'print 1234->copy()->binc->badd(6);'
  2. 1241

Under bignum (or bigint or bigrat), Perl will "upgrade" the numbers appropriately. This means that:

  1. perl -Mbignum -le 'print 1234+4.5'
  2. 1238.5

will work correctly. These mixed cases don't do always work when using Math::BigInt or Math::BigFloat alone, or at least not in the way normal Perl scalars work.

If you do want to work with large integers like under use integer; , try use bigint; :

  1. perl -Mbigint -le 'print 1234.5+4.5'
  2. 1238

There is also use bigrat; which gives you big rationals:

  1. perl -Mbigrat -le 'print 1234+4.1'
  2. 12381/10

The entire upgrading/downgrading is still experimental and might not work as you expect or may even have bugs. You might get errors like this:

  1. Can't use an undefined value as an ARRAY reference at
  2. /usr/local/lib/perl5/5.8.0/Math/BigInt/Calc.pm line 864

This means somewhere a routine got a BigFloat/Lite but expected a BigInt (or vice versa) and the upgrade/downgrad path was missing. This is a bug, please report it so that we can fix it.

You might consider using just Math::BigInt or Math::BigFloat, since they allow you finer control over what get's done in which module/space. For instance, simple loop counters will be Math::BigInts under use bignum; and this is slower than keeping them as Perl scalars:

  1. perl -Mbignum -le 'for ($i = 0; $i < 10; $i++) { print ref($i); }'

Please note the following does not work as expected (prints nothing), since overloading of '..' is not yet possible in Perl (as of v5.8.0):

  1. perl -Mbignum -le 'for (1..2) { print ref($_); }'

Options

bignum recognizes some options that can be passed while loading it via use. The options can (currently) be either a single letter form, or the long form. The following options exist:

  • a or accuracy

    This sets the accuracy for all math operations. The argument must be greater than or equal to zero. See Math::BigInt's bround() function for details.

    1. perl -Mbignum=a,50 -le 'print sqrt(20)'

    Note that setting precision and accuracy at the same time is not possible.

  • p or precision

    This sets the precision for all math operations. The argument can be any integer. Negative values mean a fixed number of digits after the dot, while a positive value rounds to this digit left from the dot. 0 or 1 mean round to integer. See Math::BigInt's bfround() function for details.

    1. perl -Mbignum=p,-50 -le 'print sqrt(20)'

    Note that setting precision and accuracy at the same time is not possible.

  • t or trace

    This enables a trace mode and is primarily for debugging bignum or Math::BigInt/Math::BigFloat.

  • l or lib

    Load a different math lib, see Math Library.

    1. perl -Mbignum=l,GMP -e 'print 2 ** 512'

    Currently there is no way to specify more than one library on the command line. This means the following does not work:

    1. perl -Mbignum=l,GMP,Pari -e 'print 2 ** 512'

    This will be hopefully fixed soon ;)

  • hex

    Override the built-in hex() method with a version that can handle big numbers. This overrides it by exporting it to the current package. Under Perl v5.10.0 and higher, this is not so necessary, as hex() is lexically overridden in the current scope whenever the bignum pragma is active.

  • oct

    Override the built-in oct() method with a version that can handle big numbers. This overrides it by exporting it to the current package. Under Perl v5.10.0 and higher, this is not so necessary, as oct() is lexically overridden in the current scope whenever the bigint pragma is active.

  • v or version

    This prints out the name and version of all modules used and then exits.

    1. perl -Mbignum=v

Methods

Beside import() and AUTOLOAD() there are only a few other methods.

Since all numbers are now objects, you can use all functions that are part of the BigInt or BigFloat API. It is wise to use only the bxxx() notation, and not the fxxx() notation, though. This makes it possible that the underlying object might morph into a different class than BigFloat.

Caveats

But a warning is in order. When using the following to make a copy of a number, only a shallow copy will be made.

  1. $x = 9; $y = $x;
  2. $x = $y = 7;

If you want to make a real copy, use the following:

  1. $y = $x->copy();

Using the copy or the original with overloaded math is okay, e.g. the following work:

  1. $x = 9; $y = $x;
  2. print $x + 1, " ", $y,"\n"; # prints 10 9

but calling any method that modifies the number directly will result in both the original and the copy being destroyed:

  1. $x = 9; $y = $x;
  2. print $x->badd(1), " ", $y,"\n"; # prints 10 10
  3. $x = 9; $y = $x;
  4. print $x->binc(1), " ", $y,"\n"; # prints 10 10
  5. $x = 9; $y = $x;
  6. print $x->bmul(2), " ", $y,"\n"; # prints 18 18

Using methods that do not modify, but test the contents works:

  1. $x = 9; $y = $x;
  2. $z = 9 if $x->is_zero(); # works fine

See the documentation about the copy constructor and = in overload, as well as the documentation in BigInt for further details.

  • inf()

    A shortcut to return Math::BigInt->binf(). Useful because Perl does not always handle bareword inf properly.

  • NaN()

    A shortcut to return Math::BigInt->bnan(). Useful because Perl does not always handle bareword NaN properly.

  • e
    1. # perl -Mbignum=e -wle 'print e'

    Returns Euler's number e , aka exp(1).

  • PI()
    1. # perl -Mbignum=PI -wle 'print PI'

    Returns PI.

  • bexp()
    1. bexp($power,$accuracy);

    Returns Euler's number e raised to the appropriate power, to the wanted accuracy.

    Example:

    1. # perl -Mbignum=bexp -wle 'print bexp(1,80)'
  • bpi()
    1. bpi($accuracy);

    Returns PI to the wanted accuracy.

    Example:

    1. # perl -Mbignum=bpi -wle 'print bpi(80)'
  • upgrade()

    Return the class that numbers are upgraded to, is in fact returning $Math::BigInt::upgrade .

  • in_effect()
    1. use bignum;
    2. print "in effect\n" if bignum::in_effect; # true
    3. {
    4. no bignum;
    5. print "in effect\n" if bignum::in_effect; # false
    6. }

    Returns true or false if bignum is in effect in the current scope.

    This method only works on Perl v5.9.4 or later.

Math Library

Math with the numbers is done (by default) by a module called Math::BigInt::Calc. This is equivalent to saying:

  1. use bignum lib => 'Calc';

You can change this by using:

  1. use bignum lib => 'GMP';

The following would first try to find Math::BigInt::Foo, then Math::BigInt::Bar, and when this also fails, revert to Math::BigInt::Calc:

  1. use bignum lib => 'Foo,Math::BigInt::Bar';

Please see respective module documentation for further details.

Using lib warns if none of the specified libraries can be found and Math::BigInt did fall back to one of the default libraries. To suppress this warning, use try instead:

  1. use bignum try => 'GMP';

If you want the code to die instead of falling back, use only instead:

  1. use bignum only => 'GMP';

INTERNAL FORMAT

The numbers are stored as objects, and their internals might change at anytime, especially between math operations. The objects also might belong to different classes, like Math::BigInt, or Math::BigFLoat. Mixing them together, even with normal scalars is not extraordinary, but normal and expected.

You should not depend on the internal format, all accesses must go through accessor methods. E.g. looking at $x->{sign} is not a bright idea since there is no guaranty that the object in question has such a hashkey, nor is a hash underneath at all.

SIGN

The sign is either '+', '-', 'NaN', '+inf' or '-inf' and stored separately. You can access it with the sign() method.

A sign of 'NaN' is used to represent the result when input arguments are not numbers or as a result of 0/0. '+inf' and '-inf' represent plus respectively minus infinity. You will get '+inf' when dividing a positive number by 0, and '-inf' when dividing any negative number by 0.

CAVEATS

  • in_effect()

    This method only works on Perl v5.9.4 or later.

  • hex()/oct()

    bigint overrides these routines with versions that can also handle big integer values. Under Perl prior to version v5.9.4, however, this will not happen unless you specifically ask for it with the two import tags "hex" and "oct" - and then it will be global and cannot be disabled inside a scope with "no bigint":

    1. use bigint qw/hex oct/;
    2. print hex("0x1234567890123456");
    3. {
    4. no bigint;
    5. print hex("0x1234567890123456");
    6. }

    The second call to hex() will warn about a non-portable constant.

    Compare this to:

    1. use bigint;
    2. # will warn only under older than v5.9.4
    3. print hex("0x1234567890123456");

MODULES USED

bignum is just a thin wrapper around various modules of the Math::BigInt family. Think of it as the head of the family, who runs the shop, and orders the others to do the work.

The following modules are currently used by bignum:

  1. Math::BigInt::Lite (for speed, and only if it is loadable)
  2. Math::BigInt
  3. Math::BigFloat

EXAMPLES

Some cool command line examples to impress the Python crowd ;)

  1. perl -Mbignum -le 'print sqrt(33)'
  2. perl -Mbignum -le 'print 2*255'
  3. perl -Mbignum -le 'print 4.5+2*255'
  4. perl -Mbignum -le 'print 3/7 + 5/7 + 8/3'
  5. perl -Mbignum -le 'print 123->is_odd()'
  6. perl -Mbignum -le 'print log(2)'
  7. perl -Mbignum -le 'print exp(1)'
  8. perl -Mbignum -le 'print 2 ** 0.5'
  9. perl -Mbignum=a,65 -le 'print 2 ** 0.2'
  10. perl -Mbignum=a,65,l,GMP -le 'print 7 ** 7777'

LICENSE

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Especially bigrat as in perl -Mbigrat -le 'print 1/3+1/4' .

Math::BigFloat, Math::BigInt, Math::BigRat and Math::Big as well as Math::BigInt::Pari and Math::BigInt::GMP.

AUTHORS

(C) by Tels http://bloodgate.com/ in early 2002 - 2007.

 
perldoc-html/bigrat.html000644 000765 000024 00000104434 12275777414 015402 0ustar00jjstaff000000 000000 bigrat - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

bigrat

Perl 5 version 18.2 documentation
Recently read

bigrat

NAME

bigrat - Transparent BigNumber/BigRational support for Perl

SYNOPSIS

  1. use bigrat;
  2. print 2 + 4.5,"\n"; # BigFloat 6.5
  3. print 1/3 + 1/4,"\n"; # produces 7/12
  4. {
  5. no bigrat;
  6. print 1/3,"\n"; # 0.33333...
  7. }
  8. # Import into current package:
  9. use bigrat qw/hex oct/;
  10. print hex("0x1234567890123490"),"\n";
  11. print oct("01234567890123490"),"\n";

DESCRIPTION

All operators (including basic math operations) are overloaded. Integer and floating-point constants are created as proper BigInts or BigFloats, respectively.

Other than bignum, this module upgrades to Math::BigRat, meaning that instead of 2.5 you will get 2+1/2 as output.

Modules Used

bigrat is just a thin wrapper around various modules of the Math::BigInt family. Think of it as the head of the family, who runs the shop, and orders the others to do the work.

The following modules are currently used by bignum:

  1. Math::BigInt::Lite (for speed, and only if it is loadable)
  2. Math::BigInt
  3. Math::BigFloat
  4. Math::BigRat

Math Library

Math with the numbers is done (by default) by a module called Math::BigInt::Calc. This is equivalent to saying:

  1. use bigrat lib => 'Calc';

You can change this by using:

  1. use bignum lib => 'GMP';

The following would first try to find Math::BigInt::Foo, then Math::BigInt::Bar, and when this also fails, revert to Math::BigInt::Calc:

  1. use bigrat lib => 'Foo,Math::BigInt::Bar';

Using lib warns if none of the specified libraries can be found and Math::BigInt did fall back to one of the default libraries. To suppress this warning, use try instead:

  1. use bignum try => 'GMP';

If you want the code to die instead of falling back, use only instead:

  1. use bignum only => 'GMP';

Please see respective module documentation for further details.

Sign

The sign is either '+', '-', 'NaN', '+inf' or '-inf'.

A sign of 'NaN' is used to represent the result when input arguments are not numbers or as a result of 0/0. '+inf' and '-inf' represent plus respectively minus infinity. You will get '+inf' when dividing a positive number by 0, and '-inf' when dividing any negative number by 0.

Methods

Since all numbers are not objects, you can use all functions that are part of the BigInt or BigFloat API. It is wise to use only the bxxx() notation, and not the fxxx() notation, though. This makes you independent on the fact that the underlying object might morph into a different class than BigFloat.

  • inf()

    A shortcut to return Math::BigInt->binf(). Useful because Perl does not always handle bareword inf properly.

  • NaN()

    A shortcut to return Math::BigInt->bnan(). Useful because Perl does not always handle bareword NaN properly.

  • e
    1. # perl -Mbigrat=e -wle 'print e'

    Returns Euler's number e , aka exp(1).

  • PI
    1. # perl -Mbigrat=PI -wle 'print PI'

    Returns PI.

  • bexp()
    1. bexp($power,$accuracy);

    Returns Euler's number e raised to the appropriate power, to the wanted accuracy.

    Example:

    1. # perl -Mbigrat=bexp -wle 'print bexp(1,80)'
  • bpi()
    1. bpi($accuracy);

    Returns PI to the wanted accuracy.

    Example:

    1. # perl -Mbigrat=bpi -wle 'print bpi(80)'
  • upgrade()

    Return the class that numbers are upgraded to, is in fact returning $Math::BigInt::upgrade .

  • in_effect()
    1. use bigrat;
    2. print "in effect\n" if bigrat::in_effect; # true
    3. {
    4. no bigrat;
    5. print "in effect\n" if bigrat::in_effect; # false
    6. }

    Returns true or false if bigrat is in effect in the current scope.

    This method only works on Perl v5.9.4 or later.

MATH LIBRARY

Math with the numbers is done (by default) by a module called

Caveat

But a warning is in order. When using the following to make a copy of a number, only a shallow copy will be made.

  1. $x = 9; $y = $x;
  2. $x = $y = 7;

If you want to make a real copy, use the following:

  1. $y = $x->copy();

Using the copy or the original with overloaded math is okay, e.g. the following work:

  1. $x = 9; $y = $x;
  2. print $x + 1, " ", $y,"\n"; # prints 10 9

but calling any method that modifies the number directly will result in both the original and the copy being destroyed:

  1. $x = 9; $y = $x;
  2. print $x->badd(1), " ", $y,"\n"; # prints 10 10
  3. $x = 9; $y = $x;
  4. print $x->binc(1), " ", $y,"\n"; # prints 10 10
  5. $x = 9; $y = $x;
  6. print $x->bmul(2), " ", $y,"\n"; # prints 18 18

Using methods that do not modify, but testthe contents works:

  1. $x = 9; $y = $x;
  2. $z = 9 if $x->is_zero(); # works fine

See the documentation about the copy constructor and = in overload, as well as the documentation in BigInt for further details.

Options

bignum recognizes some options that can be passed while loading it via use. The options can (currently) be either a single letter form, or the long form. The following options exist:

  • a or accuracy

    This sets the accuracy for all math operations. The argument must be greater than or equal to zero. See Math::BigInt's bround() function for details.

    1. perl -Mbigrat=a,50 -le 'print sqrt(20)'

    Note that setting precision and accuracy at the same time is not possible.

  • p or precision

    This sets the precision for all math operations. The argument can be any integer. Negative values mean a fixed number of digits after the dot, while a positive value rounds to this digit left from the dot. 0 or 1 mean round to integer. See Math::BigInt's bfround() function for details.

    1. perl -Mbigrat=p,-50 -le 'print sqrt(20)'

    Note that setting precision and accuracy at the same time is not possible.

  • t or trace

    This enables a trace mode and is primarily for debugging bignum or Math::BigInt/Math::BigFloat.

  • l or lib

    Load a different math lib, see MATH LIBRARY.

    1. perl -Mbigrat=l,GMP -e 'print 2 ** 512'

    Currently there is no way to specify more than one library on the command line. This means the following does not work:

    1. perl -Mbignum=l,GMP,Pari -e 'print 2 ** 512'

    This will be hopefully fixed soon ;)

  • hex

    Override the built-in hex() method with a version that can handle big numbers. This overrides it by exporting it to the current package. Under Perl v5.10.0 and higher, this is not so necessary, as hex() is lexically overridden in the current scope whenever the bigrat pragma is active.

  • oct

    Override the built-in oct() method with a version that can handle big numbers. This overrides it by exporting it to the current package. Under Perl v5.10.0 and higher, this is not so necessary, as oct() is lexically overridden in the current scope whenever the bigrat pragma is active.

  • v or version

    This prints out the name and version of all modules used and then exits.

    1. perl -Mbigrat=v

CAVEATS

  • in_effect()

    This method only works on Perl v5.9.4 or later.

  • hex()/oct()

    bigint overrides these routines with versions that can also handle big integer values. Under Perl prior to version v5.9.4, however, this will not happen unless you specifically ask for it with the two import tags "hex" and "oct" - and then it will be global and cannot be disabled inside a scope with "no bigint":

    1. use bigint qw/hex oct/;
    2. print hex("0x1234567890123456");
    3. {
    4. no bigint;
    5. print hex("0x1234567890123456");
    6. }

    The second call to hex() will warn about a non-portable constant.

    Compare this to:

    1. use bigint;
    2. # will warn only under Perl older than v5.9.4
    3. print hex("0x1234567890123456");

EXAMPLES

  1. perl -Mbigrat -le 'print sqrt(33)'
  2. perl -Mbigrat -le 'print 2*255'
  3. perl -Mbigrat -le 'print 4.5+2*255'
  4. perl -Mbigrat -le 'print 3/7 + 5/7 + 8/3'
  5. perl -Mbigrat -le 'print 12->is_odd()';
  6. perl -Mbignum=l,GMP -le 'print 7 ** 7777'

LICENSE

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Especially bignum.

Math::BigFloat, Math::BigInt, Math::BigRat and Math::Big as well as Math::BigInt::Pari and Math::BigInt::GMP.

AUTHORS

(C) by Tels http://bloodgate.com/ in early 2002 - 2007.

 
perldoc-html/blib.html000644 000765 000024 00000036442 12275777414 015045 0ustar00jjstaff000000 000000 blib - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

blib

Perl 5 version 18.2 documentation
Recently read

blib

NAME

blib - Use MakeMaker's uninstalled version of a package

SYNOPSIS

  1. perl -Mblib script [args...]
  2. perl -Mblib=dir script [args...]

DESCRIPTION

Looks for MakeMaker-like 'blib' directory structure starting in dir (or current directory) and working back up to five levels of '..'.

Intended for use on command line with -M option as a way of testing arbitrary scripts against an uninstalled version of a package.

However it is possible to :

  1. use blib;
  2. or
  3. use blib '..';

etc. if you really must.

BUGS

Pollutes global name space for development only task.

AUTHOR

Nick Ing-Simmons nik@tiuk.ti.com

 
perldoc-html/bytes.html000644 000765 000024 00000050341 12275777414 015255 0ustar00jjstaff000000 000000 bytes - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

bytes

Perl 5 version 18.2 documentation
Recently read

bytes

NAME

bytes - Perl pragma to force byte semantics rather than character semantics

NOTICE

This pragma reflects early attempts to incorporate Unicode into perl and has since been superseded. It breaks encapsulation (i.e. it exposes the innards of how the perl executable currently happens to store a string), and use of this module for anything other than debugging purposes is strongly discouraged. If you feel that the functions here within might be useful for your application, this possibly indicates a mismatch between your mental model of Perl Unicode and the current reality. In that case, you may wish to read some of the perl Unicode documentation: perluniintro, perlunitut, perlunifaq and perlunicode.

SYNOPSIS

  1. use bytes;
  2. ... chr(...); # or bytes::chr
  3. ... index(...); # or bytes::index
  4. ... length(...); # or bytes::length
  5. ... ord(...); # or bytes::ord
  6. ... rindex(...); # or bytes::rindex
  7. ... substr(...); # or bytes::substr
  8. no bytes;

DESCRIPTION

The use bytes pragma disables character semantics for the rest of the lexical scope in which it appears. no bytes can be used to reverse the effect of use bytes within the current lexical scope.

Perl normally assumes character semantics in the presence of character data (i.e. data that has come from a source that has been marked as being of a particular character encoding). When use bytes is in effect, the encoding is temporarily ignored, and each string is treated as a series of bytes.

As an example, when Perl sees $x = chr(400) , it encodes the character in UTF-8 and stores it in $x. Then it is marked as character data, so, for instance, length $x returns 1 . However, in the scope of the bytes pragma, $x is treated as a series of bytes - the bytes that make up the UTF8 encoding - and length $x returns 2 :

  1. $x = chr(400);
  2. print "Length is ", length $x, "\n"; # "Length is 1"
  3. printf "Contents are %vd\n", $x; # "Contents are 400"
  4. {
  5. use bytes; # or "require bytes; bytes::length()"
  6. print "Length is ", length $x, "\n"; # "Length is 2"
  7. printf "Contents are %vd\n", $x; # "Contents are 198.144"
  8. }

chr(), ord(), substr(), index() and rindex() behave similarly.

For more on the implications and differences between character semantics and byte semantics, see perluniintro and perlunicode.

LIMITATIONS

bytes::substr() does not work as an lvalue().

SEE ALSO

perluniintro, perlunicode, utf8

 
perldoc-html/c2ph.html000644 000765 000024 00000064415 12275777417 014775 0ustar00jjstaff000000 000000 c2ph - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

c2ph

Perl 5 version 18.2 documentation
Recently read

c2ph

NAME

c2ph, pstruct - Dump C structures as generated from cc -g -S stabs

SYNOPSIS

  1. c2ph [-dpnP] [var=val] [files ...]

OPTIONS

  1. Options:
  2. -w wide; short for: type_width=45 member_width=35 offset_width=8
  3. -x hex; short for: offset_fmt=x offset_width=08 size_fmt=x size_width=04
  4. -n do not generate perl code (default when invoked as pstruct)
  5. -p generate perl code (default when invoked as c2ph)
  6. -v generate perl code, with C decls as comments
  7. -i do NOT recompute sizes for intrinsic datatypes
  8. -a dump information on intrinsics also
  9. -t trace execution
  10. -d spew reams of debugging output
  11. -slist give comma-separated list a structures to dump

DESCRIPTION

The following is the old c2ph.doc documentation by Tom Christiansen <tchrist@perl.com> Date: 25 Jul 91 08:10:21 GMT

Once upon a time, I wrote a program called pstruct. It was a perl program that tried to parse out C structures and display their member offsets for you. This was especially useful for people looking at binary dumps or poking around the kernel.

Pstruct was not a pretty program. Neither was it particularly robust. The problem, you see, was that the C compiler was much better at parsing C than I could ever hope to be.

So I got smart: I decided to be lazy and let the C compiler parse the C, which would spit out debugger stabs for me to read. These were much easier to parse. It's still not a pretty program, but at least it's more robust.

Pstruct takes any .c or .h files, or preferably .s ones, since that's the format it is going to massage them into anyway, and spits out listings like this:

  1. struct tty {
  2. int tty.t_locker 000 4
  3. int tty.t_mutex_index 004 4
  4. struct tty * tty.t_tp_virt 008 4
  5. struct clist tty.t_rawq 00c 20
  6. int tty.t_rawq.c_cc 00c 4
  7. int tty.t_rawq.c_cmax 010 4
  8. int tty.t_rawq.c_cfx 014 4
  9. int tty.t_rawq.c_clx 018 4
  10. struct tty * tty.t_rawq.c_tp_cpu 01c 4
  11. struct tty * tty.t_rawq.c_tp_iop 020 4
  12. unsigned char * tty.t_rawq.c_buf_cpu 024 4
  13. unsigned char * tty.t_rawq.c_buf_iop 028 4
  14. struct clist tty.t_canq 02c 20
  15. int tty.t_canq.c_cc 02c 4
  16. int tty.t_canq.c_cmax 030 4
  17. int tty.t_canq.c_cfx 034 4
  18. int tty.t_canq.c_clx 038 4
  19. struct tty * tty.t_canq.c_tp_cpu 03c 4
  20. struct tty * tty.t_canq.c_tp_iop 040 4
  21. unsigned char * tty.t_canq.c_buf_cpu 044 4
  22. unsigned char * tty.t_canq.c_buf_iop 048 4
  23. struct clist tty.t_outq 04c 20
  24. int tty.t_outq.c_cc 04c 4
  25. int tty.t_outq.c_cmax 050 4
  26. int tty.t_outq.c_cfx 054 4
  27. int tty.t_outq.c_clx 058 4
  28. struct tty * tty.t_outq.c_tp_cpu 05c 4
  29. struct tty * tty.t_outq.c_tp_iop 060 4
  30. unsigned char * tty.t_outq.c_buf_cpu 064 4
  31. unsigned char * tty.t_outq.c_buf_iop 068 4
  32. (*int)() tty.t_oproc_cpu 06c 4
  33. (*int)() tty.t_oproc_iop 070 4
  34. (*int)() tty.t_stopproc_cpu 074 4
  35. (*int)() tty.t_stopproc_iop 078 4
  36. struct thread * tty.t_rsel 07c 4

etc.

Actually, this was generated by a particular set of options. You can control the formatting of each column, whether you prefer wide or fat, hex or decimal, leading zeroes or whatever.

All you need to be able to use this is a C compiler than generates BSD/GCC-style stabs. The -g option on native BSD compilers and GCC should get this for you.

To learn more, just type a bogus option, like -\?, and a long usage message will be provided. There are a fair number of possibilities.

If you're only a C programmer, than this is the end of the message for you. You can quit right now, and if you care to, save off the source and run it when you feel like it. Or not.

But if you're a perl programmer, then for you I have something much more wondrous than just a structure offset printer.

You see, if you call pstruct by its other incybernation, c2ph, you have a code generator that translates C code into perl code! Well, structure and union declarations at least, but that's quite a bit.

Prior to this point, anyone programming in perl who wanted to interact with C programs, like the kernel, was forced to guess the layouts of the C structures, and then hardwire these into his program. Of course, when you took your wonderfully crafted program to a system where the sgtty structure was laid out differently, your program broke. Which is a shame.

We've had Larry's h2ph translator, which helped, but that only works on cpp symbols, not real C, which was also very much needed. What I offer you is a symbolic way of getting at all the C structures. I've couched them in terms of packages and functions. Consider the following program:

  1. #!/usr/local/bin/perl
  2. require 'syscall.ph';
  3. require 'sys/time.ph';
  4. require 'sys/resource.ph';
  5. $ru = "\0" x &rusage'sizeof();
  6. syscall(&SYS_getrusage, &RUSAGE_SELF, $ru) && die "getrusage: $!";
  7. @ru = unpack($t = &rusage'typedef(), $ru);
  8. $utime = $ru[ &rusage'ru_utime + &timeval'tv_sec ]
  9. + ($ru[ &rusage'ru_utime + &timeval'tv_usec ]) / 1e6;
  10. $stime = $ru[ &rusage'ru_stime + &timeval'tv_sec ]
  11. + ($ru[ &rusage'ru_stime + &timeval'tv_usec ]) / 1e6;
  12. printf "you have used %8.3fs+%8.3fu seconds.\n", $utime, $stime;

As you see, the name of the package is the name of the structure. Regular fields are just their own names. Plus the following accessor functions are provided for your convenience:

  1. struct This takes no arguments, and is merely the number of first-level
  2. elements in the structure. You would use this for indexing
  3. into arrays of structures, perhaps like this
  4. $usec = $u[ &user'u_utimer
  5. + (&ITIMER_VIRTUAL * &itimerval'struct)
  6. + &itimerval'it_value
  7. + &timeval'tv_usec
  8. ];
  9. sizeof Returns the bytes in the structure, or the member if
  10. you pass it an argument, such as
  11. &rusage'sizeof(&rusage'ru_utime)
  12. typedef This is the perl format definition for passing to pack and
  13. unpack. If you ask for the typedef of a nothing, you get
  14. the whole structure, otherwise you get that of the member
  15. you ask for. Padding is taken care of, as is the magic to
  16. guarantee that a union is unpacked into all its aliases.
  17. Bitfields are not quite yet supported however.
  18. offsetof This function is the byte offset into the array of that
  19. member. You may wish to use this for indexing directly
  20. into the packed structure with vec() if you're too lazy
  21. to unpack it.
  22. typeof Not to be confused with the typedef accessor function, this
  23. one returns the C type of that field. This would allow
  24. you to print out a nice structured pretty print of some
  25. structure without knoning anything about it beforehand.
  26. No args to this one is a noop. Someday I'll post such
  27. a thing to dump out your u structure for you.

The way I see this being used is like basically this:

  1. % h2ph <some_include_file.h > /usr/lib/perl/tmp.ph
  2. % c2ph some_include_file.h >> /usr/lib/perl/tmp.ph
  3. % install

It's a little tricker with c2ph because you have to get the includes right. I can't know this for your system, but it's not usually too terribly difficult.

The code isn't pretty as I mentioned -- I never thought it would be a 1000- line program when I started, or I might not have begun. :-) But I would have been less cavalier in how the parts of the program communicated with each other, etc. It might also have helped if I didn't have to divine the makeup of the stabs on the fly, and then account for micro differences between my compiler and gcc.

Anyway, here it is. Should run on perl v4 or greater. Maybe less.

  1. --tom
 
perldoc-html/charnames.html000644 000765 000024 00000137760 12275777414 016103 0ustar00jjstaff000000 000000 charnames - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

charnames

Perl 5 version 18.2 documentation
Recently read

charnames

NAME

charnames - access to Unicode character names and named character sequences; also define character names

SYNOPSIS

  1. use charnames ':full';
  2. print "\N{GREEK SMALL LETTER SIGMA} is called sigma.\n";
  3. print "\N{LATIN CAPITAL LETTER E WITH VERTICAL LINE BELOW}",
  4. " is an officially named sequence of two Unicode characters\n";
  5. use charnames ':loose';
  6. print "\N{Greek small-letter sigma}",
  7. "can be used to ignore case, underscores, most blanks,"
  8. "and when you aren't sure if the official name has hyphens\n";
  9. use charnames ':short';
  10. print "\N{greek:Sigma} is an upper-case sigma.\n";
  11. use charnames qw(cyrillic greek);
  12. print "\N{sigma} is Greek sigma, and \N{be} is Cyrillic b.\n";
  13. use utf8;
  14. use charnames ":full", ":alias" => {
  15. e_ACUTE => "LATIN SMALL LETTER E WITH ACUTE",
  16. mychar => 0xE8000, # Private use area
  17. "自転車に乗る人" => "BICYCLIST"
  18. };
  19. print "\N{e_ACUTE} is a small letter e with an acute.\n";
  20. print "\N{mychar} allows me to name private use characters.\n";
  21. print "And I can create synonyms in other languages,",
  22. " such as \N{自転車に乗る人} for "BICYCLIST (U+1F6B4)\n";
  23. use charnames ();
  24. print charnames::viacode(0x1234); # prints "ETHIOPIC SYLLABLE SEE"
  25. printf "%04X", charnames::vianame("GOTHIC LETTER AHSA"); # prints
  26. # "10330"
  27. print charnames::vianame("LATIN CAPITAL LETTER A"); # prints 65 on
  28. # ASCII platforms;
  29. # 193 on EBCDIC
  30. print charnames::string_vianame("LATIN CAPITAL LETTER A"); # prints "A"

DESCRIPTION

Pragma use charnames is used to gain access to the names of the Unicode characters and named character sequences, and to allow you to define your own character and character sequence names.

All forms of the pragma enable use of the following 3 functions:

Starting in Perl v5.16, any occurrence of \N{CHARNAME} sequences in a double-quotish string automatically loads this module with arguments :full and :short (described below) if it hasn't already been loaded with different arguments, in order to compile the named Unicode character into position in the string. Prior to v5.16, an explicit use charnames was required to enable this usage. (However, prior to v5.16, the form "use charnames ();" did not enable \N{CHARNAME}.)

Note that \N{U+...}, where the ... is a hexadecimal number, also inserts a character into a string. The character it inserts is the one whose code point (ordinal value) is equal to the number. For example, "\N{U+263a}" is the Unicode (white background, black foreground) smiley face equivalent to "\N{WHITE SMILING FACE}" . Also note, \N{...} can mean a regex quantifier instead of a character name, when the ... is a number (or comma separated pair of numbers (see QUANTIFIERS in perlreref), and is not related to this pragma.

The charnames pragma supports arguments :full , :loose , :short , script names and customized aliases.

If :full is present, for expansion of \N{CHARNAME}, the string CHARNAME is first looked up in the list of standard Unicode character names.

:loose is a variant of :full which allows CHARNAME to be less precisely specified. Details are in LOOSE MATCHES.

If :short is present, and CHARNAME has the form SCRIPT:CNAME, then CNAME is looked up as a letter in script SCRIPT, as described in the next paragraph. Or, if use charnames is used with script name arguments, then for \N{CHARNAME} the name CHARNAME is looked up as a letter in the given scripts (in the specified order). Customized aliases can override these, and are explained in CUSTOM ALIASES.

For lookup of CHARNAME inside a given script SCRIPTNAME, this pragma looks in the table of standard Unicode names for the names

  1. SCRIPTNAME CAPITAL LETTER CHARNAME
  2. SCRIPTNAME SMALL LETTER CHARNAME
  3. SCRIPTNAME LETTER CHARNAME

If CHARNAME is all lowercase, then the CAPITAL variant is ignored, otherwise the SMALL variant is ignored, and both CHARNAME and SCRIPTNAME are converted to all uppercase for look-up. Other than that, both of them follow loose rules if :loose is also specified; strict otherwise.

Note that \N{...} is compile-time; it's a special form of string constant used inside double-quotish strings; this means that you cannot use variables inside the \N{...} . If you want similar run-time functionality, use charnames::string_vianame().

Note, starting in Perl 5.18, the name BELL refers to the Unicode character U+1F514, instead of the traditional U+0007. For the latter, use ALERT or BEL .

It is a syntax error to use \N{NAME} where NAME is unknown.

For \N{NAME} , it is a fatal error if use bytes is in effect and the input name is that of a character that won't fit into a byte (i.e., whose ordinal is above 255).

Otherwise, any string that includes a \N{charname} or \N{U+code point} will automatically have Unicode semantics (see Byte and Character Semantics in perlunicode).

LOOSE MATCHES

By specifying :loose , Unicode's loose character name matching rules are selected instead of the strict exact match used otherwise. That means that CHARNAME doesn't have to be so precisely specified. Upper/lower case doesn't matter (except with scripts as mentioned above), nor do any underscores, and the only hyphens that matter are those at the beginning or end of a word in the name (with one exception: the hyphen in U+1180 HANGUL JUNGSEONG O-E does matter). Also, blanks not adjacent to hyphens don't matter. The official Unicode names are quite variable as to where they use hyphens versus spaces to separate word-like units, and this option allows you to not have to care as much. The reason non-medial hyphens matter is because of cases like U+0F60 TIBETAN LETTER -A versus U+0F68 TIBETAN LETTER A . The hyphen here is significant, as is the space before it, and so both must be included.

:loose slows down look-ups by a factor of 2 to 3 versus :full , but the trade-off may be worth it to you. Each individual look-up takes very little time, and the results are cached, so the speed difference would become a factor only in programs that do look-ups of many different spellings, and probably only when those look-ups are through vianame() and string_vianame() , since \N{...} look-ups are done at compile time.

ALIASES

Starting in Unicode 6.1 and Perl v5.16, Unicode defines many abbreviations and names that were formerly Perl extensions, and some additional ones that Perl did not previously accept. The list is getting too long to reproduce here, but you can get the complete list from the Unicode web site: http://www.unicode.org/Public/UNIDATA/NameAliases.txt.

Earlier versions of Perl accepted almost all the 6.1 names. These were most extensively documented in the v5.14 version of this pod: http://perldoc.perl.org/5.14.0/charnames.html#ALIASES.

CUSTOM ALIASES

You can add customized aliases to standard (:full ) Unicode naming conventions. The aliases override any standard definitions, so, if you're twisted enough, you can change "\N{LATIN CAPITAL LETTER A}" to mean "B" , etc.

Aliases must begin with a character that is alphabetic. After that, each may contain any combination of word (\w ) characters, SPACE (U+0020), HYPHEN-MINUS (U+002D), LEFT PARENTHESIS (U+0028), RIGHT PARENTHESIS (U+0029), and NO-BREAK SPACE (U+00A0). These last three should never have been allowed in names, and are retained for backwards compatibility only; they may be deprecated and removed in future releases of Perl, so don't use them for new names. (More precisely, the first character of a name you specify must be something that matches all of \p{ID_Start} , \p{Alphabetic} , and \p{Gc=Letter} . This makes sure it is what any reasonable person would view as an alphabetic character. And, the continuation characters that match \w must also match \p{ID_Continue} .) Starting with Perl v5.18, any Unicode characters meeting the above criteria may be used; prior to that only Latin1-range characters were acceptable.

An alias can map to either an official Unicode character name (not a loose matched name) or to a numeric code point (ordinal). The latter is useful for assigning names to code points in Unicode private use areas such as U+E800 through U+F8FF. A numeric code point must be a non-negative integer or a string beginning with "U+" or "0x" with the remainder considered to be a hexadecimal integer. A literal numeric constant must be unsigned; it will be interpreted as hex if it has a leading zero or contains non-decimal hex digits; otherwise it will be interpreted as decimal.

Aliases are added either by the use of anonymous hashes:

  1. use charnames ":alias" => {
  2. e_ACUTE => "LATIN SMALL LETTER E WITH ACUTE",
  3. mychar1 => 0xE8000,
  4. };
  5. my $str = "\N{e_ACUTE}";

or by using a file containing aliases:

  1. use charnames ":alias" => "pro";

This will try to read "unicore/pro_alias.pl" from the @INC path. This file should return a list in plain perl:

  1. (
  2. A_GRAVE => "LATIN CAPITAL LETTER A WITH GRAVE",
  3. A_CIRCUM => "LATIN CAPITAL LETTER A WITH CIRCUMFLEX",
  4. A_DIAERES => "LATIN CAPITAL LETTER A WITH DIAERESIS",
  5. A_TILDE => "LATIN CAPITAL LETTER A WITH TILDE",
  6. A_BREVE => "LATIN CAPITAL LETTER A WITH BREVE",
  7. A_RING => "LATIN CAPITAL LETTER A WITH RING ABOVE",
  8. A_MACRON => "LATIN CAPITAL LETTER A WITH MACRON",
  9. mychar2 => "U+E8001",
  10. );

Both these methods insert ":full" automatically as the first argument (if no other argument is given), and you can give the ":full" explicitly as well, like

  1. use charnames ":full", ":alias" => "pro";

":loose" has no effect with these. Input names must match exactly, using ":full" rules.

Also, both these methods currently allow only single characters to be named. To name a sequence of characters, use a custom translator (described below).

charnames::string_vianame(name)

This is a runtime equivalent to \N{...} . name can be any expression that evaluates to a name accepted by \N{...} under the :full option to charnames . In addition, any other options for the controlling "use charnames" in the same scope apply, like :loose or any script list, :short option, or custom aliases you may have defined.

The only differences are due to the fact that string_vianame is run-time and \N{} is compile time. You can't interpolate inside a \N{} , (so \N{$variable} doesn't work); and if the input name is unknown, string_vianame returns undef instead of it being a syntax error.

charnames::vianame(name)

This is similar to string_vianame . The main difference is that under most circumstances, vianame returns an ordinal code point, whereas string_vianame returns a string. For example,

  1. printf "U+%04X", charnames::vianame("FOUR TEARDROP-SPOKED ASTERISK");

prints "U+2722".

This leads to the other two differences. Since a single code point is returned, the function can't handle named character sequences, as these are composed of multiple characters (it returns undef for these. And, the code point can be that of any character, even ones that aren't legal under the use bytes pragma,

See BUGS for the circumstances in which the behavior differs from that described above.

charnames::viacode(code)

Returns the full name of the character indicated by the numeric code. For example,

  1. print charnames::viacode(0x2722);

prints "FOUR TEARDROP-SPOKED ASTERISK".

The name returned is the "best" (defined below) official name or alias for the code point, if available; otherwise your custom alias for it, if defined; otherwise undef. This means that your alias will only be returned for code points that don't have an official Unicode name (nor alias) such as private use code points.

If you define more than one name for the code point, it is indeterminate which one will be returned.

As mentioned, the function returns undef if no name is known for the code point. In Unicode the proper name for these is the empty string, which undef stringifies to. (If you ask for a code point past the legal Unicode maximum of U+10FFFF that you haven't assigned an alias to, you get undef plus a warning.)

The input number must be a non-negative integer, or a string beginning with "U+" or "0x" with the remainder considered to be a hexadecimal integer. A literal numeric constant must be unsigned; it will be interpreted as hex if it has a leading zero or contains non-decimal hex digits; otherwise it will be interpreted as decimal.

As mentioned above under ALIASES, Unicode 6.1 defines extra names (synonyms or aliases) for some code points, most of which were already available as Perl extensions. All these are accepted by \N{...} and the other functions in this module, but viacode has to choose which one name to return for a given input code point, so it returns the "best" name. To understand how this works, it is helpful to know more about the Unicode name properties. All code points actually have only a single name, which (starting in Unicode 2.0) can never change once a character has been assigned to the code point. But mistakes have been made in assigning names, for example sometimes a clerical error was made during the publishing of the Standard which caused words to be misspelled, and there was no way to correct those. The Name_Alias property was eventually created to handle these situations. If a name was wrong, a corrected synonym would be published for it, using Name_Alias. viacode will return that corrected synonym as the "best" name for a code point. (It is even possible, though it hasn't happened yet, that the correction itself will need to be corrected, and so another Name_Alias can be created for that code point; viacode will return the most recent correction.)

The Unicode name for each of the control characters (such as LINE FEED) is the empty string. However almost all had names assigned by other standards, such as the ASCII Standard, or were in common use. viacode returns these names as the "best" ones available. Unicode 6.1 has created Name_Aliases for each of them, including alternate names, like NEW LINE. viacode uses the original name, "LINE FEED" in preference to the alternate. Similarly the name returned for U+FEFF is "ZERO WIDTH NO-BREAK SPACE", not "BYTE ORDER MARK".

Until Unicode 6.1, the 4 control characters U+0080, U+0081, U+0084, and U+0099 did not have names nor aliases. To preserve backwards compatibility, any alias you define for these code points will be returned by this function, in preference to the official name.

Some code points also have abbreviated names, such as "LF" or "NL". viacode never returns these.

Because a name correction may be added in future Unicode releases, the name that viacode returns may change as a result. This is a rare event, but it does happen.

CUSTOM TRANSLATORS

The mechanism of translation of \N{...} escapes is general and not hardwired into charnames.pm. A module can install custom translations (inside the scope which uses the module) with the following magic incantation:

  1. sub import {
  2. shift;
  3. $^H{charnames} = \&translator;
  4. }

Here translator() is a subroutine which takes CHARNAME as an argument, and returns text to insert into the string instead of the \N{CHARNAME} escape.

This is the only way you can create a custom named sequence of code points.

Since the text to insert should be different in bytes mode and out of it, the function should check the current state of bytes -flag as in:

  1. use bytes (); # for $bytes::hint_bits
  2. sub translator {
  3. if ($^H & $bytes::hint_bits) {
  4. return bytes_translator(@_);
  5. }
  6. else {
  7. return utf8_translator(@_);
  8. }
  9. }

See CUSTOM ALIASES above for restrictions on CHARNAME.

Of course, vianame , viacode , and string_vianame would need to be overridden as well.

BUGS

vianame() normally returns an ordinal code point, but when the input name is of the form U+... , it returns a chr instead. In this case, if use bytes is in effect and the character won't fit into a byte, it returns undef and raises a warning.

Since evaluation of the translation function (see CUSTOM TRANSLATORS) happens in the middle of compilation (of a string literal), the translation function should not do any evals or requires. This restriction should be lifted (but is low priority) in a future version of Perl.

 
perldoc-html/config_data.html000644 000765 000024 00000046525 12275777417 016401 0ustar00jjstaff000000 000000 config_data - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

config_data

Perl 5 version 18.2 documentation
Recently read

config_data

NAME

config_data - Query or change configuration of Perl modules

SYNOPSIS

  1. # Get config/feature values
  2. config_data --module Foo::Bar --feature bazzable
  3. config_data --module Foo::Bar --config magic_number
  4. # Set config/feature values
  5. config_data --module Foo::Bar --set_feature bazzable=1
  6. config_data --module Foo::Bar --set_config magic_number=42
  7. # Print a usage message
  8. config_data --help

DESCRIPTION

The config_data tool provides a command-line interface to the configuration of Perl modules. By "configuration", we mean something akin to "user preferences" or "local settings". This is a formalization and abstraction of the systems that people like Andreas Koenig (CPAN::Config ), Jon Swartz (HTML::Mason::Config ), Andy Wardley (Template::Config ), and Larry Wall (perl's own Config.pm) have developed independently.

The configuration system employed here was developed in the context of Module::Build . Under this system, configuration information for a module Foo , for example, is stored in a module called Foo::ConfigData ) (I would have called it Foo::Config , but that was taken by all those other systems mentioned in the previous paragraph...). These ...::ConfigData modules contain the configuration data, as well as publicly accessible methods for querying and setting (yes, actually re-writing) the configuration data. The config_data script (whose docs you are currently reading) is merely a front-end for those methods. If you wish, you may create alternate front-ends.

The two types of data that may be stored are called config values and feature values. A config value may be any perl scalar, including references to complex data structures. It must, however, be serializable using Data::Dumper . A feature is a boolean (1 or 0) value.

USAGE

This script functions as a basic getter/setter wrapper around the configuration of a single module. On the command line, specify which module's configuration you're interested in, and pass options to get or set config or feature values. The following options are supported:

  • module

    Specifies the name of the module to configure (required).

  • feature

    When passed the name of a feature , shows its value. The value will be 1 if the feature is enabled, 0 if the feature is not enabled, or empty if the feature is unknown. When no feature name is supplied, the names and values of all known features will be shown.

  • config

    When passed the name of a config entry, shows its value. The value will be displayed using Data::Dumper (or similar) as perl code. When no config name is supplied, the names and values of all known config entries will be shown.

  • set_feature

    Sets the given feature to the given boolean value. Specify the value as either 1 or 0.

  • set_config

    Sets the given config entry to the given value.

  • eval

    If the --eval option is used, the values in set_config will be evaluated as perl code before being stored. This allows moderately complicated data structures to be stored. For really complicated structures, you probably shouldn't use this command-line interface, just use the Perl API instead.

  • help

    Prints a help message, including a few examples, and exits.

AUTHOR

Ken Williams, kwilliams@cpan.org

COPYRIGHT

Copyright (c) 1999, Ken Williams. All rights reserved.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

Module::Build(3), perl(1).

 
perldoc-html/constant.html000644 000765 000024 00000102764 12275777414 015767 0ustar00jjstaff000000 000000 constant - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

constant

Perl 5 version 18.2 documentation
Recently read

constant

NAME

constant - Perl pragma to declare constants

SYNOPSIS

  1. use constant PI => 4 * atan2(1, 1);
  2. use constant DEBUG => 0;
  3. print "Pi equals ", PI, "...\n" if DEBUG;
  4. use constant {
  5. SEC => 0,
  6. MIN => 1,
  7. HOUR => 2,
  8. MDAY => 3,
  9. MON => 4,
  10. YEAR => 5,
  11. WDAY => 6,
  12. YDAY => 7,
  13. ISDST => 8,
  14. };
  15. use constant WEEKDAYS => qw(
  16. Sunday Monday Tuesday Wednesday Thursday Friday Saturday
  17. );
  18. print "Today is ", (WEEKDAYS)[ (localtime)[WDAY] ], ".\n";

DESCRIPTION

This pragma allows you to declare constants at compile-time.

When you declare a constant such as PI using the method shown above, each machine your script runs upon can have as many digits of accuracy as it can use. Also, your program will be easier to read, more likely to be maintained (and maintained correctly), and far less likely to send a space probe to the wrong planet because nobody noticed the one equation in which you wrote 3.14195 .

When a constant is used in an expression, Perl replaces it with its value at compile time, and may then optimize the expression further. In particular, any code in an if (CONSTANT) block will be optimized away if the constant is false.

NOTES

As with all use directives, defining a constant happens at compile time. Thus, it's probably not correct to put a constant declaration inside of a conditional statement (like if ($foo) { use constant ... } ).

Constants defined using this module cannot be interpolated into strings like variables. However, concatenation works just fine:

  1. print "Pi equals PI...\n"; # WRONG: does not expand "PI"
  2. print "Pi equals ".PI."...\n"; # right

Even though a reference may be declared as a constant, the reference may point to data which may be changed, as this code shows.

  1. use constant ARRAY => [ 1,2,3,4 ];
  2. print ARRAY->[1];
  3. ARRAY->[1] = " be changed";
  4. print ARRAY->[1];

Dereferencing constant references incorrectly (such as using an array subscript on a constant hash reference, or vice versa) will be trapped at compile time.

Constants belong to the package they are defined in. To refer to a constant defined in another package, specify the full package name, as in Some::Package::CONSTANT . Constants may be exported by modules, and may also be called as either class or instance methods, that is, as Some::Package->CONSTANT or as $obj->CONSTANT where $obj is an instance of Some::Package . Subclasses may define their own constants to override those in their base class.

The use of all caps for constant names is merely a convention, although it is recommended in order to make constants stand out and to help avoid collisions with other barewords, keywords, and subroutine names. Constant names must begin with a letter or underscore. Names beginning with a double underscore are reserved. Some poor choices for names will generate warnings, if warnings are enabled at compile time.

List constants

Constants may be lists of more (or less) than one value. A constant with no values evaluates to undef in scalar context. Note that constants with more than one value do not return their last value in scalar context as one might expect. They currently return the number of values, but this may change in the future. Do not use constants with multiple values in scalar context.

NOTE: This implies that the expression defining the value of a constant is evaluated in list context. This may produce surprises:

  1. use constant TIMESTAMP => localtime; # WRONG!
  2. use constant TIMESTAMP => scalar localtime; # right

The first line above defines TIMESTAMP as a 9-element list, as returned by localtime() in list context. To set it to the string returned by localtime() in scalar context, an explicit scalar keyword is required.

List constants are lists, not arrays. To index or slice them, they must be placed in parentheses.

  1. my @workdays = WEEKDAYS[1 .. 5]; # WRONG!
  2. my @workdays = (WEEKDAYS)[1 .. 5]; # right

Defining multiple constants at once

Instead of writing multiple use constant statements, you may define multiple constants in a single statement by giving, instead of the constant name, a reference to a hash where the keys are the names of the constants to be defined. Obviously, all constants defined using this method must have a single value.

  1. use constant {
  2. FOO => "A single value",
  3. BAR => "This", "won't", "work!", # Error!
  4. };

This is a fundamental limitation of the way hashes are constructed in Perl. The error messages produced when this happens will often be quite cryptic -- in the worst case there may be none at all, and you'll only later find that something is broken.

When defining multiple constants, you cannot use the values of other constants defined in the same declaration. This is because the calling package doesn't know about any constant within that group until after the use statement is finished.

  1. use constant {
  2. BITMASK => 0xAFBAEBA8,
  3. NEGMASK => ~BITMASK, # Error!
  4. };

Magic constants

Magical values and references can be made into constants at compile time, allowing for way cool stuff like this. (These error numbers aren't totally portable, alas.)

  1. use constant E2BIG => ($! = 7);
  2. print E2BIG, "\n"; # something like "Arg list too long"
  3. print 0+E2BIG, "\n"; # "7"

You can't produce a tied constant by giving a tied scalar as the value. References to tied variables, however, can be used as constants without any problems.

TECHNICAL NOTES

In the current implementation, scalar constants are actually inlinable subroutines. As of version 5.004 of Perl, the appropriate scalar constant is inserted directly in place of some subroutine calls, thereby saving the overhead of a subroutine call. See Constant Functions in perlsub for details about how and when this happens.

In the rare case in which you need to discover at run time whether a particular constant has been declared via this module, you may use this function to examine the hash %constant::declared . If the given constant name does not include a package name, the current package is used.

  1. sub declared ($) {
  2. use constant 1.01; # don't omit this!
  3. my $name = shift;
  4. $name =~ s/^::/main::/;
  5. my $pkg = caller;
  6. my $full_name = $name =~ /::/ ? $name : "${pkg}::$name";
  7. $constant::declared{$full_name};
  8. }

CAVEATS

In the current version of Perl, list constants are not inlined and some symbols may be redefined without generating a warning.

It is not possible to have a subroutine or a keyword with the same name as a constant in the same package. This is probably a Good Thing.

A constant with a name in the list STDIN STDOUT STDERR ARGV ARGVOUT ENV INC SIG is not allowed anywhere but in package main:: , for technical reasons.

Unlike constants in some languages, these cannot be overridden on the command line or via environment variables.

You can get into trouble if you use constants in a context which automatically quotes barewords (as is true for any subroutine call). For example, you can't say $hash{CONSTANT} because CONSTANT will be interpreted as a string. Use $hash{CONSTANT()} or $hash{+CONSTANT} to prevent the bareword quoting mechanism from kicking in. Similarly, since the => operator quotes a bareword immediately to its left, you have to say CONSTANT() => 'value' (or simply use a comma in place of the big arrow) instead of CONSTANT => 'value' .

SEE ALSO

Readonly - Facility for creating read-only scalars, arrays, hashes.

Attribute::Constant - Make read-only variables via attribute

Scalar::Readonly - Perl extension to the SvREADONLY scalar flag

Hash::Util - A selection of general-utility hash subroutines (mostly to lock/unlock keys and values)

BUGS

Please report any bugs or feature requests via the perlbug(1) utility.

AUTHORS

Tom Phoenix, <rootbeer@redcat.com>, with help from many other folks.

Multiple constant declarations at once added by Casey West, <casey@geeknest.com>.

Documentation mostly rewritten by Ilmari Karonen, <perl@itz.pp.sci.fi>.

This program is maintained by the Perl 5 Porters. The CPAN distribution is maintained by Sébastien Aperghis-Tramoni <sebastien@aperghis.net>.

COPYRIGHT & LICENSE

Copyright (C) 1997, 1999 Tom Phoenix

This module is free software; you can redistribute it or modify it under the same terms as Perl itself.

 
perldoc-html/corelist.html000644 000765 000024 00000057223 12275777417 015764 0ustar00jjstaff000000 000000 corelist - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

corelist

Perl 5 version 18.2 documentation
Recently read

corelist

NAME

corelist - a commandline frontend to Module::CoreList

DESCRIPTION

See Module::CoreList for one.

SYNOPSIS

  1. corelist -v
  2. corelist [-a|-d] <ModuleName> | /<ModuleRegex>/ [<ModuleVersion>] ...
  3. corelist [-v <PerlVersion>] [ <ModuleName> | /<ModuleRegex>/ ] ...
  4. corelist [-r <PerlVersion>] ...
  5. corelist --feature <FeatureName> [<FeatureName>] ...
  6. corelist --diff PerlVersion PerlVersion
  7. corelist --upstream <ModuleName>

OPTIONS

  • -a

    lists all versions of the given module (or the matching modules, in case you used a module regexp) in the perls Module::CoreList knows about.

    1. corelist -a Unicode
    2. Unicode was first released with perl v5.6.2
    3. v5.6.2 3.0.1
    4. v5.8.0 3.2.0
    5. v5.8.1 4.0.0
    6. v5.8.2 4.0.0
    7. v5.8.3 4.0.0
    8. v5.8.4 4.0.1
    9. v5.8.5 4.0.1
    10. v5.8.6 4.0.1
    11. v5.8.7 4.1.0
    12. v5.8.8 4.1.0
    13. v5.8.9 5.1.0
    14. v5.9.0 4.0.0
    15. v5.9.1 4.0.0
    16. v5.9.2 4.0.1
    17. v5.9.3 4.1.0
    18. v5.9.4 4.1.0
    19. v5.9.5 5.0.0
    20. v5.10.0 5.0.0
    21. v5.10.1 5.1.0
    22. v5.11.0 5.1.0
    23. v5.11.1 5.1.0
    24. v5.11.2 5.1.0
    25. v5.11.3 5.2.0
    26. v5.11.4 5.2.0
    27. v5.11.5 5.2.0
    28. v5.12.0 5.2.0
    29. v5.12.1 5.2.0
    30. v5.12.2 5.2.0
    31. v5.12.3 5.2.0
    32. v5.12.4 5.2.0
    33. v5.13.0 5.2.0
    34. v5.13.1 5.2.0
    35. v5.13.2 5.2.0
    36. v5.13.3 5.2.0
    37. v5.13.4 5.2.0
    38. v5.13.5 5.2.0
    39. v5.13.6 5.2.0
    40. v5.13.7 6.0.0
    41. v5.13.8 6.0.0
    42. v5.13.9 6.0.0
    43. v5.13.10 6.0.0
    44. v5.13.11 6.0.0
    45. v5.14.0 6.0.0
    46. v5.14.1 6.0.0
    47. v5.15.0 6.0.0
  • -d

    finds the first perl version where a module has been released by date, and not by version number (as is the default).

  • --diff

    Given two versions of perl, this prints a human-readable table of all module changes between the two. The output format may change in the future, and is meant for humans, not programs. For programs, use the Module::CoreList API.

  • -? or -help

    help! help! help! to see more help, try --man.

  • -man

    all of the help

  • -v

    lists all of the perl release versions we got the CoreList for.

    If you pass a version argument (value of $] , like 5.00503 or 5.008008 ), you get a list of all the modules and their respective versions. (If you have the version module, you can also use new-style version numbers, like 5.8.8 .)

    In module filtering context, it can be used as Perl version filter.

  • -r

    lists all of the perl releases and when they were released

    If you pass a perl version you get the release date for that version only.

  • --feature, -f

    lists the first version bundle of each named feature given

  • --upstream, -u

    Shows if the given module is primarily maintained in perl core or on CPAN and bug tracker URL.

As a special case, if you specify the module name Unicode , you'll get the version number of the Unicode Character Database bundled with the requested perl versions.

EXAMPLES

  1. $ corelist File::Spec
  2. File::Spec was first released with perl 5.005
  3. $ corelist File::Spec 0.83
  4. File::Spec 0.83 was released with perl 5.007003
  5. $ corelist File::Spec 0.89
  6. File::Spec 0.89 was not in CORE (or so I think)
  7. $ corelist File::Spec::Aliens
  8. File::Spec::Aliens was not in CORE (or so I think)
  9. $ corelist /IPC::Open/
  10. IPC::Open2 was first released with perl 5
  11. IPC::Open3 was first released with perl 5
  12. $ corelist /MANIFEST/i
  13. ExtUtils::Manifest was first released with perl 5.001
  14. $ corelist /Template/
  15. /Template/ has no match in CORE (or so I think)
  16. $ corelist -v 5.8.8 B
  17. B 1.09_01
  18. $ corelist -v 5.8.8 /^B::/
  19. B::Asmdata 1.01
  20. B::Assembler 0.07
  21. B::Bblock 1.02_01
  22. B::Bytecode 1.01_01
  23. B::C 1.04_01
  24. B::CC 1.00_01
  25. B::Concise 0.66
  26. B::Debug 1.02_01
  27. B::Deparse 0.71
  28. B::Disassembler 1.05
  29. B::Lint 1.03
  30. B::O 1.00
  31. B::Showlex 1.02
  32. B::Stackobj 1.00
  33. B::Stash 1.00
  34. B::Terse 1.03_01
  35. B::Xref 1.01

COPYRIGHT

Copyright (c) 2002-2007 by D.H. aka PodMaster

Currently maintained by the perl 5 porters <perl5-porters@perl.org>.

This program is distributed under the same terms as perl itself. See http://perl.org/ or http://cpan.org/ for more info on that.

 
perldoc-html/cpan.html000644 000765 000024 00000060212 12275777417 015051 0ustar00jjstaff000000 000000 cpan - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

cpan

Perl 5 version 18.2 documentation
Recently read

cpan

NAME

cpan - easily interact with CPAN from the command line

SYNOPSIS

  1. # with arguments and no switches, installs specified modules
  2. cpan module_name [ module_name ... ]
  3. # with switches, installs modules with extra behavior
  4. cpan [-cfgimtTw] module_name [ module_name ... ]
  5. # with just the dot, install from the distribution in the
  6. # current directory
  7. cpan .
  8. # without arguments, starts CPAN.pm shell
  9. cpan
  10. # dump the configuration
  11. cpan -J
  12. # load a different configuration to install Module::Foo
  13. cpan -j some/other/file Module::Foo
  14. # without arguments, but some switches
  15. cpan [-ahrvACDlLO]

DESCRIPTION

This script provides a command interface (not a shell) to CPAN. At the moment it uses CPAN.pm to do the work, but it is not a one-shot command runner for CPAN.pm.

Options

  • -a

    Creates a CPAN.pm autobundle with CPAN::Shell->autobundle.

  • -A module [ module ... ]

    Shows the primary maintainers for the specified modules.

  • -c module

    Runs a `make clean` in the specified module's directories.

  • -C module [ module ... ]

    Show the Changes files for the specified modules

  • -D module [ module ... ]

    Show the module details.

  • -f

    Force the specified action, when it normally would have failed. Use this to install a module even if its tests fail. When you use this option, -i is not optional for installing a module when you need to force it:

    1. % cpan -f -i Module::Foo
  • -F

    Turn off CPAN.pm's attempts to lock anything. You should be careful with this since you might end up with multiple scripts trying to muck in the same directory. This isn't so much of a concern if you're loading a special config with -j , and that config sets up its own work directories.

  • -g module [ module ... ]

    Downloads to the current directory the latest distribution of the module.

  • -G module [ module ... ]

    UNIMPLEMENTED

    Download to the current directory the latest distribution of the modules, unpack each distribution, and create a git repository for each distribution.

    If you want this feature, check out Yanick Champoux's Git::CPAN::Patch distribution.

  • -h

    Print a help message and exit. When you specify -h , it ignores all of the other options and arguments.

  • -i

    Install the specified modules.

  • -I

    Load local::lib (think like -I for loading lib paths).

  • -j Config.pm

    Load the file that has the CPAN configuration data. This should have the same format as the standard CPAN/Config.pm file, which defines $CPAN::Config as an anonymous hash.

  • -J

    Dump the configuration in the same format that CPAN.pm uses. This is useful for checking the configuration as well as using the dump as a starting point for a new, custom configuration.

  • -l

    List all installed modules wth their versions

  • -L author [ author ... ]

    List the modules by the specified authors.

  • -m

    Make the specified modules.

  • -O

    Show the out-of-date modules.

  • -p

    Ping the configured mirrors

  • -P

    Find the best mirrors you could be using (but doesn't configure them just yet)

  • -r

    Recompiles dynamically loaded modules with CPAN::Shell->recompile.

  • -t

    Run a `make test` on the specified modules.

  • -T

    Do not test modules. Simply install them.

  • -u

    Upgrade all installed modules. Blindly doing this can really break things, so keep a backup.

  • -v

    Print the script version and CPAN.pm version then exit.

  • -V

    Print detailed information about the cpan client.

  • -w

    UNIMPLEMENTED

    Turn on cpan warnings. This checks various things, like directory permissions, and tells you about problems you might have.

Examples

  1. # print a help message
  2. cpan -h
  3. # print the version numbers
  4. cpan -v
  5. # create an autobundle
  6. cpan -a
  7. # recompile modules
  8. cpan -r
  9. # upgrade all installed modules
  10. cpan -u
  11. # install modules ( sole -i is optional )
  12. cpan -i Netscape::Booksmarks Business::ISBN
  13. # force install modules ( must use -i )
  14. cpan -fi CGI::Minimal URI

ENVIRONMENT VARIABLES

  • CPAN_OPTS

    cpan splits this variable on whitespace and prepends that list to @ARGV before it processes the command-line arguments. For instance, if you always want to use local:lib, you can set CPAN_OPTS to -I .

EXIT VALUES

The script exits with zero if it thinks that everything worked, or a positive number if it thinks that something failed. Note, however, that in some cases it has to divine a failure by the output of things it does not control. For now, the exit codes are vague:

  1. 1 An unknown error
  2. 2 The was an external problem
  3. 4 There was an internal problem with the script
  4. 8 A module failed to install

TO DO

* one shot configuration values from the command line

BUGS

* none noted

SEE ALSO

Most behaviour, including environment variables and configuration, comes directly from CPAN.pm.

SOURCE AVAILABILITY

This code is in Github:

  1. git://github.com/briandfoy/cpan_script.git

CREDITS

Japheth Cleaver added the bits to allow a forced install (-f).

Jim Brandt suggest and provided the initial implementation for the up-to-date and Changes features.

Adam Kennedy pointed out that exit() causes problems on Windows where this script ends up with a .bat extension

AUTHOR

brian d foy, <bdfoy@cpan.org>

COPYRIGHT

Copyright (c) 2001-2013, brian d foy, All Rights Reserved.

You may redistribute this under the same terms as Perl itself.

 
perldoc-html/cpan2dist.html000644 000765 000024 00000042410 12275777417 016017 0ustar00jjstaff000000 000000 cpan2dist - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

cpan2dist

Perl 5 version 18.2 documentation
Recently read

cpan2dist

NAME

cpan2dist - The CPANPLUS distribution creator

DESCRIPTION

This script will create distributions of CPAN modules of the format you specify, including its prerequisites. These packages can then be installed using the corresponding package manager for the format.

Note, you can also do this interactively from the default shell, CPANPLUS::Shell::Default . See the CPANPLUS::Dist documentation, as well as the documentation of your format of choice for any format specific documentation.

USAGE

Built-In Filter Lists

Some modules you'd rather not package. Some because they are part of core-perl and you dont want a new package. Some because they won't build on your system. Some because your package manager of choice already packages them for you.

There may be a myriad of reasons. You can use the --ignore and --ban options for this, but we provide some built-in lists that catch common cases. You can use these built-in lists if you like, or supply your own if need be.

Built-In Ignore List

You can use this list of regexes to ignore modules matching to be listed as prerequisites of a package. Particularly useful if they are bundled with core-perl anyway and they have known issues building.

Toggle it by supplying the --default-ignorelist option.

Built-In Ban list

You can use this list of regexes to disable building of these modules altogether.

Toggle it by supplying the --default-banlist option.

SEE ALSO

CPANPLUS::Dist, CPANPLUS::Module, CPANPLUS::Shell::Default, cpanp

BUG REPORTS

Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

AUTHOR

This module by Jos Boumans <kane@cpan.org>.

COPYRIGHT

The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

 
perldoc-html/cpanp.html000644 000765 000024 00000044457 12275777417 015246 0ustar00jjstaff000000 000000 cpanp - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

cpanp

Perl 5 version 18.2 documentation
Recently read

cpanp

NAME

cpanp - The CPANPLUS launcher

SYNOPSIS

cpanp

cpanp [-]a [ --[no-]option... ] author...

cpanp [-]mfitulrcz [ --[no-]option... ] module...

cpanp [-]d [ --[no-]option... ] [ --fetchdir=... ] module...

cpanp [-]xb [ --[no-]option... ]

cpanp [-]o [ --[no-]option... ] [ module... ]

DESCRIPTION

This script launches the CPANPLUS utility to perform various operations from the command line. If it's invoked without arguments, an interactive shell is executed by default.

Optionally, it can take a single-letter switch and one or more argument, to perform the associated action on each arguments. A summary of the available commands is listed below; cpanp -h provides a detailed list.

  1. h # help information
  2. v # version information
  3. a AUTHOR ... # search by author(s)
  4. m MODULE ... # search by module(s)
  5. f MODULE ... # list all releases of a module
  6. i MODULE ... # install module(s)
  7. t MODULE ... # test module(s)
  8. u MODULE ... # uninstall module(s)
  9. d MODULE ... # download module(s)
  10. l MODULE ... # display detailed information about module(s)
  11. r MODULE ... # display README files of module(s)
  12. c MODULE ... # check for module report(s) from cpan-testers
  13. z MODULE ... # extract module(s) and open command prompt in it
  14. x # reload CPAN indices
  15. o [ MODULE ... ] # list installed module(s) that aren't up to date
  16. b # write a bundle file for your configuration

Each command may be followed by one or more options. If preceded by no, the corresponding option will be set to 0 , otherwise it's set to 1 .

Example: To skip a module's tests,

  1. cpanp -i --skiptest MODULE ...

Valid options for most commands are cpantest , debug , flush , force , prereqs , storable , verbose , md5 , signature , and skiptest ; the 'd' command also accepts fetchdir . Please consult CPANPLUS::Configure for an explanation to their meanings.

Example: To download a module's tarball to the current directory,

  1. cpanp -d --fetchdir=. MODULE ...
 
perldoc-html/diagnostics.html000644 000765 000024 00000063416 12275777414 016445 0ustar00jjstaff000000 000000 diagnostics - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

diagnostics

Perl 5 version 18.2 documentation
Recently read

diagnostics

NAME

diagnostics, splain - produce verbose warning diagnostics

SYNOPSIS

Using the diagnostics pragma:

  1. use diagnostics;
  2. use diagnostics -verbose;
  3. enable diagnostics;
  4. disable diagnostics;

Using the splain standalone filter program:

  1. perl program 2>diag.out
  2. splain [-v] [-p] diag.out

Using diagnostics to get stack traces from a misbehaving script:

  1. perl -Mdiagnostics=-traceonly my_script.pl

DESCRIPTION

The diagnostics Pragma

This module extends the terse diagnostics normally emitted by both the perl compiler and the perl interpreter (from running perl with a -w switch or use warnings ), augmenting them with the more explicative and endearing descriptions found in perldiag. Like the other pragmata, it affects the compilation phase of your program rather than merely the execution phase.

To use in your program as a pragma, merely invoke

  1. use diagnostics;

at the start (or near the start) of your program. (Note that this does enable perl's -w flag.) Your whole compilation will then be subject(ed :-) to the enhanced diagnostics. These still go out STDERR.

Due to the interaction between runtime and compiletime issues, and because it's probably not a very good idea anyway, you may not use no diagnostics to turn them off at compiletime. However, you may control their behaviour at runtime using the disable() and enable() methods to turn them off and on respectively.

The -verbose flag first prints out the perldiag introduction before any other diagnostics. The $diagnostics::PRETTY variable can generate nicer escape sequences for pagers.

Warnings dispatched from perl itself (or more accurately, those that match descriptions found in perldiag) are only displayed once (no duplicate descriptions). User code generated warnings a la warn() are unaffected, allowing duplicate user messages to be displayed.

This module also adds a stack trace to the error message when perl dies. This is useful for pinpointing what caused the death. The -traceonly (or just -t) flag turns off the explanations of warning messages leaving just the stack traces. So if your script is dieing, run it again with

  1. perl -Mdiagnostics=-traceonly my_bad_script

to see the call stack at the time of death. By supplying the -warntrace (or just -w) flag, any warnings emitted will also come with a stack trace.

The splain Program

While apparently a whole nuther program, splain is actually nothing more than a link to the (executable) diagnostics.pm module, as well as a link to the diagnostics.pod documentation. The -v flag is like the use diagnostics -verbose directive. The -p flag is like the $diagnostics::PRETTY variable. Since you're post-processing with splain, there's no sense in being able to enable() or disable() processing.

Output from splain is directed to STDOUT, unlike the pragma.

EXAMPLES

The following file is certain to trigger a few errors at both runtime and compiletime:

  1. use diagnostics;
  2. print NOWHERE "nothing\n";
  3. print STDERR "\n\tThis message should be unadorned.\n";
  4. warn "\tThis is a user warning";
  5. print "\nDIAGNOSTIC TESTER: Please enter a <CR> here: ";
  6. my $a, $b = scalar <STDIN>;
  7. print "\n";
  8. print $x/$y;

If you prefer to run your program first and look at its problem afterwards, do this:

  1. perl -w test.pl 2>test.out
  2. ./splain < test.out

Note that this is not in general possible in shells of more dubious heritage, as the theoretical

  1. (perl -w test.pl >/dev/tty) >& test.out
  2. ./splain < test.out

Because you just moved the existing stdout to somewhere else.

If you don't want to modify your source code, but still have on-the-fly warnings, do this:

  1. exec 3>&1; perl -w test.pl 2>&1 1>&3 3>&- | splain 1>&2 3>&-

Nifty, eh?

If you want to control warnings on the fly, do something like this. Make sure you do the use first, or you won't be able to get at the enable() or disable() methods.

  1. use diagnostics; # checks entire compilation phase
  2. print "\ntime for 1st bogus diags: SQUAWKINGS\n";
  3. print BOGUS1 'nada';
  4. print "done with 1st bogus\n";
  5. disable diagnostics; # only turns off runtime warnings
  6. print "\ntime for 2nd bogus: (squelched)\n";
  7. print BOGUS2 'nada';
  8. print "done with 2nd bogus\n";
  9. enable diagnostics; # turns back on runtime warnings
  10. print "\ntime for 3rd bogus: SQUAWKINGS\n";
  11. print BOGUS3 'nada';
  12. print "done with 3rd bogus\n";
  13. disable diagnostics;
  14. print "\ntime for 4th bogus: (squelched)\n";
  15. print BOGUS4 'nada';
  16. print "done with 4th bogus\n";

INTERNALS

Diagnostic messages derive from the perldiag.pod file when available at runtime. Otherwise, they may be embedded in the file itself when the splain package is built. See the Makefile for details.

If an extant $SIG{__WARN__} handler is discovered, it will continue to be honored, but only after the diagnostics::splainthis() function (the module's $SIG{__WARN__} interceptor) has had its way with your warnings.

There is a $diagnostics::DEBUG variable you may set if you're desperately curious what sorts of things are being intercepted.

  1. BEGIN { $diagnostics::DEBUG = 1 }

BUGS

Not being able to say "no diagnostics" is annoying, but may not be insurmountable.

The -pretty directive is called too late to affect matters. You have to do this instead, and before you load the module.

  1. BEGIN { $diagnostics::PRETTY = 1 }

I could start up faster by delaying compilation until it should be needed, but this gets a "panic: top_level" when using the pragma form in Perl 5.001e.

While it's true that this documentation is somewhat subserious, if you use a program named splain, you should expect a bit of whimsy.

AUTHOR

Tom Christiansen <tchrist@mox.perl.com>, 25 June 1995.

 
perldoc-html/enc2xs.html000644 000765 000024 00000065723 12275777417 015346 0ustar00jjstaff000000 000000 enc2xs - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

enc2xs

Perl 5 version 18.2 documentation
Recently read

enc2xs

NAME

enc2xs -- Perl Encode Module Generator

SYNOPSIS

  1. enc2xs -[options]
  2. enc2xs -M ModName mapfiles...
  3. enc2xs -C

DESCRIPTION

enc2xs builds a Perl extension for use by Encode from either Unicode Character Mapping files (.ucm) or Tcl Encoding Files (.enc). Besides being used internally during the build process of the Encode module, you can use enc2xs to add your own encoding to perl. No knowledge of XS is necessary.

Quick Guide

If you want to know as little about Perl as possible but need to add a new encoding, just read this chapter and forget the rest.

0.

Have a .ucm file ready. You can get it from somewhere or you can write your own from scratch or you can grab one from the Encode distribution and customize it. For the UCM format, see the next Chapter. In the example below, I'll call my theoretical encoding myascii, defined in my.ucm. $ is a shell prompt.

  1. $ ls -F
  2. my.ucm
1.

Issue a command as follows;

  1. $ enc2xs -M My my.ucm
  2. generating Makefile.PL
  3. generating My.pm
  4. generating README
  5. generating Changes

Now take a look at your current directory. It should look like this.

  1. $ ls -F
  2. Makefile.PL My.pm my.ucm t/

The following files were created.

  1. Makefile.PL - MakeMaker script
  2. My.pm - Encode submodule
  3. t/My.t - test file
1.
1.

If you want *.ucm installed together with the modules, do as follows;

  1. $ mkdir Encode
  2. $ mv *.ucm Encode
  3. $ enc2xs -M My Encode/*ucm
2.

Edit the files generated. You don't have to if you have no time AND no intention to give it to someone else. But it is a good idea to edit the pod and to add more tests.

3.

Now issue a command all Perl Mongers love:

  1. $ perl Makefile.PL
  2. Writing Makefile for Encode::My
4.

Now all you have to do is make.

  1. $ make
  2. cp My.pm blib/lib/Encode/My.pm
  3. /usr/local/bin/perl /usr/local/bin/enc2xs -Q -O \
  4. -o encode_t.c -f encode_t.fnm
  5. Reading myascii (myascii)
  6. Writing compiled form
  7. 128 bytes in string tables
  8. 384 bytes (75%) saved spotting duplicates
  9. 1 bytes (0.775%) saved using substrings
  10. ....
  11. chmod 644 blib/arch/auto/Encode/My/My.bs
  12. $

The time it takes varies depending on how fast your machine is and how large your encoding is. Unless you are working on something big like euc-tw, it won't take too long.

5.

You can "make install" already but you should test first.

  1. $ make test
  2. PERL_DL_NONLAZY=1 /usr/local/bin/perl -Iblib/arch -Iblib/lib \
  3. -e 'use Test::Harness qw(&runtests $verbose); \
  4. $verbose=0; runtests @ARGV;' t/*.t
  5. t/My....ok
  6. All tests successful.
  7. Files=1, Tests=2, 0 wallclock secs
  8. ( 0.09 cusr + 0.01 csys = 0.09 CPU)
6.

If you are content with the test result, just "make install"

7.

If you want to add your encoding to Encode's demand-loading list (so you don't have to "use Encode::YourEncoding"), run

  1. enc2xs -C

to update Encode::ConfigLocal, a module that controls local settings. After that, "use Encode;" is enough to load your encodings on demand.

The Unicode Character Map

Encode uses the Unicode Character Map (UCM) format for source character mappings. This format is used by IBM's ICU package and was adopted by Nick Ing-Simmons for use with the Encode module. Since UCM is more flexible than Tcl's Encoding Map and far more user-friendly, this is the recommended format for Encode now.

A UCM file looks like this.

  1. #
  2. # Comments
  3. #
  4. <code_set_name> "US-ascii" # Required
  5. <code_set_alias> "ascii" # Optional
  6. <mb_cur_min> 1 # Required; usually 1
  7. <mb_cur_max> 1 # Max. # of bytes/char
  8. <subchar> \x3F # Substitution char
  9. #
  10. CHARMAP
  11. <U0000> \x00 |0 # <control>
  12. <U0001> \x01 |0 # <control>
  13. <U0002> \x02 |0 # <control>
  14. ....
  15. <U007C> \x7C |0 # VERTICAL LINE
  16. <U007D> \x7D |0 # RIGHT CURLY BRACKET
  17. <U007E> \x7E |0 # TILDE
  18. <U007F> \x7F |0 # <control>
  19. END CHARMAP
  • Anything that follows # is treated as a comment.

  • The header section continues until a line containing the word CHARMAP. This section has a form of <keyword> value, one pair per line. Strings used as values must be quoted. Barewords are treated as numbers. \xXX represents a byte.

    Most of the keywords are self-explanatory. subchar means substitution character, not subcharacter. When you decode a Unicode sequence to this encoding but no matching character is found, the byte sequence defined here will be used. For most cases, the value here is \x3F; in ASCII, this is a question mark.

  • CHARMAP starts the character map section. Each line has a form as follows:

    1. <UXXXX> \xXX.. |0 # comment
    2. ^ ^ ^
    3. | | +- Fallback flag
    4. | +-------- Encoded byte sequence
    5. +-------------- Unicode Character ID in hex

    The format is roughly the same as a header section except for the fallback flag: | followed by 0..3. The meaning of the possible values is as follows:

    • |0

      Round trip safe. A character decoded to Unicode encodes back to the same byte sequence. Most characters have this flag.

    • |1

      Fallback for unicode -> encoding. When seen, enc2xs adds this character for the encode map only.

    • |2

      Skip sub-char mapping should there be no code point.

    • |3

      Fallback for encoding -> unicode. When seen, enc2xs adds this character for the decode map only.

  • And finally, END OF CHARMAP ends the section.

When you are manually creating a UCM file, you should copy ascii.ucm or an existing encoding which is close to yours, rather than write your own from scratch.

When you do so, make sure you leave at least U0000 to U0020 as is, unless your environment is EBCDIC.

CAVEAT: not all features in UCM are implemented. For example, icu:state is not used. Because of that, you need to write a perl module if you want to support algorithmical encodings, notably the ISO-2022 series. Such modules include Encode::JP::2022_JP, Encode::KR::2022_KR, and Encode::TW::HZ.

Coping with duplicate mappings

When you create a map, you SHOULD make your mappings round-trip safe. That is, encode('your-encoding', decode('your-encoding', $data)) eq $data stands for all characters that are marked as |0. Here is how to make sure:

  • Sort your map in Unicode order.

  • When you have a duplicate entry, mark either one with '|1' or '|3'.

  • And make sure the '|1' or '|3' entry FOLLOWS the '|0' entry.

Here is an example from big5-eten.

  1. <U2550> \xF9\xF9 |0
  2. <U2550> \xA2\xA4 |3

Internally Encoding -> Unicode and Unicode -> Encoding Map looks like this;

  1. E to U U to E
  2. --------------------------------------
  3. \xF9\xF9 => U2550 U2550 => \xF9\xF9
  4. \xA2\xA4 => U2550

So it is round-trip safe for \xF9\xF9. But if the line above is upside down, here is what happens.

  1. E to U U to E
  2. --------------------------------------
  3. \xA2\xA4 => U2550 U2550 => \xF9\xF9
  4. (\xF9\xF9 => U2550 is now overwritten!)

The Encode package comes with ucmlint, a crude but sufficient utility to check the integrity of a UCM file. Check under the Encode/bin directory for this.

When in doubt, you can use ucmsort, yet another utility under Encode/bin directory.

Bookmarks

SEE ALSO

Encode, perlmod, perlpod

 
perldoc-html/encoding.html000644 000765 000024 00000125552 12275777415 015725 0ustar00jjstaff000000 000000 encoding - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

encoding

Perl 5 version 18.2 documentation
Recently read

encoding

NAME

encoding - allows you to write your script in non-ascii or non-utf8

WARNING

This module is deprecated under perl 5.18. It uses a mechanism provided by perl that is deprecated under 5.18 and higher, and may be removed in a future version.

SYNOPSIS

  1. use encoding "greek"; # Perl like Greek to you?
  2. use encoding "euc-jp"; # Jperl!
  3. # or you can even do this if your shell supports your native encoding
  4. perl -Mencoding=latin2 -e'...' # Feeling centrally European?
  5. perl -Mencoding=euc-kr -e'...' # Or Korean?
  6. # more control
  7. # A simple euc-cn => utf-8 converter
  8. use encoding "euc-cn", STDOUT => "utf8"; while(<>){print};
  9. # "no encoding;" supported (but not scoped!)
  10. no encoding;
  11. # an alternate way, Filter
  12. use encoding "euc-jp", Filter=>1;
  13. # now you can use kanji identifiers -- in euc-jp!
  14. # switch on locale -
  15. # note that this probably means that unless you have a complete control
  16. # over the environments the application is ever going to be run, you should
  17. # NOT use the feature of encoding pragma allowing you to write your script
  18. # in any recognized encoding because changing locale settings will wreck
  19. # the script; you can of course still use the other features of the pragma.
  20. use encoding ':locale';

ABSTRACT

Let's start with a bit of history: Perl 5.6.0 introduced Unicode support. You could apply substr() and regexes even to complex CJK characters -- so long as the script was written in UTF-8. But back then, text editors that supported UTF-8 were still rare and many users instead chose to write scripts in legacy encodings, giving up a whole new feature of Perl 5.6.

Rewind to the future: starting from perl 5.8.0 with the encoding pragma, you can write your script in any encoding you like (so long as the Encode module supports it) and still enjoy Unicode support. This pragma achieves that by doing the following:

  • Internally converts all literals (q//,qq//,qr//,qw///, qx// ) from the encoding specified to utf8. In Perl 5.8.1 and later, literals in tr/// and DATA pseudo-filehandle are also converted.

  • Changing PerlIO layers of STDIN and STDOUT to the encoding specified.

Literal Conversions

You can write code in EUC-JP as follows:

  1. my $Rakuda = "\xF1\xD1\xF1\xCC"; # Camel in Kanji
  2. #<-char-><-char-> # 4 octets
  3. s/\bCamel\b/$Rakuda/;

And with use encoding "euc-jp" in effect, it is the same thing as the code in UTF-8:

  1. my $Rakuda = "\x{99F1}\x{99DD}"; # two Unicode Characters
  2. s/\bCamel\b/$Rakuda/;

PerlIO layers for STD(IN|OUT)

The encoding pragma also modifies the filehandle layers of STDIN and STDOUT to the specified encoding. Therefore,

  1. use encoding "euc-jp";
  2. my $message = "Camel is the symbol of perl.\n";
  3. my $Rakuda = "\xF1\xD1\xF1\xCC"; # Camel in Kanji
  4. $message =~ s/\bCamel\b/$Rakuda/;
  5. print $message;

Will print "\xF1\xD1\xF1\xCC is the symbol of perl.\n", not "\x{99F1}\x{99DD} is the symbol of perl.\n".

You can override this by giving extra arguments; see below.

Implicit upgrading for byte strings

By default, if strings operating under byte semantics and strings with Unicode character data are concatenated, the new string will be created by decoding the byte strings as ISO 8859-1 (Latin-1).

The encoding pragma changes this to use the specified encoding instead. For example:

  1. use encoding 'utf8';
  2. my $string = chr(20000); # a Unicode string
  3. utf8::encode($string); # now it's a UTF-8 encoded byte string
  4. # concatenate with another Unicode string
  5. print length($string . chr(20000));

Will print 2 , because $string is upgraded as UTF-8. Without use encoding 'utf8'; , it will print 4 instead, since $string is three octets when interpreted as Latin-1.

Side effects

If the encoding pragma is in scope then the lengths returned are calculated from the length of $/ in Unicode characters, which is not always the same as the length of $/ in the native encoding.

This pragma affects utf8::upgrade, but not utf8::downgrade.

FEATURES THAT REQUIRE 5.8.1

Some of the features offered by this pragma requires perl 5.8.1. Most of these are done by Inaba Hiroto. Any other features and changes are good for 5.8.0.

  • "NON-EUC" doublebyte encodings

    Because perl needs to parse script before applying this pragma, such encodings as Shift_JIS and Big-5 that may contain '\' (BACKSLASH; \x5c) in the second byte fails because the second byte may accidentally escape the quoting character that follows. Perl 5.8.1 or later fixes this problem.

  • tr//

    tr// was overlooked by Perl 5 porters when they released perl 5.8.0 See the section below for details.

  • DATA pseudo-filehandle

    Another feature that was overlooked was DATA .

USAGE

  • use encoding [ENCNAME] ;

    Sets the script encoding to ENCNAME. And unless ${^UNICODE} exists and non-zero, PerlIO layers of STDIN and STDOUT are set to ":encoding(ENCNAME)".

    Note that STDERR WILL NOT be changed.

    Also note that non-STD file handles remain unaffected. Use use open or binmode to change layers of those.

    If no encoding is specified, the environment variable PERL_ENCODING is consulted. If no encoding can be found, the error Unknown encoding 'ENCNAME' will be thrown.

  • use encoding ENCNAME [ STDIN => ENCNAME_IN ...] ;

    You can also individually set encodings of STDIN and STDOUT via the STDIN => ENCNAME form. In this case, you cannot omit the first ENCNAME. STDIN => undef turns the IO transcoding completely off.

    When ${^UNICODE} exists and non-zero, these options will completely ignored. ${^UNICODE} is a variable introduced in perl 5.8.1. See perlrun see ${^UNICODE} in perlvar and -C in perlrun for details (perl 5.8.1 and later).

  • use encoding ENCNAME Filter=>1;

    This turns the encoding pragma into a source filter. While the default approach just decodes interpolated literals (in qq() and qr()), this will apply a source filter to the entire source code. See The Filter Option below for details.

  • no encoding;

    Unsets the script encoding. The layers of STDIN, STDOUT are reset to ":raw" (the default unprocessed raw stream of bytes).

The Filter Option

The magic of use encoding is not applied to the names of identifiers. In order to make ${"\x{4eba}"}++ ($human++, where human is a single Han ideograph) work, you still need to write your script in UTF-8 -- or use a source filter. That's what 'Filter=>1' does.

What does this mean? Your source code behaves as if it is written in UTF-8 with 'use utf8' in effect. So even if your editor only supports Shift_JIS, for example, you can still try examples in Chapter 15 of Programming Perl, 3rd Ed.. For instance, you can use UTF-8 identifiers.

This option is significantly slower and (as of this writing) non-ASCII identifiers are not very stable WITHOUT this option and with the source code written in UTF-8.

Filter-related changes at Encode version 1.87

  • The Filter option now sets STDIN and STDOUT like non-filter options. And STDIN=>ENCODING and STDOUT=>ENCODING work like non-filter version.

  • use utf8 is implicitly declared so you no longer have to use utf8 to ${"\x{4eba}"}++ .

CAVEATS

NOT SCOPED

The pragma is a per script, not a per block lexical. Only the last use encoding or no encoding matters, and it affects the whole script. However, the <no encoding> pragma is supported and use encoding can appear as many times as you want in a given script. The multiple use of this pragma is discouraged.

By the same reason, the use this pragma inside modules is also discouraged (though not as strongly discouraged as the case above. See below).

If you still have to write a module with this pragma, be very careful of the load order. See the codes below;

  1. # called module
  2. package Module_IN_BAR;
  3. use encoding "bar";
  4. # stuff in "bar" encoding here
  5. 1;
  6. # caller script
  7. use encoding "foo"
  8. use Module_IN_BAR;
  9. # surprise! use encoding "bar" is in effect.

The best way to avoid this oddity is to use this pragma RIGHT AFTER other modules are loaded. i.e.

  1. use Module_IN_BAR;
  2. use encoding "foo";

DO NOT MIX MULTIPLE ENCODINGS

Notice that only literals (string or regular expression) having only legacy code points are affected: if you mix data like this

  1. \xDF\x{100}

the data is assumed to be in (Latin 1 and) Unicode, not in your native encoding. In other words, this will match in "greek":

  1. "\xDF" =~ /\x{3af}/

but this will not

  1. "\xDF\x{100}" =~ /\x{3af}\x{100}/

since the \xDF (ISO 8859-7 GREEK SMALL LETTER IOTA WITH TONOS) on the left will not be upgraded to \x{3af} (Unicode GREEK SMALL LETTER IOTA WITH TONOS) because of the \x{100} on the left. You should not be mixing your legacy data and Unicode in the same string.

This pragma also affects encoding of the 0x80..0xFF code point range: normally characters in that range are left as eight-bit bytes (unless they are combined with characters with code points 0x100 or larger, in which case all characters need to become UTF-8 encoded), but if the encoding pragma is present, even the 0x80..0xFF range always gets UTF-8 encoded.

After all, the best thing about this pragma is that you don't have to resort to \x{....} just to spell your name in a native encoding. So feel free to put your strings in your encoding in quotes and regexes.

tr/// with ranges

The encoding pragma works by decoding string literals in q//,qq//,qr//,qw///, qx// and so forth. In perl 5.8.0, this does not apply to tr///. Therefore,

  1. use encoding 'euc-jp';
  2. #....
  3. $kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/;
  4. # -------- -------- -------- --------

Does not work as

  1. $kana =~ tr/\x{3041}-\x{3093}/\x{30a1}-\x{30f3}/;
  • Legend of characters above
    1. utf8 euc-jp charnames::viacode()
    2. -----------------------------------------
    3. \x{3041} \xA4\xA1 HIRAGANA LETTER SMALL A
    4. \x{3093} \xA4\xF3 HIRAGANA LETTER N
    5. \x{30a1} \xA5\xA1 KATAKANA LETTER SMALL A
    6. \x{30f3} \xA5\xF3 KATAKANA LETTER N

This counterintuitive behavior has been fixed in perl 5.8.1.

workaround to tr///;

In perl 5.8.0, you can work around as follows;

  1. use encoding 'euc-jp';
  2. # ....
  3. eval qq{ \$kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/ };

Note the tr// expression is surrounded by qq{}. The idea behind is the same as classic idiom that makes tr/// 'interpolate'.

  1. tr/$from/$to/; # wrong!
  2. eval qq{ tr/$from/$to/ }; # workaround.

Nevertheless, in case of encoding pragma even q// is affected so tr/// not being decoded was obviously against the will of Perl5 Porters so it has been fixed in Perl 5.8.1 or later.

EXAMPLE - Greekperl

  1. use encoding "iso 8859-7";
  2. # \xDF in ISO 8859-7 (Greek) is \x{3af} in Unicode.
  3. $a = "\xDF";
  4. $b = "\x{100}";
  5. printf "%#x\n", ord($a); # will print 0x3af, not 0xdf
  6. $c = $a . $b;
  7. # $c will be "\x{3af}\x{100}", not "\x{df}\x{100}".
  8. # chr() is affected, and ...
  9. print "mega\n" if ord(chr(0xdf)) == 0x3af;
  10. # ... ord() is affected by the encoding pragma ...
  11. print "tera\n" if ord(pack("C", 0xdf)) == 0x3af;
  12. # ... as are eq and cmp ...
  13. print "peta\n" if "\x{3af}" eq pack("C", 0xdf);
  14. print "exa\n" if "\x{3af}" cmp pack("C", 0xdf) == 0;
  15. # ... but pack/unpack C are not affected, in case you still
  16. # want to go back to your native encoding
  17. print "zetta\n" if unpack("C", (pack("C", 0xdf))) == 0xdf;

KNOWN PROBLEMS

  • literals in regex that are longer than 127 bytes

    For native multibyte encodings (either fixed or variable length), the current implementation of the regular expressions may introduce recoding errors for regular expression literals longer than 127 bytes.

  • EBCDIC

    The encoding pragma is not supported on EBCDIC platforms. (Porters who are willing and able to remove this limitation are welcome.)

  • format

    This pragma doesn't work well with format because PerlIO does not get along very well with it. When format contains non-ascii characters it prints funny or gets "wide character warnings". To understand it, try the code below.

    1. # Save this one in utf8
    2. # replace *non-ascii* with a non-ascii string
    3. my $camel;
    4. format STDOUT =
    5. *non-ascii*@>>>>>>>
    6. $camel
    7. .
    8. $camel = "*non-ascii*";
    9. binmode(STDOUT=>':encoding(utf8)'); # bang!
    10. write; # funny
    11. print $camel, "\n"; # fine

    Without binmode this happens to work but without binmode, print() fails instead of write().

    At any rate, the very use of format is questionable when it comes to unicode characters since you have to consider such things as character width (i.e. double-width for ideographs) and directions (i.e. BIDI for Arabic and Hebrew).

  • Thread safety

    use encoding ... is not thread-safe (i.e., do not use in threaded applications).

The Logic of :locale

The logic of :locale is as follows:

1.

If the platform supports the langinfo(CODESET) interface, the codeset returned is used as the default encoding for the open pragma.

2.

If 1. didn't work but we are under the locale pragma, the environment variables LC_ALL and LANG (in that order) are matched for encodings (the part after ., if any), and if any found, that is used as the default encoding for the open pragma.

3.

If 1. and 2. didn't work, the environment variables LC_ALL and LANG (in that order) are matched for anything looking like UTF-8, and if any found, :utf8 is used as the default encoding for the open pragma.

If your locale environment variables (LC_ALL, LC_CTYPE, LANG) contain the strings 'UTF-8' or 'UTF8' (case-insensitive matching), the default encoding of your STDIN, STDOUT, and STDERR, and of any subsequent file open, is UTF-8.

HISTORY

This pragma first appeared in Perl 5.8.0. For features that require 5.8.1 and better, see above.

The :locale subpragma was implemented in 2.01, or Perl 5.8.6.

SEE ALSO

perlunicode, Encode, open, Filter::Util::Call,

Ch. 15 of Programming Perl (3rd Edition) by Larry Wall, Tom Christiansen, Jon Orwant; O'Reilly & Associates; ISBN 0-596-00027-8

 
perldoc-html/feature.html000644 000765 000024 00000106137 12275777415 015570 0ustar00jjstaff000000 000000 feature - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

feature

Perl 5 version 18.2 documentation
Recently read

feature

NAME

feature - Perl pragma to enable new features

SYNOPSIS

  1. use feature qw(say switch);
  2. given ($foo) {
  3. when (1) { say "\$foo == 1" }
  4. when ([2,3]) { say "\$foo == 2 || \$foo == 3" }
  5. when (/^a[bc]d$/) { say "\$foo eq 'abd' || \$foo eq 'acd'" }
  6. when ($_ > 100) { say "\$foo > 100" }
  7. default { say "None of the above" }
  8. }
  9. use feature ':5.10'; # loads all features available in perl 5.10
  10. use v5.10; # implicitly loads :5.10 feature bundle

DESCRIPTION

It is usually impossible to add new syntax to Perl without breaking some existing programs. This pragma provides a way to minimize that risk. New syntactic constructs, or new semantic meanings to older constructs, can be enabled by use feature 'foo' , and will be parsed only when the appropriate feature pragma is in scope. (Nevertheless, the CORE:: prefix provides access to all Perl keywords, regardless of this pragma.)

Lexical effect

Like other pragmas (use strict , for example), features have a lexical effect. use feature qw(foo) will only make the feature "foo" available from that point to the end of the enclosing block.

  1. {
  2. use feature 'say';
  3. say "say is available here";
  4. }
  5. print "But not here.\n";

no feature

Features can also be turned off by using no feature "foo" . This too has lexical effect.

  1. use feature 'say';
  2. say "say is available here";
  3. {
  4. no feature 'say';
  5. print "But not here.\n";
  6. }
  7. say "Yet it is here.";

no feature with no features specified will reset to the default group. To disable all features (an unusual request!) use no feature ':all' .

AVAILABLE FEATURES

The 'say' feature

use feature 'say' tells the compiler to enable the Perl 6 style say function.

See say for details.

This feature is available starting with Perl 5.10.

The 'state' feature

use feature 'state' tells the compiler to enable state variables.

See Persistent Private Variables in perlsub for details.

This feature is available starting with Perl 5.10.

The 'switch' feature

use feature 'switch' tells the compiler to enable the Perl 6 given/when construct.

See Switch Statements in perlsyn for details.

This feature is available starting with Perl 5.10.

The 'unicode_strings' feature

use feature 'unicode_strings' tells the compiler to use Unicode semantics in all string operations executed within its scope (unless they are also within the scope of either use locale or use bytes ). The same applies to all regular expressions compiled within the scope, even if executed outside it. It does not change the internal representation of strings, but only how they are interpreted.

no feature 'unicode_strings' tells the compiler to use the traditional Perl semantics wherein the native character set semantics is used unless it is clear to Perl that Unicode is desired. This can lead to some surprises when the behavior suddenly changes. (See The Unicode Bug in perlunicode for details.) For this reason, if you are potentially using Unicode in your program, the use feature 'unicode_strings' subpragma is strongly recommended.

This feature is available starting with Perl 5.12; was almost fully implemented in Perl 5.14; and extended in Perl 5.16 to cover quotemeta.

The 'unicode_eval' and 'evalbytes' features

Under the unicode_eval feature, Perl's eval function, when passed a string, will evaluate it as a string of characters, ignoring any use utf8 declarations. use utf8 exists to declare the encoding of the script, which only makes sense for a stream of bytes, not a string of characters. Source filters are forbidden, as they also really only make sense on strings of bytes. Any attempt to activate a source filter will result in an error.

The evalbytes feature enables the evalbytes keyword, which evaluates the argument passed to it as a string of bytes. It dies if the string contains any characters outside the 8-bit range. Source filters work within evalbytes: they apply to the contents of the string being evaluated.

Together, these two features are intended to replace the historical eval function, which has (at least) two bugs in it, that cannot easily be fixed without breaking existing programs:

  • eval behaves differently depending on the internal encoding of the string, sometimes treating its argument as a string of bytes, and sometimes as a string of characters.

  • Source filters activated within eval leak out into whichever file scope is currently being compiled. To give an example with the CPAN module Semi::Semicolons:

    1. BEGIN { eval "use Semi::Semicolons; # not filtered here " }
    2. # filtered here!

    evalbytes fixes that to work the way one would expect:

    1. use feature "evalbytes";
    2. BEGIN { evalbytes "use Semi::Semicolons; # filtered " }
    3. # not filtered

These two features are available starting with Perl 5.16.

The 'current_sub' feature

This provides the __SUB__ token that returns a reference to the current subroutine or undef outside of a subroutine.

This feature is available starting with Perl 5.16.

The 'array_base' feature

This feature supports the legacy $[ variable. See $[ in perlvar and arybase. It is on by default but disabled under use v5.16 (see IMPLICIT LOADING, below).

This feature is available under this name starting with Perl 5.16. In previous versions, it was simply on all the time, and this pragma knew nothing about it.

The 'fc' feature

use feature 'fc' tells the compiler to enable the fc function, which implements Unicode casefolding.

See fc for details.

This feature is available from Perl 5.16 onwards.

The 'lexical_subs' feature

WARNING: This feature is still experimental and the implementation may change in future versions of Perl. For this reason, Perl will warn when you use the feature, unless you have explicitly disabled the warning:

  1. no warnings "experimental::lexical_subs";

This enables declaration of subroutines via my sub foo , state sub foo and our sub foo syntax. See Lexical Subroutines in perlsub for details.

This feature is available from Perl 5.18 onwards.

FEATURE BUNDLES

It's possible to load multiple features together, using a feature bundle. The name of a feature bundle is prefixed with a colon, to distinguish it from an actual feature.

  1. use feature ":5.10";

The following feature bundles are available:

  1. bundle features included
  2. --------- -----------------
  3. :default array_base
  4. :5.10 say state switch array_base
  5. :5.12 say state switch unicode_strings array_base
  6. :5.14 say state switch unicode_strings array_base
  7. :5.16 say state switch unicode_strings
  8. unicode_eval evalbytes current_sub fc
  9. :5.18 say state switch unicode_strings
  10. unicode_eval evalbytes current_sub fc

The :default bundle represents the feature set that is enabled before any use feature or no feature declaration.

Specifying sub-versions such as the 0 in 5.14.0 in feature bundles has no effect. Feature bundles are guaranteed to be the same for all sub-versions.

  1. use feature ":5.14.0"; # same as ":5.14"
  2. use feature ":5.14.1"; # same as ":5.14"

IMPLICIT LOADING

Instead of loading feature bundles by name, it is easier to let Perl do implicit loading of a feature bundle for you.

There are two ways to load the feature pragma implicitly:

  • By using the -E switch on the Perl command-line instead of -e . That will enable the feature bundle for that version of Perl in the main compilation unit (that is, the one-liner that follows -E ).

  • By explicitly requiring a minimum Perl version number for your program, with the use VERSION construct. That is,

    1. use v5.10.0;

    will do an implicit

    1. no feature ':all';
    2. use feature ':5.10';

    and so on. Note how the trailing sub-version is automatically stripped from the version.

    But to avoid portability warnings (see use), you may prefer:

    1. use 5.010;

    with the same effect.

    If the required version is older than Perl 5.10, the ":default" feature bundle is automatically loaded instead.

 
perldoc-html/fields.html000644 000765 000024 00000060531 12275777415 015400 0ustar00jjstaff000000 000000 fields - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

fields

Perl 5 version 18.2 documentation
Recently read

fields

NAME

fields - compile-time class fields

SYNOPSIS

  1. {
  2. package Foo;
  3. use fields qw(foo bar _Foo_private);
  4. sub new {
  5. my Foo $self = shift;
  6. unless (ref $self) {
  7. $self = fields::new($self);
  8. $self->{_Foo_private} = "this is Foo's secret";
  9. }
  10. $self->{foo} = 10;
  11. $self->{bar} = 20;
  12. return $self;
  13. }
  14. }
  15. my $var = Foo->new;
  16. $var->{foo} = 42;
  17. # this will generate an error
  18. $var->{zap} = 42;
  19. # subclassing
  20. {
  21. package Bar;
  22. use base 'Foo';
  23. use fields qw(baz _Bar_private); # not shared with Foo
  24. sub new {
  25. my $class = shift;
  26. my $self = fields::new($class);
  27. $self->SUPER::new(); # init base fields
  28. $self->{baz} = 10; # init own fields
  29. $self->{_Bar_private} = "this is Bar's secret";
  30. return $self;
  31. }
  32. }

DESCRIPTION

The fields pragma enables compile-time verified class fields.

NOTE: The current implementation keeps the declared fields in the %FIELDS hash of the calling package, but this may change in future versions. Do not update the %FIELDS hash directly, because it must be created at compile-time for it to be fully useful, as is done by this pragma.

Only valid for perl before 5.9.0:

If a typed lexical variable holding a reference is used to access a hash element and a package with the same name as the type has declared class fields using this pragma, then the operation is turned into an array access at compile time.

The related base pragma will combine fields from base classes and any fields declared using the fields pragma. This enables field inheritance to work properly.

Field names that start with an underscore character are made private to the class and are not visible to subclasses. Inherited fields can be overridden but will generate a warning if used together with the -w switch.

Only valid for perls before 5.9.0:

The effect of all this is that you can have objects with named fields which are as compact and as fast arrays to access. This only works as long as the objects are accessed through properly typed variables. If the objects are not typed, access is only checked at run time.

The following functions are supported:

  • new

    perl before 5.9.0: fields::new() creates and blesses a pseudo-hash comprised of the fields declared using the fields pragma into the specified class.

    perl 5.9.0 and higher: fields::new() creates and blesses a restricted-hash comprised of the fields declared using the fields pragma into the specified class.

    This function is usable with or without pseudo-hashes. It is the recommended way to construct a fields-based object.

    This makes it possible to write a constructor like this:

    1. package Critter::Sounds;
    2. use fields qw(cat dog bird);
    3. sub new {
    4. my $self = shift;
    5. $self = fields::new($self) unless ref $self;
    6. $self->{cat} = 'meow'; # scalar element
    7. @$self{'dog','bird'} = ('bark','tweet'); # slice
    8. return $self;
    9. }
  • phash

    before perl 5.9.0:

    fields::phash() can be used to create and initialize a plain (unblessed) pseudo-hash. This function should always be used instead of creating pseudo-hashes directly.

    If the first argument is a reference to an array, the pseudo-hash will be created with keys from that array. If a second argument is supplied, it must also be a reference to an array whose elements will be used as the values. If the second array contains less elements than the first, the trailing elements of the pseudo-hash will not be initialized. This makes it particularly useful for creating a pseudo-hash from subroutine arguments:

    1. sub dogtag {
    2. my $tag = fields::phash([qw(name rank ser_num)], [@_]);
    3. }

    fields::phash() also accepts a list of key-value pairs that will be used to construct the pseudo hash. Examples:

    1. my $tag = fields::phash(name => "Joe",
    2. rank => "captain",
    3. ser_num => 42);
    4. my $pseudohash = fields::phash(%args);

    perl 5.9.0 and higher:

    Pseudo-hashes have been removed from Perl as of 5.10. Consider using restricted hashes or fields::new() instead. Using fields::phash() will cause an error.

SEE ALSO

base

 
perldoc-html/filetest.html000644 000765 000024 00000050401 12275777415 015744 0ustar00jjstaff000000 000000 filetest - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

filetest

Perl 5 version 18.2 documentation
Recently read

filetest

NAME

filetest - Perl pragma to control the filetest permission operators

SYNOPSIS

  1. $can_perhaps_read = -r "file"; # use the mode bits
  2. {
  3. use filetest 'access'; # intuit harder
  4. $can_really_read = -r "file";
  5. }
  6. $can_perhaps_read = -r "file"; # use the mode bits again

DESCRIPTION

This pragma tells the compiler to change the behaviour of the filetest permission operators, -r -w -x -R -W -X (see perlfunc).

The default behaviour of file test operators is to use the simple mode bits as returned by the stat() family of system calls. However, many operating systems have additional features to define more complex access rights, for example ACLs (Access Control Lists). For such environments, use filetest may help the permission operators to return results more consistent with other tools.

The use filetest or no filetest statements affect file tests defined in their block, up to the end of the closest enclosing block (they are lexically block-scoped).

Currently, only the access sub-pragma is implemented. It enables (or disables) the use of access() when available, that is, on most UNIX systems and other POSIX environments. See details below.

Consider this carefully

The stat() mode bits are probably right for most of the files and directories found on your system, because few people want to use the additional features offered by access(). But you may encounter surprises if your program runs on a system that uses ACLs, since the stat() information won't reflect the actual permissions.

There may be a slight performance decrease in the filetest operations when the filetest pragma is in effect, because checking bits is very cheap.

Also, note that using the file tests for security purposes is a lost cause from the start: there is a window open for race conditions (who is to say that the permissions will not change between the test and the real operation?). Therefore if you are serious about security, just try the real operation and test for its success - think in terms of atomic operations. Filetests are more useful for filesystem administrative tasks, when you have no need for the content of the elements on disk.

The "access" sub-pragma

UNIX and POSIX systems provide an abstract access() operating system call, which should be used to query the read, write, and execute rights. This function hides various distinct approaches in additional operating system specific security features, like Access Control Lists (ACLs)

The extended filetest functionality is used by Perl only when the argument of the operators is a filename, not when it is a filehandle.

Limitation with regard to _

Because access() does not invoke stat() (at least not in a way visible to Perl), the stat result cache "_" is not set. This means that the outcome of the following two tests is different. The first has the stat bits of /etc/passwd in _ , and in the second case this still contains the bits of /etc.

  1. { -d '/etc';
  2. -w '/etc/passwd';
  3. print -f _ ? 'Yes' : 'No'; # Yes
  4. }
  5. { use filetest 'access';
  6. -d '/etc';
  7. -w '/etc/passwd';
  8. print -f _ ? 'Yes' : 'No'; # No
  9. }

Of course, unless your OS does not implement access(), in which case the pragma is simply ignored. Best not to use _ at all in a file where the filetest pragma is active!

As a side effect, as _ doesn't work, stacked filetest operators (-f -w $file ) won't work either.

This limitation might be removed in a future version of perl.

 
perldoc-html/find2perl.html000644 000765 000024 00000061502 12275777417 016020 0ustar00jjstaff000000 000000 find2perl - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

find2perl

Perl 5 version 18.2 documentation
Recently read

find2perl

NAME

find2perl - translate find command lines to Perl code

SYNOPSIS

  1. find2perl [paths] [predicates] | perl

DESCRIPTION

find2perl is a little translator to convert find command lines to equivalent Perl code. The resulting code is typically faster than running find itself.

"paths" are a set of paths where find2perl will start its searches and "predicates" are taken from the following list.

  • ! PREDICATE

    Negate the sense of the following predicate. The ! must be passed as a distinct argument, so it may need to be surrounded by whitespace and/or quoted from interpretation by the shell using a backslash (just as with using find(1) ).

  • ( PREDICATES )

    Group the given PREDICATES. The parentheses must be passed as distinct arguments, so they may need to be surrounded by whitespace and/or quoted from interpretation by the shell using a backslash (just as with using find(1) ).

  • PREDICATE1 PREDICATE2

    True if _both_ PREDICATE1 and PREDICATE2 are true; PREDICATE2 is not evaluated if PREDICATE1 is false.

  • PREDICATE1 -o PREDICATE2

    True if either one of PREDICATE1 or PREDICATE2 is true; PREDICATE2 is not evaluated if PREDICATE1 is true.

  • -follow

    Follow (dereference) symlinks. The checking of file attributes depends on the position of the -follow option. If it precedes the file check option, an stat is done which means the file check applies to the file the symbolic link is pointing to. If -follow option follows the file check option, this now applies to the symbolic link itself, i.e. an lstat is done.

  • -depth

    Change directory traversal algorithm from breadth-first to depth-first.

  • -prune

    Do not descend into the directory currently matched.

  • -xdev

    Do not traverse mount points (prunes search at mount-point directories).

  • -name GLOB

    File name matches specified GLOB wildcard pattern. GLOB may need to be quoted to avoid interpretation by the shell (just as with using find(1) ).

  • -iname GLOB

    Like -name , but the match is case insensitive.

  • -path GLOB

    Path name matches specified GLOB wildcard pattern.

  • -ipath GLOB

    Like -path , but the match is case insensitive.

  • -perm PERM

    Low-order 9 bits of permission match octal value PERM.

  • -perm -PERM

    The bits specified in PERM are all set in file's permissions.

  • -type X

    The file's type matches perl's -X operator.

  • -fstype TYPE

    Filesystem of current path is of type TYPE (only NFS/non-NFS distinction is implemented).

  • -user USER

    True if USER is owner of file.

  • -group GROUP

    True if file's group is GROUP.

  • -nouser

    True if file's owner is not in password database.

  • -nogroup

    True if file's group is not in group database.

  • -inum INUM

    True file's inode number is INUM.

  • -links N

    True if (hard) link count of file matches N (see below).

  • -size N

    True if file's size matches N (see below) N is normally counted in 512-byte blocks, but a suffix of "c" specifies that size should be counted in characters (bytes) and a suffix of "k" specifies that size should be counted in 1024-byte blocks.

  • -atime N

    True if last-access time of file matches N (measured in days) (see below).

  • -ctime N

    True if last-changed time of file's inode matches N (measured in days, see below).

  • -mtime N

    True if last-modified time of file matches N (measured in days, see below).

  • -newer FILE

    True if last-modified time of file matches N.

  • -print

    Print out path of file (always true). If none of -exec , -ls , -print0 , or -ok is specified, then -print will be added implicitly.

  • -print0

    Like -print, but terminates with \0 instead of \n.

  • -exec OPTIONS ;

    exec() the arguments in OPTIONS in a subprocess; any occurrence of {} in OPTIONS will first be substituted with the path of the current file. Note that the command "rm" has been special-cased to use perl's unlink() function instead (as an optimization). The ; must be passed as a distinct argument, so it may need to be surrounded by whitespace and/or quoted from interpretation by the shell using a backslash (just as with using find(1) ).

  • -ok OPTIONS ;

    Like -exec, but first prompts user; if user's response does not begin with a y, skip the exec. The ; must be passed as a distinct argument, so it may need to be surrounded by whitespace and/or quoted from interpretation by the shell using a backslash (just as with using find(1) ).

  • -eval EXPR

    Has the perl script eval() the EXPR.

  • -ls

    Simulates -exec ls -dils {} ;

  • -tar FILE

    Adds current output to tar-format FILE.

  • -cpio FILE

    Adds current output to old-style cpio-format FILE.

  • -ncpio FILE

    Adds current output to "new"-style cpio-format FILE.

Predicates which take a numeric argument N can come in three forms:

  1. * N is prefixed with a +: match values greater than N
  2. * N is prefixed with a -: match values less than N
  3. * N is not prefixed with either + or -: match only values equal to N

SEE ALSO

find, File::Find.

 
perldoc-html/functions/000755 000765 000024 00000000000 12275777532 015247 5ustar00jjstaff000000 000000 perldoc-html/h2ph.html000644 000765 000024 00000045237 12275777417 015003 0ustar00jjstaff000000 000000 h2ph - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

h2ph

Perl 5 version 18.2 documentation
Recently read

h2ph

NAME

h2ph - convert .h C header files to .ph Perl header files

SYNOPSIS

h2ph [-d destination directory] [-r | -a] [-l] [headerfiles]

DESCRIPTION

h2ph converts any C header files specified to the corresponding Perl header file format. It is most easily run while in /usr/include:

  1. cd /usr/include; h2ph * sys/*

or

  1. cd /usr/include; h2ph * sys/* arpa/* netinet/*

or

  1. cd /usr/include; h2ph -r -l .

The output files are placed in the hierarchy rooted at Perl's architecture dependent library directory. You can specify a different hierarchy with a -d switch.

If run with no arguments, filters standard input to standard output.

OPTIONS

  • -d destination_dir

    Put the resulting .ph files beneath destination_dir, instead of beneath the default Perl library location ($Config{'installsitearch'} ).

  • -r

    Run recursively; if any of headerfiles are directories, then run h2ph on all files in those directories (and their subdirectories, etc.). -r and -a are mutually exclusive.

  • -a

    Run automagically; convert headerfiles, as well as any .h files which they include. This option will search for .h files in all directories which your C compiler ordinarily uses. -a and -r are mutually exclusive.

  • -l

    Symbolic links will be replicated in the destination directory. If -l is not specified, then links are skipped over.

  • -h

    Put 'hints' in the .ph files which will help in locating problems with h2ph. In those cases when you require a .ph file containing syntax errors, instead of the cryptic

    1. [ some error condition ] at (eval mmm) line nnn

    you will see the slightly more helpful

    1. [ some error condition ] at filename.ph line nnn

    However, the .ph files almost double in size when built using -h.

  • -D

    Include the code from the .h file as a comment in the .ph file. This is primarily used for debugging h2ph.

  • -Q

    'Quiet' mode; don't print out the names of the files being converted.

ENVIRONMENT

No environment variables are used.

FILES

  1. /usr/include/*.h
  2. /usr/include/sys/*.h

etc.

AUTHOR

Larry Wall

SEE ALSO

perl(1)

DIAGNOSTICS

The usual warnings if it can't read or write the files involved.

BUGS

Doesn't construct the %sizeof array for you.

It doesn't handle all C constructs, but it does attempt to isolate definitions inside evals so that you can get at the definitions that it can translate.

It's only intended as a rough tool. You may need to dicker with the files produced.

You have to run this program by hand; it's not run as part of the Perl installation.

Doesn't handle complicated expressions built piecemeal, a la:

  1. enum {
  2. FIRST_VALUE,
  3. SECOND_VALUE,
  4. #ifdef ABC
  5. THIRD_VALUE
  6. #endif
  7. };

Doesn't necessarily locate all of your C compiler's internally-defined symbols.

 
perldoc-html/h2xs.html000644 000765 000024 00000106523 12275777420 015014 0ustar00jjstaff000000 000000 h2xs - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

h2xs

Perl 5 version 18.2 documentation
Recently read

h2xs

NAME

h2xs - convert .h C header files to Perl extensions

SYNOPSIS

h2xs [OPTIONS ...] [headerfile ... [extra_libraries]]

h2xs -h|-?|--help

DESCRIPTION

h2xs builds a Perl extension from C header files. The extension will include functions which can be used to retrieve the value of any #define statement which was in the C header files.

The module_name will be used for the name of the extension. If module_name is not supplied then the name of the first header file will be used, with the first character capitalized.

If the extension might need extra libraries, they should be included here. The extension Makefile.PL will take care of checking whether the libraries actually exist and how they should be loaded. The extra libraries should be specified in the form -lm -lposix, etc, just as on the cc command line. By default, the Makefile.PL will search through the library path determined by Configure. That path can be augmented by including arguments of the form -L/another/library/path in the extra-libraries argument.

In spite of its name, h2xs may also be used to create a skeleton pure Perl module. See the -X option.

OPTIONS

  • -A, --omit-autoload

    Omit all autoload facilities. This is the same as -c but also removes the use AutoLoader statement from the .pm file.

  • -B, --beta-version

    Use an alpha/beta style version number. Causes version number to be "0.00_01" unless -v is specified.

  • -C, --omit-changes

    Omits creation of the Changes file, and adds a HISTORY section to the POD template.

  • -F, --cpp-flags=addflags

    Additional flags to specify to C preprocessor when scanning header for function declarations. Writes these options in the generated Makefile.PL too.

  • -M, --func-mask=regular expression

    selects functions/macros to process.

  • -O, --overwrite-ok

    Allows a pre-existing extension directory to be overwritten.

  • -P, --omit-pod

    Omit the autogenerated stub POD section.

  • -X, --omit-XS

    Omit the XS portion. Used to generate a skeleton pure Perl module. -c and -f are implicitly enabled.

  • -a, --gen-accessors

    Generate an accessor method for each element of structs and unions. The generated methods are named after the element name; will return the current value of the element if called without additional arguments; and will set the element to the supplied value (and return the new value) if called with an additional argument. Embedded structures and unions are returned as a pointer rather than the complete structure, to facilitate chained calls.

    These methods all apply to the Ptr type for the structure; additionally two methods are constructed for the structure type itself, _to_ptr which returns a Ptr type pointing to the same structure, and a new method to construct and return a new structure, initialised to zeroes.

  • -b, --compat-version=version

    Generates a .pm file which is backwards compatible with the specified perl version.

    For versions < 5.6.0, the changes are. - no use of 'our' (uses 'use vars' instead) - no 'use warnings'

    Specifying a compatibility version higher than the version of perl you are using to run h2xs will have no effect. If unspecified h2xs will default to compatibility with the version of perl you are using to run h2xs.

  • -c, --omit-constant

    Omit constant() from the .xs file and corresponding specialised AUTOLOAD from the .pm file.

  • -d, --debugging

    Turn on debugging messages.

  • -e, --omit-enums=[regular expression]

    If regular expression is not given, skip all constants that are defined in a C enumeration. Otherwise skip only those constants that are defined in an enum whose name matches regular expression.

    Since regular expression is optional, make sure that this switch is followed by at least one other switch if you omit regular expression and have some pending arguments such as header-file names. This is ok:

    1. h2xs -e -n Module::Foo foo.h

    This is not ok:

    1. h2xs -n Module::Foo -e foo.h

    In the latter, foo.h is taken as regular expression.

  • -f, --force

    Allows an extension to be created for a header even if that header is not found in standard include directories.

  • -g, --global

    Include code for safely storing static data in the .xs file. Extensions that do no make use of static data can ignore this option.

  • -h, -?, --help

    Print the usage, help and version for this h2xs and exit.

  • -k, --omit-const-func

    For function arguments declared as const , omit the const attribute in the generated XS code.

  • -m, --gen-tied-var

    Experimental: for each variable declared in the header file(s), declare a perl variable of the same name magically tied to the C variable.

  • -n, --name=module_name

    Specifies a name to be used for the extension, e.g., -n RPC::DCE

  • -o, --opaque-re=regular expression

    Use "opaque" data type for the C types matched by the regular expression, even if these types are typedef -equivalent to types from typemaps. Should not be used without -x.

    This may be useful since, say, types which are typedef -equivalent to integers may represent OS-related handles, and one may want to work with these handles in OO-way, as in $handle->do_something() . Use -o . if you want to handle all the typedef ed types as opaque types.

    The type-to-match is whitewashed (except for commas, which have no whitespace before them, and multiple * which have no whitespace between them).

  • -p, --remove-prefix=prefix

    Specify a prefix which should be removed from the Perl function names, e.g., -p sec_rgy_ This sets up the XS PREFIX keyword and removes the prefix from functions that are autoloaded via the constant() mechanism.

  • -s, --const-subs=sub1,sub2

    Create a perl subroutine for the specified macros rather than autoload with the constant() subroutine. These macros are assumed to have a return type of char *, e.g., -s sec_rgy_wildcard_name,sec_rgy_wildcard_sid.

  • -t, --default-type=type

    Specify the internal type that the constant() mechanism uses for macros. The default is IV (signed integer). Currently all macros found during the header scanning process will be assumed to have this type. Future versions of h2xs may gain the ability to make educated guesses.

  • --use-new-tests

    When --compat-version (-b) is present the generated tests will use Test::More rather than Test which is the default for versions before 5.6.2. Test::More will be added to PREREQ_PM in the generated Makefile.PL .

  • --use-old-tests

    Will force the generation of test code that uses the older Test module.

  • --skip-exporter

    Do not use Exporter and/or export any symbol.

  • --skip-ppport

    Do not use Devel::PPPort : no portability to older version.

  • --skip-autoloader

    Do not use the module AutoLoader ; but keep the constant() function and sub AUTOLOAD for constants.

  • --skip-strict

    Do not use the pragma strict .

  • --skip-warnings

    Do not use the pragma warnings .

  • -v, --version=version

    Specify a version number for this extension. This version number is added to the templates. The default is 0.01, or 0.00_01 if -B is specified. The version specified should be numeric.

  • -x, --autogen-xsubs

    Automatically generate XSUBs basing on function declarations in the header file. The package C::Scan should be installed. If this option is specified, the name of the header file may look like NAME1,NAME2 . In this case NAME1 is used instead of the specified string, but XSUBs are emitted only for the declarations included from file NAME2.

    Note that some types of arguments/return-values for functions may result in XSUB-declarations/typemap-entries which need hand-editing. Such may be objects which cannot be converted from/to a pointer (like long long ), pointers to functions, or arrays. See also the section on LIMITATIONS of -x.

EXAMPLES

  1. # Default behavior, extension is Rusers
  2. h2xs rpcsvc/rusers
  3. # Same, but extension is RUSERS
  4. h2xs -n RUSERS rpcsvc/rusers
  5. # Extension is rpcsvc::rusers. Still finds <rpcsvc/rusers.h>
  6. h2xs rpcsvc::rusers
  7. # Extension is ONC::RPC. Still finds <rpcsvc/rusers.h>
  8. h2xs -n ONC::RPC rpcsvc/rusers
  9. # Without constant() or AUTOLOAD
  10. h2xs -c rpcsvc/rusers
  11. # Creates templates for an extension named RPC
  12. h2xs -cfn RPC
  13. # Extension is ONC::RPC.
  14. h2xs -cfn ONC::RPC
  15. # Extension is a pure Perl module with no XS code.
  16. h2xs -X My::Module
  17. # Extension is Lib::Foo which works at least with Perl5.005_03.
  18. # Constants are created for all #defines and enums h2xs can find
  19. # in foo.h.
  20. h2xs -b 5.5.3 -n Lib::Foo foo.h
  21. # Extension is Lib::Foo which works at least with Perl5.005_03.
  22. # Constants are created for all #defines but only for enums
  23. # whose names do not start with 'bar_'.
  24. h2xs -b 5.5.3 -e '^bar_' -n Lib::Foo foo.h
  25. # Makefile.PL will look for library -lrpc in
  26. # additional directory /opt/net/lib
  27. h2xs rpcsvc/rusers -L/opt/net/lib -lrpc
  28. # Extension is DCE::rgynbase
  29. # prefix "sec_rgy_" is dropped from perl function names
  30. h2xs -n DCE::rgynbase -p sec_rgy_ dce/rgynbase
  31. # Extension is DCE::rgynbase
  32. # prefix "sec_rgy_" is dropped from perl function names
  33. # subroutines are created for sec_rgy_wildcard_name and
  34. # sec_rgy_wildcard_sid
  35. h2xs -n DCE::rgynbase -p sec_rgy_ \
  36. -s sec_rgy_wildcard_name,sec_rgy_wildcard_sid dce/rgynbase
  37. # Make XS without defines in perl.h, but with function declarations
  38. # visible from perl.h. Name of the extension is perl1.
  39. # When scanning perl.h, define -DEXT=extern -DdEXT= -DINIT(x)=
  40. # Extra backslashes below because the string is passed to shell.
  41. # Note that a directory with perl header files would
  42. # be added automatically to include path.
  43. h2xs -xAn perl1 -F "-DEXT=extern -DdEXT= -DINIT\(x\)=" perl.h
  44. # Same with function declaration in proto.h as visible from perl.h.
  45. h2xs -xAn perl2 perl.h,proto.h
  46. # Same but select only functions which match /^av_/
  47. h2xs -M '^av_' -xAn perl2 perl.h,proto.h
  48. # Same but treat SV* etc as "opaque" types
  49. h2xs -o '^[S]V \*$' -M '^av_' -xAn perl2 perl.h,proto.h

Extension based on .h and .c files

Suppose that you have some C files implementing some functionality, and the corresponding header files. How to create an extension which makes this functionality accessible in Perl? The example below assumes that the header files are interface_simple.h and interface_hairy.h, and you want the perl module be named as Ext::Ension . If you need some preprocessor directives and/or linking with external libraries, see the flags -F , -L and -l in OPTIONS.

  • Find the directory name

    Start with a dummy run of h2xs:

    1. h2xs -Afn Ext::Ension

    The only purpose of this step is to create the needed directories, and let you know the names of these directories. From the output you can see that the directory for the extension is Ext/Ension.

  • Copy C files

    Copy your header files and C files to this directory Ext/Ension.

  • Create the extension

    Run h2xs, overwriting older autogenerated files:

    1. h2xs -Oxan Ext::Ension interface_simple.h interface_hairy.h

    h2xs looks for header files after changing to the extension directory, so it will find your header files OK.

  • Archive and test

    As usual, run

    1. cd Ext/Ension
    2. perl Makefile.PL
    3. make dist
    4. make
    5. make test
  • Hints

    It is important to do make dist as early as possible. This way you can easily merge(1) your changes to autogenerated files if you decide to edit your .h files and rerun h2xs.

    Do not forget to edit the documentation in the generated .pm file.

    Consider the autogenerated files as skeletons only, you may invent better interfaces than what h2xs could guess.

    Consider this section as a guideline only, some other options of h2xs may better suit your needs.

ENVIRONMENT

No environment variables are used.

AUTHOR

Larry Wall and others

SEE ALSO

perl, perlxstut, ExtUtils::MakeMaker, and AutoLoader.

DIAGNOSTICS

The usual warnings if it cannot read or write the files involved.

LIMITATIONS of -x

h2xs would not distinguish whether an argument to a C function which is of the form, say, int * , is an input, output, or input/output parameter. In particular, argument declarations of the form

  1. int
  2. foo(n)
  3. int *n

should be better rewritten as

  1. int
  2. foo(n)
  3. int &n

if n is an input parameter.

Additionally, h2xs has no facilities to intuit that a function

  1. int
  2. foo(addr,l)
  3. char *addr
  4. int l

takes a pair of address and length of data at this address, so it is better to rewrite this function as

  1. int
  2. foo(sv)
  3. SV *addr
  4. PREINIT:
  5. STRLEN len;
  6. char *s;
  7. CODE:
  8. s = SvPV(sv,len);
  9. RETVAL = foo(s, len);
  10. OUTPUT:
  11. RETVAL

or alternately

  1. static int
  2. my_foo(SV *sv)
  3. {
  4. STRLEN len;
  5. char *s = SvPV(sv,len);
  6. return foo(s, len);
  7. }
  8. MODULE = foo PACKAGE = foo PREFIX = my_
  9. int
  10. foo(sv)
  11. SV *sv

See perlxs and perlxstut for additional details.

 
perldoc-html/if.html000644 000765 000024 00000037003 12275777415 014526 0ustar00jjstaff000000 000000 if - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

if

Perl 5 version 18.2 documentation
Recently read

if

NAME

if - use a Perl module if a condition holds

SYNOPSIS

  1. use if CONDITION, MODULE => ARGUMENTS;

DESCRIPTION

The construct

  1. use if CONDITION, MODULE => ARGUMENTS;

has no effect unless CONDITION is true. In this case the effect is the same as of

  1. use MODULE ARGUMENTS;

Above => provides necessary quoting of MODULE . If not used (e.g., no ARGUMENTS to give), you'd better quote MODULE yourselves.

BUGS

The current implementation does not allow specification of the required version of the module.

AUTHOR

Ilya Zakharevich mailto:ilyaz@cpan.org.

 
perldoc-html/index-faq.html000644 000765 000024 00000033142 12275777326 016005 0ustar00jjstaff000000 000000 FAQs - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

FAQs

Perl 5 version 18.2 documentation
Recently read

FAQs

 
perldoc-html/index-functions-by-cat.html000644 000765 000024 00000132060 12275777517 020424 0ustar00jjstaff000000 000000 Perl functions by category - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Perl functions by category

Perl 5 version 18.2 documentation
Recently read

Perl functions by category

Functions for SCALARs or strings

  • chomp - remove a trailing record separator from a string
  • chop - remove the last character from a string
  • chr - get character this number represents
  • crypt - one-way passwd-style encryption
  • hex - convert a string to a hexadecimal number
  • index - find a substring within a string
  • lc - return lower-case version of a string
  • lcfirst - return a string with just the next letter in lower case
  • length - return the number of bytes in a string
  • oct - convert a string to an octal number
  • ord - find a character's numeric representation
  • pack - convert a list into a binary representation
  • q/STRING/ - singly quote a string
  • qq/STRING/ - doubly quote a string
  • reverse - flip a string or a list
  • rindex - right-to-left substring search
  • sprintf - formatted print into a string
  • substr - get or alter a portion of a stirng
  • tr/// - transliterate a string
  • uc - return upper-case version of a string
  • ucfirst - return a string with just the next letter in upper case
  • y/// - transliterate a string

Regular expressions and pattern matching

  • m// - match a string with a regular expression pattern
  • pos - find or set the offset for the last/next m//g search
  • qr/STRING/ - Compile pattern
  • quotemeta - quote regular expression magic characters
  • s/// - replace a pattern with a string
  • split - split up a string using a regexp delimiter
  • study - optimize input data for repeated searches

Numeric functions

  • abs - absolute value function
  • atan2 - arctangent of Y/X in the range -PI to PI
  • cos - cosine function
  • exp - raise I to a power
  • hex - convert a string to a hexadecimal number
  • int - get the integer portion of a number
  • log - retrieve the natural logarithm for a number
  • oct - convert a string to an octal number
  • rand - retrieve the next pseudorandom number
  • sin - return the sine of a number
  • sqrt - square root function
  • srand - seed the random number generator

Functions for real @ARRAYs

  • pop - remove the last element from an array and return it
  • push - append one or more elements to an array
  • shift - remove the first element of an array, and return it
  • splice - add or remove elements anywhere in an array
  • unshift - prepend more elements to the beginning of a list

Functions for list data

  • grep - locate elements in a list test true against a given criterion
  • join - join a list into a string using a separator
  • map - apply a change to a list to get back a new list with the changes
  • qw/STRING/ - quote a list of words
  • reverse - flip a string or a list
  • sort - sort a list of values
  • unpack - convert binary structure into normal perl variables

Functions for real %HASHes

  • delete - deletes a value from a hash
  • each - retrieve the next key/value pair from a hash
  • exists - test whether a hash key is present
  • keys - retrieve list of indices from a hash
  • values - return a list of the values in a hash

Input and output functions

  • binmode - prepare binary files for I/O
  • close - close file (or pipe or socket) handle
  • closedir - close directory handle
  • dbmclose - breaks binding on a tied dbm file
  • dbmopen - create binding on a tied dbm file
  • die - raise an exception or bail out
  • eof - test a filehandle for its end
  • fileno - return file descriptor from filehandle
  • flock - lock an entire file with an advisory lock
  • format - declare a picture format with use by the write() function
  • getc - get the next character from the filehandle
  • print - output a list to a filehandle
  • printf - output a formatted list to a filehandle
  • read - fixed-length buffered input from a filehandle
  • readdir - get a directory from a directory handle
  • readline - fetch a record from a file
  • rewinddir - reset directory handle
  • seek - reposition file pointer for random-access I/O
  • seekdir - reposition directory pointer
  • select - reset default output or do I/O multiplexing
  • syscall - execute an arbitrary system call
  • sysread - fixed-length unbuffered input from a filehandle
  • sysseek - position I/O pointer on handle used with sysread and syswrite
  • syswrite - fixed-length unbuffered output to a filehandle
  • tell - get current seekpointer on a filehandle
  • telldir - get current seekpointer on a directory handle
  • truncate - shorten a file
  • warn - print debugging info
  • write - print a picture record

Functions for fixed length data or records

  • pack - convert a list into a binary representation
  • read - fixed-length buffered input from a filehandle
  • syscall - execute an arbitrary system call
  • sysread - fixed-length unbuffered input from a filehandle
  • sysseek - position I/O pointer on handle used with sysread and syswrite
  • syswrite - fixed-length unbuffered output to a filehandle
  • unpack - convert binary structure into normal perl variables
  • vec - test or set particular bits in a string

Functions for filehandles, files, or directories

  • -X - a file test (-r, -x, etc)
  • chdir - change your current working directory
  • chmod - changes the permissions on a list of files
  • chown - change the owership on a list of files
  • chroot - make directory new root for path lookups
  • fcntl - file control system call
  • glob - expand filenames using wildcards
  • ioctl - system-dependent device control system call
  • link - create a hard link in the filesytem
  • lstat - stat a symbolic link
  • mkdir - create a directory
  • open - open a file, pipe, or descriptor
  • opendir - open a directory
  • readlink - determine where a symbolic link is pointing
  • rename - change a filename
  • rmdir - remove a directory
  • stat - get a file's status information
  • symlink - create a symbolic link to a file
  • sysopen - open a file, pipe, or descriptor
  • umask - set file creation mode mask
  • unlink - remove one link to a file
  • utime - set a file's last access and modify times

Keywords related to control flow of your perl program

  • caller - get context of the current subroutine call
  • continue - optional trailing block in a while or foreach
  • die - raise an exception or bail out
  • do - turn a BLOCK into a TERM
  • dump - create an immediate core dump
  • eval - catch exceptions or compile and run code
  • exit - terminate this program
  • goto - create spaghetti code
  • last - exit a block prematurely
  • next - iterate a block prematurely
  • prototype - get the prototype (if any) of a subroutine
  • redo - start this loop iteration over again
  • return - get out of a function early
  • sub - declare a subroutine, possibly anonymously
  • wantarray - get void vs scalar vs list context of current subroutine call

Keywords altering or affecting scoping of identifiers

  • caller - get context of the current subroutine call
  • import - patch a module's namespace into your own
  • local - create a temporary value for a global variable (dynamic scoping)
  • my - declare and assign a local variable (lexical scoping)
  • our - declare and assign a package variable (lexical scoping)
  • package - declare a separate global namespace
  • use - load in a module at compile time

Miscellaneous functions

  • defined - test whether a value, variable, or function is defined
  • dump - create an immediate core dump
  • eval - catch exceptions or compile and run code
  • formline - internal function used for formats
  • local - create a temporary value for a global variable (dynamic scoping)
  • my - declare and assign a local variable (lexical scoping)
  • our - declare and assign a package variable (lexical scoping)
  • prototype - get the prototype (if any) of a subroutine
  • reset - clear all variables of a given name
  • scalar - force a scalar context
  • undef - remove a variable or function definition
  • wantarray - get void vs scalar vs list context of current subroutine call

Functions for processes and process groups

  • alarm - schedule a SIGALRM
  • exec - abandon this program to run another
  • fork - create a new process just like this one
  • getpgrp - get process group
  • getppid - get parent process ID
  • getpriority - get current nice value
  • kill - send a signal to a process or process group
  • pipe - open a pair of connected filehandles
  • qx/STRING/ - backquote quote a string
  • readpipe - execute a system command and collect standard output
  • setpgrp - set the process group of a process
  • setpriority - set a process's nice value
  • sleep - block for some number of seconds
  • system - run a separate program
  • times - return elapsed time for self and child processes
  • wait - wait for any child process to die
  • waitpid - wait for a particular child process to die

Keywords related to perl modules

  • do - turn a BLOCK into a TERM
  • import - patch a module's namespace into your own
  • no - unimport some module symbols or semantics at compile time
  • package - declare a separate global namespace
  • require - load in external functions from a library at runtime
  • use - load in a module at compile time

Keywords related to classes and object-orientedness

  • bless - create an object
  • dbmclose - breaks binding on a tied dbm file
  • dbmopen - create binding on a tied dbm file
  • package - declare a separate global namespace
  • ref - find out the type of thing being referenced
  • tie - bind a variable to an object class
  • tied - get a reference to the object underlying a tied variable
  • untie - break a tie binding to a variable
  • use - load in a module at compile time

Low-level socket functions

  • accept - accept an incoming socket connect
  • bind - binds an address to a socket
  • connect - connect to a remote socket
  • getpeername - find the other end of a socket connection
  • getsockname - retrieve the sockaddr for a given socket
  • getsockopt - get socket options on a given socket
  • listen - register your socket as a server
  • recv - receive a message over a Socket
  • send - send a message over a socket
  • setsockopt - set some socket options
  • shutdown - close down just half of a socket connection
  • socket - create a socket
  • socketpair - create a pair of sockets

System V interprocess communication functions

  • msgctl - SysV IPC message control operations
  • msgget - get SysV IPC message queue
  • msgrcv - receive a SysV IPC message from a message queue
  • msgsnd - send a SysV IPC message to a message queue
  • semctl - SysV semaphore control operations
  • semget - get set of SysV semaphores
  • semop - SysV semaphore operations
  • shmctl - SysV shared memory operations
  • shmget - get SysV shared memory segment identifier
  • shmread - read SysV shared memory
  • shmwrite - write SysV shared memory

Fetching user and group info

Fetching network info

Time-related functions

  • gmtime - convert UNIX time into record or string using Greenwich time
  • localtime - convert UNIX time into record or string using local time
  • time - return number of seconds since 1970
  • times - return elapsed time for self and child processes
 
perldoc-html/index-functions.html000644 000765 000024 00000133471 12275777517 017256 0ustar00jjstaff000000 000000 Perl functions A-Z - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Perl functions A-Z

Perl 5 version 18.2 documentation
Recently read

Perl functions A-Z

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

A

  • AUTOLOAD
  • abs - absolute value function
  • accept - accept an incoming socket connect
  • alarm - schedule a SIGALRM
  • and
  • atan2 - arctangent of Y/X in the range -PI to PI

B

  • BEGIN
  • bind - binds an address to a socket
  • binmode - prepare binary files for I/O
  • bless - create an object
  • break - break out of a "given" block

C

  • CHECK
  • caller - get context of the current subroutine call
  • chdir - change your current working directory
  • chmod - changes the permissions on a list of files
  • chomp - remove a trailing record separator from a string
  • chop - remove the last character from a string
  • chown - change the owership on a list of files
  • chr - get character this number represents
  • chroot - make directory new root for path lookups
  • close - close file (or pipe or socket) handle
  • closedir - close directory handle
  • cmp
  • connect - connect to a remote socket
  • continue - optional trailing block in a while or foreach
  • cos - cosine function
  • crypt - one-way passwd-style encryption

D

  • DESTROY
  • __DATA__
  • dbmclose - breaks binding on a tied dbm file
  • dbmopen - create binding on a tied dbm file
  • default
  • defined - test whether a value, variable, or function is defined
  • delete - deletes a value from a hash
  • die - raise an exception or bail out
  • do - turn a BLOCK into a TERM
  • dump - create an immediate core dump

E

F

  • __FILE__
  • fc
  • fcntl - file control system call
  • fileno - return file descriptor from filehandle
  • flock - lock an entire file with an advisory lock
  • for
  • foreach
  • fork - create a new process just like this one
  • format - declare a picture format with use by the write() function
  • formline - internal function used for formats

G

H

  • hex - convert a string to a hexadecimal number

I

  • INIT
  • if
  • import - patch a module's namespace into your own
  • index - find a substring within a string
  • int - get the integer portion of a number
  • ioctl - system-dependent device control system call

J

  • join - join a list into a string using a separator

K

  • keys - retrieve list of indices from a hash
  • kill - send a signal to a process or process group

L

  • __LINE__
  • last - exit a block prematurely
  • lc - return lower-case version of a string
  • lcfirst - return a string with just the next letter in lower case
  • le
  • length - return the number of bytes in a string
  • link - create a hard link in the filesytem
  • listen - register your socket as a server
  • local - create a temporary value for a global variable (dynamic scoping)
  • localtime - convert UNIX time into record or string using local time
  • lock - get a thread lock on a variable, subroutine, or method
  • log - retrieve the natural logarithm for a number
  • lstat - stat a symbolic link
  • lt

M

  • m - match a string with a regular expression pattern
  • map - apply a change to a list to get back a new list with the changes
  • mkdir - create a directory
  • msgctl - SysV IPC message control operations
  • msgget - get SysV IPC message queue
  • msgrcv - receive a SysV IPC message from a message queue
  • msgsnd - send a SysV IPC message to a message queue
  • my - declare and assign a local variable (lexical scoping)

N

  • ne
  • next - iterate a block prematurely
  • no - unimport some module symbols or semantics at compile time
  • not

O

  • oct - convert a string to an octal number
  • open - open a file, pipe, or descriptor
  • opendir - open a directory
  • or
  • ord - find a character's numeric representation
  • our - declare and assign a package variable (lexical scoping)

P

  • __PACKAGE__
  • pack - convert a list into a binary representation
  • package - declare a separate global namespace
  • pipe - open a pair of connected filehandles
  • pop - remove the last element from an array and return it
  • pos - find or set the offset for the last/next m//g search
  • print - output a list to a filehandle
  • printf - output a formatted list to a filehandle
  • prototype - get the prototype (if any) of a subroutine
  • push - append one or more elements to an array

Q

  • q - singly quote a string
  • qq - doubly quote a string
  • qr - Compile pattern
  • quotemeta - quote regular expression magic characters
  • qw - quote a list of words
  • qx - backquote quote a string

R

  • rand - retrieve the next pseudorandom number
  • read - fixed-length buffered input from a filehandle
  • readdir - get a directory from a directory handle
  • readline - fetch a record from a file
  • readlink - determine where a symbolic link is pointing
  • readpipe - execute a system command and collect standard output
  • recv - receive a message over a Socket
  • redo - start this loop iteration over again
  • ref - find out the type of thing being referenced
  • rename - change a filename
  • require - load in external functions from a library at runtime
  • reset - clear all variables of a given name
  • return - get out of a function early
  • reverse - flip a string or a list
  • rewinddir - reset directory handle
  • rindex - right-to-left substring search
  • rmdir - remove a directory

S

  • __SUB__
  • s - replace a pattern with a string
  • say - print with newline
  • scalar - force a scalar context
  • seek - reposition file pointer for random-access I/O
  • seekdir - reposition directory pointer
  • select - reset default output or do I/O multiplexing
  • semctl - SysV semaphore control operations
  • semget - get set of SysV semaphores
  • semop - SysV semaphore operations
  • send - send a message over a socket
  • setgrent - prepare group file for use
  • sethostent - prepare hosts file for use
  • setnetent - prepare networks file for use
  • setpgrp - set the process group of a process
  • setpriority - set a process's nice value
  • setprotoent - prepare protocols file for use
  • setpwent - prepare passwd file for use
  • setservent - prepare services file for use
  • setsockopt - set some socket options
  • shift - remove the first element of an array, and return it
  • shmctl - SysV shared memory operations
  • shmget - get SysV shared memory segment identifier
  • shmread - read SysV shared memory
  • shmwrite - write SysV shared memory
  • shutdown - close down just half of a socket connection
  • sin - return the sine of a number
  • sleep - block for some number of seconds
  • socket - create a socket
  • socketpair - create a pair of sockets
  • sort - sort a list of values
  • splice - add or remove elements anywhere in an array
  • split - split up a string using a regexp delimiter
  • sprintf - formatted print into a string
  • sqrt - square root function
  • srand - seed the random number generator
  • stat - get a file's status information
  • state - declare and assign a state variable (persistent lexical scoping)
  • study - optimize input data for repeated searches
  • sub - declare a subroutine, possibly anonymously
  • substr - get or alter a portion of a stirng
  • symlink - create a symbolic link to a file
  • syscall - execute an arbitrary system call
  • sysopen - open a file, pipe, or descriptor
  • sysread - fixed-length unbuffered input from a filehandle
  • sysseek - position I/O pointer on handle used with sysread and syswrite
  • system - run a separate program
  • syswrite - fixed-length unbuffered output to a filehandle

T

  • tell - get current seekpointer on a filehandle
  • telldir - get current seekpointer on a directory handle
  • tie - bind a variable to an object class
  • tied - get a reference to the object underlying a tied variable
  • time - return number of seconds since 1970
  • times - return elapsed time for self and child processes
  • tr - transliterate a string
  • truncate - shorten a file

U

  • UNITCHECK
  • uc - return upper-case version of a string
  • ucfirst - return a string with just the next letter in upper case
  • umask - set file creation mode mask
  • undef - remove a variable or function definition
  • unless
  • unlink - remove one link to a file
  • unpack - convert binary structure into normal perl variables
  • unshift - prepend more elements to the beginning of a list
  • untie - break a tie binding to a variable
  • until
  • use - load in a module at compile time
  • utime - set a file's last access and modify times

V

  • values - return a list of the values in a hash
  • vec - test or set particular bits in a string

W

  • wait - wait for any child process to die
  • waitpid - wait for a particular child process to die
  • wantarray - get void vs scalar vs list context of current subroutine call
  • warn - print debugging info
  • when
  • while
  • write - print a picture record

X

  • -X - a file test (-r, -x, etc)
  • x
  • xor

Y

  • y - transliterate a string
 
perldoc-html/index-history.html000644 000765 000024 00000040622 12275777371 016740 0ustar00jjstaff000000 000000 History / Changes - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

History / Changes

Perl 5 version 18.2 documentation
Recently read

History / Changes

 
perldoc-html/index-internals.html000644 000765 000024 00000036204 12275777357 017243 0ustar00jjstaff000000 000000 Internals and C language interface - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Internals and C language interface

Perl 5 version 18.2 documentation
Recently read

Internals and C language interface

  • perlembed - how to embed perl in your C program
  • perldebguts - Guts of Perl debugging
  • perlxs - XS language reference manual
  • perlxstut - Tutorial for writing XSUBs
  • perlxstypemap - Perl XS C/Perl type mapping
  • perlinterp - An overview of the Perl interpreter
  • perlsource - A guide to the Perl source tree
  • perlrepository - Links to current information on the Perl source repository
  • perlclib - Internal replacements for standard C library functions
  • perlguts - Introduction to the Perl API
  • perlcall - Perl calling conventions from C
  • perlapi - autogenerated documentation for the perl public API
  • perlintern - autogenerated documentation of purely internal Perl functions
  • perlmroapi - Perl method resolution plugin interface
  • perliol - C API for Perl's implementation of IO in Layers.
  • perlapio - perl's IO abstraction interface.
  • perlhack - How to hack on Perl
  • perlhacktut - Walk through the creation of a simple C code patch
  • perlhacktips - Tips for Perl core C code hacking
  • perlreguts - Description of the Perl regular expression engine.
  • perlreapi - Perl regular expression plugin interface
  • perlpolicy - Various and sundry policies and commitments related to the Perl core
 
perldoc-html/index-language.html000644 000765 000024 00000041455 12275777331 017023 0ustar00jjstaff000000 000000 Language reference - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Language reference

Perl 5 version 18.2 documentation
Recently read

Language reference

 
perldoc-html/index-licence.html000644 000765 000024 00000031541 12275777410 016633 0ustar00jjstaff000000 000000 Licence - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Licence

Perl 5 version 18.2 documentation
Recently read

Licence

 
perldoc-html/index-modules-A.html000644 000765 000024 00000037136 12275777421 017067 0ustar00jjstaff000000 000000 Core modules (A) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (A)

Perl 5 version 18.2 documentation
Recently read

Core modules (A)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-B.html000644 000765 000024 00000036605 12275777423 017072 0ustar00jjstaff000000 000000 Core modules (B) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (B)

Perl 5 version 18.2 documentation
Recently read

Core modules (B)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
  • B - The Perl Compiler Backend
  • B::Concise - Walk Perl syntax tree, printing concise info about ops
  • B::Debug - Walk Perl syntax tree, printing debug info about ops
  • B::Deparse - Perl compiler backend to produce perl code
  • B::Lint - Perl lint
  • B::Lint::Debug - Adds debugging stringification to B::
  • B::Showlex - Show lexical variables used in functions or files
  • B::Terse - Walk Perl syntax tree, printing terse info about ops
  • B::Xref - Generates cross reference reports for Perl programs
  • Benchmark - benchmark running times of Perl code
 
perldoc-html/index-modules-C.html000644 000765 000024 00000053246 12275777424 017074 0ustar00jjstaff000000 000000 Core modules (C) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (C)

Perl 5 version 18.2 documentation
Recently read

Core modules (C)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-D.html000644 000765 000024 00000041074 12275777435 017073 0ustar00jjstaff000000 000000 Core modules (D) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (D)

Perl 5 version 18.2 documentation
Recently read

Core modules (D)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-E.html000644 000765 000024 00000054566 12275777437 017110 0ustar00jjstaff000000 000000 Core modules (E) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (E)

Perl 5 version 18.2 documentation
Recently read

Core modules (E)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-F.html000644 000765 000024 00000042574 12275777443 017102 0ustar00jjstaff000000 000000 Core modules (F) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (F)

Perl 5 version 18.2 documentation
Recently read

Core modules (F)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-G.html000644 000765 000024 00000035106 12275777446 017077 0ustar00jjstaff000000 000000 Core modules (G) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (G)

Perl 5 version 18.2 documentation
Recently read

Core modules (G)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
  • Getopt::Long - Extended processing of command line options
  • Getopt::Std - Process single-character switches with switch clustering
 
perldoc-html/index-modules-H.html000644 000765 000024 00000035077 12275777446 017107 0ustar00jjstaff000000 000000 Core modules (H) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (H)

Perl 5 version 18.2 documentation
Recently read

Core modules (H)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-I.html000644 000765 000024 00000045230 12275777447 017101 0ustar00jjstaff000000 000000 Core modules (I) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (I)

Perl 5 version 18.2 documentation
Recently read

Core modules (I)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-J.html000644 000765 000024 00000034521 12275777455 017102 0ustar00jjstaff000000 000000 Core modules (J) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (J)

Perl 5 version 18.2 documentation
Recently read

Core modules (J)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-K.html000644 000765 000024 00000034521 12275777456 017104 0ustar00jjstaff000000 000000 Core modules (K) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (K)

Perl 5 version 18.2 documentation
Recently read

Core modules (K)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-L.html000644 000765 000024 00000040264 12275777460 017101 0ustar00jjstaff000000 000000 Core modules (L) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (L)

Perl 5 version 18.2 documentation
Recently read

Core modules (L)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-M.html000644 000765 000024 00000050615 12275777465 017110 0ustar00jjstaff000000 000000 Core modules (M) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (M)

Perl 5 version 18.2 documentation
Recently read

Core modules (M)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-N.html000644 000765 000024 00000040126 12275777473 017104 0ustar00jjstaff000000 000000 Core modules (N) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (N)

Perl 5 version 18.2 documentation
Recently read

Core modules (N)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
  • NDBM_File - Tied access to ndbm files
  • Net::Cmd - Network Command class (as used by FTP, SMTP etc)
  • Net::Config - Local configuration data for libnet
  • Net::Domain - Attempt to evaluate the current host's internet name and domain
  • Net::FTP - FTP Client class
  • Net::hostent - by-name interface to Perl's built-in gethost*() functions
  • Net::netent - by-name interface to Perl's built-in getnet*() functions
  • Net::Netrc - OO interface to users netrc file
  • Net::NNTP - NNTP Client class
  • Net::Ping - check a remote host for reachability
  • Net::POP3 - Post Office Protocol 3 Client class (RFC1939)
  • Net::protoent - by-name interface to Perl's built-in getproto*() functions
  • Net::servent - by-name interface to Perl's built-in getserv*() functions
  • Net::SMTP - Simple Mail Transfer Protocol Client
  • Net::Time - time and daytime network client interface
  • NEXT - Provide a pseudo-class NEXT (et al) that allows method redispatch
 
perldoc-html/index-modules-O.html000644 000765 000024 00000035230 12275777474 017106 0ustar00jjstaff000000 000000 Core modules (O) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (O)

Perl 5 version 18.2 documentation
Recently read

Core modules (O)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
  • O - Generic interface to Perl Compiler backends
  • Object::Accessor - interface to create per object accessors
  • Opcode - Disable named opcodes when compiling perl code
 
perldoc-html/index-modules-P.html000644 000765 000024 00000052471 12275777475 017116 0ustar00jjstaff000000 000000 Core modules (P) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (P)

Perl 5 version 18.2 documentation
Recently read

Core modules (P)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-Q.html000644 000765 000024 00000034521 12275777503 017103 0ustar00jjstaff000000 000000 Core modules (Q) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (Q)

Perl 5 version 18.2 documentation
Recently read

Core modules (Q)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-R.html000644 000765 000024 00000034521 12275777503 017104 0ustar00jjstaff000000 000000 Core modules (R) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (R)

Perl 5 version 18.2 documentation
Recently read

Core modules (R)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-S.html000644 000765 000024 00000036771 12275777503 017116 0ustar00jjstaff000000 000000 Core modules (S) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (S)

Perl 5 version 18.2 documentation
Recently read

Core modules (S)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
  • Safe - Compile and execute code in restricted compartments
  • Scalar::Util - A selection of general-utility scalar subroutines
  • SDBM_File - Tied access to sdbm files
  • Search::Dict - look - search for key in dictionary file
  • SelectSaver - save and restore selected file handle
  • SelfLoader - load functions only on demand
  • Socket - networking constants and support functions
  • Storable - persistence for Perl data structures
  • Symbol - manipulate Perl symbols and their names
  • Sys::Hostname - Try every conceivable way to get hostname
  • Sys::Syslog - Perl interface to the UNIX syslog(3) calls
 
perldoc-html/index-modules-T.html000644 000765 000024 00000057274 12275777505 017122 0ustar00jjstaff000000 000000 Core modules (T) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (T)

Perl 5 version 18.2 documentation
Recently read

Core modules (T)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-U.html000644 000765 000024 00000036005 12275777515 017111 0ustar00jjstaff000000 000000 Core modules (U) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (U)

Perl 5 version 18.2 documentation
Recently read

Core modules (U)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-V.html000644 000765 000024 00000034521 12275777517 017115 0ustar00jjstaff000000 000000 Core modules (V) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (V)

Perl 5 version 18.2 documentation
Recently read

Core modules (V)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-W.html000644 000765 000024 00000034521 12275777517 017116 0ustar00jjstaff000000 000000 Core modules (W) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (W)

Perl 5 version 18.2 documentation
Recently read

Core modules (W)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-X.html000644 000765 000024 00000034677 12275777517 017133 0ustar00jjstaff000000 000000 Core modules (X) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (X)

Perl 5 version 18.2 documentation
Recently read

Core modules (X)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
  • XSLoader - Dynamically load C libraries into Perl code
 
perldoc-html/index-modules-Y.html000644 000765 000024 00000034521 12275777517 017120 0ustar00jjstaff000000 000000 Core modules (Y) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (Y)

Perl 5 version 18.2 documentation
Recently read

Core modules (Y)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-modules-Z.html000644 000765 000024 00000034521 12275777517 017121 0ustar00jjstaff000000 000000 Core modules (Z) - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Core modules (Z)

Perl 5 version 18.2 documentation
Recently read

Core modules (Z)

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
 
perldoc-html/index-overview.html000644 000765 000024 00000032232 12275777321 017076 0ustar00jjstaff000000 000000 Overview - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Overview

Perl 5 version 18.2 documentation
Recently read

Overview

  • perl - The Perl 5 language interpreter
  • perlintro - a brief introduction and overview of Perl
  • perlrun - how to execute the Perl interpreter
  • perlbook - Books about and related to Perl
  • perlcommunity - a brief overview of the Perl community
 
perldoc-html/index-platforms.html000644 000765 000024 00000037020 12275777410 017236 0ustar00jjstaff000000 000000 Platform specific - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Platform specific

Perl 5 version 18.2 documentation
Recently read

Platform specific

 
perldoc-html/index-pragmas.html000644 000765 000024 00000041602 12275777413 016665 0ustar00jjstaff000000 000000 Pragmas - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Pragmas

Perl 5 version 18.2 documentation
Recently read

Pragmas

  • attributes - get/set subroutine or variable attributes
  • autodie - Replace functions with ones that succeed or die with lexical scope
  • autouse - postpone load of modules until a function is used
  • base - Establish an ISA relationship with base classes at compile time
  • bigint - Transparent BigInteger support for Perl
  • bignum - Transparent BigNumber support for Perl
  • bigrat - Transparent BigNumber/BigRational support for Perl
  • blib - Use MakeMaker's uninstalled version of a package
  • bytes - Perl pragma to force byte semantics rather than character semantics
  • charnames - access to Unicode character names and named character sequences; also define character names
  • constant - Perl pragma to declare constants
  • diagnostics - produce verbose warning diagnostics
  • encoding - allows you to write your script in non-ascii or non-utf8
  • feature - Perl pragma to enable new features
  • fields - compile-time class fields
  • filetest - Perl pragma to control the filetest permission operators
  • if - use a Perl module if a condition holds
  • integer - Perl pragma to use integer arithmetic instead of floating point
  • less - perl pragma to request less of something
  • lib - manipulate @INC at compile time
  • locale - Perl pragma to use or avoid POSIX locales for built-in operations
  • mro - Method Resolution Order
  • open - perl pragma to set default PerlIO layers for input and output
  • ops - Perl pragma to restrict unsafe operations when compiling
  • overload - Package for overloading Perl operations
  • overloading - perl pragma to lexically control overloading
  • parent - Establish an ISA relationship with base classes at compile time
  • re - Perl pragma to alter regular expression behaviour
  • sigtrap - Perl pragma to enable simple signal handling
  • sort - perl pragma to control sort() behaviour
  • strict - Perl pragma to restrict unsafe constructs
  • subs - Perl pragma to predeclare sub names
  • threads - Perl interpreter-based threads
  • threads::shared - Perl extension for sharing data structures between threads
  • utf8 - Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source code
  • vars - Perl pragma to predeclare global variable names
  • vmsish - Perl pragma to control VMS-specific language features
  • warnings - Perl pragma to control optional warnings
  • warnings::register - warnings import function
 
perldoc-html/index-tutorials.html000644 000765 000024 00000035135 12275777322 017264 0ustar00jjstaff000000 000000 Tutorials - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Tutorials

Perl 5 version 18.2 documentation
Recently read

Tutorials

 
perldoc-html/index-utilities.html000644 000765 000024 00000040174 12275777417 017255 0ustar00jjstaff000000 000000 Utilities - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Utilities

Perl 5 version 18.2 documentation
Recently read

Utilities

  • perlutil - utilities packaged with the Perl distribution
  • a2p - Awk to Perl translator
  • c2ph - Dump C structures as generated from cc -g -S stabs
  • config_data - Query or change configuration of Perl modules
  • corelist - a commandline frontend to Module::CoreList
  • cpan - easily interact with CPAN from the command line
  • cpanp - The CPANPLUS launcher
  • cpan2dist - The CPANPLUS distribution creator
  • enc2xs - Perl Encode Module Generator
  • find2perl - translate find command lines to Perl code
  • h2ph - convert .h C header files to .ph Perl header files
  • h2xs - convert .h C header files to Perl extensions
  • instmodsh - A shell to examine installed modules
  • libnetcfg - configure libnet
  • perlbug - how to submit bug reports on Perl
  • piconv - iconv(1), reinvented in perl
  • prove - Run tests through a TAP harness.
  • psed - a stream editor
  • podchecker - check the syntax of POD format documentation files
  • perldoc - Look up Perl documentation in Pod format.
  • perlivp - Perl Installation Verification Procedure
  • pod2html - convert .pod files to .html files
  • pod2latex - convert pod documentation to latex format
  • pod2man - Convert POD data to formatted *roff input
  • pod2text - Convert POD data to formatted ASCII text
  • pod2usage - print usage messages from embedded pod docs in files
  • podselect - print selected sections of pod documentation on standard output
  • pstruct - Dump C structures as generated from cc -g -S stabs
  • ptar
  • ptardiff - program that diffs an extracted archive against an unextracted one
  • s2p - a stream editor
  • shasum - Print or Check SHA Checksums
  • splain - produce verbose warning diagnostics
  • xsubpp - compiler to convert Perl XS code into C code
  • perlthanks - how to submit bug reports on Perl
 
perldoc-html/index.html000644 000765 000024 00000036641 12276001417 015225 0ustar00jjstaff000000 000000 Perl programming documentation - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Perl programming documentation

Perl 5 version 18.2 documentation
Recently read

Perl 5 version 18.2 documentation

Core documentation for Perl 5 version 18.2, in HTML and PDF formats.

To find out what's new in Perl 5.18.2, read the perldelta manpage.

If you are new to the Perl language, good places to start reading are the introduction and overview at perlintro, and the extensive FAQ section, which provides answers to over 300 common questions.

Site features

  • Improved navigation
    When you scroll down a page, the top navigation bar remains visible at the top of your screen, so the page name, breadcrumb trail, and other links are always available.
  • Pop-up index display
    Documentation pages now have a 'Show page index' link in the navigation bar. Clicking this opens a draggable, resizable window with an overview of the page you're reading.
  • Improved search
    It's now even easier to find the page you need. For example, just type 'getopt long' into the search box to be taken directly to the Getopt::Long documentation.
  • Recently viewed pages
    The right-hand side panel shows the last 10 documentation pages you viewed. As with the search engine, this feature still works if you're using an offline local copy of the site.
  • Improved syntax highlighting
    As well as a better highlighting algorithm, code blocks now have line numbers to make it easier to see line breaks.
  • Module index links
    View the module indexes with a single click from the left-hand side panel.

Downloads

The complete documentation set is available to download for offline use.

As well as the documentation pages, the perldoc search engine is also included in the above downloads. No installation is required, just unpack the archive and open the index.html file in your web browser.

To obtain Perl itself, please go to http://www.perl.org/get.html.

   
perldoc-html/instmodsh.html000644 000765 000024 00000034507 12275777420 016142 0ustar00jjstaff000000 000000 instmodsh - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

instmodsh

Perl 5 version 18.2 documentation
Recently read

instmodsh

NAME

instmodsh - A shell to examine installed modules

SYNOPSIS

  1. instmodsh

DESCRIPTION

A little interface to ExtUtils::Installed to examine installed modules, validate your packlists and even create a tarball from an installed module.

SEE ALSO

ExtUtils::Installed

 
perldoc-html/integer.html000644 000765 000024 00000050561 12275777415 015571 0ustar00jjstaff000000 000000 integer - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

integer

Perl 5 version 18.2 documentation
Recently read

integer

NAME

integer - Perl pragma to use integer arithmetic instead of floating point

SYNOPSIS

  1. use integer;
  2. $x = 10/3;
  3. # $x is now 3, not 3.33333333333333333

DESCRIPTION

This tells the compiler to use integer operations from here to the end of the enclosing BLOCK. On many machines, this doesn't matter a great deal for most computations, but on those without floating point hardware, it can make a big difference in performance.

Note that this only affects how most of the arithmetic and relational operators handle their operands and results, and not how all numbers everywhere are treated. Specifically, use integer; has the effect that before computing the results of the arithmetic operators (+, -, *, /, %, +=, -=, *=, /=, %=, and unary minus), the comparison operators (<, <=, >, >=, ==, !=, <=>), and the bitwise operators (|, &, ^, <<,>>, |=, &=, ^=, <<=,>>=), the operands have their fractional portions truncated (or floored), and the result will have its fractional portion truncated as well. In addition, the range of operands and results is restricted to that of familiar two's complement integers, i.e., -(2**31) .. (2**31-1) on 32-bit architectures, and -(2**63) .. (2**63-1) on 64-bit architectures. For example, this code

  1. use integer;
  2. $x = 5.8;
  3. $y = 2.5;
  4. $z = 2.7;
  5. $a = 2**31 - 1; # Largest positive integer on 32-bit machines
  6. $, = ", ";
  7. print $x, -$x, $x + $y, $x - $y, $x / $y, $x * $y, $y == $z, $a, $a + 1;

will print: 5.8, -5, 7, 3, 2, 10, 1, 2147483647, -2147483648

Note that $x is still printed as having its true non-integer value of 5.8 since it wasn't operated on. And note too the wrap-around from the largest positive integer to the largest negative one. Also, arguments passed to functions and the values returned by them are not affected by use integer; . E.g.,

  1. srand(1.5);
  2. $, = ", ";
  3. print sin(.5), cos(.5), atan2(1,2), sqrt(2), rand(10);

will give the same result with or without use integer; The power operator ** is also not affected, so that 2 ** .5 is always the square root of 2. Now, it so happens that the pre- and post- increment and decrement operators, ++ and --, are not affected by use integer; either. Some may rightly consider this to be a bug -- but at least it's a long-standing one.

Finally, use integer; also has an additional affect on the bitwise operators. Normally, the operands and results are treated as unsigned integers, but with use integer; the operands and results are signed. This means, among other things, that ~0 is -1, and -2 & -5 is -6.

Internally, native integer arithmetic (as provided by your C compiler) is used. This means that Perl's own semantics for arithmetic operations may not be preserved. One common source of trouble is the modulus of negative numbers, which Perl does one way, but your hardware may do another.

  1. % perl -le 'print (4 % -3)'
  2. -2
  3. % perl -Minteger -le 'print (4 % -3)'
  4. 1

See Pragmatic Modules in perlmodlib, Integer Arithmetic in perlop

 
perldoc-html/less.html000644 000765 000024 00000045705 12275777415 015106 0ustar00jjstaff000000 000000 less - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

less

Perl 5 version 18.2 documentation
Recently read

less

NAME

less - perl pragma to request less of something

SYNOPSIS

  1. use less 'CPU';

DESCRIPTION

This is a user-pragma. If you're very lucky some code you're using will know that you asked for less CPU usage or ram or fat or... we just can't know. Consult your documentation on everything you're currently using.

For general suggestions, try requesting CPU or memory .

  1. use less 'memory';
  2. use less 'CPU';
  3. use less 'fat';

If you ask for nothing in particular, you'll be asking for less 'please' .

  1. use less 'please';

FOR MODULE AUTHORS

less has been in the core as a "joke" module for ages now and it hasn't had any real way to communicating any information to anything. Thanks to Nicholas Clark we have user pragmas (see perlpragma) and now less can do something.

You can probably expect your users to be able to guess that they can request less CPU or memory or just "less" overall.

If the user didn't specify anything, it's interpreted as having used the please tag. It's up to you to make this useful.

  1. # equivalent
  2. use less;
  3. use less 'please';

BOOLEAN = less->of( FEATURE )

The class method less->of( NAME ) returns a boolean to tell you whether your user requested less of something.

  1. if ( less->of( 'CPU' ) ) {
  2. ...
  3. }
  4. elsif ( less->of( 'memory' ) ) {
  5. }

FEATURES = less->of()

If you don't ask for any feature, you get the list of features that the user requested you to be nice to. This has the nice side effect that if you don't respect anything in particular then you can just ask for it and use it like a boolean.

  1. if ( less->of ) {
  2. ...
  3. }
  4. else {
  5. ...
  6. }

CAVEATS

  • This probably does nothing.
  • This works only on 5.10+

    At least it's backwards compatible in not doing much.

 
perldoc-html/lib.html000644 000765 000024 00000051076 12275777415 014704 0ustar00jjstaff000000 000000 lib - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

lib

Perl 5 version 18.2 documentation
Recently read

lib

NAME

lib - manipulate @INC at compile time

SYNOPSIS

  1. use lib LIST;
  2. no lib LIST;

DESCRIPTION

This is a small simple module which simplifies the manipulation of @INC at compile time.

It is typically used to add extra directories to perl's search path so that later use or require statements will find modules which are not located on perl's default search path.

Adding directories to @INC

The parameters to use lib are added to the start of the perl search path. Saying

  1. use lib LIST;

is almost the same as saying

  1. BEGIN { unshift(@INC, LIST) }

For each directory in LIST (called $dir here) the lib module also checks to see if a directory called $dir/$archname/auto exists. If so the $dir/$archname directory is assumed to be a corresponding architecture specific directory and is added to @INC in front of $dir. lib.pm also checks if directories called $dir/$version and $dir/$version/$archname exist and adds these directories to @INC.

The current value of $archname can be found with this command:

  1. perl -V:archname

The corresponding command to get the current value of $version is:

  1. perl -V:version

To avoid memory leaks, all trailing duplicate entries in @INC are removed.

Deleting directories from @INC

You should normally only add directories to @INC. If you need to delete directories from @INC take care to only delete those which you added yourself or which you are certain are not needed by other modules in your script. Other modules may have added directories which they need for correct operation.

The no lib statement deletes all instances of each named directory from @INC.

For each directory in LIST (called $dir here) the lib module also checks to see if a directory called $dir/$archname/auto exists. If so the $dir/$archname directory is assumed to be a corresponding architecture specific directory and is also deleted from @INC.

Restoring original @INC

When the lib module is first loaded it records the current value of @INC in an array @lib::ORIG_INC . To restore @INC to that value you can say

  1. @INC = @lib::ORIG_INC;

CAVEATS

In order to keep lib.pm small and simple, it only works with Unix filepaths. This doesn't mean it only works on Unix, but non-Unix users must first translate their file paths to Unix conventions.

  1. # VMS users wanting to put [.stuff.moo] into
  2. # their @INC would write
  3. use lib 'stuff/moo';

NOTES

In the future, this module will likely use File::Spec for determining paths, as it does now for Mac OS (where Unix-style or Mac-style paths work, and Unix-style paths are converted properly to Mac-style paths before being added to @INC).

If you try to add a file to @INC as follows:

  1. use lib 'this_is_a_file.txt';

lib will warn about this. The sole exceptions are files with the .par extension which are intended to be used as libraries.

SEE ALSO

FindBin - optional module which deals with paths relative to the source file.

PAR - optional module which can treat .par files as Perl libraries.

AUTHOR

Tim Bunce, 2nd June 1995.

lib is maintained by the perl5-porters. Please direct any questions to the canonical mailing list. Anything that is applicable to the CPAN release can be sent to its maintainer, though.

Maintainer: The Perl5-Porters <perl5-porters@perl.org>

Maintainer of the CPAN release: Steffen Mueller <smueller@cpan.org>

COPYRIGHT AND LICENSE

This package has been part of the perl core since perl 5.001. It has been released separately to CPAN so older installations can benefit from bug fixes.

This package has the same copyright and license as the perl core.

 
perldoc-html/libnetcfg.html000644 000765 000024 00000040165 12275777420 016064 0ustar00jjstaff000000 000000 libnetcfg - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

libnetcfg

Perl 5 version 18.2 documentation
Recently read

libnetcfg

NAME

libnetcfg - configure libnet

DESCRIPTION

The libnetcfg utility can be used to configure the libnet. Starting from perl 5.8 libnet is part of the standard Perl distribution, but the libnetcfg can be used for any libnet installation.

USAGE

Without arguments libnetcfg displays the current configuration.

  1. $ libnetcfg
  2. # old config ./libnet.cfg
  3. daytime_hosts ntp1.none.such
  4. ftp_int_passive 0
  5. ftp_testhost ftp.funet.fi
  6. inet_domain none.such
  7. nntp_hosts nntp.none.such
  8. ph_hosts
  9. pop3_hosts pop.none.such
  10. smtp_hosts smtp.none.such
  11. snpp_hosts
  12. test_exist 1
  13. test_hosts 1
  14. time_hosts ntp.none.such
  15. # libnetcfg -h for help
  16. $

It tells where the old configuration file was found (if found).

The -h option will show a usage message.

To change the configuration you will need to use either the -c or the -d options.

The default name of the old configuration file is by default "libnet.cfg", unless otherwise specified using the -i option, -i oldfile , and it is searched first from the current directory, and then from your module path.

The default name of the new configuration file is "libnet.cfg", and by default it is written to the current directory, unless otherwise specified using the -o option, -o newfile .

SEE ALSO

Net::Config, libnetFAQ

AUTHORS

Graham Barr, the original Configure script of libnet.

Jarkko Hietaniemi, conversion into libnetcfg for inclusion into Perl 5.8.

 
perldoc-html/locale.html000644 000765 000024 00000043151 12275777415 015370 0ustar00jjstaff000000 000000 locale - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

locale

Perl 5 version 18.2 documentation
Recently read

locale

NAME

locale - Perl pragma to use or avoid POSIX locales for built-in operations

SYNOPSIS

  1. @x = sort @y; # Unicode sorting order
  2. {
  3. use locale;
  4. @x = sort @y; # Locale-defined sorting order
  5. }
  6. @x = sort @y; # Unicode sorting order again

DESCRIPTION

This pragma tells the compiler to enable (or disable) the use of POSIX locales for built-in operations (for example, LC_CTYPE for regular expressions, LC_COLLATE for string comparison, and LC_NUMERIC for number formatting). Each "use locale" or "no locale" affects statements to the end of the enclosing BLOCK.

Starting in Perl 5.16, a hybrid mode for this pragma is available,

  1. use locale ':not_characters';

which enables only the portions of locales that don't affect the character set (that is, all except LC_COLLATE and LC_CTYPE). This is useful when mixing Unicode and locales, including UTF-8 locales.

  1. use locale ':not_characters';
  2. use open ":locale"; # Convert I/O to/from Unicode
  3. use POSIX qw(locale_h); # Import the LC_ALL constant
  4. setlocale(LC_ALL, ""); # Required for the next statement
  5. # to take effect
  6. printf "%.2f\n", 12345.67' # Locale-defined formatting
  7. @x = sort @y; # Unicode-defined sorting order.
  8. # (Note that you will get better
  9. # results using Unicode::Collate.)

See perllocale for more detailed information on how Perl supports locales.

NOTE

If your system does not support locales, then loading this module will cause the program to die with a message:

  1. "Your vendor does not support locales, you cannot use the locale
  2. module."
 
perldoc-html/mro.html000644 000765 000024 00000101650 12275777415 014725 0ustar00jjstaff000000 000000 mro - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

mro

Perl 5 version 18.2 documentation
Recently read

mro

NAME

mro - Method Resolution Order

SYNOPSIS

  1. use mro; # enables next::method and friends globally
  2. use mro 'dfs'; # enable DFS MRO for this class (Perl default)
  3. use mro 'c3'; # enable C3 MRO for this class

DESCRIPTION

The "mro" namespace provides several utilities for dealing with method resolution order and method caching in general.

These interfaces are only available in Perl 5.9.5 and higher. See MRO::Compat on CPAN for a mostly forwards compatible implementation for older Perls.

OVERVIEW

It's possible to change the MRO of a given class either by using use mro as shown in the synopsis, or by using the mro::set_mro function below.

The special methods next::method, next::can, and maybe::next::method are not available until this mro module has been loaded via use or require.

The C3 MRO

In addition to the traditional Perl default MRO (depth first search, called DFS here), Perl now offers the C3 MRO as well. Perl's support for C3 is based on the work done in Stevan Little's module Class::C3, and most of the C3-related documentation here is ripped directly from there.

What is C3?

C3 is the name of an algorithm which aims to provide a sane method resolution order under multiple inheritance. It was first introduced in the language Dylan (see links in the SEE ALSO section), and then later adopted as the preferred MRO (Method Resolution Order) for the new-style classes in Python 2.3. Most recently it has been adopted as the "canonical" MRO for Perl 6 classes, and the default MRO for Parrot objects as well.

How does C3 work

C3 works by always preserving local precedence ordering. This essentially means that no class will appear before any of its subclasses. Take, for instance, the classic diamond inheritance pattern:

  1. <A>
  2. / \
  3. <B> <C>
  4. \ /
  5. <D>

The standard Perl 5 MRO would be (D, B, A, C). The result being that A appears before C, even though C is the subclass of A. The C3 MRO algorithm however, produces the following order: (D, B, C, A), which does not have this issue.

This example is fairly trivial; for more complex cases and a deeper explanation, see the links in the SEE ALSO section.

Functions

mro::get_linear_isa($classname[, $type])

Returns an arrayref which is the linearized MRO of the given class. Uses whichever MRO is currently in effect for that class by default, or the given MRO (either c3 or dfs if specified as $type ).

The linearized MRO of a class is an ordered array of all of the classes one would search when resolving a method on that class, starting with the class itself.

If the requested class doesn't yet exist, this function will still succeed, and return [ $classname ]

Note that UNIVERSAL (and any members of UNIVERSAL 's MRO) are not part of the MRO of a class, even though all classes implicitly inherit methods from UNIVERSAL and its parents.

mro::set_mro ($classname, $type)

Sets the MRO of the given class to the $type argument (either c3 or dfs ).

mro::get_mro($classname)

Returns the MRO of the given class (either c3 or dfs ).

mro::get_isarev($classname)

Gets the mro_isarev for this class, returned as an arrayref of class names. These are every class that "isa" the given class name, even if the isa relationship is indirect. This is used internally by the MRO code to keep track of method/MRO cache invalidations.

As with mro::get_linear_isa above, UNIVERSAL is special. UNIVERSAL (and parents') isarev lists do not include every class in existence, even though all classes are effectively descendants for method inheritance purposes.

mro::is_universal($classname)

Returns a boolean status indicating whether or not the given classname is either UNIVERSAL itself, or one of UNIVERSAL 's parents by @ISA inheritance.

Any class for which this function returns true is "universal" in the sense that all classes potentially inherit methods from it.

mro::invalidate_all_method_caches()

Increments PL_sub_generation , which invalidates method caching in all packages.

mro::method_changed_in($classname)

Invalidates the method cache of any classes dependent on the given class. This is not normally necessary. The only known case where pure perl code can confuse the method cache is when you manually install a new constant subroutine by using a readonly scalar value, like the internals of constant do. If you find another case, please report it so we can either fix it or document the exception here.

mro::get_pkg_gen($classname)

Returns an integer which is incremented every time a real local method in the package $classname changes, or the local @ISA of $classname is modified.

This is intended for authors of modules which do lots of class introspection, as it allows them to very quickly check if anything important about the local properties of a given class have changed since the last time they looked. It does not increment on method/@ISA changes in superclasses.

It's still up to you to seek out the actual changes, and there might not actually be any. Perhaps all of the changes since you last checked cancelled each other out and left the package in the state it was in before.

This integer normally starts off at a value of 1 when a package stash is instantiated. Calling it on packages whose stashes do not exist at all will return 0 . If a package stash is completely deleted (not a normal occurence, but it can happen if someone does something like undef %PkgName:: ), the number will be reset to either 0 or 1 , depending on how completely package was wiped out.

next::method

This is somewhat like SUPER , but it uses the C3 method resolution order to get better consistency in multiple inheritance situations. Note that while inheritance in general follows whichever MRO is in effect for the given class, next::method only uses the C3 MRO.

One generally uses it like so:

  1. sub some_method {
  2. my $self = shift;
  3. my $superclass_answer = $self->next::method(@_);
  4. return $superclass_answer + 1;
  5. }

Note that you don't (re-)specify the method name. It forces you to always use the same method name as the method you started in.

It can be called on an object or a class, of course.

The way it resolves which actual method to call is:

1

First, it determines the linearized C3 MRO of the object or class it is being called on.

2

Then, it determines the class and method name of the context it was invoked from.

3

Finally, it searches down the C3 MRO list until it reaches the contextually enclosing class, then searches further down the MRO list for the next method with the same name as the contextually enclosing method.

Failure to find a next method will result in an exception being thrown (see below for alternatives).

This is substantially different than the behavior of SUPER under complex multiple inheritance. (This becomes obvious when one realizes that the common superclasses in the C3 linearizations of a given class and one of its parents will not always be ordered the same for both.)

Caveat: Calling next::method from methods defined outside the class:

There is an edge case when using next::method from within a subroutine which was created in a different module than the one it is called from. It sounds complicated, but it really isn't. Here is an example which will not work correctly:

  1. *Foo::foo = sub { (shift)->next::method(@_) };

The problem exists because the anonymous subroutine being assigned to the *Foo::foo glob will show up in the call stack as being called __ANON__ and not foo as you might expect. Since next::method uses caller to find the name of the method it was called in, it will fail in this case.

But fear not, there's a simple solution. The module Sub::Name will reach into the perl internals and assign a name to an anonymous subroutine for you. Simply do this:

  1. use Sub::Name 'subname';
  2. *Foo::foo = subname 'Foo::foo' => sub { (shift)->next::method(@_) };

and things will Just Work.

next::can

This is similar to next::method, but just returns either a code reference or undef to indicate that no further methods of this name exist.

maybe::next::method

In simple cases, it is equivalent to:

  1. $self->next::method(@_) if $self->next::can;

But there are some cases where only this solution works (like goto &maybe::next::method );

SEE ALSO

The original Dylan paper

Pugs

The Pugs prototype Perl 6 Object Model uses C3

Parrot

Parrot now uses C3

Python 2.3 MRO related links

Class::C3

AUTHOR

Brandon L. Black, <blblack@gmail.com>

Based on Stevan Little's Class::C3

 
perldoc-html/open.html000644 000765 000024 00000062205 12275777415 015073 0ustar00jjstaff000000 000000 open - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

open

Perl 5 version 18.2 documentation
Recently read

open

NAME

open - perl pragma to set default PerlIO layers for input and output

SYNOPSIS

  1. use open IN => ":crlf", OUT => ":bytes";
  2. use open OUT => ':utf8';
  3. use open IO => ":encoding(iso-8859-7)";
  4. use open IO => ':locale';
  5. use open ':encoding(utf8)';
  6. use open ':locale';
  7. use open ':encoding(iso-8859-7)';
  8. use open ':std';

DESCRIPTION

Full-fledged support for I/O layers is now implemented provided Perl is configured to use PerlIO as its IO system (which is now the default).

The open pragma serves as one of the interfaces to declare default "layers" (also known as "disciplines") for all I/O. Any two-argument open(), readpipe() (aka qx//) and similar operators found within the lexical scope of this pragma will use the declared defaults. Even three-argument opens may be affected by this pragma when they don't specify IO layers in MODE.

With the IN subpragma you can declare the default layers of input streams, and with the OUT subpragma you can declare the default layers of output streams. With the IO subpragma you can control both input and output streams simultaneously.

If you have a legacy encoding, you can use the :encoding(...) tag.

If you want to set your encoding layers based on your locale environment variables, you can use the :locale tag. For example:

  1. $ENV{LANG} = 'ru_RU.KOI8-R';
  2. # the :locale will probe the locale environment variables like LANG
  3. use open OUT => ':locale';
  4. open(O, ">koi8");
  5. print O chr(0x430); # Unicode CYRILLIC SMALL LETTER A = KOI8-R 0xc1
  6. close O;
  7. open(I, "<koi8");
  8. printf "%#x\n", ord(<I>), "\n"; # this should print 0xc1
  9. close I;

These are equivalent

  1. use open ':encoding(utf8)';
  2. use open IO => ':encoding(utf8)';

as are these

  1. use open ':locale';
  2. use open IO => ':locale';

and these

  1. use open ':encoding(iso-8859-7)';
  2. use open IO => ':encoding(iso-8859-7)';

The matching of encoding names is loose: case does not matter, and many encodings have several aliases. See Encode::Supported for details and the list of supported locales.

When open() is given an explicit list of layers (with the three-arg syntax), they override the list declared using this pragma. open() can also be given a single colon (:) for a layer name, to override this pragma and use the default (:raw on Unix, :crlf on Windows).

The :std subpragma on its own has no effect, but if combined with the :utf8 or :encoding subpragmas, it converts the standard filehandles (STDIN, STDOUT, STDERR) to comply with encoding selected for input/output handles. For example, if both input and out are chosen to be :encoding(utf8) , a :std will mean that STDIN, STDOUT, and STDERR are also in :encoding(utf8) . On the other hand, if only output is chosen to be in :encoding(koi8r) , a :std will cause only the STDOUT and STDERR to be in koi8r . The :locale subpragma implicitly turns on :std .

The logic of :locale is described in full in encoding, but in short it is first trying nl_langinfo(CODESET) and then guessing from the LC_ALL and LANG locale environment variables.

Directory handles may also support PerlIO layers in the future.

NONPERLIO FUNCTIONALITY

If Perl is not built to use PerlIO as its IO system then only the two pseudo-layers :bytes and :crlf are available.

The :bytes layer corresponds to "binary mode" and the :crlf layer corresponds to "text mode" on platforms that distinguish between the two modes when opening files (which is many DOS-like platforms, including Windows). These two layers are no-ops on platforms where binmode() is a no-op, but perform their functions everywhere if PerlIO is enabled.

IMPLEMENTATION DETAILS

There is a class method in PerlIO::Layer find which is implemented as XS code. It is called by import to validate the layers:

  1. PerlIO::Layer::->find("perlio")

The return value (if defined) is a Perl object, of class PerlIO::Layer which is created by the C code in perlio.c. As yet there is nothing useful you can do with the object at the perl level.

SEE ALSO

binmode, open, perlunicode, PerlIO, encoding

 
perldoc-html/ops.html000644 000765 000024 00000035305 12275777415 014734 0ustar00jjstaff000000 000000 ops - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

ops

Perl 5 version 18.2 documentation
Recently read

ops

NAME

ops - Perl pragma to restrict unsafe operations when compiling

SYNOPSIS

  1. perl -Mops=:default ... # only allow reasonably safe operations
  2. perl -M-ops=system ... # disable the 'system' opcode

DESCRIPTION

Since the ops pragma currently has an irreversible global effect, it is only of significant practical use with the -M option on the command line.

See the Opcode module for information about opcodes, optags, opmasks and important information about safety.

SEE ALSO

Opcode, Safe, perlrun

 
perldoc-html/overload.html000644 000765 000024 00000432465 12275777415 015756 0ustar00jjstaff000000 000000 overload - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

overload

Perl 5 version 18.2 documentation
Recently read

overload

NAME

overload - Package for overloading Perl operations

SYNOPSIS

  1. package SomeThing;
  2. use overload
  3. '+' => \&myadd,
  4. '-' => \&mysub;
  5. # etc
  6. ...
  7. package main;
  8. $a = SomeThing->new( 57 );
  9. $b = 5 + $a;
  10. ...
  11. if (overload::Overloaded $b) {...}
  12. ...
  13. $strval = overload::StrVal $b;

DESCRIPTION

This pragma allows overloading of Perl's operators for a class. To overload built-in functions, see Overriding Built-in Functions in perlsub instead.

Fundamentals

Declaration

Arguments of the use overload directive are (key, value) pairs. For the full set of legal keys, see Overloadable Operations below.

Operator implementations (the values) can be subroutines, references to subroutines, or anonymous subroutines - in other words, anything legal inside a &{ ... } call. Values specified as strings are interpreted as method names. Thus

  1. package Number;
  2. use overload
  3. "-" => "minus",
  4. "*=" => \&muas,
  5. '""' => sub { ...; };

declares that subtraction is to be implemented by method minus() in the class Number (or one of its base classes), and that the function Number::muas() is to be used for the assignment form of multiplication, *= . It also defines an anonymous subroutine to implement stringification: this is called whenever an object blessed into the package Number is used in a string context (this subroutine might, for example, return the number as a Roman numeral).

Calling Conventions and Magic Autogeneration

The following sample implementation of minus() (which assumes that Number objects are simply blessed references to scalars) illustrates the calling conventions:

  1. package Number;
  2. sub minus {
  3. my ($self, $other, $swap) = @_;
  4. my $result = $$self - $other; # *
  5. $result = -$result if $swap;
  6. ref $result ? $result : bless \$result;
  7. }
  8. # * may recurse once - see table below

Three arguments are passed to all subroutines specified in the use overload directive (with one exception - see nomethod). The first of these is the operand providing the overloaded operator implementation - in this case, the object whose minus() method is being called.

The second argument is the other operand, or undef in the case of a unary operator.

The third argument is set to TRUE if (and only if) the two operands have been swapped. Perl may do this to ensure that the first argument ($self ) is an object implementing the overloaded operation, in line with general object calling conventions. For example, if $x and $y are Number s:

  1. operation | generates a call to
  2. ============|======================
  3. $x - $y | minus($x, $y, '')
  4. $x - 7 | minus($x, 7, '')
  5. 7 - $x | minus($x, 7, 1)

Perl may also use minus() to implement other operators which have not been specified in the use overload directive, according to the rules for Magic Autogeneration described later. For example, the use overload above declared no subroutine for any of the operators -- , neg (the overload key for unary minus), or -= . Thus

  1. operation | generates a call to
  2. ============|======================
  3. -$x | minus($x, 0, 1)
  4. $x-- | minus($x, 1, undef)
  5. $x -= 3 | minus($x, 3, undef)

Note the undefs: where autogeneration results in the method for a standard operator which does not change either of its operands, such as - , being used to implement an operator which changes the operand ("mutators": here, -- and -= ), Perl passes undef as the third argument. This still evaluates as FALSE, consistent with the fact that the operands have not been swapped, but gives the subroutine a chance to alter its behaviour in these cases.

In all the above examples, minus() is required only to return the result of the subtraction: Perl takes care of the assignment to $x. In fact, such methods should not modify their operands, even if undef is passed as the third argument (see Overloadable Operations).

The same is not true of implementations of ++ and -- : these are expected to modify their operand. An appropriate implementation of -- might look like

  1. use overload '--' => "decr",
  2. # ...
  3. sub decr { --${$_[0]}; }

Mathemagic, Mutators, and Copy Constructors

The term 'mathemagic' describes the overloaded implementation of mathematical operators. Mathemagical operations raise an issue. Consider the code:

  1. $a = $b;
  2. --$a;

If $a and $b are scalars then after these statements

  1. $a == $b - 1

An object, however, is a reference to blessed data, so if $a and $b are objects then the assignment $a = $b copies only the reference, leaving $a and $b referring to the same object data. One might therefore expect the operation --$a to decrement $b as well as $a . However, this would not be consistent with how we expect the mathematical operators to work.

Perl resolves this dilemma by transparently calling a copy constructor before calling a method defined to implement a mutator (-- , += , and so on.). In the above example, when Perl reaches the decrement statement, it makes a copy of the object data in $a and assigns to $a a reference to the copied data. Only then does it call decr() , which alters the copied data, leaving $b unchanged. Thus the object metaphor is preserved as far as possible, while mathemagical operations still work according to the arithmetic metaphor.

Note: the preceding paragraph describes what happens when Perl autogenerates the copy constructor for an object based on a scalar. For other cases, see Copy Constructor.

Overloadable Operations

The complete list of keys that can be specified in the use overload directive are given, separated by spaces, in the values of the hash %overload::ops :

  1. with_assign => '+ - * / % ** << >> x .',
  2. assign => '+= -= *= /= %= **= <<= >>= x= .=',
  3. num_comparison => '< <= > >= == !=',
  4. '3way_comparison'=> '<=> cmp',
  5. str_comparison => 'lt le gt ge eq ne',
  6. binary => '& &= | |= ^ ^=',
  7. unary => 'neg ! ~',
  8. mutators => '++ --',
  9. func => 'atan2 cos sin exp abs log sqrt int',
  10. conversion => 'bool "" 0+ qr',
  11. iterators => '<>',
  12. filetest => '-X',
  13. dereferencing => '${} @{} %{} &{} *{}',
  14. matching => '~~',
  15. special => 'nomethod fallback ='

Most of the overloadable operators map one-to-one to these keys. Exceptions, including additional overloadable operations not apparent from this hash, are included in the notes which follow.

A warning is issued if an attempt is made to register an operator not found above.

  • not

    The operator not is not a valid key for use overload . However, if the operator ! is overloaded then the same implementation will be used for not (since the two operators differ only in precedence).

  • neg

    The key neg is used for unary minus to disambiguate it from binary - .

  • ++ , --

    Assuming they are to behave analogously to Perl's ++ and -- , overloaded implementations of these operators are required to mutate their operands.

    No distinction is made between prefix and postfix forms of the increment and decrement operators: these differ only in the point at which Perl calls the associated subroutine when evaluating an expression.

  • Assignments
    1. += -= *= /= %= **= <<= >>= x= .=
    2. &= |= ^=

    Simple assignment is not overloadable (the '=' key is used for the Copy Constructor). Perl does have a way to make assignments to an object do whatever you want, but this involves using tie(), not overload - see tie and the COOKBOOK examples below.

    The subroutine for the assignment variant of an operator is required only to return the result of the operation. It is permitted to change the value of its operand (this is safe because Perl calls the copy constructor first), but this is optional since Perl assigns the returned value to the left-hand operand anyway.

    An object that overloads an assignment operator does so only in respect of assignments to that object. In other words, Perl never calls the corresponding methods with the third argument (the "swap" argument) set to TRUE. For example, the operation

    1. $a *= $b

    cannot lead to $b 's implementation of *= being called, even if $a is a scalar. (It can, however, generate a call to $b 's method for * ).

  • Non-mutators with a mutator variant
    1. + - * / % ** << >> x .
    2. & | ^

    As described above, Perl may call methods for operators like + and & in the course of implementing missing operations like ++ , += , and &= . While these methods may detect this usage by testing the definedness of the third argument, they should in all cases avoid changing their operands. This is because Perl does not call the copy constructor before invoking these methods.

  • int

    Traditionally, the Perl function int rounds to 0 (see int), and so for floating-point-like types one should follow the same semantic.

  • String, numeric, boolean, and regexp conversions
    1. "" 0+ bool

    These conversions are invoked according to context as necessary. For example, the subroutine for '""' (stringify) may be used where the overloaded object is passed as an argument to print, and that for 'bool' where it is tested in the condition of a flow control statement (like while ) or the ternary ?: operation.

    Of course, in contexts like, for example, $obj + 1 , Perl will invoke $obj 's implementation of + rather than (in this example) converting $obj to a number using the numify method '0+' (an exception to this is when no method has been provided for '+' and fallback is set to TRUE).

    The subroutines for '""' , '0+' , and 'bool' can return any arbitrary Perl value. If the corresponding operation for this value is overloaded too, the operation will be called again with this value.

    As a special case if the overload returns the object itself then it will be used directly. An overloaded conversion returning the object is probably a bug, because you're likely to get something that looks like YourPackage=HASH(0x8172b34) .

    1. qr

    The subroutine for 'qr' is used wherever the object is interpolated into or used as a regexp, including when it appears on the RHS of a =~ or !~ operator.

    qr must return a compiled regexp, or a ref to a compiled regexp (such as qr// returns), and any further overloading on the return value will be ignored.

  • Iteration

    If <> is overloaded then the same implementation is used for both the read-filehandle syntax <$var> and globbing syntax <${var}> .

  • File tests

    The key '-X' is used to specify a subroutine to handle all the filetest operators (-f , -x , and so on: see -X for the full list); it is not possible to overload any filetest operator individually. To distinguish them, the letter following the '-' is passed as the second argument (that is, in the slot that for binary operators is used to pass the second operand).

    Calling an overloaded filetest operator does not affect the stat value associated with the special filehandle _ . It still refers to the result of the last stat, lstat or unoverloaded filetest.

    This overload was introduced in Perl 5.12.

  • Matching

    The key "~~" allows you to override the smart matching logic used by the ~~ operator and the switch construct (given /when ). See Switch Statements in perlsyn and feature.

    Unusually, the overloaded implementation of the smart match operator does not get full control of the smart match behaviour. In particular, in the following code:

    1. package Foo;
    2. use overload '~~' => 'match';
    3. my $obj = Foo->new();
    4. $obj ~~ [ 1,2,3 ];

    the smart match does not invoke the method call like this:

    1. $obj->match([1,2,3],0);

    rather, the smart match distributive rule takes precedence, so $obj is smart matched against each array element in turn until a match is found, so you may see between one and three of these calls instead:

    1. $obj->match(1,0);
    2. $obj->match(2,0);
    3. $obj->match(3,0);

    Consult the match table in Smartmatch Operator in perlop for details of when overloading is invoked.

  • Dereferencing
    1. ${} @{} %{} &{} *{}

    If these operators are not explicitly overloaded then they work in the normal way, yielding the underlying scalar, array, or whatever stores the object data (or the appropriate error message if the dereference operator doesn't match it). Defining a catch-all 'nomethod' (see below) makes no difference to this as the catch-all function will not be called to implement a missing dereference operator.

    If a dereference operator is overloaded then it must return a reference of the appropriate type (for example, the subroutine for key '${}' should return a reference to a scalar, not a scalar), or another object which overloads the operator: that is, the subroutine only determines what is dereferenced and the actual dereferencing is left to Perl. As a special case, if the subroutine returns the object itself then it will not be called again - avoiding infinite recursion.

  • Special
    1. nomethod fallback =

    See Special Keys for use overload.

Magic Autogeneration

If a method for an operation is not found then Perl tries to autogenerate a substitute implementation from the operations that have been defined.

Note: the behaviour described in this section can be disabled by setting fallback to FALSE (see fallback).

In the following tables, numbers indicate priority. For example, the table below states that, if no implementation for '!' has been defined then Perl will implement it using 'bool' (that is, by inverting the value returned by the method for 'bool' ); if boolean conversion is also unimplemented then Perl will use '0+' or, failing that, '""' .

  1. operator | can be autogenerated from
  2. |
  3. | 0+ "" bool . x
  4. =========|==========================
  5. 0+ | 1 2
  6. "" | 1 2
  7. bool | 1 2
  8. int | 1 2 3
  9. ! | 2 3 1
  10. qr | 2 1 3
  11. . | 2 1 3
  12. x | 2 1 3
  13. .= | 3 2 4 1
  14. x= | 3 2 4 1
  15. <> | 2 1 3
  16. -X | 2 1 3

Note: The iterator ('<>' ) and file test ('-X' ) operators work as normal: if the operand is not a blessed glob or IO reference then it is converted to a string (using the method for '""' , '0+' , or 'bool' ) to be interpreted as a glob or filename.

  1. operator | can be autogenerated from
  2. |
  3. | < <=> neg -= -
  4. =========|==========================
  5. neg | 1
  6. -= | 1
  7. -- | 1 2
  8. abs | a1 a2 b1 b2 [*]
  9. < | 1
  10. <= | 1
  11. > | 1
  12. >= | 1
  13. == | 1
  14. != | 1
  15. * one from [a1, a2] and one from [b1, b2]

Just as numeric comparisons can be autogenerated from the method for '<=>' , string comparisons can be autogenerated from that for 'cmp' :

  1. operators | can be autogenerated from
  2. ====================|===========================
  3. lt gt le ge eq ne | cmp

Similarly, autogeneration for keys '+=' and '++' is analogous to '-=' and '--' above:

  1. operator | can be autogenerated from
  2. |
  3. | += +
  4. =========|==========================
  5. += | 1
  6. ++ | 1 2

And other assignment variations are analogous to '+=' and '-=' (and similar to '.=' and 'x=' above):

  1. operator || *= /= %= **= <<= >>= &= ^= |=
  2. -------------------||--------------------------------
  3. autogenerated from || * / % ** << >> & ^ |

Note also that the copy constructor (key '=' ) may be autogenerated, but only for objects based on scalars. See Copy Constructor.

Minimal Set of Overloaded Operations

Since some operations can be automatically generated from others, there is a minimal set of operations that need to be overloaded in order to have the complete set of overloaded operations at one's disposal. Of course, the autogenerated operations may not do exactly what the user expects. The minimal set is:

  1. + - * / % ** << >> x
  2. <=> cmp
  3. & | ^ ~
  4. atan2 cos sin exp log sqrt int
  5. "" 0+ bool
  6. ~~

Of the conversions, only one of string, boolean or numeric is needed because each can be generated from either of the other two.

Special Keys for use overload

nomethod

The 'nomethod' key is used to specify a catch-all function to be called for any operator that is not individually overloaded. The specified function will be passed four parameters. The first three arguments coincide with those that would have been passed to the corresponding method if it had been defined. The fourth argument is the use overload key for that missing method.

For example, if $a is an object blessed into a package declaring

  1. use overload 'nomethod' => 'catch_all', # ...

then the operation

  1. 3 + $a

could (unless a method is specifically declared for the key '+' ) result in a call

  1. catch_all($a, 3, 1, '+')

See How Perl Chooses an Operator Implementation.

fallback

The value assigned to the key 'fallback' tells Perl how hard it should try to find an alternative way to implement a missing operator.

  • defined, but FALSE
    1. use overload "fallback" => 0, # ... ;

    This disables Magic Autogeneration.

  • undef

    In the default case where no value is explicitly assigned to fallback , magic autogeneration is enabled.

  • TRUE

    The same as for undef, but if a missing operator cannot be autogenerated then, instead of issuing an error message, Perl is allowed to revert to what it would have done for that operator if there had been no use overload directive.

    Note: in most cases, particularly the Copy Constructor, this is unlikely to be appropriate behaviour.

See How Perl Chooses an Operator Implementation.

Copy Constructor

As mentioned above, this operation is called when a mutator is applied to a reference that shares its object with some other reference. For example, if $b is mathemagical, and '++' is overloaded with 'incr' , and '=' is overloaded with 'clone' , then the code

  1. $a = $b;
  2. # ... (other code which does not modify $a or $b) ...
  3. ++$b;

would be executed in a manner equivalent to

  1. $a = $b;
  2. # ...
  3. $b = $b->clone(undef, "");
  4. $b->incr(undef, "");

Note:

  • The subroutine for '=' does not overload the Perl assignment operator: it is used only to allow mutators to work as described here. (See Assignments above.)

  • As for other operations, the subroutine implementing '=' is passed three arguments, though the last two are always undef and '' .

  • The copy constructor is called only before a call to a function declared to implement a mutator, for example, if ++$b; in the code above is effected via a method declared for key '++' (or 'nomethod', passed '++' as the fourth argument) or, by autogeneration, '+=' . It is not called if the increment operation is effected by a call to the method for '+' since, in the equivalent code,

    1. $a = $b;
    2. $b = $b + 1;

    the data referred to by $a is unchanged by the assignment to $b of a reference to new object data.

  • The copy constructor is not called if Perl determines that it is unnecessary because there is no other reference to the data being modified.

  • If 'fallback' is undefined or TRUE then a copy constructor can be autogenerated, but only for objects based on scalars. In other cases it needs to be defined explicitly. Where an object's data is stored as, for example, an array of scalars, the following might be appropriate:

    1. use overload '=' => sub { bless [ @{$_[0]} ] }, # ...
  • If 'fallback' is TRUE and no copy constructor is defined then, for objects not based on scalars, Perl may silently fall back on simple assignment - that is, assignment of the object reference. In effect, this disables the copy constructor mechanism since no new copy of the object data is created. This is almost certainly not what you want. (It is, however, consistent: for example, Perl's fallback for the ++ operator is to increment the reference itself.)

How Perl Chooses an Operator Implementation

Which is checked first, nomethod or fallback ? If the two operands of an operator are of different types and both overload the operator, which implementation is used? The following are the precedence rules:

1.

If the first operand has declared a subroutine to overload the operator then use that implementation.

2.

Otherwise, if fallback is TRUE or undefined for the first operand then see if the rules for autogeneration allows another of its operators to be used instead.

3.

Unless the operator is an assignment (+= , -= , etc.), repeat step (1) in respect of the second operand.

4.

Repeat Step (2) in respect of the second operand.

5.

If the first operand has a "nomethod" method then use that.

6.

If the second operand has a "nomethod" method then use that.

7.

If fallback is TRUE for both operands then perform the usual operation for the operator, treating the operands as numbers, strings, or booleans as appropriate for the operator (see note).

8.

Nothing worked - die.

Where there is only one operand (or only one operand with overloading) the checks in respect of the other operand above are skipped.

There are exceptions to the above rules for dereference operations (which, if Step 1 fails, always fall back to the normal, built-in implementations - see Dereferencing), and for ~~ (which has its own set of rules - see Matching under Overloadable Operations above).

Note on Step 7: some operators have a different semantic depending on the type of their operands. As there is no way to instruct Perl to treat the operands as, e.g., numbers instead of strings, the result here may not be what you expect. See BUGS AND PITFALLS.

Losing Overloading

The restriction for the comparison operation is that even if, for example, cmp should return a blessed reference, the autogenerated lt function will produce only a standard logical value based on the numerical value of the result of cmp . In particular, a working numeric conversion is needed in this case (possibly expressed in terms of other conversions).

Similarly, .= and x= operators lose their mathemagical properties if the string conversion substitution is applied.

When you chop() a mathemagical object it is promoted to a string and its mathemagical properties are lost. The same can happen with other operations as well.

Inheritance and Overloading

Overloading respects inheritance via the @ISA hierarchy. Inheritance interacts with overloading in two ways.

  • Method names in the use overload directive

    If value in

    1. use overload key => value;

    is a string, it is interpreted as a method name - which may (in the usual way) be inherited from another class.

  • Overloading of an operation is inherited by derived classes

    Any class derived from an overloaded class is also overloaded and inherits its operator implementations. If the same operator is overloaded in more than one ancestor then the implementation is determined by the usual inheritance rules.

    For example, if A inherits from B and C (in that order), B overloads + with \&D::plus_sub , and C overloads + by "plus_meth" , then the subroutine D::plus_sub will be called to implement operation + for an object in package A .

Note that in Perl version prior to 5.18 inheritance of the fallback key was not governed by the above rules. The value of fallback in the first overloaded ancestor was used. This was fixed in 5.18 to follow the usual rules of inheritance.

Run-time Overloading

Since all use directives are executed at compile-time, the only way to change overloading during run-time is to

  1. eval 'use overload "+" => \&addmethod';

You can also use

  1. eval 'no overload "+", "--", "<="';

though the use of these constructs during run-time is questionable.

Public Functions

Package overload.pm provides the following public functions:

  • overload::StrVal(arg)

    Gives the string value of arg as in the absence of stringify overloading. If you are using this to get the address of a reference (useful for checking if two references point to the same thing) then you may be better off using Scalar::Util::refaddr() , which is faster.

  • overload::Overloaded(arg)

    Returns true if arg is subject to overloading of some operations.

  • overload::Method(obj,op)

    Returns undef or a reference to the method that implements op .

Overloading Constants

For some applications, the Perl parser mangles constants too much. It is possible to hook into this process via overload::constant() and overload::remove_constant() functions.

These functions take a hash as an argument. The recognized keys of this hash are:

  • integer

    to overload integer constants,

  • float

    to overload floating point constants,

  • binary

    to overload octal and hexadecimal constants,

  • q

    to overload q-quoted strings, constant pieces of qq- and qx-quoted strings and here-documents,

  • qr

    to overload constant pieces of regular expressions.

The corresponding values are references to functions which take three arguments: the first one is the initial string form of the constant, the second one is how Perl interprets this constant, the third one is how the constant is used. Note that the initial string form does not contain string delimiters, and has backslashes in backslash-delimiter combinations stripped (thus the value of delimiter is not relevant for processing of this string). The return value of this function is how this constant is going to be interpreted by Perl. The third argument is undefined unless for overloaded q- and qr- constants, it is q in single-quote context (comes from strings, regular expressions, and single-quote HERE documents), it is tr for arguments of tr/y operators, it is s for right-hand side of s-operator, and it is qq otherwise.

Since an expression "ab$cd,," is just a shortcut for 'ab' . $cd . ',,' , it is expected that overloaded constant strings are equipped with reasonable overloaded catenation operator, otherwise absurd results will result. Similarly, negative numbers are considered as negations of positive constants.

Note that it is probably meaningless to call the functions overload::constant() and overload::remove_constant() from anywhere but import() and unimport() methods. From these methods they may be called as

  1. sub import {
  2. shift;
  3. return unless @_;
  4. die "unknown import: @_" unless @_ == 1 and $_[0] eq ':constant';
  5. overload::constant integer => sub {Math::BigInt->new(shift)};
  6. }

IMPLEMENTATION

What follows is subject to change RSN.

The table of methods for all operations is cached in magic for the symbol table hash for the package. The cache is invalidated during processing of use overload , no overload , new function definitions, and changes in @ISA.

(Every SVish thing has a magic queue, and magic is an entry in that queue. This is how a single variable may participate in multiple forms of magic simultaneously. For instance, environment variables regularly have two forms at once: their %ENV magic and their taint magic. However, the magic which implements overloading is applied to the stashes, which are rarely used directly, thus should not slow down Perl.)

If a package uses overload, it carries a special flag. This flag is also set when new function are defined or @ISA is modified. There will be a slight speed penalty on the very first operation thereafter that supports overloading, while the overload tables are updated. If there is no overloading present, the flag is turned off. Thus the only speed penalty thereafter is the checking of this flag.

It is expected that arguments to methods that are not explicitly supposed to be changed are constant (but this is not enforced).

COOKBOOK

Please add examples to what follows!

Two-face Scalars

Put this in two_face.pm in your Perl library directory:

  1. package two_face; # Scalars with separate string and
  2. # numeric values.
  3. sub new { my $p = shift; bless [@_], $p }
  4. use overload '""' => \&str, '0+' => \&num, fallback => 1;
  5. sub num {shift->[1]}
  6. sub str {shift->[0]}

Use it as follows:

  1. require two_face;
  2. my $seven = two_face->new("vii", 7);
  3. printf "seven=$seven, seven=%d, eight=%d\n", $seven, $seven+1;
  4. print "seven contains 'i'\n" if $seven =~ /i/;

(The second line creates a scalar which has both a string value, and a numeric value.) This prints:

  1. seven=vii, seven=7, eight=8
  2. seven contains 'i'

Two-face References

Suppose you want to create an object which is accessible as both an array reference and a hash reference.

  1. package two_refs;
  2. use overload '%{}' => \&gethash, '@{}' => sub { $ {shift()} };
  3. sub new {
  4. my $p = shift;
  5. bless \ [@_], $p;
  6. }
  7. sub gethash {
  8. my %h;
  9. my $self = shift;
  10. tie %h, ref $self, $self;
  11. \%h;
  12. }
  13. sub TIEHASH { my $p = shift; bless \ shift, $p }
  14. my %fields;
  15. my $i = 0;
  16. $fields{$_} = $i++ foreach qw{zero one two three};
  17. sub STORE {
  18. my $self = ${shift()};
  19. my $key = $fields{shift()};
  20. defined $key or die "Out of band access";
  21. $$self->[$key] = shift;
  22. }
  23. sub FETCH {
  24. my $self = ${shift()};
  25. my $key = $fields{shift()};
  26. defined $key or die "Out of band access";
  27. $$self->[$key];
  28. }

Now one can access an object using both the array and hash syntax:

  1. my $bar = two_refs->new(3,4,5,6);
  2. $bar->[2] = 11;
  3. $bar->{two} == 11 or die 'bad hash fetch';

Note several important features of this example. First of all, the actual type of $bar is a scalar reference, and we do not overload the scalar dereference. Thus we can get the actual non-overloaded contents of $bar by just using $$bar (what we do in functions which overload dereference). Similarly, the object returned by the TIEHASH() method is a scalar reference.

Second, we create a new tied hash each time the hash syntax is used. This allows us not to worry about a possibility of a reference loop, which would lead to a memory leak.

Both these problems can be cured. Say, if we want to overload hash dereference on a reference to an object which is implemented as a hash itself, the only problem one has to circumvent is how to access this actual hash (as opposed to the virtual hash exhibited by the overloaded dereference operator). Here is one possible fetching routine:

  1. sub access_hash {
  2. my ($self, $key) = (shift, shift);
  3. my $class = ref $self;
  4. bless $self, 'overload::dummy'; # Disable overloading of %{}
  5. my $out = $self->{$key};
  6. bless $self, $class; # Restore overloading
  7. $out;
  8. }

To remove creation of the tied hash on each access, one may an extra level of indirection which allows a non-circular structure of references:

  1. package two_refs1;
  2. use overload '%{}' => sub { ${shift()}->[1] },
  3. '@{}' => sub { ${shift()}->[0] };
  4. sub new {
  5. my $p = shift;
  6. my $a = [@_];
  7. my %h;
  8. tie %h, $p, $a;
  9. bless \ [$a, \%h], $p;
  10. }
  11. sub gethash {
  12. my %h;
  13. my $self = shift;
  14. tie %h, ref $self, $self;
  15. \%h;
  16. }
  17. sub TIEHASH { my $p = shift; bless \ shift, $p }
  18. my %fields;
  19. my $i = 0;
  20. $fields{$_} = $i++ foreach qw{zero one two three};
  21. sub STORE {
  22. my $a = ${shift()};
  23. my $key = $fields{shift()};
  24. defined $key or die "Out of band access";
  25. $a->[$key] = shift;
  26. }
  27. sub FETCH {
  28. my $a = ${shift()};
  29. my $key = $fields{shift()};
  30. defined $key or die "Out of band access";
  31. $a->[$key];
  32. }

Now if $baz is overloaded like this, then $baz is a reference to a reference to the intermediate array, which keeps a reference to an actual array, and the access hash. The tie()ing object for the access hash is a reference to a reference to the actual array, so

  • There are no loops of references.

  • Both "objects" which are blessed into the class two_refs1 are references to a reference to an array, thus references to a scalar. Thus the accessor expression $$foo->[$ind] involves no overloaded operations.

Symbolic Calculator

Put this in symbolic.pm in your Perl library directory:

  1. package symbolic; # Primitive symbolic calculator
  2. use overload nomethod => \&wrap;
  3. sub new { shift; bless ['n', @_] }
  4. sub wrap {
  5. my ($obj, $other, $inv, $meth) = @_;
  6. ($obj, $other) = ($other, $obj) if $inv;
  7. bless [$meth, $obj, $other];
  8. }

This module is very unusual as overloaded modules go: it does not provide any usual overloaded operators, instead it provides an implementation for nomethod. In this example the nomethod subroutine returns an object which encapsulates operations done over the objects: symbolic->new(3) contains ['n', 3] , 2 + symbolic->new(3) contains ['+', 2, ['n', 3]] .

Here is an example of the script which "calculates" the side of circumscribed octagon using the above package:

  1. require symbolic;
  2. my $iter = 1; # 2**($iter+2) = 8
  3. my $side = symbolic->new(1);
  4. my $cnt = $iter;
  5. while ($cnt--) {
  6. $side = (sqrt(1 + $side**2) - 1)/$side;
  7. }
  8. print "OK\n";

The value of $side is

  1. ['/', ['-', ['sqrt', ['+', 1, ['**', ['n', 1], 2]],
  2. undef], 1], ['n', 1]]

Note that while we obtained this value using a nice little script, there is no simple way to use this value. In fact this value may be inspected in debugger (see perldebug), but only if bareStringify Option is set, and not via p command.

If one attempts to print this value, then the overloaded operator "" will be called, which will call nomethod operator. The result of this operator will be stringified again, but this result is again of type symbolic , which will lead to an infinite loop.

Add a pretty-printer method to the module symbolic.pm:

  1. sub pretty {
  2. my ($meth, $a, $b) = @{+shift};
  3. $a = 'u' unless defined $a;
  4. $b = 'u' unless defined $b;
  5. $a = $a->pretty if ref $a;
  6. $b = $b->pretty if ref $b;
  7. "[$meth $a $b]";
  8. }

Now one can finish the script by

  1. print "side = ", $side->pretty, "\n";

The method pretty is doing object-to-string conversion, so it is natural to overload the operator "" using this method. However, inside such a method it is not necessary to pretty-print the components $a and $b of an object. In the above subroutine "[$meth $a $b]" is a catenation of some strings and components $a and $b. If these components use overloading, the catenation operator will look for an overloaded operator .; if not present, it will look for an overloaded operator "" . Thus it is enough to use

  1. use overload nomethod => \&wrap, '""' => \&str;
  2. sub str {
  3. my ($meth, $a, $b) = @{+shift};
  4. $a = 'u' unless defined $a;
  5. $b = 'u' unless defined $b;
  6. "[$meth $a $b]";
  7. }

Now one can change the last line of the script to

  1. print "side = $side\n";

which outputs

  1. side = [/ [- [sqrt [+ 1 [** [n 1 u] 2]] u] 1] [n 1 u]]

and one can inspect the value in debugger using all the possible methods.

Something is still amiss: consider the loop variable $cnt of the script. It was a number, not an object. We cannot make this value of type symbolic , since then the loop will not terminate.

Indeed, to terminate the cycle, the $cnt should become false. However, the operator bool for checking falsity is overloaded (this time via overloaded "" ), and returns a long string, thus any object of type symbolic is true. To overcome this, we need a way to compare an object to 0. In fact, it is easier to write a numeric conversion routine.

Here is the text of symbolic.pm with such a routine added (and slightly modified str()):

  1. package symbolic; # Primitive symbolic calculator
  2. use overload
  3. nomethod => \&wrap, '""' => \&str, '0+' => \&num;
  4. sub new { shift; bless ['n', @_] }
  5. sub wrap {
  6. my ($obj, $other, $inv, $meth) = @_;
  7. ($obj, $other) = ($other, $obj) if $inv;
  8. bless [$meth, $obj, $other];
  9. }
  10. sub str {
  11. my ($meth, $a, $b) = @{+shift};
  12. $a = 'u' unless defined $a;
  13. if (defined $b) {
  14. "[$meth $a $b]";
  15. } else {
  16. "[$meth $a]";
  17. }
  18. }
  19. my %subr = ( n => sub {$_[0]},
  20. sqrt => sub {sqrt $_[0]},
  21. '-' => sub {shift() - shift()},
  22. '+' => sub {shift() + shift()},
  23. '/' => sub {shift() / shift()},
  24. '*' => sub {shift() * shift()},
  25. '**' => sub {shift() ** shift()},
  26. );
  27. sub num {
  28. my ($meth, $a, $b) = @{+shift};
  29. my $subr = $subr{$meth}
  30. or die "Do not know how to ($meth) in symbolic";
  31. $a = $a->num if ref $a eq __PACKAGE__;
  32. $b = $b->num if ref $b eq __PACKAGE__;
  33. $subr->($a,$b);
  34. }

All the work of numeric conversion is done in %subr and num(). Of course, %subr is not complete, it contains only operators used in the example below. Here is the extra-credit question: why do we need an explicit recursion in num()? (Answer is at the end of this section.)

Use this module like this:

  1. require symbolic;
  2. my $iter = symbolic->new(2); # 16-gon
  3. my $side = symbolic->new(1);
  4. my $cnt = $iter;
  5. while ($cnt) {
  6. $cnt = $cnt - 1; # Mutator '--' not implemented
  7. $side = (sqrt(1 + $side**2) - 1)/$side;
  8. }
  9. printf "%s=%f\n", $side, $side;
  10. printf "pi=%f\n", $side*(2**($iter+2));

It prints (without so many line breaks)

  1. [/ [- [sqrt [+ 1 [** [/ [- [sqrt [+ 1 [** [n 1] 2]]] 1]
  2. [n 1]] 2]]] 1]
  3. [/ [- [sqrt [+ 1 [** [n 1] 2]]] 1] [n 1]]]=0.198912
  4. pi=3.182598

The above module is very primitive. It does not implement mutator methods (++ , -= and so on), does not do deep copying (not required without mutators!), and implements only those arithmetic operations which are used in the example.

To implement most arithmetic operations is easy; one should just use the tables of operations, and change the code which fills %subr to

  1. my %subr = ( 'n' => sub {$_[0]} );
  2. foreach my $op (split " ", $overload::ops{with_assign}) {
  3. $subr{$op} = $subr{"$op="} = eval "sub {shift() $op shift()}";
  4. }
  5. my @bins = qw(binary 3way_comparison num_comparison str_comparison);
  6. foreach my $op (split " ", "@overload::ops{ @bins }") {
  7. $subr{$op} = eval "sub {shift() $op shift()}";
  8. }
  9. foreach my $op (split " ", "@overload::ops{qw(unary func)}") {
  10. print "defining '$op'\n";
  11. $subr{$op} = eval "sub {$op shift()}";
  12. }

Since subroutines implementing assignment operators are not required to modify their operands (see Overloadable Operations above), we do not need anything special to make += and friends work, besides adding these operators to %subr and defining a copy constructor (needed since Perl has no way to know that the implementation of '+=' does not mutate the argument - see Copy Constructor).

To implement a copy constructor, add '=' => \&cpy to use overload line, and code (this code assumes that mutators change things one level deep only, so recursive copying is not needed):

  1. sub cpy {
  2. my $self = shift;
  3. bless [@$self], ref $self;
  4. }

To make ++ and -- work, we need to implement actual mutators, either directly, or in nomethod . We continue to do things inside nomethod , thus add

  1. if ($meth eq '++' or $meth eq '--') {
  2. @$obj = ($meth, (bless [@$obj]), 1); # Avoid circular reference
  3. return $obj;
  4. }

after the first line of wrap(). This is not a most effective implementation, one may consider

  1. sub inc { $_[0] = bless ['++', shift, 1]; }

instead.

As a final remark, note that one can fill %subr by

  1. my %subr = ( 'n' => sub {$_[0]} );
  2. foreach my $op (split " ", $overload::ops{with_assign}) {
  3. $subr{$op} = $subr{"$op="} = eval "sub {shift() $op shift()}";
  4. }
  5. my @bins = qw(binary 3way_comparison num_comparison str_comparison);
  6. foreach my $op (split " ", "@overload::ops{ @bins }") {
  7. $subr{$op} = eval "sub {shift() $op shift()}";
  8. }
  9. foreach my $op (split " ", "@overload::ops{qw(unary func)}") {
  10. $subr{$op} = eval "sub {$op shift()}";
  11. }
  12. $subr{'++'} = $subr{'+'};
  13. $subr{'--'} = $subr{'-'};

This finishes implementation of a primitive symbolic calculator in 50 lines of Perl code. Since the numeric values of subexpressions are not cached, the calculator is very slow.

Here is the answer for the exercise: In the case of str(), we need no explicit recursion since the overloaded .-operator will fall back to an existing overloaded operator "" . Overloaded arithmetic operators do not fall back to numeric conversion if fallback is not explicitly requested. Thus without an explicit recursion num() would convert ['+', $a, $b] to $a + $b , which would just rebuild the argument of num().

If you wonder why defaults for conversion are different for str() and num(), note how easy it was to write the symbolic calculator. This simplicity is due to an appropriate choice of defaults. One extra note: due to the explicit recursion num() is more fragile than sym(): we need to explicitly check for the type of $a and $b. If components $a and $b happen to be of some related type, this may lead to problems.

Really Symbolic Calculator

One may wonder why we call the above calculator symbolic. The reason is that the actual calculation of the value of expression is postponed until the value is used.

To see it in action, add a method

  1. sub STORE {
  2. my $obj = shift;
  3. $#$obj = 1;
  4. @$obj->[0,1] = ('=', shift);
  5. }

to the package symbolic . After this change one can do

  1. my $a = symbolic->new(3);
  2. my $b = symbolic->new(4);
  3. my $c = sqrt($a**2 + $b**2);

and the numeric value of $c becomes 5. However, after calling

  1. $a->STORE(12); $b->STORE(5);

the numeric value of $c becomes 13. There is no doubt now that the module symbolic provides a symbolic calculator indeed.

To hide the rough edges under the hood, provide a tie()d interface to the package symbolic . Add methods

  1. sub TIESCALAR { my $pack = shift; $pack->new(@_) }
  2. sub FETCH { shift }
  3. sub nop { } # Around a bug

(the bug, fixed in Perl 5.14, is described in BUGS). One can use this new interface as

  1. tie $a, 'symbolic', 3;
  2. tie $b, 'symbolic', 4;
  3. $a->nop; $b->nop; # Around a bug
  4. my $c = sqrt($a**2 + $b**2);

Now numeric value of $c is 5. After $a = 12; $b = 5 the numeric value of $c becomes 13. To insulate the user of the module add a method

  1. sub vars { my $p = shift; tie($_, $p), $_->nop foreach @_; }

Now

  1. my ($a, $b);
  2. symbolic->vars($a, $b);
  3. my $c = sqrt($a**2 + $b**2);
  4. $a = 3; $b = 4;
  5. printf "c5 %s=%f\n", $c, $c;
  6. $a = 12; $b = 5;
  7. printf "c13 %s=%f\n", $c, $c;

shows that the numeric value of $c follows changes to the values of $a and $b.

AUTHOR

Ilya Zakharevich <ilya@math.mps.ohio-state.edu>.

SEE ALSO

The overloading pragma can be used to enable or disable overloaded operations within a lexical scope - see overloading.

DIAGNOSTICS

When Perl is run with the -Do switch or its equivalent, overloading induces diagnostic messages.

Using the m command of Perl debugger (see perldebug) one can deduce which operations are overloaded (and which ancestor triggers this overloading). Say, if eq is overloaded, then the method (eq is shown by debugger. The method () corresponds to the fallback key (in fact a presence of this method shows that this package has overloading enabled, and it is what is used by the Overloaded function of module overload ).

The module might issue the following warnings:

  • Odd number of arguments for overload::constant

    (W) The call to overload::constant contained an odd number of arguments. The arguments should come in pairs.

  • '%s' is not an overloadable type

    (W) You tried to overload a constant type the overload package is unaware of.

  • '%s' is not a code reference

    (W) The second (fourth, sixth, ...) argument of overload::constant needs to be a code reference. Either an anonymous subroutine, or a reference to a subroutine.

  • overload arg '%s' is invalid

    (W) use overload was passed an argument it did not recognize. Did you mistype an operator?

BUGS AND PITFALLS

  • A pitfall when fallback is TRUE and Perl resorts to a built-in implementation of an operator is that some operators have more than one semantic, for example |:

    1. use overload '0+' => sub { $_[0]->{n}; },
    2. fallback => 1;
    3. my $x = bless { n => 4 }, "main";
    4. my $y = bless { n => 8 }, "main";
    5. print $x | $y, "\n";

    You might expect this to output "12". In fact, it prints "<": the ASCII result of treating "|" as a bitwise string operator - that is, the result of treating the operands as the strings "4" and "8" rather than numbers. The fact that numify (0+ ) is implemented but stringify ("" ) isn't makes no difference since the latter is simply autogenerated from the former.

    The only way to change this is to provide your own subroutine for '|' .

  • Magic autogeneration increases the potential for inadvertently creating self-referential structures. Currently Perl will not free self-referential structures until cycles are explicitly broken. For example,

    1. use overload '+' => 'add';
    2. sub add { bless [ \$_[0], \$_[1] ] };

    is asking for trouble, since

    1. $obj += $y;

    will effectively become

    1. $obj = add($obj, $y, undef);

    with the same result as

    1. $obj = [\$obj, \$foo];

    Even if no explicit assignment-variants of operators are present in the script, they may be generated by the optimizer. For example,

    1. "obj = $obj\n"

    may be optimized to

    1. my $tmp = 'obj = ' . $obj; $tmp .= "\n";
  • The symbol table is filled with names looking like line-noise.

  • This bug was fixed in Perl 5.18, but may still trip you up if you are using older versions:

    For the purpose of inheritance every overloaded package behaves as if fallback is present (possibly undefined). This may create interesting effects if some package is not overloaded, but inherits from two overloaded packages.

  • Before Perl 5.14, the relation between overloading and tie()ing was broken. Overloading was triggered or not based on the previous class of the tie()d variable.

    This happened because the presence of overloading was checked too early, before any tie()d access was attempted. If the class of the value FETCH()ed from the tied variable does not change, a simple workaround for code that is to run on older Perl versions is to access the value (via () = $foo or some such) immediately after tie()ing, so that after this call the previous class coincides with the current one.

  • Barewords are not covered by overloaded string constants.

  • The range operator .. cannot be overloaded.

 
perldoc-html/overloading.html000644 000765 000024 00000040305 12275777415 016440 0ustar00jjstaff000000 000000 overloading - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

overloading

Perl 5 version 18.2 documentation
Recently read

overloading

NAME

overloading - perl pragma to lexically control overloading

SYNOPSIS

  1. {
  2. no overloading;
  3. my $str = "$object"; # doesn't call stringification overload
  4. }
  5. # it's lexical, so this stringifies:
  6. warn "$object";
  7. # it can be enabled per op
  8. no overloading qw("");
  9. warn "$object";
  10. # and also reenabled
  11. use overloading;

DESCRIPTION

This pragma allows you to lexically disable or enable overloading.

  • no overloading

    Disables overloading entirely in the current lexical scope.

  • no overloading @ops

    Disables only specific overloads in the current lexical scope.

  • use overloading

    Reenables overloading in the current lexical scope.

  • use overloading @ops

    Reenables overloading only for specific ops in the current lexical scope.

 
perldoc-html/parent.html000644 000765 000024 00000051375 12275777415 015431 0ustar00jjstaff000000 000000 parent - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

parent

Perl 5 version 18.2 documentation
Recently read

parent

NAME

parent - Establish an ISA relationship with base classes at compile time

SYNOPSIS

  1. package Baz;
  2. use parent qw(Foo Bar);

DESCRIPTION

Allows you to both load one or more modules, while setting up inheritance from those modules at the same time. Mostly similar in effect to

  1. package Baz;
  2. BEGIN {
  3. require Foo;
  4. require Bar;
  5. push @ISA, qw(Foo Bar);
  6. }

By default, every base class needs to live in a file of its own. If you want to have a subclass and its parent class in the same file, you can tell parent not to load any modules by using the -norequire switch:

  1. package Foo;
  2. sub exclaim { "I CAN HAS PERL" }
  3. package DoesNotLoadFooBar;
  4. use parent -norequire, 'Foo', 'Bar';
  5. # will not go looking for Foo.pm or Bar.pm

This is equivalent to the following code:

  1. package Foo;
  2. sub exclaim { "I CAN HAS PERL" }
  3. package DoesNotLoadFooBar;
  4. push @DoesNotLoadFooBar::ISA, 'Foo', 'Bar';

This is also helpful for the case where a package lives within a differently named file:

  1. package MyHash;
  2. use Tie::Hash;
  3. use parent -norequire, 'Tie::StdHash';

This is equivalent to the following code:

  1. package MyHash;
  2. require Tie::Hash;
  3. push @ISA, 'Tie::StdHash';

If you want to load a subclass from a file that require would not consider an eligible filename (that is, it does not end in either .pm or .pmc), use the following code:

  1. package MySecondPlugin;
  2. require './plugins/custom.plugin'; # contains Plugin::Custom
  3. use parent -norequire, 'Plugin::Custom';

DIAGNOSTICS

  • Class 'Foo' tried to inherit from itself

    Attempting to inherit from yourself generates a warning.

    1. package Foo;
    2. use parent 'Foo';

HISTORY

This module was forked from base to remove the cruft that had accumulated in it.

CAVEATS

SEE ALSO

base

AUTHORS AND CONTRIBUTORS

Rafaël Garcia-Suarez, Bart Lateur, Max Maischein, Anno Siegel, Michael Schwern

MAINTAINER

Max Maischein corion@cpan.org

Copyright (c) 2007-10 Max Maischein <corion@cpan.org> Based on the idea of base.pm , which was introduced with Perl 5.004_04.

LICENSE

This module is released under the same terms as Perl itself.

 
perldoc-html/perl.html000644 000765 000024 00000150437 12275777321 015075 0ustar00jjstaff000000 000000 perl - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl

Perl 5 version 18.2 documentation
Recently read

perl

NAME

perl - The Perl 5 language interpreter

SYNOPSIS

perl [ -sTtuUWX ] [ -hv ] [ -V[:configvar] ] [ -cw ] [ -d[t][:debugger] ] [ -D[number/list] ] [ -pna ] [ -Fpattern ] [ -l[octal] ] [ -0[octal/hexadecimal] ] [ -Idir ] [ -m[-]module ] [ -M[-]'module...' ] [ -f ] [ -C [number/list] ] [ -S ] [ -x[dir] ] [ -i[extension] ] [ [-e|-E] 'command' ] [ -- ] [ programfile ] [ argument ]...

For more information on these options, you can run perldoc perlrun .

GETTING HELP

The perldoc program gives you access to all the documentation that comes with Perl. You can get more documentation, tutorials and community support online at http://www.perl.org/.

If you're new to Perl, you should start by running perldoc perlintro , which is a general intro for beginners and provides some background to help you navigate the rest of Perl's extensive documentation. Run perldoc perldoc to learn more things you can do with perldoc.

For ease of access, the Perl manual has been split up into several sections.

Overview

  1. perl Perl overview (this section)
  2. perlintro Perl introduction for beginners
  3. perlrun Perl execution and options
  4. perltoc Perl documentation table of contents

Tutorials

  1. perlreftut Perl references short introduction
  2. perldsc Perl data structures intro
  3. perllol Perl data structures: arrays of arrays
  4. perlrequick Perl regular expressions quick start
  5. perlretut Perl regular expressions tutorial
  6. perlootut Perl OO tutorial for beginners
  7. perlperf Perl Performance and Optimization Techniques
  8. perlstyle Perl style guide
  9. perlcheat Perl cheat sheet
  10. perltrap Perl traps for the unwary
  11. perldebtut Perl debugging tutorial
  12. perlfaq Perl frequently asked questions
  13. perlfaq1 General Questions About Perl
  14. perlfaq2 Obtaining and Learning about Perl
  15. perlfaq3 Programming Tools
  16. perlfaq4 Data Manipulation
  17. perlfaq5 Files and Formats
  18. perlfaq6 Regexes
  19. perlfaq7 Perl Language Issues
  20. perlfaq8 System Interaction
  21. perlfaq9 Networking

Reference Manual

  1. perlsyn Perl syntax
  2. perldata Perl data structures
  3. perlop Perl operators and precedence
  4. perlsub Perl subroutines
  5. perlfunc Perl built-in functions
  6. perlopentut Perl open() tutorial
  7. perlpacktut Perl pack() and unpack() tutorial
  8. perlpod Perl plain old documentation
  9. perlpodspec Perl plain old documentation format specification
  10. perlpodstyle Perl POD style guide
  11. perldiag Perl diagnostic messages
  12. perllexwarn Perl warnings and their control
  13. perldebug Perl debugging
  14. perlvar Perl predefined variables
  15. perlre Perl regular expressions, the rest of the story
  16. perlrebackslash Perl regular expression backslash sequences
  17. perlrecharclass Perl regular expression character classes
  18. perlreref Perl regular expressions quick reference
  19. perlref Perl references, the rest of the story
  20. perlform Perl formats
  21. perlobj Perl objects
  22. perltie Perl objects hidden behind simple variables
  23. perldbmfilter Perl DBM filters
  24. perlipc Perl interprocess communication
  25. perlfork Perl fork() information
  26. perlnumber Perl number semantics
  27. perlthrtut Perl threads tutorial
  28. perlport Perl portability guide
  29. perllocale Perl locale support
  30. perluniintro Perl Unicode introduction
  31. perlunicode Perl Unicode support
  32. perlunifaq Perl Unicode FAQ
  33. perluniprops Index of Unicode properties in Perl
  34. perlunitut Perl Unicode tutorial
  35. perlebcdic Considerations for running Perl on EBCDIC platforms
  36. perlsec Perl security
  37. perlmod Perl modules: how they work
  38. perlmodlib Perl modules: how to write and use
  39. perlmodstyle Perl modules: how to write modules with style
  40. perlmodinstall Perl modules: how to install from CPAN
  41. perlnewmod Perl modules: preparing a new module for distribution
  42. perlpragma Perl modules: writing a user pragma
  43. perlutil utilities packaged with the Perl distribution
  44. perlfilter Perl source filters
  45. perldtrace Perl's support for DTrace
  46. perlglossary Perl Glossary

Internals and C Language Interface

  1. perlembed Perl ways to embed perl in your C or C++ application
  2. perldebguts Perl debugging guts and tips
  3. perlxstut Perl XS tutorial
  4. perlxs Perl XS application programming interface
  5. perlxstypemap Perl XS C/Perl type conversion tools
  6. perlclib Internal replacements for standard C library functions
  7. perlguts Perl internal functions for those doing extensions
  8. perlcall Perl calling conventions from C
  9. perlmroapi Perl method resolution plugin interface
  10. perlreapi Perl regular expression plugin interface
  11. perlreguts Perl regular expression engine internals
  12. perlapi Perl API listing (autogenerated)
  13. perlintern Perl internal functions (autogenerated)
  14. perliol C API for Perl's implementation of IO in Layers
  15. perlapio Perl internal IO abstraction interface
  16. perlhack Perl hackers guide
  17. perlsource Guide to the Perl source tree
  18. perlinterp Overview of the Perl interpreter source and how it works
  19. perlhacktut Walk through the creation of a simple C code patch
  20. perlhacktips Tips for Perl core C code hacking
  21. perlpolicy Perl development policies
  22. perlgit Using git with the Perl repository

Miscellaneous

  1. perlbook Perl book information
  2. perlcommunity Perl community information
  3. perldoc Look up Perl documentation in Pod format
  4. perlhist Perl history records
  5. perldelta Perl changes since previous version
  6. perl5181delta Perl changes in version 5.18.1
  7. perl5180delta Perl changes in version 5.18.0
  8. perl5161delta Perl changes in version 5.16.1
  9. perl5162delta Perl changes in version 5.16.2
  10. perl5163delta Perl changes in version 5.16.3
  11. perl5160delta Perl changes in version 5.16.0
  12. perl5144delta Perl changes in version 5.14.4
  13. perl5143delta Perl changes in version 5.14.3
  14. perl5142delta Perl changes in version 5.14.2
  15. perl5141delta Perl changes in version 5.14.1
  16. perl5140delta Perl changes in version 5.14.0
  17. perl5125delta Perl changes in version 5.12.5
  18. perl5124delta Perl changes in version 5.12.4
  19. perl5123delta Perl changes in version 5.12.3
  20. perl5122delta Perl changes in version 5.12.2
  21. perl5121delta Perl changes in version 5.12.1
  22. perl5120delta Perl changes in version 5.12.0
  23. perl5101delta Perl changes in version 5.10.1
  24. perl5100delta Perl changes in version 5.10.0
  25. perl589delta Perl changes in version 5.8.9
  26. perl588delta Perl changes in version 5.8.8
  27. perl587delta Perl changes in version 5.8.7
  28. perl586delta Perl changes in version 5.8.6
  29. perl585delta Perl changes in version 5.8.5
  30. perl584delta Perl changes in version 5.8.4
  31. perl583delta Perl changes in version 5.8.3
  32. perl582delta Perl changes in version 5.8.2
  33. perl581delta Perl changes in version 5.8.1
  34. perl58delta Perl changes in version 5.8.0
  35. perl561delta Perl changes in version 5.6.1
  36. perl56delta Perl changes in version 5.6
  37. perl5005delta Perl changes in version 5.005
  38. perl5004delta Perl changes in version 5.004
  39. perlexperiment A listing of experimental features in Perl
  40. perlartistic Perl Artistic License
  41. perlgpl GNU General Public License

Language-Specific

  1. perlcn Perl for Simplified Chinese (in EUC-CN)
  2. perljp Perl for Japanese (in EUC-JP)
  3. perlko Perl for Korean (in EUC-KR)
  4. perltw Perl for Traditional Chinese (in Big5)

Platform-Specific

  1. perlaix Perl notes for AIX
  2. perlamiga Perl notes for AmigaOS
  3. perlbs2000 Perl notes for POSIX-BC BS2000
  4. perlce Perl notes for WinCE
  5. perlcygwin Perl notes for Cygwin
  6. perldgux Perl notes for DG/UX
  7. perldos Perl notes for DOS
  8. perlfreebsd Perl notes for FreeBSD
  9. perlhaiku Perl notes for Haiku
  10. perlhpux Perl notes for HP-UX
  11. perlhurd Perl notes for Hurd
  12. perlirix Perl notes for Irix
  13. perllinux Perl notes for Linux
  14. perlmacos Perl notes for Mac OS (Classic)
  15. perlmacosx Perl notes for Mac OS X
  16. perlnetware Perl notes for NetWare
  17. perlopenbsd Perl notes for OpenBSD
  18. perlos2 Perl notes for OS/2
  19. perlos390 Perl notes for OS/390
  20. perlos400 Perl notes for OS/400
  21. perlplan9 Perl notes for Plan 9
  22. perlqnx Perl notes for QNX
  23. perlriscos Perl notes for RISC OS
  24. perlsolaris Perl notes for Solaris
  25. perlsymbian Perl notes for Symbian
  26. perltru64 Perl notes for Tru64
  27. perlvms Perl notes for VMS
  28. perlvos Perl notes for Stratus VOS
  29. perlwin32 Perl notes for Windows

Stubs for Deleted Documents

  1. perlboot
  2. perlbot
  3. perltodo
  4. perltooc
  5. perltoot
  6. perlrepository

On a Unix-like system, these documentation files will usually also be available as manpages for use with the man program.

Some documentation is not available as man pages, so if a cross-reference is not found by man, try it with perldoc. Perldoc can also take you directly to documentation for functions (with the -f switch). See perldoc --help (or perldoc perldoc or man perldoc ) for other helpful options perldoc has to offer.

In general, if something strange has gone wrong with your program and you're not sure where you should look for help, try making your code comply with use strict and use warnings. These will often point out exactly where the trouble is.

DESCRIPTION

Perl officially stands for Practical Extraction and Report Language, except when it doesn't.

Perl was originally a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. It quickly became a good language for many system management tasks. Over the years, Perl has grown into a general-purpose programming language. It's widely used for everything from quick "one-liners" to full-scale application development.

The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). It combines (in the author's opinion, anyway) some of the best features of sed, awk, and sh, making it familiar and easy to use for Unix users to whip up quick solutions to annoying problems. Its general-purpose programming facilities support procedural, functional, and object-oriented programming paradigms, making Perl a comfortable language for the long haul on major projects, whatever your bent.

Perl's roots in text processing haven't been forgotten over the years. It still boasts some of the most powerful regular expressions to be found anywhere, and its support for Unicode text is world-class. It handles all kinds of structured text, too, through an extensive collection of extensions. Those libraries, collected in the CPAN, provide ready-made solutions to an astounding array of problems. When they haven't set the standard themselves, they steal from the best -- just like Perl itself.

AVAILABILITY

Perl is available for most operating systems, including virtually all Unix-like platforms. See Supported Platforms in perlport for a listing.

ENVIRONMENT

See perlrun.

AUTHOR

Larry Wall <larry@wall.org>, with the help of oodles of other folks.

If your Perl success stories and testimonials may be of help to others who wish to advocate the use of Perl in their applications, or if you wish to simply express your gratitude to Larry and the Perl developers, please write to perl-thanks@perl.org .

FILES

  1. "@INC" locations of perl libraries

SEE ALSO

  1. http://www.perl.org/ the Perl homepage
  2. http://www.perl.com/ Perl articles (O'Reilly)
  3. http://www.cpan.org/ the Comprehensive Perl Archive
  4. http://www.pm.org/ the Perl Mongers

DIAGNOSTICS

Using the use strict pragma ensures that all variables are properly declared and prevents other misuses of legacy Perl features.

The use warnings pragma produces some lovely diagnostics. One can also use the -w flag, but its use is normally discouraged, because it gets applied to all executed Perl code, including that not under your control.

See perldiag for explanations of all Perl's diagnostics. The use diagnostics pragma automatically turns Perl's normally terse warnings and errors into these longer forms.

Compilation errors will tell you the line number of the error, with an indication of the next token or token type that was to be examined. (In a script passed to Perl via -e switches, each -e is counted as one line.)

Setuid scripts have additional constraints that can produce error messages such as "Insecure dependency". See perlsec.

Did we mention that you should definitely consider using the use warnings pragma?

BUGS

The behavior implied by the use warnings pragma is not mandatory.

Perl is at the mercy of your machine's definitions of various operations such as type casting, atof(), and floating-point output with sprintf().

If your stdio requires a seek or eof between reads and writes on a particular stream, so does Perl. (This doesn't apply to sysread() and syswrite().)

While none of the built-in data types have any arbitrary size limits (apart from memory size), there are still a few arbitrary limits: a given variable name may not be longer than 251 characters. Line numbers displayed by diagnostics are internally stored as short integers, so they are limited to a maximum of 65535 (higher numbers usually being affected by wraparound).

You may mail your bug reports (be sure to include full configuration information as output by the myconfig program in the perl source tree, or by perl -V ) to perlbug@perl.org . If you've succeeded in compiling perl, the perlbug script in the utils/ subdirectory can be used to help mail in a bug report.

Perl actually stands for Pathologically Eclectic Rubbish Lister, but don't tell anyone I said that.

NOTES

The Perl motto is "There's more than one way to do it." Divining how many more is left as an exercise to the reader.

The three principal virtues of a programmer are Laziness, Impatience, and Hubris. See the Camel Book for why.

 
perldoc-html/perl5004delta.html000644 000765 000024 00000337314 12275777410 016420 0ustar00jjstaff000000 000000 perl5004delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5004delta

Perl 5 version 18.2 documentation
Recently read

perl5004delta

NAME

perl5004delta - what's new for perl5.004

DESCRIPTION

This document describes differences between the 5.003 release (as documented in Programming Perl, second edition--the Camel Book) and this one.

Supported Environments

Perl5.004 builds out of the box on Unix, Plan 9, LynxOS, VMS, OS/2, QNX, AmigaOS, and Windows NT. Perl runs on Windows 95 as well, but it cannot be built there, for lack of a reasonable command interpreter.

Core Changes

Most importantly, many bugs were fixed, including several security problems. See the Changes file in the distribution for details.

List assignment to %ENV works

%ENV = () and %ENV = @list now work as expected (except on VMS where it generates a fatal error).

Change to "Can't locate Foo.pm in @INC" error

The error "Can't locate Foo.pm in @INC" now lists the contents of @INC for easier debugging.

Compilation option: Binary compatibility with 5.003

There is a new Configure question that asks if you want to maintain binary compatibility with Perl 5.003. If you choose binary compatibility, you do not have to recompile your extensions, but you might have symbol conflicts if you embed Perl in another application, just as in the 5.003 release. By default, binary compatibility is preserved at the expense of symbol table pollution.

$PERL5OPT environment variable

You may now put Perl options in the $PERL5OPT environment variable. Unless Perl is running with taint checks, it will interpret this variable as if its contents had appeared on a "#!perl" line at the beginning of your script, except that hyphens are optional. PERL5OPT may only be used to set the following switches: -[DIMUdmw].

Limitations on -M, -m, and -T options

The -M and -m options are no longer allowed on the #! line of a script. If a script needs a module, it should invoke it with the use pragma.

The -T option is also forbidden on the #! line of a script, unless it was present on the Perl command line. Due to the way #! works, this usually means that -T must be in the first argument. Thus:

  1. #!/usr/bin/perl -T -w

will probably work for an executable script invoked as scriptname , while:

  1. #!/usr/bin/perl -w -T

will probably fail under the same conditions. (Non-Unix systems will probably not follow this rule.) But perl scriptname is guaranteed to fail, since then there is no chance of -T being found on the command line before it is found on the #! line.

More precise warnings

If you removed the -w option from your Perl 5.003 scripts because it made Perl too verbose, we recommend that you try putting it back when you upgrade to Perl 5.004. Each new perl version tends to remove some undesirable warnings, while adding new warnings that may catch bugs in your scripts.

Deprecated: Inherited AUTOLOAD for non-methods

Before Perl 5.004, AUTOLOAD functions were looked up as methods (using the @ISA hierarchy), even when the function to be autoloaded was called as a plain function (e.g. Foo::bar() ), not a method (e.g. Foo->bar() or $obj->bar() ).

Perl 5.005 will use method lookup only for methods' AUTOLOAD s. However, there is a significant base of existing code that may be using the old behavior. So, as an interim step, Perl 5.004 issues an optional warning when a non-method uses an inherited AUTOLOAD .

The simple rule is: Inheritance will not work when autoloading non-methods. The simple fix for old code is: In any module that used to depend on inheriting AUTOLOAD for non-methods from a base class named BaseClass , execute *AUTOLOAD = \&BaseClass::AUTOLOAD during startup.

Previously deprecated %OVERLOAD is no longer usable

Using %OVERLOAD to define overloading was deprecated in 5.003. Overloading is now defined using the overload pragma. %OVERLOAD is still used internally but should not be used by Perl scripts. See overload for more details.

Subroutine arguments created only when they're modified

In Perl 5.004, nonexistent array and hash elements used as subroutine parameters are brought into existence only if they are actually assigned to (via @_ ).

Earlier versions of Perl vary in their handling of such arguments. Perl versions 5.002 and 5.003 always brought them into existence. Perl versions 5.000 and 5.001 brought them into existence only if they were not the first argument (which was almost certainly a bug). Earlier versions of Perl never brought them into existence.

For example, given this code:

  1. undef @a; undef %a;
  2. sub show { print $_[0] };
  3. sub change { $_[0]++ };
  4. show($a[2]);
  5. change($a{b});

After this code executes in Perl 5.004, $a{b} exists but $a[2] does not. In Perl 5.002 and 5.003, both $a{b} and $a[2] would have existed (but $a[2]'s value would have been undefined).

Group vector changeable with $)

The $) special variable has always (well, in Perl 5, at least) reflected not only the current effective group, but also the group list as returned by the getgroups() C function (if there is one). However, until this release, there has not been a way to call the setgroups() C function from Perl.

In Perl 5.004, assigning to $) is exactly symmetrical with examining it: The first number in its string value is used as the effective gid; if there are any numbers after the first one, they are passed to the setgroups() C function (if there is one).

Fixed parsing of $$<digit>, &$<digit>, etc.

Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004.

However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$<digit>" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease.

Fixed localization of $<digit>, $&, etc.

Perl versions before 5.004 did not always properly localize the regex-related special variables. Perl 5.004 does localize them, as the documentation has always said it should. This may result in $1, $2, etc. no longer being set where existing programs use them.

No resetting of $. on implicit close

The documentation for Perl 5.0 has always stated that $. is not reset when an already-open file handle is reopened with no intervening call to close. Due to a bug, perl versions 5.000 through 5.003 did reset $. under that circumstance; Perl 5.004 does not.

wantarray may return undef

The wantarray operator returns true if a subroutine is expected to return a list, and false otherwise. In Perl 5.004, wantarray can also return the undefined value if a subroutine's return value will not be used at all, which allows subroutines to avoid a time-consuming calculation of a return value if it isn't going to be used.

eval EXPR determines value of EXPR in scalar context

Perl (version 5) used to determine the value of EXPR inconsistently, sometimes incorrectly using the surrounding context for the determination. Now, the value of EXPR (before being parsed by eval) is always determined in a scalar context. Once parsed, it is executed as before, by providing the context that the scope surrounding the eval provided. This change makes the behavior Perl4 compatible, besides fixing bugs resulting from the inconsistent behavior. This program:

  1. @a = qw(time now is time);
  2. print eval @a;
  3. print '|', scalar eval @a;

used to print something like "timenowis881399109|4", but now (and in perl4) prints "4|4".

Changes to tainting checks

A bug in previous versions may have failed to detect some insecure conditions when taint checks are turned on. (Taint checks are used in setuid or setgid scripts, or when explicitly turned on with the -T invocation option.) Although it's unlikely, this may cause a previously-working script to now fail, which should be construed as a blessing since that indicates a potentially-serious security hole was just plugged.

The new restrictions when tainting include:

  • No glob() or <*>

    These operators may spawn the C shell (csh), which cannot be made safe. This restriction will be lifted in a future version of Perl when globbing is implemented without the use of an external program.

  • No spawning if tainted $CDPATH, $ENV, $BASH_ENV

    These environment variables may alter the behavior of spawned programs (especially shells) in ways that subvert security. So now they are treated as dangerous, in the manner of $IFS and $PATH.

  • No spawning if tainted $TERM doesn't look like a terminal name

    Some termcap libraries do unsafe things with $TERM. However, it would be unnecessarily harsh to treat all $TERM values as unsafe, since only shell metacharacters can cause trouble in $TERM. So a tainted $TERM is considered to be safe if it contains only alphanumerics, underscores, dashes, and colons, and unsafe if it contains other characters (including whitespace).

New Opcode module and revised Safe module

A new Opcode module supports the creation, manipulation and application of opcode masks. The revised Safe module has a new API and is implemented using the new Opcode module. Please read the new Opcode and Safe documentation.

Embedding improvements

In older versions of Perl it was not possible to create more than one Perl interpreter instance inside a single process without leaking like a sieve and/or crashing. The bugs that caused this behavior have all been fixed. However, you still must take care when embedding Perl in a C program. See the updated perlembed manpage for tips on how to manage your interpreters.

Internal change: FileHandle class based on IO::* classes

File handles are now stored internally as type IO::Handle. The FileHandle module is still supported for backwards compatibility, but it is now merely a front end to the IO::* modules, specifically IO::Handle, IO::Seekable, and IO::File. We suggest, but do not require, that you use the IO::* modules in new code.

In harmony with this change, *GLOB{FILEHANDLE} is now just a backward-compatible synonym for *GLOB{IO} .

Internal change: PerlIO abstraction interface

It is now possible to build Perl with AT&T's sfio IO package instead of stdio. See perlapio for more details, and the INSTALL file for how to use it.

New and changed syntax

  • $coderef->(PARAMS)

    A subroutine reference may now be suffixed with an arrow and a (possibly empty) parameter list. This syntax denotes a call of the referenced subroutine, with the given parameters (if any).

    This new syntax follows the pattern of $hashref->{FOO} and $aryref->[$foo] : You may now write &$subref($foo) as $subref->($foo) . All these arrow terms may be chained; thus, &{$table->{FOO}}($bar) may now be written $table->{FOO}->($bar) .

New and changed builtin constants

  • __PACKAGE__

    The current package name at compile time, or the undefined value if there is no current package (due to a package; directive). Like __FILE__ and __LINE__ , __PACKAGE__ does not interpolate into strings.

New and changed builtin variables

  • $^E

    Extended error message on some platforms. (Also known as $EXTENDED_OS_ERROR if you use English ).

  • $^H

    The current set of syntax checks enabled by use strict . See the documentation of strict for more details. Not actually new, but newly documented. Because it is intended for internal use by Perl core components, there is no use English long name for this variable.

  • $^M

    By default, running out of memory it is not trappable. However, if compiled for this, Perl may use the contents of $^M as an emergency pool after die()ing with this message. Suppose that your Perl were compiled with -DPERL_EMERGENCY_SBRK and used Perl's malloc. Then

    1. $^M = 'a' x (1<<16);

    would allocate a 64K buffer for use when in emergency. See the INSTALL file for information on how to enable this option. As a disincentive to casual use of this advanced feature, there is no use English long name for this variable.

New and changed builtin functions

  • delete on slices

    This now works. (e.g. delete @ENV{'PATH', 'MANPATH'} )

  • flock

    is now supported on more platforms, prefers fcntl to lockf when emulating, and always flushes before (un)locking.

  • printf and sprintf

    Perl now implements these functions itself; it doesn't use the C library function sprintf() any more, except for floating-point numbers, and even then only known flags are allowed. As a result, it is now possible to know which conversions and flags will work, and what they will do.

    The new conversions in Perl's sprintf() are:

    1. %i a synonym for %d
    2. %p a pointer (the address of the Perl value, in hexadecimal)
    3. %n special: *stores* the number of characters output so far
    4. into the next variable in the parameter list

    The new flags that go between the % and the conversion are:

    1. # prefix octal with "0", hex with "0x"
    2. h interpret integer as C type "short" or "unsigned short"
    3. V interpret integer as Perl's standard integer type

    Also, where a number would appear in the flags, an asterisk ("*") may be used instead, in which case Perl uses the next item in the parameter list as the given number (that is, as the field width or precision). If a field width obtained through "*" is negative, it has the same effect as the '-' flag: left-justification.

    See sprintf for a complete list of conversion and flags.

  • keys as an lvalue

    As an lvalue, keys allows you to increase the number of hash buckets allocated for the given hash. This can gain you a measure of efficiency if you know the hash is going to get big. (This is similar to pre-extending an array by assigning a larger number to $#array.) If you say

    1. keys %hash = 200;

    then %hash will have at least 200 buckets allocated for it. These buckets will be retained even if you do %hash = () ; use undef %hash if you want to free the storage while %hash is still in scope. You can't shrink the number of buckets allocated for the hash using keys in this way (but you needn't worry about doing this by accident, as trying has no effect).

  • my() in Control Structures

    You can now use my() (with or without the parentheses) in the control expressions of control structures such as:

    1. while (defined(my $line = <>)) {
    2. $line = lc $line;
    3. } continue {
    4. print $line;
    5. }
    6. if ((my $answer = <STDIN>) =~ /^y(es)?$/i) {
    7. user_agrees();
    8. } elsif ($answer =~ /^n(o)?$/i) {
    9. user_disagrees();
    10. } else {
    11. chomp $answer;
    12. die "`$answer' is neither `yes' nor `no'";
    13. }

    Also, you can declare a foreach loop control variable as lexical by preceding it with the word "my". For example, in:

    1. foreach my $i (1, 2, 3) {
    2. some_function();
    3. }

    $i is a lexical variable, and the scope of $i extends to the end of the loop, but not beyond it.

    Note that you still cannot use my() on global punctuation variables such as $_ and the like.

  • pack() and unpack()

    A new format 'w' represents a BER compressed integer (as defined in ASN.1). Its format is a sequence of one or more bytes, each of which provides seven bits of the total value, with the most significant first. Bit eight of each byte is set, except for the last byte, in which bit eight is clear.

    If 'p' or 'P' are given undef as values, they now generate a NULL pointer.

    Both pack() and unpack() now fail when their templates contain invalid types. (Invalid types used to be ignored.)

  • sysseek()

    The new sysseek() operator is a variant of seek() that sets and gets the file's system read/write position, using the lseek(2) system call. It is the only reliable way to seek before using sysread() or syswrite(). Its return value is the new position, or the undefined value on failure.

  • use VERSION

    If the first argument to use is a number, it is treated as a version number instead of a module name. If the version of the Perl interpreter is less than VERSION, then an error message is printed and Perl exits immediately. Because use occurs at compile time, this check happens immediately during the compilation process, unlike require VERSION , which waits until runtime for the check. This is often useful if you need to check the current Perl version before useing library modules which have changed in incompatible ways from older versions of Perl. (We try not to do this more than we have to.)

  • use Module VERSION LIST

    If the VERSION argument is present between Module and LIST, then the use will call the VERSION method in class Module with the given version as an argument. The default VERSION method, inherited from the UNIVERSAL class, croaks if the given version is larger than the value of the variable $Module::VERSION. (Note that there is not a comma after VERSION!)

    This version-checking mechanism is similar to the one currently used in the Exporter module, but it is faster and can be used with modules that don't use the Exporter. It is the recommended method for new code.

  • prototype(FUNCTION)

    Returns the prototype of a function as a string (or undef if the function has no prototype). FUNCTION is a reference to or the name of the function whose prototype you want to retrieve. (Not actually new; just never documented before.)

  • srand

    The default seed for srand, which used to be time, has been changed. Now it's a heady mix of difficult-to-predict system-dependent values, which should be sufficient for most everyday purposes.

    Previous to version 5.004, calling rand without first calling srand would yield the same sequence of random numbers on most or all machines. Now, when perl sees that you're calling rand and haven't yet called srand, it calls srand with the default seed. You should still call srand manually if your code might ever be run on a pre-5.004 system, of course, or if you want a seed other than the default.

  • $_ as Default

    Functions documented in the Camel to default to $_ now in fact do, and all those that do are so documented in perlfunc.

  • m//gc does not reset search position on failure

    The m//g match iteration construct has always reset its target string's search position (which is visible through the pos operator) when a match fails; as a result, the next m//g match after a failure starts again at the beginning of the string. With Perl 5.004, this reset may be disabled by adding the "c" (for "continue") modifier, i.e. m//gc. This feature, in conjunction with the \G zero-width assertion, makes it possible to chain matches together. See perlop and perlre.

  • m//x ignores whitespace before ?*+{}

    The m//x construct has always been intended to ignore all unescaped whitespace. However, before Perl 5.004, whitespace had the effect of escaping repeat modifiers like "*" or "?"; for example, /a *b/x was (mis)interpreted as /a\*b/x . This bug has been fixed in 5.004.

  • nested sub{} closures work now

    Prior to the 5.004 release, nested anonymous functions didn't work right. They do now.

  • formats work right on changing lexicals

    Just like anonymous functions that contain lexical variables that change (like a lexical index variable for a foreach loop), formats now work properly. For example, this silently failed before (printed only zeros), but is fine now:

    1. my $i;
    2. foreach $i ( 1 .. 10 ) {
    3. write;
    4. }
    5. format =
    6. my i is @#
    7. $i
    8. .

    However, it still fails (without a warning) if the foreach is within a subroutine:

    1. my $i;
    2. sub foo {
    3. foreach $i ( 1 .. 10 ) {
    4. write;
    5. }
    6. }
    7. foo;
    8. format =
    9. my i is @#
    10. $i
    11. .

New builtin methods

The UNIVERSAL package automatically contains the following methods that are inherited by all other classes:

  • isa(CLASS)

    isa returns true if its object is blessed into a subclass of CLASS

    isa is also exportable and can be called as a sub with two arguments. This allows the ability to check what a reference points to. Example:

    1. use UNIVERSAL qw(isa);
    2. if(isa($ref, 'ARRAY')) {
    3. ...
    4. }
  • can(METHOD)

    can checks to see if its object has a method called METHOD , if it does then a reference to the sub is returned; if it does not then undef is returned.

  • VERSION( [NEED] )

    VERSION returns the version number of the class (package). If the NEED argument is given then it will check that the current version (as defined by the $VERSION variable in the given package) not less than NEED; it will die if this is not the case. This method is normally called as a class method. This method is called automatically by the VERSION form of use.

    1. use A 1.2 qw(some imported subs);
    2. # implies:
    3. A->VERSION(1.2);

NOTE: can directly uses Perl's internal code for method lookup, and isa uses a very similar method and caching strategy. This may cause strange effects if the Perl code dynamically changes @ISA in any package.

You may add other methods to the UNIVERSAL class via Perl or XS code. You do not need to use UNIVERSAL in order to make these methods available to your program. This is necessary only if you wish to have isa available as a plain subroutine in the current package.

TIEHANDLE now supported

See perltie for other kinds of tie()s.

  • TIEHANDLE classname, LIST

    This is the constructor for the class. That means it is expected to return an object of some sort. The reference can be used to hold some internal information.

    1. sub TIEHANDLE {
    2. print "<shout>\n";
    3. my $i;
    4. return bless \$i, shift;
    5. }
  • PRINT this, LIST

    This method will be triggered every time the tied handle is printed to. Beyond its self reference it also expects the list that was passed to the print function.

    1. sub PRINT {
    2. $r = shift;
    3. $$r++;
    4. return print join( $, => map {uc} @_), $\;
    5. }
  • PRINTF this, LIST

    This method will be triggered every time the tied handle is printed to with the printf() function. Beyond its self reference it also expects the format and list that was passed to the printf function.

    1. sub PRINTF {
    2. shift;
    3. my $fmt = shift;
    4. print sprintf($fmt, @_)."\n";
    5. }
  • READ this LIST

    This method will be called when the handle is read from via the read or sysread functions.

    1. sub READ {
    2. $r = shift;
    3. my($buf,$len,$offset) = @_;
    4. print "READ called, \$buf=$buf, \$len=$len, \$offset=$offset";
    5. }
  • READLINE this

    This method will be called when the handle is read from. The method should return undef when there is no more data.

    1. sub READLINE {
    2. $r = shift;
    3. return "PRINT called $$r times\n"
    4. }
  • GETC this

    This method will be called when the getc function is called.

    1. sub GETC { print "Don't GETC, Get Perl"; return "a"; }
  • DESTROY this

    As with the other types of ties, this method will be called when the tied handle is about to be destroyed. This is useful for debugging and possibly for cleaning up.

    1. sub DESTROY {
    2. print "</shout>\n";
    3. }

Malloc enhancements

If perl is compiled with the malloc included with the perl distribution (that is, if perl -V:d_mymalloc is 'define') then you can print memory statistics at runtime by running Perl thusly:

  1. env PERL_DEBUG_MSTATS=2 perl your_script_here

The value of 2 means to print statistics after compilation and on exit; with a value of 1, the statistics are printed only on exit. (If you want the statistics at an arbitrary time, you'll need to install the optional module Devel::Peek.)

Three new compilation flags are recognized by malloc.c. (They have no effect if perl is compiled with system malloc().)

  • -DPERL_EMERGENCY_SBRK

    If this macro is defined, running out of memory need not be a fatal error: a memory pool can allocated by assigning to the special variable $^M . See $^M.

  • -DPACK_MALLOC

    Perl memory allocation is by bucket with sizes close to powers of two. Because of these malloc overhead may be big, especially for data of size exactly a power of two. If PACK_MALLOC is defined, perl uses a slightly different algorithm for small allocations (up to 64 bytes long), which makes it possible to have overhead down to 1 byte for allocations which are powers of two (and appear quite often).

    Expected memory savings (with 8-byte alignment in alignbytes ) is about 20% for typical Perl usage. Expected slowdown due to additional malloc overhead is in fractions of a percent (hard to measure, because of the effect of saved memory on speed).

  • -DTWO_POT_OPTIMIZE

    Similarly to PACK_MALLOC , this macro improves allocations of data with size close to a power of two; but this works for big allocations (starting with 16K by default). Such allocations are typical for big hashes and special-purpose scripts, especially image processing.

    On recent systems, the fact that perl requires 2M from system for 1M allocation will not affect speed of execution, since the tail of such a chunk is not going to be touched (and thus will not require real memory). However, it may result in a premature out-of-memory error. So if you will be manipulating very large blocks with sizes close to powers of two, it would be wise to define this macro.

    Expected saving of memory is 0-100% (100% in applications which require most memory in such 2**n chunks); expected slowdown is negligible.

Miscellaneous efficiency enhancements

Functions that have an empty prototype and that do nothing but return a fixed value are now inlined (e.g. sub PI () { 3.14159 } ).

Each unique hash key is only allocated once, no matter how many hashes have an entry with that key. So even if you have 100 copies of the same hash, the hash keys never have to be reallocated.

Support for More Operating Systems

Support for the following operating systems is new in Perl 5.004.

Win32

Perl 5.004 now includes support for building a "native" perl under Windows NT, using the Microsoft Visual C++ compiler (versions 2.0 and above) or the Borland C++ compiler (versions 5.02 and above). The resulting perl can be used under Windows 95 (if it is installed in the same directory locations as it got installed in Windows NT). This port includes support for perl extension building tools like ExtUtils::MakeMaker and h2xs, so that many extensions available on the Comprehensive Perl Archive Network (CPAN) can now be readily built under Windows NT. See http://www.perl.com/ for more information on CPAN and README.win32 in the perl distribution for more details on how to get started with building this port.

There is also support for building perl under the Cygwin32 environment. Cygwin32 is a set of GNU tools that make it possible to compile and run many Unix programs under Windows NT by providing a mostly Unix-like interface for compilation and execution. See README.cygwin32 in the perl distribution for more details on this port and how to obtain the Cygwin32 toolkit.

Plan 9

See README.plan9 in the perl distribution.

QNX

See README.qnx in the perl distribution.

AmigaOS

See README.amigaos in the perl distribution.

Pragmata

Six new pragmatic modules exist:

  • use autouse MODULE => qw(sub1 sub2 sub3)

    Defers require MODULE until someone calls one of the specified subroutines (which must be exported by MODULE). This pragma should be used with caution, and only when necessary.

  • use blib
  • use blib 'dir'

    Looks for MakeMaker-like 'blib' directory structure starting in dir (or current directory) and working back up to five levels of parent directories.

    Intended for use on command line with -M option as a way of testing arbitrary scripts against an uninstalled version of a package.

  • use constant NAME => VALUE

    Provides a convenient interface for creating compile-time constants, See Constant Functions in perlsub.

  • use locale

    Tells the compiler to enable (or disable) the use of POSIX locales for builtin operations.

    When use locale is in effect, the current LC_CTYPE locale is used for regular expressions and case mapping; LC_COLLATE for string ordering; and LC_NUMERIC for numeric formatting in printf and sprintf (but not in print). LC_NUMERIC is always used in write, since lexical scoping of formats is problematic at best.

    Each use locale or no locale affects statements to the end of the enclosing BLOCK or, if not inside a BLOCK, to the end of the current file. Locales can be switched and queried with POSIX::setlocale().

    See perllocale for more information.

  • use ops

    Disable unsafe opcodes, or any named opcodes, when compiling Perl code.

  • use vmsish

    Enable VMS-specific language features. Currently, there are three VMS-specific features available: 'status', which makes $? and system return genuine VMS status values instead of emulating POSIX; 'exit', which makes exit take a genuine VMS status value instead of assuming that exit 1 is an error; and 'time', which makes all times relative to the local time zone, in the VMS tradition.

Modules

Required Updates

Though Perl 5.004 is compatible with almost all modules that work with Perl 5.003, there are a few exceptions:

  1. Module Required Version for Perl 5.004
  2. ------ -------------------------------
  3. Filter Filter-1.12
  4. LWP libwww-perl-5.08
  5. Tk Tk400.202 (-w makes noise)

Also, the majordomo mailing list program, version 1.94.1, doesn't work with Perl 5.004 (nor with perl 4), because it executes an invalid regular expression. This bug is fixed in majordomo version 1.94.2.

Installation directories

The installperl script now places the Perl source files for extensions in the architecture-specific library directory, which is where the shared libraries for extensions have always been. This change is intended to allow administrators to keep the Perl 5.004 library directory unchanged from a previous version, without running the risk of binary incompatibility between extensions' Perl source and shared libraries.

Module information summary

Brand new modules, arranged by topic rather than strictly alphabetically:

  1. CGI.pm Web server interface ("Common Gateway Interface")
  2. CGI/Apache.pm Support for Apache's Perl module
  3. CGI/Carp.pm Log server errors with helpful context
  4. CGI/Fast.pm Support for FastCGI (persistent server process)
  5. CGI/Push.pm Support for server push
  6. CGI/Switch.pm Simple interface for multiple server types
  7. CPAN Interface to Comprehensive Perl Archive Network
  8. CPAN::FirstTime Utility for creating CPAN configuration file
  9. CPAN::Nox Runs CPAN while avoiding compiled extensions
  10. IO.pm Top-level interface to IO::* classes
  11. IO/File.pm IO::File extension Perl module
  12. IO/Handle.pm IO::Handle extension Perl module
  13. IO/Pipe.pm IO::Pipe extension Perl module
  14. IO/Seekable.pm IO::Seekable extension Perl module
  15. IO/Select.pm IO::Select extension Perl module
  16. IO/Socket.pm IO::Socket extension Perl module
  17. Opcode.pm Disable named opcodes when compiling Perl code
  18. ExtUtils/Embed.pm Utilities for embedding Perl in C programs
  19. ExtUtils/testlib.pm Fixes up @INC to use just-built extension
  20. FindBin.pm Find path of currently executing program
  21. Class/Struct.pm Declare struct-like datatypes as Perl classes
  22. File/stat.pm By-name interface to Perl's builtin stat
  23. Net/hostent.pm By-name interface to Perl's builtin gethost*
  24. Net/netent.pm By-name interface to Perl's builtin getnet*
  25. Net/protoent.pm By-name interface to Perl's builtin getproto*
  26. Net/servent.pm By-name interface to Perl's builtin getserv*
  27. Time/gmtime.pm By-name interface to Perl's builtin gmtime
  28. Time/localtime.pm By-name interface to Perl's builtin localtime
  29. Time/tm.pm Internal object for Time::{gm,local}time
  30. User/grent.pm By-name interface to Perl's builtin getgr*
  31. User/pwent.pm By-name interface to Perl's builtin getpw*
  32. Tie/RefHash.pm Base class for tied hashes with references as keys
  33. UNIVERSAL.pm Base class for *ALL* classes

Fcntl

New constants in the existing Fcntl modules are now supported, provided that your operating system happens to support them:

  1. F_GETOWN F_SETOWN
  2. O_ASYNC O_DEFER O_DSYNC O_FSYNC O_SYNC
  3. O_EXLOCK O_SHLOCK

These constants are intended for use with the Perl operators sysopen() and fcntl() and the basic database modules like SDBM_File. For the exact meaning of these and other Fcntl constants please refer to your operating system's documentation for fcntl() and open().

In addition, the Fcntl module now provides these constants for use with the Perl operator flock():

  1. LOCK_SH LOCK_EX LOCK_NB LOCK_UN

These constants are defined in all environments (because where there is no flock() system call, Perl emulates it). However, for historical reasons, these constants are not exported unless they are explicitly requested with the ":flock" tag (e.g. use Fcntl ':flock' ).

IO

The IO module provides a simple mechanism to load all the IO modules at one go. Currently this includes:

  1. IO::Handle
  2. IO::Seekable
  3. IO::File
  4. IO::Pipe
  5. IO::Socket

For more information on any of these modules, please see its respective documentation.

Math::Complex

The Math::Complex module has been totally rewritten, and now supports more operations. These are overloaded:

  1. + - * / ** <=> neg ~ abs sqrt exp log sin cos atan2 "" (stringify)

And these functions are now exported:

  1. pi i Re Im arg
  2. log10 logn ln cbrt root
  3. tan
  4. csc sec cot
  5. asin acos atan
  6. acsc asec acot
  7. sinh cosh tanh
  8. csch sech coth
  9. asinh acosh atanh
  10. acsch asech acoth
  11. cplx cplxe

Math::Trig

This new module provides a simpler interface to parts of Math::Complex for those who need trigonometric functions only for real numbers.

DB_File

There have been quite a few changes made to DB_File. Here are a few of the highlights:

  • Fixed a handful of bugs.

  • By public demand, added support for the standard hash function exists().

  • Made it compatible with Berkeley DB 1.86.

  • Made negative subscripts work with RECNO interface.

  • Changed the default flags from O_RDWR to O_CREAT|O_RDWR and the default mode from 0640 to 0666.

  • Made DB_File automatically import the open() constants (O_RDWR, O_CREAT etc.) from Fcntl, if available.

  • Updated documentation.

Refer to the HISTORY section in DB_File.pm for a complete list of changes. Everything after DB_File 1.01 has been added since 5.003.

Net::Ping

Major rewrite - support added for both udp echo and real icmp pings.

Object-oriented overrides for builtin operators

Many of the Perl builtins returning lists now have object-oriented overrides. These are:

  1. File::stat
  2. Net::hostent
  3. Net::netent
  4. Net::protoent
  5. Net::servent
  6. Time::gmtime
  7. Time::localtime
  8. User::grent
  9. User::pwent

For example, you can now say

  1. use File::stat;
  2. use User::pwent;
  3. $his = (stat($filename)->st_uid == pwent($whoever)->pw_uid);

Utility Changes

pod2html

  • Sends converted HTML to standard output

    The pod2html utility included with Perl 5.004 is entirely new. By default, it sends the converted HTML to its standard output, instead of writing it to a file like Perl 5.003's pod2html did. Use the --outfile=FILENAME option to write to a file.

xsubpp

  • void XSUBs now default to returning nothing

    Due to a documentation/implementation bug in previous versions of Perl, XSUBs with a return type of void have actually been returning one value. Usually that value was the GV for the XSUB, but sometimes it was some already freed or reused value, which would sometimes lead to program failure.

    In Perl 5.004, if an XSUB is declared as returning void , it actually returns no value, i.e. an empty list (though there is a backward-compatibility exception; see below). If your XSUB really does return an SV, you should give it a return type of SV * .

    For backward compatibility, xsubpp tries to guess whether a void XSUB is really void or if it wants to return an SV * . It does so by examining the text of the XSUB: if xsubpp finds what looks like an assignment to ST(0) , it assumes that the XSUB's return type is really SV * .

C Language API Changes

  • gv_fetchmethod and perl_call_sv

    The gv_fetchmethod function finds a method for an object, just like in Perl 5.003. The GV it returns may be a method cache entry. However, in Perl 5.004, method cache entries are not visible to users; therefore, they can no longer be passed directly to perl_call_sv . Instead, you should use the GvCV macro on the GV to extract its CV, and pass the CV to perl_call_sv .

    The most likely symptom of passing the result of gv_fetchmethod to perl_call_sv is Perl's producing an "Undefined subroutine called" error on the second call to a given method (since there is no cache on the first call).

  • perl_eval_pv

    A new function handy for eval'ing strings of Perl code inside C code. This function returns the value from the eval statement, which can be used instead of fetching globals from the symbol table. See perlguts, perlembed and perlcall for details and examples.

  • Extended API for manipulating hashes

    Internal handling of hash keys has changed. The old hashtable API is still fully supported, and will likely remain so. The additions to the API allow passing keys as SV* s, so that tied hashes can be given real scalars as keys rather than plain strings (nontied hashes still can only use strings as keys). New extensions must use the new hash access functions and macros if they wish to use SV* keys. These additions also make it feasible to manipulate HE* s (hash entries), which can be more efficient. See perlguts for details.

Documentation Changes

Many of the base and library pods were updated. These new pods are included in section 1:

  • perldelta

    This document.

  • perlfaq

    Frequently asked questions.

  • perllocale

    Locale support (internationalization and localization).

  • perltoot

    Tutorial on Perl OO programming.

  • perlapio

    Perl internal IO abstraction interface.

  • perlmodlib

    Perl module library and recommended practice for module creation. Extracted from perlmod (which is much smaller as a result).

  • perldebug

    Although not new, this has been massively updated.

  • perlsec

    Although not new, this has been massively updated.

New Diagnostics

Several new conditions will trigger warnings that were silent before. Some only affect certain platforms. The following new warnings and errors outline these. These messages are classified as follows (listed in increasing order of desperation):

  1. (W) A warning (optional).
  2. (D) A deprecation (optional).
  3. (S) A severe warning (mandatory).
  4. (F) A fatal error (trappable).
  5. (P) An internal error you should never see (trappable).
  6. (X) A very fatal error (nontrappable).
  7. (A) An alien error message (not generated by Perl).
  • "my" variable %s masks earlier declaration in same scope

    (W) A lexical variable has been redeclared in the same scope, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier variable will still exist until the end of the scope or until all closure referents to it are destroyed.

  • %s argument is not a HASH element or slice

    (F) The argument to delete() must be either a hash element, such as

    1. $foo{$bar}
    2. $ref->[12]->{"susie"}

    or a hash slice, such as

    1. @foo{$bar, $baz, $xyzzy}
    2. @{$ref->[12]}{"susie", "queue"}
  • Allocation too large: %lx

    (X) You can't allocate more than 64K on an MS-DOS machine.

  • Allocation too large

    (F) You can't allocate more than 2^31+"small amount" bytes.

  • Applying %s to %s will act on scalar(%s)

    (W) The pattern match (//), substitution (s///), and transliteration (tr///) operators work on scalar values. If you apply one of them to an array or a hash, it will convert the array or hash to a scalar value (the length of an array or the population info of a hash) and then work on that scalar value. This is probably not what you meant to do. See grep and map for alternatives.

  • Attempt to free nonexistent shared string

    (P) Perl maintains a reference counted internal table of strings to optimize the storage and access of hash keys and other strings. This indicates someone tried to decrement the reference count of a string that can no longer be found in the table.

  • Attempt to use reference as lvalue in substr

    (W) You supplied a reference as the first argument to substr() used as an lvalue, which is pretty strange. Perhaps you forgot to dereference it first. See substr.

  • Bareword "%s" refers to nonexistent package

    (W) You used a qualified bareword of the form Foo:: , but the compiler saw no other uses of that namespace before that point. Perhaps you need to predeclare a package?

  • Can't redefine active sort subroutine %s

    (F) Perl optimizes the internal handling of sort subroutines and keeps pointers into them. You tried to redefine one such sort subroutine when it was currently active, which is not allowed. If you really want to do this, you should write sort { &func } @x instead of sort func @x .

  • Can't use bareword ("%s") as %s ref while "strict refs" in use

    (F) Only hard references are allowed by "strict refs". Symbolic references are disallowed. See perlref.

  • Cannot resolve method `%s' overloading `%s' in package `%s'

    (P) Internal error trying to resolve overloading specified by a method name (as opposed to a subroutine reference).

  • Constant subroutine %s redefined

    (S) You redefined a subroutine which had previously been eligible for inlining. See Constant Functions in perlsub for commentary and workarounds.

  • Constant subroutine %s undefined

    (S) You undefined a subroutine which had previously been eligible for inlining. See Constant Functions in perlsub for commentary and workarounds.

  • Copy method did not return a reference

    (F) The method which overloads "=" is buggy. See Copy Constructor in overload.

  • Died

    (F) You passed die() an empty string (the equivalent of die "" ) or you called it with no args and both $@ and $_ were empty.

  • Exiting pseudo-block via %s

    (W) You are exiting a rather special block construct (like a sort block or subroutine) by unconventional means, such as a goto, or a loop control statement. See sort.

  • Identifier too long

    (F) Perl limits identifiers (names for variables, functions, etc.) to 252 characters for simple names, somewhat more for compound names (like $A::B ). You've exceeded Perl's limits. Future versions of Perl are likely to eliminate these arbitrary limitations.

  • Illegal character %s (carriage return)

    (F) A carriage return character was found in the input. This is an error, and not a warning, because carriage return characters can break multi-line strings, including here documents (e.g., print <<EOF;).

  • Illegal switch in PERL5OPT: %s

    (X) The PERL5OPT environment variable may only be used to set the following switches: -[DIMUdmw].

  • Integer overflow in hex number

    (S) The literal hex number you have specified is too big for your architecture. On a 32-bit architecture the largest hex literal is 0xFFFFFFFF.

  • Integer overflow in octal number

    (S) The literal octal number you have specified is too big for your architecture. On a 32-bit architecture the largest octal literal is 037777777777.

  • internal error: glob failed

    (P) Something went wrong with the external program(s) used for glob and <*.c> . This may mean that your csh (C shell) is broken. If so, you should change all of the csh-related variables in config.sh: If you have tcsh, make the variables refer to it as if it were csh (e.g. full_csh='/usr/bin/tcsh' ); otherwise, make them all empty (except that d_csh should be 'undef' ) so that Perl will think csh is missing. In either case, after editing config.sh, run ./Configure -S and rebuild Perl.

  • Invalid conversion in %s: "%s"

    (W) Perl does not understand the given format conversion. See sprintf.

  • Invalid type in pack: '%s'

    (F) The given character is not a valid pack type. See pack.

  • Invalid type in unpack: '%s'

    (F) The given character is not a valid unpack type. See unpack.

  • Name "%s::%s" used only once: possible typo

    (W) Typographical errors often show up as unique variable names. If you had a good reason for having a unique name, then just mention it again somehow to suppress the message (the use vars pragma is provided for just this purpose).

  • Null picture in formline

    (F) The first argument to formline must be a valid format picture specification. It was found to be empty, which probably means you supplied it an uninitialized value. See perlform.

  • Offset outside string

    (F) You tried to do a read/write/send/recv operation with an offset pointing outside the buffer. This is difficult to imagine. The sole exception to this is that sysread()ing past the buffer will extend the buffer and zero pad the new area.

  • Out of memory!

    (X|F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request.

    The request was judged to be small, so the possibility to trap it depends on the way Perl was compiled. By default it is not trappable. However, if compiled for this, Perl may use the contents of $^M as an emergency pool after die()ing with this message. In this case the error is trappable once.

  • Out of memory during request for %s

    (F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. However, the request was judged large enough (compile-time default is 64K), so a possibility to shut down by trapping this error is granted.

  • panic: frexp

    (P) The library function frexp() failed, making printf("%f") impossible.

  • Possible attempt to put comments in qw() list

    (W) qw() lists contain items separated by whitespace; as with literal strings, comment characters are not ignored, but are instead treated as literal data. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.)

    You probably wrote something like this:

    1. @list = qw(
    2. a # a comment
    3. b # another comment
    4. );

    when you should have written this:

    1. @list = qw(
    2. a
    3. b
    4. );

    If you really want comments, build your list the old-fashioned way, with quotes and commas:

    1. @list = (
    2. 'a', # a comment
    3. 'b', # another comment
    4. );
  • Possible attempt to separate words with commas

    (W) qw() lists contain items separated by whitespace; therefore commas aren't needed to separate the items. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.)

    You probably wrote something like this:

    1. qw! a, b, c !;

    which puts literal commas into some of the list items. Write it without commas if you don't want them to appear in your data:

    1. qw! a b c !;
  • Scalar value @%s{%s} better written as $%s{%s}

    (W) You've used a hash slice (indicated by @) to select a single element of a hash. Generally it's better to ask for a scalar value (indicated by $). The difference is that $foo{&bar} always behaves like a scalar, both when assigning to it and when evaluating its argument, while @foo{&bar} behaves like a list when you assign to it, and provides a list context to its subscript, which can do weird things if you're expecting only one subscript.

  • Stub found while resolving method `%s' overloading `%s' in %s

    (P) Overloading resolution over @ISA tree may be broken by importing stubs. Stubs should never be implicitly created, but explicit calls to can may break this.

  • Too late for "-T" option

    (X) The #! line (or local equivalent) in a Perl script contains the -T option, but Perl was not invoked with -T in its argument list. This is an error because, by the time Perl discovers a -T in a script, it's too late to properly taint everything from the environment. So Perl gives up.

  • untie attempted while %d inner references still exist

    (W) A copy of the object returned from tie (or tied) was still valid when untie was called.

  • Unrecognized character %s

    (F) The Perl parser has no idea what to do with the specified character in your Perl script (or eval). Perhaps you tried to run a compressed script, a binary program, or a directory as a Perl program.

  • Unsupported function fork

    (F) Your version of executable does not support forking.

    Note that under some systems, like OS/2, there may be different flavors of Perl executables, some of which may support fork, some not. Try changing the name you call Perl by to perl_ , perl__ , and so on.

  • Use of "$$<digit>" to mean "${$}<digit>" is deprecated

    (D) Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004.

    However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$<digit>" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease.

  • Value of %s can be "0"; test with defined()

    (W) In a conditional expression, you used <HANDLE>, <*> (glob), each(), or readdir() as a boolean value. Each of these constructs can return a value of "0"; that would make the conditional expression false, which is probably not what you intended. When using these constructs in conditional expressions, test their values with the defined operator.

  • Variable "%s" may be unavailable

    (W) An inner (nested) anonymous subroutine is inside a named subroutine, and outside that is another subroutine; and the anonymous (innermost) subroutine is referencing a lexical variable defined in the outermost subroutine. For example:

    1. sub outermost { my $a; sub middle { sub { $a } } }

    If the anonymous subroutine is called or referenced (directly or indirectly) from the outermost subroutine, it will share the variable as you would expect. But if the anonymous subroutine is called or referenced when the outermost subroutine is not active, it will see the value of the shared variable as it was before and during the *first* call to the outermost subroutine, which is probably not what you want.

    In these circumstances, it is usually best to make the middle subroutine anonymous, using the sub {} syntax. Perl has specific support for shared variables in nested anonymous subroutines; a named subroutine in between interferes with this feature.

  • Variable "%s" will not stay shared

    (W) An inner (nested) named subroutine is referencing a lexical variable defined in an outer subroutine.

    When the inner subroutine is called, it will probably see the value of the outer subroutine's variable as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared.

    Furthermore, if the outer subroutine is anonymous and references a lexical variable outside itself, then the outer and inner subroutines will never share the given variable.

    This problem can usually be solved by making the inner subroutine anonymous, using the sub {} syntax. When inner anonymous subs that reference variables in outer subroutines are called or referenced, they are automatically rebound to the current values of such variables.

  • Warning: something's wrong

    (W) You passed warn() an empty string (the equivalent of warn "" ) or you called it with no args and $_ was empty.

  • Ill-formed logical name |%s| in prime_env_iter

    (W) A warning peculiar to VMS. A logical name was encountered when preparing to iterate over %ENV which violates the syntactic rules governing logical names. Since it cannot be translated normally, it is skipped, and will not appear in %ENV. This may be a benign occurrence, as some software packages might directly modify logical name tables and introduce nonstandard names, or it may indicate that a logical name table has been corrupted.

  • Got an error from DosAllocMem

    (P) An error peculiar to OS/2. Most probably you're using an obsolete version of Perl, and this should not happen anyway.

  • Malformed PERLLIB_PREFIX

    (F) An error peculiar to OS/2. PERLLIB_PREFIX should be of the form

    1. prefix1;prefix2

    or

    1. prefix1 prefix2

    with nonempty prefix1 and prefix2. If prefix1 is indeed a prefix of a builtin library search path, prefix2 is substituted. The error may appear if components are not found, or are too long. See "PERLLIB_PREFIX" in README.os2.

  • PERL_SH_DIR too long

    (F) An error peculiar to OS/2. PERL_SH_DIR is the directory to find the sh -shell in. See "PERL_SH_DIR" in README.os2.

  • Process terminated by SIG%s

    (W) This is a standard message issued by OS/2 applications, while *nix applications die in silence. It is considered a feature of the OS/2 port. One can easily disable this by appropriate sighandlers, see Signals in perlipc. See also "Process terminated by SIGTERM/SIGINT" in README.os2.

BUGS

If you find what you think is a bug, you might check the headers of recently posted articles in the comp.lang.perl.misc newsgroup. There may also be information at http://www.perl.com/perl/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Make sure you trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to <perlbug@perl.com> to be analysed by the Perl porting team.

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl. This file has been significantly updated for 5.004, so even veteran users should look through it.

The README file for general stuff.

The Copying file for copyright information.

HISTORY

Constructed by Tom Christiansen, grabbing material with permission from innumerable contributors, with kibitzing by more than a few Perl porters.

Last update: Wed May 14 11:14:09 EDT 1997

 
perldoc-html/perl5005delta.html000644 000765 000024 00000230162 12275777410 016412 0ustar00jjstaff000000 000000 perl5005delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5005delta

Perl 5 version 18.2 documentation
Recently read

perl5005delta

NAME

perl5005delta - what's new for perl5.005

DESCRIPTION

This document describes differences between the 5.004 release and this one.

About the new versioning system

Perl is now developed on two tracks: a maintenance track that makes small, safe updates to released production versions with emphasis on compatibility; and a development track that pursues more aggressive evolution. Maintenance releases (which should be considered production quality) have subversion numbers that run from 1 to 49 , and development releases (which should be considered "alpha" quality) run from 50 to 99 .

Perl 5.005 is the combined product of the new dual-track development scheme.

Incompatible Changes

WARNING: This version is not binary compatible with Perl 5.004.

Starting with Perl 5.004_50 there were many deep and far-reaching changes to the language internals. If you have dynamically loaded extensions that you built under perl 5.003 or 5.004, you can continue to use them with 5.004, but you will need to rebuild and reinstall those extensions to use them 5.005. See INSTALL for detailed instructions on how to upgrade.

Default installation structure has changed

The new Configure defaults are designed to allow a smooth upgrade from 5.004 to 5.005, but you should read INSTALL for a detailed discussion of the changes in order to adapt them to your system.

Perl Source Compatibility

When none of the experimental features are enabled, there should be very few user-visible Perl source compatibility issues.

If threads are enabled, then some caveats apply. @_ and $_ become lexical variables. The effect of this should be largely transparent to the user, but there are some boundary conditions under which user will need to be aware of the issues. For example, local(@_) results in a "Can't localize lexical variable @_ ..." message. This may be enabled in a future version.

Some new keywords have been introduced. These are generally expected to have very little impact on compatibility. See New INIT keyword, New lock keyword, and / operator in New qr.

Certain barewords are now reserved. Use of these will provoke a warning if you have asked for them with the -w switch. See our is now a reserved word.

C Source Compatibility

There have been a large number of changes in the internals to support the new features in this release.

  • Core sources now require ANSI C compiler

    An ANSI C compiler is now required to build perl. See INSTALL.

  • All Perl global variables must now be referenced with an explicit prefix

    All Perl global variables that are visible for use by extensions now have a PL_ prefix. New extensions should not refer to perl globals by their unqualified names. To preserve sanity, we provide limited backward compatibility for globals that are being widely used like sv_undef and na (which should now be written as PL_sv_undef , PL_na etc.)

    If you find that your XS extension does not compile anymore because a perl global is not visible, try adding a PL_ prefix to the global and rebuild.

    It is strongly recommended that all functions in the Perl API that don't begin with perl be referenced with a Perl_ prefix. The bare function names without the Perl_ prefix are supported with macros, but this support may cease in a future release.

    See perlapi.

  • Enabling threads has source compatibility issues

    Perl built with threading enabled requires extensions to use the new dTHR macro to initialize the handle to access per-thread data. If you see a compiler error that talks about the variable thr not being declared (when building a module that has XS code), you need to add dTHR; at the beginning of the block that elicited the error.

    The API function perl_get_sv("@",GV_ADD) should be used instead of directly accessing perl globals as GvSV(errgv) . The API call is backward compatible with existing perls and provides source compatibility with threading is enabled.

    See C Source Compatibility for more information.

Binary Compatibility

This version is NOT binary compatible with older versions. All extensions will need to be recompiled. Further binaries built with threads enabled are incompatible with binaries built without. This should largely be transparent to the user, as all binary incompatible configurations have their own unique architecture name, and extension binaries get installed at unique locations. This allows coexistence of several configurations in the same directory hierarchy. See INSTALL.

Security fixes may affect compatibility

A few taint leaks and taint omissions have been corrected. This may lead to "failure" of scripts that used to work with older versions. Compiling with -DINCOMPLETE_TAINTS provides a perl with minimal amounts of changes to the tainting behavior. But note that the resulting perl will have known insecurities.

Oneliners with the -e switch do not create temporary files anymore.

Relaxed new mandatory warnings introduced in 5.004

Many new warnings that were introduced in 5.004 have been made optional. Some of these warnings are still present, but perl's new features make them less often a problem. See New Diagnostics.

Licensing

Perl has a new Social Contract for contributors. See Porting/Contract.

The license included in much of the Perl documentation has changed. Most of the Perl documentation was previously under the implicit GNU General Public License or the Artistic License (at the user's choice). Now much of the documentation unambiguously states the terms under which it may be distributed. Those terms are in general much less restrictive than the GNU GPL. See perl and the individual perl manpages listed therein.

Core Changes

Threads

WARNING: Threading is considered an experimental feature. Details of the implementation may change without notice. There are known limitations and some bugs. These are expected to be fixed in future versions.

See README.threads.

Compiler

WARNING: The Compiler and related tools are considered experimental. Features may change without notice, and there are known limitations and bugs. Since the compiler is fully external to perl, the default configuration will build and install it.

The Compiler produces three different types of transformations of a perl program. The C backend generates C code that captures perl's state just before execution begins. It eliminates the compile-time overheads of the regular perl interpreter, but the run-time performance remains comparatively the same. The CC backend generates optimized C code equivalent to the code path at run-time. The CC backend has greater potential for big optimizations, but only a few optimizations are implemented currently. The Bytecode backend generates a platform independent bytecode representation of the interpreter's state just before execution. Thus, the Bytecode back end also eliminates much of the compilation overhead of the interpreter.

The compiler comes with several valuable utilities.

B::Lint is an experimental module to detect and warn about suspicious code, especially the cases that the -w switch does not detect.

B::Deparse can be used to demystify perl code, and understand how perl optimizes certain constructs.

B::Xref generates cross reference reports of all definition and use of variables, subroutines and formats in a program.

B::Showlex show the lexical variables used by a subroutine or file at a glance.

perlcc is a simple frontend for compiling perl.

See ext/B/README , B, and the respective compiler modules.

Regular Expressions

Perl's regular expression engine has been seriously overhauled, and many new constructs are supported. Several bugs have been fixed.

Here is an itemized summary:

  • Many new and improved optimizations

    Changes in the RE engine:

    1. Unneeded nodes removed;
    2. Substrings merged together;
    3. New types of nodes to process (SUBEXPR)* and similar expressions
    4. quickly, used if the SUBEXPR has no side effects and matches
    5. strings of the same length;
    6. Better optimizations by lookup for constant substrings;
    7. Better search for constants substrings anchored by $ ;

    Changes in Perl code using RE engine:

    1. More optimizations to s/longer/short/;
    2. study() was not working;
    3. /blah/ may be optimized to an analogue of index() if $& $` $' not seen;
    4. Unneeded copying of matched-against string removed;
    5. Only matched part of the string is copying if $` $' were not seen;
  • Many bug fixes

    Note that only the major bug fixes are listed here. See Changes for others.

    1. Backtracking might not restore start of $3.
    2. No feedback if max count for * or + on "complex" subexpression
    3. was reached, similarly (but at compile time) for {3,34567}
    4. Primitive restrictions on max count introduced to decrease a
    5. possibility of a segfault;
    6. (ZERO-LENGTH)* could segfault;
    7. (ZERO-LENGTH)* was prohibited;
    8. Long REs were not allowed;
    9. /RE/g could skip matches at the same position after a
    10. zero-length match;
  • New regular expression constructs

    The following new syntax elements are supported:

    1. (?<=RE)
    2. (?<!RE)
    3. (?{ CODE })
    4. (?i-x)
    5. (?i:RE)
    6. (?(COND)YES_RE|NO_RE)
    7. (?>RE)
    8. \z
  • New operator for precompiled regular expressions

    See / operator in New qr.

  • Other improvements
    1. Better debugging output (possibly with colors),
    2. even from non-debugging Perl;
    3. RE engine code now looks like C, not like assembler;
    4. Behaviour of RE modifiable by `use re' directive;
    5. Improved documentation;
    6. Test suite significantly extended;
    7. Syntax [:^upper:] etc., reserved inside character classes;
  • Incompatible changes
    1. (?i) localized inside enclosing group;
    2. $( is not interpolated into RE any more;
    3. /RE/g may match at the same position (with non-zero length)
    4. after a zero-length match (bug fix).

See perlre and perlop.

Improved malloc()

See banner at the beginning of malloc.c for details.

Quicksort is internally implemented

Perl now contains its own highly optimized qsort() routine. The new qsort() is resistant to inconsistent comparison functions, so Perl's sort() will not provoke coredumps any more when given poorly written sort subroutines. (Some C library qsort() s that were being used before used to have this problem.) In our testing, the new qsort() required the minimal number of pair-wise compares on average, among all known qsort() implementations.

See perlfunc/sort .

Reliable signals

Perl's signal handling is susceptible to random crashes, because signals arrive asynchronously, and the Perl runtime is not reentrant at arbitrary times.

However, one experimental implementation of reliable signals is available when threads are enabled. See Thread::Signal . Also see INSTALL for how to build a Perl capable of threads.

Reliable stack pointers

The internals now reallocate the perl stack only at predictable times. In particular, magic calls never trigger reallocations of the stack, because all reentrancy of the runtime is handled using a "stack of stacks". This should improve reliability of cached stack pointers in the internals and in XSUBs.

More generous treatment of carriage returns

Perl used to complain if it encountered literal carriage returns in scripts. Now they are mostly treated like whitespace within program text. Inside string literals and here documents, literal carriage returns are ignored if they occur paired with linefeeds, or get interpreted as whitespace if they stand alone. This behavior means that literal carriage returns in files should be avoided. You can get the older, more compatible (but less generous) behavior by defining the preprocessor symbol PERL_STRICT_CR when building perl. Of course, all this has nothing whatever to do with how escapes like \r are handled within strings.

Note that this doesn't somehow magically allow you to keep all text files in DOS format. The generous treatment only applies to files that perl itself parses. If your C compiler doesn't allow carriage returns in files, you may still be unable to build modules that need a C compiler.

Memory leaks

substr, pos and vec don't leak memory anymore when used in lvalue context. Many small leaks that impacted applications that embed multiple interpreters have been fixed.

Better support for multiple interpreters

The build-time option -DMULTIPLICITY has had many of the details reworked. Some previously global variables that should have been per-interpreter now are. With care, this allows interpreters to call each other. See the PerlInterp extension on CPAN.

Behavior of local() on array and hash elements is now well-defined

See Temporary Values via local() in perlsub.

%! is transparently tied to the Errno module

See perlvar, and Errno.

Pseudo-hashes are supported

See perlref.

EXPR foreach EXPR is supported

See perlsyn.

Keywords can be globally overridden

See perlsub.

$^E is meaningful on Win32

See perlvar.

foreach (1..1000000) optimized

foreach (1..1000000) is now optimized into a counting loop. It does not try to allocate a 1000000-size list anymore.

Foo:: can be used as implicitly quoted package name

Barewords caused unintuitive behavior when a subroutine with the same name as a package happened to be defined. Thus, new Foo @args , use the result of the call to Foo() instead of Foo being treated as a literal. The recommended way to write barewords in the indirect object slot is new Foo:: @args . Note that the method new() is called with a first argument of Foo , not Foo:: when you do that.

exists $Foo::{Bar::} tests existence of a package

It was impossible to test for the existence of a package without actually creating it before. Now exists $Foo::{Bar::} can be used to test if the Foo::Bar namespace has been created.

Better locale support

See perllocale.

Experimental support for 64-bit platforms

Perl5 has always had 64-bit support on systems with 64-bit longs. Starting with 5.005, the beginnings of experimental support for systems with 32-bit long and 64-bit 'long long' integers has been added. If you add -DUSE_LONG_LONG to your ccflags in config.sh (or manually define it in perl.h) then perl will be built with 'long long' support. There will be many compiler warnings, and the resultant perl may not work on all systems. There are many other issues related to third-party extensions and libraries. This option exists to allow people to work on those issues.

prototype() returns useful results on builtins

See prototype.

Extended support for exception handling

die() now accepts a reference value, and $@ gets set to that value in exception traps. This makes it possible to propagate exception objects. This is an undocumented experimental feature.

Re-blessing in DESTROY() supported for chaining DESTROY() methods

See Destructors in perlobj.

All printf format conversions are handled internally

See printf.

New INIT keyword

INIT subs are like BEGIN and END , but they get run just before the perl runtime begins execution. e.g., the Perl Compiler makes use of INIT blocks to initialize and resolve pointers to XSUBs.

New lock keyword

The lock keyword is the fundamental synchronization primitive in threaded perl. When threads are not enabled, it is currently a noop.

To minimize impact on source compatibility this keyword is "weak", i.e., any user-defined subroutine of the same name overrides it, unless a use Thread has been seen.

New qr// operator

The qr// operator, which is syntactically similar to the other quote-like operators, is used to create precompiled regular expressions. This compiled form can now be explicitly passed around in variables, and interpolated in other regular expressions. See perlop.

our is now a reserved word

Calling a subroutine with the name our will now provoke a warning when using the -w switch.

Tied arrays are now fully supported

See Tie::Array.

Tied handles support is better

Several missing hooks have been added. There is also a new base class for TIEARRAY implementations. See Tie::Array.

4th argument to substr

substr() can now both return and replace in one operation. The optional 4th argument is the replacement string. See substr.

Negative LENGTH argument to splice

splice() with a negative LENGTH argument now work similar to what the LENGTH did for substr(). Previously a negative LENGTH was treated as 0. See splice.

Magic lvalues are now more magical

When you say something like substr($x, 5) = "hi" , the scalar returned by substr() is special, in that any modifications to it affect $x. (This is called a 'magic lvalue' because an 'lvalue' is something on the left side of an assignment.) Normally, this is exactly what you would expect to happen, but Perl uses the same magic if you use substr(), pos(), or vec() in a context where they might be modified, like taking a reference with \ or as an argument to a sub that modifies @_ . In previous versions, this 'magic' only went one way, but now changes to the scalar the magic refers to ($x in the above example) affect the magic lvalue too. For instance, this code now acts differently:

  1. $x = "hello";
  2. sub printit {
  3. $x = "g'bye";
  4. print $_[0], "\n";
  5. }
  6. printit(substr($x, 0, 5));

In previous versions, this would print "hello", but it now prints "g'bye".

<> now reads in records

If $/ is a reference to an integer, or a scalar that holds an integer, <> will read in records instead of lines. For more info, see $/ in perlvar.

Supported Platforms

Configure has many incremental improvements. Site-wide policy for building perl can now be made persistent, via Policy.sh. Configure also records the command-line arguments used in config.sh.

New Platforms

BeOS is now supported. See README.beos.

DOS is now supported under the DJGPP tools. See README.dos (installed as perldos on some systems).

MiNT is now supported. See README.mint.

MPE/iX is now supported. See README.mpeix.

MVS (aka OS390, aka Open Edition) is now supported. See README.os390 (installed as perlos390 on some systems).

Stratus VOS is now supported. See README.vos.

Changes in existing support

Win32 support has been vastly enhanced. Support for Perl Object, a C++ encapsulation of Perl. GCC and EGCS are now supported on Win32. See README.win32, aka perlwin32.

VMS configuration system has been rewritten. See README.vms (installed as README_vms on some systems).

The hints files for most Unix platforms have seen incremental improvements.

Modules and Pragmata

New Modules

  • B

    Perl compiler and tools. See B.

  • Data::Dumper

    A module to pretty print Perl data. See Data::Dumper.

  • Dumpvalue

    A module to dump perl values to the screen. See Dumpvalue.

  • Errno

    A module to look up errors more conveniently. See Errno.

  • File::Spec

    A portable API for file operations.

  • ExtUtils::Installed

    Query and manage installed modules.

  • ExtUtils::Packlist

    Manipulate .packlist files.

  • Fatal

    Make functions/builtins succeed or die.

  • IPC::SysV

    Constants and other support infrastructure for System V IPC operations in perl.

  • Test

    A framework for writing test suites.

  • Tie::Array

    Base class for tied arrays.

  • Tie::Handle

    Base class for tied handles.

  • Thread

    Perl thread creation, manipulation, and support.

  • attrs

    Set subroutine attributes.

  • fields

    Compile-time class fields.

  • re

    Various pragmata to control behavior of regular expressions.

Changes in existing modules

  • Benchmark

    You can now run tests for x seconds instead of guessing the right number of tests to run.

    Keeps better time.

  • Carp

    Carp has a new function cluck(). cluck() warns, like carp(), but also adds a stack backtrace to the error message, like confess().

  • CGI

    CGI has been updated to version 2.42.

  • Fcntl

    More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for large (more than 4G) file access (the 64-bit support is not yet working, though, so no need to get overly excited), Free/Net/OpenBSD locking behaviour flags F_FLOCK, F_POSIX, Linux F_SHLCK, and O_ACCMODE: the mask of O_RDONLY, O_WRONLY, and O_RDWR.

  • Math::Complex

    The accessors methods Re, Im, arg, abs, rho, theta, methods can ($z->Re()) now also act as mutators ($z->Re(3)).

  • Math::Trig

    A little bit of radial trigonometry (cylindrical and spherical) added, for example the great circle distance.

  • POSIX

    POSIX now has its own platform-specific hints files.

  • DB_File

    DB_File supports version 2.x of Berkeley DB. See ext/DB_File/Changes .

  • MakeMaker

    MakeMaker now supports writing empty makefiles, provides a way to specify that site umask() policy should be honored. There is also better support for manipulation of .packlist files, and getting information about installed modules.

    Extensions that have both architecture-dependent and architecture-independent files are now always installed completely in the architecture-dependent locations. Previously, the shareable parts were shared both across architectures and across perl versions and were therefore liable to be overwritten with newer versions that might have subtle incompatibilities.

  • CPAN

    See perlmodinstall and CPAN.

  • Cwd

    Cwd::cwd is faster on most platforms.

Utility Changes

h2ph and related utilities have been vastly overhauled.

perlcc , a new experimental front end for the compiler is available.

The crude GNU configure emulator is now called configure.gnu to avoid trampling on Configure under case-insensitive filesystems.

perldoc used to be rather slow. The slower features are now optional. In particular, case-insensitive searches need the -i switch, and recursive searches need -r . You can set these switches in the PERLDOC environment variable to get the old behavior.

Documentation Changes

Config.pm now has a glossary of variables.

Porting/patching.pod has detailed instructions on how to create and submit patches for perl.

perlport specifies guidelines on how to write portably.

perlmodinstall describes how to fetch and install modules from CPAN sites.

Some more Perl traps are documented now. See perltrap.

perlopentut gives a tutorial on using open().

perlreftut gives a tutorial on references.

perlthrtut gives a tutorial on threads.

New Diagnostics

  • Ambiguous call resolved as CORE::%s(), qualify as such or use &

    (W) A subroutine you have declared has the same name as a Perl keyword, and you have used the name without qualification for calling one or the other. Perl decided to call the builtin because the subroutine is not imported.

    To force interpretation as a subroutine call, either put an ampersand before the subroutine name, or qualify the name with its package. Alternatively, you can import the subroutine (or pretend that it's imported with the use subs pragma).

    To silently interpret it as the Perl operator, use the CORE:: prefix on the operator (e.g. CORE::log($x) ) or by declaring the subroutine to be an object method (see attrs).

  • Bad index while coercing array into hash

    (F) The index looked up in the hash found as the 0'th element of a pseudo-hash is not legal. Index values must be at 1 or greater. See perlref.

  • Bareword "%s" refers to nonexistent package

    (W) You used a qualified bareword of the form Foo:: , but the compiler saw no other uses of that namespace before that point. Perhaps you need to predeclare a package?

  • Can't call method "%s" on an undefined value

    (F) You used the syntax of a method call, but the slot filled by the object reference or package name contains an undefined value. Something like this will reproduce the error:

    1. $BADREF = 42;
    2. process $BADREF 1,2,3;
    3. $BADREF->process(1,2,3);
  • Can't check filesystem of script "%s" for nosuid

    (P) For some reason you can't check the filesystem of the script for nosuid.

  • Can't coerce array into hash

    (F) You used an array where a hash was expected, but the array has no information on how to map from keys to array indices. You can do that only with arrays that have a hash reference at index 0.

  • Can't goto subroutine from an eval-string

    (F) The "goto subroutine" call can't be used to jump out of an eval "string". (You can use it to jump out of an eval {BLOCK}, but you probably don't want to.)

  • Can't localize pseudo-hash element

    (F) You said something like local $ar->{'key'} , where $ar is a reference to a pseudo-hash. That hasn't been implemented yet, but you can get a similar effect by localizing the corresponding array element directly: local $ar->[$ar->[0]{'key'}] .

  • Can't use %%! because Errno.pm is not available

    (F) The first time the %! hash is used, perl automatically loads the Errno.pm module. The Errno module is expected to tie the %! hash to provide symbolic names for $! errno values.

  • Cannot find an opnumber for "%s"

    (F) A string of a form CORE::word was given to prototype(), but there is no builtin with the name word .

  • Character class syntax [. .] is reserved for future extensions

    (W) Within regular expression character classes ([]) the syntax beginning with "[." and ending with ".]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[." and ".\]".

  • Character class syntax [: :] is reserved for future extensions

    (W) Within regular expression character classes ([]) the syntax beginning with "[:" and ending with ":]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[:" and ":\]".

  • Character class syntax [= =] is reserved for future extensions

    (W) Within regular expression character classes ([]) the syntax beginning with "[=" and ending with "=]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[=" and "=\]".

  • %s: Eval-group in insecure regular expression

    (F) Perl detected tainted data when trying to compile a regular expression that contains the (?{ ... }) zero-width assertion, which is unsafe. See (?{ code }) in perlre, and perlsec.

  • %s: Eval-group not allowed, use re 'eval'

    (F) A regular expression contained the (?{ ... }) zero-width assertion, but that construct is only allowed when the use re 'eval' pragma is in effect. See (?{ code }) in perlre.

  • %s: Eval-group not allowed at run time

    (F) Perl tried to compile a regular expression containing the (?{ ... }) zero-width assertion at run time, as it would when the pattern contains interpolated values. Since that is a security risk, it is not allowed. If you insist, you may still do this by explicitly building the pattern from an interpolated string at run time and using that in an eval(). See (?{ code }) in perlre.

  • Explicit blessing to '' (assuming package main)

    (W) You are blessing a reference to a zero length string. This has the effect of blessing the reference into the package main. This is usually not what you want. Consider providing a default target package, e.g. bless($ref, $p || 'MyPackage');

  • Illegal hex digit ignored

    (W) You may have tried to use a character other than 0 - 9 or A - F in a hexadecimal number. Interpretation of the hexadecimal number stopped before the illegal character.

  • No such array field

    (F) You tried to access an array as a hash, but the field name used is not defined. The hash at index 0 should map all valid field names to array indices for that to work.

  • No such field "%s" in variable %s of type %s

    (F) You tried to access a field of a typed variable where the type does not know about the field name. The field names are looked up in the %FIELDS hash in the type package at compile time. The %FIELDS hash is usually set up with the 'fields' pragma.

  • Out of memory during ridiculously large request

    (F) You can't allocate more than 2^31+"small amount" bytes. This error is most likely to be caused by a typo in the Perl program. e.g., $arr[time] instead of $arr[$time] .

  • Range iterator outside integer range

    (F) One (or both) of the numeric arguments to the range operator ".." are outside the range which can be represented by integers internally. One possible workaround is to force Perl to use magical string increment by prepending "0" to your numbers.

  • Recursive inheritance detected while looking for method '%s' %s

    (F) More than 100 levels of inheritance were encountered while invoking a method. Probably indicates an unintended loop in your inheritance hierarchy.

  • Reference found where even-sized list expected

    (W) You gave a single reference where Perl was expecting a list with an even number of elements (for assignment to a hash). This usually means that you used the anon hash constructor when you meant to use parens. In any case, a hash requires key/value pairs.

    1. %hash = { one => 1, two => 2, }; # WRONG
    2. %hash = [ qw/ an anon array / ]; # WRONG
    3. %hash = ( one => 1, two => 2, ); # right
    4. %hash = qw( one 1 two 2 ); # also fine
  • Undefined value assigned to typeglob

    (W) An undefined value was assigned to a typeglob, a la *foo = undef . This does nothing. It's possible that you really mean undef *foo .

  • Use of reserved word "%s" is deprecated

    (D) The indicated bareword is a reserved word. Future versions of perl may use it as a keyword, so you're better off either explicitly quoting the word in a manner appropriate for its context of use, or using a different name altogether. The warning can be suppressed for subroutine names by either adding a & prefix, or using a package qualifier, e.g. &our() , or Foo::our() .

  • perl: warning: Setting locale failed.

    (S) The whole warning message will look something like:

    1. perl: warning: Setting locale failed.
    2. perl: warning: Please check that your locale settings:
    3. LC_ALL = "En_US",
    4. LANG = (unset)
    5. are supported and installed on your system.
    6. perl: warning: Falling back to the standard locale ("C").

    Exactly what were the failed locale settings varies. In the above the settings were that the LC_ALL was "En_US" and the LANG had no value. This error means that Perl detected that you and/or your system administrator have set up the so-called variable system but Perl could not use those settings. This was not dead serious, fortunately: there is a "default locale" called "C" that Perl can and will use, the script will be run. Before you really fix the problem, however, you will get the same error message each time you run Perl. How to really fix the problem can be found in LOCALE PROBLEMS in perllocale.

Obsolete Diagnostics

  • Can't mktemp()

    (F) The mktemp() routine failed for some reason while trying to process a -e switch. Maybe your /tmp partition is full, or clobbered.

    Removed because -e doesn't use temporary files any more.

  • Can't write to temp file for -e: %s

    (F) The write routine failed for some reason while trying to process a -e switch. Maybe your /tmp partition is full, or clobbered.

    Removed because -e doesn't use temporary files any more.

  • Cannot open temporary file

    (F) The create routine failed for some reason while trying to process a -e switch. Maybe your /tmp partition is full, or clobbered.

    Removed because -e doesn't use temporary files any more.

  • regexp too big

    (F) The current implementation of regular expressions uses shorts as address offsets within a string. Unfortunately this means that if the regular expression compiles to longer than 32767, it'll blow up. Usually when you want a regular expression this big, there is a better way to do it with multiple statements. See perlre.

Configuration Changes

You can use "Configure -Uinstallusrbinperl" which causes installperl to skip installing perl also as /usr/bin/perl. This is useful if you prefer not to modify /usr/bin for some reason or another but harmful because many scripts assume to find Perl in /usr/bin/perl.

BUGS

If you find what you think is a bug, you might check the headers of recently posted articles in the comp.lang.perl.misc newsgroup. There may also be information at http://www.perl.com/perl/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Make sure you trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to <perlbug@perl.com> to be analysed by the Perl porting team.

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

HISTORY

Written by Gurusamy Sarathy <gsar@activestate.com>, with many contributions from The Perl Porters.

Send omissions or corrections to <perlbug@perl.com>.

Page index
 
perldoc-html/perl5100delta.html000644 000765 000024 00000354032 12275777402 016412 0ustar00jjstaff000000 000000 perl5100delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5100delta

Perl 5 version 18.2 documentation
Recently read

perl5100delta

NAME

perl5100delta - what is new for perl 5.10.0

DESCRIPTION

This document describes the differences between the 5.8.8 release and the 5.10.0 release.

Many of the bug fixes in 5.10.0 were already seen in the 5.8.X maintenance releases; they are not duplicated here and are documented in the set of man pages named perl58[1-8]?delta.

Core Enhancements

The feature pragma

The feature pragma is used to enable new syntax that would break Perl's backwards-compatibility with older releases of the language. It's a lexical pragma, like strict or warnings .

Currently the following new features are available: switch (adds a switch statement), say (adds a say built-in function), and state (adds a state keyword for declaring "static" variables). Those features are described in their own sections of this document.

The feature pragma is also implicitly loaded when you require a minimal perl version (with the use VERSION construct) greater than, or equal to, 5.9.5. See feature for details.

New -E command-line switch

-E is equivalent to -e, but it implicitly enables all optional features (like use feature ":5.10" ).

Defined-or operator

A new operator // (defined-or) has been implemented. The following expression:

  1. $a // $b

is merely equivalent to

  1. defined $a ? $a : $b

and the statement

  1. $c //= $d;

can now be used instead of

  1. $c = $d unless defined $c;

The // operator has the same precedence and associativity as ||. Special care has been taken to ensure that this operator Do What You Mean while not breaking old code, but some edge cases involving the empty regular expression may now parse differently. See perlop for details.

Switch and Smart Match operator

Perl 5 now has a switch statement. It's available when use feature 'switch' is in effect. This feature introduces three new keywords, given , when , and default :

  1. given ($foo) {
  2. when (/^abc/) { $abc = 1; }
  3. when (/^def/) { $def = 1; }
  4. when (/^xyz/) { $xyz = 1; }
  5. default { $nothing = 1; }
  6. }

A more complete description of how Perl matches the switch variable against the when conditions is given in Switch statements in perlsyn.

This kind of match is called smart match, and it's also possible to use it outside of switch statements, via the new ~~ operator. See Smart matching in detail in perlsyn.

This feature was contributed by Robin Houston.

Regular expressions

  • Recursive Patterns

    It is now possible to write recursive patterns without using the (??{}) construct. This new way is more efficient, and in many cases easier to read.

    Each capturing parenthesis can now be treated as an independent pattern that can be entered by using the (?PARNO) syntax (PARNO standing for "parenthesis number"). For example, the following pattern will match nested balanced angle brackets:

    1. /
    2. ^ # start of line
    3. ( # start capture buffer 1
    4. < # match an opening angle bracket
    5. (?: # match one of:
    6. (?> # don't backtrack over the inside of this group
    7. [^<>]+ # one or more non angle brackets
    8. ) # end non backtracking group
    9. | # ... or ...
    10. (?1) # recurse to bracket 1 and try it again
    11. )* # 0 or more times.
    12. > # match a closing angle bracket
    13. ) # end capture buffer one
    14. $ # end of line
    15. /x

    PCRE users should note that Perl's recursive regex feature allows backtracking into a recursed pattern, whereas in PCRE the recursion is atomic or "possessive" in nature. As in the example above, you can add (?>) to control this selectively. (Yves Orton)

  • Named Capture Buffers

    It is now possible to name capturing parenthesis in a pattern and refer to the captured contents by name. The naming syntax is (?<NAME>....). It's possible to backreference to a named buffer with the \k<NAME> syntax. In code, the new magical hashes %+ and %- can be used to access the contents of the capture buffers.

    Thus, to replace all doubled chars with a single copy, one could write

    1. s/(?<letter>.)\k<letter>/$+{letter}/g

    Only buffers with defined contents will be "visible" in the %+ hash, so it's possible to do something like

    1. foreach my $name (keys %+) {
    2. print "content of buffer '$name' is $+{$name}\n";
    3. }

    The %- hash is a bit more complete, since it will contain array refs holding values from all capture buffers similarly named, if there should be many of them.

    %+ and %- are implemented as tied hashes through the new module Tie::Hash::NamedCapture .

    Users exposed to the .NET regex engine will find that the perl implementation differs in that the numerical ordering of the buffers is sequential, and not "unnamed first, then named". Thus in the pattern

    1. /(A)(?<B>B)(C)(?<D>D)/

    $1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not $1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer would expect. This is considered a feature. :-) (Yves Orton)

  • Possessive Quantifiers

    Perl now supports the "possessive quantifier" syntax of the "atomic match" pattern. Basically a possessive quantifier matches as much as it can and never gives any back. Thus it can be used to control backtracking. The syntax is similar to non-greedy matching, except instead of using a '?' as the modifier the '+' is used. Thus ?+, *+ , ++ , {min,max}+ are now legal quantifiers. (Yves Orton)

  • Backtracking control verbs

    The regex engine now supports a number of special-purpose backtrack control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL) and (*ACCEPT). See perlre for their descriptions. (Yves Orton)

  • Relative backreferences

    A new syntax \g{N} or \gN where "N" is a decimal integer allows a safer form of back-reference notation as well as allowing relative backreferences. This should make it easier to generate and embed patterns that contain backreferences. See Capture buffers in perlre. (Yves Orton)

  • \K escape

    The functionality of Jeff Pinyan's module Regexp::Keep has been added to the core. In regular expressions you can now use the special escape \K as a way to do something like floating length positive lookbehind. It is also useful in substitutions like:

    1. s/(foo)bar/$1/g

    that can now be converted to

    1. s/foo\Kbar//g

    which is much more efficient. (Yves Orton)

  • Vertical and horizontal whitespace, and linebreak

    Regular expressions now recognize the \v and \h escapes that match vertical and horizontal whitespace, respectively. \V and \H logically match their complements.

    \R matches a generic linebreak, that is, vertical whitespace, plus the multi-character sequence "\x0D\x0A" .

say()

say() is a new built-in, only available when use feature 'say' is in effect, that is similar to print(), but that implicitly appends a newline to the printed string. See say. (Robin Houston)

Lexical $_

The default variable $_ can now be lexicalized, by declaring it like any other lexical variable, with a simple

  1. my $_;

The operations that default on $_ will use the lexically-scoped version of $_ when it exists, instead of the global $_ .

In a map or a grep block, if $_ was previously my'ed, then the $_ inside the block is lexical as well (and scoped to the block).

In a scope where $_ has been lexicalized, you can still have access to the global version of $_ by using $::_ , or, more simply, by overriding the lexical declaration with our $_ . (Rafael Garcia-Suarez)

The _ prototype

A new prototype character has been added. _ is equivalent to $ but defaults to $_ if the corresponding argument isn't supplied (both $ and _ denote a scalar). Due to the optional nature of the argument, you can only use it at the end of a prototype, or before a semicolon.

This has a small incompatible consequence: the prototype() function has been adjusted to return _ for some built-ins in appropriate cases (for example, prototype('CORE::rmdir')). (Rafael Garcia-Suarez)

UNITCHECK blocks

UNITCHECK , a new special code block has been introduced, in addition to BEGIN , CHECK , INIT and END .

CHECK and INIT blocks, while useful for some specialized purposes, are always executed at the transition between the compilation and the execution of the main program, and thus are useless whenever code is loaded at runtime. On the other hand, UNITCHECK blocks are executed just after the unit which defined them has been compiled. See perlmod for more information. (Alex Gough)

New Pragma, mro

A new pragma, mro (for Method Resolution Order) has been added. It permits to switch, on a per-class basis, the algorithm that perl uses to find inherited methods in case of a multiple inheritance hierarchy. The default MRO hasn't changed (DFS, for Depth First Search). Another MRO is available: the C3 algorithm. See mro for more information. (Brandon Black)

Note that, due to changes in the implementation of class hierarchy search, code that used to undef the *ISA glob will most probably break. Anyway, undef'ing *ISA had the side-effect of removing the magic on the @ISA array and should not have been done in the first place. Also, the cache *::ISA::CACHE:: no longer exists; to force reset the @ISA cache, you now need to use the mro API, or more simply to assign to @ISA (e.g. with @ISA = @ISA ).

readdir() may return a "short filename" on Windows

The readdir() function may return a "short filename" when the long filename contains characters outside the ANSI codepage. Similarly Cwd::cwd() may return a short directory name, and glob() may return short names as well. On the NTFS file system these short names can always be represented in the ANSI codepage. This will not be true for all other file system drivers; e.g. the FAT filesystem stores short filenames in the OEM codepage, so some files on FAT volumes remain unaccessible through the ANSI APIs.

Similarly, $^X, @INC, and $ENV{PATH} are preprocessed at startup to make sure all paths are valid in the ANSI codepage (if possible).

The Win32::GetLongPathName() function now returns the UTF-8 encoded correct long file name instead of using replacement characters to force the name into the ANSI codepage. The new Win32::GetANSIPathName() function can be used to turn a long pathname into a short one only if the long one cannot be represented in the ANSI codepage.

Many other functions in the Win32 module have been improved to accept UTF-8 encoded arguments. Please see Win32 for details.

readpipe() is now overridable

The built-in function readpipe() is now overridable. Overriding it permits also to override its operator counterpart, qx// (a.k.a. `` ). Moreover, it now defaults to $_ if no argument is provided. (Rafael Garcia-Suarez)

Default argument for readline()

readline() now defaults to *ARGV if no argument is provided. (Rafael Garcia-Suarez)

state() variables

A new class of variables has been introduced. State variables are similar to my variables, but are declared with the state keyword in place of my. They're visible only in their lexical scope, but their value is persistent: unlike my variables, they're not undefined at scope entry, but retain their previous value. (Rafael Garcia-Suarez, Nicholas Clark)

To use state variables, one needs to enable them by using

  1. use feature 'state';

or by using the -E command-line switch in one-liners. See Persistent Private Variables in perlsub.

Stacked filetest operators

As a new form of syntactic sugar, it's now possible to stack up filetest operators. You can now write -f -w -x $file in a row to mean -x $file && -w _ && -f _ . See -X.

UNIVERSAL::DOES()

The UNIVERSAL class has a new method, DOES() . It has been added to solve semantic problems with the isa() method. isa() checks for inheritance, while DOES() has been designed to be overridden when module authors use other types of relations between classes (in addition to inheritance). (chromatic)

See $obj->DOES( ROLE ) in UNIVERSAL.

Formats

Formats were improved in several ways. A new field, ^*, can be used for variable-width, one-line-at-a-time text. Null characters are now handled correctly in picture lines. Using @# and ~~ together will now produce a compile-time error, as those format fields are incompatible. perlform has been improved, and miscellaneous bugs fixed.

Byte-order modifiers for pack() and unpack()

There are two new byte-order modifiers, > (big-endian) and < (little-endian), that can be appended to most pack() and unpack() template characters and groups to force a certain byte-order for that type or group. See pack and perlpacktut for details.

no VERSION

You can now use no followed by a version number to specify that you want to use a version of perl older than the specified one.

chdir, chmod and chown on filehandles

chdir, chmod and chown can now work on filehandles as well as filenames, if the system supports respectively fchdir , fchmod and fchown , thanks to a patch provided by Gisle Aas.

OS groups

$( and $) now return groups in the order where the OS returns them, thanks to Gisle Aas. This wasn't previously the case.

Recursive sort subs

You can now use recursive subroutines with sort(), thanks to Robin Houston.

Exceptions in constant folding

The constant folding routine is now wrapped in an exception handler, and if folding throws an exception (such as attempting to evaluate 0/0), perl now retains the current optree, rather than aborting the whole program. Without this change, programs would not compile if they had expressions that happened to generate exceptions, even though those expressions were in code that could never be reached at runtime. (Nicholas Clark, Dave Mitchell)

Source filters in @INC

It's possible to enhance the mechanism of subroutine hooks in @INC by adding a source filter on top of the filehandle opened and returned by the hook. This feature was planned a long time ago, but wasn't quite working until now. See require for details. (Nicholas Clark)

New internal variables

  • ${^RE_DEBUG_FLAGS}

    This variable controls what debug flags are in effect for the regular expression engine when running under use re "debug" . See re for details.

  • ${^CHILD_ERROR_NATIVE}

    This variable gives the native status returned by the last pipe close, backtick command, successful call to wait() or waitpid(), or from the system() operator. See perlvar for details. (Contributed by Gisle Aas.)

  • ${^RE_TRIE_MAXBUF}

    See Trie optimisation of literal string alternations.

  • ${^WIN32_SLOPPY_STAT}

    See Sloppy stat on Windows.

Miscellaneous

unpack() now defaults to unpacking the $_ variable.

mkdir() without arguments now defaults to $_ .

The internal dump output has been improved, so that non-printable characters such as newline and backspace are output in \x notation, rather than octal.

The -C option can no longer be used on the #! line. It wasn't working there anyway, since the standard streams are already set up at this point in the execution of the perl interpreter. You can use binmode() instead to get the desired behaviour.

UCD 5.0.0

The copy of the Unicode Character Database included in Perl 5 has been updated to version 5.0.0.

MAD

MAD, which stands for Miscellaneous Attribute Decoration, is a still-in-development work leading to a Perl 5 to Perl 6 converter. To enable it, it's necessary to pass the argument -Dmad to Configure. The obtained perl isn't binary compatible with a regular perl 5.10, and has space and speed penalties; moreover not all regression tests still pass with it. (Larry Wall, Nicholas Clark)

kill() on Windows

On Windows platforms, kill(-9, $pid) now kills a process tree. (On Unix, this delivers the signal to all processes in the same process group.)

Incompatible Changes

Packing and UTF-8 strings

The semantics of pack() and unpack() regarding UTF-8-encoded data has been changed. Processing is now by default character per character instead of byte per byte on the underlying encoding. Notably, code that used things like pack("a*", $string) to see through the encoding of string will now simply get back the original $string. Packed strings can also get upgraded during processing when you store upgraded characters. You can get the old behaviour by using use bytes .

To be consistent with pack(), the C0 in unpack() templates indicates that the data is to be processed in character mode, i.e. character by character; on the contrary, U0 in unpack() indicates UTF-8 mode, where the packed string is processed in its UTF-8-encoded Unicode form on a byte by byte basis. This is reversed with regard to perl 5.8.X, but now consistent between pack() and unpack().

Moreover, C0 and U0 can also be used in pack() templates to specify respectively character and byte modes.

C0 and U0 in the middle of a pack or unpack format now switch to the specified encoding mode, honoring parens grouping. Previously, parens were ignored.

Also, there is a new pack() character format, W , which is intended to replace the old C . C is kept for unsigned chars coded as bytes in the strings internal representation. W represents unsigned (logical) character values, which can be greater than 255. It is therefore more robust when dealing with potentially UTF-8-encoded data (as C will wrap values outside the range 0..255, and not respect the string encoding).

In practice, that means that pack formats are now encoding-neutral, except C .

For consistency, A in unpack() format now trims all Unicode whitespace from the end of the string. Before perl 5.9.2, it used to strip only the classical ASCII space characters.

Byte/character count feature in unpack()

A new unpack() template character, "." , returns the number of bytes or characters (depending on the selected encoding mode, see above) read so far.

The $* and $# variables have been removed

$* , which was deprecated in favor of the /s and /m regexp modifiers, has been removed.

The deprecated $# variable (output format for numbers) has been removed.

Two new severe warnings, $#/$* is no longer supported, have been added.

substr() lvalues are no longer fixed-length

The lvalues returned by the three argument form of substr() used to be a "fixed length window" on the original string. In some cases this could cause surprising action at distance or other undefined behaviour. Now the length of the window adjusts itself to the length of the string assigned to it.

Parsing of -f _

The identifier _ is now forced to be a bareword after a filetest operator. This solves a number of misparsing issues when a global _ subroutine is defined.

:unique

The :unique attribute has been made a no-op, since its current implementation was fundamentally flawed and not threadsafe.

Effect of pragmas in eval

The compile-time value of the %^H hint variable can now propagate into eval("")uated code. This makes it more useful to implement lexical pragmas.

As a side-effect of this, the overloaded-ness of constants now propagates into eval("").

chdir FOO

A bareword argument to chdir() is now recognized as a file handle. Earlier releases interpreted the bareword as a directory name. (Gisle Aas)

Handling of .pmc files

An old feature of perl was that before require or use look for a file with a .pm extension, they will first look for a similar filename with a .pmc extension. If this file is found, it will be loaded in place of any potentially existing file ending in a .pm extension.

Previously, .pmc files were loaded only if more recent than the matching .pm file. Starting with 5.9.4, they'll be always loaded if they exist.

$^V is now a version object instead of a v-string

$^V can still be used with the %vd format in printf, but any character-level operations will now access the string representation of the version object and not the ordinals of a v-string. Expressions like substr($^V, 0, 2) or split //, $^V no longer work and must be rewritten.

@- and @+ in patterns

The special arrays @- and @+ are no longer interpolated in regular expressions. (Sadahiro Tomoyuki)

$AUTOLOAD can now be tainted

If you call a subroutine by a tainted name, and if it defers to an AUTOLOAD function, then $AUTOLOAD will be (correctly) tainted. (Rick Delaney)

Tainting and printf

When perl is run under taint mode, printf() and sprintf() will now reject any tainted format argument. (Rafael Garcia-Suarez)

undef and signal handlers

Undefining or deleting a signal handler via undef $SIG{FOO} is now equivalent to setting it to 'DEFAULT' . (Rafael Garcia-Suarez)

strictures and dereferencing in defined()

use strict 'refs' was ignoring taking a hard reference in an argument to defined(), as in :

  1. use strict 'refs';
  2. my $x = 'foo';
  3. if (defined $$x) {...}

This now correctly produces the run-time error Can't use string as a SCALAR ref while "strict refs" in use.

defined @$foo and defined %$bar are now also subject to strict 'refs' (that is, $foo and $bar shall be proper references there.) (defined(@foo) and defined(%bar) are discouraged constructs anyway.) (Nicholas Clark)

(?p{}) has been removed

The regular expression construct (?p{}), which was deprecated in perl 5.8, has been removed. Use (??{}) instead. (Rafael Garcia-Suarez)

Pseudo-hashes have been removed

Support for pseudo-hashes has been removed from Perl 5.9. (The fields pragma remains here, but uses an alternate implementation.)

Removal of the bytecode compiler and of perlcc

perlcc , the byteloader and the supporting modules (B::C, B::CC, B::Bytecode, etc.) are no longer distributed with the perl sources. Those experimental tools have never worked reliably, and, due to the lack of volunteers to keep them in line with the perl interpreter developments, it was decided to remove them instead of shipping a broken version of those. The last version of those modules can be found with perl 5.9.4.

However the B compiler framework stays supported in the perl core, as with the more useful modules it has permitted (among others, B::Deparse and B::Concise).

Removal of the JPL

The JPL (Java-Perl Lingo) has been removed from the perl sources tarball.

Recursive inheritance detected earlier

Perl will now immediately throw an exception if you modify any package's @ISA in such a way that it would cause recursive inheritance.

Previously, the exception would not occur until Perl attempted to make use of the recursive inheritance while resolving a method or doing a $foo->isa($bar) lookup.

warnings::enabled and warnings::warnif changed to favor users of modules

The behaviour in 5.10.x favors the person using the module; The behaviour in 5.8.x favors the module writer;

Assume the following code:

  1. main calls Foo::Bar::baz()
  2. Foo::Bar inherits from Foo::Base
  3. Foo::Bar::baz() calls Foo::Base::_bazbaz()
  4. Foo::Base::_bazbaz() calls: warnings::warnif('substr', 'some warning
  5. message');

On 5.8.x, the code warns when Foo::Bar contains use warnings; It does not matter if Foo::Base or main have warnings enabled to disable the warning one has to modify Foo::Bar.

On 5.10.0 and newer, the code warns when main contains use warnings; It does not matter if Foo::Base or Foo::Bar have warnings enabled to disable the warning one has to modify main.

Modules and Pragmata

Upgrading individual core modules

Even more core modules are now also available separately through the CPAN. If you wish to update one of these modules, you don't need to wait for a new perl release. From within the cpan shell, running the 'r' command will report on modules with upgrades available. See perldoc CPAN for more information.

Pragmata Changes

  • feature

    The new pragma feature is used to enable new features that might break old code. See The feature pragma above.

  • mro

    This new pragma enables to change the algorithm used to resolve inherited methods. See New Pragma, mro above.

  • Scoping of the sort pragma

    The sort pragma is now lexically scoped. Its effect used to be global.

  • Scoping of bignum , bigint , bigrat

    The three numeric pragmas bignum , bigint and bigrat are now lexically scoped. (Tels)

  • base

    The base pragma now warns if a class tries to inherit from itself. (Curtis "Ovid" Poe)

  • strict and warnings

    strict and warnings will now complain loudly if they are loaded via incorrect casing (as in use Strict; ). (Johan Vromans)

  • version

    The version module provides support for version objects.

  • warnings

    The warnings pragma doesn't load Carp anymore. That means that code that used Carp routines without having loaded it at compile time might need to be adjusted; typically, the following (faulty) code won't work anymore, and will require parentheses to be added after the function name:

    1. use warnings;
    2. require Carp;
    3. Carp::confess 'argh';
  • less

    less now does something useful (or at least it tries to). In fact, it has been turned into a lexical pragma. So, in your modules, you can now test whether your users have requested to use less CPU, or less memory, less magic, or maybe even less fat. See less for more. (Joshua ben Jore)

New modules

  • encoding::warnings , by Audrey Tang, is a module to emit warnings whenever an ASCII character string containing high-bit bytes is implicitly converted into UTF-8. It's a lexical pragma since Perl 5.9.4; on older perls, its effect is global.

  • Module::CoreList , by Richard Clamp, is a small handy module that tells you what versions of core modules ship with any versions of Perl 5. It comes with a command-line frontend, corelist .

  • Math::BigInt::FastCalc is an XS-enabled, and thus faster, version of Math::BigInt::Calc .

  • Compress::Zlib is an interface to the zlib compression library. It comes with a bundled version of zlib, so having a working zlib is not a prerequisite to install it. It's used by Archive::Tar (see below).

  • IO::Zlib is an IO:: -style interface to Compress::Zlib .

  • Archive::Tar is a module to manipulate tar archives.

  • Digest::SHA is a module used to calculate many types of SHA digests, has been included for SHA support in the CPAN module.

  • ExtUtils::CBuilder and ExtUtils::ParseXS have been added.

  • Hash::Util::FieldHash , by Anno Siegel, has been added. This module provides support for field hashes: hashes that maintain an association of a reference with a value, in a thread-safe garbage-collected way. Such hashes are useful to implement inside-out objects.

  • Module::Build , by Ken Williams, has been added. It's an alternative to ExtUtils::MakeMaker to build and install perl modules.

  • Module::Load , by Jos Boumans, has been added. It provides a single interface to load Perl modules and .pl files.

  • Module::Loaded , by Jos Boumans, has been added. It's used to mark modules as loaded or unloaded.

  • Package::Constants , by Jos Boumans, has been added. It's a simple helper to list all constants declared in a given package.

  • Win32API::File , by Tye McQueen, has been added (for Windows builds). This module provides low-level access to Win32 system API calls for files/dirs.

  • Locale::Maketext::Simple , needed by CPANPLUS, is a simple wrapper around Locale::Maketext::Lexicon . Note that Locale::Maketext::Lexicon isn't included in the perl core; the behaviour of Locale::Maketext::Simple gracefully degrades when the later isn't present.

  • Params::Check implements a generic input parsing/checking mechanism. It is used by CPANPLUS.

  • Term::UI simplifies the task to ask questions at a terminal prompt.

  • Object::Accessor provides an interface to create per-object accessors.

  • Module::Pluggable is a simple framework to create modules that accept pluggable sub-modules.

  • Module::Load::Conditional provides simple ways to query and possibly load installed modules.

  • Time::Piece provides an object oriented interface to time functions, overriding the built-ins localtime() and gmtime().

  • IPC::Cmd helps to find and run external commands, possibly interactively.

  • File::Fetch provide a simple generic file fetching mechanism.

  • Log::Message and Log::Message::Simple are used by the log facility of CPANPLUS .

  • Archive::Extract is a generic archive extraction mechanism for .tar (plain, gzipped or bzipped) or .zip files.

  • CPANPLUS provides an API and a command-line tool to access the CPAN mirrors.

  • Pod::Escapes provides utilities that are useful in decoding Pod E<...> sequences.

  • Pod::Simple is now the backend for several of the Pod-related modules included with Perl.

Selected Changes to Core Modules

  • Attribute::Handlers

    Attribute::Handlers can now report the caller's file and line number. (David Feldman)

    All interpreted attributes are now passed as array references. (Damian Conway)

  • B::Lint

    B::Lint is now based on Module::Pluggable , and so can be extended with plugins. (Joshua ben Jore)

  • B

    It's now possible to access the lexical pragma hints (%^H ) by using the method B::COP::hints_hash(). It returns a B::RHE object, which in turn can be used to get a hash reference via the method B::RHE::HASH(). (Joshua ben Jore)

  • Thread

    As the old 5005thread threading model has been removed, in favor of the ithreads scheme, the Thread module is now a compatibility wrapper, to be used in old code only. It has been removed from the default list of dynamic extensions.

Utility Changes

  • perl -d

    The Perl debugger can now save all debugger commands for sourcing later; notably, it can now emulate stepping backwards, by restarting and rerunning all bar the last command from a saved command history.

    It can also display the parent inheritance tree of a given class, with the i command.

  • ptar

    ptar is a pure perl implementation of tar that comes with Archive::Tar .

  • ptardiff

    ptardiff is a small utility used to generate a diff between the contents of a tar archive and a directory tree. Like ptar , it comes with Archive::Tar .

  • shasum

    shasum is a command-line utility, used to print or to check SHA digests. It comes with the new Digest::SHA module.

  • corelist

    The corelist utility is now installed with perl (see New modules above).

  • h2ph and h2xs

    h2ph and h2xs have been made more robust with regard to "modern" C code.

    h2xs implements a new option --use-xsloader to force use of XSLoader even in backwards compatible modules.

    The handling of authors' names that had apostrophes has been fixed.

    Any enums with negative values are now skipped.

  • perlivp

    perlivp no longer checks for *.ph files by default. Use the new -a option to run all tests.

  • find2perl

    find2perl now assumes -print as a default action. Previously, it needed to be specified explicitly.

    Several bugs have been fixed in find2perl , regarding -exec and -eval . Also the options -path , -ipath and -iname have been added.

  • config_data

    config_data is a new utility that comes with Module::Build . It provides a command-line interface to the configuration of Perl modules that use Module::Build's framework of configurability (that is, *::ConfigData modules that contain local configuration information for their parent modules.)

  • cpanp

    cpanp , the CPANPLUS shell, has been added. (cpanp-run-perl , a helper for CPANPLUS operation, has been added too, but isn't intended for direct use).

  • cpan2dist

    cpan2dist is a new utility that comes with CPANPLUS. It's a tool to create distributions (or packages) from CPAN modules.

  • pod2html

    The output of pod2html has been enhanced to be more customizable via CSS. Some formatting problems were also corrected. (Jari Aalto)

New Documentation

The perlpragma manpage documents how to write one's own lexical pragmas in pure Perl (something that is possible starting with 5.9.4).

The new perlglossary manpage is a glossary of terms used in the Perl documentation, technical and otherwise, kindly provided by O'Reilly Media, Inc.

The perlreguts manpage, courtesy of Yves Orton, describes internals of the Perl regular expression engine.

The perlreapi manpage describes the interface to the perl interpreter used to write pluggable regular expression engines (by Ævar Arnfjörð Bjarmason).

The perlunitut manpage is an tutorial for programming with Unicode and string encodings in Perl, courtesy of Juerd Waalboer.

A new manual page, perlunifaq (the Perl Unicode FAQ), has been added (Juerd Waalboer).

The perlcommunity manpage gives a description of the Perl community on the Internet and in real life. (Edgar "Trizor" Bering)

The CORE manual page documents the CORE:: namespace. (Tels)

The long-existing feature of /(?{...})/ regexps setting $_ and pos() is now documented.

Performance Enhancements

In-place sorting

Sorting arrays in place (@a = sort @a ) is now optimized to avoid making a temporary copy of the array.

Likewise, reverse sort ... is now optimized to sort in reverse, avoiding the generation of a temporary intermediate list.

Lexical array access

Access to elements of lexical arrays via a numeric constant between 0 and 255 is now faster. (This used to be only the case for global arrays.)

XS-assisted SWASHGET

Some pure-perl code that perl was using to retrieve Unicode properties and transliteration mappings has been reimplemented in XS.

Constant subroutines

The interpreter internals now support a far more memory efficient form of inlineable constants. Storing a reference to a constant value in a symbol table is equivalent to a full typeglob referencing a constant subroutine, but using about 400 bytes less memory. This proxy constant subroutine is automatically upgraded to a real typeglob with subroutine if necessary. The approach taken is analogous to the existing space optimisation for subroutine stub declarations, which are stored as plain scalars in place of the full typeglob.

Several of the core modules have been converted to use this feature for their system dependent constants - as a result use POSIX; now takes about 200K less memory.

PERL_DONT_CREATE_GVSV

The new compilation flag PERL_DONT_CREATE_GVSV , introduced as an option in perl 5.8.8, is turned on by default in perl 5.9.3. It prevents perl from creating an empty scalar with every new typeglob. See perl589delta for details.

Weak references are cheaper

Weak reference creation is now O(1) rather than O(n), courtesy of Nicholas Clark. Weak reference deletion remains O(n), but if deletion only happens at program exit, it may be skipped completely.

sort() enhancements

Salvador Fandiño provided improvements to reduce the memory usage of sort and to speed up some cases.

Memory optimisations

Several internal data structures (typeglobs, GVs, CVs, formats) have been restructured to use less memory. (Nicholas Clark)

UTF-8 cache optimisation

The UTF-8 caching code is now more efficient, and used more often. (Nicholas Clark)

Sloppy stat on Windows

On Windows, perl's stat() function normally opens the file to determine the link count and update attributes that may have been changed through hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up stat() by not performing this operation. (Jan Dubois)

Regular expressions optimisations

  • Engine de-recursivised

    The regular expression engine is no longer recursive, meaning that patterns that used to overflow the stack will either die with useful explanations, or run to completion, which, since they were able to blow the stack before, will likely take a very long time to happen. If you were experiencing the occasional stack overflow (or segfault) and upgrade to discover that now perl apparently hangs instead, look for a degenerate regex. (Dave Mitchell)

  • Single char char-classes treated as literals

    Classes of a single character are now treated the same as if the character had been used as a literal, meaning that code that uses char-classes as an escaping mechanism will see a speedup. (Yves Orton)

  • Trie optimisation of literal string alternations

    Alternations, where possible, are optimised into more efficient matching structures. String literal alternations are merged into a trie and are matched simultaneously. This means that instead of O(N) time for matching N alternations at a given point, the new code performs in O(1) time. A new special variable, ${^RE_TRIE_MAXBUF}, has been added to fine-tune this optimization. (Yves Orton)

    Note: Much code exists that works around perl's historic poor performance on alternations. Often the tricks used to do so will disable the new optimisations. Hopefully the utility modules used for this purpose will be educated about these new optimisations.

  • Aho-Corasick start-point optimisation

    When a pattern starts with a trie-able alternation and there aren't better optimisations available, the regex engine will use Aho-Corasick matching to find the start point. (Yves Orton)

Installation and Configuration Improvements

Configuration improvements

  • -Dusesitecustomize

    Run-time customization of @INC can be enabled by passing the -Dusesitecustomize flag to Configure. When enabled, this will make perl run $sitelibexp/sitecustomize.pl before anything else. This script can then be set up to add additional entries to @INC.

  • Relocatable installations

    There is now Configure support for creating a relocatable perl tree. If you Configure with -Duserelocatableinc , then the paths in @INC (and everything else in %Config) can be optionally located via the path of the perl executable.

    That means that, if the string ".../" is found at the start of any path, it's substituted with the directory of $^X. So, the relocation can be configured on a per-directory basis, although the default with -Duserelocatableinc is that everything is relocated. The initial install is done to the original configured prefix.

  • strlcat() and strlcpy()

    The configuration process now detects whether strlcat() and strlcpy() are available. When they are not available, perl's own version is used (from Russ Allbery's public domain implementation). Various places in the perl interpreter now use them. (Steve Peters)

  • d_pseudofork and d_printf_format_null

    A new configuration variable, available as $Config{d_pseudofork} in the Config module, has been added, to distinguish real fork() support from fake pseudofork used on Windows platforms.

    A new configuration variable, d_printf_format_null , has been added, to see if printf-like formats are allowed to be NULL.

  • Configure help

    Configure -h has been extended with the most commonly used options.

Compilation improvements

  • Parallel build

    Parallel makes should work properly now, although there may still be problems if make test is instructed to run in parallel.

  • Borland's compilers support

    Building with Borland's compilers on Win32 should work more smoothly. In particular Steve Hay has worked to side step many warnings emitted by their compilers and at least one C compiler internal error.

  • Static build on Windows

    Perl extensions on Windows now can be statically built into the Perl DLL.

    Also, it's now possible to build a perl-static.exe that doesn't depend on the Perl DLL on Win32. See the Win32 makefiles for details. (Vadim Konovalov)

  • ppport.h files

    All ppport.h files in the XS modules bundled with perl are now autogenerated at build time. (Marcus Holland-Moritz)

  • C++ compatibility

    Efforts have been made to make perl and the core XS modules compilable with various C++ compilers (although the situation is not perfect with some of the compilers on some of the platforms tested.)

  • Support for Microsoft 64-bit compiler

    Support for building perl with Microsoft's 64-bit compiler has been improved. (ActiveState)

  • Visual C++

    Perl can now be compiled with Microsoft Visual C++ 2005 (and 2008 Beta 2).

  • Win32 builds

    All win32 builds (MS-Win, WinCE) have been merged and cleaned up.

Installation improvements

  • Module auxiliary files

    README files and changelogs for CPAN modules bundled with perl are no longer installed.

New Or Improved Platforms

Perl has been reported to work on Symbian OS. See perlsymbian for more information.

Many improvements have been made towards making Perl work correctly on z/OS.

Perl has been reported to work on DragonFlyBSD and MidnightBSD.

Perl has also been reported to work on NexentaOS ( http://www.gnusolaris.org/ ).

The VMS port has been improved. See perlvms.

Support for Cray XT4 Catamount/Qk has been added. See hints/catamount.sh in the source code distribution for more information.

Vendor patches have been merged for RedHat and Gentoo.

DynaLoader::dl_unload_file() now works on Windows.

Selected Bug Fixes

  • strictures in regexp-eval blocks

    strict wasn't in effect in regexp-eval blocks (/(?{...})/ ).

  • Calling CORE::require()

    CORE::require() and CORE::do() were always parsed as require() and do() when they were overridden. This is now fixed.

  • Subscripts of slices

    You can now use a non-arrowed form for chained subscripts after a list slice, like in:

    1. ({foo => "bar"})[0]{foo}

    This used to be a syntax error; a -> was required.

  • no warnings 'category' works correctly with -w

    Previously when running with warnings enabled globally via -w , selective disabling of specific warning categories would actually turn off all warnings. This is now fixed; now no warnings 'io'; will only turn off warnings in the io class. Previously it would erroneously turn off all warnings.

  • threads improvements

    Several memory leaks in ithreads were closed. Also, ithreads were made less memory-intensive.

    threads is now a dual-life module, also available on CPAN. It has been expanded in many ways. A kill() method is available for thread signalling. One can get thread status, or the list of running or joinable threads.

    A new threads->exit() method is used to exit from the application (this is the default for the main thread) or from the current thread only (this is the default for all other threads). On the other hand, the exit() built-in now always causes the whole application to terminate. (Jerry D. Hedden)

  • chr() and negative values

    chr() on a negative value now gives \x{FFFD} , the Unicode replacement character, unless when the bytes pragma is in effect, where the low eight bits of the value are used.

  • PERL5SHELL and tainting

    On Windows, the PERL5SHELL environment variable is now checked for taintedness. (Rafael Garcia-Suarez)

  • Using *FILE{IO}

    stat() and -X filetests now treat *FILE{IO} filehandles like *FILE filehandles. (Steve Peters)

  • Overloading and reblessing

    Overloading now works when references are reblessed into another class. Internally, this has been implemented by moving the flag for "overloading" from the reference to the referent, which logically is where it should always have been. (Nicholas Clark)

  • Overloading and UTF-8

    A few bugs related to UTF-8 handling with objects that have stringification overloaded have been fixed. (Nicholas Clark)

  • eval memory leaks fixed

    Traditionally, eval 'syntax error' has leaked badly. Many (but not all) of these leaks have now been eliminated or reduced. (Dave Mitchell)

  • Random device on Windows

    In previous versions, perl would read the file /dev/urandom if it existed when seeding its random number generator. That file is unlikely to exist on Windows, and if it did would probably not contain appropriate data, so perl no longer tries to read it on Windows. (Alex Davies)

  • PERLIO_DEBUG

    The PERLIO_DEBUG environment variable no longer has any effect for setuid scripts and for scripts run with -T.

    Moreover, with a thread-enabled perl, using PERLIO_DEBUG could lead to an internal buffer overflow. This has been fixed.

  • PerlIO::scalar and read-only scalars

    PerlIO::scalar will now prevent writing to read-only scalars. Moreover, seek() is now supported with PerlIO::scalar-based filehandles, the underlying string being zero-filled as needed. (Rafael, Jarkko Hietaniemi)

  • study() and UTF-8

    study() never worked for UTF-8 strings, but could lead to false results. It's now a no-op on UTF-8 data. (Yves Orton)

  • Critical signals

    The signals SIGILL, SIGBUS and SIGSEGV are now always delivered in an "unsafe" manner (contrary to other signals, that are deferred until the perl interpreter reaches a reasonably stable state; see Deferred Signals (Safe Signals) in perlipc). (Rafael)

  • @INC-hook fix

    When a module or a file is loaded through an @INC-hook, and when this hook has set a filename entry in %INC, __FILE__ is now set for this module accordingly to the contents of that %INC entry. (Rafael)

  • -t switch fix

    The -w and -t switches can now be used together without messing up which categories of warnings are activated. (Rafael)

  • Duping UTF-8 filehandles

    Duping a filehandle which has the :utf8 PerlIO layer set will now properly carry that layer on the duped filehandle. (Rafael)

  • Localisation of hash elements

    Localizing a hash element whose key was given as a variable didn't work correctly if the variable was changed while the local() was in effect (as in local $h{$x}; ++$x ). (Bo Lindbergh)

New or Changed Diagnostics

  • Use of uninitialized value

    Perl will now try to tell you the name of the variable (if any) that was undefined.

  • Deprecated use of my() in false conditional

    A new deprecation warning, Deprecated use of my() in false conditional, has been added, to warn against the use of the dubious and deprecated construct

    1. my $x if 0;

    See perldiag. Use state variables instead.

  • !=~ should be !~

    A new warning, !=~ should be !~ , is emitted to prevent this misspelling of the non-matching operator.

  • Newline in left-justified string

    The warning Newline in left-justified string has been removed.

  • Too late for "-T" option

    The error Too late for "-T" option has been reformulated to be more descriptive.

  • "%s" variable %s masks earlier declaration

    This warning is now emitted in more consistent cases; in short, when one of the declarations involved is a my variable:

    1. my $x; my $x; # warns
    2. my $x; our $x; # warns
    3. our $x; my $x; # warns

    On the other hand, the following:

    1. our $x; our $x;

    now gives a "our" variable %s redeclared warning.

  • readdir()/closedir()/etc. attempted on invalid dirhandle

    These new warnings are now emitted when a dirhandle is used but is either closed or not really a dirhandle.

  • Opening dirhandle/filehandle %s also as a file/directory

    Two deprecation warnings have been added: (Rafael)

    1. Opening dirhandle %s also as a file
    2. Opening filehandle %s also as a directory
  • Use of -P is deprecated

    Perl's command-line switch -P is now deprecated.

  • v-string in use/require is non-portable

    Perl will warn you against potential backwards compatibility problems with the use VERSION syntax.

  • perl -V

    perl -V has several improvements, making it more useable from shell scripts to get the value of configuration variables. See perlrun for details.

Changed Internals

In general, the source code of perl has been refactored, tidied up, and optimized in many places. Also, memory management and allocation has been improved in several points.

When compiling the perl core with gcc, as many gcc warning flags are turned on as is possible on the platform. (This quest for cleanliness doesn't extend to XS code because we cannot guarantee the tidiness of code we didn't write.) Similar strictness flags have been added or tightened for various other C compilers.

Reordering of SVt_* constants

The relative ordering of constants that define the various types of SV have changed; in particular, SVt_PVGV has been moved before SVt_PVLV , SVt_PVAV , SVt_PVHV and SVt_PVCV . This is unlikely to make any difference unless you have code that explicitly makes assumptions about that ordering. (The inheritance hierarchy of B::* objects has been changed to reflect this.)

Elimination of SVt_PVBM

Related to this, the internal type SVt_PVBM has been removed. This dedicated type of SV was used by the index operator and parts of the regexp engine to facilitate fast Boyer-Moore matches. Its use internally has been replaced by SV s of type SVt_PVGV .

New type SVt_BIND

A new type SVt_BIND has been added, in readiness for the project to implement Perl 6 on 5. There deliberately is no implementation yet, and they cannot yet be created or destroyed.

Removal of CPP symbols

The C preprocessor symbols PERL_PM_APIVERSION and PERL_XS_APIVERSION , which were supposed to give the version number of the oldest perl binary-compatible (resp. source-compatible) with the present one, were not used, and sometimes had misleading values. They have been removed.

Less space is used by ops

The BASEOP structure now uses less space. The op_seq field has been removed and replaced by a single bit bit-field op_opt . op_type is now 9 bits long. (Consequently, the B::OP class doesn't provide an seq method anymore.)

New parser

perl's parser is now generated by bison (it used to be generated by byacc.) As a result, it seems to be a bit more robust.

Also, Dave Mitchell improved the lexer debugging output under -DT .

Use of const

Andy Lester supplied many improvements to determine which function parameters and local variables could actually be declared const to the C compiler. Steve Peters provided new *_set macros and reworked the core to use these rather than assigning to macros in LVALUE context.

Mathoms

A new file, mathoms.c, has been added. It contains functions that are no longer used in the perl core, but that remain available for binary or source compatibility reasons. However, those functions will not be compiled in if you add -DNO_MATHOMS in the compiler flags.

AvFLAGS has been removed

The AvFLAGS macro has been removed.

av_* changes

The av_*() functions, used to manipulate arrays, no longer accept null AV* parameters.

$^H and %^H

The implementation of the special variables $^H and %^H has changed, to allow implementing lexical pragmas in pure Perl.

B:: modules inheritance changed

The inheritance hierarchy of B:: modules has changed; B::NV now inherits from B::SV (it used to inherit from B::IV ).

Anonymous hash and array constructors

The anonymous hash and array constructors now take 1 op in the optree instead of 3, now that pp_anonhash and pp_anonlist return a reference to an hash/array when the op is flagged with OPf_SPECIAL. (Nicholas Clark)

Known Problems

There's still a remaining problem in the implementation of the lexical $_ : it doesn't work inside /(?{...})/ blocks. (See the TODO test in t/op/mydef.t.)

Stacked filetest operators won't work when the filetest pragma is in effect, because they rely on the stat() buffer _ being populated, and filetest bypasses stat().

UTF-8 problems

The handling of Unicode still is unclean in several places, where it's dependent on whether a string is internally flagged as UTF-8. This will be made more consistent in perl 5.12, but that won't be possible without a certain amount of backwards incompatibility.

Platform Specific Problems

When compiled with g++ and thread support on Linux, it's reported that the $! stops working correctly. This is related to the fact that the glibc provides two strerror_r(3) implementation, and perl selects the wrong one.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/rt3/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

SEE ALSO

The Changes file and the perl590delta to perl595delta man pages for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

Page index
 
perldoc-html/perl5101delta.html000644 000765 000024 00000311021 12275777401 016401 0ustar00jjstaff000000 000000 perl5101delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5101delta

Perl 5 version 18.2 documentation
Recently read

perl5101delta

NAME

perl5101delta - what is new for perl v5.10.1

DESCRIPTION

This document describes differences between the 5.10.0 release and the 5.10.1 release.

If you are upgrading from an earlier release such as 5.8.8, first read the perl5100delta, which describes differences between 5.8.8 and 5.10.0

Incompatible Changes

Switch statement changes

The handling of complex expressions by the given /when switch statement has been enhanced. There are two new cases where when now interprets its argument as a boolean, instead of an expression to be used in a smart match:

  • flip-flop operators

    The .. and ... flip-flop operators are now evaluated in boolean context, following their usual semantics; see Range Operators in perlop.

    Note that, as in perl 5.10.0, when (1..10) will not work to test whether a given value is an integer between 1 and 10; you should use when ([1..10]) instead (note the array reference).

    However, contrary to 5.10.0, evaluating the flip-flop operators in boolean context ensures it can now be useful in a when() , notably for implementing bistable conditions, like in:

    1. when (/^=begin/ .. /^=end/) {
    2. # do something
    3. }
  • defined-or operator

    A compound expression involving the defined-or operator, as in when (expr1 // expr2), will be treated as boolean if the first expression is boolean. (This just extends the existing rule that applies to the regular or operator, as in when (expr1 || expr2) .)

The next section details more changes brought to the semantics to the smart match operator, that naturally also modify the behaviour of the switch statements where smart matching is implicitly used.

Smart match changes

Changes to type-based dispatch

The smart match operator ~~ is no longer commutative. The behaviour of a smart match now depends primarily on the type of its right hand argument. Moreover, its semantics have been adjusted for greater consistency or usefulness in several cases. While the general backwards compatibility is maintained, several changes must be noted:

  • Code references with an empty prototype are no longer treated specially. They are passed an argument like the other code references (even if they choose to ignore it).

  • %hash ~~ sub {} and @array ~~ sub {} now test that the subroutine returns a true value for each key of the hash (or element of the array), instead of passing the whole hash or array as a reference to the subroutine.

  • Due to the commutativity breakage, code references are no longer treated specially when appearing on the left of the ~~ operator, but like any vulgar scalar.

  • undef ~~ %hash is always false (since undef can't be a key in a hash). No implicit conversion to "" is done (as was the case in perl 5.10.0).

  • $scalar ~~ @array now always distributes the smart match across the elements of the array. It's true if one element in @array verifies $scalar ~~ $element . This is a generalization of the old behaviour that tested whether the array contained the scalar.

The full dispatch table for the smart match operator is given in Smart matching in detail in perlsyn.

Smart match and overloading

According to the rule of dispatch based on the rightmost argument type, when an object overloading ~~ appears on the right side of the operator, the overload routine will always be called (with a 3rd argument set to a true value, see overload.) However, when the object will appear on the left, the overload routine will be called only when the rightmost argument is a simple scalar. This way distributivity of smart match across arrays is not broken, as well as the other behaviours with complex types (coderefs, hashes, regexes). Thus, writers of overloading routines for smart match mostly need to worry only with comparing against a scalar, and possibly with stringification overloading; the other common cases will be automatically handled consistently.

~~ will now refuse to work on objects that do not overload it (in order to avoid relying on the object's underlying structure). (However, if the object overloads the stringification or the numification operators, and if overload fallback is active, it will be used instead, as usual.)

Other incompatible changes

  • The semantics of use feature :5.10* have changed slightly. See Modules and Pragmata for more information.

  • It is now a run-time error to use the smart match operator ~~ with an object that has no overload defined for it. (This way ~~ will not break encapsulation by matching against the object's internal representation as a reference.)

  • The version control system used for the development of the perl interpreter has been switched from Perforce to git. This is mainly an internal issue that only affects people actively working on the perl core; but it may have minor external visibility, for example in some of details of the output of perl -V . See perlrepository for more information.

  • The internal structure of the ext/ directory in the perl source has been reorganised. In general, a module Foo::Bar whose source was stored under ext/Foo/Bar/ is now located under ext/Foo-Bar/. Also, some modules have been moved from lib/ to ext/. This is purely a source tarball change, and should make no difference to the compilation or installation of perl, unless you have a very customised build process that explicitly relies on this structure, or which hard-codes the nonxs_ext Configure parameter. Specifically, this change does not by default alter the location of any files in the final installation.

  • As part of the Test::Harness 2.x to 3.x upgrade, the experimental Test::Harness::Straps module has been removed. See Updated Modules for more details.

  • As part of the ExtUtils::MakeMaker upgrade, the ExtUtils::MakeMaker::bytes and ExtUtils::MakeMaker::vmsish modules have been removed from this distribution.

  • Module::CoreList no longer contains the %:patchlevel hash.

  • This one is actually a change introduced in 5.10.0, but it was missed from that release's perldelta, so it is mentioned here instead.

    A bugfix related to the handling of the /m modifier and qr resulted in a change of behaviour between 5.8.x and 5.10.0:

    1. # matches in 5.8.x, doesn't match in 5.10.0
    2. $re = qr/^bar/; "foo\nbar" =~ /$re/m;

Core Enhancements

Unicode Character Database 5.1.0

The copy of the Unicode Character Database included in Perl 5.10.1 has been updated to 5.1.0 from 5.0.0. See http://www.unicode.org/versions/Unicode5.1.0/#Notable_Changes for the notable changes.

A proper interface for pluggable Method Resolution Orders

As of Perl 5.10.1 there is a new interface for plugging and using method resolution orders other than the default (linear depth first search). The C3 method resolution order added in 5.10.0 has been re-implemented as a plugin, without changing its Perl-space interface. See perlmroapi for more information.

The overloading pragma

This pragma allows you to lexically disable or enable overloading for some or all operations. (Yuval Kogman)

Parallel tests

The core distribution can now run its regression tests in parallel on Unix-like platforms. Instead of running make test , set TEST_JOBS in your environment to the number of tests to run in parallel, and run make test_harness . On a Bourne-like shell, this can be done as

  1. TEST_JOBS=3 make test_harness # Run 3 tests in parallel

An environment variable is used, rather than parallel make itself, because TAP::Harness needs to be able to schedule individual non-conflicting test scripts itself, and there is no standard interface to make utilities to interact with their job schedulers.

Note that currently some test scripts may fail when run in parallel (most notably ext/IO/t/io_dir.t ). If necessary run just the failing scripts again sequentially and see if the failures go away.

DTrace support

Some support for DTrace has been added. See "DTrace support" in INSTALL.

Support for configure_requires in CPAN module metadata

Both CPAN and CPANPLUS now support the configure_requires keyword in the META.yml metadata file included in most recent CPAN distributions. This allows distribution authors to specify configuration prerequisites that must be installed before running Makefile.PL or Build.PL.

See the documentation for ExtUtils::MakeMaker or Module::Build for more on how to specify configure_requires when creating a distribution for CPAN.

Modules and Pragmata

New Modules and Pragmata

  • autodie

    This is a new lexically-scoped alternative for the Fatal module. The bundled version is 2.06_01. Note that in this release, using a string eval when autodie is in effect can cause the autodie behaviour to leak into the surrounding scope. See BUGS in autodie for more details.

  • Compress::Raw::Bzip2

    This has been added to the core (version 2.020).

  • parent

    This pragma establishes an ISA relationship with base classes at compile time. It provides the key feature of base without the feature creep.

  • Parse::CPAN::Meta

    This has been added to the core (version 1.39).

Pragmata Changes

  • attributes

    Upgraded from version 0.08 to 0.09.

  • attrs

    Upgraded from version 1.02 to 1.03.

  • base

    Upgraded from version 2.13 to 2.14. See parent for a replacement.

  • bigint

    Upgraded from version 0.22 to 0.23.

  • bignum

    Upgraded from version 0.22 to 0.23.

  • bigrat

    Upgraded from version 0.22 to 0.23.

  • charnames

    Upgraded from version 1.06 to 1.07.

    The Unicode NameAliases.txt database file has been added. This has the effect of adding some extra \N character names that formerly wouldn't have been recognised; for example, "\N{LATIN CAPITAL LETTER GHA}" .

  • constant

    Upgraded from version 1.13 to 1.17.

  • feature

    The meaning of the :5.10 and :5.10.X feature bundles has changed slightly. The last component, if any (i.e. X ) is simply ignored. This is predicated on the assumption that new features will not, in general, be added to maintenance releases. So :5.10 and :5.10.X have identical effect. This is a change to the behaviour documented for 5.10.0.

  • fields

    Upgraded from version 2.13 to 2.14 (this was just a version bump; there were no functional changes).

  • lib

    Upgraded from version 0.5565 to 0.62.

  • open

    Upgraded from version 1.06 to 1.07.

  • overload

    Upgraded from version 1.06 to 1.07.

  • overloading

    See The overloading pragma above.

  • version

    Upgraded from version 0.74 to 0.77.

Updated Modules

  • Archive::Extract

    Upgraded from version 0.24 to 0.34.

  • Archive::Tar

    Upgraded from version 1.38 to 1.52.

  • Attribute::Handlers

    Upgraded from version 0.79 to 0.85.

  • AutoLoader

    Upgraded from version 5.63 to 5.68.

  • AutoSplit

    Upgraded from version 1.05 to 1.06.

  • B

    Upgraded from version 1.17 to 1.22.

  • B::Debug

    Upgraded from version 1.05 to 1.11.

  • B::Deparse

    Upgraded from version 0.83 to 0.89.

  • B::Lint

    Upgraded from version 1.09 to 1.11.

  • B::Xref

    Upgraded from version 1.01 to 1.02.

  • Benchmark

    Upgraded from version 1.10 to 1.11.

  • Carp

    Upgraded from version 1.08 to 1.11.

  • CGI

    Upgraded from version 3.29 to 3.43. (also includes the "default_value for popup_menu()" fix from 3.45).

  • Compress::Zlib

    Upgraded from version 2.008 to 2.020.

  • CPAN

    Upgraded from version 1.9205 to 1.9402. CPAN::FTP has a local fix to stop it being too verbose on download failure.

  • CPANPLUS

    Upgraded from version 0.84 to 0.88.

  • CPANPLUS::Dist::Build

    Upgraded from version 0.06_02 to 0.36.

  • Cwd

    Upgraded from version 3.25_01 to 3.30.

  • Data::Dumper

    Upgraded from version 2.121_14 to 2.124.

  • DB

    Upgraded from version 1.01 to 1.02.

  • DB_File

    Upgraded from version 1.816_1 to 1.820.

  • Devel::PPPort

    Upgraded from version 3.13 to 3.19.

  • Digest::MD5

    Upgraded from version 2.36_01 to 2.39.

  • Digest::SHA

    Upgraded from version 5.45 to 5.47.

  • DirHandle

    Upgraded from version 1.01 to 1.03.

  • Dumpvalue

    Upgraded from version 1.12 to 1.13.

  • DynaLoader

    Upgraded from version 1.08 to 1.10.

  • Encode

    Upgraded from version 2.23 to 2.35.

  • Errno

    Upgraded from version 1.10 to 1.11.

  • Exporter

    Upgraded from version 5.62 to 5.63.

  • ExtUtils::CBuilder

    Upgraded from version 0.21 to 0.2602.

  • ExtUtils::Command

    Upgraded from version 1.13 to 1.16.

  • ExtUtils::Constant

    Upgraded from 0.20 to 0.22. (Note that neither of these versions are available on CPAN.)

  • ExtUtils::Embed

    Upgraded from version 1.27 to 1.28.

  • ExtUtils::Install

    Upgraded from version 1.44 to 1.54.

  • ExtUtils::MakeMaker

    Upgraded from version 6.42 to 6.55_02.

    Note that ExtUtils::MakeMaker::bytes and ExtUtils::MakeMaker::vmsish have been removed from this distribution.

  • ExtUtils::Manifest

    Upgraded from version 1.51_01 to 1.56.

  • ExtUtils::ParseXS

    Upgraded from version 2.18_02 to 2.2002.

  • Fatal

    Upgraded from version 1.05 to 2.06_01. See also the new pragma autodie .

  • File::Basename

    Upgraded from version 2.76 to 2.77.

  • File::Compare

    Upgraded from version 1.1005 to 1.1006.

  • File::Copy

    Upgraded from version 2.11 to 2.14.

  • File::Fetch

    Upgraded from version 0.14 to 0.20.

  • File::Find

    Upgraded from version 1.12 to 1.14.

  • File::Path

    Upgraded from version 2.04 to 2.07_03.

  • File::Spec

    Upgraded from version 3.2501 to 3.30.

  • File::stat

    Upgraded from version 1.00 to 1.01.

  • File::Temp

    Upgraded from version 0.18 to 0.22.

  • FileCache

    Upgraded from version 1.07 to 1.08.

  • FileHandle

    Upgraded from version 2.01 to 2.02.

  • Filter::Simple

    Upgraded from version 0.82 to 0.84.

  • Filter::Util::Call

    Upgraded from version 1.07 to 1.08.

  • FindBin

    Upgraded from version 1.49 to 1.50.

  • GDBM_File

    Upgraded from version 1.08 to 1.09.

  • Getopt::Long

    Upgraded from version 2.37 to 2.38.

  • Hash::Util::FieldHash

    Upgraded from version 1.03 to 1.04. This fixes a memory leak.

  • I18N::Collate

    Upgraded from version 1.00 to 1.01.

  • IO

    Upgraded from version 1.23_01 to 1.25.

    This makes non-blocking mode work on Windows in IO::Socket::INET [CPAN #43573].

  • IO::Compress::*

    Upgraded from version 2.008 to 2.020.

  • IO::Dir

    Upgraded from version 1.06 to 1.07.

  • IO::Handle

    Upgraded from version 1.27 to 1.28.

  • IO::Socket

    Upgraded from version 1.30_01 to 1.31.

  • IO::Zlib

    Upgraded from version 1.07 to 1.09.

  • IPC::Cmd

    Upgraded from version 0.40_1 to 0.46.

  • IPC::Open3

    Upgraded from version 1.02 to 1.04.

  • IPC::SysV

    Upgraded from version 1.05 to 2.01.

  • lib

    Upgraded from version 0.5565 to 0.62.

  • List::Util

    Upgraded from version 1.19 to 1.21.

  • Locale::MakeText

    Upgraded from version 1.12 to 1.13.

  • Log::Message

    Upgraded from version 0.01 to 0.02.

  • Math::BigFloat

    Upgraded from version 1.59 to 1.60.

  • Math::BigInt

    Upgraded from version 1.88 to 1.89.

  • Math::BigInt::FastCalc

    Upgraded from version 0.16 to 0.19.

  • Math::BigRat

    Upgraded from version 0.21 to 0.22.

  • Math::Complex

    Upgraded from version 1.37 to 1.56.

  • Math::Trig

    Upgraded from version 1.04 to 1.20.

  • Memoize

    Upgraded from version 1.01_02 to 1.01_03 (just a minor documentation change).

  • Module::Build

    Upgraded from version 0.2808_01 to 0.34_02.

  • Module::CoreList

    Upgraded from version 2.13 to 2.18. This release no longer contains the %Module::CoreList::patchlevel hash.

  • Module::Load

    Upgraded from version 0.12 to 0.16.

  • Module::Load::Conditional

    Upgraded from version 0.22 to 0.30.

  • Module::Loaded

    Upgraded from version 0.01 to 0.02.

  • Module::Pluggable

    Upgraded from version 3.6 to 3.9.

  • NDBM_File

    Upgraded from version 1.07 to 1.08.

  • Net::Ping

    Upgraded from version 2.33 to 2.36.

  • NEXT

    Upgraded from version 0.60_01 to 0.64.

  • Object::Accessor

    Upgraded from version 0.32 to 0.34.

  • OS2::REXX

    Upgraded from version 1.03 to 1.04.

  • Package::Constants

    Upgraded from version 0.01 to 0.02.

  • PerlIO

    Upgraded from version 1.04 to 1.06.

  • PerlIO::via

    Upgraded from version 0.04 to 0.07.

  • Pod::Man

    Upgraded from version 2.16 to 2.22.

  • Pod::Parser

    Upgraded from version 1.35 to 1.37.

  • Pod::Simple

    Upgraded from version 3.05 to 3.07.

  • Pod::Text

    Upgraded from version 3.08 to 3.13.

  • POSIX

    Upgraded from version 1.13 to 1.17.

  • Safe

    Upgraded from 2.12 to 2.18.

  • Scalar::Util

    Upgraded from version 1.19 to 1.21.

  • SelectSaver

    Upgraded from 1.01 to 1.02.

  • SelfLoader

    Upgraded from 1.11 to 1.17.

  • Socket

    Upgraded from 1.80 to 1.82.

  • Storable

    Upgraded from 2.18 to 2.20.

  • Switch

    Upgraded from version 2.13 to 2.14. Please see Deprecations.

  • Symbol

    Upgraded from version 1.06 to 1.07.

  • Sys::Syslog

    Upgraded from version 0.22 to 0.27.

  • Term::ANSIColor

    Upgraded from version 1.12 to 2.00.

  • Term::ReadLine

    Upgraded from version 1.03 to 1.04.

  • Term::UI

    Upgraded from version 0.18 to 0.20.

  • Test::Harness

    Upgraded from version 2.64 to 3.17.

    Note that one side-effect of the 2.x to 3.x upgrade is that the experimental Test::Harness::Straps module (and its supporting Assert , Iterator , Point and Results modules) have been removed. If you still need this, then they are available in the (unmaintained) Test-Harness-Straps distribution on CPAN.

  • Test::Simple

    Upgraded from version 0.72 to 0.92.

  • Text::ParseWords

    Upgraded from version 3.26 to 3.27.

  • Text::Tabs

    Upgraded from version 2007.1117 to 2009.0305.

  • Text::Wrap

    Upgraded from version 2006.1117 to 2009.0305.

  • Thread::Queue

    Upgraded from version 2.00 to 2.11.

  • Thread::Semaphore

    Upgraded from version 2.01 to 2.09.

  • threads

    Upgraded from version 1.67 to 1.72.

  • threads::shared

    Upgraded from version 1.14 to 1.29.

  • Tie::RefHash

    Upgraded from version 1.37 to 1.38.

  • Tie::StdHandle

    This has documentation changes, and has been assigned a version for the first time: version 4.2.

  • Time::HiRes

    Upgraded from version 1.9711 to 1.9719.

  • Time::Local

    Upgraded from version 1.18 to 1.1901.

  • Time::Piece

    Upgraded from version 1.12 to 1.15.

  • Unicode::Normalize

    Upgraded from version 1.02 to 1.03.

  • Unicode::UCD

    Upgraded from version 0.25 to 0.27.

    charinfo() now works on Unified CJK code points added to later versions of Unicode.

    casefold() has new fields returned to provide both a simpler interface and previously missing information. The old fields are retained for backwards compatibility. Information about Turkic-specific code points is now returned.

    The documentation has been corrected and expanded.

  • UNIVERSAL

    Upgraded from version 1.04 to 1.05.

  • Win32

    Upgraded from version 0.34 to 0.39.

  • Win32API::File

    Upgraded from version 0.1001_01 to 0.1101.

  • XSLoader

    Upgraded from version 0.08 to 0.10.

Utility Changes

  • h2ph

    Now looks in include-fixed too, which is a recent addition to gcc's search path.

  • h2xs

    No longer incorrectly treats enum values like macros (Daniel Burr).

    Now handles C++ style constants (// ) properly in enums. (A patch from Rainer Weikusat was used; Daniel Burr also proposed a similar fix).

  • perl5db.pl

    LVALUE subroutines now work under the debugger.

    The debugger now correctly handles proxy constant subroutines, and subroutine stubs.

  • perlthanks

    Perl 5.10.1 adds a new utility perlthanks, which is a variant of perlbug, but for sending non-bug-reports to the authors and maintainers of Perl. Getting nothing but bug reports can become a bit demoralising: we'll see if this changes things.

New Documentation

  • perlhaiku

    This contains instructions on how to build perl for the Haiku platform.

  • perlmroapi

    This describes the new interface for pluggable Method Resolution Orders.

  • perlperf

    This document, by Richard Foley, provides an introduction to the use of performance and optimization techniques which can be used with particular reference to perl programs.

  • perlrepository

    This describes how to access the perl source using the git version control system.

  • perlthanks

    This describes the new perlthanks utility.

Changes to Existing Documentation

The various large Changes* files (which listed every change made to perl over the last 18 years) have been removed, and replaced by a small file, also called Changes , which just explains how that same information may be extracted from the git version control system.

The file Porting/patching.pod has been deleted, as it mainly described interacting with the old Perforce-based repository, which is now obsolete. Information still relevant has been moved to perlrepository.

perlapi, perlintern, perlmodlib and perltoc are now all generated at build time, rather than being shipped as part of the release.

Performance Enhancements

  • A new internal cache means that isa() will often be faster.

  • Under use locale , the locale-relevant information is now cached on read-only values, such as the list returned by keys %hash . This makes operations such as sort keys %hash in the scope of use locale much faster.

  • Empty DESTROY methods are no longer called.

Installation and Configuration Improvements

ext/ reorganisation

The layout of directories in ext has been revised. Specifically, all extensions are now flat, and at the top level, with / in pathnames replaced by - , so that ext/Data/Dumper/ is now ext/Data-Dumper/, etc. The names of the extensions as specified to Configure, and as reported by %Config::Config under the keys dynamic_ext , known_extensions , nonxs_ext and static_ext have not changed, and still use /. Hence this change will not have any affect once perl is installed. However, Attribute::Handlers , Safe and mro have now become extensions in their own right, so if you run Configure with options to specify an exact list of extensions to build, you will need to change it to account for this.

For 5.10.2, it is planned that many dual-life modules will have been moved from lib to ext; again this will have no effect on an installed perl, but will matter if you invoke Configure with a pre-canned list of extensions to build.

Configuration improvements

If vendorlib and vendorarch are the same, then they are only added to @INC once.

$Config{usedevel} and the C-level PERL_USE_DEVEL are now defined if perl is built with -Dusedevel .

Configure will enable use of -fstack-protector , to provide protection against stack-smashing attacks, if the compiler supports it.

Configure will now determine the correct prototypes for re-entrant functions, and for gconvert , if you are using a C++ compiler rather than a C compiler.

On Unix, if you build from a tree containing a git repository, the configuration process will note the commit hash you have checked out, for display in the output of perl -v and perl -V . Unpushed local commits are automatically added to the list of local patches displayed by perl -V .

Compilation improvements

As part of the flattening of ext, all extensions on all platforms are built by make_ext.pl. This replaces the Unix-specific ext/util/make_ext, VMS-specific make_ext.com and Win32-specific win32/buildext.pl.

Platform Specific Changes

  • AIX

    Removed libbsd for AIX 5L and 6.1. Only flock() was used from libbsd.

    Removed libgdbm for AIX 5L and 6.1. The libgdbm is delivered as an optional package with the AIX Toolbox. Unfortunately the 64 bit version is broken.

    Hints changes mean that AIX 4.2 should work again.

  • Cygwin

    On Cygwin we now strip the last number from the DLL. This has been the behaviour in the cygwin.com build for years. The hints files have been updated.

  • FreeBSD

    The hints files now identify the correct threading libraries on FreeBSD 7 and later.

  • Irix

    We now work around a bizarre preprocessor bug in the Irix 6.5 compiler: cc -E - unfortunately goes into K&R mode, but cc -E file.c doesn't.

  • Haiku

    Patches from the Haiku maintainers have been merged in. Perl should now build on Haiku.

  • MirOS BSD

    Perl should now build on MirOS BSD.

  • NetBSD

    Hints now supports versions 5.*.

  • Stratus VOS

    Various changes from Stratus have been merged in.

  • Symbian

    There is now support for Symbian S60 3.2 SDK and S60 5.0 SDK.

  • Win32

    Improved message window handling means that alarm and kill messages will no longer be dropped under race conditions.

  • VMS

    Reads from the in-memory temporary files of PerlIO::scalar used to fail if $/ was set to a numeric reference (to indicate record-style reads). This is now fixed.

    VMS now supports getgrgid.

    Many improvements and cleanups have been made to the VMS file name handling and conversion code.

    Enabling the PERL_VMS_POSIX_EXIT logical name now encodes a POSIX exit status in a VMS condition value for better interaction with GNV's bash shell and other utilities that depend on POSIX exit values. See $? in perlvms for details.

Selected Bug Fixes

  • 5.10.0 inadvertently disabled an optimisation, which caused a measurable performance drop in list assignment, such as is often used to assign function parameters from @_ . The optimisation has been re-instated, and the performance regression fixed.

  • Fixed memory leak on while (1) { map 1, 1 } [RT #53038].

  • Some potential coredumps in PerlIO fixed [RT #57322,54828].

  • The debugger now works with lvalue subroutines.

  • The debugger's m command was broken on modules that defined constants [RT #61222].

  • crypt() and string complement could return tainted values for untainted arguments [RT #59998].

  • The -i.suffix command-line switch now recreates the file using restricted permissions, before changing its mode to match the original file. This eliminates a potential race condition [RT #60904].

  • On some Unix systems, the value in $? would not have the top bit set ($? & 128 ) even if the child core dumped.

  • Under some circumstances, $^R could incorrectly become undefined [RT #57042].

  • (XS) In various hash functions, passing a pre-computed hash to when the key is UTF-8 might result in an incorrect lookup.

  • (XS) Including XSUB.h before perl.h gave a compile-time error [RT #57176].

  • $object->isa('Foo') would report false if the package Foo didn't exist, even if the object's @ISA contained Foo .

  • Various bugs in the new-to 5.10.0 mro code, triggered by manipulating @ISA , have been found and fixed.

  • Bitwise operations on references could crash the interpreter, e.g. $x=\$y; $x |= "foo" [RT #54956].

  • Patterns including alternation might be sensitive to the internal UTF-8 representation, e.g.

    1. my $byte = chr(192);
    2. my $utf8 = chr(192); utf8::upgrade($utf8);
    3. $utf8 =~ /$byte|X}/i; # failed in 5.10.0
  • Within UTF8-encoded Perl source files (i.e. where use utf8 is in effect), double-quoted literal strings could be corrupted where a \xNN , \0NNN or \N{} is followed by a literal character with ordinal value greater than 255 [RT #59908].

  • B::Deparse failed to correctly deparse various constructs: readpipe STRING [RT #62428], CORE::require(STRING) [RT #62488], sub foo(_) [RT #62484].

  • Using setpgrp() with no arguments could corrupt the perl stack.

  • The block form of eval is now specifically trappable by Safe and ops . Previously it was erroneously treated like string eval.

  • In 5.10.0, the two characters [~ were sometimes parsed as the smart match operator (~~ ) [RT #63854].

  • In 5.10.0, the * quantifier in patterns was sometimes treated as {0,32767} [RT #60034, #60464]. For example, this match would fail:

    1. ("ab" x 32768) =~ /^(ab)*$/
  • shmget was limited to a 32 bit segment size on a 64 bit OS [RT #63924].

  • Using next or last to exit a given block no longer produces a spurious warning like the following:

    1. Exiting given via last at foo.pl line 123
  • On Windows, '.\foo' and '..\foo' were treated differently than './foo' and '../foo' by do and require [RT #63492].

  • Assigning a format to a glob could corrupt the format; e.g.:

    1. *bar=*foo{FORMAT}; # foo format now bad
  • Attempting to coerce a typeglob to a string or number could cause an assertion failure. The correct error message is now generated, Can't coerce GLOB to $type.

  • Under use filetest 'access' , -x was using the wrong access mode. This has been fixed [RT #49003].

  • length on a tied scalar that returned a Unicode value would not be correct the first time. This has been fixed.

  • Using an array tie inside in array tie could SEGV. This has been fixed. [RT #51636]

  • A race condition inside PerlIOStdio_close() has been identified and fixed. This used to cause various threading issues, including SEGVs.

  • In unpack, the use of () groups in scalar context was internally placing a list on the interpreter's stack, which manifested in various ways, including SEGVs. This is now fixed [RT #50256].

  • Magic was called twice in substr, \&$x , tie $x, $m and chop. These have all been fixed.

  • A 5.10.0 optimisation to clear the temporary stack within the implicit loop of s///ge has been reverted, as it turned out to be the cause of obscure bugs in seemingly unrelated parts of the interpreter [commit ef0d4e17921ee3de].

  • The line numbers for warnings inside elsif are now correct.

  • The .. operator now works correctly with ranges whose ends are at or close to the values of the smallest and largest integers.

  • binmode STDIN, ':raw' could lead to segmentation faults on some platforms. This has been fixed [RT #54828].

  • An off-by-one error meant that index $str, ... was effectively being executed as index "$str\0", ... . This has been fixed [RT #53746].

  • Various leaks associated with named captures in regexes have been fixed [RT #57024].

  • A weak reference to a hash would leak. This was affecting DBI [RT #56908].

  • Using (?|) in a regex could cause a segfault [RT #59734].

  • Use of a UTF-8 tr// within a closure could cause a segfault [RT #61520].

  • Calling sv_chop() or otherwise upgrading an SV could result in an unaligned 64-bit access on the SPARC architecture [RT #60574].

  • In the 5.10.0 release, inc_version_list would incorrectly list 5.10.* after 5.8.* ; this affected the @INC search order [RT #67628].

  • In 5.10.0, pack "a*", $tainted_value returned a non-tainted value [RT #52552].

  • In 5.10.0, printf and sprintf could produce the fatal error panic: utf8_mg_pos_cache_update when printing UTF-8 strings [RT #62666].

  • In the 5.10.0 release, a dynamically created AUTOLOAD method might be missed (method cache issue) [RT #60220,60232].

  • In the 5.10.0 release, a combination of use feature and //ee could cause a memory leak [RT #63110].

  • -C on the shebang (#! ) line is once more permitted if it is also specified on the command line. -C on the shebang line used to be a silent no-op if it was not also on the command line, so perl 5.10.0 disallowed it, which broke some scripts. Now perl checks whether it is also on the command line and only dies if it is not [RT #67880].

  • In 5.10.0, certain types of re-entrant regular expression could crash, or cause the following assertion failure [RT #60508]:

    1. Assertion rx->sublen >= (s - rx->subbeg) + i failed

New or Changed Diagnostics

  • panic: sv_chop %s

    This new fatal error occurs when the C routine Perl_sv_chop() was passed a position that is not within the scalar's string buffer. This could be caused by buggy XS code, and at this point recovery is not possible.

  • Can't locate package %s for the parents of %s

    This warning has been removed. In general, it only got produced in conjunction with other warnings, and removing it allowed an ISA lookup optimisation to be added.

  • v-string in use/require is non-portable

    This warning has been removed.

  • Deep recursion on subroutine "%s"

    It is now possible to change the depth threshold for this warning from the default of 100, by recompiling the perl binary, setting the C pre-processor macro PERL_SUB_DEPTH_WARN to the desired value.

Changed Internals

  • The J.R.R. Tolkien quotes at the head of C source file have been checked and proper citations added, thanks to a patch from Tom Christiansen.

  • vcroak() now accepts a null first argument. In addition, a full audit was made of the "not NULL" compiler annotations, and those for several other internal functions were corrected.

  • New macros dSAVEDERRNO , dSAVE_ERRNO , SAVE_ERRNO , RESTORE_ERRNO have been added to formalise the temporary saving of the errno variable.

  • The function Perl_sv_insert_flags has been added to augment Perl_sv_insert .

  • The function Perl_newSV_type(type) has been added, equivalent to Perl_newSV() followed by Perl_sv_upgrade(type) .

  • The function Perl_newSVpvn_flags() has been added, equivalent to Perl_newSVpvn() and then performing the action relevant to the flag.

    Two flag bits are currently supported.

    • SVf_UTF8

      This will call SvUTF8_on() for you. (Note that this does not convert an sequence of ISO 8859-1 characters to UTF-8). A wrapper, newSVpvn_utf8() is available for this.

    • SVs_TEMP

      Call sv_2mortal() on the new SV.

    There is also a wrapper that takes constant strings, newSVpvs_flags() .

  • The function Perl_croak_xs_usage has been added as a wrapper to Perl_croak .

  • The functions PerlIO_find_layer and PerlIO_list_alloc are now exported.

  • PL_na has been exterminated from the core code, replaced by local STRLEN temporaries, or *_nolen() calls. Either approach is faster than PL_na , which is a pointer deference into the interpreter structure under ithreads, and a global variable otherwise.

  • Perl_mg_free() used to leave freed memory accessible via SvMAGIC() on the scalar. It now updates the linked list to remove each piece of magic as it is freed.

  • Under ithreads, the regex in PL_reg_curpm is now reference counted. This eliminates a lot of hackish workarounds to cope with it not being reference counted.

  • Perl_mg_magical() would sometimes incorrectly turn on SvRMAGICAL() . This has been fixed.

  • The public IV and NV flags are now not set if the string value has trailing "garbage". This behaviour is consistent with not setting the public IV or NV flags if the value is out of range for the type.

  • SV allocation tracing has been added to the diagnostics enabled by -Dm . The tracing can alternatively output via the PERL_MEM_LOG mechanism, if that was enabled when the perl binary was compiled.

  • Uses of Nullav , Nullcv , Nullhv , Nullop , Nullsv etc have been replaced by NULL in the core code, and non-dual-life modules, as NULL is clearer to those unfamiliar with the core code.

  • A macro MUTABLE_PTR(p) has been added, which on (non-pedantic) gcc will not cast away const , returning a void * . Macros MUTABLE_SV(av) , MUTABLE_SV(cv) etc build on this, casting to AV * etc without casting away const . This allows proper compile-time auditing of const correctness in the core, and helped picked up some errors (now fixed).

  • Macros mPUSHs() and mXPUSHs() have been added, for pushing SVs on the stack and mortalizing them.

  • Use of the private structure mro_meta has changed slightly. Nothing outside the core should be accessing this directly anyway.

  • A new tool, Porting/expand-macro.pl has been added, that allows you to view how a C preprocessor macro would be expanded when compiled. This is handy when trying to decode the macro hell that is the perl guts.

New Tests

Many modules updated from CPAN incorporate new tests.

Several tests that have the potential to hang forever if they fail now incorporate a "watchdog" functionality that will kill them after a timeout, which helps ensure that make test and make test_harness run to completion automatically. (Jerry Hedden).

Some core-specific tests have been added:

  • t/comp/retainedlines.t

    Check that the debugger can retain source lines from eval.

  • t/io/perlio_fail.t

    Check that bad layers fail.

  • t/io/perlio_leaks.t

    Check that PerlIO layers are not leaking.

  • t/io/perlio_open.t

    Check that certain special forms of open work.

  • t/io/perlio.t

    General PerlIO tests.

  • t/io/pvbm.t

    Check that there is no unexpected interaction between the internal types PVBM and PVGV .

  • t/mro/package_aliases.t

    Check that mro works properly in the presence of aliased packages.

  • t/op/dbm.t

    Tests for dbmopen and dbmclose.

  • t/op/index_thr.t

    Tests for the interaction of index and threads.

  • t/op/pat_thr.t

    Tests for the interaction of esoteric patterns and threads.

  • t/op/qr_gc.t

    Test that qr doesn't leak.

  • t/op/reg_email_thr.t

    Tests for the interaction of regex recursion and threads.

  • t/op/regexp_qr_embed_thr.t

    Tests for the interaction of patterns with embedded qr// and threads.

  • t/op/regexp_unicode_prop.t

    Tests for Unicode properties in regular expressions.

  • t/op/regexp_unicode_prop_thr.t

    Tests for the interaction of Unicode properties and threads.

  • t/op/reg_nc_tie.t

    Test the tied methods of Tie::Hash::NamedCapture .

  • t/op/reg_posixcc.t

    Check that POSIX character classes behave consistently.

  • t/op/re.t

    Check that exportable re functions in universal.c work.

  • t/op/setpgrpstack.t

    Check that setpgrp works.

  • t/op/substr_thr.t

    Tests for the interaction of substr and threads.

  • t/op/upgrade.t

    Check that upgrading and assigning scalars works.

  • t/uni/lex_utf8.t

    Check that Unicode in the lexer works.

  • t/uni/tie.t

    Check that Unicode and tie work.

Known Problems

This is a list of some significant unfixed bugs, which are regressions from either 5.10.0 or 5.8.x.

  • List::Util::first misbehaves in the presence of a lexical $_ (typically introduced by my $_ or implicitly by given ). The variable which gets set for each iteration is the package variable $_ , not the lexical $_ [RT #67694].

    A similar issue may occur in other modules that provide functions which take a block as their first argument, like

    1. foo { ... $_ ...} list
  • The charnames pragma may generate a run-time error when a regex is interpolated [RT #56444]:

    1. use charnames ':full';
    2. my $r1 = qr/\N{THAI CHARACTER SARA I}/;
    3. "foo" =~ $r1; # okay
    4. "foo" =~ /$r1+/; # runtime error

    A workaround is to generate the character outside of the regex:

    1. my $a = "\N{THAI CHARACTER SARA I}";
    2. my $r1 = qr/$a/;
  • Some regexes may run much more slowly when run in a child thread compared with the thread the pattern was compiled into [RT #55600].

Deprecations

The following items are now deprecated.

  • Switch is buggy and should be avoided. From perl 5.11.0 onwards, it is intended that any use of the core version of this module will emit a warning, and that the module will eventually be removed from the core (probably in perl 5.14.0). See Switch statements in perlsyn for its replacement.

  • suidperl will be removed in 5.12.0. This provides a mechanism to emulate setuid permission bits on systems that don't support it properly.

Acknowledgements

Some of the work in this release was funded by a TPF grant.

Nicholas Clark officially retired from maintenance pumpking duty at the end of 2008; however in reality he has put much effort in since then to help get 5.10.1 into a fit state to be released, including writing a considerable chunk of this perldelta.

Steffen Mueller and David Golden in particular helped getting CPAN modules polished and synchronised with their in-core equivalents.

Craig Berry was tireless in getting maint to run under VMS, no matter how many times we broke it for him.

The other core committers contributed most of the changes, and applied most of the patches sent in by the hundreds of contributors listed in AUTHORS.

(Sorry to all the people I haven't mentioned by name).

Finally, thanks to Larry Wall, without whom none of this would be necessary.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5120delta.html000644 000765 000024 00000524556 12275777401 016425 0ustar00jjstaff000000 000000 perl5120delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5120delta

Perl 5 version 18.2 documentation
Recently read

perl5120delta

NAME

perl5120delta - what is new for perl v5.12.0

DESCRIPTION

This document describes differences between the 5.10.0 release and the 5.12.0 release.

Many of the bug fixes in 5.12.0 are already included in the 5.10.1 maintenance release.

You can see the list of those changes in the 5.10.1 release notes (perl5101delta).

Core Enhancements

New package NAME VERSION syntax

This new syntax allows a module author to set the $VERSION of a namespace when the namespace is declared with 'package'. It eliminates the need for our $VERSION = ... and similar constructs. E.g.

  1. package Foo::Bar 1.23;
  2. # $Foo::Bar::VERSION == 1.23

There are several advantages to this:

  • $VERSION is parsed in exactly the same way as use NAME VERSION

  • $VERSION is set at compile time

  • $VERSION is a version object that provides proper overloading of comparison operators so comparing $VERSION to decimal (1.23) or dotted-decimal (v1.2.3) version numbers works correctly.

  • Eliminates $VERSION = ... and eval $VERSION clutter

  • As it requires VERSION to be a numeric literal or v-string literal, it can be statically parsed by toolchain modules without eval the way MM->parse_version does for $VERSION = ...

It does not break old code with only package NAME, but code that uses package NAME VERSION will need to be restricted to perl 5.12.0 or newer This is analogous to the change to open from two-args to three-args. Users requiring the latest Perl will benefit, and perhaps after several years, it will become a standard practice.

However, package NAME VERSION requires a new, 'strict' version number format. See Version number formats for details.

The ... operator

A new operator, ... , nicknamed the Yada Yada operator, has been added. It is intended to mark placeholder code that is not yet implemented. See Yada Yada Operator in perlop.

Implicit strictures

Using the use VERSION syntax with a version number greater or equal to 5.11.0 will lexically enable strictures just like use strict would do (in addition to enabling features.) The following:

  1. use 5.12.0;

means:

  1. use strict;
  2. use feature ':5.12';

Unicode improvements

Perl 5.12 comes with Unicode 5.2, the latest version available to us at the time of release. This version of Unicode was released in October 2009. See http://www.unicode.org/versions/Unicode5.2.0 for further details about what's changed in this version of the standard. See perlunicode for instructions on installing and using other versions of Unicode.

Additionally, Perl's developers have significantly improved Perl's Unicode implementation. For full details, see Unicode overhaul below.

Y2038 compliance

Perl's core time-related functions are now Y2038 compliant. (It may not mean much to you, but your kids will love it!)

qr overloading

It is now possible to overload the qr// operator, that is, conversion to regexp, like it was already possible to overload conversion to boolean, string or number of objects. It is invoked when an object appears on the right hand side of the =~ operator or when it is interpolated into a regexp. See overload.

Pluggable keywords

Extension modules can now cleanly hook into the Perl parser to define new kinds of keyword-headed expression and compound statement. The syntax following the keyword is defined entirely by the extension. This allows a completely non-Perl sublanguage to be parsed inline, with the correct ops cleanly generated.

See PL_keyword_plugin in perlapi for the mechanism. The Perl core source distribution also includes a new module XS::APItest::KeywordRPN, which implements reverse Polish notation arithmetic via pluggable keywords. This module is mainly used for test purposes, and is not normally installed, but also serves as an example of how to use the new mechanism.

Perl's developers consider this feature to be experimental. We may remove it or change it in a backwards-incompatible way in Perl 5.14.

APIs for more internals

The lowest layers of the lexer and parts of the pad system now have C APIs available to XS extensions. These are necessary to support proper use of pluggable keywords, but have other uses too. The new APIs are experimental, and only cover a small proportion of what would be necessary to take full advantage of the core's facilities in these areas. It is intended that the Perl 5.13 development cycle will see the addition of a full range of clean, supported interfaces.

Perl's developers consider this feature to be experimental. We may remove it or change it in a backwards-incompatible way in Perl 5.14.

Overridable function lookup

Where an extension module hooks the creation of rv2cv ops to modify the subroutine lookup process, this now works correctly for bareword subroutine calls. This means that prototypes on subroutines referenced this way will be processed correctly. (Previously bareword subroutine names were initially looked up, for parsing purposes, by an unhookable mechanism, so extensions could only properly influence subroutine names that appeared with an & sigil.)

A proper interface for pluggable Method Resolution Orders

As of Perl 5.12.0 there is a new interface for plugging and using method resolution orders other than the default linear depth first search. The C3 method resolution order added in 5.10.0 has been re-implemented as a plugin, without changing its Perl-space interface. See perlmroapi for more information.

\N experimental regex escape

Perl now supports \N , a new regex escape which you can think of as the inverse of \n . It will match any character that is not a newline, independently from the presence or absence of the single line match modifier /s. It is not usable within a character class. \N{3} means to match 3 non-newlines; \N{5,} means to match at least 5. \N{NAME} still means the character or sequence named NAME , but NAME no longer can be things like 3 , or 5, .

This will break a custom charnames translator which allows numbers for character names, as \N{3} will now mean to match 3 non-newline characters, and not the character whose name is 3 . (No name defined by the Unicode standard is a number, so only custom translators might be affected.)

Perl's developers are somewhat concerned about possible user confusion with the existing \N{...} construct which matches characters by their Unicode name. Consequently, this feature is experimental. We may remove it or change it in a backwards-incompatible way in Perl 5.14.

DTrace support

Perl now has some support for DTrace. See "DTrace support" in INSTALL.

Support for configure_requires in CPAN module metadata

Both CPAN and CPANPLUS now support the configure_requires keyword in the META.yml metadata file included in most recent CPAN distributions. This allows distribution authors to specify configuration prerequisites that must be installed before running Makefile.PL or Build.PL.

See the documentation for ExtUtils::MakeMaker or Module::Build for more on how to specify configure_requires when creating a distribution for CPAN.

each, keys, values are now more flexible

The each, keys, values function can now operate on arrays.

when as a statement modifier

when is now allowed to be used as a statement modifier.

$, flexibility

The variable $, may now be tied.

// in when clauses

// now behaves like || in when clauses

Enabling warnings from your shell environment

You can now set -W from the PERL5OPT environment variable

delete local

delete local now allows you to locally delete a hash entry.

New support for Abstract namespace sockets

Abstract namespace sockets are Linux-specific socket type that live in AF_UNIX family, slightly abusing it to be able to use arbitrary character arrays as addresses: They start with nul byte and are not terminated by nul byte, but with the length passed to the socket() system call.

32-bit limit on substr arguments removed

The 32-bit limit on substr arguments has now been removed. The full range of the system's signed and unsigned integers is now available for the pos and len arguments.

Potentially Incompatible Changes

Deprecations warn by default

Over the years, Perl's developers have deprecated a number of language features for a variety of reasons. Perl now defaults to issuing a warning if a deprecated language feature is used. Many of the deprecations Perl now warns you about have been deprecated for many years. You can find a list of what was deprecated in a given release of Perl in the perl5xxdelta.pod file for that release.

To disable this feature in a given lexical scope, you should use no warnings 'deprecated'; For information about which language features are deprecated and explanations of various deprecation warnings, please see perldiag. See Deprecations below for the list of features and modules Perl's developers have deprecated as part of this release.

Version number formats

Acceptable version number formats have been formalized into "strict" and "lax" rules. package NAME VERSION takes a strict version number. UNIVERSAL::VERSION and the version object constructors take lax version numbers. Providing an invalid version will result in a fatal error. The version argument in use NAME VERSION is first parsed as a numeric literal or v-string and then passed to UNIVERSAL::VERSION (and must then pass the "lax" format test).

These formats are documented fully in the version module. To a first approximation, a "strict" version number is a positive decimal number (integer or decimal-fraction) without exponentiation or else a dotted-decimal v-string with a leading 'v' character and at least three components. A "lax" version number allows v-strings with fewer than three components or without a leading 'v'. Under "lax" rules, both decimal and dotted-decimal versions may have a trailing "alpha" component separated by an underscore character after a fractional or dotted-decimal component.

The version module adds version::is_strict and version::is_lax functions to check a scalar against these rules.

@INC reorganization

In @INC , ARCHLIB and PRIVLIB now occur after after the current version's site_perl and vendor_perl . Modules installed into site_perl and vendor_perl will now be loaded in preference to those installed in ARCHLIB and PRIVLIB .

REGEXPs are now first class

Internally, Perl now treats compiled regular expressions (such as those created with qr//) as first class entities. Perl modules which serialize, deserialize or otherwise have deep interaction with Perl's internal data structures need to be updated for this change. Most affected CPAN modules have already been updated as of this writing.

Switch statement changes

The given /when switch statement handles complex statements better than Perl 5.10.0 did (These enhancements are also available in 5.10.1 and subsequent 5.10 releases.) There are two new cases where when now interprets its argument as a boolean, instead of an expression to be used in a smart match:

  • flip-flop operators

    The .. and ... flip-flop operators are now evaluated in boolean context, following their usual semantics; see Range Operators in perlop.

    Note that, as in perl 5.10.0, when (1..10) will not work to test whether a given value is an integer between 1 and 10; you should use when ([1..10]) instead (note the array reference).

    However, contrary to 5.10.0, evaluating the flip-flop operators in boolean context ensures it can now be useful in a when() , notably for implementing bistable conditions, like in:

    1. when (/^=begin/ .. /^=end/) {
    2. # do something
    3. }
  • defined-or operator

    A compound expression involving the defined-or operator, as in when (expr1 // expr2), will be treated as boolean if the first expression is boolean. (This just extends the existing rule that applies to the regular or operator, as in when (expr1 || expr2) .)

Smart match changes

Since Perl 5.10.0, Perl's developers have made a number of changes to the smart match operator. These, of course, also alter the behaviour of the switch statements where smart matching is implicitly used. These changes were also made for the 5.10.1 release, and will remain in subsequent 5.10 releases.

Changes to type-based dispatch

The smart match operator ~~ is no longer commutative. The behaviour of a smart match now depends primarily on the type of its right hand argument. Moreover, its semantics have been adjusted for greater consistency or usefulness in several cases. While the general backwards compatibility is maintained, several changes must be noted:

  • Code references with an empty prototype are no longer treated specially. They are passed an argument like the other code references (even if they choose to ignore it).

  • %hash ~~ sub {} and @array ~~ sub {} now test that the subroutine returns a true value for each key of the hash (or element of the array), instead of passing the whole hash or array as a reference to the subroutine.

  • Due to the commutativity breakage, code references are no longer treated specially when appearing on the left of the ~~ operator, but like any vulgar scalar.

  • undef ~~ %hash is always false (since undef can't be a key in a hash). No implicit conversion to "" is done (as was the case in perl 5.10.0).

  • $scalar ~~ @array now always distributes the smart match across the elements of the array. It's true if one element in @array verifies $scalar ~~ $element . This is a generalization of the old behaviour that tested whether the array contained the scalar.

The full dispatch table for the smart match operator is given in Smart matching in detail in perlsyn.

Smart match and overloading

According to the rule of dispatch based on the rightmost argument type, when an object overloading ~~ appears on the right side of the operator, the overload routine will always be called (with a 3rd argument set to a true value, see overload.) However, when the object will appear on the left, the overload routine will be called only when the rightmost argument is a simple scalar. This way, distributivity of smart match across arrays is not broken, as well as the other behaviours with complex types (coderefs, hashes, regexes). Thus, writers of overloading routines for smart match mostly need to worry only with comparing against a scalar, and possibly with stringification overloading; the other common cases will be automatically handled consistently.

~~ will now refuse to work on objects that do not overload it (in order to avoid relying on the object's underlying structure). (However, if the object overloads the stringification or the numification operators, and if overload fallback is active, it will be used instead, as usual.)

Other potentially incompatible changes

  • The definitions of a number of Unicode properties have changed to match those of the current Unicode standard. These are listed above under Unicode overhaul. This change may break code that expects the old definitions.

  • The boolkeys op has moved to the group of hash ops. This breaks binary compatibility.

  • Filehandles are now always blessed into IO::File .

    The previous behaviour was to bless Filehandles into FileHandle (an empty proxy class) if it was loaded into memory and otherwise to bless them into IO::Handle .

  • The semantics of use feature :5.10* have changed slightly. See Modules and Pragmata for more information.

  • Perl's developers now use git, rather than Perforce. This should be a purely internal change only relevant to people actively working on the core. However, you may see minor difference in perl as a consequence of the change. For example in some of details of the output of perl -V . See perlrepository for more information.

  • As part of the Test::Harness 2.x to 3.x upgrade, the experimental Test::Harness::Straps module has been removed. See Modules and Pragmata for more details.

  • As part of the ExtUtils::MakeMaker upgrade, the ExtUtils::MakeMaker::bytes and ExtUtils::MakeMaker::vmsish modules have been removed from this distribution.

  • Module::CoreList no longer contains the %:patchlevel hash.

  • length undef now returns undef.

  • Unsupported private C API functions are now declared "static" to prevent leakage to Perl's public API.

  • To support the bootstrapping process, miniperl no longer builds with UTF-8 support in the regexp engine.

    This allows a build to complete with PERL_UNICODE set and a UTF-8 locale. Without this there's a bootstrapping problem, as miniperl can't load the UTF-8 components of the regexp engine, because they're not yet built.

  • miniperl's @INC is now restricted to just -I... , the split of $ENV{PERL5LIB} , and "."

  • A space or a newline is now required after a "#line XXX" directive.

  • Tied filehandles now have an additional method EOF which provides the EOF type.

  • To better match all other flow control statements, foreach may no longer be used as an attribute.

  • Perl's command-line switch "-P", which was deprecated in version 5.10.0, has now been removed. The CPAN module Filter::cpp can be used as an alternative.

Deprecations

From time to time, Perl's developers find it necessary to deprecate features or modules we've previously shipped as part of the core distribution. We are well aware of the pain and frustration that a backwards-incompatible change to Perl can cause for developers building or maintaining software in Perl. You can be sure that when we deprecate a functionality or syntax, it isn't a choice we make lightly. Sometimes, we choose to deprecate functionality or syntax because it was found to be poorly designed or implemented. Sometimes, this is because they're holding back other features or causing performance problems. Sometimes, the reasons are more complex. Wherever possible, we try to keep deprecated functionality available to developers in its previous form for at least one major release. So long as a deprecated feature isn't actively disrupting our ability to maintain and extend Perl, we'll try to leave it in place as long as possible.

The following items are now deprecated:

  • suidperl

    suidperl is no longer part of Perl. It used to provide a mechanism to emulate setuid permission bits on systems that don't support it properly.

  • Use of := to mean an empty attribute list

    An accident of Perl's parser meant that these constructions were all equivalent:

    1. my $pi := 4;
    2. my $pi : = 4;
    3. my $pi : = 4;

    with the : being treated as the start of an attribute list, which ends before the = . As whitespace is not significant here, all are parsed as an empty attribute list, hence all the above are equivalent to, and better written as

    1. my $pi = 4;

    because no attribute processing is done for an empty list.

    As is, this meant that := cannot be used as a new token, without silently changing the meaning of existing code. Hence that particular form is now deprecated, and will become a syntax error. If it is absolutely necessary to have empty attribute lists (for example, because of a code generator) then avoid the warning by adding a space before the = .

  • UNIVERSAL->import()

    The method UNIVERSAL->import() is now deprecated. Attempting to pass import arguments to a use UNIVERSAL statement will result in a deprecation warning.

  • Use of "goto" to jump into a construct

    Using goto to jump from an outer scope into an inner scope is now deprecated. This rare use case was causing problems in the implementation of scopes.

  • Custom character names in \N{name} that don't look like names

    In \N{name}, name can be just about anything. The standard Unicode names have a very limited domain, but a custom name translator could create names that are, for example, made up entirely of punctuation symbols. It is now deprecated to make names that don't begin with an alphabetic character, and aren't alphanumeric or contain other than a very few other characters, namely spaces, dashes, parentheses and colons. Because of the added meaning of \N (See \N experimental regex escape), names that look like curly brace -enclosed quantifiers won't work. For example, \N{3,4} now means to match 3 to 4 non-newlines; before a custom name 3,4 could have been created.

  • Deprecated Modules

    The following modules will be removed from the core distribution in a future release, and should be installed from CPAN instead. Distributions on CPAN which require these should add them to their prerequisites. The core versions of these modules warnings will issue a deprecation warning.

    If you ship a packaged version of Perl, either alone or as part of a larger system, then you should carefully consider the repercussions of core module deprecations. You may want to consider shipping your default build of Perl with packages for some or all deprecated modules which install into vendor or site perl library directories. This will inhibit the deprecation warnings.

    Alternatively, you may want to consider patching lib/deprecate.pm to provide deprecation warnings specific to your packaging system or distribution of Perl, consistent with how your packaging system or distribution manages a staged transition from a release where the installation of a single package provides the given functionality, to a later release where the system administrator needs to know to install multiple packages to get that same functionality.

    You can silence these deprecation warnings by installing the modules in question from CPAN. To install the latest version of all of them, just install Task::Deprecations::5_12 .

  • Assignment to $[
  • Use of the attribute :locked on subroutines
  • Use of "locked" with the attributes pragma
  • Use of "unique" with the attributes pragma
  • Perl_pmflag

    Perl_pmflag is no longer part of Perl's public API. Calling it now generates a deprecation warning, and it will be removed in a future release. Although listed as part of the API, it was never documented, and only ever used in toke.c, and prior to 5.10, regcomp.c. In core, it has been replaced by a static function.

  • Numerous Perl 4-era libraries

    termcap.pl, tainted.pl, stat.pl, shellwords.pl, pwd.pl, open3.pl, open2.pl, newgetopt.pl, look.pl, find.pl, finddepth.pl, importenv.pl, hostname.pl, getopts.pl, getopt.pl, getcwd.pl, flush.pl, fastcwd.pl, exceptions.pl, ctime.pl, complete.pl, cacheout.pl, bigrat.pl, bigint.pl, bigfloat.pl, assert.pl, abbrev.pl, dotsh.pl, and timelocal.pl are all now deprecated. Earlier, Perl's developers intended to remove these libraries from Perl's core for the 5.14.0 release.

    During final testing before the release of 5.12.0, several developers discovered current production code using these ancient libraries, some inside the Perl core itself. Accordingly, the pumpking granted them a stay of execution. They will begin to warn about their deprecation in the 5.14.0 release and will be removed in the 5.16.0 release.

Unicode overhaul

Perl's developers have made a concerted effort to update Perl to be in sync with the latest Unicode standard. Changes for this include:

Perl can now handle every Unicode character property. New documentation, perluniprops, lists all available non-Unihan character properties. By default, perl does not expose Unihan, deprecated or Unicode-internal properties. See below for more details on these; there is also a section in the pod listing them, and explaining why they are not exposed.

Perl now fully supports the Unicode compound-style of using = and : in writing regular expressions: \p{property=value} and \p{property:value} (both of which mean the same thing).

Perl now fully supports the Unicode loose matching rules for text between the braces in \p{...} constructs. In addition, Perl allows underscores between digits of numbers.

Perl now accepts all the Unicode-defined synonyms for properties and property values.

qr/\X/, which matches a Unicode logical character, has been expanded to work better with various Asian languages. It now is defined as an extended grapheme cluster. (See http://www.unicode.org/reports/tr29/). Anything matched previously and that made sense will continue to be accepted. Additionally:

  • \X will not break apart a CR LF sequence.

  • \X will now match a sequence which includes the ZWJ and ZWNJ characters.

  • \X will now always match at least one character, including an initial mark. Marks generally come after a base character, but it is possible in Unicode to have them in isolation, and \X will now handle that case, for example at the beginning of a line, or after a ZWSP . And this is the part where \X doesn't match the things that it used to that don't make sense. Formerly, for example, you could have the nonsensical case of an accented LF.

  • \X will now match a (Korean) Hangul syllable sequence, and the Thai and Lao exception cases.

Otherwise, this change should be transparent for the non-affected languages.

\p{...} matches using the Canonical_Combining_Class property were completely broken in previous releases of Perl. They should now work correctly.

Before Perl 5.12, the Unicode Decomposition_Type=Compat property and a Perl extension had the same name, which led to neither matching all the correct values (with more than 100 mistakes in one, and several thousand in the other). The Perl extension has now been renamed to be Decomposition_Type=Noncanonical (short: dt=noncanon ). It has the same meaning as was previously intended, namely the union of all the non-canonical Decomposition types, with Unicode Compat being just one of those.

\p{Decomposition_Type=Canonical} now includes the Hangul syllables.

\p{Uppercase} and \p{Lowercase} now work as the Unicode standard says they should. This means they each match a few more characters than they used to.

\p{Cntrl} now matches the same characters as \p{Control} . This means it no longer will match Private Use (gc=co), Surrogates (gc=cs), nor Format (gc=cf) code points. The Format code points represent the biggest possible problem. All but 36 of them are either officially deprecated or strongly discouraged from being used. Of those 36, likely the most widely used are the soft hyphen (U+00AD), and BOM, ZWSP, ZWNJ, WJ, and similar characters, plus bidirectional controls.

\p{Alpha} now matches the same characters as \p{Alphabetic} . Before 5.12, Perl's definition definition included a number of things that aren't really alpha (all marks) while omitting many that were. The definitions of \p{Alnum} and \p{Word} depend on Alpha's definition and have changed accordingly.

\p{Word} no longer incorrectly matches non-word characters such as fractions.

\p{Print} no longer matches the line control characters: Tab, LF, CR, FF, VT, and NEL. This brings it in line with standards and the documentation.

\p{XDigit} now matches the same characters as \p{Hex_Digit} . This means that in addition to the characters it currently matches, [A-Fa-f0-9] , it will also match the 22 fullwidth equivalents, for example U+FF10: FULLWIDTH DIGIT ZERO.

The Numeric type property has been extended to include the Unihan characters.

There is a new Perl extension, the 'Present_In', or simply 'In', property. This is an extension of the Unicode Age property, but \p{In=5.0} matches any code point whose usage has been determined as of Unicode version 5.0. The \p{Age=5.0} only matches code points added in precisely version 5.0.

A number of properties now have the correct values for unassigned code points. The affected properties are Bidi_Class, East_Asian_Width, Joining_Type, Decomposition_Type, Hangul_Syllable_Type, Numeric_Type, and Line_Break.

The Default_Ignorable_Code_Point, ID_Continue, and ID_Start properties are now up to date with current Unicode definitions.

Earlier versions of Perl erroneously exposed certain properties that are supposed to be Unicode internal-only. Use of these in regular expressions will now generate, if enabled, a deprecation warning message. The properties are: Other_Alphabetic, Other_Default_Ignorable_Code_Point, Other_Grapheme_Extend, Other_ID_Continue, Other_ID_Start, Other_Lowercase, Other_Math, and Other_Uppercase.

It is now possible to change which Unicode properties Perl understands on a per-installation basis. As mentioned above, certain properties are turned off by default. These include all the Unihan properties (which should be accessible via the CPAN module Unicode::Unihan) and any deprecated or Unicode internal-only property that Perl has never exposed.

The generated files in the lib/unicore/To directory are now more clearly marked as being stable, directly usable by applications. New hash entries in them give the format of the normal entries, which allows for easier machine parsing. Perl can generate files in this directory for any property, though most are suppressed. You can find instructions for changing which are written in perluniprops.

Modules and Pragmata

New Modules and Pragmata

  • autodie

    autodie is a new lexically-scoped alternative for the Fatal module. The bundled version is 2.06_01. Note that in this release, using a string eval when autodie is in effect can cause the autodie behaviour to leak into the surrounding scope. See BUGS in autodie for more details.

    Version 2.06_01 has been added to the Perl core.

  • Compress::Raw::Bzip2

    Version 2.024 has been added to the Perl core.

  • overloading

    overloading allows you to lexically disable or enable overloading for some or all operations.

    Version 0.001 has been added to the Perl core.

  • parent

    parent establishes an ISA relationship with base classes at compile time. It provides the key feature of base without further unwanted behaviors.

    Version 0.223 has been added to the Perl core.

  • Parse::CPAN::Meta

    Version 1.40 has been added to the Perl core.

  • VMS::DCLsym

    Version 1.03 has been added to the Perl core.

  • VMS::Stdio

    Version 2.4 has been added to the Perl core.

  • XS::APItest::KeywordRPN

    Version 0.003 has been added to the Perl core.

Updated Pragmata

  • base

    Upgraded from version 2.13 to 2.15.

  • bignum

    Upgraded from version 0.22 to 0.23.

  • charnames

    charnames now contains the Unicode NameAliases.txt database file. This has the effect of adding some extra \N character names that formerly wouldn't have been recognised; for example, "\N{LATIN CAPITAL LETTER GHA}" .

    Upgraded from version 1.06 to 1.07.

  • constant

    Upgraded from version 1.13 to 1.20.

  • diagnostics

    diagnostics now supports %.0f formatting internally.

    diagnostics no longer suppresses Use of uninitialized value in range (or flip) warnings. [perl #71204]

    Upgraded from version 1.17 to 1.19.

  • feature

    In feature , the meaning of the :5.10 and :5.10.X feature bundles has changed slightly. The last component, if any (i.e. X ) is simply ignored. This is predicated on the assumption that new features will not, in general, be added to maintenance releases. So :5.10 and :5.10.X have identical effect. This is a change to the behaviour documented for 5.10.0.

    feature now includes the unicode_strings feature:

    1. use feature "unicode_strings";

    This pragma turns on Unicode semantics for the case-changing operations (uc, lc, ucfirst, lcfirst) on strings that don't have the internal UTF-8 flag set, but that contain single-byte characters between 128 and 255.

    Upgraded from version 1.11 to 1.16.

  • less

    less now includes the stash_name method to allow subclasses of less to pick where in %^H to store their stash.

    Upgraded from version 0.02 to 0.03.

  • lib

    Upgraded from version 0.5565 to 0.62.

  • mro

    mro is now implemented as an XS extension. The documented interface has not changed. Code relying on the implementation detail that some mro:: methods happened to be available at all times gets to "keep both pieces".

    Upgraded from version 1.00 to 1.02.

  • overload

    overload now allow overloading of 'qr'.

    Upgraded from version 1.06 to 1.10.

  • threads

    Upgraded from version 1.67 to 1.75.

  • threads::shared

    Upgraded from version 1.14 to 1.32.

  • version

    version now has support for Version number formats as described earlier in this document and in its own documentation.

    Upgraded from version 0.74 to 0.82.

  • warnings

    warnings has a new warnings::fatal_enabled() function. It also includes a new illegalproto warning category. See also New or Changed Diagnostics for this change.

    Upgraded from version 1.06 to 1.09.

Updated Modules

  • Archive::Extract

    Upgraded from version 0.24 to 0.38.

  • Archive::Tar

    Upgraded from version 1.38 to 1.54.

  • Attribute::Handlers

    Upgraded from version 0.79 to 0.87.

  • AutoLoader

    Upgraded from version 5.63 to 5.70.

  • B::Concise

    Upgraded from version 0.74 to 0.78.

  • B::Debug

    Upgraded from version 1.05 to 1.12.

  • B::Deparse

    Upgraded from version 0.83 to 0.96.

  • B::Lint

    Upgraded from version 1.09 to 1.11_01.

  • CGI

    Upgraded from version 3.29 to 3.48.

  • Class::ISA

    Upgraded from version 0.33 to 0.36.

    NOTE: Class::ISA is deprecated and may be removed from a future version of Perl.

  • Compress::Raw::Zlib

    Upgraded from version 2.008 to 2.024.

  • CPAN

    Upgraded from version 1.9205 to 1.94_56.

  • CPANPLUS

    Upgraded from version 0.84 to 0.90.

  • CPANPLUS::Dist::Build

    Upgraded from version 0.06_02 to 0.46.

  • Data::Dumper

    Upgraded from version 2.121_14 to 2.125.

  • DB_File

    Upgraded from version 1.816_1 to 1.820.

  • Devel::PPPort

    Upgraded from version 3.13 to 3.19.

  • Digest

    Upgraded from version 1.15 to 1.16.

  • Digest::MD5

    Upgraded from version 2.36_01 to 2.39.

  • Digest::SHA

    Upgraded from version 5.45 to 5.47.

  • Encode

    Upgraded from version 2.23 to 2.39.

  • Exporter

    Upgraded from version 5.62 to 5.64_01.

  • ExtUtils::CBuilder

    Upgraded from version 0.21 to 0.27.

  • ExtUtils::Command

    Upgraded from version 1.13 to 1.16.

  • ExtUtils::Constant

    Upgraded from version 0.2 to 0.22.

  • ExtUtils::Install

    Upgraded from version 1.44 to 1.55.

  • ExtUtils::MakeMaker

    Upgraded from version 6.42 to 6.56.

  • ExtUtils::Manifest

    Upgraded from version 1.51_01 to 1.57.

  • ExtUtils::ParseXS

    Upgraded from version 2.18_02 to 2.21.

  • File::Fetch

    Upgraded from version 0.14 to 0.24.

  • File::Path

    Upgraded from version 2.04 to 2.08_01.

  • File::Temp

    Upgraded from version 0.18 to 0.22.

  • Filter::Simple

    Upgraded from version 0.82 to 0.84.

  • Filter::Util::Call

    Upgraded from version 1.07 to 1.08.

  • Getopt::Long

    Upgraded from version 2.37 to 2.38.

  • IO

    Upgraded from version 1.23_01 to 1.25_02.

  • IO::Zlib

    Upgraded from version 1.07 to 1.10.

  • IPC::Cmd

    Upgraded from version 0.40_1 to 0.54.

  • IPC::SysV

    Upgraded from version 1.05 to 2.01.

  • Locale::Maketext

    Upgraded from version 1.12 to 1.14.

  • Locale::Maketext::Simple

    Upgraded from version 0.18 to 0.21.

  • Log::Message

    Upgraded from version 0.01 to 0.02.

  • Log::Message::Simple

    Upgraded from version 0.04 to 0.06.

  • Math::BigInt

    Upgraded from version 1.88 to 1.89_01.

  • Math::BigInt::FastCalc

    Upgraded from version 0.16 to 0.19.

  • Math::BigRat

    Upgraded from version 0.21 to 0.24.

  • Math::Complex

    Upgraded from version 1.37 to 1.56.

  • Memoize

    Upgraded from version 1.01_02 to 1.01_03.

  • MIME::Base64

    Upgraded from version 3.07_01 to 3.08.

  • Module::Build

    Upgraded from version 0.2808_01 to 0.3603.

  • Module::CoreList

    Upgraded from version 2.12 to 2.29.

  • Module::Load

    Upgraded from version 0.12 to 0.16.

  • Module::Load::Conditional

    Upgraded from version 0.22 to 0.34.

  • Module::Loaded

    Upgraded from version 0.01 to 0.06.

  • Module::Pluggable

    Upgraded from version 3.6 to 3.9.

  • Net::Ping

    Upgraded from version 2.33 to 2.36.

  • NEXT

    Upgraded from version 0.60_01 to 0.64.

  • Object::Accessor

    Upgraded from version 0.32 to 0.36.

  • Package::Constants

    Upgraded from version 0.01 to 0.02.

  • PerlIO

    Upgraded from version 1.04 to 1.06.

  • Pod::Parser

    Upgraded from version 1.35 to 1.37.

  • Pod::Perldoc

    Upgraded from version 3.14_02 to 3.15_02.

  • Pod::Plainer

    Upgraded from version 0.01 to 1.02.

    NOTE: Pod::Plainer is deprecated and may be removed from a future version of Perl.

  • Pod::Simple

    Upgraded from version 3.05 to 3.13.

  • Safe

    Upgraded from version 2.12 to 2.22.

  • SelfLoader

    Upgraded from version 1.11 to 1.17.

  • Storable

    Upgraded from version 2.18 to 2.22.

  • Switch

    Upgraded from version 2.13 to 2.16.

    NOTE: Switch is deprecated and may be removed from a future version of Perl.

  • Sys::Syslog

    Upgraded from version 0.22 to 0.27.

  • Term::ANSIColor

    Upgraded from version 1.12 to 2.02.

  • Term::UI

    Upgraded from version 0.18 to 0.20.

  • Test

    Upgraded from version 1.25 to 1.25_02.

  • Test::Harness

    Upgraded from version 2.64 to 3.17.

  • Test::Simple

    Upgraded from version 0.72 to 0.94.

  • Text::Balanced

    Upgraded from version 2.0.0 to 2.02.

  • Text::ParseWords

    Upgraded from version 3.26 to 3.27.

  • Text::Soundex

    Upgraded from version 3.03 to 3.03_01.

  • Thread::Queue

    Upgraded from version 2.00 to 2.11.

  • Thread::Semaphore

    Upgraded from version 2.01 to 2.09.

  • Tie::RefHash

    Upgraded from version 1.37 to 1.38.

  • Time::HiRes

    Upgraded from version 1.9711 to 1.9719.

  • Time::Local

    Upgraded from version 1.18 to 1.1901_01.

  • Time::Piece

    Upgraded from version 1.12 to 1.15.

  • Unicode::Collate

    Upgraded from version 0.52 to 0.52_01.

  • Unicode::Normalize

    Upgraded from version 1.02 to 1.03.

  • Win32

    Upgraded from version 0.34 to 0.39.

  • Win32API::File

    Upgraded from version 0.1001_01 to 0.1101.

  • XSLoader

    Upgraded from version 0.08 to 0.10.

Removed Modules and Pragmata

  • attrs

    Removed from the Perl core. Prior version was 1.02.

  • CPAN::API::HOWTO

    Removed from the Perl core. Prior version was 'undef'.

  • CPAN::DeferedCode

    Removed from the Perl core. Prior version was 5.50.

  • CPANPLUS::inc

    Removed from the Perl core. Prior version was 'undef'.

  • DCLsym

    Removed from the Perl core. Prior version was 1.03.

  • ExtUtils::MakeMaker::bytes

    Removed from the Perl core. Prior version was 6.42.

  • ExtUtils::MakeMaker::vmsish

    Removed from the Perl core. Prior version was 6.42.

  • Stdio

    Removed from the Perl core. Prior version was 2.3.

  • Test::Harness::Assert

    Removed from the Perl core. Prior version was 0.02.

  • Test::Harness::Iterator

    Removed from the Perl core. Prior version was 0.02.

  • Test::Harness::Point

    Removed from the Perl core. Prior version was 0.01.

  • Test::Harness::Results

    Removed from the Perl core. Prior version was 0.01.

  • Test::Harness::Straps

    Removed from the Perl core. Prior version was 0.26_01.

  • Test::Harness::Util

    Removed from the Perl core. Prior version was 0.01.

  • XSSymSet

    Removed from the Perl core. Prior version was 1.1.

Deprecated Modules and Pragmata

See Deprecated Modules above.

Documentation

New Documentation

  • perlhaiku contains instructions on how to build perl for the Haiku platform.

  • perlmroapi describes the new interface for pluggable Method Resolution Orders.

  • perlperf, by Richard Foley, provides an introduction to the use of performance and optimization techniques which can be used with particular reference to perl programs.

  • perlrepository describes how to access the perl source using the git version control system.

  • perlpolicy extends the "Social contract about contributed modules" into the beginnings of a document on Perl porting policies.

Changes to Existing Documentation

  • The various large Changes* files (which listed every change made to perl over the last 18 years) have been removed, and replaced by a small file, also called Changes, which just explains how that same information may be extracted from the git version control system.

  • Porting/patching.pod has been deleted, as it mainly described interacting with the old Perforce-based repository, which is now obsolete. Information still relevant has been moved to perlrepository.

  • The syntax unless (EXPR) BLOCK else BLOCK is now documented as valid, as is the syntax unless (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK, although actually using the latter may not be the best idea for the readability of your source code.

  • Documented -X overloading.

  • Documented that when() treats specially most of the filetest operators

  • Documented when as a syntax modifier.

  • Eliminated "Old Perl threads tutorial", which described 5005 threads.

    pod/perlthrtut.pod is the same material reworked for ithreads.

  • Correct previous documentation: v-strings are not deprecated

    With version objects, we need them to use MODULE VERSION syntax. This patch removes the deprecation notice.

  • Security contact information is now part of perlsec.

  • A significant fraction of the core documentation has been updated to clarify the behavior of Perl's Unicode handling.

    Much of the remaining core documentation has been reviewed and edited for clarity, consistent use of language, and to fix the spelling of Tom Christiansen's name.

  • The Pod specification (perlpodspec) has been updated to bring the specification in line with modern usage already supported by most Pod systems. A parameter string may now follow the format name in a "begin/end" region. Links to URIs with a text description are now allowed. The usage of L<"section"> has been marked as deprecated.

  • if.pm has been documented in use as a means to get conditional loading of modules despite the implicit BEGIN block around use.

  • The documentation for $1 in perlvar.pod has been clarified.

  • \N{U+code point} is now documented.

Selected Performance Enhancements

  • A new internal cache means that isa() will often be faster.

  • The implementation of C3 Method Resolution Order has been optimised - linearisation for classes with single inheritance is 40% faster. Performance for multiple inheritance is unchanged.

  • Under use locale , the locale-relevant information is now cached on read-only values, such as the list returned by keys %hash . This makes operations such as sort keys %hash in the scope of use locale much faster.

  • Empty DESTROY methods are no longer called.

  • Perl_sv_utf8_upgrade() is now faster.

  • keys on empty hash is now faster.

  • if (%foo) has been optimized to be faster than if (keys %foo) .

  • The string repetition operator ($str x $num ) is now several times faster when $str has length one or $num is large.

  • Reversing an array to itself (as in @a = reverse @a ) in void context now happens in-place and is several orders of magnitude faster than it used to be. It will also preserve non-existent elements whenever possible, i.e. for non magical arrays or tied arrays with EXISTS and DELETE methods.

Installation and Configuration Improvements

  • perlapi, perlintern, perlmodlib and perltoc are now all generated at build time, rather than being shipped as part of the release.

  • If vendorlib and vendorarch are the same, then they are only added to @INC once.

  • $Config{usedevel} and the C-level PERL_USE_DEVEL are now defined if perl is built with -Dusedevel .

  • Configure will enable use of -fstack-protector , to provide protection against stack-smashing attacks, if the compiler supports it.

  • Configure will now determine the correct prototypes for re-entrant functions and for gconvert if you are using a C++ compiler rather than a C compiler.

  • On Unix, if you build from a tree containing a git repository, the configuration process will note the commit hash you have checked out, for display in the output of perl -v and perl -V . Unpushed local commits are automatically added to the list of local patches displayed by perl -V .

  • Perl now supports SystemTap's dtrace compatibility layer and an issue with linking miniperl has been fixed in the process.

  • perldoc now uses less -R instead of less for improved behaviour in the face of groff 's new usage of ANSI escape codes.

  • perl -V now reports use of the compile-time options USE_PERL_ATOF and USE_ATTRIBUTES_FOR_PERLIO .

  • As part of the flattening of ext, all extensions on all platforms are built by make_ext.pl. This replaces the Unix-specific ext/util/make_ext, VMS-specific make_ext.com and Win32-specific win32/buildext.pl.

Internal Changes

Each release of Perl sees numerous internal changes which shouldn't affect day to day usage but may still be notable for developers working with Perl's source code.

  • The J.R.R. Tolkien quotes at the head of C source file have been checked and proper citations added, thanks to a patch from Tom Christiansen.

  • The internal structure of the dual-life modules traditionally found in the lib/ and ext/ directories in the perl source has changed significantly. Where possible, dual-lifed modules have been extracted from lib/ and ext/.

    Dual-lifed modules maintained by Perl's developers as part of the Perl core now live in dist/. Dual-lifed modules maintained primarily on CPAN now live in cpan/. When reporting a bug in a module located under cpan/, please send your bug report directly to the module's bug tracker or author, rather than Perl's bug tracker.

  • \N{...} now compiles better, always forces UTF-8 internal representation

    Perl's developers have fixed several problems with the recognition of \N{...} constructs. As part of this, perl will store any scalar or regex containing \N{name} or \N{U+code point} in its definition in UTF-8 format. (This was true previously for all occurrences of \N{name} that did not use a custom translator, but now it's always true.)

  • Perl_magic_setmglob now knows about globs, fixing RT #71254.

  • SVt_RV no longer exists. RVs are now stored in IVs.

  • Perl_vcroak() now accepts a null first argument. In addition, a full audit was made of the "not NULL" compiler annotations, and those for several other internal functions were corrected.

  • New macros dSAVEDERRNO , dSAVE_ERRNO , SAVE_ERRNO , RESTORE_ERRNO have been added to formalise the temporary saving of the errno variable.

  • The function Perl_sv_insert_flags has been added to augment Perl_sv_insert .

  • The function Perl_newSV_type(type) has been added, equivalent to Perl_newSV() followed by Perl_sv_upgrade(type) .

  • The function Perl_newSVpvn_flags() has been added, equivalent to Perl_newSVpvn() and then performing the action relevant to the flag.

    Two flag bits are currently supported.

    • SVf_UTF8 will call SvUTF8_on() for you. (Note that this does not convert an sequence of ISO 8859-1 characters to UTF-8). A wrapper, newSVpvn_utf8() is available for this.

    • SVs_TEMP now calls Perl_sv_2mortal() on the new SV.

    There is also a wrapper that takes constant strings, newSVpvs_flags() .

  • The function Perl_croak_xs_usage has been added as a wrapper to Perl_croak .

  • Perl now exports the functions PerlIO_find_layer and PerlIO_list_alloc .

  • PL_na has been exterminated from the core code, replaced by local STRLEN temporaries, or *_nolen() calls. Either approach is faster than PL_na , which is a pointer dereference into the interpreter structure under ithreads, and a global variable otherwise.

  • Perl_mg_free() used to leave freed memory accessible via SvMAGIC() on the scalar. It now updates the linked list to remove each piece of magic as it is freed.

  • Under ithreads, the regex in PL_reg_curpm is now reference counted. This eliminates a lot of hackish workarounds to cope with it not being reference counted.

  • Perl_mg_magical() would sometimes incorrectly turn on SvRMAGICAL() . This has been fixed.

  • The public IV and NV flags are now not set if the string value has trailing "garbage". This behaviour is consistent with not setting the public IV or NV flags if the value is out of range for the type.

  • Uses of Nullav , Nullcv , Nullhv , Nullop , Nullsv etc have been replaced by NULL in the core code, and non-dual-life modules, as NULL is clearer to those unfamiliar with the core code.

  • A macro MUTABLE_PTR(p) has been added, which on (non-pedantic) gcc will not cast away const , returning a void * . Macros MUTABLE_SV(av) , MUTABLE_SV(cv) etc build on this, casting to AV * etc without casting away const . This allows proper compile-time auditing of const correctness in the core, and helped picked up some errors (now fixed).

  • Macros mPUSHs() and mXPUSHs() have been added, for pushing SVs on the stack and mortalizing them.

  • Use of the private structure mro_meta has changed slightly. Nothing outside the core should be accessing this directly anyway.

  • A new tool, Porting/expand-macro.pl has been added, that allows you to view how a C preprocessor macro would be expanded when compiled. This is handy when trying to decode the macro hell that is the perl guts.

Testing

Testing improvements

  • Parallel tests

    The core distribution can now run its regression tests in parallel on Unix-like platforms. Instead of running make test , set TEST_JOBS in your environment to the number of tests to run in parallel, and run make test_harness . On a Bourne-like shell, this can be done as

    1. TEST_JOBS=3 make test_harness # Run 3 tests in parallel

    An environment variable is used, rather than parallel make itself, because TAP::Harness needs to be able to schedule individual non-conflicting test scripts itself, and there is no standard interface to make utilities to interact with their job schedulers.

    Note that currently some test scripts may fail when run in parallel (most notably ext/IO/t/io_dir.t ). If necessary run just the failing scripts again sequentially and see if the failures go away.

  • Test harness flexibility

    It's now possible to override PERL5OPT and friends in t/TEST

  • Test watchdog

    Several tests that have the potential to hang forever if they fail now incorporate a "watchdog" functionality that will kill them after a timeout, which helps ensure that make test and make test_harness run to completion automatically.

New Tests

Perl's developers have added a number of new tests to the core. In addition to the items listed below, many modules updated from CPAN incorporate new tests.

  • Significant cleanups to core tests to ensure that language and interpreter features are not used before they're tested.

  • make test_porting now runs a number of important pre-commit checks which might be of use to anyone working on the Perl core.

  • t/porting/podcheck.t automatically checks the well-formedness of POD found in all .pl, .pm and .pod files in the MANIFEST, other than in dual-lifed modules which are primarily maintained outside the Perl core.

  • t/porting/manifest.t now tests that all files listed in MANIFEST are present.

  • t/op/while_readdir.t tests that a bare readdir in while loop sets $_.

  • t/comp/retainedlines.t checks that the debugger can retain source lines from eval.

  • t/io/perlio_fail.t checks that bad layers fail.

  • t/io/perlio_leaks.t checks that PerlIO layers are not leaking.

  • t/io/perlio_open.t checks that certain special forms of open work.

  • t/io/perlio.t includes general PerlIO tests.

  • t/io/pvbm.t checks that there is no unexpected interaction between the internal types PVBM and PVGV .

  • t/mro/package_aliases.t checks that mro works properly in the presence of aliased packages.

  • t/op/dbm.t tests dbmopen and dbmclose.

  • t/op/index_thr.t tests the interaction of index and threads.

  • t/op/pat_thr.t tests the interaction of esoteric patterns and threads.

  • t/op/qr_gc.t tests that qr doesn't leak.

  • t/op/reg_email_thr.t tests the interaction of regex recursion and threads.

  • t/op/regexp_qr_embed_thr.t tests the interaction of patterns with embedded qr// and threads.

  • t/op/regexp_unicode_prop.t tests Unicode properties in regular expressions.

  • t/op/regexp_unicode_prop_thr.t tests the interaction of Unicode properties and threads.

  • t/op/reg_nc_tie.t tests the tied methods of Tie::Hash::NamedCapture .

  • t/op/reg_posixcc.t checks that POSIX character classes behave consistently.

  • t/op/re.t checks that exportable re functions in universal.c work.

  • t/op/setpgrpstack.t checks that setpgrp works.

  • t/op/substr_thr.t tests the interaction of substr and threads.

  • t/op/upgrade.t checks that upgrading and assigning scalars works.

  • t/uni/lex_utf8.t checks that Unicode in the lexer works.

  • t/uni/tie.t checks that Unicode and tie work.

  • t/comp/final_line_num.t tests whether line numbers are correct at EOF

  • t/comp/form_scope.t tests format scoping.

  • t/comp/line_debug.t tests whether @{"_<$file"} works.

  • t/op/filetest_t.t tests if -t file test works.

  • t/op/qr.t tests qr.

  • t/op/utf8cache.t tests malfunctions of the utf8 cache.

  • t/re/uniprops.t test unicodes \p{} regex constructs.

  • t/op/filehandle.t tests some suitably portable filetest operators to check that they work as expected, particularly in the light of some internal changes made in how filehandles are blessed.

  • t/op/time_loop.t tests that unix times greater than 2**63 , which can now be handed to gmtime and localtime, do not cause an internal overflow or an excessively long loop.

New or Changed Diagnostics

New Diagnostics

  • SV allocation tracing has been added to the diagnostics enabled by -Dm . The tracing can alternatively output via the PERL_MEM_LOG mechanism, if that was enabled when the perl binary was compiled.

  • Smartmatch resolution tracing has been added as a new diagnostic. Use -DM to enable it.

  • A new debugging flag -DB now dumps subroutine definitions, leaving -Dx for its original purpose of dumping syntax trees.

  • Perl 5.12 provides a number of new diagnostic messages to help you write better code. See perldiag for details of these new messages.

    • Bad plugin affecting keyword '%s'

    • gmtime(%.0f) too large

    • Lexing code attempted to stuff non-Latin-1 character into Latin-1 input

    • Lexing code internal error (%s)

    • localtime(%.0f) too large

    • Overloaded dereference did not return a reference

    • Overloaded qr did not return a REGEXP

    • Perl_pmflag() is deprecated, and will be removed from the XS API

    • lvalue attribute ignored after the subroutine has been defined

      This new warning is issued when one attempts to mark a subroutine as lvalue after it has been defined.

    • Perl now warns you if ++ or -- are unable to change the value because it's beyond the limit of representation.

      This uses a new warnings category: "imprecision".

    • lc, uc, lcfirst, and ucfirst warn when passed undef.

    • Show constant in "Useless use of a constant in void context"

    • Prototype after '%s'

    • panic: sv_chop %s

      This new fatal error occurs when the C routine Perl_sv_chop() was passed a position that is not within the scalar's string buffer. This could be caused by buggy XS code, and at this point recovery is not possible.

    • The fatal error Malformed UTF-8 returned by \N is now produced if the charnames handler returns malformed UTF-8.

    • If an unresolved named character or sequence was encountered when compiling a regex pattern then the fatal error \N{NAME} must be resolved by the lexer is now produced. This can happen, for example, when using a single-quotish context like $re = '\N{SPACE}'; /$re/; . See perldiag for more examples of how the lexer can get bypassed.

    • Invalid hexadecimal number in \N{U+...} is a new fatal error triggered when the character constant represented by ... is not a valid hexadecimal number.

    • The new meaning of \N as [^\n] is not valid in a bracketed character class, just like . in a character class loses its special meaning, and will cause the fatal error \N in a character class must be a named character: \N{...}.

    • The rules on what is legal for the ... in \N{...} have been tightened up so that unless the ... begins with an alphabetic character and continues with a combination of alphanumerics, dashes, spaces, parentheses or colons then the warning Deprecated character(s) in \N{...} starting at '%s' is now issued.

    • The warning Using just the first characters returned by \N{} will be issued if the charnames handler returns a sequence of characters which exceeds the limit of the number of characters that can be used. The message will indicate which characters were used and which were discarded.

Changed Diagnostics

A number of existing diagnostic messages have been improved or corrected:

  • A new warning category illegalproto allows finer-grained control of warnings around function prototypes.

    The two warnings:

    • Illegal character in prototype for %s : %s
    • Prototype after '%c' for %s : %s

    have been moved from the syntax top-level warnings category into a new first-level category, illegalproto . These two warnings are currently the only ones emitted during parsing of an invalid/illegal prototype, so one can now use

    1. no warnings 'illegalproto';

    to suppress only those, but not other syntax-related warnings. Warnings where prototypes are changed, ignored, or not met are still in the prototype category as before.

  • Deep recursion on subroutine "%s"

    It is now possible to change the depth threshold for this warning from the default of 100, by recompiling the perl binary, setting the C pre-processor macro PERL_SUB_DEPTH_WARN to the desired value.

  • Illegal character in prototype warning is now more precise when reporting illegal characters after _

  • mro merging error messages are now very similar to those produced by Algorithm::C3.

  • Amelioration of the error message "Unrecognized character %s in column %d"

    Changes the error message to "Unrecognized character %s; marked by <-- HERE after %s<-- HERE near column %d". This should make it a little simpler to spot and correct the suspicious character.

  • Perl now explicitly points to $. when it causes an uninitialized warning for ranges in scalar context.

  • split now warns when called in void context.

  • printf-style functions called with too few arguments will now issue the warning "Missing argument in %s" [perl #71000]

  • Perl now properly returns a syntax error instead of segfaulting if each, keys, or values is used without an argument.

  • tell() now fails properly if called without an argument and when no previous file was read.

    tell() now returns -1 , and sets errno to EBADF , thus restoring the 5.8.x behaviour.

  • overload no longer implicitly unsets fallback on repeated 'use overload' lines.

  • POSIX::strftime() can now handle Unicode characters in the format string.

  • The syntax category was removed from 5 warnings that should only be in deprecated .

  • Three fatal pack/unpack error messages have been normalized to panic: %s

  • Unicode character is illegal has been rephrased to be more accurate

    It now reads Unicode non-character is illegal in interchange and the perldiag documentation has been expanded a bit.

  • Currently, all but the first of the several characters that the charnames handler may return are discarded when used in a regular expression pattern bracketed character class. If this happens then the warning Using just the first character returned by \N{} in character class will be issued.

  • The warning Missing right brace on \N{} or unescaped left brace after \N. Assuming the latter will be issued if Perl encounters a \N{ but doesn't find a matching }. In this case Perl doesn't know if it was mistakenly omitted, or if "match non-newline" followed by "match a {" was desired. It assumes the latter because that is actually a valid interpretation as written, unlike the other case. If you meant the former, you need to add the matching right brace. If you did mean the latter, you can silence this warning by writing instead \N\{.

  • gmtime and localtime called with numbers smaller than they can reliably handle will now issue the warnings gmtime(%.0f) too small and localtime(%.0f) too small.

The following diagnostic messages have been removed:

  • Runaway format

  • Can't locate package %s for the parents of %s

    In general this warning it only got produced in conjunction with other warnings, and removing it allowed an ISA lookup optimisation to be added.

  • v-string in use/require is non-portable

Utility Changes

  • h2ph now looks in include-fixed too, which is a recent addition to gcc's search path.

  • h2xs no longer incorrectly treats enum values like macros. It also now handles C++ style comments (// ) properly in enums.

  • perl5db.pl now supports LVALUE subroutines. Additionally, the debugger now correctly handles proxy constant subroutines, and subroutine stubs.

  • perlbug now uses %Module::CoreList::bug_tracker to print out upstream bug tracker URLs. If a user identifies a particular module as the topic of their bug report and we're able to divine the URL for its upstream bug tracker, perlbug now provide a message to the user explaining that the core copies the CPAN version directly, and provide the URL for reporting the bug directly to the upstream author.

    perlbug no longer reports "Message sent" when it hasn't actually sent the message

  • perlthanks is a new utility for sending non-bug-reports to the authors and maintainers of Perl. Getting nothing but bug reports can become a bit demoralising. If Perl 5.12 works well for you, please try out perlthanks. It will make the developers smile.

  • Perl's developers have fixed bugs in a2p having to do with the match() operator in list context. Additionally, a2p no longer generates code that uses the $[ variable.

Selected Bug Fixes

  • U+0FFFF is now a legal character in regular expressions.

  • pp_qr now always returns a new regexp SV. Resolves RT #69852.

    Instead of returning a(nother) reference to the (pre-compiled) regexp in the optree, use reg_temp_copy() to create a copy of it, and return a reference to that. This resolves issues about Regexp::DESTROY not being called in a timely fashion (the original bug tracked by RT #69852), as well as bugs related to blessing regexps, and of assigning to regexps, as described in correspondence added to the ticket.

    It transpires that we also need to undo the SvPVX() sharing when ithreads cloning a Regexp SV, because mother_re is set to NULL, instead of a cloned copy of the mother_re. This change might fix bugs with regexps and threads in certain other situations, but as yet neither tests nor bug reports have indicated any problems, so it might not actually be an edge case that it's possible to reach.

  • Several compilation errors and segfaults when perl was built with -Dmad were fixed.

  • Fixes for lexer API changes in 5.11.2 which broke NYTProf's savesrc option.

  • -t should only return TRUE for file handles connected to a TTY

    The Microsoft C version of isatty() returns TRUE for all character mode devices, including the /dev/null-style "nul" device and printers like "lpt1".

  • Fixed a regression caused by commit fafafbaf which caused a panic during parameter passing [perl #70171]

  • On systems which in-place edits without backup files, -i'*' now works as the documentation says it does [perl #70802]

  • Saving and restoring magic flags no longer loses readonly flag.

  • The malformed syntax grep EXPR LIST (note the missing comma) no longer causes abrupt and total failure.

  • Regular expressions compiled with qr{} literals properly set $' when matching again.

  • Using named subroutines with sort should no longer lead to bus errors [perl #71076]

  • Numerous bugfixes catch small issues caused by the recently-added Lexer API.

  • Smart match against @_ sometimes gave false negatives. [perl #71078]

  • $@ may now be assigned a read-only value (without error or busting the stack).

  • sort called recursively from within an active comparison subroutine no longer causes a bus error if run multiple times. [perl #71076]

  • Tie::Hash::NamedCapture::* will not abort if passed bad input (RT #71828)

  • @_ and $_ no longer leak under threads (RT #34342 and #41138, also #70602, #70974)

  • -I on shebang line now adds directories in front of @INC as documented, and as does -I when specified on the command-line.

  • kill is now fatal when called on non-numeric process identifiers. Previously, an undef process identifier would be interpreted as a request to kill process 0, which would terminate the current process group on POSIX systems. Since process identifiers are always integers, killing a non-numeric process is now fatal.

  • 5.10.0 inadvertently disabled an optimisation, which caused a measurable performance drop in list assignment, such as is often used to assign function parameters from @_ . The optimisation has been re-instated, and the performance regression fixed. (This fix is also present in 5.10.1)

  • Fixed memory leak on while (1) { map 1, 1 } [RT #53038].

  • Some potential coredumps in PerlIO fixed [RT #57322,54828].

  • The debugger now works with lvalue subroutines.

  • The debugger's m command was broken on modules that defined constants [RT #61222].

  • crypt and string complement could return tainted values for untainted arguments [RT #59998].

  • The -i .suffix command-line switch now recreates the file using restricted permissions, before changing its mode to match the original file. This eliminates a potential race condition [RT #60904].

  • On some Unix systems, the value in $? would not have the top bit set ($? & 128 ) even if the child core dumped.

  • Under some circumstances, $^R could incorrectly become undefined [RT #57042].

  • In the XS API, various hash functions, when passed a pre-computed hash where the key is UTF-8, might result in an incorrect lookup.

  • XS code including XSUB.h before perl.h gave a compile-time error [RT #57176].

  • $object->isa('Foo') would report false if the package Foo didn't exist, even if the object's @ISA contained Foo .

  • Various bugs in the new-to 5.10.0 mro code, triggered by manipulating @ISA , have been found and fixed.

  • Bitwise operations on references could crash the interpreter, e.g. $x=\$y; $x |= "foo" [RT #54956].

  • Patterns including alternation might be sensitive to the internal UTF-8 representation, e.g.

    1. my $byte = chr(192);
    2. my $utf8 = chr(192); utf8::upgrade($utf8);
    3. $utf8 =~ /$byte|X}/i; # failed in 5.10.0
  • Within UTF8-encoded Perl source files (i.e. where use utf8 is in effect), double-quoted literal strings could be corrupted where a \xNN , \0NNN or \N{} is followed by a literal character with ordinal value greater than 255 [RT #59908].

  • B::Deparse failed to correctly deparse various constructs: readpipe STRING [RT #62428], CORE::require(STRING) [RT #62488], sub foo(_) [RT #62484].

  • Using setpgrp with no arguments could corrupt the perl stack.

  • The block form of eval is now specifically trappable by Safe and ops . Previously it was erroneously treated like string eval.

  • In 5.10.0, the two characters [~ were sometimes parsed as the smart match operator (~~ ) [RT #63854].

  • In 5.10.0, the * quantifier in patterns was sometimes treated as {0,32767} [RT #60034, #60464]. For example, this match would fail:

    1. ("ab" x 32768) =~ /^(ab)*$/
  • shmget was limited to a 32 bit segment size on a 64 bit OS [RT #63924].

  • Using next or last to exit a given block no longer produces a spurious warning like the following:

    1. Exiting given via last at foo.pl line 123
  • Assigning a format to a glob could corrupt the format; e.g.:

    1. *bar=*foo{FORMAT}; # foo format now bad
  • Attempting to coerce a typeglob to a string or number could cause an assertion failure. The correct error message is now generated, Can't coerce GLOB to $type.

  • Under use filetest 'access' , -x was using the wrong access mode. This has been fixed [RT #49003].

  • length on a tied scalar that returned a Unicode value would not be correct the first time. This has been fixed.

  • Using an array tie inside in array tie could SEGV. This has been fixed. [RT #51636]

  • A race condition inside PerlIOStdio_close() has been identified and fixed. This used to cause various threading issues, including SEGVs.

  • In unpack, the use of () groups in scalar context was internally placing a list on the interpreter's stack, which manifested in various ways, including SEGVs. This is now fixed [RT #50256].

  • Magic was called twice in substr, \&$x , tie $x, $m and chop. These have all been fixed.

  • A 5.10.0 optimisation to clear the temporary stack within the implicit loop of s///ge has been reverted, as it turned out to be the cause of obscure bugs in seemingly unrelated parts of the interpreter [commit ef0d4e17921ee3de].

  • The line numbers for warnings inside elsif are now correct.

  • The .. operator now works correctly with ranges whose ends are at or close to the values of the smallest and largest integers.

  • binmode STDIN, ':raw' could lead to segmentation faults on some platforms. This has been fixed [RT #54828].

  • An off-by-one error meant that index $str, ... was effectively being executed as index "$str\0", ... . This has been fixed [RT #53746].

  • Various leaks associated with named captures in regexes have been fixed [RT #57024].

  • A weak reference to a hash would leak. This was affecting DBI [RT #56908].

  • Using (?|) in a regex could cause a segfault [RT #59734].

  • Use of a UTF-8 tr// within a closure could cause a segfault [RT #61520].

  • Calling Perl_sv_chop() or otherwise upgrading an SV could result in an unaligned 64-bit access on the SPARC architecture [RT #60574].

  • In the 5.10.0 release, inc_version_list would incorrectly list 5.10.* after 5.8.* ; this affected the @INC search order [RT #67628].

  • In 5.10.0, pack "a*", $tainted_value returned a non-tainted value [RT #52552].

  • In 5.10.0, printf and sprintf could produce the fatal error panic: utf8_mg_pos_cache_update when printing UTF-8 strings [RT #62666].

  • In the 5.10.0 release, a dynamically created AUTOLOAD method might be missed (method cache issue) [RT #60220,60232].

  • In the 5.10.0 release, a combination of use feature and //ee could cause a memory leak [RT #63110].

  • -C on the shebang (#! ) line is once more permitted if it is also specified on the command line. -C on the shebang line used to be a silent no-op if it was not also on the command line, so perl 5.10.0 disallowed it, which broke some scripts. Now perl checks whether it is also on the command line and only dies if it is not [RT #67880].

  • In 5.10.0, certain types of re-entrant regular expression could crash, or cause the following assertion failure [RT #60508]:

    1. Assertion rx->sublen >= (s - rx->subbeg) + i failed
  • Perl now includes previously missing files from the Unicode Character Database.

  • Perl now honors TMPDIR when opening an anonymous temporary file.

Platform Specific Changes

Perl is incredibly portable. In general, if a platform has a C compiler, someone has ported Perl to it (or will soon). We're happy to announce that Perl 5.12 includes support for several new platforms. At the same time, it's time to bid farewell to some (very) old friends.

New Platforms

  • Haiku

    Perl's developers have merged patches from Haiku's maintainers. Perl should now build on Haiku.

  • MirOS BSD

    Perl should now build on MirOS BSD.

Discontinued Platforms

  • Domain/OS
  • MiNT
  • Tenon MachTen

Updated Platforms

  • AIX
    • Removed libbsd for AIX 5L and 6.1. Only flock() was used from libbsd.

    • Removed libgdbm for AIX 5L and 6.1 if libgdbm < 1.8.3-5 is installed. The libgdbm is delivered as an optional package with the AIX Toolbox. Unfortunately the versions below 1.8.3-5 are broken.

    • Hints changes mean that AIX 4.2 should work again.

  • Cygwin
    • Perl now supports IPv6 on Cygwin 1.7 and newer.

    • On Cygwin we now strip the last number from the DLL. This has been the behaviour in the cygwin.com build for years. The hints files have been updated.

  • Darwin (Mac OS X)
    • Skip testing the be_BY.CP1131 locale on Darwin 10 (Mac OS X 10.6), as it's still buggy.

    • Correct infelicities in the regexp used to identify buggy locales on Darwin 8 and 9 (Mac OS X 10.4 and 10.5, respectively).

  • DragonFly BSD
    • Fix thread library selection [perl #69686]

  • FreeBSD
    • The hints files now identify the correct threading libraries on FreeBSD 7 and later.

  • Irix
    • We now work around a bizarre preprocessor bug in the Irix 6.5 compiler: cc -E - unfortunately goes into K&R mode, but cc -E file.c doesn't.

  • NetBSD
    • Hints now supports versions 5.*.

  • OpenVMS
    • -UDEBUGGING is now the default on VMS.

      Like it has been everywhere else for ages and ages. Also make command-line selection of -UDEBUGGING and -DDEBUGGING work in configure.com; before the only way to turn it off was by saying no in answer to the interactive question.

    • The default pipe buffer size on VMS has been updated to 8192 on 64-bit systems.

    • Reads from the in-memory temporary files of PerlIO::scalar used to fail if $/ was set to a numeric reference (to indicate record-style reads). This is now fixed.

    • VMS now supports getgrgid.

    • Many improvements and cleanups have been made to the VMS file name handling and conversion code.

    • Enabling the PERL_VMS_POSIX_EXIT logical name now encodes a POSIX exit status in a VMS condition value for better interaction with GNV's bash shell and other utilities that depend on POSIX exit values. See $? in perlvms for details.

    • File::Copy now detects Unix compatibility mode on VMS.

  • Stratus VOS
    • Various changes from Stratus have been merged in.

  • Symbian
    • There is now support for Symbian S60 3.2 SDK and S60 5.0 SDK.

  • Windows
    • Perl 5.12 supports Windows 2000 and later. The supporting code for legacy versions of Windows is still included, but will be removed during the next development cycle.

    • Initial support for building Perl with MinGW-w64 is now available.

    • perl.exe now includes a manifest resource to specify the trustInfo settings for Windows Vista and later. Without this setting Windows would treat perl.exe as a legacy application and apply various heuristics like redirecting access to protected file system areas (like the "Program Files" folder) to the users "VirtualStore" instead of generating a proper "permission denied" error.

      The manifest resource also requests the Microsoft Common-Controls version 6.0 (themed controls introduced in Windows XP). Check out the Win32::VisualStyles module on CPAN to switch back to old style unthemed controls for legacy applications.

    • The -t filetest operator now only returns true if the filehandle is connected to a console window. In previous versions of Perl it would return true for all character mode devices, including NUL and LPT1.

    • The -p filetest operator now works correctly, and the Fcntl::S_IFIFO constant is defined when Perl is compiled with Microsoft Visual C. In previous Perl versions -p always returned a false value, and the Fcntl::S_IFIFO constant was not defined.

      This bug is specific to Microsoft Visual C and never affected Perl binaries built with MinGW.

    • The socket error codes are now more widely supported: The POSIX module will define the symbolic names, like POSIX::EWOULDBLOCK, and stringification of socket error codes in $! works as well now;

      1. C:\>perl -MPOSIX -E "$!=POSIX::EWOULDBLOCK; say $!"
      2. A non-blocking socket operation could not be completed immediately.
    • flock() will now set sensible error codes in $!. Previous Perl versions copied the value of $^E into $!, which caused much confusion.

    • select() now supports all empty fd_set s more correctly.

    • '.\foo' and '..\foo' were treated differently than './foo' and '../foo' by do and require [RT #63492].

    • Improved message window handling means that alarm and kill messages will no longer be dropped under race conditions.

    • Various bits of Perl's build infrastructure are no longer converted to win32 line endings at release time. If this hurts you, please report the problem with the perlbug program included with perl.

Known Problems

This is a list of some significant unfixed bugs, which are regressions from either 5.10.x or 5.8.x.

  • Some CPANPLUS tests may fail if there is a functioning file ../../cpanp-run-perl outside your build directory. The failure shouldn't imply there's a problem with the actual functional software. The bug is already fixed in [RT #74188] and is scheduled for inclusion in perl-v5.12.1.

  • List::Util::first misbehaves in the presence of a lexical $_ (typically introduced by my $_ or implicitly by given ). The variable which gets set for each iteration is the package variable $_ , not the lexical $_ [RT #67694].

    A similar issue may occur in other modules that provide functions which take a block as their first argument, like

    1. foo { ... $_ ...} list
  • Some regexes may run much more slowly when run in a child thread compared with the thread the pattern was compiled into [RT #55600].

  • Things like "\N{LATIN SMALL LIGATURE FF}" =~ /\N{LATIN SMALL LETTER F}+/ will appear to hang as they get into a very long running loop [RT #72998].

  • Several porters have reported mysterious crashes when Perl's entire test suite is run after a build on certain Windows 2000 systems. When run by hand, the individual tests reportedly work fine.

Errata

  • This one is actually a change introduced in 5.10.0, but it was missed from that release's perldelta, so it is mentioned here instead.

    A bugfix related to the handling of the /m modifier and qr resulted in a change of behaviour between 5.8.x and 5.10.0:

    1. # matches in 5.8.x, doesn't match in 5.10.0
    2. $re = qr/^bar/; "foo\nbar" =~ /$re/m;

Acknowledgements

Perl 5.12.0 represents approximately two years of development since Perl 5.10.0 and contains over 750,000 lines of changes across over 3,000 files from over 200 authors and committers.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.0:

Aaron Crane, Abe Timmerman, Abhijit Menon-Sen, Abigail, Adam Russell, Adriano Ferreira, Ævar Arnfjörð Bjarmason, Alan Grover, Alexandr Ciornii, Alex Davies, Alex Vandiver, Andreas Koenig, Andrew Rodland, andrew@sundale.net, Andy Armstrong, Andy Dougherty, Jose AUGUSTE-ETIENNE, Benjamin Smith, Ben Morrow, bharanee rathna, Bo Borgerson, Bo Lindbergh, Brad Gilbert, Bram, Brendan O'Dea, brian d foy, Charles Bailey, Chip Salzenberg, Chris 'BinGOs' Williams, Christoph Lamprecht, Chris Williams, chromatic, Claes Jakobsson, Craig A. Berry, Dan Dascalescu, Daniel Frederick Crisman, Daniel M. Quinlan, Dan Jacobson, Dan Kogai, Dave Mitchell, Dave Rolsky, David Cantrell, David Dick, David Golden, David Mitchell, David M. Syzdek, David Nicol, David Wheeler, Dennis Kaarsemaker, Dintelmann, Peter, Dominic Dunlop, Dr.Ruud, Duke Leto, Enrico Sorcinelli, Eric Brine, Father Chrysostomos, Florian Ragwitz, Frank Wiegand, Gabor Szabo, Gene Sullivan, Geoffrey T. Dairiki, George Greer, Gerard Goossen, Gisle Aas, Goro Fuji, Graham Barr, Green, Paul, Hans Dieter Pearcey, Harmen, H. Merijn Brand, Hugo van der Sanden, Ian Goodacre, Igor Sutton, Ingo Weinhold, James Bence, James Mastros, Jan Dubois, Jari Aalto, Jarkko Hietaniemi, Jay Hannah, Jerry Hedden, Jesse Vincent, Jim Cromie, Jody Belka, John E. Malmberg, John Malmberg, John Peacock, John Peacock via RT, John P. Linderman, John Wright, Josh ben Jore, Jos I. Boumans, Karl Williamson, Kenichi Ishigaki, Ken Williams, Kevin Brintnall, Kevin Ryde, Kurt Starsinic, Leon Brocard, Lubomir Rintel, Luke Ross, Marcel Grünauer, Marcus Holland-Moritz, Mark Jason Dominus, Marko Asplund, Martin Hasch, Mashrab Kuvatov, Matt Kraai, Matt S Trout, Max Maischein, Michael Breen, Michael Cartmell, Michael G Schwern, Michael Witten, Mike Giroux, Milosz Tanski, Moritz Lenz, Nicholas Clark, Nick Cleaton, Niko Tyni, Offer Kaye, Osvaldo Villalon, Paul Fenwick, Paul Gaborit, Paul Green, Paul Johnson, Paul Marquess, Philip Hazel, Philippe Bruhat, Rafael Garcia-Suarez, Rainer Tammer, Rajesh Mandalemula, Reini Urban, Renée Bäcker, Ricardo Signes, Ricardo SIGNES, Richard Foley, Rich Rauenzahn, Rick Delaney, Risto Kankkunen, Robert May, Roberto C. Sanchez, Robin Barker, SADAHIRO Tomoyuki, Salvador Ortiz Garcia, Sam Vilain, Scott Lanning, Sébastien Aperghis-Tramoni, Sérgio Durigan Júnior, Shlomi Fish, Simon 'corecode' Schubert, Sisyphus, Slaven Rezic, Smylers, Steffen Müller, Steffen Ullrich, Stepan Kasal, Steve Hay, Steven Schubiger, Steve Peters, Tels, The Doctor, Tim Bunce, Tim Jenness, Todd Rinaldo, Tom Christiansen, Tom Hukins, Tom Wyant, Tony Cook, Torsten Schoenfeld, Tye McQueen, Vadim Konovalov, Vincent Pit, Hio YAMASHINA, Yasuhiro Matsumoto, Yitzchak Scott-Thoennes, Yuval Kogman, Yves Orton, Zefram, Zsban Ambrus

This is woefully incomplete as it's automatically generated from version control history. In particular, it doesn't include the names of the (very much appreciated) contributors who reported issues in previous versions of Perl that helped make Perl 5.12.0 better. For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl 5.12.0 distribution.

Our "retired" pumpkings Nicholas Clark and Rafael Garcia-Suarez deserve special thanks for their brilliant and substantive ongoing contributions. Nicholas personally authored over 30% of the patches since 5.10.0. Rafael comes in second in patch authorship with 11%, but is first by a long shot in committing patches authored by others, pushing 44% of the commits since 5.10.0 in this category, often after providing considerable coaching to the patch authors. These statistics in no way comprise all of their contributions, but express in shorthand that we couldn't have done it without them.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/. There may also be information at http://www.perl.org/, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analyzed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

http://dev.perl.org/perl5/errata.html for a list of issues found after this release, as well as a list of CPAN modules known to be incompatible with this release.

 
perldoc-html/perl5121delta.html000644 000765 000024 00000074036 12275777400 016416 0ustar00jjstaff000000 000000 perl5121delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5121delta

Perl 5 version 18.2 documentation
Recently read

perl5121delta

NAME

perl5121delta - what is new for perl v5.12.1

DESCRIPTION

This document describes differences between the 5.12.0 release and the 5.12.1 release.

If you are upgrading from an earlier release such as 5.10.1, first read perl5120delta, which describes differences between 5.10.1 and 5.12.0.

Incompatible Changes

There are no changes intentionally incompatible with 5.12.0. If any incompatibilities with 5.12.0 exist, they are bugs. Please report them.

Core Enhancements

Other than the bug fixes listed below, there should be no user-visible changes to the core language in this release.

Modules and Pragmata

Pragmata Changes

  • We fixed exporting of is_strict and is_lax from version.

    These were being exported with a wrapper that treated them as method calls, which caused them to fail. They are just functions, are documented as such, and should never be subclassed, so this patch just exports them directly as functions without the wrapper.

Updated Modules

  • We upgraded CGI.pm to version 3.49 to incorporate fixes for regressions introduced in the release we shipped with Perl 5.12.0.

  • We upgraded Pod::Simple to version 3.14 to get an improvement to \C\<\< \>\> parsing.

  • We made a small fix to the CPANPLUS test suite to fix an occasional spurious test failure.

  • We upgraded Safe to version 2.27 to wrap coderefs returned by reval() and rdo() .

Changes to Existing Documentation

  • We added the new maintenance release policy to perlpolicy.pod

  • We've clarified the multiple-angle-bracket construct in the spec for POD in perlpodspec

  • We added a missing explanation for a warning about := to perldiag.pod

  • We removed a false claim in perlunitut that all text strings are Unicode strings in Perl.

  • We updated the Github mirror link in perlrepository to mirrors/perl, not github/perl

  • We fixed a a minor error in perl5114delta.pod.

  • We replaced a mention of the now-obsolete Switch.pm with given/when.

  • We improved documentation about $sitelibexp/sitecustomize.pl in perlrun.

  • We corrected perlmodlib.pod which had unintentionally omitted a number of modules.

  • We updated the documentation for 'require' in perlfunc.pod relating to putting Perl code in @INC.

  • We reinstated some erroneously-removed documentation about quotemeta in perlfunc.

  • We fixed an a2p example in perlutil.pod.

  • We filled in a blank in perlport.pod with the release date of Perl 5.12.

  • We fixed broken links in a number of perldelta files.

  • The documentation for Carp.pm incorrectly stated that the $Carp::Verbose variable makes cluck generate stack backtraces.

  • We fixed a number of typos in Pod::Functions

  • We improved documentation of case-changing functions in perlfunc.pod

  • We corrected perlgpl.pod to contain the correct version of the GNU General Public License.

Testing

Testing Improvements

  • t/op/sselect.t is now less prone to clock jitter during timing checks on Windows.

    sleep() time on Win32 may be rounded down to multiple of the clock tick interval.

  • lib/blib.t and lib/locale.t: Fixes for test failures on Darwin/PPC

  • perl5db.t: Fix for test failures when Term::ReadLine::Gnu is installed.

Installation and Configuration Improvements

Configuration improvements

  • We updated INSTALL with notes about how to deal with broken dbm.h on OpenSUSE (and possibly other platforms)

Bug Fixes

Platform Specific Notes

HP-UX

  • Perl now allows -Duse64bitint without promoting to use64bitall on HP-UX

AIX

  • Perl now builds on AIX 4.2

    The changes required work around AIX 4.2s' lack of support for IPv6, and limited support for POSIX sigaction() .

FreeBSD 7

  • FreeBSD 7 no longer contains /usr/bin/objformat. At build time, Perl now skips the objformat check for versions 7 and higher and assumes ELF.

VMS

  • It's now possible to build extensions on older (pre 7.3-2) VMS systems.

    DCL symbol length was limited to 1K up until about seven years or so ago, but there was no particularly deep reason to prevent those older systems from configuring and building Perl.

  • We fixed the previously-broken -Uuseperlio build on VMS.

    We were checking a variable that doesn't exist in the non-default case of disabling perlio. Now we only look at it when it exists.

  • We fixed the -Uuseperlio command-line option in configure.com.

    Formerly it only worked if you went through all the questions interactively and explicitly answered no.

Known Problems

  • List::Util::first misbehaves in the presence of a lexical $_ (typically introduced by my $_ or implicitly by given ). The variable which gets set for each iteration is the package variable $_ , not the lexical $_ .

    A similar issue may occur in other modules that provide functions which take a block as their first argument, like

    1. foo { ... $_ ...} list

    See also: http://rt.perl.org/rt3/Public/Bug/Display.html?id=67694

  • Module::Load::Conditional and version have an unfortunate interaction which can cause CPANPLUS to crash when it encounters an unparseable version string. Upgrading to CPANPLUS 0.9004 or Module::Load::Conditional 0.38 from CPAN will resolve this issue.

Acknowledgements

Perl 5.12.1 represents approximately four weeks of development since Perl 5.12.0 and contains approximately 4,000 lines of changes across 142 files from 28 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.1:

Ævar Arnfjörð Bjarmason, Chris Williams, chromatic, Craig A. Berry, David Golden, Father Chrysostomos, Florian Ragwitz, Frank Wiegand, Gene Sullivan, Goro Fuji, H.Merijn Brand, James E Keenan, Jan Dubois, Jesse Vincent, Josh ben Jore, Karl Williamson, Leon Brocard, Michael Schwern, Nga Tang Chan, Nicholas Clark, Niko Tyni, Philippe Bruhat, Rafael Garcia-Suarez, Ricardo Signes, Steffen Mueller, Todd Rinaldo, Vincent Pit and Zefram.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5122delta.html000644 000765 000024 00000076327 12275777377 016441 0ustar00jjstaff000000 000000 perl5122delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5122delta

Perl 5 version 18.2 documentation
Recently read

perl5122delta

NAME

perl5122delta - what is new for perl v5.12.2

DESCRIPTION

This document describes differences between the 5.12.1 release and the 5.12.2 release.

If you are upgrading from an earlier major version, such as 5.10.1, first read perl5120delta, which describes differences between 5.10.1 and 5.12.0, as well as perl5121delta, which describes earlier changes in the 5.12 stable release series.

Incompatible Changes

There are no changes intentionally incompatible with 5.12.1. If any exist, they are bugs and reports are welcome.

Core Enhancements

Other than the bug fixes listed below, there should be no user-visible changes to the core language in this release.

Modules and Pragmata

New Modules and Pragmata

This release does not introduce any new modules or pragmata.

Pragmata Changes

In the previous release, no VERSION; statements triggered a bug which could cause feature bundles to be loaded and strict mode to be enabled unintentionally.

Updated Modules

  • Carp

    Upgraded from version 1.16 to 1.17.

    Carp now detects incomplete caller EXPR overrides and avoids using bogus @DB::args . To provide backtraces, Carp relies on particular behaviour of the caller built-in. Carp now detects if other code has overridden this with an incomplete implementation, and modifies its backtrace accordingly. Previously incomplete overrides would cause incorrect values in backtraces (best case), or obscure fatal errors (worst case)

    This fixes certain cases of Bizarre copy of ARRAY caused by modules overriding caller() incorrectly.

  • CPANPLUS

    A patch to cpanp-run-perl has been backported from CPANPLUS 0.9004 . This resolves RT #55964 and RT #57106, both of which related to failures to install distributions that use Module::Install::DSL .

  • File::Glob

    A regression which caused a failure to find CORE::GLOBAL::glob after loading File::Glob to crash has been fixed. Now, it correctly falls back to external globbing via pp_glob .

  • File::Copy

    File::Copy::copy(FILE, DIR) is now documented.

  • File::Spec

    Upgraded from version 3.31 to 3.31_01.

    Several portability fixes were made in File::Spec::VMS : a colon is now recognized as a delimiter in native filespecs; caret-escaped delimiters are recognized for better handling of extended filespecs; catpath() returns an empty directory rather than the current directory if the input directory name is empty; abs2rel() properly handles Unix-style input.

Utility Changes

  • perlbug now always gives the reporter a chance to change the email address it guesses for them.

  • perlbug should no longer warn about uninitialized values when using the -d and -v options.

Changes to Existing Documentation

  • The existing policy on backward-compatibility and deprecation has been added to perlpolicy, along with definitions of terms like deprecation.

  • srand's usage has been clarified.

  • The entry for die was reorganized to emphasize its role in the exception mechanism.

  • Perl's INSTALL file has been clarified to explicitly state that Perl requires a C89 compliant ANSI C Compiler.

  • IO::Socket's getsockopt() and setsockopt() have been documented.

  • alarm()'s inability to interrupt blocking IO on Windows has been documented.

  • Math::TrulyRandom hasn't been updated since 1996 and has been removed as a recommended solution for random number generation.

  • perlrun has been updated to clarify the behaviour of octal flags to perl.

  • To ease user confusion, $# and $* , two special variables that were removed in earlier versions of Perl have been documented.

  • The version of perlfaq shipped with the Perl core has been updated from the official FAQ version, which is now maintained in the briandfoy/perlfaq branch of the Perl repository at git://perl5.git.perl.org/perl.git.

Installation and Configuration Improvements

Configuration improvements

  • The d_u32align configuration probe on ARM has been fixed.

Compilation improvements

  • An "incompatible operand types " error in ternary expressions when building with clang has been fixed.

  • Perl now skips setuid File::Copy tests on partitions it detects to be mounted as nosuid .

Selected Bug Fixes

  • A possible segfault in the T_PRTOBJ default typemap has been fixed.

  • A possible memory leak when using caller EXPR to set @DB::args has been fixed.

  • Several memory leaks when loading XS modules were fixed.

  • unpack() now handles scalar context correctly for %32H and %32u , fixing a potential crash. split() would crash because the third item on the stack wasn't the regular expression it expected. unpack("%2H", ...) would return both the unpacked result and the checksum on the stack, as would unpack("%2u", ...) . [perl #73814]

  • Perl now avoids using memory after calling free() in pp_require when there are CODEREFs in @INC .

  • A bug that could cause "Unknown error " messages when "call_sv(code, G_EVAL) " is called from an XS destructor has been fixed.

  • The implementation of the open $fh, '>' \$buffer feature now supports get/set magic and thus tied buffers correctly.

  • The pp_getc , pp_tell , and pp_eof opcodes now make room on the stack for their return values in cases where no argument was passed in.

  • When matching unicode strings under some conditions inappropriate backtracking would result in a Malformed UTF-8 character (fatal) error. This should no longer occur. See [perl #75680]

Platform Specific Notes

AIX

  • README.aix has been updated with information about the XL C/C++ V11 compiler suite.

Windows

  • When building Perl with the mingw64 x64 cross-compiler incpath , libpth , ldflags , lddlflags and ldflags_nolargefiles values in Config.pm and Config_heavy.pl were not previously being set correctly because, with that compiler, the include and lib directories are not immediately below $(CCHOME).

VMS

  • git_version.h is now installed on VMS. This was an oversight in v5.12.0 which caused some extensions to fail to build.

  • Several memory leaks in stat FILEHANDLE have been fixed.

  • A memory leak in Perl_rename() due to a double allocation has been fixed.

  • A memory leak in vms_fid_to_name() (used by realpath() and realname() ) has been fixed.

Acknowledgements

Perl 5.12.2 represents approximately three months of development since Perl 5.12.1 and contains approximately 2,000 lines of changes across 100 files from 36 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.2:

Abigail, Ævar Arnfjörð Bjarmason, Ben Morrow, brian d foy, Brian Phillips, Chas. Owens, Chris 'BinGOs' Williams, Chris Williams, Craig A. Berry, Curtis Jewell, Dan Dascalescu, David Golden, David Mitchell, Father Chrysostomos, Florian Ragwitz, George Greer, H.Merijn Brand, Jan Dubois, Jesse Vincent, Jim Cromie, Karl Williamson, Lars Dɪᴇᴄᴋᴏᴡ 迪拉斯, Leon Brocard, Maik Hentsche, Matt S Trout, Nicholas Clark, Rafael Garcia-Suarez, Rainer Tammer, Ricardo Signes, Salvador Ortiz Garcia, Sisyphus, Slaven Rezic, Steffen Mueller, Tony Cook, Vincent Pit and Yves Orton.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5123delta.html000644 000765 000024 00000051024 12275777377 016425 0ustar00jjstaff000000 000000 perl5123delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5123delta

Perl 5 version 18.2 documentation
Recently read

perl5123delta

NAME

perl5123delta - what is new for perl v5.12.3

DESCRIPTION

This document describes differences between the 5.12.2 release and the 5.12.3 release.

If you are upgrading from an earlier release such as 5.12.1, first read perl5122delta, which describes differences between 5.12.1 and 5.12.2. The major changes made in 5.12.0 are described in perl5120delta.

Incompatible Changes

  1. There are no changes intentionally incompatible with 5.12.2. If any
  2. exist, they are bugs and reports are welcome.

Core Enhancements

keys, values work on arrays

You can now use the keys, values, each builtin functions on arrays (previously you could only use them on hashes). See perlfunc for details. This is actually a change introduced in perl 5.12.0, but it was missed from that release's perldelta.

Bug Fixes

"no VERSION" will now correctly deparse with B::Deparse, as will certain constant expressions.

Module::Build should be more reliably pass its tests under cygwin.

Lvalue subroutines are again able to return copy-on-write scalars. This had been broken since version 5.10.0.

Platform Specific Notes

  • Solaris

    A separate DTrace is now build for miniperl, which means that perl can be compiled with -Dusedtrace on Solaris again.

  • VMS

    A number of regressions on VMS have been fixed. In addition to minor cleanup of questionable expressions in vms.c, file permissions should no longer be garbled by the PerlIO layer, and spurious record boundaries should no longer be introduced by the PerlIO layer during output.

    For more details and discussion on the latter, see:

    1. http://www.nntp.perl.org/group/perl.vmsperl/2010/11/msg15419.html
  • VOS

    A few very small changes were made to the build process on VOS to better support the platform. Longer-than-32-character filenames are now supported on OpenVOS, and build properly without IPv6 support.

Acknowledgements

Perl 5.12.3 represents approximately four months of development since Perl 5.12.2 and contains approximately 2500 lines of changes across 54 files from 16 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.3:

Craig A. Berry, David Golden, David Leadbeater, Father Chrysostomos, Florian Ragwitz, Jesse Vincent, Karl Williamson, Nick Johnston, Nicolas Kaiser, Paul Green, Rafael Garcia-Suarez, Rainer Tammer, Ricardo Signes, Steffen Mueller, Zsbán Ambrus, Ævar Arnfjörð Bjarmason

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5124delta.html000644 000765 000024 00000047351 12275777377 016436 0ustar00jjstaff000000 000000 perl5124delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5124delta

Perl 5 version 18.2 documentation
Recently read

perl5124delta

NAME

perl5124delta - what is new for perl v5.12.4

DESCRIPTION

This document describes differences between the 5.12.3 release and the 5.12.4 release.

If you are upgrading from an earlier release such as 5.12.2, first read perl5123delta, which describes differences between 5.12.2 and 5.12.3. The major changes made in 5.12.0 are described in perl5120delta.

Incompatible Changes

There are no changes intentionally incompatible with 5.12.3. If any exist, they are bugs and reports are welcome.

Selected Bug Fixes

When strict "refs" mode is off, %{...} in rvalue context returns undef if its argument is undefined. An optimisation introduced in Perl 5.12.0 to make keys %{...} faster when used as a boolean did not take this into account, causing keys %{+undef} (and keys %$foo when $foo is undefined) to be an error, which it should be so in strict mode only [perl #81750].

lc, uc, lcfirst, and ucfirst no longer return untainted strings when the argument is tainted. This has been broken since perl 5.8.9 [perl #87336].

Fixed a case where it was possible that a freed buffer may have been read from when parsing a here document.

Modules and Pragmata

Module::CoreList has been upgraded from version 2.43 to 2.50.

Testing

The cpan/CGI/t/http.t test script has been fixed to work when the environment has HTTPS_* environment variables, such as HTTPS_PROXY.

Documentation

Updated the documentation for rand() in perlfunc to note that it is not cryptographically secure.

Platform Specific Notes

  • Linux

    Support Ubuntu 11.04's new multi-arch library layout.

Acknowledgements

Perl 5.12.4 represents approximately 5 months of development since Perl 5.12.3 and contains approximately 200 lines of changes across 11 files from 8 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.4:

Andy Dougherty, David Golden, David Leadbeater, Father Chrysostomos, Florian Ragwitz, Jesse Vincent, Leon Brocard, Zsbán Ambrus.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5125delta.html000644 000765 000024 00000067410 12275777377 016435 0ustar00jjstaff000000 000000 perl5125delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5125delta

Perl 5 version 18.2 documentation
Recently read

perl5125delta

NAME

perl5125delta - what is new for perl v5.12.5

DESCRIPTION

This document describes differences between the 5.12.4 release and the 5.12.5 release.

If you are upgrading from an earlier release such as 5.12.3, first read perl5124delta, which describes differences between 5.12.3 and 5.12.4.

Security

Encode decode_xs n-byte heap-overflow (CVE-2011-2939)

A bug in Encode could, on certain inputs, cause the heap to overflow. This problem has been corrected. Bug reported by Robert Zacek.

File::Glob::bsd_glob() memory error with GLOB_ALTDIRFUNC (CVE-2011-2728).

Calling File::Glob::bsd_glob with the unsupported flag GLOB_ALTDIRFUNC would cause an access violation / segfault. A Perl program that accepts a flags value from an external source could expose itself to denial of service or arbitrary code execution attacks. There are no known exploits in the wild. The problem has been corrected by explicitly disabling all unsupported flags and setting unused function pointers to null. Bug reported by Clément Lecigne.

Heap buffer overrun in 'x' string repeat operator (CVE-2012-5195)

Poorly written perl code that allows an attacker to specify the count to perl's 'x' string repeat operator can already cause a memory exhaustion denial-of-service attack. A flaw in versions of perl before 5.15.5 can escalate that into a heap buffer overrun; coupled with versions of glibc before 2.16, it possibly allows the execution of arbitrary code.

This problem has been fixed.

Incompatible Changes

There are no changes intentionally incompatible with 5.12.4. If any exist, they are bugs and reports are welcome.

Modules and Pragmata

Updated Modules

B::Concise

B::Concise no longer produces mangled output with the -tree option [perl #80632].

charnames

A regression introduced in Perl 5.8.8 has been fixed, that caused charnames::viacode(0) to return undef instead of the string "NULL" [perl #72624].

Encode has been upgraded from version 2.39 to version 2.39_01.

See Security.

File::Glob has been upgraded from version 1.07 to version 1.07_01.

See Security.

Unicode::UCD

The documentation for the upper function now actually says "upper", not "lower".

Module::CoreList

Module::CoreList has been updated to version 2.50_02 to add data for this release.

Changes to Existing Documentation

perlebcdic

The perlebcdic document contains a helpful table to use in tr/// to convert between EBCDIC and Latin1/ASCII. Unfortunately, the table was the inverse of the one it describes. This has been corrected.

perlunicode

The section on User-Defined Case Mappings had some bad markup and unclear sentences, making parts of it unreadable. This has been rectified.

perluniprops

This document has been corrected to take non-ASCII platforms into account.

Installation and Configuration Improvements

Platform Specific Changes

  • Mac OS X

    There have been configuration and test fixes to make Perl build cleanly on Lion and Mountain Lion.

  • NetBSD

    The NetBSD hints file was corrected to be compatible with NetBSD 6.*

Selected Bug Fixes

  • chop now correctly handles characters above "\x{7fffffff}" [perl #73246].

  • ($<,$>) = (...) stopped working properly in 5.12.0. It is supposed to make a single setreuid() call, rather than calling setruid() and seteuid() separately. Consequently it did not work properly. This has been fixed [perl #75212].

  • Fixed a regression of kill() when a match variable is used for the process ID to kill [perl #75812].

  • UNIVERSAL::VERSION no longer leaks memory. It started leaking in Perl 5.10.0.

  • The C-level my_strftime functions no longer leaks memory. This fixes a memory leak in POSIX::strftime [perl #73520].

  • caller no longer leaks memory when called from the DB package if @DB::args was assigned to after the first call to caller. Carp was triggering this bug [perl #97010].

  • Passing to index an offset beyond the end of the string when the string is encoded internally in UTF8 no longer causes panics [perl #75898].

  • Syntax errors in (?{...}) blocks in regular expressions no longer cause panic messages [perl #2353].

  • Perl 5.10.0 introduced some faulty logic that made "U*" in the middle of a pack template equivalent to "U0" if the input string was empty. This has been fixed [perl #90160].

Errata

split() and @_

split() no longer modifies @_ when called in scalar or void context. In void context it now produces a "Useless use of split" warning. This is actually a change introduced in perl 5.12.0, but it was missed from that release's perl5120delta.

Acknowledgements

Perl 5.12.5 represents approximately 17 months of development since Perl 5.12.4 and contains approximately 1,900 lines of changes across 64 files from 18 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.5:

Andy Dougherty, Chris 'BinGOs' Williams, Craig A. Berry, David Mitchell, Dominic Hargreaves, Father Chrysostomos, Florian Ragwitz, George Greer, Goro Fuji, Jesse Vincent, Karl Williamson, Leon Brocard, Nicholas Clark, Rafael Garcia-Suarez, Reini Urban, Ricardo Signes, Steve Hay, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5140delta.html000644 000765 000024 00001010321 12275777377 016420 0ustar00jjstaff000000 000000 perl5140delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5140delta

Perl 5 version 18.2 documentation
Recently read

perl5140delta

NAME

perl5140delta - what is new for perl v5.14.0

DESCRIPTION

This document describes differences between the 5.12.0 release and the 5.14.0 release.

If you are upgrading from an earlier release such as 5.10.0, first read perl5120delta, which describes differences between 5.10.0 and 5.12.0.

Some of the bug fixes in this release have been backported to subsequent releases of 5.12.x. Those are indicated with the 5.12.x version in parentheses.

Notice

As described in perlpolicy, the release of Perl 5.14.0 marks the official end of support for Perl 5.10. Users of Perl 5.10 or earlier should consider upgrading to a more recent release of Perl.

Core Enhancements

Unicode

Unicode Version 6.0 is now supported (mostly)

Perl comes with the Unicode 6.0 data base updated with Corrigendum #8, with one exception noted below. See http://unicode.org/versions/Unicode6.0.0/ for details on the new release. Perl does not support any Unicode provisional properties, including the new ones for this release.

Unicode 6.0 has chosen to use the name BELL for the character at U+1F514, which is a symbol that looks like a bell, and is used in Japanese cell phones. This conflicts with the long-standing Perl usage of having BELL mean the ASCII BEL character, U+0007. In Perl 5.14, \N{BELL} continues to mean U+0007, but its use generates a deprecation warning message unless such warnings are turned off. The new name for U+0007 in Perl is ALERT , which corresponds nicely with the existing shorthand sequence for it, "\a" . \N{BEL} means U+0007, with no warning given. The character at U+1F514 has no name in 5.14, but can be referred to by \N{U+1F514} . In Perl 5.16, \N{BELL} will refer to U+1F514; all code that uses \N{BELL} should be converted to use \N{ALERT} , \N{BEL} , or "\a" before upgrading.

Full functionality for use feature 'unicode_strings'

This release provides full functionality for use feature 'unicode_strings' . Under its scope, all string operations executed and regular expressions compiled (even if executed outside its scope) have Unicode semantics. See the 'unicode_strings' feature in feature. However, see Inverted bracketed character classes and multi-character folds, below.

This feature avoids most forms of the "Unicode Bug" (see The Unicode Bug in perlunicode for details). If there is any possibility that your code will process Unicode strings, you are strongly encouraged to use this subpragma to avoid nasty surprises.

\N{NAME} and charnames enhancements

  • \N{NAME} and charnames::vianame now know about the abbreviated character names listed by Unicode, such as NBSP, SHY, LRO, ZWJ, etc.; all customary abbreviations for the C0 and C1 control characters (such as ACK, BEL, CAN, etc.); and a few new variants of some C1 full names that are in common usage.

  • Unicode has several named character sequences, in which particular sequences of code points are given names. \N{NAME} now recognizes these.

  • \N{NAME}, charnames::vianame , and charnames::viacode now know about every character in Unicode. In earlier releases of Perl, they didn't know about the Hangul syllables nor several CJK (Chinese/Japanese/Korean) characters.

  • It is now possible to override Perl's abbreviations with your own custom aliases.

  • You can now create a custom alias of the ordinal of a character, known by \N{NAME}, charnames::vianame() , and charnames::viacode() . Previously, aliases had to be to official Unicode character names. This made it impossible to create an alias for unnamed code points, such as those reserved for private use.

  • The new function charnames::string_vianame() is a run-time version of \N{NAME}}, returning the string of characters whose Unicode name is its parameter. It can handle Unicode named character sequences, whereas the pre-existing charnames::vianame() cannot, as the latter returns a single code point.

See charnames for details on all these changes.

New warnings categories for problematic (non-)Unicode code points.

Three new warnings subcategories of "utf8" have been added. These allow you to turn off some "utf8" warnings, while allowing other warnings to remain on. The three categories are: surrogate when UTF-16 surrogates are encountered; nonchar when Unicode non-character code points are encountered; and non_unicode when code points above the legal Unicode maximum of 0x10FFFF are encountered.

Any unsigned value can be encoded as a character

With this release, Perl is adopting a model that any unsigned value can be treated as a code point and encoded internally (as utf8) without warnings, not just the code points that are legal in Unicode. However, unless utf8 or the corresponding sub-category (see previous item) of lexical warnings have been explicitly turned off, outputting or executing a Unicode-defined operation such as upper-casing on such a code point generates a warning. Attempting to input these using strict rules (such as with the :encoding(UTF-8) layer) will continue to fail. Prior to this release, handling was inconsistent and in places, incorrect.

Unicode non-characters, some of which previously were erroneously considered illegal in places by Perl, contrary to the Unicode Standard, are now always legal internally. Inputting or outputting them works the same as with the non-legal Unicode code points, because the Unicode Standard says they are (only) illegal for "open interchange".

Unicode database files not installed

The Unicode database files are no longer installed with Perl. This doesn't affect any functionality in Perl and saves significant disk space. If you need these files, you can download them from http://www.unicode.org/Public/zipped/6.0.0/.

Regular Expressions

(?^...) construct signifies default modifiers

An ASCII caret "^" immediately following a "(?" in a regular expression now means that the subexpression does not inherit surrounding modifiers such as /i, but reverts to the Perl defaults. Any modifiers following the caret override the defaults.

Stringification of regular expressions now uses this notation. For example, qr/hlagh/i would previously be stringified as (?i-xsm:hlagh) , but now it's stringified as (?^i:hlagh) .

The main purpose of this change is to allow tests that rely on the stringification not to have to change whenever new modifiers are added. See Extended Patterns in perlre.

This change is likely to break code that compares stringified regular expressions with fixed strings containing ?-xism .

/d, /l , /u , and /a modifiers

Four new regular expression modifiers have been added. These are mutually exclusive: one only can be turned on at a time.

  • The /l modifier says to compile the regular expression as if it were in the scope of use locale , even if it is not.

  • The /u modifier says to compile the regular expression as if it were in the scope of a use feature 'unicode_strings' pragma.

  • The /d (default) modifier is used to override any use locale and use feature 'unicode_strings' pragmas in effect at the time of compiling the regular expression.

  • The /a regular expression modifier restricts \s, \d and \w and the POSIX ([[:posix:]] ) character classes to the ASCII range. Their complements and \b and \B are correspondingly affected. Otherwise, /a behaves like the /u modifier, in that case-insensitive matching uses Unicode semantics.

    If the /a modifier is repeated, then additionally in case-insensitive matching, no ASCII character can match a non-ASCII character. For example,

    1. "k" =~ /\N{KELVIN SIGN}/ai
    2. "\xDF" =~ /ss/ai

    match but

    1. "k" =~ /\N{KELVIN SIGN}/aai
    2. "\xDF" =~ /ss/aai

    do not match.

See Modifiers in perlre for more detail.

Non-destructive substitution

The substitution (s///) and transliteration (y///) operators now support an /r option that copies the input variable, carries out the substitution on the copy, and returns the result. The original remains unmodified.

  1. my $old = "cat";
  2. my $new = $old =~ s/cat/dog/r;
  3. # $old is "cat" and $new is "dog"

This is particularly useful with map. See perlop for more examples.

Re-entrant regular expression engine

It is now safe to use regular expressions within (?{...}) and (??{...}) code blocks inside regular expressions.

These blocks are still experimental, however, and still have problems with lexical (my) variables and abnormal exiting.

use re '/flags'

The re pragma now has the ability to turn on regular expression flags till the end of the lexical scope:

  1. use re "/x";
  2. "foo" =~ / (.+) /; # /x implied

See '/flags' mode in re for details.

\o{...} for octals

There is a new octal escape sequence, "\o" , in doublequote-like contexts. This construct allows large octal ordinals beyond the current max of 0777 to be represented. It also allows you to specify a character in octal which can safely be concatenated with other regex snippets and which won't be confused with being a backreference to a regex capture group. See Capture groups in perlre.

Add \p{Titlecase} as a synonym for \p{Title}

This synonym is added for symmetry with the Unicode property names \p{Uppercase} and \p{Lowercase} .

Regular expression debugging output improvement

Regular expression debugging output (turned on by use re 'debug' ) now uses hexadecimal when escaping non-ASCII characters, instead of octal.

Return value of delete $+{...}

Custom regular expression engines can now determine the return value of delete on an entry of %+ or %- .

Syntactical Enhancements

Array and hash container functions accept references

Warning: This feature is considered experimental, as the exact behaviour may change in a future version of Perl.

All builtin functions that operate directly on array or hash containers now also accept unblessed hard references to arrays or hashes:

  1. |----------------------------+---------------------------|
  2. | Traditional syntax | Terse syntax |
  3. |----------------------------+---------------------------|
  4. | push @$arrayref, @stuff | push $arrayref, @stuff |
  5. | unshift @$arrayref, @stuff | unshift $arrayref, @stuff |
  6. | pop @$arrayref | pop $arrayref |
  7. | shift @$arrayref | shift $arrayref |
  8. | splice @$arrayref, 0, 2 | splice $arrayref, 0, 2 |
  9. | keys %$hashref | keys $hashref |
  10. | keys @$arrayref | keys $arrayref |
  11. | values %$hashref | values $hashref |
  12. | values @$arrayref | values $arrayref |
  13. | ($k,$v) = each %$hashref | ($k,$v) = each $hashref |
  14. | ($k,$v) = each @$arrayref | ($k,$v) = each $arrayref |
  15. |----------------------------+---------------------------|

This allows these builtin functions to act on long dereferencing chains or on the return value of subroutines without needing to wrap them in @{} or %{} :

  1. push @{$obj->tags}, $new_tag; # old way
  2. push $obj->tags, $new_tag; # new way
  3. for ( keys %{$hoh->{genres}{artists}} ) {...} # old way
  4. for ( keys $hoh->{genres}{artists} ) {...} # new way

Single term prototype

The + prototype is a special alternative to $ that acts like \[@%] when given a literal array or hash variable, but will otherwise force scalar context on the argument. See Prototypes in perlsub.

package block syntax

A package declaration can now contain a code block, in which case the declaration is in scope inside that block only. So package Foo { ... } is precisely equivalent to { package Foo; ... } . It also works with a version number in the declaration, as in package Foo 1.2 { ... } , which is its most attractive feature. See perlfunc.

Statement labels can appear in more places

Statement labels can now occur before any type of statement or declaration, such as package.

Stacked labels

Multiple statement labels can now appear before a single statement.

Uppercase X/B allowed in hexadecimal/binary literals

Literals may now use either upper case 0X... or 0B... prefixes, in addition to the already supported 0x... and 0b... syntax [perl #76296].

C, Ruby, Python, and PHP already support this syntax, and it makes Perl more internally consistent: a round-trip with eval sprintf "%#X", 0x10 now returns 16 , just like eval sprintf "%#x", 0x10 .

Overridable tie functions

tie, tied and untie can now be overridden [perl #75902].

Exception Handling

To make them more reliable and consistent, several changes have been made to how die, warn, and $@ behave.

  • When an exception is thrown inside an eval, the exception is no longer at risk of being clobbered by destructor code running during unwinding. Previously, the exception was written into $@ early in the throwing process, and would be overwritten if eval was used internally in the destructor for an object that had to be freed while exiting from the outer eval. Now the exception is written into $@ last thing before exiting the outer eval, so the code running immediately thereafter can rely on the value in $@ correctly corresponding to that eval. ($@ is still also set before exiting the eval, for the sake of destructors that rely on this.)

    Likewise, a local $@ inside an eval no longer clobbers any exception thrown in its scope. Previously, the restoration of $@ upon unwinding would overwrite any exception being thrown. Now the exception gets to the eval anyway. So local $@ is safe before a die.

    Exceptions thrown from object destructors no longer modify the $@ of the surrounding context. (If the surrounding context was exception unwinding, this used to be another way to clobber the exception being thrown.) Previously such an exception was sometimes emitted as a warning, and then either was string-appended to the surrounding $@ or completely replaced the surrounding $@ , depending on whether that exception and the surrounding $@ were strings or objects. Now, an exception in this situation is always emitted as a warning, leaving the surrounding $@ untouched. In addition to object destructors, this also affects any function call run by XS code using the G_KEEPERR flag.

  • Warnings for warn can now be objects in the same way as exceptions for die. If an object-based warning gets the default handling of writing to standard error, it is stringified as before with the filename and line number appended. But a $SIG{__WARN__} handler now receives an object-based warning as an object, where previously it was passed the result of stringifying the object.

Other Enhancements

Assignment to $0 sets the legacy process name with prctl() on Linux

On Linux the legacy process name is now set with prctl(2), in addition to altering the POSIX name via argv[0] , as Perl has done since version 4.000. Now system utilities that read the legacy process name such as ps, top, and killall recognize the name you set when assigning to $0 . The string you supply is truncated at 16 bytes; this limitation is imposed by Linux.

srand() now returns the seed

This allows programs that need to have repeatable results not to have to come up with their own seed-generating mechanism. Instead, they can use srand() and stash the return value for future use. One example is a test program with too many combinations to test comprehensively in the time available for each run. It can test a random subset each time and, should there be a failure, log the seed used for that run so this can later be used to produce the same results.

printf-like functions understand post-1980 size modifiers

Perl's printf and sprintf operators, and Perl's internal printf replacement function, now understand the C90 size modifiers "hh" (char ), "z" (size_t ), and "t" (ptrdiff_t ). Also, when compiled with a C99 compiler, Perl now understands the size modifier "j" (intmax_t ) (but this is not portable).

So, for example, on any modern machine, sprintf("%hhd", 257) returns "1".

New global variable ${^GLOBAL_PHASE}

A new global variable, ${^GLOBAL_PHASE} , has been added to allow introspection of the current phase of the Perl interpreter. It's explained in detail in ${^GLOBAL_PHASE} in perlvar and in BEGIN, UNITCHECK, CHECK, INIT and END in perlmod.

-d:-foo calls Devel::foo::unimport

The syntax -d:foo was extended in 5.6.1 to make -d:foo=bar equivalent to -MDevel::foo=bar, which expands internally to use Devel::foo 'bar' . Perl now allows prefixing the module name with -, with the same semantics as -M; that is:

  • -d:-foo

    Equivalent to -M-Devel::foo: expands to no Devel::foo and calls Devel::foo->unimport() if that method exists.

  • -d:-foo=bar

    Equivalent to -M-Devel::foo=bar: expands to no Devel::foo 'bar' , and calls Devel::foo->unimport("bar") if that method exists.

This is particularly useful for suppressing the default actions of a Devel::* module's import method whilst still loading it for debugging.

Filehandle method calls load IO::File on demand

When a method call on a filehandle would die because the method cannot be resolved and IO::File has not been loaded, Perl now loads IO::File via require and attempts method resolution again:

  1. open my $fh, ">", $file;
  2. $fh->binmode(":raw"); # loads IO::File and succeeds

This also works for globs like STDOUT , STDERR , and STDIN :

  1. STDOUT->autoflush(1);

Because this on-demand load happens only if method resolution fails, the legacy approach of manually loading an IO::File parent class for partial method support still works as expected:

  1. use IO::Handle;
  2. open my $fh, ">", $file;
  3. $fh->autoflush(1); # IO::File not loaded

Improved IPv6 support

The Socket module provides new affordances for IPv6, including implementations of the Socket::getaddrinfo() and Socket::getnameinfo() functions, along with related constants and a handful of new functions. See Socket.

DTrace probes now include package name

The DTrace probes now include an additional argument, arg3 , which contains the package the subroutine being entered or left was compiled in.

For example, using the following DTrace script:

  1. perl$target:::sub-entry
  2. {
  3. printf("%s::%s\n", copyinstr(arg0), copyinstr(arg3));
  4. }

and then running:

  1. $ perl -e 'sub test { }; test'

DTrace will print:

  1. main::test

New C APIs

See Internal Changes.

Security

User-defined regular expression properties

User-Defined Character Properties in perlunicode documented that you can create custom properties by defining subroutines whose names begin with "In" or "Is". However, Perl did not actually enforce that naming restriction, so \p{foo::bar} could call foo::bar() if it existed. The documented convention is now enforced.

Also, Perl no longer allows tainted regular expressions to invoke a user-defined property. It simply dies instead [perl #82616].

Incompatible Changes

Perl 5.14.0 is not binary-compatible with any previous stable release.

In addition to the sections that follow, see C API Changes.

Regular Expressions and String Escapes

Inverted bracketed character classes and multi-character folds

Some characters match a sequence of two or three characters in /i regular expression matching under Unicode rules. One example is LATIN SMALL LETTER SHARP S which matches the sequence ss .

  1. 'ss' =~ /\A[\N{LATIN SMALL LETTER SHARP S}]\z/i # Matches

This, however, can lead to very counter-intuitive results, especially when inverted. Because of this, Perl 5.14 does not use multi-character /i matching in inverted character classes.

  1. 'ss' =~ /\A[^\N{LATIN SMALL LETTER SHARP S}]+\z/i # ???

This should match any sequences of characters that aren't the SHARP S nor what SHARP S matches under /i. "s" isn't SHARP S , but Unicode says that "ss" is what SHARP S matches under /i. So which one "wins"? Do you fail the match because the string has ss or accept it because it has an s followed by another s?

Earlier releases of Perl did allow this multi-character matching, but due to bugs, it mostly did not work.

\400-\777

In certain circumstances, \400 -\777 in regexes have behaved differently than they behave in all other doublequote-like contexts. Since 5.10.1, Perl has issued a deprecation warning when this happens. Now, these literals behave the same in all doublequote-like contexts, namely to be equivalent to \x{100} -\x{1FF} , with no deprecation warning.

Use of \400 -\777 in the command-line option -0 retain their conventional meaning. They slurp whole input files; previously, this was documented only for -0777.

Because of various ambiguities, you should use the new \o{...} construct to represent characters in octal instead.

Most \p{} properties are now immune to case-insensitive matching

For most Unicode properties, it doesn't make sense to have them match differently under /i case-insensitive matching. Doing so can lead to unexpected results and potential security holes. For example

  1. m/\p{ASCII_Hex_Digit}+/i

could previously match non-ASCII characters because of the Unicode matching rules (although there were several bugs with this). Now matching under /i gives the same results as non-/i matching except for those few properties where people have come to expect differences, namely the ones where casing is an integral part of their meaning, such as m/\p{Uppercase}/i and m/\p{Lowercase}/i, both of which match the same code points as matched by m/\p{Cased}/i. Details are in Unicode Properties in perlrecharclass.

User-defined property handlers that need to match differently under /i must be changed to read the new boolean parameter passed to them, which is non-zero if case-insensitive matching is in effect and 0 otherwise. See User-Defined Character Properties in perlunicode.

\p{} implies Unicode semantics

Specifying a Unicode property in the pattern indicates that the pattern is meant for matching according to Unicode rules, the way \N{NAME} does.

Regular expressions retain their localeness when interpolated

Regular expressions compiled under use locale now retain this when interpolated into a new regular expression compiled outside a use locale , and vice-versa.

Previously, one regular expression interpolated into another inherited the localeness of the surrounding regex, losing whatever state it originally had. This is considered a bug fix, but may trip up code that has come to rely on the incorrect behaviour.

Stringification of regexes has changed

Default regular expression modifiers are now notated using (?^...) . Code relying on the old stringification will fail. This is so that when new modifiers are added, such code won't have to keep changing each time this happens, because the stringification will automatically incorporate the new modifiers.

Code that needs to work properly with both old- and new-style regexes can avoid the whole issue by using (for perls since 5.9.5; see re):

  1. use re qw(regexp_pattern);
  2. my ($pat, $mods) = regexp_pattern($re_ref);

If the actual stringification is important or older Perls need to be supported, you can use something like the following:

  1. # Accept both old and new-style stringification
  2. my $modifiers = (qr/foobar/ =~ /\Q(?^/) ? "^" : "-xism";

And then use $modifiers instead of -xism .

Run-time code blocks in regular expressions inherit pragmata

Code blocks in regular expressions ((?{...}) and (??{...}) ) previously did not inherit pragmata (strict, warnings, etc.) if the regular expression was compiled at run time as happens in cases like these two:

  1. use re "eval";
  2. $foo =~ $bar; # when $bar contains (?{...})
  3. $foo =~ /$bar(?{ $finished = 1 })/;

This bug has now been fixed, but code that relied on the buggy behaviour may need to be fixed to account for the correct behaviour.

Stashes and Package Variables

Localised tied hashes and arrays are no longed tied

In the following:

  1. tie @a, ...;
  2. {
  3. local @a;
  4. # here, @a is a now a new, untied array
  5. }
  6. # here, @a refers again to the old, tied array

Earlier versions of Perl incorrectly tied the new local array. This has now been fixed. This fix could however potentially cause a change in behaviour of some code.

Stashes are now always defined

defined %Foo:: now always returns true, even when no symbols have yet been defined in that package.

This is a side-effect of removing a special-case kludge in the tokeniser, added for 5.10.0, to hide side-effects of changes to the internal storage of hashes. The fix drastically reduces hashes' memory overhead.

Calling defined on a stash has been deprecated since 5.6.0, warned on lexicals since 5.6.0, and warned for stashes and other package variables since 5.12.0. defined %hash has always exposed an implementation detail: emptying a hash by deleting all entries from it does not make defined %hash false. Hence defined %hash is not valid code to determine whether an arbitrary hash is empty. Instead, use the behaviour of an empty %hash always returning false in scalar context.

Clearing stashes

Stash list assignment %foo:: = () used to make the stash temporarily anonymous while it was being emptied. Consequently, any of its subroutines referenced elsewhere would become anonymous, showing up as "(unknown)" in caller. They now retain their package names such that caller returns the original sub name if there is still a reference to its typeglob and "foo::__ANON__" otherwise [perl #79208].

Dereferencing typeglobs

If you assign a typeglob to a scalar variable:

  1. $glob = *foo;

the glob that is copied to $glob is marked with a special flag indicating that the glob is just a copy. This allows subsequent assignments to $glob to overwrite the glob. The original glob, however, is immutable.

Some Perl operators did not distinguish between these two types of globs. This would result in strange behaviour in edge cases: untie $scalar would not untie the scalar if the last thing assigned to it was a glob (because it treated it as untie *$scalar , which unties a handle). Assignment to a glob slot (such as *$glob = \@some_array ) would simply assign \@some_array to $glob .

To fix this, the *{} operator (including its *foo and *$foo forms) has been modified to make a new immutable glob if its operand is a glob copy. This allows operators that make a distinction between globs and scalars to be modified to treat only immutable globs as globs. (tie, tied and untie have been left as they are for compatibility's sake, but will warn. See Deprecations.)

This causes an incompatible change in code that assigns a glob to the return value of *{} when that operator was passed a glob copy. Take the following code, for instance:

  1. $glob = *foo;
  2. *$glob = *bar;

The *$glob on the second line returns a new immutable glob. That new glob is made an alias to *bar . Then it is discarded. So the second assignment has no effect.

See http://rt.perl.org/rt3/Public/Bug/Display.html?id=77810 for more detail.

Magic variables outside the main package

In previous versions of Perl, magic variables like $! , %SIG , etc. would "leak" into other packages. So %foo::SIG could be used to access signals, ${"foo::!"} (with strict mode off) to access C's errno , etc.

This was a bug, or an "unintentional" feature, which caused various ill effects, such as signal handlers being wiped when modules were loaded, etc.

This has been fixed (or the feature has been removed, depending on how you see it).

local($_) strips all magic from $_

local() on scalar variables gives them a new value but keeps all their magic intact. This has proven problematic for the default scalar variable $_, where perlsub recommends that any subroutine that assigns to $_ should first localize it. This would throw an exception if $_ is aliased to a read-only variable, and could in general have various unintentional side-effects.

Therefore, as an exception to the general rule, local($_) will not only assign a new value to $_, but also remove all existing magic from it as well.

Parsing of package and variable names

Parsing the names of packages and package variables has changed: multiple adjacent pairs of colons, as in foo::::bar , are now all treated as package separators.

Regardless of this change, the exact parsing of package separators has never been guaranteed and is subject to change in future Perl versions.

Changes to Syntax or to Perl Operators

given return values

given blocks now return the last evaluated expression, or an empty list if the block was exited by break . Thus you can now write:

  1. my $type = do {
  2. given ($num) {
  3. break when undef;
  4. "integer" when /^[+-]?[0-9]+$/;
  5. "float" when /^[+-]?[0-9]+(?:\.[0-9]+)?$/;
  6. "unknown";
  7. }
  8. };

See Return value in perlsyn for details.

Change in parsing of certain prototypes

Functions declared with the following prototypes now behave correctly as unary functions:

  1. *
  2. \$ \% \@ \* \&
  3. \[...]
  4. ;$ ;*
  5. ;\$ ;\% etc.
  6. ;\[...]

Due to this bug fix [perl #75904], functions using the (*), (;$) and (;*) prototypes are parsed with higher precedence than before. So in the following example:

  1. sub foo(;$);
  2. foo $a < $b;

the second line is now parsed correctly as foo($a) < $b , rather than foo($a < $b) . This happens when one of these operators is used in an unparenthesised argument:

  1. < > <= >= lt gt le ge
  2. == != <=> eq ne cmp ~~
  3. &
  4. | ^
  5. &&
  6. || //
  7. .. ...
  8. ?:
  9. = += -= *= etc.
  10. , =>

Smart-matching against array slices

Previously, the following code resulted in a successful match:

  1. my @a = qw(a y0 z);
  2. my @b = qw(a x0 z);
  3. @a[0 .. $#b] ~~ @b;

This odd behaviour has now been fixed [perl #77468].

Negation treats strings differently from before

The unary negation operator, - , now treats strings that look like numbers as numbers [perl #57706].

Negative zero

Negative zero (-0.0), when converted to a string, now becomes "0" on all platforms. It used to become "-0" on some, but "0" on others.

If you still need to determine whether a zero is negative, use sprintf("%g", $zero) =~ /^-/ or the Data::Float module on CPAN.

:= is now a syntax error

Previously my $pi := 4 was exactly equivalent to my $pi : = 4 , with the : being treated as the start of an attribute list, ending before the = . The use of := to mean : = was deprecated in 5.12.0, and is now a syntax error. This allows future use of := as a new token.

Outside the core's tests for it, we find no Perl 5 code on CPAN using this construction, so we believe that this change will have little impact on real-world codebases.

If it is absolutely necessary to have empty attribute lists (for example, because of a code generator), simply avoid the error by adding a space before the = .

Change in the parsing of identifiers

Characters outside the Unicode "XIDStart" set are no longer allowed at the beginning of an identifier. This means that certain accents and marks that normally follow an alphabetic character may no longer be the first character of an identifier.

Threads and Processes

Directory handles not copied to threads

On systems other than Windows that do not have a fchdir function, newly-created threads no longer inherit directory handles from their parent threads. Such programs would usually have crashed anyway [perl #75154].

close on shared pipes

To avoid deadlocks, the close function no longer waits for the child process to exit if the underlying file descriptor is still in use by another thread. It returns true in such cases.

fork() emulation will not wait for signalled children

On Windows parent processes would not terminate until all forked children had terminated first. However, kill("KILL", ...) is inherently unstable on pseudo-processes, and kill("TERM", ...) might not get delivered if the child is blocked in a system call.

To avoid the deadlock and still provide a safe mechanism to terminate the hosting process, Perl now no longer waits for children that have been sent a SIGTERM signal. It is up to the parent process to waitpid() for these children if child-cleanup processing must be allowed to finish. However, it is also then the responsibility of the parent to avoid the deadlock by making sure the child process can't be blocked on I/O.

See perlfork for more information about the fork() emulation on Windows.

Configuration

Naming fixes in Policy_sh.SH may invalidate Policy.sh

Several long-standing typos and naming confusions in Policy_sh.SH have been fixed, standardizing on the variable names used in config.sh.

This will change the behaviour of Policy.sh if you happen to have been accidentally relying on its incorrect behaviour.

Perl source code is read in text mode on Windows

Perl scripts used to be read in binary mode on Windows for the benefit of the ByteLoader module (which is no longer part of core Perl). This had the side-effect of breaking various operations on the DATA filehandle, including seek()/tell(), and even simply reading from DATA after filehandles have been flushed by a call to system(), backticks, fork() etc.

The default build options for Windows have been changed to read Perl source code on Windows in text mode now. ByteLoader will (hopefully) be updated on CPAN to automatically handle this situation [perl #28106].

Deprecations

See also Deprecated C APIs.

Omitting a space between a regular expression and subsequent word

Omitting the space between a regular expression operator or its modifiers and the following word is deprecated. For example, m/foo/sand $bar is for now still parsed as m/foo/s and $bar , but will now issue a warning.

\cX

The backslash-c construct was designed as a way of specifying non-printable characters, but there were no restrictions (on ASCII platforms) on what the character following the c could be. Now, a deprecation warning is raised if that character isn't an ASCII character. Also, a deprecation warning is raised for "\c{" (which is the same as simply saying ";" ).

"\b{" and "\B{"

In regular expressions, a literal "{" immediately following a "\b" (not in a bracketed character class) or a "\B{" is now deprecated to allow for its future use by Perl itself.

Perl 4-era .pl libraries

Perl bundles a handful of library files that predate Perl 5. This bundling is now deprecated for most of these files, which are now available from CPAN. The affected files now warn when run, if they were installed as part of the core.

This is a mandatory warning, not obeying -X or lexical warning bits. The warning is modelled on that supplied by deprecate.pm for deprecated-in-core .pm libraries. It points to the specific CPAN distribution that contains the .pl libraries. The CPAN versions, of course, do not generate the warning.

List assignment to $[

Assignment to $[ was deprecated and started to give warnings in Perl version 5.12.0. This version of Perl (5.14) now also emits a warning when assigning to $[ in list context. This fixes an oversight in 5.12.0.

Use of qw(...) as parentheses

Historically the parser fooled itself into thinking that qw(...) literals were always enclosed in parentheses, and as a result you could sometimes omit parentheses around them:

  1. for $x qw(a b c) { ... }

The parser no longer lies to itself in this way. Wrap the list literal in parentheses like this:

  1. for $x (qw(a b c)) { ... }

This is being deprecated because the parentheses in for $i (1,2,3) { ... } are not part of expression syntax. They are part of the statement syntax, with the for statement wanting literal parentheses. The synthetic parentheses that a qw expression acquired were only intended to be treated as part of expression syntax.

Note that this does not change the behaviour of cases like:

  1. use POSIX qw(setlocale localeconv);
  2. our @EXPORT = qw(foo bar baz);

where parentheses were never required around the expression.

\N{BELL}

This is because Unicode is using that name for a different character. See Unicode Version 6.0 is now supported (mostly) for more explanation.

?PATTERN?

?PATTERN? (without the initial m) has been deprecated and now produces a warning. This is to allow future use of ? in new operators. The match-once functionality is still available as m?PATTERN?.

Tie functions on scalars holding typeglobs

Calling a tie function (tie, tied, untie) with a scalar argument acts on a filehandle if the scalar happens to hold a typeglob.

This is a long-standing bug that will be removed in Perl 5.16, as there is currently no way to tie the scalar itself when it holds a typeglob, and no way to untie a scalar that has had a typeglob assigned to it.

Now there is a deprecation warning whenever a tie function is used on a handle without an explicit * .

User-defined case-mapping

This feature is being deprecated due to its many issues, as documented in User-Defined Case Mappings (for serious hackers only) in perlunicode. This feature will be removed in Perl 5.16. Instead use the CPAN module Unicode::Casing, which provides improved functionality.

Deprecated modules

The following module will be removed from the core distribution in a future release, and should be installed from CPAN instead. Distributions on CPAN that require this should add it to their prerequisites. The core version of these module now issues a deprecation warning.

If you ship a packaged version of Perl, either alone or as part of a larger system, then you should carefully consider the repercussions of core module deprecations. You may want to consider shipping your default build of Perl with a package for the deprecated module that installs into vendor or site Perl library directories. This will inhibit the deprecation warnings.

Alternatively, you may want to consider patching lib/deprecate.pm to provide deprecation warnings specific to your packaging system or distribution of Perl, consistent with how your packaging system or distribution manages a staged transition from a release where the installation of a single package provides the given functionality, to a later release where the system administrator needs to know to install multiple packages to get that same functionality.

You can silence these deprecation warnings by installing the module in question from CPAN. To install the latest version of it by role rather than by name, just install Task::Deprecations::5_14 .

Performance Enhancements

"Safe signals" optimisation

Signal dispatch has been moved from the runloop into control ops. This should give a few percent speed increase, and eliminates nearly all the speed penalty caused by the introduction of "safe signals" in 5.8.0. Signals should still be dispatched within the same statement as they were previously. If this does not happen, or if you find it possible to create uninterruptible loops, this is a bug, and reports are encouraged of how to recreate such issues.

Optimisation of shift() and pop() calls without arguments

Two fewer OPs are used for shift() and pop() calls with no argument (with implicit @_ ). This change makes shift() 5% faster than shift @_ on non-threaded perls, and 25% faster on threaded ones.

Optimisation of regexp engine string comparison work

The foldEQ_utf8 API function for case-insensitive comparison of strings (which is used heavily by the regexp engine) was substantially refactored and optimised -- and its documentation much improved as a free bonus.

Regular expression compilation speed-up

Compiling regular expressions has been made faster when upgrading the regex to utf8 is necessary but this isn't known when the compilation begins.

String appending is 100 times faster

When doing a lot of string appending, perls built to use the system's malloc could end up allocating a lot more memory than needed in a inefficient way.

sv_grow , the function used to allocate more memory if necessary when appending to a string, has been taught to round up the memory it requests to a certain geometric progression, making it much faster on certain platforms and configurations. On Win32, it's now about 100 times faster.

Eliminate PL_* accessor functions under ithreads

When MULTIPLICITY was first developed, and interpreter state moved into an interpreter struct, thread- and interpreter-local PL_* variables were defined as macros that called accessor functions (returning the address of the value) outside the Perl core. The intent was to allow members within the interpreter struct to change size without breaking binary compatibility, so that bug fixes could be merged to a maintenance branch that necessitated such a size change. This mechanism was redundant and penalised well-behaved code. It has been removed.

Freeing weak references

When there are many weak references to an object, freeing that object can under some circumstances take O(N*N) time to free, where N is the number of references. The circumstances in which this can happen have been reduced [perl #75254]

Lexical array and hash assignments

An earlier optimisation to speed up my @array = ... and my %hash = ... assignments caused a bug and was disabled in Perl 5.12.0.

Now we have found another way to speed up these assignments [perl #82110].

@_ uses less memory

Previously, @_ was allocated for every subroutine at compile time with enough space for four entries. Now this allocation is done on demand when the subroutine is called [perl #72416].

Size optimisations to SV and HV structures

xhv_fill has been eliminated from struct xpvhv , saving 1 IV per hash and on some systems will cause struct xpvhv to become cache-aligned. To avoid this memory saving causing a slowdown elsewhere, boolean use of HvFILL now calls HvTOTALKEYS instead (which is equivalent), so while the fill data when actually required are now calculated on demand, cases when this needs to be done should be rare.

The order of structure elements in SV bodies has changed. Effectively, the NV slot has swapped location with STASH and MAGIC. As all access to SV members is via macros, this should be completely transparent. This change allows the space saving for PVHVs documented above, and may reduce the memory allocation needed for PVIVs on some architectures.

XPV , XPVIV , and XPVNV now allocate only the parts of the SV body they actually use, saving some space.

Scalars containing regular expressions now allocate only the part of the SV body they actually use, saving some space.

Memory consumption improvements to Exporter

The @EXPORT_FAIL AV is no longer created unless needed, hence neither is the typeglob backing it. This saves about 200 bytes for every package that uses Exporter but doesn't use this functionality.

Memory savings for weak references

For weak references, the common case of just a single weak reference per referent has been optimised to reduce the storage required. In this case it saves the equivalent of one small Perl array per referent.

%+ and %- use less memory

The bulk of the Tie::Hash::NamedCapture module used to be in the Perl core. It has now been moved to an XS module to reduce overhead for programs that do not use %+ or %- .

Multiple small improvements to threads

The internal structures of threading now make fewer API calls and fewer allocations, resulting in noticeably smaller object code. Additionally, many thread context checks have been deferred so they're done only as needed (although this is only possible for non-debugging builds).

Adjacent pairs of nextstate opcodes are now optimized away

Previously, in code such as

  1. use constant DEBUG => 0;
  2. sub GAK {
  3. warn if DEBUG;
  4. print "stuff\n";
  5. }

the ops for warn if DEBUG would be folded to a null op (ex-const ), but the nextstate op would remain, resulting in a runtime op dispatch of nextstate , nextstate , etc.

The execution of a sequence of nextstate ops is indistinguishable from just the last nextstate op so the peephole optimizer now eliminates the first of a pair of nextstate ops except when the first carries a label, since labels must not be eliminated by the optimizer, and label usage isn't conclusively known at compile time.

Modules and Pragmata

New Modules and Pragmata

  • CPAN::Meta::YAML 0.003 has been added as a dual-life module. It supports a subset of YAML sufficient for reading and writing META.yml and MYMETA.yml files included with CPAN distributions or generated by the module installation toolchain. It should not be used for any other general YAML parsing or generation task.

  • CPAN::Meta version 2.110440 has been added as a dual-life module. It provides a standard library to read, interpret and write CPAN distribution metadata files (like META.json and META.yml) that describe a distribution, its contents, and the requirements for building it and installing it. The latest CPAN distribution metadata specification is included as CPAN::Meta::Spec and notes on changes in the specification over time are given in CPAN::Meta::History.

  • HTTP::Tiny 0.012 has been added as a dual-life module. It is a very small, simple HTTP/1.1 client designed for simple GET requests and file mirroring. It has been added so that CPAN.pm and CPANPLUS can "bootstrap" HTTP access to CPAN using pure Perl without relying on external binaries like curl(1) or wget(1).

  • JSON::PP 2.27105 has been added as a dual-life module to allow CPAN clients to read META.json files in CPAN distributions.

  • Module::Metadata 1.000004 has been added as a dual-life module. It gathers package and POD information from Perl module files. It is a standalone module based on Module::Build::ModuleInfo for use by other module installation toolchain components. Module::Build::ModuleInfo has been deprecated in favor of this module instead.

  • Perl::OSType 1.002 has been added as a dual-life module. It maps Perl operating system names (like "dragonfly" or "MSWin32") to more generic types with standardized names (like "Unix" or "Windows"). It has been refactored out of Module::Build and ExtUtils::CBuilder and consolidates such mappings into a single location for easier maintenance.

  • The following modules were added by the Unicode::Collate upgrade. See below for details.

    Unicode::Collate::CJK::Big5

    Unicode::Collate::CJK::GB2312

    Unicode::Collate::CJK::JISX0208

    Unicode::Collate::CJK::Korean

    Unicode::Collate::CJK::Pinyin

    Unicode::Collate::CJK::Stroke

  • Version::Requirements version 0.101020 has been added as a dual-life module. It provides a standard library to model and manipulates module prerequisites and version constraints defined in CPAN::Meta::Spec.

Updated Modules and Pragma

  • attributes has been upgraded from version 0.12 to 0.14.

  • Archive::Extract has been upgraded from version 0.38 to 0.48.

    Updates since 0.38 include: a safe print method that guards Archive::Extract from changes to $\ ; a fix to the tests when run in core Perl; support for TZ files; a modification for the lzma logic to favour IO::Uncompress::Unlzma; and a fix for an issue with NetBSD-current and its new unzip(1) executable.

  • Archive::Tar has been upgraded from version 1.54 to 1.76.

    Important changes since 1.54 include the following:

    • Compatibility with busybox implementations of tar(1).

    • A fix so that write() and create_archive() close only filehandles they themselves opened.

    • A bug was fixed regarding the exit code of extract_archive.

    • The ptar(1) utility has a new option to allow safe creation of tarballs without world-writable files on Windows, allowing those archives to be uploaded to CPAN.

    • A new ptargrep(1) utility for using regular expressions against the contents of files in a tar archive.

    • pax extended headers are now skipped.

  • Attribute::Handlers has been upgraded from version 0.87 to 0.89.

  • autodie has been upgraded from version 2.06_01 to 2.1001.

  • AutoLoader has been upgraded from version 5.70 to 5.71.

  • The B module has been upgraded from version 1.23 to 1.29.

    It no longer crashes when taking apart a y/// containing characters outside the octet range or compiled in a use utf8 scope.

    The size of the shared object has been reduced by about 40%, with no reduction in functionality.

  • B::Concise has been upgraded from version 0.78 to 0.83.

    B::Concise marks rv2sv(), rv2av(), and rv2hv() ops with the new OPpDEREF flag as "DREFed".

    It no longer produces mangled output with the -tree option [perl #80632].

  • B::Debug has been upgraded from version 1.12 to 1.16.

  • B::Deparse has been upgraded from version 0.96 to 1.03.

    The deparsing of a nextstate op has changed when it has both a change of package relative to the previous nextstate, or a change of %^H or other state and a label. The label was previously emitted first, but is now emitted last (5.12.1).

    The no 5.13.2 or similar form is now correctly handled by B::Deparse (5.12.3).

    B::Deparse now properly handles the code that applies a conditional pattern match against implicit $_ as it was fixed in [perl #20444].

    Deparsing of our followed by a variable with funny characters (as permitted under the use utf8 pragma) has also been fixed [perl #33752].

  • B::Lint has been upgraded from version 1.11_01 to 1.13.

  • base has been upgraded from version 2.15 to 2.16.

  • Benchmark has been upgraded from version 1.11 to 1.12.

  • bignum has been upgraded from version 0.23 to 0.27.

  • Carp has been upgraded from version 1.15 to 1.20.

    Carp now detects incomplete caller EXPR overrides and avoids using bogus @DB::args . To provide backtraces, Carp relies on particular behaviour of the caller() builtin. Carp now detects if other code has overridden this with an incomplete implementation, and modifies its backtrace accordingly. Previously incomplete overrides would cause incorrect values in backtraces (best case), or obscure fatal errors (worst case).

    This fixes certain cases of "Bizarre copy of ARRAY" caused by modules overriding caller() incorrectly (5.12.2).

    It now also avoids using regular expressions that cause Perl to load its Unicode tables, so as to avoid the "BEGIN not safe after errors" error that ensue if there has been a syntax error [perl #82854].

  • CGI has been upgraded from version 3.48 to 3.52.

    This provides the following security fixes: the MIME boundary in multipart_init() is now random and the handling of newlines embedded in header values has been improved.

  • Compress::Raw::Bzip2 has been upgraded from version 2.024 to 2.033.

    It has been updated to use bzip2(1) 1.0.6.

  • Compress::Raw::Zlib has been upgraded from version 2.024 to 2.033.

  • constant has been upgraded from version 1.20 to 1.21.

    Unicode constants work once more. They have been broken since Perl 5.10.0 [CPAN RT #67525].

  • CPAN has been upgraded from version 1.94_56 to 1.9600.

    Major highlights:

    • much less configuration dialog hassle
    • support for META/MYMETA.json
    • support for local::lib
    • support for HTTP::Tiny to reduce the dependency on FTP sites
    • automatic mirror selection
    • iron out all known bugs in configure_requires
    • support for distributions compressed with bzip2(1)
    • allow Foo/Bar.pm on the command line to mean Foo::Bar
  • CPANPLUS has been upgraded from version 0.90 to 0.9103.

    A change to cpanp-run-perl resolves RT #55964 and RT #57106, both of which related to failures to install distributions that use Module::Install::DSL (5.12.2).

    A dependency on Config was not recognised as a core module dependency. This has been fixed.

    CPANPLUS now includes support for META.json and MYMETA.json.

  • CPANPLUS::Dist::Build has been upgraded from version 0.46 to 0.54.

  • Data::Dumper has been upgraded from version 2.125 to 2.130_02.

    The indentation used to be off when $Data::Dumper::Terse was set. This has been fixed [perl #73604].

    This upgrade also fixes a crash when using custom sort functions that might cause the stack to change [perl #74170].

    Dumpxs no longer crashes with globs returned by *$io_ref [perl #72332].

  • DB_File has been upgraded from version 1.820 to 1.821.

  • DBM_Filter has been upgraded from version 0.03 to 0.04.

  • Devel::DProf has been upgraded from version 20080331.00 to 20110228.00.

    Merely loading Devel::DProf now no longer triggers profiling to start. Both use Devel::DProf and perl -d:DProf ... behave as before and start the profiler.

    NOTE: Devel::DProf is deprecated and will be removed from a future version of Perl. We strongly recommend that you install and use Devel::NYTProf instead, as it offers significantly improved profiling and reporting.

  • Devel::Peek has been upgraded from version 1.04 to 1.07.

  • Devel::SelfStubber has been upgraded from version 1.03 to 1.05.

  • diagnostics has been upgraded from version 1.19 to 1.22.

    It now renders pod links slightly better, and has been taught to find descriptions for messages that share their descriptions with other messages.

  • Digest::MD5 has been upgraded from version 2.39 to 2.51.

    It is now safe to use this module in combination with threads.

  • Digest::SHA has been upgraded from version 5.47 to 5.61.

    shasum now more closely mimics sha1sum(1)/md5sum(1).

    addfile accepts all POSIX filenames.

    New SHA-512/224 and SHA-512/256 transforms (ref. NIST Draft FIPS 180-4 [February 2011])

  • DirHandle has been upgraded from version 1.03 to 1.04.

  • Dumpvalue has been upgraded from version 1.13 to 1.16.

  • DynaLoader has been upgraded from version 1.10 to 1.13.

    It fixes a buffer overflow when passed a very long file name.

    It no longer inherits from AutoLoader; hence it no longer produces weird error messages for unsuccessful method calls on classes that inherit from DynaLoader [perl #84358].

  • Encode has been upgraded from version 2.39 to 2.42.

    Now, all 66 Unicode non-characters are treated the same way U+FFFF has always been treated: in cases when it was disallowed, all 66 are disallowed, and in cases where it warned, all 66 warn.

  • Env has been upgraded from version 1.01 to 1.02.

  • Errno has been upgraded from version 1.11 to 1.13.

    The implementation of Errno has been refactored to use about 55% less memory.

    On some platforms with unusual header files, like Win32 gcc(1) using mingw64 headers, some constants that weren't actually error numbers have been exposed by Errno. This has been fixed [perl #77416].

  • Exporter has been upgraded from version 5.64_01 to 5.64_03.

    Exporter no longer overrides $SIG{__WARN__} [perl #74472]

  • ExtUtils::CBuilder has been upgraded from version 0.27 to 0.280203.

  • ExtUtils::Command has been upgraded from version 1.16 to 1.17.

  • ExtUtils::Constant has been upgraded from 0.22 to 0.23.

    The AUTOLOAD helper code generated by ExtUtils::Constant::ProxySubs can now croak() for missing constants, or generate a complete AUTOLOAD subroutine in XS, allowing simplification of many modules that use it (Fcntl, File::Glob, GDBM_File, I18N::Langinfo, POSIX, Socket).

    ExtUtils::Constant::ProxySubs can now optionally push the names of all constants onto the package's @EXPORT_OK .

  • ExtUtils::Install has been upgraded from version 1.55 to 1.56.

  • ExtUtils::MakeMaker has been upgraded from version 6.56 to 6.57_05.

  • ExtUtils::Manifest has been upgraded from version 1.57 to 1.58.

  • ExtUtils::ParseXS has been upgraded from version 2.21 to 2.2210.

  • Fcntl has been upgraded from version 1.06 to 1.11.

  • File::Basename has been upgraded from version 2.78 to 2.82.

  • File::CheckTree has been upgraded from version 4.4 to 4.41.

  • File::Copy has been upgraded from version 2.17 to 2.21.

  • File::DosGlob has been upgraded from version 1.01 to 1.04.

    It allows patterns containing literal parentheses: they no longer need to be escaped. On Windows, it no longer adds an extra ./ to file names returned when the pattern is a relative glob with a drive specification, like C:*.pl [perl #71712].

  • File::Fetch has been upgraded from version 0.24 to 0.32.

    HTTP::Lite is now supported for the "http" scheme.

    The fetch(1) utility is supported on FreeBSD, NetBSD, and Dragonfly BSD for the http and ftp schemes.

  • File::Find has been upgraded from version 1.15 to 1.19.

    It improves handling of backslashes on Windows, so that paths like C:\dir\/file are no longer generated [perl #71710].

  • File::Glob has been upgraded from version 1.07 to 1.12.

  • File::Spec has been upgraded from version 3.31 to 3.33.

    Several portability fixes were made in File::Spec::VMS: a colon is now recognized as a delimiter in native filespecs; caret-escaped delimiters are recognized for better handling of extended filespecs; catpath() returns an empty directory rather than the current directory if the input directory name is empty; and abs2rel() properly handles Unix-style input (5.12.2).

  • File::stat has been upgraded from 1.02 to 1.05.

    The -x and -X file test operators now work correctly when run by the superuser.

  • Filter::Simple has been upgraded from version 0.84 to 0.86.

  • GDBM_File has been upgraded from 1.10 to 1.14.

    This fixes a memory leak when DBM filters are used.

  • Hash::Util has been upgraded from 0.07 to 0.11.

    Hash::Util no longer emits spurious "uninitialized" warnings when recursively locking hashes that have undefined values [perl #74280].

  • Hash::Util::FieldHash has been upgraded from version 1.04 to 1.09.

  • I18N::Collate has been upgraded from version 1.01 to 1.02.

  • I18N::Langinfo has been upgraded from version 0.03 to 0.08.

    langinfo() now defaults to using $_ if there is no argument given, just as the documentation has always claimed.

  • I18N::LangTags has been upgraded from version 0.35 to 0.35_01.

  • if has been upgraded from version 0.05 to 0.0601.

  • IO has been upgraded from version 1.25_02 to 1.25_04.

    This version of IO includes a new IO::Select, which now allows IO::Handle objects (and objects in derived classes) to be removed from an IO::Select set even if the underlying file descriptor is closed or invalid.

  • IPC::Cmd has been upgraded from version 0.54 to 0.70.

    Resolves an issue with splitting Win32 command lines. An argument consisting of the single character "0" used to be omitted (CPAN RT #62961).

  • IPC::Open3 has been upgraded from 1.05 to 1.09.

    open3() now produces an error if the exec call fails, allowing this condition to be distinguished from a child process that exited with a non-zero status [perl #72016].

    The internal xclose() routine now knows how to handle file descriptors as documented, so duplicating STDIN in a child process using its file descriptor now works [perl #76474].

  • IPC::SysV has been upgraded from version 2.01 to 2.03.

  • lib has been upgraded from version 0.62 to 0.63.

  • Locale::Maketext has been upgraded from version 1.14 to 1.19.

    Locale::Maketext now supports external caches.

    This upgrade also fixes an infinite loop in Locale::Maketext::Guts::_compile() when working with tainted values (CPAN RT #40727).

    ->maketext calls now back up and restore $@ so error messages are not suppressed (CPAN RT #34182).

  • Log::Message has been upgraded from version 0.02 to 0.04.

  • Log::Message::Simple has been upgraded from version 0.06 to 0.08.

  • Math::BigInt has been upgraded from version 1.89_01 to 1.994.

    This fixes, among other things, incorrect results when computing binomial coefficients [perl #77640].

    It also prevents sqrt($int) from crashing under use bigrat . [perl #73534].

  • Math::BigInt::FastCalc has been upgraded from version 0.19 to 0.28.

  • Math::BigRat has been upgraded from version 0.24 to 0.26_02.

  • Memoize has been upgraded from version 1.01_03 to 1.02.

  • MIME::Base64 has been upgraded from 3.08 to 3.13.

    Includes new functions to calculate the length of encoded and decoded base64 strings.

    Now provides encode_base64url() and decode_base64url() functions to process the base64 scheme for "URL applications".

  • Module::Build has been upgraded from version 0.3603 to 0.3800.

    A notable change is the deprecation of several modules. Module::Build::Version has been deprecated and Module::Build now relies on the version pragma directly. Module::Build::ModuleInfo has been deprecated in favor of a standalone copy called Module::Metadata. Module::Build::YAML has been deprecated in favor of CPAN::Meta::YAML.

    Module::Build now also generates META.json and MYMETA.json files in accordance with version 2 of the CPAN distribution metadata specification, CPAN::Meta::Spec. The older format META.yml and MYMETA.yml files are still generated.

  • Module::CoreList has been upgraded from version 2.29 to 2.47.

    Besides listing the updated core modules of this release, it also stops listing the Filespec module. That module never existed in core. The scripts generating Module::CoreList confused it with VMS::Filespec, which actually is a core module as of Perl 5.8.7.

  • Module::Load has been upgraded from version 0.16 to 0.18.

  • Module::Load::Conditional has been upgraded from version 0.34 to 0.44.

  • The mro pragma has been upgraded from version 1.02 to 1.07.

  • NDBM_File has been upgraded from version 1.08 to 1.12.

    This fixes a memory leak when DBM filters are used.

  • Net::Ping has been upgraded from version 2.36 to 2.38.

  • NEXT has been upgraded from version 0.64 to 0.65.

  • Object::Accessor has been upgraded from version 0.36 to 0.38.

  • ODBM_File has been upgraded from version 1.07 to 1.10.

    This fixes a memory leak when DBM filters are used.

  • Opcode has been upgraded from version 1.15 to 1.18.

  • The overload pragma has been upgraded from 1.10 to 1.13.

    overload::Method can now handle subroutines that are themselves blessed into overloaded classes [perl #71998].

    The documentation has greatly improved. See Documentation below.

  • Params::Check has been upgraded from version 0.26 to 0.28.

  • The parent pragma has been upgraded from version 0.223 to 0.225.

  • Parse::CPAN::Meta has been upgraded from version 1.40 to 1.4401.

    The latest Parse::CPAN::Meta can now read YAML and JSON files using CPAN::Meta::YAML and JSON::PP, which are now part of the Perl core.

  • PerlIO::encoding has been upgraded from version 0.12 to 0.14.

  • PerlIO::scalar has been upgraded from 0.07 to 0.11.

    A read() after a seek() beyond the end of the string no longer thinks it has data to read [perl #78716].

  • PerlIO::via has been upgraded from version 0.09 to 0.11.

  • Pod::Html has been upgraded from version 1.09 to 1.11.

  • Pod::LaTeX has been upgraded from version 0.58 to 0.59.

  • Pod::Perldoc has been upgraded from version 3.15_02 to 3.15_03.

  • Pod::Simple has been upgraded from version 3.13 to 3.16.

  • POSIX has been upgraded from 1.19 to 1.24.

    It now includes constants for POSIX signal constants.

  • The re pragma has been upgraded from version 0.11 to 0.18.

    The use re '/flags' subpragma is new.

    The regmust() function used to crash when called on a regular expression belonging to a pluggable engine. Now it croaks instead.

    regmust() no longer leaks memory.

  • Safe has been upgraded from version 2.25 to 2.29.

    Coderefs returned by reval() and rdo() are now wrapped via wrap_code_refs() (5.12.1).

    This fixes a possible infinite loop when looking for coderefs.

    It adds several version::vxs::* routines to the default share.

  • SDBM_File has been upgraded from version 1.06 to 1.09.

  • SelfLoader has been upgraded from 1.17 to 1.18.

    It now works in taint mode [perl #72062].

  • The sigtrap pragma has been upgraded from version 1.04 to 1.05.

    It no longer tries to modify read-only arguments when generating a backtrace [perl #72340].

  • Socket has been upgraded from version 1.87 to 1.94.

    See Improved IPv6 support above.

  • Storable has been upgraded from version 2.22 to 2.27.

    Includes performance improvement for overloaded classes.

    This adds support for serialising code references that contain UTF-8 strings correctly. The Storable minor version number changed as a result, meaning that Storable users who set $Storable::accept_future_minor to a FALSE value will see errors (see FORWARD COMPATIBILITY in Storable for more details).

    Freezing no longer gets confused if the Perl stack gets reallocated during freezing [perl #80074].

  • Sys::Hostname has been upgraded from version 1.11 to 1.16.

  • Term::ANSIColor has been upgraded from version 2.02 to 3.00.

  • Term::UI has been upgraded from version 0.20 to 0.26.

  • Test::Harness has been upgraded from version 3.17 to 3.23.

  • Test::Simple has been upgraded from version 0.94 to 0.98.

    Among many other things, subtests without a plan or no_plan now have an implicit done_testing() added to them.

  • Thread::Semaphore has been upgraded from version 2.09 to 2.12.

    It provides two new methods that give more control over the decrementing of semaphores: down_nb and down_force .

  • Thread::Queue has been upgraded from version 2.11 to 2.12.

  • The threads pragma has been upgraded from version 1.75 to 1.83.

  • The threads::shared pragma has been upgraded from version 1.32 to 1.37.

  • Tie::Hash has been upgraded from version 1.03 to 1.04.

    Calling Tie::Hash->TIEHASH() used to loop forever. Now it croak s.

  • Tie::Hash::NamedCapture has been upgraded from version 0.06 to 0.08.

  • Tie::RefHash has been upgraded from version 1.38 to 1.39.

  • Time::HiRes has been upgraded from version 1.9719 to 1.9721_01.

  • Time::Local has been upgraded from version 1.1901_01 to 1.2000.

  • Time::Piece has been upgraded from version 1.15_01 to 1.20_01.

  • Unicode::Collate has been upgraded from version 0.52_01 to 0.73.

    Unicode::Collate has been updated to use Unicode 6.0.0.

    Unicode::Collate::Locale now supports a plethora of new locales: ar, be, bg, de__phonebook, hu, hy, kk, mk, nso, om, tn, vi, hr, ig, ja, ko, ru, sq, se, sr, to, uk, zh, zh__big5han, zh__gb2312han, zh__pinyin, and zh__stroke.

    The following modules have been added:

    Unicode::Collate::CJK::Big5 for zh__big5han which makes tailoring of CJK Unified Ideographs in the order of CLDR's big5han ordering.

    Unicode::Collate::CJK::GB2312 for zh__gb2312han which makes tailoring of CJK Unified Ideographs in the order of CLDR's gb2312han ordering.

    Unicode::Collate::CJK::JISX0208 which makes tailoring of 6355 kanji (CJK Unified Ideographs) in the JIS X 0208 order.

    Unicode::Collate::CJK::Korean which makes tailoring of CJK Unified Ideographs in the order of CLDR's Korean ordering.

    Unicode::Collate::CJK::Pinyin for zh__pinyin which makes tailoring of CJK Unified Ideographs in the order of CLDR's pinyin ordering.

    Unicode::Collate::CJK::Stroke for zh__stroke which makes tailoring of CJK Unified Ideographs in the order of CLDR's stroke ordering.

    This also sees the switch from using the pure-Perl version of this module to the XS version.

  • Unicode::Normalize has been upgraded from version 1.03 to 1.10.

  • Unicode::UCD has been upgraded from version 0.27 to 0.32.

    A new function, Unicode::UCD::num(), has been added. This function returns the numeric value of the string passed it or undef if the string in its entirety has no "safe" numeric value. (For more detail, and for the definition of "safe", see num() in Unicode::UCD.)

    This upgrade also includes several bug fixes:

    • charinfo()
      • It is now updated to Unicode Version 6.0.0 with Corrigendum #8, excepting that, just as with Perl 5.14, the code point at U+1F514 has no name.

      • Hangul syllable code points have the correct names, and their decompositions are always output without requiring Lingua::KO::Hangul::Util to be installed.

      • CJK (Chinese-Japanese-Korean) code points U+2A700 to U+2B734 and U+2B740 to U+2B81D are now properly handled.

      • Numeric values are now output for those CJK code points that have them.

      • Names output for code points with multiple aliases are now the corrected ones.

    • charscript()

      This now correctly returns "Unknown" instead of undef for the script of a code point that hasn't been assigned another one.

    • charblock()

      This now correctly returns "No_Block" instead of undef for the block of a code point that hasn't been assigned to another one.

  • The version pragma has been upgraded from 0.82 to 0.88.

    Because of a bug, now fixed, the is_strict() and is_lax() functions did not work when exported (5.12.1).

  • The warnings pragma has been upgraded from version 1.09 to 1.12.

    Calling use warnings without arguments is now significantly more efficient.

  • The warnings::register pragma has been upgraded from version 1.01 to 1.02.

    It is now possible to register warning categories other than the names of packages using warnings::register. See perllexwarn(1) for more information.

  • XSLoader has been upgraded from version 0.10 to 0.13.

  • VMS::DCLsym has been upgraded from version 1.03 to 1.05.

    Two bugs have been fixed [perl #84086]:

    The symbol table name was lost when tying a hash, due to a thinko in TIEHASH . The result was that all tied hashes interacted with the local symbol table.

    Unless a symbol table name had been explicitly specified in the call to the constructor, querying the special key :LOCAL failed to identify objects connected to the local symbol table.

  • The Win32 module has been upgraded from version 0.39 to 0.44.

    This release has several new functions: Win32::GetSystemMetrics(), Win32::GetProductInfo(), Win32::GetOSDisplayName().

    The names returned by Win32::GetOSName() and Win32::GetOSDisplayName() have been corrected.

  • XS::Typemap has been upgraded from version 0.03 to 0.05.

Removed Modules and Pragmata

As promised in Perl 5.12.0's release notes, the following modules have been removed from the core distribution, and if needed should be installed from CPAN instead.

  • Class::ISA has been removed from the Perl core. Prior version was 0.36.

  • Pod::Plainer has been removed from the Perl core. Prior version was 1.02.

  • Switch has been removed from the Perl core. Prior version was 2.16.

The removal of Shell has been deferred until after 5.14, as the implementation of Shell shipped with 5.12.0 did not correctly issue the warning that it was to be removed from core.

Documentation

New Documentation

perlgpl

perlgpl has been updated to contain GPL version 1, as is included in the README distributed with Perl (5.12.1).

Perl 5.12.x delta files

The perldelta files for Perl 5.12.1 to 5.12.3 have been added from the maintenance branch: perl5121delta, perl5122delta, perl5123delta.

perlpodstyle

New style guide for POD documentation, split mostly from the NOTES section of the pod2man(1) manpage.

perlsource, perlinterp, perlhacktut, and perlhacktips

See perlhack and perlrepository revamp, below.

Changes to Existing Documentation

perlmodlib is now complete

The perlmodlib manpage that came with Perl 5.12.0 was missing several modules due to a bug in the script that generates the list. This has been fixed [perl #74332] (5.12.1).

Replace incorrect tr/// table in perlebcdic

perlebcdic contains a helpful table to use in tr/// to convert between EBCDIC and Latin1/ASCII. The table was the inverse of the one it describes, though the code that used the table worked correctly for the specific example given.

The table has been corrected and the sample code changed to correspond.

The table has also been changed to hex from octal, and the recipes in the pod have been altered to print out leading zeros to make all values the same length.

Tricks for user-defined casing

perlunicode now contains an explanation of how to override, mangle and otherwise tweak the way Perl handles upper-, lower- and other-case conversions on Unicode data, and how to provide scoped changes to alter one's own code's behaviour without stomping on anybody else's.

INSTALL explicitly states that Perl requires a C89 compiler

This was already true, but it's now Officially Stated For The Record (5.12.2).

Explanation of \xHH and \oOOO escapes

perlop has been updated with more detailed explanation of these two character escapes.

-0NNN switch

In perlrun, the behaviour of the -0NNN switch for -0400 or higher has been clarified (5.12.2).

Maintenance policy

perlpolicy now contains the policy on what patches are acceptable for maintenance branches (5.12.1).

Deprecation policy

perlpolicy now contains the policy on compatibility and deprecation along with definitions of terms like "deprecation" (5.12.2).

New descriptions in perldiag

The following existing diagnostics are now documented:

perlbook

perlbook has been expanded to cover many more popular books.

SvTRUE macro

The documentation for the SvTRUE macro in perlapi was simply wrong in stating that get-magic is not processed. It has been corrected.

op manipulation functions

Several API functions that process optrees have been newly documented.

perlvar revamp

perlvar reorders the variables and groups them by topic. Each variable introduced after Perl 5.000 notes the first version in which it is available. perlvar also has a new section for deprecated variables to note when they were removed.

Array and hash slices in scalar context

These are now documented in perldata.

use locale and formats

perlform and perllocale have been corrected to state that use locale affects formats.

overload

overload's documentation has practically undergone a rewrite. It is now much more straightforward and clear.

perlhack and perlrepository revamp

The perlhack document is now much shorter, and focuses on the Perl 5 development process and submitting patches to Perl. The technical content has been moved to several new documents, perlsource, perlinterp, perlhacktut, and perlhacktips. This technical content has been only lightly edited.

The perlrepository document has been renamed to perlgit. This new document is just a how-to on using git with the Perl source code. Any other content that used to be in perlrepository has been moved to perlhack.

Time::Piece examples

Examples in perlfaq4 have been updated to show the use of Time::Piece.

Diagnostics

The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see perldiag.

New Diagnostics

New Errors

  • Closure prototype called

    This error occurs when a subroutine reference passed to an attribute handler is called, if the subroutine is a closure [perl #68560].

  • Insecure user-defined property %s

    Perl detected tainted data when trying to compile a regular expression that contains a call to a user-defined character property function, meaning \p{IsFoo} or \p{InFoo} . See User-Defined Character Properties in perlunicode and perlsec.

  • panic: gp_free failed to free glob pointer - something is repeatedly re-creating entries

    This new error is triggered if a destructor called on an object in a typeglob that is being freed creates a new typeglob entry containing an object with a destructor that creates a new entry containing an object etc.

  • Parsing code internal error (%s)

    This new fatal error is produced when parsing code supplied by an extension violates the parser's API in a detectable way.

  • refcnt: fd %d%s

    This new error only occurs if a internal consistency check fails when a pipe is about to be closed.

  • Regexp modifier "/%c" may not appear twice

    The regular expression pattern has one of the mutually exclusive modifiers repeated.

  • Regexp modifiers "/%c" and "/%c" are mutually exclusive

    The regular expression pattern has more than one of the mutually exclusive modifiers.

  • Using !~ with %s doesn't make sense

    This error occurs when !~ is used with s///r or y///r.

New Warnings

  • "\b{" is deprecated; use "\b\{" instead
  • "\B{" is deprecated; use "\B\{" instead

    Use of an unescaped "{" immediately following a \b or \B is now deprecated in order to reserve its use for Perl itself in a future release.

  • Operation "%s" returns its argument for ...

    Performing an operation requiring Unicode semantics (such as case-folding) on a Unicode surrogate or a non-Unicode character now triggers this warning.

  • Use of qw(...) as parentheses is deprecated

    See Use of qw(...) as parentheses, above, for details.

Changes to Existing Diagnostics

  • The "Variable $foo is not imported" warning that precedes a strict 'vars' error has now been assigned the "misc" category, so that no warnings will suppress it [perl #73712].

  • warn() and die() now produce "Wide character" warnings when fed a character outside the byte range if STDERR is a byte-sized handle.

  • The "Layer does not match this perl" error message has been replaced with these more helpful messages [perl #73754]:

    • PerlIO layer function table size (%d) does not match size expected by this perl (%d)

    • PerlIO layer instance size (%d) does not match size expected by this perl (%d)

  • The "Found = in conditional" warning that is emitted when a constant is assigned to a variable in a condition is now withheld if the constant is actually a subroutine or one generated by use constant , since the value of the constant may not be known at the time the program is written [perl #77762].

  • Previously, if none of the gethostbyaddr(), gethostbyname() and gethostent() functions were implemented on a given platform, they would all die with the message "Unsupported socket function 'gethostent' called", with analogous messages for getnet*() and getserv*(). This has been corrected.

  • The warning message about unrecognized regular expression escapes passed through has been changed to include any literal "{" following the two-character escape. For example, "\q{" is now emitted instead of "\q".

Utility Changes

perlbug(1)

  • perlbug now looks in the EMAIL environment variable for a return address if the REPLY-TO and REPLYTO variables are empty.

  • perlbug did not previously generate a "From:" header, potentially resulting in dropped mail; it now includes that header.

  • The user's address is now used as the Return-Path.

    Many systems these days don't have a valid Internet domain name, and perlbug@perl.org does not accept email with a return-path that does not resolve. So the user's address is now passed to sendmail so it's less likely to get stuck in a mail queue somewhere [perl #82996].

  • perlbug now always gives the reporter a chance to change the email address it guesses for them (5.12.2).

  • perlbug should no longer warn about uninitialized values when using the -d and -v options (5.12.2).

perl5db.pl

  • The remote terminal works after forking and spawns new sessions, one per forked process.

ptargrep

  • ptargrep is a new utility to apply pattern matching to the contents of files in a tar archive. It comes with Archive::Tar .

Configuration and Compilation

See also Naming fixes in Policy_sh.SH may invalidate Policy.sh, above.

  • CCINCDIR and CCLIBDIR for the mingw64 cross-compiler are now correctly under $(CCHOME)\mingw\include and \lib rather than immediately below $(CCHOME).

    This means the "incpath", "libpth", "ldflags", "lddlflags" and "ldflags_nolargefiles" values in Config.pm and Config_heavy.pl are now set correctly.

  • make test.valgrind has been adjusted to account for cpan/dist/ext separation.

  • On compilers that support it, -Wwrite-strings is now added to cflags by default.

  • The Encode module can now (once again) be included in a static Perl build. The special-case handling for this situation got broken in Perl 5.11.0, and has now been repaired.

  • The previous default size of a PerlIO buffer (4096 bytes) has been increased to the larger of 8192 bytes and your local BUFSIZ. Benchmarks show that doubling this decade-old default increases read and write performance by around 25% to 50% when using the default layers of perlio on top of unix. To choose a non-default size, such as to get back the old value or to obtain an even larger value, configure with:

    1. ./Configure -Accflags=-DPERLIOBUF_DEFAULT_BUFSIZ=N

    where N is the desired size in bytes; it should probably be a multiple of your page size.

  • An "incompatible operand types" error in ternary expressions when building with clang has been fixed (5.12.2).

  • Perl now skips setuid File::Copy tests on partitions it detects mounted as nosuid (5.12.2).

Platform Support

New Platforms

  • AIX

    Perl now builds on AIX 4.2 (5.12.1).

Discontinued Platforms

  • Apollo DomainOS

    The last vestiges of support for this platform have been excised from the Perl distribution. It was officially discontinued in version 5.12.0. It had not worked for years before that.

  • MacOS Classic

    The last vestiges of support for this platform have been excised from the Perl distribution. It was officially discontinued in an earlier version.

Platform-Specific Notes

AIX

  • README.aix has been updated with information about the XL C/C++ V11 compiler suite (5.12.2).

ARM

  • The d_u32align configuration probe on ARM has been fixed (5.12.2).

Cygwin

  • MakeMaker has been updated to build manpages on cygwin.

  • Improved rebase behaviour

    If a DLL is updated on cygwin the old imagebase address is reused. This solves most rebase errors, especially when updating on core DLL's. See http://www.tishler.net/jason/software/rebase/rebase-2.4.2.README for more information.

  • Support for the standard cygwin dll prefix (needed for FFIs)

  • Updated build hints file

FreeBSD 7

  • FreeBSD 7 no longer contains /usr/bin/objformat. At build time, Perl now skips the objformat check for versions 7 and higher and assumes ELF (5.12.1).

HP-UX

  • Perl now allows -Duse64bitint without promoting to use64bitall on HP-UX (5.12.1).

IRIX

  • Conversion of strings to floating-point numbers is now more accurate on IRIX systems [perl #32380].

Mac OS X

  • Early versions of Mac OS X (Darwin) had buggy implementations of the setregid(), setreuid(), setrgid(,) and setruid() functions, so Perl would pretend they did not exist.

    These functions are now recognised on Mac OS 10.5 (Leopard; Darwin 9) and higher, as they have been fixed [perl #72990].

MirBSD

  • Previously if you built Perl with a shared libperl.so on MirBSD (the default config), it would work up to the installation; however, once installed, it would be unable to find libperl. Path handling is now treated as in the other BSD dialects.

NetBSD

  • The NetBSD hints file has been changed to make the system malloc the default.

OpenBSD

  • OpenBSD > 3.7 has a new malloc implementation which is mmap-based, and as such can release memory back to the OS; however, Perl's use of this malloc causes a substantial slowdown, so we now default to using Perl's malloc instead [perl #75742].

OpenVOS

  • Perl now builds again with OpenVOS (formerly known as Stratus VOS) [perl #78132] (5.12.3).

Solaris

  • DTrace is now supported on Solaris. There used to be build failures, but these have been fixed [perl #73630] (5.12.3).

VMS

  • Extension building on older (pre 7.3-2) VMS systems was broken because configure.com hit the DCL symbol length limit of 1K. We now work within this limit when assembling the list of extensions in the core build (5.12.1).

  • We fixed configuring and building Perl with -Uuseperlio (5.12.1).

  • PerlIOUnix_open now honours the default permissions on VMS.

    When perlio became the default and unix became the default bottom layer, the most common path for creating files from Perl became PerlIOUnix_open , which has always explicitly used 0666 as the permission mask. This prevents inheriting permissions from RMS defaults and ACLs, so to avoid that problem, we now pass 0777 to open(). In the VMS CRTL, 0777 has a special meaning over and above intersecting with the current umask; specifically, it allows Unix syscalls to preserve native default permissions (5.12.3).

  • The shortening of symbols longer than 31 characters in the core C sources and in extensions is now by default done by the C compiler rather than by xsubpp (which could only do so for generated symbols in XS code). You can reenable xsubpp's symbol shortening by configuring with -Uuseshortenedsymbols, but you'll have some work to do to get the core sources to compile.

  • Record-oriented files (record format variable or variable with fixed control) opened for write by the perlio layer will now be line-buffered to prevent the introduction of spurious line breaks whenever the perlio buffer fills up.

  • git_version.h is now installed on VMS. This was an oversight in v5.12.0 which caused some extensions to fail to build (5.12.2).

  • Several memory leaks in stat FILEHANDLE have been fixed (5.12.2).

  • A memory leak in Perl_rename() due to a double allocation has been fixed (5.12.2).

  • A memory leak in vms_fid_to_name() (used by realpath() and realname()> has been fixed (5.12.2).

Windows

See also fork() emulation will not wait for signalled children and Perl source code is read in text mode on Windows, above.

  • Fixed build process for SDK2003SP1 compilers.

  • Compilation with Visual Studio 2010 is now supported.

  • When using old 32-bit compilers, the define _USE_32BIT_TIME_T is now set in $Config{ccflags} . This improves portability when compiling XS extensions using new compilers, but for a Perl compiled with old 32-bit compilers.

  • $Config{gccversion} is now set correctly when Perl is built using the mingw64 compiler from http://mingw64.org [perl #73754].

  • When building Perl with the mingw64 x64 cross-compiler incpath , libpth , ldflags , lddlflags and ldflags_nolargefiles values in Config.pm and Config_heavy.pl were not previously being set correctly because, with that compiler, the include and lib directories are not immediately below $(CCHOME) (5.12.2).

  • The build process proceeds more smoothly with mingw and dmake when C:\MSYS\bin is in the PATH, due to a Cwd fix.

  • Support for building with Visual C++ 2010 is now underway, but is not yet complete. See README.win32 or perlwin32 for more details.

  • The option to use an externally-supplied crypt(), or to build with no crypt() at all, has been removed. Perl supplies its own crypt() implementation for Windows, and the political situation that required this part of the distribution to sometimes be omitted is long gone.

Internal Changes

New APIs

CLONE_PARAMS structure added to ease correct thread creation

Modules that create threads should now create CLONE_PARAMS structures by calling the new function Perl_clone_params_new(), and free them with Perl_clone_params_del(). This will ensure compatibility with any future changes to the internals of the CLONE_PARAMS structure layout, and that it is correctly allocated and initialised.

New parsing functions

Several functions have been added for parsing Perl statements and expressions. These functions are meant to be used by XS code invoked during Perl parsing, in a recursive-descent manner, to allow modules to augment the standard Perl syntax.

Hints hash API

A new C API for introspecting the hinthash %^H at runtime has been added. See cop_hints_2hv , cop_hints_fetchpvn , cop_hints_fetchpvs , cop_hints_fetchsv , and hv_copy_hints_hv in perlapi for details.

A new, experimental API has been added for accessing the internal structure that Perl uses for %^H . See the functions beginning with cophh_ in perlapi.

C interface to caller()

The caller_cx function has been added as an XSUB-writer's equivalent of caller(). See perlapi for details.

Custom per-subroutine check hooks

XS code in an extension module can now annotate a subroutine (whether implemented in XS or in Perl) so that nominated XS code will be called at compile time (specifically as part of op checking) to change the op tree of that subroutine. The compile-time check function (supplied by the extension module) can implement argument processing that can't be expressed as a prototype, generate customised compile-time warnings, perform constant folding for a pure function, inline a subroutine consisting of sufficiently simple ops, replace the whole call with a custom op, and so on. This was previously all possible by hooking the entersub op checker, but the new mechanism makes it easy to tie the hook to a specific subroutine. See cv_set_call_checker in perlapi.

To help in writing custom check hooks, several subtasks within standard entersub op checking have been separated out and exposed in the API.

Improved support for custom OPs

Custom ops can now be registered with the new custom_op_register C function and the XOP structure. This will make it easier to add new properties of custom ops in the future. Two new properties have been added already, xop_class and xop_peep .

xop_class is one of the OA_*OP constants. It allows B and other introspection mechanisms to work with custom ops that aren't BASEOPs. xop_peep is a pointer to a function that will be called for ops of this type from Perl_rpeep .

See Custom Operators in perlguts and Custom Operators in perlapi for more detail.

The old PL_custom_op_names /PL_custom_op_descs interface is still supported but discouraged.

Scope hooks

It is now possible for XS code to hook into Perl's lexical scope mechanism at compile time, using the new Perl_blockhook_register function. See Compile-time scope hooks in perlguts.

The recursive part of the peephole optimizer is now hookable

In addition to PL_peepp , for hooking into the toplevel peephole optimizer, a PL_rpeepp is now available to hook into the optimizer recursing into side-chains of the optree.

New non-magical variants of existing functions

The following functions/macros have been added to the API. The *_nomg macros are equivalent to their non-_nomg variants, except that they ignore get-magic. Those ending in _flags allow one to specify whether get-magic is processed.

  1. sv_2bool_flags
  2. SvTRUE_nomg
  3. sv_2nv_flags
  4. SvNV_nomg
  5. sv_cmp_flags
  6. sv_cmp_locale_flags
  7. sv_eq_flags
  8. sv_collxfrm_flags

In some of these cases, the non-_flags functions have been replaced with wrappers around the new functions.

pv/pvs/sv versions of existing functions

Many functions ending with pvn now have equivalent pv/pvs/sv versions.

List op-building functions

List op-building functions have been added to the API. See op_append_elem, op_append_list, and op_prepend_elem in perlapi.

LINKLIST

The LINKLIST macro, part of op building that constructs the execution-order op chain, has been added to the API.

Localisation functions

The save_freeop , save_op , save_pushi32ptr and save_pushptrptr functions have been added to the API.

Stash names

A stash can now have a list of effective names in addition to its usual name. The first effective name can be accessed via the HvENAME macro, which is now the recommended name to use in MRO linearisations (HvNAME being a fallback if there is no HvENAME ).

These names are added and deleted via hv_ename_add and hv_ename_delete . These two functions are not part of the API.

New functions for finding and removing magic

The mg_findext() and sv_unmagicext() functions have been added to the API. They allow extension authors to find and remove magic attached to scalars based on both the magic type and the magic virtual table, similar to how sv_magicext() attaches magic of a certain type and with a given virtual table to a scalar. This eliminates the need for extensions to walk the list of MAGIC pointers of an SV to find the magic that belongs to them.

find_rundefsv

This function returns the SV representing $_ , whether it's lexical or dynamic.

Perl_croak_no_modify

Perl_croak_no_modify() is short-hand for Perl_croak("%s", PL_no_modify) .

PERL_STATIC_INLINE define

The PERL_STATIC_INLINE define has been added to provide the best-guess incantation to use for static inline functions, if the C compiler supports C99-style static inline. If it doesn't, it'll give a plain static .

HAS_STATIC_INLINE can be used to check if the compiler actually supports inline functions.

New pv_escape option for hexadecimal escapes

A new option, PERL_PV_ESCAPE_NONASCII , has been added to pv_escape to dump all characters above ASCII in hexadecimal. Before, one could get all characters as hexadecimal or the Latin1 non-ASCII as octal.

lex_start

lex_start has been added to the API, but is considered experimental.

op_scope() and op_lvalue()

The op_scope() and op_lvalue() functions have been added to the API, but are considered experimental.

C API Changes

PERL_POLLUTE has been removed

The option to define PERL_POLLUTE to expose older 5.005 symbols for backwards compatibility has been removed. Its use was always discouraged, and MakeMaker contains a more specific escape hatch:

  1. perl Makefile.PL POLLUTE=1

This can be used for modules that have not been upgraded to 5.6 naming conventions (and really should be completely obsolete by now).

Check API compatibility when loading XS modules

When Perl's API changes in incompatible ways (which usually happens between major releases), XS modules compiled for previous versions of Perl will no longer work. They need to be recompiled against the new Perl.

The XS_APIVERSION_BOOTCHECK macro has been added to ensure that modules are recompiled and to prevent users from accidentally loading modules compiled for old perls into newer perls. That macro, which is called when loading every newly compiled extension, compares the API version of the running perl with the version a module has been compiled for and raises an exception if they don't match.

Perl_fetch_cop_label

The first argument of the C API function Perl_fetch_cop_label has changed from struct refcounted_he * to COP * , to insulate the user from implementation details.

This API function was marked as "may change", and likely isn't in use outside the core. (Neither an unpacked CPAN nor Google's codesearch finds any other references to it.)

GvCV() and GvGP() are no longer lvalues

The new GvCV_set() and GvGP_set() macros are now provided to replace assignment to those two macros.

This allows a future commit to eliminate some backref magic between GV and CVs, which will require complete control over assignment to the gp_cv slot.

CvGV() is no longer an lvalue

Under some circumstances, the CvGV() field of a CV is now reference-counted. To ensure consistent behaviour, direct assignment to it, for example CvGV(cv) = gv is now a compile-time error. A new macro, CvGV_set(cv,gv) has been introduced to run this operation safely. Note that modification of this field is not part of the public API, regardless of this new macro (and despite its being listed in this section).

CvSTASH() is no longer an lvalue

The CvSTASH() macro can now only be used as an rvalue. CvSTASH_set() has been added to replace assignment to CvSTASH(). This is to ensure that backreferences are handled properly. These macros are not part of the API.

Calling conventions for newFOROP and newWHILEOP

The way the parser handles labels has been cleaned up and refactored. As a result, the newFOROP() constructor function no longer takes a parameter stating what label is to go in the state op.

The newWHILEOP() and newFOROP() functions no longer accept a line number as a parameter.

Flags passed to uvuni_to_utf8_flags and utf8n_to_uvuni

Some of the flags parameters to uvuni_to_utf8_flags() and utf8n_to_uvuni() have changed. This is a result of Perl's now allowing internal storage and manipulation of code points that are problematic in some situations. Hence, the default actions for these functions has been complemented to allow these code points. The new flags are documented in perlapi. Code that requires the problematic code points to be rejected needs to change to use the new flags. Some flag names are retained for backward source compatibility, though they do nothing, as they are now the default. However the flags UNICODE_ALLOW_FDD0 , UNICODE_ALLOW_FFFF , UNICODE_ILLEGAL , and UNICODE_IS_ILLEGAL have been removed, as they stem from a fundamentally broken model of how the Unicode non-character code points should be handled, which is now described in Non-character code points in perlunicode. See also the Unicode section under Selected Bug Fixes.

Deprecated C APIs

  • Perl_ptr_table_clear

    Perl_ptr_table_clear is no longer part of Perl's public API. Calling it now generates a deprecation warning, and it will be removed in a future release.

  • sv_compile_2op

    The sv_compile_2op() API function is now deprecated. Searches suggest that nothing on CPAN is using it, so this should have zero impact.

    It attempted to provide an API to compile code down to an optree, but failed to bind correctly to lexicals in the enclosing scope. It's not possible to fix this problem within the constraints of its parameters and return value.

  • find_rundefsvoffset

    The find_rundefsvoffset function has been deprecated. It appeared that its design was insufficient for reliably getting the lexical $_ at run-time.

    Use the new find_rundefsv function or the UNDERBAR macro instead. They directly return the right SV representing $_ , whether it's lexical or dynamic.

  • CALL_FPTR and CPERLscope

    Those are left from an old implementation of MULTIPLICITY using C++ objects, which was removed in Perl 5.8. Nowadays these macros do exactly nothing, so they shouldn't be used anymore.

    For compatibility, they are still defined for external XS code. Only extensions defining PERL_CORE must be updated now.

Other Internal Changes

Stack unwinding

The protocol for unwinding the C stack at the last stage of a die has changed how it identifies the target stack frame. This now uses a separate variable PL_restartjmpenv , where previously it relied on the blk_eval.cur_top_env pointer in the eval context frame that has nominally just been discarded. This change means that code running during various stages of Perl-level unwinding no longer needs to take care to avoid destroying the ghost frame.

Scope stack entries

The format of entries on the scope stack has been changed, resulting in a reduction of memory usage of about 10%. In particular, the memory used by the scope stack to record each active lexical variable has been halved.

Memory allocation for pointer tables

Memory allocation for pointer tables has been changed. Previously Perl_ptr_table_store allocated memory from the same arena system as SV bodies and HE s, with freed memory remaining bound to those arenas until interpreter exit. Now it allocates memory from arenas private to the specific pointer table, and that memory is returned to the system when Perl_ptr_table_free is called. Additionally, allocation and release are both less CPU intensive.

UNDERBAR

The UNDERBAR macro now calls find_rundefsv . dUNDERBAR is now a noop but should still be used to ensure past and future compatibility.

String comparison routines renamed

The ibcmp_* functions have been renamed and are now called foldEQ , foldEQ_locale , and foldEQ_utf8 . The old names are still available as macros.

chop and chomp implementations merged

The opcode bodies for chop and chomp and for schop and schomp have been merged. The implementation functions Perl_do_chop() and Perl_do_chomp(), never part of the public API, have been merged and moved to a static function in pp.c. This shrinks the Perl binary slightly, and should not affect any code outside the core (unless it is relying on the order of side-effects when chomp is passed a list of values).

Selected Bug Fixes

I/O

  • Perl no longer produces this warning:

    1. $ perl -we 'open(my $f, ">", \my $x); binmode($f, "scalar")'
    2. Use of uninitialized value in binmode at -e line 1.
  • Opening a glob reference via open($fh, ">", \*glob) no longer causes the glob to be corrupted when the filehandle is printed to. This would cause Perl to crash whenever the glob's contents were accessed [perl #77492].

  • PerlIO no longer crashes when called recursively, such as from a signal handler. Now it just leaks memory [perl #75556].

  • Most I/O functions were not warning for unopened handles unless the "closed" and "unopened" warnings categories were both enabled. Now only use warnings 'unopened' is necessary to trigger these warnings, as had always been the intention.

  • There have been several fixes to PerlIO layers:

    When binmode(FH, ":crlf") pushes the :crlf layer on top of the stack, it no longer enables crlf layers lower in the stack so as to avoid unexpected results [perl #38456].

    Opening a file in :raw mode now does what it advertises to do (first open the file, then binmode it), instead of simply leaving off the top layer [perl #80764].

    The three layers :pop , :utf8 , and :bytes didn't allow stacking when opening a file. For example this:

    1. open(FH, ">:pop:perlio", "some.file") or die $!;

    would throw an "Invalid argument" error. This has been fixed in this release [perl #82484].

Regular Expression Bug Fixes

  • The regular expression engine no longer loops when matching "\N{LATIN SMALL LIGATURE FF}" =~ /f+/i and similar expressions [perl #72998] (5.12.1).

  • The trie runtime code should no longer allocate massive amounts of memory, fixing #74484.

  • Syntax errors in (?{...}) blocks no longer cause panic messages [perl #2353].

  • A pattern like (?:(o){2})? no longer causes a "panic" error [perl #39233].

  • A fatal error in regular expressions containing (.*?) when processing UTF-8 data has been fixed [perl #75680] (5.12.2).

  • An erroneous regular expression engine optimisation that caused regex verbs like *COMMIT sometimes to be ignored has been removed.

  • The regular expression bracketed character class [\8\9] was effectively the same as [89\000] , incorrectly matching a NULL character. It also gave incorrect warnings that the 8 and 9 were ignored. Now [\8\9] is the same as [89] and gives legitimate warnings that \8 and \9 are unrecognized escape sequences, passed-through.

  • A regular expression match in the right-hand side of a global substitution (s///g) that is in the same scope will no longer cause match variables to have the wrong values on subsequent iterations. This can happen when an array or hash subscript is interpolated in the right-hand side, as in s|(.)|@a{ print($1), /./ }|g [perl #19078].

  • Several cases in which characters in the Latin-1 non-ASCII range (0x80 to 0xFF) used not to match themselves, or used to match both a character class and its complement, have been fixed. For instance, U+00E2 could match both \w and \W [perl #78464] [perl #18281] [perl #60156].

  • Matching a Unicode character against an alternation containing characters that happened to match continuation bytes in the former's UTF8 representation (like qq{\x{30ab}} =~ /\xab|\xa9/ ) would cause erroneous warnings [perl #70998].

  • The trie optimisation was not taking empty groups into account, preventing "foo" from matching /\A(?:(?:)foo|bar|zot)\z/ [perl #78356].

  • A pattern containing a + inside a lookahead would sometimes cause an incorrect match failure in a global match (for example, /(?=(\S+))/g ) [perl #68564].

  • A regular expression optimisation would sometimes cause a match with a {n,m} quantifier to fail when it should have matched [perl #79152].

  • Case-insensitive matching in regular expressions compiled under use locale now works much more sanely when the pattern or target string is internally encoded in UTF8. Previously, under these conditions the localeness was completely lost. Now, code points above 255 are treated as Unicode, but code points between 0 and 255 are treated using the current locale rules, regardless of whether the pattern or the string is encoded in UTF8. The few case-insensitive matches that cross the 255/256 boundary are not allowed. For example, 0xFF does not caselessly match the character at 0x178, LATIN CAPITAL LETTER Y WITH DIAERESIS, because 0xFF may not be LATIN SMALL LETTER Y in the current locale, and Perl has no way of knowing if that character even exists in the locale, much less what code point it is.

  • The (?|...) regular expression construct no longer crashes if the final branch has more sets of capturing parentheses than any other branch. This was fixed in Perl 5.10.1 for the case of a single branch, but that fix did not take multiple branches into account [perl #84746].

  • A bug has been fixed in the implementation of {...} quantifiers in regular expressions that prevented the code block in /((\w+)(?{ print $2 })){2}/ from seeing the $2 sometimes [perl #84294].

Syntax/Parsing Bugs

  • when (scalar) {...} no longer crashes, but produces a syntax error [perl #74114] (5.12.1).

  • A label right before a string eval (foo: eval $string ) no longer causes the label to be associated also with the first statement inside the eval [perl #74290] (5.12.1).

  • The no 5.13.2 form of no no longer tries to turn on features or pragmata (like strict) [perl #70075] (5.12.2).

  • BEGIN {require 5.12.0} now behaves as documented, rather than behaving identically to use 5.12.0 . Previously, require in a BEGIN block was erroneously executing the use feature ':5.12.0' and use strict behaviour, which only use was documented to provide [perl #69050].

  • A regression introduced in Perl 5.12.0, making my $x = 3; $x = length(undef) result in $x set to 3 has been fixed. $x will now be undef [perl #85508] (5.12.2).

  • When strict "refs" mode is off, %{...} in rvalue context returns undef if its argument is undefined. An optimisation introduced in Perl 5.12.0 to make keys %{...} faster when used as a boolean did not take this into account, causing keys %{+undef} (and keys %$foo when $foo is undefined) to be an error, which it should be so in strict mode only [perl #81750].

  • Constant-folding used to cause

    1. $text =~ ( 1 ? /phoo/ : /bear/)

    to turn into

    1. $text =~ /phoo/

    at compile time. Now it correctly matches against $_ [perl #20444].

  • Parsing Perl code (either with string eval or by loading modules) from within a UNITCHECK block no longer causes the interpreter to crash [perl #70614].

  • String evals no longer fail after 2 billion scopes have been compiled [perl #83364].

  • The parser no longer hangs when encountering certain Unicode characters, such as U+387 [perl #74022].

  • Defining a constant with the same name as one of Perl's special blocks (like INIT ) stopped working in 5.12.0, but has now been fixed [perl #78634].

  • A reference to a literal value used as a hash key ($hash{\"foo"} ) used to be stringified, even if the hash was tied [perl #79178].

  • A closure containing an if statement followed by a constant or variable is no longer treated as a constant [perl #63540].

  • state can now be used with attributes. It used to mean the same thing as my if any attributes were present [perl #68658].

  • Expressions like @$a > 3 no longer cause $a to be mentioned in the "Use of uninitialized value in numeric gt" warning when $a is undefined (since it is not part of the > expression, but the operand of the @ ) [perl #72090].

  • Accessing an element of a package array with a hard-coded number (as opposed to an arbitrary expression) would crash if the array did not exist. Usually the array would be autovivified during compilation, but typeglob manipulation could remove it, as in these two cases which used to crash:

    1. *d = *a; print $d[0];
    2. undef *d; print $d[0];
  • The -C command-line option, when used on the shebang line, can now be followed by other options [perl #72434].

  • The B module was returning B::OP s instead of B::LOGOP s for entertry [perl #80622]. This was due to a bug in the Perl core, not in B itself.

Stashes, Globs and Method Lookup

Perl 5.10.0 introduced a new internal mechanism for caching MROs (method resolution orders, or lists of parent classes; aka "isa" caches) to make method lookup faster (so @ISA arrays would not have to be searched repeatedly). Unfortunately, this brought with it quite a few bugs. Almost all of these have been fixed now, along with a few MRO-related bugs that existed before 5.10.0:

  • The following used to have erratic effects on method resolution, because the "isa" caches were not reset or otherwise ended up listing the wrong classes. These have been fixed.

    • Aliasing packages by assigning to globs [perl #77358]
    • Deleting packages by deleting their containing stash elements
    • Undefining the glob containing a package (undef *Foo:: )
    • Undefining an ISA glob (undef *Foo::ISA )
    • Deleting an ISA stash element (delete $Foo::{ISA} )
    • Sharing @ISA arrays between classes (via *Foo::ISA = \@Bar::ISA or *Foo::ISA = *Bar::ISA ) [perl #77238]

    undef *Foo::ISA would even stop a new @Foo::ISA array from updating caches.

  • Typeglob assignments would crash if the glob's stash no longer existed, so long as the glob assigned to were named ISA or the glob on either side of the assignment contained a subroutine.

  • PL_isarev , which is accessible to Perl via mro::get_isarev is now updated properly when packages are deleted or removed from the @ISA of other classes. This allows many packages to be created and deleted without causing a memory leak [perl #75176].

In addition, various other bugs related to typeglobs and stashes have been fixed:

  • Some work has been done on the internal pointers that link between symbol tables (stashes), typeglobs, and subroutines. This has the effect that various edge cases related to deleting stashes or stash entries (for example, <%FOO:: = ()>), and complex typeglob or code-reference aliasing, will no longer crash the interpreter.

  • Assigning a reference to a glob copy now assigns to a glob slot instead of overwriting the glob with a scalar [perl #1804] [perl #77508].

  • A bug when replacing the glob of a loop variable within the loop has been fixed [perl #21469]. This means the following code will no longer crash:

    1. for $x (...) {
    2. *x = *y;
    3. }
  • Assigning a glob to a PVLV used to convert it to a plain string. Now it works correctly, and a PVLV can hold a glob. This would happen when a nonexistent hash or array element was passed to a subroutine:

    1. sub { $_[0] = *foo }->($hash{key});
    2. # $_[0] would have been the string "*main::foo"

    It also happened when a glob was assigned to, or returned from, an element of a tied array or hash [perl #36051].

  • When trying to report Use of uninitialized value $Foo::BAR , crashes could occur if the glob holding the global variable in question had been detached from its original stash by, for example, delete $::{"Foo::"} . This has been fixed by disabling the reporting of variable names in those cases.

  • During the restoration of a localised typeglob on scope exit, any destructors called as a result would be able to see the typeglob in an inconsistent state, containing freed entries, which could result in a crash. This would affect code like this:

    1. local *@;
    2. eval { die bless [] }; # puts an object in $@
    3. sub DESTROY {
    4. local $@; # boom
    5. }

    Now the glob entries are cleared before any destructors are called. This also means that destructors can vivify entries in the glob. So Perl tries again and, if the entries are re-created too many times, dies with a "panic: gp_free ..." error message.

  • If a typeglob is freed while a subroutine attached to it is still referenced elsewhere, the subroutine is renamed to __ANON__ in the same package, unless the package has been undefined, in which case the __ANON__ package is used. This could cause packages to be sometimes autovivified, such as if the package had been deleted. Now this no longer occurs. The __ANON__ package is also now used when the original package is no longer attached to the symbol table. This avoids memory leaks in some cases [perl #87664].

  • Subroutines and package variables inside a package whose name ends with :: can now be accessed with a fully qualified name.

Unicode

  • What has become known as "the Unicode Bug" is almost completely resolved in this release. Under use feature 'unicode_strings' (which is automatically selected by use 5.012 and above), the internal storage format of a string no longer affects the external semantics. [perl #58182].

    There are two known exceptions:

    1

    The now-deprecated, user-defined case-changing functions require utf8-encoded strings to operate. The CPAN module Unicode::Casing has been written to replace this feature without its drawbacks, and the feature is scheduled to be removed in 5.16.

    2

    quotemeta() (and its in-line equivalent \Q ) can also give different results depending on whether a string is encoded in UTF-8. See The Unicode Bug in perlunicode.

  • Handling of Unicode non-character code points has changed. Previously they were mostly considered illegal, except that in some place only one of the 66 of them was known. The Unicode Standard considers them all legal, but forbids their "open interchange". This is part of the change to allow internal use of any code point (see Core Enhancements). Together, these changes resolve [perl #38722], [perl #51918], [perl #51936], and [perl #63446].

  • Case-insensitive "/i" regular expression matching of Unicode characters that match multiple characters now works much more as intended. For example

    1. "\N{LATIN SMALL LIGATURE FFI}" =~ /ffi/ui

    and

    1. "ffi" =~ /\N{LATIN SMALL LIGATURE FFI}/ui

    are both true. Previously, there were many bugs with this feature. What hasn't been fixed are the places where the pattern contains the multiple characters, but the characters are split up by other things, such as in

    1. "\N{LATIN SMALL LIGATURE FFI}" =~ /(f)(f)i/ui

    or

    1. "\N{LATIN SMALL LIGATURE FFI}" =~ /ffi*/ui

    or

    1. "\N{LATIN SMALL LIGATURE FFI}" =~ /[a-f][f-m][g-z]/ui

    None of these match.

    Also, this matching doesn't fully conform to the current Unicode Standard, which asks that the matching be made upon the NFD (Normalization Form Decomposed) of the text. However, as of this writing (April 2010), the Unicode Standard is currently in flux about what they will recommend doing with regard in such scenarios. It may be that they will throw out the whole concept of multi-character matches. [perl #71736].

  • Naming a deprecated character in \N{NAME} no longer leaks memory.

  • We fixed a bug that could cause \N{NAME} constructs followed by a single "." to be parsed incorrectly [perl #74978] (5.12.1).

  • chop now correctly handles characters above "\x{7fffffff}" [perl #73246].

  • Passing to index an offset beyond the end of the string when the string is encoded internally in UTF8 no longer causes panics [perl #75898].

  • warn() and die() now respect utf8-encoded scalars [perl #45549].

  • Sometimes the UTF8 length cache would not be reset on a value returned by substr, causing length(substr($uni_string, ...)) to give wrong answers. With ${^UTF8CACHE} set to -1, it would also produce a "panic" error message [perl #77692].

Ties, Overloading and Other Magic

  • Overloading now works properly in conjunction with tied variables. What formerly happened was that most ops checked their arguments for overloading before checking for magic, so for example an overloaded object returned by a tied array access would usually be treated as not overloaded [RT #57012].

  • Various instances of magic (like tie methods) being called on tied variables too many or too few times have been fixed:

    • $tied->() did not always call FETCH [perl #8438].

    • Filetest operators and y/// and tr/// were calling FETCH too many times.

    • The = operator used to ignore magic on its right-hand side if the scalar happened to hold a typeglob (if a typeglob was the last thing returned from or assigned to a tied scalar) [perl #77498].

    • Dereference operators used to ignore magic if the argument was a reference already (such as from a previous FETCH) [perl #72144].

    • splice now calls set-magic (so changes made by splice @ISA are respected by method calls) [perl #78400].

    • In-memory files created by open($fh, ">", \$buffer) were not calling FETCH/STORE at all [perl #43789] (5.12.2).

    • utf8::is_utf8() now respects get-magic (like $1 ) (5.12.1).

  • Non-commutative binary operators used to swap their operands if the same tied scalar was used for both operands and returned a different value for each FETCH. For instance, if $t returned 2 the first time and 3 the second, then $t/$t would evaluate to 1.5. This has been fixed [perl #87708].

  • String eval now detects taintedness of overloaded or tied arguments [perl #75716].

  • String eval and regular expression matches against objects with string overloading no longer cause memory corruption or crashes [perl #77084].

  • readline EXPR now honors <> overloading on tied arguments.

  • <expr> always respects overloading now if the expression is overloaded.

    Because "<> as glob" was parsed differently from "<> as filehandle" from 5.6 onwards, something like <$foo[0]> did not handle overloading, even if $foo[0] was an overloaded object. This was contrary to the documentation for overload, and meant that <> could not be used as a general overloaded iterator operator.

  • The fallback behaviour of overloading on binary operators was asymmetric [perl #71286].

  • Magic applied to variables in the main package no longer affects other packages. See Magic variables outside the main package above [perl #76138].

  • Sometimes magic (ties, taintedness, etc.) attached to variables could cause an object to last longer than it should, or cause a crash if a tied variable were freed from within a tie method. These have been fixed [perl #81230].

  • DESTROY methods of objects implementing ties are no longer able to crash by accessing the tied variable through a weak reference [perl #86328].

  • Fixed a regression of kill() when a match variable is used for the process ID to kill [perl #75812].

  • $AUTOLOAD used to remain tainted forever if it ever became tainted. Now it is correctly untainted if an autoloaded method is called and the method name was not tainted.

  • sprintf now dies when passed a tainted scalar for the format. It did already die for arbitrary expressions, but not for simple scalars [perl #82250].

  • lc, uc, lcfirst, and ucfirst no longer return untainted strings when the argument is tainted. This has been broken since perl 5.8.9 [perl #87336].

The Debugger

  • The Perl debugger now also works in taint mode [perl #76872].

  • Subroutine redefinition works once more in the debugger [perl #48332].

  • When -d is used on the shebang (#! ) line, the debugger now has access to the lines of the main program. In the past, this sometimes worked and sometimes did not, depending on the order in which things happened to be arranged in memory [perl #71806].

  • A possible memory leak when using caller EXPR to set @DB::args has been fixed (5.12.2).

  • Perl no longer stomps on $DB::single , $DB::trace , and $DB::signal if these variables already have values when $^P is assigned to [perl #72422].

  • #line directives in string evals were not properly updating the arrays of lines of code (@{"_< ..."} ) that the debugger (or any debugging or profiling module) uses. In threaded builds, they were not being updated at all. In non-threaded builds, the line number was ignored, so any change to the existing line number would cause the lines to be misnumbered [perl #79442].

Threads

  • Perl no longer accidentally clones lexicals in scope within active stack frames in the parent when creating a child thread [perl #73086].

  • Several memory leaks in cloning and freeing threaded Perl interpreters have been fixed [perl #77352].

  • Creating a new thread when directory handles were open used to cause a crash, because the handles were not cloned, but simply passed to the new thread, resulting in a double free.

    Now directory handles are cloned properly on Windows and on systems that have a fchdir function. On other systems, new threads simply do not inherit directory handles from their parent threads [perl #75154].

  • The typeglob *, , which holds the scalar variable $, (output field separator), had the wrong reference count in child threads.

  • [perl #78494] When pipes are shared between threads, the close function (and any implicit close, such as on thread exit) no longer blocks.

  • Perl now does a timely cleanup of SVs that are cloned into a new thread but then discovered to be orphaned (that is, their owners are not cloned). This eliminates several "scalars leaked" warnings when joining threads.

Scoping and Subroutines

  • Lvalue subroutines are again able to return copy-on-write scalars. This had been broken since version 5.10.0 [perl #75656] (5.12.3).

  • require no longer causes caller to return the wrong file name for the scope that called require and other scopes higher up that had the same file name [perl #68712].

  • sort with a ($$) -prototyped comparison routine used to cause the value of @_ to leak out of the sort. Taking a reference to @_ within the sorting routine could cause a crash [perl #72334].

  • Match variables (like $1 ) no longer persist between calls to a sort subroutine [perl #76026].

  • Iterating with foreach over an array returned by an lvalue sub now works [perl #23790].

  • $@ is now localised during calls to binmode to prevent action at a distance [perl #78844].

  • Calling a closure prototype (what is passed to an attribute handler for a closure) now results in a "Closure prototype called" error message instead of a crash [perl #68560].

  • Mentioning a read-only lexical variable from the enclosing scope in a string eval no longer causes the variable to become writable [perl #19135].

Signals

  • Within signal handlers, $! is now implicitly localized.

  • CHLD signals are no longer unblocked after a signal handler is called if they were blocked before by POSIX::sigprocmask [perl #82040].

  • A signal handler called within a signal handler could cause leaks or double-frees. Now fixed [perl #76248].

Miscellaneous Memory Leaks

  • Several memory leaks when loading XS modules were fixed (5.12.2).

  • substr EXPR,OFFSET,LENGTH,REPLACEMENT, index STR,SUBSTR,POSITION, keys HASH, and vec EXPR,OFFSET,BITS could, when used in combination with lvalues, result in leaking the scalar value they operate on, and cause its destruction to happen too late. This has now been fixed.

  • The postincrement and postdecrement operators, ++ and -- , used to cause leaks when used on references. This has now been fixed.

  • Nested map and grep blocks no longer leak memory when processing large lists [perl #48004].

  • use VERSION and no VERSION no longer leak memory [perl #78436] [perl #69050].

  • .= followed by <> or readline would leak memory if $/ contained characters beyond the octet range and the scalar assigned to happened to be encoded as UTF8 internally [perl #72246].

  • eval 'BEGIN{die}' no longer leaks memory on non-threaded builds.

Memory Corruption and Crashes

  • glob() no longer crashes when %File::Glob:: is empty and CORE::GLOBAL::glob isn't present [perl #75464] (5.12.2).

  • readline() has been fixed when interrupted by signals so it no longer returns the "same thing" as before or random memory.

  • When assigning a list with duplicated keys to a hash, the assignment used to return garbage and/or freed values:

    1. @a = %h = (list with some duplicate keys);

    This has now been fixed [perl #31865].

  • The mechanism for freeing objects in globs used to leave dangling pointers to freed SVs, meaning Perl users could see corrupted state during destruction.

    Perl now frees only the affected slots of the GV, rather than freeing the GV itself. This makes sure that there are no dangling refs or corrupted state during destruction.

  • The interpreter no longer crashes when freeing deeply-nested arrays of arrays. Hashes have not been fixed yet [perl #44225].

  • Concatenating long strings under use encoding no longer causes Perl to crash [perl #78674].

  • Calling ->import on a class lacking an import method could corrupt the stack, resulting in strange behaviour. For instance,

    1. push @a, "foo", $b = bar->import;

    would assign "foo" to $b [perl #63790].

  • The recv function could crash when called with the MSG_TRUNC flag [perl #75082].

  • formline no longer crashes when passed a tainted format picture. It also taints $^A now if its arguments are tainted [perl #79138].

  • A bug in how we process filetest operations could cause a segfault. Filetests don't always expect an op on the stack, so we now use TOPs only if we're sure that we're not stating the _ filehandle. This is indicated by OPf_KIDS (as checked in ck_ftst) [perl #74542] (5.12.1).

  • unpack() now handles scalar context correctly for %32H and %32u , fixing a potential crash. split() would crash because the third item on the stack wasn't the regular expression it expected. unpack("%2H", ...) would return both the unpacked result and the checksum on the stack, as would unpack("%2u", ...) [perl #73814] (5.12.2).

Fixes to Various Perl Operators

  • The & , |, and ^ bitwise operators no longer coerce read-only arguments [perl #20661].

  • Stringifying a scalar containing "-0.0" no longer has the effect of turning false into true [perl #45133].

  • Some numeric operators were converting integers to floating point, resulting in loss of precision on 64-bit platforms [perl #77456].

  • sprintf() was ignoring locales when called with constant arguments [perl #78632].

  • Combining the vector (%v ) flag and dynamic precision would cause sprintf to confuse the order of its arguments, making it treat the string as the precision and vice-versa [perl #83194].

Bugs Relating to the C API

  • The C-level lex_stuff_pvn function would sometimes cause a spurious syntax error on the last line of the file if it lacked a final semicolon [perl #74006] (5.12.1).

  • The eval_sv and eval_pv C functions now set $@ correctly when there is a syntax error and no G_KEEPERR flag, and never set it if the G_KEEPERR flag is present [perl #3719].

  • The XS multicall API no longer causes subroutines to lose reference counts if called via the multicall interface from within those very subroutines. This affects modules like List::Util. Calling one of its functions with an active subroutine as the first argument could cause a crash [perl #78070].

  • The SvPVbyte function available to XS modules now calls magic before downgrading the SV, to avoid warnings about wide characters [perl #72398].

  • The ref types in the typemap for XS bindings now support magical variables [perl #72684].

  • sv_catsv_flags no longer calls mg_get on its second argument (the source string) if the flags passed to it do not include SV_GMAGIC. So it now matches the documentation.

  • my_strftime no longer leaks memory. This fixes a memory leak in POSIX::strftime [perl #73520].

  • XSUB.h now correctly redefines fgets under PERL_IMPLICIT_SYS [perl #55049] (5.12.1).

  • XS code using fputc() or fputs() on Windows could cause an error due to their arguments being swapped [perl #72704] (5.12.1).

  • A possible segfault in the T_PTROBJ default typemap has been fixed (5.12.2).

  • A bug that could cause "Unknown error" messages when call_sv(code, G_EVAL) is called from an XS destructor has been fixed (5.12.2).

Known Problems

This is a list of significant unresolved issues which are regressions from earlier versions of Perl or which affect widely-used CPAN modules.

  • List::Util::first misbehaves in the presence of a lexical $_ (typically introduced by my $_ or implicitly by given ). The variable that gets set for each iteration is the package variable $_ , not the lexical $_ .

    A similar issue may occur in other modules that provide functions which take a block as their first argument, like

    1. foo { ... $_ ...} list

    See also: http://rt.perl.org/rt3/Public/Bug/Display.html?id=67694

  • readline() returns an empty string instead of a cached previous value when it is interrupted by a signal

  • The changes in prototype handling break Switch. A patch has been sent upstream and will hopefully appear on CPAN soon.

  • The upgrade to ExtUtils-MakeMaker-6.57_05 has caused some tests in the Module-Install distribution on CPAN to fail. (Specifically, 02_mymeta.t tests 5 and 21; 18_all_from.t tests 6 and 15; 19_authors.t tests 5, 13, 21, and 29; and 20_authors_with_special_characters.t tests 6, 15, and 23 in version 1.00 of that distribution now fail.)

  • On VMS, Time::HiRes tests will fail due to a bug in the CRTL's implementation of setitimer : previous timer values would be cleared if a timer expired but not if the timer was reset before expiring. HP OpenVMS Engineering have corrected the problem and will release a patch in due course (Quix case # QXCM1001115136).

  • On VMS, there were a handful of Module::Build test failures we didn't get to before the release; please watch CPAN for updates.

Errata

keys(), values(), and each() work on arrays

You can now use the keys(), values(), and each() builtins on arrays; previously you could use them only on hashes. See perlfunc for details. This is actually a change introduced in perl 5.12.0, but it was missed from that release's perl5120delta.

split() and @_

split() no longer modifies @_ when called in scalar or void context. In void context it now produces a "Useless use of split" warning. This was also a perl 5.12.0 change that missed the perldelta.

Obituary

Randy Kobes, creator of http://kobesearch.cpan.org/ and contributor/maintainer to several core Perl toolchain modules, passed away on September 18, 2010 after a battle with lung cancer. The community was richer for his involvement. He will be missed.

Acknowledgements

Perl 5.14.0 represents one year of development since Perl 5.12.0 and contains nearly 550,000 lines of changes across nearly 3,000 files from 150 authors and committers.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.14.0:

Aaron Crane, Abhijit Menon-Sen, Abigail, Ævar Arnfjörð Bjarmason, Alastair Douglas, Alexander Alekseev, Alexander Hartmaier, Alexandr Ciornii, Alex Davies, Alex Vandiver, Ali Polatel, Allen Smith, Andreas König, Andrew Rodland, Andy Armstrong, Andy Dougherty, Aristotle Pagaltzis, Arkturuz, Arvan, A. Sinan Unur, Ben Morrow, Bo Lindbergh, Boris Ratner, Brad Gilbert, Bram, brian d foy, Brian Phillips, Casey West, Charles Bailey, Chas. Owens, Chip Salzenberg, Chris 'BinGOs' Williams, chromatic, Craig A. Berry, Curtis Jewell, Dagfinn Ilmari Mannsåker, Dan Dascalescu, Dave Rolsky, David Caldwell, David Cantrell, David Golden, David Leadbeater, David Mitchell, David Wheeler, Eric Brine, Father Chrysostomos, Fingle Nark, Florian Ragwitz, Frank Wiegand, Franz Fasching, Gene Sullivan, George Greer, Gerard Goossen, Gisle Aas, Goro Fuji, Grant McLean, gregor herrmann, H.Merijn Brand, Hongwen Qiu, Hugo van der Sanden, Ian Goodacre, James E Keenan, James Mastros, Jan Dubois, Jay Hannah, Jerry D. Hedden, Jesse Vincent, Jim Cromie, Jirka Hruška, John Peacock, Joshua ben Jore, Joshua Pritikin, Karl Williamson, Kevin Ryde, kmx, Lars Dɪᴇᴄᴋᴏᴡ 迪拉斯, Larwan Berke, Leon Brocard, Leon Timmermans, Lubomir Rintel, Lukas Mai, Maik Hentsche, Marty Pauley, Marvin Humphrey, Matt Johnson, Matt S Trout, Max Maischein, Michael Breen, Michael Fig, Michael G Schwern, Michael Parker, Michael Stevens, Michael Witten, Mike Kelly, Moritz Lenz, Nicholas Clark, Nick Cleaton, Nick Johnston, Nicolas Kaiser, Niko Tyni, Noirin Shirley, Nuno Carvalho, Paul Evans, Paul Green, Paul Johnson, Paul Marquess, Peter J. Holzer, Peter John Acklam, Peter Martini, Philippe Bruhat (BooK), Piotr Fusik, Rafael Garcia-Suarez, Rainer Tammer, Reini Urban, Renee Baecker, Ricardo Signes, Richard Möhn, Richard Soderberg, Rob Hoelz, Robin Barker, Ruslan Zakirov, Salvador Fandiño, Salvador Ortiz Garcia, Shlomi Fish, Sinan Unur, Sisyphus, Slaven Rezic, Steffen Müller, Steve Hay, Steven Schubiger, Steve Peters, Sullivan Beck, Tatsuhiko Miyagawa, Tim Bunce, Todd Rinaldo, Tom Christiansen, Tom Hukins, Tony Cook, Tye McQueen, Vadim Konovalov, Vernon Lyon, Vincent Pit, Walt Mankowski, Wolfram Humann, Yves Orton, Zefram, and Zsbán Ambrus.

This is woefully incomplete as it's automatically generated from version control history. In particular, it doesn't include the names of the (very much appreciated) contributors who reported issues in previous versions of Perl that helped make Perl 5.14.0 better. For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl 5.14.0 distribution.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the Perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who are able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please use this address for security issues in the Perl core only, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

Page index
 
perldoc-html/perl5141delta.html000644 000765 000024 00000070054 12275777375 016427 0ustar00jjstaff000000 000000 perl5141delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5141delta

Perl 5 version 18.2 documentation
Recently read

perl5141delta

NAME

perl5141delta - what is new for perl v5.14.1

DESCRIPTION

This document describes differences between the 5.14.0 release and the 5.14.1 release.

If you are upgrading from an earlier release such as 5.12.0, first read perl5140delta, which describes differences between 5.12.0 and 5.14.0.

Core Enhancements

No changes since 5.14.0.

Security

No changes since 5.14.0.

Incompatible Changes

There are no changes intentionally incompatible with 5.14.0. If any exist, they are bugs and reports are welcome.

Deprecations

There have been no deprecations since 5.14.0.

Modules and Pragmata

New Modules and Pragmata

None

Updated Modules and Pragmata

  • B::Deparse has been upgraded from version 1.03 to 1.04, to address two regressions in Perl 5.14.0:

    Deparsing of the glob operator and its diamond (<> ) form now works again. [perl #90898]

    The presence of subroutines named :::: or :::::: no longer causes B::Deparse to hang.

  • Pod::Perldoc has been upgraded from version 3.15_03 to 3.15_04.

    It corrects the search paths on VMS. [perl #90640]

Removed Modules and Pragmata

None

Documentation

New Documentation

None

Changes to Existing Documentation

perlfunc

  • given , when and default are now listed in perlfunc.

  • Documentation for use now includes a pointer to if.pm.

perllol

  • perllol has been expanded with examples using the new push $scalar syntax introduced in Perl 5.14.0.

perlop

  • The explanation of bitwise operators has been expanded to explain how they work on Unicode strings.

  • The section on the triple-dot or yada-yada operator has been moved up, as it used to separate two closely related sections about the comma operator.

  • More examples for m//g have been added.

  • The <<\FOO here-doc syntax has been documented.

perlrun

  • perlrun has undergone a significant clean-up. Most notably, the -0x... form of the -0 flag has been clarified, and the final section on environment variables has been corrected and expanded.

POSIX

  • The invocation documentation for WIFEXITED , WEXITSTATUS , WIFSIGNALED , WTERMSIG , WIFSTOPPED , and WSTOPSIG was corrected.

Diagnostics

The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see perldiag.

New Diagnostics

None

Changes to Existing Diagnostics

None

Utility Changes

None

Configuration and Compilation

  • regexp.h has been modified for compatibility with GCC's -Werror option, as used by some projects that include perl's header files.

Testing

  • Some test failures in dist/Locale-Maketext/t/09_compile.t that could occur depending on the environment have been fixed. [perl #89896]

  • A watchdog timer for t/re/re.t was lengthened to accommodate SH-4 systems which were unable to complete the tests before the previous timer ran out.

Platform Support

New Platforms

None

Discontinued Platforms

None

Platform-Specific Notes

Solaris

  • Documentation listing the Solaris packages required to build Perl on Solaris 9 and Solaris 10 has been corrected.

Mac OS X

  • The lib/locale.t test script has been updated to work on the upcoming Lion release.

  • Mac OS X specific compilation instructions have been clarified.

Ubuntu Linux

  • The ODBM_File installation process has been updated with the new library paths on Ubuntu natty.

Internal Changes

  • The compiled representation of formats is now stored via the mg_ptr of their PERL_MAGIC_fm. Previously it was stored in the string buffer, beyond SvLEN(), the regular end of the string. SvCOMPILED() and SvCOMPILED_{on,off}() now exist solely for compatibility for XS code. The first is always 0, the other two now no-ops.

Bug Fixes

  • A bug has been fixed that would cause a "Use of freed value in iteration" error if the next two hash elements that would be iterated over are deleted. [perl #85026]

  • Passing the same constant subroutine to both index and formline no longer causes one or the other to fail. [perl #89218]

  • 5.14.0 introduced some memory leaks in regular expression character classes such as [\w\s] , which have now been fixed.

  • An edge case in regular expression matching could potentially loop. This happened only under /i in bracketed character classes that have characters with multi-character folds, and the target string to match against includes the first portion of the fold, followed by another character that has a multi-character fold that begins with the remaining portion of the fold, plus some more.

    1. "s\N{U+DF}" =~ /[\x{DF}foo]/i

    is one such case. \xDF folds to "ss" .

  • Several Unicode case-folding bugs have been fixed.

  • The new (in 5.14.0) regular expression modifier /a when repeated like /aa forbids the characters outside the ASCII range that match characters inside that range from matching under /i. This did not work under some circumstances, all involving alternation, such as:

    1. "\N{KELVIN SIGN}" =~ /k|foo/iaa;

    succeeded inappropriately. This is now fixed.

  • Fixed a case where it was possible that a freed buffer may have been read from when parsing a here document.

Acknowledgements

Perl 5.14.1 represents approximately four weeks of development since Perl 5.14.0 and contains approximately 3500 lines of changes across 38 files from 17 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.14.1:

Bo Lindbergh, Claudio Ramirez, Craig A. Berry, David Leadbeater, Father Chrysostomos, Jesse Vincent, Jim Cromie, Justin Case, Karl Williamson, Leo Lapworth, Nicholas Clark, Nobuhiro Iwamatsu, smash, Tom Christiansen, Ton Hospel, Vladimir Timofeev, and Zsbán Ambrus.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5142delta.html000644 000765 000024 00000063635 12275777375 016437 0ustar00jjstaff000000 000000 perl5142delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5142delta

Perl 5 version 18.2 documentation
Recently read

perl5142delta

NAME

perl5142delta - what is new for perl v5.14.2

DESCRIPTION

This document describes differences between the 5.14.1 release and the 5.14.2 release.

If you are upgrading from an earlier release such as 5.14.0, first read perl5141delta, which describes differences between 5.14.0 and 5.14.1.

Core Enhancements

No changes since 5.14.0.

Security

File::Glob::bsd_glob() memory error with GLOB_ALTDIRFUNC (CVE-2011-2728).

Calling File::Glob::bsd_glob with the unsupported flag GLOB_ALTDIRFUNC would cause an access violation / segfault. A Perl program that accepts a flags value from an external source could expose itself to denial of service or arbitrary code execution attacks. There are no known exploits in the wild. The problem has been corrected by explicitly disabling all unsupported flags and setting unused function pointers to null. Bug reported by Clément Lecigne.

Encode decode_xs n-byte heap-overflow (CVE-2011-2939)

A bug in Encode could, on certain inputs, cause the heap to overflow. This problem has been corrected. Bug reported by Robert Zacek.

Incompatible Changes

There are no changes intentionally incompatible with 5.14.0. If any exist, they are bugs and reports are welcome.

Deprecations

There have been no deprecations since 5.14.0.

Modules and Pragmata

New Modules and Pragmata

None

Updated Modules and Pragmata

  • CPAN has been upgraded from version 1.9600 to version 1.9600_01.

    CPAN::Distribution has been upgraded from version 1.9602 to 1.9602_01.

    Backported bugfixes from CPAN version 1.9800. Ensures proper detection of configure_requires prerequisites from CPAN Meta files in the case where dynamic_config is true. [rt.cpan.org #68835]

    Also ensures that configure_requires is only checked in META files, not MYMETA files, so protect against MYMETA generation that drops configure_requires .

  • Encode has been upgraded from version 2.42 to 2.42_01.

    See Security.

  • File::Glob has been upgraded from version 1.12 to version 1.13.

    See Security.

  • PerlIO::scalar has been upgraded from version 0.11 to 0.11_01.

    It fixes a problem with open my $fh, ">", \$scalar not working if $scalar is a copy-on-write scalar.

Removed Modules and Pragmata

None

Platform Support

New Platforms

None

Discontinued Platforms

None

Platform-Specific Notes

  • HP-UX PA-RISC/64 now supports gcc-4.x

    A fix to correct the socketsize now makes the test suite pass on HP-UX PA-RISC for 64bitall builds.

  • Building on OS X 10.7 Lion and Xcode 4 works again

    The build system has been updated to work with the build tools under Mac OS X 10.7.

Bug Fixes

  • In @INC filters (subroutines returned by subroutines in @INC), $_ used to misbehave: If returned from a subroutine, it would not be copied, but the variable itself would be returned; and freeing $_ (e.g., with undef *_ ) would cause perl to crash. This has been fixed [perl #91880].

  • Perl 5.10.0 introduced some faulty logic that made "U*" in the middle of a pack template equivalent to "U0" if the input string was empty. This has been fixed [perl #90160].

  • caller no longer leaks memory when called from the DB package if @DB::args was assigned to after the first call to caller. Carp was triggering this bug [perl #97010].

  • utf8::decode had a nasty bug that would modify copy-on-write scalars' string buffers in place (i.e., skipping the copy). This could result in hashes having two elements with the same key [perl #91834].

  • Localising a tied variable used to make it read-only if it contained a copy-on-write string.

  • Elements of restricted hashes (see the fields pragma) containing copy-on-write values couldn't be deleted, nor could such hashes be cleared (%hash = () ).

  • Locking a hash element that is a glob copy no longer causes subsequent assignment to it to corrupt the glob.

  • A panic involving the combination of the regular expression modifiers /aa introduced in 5.14.0 and the \b escape sequence has been fixed [perl #95964].

Known Problems

This is a list of some significant unfixed bugs, which are regressions from 5.12.0.

  • PERL_GLOBAL_STRUCT is broken.

    Since perl 5.14.0, building with -DPERL_GLOBAL_STRUCT hasn't been possible. This means that perl currently doesn't work on any platforms that require it to be built this way, including Symbian.

    While PERL_GLOBAL_STRUCT now works again on recent development versions of perl, it actually working on Symbian again hasn't been verified.

    We'd be very interested in hearing from anyone working with Perl on Symbian.

Acknowledgements

Perl 5.14.2 represents approximately three months of development since Perl 5.14.1 and contains approximately 1200 lines of changes across 61 files from 9 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.14.2:

Craig A. Berry, David Golden, Father Chrysostomos, Florian Ragwitz, H.Merijn Brand, Karl Williamson, Nicholas Clark, Pau Amma and Ricardo Signes.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5143delta.html000644 000765 000024 00000065644 12275777375 016442 0ustar00jjstaff000000 000000 perl5143delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5143delta

Perl 5 version 18.2 documentation
Recently read

perl5143delta

NAME

perl5143delta - what is new for perl v5.14.3

DESCRIPTION

This document describes differences between the 5.14.2 release and the 5.14.3 release.

If you are upgrading from an earlier release such as 5.12.0, first read perl5140delta, which describes differences between 5.12.0 and 5.14.0.

Core Enhancements

No changes since 5.14.0.

Security

Digest unsafe use of eval (CVE-2011-3597)

The Digest->new() function did not properly sanitize input before using it in an eval() call, which could lead to the injection of arbitrary Perl code.

In order to exploit this flaw, the attacker would need to be able to set the algorithm name used, or be able to execute arbitrary Perl code already.

This problem has been fixed.

Heap buffer overrun in 'x' string repeat operator (CVE-2012-5195)

Poorly written perl code that allows an attacker to specify the count to perl's 'x' string repeat operator can already cause a memory exhaustion denial-of-service attack. A flaw in versions of perl before 5.15.5 can escalate that into a heap buffer overrun; coupled with versions of glibc before 2.16, it possibly allows the execution of arbitrary code.

This problem has been fixed.

Incompatible Changes

There are no changes intentionally incompatible with 5.14.0. If any exist, they are bugs and reports are welcome.

Deprecations

There have been no deprecations since 5.14.0.

Modules and Pragmata

New Modules and Pragmata

None

Updated Modules and Pragmata

  • PerlIO::scalar was updated to fix a bug in which opening a filehandle to a glob copy caused assertion failures (under debugging) or hangs or other erratic behaviour without debugging.

  • ODBM_File and NDBM_File were updated to allow building on GNU/Hurd.

  • IPC::Open3 has been updated to fix a regression introduced in perl 5.12, which broke IPC::Open3::open3($in, $out, $err, '-') . [perl #95748]

  • Digest has been upgraded from version 1.16 to 1.16_01.

    See Security.

  • Module::CoreList has been updated to version 2.49_04 to add data for this release.

Removed Modules and Pragmata

None

Documentation

New Documentation

None

Changes to Existing Documentation

perlcheat

Configuration and Compilation

  • h2ph was updated to search correctly gcc include directories on platforms such as Debian with multi-architecture support.

  • In Configure, the test for procselfexe was refactored into a loop.

Platform Support

New Platforms

None

Discontinued Platforms

None

Platform-Specific Notes

  • FreeBSD

    The FreeBSD hints file was corrected to be compatible with FreeBSD 10.0.

  • Solaris and NetBSD

    Configure was updated for "procselfexe" support on Solaris and NetBSD.

  • HP-UX

    README.hpux was updated to note the existence of a broken header in HP-UX 11.00.

  • Linux

    libutil is no longer used when compiling on Linux platforms, which avoids warnings being emitted.

    The system gcc (rather than any other gcc which might be in the compiling user's path) is now used when searching for libraries such as -lm .

  • Mac OS X

    The locale tests were updated to reflect the behaviour of locales in Mountain Lion.

  • GNU/Hurd

    Various build and test fixes were included for GNU/Hurd.

    LFS support was enabled in GNU/Hurd.

  • NetBSD

    The NetBSD hints file was corrected to be compatible with NetBSD 6.*

Bug Fixes

  • A regression has been fixed that was introduced in 5.14, in /i regular expression matching, in which a match improperly fails if the pattern is in UTF-8, the target string is not, and a Latin-1 character precedes a character in the string that should match the pattern. [perl #101710]

  • In case-insensitive regular expression pattern matching, no longer on UTF-8 encoded strings does the scan for the start of match only look at the first possible position. This caused matches such as "f\x{FB00}" =~ /ff/i to fail.

  • The sitecustomize support was made relocatableinc aware, so that -Dusesitecustomize and -Duserelocatableinc may be used together.

  • The smartmatch operator (~~ ) was changed so that the right-hand side takes precedence during Any ~~ Object operations.

  • A bug has been fixed in the tainting support, in which an index() operation on a tainted constant would cause all other constants to become tainted. [perl #64804]

  • A regression has been fixed that was introduced in perl 5.12, whereby tainting errors were not correctly propagated through die(). [perl #111654]

  • A regression has been fixed that was introduced in perl 5.14, in which /[[:lower:]]/i and /[[:upper:]]/i no longer matched the opposite case. [perl #101970]

Acknowledgements

Perl 5.14.3 represents approximately 12 months of development since Perl 5.14.2 and contains approximately 2,300 lines of changes across 64 files from 22 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.14.3:

Abigail, Andy Dougherty, Carl Hayter, Chris 'BinGOs' Williams, Dave Rolsky, David Mitchell, Dominic Hargreaves, Father Chrysostomos, Florian Ragwitz, H.Merijn Brand, Jilles Tjoelker, Karl Williamson, Leon Timmermans, Michael G Schwern, Nicholas Clark, Niko Tyni, Pino Toscano, Ricardo Signes, Salvador Fandiño, Samuel Thibault, Steve Hay, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5144delta.html000644 000765 000024 00000065651 12275777375 016441 0ustar00jjstaff000000 000000 perl5144delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5144delta

Perl 5 version 18.2 documentation
Recently read

perl5144delta

NAME

perl5144delta - what is new for perl v5.14.4

DESCRIPTION

This document describes differences between the 5.14.3 release and the 5.14.4 release.

If you are upgrading from an earlier release such as 5.12.0, first read perl5140delta, which describes differences between 5.12.0 and 5.14.0.

Core Enhancements

No changes since 5.14.0.

Security

This release contains one major, and medium, and a number of minor security fixes. The latter are included mainly to allow the test suite to pass cleanly with the clang compiler's address sanitizer facility.

CVE-2013-1667: memory exhaustion with arbitrary hash keys

With a carefully crafted set of hash keys (for example arguments on a URL), it is possible to cause a hash to consume a large amount of memory and CPU, and thus possibly to achieve a Denial-of-Service.

This problem has been fixed.

memory leak in Encode

The UTF-8 encoding implementation in Encode.xs had a memory leak which has been fixed.

[perl #111594] Socket::unpack_sockaddr_un heap-buffer-overflow

A read buffer overflow could occur when copying sockaddr buffers. Fairly harmless.

This problem has been fixed.

[perl #111586] SDBM_File: fix off-by-one access to global ".dir"

An extra byte was being copied for some string literals. Fairly harmless.

This problem has been fixed.

off-by-two error in List::Util

A string literal was being used that included two bytes beyond the end of the string. Fairly harmless.

This problem has been fixed.

[perl #115994] fix segv in regcomp.c:S_join_exact()

Under debugging builds, while marking optimised-out regex nodes as type OPTIMIZED , it could treat blocks of exact text as if they were nodes, and thus SEGV. Fairly harmless.

This problem has been fixed.

[perl #115992] PL_eval_start use-after-free

The statement local $[; , when preceded by an eval, and when not part of an assignment, could crash. Fairly harmless.

This problem has been fixed.

wrap-around with IO on long strings

Reading or writing strings greater than 2**31 bytes in size could segfault due to integer wraparound.

This problem has been fixed.

Incompatible Changes

There are no changes intentionally incompatible with 5.14.0. If any exist, they are bugs and reports are welcome.

Deprecations

There have been no deprecations since 5.14.0.

Modules and Pragmata

New Modules and Pragmata

None

Updated Modules and Pragmata

The following modules have just the minor code fixes as listed above in Security (version numbers have not changed):

  • Socket
  • SDBM_File
  • List::Util

Encode has been upgraded from version 2.42_01 to version 2.42_02.

Module::CoreList has been updated to version 2.49_06 to add data for this release.

Removed Modules and Pragmata

None.

Documentation

New Documentation

None.

Changes to Existing Documentation

None.

Diagnostics

No new or changed diagnostics.

Utility Changes

None

Configuration and Compilation

No changes.

Platform Support

New Platforms

None.

Discontinued Platforms

None.

Platform-Specific Notes

  • VMS

    5.14.3 failed to compile on VMS due to incomplete application of a patch series that allowed userelocatableinc and usesitecustomize to be used simultaneously. Other platforms were not affected and the problem has now been corrected.

Selected Bug Fixes

  • In Perl 5.14.0, $tainted ~~ @array stopped working properly. Sometimes it would erroneously fail (when $tainted contained a string that occurs in the array after the first element) or erroneously succeed (when undef occurred after the first element) [perl #93590].

Known Problems

None.

Acknowledgements

Perl 5.14.4 represents approximately 5 months of development since Perl 5.14.3 and contains approximately 1,700 lines of changes across 49 files from 12 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.14.4:

Andy Dougherty, Chris 'BinGOs' Williams, Christian Hansen, Craig A. Berry, Dave Rolsky, David Mitchell, Dominic Hargreaves, Father Chrysostomos, Florian Ragwitz, Reini Urban, Ricardo Signes, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5160delta.html000644 000765 000024 00000762305 12275777375 016437 0ustar00jjstaff000000 000000 perl5160delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5160delta

Perl 5 version 18.2 documentation
Recently read

perl5160delta

NAME

perl5160delta - what is new for perl v5.16.0

DESCRIPTION

This document describes differences between the 5.14.0 release and the 5.16.0 release.

If you are upgrading from an earlier release such as 5.12.0, first read perl5140delta, which describes differences between 5.12.0 and 5.14.0.

Some bug fixes in this release have been backported to later releases of 5.14.x. Those are indicated with the 5.14.x version in parentheses.

Notice

With the release of Perl 5.16.0, the 5.12.x series of releases is now out of its support period. There may be future 5.12.x releases, but only in the event of a critical security issue. Users of Perl 5.12 or earlier should consider upgrading to a more recent release of Perl.

This policy is described in greater detail in perlpolicy.

Core Enhancements

use VERSION

As of this release, version declarations like use v5.16 now disable all features before enabling the new feature bundle. This means that the following holds true:

  1. use 5.016;
  2. # only 5.16 features enabled here
  3. use 5.014;
  4. # only 5.14 features enabled here (not 5.16)

use v5.12 and higher continue to enable strict, but explicit use strict and no strict now override the version declaration, even when they come first:

  1. no strict;
  2. use 5.012;
  3. # no strict here

There is a new ":default" feature bundle that represents the set of features enabled before any version declaration or use feature has been seen. Version declarations below 5.10 now enable the ":default" feature set. This does not actually change the behavior of use v5.8 , because features added to the ":default" set are those that were traditionally enabled by default, before they could be turned off.

no feature now resets to the default feature set. To disable all features (which is likely to be a pretty special-purpose request, since it presumably won't match any named set of semantics) you can now write no feature ':all' .

$[ is now disabled under use v5.16 . It is part of the default feature set and can be turned on or off explicitly with use feature 'array_base' .

__SUB__

The new __SUB__ token, available under the current_sub feature (see feature) or use v5.16 , returns a reference to the current subroutine, making it easier to write recursive closures.

New and Improved Built-ins

More consistent eval

The eval operator sometimes treats a string argument as a sequence of characters and sometimes as a sequence of bytes, depending on the internal encoding. The internal encoding is not supposed to make any difference, but there is code that relies on this inconsistency.

The new unicode_eval and evalbytes features (enabled under use 5.16.0 ) resolve this. The unicode_eval feature causes eval $string to treat the string always as Unicode. The evalbytes features provides a function, itself called evalbytes, which evaluates its argument always as a string of bytes.

These features also fix oddities with source filters leaking to outer dynamic scopes.

See feature for more detail.

substr lvalue revamp

When substr is called in lvalue or potential lvalue context with two or three arguments, a special lvalue scalar is returned that modifies the original string (the first argument) when assigned to.

Previously, the offsets (the second and third arguments) passed to substr would be converted immediately to match the string, negative offsets being translated to positive and offsets beyond the end of the string being truncated.

Now, the offsets are recorded without modification in the special lvalue scalar that is returned, and the original string is not even looked at by substr itself, but only when the returned lvalue is read or modified.

These changes result in an incompatible change:

If the original string changes length after the call to substr but before assignment to its return value, negative offsets will remember their position from the end of the string, affecting code like this:

  1. my $string = "string";
  2. my $lvalue = \substr $string, -4, 2;
  3. print $$lvalue, "\n"; # prints "ri"
  4. $string = "bailing twine";
  5. print $$lvalue, "\n"; # prints "wi"; used to print "il"

The same thing happens with an omitted third argument. The returned lvalue will always extend to the end of the string, even if the string becomes longer.

Since this change also allowed many bugs to be fixed (see The substr operator), and since the behavior of negative offsets has never been specified, the change was deemed acceptable.

Return value of tied

The value returned by tied on a tied variable is now the actual scalar that holds the object to which the variable is tied. This lets ties be weakened with Scalar::Util::weaken(tied $tied_variable) .

Unicode Support

Supports (almost) Unicode 6.1

Besides the addition of whole new scripts, and new characters in existing scripts, this new version of Unicode, as always, makes some changes to existing characters. One change that may trip up some applications is that the General Category of two characters in the Latin-1 range, PILCROW SIGN and SECTION SIGN, has been changed from Other_Symbol to Other_Punctuation. The same change has been made for a character in each of Tibetan, Ethiopic, and Aegean. The code points U+3248..U+324F (CIRCLED NUMBER TEN ON BLACK SQUARE through CIRCLED NUMBER EIGHTY ON BLACK SQUARE) have had their General Category changed from Other_Symbol to Other_Numeric. The Line Break property has changes for Hebrew and Japanese; and because of other changes in 6.1, the Perl regular expression construct \X now works differently for some characters in Thai and Lao.

New aliases (synonyms) have been defined for many property values; these, along with the previously existing ones, are all cross-indexed in perluniprops.

The return value of charnames::viacode() is affected by other changes:

  1. Code point Old Name New Name
  2. U+000A LINE FEED (LF) LINE FEED
  3. U+000C FORM FEED (FF) FORM FEED
  4. U+000D CARRIAGE RETURN (CR) CARRIAGE RETURN
  5. U+0085 NEXT LINE (NEL) NEXT LINE
  6. U+008E SINGLE-SHIFT 2 SINGLE-SHIFT-2
  7. U+008F SINGLE-SHIFT 3 SINGLE-SHIFT-3
  8. U+0091 PRIVATE USE 1 PRIVATE USE-1
  9. U+0092 PRIVATE USE 2 PRIVATE USE-2
  10. U+2118 SCRIPT CAPITAL P WEIERSTRASS ELLIPTIC FUNCTION

Perl will accept any of these names as input, but charnames::viacode() now returns the new name of each pair. The change for U+2118 is considered by Unicode to be a correction, that is the original name was a mistake (but again, it will remain forever valid to use it to refer to U+2118). But most of these changes are the fallout of the mistake Unicode 6.0 made in naming a character used in Japanese cell phones to be "BELL", which conflicts with the longstanding industry use of (and Unicode's recommendation to use) that name to mean the ASCII control character at U+0007. Therefore, that name has been deprecated in Perl since v5.14, and any use of it will raise a warning message (unless turned off). The name "ALERT" is now the preferred name for this code point, with "BEL" an acceptable short form. The name for the new cell phone character, at code point U+1F514, remains undefined in this version of Perl (hence we don't implement quite all of Unicode 6.1), but starting in v5.18, BELL will mean this character, and not U+0007.

Unicode has taken steps to make sure that this sort of mistake does not happen again. The Standard now includes all generally accepted names and abbreviations for control characters, whereas previously it didn't (though there were recommended names for most of them, which Perl used). This means that most of those recommended names are now officially in the Standard. Unicode did not recommend names for the four code points listed above between U+008E and U+008F, and in standardizing them Unicode subtly changed the names that Perl had previously given them, by replacing the final blank in each name by a hyphen. Unicode also officially accepts names that Perl had deprecated, such as FILE SEPARATOR. Now the only deprecated name is BELL. Finally, Perl now uses the new official names instead of the old (now considered obsolete) names for the first four code points in the list above (the ones which have the parentheses in them).

Now that the names have been placed in the Unicode standard, these kinds of changes should not happen again, though corrections, such as to U+2118, are still possible.

Unicode also added some name abbreviations, which Perl now accepts: SP for SPACE; TAB for CHARACTER TABULATION; NEW LINE, END OF LINE, NL, and EOL for LINE FEED; LOCKING-SHIFT ONE for SHIFT OUT; LOCKING-SHIFT ZERO for SHIFT IN; and ZWNBSP for ZERO WIDTH NO-BREAK SPACE.

More details on this version of Unicode are provided in http://www.unicode.org/versions/Unicode6.1.0/.

use charnames is no longer needed for \N{name}

When \N{name} is encountered, the charnames module is now automatically loaded when needed as if the :full and :short options had been specified. See charnames for more information.

\N{...} can now have Unicode loose name matching

This is described in the charnames item in Updated Modules and Pragmata below.

Unicode Symbol Names

Perl now has proper support for Unicode in symbol names. It used to be that *{$foo} would ignore the internal UTF8 flag and use the bytes of the underlying representation to look up the symbol. That meant that *{"\x{100}"} and *{"\xc4\x80"} would return the same thing. All these parts of Perl have been fixed to account for Unicode:

  • Method names (including those passed to use overload )

  • Typeglob names (including names of variables, subroutines, and filehandles)

  • Package names

  • goto

  • Symbolic dereferencing

  • Second argument to bless() and tie()

  • Return value of ref()

  • Subroutine prototypes

  • Attributes

  • Various warnings and error messages that mention variable names or values, methods, etc.

In addition, a parsing bug has been fixed that prevented *{é} from implicitly quoting the name, but instead interpreted it as *{+é} , which would cause a strict violation.

*{"*a::b"} automatically strips off the * if it is followed by an ASCII letter. That has been extended to all Unicode identifier characters.

One-character non-ASCII non-punctuation variables (like ) are now subject to "Used only once" warnings. They used to be exempt, as they were treated as punctuation variables.

Also, single-character Unicode punctuation variables (like $‰ ) are now supported [perl #69032].

Improved ability to mix locales and Unicode, including UTF-8 locales

An optional parameter has been added to use locale

  1. use locale ':not_characters';

which tells Perl to use all but the LC_CTYPE and LC_COLLATE portions of the current locale. Instead, the character set is assumed to be Unicode. This lets locales and Unicode be seamlessly mixed, including the increasingly frequent UTF-8 locales. When using this hybrid form of locales, the :locale layer to the open pragma can be used to interface with the file system, and there are CPAN modules available for ARGV and environment variable conversions.

Full details are in perllocale.

New function fc and corresponding escape sequence \F for Unicode foldcase

Unicode foldcase is an extension to lowercase that gives better results when comparing two strings case-insensitively. It has long been used internally in regular expression /i matching. Now it is available explicitly through the new fc function call (enabled by "use feature 'fc'" , or use v5.16 , or explicitly callable via CORE::fc ) or through the new \F sequence in double-quotish strings.

Full details are in fc.

The Unicode Script_Extensions property is now supported.

New in Unicode 6.0, this is an improved Script property. Details are in Scripts in perlunicode.

XS Changes

Improved typemaps for Some Builtin Types

Most XS authors will know there is a longstanding bug in the OUTPUT typemap for T_AVREF (AV* ), T_HVREF (HV* ), T_CVREF (CV* ), and T_SVREF (SVREF or \$foo ) that requires manually decrementing the reference count of the return value instead of the typemap taking care of this. For backwards-compatibility, this cannot be changed in the default typemaps. But we now provide additional typemaps T_AVREF_REFCOUNT_FIXED , etc. that do not exhibit this bug. Using them in your extension is as simple as having one line in your TYPEMAP section:

  1. HV* T_HVREF_REFCOUNT_FIXED

is_utf8_char()

The XS-callable function is_utf8_char() , when presented with malformed UTF-8 input, can read up to 12 bytes beyond the end of the string. This cannot be fixed without changing its API, and so its use is now deprecated. Use is_utf8_char_buf() (described just below) instead.

Added is_utf8_char_buf()

This function is designed to replace the deprecated is_utf8_char() function. It includes an extra parameter to make sure it doesn't read past the end of the input buffer.

Other is_utf8_foo() functions, as well as utf8_to_foo() , etc.

Most other XS-callable functions that take UTF-8 encoded input implicitly assume that the UTF-8 is valid (not malformed) with respect to buffer length. Do not do things such as change a character's case or see if it is alphanumeric without first being sure that it is valid UTF-8. This can be safely done for a whole string by using one of the functions is_utf8_string() , is_utf8_string_loc() , and is_utf8_string_loclen() .

New Pad API

Many new functions have been added to the API for manipulating lexical pads. See Pad Data Structures in perlapi for more information.

Changes to Special Variables

$$ can be assigned to

$$ was made read-only in Perl 5.8.0. But only sometimes: local $$ would make it writable again. Some CPAN modules were using local $$ or XS code to bypass the read-only check, so there is no reason to keep $$ read-only. (This change also allowed a bug to be fixed while maintaining backward compatibility.)

$^X converted to an absolute path on FreeBSD, OS X and Solaris

$^X is now converted to an absolute path on OS X, FreeBSD (without needing /proc mounted) and Solaris 10 and 11. This augments the previous approach of using /proc on Linux, FreeBSD, and NetBSD (in all cases, where mounted).

This makes relocatable perl installations more useful on these platforms. (See "Relocatable @INC" in INSTALL)

Debugger Changes

Features inside the debugger

The current Perl's feature bundle is now enabled for commands entered in the interactive debugger.

New option for the debugger's t command

The t command in the debugger, which toggles tracing mode, now accepts a numeric argument that determines how many levels of subroutine calls to trace.

enable and disable

The debugger now has disable and enable commands for disabling existing breakpoints and re-enabling them. See perldebug.

Breakpoints with file names

The debugger's "b" command for setting breakpoints now lets a line number be prefixed with a file name. See b [file]:[line] [condition] in perldebug.

The CORE Namespace

The CORE:: prefix

The CORE:: prefix can now be used on keywords enabled by feature.pm, even outside the scope of use feature .

Subroutines in the CORE namespace

Many Perl keywords are now available as subroutines in the CORE namespace. This lets them be aliased:

  1. BEGIN { *entangle = \&CORE::tie }
  2. entangle $variable, $package, @args;

And for prototypes to be bypassed:

  1. sub mytie(\[%$*@]$@) {
  2. my ($ref, $pack, @args) = @_;
  3. ... do something ...
  4. goto &CORE::tie;
  5. }

Some of these cannot be called through references or via &foo syntax, but must be called as barewords.

See CORE for details.

Other Changes

Anonymous handles

Automatically generated file handles are now named __ANONIO__ when the variable name cannot be determined, rather than $__ANONIO__.

Autoloaded sort Subroutines

Custom sort subroutines can now be autoloaded [perl #30661]:

  1. sub AUTOLOAD { ... }
  2. @sorted = sort foo @list; # uses AUTOLOAD

continue no longer requires the "switch" feature

The continue keyword has two meanings. It can introduce a continue block after a loop, or it can exit the current when block. Up to now, the latter meaning was valid only with the "switch" feature enabled, and was a syntax error otherwise. Since the main purpose of feature.pm is to avoid conflicts with user-defined subroutines, there is no reason for continue to depend on it.

DTrace probes for interpreter phase change

The phase-change probes will fire when the interpreter's phase changes, which tracks the ${^GLOBAL_PHASE} variable. arg0 is the new phase name; arg1 is the old one. This is useful for limiting your instrumentation to one or more of: compile time, run time, or destruct time.

__FILE__() Syntax

The __FILE__ , __LINE__ and __PACKAGE__ tokens can now be written with an empty pair of parentheses after them. This makes them parse the same way as time, fork and other built-in functions.

The \$ prototype accepts any scalar lvalue

The \$ and \[$] subroutine prototypes now accept any scalar lvalue argument. Previously they accepted only scalars beginning with $ and hash and array elements. This change makes them consistent with the way the built-in read and recv functions (among others) parse their arguments. This means that one can override the built-in functions with custom subroutines that parse their arguments the same way.

_ in subroutine prototypes

The _ character in subroutine prototypes is now allowed before @ or % .

Security

Use is_utf8_char_buf() and not is_utf8_char()

The latter function is now deprecated because its API is insufficient to guarantee that it doesn't read (up to 12 bytes in the worst case) beyond the end of its input string. See is_utf8_char_buf().

Malformed UTF-8 input could cause attempts to read beyond the end of the buffer

Two new XS-accessible functions, utf8_to_uvchr_buf() and utf8_to_uvuni_buf() are now available to prevent this, and the Perl core has been converted to use them. See Internal Changes.

File::Glob::bsd_glob() memory error with GLOB_ALTDIRFUNC (CVE-2011-2728).

Calling File::Glob::bsd_glob with the unsupported flag GLOB_ALTDIRFUNC would cause an access violation / segfault. A Perl program that accepts a flags value from an external source could expose itself to denial of service or arbitrary code execution attacks. There are no known exploits in the wild. The problem has been corrected by explicitly disabling all unsupported flags and setting unused function pointers to null. Bug reported by Clément Lecigne. (5.14.2)

Privileges are now set correctly when assigning to $(

A hypothetical bug (probably unexploitable in practice) because the incorrect setting of the effective group ID while setting $( has been fixed. The bug would have affected only systems that have setresgid() but not setregid() , but no such systems are known to exist.

Deprecations

Don't read the Unicode data base files in lib/unicore

It is now deprecated to directly read the Unicode data base files. These are stored in the lib/unicore directory. Instead, you should use the new functions in Unicode::UCD. These provide a stable API, and give complete information.

Perl may at some point in the future change or remove these files. The file which applications were most likely to have used is lib/unicore/ToDigit.pl. prop_invmap() in Unicode::UCD can be used to get at its data instead.

XS functions is_utf8_char() , utf8_to_uvchr() and utf8_to_uvuni()

This function is deprecated because it could read beyond the end of the input string. Use the new is_utf8_char_buf(), utf8_to_uvchr_buf() and utf8_to_uvuni_buf() instead.

Future Deprecations

This section serves as a notice of features that are likely to be removed or deprecated in the next release of perl (5.18.0). If your code depends on these features, you should contact the Perl 5 Porters via the mailing list or perlbug to explain your use case and inform the deprecation process.

Core Modules

These modules may be marked as deprecated from the core. This only means that they will no longer be installed by default with the core distribution, but will remain available on the CPAN.

  • CPANPLUS

  • Filter::Simple

  • PerlIO::mmap

  • Pod::LaTeX

  • Pod::Parser

  • SelfLoader

  • Text::Soundex

  • Thread.pm

Platforms with no supporting programmers

These platforms will probably have their special build support removed during the 5.17.0 development series.

  • BeOS

  • djgpp

  • dgux

  • EPOC

  • MPE/iX

  • Rhapsody

  • UTS

  • VM/ESA

Other Future Deprecations

  • Swapping of $< and $>

    For more information about this future deprecation, see the relevant RT ticket.

  • sfio, stdio

    Perl supports being built without PerlIO proper, using a stdio or sfio wrapper instead. A perl build like this will not support IO layers and thus Unicode IO, making it rather handicapped.

    PerlIO supports a stdio layer if stdio use is desired, and similarly a sfio layer could be produced.

  • Unescaped literal "{" in regular expressions.

    Starting with v5.20, it is planned to require a literal "{" to be escaped, for example by preceding it with a backslash. In v5.18, a deprecated warning message will be emitted for all such uses. This affects only patterns that are to match a literal "{" . Other uses of this character, such as part of a quantifier or sequence as in those below, are completely unaffected:

    1. /foo{3,5}/
    2. /\p{Alphabetic}/
    3. /\N{DIGIT ZERO}

    Removing this will permit extensions to Perl's pattern syntax and better error checking for existing syntax. See Quantifiers in perlre for an example.

  • Revamping "\Q" semantics in double-quotish strings when combined with other escapes.

    There are several bugs and inconsistencies involving combinations of \Q and escapes like \x , \L , etc., within a \Q...\E pair. These need to be fixed, and doing so will necessarily change current behavior. The changes have not yet been settled.

Incompatible Changes

Special blocks called in void context

Special blocks (BEGIN , CHECK , INIT , UNITCHECK , END ) are now called in void context. This avoids wasteful copying of the result of the last statement [perl #108794].

The overloading pragma and regexp objects

With no overloading , regular expression objects returned by qr// are now stringified as "Regexp=REGEXP(0xbe600d)" instead of the regular expression itself [perl #108780].

Two XS typemap Entries removed

Two presumably unused XS typemap entries have been removed from the core typemap: T_DATAUNIT and T_CALLBACK. If you are, against all odds, a user of these, please see the instructions on how to restore them in perlxstypemap.

Unicode 6.1 has incompatibilities with Unicode 6.0

These are detailed in Supports (almost) Unicode 6.1 above. You can compile this version of Perl to use Unicode 6.0. See Hacking Perl to work on earlier Unicode versions (for very serious hackers only) in perlunicode.

Borland compiler

All support for the Borland compiler has been dropped. The code had not worked for a long time anyway.

Certain deprecated Unicode properties are no longer supported by default

Perl should never have exposed certain Unicode properties that are used by Unicode internally and not meant to be publicly available. Use of these has generated deprecated warning messages since Perl 5.12. The removed properties are Other_Alphabetic, Other_Default_Ignorable_Code_Point, Other_Grapheme_Extend, Other_ID_Continue, Other_ID_Start, Other_Lowercase, Other_Math, and Other_Uppercase.

Perl may be recompiled to include any or all of them; instructions are given in Unicode character properties that are NOT accepted by Perl in perluniprops.

Dereferencing IO thingies as typeglobs

The *{...} operator, when passed a reference to an IO thingy (as in *{*STDIN{IO}} ), creates a new typeglob containing just that IO object. Previously, it would stringify as an empty string, but some operators would treat it as undefined, producing an "uninitialized" warning. Now it stringifies as __ANONIO__ [perl #96326].

User-defined case-changing operations

This feature was deprecated in Perl 5.14, and has now been removed. The CPAN module Unicode::Casing provides better functionality without the drawbacks that this feature had, as are detailed in the 5.14 documentation: http://perldoc.perl.org/5.14.0/perlunicode.html#User-Defined-Case-Mappings-%28for-serious-hackers-only%29

XSUBs are now 'static'

XSUB C functions are now 'static', that is, they are not visible from outside the compilation unit. Users can use the new XS_EXTERNAL(name) and XS_INTERNAL(name) macros to pick the desired linking behavior. The ordinary XS(name) declaration for XSUBs will continue to declare non-'static' XSUBs for compatibility, but the XS compiler, ExtUtils::ParseXS (xsubpp ) will emit 'static' XSUBs by default. ExtUtils::ParseXS's behavior can be reconfigured from XS using the EXPORT_XSUB_SYMBOLS keyword. See perlxs for details.

Weakening read-only references

Weakening read-only references is no longer permitted. It should never have worked anyway, and could sometimes result in crashes.

Tying scalars that hold typeglobs

Attempting to tie a scalar after a typeglob was assigned to it would instead tie the handle in the typeglob's IO slot. This meant that it was impossible to tie the scalar itself. Similar problems affected tied and untie: tied $scalar would return false on a tied scalar if the last thing returned was a typeglob, and untie $scalar on such a tied scalar would do nothing.

We fixed this problem before Perl 5.14.0, but it caused problems with some CPAN modules, so we put in a deprecation cycle instead.

Now the deprecation has been removed and this bug has been fixed. So tie $scalar will always tie the scalar, not the handle it holds. To tie the handle, use tie *$scalar (with an explicit asterisk). The same applies to tied *$scalar and untie *$scalar .

IPC::Open3 no longer provides xfork() , xclose_on_exec() and xpipe_anon()

All three functions were private, undocumented, and unexported. They do not appear to be used by any code on CPAN. Two have been inlined and one deleted entirely.

$$ no longer caches PID

Previously, if one called fork(3) from C, Perl's notion of $$ could go out of sync with what getpid() returns. By always fetching the value of $$ via getpid(), this potential bug is eliminated. Code that depends on the caching behavior will break. As described in Core Enhancements, $$ is now writable, but it will be reset during a fork.

$$ and getppid() no longer emulate POSIX semantics under LinuxThreads

The POSIX emulation of $$ and getppid() under the obsolete LinuxThreads implementation has been removed. This only impacts users of Linux 2.4 and users of Debian GNU/kFreeBSD up to and including 6.0, not the vast majority of Linux installations that use NPTL threads.

This means that getppid(), like $$ , is now always guaranteed to return the OS's idea of the current state of the process, not perl's cached version of it.

See the documentation for $$ for details.

$< , $> , $( and $) are no longer cached

Similarly to the changes to $$ and getppid(), the internal caching of $< , $> , $( and $) has been removed.

When we cached these values our idea of what they were would drift out of sync with reality if someone (e.g., someone embedding perl) called sete?[ug]id() without updating PL_e?[ug]id . Having to deal with this complexity wasn't worth it given how cheap the gete?[ug]id() system call is.

This change will break a handful of CPAN modules that use the XS-level PL_uid , PL_gid , PL_euid or PL_egid variables.

The fix for those breakages is to use PerlProc_gete?[ug]id() to retrieve them (e.g., PerlProc_getuid() ), and not to assign to PL_e?[ug]id if you change the UID/GID/EUID/EGID. There is no longer any need to do so since perl will always retrieve the up-to-date version of those values from the OS.

Which Non-ASCII characters get quoted by quotemeta and \Q has changed

This is unlikely to result in a real problem, as Perl does not attach special meaning to any non-ASCII character, so it is currently irrelevant which are quoted or not. This change fixes bug [perl #77654] and brings Perl's behavior more into line with Unicode's recommendations. See quotemeta.

Performance Enhancements

  • Improved performance for Unicode properties in regular expressions

    Matching a code point against a Unicode property is now done via a binary search instead of linear. This means for example that the worst case for a 1000 item property is 10 probes instead of 1000. This inefficiency has been compensated for in the past by permanently storing in a hash the results of a given probe plus the results for the adjacent 64 code points, under the theory that near-by code points are likely to be searched for. A separate hash was used for each mention of a Unicode property in each regular expression. Thus, qr/\p{foo}abc\p{foo}/ would generate two hashes. Any probes in one instance would be unknown to the other, and the hashes could expand separately to be quite large if the regular expression were used on many different widely-separated code points. Now, however, there is just one hash shared by all instances of a given property. This means that if \p{foo} is matched against "A" in one regular expression in a thread, the result will be known immediately to all regular expressions, and the relentless march of using up memory is slowed considerably.

  • Version declarations with the use keyword (e.g., use 5.012 ) are now faster, as they enable features without loading feature.pm.

  • local $_ is faster now, as it no longer iterates through magic that it is not going to copy anyway.

  • Perl 5.12.0 sped up the destruction of objects whose classes define empty DESTROY methods (to prevent autoloading), by simply not calling such empty methods. This release takes this optimization a step further, by not calling any DESTROY method that begins with a return statement. This can be useful for destructors that are only used for debugging:

    1. use constant DEBUG => 1;
    2. sub DESTROY { return unless DEBUG; ... }

    Constant-folding will reduce the first statement to return; if DEBUG is set to 0, triggering this optimization.

  • Assigning to a variable that holds a typeglob or copy-on-write scalar is now much faster. Previously the typeglob would be stringified or the copy-on-write scalar would be copied before being clobbered.

  • Assignment to substr in void context is now more than twice its previous speed. Instead of creating and returning a special lvalue scalar that is then assigned to, substr modifies the original string itself.

  • substr no longer calculates a value to return when called in void context.

  • Due to changes in File::Glob, Perl's glob function and its <...> equivalent are now much faster. The splitting of the pattern into words has been rewritten in C, resulting in speed-ups of 20% for some cases.

    This does not affect glob on VMS, as it does not use File::Glob.

  • The short-circuiting operators &&, ||, and // , when chained (such as $a || $b || $c ), are now considerably faster to short-circuit, due to reduced optree traversal.

  • The implementation of s///r makes one fewer copy of the scalar's value.

  • Recursive calls to lvalue subroutines in lvalue scalar context use less memory.

Modules and Pragmata

Deprecated Modules

New Modules and Pragmata

  • arybase -- this new module implements the $[ variable.

  • PerlIO::mmap 0.010 has been added to the Perl core.

    The mmap PerlIO layer is no longer implemented by perl itself, but has been moved out into the new PerlIO::mmap module.

Updated Modules and Pragmata

This is only an overview of selected module updates. For a complete list of updates, run:

  1. $ corelist --diff 5.14.0 5.16.0

You can substitute your favorite version in place of 5.14.0, too.

  • Archive::Extract has been upgraded from version 0.48 to 0.58.

    Includes a fix for FreeBSD to only use unzip if it is located in /usr/local/bin , as FreeBSD 9.0 will ship with a limited unzip in /usr/bin .

  • Archive::Tar has been upgraded from version 1.76 to 1.82.

    Adjustments to handle files >8gb (>0777777777777 octal) and a feature to return the MD5SUM of files in the archive.

  • base has been upgraded from version 2.16 to 2.18.

    base no longer sets a module's $VERSION to "-1" when a module it loads does not define a $VERSION . This change has been made because "-1" is not a valid version number under the new "lax" criteria used internally by UNIVERSAL::VERSION . (See version for more on "lax" version criteria.)

    base no longer internally skips loading modules it has already loaded and instead relies on require to inspect %INC . This fixes a bug when base is used with code that clear %INC to force a module to be reloaded.

  • Carp has been upgraded from version 1.20 to 1.26.

    It now includes last read filehandle info and puts a dot after the file and line number, just like errors from die [perl #106538].

  • charnames has been updated from version 1.18 to 1.30.

    charnames can now be invoked with a new option, :loose , which is like the existing :full option, but enables Unicode loose name matching. Details are in LOOSE MATCHES in charnames.

  • B::Deparse has been upgraded from version 1.03 to 1.14. This fixes numerous deparsing bugs.

  • CGI has been upgraded from version 3.52 to 3.59.

    It uses the public and documented FCGI.pm API in CGI::Fast. CGI::Fast was using an FCGI API that was deprecated and removed from documentation more than ten years ago. Usage of this deprecated API with FCGI >= 0.70 or FCGI <= 0.73 introduces a security issue. https://rt.cpan.org/Public/Bug/Display.html?id=68380 http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2011-2766

    Things that may break your code:

    url() was fixed to return PATH_INFO when it is explicitly requested with either the path=>1 or path_info=>1 flag.

    If your code is running under mod_rewrite (or compatible) and you are calling self_url() or you are calling url() and passing path_info=>1 , these methods will actually be returning PATH_INFO now, as you have explicitly requested or self_url() has requested on your behalf.

    The PATH_INFO has been omitted in such URLs since the issue was introduced in the 3.12 release in December, 2005.

    This bug is so old your application may have come to depend on it or workaround it. Check for application before upgrading to this release.

    Examples of affected method calls:

    1. $q->url(-absolute => 1, -query => 1, -path_info => 1);
    2. $q->url(-path=>1);
    3. $q->url(-full=>1,-path=>1);
    4. $q->url(-rewrite=>1,-path=>1);
    5. $q->self_url();

    We no longer read from STDIN when the Content-Length is not set, preventing requests with no Content-Length from sometimes freezing. This is consistent with the CGI RFC 3875, and is also consistent with CGI::Simple. However, the old behavior may have been expected by some command-line uses of CGI.pm.

    In addition, the DELETE HTTP verb is now supported.

  • Compress::Zlib has been upgraded from version 2.035 to 2.048.

    IO::Compress::Zip and IO::Uncompress::Unzip now have support for LZMA (method 14). There is a fix for a CRC issue in IO::Compress::Unzip and it supports Streamed Stored context now. And fixed a Zip64 issue in IO::Compress::Zip when the content size was exactly 0xFFFFFFFF.

  • Digest::SHA has been upgraded from version 5.61 to 5.71.

    Added BITS mode to the addfile method and shasum. This makes partial-byte inputs possible via files/STDIN and lets shasum check all 8074 NIST Msg vectors, where previously special programming was required to do this.

  • Encode has been upgraded from version 2.42 to 2.44.

    Missing aliases added, a deep recursion error fixed and various documentation updates.

    Addressed 'decode_xs n-byte heap-overflow' security bug in Unicode.xs (CVE-2011-2939). (5.14.2)

  • ExtUtils::CBuilder updated from version 0.280203 to 0.280206.

    The new version appends CFLAGS and LDFLAGS to their Config.pm counterparts.

  • ExtUtils::ParseXS has been upgraded from version 2.2210 to 3.16.

    Much of ExtUtils::ParseXS, the module behind the XS compiler xsubpp , was rewritten and cleaned up. It has been made somewhat more extensible and now finally uses strictures.

    The typemap logic has been moved into a separate module, ExtUtils::Typemaps. See New Modules and Pragmata, above.

    For a complete set of changes, please see the ExtUtils::ParseXS changelog, available on the CPAN.

  • File::Glob has been upgraded from version 1.12 to 1.17.

    On Windows, tilde (~) expansion now checks the USERPROFILE environment variable, after checking HOME .

    It has a new :bsd_glob export tag, intended to replace :glob . Like :glob it overrides glob with a function that does not split the glob pattern into words, but, unlike :glob , it iterates properly in scalar context, instead of returning the last file.

    There are other changes affecting Perl's own glob operator (which uses File::Glob internally, except on VMS). See Performance Enhancements and Selected Bug Fixes.

  • FindBin updated from version 1.50 to 1.51.

    It no longer returns a wrong result if a script of the same name as the current one exists in the path and is executable.

  • HTTP::Tiny has been upgraded from version 0.012 to 0.017.

    Added support for using $ENV{http_proxy} to set the default proxy host.

    Adds additional shorthand methods for all common HTTP verbs, a post_form() method for POST-ing x-www-form-urlencoded data and a www_form_urlencode() utility method.

  • IO has been upgraded from version 1.25_04 to 1.25_06, and IO::Handle from version 1.31 to 1.33.

    Together, these upgrades fix a problem with IO::Handle's getline and getlines methods. When these methods are called on the special ARGV handle, the next file is automatically opened, as happens with the built-in <> and readline functions. But, unlike the built-ins, these methods were not respecting the caller's use of the open pragma and applying the appropriate I/O layers to the newly-opened file [rt.cpan.org #66474].

  • IPC::Cmd has been upgraded from version 0.70 to 0.76.

    Capturing of command output (both STDOUT and STDERR ) is now supported using IPC::Open3 on MSWin32 without requiring IPC::Run.

  • IPC::Open3 has been upgraded from version 1.09 to 1.12.

    Fixes a bug which prevented use of open3 on Windows when *STDIN , *STDOUT or *STDERR had been localized.

    Fixes a bug which prevented duplicating numeric file descriptors on Windows.

    open3 with "-" for the program name works once more. This was broken in version 1.06 (and hence in Perl 5.14.0) [perl #95748].

  • Locale::Codes has been upgraded from version 3.16 to 3.21.

    Added Language Extension codes (langext) and Language Variation codes (langvar) as defined in the IANA language registry.

    Added language codes from ISO 639-5

    Added language/script codes from the IANA language subtag registry

    Fixed an uninitialized value warning [rt.cpan.org #67438].

    Fixed the return value for the all_XXX_codes and all_XXX_names functions [rt.cpan.org #69100].

    Reorganized modules to move Locale::MODULE to Locale::Codes::MODULE to allow for cleaner future additions. The original four modules (Locale::Language, Locale::Currency, Locale::Country, Locale::Script) will continue to work, but all new sets of codes will be added in the Locale::Codes namespace.

    The code2XXX, XXX2code, all_XXX_codes, and all_XXX_names functions now support retired codes. All codesets may be specified by a constant or by their name now. Previously, they were specified only by a constant.

    The alias_code function exists for backward compatibility. It has been replaced by rename_country_code. The alias_code function will be removed some time after September, 2013.

    All work is now done in the central module (Locale::Codes). Previously, some was still done in the wrapper modules (Locale::Codes::*). Added Language Family codes (langfam) as defined in ISO 639-5.

  • Math::BigFloat has been upgraded from version 1.993 to 1.997.

    The numify method has been corrected to return a normalized Perl number (the result of 0 + $thing ), instead of a string [rt.cpan.org #66732].

  • Math::BigInt has been upgraded from version 1.994 to 1.998.

    It provides a new bsgn method that complements the babs method.

    It fixes the internal objectify function's handling of "foreign objects" so they are converted to the appropriate class (Math::BigInt or Math::BigFloat).

  • Math::BigRat has been upgraded from version 0.2602 to 0.2603.

    int() on a Math::BigRat object containing -1/2 now creates a Math::BigInt containing 0, rather than -0. Math::BigInt does not even support negative zero, so the resulting object was actually malformed [perl #95530].

  • Math::Complex has been upgraded from version 1.56 to 1.59 and Math::Trig from version 1.2 to 1.22.

    Fixes include: correct copy constructor usage; fix polarwise formatting with numeric format specifier; and more stable great_circle_direction algorithm.

  • Module::CoreList has been upgraded from version 2.51 to 2.66.

    The corelist utility now understands the -r option for displaying Perl release dates and the --diff option to print the set of modlib changes between two perl distributions.

  • Module::Metadata has been upgraded from version 1.000004 to 1.000009.

    Adds provides method to generate a CPAN META provides data structure correctly; use of package_versions_from_directory is discouraged.

  • ODBM_File has been upgraded from version 1.10 to 1.12.

    The XS code is now compiled with PERL_NO_GET_CONTEXT , which will aid performance under ithreads.

  • open has been upgraded from version 1.08 to 1.10.

    It no longer turns off layers on standard handles when invoked without the ":std" directive. Similarly, when invoked with the ":std" directive, it now clears layers on STDERR before applying the new ones, and not just on STDIN and STDOUT [perl #92728].

  • overload has been upgraded from version 1.13 to 1.18.

    overload::Overloaded no longer calls can on the class, but uses another means to determine whether the object has overloading. It was never correct for it to call can , as overloading does not respect AUTOLOAD. So classes that autoload methods and implement can no longer have to account for overloading [perl #40333].

    A warning is now produced for invalid arguments. See New Diagnostics.

  • PerlIO::scalar has been upgraded from version 0.11 to 0.14.

    (This is the module that implements open $fh, '>', \$scalar .)

    It fixes a problem with open my $fh, ">", \$scalar not working if $scalar is a copy-on-write scalar. (5.14.2)

    It also fixes a hang that occurs with readline or <$fh> if a typeglob has been assigned to $scalar [perl #92258].

    It no longer assumes during seek that $scalar is a string internally. If it didn't crash, it was close to doing so [perl #92706]. Also, the internal print routine no longer assumes that the position set by seek is valid, but extends the string to that position, filling the intervening bytes (between the old length and the seek position) with nulls [perl #78980].

    Printing to an in-memory handle now works if the $scalar holds a reference, stringifying the reference before modifying it. References used to be treated as empty strings.

    Printing to an in-memory handle no longer crashes if the $scalar happens to hold a number internally, but no string buffer.

    Printing to an in-memory handle no longer creates scalars that confuse the regular expression engine [perl #108398].

  • Pod::Functions has been upgraded from version 1.04 to 1.05.

    Functions.pm is now generated at perl build time from annotations in perlfunc.pod. This will ensure that Pod::Functions and perlfunc remain in synchronisation.

  • Pod::Html has been upgraded from version 1.11 to 1.1502.

    This is an extensive rewrite of Pod::Html to use Pod::Simple under the hood. The output has changed significantly.

  • Pod::Perldoc has been upgraded from version 3.15_03 to 3.17.

    It corrects the search paths on VMS [perl #90640]. (5.14.1)

    The -v option now fetches the right section for $0 .

    This upgrade has numerous significant fixes. Consult its changelog on the CPAN for more information.

  • POSIX has been upgraded from version 1.24 to 1.30.

    POSIX no longer uses AutoLoader. Any code which was relying on this implementation detail was buggy, and may fail because of this change. The module's Perl code has been considerably simplified, roughly halving the number of lines, with no change in functionality. The XS code has been refactored to reduce the size of the shared object by about 12%, with no change in functionality. More POSIX functions now have tests.

    sigsuspend and pause now run signal handlers before returning, as the whole point of these two functions is to wait until a signal has arrived, and then return after it has been triggered. Delayed, or "safe", signals were preventing that from happening, possibly resulting in race conditions [perl #107216].

    POSIX::sleep is now a direct call into the underlying OS sleep function, instead of being a Perl wrapper on CORE::sleep . POSIX::dup2 now returns the correct value on Win32 (i.e., the file descriptor). POSIX::SigSet sigsuspend and sigpending and POSIX::pause now dispatch safe signals immediately before returning to their caller.

    POSIX::Termios::setattr now defaults the third argument to TCSANOW , instead of 0. On most platforms TCSANOW is defined to be 0, but on some 0 is not a valid parameter, which caused a call with defaults to fail.

  • Socket has been upgraded from version 1.94 to 2.001.

    It has new functions and constants for handling IPv6 sockets:

    1. pack_ipv6_mreq
    2. unpack_ipv6_mreq
    3. IPV6_ADD_MEMBERSHIP
    4. IPV6_DROP_MEMBERSHIP
    5. IPV6_MTU
    6. IPV6_MTU_DISCOVER
    7. IPV6_MULTICAST_HOPS
    8. IPV6_MULTICAST_IF
    9. IPV6_MULTICAST_LOOP
    10. IPV6_UNICAST_HOPS
    11. IPV6_V6ONLY
  • Storable has been upgraded from version 2.27 to 2.34.

    It no longer turns copy-on-write scalars into read-only scalars when freezing and thawing.

  • Sys::Syslog has been upgraded from version 0.27 to 0.29.

    This upgrade closes many outstanding bugs.

  • Term::ANSIColor has been upgraded from version 3.00 to 3.01.

    Only interpret an initial array reference as a list of colors, not any initial reference, allowing the colored function to work properly on objects with stringification defined.

  • Term::ReadLine has been upgraded from version 1.07 to 1.09.

    Term::ReadLine now supports any event loop, including unpublished ones and simple IO::Select, loops without the need to rewrite existing code for any particular framework [perl #108470].

  • threads::shared has been upgraded from version 1.37 to 1.40.

    Destructors on shared objects used to be ignored sometimes if the objects were referenced only by shared data structures. This has been mostly fixed, but destructors may still be ignored if the objects still exist at global destruction time [perl #98204].

  • Unicode::Collate has been upgraded from version 0.73 to 0.89.

    Updated to CLDR 1.9.1

    Locales updated to CLDR 2.0: mk, mt, nb, nn, ro, ru, sk, sr, sv, uk, zh__pinyin, zh__stroke

    Newly supported locales: bn, fa, ml, mr, or, pa, sa, si, si__dictionary, sr_Latn, sv__reformed, ta, te, th, ur, wae.

    Tailored compatibility ideographs as well as unified ideographs for the locales: ja, ko, zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke.

    Locale/*.pl files are now searched for in @INC.

  • Unicode::Normalize has been upgraded from version 1.10 to 1.14.

    Fixes for the removal of unicore/CompositionExclusions.txt from core.

  • Unicode::UCD has been upgraded from version 0.32 to 0.43.

    This adds four new functions: prop_aliases() and prop_value_aliases() , which are used to find all Unicode-approved synonyms for property names, or to convert from one name to another; prop_invlist which returns all code points matching a given Unicode binary property; and prop_invmap which returns the complete specification of a given Unicode property.

  • Win32API::File has been upgraded from version 0.1101 to 0.1200.

    Added SetStdHandle and GetStdHandle functions

Removed Modules and Pragmata

As promised in Perl 5.14.0's release notes, the following modules have been removed from the core distribution, and if needed should be installed from CPAN instead.

  • Devel::DProf has been removed from the Perl core. Prior version was 20110228.00.

  • Shell has been removed from the Perl core. Prior version was 0.72_01.

  • Several old perl4-style libraries which have been deprecated with 5.14 are now removed:

    1. abbrev.pl assert.pl bigfloat.pl bigint.pl bigrat.pl cacheout.pl
    2. complete.pl ctime.pl dotsh.pl exceptions.pl fastcwd.pl flush.pl
    3. getcwd.pl getopt.pl getopts.pl hostname.pl importenv.pl
    4. lib/find{,depth}.pl look.pl newgetopt.pl open2.pl open3.pl
    5. pwd.pl shellwords.pl stat.pl tainted.pl termcap.pl timelocal.pl

    They can be found on CPAN as Perl4::CoreLibs.

Documentation

New Documentation

perldtrace

perldtrace describes Perl's DTrace support, listing the provided probes and gives examples of their use.

perlexperiment

This document is intended to provide a list of experimental features in Perl. It is still a work in progress.

perlootut

This a new OO tutorial. It focuses on basic OO concepts, and then recommends that readers choose an OO framework from CPAN.

perlxstypemap

The new manual describes the XS typemapping mechanism in unprecedented detail and combines new documentation with information extracted from perlxs and the previously unofficial list of all core typemaps.

Changes to Existing Documentation

perlapi

  • The HV API has long accepted negative lengths to show that the key is in UTF8. This is now documented.

  • The boolSV() macro is now documented.

perlfunc

  • dbmopen treats a 0 mode as a special case, that prevents a nonexistent file from being created. This has been the case since Perl 5.000, but was never documented anywhere. Now the perlfunc entry mentions it [perl #90064].

  • As an accident of history, open $fh, '<:', ... applies the default layers for the platform (:raw on Unix, :crlf on Windows), ignoring whatever is declared by open.pm. This seems such a useful feature it has been documented in open and open.

  • The entry for split has been rewritten. It is now far clearer than before.

perlguts

  • A new section, Autoloading with XSUBs, has been added, which explains the two APIs for accessing the name of the autoloaded sub.

  • Some function descriptions in perlguts were confusing, as it was not clear whether they referred to the function above or below the description. This has been clarified [perl #91790].

perlobj

  • This document has been rewritten from scratch, and its coverage of various OO concepts has been expanded.

perlop

  • Documentation of the smartmatch operator has been reworked and moved from perlsyn to perlop where it belongs.

    It has also been corrected for the case of undef on the left-hand side. The list of different smart match behaviors had an item in the wrong place.

  • Documentation of the ellipsis statement (... ) has been reworked and moved from perlop to perlsyn.

  • The explanation of bitwise operators has been expanded to explain how they work on Unicode strings (5.14.1).

  • More examples for m//g have been added (5.14.1).

  • The <<\FOO here-doc syntax has been documented (5.14.1).

perlpragma

  • There is now a standard convention for naming keys in the %^H , documented under Key naming.

Laundering and Detecting Tainted Data in perlsec

  • The example function for checking for taintedness contained a subtle error. $@ needs to be localized to prevent its changing this global's value outside the function. The preferred method to check for this remains tainted in Scalar::Util.

perllol

  • perllol has been expanded with examples using the new push $scalar syntax introduced in Perl 5.14.0 (5.14.1).

perlmod

  • perlmod now states explicitly that some types of explicit symbol table manipulation are not supported. This codifies what was effectively already the case [perl #78074].

perlpodstyle

  • The tips on which formatting codes to use have been corrected and greatly expanded.

  • There are now a couple of example one-liners for previewing POD files after they have been edited.

perlre

perlrun

  • perlrun has undergone a significant clean-up. Most notably, the -0x... form of the -0 flag has been clarified, and the final section on environment variables has been corrected and expanded (5.14.1).

perlsub

  • The ($;) prototype syntax, which has existed for rather a long time, is now documented in perlsub. It lets a unary function have the same precedence as a list operator.

perltie

  • The required syntax for tying handles has been documented.

perlvar

  • The documentation for $! has been corrected and clarified. It used to state that $! could be undef, which is not the case. It was also unclear whether system calls set C's errno or Perl's $! [perl #91614].

  • Documentation for $$ has been amended with additional cautions regarding changing the process ID.

Other Changes

  • perlxs was extended with documentation on inline typemaps.

  • perlref has a new Circular References section explaining how circularities may not be freed and how to solve that with weak references.

  • Parts of perlapi were clarified, and Perl equivalents of some C functions have been added as an additional mode of exposition.

  • A few parts of perlre and perlrecharclass were clarified.

Removed Documentation

Old OO Documentation

The old OO tutorials, perltoot, perltooc, and perlboot, have been removed. The perlbot (bag of object tricks) document has been removed as well.

Development Deltas

The perldelta files for development releases are no longer packaged with perl. These can still be found in the perl source code repository.

Diagnostics

The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see perldiag.

New Diagnostics

New Errors

New Warnings

Removed Errors

  • "sort is now a reserved word"

    This error used to occur when sort was called without arguments, followed by ; or ). (E.g., sort; would die, but {sort} was OK.) This error message was added in Perl 3 to catch code like close(sort) which would no longer work. More than two decades later, this message is no longer appropriate. Now sort without arguments is always allowed, and returns an empty list, as it did in those cases where it was already allowed [perl #90030].

Changes to Existing Diagnostics

  • The "Applying pattern match..." or similar warning produced when an array or hash is on the left-hand side of the =~ operator now mentions the name of the variable.

  • The "Attempt to free non-existent shared string" has had the spelling of "non-existent" corrected to "nonexistent". It was already listed with the correct spelling in perldiag.

  • The error messages for using default and when outside a topicalizer have been standardized to match the messages for continue and loop controls. They now read 'Can't "default" outside a topicalizer' and 'Can't "when" outside a topicalizer'. They both used to be 'Can't use when() outside a topicalizer' [perl #91514].

  • The message, "Code point 0x%X is not Unicode, no properties match it; all inverse properties do" has been changed to "Code point 0x%X is not Unicode, all \p{} matches fail; all \P{} matches succeed".

  • Redefinition warnings for constant subroutines used to be mandatory, even occurring under no warnings . Now they respect the warnings pragma.

  • The "glob failed" warning message is now suppressible via no warnings [perl #111656].

  • The Invalid version format error message now says "negative version number" within the parentheses, rather than "non-numeric data", for negative numbers.

  • The two warnings Possible attempt to put comments in qw() list and Possible attempt to separate words with commas are no longer mutually exclusive: the same qw construct may produce both.

  • The uninitialized warning for y///r when $_ is implicit and undefined now mentions the variable name, just like the non-/r variation of the operator.

  • The 'Use of "foo" without parentheses is ambiguous' warning has been extended to apply also to user-defined subroutines with a (;$) prototype, and not just to built-in functions.

  • Warnings that mention the names of lexical (my) variables with Unicode characters in them now respect the presence or absence of the :utf8 layer on the output handle, instead of outputting UTF8 regardless. Also, the correct names are included in the strings passed to $SIG{__WARN__} handlers, rather than the raw UTF8 bytes.

Utility Changes

h2ph

  • h2ph used to generate code of the form

    1. unless(defined(&FOO)) {
    2. sub FOO () {42;}
    3. }

    But the subroutine is a compile-time declaration, and is hence unaffected by the condition. It has now been corrected to emit a string eval around the subroutine [perl #99368].

splain

  • splain no longer emits backtraces with the first line number repeated.

    This:

    1. Uncaught exception from user code:
    2. Cannot fwiddle the fwuddle at -e line 1.
    3. at -e line 1
    4. main::baz() called at -e line 1
    5. main::bar() called at -e line 1
    6. main::foo() called at -e line 1

    has become this:

    1. Uncaught exception from user code:
    2. Cannot fwiddle the fwuddle at -e line 1.
    3. main::baz() called at -e line 1
    4. main::bar() called at -e line 1
    5. main::foo() called at -e line 1
  • Some error messages consist of multiple lines that are listed as separate entries in perldiag. splain has been taught to find the separate entries in these cases, instead of simply failing to find the message.

zipdetails

  • This is a new utility, included as part of an IO::Compress::Base upgrade.

    zipdetails displays information about the internal record structure of the zip file. It is not concerned with displaying any details of the compressed data stored in the zip file.

Configuration and Compilation

  • regexp.h has been modified for compatibility with GCC's -Werror option, as used by some projects that include perl's header files (5.14.1).

  • USE_LOCALE{,_COLLATE,_CTYPE,_NUMERIC} have been added the output of perl -V as they have affect the behavior of the interpreter binary (albeit in only a small area).

  • The code and tests for IPC::Open2 have been moved from ext/IPC-Open2 into ext/IPC-Open3, as IPC::Open2::open2() is implemented as a thin wrapper around IPC::Open3::_open3() , and hence is very tightly coupled to it.

  • The magic types and magic vtables are now generated from data in a new script regen/mg_vtable.pl, instead of being maintained by hand. As different EBCDIC variants can't agree on the code point for '~', the character to code point conversion is done at build time by generate_uudmap to a new generated header mg_data.h. PL_vtbl_bm and PL_vtbl_fm are now defined by the pre-processor as PL_vtbl_regexp , instead of being distinct C variables. PL_vtbl_sig has been removed.

  • Building with -DPERL_GLOBAL_STRUCT works again. This configuration is not generally used.

  • Perl configured with MAD now correctly frees MADPROP structures when OPs are freed. MADPROP s are now allocated with PerlMemShared_malloc()

  • makedef.pl has been refactored. This should have no noticeable affect on any of the platforms that use it as part of their build (AIX, VMS, Win32).

  • useperlio can no longer be disabled.

  • The file global.sym is no longer needed, and has been removed. It contained a list of all exported functions, one of the files generated by regen/embed.pl from data in embed.fnc and regen/opcodes. The code has been refactored so that the only user of global.sym, makedef.pl, now reads embed.fnc and regen/opcodes directly, removing the need to store the list of exported functions in an intermediate file.

    As global.sym was never installed, this change should not be visible outside the build process.

  • pod/buildtoc, used by the build process to build perltoc, has been refactored and simplified. It now contains only code to build perltoc; the code to regenerate Makefiles has been moved to Porting/pod_rules.pl. It's a bug if this change has any material effect on the build process.

  • pod/roffitall is now built by pod/buildtoc, instead of being shipped with the distribution. Its list of manpages is now generated (and therefore current). See also RT #103202 for an unresolved related issue.

  • The man page for XS::Typemap is no longer installed. XS::Typemap is a test module which is not installed, hence installing its documentation makes no sense.

  • The -Dusesitecustomize and -Duserelocatableinc options now work together properly.

Platform Support

Platform-Specific Notes

Cygwin

  • Since version 1.7, Cygwin supports native UTF-8 paths. If Perl is built under that environment, directory and filenames will be UTF-8 encoded.

  • Cygwin does not initialize all original Win32 environment variables. See README.cygwin for a discussion of the newly-added Cygwin::sync_winenv() function [perl #110190] and for further links.

HP-UX

  • HP-UX PA-RISC/64 now supports gcc-4.x

    A fix to correct the socketsize now makes the test suite pass on HP-UX PA-RISC for 64bitall builds. (5.14.2)

VMS

  • Remove unnecessary includes, fix miscellaneous compiler warnings and close some unclosed comments on vms/vms.c.

  • Remove sockadapt layer from the VMS build.

  • Explicit support for VMS versions before v7.0 and DEC C versions before v6.0 has been removed.

  • Since Perl 5.10.1, the home-grown stat wrapper has been unable to distinguish between a directory name containing an underscore and an otherwise-identical filename containing a dot in the same position (e.g., t/test_pl as a directory and t/test.pl as a file). This problem has been corrected.

  • The build on VMS now permits names of the resulting symbols in C code for Perl longer than 31 characters. Symbols like Perl__it_was_the_best_of_times_it_was_the_worst_of_times can now be created freely without causing the VMS linker to seize up.

GNU/Hurd

  • Numerous build and test failures on GNU/Hurd have been resolved with hints for building DBM modules, detection of the library search path, and enabling of large file support.

OpenVOS

  • Perl is now built with dynamic linking on OpenVOS, the minimum supported version of which is now Release 17.1.0.

SunOS

The CC workshop C++ compiler is now detected and used on systems that ship without cc.

Internal Changes

  • The compiled representation of formats is now stored via the mg_ptr of their PERL_MAGIC_fm . Previously it was stored in the string buffer, beyond SvLEN() , the regular end of the string. SvCOMPILED() and SvCOMPILED_{on,off}() now exist solely for compatibility for XS code. The first is always 0, the other two now no-ops. (5.14.1)

  • Some global variables have been marked const , members in the interpreter structure have been re-ordered, and the opcodes have been re-ordered. The op OP_AELEMFAST has been split into OP_AELEMFAST and OP_AELEMFAST_LEX .

  • When empting a hash of its elements (e.g., via undef(%h), or %h=()), HvARRAY field is no longer temporarily zeroed. Any destructors called on the freed elements see the remaining elements. Thus, %h=() becomes more like delete $h{$_} for keys %h .

  • Boyer-Moore compiled scalars are now PVMGs, and the Boyer-Moore tables are now stored via the mg_ptr of their PERL_MAGIC_bm . Previously they were PVGVs, with the tables stored in the string buffer, beyond SvLEN() . This eliminates the last place where the core stores data beyond SvLEN() .

  • Simplified logic in Perl_sv_magic() introduces a small change of behavior for error cases involving unknown magic types. Previously, if Perl_sv_magic() was passed a magic type unknown to it, it would

    1.

    Croak "Modification of a read-only value attempted" if read only

    2.

    Return without error if the SV happened to already have this magic

    3.

    otherwise croak "Don't know how to handle magic of type \\%o"

    Now it will always croak "Don't know how to handle magic of type \\%o", even on read-only values, or SVs which already have the unknown magic type.

  • The experimental fetch_cop_label function has been renamed to cop_fetch_label .

  • The cop_store_label function has been added to the API, but is experimental.

  • embedvar.h has been simplified, and one level of macro indirection for PL_* variables has been removed for the default (non-multiplicity) configuration. PERLVAR*() macros now directly expand their arguments to tokens such as PL_defgv , instead of expanding to PL_Idefgv , with embedvar.h defining a macro to map PL_Idefgv to PL_defgv . XS code which has unwarranted chumminess with the implementation may need updating.

  • An API has been added to explicitly choose whether to export XSUB symbols. More detail can be found in the comments for commit e64345f8.

  • The is_gv_magical_sv function has been eliminated and merged with gv_fetchpvn_flags . It used to be called to determine whether a GV should be autovivified in rvalue context. Now it has been replaced with a new GV_ADDMG flag (not part of the API).

  • The returned code point from the function utf8n_to_uvuni() when the input is malformed UTF-8, malformations are allowed, and utf8 warnings are off is now the Unicode REPLACEMENT CHARACTER whenever the malformation is such that no well-defined code point can be computed. Previously the returned value was essentially garbage. The only malformations that have well-defined values are a zero-length string (0 is the return), and overlong UTF-8 sequences.

  • Padlists are now marked AvREAL ; i.e., reference-counted. They have always been reference-counted, but were not marked real, because pad.c did its own clean-up, instead of using the usual clean-up code in sv.c. That caused problems in thread cloning, so now the AvREAL flag is on, but is turned off in pad.c right before the padlist is freed (after pad.c has done its custom freeing of the pads).

  • All C files that make up the Perl core have been converted to UTF-8.

  • These new functions have been added as part of the work on Unicode symbols:

    1. HvNAMELEN
    2. HvNAMEUTF8
    3. HvENAMELEN
    4. HvENAMEUTF8
    5. gv_init_pv
    6. gv_init_pvn
    7. gv_init_pvsv
    8. gv_fetchmeth_pv
    9. gv_fetchmeth_pvn
    10. gv_fetchmeth_sv
    11. gv_fetchmeth_pv_autoload
    12. gv_fetchmeth_pvn_autoload
    13. gv_fetchmeth_sv_autoload
    14. gv_fetchmethod_pv_flags
    15. gv_fetchmethod_pvn_flags
    16. gv_fetchmethod_sv_flags
    17. gv_autoload_pv
    18. gv_autoload_pvn
    19. gv_autoload_sv
    20. newGVgen_flags
    21. sv_derived_from_pv
    22. sv_derived_from_pvn
    23. sv_derived_from_sv
    24. sv_does_pv
    25. sv_does_pvn
    26. sv_does_sv
    27. whichsig_pv
    28. whichsig_pvn
    29. whichsig_sv
    30. newCONSTSUB_flags

    The gv_fetchmethod_*_flags functions, like gv_fetchmethod_flags, are experimental and may change in a future release.

  • The following functions were added. These are not part of the API:

    1. GvNAMEUTF8
    2. GvENAMELEN
    3. GvENAME_HEK
    4. CopSTASH_flags
    5. CopSTASH_flags_set
    6. PmopSTASH_flags
    7. PmopSTASH_flags_set
    8. sv_sethek
    9. HEKfARG

    There is also a HEKf macro corresponding to SVf , for interpolating HEKs in formatted strings.

  • sv_catpvn_flags takes a couple of new internal-only flags, SV_CATBYTES and SV_CATUTF8 , which tell it whether the char array to be concatenated is UTF8. This allows for more efficient concatenation than creating temporary SVs to pass to sv_catsv .

  • For XS AUTOLOAD subs, $AUTOLOAD is set once more, as it was in 5.6.0. This is in addition to setting SvPVX(cv) , for compatibility with 5.8 to 5.14. See Autoloading with XSUBs in perlguts.

  • Perl now checks whether the array (the linearized isa) returned by a MRO plugin begins with the name of the class itself, for which the array was created, instead of assuming that it does. This prevents the first element from being skipped during method lookup. It also means that mro::get_linear_isa may return an array with one more element than the MRO plugin provided [perl #94306].

  • PL_curstash is now reference-counted.

  • There are now feature bundle hints in PL_hints ($^H ) that version declarations use, to avoid having to load feature.pm. One setting of the hint bits indicates a "custom" feature bundle, which means that the entries in %^H still apply. feature.pm uses that.

    The HINT_FEATURE_MASK macro is defined in perl.h along with other hints. Other macros for setting and testing features and bundles are in the new feature.h. FEATURE_IS_ENABLED (which has moved to feature.h) is no longer used throughout the codebase, but more specific macros, e.g., FEATURE_SAY_IS_ENABLED , that are defined in feature.h.

  • lib/feature.pm is now a generated file, created by the new regen/feature.pl script, which also generates feature.h.

  • Tied arrays are now always AvREAL . If @_ or DB::args is tied, it is reified first, to make sure this is always the case.

  • Two new functions utf8_to_uvchr_buf() and utf8_to_uvuni_buf() have been added. These are the same as utf8_to_uvchr and utf8_to_uvuni (which are now deprecated), but take an extra parameter that is used to guard against reading beyond the end of the input string. See utf8_to_uvchr_buf in perlapi and utf8_to_uvuni_buf in perlapi.

  • The regular expression engine now does TRIE case insensitive matches under Unicode. This may change the output of use re 'debug'; , and will speed up various things.

  • There is a new wrap_op_checker() function, which provides a thread-safe alternative to writing to PL_check directly.

Selected Bug Fixes

Array and hash

  • A bug has been fixed that would cause a "Use of freed value in iteration" error if the next two hash elements that would be iterated over are deleted [perl #85026]. (5.14.1)

  • Deleting the current hash iterator (the hash element that would be returned by the next call to each) in void context used not to free it [perl #85026].

  • Deletion of methods via delete $Class::{method} syntax used to update method caches if called in void context, but not scalar or list context.

  • When hash elements are deleted in void context, the internal hash entry is now freed before the value is freed, to prevent destructors called by that latter freeing from seeing the hash in an inconsistent state. It was possible to cause double-frees if the destructor freed the hash itself [perl #100340].

  • A keys optimization in Perl 5.12.0 to make it faster on empty hashes caused each not to reset the iterator if called after the last element was deleted.

  • Freeing deeply nested hashes no longer crashes [perl #44225].

  • It is possible from XS code to create hashes with elements that have no values. The hash element and slice operators used to crash when handling these in lvalue context. They now produce a "Modification of non-creatable hash value attempted" error message.

  • If list assignment to a hash or array triggered destructors that freed the hash or array itself, a crash would ensue. This is no longer the case [perl #107440].

  • It used to be possible to free the typeglob of a localized array or hash (e.g., local @{"x"}; delete $::{x} ), resulting in a crash on scope exit.

  • Some core bugs affecting Hash::Util have been fixed: locking a hash element that is a glob copy no longer causes the next assignment to it to corrupt the glob (5.14.2), and unlocking a hash element that holds a copy-on-write scalar no longer causes modifications to that scalar to modify other scalars that were sharing the same string buffer.

C API fixes

  • The newHVhv XS function now works on tied hashes, instead of crashing or returning an empty hash.

  • The SvIsCOW C macro now returns false for read-only copies of typeglobs, such as those created by:

    1. $hash{elem} = *foo;
    2. Hash::Util::lock_value %hash, 'elem';

    It used to return true.

  • The SvPVutf8 C function no longer tries to modify its argument, resulting in errors [perl #108994].

  • SvPVutf8 now works properly with magical variables.

  • SvPVbyte now works properly non-PVs.

  • When presented with malformed UTF-8 input, the XS-callable functions is_utf8_string() , is_utf8_string_loc() , and is_utf8_string_loclen() could read beyond the end of the input string by up to 12 bytes. This no longer happens. [perl #32080]. However, currently, is_utf8_char() still has this defect, see is_utf8_char() above.

  • The C-level pregcomp function could become confused about whether the pattern was in UTF8 if the pattern was an overloaded, tied, or otherwise magical scalar [perl #101940].

Compile-time hints

  • Tying %^H no longer causes perl to crash or ignore the contents of %^H when entering a compilation scope [perl #106282].

  • eval $string and require used not to localize %^H during compilation if it was empty at the time the eval call itself was compiled. This could lead to scary side effects, like use re "/m" enabling other flags that the surrounding code was trying to enable for its caller [perl #68750].

  • eval $string and require no longer localize hints ($^H and %^H ) at run time, but only during compilation of the $string or required file. This makes BEGIN { $^H{foo}=7 } equivalent to BEGIN { eval '$^H{foo}=7' } [perl #70151].

  • Creating a BEGIN block from XS code (via newXS or newATTRSUB ) would, on completion, make the hints of the current compiling code the current hints. This could cause warnings to occur in a non-warning scope.

Copy-on-write scalars

Copy-on-write or shared hash key scalars were introduced in 5.8.0, but most Perl code did not encounter them (they were used mostly internally). Perl 5.10.0 extended them, such that assigning __PACKAGE__ or a hash key to a scalar would make it copy-on-write. Several parts of Perl were not updated to account for them, but have now been fixed.

  • utf8::decode had a nasty bug that would modify copy-on-write scalars' string buffers in place (i.e., skipping the copy). This could result in hashes having two elements with the same key [perl #91834]. (5.14.2)

  • Lvalue subroutines were not allowing COW scalars to be returned. This was fixed for lvalue scalar context in Perl 5.12.3 and 5.14.0, but list context was not fixed until this release.

  • Elements of restricted hashes (see the fields pragma) containing copy-on-write values couldn't be deleted, nor could such hashes be cleared (%hash = () ). (5.14.2)

  • Localizing a tied variable used to make it read-only if it contained a copy-on-write string. (5.14.2)

  • Assigning a copy-on-write string to a stash element no longer causes a double free. Regardless of this change, the results of such assignments are still undefined.

  • Assigning a copy-on-write string to a tied variable no longer stops that variable from being tied if it happens to be a PVMG or PVLV internally.

  • Doing a substitution on a tied variable returning a copy-on-write scalar used to cause an assertion failure or an "Attempt to free nonexistent shared string" warning.

  • This one is a regression from 5.12: In 5.14.0, the bitwise assignment operators |= , ^= and &= started leaving the left-hand side undefined if it happened to be a copy-on-write string [perl #108480].

  • Storable, Devel::Peek and PerlIO::scalar had similar problems. See Updated Modules and Pragmata, above.

The debugger

  • dumpvar.pl, and therefore the x command in the debugger, have been fixed to handle objects blessed into classes whose names contain "=". The contents of such objects used not to be dumped [perl #101814].

  • The "R" command for restarting a debugger session has been fixed to work on Windows, or any other system lacking a POSIX::_SC_OPEN_MAX constant [perl #87740].

  • The #line 42 foo directive used not to update the arrays of lines used by the debugger if it occurred in a string eval. This was partially fixed in 5.14, but it worked only for a single #line 42 foo in each eval. Now it works for multiple.

  • When subroutine calls are intercepted by the debugger, the name of the subroutine or a reference to it is stored in $DB::sub , for the debugger to access. Sometimes (such as $foo = *bar; undef *bar; &$foo ) $DB::sub would be set to a name that could not be used to find the subroutine, and so the debugger's attempt to call it would fail. Now the check to see whether a reference is needed is more robust, so those problems should not happen anymore [rt.cpan.org #69862].

  • Every subroutine has a filename associated with it that the debugger uses. The one associated with constant subroutines used to be misallocated when cloned under threads. Consequently, debugging threaded applications could result in memory corruption [perl #96126].

Dereferencing operators

  • defined(${"..."}), defined(*{"..."}), etc., used to return true for most, but not all built-in variables, if they had not been used yet. This bug affected ${^GLOBAL_PHASE} and ${^UTF8CACHE} , among others. It also used to return false if the package name was given as well (${"::!"} ) [perl #97978, #97492].

  • Perl 5.10.0 introduced a similar bug: defined(*{"foo"}) where "foo" represents the name of a built-in global variable used to return false if the variable had never been used before, but only on the first call. This, too, has been fixed.

  • Since 5.6.0, *{ ... } has been inconsistent in how it treats undefined values. It would die in strict mode or lvalue context for most undefined values, but would be treated as the empty string (with a warning) for the specific scalar return by undef() (&PL_sv_undef internally). This has been corrected. undef() is now treated like other undefined scalars, as in Perl 5.005.

Filehandle, last-accessed

Perl has an internal variable that stores the last filehandle to be accessed. It is used by $. and by tell and eof without arguments.

  • It used to be possible to set this internal variable to a glob copy and then modify that glob copy to be something other than a glob, and still have the last-accessed filehandle associated with the variable after assigning a glob to it again:

    1. my $foo = *STDOUT; # $foo is a glob copy
    2. <$foo>; # $foo is now the last-accessed handle
    3. $foo = 3; # no longer a glob
    4. $foo = *STDERR; # still the last-accessed handle

    Now the $foo = 3 assignment unsets that internal variable, so there is no last-accessed filehandle, just as if <$foo> had never happened.

    This also prevents some unrelated handle from becoming the last-accessed handle if $foo falls out of scope and the same internal SV gets used for another handle [perl #97988].

  • A regression in 5.14 caused these statements not to set that internal variable:

    1. my $fh = *STDOUT;
    2. tell $fh;
    3. eof $fh;
    4. seek $fh, 0,0;
    5. tell *$fh;
    6. eof *$fh;
    7. seek *$fh, 0,0;
    8. readline *$fh;

    This is now fixed, but tell *{ *$fh } still has the problem, and it is not clear how to fix it [perl #106536].

Filetests and stat

The term "filetests" refers to the operators that consist of a hyphen followed by a single letter: -r , -x , -M , etc. The term "stacked" when applied to filetests means followed by another filetest operator sharing the same operand, as in -r -x -w $fooo .

  • stat produces more consistent warnings. It no longer warns for "_" [perl #71002] and no longer skips the warning at times for other unopened handles. It no longer warns about an unopened handle when the operating system's fstat function fails.

  • stat would sometimes return negative numbers for large inode numbers, because it was using the wrong internal C type. [perl #84590]

  • lstat is documented to fall back to stat (with a warning) when given a filehandle. When passed an IO reference, it was actually doing the equivalent of stat _ and ignoring the handle.

  • -T _ with no preceding stat used to produce a confusing "uninitialized" warning, even though there is no visible uninitialized value to speak of.

  • -T , -B , -l and -t now work when stacked with other filetest operators [perl #77388].

  • In 5.14.0, filetest ops (-r , -x , etc.) started calling FETCH on a tied argument belonging to the previous argument to a list operator, if called with a bareword argument or no argument at all. This has been fixed, so push @foo, $tied, -r no longer calls FETCH on $tied .

  • In Perl 5.6, -l followed by anything other than a bareword would treat its argument as a file name. That was changed in 5.8 for glob references (\*foo ), but not for globs themselves (*foo ). -l started returning undef for glob references without setting the last stat buffer that the "_" handle uses, but only if warnings were turned on. With warnings off, it was the same as 5.6. In other words, it was simply buggy and inconsistent. Now the 5.6 behavior has been restored.

  • -l followed by a bareword no longer "eats" the previous argument to the list operator in whose argument list it resides. Hence, print "bar", -l foo now actually prints "bar", because -l on longer eats it.

  • Perl keeps several internal variables to keep track of the last stat buffer, from which file(handle) it originated, what type it was, and whether the last stat succeeded.

    There were various cases where these could get out of synch, resulting in inconsistent or erratic behavior in edge cases (every mention of -T applies to -B as well):

    • -T HANDLE, even though it does a stat, was not resetting the last stat type, so an lstat _ following it would merrily return the wrong results. Also, it was not setting the success status.

    • Freeing the handle last used by stat or a filetest could result in -T _ using an unrelated handle.

    • stat with an IO reference would not reset the stat type or record the filehandle for -T _ to use.

    • Fatal warnings could cause the stat buffer not to be reset for a filetest operator on an unopened filehandle or -l on any handle. Fatal warnings also stopped -T from setting $! .

    • When the last stat was on an unreadable file, -T _ is supposed to return undef, leaving the last stat buffer unchanged. But it was setting the stat type, causing lstat _ to stop working.

    • -T FILENAME was not resetting the internal stat buffers for unreadable files.

    These have all been fixed.

Formats

  • Several edge cases have been fixed with formats and formline; in particular, where the format itself is potentially variable (such as with ties and overloading), and where the format and data differ in their encoding. In both these cases, it used to possible for the output to be corrupted [perl #91032].

  • formline no longer converts its argument into a string in-place. So passing a reference to formline no longer destroys the reference [perl #79532].

  • Assignment to $^A (the format output accumulator) now recalculates the number of lines output.

given and when

  • given was not scoping its implicit $_ properly, resulting in memory leaks or "Variable is not available" warnings [perl #94682].

  • given was not calling set-magic on the implicit lexical $_ that it uses. This meant, for example, that pos would be remembered from one execution of the same given block to the next, even if the input were a different variable [perl #84526].

  • when blocks are now capable of returning variables declared inside the enclosing given block [perl #93548].

The glob operator

  • On OSes other than VMS, Perl's glob operator (and the <...> form) use File::Glob underneath. File::Glob splits the pattern into words, before feeding each word to its bsd_glob function.

    There were several inconsistencies in the way the split was done. Now quotation marks (' and ") are always treated as shell-style word delimiters (that allow whitespace as part of a word) and backslashes are always preserved, unless they exist to escape quotation marks. Before, those would only sometimes be the case, depending on whether the pattern contained whitespace. Also, escaped whitespace at the end of the pattern is no longer stripped [perl #40470].

  • CORE::glob now works as a way to call the default globbing function. It used to respect overrides, despite the CORE:: prefix.

  • Under miniperl (used to configure modules when perl itself is built), glob now clears %ENV before calling csh, since the latter croaks on some systems if it does not like the contents of the LS_COLORS environment variable [perl #98662].

Lvalue subroutines

  • Explicit return now returns the actual argument passed to return, instead of copying it [perl #72724, #72706].

  • Lvalue subroutines used to enforce lvalue syntax (i.e., whatever can go on the left-hand side of = ) for the last statement and the arguments to return. Since lvalue subroutines are not always called in lvalue context, this restriction has been lifted.

  • Lvalue subroutines are less restrictive about what values can be returned. It used to croak on values returned by shift and delete and from other subroutines, but no longer does so [perl #71172].

  • Empty lvalue subroutines (sub :lvalue {} ) used to return @_ in list context. All subroutines used to do this, but regular subs were fixed in Perl 5.8.2. Now lvalue subroutines have been likewise fixed.

  • Autovivification now works on values returned from lvalue subroutines [perl #7946], as does returning keys in lvalue context.

  • Lvalue subroutines used to copy their return values in rvalue context. Not only was this a waste of CPU cycles, but it also caused bugs. A ($) prototype would cause an lvalue sub to copy its return value [perl #51408], and while(lvalue_sub() =~ m/.../g) { ... } would loop endlessly [perl #78680].

  • When called in potential lvalue context (e.g., subroutine arguments or a list passed to for ), lvalue subroutines used to copy any read-only value that was returned. E.g., sub :lvalue { $] } would not return $] , but a copy of it.

  • When called in potential lvalue context, an lvalue subroutine returning arrays or hashes used to bind the arrays or hashes to scalar variables, resulting in bugs. This was fixed in 5.14.0 if an array were the first thing returned from the subroutine (but not for $scalar, @array or hashes being returned). Now a more general fix has been applied [perl #23790].

  • Method calls whose arguments were all surrounded with my() or our() (as in $object->method(my($a,$b)) ) used to force lvalue context on the subroutine. This would prevent lvalue methods from returning certain values.

  • Lvalue sub calls that are not determined to be such at compile time (&$name or &{"name"}) are no longer exempt from strict refs if they occur in the last statement of an lvalue subroutine [perl #102486].

  • Sub calls whose subs are not visible at compile time, if they occurred in the last statement of an lvalue subroutine, would reject non-lvalue subroutines and die with "Can't modify non-lvalue subroutine call" [perl #102486].

    Non-lvalue sub calls whose subs are visible at compile time exhibited the opposite bug. If the call occurred in the last statement of an lvalue subroutine, there would be no error when the lvalue sub was called in lvalue context. Perl would blindly assign to the temporary value returned by the non-lvalue subroutine.

  • AUTOLOAD routines used to take precedence over the actual sub being called (i.e., when autoloading wasn't needed), for sub calls in lvalue or potential lvalue context, if the subroutine was not visible at compile time.

  • Applying the :lvalue attribute to an XSUB or to an aliased subroutine stub with sub foo :lvalue; syntax stopped working in Perl 5.12. This has been fixed.

  • Applying the :lvalue attribute to subroutine that is already defined does not work properly, as the attribute changes the way the sub is compiled. Hence, Perl 5.12 began warning when an attempt is made to apply the attribute to an already defined sub. In such cases, the attribute is discarded.

    But the change in 5.12 missed the case where custom attributes are also present: that case still silently and ineffectively applied the attribute. That omission has now been corrected. sub foo :lvalue :Whatever (when foo is already defined) now warns about the :lvalue attribute, and does not apply it.

  • A bug affecting lvalue context propagation through nested lvalue subroutine calls has been fixed. Previously, returning a value in nested rvalue context would be treated as lvalue context by the inner subroutine call, resulting in some values (such as read-only values) being rejected.

Overloading

  • Arithmetic assignment ($left += $right ) involving overloaded objects that rely on the 'nomethod' override no longer segfault when the left operand is not overloaded.

  • Errors that occur when methods cannot be found during overloading now mention the correct package name, as they did in 5.8.x, instead of erroneously mentioning the "overload" package, as they have since 5.10.0.

  • Undefining %overload:: no longer causes a crash.

Prototypes of built-in keywords

  • The prototype function no longer dies for the __FILE__ , __LINE__ and __PACKAGE__ directives. It now returns an empty-string prototype for them, because they are syntactically indistinguishable from nullary functions like time.

  • prototype now returns undef for all overridable infix operators, such as eq , which are not callable in any way resembling functions. It used to return incorrect prototypes for some and die for others [perl #94984].

  • The prototypes of several built-in functions--getprotobynumber, lock, not and select--have been corrected, or at least are now closer to reality than before.

Regular expressions

  • /[[:ascii:]]/ and /[[:blank:]]/ now use locale rules under use locale when the platform supports that. Previously, they used the platform's native character set.

  • m/[[:ascii:]]/i and /\p{ASCII}/i now match identically (when not under a differing locale). This fixes a regression introduced in 5.14 in which the first expression could match characters outside of ASCII, such as the KELVIN SIGN.

  • /.*/g would sometimes refuse to match at the end of a string that ends with "\n". This has been fixed [perl #109206].

  • Starting with 5.12.0, Perl used to get its internal bookkeeping muddled up after assigning ${ qr// } to a hash element and locking it with Hash::Util. This could result in double frees, crashes, or erratic behavior.

  • The new (in 5.14.0) regular expression modifier /a when repeated like /aa forbids the characters outside the ASCII range that match characters inside that range from matching under /i. This did not work under some circumstances, all involving alternation, such as:

    1. "\N{KELVIN SIGN}" =~ /k|foo/iaa;

    succeeded inappropriately. This is now fixed.

  • 5.14.0 introduced some memory leaks in regular expression character classes such as [\w\s] , which have now been fixed. (5.14.1)

  • An edge case in regular expression matching could potentially loop. This happened only under /i in bracketed character classes that have characters with multi-character folds, and the target string to match against includes the first portion of the fold, followed by another character that has a multi-character fold that begins with the remaining portion of the fold, plus some more.

    1. "s\N{U+DF}" =~ /[\x{DF}foo]/i

    is one such case. \xDF folds to "ss" . (5.14.1)

  • A few characters in regular expression pattern matches did not match correctly in some circumstances, all involving /i. The affected characters are: COMBINING GREEK YPOGEGRAMMENI, GREEK CAPITAL LETTER IOTA, GREEK CAPITAL LETTER UPSILON, GREEK PROSGEGRAMMENI, GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA, GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS, GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA, GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS, LATIN SMALL LETTER LONG S, LATIN SMALL LIGATURE LONG S T, and LATIN SMALL LIGATURE ST.

  • A memory leak regression in regular expression compilation under threading has been fixed.

  • A regression introduced in 5.14.0 has been fixed. This involved an inverted bracketed character class in a regular expression that consisted solely of a Unicode property. That property wasn't getting inverted outside the Latin1 range.

  • Three problematic Unicode characters now work better in regex pattern matching under /i.

    In the past, three Unicode characters: LATIN SMALL LETTER SHARP S, GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS, and GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS, along with the sequences that they fold to (including "ss" for LATIN SMALL LETTER SHARP S), did not properly match under /i. 5.14.0 fixed some of these cases, but introduced others, including a panic when one of the characters or sequences was used in the (?(DEFINE) regular expression predicate. The known bugs that were introduced in 5.14 have now been fixed; as well as some other edge cases that have never worked until now. These all involve using the characters and sequences outside bracketed character classes under /i. This closes [perl #98546].

    There remain known problems when using certain characters with multi-character folds inside bracketed character classes, including such constructs as qr/[\N{LATIN SMALL LETTER SHARP}a-z]/i . These remaining bugs are addressed in [perl #89774].

  • RT #78266: The regex engine has been leaking memory when accessing named captures that weren't matched as part of a regex ever since 5.10 when they were introduced; e.g., this would consume over a hundred MB of memory:

    1. for (1..10_000_000) {
    2. if ("foo" =~ /(foo|(?<capture>bar))?/) {
    3. my $capture = $+{capture}
    4. }
    5. }
    6. system "ps -o rss $$"'
  • In 5.14, /[[:lower:]]/i and /[[:upper:]]/i no longer matched the opposite case. This has been fixed [perl #101970].

  • A regular expression match with an overloaded object on the right-hand side would sometimes stringify the object too many times.

  • A regression has been fixed that was introduced in 5.14, in /i regular expression matching, in which a match improperly fails if the pattern is in UTF-8, the target string is not, and a Latin-1 character precedes a character in the string that should match the pattern. [perl #101710]

  • In case-insensitive regular expression pattern matching, no longer on UTF-8 encoded strings does the scan for the start of match look only at the first possible position. This caused matches such as "f\x{FB00}" =~ /ff/i to fail.

  • The regexp optimizer no longer crashes on debugging builds when merging fixed-string nodes with inconvenient contents.

  • A panic involving the combination of the regular expression modifiers /aa and the \b escape sequence introduced in 5.14.0 has been fixed [perl #95964]. (5.14.2)

  • The combination of the regular expression modifiers /aa and the \b and \B escape sequences did not work properly on UTF-8 encoded strings. All non-ASCII characters under /aa should be treated as non-word characters, but what was happening was that Unicode rules were used to determine wordness/non-wordness for non-ASCII characters. This is now fixed [perl #95968].

  • (?foo: ...) no longer loses passed in character set.

  • The trie optimization used to have problems with alternations containing an empty (?:), causing "x" =~ /\A(?>(?:(?:)A|B|C?x))\z/ not to match, whereas it should [perl #111842].

  • Use of lexical (my) variables in code blocks embedded in regular expressions will no longer result in memory corruption or crashes.

    Nevertheless, these code blocks are still experimental, as there are still problems with the wrong variables being closed over (in loops for instance) and with abnormal exiting (e.g., die) causing memory corruption.

  • The \h , \H , \v and \V regular expression metacharacters used to cause a panic error message when trying to match at the end of the string [perl #96354].

  • The abbreviations for four C1 control characters MW PM , RI , and ST were previously unrecognized by \N{} , vianame(), and string_vianame().

  • Mentioning a variable named "&" other than $& (i.e., @& or %& ) no longer stops $& from working. The same applies to variables named "'" and "`" [perl #24237].

  • Creating a UNIVERSAL::AUTOLOAD sub no longer stops %+ , %- and %! from working some of the time [perl #105024].

Smartmatching

  • ~~ now correctly handles the precedence of Any~~Object, and is not tricked by an overloaded object on the left-hand side.

  • In Perl 5.14.0, $tainted ~~ @array stopped working properly. Sometimes it would erroneously fail (when $tainted contained a string that occurs in the array after the first element) or erroneously succeed (when undef occurred after the first element) [perl #93590].

The sort operator

  • sort was not treating sub {} and sub {()} as equivalent when such a sub was provided as the comparison routine. It used to croak on sub {()} .

  • sort now works once more with custom sort routines that are XSUBs. It stopped working in 5.10.0.

  • sort with a constant for a custom sort routine, although it produces unsorted results, no longer crashes. It started crashing in 5.10.0.

  • Warnings emitted by sort when a custom comparison routine returns a non-numeric value now contain "in sort" and show the line number of the sort operator, rather than the last line of the comparison routine. The warnings also now occur only if warnings are enabled in the scope where sort occurs. Previously the warnings would occur if enabled in the comparison routine's scope.

  • sort { $a <=> $b } , which is optimized internally, now produces "uninitialized" warnings for NaNs (not-a-number values), since <=> returns undef for those. This brings it in line with sort { 1; $a <=> $b } and other more complex cases, which are not optimized [perl #94390].

The substr operator

  • Tied (and otherwise magical) variables are no longer exempt from the "Attempt to use reference as lvalue in substr" warning.

  • That warning now occurs when the returned lvalue is assigned to, not when substr itself is called. This makes a difference only if the return value of substr is referenced and later assigned to.

  • Passing a substring of a read-only value or a typeglob to a function (potential lvalue context) no longer causes an immediate "Can't coerce" or "Modification of a read-only value" error. That error occurs only if the passed value is assigned to.

    The same thing happens with the "substr outside of string" error. If the lvalue is only read from, not written to, it is now just a warning, as with rvalue substr.

  • substr assignments no longer call FETCH twice if the first argument is a tied variable, just once.

Support for embedded nulls

Some parts of Perl did not work correctly with nulls (chr 0 ) embedded in strings. That meant that, for instance, $m = "a\0b"; foo->$m would call the "a" method, instead of the actual method name contained in $m. These parts of perl have been fixed to support nulls:

  • Method names

  • Typeglob names (including filehandle and subroutine names)

  • Package names, including the return value of ref()

  • Typeglob elements (*foo{"THING\0stuff"} )

  • Signal names

  • Various warnings and error messages that mention variable names or values, methods, etc.

One side effect of these changes is that blessing into "\0" no longer causes ref() to return false.

Threading bugs

  • Typeglobs returned from threads are no longer cloned if the parent thread already has a glob with the same name. This means that returned subroutines will now assign to the right package variables [perl #107366].

  • Some cases of threads crashing due to memory allocation during cloning have been fixed [perl #90006].

  • Thread joining would sometimes emit "Attempt to free unreferenced scalar" warnings if caller had been used from the DB package before thread creation [perl #98092].

  • Locking a subroutine (via lock &sub ) is no longer a compile-time error for regular subs. For lvalue subroutines, it no longer tries to return the sub as a scalar, resulting in strange side effects like ref \$_ returning "CODE" in some instances.

    lock &sub is now a run-time error if threads::shared is loaded (a no-op otherwise), but that may be rectified in a future version.

Tied variables

  • Various cases in which FETCH was being ignored or called too many times have been fixed:

    • PerlIO::get_layers [perl #97956]

    • $tied =~ y/a/b/ , chop $tied and chomp $tied when $tied holds a reference.

    • When calling local $_ [perl #105912]

    • Four-argument select

    • A tied buffer passed to sysread

    • $tied .= <>

    • Three-argument open, the third being a tied file handle (as in open $fh, ">&", $tied )

    • sort with a reference to a tied glob for the comparison routine.

    • .. and ... in list context [perl #53554].

    • ${$tied} , @{$tied} , %{$tied} and *{$tied} where the tied variable returns a string (&{} was unaffected)

    • defined ${ $tied_variable }

    • Various functions that take a filehandle argument in rvalue context (close, readline, etc.) [perl #97482]

    • Some cases of dereferencing a complex expression, such as ${ (), $tied } = 1 , used to call FETCH multiple times, but now call it once.

    • $tied->method where $tied returns a package name--even resulting in a failure to call the method, due to memory corruption

    • Assignments like *$tied = \&{"..."} and *glob = $tied

    • chdir, chmod, chown, utime, truncate, stat, lstat and the filetest ops (-r , -x , etc.)

  • caller sets @DB::args to the subroutine arguments when called from the DB package. It used to crash when doing so if @DB::args happened to be tied. Now it croaks instead.

  • Tying an element of %ENV or %^H and then deleting that element would result in a call to the tie object's DELETE method, even though tying the element itself is supposed to be equivalent to tying a scalar (the element is, of course, a scalar) [perl #67490].

  • When Perl autovivifies an element of a tied array or hash (which entails calling STORE with a new reference), it now calls FETCH immediately after the STORE, instead of assuming that FETCH would have returned the same reference. This can make it easier to implement tied objects [perl #35865, #43011].

  • Four-argument select no longer produces its "Non-string passed as bitmask" warning on tied or tainted variables that are strings.

  • Localizing a tied scalar that returns a typeglob no longer stops it from being tied till the end of the scope.

  • Attempting to goto out of a tied handle method used to cause memory corruption or crashes. Now it produces an error message instead [perl #8611].

  • A bug has been fixed that occurs when a tied variable is used as a subroutine reference: if the last thing assigned to or returned from the variable was a reference or typeglob, the \&$tied could either crash or return the wrong subroutine. The reference case is a regression introduced in Perl 5.10.0. For typeglobs, it has probably never worked till now.

Version objects and vstrings

  • The bitwise complement operator (and possibly other operators, too) when passed a vstring would leave vstring magic attached to the return value, even though the string had changed. This meant that version->new(~v1.2.3) would create a version looking like "v1.2.3" even though the string passed to version->new was actually "\376\375\374". This also caused B::Deparse to deparse ~v1.2.3 incorrectly, without the ~ [perl #29070].

  • Assigning a vstring to a magic (e.g., tied, $! ) variable and then assigning something else used to blow away all magic. This meant that tied variables would come undone, $! would stop getting updated on failed system calls, $| would stop setting autoflush, and other mischief would take place. This has been fixed.

  • version->new("version") and printf "%vd", "version" no longer crash [perl #102586].

  • Version comparisons, such as those that happen implicitly with use v5.43 , no longer cause locale settings to change [perl #105784].

  • Version objects no longer cause memory leaks in boolean context [perl #109762].

Warnings, redefinition

  • Subroutines from the autouse namespace are once more exempt from redefinition warnings. This used to work in 5.005, but was broken in 5.6 for most subroutines. For subs created via XS that redefine subroutines from the autouse package, this stopped working in 5.10.

  • New XSUBs now produce redefinition warnings if they overwrite existing subs, as they did in 5.8.x. (The autouse logic was reversed in 5.10-14. Only subroutines from the autouse namespace would warn when clobbered.)

  • newCONSTSUB used to use compile-time warning hints, instead of run-time hints. The following code should never produce a redefinition warning, but it used to, if newCONSTSUB redefined an existing subroutine:

    1. use warnings;
    2. BEGIN {
    3. no warnings;
    4. some_XS_function_that_calls_new_CONSTSUB();
    5. }
  • Redefinition warnings for constant subroutines are on by default (what are known as severe warnings in perldiag). This occurred only when it was a glob assignment or declaration of a Perl subroutine that caused the warning. If the creation of XSUBs triggered the warning, it was not a default warning. This has been corrected.

  • The internal check to see whether a redefinition warning should occur used to emit "uninitialized" warnings in cases like this:

    1. use warnings "uninitialized";
    2. use constant {u => undef, v => undef};
    3. sub foo(){u}
    4. sub foo(){v}

Warnings, "Uninitialized"

  • Various functions that take a filehandle argument in rvalue context (close, readline, etc.) used to warn twice for an undefined handle [perl #97482].

  • dbmopen now only warns once, rather than three times, if the mode argument is undef [perl #90064].

  • The += operator does not usually warn when the left-hand side is undef, but it was doing so for tied variables. This has been fixed [perl #44895].

  • A bug fix in Perl 5.14 introduced a new bug, causing "uninitialized" warnings to report the wrong variable if the operator in question had two operands and one was %{...} or @{...} . This has been fixed [perl #103766].

  • .. and ... in list context now mention the name of the variable in "uninitialized" warnings for string (as opposed to numeric) ranges.

Weak references

  • Weakening the first argument to an automatically-invoked DESTROY method could result in erroneous "DESTROY created new reference" errors or crashes. Now it is an error to weaken a read-only reference.

  • Weak references to lexical hashes going out of scope were not going stale (becoming undefined), but continued to point to the hash.

  • Weak references to lexical variables going out of scope are now broken before any magical methods (e.g., DESTROY on a tie object) are called. This prevents such methods from modifying the variable that will be seen the next time the scope is entered.

  • Creating a weak reference to an @ISA array or accessing the array index ($#ISA ) could result in confused internal bookkeeping for elements later added to the @ISA array. For instance, creating a weak reference to the element itself could push that weak reference on to @ISA; and elements added after use of $#ISA would be ignored by method lookup [perl #85670].

Other notable fixes

  • quotemeta now quotes consistently the same non-ASCII characters under use feature 'unicode_strings' , regardless of whether the string is encoded in UTF-8 or not, hence fixing the last vestiges (we hope) of the notorious The Unicode Bug in perlunicode. [perl #77654].

    Which of these code points is quoted has changed, based on Unicode's recommendations. See quotemeta for details.

  • study is now a no-op, presumably fixing all outstanding bugs related to study causing regex matches to behave incorrectly!

  • When one writes open foo || die , which used to work in Perl 4, a "Precedence problem" warning is produced. This warning used erroneously to apply to fully-qualified bareword handle names not followed by ||. This has been corrected.

  • After package aliasing (*foo:: = *bar:: ), select with 0 or 1 argument would sometimes return a name that could not be used to refer to the filehandle, or sometimes it would return undef even when a filehandle was selected. Now it returns a typeglob reference in such cases.

  • PerlIO::get_layers no longer ignores some arguments that it thinks are numeric, while treating others as filehandle names. It is now consistent for flat scalars (i.e., not references).

  • Unrecognized switches on #! line

    If a switch, such as -x, that cannot occur on the #! line is used there, perl dies with "Can't emulate...".

    It used to produce the same message for switches that perl did not recognize at all, whether on the command line or the #! line.

    Now it produces the "Unrecognized switch" error message [perl #104288].

  • system now temporarily blocks the SIGCHLD signal handler, to prevent the signal handler from stealing the exit status [perl #105700].

  • The %n formatting code for printf and sprintf, which causes the number of characters to be assigned to the next argument, now actually assigns the number of characters, instead of the number of bytes.

    It also works now with special lvalue functions like substr and with nonexistent hash and array elements [perl #3471, #103492].

  • Perl skips copying values returned from a subroutine, for the sake of speed, if doing so would make no observable difference. Because of faulty logic, this would happen with the result of delete, shift or splice, even if the result was referenced elsewhere. It also did so with tied variables about to be freed [perl #91844, #95548].

  • utf8::decode now refuses to modify read-only scalars [perl #91850].

  • Freeing $_ inside a grep or map block, a code block embedded in a regular expression, or an @INC filter (a subroutine returned by a subroutine in @INC) used to result in double frees or crashes [perl #91880, #92254, #92256].

  • eval returns undef in scalar context or an empty list in list context when there is a run-time error. When eval was passed a string in list context and a syntax error occurred, it used to return a list containing a single undefined element. Now it returns an empty list in list context for all errors [perl #80630].

  • goto &func no longer crashes, but produces an error message, when the unwinding of the current subroutine's scope fires a destructor that undefines the subroutine being "goneto" [perl #99850].

  • Perl now holds an extra reference count on the package that code is currently compiling in. This means that the following code no longer crashes [perl #101486]:

    1. package Foo;
    2. BEGIN {*Foo:: = *Bar::}
    3. sub foo;
  • The x repetition operator no longer crashes on 64-bit builds with large repeat counts [perl #94560].

  • Calling require on an implicit $_ when *CORE::GLOBAL::require has been overridden does not segfault anymore, and $_ is now passed to the overriding subroutine [perl #78260].

  • use and require are no longer affected by the I/O layers active in the caller's scope (enabled by open.pm) [perl #96008].

  • our $::é; (which is invalid) no longer produces the "Compilation error at lib/utf8_heavy.pl..." error message, which it started emitting in 5.10.0 [perl #99984].

  • On 64-bit systems, read() now understands large string offsets beyond the 32-bit range.

  • Errors that occur when processing subroutine attributes no longer cause the subroutine's op tree to leak.

  • Passing the same constant subroutine to both index and formline no longer causes one or the other to fail [perl #89218]. (5.14.1)

  • List assignment to lexical variables declared with attributes in the same statement (my ($x,@y) : blimp = (72,94) ) stopped working in Perl 5.8.0. It has now been fixed.

  • Perl 5.10.0 introduced some faulty logic that made "U*" in the middle of a pack template equivalent to "U0" if the input string was empty. This has been fixed [perl #90160]. (5.14.2)

  • Destructors on objects were not called during global destruction on objects that were not referenced by any scalars. This could happen if an array element were blessed (e.g., bless \$a[0] ) or if a closure referenced a blessed variable (bless \my @a; sub foo { @a } ).

    Now there is an extra pass during global destruction to fire destructors on any objects that might be left after the usual passes that check for objects referenced by scalars [perl #36347].

  • Fixed a case where it was possible that a freed buffer may have been read from when parsing a here document [perl #90128]. (5.14.1)

  • each(ARRAY) is now wrapped in defined(...), like each(HASH), inside a while condition [perl #90888].

  • A problem with context propagation when a do block is an argument to return has been fixed. It used to cause undef to be returned in certain cases of a return inside an if block which itself is followed by another return.

  • Calling index with a tainted constant no longer causes constants in subsequently compiled code to become tainted [perl #64804].

  • Infinite loops like 1 while 1 used to stop strict 'subs' mode from working for the rest of the block.

  • For list assignments like ($a,$b) = ($b,$a) , Perl has to make a copy of the items on the right-hand side before assignment them to the left. For efficiency's sake, it assigns the values on the right straight to the items on the left if no one variable is mentioned on both sides, as in ($a,$b) = ($c,$d) . The logic for determining when it can cheat was faulty, in that && and || on the right-hand side could fool it. So ($a,$b) = $some_true_value && ($b,$a) would end up assigning the value of $b to both scalars.

  • Perl no longer tries to apply lvalue context to the string in ("string", $variable) ||= 1 (which used to be an error). Since the left-hand side of ||= is evaluated in scalar context, that's a scalar comma operator, which gives all but the last item void context. There is no such thing as void lvalue context, so it was a mistake for Perl to try to force it [perl #96942].

  • caller no longer leaks memory when called from the DB package if @DB::args was assigned to after the first call to caller. Carp was triggering this bug [perl #97010]. (5.14.2)

  • close and similar filehandle functions, when called on built-in global variables (like $+ ), used to die if the variable happened to hold the undefined value, instead of producing the usual "Use of uninitialized value" warning.

  • When autovivified file handles were introduced in Perl 5.6.0, readline was inadvertently made to autovivify when called as readline($foo) (but not as <$foo> ). It has now been fixed never to autovivify.

  • Calling an undefined anonymous subroutine (e.g., what $x holds after undef &{$x = sub{}} ) used to cause a "Not a CODE reference" error, which has been corrected to "Undefined subroutine called" [perl #71154].

  • Causing @DB::args to be freed between uses of caller no longer results in a crash [perl #93320].

  • setpgrp($foo) used to be equivalent to ($foo, setpgrp) , because setpgrp was ignoring its argument if there was just one. Now it is equivalent to setpgrp($foo,0).

  • shmread was not setting the scalar flags correctly when reading from shared memory, causing the existing cached numeric representation in the scalar to persist [perl #98480].

  • ++ and -- now work on copies of globs, instead of dying.

  • splice() doesn't warn when truncating

    You can now limit the size of an array using splice(@a,MAX_LEN) without worrying about warnings.

  • $$ is no longer tainted. Since this value comes directly from getpid() , it is always safe.

  • The parser no longer leaks a filehandle if STDIN was closed before parsing started [perl #37033].

  • die; with a non-reference, non-string, or magical (e.g., tainted) value in $@ now properly propagates that value [perl #111654].

Known Problems

  • On Solaris, we have two kinds of failure.

    If make is Sun's make, we get an error about a badly formed macro assignment in the Makefile. That happens when ./Configure tries to make depends. Configure then exits 0, but further make-ing fails.

    If make is gmake, Configure completes, then we get errors related to /usr/include/stdbool.h

  • On Win32, a number of tests hang unless STDERR is redirected. The cause of this is still under investigation.

  • When building as root with a umask that prevents files from being other-readable, t/op/filetest.t will fail. This is a test bug, not a bug in perl's behavior.

  • Configuring with a recent gcc and link-time-optimization, such as Configure -Doptimize='-O2 -flto' fails because the optimizer optimizes away some of Configure's tests. A workaround is to omit the -flto flag when running Configure, but add it back in while actually building, something like

    1. sh Configure -Doptimize=-O2
    2. make OPTIMIZE='-O2 -flto'
  • The following CPAN modules have test failures with perl 5.16. Patches have been submitted for all of these, so hopefully there will be new releases soon:

Acknowledgements

Perl 5.16.0 represents approximately 12 months of development since Perl 5.14.0 and contains approximately 590,000 lines of changes across 2,500 files from 139 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.16.0:

Aaron Crane, Abhijit Menon-Sen, Abigail, Alan Haggai Alavi, Alberto Simões, Alexandr Ciornii, Andreas König, Andy Dougherty, Aristotle Pagaltzis, Bo Johansson, Bo Lindbergh, Breno G. de Oliveira, brian d foy, Brian Fraser, Brian Greenfield, Carl Hayter, Chas. Owens, Chia-liang Kao, Chip Salzenberg, Chris 'BinGOs' Williams, Christian Hansen, Christopher J. Madsen, chromatic, Claes Jacobsson, Claudio Ramirez, Craig A. Berry, Damian Conway, Daniel Kahn Gillmor, Darin McBride, Dave Rolsky, David Cantrell, David Golden, David Leadbeater, David Mitchell, Dee Newcum, Dennis Kaarsemaker, Dominic Hargreaves, Douglas Christopher Wilson, Eric Brine, Father Chrysostomos, Florian Ragwitz, Frederic Briere, George Greer, Gerard Goossen, Gisle Aas, H.Merijn Brand, Hojung Youn, Ian Goodacre, James E Keenan, Jan Dubois, Jerry D. Hedden, Jesse Luehrs, Jesse Vincent, Jilles Tjoelker, Jim Cromie, Jim Meyering, Joel Berger, Johan Vromans, Johannes Plunien, John Hawkinson, John P. Linderman, John Peacock, Joshua ben Jore, Juerd Waalboer, Karl Williamson, Karthik Rajagopalan, Keith Thompson, Kevin J. Woolley, Kevin Ryde, Laurent Dami, Leo Lapworth, Leon Brocard, Leon Timmermans, Louis Strous, Lukas Mai, Marc Green, Marcel Grünauer, Mark A. Stratman, Mark Dootson, Mark Jason Dominus, Martin Hasch, Matthew Horsfall, Max Maischein, Michael G Schwern, Michael Witten, Mike Sheldrake, Moritz Lenz, Nicholas Clark, Niko Tyni, Nuno Carvalho, Pau Amma, Paul Evans, Paul Green, Paul Johnson, Perlover, Peter John Acklam, Peter Martini, Peter Scott, Phil Monsen, Pino Toscano, Rafael Garcia-Suarez, Rainer Tammer, Reini Urban, Ricardo Signes, Robin Barker, Rodolfo Carvalho, Salvador Fandiño, Sam Kimbrel, Samuel Thibault, Shawn M Moore, Shigeya Suzuki, Shirakata Kentaro, Shlomi Fish, Sisyphus, Slaven Rezic, Spiros Denaxas, Steffen Müller, Steffen Schwigon, Stephen Bennett, Stephen Oberholtzer, Stevan Little, Steve Hay, Steve Peters, Thomas Sibley, Thorsten Glaser, Timothe Litt, Todd Rinaldo, Tom Christiansen, Tom Hukins, Tony Cook, Vadim Konovalov, Vincent Pit, Vladimir Timofeev, Walt Mankowski, Yves Orton, Zefram, Zsbán Ambrus, Ævar Arnfjörð Bjarmason.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/. There may also be information at http://www.perl.org/, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please use this address only for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

Page index
 
perldoc-html/perl5161delta.html000644 000765 000024 00000056602 12275777373 016432 0ustar00jjstaff000000 000000 perl5161delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5161delta

Perl 5 version 18.2 documentation
Recently read

perl5161delta

NAME

perl5161delta - what is new for perl v5.16.1

DESCRIPTION

This document describes differences between the 5.16.0 release and the 5.16.1 release.

If you are upgrading from an earlier release such as 5.14.0, first read perl5160delta, which describes differences between 5.14.0 and 5.16.0.

Security

an off-by-two error in Scalar-List-Util has been fixed

The bugfix was in Scalar-List-Util 1.23_04, and perl 5.16.1 includes Scalar-List-Util 1.25.

Incompatible Changes

There are no changes intentionally incompatible with 5.16.0 If any exist, they are bugs, and we request that you submit a report. See Reporting Bugs below.

Modules and Pragmata

Updated Modules and Pragmata

  • Scalar::Util and List::Util have been upgraded from version 1.23 to version 1.25.

  • B::Deparse has been updated from version 1.14 to 1.14_01. An "uninitialized" warning emitted by B::Deparse has been squashed [perl #113464].

Configuration and Compilation

  • Building perl with some Windows compilers used to fail due to a problem with miniperl's glob operator (which uses the perlglob program) deleting the PATH environment variable [perl #113798].

Platform Support

Platform-Specific Notes

  • VMS

    All C header files from the top-level directory of the distribution are now installed on VMS, providing consistency with a long-standing practice on other platforms. Previously only a subset were installed, which broke non-core extension builds for extensions that depended on the missing include files.

Selected Bug Fixes

  • A regression introduced in Perl v5.16.0 involving tr/SEARCHLIST/REPLACEMENTLIST/ has been fixed. Only the first instance is supposed to be meaningful if a character appears more than once in SEARCHLIST. Under some circumstances, the final instance was overriding all earlier ones. [perl #113584]

  • B::COP::stashlen has been added. This provides access to an internal field added in perl 5.16 under threaded builds. It was broken at the last minute before 5.16 was released [perl #113034].

  • The re pragma will no longer clobber $_ . [perl #113750]

  • Unicode 6.1 published an incorrect alias for one of the Canonical_Combining_Class property's values (which range between 0 and 254). The alias CCC133 should have been CCC132 . Perl now overrides the data file furnished by Unicode to give the correct value.

  • Duplicating scalar filehandles works again. [perl #113764]

  • Under threaded perls, a runtime code block in a regular expression could corrupt the package name stored in the op tree, resulting in bad reads in caller, and possibly crashes [perl #113060].

  • For efficiency's sake, many operators and built-in functions return the same scalar each time. Lvalue subroutines and subroutines in the CORE:: namespace were allowing this implementation detail to leak through. print &CORE::uc("a"), &CORE::uc("b") used to print "BB". The same thing would happen with an lvalue subroutine returning the return value of uc. Now the value is copied in such cases [perl #113044].

  • __SUB__ now works in special blocks (BEGIN , END , etc.).

  • Formats that reference lexical variables from outside no longer result in crashes.

Known Problems

There are no new known problems, but consult Known Problems in perl5160delta to see those identified in the 5.16.0 release.

Acknowledgements

Perl 5.16.1 represents approximately 2 months of development since Perl 5.16.0 and contains approximately 14,000 lines of changes across 96 files from 8 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.16.1:

Chris 'BinGOs' Williams, Craig A. Berry, Father Chrysostomos, Karl Williamson, Paul Johnson, Reini Urban, Ricardo Signes, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5162delta.html000644 000765 000024 00000046747 12275777373 016444 0ustar00jjstaff000000 000000 perl5162delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5162delta

Perl 5 version 18.2 documentation
Recently read

perl5162delta

NAME

perl5162delta - what is new for perl v5.16.2

DESCRIPTION

This document describes differences between the 5.16.1 release and the 5.16.2 release.

If you are upgrading from an earlier release such as 5.16.0, first read perl5161delta, which describes differences between 5.16.0 and 5.16.1.

Incompatible Changes

There are no changes intentionally incompatible with 5.16.0 If any exist, they are bugs, and we request that you submit a report. See Reporting Bugs below.

Modules and Pragmata

Updated Modules and Pragmata

Configuration and Compilation

  • configuration should no longer be confused by ls colorization

Platform Support

Platform-Specific Notes

  • AIX

    Configure now always adds -qlanglvl=extc99 to the CC flags on AIX when using xlC. This will make it easier to compile a number of XS-based modules that assume C99 [perl #113778].

Selected Bug Fixes

  • fix /\h/ equivalence with /[\h]/

    see [perl #114220]

Known Problems

There are no new known problems.

Acknowledgements

Perl 5.16.2 represents approximately 2 months of development since Perl 5.16.1 and contains approximately 740 lines of changes across 20 files from 9 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.16.2:

Andy Dougherty, Craig A. Berry, Darin McBride, Dominic Hargreaves, Karen Etheridge, Karl Williamson, Peter Martini, Ricardo Signes, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5163delta.html000644 000765 000024 00000050503 12275777373 016426 0ustar00jjstaff000000 000000 perl5163delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5163delta

Perl 5 version 18.2 documentation
Recently read

perl5163delta

NAME

perl5163delta - what is new for perl v5.16.3

DESCRIPTION

This document describes differences between the 5.16.2 release and the 5.16.3 release.

If you are upgrading from an earlier release such as 5.16.1, first read perl5162delta, which describes differences between 5.16.1 and 5.16.2.

Core Enhancements

No changes since 5.16.0.

Security

This release contains one major and a number of minor security fixes. These latter are included mainly to allow the test suite to pass cleanly with the clang compiler's address sanitizer facility.

CVE-2013-1667: memory exhaustion with arbitrary hash keys

With a carefully crafted set of hash keys (for example arguments on a URL), it is possible to cause a hash to consume a large amount of memory and CPU, and thus possibly to achieve a Denial-of-Service.

This problem has been fixed.

wrap-around with IO on long strings

Reading or writing strings greater than 2**31 bytes in size could segfault due to integer wraparound.

This problem has been fixed.

memory leak in Encode

The UTF-8 encoding implementation in Encode.xs had a memory leak which has been fixed.

Incompatible Changes

There are no changes intentionally incompatible with 5.16.0. If any exist, they are bugs and reports are welcome.

Deprecations

There have been no deprecations since 5.16.0.

Modules and Pragmata

Updated Modules and Pragmata

  • Encode has been upgraded from version 2.44 to version 2.44_01.

  • Module::CoreList has been upgraded from version 2.76 to version 2.76_02.

  • XS::APItest has been upgraded from version 0.38 to version 0.39.

Known Problems

None.

Acknowledgements

Perl 5.16.3 represents approximately 4 months of development since Perl 5.16.2 and contains approximately 870 lines of changes across 39 files from 7 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.16.3:

Andy Dougherty, Chris 'BinGOs' Williams, Dave Rolsky, David Mitchell, Michael Schroeder, Ricardo Signes, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5180delta.html000644 000765 000024 00000677534 12275777373 016450 0ustar00jjstaff000000 000000 perl5180delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5180delta

Perl 5 version 18.2 documentation
Recently read

perl5180delta

NAME

perl5180delta - what is new for perl v5.18.0

DESCRIPTION

This document describes differences between the v5.16.0 release and the v5.18.0 release.

If you are upgrading from an earlier release such as v5.14.0, first read perl5160delta, which describes differences between v5.14.0 and v5.16.0.

Core Enhancements

New mechanism for experimental features

Newly-added experimental features will now require this incantation:

  1. no warnings "experimental::feature_name";
  2. use feature "feature_name"; # would warn without the prev line

There is a new warnings category, called "experimental", containing warnings that the feature pragma emits when enabling experimental features.

Newly-added experimental features will also be given special warning IDs, which consist of "experimental::" followed by the name of the feature. (The plan is to extend this mechanism eventually to all warnings, to allow them to be enabled or disabled individually, and not just by category.)

By saying

  1. no warnings "experimental::feature_name";

you are taking responsibility for any breakage that future changes to, or removal of, the feature may cause.

Since some features (like ~~ or my $_ ) now emit experimental warnings, and you may want to disable them in code that is also run on perls that do not recognize these warning categories, consider using the if pragma like this:

  1. no if $] >= 5.018, warnings => "experimental::feature_name";

Existing experimental features may begin emitting these warnings, too. Please consult perlexperiment for information on which features are considered experimental.

Hash overhaul

Changes to the implementation of hashes in perl v5.18.0 will be one of the most visible changes to the behavior of existing code.

By default, two distinct hash variables with identical keys and values may now provide their contents in a different order where it was previously identical.

When encountering these changes, the key to cleaning up from them is to accept that hashes are unordered collections and to act accordingly.

Hash randomization

The seed used by Perl's hash function is now random. This means that the order which keys/values will be returned from functions like keys(), values(), and each() will differ from run to run.

This change was introduced to make Perl's hashes more robust to algorithmic complexity attacks, and also because we discovered that it exposes hash ordering dependency bugs and makes them easier to track down.

Toolchain maintainers might want to invest in additional infrastructure to test for things like this. Running tests several times in a row and then comparing results will make it easier to spot hash order dependencies in code. Authors are strongly encouraged not to expose the key order of Perl's hashes to insecure audiences.

Further, every hash has its own iteration order, which should make it much more difficult to determine what the current hash seed is.

New hash functions

Perl v5.18 includes support for multiple hash functions, and changed the default (to ONE_AT_A_TIME_HARD), you can choose a different algorithm by defining a symbol at compile time. For a current list, consult the INSTALL document. Note that as of Perl v5.18 we can only recommend use of the default or SIPHASH. All the others are known to have security issues and are for research purposes only.

PERL_HASH_SEED environment variable now takes a hex value

PERL_HASH_SEED no longer accepts an integer as a parameter; instead the value is expected to be a binary value encoded in a hex string, such as "0xf5867c55039dc724". This is to make the infrastructure support hash seeds of arbitrary lengths, which might exceed that of an integer. (SipHash uses a 16 byte seed.)

PERL_PERTURB_KEYS environment variable added

The PERL_PERTURB_KEYS environment variable allows one to control the level of randomization applied to keys and friends.

When PERL_PERTURB_KEYS is 0, perl will not randomize the key order at all. The chance that keys changes due to an insert will be the same as in previous perls, basically only when the bucket size is changed.

When PERL_PERTURB_KEYS is 1, perl will randomize keys in a non-repeatable way. The chance that keys changes due to an insert will be very high. This is the most secure and default mode.

When PERL_PERTURB_KEYS is 2, perl will randomize keys in a repeatable way. Repeated runs of the same program should produce the same output every time.

PERL_HASH_SEED implies a non-default PERL_PERTURB_KEYS setting. Setting PERL_HASH_SEED=0 (exactly one 0) implies PERL_PERTURB_KEYS=0 (hash key randomization disabled); settng PERL_HASH_SEED to any other value implies PERL_PERTURB_KEYS=2 (deterministic and repeatable hash key randomization). Specifying PERL_PERTURB_KEYS explicitly to a different level overrides this behavior.

Hash::Util::hash_seed() now returns a string

Hash::Util::hash_seed() now returns a string instead of an integer. This is to make the infrastructure support hash seeds of arbitrary lengths which might exceed that of an integer. (SipHash uses a 16 byte seed.)

Output of PERL_HASH_SEED_DEBUG has been changed

The environment variable PERL_HASH_SEED_DEBUG now makes perl show both the hash function perl was built with, and the seed, in hex, in use for that process. Code parsing this output, should it exist, must change to accommodate the new format. Example of the new format:

  1. $ PERL_HASH_SEED_DEBUG=1 ./perl -e1
  2. HASH_FUNCTION = MURMUR3 HASH_SEED = 0x1476bb9f

Upgrade to Unicode 6.2

Perl now supports Unicode 6.2. A list of changes from Unicode 6.1 is at http://www.unicode.org/versions/Unicode6.2.0.

Character name aliases may now include non-Latin1-range characters

It is possible to define your own names for characters for use in \N{...} , charnames::vianame() , etc. These names can now be comprised of characters from the whole Unicode range. This allows for names to be in your native language, and not just English. Certain restrictions apply to the characters that may be used (you can't define a name that has punctuation in it, for example). See CUSTOM ALIASES in charnames.

New DTrace probes

The following new DTrace probes have been added:

  • op-entry

  • loading-file

  • loaded-file

${^LAST_FH}

This new variable provides access to the filehandle that was last read. This is the handle used by $. and by tell and eof without arguments.

Regular Expression Set Operations

This is an experimental feature to allow matching against the union, intersection, etc., of sets of code points, similar to Unicode::Regex::Set. It can also be used to extend /x processing to [bracketed] character classes, and as a replacement of user-defined properties, allowing more complex expressions than they do. See Extended Bracketed Character Classes in perlrecharclass.

Lexical subroutines

This new feature is still considered experimental. To enable it:

  1. use 5.018;
  2. no warnings "experimental::lexical_subs";
  3. use feature "lexical_subs";

You can now declare subroutines with state sub foo , my sub foo , and our sub foo . (state sub requires that the "state" feature be enabled, unless you write it as CORE::state sub foo .)

state sub creates a subroutine visible within the lexical scope in which it is declared. The subroutine is shared between calls to the outer sub.

my sub declares a lexical subroutine that is created each time the enclosing block is entered. state sub is generally slightly faster than my sub .

our sub declares a lexical alias to the package subroutine of the same name.

For more information, see Lexical Subroutines in perlsub.

Computed Labels

The loop controls next, last and redo, and the special dump operator, now allow arbitrary expressions to be used to compute labels at run time. Previously, any argument that was not a constant was treated as the empty string.

More CORE:: subs

Several more built-in functions have been added as subroutines to the CORE:: namespace - namely, those non-overridable keywords that can be implemented without custom parsers: defined, delete, exists, glob, pos, protoytpe , scalar, split, study, and undef.

As some of these have prototypes, prototype('CORE::...') has been changed to not make a distinction between overridable and non-overridable keywords. This is to make prototype('CORE::pos') consistent with prototype(&CORE::pos).

kill with negative signal names

kill has always allowed a negative signal number, which kills the process group instead of a single process. It has also allowed signal names. But it did not behave consistently, because negative signal names were treated as 0. Now negative signals names like -INT are supported and treated the same way as -2 [perl #112990].

Security

See also: hash overhaul

Some of the changes in the hash overhaul were made to enhance security. Please read that section.

Storable security warning in documentation

The documentation for Storable now includes a section which warns readers of the danger of accepting Storable documents from untrusted sources. The short version is that deserializing certain types of data can lead to loading modules and other code execution. This is documented behavior and wanted behavior, but this opens an attack vector for malicious entities.

Locale::Maketext allowed code injection via a malicious template

If users could provide a translation string to Locale::Maketext, this could be used to invoke arbitrary Perl subroutines available in the current process.

This has been fixed, but it is still possible to invoke any method provided by Locale::Maketext itself or a subclass that you are using. One of these methods in turn will invoke the Perl core's sprintf subroutine.

In summary, allowing users to provide translation strings without auditing them is a bad idea.

This vulnerability is documented in CVE-2012-6329.

Avoid calling memset with a negative count

Poorly written perl code that allows an attacker to specify the count to perl's x string repeat operator can already cause a memory exhaustion denial-of-service attack. A flaw in versions of perl before v5.15.5 can escalate that into a heap buffer overrun; coupled with versions of glibc before 2.16, it possibly allows the execution of arbitrary code.

The flaw addressed to this commit has been assigned identifier CVE-2012-5195 and was researched by Tim Brown.

Incompatible Changes

See also: hash overhaul

Some of the changes in the hash overhaul are not fully compatible with previous versions of perl. Please read that section.

An unknown character name in \N{...} is now a syntax error

Previously, it warned, and the Unicode REPLACEMENT CHARACTER was substituted. Unicode now recommends that this situation be a syntax error. Also, the previous behavior led to some confusing warnings and behaviors, and since the REPLACEMENT CHARACTER has no use other than as a stand-in for some unknown character, any code that has this problem is buggy.

Formerly deprecated characters in \N{} character name aliases are now errors.

Since v5.12.0, it has been deprecated to use certain characters in user-defined \N{...} character names. These now cause a syntax error. For example, it is now an error to begin a name with a digit, such as in

  1. my $undraftable = "\N{4F}"; # Syntax error!

or to have commas anywhere in the name. See CUSTOM ALIASES in charnames.

\N{BELL} now refers to U+1F514 instead of U+0007

Unicode 6.0 reused the name "BELL" for a different code point than it traditionally had meant. Since Perl v5.14, use of this name still referred to U+0007, but would raise a deprecation warning. Now, "BELL" refers to U+1F514, and the name for U+0007 is "ALERT". All the functions in charnames have been correspondingly updated.

New Restrictions in Multi-Character Case-Insensitive Matching in Regular Expression Bracketed Character Classes

Unicode has now withdrawn their previous recommendation for regular expressions to automatically handle cases where a single character can match multiple characters case-insensitively, for example, the letter LATIN SMALL LETTER SHARP S and the sequence ss . This is because it turns out to be impracticable to do this correctly in all circumstances. Because Perl has tried to do this as best it can, it will continue to do so. (We are considering an option to turn it off.) However, a new restriction is being added on such matches when they occur in [bracketed] character classes. People were specifying things such as /[\0-\xff]/i , and being surprised that it matches the two character sequence ss (since LATIN SMALL LETTER SHARP S occurs in this range). This behavior is also inconsistent with using a property instead of a range: \p{Block=Latin1} also includes LATIN SMALL LETTER SHARP S, but /[\p{Block=Latin1}]/i does not match ss . The new rule is that for there to be a multi-character case-insensitive match within a bracketed character class, the character must be explicitly listed, and not as an end point of a range. This more closely obeys the Principle of Least Astonishment. See Bracketed Character Classes in perlrecharclass. Note that a bug [perl #89774], now fixed as part of this change, prevented the previous behavior from working fully.

Explicit rules for variable names and identifiers

Due to an oversight, single character variable names in v5.16 were completely unrestricted. This opened the door to several kinds of insanity. As of v5.18, these now follow the rules of other identifiers, in addition to accepting characters that match the \p{POSIX_Punct} property.

There is no longer any difference in the parsing of identifiers specified by using braces versus without braces. For instance, perl used to allow ${foo:bar} (with a single colon) but not $foo:bar . Now that both are handled by a single code path, they are both treated the same way: both are forbidden. Note that this change is about the range of permissible literal identifiers, not other expressions.

Vertical tabs are now whitespace

No one could recall why \s didn't match \cK , the vertical tab. Now it does. Given the extreme rarity of that character, very little breakage is expected. That said, here's what it means:

\s in a regex now matches a vertical tab in all circumstances.

Literal vertical tabs in a regex literal are ignored when the /x modifier is used.

Leading vertical tabs, alone or mixed with other whitespace, are now ignored when interpreting a string as a number. For example:

  1. $dec = " \cK \t 123";
  2. $hex = " \cK \t 0xF";
  3. say 0 + $dec; # was 0 with warning, now 123
  4. say int $dec; # was 0, now 123
  5. say oct $hex; # was 0, now 15

/(?{})/ and /(??{})/ have been heavily reworked

The implementation of this feature has been almost completely rewritten. Although its main intent is to fix bugs, some behaviors, especially related to the scope of lexical variables, will have changed. This is described more fully in the Selected Bug Fixes section.

Stricter parsing of substitution replacement

It is no longer possible to abuse the way the parser parses s///e like this:

  1. %_=(_,"Just another ");
  2. $_="Perl hacker,\n";
  3. s//_}->{_/e;print

given now aliases the global $_

Instead of assigning to an implicit lexical $_ , given now makes the global $_ an alias for its argument, just like foreach . However, it still uses lexical $_ if there is lexical $_ in scope (again, just like foreach ) [perl #114020].

The smartmatch family of features are now experimental

Smart match, added in v5.10.0 and significantly revised in v5.10.1, has been a regular point of complaint. Although there are a number of ways in which it is useful, it has also proven problematic and confusing for both users and implementors of Perl. There have been a number of proposals on how to best address the problem. It is clear that smartmatch is almost certainly either going to change or go away in the future. Relying on its current behavior is not recommended.

Warnings will now be issued when the parser sees ~~ , given , or when . To disable these warnings, you can add this line to the appropriate scope:

  1. no if $] >= 5.018, warnings => "experimental::smartmatch";

Consider, though, replacing the use of these features, as they may change behavior again before becoming stable.

Lexical $_ is now experimental

Since it was introduced in Perl v5.10, it has caused much confusion with no obvious solution:

  • Various modules (e.g., List::Util) expect callback routines to use the global $_ . use List::Util 'first'; my $_; first { $_ == 1 } @list does not work as one would expect.

  • A my $_ declaration earlier in the same file can cause confusing closure warnings.

  • The "_" subroutine prototype character allows called subroutines to access your lexical $_ , so it is not really private after all.

  • Nevertheless, subroutines with a "(@)" prototype and methods cannot access the caller's lexical $_ , unless they are written in XS.

  • But even XS routines cannot access a lexical $_ declared, not in the calling subroutine, but in an outer scope, iff that subroutine happened not to mention $_ or use any operators that default to $_ .

It is our hope that lexical $_ can be rehabilitated, but this may cause changes in its behavior. Please use it with caution until it becomes stable.

readline() with $/ = \N now reads N characters, not N bytes

Previously, when reading from a stream with I/O layers such as encoding , the readline() function, otherwise known as the <> operator, would read N bytes from the top-most layer. [perl #79960]

Now, N characters are read instead.

There is no change in behaviour when reading from streams with no extra layers, since bytes map exactly to characters.

Overridden glob is now passed one argument

glob overrides used to be passed a magical undocumented second argument that identified the caller. Nothing on CPAN was using this, and it got in the way of a bug fix, so it was removed. If you really need to identify the caller, see Devel::Callsite on CPAN.

Here doc parsing

The body of a here document inside a quote-like operator now always begins on the line after the "<<foo" marker. Previously, it was documented to begin on the line following the containing quote-like operator, but that was only sometimes the case [perl #114040].

Alphanumeric operators must now be separated from the closing delimiter of regular expressions

You may no longer write something like:

  1. m/a/and 1

Instead you must write

  1. m/a/ and 1

with whitespace separating the operator from the closing delimiter of the regular expression. Not having whitespace has resulted in a deprecation warning since Perl v5.14.0.

qw(...) can no longer be used as parentheses

qw lists used to fool the parser into thinking they were always surrounded by parentheses. This permitted some surprising constructions such as foreach $x qw(a b c) {...} , which should really be written foreach $x (qw(a b c)) {...} . These would sometimes get the lexer into the wrong state, so they didn't fully work, and the similar foreach qw(a b c) {...} that one might expect to be permitted never worked at all.

This side effect of qw has now been abolished. It has been deprecated since Perl v5.13.11. It is now necessary to use real parentheses everywhere that the grammar calls for them.

Interaction of lexical and default warnings

Turning on any lexical warnings used first to disable all default warnings if lexical warnings were not already enabled:

  1. $*; # deprecation warning
  2. use warnings "void";
  3. $#; # void warning; no deprecation warning

Now, the debugging , deprecated , glob, inplace and malloc warnings categories are left on when turning on lexical warnings (unless they are turned off by no warnings , of course).

This may cause deprecation warnings to occur in code that used to be free of warnings.

Those are the only categories consisting only of default warnings. Default warnings in other categories are still disabled by use warnings "category" , as we do not yet have the infrastructure for controlling individual warnings.

state sub and our sub

Due to an accident of history, state sub and our sub were equivalent to a plain sub, so one could even create an anonymous sub with our sub { ... } . These are now disallowed outside of the "lexical_subs" feature. Under the "lexical_subs" feature they have new meanings described in Lexical Subroutines in perlsub.

Defined values stored in environment are forced to byte strings

A value stored in an environment variable has always been stringified. In this release, it is converted to be only a byte string. First, it is forced to be only a string. Then if the string is utf8 and the equivalent of utf8::downgrade() works, that result is used; otherwise, the equivalent of utf8::encode() is used, and a warning is issued about wide characters (Diagnostics).

require dies for unreadable files

When require encounters an unreadable file, it now dies. It used to ignore the file and continue searching the directories in @INC [perl #113422].

gv_fetchmeth_* and SUPER

The various gv_fetchmeth_* XS functions used to treat a package whose named ended with ::SUPER specially. A method lookup on the Foo::SUPER package would be treated as a SUPER method lookup on the Foo package. This is no longer the case. To do a SUPER lookup, pass the Foo stash and the GV_SUPER flag.

split's first argument is more consistently interpreted

After some changes earlier in v5.17, split's behavior has been simplified: if the PATTERN argument evaluates to a string containing one space, it is treated the way that a literal string containing one space once was.

Deprecations

Module removals

The following modules will be removed from the core distribution in a future release, and will at that time need to be installed from CPAN. Distributions on CPAN which require these modules will need to list them as prerequisites.

The core versions of these modules will now issue "deprecated" -category warnings to alert you to this fact. To silence these deprecation warnings, install the modules in question from CPAN.

Note that these are (with rare exceptions) fine modules that you are encouraged to continue to use. Their disinclusion from core primarily hinges on their necessity to bootstrapping a fully functional, CPAN-capable Perl installation, not usually on concerns over their design.

Deprecated Utilities

The following utilities will be removed from the core distribution in a future release as their associated modules have been deprecated. They will remain available with the applicable CPAN distribution.

  • cpanp
  • cpanp-run-perl
  • cpan2dist

    These items are part of the CPANPLUS distribution.

  • pod2latex

    This item is part of the Pod::LaTeX distribution.

PL_sv_objcount

This interpreter-global variable used to track the total number of Perl objects in the interpreter. It is no longer maintained and will be removed altogether in Perl v5.20.

Five additional characters should be escaped in patterns with /x

When a regular expression pattern is compiled with /x, Perl treats 6 characters as white space to ignore, such as SPACE and TAB. However, Unicode recommends 11 characters be treated thusly. We will conform with this in a future Perl version. In the meantime, use of any of the missing characters will raise a deprecation warning, unless turned off. The five characters are:

  1. U+0085 NEXT LINE
  2. U+200E LEFT-TO-RIGHT MARK
  3. U+200F RIGHT-TO-LEFT MARK
  4. U+2028 LINE SEPARATOR
  5. U+2029 PARAGRAPH SEPARATOR

User-defined charnames with surprising whitespace

A user-defined character name with trailing or multiple spaces in a row is likely a typo. This now generates a warning when defined, on the assumption that uses of it will be unlikely to include the excess whitespace.

Various XS-callable functions are now deprecated

All the functions used to classify characters will be removed from a future version of Perl, and should not be used. With participating C compilers (e.g., gcc), compiling any file that uses any of these will generate a warning. These were not intended for public use; there are equivalent, faster, macros for most of them.

See Character classes in perlapi. The complete list is:

is_uni_alnum , is_uni_alnumc , is_uni_alnumc_lc , is_uni_alnum_lc , is_uni_alpha , is_uni_alpha_lc , is_uni_ascii , is_uni_ascii_lc , is_uni_blank , is_uni_blank_lc , is_uni_cntrl , is_uni_cntrl_lc , is_uni_digit , is_uni_digit_lc , is_uni_graph , is_uni_graph_lc , is_uni_idfirst , is_uni_idfirst_lc , is_uni_lower , is_uni_lower_lc , is_uni_print , is_uni_print_lc , is_uni_punct , is_uni_punct_lc , is_uni_space , is_uni_space_lc , is_uni_upper , is_uni_upper_lc , is_uni_xdigit , is_uni_xdigit_lc , is_utf8_alnum , is_utf8_alnumc , is_utf8_alpha , is_utf8_ascii , is_utf8_blank , is_utf8_char , is_utf8_cntrl , is_utf8_digit , is_utf8_graph , is_utf8_idcont , is_utf8_idfirst , is_utf8_lower , is_utf8_mark , is_utf8_perl_space , is_utf8_perl_word , is_utf8_posix_digit , is_utf8_print , is_utf8_punct , is_utf8_space , is_utf8_upper , is_utf8_xdigit , is_utf8_xidcont , is_utf8_xidfirst .

In addition these three functions that have never worked properly are deprecated: to_uni_lower_lc , to_uni_title_lc , and to_uni_upper_lc .

Certain rare uses of backslashes within regexes are now deprecated

There are three pairs of characters that Perl recognizes as metacharacters in regular expression patterns: {} , [] , and () . These can be used as well to delimit patterns, as in:

  1. m{foo}
  2. s(foo)(bar)

Since they are metacharacters, they have special meaning to regular expression patterns, and it turns out that you can't turn off that special meaning by the normal means of preceding them with a backslash, if you use them, paired, within a pattern delimited by them. For example, in

  1. m{foo\{1,3\}}

the backslashes do not change the behavior, and this matches "f o" followed by one to three more occurrences of "o" .

Usages like this, where they are interpreted as metacharacters, are exceedingly rare; we think there are none, for example, in all of CPAN. Hence, this deprecation should affect very little code. It does give notice, however, that any such code needs to change, which will in turn allow us to change the behavior in future Perl versions so that the backslashes do have an effect, and without fear that we are silently breaking any existing code.

Splitting the tokens (? and (* in regular expressions

A deprecation warning is now raised if the ( and ? are separated by white space or comments in (?...) regular expression constructs. Similarly, if the ( and * are separated in (*VERB...) constructs.

Pre-PerlIO IO implementations

In theory, you can currently build perl without PerlIO. Instead, you'd use a wrapper around stdio or sfio. In practice, this isn't very useful. It's not well tested, and without any support for IO layers or (thus) Unicode, it's not much of a perl. Building without PerlIO will most likely be removed in the next version of perl.

PerlIO supports a stdio layer if stdio use is desired. Similarly a sfio layer could be produced in the future, if needed.

Future Deprecations

  • Platforms without support infrastructure

    Both Windows CE and z/OS have been historically under-maintained, and are currently neither successfully building nor regularly being smoke tested. Efforts are underway to change this situation, but it should not be taken for granted that the platforms are safe and supported. If they do not become buildable and regularly smoked, support for them may be actively removed in future releases. If you have an interest in these platforms and you can lend your time, expertise, or hardware to help support these platforms, please let the perl development effort know by emailing perl5-porters@perl.org .

    Some platforms that appear otherwise entirely dead are also on the short list for removal between now and v5.20.0:

    • DG/UX
    • NeXT

    We also think it likely that current versions of Perl will no longer build AmigaOS, DJGPP, NetWare (natively), OS/2 and Plan 9. If you are using Perl on such a platform and have an interest in ensuring Perl's future on them, please contact us.

    We believe that Perl has long been unable to build on mixed endian architectures (such as PDP-11s), and intend to remove any remaining support code. Similarly, code supporting the long umaintained GNU dld will be removed soon if no-one makes themselves known as an active user.

  • Swapping of $< and $>

    Perl has supported the idiom of swapping $< and $> (and likewise $( and $)) to temporarily drop permissions since 5.0, like this:

    1. ($<, $>) = ($>, $<);

    However, this idiom modifies the real user/group id, which can have undesirable side-effects, is no longer useful on any platform perl supports and complicates the implementation of these variables and list assignment in general.

    As an alternative, assignment only to $> is recommended:

    1. local $> = $<;

    See also: Setuid Demystified.

  • microperl , long broken and of unclear present purpose, will be removed.

  • Revamping "\Q" semantics in double-quotish strings when combined with other escapes.

    There are several bugs and inconsistencies involving combinations of \Q and escapes like \x , \L , etc., within a \Q...\E pair. These need to be fixed, and doing so will necessarily change current behavior. The changes have not yet been settled.

  • Use of $x , where x stands for any actual (non-printing) C0 control character will be disallowed in a future Perl version. Use ${x} instead (where again x stands for a control character), or better, $^A , where ^ is a caret (CIRCUMFLEX ACCENT), and A stands for any of the characters listed at the end of OPERATOR DIFFERENCES in perlebcdic.

Performance Enhancements

  • Lists of lexical variable declarations (my($x, $y) ) are now optimised down to a single op and are hence faster than before.

  • A new C preprocessor define NO_TAINT_SUPPORT was added that, if set, disables Perl's taint support altogether. Using the -T or -t command line flags will cause a fatal error. Beware that both core tests as well as many a CPAN distribution's tests will fail with this change. On the upside, it provides a small performance benefit due to reduced branching.

    Do not enable this unless you know exactly what you are getting yourself into.

  • pack with constant arguments is now constant folded in most cases [perl #113470].

  • Speed up in regular expression matching against Unicode properties. The largest gain is for \X , the Unicode "extended grapheme cluster." The gain for it is about 35% - 40%. Bracketed character classes, e.g., [0-9\x{100}] containing code points above 255 are also now faster.

  • On platforms supporting it, several former macros are now implemented as static inline functions. This should speed things up slightly on non-GCC platforms.

  • The optimisation of hashes in boolean context has been extended to affect scalar(%hash), %hash ? ... : ... , and sub { %hash || ... } .

  • Filetest operators manage the stack in a fractionally more efficient manner.

  • Globs used in a numeric context are now numified directly in most cases, rather than being numified via stringification.

  • The x repetition operator is now folded to a single constant at compile time if called in scalar context with constant operands and no parentheses around the left operand.

Modules and Pragmata

New Modules and Pragmata

  • Config::Perl::V version 0.16 has been added as a dual-lifed module. It provides structured data retrieval of perl -V output including information only known to the perl binary and not available via Config.

Updated Modules and Pragmata

For a complete list of updates, run:

  1. $ corelist --diff 5.16.0 5.18.0

You can substitute your favorite version in place of 5.16.0 , too.

  • Archive::Extract has been upgraded to 0.68.

    Work around an edge case on Linux with Busybox's unzip.

  • Archive::Tar has been upgraded to 1.90.

    ptar now supports the -T option as well as dashless options [rt.cpan.org #75473], [rt.cpan.org #75475].

    Auto-encode filenames marked as UTF-8 [rt.cpan.org #75474].

    Don't use tell on IO::Zlib handles [rt.cpan.org #64339].

    Don't try to chown on symlinks.

  • autodie has been upgraded to 2.13.

    autodie now plays nicely with the 'open' pragma.

  • B has been upgraded to 1.42.

    The stashoff method of COPs has been added. This provides access to an internal field added in perl 5.16 under threaded builds [perl #113034].

    B::COP::stashpv now supports UTF-8 package names and embedded NULs.

    All CVf_* and GVf_* and more SV-related flag values are now provided as constants in the B:: namespace and available for export. The default export list has not changed.

    This makes the module work with the new pad API.

  • B::Concise has been upgraded to 0.95.

    The -nobanner option has been fixed, and formats can now be dumped. When passed a sub name to dump, it will check also to see whether it is the name of a format. If a sub and a format share the same name, it will dump both.

    This adds support for the new OpMAYBE_TRUEBOOL and OPpTRUEBOOL flags.

  • B::Debug has been upgraded to 1.18.

    This adds support (experimentally) for B::PADLIST , which was added in Perl 5.17.4.

  • B::Deparse has been upgraded to 1.20.

    Avoid warning when run under perl -w .

    It now deparses loop controls with the correct precedence, and multiple statements in a format line are also now deparsed correctly.

    This release suppresses trailing semicolons in formats.

    This release adds stub deparsing for lexical subroutines.

    It no longer dies when deparsing sort without arguments. It now correctly omits the comma for system $prog @args and exec $prog @args .

  • bignum, bigint and bigrat have been upgraded to 0.33.

    The overrides for hex and oct have been rewritten, eliminating several problems, and making one incompatible change:

    • Formerly, whichever of use bigint or use bigrat was compiled later would take precedence over the other, causing hex and oct not to respect the other pragma when in scope.

    • Using any of these three pragmata would cause hex and oct anywhere else in the program to evalute their arguments in list context and prevent them from inferring $_ when called without arguments.

    • Using any of these three pragmata would make oct("1234") return 1234 (for any number not beginning with 0) anywhere in the program. Now "1234" is translated from octal to decimal, whether within the pragma's scope or not.

    • The global overrides that facilitate lexical use of hex and oct now respect any existing overrides that were in place before the new overrides were installed, falling back to them outside of the scope of use bignum .

    • use bignum "hex" , use bignum "oct" and similar invocations for bigint and bigrat now export a hex or oct function, instead of providing a global override.

  • Carp has been upgraded to 1.29.

    Carp is no longer confused when caller returns undef for a package that has been deleted.

    The longmess() and shortmess() functions are now documented.

  • CGI has been upgraded to 3.63.

    Unrecognized HTML escape sequences are now handled better, problematic trailing newlines are no longer inserted after <form> tags by startform() or start_form() , and bogus "Insecure Dependency" warnings appearing with some versions of perl are now worked around.

  • Class::Struct has been upgraded to 0.64.

    The constructor now respects overridden accessor methods [perl #29230].

  • Compress::Raw::Bzip2 has been upgraded to 2.060.

    The misuse of Perl's "magic" API has been fixed.

  • Compress::Raw::Zlib has been upgraded to 2.060.

    Upgrade bundled zlib to version 1.2.7.

    Fix build failures on Irix, Solaris, and Win32, and also when building as C++ [rt.cpan.org #69985], [rt.cpan.org #77030], [rt.cpan.org #75222].

    The misuse of Perl's "magic" API has been fixed.

    compress() , uncompress() , memGzip() and memGunzip() have been speeded up by making parameter validation more efficient.

  • CPAN::Meta::Requirements has been upgraded to 2.122.

    Treat undef requirements to from_string_hash as 0 (with a warning).

    Added requirements_for_module method.

  • CPANPLUS has been upgraded to 0.9135.

    Allow adding blib/script to PATH.

    Save the history between invocations of the shell.

    Handle multiple makemakerargs and makeflags arguments better.

    This resolves issues with the SQLite source engine.

  • Data::Dumper has been upgraded to 2.145.

    It has been optimized to only build a seen-scalar hash as necessary, thereby speeding up serialization drastically.

    Additional tests were added in order to improve statement, branch, condition and subroutine coverage. On the basis of the coverage analysis, some of the internals of Dumper.pm were refactored. Almost all methods are now documented.

  • DB_File has been upgraded to 1.827.

    The main Perl module no longer uses the "@_" construct.

  • Devel::Peek has been upgraded to 1.11.

    This fixes compilation with C++ compilers and makes the module work with the new pad API.

  • Digest::MD5 has been upgraded to 2.52.

    Fix Digest::Perl::MD5 OO fallback [rt.cpan.org #66634].

  • Digest::SHA has been upgraded to 5.84.

    This fixes a double-free bug, which might have caused vulnerabilities in some cases.

  • DynaLoader has been upgraded to 1.18.

    This is due to a minor code change in the XS for the VMS implementation.

    This fixes warnings about using CODE sections without an OUTPUT section.

  • Encode has been upgraded to 2.49.

    The Mac alias x-mac-ce has been added, and various bugs have been fixed in Encode::Unicode, Encode::UTF7 and Encode::GSM0338.

  • Env has been upgraded to 1.04.

    Its SPLICE implementation no longer misbehaves in list context.

  • ExtUtils::CBuilder has been upgraded to 0.280210.

    Manifest files are now correctly embedded for those versions of VC++ which make use of them. [perl #111782, #111798].

    A list of symbols to export can now be passed to link() when on Windows, as on other OSes [perl #115100].

  • ExtUtils::ParseXS has been upgraded to 3.18.

    The generated C code now avoids unnecessarily incrementing PL_amagic_generation on Perl versions where it's done automatically (or on current Perl where the variable no longer exists).

    This avoids a bogus warning for initialised XSUB non-parameters [perl #112776].

  • File::Copy has been upgraded to 2.26.

    copy() no longer zeros files when copying into the same directory, and also now fails (as it has long been documented to do) when attempting to copy a file over itself.

  • File::DosGlob has been upgraded to 1.10.

    The internal cache of file names that it keeps for each caller is now freed when that caller is freed. This means use File::DosGlob 'glob'; eval 'scalar <*>' no longer leaks memory.

  • File::Fetch has been upgraded to 0.38.

    Added the 'file_default' option for URLs that do not have a file component.

    Use File::HomeDir when available, and provide PERL5_CPANPLUS_HOME to override the autodetection.

    Always re-fetch CHECKSUMS if fetchdir is set.

  • File::Find has been upgraded to 1.23.

    This fixes inconsistent unixy path handling on VMS.

    Individual files may now appear in list of directories to be searched [perl #59750].

  • File::Glob has been upgraded to 1.20.

    File::Glob has had exactly the same fix as File::DosGlob. Since it is what Perl's own glob operator itself uses (except on VMS), this means eval 'scalar <*>' no longer leaks.

    A space-separated list of patterns return long lists of results no longer results in memory corruption or crashes. This bug was introduced in Perl 5.16.0. [perl #114984]

  • File::Spec::Unix has been upgraded to 3.40.

    abs2rel could produce incorrect results when given two relative paths or the root directory twice [perl #111510].

  • File::stat has been upgraded to 1.07.

    File::stat ignores the filetest pragma, and warns when used in combination therewith. But it was not warning for -r . This has been fixed [perl #111640].

    -p now works, and does not return false for pipes [perl #111638].

    Previously File::stat 's overloaded -x and -X operators did not give the correct results for directories or executable files when running as root. They had been treating executable permissions for root just like for any other user, performing group membership tests etc for files not owned by root. They now follow the correct Unix behaviour - for a directory they are always true, and for a file if any of the three execute permission bits are set then they report that root can execute the file. Perl's builtin -x and -X operators have always been correct.

  • File::Temp has been upgraded to 0.23

    Fixes various bugs involving directory removal. Defers unlinking tempfiles if the initial unlink fails, which fixes problems on NFS.

  • GDBM_File has been upgraded to 1.15.

    The undocumented optional fifth parameter to TIEHASH has been removed. This was intended to provide control of the callback used by gdbm* functions in case of fatal errors (such as filesystem problems), but did not work (and could never have worked). No code on CPAN even attempted to use it. The callback is now always the previous default, croak . Problems on some platforms with how the C croak function is called have also been resolved.

  • Hash::Util has been upgraded to 0.15.

    hash_unlocked and hashref_unlocked now returns true if the hash is unlocked, instead of always returning false [perl #112126].

    hash_unlocked , hashref_unlocked , lock_hash_recurse and unlock_hash_recurse are now exportable [perl #112126].

    Two new functions, hash_locked and hashref_locked , have been added. Oddly enough, these two functions were already exported, even though they did not exist [perl #112126].

  • HTTP::Tiny has been upgraded to 0.025.

    Add SSL verification features [github #6], [github #9].

    Include the final URL in the response hashref.

    Add local_address option.

    This improves SSL support.

  • IO has been upgraded to 1.28.

    sync() can now be called on read-only file handles [perl #64772].

    IO::Socket tries harder to cache or otherwise fetch socket information.

  • IPC::Cmd has been upgraded to 0.80.

    Use POSIX::_exit instead of exit in run_forked [rt.cpan.org #76901].

  • IPC::Open3 has been upgraded to 1.13.

    The open3() function no longer uses POSIX::close() to close file descriptors since that breaks the ref-counting of file descriptors done by PerlIO in cases where the file descriptors are shared by PerlIO streams, leading to attempts to close the file descriptors a second time when any such PerlIO streams are closed later on.

  • Locale::Codes has been upgraded to 3.25.

    It includes some new codes.

  • Memoize has been upgraded to 1.03.

    Fix the MERGE cache option.

  • Module::Build has been upgraded to 0.4003.

    Fixed bug where modules without $VERSION might have a version of '0' listed in 'provides' metadata, which will be rejected by PAUSE.

    Fixed bug in PodParser to allow numerals in module names.

    Fixed bug where giving arguments twice led to them becoming arrays, resulting in install paths like ARRAY(0xdeadbeef)/lib/Foo.pm.

    A minor bug fix allows markup to be used around the leading "Name" in a POD "abstract" line, and some documentation improvements have been made.

  • Module::CoreList has been upgraded to 2.90

    Version information is now stored as a delta, which greatly reduces the size of the CoreList.pm file.

    This restores compatibility with older versions of perl and cleans up the corelist data for various modules.

  • Module::Load::Conditional has been upgraded to 0.54.

    Fix use of requires on perls installed to a path with spaces.

    Various enhancements include the new use of Module::Metadata.

  • Module::Metadata has been upgraded to 1.000011.

    The creation of a Module::Metadata object for a typical module file has been sped up by about 40%, and some spurious warnings about $VERSION s have been suppressed.

  • Module::Pluggable has been upgraded to 4.7.

    Amongst other changes, triggers are now allowed on events, which gives a powerful way to modify behaviour.

  • Net::Ping has been upgraded to 2.41.

    This fixes some test failures on Windows.

  • Opcode has been upgraded to 1.25.

    Reflect the removal of the boolkeys opcode and the addition of the clonecv, introcv and padcv opcodes.

  • overload has been upgraded to 1.22.

    no overload now warns for invalid arguments, just like use overload .

  • PerlIO::encoding has been upgraded to 0.16.

    This is the module implementing the ":encoding(...)" I/O layer. It no longer corrupts memory or crashes when the encoding back-end reallocates the buffer or gives it a typeglob or shared hash key scalar.

  • PerlIO::scalar has been upgraded to 0.16.

    The buffer scalar supplied may now only contain code pounts 0xFF or lower. [perl #109828]

  • Perl::OSType has been upgraded to 1.003.

    This fixes a bug detecting the VOS operating system.

  • Pod::Html has been upgraded to 1.18.

    The option --libpods has been reinstated. It is deprecated, and its use does nothing other than issue a warning that it is no longer supported.

    Since the HTML files generated by pod2html claim to have a UTF-8 charset, actually write the files out using UTF-8 [perl #111446].

  • Pod::Simple has been upgraded to 3.28.

    Numerous improvements have been made, mostly to Pod::Simple::XHTML, which also has a compatibility change: the codes_in_verbatim option is now disabled by default. See cpan/Pod-Simple/ChangeLog for the full details.

  • re has been upgraded to 0.23

    Single character [class]es like /[s]/ or /[s]/i are now optimized as if they did not have the brackets, i.e. /s/ or /s/i .

    See note about op_comp in the Internal Changes section below.

  • Safe has been upgraded to 2.35.

    Fix interactions with Devel::Cover .

    Don't eval code under no strict .

  • Scalar::Util has been upgraded to version 1.27.

    Fix an overloading issue with sum .

    first and reduce now check the callback first (so &first(1) is disallowed).

    Fix tainted on magical values [rt.cpan.org #55763].

    Fix sum on previously magical values [rt.cpan.org #61118].

    Fix reading past the end of a fixed buffer [rt.cpan.org #72700].

  • Search::Dict has been upgraded to 1.07.

    No longer require stat on filehandles.

    Use fc for casefolding.

  • Socket has been upgraded to 2.009.

    Constants and functions required for IP multicast source group membership have been added.

    unpack_sockaddr_in() and unpack_sockaddr_in6() now return just the IP address in scalar context, and inet_ntop() now guards against incorrect length scalars being passed in.

    This fixes an uninitialized memory read.

  • Storable has been upgraded to 2.41.

    Modifying $_[0] within STORABLE_freeze no longer results in crashes [perl #112358].

    An object whose class implements STORABLE_attach is now thawed only once when there are multiple references to it in the structure being thawed [perl #111918].

    Restricted hashes were not always thawed correctly [perl #73972].

    Storable would croak when freezing a blessed REF object with a STORABLE_freeze() method [perl #113880].

    It can now freeze and thaw vstrings correctly. This causes a slight incompatible change in the storage format, so the format version has increased to 2.9.

    This contains various bugfixes, including compatibility fixes for older versions of Perl and vstring handling.

  • Sys::Syslog has been upgraded to 0.32.

    This contains several bug fixes relating to getservbyname(), setlogsock() and log levels in syslog() , together with fixes for Windows, Haiku-OS and GNU/kFreeBSD. See cpan/Sys-Syslog/Changes for the full details.

  • Term::ANSIColor has been upgraded to 4.02.

    Add support for italics.

    Improve error handling.

  • Term::ReadLine has been upgraded to 1.10. This fixes the use of the cpan and cpanp shells on Windows in the event that the current drive happens to contain a \dev\tty file.

  • Test::Harness has been upgraded to 3.26.

    Fix glob semantics on Win32 [rt.cpan.org #49732].

    Don't use Win32::GetShortPathName when calling perl [rt.cpan.org #47890].

    Ignore -T when reading shebang [rt.cpan.org #64404].

    Handle the case where we don't know the wait status of the test more gracefully.

    Make the test summary 'ok' line overridable so that it can be changed to a plugin to make the output of prove idempotent.

    Don't run world-writable files.

  • Text::Tabs and Text::Wrap have been upgraded to 2012.0818. Support for Unicode combining characters has been added to them both.

  • threads::shared has been upgraded to 1.31.

    This adds the option to warn about or ignore attempts to clone structures that can't be cloned, as opposed to just unconditionally dying in that case.

    This adds support for dual-valued values as created by Scalar::Util::dualvar.

  • Tie::StdHandle has been upgraded to 4.3.

    READ now respects the offset argument to read [perl #112826].

  • Time::Local has been upgraded to 1.2300.

    Seconds values greater than 59 but less than 60 no longer cause timegm() and timelocal() to croak.

  • Unicode::UCD has been upgraded to 0.53.

    This adds a function all_casefolds() that returns all the casefolds.

  • Win32 has been upgraded to 0.47.

    New APIs have been added for getting and setting the current code page.

Removed Modules and Pragmata

Documentation

Changes to Existing Documentation

perlcheat

  • perlcheat has been reorganized, and a few new sections were added.

perldata

  • Now explicitly documents the behaviour of hash initializer lists that contain duplicate keys.

perldiag

  • The explanation of symbolic references being prevented by "strict refs" now doesn't assume that the reader knows what symbolic references are.

perlfaq

  • perlfaq has been synchronized with version 5.0150040 from CPAN.

perlfunc

  • The return value of pipe is now documented.

  • Clarified documentation of our.

perlop

  • Loop control verbs (dump, goto, next, last and redo) have always had the same precedence as assignment operators, but this was not documented until now.

Diagnostics

The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see perldiag.

New Diagnostics

New Errors

New Warnings

Changes to Existing Diagnostics

  • $* is no longer supported

    The warning that use of $* and $# is no longer supported is now generated for every location that references them. Previously it would fail to be generated if another variable using the same typeglob was seen first (e.g. @* before $* ), and would not be generated for the second and subsequent uses. (It's hard to fix the failure to generate warnings at all without also generating them every time, and warning every time is consistent with the warnings that $[ used to generate.)

  • The warnings for \b{ and \B{ were added. They are a deprecation warning which should be turned off by that category. One should not have to turn off regular regexp warnings as well to get rid of these.

  • Constant(%s): Call to &{$^H{%s}} did not return a defined value

    Constant overloading that returns undef results in this error message. For numeric constants, it used to say "Constant(undef)". "undef" has been replaced with the number itself.

  • The error produced when a module cannot be loaded now includes a hint that the module may need to be installed: "Can't locate hopping.pm in @INC (you may need to install the hopping module) (@INC contains: ...)"

  • vector argument not supported with alpha versions

    This warning was not suppressable, even with no warnings . Now it is suppressible, and has been moved from the "internal" category to the "printf" category.

  • Can't do {n,m} with n > m in regex; marked by <-- HERE in m/%s/

    This fatal error has been turned into a warning that reads:

    Quantifier {n,m} with n > m can't match in regex

    (W regexp) Minima should be less than or equal to maxima. If you really want your regexp to match something 0 times, just put {0}.

  • The "Runaway prototype" warning that occurs in bizarre cases has been removed as being unhelpful and inconsistent.

  • The "Not a format reference" error has been removed, as the only case in which it could be triggered was a bug.

  • The "Unable to create sub named %s" error has been removed for the same reason.

  • The 'Can't use "my %s" in sort comparison' error has been downgraded to a warning, '"my %s" used in sort comparison' (with 'state' instead of 'my' for state variables). In addition, the heuristics for guessing whether lexical $a or $b has been misused have been improved to generate fewer false positives. Lexical $a and $b are no longer disallowed if they are outside the sort block. Also, a named unary or list operator inside the sort block no longer causes the $a or $b to be ignored [perl #86136].

Utility Changes

h2xs

  • h2xs no longer produces invalid code for empty defines. [perl #20636]

Configuration and Compilation

  • Added useversionedarchname option to Configure

    When set, it includes 'api_versionstring' in 'archname'. E.g. x86_64-linux-5.13.6-thread-multi. It is unset by default.

    This feature was requested by Tim Bunce, who observed that INSTALL_BASE creates a library structure that does not differentiate by perl version. Instead, it places architecture specific files in "$install_base/lib/perl5/$archname". This makes it difficult to use a common INSTALL_BASE library path with multiple versions of perl.

    By setting -Duseversionedarchname , the $archname will be distinct for architecture and API version, allowing mixed use of INSTALL_BASE .

  • Add a PERL_NO_INLINE_FUNCTIONS option

    If PERL_NO_INLINE_FUNCTIONS is defined, don't include "inline.h"

    This permits test code to include the perl headers for definitions without creating a link dependency on the perl library (which may not exist yet).

  • Configure will honour the external MAILDOMAIN environment variable, if set.

  • installman no longer ignores the silent option

  • Both META.yml and META.json files are now included in the distribution.

  • Configure will now correctly detect isblank() when compiling with a C++ compiler.

  • The pager detection in Configure has been improved to allow responses which specify options after the program name, e.g. /usr/bin/less -R, if the user accepts the default value. This helps perldoc when handling ANSI escapes [perl #72156].

Testing

  • The test suite now has a section for tests that require very large amounts of memory. These tests won't run by default; they can be enabled by setting the PERL_TEST_MEMORY environment variable to the number of gibibytes of memory that may be safely used.

Platform Support

Discontinued Platforms

  • BeOS

    BeOS was an operating system for personal computers developed by Be Inc, initially for their BeBox hardware. The OS Haiku was written as an open source replacement for/continuation of BeOS, and its perl port is current and actively maintained.

  • UTS Global

    Support code relating to UTS global has been removed. UTS was a mainframe version of System V created by Amdahl, subsequently sold to UTS Global. The port has not been touched since before Perl v5.8.0, and UTS Global is now defunct.

  • VM/ESA

    Support for VM/ESA has been removed. The port was tested on 2.3.0, which IBM ended service on in March 2002. 2.4.0 ended service in June 2003, and was superseded by Z/VM. The current version of Z/VM is V6.2.0, and scheduled for end of service on 2015/04/30.

  • MPE/IX

    Support for MPE/IX has been removed.

  • EPOC

    Support code relating to EPOC has been removed. EPOC was a family of operating systems developed by Psion for mobile devices. It was the predecessor of Symbian. The port was last updated in April 2002.

  • Rhapsody

    Support for Rhapsody has been removed.

Platform-Specific Notes

AIX

Configure now always adds -qlanglvl=extc99 to the CC flags on AIX when using xlC. This will make it easier to compile a number of XS-based modules that assume C99 [perl #113778].

clang++

There is now a workaround for a compiler bug that prevented compiling with clang++ since Perl v5.15.7 [perl #112786].

C++

When compiling the Perl core as C++ (which is only semi-supported), the mathom functions are now compiled as extern "C" , to ensure proper binary compatibility. (However, binary compatibility isn't generally guaranteed anyway in the situations where this would matter.)

Darwin

Stop hardcoding an alignment on 8 byte boundaries to fix builds using -Dusemorebits.

Haiku

Perl should now work out of the box on Haiku R1 Alpha 4.

MidnightBSD

libc_r was removed from recent versions of MidnightBSD and older versions work better with pthread . Threading is now enabled using pthread which corrects build errors with threading enabled on 0.4-CURRENT.

Solaris

In Configure, avoid running sed commands with flags not supported on Solaris.

VMS

  • Where possible, the case of filenames and command-line arguments is now preserved by enabling the CRTL features DECC$EFS_CASE_PRESERVE and DECC$ARGV_PARSE_STYLE at start-up time. The latter only takes effect when extended parse is enabled in the process from which Perl is run.

  • The character set for Extended Filename Syntax (EFS) is now enabled by default on VMS. Among other things, this provides better handling of dots in directory names, multiple dots in filenames, and spaces in filenames. To obtain the old behavior, set the logical name DECC$EFS_CHARSET to DISABLE .

  • Fixed linking on builds configured with -Dusemymalloc=y .

  • Experimental support for building Perl with the HP C++ compiler is available by configuring with -Dusecxx .

  • All C header files from the top-level directory of the distribution are now installed on VMS, providing consistency with a long-standing practice on other platforms. Previously only a subset were installed, which broke non-core extension builds for extensions that depended on the missing include files.

  • Quotes are now removed from the command verb (but not the parameters) for commands spawned via system, backticks, or a piped open. Previously, quotes on the verb were passed through to DCL, which would fail to recognize the command. Also, if the verb is actually a path to an image or command procedure on an ODS-5 volume, quoting it now allows the path to contain spaces.

  • The a2p build has been fixed for the HP C++ compiler on OpenVMS.

Win32

  • Perl can now be built using Microsoft's Visual C++ 2012 compiler by specifying CCTYPE=MSVC110 (or MSVC110FREE if you are using the free Express edition for Windows Desktop) in win32/Makefile.

  • The option to build without USE_SOCKETS_AS_HANDLES has been removed.

  • Fixed a problem where perl could crash while cleaning up threads (including the main thread) in threaded debugging builds on Win32 and possibly other platforms [perl #114496].

  • A rare race condition that would lead to sleep taking more time than requested, and possibly even hanging, has been fixed [perl #33096].

  • link on Win32 now attempts to set $! to more appropriate values based on the Win32 API error code. [perl #112272]

    Perl no longer mangles the environment block, e.g. when launching a new sub-process, when the environment contains non-ASCII characters. Known problems still remain, however, when the environment contains characters outside of the current ANSI codepage (e.g. see the item about Unicode in %ENV in http://perl5.git.perl.org/perl.git/blob/HEAD:/Porting/todo.pod). [perl #113536]

  • Building perl with some Windows compilers used to fail due to a problem with miniperl's glob operator (which uses the perlglob program) deleting the PATH environment variable [perl #113798].

  • A new makefile option, USE_64_BIT_INT , has been added to the Windows makefiles. Set this to "define" when building a 32-bit perl if you want it to use 64-bit integers.

    Machine code size reductions, already made to the DLLs of XS modules in Perl v5.17.2, have now been extended to the perl DLL itself.

    Building with VC++ 6.0 was inadvertently broken in Perl v5.17.2 but has now been fixed again.

WinCE

Building on WinCE is now possible once again, although more work is required to fully restore a clean build.

Internal Changes

  • Synonyms for the misleadingly named av_len() have been created: av_top_index() and av_tindex . All three of these return the number of the highest index in the array, not the number of elements it contains.

  • SvUPGRADE() is no longer an expression. Originally this macro (and its underlying function, sv_upgrade()) were documented as boolean, although in reality they always croaked on error and never returned false. In 2005 the documentation was updated to specify a void return value, but SvUPGRADE() was left always returning 1 for backwards compatibility. This has now been removed, and SvUPGRADE() is now a statement with no return value.

    So this is now a syntax error:

    1. if (!SvUPGRADE(sv)) { croak(...); }

    If you have code like that, simply replace it with

    1. SvUPGRADE(sv);

    or to avoid compiler warnings with older perls, possibly

    1. (void)SvUPGRADE(sv);
  • Perl has a new copy-on-write mechanism that allows any SvPOK scalar to be upgraded to a copy-on-write scalar. A reference count on the string buffer is stored in the string buffer itself. This feature is not enabled by default.

    It can be enabled in a perl build by running Configure with -Accflags=-DPERL_NEW_COPY_ON_WRITE, and we would encourage XS authors to try their code with such an enabled perl, and provide feedback. Unfortunately, there is not yet a good guide to updating XS code to cope with COW. Until such a document is available, consult the perl5-porters mailing list.

    It breaks a few XS modules by allowing copy-on-write scalars to go through code paths that never encountered them before.

  • Copy-on-write no longer uses the SvFAKE and SvREADONLY flags. Hence, SvREADONLY indicates a true read-only SV.

    Use the SvIsCOW macro (as before) to identify a copy-on-write scalar.

  • PL_glob_index is gone.

  • The private Perl_croak_no_modify has had its context parameter removed. It is now has a void prototype. Users of the public API croak_no_modify remain unaffected.

  • Copy-on-write (shared hash key) scalars are no longer marked read-only. SvREADONLY returns false on such an SV, but SvIsCOW still returns true.

  • A new op type, OP_PADRANGE has been introduced. The perl peephole optimiser will, where possible, substitute a single padrange op for a pushmark followed by one or more pad ops, and possibly also skipping list and nextstate ops. In addition, the op can carry out the tasks associated with the RHS of a my(...) = @_ assignment, so those ops may be optimised away too.

  • Case-insensitive matching inside a [bracketed] character class with a multi-character fold no longer excludes one of the possibilities in the circumstances that it used to. [perl #89774].

  • PL_formfeed has been removed.

  • The regular expression engine no longer reads one byte past the end of the target string. While for all internally well-formed scalars this should never have been a problem, this change facilitates clever tricks with string buffers in CPAN modules. [perl #73542]

  • Inside a BEGIN block, PL_compcv now points to the currently-compiling subroutine, rather than the BEGIN block itself.

  • mg_length has been deprecated.

  • sv_len now always returns a byte count and sv_len_utf8 a character count. Previously, sv_len and sv_len_utf8 were both buggy and would sometimes returns bytes and sometimes characters. sv_len_utf8 no longer assumes that its argument is in UTF-8. Neither of these creates UTF-8 caches for tied or overloaded values or for non-PVs any more.

  • sv_mortalcopy now copies string buffers of shared hash key scalars when called from XS modules [perl #79824].

  • RXf_SPLIT and RXf_SKIPWHITE are no longer used. They are now #defined as 0.

  • The new RXf_MODIFIES_VARS flag can be set by custom regular expression engines to indicate that the execution of the regular expression may cause variables to be modified. This lets s/// know to skip certain optimisations. Perl's own regular expression engine sets this flag for the special backtracking verbs that set $REGMARK and $REGERROR.

  • The APIs for accessing lexical pads have changed considerably.

    PADLIST s are now longer AV s, but their own type instead. PADLIST s now contain a PAD and a PADNAMELIST of PADNAME s, rather than AV s for the pad and the list of pad names. PAD s, PADNAMELIST s, and PADNAME s are to be accessed as such through the newly added pad API instead of the plain AV and SV APIs. See perlapi for details.

  • In the regex API, the numbered capture callbacks are passed an index indicating what match variable is being accessed. There are special index values for the $`, $&, $& variables. Previously the same three values were used to retrieve ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} too, but these have now been assigned three separate values. See Numbered capture callbacks in perlreapi.

  • PL_sawampersand was previously a boolean indicating that any of $`, $&, $& had been seen; it now contains three one-bit flags indicating the presence of each of the variables individually.

  • The CV * typemap entry now supports &{} overloading and typeglobs, just like &{...} [perl #96872].

  • The SVf_AMAGIC flag to indicate overloading is now on the stash, not the object. It is now set automatically whenever a method or @ISA changes, so its meaning has changed, too. It now means "potentially overloaded". When the overload table is calculated, the flag is automatically turned off if there is no overloading, so there should be no noticeable slowdown.

    The staleness of the overload tables is now checked when overload methods are invoked, rather than during bless.

    "A" magic is gone. The changes to the handling of the SVf_AMAGIC flag eliminate the need for it.

    PL_amagic_generation has been removed as no longer necessary. For XS modules, it is now a macro alias to PL_na .

    The fallback overload setting is now stored in a stash entry separate from overloadedness itself.

  • The character-processing code has been cleaned up in places. The changes should be operationally invisible.

  • The study function was made a no-op in v5.16. It was simply disabled via a return statement; the code was left in place. Now the code supporting what study used to do has been removed.

  • Under threaded perls, there is no longer a separate PV allocated for every COP to store its package name (cop->stashpv ). Instead, there is an offset (cop->stashoff ) into the new PL_stashpad array, which holds stash pointers.

  • In the pluggable regex API, the regexp_engine struct has acquired a new field op_comp , which is currently just for perl's internal use, and should be initialized to NULL by other regex plugin modules.

  • A new function alloccopstash has been added to the API, but is considered experimental. See perlapi.

  • Perl used to implement get magic in a way that would sometimes hide bugs in code that could call mg_get() too many times on magical values. This hiding of errors no longer occurs, so long-standing bugs may become visible now. If you see magic-related errors in XS code, check to make sure it, together with the Perl API functions it uses, calls mg_get() only once on SvGMAGICAL() values.

  • OP allocation for CVs now uses a slab allocator. This simplifies memory management for OPs allocated to a CV, so cleaning up after a compilation error is simpler and safer [perl #111462][perl #112312].

  • PERL_DEBUG_READONLY_OPS has been rewritten to work with the new slab allocator, allowing it to catch more violations than before.

  • The old slab allocator for ops, which was only enabled for PERL_IMPLICIT_SYS and PERL_DEBUG_READONLY_OPS , has been retired.

Selected Bug Fixes

  • Here document terminators no longer require a terminating newline character when they occur at the end of a file. This was already the case at the end of a string eval [perl #65838].

  • -DPERL_GLOBAL_STRUCT builds now free the global struct after they've finished using it.

  • A trailing '/' on a path in @INC will no longer have an additional '/' appended.

  • The :crlf layer now works when unread data doesn't fit into its own buffer. [perl #112244].

  • ungetc() now handles UTF-8 encoded data. [perl #116322].

  • A bug in the core typemap caused any C types that map to the T_BOOL core typemap entry to not be set, updated, or modified when the T_BOOL variable was used in an OUTPUT: section with an exception for RETVAL. T_BOOL in an INPUT: section was not affected. Using a T_BOOL return type for an XSUB (RETVAL) was not affected. A side effect of fixing this bug is, if a T_BOOL is specified in the OUTPUT: section (which previous did nothing to the SV), and a read only SV (literal) is passed to the XSUB, croaks like "Modification of a read-only value attempted" will happen. [perl #115796]

  • On many platforms, providing a directory name as the script name caused perl to do nothing and report success. It should now universally report an error and exit nonzero. [perl #61362]

  • sort {undef} ... under fatal warnings no longer crashes. It had begun crashing in Perl v5.16.

  • Stashes blessed into each other (bless \%Foo::, 'Bar'; bless \%Bar::, 'Foo' ) no longer result in double frees. This bug started happening in Perl v5.16.

  • Numerous memory leaks have been fixed, mostly involving fatal warnings and syntax errors.

  • Some failed regular expression matches such as 'f' =~ /../g were not resetting pos. Also, "match-once" patterns (m?...?g) failed to reset it, too, when invoked a second time [perl #23180].

  • Several bugs involving local *ISA and local *Foo:: causing stale MRO caches have been fixed.

  • Defining a subroutine when its typeglob has been aliased no longer results in stale method caches. This bug was introduced in Perl v5.10.

  • Localising a typeglob containing a subroutine when the typeglob's package has been deleted from its parent stash no longer produces an error. This bug was introduced in Perl v5.14.

  • Under some circumstances, local *method=... would fail to reset method caches upon scope exit.

  • /[.foo.]/ is no longer an error, but produces a warning (as before) and is treated as /[.fo]/ [perl #115818].

  • goto $tied_var now calls FETCH before deciding what type of goto (subroutine or label) this is.

  • Renaming packages through glob assignment (*Foo:: = *Bar::; *Bar:: = *Baz:: ) in combination with m?...? and reset no longer makes threaded builds crash.

  • A number of bugs related to assigning a list to hash have been fixed. Many of these involve lists with repeated keys like (1, 1, 1, 1) .

    • The expression scalar(%h = (1, 1, 1, 1)) now returns 4 , not 2 .

    • The return value of %h = (1, 1, 1) in list context was wrong. Previously this would return (1, undef, 1) , now it returns (1, undef) .

    • Perl now issues the same warning on ($s, %h) = (1, {}) as it does for (%h) = ({}) , "Reference found where even-sized list expected".

    • A number of additional edge cases in list assignment to hashes were corrected. For more details see commit 23b7025ebc.

  • Attributes applied to lexical variables no longer leak memory. [perl #114764]

  • dump, goto, last, next, redo or require followed by a bareword (or version) and then an infix operator is no longer a syntax error. It used to be for those infix operators (like + ) that have a different meaning where a term is expected. [perl #105924]

  • require a::b . 1 and require a::b + 1 no longer produce erroneous ambiguity warnings. [perl #107002]

  • Class method calls are now allowed on any string, and not just strings beginning with an alphanumeric character. [perl #105922]

  • An empty pattern created with qr// used in m/// no longer triggers the "empty pattern reuses last pattern" behaviour. [perl #96230]

  • Tying a hash during iteration no longer results in a memory leak.

  • Freeing a tied hash during iteration no longer results in a memory leak.

  • List assignment to a tied array or hash that dies on STORE no longer results in a memory leak.

  • If the hint hash (%^H ) is tied, compile-time scope entry (which copies the hint hash) no longer leaks memory if FETCH dies. [perl #107000]

  • Constant folding no longer inappropriately triggers the special split " " behaviour. [perl #94490]

  • defined scalar(@array) , defined do { &foo } , and similar constructs now treat the argument to defined as a simple scalar. [perl #97466]

  • Running a custom debugging that defines no *DB::DB glob or provides a subroutine stub for &DB::DB no longer results in a crash, but an error instead. [perl #114990]

  • reset "" now matches its documentation. reset only resets m?...? patterns when called with no argument. An empty string for an argument now does nothing. (It used to be treated as no argument.) [perl #97958]

  • printf with an argument returning an empty list no longer reads past the end of the stack, resulting in erratic behaviour. [perl #77094]

  • --subname no longer produces erroneous ambiguity warnings. [perl #77240]

  • v10 is now allowed as a label or package name. This was inadvertently broken when v-strings were added in Perl v5.6. [perl #56880]

  • length, pos, substr and sprintf could be confused by ties, overloading, references and typeglobs if the stringification of such changed the internal representation to or from UTF-8. [perl #114410]

  • utf8::encode now calls FETCH and STORE on tied variables. utf8::decode now calls STORE (it was already calling FETCH).

  • $tied =~ s/$non_utf8/$utf8/ no longer loops infinitely if the tied variable returns a Latin-1 string, shared hash key scalar, or reference or typeglob that stringifies as ASCII or Latin-1. This was a regression from v5.12.

  • s/// without /e is now better at detecting when it needs to forego certain optimisations, fixing some buggy cases:

    • Match variables in certain constructs (&&, ||, .. and others) in the replacement part; e.g., s/(.)/$l{$a||$1}/g. [perl #26986]

    • Aliases to match variables in the replacement.

    • $REGERROR or $REGMARK in the replacement. [perl #49190]

    • An empty pattern (s//$foo/) that causes the last-successful pattern to be used, when that pattern contains code blocks that modify the variables in the replacement.

  • The taintedness of the replacement string no longer affects the taintedness of the return value of s///e.

  • The $| autoflush variable is created on-the-fly when needed. If this happened (e.g., if it was mentioned in a module or eval) when the currently-selected filehandle was a typeglob with an empty IO slot, it used to crash. [perl #115206]

  • Line numbers at the end of a string eval are no longer off by one. [perl #114658]

  • @INC filters (subroutines returned by subroutines in @INC) that set $_ to a copy-on-write scalar no longer cause the parser to modify that string buffer in place.

  • length($object) no longer returns the undefined value if the object has string overloading that returns undef. [perl #115260]

  • The use of PL_stashcache , the stash name lookup cache for method calls, has been restored,

    Commit da6b625f78f5f133 in August 2011 inadvertently broke the code that looks up values in PL_stashcache . As it's a only cache, quite correctly everything carried on working without it.

  • The error "Can't localize through a reference" had disappeared in v5.16.0 when local %$ref appeared on the last line of an lvalue subroutine. This error disappeared for \local %$ref in perl v5.8.1. It has now been restored.

  • The parsing of here-docs has been improved significantly, fixing several parsing bugs and crashes and one memory leak, and correcting wrong subsequent line numbers under certain conditions.

  • Inside an eval, the error message for an unterminated here-doc no longer has a newline in the middle of it [perl #70836].

  • A substitution inside a substitution pattern (s/${s|||}//) no longer confuses the parser.

  • It may be an odd place to allow comments, but s//"" # hello/e has always worked, unless there happens to be a null character before the first #. Now it works even in the presence of nulls.

  • An invalid range in tr/// or y/// no longer results in a memory leak.

  • String eval no longer treats a semicolon-delimited quote-like operator at the very end (eval 'q;;' ) as a syntax error.

  • warn {$_ => 1} + 1 is no longer a syntax error. The parser used to get confused with certain list operators followed by an anonymous hash and then an infix operator that shares its form with a unary operator.

  • (caller $n)[6] (which gives the text of the eval) used to return the actual parser buffer. Modifying it could result in crashes. Now it always returns a copy. The string returned no longer has "\n;" tacked on to the end. The returned text also includes here-doc bodies, which used to be omitted.

  • The UTF-8 position cache is now reset when accessing magical variables, to avoid the string buffer and the UTF-8 position cache getting out of sync [perl #114410].

  • Various cases of get magic being called twice for magical UTF-8 strings have been fixed.

  • This code (when not in the presence of $& etc)

    1. $_ = 'x' x 1_000_000;
    2. 1 while /(.)/;

    used to skip the buffer copy for performance reasons, but suffered from $1 etc changing if the original string changed. That's now been fixed.

  • Perl doesn't use PerlIO anymore to report out of memory messages, as PerlIO might attempt to allocate more memory.

  • In a regular expression, if something is quantified with {n,m} where n > m , it can't possibly match. Previously this was a fatal error, but now is merely a warning (and that something won't match). [perl #82954].

  • It used to be possible for formats defined in subroutines that have subsequently been undefined and redefined to close over variables in the wrong pad (the newly-defined enclosing sub), resulting in crashes or "Bizarre copy" errors.

  • Redefinition of XSUBs at run time could produce warnings with the wrong line number.

  • The %vd sprintf format does not support version objects for alpha versions. It used to output the format itself (%vd) when passed an alpha version, and also emit an "Invalid conversion in printf" warning. It no longer does, but produces the empty string in the output. It also no longer leaks memory in this case.

  • $obj->SUPER::method calls in the main package could fail if the SUPER package had already been accessed by other means.

  • Stash aliasing (*foo:: = *bar:: ) no longer causes SUPER calls to ignore changes to methods or @ISA or use the wrong package.

  • Method calls on packages whose names end in ::SUPER are no longer treated as SUPER method calls, resulting in failure to find the method. Furthermore, defining subroutines in such packages no longer causes them to be found by SUPER method calls on the containing package [perl #114924].

  • \w now matches the code points U+200C (ZERO WIDTH NON-JOINER) and U+200D (ZERO WIDTH JOINER). \W no longer matches these. This change is because Unicode corrected their definition of what \w should match.

  • dump LABEL no longer leaks its label.

  • Constant folding no longer changes the behaviour of functions like stat() and truncate() that can take either filenames or handles. stat 1 ? foo : bar nows treats its argument as a file name (since it is an arbitrary expression), rather than the handle "foo".

  • truncate FOO, $len no longer falls back to treating "FOO" as a file name if the filehandle has been deleted. This was broken in Perl v5.16.0.

  • Subroutine redefinitions after sub-to-glob and glob-to-glob assignments no longer cause double frees or panic messages.

  • s/// now turns vstrings into plain strings when performing a substitution, even if the resulting string is the same (s/a/a/).

  • Prototype mismatch warnings no longer erroneously treat constant subs as having no prototype when they actually have "".

  • Constant subroutines and forward declarations no longer prevent prototype mismatch warnings from omitting the sub name.

  • undef on a subroutine now clears call checkers.

  • The ref operator started leaking memory on blessed objects in Perl v5.16.0. This has been fixed [perl #114340].

  • use no longer tries to parse its arguments as a statement, making use constant { () }; a syntax error [perl #114222].

  • On debugging builds, "uninitialized" warnings inside formats no longer cause assertion failures.

  • On debugging builds, subroutines nested inside formats no longer cause assertion failures [perl #78550].

  • Formats and use statements are now permitted inside formats.

  • print $x and sub { print $x }->() now always produce the same output. It was possible for the latter to refuse to close over $x if the variable was not active; e.g., if it was defined outside a currently-running named subroutine.

  • Similarly, print $x and print eval '$x' now produce the same output. This also allows "my $x if 0" variables to be seen in the debugger [perl #114018].

  • Formats called recursively no longer stomp on their own lexical variables, but each recursive call has its own set of lexicals.

  • Attempting to free an active format or the handle associated with it no longer results in a crash.

  • Format parsing no longer gets confused by braces, semicolons and low-precedence operators. It used to be possible to use braces as format delimiters (instead of = and .), but only sometimes. Semicolons and low-precedence operators in format argument lines no longer confuse the parser into ignoring the line's return value. In format argument lines, braces can now be used for anonymous hashes, instead of being treated always as do blocks.

  • Formats can now be nested inside code blocks in regular expressions and other quoted constructs (/(?{...})/ and qq/${...}/) [perl #114040].

  • Formats are no longer created after compilation errors.

  • Under debugging builds, the -DA command line option started crashing in Perl v5.16.0. It has been fixed [perl #114368].

  • A potential deadlock scenario involving the premature termination of a pseudo- forked child in a Windows build with ithreads enabled has been fixed. This resolves the common problem of the t/op/fork.t test hanging on Windows [perl #88840].

  • The code which generates errors from require() could potentially read one or two bytes before the start of the filename for filenames less than three bytes long and ending /\.p?\z/ . This has now been fixed. Note that it could never have happened with module names given to use() or require() anyway.

  • The handling of pathnames of modules given to require() has been made thread-safe on VMS.

  • Non-blocking sockets have been fixed on VMS.

  • Pod can now be nested in code inside a quoted construct outside of a string eval. This used to work only within string evals [perl #114040].

  • goto '' now looks for an empty label, producing the "goto must have label" error message, instead of exiting the program [perl #111794].

  • goto "\0" now dies with "Can't find label" instead of "goto must have label".

  • The C function hv_store used to result in crashes when used on %^H [perl #111000].

  • A call checker attached to a closure prototype via cv_set_call_checker is now copied to closures cloned from it. So cv_set_call_checker now works inside an attribute handler for a closure.

  • Writing to $^N used to have no effect. Now it croaks with "Modification of a read-only value" by default, but that can be overridden by a custom regular expression engine, as with $1 [perl #112184].

  • undef on a control character glob (undef *^H ) no longer emits an erroneous warning about ambiguity [perl #112456].

  • For efficiency's sake, many operators and built-in functions return the same scalar each time. Lvalue subroutines and subroutines in the CORE:: namespace were allowing this implementation detail to leak through. print &CORE::uc("a"), &CORE::uc("b") used to print "BB". The same thing would happen with an lvalue subroutine returning the return value of uc. Now the value is copied in such cases.

  • method {} syntax with an empty block or a block returning an empty list used to crash or use some random value left on the stack as its invocant. Now it produces an error.

  • vec now works with extremely large offsets (>2 GB) [perl #111730].

  • Changes to overload settings now take effect immediately, as do changes to inheritance that affect overloading. They used to take effect only after bless.

    Objects that were created before a class had any overloading used to remain non-overloaded even if the class gained overloading through use overload or @ISA changes, and even after bless. This has been fixed [perl #112708].

  • Classes with overloading can now inherit fallback values.

  • Overloading was not respecting a fallback value of 0 if there were overloaded objects on both sides of an assignment operator like += [perl #111856].

  • pos now croaks with hash and array arguments, instead of producing erroneous warnings.

  • while(each %h) now implies while(defined($_ = each %h)) , like readline and readdir.

  • Subs in the CORE:: namespace no longer crash after undef *_ when called with no argument list (&CORE::time with no parentheses).

  • unpack no longer produces the "'/' must follow a numeric type in unpack" error when it is the data that are at fault [perl #60204].

  • join and "@array" now call FETCH only once on a tied $" [perl #8931].

  • Some subroutine calls generated by compiling core ops affected by a CORE::GLOBAL override had op checking performed twice. The checking is always idempotent for pure Perl code, but the double checking can matter when custom call checkers are involved.

  • A race condition used to exist around fork that could cause a signal sent to the parent to be handled by both parent and child. Signals are now blocked briefly around fork to prevent this from happening [perl #82580].

  • The implementation of code blocks in regular expressions, such as (?{}) and (??{}) , has been heavily reworked to eliminate a whole slew of bugs. The main user-visible changes are:

    • Code blocks within patterns are now parsed in the same pass as the surrounding code; in particular it is no longer necessary to have balanced braces: this now works:

      1. /(?{ $x='{' })/

      This means that this error message is no longer generated:

      1. Sequence (?{...}) not terminated or not {}-balanced in regex

      but a new error may be seen:

      1. Sequence (?{...}) not terminated with ')'

      In addition, literal code blocks within run-time patterns are only compiled once, at perl compile-time:

      1. for my $p (...) {
      2. # this 'FOO' block of code is compiled once,
      3. # at the same time as the surrounding 'for' loop
      4. /$p{(?{FOO;})/;
      5. }
    • Lexical variables are now sane as regards scope, recursion and closure behavior. In particular, /A(?{B})C/ behaves (from a closure viewpoint) exactly like /A/ && do { B } && /C/ , while qr/A(?{B})C/ is like sub {/A/ && do { B } && /C/} . So this code now works how you might expect, creating three regexes that match 0, 1, and 2:

      1. for my $i (0..2) {
      2. push @r, qr/^(??{$i})$/;
      3. }
      4. "1" =~ $r[1]; # matches
    • The use re 'eval' pragma is now only required for code blocks defined at runtime; in particular in the following, the text of the $r pattern is still interpolated into the new pattern and recompiled, but the individual compiled code-blocks within $r are reused rather than being recompiled, and use re 'eval' isn't needed any more:

      1. my $r = qr/abc(?{....})def/;
      2. /xyz$r/;
    • Flow control operators no longer crash. Each code block runs in a new dynamic scope, so next etc. will not see any enclosing loops. return returns a value from the code block, not from any enclosing subroutine.

    • Perl normally caches the compilation of run-time patterns, and doesn't recompile if the pattern hasn't changed, but this is now disabled if required for the correct behavior of closures. For example:

      1. my $code = '(??{$x})';
      2. for my $x (1..3) {
      3. # recompile to see fresh value of $x each time
      4. $x =~ /$code/;
      5. }
    • The /msix and (?msix) etc. flags are now propagated into the return value from (??{}) ; this now works:

      1. "AB" =~ /a(??{'b'})/i;
    • Warnings and errors will appear to come from the surrounding code (or for run-time code blocks, from an eval) rather than from an re_eval :

      1. use re 'eval'; $c = '(?{ warn "foo" })'; /$c/;
      2. /(?{ warn "foo" })/;

      formerly gave:

      1. foo at (re_eval 1) line 1.
      2. foo at (re_eval 2) line 1.

      and now gives:

      1. foo at (eval 1) line 1.
      2. foo at /some/prog line 2.
  • Perl now can be recompiled to use any Unicode version. In v5.16, it worked on Unicodes 6.0 and 6.1, but there were various bugs if earlier releases were used; the older the release the more problems.

  • vec no longer produces "uninitialized" warnings in lvalue context [perl #9423].

  • An optimization involving fixed strings in regular expressions could cause a severe performance penalty in edge cases. This has been fixed [perl #76546].

  • In certain cases, including empty subpatterns within a regular expression (such as (?:) or (?:|) ) could disable some optimizations. This has been fixed.

  • The "Can't find an opnumber" message that prototype produces when passed a string like "CORE::nonexistent_keyword" now passes UTF-8 and embedded NULs through unchanged [perl #97478].

  • prototype now treats magical variables like $1 the same way as non-magical variables when checking for the CORE:: prefix, instead of treating them as subroutine names.

  • Under threaded perls, a runtime code block in a regular expression could corrupt the package name stored in the op tree, resulting in bad reads in caller, and possibly crashes [perl #113060].

  • Referencing a closure prototype (\&{$_[1]} in an attribute handler for a closure) no longer results in a copy of the subroutine (or assertion failures on debugging builds).

  • eval '__PACKAGE__' now returns the right answer on threaded builds if the current package has been assigned over (as in *ThisPackage:: = *ThatPackage:: ) [perl #78742].

  • If a package is deleted by code that it calls, it is possible for caller to see a stack frame belonging to that deleted package. caller could crash if the stash's memory address was reused for a scalar and a substitution was performed on the same scalar [perl #113486].

  • UNIVERSAL::can no longer treats its first argument differently depending on whether it is a string or number internally.

  • open with <& for the mode checks to see whether the third argument is a number, in determining whether to treat it as a file descriptor or a handle name. Magical variables like $1 were always failing the numeric check and being treated as handle names.

  • warn's handling of magical variables ($1 , ties) has undergone several fixes. FETCH is only called once now on a tied argument or a tied $@ [perl #97480]. Tied variables returning objects that stringify as "" are no longer ignored. A tied $@ that happened to return a reference the previous time it was used is no longer ignored.

  • warn "" now treats $@ with a number in it the same way, regardless of whether it happened via $@=3 or $@="3" . It used to ignore the former. Now it appends "\t...caught", as it has always done with $@="3" .

  • Numeric operators on magical variables (e.g., $1 + 1 ) used to use floating point operations even where integer operations were more appropriate, resulting in loss of accuracy on 64-bit platforms [perl #109542].

  • Unary negation no longer treats a string as a number if the string happened to be used as a number at some point. So, if $x contains the string "dogs", -$x returns "-dogs" even if $y=0+$x has happened at some point.

  • In Perl v5.14, -'-10' was fixed to return "10", not "+10". But magical variables ($1 , ties) were not fixed till now [perl #57706].

  • Unary negation now treats strings consistently, regardless of the internal UTF8 flag.

  • A regression introduced in Perl v5.16.0 involving tr/SEARCHLIST/REPLACEMENTLIST/ has been fixed. Only the first instance is supposed to be meaningful if a character appears more than once in SEARCHLIST. Under some circumstances, the final instance was overriding all earlier ones. [perl #113584]

  • Regular expressions like qr/\87/ previously silently inserted a NUL character, thus matching as if it had been written qr/\00087/. Now it matches as if it had been written as qr/87/, with a message that the sequence "\8" is unrecognized.

  • __SUB__ now works in special blocks (BEGIN , END , etc.).

  • Thread creation on Windows could theoretically result in a crash if done inside a BEGIN block. It still does not work properly, but it no longer crashes [perl #111610].

  • \&{''} (with the empty string) now autovivifies a stub like any other sub name, and no longer produces the "Unable to create sub" error [perl #94476].

  • A regression introduced in v5.14.0 has been fixed, in which some calls to the re module would clobber $_ [perl #113750].

  • do FILE now always either sets or clears $@ , even when the file can't be read. This ensures that testing $@ first (as recommended by the documentation) always returns the correct result.

  • The array iterator used for the each @array construct is now correctly reset when @array is cleared [perl #75596]. This happens, for example, when the array is globally assigned to, as in @array = (...) , but not when its values are assigned to. In terms of the XS API, it means that av_clear() will now reset the iterator.

    This mirrors the behaviour of the hash iterator when the hash is cleared.

  • $class->can , $class->isa , and $class->DOES now return correct results, regardless of whether that package referred to by $class exists [perl #47113].

  • Arriving signals no longer clear $@ [perl #45173].

  • Allow my () declarations with an empty variable list [perl #113554].

  • During parsing, subs declared after errors no longer leave stubs [perl #113712].

  • Closures containing no string evals no longer hang on to their containing subroutines, allowing variables closed over by outer subroutines to be freed when the outer sub is freed, even if the inner sub still exists [perl #89544].

  • Duplication of in-memory filehandles by opening with a "<&=" or ">&=" mode stopped working properly in v5.16.0. It was causing the new handle to reference a different scalar variable. This has been fixed [perl #113764].

  • qr// expressions no longer crash with custom regular expression engines that do not set offs at regular expression compilation time [perl #112962].

  • delete local no longer crashes with certain magical arrays and hashes [perl #112966].

  • local on elements of certain magical arrays and hashes used not to arrange to have the element deleted on scope exit, even if the element did not exist before local.

  • scalar(write) no longer returns multiple items [perl #73690].

  • String to floating point conversions no longer misparse certain strings under use locale [perl #109318].

  • @INC filters that die no longer leak memory [perl #92252].

  • The implementations of overloaded operations are now called in the correct context. This allows, among other things, being able to properly override <> [perl #47119].

  • Specifying only the fallback key when calling use overload now behaves properly [perl #113010].

  • sub foo { my $a = 0; while ($a) { ... } } and sub foo { while (0) { ... } } now return the same thing [perl #73618].

  • String negation now behaves the same under use integer; as it does without [perl #113012].

  • chr now returns the Unicode replacement character (U+FFFD) for -1, regardless of the internal representation. -1 used to wrap if the argument was tied or a string internally.

  • Using a format after its enclosing sub was freed could crash as of perl v5.12.0, if the format referenced lexical variables from the outer sub.

  • Using a format after its enclosing sub was undefined could crash as of perl v5.10.0, if the format referenced lexical variables from the outer sub.

  • Using a format defined inside a closure, which format references lexical variables from outside, never really worked unless the write call was directly inside the closure. In v5.10.0 it even started crashing. Now the copy of that closure nearest the top of the call stack is used to find those variables.

  • Formats that close over variables in special blocks no longer crash if a stub exists with the same name as the special block before the special block is compiled.

  • The parser no longer gets confused, treating eval foo () as a syntax error if preceded by print; [perl #16249].

  • The return value of syscall is no longer truncated on 64-bit platforms [perl #113980].

  • Constant folding no longer causes print 1 ? FOO : BAR to print to the FOO handle [perl #78064].

  • do subname now calls the named subroutine and uses the file name it returns, instead of opening a file named "subname".

  • Subroutines looked up by rv2cv check hooks (registered by XS modules) are now taken into consideration when determining whether foo bar should be the sub call foo(bar) or the method call "bar"->foo .

  • CORE::foo::bar is no longer treated specially, allowing global overrides to be called directly via CORE::GLOBAL::uc(...) [perl #113016].

  • Calling an undefined sub whose typeglob has been undefined now produces the customary "Undefined subroutine called" error, instead of "Not a CODE reference".

  • Two bugs involving @ISA have been fixed. *ISA = *glob_without_array and undef *ISA; @{*ISA} would prevent future modifications to @ISA from updating the internal caches used to look up methods. The *glob_without_array case was a regression from Perl v5.12.

  • Regular expression optimisations sometimes caused $ with /m to produce failed or incorrect matches [perl #114068].

  • __SUB__ now works in a sort block when the enclosing subroutine is predeclared with sub foo; syntax [perl #113710].

  • Unicode properties only apply to Unicode code points, which leads to some subtleties when regular expressions are matched against above-Unicode code points. There is a warning generated to draw your attention to this. However, this warning was being generated inappropriately in some cases, such as when a program was being parsed. Non-Unicode matches such as \w and [:word:] should not generate the warning, as their definitions don't limit them to apply to only Unicode code points. Now the message is only generated when matching against \p{} and \P{} . There remains a bug, [perl #114148], for the very few properties in Unicode that match just a single code point. The warning is not generated if they are matched against an above-Unicode code point.

  • Uninitialized warnings mentioning hash elements would only mention the element name if it was not in the first bucket of the hash, due to an off-by-one error.

  • A regular expression optimizer bug could cause multiline "^" to behave incorrectly in the presence of line breaks, such that "/\n\n" =~ m#\A(?:^/$)#im would not match [perl #115242].

  • Failed fork in list context no longer corrupts the stack. @a = (1, 2, fork, 3) used to gobble up the 2 and assign (1, undef, 3) if the fork call failed.

  • Numerous memory leaks have been fixed, mostly involving tied variables that die, regular expression character classes and code blocks, and syntax errors.

  • Assigning a regular expression (${qr//} ) to a variable that happens to hold a floating point number no longer causes assertion failures on debugging builds.

  • Assigning a regular expression to a scalar containing a number no longer causes subsequent numification to produce random numbers.

  • Assigning a regular expression to a magic variable no longer wipes away the magic. This was a regression from v5.10.

  • Assigning a regular expression to a blessed scalar no longer results in crashes. This was also a regression from v5.10.

  • Regular expression can now be assigned to tied hash and array elements with flattening into strings.

  • Numifying a regular expression no longer results in an uninitialized warning.

  • Negative array indices no longer cause EXISTS methods of tied variables to be ignored. This was a regression from v5.12.

  • Negative array indices no longer result in crashes on arrays tied to non-objects.

  • $byte_overload .= $utf8 no longer results in doubly-encoded UTF-8 if the left-hand scalar happened to have produced a UTF-8 string the last time overloading was invoked.

  • goto &sub now uses the current value of @_, instead of using the array the subroutine was originally called with. This means local @_ = (...); goto &sub now works [perl #43077].

  • If a debugger is invoked recursively, it no longer stomps on its own lexical variables. Formerly under recursion all calls would share the same set of lexical variables [perl #115742].

  • *_{ARRAY} returned from a subroutine no longer spontaneously becomes empty.

  • When using say to print to a tied filehandle, the value of $\ is correctly localized, even if it was previously undef. [perl #119927]

Known Problems

  • UTF8-flagged strings in %ENV on HP-UX 11.00 are buggy

    The interaction of UTF8-flagged strings and %ENV on HP-UX 11.00 is currently dodgy in some not-yet-fully-diagnosed way. Expect test failures in t/op/magic.t, followed by unknown behavior when storing wide characters in the environment.

Obituary

Hojung Yoon (AMORETTE), 24, of Seoul, South Korea, went to his long rest on May 8, 2013 with llama figurine and autographed TIMTOADY card. He was a brilliant young Perl 5 & 6 hacker and a devoted member of Seoul.pm. He programmed Perl, talked Perl, ate Perl, and loved Perl. We believe that he is still programming in Perl with his broken IBM laptop somewhere. He will be missed.

Acknowledgements

Perl v5.18.0 represents approximately 12 months of development since Perl v5.16.0 and contains approximately 400,000 lines of changes across 2,100 files from 113 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl v5.18.0:

Aaron Crane, Aaron Trevena, Abhijit Menon-Sen, Adrian M. Enache, Alan Haggai Alavi, Alexandr Ciornii, Andrew Tam, Andy Dougherty, Anton Nikishaev, Aristotle Pagaltzis, Augustina Blair, Bob Ernst, Brad Gilbert, Breno G. de Oliveira, Brian Carlson, Brian Fraser, Charlie Gonzalez, Chip Salzenberg, Chris 'BinGOs' Williams, Christian Hansen, Colin Kuskie, Craig A. Berry, Dagfinn Ilmari Mannsåker, Daniel Dragan, Daniel Perrett, Darin McBride, Dave Rolsky, David Golden, David Leadbeater, David Mitchell, David Nicol, Dominic Hargreaves, E. Choroba, Eric Brine, Evan Miller, Father Chrysostomos, Florian Ragwitz, François Perrad, George Greer, Goro Fuji, H.Merijn Brand, Herbert Breunung, Hugo van der Sanden, Igor Zaytsev, James E Keenan, Jan Dubois, Jasmine Ahuja, Jerry D. Hedden, Jess Robinson, Jesse Luehrs, Joaquin Ferrero, Joel Berger, John Goodyear, John Peacock, Karen Etheridge, Karl Williamson, Karthik Rajagopalan, Kent Fredric, Leon Timmermans, Lucas Holt, Lukas Mai, Marcus Holland-Moritz, Markus Jansen, Martin Hasch, Matthew Horsfall, Max Maischein, Michael G Schwern, Michael Schroeder, Moritz Lenz, Nicholas Clark, Niko Tyni, Oleg Nesterov, Patrik Hägglund, Paul Green, Paul Johnson, Paul Marquess, Peter Martini, Rafael Garcia-Suarez, Reini Urban, Renee Baecker, Rhesa Rozendaal, Ricardo Signes, Robin Barker, Ronald J. Kimball, Ruslan Zakirov, Salvador Fandiño, Sawyer X, Scott Lanning, Sergey Alekseev, Shawn M Moore, Shirakata Kentaro, Shlomi Fish, Sisyphus, Smylers, Steffen Müller, Steve Hay, Steve Peters, Steven Schubiger, Sullivan Beck, Sven Strickroth, Sébastien Aperghis-Tramoni, Thomas Sibley, Tobias Leich, Tom Wyant, Tony Cook, Vadim Konovalov, Vincent Pit, Volker Schatz, Walt Mankowski, Yves Orton, Zefram.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

Page index
 
perldoc-html/perl5181delta.html000644 000765 000024 00000056766 12275777371 016445 0ustar00jjstaff000000 000000 perl5181delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5181delta

Perl 5 version 18.2 documentation
Recently read

perl5181delta

NAME

perl5181delta - what is new for perl v5.18.1

DESCRIPTION

This document describes differences between the 5.18.0 release and the 5.18.1 release.

If you are upgrading from an earlier release such as 5.16.0, first read perl5180delta, which describes differences between 5.16.0 and 5.18.0.

Incompatible Changes

There are no changes intentionally incompatible with 5.18.0 If any exist, they are bugs, and we request that you submit a report. See Reporting Bugs below.

Modules and Pragmata

Updated Modules and Pragmata

  • B has been upgraded from 1.42 to 1.42_01, fixing bugs related to lexical subroutines.

  • Digest::SHA has been upgraded from 5.84 to 5.84_01, fixing a crashing bug. [RT #118649]

  • Module::CoreList has been upgraded from 2.89 to 2.96.

Platform Support

Platform-Specific Notes

  • AIX

    A rarely-encounted configuration bug in the AIX hints file has been corrected.

  • MidnightBSD

    After a patch to the relevant hints file, perl should now build correctly on MidnightBSD 0.4-RELEASE.

Selected Bug Fixes

  • Starting in v5.18.0, a construct like /[#](?{})/x would have its # incorrectly interpreted as a comment. The code block would be skipped, unparsed. This has been corrected.

  • A number of memory leaks related to the new, experimental regexp bracketed character class feature have been plugged.

  • The OP allocation code now returns correctly aligned memory in all cases for struct pmop . Previously it could return memory only aligned to a 4-byte boundary, which is not correct for an ithreads build with 64 bit IVs on some 32 bit platforms. Notably, this caused the build to fail completely on sparc GNU/Linux. [RT #118055]

  • The debugger's man command been fixed. It was broken in the v5.18.0 release. The man command is aliased to the names doc and perldoc - all now work again.

  • @_ is now correctly visible in the debugger, fixing a regression introduced in v5.18.0's debugger. [RT #118169]

  • Fixed a small number of regexp constructions that could either fail to match or crash perl when the string being matched against was allocated above the 2GB line on 32-bit systems. [RT #118175]

  • Perl v5.16 inadvertently introduced a bug whereby calls to XSUBs that were not visible at compile time were treated as lvalues and could be assigned to, even when the subroutine was not an lvalue sub. This has been fixed. [perl #117947]

  • Perl v5.18 inadvertently introduced a bug whereby dual-vars (i.e. variables with both string and numeric values, such as $! ) where the truthness of the variable was determined by the numeric value rather than the string value. [RT #118159]

  • Perl v5.18 inadvertently introduced a bug whereby interpolating mixed up- and down-graded UTF-8 strings in a regex could result in malformed UTF-8 in the pattern: specifically if a downgraded character in the range \x80..\xff followed a UTF-8 string, e.g.

    1. utf8::upgrade( my $u = "\x{e5}");
    2. utf8::downgrade(my $d = "\x{e5}");
    3. /$u$d/

    [perl #118297].

  • Lexical constants (my sub a() { 42 } ) no longer crash when inlined.

  • Parameter prototypes attached to lexical subroutines are now respected when compiling sub calls without parentheses. Previously, the prototypes were honoured only for calls with parentheses. [RT #116735]

  • Syntax errors in lexical subroutines in combination with calls to the same subroutines no longer cause crashes at compile time.

  • The dtrace sub-entry probe now works with lexical subs, instead of crashing [perl #118305].

  • Undefining an inlinable lexical subroutine (my sub foo() { 42 } undef &foo ) would result in a crash if warnings were turned on.

  • Deep recursion warnings no longer crash lexical subroutines. [RT #118521]

Acknowledgements

Perl 5.18.1 represents approximately 2 months of development since Perl 5.18.0 and contains approximately 8,400 lines of changes across 60 files from 12 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.18.1:

Chris 'BinGOs' Williams, Craig A. Berry, Dagfinn Ilmari Mannsåker, David Mitchell, Father Chrysostomos, Karl Williamson, Lukas Mai, Nicholas Clark, Peter Martini, Ricardo Signes, Shlomi Fish, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl5182delta.html000644 000765 000024 00000053204 12275777371 016426 0ustar00jjstaff000000 000000 perl5182delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl5182delta

Perl 5 version 18.2 documentation
Recently read

perl5182delta

NAME

perldelta - what is new for perl v5.18.2

DESCRIPTION

This document describes differences between the 5.18.1 release and the 5.18.2 release.

If you are upgrading from an earlier release such as 5.18.0, first read perl5181delta, which describes differences between 5.18.0 and 5.18.1.

Modules and Pragmata

Updated Modules and Pragmata

  • B has been upgraded from version 1.42_01 to 1.42_02.

    The fix for [perl #118525] introduced a regression in the behaviour of B::CV::GV , changing the return value from a B::SPECIAL object on a NULL CvGV to undef. B::CV::GV again returns a B::SPECIAL object in this case. [perl #119413]

  • B::Concise has been upgraded from version 0.95 to 0.95_01.

    This fixes a bug in dumping unexpected SEPCIALs.

  • English has been upgraded from version 1.06 to 1.06_01. This fixes an error about the performance of $` , $& , and c<$'>.

  • File::Glob has been upgraded from version 1.20 to 1.20_01.

Documentation

Changes to Existing Documentation

  • perlrepository has been restored with a pointer to more useful pages.

  • perlhack has been updated with the latest changes from blead.

Selected Bug Fixes

  • Perl 5.18.1 introduced a regression along with a bugfix for lexical subs. Some B::SPECIAL results from B::CV::GV became undefs instead. This broke Devel::Cover among other libraries. This has been fixed. [perl #119351]

  • Perl 5.18.0 introduced a regression whereby [:^ascii:] , if used in the same character class as other qualifiers, would fail to match characters in the Latin-1 block. This has been fixed. [perl #120799]

  • Perl 5.18.0 introduced a regression when using ->SUPER::method with AUTOLOAD by looking up AUTOLOAD from the current package, rather than the current package’s superclass. This has been fixed. [perl #120694]

  • Perl 5.18.0 introduced a regression whereby -bareword was no longer permitted under the strict and integer pragmata when used together. This has been fixed. [perl #120288]

  • Previously PerlIOBase_dup didn't check if pushing the new layer succeeded before (optionally) setting the utf8 flag. This could cause segfaults-by-nullpointer. This has been fixed.

  • A buffer overflow with very long identifiers has been fixed.

  • A regression from 5.16 in the handling of padranges led to assertion failures if a keyword plugin declined to handle the second ‘my’, but only after creating a padop.

    This affected, at least, Devel::CallParser under threaded builds.

    This has been fixed

  • The construct $r=qr/.../; /$r/p is now handled properly, an issue which had been worsened by changes 5.18.0. [perl #118213]

Acknowledgements

Perl 5.18.2 represents approximately 3 months of development since Perl 5.18.1 and contains approximately 980 lines of changes across 39 files from 4 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.18.2:

Craig A. Berry, David Mitchell, Ricardo Signes, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl561delta.html000644 000765 000024 00000705034 12275777406 016346 0ustar00jjstaff000000 000000 perl561delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl561delta

Perl 5 version 18.2 documentation
Recently read

perl561delta

NAME

perl561delta - what's new for perl v5.6.1

DESCRIPTION

This document describes differences between the 5.005 release and the 5.6.1 release.

Summary of changes between 5.6.0 and 5.6.1

This section contains a summary of the changes between the 5.6.0 release and the 5.6.1 release. More details about the changes mentioned here may be found in the Changes files that accompany the Perl source distribution. See perlhack for pointers to online resources where you can inspect the individual patches described by these changes.

Security Issues

suidperl will not run /bin/mail anymore, because some platforms have a /bin/mail that is vulnerable to buffer overflow attacks.

Note that suidperl is neither built nor installed by default in any recent version of perl. Use of suidperl is highly discouraged. If you think you need it, try alternatives such as sudo first. See http://www.courtesan.com/sudo/ .

Core bug fixes

This is not an exhaustive list. It is intended to cover only the significant user-visible changes.

  • UNIVERSAL::isa()

    A bug in the caching mechanism used by UNIVERSAL::isa() that affected base.pm has been fixed. The bug has existed since the 5.005 releases, but wasn't tickled by base.pm in those releases.

  • Memory leaks

    Various cases of memory leaks and attempts to access uninitialized memory have been cured. See Known Problems below for further issues.

  • Numeric conversions

    Numeric conversions did not recognize changes in the string value properly in certain circumstances.

    In other situations, large unsigned numbers (those above 2**31) could sometimes lose their unsignedness, causing bogus results in arithmetic operations.

    Integer modulus on large unsigned integers sometimes returned incorrect values.

    Perl 5.6.0 generated "not a number" warnings on certain conversions where previous versions didn't.

    These problems have all been rectified.

    Infinity is now recognized as a number.

  • qw(a\\b)

    In Perl 5.6.0, qw(a\\b) produced a string with two backslashes instead of one, in a departure from the behavior in previous versions. The older behavior has been reinstated.

  • caller()

    caller() could cause core dumps in certain situations. Carp was sometimes affected by this problem.

  • Bugs in regular expressions

    Pattern matches on overloaded values are now handled correctly.

    Perl 5.6.0 parsed m/\x{ab}/ incorrectly, leading to spurious warnings. This has been corrected.

    The RE engine found in Perl 5.6.0 accidentally pessimised certain kinds of simple pattern matches. These are now handled better.

    Regular expression debug output (whether through use re 'debug' or via -Dr ) now looks better.

    Multi-line matches like "a\nxb\n" =~ /(?!\A)x/m were flawed. The bug has been fixed.

    Use of $& could trigger a core dump under some situations. This is now avoided.

    Match variables $1 et al., weren't being unset when a pattern match was backtracking, and the anomaly showed up inside /...(?{ ... }).../ etc. These variables are now tracked correctly.

    pos() did not return the correct value within s///ge in earlier versions. This is now handled correctly.

  • "slurp" mode

    readline() on files opened in "slurp" mode could return an extra "" at the end in certain situations. This has been corrected.

  • Autovivification of symbolic references to special variables

    Autovivification of symbolic references of special variables described in perlvar (as in ${$num} ) was accidentally disabled. This works again now.

  • Lexical warnings

    Lexical warnings now propagate correctly into eval "..." .

    use warnings qw(FATAL all) did not work as intended. This has been corrected.

    Lexical warnings could leak into other scopes in some situations. This is now fixed.

    warnings::enabled() now reports the state of $^W correctly if the caller isn't using lexical warnings.

  • Spurious warnings and errors

    Perl 5.6.0 could emit spurious warnings about redefinition of dl_error() when statically building extensions into perl. This has been corrected.

    "our" variables could result in bogus "Variable will not stay shared" warnings. This is now fixed.

    "our" variables of the same name declared in two sibling blocks resulted in bogus warnings about "redeclaration" of the variables. The problem has been corrected.

  • glob()

    Compatibility of the builtin glob() with old csh-based glob has been improved with the addition of GLOB_ALPHASORT option. See File::Glob .

    File::Glob::glob() has been renamed to File::Glob::bsd_glob() because the name clashes with the builtin glob(). The older name is still available for compatibility, but is deprecated.

    Spurious syntax errors generated in certain situations, when glob() caused File::Glob to be loaded for the first time, have been fixed.

  • Tainting

    Some cases of inconsistent taint propagation (such as within hash values) have been fixed.

    The tainting behavior of sprintf() has been rationalized. It does not taint the result of floating point formats anymore, making the behavior consistent with that of string interpolation.

  • sort()

    Arguments to sort() weren't being provided the right wantarray() context. The comparison block is now run in scalar context, and the arguments to be sorted are always provided list context.

    sort() is also fully reentrant, in the sense that the sort function can itself call sort(). This did not work reliably in previous releases.

  • #line directives

    #line directives now work correctly when they appear at the very beginning of eval "..." .

  • Subroutine prototypes

    The (\&) prototype now works properly.

  • map()

    map() could get pathologically slow when the result list it generates is larger than the source list. The performance has been improved for common scenarios.

  • Debugger

    Debugger exit code now reflects the script exit code.

    Condition "0" in breakpoints is now treated correctly.

    The d command now checks the line number.

    $. is no longer corrupted by the debugger.

    All debugger output now correctly goes to the socket if RemotePort is set.

  • PERL5OPT

    PERL5OPT can be set to more than one switch group. Previously, it used to be limited to one group of options only.

  • chop()

    chop(@list) in list context returned the characters chopped in reverse order. This has been reversed to be in the right order.

  • Unicode support

    Unicode support has seen a large number of incremental improvements, but continues to be highly experimental. It is not expected to be fully supported in the 5.6.x maintenance releases.

    substr(), join(), repeat(), reverse(), quotemeta() and string concatenation were all handling Unicode strings incorrectly in Perl 5.6.0. This has been corrected.

    Support for tr///CU and tr///UC etc., have been removed since we realized the interface is broken. For similar functionality, see pack.

    The Unicode Character Database has been updated to version 3.0.1 with additions made available to the public as of August 30, 2000.

    The Unicode character classes \p{Blank} and \p{SpacePerl} have been added. "Blank" is like C isblank(), that is, it contains only "horizontal whitespace" (the space character is, the newline isn't), and the "SpacePerl" is the Unicode equivalent of \s (\p{Space} isn't, since that includes the vertical tabulator character, whereas \s doesn't.)

    If you are experimenting with Unicode support in perl, the development versions of Perl may have more to offer. In particular, I/O layers are now available in the development track, but not in the maintenance track, primarily to do backward compatibility issues. Unicode support is also evolving rapidly on a daily basis in the development track--the maintenance track only reflects the most conservative of these changes.

  • 64-bit support

    Support for 64-bit platforms has been improved, but continues to be experimental. The level of support varies greatly among platforms.

  • Compiler

    The B Compiler and its various backends have had many incremental improvements, but they continue to remain highly experimental. Use in production environments is discouraged.

    The perlcc tool has been rewritten so that the user interface is much more like that of a C compiler.

    The perlbc tools has been removed. Use perlcc -B instead.

  • Lvalue subroutines

    There have been various bugfixes to support lvalue subroutines better. However, the feature still remains experimental.

  • IO::Socket

    IO::Socket::INET failed to open the specified port if the service name was not known. It now correctly uses the supplied port number as is.

  • File::Find

    File::Find now chdir()s correctly when chasing symbolic links.

  • xsubpp

    xsubpp now tolerates embedded POD sections.

  • no Module;

    no Module; does not produce an error even if Module does not have an unimport() method. This parallels the behavior of use vis-a-vis import.

  • Tests

    A large number of tests have been added.

Core features

untie() will now call an UNTIE() hook if it exists. See perltie for details.

The -DT command line switch outputs copious tokenizing information. See perlrun.

Arrays are now always interpolated in double-quotish strings. Previously, "foo@bar.com" used to be a fatal error at compile time, if an array @bar was not used or declared. This transitional behavior was intended to help migrate perl4 code, and is deemed to be no longer useful. See Arrays now always interpolate into double-quoted strings.

keys(), each(), pop(), push(), shift(), splice() and unshift() can all be overridden now.

my __PACKAGE__ $obj now does the expected thing.

Configuration issues

On some systems (IRIX and Solaris among them) the system malloc is demonstrably better. While the defaults haven't been changed in order to retain binary compatibility with earlier releases, you may be better off building perl with Configure -Uusemymalloc ... as discussed in the INSTALL file.

Configure has been enhanced in various ways:

  • Minimizes use of temporary files.

  • By default, does not link perl with libraries not used by it, such as the various dbm libraries. SunOS 4.x hints preserve behavior on that platform.

  • Support for pdp11-style memory models has been removed due to obsolescence.

  • Building outside the source tree is supported on systems that have symbolic links. This is done by running

    1. sh /path/to/source/Configure -Dmksymlinks ...
    2. make all test install

    in a directory other than the perl source directory. See INSTALL.

  • Configure -S can be run non-interactively.

Documentation

README.aix, README.solaris and README.macos have been added. README.posix-bc has been renamed to README.bs2000. These are installed as perlaix, perlsolaris, perlmacos, and perlbs2000 respectively.

The following pod documents are brand new:

  1. perlclib Internal replacements for standard C library functions
  2. perldebtut Perl debugging tutorial
  3. perlebcdic Considerations for running Perl on EBCDIC platforms
  4. perlnewmod Perl modules: preparing a new module for distribution
  5. perlrequick Perl regular expressions quick start
  6. perlretut Perl regular expressions tutorial
  7. perlutil utilities packaged with the Perl distribution

The INSTALL file has been expanded to cover various issues, such as 64-bit support.

A longer list of contributors has been added to the source distribution. See the file AUTHORS .

Numerous other changes have been made to the included documentation and FAQs.

Bundled modules

The following modules have been added.

  • B::Concise

    Walks Perl syntax tree, printing concise info about ops. See B::Concise.

  • File::Temp

    Returns name and handle of a temporary file safely. See File::Temp.

  • Pod::LaTeX

    Converts Pod data to formatted LaTeX. See Pod::LaTeX.

  • Pod::Text::Overstrike

    Converts POD data to formatted overstrike text. See Pod::Text::Overstrike.

The following modules have been upgraded.

  • CGI

    CGI v2.752 is now included.

  • CPAN

    CPAN v1.59_54 is now included.

  • Class::Struct

    Various bugfixes have been added.

  • DB_File

    DB_File v1.75 supports newer Berkeley DB versions, among other improvements.

  • Devel::Peek

    Devel::Peek has been enhanced to support dumping of memory statistics, when perl is built with the included malloc().

  • File::Find

    File::Find now supports pre and post-processing of the files in order to sort() them, etc.

  • Getopt::Long

    Getopt::Long v2.25 is included.

  • IO::Poll

    Various bug fixes have been included.

  • IPC::Open3

    IPC::Open3 allows use of numeric file descriptors.

  • Math::BigFloat

    The fmod() function supports modulus operations. Various bug fixes have also been included.

  • Math::Complex

    Math::Complex handles inf, NaN etc., better.

  • Net::Ping

    ping() could fail on odd number of data bytes, and when the echo service isn't running. This has been corrected.

  • Opcode

    A memory leak has been fixed.

  • Pod::Parser

    Version 1.13 of the Pod::Parser suite is included.

  • Pod::Text

    Pod::Text and related modules have been upgraded to the versions in podlators suite v2.08.

  • SDBM_File

    On dosish platforms, some keys went missing because of lack of support for files with "holes". A workaround for the problem has been added.

  • Sys::Syslog

    Various bug fixes have been included.

  • Tie::RefHash

    Now supports Tie::RefHash::Nestable to automagically tie hashref values.

  • Tie::SubstrHash

    Various bug fixes have been included.

Platform-specific improvements

The following new ports are now available.

  • NCR MP-RAS
  • NonStop-UX

Perl now builds under Amdahl UTS.

Perl has also been verified to build under Amiga OS.

Support for EPOC has been much improved. See README.epoc.

Building perl with -Duseithreads or -Duse5005threads now works under HP-UX 10.20 (previously it only worked under 10.30 or later). You will need a thread library package installed. See README.hpux.

Long doubles should now work under Linux.

Mac OS Classic is now supported in the mainstream source package. See README.macos.

Support for MPE/iX has been updated. See README.mpeix.

Support for OS/2 has been improved. See os2/Changes and README.os2.

Dynamic loading on z/OS (formerly OS/390) has been improved. See README.os390.

Support for VMS has seen many incremental improvements, including better support for operators like backticks and system(), and better %ENV handling. See README.vms and perlvms.

Support for Stratus VOS has been improved. See vos/Changes and README.vos.

Support for Windows has been improved.

  • fork() emulation has been improved in various ways, but still continues to be experimental. See perlfork for known bugs and caveats.

  • %SIG has been enabled under USE_ITHREADS, but its use is completely unsupported under all configurations.

  • Borland C++ v5.5 is now a supported compiler that can build Perl. However, the generated binaries continue to be incompatible with those generated by the other supported compilers (GCC and Visual C++).

  • Non-blocking waits for child processes (or pseudo-processes) are supported via waitpid($pid, &POSIX::WNOHANG) .

  • A memory leak in accept() has been fixed.

  • wait(), waitpid() and backticks now return the correct exit status under Windows 9x.

  • Trailing new %ENV entries weren't propagated to child processes. This is now fixed.

  • Current directory entries in %ENV are now correctly propagated to child processes.

  • Duping socket handles with open(F, ">&MYSOCK") now works under Windows 9x.

  • The makefiles now provide a single switch to bulk-enable all the features enabled in ActiveState ActivePerl (a popular binary distribution).

  • Win32::GetCwd() correctly returns C:\ instead of C: when at the drive root. Other bugs in chdir() and Cwd::cwd() have also been fixed.

  • fork() correctly returns undef and sets EAGAIN when it runs out of pseudo-process handles.

  • ExtUtils::MakeMaker now uses $ENV{LIB} to search for libraries.

  • UNC path handling is better when perl is built to support fork().

  • A handle leak in socket handling has been fixed.

  • send() works from within a pseudo-process.

Unless specifically qualified otherwise, the remainder of this document covers changes between the 5.005 and 5.6.0 releases.

Core Enhancements

Interpreter cloning, threads, and concurrency

Perl 5.6.0 introduces the beginnings of support for running multiple interpreters concurrently in different threads. In conjunction with the perl_clone() API call, which can be used to selectively duplicate the state of any given interpreter, it is possible to compile a piece of code once in an interpreter, clone that interpreter one or more times, and run all the resulting interpreters in distinct threads.

On the Windows platform, this feature is used to emulate fork() at the interpreter level. See perlfork for details about that.

This feature is still in evolution. It is eventually meant to be used to selectively clone a subroutine and data reachable from that subroutine in a separate interpreter and run the cloned subroutine in a separate thread. Since there is no shared data between the interpreters, little or no locking will be needed (unless parts of the symbol table are explicitly shared). This is obviously intended to be an easy-to-use replacement for the existing threads support.

Support for cloning interpreters and interpreter concurrency can be enabled using the -Dusethreads Configure option (see win32/Makefile for how to enable it on Windows.) The resulting perl executable will be functionally identical to one that was built with -Dmultiplicity, but the perl_clone() API call will only be available in the former.

-Dusethreads enables the cpp macro USE_ITHREADS by default, which in turn enables Perl source code changes that provide a clear separation between the op tree and the data it operates with. The former is immutable, and can therefore be shared between an interpreter and all of its clones, while the latter is considered local to each interpreter, and is therefore copied for each clone.

Note that building Perl with the -Dusemultiplicity Configure option is adequate if you wish to run multiple independent interpreters concurrently in different threads. -Dusethreads only provides the additional functionality of the perl_clone() API call and other support for running cloned interpreters concurrently.

  1. NOTE: This is an experimental feature. Implementation details are
  2. subject to change.

Lexically scoped warning categories

You can now control the granularity of warnings emitted by perl at a finer level using the use warnings pragma. warnings and perllexwarn have copious documentation on this feature.

Unicode and UTF-8 support

Perl now uses UTF-8 as its internal representation for character strings. The utf8 and bytes pragmas are used to control this support in the current lexical scope. See perlunicode, utf8 and bytes for more information.

This feature is expected to evolve quickly to support some form of I/O disciplines that can be used to specify the kind of input and output data (bytes or characters). Until that happens, additional modules from CPAN will be needed to complete the toolkit for dealing with Unicode.

  1. NOTE: This should be considered an experimental feature. Implementation
  2. details are subject to change.

Support for interpolating named characters

The new \N escape interpolates named characters within strings. For example, "Hi! \N{WHITE SMILING FACE}" evaluates to a string with a Unicode smiley face at the end.

"our" declarations

An "our" declaration introduces a value that can be best understood as a lexically scoped symbolic alias to a global variable in the package that was current where the variable was declared. This is mostly useful as an alternative to the vars pragma, but also provides the opportunity to introduce typing and other attributes for such variables. See our.

Support for strings represented as a vector of ordinals

Literals of the form v1.2.3.4 are now parsed as a string composed of characters with the specified ordinals. This is an alternative, more readable way to construct (possibly Unicode) strings instead of interpolating characters, as in "\x{1}\x{2}\x{3}\x{4}" . The leading v may be omitted if there are more than two ordinals, so 1.2.3 is parsed the same as v1.2.3 .

Strings written in this form are also useful to represent version "numbers". It is easy to compare such version "numbers" (which are really just plain strings) using any of the usual string comparison operators eq , ne , lt , gt , etc., or perform bitwise string operations on them using |, & , etc.

In conjunction with the new $^V magic variable (which contains the perl version as a string), such literals can be used as a readable way to check if you're running a particular version of Perl:

  1. # this will parse in older versions of Perl also
  2. if ($^V and $^V gt v5.6.0) {
  3. # new features supported
  4. }

require and use also have some special magic to support such literals. They will be interpreted as a version rather than as a module name:

  1. require v5.6.0; # croak if $^V lt v5.6.0
  2. use v5.6.0; # same, but croaks at compile-time

Alternatively, the v may be omitted if there is more than one dot:

  1. require 5.6.0;
  2. use 5.6.0;

Also, sprintf and printf support the Perl-specific format flag %v to print ordinals of characters in arbitrary strings:

  1. printf "v%vd", $^V; # prints current version, such as "v5.5.650"
  2. printf "%*vX", ":", $addr; # formats IPv6 address
  3. printf "%*vb", " ", $bits; # displays bitstring

See Scalar value constructors in perldata for additional information.

Improved Perl version numbering system

Beginning with Perl version 5.6.0, the version number convention has been changed to a "dotted integer" scheme that is more commonly found in open source projects.

Maintenance versions of v5.6.0 will be released as v5.6.1, v5.6.2 etc. The next development series following v5.6.0 will be numbered v5.7.x, beginning with v5.7.0, and the next major production release following v5.6.0 will be v5.8.0.

The English module now sets $PERL_VERSION to $^V (a string value) rather than $] (a numeric value). (This is a potential incompatibility. Send us a report via perlbug if you are affected by this.)

The v1.2.3 syntax is also now legal in Perl. See Support for strings represented as a vector of ordinals for more on that.

To cope with the new versioning system's use of at least three significant digits for each version component, the method used for incrementing the subversion number has also changed slightly. We assume that versions older than v5.6.0 have been incrementing the subversion component in multiples of 10. Versions after v5.6.0 will increment them by 1. Thus, using the new notation, 5.005_03 is the "same" as v5.5.30, and the first maintenance version following v5.6.0 will be v5.6.1 (which should be read as being equivalent to a floating point value of 5.006_001 in the older format, stored in $] ).

New syntax for declaring subroutine attributes

Formerly, if you wanted to mark a subroutine as being a method call or as requiring an automatic lock() when it is entered, you had to declare that with a use attrs pragma in the body of the subroutine. That can now be accomplished with declaration syntax, like this:

  1. sub mymethod : locked method;
  2. ...
  3. sub mymethod : locked method {
  4. ...
  5. }
  6. sub othermethod :locked :method;
  7. ...
  8. sub othermethod :locked :method {
  9. ...
  10. }

(Note how only the first : is mandatory, and whitespace surrounding the : is optional.)

AutoSplit.pm and SelfLoader.pm have been updated to keep the attributes with the stubs they provide. See attributes.

File and directory handles can be autovivified

Similar to how constructs such as $x->[0] autovivify a reference, handle constructors (open(), opendir(), pipe(), socketpair(), sysopen(), socket(), and accept()) now autovivify a file or directory handle if the handle passed to them is an uninitialized scalar variable. This allows the constructs such as open(my $fh, ...) and open(local $fh,...) to be used to create filehandles that will conveniently be closed automatically when the scope ends, provided there are no other references to them. This largely eliminates the need for typeglobs when opening filehandles that must be passed around, as in the following example:

  1. sub myopen {
  2. open my $fh, "@_"
  3. or die "Can't open '@_': $!";
  4. return $fh;
  5. }
  6. {
  7. my $f = myopen("</etc/motd");
  8. print <$f>;
  9. # $f implicitly closed here
  10. }

open() with more than two arguments

If open() is passed three arguments instead of two, the second argument is used as the mode and the third argument is taken to be the file name. This is primarily useful for protecting against unintended magic behavior of the traditional two-argument form. See open.

64-bit support

Any platform that has 64-bit integers either

  1. (1) natively as longs or ints
  2. (2) via special compiler flags
  3. (3) using long long or int64_t

is able to use "quads" (64-bit integers) as follows:

  • constants (decimal, hexadecimal, octal, binary) in the code

  • arguments to oct() and hex()

  • arguments to print(), printf() and sprintf() (flag prefixes ll, L, q)

  • printed as such

  • pack() and unpack() "q" and "Q" formats

  • in basic arithmetics: + - * / % (NOTE: operating close to the limits of the integer values may produce surprising results)

  • in bit arithmetics: & | ^ ~ <<>> (NOTE: these used to be forced to be 32 bits wide but now operate on the full native width.)

  • vec()

Note that unless you have the case (a) you will have to configure and compile Perl using the -Duse64bitint Configure flag.

  1. NOTE: The Configure flags -Duselonglong and -Duse64bits have been
  2. deprecated. Use -Duse64bitint instead.

There are actually two modes of 64-bitness: the first one is achieved using Configure -Duse64bitint and the second one using Configure -Duse64bitall. The difference is that the first one is minimal and the second one maximal. The first works in more places than the second.

The use64bitint does only as much as is required to get 64-bit integers into Perl (this may mean, for example, using "long longs") while your memory may still be limited to 2 gigabytes (because your pointers could still be 32-bit). Note that the name 64bitint does not imply that your C compiler will be using 64-bit ints (it might, but it doesn't have to): the use64bitint means that you will be able to have 64 bits wide scalar values.

The use64bitall goes all the way by attempting to switch also integers (if it can), longs (and pointers) to being 64-bit. This may create an even more binary incompatible Perl than -Duse64bitint: the resulting executable may not run at all in a 32-bit box, or you may have to reboot/reconfigure/rebuild your operating system to be 64-bit aware.

Natively 64-bit systems like Alpha and Cray need neither -Duse64bitint nor -Duse64bitall.

Last but not least: note that due to Perl's habit of always using floating point numbers, the quads are still not true integers. When quads overflow their limits (0...18_446_744_073_709_551_615 unsigned, -9_223_372_036_854_775_808...9_223_372_036_854_775_807 signed), they are silently promoted to floating point numbers, after which they will start losing precision (in their lower digits).

  1. NOTE: 64-bit support is still experimental on most platforms.
  2. Existing support only covers the LP64 data model. In particular, the
  3. LLP64 data model is not yet supported. 64-bit libraries and system
  4. APIs on many platforms have not stabilized--your mileage may vary.

Large file support

If you have filesystems that support "large files" (files larger than 2 gigabytes), you may now also be able to create and access them from Perl.

  1. NOTE: The default action is to enable large file support, if
  2. available on the platform.

If the large file support is on, and you have a Fcntl constant O_LARGEFILE, the O_LARGEFILE is automatically added to the flags of sysopen().

Beware that unless your filesystem also supports "sparse files" seeking to umpteen petabytes may be inadvisable.

Note that in addition to requiring a proper file system to do large files you may also need to adjust your per-process (or your per-system, or per-process-group, or per-user-group) maximum filesize limits before running Perl scripts that try to handle large files, especially if you intend to write such files.

Finally, in addition to your process/process group maximum filesize limits, you may have quota limits on your filesystems that stop you (your user id or your user group id) from using large files.

Adjusting your process/user/group/file system/operating system limits is outside the scope of Perl core language. For process limits, you may try increasing the limits using your shell's limits/limit/ulimit command before running Perl. The BSD::Resource extension (not included with the standard Perl distribution) may also be of use, it offers the getrlimit/setrlimit interface that can be used to adjust process resource usage limits, including the maximum filesize limit.

Long doubles

In some systems you may be able to use long doubles to enhance the range and precision of your double precision floating point numbers (that is, Perl's numbers). Use Configure -Duselongdouble to enable this support (if it is available).

"more bits"

You can "Configure -Dusemorebits" to turn on both the 64-bit support and the long double support.

Enhanced support for sort() subroutines

Perl subroutines with a prototype of ($$) , and XSUBs in general, can now be used as sort subroutines. In either case, the two elements to be compared are passed as normal parameters in @_. See sort.

For unprototyped sort subroutines, the historical behavior of passing the elements to be compared as the global variables $a and $b remains unchanged.

sort $coderef @foo allowed

sort() did not accept a subroutine reference as the comparison function in earlier versions. This is now permitted.

File globbing implemented internally

Perl now uses the File::Glob implementation of the glob() operator automatically. This avoids using an external csh process and the problems associated with it.

  1. NOTE: This is currently an experimental feature. Interfaces and
  2. implementation are subject to change.

Support for CHECK blocks

In addition to BEGIN , INIT , END , DESTROY and AUTOLOAD , subroutines named CHECK are now special. These are queued up during compilation and behave similar to END blocks, except they are called at the end of compilation rather than at the end of execution. They cannot be called directly.

POSIX character class syntax [: :] supported

For example to match alphabetic characters use /[[:alpha:]]/. See perlre for details.

Better pseudo-random number generator

In 5.005_0x and earlier, perl's rand() function used the C library rand(3) function. As of 5.005_52, Configure tests for drand48(), random(), and rand() (in that order) and picks the first one it finds.

These changes should result in better random numbers from rand().

Improved qw// operator

The qw// operator is now evaluated at compile time into a true list instead of being replaced with a run time call to split(). This removes the confusing misbehaviour of qw// in scalar context, which had inherited that behaviour from split().

Thus:

  1. $foo = ($bar) = qw(a b c); print "$foo|$bar\n";

now correctly prints "3|a", instead of "2|a".

Better worst-case behavior of hashes

Small changes in the hashing algorithm have been implemented in order to improve the distribution of lower order bits in the hashed value. This is expected to yield better performance on keys that are repeated sequences.

pack() format 'Z' supported

The new format type 'Z' is useful for packing and unpacking null-terminated strings. See pack.

pack() format modifier '!' supported

The new format type modifier '!' is useful for packing and unpacking native shorts, ints, and longs. See pack.

pack() and unpack() support counted strings

The template character '/' can be used to specify a counted string type to be packed or unpacked. See pack.

Comments in pack() templates

The '#' character in a template introduces a comment up to end of the line. This facilitates documentation of pack() templates.

Weak references

In previous versions of Perl, you couldn't cache objects so as to allow them to be deleted if the last reference from outside the cache is deleted. The reference in the cache would hold a reference count on the object and the objects would never be destroyed.

Another familiar problem is with circular references. When an object references itself, its reference count would never go down to zero, and it would not get destroyed until the program is about to exit.

Weak references solve this by allowing you to "weaken" any reference, that is, make it not count towards the reference count. When the last non-weak reference to an object is deleted, the object is destroyed and all the weak references to the object are automatically undef-ed.

To use this feature, you need the Devel::WeakRef package from CPAN, which contains additional documentation.

  1. NOTE: This is an experimental feature. Details are subject to change.

Binary numbers supported

Binary numbers are now supported as literals, in s?printf formats, and oct():

  1. $answer = 0b101010;
  2. printf "The answer is: %b\n", oct("0b101010");

Lvalue subroutines

Subroutines can now return modifiable lvalues. See Lvalue subroutines in perlsub.

  1. NOTE: This is an experimental feature. Details are subject to change.

Some arrows may be omitted in calls through references

Perl now allows the arrow to be omitted in many constructs involving subroutine calls through references. For example, $foo[10]->('foo') may now be written $foo[10]('foo') . This is rather similar to how the arrow may be omitted from $foo[10]->{'foo'} . Note however, that the arrow is still required for foo(10)->('bar') .

Boolean assignment operators are legal lvalues

Constructs such as ($a ||= 2) += 1 are now allowed.

exists() is supported on subroutine names

The exists() builtin now works on subroutine names. A subroutine is considered to exist if it has been declared (even if implicitly). See exists for examples.

exists() and delete() are supported on array elements

The exists() and delete() builtins now work on simple arrays as well. The behavior is similar to that on hash elements.

exists() can be used to check whether an array element has been initialized. This avoids autovivifying array elements that don't exist. If the array is tied, the EXISTS() method in the corresponding tied package will be invoked.

delete() may be used to remove an element from the array and return it. The array element at that position returns to its uninitialized state, so that testing for the same element with exists() will return false. If the element happens to be the one at the end, the size of the array also shrinks up to the highest element that tests true for exists(), or 0 if none such is found. If the array is tied, the DELETE() method in the corresponding tied package will be invoked.

See exists and delete for examples.

Pseudo-hashes work better

Dereferencing some types of reference values in a pseudo-hash, such as $ph->{foo}[1] , was accidentally disallowed. This has been corrected.

When applied to a pseudo-hash element, exists() now reports whether the specified value exists, not merely if the key is valid.

delete() now works on pseudo-hashes. When given a pseudo-hash element or slice it deletes the values corresponding to the keys (but not the keys themselves). See Pseudo-hashes: Using an array as a hash in perlref.

Pseudo-hash slices with constant keys are now optimized to array lookups at compile-time.

List assignments to pseudo-hash slices are now supported.

The fields pragma now provides ways to create pseudo-hashes, via fields::new() and fields::phash(). See fields.

  1. NOTE: The pseudo-hash data type continues to be experimental.
  2. Limiting oneself to the interface elements provided by the
  3. fields pragma will provide protection from any future changes.

Automatic flushing of output buffers

fork(), exec(), system(), qx//, and pipe open()s now flush buffers of all files opened for output when the operation was attempted. This mostly eliminates confusing buffering mishaps suffered by users unaware of how Perl internally handles I/O.

This is not supported on some platforms like Solaris where a suitably correct implementation of fflush(NULL) isn't available.

Better diagnostics on meaningless filehandle operations

Constructs such as open() and close() are compile time errors. Attempting to read from filehandles that were opened only for writing will now produce warnings (just as writing to read-only filehandles does).

Where possible, buffered data discarded from duped input filehandle

open(NEW, "<&OLD") now attempts to discard any data that was previously read and buffered in OLD before duping the handle. On platforms where doing this is allowed, the next read operation on NEW will return the same data as the corresponding operation on OLD . Formerly, it would have returned the data from the start of the following disk block instead.

eof() has the same old magic as <>

eof() would return true if no attempt to read from <> had yet been made. eof() has been changed to have a little magic of its own, it now opens the <> files.

binmode() can be used to set :crlf and :raw modes

binmode() now accepts a second argument that specifies a discipline for the handle in question. The two pseudo-disciplines ":raw" and ":crlf" are currently supported on DOS-derivative platforms. See binmode and open.

-T filetest recognizes UTF-8 encoded files as "text"

The algorithm used for the -T filetest has been enhanced to correctly identify UTF-8 content as "text".

system(), backticks and pipe open now reflect exec() failure

On Unix and similar platforms, system(), qx() and open(FOO, "cmd |") etc., are implemented via fork() and exec(). When the underlying exec() fails, earlier versions did not report the error properly, since the exec() happened to be in a different process.

The child process now communicates with the parent about the error in launching the external command, which allows these constructs to return with their usual error value and set $!.

Improved diagnostics

Line numbers are no longer suppressed (under most likely circumstances) during the global destruction phase.

Diagnostics emitted from code running in threads other than the main thread are now accompanied by the thread ID.

Embedded null characters in diagnostics now actually show up. They used to truncate the message in prior versions.

$foo::a and $foo::b are now exempt from "possible typo" warnings only if sort() is encountered in package foo .

Unrecognized alphabetic escapes encountered when parsing quote constructs now generate a warning, since they may take on new semantics in later versions of Perl.

Many diagnostics now report the internal operation in which the warning was provoked, like so:

  1. Use of uninitialized value in concatenation (.) at (eval 1) line 1.
  2. Use of uninitialized value in print at (eval 1) line 1.

Diagnostics that occur within eval may also report the file and line number where the eval is located, in addition to the eval sequence number and the line number within the evaluated text itself. For example:

  1. Not enough arguments for scalar at (eval 4)[newlib/perl5db.pl:1411] line 2, at EOF

Diagnostics follow STDERR

Diagnostic output now goes to whichever file the STDERR handle is pointing at, instead of always going to the underlying C runtime library's stderr .

More consistent close-on-exec behavior

On systems that support a close-on-exec flag on filehandles, the flag is now set for any handles created by pipe(), socketpair(), socket(), and accept(), if that is warranted by the value of $^F that may be in effect. Earlier versions neglected to set the flag for handles created with these operators. See pipe, socketpair, socket, accept, and $^F in perlvar.

syswrite() ease-of-use

The length argument of syswrite() has become optional.

Better syntax checks on parenthesized unary operators

Expressions such as:

  1. print defined(&foo,&bar,&baz);
  2. print uc("foo","bar","baz");
  3. undef($foo,&bar);

used to be accidentally allowed in earlier versions, and produced unpredictable behaviour. Some produced ancillary warnings when used in this way; others silently did the wrong thing.

The parenthesized forms of most unary operators that expect a single argument now ensure that they are not called with more than one argument, making the cases shown above syntax errors. The usual behaviour of:

  1. print defined &foo, &bar, &baz;
  2. print uc "foo", "bar", "baz";
  3. undef $foo, &bar;

remains unchanged. See perlop.

Bit operators support full native integer width

The bit operators (& | ^ ~ <<>>) now operate on the full native integral width (the exact size of which is available in $Config{ivsize}). For example, if your platform is either natively 64-bit or if Perl has been configured to use 64-bit integers, these operations apply to 8 bytes (as opposed to 4 bytes on 32-bit platforms). For portability, be sure to mask off the excess bits in the result of unary ~ , e.g., ~$x & 0xffffffff .

Improved security features

More potentially unsafe operations taint their results for improved security.

The passwd and shell fields returned by the getpwent(), getpwnam(), and getpwuid() are now tainted, because the user can affect their own encrypted password and login shell.

The variable modified by shmread(), and messages returned by msgrcv() (and its object-oriented interface IPC::SysV::Msg::rcv) are also tainted, because other untrusted processes can modify messages and shared memory segments for their own nefarious purposes.

More functional bareword prototype (*)

Bareword prototypes have been rationalized to enable them to be used to override builtins that accept barewords and interpret them in a special way, such as require or do.

Arguments prototyped as * will now be visible within the subroutine as either a simple scalar or as a reference to a typeglob. See Prototypes in perlsub.

require and do may be overridden

require and do 'file' operations may be overridden locally by importing subroutines of the same name into the current package (or globally by importing them into the CORE::GLOBAL:: namespace). Overriding require will also affect use, provided the override is visible at compile-time. See Overriding Built-in Functions in perlsub.

$^X variables may now have names longer than one character

Formerly, $^X was synonymous with ${"\cX"}, but $^XY was a syntax error. Now variable names that begin with a control character may be arbitrarily long. However, for compatibility reasons, these variables must be written with explicit braces, as ${^XY} for example. ${^XYZ} is synonymous with ${"\cXYZ"}. Variable names with more than one control character, such as ${^XY^Z} , are illegal.

The old syntax has not changed. As before, `^X' may be either a literal control-X character or the two-character sequence `caret' plus `X'. When braces are omitted, the variable name stops after the control character. Thus "$^XYZ" continues to be synonymous with $^X . "YZ" as before.

As before, lexical variables may not have names beginning with control characters. As before, variables whose names begin with a control character are always forced to be in package `main'. All such variables are reserved for future extensions, except those that begin with ^_, which may be used by user programs and are guaranteed not to acquire special meaning in any future version of Perl.

New variable $^C reflects -c switch

$^C has a boolean value that reflects whether perl is being run in compile-only mode (i.e. via the -c switch). Since BEGIN blocks are executed under such conditions, this variable enables perl code to determine whether actions that make sense only during normal running are warranted. See perlvar.

New variable $^V contains Perl version as a string

$^V contains the Perl version number as a string composed of characters whose ordinals match the version numbers, i.e. v5.6.0. This may be used in string comparisons.

See Support for strings represented as a vector of ordinals for an example.

Optional Y2K warnings

If Perl is built with the cpp macro PERL_Y2KWARN defined, it emits optional warnings when concatenating the number 19 with another number.

This behavior must be specifically enabled when running Configure. See INSTALL and README.Y2K.

Arrays now always interpolate into double-quoted strings

In double-quoted strings, arrays now interpolate, no matter what. The behavior in earlier versions of perl 5 was that arrays would interpolate into strings if the array had been mentioned before the string was compiled, and otherwise Perl would raise a fatal compile-time error. In versions 5.000 through 5.003, the error was

  1. Literal @example now requires backslash

In versions 5.004_01 through 5.6.0, the error was

  1. In string, @example now must be written as \@example

The idea here was to get people into the habit of writing "fred\@example.com" when they wanted a literal @ sign, just as they have always written "Give me back my \$5" when they wanted a literal $ sign.

Starting with 5.6.1, when Perl now sees an @ sign in a double-quoted string, it always attempts to interpolate an array, regardless of whether or not the array has been used or declared already. The fatal error has been downgraded to an optional warning:

  1. Possible unintended interpolation of @example in string

This warns you that "fred@example.com" is going to turn into fred.com if you don't backslash the @ . See http://perl.plover.com/at-error.html for more details about the history here.

@- and @+ provide starting/ending offsets of regex submatches

The new magic variables @- and @+ provide the starting and ending offsets, respectively, of $&, $1, $2, etc. See perlvar for details.

Modules and Pragmata

Modules

  • attributes

    While used internally by Perl as a pragma, this module also provides a way to fetch subroutine and variable attributes. See attributes.

  • B

    The Perl Compiler suite has been extensively reworked for this release. More of the standard Perl test suite passes when run under the Compiler, but there is still a significant way to go to achieve production quality compiled executables.

    1. NOTE: The Compiler suite remains highly experimental. The
    2. generated code may not be correct, even when it manages to execute
    3. without errors.
  • Benchmark

    Overall, Benchmark results exhibit lower average error and better timing accuracy.

    You can now run tests for n seconds instead of guessing the right number of tests to run: e.g., timethese(-5, ...) will run each code for at least 5 CPU seconds. Zero as the "number of repetitions" means "for at least 3 CPU seconds". The output format has also changed. For example:

    1. use Benchmark;$x=3;timethese(-5,{a=>sub{$x*$x},b=>sub{$x**2}})

    will now output something like this:

    1. Benchmark: running a, b, each for at least 5 CPU seconds...
    2. a: 5 wallclock secs ( 5.77 usr + 0.00 sys = 5.77 CPU) @ 200551.91/s (n=1156516)
    3. b: 4 wallclock secs ( 5.00 usr + 0.02 sys = 5.02 CPU) @ 159605.18/s (n=800686)

    New features: "each for at least N CPU seconds...", "wallclock secs", and the "@ operations/CPU second (n=operations)".

    timethese() now returns a reference to a hash of Benchmark objects containing the test results, keyed on the names of the tests.

    timethis() now returns the iterations field in the Benchmark result object instead of 0.

    timethese(), timethis(), and the new cmpthese() (see below) can also take a format specifier of 'none' to suppress output.

    A new function countit() is just like timeit() except that it takes a TIME instead of a COUNT.

    A new function cmpthese() prints a chart comparing the results of each test returned from a timethese() call. For each possible pair of tests, the percentage speed difference (iters/sec or seconds/iter) is shown.

    For other details, see Benchmark.

  • ByteLoader

    The ByteLoader is a dedicated extension to generate and run Perl bytecode. See ByteLoader.

  • constant

    References can now be used.

    The new version also allows a leading underscore in constant names, but disallows a double leading underscore (as in "__LINE__"). Some other names are disallowed or warned against, including BEGIN, END, etc. Some names which were forced into main:: used to fail silently in some cases; now they're fatal (outside of main::) and an optional warning (inside of main::). The ability to detect whether a constant had been set with a given name has been added.

    See constant.

  • charnames

    This pragma implements the \N string escape. See charnames.

  • Data::Dumper

    A Maxdepth setting can be specified to avoid venturing too deeply into deep data structures. See Data::Dumper.

    The XSUB implementation of Dump() is now automatically called if the Useqq setting is not in use.

    Dumping qr// objects works correctly.

  • DB

    DB is an experimental module that exposes a clean abstraction to Perl's debugging API.

  • DB_File

    DB_File can now be built with Berkeley DB versions 1, 2 or 3. See ext/DB_File/Changes .

  • Devel::DProf

    Devel::DProf, a Perl source code profiler has been added. See Devel::DProf and dprofpp.

  • Devel::Peek

    The Devel::Peek module provides access to the internal representation of Perl variables and data. It is a data debugging tool for the XS programmer.

  • Dumpvalue

    The Dumpvalue module provides screen dumps of Perl data.

  • DynaLoader

    DynaLoader now supports a dl_unload_file() function on platforms that support unloading shared objects using dlclose().

    Perl can also optionally arrange to unload all extension shared objects loaded by Perl. To enable this, build Perl with the Configure option -Accflags=-DDL_UNLOAD_ALL_AT_EXIT . (This maybe useful if you are using Apache with mod_perl.)

  • English

    $PERL_VERSION now stands for $^V (a string value) rather than for $] (a numeric value).

  • Env

    Env now supports accessing environment variables like PATH as array variables.

  • Fcntl

    More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for large file (more than 4GB) access (NOTE: the O_LARGEFILE is automatically added to sysopen() flags if large file support has been configured, as is the default), Free/Net/OpenBSD locking behaviour flags F_FLOCK, F_POSIX, Linux F_SHLCK, and O_ACCMODE: the combined mask of O_RDONLY, O_WRONLY, and O_RDWR. The seek()/sysseek() constants SEEK_SET, SEEK_CUR, and SEEK_END are available via the :seek tag. The chmod()/stat() S_IF* constants and S_IS* functions are available via the :mode tag.

  • File::Compare

    A compare_text() function has been added, which allows custom comparison functions. See File::Compare.

  • File::Find

    File::Find now works correctly when the wanted() function is either autoloaded or is a symbolic reference.

    A bug that caused File::Find to lose track of the working directory when pruning top-level directories has been fixed.

    File::Find now also supports several other options to control its behavior. It can follow symbolic links if the follow option is specified. Enabling the no_chdir option will make File::Find skip changing the current directory when walking directories. The untaint flag can be useful when running with taint checks enabled.

    See File::Find.

  • File::Glob

    This extension implements BSD-style file globbing. By default, it will also be used for the internal implementation of the glob() operator. See File::Glob.

  • File::Spec

    New methods have been added to the File::Spec module: devnull() returns the name of the null device (/dev/null on Unix) and tmpdir() the name of the temp directory (normally /tmp on Unix). There are now also methods to convert between absolute and relative filenames: abs2rel() and rel2abs(). For compatibility with operating systems that specify volume names in file paths, the splitpath(), splitdir(), and catdir() methods have been added.

  • File::Spec::Functions

    The new File::Spec::Functions modules provides a function interface to the File::Spec module. Allows shorthand

    1. $fullname = catfile($dir1, $dir2, $file);

    instead of

    1. $fullname = File::Spec->catfile($dir1, $dir2, $file);
  • Getopt::Long

    Getopt::Long licensing has changed to allow the Perl Artistic License as well as the GPL. It used to be GPL only, which got in the way of non-GPL applications that wanted to use Getopt::Long.

    Getopt::Long encourages the use of Pod::Usage to produce help messages. For example:

    1. use Getopt::Long;
    2. use Pod::Usage;
    3. my $man = 0;
    4. my $help = 0;
    5. GetOptions('help|?' => \$help, man => \$man) or pod2usage(2);
    6. pod2usage(1) if $help;
    7. pod2usage(-exitstatus => 0, -verbose => 2) if $man;
    8. __END__
    9. =head1 NAME
    10. sample - Using Getopt::Long and Pod::Usage
    11. =head1 SYNOPSIS
    12. sample [options] [file ...]
    13. Options:
    14. -help brief help message
    15. -man full documentation
    16. =head1 OPTIONS
    17. =over 8
    18. =item B<-help>
    19. Print a brief help message and exits.
    20. =item B<-man>
    21. Prints the manual page and exits.
    22. =back
    23. =head1 DESCRIPTION
    24. B<This program> will read the given input file(s) and do something
    25. useful with the contents thereof.
    26. =cut

    See Pod::Usage for details.

    A bug that prevented the non-option call-back <> from being specified as the first argument has been fixed.

    To specify the characters < and > as option starters, use ><. Note, however, that changing option starters is strongly deprecated.

  • IO

    write() and syswrite() will now accept a single-argument form of the call, for consistency with Perl's syswrite().

    You can now create a TCP-based IO::Socket::INET without forcing a connect attempt. This allows you to configure its options (like making it non-blocking) and then call connect() manually.

    A bug that prevented the IO::Socket::protocol() accessor from ever returning the correct value has been corrected.

    IO::Socket::connect now uses non-blocking IO instead of alarm() to do connect timeouts.

    IO::Socket::accept now uses select() instead of alarm() for doing timeouts.

    IO::Socket::INET->new now sets $! correctly on failure. $@ is still set for backwards compatibility.

  • JPL

    Java Perl Lingo is now distributed with Perl. See jpl/README for more information.

  • lib

    use lib now weeds out any trailing duplicate entries. no lib removes all named entries.

  • Math::BigInt

    The bitwise operations << , >> , & , |, and ~ are now supported on bigints.

  • Math::Complex

    The accessor methods Re, Im, arg, abs, rho, and theta can now also act as mutators (accessor $z->Re(), mutator $z->Re(3)).

    The class method display_format and the corresponding object method display_format , in addition to accepting just one argument, now can also accept a parameter hash. Recognized keys of a parameter hash are "style" , which corresponds to the old one parameter case, and two new parameters: "format" , which is a printf()-style format string (defaults usually to "%.15g" , you can revert to the default by setting the format string to undef) used for both parts of a complex number, and "polar_pretty_print" (defaults to true), which controls whether an attempt is made to try to recognize small multiples and rationals of pi (2pi, pi/2) at the argument (angle) of a polar complex number.

    The potentially disruptive change is that in list context both methods now return the parameter hash, instead of only the value of the "style" parameter.

  • Math::Trig

    A little bit of radial trigonometry (cylindrical and spherical), radial coordinate conversions, and the great circle distance were added.

  • Pod::Parser, Pod::InputObjects

    Pod::Parser is a base class for parsing and selecting sections of pod documentation from an input stream. This module takes care of identifying pod paragraphs and commands in the input and hands off the parsed paragraphs and commands to user-defined methods which are free to interpret or translate them as they see fit.

    Pod::InputObjects defines some input objects needed by Pod::Parser, and for advanced users of Pod::Parser that need more about a command besides its name and text.

    As of release 5.6.0 of Perl, Pod::Parser is now the officially sanctioned "base parser code" recommended for use by all pod2xxx translators. Pod::Text (pod2text) and Pod::Man (pod2man) have already been converted to use Pod::Parser and efforts to convert Pod::HTML (pod2html) are already underway. For any questions or comments about pod parsing and translating issues and utilities, please use the pod-people@perl.org mailing list.

    For further information, please see Pod::Parser and Pod::InputObjects.

  • Pod::Checker, podchecker

    This utility checks pod files for correct syntax, according to perlpod. Obvious errors are flagged as such, while warnings are printed for mistakes that can be handled gracefully. The checklist is not complete yet. See Pod::Checker.

  • Pod::ParseUtils, Pod::Find

    These modules provide a set of gizmos that are useful mainly for pod translators. Pod::Find traverses directory structures and returns found pod files, along with their canonical names (like File::Spec::Unix ). Pod::ParseUtils contains Pod::List (useful for storing pod list information), Pod::Hyperlink (for parsing the contents of L<> sequences) and Pod::Cache (for caching information about pod files, e.g., link nodes).

  • Pod::Select, podselect

    Pod::Select is a subclass of Pod::Parser which provides a function named "podselect()" to filter out user-specified sections of raw pod documentation from an input stream. podselect is a script that provides access to Pod::Select from other scripts to be used as a filter. See Pod::Select.

  • Pod::Usage, pod2usage

    Pod::Usage provides the function "pod2usage()" to print usage messages for a Perl script based on its embedded pod documentation. The pod2usage() function is generally useful to all script authors since it lets them write and maintain a single source (the pods) for documentation, thus removing the need to create and maintain redundant usage message text consisting of information already in the pods.

    There is also a pod2usage script which can be used from other kinds of scripts to print usage messages from pods (even for non-Perl scripts with pods embedded in comments).

    For details and examples, please see Pod::Usage.

  • Pod::Text and Pod::Man

    Pod::Text has been rewritten to use Pod::Parser. While pod2text() is still available for backwards compatibility, the module now has a new preferred interface. See Pod::Text for the details. The new Pod::Text module is easily subclassed for tweaks to the output, and two such subclasses (Pod::Text::Termcap for man-page-style bold and underlining using termcap information, and Pod::Text::Color for markup with ANSI color sequences) are now standard.

    pod2man has been turned into a module, Pod::Man, which also uses Pod::Parser. In the process, several outstanding bugs related to quotes in section headers, quoting of code escapes, and nested lists have been fixed. pod2man is now a wrapper script around this module.

  • SDBM_File

    An EXISTS method has been added to this module (and sdbm_exists() has been added to the underlying sdbm library), so one can now call exists on an SDBM_File tied hash and get the correct result, rather than a runtime error.

    A bug that may have caused data loss when more than one disk block happens to be read from the database in a single FETCH() has been fixed.

  • Sys::Syslog

    Sys::Syslog now uses XSUBs to access facilities from syslog.h so it no longer requires syslog.ph to exist.

  • Sys::Hostname

    Sys::Hostname now uses XSUBs to call the C library's gethostname() or uname() if they exist.

  • Term::ANSIColor

    Term::ANSIColor is a very simple module to provide easy and readable access to the ANSI color and highlighting escape sequences, supported by most ANSI terminal emulators. It is now included standard.

  • Time::Local

    The timelocal() and timegm() functions used to silently return bogus results when the date fell outside the machine's integer range. They now consistently croak() if the date falls in an unsupported range.

  • Win32

    The error return value in list context has been changed for all functions that return a list of values. Previously these functions returned a list with a single element undef if an error occurred. Now these functions return the empty list in these situations. This applies to the following functions:

    1. Win32::FsType
    2. Win32::GetOSVersion

    The remaining functions are unchanged and continue to return undef on error even in list context.

    The Win32::SetLastError(ERROR) function has been added as a complement to the Win32::GetLastError() function.

    The new Win32::GetFullPathName(FILENAME) returns the full absolute pathname for FILENAME in scalar context. In list context it returns a two-element list containing the fully qualified directory name and the filename. See Win32.

  • XSLoader

    The XSLoader extension is a simpler alternative to DynaLoader. See XSLoader.

  • DBM Filters

    A new feature called "DBM Filters" has been added to all the DBM modules--DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File. DBM Filters add four new methods to each DBM module:

    1. filter_store_key
    2. filter_store_value
    3. filter_fetch_key
    4. filter_fetch_value

    These can be used to filter key-value pairs before the pairs are written to the database or just after they are read from the database. See perldbmfilter for further information.

Pragmata

use attrs is now obsolete, and is only provided for backward-compatibility. It's been replaced by the sub : attributes syntax. See Subroutine Attributes in perlsub and attributes.

Lexical warnings pragma, use warnings; , to control optional warnings. See perllexwarn.

use filetest to control the behaviour of filetests (-r -w ...). Currently only one subpragma implemented, "use filetest 'access';", that uses access(2) or equivalent to check permissions instead of using stat(2) as usual. This matters in filesystems where there are ACLs (access control lists): the stat(2) might lie, but access(2) knows better.

The open pragma can be used to specify default disciplines for handle constructors (e.g. open()) and for qx//. The two pseudo-disciplines :raw and :crlf are currently supported on DOS-derivative platforms (i.e. where binmode is not a no-op). See also binmode() can be used to set :crlf and :raw modes.

Utility Changes

dprofpp

dprofpp is used to display profile data generated using Devel::DProf . See dprofpp.

find2perl

The find2perl utility now uses the enhanced features of the File::Find module. The -depth and -follow options are supported. Pod documentation is also included in the script.

h2xs

The h2xs tool can now work in conjunction with C::Scan (available from CPAN) to automatically parse real-life header files. The -M , -a , -k , and -o options are new.

perlcc

perlcc now supports the C and Bytecode backends. By default, it generates output from the simple C backend rather than the optimized C backend.

Support for non-Unix platforms has been improved.

perldoc

perldoc has been reworked to avoid possible security holes. It will not by default let itself be run as the superuser, but you may still use the -U switch to try to make it drop privileges first.

The Perl Debugger

Many bug fixes and enhancements were added to perl5db.pl, the Perl debugger. The help documentation was rearranged. New commands include < ?, > ?, and { ? to list out current actions, man docpage to run your doc viewer on some perl docset, and support for quoted options. The help information was rearranged, and should be viewable once again if you're using less as your pager. A serious security hole was plugged--you should immediately remove all older versions of the Perl debugger as installed in previous releases, all the way back to perl3, from your system to avoid being bitten by this.

Improved Documentation

Many of the platform-specific README files are now part of the perl installation. See perl for the complete list.

  • perlapi.pod

    The official list of public Perl API functions.

  • perlboot.pod

    A tutorial for beginners on object-oriented Perl.

  • perlcompile.pod

    An introduction to using the Perl Compiler suite.

  • perldbmfilter.pod

    A howto document on using the DBM filter facility.

  • perldebug.pod

    All material unrelated to running the Perl debugger, plus all low-level guts-like details that risked crushing the casual user of the debugger, have been relocated from the old manpage to the next entry below.

  • perldebguts.pod

    This new manpage contains excessively low-level material not related to the Perl debugger, but slightly related to debugging Perl itself. It also contains some arcane internal details of how the debugging process works that may only be of interest to developers of Perl debuggers.

  • perlfork.pod

    Notes on the fork() emulation currently available for the Windows platform.

  • perlfilter.pod

    An introduction to writing Perl source filters.

  • perlhack.pod

    Some guidelines for hacking the Perl source code.

  • perlintern.pod

    A list of internal functions in the Perl source code. (List is currently empty.)

  • perllexwarn.pod

    Introduction and reference information about lexically scoped warning categories.

  • perlnumber.pod

    Detailed information about numbers as they are represented in Perl.

  • perlopentut.pod

    A tutorial on using open() effectively.

  • perlreftut.pod

    A tutorial that introduces the essentials of references.

  • perltootc.pod

    A tutorial on managing class data for object modules.

  • perltodo.pod

    Discussion of the most often wanted features that may someday be supported in Perl.

  • perlunicode.pod

    An introduction to Unicode support features in Perl.

Performance enhancements

Simple sort() using { $a <=> $b } and the like are optimized

Many common sort() operations using a simple inlined block are now optimized for faster performance.

Optimized assignments to lexical variables

Certain operations in the RHS of assignment statements have been optimized to directly set the lexical variable on the LHS, eliminating redundant copying overheads.

Faster subroutine calls

Minor changes in how subroutine calls are handled internally provide marginal improvements in performance.

delete(), each(), values() and hash iteration are faster

The hash values returned by delete(), each(), values() and hashes in a list context are the actual values in the hash, instead of copies. This results in significantly better performance, because it eliminates needless copying in most situations.

Installation and Configuration Improvements

-Dusethreads means something different

The -Dusethreads flag now enables the experimental interpreter-based thread support by default. To get the flavor of experimental threads that was in 5.005 instead, you need to run Configure with "-Dusethreads -Duse5005threads".

As of v5.6.0, interpreter-threads support is still lacking a way to create new threads from Perl (i.e., use Thread; will not work with interpreter threads). use Thread; continues to be available when you specify the -Duse5005threads option to Configure, bugs and all.

  1. NOTE: Support for threads continues to be an experimental feature.
  2. Interfaces and implementation are subject to sudden and drastic changes.

New Configure flags

The following new flags may be enabled on the Configure command line by running Configure with -Dflag .

  1. usemultiplicity
  2. usethreads useithreads (new interpreter threads: no Perl API yet)
  3. usethreads use5005threads (threads as they were in 5.005)
  4. use64bitint (equal to now deprecated 'use64bits')
  5. use64bitall
  6. uselongdouble
  7. usemorebits
  8. uselargefiles
  9. usesocks (only SOCKS v5 supported)

Threadedness and 64-bitness now more daring

The Configure options enabling the use of threads and the use of 64-bitness are now more daring in the sense that they no more have an explicit list of operating systems of known threads/64-bit capabilities. In other words: if your operating system has the necessary APIs and datatypes, you should be able just to go ahead and use them, for threads by Configure -Dusethreads, and for 64 bits either explicitly by Configure -Duse64bitint or implicitly if your system has 64-bit wide datatypes. See also 64-bit support.

Long Doubles

Some platforms have "long doubles", floating point numbers of even larger range than ordinary "doubles". To enable using long doubles for Perl's scalars, use -Duselongdouble.

-Dusemorebits

You can enable both -Duse64bitint and -Duselongdouble with -Dusemorebits. See also 64-bit support.

-Duselargefiles

Some platforms support system APIs that are capable of handling large files (typically, files larger than two gigabytes). Perl will try to use these APIs if you ask for -Duselargefiles.

See Large file support for more information.

installusrbinperl

You can use "Configure -Uinstallusrbinperl" which causes installperl to skip installing perl also as /usr/bin/perl. This is useful if you prefer not to modify /usr/bin for some reason or another but harmful because many scripts assume to find Perl in /usr/bin/perl.

SOCKS support

You can use "Configure -Dusesocks" which causes Perl to probe for the SOCKS proxy protocol library (v5, not v4). For more information on SOCKS, see:

  1. http://www.socks.nec.com/

-A flag

You can "post-edit" the Configure variables using the Configure -A switch. The editing happens immediately after the platform specific hints files have been processed but before the actual configuration process starts. Run Configure -h to find out the full -A syntax.

Enhanced Installation Directories

The installation structure has been enriched to improve the support for maintaining multiple versions of perl, to provide locations for vendor-supplied modules, scripts, and manpages, and to ease maintenance of locally-added modules, scripts, and manpages. See the section on Installation Directories in the INSTALL file for complete details. For most users building and installing from source, the defaults should be fine.

If you previously used Configure -Dsitelib or -Dsitearch to set special values for library directories, you might wish to consider using the new -Dsiteprefix setting instead. Also, if you wish to re-use a config.sh file from an earlier version of perl, you should be sure to check that Configure makes sensible choices for the new directories. See INSTALL for complete details.

gcc automatically tried if 'cc' does not seem to be working

In many platforms the vendor-supplied 'cc' is too stripped-down to build Perl (basically, the 'cc' doesn't do ANSI C). If this seems to be the case and the 'cc' does not seem to be the GNU C compiler 'gcc', an automatic attempt is made to find and use 'gcc' instead.

Platform specific changes

Supported platforms

  • The Mach CThreads (NEXTSTEP, OPENSTEP) are now supported by the Thread extension.

  • GNU/Hurd is now supported.

  • Rhapsody/Darwin is now supported.

  • EPOC is now supported (on Psion 5).

  • The cygwin port (formerly cygwin32) has been greatly improved.

DOS

  • Perl now works with djgpp 2.02 (and 2.03 alpha).

  • Environment variable names are not converted to uppercase any more.

  • Incorrect exit codes from backticks have been fixed.

  • This port continues to use its own builtin globbing (not File::Glob).

OS390 (OpenEdition MVS)

Support for this EBCDIC platform has not been renewed in this release. There are difficulties in reconciling Perl's standardization on UTF-8 as its internal representation for characters with the EBCDIC character set, because the two are incompatible.

It is unclear whether future versions will renew support for this platform, but the possibility exists.

VMS

Numerous revisions and extensions to configuration, build, testing, and installation process to accommodate core changes and VMS-specific options.

Expand %ENV-handling code to allow runtime mapping to logical names, CLI symbols, and CRTL environ array.

Extension of subprocess invocation code to accept filespecs as command "verbs".

Add to Perl command line processing the ability to use default file types and to recognize Unix-style 2>&1 .

Expansion of File::Spec::VMS routines, and integration into ExtUtils::MM_VMS.

Extension of ExtUtils::MM_VMS to handle complex extensions more flexibly.

Barewords at start of Unix-syntax paths may be treated as text rather than only as logical names.

Optional secure translation of several logical names used internally by Perl.

Miscellaneous bugfixing and porting of new core code to VMS.

Thanks are gladly extended to the many people who have contributed VMS patches, testing, and ideas.

Win32

Perl can now emulate fork() internally, using multiple interpreters running in different concurrent threads. This support must be enabled at build time. See perlfork for detailed information.

When given a pathname that consists only of a drivename, such as A: , opendir() and stat() now use the current working directory for the drive rather than the drive root.

The builtin XSUB functions in the Win32:: namespace are documented. See Win32.

$^X now contains the full path name of the running executable.

A Win32::GetLongPathName() function is provided to complement Win32::GetFullPathName() and Win32::GetShortPathName(). See Win32.

POSIX::uname() is supported.

system(1,...) now returns true process IDs rather than process handles. kill() accepts any real process id, rather than strictly return values from system(1,...).

For better compatibility with Unix, kill(0, $pid) can now be used to test whether a process exists.

The Shell module is supported.

Better support for building Perl under command.com in Windows 95 has been added.

Scripts are read in binary mode by default to allow ByteLoader (and the filter mechanism in general) to work properly. For compatibility, the DATA filehandle will be set to text mode if a carriage return is detected at the end of the line containing the __END__ or __DATA__ token; if not, the DATA filehandle will be left open in binary mode. Earlier versions always opened the DATA filehandle in text mode.

The glob() operator is implemented via the File::Glob extension, which supports glob syntax of the C shell. This increases the flexibility of the glob() operator, but there may be compatibility issues for programs that relied on the older globbing syntax. If you want to preserve compatibility with the older syntax, you might want to run perl with -MFile::DosGlob . For details and compatibility information, see File::Glob.

Significant bug fixes

<HANDLE> on empty files

With $/ set to undef, "slurping" an empty file returns a string of zero length (instead of undef, as it used to) the first time the HANDLE is read after $/ is set to undef. Further reads yield undef.

This means that the following will append "foo" to an empty file (it used to do nothing):

  1. perl -0777 -pi -e 's/^/foo/' empty_file

The behaviour of:

  1. perl -pi -e 's/^/foo/' empty_file

is unchanged (it continues to leave the file empty).

eval '...' improvements

Line numbers (as reflected by caller() and most diagnostics) within eval '...' were often incorrect where here documents were involved. This has been corrected.

Lexical lookups for variables appearing in eval '...' within functions that were themselves called within an eval '...' were searching the wrong place for lexicals. The lexical search now correctly ends at the subroutine's block boundary.

The use of return within eval {...} caused $@ not to be reset correctly when no exception occurred within the eval. This has been fixed.

Parsing of here documents used to be flawed when they appeared as the replacement expression in eval 's/.../.../e' . This has been fixed.

All compilation errors are true errors

Some "errors" encountered at compile time were by necessity generated as warnings followed by eventual termination of the program. This enabled more such errors to be reported in a single run, rather than causing a hard stop at the first error that was encountered.

The mechanism for reporting such errors has been reimplemented to queue compile-time errors and report them at the end of the compilation as true errors rather than as warnings. This fixes cases where error messages leaked through in the form of warnings when code was compiled at run time using eval STRING , and also allows such errors to be reliably trapped using eval "..." .

Implicitly closed filehandles are safer

Sometimes implicitly closed filehandles (as when they are localized, and Perl automatically closes them on exiting the scope) could inadvertently set $? or $!. This has been corrected.

Behavior of list slices is more consistent

When taking a slice of a literal list (as opposed to a slice of an array or hash), Perl used to return an empty list if the result happened to be composed of all undef values.

The new behavior is to produce an empty list if (and only if) the original list was empty. Consider the following example:

  1. @a = (1,undef,undef,2)[2,1,2];

The old behavior would have resulted in @a having no elements. The new behavior ensures it has three undefined elements.

Note in particular that the behavior of slices of the following cases remains unchanged:

  1. @a = ()[1,2];
  2. @a = (getpwent)[7,0];
  3. @a = (anything_returning_empty_list())[2,1,2];
  4. @a = @b[2,1,2];
  5. @a = @c{'a','b','c'};

See perldata.

(\$) prototype and $foo{a}

A scalar reference prototype now correctly allows a hash or array element in that slot.

goto &sub and AUTOLOAD

The goto &sub construct works correctly when &sub happens to be autoloaded.

-bareword allowed under use integer

The autoquoting of barewords preceded by - did not work in prior versions when the integer pragma was enabled. This has been fixed.

Failures in DESTROY()

When code in a destructor threw an exception, it went unnoticed in earlier versions of Perl, unless someone happened to be looking in $@ just after the point the destructor happened to run. Such failures are now visible as warnings when warnings are enabled.

Locale bugs fixed

printf() and sprintf() previously reset the numeric locale back to the default "C" locale. This has been fixed.

Numbers formatted according to the local numeric locale (such as using a decimal comma instead of a decimal dot) caused "isn't numeric" warnings, even while the operations accessing those numbers produced correct results. These warnings have been discontinued.

Memory leaks

The eval 'return sub {...}' construct could sometimes leak memory. This has been fixed.

Operations that aren't filehandle constructors used to leak memory when used on invalid filehandles. This has been fixed.

Constructs that modified @_ could fail to deallocate values in @_ and thus leak memory. This has been corrected.

Spurious subroutine stubs after failed subroutine calls

Perl could sometimes create empty subroutine stubs when a subroutine was not found in the package. Such cases stopped later method lookups from progressing into base packages. This has been corrected.

Taint failures under -U

When running in unsafe mode, taint violations could sometimes cause silent failures. This has been fixed.

END blocks and the -c switch

Prior versions used to run BEGIN and END blocks when Perl was run in compile-only mode. Since this is typically not the expected behavior, END blocks are not executed anymore when the -c switch is used, or if compilation fails.

See Support for CHECK blocks for how to run things when the compile phase ends.

Potential to leak DATA filehandles

Using the __DATA__ token creates an implicit filehandle to the file that contains the token. It is the program's responsibility to close it when it is done reading from it.

This caveat is now better explained in the documentation. See perldata.

New or Changed Diagnostics

  • "%s" variable %s masks earlier declaration in same %s

    (W misc) A "my" or "our" variable has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier variable will still exist until the end of the scope or until all closure referents to it are destroyed.

  • "my sub" not yet implemented

    (F) Lexically scoped subroutines are not yet implemented. Don't try that yet.

  • "our" variable %s redeclared

    (W misc) You seem to have already declared the same global once before in the current lexical scope.

  • '!' allowed only after types %s

    (F) The '!' is allowed in pack() and unpack() only after certain types. See pack.

  • / cannot take a count

    (F) You had an unpack template indicating a counted-length string, but you have also specified an explicit size for the string. See pack.

  • / must be followed by a, A or Z

    (F) You had an unpack template indicating a counted-length string, which must be followed by one of the letters a, A or Z to indicate what sort of string is to be unpacked. See pack.

  • / must be followed by a*, A* or Z*

    (F) You had a pack template indicating a counted-length string, Currently the only things that can have their length counted are a*, A* or Z*. See pack.

  • / must follow a numeric type

    (F) You had an unpack template that contained a '#', but this did not follow some numeric unpack specification. See pack.

  • /%s/: Unrecognized escape \\%c passed through

    (W regexp) You used a backslash-character combination which is not recognized by Perl. This combination appears in an interpolated variable or a '-delimited regular expression. The character was understood literally.

  • /%s/: Unrecognized escape \\%c in character class passed through

    (W regexp) You used a backslash-character combination which is not recognized by Perl inside character classes. The character was understood literally.

  • /%s/ should probably be written as "%s"

    (W syntax) You have used a pattern where Perl expected to find a string, as in the first argument to join. Perl will treat the true or false result of matching the pattern against $_ as the string, which is probably not what you had in mind.

  • %s() called too early to check prototype

    (W prototype) You've called a function that has a prototype before the parser saw a definition or declaration for it, and Perl could not check that the call conforms to the prototype. You need to either add an early prototype declaration for the subroutine in question, or move the subroutine definition ahead of the call to get proper prototype checking. Alternatively, if you are certain that you're calling the function correctly, you may put an ampersand before the name to avoid the warning. See perlsub.

  • %s argument is not a HASH or ARRAY element

    (F) The argument to exists() must be a hash or array element, such as:

    1. $foo{$bar}
    2. $ref->{"susie"}[12]
  • %s argument is not a HASH or ARRAY element or slice

    (F) The argument to delete() must be either a hash or array element, such as:

    1. $foo{$bar}
    2. $ref->{"susie"}[12]

    or a hash or array slice, such as:

    1. @foo[$bar, $baz, $xyzzy]
    2. @{$ref->[12]}{"susie", "queue"}
  • %s argument is not a subroutine name

    (F) The argument to exists() for exists &sub must be a subroutine name, and not a subroutine call. exists &sub() will generate this error.

  • %s package attribute may clash with future reserved word: %s

    (W reserved) A lowercase attribute name was used that had a package-specific handler. That name might have a meaning to Perl itself some day, even though it doesn't yet. Perhaps you should use a mixed-case attribute name, instead. See attributes.

  • (in cleanup) %s

    (W misc) This prefix usually indicates that a DESTROY() method raised the indicated exception. Since destructors are usually called by the system at arbitrary points during execution, and often a vast number of times, the warning is issued only once for any number of failures that would otherwise result in the same message being repeated.

    Failure of user callbacks dispatched using the G_KEEPERR flag could also result in this warning. See G_KEEPERR in perlcall.

  • <> should be quotes

    (F) You wrote require <file> when you should have written require 'file' .

  • Attempt to join self

    (F) You tried to join a thread from within itself, which is an impossible task. You may be joining the wrong thread, or you may need to move the join() to some other thread.

  • Bad evalled substitution pattern

    (F) You've used the /e switch to evaluate the replacement for a substitution, but perl found a syntax error in the code to evaluate, most likely an unexpected right brace '}'.

  • Bad realloc() ignored

    (S) An internal routine called realloc() on something that had never been malloc()ed in the first place. Mandatory, but can be disabled by setting environment variable PERL_BADFREE to 1.

  • Bareword found in conditional

    (W bareword) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example:

    1. open FOO || die;

    It may also indicate a misspelled constant that has been interpreted as a bareword:

    1. use constant TYPO => 1;
    2. if (TYOP) { print "foo" }

    The strict pragma is useful in avoiding such errors.

  • Binary number > 0b11111111111111111111111111111111 non-portable

    (W portable) The binary number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

  • Bit vector size > 32 non-portable

    (W portable) Using bit vector sizes larger than 32 is non-portable.

  • Buffer overflow in prime_env_iter: %s

    (W internal) A warning peculiar to VMS. While Perl was preparing to iterate over %ENV, it encountered a logical name or symbol definition which was too long, so it was truncated to the string shown.

  • Can't check filesystem of script "%s"

    (P) For some reason you can't check the filesystem of the script for nosuid.

  • Can't declare class for non-scalar %s in "%s"

    (S) Currently, only scalar variables can declared with a specific class qualifier in a "my" or "our" declaration. The semantics may be extended for other types of variables in future.

  • Can't declare %s in "%s"

    (F) Only scalar, array, and hash variables may be declared as "my" or "our" variables. They must have ordinary identifiers as names.

  • Can't ignore signal CHLD, forcing to default

    (W signal) Perl has detected that it is being run with the SIGCHLD signal (sometimes known as SIGCLD) disabled. Since disabling this signal will interfere with proper determination of exit status of child processes, Perl has reset the signal to its default value. This situation typically indicates that the parent program under which Perl may be running (e.g., cron) is being very careless.

  • Can't modify non-lvalue subroutine call

    (F) Subroutines meant to be used in lvalue context should be declared as such, see Lvalue subroutines in perlsub.

  • Can't read CRTL environ

    (S) A warning peculiar to VMS. Perl tried to read an element of %ENV from the CRTL's internal environment array and discovered the array was missing. You need to figure out where your CRTL misplaced its environ or define PERL_ENV_TABLES (see perlvms) so that environ is not searched.

  • Can't remove %s: %s, skipping file

    (S) You requested an inplace edit without creating a backup file. Perl was unable to remove the original file to replace it with the modified file. The file was left unmodified.

  • Can't return %s from lvalue subroutine

    (F) Perl detected an attempt to return illegal lvalues (such as temporary or readonly values) from a subroutine used as an lvalue. This is not allowed.

  • Can't weaken a nonreference

    (F) You attempted to weaken something that was not a reference. Only references can be weakened.

  • Character class [:%s:] unknown

    (F) The class in the character class [: :] syntax is unknown. See perlre.

  • Character class syntax [%s] belongs inside character classes

    (W unsafe) The character class constructs [: :], [= =], and [. .] go inside character classes, the [] are part of the construct, for example: /[012[:alpha:]345]/. Note that [= =] and [. .] are not currently implemented; they are simply placeholders for future extensions.

  • Constant is not %s reference

    (F) A constant value (perhaps declared using the use constant pragma) is being dereferenced, but it amounts to the wrong type of reference. The message indicates the type of reference that was expected. This usually indicates a syntax error in dereferencing the constant value. See Constant Functions in perlsub and constant.

  • constant(%s): %s

    (F) The parser found inconsistencies either while attempting to define an overloaded constant, or when trying to find the character name specified in the \N{...} escape. Perhaps you forgot to load the corresponding overload or charnames pragma? See charnames and overload.

  • CORE::%s is not a keyword

    (F) The CORE:: namespace is reserved for Perl keywords.

  • defined(@array) is deprecated

    (D) defined() is not usually useful on arrays because it checks for an undefined scalar value. If you want to see if the array is empty, just use if (@array) { # not empty } for example.

  • defined(%hash) is deprecated

    (D) defined() is not usually useful on hashes because it checks for an undefined scalar value. If you want to see if the hash is empty, just use if (%hash) { # not empty } for example.

  • Did not produce a valid header

    See Server error.

  • (Did you mean "local" instead of "our"?)

    (W misc) Remember that "our" does not localize the declared global variable. You have declared it again in the same lexical scope, which seems superfluous.

  • Document contains no data

    See Server error.

  • entering effective %s failed

    (F) While under the use filetest pragma, switching the real and effective uids or gids failed.

  • false [] range "%s" in regexp

    (W regexp) A character class range must start and end at a literal character, not another character class like \d or [:alpha:]. The "-" in your false range is interpreted as a literal "-". Consider quoting the "-", "\-". See perlre.

  • Filehandle %s opened only for output

    (W io) You tried to read from a filehandle opened only for writing. If you intended it to be a read/write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with "<" or nothing. If you intended only to read from the file, use "<". See open.

  • flock() on closed filehandle %s

    (W closed) The filehandle you're attempting to flock() got itself closed some time before now. Check your logic flow. flock() operates on filehandles. Are you attempting to call flock() on a dirhandle by the same name?

  • Global symbol "%s" requires explicit package name

    (F) You've said "use strict vars", which indicates that all variables must either be lexically scoped (using "my"), declared beforehand using "our", or explicitly qualified to say which package the global variable is in (using "::").

  • Hexadecimal number > 0xffffffff non-portable

    (W portable) The hexadecimal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

  • Ill-formed CRTL environ value "%s"

    (W internal) A warning peculiar to VMS. Perl tried to read the CRTL's internal environ array, and encountered an element without the = delimiter used to separate keys from values. The element is ignored.

  • Ill-formed message in prime_env_iter: |%s|

    (W internal) A warning peculiar to VMS. Perl tried to read a logical name or CLI symbol definition when preparing to iterate over %ENV, and didn't see the expected delimiter between key and value, so the line was ignored.

  • Illegal binary digit %s

    (F) You used a digit other than 0 or 1 in a binary number.

  • Illegal binary digit %s ignored

    (W digit) You may have tried to use a digit other than 0 or 1 in a binary number. Interpretation of the binary number stopped before the offending digit.

  • Illegal number of bits in vec

    (F) The number of bits in vec() (the third argument) must be a power of two from 1 to 32 (or 64, if your platform supports that).

  • Integer overflow in %s number

    (W overflow) The hexadecimal, octal or binary number you have specified either as a literal or as an argument to hex() or oct() is too big for your architecture, and has been converted to a floating point number. On a 32-bit architecture the largest hexadecimal, octal or binary number representable without overflow is 0xFFFFFFFF, 037777777777, or 0b11111111111111111111111111111111 respectively. Note that Perl transparently promotes all numbers to a floating point representation internally--subject to loss of precision errors in subsequent operations.

  • Invalid %s attribute: %s

    The indicated attribute for a subroutine or variable was not recognized by Perl or by a user-supplied handler. See attributes.

  • Invalid %s attributes: %s

    The indicated attributes for a subroutine or variable were not recognized by Perl or by a user-supplied handler. See attributes.

  • invalid [] range "%s" in regexp

    The offending range is now explicitly displayed.

  • Invalid separator character %s in attribute list

    (F) Something other than a colon or whitespace was seen between the elements of an attribute list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon. See attributes.

  • Invalid separator character %s in subroutine attribute list

    (F) Something other than a colon or whitespace was seen between the elements of a subroutine attribute list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon.

  • leaving effective %s failed

    (F) While under the use filetest pragma, switching the real and effective uids or gids failed.

  • Lvalue subs returning %s not implemented yet

    (F) Due to limitations in the current implementation, array and hash values cannot be returned in subroutines used in lvalue context. See Lvalue subroutines in perlsub.

  • Method %s not permitted

    See Server error.

  • Missing %sbrace%s on \N{}

    (F) Wrong syntax of character name literal \N{charname} within double-quotish context.

  • Missing command in piped open

    (W pipe) You used the open(FH, "| command") or open(FH, "command |") construction, but the command was missing or blank.

  • Missing name in "my sub"

    (F) The reserved syntax for lexically scoped subroutines requires that they have a name with which they can be found.

  • No %s specified for -%c

    (F) The indicated command line switch needs a mandatory argument, but you haven't specified one.

  • No package name allowed for variable %s in "our"

    (F) Fully qualified variable names are not allowed in "our" declarations, because that doesn't make much sense under existing semantics. Such syntax is reserved for future extensions.

  • No space allowed after -%c

    (F) The argument to the indicated command line switch must follow immediately after the switch, without intervening spaces.

  • no UTC offset information; assuming local time is UTC

    (S) A warning peculiar to VMS. Perl was unable to find the local timezone offset, so it's assuming that local system time is equivalent to UTC. If it's not, define the logical name SYS$TIMEZONE_DIFFERENTIAL to translate to the number of seconds which need to be added to UTC to get local time.

  • Octal number > 037777777777 non-portable

    (W portable) The octal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

    See also perlport for writing portable code.

  • panic: del_backref

    (P) Failed an internal consistency check while trying to reset a weak reference.

  • panic: kid popen errno read

    (F) forked child returned an incomprehensible message about its errno.

  • panic: magic_killbackrefs

    (P) Failed an internal consistency check while trying to reset all weak references to an object.

  • Parentheses missing around "%s" list

    (W parenthesis) You said something like

    1. my $foo, $bar = @_;

    when you meant

    1. my ($foo, $bar) = @_;

    Remember that "my", "our", and "local" bind tighter than comma.

  • Possible unintended interpolation of %s in string

    (W ambiguous) It used to be that Perl would try to guess whether you wanted an array interpolated or a literal @. It no longer does this; arrays are now always interpolated into strings. This means that if you try something like:

    1. print "fred@example.com";

    and the array @example doesn't exist, Perl is going to print fred.com , which is probably not what you wanted. To get a literal @ sign in a string, put a backslash before it, just as you would to get a literal $ sign.

  • Possible Y2K bug: %s

    (W y2k) You are concatenating the number 19 with another number, which could be a potential Year 2000 problem.

  • pragma "attrs" is deprecated, use "sub NAME : ATTRS" instead

    (W deprecated) You have written something like this:

    1. sub doit
    2. {
    3. use attrs qw(locked);
    4. }

    You should use the new declaration syntax instead.

    1. sub doit : locked
    2. {
    3. ...

    The use attrs pragma is now obsolete, and is only provided for backward-compatibility. See Subroutine Attributes in perlsub.

  • Premature end of script headers

    See Server error.

  • Repeat count in pack overflows

    (F) You can't specify a repeat count so large that it overflows your signed integers. See pack.

  • Repeat count in unpack overflows

    (F) You can't specify a repeat count so large that it overflows your signed integers. See unpack.

  • realloc() of freed memory ignored

    (S) An internal routine called realloc() on something that had already been freed.

  • Reference is already weak

    (W misc) You have attempted to weaken a reference that is already weak. Doing so has no effect.

  • setpgrp can't take arguments

    (F) Your system has the setpgrp() from BSD 4.2, which takes no arguments, unlike POSIX setpgid(), which takes a process ID and process group ID.

  • Strange *+?{} on zero-length expression

    (W regexp) You applied a regular expression quantifier in a place where it makes no sense, such as on a zero-width assertion. Try putting the quantifier inside the assertion instead. For example, the way to match "abc" provided that it is followed by three repetitions of "xyz" is /abc(?=(?:xyz){3})/ , not /abc(?=xyz){3}/ .

  • switching effective %s is not implemented

    (F) While under the use filetest pragma, we cannot switch the real and effective uids or gids.

  • This Perl can't reset CRTL environ elements (%s)
  • This Perl can't set CRTL environ elements (%s=%s)

    (W internal) Warnings peculiar to VMS. You tried to change or delete an element of the CRTL's internal environ array, but your copy of Perl wasn't built with a CRTL that contained the setenv() function. You'll need to rebuild Perl with a CRTL that does, or redefine PERL_ENV_TABLES (see perlvms) so that the environ array isn't the target of the change to %ENV which produced the warning.

  • Too late to run %s block

    (W void) A CHECK or INIT block is being defined during run time proper, when the opportunity to run them has already passed. Perhaps you are loading a file with require or do when you should be using use instead. Or perhaps you should put the require or do inside a BEGIN block.

  • Unknown open() mode '%s'

    (F) The second argument of 3-argument open() is not among the list of valid modes: < , >, >> , +< , +>, +>> , -|, |-.

  • Unknown process %x sent message to prime_env_iter: %s

    (P) An error peculiar to VMS. Perl was reading values for %ENV before iterating over it, and someone else stuck a message in the stream of data Perl expected. Someone's very confused, or perhaps trying to subvert Perl's population of %ENV for nefarious purposes.

  • Unrecognized escape \\%c passed through

    (W misc) You used a backslash-character combination which is not recognized by Perl. The character was understood literally.

  • Unterminated attribute parameter in attribute list

    (F) The lexer saw an opening (left) parenthesis character while parsing an attribute list, but the matching closing (right) parenthesis character was not found. You may need to add (or remove) a backslash character to get your parentheses to balance. See attributes.

  • Unterminated attribute list

    (F) The lexer found something other than a simple identifier at the start of an attribute, and it wasn't a semicolon or the start of a block. Perhaps you terminated the parameter list of the previous attribute too soon. See attributes.

  • Unterminated attribute parameter in subroutine attribute list

    (F) The lexer saw an opening (left) parenthesis character while parsing a subroutine attribute list, but the matching closing (right) parenthesis character was not found. You may need to add (or remove) a backslash character to get your parentheses to balance.

  • Unterminated subroutine attribute list

    (F) The lexer found something other than a simple identifier at the start of a subroutine attribute, and it wasn't a semicolon or the start of a block. Perhaps you terminated the parameter list of the previous attribute too soon.

  • Value of CLI symbol "%s" too long

    (W misc) A warning peculiar to VMS. Perl tried to read the value of an %ENV element from a CLI symbol table, and found a resultant string longer than 1024 characters. The return value has been truncated to 1024 characters.

  • Version number must be a constant number

    (P) The attempt to translate a use Module n.n LIST statement into its equivalent BEGIN block found an internal inconsistency with the version number.

New tests

  • lib/attrs

    Compatibility tests for sub : attrs vs the older use attrs .

  • lib/env

    Tests for new environment scalar capability (e.g., use Env qw($BAR); ).

  • lib/env-array

    Tests for new environment array capability (e.g., use Env qw(@PATH); ).

  • lib/io_const

    IO constants (SEEK_*, _IO*).

  • lib/io_dir

    Directory-related IO methods (new, read, close, rewind, tied delete).

  • lib/io_multihomed

    INET sockets with multi-homed hosts.

  • lib/io_poll

    IO poll().

  • lib/io_unix

    UNIX sockets.

  • op/attrs

    Regression tests for my ($x,@y,%z) : attrs and <sub : attrs>.

  • op/filetest

    File test operators.

  • op/lex_assign

    Verify operations that access pad objects (lexicals and temporaries).

  • op/exists_sub

    Verify exists &sub operations.

Incompatible Changes

Perl Source Incompatibilities

Beware that any new warnings that have been added or old ones that have been enhanced are not considered incompatible changes.

Since all new warnings must be explicitly requested via the -w switch or the warnings pragma, it is ultimately the programmer's responsibility to ensure that warnings are enabled judiciously.

  • CHECK is a new keyword

    All subroutine definitions named CHECK are now special. See /"Support for CHECK blocks" for more information.

  • Treatment of list slices of undef has changed

    There is a potential incompatibility in the behavior of list slices that are comprised entirely of undefined values. See Behavior of list slices is more consistent.

  • Format of $English::PERL_VERSION is different

    The English module now sets $PERL_VERSION to $^V (a string value) rather than $] (a numeric value). This is a potential incompatibility. Send us a report via perlbug if you are affected by this.

    See Improved Perl version numbering system for the reasons for this change.

  • Literals of the form 1.2.3 parse differently

    Previously, numeric literals with more than one dot in them were interpreted as a floating point number concatenated with one or more numbers. Such "numbers" are now parsed as strings composed of the specified ordinals.

    For example, print 97.98.99 used to output 97.9899 in earlier versions, but now prints abc .

    See Support for strings represented as a vector of ordinals.

  • Possibly changed pseudo-random number generator

    Perl programs that depend on reproducing a specific set of pseudo-random numbers may now produce different output due to improvements made to the rand() builtin. You can use sh Configure -Drandfunc=rand to obtain the old behavior.

    See Better pseudo-random number generator.

  • Hashing function for hash keys has changed

    Even though Perl hashes are not order preserving, the apparently random order encountered when iterating on the contents of a hash is actually determined by the hashing algorithm used. Improvements in the algorithm may yield a random order that is different from that of previous versions, especially when iterating on hashes.

    See Better worst-case behavior of hashes for additional information.

  • undef fails on read only values

    Using the undef operator on a readonly value (such as $1) has the same effect as assigning undef to the readonly value--it throws an exception.

  • Close-on-exec bit may be set on pipe and socket handles

    Pipe and socket handles are also now subject to the close-on-exec behavior determined by the special variable $^F.

    See More consistent close-on-exec behavior.

  • Writing "$$1" to mean "${$}1" is unsupported

    Perl 5.004 deprecated the interpretation of $$1 and similar within interpolated strings to mean $$ . "1" , but still allowed it.

    In Perl 5.6.0 and later, "$$1" always means "${$1}" .

  • delete(), each(), values() and \(%h)

    operate on aliases to values, not copies

    delete(), each(), values() and hashes (e.g. \(%h) ) in a list context return the actual values in the hash, instead of copies (as they used to in earlier versions). Typical idioms for using these constructs copy the returned values, but this can make a significant difference when creating references to the returned values. Keys in the hash are still returned as copies when iterating on a hash.

    See also delete(), each(), values() and hash iteration are faster.

  • vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS

    vec() generates a run-time error if the BITS argument is not a valid power-of-two integer.

  • Text of some diagnostic output has changed

    Most references to internal Perl operations in diagnostics have been changed to be more descriptive. This may be an issue for programs that may incorrectly rely on the exact text of diagnostics for proper functioning.

  • %@ has been removed

    The undocumented special variable %@ that used to accumulate "background" errors (such as those that happen in DESTROY()) has been removed, because it could potentially result in memory leaks.

  • Parenthesized not() behaves like a list operator

    The not operator now falls under the "if it looks like a function, it behaves like a function" rule.

    As a result, the parenthesized form can be used with grep and map. The following construct used to be a syntax error before, but it works as expected now:

    1. grep not($_), @things;

    On the other hand, using not with a literal list slice may not work. The following previously allowed construct:

    1. print not (1,2,3)[0];

    needs to be written with additional parentheses now:

    1. print not((1,2,3)[0]);

    The behavior remains unaffected when not is not followed by parentheses.

  • Semantics of bareword prototype (*) have changed

    The semantics of the bareword prototype * have changed. Perl 5.005 always coerced simple scalar arguments to a typeglob, which wasn't useful in situations where the subroutine must distinguish between a simple scalar and a typeglob. The new behavior is to not coerce bareword arguments to a typeglob. The value will always be visible as either a simple scalar or as a reference to a typeglob.

    See More functional bareword prototype (*).

  • Semantics of bit operators may have changed on 64-bit platforms

    If your platform is either natively 64-bit or if Perl has been configured to used 64-bit integers, i.e., $Config{ivsize} is 8, there may be a potential incompatibility in the behavior of bitwise numeric operators (& | ^ ~ <<>>). These operators used to strictly operate on the lower 32 bits of integers in previous versions, but now operate over the entire native integral width. In particular, note that unary ~ will produce different results on platforms that have different $Config{ivsize}. For portability, be sure to mask off the excess bits in the result of unary ~ , e.g., ~$x & 0xffffffff .

    See Bit operators support full native integer width.

  • More builtins taint their results

    As described in Improved security features, there may be more sources of taint in a Perl program.

    To avoid these new tainting behaviors, you can build Perl with the Configure option -Accflags=-DINCOMPLETE_TAINTS . Beware that the ensuing perl binary may be insecure.

C Source Incompatibilities

  • PERL_POLLUTE

    Release 5.005 grandfathered old global symbol names by providing preprocessor macros for extension source compatibility. As of release 5.6.0, these preprocessor definitions are not available by default. You need to explicitly compile perl with -DPERL_POLLUTE to get these definitions. For extensions still using the old symbols, this option can be specified via MakeMaker:

    1. perl Makefile.PL POLLUTE=1
  • PERL_IMPLICIT_CONTEXT

    This new build option provides a set of macros for all API functions such that an implicit interpreter/thread context argument is passed to every API function. As a result of this, something like sv_setsv(foo,bar) amounts to a macro invocation that actually translates to something like Perl_sv_setsv(my_perl,foo,bar) . While this is generally expected to not have any significant source compatibility issues, the difference between a macro and a real function call will need to be considered.

    This means that there is a source compatibility issue as a result of this if your extensions attempt to use pointers to any of the Perl API functions.

    Note that the above issue is not relevant to the default build of Perl, whose interfaces continue to match those of prior versions (but subject to the other options described here).

    See Background and PERL_IMPLICIT_CONTEXT in perlguts for detailed information on the ramifications of building Perl with this option.

    1. NOTE: PERL_IMPLICIT_CONTEXT is automatically enabled whenever Perl is built
    2. with one of -Dusethreads, -Dusemultiplicity, or both. It is not
    3. intended to be enabled by users at this time.
  • PERL_POLLUTE_MALLOC

    Enabling Perl's malloc in release 5.005 and earlier caused the namespace of the system's malloc family of functions to be usurped by the Perl versions, since by default they used the same names. Besides causing problems on platforms that do not allow these functions to be cleanly replaced, this also meant that the system versions could not be called in programs that used Perl's malloc. Previous versions of Perl have allowed this behaviour to be suppressed with the HIDEMYMALLOC and EMBEDMYMALLOC preprocessor definitions.

    As of release 5.6.0, Perl's malloc family of functions have default names distinct from the system versions. You need to explicitly compile perl with -DPERL_POLLUTE_MALLOC to get the older behaviour. HIDEMYMALLOC and EMBEDMYMALLOC have no effect, since the behaviour they enabled is now the default.

    Note that these functions do not constitute Perl's memory allocation API. See Memory Allocation in perlguts for further information about that.

Compatible C Source API Changes

  • PATCHLEVEL is now PERL_VERSION

    The cpp macros PERL_REVISION , PERL_VERSION , and PERL_SUBVERSION are now available by default from perl.h, and reflect the base revision, patchlevel, and subversion respectively. PERL_REVISION had no prior equivalent, while PERL_VERSION and PERL_SUBVERSION were previously available as PATCHLEVEL and SUBVERSION .

    The new names cause less pollution of the cpp namespace and reflect what the numbers have come to stand for in common practice. For compatibility, the old names are still supported when patchlevel.h is explicitly included (as required before), so there is no source incompatibility from the change.

Binary Incompatibilities

In general, the default build of this release is expected to be binary compatible for extensions built with the 5.005 release or its maintenance versions. However, specific platforms may have broken binary compatibility due to changes in the defaults used in hints files. Therefore, please be sure to always check the platform-specific README files for any notes to the contrary.

The usethreads or usemultiplicity builds are not binary compatible with the corresponding builds in 5.005.

On platforms that require an explicit list of exports (AIX, OS/2 and Windows, among others), purely internal symbols such as parser functions and the run time opcodes are not exported by default. Perl 5.005 used to export all functions irrespective of whether they were considered part of the public API or not.

For the full list of public API functions, see perlapi.

Known Problems

Localizing a tied hash element may leak memory

As of the 5.6.1 release, there is a known leak when code such as this is executed:

  1. use Tie::Hash;
  2. tie my %tie_hash => 'Tie::StdHash';
  3. ...
  4. local($tie_hash{Foo}) = 1; # leaks

Known test failures

  • 64-bit builds

    Subtest #15 of lib/b.t may fail under 64-bit builds on platforms such as HP-UX PA64 and Linux IA64. The issue is still being investigated.

    The lib/io_multihomed test may hang in HP-UX if Perl has been configured to be 64-bit. Because other 64-bit platforms do not hang in this test, HP-UX is suspect. All other tests pass in 64-bit HP-UX. The test attempts to create and connect to "multihomed" sockets (sockets which have multiple IP addresses).

    Note that 64-bit support is still experimental.

  • Failure of Thread tests

    The subtests 19 and 20 of lib/thr5005.t test are known to fail due to fundamental problems in the 5.005 threading implementation. These are not new failures--Perl 5.005_0x has the same bugs, but didn't have these tests. (Note that support for 5.005-style threading remains experimental.)

  • NEXTSTEP 3.3 POSIX test failure

    In NEXTSTEP 3.3p2 the implementation of the strftime(3) in the operating system libraries is buggy: the %j format numbers the days of a month starting from zero, which, while being logical to programmers, will cause the subtests 19 to 27 of the lib/posix test may fail.

  • Tru64 (aka Digital UNIX, aka DEC OSF/1) lib/sdbm test failure with gcc

    If compiled with gcc 2.95 the lib/sdbm test will fail (dump core). The cure is to use the vendor cc, it comes with the operating system and produces good code.

EBCDIC platforms not fully supported

In earlier releases of Perl, EBCDIC environments like OS390 (also known as Open Edition MVS) and VM-ESA were supported. Due to changes required by the UTF-8 (Unicode) support, the EBCDIC platforms are not supported in Perl 5.6.0.

The 5.6.1 release improves support for EBCDIC platforms, but they are not fully supported yet.

UNICOS/mk CC failures during Configure run

In UNICOS/mk the following errors may appear during the Configure run:

  1. Guessing which symbols your C compiler and preprocessor define...
  2. CC-20 cc: ERROR File = try.c, Line = 3
  3. ...
  4. bad switch yylook 79bad switch yylook 79bad switch yylook 79bad switch yylook 79#ifdef A29K
  5. ...
  6. 4 errors detected in the compilation of "try.c".

The culprit is the broken awk of UNICOS/mk. The effect is fortunately rather mild: Perl itself is not adversely affected by the error, only the h2ph utility coming with Perl, and that is rather rarely needed these days.

Arrow operator and arrays

When the left argument to the arrow operator -> is an array, or the scalar operator operating on an array, the result of the operation must be considered erroneous. For example:

  1. @x->[2]
  2. scalar(@x)->[2]

These expressions will get run-time errors in some future release of Perl.

Experimental features

As discussed above, many features are still experimental. Interfaces and implementation of these features are subject to change, and in extreme cases, even subject to removal in some future release of Perl. These features include the following:

  • Threads
  • Unicode
  • 64-bit support
  • Lvalue subroutines
  • Weak references
  • The pseudo-hash data type
  • The Compiler suite
  • Internal implementation of file globbing
  • The DB module
  • The regular expression code constructs:

    (?{ code }) and (??{ code })

Obsolete Diagnostics

  • Character class syntax [: :] is reserved for future extensions

    (W) Within regular expression character classes ([]) the syntax beginning with "[:" and ending with ":]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[:" and ":\]".

  • Ill-formed logical name |%s| in prime_env_iter

    (W) A warning peculiar to VMS. A logical name was encountered when preparing to iterate over %ENV which violates the syntactic rules governing logical names. Because it cannot be translated normally, it is skipped, and will not appear in %ENV. This may be a benign occurrence, as some software packages might directly modify logical name tables and introduce nonstandard names, or it may indicate that a logical name table has been corrupted.

  • In string, @%s now must be written as \@%s

    The description of this error used to say:

    1. (Someday it will simply assume that an unbackslashed @
    2. interpolates an array.)

    That day has come, and this fatal error has been removed. It has been replaced by a non-fatal warning instead. See Arrays now always interpolate into double-quoted strings for details.

  • Probable precedence problem on %s

    (W) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example:

    1. open FOO || die;
  • regexp too big

    (F) The current implementation of regular expressions uses shorts as address offsets within a string. Unfortunately this means that if the regular expression compiles to longer than 32767, it'll blow up. Usually when you want a regular expression this big, there is a better way to do it with multiple statements. See perlre.

  • Use of "$$<digit>" to mean "${$}<digit>" is deprecated

    (D) Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004.

    However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$<digit>" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup. There may also be information at http://www.perl.com/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

HISTORY

Written by Gurusamy Sarathy <gsar@ActiveState.com>, with many contributions from The Perl Porters.

Send omissions or corrections to <perlbug@perl.org>.

Page index
 
perldoc-html/perl56delta.html000644 000765 000024 00000622354 12275777407 016271 0ustar00jjstaff000000 000000 perl56delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl56delta

Perl 5 version 18.2 documentation
Recently read

perl56delta

NAME

perl56delta - what's new for perl v5.6.0

DESCRIPTION

This document describes differences between the 5.005 release and the 5.6.0 release.

Core Enhancements

Interpreter cloning, threads, and concurrency

Perl 5.6.0 introduces the beginnings of support for running multiple interpreters concurrently in different threads. In conjunction with the perl_clone() API call, which can be used to selectively duplicate the state of any given interpreter, it is possible to compile a piece of code once in an interpreter, clone that interpreter one or more times, and run all the resulting interpreters in distinct threads.

On the Windows platform, this feature is used to emulate fork() at the interpreter level. See perlfork for details about that.

This feature is still in evolution. It is eventually meant to be used to selectively clone a subroutine and data reachable from that subroutine in a separate interpreter and run the cloned subroutine in a separate thread. Since there is no shared data between the interpreters, little or no locking will be needed (unless parts of the symbol table are explicitly shared). This is obviously intended to be an easy-to-use replacement for the existing threads support.

Support for cloning interpreters and interpreter concurrency can be enabled using the -Dusethreads Configure option (see win32/Makefile for how to enable it on Windows.) The resulting perl executable will be functionally identical to one that was built with -Dmultiplicity, but the perl_clone() API call will only be available in the former.

-Dusethreads enables the cpp macro USE_ITHREADS by default, which in turn enables Perl source code changes that provide a clear separation between the op tree and the data it operates with. The former is immutable, and can therefore be shared between an interpreter and all of its clones, while the latter is considered local to each interpreter, and is therefore copied for each clone.

Note that building Perl with the -Dusemultiplicity Configure option is adequate if you wish to run multiple independent interpreters concurrently in different threads. -Dusethreads only provides the additional functionality of the perl_clone() API call and other support for running cloned interpreters concurrently.

  1. NOTE: This is an experimental feature. Implementation details are
  2. subject to change.

Lexically scoped warning categories

You can now control the granularity of warnings emitted by perl at a finer level using the use warnings pragma. warnings and perllexwarn have copious documentation on this feature.

Unicode and UTF-8 support

Perl now uses UTF-8 as its internal representation for character strings. The utf8 and bytes pragmas are used to control this support in the current lexical scope. See perlunicode, utf8 and bytes for more information.

This feature is expected to evolve quickly to support some form of I/O disciplines that can be used to specify the kind of input and output data (bytes or characters). Until that happens, additional modules from CPAN will be needed to complete the toolkit for dealing with Unicode.

  1. NOTE: This should be considered an experimental feature. Implementation
  2. details are subject to change.

Support for interpolating named characters

The new \N escape interpolates named characters within strings. For example, "Hi! \N{WHITE SMILING FACE}" evaluates to a string with a unicode smiley face at the end.

"our" declarations

An "our" declaration introduces a value that can be best understood as a lexically scoped symbolic alias to a global variable in the package that was current where the variable was declared. This is mostly useful as an alternative to the vars pragma, but also provides the opportunity to introduce typing and other attributes for such variables. See our.

Support for strings represented as a vector of ordinals

Literals of the form v1.2.3.4 are now parsed as a string composed of characters with the specified ordinals. This is an alternative, more readable way to construct (possibly unicode) strings instead of interpolating characters, as in "\x{1}\x{2}\x{3}\x{4}" . The leading v may be omitted if there are more than two ordinals, so 1.2.3 is parsed the same as v1.2.3 .

Strings written in this form are also useful to represent version "numbers". It is easy to compare such version "numbers" (which are really just plain strings) using any of the usual string comparison operators eq , ne , lt , gt , etc., or perform bitwise string operations on them using |, & , etc.

In conjunction with the new $^V magic variable (which contains the perl version as a string), such literals can be used as a readable way to check if you're running a particular version of Perl:

  1. # this will parse in older versions of Perl also
  2. if ($^V and $^V gt v5.6.0) {
  3. # new features supported
  4. }

require and use also have some special magic to support such literals, but this particular usage should be avoided because it leads to misleading error messages under versions of Perl which don't support vector strings. Using a true version number will ensure correct behavior in all versions of Perl:

  1. require 5.006; # run time check for v5.6
  2. use 5.006_001; # compile time check for v5.6.1

Also, sprintf and printf support the Perl-specific format flag %v to print ordinals of characters in arbitrary strings:

  1. printf "v%vd", $^V; # prints current version, such as "v5.5.650"
  2. printf "%*vX", ":", $addr; # formats IPv6 address
  3. printf "%*vb", " ", $bits; # displays bitstring

See Scalar value constructors in perldata for additional information.

Improved Perl version numbering system

Beginning with Perl version 5.6.0, the version number convention has been changed to a "dotted integer" scheme that is more commonly found in open source projects.

Maintenance versions of v5.6.0 will be released as v5.6.1, v5.6.2 etc. The next development series following v5.6.0 will be numbered v5.7.x, beginning with v5.7.0, and the next major production release following v5.6.0 will be v5.8.0.

The English module now sets $PERL_VERSION to $^V (a string value) rather than $] (a numeric value). (This is a potential incompatibility. Send us a report via perlbug if you are affected by this.)

The v1.2.3 syntax is also now legal in Perl. See Support for strings represented as a vector of ordinals for more on that.

To cope with the new versioning system's use of at least three significant digits for each version component, the method used for incrementing the subversion number has also changed slightly. We assume that versions older than v5.6.0 have been incrementing the subversion component in multiples of 10. Versions after v5.6.0 will increment them by 1. Thus, using the new notation, 5.005_03 is the "same" as v5.5.30, and the first maintenance version following v5.6.0 will be v5.6.1 (which should be read as being equivalent to a floating point value of 5.006_001 in the older format, stored in $] ).

New syntax for declaring subroutine attributes

Formerly, if you wanted to mark a subroutine as being a method call or as requiring an automatic lock() when it is entered, you had to declare that with a use attrs pragma in the body of the subroutine. That can now be accomplished with declaration syntax, like this:

  1. sub mymethod : locked method;
  2. ...
  3. sub mymethod : locked method {
  4. ...
  5. }
  6. sub othermethod :locked :method;
  7. ...
  8. sub othermethod :locked :method {
  9. ...
  10. }

(Note how only the first : is mandatory, and whitespace surrounding the : is optional.)

AutoSplit.pm and SelfLoader.pm have been updated to keep the attributes with the stubs they provide. See attributes.

File and directory handles can be autovivified

Similar to how constructs such as $x->[0] autovivify a reference, handle constructors (open(), opendir(), pipe(), socketpair(), sysopen(), socket(), and accept()) now autovivify a file or directory handle if the handle passed to them is an uninitialized scalar variable. This allows the constructs such as open(my $fh, ...) and open(local $fh,...) to be used to create filehandles that will conveniently be closed automatically when the scope ends, provided there are no other references to them. This largely eliminates the need for typeglobs when opening filehandles that must be passed around, as in the following example:

  1. sub myopen {
  2. open my $fh, "@_"
  3. or die "Can't open '@_': $!";
  4. return $fh;
  5. }
  6. {
  7. my $f = myopen("</etc/motd");
  8. print <$f>;
  9. # $f implicitly closed here
  10. }

open() with more than two arguments

If open() is passed three arguments instead of two, the second argument is used as the mode and the third argument is taken to be the file name. This is primarily useful for protecting against unintended magic behavior of the traditional two-argument form. See open.

64-bit support

Any platform that has 64-bit integers either

  1. (1) natively as longs or ints
  2. (2) via special compiler flags
  3. (3) using long long or int64_t

is able to use "quads" (64-bit integers) as follows:

  • constants (decimal, hexadecimal, octal, binary) in the code

  • arguments to oct() and hex()

  • arguments to print(), printf() and sprintf() (flag prefixes ll, L, q)

  • printed as such

  • pack() and unpack() "q" and "Q" formats

  • in basic arithmetics: + - * / % (NOTE: operating close to the limits of the integer values may produce surprising results)

  • in bit arithmetics: & | ^ ~ <<>> (NOTE: these used to be forced to be 32 bits wide but now operate on the full native width.)

  • vec()

Note that unless you have the case (a) you will have to configure and compile Perl using the -Duse64bitint Configure flag.

  1. NOTE: The Configure flags -Duselonglong and -Duse64bits have been
  2. deprecated. Use -Duse64bitint instead.

There are actually two modes of 64-bitness: the first one is achieved using Configure -Duse64bitint and the second one using Configure -Duse64bitall. The difference is that the first one is minimal and the second one maximal. The first works in more places than the second.

The use64bitint does only as much as is required to get 64-bit integers into Perl (this may mean, for example, using "long longs") while your memory may still be limited to 2 gigabytes (because your pointers could still be 32-bit). Note that the name 64bitint does not imply that your C compiler will be using 64-bit ints (it might, but it doesn't have to): the use64bitint means that you will be able to have 64 bits wide scalar values.

The use64bitall goes all the way by attempting to switch also integers (if it can), longs (and pointers) to being 64-bit. This may create an even more binary incompatible Perl than -Duse64bitint: the resulting executable may not run at all in a 32-bit box, or you may have to reboot/reconfigure/rebuild your operating system to be 64-bit aware.

Natively 64-bit systems like Alpha and Cray need neither -Duse64bitint nor -Duse64bitall.

Last but not least: note that due to Perl's habit of always using floating point numbers, the quads are still not true integers. When quads overflow their limits (0...18_446_744_073_709_551_615 unsigned, -9_223_372_036_854_775_808...9_223_372_036_854_775_807 signed), they are silently promoted to floating point numbers, after which they will start losing precision (in their lower digits).

  1. NOTE: 64-bit support is still experimental on most platforms.
  2. Existing support only covers the LP64 data model. In particular, the
  3. LLP64 data model is not yet supported. 64-bit libraries and system
  4. APIs on many platforms have not stabilized--your mileage may vary.

Large file support

If you have filesystems that support "large files" (files larger than 2 gigabytes), you may now also be able to create and access them from Perl.

  1. NOTE: The default action is to enable large file support, if
  2. available on the platform.

If the large file support is on, and you have a Fcntl constant O_LARGEFILE, the O_LARGEFILE is automatically added to the flags of sysopen().

Beware that unless your filesystem also supports "sparse files" seeking to umpteen petabytes may be inadvisable.

Note that in addition to requiring a proper file system to do large files you may also need to adjust your per-process (or your per-system, or per-process-group, or per-user-group) maximum filesize limits before running Perl scripts that try to handle large files, especially if you intend to write such files.

Finally, in addition to your process/process group maximum filesize limits, you may have quota limits on your filesystems that stop you (your user id or your user group id) from using large files.

Adjusting your process/user/group/file system/operating system limits is outside the scope of Perl core language. For process limits, you may try increasing the limits using your shell's limits/limit/ulimit command before running Perl. The BSD::Resource extension (not included with the standard Perl distribution) may also be of use, it offers the getrlimit/setrlimit interface that can be used to adjust process resource usage limits, including the maximum filesize limit.

Long doubles

In some systems you may be able to use long doubles to enhance the range and precision of your double precision floating point numbers (that is, Perl's numbers). Use Configure -Duselongdouble to enable this support (if it is available).

"more bits"

You can "Configure -Dusemorebits" to turn on both the 64-bit support and the long double support.

Enhanced support for sort() subroutines

Perl subroutines with a prototype of ($$) , and XSUBs in general, can now be used as sort subroutines. In either case, the two elements to be compared are passed as normal parameters in @_. See sort.

For unprototyped sort subroutines, the historical behavior of passing the elements to be compared as the global variables $a and $b remains unchanged.

sort $coderef @foo allowed

sort() did not accept a subroutine reference as the comparison function in earlier versions. This is now permitted.

File globbing implemented internally

Perl now uses the File::Glob implementation of the glob() operator automatically. This avoids using an external csh process and the problems associated with it.

  1. NOTE: This is currently an experimental feature. Interfaces and
  2. implementation are subject to change.

Support for CHECK blocks

In addition to BEGIN , INIT , END , DESTROY and AUTOLOAD , subroutines named CHECK are now special. These are queued up during compilation and behave similar to END blocks, except they are called at the end of compilation rather than at the end of execution. They cannot be called directly.

POSIX character class syntax [: :] supported

For example to match alphabetic characters use /[[:alpha:]]/. See perlre for details.

Better pseudo-random number generator

In 5.005_0x and earlier, perl's rand() function used the C library rand(3) function. As of 5.005_52, Configure tests for drand48(), random(), and rand() (in that order) and picks the first one it finds.

These changes should result in better random numbers from rand().

Improved qw// operator

The qw// operator is now evaluated at compile time into a true list instead of being replaced with a run time call to split(). This removes the confusing misbehaviour of qw// in scalar context, which had inherited that behaviour from split().

Thus:

  1. $foo = ($bar) = qw(a b c); print "$foo|$bar\n";

now correctly prints "3|a", instead of "2|a".

Better worst-case behavior of hashes

Small changes in the hashing algorithm have been implemented in order to improve the distribution of lower order bits in the hashed value. This is expected to yield better performance on keys that are repeated sequences.

pack() format 'Z' supported

The new format type 'Z' is useful for packing and unpacking null-terminated strings. See pack.

pack() format modifier '!' supported

The new format type modifier '!' is useful for packing and unpacking native shorts, ints, and longs. See pack.

pack() and unpack() support counted strings

The template character '/' can be used to specify a counted string type to be packed or unpacked. See pack.

Comments in pack() templates

The '#' character in a template introduces a comment up to end of the line. This facilitates documentation of pack() templates.

Weak references

In previous versions of Perl, you couldn't cache objects so as to allow them to be deleted if the last reference from outside the cache is deleted. The reference in the cache would hold a reference count on the object and the objects would never be destroyed.

Another familiar problem is with circular references. When an object references itself, its reference count would never go down to zero, and it would not get destroyed until the program is about to exit.

Weak references solve this by allowing you to "weaken" any reference, that is, make it not count towards the reference count. When the last non-weak reference to an object is deleted, the object is destroyed and all the weak references to the object are automatically undef-ed.

To use this feature, you need the Devel::WeakRef package from CPAN, which contains additional documentation.

  1. NOTE: This is an experimental feature. Details are subject to change.

Binary numbers supported

Binary numbers are now supported as literals, in s?printf formats, and oct():

  1. $answer = 0b101010;
  2. printf "The answer is: %b\n", oct("0b101010");

Lvalue subroutines

Subroutines can now return modifiable lvalues. See Lvalue subroutines in perlsub.

  1. NOTE: This is an experimental feature. Details are subject to change.

Some arrows may be omitted in calls through references

Perl now allows the arrow to be omitted in many constructs involving subroutine calls through references. For example, $foo[10]->('foo') may now be written $foo[10]('foo') . This is rather similar to how the arrow may be omitted from $foo[10]->{'foo'} . Note however, that the arrow is still required for foo(10)->('bar') .

Boolean assignment operators are legal lvalues

Constructs such as ($a ||= 2) += 1 are now allowed.

exists() is supported on subroutine names

The exists() builtin now works on subroutine names. A subroutine is considered to exist if it has been declared (even if implicitly). See exists for examples.

exists() and delete() are supported on array elements

The exists() and delete() builtins now work on simple arrays as well. The behavior is similar to that on hash elements.

exists() can be used to check whether an array element has been initialized. This avoids autovivifying array elements that don't exist. If the array is tied, the EXISTS() method in the corresponding tied package will be invoked.

delete() may be used to remove an element from the array and return it. The array element at that position returns to its uninitialized state, so that testing for the same element with exists() will return false. If the element happens to be the one at the end, the size of the array also shrinks up to the highest element that tests true for exists(), or 0 if none such is found. If the array is tied, the DELETE() method in the corresponding tied package will be invoked.

See exists and delete for examples.

Pseudo-hashes work better

Dereferencing some types of reference values in a pseudo-hash, such as $ph->{foo}[1] , was accidentally disallowed. This has been corrected.

When applied to a pseudo-hash element, exists() now reports whether the specified value exists, not merely if the key is valid.

delete() now works on pseudo-hashes. When given a pseudo-hash element or slice it deletes the values corresponding to the keys (but not the keys themselves). See Pseudo-hashes: Using an array as a hash in perlref.

Pseudo-hash slices with constant keys are now optimized to array lookups at compile-time.

List assignments to pseudo-hash slices are now supported.

The fields pragma now provides ways to create pseudo-hashes, via fields::new() and fields::phash(). See fields.

  1. NOTE: The pseudo-hash data type continues to be experimental.
  2. Limiting oneself to the interface elements provided by the
  3. fields pragma will provide protection from any future changes.

Automatic flushing of output buffers

fork(), exec(), system(), qx//, and pipe open()s now flush buffers of all files opened for output when the operation was attempted. This mostly eliminates confusing buffering mishaps suffered by users unaware of how Perl internally handles I/O.

This is not supported on some platforms like Solaris where a suitably correct implementation of fflush(NULL) isn't available.

Better diagnostics on meaningless filehandle operations

Constructs such as open() and close() are compile time errors. Attempting to read from filehandles that were opened only for writing will now produce warnings (just as writing to read-only filehandles does).

Where possible, buffered data discarded from duped input filehandle

open(NEW, "<&OLD") now attempts to discard any data that was previously read and buffered in OLD before duping the handle. On platforms where doing this is allowed, the next read operation on NEW will return the same data as the corresponding operation on OLD . Formerly, it would have returned the data from the start of the following disk block instead.

eof() has the same old magic as <>

eof() would return true if no attempt to read from <> had yet been made. eof() has been changed to have a little magic of its own, it now opens the <> files.

binmode() can be used to set :crlf and :raw modes

binmode() now accepts a second argument that specifies a discipline for the handle in question. The two pseudo-disciplines ":raw" and ":crlf" are currently supported on DOS-derivative platforms. See binmode and open.

-T filetest recognizes UTF-8 encoded files as "text"

The algorithm used for the -T filetest has been enhanced to correctly identify UTF-8 content as "text".

system(), backticks and pipe open now reflect exec() failure

On Unix and similar platforms, system(), qx() and open(FOO, "cmd |") etc., are implemented via fork() and exec(). When the underlying exec() fails, earlier versions did not report the error properly, since the exec() happened to be in a different process.

The child process now communicates with the parent about the error in launching the external command, which allows these constructs to return with their usual error value and set $!.

Improved diagnostics

Line numbers are no longer suppressed (under most likely circumstances) during the global destruction phase.

Diagnostics emitted from code running in threads other than the main thread are now accompanied by the thread ID.

Embedded null characters in diagnostics now actually show up. They used to truncate the message in prior versions.

$foo::a and $foo::b are now exempt from "possible typo" warnings only if sort() is encountered in package foo .

Unrecognized alphabetic escapes encountered when parsing quote constructs now generate a warning, since they may take on new semantics in later versions of Perl.

Many diagnostics now report the internal operation in which the warning was provoked, like so:

  1. Use of uninitialized value in concatenation (.) at (eval 1) line 1.
  2. Use of uninitialized value in print at (eval 1) line 1.

Diagnostics that occur within eval may also report the file and line number where the eval is located, in addition to the eval sequence number and the line number within the evaluated text itself. For example:

  1. Not enough arguments for scalar at (eval 4)[newlib/perl5db.pl:1411] line 2, at EOF

Diagnostics follow STDERR

Diagnostic output now goes to whichever file the STDERR handle is pointing at, instead of always going to the underlying C runtime library's stderr .

More consistent close-on-exec behavior

On systems that support a close-on-exec flag on filehandles, the flag is now set for any handles created by pipe(), socketpair(), socket(), and accept(), if that is warranted by the value of $^F that may be in effect. Earlier versions neglected to set the flag for handles created with these operators. See pipe, socketpair, socket, accept, and $^F in perlvar.

syswrite() ease-of-use

The length argument of syswrite() has become optional.

Better syntax checks on parenthesized unary operators

Expressions such as:

  1. print defined(&foo,&bar,&baz);
  2. print uc("foo","bar","baz");
  3. undef($foo,&bar);

used to be accidentally allowed in earlier versions, and produced unpredictable behaviour. Some produced ancillary warnings when used in this way; others silently did the wrong thing.

The parenthesized forms of most unary operators that expect a single argument now ensure that they are not called with more than one argument, making the cases shown above syntax errors. The usual behaviour of:

  1. print defined &foo, &bar, &baz;
  2. print uc "foo", "bar", "baz";
  3. undef $foo, &bar;

remains unchanged. See perlop.

Bit operators support full native integer width

The bit operators (& | ^ ~ <<>>) now operate on the full native integral width (the exact size of which is available in $Config{ivsize}). For example, if your platform is either natively 64-bit or if Perl has been configured to use 64-bit integers, these operations apply to 8 bytes (as opposed to 4 bytes on 32-bit platforms). For portability, be sure to mask off the excess bits in the result of unary ~ , e.g., ~$x & 0xffffffff .

Improved security features

More potentially unsafe operations taint their results for improved security.

The passwd and shell fields returned by the getpwent(), getpwnam(), and getpwuid() are now tainted, because the user can affect their own encrypted password and login shell.

The variable modified by shmread(), and messages returned by msgrcv() (and its object-oriented interface IPC::SysV::Msg::rcv) are also tainted, because other untrusted processes can modify messages and shared memory segments for their own nefarious purposes.

More functional bareword prototype (*)

Bareword prototypes have been rationalized to enable them to be used to override builtins that accept barewords and interpret them in a special way, such as require or do.

Arguments prototyped as * will now be visible within the subroutine as either a simple scalar or as a reference to a typeglob. See Prototypes in perlsub.

require and do may be overridden

require and do 'file' operations may be overridden locally by importing subroutines of the same name into the current package (or globally by importing them into the CORE::GLOBAL:: namespace). Overriding require will also affect use, provided the override is visible at compile-time. See Overriding Built-in Functions in perlsub.

$^X variables may now have names longer than one character

Formerly, $^X was synonymous with ${"\cX"}, but $^XY was a syntax error. Now variable names that begin with a control character may be arbitrarily long. However, for compatibility reasons, these variables must be written with explicit braces, as ${^XY} for example. ${^XYZ} is synonymous with ${"\cXYZ"}. Variable names with more than one control character, such as ${^XY^Z} , are illegal.

The old syntax has not changed. As before, `^X' may be either a literal control-X character or the two-character sequence `caret' plus `X'. When braces are omitted, the variable name stops after the control character. Thus "$^XYZ" continues to be synonymous with $^X . "YZ" as before.

As before, lexical variables may not have names beginning with control characters. As before, variables whose names begin with a control character are always forced to be in package `main'. All such variables are reserved for future extensions, except those that begin with ^_, which may be used by user programs and are guaranteed not to acquire special meaning in any future version of Perl.

New variable $^C reflects -c switch

$^C has a boolean value that reflects whether perl is being run in compile-only mode (i.e. via the -c switch). Since BEGIN blocks are executed under such conditions, this variable enables perl code to determine whether actions that make sense only during normal running are warranted. See perlvar.

New variable $^V contains Perl version as a string

$^V contains the Perl version number as a string composed of characters whose ordinals match the version numbers, i.e. v5.6.0. This may be used in string comparisons.

See Support for strings represented as a vector of ordinals for an example.

Optional Y2K warnings

If Perl is built with the cpp macro PERL_Y2KWARN defined, it emits optional warnings when concatenating the number 19 with another number.

This behavior must be specifically enabled when running Configure. See INSTALL and README.Y2K.

Arrays now always interpolate into double-quoted strings

In double-quoted strings, arrays now interpolate, no matter what. The behavior in earlier versions of perl 5 was that arrays would interpolate into strings if the array had been mentioned before the string was compiled, and otherwise Perl would raise a fatal compile-time error. In versions 5.000 through 5.003, the error was

  1. Literal @example now requires backslash

In versions 5.004_01 through 5.6.0, the error was

  1. In string, @example now must be written as \@example

The idea here was to get people into the habit of writing "fred\@example.com" when they wanted a literal @ sign, just as they have always written "Give me back my \$5" when they wanted a literal $ sign.

Starting with 5.6.1, when Perl now sees an @ sign in a double-quoted string, it always attempts to interpolate an array, regardless of whether or not the array has been used or declared already. The fatal error has been downgraded to an optional warning:

  1. Possible unintended interpolation of @example in string

This warns you that "fred@example.com" is going to turn into fred.com if you don't backslash the @ . See http://perl.plover.com/at-error.html for more details about the history here.

@- and @+ provide starting/ending offsets of regex matches

The new magic variables @- and @+ provide the starting and ending offsets, respectively, of $&, $1, $2, etc. See perlvar for details.

Modules and Pragmata

Modules

  • attributes

    While used internally by Perl as a pragma, this module also provides a way to fetch subroutine and variable attributes. See attributes.

  • B

    The Perl Compiler suite has been extensively reworked for this release. More of the standard Perl test suite passes when run under the Compiler, but there is still a significant way to go to achieve production quality compiled executables.

    1. NOTE: The Compiler suite remains highly experimental. The
    2. generated code may not be correct, even when it manages to execute
    3. without errors.
  • Benchmark

    Overall, Benchmark results exhibit lower average error and better timing accuracy.

    You can now run tests for n seconds instead of guessing the right number of tests to run: e.g., timethese(-5, ...) will run each code for at least 5 CPU seconds. Zero as the "number of repetitions" means "for at least 3 CPU seconds". The output format has also changed. For example:

    1. use Benchmark;$x=3;timethese(-5,{a=>sub{$x*$x},b=>sub{$x**2}})

    will now output something like this:

    1. Benchmark: running a, b, each for at least 5 CPU seconds...
    2. a: 5 wallclock secs ( 5.77 usr + 0.00 sys = 5.77 CPU) @ 200551.91/s (n=1156516)
    3. b: 4 wallclock secs ( 5.00 usr + 0.02 sys = 5.02 CPU) @ 159605.18/s (n=800686)

    New features: "each for at least N CPU seconds...", "wallclock secs", and the "@ operations/CPU second (n=operations)".

    timethese() now returns a reference to a hash of Benchmark objects containing the test results, keyed on the names of the tests.

    timethis() now returns the iterations field in the Benchmark result object instead of 0.

    timethese(), timethis(), and the new cmpthese() (see below) can also take a format specifier of 'none' to suppress output.

    A new function countit() is just like timeit() except that it takes a TIME instead of a COUNT.

    A new function cmpthese() prints a chart comparing the results of each test returned from a timethese() call. For each possible pair of tests, the percentage speed difference (iters/sec or seconds/iter) is shown.

    For other details, see Benchmark.

  • ByteLoader

    The ByteLoader is a dedicated extension to generate and run Perl bytecode. See ByteLoader.

  • constant

    References can now be used.

    The new version also allows a leading underscore in constant names, but disallows a double leading underscore (as in "__LINE__"). Some other names are disallowed or warned against, including BEGIN, END, etc. Some names which were forced into main:: used to fail silently in some cases; now they're fatal (outside of main::) and an optional warning (inside of main::). The ability to detect whether a constant had been set with a given name has been added.

    See constant.

  • charnames

    This pragma implements the \N string escape. See charnames.

  • Data::Dumper

    A Maxdepth setting can be specified to avoid venturing too deeply into deep data structures. See Data::Dumper.

    The XSUB implementation of Dump() is now automatically called if the Useqq setting is not in use.

    Dumping qr// objects works correctly.

  • DB

    DB is an experimental module that exposes a clean abstraction to Perl's debugging API.

  • DB_File

    DB_File can now be built with Berkeley DB versions 1, 2 or 3. See ext/DB_File/Changes .

  • Devel::DProf

    Devel::DProf, a Perl source code profiler has been added. See Devel::DProf and dprofpp.

  • Devel::Peek

    The Devel::Peek module provides access to the internal representation of Perl variables and data. It is a data debugging tool for the XS programmer.

  • Dumpvalue

    The Dumpvalue module provides screen dumps of Perl data.

  • DynaLoader

    DynaLoader now supports a dl_unload_file() function on platforms that support unloading shared objects using dlclose().

    Perl can also optionally arrange to unload all extension shared objects loaded by Perl. To enable this, build Perl with the Configure option -Accflags=-DDL_UNLOAD_ALL_AT_EXIT . (This maybe useful if you are using Apache with mod_perl.)

  • English

    $PERL_VERSION now stands for $^V (a string value) rather than for $] (a numeric value).

  • Env

    Env now supports accessing environment variables like PATH as array variables.

  • Fcntl

    More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for large file (more than 4GB) access (NOTE: the O_LARGEFILE is automatically added to sysopen() flags if large file support has been configured, as is the default), Free/Net/OpenBSD locking behaviour flags F_FLOCK, F_POSIX, Linux F_SHLCK, and O_ACCMODE: the combined mask of O_RDONLY, O_WRONLY, and O_RDWR. The seek()/sysseek() constants SEEK_SET, SEEK_CUR, and SEEK_END are available via the :seek tag. The chmod()/stat() S_IF* constants and S_IS* functions are available via the :mode tag.

  • File::Compare

    A compare_text() function has been added, which allows custom comparison functions. See File::Compare.

  • File::Find

    File::Find now works correctly when the wanted() function is either autoloaded or is a symbolic reference.

    A bug that caused File::Find to lose track of the working directory when pruning top-level directories has been fixed.

    File::Find now also supports several other options to control its behavior. It can follow symbolic links if the follow option is specified. Enabling the no_chdir option will make File::Find skip changing the current directory when walking directories. The untaint flag can be useful when running with taint checks enabled.

    See File::Find.

  • File::Glob

    This extension implements BSD-style file globbing. By default, it will also be used for the internal implementation of the glob() operator. See File::Glob.

  • File::Spec

    New methods have been added to the File::Spec module: devnull() returns the name of the null device (/dev/null on Unix) and tmpdir() the name of the temp directory (normally /tmp on Unix). There are now also methods to convert between absolute and relative filenames: abs2rel() and rel2abs(). For compatibility with operating systems that specify volume names in file paths, the splitpath(), splitdir(), and catdir() methods have been added.

  • File::Spec::Functions

    The new File::Spec::Functions modules provides a function interface to the File::Spec module. Allows shorthand

    1. $fullname = catfile($dir1, $dir2, $file);

    instead of

    1. $fullname = File::Spec->catfile($dir1, $dir2, $file);
  • Getopt::Long

    Getopt::Long licensing has changed to allow the Perl Artistic License as well as the GPL. It used to be GPL only, which got in the way of non-GPL applications that wanted to use Getopt::Long.

    Getopt::Long encourages the use of Pod::Usage to produce help messages. For example:

    1. use Getopt::Long;
    2. use Pod::Usage;
    3. my $man = 0;
    4. my $help = 0;
    5. GetOptions('help|?' => \$help, man => \$man) or pod2usage(2);
    6. pod2usage(1) if $help;
    7. pod2usage(-exitstatus => 0, -verbose => 2) if $man;
    8. __END__
    9. =head1 NAME
    10. sample - Using Getopt::Long and Pod::Usage
    11. =head1 SYNOPSIS
    12. sample [options] [file ...]
    13. Options:
    14. -help brief help message
    15. -man full documentation
    16. =head1 OPTIONS
    17. =over 8
    18. =item B<-help>
    19. Print a brief help message and exits.
    20. =item B<-man>
    21. Prints the manual page and exits.
    22. =back
    23. =head1 DESCRIPTION
    24. B<This program> will read the given input file(s) and do something
    25. useful with the contents thereof.
    26. =cut

    See Pod::Usage for details.

    A bug that prevented the non-option call-back <> from being specified as the first argument has been fixed.

    To specify the characters < and > as option starters, use ><. Note, however, that changing option starters is strongly deprecated.

  • IO

    write() and syswrite() will now accept a single-argument form of the call, for consistency with Perl's syswrite().

    You can now create a TCP-based IO::Socket::INET without forcing a connect attempt. This allows you to configure its options (like making it non-blocking) and then call connect() manually.

    A bug that prevented the IO::Socket::protocol() accessor from ever returning the correct value has been corrected.

    IO::Socket::connect now uses non-blocking IO instead of alarm() to do connect timeouts.

    IO::Socket::accept now uses select() instead of alarm() for doing timeouts.

    IO::Socket::INET->new now sets $! correctly on failure. $@ is still set for backwards compatibility.

  • JPL

    Java Perl Lingo is now distributed with Perl. See jpl/README for more information.

  • lib

    use lib now weeds out any trailing duplicate entries. no lib removes all named entries.

  • Math::BigInt

    The bitwise operations << , >> , & , |, and ~ are now supported on bigints.

  • Math::Complex

    The accessor methods Re, Im, arg, abs, rho, and theta can now also act as mutators (accessor $z->Re(), mutator $z->Re(3)).

    The class method display_format and the corresponding object method display_format , in addition to accepting just one argument, now can also accept a parameter hash. Recognized keys of a parameter hash are "style" , which corresponds to the old one parameter case, and two new parameters: "format" , which is a printf()-style format string (defaults usually to "%.15g" , you can revert to the default by setting the format string to undef) used for both parts of a complex number, and "polar_pretty_print" (defaults to true), which controls whether an attempt is made to try to recognize small multiples and rationals of pi (2pi, pi/2) at the argument (angle) of a polar complex number.

    The potentially disruptive change is that in list context both methods now return the parameter hash, instead of only the value of the "style" parameter.

  • Math::Trig

    A little bit of radial trigonometry (cylindrical and spherical), radial coordinate conversions, and the great circle distance were added.

  • Pod::Parser, Pod::InputObjects

    Pod::Parser is a base class for parsing and selecting sections of pod documentation from an input stream. This module takes care of identifying pod paragraphs and commands in the input and hands off the parsed paragraphs and commands to user-defined methods which are free to interpret or translate them as they see fit.

    Pod::InputObjects defines some input objects needed by Pod::Parser, and for advanced users of Pod::Parser that need more about a command besides its name and text.

    As of release 5.6.0 of Perl, Pod::Parser is now the officially sanctioned "base parser code" recommended for use by all pod2xxx translators. Pod::Text (pod2text) and Pod::Man (pod2man) have already been converted to use Pod::Parser and efforts to convert Pod::HTML (pod2html) are already underway. For any questions or comments about pod parsing and translating issues and utilities, please use the pod-people@perl.org mailing list.

    For further information, please see Pod::Parser and Pod::InputObjects.

  • Pod::Checker, podchecker

    This utility checks pod files for correct syntax, according to perlpod. Obvious errors are flagged as such, while warnings are printed for mistakes that can be handled gracefully. The checklist is not complete yet. See Pod::Checker.

  • Pod::ParseUtils, Pod::Find

    These modules provide a set of gizmos that are useful mainly for pod translators. Pod::Find traverses directory structures and returns found pod files, along with their canonical names (like File::Spec::Unix ). Pod::ParseUtils contains Pod::List (useful for storing pod list information), Pod::Hyperlink (for parsing the contents of L<> sequences) and Pod::Cache (for caching information about pod files, e.g., link nodes).

  • Pod::Select, podselect

    Pod::Select is a subclass of Pod::Parser which provides a function named "podselect()" to filter out user-specified sections of raw pod documentation from an input stream. podselect is a script that provides access to Pod::Select from other scripts to be used as a filter. See Pod::Select.

  • Pod::Usage, pod2usage

    Pod::Usage provides the function "pod2usage()" to print usage messages for a Perl script based on its embedded pod documentation. The pod2usage() function is generally useful to all script authors since it lets them write and maintain a single source (the pods) for documentation, thus removing the need to create and maintain redundant usage message text consisting of information already in the pods.

    There is also a pod2usage script which can be used from other kinds of scripts to print usage messages from pods (even for non-Perl scripts with pods embedded in comments).

    For details and examples, please see Pod::Usage.

  • Pod::Text and Pod::Man

    Pod::Text has been rewritten to use Pod::Parser. While pod2text() is still available for backwards compatibility, the module now has a new preferred interface. See Pod::Text for the details. The new Pod::Text module is easily subclassed for tweaks to the output, and two such subclasses (Pod::Text::Termcap for man-page-style bold and underlining using termcap information, and Pod::Text::Color for markup with ANSI color sequences) are now standard.

    pod2man has been turned into a module, Pod::Man, which also uses Pod::Parser. In the process, several outstanding bugs related to quotes in section headers, quoting of code escapes, and nested lists have been fixed. pod2man is now a wrapper script around this module.

  • SDBM_File

    An EXISTS method has been added to this module (and sdbm_exists() has been added to the underlying sdbm library), so one can now call exists on an SDBM_File tied hash and get the correct result, rather than a runtime error.

    A bug that may have caused data loss when more than one disk block happens to be read from the database in a single FETCH() has been fixed.

  • Sys::Syslog

    Sys::Syslog now uses XSUBs to access facilities from syslog.h so it no longer requires syslog.ph to exist.

  • Sys::Hostname

    Sys::Hostname now uses XSUBs to call the C library's gethostname() or uname() if they exist.

  • Term::ANSIColor

    Term::ANSIColor is a very simple module to provide easy and readable access to the ANSI color and highlighting escape sequences, supported by most ANSI terminal emulators. It is now included standard.

  • Time::Local

    The timelocal() and timegm() functions used to silently return bogus results when the date fell outside the machine's integer range. They now consistently croak() if the date falls in an unsupported range.

  • Win32

    The error return value in list context has been changed for all functions that return a list of values. Previously these functions returned a list with a single element undef if an error occurred. Now these functions return the empty list in these situations. This applies to the following functions:

    1. Win32::FsType
    2. Win32::GetOSVersion

    The remaining functions are unchanged and continue to return undef on error even in list context.

    The Win32::SetLastError(ERROR) function has been added as a complement to the Win32::GetLastError() function.

    The new Win32::GetFullPathName(FILENAME) returns the full absolute pathname for FILENAME in scalar context. In list context it returns a two-element list containing the fully qualified directory name and the filename. See Win32.

  • XSLoader

    The XSLoader extension is a simpler alternative to DynaLoader. See XSLoader.

  • DBM Filters

    A new feature called "DBM Filters" has been added to all the DBM modules--DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File. DBM Filters add four new methods to each DBM module:

    1. filter_store_key
    2. filter_store_value
    3. filter_fetch_key
    4. filter_fetch_value

    These can be used to filter key-value pairs before the pairs are written to the database or just after they are read from the database. See perldbmfilter for further information.

Pragmata

use attrs is now obsolete, and is only provided for backward-compatibility. It's been replaced by the sub : attributes syntax. See Subroutine Attributes in perlsub and attributes.

Lexical warnings pragma, use warnings; , to control optional warnings. See perllexwarn.

use filetest to control the behaviour of filetests (-r -w ...). Currently only one subpragma implemented, "use filetest 'access';", that uses access(2) or equivalent to check permissions instead of using stat(2) as usual. This matters in filesystems where there are ACLs (access control lists): the stat(2) might lie, but access(2) knows better.

The open pragma can be used to specify default disciplines for handle constructors (e.g. open()) and for qx//. The two pseudo-disciplines :raw and :crlf are currently supported on DOS-derivative platforms (i.e. where binmode is not a no-op). See also binmode() can be used to set :crlf and :raw modes.

Utility Changes

dprofpp

dprofpp is used to display profile data generated using Devel::DProf . See dprofpp.

find2perl

The find2perl utility now uses the enhanced features of the File::Find module. The -depth and -follow options are supported. Pod documentation is also included in the script.

h2xs

The h2xs tool can now work in conjunction with C::Scan (available from CPAN) to automatically parse real-life header files. The -M , -a , -k , and -o options are new.

perlcc

perlcc now supports the C and Bytecode backends. By default, it generates output from the simple C backend rather than the optimized C backend.

Support for non-Unix platforms has been improved.

perldoc

perldoc has been reworked to avoid possible security holes. It will not by default let itself be run as the superuser, but you may still use the -U switch to try to make it drop privileges first.

The Perl Debugger

Many bug fixes and enhancements were added to perl5db.pl, the Perl debugger. The help documentation was rearranged. New commands include < ?, > ?, and { ? to list out current actions, man docpage to run your doc viewer on some perl docset, and support for quoted options. The help information was rearranged, and should be viewable once again if you're using less as your pager. A serious security hole was plugged--you should immediately remove all older versions of the Perl debugger as installed in previous releases, all the way back to perl3, from your system to avoid being bitten by this.

Improved Documentation

Many of the platform-specific README files are now part of the perl installation. See perl for the complete list.

  • perlapi.pod

    The official list of public Perl API functions.

  • perlboot.pod

    A tutorial for beginners on object-oriented Perl.

  • perlcompile.pod

    An introduction to using the Perl Compiler suite.

  • perldbmfilter.pod

    A howto document on using the DBM filter facility.

  • perldebug.pod

    All material unrelated to running the Perl debugger, plus all low-level guts-like details that risked crushing the casual user of the debugger, have been relocated from the old manpage to the next entry below.

  • perldebguts.pod

    This new manpage contains excessively low-level material not related to the Perl debugger, but slightly related to debugging Perl itself. It also contains some arcane internal details of how the debugging process works that may only be of interest to developers of Perl debuggers.

  • perlfork.pod

    Notes on the fork() emulation currently available for the Windows platform.

  • perlfilter.pod

    An introduction to writing Perl source filters.

  • perlhack.pod

    Some guidelines for hacking the Perl source code.

  • perlintern.pod

    A list of internal functions in the Perl source code. (List is currently empty.)

  • perllexwarn.pod

    Introduction and reference information about lexically scoped warning categories.

  • perlnumber.pod

    Detailed information about numbers as they are represented in Perl.

  • perlopentut.pod

    A tutorial on using open() effectively.

  • perlreftut.pod

    A tutorial that introduces the essentials of references.

  • perltootc.pod

    A tutorial on managing class data for object modules.

  • perltodo.pod

    Discussion of the most often wanted features that may someday be supported in Perl.

  • perlunicode.pod

    An introduction to Unicode support features in Perl.

Performance enhancements

Simple sort() using { $a <=> $b } and the like are optimized

Many common sort() operations using a simple inlined block are now optimized for faster performance.

Optimized assignments to lexical variables

Certain operations in the RHS of assignment statements have been optimized to directly set the lexical variable on the LHS, eliminating redundant copying overheads.

Faster subroutine calls

Minor changes in how subroutine calls are handled internally provide marginal improvements in performance.

delete(), each(), values() and hash iteration are faster

The hash values returned by delete(), each(), values() and hashes in a list context are the actual values in the hash, instead of copies. This results in significantly better performance, because it eliminates needless copying in most situations.

Installation and Configuration Improvements

-Dusethreads means something different

The -Dusethreads flag now enables the experimental interpreter-based thread support by default. To get the flavor of experimental threads that was in 5.005 instead, you need to run Configure with "-Dusethreads -Duse5005threads".

As of v5.6.0, interpreter-threads support is still lacking a way to create new threads from Perl (i.e., use Thread; will not work with interpreter threads). use Thread; continues to be available when you specify the -Duse5005threads option to Configure, bugs and all.

  1. NOTE: Support for threads continues to be an experimental feature.
  2. Interfaces and implementation are subject to sudden and drastic changes.

New Configure flags

The following new flags may be enabled on the Configure command line by running Configure with -Dflag .

  1. usemultiplicity
  2. usethreads useithreads (new interpreter threads: no Perl API yet)
  3. usethreads use5005threads (threads as they were in 5.005)
  4. use64bitint (equal to now deprecated 'use64bits')
  5. use64bitall
  6. uselongdouble
  7. usemorebits
  8. uselargefiles
  9. usesocks (only SOCKS v5 supported)

Threadedness and 64-bitness now more daring

The Configure options enabling the use of threads and the use of 64-bitness are now more daring in the sense that they no more have an explicit list of operating systems of known threads/64-bit capabilities. In other words: if your operating system has the necessary APIs and datatypes, you should be able just to go ahead and use them, for threads by Configure -Dusethreads, and for 64 bits either explicitly by Configure -Duse64bitint or implicitly if your system has 64-bit wide datatypes. See also 64-bit support.

Long Doubles

Some platforms have "long doubles", floating point numbers of even larger range than ordinary "doubles". To enable using long doubles for Perl's scalars, use -Duselongdouble.

-Dusemorebits

You can enable both -Duse64bitint and -Duselongdouble with -Dusemorebits. See also 64-bit support.

-Duselargefiles

Some platforms support system APIs that are capable of handling large files (typically, files larger than two gigabytes). Perl will try to use these APIs if you ask for -Duselargefiles.

See Large file support for more information.

installusrbinperl

You can use "Configure -Uinstallusrbinperl" which causes installperl to skip installing perl also as /usr/bin/perl. This is useful if you prefer not to modify /usr/bin for some reason or another but harmful because many scripts assume to find Perl in /usr/bin/perl.

SOCKS support

You can use "Configure -Dusesocks" which causes Perl to probe for the SOCKS proxy protocol library (v5, not v4). For more information on SOCKS, see:

  1. http://www.socks.nec.com/

-A flag

You can "post-edit" the Configure variables using the Configure -A switch. The editing happens immediately after the platform specific hints files have been processed but before the actual configuration process starts. Run Configure -h to find out the full -A syntax.

Enhanced Installation Directories

The installation structure has been enriched to improve the support for maintaining multiple versions of perl, to provide locations for vendor-supplied modules, scripts, and manpages, and to ease maintenance of locally-added modules, scripts, and manpages. See the section on Installation Directories in the INSTALL file for complete details. For most users building and installing from source, the defaults should be fine.

If you previously used Configure -Dsitelib or -Dsitearch to set special values for library directories, you might wish to consider using the new -Dsiteprefix setting instead. Also, if you wish to re-use a config.sh file from an earlier version of perl, you should be sure to check that Configure makes sensible choices for the new directories. See INSTALL for complete details.

Platform specific changes

Supported platforms

  • The Mach CThreads (NEXTSTEP, OPENSTEP) are now supported by the Thread extension.

  • GNU/Hurd is now supported.

  • Rhapsody/Darwin is now supported.

  • EPOC is now supported (on Psion 5).

  • The cygwin port (formerly cygwin32) has been greatly improved.

DOS

  • Perl now works with djgpp 2.02 (and 2.03 alpha).

  • Environment variable names are not converted to uppercase any more.

  • Incorrect exit codes from backticks have been fixed.

  • This port continues to use its own builtin globbing (not File::Glob).

OS390 (OpenEdition MVS)

Support for this EBCDIC platform has not been renewed in this release. There are difficulties in reconciling Perl's standardization on UTF-8 as its internal representation for characters with the EBCDIC character set, because the two are incompatible.

It is unclear whether future versions will renew support for this platform, but the possibility exists.

VMS

Numerous revisions and extensions to configuration, build, testing, and installation process to accommodate core changes and VMS-specific options.

Expand %ENV-handling code to allow runtime mapping to logical names, CLI symbols, and CRTL environ array.

Extension of subprocess invocation code to accept filespecs as command "verbs".

Add to Perl command line processing the ability to use default file types and to recognize Unix-style 2>&1 .

Expansion of File::Spec::VMS routines, and integration into ExtUtils::MM_VMS.

Extension of ExtUtils::MM_VMS to handle complex extensions more flexibly.

Barewords at start of Unix-syntax paths may be treated as text rather than only as logical names.

Optional secure translation of several logical names used internally by Perl.

Miscellaneous bugfixing and porting of new core code to VMS.

Thanks are gladly extended to the many people who have contributed VMS patches, testing, and ideas.

Win32

Perl can now emulate fork() internally, using multiple interpreters running in different concurrent threads. This support must be enabled at build time. See perlfork for detailed information.

When given a pathname that consists only of a drivename, such as A: , opendir() and stat() now use the current working directory for the drive rather than the drive root.

The builtin XSUB functions in the Win32:: namespace are documented. See Win32.

$^X now contains the full path name of the running executable.

A Win32::GetLongPathName() function is provided to complement Win32::GetFullPathName() and Win32::GetShortPathName(). See Win32.

POSIX::uname() is supported.

system(1,...) now returns true process IDs rather than process handles. kill() accepts any real process id, rather than strictly return values from system(1,...).

For better compatibility with Unix, kill(0, $pid) can now be used to test whether a process exists.

The Shell module is supported.

Better support for building Perl under command.com in Windows 95 has been added.

Scripts are read in binary mode by default to allow ByteLoader (and the filter mechanism in general) to work properly. For compatibility, the DATA filehandle will be set to text mode if a carriage return is detected at the end of the line containing the __END__ or __DATA__ token; if not, the DATA filehandle will be left open in binary mode. Earlier versions always opened the DATA filehandle in text mode.

The glob() operator is implemented via the File::Glob extension, which supports glob syntax of the C shell. This increases the flexibility of the glob() operator, but there may be compatibility issues for programs that relied on the older globbing syntax. If you want to preserve compatibility with the older syntax, you might want to run perl with -MFile::DosGlob . For details and compatibility information, see File::Glob.

Significant bug fixes

<HANDLE> on empty files

With $/ set to undef, "slurping" an empty file returns a string of zero length (instead of undef, as it used to) the first time the HANDLE is read after $/ is set to undef. Further reads yield undef.

This means that the following will append "foo" to an empty file (it used to do nothing):

  1. perl -0777 -pi -e 's/^/foo/' empty_file

The behaviour of:

  1. perl -pi -e 's/^/foo/' empty_file

is unchanged (it continues to leave the file empty).

eval '...' improvements

Line numbers (as reflected by caller() and most diagnostics) within eval '...' were often incorrect where here documents were involved. This has been corrected.

Lexical lookups for variables appearing in eval '...' within functions that were themselves called within an eval '...' were searching the wrong place for lexicals. The lexical search now correctly ends at the subroutine's block boundary.

The use of return within eval {...} caused $@ not to be reset correctly when no exception occurred within the eval. This has been fixed.

Parsing of here documents used to be flawed when they appeared as the replacement expression in eval 's/.../.../e' . This has been fixed.

All compilation errors are true errors

Some "errors" encountered at compile time were by necessity generated as warnings followed by eventual termination of the program. This enabled more such errors to be reported in a single run, rather than causing a hard stop at the first error that was encountered.

The mechanism for reporting such errors has been reimplemented to queue compile-time errors and report them at the end of the compilation as true errors rather than as warnings. This fixes cases where error messages leaked through in the form of warnings when code was compiled at run time using eval STRING , and also allows such errors to be reliably trapped using eval "..." .

Implicitly closed filehandles are safer

Sometimes implicitly closed filehandles (as when they are localized, and Perl automatically closes them on exiting the scope) could inadvertently set $? or $!. This has been corrected.

Behavior of list slices is more consistent

When taking a slice of a literal list (as opposed to a slice of an array or hash), Perl used to return an empty list if the result happened to be composed of all undef values.

The new behavior is to produce an empty list if (and only if) the original list was empty. Consider the following example:

  1. @a = (1,undef,undef,2)[2,1,2];

The old behavior would have resulted in @a having no elements. The new behavior ensures it has three undefined elements.

Note in particular that the behavior of slices of the following cases remains unchanged:

  1. @a = ()[1,2];
  2. @a = (getpwent)[7,0];
  3. @a = (anything_returning_empty_list())[2,1,2];
  4. @a = @b[2,1,2];
  5. @a = @c{'a','b','c'};

See perldata.

(\$) prototype and $foo{a}

A scalar reference prototype now correctly allows a hash or array element in that slot.

goto &sub and AUTOLOAD

The goto &sub construct works correctly when &sub happens to be autoloaded.

-bareword allowed under use integer

The autoquoting of barewords preceded by - did not work in prior versions when the integer pragma was enabled. This has been fixed.

Failures in DESTROY()

When code in a destructor threw an exception, it went unnoticed in earlier versions of Perl, unless someone happened to be looking in $@ just after the point the destructor happened to run. Such failures are now visible as warnings when warnings are enabled.

Locale bugs fixed

printf() and sprintf() previously reset the numeric locale back to the default "C" locale. This has been fixed.

Numbers formatted according to the local numeric locale (such as using a decimal comma instead of a decimal dot) caused "isn't numeric" warnings, even while the operations accessing those numbers produced correct results. These warnings have been discontinued.

Memory leaks

The eval 'return sub {...}' construct could sometimes leak memory. This has been fixed.

Operations that aren't filehandle constructors used to leak memory when used on invalid filehandles. This has been fixed.

Constructs that modified @_ could fail to deallocate values in @_ and thus leak memory. This has been corrected.

Spurious subroutine stubs after failed subroutine calls

Perl could sometimes create empty subroutine stubs when a subroutine was not found in the package. Such cases stopped later method lookups from progressing into base packages. This has been corrected.

Taint failures under -U

When running in unsafe mode, taint violations could sometimes cause silent failures. This has been fixed.

END blocks and the -c switch

Prior versions used to run BEGIN and END blocks when Perl was run in compile-only mode. Since this is typically not the expected behavior, END blocks are not executed anymore when the -c switch is used, or if compilation fails.

See Support for CHECK blocks for how to run things when the compile phase ends.

Potential to leak DATA filehandles

Using the __DATA__ token creates an implicit filehandle to the file that contains the token. It is the program's responsibility to close it when it is done reading from it.

This caveat is now better explained in the documentation. See perldata.

New or Changed Diagnostics

  • "%s" variable %s masks earlier declaration in same %s

    (W misc) A "my" or "our" variable has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier variable will still exist until the end of the scope or until all closure referents to it are destroyed.

  • "my sub" not yet implemented

    (F) Lexically scoped subroutines are not yet implemented. Don't try that yet.

  • "our" variable %s redeclared

    (W misc) You seem to have already declared the same global once before in the current lexical scope.

  • '!' allowed only after types %s

    (F) The '!' is allowed in pack() and unpack() only after certain types. See pack.

  • / cannot take a count

    (F) You had an unpack template indicating a counted-length string, but you have also specified an explicit size for the string. See pack.

  • / must be followed by a, A or Z

    (F) You had an unpack template indicating a counted-length string, which must be followed by one of the letters a, A or Z to indicate what sort of string is to be unpacked. See pack.

  • / must be followed by a*, A* or Z*

    (F) You had a pack template indicating a counted-length string, Currently the only things that can have their length counted are a*, A* or Z*. See pack.

  • / must follow a numeric type

    (F) You had an unpack template that contained a '#', but this did not follow some numeric unpack specification. See pack.

  • /%s/: Unrecognized escape \\%c passed through

    (W regexp) You used a backslash-character combination which is not recognized by Perl. This combination appears in an interpolated variable or a '-delimited regular expression. The character was understood literally.

  • /%s/: Unrecognized escape \\%c in character class passed through

    (W regexp) You used a backslash-character combination which is not recognized by Perl inside character classes. The character was understood literally.

  • /%s/ should probably be written as "%s"

    (W syntax) You have used a pattern where Perl expected to find a string, as in the first argument to join. Perl will treat the true or false result of matching the pattern against $_ as the string, which is probably not what you had in mind.

  • %s() called too early to check prototype

    (W prototype) You've called a function that has a prototype before the parser saw a definition or declaration for it, and Perl could not check that the call conforms to the prototype. You need to either add an early prototype declaration for the subroutine in question, or move the subroutine definition ahead of the call to get proper prototype checking. Alternatively, if you are certain that you're calling the function correctly, you may put an ampersand before the name to avoid the warning. See perlsub.

  • %s argument is not a HASH or ARRAY element

    (F) The argument to exists() must be a hash or array element, such as:

    1. $foo{$bar}
    2. $ref->{"susie"}[12]
  • %s argument is not a HASH or ARRAY element or slice

    (F) The argument to delete() must be either a hash or array element, such as:

    1. $foo{$bar}
    2. $ref->{"susie"}[12]

    or a hash or array slice, such as:

    1. @foo[$bar, $baz, $xyzzy]
    2. @{$ref->[12]}{"susie", "queue"}
  • %s argument is not a subroutine name

    (F) The argument to exists() for exists &sub must be a subroutine name, and not a subroutine call. exists &sub() will generate this error.

  • %s package attribute may clash with future reserved word: %s

    (W reserved) A lowercase attribute name was used that had a package-specific handler. That name might have a meaning to Perl itself some day, even though it doesn't yet. Perhaps you should use a mixed-case attribute name, instead. See attributes.

  • (in cleanup) %s

    (W misc) This prefix usually indicates that a DESTROY() method raised the indicated exception. Since destructors are usually called by the system at arbitrary points during execution, and often a vast number of times, the warning is issued only once for any number of failures that would otherwise result in the same message being repeated.

    Failure of user callbacks dispatched using the G_KEEPERR flag could also result in this warning. See G_KEEPERR in perlcall.

  • <> should be quotes

    (F) You wrote require <file> when you should have written require 'file' .

  • Attempt to join self

    (F) You tried to join a thread from within itself, which is an impossible task. You may be joining the wrong thread, or you may need to move the join() to some other thread.

  • Bad evalled substitution pattern

    (F) You've used the /e switch to evaluate the replacement for a substitution, but perl found a syntax error in the code to evaluate, most likely an unexpected right brace '}'.

  • Bad realloc() ignored

    (S) An internal routine called realloc() on something that had never been malloc()ed in the first place. Mandatory, but can be disabled by setting environment variable PERL_BADFREE to 1.

  • Bareword found in conditional

    (W bareword) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example:

    1. open FOO || die;

    It may also indicate a misspelled constant that has been interpreted as a bareword:

    1. use constant TYPO => 1;
    2. if (TYOP) { print "foo" }

    The strict pragma is useful in avoiding such errors.

  • Binary number > 0b11111111111111111111111111111111 non-portable

    (W portable) The binary number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

  • Bit vector size > 32 non-portable

    (W portable) Using bit vector sizes larger than 32 is non-portable.

  • Buffer overflow in prime_env_iter: %s

    (W internal) A warning peculiar to VMS. While Perl was preparing to iterate over %ENV, it encountered a logical name or symbol definition which was too long, so it was truncated to the string shown.

  • Can't check filesystem of script "%s"

    (P) For some reason you can't check the filesystem of the script for nosuid.

  • Can't declare class for non-scalar %s in "%s"

    (S) Currently, only scalar variables can declared with a specific class qualifier in a "my" or "our" declaration. The semantics may be extended for other types of variables in future.

  • Can't declare %s in "%s"

    (F) Only scalar, array, and hash variables may be declared as "my" or "our" variables. They must have ordinary identifiers as names.

  • Can't ignore signal CHLD, forcing to default

    (W signal) Perl has detected that it is being run with the SIGCHLD signal (sometimes known as SIGCLD) disabled. Since disabling this signal will interfere with proper determination of exit status of child processes, Perl has reset the signal to its default value. This situation typically indicates that the parent program under which Perl may be running (e.g., cron) is being very careless.

  • Can't modify non-lvalue subroutine call

    (F) Subroutines meant to be used in lvalue context should be declared as such, see Lvalue subroutines in perlsub.

  • Can't read CRTL environ

    (S) A warning peculiar to VMS. Perl tried to read an element of %ENV from the CRTL's internal environment array and discovered the array was missing. You need to figure out where your CRTL misplaced its environ or define PERL_ENV_TABLES (see perlvms) so that environ is not searched.

  • Can't remove %s: %s, skipping file

    (S) You requested an inplace edit without creating a backup file. Perl was unable to remove the original file to replace it with the modified file. The file was left unmodified.

  • Can't return %s from lvalue subroutine

    (F) Perl detected an attempt to return illegal lvalues (such as temporary or readonly values) from a subroutine used as an lvalue. This is not allowed.

  • Can't weaken a nonreference

    (F) You attempted to weaken something that was not a reference. Only references can be weakened.

  • Character class [:%s:] unknown

    (F) The class in the character class [: :] syntax is unknown. See perlre.

  • Character class syntax [%s] belongs inside character classes

    (W unsafe) The character class constructs [: :], [= =], and [. .] go inside character classes, the [] are part of the construct, for example: /[012[:alpha:]345]/. Note that [= =] and [. .] are not currently implemented; they are simply placeholders for future extensions.

  • Constant is not %s reference

    (F) A constant value (perhaps declared using the use constant pragma) is being dereferenced, but it amounts to the wrong type of reference. The message indicates the type of reference that was expected. This usually indicates a syntax error in dereferencing the constant value. See Constant Functions in perlsub and constant.

  • constant(%s): %s

    (F) The parser found inconsistencies either while attempting to define an overloaded constant, or when trying to find the character name specified in the \N{...} escape. Perhaps you forgot to load the corresponding overload or charnames pragma? See charnames and overload.

  • CORE::%s is not a keyword

    (F) The CORE:: namespace is reserved for Perl keywords.

  • defined(@array) is deprecated

    (D) defined() is not usually useful on arrays because it checks for an undefined scalar value. If you want to see if the array is empty, just use if (@array) { # not empty } for example.

  • defined(%hash) is deprecated

    (D) defined() is not usually useful on hashes because it checks for an undefined scalar value. If you want to see if the hash is empty, just use if (%hash) { # not empty } for example.

  • Did not produce a valid header

    See Server error.

  • (Did you mean "local" instead of "our"?)

    (W misc) Remember that "our" does not localize the declared global variable. You have declared it again in the same lexical scope, which seems superfluous.

  • Document contains no data

    See Server error.

  • entering effective %s failed

    (F) While under the use filetest pragma, switching the real and effective uids or gids failed.

  • false [] range "%s" in regexp

    (W regexp) A character class range must start and end at a literal character, not another character class like \d or [:alpha:]. The "-" in your false range is interpreted as a literal "-". Consider quoting the "-", "\-". See perlre.

  • Filehandle %s opened only for output

    (W io) You tried to read from a filehandle opened only for writing. If you intended it to be a read/write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with "<" or nothing. If you intended only to read from the file, use "<". See open.

  • flock() on closed filehandle %s

    (W closed) The filehandle you're attempting to flock() got itself closed some time before now. Check your logic flow. flock() operates on filehandles. Are you attempting to call flock() on a dirhandle by the same name?

  • Global symbol "%s" requires explicit package name

    (F) You've said "use strict vars", which indicates that all variables must either be lexically scoped (using "my"), declared beforehand using "our", or explicitly qualified to say which package the global variable is in (using "::").

  • Hexadecimal number > 0xffffffff non-portable

    (W portable) The hexadecimal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

  • Ill-formed CRTL environ value "%s"

    (W internal) A warning peculiar to VMS. Perl tried to read the CRTL's internal environ array, and encountered an element without the = delimiter used to separate keys from values. The element is ignored.

  • Ill-formed message in prime_env_iter: |%s|

    (W internal) A warning peculiar to VMS. Perl tried to read a logical name or CLI symbol definition when preparing to iterate over %ENV, and didn't see the expected delimiter between key and value, so the line was ignored.

  • Illegal binary digit %s

    (F) You used a digit other than 0 or 1 in a binary number.

  • Illegal binary digit %s ignored

    (W digit) You may have tried to use a digit other than 0 or 1 in a binary number. Interpretation of the binary number stopped before the offending digit.

  • Illegal number of bits in vec

    (F) The number of bits in vec() (the third argument) must be a power of two from 1 to 32 (or 64, if your platform supports that).

  • Integer overflow in %s number

    (W overflow) The hexadecimal, octal or binary number you have specified either as a literal or as an argument to hex() or oct() is too big for your architecture, and has been converted to a floating point number. On a 32-bit architecture the largest hexadecimal, octal or binary number representable without overflow is 0xFFFFFFFF, 037777777777, or 0b11111111111111111111111111111111 respectively. Note that Perl transparently promotes all numbers to a floating point representation internally--subject to loss of precision errors in subsequent operations.

  • Invalid %s attribute: %s

    The indicated attribute for a subroutine or variable was not recognized by Perl or by a user-supplied handler. See attributes.

  • Invalid %s attributes: %s

    The indicated attributes for a subroutine or variable were not recognized by Perl or by a user-supplied handler. See attributes.

  • invalid [] range "%s" in regexp

    The offending range is now explicitly displayed.

  • Invalid separator character %s in attribute list

    (F) Something other than a colon or whitespace was seen between the elements of an attribute list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon. See attributes.

  • Invalid separator character %s in subroutine attribute list

    (F) Something other than a colon or whitespace was seen between the elements of a subroutine attribute list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon.

  • leaving effective %s failed

    (F) While under the use filetest pragma, switching the real and effective uids or gids failed.

  • Lvalue subs returning %s not implemented yet

    (F) Due to limitations in the current implementation, array and hash values cannot be returned in subroutines used in lvalue context. See Lvalue subroutines in perlsub.

  • Method %s not permitted

    See Server error.

  • Missing %sbrace%s on \N{}

    (F) Wrong syntax of character name literal \N{charname} within double-quotish context.

  • Missing command in piped open

    (W pipe) You used the open(FH, "| command") or open(FH, "command |") construction, but the command was missing or blank.

  • Missing name in "my sub"

    (F) The reserved syntax for lexically scoped subroutines requires that they have a name with which they can be found.

  • No %s specified for -%c

    (F) The indicated command line switch needs a mandatory argument, but you haven't specified one.

  • No package name allowed for variable %s in "our"

    (F) Fully qualified variable names are not allowed in "our" declarations, because that doesn't make much sense under existing semantics. Such syntax is reserved for future extensions.

  • No space allowed after -%c

    (F) The argument to the indicated command line switch must follow immediately after the switch, without intervening spaces.

  • no UTC offset information; assuming local time is UTC

    (S) A warning peculiar to VMS. Perl was unable to find the local timezone offset, so it's assuming that local system time is equivalent to UTC. If it's not, define the logical name SYS$TIMEZONE_DIFFERENTIAL to translate to the number of seconds which need to be added to UTC to get local time.

  • Octal number > 037777777777 non-portable

    (W portable) The octal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

    See also perlport for writing portable code.

  • panic: del_backref

    (P) Failed an internal consistency check while trying to reset a weak reference.

  • panic: kid popen errno read

    (F) forked child returned an incomprehensible message about its errno.

  • panic: magic_killbackrefs

    (P) Failed an internal consistency check while trying to reset all weak references to an object.

  • Parentheses missing around "%s" list

    (W parenthesis) You said something like

    1. my $foo, $bar = @_;

    when you meant

    1. my ($foo, $bar) = @_;

    Remember that "my", "our", and "local" bind tighter than comma.

  • Possible unintended interpolation of %s in string

    (W ambiguous) It used to be that Perl would try to guess whether you wanted an array interpolated or a literal @. It no longer does this; arrays are now always interpolated into strings. This means that if you try something like:

    1. print "fred@example.com";

    and the array @example doesn't exist, Perl is going to print fred.com , which is probably not what you wanted. To get a literal @ sign in a string, put a backslash before it, just as you would to get a literal $ sign.

  • Possible Y2K bug: %s

    (W y2k) You are concatenating the number 19 with another number, which could be a potential Year 2000 problem.

  • pragma "attrs" is deprecated, use "sub NAME : ATTRS" instead

    (W deprecated) You have written something like this:

    1. sub doit
    2. {
    3. use attrs qw(locked);
    4. }

    You should use the new declaration syntax instead.

    1. sub doit : locked
    2. {
    3. ...

    The use attrs pragma is now obsolete, and is only provided for backward-compatibility. See Subroutine Attributes in perlsub.

  • Premature end of script headers

    See Server error.

  • Repeat count in pack overflows

    (F) You can't specify a repeat count so large that it overflows your signed integers. See pack.

  • Repeat count in unpack overflows

    (F) You can't specify a repeat count so large that it overflows your signed integers. See unpack.

  • realloc() of freed memory ignored

    (S) An internal routine called realloc() on something that had already been freed.

  • Reference is already weak

    (W misc) You have attempted to weaken a reference that is already weak. Doing so has no effect.

  • setpgrp can't take arguments

    (F) Your system has the setpgrp() from BSD 4.2, which takes no arguments, unlike POSIX setpgid(), which takes a process ID and process group ID.

  • Strange *+?{} on zero-length expression

    (W regexp) You applied a regular expression quantifier in a place where it makes no sense, such as on a zero-width assertion. Try putting the quantifier inside the assertion instead. For example, the way to match "abc" provided that it is followed by three repetitions of "xyz" is /abc(?=(?:xyz){3})/ , not /abc(?=xyz){3}/ .

  • switching effective %s is not implemented

    (F) While under the use filetest pragma, we cannot switch the real and effective uids or gids.

  • This Perl can't reset CRTL environ elements (%s)
  • This Perl can't set CRTL environ elements (%s=%s)

    (W internal) Warnings peculiar to VMS. You tried to change or delete an element of the CRTL's internal environ array, but your copy of Perl wasn't built with a CRTL that contained the setenv() function. You'll need to rebuild Perl with a CRTL that does, or redefine PERL_ENV_TABLES (see perlvms) so that the environ array isn't the target of the change to %ENV which produced the warning.

  • Too late to run %s block

    (W void) A CHECK or INIT block is being defined during run time proper, when the opportunity to run them has already passed. Perhaps you are loading a file with require or do when you should be using use instead. Or perhaps you should put the require or do inside a BEGIN block.

  • Unknown open() mode '%s'

    (F) The second argument of 3-argument open() is not among the list of valid modes: < , >, >> , +< , +>, +>> , -|, |-.

  • Unknown process %x sent message to prime_env_iter: %s

    (P) An error peculiar to VMS. Perl was reading values for %ENV before iterating over it, and someone else stuck a message in the stream of data Perl expected. Someone's very confused, or perhaps trying to subvert Perl's population of %ENV for nefarious purposes.

  • Unrecognized escape \\%c passed through

    (W misc) You used a backslash-character combination which is not recognized by Perl. The character was understood literally.

  • Unterminated attribute parameter in attribute list

    (F) The lexer saw an opening (left) parenthesis character while parsing an attribute list, but the matching closing (right) parenthesis character was not found. You may need to add (or remove) a backslash character to get your parentheses to balance. See attributes.

  • Unterminated attribute list

    (F) The lexer found something other than a simple identifier at the start of an attribute, and it wasn't a semicolon or the start of a block. Perhaps you terminated the parameter list of the previous attribute too soon. See attributes.

  • Unterminated attribute parameter in subroutine attribute list

    (F) The lexer saw an opening (left) parenthesis character while parsing a subroutine attribute list, but the matching closing (right) parenthesis character was not found. You may need to add (or remove) a backslash character to get your parentheses to balance.

  • Unterminated subroutine attribute list

    (F) The lexer found something other than a simple identifier at the start of a subroutine attribute, and it wasn't a semicolon or the start of a block. Perhaps you terminated the parameter list of the previous attribute too soon.

  • Value of CLI symbol "%s" too long

    (W misc) A warning peculiar to VMS. Perl tried to read the value of an %ENV element from a CLI symbol table, and found a resultant string longer than 1024 characters. The return value has been truncated to 1024 characters.

  • Version number must be a constant number

    (P) The attempt to translate a use Module n.n LIST statement into its equivalent BEGIN block found an internal inconsistency with the version number.

New tests

  • lib/attrs

    Compatibility tests for sub : attrs vs the older use attrs .

  • lib/env

    Tests for new environment scalar capability (e.g., use Env qw($BAR); ).

  • lib/env-array

    Tests for new environment array capability (e.g., use Env qw(@PATH); ).

  • lib/io_const

    IO constants (SEEK_*, _IO*).

  • lib/io_dir

    Directory-related IO methods (new, read, close, rewind, tied delete).

  • lib/io_multihomed

    INET sockets with multi-homed hosts.

  • lib/io_poll

    IO poll().

  • lib/io_unix

    UNIX sockets.

  • op/attrs

    Regression tests for my ($x,@y,%z) : attrs and <sub : attrs>.

  • op/filetest

    File test operators.

  • op/lex_assign

    Verify operations that access pad objects (lexicals and temporaries).

  • op/exists_sub

    Verify exists &sub operations.

Incompatible Changes

Perl Source Incompatibilities

Beware that any new warnings that have been added or old ones that have been enhanced are not considered incompatible changes.

Since all new warnings must be explicitly requested via the -w switch or the warnings pragma, it is ultimately the programmer's responsibility to ensure that warnings are enabled judiciously.

  • CHECK is a new keyword

    All subroutine definitions named CHECK are now special. See /"Support for CHECK blocks" for more information.

  • Treatment of list slices of undef has changed

    There is a potential incompatibility in the behavior of list slices that are comprised entirely of undefined values. See Behavior of list slices is more consistent.

  • Format of $English::PERL_VERSION is different

    The English module now sets $PERL_VERSION to $^V (a string value) rather than $] (a numeric value). This is a potential incompatibility. Send us a report via perlbug if you are affected by this.

    See Improved Perl version numbering system for the reasons for this change.

  • Literals of the form 1.2.3 parse differently

    Previously, numeric literals with more than one dot in them were interpreted as a floating point number concatenated with one or more numbers. Such "numbers" are now parsed as strings composed of the specified ordinals.

    For example, print 97.98.99 used to output 97.9899 in earlier versions, but now prints abc .

    See Support for strings represented as a vector of ordinals.

  • Possibly changed pseudo-random number generator

    Perl programs that depend on reproducing a specific set of pseudo-random numbers may now produce different output due to improvements made to the rand() builtin. You can use sh Configure -Drandfunc=rand to obtain the old behavior.

    See Better pseudo-random number generator.

  • Hashing function for hash keys has changed

    Even though Perl hashes are not order preserving, the apparently random order encountered when iterating on the contents of a hash is actually determined by the hashing algorithm used. Improvements in the algorithm may yield a random order that is different from that of previous versions, especially when iterating on hashes.

    See Better worst-case behavior of hashes for additional information.

  • undef fails on read only values

    Using the undef operator on a readonly value (such as $1) has the same effect as assigning undef to the readonly value--it throws an exception.

  • Close-on-exec bit may be set on pipe and socket handles

    Pipe and socket handles are also now subject to the close-on-exec behavior determined by the special variable $^F.

    See More consistent close-on-exec behavior.

  • Writing "$$1" to mean "${$}1" is unsupported

    Perl 5.004 deprecated the interpretation of $$1 and similar within interpolated strings to mean $$ . "1" , but still allowed it.

    In Perl 5.6.0 and later, "$$1" always means "${$1}" .

  • delete(), each(), values() and \(%h)

    operate on aliases to values, not copies

    delete(), each(), values() and hashes (e.g. \(%h) ) in a list context return the actual values in the hash, instead of copies (as they used to in earlier versions). Typical idioms for using these constructs copy the returned values, but this can make a significant difference when creating references to the returned values. Keys in the hash are still returned as copies when iterating on a hash.

    See also delete(), each(), values() and hash iteration are faster.

  • vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS

    vec() generates a run-time error if the BITS argument is not a valid power-of-two integer.

  • Text of some diagnostic output has changed

    Most references to internal Perl operations in diagnostics have been changed to be more descriptive. This may be an issue for programs that may incorrectly rely on the exact text of diagnostics for proper functioning.

  • %@ has been removed

    The undocumented special variable %@ that used to accumulate "background" errors (such as those that happen in DESTROY()) has been removed, because it could potentially result in memory leaks.

  • Parenthesized not() behaves like a list operator

    The not operator now falls under the "if it looks like a function, it behaves like a function" rule.

    As a result, the parenthesized form can be used with grep and map. The following construct used to be a syntax error before, but it works as expected now:

    1. grep not($_), @things;

    On the other hand, using not with a literal list slice may not work. The following previously allowed construct:

    1. print not (1,2,3)[0];

    needs to be written with additional parentheses now:

    1. print not((1,2,3)[0]);

    The behavior remains unaffected when not is not followed by parentheses.

  • Semantics of bareword prototype (*) have changed

    The semantics of the bareword prototype * have changed. Perl 5.005 always coerced simple scalar arguments to a typeglob, which wasn't useful in situations where the subroutine must distinguish between a simple scalar and a typeglob. The new behavior is to not coerce bareword arguments to a typeglob. The value will always be visible as either a simple scalar or as a reference to a typeglob.

    See More functional bareword prototype (*).

  • Semantics of bit operators may have changed on 64-bit platforms

    If your platform is either natively 64-bit or if Perl has been configured to used 64-bit integers, i.e., $Config{ivsize} is 8, there may be a potential incompatibility in the behavior of bitwise numeric operators (& | ^ ~ <<>>). These operators used to strictly operate on the lower 32 bits of integers in previous versions, but now operate over the entire native integral width. In particular, note that unary ~ will produce different results on platforms that have different $Config{ivsize}. For portability, be sure to mask off the excess bits in the result of unary ~ , e.g., ~$x & 0xffffffff .

    See Bit operators support full native integer width.

  • More builtins taint their results

    As described in Improved security features, there may be more sources of taint in a Perl program.

    To avoid these new tainting behaviors, you can build Perl with the Configure option -Accflags=-DINCOMPLETE_TAINTS . Beware that the ensuing perl binary may be insecure.

C Source Incompatibilities

  • PERL_POLLUTE

    Release 5.005 grandfathered old global symbol names by providing preprocessor macros for extension source compatibility. As of release 5.6.0, these preprocessor definitions are not available by default. You need to explicitly compile perl with -DPERL_POLLUTE to get these definitions. For extensions still using the old symbols, this option can be specified via MakeMaker:

    1. perl Makefile.PL POLLUTE=1
  • PERL_IMPLICIT_CONTEXT

    This new build option provides a set of macros for all API functions such that an implicit interpreter/thread context argument is passed to every API function. As a result of this, something like sv_setsv(foo,bar) amounts to a macro invocation that actually translates to something like Perl_sv_setsv(my_perl,foo,bar) . While this is generally expected to not have any significant source compatibility issues, the difference between a macro and a real function call will need to be considered.

    This means that there is a source compatibility issue as a result of this if your extensions attempt to use pointers to any of the Perl API functions.

    Note that the above issue is not relevant to the default build of Perl, whose interfaces continue to match those of prior versions (but subject to the other options described here).

    See Background and PERL_IMPLICIT_CONTEXT in perlguts for detailed information on the ramifications of building Perl with this option.

    1. NOTE: PERL_IMPLICIT_CONTEXT is automatically enabled whenever Perl is built
    2. with one of -Dusethreads, -Dusemultiplicity, or both. It is not
    3. intended to be enabled by users at this time.
  • PERL_POLLUTE_MALLOC

    Enabling Perl's malloc in release 5.005 and earlier caused the namespace of the system's malloc family of functions to be usurped by the Perl versions, since by default they used the same names. Besides causing problems on platforms that do not allow these functions to be cleanly replaced, this also meant that the system versions could not be called in programs that used Perl's malloc. Previous versions of Perl have allowed this behaviour to be suppressed with the HIDEMYMALLOC and EMBEDMYMALLOC preprocessor definitions.

    As of release 5.6.0, Perl's malloc family of functions have default names distinct from the system versions. You need to explicitly compile perl with -DPERL_POLLUTE_MALLOC to get the older behaviour. HIDEMYMALLOC and EMBEDMYMALLOC have no effect, since the behaviour they enabled is now the default.

    Note that these functions do not constitute Perl's memory allocation API. See Memory Allocation in perlguts for further information about that.

Compatible C Source API Changes

  • PATCHLEVEL is now PERL_VERSION

    The cpp macros PERL_REVISION , PERL_VERSION , and PERL_SUBVERSION are now available by default from perl.h, and reflect the base revision, patchlevel, and subversion respectively. PERL_REVISION had no prior equivalent, while PERL_VERSION and PERL_SUBVERSION were previously available as PATCHLEVEL and SUBVERSION .

    The new names cause less pollution of the cpp namespace and reflect what the numbers have come to stand for in common practice. For compatibility, the old names are still supported when patchlevel.h is explicitly included (as required before), so there is no source incompatibility from the change.

Binary Incompatibilities

In general, the default build of this release is expected to be binary compatible for extensions built with the 5.005 release or its maintenance versions. However, specific platforms may have broken binary compatibility due to changes in the defaults used in hints files. Therefore, please be sure to always check the platform-specific README files for any notes to the contrary.

The usethreads or usemultiplicity builds are not binary compatible with the corresponding builds in 5.005.

On platforms that require an explicit list of exports (AIX, OS/2 and Windows, among others), purely internal symbols such as parser functions and the run time opcodes are not exported by default. Perl 5.005 used to export all functions irrespective of whether they were considered part of the public API or not.

For the full list of public API functions, see perlapi.

Known Problems

Thread test failures

The subtests 19 and 20 of lib/thr5005.t test are known to fail due to fundamental problems in the 5.005 threading implementation. These are not new failures--Perl 5.005_0x has the same bugs, but didn't have these tests.

EBCDIC platforms not supported

In earlier releases of Perl, EBCDIC environments like OS390 (also known as Open Edition MVS) and VM-ESA were supported. Due to changes required by the UTF-8 (Unicode) support, the EBCDIC platforms are not supported in Perl 5.6.0.

In 64-bit HP-UX the lib/io_multihomed test may hang

The lib/io_multihomed test may hang in HP-UX if Perl has been configured to be 64-bit. Because other 64-bit platforms do not hang in this test, HP-UX is suspect. All other tests pass in 64-bit HP-UX. The test attempts to create and connect to "multihomed" sockets (sockets which have multiple IP addresses).

NEXTSTEP 3.3 POSIX test failure

In NEXTSTEP 3.3p2 the implementation of the strftime(3) in the operating system libraries is buggy: the %j format numbers the days of a month starting from zero, which, while being logical to programmers, will cause the subtests 19 to 27 of the lib/posix test may fail.

Tru64 (aka Digital UNIX, aka DEC OSF/1) lib/sdbm test failure with gcc

If compiled with gcc 2.95 the lib/sdbm test will fail (dump core). The cure is to use the vendor cc, it comes with the operating system and produces good code.

UNICOS/mk CC failures during Configure run

In UNICOS/mk the following errors may appear during the Configure run:

  1. Guessing which symbols your C compiler and preprocessor define...
  2. CC-20 cc: ERROR File = try.c, Line = 3
  3. ...
  4. bad switch yylook 79bad switch yylook 79bad switch yylook 79bad switch yylook 79#ifdef A29K
  5. ...
  6. 4 errors detected in the compilation of "try.c".

The culprit is the broken awk of UNICOS/mk. The effect is fortunately rather mild: Perl itself is not adversely affected by the error, only the h2ph utility coming with Perl, and that is rather rarely needed these days.

Arrow operator and arrays

When the left argument to the arrow operator -> is an array, or the scalar operator operating on an array, the result of the operation must be considered erroneous. For example:

  1. @x->[2]
  2. scalar(@x)->[2]

These expressions will get run-time errors in some future release of Perl.

Experimental features

As discussed above, many features are still experimental. Interfaces and implementation of these features are subject to change, and in extreme cases, even subject to removal in some future release of Perl. These features include the following:

  • Threads
  • Unicode
  • 64-bit support
  • Lvalue subroutines
  • Weak references
  • The pseudo-hash data type
  • The Compiler suite
  • Internal implementation of file globbing
  • The DB module
  • The regular expression code constructs:

    (?{ code }) and (??{ code })

Obsolete Diagnostics

  • Character class syntax [: :] is reserved for future extensions

    (W) Within regular expression character classes ([]) the syntax beginning with "[:" and ending with ":]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[:" and ":\]".

  • Ill-formed logical name |%s| in prime_env_iter

    (W) A warning peculiar to VMS. A logical name was encountered when preparing to iterate over %ENV which violates the syntactic rules governing logical names. Because it cannot be translated normally, it is skipped, and will not appear in %ENV. This may be a benign occurrence, as some software packages might directly modify logical name tables and introduce nonstandard names, or it may indicate that a logical name table has been corrupted.

  • In string, @%s now must be written as \@%s

    The description of this error used to say:

    1. (Someday it will simply assume that an unbackslashed @
    2. interpolates an array.)

    That day has come, and this fatal error has been removed. It has been replaced by a non-fatal warning instead. See Arrays now always interpolate into double-quoted strings for details.

  • Probable precedence problem on %s

    (W) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example:

    1. open FOO || die;
  • regexp too big

    (F) The current implementation of regular expressions uses shorts as address offsets within a string. Unfortunately this means that if the regular expression compiles to longer than 32767, it'll blow up. Usually when you want a regular expression this big, there is a better way to do it with multiple statements. See perlre.

  • Use of "$$<digit>" to mean "${$}<digit>" is deprecated

    (D) Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004.

    However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$<digit>" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup. There may also be information at http://www.perl.com/perl/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

HISTORY

Written by Gurusamy Sarathy <gsar@activestate.com>, with many contributions from The Perl Porters.

Send omissions or corrections to <perlbug@perl.org>.

Page index
 
perldoc-html/perl581delta.html000644 000765 000024 00000223502 12275777404 016341 0ustar00jjstaff000000 000000 perl581delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl581delta

Perl 5 version 18.2 documentation
Recently read

perl581delta

NAME

perl581delta - what is new for perl v5.8.1

DESCRIPTION

This document describes differences between the 5.8.0 release and the 5.8.1 release.

If you are upgrading from an earlier release such as 5.6.1, first read the perl58delta, which describes differences between 5.6.0 and 5.8.0.

In case you are wondering about 5.6.1, it was bug-fix-wise rather identical to the development release 5.7.1. Confused? This timeline hopefully helps a bit: it lists the new major releases, their maintenance releases, and the development releases.

  1. New Maintenance Development
  2. 5.6.0 2000-Mar-22
  3. 5.7.0 2000-Sep-02
  4. 5.6.1 2001-Apr-08
  5. 5.7.1 2001-Apr-09
  6. 5.7.2 2001-Jul-13
  7. 5.7.3 2002-Mar-05
  8. 5.8.0 2002-Jul-18
  9. 5.8.1 2003-Sep-25

Incompatible Changes

Hash Randomisation

Mainly due to security reasons, the "random ordering" of hashes has been made even more random. Previously while the order of hash elements from keys(), values(), and each() was essentially random, it was still repeatable. Now, however, the order varies between different runs of Perl.

Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order.

The added randomness may affect applications.

One possible scenario is when output of an application has included hash data. For example, if you have used the Data::Dumper module to dump data into different files, and then compared the files to see whether the data has changed, now you will have false positives since the order in which hashes are dumped will vary. In general the cure is to sort the keys (or the values); in particular for Data::Dumper to use the Sortkeys option. If some particular order is really important, use tied hashes: for example the Tie::IxHash module which by default preserves the order in which the hash elements were added.

More subtle problem is reliance on the order of "global destruction". That is what happens at the end of execution: Perl destroys all data structures, including user data. If your destructors (the DESTROY subroutines) have assumed any particular ordering to the global destruction, there might be problems ahead. For example, in a destructor of one object you cannot assume that objects of any other class are still available, unless you hold a reference to them. If the environment variable PERL_DESTRUCT_LEVEL is set to a non-zero value, or if Perl is exiting a spawned thread, it will also destruct the ordinary references and the symbol tables that are no longer in use. You can't call a class method or an ordinary function on a class that has been collected that way.

The hash randomisation is certain to reveal hidden assumptions about some particular ordering of hash elements, and outright bugs: it revealed a few bugs in the Perl core and core modules.

To disable the hash randomisation in runtime, set the environment variable PERL_HASH_SEED to 0 (zero) before running Perl (for more information see PERL_HASH_SEED in perlrun), or to disable the feature completely in compile time, compile with -DNO_HASH_SEED (see INSTALL).

See Algorithmic Complexity Attacks in perlsec for the original rationale behind this change.

UTF-8 On Filehandles No Longer Activated By Locale

In Perl 5.8.0 all filehandles, including the standard filehandles, were implicitly set to be in Unicode UTF-8 if the locale settings indicated the use of UTF-8. This feature caused too many problems, so the feature was turned off and redesigned: see Core Enhancements.

Single-number v-strings are no longer v-strings before "=>"

The version strings or v-strings (see Version Strings in perldata) feature introduced in Perl 5.6.0 has been a source of some confusion-- especially when the user did not want to use it, but Perl thought it knew better. Especially troublesome has been the feature that before a "=>" a version string (a "v" followed by digits) has been interpreted as a v-string instead of a string literal. In other words:

  1. %h = ( v65 => 42 );

has meant since Perl 5.6.0

  1. %h = ( 'A' => 42 );

(at least in platforms of ASCII progeny) Perl 5.8.1 restores the more natural interpretation

  1. %h = ( 'v65' => 42 );

The multi-number v-strings like v65.66 and 65.66.67 still continue to be v-strings in Perl 5.8.

(Win32) The -C Switch Has Been Repurposed

The -C switch has changed in an incompatible way. The old semantics of this switch only made sense in Win32 and only in the "use utf8" universe in 5.6.x releases, and do not make sense for the Unicode implementation in 5.8.0. Since this switch could not have been used by anyone, it has been repurposed. The behavior that this switch enabled in 5.6.x releases may be supported in a transparent, data-dependent fashion in a future release.

For the new life of this switch, see UTF-8 no longer default under UTF-8 locales, and -C in perlrun.

(Win32) The /d Switch Of cmd.exe

Perl 5.8.1 uses the /d switch when running the cmd.exe shell internally for system(), backticks, and when opening pipes to external programs. The extra switch disables the execution of AutoRun commands from the registry, which is generally considered undesirable when running external programs. If you wish to retain compatibility with the older behavior, set PERL5SHELL in your environment to cmd /x/c .

Core Enhancements

UTF-8 no longer default under UTF-8 locales

In Perl 5.8.0 many Unicode features were introduced. One of them was found to be of more nuisance than benefit: the automagic (and silent) "UTF-8-ification" of filehandles, including the standard filehandles, if the user's locale settings indicated use of UTF-8.

For example, if you had en_US.UTF-8 as your locale, your STDIN and STDOUT were automatically "UTF-8", in other words an implicit binmode(..., ":utf8") was made. This meant that trying to print, say, chr(0xff), ended up printing the bytes 0xc3 0xbf. Hardly what you had in mind unless you were aware of this feature of Perl 5.8.0. The problem is that the vast majority of people weren't: for example in RedHat releases 8 and 9 the default locale setting is UTF-8, so all RedHat users got UTF-8 filehandles, whether they wanted it or not. The pain was intensified by the Unicode implementation of Perl 5.8.0 (still) having nasty bugs, especially related to the use of s/// and tr///. (Bugs that have been fixed in 5.8.1)

Therefore a decision was made to backtrack the feature and change it from implicit silent default to explicit conscious option. The new Perl command line option -C and its counterpart environment variable PERL_UNICODE can now be used to control how Perl and Unicode interact at interfaces like I/O and for example the command line arguments. See -C in perlrun and PERL_UNICODE in perlrun for more information.

Unsafe signals again available

In Perl 5.8.0 the so-called "safe signals" were introduced. This means that Perl no longer handles signals immediately but instead "between opcodes", when it is safe to do so. The earlier immediate handling easily could corrupt the internal state of Perl, resulting in mysterious crashes.

However, the new safer model has its problems too. Because now an opcode, a basic unit of Perl execution, is never interrupted but instead let to run to completion, certain operations that can take a long time now really do take a long time. For example, certain network operations have their own blocking and timeout mechanisms, and being able to interrupt them immediately would be nice.

Therefore perl 5.8.1 introduces a "backdoor" to restore the pre-5.8.0 (pre-5.7.3, really) signal behaviour. Just set the environment variable PERL_SIGNALS to unsafe , and the old immediate (and unsafe) signal handling behaviour returns. See PERL_SIGNALS in perlrun and Deferred Signals (Safe Signals) in perlipc.

In completely unrelated news, you can now use safe signals with POSIX::SigAction. See POSIX::SigAction in POSIX.

Tied Arrays with Negative Array Indices

Formerly, the indices passed to FETCH , STORE , EXISTS , and DELETE methods in tied array class were always non-negative. If the actual argument was negative, Perl would call FETCHSIZE implicitly and add the result to the index before passing the result to the tied array method. This behaviour is now optional. If the tied array class contains a package variable named $NEGATIVE_INDICES which is set to a true value, negative values will be passed to FETCH , STORE , EXISTS , and DELETE unchanged.

local ${$x}

The syntaxes

  1. local ${$x}
  2. local @{$x}
  3. local %{$x}

now do localise variables, given that the $x is a valid variable name.

Unicode Character Database 4.0.0

The copy of the Unicode Character Database included in Perl 5.8 has been updated to 4.0.0 from 3.2.0. This means for example that the Unicode character properties are as in Unicode 4.0.0.

Deprecation Warnings

There is one new feature deprecation. Perl 5.8.0 forgot to add some deprecation warnings, these warnings have now been added. Finally, a reminder of an impending feature removal.

(Reminder) Pseudo-hashes are deprecated (really)

Pseudo-hashes were deprecated in Perl 5.8.0 and will be removed in Perl 5.10.0, see perl58delta for details. Each attempt to access pseudo-hashes will trigger the warning Pseudo-hashes are deprecated . If you really want to continue using pseudo-hashes but not to see the deprecation warnings, use:

  1. no warnings 'deprecated';

Or you can continue to use the fields pragma, but please don't expect the data structures to be pseudohashes any more.

(Reminder) 5.005-style threads are deprecated (really)

5.005-style threads (activated by use Thread; ) were deprecated in Perl 5.8.0 and will be removed after Perl 5.8, see perl58delta for details. Each 5.005-style thread creation will trigger the warning 5.005 threads are deprecated. If you really want to continue using the 5.005 threads but not to see the deprecation warnings, use:

  1. no warnings 'deprecated';

(Reminder) The $* variable is deprecated (really)

The $* variable controlling multi-line matching has been deprecated and will be removed after 5.8. The variable has been deprecated for a long time, and a deprecation warning Use of $* is deprecated is given, now the variable will just finally be removed. The functionality has been supplanted by the /s and /m modifiers on pattern matching. If you really want to continue using the $* -variable but not to see the deprecation warnings, use:

  1. no warnings 'deprecated';

Miscellaneous Enhancements

map in void context is no longer expensive. map is now context aware, and will not construct a list if called in void context.

If a socket gets closed by the server while printing to it, the client now gets a SIGPIPE. While this new feature was not planned, it fell naturally out of PerlIO changes, and is to be considered an accidental feature.

PerlIO::get_layers(FH) returns the names of the PerlIO layers active on a filehandle.

PerlIO::via layers can now have an optional UTF8 method to indicate whether the layer wants to "auto-:utf8" the stream.

utf8::is_utf8() has been added as a quick way to test whether a scalar is encoded internally in UTF-8 (Unicode).

Modules and Pragmata

Updated Modules And Pragmata

The following modules and pragmata have been updated since Perl 5.8.0:

  • base
  • B::Bytecode

    In much better shape than it used to be. Still far from perfect, but maybe worth a try.

  • B::Concise
  • B::Deparse
  • Benchmark

    An optional feature, :hireswallclock , now allows for high resolution wall clock times (uses Time::HiRes).

  • ByteLoader

    See B::Bytecode.

  • bytes

    Now has bytes::substr.

  • CGI
  • charnames

    One can now have custom character name aliases.

  • CPAN

    There is now a simple command line frontend to the CPAN.pm module called cpan.

  • Data::Dumper

    A new option, Pair, allows choosing the separator between hash keys and values.

  • DB_File
  • Devel::PPPort
  • Digest::MD5
  • Encode

    Significant updates on the encoding pragma functionality (tr/// and the DATA filehandle, formats).

    If a filehandle has been marked as to have an encoding, unmappable characters are detected already during input, not later (when the corrupted data is being used).

    The ISO 8859-6 conversion table has been corrected (the 0x30..0x39 erroneously mapped to U+0660..U+0669, instead of U+0030..U+0039). The GSM 03.38 conversion did not handle escape sequences correctly. The UTF-7 encoding has been added (making Encode feature-complete with Unicode::String).

  • fields
  • libnet
  • Math::BigInt

    A lot of bugs have been fixed since v1.60, the version included in Perl v5.8.0. Especially noteworthy are the bug in Calc that caused div and mod to fail for some large values, and the fixes to the handling of bad inputs.

    Some new features were added, e.g. the broot() method, you can now pass parameters to config() to change some settings at runtime, and it is now possible to trap the creation of NaN and infinity.

    As usual, some optimizations took place and made the math overall a tad faster. In some cases, quite a lot faster, actually. Especially alternative libraries like Math::BigInt::GMP benefit from this. In addition, a lot of the quite clunky routines like fsqrt() and flog() are now much much faster.

  • MIME::Base64
  • NEXT

    Diamond inheritance now works.

  • Net::Ping
  • PerlIO::scalar

    Reading from non-string scalars (like the special variables, see perlvar) now works.

  • podlators
  • Pod::LaTeX
  • PodParsers
  • Pod::Perldoc

    Complete rewrite. As a side-effect, no longer refuses to startup when run by root.

  • Scalar::Util

    New utilities: refaddr, isvstring, looks_like_number, set_prototype.

  • Storable

    Can now store code references (via B::Deparse, so not foolproof).

  • strict

    Earlier versions of the strict pragma did not check the parameters implicitly passed to its "import" (use) and "unimport" (no) routine. This caused the false idiom such as:

    1. use strict qw(@ISA);
    2. @ISA = qw(Foo);

    This however (probably) raised the false expectation that the strict refs, vars and subs were being enforced (and that @ISA was somehow "declared"). But the strict refs, vars, and subs are not enforced when using this false idiom.

    Starting from Perl 5.8.1, the above will cause an error to be raised. This may cause programs which used to execute seemingly correctly without warnings and errors to fail when run under 5.8.1. This happens because

    1. use strict qw(@ISA);

    will now fail with the error:

    1. Unknown 'strict' tag(s) '@ISA'

    The remedy to this problem is to replace this code with the correct idiom:

    1. use strict;
    2. use vars qw(@ISA);
    3. @ISA = qw(Foo);
  • Term::ANSIcolor
  • Test::Harness

    Now much more picky about extra or missing output from test scripts.

  • Test::More
  • Test::Simple
  • Text::Balanced
  • Time::HiRes

    Use of nanosleep(), if available, allows mixing subsecond sleeps with alarms.

  • threads

    Several fixes, for example for join() problems and memory leaks. In some platforms (like Linux) that use glibc the minimum memory footprint of one ithread has been reduced by several hundred kilobytes.

  • threads::shared

    Many memory leaks have been fixed.

  • Unicode::Collate
  • Unicode::Normalize
  • Win32::GetFolderPath
  • Win32::GetOSVersion

    Now returns extra information.

Utility Changes

The h2xs utility now produces a more modern layout: Foo-Bar/lib/Foo/Bar.pm instead of Foo/Bar/Bar.pm. Also, the boilerplate test is now called t/Foo-Bar.t instead of t/1.t.

The Perl debugger (lib/perl5db.pl) has now been extensively documented and bugs found while documenting have been fixed.

perldoc has been rewritten from scratch to be more robust and feature rich.

perlcc -B works now at least somewhat better, while perlcc -c is rather more broken. (The Perl compiler suite as a whole continues to be experimental.)

New Documentation

perl573delta has been added to list the differences between the (now quite obsolete) development releases 5.7.2 and 5.7.3.

perl58delta has been added: it is the perldelta of 5.8.0, detailing the differences between 5.6.0 and 5.8.0.

perlartistic has been added: it is the Artistic License in pod format, making it easier for modules to refer to it.

perlcheat has been added: it is a Perl cheat sheet.

perlgpl has been added: it is the GNU General Public License in pod format, making it easier for modules to refer to it.

perlmacosx has been added to tell about the installation and use of Perl in Mac OS X.

perlos400 has been added to tell about the installation and use of Perl in OS/400 PASE.

perlreref has been added: it is a regular expressions quick reference.

Installation and Configuration Improvements

The Unix standard Perl location, /usr/bin/perl, is no longer overwritten by default if it exists. This change was very prudent because so many Unix vendors already provide a /usr/bin/perl, but simultaneously many system utilities may depend on that exact version of Perl, so better not to overwrite it.

One can now specify installation directories for site and vendor man and HTML pages, and site and vendor scripts. See INSTALL.

One can now specify a destination directory for Perl installation by specifying the DESTDIR variable for make install . (This feature is slightly different from the previous Configure -Dinstallprefix=... .) See INSTALL.

gcc versions 3.x introduced a new warning that caused a lot of noise during Perl compilation: gcc -Ialreadyknowndirectory (warning: changing search order). This warning has now been avoided by Configure weeding out such directories before the compilation.

One can now build subsets of Perl core modules by using the Configure flags -Dnoextensions=... and -Donlyextensions=... , see INSTALL.

Platform-specific enhancements

In Cygwin Perl can now be built with threads (Configure -Duseithreads ). This works with both Cygwin 1.3.22 and Cygwin 1.5.3.

In newer FreeBSD releases Perl 5.8.0 compilation failed because of trying to use malloc.h, which in FreeBSD is just a dummy file, and a fatal error to even try to use. Now malloc.h is not used.

Perl is now known to build also in Hitachi HI-UXMPP.

Perl is now known to build again in LynxOS.

Mac OS X now installs with Perl version number embedded in installation directory names for easier upgrading of user-compiled Perl, and the installation directories in general are more standard. In other words, the default installation no longer breaks the Apple-provided Perl. On the other hand, with Configure -Dprefix=/usr you can now really replace the Apple-supplied Perl (please be careful).

Mac OS X now builds Perl statically by default. This change was done mainly for faster startup times. The Apple-provided Perl is still dynamically linked and shared, and you can enable the sharedness for your own Perl builds by Configure -Duseshrplib .

Perl has been ported to IBM's OS/400 PASE environment. The best way to build a Perl for PASE is to use an AIX host as a cross-compilation environment. See README.os400.

Yet another cross-compilation option has been added: now Perl builds on OpenZaurus, an Linux distribution based on Mandrake + Embedix for the Sharp Zaurus PDA. See the Cross/README file.

Tru64 when using gcc 3 drops the optimisation for toke.c to -O2 because of gigantic memory use with the default -O3 .

Tru64 can now build Perl with the newer Berkeley DBs.

Building Perl on WinCE has been much enhanced, see README.ce and README.perlce.

Selected Bug Fixes

Closures, eval and lexicals

There have been many fixes in the area of anonymous subs, lexicals and closures. Although this means that Perl is now more "correct", it is possible that some existing code will break that happens to rely on the faulty behaviour. In practice this is unlikely unless your code contains a very complex nesting of anonymous subs, evals and lexicals.

Generic fixes

If an input filehandle is marked :utf8 and Perl sees illegal UTF-8 coming in when doing <FH> , if warnings are enabled a warning is immediately given - instead of being silent about it and Perl being unhappy about the broken data later. (The :encoding(utf8) layer also works the same way.)

binmode(SOCKET, ":utf8") only worked on the input side, not on the output side of the socket. Now it works both ways.

For threaded Perls certain system database functions like getpwent() and getgrent() now grow their result buffer dynamically, instead of failing. This means that at sites with lots of users and groups the functions no longer fail by returning only partial results.

Perl 5.8.0 had accidentally broken the capability for users to define their own uppercase<->lowercase Unicode mappings (as advertised by the Camel). This feature has been fixed and is also documented better.

In 5.8.0 this

  1. $some_unicode .= <FH>;

didn't work correctly but instead corrupted the data. This has now been fixed.

Tied methods like FETCH etc. may now safely access tied values, i.e. resulting in a recursive call to FETCH etc. Remember to break the recursion, though.

At startup Perl blocks the SIGFPE signal away since there isn't much Perl can do about it. Previously this blocking was in effect also for programs executed from within Perl. Now Perl restores the original SIGFPE handling routine, whatever it was, before running external programs.

Linenumbers in Perl scripts may now be greater than 65536, or 2**16. (Perl scripts have always been able to be larger than that, it's just that the linenumber for reported errors and warnings have "wrapped around".) While scripts that large usually indicate a need to rethink your code a bit, such Perl scripts do exist, for example as results from generated code. Now linenumbers can go all the way to 4294967296, or 2**32.

Platform-specific fixes

Linux

  • Setting $0 works again (with certain limitations that Perl cannot do much about: see $0 in perlvar)

HP-UX

  • Setting $0 now works.

VMS

  • Configuration now tests for the presence of poll() , and IO::Poll now uses the vendor-supplied function if detected.

  • A rare access violation at Perl start-up could occur if the Perl image was installed with privileges or if there was an identifier with the subsystem attribute set in the process's rightslist. Either of these circumstances triggered tainting code that contained a pointer bug. The faulty pointer arithmetic has been fixed.

  • The length limit on values (not keys) in the %ENV hash has been raised from 255 bytes to 32640 bytes (except when the PERL_ENV_TABLES setting overrides the default use of logical names for %ENV). If it is necessary to access these long values from outside Perl, be aware that they are implemented using search list logical names that store the value in pieces, each 255-byte piece (up to 128 of them) being an element in the search list. When doing a lookup in %ENV from within Perl, the elements are combined into a single value. The existing VMS-specific ability to access individual elements of a search list logical name via the $ENV{'foo;N'} syntax (where N is the search list index) is unimpaired.

  • The piping implementation now uses local rather than global DCL symbols for inter-process communication.

  • File::Find could become confused when navigating to a relative directory whose name collided with a logical name. This problem has been corrected by adding directory syntax to relative path names, thus preventing logical name translation.

Win32

  • A memory leak in the fork() emulation has been fixed.

  • The return value of the ioctl() built-in function was accidentally broken in 5.8.0. This has been corrected.

  • The internal message loop executed by perl during blocking operations sometimes interfered with messages that were external to Perl. This often resulted in blocking operations terminating prematurely or returning incorrect results, when Perl was executing under environments that could generate Windows messages. This has been corrected.

  • Pipes and sockets are now automatically in binary mode.

  • The four-argument form of select() did not preserve $! (errno) properly when there were errors in the underlying call. This is now fixed.

  • The "CR CR LF" problem of has been fixed, binmode(FH, ":crlf") is now effectively a no-op.

New or Changed Diagnostics

All the warnings related to pack() and unpack() were made more informative and consistent.

Changed "A thread exited while %d threads were running"

The old version

  1. A thread exited while %d other threads were still running

was misleading because the "other" included also the thread giving the warning.

Removed "Attempt to clear a restricted hash"

It is not illegal to clear a restricted hash, so the warning was removed.

New "Illegal declaration of anonymous subroutine"

You must specify the block of code for sub.

Changed "Invalid range "%s" in transliteration operator"

The old version

  1. Invalid [] range "%s" in transliteration operator

was simply wrong because there are no "[] ranges" in tr///.

New "Missing control char name in \c"

Self-explanatory.

New "Newline in left-justified string for %s"

The padding spaces would appear after the newline, which is probably not what you had in mind.

New "Possible precedence problem on bitwise %c operator"

If you think this

  1. $x & $y == 0

tests whether the bitwise AND of $x and $y is zero, you will like this warning.

New "Pseudo-hashes are deprecated"

This warning should have been already in 5.8.0, since they are.

New "read() on %s filehandle %s"

You cannot read() (or sysread()) from a closed or unopened filehandle.

New "5.005 threads are deprecated"

This warning should have been already in 5.8.0, since they are.

New "Tied variable freed while still in use"

Something pulled the plug on a live tied variable, Perl plays safe by bailing out.

New "To%s: illegal mapping '%s'"

An illegal user-defined Unicode casemapping was specified.

New "Use of freed value in iteration"

Something modified the values being iterated over. This is not good.

Changed Internals

These news matter to you only if you either write XS code or like to know about or hack Perl internals (using Devel::Peek or any of the B:: modules counts), or like to run Perl with the -D option.

The embedding examples of perlembed have been reviewed to be up to date and consistent: for example, the correct use of PERL_SYS_INIT3() and PERL_SYS_TERM().

Extensive reworking of the pad code (the code responsible for lexical variables) has been conducted by Dave Mitchell.

Extensive work on the v-strings by John Peacock.

UTF-8 length and position cache: to speed up the handling of Unicode (UTF-8) scalars, a cache was introduced. Potential problems exist if an extension bypasses the official APIs and directly modifies the PV of an SV: the UTF-8 cache does not get cleared as it should.

APIs obsoleted in Perl 5.8.0, like sv_2pv, sv_catpvn, sv_catsv, sv_setsv, are again available.

Certain Perl core C APIs like cxinc and regatom are no longer available at all to code outside the Perl core of the Perl core extensions. This is intentional. They never should have been available with the shorter names, and if you application depends on them, you should (be ashamed and) contact perl5-porters to discuss what are the proper APIs.

Certain Perl core C APIs like Perl_list are no longer available without their Perl_ prefix. If your XS module stops working because some functions cannot be found, in many cases a simple fix is to add the Perl_ prefix to the function and the thread context aTHX_ as the first argument of the function call. This is also how it should always have been done: letting the Perl_-less forms to leak from the core was an accident. For cleaner embedding you can also force this for all APIs by defining at compile time the cpp define PERL_NO_SHORT_NAMES.

Perl_save_bool() has been added.

Regexp objects (those created with qr) now have S-magic rather than R-magic. This fixed regexps of the form /...(??{...;$x})/ to no longer ignore changes made to $x. The S-magic avoids dropping the caching optimization and making (??{...}) constructs obscenely slow (and consequently useless). See also Magic Variables in perlguts. Regexp::Copy was affected by this change.

The Perl internal debugging macros DEBUG() and DEB() have been renamed to PERL_DEBUG() and PERL_DEB() to avoid namespace conflicts.

-DL removed (the leaktest had been broken and unsupported for years, use alternative debugging mallocs or tools like valgrind and Purify).

Verbose modifier v added for -DXv and -Dsv , see perlrun.

New Tests

In Perl 5.8.0 there were about 69000 separate tests in about 700 test files, in Perl 5.8.1 there are about 77000 separate tests in about 780 test files. The exact numbers depend on the Perl configuration and on the operating system platform.

Known Problems

The hash randomisation mentioned in Incompatible Changes is definitely problematic: it will wake dormant bugs and shake out bad assumptions.

If you want to use mod_perl 2.x with Perl 5.8.1, you will need mod_perl-1.99_10 or higher. Earlier versions of mod_perl 2.x do not work with the randomised hashes. (mod_perl 1.x works fine.) You will also need Apache::Test 1.04 or higher.

Many of the rarer platforms that worked 100% or pretty close to it with perl 5.8.0 have been left a little bit untended since their maintainers have been otherwise busy lately, and therefore there will be more failures on those platforms. Such platforms include Mac OS Classic, IBM z/OS (and other EBCDIC platforms), and NetWare. The most common Perl platforms (Unix and Unix-like, Microsoft platforms, and VMS) have large enough testing and expert population that they are doing well.

Tied hashes in scalar context

Tied hashes do not currently return anything useful in scalar context, for example when used as boolean tests:

  1. if (%tied_hash) { ... }

The current nonsensical behaviour is always to return false, regardless of whether the hash is empty or has elements.

The root cause is that there is no interface for the implementors of tied hashes to implement the behaviour of a hash in scalar context.

Net::Ping 450_service and 510_ping_udp failures

The subtests 9 and 18 of lib/Net/Ping/t/450_service.t, and the subtest 2 of lib/Net/Ping/t/510_ping_udp.t might fail if you have an unusual networking setup. For example in the latter case the test is trying to send a UDP ping to the IP address 127.0.0.1.

B::C

The C-generating compiler backend B::C (the frontend being perlcc -c ) is even more broken than it used to be because of the extensive lexical variable changes. (The good news is that B::Bytecode and ByteLoader are better than they used to be.)

Platform Specific Problems

EBCDIC Platforms

IBM z/OS and other EBCDIC platforms continue to be problematic regarding Unicode support. Many Unicode tests are skipped when they really should be fixed.

Cygwin 1.5 problems

In Cygwin 1.5 the io/tell and op/sysio tests have failures for some yet unknown reason. In 1.5.5 the threads tests stress_cv, stress_re, and stress_string are failing unless the environment variable PERLIO is set to "perlio" (which makes also the io/tell failure go away).

Perl 5.8.1 does build and work well with Cygwin 1.3: with (uname -a) CYGWIN_NT-5.0 ... 1.3.22(0.78/3/2) 2003-03-18 09:20 i686 ... a 100% "make test" was achieved with Configure -des -Duseithreads .

HP-UX: HP cc warnings about sendfile and sendpath

With certain HP C compiler releases (e.g. B.11.11.02) you will get many warnings like this (lines wrapped for easier reading):

  1. cc: "/usr/include/sys/socket.h", line 504: warning 562:
  2. Redeclaration of "sendfile" with a different storage class specifier:
  3. "sendfile" will have internal linkage.
  4. cc: "/usr/include/sys/socket.h", line 505: warning 562:
  5. Redeclaration of "sendpath" with a different storage class specifier:
  6. "sendpath" will have internal linkage.

The warnings show up both during the build of Perl and during certain lib/ExtUtils tests that invoke the C compiler. The warning, however, is not serious and can be ignored.

IRIX: t/uni/tr_7jis.t falsely failing

The test t/uni/tr_7jis.t is known to report failure under 'make test' or the test harness with certain releases of IRIX (at least IRIX 6.5 and MIPSpro Compilers Version 7.3.1.1m), but if run manually the test fully passes.

Mac OS X: no usemymalloc

The Perl malloc (-Dusemymalloc ) does not work at all in Mac OS X. This is not that serious, though, since the native malloc works just fine.

Tru64: No threaded builds with GNU cc (gcc)

In the latest Tru64 releases (e.g. v5.1B or later) gcc cannot be used to compile a threaded Perl (-Duseithreads) because the system <pthread.h> file doesn't know about gcc.

Win32: sysopen, sysread, syswrite

As of the 5.8.0 release, sysopen()/sysread()/syswrite() do not behave like they used to in 5.6.1 and earlier with respect to "text" mode. These built-ins now always operate in "binary" mode (even if sysopen() was passed the O_TEXT flag, or if binmode() was used on the file handle). Note that this issue should only make a difference for disk files, as sockets and pipes have always been in "binary" mode in the Windows port. As this behavior is currently considered a bug, compatible behavior may be re-introduced in a future release. Until then, the use of sysopen(), sysread() and syswrite() is not supported for "text" mode operations.

Future Directions

The following things might happen in future. The first publicly available releases having these characteristics will be the developer releases Perl 5.9.x, culminating in the Perl 5.10.0 release. These are our best guesses at the moment: we reserve the right to rethink.

  • PerlIO will become The Default. Currently (in Perl 5.8.x) the stdio library is still used if Perl thinks it can use certain tricks to make stdio go really fast. For future releases our goal is to make PerlIO go even faster.

  • A new feature called assertions will be available. This means that one can have code called assertions sprinkled in the code: usually they are optimised away, but they can be enabled with the -A option.

  • A new operator // (defined-or) will be available. This means that one will be able to say

    1. $a // $b

    instead of

    1. defined $a ? $a : $b

    and

    1. $c //= $d;

    instead of

    1. $c = $d unless defined $c;

    The operator will have the same precedence and associativity as ||. A source code patch against the Perl 5.8.1 sources will be available in CPAN as authors/id/H/HM/HMBRAND/dor-5.8.1.diff.

  • unpack() will default to unpacking the $_ .

  • Various Copy-On-Write techniques will be investigated in hopes of speeding up Perl.

  • CPANPLUS, Inline, and Module::Build will become core modules.

  • The ability to write true lexically scoped pragmas will be introduced.

  • Work will continue on the bytecompiler and byteloader.

  • v-strings as they currently exist are scheduled to be deprecated. The v-less form (1.2.3) will become a "version object" when used with use, require, and $VERSION . $^V will also be a "version object" so the printf("%vd",...) construct will no longer be needed. The v-ful version (v1.2.3) will become obsolete. The equivalence of strings and v-strings (e.g. that currently 5.8.0 is equal to "\5\8\0") will go away. There may be no deprecation warning for v-strings, though: it is quite hard to detect when v-strings are being used safely, and when they are not.

  • 5.005 Threads Will Be Removed

  • The $* Variable Will Be Removed (it was deprecated a long time ago)

  • Pseudohashes Will Be Removed

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org/ . There may also be information at http://www.perl.com/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

Page index
 
perldoc-html/perl582delta.html000644 000765 000024 00000051577 12275777404 016355 0ustar00jjstaff000000 000000 perl582delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl582delta

Perl 5 version 18.2 documentation
Recently read

perl582delta

NAME

perl582delta - what is new for perl v5.8.2

DESCRIPTION

This document describes differences between the 5.8.1 release and the 5.8.2 release.

If you are upgrading from an earlier release such as 5.6.1, first read the perl58delta, which describes differences between 5.6.0 and 5.8.0, and the perl581delta, which describes differences between 5.8.0 and 5.8.1.

Incompatible Changes

For threaded builds for modules calling certain re-entrant system calls, binary compatibility was accidentally lost between 5.8.0 and 5.8.1. Binary compatibility with 5.8.0 has been restored in 5.8.2, which necessitates breaking compatibility with 5.8.1. We see this as the lesser of two evils.

This will only affect people who have a threaded perl 5.8.1, and compiled modules which use these calls, and now attempt to run the compiled modules with 5.8.2. The fix is to re-compile and re-install the modules using 5.8.2.

Core Enhancements

Hash Randomisation

The hash randomisation introduced with 5.8.1 has been amended. It transpired that although the implementation introduced in 5.8.1 was source compatible with 5.8.0, it was not binary compatible in certain cases. 5.8.2 contains an improved implementation which is both source and binary compatible with both 5.8.0 and 5.8.1, and remains robust against the form of attack which prompted the change for 5.8.1.

We are grateful to the Debian project for their input in this area. See Algorithmic Complexity Attacks in perlsec for the original rationale behind this change.

Threading

Several memory leaks associated with variables shared between threads have been fixed.

Modules and Pragmata

Updated Modules And Pragmata

The following modules and pragmata have been updated since Perl 5.8.1:

  • Devel::PPPort
  • Digest::MD5
  • I18N::LangTags
  • libnet
  • MIME::Base64
  • Pod::Perldoc
  • strict

    Documentation improved

  • Tie::Hash

    Documentation improved

  • Time::HiRes
  • Unicode::Collate
  • Unicode::Normalize
  • UNIVERSAL

    Documentation improved

Selected Bug Fixes

Some syntax errors involving unrecognized filetest operators are now handled correctly by the parser.

Changed Internals

Interpreter initialization is more complete when -DMULTIPLICITY is off. This should resolve problems with initializing and destroying the Perl interpreter more than once in a single process.

Platform Specific Problems

Dynamic linker flags have been tweaked for Solaris and OS X, which should solve problems seen while building some XS modules.

Bugs in OS/2 sockets and tmpfile have been fixed.

In OS X setreuid and friends are troublesome - perl will now work around their problems as best possible.

Future Directions

Starting with 5.8.3 we intend to make more frequent maintenance releases, with a smaller number of changes in each. The intent is to propagate bug fixes out to stable releases more rapidly and make upgrading stable releases less of an upheaval. This should give end users more flexibility in their choice of upgrade timing, and allow them easier assessment of the impact of upgrades. The current plan is for code freezes as follows

  • 5.8.3 23:59:59 GMT, Wednesday December 31st 2003

  • 5.8.4 23:59:59 GMT, Wednesday March 31st 2004

  • 5.8.5 23:59:59 GMT, Wednesday June 30th 2004

with the release following soon after, when testing is complete.

See Future Directions in perl581delta for more soothsaying.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org/. There may also be information at http://www.perl.com/, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl583delta.html000644 000765 000024 00000061410 12275777404 016341 0ustar00jjstaff000000 000000 perl583delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl583delta

Perl 5 version 18.2 documentation
Recently read

perl583delta

NAME

perl583delta - what is new for perl v5.8.3

DESCRIPTION

This document describes differences between the 5.8.2 release and the 5.8.3 release.

If you are upgrading from an earlier release such as 5.6.1, first read the perl58delta, which describes differences between 5.6.0 and 5.8.0, and the perl581delta and perl582delta, which describe differences between 5.8.0, 5.8.1 and 5.8.2

Incompatible Changes

There are no changes incompatible with 5.8.2.

Core Enhancements

A SCALAR method is now available for tied hashes. This is called when a tied hash is used in scalar context, such as

  1. if (%tied_hash) {
  2. ...
  3. }

The old behaviour was that %tied_hash would return whatever would have been returned for that hash before the hash was tied (so usually 0). The new behaviour in the absence of a SCALAR method is to return TRUE if in the middle of an each iteration, and otherwise call FIRSTKEY to check if the hash is empty (making sure that a subsequent each will also begin by calling FIRSTKEY). Please see SCALAR in perltie for the full details and caveats.

Modules and Pragmata

  • CGI
  • Cwd
  • Digest
  • Digest::MD5
  • Encode
  • File::Spec
  • FindBin

    A function again is provided to resolve problems where modules in different directories wish to use FindBin.

  • List::Util

    You can now weaken references to read only values.

  • Math::BigInt
  • PodParser
  • Pod::Perldoc
  • POSIX
  • Unicode::Collate
  • Unicode::Normalize
  • Test::Harness
  • threads::shared

    cond_wait has a new two argument form. cond_timedwait has been added.

Utility Changes

find2perl now assumes -print as a default action. Previously, it needed to be specified explicitly.

A new utility, prove , makes it easy to run an individual regression test at the command line. prove is part of Test::Harness, which users of earlier Perl versions can install from CPAN.

New Documentation

The documentation has been revised in places to produce more standard manpages.

The documentation for the special code blocks (BEGIN, CHECK, INIT, END) has been improved.

Installation and Configuration Improvements

Perl now builds on OpenVMS I64

Selected Bug Fixes

Using substr() on a UTF8 string could cause subsequent accesses on that string to return garbage. This was due to incorrect UTF8 offsets being cached, and is now fixed.

join() could return garbage when the same join() statement was used to process 8 bit data having earlier processed UTF8 data, due to the flags on that statement's temporary workspace not being reset correctly. This is now fixed.

$a .. $b will now work as expected when either $a or $b is undef

Using Unicode keys with tied hashes should now work correctly.

Reading $^E now preserves $!. Previously, the C code implementing $^E did not preserve errno , so reading $^E could cause errno and therefore $! to change unexpectedly.

Reentrant functions will (once more) work with C++. 5.8.2 introduced a bugfix which accidentally broke the compilation of Perl extensions written in C++

New or Changed Diagnostics

The fatal error "DESTROY created new reference to dead object" is now documented in perldiag.

Changed Internals

The hash code has been refactored to reduce source duplication. The external interface is unchanged, and aside from the bug fixes described above, there should be no change in behaviour.

hv_clear_placeholders is now part of the perl API

Some C macros have been tidied. In particular macros which create temporary local variables now name these variables more defensively, which should avoid bugs where names clash.

<signal.h> is now always included.

Configuration and Building

Configure now invokes callbacks regardless of the value of the variable they are called for. Previously callbacks were only invoked in the case $variable $define) branch. This change should only affect platform maintainers writing configuration hints files.

Platform Specific Problems

The regression test ext/threads/shared/t/wait.t fails on early RedHat 9 and HP-UX 10.20 due to bugs in their threading implementations. RedHat users should see https://rhn.redhat.com/errata/RHBA-2003-136.html and consider upgrading their glibc.

Known Problems

Detached threads aren't supported on Windows yet, as they may lead to memory access violation problems.

There is a known race condition opening scripts in suidperl . suidperl is neither built nor installed by default, and has been deprecated since perl 5.8.0. You are advised to replace use of suidperl with tools such as sudo ( http://www.courtesan.com/sudo/ )

We have a backlog of unresolved bugs. Dealing with bugs and bug reports is unglamorous work; not something ideally suited to volunteer labour, but that is all that we have.

The perl5 development team are implementing changes to help address this problem, which should go live in early 2004.

Future Directions

Code freeze for the next maintenance release (5.8.4) is on March 31st 2004, with release expected by mid April. Similarly 5.8.5's freeze will be at the end of June, with release by mid July.

Obituary

Iain 'Spoon' Truskett, Perl hacker, author of perlreref and contributor to CPAN, died suddenly on 29th December 2003, aged 24. He will be missed.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl584delta.html000644 000765 000024 00000067747 12275777404 016365 0ustar00jjstaff000000 000000 perl584delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl584delta

Perl 5 version 18.2 documentation
Recently read

perl584delta

NAME

perl584delta - what is new for perl v5.8.4

DESCRIPTION

This document describes differences between the 5.8.3 release and the 5.8.4 release.

Incompatible Changes

Many minor bugs have been fixed. Scripts which happen to rely on previously erroneous behaviour will consider these fixes as incompatible changes :-) You are advised to perform sufficient acceptance testing on this release to satisfy yourself that this does not affect you, before putting this release into production.

The diagnostic output of Carp has been changed slightly, to add a space after the comma between arguments. This makes it much easier for tools such as web browsers to wrap it, but might confuse any automatic tools which perform detailed parsing of Carp output.

The internal dump output has been improved, so that non-printable characters such as newline and backspace are output in \x notation, rather than octal. This might just confuse non-robust tools which parse the output of modules such as Devel::Peek.

Core Enhancements

Malloc wrapping

Perl can now be built to detect attempts to assign pathologically large chunks of memory. Previously such assignments would suffer from integer wrap-around during size calculations causing a misallocation, which would crash perl, and could theoretically be used for "stack smashing" attacks. The wrapping defaults to enabled on platforms where we know it works (most AIX configurations, BSDi, Darwin, DEC OSF/1, FreeBSD, HP/UX, GNU Linux, OpenBSD, Solaris, VMS and most Win32 compilers) and defaults to disabled on other platforms.

Unicode Character Database 4.0.1

The copy of the Unicode Character Database included in Perl 5.8 has been updated to 4.0.1 from 4.0.0.

suidperl less insecure

Paul Szabo has analysed and patched suidperl to remove existing known insecurities. Currently there are no known holes in suidperl , but previous experience shows that we cannot be confident that these were the last. You may no longer invoke the set uid perl directly, so to preserve backwards compatibility with scripts that invoke #!/usr/bin/suidperl the only set uid binary is now sperl5.8. n (sperl5.8.4 for this release). suidperl is installed as a hard link to perl ; both suidperl and perl will invoke sperl5.8.4 automatically the set uid binary, so this change should be completely transparent.

For new projects the core perl team would strongly recommend that you use dedicated, single purpose security tools such as sudo in preference to suidperl .

format

In addition to bug fixes, format's features have been enhanced. See perlform

Modules and Pragmata

The (mis)use of /tmp in core modules and documentation has been tidied up. Some modules available both within the perl core and independently from CPAN ("dual-life modules") have not yet had these changes applied; the changes will be integrated into future stable perl releases as the modules are updated on CPAN.

Updated modules

  • Attribute::Handlers
  • B
  • Benchmark
  • CGI
  • Carp
  • Cwd
  • Exporter
  • File::Find
  • IO
  • IPC::Open3
  • Local::Maketext
  • Math::BigFloat
  • Math::BigInt
  • Math::BigRat
  • MIME::Base64
  • ODBM_File
  • POSIX
  • Shell
  • Socket

    There is experimental support for Linux abstract Unix domain sockets.

  • Storable
  • Switch

    Synced with its CPAN version 2.10

  • Sys::Syslog

    syslog() can now use numeric constants for facility names and priorities, in addition to strings.

  • Term::ANSIColor
  • Time::HiRes
  • Unicode::UCD
  • Win32

    Win32.pm/Win32.xs has moved from the libwin32 module to core Perl

  • base
  • open
  • threads

    Detached threads are now also supported on Windows.

  • utf8

Performance Enhancements

  • Accelerated Unicode case mappings (/i, lc, uc, etc).

  • In place sort optimised (eg @a = sort @a )

  • Unnecessary assignment optimised away in

    1. my $s = undef;
    2. my @a = ();
    3. my %h = ();
  • Optimised map in scalar context

Utility Changes

The Perl debugger (lib/perl5db.pl) can now save all debugger commands for sourcing later, and can display the parent inheritance tree of a given class.

Installation and Configuration Improvements

The build process on both VMS and Windows has had several minor improvements made. On Windows Borland's C compiler can now compile perl with PerlIO and/or USE_LARGE_FILES enabled.

perl.exe on Windows now has a "Camel" logo icon. The use of a camel with the topic of Perl is a trademark of O'Reilly and Associates Inc., and is used with their permission (ie distribution of the source, compiling a Windows executable from it, and using that executable locally). Use of the supplied camel for anything other than a perl executable's icon is specifically not covered, and anyone wishing to redistribute perl binaries with the icon should check directly with O'Reilly beforehand.

Perl should build cleanly on Stratus VOS once more.

Selected Bug Fixes

More utf8 bugs fixed, notably in how chomp, chop, send, and syswrite and interact with utf8 data. Concatenation now works correctly when use bytes; is in scope.

Pragmata are now correctly propagated into (?{...}) constructions in regexps. Code such as

  1. my $x = qr{ ... (??{ $x }) ... };

will now (correctly) fail under use strict. (As the inner $x is and has always referred to $::x )

The "const in void context" warning has been suppressed for a constant in an optimised-away boolean expression such as 5 || print;

perl -i could fchmod(stdin) by mistake. This is serious if stdin is attached to a terminal, and perl is running as root. Now fixed.

New or Changed Diagnostics

Carp and the internal diagnostic routines used by Devel::Peek have been made clearer, as described in Incompatible Changes

Changed Internals

Some bugs have been fixed in the hash internals. Restricted hashes and their place holders are now allocated and deleted at slightly different times, but this should not be visible to user code.

Future Directions

Code freeze for the next maintenance release (5.8.5) will be on 30th June 2004, with release by mid July.

Platform Specific Problems

This release is known not to build on Windows 95.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl585delta.html000644 000765 000024 00000055067 12275777404 016356 0ustar00jjstaff000000 000000 perl585delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl585delta

Perl 5 version 18.2 documentation
Recently read

perl585delta

NAME

perl585delta - what is new for perl v5.8.5

DESCRIPTION

This document describes differences between the 5.8.4 release and the 5.8.5 release.

Incompatible Changes

There are no changes incompatible with 5.8.4.

Core Enhancements

Perl's regular expression engine now contains support for matching on the intersection of two Unicode character classes. You can also now refer to user-defined character classes from within other user defined character classes.

Modules and Pragmata

  • Carp improved to work nicely with Safe. Carp's message reporting should now be anomaly free - it will always print out line number information.

  • CGI upgraded to version 3.05

  • charnames now avoids clobbering $_

  • Digest upgraded to version 1.08

  • Encode upgraded to version 2.01

  • FileCache upgraded to version 1.04

  • libnet upgraded to version 1.19

  • Pod::Parser upgraded to version 1.28

  • Pod::Perldoc upgraded to version 3.13

  • Pod::LaTeX upgraded to version 0.57

  • Safe now works properly with Carp

  • Scalar-List-Utils upgraded to version 1.14

  • Shell's documentation has been re-written, and its historical partial auto-quoting of command arguments can now be disabled.

  • Test upgraded to version 1.25

  • Test::Harness upgraded to version 2.42

  • Time::Local upgraded to version 1.10

  • Unicode::Collate upgraded to version 0.40

  • Unicode::Normalize upgraded to version 0.30

Utility Changes

Perl's debugger

The debugger can now emulate stepping backwards, by restarting and rerunning all bar the last command from a saved command history.

h2ph

h2ph is now able to understand a very limited set of C inline functions -- basically, the inline functions that look like CPP macros. This has been introduced to deal with some of the headers of the newest versions of the glibc. The standard warning still applies; to quote h2ph's documentation, you may need to dicker with the files produced.

Installation and Configuration Improvements

Perl 5.8.5 should build cleanly from source on LynxOS.

Selected Bug Fixes

  • The in-place sort optimisation introduced in 5.8.4 had a bug. For example, in code such as

    1. @a = sort ($b, @a)

    the result would omit the value $b. This is now fixed.

  • The optimisation for unnecessary assignments introduced in 5.8.4 could give spurious warnings. This has been fixed.

  • Perl should now correctly detect and read BOM-marked and (BOMless) UTF-16 scripts of either endianness.

  • Creating a new thread when weak references exist was buggy, and would often cause warnings at interpreter destruction time. The known bug is now fixed.

  • Several obscure bugs involving manipulating Unicode strings with substr have been fixed.

  • Previously if Perl's file globbing function encountered a directory that it did not have permission to open it would return immediately, leading to unexpected truncation of the list of results. This has been fixed, to be consistent with Unix shells' globbing behaviour.

  • Thread creation time could vary wildly between identical runs. This was caused by a poor hashing algorithm in the thread cloning routines, which has now been fixed.

  • The internals of the ithreads implementation were not checking if OS-level thread creation had failed. threads->create() now returns undef in if thread creation fails instead of crashing perl.

New or Changed Diagnostics

  • Perl -V has several improvements

    • correctly outputs local patch names that contain embedded code snippets or other characters that used to confuse it.

    • arguments to -V that look like regexps will give multiple lines of output.

    • a trailing colon suppresses the linefeed and ';' terminator, allowing embedding of queries into shell commands.

    • a leading colon removes the 'name=' part of the response, allowing mapping to any name.

  • When perl fails to find the specified script, it now outputs a second line suggesting that the user use the -S flag:

    1. $ perl5.8.5 missing.pl
    2. Can't open perl script "missing.pl": No such file or directory.
    3. Use -S to search $PATH for it.

Changed Internals

The Unicode character class files used by the regular expression engine are now built at build time from the supplied Unicode consortium data files, instead of being shipped prebuilt. This makes the compressed Perl source tarball about 200K smaller. A side effect is that the layout of files inside lib/unicore has changed.

Known Problems

The regression test t/uni/class.t is now performing considerably more tests, and can take several minutes to run even on a fast machine.

Platform Specific Problems

This release is known not to build on Windows 95.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl586delta.html000644 000765 000024 00000053316 12275777403 016351 0ustar00jjstaff000000 000000 perl586delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl586delta

Perl 5 version 18.2 documentation
Recently read

perl586delta

NAME

perl586delta - what is new for perl v5.8.6

DESCRIPTION

This document describes differences between the 5.8.5 release and the 5.8.6 release.

Incompatible Changes

There are no changes incompatible with 5.8.5.

Core Enhancements

The perl interpreter is now more tolerant of UTF-16-encoded scripts.

On Win32, Perl can now use non-IFS compatible LSPs, which allows Perl to work in conjunction with firewalls such as McAfee Guardian. For full details see the file README.win32, particularly if you're running Win95.

Modules and Pragmata

  • With the base pragma, an intermediate class with no fields used to messes up private fields in the base class. This has been fixed.

  • Cwd upgraded to version 3.01 (as part of the new PathTools distribution)

  • Devel::PPPort upgraded to version 3.03

  • File::Spec upgraded to version 3.01 (as part of the new PathTools distribution)

  • Encode upgraded to version 2.08

  • ExtUtils::MakeMaker remains at version 6.17, as later stable releases currently available on CPAN have some issues with core modules on some core platforms.

  • I18N::LangTags upgraded to version 0.35

  • Math::BigInt upgraded to version 1.73

  • Math::BigRat upgraded to version 0.13

  • MIME::Base64 upgraded to version 3.05

  • POSIX::sigprocmask function can now retrieve the current signal mask without also setting it.

  • Time::HiRes upgraded to version 1.65

Utility Changes

Perl has a new -dt command-line flag, which enables threads support in the debugger.

Performance Enhancements

reverse sort ... is now optimized to sort in reverse, avoiding the generation of a temporary intermediate list.

for (reverse @foo) now iterates in reverse, avoiding the generation of a temporary reversed list.

Selected Bug Fixes

The regexp engine is now more robust when given invalid utf8 input, as is sometimes generated by buggy XS modules.

foreach on threads::shared array used to be able to crash Perl. This bug has now been fixed.

A regexp in STDOUT 's destructor used to coredump, because the regexp pad was already freed. This has been fixed.

goto & is now more robust - bugs in deep recursion and chained goto & have been fixed.

Using delete on an array no longer leaks memory. A pop of an item from a shared array reference no longer causes a leak.

eval_sv() failing a taint test could corrupt the stack - this has been fixed.

On platforms with 64 bit pointers numeric comparison operators used to erroneously compare the addresses of references that are overloaded, rather than using the overloaded values. This has been fixed.

read into a UTF8-encoded buffer with an offset off the end of the buffer no longer mis-calculates buffer lengths.

Although Perl has promised since version 5.8 that sort() would be stable, the two cases sort {$b cmp $a} and sort {$b <=> $a} could produce non-stable sorts. This is corrected in perl5.8.6.

Localising $^D no longer generates a diagnostic message about valid -D flags.

New or Changed Diagnostics

For -t and -T, Too late for "-T" option has been changed to the more informative "-T" is on the #! line, it must also be used on the command line

Changed Internals

From now on all applications embedding perl will behave as if perl were compiled with -DPERL_USE_SAFE_PUTENV. See "Environment access" in the INSTALL file for details.

Most C source files now have comments at the top explaining their purpose, which should help anyone wishing to get an overview of the implementation.

New Tests

There are significantly more tests for the B suite of modules.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl587delta.html000644 000765 000024 00000071753 12275777403 016357 0ustar00jjstaff000000 000000 perl587delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl587delta

Perl 5 version 18.2 documentation
Recently read

perl587delta

NAME

perl587delta - what is new for perl v5.8.7

DESCRIPTION

This document describes differences between the 5.8.6 release and the 5.8.7 release.

Incompatible Changes

There are no changes incompatible with 5.8.6.

Core Enhancements

Unicode Character Database 4.1.0

The copy of the Unicode Character Database included in Perl 5.8 has been updated to 4.1.0 from 4.0.1. See http://www.unicode.org/versions/Unicode4.1.0/#NotableChanges for the notable changes.

suidperl less insecure

A pair of exploits in suidperl involving debugging code have been closed.

For new projects the core perl team strongly recommends that you use dedicated, single purpose security tools such as sudo in preference to suidperl .

Optional site customization script

The perl interpreter can be built to allow the use of a site customization script. By default this is not enabled, to be consistent with previous perl releases. To use this, add -Dusesitecustomize to the command line flags when running the Configure script. See also -f in perlrun.

Config.pm is now much smaller.

Config.pm is now about 3K rather than 32K, with the infrequently used code and %Config values loaded on demand. This is transparent to the programmer, but means that most code will save parsing and loading 29K of script (for example, code that uses File::Find ).

Modules and Pragmata

  • B upgraded to version 1.09

  • base upgraded to version 2.07

  • bignum upgraded to version 0.17

  • bytes upgraded to version 1.02

  • Carp upgraded to version 1.04

  • CGI upgraded to version 3.10

  • Class::ISA upgraded to version 0.33

  • Data::Dumper upgraded to version 2.121_02

  • DB_File upgraded to version 1.811

  • Devel::PPPort upgraded to version 3.06

  • Digest upgraded to version 1.10

  • Encode upgraded to version 2.10

  • FileCache upgraded to version 1.05

  • File::Path upgraded to version 1.07

  • File::Temp upgraded to version 0.16

  • IO::File upgraded to version 1.11

  • IO::Socket upgraded to version 1.28

  • Math::BigInt upgraded to version 1.77

  • Math::BigRat upgraded to version 0.15

  • overload upgraded to version 1.03

  • PathTools upgraded to version 3.05

  • Pod::HTML upgraded to version 1.0503

  • Pod::Perldoc upgraded to version 3.14

  • Pod::LaTeX upgraded to version 0.58

  • Pod::Parser upgraded to version 1.30

  • Symbol upgraded to version 1.06

  • Term::ANSIColor upgraded to version 1.09

  • Test::Harness upgraded to version 2.48

  • Test::Simple upgraded to version 0.54

  • Text::Wrap upgraded to version 2001.09293, to fix a bug when wrap() was called with a non-space separator.

  • threads::shared upgraded to version 0.93

  • Time::HiRes upgraded to version 1.66

  • Time::Local upgraded to version 1.11

  • Unicode::Normalize upgraded to version 0.32

  • utf8 upgraded to version 1.05

  • Win32 upgraded to version 0.24, which provides Win32::GetFileVersion

Utility Changes

find2perl enhancements

find2perl has new options -iname , -path and -ipath .

Performance Enhancements

The internal pointer mapping hash used during ithreads cloning now uses an arena for memory allocation. In tests this reduced ithreads cloning time by about 10%.

Installation and Configuration Improvements

  • The Win32 "dmake" makefile.mk has been updated to make it compatible with the latest versions of dmake.

  • PERL_MALLOC , DEBUG_MSTATS , PERL_HASH_SEED_EXPLICIT and NO_HASH_SEED should now work in Win32 makefiles.

Selected Bug Fixes

  • The socket() function on Win32 has been fixed so that it is able to use transport providers which specify a protocol of 0 (meaning any protocol is allowed) once more. (This was broken in 5.8.6, and typically caused the use of ICMP sockets to fail.)

  • Another obscure bug involving substr and UTF-8 caused by bad internal offset caching has been identified and fixed.

  • A bug involving the loading of UTF-8 tables by the regexp engine has been fixed - code such as "\x{100}" =~ /[[:print:]]/ will no longer give corrupt results.

  • Case conversion operations such as uc on a long Unicode string could exhaust memory. This has been fixed.

  • index/rindex were buggy for some combinations of Unicode and non-Unicode data. This has been fixed.

  • read (and presumably sysread) would expose the UTF-8 internals when reading from a byte oriented file handle into a UTF-8 scalar. This has been fixed.

  • Several pack/unpack bug fixes:

    • Checksums with b or B formats were broken.

    • unpack checksums could overflow with the C format.

    • U0 and C0 are now scoped to () pack sub-templates.

    • Counted length prefixes now don't change C0 /U0 mode.

    • pack Z0 used to destroy the preceding character.

    • P /p pack formats used to only recognise literal undef

  • Using closures with ithreads could cause perl to crash. This was due to failure to correctly lock internal OP structures, and has been fixed.

  • The return value of close now correctly reflects any file errors that occur while flushing the handle's data, instead of just giving failure if the actual underlying file close operation failed.

  • not() || 1 used to segfault. not() now behaves like not(0) , which was the pre 5.6.0 behaviour.

  • h2ph has various enhancements to cope with constructs in header files that used to result in incorrect or invalid output.

New or Changed Diagnostics

There is a new taint error, "%ENV is aliased to %s". This error is thrown when taint checks are enabled and when *ENV has been aliased, so that %ENV has no env-magic anymore and hence the environment cannot be verified as taint-free.

The internals of pack and unpack have been updated. All legitimate templates should work as before, but there may be some changes in the error reported for complex failure cases. Any behaviour changes for non-error cases are bugs, and should be reported.

Changed Internals

There has been a fair amount of refactoring of the C source code, partly to make it tidier and more maintainable. The resulting object code and the perl binary may well be smaller than 5.8.6, and hopefully faster in some cases, but apart from this there should be no user-detectable changes.

${^UTF8LOCALE} has been added to give perl space access to PL_utf8locale .

The size of the arenas used to allocate SV heads and most SV bodies can now be changed at compile time. The old size was 1008 bytes, the new default size is 4080 bytes.

Known Problems

Unicode strings returned from overloaded operators can be buggy. This is a long standing bug reported since 5.8.6 was released, but we do not yet have a suitable fix for it.

Platform Specific Problems

On UNICOS, lib/Math/BigInt/t/bigintc.t hangs burning CPU. ext/B/t/bytecode.t and ext/Socket/t/socketpair.t both fail tests. These are unlikely to be resolved, as our valiant UNICOS porter's last Cray is being decommissioned.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl588delta.html000644 000765 000024 00000170544 12275777403 016356 0ustar00jjstaff000000 000000 perl588delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl588delta

Perl 5 version 18.2 documentation
Recently read

perl588delta

NAME

perl588delta - what is new for perl v5.8.8

DESCRIPTION

This document describes differences between the 5.8.7 release and the 5.8.8 release.

Incompatible Changes

There are no changes intentionally incompatible with 5.8.7. If any exist, they are bugs and reports are welcome.

Core Enhancements

  • chdir, chmod and chown can now work on filehandles as well as filenames, if the system supports respectively fchdir , fchmod and fchown , thanks to a patch provided by Gisle Aas.

Modules and Pragmata

  • Attribute::Handlers upgraded to version 0.78_02

    • Documentation typo fix

  • attrs upgraded to version 1.02

    • Internal cleanup only

  • autouse upgraded to version 1.05

    • Simplified implementation

  • B upgraded to version 1.09_01

    • The inheritance hierarchy of the B:: modules has been corrected; B::NV now inherits from B::SV (instead of B::IV ).

  • blib upgraded to version 1.03

    • Documentation typo fix

  • ByteLoader upgraded to version 0.06

    • Internal cleanup

  • CGI upgraded to version 3.15

    • Extraneous "?" from self_url() removed

    • scrolling_list() select attribute fixed

    • virtual_port now works properly with the https protocol

    • upload_hook() and append() now works in function-oriented mode

    • POST_MAX doesn't cause the client to hang any more

    • Automatic tab indexes are now disabled and new -tabindex pragma has been added to turn automatic indexes back on

    • end_form() doesn't emit empty (and non-validating) <div>

    • CGI::Carp works better in certain mod_perl configurations

    • Setting $CGI::TMPDIRECTORY is now effective

    • Enhanced documentation

  • charnames upgraded to version 1.05

    • viacode() now accept hex strings and has been optimized.

  • CPAN upgraded to version 1.76_02

    • 1 minor bug fix for Win32

  • Cwd upgraded to version 3.12

    • canonpath() on Win32 now collapses foo\.. sections correctly.

    • Improved behaviour on Symbian OS.

    • Enhanced documentation and typo fixes

    • Internal cleanup

  • Data::Dumper upgraded to version 2.121_08

    • A problem where Data::Dumper would sometimes update the iterator state of hashes has been fixed

    • Numeric labels now work

    • Internal cleanup

  • DB upgraded to version 1.01

    • A problem where the state of the regexp engine would sometimes get clobbered when running under the debugger has been fixed.

  • DB_File upgraded to version 1.814

    • Adds support for Berkeley DB 4.4.

  • Devel::DProf upgraded to version 20050603.00

    • Internal cleanup

  • Devel::Peek upgraded to version 1.03

    • Internal cleanup

  • Devel::PPPort upgraded to version 3.06_01

    • --compat-version argument checking has been improved

    • Files passed on the command line are filtered by default

    • --nofilter option to override the filtering has been added

    • Enhanced documentation

  • diagnostics upgraded to version 1.15

    • Documentation typo fix

  • Digest upgraded to version 1.14

    • The constructor now knows which module implements SHA-224

    • Documentation tweaks and typo fixes

  • Digest::MD5 upgraded to version 2.36

    • XSLoader is now used for faster loading

    • Enhanced documentation including MD5 weaknesses discovered lately

  • Dumpvalue upgraded to version 1.12

    • Documentation fix

  • DynaLoader upgraded but unfortunately we're not able to increment its version number :-(

    • Implements dl_unload_file on Win32

    • Internal cleanup

    • XSLoader 0.06 incorporated; small optimisation for calling bootstrap_inherit() and documentation enhancements.

  • Encode upgraded to version 2.12

    • A coderef is now acceptable for CHECK !

    • 3 new characters added to the ISO-8859-7 encoding

    • New encoding MIME-Header-ISO_2022_JP added

    • Problem with partial characters and encoding(utf-8-strict) fixed.

    • Documentation enhancements and typo fixes

  • English upgraded to version 1.02

    • the $COMPILING variable has been added

  • ExtUtils::Constant upgraded to version 0.17

    • Improved compatibility with older versions of perl

  • ExtUtils::MakeMaker upgraded to version 6.30 (was 6.17)

  • File::Basename upgraded to version 2.74, with changes contributed by Michael Schwern.

    • Documentation clarified and errors corrected.

    • basename now strips trailing path separators before processing the name.

    • basename now returns / for parameter /, to make basename consistent with the shell utility of the same name.

    • The suffix is no longer stripped if it is identical to the remaining characters in the name, again for consistency with the shell utility.

    • Some internal code cleanup.

  • File::Copy upgraded to version 2.09

    • Copying a file onto itself used to fail.

    • Moving a file between file systems now preserves the access and modification time stamps

  • File::Find upgraded to version 1.10

    • Win32 portability fixes

    • Enhanced documentation

  • File::Glob upgraded to version 1.05

    • Internal cleanup

  • File::Path upgraded to version 1.08

    • mkpath now preserves errno when mkdir fails

  • File::Spec upgraded to version 3.12

    • File::Spec- rootdir()> now returns \ on Win32, instead of /

    • $^O could sometimes become tainted. This has been fixed.

    • canonpath on Win32 now collapses foo/.. (or foo\.. ) sections correctly, rather than doing the "misguided" work it was previously doing. Note that canonpath on Unix still does not collapse these sections, as doing so would be incorrect.

    • Some documentation improvements

    • Some internal code cleanup

  • FileCache upgraded to version 1.06

    • POD formatting errors in the documentation fixed

  • Filter::Simple upgraded to version 0.82

  • FindBin upgraded to version 1.47

    • Now works better with directories where access rights are more restrictive than usual.

  • GDBM_File upgraded to version 1.08

    • Internal cleanup

  • Getopt::Long upgraded to version 2.35

    • prefix_pattern has now been complemented by a new configuration option long_prefix_pattern that allows the user to specify what prefix patterns should have long option style semantics applied.

    • Options can now take multiple values at once (experimental)

    • Various bug fixes

  • if upgraded to version 0.05

    • Give more meaningful error messages from if when invoked with a condition in list context.

    • Restore backwards compatibility with earlier versions of perl

  • IO upgraded to version 1.22

    • Enhanced documentation

    • Internal cleanup

  • IPC::Open2 upgraded to version 1.02

    • Enhanced documentation

  • IPC::Open3 upgraded to version 1.02

    • Enhanced documentation

  • List::Util upgraded to version 1.18 (was 1.14)

    • Fix pure-perl version of refaddr to avoid blessing an un-blessed reference

    • Use XSLoader for faster loading

    • Fixed various memory leaks

    • Internal cleanup and portability fixes

  • Math::Complex upgraded to version 1.35

    • atan2(0, i) now works, as do all the (computable) complex argument cases

    • Fixes for certain bugs in make and emake

    • Support returning the kth root directly

    • Support [2,-3pi/8] in emake

    • Support inf for make /emake

    • Document make /emake more visibly

  • Math::Trig upgraded to version 1.03

    • Add more great circle routines: great_circle_waypoint and great_circle_destination

  • MIME::Base64 upgraded to version 3.07

    • Use XSLoader for faster loading

    • Enhanced documentation

    • Internal cleanup

  • NDBM_File upgraded to version 1.06

    • Enhanced documentation

  • ODBM_File upgraded to version 1.06

    • Documentation typo fixed

    • Internal cleanup

  • Opcode upgraded to version 1.06

    • Enhanced documentation

    • Internal cleanup

  • open upgraded to version 1.05

    • Enhanced documentation

  • overload upgraded to version 1.04

    • Enhanced documentation

  • PerlIO upgraded to version 1.04

    • PerlIO::via iterate over layers properly now

    • PerlIO::scalar understands $/ = "" now

    • encoding(utf-8-strict) with partial characters now works

    • Enhanced documentation

    • Internal cleanup

  • Pod::Functions upgraded to version 1.03

    • Documentation typos fixed

  • Pod::Html upgraded to version 1.0504

    • HTML output will now correctly link to =item s on the same page, and should be valid XHTML.

    • Variable names are recognized as intended

    • Documentation typos fixed

  • Pod::Parser upgraded to version 1.32

    • Allow files that start with =head on the first line

    • Win32 portability fix

    • Exit status of pod2usage fixed

    • New -noperldoc switch for pod2usage

    • Arbitrary URL schemes now allowed

    • Documentation typos fixed

  • POSIX upgraded to version 1.09

    • Documentation typos fixed

    • Internal cleanup

  • re upgraded to version 0.05

    • Documentation typo fixed

  • Safe upgraded to version 2.12

    • Minor documentation enhancement

  • SDBM_File upgraded to version 1.05

    • Documentation typo fixed

    • Internal cleanup

  • Socket upgraded to version 1.78

    • Internal cleanup

  • Storable upgraded to version 2.15

    • This includes the STORABLE_attach hook functionality added by Adam Kennedy, and more frugal memory requirements when storing under ithreads , by using the ithreads cloning tracking code.

  • Switch upgraded to version 2.10_01

    • Documentation typos fixed

  • Sys::Syslog upgraded to version 0.13

    • Now provides numeric macros and meaningful Exporter tags.

    • No longer uses Sys::Hostname as it may provide useless values in unconfigured network environments, so instead uses INADDR_LOOPBACK directly.

    • syslog() now uses local timestamp.

    • setlogmask() now behaves like its C counterpart.

    • setlogsock() will now croak() as documented.

    • Improved error and warnings messages.

    • Improved documentation.

  • Term::ANSIColor upgraded to version 1.10

    • Fixes a bug in colored when $EACHLINE is set that caused it to not color lines consisting solely of 0 (literal zero).

    • Improved tests.

  • Term::ReadLine upgraded to version 1.02

    • Documentation tweaks

  • Test::Harness upgraded to version 2.56 (was 2.48)

    • The Test::Harness timer is now off by default.

    • Now shows elapsed time in milliseconds.

    • Various bug fixes

  • Test::Simple upgraded to version 0.62 (was 0.54)

    • is_deeply() no longer fails to work for many cases

    • Various minor bug fixes

    • Documentation enhancements

  • Text::Tabs upgraded to version 2005.0824

    • Provides a faster implementation of expand

  • Text::Wrap upgraded to version 2005.082401

    • Adds $Text::Wrap::separator2 , which allows you to preserve existing newlines but add line-breaks with some other string.

  • threads upgraded to version 1.07

    • threads will now honour no warnings 'threads'

    • A thread's interpreter is now freed after $t->join() rather than after undef $t , which should fix some ithreads memory leaks. (Fixed by Dave Mitchell)

    • Some documentation typo fixes.

  • threads::shared upgraded to version 0.94

    • Documentation changes only

    • Note: An improved implementation of threads::shared is available on CPAN - this will be merged into 5.8.9 if it proves stable.

  • Tie::Hash upgraded to version 1.02

    • Documentation typo fixed

  • Time::HiRes upgraded to version 1.86 (was 1.66)

    • clock_nanosleep() and clock() functions added

    • Support for the POSIX clock_gettime() and clock_getres() has been added

    • Return undef or an empty list if the C gettimeofday() function fails

    • Improved nanosleep detection

    • Internal cleanup

    • Enhanced documentation

  • Unicode::Collate upgraded to version 0.52

    • Now implements UCA Revision 14 (based on Unicode 4.1.0).

    • Unicode::Collate- new> method no longer overwrites user's $_

    • Enhanced documentation

  • Unicode::UCD upgraded to version 0.24

    • Documentation typos fixed

  • User::grent upgraded to version 1.01

    • Documentation typo fixed

  • utf8 upgraded to version 1.06

    • Documentation typos fixed

  • vmsish upgraded to version 1.02

    • Documentation typos fixed

  • warnings upgraded to version 1.05

    • Gentler messing with Carp:: internals

    • Internal cleanup

    • Documentation update

  • Win32 upgraded to version 0.2601

    • Provides Windows Vista support to Win32::GetOSName

    • Documentation enhancements

  • XS::Typemap upgraded to version 0.02

    • Internal cleanup

Utility Changes

h2xs enhancements

h2xs implements new option --use-xsloader to force use of XSLoader even in backwards compatible modules.

The handling of authors' names that had apostrophes has been fixed.

Any enums with negative values are now skipped.

perlivp enhancements

perlivp implements new option -a and will not check for *.ph files by default any more. Use the -a option to run all tests.

New Documentation

The perlglossary manpage is a glossary of terms used in the Perl documentation, technical and otherwise, kindly provided by O'Reilly Media, inc.

Performance Enhancements

  • Weak reference creation is now O(1) rather than O(n), courtesy of Nicholas Clark. Weak reference deletion remains O(n), but if deletion only happens at program exit, it may be skipped completely.

  • Salvador Fandiño provided improvements to reduce the memory usage of sort and to speed up some cases.

  • Jarkko Hietaniemi and Andy Lester worked to mark as much data as possible in the C source files as static , to increase the proportion of the executable file that the operating system can share between process, and thus reduce real memory usage on multi-user systems.

Installation and Configuration Improvements

Parallel makes should work properly now, although there may still be problems if make test is instructed to run in parallel.

Building with Borland's compilers on Win32 should work more smoothly. In particular Steve Hay has worked to side step many warnings emitted by their compilers and at least one C compiler internal error.

Configure will now detect clearenv and unsetenv , thanks to a patch from Alan Burlison. It will also probe for futimes and whether sprintf correctly returns the length of the formatted string, which will both be used in perl 5.8.9.

There are improved hints for next-3.0, vmesa, IX, Darwin, Solaris, Linux, DEC/OSF, HP-UX and MPE/iX

Perl extensions on Windows now can be statically built into the Perl DLL, thanks to a work by Vadim Konovalov. (This improvement was actually in 5.8.7, but was accidentally omitted from perl587delta).

Selected Bug Fixes

no warnings 'category' works correctly with -w

Previously when running with warnings enabled globally via -w , selective disabling of specific warning categories would actually turn off all warnings. This is now fixed; now no warnings 'io'; will only turn off warnings in the io class. Previously it would erroneously turn off all warnings.

This bug fix may cause some programs to start correctly issuing warnings.

Remove over-optimisation

Perl 5.8.4 introduced a change so that assignments of undef to a scalar, or of an empty list to an array or a hash, were optimised away. As this could cause problems when goto jumps were involved, this change has been backed out.

sprintf() fixes

Using the sprintf() function with some formats could lead to a buffer overflow in some specific cases. This has been fixed, along with several other bugs, notably in bounds checking.

In related fixes, it was possible for badly written code that did not follow the documentation of Sys::Syslog to have formatting vulnerabilities. Sys::Syslog has been changed to protect people from poor quality third party code.

Debugger and Unicode slowdown

It had been reported that running under perl's debugger when processing Unicode data could cause unexpectedly large slowdowns. The most likely cause of this was identified and fixed by Nicholas Clark.

Smaller fixes

  • FindBin now works better with directories where access rights are more restrictive than usual.

  • Several memory leaks in ithreads were closed. An improved implementation of threads::shared is available on CPAN - this will be merged into 5.8.9 if it proves stable.

  • Trailing spaces are now trimmed from $! and $^E .

  • Operations that require perl to read a process's list of groups, such as reads of $( and $) , now dynamically allocate memory rather than using a fixed sized array. The fixed size array could cause C stack exhaustion on systems configured to use large numbers of groups.

  • PerlIO::scalar now works better with non-default $/ settings.

  • You can now use the x operator to repeat a qw// list. This used to raise a syntax error.

  • The debugger now traces correctly execution in eval("")uated code that contains #line directives.

  • The value of the open pragma is no longer ignored for three-argument opens.

  • The optimisation of for (reverse @a) introduced in perl 5.8.6 could misbehave when the array had undefined elements and was used in LVALUE context. Dave Mitchell provided a fix.

  • Some case insensitive matches between UTF-8 encoded data and 8 bit regexps, and vice versa, could give malformed character warnings. These have been fixed by Dave Mitchell and Yves Orton.

  • lcfirst and ucfirst could corrupt the string for certain cases where the length UTF-8 encoding of the string in lower case, upper case or title case differed. This was fixed by Nicholas Clark.

  • Perl will now use the C library calls unsetenv and clearenv if present to delete keys from %ENV and delete %ENV entirely, thanks to a patch from Alan Burlison.

New or Changed Diagnostics

Attempt to set length of freed array

This is a new warning, produced in situations such as this:

  1. $r = do {my @a; \$#a};
  2. $$r = 503;

Non-string passed as bitmask

This is a new warning, produced when number has been passed as a argument to select(), instead of a bitmask.

  1. # Wrong, will now warn
  2. $rin = fileno(STDIN);
  3. ($nfound,$timeleft) = select($rout=$rin, undef, undef, $timeout);
  4. # Should be
  5. $rin = '';
  6. vec($rin,fileno(STDIN),1) = 1;
  7. ($nfound,$timeleft) = select($rout=$rin, undef, undef, $timeout);

Search pattern not terminated or ternary operator parsed as search pattern

This syntax error indicates that the lexer couldn't find the final delimiter of a ?PATTERN? construct. Mentioning the ternary operator in this error message makes it easier to diagnose syntax errors.

Changed Internals

There has been a fair amount of refactoring of the C source code, partly to make it tidier and more maintainable. The resulting object code and the perl binary may well be smaller than 5.8.7, in particular due to a change contributed by Dave Mitchell which reworked the warnings code to be significantly smaller. Apart from being smaller and possibly faster, there should be no user-detectable changes.

Andy Lester supplied many improvements to determine which function parameters and local variables could actually be declared const to the C compiler. Steve Peters provided new *_set macros and reworked the core to use these rather than assigning to macros in LVALUE context.

Dave Mitchell improved the lexer debugging output under -DT

Nicholas Clark changed the string buffer allocation so that it is now rounded up to the next multiple of 4 (or 8 on platforms with 64 bit pointers). This should reduce the number of calls to realloc without actually using any extra memory.

The HV 's array of HE* s is now allocated at the correct (minimal) size, thanks to another change by Nicholas Clark. Compile with -DPERL_USE_LARGE_HV_ALLOC to use the old, sloppier, default.

For XS or embedding debugging purposes, if perl is compiled with -DDEBUG_LEAKING_SCALARS_FORK_DUMP in addition to -DDEBUG_LEAKING_SCALARS then a child process is forked just before global destruction, which is used to display the values of any scalars found to have leaked at the end of global destruction. Without this, the scalars have already been freed sufficiently at the point of detection that it is impossible to produce any meaningful dump of their contents. This feature was implemented by the indefatigable Nicholas Clark, based on an idea by Mike Giroux.

Platform Specific Problems

The optimiser on HP-UX 11.23 (Itanium 2) is currently partly disabled (scaled down to +O1) when using HP C-ANSI-C; the cause of problems at higher optimisation levels is still unclear.

There are a handful of remaining test failures on VMS, mostly due to test fixes and minor module tweaks with too many dependencies to integrate into this release from the development stream, where they have all been corrected. The following is a list of expected failures with the patch number of the fix where that is known:

  1. ext/Devel/PPPort/t/ppphtest.t #26913
  2. ext/List/Util/t/p_tainted.t #26912
  3. lib/ExtUtils/t/PL_FILES.t #26813
  4. lib/ExtUtils/t/basic.t #26813
  5. t/io/fs.t
  6. t/op/cmp.t

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perl589delta.html000644 000765 000024 00000346057 12275777403 016363 0ustar00jjstaff000000 000000 perl589delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl589delta

Perl 5 version 18.2 documentation
Recently read

perl589delta

NAME

perl589delta - what is new for perl v5.8.9

DESCRIPTION

This document describes differences between the 5.8.8 release and the 5.8.9 release.

Notice

The 5.8.9 release will be the last significant release of the 5.8.x series. Any future releases of 5.8.x will likely only be to deal with security issues, and platform build failures. Hence you should look to migrating to 5.10.x, if you have not started already. See Known Problems for more information.

Incompatible Changes

A particular construction in the source code of extensions written in C++ may need changing. See Changed Internals for more details. All extensions written in C, most written in C++, and all existing compiled extensions are unaffected. This was necessary to improve C++ support.

Other than this, there are no changes intentionally incompatible with 5.8.8. If any exist, they are bugs and reports are welcome.

Core Enhancements

Unicode Character Database 5.1.0.

The copy of the Unicode Character Database included in Perl 5.8 has been updated to 5.1.0 from 4.1.0. See http://www.unicode.org/versions/Unicode5.1.0/#NotableChanges for the notable changes.

stat and -X on directory handles

It is now possible to call stat and the -X filestat operators on directory handles. As both directory and file handles are barewords, there can be ambiguities over which was intended. In these situations the file handle semantics are preferred. Both also treat *FILE{IO} filehandles like *FILE filehandles.

Source filters in @INC

It's possible to enhance the mechanism of subroutine hooks in @INC by adding a source filter on top of the filehandle opened and returned by the hook. This feature was planned a long time ago, but wasn't quite working until now. See require for details. (Nicholas Clark)

Exceptions in constant folding

The constant folding routine is now wrapped in an exception handler, and if folding throws an exception (such as attempting to evaluate 0/0), perl now retains the current optree, rather than aborting the whole program. Without this change, programs would not compile if they had expressions that happened to generate exceptions, even though those expressions were in code that could never be reached at runtime. (Nicholas Clark, Dave Mitchell)

no VERSION

You can now use no followed by a version number to specify that you want to use a version of perl older than the specified one.

Improved internal UTF-8 caching code

The code that caches calculated UTF-8 byte offsets for character offsets for a string has been re-written. Several bugs have been located and eliminated, and the code now makes better use of the information it has, so should be faster. In particular, it doesn't scan to the end of a string before calculating an offset within the string, which should speed up some operations on long strings. It is now possible to disable the caching code at run time, to verify that it is not the cause of suspected problems.

Runtime relocatable installations

There is now Configure support for creating a perl tree that is relocatable at run time. see Relocatable installations.

New internal variables

  • ${^CHILD_ERROR_NATIVE}

    This variable gives the native status returned by the last pipe close, backtick command, successful call to wait or waitpid, or from the system operator. See perlvar for details. (Contributed by Gisle Aas.)

  • ${^UTF8CACHE}

    This variable controls the state of the internal UTF-8 offset caching code. 1 for on (the default), 0 for off, -1 to debug the caching code by checking all its results against linear scans, and panicking on any discrepancy.

readpipe is now overridable

The built-in function readpipe is now overridable. Overriding it permits also to override its operator counterpart, qx// (also known as `` ).

simple exception handling macros

Perl 5.8.9 (and 5.10.0 onwards) now provides a couple of macros to do very basic exception handling in XS modules. You can use these macros if you call code that may croak , but you need to do some cleanup before giving control back to Perl. See Exception Handling in perlguts for more details.

-D option enhancements

  • -Dq suppresses the EXECUTING... message when running under -D

  • -Dl logs runops loop entry and exit, and jump level popping.

  • -Dv displays the process id as part of the trace output.

XS-assisted SWASHGET

Some pure-perl code that the regexp engine was using to retrieve Unicode properties and transliteration mappings has been reimplemented in XS for faster execution. (SADAHIRO Tomoyuki)

Constant subroutines

The interpreter internals now support a far more memory efficient form of inlineable constants. Storing a reference to a constant value in a symbol table is equivalent to a full typeglob referencing a constant subroutine, but using about 400 bytes less memory. This proxy constant subroutine is automatically upgraded to a real typeglob with subroutine if necessary. The approach taken is analogous to the existing space optimisation for subroutine stub declarations, which are stored as plain scalars in place of the full typeglob.

However, to aid backwards compatibility of existing code, which (wrongly) does not expect anything other than typeglobs in symbol tables, nothing in core uses this feature, other than the regression tests.

Stubs for prototyped subroutines have been stored in symbol tables as plain strings, and stubs for unprototyped subroutines as the number -1, since 5.005, so code which assumes that the core only places typeglobs in symbol tables has been making incorrect assumptions for over 10 years.

New Platforms

Compile support added for:

  • DragonFlyBSD

  • MidnightBSD

  • MirOS BSD

  • RISC OS

  • Cray XT4/Catamount

Modules and Pragmata

New Modules

  • Module::Pluggable is a simple framework to create modules that accept pluggable sub-modules. The bundled version is 3.8

  • Module::CoreList is a hash of hashes that is keyed on perl version as indicated in $] . The bundled version is 2.17

  • Win32API::File now available in core on Microsoft Windows. The bundled version is 0.1001_01

  • Devel::InnerPackage finds all the packages defined by a single file. It is part of the Module::Pluggable distribution. The bundled version is 0.3

Updated Modules

  • attributes upgraded to version 0.09

  • AutoLoader upgraded to version 5.67

  • AutoSplit upgraded to 1.06

  • autouse upgraded to version 1.06

  • B upgraded from 1.09_01 to 1.19

    • provides new pad related abstraction macros B::NV::COP_SEQ_RANGE_LOW , B::NV::COP_SEQ_RANGE_HIGH , B::NV::PARENT_PAD_INDEX , B::NV::PARENT_FAKELEX_FLAGS , which hides the difference in storage in 5.10.0 and later.

    • provides B::sub_generation , which exposes PL_sub_generation

    • provides B::GV::isGV_with_GP , which on pre-5.10 perls always returns true.

    • New type B::HE added with methods VAL , HASH and SVKEY_force

    • The B::GVf_IMPORTED_CV flag is now set correctly when a proxy constant subroutine is imported.

    • bugs fixed in the handling of PMOP s.

    • B::BM::PREVIOUS returns now U32 , not U16 . B::CV::START and B:CV::ROOT return now NULL on an XSUB, B::CV::XSUB and B::CV::XSUBANY return 0 on a non-XSUB.

  • B::C upgraded to 1.05

  • B::Concise upgraded to 0.76

    • new option -src causes the rendering of each statement (starting with the nextstate OP) to be preceded by the first line of source code that generates it.

    • new option -stash="somepackage" , requires "somepackage", and then renders each function defined in its namespace.

    • now has documentation of detailed hint symbols.

  • B::Debug upgraded to version 1.05

  • B::Deparse upgraded to version 0.87

    • properly deparse print readpipe $x, $y .

    • now handles ''- ()>, ::() , sub :: {} , etc. correctly [RT #43010]. All bugs in parsing these kinds of syntax are now fixed:

      1. perl -MO=Deparse -e '"my %h = "->()'
      2. perl -MO=Deparse -e '::->()'
      3. perl -MO=Deparse -e 'sub :: {}'
      4. perl -MO=Deparse -e 'package a; sub a::b::c {}'
      5. perl -MO=Deparse -e 'sub the::main::road {}'
    • does not deparse $^H{v_string} , which is automatically set by the internals.

  • B::Lint upgraded to version 1.11

  • B::Terse upgraded to version 1.05

  • base upgraded to version 2.13

    • loading a module via base.pm would mask a global $SIG{__DIE__} in that module.

    • push all classes at once in @ISA

  • Benchmark upgraded to version 1.10

  • bigint upgraded to 0.23

  • bignum upgraded to 0.23

  • bigrat upgraded to 0.23

  • blib upgraded to 0.04

  • Carp upgraded to version 1.10

    The argument backtrace code now shows undef as undef, instead of a string "undef".

  • CGI upgraded to version 3.42

  • charnames upgraded to 1.06

  • constant upgraded to version 1.17

  • CPAN upgraded to version 1.9301

  • Cwd upgraded to version 3.29 with some platform specific improvements (including for VMS).

  • Data::Dumper upgraded to version 2.121_17

    • Fixes hash iterator current position with the pure Perl version [RT #40668]

    • Performance enhancements, which will be most evident on platforms where repeated calls to C's realloc() are slow, such as Win32.

  • DB_File upgraded to version 1.817

  • DB_Filter upgraded to version 0.02

  • Devel::DProf upgraded to version 20080331.00

  • Devel::Peek upgraded to version 1.04

  • Devel::PPPort upgraded to version 3.14

  • diagnostics upgraded to version 1.16

  • Digest upgraded to version 1.15

  • Digest::MD5 upgraded to version 2.37

  • DirHandle upgraded to version 1.02

    • now localises $. , $@ , $! , $^E , and $? before closing the directory handle to suppress leaking any side effects of warnings about it already being closed.

  • DynaLoader upgraded to version 1.09

    DynaLoader can now dynamically load a loadable object from a file with a non-default file extension.

  • Encode upgraded to version 2.26

    Encode::Alias includes a fix for encoding "646" on Solaris (better known as ASCII).

  • English upgraded to version 1.03

  • Errno upgraded to version 1.10

  • Exporter upgraded to version 5.63

  • ExtUtils::Command upgraded to version 1.15

  • ExtUtils::Constant upgraded to version 0.21

  • ExtUtils::Embed upgraded to version 1.28

  • ExtUtils::Install upgraded to version 1.50_01

  • ExtUtils::Installed upgraded to version 1.43

  • ExtUtils::MakeMaker upgraded to version 6.48

    • support for INSTALLSITESCRIPT and INSTALLVENDORSCRIPT configuration.

  • ExtUtils::Manifest upgraded to version 1.55

  • ExtUtils::ParseXS upgraded to version 2.19

  • Fatal upgraded to version 1.06

    • allows built-ins in CORE::GLOBAL to be made fatal.

  • Fcntl upgraded to version 1.06

  • fields upgraded to version 2.12

  • File::Basename upgraded to version 2.77

  • FileCache upgraded to version 1.07

  • File::Compare upgraded to 1.1005

  • File::Copy upgraded to 2.13

    • now uses 3-arg open.

  • File::DosGlob upgraded to 1.01

  • File::Find upgraded to version 1.13

  • File::Glob upgraded to version 1.06

    • fixes spurious results with brackets inside braces.

  • File::Path upgraded to version 2.07_02

  • File::Spec upgraded to version 3.29

    • improved handling of bad arguments.

    • some platform specific improvements (including for VMS and Cygwin), with an optimisation on abs2rel when handling both relative arguments.

  • File::stat upgraded to version 1.01

  • File::Temp upgraded to version 0.20

  • filetest upgraded to version 1.02

  • Filter::Util::Call upgraded to version 1.07

  • Filter::Simple upgraded to version 0.83

  • FindBin upgraded to version 1.49

  • GDBM_File upgraded to version 1.09

  • Getopt::Long upgraded to version 2.37

  • Getopt::Std upgraded to version 1.06

  • Hash::Util upgraded to version 0.06

  • if upgraded to version 0.05

  • IO upgraded to version 1.23

    Reduced number of calls to getpeername in IO::Socket

  • IPC::Open upgraded to version 1.03

  • IPC::Open3 upgraded to version 1.03

  • IPC::SysV upgraded to version 2.00

  • lib upgraded to version 0.61

    • avoid warning about loading .par files.

  • libnet upgraded to version 1.22

  • List::Util upgraded to 1.19

  • Locale::Maketext upgraded to 1.13

  • Math::BigFloat upgraded to version 1.60

  • Math::BigInt upgraded to version 1.89

  • Math::BigRat upgraded to version 0.22

    • implements new as_float method.

  • Math::Complex upgraded to version 1.54.

  • Math::Trig upgraded to version 1.18.

  • NDBM_File upgraded to version 1.07

    • improve g++ handling for systems using GDBM compatibility headers.

  • Net::Ping upgraded to version 2.35

  • NEXT upgraded to version 0.61

    • fix several bugs with NEXT when working with AUTOLOAD , eval block, and within overloaded stringification.

  • ODBM_File upgraded to 1.07

  • open upgraded to 1.06

  • ops upgraded to 1.02

  • PerlIO::encoding upgraded to version 0.11

  • PerlIO::scalar upgraded to version 0.06

    • [RT #40267] PerlIO::scalar doesn't respect readonly-ness.

  • PerlIO::via upgraded to version 0.05

  • Pod::Html upgraded to version 1.09

  • Pod::Parser upgraded to version 1.35

  • Pod::Usage upgraded to version 1.35

  • POSIX upgraded to version 1.15

    • POSIX constants that duplicate those in Fcntl are now imported from Fcntl and re-exported, rather than being duplicated by POSIX

    • POSIX::remove can remove empty directories.

    • POSIX::setlocale safer to call multiple times.

    • POSIX::SigRt added, which provides access to POSIX realtime signal functionality on systems that support it.

  • re upgraded to version 0.06_01

  • Safe upgraded to version 2.16

  • Scalar::Util upgraded to 1.19

  • SDBM_File upgraded to version 1.06

  • SelfLoader upgraded to version 1.17

  • Shell upgraded to version 0.72

  • sigtrap upgraded to version 1.04

  • Socket upgraded to version 1.81

  • Storable upgraded to 2.19

  • Switch upgraded to version 2.13

  • Sys::Syslog upgraded to version 0.27

  • Term::ANSIColor upgraded to version 1.12

  • Term::Cap upgraded to version 1.12

  • Term::ReadLine upgraded to version 1.03

  • Test::Builder upgraded to version 0.80

  • Test::Harness upgraded version to 2.64

    • this makes it able to handle newlines.

  • Test::More upgraded to version 0.80

  • Test::Simple upgraded to version 0.80

  • Text::Balanced upgraded to version 1.98

  • Text::ParseWords upgraded to version 3.27

  • Text::Soundex upgraded to version 3.03

  • Text::Tabs upgraded to version 2007.1117

  • Text::Wrap upgraded to version 2006.1117

  • Thread upgraded to version 2.01

  • Thread::Semaphore upgraded to version 2.09

  • Thread::Queue upgraded to version 2.11

    • added capability to add complex structures (e.g., hash of hashes) to queues.

    • added capability to dequeue multiple items at once.

    • added new methods to inspect and manipulate queues: peek , insert and extract

  • Tie::Handle upgraded to version 4.2

  • Tie::Hash upgraded to version 1.03

  • Tie::Memoize upgraded to version 1.1

    • Tie::Memoize::EXISTS now correctly caches its results.

  • Tie::RefHash upgraded to version 1.38

  • Tie::Scalar upgraded to version 1.01

  • Tie::StdHandle upgraded to version 4.2

  • Time::gmtime upgraded to version 1.03

  • Time::Local upgraded to version 1.1901

  • Time::HiRes upgraded to version 1.9715 with various build improvements (including VMS) and minor platform-specific bug fixes (including for HP-UX 11 ia64).

  • threads upgraded to 1.71

    • new thread state information methods: is_running , is_detached and is_joinable . list method enhanced to return running or joinable threads.

    • new thread signal method: kill

    • added capability to specify thread stack size.

    • added capability to control thread exiting behavior. Added a new exit method.

  • threads::shared upgraded to version 1.27

    • smaller and faster implementation that eliminates one internal structure and the consequent level of indirection.

    • user locks are now stored in a safer manner.

    • new function shared_clone creates a copy of an object leaving shared elements as-is and deep-cloning non-shared elements.

    • added new is_shared method.

  • Unicode::Normalize upgraded to version 1.02

  • Unicode::UCD upgraded to version 0.25

  • warnings upgraded to version 1.05_01

  • Win32 upgraded to version 0.38

    • added new function GetCurrentProcessId which returns the regular Windows process identifier of the current process, even when called from within a fork.

  • XSLoader upgraded to version 0.10

  • XS::APItest and XS::Typemap are for internal use only and hence no longer installed. Many more tests have been added to XS::APItest .

Utility Changes

debugger upgraded to version 1.31

  • Andreas König contributed two functions to save and load the debugger history.

  • NEXT::AUTOLOAD no longer emits warnings under the debugger.

  • The debugger should now correctly find tty the device on OS X 10.5 and VMS when the program forks.

  • LVALUE subs now work inside the debugger.

perlthanks

Perl 5.8.9 adds a new utility perlthanks, which is a variant of perlbug, but for sending non-bug-reports to the authors and maintainers of Perl. Getting nothing but bug reports can become a bit demoralising - we'll see if this changes things.

perlbug

perlbug now checks if you're reporting about a non-core module and suggests you report it to the CPAN author instead.

h2xs

  • won't define an empty string as a constant [RT #25366]

  • has examples for h2xs -X

h2ph

  • now attempts to deal sensibly with the difference in path implications between "" and <> quoting in #include statements.

  • now generates correct code for #if defined A || defined B [RT #39130]

New Documentation

As usual, the documentation received its share of corrections, clarifications and other nitfixes. More tags were added for indexing.

perlunitut is a tutorial written by Juerd Waalboer on Unicode-related terminology and how to correctly handle Unicode in Perl scripts.

perlunicode is updated in section user defined properties.

perluniintro has been updated in the example of detecting data that is not valid in particular encoding.

perlcommunity provides an overview of the Perl Community along with further resources.

CORE documents the pseudo-namespace for Perl's core routines.

Changes to Existing Documentation

perlglossary adds deprecated modules and features and to be dropped modules.

perlhack has been updated and added resources on smoke testing.

The Perl FAQs (perlfaq1..perlfaq9) have been updated.

perlcheat is updated with better details on \w , \d , and \s.

perldebug is updated with information on how to call the debugger.

perldiag documentation updated with subroutine with an ampersand on the argument to exists and delete and also several terminology updates on warnings.

perlfork documents the limitation of exec inside pseudo-processes.

perlfunc:

  • Documentation is fixed in section caller and pop.

  • Function alarm now mentions Time::HiRes::ualarm in preference to select.

  • Regarding precedence in -X, filetest operators are the same as unary operators, but not regarding parsing and parentheses (spotted by Eirik Berg Hanssen).

  • reverse function documentation received scalar context examples.

perllocale documentation is adjusted for number localization and POSIX::setlocale to fix Debian bug #379463.

perlmodlib is updated with CPAN::API::HOWTO and Sys::Syslog::win32::Win32

perlre documentation updated to reflect the differences between [[:xxxxx:]] and \p{IsXxxxx} matches. Also added section on /g and /c modifiers.

perlreguts describe the internals of the regular expressions engine. It has been contributed by Yves Orton.

perlrebackslash describes all perl regular expression backslash and escape sequences.

perlrecharclass describes the syntax and use of character classes in Perl Regular Expressions.

perlrun is updated to clarify on the hash seed PERL_HASH_SEED. Also more information in options -x and -u .

perlsub example is updated to use a lexical variable for opendir syntax.

perlvar fixes confusion about real GID $( and effective GID $) .

Perl thread tutorial example is fixed in section Queues: Passing Data Around in perlthrtut and perlthrtut.

perlhack documentation extensively improved by Jarkko Hietaniemi and others.

perltoot provides information on modifying @UNIVERSAL::ISA .

perlport documentation extended to include different kill(-9, ...) semantics on Windows. It also clearly states dump is not supported on Win32 and cygwin.

INSTALL has been updated and modernised.

Performance Enhancements

  • The default since perl 5.000 has been for perl to create an empty scalar with every new typeglob. The increased use of lexical variables means that most are now unused. Thanks to Nicholas Clark's efforts, Perl can now be compiled with -DPERL_DONT_CREATE_GVSV to avoid creating these empty scalars. This will significantly decrease the number of scalars allocated for all configurations, and the number of scalars that need to be copied for ithread creation. Whilst this option is binary compatible with existing perl installations, it does change a long-standing assumption about the internals, hence it is not enabled by default, as some third party code may rely on the old behaviour.

    We would recommend testing with this configuration on new deployments of perl, particularly for multi-threaded servers, to see whether all third party code is compatible with it, as this configuration may give useful performance improvements. For existing installations we would not recommend changing to this configuration unless thorough testing is performed before deployment.

  • diagnostics no longer uses $& , which results in large speedups for regexp matching in all code using it.

  • Regular expressions classes of a single character are now treated the same as if the character had been used as a literal, meaning that code that uses char-classes as an escaping mechanism will see a speedup. (Yves Orton)

  • Creating anonymous array and hash references (ie. [] and {} ) now incurs no more overhead than creating an anonymous list or hash. Nicholas Clark provided changes with a saving of two ops and one stack push, which was measured as a slightly better than 5% improvement for these operations.

  • Many calls to strlen() have been eliminated, either because the length was already known, or by adopting or enhancing APIs that pass lengths. This has been aided by the adoption of a my_sprintf() wrapper, which returns the correct C89 value - the length of the formatted string. Previously we could not rely on the return value of sprintf(), because on some ancient but extant platforms it still returns char * .

  • index is now faster if the search string is stored in UTF-8 but only contains characters in the Latin-1 range.

  • The Unicode swatch cache inside the regexp engine is now used. (the lookup had a key mismatch, present since the initial implementation). [RT #42839]

Installation and Configuration Improvements

Relocatable installations

There is now Configure support for creating a relocatable perl tree. If you Configure with -Duserelocatableinc , then the paths in @INC (and everything else in %Config ) can be optionally located via the path of the perl executable.

At start time, if any paths in @INC or Config that Configure marked as relocatable (by starting them with ".../" ), then they are prefixed the directory of $^X . This allows the relocation can be configured on a per-directory basis, although the default with -Duserelocatableinc is that everything is relocated. The initial install is done to the original configured prefix.

Configuration improvements

Configure is now better at removing temporary files. Tom Callaway (from RedHat) also contributed patches that complete the set of flags passed to the compiler and the linker, in particular that -fPIC is now enabled on Linux. It will also croak when your /dev/null isn't a device.

A new configuration variable d_pseudofork has been to Configure, and is available as $Config{d_pseudofork} in the Config module. This distinguishes real fork support from the pseudofork emulation used on Windows platforms.

Config.pod and config.sh are now placed correctly for cross-compilation.

$Config{useshrplib} is now 'true' rather than 'yes' when using a shared perl library.

Compilation improvements

Parallel makes should work properly now, although there may still be problems if make test is instructed to run in parallel.

Many compilation warnings have been cleaned up. A very stubborn compiler warning in S_emulate_eaccess() was killed after six attempts. g++ support has been tuned, especially for FreeBSD.

mkppport has been integrated, and all ppport.h files in the core will now be autogenerated at build time (and removed during cleanup).

Installation improvements.

installman now works with -Duserelocatableinc and DESTDIR .

installperl no longer installs:

  • static library files of statically linked extensions when a shared perl library is being used. (They are not needed. See Windows below).

  • SIGNATURE and PAUSE*.pub (CPAN files)

  • NOTES and PATCHING (ExtUtils files)

  • perlld and ld2 (Cygwin files)

Platform Specific Changes

There are improved hints for AIX, Cygwin, DEC/OSF, FreeBSD, HP/UX, Irix 6 Linux, MachTen, NetBSD, OS/390, QNX, SCO, Solaris, SunOS, System V Release 5.x (UnixWare 7, OpenUNIX 8), Ultrix, UMIPS, uts and VOS.

FreeBSD

  • Drop -std=c89 and -ansi if using long long as the main integral type, else in FreeBSD 6.2 (and perhaps other releases), system headers do not declare some functions required by perl.

Solaris

  • Starting with Solaris 10, we do not want versioned shared libraries, because those often indicate a private use only library. These problems could often be triggered when SUNWbdb (Berkeley DB) was installed. Hence if Solaris 10 is detected set ignore_versioned_solibs=y.

VMS

  • Allow IEEE math to be deselected on OpenVMS I64 (but it remains the default).

  • Record IEEE usage in config.h

  • Help older VMS compilers by using ccflags when building munchconfig.exe .

  • Don't try to build old Thread extension on VMS when -Duseithreads has been chosen.

  • Passing a raw string of "NaN" to nawk causes a core dump - so the string has been changed to "*NaN*"

  • t/op/stat.t tests will now test hard links on VMS if they are supported.

Windows

  • When using a shared perl library installperl no longer installs static library files, import library files and export library files (of statically linked extensions) and empty bootstrap files (of dynamically linked extensions). This fixes a problem building PAR-Packer on Win32 with a debug build of perl.

  • Various improvements to the win32 build process, including support for Visual C++ 2005 Express Edition (aka Visual C++ 8.x).

  • perl.exe will now have an icon if built with MinGW or Borland.

  • Improvements to the perl-static.exe build process.

  • Add Win32 makefile option to link all extensions statically.

  • The WinCE directory has been merged into the Win32 directory.

  • setlocale tests have been re-enabled for Windows XP onwards.

Selected Bug Fixes

Unicode

Many many bugs related to the internal Unicode implementation (UTF-8) have been fixed. In particular, long standing bugs related to returning Unicode via tie, overloading or $@ are now gone, some of which were never reported.

unpack will internally convert the string back from UTF-8 on numeric types. This is a compromise between the full consistency now in 5.10, and the current behaviour, which is often used as a "feature" on string types.

Using :crlf and UTF-16 IO layers together will now work.

Fixed problems with split, Unicode /\s+/ and / \0/ .

Fixed bug RT #40641 - encoding of Unicode characters in regular expressions.

Fixed a bug where using certain patterns in a regexp led to a panic. [RT #45337]

Perl no longer segfaults (due to infinite internal recursion) if the locale's character is not UTF-8 [RT #41442]:

  1. use open ':locale';
  2. print STDERR "\x{201e}"; # &bdquo;

PerlIO

Inconsistencies have been fixed in the reference counting PerlIO uses to keep track of Unix file descriptors, and the API used by XS code to manage getting and releasing FILE * s

Magic

Several bugs have been fixed in Magic, the internal system used to implement features such as tie, tainting and threads sharing.

undef @array on a tied array now correctly calls the CLEAR method.

Some of the bitwise ops were not checking whether their arguments were magical before using them. [RT #24816]

Magic is no longer invoked twice by the expression \&$x

A bug with assigning large numbers and tainting has been resolved. [RT #40708]

A new entry has been added to the MAGIC vtable - svt_local . This is used when copying magic to the new value during local, allowing certain problems with localising shared variables to be resolved.

For the implementation details, see Magic Virtual Tables in perlguts.

Reblessing overloaded objects now works

Internally, perl object-ness is on the referent, not the reference, even though methods can only be called via a reference. However, the original implementation of overloading stored flags related to overloading on the reference, relying on the flags being copied when the reference was copied, or set at the creation of a new reference. This manifests in a bug - if you rebless an object from a class that has overloading, into one that does not, then any other existing references think that they (still) point to an overloaded object, choose these C code paths, and then throw errors. Analogously, blessing into an overloaded class when other references exist will result in them not using overloading.

The implementation has been fixed for 5.10, but this fix changes the semantics of flag bits, so is not binary compatible, so can't be applied to 5.8.9. However, 5.8.9 has a work-around that implements the same bug fix. If the referent has multiple references, then all the other references are located and corrected. A full search is avoided whenever possible by scanning lexicals outwards from the current subroutine, and the argument stack.

A certain well known Linux vendor applied incomplete versions of this bug fix to their /usr/bin/perl and then prematurely closed bug reports about performance issues without consulting back upstream. This not being enough, they then proceeded to ignore the necessary fixes to these unreleased changes for 11 months, until massive pressure was applied by their long-suffering paying customers, catalysed by the failings being featured on a prominent blog and Slashdot.

strict now propagates correctly into string evals

Under 5.8.8 and earlier:

  1. $ perl5.8.8 -e 'use strict; eval "use foo bar" or die $@'
  2. Can't locate foo.pm in @INC (@INC contains: ... .) at (eval 1) line 2.
  3. BEGIN failed--compilation aborted at (eval 1) line 2.

Under 5.8.9 and later:

  1. $ perl5.8.9 -e 'use strict; eval "use foo bar" or die $@'
  2. Bareword "bar" not allowed while "strict subs" in use at (eval 1) line 1.

This may cause problems with programs that parse the error message and rely on the buggy behaviour.

Other fixes

  • The tokenizer no longer treats =cute (and other words beginning with =cut ) as a synonym for =cut .

  • Calling CORE::require

    CORE::require and CORE::do were always parsed as require and do when they were overridden. This is now fixed.

  • Stopped memory leak on long /etc/groups entries.

  • while (my $x ...) { ...; redo } shouldn't undef $x .

    In the presence of my in the conditional of a while() , until() , or for(;;) loop, we now add an extra scope to the body so that redo doesn't undef the lexical.

  • The encoding pragma now correctly ignores anything following an @ character in the LC_ALL and LANG environment variables. [RT # 49646]

  • A segfault observed with some gcc 3.3 optimisations is resolved.

  • A possible segfault when unpack used in scalar context with () groups is resolved. [RT #50256]

  • Resolved issue where $! could be changed by a signal handler interrupting a system call.

  • Fixed bug RT #37886, symbolic dereferencing was allowed in the argument of defined even under the influence of use strict 'refs' .

  • Fixed bug RT #43207, where lc/uc inside sort affected the return value.

  • Fixed bug RT #45607, where *{"BONK"} = \&{"BONK"} didn't work correctly.

  • Fixed bug RT #35878, croaking from a XSUB called via goto &xsub corrupts perl internals.

  • Fixed bug RT #32539, DynaLoader.o is moved into libperl.so to avoid the need to statically link DynaLoader into the stub perl executable. With this libperl.so provides everything needed to get a functional embedded perl interpreter to run.

  • Fix bug RT #36267 so that assigning to a tied hash doesn't change the underlying hash.

  • Fix bug RT #6006, regexp replaces using large replacement variables fail some of the time, i.e. when substitution contains something like ${10} (note the bracket) instead of just $10 .

  • Fix bug RT #45053, Perl_newCONSTSUB() is now thread safe.

Platform Specific Fixes

Darwin / MacOS X

  • Various improvements to 64 bit builds.

  • Mutex protection added in PerlIOStdio_close() to avoid race conditions. Hopefully this fixes failures in the threads tests free.t and blocks.t.

  • Added forked terminal support to the debugger, with the ability to update the window title.

OS/2

  • A build problem with specifying USE_MULTI and USE_ITHREADS but without USE_IMP_SYS has been fixed.

  • OS2::REXX upgraded to version 1.04

Tru64

  • Aligned floating point build policies for cc and gcc.

RedHat Linux

  • Revisited a patch from 5.6.1 for RH7.2 for Intel's icc [RT #7916], added an additional check for $Config{gccversion} .

Solaris/i386

  • Use -DPTR_IS_LONG when using 64 bit integers

VMS

  • Fixed PerlIO::Scalar in-memory file record-style reads.

  • pipe shutdown at process exit should now be more robust.

  • Bugs in VMS exit handling tickled by Test::Harness 2.64 have been fixed.

  • Fix fcntl() locking capability test in configure.com.

  • Replaced shrplib='define' with useshrplib='true' on VMS.

Windows

  • File::Find used to fail when the target directory is a bare drive letter and no_chdir is 1 (the default is 0). [RT #41555]

  • A build problem with specifying USE_MULTI and USE_ITHREADS but without USE_IMP_SYS has been fixed.

  • The process id is no longer truncated to 16 bits on some Windows platforms ( http://bugs.activestate.com/show_bug.cgi?id=72443 )

  • Fixed bug RT #54828 in perlio.c where calling binmode on Win32 and Cygwin may cause a segmentation fault.

Smaller fixes

  • It is now possible to overload eq when using nomethod .

  • Various problems using overload with 64 bit integers corrected.

  • The reference count of PerlIO file descriptors is now correctly handled.

  • On VMS, escaped dots will be preserved when converted to Unix syntax.

  • keys %+ no longer throws an 'ambiguous' warning.

  • Using #!perl -d could trigger an assertion, which has been fixed.

  • Don't stringify tied code references in @INC when calling require.

  • Code references in @INC report the correct file name when __FILE__ is used.

  • Width and precision in sprintf didn't handle characters above 255 correctly. [RT #40473]

  • List slices with indices out of range now work more consistently. [RT #39882]

  • A change introduced with perl 5.8.1 broke the parsing of arguments of the form -foo=bar with the -s on the <#!> line. This has been fixed. See http://bugs.activestate.com/show_bug.cgi?id=43483

  • tr/// is now threadsafe. Previously it was storing a swash inside its OP, rather than in a pad.

  • pod2html labels anchors more consistently and handles nested definition lists better.

  • threads cleanup veto has been extended to include perl_free() and perl_destruct()

  • On some systems, changes to $ENV{TZ} would not always be respected by the underlying calls to localtime_r() . Perl now forces the inspection of the environment on these systems.

  • The special variable $^R is now more consistently set when executing regexps using the (?{...}) construct. In particular, it will still be set even if backreferences or optional sub-patterns (?:...)? are used.

New or Changed Diagnostics

panic: sv_chop %s

This new fatal error occurs when the C routine Perl_sv_chop() was passed a position that is not within the scalar's string buffer. This is caused by buggy XS code, and at this point recovery is not possible.

Maximal count of pending signals (%s) exceeded

This new fatal error occurs when the perl process has to abort due to too many pending signals, which is bound to prevent perl from being able to handle further incoming signals safely.

panic: attempt to call %s in %s

This new fatal error occurs when the ACL version file test operator is used where it is not available on the current platform. Earlier checks mean that it should never be possible to get this.

FETCHSIZE returned a negative value

New error indicating that a tied array has claimed to have a negative number of elements.

Can't upgrade %s (%d) to %d

Previously the internal error from the SV upgrade code was the less informative Can't upgrade that kind of scalar. It now reports the current internal type, and the new type requested.

%s argument is not a HASH or ARRAY element or a subroutine

This error, thrown if an invalid argument is provided to exists now correctly includes "or a subroutine". [RT #38955]

Cannot make the non-overridable builtin %s fatal

This error in Fatal previously did not show the name of the builtin in question (now represented by %s above).

Unrecognized character '%s' in column %d

This error previously did not state the column.

Offset outside string

This can now also be generated by a seek on a file handle using PerlIO::scalar .

Invalid escape in the specified encoding in regexp; marked by <-- HERE in m/%s/

New error, introduced as part of the fix to RT #40641 to handle encoding of Unicode characters in regular expression comments.

Your machine doesn't support dump/undump.

A more informative fatal error issued when calling dump on Win32 and Cygwin. (Given that the purpose of dump is to abort with a core dump, and core dumps can't be produced on these platforms, this is more useful than silently exiting.)

Changed Internals

The perl sources can now be compiled with a C++ compiler instead of a C compiler. A necessary implementation details is that under C++, the macro XS used to define XSUBs now includes an extern "C" definition. A side effect of this is that C++ code that used the construction

  1. typedef XS(SwigPerlWrapper);

now needs to be written

  1. typedef XSPROTO(SwigPerlWrapper);

using the new XSPROTO macro, in order to compile. C extensions are unaffected, although C extensions are encouraged to use XSPROTO too. This change was present in the 5.10.0 release of perl, so any actively maintained code that happened to use this construction should already have been adapted. Code that needs changing will fail with a compilation error.

set magic on localizing/assigning to a magic variable will now only trigger for container magics, i.e. it will for %ENV or %SIG but not for $#array .

The new API macro newSVpvs() can be used in place of constructions such as newSVpvn("ISA", 3) . It takes a single string constant, and at C compile time determines its length.

The new API function Perl_newSV_type() can be used as a more efficient replacement of the common idiom

  1. sv = newSV(0);
  2. sv_upgrade(sv, type);

Similarly Perl_newSVpvn_flags() can be used to combine Perl_newSVpv() with Perl_sv_2mortal() or the equivalent Perl_sv_newmortal() with Perl_sv_setpvn()

Two new macros mPUSHs() and mXPUSHs() are added, to make it easier to push mortal SVs onto the stack. They were then used to fix several bugs where values on the stack had not been mortalised.

A Perl_signbit() function was added to test the sign of an NV . It maps to the system one when available.

Perl_av_reify() , Perl_lex_end() , Perl_mod() , Perl_op_clear() , Perl_pop_return() , Perl_qerror() , Perl_setdefout() , Perl_vivify_defelem() and Perl_yylex() are now visible to extensions. This was required to allow Data::Alias to work on Windows.

Perl_find_runcv() is now visible to perl core extensions. This was required to allow Sub::Current to work on Windows.

ptr_table* functions are now available in unthreaded perl. Storable takes advantage of this.

There have been many small cleanups made to the internals. In particular, Perl_sv_upgrade() has been simplified considerably, with a straight-through code path that uses memset() and memcpy() to initialise the new body, rather than assignment via multiple temporary variables. It has also benefited from simplification and de-duplication of the arena management code.

A lot of small improvements in the code base were made due to reports from the Coverity static code analyzer.

Corrected use and documentation of Perl_gv_stashpv() , Perl_gv_stashpvn() , Perl_gv_stashsv() functions (last parameter is a bitmask, not boolean).

PERL_SYS_INIT , PERL_SYS_INIT3 and PERL_SYS_TERM macros have been changed into functions.

PERLSYS_TERM no longer requires a context. PerlIO_teardown() is now called without a context, and debugging output in this function has been disabled because that required that an interpreter was present, an invalid assumption at termination time.

All compile time options which affect binary compatibility have been grouped together into a global variable (PL_bincompat_options ).

The values of PERL_REVISION , PERL_VERSION and PERL_SUBVERSION are now baked into global variables (and hence into any shared perl library). Additionally under MULTIPLICITY , the perl executable now records the size of the interpreter structure (total, and for this version). Coupled with PL_bincompat_options this will allow 5.8.10 (and later), when compiled with a shared perl library, to perform sanity checks in main() to verify that the shared library is indeed binary compatible.

Symbolic references can now have embedded NULs. The new public function Perl_get_cvn_flags() can be used in extensions if you have to handle them.

Macro cleanups

The core code, and XS code in ext that is not dual-lived on CPAN, no longer uses the macros PL_na , NEWSV() , Null() , Nullav , Nullcv , Nullhv , Nullhv etc. Their use is discouraged in new code, particularly PL_na , which is a small performance hit.

New Tests

Many modules updated from CPAN incorporate new tests. Some core specific tests have been added:

  • ext/DynaLoader/t/DynaLoader.t

    Tests for the DynaLoader module.

  • t/comp/fold.t

    Tests for compile-time constant folding.

  • t/io/pvbm.t

    Tests incorporated from 5.10.0 which check that there is no unexpected interaction between the internal types PVBM and PVGV .

  • t/lib/proxy_constant_subs.t

    Tests for the new form of constant subroutines.

  • t/op/attrhand.t

    Tests for Attribute::Handlers .

  • t/op/dbm.t

    Tests for dbmopen.

  • t/op/inccode-tie.t

    Calls all tests in t/op/inccode.t after first tying @INC .

  • t/op/incfilter.t

    Tests for source filters returned from code references in @INC .

  • t/op/kill0.t

    Tests for RT #30970.

  • t/op/qrstack.t

    Tests for RT #41484.

  • t/op/qr.t

    Tests for the qr// construct.

  • t/op/regexp_qr_embed.t

    Tests for the qr// construct within another regexp.

  • t/op/regexp_qr.t

    Tests for the qr// construct.

  • t/op/rxcode.t

    Tests for RT #32840.

  • t/op/studytied.t

    Tests for study on tied scalars.

  • t/op/substT.t

    Tests for subst run under -T mode.

  • t/op/symbolcache.t

    Tests for undef and delete on stash entries that are bound to subroutines or methods.

  • t/op/upgrade.t

    Tests for Perl_sv_upgrade() .

  • t/mro/package_aliases.t

    MRO tests for isa and package aliases.

  • t/pod/twice.t

    Tests for calling Pod::Parser twice.

  • t/run/cloexec.t

    Tests for inheriting file descriptors across exec (close-on-exec).

  • t/uni/cache.t

    Tests for the UTF-8 caching code.

  • t/uni/chr.t

    Test that strange encodings do not upset Perl_pp_chr() .

  • t/uni/greek.t

    Tests for RT #40641.

  • t/uni/latin2.t

    Tests for RT #40641.

  • t/uni/overload.t

    Tests for returning Unicode from overloaded values.

  • t/uni/tie.t

    Tests for returning Unicode from tied variables.

Known Problems

There are no known new bugs.

However, programs that rely on bugs that have been fixed will have problems. Also, many bug fixes present in 5.10.0 can't be back-ported to the 5.8.x branch, because they require changes that are binary incompatible, or because the code changes are too large and hence too risky to incorporate.

We have only limited volunteer labour, and the maintenance burden is getting increasingly complex. Hence this will be the last significant release of the 5.8.x series. Any future releases of 5.8.x will likely only be to deal with security issues, and platform build failures. Hence you should look to migrating to 5.10.x, if you have not started already. Alternatively, if business requirements constrain you to continue to use 5.8.x, you may wish to consider commercial support from firms such as ActiveState.

Platform Specific Notes

Win32

readdir(), cwd() , $^X and @INC now use the alternate (short) filename if the long name is outside the current codepage (Jan Dubois).

Updated Modules

  • Win32 upgraded to version 0.38. Now has a documented 'WinVista' response from GetOSName and support for Vista's privilege elevation in IsAdminUser . Support for Unicode characters in path names. Improved cygwin and Win64 compatibility.

  • Win32API updated to 0.1001_01

  • killpg() support added to MSWin32 (Jan Dubois).

  • File::Spec::Win32 upgraded to version 3.2701

OS/2

Updated Modules

  • OS2::Process upgraded to 1.03

    Ilya Zakharevich has added and documented several Window* and Clipbrd* functions.

  • OS2::REXX::DLL , OS2::REXX updated to version 1.03

VMS

Updated Modules

  • DCLsym upgraded to version 1.03

  • Stdio upgraded to version 2.4

  • VMS::XSSymSet upgraded to 1.1.

Obituary

Nick Ing-Simmons, long time Perl hacker, author of the Tk and Encode modules, perlio.c in the core, and 5.003_02 pumpking, died of a heart attack on 25th September 2006. He will be missed.

Acknowledgements

Some of the work in this release was funded by a TPF grant.

Steve Hay worked behind the scenes working out the causes of the differences between core modules, their CPAN releases, and previous core releases, and the best way to rectify them. He doesn't want to do it again. I know this feeling, and I'm very glad he did it this time, instead of me.

Paul Fenwick assembled a team of 18 volunteers, who broke the back of writing this document. In particular, Bradley Dean, Eddy Tan, and Vincent Pit provided half the team's contribution.

Schwern verified the list of updated module versions, correcting quite a few errors that I (and everyone else) had missed, both wrongly stated module versions, and changed modules that had not been listed.

The crack Berlin-based QA team of Andreas König and Slaven Rezic tirelessly re-built snapshots, tested most everything CPAN against them, and then identified the changes responsible for any module regressions, ensuring that several show-stopper bugs were stomped before the first release candidate was cut.

The other core committers contributed most of the changes, and applied most of the patches sent in by the hundreds of contributors listed in AUTHORS.

And obviously, Larry Wall, without whom we wouldn't have Perl.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

Page index
 
perldoc-html/perl58delta.html000644 000765 000024 00000573555 12275777405 016301 0ustar00jjstaff000000 000000 perl58delta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perl58delta

Perl 5 version 18.2 documentation
Recently read

perl58delta

NAME

perl58delta - what is new for perl v5.8.0

DESCRIPTION

This document describes differences between the 5.6.0 release and the 5.8.0 release.

Many of the bug fixes in 5.8.0 were already seen in the 5.6.1 maintenance release since the two releases were kept closely coordinated (while 5.8.0 was still called 5.7.something).

Changes that were integrated into the 5.6.1 release are marked [561] . Many of these changes have been further developed since 5.6.1 was released, those are marked [561+] .

You can see the list of changes in the 5.6.1 release (both from the 5.005_03 release and the 5.6.0 release) by reading perl561delta.

Highlights In 5.8.0

  • Better Unicode support

  • New IO Implementation

  • New Thread Implementation

  • Better Numeric Accuracy

  • Safe Signals

  • Many New Modules

  • More Extensive Regression Testing

Incompatible Changes

Binary Incompatibility

Perl 5.8 is not binary compatible with earlier releases of Perl.

You have to recompile your XS modules.

(Pure Perl modules should continue to work.)

The major reason for the discontinuity is the new IO architecture called PerlIO. PerlIO is the default configuration because without it many new features of Perl 5.8 cannot be used. In other words: you just have to recompile your modules containing XS code, sorry about that.

In future releases of Perl, non-PerlIO aware XS modules may become completely unsupported. This shouldn't be too difficult for module authors, however: PerlIO has been designed as a drop-in replacement (at the source code level) for the stdio interface.

Depending on your platform, there are also other reasons why we decided to break binary compatibility, please read on.

64-bit platforms and malloc

If your pointers are 64 bits wide, the Perl malloc is no longer being used because it does not work well with 8-byte pointers. Also, usually the system mallocs on such platforms are much better optimized for such large memory models than the Perl malloc. Some memory-hungry Perl applications like the PDL don't work well with Perl's malloc. Finally, other applications than Perl (such as mod_perl) tend to prefer the system malloc. Such platforms include Alpha and 64-bit HPPA, MIPS, PPC, and Sparc.

AIX Dynaloading

The AIX dynaloading now uses in AIX releases 4.3 and newer the native dlopen interface of AIX instead of the old emulated interface. This change will probably break backward compatibility with compiled modules. The change was made to make Perl more compliant with other applications like mod_perl which are using the AIX native interface.

Attributes for my variables now handled at run-time

The my EXPR : ATTRS syntax now applies variable attributes at run-time. (Subroutine and our variables still get attributes applied at compile-time.) See attributes for additional details. In particular, however, this allows variable attributes to be useful for tie interfaces, which was a deficiency of earlier releases. Note that the new semantics doesn't work with the Attribute::Handlers module (as of version 0.76).

Socket Extension Dynamic in VMS

The Socket extension is now dynamically loaded instead of being statically built in. This may or may not be a problem with ancient TCP/IP stacks of VMS: we do not know since we weren't able to test Perl in such configurations.

IEEE-format Floating Point Default on OpenVMS Alpha

Perl now uses IEEE format (T_FLOAT) as the default internal floating point format on OpenVMS Alpha, potentially breaking binary compatibility with external libraries or existing data. G_FLOAT is still available as a configuration option. The default on VAX (D_FLOAT) has not changed.

New Unicode Semantics (no more use utf8 , almost)

Previously in Perl 5.6 to use Unicode one would say "use utf8" and then the operations (like string concatenation) were Unicode-aware in that lexical scope.

This was found to be an inconvenient interface, and in Perl 5.8 the Unicode model has completely changed: now the "Unicodeness" is bound to the data itself, and for most of the time "use utf8" is not needed at all. The only remaining use of "use utf8" is when the Perl script itself has been written in the UTF-8 encoding of Unicode. (UTF-8 has not been made the default since there are many Perl scripts out there that are using various national eight-bit character sets, which would be illegal in UTF-8.)

See perluniintro for the explanation of the current model, and utf8 for the current use of the utf8 pragma.

New Unicode Properties

Unicode scripts are now supported. Scripts are similar to (and superior to) Unicode blocks. The difference between scripts and blocks is that scripts are the glyphs used by a language or a group of languages, while the blocks are more artificial groupings of (mostly) 256 characters based on the Unicode numbering.

In general, scripts are more inclusive, but not universally so. For example, while the script Latin includes all the Latin characters and their various diacritic-adorned versions, it does not include the various punctuation or digits (since they are not solely Latin ).

A number of other properties are now supported, including \p{L&}, \p{Any} \p{Assigned} , \p{Unassigned} , \p{Blank} [561] and \p{SpacePerl} [561] (along with their \P{...} versions, of course). See perlunicode for details, and more additions.

The In or Is prefix to names used with the \p{...} and \P{...} are now almost always optional. The only exception is that a In prefix is required to signify a Unicode block when a block name conflicts with a script name. For example, \p{Tibetan} refers to the script, while \p{InTibetan} refers to the block. When there is no name conflict, you can omit the In from the block name (e.g. \p{BraillePatterns} ), but to be safe, it's probably best to always use the In ).

REF(...) Instead Of SCALAR(...)

A reference to a reference now stringifies as "REF(0x81485ec)" instead of "SCALAR(0x81485ec)" in order to be more consistent with the return value of ref().

pack/unpack D/F recycled

The undocumented pack/unpack template letters D/F have been recycled for better use: now they stand for long double (if supported by the platform) and NV (Perl internal floating point type). (They used to be aliases for d/f, but you never knew that.)

glob() now returns filenames in alphabetical order

The list of filenames from glob() (or <...>) is now by default sorted alphabetically to be csh-compliant (which is what happened before in most Unix platforms). (bsd_glob() does still sort platform natively, ASCII or EBCDIC, unless GLOB_ALPHASORT is specified.) [561]

Deprecations

  • The semantics of bless(REF, REF) were unclear and until someone proves it to make some sense, it is forbidden.

  • The obsolete chat2 library that should never have been allowed to escape the laboratory has been decommissioned.

  • Using chdir("") or chdir(undef) instead of explicit chdir() is doubtful. A failure (think chdir(some_function()) can lead into unintended chdir() to the home directory, therefore this behaviour is deprecated.

  • The builtin dump() function has probably outlived most of its usefulness. The core-dumping functionality will remain in future available as an explicit call to CORE::dump() , but in future releases the behaviour of an unqualified dump() call may change.

  • The very dusty examples in the eg/ directory have been removed. Suggestions for new shiny examples welcome but the main issue is that the examples need to be documented, tested and (most importantly) maintained.

  • The (bogus) escape sequences \8 and \9 now give an optional warning ("Unrecognized escape passed through"). There is no need to \-escape any \w character.

  • The *glob{FILEHANDLE} is deprecated, use *glob{IO} instead.

  • The package; syntax (package without an argument) has been deprecated. Its semantics were never that clear and its implementation even less so. If you have used that feature to disallow all but fully qualified variables, use strict; instead.

  • The unimplemented POSIX regex features [[.cc.]] and [[=c=]] are still recognised but now cause fatal errors. The previous behaviour of ignoring them by default and warning if requested was unacceptable since it, in a way, falsely promised that the features could be used.

  • In future releases, non-PerlIO aware XS modules may become completely unsupported. Since PerlIO is a drop-in replacement for stdio at the source code level, this shouldn't be that drastic a change.

  • Previous versions of perl and some readings of some sections of Camel III implied that the :raw "discipline" was the inverse of :crlf . Turning off "clrfness" is no longer enough to make a stream truly binary. So the PerlIO :raw layer (or "discipline", to use the Camel book's older terminology) is now formally defined as being equivalent to binmode(FH) - which is in turn defined as doing whatever is necessary to pass each byte as-is without any translation. In particular binmode(FH) - and hence :raw - will now turn off both CRLF and UTF-8 translation and remove other layers (e.g. :encoding()) which would modify byte stream.

  • The current user-visible implementation of pseudo-hashes (the weird use of the first array element) is deprecated starting from Perl 5.8.0 and will be removed in Perl 5.10.0, and the feature will be implemented differently. Not only is the current interface rather ugly, but the current implementation slows down normal array and hash use quite noticeably. The fields pragma interface will remain available. The restricted hashes interface is expected to be the replacement interface (see Hash::Util). If your existing programs depends on the underlying implementation, consider using Class::PseudoHash from CPAN.

  • The syntaxes @a->[...] and %h->{...} have now been deprecated.

  • After years of trying, suidperl is considered to be too complex to ever be considered truly secure. The suidperl functionality is likely to be removed in a future release.

  • The 5.005 threads model (module Thread ) is deprecated and expected to be removed in Perl 5.10. Multithreaded code should be migrated to the new ithreads model (see threads, threads::shared and perlthrtut).

  • The long deprecated uppercase aliases for the string comparison operators (EQ, NE, LT, LE, GE, GT) have now been removed.

  • The tr///C and tr///U features have been removed and will not return; the interface was a mistake. Sorry about that. For similar functionality, see pack('U0', ...) and pack('C0', ...). [561]

  • Earlier Perls treated "sub foo (@bar)" as equivalent to "sub foo (@)". The prototypes are now checked better at compile-time for invalid syntax. An optional warning is generated ("Illegal character in prototype...") but this may be upgraded to a fatal error in a future release.

  • The exec LIST and system LIST operations now produce warnings on tainted data and in some future release they will produce fatal errors.

  • The existing behaviour when localising tied arrays and hashes is wrong, and will be changed in a future release, so do not rely on the existing behaviour. See Localising Tied Arrays and Hashes Is Broken.

Core Enhancements

Unicode Overhaul

Unicode in general should be now much more usable than in Perl 5.6.0 (or even in 5.6.1). Unicode can be used in hash keys, Unicode in regular expressions should work now, Unicode in tr/// should work now, Unicode in I/O should work now. See perluniintro for introduction and perlunicode for details.

  • The Unicode Character Database coming with Perl has been upgraded to Unicode 3.2.0. For more information, see http://www.unicode.org/ . [561+] (5.6.1 has UCD 3.0.1.)

  • For developers interested in enhancing Perl's Unicode capabilities: almost all the UCD files are included with the Perl distribution in the lib/unicore subdirectory. The most notable omission, for space considerations, is the Unihan database.

  • The properties \p{Blank} and \p{SpacePerl} have been added. "Blank" is like C isblank(), that is, it contains only "horizontal whitespace" (the space character is, the newline isn't), and the "SpacePerl" is the Unicode equivalent of \s (\p{Space} isn't, since that includes the vertical tabulator character, whereas \s doesn't.)

    See "New Unicode Properties" earlier in this document for additional information on changes with Unicode properties.

PerlIO is Now The Default

  • IO is now by default done via PerlIO rather than system's "stdio". PerlIO allows "layers" to be "pushed" onto a file handle to alter the handle's behaviour. Layers can be specified at open time via 3-arg form of open:

    1. open($fh,'>:crlf :utf8', $path) || ...

    or on already opened handles via extended binmode:

    1. binmode($fh,':encoding(iso-8859-7)');

    The built-in layers are: unix (low level read/write), stdio (as in previous Perls), perlio (re-implementation of stdio buffering in a portable manner), crlf (does CRLF <=> "\n" translation as on Win32, but available on any platform). A mmap layer may be available if platform supports it (mostly Unixes).

    Layers to be applied by default may be specified via the 'open' pragma.

    See Installation and Configuration Improvements for the effects of PerlIO on your architecture name.

  • If your platform supports fork(), you can use the list form of open for pipes. For example:

    1. open KID_PS, "-|", "ps", "aux" or die $!;

    forks the ps(1) command (without spawning a shell, as there are more than three arguments to open()), and reads its standard output via the KID_PS filehandle. See perlipc.

  • File handles can be marked as accepting Perl's internal encoding of Unicode (UTF-8 or UTF-EBCDIC depending on platform) by a pseudo layer ":utf8" :

    1. open($fh,">:utf8","Uni.txt");

    Note for EBCDIC users: the pseudo layer ":utf8" is erroneously named for you since it's not UTF-8 what you will be getting but instead UTF-EBCDIC. See perlunicode, utf8, and http://www.unicode.org/unicode/reports/tr16/ for more information. In future releases this naming may change. See perluniintro for more information about UTF-8.

  • If your environment variables (LC_ALL, LC_CTYPE, LANG) look like you want to use UTF-8 (any of the variables match /utf-?8/i ), your STDIN, STDOUT, STDERR handles and the default open layer (see open) are marked as UTF-8. (This feature, like other new features that combine Unicode and I/O, work only if you are using PerlIO, but that's the default.)

    Note that after this Perl really does assume that everything is UTF-8: for example if some input handle is not, Perl will probably very soon complain about the input data like this "Malformed UTF-8 ..." since any old eight-bit data is not legal UTF-8.

    Note for code authors: if you want to enable your users to use UTF-8 as their default encoding but in your code still have eight-bit I/O streams (such as images or zip files), you need to explicitly open() or binmode() with :bytes (see open and binmode), or you can just use binmode(FH) (nice for pre-5.8.0 backward compatibility).

  • File handles can translate character encodings from/to Perl's internal Unicode form on read/write via the ":encoding()" layer.

  • File handles can be opened to "in memory" files held in Perl scalars via:

    1. open($fh,'>', \$variable) || ...
  • Anonymous temporary files are available without need to 'use FileHandle' or other module via

    1. open($fh,"+>", undef) || ...

    That is a literal undef, not an undefined value.

ithreads

The new interpreter threads ("ithreads" for short) implementation of multithreading, by Arthur Bergman, replaces the old "5.005 threads" implementation. In the ithreads model any data sharing between threads must be explicit, as opposed to the model where data sharing was implicit. See threads and threads::shared, and perlthrtut.

As a part of the ithreads implementation Perl will also use any necessary and detectable reentrant libc interfaces.

Restricted Hashes

A restricted hash is restricted to a certain set of keys, no keys outside the set can be added. Also individual keys can be restricted so that the key cannot be deleted and the value cannot be changed. No new syntax is involved: the Hash::Util module is the interface.

Safe Signals

Perl used to be fragile in that signals arriving at inopportune moments could corrupt Perl's internal state. Now Perl postpones handling of signals until it's safe (between opcodes).

This change may have surprising side effects because signals no longer interrupt Perl instantly. Perl will now first finish whatever it was doing, like finishing an internal operation (like sort()) or an external operation (like an I/O operation), and only then look at any arrived signals (and before starting the next operation). No more corrupt internal state since the current operation is always finished first, but the signal may take more time to get heard. Note that breaking out from potentially blocking operations should still work, though.

Understanding of Numbers

In general a lot of fixing has happened in the area of Perl's understanding of numbers, both integer and floating point. Since in many systems the standard number parsing functions like strtoul() and atof() seem to have bugs, Perl tries to work around their deficiencies. This results hopefully in more accurate numbers.

Perl now tries internally to use integer values in numeric conversions and basic arithmetics (+ - * /) if the arguments are integers, and tries also to keep the results stored internally as integers. This change leads to often slightly faster and always less lossy arithmetics. (Previously Perl always preferred floating point numbers in its math.)

Arrays now always interpolate into double-quoted strings [561]

In double-quoted strings, arrays now interpolate, no matter what. The behavior in earlier versions of perl 5 was that arrays would interpolate into strings if the array had been mentioned before the string was compiled, and otherwise Perl would raise a fatal compile-time error. In versions 5.000 through 5.003, the error was

  1. Literal @example now requires backslash

In versions 5.004_01 through 5.6.0, the error was

  1. In string, @example now must be written as \@example

The idea here was to get people into the habit of writing "fred\@example.com" when they wanted a literal @ sign, just as they have always written "Give me back my \$5" when they wanted a literal $ sign.

Starting with 5.6.1, when Perl now sees an @ sign in a double-quoted string, it always attempts to interpolate an array, regardless of whether or not the array has been used or declared already. The fatal error has been downgraded to an optional warning:

  1. Possible unintended interpolation of @example in string

This warns you that "fred@example.com" is going to turn into fred.com if you don't backslash the @ . See http://perl.plover.com/at-error.html for more details about the history here.

Miscellaneous Changes

  • AUTOLOAD is now lvaluable, meaning that you can add the :lvalue attribute to AUTOLOAD subroutines and you can assign to the AUTOLOAD return value.

  • The $Config{byteorder} (and corresponding BYTEORDER in config.h) was previously wrong in platforms if sizeof(long) was 4, but sizeof(IV) was 8. The byteorder was only sizeof(long) bytes long (1234 or 4321), but now it is correctly sizeof(IV) bytes long, (12345678 or 87654321). (This problem didn't affect Windows platforms.)

    Also, $Config{byteorder} is now computed dynamically--this is more robust with "fat binaries" where an executable image contains binaries for more than one binary platform, and when cross-compiling.

  • perl -d:Module=arg,arg,arg now works (previously one couldn't pass in multiple arguments.)

  • do followed by a bareword now ensures that this bareword isn't a keyword (to avoid a bug where do q(foo.pl) tried to call a subroutine called q). This means that for example instead of do format() you must write do &format() .

  • The builtin dump() now gives an optional warning dump() better written as CORE::dump(), meaning that by default dump(...) is resolved as the builtin dump() which dumps core and aborts, not as (possibly) user-defined sub dump . To call the latter, qualify the call as &dump(...) . (The whole dump() feature is to considered deprecated, and possibly removed/changed in future releases.)

  • chomp() and chop() are now overridable. Note, however, that their prototype (as given by prototype("CORE::chomp") is undefined, because it cannot be expressed and therefore one cannot really write replacements to override these builtins.

  • END blocks are now run even if you exit/die in a BEGIN block. Internally, the execution of END blocks is now controlled by PL_exit_flags & PERL_EXIT_DESTRUCT_END. This enables the new behaviour for Perl embedders. This will default in 5.10. See perlembed.

  • Formats now support zero-padded decimal fields.

  • Although "you shouldn't do that", it was possible to write code that depends on Perl's hashed key order (Data::Dumper does this). The new algorithm "One-at-a-Time" produces a different hashed key order. More details are in Performance Enhancements.

  • lstat(FILEHANDLE) now gives a warning because the operation makes no sense. In future releases this may become a fatal error.

  • Spurious syntax errors generated in certain situations, when glob() caused File::Glob to be loaded for the first time, have been fixed. [561]

  • Lvalue subroutines can now return undef in list context. However, the lvalue subroutine feature still remains experimental. [561+]

  • A lost warning "Can't declare ... dereference in my" has been restored (Perl had it earlier but it became lost in later releases.)

  • A new special regular expression variable has been introduced: $^N , which contains the most-recently closed group (submatch).

  • no Module; does not produce an error even if Module does not have an unimport() method. This parallels the behavior of use vis-a-vis import. [561]

  • The numerical comparison operators return undef if either operand is a NaN. Previously the behaviour was unspecified.

  • our can now have an experimental optional attribute unique that affects how global variables are shared among multiple interpreters, see our.

  • The following builtin functions are now overridable: each(), keys(), pop(), push(), shift(), splice(), unshift(). [561]

  • pack() / unpack() can now group template letters with () and then apply repetition/count modifiers on the groups.

  • pack() / unpack() can now process the Perl internal numeric types: IVs, UVs, NVs-- and also long doubles, if supported by the platform. The template letters are j , J , F , and D .

  • pack('U0a*', ...) can now be used to force a string to UTF-8.

  • my __PACKAGE__ $obj now works. [561]

  • POSIX::sleep() now returns the number of unslept seconds (as the POSIX standard says), as opposed to CORE::sleep() which returns the number of slept seconds.

  • printf() and sprintf() now support parameter reordering using the %\d+\$ and *\d+\$ syntaxes. For example

    1. printf "%2\$s %1\$s\n", "foo", "bar";

    will print "bar foo\n". This feature helps in writing internationalised software, and in general when the order of the parameters can vary.

  • The (\&) prototype now works properly. [561]

  • prototype(\[$@%&]) is now available to implicitly create references (useful for example if you want to emulate the tie() interface).

  • A new command-line option, -t is available. It is the little brother of -T : instead of dying on taint violations, lexical warnings are given. This is only meant as a temporary debugging aid while securing the code of old legacy applications. This is not a substitute for -T.

  • In other taint news, the exec LIST and system LIST have now been considered too risky (think exec @ARGV : it can start any program with any arguments), and now the said forms cause a warning under lexical warnings. You should carefully launder the arguments to guarantee their validity. In future releases of Perl the forms will become fatal errors so consider starting laundering now.

  • Tied hash interfaces are now required to have the EXISTS and DELETE methods (either own or inherited).

  • If tr/// is just counting characters, it doesn't attempt to modify its target.

  • untie() will now call an UNTIE() hook if it exists. See perltie for details. [561]

  • utime now supports utime undef, undef, @files to change the file timestamps to the current time.

  • The rules for allowing underscores (underbars) in numeric constants have been relaxed and simplified: now you can have an underscore simply between digits.

  • Rather than relying on C's argv[0] (which may not contain a full pathname) where possible $^X is now set by asking the operating system. (eg by reading /proc/self/exe on Linux, /proc/curproc/file on FreeBSD)

  • A new variable, ${^TAINT} , indicates whether taint mode is enabled.

  • You can now override the readline() builtin, and this overrides also the <FILEHANDLE> angle bracket operator.

  • The command-line options -s and -F are now recognized on the shebang (#!) line.

  • Use of the /c match modifier without an accompanying /g modifier elicits a new warning: Use of /c modifier is meaningless without /g .

    Use of /c in substitutions, even with /g, elicits Use of /c modifier is meaningless in s/// .

    Use of /g with split elicits Use of /g modifier is meaningless in split .

  • Support for the CLONE special subroutine had been added. With ithreads, when a new thread is created, all Perl data is cloned, however non-Perl data cannot be cloned automatically. In CLONE you can do whatever you need to do, like for example handle the cloning of non-Perl data, if necessary. CLONE will be executed once for every package that has it defined or inherited. It will be called in the context of the new thread, so all modifications are made in the new area.

    See perlmod

Modules and Pragmata

New Modules and Pragmata

  • Attribute::Handlers , originally by Damian Conway and now maintained by Arthur Bergman, allows a class to define attribute handlers.

    1. package MyPack;
    2. use Attribute::Handlers;
    3. sub Wolf :ATTR(SCALAR) { print "howl!\n" }
    4. # later, in some package using or inheriting from MyPack...
    5. my MyPack $Fluffy : Wolf; # the attribute handler Wolf will be called

    Both variables and routines can have attribute handlers. Handlers can be specific to type (SCALAR, ARRAY, HASH, or CODE), or specific to the exact compilation phase (BEGIN, CHECK, INIT, or END). See Attribute::Handlers.

  • B::Concise , by Stephen McCamant, is a new compiler backend for walking the Perl syntax tree, printing concise info about ops. The output is highly customisable. See B::Concise. [561+]

  • The new bignum, bigint, and bigrat pragmas, by Tels, implement transparent bignum support (using the Math::BigInt, Math::BigFloat, and Math::BigRat backends).

  • Class::ISA , by Sean Burke, is a module for reporting the search path for a class's ISA tree. See Class::ISA.

  • Cwd now has a split personality: if possible, an XS extension is used, (this will hopefully be faster, more secure, and more robust) but if not possible, the familiar Perl implementation is used.

  • Devel::PPPort , originally by Kenneth Albanowski and now maintained by Paul Marquess, has been added. It is primarily used by h2xs to enhance portability of XS modules between different versions of Perl. See Devel::PPPort.

  • Digest , frontend module for calculating digests (checksums), from Gisle Aas, has been added. See Digest.

  • Digest::MD5 for calculating MD5 digests (checksums) as defined in RFC 1321, from Gisle Aas, has been added. See Digest::MD5.

    1. use Digest::MD5 'md5_hex';
    2. $digest = md5_hex("Thirsty Camel");
    3. print $digest, "\n"; # 01d19d9d2045e005c3f1b80e8b164de1

    NOTE: the MD5 backward compatibility module is deliberately not included since its further use is discouraged.

    See also PerlIO::via::QuotedPrint.

  • Encode , originally by Nick Ing-Simmons and now maintained by Dan Kogai, provides a mechanism to translate between different character encodings. Support for Unicode, ISO-8859-1, and ASCII are compiled in to the module. Several other encodings (like the rest of the ISO-8859, CP*/Win*, Mac, KOI8-R, three variants EBCDIC, Chinese, Japanese, and Korean encodings) are included and can be loaded at runtime. (For space considerations, the largest Chinese encodings have been separated into their own CPAN module, Encode::HanExtra, which Encode will use if available). See Encode.

    Any encoding supported by Encode module is also available to the ":encoding()" layer if PerlIO is used.

  • Hash::Util is the interface to the new restricted hashes feature. (Implemented by Jeffrey Friedl, Nick Ing-Simmons, and Michael Schwern.) See Hash::Util.

  • I18N::Langinfo can be used to query locale information. See I18N::Langinfo.

  • I18N::LangTags , by Sean Burke, has functions for dealing with RFC3066-style language tags. See I18N::LangTags.

  • ExtUtils::Constant , by Nicholas Clark, is a new tool for extension writers for generating XS code to import C header constants. See ExtUtils::Constant.

  • Filter::Simple , by Damian Conway, is an easy-to-use frontend to Filter::Util::Call. See Filter::Simple.

    1. # in MyFilter.pm:
    2. package MyFilter;
    3. use Filter::Simple sub {
    4. while (my ($from, $to) = splice @_, 0, 2) {
    5. s/$from/$to/g;
    6. }
    7. };
    8. 1;
    9. # in user's code:
    10. use MyFilter qr/red/ => 'green';
    11. print "red\n"; # this code is filtered, will print "green\n"
    12. print "bored\n"; # this code is filtered, will print "bogreen\n"
    13. no MyFilter;
    14. print "red\n"; # this code is not filtered, will print "red\n"
  • File::Temp , by Tim Jenness, allows one to create temporary files and directories in an easy, portable, and secure way. See File::Temp. [561+]

  • Filter::Util::Call , by Paul Marquess, provides you with the framework to write source filters in Perl. For most uses, the frontend Filter::Simple is to be preferred. See Filter::Util::Call.

  • if , by Ilya Zakharevich, is a new pragma for conditional inclusion of modules.

  • libnet, by Graham Barr, is a collection of perl5 modules related to network programming. See Net::FTP, Net::NNTP, Net::Ping (not part of libnet, but related), Net::POP3, Net::SMTP, and Net::Time.

    Perl installation leaves libnet unconfigured; use libnetcfg to configure it.

  • List::Util , by Graham Barr, is a selection of general-utility list subroutines, such as sum(), min(), first(), and shuffle(). See List::Util.

  • Locale::Constants , Locale::Country , Locale::Currency Locale::Language , and Locale::Script, by Neil Bowers, have been added. They provide the codes for various locale standards, such as "fr" for France, "usd" for US Dollar, and "ja" for Japanese.

    1. use Locale::Country;
    2. $country = code2country('jp'); # $country gets 'Japan'
    3. $code = country2code('Norway'); # $code gets 'no'

    See Locale::Constants, Locale::Country, Locale::Currency, and Locale::Language.

  • Locale::Maketext , by Sean Burke, is a localization framework. See Locale::Maketext, and Locale::Maketext::TPJ13. The latter is an article about software localization, originally published in The Perl Journal #13, and republished here with kind permission.

  • Math::BigRat for big rational numbers, to accompany Math::BigInt and Math::BigFloat, from Tels. See Math::BigRat.

  • Memoize can make your functions faster by trading space for time, from Mark-Jason Dominus. See Memoize.

  • MIME::Base64 , by Gisle Aas, allows you to encode data in base64, as defined in RFC 2045 - MIME (Multipurpose Internet Mail Extensions).

    1. use MIME::Base64;
    2. $encoded = encode_base64('Aladdin:open sesame');
    3. $decoded = decode_base64($encoded);
    4. print $encoded, "\n"; # "QWxhZGRpbjpvcGVuIHNlc2FtZQ=="

    See MIME::Base64.

  • MIME::QuotedPrint , by Gisle Aas, allows you to encode data in quoted-printable encoding, as defined in RFC 2045 - MIME (Multipurpose Internet Mail Extensions).

    1. use MIME::QuotedPrint;
    2. $encoded = encode_qp("\xDE\xAD\xBE\xEF");
    3. $decoded = decode_qp($encoded);
    4. print $encoded, "\n"; # "=DE=AD=BE=EF\n"
    5. print $decoded, "\n"; # "\xDE\xAD\xBE\xEF\n"

    See also PerlIO::via::QuotedPrint.

  • NEXT , by Damian Conway, is a pseudo-class for method redispatch. See NEXT.

  • open is a new pragma for setting the default I/O layers for open().

  • PerlIO::scalar , by Nick Ing-Simmons, provides the implementation of IO to "in memory" Perl scalars as discussed above. It also serves as an example of a loadable PerlIO layer. Other future possibilities include PerlIO::Array and PerlIO::Code. See PerlIO::scalar.

  • PerlIO::via , by Nick Ing-Simmons, acts as a PerlIO layer and wraps PerlIO layer functionality provided by a class (typically implemented in Perl code).

  • PerlIO::via::QuotedPrint , by Elizabeth Mattijsen, is an example of a PerlIO::via class:

    1. use PerlIO::via::QuotedPrint;
    2. open($fh,">:via(QuotedPrint)",$path);

    This will automatically convert everything output to $fh to Quoted-Printable. See PerlIO::via and PerlIO::via::QuotedPrint.

  • Pod::ParseLink , by Russ Allbery, has been added, to parse L<> links in pods as described in the new perlpodspec.

  • Pod::Text::Overstrike , by Joe Smith, has been added. It converts POD data to formatted overstrike text. See Pod::Text::Overstrike. [561+]

  • Scalar::Util is a selection of general-utility scalar subroutines, such as blessed(), reftype(), and tainted(). See Scalar::Util.

  • sort is a new pragma for controlling the behaviour of sort().

  • Storable gives persistence to Perl data structures by allowing the storage and retrieval of Perl data to and from files in a fast and compact binary format. Because in effect Storable does serialisation of Perl data structures, with it you can also clone deep, hierarchical datastructures. Storable was originally created by Raphael Manfredi, but it is now maintained by Abhijit Menon-Sen. Storable has been enhanced to understand the two new hash features, Unicode keys and restricted hashes. See Storable.

  • Switch , by Damian Conway, has been added. Just by saying

    1. use Switch;

    you have switch and case available in Perl.

    1. use Switch;
    2. switch ($val) {
    3. case 1 { print "number 1" }
    4. case "a" { print "string a" }
    5. case [1..10,42] { print "number in list" }
    6. case (@array) { print "number in list" }
    7. case /\w+/ { print "pattern" }
    8. case qr/\w+/ { print "pattern" }
    9. case (%hash) { print "entry in hash" }
    10. case (\%hash) { print "entry in hash" }
    11. case (\&sub) { print "arg to subroutine" }
    12. else { print "previous case not true" }
    13. }

    See Switch.

  • Test::More , by Michael Schwern, is yet another framework for writing test scripts, more extensive than Test::Simple. See Test::More.

  • Test::Simple , by Michael Schwern, has basic utilities for writing tests. See Test::Simple.

  • Text::Balanced , by Damian Conway, has been added, for extracting delimited text sequences from strings.

    1. use Text::Balanced 'extract_delimited';
    2. ($a, $b) = extract_delimited("'never say never', he never said", "'", '');

    $a will be "'never say never'", $b will be ', he never said'.

    In addition to extract_delimited(), there are also extract_bracketed(), extract_quotelike(), extract_codeblock(), extract_variable(), extract_tagged(), extract_multiple(), gen_delimited_pat(), and gen_extract_tagged(). With these, you can implement rather advanced parsing algorithms. See Text::Balanced.

  • threads , by Arthur Bergman, is an interface to interpreter threads. Interpreter threads (ithreads) is the new thread model introduced in Perl 5.6 but only available as an internal interface for extension writers (and for Win32 Perl for fork() emulation). See threads, threads::shared, and perlthrtut.

  • threads::shared , by Arthur Bergman, allows data sharing for interpreter threads. See threads::shared.

  • Tie::File , by Mark-Jason Dominus, associates a Perl array with the lines of a file. See Tie::File.

  • Tie::Memoize , by Ilya Zakharevich, provides on-demand loaded hashes. See Tie::Memoize.

  • Tie::RefHash::Nestable , by Edward Avis, allows storing hash references (unlike the standard Tie::RefHash) The module is contained within Tie::RefHash. See Tie::RefHash.

  • Time::HiRes , by Douglas E. Wegscheid, provides high resolution timing (ualarm, usleep, and gettimeofday). See Time::HiRes.

  • Unicode::UCD offers a querying interface to the Unicode Character Database. See Unicode::UCD.

  • Unicode::Collate , by SADAHIRO Tomoyuki, implements the UCA (Unicode Collation Algorithm) for sorting Unicode strings. See Unicode::Collate.

  • Unicode::Normalize , by SADAHIRO Tomoyuki, implements the various Unicode normalization forms. See Unicode::Normalize.

  • XS::APItest , by Tim Jenness, is a test extension that exercises XS APIs. Currently only printf() is tested: how to output various basic data types from XS.

  • XS::Typemap , by Tim Jenness, is a test extension that exercises XS typemaps. Nothing gets installed, but the code is worth studying for extension writers.

Updated And Improved Modules and Pragmata

  • The following independently supported modules have been updated to the newest versions from CPAN: CGI, CPAN, DB_File, File::Spec, File::Temp, Getopt::Long, Math::BigFloat, Math::BigInt, the podlators bundle (Pod::Man, Pod::Text), Pod::LaTeX [561+], Pod::Parser, Storable, Term::ANSIColor, Test, Text-Tabs+Wrap.

  • attributes::reftype() now works on tied arguments.

  • AutoLoader can now be disabled with no AutoLoader; .

  • B::Deparse has been significantly enhanced by Robin Houston. It can now deparse almost all of the standard test suite (so that the tests still succeed). There is a make target "test.deparse" for trying this out.

  • Carp now has better interface documentation, and the @CARP_NOT interface has been added to get optional control over where errors are reported independently of @ISA, by Ben Tilly.

  • Class::Struct can now define the classes in compile time.

  • Class::Struct now assigns the array/hash element if the accessor is called with an array/hash element as the sole argument.

  • The return value of Cwd::fastcwd() is now tainted.

  • Data::Dumper now has an option to sort hashes.

  • Data::Dumper now has an option to dump code references using B::Deparse.

  • DB_File now supports newer Berkeley DB versions, among other improvements.

  • Devel::Peek now has an interface for the Perl memory statistics (this works only if you are using perl's malloc, and if you have compiled with debugging).

  • The English module can now be used without the infamous performance hit by saying

    1. use English '-no_match_vars';

    (Assuming, of course, that you don't need the troublesome variables $` , $& , or $' .) Also, introduced @LAST_MATCH_START and @LAST_MATCH_END English aliases for @- and @+ .

  • ExtUtils::MakeMaker has been significantly cleaned up and fixed. The enhanced version has also been backported to earlier releases of Perl and submitted to CPAN so that the earlier releases can enjoy the fixes.

  • The arguments of WriteMakefile() in Makefile.PL are now checked for sanity much more carefully than before. This may cause new warnings when modules are being installed. See ExtUtils::MakeMaker for more details.

  • ExtUtils::MakeMaker now uses File::Spec internally, which hopefully leads to better portability.

  • Fcntl, Socket, and Sys::Syslog have been rewritten by Nicholas Clark to use the new-style constant dispatch section (see ExtUtils::Constant). This means that they will be more robust and hopefully faster.

  • File::Find now chdir()s correctly when chasing symbolic links. [561]

  • File::Find now has pre- and post-processing callbacks. It also correctly changes directories when chasing symbolic links. Callbacks (naughtily) exiting with "next;" instead of "return;" now work.

  • File::Find is now (again) reentrant. It also has been made more portable.

  • The warnings issued by File::Find now belong to their own category. You can enable/disable them with use/no warnings 'File::Find';.

  • File::Glob::glob() has been renamed to File::Glob::bsd_glob() because the name clashes with the builtin glob(). The older name is still available for compatibility, but is deprecated. [561]

  • File::Glob now supports GLOB_LIMIT constant to limit the size of the returned list of filenames.

  • IPC::Open3 now allows the use of numeric file descriptors.

  • IO::Socket now has an atmark() method, which returns true if the socket is positioned at the out-of-band mark. The method is also exportable as a sockatmark() function.

  • IO::Socket::INET failed to open the specified port if the service name was not known. It now correctly uses the supplied port number as is. [561]

  • IO::Socket::INET has support for the ReusePort option (if your platform supports it). The Reuse option now has an alias, ReuseAddr. For clarity, you may want to prefer ReuseAddr.

  • IO::Socket::INET now supports a value of zero for LocalPort (usually meaning that the operating system will make one up.)

  • 'use lib' now works identically to @INC. Removing directories with 'no lib' now works.

  • Math::BigFloat and Math::BigInt have undergone a full rewrite by Tels. They are now magnitudes faster, and they support various bignum libraries such as GMP and PARI as their backends.

  • Math::Complex handles inf, NaN etc., better.

  • Net::Ping has been considerably enhanced by Rob Brown: multihoming is now supported, Win32 functionality is better, there is now time measuring functionality (optionally high-resolution using Time::HiRes), and there is now "external" protocol which uses Net::Ping::External module which runs your external ping utility and parses the output. A version of Net::Ping::External is available in CPAN.

    Note that some of the Net::Ping tests are disabled when running under the Perl distribution since one cannot assume one or more of the following: enabled echo port at localhost, full Internet connectivity, or sympathetic firewalls. You can set the environment variable PERL_TEST_Net_Ping to "1" (one) before running the Perl test suite to enable all the Net::Ping tests.

  • POSIX::sigaction() is now much more flexible and robust. You can now install coderef handlers, 'DEFAULT', and 'IGNORE' handlers, installing new handlers was not atomic.

  • In Safe, %INC is now localised in a Safe compartment so that use/require work.

  • In SDBM_File on dosish platforms, some keys went missing because of lack of support for files with "holes". A workaround for the problem has been added.

  • In Search::Dict one can now have a pre-processing hook for the lines being searched.

  • The Shell module now has an OO interface.

  • In Sys::Syslog there is now a failover mechanism that will go through alternative connection mechanisms until the message is successfully logged.

  • The Test module has been significantly enhanced.

  • Time::Local::timelocal() does not handle fractional seconds anymore. The rationale is that neither does localtime(), and timelocal() and localtime() are supposed to be inverses of each other.

  • The vars pragma now supports declaring fully qualified variables. (Something that our() does not and will not support.)

  • The utf8:: name space (as in the pragma) provides various Perl-callable functions to provide low level access to Perl's internal Unicode representation. At the moment only length() has been implemented.

Utility Changes

  • Emacs perl mode (emacs/cperl-mode.el) has been updated to version 4.31.

  • emacs/e2ctags.pl is now much faster.

  • enc2xs is a tool for people adding their own encodings to the Encode module.

  • h2ph now supports C trigraphs.

  • h2xs now produces a template README.

  • h2xs now uses Devel::PPPort for better portability between different versions of Perl.

  • h2xs uses the new ExtUtils::Constant module which will affect newly created extensions that define constants. Since the new code is more correct (if you have two constants where the first one is a prefix of the second one, the first constant never got defined), less lossy (it uses integers for integer constant, as opposed to the old code that used floating point numbers even for integer constants), and slightly faster, you might want to consider regenerating your extension code (the new scheme makes regenerating easy). h2xs now also supports C trigraphs.

  • libnetcfg has been added to configure libnet.

  • perlbug is now much more robust. It also sends the bug report to perl.org, not perl.com.

  • perlcc has been rewritten and its user interface (that is, command line) is much more like that of the Unix C compiler, cc. (The perlbc tools has been removed. Use perlcc -B instead.) Note that perlcc is still considered very experimental and unsupported. [561]

  • perlivp is a new Installation Verification Procedure utility for running any time after installing Perl.

  • piconv is an implementation of the character conversion utility iconv , demonstrating the new Encode module.

  • pod2html now allows specifying a cache directory.

  • pod2html now produces XHTML 1.0.

  • pod2html now understands POD written using different line endings (PC-like CRLF versus Unix-like LF versus MacClassic-like CR).

  • s2p has been completely rewritten in Perl. (It is in fact a full implementation of sed in Perl: you can use the sed functionality by using the psed utility.)

  • xsubpp now understands POD documentation embedded in the *.xs files. [561]

  • xsubpp now supports the OUT keyword.

New Documentation

  • perl56delta details the changes between the 5.005 release and the 5.6.0 release.

  • perlclib documents the internal replacements for standard C library functions. (Interesting only for extension writers and Perl core hackers.) [561+]

  • perldebtut is a Perl debugging tutorial. [561+]

  • perlebcdic contains considerations for running Perl on EBCDIC platforms. [561+]

  • perlintro is a gentle introduction to Perl.

  • perliol documents the internals of PerlIO with layers.

  • perlmodstyle is a style guide for writing modules.

  • perlnewmod tells about writing and submitting a new module. [561+]

  • perlpacktut is a pack() tutorial.

  • perlpod has been rewritten to be clearer and to record the best practices gathered over the years.

  • perlpodspec is a more formal specification of the pod format, mainly of interest for writers of pod applications, not to people writing in pod.

  • perlretut is a regular expression tutorial. [561+]

  • perlrequick is a regular expressions quick-start guide. Yes, much quicker than perlretut. [561]

  • perltodo has been updated.

  • perltootc has been renamed as perltooc (to not to conflict with perltoot in filesystems restricted to "8.3" names).

  • perluniintro is an introduction to using Unicode in Perl. (perlunicode is more of a detailed reference and background information)

  • perlutil explains the command line utilities packaged with the Perl distribution. [561+]

The following platform-specific documents are available before the installation as README.platform, and after the installation as perlplatform:

  1. perlaix perlamiga perlapollo perlbeos perlbs2000
  2. perlce perlcygwin perldgux perldos perlepoc perlfreebsd perlhpux
  3. perlhurd perlirix perlmachten perlmacos perlmint perlmpeix
  4. perlnetware perlos2 perlos390 perlplan9 perlqnx perlsolaris
  5. perltru64 perluts perlvmesa perlvms perlvos perlwin32

These documents usually detail one or more of the following subjects: configuring, building, testing, installing, and sometimes also using Perl on the said platform.

Eastern Asian Perl users are now welcomed in their own languages: README.jp (Japanese), README.ko (Korean), README.cn (simplified Chinese) and README.tw (traditional Chinese), which are written in normal pod but encoded in EUC-JP, EUC-KR, EUC-CN and Big5. These will get installed as

  1. perljp perlko perlcn perltw
  • The documentation for the POSIX-BC platform is called "BS2000", to avoid confusion with the Perl POSIX module.

  • The documentation for the WinCE platform is called perlce (README.ce in the source code kit), to avoid confusion with the perlwin32 documentation on 8.3-restricted filesystems.

Performance Enhancements

  • map() could get pathologically slow when the result list it generates is larger than the source list. The performance has been improved for common scenarios. [561]

  • sort() is also fully reentrant, in the sense that the sort function can itself call sort(). This did not work reliably in previous releases. [561]

  • sort() has been changed to use primarily mergesort internally as opposed to the earlier quicksort. For very small lists this may result in slightly slower sorting times, but in general the speedup should be at least 20%. Additional bonuses are that the worst case behaviour of sort() is now better (in computer science terms it now runs in time O(N log N), as opposed to quicksort's Theta(N**2) worst-case run time behaviour), and that sort() is now stable (meaning that elements with identical keys will stay ordered as they were before the sort). See the sort pragma for information.

    The story in more detail: suppose you want to serve yourself a little slice of Pi.

    1. @digits = ( 3,1,4,1,5,9 );

    A numerical sort of the digits will yield (1,1,3,4,5,9), as expected. Which 1 comes first is hard to know, since one 1 looks pretty much like any other. You can regard this as totally trivial, or somewhat profound. However, if you just want to sort the even digits ahead of the odd ones, then what will

    1. sort { ($a % 2) <=> ($b % 2) } @digits;

    yield? The only even digit, 4 , will come first. But how about the odd numbers, which all compare equal? With the quicksort algorithm used to implement Perl 5.6 and earlier, the order of ties is left up to the sort. So, as you add more and more digits of Pi, the order in which the sorted even and odd digits appear will change. and, for sufficiently large slices of Pi, the quicksort algorithm in Perl 5.8 won't return the same results even if reinvoked with the same input. The justification for this rests with quicksort's worst case behavior. If you run

    1. sort { $a <=> $b } ( 1 .. $N , 1 .. $N );

    (something you might approximate if you wanted to merge two sorted arrays using sort), doubling $N doesn't just double the quicksort time, it quadruples it. Quicksort has a worst case run time that can grow like N**2, so-called quadratic behaviour, and it can happen on patterns that may well arise in normal use. You won't notice this for small arrays, but you will notice it with larger arrays, and you may not live long enough for the sort to complete on arrays of a million elements. So the 5.8 quicksort scrambles large arrays before sorting them, as a statistical defence against quadratic behaviour. But that means if you sort the same large array twice, ties may be broken in different ways.

    Because of the unpredictability of tie-breaking order, and the quadratic worst-case behaviour, quicksort was almost replaced completely with a stable mergesort. Stable means that ties are broken to preserve the original order of appearance in the input array. So

    1. sort { ($a % 2) <=> ($b % 2) } (3,1,4,1,5,9);

    will yield (4,3,1,1,5,9), guaranteed. The even and odd numbers appear in the output in the same order they appeared in the input. Mergesort has worst case O(N log N) behaviour, the best value attainable. And, ironically, this mergesort does particularly well where quicksort goes quadratic: mergesort sorts (1..$N, 1..$N) in O(N) time. But quicksort was rescued at the last moment because it is faster than mergesort on certain inputs and platforms. For example, if you really don't care about the order of even and odd digits, quicksort will run in O(N) time; it's very good at sorting many repetitions of a small number of distinct elements. The quicksort divide and conquer strategy works well on platforms with relatively small, very fast, caches. Eventually, the problem gets whittled down to one that fits in the cache, from which point it benefits from the increased memory speed.

    Quicksort was rescued by implementing a sort pragma to control aspects of the sort. The stable subpragma forces stable behaviour, regardless of algorithm. The _quicksort and _mergesort subpragmas are heavy-handed ways to select the underlying implementation. The leading _ is a reminder that these subpragmas may not survive beyond 5.8. More appropriate mechanisms for selecting the implementation exist, but they wouldn't have arrived in time to save quicksort.

  • Hashes now use Bob Jenkins "One-at-a-Time" hashing key algorithm ( http://burtleburtle.net/bob/hash/doobs.html ). This algorithm is reasonably fast while producing a much better spread of values than the old hashing algorithm (originally by Chris Torek, later tweaked by Ilya Zakharevich). Hash values output from the algorithm on a hash of all 3-char printable ASCII keys comes much closer to passing the DIEHARD random number generation tests. According to perlbench, this change has not affected the overall speed of Perl.

  • unshift() should now be noticeably faster.

Installation and Configuration Improvements

Generic Improvements

  • INSTALL now explains how you can configure Perl to use 64-bit integers even on non-64-bit platforms.

  • Policy.sh policy change: if you are reusing a Policy.sh file (see INSTALL) and you use Configure -Dprefix=/foo/bar and in the old Policy $prefix eq $siteprefix and $prefix eq $vendorprefix, all of them will now be changed to the new prefix, /foo/bar. (Previously only $prefix changed.) If you do not like this new behaviour, specify prefix, siteprefix, and vendorprefix explicitly.

  • A new optional location for Perl libraries, otherlibdirs, is available. It can be used for example for vendor add-ons without disturbing Perl's own library directories.

  • In many platforms, the vendor-supplied 'cc' is too stripped-down to build Perl (basically, 'cc' doesn't do ANSI C). If this seems to be the case and 'cc' does not seem to be the GNU C compiler 'gcc', an automatic attempt is made to find and use 'gcc' instead.

  • gcc needs to closely track the operating system release to avoid build problems. If Configure finds that gcc was built for a different operating system release than is running, it now gives a clearly visible warning that there may be trouble ahead.

  • Since Perl 5.8 is not binary-compatible with previous releases of Perl, Configure no longer suggests including the 5.005 modules in @INC.

  • Configure -S can now run non-interactively. [561]

  • Configure support for pdp11-style memory models has been removed due to obsolescence. [561]

  • configure.gnu now works with options with whitespace in them.

  • installperl now outputs everything to STDERR.

  • Because PerlIO is now the default on most platforms, "-perlio" doesn't get appended to the $Config{archname} (also known as $^O) anymore. Instead, if you explicitly choose not to use perlio (Configure command line option -Uuseperlio), you will get "-stdio" appended.

  • Another change related to the architecture name is that "-64all" (-Duse64bitall, or "maximally 64-bit") is appended only if your pointers are 64 bits wide. (To be exact, the use64bitall is ignored.)

  • In AFS installations, one can configure the root of the AFS to be somewhere else than the default /afs by using the Configure parameter -Dafsroot=/some/where/else.

  • APPLLIB_EXP, a lesser-known configuration-time definition, has been documented. It can be used to prepend site-specific directories to Perl's default search path (@INC); see INSTALL for information.

  • The version of Berkeley DB used when the Perl (and, presumably, the DB_File extension) was built is now available as @Config{qw(db_version_major db_version_minor db_version_patch)} from Perl and as DB_VERSION_MAJOR_CFG DB_VERSION_MINOR_CFG DB_VERSION_PATCH_CFG from C.

  • Building Berkeley DB3 for compatibility modes for DB, NDBM, and ODBM has been documented in INSTALL.

  • If you have CPAN access (either network or a local copy such as a CD-ROM) you can during specify extra modules to Configure to build and install with Perl using the -Dextras=... option. See INSTALL for more details.

  • In addition to config.over, a new override file, config.arch, is available. This file is supposed to be used by hints file writers for architecture-wide changes (as opposed to config.over which is for site-wide changes).

  • If your file system supports symbolic links, you can build Perl outside of the source directory by

    1. mkdir perl/build/directory
    2. cd perl/build/directory
    3. sh /path/to/perl/source/Configure -Dmksymlinks ...

    This will create in perl/build/directory a tree of symbolic links pointing to files in /path/to/perl/source. The original files are left unaffected. After Configure has finished, you can just say

    1. make all test

    and Perl will be built and tested, all in perl/build/directory. [561]

  • For Perl developers, several new make targets for profiling and debugging have been added; see perlhack.

    • Use of the gprof tool to profile Perl has been documented in perlhack. There is a make target called "perl.gprof" for generating a gprofiled Perl executable.

    • If you have GCC 3, there is a make target called "perl.gcov" for creating a gcoved Perl executable for coverage analysis. See perlhack.

    • If you are on IRIX or Tru64 platforms, new profiling/debugging options have been added; see perlhack for more information about pixie and Third Degree.

  • Guidelines of how to construct minimal Perl installations have been added to INSTALL.

  • The Thread extension is now not built at all under ithreads (Configure -Duseithreads ) because it wouldn't work anyway (the Thread extension requires being Configured with -Duse5005threads ).

    Note that the 5.005 threads are unsupported and deprecated: if you have code written for the old threads you should migrate it to the new ithreads model.

  • The Gconvert macro ($Config{d_Gconvert}) used by perl for stringifying floating-point numbers is now more picky about using sprintf %.*g rules for the conversion. Some platforms that used to use gcvt may now resort to the slower sprintf.

  • The obsolete method of making a special (e.g., debugging) flavor of perl by saying

    1. make LIBPERL=libperld.a

    has been removed. Use -DDEBUGGING instead.

New Or Improved Platforms

For the list of platforms known to support Perl, see Supported Platforms in perlport.

  • AIX dynamic loading should be now better supported.

  • AIX should now work better with gcc, threads, and 64-bitness. Also the long doubles support in AIX should be better now. See perlaix.

  • AtheOS ( http://www.atheos.cx/ ) is a new platform.

  • BeOS has been reclaimed.

  • The DG/UX platform now supports 5.005-style threads. See perldgux.

  • The DYNIX/ptx platform (also known as dynixptx) is supported at or near osvers 4.5.2.

  • EBCDIC platforms (z/OS (also known as OS/390), POSIX-BC, and VM/ESA) have been regained. Many test suite tests still fail and the co-existence of Unicode and EBCDIC isn't quite settled, but the situation is much better than with Perl 5.6. See perlos390, perlbs2000 (for POSIX-BC), and perlvmesa for more information. (Note: support for VM/ESA was removed in Perl v5.18.0. The relevant information was in README.vmesa)

  • Building perl with -Duseithreads or -Duse5005threads now works under HP-UX 10.20 (previously it only worked under 10.30 or later). You will need a thread library package installed. See README.hpux. [561]

  • Mac OS Classic is now supported in the mainstream source package (MacPerl has of course been available since perl 5.004 but now the source code bases of standard Perl and MacPerl have been synchronised) [561]

  • Mac OS X (or Darwin) should now be able to build Perl even on HFS+ filesystems. (The case-insensitivity used to confuse the Perl build process.)

  • NCR MP-RAS is now supported. [561]

  • All the NetBSD specific patches (except for the installation specific ones) have been merged back to the main distribution.

  • NetWare from Novell is now supported. See perlnetware.

  • NonStop-UX is now supported. [561]

  • NEC SUPER-UX is now supported.

  • All the OpenBSD specific patches (except for the installation specific ones) have been merged back to the main distribution.

  • Perl has been tested with the GNU pth userlevel thread package ( http://www.gnu.org/software/pth/pth.html ). All thread tests of Perl now work, but not without adding some yield()s to the tests, so while pth (and other userlevel thread implementations) can be considered to be "working" with Perl ithreads, keep in mind the possible non-preemptability of the underlying thread implementation.

  • Stratus VOS is now supported using Perl's native build method (Configure). This is the recommended method to build Perl on VOS. The older methods, which build miniperl, are still available. See perlvos. [561+]

  • The Amdahl UTS Unix mainframe platform is now supported. [561]

  • WinCE is now supported. See perlce.

  • z/OS (formerly known as OS/390, formerly known as MVS OE) now has support for dynamic loading. This is not selected by default, however, you must specify -Dusedl in the arguments of Configure. [561]

Selected Bug Fixes

Numerous memory leaks and uninitialized memory accesses have been hunted down. Most importantly, anonymous subs used to leak quite a bit. [561]

  • The autouse pragma didn't work for Multi::Part::Function::Names.

  • caller() could cause core dumps in certain situations. Carp was sometimes affected by this problem. In particular, caller() now returns a subroutine name of (unknown) for subroutines that have been removed from the symbol table.

  • chop(@list) in list context returned the characters chopped in reverse order. This has been reversed to be in the right order. [561]

  • Configure no longer includes the DBM libraries (dbm, gdbm, db, ndbm) when building the Perl binary. The only exception to this is SunOS 4.x, which needs them. [561]

  • The behaviour of non-decimal but numeric string constants such as "0x23" was platform-dependent: in some platforms that was seen as 35, in some as 0, in some as a floating point number (don't ask). This was caused by Perl's using the operating system libraries in a situation where the result of the string to number conversion is undefined: now Perl consistently handles such strings as zero in numeric contexts.

  • Several debugger fixes: exit code now reflects the script exit code, condition "0" now treated correctly, the d command now checks line number, $. no longer gets corrupted, and all debugger output now goes correctly to the socket if RemotePort is set. [561]

  • The debugger (perl5db.pl) has been modified to present a more consistent commands interface, via (CommandSet=580). perl5db.t was also added to test the changes, and as a placeholder for further tests.

    See perldebug.

  • The debugger has a new dumpDepth option to control the maximum depth to which nested structures are dumped. The x command has been extended so that x N EXPR dumps out the value of EXPR to a depth of at most N levels.

  • The debugger can now show lexical variables if you have the CPAN module PadWalker installed.

  • The order of DESTROYs has been made more predictable.

  • Perl 5.6.0 could emit spurious warnings about redefinition of dl_error() when statically building extensions into perl. This has been corrected. [561]

  • dprofpp -R didn't work.

  • *foo{FORMAT} now works.

  • Infinity is now recognized as a number.

  • UNIVERSAL::isa no longer caches methods incorrectly. (This broke the Tk extension with 5.6.0.) [561]

  • Lexicals I: lexicals outside an eval "" weren't resolved correctly inside a subroutine definition inside the eval "" if they were not already referenced in the top level of the eval""ed code.

  • Lexicals II: lexicals leaked at file scope into subroutines that were declared before the lexicals.

  • Lexical warnings now propagating correctly between scopes and into eval "..." .

  • use warnings qw(FATAL all) did not work as intended. This has been corrected. [561]

  • warnings::enabled() now reports the state of $^W correctly if the caller isn't using lexical warnings. [561]

  • Line renumbering with eval and #line now works. [561]

  • Fixed numerous memory leaks, especially in eval "".

  • Localised tied variables no longer leak memory

    1. use Tie::Hash;
    2. tie my %tied_hash => 'Tie::StdHash';
    3. ...
    4. # Used to leak memory every time local() was called;
    5. # in a loop, this added up.
    6. local($tied_hash{Foo}) = 1;
  • Localised hash elements (and %ENV) are correctly unlocalised to not exist, if they didn't before they were localised.

    1. use Tie::Hash;
    2. tie my %tied_hash => 'Tie::StdHash';
    3. ...
    4. # Nothing has set the FOO element so far
    5. { local $tied_hash{FOO} = 'Bar' }
    6. # This used to print, but not now.
    7. print "exists!\n" if exists $tied_hash{FOO};

    As a side effect of this fix, tied hash interfaces must define the EXISTS and DELETE methods.

  • mkdir() now ignores trailing slashes in the directory name, as mandated by POSIX.

  • Some versions of glibc have a broken modfl(). This affects builds with -Duselongdouble . This version of Perl detects this brokenness and has a workaround for it. The glibc release 2.2.2 is known to have fixed the modfl() bug.

  • Modulus of unsigned numbers now works (4063328477 % 65535 used to return 27406, instead of 27047). [561]

  • Some "not a number" warnings introduced in 5.6.0 eliminated to be more compatible with 5.005. Infinity is now recognised as a number. [561]

  • Numeric conversions did not recognize changes in the string value properly in certain circumstances. [561]

  • Attributes (such as :shared) didn't work with our().

  • our() variables will not cause bogus "Variable will not stay shared" warnings. [561]

  • "our" variables of the same name declared in two sibling blocks resulted in bogus warnings about "redeclaration" of the variables. The problem has been corrected. [561]

  • pack "Z" now correctly terminates the string with "\0".

  • Fix password routines which in some shadow password platforms (e.g. HP-UX) caused getpwent() to return every other entry.

  • The PERL5OPT environment variable (for passing command line arguments to Perl) didn't work for more than a single group of options. [561]

  • PERL5OPT with embedded spaces didn't work.

  • printf() no longer resets the numeric locale to "C".

  • qw(a\\b) now parses correctly as 'a\\b' : that is, as three characters, not four. [561]

  • pos() did not return the correct value within s///ge in earlier versions. This is now handled correctly. [561]

  • Printing quads (64-bit integers) with printf/sprintf now works without the q L ll prefixes (assuming you are on a quad-capable platform).

  • Regular expressions on references and overloaded scalars now work. [561+]

  • Right-hand side magic (GMAGIC) could in many cases such as string concatenation be invoked too many times.

  • scalar() now forces scalar context even when used in void context.

  • SOCKS support is now much more robust.

  • sort() arguments are now compiled in the right wantarray context (they were accidentally using the context of the sort() itself). The comparison block is now run in scalar context, and the arguments to be sorted are always provided list context. [561]

  • Changed the POSIX character class [[:space:]] to include the (very rarely used) vertical tab character. Added a new POSIX-ish character class [[:blank:]] which stands for horizontal whitespace (currently, the space and the tab).

  • The tainting behaviour of sprintf() has been rationalized. It does not taint the result of floating point formats anymore, making the behaviour consistent with that of string interpolation. [561]

  • Some cases of inconsistent taint propagation (such as within hash values) have been fixed.

  • The RE engine found in Perl 5.6.0 accidentally pessimised certain kinds of simple pattern matches. These are now handled better. [561]

  • Regular expression debug output (whether through use re 'debug' or via -Dr ) now looks better. [561]

  • Multi-line matches like "a\nxb\n" =~ /(?!\A)x/m were flawed. The bug has been fixed. [561]

  • Use of $& could trigger a core dump under some situations. This is now avoided. [561]

  • The regular expression captured submatches ($1, $2, ...) are now more consistently unset if the match fails, instead of leaving false data lying around in them. [561]

  • readline() on files opened in "slurp" mode could return an extra "" (blank line) at the end in certain situations. This has been corrected. [561]

  • Autovivification of symbolic references of special variables described in perlvar (as in ${$num} ) was accidentally disabled. This works again now. [561]

  • Sys::Syslog ignored the LOG_AUTH constant.

  • $AUTOLOAD, sort(), lock(), and spawning subprocesses in multiple threads simultaneously are now thread-safe.

  • Tie::Array's SPLICE method was broken.

  • Allow a read-only string on the left-hand side of a non-modifying tr///.

  • If STDERR is tied, warnings caused by warn and die now correctly pass to it.

  • Several Unicode fixes.

    • BOMs (byte order marks) at the beginning of Perl files (scripts, modules) should now be transparently skipped. UTF-16 and UCS-2 encoded Perl files should now be read correctly.

    • The character tables have been updated to Unicode 3.2.0.

    • Comparing with utf8 data does not magically upgrade non-utf8 data into utf8. (This was a problem for example if you were mixing data from I/O and Unicode data: your output might have got magically encoded as UTF-8.)

    • Generating illegal Unicode code points such as U+FFFE, or the UTF-16 surrogates, now also generates an optional warning.

    • IsAlnum , IsAlpha , and IsWord now match titlecase.

    • Concatenation with the . operator or via variable interpolation, eq , substr, reverse, quotemeta, the x operator, substitution with s///, single-quoted UTF-8, should now work.

    • The tr/// operator now works. Note that the tr///CU functionality has been removed (but see pack('U0', ...)).

    • eval "v200" now works.

    • Perl 5.6.0 parsed m/\x{ab}/ incorrectly, leading to spurious warnings. This has been corrected. [561]

    • Zero entries were missing from the Unicode classes such as IsDigit .

  • Large unsigned numbers (those above 2**31) could sometimes lose their unsignedness, causing bogus results in arithmetic operations. [561]

  • The Perl parser has been stress tested using both random input and Markov chain input and the few found crashes and lockups have been fixed.

Platform Specific Changes and Fixes

  • BSDI 4.*

    Perl now works on post-4.0 BSD/OSes.

  • All BSDs

    Setting $0 now works (as much as possible; see perlvar for details).

  • Cygwin

    Numerous updates; currently synchronised with Cygwin 1.3.10.

  • Previously DYNIX/ptx had problems in its Configure probe for non-blocking I/O.

  • EPOC

    EPOC now better supported. See README.epoc. [561]

  • FreeBSD 3.*

    Perl now works on post-3.0 FreeBSDs.

  • HP-UX

    README.hpux updated; Configure -Duse64bitall now works; now uses HP-UX malloc instead of Perl malloc.

  • IRIX

    Numerous compilation flag and hint enhancements; accidental mixing of 32-bit and 64-bit libraries (a doomed attempt) made much harder.

  • Linux

    • Long doubles should now work (see INSTALL). [561]

    • Linux previously had problems related to sockaddrlen when using accept(), recvfrom() (in Perl: recv()), getpeername(), and getsockname().

  • Mac OS Classic

    Compilation of the standard Perl distribution in Mac OS Classic should now work if you have the Metrowerks development environment and the missing Mac-specific toolkit bits. Contact the macperl mailing list for details.

  • MPE/iX

    MPE/iX update after Perl 5.6.0. See README.mpeix. [561]

  • NetBSD/threads: try installing the GNU pth (should be in the packages collection, or http://www.gnu.org/software/pth/), and Configure with -Duseithreads.

  • NetBSD/sparc

    Perl now works on NetBSD/sparc.

  • OS/2

    Now works with usethreads (see INSTALL). [561]

  • Solaris

    64-bitness using the Sun Workshop compiler now works.

  • Stratus VOS

    The native build method requires at least VOS Release 14.5.0 and GNU C++/GNU Tools 2.0.1 or later. The Perl pack function now maps overflowed values to +infinity and underflowed values to -infinity.

  • Tru64 (aka Digital UNIX, aka DEC OSF/1)

    The operating system version letter now recorded in $Config{osvers}. Allow compiling with gcc (previously explicitly forbidden). Compiling with gcc still not recommended because buggy code results, even with gcc 2.95.2.

  • Unicos

    Fixed various alignment problems that lead into core dumps either during build or later; no longer dies on math errors at runtime; now using full quad integers (64 bits), previously was using only 46 bit integers for speed.

  • VMS

    See Socket Extension Dynamic in VMS and IEEE-format Floating Point Default on OpenVMS Alpha for important changes not otherwise listed here.

    chdir() now works better despite a CRT bug; now works with MULTIPLICITY (see INSTALL); now works with Perl's malloc.

    The tainting of %ENV elements via keys or values was previously unimplemented. It now works as documented.

    The waitpid emulation has been improved. The worst bug (now fixed) was that a pid of -1 would cause a wildcard search of all processes on the system.

    POSIX-style signals are now emulated much better on VMS versions prior to 7.0.

    The system function and backticks operator have improved functionality and better error handling. [561]

    File access tests now use current process privileges rather than the user's default privileges, which could sometimes result in a mismatch between reported access and actual access. This improvement is only available on VMS v6.0 and later.

    There is a new kill implementation based on sys$sigprc that allows older VMS systems (pre-7.0) to use kill to send signals rather than simply force exit. This implementation also allows later systems to call kill from within a signal handler.

    Iterative logical name translations are now limited to 10 iterations in imitation of SHOW LOGICAL and other OpenVMS facilities.

  • Windows

    • Signal handling now works better than it used to. It is now implemented using a Windows message loop, and is therefore less prone to random crashes.

    • fork() emulation is now more robust, but still continues to have a few esoteric bugs and caveats. See perlfork for details. [561+]

    • A failed (pseudo)fork now returns undef and sets errno to EAGAIN. [561]

    • The following modules now work on Windows:

      1. ExtUtils::Embed [561]
      2. IO::Pipe
      3. IO::Poll
      4. Net::Ping
    • IO::File::new_tmpfile() is no longer limited to 32767 invocations per-process.

    • Better chdir() return value for a non-existent directory.

    • Compiling perl using the 64-bit Platform SDK tools is now supported.

    • The Win32::SetChildShowWindow() builtin can be used to control the visibility of windows created by child processes. See Win32 for details.

    • Non-blocking waits for child processes (or pseudo-processes) are supported via waitpid($pid, &POSIX::WNOHANG) .

    • The behavior of system() with multiple arguments has been rationalized. Each unquoted argument will be automatically quoted to protect whitespace, and any existing whitespace in the arguments will be preserved. This improves the portability of system(@args) by avoiding the need for Windows cmd shell specific quoting in perl programs.

      Note that this means that some scripts that may have relied on earlier buggy behavior may no longer work correctly. For example, system("nmake /nologo", @args) will now attempt to run the file nmake /nologo and will fail when such a file isn't found. On the other hand, perl will now execute code such as system("c:/Program Files/MyApp/foo.exe", @args) correctly.

    • The perl header files no longer suppress common warnings from the Microsoft Visual C++ compiler. This means that additional warnings may now show up when compiling XS code.

    • Borland C++ v5.5 is now a supported compiler that can build Perl. However, the generated binaries continue to be incompatible with those generated by the other supported compilers (GCC and Visual C++). [561]

    • Duping socket handles with open(F, ">&MYSOCK") now works under Windows 9x. [561]

    • Current directory entries in %ENV are now correctly propagated to child processes. [561]

    • New %ENV entries now propagate to subprocesses. [561]

    • Win32::GetCwd() correctly returns C:\ instead of C: when at the drive root. Other bugs in chdir() and Cwd::cwd() have also been fixed. [561]

    • The makefiles now default to the features enabled in ActiveState ActivePerl (a popular Win32 binary distribution). [561]

    • HTML files will now be installed in c:\perl\html instead of c:\perl\lib\pod\html

    • REG_EXPAND_SZ keys are now allowed in registry settings used by perl. [561]

    • Can now send() from all threads, not just the first one. [561]

    • ExtUtils::MakeMaker now uses $ENV{LIB} to search for libraries. [561]

    • Less stack reserved per thread so that more threads can run concurrently. (Still 16M per thread.) [561]

    • File::Spec->tmpdir() now prefers C:/temp over /tmp (works better when perl is running as service).

    • Better UNC path handling under ithreads. [561]

    • wait(), waitpid(), and backticks now return the correct exit status under Windows 9x. [561]

    • A socket handle leak in accept() has been fixed. [561]

New or Changed Diagnostics

Please see perldiag for more details.

  • Ambiguous range in the transliteration operator (like a-z-9) now gives a warning.

  • chdir("") and chdir(undef) now give a deprecation warning because they cause a possible unintentional chdir to the home directory. Say chdir() if you really mean that.

  • Two new debugging options have been added: if you have compiled your Perl with debugging, you can use the -DT [561] and -DR options to trace tokenising and to add reference counts to displaying variables, respectively.

  • The lexical warnings category "deprecated" is no longer a sub-category of the "syntax" category. It is now a top-level category in its own right.

  • Unadorned dump() will now give a warning suggesting to use explicit CORE::dump() if that's what really is meant.

  • The "Unrecognized escape" warning has been extended to include \8 , \9 , and \_ . There is no need to escape any of the \w characters.

  • All regular expression compilation error messages are now hopefully easier to understand both because the error message now comes before the failed regex and because the point of failure is now clearly marked by a <-- HERE marker.

  • Various I/O (and socket) functions like binmode(), close(), and so forth now more consistently warn if they are used illogically either on a yet unopened or on an already closed filehandle (or socket).

  • Using lstat() on a filehandle now gives a warning. (It's a non-sensical thing to do.)

  • The -M and -m options now warn if you didn't supply the module name.

  • If you in use specify a required minimum version, modules matching the name and but not defining a $VERSION will cause a fatal failure.

  • Using negative offset for vec() in lvalue context is now a warnable offense.

  • Odd number of arguments to overload::constant now elicits a warning.

  • Odd number of elements in anonymous hash now elicits a warning.

  • The various "opened only for", "on closed", "never opened" warnings drop the main:: prefix for filehandles in the main package, for example STDIN instead of main::STDIN .

  • Subroutine prototypes are now checked more carefully, you may get warnings for example if you have used non-prototype characters.

  • If an attempt to use a (non-blessed) reference as an array index is made, a warning is given.

  • push @a; and unshift @a; (with no values to push or unshift) now give a warning. This may be a problem for generated and eval'ed code.

  • If you try to pack a number less than 0 or larger than 255 using the "C" format you will get an optional warning. Similarly for the "c" format and a number less than -128 or more than 127.

  • pack P format now demands an explicit size.

  • unpack w now warns of unterminated compressed integers.

  • Warnings relating to the use of PerlIO have been added.

  • Certain regex modifiers such as (?o) make sense only if applied to the entire regex. You will get an optional warning if you try to do otherwise.

  • Variable length lookbehind has not yet been implemented, trying to use it will tell that.

  • Using arrays or hashes as references (e.g. %foo->{bar} has been deprecated for a while. Now you will get an optional warning.

  • Warnings relating to the use of the new restricted hashes feature have been added.

  • Self-ties of arrays and hashes are not supported and fatal errors will happen even at an attempt to do so.

  • Using sort in scalar context now issues an optional warning. This didn't do anything useful, as the sort was not performed.

  • Using the /g modifier in split() is meaningless and will cause a warning.

  • Using splice() past the end of an array now causes a warning.

  • Malformed Unicode encodings (UTF-8 and UTF-16) cause a lot of warnings, as does trying to use UTF-16 surrogates (which are unimplemented).

  • Trying to use Unicode characters on an I/O stream without marking the stream's encoding (using open() or binmode()) will cause "Wide character" warnings.

  • Use of v-strings in use/require causes a (backward) portability warning.

  • Warnings relating to the use interpreter threads and their shared data have been added.

Changed Internals

  • PerlIO is now the default.

  • perlapi.pod (a companion to perlguts) now attempts to document the internal API.

  • You can now build a really minimal perl called microperl. Building microperl does not require even running Configure; make -f Makefile.micro should be enough. Beware: microperl makes many assumptions, some of which may be too bold; the resulting executable may crash or otherwise misbehave in wondrous ways. For careful hackers only.

  • Added rsignal(), whichsig(), do_join(), op_clear, op_null, ptr_table_clear(), ptr_table_free(), sv_setref_uv(), and several UTF-8 interfaces to the publicised API. For the full list of the available APIs see perlapi.

  • Made possible to propagate customised exceptions via croak()ing.

  • Now xsubs can have attributes just like subs. (Well, at least the built-in attributes.)

  • dTHR and djSP have been obsoleted; the former removed (because it's a no-op) and the latter replaced with dSP.

  • PERL_OBJECT has been completely removed.

  • The MAGIC constants (e.g. 'P' ) have been macrofied (e.g. PERL_MAGIC_TIED ) for better source code readability and maintainability.

  • The regex compiler now maintains a structure that identifies nodes in the compiled bytecode with the corresponding syntactic features of the original regex expression. The information is attached to the new offsets member of the struct regexp . See perldebguts for more complete information.

  • The C code has been made much more gcc -Wall clean. Some warning messages still remain in some platforms, so if you are compiling with gcc you may see some warnings about dubious practices. The warnings are being worked on.

  • perly.c, sv.c, and sv.h have now been extensively commented.

  • Documentation on how to use the Perl source repository has been added to Porting/repository.pod.

  • There are now several profiling make targets.

Security Vulnerability Closed [561]

(This change was already made in 5.7.0 but bears repeating here.) (5.7.0 came out before 5.6.1: the development branch 5.7 released earlier than the maintenance branch 5.6)

A potential security vulnerability in the optional suidperl component of Perl was identified in August 2000. suidperl is neither built nor installed by default. As of November 2001 the only known vulnerable platform is Linux, most likely all Linux distributions. CERT and various vendors and distributors have been alerted about the vulnerability. See http://www.cpan.org/src/5.0/sperl-2000-08-05/sperl-2000-08-05.txt for more information.

The problem was caused by Perl trying to report a suspected security exploit attempt using an external program, /bin/mail. On Linux platforms the /bin/mail program had an undocumented feature which when combined with suidperl gave access to a root shell, resulting in a serious compromise instead of reporting the exploit attempt. If you don't have /bin/mail, or if you have 'safe setuid scripts', or if suidperl is not installed, you are safe.

The exploit attempt reporting feature has been completely removed from Perl 5.8.0 (and the maintenance release 5.6.1, and it was removed also from all the Perl 5.7 releases), so that particular vulnerability isn't there anymore. However, further security vulnerabilities are, unfortunately, always possible. The suidperl functionality is most probably going to be removed in Perl 5.10. In any case, suidperl should only be used by security experts who know exactly what they are doing and why they are using suidperl instead of some other solution such as sudo ( see http://www.courtesan.com/sudo/ ).

New Tests

Several new tests have been added, especially for the lib and ext subsections. There are now about 69 000 individual tests (spread over about 700 test scripts), in the regression suite (5.6.1 has about 11 700 tests, in 258 test scripts) The exact numbers depend on the platform and Perl configuration used. Many of the new tests are of course introduced by the new modules, but still in general Perl is now more thoroughly tested.

Because of the large number of tests, running the regression suite will take considerably longer time than it used to: expect the suite to take up to 4-5 times longer to run than in perl 5.6. On a really fast machine you can hope to finish the suite in about 6-8 minutes (wallclock time).

The tests are now reported in a different order than in earlier Perls. (This happens because the test scripts from under t/lib have been moved to be closer to the library/extension they are testing.)

Known Problems

The Compiler Suite Is Still Very Experimental

The compiler suite is slowly getting better but it continues to be highly experimental. Use in production environments is discouraged.

Localising Tied Arrays and Hashes Is Broken

  1. local %tied_array;

doesn't work as one would expect: the old value is restored incorrectly. This will be changed in a future release, but we don't know yet what the new semantics will exactly be. In any case, the change will break existing code that relies on the current (ill-defined) semantics, so just avoid doing this in general.

Building Extensions Can Fail Because Of Largefiles

Some extensions like mod_perl are known to have issues with `largefiles', a change brought by Perl 5.6.0 in which file offsets default to 64 bits wide, where supported. Modules may fail to compile at all, or they may compile and work incorrectly. Currently, there is no good solution for the problem, but Configure now provides appropriate non-largefile ccflags, ldflags, libswanted, and libs in the %Config hash (e.g., $Config{ccflags_nolargefiles}) so the extensions that are having problems can try configuring themselves without the largefileness. This is admittedly not a clean solution, and the solution may not even work at all. One potential failure is whether one can (or, if one can, whether it's a good idea to) link together at all binaries with different ideas about file offsets; all this is platform-dependent.

Modifying $_ Inside for(..)

  1. for (1..5) { $_++ }

works without complaint. It shouldn't. (You should be able to modify only lvalue elements inside the loops.) You can see the correct behaviour by replacing the 1..5 with 1, 2, 3, 4, 5.

mod_perl 1.26 Doesn't Build With Threaded Perl

Use mod_perl 1.27 or higher.

lib/ftmp-security tests warn 'system possibly insecure'

Don't panic. Read the 'make test' section of INSTALL instead.

libwww-perl (LWP) fails base/date #51

Use libwww-perl 5.65 or later.

PDL failing some tests

Use PDL 2.3.4 or later.

Perl_get_sv

You may get errors like 'Undefined symbol "Perl_get_sv"' or "can't resolve symbol 'Perl_get_sv'", or the symbol may be "Perl_sv_2pv". This probably means that you are trying to use an older shared Perl library (or extensions linked with such) with Perl 5.8.0 executable. Perl used to have such a subroutine, but that is no more the case. Check your shared library path, and any shared Perl libraries in those directories.

Sometimes this problem may also indicate a partial Perl 5.8.0 installation, see Mac OS X dyld undefined symbols for an example and how to deal with it.

Self-tying Problems

Self-tying of arrays and hashes is broken in rather deep and hard-to-fix ways. As a stop-gap measure to avoid people from getting frustrated at the mysterious results (core dumps, most often), it is forbidden for now (you will get a fatal error even from an attempt).

A change to self-tying of globs has caused them to be recursively referenced (see: Two-Phased Garbage Collection in perlobj). You will now need an explicit untie to destroy a self-tied glob. This behaviour may be fixed at a later date.

Self-tying of scalars and IO thingies works.

ext/threads/t/libc

If this test fails, it indicates that your libc (C library) is not threadsafe. This particular test stress tests the localtime() call to find out whether it is threadsafe. See perlthrtut for more information.

Failure of Thread (5.005-style) tests

Note that support for 5.005-style threading is deprecated, experimental and practically unsupported. In 5.10, it is expected to be removed. You should migrate your code to ithreads.

The following tests are known to fail due to fundamental problems in the 5.005 threading implementation. These are not new failures--Perl 5.005_0x has the same bugs, but didn't have these tests.

  1. ../ext/B/t/xref.t 255 65280 14 12 85.71% 3-14
  2. ../ext/List/Util/t/first.t 255 65280 7 4 57.14% 2 5-7
  3. ../lib/English.t 2 512 54 2 3.70% 2-3
  4. ../lib/FileCache.t 5 1 20.00% 5
  5. ../lib/Filter/Simple/t/data.t 6 3 50.00% 1-3
  6. ../lib/Filter/Simple/t/filter_only. 9 3 33.33% 1-2 5
  7. ../lib/Math/BigInt/t/bare_mbf.t 1627 4 0.25% 8 11 1626-1627
  8. ../lib/Math/BigInt/t/bigfltpm.t 1629 4 0.25% 10 13 1628-
  9. 1629
  10. ../lib/Math/BigInt/t/sub_mbf.t 1633 4 0.24% 8 11 1632-1633
  11. ../lib/Math/BigInt/t/with_sub.t 1628 4 0.25% 9 12 1627-1628
  12. ../lib/Tie/File/t/31_autodefer.t 255 65280 65 32 49.23% 34-65
  13. ../lib/autouse.t 10 1 10.00% 4
  14. op/flip.t 15 1 6.67% 15

These failures are unlikely to get fixed as 5.005-style threads are considered fundamentally broken. (Basically what happens is that competing threads can corrupt shared global state, one good example being regular expression engine's state.)

Timing problems

The following tests may fail intermittently because of timing problems, for example if the system is heavily loaded.

  1. t/op/alarm.t
  2. ext/Time/HiRes/HiRes.t
  3. lib/Benchmark.t
  4. lib/Memoize/t/expmod_t.t
  5. lib/Memoize/t/speed.t

In case of failure please try running them manually, for example

  1. ./perl -Ilib ext/Time/HiRes/HiRes.t

Tied/Magical Array/Hash Elements Do Not Autovivify

For normal arrays $foo = \$bar[1] will assign undef to $bar[1] (assuming that it didn't exist before), but for tied/magical arrays and hashes such autovivification does not happen because there is currently no way to catch the reference creation. The same problem affects slicing over non-existent indices/keys of a tied/magical array/hash.

Unicode in package/class and subroutine names does not work

One can have Unicode in identifier names, but not in package/class or subroutine names. While some limited functionality towards this does exist as of Perl 5.8.0, that is more accidental than designed; use of Unicode for the said purposes is unsupported.

One reason of this unfinishedness is its (currently) inherent unportability: since both package names and subroutine names may need to be mapped to file and directory names, the Unicode capability of the filesystem becomes important-- and there unfortunately aren't portable answers.

Platform Specific Problems

AIX

  • If using the AIX native make command, instead of just "make" issue "make all". In some setups the former has been known to spuriously also try to run "make install". Alternatively, you may want to use GNU make.

  • In AIX 4.2, Perl extensions that use C++ functions that use statics may have problems in that the statics are not getting initialized. In newer AIX releases, this has been solved by linking Perl with the libC_r library, but unfortunately in AIX 4.2 the said library has an obscure bug where the various functions related to time (such as time() and gettimeofday()) return broken values, and therefore in AIX 4.2 Perl is not linked against libC_r.

  • vac 5.0.0.0 May Produce Buggy Code For Perl

    The AIX C compiler vac version 5.0.0.0 may produce buggy code, resulting in a few random tests failing when run as part of "make test", but when the failing tests are run by hand, they succeed. We suggest upgrading to at least vac version 5.0.1.0, that has been known to compile Perl correctly. "lslpp -L|grep vac.C" will tell you the vac version. See README.aix.

  • If building threaded Perl, you may get compilation warning from pp_sys.c:

    1. "pp_sys.c", line 4651.39: 1506-280 (W) Function argument assignment between types "unsigned char*" and "const void*" is not allowed.

    This is harmless; it is caused by the getnetbyaddr() and getnetbyaddr_r() having slightly different types for their first argument.

Alpha systems with old gccs fail several tests

If you see op/pack, op/pat, op/regexp, or ext/Storable tests failing in a Linux/alpha or *BSD/Alpha, it's probably time to upgrade your gcc. gccs prior to 2.95.3 are definitely not good enough, and gcc 3.1 may be even better. (RedHat Linux/alpha with gcc 3.1 reported no problems, as did Linux 2.4.18 with gcc 2.95.4.) (In Tru64, it is preferable to use the bundled C compiler.)

AmigaOS

Perl 5.8.0 doesn't build in AmigaOS. It broke at some point during the ithreads work and we could not find Amiga experts to unbreak the problems. Perl 5.6.1 still works for AmigaOS (as does the 5.7.2 development release).

BeOS

The following tests fail on 5.8.0 Perl in BeOS Personal 5.03:

  1. t/op/lfs............................FAILED at test 17
  2. t/op/magic..........................FAILED at test 24
  3. ext/Fcntl/t/syslfs..................FAILED at test 17
  4. ext/File/Glob/t/basic...............FAILED at test 3
  5. ext/POSIX/t/sigaction...............FAILED at test 13
  6. ext/POSIX/t/waitpid.................FAILED at test 1

(Note: more information was available in README.beos until support for BeOS was removed in Perl v5.18.0)

Cygwin "unable to remap"

For example when building the Tk extension for Cygwin, you may get an error message saying "unable to remap". This is known problem with Cygwin, and a workaround is detailed in here: http://sources.redhat.com/ml/cygwin/2001-12/msg00894.html

Cygwin ndbm tests fail on FAT

One can build but not install (or test the build of) the NDBM_File on FAT filesystems. Installation (or build) on NTFS works fine. If one attempts the test on a FAT install (or build) the following failures are expected:

  1. ../ext/NDBM_File/ndbm.t 13 3328 71 59 83.10% 1-2 4 16-71
  2. ../ext/ODBM_File/odbm.t 255 65280 ?? ?? % ??
  3. ../lib/AnyDBM_File.t 2 512 12 2 16.67% 1 4
  4. ../lib/Memoize/t/errors.t 0 139 11 5 45.45% 7-11
  5. ../lib/Memoize/t/tie_ndbm.t 13 3328 4 4 100.00% 1-4
  6. run/fresh_perl.t 97 1 1.03% 91

NDBM_File fails and ODBM_File just coredumps.

If you intend to run only on FAT (or if using AnyDBM_File on FAT), run Configure with the -Ui_ndbm and -Ui_dbm options to prevent NDBM_File and ODBM_File being built.

DJGPP Failures

  1. t/op/stat............................FAILED at test 29
  2. lib/File/Find/t/find.................FAILED at test 1
  3. lib/File/Find/t/taint................FAILED at test 1
  4. lib/h2xs.............................FAILED at test 15
  5. lib/Pod/t/eol........................FAILED at test 1
  6. lib/Test/Harness/t/strap-analyze.....FAILED at test 8
  7. lib/Test/Harness/t/test-harness......FAILED at test 23
  8. lib/Test/Simple/t/exit...............FAILED at test 1

The above failures are known as of 5.8.0 with native builds with long filenames, but there are a few more if running under dosemu because of limitations (and maybe bugs) of dosemu:

  1. t/comp/cpp...........................FAILED at test 3
  2. t/op/inccode.........................(crash)

and a few lib/ExtUtils tests, and several hundred Encode/t/Aliases.t failures that work fine with long filenames. So you really might prefer native builds and long filenames.

FreeBSD built with ithreads coredumps reading large directories

This is a known bug in FreeBSD 4.5's readdir_r(), it has been fixed in FreeBSD 4.6 (see perlfreebsd (README.freebsd)).

FreeBSD Failing locale Test 117 For ISO 8859-15 Locales

The ISO 8859-15 locales may fail the locale test 117 in FreeBSD. This is caused by the characters \xFF (y with diaeresis) and \xBE (Y with diaeresis) not behaving correctly when being matched case-insensitively. Apparently this problem has been fixed in the latest FreeBSD releases. ( http://www.freebsd.org/cgi/query-pr.cgi?pr=34308 )

IRIX fails ext/List/Util/t/shuffle.t or Digest::MD5

IRIX with MIPSpro 7.3.1.2m or 7.3.1.3m compiler may fail the List::Util test ext/List/Util/t/shuffle.t by dumping core. This seems to be a compiler error since if compiled with gcc no core dump ensues, and no failures have been seen on the said test on any other platform.

Similarly, building the Digest::MD5 extension has been known to fail with "*** Termination code 139 (bu21)".

The cure is to drop optimization level (Configure -Doptimize=-O2).

HP-UX lib/posix Subtest 9 Fails When LP64-Configured

If perl is configured with -Duse64bitall, the successful result of the subtest 10 of lib/posix may arrive before the successful result of the subtest 9, which confuses the test harness so much that it thinks the subtest 9 failed.

Linux with glibc 2.2.5 fails t/op/int subtest #6 with -Duse64bitint

This is a known bug in the glibc 2.2.5 with long long integers. ( http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=65612 )

Linux With Sfio Fails op/misc Test 48

No known fix.

Mac OS X

Please remember to set your environment variable LC_ALL to "C" (setenv LC_ALL C) before running "make test" to avoid a lot of warnings about the broken locales of Mac OS X.

The following tests are known to fail in Mac OS X 10.1.5 because of buggy (old) implementations of Berkeley DB included in Mac OS X:

  1. Failed Test Stat Wstat Total Fail Failed List of Failed
  2. -------------------------------------------------------------------------
  3. ../ext/DB_File/t/db-btree.t 0 11 ?? ?? % ??
  4. ../ext/DB_File/t/db-recno.t 149 3 2.01% 61 63 65

If you are building on a UFS partition, you will also probably see t/op/stat.t subtest #9 fail. This is caused by Darwin's UFS not supporting inode change time.

Also the ext/POSIX/t/posix.t subtest #10 fails but it is skipped for now because the failure is Apple's fault, not Perl's (blocked signals are lost).

If you Configure with ithreads, ext/threads/t/libc.t will fail. Again, this is not Perl's fault-- the libc of Mac OS X is not threadsafe (in this particular test, the localtime() call is found to be threadunsafe.)

Mac OS X dyld undefined symbols

If after installing Perl 5.8.0 you are getting warnings about missing symbols, for example

  1. dyld: perl Undefined symbols
  2. _perl_sv_2pv
  3. _perl_get_sv

you probably have an old pre-Perl-5.8.0 installation (or parts of one) in /Library/Perl (the undefined symbols used to exist in pre-5.8.0 Perls). It seems that for some reason "make install" doesn't always completely overwrite the files in /Library/Perl. You can move the old Perl shared library out of the way like this:

  1. cd /Library/Perl/darwin/CORE
  2. mv libperl.dylib libperlold.dylib

and then reissue "make install". Note that the above of course is extremely disruptive for anything using the /usr/local/bin/perl. If that doesn't help, you may have to try removing all the .bundle files from beneath /Library/Perl, and again "make install"-ing.

OS/2 Test Failures

The following tests are known to fail on OS/2 (for clarity only the failures are shown, not the full error messages):

  1. ../lib/ExtUtils/t/Mkbootstrap.t 1 256 18 1 5.56% 8
  2. ../lib/ExtUtils/t/Packlist.t 1 256 34 1 2.94% 17
  3. ../lib/ExtUtils/t/basic.t 1 256 17 1 5.88% 14
  4. lib/os2_process.t 2 512 227 2 0.88% 174 209
  5. lib/os2_process_kid.t 227 2 0.88% 174 209
  6. lib/rx_cmprt.t 255 65280 18 3 16.67% 16-18

op/sprintf tests 91, 129, and 130

The op/sprintf tests 91, 129, and 130 are known to fail on some platforms. Examples include any platform using sfio, and Compaq/Tandem's NonStop-UX.

Test 91 is known to fail on QNX6 (nto), because sprintf '%e',0 incorrectly produces 0.000000e+0 instead of 0.000000e+00 .

For tests 129 and 130, the failing platforms do not comply with the ANSI C Standard: lines 19ff on page 134 of ANSI X3.159 1989, to be exact. (They produce something other than "1" and "-1" when formatting 0.6 and -0.6 using the printf format "%.0f"; most often, they produce "0" and "-0".)

SCO

The socketpair tests are known to be unhappy in SCO 3.2v5.0.4:

  1. ext/Socket/socketpair.t...............FAILED tests 15-45

Solaris 2.5

In case you are still using Solaris 2.5 (aka SunOS 5.5), you may experience failures (the test core dumping) in lib/locale.t. The suggested cure is to upgrade your Solaris.

Solaris x86 Fails Tests With -Duse64bitint

The following tests are known to fail in Solaris x86 with Perl configured to use 64 bit integers:

  1. ext/Data/Dumper/t/dumper.............FAILED at test 268
  2. ext/Devel/Peek/Peek..................FAILED at test 7

SUPER-UX (NEC SX)

The following tests are known to fail on SUPER-UX:

  1. op/64bitint...........................FAILED tests 29-30, 32-33, 35-36
  2. op/arith..............................FAILED tests 128-130
  3. op/pack...............................FAILED tests 25-5625
  4. op/pow................................
  5. op/taint..............................# msgsnd failed
  6. ../ext/IO/lib/IO/t/io_poll............FAILED tests 3-4
  7. ../ext/IPC/SysV/ipcsysv...............FAILED tests 2, 5-6
  8. ../ext/IPC/SysV/t/msg.................FAILED tests 2, 4-6
  9. ../ext/Socket/socketpair..............FAILED tests 12
  10. ../lib/IPC/SysV.......................FAILED tests 2, 5-6
  11. ../lib/warnings.......................FAILED tests 115-116, 118-119

The op/pack failure ("Cannot compress negative numbers at op/pack.t line 126") is serious but as of yet unsolved. It points at some problems with the signedness handling of the C compiler, as do the 64bitint, arith, and pow failures. Most of the rest point at problems with SysV IPC.

Term::ReadKey not working on Win32

Use Term::ReadKey 2.20 or later.

UNICOS/mk

  • During Configure, the test

    1. Guessing which symbols your C compiler and preprocessor define...

    will probably fail with error messages like

    1. CC-20 cc: ERROR File = try.c, Line = 3
    2. The identifier "bad" is undefined.
    3. bad switch yylook 79bad switch yylook 79bad switch yylook 79bad switch yylook 79#ifdef A29K
    4. ^
    5. CC-65 cc: ERROR File = try.c, Line = 3
    6. A semicolon is expected at this point.

    This is caused by a bug in the awk utility of UNICOS/mk. You can ignore the error, but it does cause a slight problem: you cannot fully benefit from the h2ph utility (see h2ph) that can be used to convert C headers to Perl libraries, mainly used to be able to access from Perl the constants defined using C preprocessor, cpp. Because of the above error, parts of the converted headers will be invisible. Luckily, these days the need for h2ph is rare.

  • If building Perl with interpreter threads (ithreads), the getgrent(), getgrnam(), and getgrgid() functions cannot return the list of the group members due to a bug in the multithreaded support of UNICOS/mk. What this means is that in list context the functions will return only three values, not four.

UTS

There are a few known test failures. (Note: the relevant information was available in README.uts until support for UTS was removed in Perl v5.18.0)

VOS (Stratus)

When Perl is built using the native build process on VOS Release 14.5.0 and GNU C++/GNU Tools 2.0.1, all attempted tests either pass or result in TODO (ignored) failures.

VMS

There should be no reported test failures with a default configuration, though there are a number of tests marked TODO that point to areas needing further debugging and/or porting work.

Win32

In multi-CPU boxes, there are some problems with the I/O buffering: some output may appear twice.

XML::Parser not working

Use XML::Parser 2.31 or later.

z/OS (OS/390)

z/OS has rather many test failures but the situation is actually much better than it was in 5.6.0; it's just that so many new modules and tests have been added.

  1. Failed Test Stat Wstat Total Fail Failed List of Failed
  2. ---------------------------------------------------------------------------
  3. ../ext/Data/Dumper/t/dumper.t 357 8 2.24% 311 314 325 327
  4. 331 333 337 339
  5. ../ext/IO/lib/IO/t/io_unix.t 5 4 80.00% 2-5
  6. ../ext/Storable/t/downgrade.t 12 3072 169 12 7.10% 14-15 46-47 78-79
  7. 110-111 150 161
  8. ../lib/ExtUtils/t/Constant.t 121 30976 48 48 100.00% 1-48
  9. ../lib/ExtUtils/t/Embed.t 9 9 100.00% 1-9
  10. op/pat.t 922 7 0.76% 665 776 785 832-
  11. 834 845
  12. op/sprintf.t 224 3 1.34% 98 100 136
  13. op/tr.t 97 5 5.15% 63 71-74
  14. uni/fold.t 780 6 0.77% 61 169 196 661
  15. 710-711

The failures in dumper.t and downgrade.t are problems in the tests, those in io_unix and sprintf are problems in the USS (UDP sockets and printf formats). The pat, tr, and fold failures are genuine Perl problems caused by EBCDIC (and in the pat and fold cases, combining that with Unicode). The Constant and Embed are probably problems in the tests (since they test Perl's ability to build extensions, and that seems to be working reasonably well.)

Unicode Support on EBCDIC Still Spotty

Though mostly working, Unicode support still has problem spots on EBCDIC platforms. One such known spot are the \p{} and \P{} regular expression constructs for code points less than 256: the pP are testing for Unicode code points, not knowing about EBCDIC.

Seen In Perl 5.7 But Gone Now

Time::Piece (previously known as Time::Object ) was removed because it was felt that it didn't have enough value in it to be a core module. It is still a useful module, though, and is available from the CPAN.

Perl 5.8 unfortunately does not build anymore on AmigaOS; this broke accidentally at some point. Since there are not that many Amiga developers available, we could not get this fixed and tested in time for 5.8.0. Perl 5.6.1 still works for AmigaOS (as does the 5.7.2 development release).

The PerlIO::Scalar and PerlIO::Via (capitalised) were renamed as PerlIO::scalar and PerlIO::via (all lowercase) just before 5.8.0. The main rationale was to have all core PerlIO layers to have all lowercase names. The "plugins" are named as usual, for example PerlIO::via::QuotedPrint .

The threads::shared::queue and threads::shared::semaphore were renamed as Thread::Queue and Thread::Semaphore just before 5.8.0. The main rationale was to have thread modules to obey normal naming, Thread:: (the threads and threads::shared themselves are more pragma-like, they affect compile-time, so they stay lowercase).

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org/ . There may also be information at http://www.perl.com/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

SEE ALSO

The Changes file for exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

HISTORY

Written by Jarkko Hietaniemi <jhi@iki.fi>.

Page index
 
perldoc-html/perlaix.html000644 000765 000024 00000134542 12275777410 015575 0ustar00jjstaff000000 000000 perlaix - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlaix

Perl 5 version 18.2 documentation
Recently read

perlaix

NAME

perlaix - Perl version 5 on IBM AIX (UNIX) systems

DESCRIPTION

This document describes various features of IBM's UNIX operating system AIX that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs.

Compiling Perl 5 on AIX

For information on compilers on older versions of AIX, see Compiling Perl 5 on older AIX versions up to 4.3.3.

When compiling Perl, you must use an ANSI C compiler. AIX does not ship an ANSI compliant C compiler with AIX by default, but binary builds of gcc for AIX are widely available. A version of gcc is also included in the AIX Toolbox which is shipped with AIX.

Supported Compilers

Currently all versions of IBM's "xlc", "xlc_r", "cc", "cc_r" or "vac" ANSI/C compiler will work for building Perl if that compiler works on your system.

If you plan to link Perl to any module that requires thread-support, like DBD::Oracle, it is better to use the _r version of the compiler. This will not build a threaded Perl, but a thread-enabled Perl. See also Threaded Perl later on.

As of writing (2010-09) only the IBM XL C for AIX or IBM XL C/C++ for AIX compiler is supported by IBM on AIX 5L/6.1/7.1.

The following compiler versions are currently supported by IBM:

  1. IBM XL C and IBM XL C/C++ V8, V9, V10, V11

The XL C for AIX is integrated in the XL C/C++ for AIX compiler and therefore also supported.

If you choose XL C/C++ V9 you need APAR IZ35785 installed otherwise the integrated SDBM_File do not compile correctly due to an optimization bug. You can circumvent this problem by adding -qipa to the optimization flags (-Doptimize='-O -qipa'). The PTF for APAR IZ35785 which solves this problem is available from IBM (April 2009 PTF for XL C/C++ Enterprise Edition for AIX, V9.0).

If you choose XL C/C++ V11 you need the April 2010 PTF (or newer) installed otherwise you will not get a working Perl version.

Perl can be compiled with either IBM's ANSI C compiler or with gcc. The former is recommended, as not only it can compile Perl with no difficulty, but also can take advantage of features listed later that require the use of IBM compiler-specific command-line flags.

If you decide to use gcc, make sure your installation is recent and complete, and be sure to read the Perl INSTALL file for more gcc-specific details. Please report any hoops you had to jump through to the development team.

Incompatibility with AIX Toolbox lib gdbm

If the AIX Toolbox version of lib gdbm < 1.8.3-5 is installed on your system then Perl will not work. This library contains the header files /opt/freeware/include/gdbm/dbm.h|ndbm.h which conflict with the AIX system versions. The lib gdbm will be automatically removed from the wanted libraries if the presence of one of these two header files is detected. If you want to build Perl with GDBM support then please install at least gdbm-devel-1.8.3-5 (or higher).

Perl 5 was successfully compiled and tested on:

  1. Perl | AIX Level | Compiler Level | w th | w/o th
  2. -------+---------------------+-------------------------+------+-------
  3. 5.12.2 |5.1 TL9 32 bit | XL C/C++ V7 | OK | OK
  4. 5.12.2 |5.1 TL9 64 bit | XL C/C++ V7 | OK | OK
  5. 5.12.2 |5.2 TL10 SP8 32 bit | XL C/C++ V8 | OK | OK
  6. 5.12.2 |5.2 TL10 SP8 32 bit | gcc 3.2.2 | OK | OK
  7. 5.12.2 |5.2 TL10 SP8 64 bit | XL C/C++ V8 | OK | OK
  8. 5.12.2 |5.3 TL8 SP8 32 bit | XL C/C++ V9 + IZ35785 | OK | OK
  9. 5.12.2 |5.3 TL8 SP8 32 bit | gcc 4.2.4 | OK | OK
  10. 5.12.2 |5.3 TL8 SP8 64 bit | XL C/C++ V9 + IZ35785 | OK | OK
  11. 5.12.2 |5.3 TL10 SP3 32 bit | XL C/C++ V11 + Apr 2010 | OK | OK
  12. 5.12.2 |5.3 TL10 SP3 64 bit | XL C/C++ V11 + Apr 2010 | OK | OK
  13. 5.12.2 |6.1 TL1 SP7 32 bit | XL C/C++ V10 | OK | OK
  14. 5.12.2 |6.1 TL1 SP7 64 bit | XL C/C++ V10 | OK | OK
  15. 5.13 |7.1 TL0 SP1 32 bit | XL C/C++ V11 + Jul 2010 | OK | OK
  16. 5.13 |7.1 TL0 SP1 64 bit | XL C/C++ V11 + Jul 2010 | OK | OK
  17. w th = with thread support
  18. w/o th = without thread support
  19. OK = tested

Successfully tested means that all "make test" runs finish with a result of 100% OK. All tests were conducted with -Duseshrplib set.

All tests were conducted on the oldest supported AIX technology level with the latest support package applied. If the tested AIX version is out of support (AIX 4.3.3, 5.1, 5.2) then the last available support level was used.

Building Dynamic Extensions on AIX

Starting from Perl 5.7.2 (and consequently 5.8.x / 5.10.x / 5.12.x) and AIX 4.3 or newer Perl uses the AIX native dynamic loading interface in the so called runtime linking mode instead of the emulated interface that was used in Perl releases 5.6.1 and earlier or, for AIX releases 4.2 and earlier. This change does break backward compatibility with compiled modules from earlier Perl releases. The change was made to make Perl more compliant with other applications like Apache/mod_perl which are using the AIX native interface. This change also enables the use of C++ code with static constructors and destructors in Perl extensions, which was not possible using the emulated interface.

It is highly recommended to use the new interface.

Using Large Files with Perl

Should yield no problems.

Threaded Perl

Should yield no problems with AIX 5.1 / 5.2 / 5.3 / 6.1 / 7.1.

IBM uses the AIX system Perl (V5.6.0 on AIX 5.1 and V5.8.2 on AIX 5.2 / 5.3 and 6.1; V5.8.8 on AIX 5.3 TL11 and AIX 6.1 TL4; V5.10.1 on AIX 7.1) for some AIX system scripts. If you switch the links in /usr/bin from the AIX system Perl (/usr/opt/perl5) to the newly build Perl then you get the same features as with the IBM AIX system Perl if the threaded options are used.

The threaded Perl build works also on AIX 5.1 but the IBM Perl build (Perl v5.6.0) is not threaded on AIX 5.1.

Perl 5.12 an newer is not compatible with the IBM fileset perl.libext.

64-bit Perl

If your AIX system is installed with 64-bit support, you can expect 64-bit configurations to work. If you want to use 64-bit Perl on AIX 6.1 you need an APAR for a libc.a bug which affects (n)dbm_XXX functions. The APAR number for this problem is IZ39077.

If you need more memory (larger data segment) for your Perl programs you can set:

  1. /etc/security/limits
  2. default: (or your user)
  3. data = -1 (default is 262144 * 512 byte)

With the default setting the size is limited to 128MB. The -1 removes this limit. If the "make test" fails please change your /etc/security/limits as stated above.

Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/32-bit)

With the following options you get a threaded Perl version which passes all make tests in threaded 32-bit mode, which is the default configuration for the Perl builds that AIX ships with.

  1. rm config.sh
  2. ./Configure \
  3. -d \
  4. -Dcc=cc_r \
  5. -Duseshrplib \
  6. -Dusethreads \
  7. -Dprefix=/usr/opt/perl5_32

The -Dprefix option will install Perl in a directory parallel to the IBM AIX system Perl installation.

Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (32-bit)

With the following options you get a Perl version which passes all make tests in 32-bit mode.

  1. rm config.sh
  2. ./Configure \
  3. -d \
  4. -Dcc=cc_r \
  5. -Duseshrplib \
  6. -Dprefix=/usr/opt/perl5_32

The -Dprefix option will install Perl in a directory parallel to the IBM AIX system Perl installation.

Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/64-bit)

With the following options you get a threaded Perl version which passes all make tests in 64-bit mode.

  1. export OBJECT_MODE=64 / setenv OBJECT_MODE 64 (depending on your shell)
  2. rm config.sh
  3. ./Configure \
  4. -d \
  5. -Dcc=cc_r \
  6. -Duseshrplib \
  7. -Dusethreads \
  8. -Duse64bitall \
  9. -Dprefix=/usr/opt/perl5_64

Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (64-bit)

With the following options you get a Perl version which passes all make tests in 64-bit mode.

  1. export OBJECT_MODE=64 / setenv OBJECT_MODE 64 (depending on your shell)
  2. rm config.sh
  3. ./Configure \
  4. -d \
  5. -Dcc=cc_r \
  6. -Duseshrplib \
  7. -Duse64bitall \
  8. -Dprefix=/usr/opt/perl5_64

The -Dprefix option will install Perl in a directory parallel to the IBM AIX system Perl installation.

If you choose gcc to compile 64-bit Perl then you need to add the following option:

  1. -Dcc='gcc -maix64'

Compiling Perl 5 on AIX 7.1.0

A regression in AIX 7 causes a failure in make test in Time::Piece during daylight savings time. APAR IV16514 provides the fix for this. A quick test to see if it's required, assuming it is currently daylight savings in Eastern Time, would be to run TZ=EST5 date +%Z . This will come back with EST normally, but nothing if you have the problem.

Compiling Perl 5 on older AIX versions up to 4.3.3

Due to the fact that AIX 4.3.3 reached end-of-service in December 31, 2003 this information is provided as is. The Perl versions prior to Perl 5.8.9 could be compiled on AIX up to 4.3.3 with the following settings (your mileage may vary):

When compiling Perl, you must use an ANSI C compiler. AIX does not ship an ANSI compliant C-compiler with AIX by default, but binary builds of gcc for AIX are widely available.

At the moment of writing, AIX supports two different native C compilers, for which you have to pay: xlC and vac. If you decide to use either of these two (which is quite a lot easier than using gcc), be sure to upgrade to the latest available patch level. Currently:

  1. xlC.C 3.1.4.10 or 3.6.6.0 or 4.0.2.2 or 5.0.2.9 or 6.0.0.3
  2. vac.C 4.4.0.3 or 5.0.2.6 or 6.0.0.1

note that xlC has the OS version in the name as of version 4.0.2.0, so you will find xlC.C for AIX-5.0 as package

  1. xlC.aix50.rte 5.0.2.0 or 6.0.0.3

subversions are not the same "latest" on all OS versions. For example, the latest xlC-5 on aix41 is 5.0.2.9, while on aix43, it is 5.0.2.7.

Perl can be compiled with either IBM's ANSI C compiler or with gcc. The former is recommended, as not only can it compile Perl with no difficulty, but also can take advantage of features listed later that require the use of IBM compiler-specific command-line flags.

The IBM's compiler patch levels 5.0.0.0 and 5.0.1.0 have compiler optimization bugs that affect compiling perl.c and regcomp.c, respectively. If Perl's configuration detects those compiler patch levels, optimization is turned off for the said source code files. Upgrading to at least 5.0.2.0 is recommended.

If you decide to use gcc, make sure your installation is recent and complete, and be sure to read the Perl INSTALL file for more gcc-specific details. Please report any hoops you had to jump through to the development team.

OS level

Before installing the patches to the IBM C-compiler you need to know the level of patching for the Operating System. IBM's command 'oslevel' will show the base, but is not always complete (in this example oslevel shows 4.3.NULL, whereas the system might run most of 4.3.THREE):

  1. # oslevel
  2. 4.3.0.0
  3. # lslpp -l | grep 'bos.rte '
  4. bos.rte 4.3.3.75 COMMITTED Base Operating System Runtime
  5. bos.rte 4.3.2.0 COMMITTED Base Operating System Runtime
  6. #

The same might happen to AIX 5.1 or other OS levels. As a side note, Perl cannot be built without bos.adt.syscalls and bos.adt.libm installed

  1. # lslpp -l | egrep "syscalls|libm"
  2. bos.adt.libm 5.1.0.25 COMMITTED Base Application Development
  3. bos.adt.syscalls 5.1.0.36 COMMITTED System Calls Application
  4. #

Building Dynamic Extensions on AIX < 5L

AIX supports dynamically loadable objects as well as shared libraries. Shared libraries by convention end with the suffix .a, which is a bit misleading, as an archive can contain static as well as dynamic members. For Perl dynamically loaded objects we use the .so suffix also used on many other platforms.

Note that starting from Perl 5.7.2 (and consequently 5.8.0) and AIX 4.3 or newer Perl uses the AIX native dynamic loading interface in the so called runtime linking mode instead of the emulated interface that was used in Perl releases 5.6.1 and earlier or, for AIX releases 4.2 and earlier. This change does break backward compatibility with compiled modules from earlier Perl releases. The change was made to make Perl more compliant with other applications like Apache/mod_perl which are using the AIX native interface. This change also enables the use of C++ code with static constructors and destructors in Perl extensions, which was not possible using the emulated interface.

The IBM ANSI C Compiler

All defaults for Configure can be used.

If you've chosen to use vac 4, be sure to run 4.4.0.3. Older versions will turn up nasty later on. For vac 5 be sure to run at least 5.0.1.0, but vac 5.0.2.6 or up is highly recommended. Note that since IBM has removed vac 5.0.2.1 through 5.0.2.5 from the software depot, these versions should be considered obsolete.

Here's a brief lead of how to upgrade the compiler to the latest level. Of course this is subject to changes. You can only upgrade versions from ftp-available updates if the first three digit groups are the same (in where you can skip intermediate unlike the patches in the developer snapshots of Perl), or to one version up where the "base" is available. In other words, the AIX compiler patches are cumulative.

  1. vac.C.4.4.0.1 => vac.C.4.4.0.3 is OK (vac.C.4.4.0.2 not needed)
  2. xlC.C.3.1.3.3 => xlC.C.3.1.4.10 is NOT OK (xlC.C.3.1.4.0 is not available)
  3. # ftp ftp.software.ibm.com
  4. Connected to service.boulder.ibm.com.
  5. : welcome message ...
  6. Name (ftp.software.ibm.com:merijn): anonymous
  7. 331 Guest login ok, send your complete e-mail address as password.
  8. Password:
  9. ... accepted login stuff
  10. ftp> cd /aix/fixes/v4/
  11. ftp> dir other other.ll
  12. output to local-file: other.ll? y
  13. 200 PORT command successful.
  14. 150 Opening ASCII mode data connection for /bin/ls.
  15. 226 Transfer complete.
  16. ftp> dir xlc xlc.ll
  17. output to local-file: xlc.ll? y
  18. 200 PORT command successful.
  19. 150 Opening ASCII mode data connection for /bin/ls.
  20. 226 Transfer complete.
  21. ftp> bye
  22. ... goodbye messages
  23. # ls -l *.ll
  24. -rw-rw-rw- 1 merijn system 1169432 Nov 2 17:29 other.ll
  25. -rw-rw-rw- 1 merijn system 29170 Nov 2 17:29 xlc.ll

On AIX 4.2 using xlC, we continue:

  1. # lslpp -l | fgrep 'xlC.C '
  2. xlC.C 3.1.4.9 COMMITTED C for AIX Compiler
  3. xlC.C 3.1.4.0 COMMITTED C for AIX Compiler
  4. # grep 'xlC.C.3.1.4.*.bff' xlc.ll
  5. -rw-r--r-- 1 45776101 1 6286336 Jul 22 1996 xlC.C.3.1.4.1.bff
  6. -rw-rw-r-- 1 45776101 1 6173696 Aug 24 1998 xlC.C.3.1.4.10.bff
  7. -rw-r--r-- 1 45776101 1 6319104 Aug 14 1996 xlC.C.3.1.4.2.bff
  8. -rw-r--r-- 1 45776101 1 6316032 Oct 21 1996 xlC.C.3.1.4.3.bff
  9. -rw-r--r-- 1 45776101 1 6315008 Dec 20 1996 xlC.C.3.1.4.4.bff
  10. -rw-rw-r-- 1 45776101 1 6178816 Mar 28 1997 xlC.C.3.1.4.5.bff
  11. -rw-rw-r-- 1 45776101 1 6188032 May 22 1997 xlC.C.3.1.4.6.bff
  12. -rw-rw-r-- 1 45776101 1 6191104 Sep 5 1997 xlC.C.3.1.4.7.bff
  13. -rw-rw-r-- 1 45776101 1 6185984 Jan 13 1998 xlC.C.3.1.4.8.bff
  14. -rw-rw-r-- 1 45776101 1 6169600 May 27 1998 xlC.C.3.1.4.9.bff
  15. # wget ftp://ftp.software.ibm.com/aix/fixes/v4/xlc/xlC.C.3.1.4.10.bff
  16. #

On AIX 4.3 using vac, we continue:

  1. # lslpp -l | grep 'vac.C '
  2. vac.C 5.0.2.2 COMMITTED C for AIX Compiler
  3. vac.C 5.0.2.0 COMMITTED C for AIX Compiler
  4. # grep 'vac.C.5.0.2.*.bff' other.ll
  5. -rw-rw-r-- 1 45776101 1 13592576 Apr 16 2001 vac.C.5.0.2.0.bff
  6. -rw-rw-r-- 1 45776101 1 14133248 Apr 9 2002 vac.C.5.0.2.3.bff
  7. -rw-rw-r-- 1 45776101 1 14173184 May 20 2002 vac.C.5.0.2.4.bff
  8. -rw-rw-r-- 1 45776101 1 14192640 Nov 22 2002 vac.C.5.0.2.6.bff
  9. # wget ftp://ftp.software.ibm.com/aix/fixes/v4/other/vac.C.5.0.2.6.bff
  10. #

Likewise on all other OS levels. Then execute the following command, and fill in its choices

  1. # smit install_update
  2. -> Install and Update from LATEST Available Software
  3. * INPUT device / directory for software [ vac.C.5.0.2.6.bff ]
  4. [ OK ]
  5. [ OK ]

Follow the messages ... and you're done.

If you like a more web-like approach, a good start point can be http://www14.software.ibm.com/webapp/download/downloadaz.jsp and click "C for AIX", and follow the instructions.

The usenm option

If linking miniperl

  1. cc -o miniperl ... miniperlmain.o opmini.o perl.o ... -lm -lc ...

causes error like this

  1. ld: 0711-317 ERROR: Undefined symbol: .aintl
  2. ld: 0711-317 ERROR: Undefined symbol: .copysignl
  3. ld: 0711-317 ERROR: Undefined symbol: .syscall
  4. ld: 0711-317 ERROR: Undefined symbol: .eaccess
  5. ld: 0711-317 ERROR: Undefined symbol: .setresuid
  6. ld: 0711-317 ERROR: Undefined symbol: .setresgid
  7. ld: 0711-317 ERROR: Undefined symbol: .setproctitle
  8. ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.

you could retry with

  1. make realclean
  2. rm config.sh
  3. ./Configure -Dusenm ...

which makes Configure to use the nm tool when scanning for library symbols, which usually is not done in AIX.

Related to this, you probably should not use the -r option of Configure in AIX, because that affects of how the nm tool is used.

Using GNU's gcc for building Perl

Using gcc-3.x (tested with 3.0.4, 3.1, and 3.2) now works out of the box, as do recent gcc-2.9 builds available directly from IBM as part of their Linux compatibility packages, available here:

  1. http://www.ibm.com/servers/aix/products/aixos/linux/

Using Large Files with Perl < 5L

Should yield no problems.

Threaded Perl < 5L

Threads seem to work OK, though at the moment not all tests pass when threads are used in combination with 64-bit configurations.

You may get a warning when doing a threaded build:

  1. "pp_sys.c", line 4640.39: 1506-280 (W) Function argument assignment
  2. between types "unsigned char*" and "const void*" is not allowed.

The exact line number may vary, but if the warning (W) comes from a line line this

  1. hent = PerlSock_gethostbyaddr(addr, (Netdb_hlen_t) addrlen, addrtype);

in the "pp_ghostent" function, you may ignore it safely. The warning is caused by the reentrant variant of gethostbyaddr() having a slightly different prototype than its non-reentrant variant, but the difference is not really significant here.

64-bit Perl < 5L

If your AIX is installed with 64-bit support, you can expect 64-bit configurations to work. In combination with threads some tests might still fail.

AIX 4.2 and extensions using C++ with statics

In AIX 4.2 Perl extensions that use C++ functions that use statics may have problems in that the statics are not getting initialized. In newer AIX releases this has been solved by linking Perl with the libC_r library, but unfortunately in AIX 4.2 the said library has an obscure bug where the various functions related to time (such as time() and gettimeofday()) return broken values, and therefore in AIX 4.2 Perl is not linked against the libC_r.

AUTHORS

Rainer Tammer <tammer@tammer.net>

 
perldoc-html/perlamiga.html000644 000765 000024 00000065373 12275777410 016077 0ustar00jjstaff000000 000000 perlamiga - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlamiga

Perl 5 version 18.2 documentation
Recently read

perlamiga

NAME

perlamiga - Perl under Amiga OS

NOTE

Perl 5.8.0 cannot be built in AmigaOS. You can use either the maintenance release Perl 5.6.1 or the development release Perl 5.7.2 in AmigaOS. See PERL 5.8.0 BROKEN IN AMIGAOS if you want to help fixing this problem.

SYNOPSIS

One can read this document in the following formats:

  1. man perlamiga
  2. multiview perlamiga.guide

to list some (not all may be available simultaneously), or it may be read as is: either as README.amiga, or pod/perlamiga.pod.

A recent version of perl for the Amiga can be found at the Geek Gadgets section of the Aminet:

  1. http://www.aminet.net/~aminet/dev/gg

DESCRIPTION

Prerequisites for Compiling Perl on AmigaOS

  • Unix emulation for AmigaOS: ixemul.library

    You need the Unix emulation for AmigaOS, whose most important part is ixemul.library. For a minimum setup, get the latest versions of the following packages from the Aminet archives ( http://www.aminet.net/~aminet/ ):

    1. ixemul-bin
    2. ixemul-env-bin
    3. pdksh-bin

    Note also that this is a minimum setup; you might want to add other packages of ADE (the Amiga Developers Environment).

  • Version of Amiga OS

    You need at the very least AmigaOS version 2.0. Recommended is version 3.1.

Starting Perl programs under AmigaOS

Start your Perl program foo with arguments arg1 arg2 arg3 the same way as on any other platform, by

  1. perl foo arg1 arg2 arg3

If you want to specify perl options -my_opts to the perl itself (as opposed to your program), use

  1. perl -my_opts foo arg1 arg2 arg3

Alternately, you can try to get a replacement for the system's Execute command that honors the #!/usr/bin/perl syntax in scripts and set the s-Bit of your scripts. Then you can invoke your scripts like under UNIX with

  1. foo arg1 arg2 arg3

(Note that having *nixish full path to perl /usr/bin/perl is not necessary, perl would be enough, but having full path would make it easier to use your script under *nix.)

Shortcomings of Perl under AmigaOS

Perl under AmigaOS lacks some features of perl under UNIX because of deficiencies in the UNIX-emulation, most notably:

  • fork()

  • some features of the UNIX filesystem regarding link count and file dates

  • inplace operation (the -i switch) without backup file

  • umask() works, but the correct permissions are only set when the file is finally close()d

INSTALLATION

Change to the installation directory (most probably ADE:), and extract the binary distribution:

lha -mraxe x perl-$VERSION-bin.lha

or

tar xvzpf perl-$VERSION-bin.tgz

(Of course you need lha or tar and gunzip for this.)

For installation of the Unix emulation, read the appropriate docs.

Accessing documentation

Manpages for Perl on AmigaOS

If you have man installed on your system, and you installed perl manpages, use something like this:

  1. man perlfunc
  2. man less
  3. man ExtUtils.MakeMaker

to access documentation for different components of Perl. Start with

  1. man perl

Note: You have to modify your man.conf file to search for manpages in the /ade/lib/perl5/man/man3 directory, or the man pages for the perl library will not be found.

Note that dot (.) is used as a package separator for documentation for packages, and as usual, sometimes you need to give the section - 3 above - to avoid shadowing by the less(1) manpage.

Perl HTML Documentation on AmigaOS

If you have some WWW browser available, you can build HTML docs. Cd to directory with .pod files, and do like this

  1. cd /ade/lib/perl5/pod
  2. pod2html

After this you can direct your browser the file perl.html in this directory, and go ahead with reading docs.

Alternatively you may be able to get these docs prebuilt from CPAN .

Perl GNU Info Files on AmigaOS

Users of Emacs would appreciate it very much, especially with CPerl mode loaded. You need to get latest pod2info from CPAN , or, alternately, prebuilt info pages.

Perl LaTeX Documentation on AmigaOS

Can be constructed using pod2latex .

BUILDING PERL ON AMIGAOS

Here we discuss how to build Perl under AmigaOS.

Build Prerequisites for Perl on AmigaOS

You need to have the latest ixemul (Unix emulation for Amiga) from Aminet.

Getting the Perl Source for AmigaOS

You can either get the latest perl-for-amiga source from Ninemoons and extract it with:

  1. tar xvzpf perl-$VERSION-src.tgz

or get the official source from CPAN:

  1. http://www.cpan.org/src/5.0

Extract it like this

  1. tar xvzpf perl-$VERSION.tar.gz

You will see a message about errors while extracting Configure. This is normal and expected. (There is a conflict with a similarly-named file configure, but it causes no harm.)

Making Perl on AmigaOS

Remember to use a hefty wad of stack (I use 2000000)

  1. sh configure.gnu --prefix=/gg

Now type

  1. make depend

Now!

  1. make

Testing Perl on AmigaOS

Now run

  1. make test

Some tests will be skipped because they need the fork() function:

io/pipe.t, op/fork.t, lib/filehand.t, lib/open2.t, lib/open3.t, lib/io_pipe.t, lib/io_sock.t

Installing the built Perl on AmigaOS

Run

  1. make install

PERL 5.8.0 BROKEN IN AMIGAOS

As told above, Perl 5.6.1 was still good in AmigaOS, as was 5.7.2. After Perl 5.7.2 (change #11423, see the Changes file, and the file pod/perlhack.pod for how to get the individual changes) Perl dropped its internal support for vfork(), and that was very probably the step that broke AmigaOS (since the ixemul library has only vfork). The build finally fails when the ext/DynaLoader is being built, and PERL ends up as "0" in the produced Makefile, trying to run "0" does not quite work. Also, executing miniperl in backticks seems to generate nothing: very probably related to the (v)fork problems. Fixing the breakage requires someone quite familiar with the ixemul library, and how one is supposed to run external commands in AmigaOS without fork().

AUTHORS

Norbert Pueschel, pueschel@imsdd.meb.uni-bonn.de Jan-Erik Karlsson, trg@privat.utfors.se

SEE ALSO

perl(1).

 
perldoc-html/perlapi.html000644 000765 000024 00002451112 12275777366 015574 0ustar00jjstaff000000 000000 perlapi - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlapi

Perl 5 version 18.2 documentation
Recently read

perlapi

NAME

perlapi - autogenerated documentation for the perl public API

DESCRIPTION

This file contains the documentation of the perl public API generated by embed.pl, specifically a listing of functions, macros, flags, and variables that may be used by extension writers. At the end is a list of functions which have yet to be documented. The interfaces of those are subject to change without notice. Any functions not listed here are not part of the public API, and should not be used by extension writers at all. For these reasons, blindly using functions listed in proto.h is to be avoided when writing extensions.

Note that all Perl API global variables must be referenced with the PL_ prefix. Some macros are provided for compatibility with the older, unadorned names, but this support may be disabled in a future release.

Perl was originally written to handle US-ASCII only (that is characters whose ordinal numbers are in the range 0 - 127). And documentation and comments may still use the term ASCII, when sometimes in fact the entire range from 0 - 255 is meant.

Note that Perl can be compiled and run under EBCDIC (See perlebcdic) or ASCII. Most of the documentation (and even comments in the code) ignore the EBCDIC possibility. For almost all purposes the differences are transparent. As an example, under EBCDIC, instead of UTF-8, UTF-EBCDIC is used to encode Unicode strings, and so whenever this documentation refers to utf8 (and variants of that name, including in function names), it also (essentially transparently) means UTF-EBCDIC . But the ordinals of characters differ between ASCII, EBCDIC, and the UTF- encodings, and a string encoded in UTF-EBCDIC may occupy more bytes than in UTF-8.

The listing below is alphabetical, case insensitive.

"Gimme" Values

  • GIMME

    A backward-compatible version of GIMME_V which can only return G_SCALAR or G_ARRAY ; in a void context, it returns G_SCALAR . Deprecated. Use GIMME_V instead.

    1. U32 GIMME
  • GIMME_V

    The XSUB-writer's equivalent to Perl's wantarray. Returns G_VOID , G_SCALAR or G_ARRAY for void, scalar or list context, respectively. See perlcall for a usage example.

    1. U32 GIMME_V
  • G_ARRAY

    Used to indicate list context. See GIMME_V , GIMME and perlcall.

  • G_DISCARD

    Indicates that arguments returned from a callback should be discarded. See perlcall.

  • G_EVAL

    Used to force a Perl eval wrapper around a callback. See perlcall.

  • G_NOARGS

    Indicates that no arguments are being sent to a callback. See perlcall.

  • G_SCALAR

    Used to indicate scalar context. See GIMME_V , GIMME , and perlcall.

  • G_VOID

    Used to indicate void context. See GIMME_V and perlcall.

Array Manipulation Functions

  • AvFILL

    Same as av_top_index() . Deprecated, use av_top_index() instead.

    1. int AvFILL(AV* av)
  • av_clear

    Clears an array, making it empty. Does not free the memory the av uses to store its list of scalars. If any destructors are triggered as a result, the av itself may be freed when this function returns.

    Perl equivalent: @myarray = (); .

    1. void av_clear(AV *av)
  • av_create_and_push

    Push an SV onto the end of the array, creating the array if necessary. A small internal helper function to remove a commonly duplicated idiom.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void av_create_and_push(AV **const avp,
    2. SV *const val)
  • av_create_and_unshift_one

    Unshifts an SV onto the beginning of the array, creating the array if necessary. A small internal helper function to remove a commonly duplicated idiom.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV** av_create_and_unshift_one(AV **const avp,
    2. SV *const val)
  • av_delete

    Deletes the element indexed by key from the array, makes the element mortal, and returns it. If flags equals G_DISCARD , the element is freed and null is returned. Perl equivalent: my $elem = delete($myarray[$idx]); for the non-G_DISCARD version and a void-context delete($myarray[$idx]); for the G_DISCARD version.

    1. SV* av_delete(AV *av, I32 key, I32 flags)
  • av_exists

    Returns true if the element indexed by key has been initialized.

    This relies on the fact that uninitialized array elements are set to &PL_sv_undef .

    Perl equivalent: exists($myarray[$key]).

    1. bool av_exists(AV *av, I32 key)
  • av_extend

    Pre-extend an array. The key is the index to which the array should be extended.

    1. void av_extend(AV *av, I32 key)
  • av_fetch

    Returns the SV at the specified index in the array. The key is the index. If lval is true, you are guaranteed to get a real SV back (in case it wasn't real before), which you can then modify. Check that the return value is non-null before dereferencing it to a SV* .

    See Understanding the Magic of Tied Hashes and Arrays in perlguts for more information on how to use this function on tied arrays.

    The rough perl equivalent is $myarray[$idx] .

    1. SV** av_fetch(AV *av, I32 key, I32 lval)
  • av_fill

    Set the highest index in the array to the given number, equivalent to Perl's $#array = $fill; .

    The number of elements in the an array will be fill + 1 after av_fill() returns. If the array was previously shorter, then the additional elements appended are set to PL_sv_undef . If the array was longer, then the excess elements are freed. av_fill(av, -1) is the same as av_clear(av) .

    1. void av_fill(AV *av, I32 fill)
  • av_len

    Same as av_top_index. Returns the highest index in the array. Note that the return value is +1 what its name implies it returns; and hence differs in meaning from what the similarly named sv_len returns.

    1. I32 av_len(AV *av)
  • av_make

    Creates a new AV and populates it with a list of SVs. The SVs are copied into the array, so they may be freed after the call to av_make. The new AV will have a reference count of 1.

    Perl equivalent: my @new_array = ($scalar1, $scalar2, $scalar3...);

    1. AV* av_make(I32 size, SV **strp)
  • av_pop

    Removes one SV from the end of the array, reducing its size by one and returning the SV (transferring control of one reference count) to the caller. Returns &PL_sv_undef if the array is empty.

    Perl equivalent: pop(@myarray);

    1. SV* av_pop(AV *av)
  • av_push

    Pushes an SV onto the end of the array. The array will grow automatically to accommodate the addition. This takes ownership of one reference count.

    Perl equivalent: push @myarray, $elem; .

    1. void av_push(AV *av, SV *val)
  • av_shift

    Shifts an SV off the beginning of the array. Returns &PL_sv_undef if the array is empty.

    Perl equivalent: shift(@myarray);

    1. SV* av_shift(AV *av)
  • av_store

    Stores an SV in an array. The array index is specified as key . The return value will be NULL if the operation failed or if the value did not need to be actually stored within the array (as in the case of tied arrays). Otherwise, it can be dereferenced to get the SV* that was stored there (= val )).

    Note that the caller is responsible for suitably incrementing the reference count of val before the call, and decrementing it if the function returned NULL.

    Approximate Perl equivalent: $myarray[$key] = $val; .

    See Understanding the Magic of Tied Hashes and Arrays in perlguts for more information on how to use this function on tied arrays.

    1. SV** av_store(AV *av, I32 key, SV *val)
  • av_tindex

    Same as av_top_index() .

    1. int av_tindex(AV* av)
  • av_top_index

    Returns the highest index in the array. The number of elements in the array is av_top_index(av) + 1 . Returns -1 if the array is empty.

    The Perl equivalent for this is $#myarray .

    (A slightly shorter form is av_tindex .)

    1. I32 av_top_index(AV *av)
  • av_undef

    Undefines the array. Frees the memory used by the av to store its list of scalars. If any destructors are triggered as a result, the av itself may be freed.

    1. void av_undef(AV *av)
  • av_unshift

    Unshift the given number of undef values onto the beginning of the array. The array will grow automatically to accommodate the addition. You must then use av_store to assign values to these new elements.

    Perl equivalent: unshift @myarray, ( (undef) x $n );

    1. void av_unshift(AV *av, I32 num)
  • get_av

    Returns the AV of the specified Perl global or package array with the given name (so it won't work on lexical variables). flags are passed to gv_fetchpv . If GV_ADD is set and the Perl variable does not exist then it will be created. If flags is zero and the variable does not exist then NULL is returned.

    Perl equivalent: @{"$name"} .

    NOTE: the perl_ form of this function is deprecated.

    1. AV* get_av(const char *name, I32 flags)
  • newAV

    Creates a new AV. The reference count is set to 1.

    Perl equivalent: my @array; .

    1. AV* newAV()
  • sortsv

    Sort an array. Here is an example:

    1. sortsv(AvARRAY(av), av_top_index(av)+1, Perl_sv_cmp_locale);

    Currently this always uses mergesort. See sortsv_flags for a more flexible routine.

    1. void sortsv(SV** array, size_t num_elts,
    2. SVCOMPARE_t cmp)
  • sortsv_flags

    Sort an array, with various options.

    1. void sortsv_flags(SV** array, size_t num_elts,
    2. SVCOMPARE_t cmp, U32 flags)

Callback Functions

  • call_argv

    Performs a callback to the specified named and package-scoped Perl subroutine with argv (a NULL-terminated array of strings) as arguments. See perlcall.

    Approximate Perl equivalent: &{"$sub_name"}(@$argv) .

    NOTE: the perl_ form of this function is deprecated.

    1. I32 call_argv(const char* sub_name, I32 flags,
    2. char** argv)
  • call_method

    Performs a callback to the specified Perl method. The blessed object must be on the stack. See perlcall.

    NOTE: the perl_ form of this function is deprecated.

    1. I32 call_method(const char* methname, I32 flags)
  • call_pv

    Performs a callback to the specified Perl sub. See perlcall.

    NOTE: the perl_ form of this function is deprecated.

    1. I32 call_pv(const char* sub_name, I32 flags)
  • call_sv

    Performs a callback to the Perl sub whose name is in the SV. See perlcall.

    NOTE: the perl_ form of this function is deprecated.

    1. I32 call_sv(SV* sv, VOL I32 flags)
  • ENTER

    Opening bracket on a callback. See LEAVE and perlcall.

    1. ENTER;
  • eval_pv

    Tells Perl to eval the given string and return an SV* result.

    NOTE: the perl_ form of this function is deprecated.

    1. SV* eval_pv(const char* p, I32 croak_on_error)
  • eval_sv

    Tells Perl to eval the string in the SV. It supports the same flags as call_sv , with the obvious exception of G_EVAL. See perlcall.

    NOTE: the perl_ form of this function is deprecated.

    1. I32 eval_sv(SV* sv, I32 flags)
  • FREETMPS

    Closing bracket for temporaries on a callback. See SAVETMPS and perlcall.

    1. FREETMPS;
  • LEAVE

    Closing bracket on a callback. See ENTER and perlcall.

    1. LEAVE;
  • SAVETMPS

    Opening bracket for temporaries on a callback. See FREETMPS and perlcall.

    1. SAVETMPS;

Character case changing

  • toLOWER

    Converts the specified character to lowercase, if possible; otherwise returns the input character itself.

    1. char toLOWER(char ch)
  • toUPPER

    Converts the specified character to uppercase, if possible; otherwise returns the input character itself.

    1. char toUPPER(char ch)

Character classes

This section is about functions (really macros) that classify characters into types, such as punctuation versus alphabetic, etc. Most of these are analogous to regular expression character classes. (See POSIX Character Classes in perlrecharclass.) There are several variants for each class. (Not all macros have all variants; each item below lists the ones valid for it.) None are affected by use bytes , and only the ones with LC in the name are affected by the current locale.

The base function, e.g., isALPHA() , takes an octet (either a char or a U8 ) as input and returns a boolean as to whether or not the character represented by that octet is (or on non-ASCII platforms, corresponds to) an ASCII character in the named class based on platform, Unicode, and Perl rules. If the input is a number that doesn't fit in an octet, FALSE is returned.

Variant isFOO_A (e.g., isALPHA_A() ) is identical to the base function with no suffix "_A" .

Variant isFOO_L1 imposes the Latin-1 (or EBCDIC equivlalent) character set onto the platform. That is, the code points that are ASCII are unaffected, since ASCII is a subset of Latin-1. But the non-ASCII code points are treated as if they are Latin-1 characters. For example, isWORDCHAR_L1() will return true when called with the code point 0xDF, which is a word character in both ASCII and EBCDIC (though it represent different characters in each).

Variant isFOO_uni is like the isFOO_L1 variant, but accepts any UV code point as input. If the code point is larger than 255, Unicode rules are used to determine if it is in the character class. For example, isWORDCHAR_uni(0x100) returns TRUE, since 0x100 is LATIN CAPITAL LETTER A WITH MACRON in Unicode, and is a word character.

Variant isFOO_utf8 is like isFOO_uni , but the input is a pointer to a (known to be well-formed) UTF-8 encoded string (U8* or char* ). The classification of just the first (possibly multi-byte) character in the string is tested.

Variant isFOO_LC is like the isFOO_A and isFOO_L1 variants, but uses the C library function that gives the named classification instead of hard-coded rules. For example, isDIGIT_LC() returns the result of calling isdigit() . This means that the result is based on the current locale, which is what LC in the name stands for. FALSE is always returned if the input won't fit into an octet.

Variant isFOO_LC_uvchr is like isFOO_LC , but is defined on any UV. It returns the same as isFOO_LC for input code points less than 256, and returns the hard-coded, not-affected-by-locale, Unicode results for larger ones.

Variant isFOO_LC_utf8 is like isFOO_LC_uvchr , but the input is a pointer to a (known to be well-formed) UTF-8 encoded string (U8* or char* ). The classification of just the first (possibly multi-byte) character in the string is tested.

  • isALPHA

    Returns a boolean indicating whether the specified character is an alphabetic character, analogous to m/[[:alpha:]]/. See the top of this section for an explanation of variants isALPHA_A , isALPHA_L1 , isALPHA_uni , isALPHA_utf8 , isALPHA_LC , isALPHA_LC_uvchr , and isALPHA_LC_utf8 .

    1. bool isALPHA(char ch)
  • isALPHANUMERIC

    Returns a boolean indicating whether the specified character is a either an alphabetic character or decimal digit, analogous to m/[[:alnum:]]/. See the top of this section for an explanation of variants isALPHANUMERIC_A , isALPHANUMERIC_L1 , isALPHANUMERIC_uni , isALPHANUMERIC_utf8 , isALPHANUMERIC_LC , isALPHANUMERIC_LC_uvchr , and isALPHANUMERIC_LC_utf8 .

    1. bool isALPHANUMERIC(char ch)
  • isASCII

    Returns a boolean indicating whether the specified character is one of the 128 characters in the ASCII character set, analogous to m/[[:ascii:]]/. On non-ASCII platforms, it returns TRUE iff this character corresponds to an ASCII character. Variants isASCII_A() and isASCII_L1() are identical to isASCII() . See the top of this section for an explanation of variants isASCII_uni , isASCII_utf8 , isASCII_LC , isASCII_LC_uvchr , and isASCII_LC_utf8 . Note, however, that some platforms do not have the C library routine isascii() . In these cases, the variants whose names contain LC are the same as the corresponding ones without.

    1. bool isASCII(char ch)
  • isBLANK

    Returns a boolean indicating whether the specified character is a character considered to be a blank, analogous to m/[[:blank:]]/. See the top of this section for an explanation of variants isBLANK_A , isBLANK_L1 , isBLANK_uni , isBLANK_utf8 , isBLANK_LC , isBLANK_LC_uvchr , and isBLANK_LC_utf8 . Note, however, that some platforms do not have the C library routine isblank() . In these cases, the variants whose names contain LC are the same as the corresponding ones without.

    1. bool isBLANK(char ch)
  • isCNTRL

    Returns a boolean indicating whether the specified character is a control character, analogous to m/[[:cntrl:]]/. See the top of this section for an explanation of variants isCNTRL_A , isCNTRL_L1 , isCNTRL_uni , isCNTRL_utf8 , isCNTRL_LC , isCNTRL_LC_uvchr , and isCNTRL_LC_utf8 On EBCDIC platforms, you almost always want to use the isCNTRL_L1 variant.

    1. bool isCNTRL(char ch)
  • isDIGIT

    Returns a boolean indicating whether the specified character is a digit, analogous to m/[[:digit:]]/. Variants isDIGIT_A and isDIGIT_L1 are identical to isDIGIT . See the top of this section for an explanation of variants isDIGIT_uni , isDIGIT_utf8 , isDIGIT_LC , isDIGIT_LC_uvchr , and isDIGIT_LC_utf8 .

    1. bool isDIGIT(char ch)
  • isGRAPH

    Returns a boolean indicating whether the specified character is a graphic character, analogous to m/[[:graph:]]/. See the top of this section for an explanation of variants isGRAPH_A , isGRAPH_L1 , isGRAPH_uni , isGRAPH_utf8 , isGRAPH_LC , isGRAPH_LC_uvchr , and isGRAPH_LC_utf8 .

    1. bool isGRAPH(char ch)
  • isIDCONT

    Returns a boolean indicating whether the specified character can be the second or succeeding character of an identifier. This is very close to, but not quite the same as the official Unicode property XID_Continue . The difference is that this returns true only if the input character also matches isWORDCHAR. See the top of this section for an explanation of variants isIDCONT_A , isIDCONT_L1 , isIDCONT_uni , isIDCONT_utf8 , isIDCONT_LC , isIDCONT_LC_uvchr , and isIDCONT_LC_utf8 .

    1. bool isIDCONT(char ch)
  • isIDFIRST

    Returns a boolean indicating whether the specified character can be the first character of an identifier. This is very close to, but not quite the same as the official Unicode property XID_Start . The difference is that this returns true only if the input character also matches isWORDCHAR. See the top of this section for an explanation of variants isIDFIRST_A , isIDFIRST_L1 , isIDFIRST_uni , isIDFIRST_utf8 , isIDFIRST_LC , isIDFIRST_LC_uvchr , and isIDFIRST_LC_utf8 .

    1. bool isIDFIRST(char ch)
  • isLOWER

    Returns a boolean indicating whether the specified character is a lowercase character, analogous to m/[[:lower:]]/. See the top of this section for an explanation of variants isLOWER_A , isLOWER_L1 , isLOWER_uni , isLOWER_utf8 , isLOWER_LC , isLOWER_LC_uvchr , and isLOWER_LC_utf8 .

    1. bool isLOWER(char ch)
  • isOCTAL

    Returns a boolean indicating whether the specified character is an octal digit, [0-7]. The only two variants are isOCTAL_A and isOCTAL_L1 ; each is identical to isOCTAL .

    1. bool isOCTAL(char ch)
  • isPRINT

    Returns a boolean indicating whether the specified character is a printable character, analogous to m/[[:print:]]/. See the top of this section for an explanation of variants isPRINT_A , isPRINT_L1 , isPRINT_uni , isPRINT_utf8 , isPRINT_LC , isPRINT_LC_uvchr , and isPRINT_LC_utf8 .

    1. bool isPRINT(char ch)
  • isPSXSPC

    (short for Posix Space) Starting in 5.18, this is identical (experimentally) in all its forms to the corresponding isSPACE() macros. ("Experimentally" means that this change may be backed out in 5.20 or 5.22 if field experience indicates that it was unwise.) The locale forms of this macro are identical to their corresponding isSPACE() forms in all Perl releases. In releases prior to 5.18, the non-locale forms differ from their isSPACE() forms only in that the isSPACE() forms don't match a Vertical Tab, and the isPSXSPC() forms do. Otherwise they are identical. Thus this macro is analogous to what m/[[:space:]]/ matches in a regular expression. See the top of this section for an explanation of variants isPSXSPC_A , isPSXSPC_L1 , isPSXSPC_uni , isPSXSPC_utf8 , isPSXSPC_LC , isPSXSPC_LC_uvchr , and isPSXSPC_LC_utf8 .

    1. bool isPSXSPC(char ch)
  • isPUNCT

    Returns a boolean indicating whether the specified character is a punctuation character, analogous to m/[[:punct:]]/. Note that the definition of what is punctuation isn't as straightforward as one might desire. See POSIX Character Classes in perlrecharclass for details. See the top of this section for an explanation of variants isPUNCT_A , isPUNCT_L1 , isPUNCT_uni , isPUNCT_utf8 , isPUNCT_LC , isPUNCT_LC_uvchr , and isPUNCT_LC_utf8 .

    1. bool isPUNCT(char ch)
  • isSPACE

    Returns a boolean indicating whether the specified character is a whitespace character. This is analogous to what m/\s/ matches in a regular expression. Starting in Perl 5.18 (experimentally), this also matches what m/[[:space:]]/ does. ("Experimentally" means that this change may be backed out in 5.20 or 5.22 if field experience indicates that it was unwise.) Prior to 5.18, only the locale forms of this macro (the ones with LC in their names) matched precisely what m/[[:space:]]/ does. In those releases, the only difference, in the non-locale variants, was that isSPACE() did not match a vertical tab. (See isPSXSPC for a macro that matches a vertical tab in all releases.) See the top of this section for an explanation of variants isSPACE_A , isSPACE_L1 , isSPACE_uni , isSPACE_utf8 , isSPACE_LC , isSPACE_LC_uvchr , and isSPACE_LC_utf8 .

    1. bool isSPACE(char ch)
  • isUPPER

    Returns a boolean indicating whether the specified character is an uppercase character, analogous to m/[[:upper:]]/. See the top of this section for an explanation of variants isUPPER_A , isUPPER_L1 , isUPPER_uni , isUPPER_utf8 , isUPPER_LC , isUPPER_LC_uvchr , and isUPPER_LC_utf8 .

    1. bool isUPPER(char ch)
  • isWORDCHAR

    Returns a boolean indicating whether the specified character is a character that is a word character, analogous to what m/\w/ and m/[[:word:]]/ match in a regular expression. A word character is an alphabetic character, a decimal digit, a connecting punctuation character (such as an underscore), or a "mark" character that attaches to one of those (like some sort of accent). isALNUM() is a synonym provided for backward compatibility, even though a word character includes more than the standard C language meaning of alphanumeric. See the top of this section for an explanation of variants isWORDCHAR_A , isWORDCHAR_L1 , isWORDCHAR_uni , isWORDCHAR_utf8 , isWORDCHAR_LC , isWORDCHAR_LC_uvchr , and isWORDCHAR_LC_utf8 .

    1. bool isWORDCHAR(char ch)
  • isXDIGIT

    Returns a boolean indicating whether the specified character is a hexadecimal digit. In the ASCII range these are [0-9A-Fa-f] . Variants isXDIGIT_A() and isXDIGIT_L1() are identical to isXDIGIT() . See the top of this section for an explanation of variants isXDIGIT_uni , isXDIGIT_utf8 , isXDIGIT_LC , isXDIGIT_LC_uvchr , and isXDIGIT_LC_utf8 .

    1. bool isXDIGIT(char ch)

Cloning an interpreter

  • perl_clone

    Create and return a new interpreter by cloning the current one.

    perl_clone takes these flags as parameters:

    CLONEf_COPY_STACKS - is used to, well, copy the stacks also, without it we only clone the data and zero the stacks, with it we copy the stacks and the new perl interpreter is ready to run at the exact same point as the previous one. The pseudo-fork code uses COPY_STACKS while the threads->create doesn't.

    CLONEf_KEEP_PTR_TABLE - perl_clone keeps a ptr_table with the pointer of the old variable as a key and the new variable as a value, this allows it to check if something has been cloned and not clone it again but rather just use the value and increase the refcount. If KEEP_PTR_TABLE is not set then perl_clone will kill the ptr_table using the function ptr_table_free(PL_ptr_table); PL_ptr_table = NULL; , reason to keep it around is if you want to dup some of your own variable who are outside the graph perl scans, example of this code is in threads.xs create.

    CLONEf_CLONE_HOST - This is a win32 thing, it is ignored on unix, it tells perls win32host code (which is c++) to clone itself, this is needed on win32 if you want to run two threads at the same time, if you just want to do some stuff in a separate perl interpreter and then throw it away and return to the original one, you don't need to do anything.

    1. PerlInterpreter* perl_clone(
    2. PerlInterpreter *proto_perl,
    3. UV flags
    4. )

Compile-time scope hooks

  • BhkDISABLE

    Temporarily disable an entry in this BHK structure, by clearing the appropriate flag. which is a preprocessor token indicating which entry to disable.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void BhkDISABLE(BHK *hk, which)
  • BhkENABLE

    Re-enable an entry in this BHK structure, by setting the appropriate flag. which is a preprocessor token indicating which entry to enable. This will assert (under -DDEBUGGING) if the entry doesn't contain a valid pointer.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void BhkENABLE(BHK *hk, which)
  • BhkENTRY_set

    Set an entry in the BHK structure, and set the flags to indicate it is valid. which is a preprocessing token indicating which entry to set. The type of ptr depends on the entry.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void BhkENTRY_set(BHK *hk, which, void *ptr)
  • blockhook_register

    Register a set of hooks to be called when the Perl lexical scope changes at compile time. See Compile-time scope hooks in perlguts.

    NOTE: this function is experimental and may change or be removed without notice.

    NOTE: this function must be explicitly called as Perl_blockhook_register with an aTHX_ parameter.

    1. void Perl_blockhook_register(pTHX_ BHK *hk)

COP Hint Hashes

  • cophh_2hv

    Generates and returns a standard Perl hash representing the full set of key/value pairs in the cop hints hash cophh. flags is currently unused and must be zero.

    NOTE: this function is experimental and may change or be removed without notice.

    1. HV * cophh_2hv(const COPHH *cophh, U32 flags)
  • cophh_copy

    Make and return a complete copy of the cop hints hash cophh.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_copy(COPHH *cophh)
  • cophh_delete_pv

    Like cophh_delete_pvn, but takes a nul-terminated string instead of a string/length pair.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_delete_pv(const COPHH *cophh,
    2. const char *key, U32 hash,
    3. U32 flags)
  • cophh_delete_pvn

    Delete a key and its associated value from the cop hints hash cophh, and returns the modified hash. The returned hash pointer is in general not the same as the hash pointer that was passed in. The input hash is consumed by the function, and the pointer to it must not be subsequently used. Use cophh_copy if you need both hashes.

    The key is specified by keypv and keylen. If flags has the COPHH_KEY_UTF8 bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. hash is a precomputed hash of the key string, or zero if it has not been precomputed.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_delete_pvn(COPHH *cophh,
    2. const char *keypv,
    3. STRLEN keylen, U32 hash,
    4. U32 flags)
  • cophh_delete_pvs

    Like cophh_delete_pvn, but takes a literal string instead of a string/length pair, and no precomputed hash.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_delete_pvs(const COPHH *cophh,
    2. const char *key, U32 flags)
  • cophh_delete_sv

    Like cophh_delete_pvn, but takes a Perl scalar instead of a string/length pair.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_delete_sv(const COPHH *cophh, SV *key,
    2. U32 hash, U32 flags)
  • cophh_fetch_pv

    Like cophh_fetch_pvn, but takes a nul-terminated string instead of a string/length pair.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV * cophh_fetch_pv(const COPHH *cophh,
    2. const char *key, U32 hash,
    3. U32 flags)
  • cophh_fetch_pvn

    Look up the entry in the cop hints hash cophh with the key specified by keypv and keylen. If flags has the COPHH_KEY_UTF8 bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. hash is a precomputed hash of the key string, or zero if it has not been precomputed. Returns a mortal scalar copy of the value associated with the key, or &PL_sv_placeholder if there is no value associated with the key.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV * cophh_fetch_pvn(const COPHH *cophh,
    2. const char *keypv,
    3. STRLEN keylen, U32 hash,
    4. U32 flags)
  • cophh_fetch_pvs

    Like cophh_fetch_pvn, but takes a literal string instead of a string/length pair, and no precomputed hash.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV * cophh_fetch_pvs(const COPHH *cophh,
    2. const char *key, U32 flags)
  • cophh_fetch_sv

    Like cophh_fetch_pvn, but takes a Perl scalar instead of a string/length pair.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV * cophh_fetch_sv(const COPHH *cophh, SV *key,
    2. U32 hash, U32 flags)
  • cophh_free

    Discard the cop hints hash cophh, freeing all resources associated with it.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void cophh_free(COPHH *cophh)
  • cophh_new_empty

    Generate and return a fresh cop hints hash containing no entries.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_new_empty()
  • cophh_store_pv

    Like cophh_store_pvn, but takes a nul-terminated string instead of a string/length pair.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_store_pv(const COPHH *cophh,
    2. const char *key, U32 hash,
    3. SV *value, U32 flags)
  • cophh_store_pvn

    Stores a value, associated with a key, in the cop hints hash cophh, and returns the modified hash. The returned hash pointer is in general not the same as the hash pointer that was passed in. The input hash is consumed by the function, and the pointer to it must not be subsequently used. Use cophh_copy if you need both hashes.

    The key is specified by keypv and keylen. If flags has the COPHH_KEY_UTF8 bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. hash is a precomputed hash of the key string, or zero if it has not been precomputed.

    value is the scalar value to store for this key. value is copied by this function, which thus does not take ownership of any reference to it, and later changes to the scalar will not be reflected in the value visible in the cop hints hash. Complex types of scalar will not be stored with referential integrity, but will be coerced to strings.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_store_pvn(COPHH *cophh, const char *keypv,
    2. STRLEN keylen, U32 hash,
    3. SV *value, U32 flags)
  • cophh_store_pvs

    Like cophh_store_pvn, but takes a literal string instead of a string/length pair, and no precomputed hash.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_store_pvs(const COPHH *cophh,
    2. const char *key, SV *value,
    3. U32 flags)
  • cophh_store_sv

    Like cophh_store_pvn, but takes a Perl scalar instead of a string/length pair.

    NOTE: this function is experimental and may change or be removed without notice.

    1. COPHH * cophh_store_sv(const COPHH *cophh, SV *key,
    2. U32 hash, SV *value, U32 flags)

COP Hint Reading

  • cop_hints_2hv

    Generates and returns a standard Perl hash representing the full set of hint entries in the cop cop. flags is currently unused and must be zero.

    1. HV * cop_hints_2hv(const COP *cop, U32 flags)
  • cop_hints_fetch_pv

    Like cop_hints_fetch_pvn, but takes a nul-terminated string instead of a string/length pair.

    1. SV * cop_hints_fetch_pv(const COP *cop,
    2. const char *key, U32 hash,
    3. U32 flags)
  • cop_hints_fetch_pvn

    Look up the hint entry in the cop cop with the key specified by keypv and keylen. If flags has the COPHH_KEY_UTF8 bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. hash is a precomputed hash of the key string, or zero if it has not been precomputed. Returns a mortal scalar copy of the value associated with the key, or &PL_sv_placeholder if there is no value associated with the key.

    1. SV * cop_hints_fetch_pvn(const COP *cop,
    2. const char *keypv,
    3. STRLEN keylen, U32 hash,
    4. U32 flags)
  • cop_hints_fetch_pvs

    Like cop_hints_fetch_pvn, but takes a literal string instead of a string/length pair, and no precomputed hash.

    1. SV * cop_hints_fetch_pvs(const COP *cop,
    2. const char *key, U32 flags)
  • cop_hints_fetch_sv

    Like cop_hints_fetch_pvn, but takes a Perl scalar instead of a string/length pair.

    1. SV * cop_hints_fetch_sv(const COP *cop, SV *key,
    2. U32 hash, U32 flags)

Custom Operators

  • custom_op_register

    Register a custom op. See Custom Operators in perlguts.

    NOTE: this function must be explicitly called as Perl_custom_op_register with an aTHX_ parameter.

    1. void Perl_custom_op_register(pTHX_
    2. Perl_ppaddr_t ppaddr,
    3. const XOP *xop)
  • custom_op_xop

    Return the XOP structure for a given custom op. This function should be considered internal to OP_NAME and the other access macros: use them instead.

    NOTE: this function must be explicitly called as Perl_custom_op_xop with an aTHX_ parameter.

    1. const XOP * Perl_custom_op_xop(pTHX_ const OP *o)
  • XopDISABLE

    Temporarily disable a member of the XOP, by clearing the appropriate flag.

    1. void XopDISABLE(XOP *xop, which)
  • XopENABLE

    Reenable a member of the XOP which has been disabled.

    1. void XopENABLE(XOP *xop, which)
  • XopENTRY

    Return a member of the XOP structure. which is a cpp token indicating which entry to return. If the member is not set this will return a default value. The return type depends on which.

    1. XopENTRY(XOP *xop, which)
  • XopENTRY_set

    Set a member of the XOP structure. which is a cpp token indicating which entry to set. See Custom Operators in perlguts for details about the available members and how they are used.

    1. void XopENTRY_set(XOP *xop, which, value)
  • XopFLAGS

    Return the XOP's flags.

    1. U32 XopFLAGS(XOP *xop)

CV Manipulation Functions

  • CvSTASH

    Returns the stash of the CV. A stash is the symbol table hash, containing the package-scoped variables in the package where the subroutine was defined. For more information, see perlguts.

    This also has a special use with XS AUTOLOAD subs. See Autoloading with XSUBs in perlguts.

    1. HV* CvSTASH(CV* cv)
  • get_cv

    Uses strlen to get the length of name , then calls get_cvn_flags .

    NOTE: the perl_ form of this function is deprecated.

    1. CV* get_cv(const char* name, I32 flags)
  • get_cvn_flags

    Returns the CV of the specified Perl subroutine. flags are passed to gv_fetchpvn_flags . If GV_ADD is set and the Perl subroutine does not exist then it will be declared (which has the same effect as saying sub name; ). If GV_ADD is not set and the subroutine does not exist then NULL is returned.

    NOTE: the perl_ form of this function is deprecated.

    1. CV* get_cvn_flags(const char* name, STRLEN len,
    2. I32 flags)

Embedding Functions

  • cv_clone

    Clone a CV, making a lexical closure. proto supplies the prototype of the function: its code, pad structure, and other attributes. The prototype is combined with a capture of outer lexicals to which the code refers, which are taken from the currently-executing instance of the immediately surrounding code.

    1. CV * cv_clone(CV *proto)
  • cv_undef

    Clear out all the active components of a CV. This can happen either by an explicit undef &foo , or by the reference count going to zero. In the former case, we keep the CvOUTSIDE pointer, so that any anonymous children can still follow the full lexical scope chain.

    1. void cv_undef(CV* cv)
  • find_rundefsv

    Find and return the variable that is named $_ in the lexical scope of the currently-executing function. This may be a lexical $_ , or will otherwise be the global one.

    1. SV * find_rundefsv()
  • find_rundefsvoffset

    Find the position of the lexical $_ in the pad of the currently-executing function. Returns the offset in the current pad, or NOT_IN_PAD if there is no lexical $_ in scope (in which case the global one should be used instead). find_rundefsv is likely to be more convenient.

    NOTE: the perl_ form of this function is deprecated.

    1. PADOFFSET find_rundefsvoffset()
  • load_module

    Loads the module whose name is pointed to by the string part of name. Note that the actual module name, not its filename, should be given. Eg, "Foo::Bar" instead of "Foo/Bar.pm". flags can be any of PERL_LOADMOD_DENY, PERL_LOADMOD_NOIMPORT, or PERL_LOADMOD_IMPORT_OPS (or 0 for no flags). ver, if specified and not NULL, provides version semantics similar to use Foo::Bar VERSION . The optional trailing SV* arguments can be used to specify arguments to the module's import() method, similar to use Foo::Bar VERSION LIST . They must be terminated with a final NULL pointer. Note that this list can only be omitted when the PERL_LOADMOD_NOIMPORT flag has been used. Otherwise at least a single NULL pointer to designate the default import list is required.

    The reference count for each specified SV* parameter is decremented.

    1. void load_module(U32 flags, SV* name, SV* ver, ...)
  • nothreadhook

    Stub that provides thread hook for perl_destruct when there are no threads.

    1. int nothreadhook()
  • pad_add_anon

    Allocates a place in the currently-compiling pad (via pad_alloc) for an anonymous function that is lexically scoped inside the currently-compiling function. The function func is linked into the pad, and its CvOUTSIDE link to the outer scope is weakened to avoid a reference loop.

    One reference count is stolen, so you may need to do SvREFCNT_inc(func) .

    optype should be an opcode indicating the type of operation that the pad entry is to support. This doesn't affect operational semantics, but is used for debugging.

    1. PADOFFSET pad_add_anon(CV *func, I32 optype)
  • pad_add_name_pv

    Exactly like pad_add_name_pvn, but takes a nul-terminated string instead of a string/length pair.

    1. PADOFFSET pad_add_name_pv(const char *name, U32 flags,
    2. HV *typestash, HV *ourstash)
  • pad_add_name_pvn

    Allocates a place in the currently-compiling pad for a named lexical variable. Stores the name and other metadata in the name part of the pad, and makes preparations to manage the variable's lexical scoping. Returns the offset of the allocated pad slot.

    namepv/namelen specify the variable's name, including leading sigil. If typestash is non-null, the name is for a typed lexical, and this identifies the type. If ourstash is non-null, it's a lexical reference to a package variable, and this identifies the package. The following flags can be OR'ed together:

    1. padadd_OUR redundantly specifies if it's a package var
    2. padadd_STATE variable will retain value persistently
    3. padadd_NO_DUP_CHECK skip check for lexical shadowing
    4. PADOFFSET pad_add_name_pvn(const char *namepv,
    5. STRLEN namelen, U32 flags,
    6. HV *typestash, HV *ourstash)
  • pad_add_name_sv

    Exactly like pad_add_name_pvn, but takes the name string in the form of an SV instead of a string/length pair.

    1. PADOFFSET pad_add_name_sv(SV *name, U32 flags,
    2. HV *typestash, HV *ourstash)
  • pad_alloc

    Allocates a place in the currently-compiling pad, returning the offset of the allocated pad slot. No name is initially attached to the pad slot. tmptype is a set of flags indicating the kind of pad entry required, which will be set in the value SV for the allocated pad entry:

    1. SVs_PADMY named lexical variable ("my", "our", "state")
    2. SVs_PADTMP unnamed temporary store

    optype should be an opcode indicating the type of operation that the pad entry is to support. This doesn't affect operational semantics, but is used for debugging.

    NOTE: this function is experimental and may change or be removed without notice.

    1. PADOFFSET pad_alloc(I32 optype, U32 tmptype)
  • pad_compname_type

    Looks up the type of the lexical variable at position po in the currently-compiling pad. If the variable is typed, the stash of the class to which it is typed is returned. If not, NULL is returned.

    1. HV * pad_compname_type(PADOFFSET po)
  • pad_findmy_pv

    Exactly like pad_findmy_pvn, but takes a nul-terminated string instead of a string/length pair.

    1. PADOFFSET pad_findmy_pv(const char *name, U32 flags)
  • pad_findmy_pvn

    Given the name of a lexical variable, find its position in the currently-compiling pad. namepv/namelen specify the variable's name, including leading sigil. flags is reserved and must be zero. If it is not in the current pad but appears in the pad of any lexically enclosing scope, then a pseudo-entry for it is added in the current pad. Returns the offset in the current pad, or NOT_IN_PAD if no such lexical is in scope.

    1. PADOFFSET pad_findmy_pvn(const char *namepv,
    2. STRLEN namelen, U32 flags)
  • pad_findmy_sv

    Exactly like pad_findmy_pvn, but takes the name string in the form of an SV instead of a string/length pair.

    1. PADOFFSET pad_findmy_sv(SV *name, U32 flags)
  • pad_setsv

    Set the value at offset po in the current (compiling or executing) pad. Use the macro PAD_SETSV() rather than calling this function directly.

    1. void pad_setsv(PADOFFSET po, SV *sv)
  • pad_sv

    Get the value at offset po in the current (compiling or executing) pad. Use macro PAD_SV instead of calling this function directly.

    1. SV * pad_sv(PADOFFSET po)
  • pad_tidy

    Tidy up a pad at the end of compilation of the code to which it belongs. Jobs performed here are: remove most stuff from the pads of anonsub prototypes; give it a @_; mark temporaries as such. type indicates the kind of subroutine:

    1. padtidy_SUB ordinary subroutine
    2. padtidy_SUBCLONE prototype for lexical closure
    3. padtidy_FORMAT format

    NOTE: this function is experimental and may change or be removed without notice.

    1. void pad_tidy(padtidy_type type)
  • perl_alloc

    Allocates a new Perl interpreter. See perlembed.

    1. PerlInterpreter* perl_alloc()
  • perl_construct

    Initializes a new Perl interpreter. See perlembed.

    1. void perl_construct(PerlInterpreter *my_perl)
  • perl_destruct

    Shuts down a Perl interpreter. See perlembed.

    1. int perl_destruct(PerlInterpreter *my_perl)
  • perl_free

    Releases a Perl interpreter. See perlembed.

    1. void perl_free(PerlInterpreter *my_perl)
  • perl_parse

    Tells a Perl interpreter to parse a Perl script. See perlembed.

    1. int perl_parse(PerlInterpreter *my_perl,
    2. XSINIT_t xsinit, int argc,
    3. char** argv, char** env)
  • perl_run

    Tells a Perl interpreter to run. See perlembed.

    1. int perl_run(PerlInterpreter *my_perl)
  • require_pv

    Tells Perl to require the file named by the string argument. It is analogous to the Perl code eval "require '$file'" . It's even implemented that way; consider using load_module instead.

    NOTE: the perl_ form of this function is deprecated.

    1. void require_pv(const char* pv)

Functions in file dump.c

  • pv_display

    Similar to

    1. pv_escape(dsv,pv,cur,pvlim,PERL_PV_ESCAPE_QUOTE);

    except that an additional "\0" will be appended to the string when len > cur and pv[cur] is "\0".

    Note that the final string may be up to 7 chars longer than pvlim.

    1. char* pv_display(SV *dsv, const char *pv, STRLEN cur,
    2. STRLEN len, STRLEN pvlim)
  • pv_escape

    Escapes at most the first "count" chars of pv and puts the results into dsv such that the size of the escaped string will not exceed "max" chars and will not contain any incomplete escape sequences.

    If flags contains PERL_PV_ESCAPE_QUOTE then any double quotes in the string will also be escaped.

    Normally the SV will be cleared before the escaped string is prepared, but when PERL_PV_ESCAPE_NOCLEAR is set this will not occur.

    If PERL_PV_ESCAPE_UNI is set then the input string is treated as Unicode, if PERL_PV_ESCAPE_UNI_DETECT is set then the input string is scanned using is_utf8_string() to determine if it is Unicode.

    If PERL_PV_ESCAPE_ALL is set then all input chars will be output using \x01F1 style escapes, otherwise if PERL_PV_ESCAPE_NONASCII is set, only chars above 127 will be escaped using this style; otherwise, only chars above 255 will be so escaped; other non printable chars will use octal or common escaped patterns like \n . Otherwise, if PERL_PV_ESCAPE_NOBACKSLASH then all chars below 255 will be treated as printable and will be output as literals.

    If PERL_PV_ESCAPE_FIRSTCHAR is set then only the first char of the string will be escaped, regardless of max. If the output is to be in hex, then it will be returned as a plain hex sequence. Thus the output will either be a single char, an octal escape sequence, a special escape like \n or a hex value.

    If PERL_PV_ESCAPE_RE is set then the escape char used will be a '%' and not a '\\'. This is because regexes very often contain backslashed sequences, whereas '%' is not a particularly common character in patterns.

    Returns a pointer to the escaped text as held by dsv.

    1. char* pv_escape(SV *dsv, char const * const str,
    2. const STRLEN count, const STRLEN max,
    3. STRLEN * const escaped,
    4. const U32 flags)
  • pv_pretty

    Converts a string into something presentable, handling escaping via pv_escape() and supporting quoting and ellipses.

    If the PERL_PV_PRETTY_QUOTE flag is set then the result will be double quoted with any double quotes in the string escaped. Otherwise if the PERL_PV_PRETTY_LTGT flag is set then the result be wrapped in angle brackets.

    If the PERL_PV_PRETTY_ELLIPSES flag is set and not all characters in string were output then an ellipsis ... will be appended to the string. Note that this happens AFTER it has been quoted.

    If start_color is non-null then it will be inserted after the opening quote (if there is one) but before the escaped text. If end_color is non-null then it will be inserted after the escaped text but before any quotes or ellipses.

    Returns a pointer to the prettified text as held by dsv.

    1. char* pv_pretty(SV *dsv, char const * const str,
    2. const STRLEN count, const STRLEN max,
    3. char const * const start_color,
    4. char const * const end_color,
    5. const U32 flags)

Functions in file mathoms.c

  • custom_op_desc

    Return the description of a given custom op. This was once used by the OP_DESC macro, but is no longer: it has only been kept for compatibility, and should not be used.

    1. const char * custom_op_desc(const OP *o)
  • custom_op_name

    Return the name for a given custom op. This was once used by the OP_NAME macro, but is no longer: it has only been kept for compatibility, and should not be used.

    1. const char * custom_op_name(const OP *o)
  • gv_fetchmethod

    See gv_fetchmethod_autoload.

    1. GV* gv_fetchmethod(HV* stash, const char* name)
  • pack_cat

    The engine implementing pack() Perl function. Note: parameters next_in_list and flags are not used. This call should not be used; use packlist instead.

    1. void pack_cat(SV *cat, const char *pat,
    2. const char *patend, SV **beglist,
    3. SV **endlist, SV ***next_in_list,
    4. U32 flags)
  • sv_2pvbyte_nolen

    Return a pointer to the byte-encoded representation of the SV. May cause the SV to be downgraded from UTF-8 as a side-effect.

    Usually accessed via the SvPVbyte_nolen macro.

    1. char* sv_2pvbyte_nolen(SV* sv)
  • sv_2pvutf8_nolen

    Return a pointer to the UTF-8-encoded representation of the SV. May cause the SV to be upgraded to UTF-8 as a side-effect.

    Usually accessed via the SvPVutf8_nolen macro.

    1. char* sv_2pvutf8_nolen(SV* sv)
  • sv_2pv_nolen

    Like sv_2pv() , but doesn't return the length too. You should usually use the macro wrapper SvPV_nolen(sv) instead.

    1. char* sv_2pv_nolen(SV* sv)
  • sv_catpvn_mg

    Like sv_catpvn , but also handles 'set' magic.

    1. void sv_catpvn_mg(SV *sv, const char *ptr,
    2. STRLEN len)
  • sv_catsv_mg

    Like sv_catsv , but also handles 'set' magic.

    1. void sv_catsv_mg(SV *dsv, SV *ssv)
  • sv_force_normal

    Undo various types of fakery on an SV: if the PV is a shared string, make a private copy; if we're a ref, stop refing; if we're a glob, downgrade to an xpvmg. See also sv_force_normal_flags .

    1. void sv_force_normal(SV *sv)
  • sv_iv

    A private implementation of the SvIVx macro for compilers which can't cope with complex macro expressions. Always use the macro instead.

    1. IV sv_iv(SV* sv)
  • sv_nolocking

    Dummy routine which "locks" an SV when there is no locking module present. Exists to avoid test for a NULL function pointer and because it could potentially warn under some level of strict-ness.

    "Superseded" by sv_nosharing().

    1. void sv_nolocking(SV *sv)
  • sv_nounlocking

    Dummy routine which "unlocks" an SV when there is no locking module present. Exists to avoid test for a NULL function pointer and because it could potentially warn under some level of strict-ness.

    "Superseded" by sv_nosharing().

    1. void sv_nounlocking(SV *sv)
  • sv_nv

    A private implementation of the SvNVx macro for compilers which can't cope with complex macro expressions. Always use the macro instead.

    1. NV sv_nv(SV* sv)
  • sv_pv

    Use the SvPV_nolen macro instead

    1. char* sv_pv(SV *sv)
  • sv_pvbyte

    Use SvPVbyte_nolen instead.

    1. char* sv_pvbyte(SV *sv)
  • sv_pvbyten

    A private implementation of the SvPVbyte macro for compilers which can't cope with complex macro expressions. Always use the macro instead.

    1. char* sv_pvbyten(SV *sv, STRLEN *lp)
  • sv_pvn

    A private implementation of the SvPV macro for compilers which can't cope with complex macro expressions. Always use the macro instead.

    1. char* sv_pvn(SV *sv, STRLEN *lp)
  • sv_pvutf8

    Use the SvPVutf8_nolen macro instead

    1. char* sv_pvutf8(SV *sv)
  • sv_pvutf8n

    A private implementation of the SvPVutf8 macro for compilers which can't cope with complex macro expressions. Always use the macro instead.

    1. char* sv_pvutf8n(SV *sv, STRLEN *lp)
  • sv_taint

    Taint an SV. Use SvTAINTED_on instead.

    1. void sv_taint(SV* sv)
  • sv_unref

    Unsets the RV status of the SV, and decrements the reference count of whatever was being referenced by the RV. This can almost be thought of as a reversal of newSVrv . This is sv_unref_flags with the flag being zero. See SvROK_off .

    1. void sv_unref(SV* sv)
  • sv_usepvn

    Tells an SV to use ptr to find its string value. Implemented by calling sv_usepvn_flags with flags of 0, hence does not handle 'set' magic. See sv_usepvn_flags .

    1. void sv_usepvn(SV* sv, char* ptr, STRLEN len)
  • sv_usepvn_mg

    Like sv_usepvn , but also handles 'set' magic.

    1. void sv_usepvn_mg(SV *sv, char *ptr, STRLEN len)
  • sv_uv

    A private implementation of the SvUVx macro for compilers which can't cope with complex macro expressions. Always use the macro instead.

    1. UV sv_uv(SV* sv)
  • unpack_str

    The engine implementing unpack() Perl function. Note: parameters strbeg, new_s and ocnt are not used. This call should not be used, use unpackstring instead.

    1. I32 unpack_str(const char *pat, const char *patend,
    2. const char *s, const char *strbeg,
    3. const char *strend, char **new_s,
    4. I32 ocnt, U32 flags)

Functions in file op.c

  • alloccopstash

    Available only under threaded builds, this function allocates an entry in PL_stashpad for the stash passed to it.

    NOTE: this function is experimental and may change or be removed without notice.

    1. PADOFFSET alloccopstash(HV *hv)
  • op_contextualize

    Applies a syntactic context to an op tree representing an expression. o is the op tree, and context must be G_SCALAR , G_ARRAY , or G_VOID to specify the context to apply. The modified op tree is returned.

    1. OP * op_contextualize(OP *o, I32 context)

Functions in file perl.h

  • PERL_SYS_INIT

    Provides system-specific tune up of the C runtime environment necessary to run Perl interpreters. This should be called only once, before creating any Perl interpreters.

    1. void PERL_SYS_INIT(int *argc, char*** argv)
  • PERL_SYS_INIT3

    Provides system-specific tune up of the C runtime environment necessary to run Perl interpreters. This should be called only once, before creating any Perl interpreters.

    1. void PERL_SYS_INIT3(int *argc, char*** argv,
    2. char*** env)
  • PERL_SYS_TERM

    Provides system-specific clean up of the C runtime environment after running Perl interpreters. This should be called only once, after freeing any remaining Perl interpreters.

    1. void PERL_SYS_TERM()

Functions in file pp_ctl.c

  • caller_cx

    The XSUB-writer's equivalent of caller. The returned PERL_CONTEXT structure can be interrogated to find all the information returned to Perl by caller. Note that XSUBs don't get a stack frame, so caller_cx(0, NULL) will return information for the immediately-surrounding Perl code.

    This function skips over the automatic calls to &DB::sub made on the behalf of the debugger. If the stack frame requested was a sub called by DB::sub , the return value will be the frame for the call to DB::sub , since that has the correct line number/etc. for the call site. If dbcxp is non-NULL , it will be set to a pointer to the frame for the sub call itself.

    1. const PERL_CONTEXT * caller_cx(
    2. I32 level,
    3. const PERL_CONTEXT **dbcxp
    4. )
  • find_runcv

    Locate the CV corresponding to the currently executing sub or eval. If db_seqp is non_null, skip CVs that are in the DB package and populate *db_seqp with the cop sequence number at the point that the DB:: code was entered. (allows debuggers to eval in the scope of the breakpoint rather than in the scope of the debugger itself).

    1. CV* find_runcv(U32 *db_seqp)

Functions in file pp_pack.c

  • packlist

    The engine implementing pack() Perl function.

    1. void packlist(SV *cat, const char *pat,
    2. const char *patend, SV **beglist,
    3. SV **endlist)
  • unpackstring

    The engine implementing the unpack() Perl function.

    Using the template pat..patend, this function unpacks the string s..strend into a number of mortal SVs, which it pushes onto the perl argument (@_) stack (so you will need to issue a PUTBACK before and SPAGAIN after the call to this function). It returns the number of pushed elements.

    The strend and patend pointers should point to the byte following the last character of each string.

    Although this function returns its values on the perl argument stack, it doesn't take any parameters from that stack (and thus in particular there's no need to do a PUSHMARK before calling it, unlike call_pv for example).

    1. I32 unpackstring(const char *pat,
    2. const char *patend, const char *s,
    3. const char *strend, U32 flags)

Functions in file pp_sys.c

  • setdefout

    Sets PL_defoutgv, the default file handle for output, to the passed in typeglob. As PL_defoutgv "owns" a reference on its typeglob, the reference count of the passed in typeglob is increased by one, and the reference count of the typeglob that PL_defoutgv points to is decreased by one.

    1. void setdefout(GV* gv)

Functions in file utf8.h

  • ibcmp_utf8

    This is a synonym for (! foldEQ_utf8())

    1. I32 ibcmp_utf8(const char *s1, char **pe1, UV l1,
    2. bool u1, const char *s2, char **pe2,
    3. UV l2, bool u2)

Functions in file util.h

  • ibcmp

    This is a synonym for (! foldEQ())

    1. I32 ibcmp(const char* a, const char* b, I32 len)
  • ibcmp_locale

    This is a synonym for (! foldEQ_locale())

    1. I32 ibcmp_locale(const char* a, const char* b,
    2. I32 len)

Global Variables

  • PL_check

    Array, indexed by opcode, of functions that will be called for the "check" phase of optree building during compilation of Perl code. For most (but not all) types of op, once the op has been initially built and populated with child ops it will be filtered through the check function referenced by the appropriate element of this array. The new op is passed in as the sole argument to the check function, and the check function returns the completed op. The check function may (as the name suggests) check the op for validity and signal errors. It may also initialise or modify parts of the ops, or perform more radical surgery such as adding or removing child ops, or even throw the op away and return a different op in its place.

    This array of function pointers is a convenient place to hook into the compilation process. An XS module can put its own custom check function in place of any of the standard ones, to influence the compilation of a particular type of op. However, a custom check function must never fully replace a standard check function (or even a custom check function from another module). A module modifying checking must instead wrap the preexisting check function. A custom check function must be selective about when to apply its custom behaviour. In the usual case where it decides not to do anything special with an op, it must chain the preexisting op function. Check functions are thus linked in a chain, with the core's base checker at the end.

    For thread safety, modules should not write directly to this array. Instead, use the function wrap_op_checker.

  • PL_keyword_plugin

    Function pointer, pointing at a function used to handle extended keywords. The function should be declared as

    1. int keyword_plugin_function(pTHX_
    2. char *keyword_ptr, STRLEN keyword_len,
    3. OP **op_ptr)

    The function is called from the tokeniser, whenever a possible keyword is seen. keyword_ptr points at the word in the parser's input buffer, and keyword_len gives its length; it is not null-terminated. The function is expected to examine the word, and possibly other state such as %^H, to decide whether it wants to handle it as an extended keyword. If it does not, the function should return KEYWORD_PLUGIN_DECLINE , and the normal parser process will continue.

    If the function wants to handle the keyword, it first must parse anything following the keyword that is part of the syntax introduced by the keyword. See Lexer interface for details.

    When a keyword is being handled, the plugin function must build a tree of OP structures, representing the code that was parsed. The root of the tree must be stored in *op_ptr . The function then returns a constant indicating the syntactic role of the construct that it has parsed: KEYWORD_PLUGIN_STMT if it is a complete statement, or KEYWORD_PLUGIN_EXPR if it is an expression. Note that a statement construct cannot be used inside an expression (except via do BLOCK and similar), and an expression is not a complete statement (it requires at least a terminating semicolon).

    When a keyword is handled, the plugin function may also have (compile-time) side effects. It may modify %^H , define functions, and so on. Typically, if side effects are the main purpose of a handler, it does not wish to generate any ops to be included in the normal compilation. In this case it is still required to supply an op tree, but it suffices to generate a single null op.

    That's how the *PL_keyword_plugin function needs to behave overall. Conventionally, however, one does not completely replace the existing handler function. Instead, take a copy of PL_keyword_plugin before assigning your own function pointer to it. Your handler function should look for keywords that it is interested in and handle those. Where it is not interested, it should call the saved plugin function, passing on the arguments it received. Thus PL_keyword_plugin actually points at a chain of handler functions, all of which have an opportunity to handle keywords, and only the last function in the chain (built into the Perl core) will normally return KEYWORD_PLUGIN_DECLINE .

    NOTE: this function is experimental and may change or be removed without notice.

GV Functions

  • GvAV

    Return the AV from the GV.

    1. AV* GvAV(GV* gv)
  • GvCV

    Return the CV from the GV.

    1. CV* GvCV(GV* gv)
  • GvHV

    Return the HV from the GV.

    1. HV* GvHV(GV* gv)
  • GvSV

    Return the SV from the GV.

    1. SV* GvSV(GV* gv)
  • gv_const_sv

    If gv is a typeglob whose subroutine entry is a constant sub eligible for inlining, or gv is a placeholder reference that would be promoted to such a typeglob, then returns the value returned by the sub. Otherwise, returns NULL.

    1. SV* gv_const_sv(GV* gv)
  • gv_fetchmeth

    Like gv_fetchmeth_pvn, but lacks a flags parameter.

    1. GV* gv_fetchmeth(HV* stash, const char* name,
    2. STRLEN len, I32 level)
  • gv_fetchmethod_autoload

    Returns the glob which contains the subroutine to call to invoke the method on the stash . In fact in the presence of autoloading this may be the glob for "AUTOLOAD". In this case the corresponding variable $AUTOLOAD is already setup.

    The third parameter of gv_fetchmethod_autoload determines whether AUTOLOAD lookup is performed if the given method is not present: non-zero means yes, look for AUTOLOAD; zero means no, don't look for AUTOLOAD. Calling gv_fetchmethod is equivalent to calling gv_fetchmethod_autoload with a non-zero autoload parameter.

    These functions grant "SUPER" token as a prefix of the method name. Note that if you want to keep the returned glob for a long time, you need to check for it being "AUTOLOAD", since at the later time the call may load a different subroutine due to $AUTOLOAD changing its value. Use the glob created via a side effect to do this.

    These functions have the same side-effects and as gv_fetchmeth with level==0 . name should be writable if contains ':' or ' ''. The warning against passing the GV returned by gv_fetchmeth to call_sv apply equally to these functions.

    1. GV* gv_fetchmethod_autoload(HV* stash,
    2. const char* name,
    3. I32 autoload)
  • gv_fetchmeth_autoload

    This is the old form of gv_fetchmeth_pvn_autoload, which has no flags parameter.

    1. GV* gv_fetchmeth_autoload(HV* stash,
    2. const char* name,
    3. STRLEN len, I32 level)
  • gv_fetchmeth_pv

    Exactly like gv_fetchmeth_pvn, but takes a nul-terminated string instead of a string/length pair.

    1. GV* gv_fetchmeth_pv(HV* stash, const char* name,
    2. I32 level, U32 flags)
  • gv_fetchmeth_pvn

    Returns the glob with the given name and a defined subroutine or NULL . The glob lives in the given stash , or in the stashes accessible via @ISA and UNIVERSAL::.

    The argument level should be either 0 or -1. If level==0 , as a side-effect creates a glob with the given name in the given stash which in the case of success contains an alias for the subroutine, and sets up caching info for this glob.

    The only significant values for flags are GV_SUPER and SVf_UTF8.

    GV_SUPER indicates that we want to look up the method in the superclasses of the stash .

    The GV returned from gv_fetchmeth may be a method cache entry, which is not visible to Perl code. So when calling call_sv , you should not use the GV directly; instead, you should use the method's CV, which can be obtained from the GV with the GvCV macro.

    1. GV* gv_fetchmeth_pvn(HV* stash, const char* name,
    2. STRLEN len, I32 level,
    3. U32 flags)
  • gv_fetchmeth_pvn_autoload

    Same as gv_fetchmeth_pvn(), but looks for autoloaded subroutines too. Returns a glob for the subroutine.

    For an autoloaded subroutine without a GV, will create a GV even if level < 0 . For an autoloaded subroutine without a stub, GvCV() of the result may be zero.

    Currently, the only significant value for flags is SVf_UTF8.

    1. GV* gv_fetchmeth_pvn_autoload(HV* stash,
    2. const char* name,
    3. STRLEN len, I32 level,
    4. U32 flags)
  • gv_fetchmeth_pv_autoload

    Exactly like gv_fetchmeth_pvn_autoload, but takes a nul-terminated string instead of a string/length pair.

    1. GV* gv_fetchmeth_pv_autoload(HV* stash,
    2. const char* name,
    3. I32 level, U32 flags)
  • gv_fetchmeth_sv

    Exactly like gv_fetchmeth_pvn, but takes the name string in the form of an SV instead of a string/length pair.

    1. GV* gv_fetchmeth_sv(HV* stash, SV* namesv,
    2. I32 level, U32 flags)
  • gv_fetchmeth_sv_autoload

    Exactly like gv_fetchmeth_pvn_autoload, but takes the name string in the form of an SV instead of a string/length pair.

    1. GV* gv_fetchmeth_sv_autoload(HV* stash, SV* namesv,
    2. I32 level, U32 flags)
  • gv_init

    The old form of gv_init_pvn(). It does not work with UTF8 strings, as it has no flags parameter. If the multi parameter is set, the GV_ADDMULTI flag will be passed to gv_init_pvn().

    1. void gv_init(GV* gv, HV* stash, const char* name,
    2. STRLEN len, int multi)
  • gv_init_pv

    Same as gv_init_pvn(), but takes a nul-terminated string for the name instead of separate char * and length parameters.

    1. void gv_init_pv(GV* gv, HV* stash, const char* name,
    2. U32 flags)
  • gv_init_pvn

    Converts a scalar into a typeglob. This is an incoercible typeglob; assigning a reference to it will assign to one of its slots, instead of overwriting it as happens with typeglobs created by SvSetSV. Converting any scalar that is SvOK() may produce unpredictable results and is reserved for perl's internal use.

    gv is the scalar to be converted.

    stash is the parent stash/package, if any.

    name and len give the name. The name must be unqualified; that is, it must not include the package name. If gv is a stash element, it is the caller's responsibility to ensure that the name passed to this function matches the name of the element. If it does not match, perl's internal bookkeeping will get out of sync.

    flags can be set to SVf_UTF8 if name is a UTF8 string, or the return value of SvUTF8(sv). It can also take the GV_ADDMULTI flag, which means to pretend that the GV has been seen before (i.e., suppress "Used once" warnings).

    1. void gv_init_pvn(GV* gv, HV* stash, const char* name,
    2. STRLEN len, U32 flags)
  • gv_init_sv

    Same as gv_init_pvn(), but takes an SV * for the name instead of separate char * and length parameters. flags is currently unused.

    1. void gv_init_sv(GV* gv, HV* stash, SV* namesv,
    2. U32 flags)
  • gv_stashpv

    Returns a pointer to the stash for a specified package. Uses strlen to determine the length of name , then calls gv_stashpvn() .

    1. HV* gv_stashpv(const char* name, I32 flags)
  • gv_stashpvn

    Returns a pointer to the stash for a specified package. The namelen parameter indicates the length of the name , in bytes. flags is passed to gv_fetchpvn_flags() , so if set to GV_ADD then the package will be created if it does not already exist. If the package does not exist and flags is 0 (or any other setting that does not create packages) then NULL is returned.

    Flags may be one of:

    1. GV_ADD
    2. SVf_UTF8
    3. GV_NOADD_NOINIT
    4. GV_NOINIT
    5. GV_NOEXPAND
    6. GV_ADDMG

    The most important of which are probably GV_ADD and SVf_UTF8.

    1. HV* gv_stashpvn(const char* name, U32 namelen,
    2. I32 flags)
  • gv_stashpvs

    Like gv_stashpvn , but takes a literal string instead of a string/length pair.

    1. HV* gv_stashpvs(const char* name, I32 create)
  • gv_stashsv

    Returns a pointer to the stash for a specified package. See gv_stashpvn .

    1. HV* gv_stashsv(SV* sv, I32 flags)

Handy Values

  • Nullav

    Null AV pointer.

    (deprecated - use (AV *)NULL instead)

  • Nullch

    Null character pointer. (No longer available when PERL_CORE is defined.)

  • Nullcv

    Null CV pointer.

    (deprecated - use (CV *)NULL instead)

  • Nullhv

    Null HV pointer.

    (deprecated - use (HV *)NULL instead)

  • Nullsv

    Null SV pointer. (No longer available when PERL_CORE is defined.)

Hash Manipulation Functions

  • cop_fetch_label

    Returns the label attached to a cop. The flags pointer may be set to SVf_UTF8 or 0.

    NOTE: this function is experimental and may change or be removed without notice.

    1. const char * cop_fetch_label(COP *const cop,
    2. STRLEN *len, U32 *flags)
  • cop_store_label

    Save a label into a cop_hints_hash . You need to set flags to SVf_UTF8 for a utf-8 label.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void cop_store_label(COP *const cop,
    2. const char *label, STRLEN len,
    3. U32 flags)
  • get_hv

    Returns the HV of the specified Perl hash. flags are passed to gv_fetchpv . If GV_ADD is set and the Perl variable does not exist then it will be created. If flags is zero and the variable does not exist then NULL is returned.

    NOTE: the perl_ form of this function is deprecated.

    1. HV* get_hv(const char *name, I32 flags)
  • HEf_SVKEY

    This flag, used in the length slot of hash entries and magic structures, specifies the structure contains an SV* pointer where a char* pointer is to be expected. (For information only--not to be used).

  • HeHASH

    Returns the computed hash stored in the hash entry.

    1. U32 HeHASH(HE* he)
  • HeKEY

    Returns the actual pointer stored in the key slot of the hash entry. The pointer may be either char* or SV* , depending on the value of HeKLEN() . Can be assigned to. The HePV() or HeSVKEY() macros are usually preferable for finding the value of a key.

    1. void* HeKEY(HE* he)
  • HeKLEN

    If this is negative, and amounts to HEf_SVKEY , it indicates the entry holds an SV* key. Otherwise, holds the actual length of the key. Can be assigned to. The HePV() macro is usually preferable for finding key lengths.

    1. STRLEN HeKLEN(HE* he)
  • HePV

    Returns the key slot of the hash entry as a char* value, doing any necessary dereferencing of possibly SV* keys. The length of the string is placed in len (this is a macro, so do not use &len ). If you do not care about what the length of the key is, you may use the global variable PL_na , though this is rather less efficient than using a local variable. Remember though, that hash keys in perl are free to contain embedded nulls, so using strlen() or similar is not a good way to find the length of hash keys. This is very similar to the SvPV() macro described elsewhere in this document. See also HeUTF8 .

    If you are using HePV to get values to pass to newSVpvn() to create a new SV, you should consider using newSVhek(HeKEY_hek(he)) as it is more efficient.

    1. char* HePV(HE* he, STRLEN len)
  • HeSVKEY

    Returns the key as an SV* , or NULL if the hash entry does not contain an SV* key.

    1. SV* HeSVKEY(HE* he)
  • HeSVKEY_force

    Returns the key as an SV* . Will create and return a temporary mortal SV* if the hash entry contains only a char* key.

    1. SV* HeSVKEY_force(HE* he)
  • HeSVKEY_set

    Sets the key to a given SV* , taking care to set the appropriate flags to indicate the presence of an SV* key, and returns the same SV* .

    1. SV* HeSVKEY_set(HE* he, SV* sv)
  • HeUTF8

    Returns whether the char * value returned by HePV is encoded in UTF-8, doing any necessary dereferencing of possibly SV* keys. The value returned will be 0 or non-0, not necessarily 1 (or even a value with any low bits set), so do not blindly assign this to a bool variable, as bool may be a typedef for char .

    1. char* HeUTF8(HE* he)
  • HeVAL

    Returns the value slot (type SV* ) stored in the hash entry. Can be assigned to.

    1. SV *foo= HeVAL(hv);
    2. HeVAL(hv)= sv;
    3. SV* HeVAL(HE* he)
  • HvENAME

    Returns the effective name of a stash, or NULL if there is none. The effective name represents a location in the symbol table where this stash resides. It is updated automatically when packages are aliased or deleted. A stash that is no longer in the symbol table has no effective name. This name is preferable to HvNAME for use in MRO linearisations and isa caches.

    1. char* HvENAME(HV* stash)
  • HvENAMELEN

    Returns the length of the stash's effective name.

    1. STRLEN HvENAMELEN(HV *stash)
  • HvENAMEUTF8

    Returns true if the effective name is in UTF8 encoding.

    1. unsigned char HvENAMEUTF8(HV *stash)
  • HvNAME

    Returns the package name of a stash, or NULL if stash isn't a stash. See SvSTASH , CvSTASH .

    1. char* HvNAME(HV* stash)
  • HvNAMELEN

    Returns the length of the stash's name.

    1. STRLEN HvNAMELEN(HV *stash)
  • HvNAMEUTF8

    Returns true if the name is in UTF8 encoding.

    1. unsigned char HvNAMEUTF8(HV *stash)
  • hv_assert

    Check that a hash is in an internally consistent state.

    1. void hv_assert(HV *hv)
  • hv_clear

    Frees the all the elements of a hash, leaving it empty. The XS equivalent of %hash = () . See also hv_undef.

    If any destructors are triggered as a result, the hv itself may be freed.

    1. void hv_clear(HV *hv)
  • hv_clear_placeholders

    Clears any placeholders from a hash. If a restricted hash has any of its keys marked as readonly and the key is subsequently deleted, the key is not actually deleted but is marked by assigning it a value of &PL_sv_placeholder. This tags it so it will be ignored by future operations such as iterating over the hash, but will still allow the hash to have a value reassigned to the key at some future point. This function clears any such placeholder keys from the hash. See Hash::Util::lock_keys() for an example of its use.

    1. void hv_clear_placeholders(HV *hv)
  • hv_copy_hints_hv

    A specialised version of newHVhv for copying %^H . ohv must be a pointer to a hash (which may have %^H magic, but should be generally non-magical), or NULL (interpreted as an empty hash). The content of ohv is copied to a new hash, which has the %^H -specific magic added to it. A pointer to the new hash is returned.

    1. HV * hv_copy_hints_hv(HV *ohv)
  • hv_delete

    Deletes a key/value pair in the hash. The value's SV is removed from the hash, made mortal, and returned to the caller. The absolute value of klen is the length of the key. If klen is negative the key is assumed to be in UTF-8-encoded Unicode. The flags value will normally be zero; if set to G_DISCARD then NULL will be returned. NULL will also be returned if the key is not found.

    1. SV* hv_delete(HV *hv, const char *key, I32 klen,
    2. I32 flags)
  • hv_delete_ent

    Deletes a key/value pair in the hash. The value SV is removed from the hash, made mortal, and returned to the caller. The flags value will normally be zero; if set to G_DISCARD then NULL will be returned. NULL will also be returned if the key is not found. hash can be a valid precomputed hash value, or 0 to ask for it to be computed.

    1. SV* hv_delete_ent(HV *hv, SV *keysv, I32 flags,
    2. U32 hash)
  • hv_exists

    Returns a boolean indicating whether the specified hash key exists. The absolute value of klen is the length of the key. If klen is negative the key is assumed to be in UTF-8-encoded Unicode.

    1. bool hv_exists(HV *hv, const char *key, I32 klen)
  • hv_exists_ent

    Returns a boolean indicating whether the specified hash key exists. hash can be a valid precomputed hash value, or 0 to ask for it to be computed.

    1. bool hv_exists_ent(HV *hv, SV *keysv, U32 hash)
  • hv_fetch

    Returns the SV which corresponds to the specified key in the hash. The absolute value of klen is the length of the key. If klen is negative the key is assumed to be in UTF-8-encoded Unicode. If lval is set then the fetch will be part of a store. This means that if there is no value in the hash associated with the given key, then one is created and a pointer to it is returned. The SV* it points to can be assigned to. But always check that the return value is non-null before dereferencing it to an SV* .

    See Understanding the Magic of Tied Hashes and Arrays in perlguts for more information on how to use this function on tied hashes.

    1. SV** hv_fetch(HV *hv, const char *key, I32 klen,
    2. I32 lval)
  • hv_fetchs

    Like hv_fetch , but takes a literal string instead of a string/length pair.

    1. SV** hv_fetchs(HV* tb, const char* key, I32 lval)
  • hv_fetch_ent

    Returns the hash entry which corresponds to the specified key in the hash. hash must be a valid precomputed hash number for the given key , or 0 if you want the function to compute it. IF lval is set then the fetch will be part of a store. Make sure the return value is non-null before accessing it. The return value when hv is a tied hash is a pointer to a static location, so be sure to make a copy of the structure if you need to store it somewhere.

    See Understanding the Magic of Tied Hashes and Arrays in perlguts for more information on how to use this function on tied hashes.

    1. HE* hv_fetch_ent(HV *hv, SV *keysv, I32 lval,
    2. U32 hash)
  • hv_fill

    Returns the number of hash buckets that happen to be in use. This function is wrapped by the macro HvFILL .

    Previously this value was stored in the HV structure, rather than being calculated on demand.

    1. STRLEN hv_fill(HV const *const hv)
  • hv_iterinit

    Prepares a starting point to traverse a hash table. Returns the number of keys in the hash (i.e. the same as HvUSEDKEYS(hv) ). The return value is currently only meaningful for hashes without tie magic.

    NOTE: Before version 5.004_65, hv_iterinit used to return the number of hash buckets that happen to be in use. If you still need that esoteric value, you can get it through the macro HvFILL(hv) .

    1. I32 hv_iterinit(HV *hv)
  • hv_iterkey

    Returns the key from the current position of the hash iterator. See hv_iterinit .

    1. char* hv_iterkey(HE* entry, I32* retlen)
  • hv_iterkeysv

    Returns the key as an SV* from the current position of the hash iterator. The return value will always be a mortal copy of the key. Also see hv_iterinit .

    1. SV* hv_iterkeysv(HE* entry)
  • hv_iternext

    Returns entries from a hash iterator. See hv_iterinit .

    You may call hv_delete or hv_delete_ent on the hash entry that the iterator currently points to, without losing your place or invalidating your iterator. Note that in this case the current entry is deleted from the hash with your iterator holding the last reference to it. Your iterator is flagged to free the entry on the next call to hv_iternext , so you must not discard your iterator immediately else the entry will leak - call hv_iternext to trigger the resource deallocation.

    1. HE* hv_iternext(HV *hv)
  • hv_iternextsv

    Performs an hv_iternext , hv_iterkey , and hv_iterval in one operation.

    1. SV* hv_iternextsv(HV *hv, char **key, I32 *retlen)
  • hv_iternext_flags

    Returns entries from a hash iterator. See hv_iterinit and hv_iternext . The flags value will normally be zero; if HV_ITERNEXT_WANTPLACEHOLDERS is set the placeholders keys (for restricted hashes) will be returned in addition to normal keys. By default placeholders are automatically skipped over. Currently a placeholder is implemented with a value that is &PL_sv_placeholder . Note that the implementation of placeholders and restricted hashes may change, and the implementation currently is insufficiently abstracted for any change to be tidy.

    NOTE: this function is experimental and may change or be removed without notice.

    1. HE* hv_iternext_flags(HV *hv, I32 flags)
  • hv_iterval

    Returns the value from the current position of the hash iterator. See hv_iterkey .

    1. SV* hv_iterval(HV *hv, HE *entry)
  • hv_magic

    Adds magic to a hash. See sv_magic .

    1. void hv_magic(HV *hv, GV *gv, int how)
  • hv_scalar

    Evaluates the hash in scalar context and returns the result. Handles magic when the hash is tied.

    1. SV* hv_scalar(HV *hv)
  • hv_store

    Stores an SV in a hash. The hash key is specified as key and the absolute value of klen is the length of the key. If klen is negative the key is assumed to be in UTF-8-encoded Unicode. The hash parameter is the precomputed hash value; if it is zero then Perl will compute it.

    The return value will be NULL if the operation failed or if the value did not need to be actually stored within the hash (as in the case of tied hashes). Otherwise it can be dereferenced to get the original SV* . Note that the caller is responsible for suitably incrementing the reference count of val before the call, and decrementing it if the function returned NULL. Effectively a successful hv_store takes ownership of one reference to val . This is usually what you want; a newly created SV has a reference count of one, so if all your code does is create SVs then store them in a hash, hv_store will own the only reference to the new SV, and your code doesn't need to do anything further to tidy up. hv_store is not implemented as a call to hv_store_ent, and does not create a temporary SV for the key, so if your key data is not already in SV form then use hv_store in preference to hv_store_ent.

    See Understanding the Magic of Tied Hashes and Arrays in perlguts for more information on how to use this function on tied hashes.

    1. SV** hv_store(HV *hv, const char *key, I32 klen,
    2. SV *val, U32 hash)
  • hv_stores

    Like hv_store , but takes a literal string instead of a string/length pair and omits the hash parameter.

    1. SV** hv_stores(HV* tb, const char* key,
    2. NULLOK SV* val)
  • hv_store_ent

    Stores val in a hash. The hash key is specified as key . The hash parameter is the precomputed hash value; if it is zero then Perl will compute it. The return value is the new hash entry so created. It will be NULL if the operation failed or if the value did not need to be actually stored within the hash (as in the case of tied hashes). Otherwise the contents of the return value can be accessed using the He? macros described here. Note that the caller is responsible for suitably incrementing the reference count of val before the call, and decrementing it if the function returned NULL. Effectively a successful hv_store_ent takes ownership of one reference to val . This is usually what you want; a newly created SV has a reference count of one, so if all your code does is create SVs then store them in a hash, hv_store will own the only reference to the new SV, and your code doesn't need to do anything further to tidy up. Note that hv_store_ent only reads the key ; unlike val it does not take ownership of it, so maintaining the correct reference count on key is entirely the caller's responsibility. hv_store is not implemented as a call to hv_store_ent, and does not create a temporary SV for the key, so if your key data is not already in SV form then use hv_store in preference to hv_store_ent.

    See Understanding the Magic of Tied Hashes and Arrays in perlguts for more information on how to use this function on tied hashes.

    1. HE* hv_store_ent(HV *hv, SV *key, SV *val, U32 hash)
  • hv_undef

    Undefines the hash. The XS equivalent of undef(%hash).

    As well as freeing all the elements of the hash (like hv_clear()), this also frees any auxiliary data and storage associated with the hash.

    If any destructors are triggered as a result, the hv itself may be freed.

    See also hv_clear.

    1. void hv_undef(HV *hv)
  • newHV

    Creates a new HV. The reference count is set to 1.

    1. HV* newHV()

Hook manipulation

  • wrap_op_checker

    Puts a C function into the chain of check functions for a specified op type. This is the preferred way to manipulate the PL_check array. opcode specifies which type of op is to be affected. new_checker is a pointer to the C function that is to be added to that opcode's check chain, and old_checker_p points to the storage location where a pointer to the next function in the chain will be stored. The value of new_pointer is written into the PL_check array, while the value previously stored there is written to *old_checker_p.

    PL_check is global to an entire process, and a module wishing to hook op checking may find itself invoked more than once per process, typically in different threads. To handle that situation, this function is idempotent. The location *old_checker_p must initially (once per process) contain a null pointer. A C variable of static duration (declared at file scope, typically also marked static to give it internal linkage) will be implicitly initialised appropriately, if it does not have an explicit initialiser. This function will only actually modify the check chain if it finds *old_checker_p to be null. This function is also thread safe on the small scale. It uses appropriate locking to avoid race conditions in accessing PL_check.

    When this function is called, the function referenced by new_checker must be ready to be called, except for *old_checker_p being unfilled. In a threading situation, new_checker may be called immediately, even before this function has returned. *old_checker_p will always be appropriately set before new_checker is called. If new_checker decides not to do anything special with an op that it is given (which is the usual case for most uses of op check hooking), it must chain the check function referenced by *old_checker_p.

    If you want to influence compilation of calls to a specific subroutine, then use cv_set_call_checker rather than hooking checking of all entersub ops.

    1. void wrap_op_checker(Optype opcode,
    2. Perl_check_t new_checker,
    3. Perl_check_t *old_checker_p)

Lexer interface

  • lex_bufutf8

    Indicates whether the octets in the lexer buffer (PL_parser->linestr) should be interpreted as the UTF-8 encoding of Unicode characters. If not, they should be interpreted as Latin-1 characters. This is analogous to the SvUTF8 flag for scalars.

    In UTF-8 mode, it is not guaranteed that the lexer buffer actually contains valid UTF-8. Lexing code must be robust in the face of invalid encoding.

    The actual SvUTF8 flag of the PL_parser->linestr scalar is significant, but not the whole story regarding the input character encoding. Normally, when a file is being read, the scalar contains octets and its SvUTF8 flag is off, but the octets should be interpreted as UTF-8 if the use utf8 pragma is in effect. During a string eval, however, the scalar may have the SvUTF8 flag on, and in this case its octets should be interpreted as UTF-8 unless the use bytes pragma is in effect. This logic may change in the future; use this function instead of implementing the logic yourself.

    NOTE: this function is experimental and may change or be removed without notice.

    1. bool lex_bufutf8()
  • lex_discard_to

    Discards the first part of the PL_parser->linestr buffer, up to ptr. The remaining content of the buffer will be moved, and all pointers into the buffer updated appropriately. ptr must not be later in the buffer than the position of PL_parser->bufptr: it is not permitted to discard text that has yet to be lexed.

    Normally it is not necessarily to do this directly, because it suffices to use the implicit discarding behaviour of lex_next_chunk and things based on it. However, if a token stretches across multiple lines, and the lexing code has kept multiple lines of text in the buffer for that purpose, then after completion of the token it would be wise to explicitly discard the now-unneeded earlier lines, to avoid future multi-line tokens growing the buffer without bound.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_discard_to(char *ptr)
  • lex_grow_linestr

    Reallocates the lexer buffer (PL_parser->linestr) to accommodate at least len octets (including terminating NUL). Returns a pointer to the reallocated buffer. This is necessary before making any direct modification of the buffer that would increase its length. lex_stuff_pvn provides a more convenient way to insert text into the buffer.

    Do not use SvGROW or sv_grow directly on PL_parser->linestr ; this function updates all of the lexer's variables that point directly into the buffer.

    NOTE: this function is experimental and may change or be removed without notice.

    1. char * lex_grow_linestr(STRLEN len)
  • lex_next_chunk

    Reads in the next chunk of text to be lexed, appending it to PL_parser->linestr. This should be called when lexing code has looked to the end of the current chunk and wants to know more. It is usual, but not necessary, for lexing to have consumed the entirety of the current chunk at this time.

    If PL_parser->bufptr is pointing to the very end of the current chunk (i.e., the current chunk has been entirely consumed), normally the current chunk will be discarded at the same time that the new chunk is read in. If flags includes LEX_KEEP_PREVIOUS , the current chunk will not be discarded. If the current chunk has not been entirely consumed, then it will not be discarded regardless of the flag.

    Returns true if some new text was added to the buffer, or false if the buffer has reached the end of the input text.

    NOTE: this function is experimental and may change or be removed without notice.

    1. bool lex_next_chunk(U32 flags)
  • lex_peek_unichar

    Looks ahead one (Unicode) character in the text currently being lexed. Returns the codepoint (unsigned integer value) of the next character, or -1 if lexing has reached the end of the input text. To consume the peeked character, use lex_read_unichar.

    If the next character is in (or extends into) the next chunk of input text, the next chunk will be read in. Normally the current chunk will be discarded at the same time, but if flags includes LEX_KEEP_PREVIOUS then the current chunk will not be discarded.

    If the input is being interpreted as UTF-8 and a UTF-8 encoding error is encountered, an exception is generated.

    NOTE: this function is experimental and may change or be removed without notice.

    1. I32 lex_peek_unichar(U32 flags)
  • lex_read_space

    Reads optional spaces, in Perl style, in the text currently being lexed. The spaces may include ordinary whitespace characters and Perl-style comments. #line directives are processed if encountered. PL_parser->bufptr is moved past the spaces, so that it points at a non-space character (or the end of the input text).

    If spaces extend into the next chunk of input text, the next chunk will be read in. Normally the current chunk will be discarded at the same time, but if flags includes LEX_KEEP_PREVIOUS then the current chunk will not be discarded.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_read_space(U32 flags)
  • lex_read_to

    Consume text in the lexer buffer, from PL_parser->bufptr up to ptr. This advances PL_parser->bufptr to match ptr, performing the correct bookkeeping whenever a newline character is passed. This is the normal way to consume lexed text.

    Interpretation of the buffer's octets can be abstracted out by using the slightly higher-level functions lex_peek_unichar and lex_read_unichar.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_read_to(char *ptr)
  • lex_read_unichar

    Reads the next (Unicode) character in the text currently being lexed. Returns the codepoint (unsigned integer value) of the character read, and moves PL_parser->bufptr past the character, or returns -1 if lexing has reached the end of the input text. To non-destructively examine the next character, use lex_peek_unichar instead.

    If the next character is in (or extends into) the next chunk of input text, the next chunk will be read in. Normally the current chunk will be discarded at the same time, but if flags includes LEX_KEEP_PREVIOUS then the current chunk will not be discarded.

    If the input is being interpreted as UTF-8 and a UTF-8 encoding error is encountered, an exception is generated.

    NOTE: this function is experimental and may change or be removed without notice.

    1. I32 lex_read_unichar(U32 flags)
  • lex_start

    Creates and initialises a new lexer/parser state object, supplying a context in which to lex and parse from a new source of Perl code. A pointer to the new state object is placed in PL_parser. An entry is made on the save stack so that upon unwinding the new state object will be destroyed and the former value of PL_parser will be restored. Nothing else need be done to clean up the parsing context.

    The code to be parsed comes from line and rsfp. line, if non-null, provides a string (in SV form) containing code to be parsed. A copy of the string is made, so subsequent modification of line does not affect parsing. rsfp, if non-null, provides an input stream from which code will be read to be parsed. If both are non-null, the code in line comes first and must consist of complete lines of input, and rsfp supplies the remainder of the source.

    The flags parameter is reserved for future use. Currently it is only used by perl internally, so extensions should always pass zero.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_start(SV *line, PerlIO *rsfp, U32 flags)
  • lex_stuff_pv

    Insert characters into the lexer buffer (PL_parser->linestr), immediately after the current lexing point (PL_parser->bufptr), reallocating the buffer if necessary. This means that lexing code that runs later will see the characters as if they had appeared in the input. It is not recommended to do this as part of normal parsing, and most uses of this facility run the risk of the inserted characters being interpreted in an unintended manner.

    The string to be inserted is represented by octets starting at pv and continuing to the first nul. These octets are interpreted as either UTF-8 or Latin-1, according to whether the LEX_STUFF_UTF8 flag is set in flags. The characters are recoded for the lexer buffer, according to how the buffer is currently being interpreted (lex_bufutf8). If it is not convenient to nul-terminate a string to be inserted, the lex_stuff_pvn function is more appropriate.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_stuff_pv(const char *pv, U32 flags)
  • lex_stuff_pvn

    Insert characters into the lexer buffer (PL_parser->linestr), immediately after the current lexing point (PL_parser->bufptr), reallocating the buffer if necessary. This means that lexing code that runs later will see the characters as if they had appeared in the input. It is not recommended to do this as part of normal parsing, and most uses of this facility run the risk of the inserted characters being interpreted in an unintended manner.

    The string to be inserted is represented by len octets starting at pv. These octets are interpreted as either UTF-8 or Latin-1, according to whether the LEX_STUFF_UTF8 flag is set in flags. The characters are recoded for the lexer buffer, according to how the buffer is currently being interpreted (lex_bufutf8). If a string to be inserted is available as a Perl scalar, the lex_stuff_sv function is more convenient.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_stuff_pvn(const char *pv, STRLEN len,
    2. U32 flags)
  • lex_stuff_pvs

    Like lex_stuff_pvn, but takes a literal string instead of a string/length pair.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_stuff_pvs(const char *pv, U32 flags)
  • lex_stuff_sv

    Insert characters into the lexer buffer (PL_parser->linestr), immediately after the current lexing point (PL_parser->bufptr), reallocating the buffer if necessary. This means that lexing code that runs later will see the characters as if they had appeared in the input. It is not recommended to do this as part of normal parsing, and most uses of this facility run the risk of the inserted characters being interpreted in an unintended manner.

    The string to be inserted is the string value of sv. The characters are recoded for the lexer buffer, according to how the buffer is currently being interpreted (lex_bufutf8). If a string to be inserted is not already a Perl scalar, the lex_stuff_pvn function avoids the need to construct a scalar.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_stuff_sv(SV *sv, U32 flags)
  • lex_unstuff

    Discards text about to be lexed, from PL_parser->bufptr up to ptr. Text following ptr will be moved, and the buffer shortened. This hides the discarded text from any lexing code that runs later, as if the text had never appeared.

    This is not the normal way to consume lexed text. For that, use lex_read_to.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void lex_unstuff(char *ptr)
  • parse_arithexpr

    Parse a Perl arithmetic expression. This may contain operators of precedence down to the bit shift operators. The expression must be followed (and thus terminated) either by a comparison or lower-precedence operator or by something that would normally terminate an expression such as semicolon. If flags includes PARSE_OPTIONAL then the expression is optional, otherwise it is mandatory. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed and the lexical context for the expression.

    The op tree representing the expression is returned. If an optional expression is absent, a null pointer is returned, otherwise the pointer will be non-null.

    If an error occurs in parsing or compilation, in most cases a valid op tree is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. Some compilation errors, however, will throw an exception immediately.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * parse_arithexpr(U32 flags)
  • parse_barestmt

    Parse a single unadorned Perl statement. This may be a normal imperative statement or a declaration that has compile-time effect. It does not include any label or other affixture. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed and the lexical context for the statement.

    The op tree representing the statement is returned. This may be a null pointer if the statement is null, for example if it was actually a subroutine definition (which has compile-time side effects). If not null, it will be ops directly implementing the statement, suitable to pass to newSTATEOP. It will not normally include a nextstate or equivalent op (except for those embedded in a scope contained entirely within the statement).

    If an error occurs in parsing or compilation, in most cases a valid op tree (most likely null) is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. Some compilation errors, however, will throw an exception immediately.

    The flags parameter is reserved for future use, and must always be zero.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * parse_barestmt(U32 flags)
  • parse_block

    Parse a single complete Perl code block. This consists of an opening brace, a sequence of statements, and a closing brace. The block constitutes a lexical scope, so my variables and various compile-time effects can be contained within it. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed and the lexical context for the statement.

    The op tree representing the code block is returned. This is always a real op, never a null pointer. It will normally be a lineseq list, including nextstate or equivalent ops. No ops to construct any kind of runtime scope are included by virtue of it being a block.

    If an error occurs in parsing or compilation, in most cases a valid op tree (most likely null) is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. Some compilation errors, however, will throw an exception immediately.

    The flags parameter is reserved for future use, and must always be zero.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * parse_block(U32 flags)
  • parse_fullexpr

    Parse a single complete Perl expression. This allows the full expression grammar, including the lowest-precedence operators such as or . The expression must be followed (and thus terminated) by a token that an expression would normally be terminated by: end-of-file, closing bracketing punctuation, semicolon, or one of the keywords that signals a postfix expression-statement modifier. If flags includes PARSE_OPTIONAL then the expression is optional, otherwise it is mandatory. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed and the lexical context for the expression.

    The op tree representing the expression is returned. If an optional expression is absent, a null pointer is returned, otherwise the pointer will be non-null.

    If an error occurs in parsing or compilation, in most cases a valid op tree is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. Some compilation errors, however, will throw an exception immediately.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * parse_fullexpr(U32 flags)
  • parse_fullstmt

    Parse a single complete Perl statement. This may be a normal imperative statement or a declaration that has compile-time effect, and may include optional labels. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed and the lexical context for the statement.

    The op tree representing the statement is returned. This may be a null pointer if the statement is null, for example if it was actually a subroutine definition (which has compile-time side effects). If not null, it will be the result of a newSTATEOP call, normally including a nextstate or equivalent op.

    If an error occurs in parsing or compilation, in most cases a valid op tree (most likely null) is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. Some compilation errors, however, will throw an exception immediately.

    The flags parameter is reserved for future use, and must always be zero.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * parse_fullstmt(U32 flags)
  • parse_label

    Parse a single label, possibly optional, of the type that may prefix a Perl statement. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed. If flags includes PARSE_OPTIONAL then the label is optional, otherwise it is mandatory.

    The name of the label is returned in the form of a fresh scalar. If an optional label is absent, a null pointer is returned.

    If an error occurs in parsing, which can only occur if the label is mandatory, a valid label is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV * parse_label(U32 flags)
  • parse_listexpr

    Parse a Perl list expression. This may contain operators of precedence down to the comma operator. The expression must be followed (and thus terminated) either by a low-precedence logic operator such as or or by something that would normally terminate an expression such as semicolon. If flags includes PARSE_OPTIONAL then the expression is optional, otherwise it is mandatory. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed and the lexical context for the expression.

    The op tree representing the expression is returned. If an optional expression is absent, a null pointer is returned, otherwise the pointer will be non-null.

    If an error occurs in parsing or compilation, in most cases a valid op tree is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. Some compilation errors, however, will throw an exception immediately.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * parse_listexpr(U32 flags)
  • parse_stmtseq

    Parse a sequence of zero or more Perl statements. These may be normal imperative statements, including optional labels, or declarations that have compile-time effect, or any mixture thereof. The statement sequence ends when a closing brace or end-of-file is encountered in a place where a new statement could have validly started. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed and the lexical context for the statements.

    The op tree representing the statement sequence is returned. This may be a null pointer if the statements were all null, for example if there were no statements or if there were only subroutine definitions (which have compile-time side effects). If not null, it will be a lineseq list, normally including nextstate or equivalent ops.

    If an error occurs in parsing or compilation, in most cases a valid op tree is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. Some compilation errors, however, will throw an exception immediately.

    The flags parameter is reserved for future use, and must always be zero.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * parse_stmtseq(U32 flags)
  • parse_termexpr

    Parse a Perl term expression. This may contain operators of precedence down to the assignment operators. The expression must be followed (and thus terminated) either by a comma or lower-precedence operator or by something that would normally terminate an expression such as semicolon. If flags includes PARSE_OPTIONAL then the expression is optional, otherwise it is mandatory. It is up to the caller to ensure that the dynamic parser state (PL_parser et al) is correctly set to reflect the source of the code to be parsed and the lexical context for the expression.

    The op tree representing the expression is returned. If an optional expression is absent, a null pointer is returned, otherwise the pointer will be non-null.

    If an error occurs in parsing or compilation, in most cases a valid op tree is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. Some compilation errors, however, will throw an exception immediately.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * parse_termexpr(U32 flags)
  • PL_parser

    Pointer to a structure encapsulating the state of the parsing operation currently in progress. The pointer can be locally changed to perform a nested parse without interfering with the state of an outer parse. Individual members of PL_parser have their own documentation.

  • PL_parser->bufend

    Direct pointer to the end of the chunk of text currently being lexed, the end of the lexer buffer. This is equal to SvPVX(PL_parser->linestr) + SvCUR(PL_parser->linestr) . A NUL character (zero octet) is always located at the end of the buffer, and does not count as part of the buffer's contents.

    NOTE: this function is experimental and may change or be removed without notice.

  • PL_parser->bufptr

    Points to the current position of lexing inside the lexer buffer. Characters around this point may be freely examined, within the range delimited by SvPVX(PL_parser->linestr) and PL_parser->bufend. The octets of the buffer may be intended to be interpreted as either UTF-8 or Latin-1, as indicated by lex_bufutf8.

    Lexing code (whether in the Perl core or not) moves this pointer past the characters that it consumes. It is also expected to perform some bookkeeping whenever a newline character is consumed. This movement can be more conveniently performed by the function lex_read_to, which handles newlines appropriately.

    Interpretation of the buffer's octets can be abstracted out by using the slightly higher-level functions lex_peek_unichar and lex_read_unichar.

    NOTE: this function is experimental and may change or be removed without notice.

  • PL_parser->linestart

    Points to the start of the current line inside the lexer buffer. This is useful for indicating at which column an error occurred, and not much else. This must be updated by any lexing code that consumes a newline; the function lex_read_to handles this detail.

    NOTE: this function is experimental and may change or be removed without notice.

  • PL_parser->linestr

    Buffer scalar containing the chunk currently under consideration of the text currently being lexed. This is always a plain string scalar (for which SvPOK is true). It is not intended to be used as a scalar by normal scalar means; instead refer to the buffer directly by the pointer variables described below.

    The lexer maintains various char* pointers to things in the PL_parser->linestr buffer. If PL_parser->linestr is ever reallocated, all of these pointers must be updated. Don't attempt to do this manually, but rather use lex_grow_linestr if you need to reallocate the buffer.

    The content of the text chunk in the buffer is commonly exactly one complete line of input, up to and including a newline terminator, but there are situations where it is otherwise. The octets of the buffer may be intended to be interpreted as either UTF-8 or Latin-1. The function lex_bufutf8 tells you which. Do not use the SvUTF8 flag on this scalar, which may disagree with it.

    For direct examination of the buffer, the variable PL_parser->bufend points to the end of the buffer. The current lexing position is pointed to by PL_parser->bufptr. Direct use of these pointers is usually preferable to examination of the scalar through normal scalar means.

    NOTE: this function is experimental and may change or be removed without notice.

Magical Functions

  • mg_clear

    Clear something magical that the SV represents. See sv_magic .

    1. int mg_clear(SV* sv)
  • mg_copy

    Copies the magic from one SV to another. See sv_magic .

    1. int mg_copy(SV *sv, SV *nsv, const char *key,
    2. I32 klen)
  • mg_find

    Finds the magic pointer for type matching the SV. See sv_magic .

    1. MAGIC* mg_find(const SV* sv, int type)
  • mg_findext

    Finds the magic pointer of type with the given vtbl for the SV . See sv_magicext .

    1. MAGIC* mg_findext(const SV* sv, int type,
    2. const MGVTBL *vtbl)
  • mg_free

    Free any magic storage used by the SV. See sv_magic .

    1. int mg_free(SV* sv)
  • mg_free_type

    Remove any magic of type how from the SV sv. See sv_magic.

    1. void mg_free_type(SV *sv, int how)
  • mg_get

    Do magic before a value is retrieved from the SV. The type of SV must be >= SVt_PVMG. See sv_magic .

    1. int mg_get(SV* sv)
  • mg_length

    This function is deprecated.

    It reports on the SV's length in bytes, calling length magic if available, but does not set the UTF8 flag on the sv. It will fall back to 'get' magic if there is no 'length' magic, but with no indication as to whether it called 'get' magic. It assumes the sv is a PVMG or higher. Use sv_len() instead.

    1. U32 mg_length(SV* sv)
  • mg_magical

    Turns on the magical status of an SV. See sv_magic .

    1. void mg_magical(SV* sv)
  • mg_set

    Do magic after a value is assigned to the SV. See sv_magic .

    1. int mg_set(SV* sv)
  • SvGETMAGIC

    Invokes mg_get on an SV if it has 'get' magic. For example, this will call FETCH on a tied variable. This macro evaluates its argument more than once.

    1. void SvGETMAGIC(SV* sv)
  • SvLOCK

    Arranges for a mutual exclusion lock to be obtained on sv if a suitable module has been loaded.

    1. void SvLOCK(SV* sv)
  • SvSETMAGIC

    Invokes mg_set on an SV if it has 'set' magic. This is necessary after modifying a scalar, in case it is a magical variable like $| or a tied variable (it calls STORE ). This macro evaluates its argument more than once.

    1. void SvSETMAGIC(SV* sv)
  • SvSetMagicSV

    Like SvSetSV , but does any set magic required afterwards.

    1. void SvSetMagicSV(SV* dsb, SV* ssv)
  • SvSetMagicSV_nosteal

    Like SvSetSV_nosteal , but does any set magic required afterwards.

    1. void SvSetMagicSV_nosteal(SV* dsv, SV* ssv)
  • SvSetSV

    Calls sv_setsv if dsv is not the same as ssv. May evaluate arguments more than once.

    1. void SvSetSV(SV* dsb, SV* ssv)
  • SvSetSV_nosteal

    Calls a non-destructive version of sv_setsv if dsv is not the same as ssv. May evaluate arguments more than once.

    1. void SvSetSV_nosteal(SV* dsv, SV* ssv)
  • SvSHARE

    Arranges for sv to be shared between threads if a suitable module has been loaded.

    1. void SvSHARE(SV* sv)
  • SvUNLOCK

    Releases a mutual exclusion lock on sv if a suitable module has been loaded.

    1. void SvUNLOCK(SV* sv)

Memory Management

  • Copy

    The XSUB-writer's interface to the C memcpy function. The src is the source, dest is the destination, nitems is the number of items, and type is the type. May fail on overlapping copies. See also Move .

    1. void Copy(void* src, void* dest, int nitems, type)
  • CopyD

    Like Copy but returns dest. Useful for encouraging compilers to tail-call optimise.

    1. void * CopyD(void* src, void* dest, int nitems, type)
  • Move

    The XSUB-writer's interface to the C memmove function. The src is the source, dest is the destination, nitems is the number of items, and type is the type. Can do overlapping moves. See also Copy .

    1. void Move(void* src, void* dest, int nitems, type)
  • MoveD

    Like Move but returns dest. Useful for encouraging compilers to tail-call optimise.

    1. void * MoveD(void* src, void* dest, int nitems, type)
  • Newx

    The XSUB-writer's interface to the C malloc function.

    In 5.9.3, Newx() and friends replace the older New() API, and drops the first parameter, x, a debug aid which allowed callers to identify themselves. This aid has been superseded by a new build option, PERL_MEM_LOG (see PERL_MEM_LOG in perlhacktips). The older API is still there for use in XS modules supporting older perls.

    1. void Newx(void* ptr, int nitems, type)
  • Newxc

    The XSUB-writer's interface to the C malloc function, with cast. See also Newx .

    1. void Newxc(void* ptr, int nitems, type, cast)
  • Newxz

    The XSUB-writer's interface to the C malloc function. The allocated memory is zeroed with memzero . See also Newx .

    1. void Newxz(void* ptr, int nitems, type)
  • Poison

    PoisonWith(0xEF) for catching access to freed memory.

    1. void Poison(void* dest, int nitems, type)
  • PoisonFree

    PoisonWith(0xEF) for catching access to freed memory.

    1. void PoisonFree(void* dest, int nitems, type)
  • PoisonNew

    PoisonWith(0xAB) for catching access to allocated but uninitialized memory.

    1. void PoisonNew(void* dest, int nitems, type)
  • PoisonWith

    Fill up memory with a byte pattern (a byte repeated over and over again) that hopefully catches attempts to access uninitialized memory.

    1. void PoisonWith(void* dest, int nitems, type,
    2. U8 byte)
  • Renew

    The XSUB-writer's interface to the C realloc function.

    1. void Renew(void* ptr, int nitems, type)
  • Renewc

    The XSUB-writer's interface to the C realloc function, with cast.

    1. void Renewc(void* ptr, int nitems, type, cast)
  • Safefree

    The XSUB-writer's interface to the C free function.

    1. void Safefree(void* ptr)
  • savepv

    Perl's version of strdup() . Returns a pointer to a newly allocated string which is a duplicate of pv . The size of the string is determined by strlen() . The memory allocated for the new string can be freed with the Safefree() function.

    1. char* savepv(const char* pv)
  • savepvn

    Perl's version of what strndup() would be if it existed. Returns a pointer to a newly allocated string which is a duplicate of the first len bytes from pv , plus a trailing NUL byte. The memory allocated for the new string can be freed with the Safefree() function.

    1. char* savepvn(const char* pv, I32 len)
  • savepvs

    Like savepvn , but takes a literal string instead of a string/length pair.

    1. char* savepvs(const char* s)
  • savesharedpv

    A version of savepv() which allocates the duplicate string in memory which is shared between threads.

    1. char* savesharedpv(const char* pv)
  • savesharedpvn

    A version of savepvn() which allocates the duplicate string in memory which is shared between threads. (With the specific difference that a NULL pointer is not acceptable)

    1. char* savesharedpvn(const char *const pv,
    2. const STRLEN len)
  • savesharedpvs

    A version of savepvs() which allocates the duplicate string in memory which is shared between threads.

    1. char* savesharedpvs(const char* s)
  • savesharedsvpv

    A version of savesharedpv() which allocates the duplicate string in memory which is shared between threads.

    1. char* savesharedsvpv(SV *sv)
  • savesvpv

    A version of savepv() /savepvn() which gets the string to duplicate from the passed in SV using SvPV()

    1. char* savesvpv(SV* sv)
  • StructCopy

    This is an architecture-independent macro to copy one structure to another.

    1. void StructCopy(type *src, type *dest, type)
  • Zero

    The XSUB-writer's interface to the C memzero function. The dest is the destination, nitems is the number of items, and type is the type.

    1. void Zero(void* dest, int nitems, type)
  • ZeroD

    Like Zero but returns dest. Useful for encouraging compilers to tail-call optimise.

    1. void * ZeroD(void* dest, int nitems, type)

Miscellaneous Functions

  • fbm_compile

    Analyses the string in order to make fast searches on it using fbm_instr() -- the Boyer-Moore algorithm.

    1. void fbm_compile(SV* sv, U32 flags)
  • fbm_instr

    Returns the location of the SV in the string delimited by big and bigend . It returns NULL if the string can't be found. The sv does not have to be fbm_compiled, but the search will not be as fast then.

    1. char* fbm_instr(unsigned char* big,
    2. unsigned char* bigend, SV* littlestr,
    3. U32 flags)
  • foldEQ

    Returns true if the leading len bytes of the strings s1 and s2 are the same case-insensitively; false otherwise. Uppercase and lowercase ASCII range bytes match themselves and their opposite case counterparts. Non-cased and non-ASCII range bytes match only themselves.

    1. I32 foldEQ(const char* a, const char* b, I32 len)
  • foldEQ_locale

    Returns true if the leading len bytes of the strings s1 and s2 are the same case-insensitively in the current locale; false otherwise.

    1. I32 foldEQ_locale(const char* a, const char* b,
    2. I32 len)
  • form

    Takes a sprintf-style format pattern and conventional (non-SV) arguments and returns the formatted string.

    1. (char *) Perl_form(pTHX_ const char* pat, ...)

    can be used any place a string (char *) is required:

    1. char * s = Perl_form("%d.%d",major,minor);

    Uses a single private buffer so if you want to format several strings you must explicitly copy the earlier strings away (and free the copies when you are done).

    1. char* form(const char* pat, ...)
  • getcwd_sv

    Fill the sv with current working directory

    1. int getcwd_sv(SV* sv)
  • mess

    Take a sprintf-style format pattern and argument list. These are used to generate a string message. If the message does not end with a newline, then it will be extended with some indication of the current location in the code, as described for mess_sv.

    Normally, the resulting message is returned in a new mortal SV. During global destruction a single SV may be shared between uses of this function.

    1. SV * mess(const char *pat, ...)
  • mess_sv

    Expands a message, intended for the user, to include an indication of the current location in the code, if the message does not already appear to be complete.

    basemsg is the initial message or object. If it is a reference, it will be used as-is and will be the result of this function. Otherwise it is used as a string, and if it already ends with a newline, it is taken to be complete, and the result of this function will be the same string. If the message does not end with a newline, then a segment such as at foo.pl line 37 will be appended, and possibly other clauses indicating the current state of execution. The resulting message will end with a dot and a newline.

    Normally, the resulting message is returned in a new mortal SV. During global destruction a single SV may be shared between uses of this function. If consume is true, then the function is permitted (but not required) to modify and return basemsg instead of allocating a new SV.

    1. SV * mess_sv(SV *basemsg, bool consume)
  • my_snprintf

    The C library snprintf functionality, if available and standards-compliant (uses vsnprintf , actually). However, if the vsnprintf is not available, will unfortunately use the unsafe vsprintf which can overrun the buffer (there is an overrun check, but that may be too late). Consider using sv_vcatpvf instead, or getting vsnprintf .

    1. int my_snprintf(char *buffer, const Size_t len,
    2. const char *format, ...)
  • my_sprintf

    The C library sprintf, wrapped if necessary, to ensure that it will return the length of the string written to the buffer. Only rare pre-ANSI systems need the wrapper function - usually this is a direct call to sprintf.

    1. int my_sprintf(char *buffer, const char *pat, ...)
  • my_vsnprintf

    The C library vsnprintf if available and standards-compliant. However, if if the vsnprintf is not available, will unfortunately use the unsafe vsprintf which can overrun the buffer (there is an overrun check, but that may be too late). Consider using sv_vcatpvf instead, or getting vsnprintf .

    1. int my_vsnprintf(char *buffer, const Size_t len,
    2. const char *format, va_list ap)
  • new_version

    Returns a new version object based on the passed in SV:

    1. SV *sv = new_version(SV *ver);

    Does not alter the passed in ver SV. See "upg_version" if you want to upgrade the SV.

    1. SV* new_version(SV *ver)
  • prescan_version

    Validate that a given string can be parsed as a version object, but doesn't actually perform the parsing. Can use either strict or lax validation rules. Can optionally set a number of hint variables to save the parsing code some time when tokenizing.

    1. const char* prescan_version(const char *s, bool strict,
    2. const char** errstr,
    3. bool *sqv,
    4. int *ssaw_decimal,
    5. int *swidth, bool *salpha)
  • READ_XDIGIT

    Returns the value of an ASCII-range hex digit and advances the string pointer. Behaviour is only well defined when isXDIGIT(*str) is true.

    1. U8 READ_XDIGIT(char str*)
  • scan_version

    Returns a pointer to the next character after the parsed version string, as well as upgrading the passed in SV to an RV.

    Function must be called with an already existing SV like

    1. sv = newSV(0);
    2. s = scan_version(s, SV *sv, bool qv);

    Performs some preprocessing to the string to ensure that it has the correct characteristics of a version. Flags the object if it contains an underscore (which denotes this is an alpha version). The boolean qv denotes that the version should be interpreted as if it had multiple decimals, even if it doesn't.

    1. const char* scan_version(const char *s, SV *rv, bool qv)
  • strEQ

    Test two strings to see if they are equal. Returns true or false.

    1. bool strEQ(char* s1, char* s2)
  • strGE

    Test two strings to see if the first, s1 , is greater than or equal to the second, s2 . Returns true or false.

    1. bool strGE(char* s1, char* s2)
  • strGT

    Test two strings to see if the first, s1 , is greater than the second, s2 . Returns true or false.

    1. bool strGT(char* s1, char* s2)
  • strLE

    Test two strings to see if the first, s1 , is less than or equal to the second, s2 . Returns true or false.

    1. bool strLE(char* s1, char* s2)
  • strLT

    Test two strings to see if the first, s1 , is less than the second, s2 . Returns true or false.

    1. bool strLT(char* s1, char* s2)
  • strNE

    Test two strings to see if they are different. Returns true or false.

    1. bool strNE(char* s1, char* s2)
  • strnEQ

    Test two strings to see if they are equal. The len parameter indicates the number of bytes to compare. Returns true or false. (A wrapper for strncmp ).

    1. bool strnEQ(char* s1, char* s2, STRLEN len)
  • strnNE

    Test two strings to see if they are different. The len parameter indicates the number of bytes to compare. Returns true or false. (A wrapper for strncmp ).

    1. bool strnNE(char* s1, char* s2, STRLEN len)
  • sv_destroyable

    Dummy routine which reports that object can be destroyed when there is no sharing module present. It ignores its single SV argument, and returns 'true'. Exists to avoid test for a NULL function pointer and because it could potentially warn under some level of strict-ness.

    1. bool sv_destroyable(SV *sv)
  • sv_nosharing

    Dummy routine which "shares" an SV when there is no sharing module present. Or "locks" it. Or "unlocks" it. In other words, ignores its single SV argument. Exists to avoid test for a NULL function pointer and because it could potentially warn under some level of strict-ness.

    1. void sv_nosharing(SV *sv)
  • upg_version

    In-place upgrade of the supplied SV to a version object.

    1. SV *sv = upg_version(SV *sv, bool qv);

    Returns a pointer to the upgraded SV. Set the boolean qv if you want to force this SV to be interpreted as an "extended" version.

    1. SV* upg_version(SV *ver, bool qv)
  • vcmp

    Version object aware cmp. Both operands must already have been converted into version objects.

    1. int vcmp(SV *lhv, SV *rhv)
  • vmess

    pat and args are a sprintf-style format pattern and encapsulated argument list. These are used to generate a string message. If the message does not end with a newline, then it will be extended with some indication of the current location in the code, as described for mess_sv.

    Normally, the resulting message is returned in a new mortal SV. During global destruction a single SV may be shared between uses of this function.

    1. SV * vmess(const char *pat, va_list *args)
  • vnormal

    Accepts a version object and returns the normalized string representation. Call like:

    1. sv = vnormal(rv);

    NOTE: you can pass either the object directly or the SV contained within the RV.

    The SV returned has a refcount of 1.

    1. SV* vnormal(SV *vs)
  • vnumify

    Accepts a version object and returns the normalized floating point representation. Call like:

    1. sv = vnumify(rv);

    NOTE: you can pass either the object directly or the SV contained within the RV.

    The SV returned has a refcount of 1.

    1. SV* vnumify(SV *vs)
  • vstringify

    In order to maintain maximum compatibility with earlier versions of Perl, this function will return either the floating point notation or the multiple dotted notation, depending on whether the original version contained 1 or more dots, respectively.

    The SV returned has a refcount of 1.

    1. SV* vstringify(SV *vs)
  • vverify

    Validates that the SV contains valid internal structure for a version object. It may be passed either the version object (RV) or the hash itself (HV). If the structure is valid, it returns the HV. If the structure is invalid, it returns NULL.

    1. SV *hv = vverify(sv);

    Note that it only confirms the bare minimum structure (so as not to get confused by derived classes which may contain additional hash entries):

    1. SV* vverify(SV *vs)

MRO Functions

  • mro_get_linear_isa

    Returns the mro linearisation for the given stash. By default, this will be whatever mro_get_linear_isa_dfs returns unless some other MRO is in effect for the stash. The return value is a read-only AV*.

    You are responsible for SvREFCNT_inc() on the return value if you plan to store it anywhere semi-permanently (otherwise it might be deleted out from under you the next time the cache is invalidated).

    1. AV* mro_get_linear_isa(HV* stash)
  • mro_method_changed_in

    Invalidates method caching on any child classes of the given stash, so that they might notice the changes in this one.

    Ideally, all instances of PL_sub_generation++ in perl source outside of mro.c should be replaced by calls to this.

    Perl automatically handles most of the common ways a method might be redefined. However, there are a few ways you could change a method in a stash without the cache code noticing, in which case you need to call this method afterwards:

    1) Directly manipulating the stash HV entries from XS code.

    2) Assigning a reference to a readonly scalar constant into a stash entry in order to create a constant subroutine (like constant.pm does).

    This same method is available from pure perl via, mro::method_changed_in(classname) .

    1. void mro_method_changed_in(HV* stash)
  • mro_register

    Registers a custom mro plugin. See perlmroapi for details.

    1. void mro_register(const struct mro_alg *mro)

Multicall Functions

Numeric functions

  • grok_bin

    converts a string representing a binary number to numeric form.

    On entry start and *len give the string to scan, *flags gives conversion flags, and result should be NULL or a pointer to an NV. The scan stops at the end of the string, or the first invalid character. Unless PERL_SCAN_SILENT_ILLDIGIT is set in *flags, encountering an invalid character will also trigger a warning. On return *len is set to the length of the scanned string, and *flags gives output flags.

    If the value is <= UV_MAX it is returned as a UV, the output flags are clear, and nothing is written to *result. If the value is > UV_MAX grok_bin returns UV_MAX, sets PERL_SCAN_GREATER_THAN_UV_MAX in the output flags, and writes the value to *result (or the value is discarded if result is NULL).

    The binary number may optionally be prefixed with "0b" or "b" unless PERL_SCAN_DISALLOW_PREFIX is set in *flags on entry. If PERL_SCAN_ALLOW_UNDERSCORES is set in *flags then the binary number may use '_' characters to separate digits.

    1. UV grok_bin(const char* start, STRLEN* len_p,
    2. I32* flags, NV *result)
  • grok_hex

    converts a string representing a hex number to numeric form.

    On entry start and *len give the string to scan, *flags gives conversion flags, and result should be NULL or a pointer to an NV. The scan stops at the end of the string, or the first invalid character. Unless PERL_SCAN_SILENT_ILLDIGIT is set in *flags, encountering an invalid character will also trigger a warning. On return *len is set to the length of the scanned string, and *flags gives output flags.

    If the value is <= UV_MAX it is returned as a UV, the output flags are clear, and nothing is written to *result. If the value is > UV_MAX grok_hex returns UV_MAX, sets PERL_SCAN_GREATER_THAN_UV_MAX in the output flags, and writes the value to *result (or the value is discarded if result is NULL).

    The hex number may optionally be prefixed with "0x" or "x" unless PERL_SCAN_DISALLOW_PREFIX is set in *flags on entry. If PERL_SCAN_ALLOW_UNDERSCORES is set in *flags then the hex number may use '_' characters to separate digits.

    1. UV grok_hex(const char* start, STRLEN* len_p,
    2. I32* flags, NV *result)
  • grok_number

    Recognise (or not) a number. The type of the number is returned (0 if unrecognised), otherwise it is a bit-ORed combination of IS_NUMBER_IN_UV, IS_NUMBER_GREATER_THAN_UV_MAX, IS_NUMBER_NOT_INT, IS_NUMBER_NEG, IS_NUMBER_INFINITY, IS_NUMBER_NAN (defined in perl.h).

    If the value of the number can fit an in UV, it is returned in the *valuep IS_NUMBER_IN_UV will be set to indicate that *valuep is valid, IS_NUMBER_IN_UV will never be set unless *valuep is valid, but *valuep may have been assigned to during processing even though IS_NUMBER_IN_UV is not set on return. If valuep is NULL, IS_NUMBER_IN_UV will be set for the same cases as when valuep is non-NULL, but no actual assignment (or SEGV) will occur.

    IS_NUMBER_NOT_INT will be set with IS_NUMBER_IN_UV if trailing decimals were seen (in which case *valuep gives the true value truncated to an integer), and IS_NUMBER_NEG if the number is negative (in which case *valuep holds the absolute value). IS_NUMBER_IN_UV is not set if e notation was used or the number is larger than a UV.

    1. int grok_number(const char *pv, STRLEN len,
    2. UV *valuep)
  • grok_numeric_radix

    Scan and skip for a numeric decimal separator (radix).

    1. bool grok_numeric_radix(const char **sp,
    2. const char *send)
  • grok_oct

    converts a string representing an octal number to numeric form.

    On entry start and *len give the string to scan, *flags gives conversion flags, and result should be NULL or a pointer to an NV. The scan stops at the end of the string, or the first invalid character. Unless PERL_SCAN_SILENT_ILLDIGIT is set in *flags, encountering an 8 or 9 will also trigger a warning. On return *len is set to the length of the scanned string, and *flags gives output flags.

    If the value is <= UV_MAX it is returned as a UV, the output flags are clear, and nothing is written to *result. If the value is > UV_MAX grok_oct returns UV_MAX, sets PERL_SCAN_GREATER_THAN_UV_MAX in the output flags, and writes the value to *result (or the value is discarded if result is NULL).

    If PERL_SCAN_ALLOW_UNDERSCORES is set in *flags then the octal number may use '_' characters to separate digits.

    1. UV grok_oct(const char* start, STRLEN* len_p,
    2. I32* flags, NV *result)
  • Perl_signbit

    Return a non-zero integer if the sign bit on an NV is set, and 0 if it is not.

    If Configure detects this system has a signbit() that will work with our NVs, then we just use it via the #define in perl.h. Otherwise, fall back on this implementation. As a first pass, this gets everything right except -0.0. Alas, catching -0.0 is the main use for this function, so this is not too helpful yet. Still, at least we have the scaffolding in place to support other systems, should that prove useful.

    Configure notes: This function is called 'Perl_signbit' instead of a plain 'signbit' because it is easy to imagine a system having a signbit() function or macro that doesn't happen to work with our particular choice of NVs. We shouldn't just re-#define signbit as Perl_signbit and expect the standard system headers to be happy. Also, this is a no-context function (no pTHX_) because Perl_signbit() is usually re-#defined in perl.h as a simple macro call to the system's signbit(). Users should just always call Perl_signbit().

    NOTE: this function is experimental and may change or be removed without notice.

    1. int Perl_signbit(NV f)
  • scan_bin

    For backwards compatibility. Use grok_bin instead.

    1. NV scan_bin(const char* start, STRLEN len,
    2. STRLEN* retlen)
  • scan_hex

    For backwards compatibility. Use grok_hex instead.

    1. NV scan_hex(const char* start, STRLEN len,
    2. STRLEN* retlen)
  • scan_oct

    For backwards compatibility. Use grok_oct instead.

    1. NV scan_oct(const char* start, STRLEN len,
    2. STRLEN* retlen)

Optree construction

  • newASSIGNOP

    Constructs, checks, and returns an assignment op. left and right supply the parameters of the assignment; they are consumed by this function and become part of the constructed op tree.

    If optype is OP_ANDASSIGN , OP_ORASSIGN , or OP_DORASSIGN , then a suitable conditional optree is constructed. If optype is the opcode of a binary operator, such as OP_BIT_OR , then an op is constructed that performs the binary operation and assigns the result to the left argument. Either way, if optype is non-zero then flags has no effect.

    If optype is zero, then a plain scalar or list assignment is constructed. Which type of assignment it is is automatically determined. flags gives the eight bits of op_flags , except that OPf_KIDS will be set automatically, and, shifted up eight bits, the eight bits of op_private , except that the bit with value 1 or 2 is automatically set as required.

    1. OP * newASSIGNOP(I32 flags, OP *left, I32 optype,
    2. OP *right)
  • newBINOP

    Constructs, checks, and returns an op of any binary type. type is the opcode. flags gives the eight bits of op_flags , except that OPf_KIDS will be set automatically, and, shifted up eight bits, the eight bits of op_private , except that the bit with value 1 or 2 is automatically set as required. first and last supply up to two ops to be the direct children of the binary op; they are consumed by this function and become part of the constructed op tree.

    1. OP * newBINOP(I32 type, I32 flags, OP *first,
    2. OP *last)
  • newCONDOP

    Constructs, checks, and returns a conditional-expression (cond_expr ) op. flags gives the eight bits of op_flags , except that OPf_KIDS will be set automatically, and, shifted up eight bits, the eight bits of op_private , except that the bit with value 1 is automatically set. first supplies the expression selecting between the two branches, and trueop and falseop supply the branches; they are consumed by this function and become part of the constructed op tree.

    1. OP * newCONDOP(I32 flags, OP *first, OP *trueop,
    2. OP *falseop)
  • newFOROP

    Constructs, checks, and returns an op tree expressing a foreach loop (iteration through a list of values). This is a heavyweight loop, with structure that allows exiting the loop by last and suchlike.

    sv optionally supplies the variable that will be aliased to each item in turn; if null, it defaults to $_ (either lexical or global). expr supplies the list of values to iterate over. block supplies the main body of the loop, and cont optionally supplies a continue block that operates as a second half of the body. All of these optree inputs are consumed by this function and become part of the constructed op tree.

    flags gives the eight bits of op_flags for the leaveloop op and, shifted up eight bits, the eight bits of op_private for the leaveloop op, except that (in both cases) some bits will be set automatically.

    1. OP * newFOROP(I32 flags, OP *sv, OP *expr, OP *block,
    2. OP *cont)
  • newGIVENOP

    Constructs, checks, and returns an op tree expressing a given block. cond supplies the expression that will be locally assigned to a lexical variable, and block supplies the body of the given construct; they are consumed by this function and become part of the constructed op tree. defsv_off is the pad offset of the scalar lexical variable that will be affected. If it is 0, the global $_ will be used.

    1. OP * newGIVENOP(OP *cond, OP *block,
    2. PADOFFSET defsv_off)
  • newGVOP

    Constructs, checks, and returns an op of any type that involves an embedded reference to a GV. type is the opcode. flags gives the eight bits of op_flags . gv identifies the GV that the op should reference; calling this function does not transfer ownership of any reference to it.

    1. OP * newGVOP(I32 type, I32 flags, GV *gv)
  • newLISTOP

    Constructs, checks, and returns an op of any list type. type is the opcode. flags gives the eight bits of op_flags , except that OPf_KIDS will be set automatically if required. first and last supply up to two ops to be direct children of the list op; they are consumed by this function and become part of the constructed op tree.

    1. OP * newLISTOP(I32 type, I32 flags, OP *first,
    2. OP *last)
  • newLOGOP

    Constructs, checks, and returns a logical (flow control) op. type is the opcode. flags gives the eight bits of op_flags , except that OPf_KIDS will be set automatically, and, shifted up eight bits, the eight bits of op_private , except that the bit with value 1 is automatically set. first supplies the expression controlling the flow, and other supplies the side (alternate) chain of ops; they are consumed by this function and become part of the constructed op tree.

    1. OP * newLOGOP(I32 type, I32 flags, OP *first,
    2. OP *other)
  • newLOOPEX

    Constructs, checks, and returns a loop-exiting op (such as goto or last). type is the opcode. label supplies the parameter determining the target of the op; it is consumed by this function and becomes part of the constructed op tree.

    1. OP * newLOOPEX(I32 type, OP *label)
  • newLOOPOP

    Constructs, checks, and returns an op tree expressing a loop. This is only a loop in the control flow through the op tree; it does not have the heavyweight loop structure that allows exiting the loop by last and suchlike. flags gives the eight bits of op_flags for the top-level op, except that some bits will be set automatically as required. expr supplies the expression controlling loop iteration, and block supplies the body of the loop; they are consumed by this function and become part of the constructed op tree. debuggable is currently unused and should always be 1.

    1. OP * newLOOPOP(I32 flags, I32 debuggable, OP *expr,
    2. OP *block)
  • newNULLLIST

    Constructs, checks, and returns a new stub op, which represents an empty list expression.

    1. OP * newNULLLIST()
  • newOP

    Constructs, checks, and returns an op of any base type (any type that has no extra fields). type is the opcode. flags gives the eight bits of op_flags , and, shifted up eight bits, the eight bits of op_private .

    1. OP * newOP(I32 type, I32 flags)
  • newPADOP

    Constructs, checks, and returns an op of any type that involves a reference to a pad element. type is the opcode. flags gives the eight bits of op_flags . A pad slot is automatically allocated, and is populated with sv; this function takes ownership of one reference to it.

    This function only exists if Perl has been compiled to use ithreads.

    1. OP * newPADOP(I32 type, I32 flags, SV *sv)
  • newPMOP

    Constructs, checks, and returns an op of any pattern matching type. type is the opcode. flags gives the eight bits of op_flags and, shifted up eight bits, the eight bits of op_private .

    1. OP * newPMOP(I32 type, I32 flags)
  • newPVOP

    Constructs, checks, and returns an op of any type that involves an embedded C-level pointer (PV). type is the opcode. flags gives the eight bits of op_flags . pv supplies the C-level pointer, which must have been allocated using PerlMemShared_malloc; the memory will be freed when the op is destroyed.

    1. OP * newPVOP(I32 type, I32 flags, char *pv)
  • newRANGE

    Constructs and returns a range op, with subordinate flip and flop ops. flags gives the eight bits of op_flags for the flip op and, shifted up eight bits, the eight bits of op_private for both the flip and range ops, except that the bit with value 1 is automatically set. left and right supply the expressions controlling the endpoints of the range; they are consumed by this function and become part of the constructed op tree.

    1. OP * newRANGE(I32 flags, OP *left, OP *right)
  • newSLICEOP

    Constructs, checks, and returns an lslice (list slice) op. flags gives the eight bits of op_flags , except that OPf_KIDS will be set automatically, and, shifted up eight bits, the eight bits of op_private , except that the bit with value 1 or 2 is automatically set as required. listval and subscript supply the parameters of the slice; they are consumed by this function and become part of the constructed op tree.

    1. OP * newSLICEOP(I32 flags, OP *subscript,
    2. OP *listval)
  • newSTATEOP

    Constructs a state op (COP). The state op is normally a nextstate op, but will be a dbstate op if debugging is enabled for currently-compiled code. The state op is populated from PL_curcop (or PL_compiling). If label is non-null, it supplies the name of a label to attach to the state op; this function takes ownership of the memory pointed at by label, and will free it. flags gives the eight bits of op_flags for the state op.

    If o is null, the state op is returned. Otherwise the state op is combined with o into a lineseq list op, which is returned. o is consumed by this function and becomes part of the returned op tree.

    1. OP * newSTATEOP(I32 flags, char *label, OP *o)
  • newSVOP

    Constructs, checks, and returns an op of any type that involves an embedded SV. type is the opcode. flags gives the eight bits of op_flags . sv gives the SV to embed in the op; this function takes ownership of one reference to it.

    1. OP * newSVOP(I32 type, I32 flags, SV *sv)
  • newUNOP

    Constructs, checks, and returns an op of any unary type. type is the opcode. flags gives the eight bits of op_flags , except that OPf_KIDS will be set automatically if required, and, shifted up eight bits, the eight bits of op_private , except that the bit with value 1 is automatically set. first supplies an optional op to be the direct child of the unary op; it is consumed by this function and become part of the constructed op tree.

    1. OP * newUNOP(I32 type, I32 flags, OP *first)
  • newWHENOP

    Constructs, checks, and returns an op tree expressing a when block. cond supplies the test expression, and block supplies the block that will be executed if the test evaluates to true; they are consumed by this function and become part of the constructed op tree. cond will be interpreted DWIMically, often as a comparison against $_ , and may be null to generate a default block.

    1. OP * newWHENOP(OP *cond, OP *block)
  • newWHILEOP

    Constructs, checks, and returns an op tree expressing a while loop. This is a heavyweight loop, with structure that allows exiting the loop by last and suchlike.

    loop is an optional preconstructed enterloop op to use in the loop; if it is null then a suitable op will be constructed automatically. expr supplies the loop's controlling expression. block supplies the main body of the loop, and cont optionally supplies a continue block that operates as a second half of the body. All of these optree inputs are consumed by this function and become part of the constructed op tree.

    flags gives the eight bits of op_flags for the leaveloop op and, shifted up eight bits, the eight bits of op_private for the leaveloop op, except that (in both cases) some bits will be set automatically. debuggable is currently unused and should always be 1. has_my can be supplied as true to force the loop body to be enclosed in its own scope.

    1. OP * newWHILEOP(I32 flags, I32 debuggable,
    2. LOOP *loop, OP *expr, OP *block,
    3. OP *cont, I32 has_my)

Optree Manipulation Functions

  • ck_entersub_args_list

    Performs the default fixup of the arguments part of an entersub op tree. This consists of applying list context to each of the argument ops. This is the standard treatment used on a call marked with & , or a method call, or a call through a subroutine reference, or any other call where the callee can't be identified at compile time, or a call where the callee has no prototype.

    1. OP * ck_entersub_args_list(OP *entersubop)
  • ck_entersub_args_proto

    Performs the fixup of the arguments part of an entersub op tree based on a subroutine prototype. This makes various modifications to the argument ops, from applying context up to inserting refgen ops, and checking the number and syntactic types of arguments, as directed by the prototype. This is the standard treatment used on a subroutine call, not marked with & , where the callee can be identified at compile time and has a prototype.

    protosv supplies the subroutine prototype to be applied to the call. It may be a normal defined scalar, of which the string value will be used. Alternatively, for convenience, it may be a subroutine object (a CV* that has been cast to SV* ) which has a prototype. The prototype supplied, in whichever form, does not need to match the actual callee referenced by the op tree.

    If the argument ops disagree with the prototype, for example by having an unacceptable number of arguments, a valid op tree is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. In the error message, the callee is referred to by the name defined by the namegv parameter.

    1. OP * ck_entersub_args_proto(OP *entersubop,
    2. GV *namegv, SV *protosv)
  • ck_entersub_args_proto_or_list

    Performs the fixup of the arguments part of an entersub op tree either based on a subroutine prototype or using default list-context processing. This is the standard treatment used on a subroutine call, not marked with & , where the callee can be identified at compile time.

    protosv supplies the subroutine prototype to be applied to the call, or indicates that there is no prototype. It may be a normal scalar, in which case if it is defined then the string value will be used as a prototype, and if it is undefined then there is no prototype. Alternatively, for convenience, it may be a subroutine object (a CV* that has been cast to SV* ), of which the prototype will be used if it has one. The prototype (or lack thereof) supplied, in whichever form, does not need to match the actual callee referenced by the op tree.

    If the argument ops disagree with the prototype, for example by having an unacceptable number of arguments, a valid op tree is returned anyway. The error is reflected in the parser state, normally resulting in a single exception at the top level of parsing which covers all the compilation errors that occurred. In the error message, the callee is referred to by the name defined by the namegv parameter.

    1. OP * ck_entersub_args_proto_or_list(OP *entersubop,
    2. GV *namegv,
    3. SV *protosv)
  • cv_const_sv

    If cv is a constant sub eligible for inlining. returns the constant value returned by the sub. Otherwise, returns NULL.

    Constant subs can be created with newCONSTSUB or as described in Constant Functions in perlsub.

    1. SV* cv_const_sv(const CV *const cv)
  • cv_get_call_checker

    Retrieves the function that will be used to fix up a call to cv. Specifically, the function is applied to an entersub op tree for a subroutine call, not marked with & , where the callee can be identified at compile time as cv.

    The C-level function pointer is returned in *ckfun_p, and an SV argument for it is returned in *ckobj_p. The function is intended to be called in this manner:

    1. entersubop = (*ckfun_p)(aTHX_ entersubop, namegv, (*ckobj_p));

    In this call, entersubop is a pointer to the entersub op, which may be replaced by the check function, and namegv is a GV supplying the name that should be used by the check function to refer to the callee of the entersub op if it needs to emit any diagnostics. It is permitted to apply the check function in non-standard situations, such as to a call to a different subroutine or to a method call.

    By default, the function is Perl_ck_entersub_args_proto_or_list, and the SV parameter is cv itself. This implements standard prototype processing. It can be changed, for a particular subroutine, by cv_set_call_checker.

    1. void cv_get_call_checker(CV *cv,
    2. Perl_call_checker *ckfun_p,
    3. SV **ckobj_p)
  • cv_set_call_checker

    Sets the function that will be used to fix up a call to cv. Specifically, the function is applied to an entersub op tree for a subroutine call, not marked with & , where the callee can be identified at compile time as cv.

    The C-level function pointer is supplied in ckfun, and an SV argument for it is supplied in ckobj. The function is intended to be called in this manner:

    1. entersubop = ckfun(aTHX_ entersubop, namegv, ckobj);

    In this call, entersubop is a pointer to the entersub op, which may be replaced by the check function, and namegv is a GV supplying the name that should be used by the check function to refer to the callee of the entersub op if it needs to emit any diagnostics. It is permitted to apply the check function in non-standard situations, such as to a call to a different subroutine or to a method call.

    The current setting for a particular CV can be retrieved by cv_get_call_checker.

    1. void cv_set_call_checker(CV *cv,
    2. Perl_call_checker ckfun,
    3. SV *ckobj)
  • LINKLIST

    Given the root of an optree, link the tree in execution order using the op_next pointers and return the first op executed. If this has already been done, it will not be redone, and o->op_next will be returned. If o->op_next is not already set, o should be at least an UNOP .

    1. OP* LINKLIST(OP *o)
  • newCONSTSUB

    See newCONSTSUB_flags.

    1. CV* newCONSTSUB(HV* stash, const char* name, SV* sv)
  • newCONSTSUB_flags

    Creates a constant sub equivalent to Perl sub FOO () { 123 } which is eligible for inlining at compile-time.

    Currently, the only useful value for flags is SVf_UTF8.

    The newly created subroutine takes ownership of a reference to the passed in SV.

    Passing NULL for SV creates a constant sub equivalent to sub BAR () {} , which won't be called if used as a destructor, but will suppress the overhead of a call to AUTOLOAD . (This form, however, isn't eligible for inlining at compile time.)

    1. CV* newCONSTSUB_flags(HV* stash, const char* name,
    2. STRLEN len, U32 flags, SV* sv)
  • newXS

    Used by xsubpp to hook up XSUBs as Perl subs. filename needs to be static storage, as it is used directly as CvFILE(), without a copy being made.

  • op_append_elem

    Append an item to the list of ops contained directly within a list-type op, returning the lengthened list. first is the list-type op, and last is the op to append to the list. optype specifies the intended opcode for the list. If first is not already a list of the right type, it will be upgraded into one. If either first or last is null, the other is returned unchanged.

    1. OP * op_append_elem(I32 optype, OP *first, OP *last)
  • op_append_list

    Concatenate the lists of ops contained directly within two list-type ops, returning the combined list. first and last are the list-type ops to concatenate. optype specifies the intended opcode for the list. If either first or last is not already a list of the right type, it will be upgraded into one. If either first or last is null, the other is returned unchanged.

    1. OP * op_append_list(I32 optype, OP *first, OP *last)
  • OP_CLASS

    Return the class of the provided OP: that is, which of the *OP structures it uses. For core ops this currently gets the information out of PL_opargs, which does not always accurately reflect the type used. For custom ops the type is returned from the registration, and it is up to the registree to ensure it is accurate. The value returned will be one of the OA_* constants from op.h.

    1. U32 OP_CLASS(OP *o)
  • OP_DESC

    Return a short description of the provided OP.

    1. const char * OP_DESC(OP *o)
  • op_linklist

    This function is the implementation of the LINKLIST macro. It should not be called directly.

    1. OP* op_linklist(OP *o)
  • op_lvalue

    Propagate lvalue ("modifiable") context to an op and its children. type represents the context type, roughly based on the type of op that would do the modifying, although local() is represented by OP_NULL, because it has no op type of its own (it is signalled by a flag on the lvalue op).

    This function detects things that can't be modified, such as $x+1 , and generates errors for them. For example, $x+1 = 2 would cause it to be called with an op of type OP_ADD and a type argument of OP_SASSIGN.

    It also flags things that need to behave specially in an lvalue context, such as $$x = 5 which might have to vivify a reference in $x .

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * op_lvalue(OP *o, I32 type)
  • OP_NAME

    Return the name of the provided OP. For core ops this looks up the name from the op_type; for custom ops from the op_ppaddr.

    1. const char * OP_NAME(OP *o)
  • op_prepend_elem

    Prepend an item to the list of ops contained directly within a list-type op, returning the lengthened list. first is the op to prepend to the list, and last is the list-type op. optype specifies the intended opcode for the list. If last is not already a list of the right type, it will be upgraded into one. If either first or last is null, the other is returned unchanged.

    1. OP * op_prepend_elem(I32 optype, OP *first, OP *last)
  • op_scope

    Wraps up an op tree with some additional ops so that at runtime a dynamic scope will be created. The original ops run in the new dynamic scope, and then, provided that they exit normally, the scope will be unwound. The additional ops used to create and unwind the dynamic scope will normally be an enter /leave pair, but a scope op may be used instead if the ops are simple enough to not need the full dynamic scope structure.

    NOTE: this function is experimental and may change or be removed without notice.

    1. OP * op_scope(OP *o)
  • rv2cv_op_cv

    Examines an op, which is expected to identify a subroutine at runtime, and attempts to determine at compile time which subroutine it identifies. This is normally used during Perl compilation to determine whether a prototype can be applied to a function call. cvop is the op being considered, normally an rv2cv op. A pointer to the identified subroutine is returned, if it could be determined statically, and a null pointer is returned if it was not possible to determine statically.

    Currently, the subroutine can be identified statically if the RV that the rv2cv is to operate on is provided by a suitable gv or const op. A gv op is suitable if the GV's CV slot is populated. A const op is suitable if the constant value must be an RV pointing to a CV. Details of this process may change in future versions of Perl. If the rv2cv op has the OPpENTERSUB_AMPER flag set then no attempt is made to identify the subroutine statically: this flag is used to suppress compile-time magic on a subroutine call, forcing it to use default runtime behaviour.

    If flags has the bit RV2CVOPCV_MARK_EARLY set, then the handling of a GV reference is modified. If a GV was examined and its CV slot was found to be empty, then the gv op has the OPpEARLY_CV flag set. If the op is not optimised away, and the CV slot is later populated with a subroutine having a prototype, that flag eventually triggers the warning "called too early to check prototype".

    If flags has the bit RV2CVOPCV_RETURN_NAME_GV set, then instead of returning a pointer to the subroutine it returns a pointer to the GV giving the most appropriate name for the subroutine in this context. Normally this is just the CvGV of the subroutine, but for an anonymous (CvANON ) subroutine that is referenced through a GV it will be the referencing GV. The resulting GV* is cast to CV* to be returned. A null pointer is returned as usual if there is no statically-determinable subroutine.

    1. CV * rv2cv_op_cv(OP *cvop, U32 flags)

Pad Data Structures

  • CvPADLIST

    CV's can have CvPADLIST(cv) set to point to a PADLIST. This is the CV's scratchpad, which stores lexical variables and opcode temporary and per-thread values.

    For these purposes "formats" are a kind-of CV; eval""s are too (except they're not callable at will and are always thrown away after the eval"" is done executing). Require'd files are simply evals without any outer lexical scope.

    XSUBs don't have CvPADLIST set - dXSTARG fetches values from PL_curpad, but that is really the callers pad (a slot of which is allocated by every entersub).

    The PADLIST has a C array where pads are stored.

    The 0th entry of the PADLIST is a PADNAMELIST (which is actually just an AV, but that may change) which represents the "names" or rather the "static type information" for lexicals. The individual elements of a PADNAMELIST are PADNAMEs (just SVs; but, again, that may change). Future refactorings might stop the PADNAMELIST from being stored in the PADLIST's array, so don't rely on it. See PadlistNAMES.

    The CvDEPTH'th entry of a PADLIST is a PAD (an AV) which is the stack frame at that depth of recursion into the CV. The 0th slot of a frame AV is an AV which is @_. Other entries are storage for variables and op targets.

    Iterating over the PADNAMELIST iterates over all possible pad items. Pad slots that are SVs_PADTMP (targets/GVs/constants) end up having &PL_sv_undef "names" (see pad_alloc()).

    Only my/our variable (SvPADMY/PADNAME_isOUR) slots get valid names. The rest are op targets/GVs/constants which are statically allocated or resolved at compile time. These don't have names by which they can be looked up from Perl code at run time through eval"" the way my/our variables can be. Since they can't be looked up by "name" but only by their index allocated at compile time (which is usually in PL_op->op_targ), wasting a name SV for them doesn't make sense.

    The SVs in the names AV have their PV being the name of the variable. xlow+1..xhigh inclusive in the NV union is a range of cop_seq numbers for which the name is valid (accessed through the macros COP_SEQ_RANGE_LOW and _HIGH). During compilation, these fields may hold the special value PERL_PADSEQ_INTRO to indicate various stages:

    1. COP_SEQ_RANGE_LOW _HIGH
    2. ----------------- -----
    3. PERL_PADSEQ_INTRO 0 variable not yet introduced: { my ($x
    4. valid-seq# PERL_PADSEQ_INTRO variable in scope: { my ($x)
    5. valid-seq# valid-seq# compilation of scope complete: { my ($x) }

    For typed lexicals name SV is SVt_PVMG and SvSTASH points at the type. For our lexicals, the type is also SVt_PVMG, with the SvOURSTASH slot pointing at the stash of the associated global (so that duplicate our declarations in the same package can be detected). SvUVX is sometimes hijacked to store the generation number during compilation.

    If PADNAME_OUTER (SvFAKE) is set on the name SV, then that slot in the frame AV is a REFCNT'ed reference to a lexical from "outside". In this case, the name SV does not use xlow and xhigh to store a cop_seq range, since it is in scope throughout. Instead xhigh stores some flags containing info about the real lexical (is it declared in an anon, and is it capable of being instantiated multiple times?), and for fake ANONs, xlow contains the index within the parent's pad where the lexical's value is stored, to make cloning quicker.

    If the 'name' is '&' the corresponding entry in the PAD is a CV representing a possible closure. (PADNAME_OUTER and name of '&' is not a meaningful combination currently but could become so if my sub foo {} is implemented.)

    Note that formats are treated as anon subs, and are cloned each time write is called (if necessary).

    The flag SVs_PADSTALE is cleared on lexicals each time the my() is executed, and set on scope exit. This allows the 'Variable $x is not available' warning to be generated in evals, such as

    1. { my $x = 1; sub f { eval '$x'} } f();

    For state vars, SVs_PADSTALE is overloaded to mean 'not yet initialised'.

    NOTE: this function is experimental and may change or be removed without notice.

    1. PADLIST * CvPADLIST(CV *cv)
  • PadARRAY

    The C array of pad entries.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV ** PadARRAY(PAD pad)
  • PadlistARRAY

    The C array of a padlist, containing the pads. Only subscript it with numbers >= 1, as the 0th entry is not guaranteed to remain usable.

    NOTE: this function is experimental and may change or be removed without notice.

    1. PAD ** PadlistARRAY(PADLIST padlist)
  • PadlistMAX

    The index of the last pad in the padlist.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SSize_t PadlistMAX(PADLIST padlist)
  • PadlistNAMES

    The names associated with pad entries.

    NOTE: this function is experimental and may change or be removed without notice.

    1. PADNAMELIST * PadlistNAMES(PADLIST padlist)
  • PadlistNAMESARRAY

    The C array of pad names.

    NOTE: this function is experimental and may change or be removed without notice.

    1. PADNAME ** PadlistNAMESARRAY(PADLIST padlist)
  • PadlistNAMESMAX

    The index of the last pad name.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SSize_t PadlistNAMESMAX(PADLIST padlist)
  • PadlistREFCNT

    The reference count of the padlist. Currently this is always 1.

    NOTE: this function is experimental and may change or be removed without notice.

    1. U32 PadlistREFCNT(PADLIST padlist)
  • PadMAX

    The index of the last pad entry.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SSize_t PadMAX(PAD pad)
  • PadnameLEN

    The length of the name.

    NOTE: this function is experimental and may change or be removed without notice.

    1. STRLEN PadnameLEN(PADNAME pn)
  • PadnamelistARRAY

    The C array of pad names.

    NOTE: this function is experimental and may change or be removed without notice.

    1. PADNAME ** PadnamelistARRAY(PADNAMELIST pnl)
  • PadnamelistMAX

    The index of the last pad name.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SSize_t PadnamelistMAX(PADNAMELIST pnl)
  • PadnamePV

    The name stored in the pad name struct. This returns NULL for a target or GV slot.

    NOTE: this function is experimental and may change or be removed without notice.

    1. char * PadnamePV(PADNAME pn)
  • PadnameSV

    Returns the pad name as an SV. This is currently just pn . It will begin returning a new mortal SV if pad names ever stop being SVs.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV * PadnameSV(PADNAME pn)
  • PadnameUTF8

    Whether PadnamePV is in UTF8.

    NOTE: this function is experimental and may change or be removed without notice.

    1. bool PadnameUTF8(PADNAME pn)
  • pad_add_name_pvs

    Exactly like pad_add_name_pvn, but takes a literal string instead of a string/length pair.

    1. PADOFFSET pad_add_name_pvs(const char *name, U32 flags,
    2. HV *typestash, HV *ourstash)
  • pad_findmy_pvs

    Exactly like pad_findmy_pvn, but takes a literal string instead of a string/length pair.

    1. PADOFFSET pad_findmy_pvs(const char *name, U32 flags)
  • pad_new

    Create a new padlist, updating the global variables for the currently-compiling padlist to point to the new padlist. The following flags can be OR'ed together:

    1. padnew_CLONE this pad is for a cloned CV
    2. padnew_SAVE save old globals on the save stack
    3. padnew_SAVESUB also save extra stuff for start of sub
    4. PADLIST * pad_new(int flags)
  • PL_comppad

    During compilation, this points to the array containing the values part of the pad for the currently-compiling code. (At runtime a CV may have many such value arrays; at compile time just one is constructed.) At runtime, this points to the array containing the currently-relevant values for the pad for the currently-executing code.

    NOTE: this function is experimental and may change or be removed without notice.

  • PL_comppad_name

    During compilation, this points to the array containing the names part of the pad for the currently-compiling code.

    NOTE: this function is experimental and may change or be removed without notice.

  • PL_curpad

    Points directly to the body of the PL_comppad array. (I.e., this is PAD_ARRAY(PL_comppad) .)

    NOTE: this function is experimental and may change or be removed without notice.

Per-Interpreter Variables

  • PL_modglobal

    PL_modglobal is a general purpose, interpreter global HV for use by extensions that need to keep information on a per-interpreter basis. In a pinch, it can also be used as a symbol table for extensions to share data among each other. It is a good idea to use keys prefixed by the package name of the extension that owns the data.

    1. HV* PL_modglobal
  • PL_na

    A convenience variable which is typically used with SvPV when one doesn't care about the length of the string. It is usually more efficient to either declare a local variable and use that instead or to use the SvPV_nolen macro.

    1. STRLEN PL_na
  • PL_opfreehook

    When non-NULL , the function pointed by this variable will be called each time an OP is freed with the corresponding OP as the argument. This allows extensions to free any extra attribute they have locally attached to an OP. It is also assured to first fire for the parent OP and then for its kids.

    When you replace this variable, it is considered a good practice to store the possibly previously installed hook and that you recall it inside your own.

    1. Perl_ophook_t PL_opfreehook
  • PL_peepp

    Pointer to the per-subroutine peephole optimiser. This is a function that gets called at the end of compilation of a Perl subroutine (or equivalently independent piece of Perl code) to perform fixups of some ops and to perform small-scale optimisations. The function is called once for each subroutine that is compiled, and is passed, as sole parameter, a pointer to the op that is the entry point to the subroutine. It modifies the op tree in place.

    The peephole optimiser should never be completely replaced. Rather, add code to it by wrapping the existing optimiser. The basic way to do this can be seen in Compile pass 3: peephole optimization in perlguts. If the new code wishes to operate on ops throughout the subroutine's structure, rather than just at the top level, it is likely to be more convenient to wrap the PL_rpeepp hook.

    1. peep_t PL_peepp
  • PL_rpeepp

    Pointer to the recursive peephole optimiser. This is a function that gets called at the end of compilation of a Perl subroutine (or equivalently independent piece of Perl code) to perform fixups of some ops and to perform small-scale optimisations. The function is called once for each chain of ops linked through their op_next fields; it is recursively called to handle each side chain. It is passed, as sole parameter, a pointer to the op that is at the head of the chain. It modifies the op tree in place.

    The peephole optimiser should never be completely replaced. Rather, add code to it by wrapping the existing optimiser. The basic way to do this can be seen in Compile pass 3: peephole optimization in perlguts. If the new code wishes to operate only on ops at a subroutine's top level, rather than throughout the structure, it is likely to be more convenient to wrap the PL_peepp hook.

    1. peep_t PL_rpeepp
  • PL_sv_no

    This is the false SV. See PL_sv_yes . Always refer to this as &PL_sv_no .

    1. SV PL_sv_no
  • PL_sv_undef

    This is the undef SV. Always refer to this as &PL_sv_undef .

    1. SV PL_sv_undef
  • PL_sv_yes

    This is the true SV. See PL_sv_no . Always refer to this as &PL_sv_yes .

    1. SV PL_sv_yes

REGEXP Functions

  • SvRX

    Convenience macro to get the REGEXP from a SV. This is approximately equivalent to the following snippet:

    1. if (SvMAGICAL(sv))
    2. mg_get(sv);
    3. if (SvROK(sv))
    4. sv = MUTABLE_SV(SvRV(sv));
    5. if (SvTYPE(sv) == SVt_REGEXP)
    6. return (REGEXP*) sv;

    NULL will be returned if a REGEXP* is not found.

    1. REGEXP * SvRX(SV *sv)
  • SvRXOK

    Returns a boolean indicating whether the SV (or the one it references) is a REGEXP.

    If you want to do something with the REGEXP* later use SvRX instead and check for NULL.

    1. bool SvRXOK(SV* sv)

Simple Exception Handling Macros

Stack Manipulation Macros

  • dMARK

    Declare a stack marker variable, mark , for the XSUB. See MARK and dORIGMARK .

    1. dMARK;
  • dORIGMARK

    Saves the original stack mark for the XSUB. See ORIGMARK .

    1. dORIGMARK;
  • dSP

    Declares a local copy of perl's stack pointer for the XSUB, available via the SP macro. See SP .

    1. dSP;
  • EXTEND

    Used to extend the argument stack for an XSUB's return values. Once used, guarantees that there is room for at least nitems to be pushed onto the stack.

    1. void EXTEND(SP, int nitems)
  • MARK

    Stack marker variable for the XSUB. See dMARK .

  • mPUSHi

    Push an integer onto the stack. The stack must have room for this element. Does not use TARG . See also PUSHi , mXPUSHi and XPUSHi .

    1. void mPUSHi(IV iv)
  • mPUSHn

    Push a double onto the stack. The stack must have room for this element. Does not use TARG . See also PUSHn , mXPUSHn and XPUSHn .

    1. void mPUSHn(NV nv)
  • mPUSHp

    Push a string onto the stack. The stack must have room for this element. The len indicates the length of the string. Does not use TARG . See also PUSHp , mXPUSHp and XPUSHp .

    1. void mPUSHp(char* str, STRLEN len)
  • mPUSHs

    Push an SV onto the stack and mortalizes the SV. The stack must have room for this element. Does not use TARG . See also PUSHs and mXPUSHs .

    1. void mPUSHs(SV* sv)
  • mPUSHu

    Push an unsigned integer onto the stack. The stack must have room for this element. Does not use TARG . See also PUSHu , mXPUSHu and XPUSHu .

    1. void mPUSHu(UV uv)
  • mXPUSHi

    Push an integer onto the stack, extending the stack if necessary. Does not use TARG . See also XPUSHi , mPUSHi and PUSHi .

    1. void mXPUSHi(IV iv)
  • mXPUSHn

    Push a double onto the stack, extending the stack if necessary. Does not use TARG . See also XPUSHn , mPUSHn and PUSHn .

    1. void mXPUSHn(NV nv)
  • mXPUSHp

    Push a string onto the stack, extending the stack if necessary. The len indicates the length of the string. Does not use TARG . See also XPUSHp , mPUSHp and PUSHp .

    1. void mXPUSHp(char* str, STRLEN len)
  • mXPUSHs

    Push an SV onto the stack, extending the stack if necessary and mortalizes the SV. Does not use TARG . See also XPUSHs and mPUSHs .

    1. void mXPUSHs(SV* sv)
  • mXPUSHu

    Push an unsigned integer onto the stack, extending the stack if necessary. Does not use TARG . See also XPUSHu , mPUSHu and PUSHu .

    1. void mXPUSHu(UV uv)
  • ORIGMARK

    The original stack mark for the XSUB. See dORIGMARK .

  • POPi

    Pops an integer off the stack.

    1. IV POPi
  • POPl

    Pops a long off the stack.

    1. long POPl
  • POPn

    Pops a double off the stack.

    1. NV POPn
  • POPp

    Pops a string off the stack.

    1. char* POPp
  • POPpbytex

    Pops a string off the stack which must consist of bytes i.e. characters < 256.

    1. char* POPpbytex
  • POPpx

    Pops a string off the stack. Identical to POPp. There are two names for historical reasons.

    1. char* POPpx
  • POPs

    Pops an SV off the stack.

    1. SV* POPs
  • PUSHi

    Push an integer onto the stack. The stack must have room for this element. Handles 'set' magic. Uses TARG , so dTARGET or dXSTARG should be called to declare it. Do not call multiple TARG -oriented macros to return lists from XSUB's - see mPUSHi instead. See also XPUSHi and mXPUSHi .

    1. void PUSHi(IV iv)
  • PUSHMARK

    Opening bracket for arguments on a callback. See PUTBACK and perlcall.

    1. void PUSHMARK(SP)
  • PUSHmortal

    Push a new mortal SV onto the stack. The stack must have room for this element. Does not use TARG . See also PUSHs , XPUSHmortal and XPUSHs .

    1. void PUSHmortal()
  • PUSHn

    Push a double onto the stack. The stack must have room for this element. Handles 'set' magic. Uses TARG , so dTARGET or dXSTARG should be called to declare it. Do not call multiple TARG -oriented macros to return lists from XSUB's - see mPUSHn instead. See also XPUSHn and mXPUSHn .

    1. void PUSHn(NV nv)
  • PUSHp

    Push a string onto the stack. The stack must have room for this element. The len indicates the length of the string. Handles 'set' magic. Uses TARG , so dTARGET or dXSTARG should be called to declare it. Do not call multiple TARG -oriented macros to return lists from XSUB's - see mPUSHp instead. See also XPUSHp and mXPUSHp .

    1. void PUSHp(char* str, STRLEN len)
  • PUSHs

    Push an SV onto the stack. The stack must have room for this element. Does not handle 'set' magic. Does not use TARG . See also PUSHmortal , XPUSHs and XPUSHmortal .

    1. void PUSHs(SV* sv)
  • PUSHu

    Push an unsigned integer onto the stack. The stack must have room for this element. Handles 'set' magic. Uses TARG , so dTARGET or dXSTARG should be called to declare it. Do not call multiple TARG -oriented macros to return lists from XSUB's - see mPUSHu instead. See also XPUSHu and mXPUSHu .

    1. void PUSHu(UV uv)
  • PUTBACK

    Closing bracket for XSUB arguments. This is usually handled by xsubpp . See PUSHMARK and perlcall for other uses.

    1. PUTBACK;
  • SP

    Stack pointer. This is usually handled by xsubpp . See dSP and SPAGAIN .

  • SPAGAIN

    Refetch the stack pointer. Used after a callback. See perlcall.

    1. SPAGAIN;
  • XPUSHi

    Push an integer onto the stack, extending the stack if necessary. Handles 'set' magic. Uses TARG , so dTARGET or dXSTARG should be called to declare it. Do not call multiple TARG -oriented macros to return lists from XSUB's - see mXPUSHi instead. See also PUSHi and mPUSHi .

    1. void XPUSHi(IV iv)
  • XPUSHmortal

    Push a new mortal SV onto the stack, extending the stack if necessary. Does not use TARG . See also XPUSHs , PUSHmortal and PUSHs .

    1. void XPUSHmortal()
  • XPUSHn

    Push a double onto the stack, extending the stack if necessary. Handles 'set' magic. Uses TARG , so dTARGET or dXSTARG should be called to declare it. Do not call multiple TARG -oriented macros to return lists from XSUB's - see mXPUSHn instead. See also PUSHn and mPUSHn .

    1. void XPUSHn(NV nv)
  • XPUSHp

    Push a string onto the stack, extending the stack if necessary. The len indicates the length of the string. Handles 'set' magic. Uses TARG , so dTARGET or dXSTARG should be called to declare it. Do not call multiple TARG -oriented macros to return lists from XSUB's - see mXPUSHp instead. See also PUSHp and mPUSHp .

    1. void XPUSHp(char* str, STRLEN len)
  • XPUSHs

    Push an SV onto the stack, extending the stack if necessary. Does not handle 'set' magic. Does not use TARG . See also XPUSHmortal , PUSHs and PUSHmortal .

    1. void XPUSHs(SV* sv)
  • XPUSHu

    Push an unsigned integer onto the stack, extending the stack if necessary. Handles 'set' magic. Uses TARG , so dTARGET or dXSTARG should be called to declare it. Do not call multiple TARG -oriented macros to return lists from XSUB's - see mXPUSHu instead. See also PUSHu and mPUSHu .

    1. void XPUSHu(UV uv)
  • XSRETURN

    Return from XSUB, indicating number of items on the stack. This is usually handled by xsubpp .

    1. void XSRETURN(int nitems)
  • XSRETURN_EMPTY

    Return an empty list from an XSUB immediately.

    1. XSRETURN_EMPTY;
  • XSRETURN_IV

    Return an integer from an XSUB immediately. Uses XST_mIV .

    1. void XSRETURN_IV(IV iv)
  • XSRETURN_NO

    Return &PL_sv_no from an XSUB immediately. Uses XST_mNO .

    1. XSRETURN_NO;
  • XSRETURN_NV

    Return a double from an XSUB immediately. Uses XST_mNV .

    1. void XSRETURN_NV(NV nv)
  • XSRETURN_PV

    Return a copy of a string from an XSUB immediately. Uses XST_mPV .

    1. void XSRETURN_PV(char* str)
  • XSRETURN_UNDEF

    Return &PL_sv_undef from an XSUB immediately. Uses XST_mUNDEF .

    1. XSRETURN_UNDEF;
  • XSRETURN_UV

    Return an integer from an XSUB immediately. Uses XST_mUV .

    1. void XSRETURN_UV(IV uv)
  • XSRETURN_YES

    Return &PL_sv_yes from an XSUB immediately. Uses XST_mYES .

    1. XSRETURN_YES;
  • XST_mIV

    Place an integer into the specified position pos on the stack. The value is stored in a new mortal SV.

    1. void XST_mIV(int pos, IV iv)
  • XST_mNO

    Place &PL_sv_no into the specified position pos on the stack.

    1. void XST_mNO(int pos)
  • XST_mNV

    Place a double into the specified position pos on the stack. The value is stored in a new mortal SV.

    1. void XST_mNV(int pos, NV nv)
  • XST_mPV

    Place a copy of a string into the specified position pos on the stack. The value is stored in a new mortal SV.

    1. void XST_mPV(int pos, char* str)
  • XST_mUNDEF

    Place &PL_sv_undef into the specified position pos on the stack.

    1. void XST_mUNDEF(int pos)
  • XST_mYES

    Place &PL_sv_yes into the specified position pos on the stack.

    1. void XST_mYES(int pos)

SV Flags

  • svtype

    An enum of flags for Perl types. These are found in the file sv.h in the svtype enum. Test these flags with the SvTYPE macro.

    The types are:

    1. SVt_NULL
    2. SVt_BIND (unused)
    3. SVt_IV
    4. SVt_NV
    5. SVt_RV
    6. SVt_PV
    7. SVt_PVIV
    8. SVt_PVNV
    9. SVt_PVMG
    10. SVt_REGEXP
    11. SVt_PVGV
    12. SVt_PVLV
    13. SVt_PVAV
    14. SVt_PVHV
    15. SVt_PVCV
    16. SVt_PVFM
    17. SVt_PVIO

    These are most easily explained from the bottom up.

    SVt_PVIO is for I/O objects, SVt_PVFM for formats, SVt_PVCV for subroutines, SVt_PVHV for hashes and SVt_PVAV for arrays.

    All the others are scalar types, that is, things that can be bound to a $ variable. For these, the internal types are mostly orthogonal to types in the Perl language.

    Hence, checking SvTYPE(sv) < SVt_PVAV is the best way to see whether something is a scalar.

    SVt_PVGV represents a typeglob. If !SvFAKE(sv), then it is a real, incoercible typeglob. If SvFAKE(sv), then it is a scalar to which a typeglob has been assigned. Assigning to it again will stop it from being a typeglob. SVt_PVLV represents a scalar that delegates to another scalar behind the scenes. It is used, e.g., for the return value of substr and for tied hash and array elements. It can hold any scalar value, including a typeglob. SVt_REGEXP is for regular expressions.

    SVt_PVMG represents a "normal" scalar (not a typeglob, regular expression, or delegate). Since most scalars do not need all the internal fields of a PVMG, we save memory by allocating smaller structs when possible. All the other types are just simpler forms of SVt_PVMG, with fewer internal fields. SVt_NULL can only hold undef. SVt_IV can hold undef, an integer, or a reference. (SVt_RV is an alias for SVt_IV, which exists for backward compatibility.) SVt_NV can hold any of those or a double. SVt_PV can only hold undef or a string. SVt_PVIV is a superset of SVt_PV and SVt_IV. SVt_PVNV is similar. SVt_PVMG can hold anything SVt_PVNV can hold, but it can, but does not have to, be blessed or magical.

  • SVt_IV

    Type flag for scalars. See svtype.

  • SVt_NULL

    Type flag for scalars. See svtype.

  • SVt_NV

    Type flag for scalars. See svtype.

  • SVt_PV

    Type flag for scalars. See svtype.

  • SVt_PVAV

    Type flag for arrays. See svtype.

  • SVt_PVCV

    Type flag for subroutines. See svtype.

  • SVt_PVFM

    Type flag for formats. See svtype.

  • SVt_PVGV

    Type flag for typeglobs. See svtype.

  • SVt_PVHV

    Type flag for hashes. See svtype.

  • SVt_PVIO

    Type flag for I/O objects. See svtype.

  • SVt_PVIV

    Type flag for scalars. See svtype.

  • SVt_PVLV

    Type flag for scalars. See svtype.

  • SVt_PVMG

    Type flag for scalars. See svtype.

  • SVt_PVNV

    Type flag for scalars. See svtype.

  • SVt_REGEXP

    Type flag for regular expressions. See svtype.

SV Manipulation Functions

  • boolSV

    Returns a true SV if b is a true value, or a false SV if b is 0.

    See also PL_sv_yes and PL_sv_no .

    1. SV * boolSV(bool b)
  • croak_xs_usage

    A specialised variant of croak() for emitting the usage message for xsubs

    1. croak_xs_usage(cv, "eee_yow");

    works out the package name and subroutine name from cv , and then calls croak() . Hence if cv is &ouch::awk , it would call croak as:

    1. Perl_croak(aTHX_ "Usage: %"SVf"::%"SVf"(%s)", "ouch" "awk", "eee_yow");
    2. void croak_xs_usage(const CV *const cv,
    3. const char *const params)
  • get_sv

    Returns the SV of the specified Perl scalar. flags are passed to gv_fetchpv . If GV_ADD is set and the Perl variable does not exist then it will be created. If flags is zero and the variable does not exist then NULL is returned.

    NOTE: the perl_ form of this function is deprecated.

    1. SV* get_sv(const char *name, I32 flags)
  • newRV_inc

    Creates an RV wrapper for an SV. The reference count for the original SV is incremented.

    1. SV* newRV_inc(SV* sv)
  • newSVpadname

    Creates a new SV containing the pad name. This is currently identical to newSVsv , but pad names may cease being SVs at some point, so newSVpadname is preferable.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV* newSVpadname(PADNAME *pn)
  • newSVpvn_utf8

    Creates a new SV and copies a string into it. If utf8 is true, calls SvUTF8_on on the new SV. Implemented as a wrapper around newSVpvn_flags .

    1. SV* newSVpvn_utf8(NULLOK const char* s, STRLEN len,
    2. U32 utf8)
  • SvCUR

    Returns the length of the string which is in the SV. See SvLEN .

    1. STRLEN SvCUR(SV* sv)
  • SvCUR_set

    Set the current length of the string which is in the SV. See SvCUR and SvIV_set .

    1. void SvCUR_set(SV* sv, STRLEN len)
  • SvEND

    Returns a pointer to the spot just after the last character in the string which is in the SV, where there is usually a trailing null (even though Perl scalars do not strictly require it). See SvCUR . Access the character as *(SvEND(sv)).

    Warning: If SvCUR is equal to SvLEN , then SvEND points to unallocated memory.

    1. char* SvEND(SV* sv)
  • SvGAMAGIC

    Returns true if the SV has get magic or overloading. If either is true then the scalar is active data, and has the potential to return a new value every time it is accessed. Hence you must be careful to only read it once per user logical operation and work with that returned value. If neither is true then the scalar's value cannot change unless written to.

    1. U32 SvGAMAGIC(SV* sv)
  • SvGROW

    Expands the character buffer in the SV so that it has room for the indicated number of bytes (remember to reserve space for an extra trailing NUL character). Calls sv_grow to perform the expansion if necessary. Returns a pointer to the character buffer. SV must be of type >= SVt_PV. One alternative is to call sv_grow if you are not sure of the type of SV.

    1. char * SvGROW(SV* sv, STRLEN len)
  • SvIOK

    Returns a U32 value indicating whether the SV contains an integer.

    1. U32 SvIOK(SV* sv)
  • SvIOKp

    Returns a U32 value indicating whether the SV contains an integer. Checks the private setting. Use SvIOK instead.

    1. U32 SvIOKp(SV* sv)
  • SvIOK_notUV

    Returns a boolean indicating whether the SV contains a signed integer.

    1. bool SvIOK_notUV(SV* sv)
  • SvIOK_off

    Unsets the IV status of an SV.

    1. void SvIOK_off(SV* sv)
  • SvIOK_on

    Tells an SV that it is an integer.

    1. void SvIOK_on(SV* sv)
  • SvIOK_only

    Tells an SV that it is an integer and disables all other OK bits.

    1. void SvIOK_only(SV* sv)
  • SvIOK_only_UV

    Tells an SV that it is an unsigned integer and disables all other OK bits.

    1. void SvIOK_only_UV(SV* sv)
  • SvIOK_UV

    Returns a boolean indicating whether the SV contains an integer that must be interpreted as unsigned. A non-negative integer whose value is within the range of both an IV and a UV may be be flagged as either SvUOK or SVIOK.

    1. bool SvIOK_UV(SV* sv)
  • SvIsCOW

    Returns a boolean indicating whether the SV is Copy-On-Write (either shared hash key scalars, or full Copy On Write scalars if 5.9.0 is configured for COW).

    1. bool SvIsCOW(SV* sv)
  • SvIsCOW_shared_hash

    Returns a boolean indicating whether the SV is Copy-On-Write shared hash key scalar.

    1. bool SvIsCOW_shared_hash(SV* sv)
  • SvIV

    Coerces the given SV to an integer and returns it. See SvIVx for a version which guarantees to evaluate sv only once.

    1. IV SvIV(SV* sv)
  • SvIVX

    Returns the raw value in the SV's IV slot, without checks or conversions. Only use when you are sure SvIOK is true. See also SvIV() .

    1. IV SvIVX(SV* sv)
  • SvIVx

    Coerces the given SV to an integer and returns it. Guarantees to evaluate sv only once. Only use this if sv is an expression with side effects, otherwise use the more efficient SvIV .

    1. IV SvIVx(SV* sv)
  • SvIV_nomg

    Like SvIV but doesn't process magic.

    1. IV SvIV_nomg(SV* sv)
  • SvIV_set

    Set the value of the IV pointer in sv to val. It is possible to perform the same function of this macro with an lvalue assignment to SvIVX . With future Perls, however, it will be more efficient to use SvIV_set instead of the lvalue assignment to SvIVX .

    1. void SvIV_set(SV* sv, IV val)
  • SvLEN

    Returns the size of the string buffer in the SV, not including any part attributable to SvOOK . See SvCUR .

    1. STRLEN SvLEN(SV* sv)
  • SvLEN_set

    Set the actual length of the string which is in the SV. See SvIV_set .

    1. void SvLEN_set(SV* sv, STRLEN len)
  • SvMAGIC_set

    Set the value of the MAGIC pointer in sv to val. See SvIV_set .

    1. void SvMAGIC_set(SV* sv, MAGIC* val)
  • SvNIOK

    Returns a U32 value indicating whether the SV contains a number, integer or double.

    1. U32 SvNIOK(SV* sv)
  • SvNIOKp

    Returns a U32 value indicating whether the SV contains a number, integer or double. Checks the private setting. Use SvNIOK instead.

    1. U32 SvNIOKp(SV* sv)
  • SvNIOK_off

    Unsets the NV/IV status of an SV.

    1. void SvNIOK_off(SV* sv)
  • SvNOK

    Returns a U32 value indicating whether the SV contains a double.

    1. U32 SvNOK(SV* sv)
  • SvNOKp

    Returns a U32 value indicating whether the SV contains a double. Checks the private setting. Use SvNOK instead.

    1. U32 SvNOKp(SV* sv)
  • SvNOK_off

    Unsets the NV status of an SV.

    1. void SvNOK_off(SV* sv)
  • SvNOK_on

    Tells an SV that it is a double.

    1. void SvNOK_on(SV* sv)
  • SvNOK_only

    Tells an SV that it is a double and disables all other OK bits.

    1. void SvNOK_only(SV* sv)
  • SvNV

    Coerce the given SV to a double and return it. See SvNVx for a version which guarantees to evaluate sv only once.

    1. NV SvNV(SV* sv)
  • SvNVX

    Returns the raw value in the SV's NV slot, without checks or conversions. Only use when you are sure SvNOK is true. See also SvNV() .

    1. NV SvNVX(SV* sv)
  • SvNVx

    Coerces the given SV to a double and returns it. Guarantees to evaluate sv only once. Only use this if sv is an expression with side effects, otherwise use the more efficient SvNV .

    1. NV SvNVx(SV* sv)
  • SvNV_nomg

    Like SvNV but doesn't process magic.

    1. NV SvNV_nomg(SV* sv)
  • SvNV_set

    Set the value of the NV pointer in sv to val. See SvIV_set .

    1. void SvNV_set(SV* sv, NV val)
  • SvOK

    Returns a U32 value indicating whether the value is defined. This is only meaningful for scalars.

    1. U32 SvOK(SV* sv)
  • SvOOK

    Returns a U32 indicating whether the pointer to the string buffer is offset. This hack is used internally to speed up removal of characters from the beginning of a SvPV. When SvOOK is true, then the start of the allocated string buffer is actually SvOOK_offset() bytes before SvPVX. This offset used to be stored in SvIVX, but is now stored within the spare part of the buffer.

    1. U32 SvOOK(SV* sv)
  • SvOOK_offset

    Reads into len the offset from SvPVX back to the true start of the allocated buffer, which will be non-zero if sv_chop has been used to efficiently remove characters from start of the buffer. Implemented as a macro, which takes the address of len, which must be of type STRLEN . Evaluates sv more than once. Sets len to 0 if SvOOK(sv) is false.

    1. void SvOOK_offset(NN SV*sv, STRLEN len)
  • SvPOK

    Returns a U32 value indicating whether the SV contains a character string.

    1. U32 SvPOK(SV* sv)
  • SvPOKp

    Returns a U32 value indicating whether the SV contains a character string. Checks the private setting. Use SvPOK instead.

    1. U32 SvPOKp(SV* sv)
  • SvPOK_off

    Unsets the PV status of an SV.

    1. void SvPOK_off(SV* sv)
  • SvPOK_on

    Tells an SV that it is a string.

    1. void SvPOK_on(SV* sv)
  • SvPOK_only

    Tells an SV that it is a string and disables all other OK bits. Will also turn off the UTF-8 status.

    1. void SvPOK_only(SV* sv)
  • SvPOK_only_UTF8

    Tells an SV that it is a string and disables all other OK bits, and leaves the UTF-8 status as it was.

    1. void SvPOK_only_UTF8(SV* sv)
  • SvPV

    Returns a pointer to the string in the SV, or a stringified form of the SV if the SV does not contain a string. The SV may cache the stringified version becoming SvPOK . Handles 'get' magic. See also SvPVx for a version which guarantees to evaluate sv only once.

    Note that there is no guarantee that the return value of SvPV() is equal to SvPVX(sv) , or that SvPVX(sv) contains valid data, or that successive calls to SvPV(sv)) will return the same pointer value each time. This is due to the way that things like overloading and Copy-On-Write are handled. In these cases, the return value may point to a temporary buffer or similar. If you absolutely need the SvPVX field to be valid (for example, if you intend to write to it), then see SvPV_force.

    1. char* SvPV(SV* sv, STRLEN len)
  • SvPVbyte

    Like SvPV , but converts sv to byte representation first if necessary.

    1. char* SvPVbyte(SV* sv, STRLEN len)
  • SvPVbytex

    Like SvPV , but converts sv to byte representation first if necessary. Guarantees to evaluate sv only once; use the more efficient SvPVbyte otherwise.

    1. char* SvPVbytex(SV* sv, STRLEN len)
  • SvPVbytex_force

    Like SvPV_force , but converts sv to byte representation first if necessary. Guarantees to evaluate sv only once; use the more efficient SvPVbyte_force otherwise.

    1. char* SvPVbytex_force(SV* sv, STRLEN len)
  • SvPVbyte_force

    Like SvPV_force , but converts sv to byte representation first if necessary.

    1. char* SvPVbyte_force(SV* sv, STRLEN len)
  • SvPVbyte_nolen

    Like SvPV_nolen , but converts sv to byte representation first if necessary.

    1. char* SvPVbyte_nolen(SV* sv)
  • SvPVutf8

    Like SvPV , but converts sv to utf8 first if necessary.

    1. char* SvPVutf8(SV* sv, STRLEN len)
  • SvPVutf8x

    Like SvPV , but converts sv to utf8 first if necessary. Guarantees to evaluate sv only once; use the more efficient SvPVutf8 otherwise.

    1. char* SvPVutf8x(SV* sv, STRLEN len)
  • SvPVutf8x_force

    Like SvPV_force , but converts sv to utf8 first if necessary. Guarantees to evaluate sv only once; use the more efficient SvPVutf8_force otherwise.

    1. char* SvPVutf8x_force(SV* sv, STRLEN len)
  • SvPVutf8_force

    Like SvPV_force , but converts sv to utf8 first if necessary.

    1. char* SvPVutf8_force(SV* sv, STRLEN len)
  • SvPVutf8_nolen

    Like SvPV_nolen , but converts sv to utf8 first if necessary.

    1. char* SvPVutf8_nolen(SV* sv)
  • SvPVX

    Returns a pointer to the physical string in the SV. The SV must contain a string. Prior to 5.9.3 it is not safe to execute this macro unless the SV's type >= SVt_PV.

    This is also used to store the name of an autoloaded subroutine in an XS AUTOLOAD routine. See Autoloading with XSUBs in perlguts.

    1. char* SvPVX(SV* sv)
  • SvPVx

    A version of SvPV which guarantees to evaluate sv only once. Only use this if sv is an expression with side effects, otherwise use the more efficient SvPV .

    1. char* SvPVx(SV* sv, STRLEN len)
  • SvPV_force

    Like SvPV but will force the SV into containing a string (SvPOK ), and only a string (SvPOK_only ), by hook or by crook. You need force if you are going to update the SvPVX directly. Processes get magic.

    Note that coercing an arbitrary scalar into a plain PV will potentially strip useful data from it. For example if the SV was SvROK , then the referent will have its reference count decremented, and the SV itself may be converted to an SvPOK scalar with a string buffer containing a value such as "ARRAY(0x1234)" .

    1. char* SvPV_force(SV* sv, STRLEN len)
  • SvPV_force_nomg

    Like SvPV_force , but doesn't process get magic.

    1. char* SvPV_force_nomg(SV* sv, STRLEN len)
  • SvPV_nolen

    Like SvPV but doesn't set a length variable.

    1. char* SvPV_nolen(SV* sv)
  • SvPV_nomg

    Like SvPV but doesn't process magic.

    1. char* SvPV_nomg(SV* sv, STRLEN len)
  • SvPV_nomg_nolen

    Like SvPV_nolen but doesn't process magic.

    1. char* SvPV_nomg_nolen(SV* sv)
  • SvPV_set

    Set the value of the PV pointer in sv to val. See also SvIV_set .

    Beware that the existing pointer may be involved in copy-on-write or other mischief, so do SvOOK_off(sv) and use sv_force_normal or SvPV_force (or check the SvIsCOW flag) first to make sure this modification is safe.

    1. void SvPV_set(SV* sv, char* val)
  • SvREFCNT

    Returns the value of the object's reference count.

    1. U32 SvREFCNT(SV* sv)
  • SvREFCNT_dec

    Decrements the reference count of the given SV. sv may be NULL.

    1. void SvREFCNT_dec(SV* sv)
  • SvREFCNT_dec_NN

    Same as SvREFCNT_dec, but can only be used if you know sv is not NULL. Since we don't have to check the NULLness, it's faster and smaller.

    1. void SvREFCNT_dec_NN(SV* sv)
  • SvREFCNT_inc

    Increments the reference count of the given SV, returning the SV.

    All of the following SvREFCNT_inc* macros are optimized versions of SvREFCNT_inc, and can be replaced with SvREFCNT_inc.

    1. SV* SvREFCNT_inc(SV* sv)
  • SvREFCNT_inc_NN

    Same as SvREFCNT_inc, but can only be used if you know sv is not NULL. Since we don't have to check the NULLness, it's faster and smaller.

    1. SV* SvREFCNT_inc_NN(SV* sv)
  • SvREFCNT_inc_simple

    Same as SvREFCNT_inc, but can only be used with expressions without side effects. Since we don't have to store a temporary value, it's faster.

    1. SV* SvREFCNT_inc_simple(SV* sv)
  • SvREFCNT_inc_simple_NN

    Same as SvREFCNT_inc_simple, but can only be used if you know sv is not NULL. Since we don't have to check the NULLness, it's faster and smaller.

    1. SV* SvREFCNT_inc_simple_NN(SV* sv)
  • SvREFCNT_inc_simple_void

    Same as SvREFCNT_inc_simple, but can only be used if you don't need the return value. The macro doesn't need to return a meaningful value.

    1. void SvREFCNT_inc_simple_void(SV* sv)
  • SvREFCNT_inc_simple_void_NN

    Same as SvREFCNT_inc, but can only be used if you don't need the return value, and you know that sv is not NULL. The macro doesn't need to return a meaningful value, or check for NULLness, so it's smaller and faster.

    1. void SvREFCNT_inc_simple_void_NN(SV* sv)
  • SvREFCNT_inc_void

    Same as SvREFCNT_inc, but can only be used if you don't need the return value. The macro doesn't need to return a meaningful value.

    1. void SvREFCNT_inc_void(SV* sv)
  • SvREFCNT_inc_void_NN

    Same as SvREFCNT_inc, but can only be used if you don't need the return value, and you know that sv is not NULL. The macro doesn't need to return a meaningful value, or check for NULLness, so it's smaller and faster.

    1. void SvREFCNT_inc_void_NN(SV* sv)
  • SvROK

    Tests if the SV is an RV.

    1. U32 SvROK(SV* sv)
  • SvROK_off

    Unsets the RV status of an SV.

    1. void SvROK_off(SV* sv)
  • SvROK_on

    Tells an SV that it is an RV.

    1. void SvROK_on(SV* sv)
  • SvRV

    Dereferences an RV to return the SV.

    1. SV* SvRV(SV* sv)
  • SvRV_set

    Set the value of the RV pointer in sv to val. See SvIV_set .

    1. void SvRV_set(SV* sv, SV* val)
  • SvSTASH

    Returns the stash of the SV.

    1. HV* SvSTASH(SV* sv)
  • SvSTASH_set

    Set the value of the STASH pointer in sv to val. See SvIV_set .

    1. void SvSTASH_set(SV* sv, HV* val)
  • SvTAINT

    Taints an SV if tainting is enabled, and if some input to the current expression is tainted--usually a variable, but possibly also implicit inputs such as locale settings. SvTAINT propagates that taintedness to the outputs of an expression in a pessimistic fashion; i.e., without paying attention to precisely which outputs are influenced by which inputs.

    1. void SvTAINT(SV* sv)
  • SvTAINTED

    Checks to see if an SV is tainted. Returns TRUE if it is, FALSE if not.

    1. bool SvTAINTED(SV* sv)
  • SvTAINTED_off

    Untaints an SV. Be very careful with this routine, as it short-circuits some of Perl's fundamental security features. XS module authors should not use this function unless they fully understand all the implications of unconditionally untainting the value. Untainting should be done in the standard perl fashion, via a carefully crafted regexp, rather than directly untainting variables.

    1. void SvTAINTED_off(SV* sv)
  • SvTAINTED_on

    Marks an SV as tainted if tainting is enabled.

    1. void SvTAINTED_on(SV* sv)
  • SvTRUE

    Returns a boolean indicating whether Perl would evaluate the SV as true or false. See SvOK() for a defined/undefined test. Handles 'get' magic unless the scalar is already SvPOK, SvIOK or SvNOK (the public, not the private flags).

    1. bool SvTRUE(SV* sv)
  • SvTRUE_nomg

    Returns a boolean indicating whether Perl would evaluate the SV as true or false. See SvOK() for a defined/undefined test. Does not handle 'get' magic.

    1. bool SvTRUE_nomg(SV* sv)
  • SvTYPE

    Returns the type of the SV. See svtype .

    1. svtype SvTYPE(SV* sv)
  • SvUOK

    Returns a boolean indicating whether the SV contains an integer that must be interpreted as unsigned. A non-negative integer whose value is within the range of both an IV and a UV may be be flagged as either SvUOK or SVIOK.

    1. bool SvUOK(SV* sv)
  • SvUPGRADE

    Used to upgrade an SV to a more complex form. Uses sv_upgrade to perform the upgrade if necessary. See svtype .

    1. void SvUPGRADE(SV* sv, svtype type)
  • SvUTF8

    Returns a U32 value indicating the UTF-8 status of an SV. If things are set-up properly, this indicates whether or not the SV contains UTF-8 encoded data. You should use this after a call to SvPV() or one of its variants, in case any call to string overloading updates the internal flag.

    1. U32 SvUTF8(SV* sv)
  • SvUTF8_off

    Unsets the UTF-8 status of an SV (the data is not changed, just the flag). Do not use frivolously.

    1. void SvUTF8_off(SV *sv)
  • SvUTF8_on

    Turn on the UTF-8 status of an SV (the data is not changed, just the flag). Do not use frivolously.

    1. void SvUTF8_on(SV *sv)
  • SvUV

    Coerces the given SV to an unsigned integer and returns it. See SvUVx for a version which guarantees to evaluate sv only once.

    1. UV SvUV(SV* sv)
  • SvUVX

    Returns the raw value in the SV's UV slot, without checks or conversions. Only use when you are sure SvIOK is true. See also SvUV() .

    1. UV SvUVX(SV* sv)
  • SvUVx

    Coerces the given SV to an unsigned integer and returns it. Guarantees to evaluate sv only once. Only use this if sv is an expression with side effects, otherwise use the more efficient SvUV .

    1. UV SvUVx(SV* sv)
  • SvUV_nomg

    Like SvUV but doesn't process magic.

    1. UV SvUV_nomg(SV* sv)
  • SvUV_set

    Set the value of the UV pointer in sv to val. See SvIV_set .

    1. void SvUV_set(SV* sv, UV val)
  • SvVOK

    Returns a boolean indicating whether the SV contains a v-string.

    1. bool SvVOK(SV* sv)
  • sv_catpvn_nomg

    Like sv_catpvn but doesn't process magic.

    1. void sv_catpvn_nomg(SV* sv, const char* ptr,
    2. STRLEN len)
  • sv_catpv_nomg

    Like sv_catpv but doesn't process magic.

    1. void sv_catpv_nomg(SV* sv, const char* ptr)
  • sv_catsv_nomg

    Like sv_catsv but doesn't process magic.

    1. void sv_catsv_nomg(SV* dsv, SV* ssv)
  • sv_derived_from

    Exactly like sv_derived_from_pv, but doesn't take a flags parameter.

    1. bool sv_derived_from(SV* sv, const char *const name)
  • sv_derived_from_pv

    Exactly like sv_derived_from_pvn, but takes a nul-terminated string instead of a string/length pair.

    1. bool sv_derived_from_pv(SV* sv,
    2. const char *const name,
    3. U32 flags)
  • sv_derived_from_pvn

    Returns a boolean indicating whether the SV is derived from the specified class at the C level. To check derivation at the Perl level, call isa() as a normal Perl method.

    Currently, the only significant value for flags is SVf_UTF8.

    1. bool sv_derived_from_pvn(SV* sv,
    2. const char *const name,
    3. const STRLEN len, U32 flags)
  • sv_derived_from_sv

    Exactly like sv_derived_from_pvn, but takes the name string in the form of an SV instead of a string/length pair.

    1. bool sv_derived_from_sv(SV* sv, SV *namesv,
    2. U32 flags)
  • sv_does

    Like sv_does_pv, but doesn't take a flags parameter.

    1. bool sv_does(SV* sv, const char *const name)
  • sv_does_pv

    Like sv_does_sv, but takes a nul-terminated string instead of an SV.

    1. bool sv_does_pv(SV* sv, const char *const name,
    2. U32 flags)
  • sv_does_pvn

    Like sv_does_sv, but takes a string/length pair instead of an SV.

    1. bool sv_does_pvn(SV* sv, const char *const name,
    2. const STRLEN len, U32 flags)
  • sv_does_sv

    Returns a boolean indicating whether the SV performs a specific, named role. The SV can be a Perl object or the name of a Perl class.

    1. bool sv_does_sv(SV* sv, SV* namesv, U32 flags)
  • sv_report_used

    Dump the contents of all SVs not yet freed (debugging aid).

    1. void sv_report_used()
  • sv_setsv_nomg

    Like sv_setsv but doesn't process magic.

    1. void sv_setsv_nomg(SV* dsv, SV* ssv)
  • sv_utf8_upgrade_nomg

    Like sv_utf8_upgrade, but doesn't do magic on sv .

    1. STRLEN sv_utf8_upgrade_nomg(NN SV *sv)

SV-Body Allocation

  • looks_like_number

    Test if the content of an SV looks like a number (or is a number). Inf and Infinity are treated as numbers (so will not issue a non-numeric warning), even if your atof() doesn't grok them. Get-magic is ignored.

    1. I32 looks_like_number(SV *const sv)
  • newRV_noinc

    Creates an RV wrapper for an SV. The reference count for the original SV is not incremented.

    1. SV* newRV_noinc(SV *const sv)
  • newSV

    Creates a new SV. A non-zero len parameter indicates the number of bytes of preallocated string space the SV should have. An extra byte for a trailing NUL is also reserved. (SvPOK is not set for the SV even if string space is allocated.) The reference count for the new SV is set to 1.

    In 5.9.3, newSV() replaces the older NEWSV() API, and drops the first parameter, x, a debug aid which allowed callers to identify themselves. This aid has been superseded by a new build option, PERL_MEM_LOG (see PERL_MEM_LOG in perlhacktips). The older API is still there for use in XS modules supporting older perls.

    1. SV* newSV(const STRLEN len)
  • newSVhek

    Creates a new SV from the hash key structure. It will generate scalars that point to the shared string table where possible. Returns a new (undefined) SV if the hek is NULL.

    1. SV* newSVhek(const HEK *const hek)
  • newSViv

    Creates a new SV and copies an integer into it. The reference count for the SV is set to 1.

    1. SV* newSViv(const IV i)
  • newSVnv

    Creates a new SV and copies a floating point value into it. The reference count for the SV is set to 1.

    1. SV* newSVnv(const NV n)
  • newSVpv

    Creates a new SV and copies a string into it. The reference count for the SV is set to 1. If len is zero, Perl will compute the length using strlen(). For efficiency, consider using newSVpvn instead.

    1. SV* newSVpv(const char *const s, const STRLEN len)
  • newSVpvf

    Creates a new SV and initializes it with the string formatted like sprintf.

    1. SV* newSVpvf(const char *const pat, ...)
  • newSVpvn

    Creates a new SV and copies a buffer into it, which may contain NUL characters (\0 ) and other binary data. The reference count for the SV is set to 1. Note that if len is zero, Perl will create a zero length (Perl) string. You are responsible for ensuring that the source buffer is at least len bytes long. If the buffer argument is NULL the new SV will be undefined.

    1. SV* newSVpvn(const char *const s, const STRLEN len)
  • newSVpvn_flags

    Creates a new SV and copies a string into it. The reference count for the SV is set to 1. Note that if len is zero, Perl will create a zero length string. You are responsible for ensuring that the source string is at least len bytes long. If the s argument is NULL the new SV will be undefined. Currently the only flag bits accepted are SVf_UTF8 and SVs_TEMP . If SVs_TEMP is set, then sv_2mortal() is called on the result before returning. If SVf_UTF8 is set, s is considered to be in UTF-8 and the SVf_UTF8 flag will be set on the new SV. newSVpvn_utf8() is a convenience wrapper for this function, defined as

    1. #define newSVpvn_utf8(s, len, u) \
    2. newSVpvn_flags((s), (len), (u) ? SVf_UTF8 : 0)
    3. SV* newSVpvn_flags(const char *const s,
    4. const STRLEN len,
    5. const U32 flags)
  • newSVpvn_share

    Creates a new SV with its SvPVX_const pointing to a shared string in the string table. If the string does not already exist in the table, it is created first. Turns on the SvIsCOW flag (or READONLY and FAKE in 5.16 and earlier). If the hash parameter is non-zero, that value is used; otherwise the hash is computed. The string's hash can later be retrieved from the SV with the SvSHARED_HASH() macro. The idea here is that as the string table is used for shared hash keys these strings will have SvPVX_const == HeKEY and hash lookup will avoid string compare.

    1. SV* newSVpvn_share(const char* s, I32 len, U32 hash)
  • newSVpvs

    Like newSVpvn , but takes a literal string instead of a string/length pair.

    1. SV* newSVpvs(const char* s)
  • newSVpvs_flags

    Like newSVpvn_flags , but takes a literal string instead of a string/length pair.

    1. SV* newSVpvs_flags(const char* s, U32 flags)
  • newSVpvs_share

    Like newSVpvn_share , but takes a literal string instead of a string/length pair and omits the hash parameter.

    1. SV* newSVpvs_share(const char* s)
  • newSVpv_share

    Like newSVpvn_share , but takes a nul-terminated string instead of a string/length pair.

    1. SV* newSVpv_share(const char* s, U32 hash)
  • newSVrv

    Creates a new SV for the existing RV, rv , to point to. If rv is not an RV then it will be upgraded to one. If classname is non-null then the new SV will be blessed in the specified package. The new SV is returned and its reference count is 1. The reference count 1 is owned by rv .

    1. SV* newSVrv(SV *const rv,
    2. const char *const classname)
  • newSVsv

    Creates a new SV which is an exact duplicate of the original SV. (Uses sv_setsv .)

    1. SV* newSVsv(SV *const old)
  • newSVuv

    Creates a new SV and copies an unsigned integer into it. The reference count for the SV is set to 1.

    1. SV* newSVuv(const UV u)
  • newSV_type

    Creates a new SV, of the type specified. The reference count for the new SV is set to 1.

    1. SV* newSV_type(const svtype type)
  • sv_2bool

    This macro is only used by sv_true() or its macro equivalent, and only if the latter's argument is neither SvPOK, SvIOK nor SvNOK. It calls sv_2bool_flags with the SV_GMAGIC flag.

    1. bool sv_2bool(SV *const sv)
  • sv_2bool_flags

    This function is only used by sv_true() and friends, and only if the latter's argument is neither SvPOK, SvIOK nor SvNOK. If the flags contain SV_GMAGIC, then it does an mg_get() first.

    1. bool sv_2bool_flags(SV *const sv, const I32 flags)
  • sv_2cv

    Using various gambits, try to get a CV from an SV; in addition, try if possible to set *st and *gvp to the stash and GV associated with it. The flags in lref are passed to gv_fetchsv.

    1. CV* sv_2cv(SV* sv, HV **const st, GV **const gvp,
    2. const I32 lref)
  • sv_2io

    Using various gambits, try to get an IO from an SV: the IO slot if its a GV; or the recursive result if we're an RV; or the IO slot of the symbol named after the PV if we're a string.

    'Get' magic is ignored on the sv passed in, but will be called on SvRV(sv) if sv is an RV.

    1. IO* sv_2io(SV *const sv)
  • sv_2iv_flags

    Return the integer value of an SV, doing any necessary string conversion. If flags includes SV_GMAGIC, does an mg_get() first. Normally used via the SvIV(sv) and SvIVx(sv) macros.

    1. IV sv_2iv_flags(SV *const sv, const I32 flags)
  • sv_2mortal

    Marks an existing SV as mortal. The SV will be destroyed "soon", either by an explicit call to FREETMPS, or by an implicit call at places such as statement boundaries. SvTEMP() is turned on which means that the SV's string buffer can be "stolen" if this SV is copied. See also sv_newmortal and sv_mortalcopy .

    1. SV* sv_2mortal(SV *const sv)
  • sv_2nv_flags

    Return the num value of an SV, doing any necessary string or integer conversion. If flags includes SV_GMAGIC, does an mg_get() first. Normally used via the SvNV(sv) and SvNVx(sv) macros.

    1. NV sv_2nv_flags(SV *const sv, const I32 flags)
  • sv_2pvbyte

    Return a pointer to the byte-encoded representation of the SV, and set *lp to its length. May cause the SV to be downgraded from UTF-8 as a side-effect.

    Usually accessed via the SvPVbyte macro.

    1. char* sv_2pvbyte(SV *sv, STRLEN *const lp)
  • sv_2pvutf8

    Return a pointer to the UTF-8-encoded representation of the SV, and set *lp to its length. May cause the SV to be upgraded to UTF-8 as a side-effect.

    Usually accessed via the SvPVutf8 macro.

    1. char* sv_2pvutf8(SV *sv, STRLEN *const lp)
  • sv_2pv_flags

    Returns a pointer to the string value of an SV, and sets *lp to its length. If flags includes SV_GMAGIC, does an mg_get() first. Coerces sv to a string if necessary. Normally invoked via the SvPV_flags macro. sv_2pv() and sv_2pv_nomg usually end up here too.

    1. char* sv_2pv_flags(SV *const sv, STRLEN *const lp,
    2. const I32 flags)
  • sv_2uv_flags

    Return the unsigned integer value of an SV, doing any necessary string conversion. If flags includes SV_GMAGIC, does an mg_get() first. Normally used via the SvUV(sv) and SvUVx(sv) macros.

    1. UV sv_2uv_flags(SV *const sv, const I32 flags)
  • sv_backoff

    Remove any string offset. You should normally use the SvOOK_off macro wrapper instead.

    1. int sv_backoff(SV *const sv)
  • sv_bless

    Blesses an SV into a specified package. The SV must be an RV. The package must be designated by its stash (see gv_stashpv() ). The reference count of the SV is unaffected.

    1. SV* sv_bless(SV *const sv, HV *const stash)
  • sv_catpv

    Concatenates the string onto the end of the string which is in the SV. If the SV has the UTF-8 status set, then the bytes appended should be valid UTF-8. Handles 'get' magic, but not 'set' magic. See sv_catpv_mg .

    1. void sv_catpv(SV *const sv, const char* ptr)
  • sv_catpvf

    Processes its arguments like sprintf and appends the formatted output to an SV. If the appended data contains "wide" characters (including, but not limited to, SVs with a UTF-8 PV formatted with %s, and characters >255 formatted with %c), the original SV might get upgraded to UTF-8. Handles 'get' magic, but not 'set' magic. See sv_catpvf_mg . If the original SV was UTF-8, the pattern should be valid UTF-8; if the original SV was bytes, the pattern should be too.

    1. void sv_catpvf(SV *const sv, const char *const pat,
    2. ...)
  • sv_catpvf_mg

    Like sv_catpvf , but also handles 'set' magic.

    1. void sv_catpvf_mg(SV *const sv,
    2. const char *const pat, ...)
  • sv_catpvn

    Concatenates the string onto the end of the string which is in the SV. The len indicates number of bytes to copy. If the SV has the UTF-8 status set, then the bytes appended should be valid UTF-8. Handles 'get' magic, but not 'set' magic. See sv_catpvn_mg .

    1. void sv_catpvn(SV *dsv, const char *sstr, STRLEN len)
  • sv_catpvn_flags

    Concatenates the string onto the end of the string which is in the SV. The len indicates number of bytes to copy. If the SV has the UTF-8 status set, then the bytes appended should be valid UTF-8. If flags has the SV_SMAGIC bit set, will mg_set on dsv afterwards if appropriate. sv_catpvn and sv_catpvn_nomg are implemented in terms of this function.

    1. void sv_catpvn_flags(SV *const dstr,
    2. const char *sstr,
    3. const STRLEN len,
    4. const I32 flags)
  • sv_catpvs

    Like sv_catpvn , but takes a literal string instead of a string/length pair.

    1. void sv_catpvs(SV* sv, const char* s)
  • sv_catpvs_flags

    Like sv_catpvn_flags , but takes a literal string instead of a string/length pair.

    1. void sv_catpvs_flags(SV* sv, const char* s,
    2. I32 flags)
  • sv_catpvs_mg

    Like sv_catpvn_mg , but takes a literal string instead of a string/length pair.

    1. void sv_catpvs_mg(SV* sv, const char* s)
  • sv_catpvs_nomg

    Like sv_catpvn_nomg , but takes a literal string instead of a string/length pair.

    1. void sv_catpvs_nomg(SV* sv, const char* s)
  • sv_catpv_flags

    Concatenates the string onto the end of the string which is in the SV. If the SV has the UTF-8 status set, then the bytes appended should be valid UTF-8. If flags has the SV_SMAGIC bit set, will mg_set on the modified SV if appropriate.

    1. void sv_catpv_flags(SV *dstr, const char *sstr,
    2. const I32 flags)
  • sv_catpv_mg

    Like sv_catpv , but also handles 'set' magic.

    1. void sv_catpv_mg(SV *const sv, const char *const ptr)
  • sv_catsv

    Concatenates the string from SV ssv onto the end of the string in SV dsv . If ssv is null, does nothing; otherwise modifies only dsv . Handles 'get' magic on both SVs, but no 'set' magic. See sv_catsv_mg and sv_catsv_nomg .

    1. void sv_catsv(SV *dstr, SV *sstr)
  • sv_catsv_flags

    Concatenates the string from SV ssv onto the end of the string in SV dsv . If ssv is null, does nothing; otherwise modifies only dsv . If flags include SV_GMAGIC bit set, will call mg_get on both SVs if appropriate. If flags include SV_SMAGIC , mg_set will be called on the modified SV afterward, if appropriate. sv_catsv , sv_catsv_nomg , and sv_catsv_mg are implemented in terms of this function.

    1. void sv_catsv_flags(SV *const dsv, SV *const ssv,
    2. const I32 flags)
  • sv_chop

    Efficient removal of characters from the beginning of the string buffer. SvPOK(sv), or at least SvPOKp(sv), must be true and the ptr must be a pointer to somewhere inside the string buffer. The ptr becomes the first character of the adjusted string. Uses the "OOK hack". On return, only SvPOK(sv) and SvPOKp(sv) among the OK flags will be true.

    Beware: after this function returns, ptr and SvPVX_const(sv) may no longer refer to the same chunk of data.

    The unfortunate similarity of this function's name to that of Perl's chop operator is strictly coincidental. This function works from the left; chop works from the right.

    1. void sv_chop(SV *const sv, const char *const ptr)
  • sv_clear

    Clear an SV: call any destructors, free up any memory used by the body, and free the body itself. The SV's head is not freed, although its type is set to all 1's so that it won't inadvertently be assumed to be live during global destruction etc. This function should only be called when REFCNT is zero. Most of the time you'll want to call sv_free() (or its macro wrapper SvREFCNT_dec ) instead.

    1. void sv_clear(SV *const orig_sv)
  • sv_cmp

    Compares the strings in two SVs. Returns -1, 0, or 1 indicating whether the string in sv1 is less than, equal to, or greater than the string in sv2 . Is UTF-8 and 'use bytes' aware, handles get magic, and will coerce its args to strings if necessary. See also sv_cmp_locale .

    1. I32 sv_cmp(SV *const sv1, SV *const sv2)
  • sv_cmp_flags

    Compares the strings in two SVs. Returns -1, 0, or 1 indicating whether the string in sv1 is less than, equal to, or greater than the string in sv2 . Is UTF-8 and 'use bytes' aware and will coerce its args to strings if necessary. If the flags include SV_GMAGIC, it handles get magic. See also sv_cmp_locale_flags .

    1. I32 sv_cmp_flags(SV *const sv1, SV *const sv2,
    2. const U32 flags)
  • sv_cmp_locale

    Compares the strings in two SVs in a locale-aware manner. Is UTF-8 and 'use bytes' aware, handles get magic, and will coerce its args to strings if necessary. See also sv_cmp .

    1. I32 sv_cmp_locale(SV *const sv1, SV *const sv2)
  • sv_cmp_locale_flags

    Compares the strings in two SVs in a locale-aware manner. Is UTF-8 and 'use bytes' aware and will coerce its args to strings if necessary. If the flags contain SV_GMAGIC, it handles get magic. See also sv_cmp_flags .

    1. I32 sv_cmp_locale_flags(SV *const sv1,
    2. SV *const sv2,
    3. const U32 flags)
  • sv_collxfrm

    This calls sv_collxfrm_flags with the SV_GMAGIC flag. See sv_collxfrm_flags .

    1. char* sv_collxfrm(SV *const sv, STRLEN *const nxp)
  • sv_collxfrm_flags

    Add Collate Transform magic to an SV if it doesn't already have it. If the flags contain SV_GMAGIC, it handles get-magic.

    Any scalar variable may carry PERL_MAGIC_collxfrm magic that contains the scalar data of the variable, but transformed to such a format that a normal memory comparison can be used to compare the data according to the locale settings.

    1. char* sv_collxfrm_flags(SV *const sv,
    2. STRLEN *const nxp,
    3. I32 const flags)
  • sv_copypv_flags

    Implementation of sv_copypv and sv_copypv_nomg. Calls get magic iff flags include SV_GMAGIC.

    1. void sv_copypv_flags(SV *const dsv, SV *const ssv,
    2. const I32 flags)
  • sv_copypv_nomg

    Like sv_copypv, but doesn't invoke get magic first.

    1. void sv_copypv_nomg(SV *const dsv, SV *const ssv)
  • sv_dec

    Auto-decrement of the value in the SV, doing string to numeric conversion if necessary. Handles 'get' magic and operator overloading.

    1. void sv_dec(SV *const sv)
  • sv_dec_nomg

    Auto-decrement of the value in the SV, doing string to numeric conversion if necessary. Handles operator overloading. Skips handling 'get' magic.

    1. void sv_dec_nomg(SV *const sv)
  • sv_eq

    Returns a boolean indicating whether the strings in the two SVs are identical. Is UTF-8 and 'use bytes' aware, handles get magic, and will coerce its args to strings if necessary.

    1. I32 sv_eq(SV* sv1, SV* sv2)
  • sv_eq_flags

    Returns a boolean indicating whether the strings in the two SVs are identical. Is UTF-8 and 'use bytes' aware and coerces its args to strings if necessary. If the flags include SV_GMAGIC, it handles get-magic, too.

    1. I32 sv_eq_flags(SV* sv1, SV* sv2, const U32 flags)
  • sv_force_normal_flags

    Undo various types of fakery on an SV, where fakery means "more than" a string: if the PV is a shared string, make a private copy; if we're a ref, stop refing; if we're a glob, downgrade to an xpvmg; if we're a copy-on-write scalar, this is the on-write time when we do the copy, and is also used locally; if this is a vstring, drop the vstring magic. If SV_COW_DROP_PV is set then a copy-on-write scalar drops its PV buffer (if any) and becomes SvPOK_off rather than making a copy. (Used where this scalar is about to be set to some other value.) In addition, the flags parameter gets passed to sv_unref_flags() when unreffing. sv_force_normal calls this function with flags set to 0.

    1. void sv_force_normal_flags(SV *const sv,
    2. const U32 flags)
  • sv_free

    Decrement an SV's reference count, and if it drops to zero, call sv_clear to invoke destructors and free up any memory used by the body; finally, deallocate the SV's head itself. Normally called via a wrapper macro SvREFCNT_dec .

    1. void sv_free(SV *const sv)
  • sv_gets

    Get a line from the filehandle and store it into the SV, optionally appending to the currently-stored string. If append is not 0, the line is appended to the SV instead of overwriting it. append should be set to the byte offset that the appended string should start at in the SV (typically, SvCUR(sv) is a suitable choice).

    1. char* sv_gets(SV *const sv, PerlIO *const fp,
    2. I32 append)
  • sv_grow

    Expands the character buffer in the SV. If necessary, uses sv_unref and upgrades the SV to SVt_PV . Returns a pointer to the character buffer. Use the SvGROW wrapper instead.

    1. char* sv_grow(SV *const sv, STRLEN newlen)
  • sv_inc

    Auto-increment of the value in the SV, doing string to numeric conversion if necessary. Handles 'get' magic and operator overloading.

    1. void sv_inc(SV *const sv)
  • sv_inc_nomg

    Auto-increment of the value in the SV, doing string to numeric conversion if necessary. Handles operator overloading. Skips handling 'get' magic.

    1. void sv_inc_nomg(SV *const sv)
  • sv_insert

    Inserts a string at the specified offset/length within the SV. Similar to the Perl substr() function. Handles get magic.

    1. void sv_insert(SV *const bigstr, const STRLEN offset,
    2. const STRLEN len,
    3. const char *const little,
    4. const STRLEN littlelen)
  • sv_insert_flags

    Same as sv_insert , but the extra flags are passed to the SvPV_force_flags that applies to bigstr .

    1. void sv_insert_flags(SV *const bigstr,
    2. const STRLEN offset,
    3. const STRLEN len,
    4. const char *const little,
    5. const STRLEN littlelen,
    6. const U32 flags)
  • sv_isa

    Returns a boolean indicating whether the SV is blessed into the specified class. This does not check for subtypes; use sv_derived_from to verify an inheritance relationship.

    1. int sv_isa(SV* sv, const char *const name)
  • sv_isobject

    Returns a boolean indicating whether the SV is an RV pointing to a blessed object. If the SV is not an RV, or if the object is not blessed, then this will return false.

    1. int sv_isobject(SV* sv)
  • sv_len

    Returns the length of the string in the SV. Handles magic and type coercion and sets the UTF8 flag appropriately. See also SvCUR , which gives raw access to the xpv_cur slot.

    1. STRLEN sv_len(SV *const sv)
  • sv_len_utf8

    Returns the number of characters in the string in an SV, counting wide UTF-8 bytes as a single character. Handles magic and type coercion.

    1. STRLEN sv_len_utf8(SV *const sv)
  • sv_magic

    Adds magic to an SV. First upgrades sv to type SVt_PVMG if necessary, then adds a new magic item of type how to the head of the magic list.

    See sv_magicext (which sv_magic now calls) for a description of the handling of the name and namlen arguments.

    You need to use sv_magicext to add magic to SvREADONLY SVs and also to add more than one instance of the same 'how'.

    1. void sv_magic(SV *const sv, SV *const obj,
    2. const int how, const char *const name,
    3. const I32 namlen)
  • sv_magicext

    Adds magic to an SV, upgrading it if necessary. Applies the supplied vtable and returns a pointer to the magic added.

    Note that sv_magicext will allow things that sv_magic will not. In particular, you can add magic to SvREADONLY SVs, and add more than one instance of the same 'how'.

    If namlen is greater than zero then a savepvn copy of name is stored, if namlen is zero then name is stored as-is and - as another special case - if (name && namlen == HEf_SVKEY) then name is assumed to contain an SV* and is stored as-is with its REFCNT incremented.

    (This is now used as a subroutine by sv_magic .)

    1. MAGIC * sv_magicext(SV *const sv, SV *const obj,
    2. const int how,
    3. const MGVTBL *const vtbl,
    4. const char *const name,
    5. const I32 namlen)
  • sv_mortalcopy

    Creates a new SV which is a copy of the original SV (using sv_setsv ). The new SV is marked as mortal. It will be destroyed "soon", either by an explicit call to FREETMPS, or by an implicit call at places such as statement boundaries. See also sv_newmortal and sv_2mortal .

    1. SV* sv_mortalcopy(SV *const oldsv)
  • sv_newmortal

    Creates a new null SV which is mortal. The reference count of the SV is set to 1. It will be destroyed "soon", either by an explicit call to FREETMPS, or by an implicit call at places such as statement boundaries. See also sv_mortalcopy and sv_2mortal .

    1. SV* sv_newmortal()
  • sv_newref

    Increment an SV's reference count. Use the SvREFCNT_inc() wrapper instead.

    1. SV* sv_newref(SV *const sv)
  • sv_pos_b2u

    Converts the value pointed to by offsetp from a count of bytes from the start of the string, to a count of the equivalent number of UTF-8 chars. Handles magic and type coercion.

    1. void sv_pos_b2u(SV *const sv, I32 *const offsetp)
  • sv_pos_u2b

    Converts the value pointed to by offsetp from a count of UTF-8 chars from the start of the string, to a count of the equivalent number of bytes; if lenp is non-zero, it does the same to lenp, but this time starting from the offset, rather than from the start of the string. Handles magic and type coercion.

    Use sv_pos_u2b_flags in preference, which correctly handles strings longer than 2Gb.

    1. void sv_pos_u2b(SV *const sv, I32 *const offsetp,
    2. I32 *const lenp)
  • sv_pos_u2b_flags

    Converts the value pointed to by offsetp from a count of UTF-8 chars from the start of the string, to a count of the equivalent number of bytes; if lenp is non-zero, it does the same to lenp, but this time starting from the offset, rather than from the start of the string. Handles type coercion. flags is passed to SvPV_flags , and usually should be SV_GMAGIC|SV_CONST_RETURN to handle magic.

    1. STRLEN sv_pos_u2b_flags(SV *const sv, STRLEN uoffset,
    2. STRLEN *const lenp, U32 flags)
  • sv_pvbyten_force

    The backend for the SvPVbytex_force macro. Always use the macro instead.

    1. char* sv_pvbyten_force(SV *const sv, STRLEN *const lp)
  • sv_pvn_force

    Get a sensible string out of the SV somehow. A private implementation of the SvPV_force macro for compilers which can't cope with complex macro expressions. Always use the macro instead.

    1. char* sv_pvn_force(SV* sv, STRLEN* lp)
  • sv_pvn_force_flags

    Get a sensible string out of the SV somehow. If flags has SV_GMAGIC bit set, will mg_get on sv if appropriate, else not. sv_pvn_force and sv_pvn_force_nomg are implemented in terms of this function. You normally want to use the various wrapper macros instead: see SvPV_force and SvPV_force_nomg

    1. char* sv_pvn_force_flags(SV *const sv,
    2. STRLEN *const lp,
    3. const I32 flags)
  • sv_pvutf8n_force

    The backend for the SvPVutf8x_force macro. Always use the macro instead.

    1. char* sv_pvutf8n_force(SV *const sv, STRLEN *const lp)
  • sv_reftype

    Returns a string describing what the SV is a reference to.

    1. const char* sv_reftype(const SV *const sv, const int ob)
  • sv_replace

    Make the first argument a copy of the second, then delete the original. The target SV physically takes over ownership of the body of the source SV and inherits its flags; however, the target keeps any magic it owns, and any magic in the source is discarded. Note that this is a rather specialist SV copying operation; most of the time you'll want to use sv_setsv or one of its many macro front-ends.

    1. void sv_replace(SV *const sv, SV *const nsv)
  • sv_reset

    Underlying implementation for the reset Perl function. Note that the perl-level function is vaguely deprecated.

    1. void sv_reset(const char* s, HV *const stash)
  • sv_rvweaken

    Weaken a reference: set the SvWEAKREF flag on this RV; give the referred-to SV PERL_MAGIC_backref magic if it hasn't already; and push a back-reference to this RV onto the array of backreferences associated with that magic. If the RV is magical, set magic will be called after the RV is cleared.

    1. SV* sv_rvweaken(SV *const sv)
  • sv_setiv

    Copies an integer into the given SV, upgrading first if necessary. Does not handle 'set' magic. See also sv_setiv_mg .

    1. void sv_setiv(SV *const sv, const IV num)
  • sv_setiv_mg

    Like sv_setiv , but also handles 'set' magic.

    1. void sv_setiv_mg(SV *const sv, const IV i)
  • sv_setnv

    Copies a double into the given SV, upgrading first if necessary. Does not handle 'set' magic. See also sv_setnv_mg .

    1. void sv_setnv(SV *const sv, const NV num)
  • sv_setnv_mg

    Like sv_setnv , but also handles 'set' magic.

    1. void sv_setnv_mg(SV *const sv, const NV num)
  • sv_setpv

    Copies a string into an SV. The string must be null-terminated. Does not handle 'set' magic. See sv_setpv_mg .

    1. void sv_setpv(SV *const sv, const char *const ptr)
  • sv_setpvf

    Works like sv_catpvf but copies the text into the SV instead of appending it. Does not handle 'set' magic. See sv_setpvf_mg .

    1. void sv_setpvf(SV *const sv, const char *const pat,
    2. ...)
  • sv_setpvf_mg

    Like sv_setpvf , but also handles 'set' magic.

    1. void sv_setpvf_mg(SV *const sv,
    2. const char *const pat, ...)
  • sv_setpviv

    Copies an integer into the given SV, also updating its string value. Does not handle 'set' magic. See sv_setpviv_mg .

    1. void sv_setpviv(SV *const sv, const IV num)
  • sv_setpviv_mg

    Like sv_setpviv , but also handles 'set' magic.

    1. void sv_setpviv_mg(SV *const sv, const IV iv)
  • sv_setpvn

    Copies a string into an SV. The len parameter indicates the number of bytes to be copied. If the ptr argument is NULL the SV will become undefined. Does not handle 'set' magic. See sv_setpvn_mg .

    1. void sv_setpvn(SV *const sv, const char *const ptr,
    2. const STRLEN len)
  • sv_setpvn_mg

    Like sv_setpvn , but also handles 'set' magic.

    1. void sv_setpvn_mg(SV *const sv,
    2. const char *const ptr,
    3. const STRLEN len)
  • sv_setpvs

    Like sv_setpvn , but takes a literal string instead of a string/length pair.

    1. void sv_setpvs(SV* sv, const char* s)
  • sv_setpvs_mg

    Like sv_setpvn_mg , but takes a literal string instead of a string/length pair.

    1. void sv_setpvs_mg(SV* sv, const char* s)
  • sv_setpv_mg

    Like sv_setpv , but also handles 'set' magic.

    1. void sv_setpv_mg(SV *const sv, const char *const ptr)
  • sv_setref_iv

    Copies an integer into a new SV, optionally blessing the SV. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

    1. SV* sv_setref_iv(SV *const rv,
    2. const char *const classname,
    3. const IV iv)
  • sv_setref_nv

    Copies a double into a new SV, optionally blessing the SV. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

    1. SV* sv_setref_nv(SV *const rv,
    2. const char *const classname,
    3. const NV nv)
  • sv_setref_pv

    Copies a pointer into a new SV, optionally blessing the SV. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. If the pv argument is NULL then PL_sv_undef will be placed into the SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

    Do not use with other Perl types such as HV, AV, SV, CV, because those objects will become corrupted by the pointer copy process.

    Note that sv_setref_pvn copies the string while this copies the pointer.

    1. SV* sv_setref_pv(SV *const rv,
    2. const char *const classname,
    3. void *const pv)
  • sv_setref_pvn

    Copies a string into a new SV, optionally blessing the SV. The length of the string must be specified with n . The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

    Note that sv_setref_pv copies the pointer while this copies the string.

    1. SV* sv_setref_pvn(SV *const rv,
    2. const char *const classname,
    3. const char *const pv,
    4. const STRLEN n)
  • sv_setref_pvs

    Like sv_setref_pvn , but takes a literal string instead of a string/length pair.

    1. SV * sv_setref_pvs(const char* s)
  • sv_setref_uv

    Copies an unsigned integer into a new SV, optionally blessing the SV. The rv argument will be upgraded to an RV. That RV will be modified to point to the new SV. The classname argument indicates the package for the blessing. Set classname to NULL to avoid the blessing. The new SV will have a reference count of 1, and the RV will be returned.

    1. SV* sv_setref_uv(SV *const rv,
    2. const char *const classname,
    3. const UV uv)
  • sv_setsv

    Copies the contents of the source SV ssv into the destination SV dsv . The source SV may be destroyed if it is mortal, so don't use this function if the source SV needs to be reused. Does not handle 'set' magic. Loosely speaking, it performs a copy-by-value, obliterating any previous content of the destination.

    You probably want to use one of the assortment of wrappers, such as SvSetSV , SvSetSV_nosteal , SvSetMagicSV and SvSetMagicSV_nosteal .

    1. void sv_setsv(SV *dstr, SV *sstr)
  • sv_setsv_flags

    Copies the contents of the source SV ssv into the destination SV dsv . The source SV may be destroyed if it is mortal, so don't use this function if the source SV needs to be reused. Does not handle 'set' magic. Loosely speaking, it performs a copy-by-value, obliterating any previous content of the destination. If the flags parameter has the SV_GMAGIC bit set, will mg_get on ssv if appropriate, else not. If the flags parameter has the NOSTEAL bit set then the buffers of temps will not be stolen. <sv_setsv> and sv_setsv_nomg are implemented in terms of this function.

    You probably want to use one of the assortment of wrappers, such as SvSetSV , SvSetSV_nosteal , SvSetMagicSV and SvSetMagicSV_nosteal .

    This is the primary function for copying scalars, and most other copy-ish functions and macros use this underneath.

    1. void sv_setsv_flags(SV *dstr, SV *sstr,
    2. const I32 flags)
  • sv_setsv_mg

    Like sv_setsv , but also handles 'set' magic.

    1. void sv_setsv_mg(SV *const dstr, SV *const sstr)
  • sv_setuv

    Copies an unsigned integer into the given SV, upgrading first if necessary. Does not handle 'set' magic. See also sv_setuv_mg .

    1. void sv_setuv(SV *const sv, const UV num)
  • sv_setuv_mg

    Like sv_setuv , but also handles 'set' magic.

    1. void sv_setuv_mg(SV *const sv, const UV u)
  • sv_tainted

    Test an SV for taintedness. Use SvTAINTED instead.

    1. bool sv_tainted(SV *const sv)
  • sv_true

    Returns true if the SV has a true value by Perl's rules. Use the SvTRUE macro instead, which may call sv_true() or may instead use an in-line version.

    1. I32 sv_true(SV *const sv)
  • sv_unmagic

    Removes all magic of type type from an SV.

    1. int sv_unmagic(SV *const sv, const int type)
  • sv_unmagicext

    Removes all magic of type type with the specified vtbl from an SV.

    1. int sv_unmagicext(SV *const sv, const int type,
    2. MGVTBL *vtbl)
  • sv_unref_flags

    Unsets the RV status of the SV, and decrements the reference count of whatever was being referenced by the RV. This can almost be thought of as a reversal of newSVrv . The cflags argument can contain SV_IMMEDIATE_UNREF to force the reference count to be decremented (otherwise the decrementing is conditional on the reference count being different from one or the reference being a readonly SV). See SvROK_off .

    1. void sv_unref_flags(SV *const ref, const U32 flags)
  • sv_untaint

    Untaint an SV. Use SvTAINTED_off instead.

    1. void sv_untaint(SV *const sv)
  • sv_upgrade

    Upgrade an SV to a more complex form. Generally adds a new body type to the SV, then copies across as much information as possible from the old body. It croaks if the SV is already in a more complex form than requested. You generally want to use the SvUPGRADE macro wrapper, which checks the type before calling sv_upgrade , and hence does not croak. See also svtype .

    1. void sv_upgrade(SV *const sv, svtype new_type)
  • sv_usepvn_flags

    Tells an SV to use ptr to find its string value. Normally the string is stored inside the SV but sv_usepvn allows the SV to use an outside string. The ptr should point to memory that was allocated by malloc . It must be the start of a mallocked block of memory, and not a pointer to the middle of it. The string length, len , must be supplied. By default this function will realloc (i.e. move) the memory pointed to by ptr , so that pointer should not be freed or used by the programmer after giving it to sv_usepvn, and neither should any pointers from "behind" that pointer (e.g. ptr + 1) be used.

    If flags & SV_SMAGIC is true, will call SvSETMAGIC. If flags & SV_HAS_TRAILING_NUL is true, then ptr[len] must be NUL, and the realloc will be skipped (i.e. the buffer is actually at least 1 byte longer than len , and already meets the requirements for storing in SvPVX ).

    1. void sv_usepvn_flags(SV *const sv, char* ptr,
    2. const STRLEN len,
    3. const U32 flags)
  • sv_utf8_decode

    If the PV of the SV is an octet sequence in UTF-8 and contains a multiple-byte character, the SvUTF8 flag is turned on so that it looks like a character. If the PV contains only single-byte characters, the SvUTF8 flag stays off. Scans PV for validity and returns false if the PV is invalid UTF-8.

    NOTE: this function is experimental and may change or be removed without notice.

    1. bool sv_utf8_decode(SV *const sv)
  • sv_utf8_downgrade

    Attempts to convert the PV of an SV from characters to bytes. If the PV contains a character that cannot fit in a byte, this conversion will fail; in this case, either returns false or, if fail_ok is not true, croaks.

    This is not a general purpose Unicode to byte encoding interface: use the Encode extension for that.

    NOTE: this function is experimental and may change or be removed without notice.

    1. bool sv_utf8_downgrade(SV *const sv,
    2. const bool fail_ok)
  • sv_utf8_encode

    Converts the PV of an SV to UTF-8, but then turns the SvUTF8 flag off so that it looks like octets again.

    1. void sv_utf8_encode(SV *const sv)
  • sv_utf8_upgrade

    Converts the PV of an SV to its UTF-8-encoded form. Forces the SV to string form if it is not already. Will mg_get on sv if appropriate. Always sets the SvUTF8 flag to avoid future validity checks even if the whole string is the same in UTF-8 as not. Returns the number of bytes in the converted string

    This is not a general purpose byte encoding to Unicode interface: use the Encode extension for that.

    1. STRLEN sv_utf8_upgrade(SV *sv)
  • sv_utf8_upgrade_flags

    Converts the PV of an SV to its UTF-8-encoded form. Forces the SV to string form if it is not already. Always sets the SvUTF8 flag to avoid future validity checks even if all the bytes are invariant in UTF-8. If flags has SV_GMAGIC bit set, will mg_get on sv if appropriate, else not. Returns the number of bytes in the converted string sv_utf8_upgrade and sv_utf8_upgrade_nomg are implemented in terms of this function.

    This is not a general purpose byte encoding to Unicode interface: use the Encode extension for that.

    1. STRLEN sv_utf8_upgrade_flags(SV *const sv,
    2. const I32 flags)
  • sv_utf8_upgrade_nomg

    Like sv_utf8_upgrade, but doesn't do magic on sv .

    1. STRLEN sv_utf8_upgrade_nomg(SV *sv)
  • sv_vcatpvf

    Processes its arguments like vsprintf and appends the formatted output to an SV. Does not handle 'set' magic. See sv_vcatpvf_mg .

    Usually used via its frontend sv_catpvf .

    1. void sv_vcatpvf(SV *const sv, const char *const pat,
    2. va_list *const args)
  • sv_vcatpvfn
    1. void sv_vcatpvfn(SV *const sv, const char *const pat,
    2. const STRLEN patlen,
    3. va_list *const args,
    4. SV **const svargs, const I32 svmax,
    5. bool *const maybe_tainted)
  • sv_vcatpvfn_flags

    Processes its arguments like vsprintf and appends the formatted output to an SV. Uses an array of SVs if the C style variable argument list is missing (NULL). When running with taint checks enabled, indicates via maybe_tainted if results are untrustworthy (often due to the use of locales).

    If called as sv_vcatpvfn or flags include SV_GMAGIC , calls get magic.

    Usually used via one of its frontends sv_vcatpvf and sv_vcatpvf_mg .

    1. void sv_vcatpvfn_flags(SV *const sv,
    2. const char *const pat,
    3. const STRLEN patlen,
    4. va_list *const args,
    5. SV **const svargs,
    6. const I32 svmax,
    7. bool *const maybe_tainted,
    8. const U32 flags)
  • sv_vcatpvf_mg

    Like sv_vcatpvf , but also handles 'set' magic.

    Usually used via its frontend sv_catpvf_mg .

    1. void sv_vcatpvf_mg(SV *const sv,
    2. const char *const pat,
    3. va_list *const args)
  • sv_vsetpvf

    Works like sv_vcatpvf but copies the text into the SV instead of appending it. Does not handle 'set' magic. See sv_vsetpvf_mg .

    Usually used via its frontend sv_setpvf .

    1. void sv_vsetpvf(SV *const sv, const char *const pat,
    2. va_list *const args)
  • sv_vsetpvfn

    Works like sv_vcatpvfn but copies the text into the SV instead of appending it.

    Usually used via one of its frontends sv_vsetpvf and sv_vsetpvf_mg .

    1. void sv_vsetpvfn(SV *const sv, const char *const pat,
    2. const STRLEN patlen,
    3. va_list *const args,
    4. SV **const svargs, const I32 svmax,
    5. bool *const maybe_tainted)
  • sv_vsetpvf_mg

    Like sv_vsetpvf , but also handles 'set' magic.

    Usually used via its frontend sv_setpvf_mg .

    1. void sv_vsetpvf_mg(SV *const sv,
    2. const char *const pat,
    3. va_list *const args)

Unicode Support

  • bytes_cmp_utf8

    Compares the sequence of characters (stored as octets) in b , blen with the sequence of characters (stored as UTF-8) in u , ulen . Returns 0 if they are equal, -1 or -2 if the first string is less than the second string, +1 or +2 if the first string is greater than the second string.

    -1 or +1 is returned if the shorter string was identical to the start of the longer string. -2 or +2 is returned if the was a difference between characters within the strings.

    1. int bytes_cmp_utf8(const U8 *b, STRLEN blen,
    2. const U8 *u, STRLEN ulen)
  • bytes_from_utf8

    Converts a string s of length len from UTF-8 into native byte encoding. Unlike utf8_to_bytes but like bytes_to_utf8, returns a pointer to the newly-created string, and updates len to contain the new length. Returns the original string if no conversion occurs, len is unchanged. Do nothing if is_utf8 points to 0. Sets is_utf8 to 0 if s is converted or consisted entirely of characters that are invariant in utf8 (i.e., US-ASCII on non-EBCDIC machines).

    NOTE: this function is experimental and may change or be removed without notice.

    1. U8* bytes_from_utf8(const U8 *s, STRLEN *len,
    2. bool *is_utf8)
  • bytes_to_utf8

    Converts a string s of length len bytes from the native encoding into UTF-8. Returns a pointer to the newly-created string, and sets len to reflect the new length in bytes.

    A NUL character will be written after the end of the string.

    If you want to convert to UTF-8 from encodings other than the native (Latin1 or EBCDIC), see sv_recode_to_utf8().

    NOTE: this function is experimental and may change or be removed without notice.

    1. U8* bytes_to_utf8(const U8 *s, STRLEN *len)
  • foldEQ_utf8

    Returns true if the leading portions of the strings s1 and s2 (either or both of which may be in UTF-8) are the same case-insensitively; false otherwise. How far into the strings to compare is determined by other input parameters.

    If u1 is true, the string s1 is assumed to be in UTF-8-encoded Unicode; otherwise it is assumed to be in native 8-bit encoding. Correspondingly for u2 with respect to s2 .

    If the byte length l1 is non-zero, it says how far into s1 to check for fold equality. In other words, s1 +l1 will be used as a goal to reach. The scan will not be considered to be a match unless the goal is reached, and scanning won't continue past that goal. Correspondingly for l2 with respect to s2 .

    If pe1 is non-NULL and the pointer it points to is not NULL, that pointer is considered an end pointer to the position 1 byte past the maximum point in s1 beyond which scanning will not continue under any circumstances. (This routine assumes that UTF-8 encoded input strings are not malformed; malformed input can cause it to read past pe1 ). This means that if both l1 and pe1 are specified, and pe1 is less than s1 +l1 , the match will never be successful because it can never get as far as its goal (and in fact is asserted against). Correspondingly for pe2 with respect to s2 .

    At least one of s1 and s2 must have a goal (at least one of l1 and l2 must be non-zero), and if both do, both have to be reached for a successful match. Also, if the fold of a character is multiple characters, all of them must be matched (see tr21 reference below for 'folding').

    Upon a successful match, if pe1 is non-NULL, it will be set to point to the beginning of the next character of s1 beyond what was matched. Correspondingly for pe2 and s2 .

    For case-insensitiveness, the "casefolding" of Unicode is used instead of upper/lowercasing both the characters, see http://www.unicode.org/unicode/reports/tr21/ (Case Mappings).

    1. I32 foldEQ_utf8(const char *s1, char **pe1, UV l1,
    2. bool u1, const char *s2, char **pe2,
    3. UV l2, bool u2)
  • is_ascii_string

    Returns true if the first len bytes of the string s are the same whether or not the string is encoded in UTF-8 (or UTF-EBCDIC on EBCDIC machines). That is, if they are invariant. On ASCII-ish machines, only ASCII characters fit this definition, hence the function's name.

    If len is 0, it will be calculated using strlen(s) .

    See also is_utf8_string(), is_utf8_string_loclen(), and is_utf8_string_loc().

    1. bool is_ascii_string(const U8 *s, STRLEN len)
  • is_utf8_char

    DEPRECATED!

    Tests if some arbitrary number of bytes begins in a valid UTF-8 character. Note that an INVARIANT (i.e. ASCII on non-EBCDIC machines) character is a valid UTF-8 character. The actual number of bytes in the UTF-8 character will be returned if it is valid, otherwise 0.

    This function is deprecated due to the possibility that malformed input could cause reading beyond the end of the input buffer. Use is_utf8_char_buf instead.

    1. STRLEN is_utf8_char(const U8 *s)
  • is_utf8_char_buf

    Returns the number of bytes that comprise the first UTF-8 encoded character in buffer buf . buf_end should point to one position beyond the end of the buffer. 0 is returned if buf does not point to a complete, valid UTF-8 encoded character.

    Note that an INVARIANT character (i.e. ASCII on non-EBCDIC machines) is a valid UTF-8 character.

    1. STRLEN is_utf8_char_buf(const U8 *buf,
    2. const U8 *buf_end)
  • is_utf8_string

    Returns true if the first len bytes of string s form a valid UTF-8 string, false otherwise. If len is 0, it will be calculated using strlen(s) (which means if you use this option, that s has to have a terminating NUL byte). Note that all characters being ASCII constitute 'a valid UTF-8 string'.

    See also is_ascii_string(), is_utf8_string_loclen(), and is_utf8_string_loc().

    1. bool is_utf8_string(const U8 *s, STRLEN len)
  • is_utf8_string_loc

    Like is_utf8_string but stores the location of the failure (in the case of "utf8ness failure") or the location s+len (in the case of "utf8ness success") in the ep .

    See also is_utf8_string_loclen() and is_utf8_string().

    1. bool is_utf8_string_loc(const U8 *s, STRLEN len,
    2. const U8 **ep)
  • is_utf8_string_loclen

    Like is_utf8_string() but stores the location of the failure (in the case of "utf8ness failure") or the location s+len (in the case of "utf8ness success") in the ep , and the number of UTF-8 encoded characters in the el .

    See also is_utf8_string_loc() and is_utf8_string().

    1. bool is_utf8_string_loclen(const U8 *s, STRLEN len,
    2. const U8 **ep, STRLEN *el)
  • pv_uni_display

    Build to the scalar dsv a displayable version of the string spv , length len , the displayable version being at most pvlim bytes long (if longer, the rest is truncated and "..." will be appended).

    The flags argument can have UNI_DISPLAY_ISPRINT set to display isPRINT()able characters as themselves, UNI_DISPLAY_BACKSLASH to display the \\[nrfta\\] as the backslashed versions (like '\n') (UNI_DISPLAY_BACKSLASH is preferred over UNI_DISPLAY_ISPRINT for \\). UNI_DISPLAY_QQ (and its alias UNI_DISPLAY_REGEX) have both UNI_DISPLAY_BACKSLASH and UNI_DISPLAY_ISPRINT turned on.

    The pointer to the PV of the dsv is returned.

    1. char* pv_uni_display(SV *dsv, const U8 *spv,
    2. STRLEN len, STRLEN pvlim,
    3. UV flags)
  • sv_cat_decode

    The encoding is assumed to be an Encode object, the PV of the ssv is assumed to be octets in that encoding and decoding the input starts from the position which (PV + *offset) pointed to. The dsv will be concatenated the decoded UTF-8 string from ssv. Decoding will terminate when the string tstr appears in decoding output or the input ends on the PV of the ssv. The value which the offset points will be modified to the last input position on the ssv.

    Returns TRUE if the terminator was found, else returns FALSE.

    1. bool sv_cat_decode(SV* dsv, SV *encoding, SV *ssv,
    2. int *offset, char* tstr, int tlen)
  • sv_recode_to_utf8

    The encoding is assumed to be an Encode object, on entry the PV of the sv is assumed to be octets in that encoding, and the sv will be converted into Unicode (and UTF-8).

    If the sv already is UTF-8 (or if it is not POK), or if the encoding is not a reference, nothing is done to the sv. If the encoding is not an Encode::XS Encoding object, bad things will happen. (See lib/encoding.pm and Encode.)

    The PV of the sv is returned.

    1. char* sv_recode_to_utf8(SV* sv, SV *encoding)
  • sv_uni_display

    Build to the scalar dsv a displayable version of the scalar sv , the displayable version being at most pvlim bytes long (if longer, the rest is truncated and "..." will be appended).

    The flags argument is as in pv_uni_display().

    The pointer to the PV of the dsv is returned.

    1. char* sv_uni_display(SV *dsv, SV *ssv, STRLEN pvlim,
    2. UV flags)
  • to_utf8_case

    The p contains the pointer to the UTF-8 string encoding the character that is being converted. This routine assumes that the character at p is well-formed.

    The ustrp is a pointer to the character buffer to put the conversion result to. The lenp is a pointer to the length of the result.

    The swashp is a pointer to the swash to use.

    Both the special and normal mappings are stored in lib/unicore/To/Foo.pl, and loaded by SWASHNEW, using lib/utf8_heavy.pl. The special (usually, but not always, a multicharacter mapping), is tried first.

    The special is a string like "utf8::ToSpecLower", which means the hash %utf8::ToSpecLower. The access to the hash is through Perl_to_utf8_case().

    The normal is a string like "ToLower" which means the swash %utf8::ToLower.

    1. UV to_utf8_case(const U8 *p, U8* ustrp,
    2. STRLEN *lenp, SV **swashp,
    3. const char *normal,
    4. const char *special)
  • to_utf8_fold

    Convert the UTF-8 encoded character at p to its foldcase version and store that in UTF-8 in ustrp and its length in bytes in lenp . Note that the ustrp needs to be at least UTF8_MAXBYTES_CASE+1 bytes since the foldcase version may be longer than the original character (up to three characters).

    The first character of the foldcased version is returned (but note, as explained above, that there may be more.)

    The character at p is assumed by this routine to be well-formed.

    1. UV to_utf8_fold(const U8 *p, U8* ustrp,
    2. STRLEN *lenp)
  • to_utf8_lower

    Convert the UTF-8 encoded character at p to its lowercase version and store that in UTF-8 in ustrp and its length in bytes in lenp . Note that the ustrp needs to be at least UTF8_MAXBYTES_CASE+1 bytes since the lowercase version may be longer than the original character.

    The first character of the lowercased version is returned (but note, as explained above, that there may be more.)

    The character at p is assumed by this routine to be well-formed.

    1. UV to_utf8_lower(const U8 *p, U8* ustrp,
    2. STRLEN *lenp)
  • to_utf8_title

    Convert the UTF-8 encoded character at p to its titlecase version and store that in UTF-8 in ustrp and its length in bytes in lenp . Note that the ustrp needs to be at least UTF8_MAXBYTES_CASE+1 bytes since the titlecase version may be longer than the original character.

    The first character of the titlecased version is returned (but note, as explained above, that there may be more.)

    The character at p is assumed by this routine to be well-formed.

    1. UV to_utf8_title(const U8 *p, U8* ustrp,
    2. STRLEN *lenp)
  • to_utf8_upper

    Convert the UTF-8 encoded character at p to its uppercase version and store that in UTF-8 in ustrp and its length in bytes in lenp . Note that the ustrp needs to be at least UTF8_MAXBYTES_CASE+1 bytes since the uppercase version may be longer than the original character.

    The first character of the uppercased version is returned (but note, as explained above, that there may be more.)

    The character at p is assumed by this routine to be well-formed.

    1. UV to_utf8_upper(const U8 *p, U8* ustrp,
    2. STRLEN *lenp)
  • utf8n_to_uvchr

    Returns the native character value of the first character in the string s which is assumed to be in UTF-8 encoding; retlen will be set to the length, in bytes, of that character.

    length and flags are the same as utf8n_to_uvuni().

    1. UV utf8n_to_uvchr(const U8 *s, STRLEN curlen,
    2. STRLEN *retlen, U32 flags)
  • utf8n_to_uvuni

    Bottom level UTF-8 decode routine. Returns the code point value of the first character in the string s, which is assumed to be in UTF-8 (or UTF-EBCDIC) encoding, and no longer than curlen bytes; *retlen (if retlen isn't NULL) will be set to the length, in bytes, of that character.

    The value of flags determines the behavior when s does not point to a well-formed UTF-8 character. If flags is 0, when a malformation is found, zero is returned and *retlen is set so that (s + *retlen ) is the next possible position in s that could begin a non-malformed character. Also, if UTF-8 warnings haven't been lexically disabled, a warning is raised.

    Various ALLOW flags can be set in flags to allow (and not warn on) individual types of malformations, such as the sequence being overlong (that is, when there is a shorter sequence that can express the same code point; overlong sequences are expressly forbidden in the UTF-8 standard due to potential security issues). Another malformation example is the first byte of a character not being a legal first byte. See utf8.h for the list of such flags. For allowed 0 length strings, this function returns 0; for allowed overlong sequences, the computed code point is returned; for all other allowed malformations, the Unicode REPLACEMENT CHARACTER is returned, as these have no determinable reasonable value.

    The UTF8_CHECK_ONLY flag overrides the behavior when a non-allowed (by other flags) malformation is found. If this flag is set, the routine assumes that the caller will raise a warning, and this function will silently just set retlen to -1 (cast to STRLEN ) and return zero.

    Note that this API requires disambiguation between successful decoding a NUL character, and an error return (unless the UTF8_CHECK_ONLY flag is set), as in both cases, 0 is returned. To disambiguate, upon a zero return, see if the first byte of s is 0 as well. If so, the input was a NUL; if not, the input had an error.

    Certain code points are considered problematic. These are Unicode surrogates, Unicode non-characters, and code points above the Unicode maximum of 0x10FFFF. By default these are considered regular code points, but certain situations warrant special handling for them. If flags contains UTF8_DISALLOW_ILLEGAL_INTERCHANGE, all three classes are treated as malformations and handled as such. The flags UTF8_DISALLOW_SURROGATE, UTF8_DISALLOW_NONCHAR, and UTF8_DISALLOW_SUPER (meaning above the legal Unicode maximum) can be set to disallow these categories individually.

    The flags UTF8_WARN_ILLEGAL_INTERCHANGE, UTF8_WARN_SURROGATE, UTF8_WARN_NONCHAR, and UTF8_WARN_SUPER will cause warning messages to be raised for their respective categories, but otherwise the code points are considered valid (not malformations). To get a category to both be treated as a malformation and raise a warning, specify both the WARN and DISALLOW flags. (But note that warnings are not raised if lexically disabled nor if UTF8_CHECK_ONLY is also specified.)

    Very large code points (above 0x7FFF_FFFF) are considered more problematic than the others that are above the Unicode legal maximum. There are several reasons: they requre at least 32 bits to represent them on ASCII platforms, are not representable at all on EBCDIC platforms, and the original UTF-8 specification never went above this number (the current 0x10FFFF limit was imposed later). (The smaller ones, those that fit into 32 bits, are representable by a UV on ASCII platforms, but not by an IV, which means that the number of operations that can be performed on them is quite restricted.) The UTF-8 encoding on ASCII platforms for these large code points begins with a byte containing 0xFE or 0xFF. The UTF8_DISALLOW_FE_FF flag will cause them to be treated as malformations, while allowing smaller above-Unicode code points. (Of course UTF8_DISALLOW_SUPER will treat all above-Unicode code points, including these, as malformations.) Similarly, UTF8_WARN_FE_FF acts just like the other WARN flags, but applies just to these code points.

    All other code points corresponding to Unicode characters, including private use and those yet to be assigned, are never considered malformed and never warn.

    Most code should use utf8_to_uvchr_buf() rather than call this directly.

    1. UV utf8n_to_uvuni(const U8 *s, STRLEN curlen,
    2. STRLEN *retlen, U32 flags)
  • utf8_distance

    Returns the number of UTF-8 characters between the UTF-8 pointers a and b .

    WARNING: use only if you *know* that the pointers point inside the same UTF-8 buffer.

    1. IV utf8_distance(const U8 *a, const U8 *b)
  • utf8_hop

    Return the UTF-8 pointer s displaced by off characters, either forward or backward.

    WARNING: do not use the following unless you *know* off is within the UTF-8 data pointed to by s *and* that on entry s is aligned on the first byte of character or just after the last byte of a character.

    1. U8* utf8_hop(const U8 *s, I32 off)
  • utf8_length

    Return the length of the UTF-8 char encoded string s in characters. Stops at e (inclusive). If e < s or if the scan would end up past e , croaks.

    1. STRLEN utf8_length(const U8* s, const U8 *e)
  • utf8_to_bytes

    Converts a string s of length len from UTF-8 into native byte encoding. Unlike bytes_to_utf8, this over-writes the original string, and updates len to contain the new length. Returns zero on failure, setting len to -1.

    If you need a copy of the string, see bytes_from_utf8.

    NOTE: this function is experimental and may change or be removed without notice.

    1. U8* utf8_to_bytes(U8 *s, STRLEN *len)
  • utf8_to_uvchr

    DEPRECATED!

    Returns the native code point of the first character in the string s which is assumed to be in UTF-8 encoding; retlen will be set to the length, in bytes, of that character.

    Some, but not all, UTF-8 malformations are detected, and in fact, some malformed input could cause reading beyond the end of the input buffer, which is why this function is deprecated. Use utf8_to_uvchr_buf instead.

    If s points to one of the detected malformations, and UTF8 warnings are enabled, zero is returned and *retlen is set (if retlen isn't NULL) to -1. If those warnings are off, the computed value if well-defined (or the Unicode REPLACEMENT CHARACTER, if not) is silently returned, and *retlen is set (if retlen isn't NULL) so that (s + *retlen ) is the next possible position in s that could begin a non-malformed character. See utf8n_to_uvuni for details on when the REPLACEMENT CHARACTER is returned.

    1. UV utf8_to_uvchr(const U8 *s, STRLEN *retlen)
  • utf8_to_uvchr_buf

    Returns the native code point of the first character in the string s which is assumed to be in UTF-8 encoding; send points to 1 beyond the end of s. *retlen will be set to the length, in bytes, of that character.

    If s does not point to a well-formed UTF-8 character and UTF8 warnings are enabled, zero is returned and *retlen is set (if retlen isn't NULL) to -1. If those warnings are off, the computed value, if well-defined (or the Unicode REPLACEMENT CHARACTER if not), is silently returned, and *retlen is set (if retlen isn't NULL) so that (s + *retlen ) is the next possible position in s that could begin a non-malformed character. See utf8n_to_uvuni for details on when the REPLACEMENT CHARACTER is returned.

    1. UV utf8_to_uvchr_buf(const U8 *s, const U8 *send,
    2. STRLEN *retlen)
  • utf8_to_uvuni

    DEPRECATED!

    Returns the Unicode code point of the first character in the string s which is assumed to be in UTF-8 encoding; retlen will be set to the length, in bytes, of that character.

    This function should only be used when the returned UV is considered an index into the Unicode semantic tables (e.g. swashes).

    Some, but not all, UTF-8 malformations are detected, and in fact, some malformed input could cause reading beyond the end of the input buffer, which is why this function is deprecated. Use utf8_to_uvuni_buf instead.

    If s points to one of the detected malformations, and UTF8 warnings are enabled, zero is returned and *retlen is set (if retlen doesn't point to NULL) to -1. If those warnings are off, the computed value if well-defined (or the Unicode REPLACEMENT CHARACTER, if not) is silently returned, and *retlen is set (if retlen isn't NULL) so that (s + *retlen ) is the next possible position in s that could begin a non-malformed character. See utf8n_to_uvuni for details on when the REPLACEMENT CHARACTER is returned.

    1. UV utf8_to_uvuni(const U8 *s, STRLEN *retlen)
  • utf8_to_uvuni_buf

    Returns the Unicode code point of the first character in the string s which is assumed to be in UTF-8 encoding; send points to 1 beyond the end of s. retlen will be set to the length, in bytes, of that character.

    This function should only be used when the returned UV is considered an index into the Unicode semantic tables (e.g. swashes).

    If s does not point to a well-formed UTF-8 character and UTF8 warnings are enabled, zero is returned and *retlen is set (if retlen isn't NULL) to -1. If those warnings are off, the computed value if well-defined (or the Unicode REPLACEMENT CHARACTER, if not) is silently returned, and *retlen is set (if retlen isn't NULL) so that (s + *retlen ) is the next possible position in s that could begin a non-malformed character. See utf8n_to_uvuni for details on when the REPLACEMENT CHARACTER is returned.

    1. UV utf8_to_uvuni_buf(const U8 *s, const U8 *send,
    2. STRLEN *retlen)
  • uvchr_to_utf8

    Adds the UTF-8 representation of the Native code point uv to the end of the string d ; d should have at least UTF8_MAXBYTES+1 free bytes available. The return value is the pointer to the byte after the end of the new character. In other words,

    1. d = uvchr_to_utf8(d, uv);

    is the recommended wide native character-aware way of saying

    1. *(d++) = uv;
    2. U8* uvchr_to_utf8(U8 *d, UV uv)
  • uvuni_to_utf8_flags

    Adds the UTF-8 representation of the Unicode code point uv to the end of the string d ; d should have at least UTF8_MAXBYTES+1 free bytes available. The return value is the pointer to the byte after the end of the new character. In other words,

    1. d = uvuni_to_utf8_flags(d, uv, flags);

    or, in most cases,

    1. d = uvuni_to_utf8(d, uv);

    (which is equivalent to)

    1. d = uvuni_to_utf8_flags(d, uv, 0);

    This is the recommended Unicode-aware way of saying

    1. *(d++) = uv;

    where uv is a code point expressed in Latin-1 or above, not the platform's native character set. Almost all code should instead use uvchr_to_utf8 or uvchr_to_utf8_flags.

    This function will convert to UTF-8 (and not warn) even code points that aren't legal Unicode or are problematic, unless flags contains one or more of the following flags:

    If uv is a Unicode surrogate code point and UNICODE_WARN_SURROGATE is set, the function will raise a warning, provided UTF8 warnings are enabled. If instead UNICODE_DISALLOW_SURROGATE is set, the function will fail and return NULL. If both flags are set, the function will both warn and return NULL.

    The UNICODE_WARN_NONCHAR and UNICODE_DISALLOW_NONCHAR flags correspondingly affect how the function handles a Unicode non-character. And likewise, the UNICODE_WARN_SUPER and UNICODE_DISALLOW_SUPER flags, affect the handling of code points that are above the Unicode maximum of 0x10FFFF. Code points above 0x7FFF_FFFF (which are even less portable) can be warned and/or disallowed even if other above-Unicode code points are accepted by the UNICODE_WARN_FE_FF and UNICODE_DISALLOW_FE_FF flags.

    And finally, the flag UNICODE_WARN_ILLEGAL_INTERCHANGE selects all four of the above WARN flags; and UNICODE_DISALLOW_ILLEGAL_INTERCHANGE selects all four DISALLOW flags.

    1. U8* uvuni_to_utf8_flags(U8 *d, UV uv, UV flags)

Variables created by xsubpp and xsubpp internal functions

  • ax

    Variable which is setup by xsubpp to indicate the stack base offset, used by the ST , XSprePUSH and XSRETURN macros. The dMARK macro must be called prior to setup the MARK variable.

    1. I32 ax
  • CLASS

    Variable which is setup by xsubpp to indicate the class name for a C++ XS constructor. This is always a char* . See THIS .

    1. char* CLASS
  • dAX

    Sets up the ax variable. This is usually handled automatically by xsubpp by calling dXSARGS .

    1. dAX;
  • dAXMARK

    Sets up the ax variable and stack marker variable mark . This is usually handled automatically by xsubpp by calling dXSARGS .

    1. dAXMARK;
  • dITEMS

    Sets up the items variable. This is usually handled automatically by xsubpp by calling dXSARGS .

    1. dITEMS;
  • dUNDERBAR

    Sets up any variable needed by the UNDERBAR macro. It used to define padoff_du , but it is currently a noop. However, it is strongly advised to still use it for ensuring past and future compatibility.

    1. dUNDERBAR;
  • dXSARGS

    Sets up stack and mark pointers for an XSUB, calling dSP and dMARK. Sets up the ax and items variables by calling dAX and dITEMS . This is usually handled automatically by xsubpp .

    1. dXSARGS;
  • dXSI32

    Sets up the ix variable for an XSUB which has aliases. This is usually handled automatically by xsubpp .

    1. dXSI32;
  • items

    Variable which is setup by xsubpp to indicate the number of items on the stack. See Variable-length Parameter Lists in perlxs.

    1. I32 items
  • ix

    Variable which is setup by xsubpp to indicate which of an XSUB's aliases was used to invoke it. See The ALIAS: Keyword in perlxs.

    1. I32 ix
  • newXSproto

    Used by xsubpp to hook up XSUBs as Perl subs. Adds Perl prototypes to the subs.

  • RETVAL

    Variable which is setup by xsubpp to hold the return value for an XSUB. This is always the proper type for the XSUB. See The RETVAL Variable in perlxs.

    1. (whatever) RETVAL
  • ST

    Used to access elements on the XSUB's stack.

    1. SV* ST(int ix)
  • THIS

    Variable which is setup by xsubpp to designate the object in a C++ XSUB. This is always the proper type for the C++ object. See CLASS and Using XS With C++ in perlxs.

    1. (whatever) THIS
  • UNDERBAR

    The SV* corresponding to the $_ variable. Works even if there is a lexical $_ in scope.

  • XS

    Macro to declare an XSUB and its C parameter list. This is handled by xsubpp . It is the same as using the more explicit XS_EXTERNAL macro.

  • XS_APIVERSION_BOOTCHECK

    Macro to verify that the perl api version an XS module has been compiled against matches the api version of the perl interpreter it's being loaded into.

    1. XS_APIVERSION_BOOTCHECK;
  • XS_EXTERNAL

    Macro to declare an XSUB and its C parameter list explicitly exporting the symbols.

  • XS_INTERNAL

    Macro to declare an XSUB and its C parameter list without exporting the symbols. This is handled by xsubpp and generally preferable over exporting the XSUB symbols unnecessarily.

  • XS_VERSION

    The version identifier for an XS module. This is usually handled automatically by ExtUtils::MakeMaker . See XS_VERSION_BOOTCHECK .

  • XS_VERSION_BOOTCHECK

    Macro to verify that a PM module's $VERSION variable matches the XS module's XS_VERSION variable. This is usually handled automatically by xsubpp . See The VERSIONCHECK: Keyword in perlxs.

    1. XS_VERSION_BOOTCHECK;

Warning and Dieing

  • croak

    This is an XS interface to Perl's die function.

    Take a sprintf-style format pattern and argument list. These are used to generate a string message. If the message does not end with a newline, then it will be extended with some indication of the current location in the code, as described for mess_sv.

    The error message will be used as an exception, by default returning control to the nearest enclosing eval, but subject to modification by a $SIG{__DIE__} handler. In any case, the croak function never returns normally.

    For historical reasons, if pat is null then the contents of ERRSV ($@ ) will be used as an error message or object instead of building an error message from arguments. If you want to throw a non-string object, or build an error message in an SV yourself, it is preferable to use the croak_sv function, which does not involve clobbering ERRSV .

    1. void croak(const char *pat, ...)
  • croak_no_modify

    Exactly equivalent to Perl_croak(aTHX_ "%s", PL_no_modify) , but generates terser object code than using Perl_croak . Less code used on exception code paths reduces CPU cache pressure.

    1. void croak_no_modify()
  • croak_sv

    This is an XS interface to Perl's die function.

    baseex is the error message or object. If it is a reference, it will be used as-is. Otherwise it is used as a string, and if it does not end with a newline then it will be extended with some indication of the current location in the code, as described for mess_sv.

    The error message or object will be used as an exception, by default returning control to the nearest enclosing eval, but subject to modification by a $SIG{__DIE__} handler. In any case, the croak_sv function never returns normally.

    To die with a simple string message, the croak function may be more convenient.

    1. void croak_sv(SV *baseex)
  • die

    Behaves the same as croak, except for the return type. It should be used only where the OP * return type is required. The function never actually returns.

    1. OP * die(const char *pat, ...)
  • die_sv

    Behaves the same as croak_sv, except for the return type. It should be used only where the OP * return type is required. The function never actually returns.

    1. OP * die_sv(SV *baseex)
  • vcroak

    This is an XS interface to Perl's die function.

    pat and args are a sprintf-style format pattern and encapsulated argument list. These are used to generate a string message. If the message does not end with a newline, then it will be extended with some indication of the current location in the code, as described for mess_sv.

    The error message will be used as an exception, by default returning control to the nearest enclosing eval, but subject to modification by a $SIG{__DIE__} handler. In any case, the croak function never returns normally.

    For historical reasons, if pat is null then the contents of ERRSV ($@ ) will be used as an error message or object instead of building an error message from arguments. If you want to throw a non-string object, or build an error message in an SV yourself, it is preferable to use the croak_sv function, which does not involve clobbering ERRSV .

    1. void vcroak(const char *pat, va_list *args)
  • vwarn

    This is an XS interface to Perl's warn function.

    pat and args are a sprintf-style format pattern and encapsulated argument list. These are used to generate a string message. If the message does not end with a newline, then it will be extended with some indication of the current location in the code, as described for mess_sv.

    The error message or object will by default be written to standard error, but this is subject to modification by a $SIG{__WARN__} handler.

    Unlike with vcroak, pat is not permitted to be null.

    1. void vwarn(const char *pat, va_list *args)
  • warn

    This is an XS interface to Perl's warn function.

    Take a sprintf-style format pattern and argument list. These are used to generate a string message. If the message does not end with a newline, then it will be extended with some indication of the current location in the code, as described for mess_sv.

    The error message or object will by default be written to standard error, but this is subject to modification by a $SIG{__WARN__} handler.

    Unlike with croak, pat is not permitted to be null.

    1. void warn(const char *pat, ...)
  • warn_sv

    This is an XS interface to Perl's warn function.

    baseex is the error message or object. If it is a reference, it will be used as-is. Otherwise it is used as a string, and if it does not end with a newline then it will be extended with some indication of the current location in the code, as described for mess_sv.

    The error message or object will by default be written to standard error, but this is subject to modification by a $SIG{__WARN__} handler.

    To warn with a simple string message, the warn function may be more convenient.

    1. void warn_sv(SV *baseex)

Undocumented functions

The following functions have been flagged as part of the public API, but are currently undocumented. Use them at your own risk, as the interfaces are subject to change. Functions that are not listed in this document are not intended for public use, and should NOT be used under any circumstances.

If you use one of the undocumented functions below, you may wish to consider creating and submitting documentation for it. If your patch is accepted, this will indicate that the interface is stable (unless it is explicitly marked otherwise).

  • GetVars
  • Gv_AMupdate
  • PerlIO_clearerr
  • PerlIO_close
  • PerlIO_context_layers
  • PerlIO_eof
  • PerlIO_error
  • PerlIO_fileno
  • PerlIO_fill
  • PerlIO_flush
  • PerlIO_get_base
  • PerlIO_get_bufsiz
  • PerlIO_get_cnt
  • PerlIO_get_ptr
  • PerlIO_read
  • PerlIO_seek
  • PerlIO_set_cnt
  • PerlIO_set_ptrcnt
  • PerlIO_setlinebuf
  • PerlIO_stderr
  • PerlIO_stdin
  • PerlIO_stdout
  • PerlIO_tell
  • PerlIO_unread
  • PerlIO_write
  • amagic_call
  • amagic_deref_call
  • any_dup
  • atfork_lock
  • atfork_unlock
  • av_arylen_p
  • av_iter_p
  • block_gimme
  • call_atexit
  • call_list
  • calloc
  • cast_i32
  • cast_iv
  • cast_ulong
  • cast_uv
  • ck_warner
  • ck_warner_d
  • ckwarn
  • ckwarn_d
  • clone_params_del
  • clone_params_new
  • croak_nocontext
  • csighandler
  • cx_dump
  • cx_dup
  • cxinc
  • deb
  • deb_nocontext
  • debop
  • debprofdump
  • debstack
  • debstackptrs
  • delimcpy
  • despatch_signals
  • die_nocontext
  • dirp_dup
  • do_aspawn
  • do_binmode
  • do_close
  • do_gv_dump
  • do_gvgv_dump
  • do_hv_dump
  • do_join
  • do_magic_dump
  • do_op_dump
  • do_open
  • do_open9
  • do_openn
  • do_pmop_dump
  • do_spawn
  • do_spawn_nowait
  • do_sprintf
  • do_sv_dump
  • doing_taint
  • doref
  • dounwind
  • dowantarray
  • dump_all
  • dump_eval
  • dump_fds
  • dump_form
  • dump_indent
  • dump_mstats
  • dump_packsubs
  • dump_sub
  • dump_vindent
  • filter_add
  • filter_del
  • filter_read
  • foldEQ_latin1
  • form_nocontext
  • fp_dup
  • fprintf_nocontext
  • free_global_struct
  • free_tmps
  • get_context
  • get_mstats
  • get_op_descs
  • get_op_names
  • get_ppaddr
  • get_vtbl
  • gp_dup
  • gp_free
  • gp_ref
  • gv_AVadd
  • gv_HVadd
  • gv_IOadd
  • gv_SVadd
  • gv_add_by_type
  • gv_autoload4
  • gv_autoload_pv
  • gv_autoload_pvn
  • gv_autoload_sv
  • gv_check
  • gv_dump
  • gv_efullname
  • gv_efullname3
  • gv_efullname4
  • gv_fetchfile
  • gv_fetchfile_flags
  • gv_fetchpv
  • gv_fetchpvn_flags
  • gv_fetchsv
  • gv_fullname
  • gv_fullname3
  • gv_fullname4
  • gv_handler
  • gv_name_set
  • he_dup
  • hek_dup
  • hv_common
  • hv_common_key_len
  • hv_delayfree_ent
  • hv_eiter_p
  • hv_eiter_set
  • hv_free_ent
  • hv_ksplit
  • hv_name_set
  • hv_placeholders_get
  • hv_placeholders_p
  • hv_placeholders_set
  • hv_rand_set
  • hv_riter_p
  • hv_riter_set
  • init_global_struct
  • init_i18nl10n
  • init_i18nl14n
  • init_stacks
  • init_tm
  • instr
  • is_lvalue_sub
  • leave_scope
  • load_module_nocontext
  • magic_dump
  • malloc
  • markstack_grow
  • mess_nocontext
  • mfree
  • mg_dup
  • mg_size
  • mini_mktime
  • moreswitches
  • mro_get_from_name
  • mro_get_private_data
  • mro_set_mro
  • mro_set_private_data
  • my_atof
  • my_atof2
  • my_bcopy
  • my_bzero
  • my_chsize
  • my_cxt_index
  • my_cxt_init
  • my_dirfd
  • my_exit
  • my_failure_exit
  • my_fflush_all
  • my_fork
  • my_htonl
  • my_lstat
  • my_memcmp
  • my_memset
  • my_ntohl
  • my_pclose
  • my_popen
  • my_popen_list
  • my_setenv
  • my_socketpair
  • my_stat
  • my_strftime
  • my_strlcat
  • my_strlcpy
  • my_swap
  • newANONATTRSUB
  • newANONHASH
  • newANONLIST
  • newANONSUB
  • newATTRSUB
  • newAVREF
  • newCVREF
  • newFORM
  • newGVREF
  • newGVgen
  • newGVgen_flags
  • newHVREF
  • newHVhv
  • newIO
  • newMYSUB
  • newPROG
  • newRV
  • newSUB
  • newSVREF
  • newSVpvf_nocontext
  • new_collate
  • new_ctype
  • new_numeric
  • new_stackinfo
  • ninstr
  • op_dump
  • op_free
  • op_null
  • op_refcnt_lock
  • op_refcnt_unlock
  • parser_dup
  • perl_alloc_using
  • perl_clone_using
  • pmop_dump
  • pop_scope
  • pregcomp
  • pregexec
  • pregfree
  • pregfree2
  • printf_nocontext
  • ptr_table_clear
  • ptr_table_fetch
  • ptr_table_free
  • ptr_table_new
  • ptr_table_split
  • ptr_table_store
  • push_scope
  • re_compile
  • re_dup_guts
  • re_intuit_start
  • re_intuit_string
  • realloc
  • reentrant_free
  • reentrant_init
  • reentrant_retry
  • reentrant_size
  • ref
  • reg_named_buff_all
  • reg_named_buff_exists
  • reg_named_buff_fetch
  • reg_named_buff_firstkey
  • reg_named_buff_nextkey
  • reg_named_buff_scalar
  • regclass_swash
  • regdump
  • regdupe_internal
  • regexec_flags
  • regfree_internal
  • reginitcolors
  • regnext
  • repeatcpy
  • rninstr
  • rsignal
  • rsignal_state
  • runops_debug
  • runops_standard
  • rvpv_dup
  • safesyscalloc
  • safesysfree
  • safesysmalloc
  • safesysrealloc
  • save_I16
  • save_I32
  • save_I8
  • save_adelete
  • save_aelem
  • save_aelem_flags
  • save_alloc
  • save_aptr
  • save_ary
  • save_bool
  • save_clearsv
  • save_delete
  • save_destructor
  • save_destructor_x
  • save_freeop
  • save_freepv
  • save_freesv
  • save_generic_pvref
  • save_generic_svref
  • save_gp
  • save_hash
  • save_hdelete
  • save_helem
  • save_helem_flags
  • save_hints
  • save_hptr
  • save_int
  • save_item
  • save_iv
  • save_list
  • save_long
  • save_mortalizesv
  • save_nogv
  • save_op
  • save_padsv_and_mortalize
  • save_pptr
  • save_pushi32ptr
  • save_pushptr
  • save_pushptrptr
  • save_re_context
  • save_scalar
  • save_set_svflags
  • save_shared_pvref
  • save_sptr
  • save_svref
  • save_vptr
  • savestack_grow
  • savestack_grow_cnt
  • scan_num
  • scan_vstring
  • screaminstr
  • seed
  • set_context
  • set_numeric_local
  • set_numeric_radix
  • set_numeric_standard
  • share_hek
  • si_dup
  • ss_dup
  • stack_grow
  • start_subparse
  • str_to_version
  • sv_2iv
  • sv_2pv
  • sv_2uv
  • sv_catpvf_mg_nocontext
  • sv_catpvf_nocontext
  • sv_dump
  • sv_dup
  • sv_dup_inc
  • sv_peek
  • sv_pvn_nomg
  • sv_setpvf_mg_nocontext
  • sv_setpvf_nocontext
  • sv_utf8_upgrade_flags_grow
  • swash_fetch
  • swash_init
  • sys_init
  • sys_init3
  • sys_intern_clear
  • sys_intern_dup
  • sys_intern_init
  • sys_term
  • taint_env
  • taint_proper
  • tmps_grow
  • unlnk
  • unsharepvn
  • utf16_to_utf8
  • utf16_to_utf8_reversed
  • uvchr_to_utf8_flags
  • uvuni_to_utf8
  • vdeb
  • vform
  • vload_module
  • vnewSVpvf
  • vwarner
  • warn_nocontext
  • warner
  • warner_nocontext
  • whichsig
  • whichsig_pv
  • whichsig_pvn
  • whichsig_sv

AUTHORS

Until May 1997, this document was maintained by Jeff Okamoto <okamoto@corp.hp.com>. It is now maintained as part of Perl itself.

With lots of help and suggestions from Dean Roehrich, Malcolm Beattie, Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, Neil Bowers, Matthew Green, Tim Bunce, Spider Boardman, Ulrich Pfeifer, Stephen McCamant, and Gurusamy Sarathy.

API Listing originally by Dean Roehrich <roehrich@cray.com>.

Updated to be autogenerated from comments in the source by Benjamin Stuhl.

SEE ALSO

perlguts, perlxs, perlxstut, perlintern

 
perldoc-html/perlapio.html000644 000765 000024 00000120171 12275777367 015750 0ustar00jjstaff000000 000000 perlapio - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlapio

Perl 5 version 18.2 documentation
Recently read

perlapio

NAME

perlapio - perl's IO abstraction interface.

SYNOPSIS

  1. #define PERLIO_NOT_STDIO 0 /* For co-existence with stdio only */
  2. #include <perlio.h> /* Usually via #include <perl.h> */
  3. PerlIO *PerlIO_stdin(void);
  4. PerlIO *PerlIO_stdout(void);
  5. PerlIO *PerlIO_stderr(void);
  6. PerlIO *PerlIO_open(const char *path,const char *mode);
  7. PerlIO *PerlIO_fdopen(int fd, const char *mode);
  8. PerlIO *PerlIO_reopen(const char *path, const char *mode, PerlIO *old); /* deprecated */
  9. int PerlIO_close(PerlIO *f);
  10. int PerlIO_stdoutf(const char *fmt,...)
  11. int PerlIO_puts(PerlIO *f,const char *string);
  12. int PerlIO_putc(PerlIO *f,int ch);
  13. int PerlIO_write(PerlIO *f,const void *buf,size_t numbytes);
  14. int PerlIO_printf(PerlIO *f, const char *fmt,...);
  15. int PerlIO_vprintf(PerlIO *f, const char *fmt, va_list args);
  16. int PerlIO_flush(PerlIO *f);
  17. int PerlIO_eof(PerlIO *f);
  18. int PerlIO_error(PerlIO *f);
  19. void PerlIO_clearerr(PerlIO *f);
  20. int PerlIO_getc(PerlIO *d);
  21. int PerlIO_ungetc(PerlIO *f,int ch);
  22. int PerlIO_read(PerlIO *f, void *buf, size_t numbytes);
  23. int PerlIO_fileno(PerlIO *f);
  24. void PerlIO_setlinebuf(PerlIO *f);
  25. Off_t PerlIO_tell(PerlIO *f);
  26. int PerlIO_seek(PerlIO *f, Off_t offset, int whence);
  27. void PerlIO_rewind(PerlIO *f);
  28. int PerlIO_getpos(PerlIO *f, SV *save); /* prototype changed */
  29. int PerlIO_setpos(PerlIO *f, SV *saved); /* prototype changed */
  30. int PerlIO_fast_gets(PerlIO *f);
  31. int PerlIO_has_cntptr(PerlIO *f);
  32. int PerlIO_get_cnt(PerlIO *f);
  33. char *PerlIO_get_ptr(PerlIO *f);
  34. void PerlIO_set_ptrcnt(PerlIO *f, char *ptr, int count);
  35. int PerlIO_canset_cnt(PerlIO *f); /* deprecated */
  36. void PerlIO_set_cnt(PerlIO *f, int count); /* deprecated */
  37. int PerlIO_has_base(PerlIO *f);
  38. char *PerlIO_get_base(PerlIO *f);
  39. int PerlIO_get_bufsiz(PerlIO *f);
  40. PerlIO *PerlIO_importFILE(FILE *stdio, const char *mode);
  41. FILE *PerlIO_exportFILE(PerlIO *f, int flags);
  42. FILE *PerlIO_findFILE(PerlIO *f);
  43. void PerlIO_releaseFILE(PerlIO *f,FILE *stdio);
  44. int PerlIO_apply_layers(PerlIO *f, const char *mode, const char *layers);
  45. int PerlIO_binmode(PerlIO *f, int ptype, int imode, const char *layers);
  46. void PerlIO_debug(const char *fmt,...)

DESCRIPTION

Perl's source code, and extensions that want maximum portability, should use the above functions instead of those defined in ANSI C's stdio.h. The perl headers (in particular "perlio.h") will #define them to the I/O mechanism selected at Configure time.

The functions are modeled on those in stdio.h, but parameter order has been "tidied up a little".

PerlIO * takes the place of FILE *. Like FILE * it should be treated as opaque (it is probably safe to assume it is a pointer to something).

There are currently three implementations:

1.
USE_STDIO

All above are #define'd to stdio functions or are trivial wrapper functions which call stdio. In this case only PerlIO * is a FILE *. This has been the default implementation since the abstraction was introduced in perl5.003_02.

2.
USE_SFIO

A "legacy" implementation in terms of the "sfio" library. Used for some specialist applications on Unix machines ("sfio" is not widely ported away from Unix). Most of above are #define'd to the sfio functions. PerlIO * is in this case Sfio_t *.

3.
USE_PERLIO

Introduced just after perl5.7.0, this is a re-implementation of the above abstraction which allows perl more control over how IO is done as it decouples IO from the way the operating system and C library choose to do things. For USE_PERLIO PerlIO * has an extra layer of indirection - it is a pointer-to-a-pointer. This allows the PerlIO * to remain with a known value while swapping the implementation around underneath at run time. In this case all the above are true (but very simple) functions which call the underlying implementation.

This is the only implementation for which PerlIO_apply_layers() does anything "interesting".

The USE_PERLIO implementation is described in perliol.

Because "perlio.h" is a thin layer (for efficiency) the semantics of these functions are somewhat dependent on the underlying implementation. Where these variations are understood they are noted below.

Unless otherwise noted, functions return 0 on success, or a negative value (usually EOF which is usually -1) and set errno on error.

  • PerlIO_stdin(), PerlIO_stdout(), PerlIO_stderr()

    Use these rather than stdin , stdout , stderr . They are written to look like "function calls" rather than variables because this makes it easier to make them function calls if platform cannot export data to loaded modules, or if (say) different "threads" might have different values.

  • PerlIO_open(path, mode), PerlIO_fdopen(fd,mode)

    These correspond to fopen()/fdopen() and the arguments are the same. Return NULL and set errno if there is an error. There may be an implementation limit on the number of open handles, which may be lower than the limit on the number of open files - errno may not be set when NULL is returned if this limit is exceeded.

  • PerlIO_reopen(path,mode,f)

    While this currently exists in all three implementations perl itself does not use it. As perl does not use it, it is not well tested.

    Perl prefers to dup the new low-level descriptor to the descriptor used by the existing PerlIO. This may become the behaviour of this function in the future.

  • PerlIO_printf(f,fmt,...), PerlIO_vprintf(f,fmt,a)

    These are fprintf()/vfprintf() equivalents.

  • PerlIO_stdoutf(fmt,...)

    This is printf() equivalent. printf is #defined to this function, so it is (currently) legal to use printf(fmt,...) in perl sources.

  • PerlIO_read(f,buf,count), PerlIO_write(f,buf,count)

    These correspond functionally to fread() and fwrite() but the arguments and return values are different. The PerlIO_read() and PerlIO_write() signatures have been modeled on the more sane low level read() and write() functions instead: The "file" argument is passed first, there is only one "count", and the return value can distinguish between error and EOF .

    Returns a byte count if successful (which may be zero or positive), returns negative value and sets errno on error. Depending on implementation errno may be EINTR if operation was interrupted by a signal.

  • PerlIO_close(f)

    Depending on implementation errno may be EINTR if operation was interrupted by a signal.

  • PerlIO_puts(f,s), PerlIO_putc(f,c)

    These correspond to fputs() and fputc(). Note that arguments have been revised to have "file" first.

  • PerlIO_ungetc(f,c)

    This corresponds to ungetc(). Note that arguments have been revised to have "file" first. Arranges that next read operation will return the byte c. Despite the implied "character" in the name only values in the range 0..0xFF are defined. Returns the byte c on success or -1 (EOF ) on error. The number of bytes that can be "pushed back" may vary, only 1 character is certain, and then only if it is the last character that was read from the handle.

  • PerlIO_getc(f)

    This corresponds to getc(). Despite the c in the name only byte range 0..0xFF is supported. Returns the character read or -1 (EOF ) on error.

  • PerlIO_eof(f)

    This corresponds to feof(). Returns a true/false indication of whether the handle is at end of file. For terminal devices this may or may not be "sticky" depending on the implementation. The flag is cleared by PerlIO_seek(), or PerlIO_rewind().

  • PerlIO_error(f)

    This corresponds to ferror(). Returns a true/false indication of whether there has been an IO error on the handle.

  • PerlIO_fileno(f)

    This corresponds to fileno(), note that on some platforms, the meaning of "fileno" may not match Unix. Returns -1 if the handle has no open descriptor associated with it.

  • PerlIO_clearerr(f)

    This corresponds to clearerr(), i.e., clears 'error' and (usually) 'eof' flags for the "stream". Does not return a value.

  • PerlIO_flush(f)

    This corresponds to fflush(). Sends any buffered write data to the underlying file. If called with NULL this may flush all open streams (or core dump with some USE_STDIO implementations). Calling on a handle open for read only, or on which last operation was a read of some kind may lead to undefined behaviour on some USE_STDIO implementations. The USE_PERLIO (layers) implementation tries to behave better: it flushes all open streams when passed NULL , and attempts to retain data on read streams either in the buffer or by seeking the handle to the current logical position.

  • PerlIO_seek(f,offset,whence)

    This corresponds to fseek(). Sends buffered write data to the underlying file, or discards any buffered read data, then positions the file descriptor as specified by offset and whence (sic). This is the correct thing to do when switching between read and write on the same handle (see issues with PerlIO_flush() above). Offset is of type Off_t which is a perl Configure value which may not be same as stdio's off_t .

  • PerlIO_tell(f)

    This corresponds to ftell(). Returns the current file position, or (Off_t) -1 on error. May just return value system "knows" without making a system call or checking the underlying file descriptor (so use on shared file descriptors is not safe without a PerlIO_seek()). Return value is of type Off_t which is a perl Configure value which may not be same as stdio's off_t .

  • PerlIO_getpos(f,p), PerlIO_setpos(f,p)

    These correspond (loosely) to fgetpos() and fsetpos(). Rather than stdio's Fpos_t they expect a "Perl Scalar Value" to be passed. What is stored there should be considered opaque. The layout of the data may vary from handle to handle. When not using stdio or if platform does not have the stdio calls then they are implemented in terms of PerlIO_tell() and PerlIO_seek().

  • PerlIO_rewind(f)

    This corresponds to rewind(). It is usually defined as being

    1. PerlIO_seek(f,(Off_t)0L, SEEK_SET);
    2. PerlIO_clearerr(f);
  • PerlIO_tmpfile()

    This corresponds to tmpfile(), i.e., returns an anonymous PerlIO or NULL on error. The system will attempt to automatically delete the file when closed. On Unix the file is usually unlink-ed just after it is created so it does not matter how it gets closed. On other systems the file may only be deleted if closed via PerlIO_close() and/or the program exits via exit. Depending on the implementation there may be "race conditions" which allow other processes access to the file, though in general it will be safer in this regard than ad. hoc. schemes.

  • PerlIO_setlinebuf(f)

    This corresponds to setlinebuf(). Does not return a value. What constitutes a "line" is implementation dependent but usually means that writing "\n" flushes the buffer. What happens with things like "this\nthat" is uncertain. (Perl core uses it only when "dumping"; it has nothing to do with $| auto-flush.)

Co-existence with stdio

There is outline support for co-existence of PerlIO with stdio. Obviously if PerlIO is implemented in terms of stdio there is no problem. However in other cases then mechanisms must exist to create a FILE * which can be passed to library code which is going to use stdio calls.

The first step is to add this line:

  1. #define PERLIO_NOT_STDIO 0

before including any perl header files. (This will probably become the default at some point). That prevents "perlio.h" from attempting to #define stdio functions onto PerlIO functions.

XS code is probably better using "typemap" if it expects FILE * arguments. The standard typemap will be adjusted to comprehend any changes in this area.

  • PerlIO_importFILE(f,mode)

    Used to get a PerlIO * from a FILE *.

    The mode argument should be a string as would be passed to fopen/PerlIO_open. If it is NULL then - for legacy support - the code will (depending upon the platform and the implementation) either attempt to empirically determine the mode in which f is open, or use "r+" to indicate a read/write stream.

    Once called the FILE * should ONLY be closed by calling PerlIO_close() on the returned PerlIO *.

    The PerlIO is set to textmode. Use PerlIO_binmode if this is not the desired mode.

    This is not the reverse of PerlIO_exportFILE().

  • PerlIO_exportFILE(f,mode)

    Given a PerlIO * create a 'native' FILE * suitable for passing to code expecting to be compiled and linked with ANSI C stdio.h. The mode argument should be a string as would be passed to fopen/PerlIO_open. If it is NULL then - for legacy support - the FILE * is opened in same mode as the PerlIO *.

    The fact that such a FILE * has been 'exported' is recorded, (normally by pushing a new :stdio "layer" onto the PerlIO *), which may affect future PerlIO operations on the original PerlIO *. You should not call fclose() on the file unless you call PerlIO_releaseFILE() to disassociate it from the PerlIO *. (Do not use PerlIO_importFILE() for doing the disassociation.)

    Calling this function repeatedly will create a FILE * on each call (and will push an :stdio layer each time as well).

  • PerlIO_releaseFILE(p,f)

    Calling PerlIO_releaseFILE informs PerlIO that all use of FILE * is complete. It is removed from the list of 'exported' FILE *s, and the associated PerlIO * should revert to its original behaviour.

    Use this to disassociate a file from a PerlIO * that was associated using PerlIO_exportFILE().

  • PerlIO_findFILE(f)

    Returns a native FILE * used by a stdio layer. If there is none, it will create one with PerlIO_exportFILE. In either case the FILE * should be considered as belonging to PerlIO subsystem and should only be closed by calling PerlIO_close() .

"Fast gets" Functions

In addition to standard-like API defined so far above there is an "implementation" interface which allows perl to get at internals of PerlIO. The following calls correspond to the various FILE_xxx macros determined by Configure - or their equivalent in other implementations. This section is really of interest to only those concerned with detailed perl-core behaviour, implementing a PerlIO mapping or writing code which can make use of the "read ahead" that has been done by the IO system in the same way perl does. Note that any code that uses these interfaces must be prepared to do things the traditional way if a handle does not support them.

  • PerlIO_fast_gets(f)

    Returns true if implementation has all the interfaces required to allow perl's sv_gets to "bypass" normal IO mechanism. This can vary from handle to handle.

    1. PerlIO_fast_gets(f) = PerlIO_has_cntptr(f) && \
    2. PerlIO_canset_cnt(f) && \
    3. 'Can set pointer into buffer'
  • PerlIO_has_cntptr(f)

    Implementation can return pointer to current position in the "buffer" and a count of bytes available in the buffer. Do not use this - use PerlIO_fast_gets.

  • PerlIO_get_cnt(f)

    Return count of readable bytes in the buffer. Zero or negative return means no more bytes available.

  • PerlIO_get_ptr(f)

    Return pointer to next readable byte in buffer, accessing via the pointer (dereferencing) is only safe if PerlIO_get_cnt() has returned a positive value. Only positive offsets up to value returned by PerlIO_get_cnt() are allowed.

  • PerlIO_set_ptrcnt(f,p,c)

    Set pointer into buffer, and a count of bytes still in the buffer. Should be used only to set pointer to within range implied by previous calls to PerlIO_get_ptr and PerlIO_get_cnt . The two values must be consistent with each other (implementation may only use one or the other or may require both).

  • PerlIO_canset_cnt(f)

    Implementation can adjust its idea of number of bytes in the buffer. Do not use this - use PerlIO_fast_gets.

  • PerlIO_set_cnt(f,c)

    Obscure - set count of bytes in the buffer. Deprecated. Only usable if PerlIO_canset_cnt() returns true. Currently used in only doio.c to force count less than -1 to -1. Perhaps should be PerlIO_set_empty or similar. This call may actually do nothing if "count" is deduced from pointer and a "limit". Do not use this - use PerlIO_set_ptrcnt().

  • PerlIO_has_base(f)

    Returns true if implementation has a buffer, and can return pointer to whole buffer and its size. Used by perl for -T / -B tests. Other uses would be very obscure...

  • PerlIO_get_base(f)

    Return start of buffer. Access only positive offsets in the buffer up to the value returned by PerlIO_get_bufsiz().

  • PerlIO_get_bufsiz(f)

    Return the total number of bytes in the buffer, this is neither the number that can be read, nor the amount of memory allocated to the buffer. Rather it is what the operating system and/or implementation happened to read() (or whatever) last time IO was requested.

Other Functions

  • PerlIO_apply_layers(f,mode,layers)

    The new interface to the USE_PERLIO implementation. The layers ":crlf" and ":raw" are only ones allowed for other implementations and those are silently ignored. (As of perl5.8 ":raw" is deprecated.) Use PerlIO_binmode() below for the portable case.

  • PerlIO_binmode(f,ptype,imode,layers)

    The hook used by perl's binmode operator. ptype is perl's character for the kind of IO:

    • '<' read
    • '>' write
    • '+' read/write

    imode is O_BINARY or O_TEXT .

    layers is a string of layers to apply, only ":crlf" makes sense in the non USE_PERLIO case. (As of perl5.8 ":raw" is deprecated in favour of passing NULL.)

    Portable cases are:

    1. PerlIO_binmode(f,ptype,O_BINARY,NULL);
    2. and
    3. PerlIO_binmode(f,ptype,O_TEXT,":crlf");

    On Unix these calls probably have no effect whatsoever. Elsewhere they alter "\n" to CR,LF translation and possibly cause a special text "end of file" indicator to be written or honoured on read. The effect of making the call after doing any IO to the handle depends on the implementation. (It may be ignored, affect any data which is already buffered as well, or only apply to subsequent data.)

  • PerlIO_debug(fmt,...)

    PerlIO_debug is a printf()-like function which can be used for debugging. No return value. Its main use is inside PerlIO where using real printf, warn() etc. would recursively call PerlIO and be a problem.

    PerlIO_debug writes to the file named by $ENV{'PERLIO_DEBUG'} typical use might be

    1. Bourne shells (sh, ksh, bash, zsh, ash, ...):
    2. PERLIO_DEBUG=/dev/tty ./perl somescript some args
    3. Csh/Tcsh:
    4. setenv PERLIO_DEBUG /dev/tty
    5. ./perl somescript some args
    6. If you have the "env" utility:
    7. env PERLIO_DEBUG=/dev/tty ./perl somescript some args
    8. Win32:
    9. set PERLIO_DEBUG=CON
    10. perl somescript some args

    If $ENV{'PERLIO_DEBUG'} is not set PerlIO_debug() is a no-op.

 
perldoc-html/perlartistic.html000644 000765 000024 00000053743 12275777410 016641 0ustar00jjstaff000000 000000 perlartistic - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlartistic

Perl 5 version 18.2 documentation
Recently read

perlartistic

SYNOPSIS

  1. You can refer to this document in Pod via "L<perlartistic>"
  2. Or you can see this document by entering "perldoc perlartistic"

DESCRIPTION

Perl is free software; you can redistribute it and/or modify it under the terms of either:

  1. a) the GNU General Public License as published by the Free
  2. Software Foundation; either version 1, or (at your option) any
  3. later version, or
  4. b) the "Artistic License" which comes with this Kit.

This is "The Artistic License". It's here so that modules, programs, etc., that want to declare this as their distribution license can link to it.

For the GNU General Public License, see perlgpl.

The "Artistic License"

Preamble

The intent of this document is to state the conditions under which a Package may be copied, such that the Copyright Holder maintains some semblance of artistic control over the development of the package, while giving the users of the package the right to use and distribute the Package in a more-or-less customary fashion, plus the right to make reasonable modifications.

Definitions

  • "Package"

    refers to the collection of files distributed by the Copyright Holder, and derivatives of that collection of files created through textual modification.

  • "Standard Version"

    refers to such a Package if it has not been modified, or has been modified in accordance with the wishes of the Copyright Holder as specified below.

  • "Copyright Holder"

    is whoever is named in the copyright or copyrights for the package.

  • "You"

    is you, if you're thinking about copying or distributing this Package.

  • "Reasonable copying fee"

    is whatever you can justify on the basis of media cost, duplication charges, time of people involved, and so on. (You will not be required to justify it to the Copyright Holder, but only to the computing community at large as a market that must bear the fee.)

  • "Freely Available"

    means that no fee is charged for the item itself, though there may be fees involved in handling the item. It also means that recipients of the item may redistribute it under the same conditions they received it.

Conditions

1.

You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers.

2.

You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version.

3.

You may otherwise modify your copy of this Package in any way, provided that you insert a prominent notice in each changed file stating how and when you changed that file, and provided that you do at least ONE of the following:

  • a)

    place your modifications in the Public Domain or otherwise make them Freely Available, such as by posting said modifications to Usenet or an equivalent medium, or placing the modifications on a major archive site such as uunet.uu.net, or by allowing the Copyright Holder to include your modifications in the Standard Version of the Package.

  • b)

    use the modified Package only within your corporation or organization.

  • c)

    rename any non-standard executables so the names do not conflict with standard executables, which must also be provided, and provide a separate manual page for each non-standard executable that clearly documents how it differs from the Standard Version.

  • d)

    make other distribution arrangements with the Copyright Holder.

4.

You may distribute the programs of this Package in object code or executable form, provided that you do at least ONE of the following:

  • a)

    distribute a Standard Version of the executables and library files, together with instructions (in the manual page or equivalent) on where to get the Standard Version.

  • b)

    accompany the distribution with the machine-readable source of the Package with your modifications.

  • c)

    give non-standard executables non-standard names, and clearly document the differences in manual pages (or equivalent), together with instructions on where to get the Standard Version.

  • d)

    make other distribution arrangements with the Copyright Holder.

5.

You may charge a reasonable copying fee for any distribution of this Package. You may charge any fee you choose for support of this Package. You may not charge a fee for this Package itself. However, you may distribute this Package in aggregate with other (possibly commercial) programs as part of a larger (possibly commercial) software distribution provided that you do not advertise this Package as a product of your own. You may embed this Package's interpreter within an executable of yours (by linking); this shall be construed as a mere form of aggregation, provided that the complete Standard Version of the interpreter is so embedded.

6.

The scripts and library files supplied as input to or produced as output from the programs of this Package do not automatically fall under the copyright of this Package, but belong to whoever generated them, and may be sold commercially, and may be aggregated with this Package. If such scripts or library files are aggregated with this Package via the so-called "undump" or "unexec" methods of producing a binary executable image, then distribution of such an image shall neither be construed as a distribution of this Package nor shall it fall under the restrictions of Paragraphs 3 and 4, provided that you do not represent such an executable image as a Standard Version of this Package.

7.

C subroutines (or comparably compiled subroutines in other languages) supplied by you and linked into this Package in order to emulate subroutines and variables of the language defined by this Package shall not be considered part of this Package, but are the equivalent of input as in Paragraph 6, provided these subroutines do not change the language in any way that would cause it to fail the regression tests for the language.

8.

Aggregation of this Package with a commercial distribution is always permitted provided that the use of this Package is embedded; that is, when no overt attempt is made to make this Package's interfaces visible to the end user of the commercial distribution. Such use shall not be construed as a distribution of this Package.

9.

The name of the Copyright Holder may not be used to endorse or promote products derived from this software without specific prior written permission.

10.

THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

The End

 
perldoc-html/perlbook.html000644 000765 000024 00000126301 12275777322 015742 0ustar00jjstaff000000 000000 perlbook - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlbook

Perl 5 version 18.2 documentation
Recently read

perlbook

NAME

perlbook - Books about and related to Perl

DESCRIPTION

There are many books on Perl and Perl-related. A few of these are good, some are OK, but many aren't worth your money. There is a list of these books, some with extensive reviews, at http://books.perl.org/ . We list some of the books here, and while listing a book implies our endorsement, don't think that not including a book means anything.

Most of these books are available online through Safari Books Online ( http://safaribooksonline.com/ ).

The most popular books

The major reference book on Perl, written by the creator of Perl, is Programming Perl:

  • Programming Perl (the "Camel Book"):
    1. by Tom Christiansen, brian d foy, Larry Wall with Jon Orwant
    2. ISBN 978-0-596-00492-7 [4th edition February 2012]
    3. ISBN 978-1-4493-9890-3 [ebook]
    4. http://oreilly.com/catalog/9780596004927

The Ram is a cookbook with hundreds of examples of using Perl to accomplish specific tasks:

  • The Perl Cookbook (the "Ram Book"):
    1. by Tom Christiansen and Nathan Torkington,
    2. with Foreword by Larry Wall
    3. ISBN 978-0-596-00313-5 [2nd Edition August 2003]
    4. http://oreilly.com/catalog/9780596003135/

If you want to learn the basics of Perl, you might start with the Llama book, which assumes that you already know a little about programming:

  • Learning Perl (the "Llama Book")
    1. by Randal L. Schwartz, Tom Phoenix, and brian d foy
    2. ISBN 978-1-4493-0358-7 [6th edition June 2011]
    3. http://oreilly.com/catalog/0636920018452

The tutorial started in the Llama continues in the Alpaca, which introduces the intermediate features of references, data structures, object-oriented programming, and modules:

  • Intermediate Perl (the "Alpaca Book")
    1. by Randal L. Schwartz and brian d foy, with Tom Phoenix
    2. foreword by Damian Conway
    3. ISBN 978-1-4493-9309-0 [2nd edition August 2012]
    4. http://oreilly.com/catalog/0636920012689/

References

You might want to keep these desktop references close by your keyboard:

  • Perl 5 Pocket Reference
    1. by Johan Vromans
    2. ISBN 978-1-4493-0370-9 [5th edition July 2011]
    3. ISBN 978-1-4493-0813-1 [ebook]
    4. http://oreilly.com/catalog/0636920018476/
  • Perl Debugger Pocket Reference
    1. by Richard Foley
    2. ISBN 978-0-596-00503-0 [1st edition January 2004]
    3. http://oreilly.com/catalog/9780596005030/
  • Regular Expression Pocket Reference
    1. by Tony Stubblebine
    2. ISBN 978-0-596-51427-3 [July 2007]
    3. http://oreilly.com/catalog/9780596514273/

Tutorials

  • Beginning Perl
    1. by James Lee
    2. ISBN 1-59059-391-X [3rd edition April 2010]
    3. http://www.apress.com/9781430227939
  • Learning Perl
    1. by Randal L. Schwartz, Tom Phoenix, and brian d foy
    2. ISBN 978-0-596-52010-6 [5th edition June 2008]
    3. http://oreilly.com/catalog/9780596520106
  • Intermediate Perl (the "Alpaca Book")
    1. by Randal L. Schwartz and brian d foy, with Tom Phoenix
    2. foreword by Damian Conway
    3. ISBN 0-596-10206-2 [1st edition March 2006]
    4. http://oreilly.com/catalog/9780596102067
  • Mastering Perl
    1. by brian d foy
    2. ISBN 978-0-596-10206-7 [1st edition July 2007]
    3. http://www.oreilly.com/catalog/9780596527242
  • Effective Perl Programming
    1. by Joseph N. Hall, Joshua A. McAdams, brian d foy
    2. ISBN 0-321-49694-9 [2nd edition 2010]
    3. http://www.effectiveperlprogramming.com/

Task-Oriented

  • Writing Perl Modules for CPAN
    1. by Sam Tregar
    2. ISBN 1-59059-018-X [1st edition August 2002]
    3. http://www.apress.com/9781590590188
  • The Perl Cookbook
    1. by Tom Christiansen and Nathan Torkington
    2. with foreword by Larry Wall
    3. ISBN 1-56592-243-3 [2nd edition August 2003]
    4. http://oreilly.com/catalog/9780596003135
  • Automating System Administration with Perl
    1. by David N. Blank-Edelman
    2. ISBN 978-0-596-00639-6 [2nd edition May 2009]
    3. http://oreilly.com/catalog/9780596006396
  • Real World SQL Server Administration with Perl
    1. by Linchi Shea
    2. ISBN 1-59059-097-X [1st edition July 2003]
    3. http://www.apress.com/9781590590973

Special Topics

  • Regular Expressions Cookbook
    1. by Jan Goyvaerts and Steven Levithan
    2. ISBN 978-0-596-52069-4 [May 2009]
    3. http://oreilly.com/catalog/9780596520694
  • Programming the Perl DBI
    1. by Tim Bunce and Alligator Descartes
    2. ISBN 978-1-56592-699-8 [February 2000]
    3. http://oreilly.com/catalog/9781565926998
  • Perl Best Practices
    1. by Damian Conway
    2. ISBN: 978-0-596-00173-5 [1st edition July 2005]
    3. http://oreilly.com/catalog/9780596001735
  • Higher-Order Perl
    1. by Mark-Jason Dominus
    2. ISBN: 1-55860-701-3 [1st edition March 2005]
    3. http://hop.perl.plover.com/
  • Mastering Regular Expressions
    1. by Jeffrey E. F. Friedl
    2. ISBN 978-0-596-52812-6 [3rd edition August 2006]
    3. http://oreilly.com/catalog/9780596528126
  • Network Programming with Perl
    1. by Lincoln Stein
    2. ISBN 0-201-61571-1 [1st edition 2001]
    3. http://www.pearsonhighered.com/educator/product/Network-Programming-with-Perl/9780201615715.page
  • Perl Template Toolkit
    1. by Darren Chamberlain, Dave Cross, and Andy Wardley
    2. ISBN 978-0-596-00476-7 [December 2003]
    3. http://oreilly.com/catalog/9780596004767
  • Object Oriented Perl
    1. by Damian Conway
    2. with foreword by Randal L. Schwartz
    3. ISBN 1-884777-79-1 [1st edition August 1999]
    4. http://www.manning.com/conway/
  • Data Munging with Perl
    1. by Dave Cross
    2. ISBN 1-930110-00-6 [1st edition 2001]
    3. http://www.manning.com/cross
  • Mastering Perl/Tk
    1. by Steve Lidie and Nancy Walsh
    2. ISBN 978-1-56592-716-2 [1st edition January 2002]
    3. http://oreilly.com/catalog/9781565927162
  • Extending and Embedding Perl
    1. by Tim Jenness and Simon Cozens
    2. ISBN 1-930110-82-0 [1st edition August 2002]
    3. http://www.manning.com/jenness
  • Pro Perl Debugging
    1. by Richard Foley with Andy Lester
    2. ISBN 1-59059-454-1 [1st edition July 2005]
    3. http://www.apress.com/9781590594544

Free (as in beer) books

Some of these books are available as free downloads.

Higher-Order Perl: http://hop.perl.plover.com/

Other interesting, non-Perl books

You might notice several familiar Perl concepts in this collection of ACM columns from Jon Bentley. The similarity to the title of the major Perl book (which came later) is not completely accidental:

  • Programming Pearls
    1. by Jon Bentley
    2. ISBN 978-0-201-65788-3 [2 edition, October 1999]
  • More Programming Pearls
    1. by Jon Bentley
    2. ISBN 0-201-11889-0 [January 1988]

A note on freshness

Each version of Perl comes with the documentation that was current at the time of release. This poses a problem for content such as book lists. There are probably very nice books published after this list was included in your Perl release, and you can check the latest released version at http://perldoc.perl.org/perlbook.html .

Some of the books we've listed appear almost ancient in internet scale, but we've included those books because they still describe the current way of doing things. Not everything in Perl changes every day. Many of the beginner-level books, too, go over basic features and techniques that are still valid today. In general though, we try to limit this list to books published in the past five years.

Get your book listed

If your Perl book isn't listed and you think it should be, let us know.

 
perldoc-html/perlboot.html000644 000765 000024 00000033736 12275777324 015766 0ustar00jjstaff000000 000000 perlboot - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlboot

Perl 5 version 18.2 documentation
Recently read

perlboot

NAME

perlboot - This document has been deleted

DESCRIPTION

For information on OO programming with Perl, please see perlootut and perlobj.

Page index
 
perldoc-html/perlbot.html000644 000765 000024 00000033727 12275777324 015607 0ustar00jjstaff000000 000000 perlbot - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlbot

Perl 5 version 18.2 documentation
Recently read

perlbot

NAME

perlbot - This document has been deleted

DESCRIPTION

For information on OO programming with Perl, please see perlootut and perlobj.

Page index
 
perldoc-html/perlbs2000.html000644 000765 000024 00000067266 12275777411 015733 0ustar00jjstaff000000 000000 perlbs2000 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlbs2000

Perl 5 version 18.2 documentation
Recently read

perlbs2000

NAME

perlbs2000 - building and installing Perl for BS2000.

SYNOPSIS

This document will help you Configure, build, test and install Perl on BS2000 in the POSIX subsystem.

DESCRIPTION

This is a ported perl for the POSIX subsystem in BS2000 VERSION OSD V3.1A or later. It may work on other versions, but we started porting and testing it with 3.1A and are currently using Version V4.0A.

You may need the following GNU programs in order to install perl:

gzip on BS2000

We used version 1.2.4, which could be installed out of the box with one failure during 'make check'.

bison on BS2000

The yacc coming with BS2000 POSIX didn't work for us. So we had to use bison. We had to make a few changes to perl in order to use the pure (reentrant) parser of bison. We used version 1.25, but we had to add a few changes due to EBCDIC. See below for more details concerning yacc.

Unpacking Perl Distribution on BS2000

To extract an ASCII tar archive on BS2000 POSIX you need an ASCII filesystem (we used the mountpoint /usr/local/ascii for this). Now you extract the archive in the ASCII filesystem without I/O-conversion:

cd /usr/local/ascii export IO_CONVERSION=NO gunzip < /usr/local/src/perl.tar.gz | pax -r

You may ignore the error message for the first element of the archive (this doesn't look like a tar archive / skipping to next file...), it's only the directory which will be created automatically anyway.

After extracting the archive you copy the whole directory tree to your EBCDIC filesystem. This time you use I/O-conversion:

cd /usr/local/src IO_CONVERSION=YES cp -r /usr/local/ascii/perl5.005_02 ./

Compiling Perl on BS2000

There is a "hints" file for BS2000 called hints.posix-bc (because posix-bc is the OS name given by `uname`) that specifies the correct values for most things. The major problem is (of course) the EBCDIC character set. We have german EBCDIC version.

Because of our problems with the native yacc we used GNU bison to generate a pure (=reentrant) parser for perly.y. So our yacc is really the following script:

-----8<-----/usr/local/bin/yacc-----8<----- #! /usr/bin/sh

# Bison as a reentrant yacc:

# save parameters: params="" while [[ $# -gt 1 ]]; do params="$params $1" shift done

# add flag %pure_parser:

tmpfile=/tmp/bison.$$.y echo %pure_parser > $tmpfile cat $1>> $tmpfile

# call bison:

echo "/usr/local/bin/bison --yacc $params $1\t\t\t(Pure Parser)" /usr/local/bin/bison --yacc $params $tmpfile

# cleanup:

rm -f $tmpfile -----8<----------8<-----

We still use the normal yacc for a2p.y though!!! We made a softlink called byacc to distinguish between the two versions:

ln -s /usr/bin/yacc /usr/local/bin/byacc

We build perl using GNU make. We tried the native make once and it worked too.

Testing Perl on BS2000

We still got a few errors during make test . Some of them are the result of using bison. Bison prints parser error instead of syntax error, so we may ignore them. The following list shows our errors, your results may differ:

op/numconvert.......FAILED tests 1409-1440 op/regexp...........FAILED tests 483, 496 op/regexp_noamp.....FAILED tests 483, 496 pragma/overload.....FAILED tests 152-153, 170-171 pragma/warnings.....FAILED tests 14, 82, 129, 155, 192, 205, 207 lib/bigfloat........FAILED tests 351-352, 355 lib/bigfltpm........FAILED tests 354-355, 358 lib/complex.........FAILED tests 267, 487 lib/dumper..........FAILED tests 43, 45 Failed 11/231 test scripts, 95.24% okay. 57/10595 subtests failed, 99.46% okay.

Installing Perl on BS2000

We have no nroff on BS2000 POSIX (yet), so we ignored any errors while installing the documentation.

Using Perl in the Posix-Shell of BS2000

BS2000 POSIX doesn't support the shebang notation (#!/usr/local/bin/perl ), so you have to use the following lines instead:

: # use perl eval 'exec /usr/local/bin/perl -S $0 ${1+"$@"}' if $running_under_some_shell;

Using Perl in "native" BS2000

We don't have much experience with this yet, but try the following:

Copy your Perl executable to a BS2000 LLM using bs2cp:

bs2cp /usr/local/bin/perl 'bs2:perl(perl,l)'

Now you can start it with the following (SDF) command:

/START-PROG FROM-FILE=*MODULE(PERL,PERL),PROG-MODE=*ANY,RUN-MODE=*ADV

First you get the BS2000 commandline prompt ('*'). Here you may enter your parameters, e.g. -e 'print "Hello World!\\n";' (note the double backslash!) or -w and the name of your Perl script. Filenames starting with / are searched in the Posix filesystem, others are searched in the BS2000 filesystem. You may even use wildcards if you put a % in front of your filename (e.g. -w checkfiles.pl %*.c ). Read your C/C++ manual for additional possibilities of the commandline prompt (look for PARAMETER-PROMPTING).

Floating point anomalies on BS2000

There appears to be a bug in the floating point implementation on BS2000 POSIX systems such that calling int() on the product of a number and a small magnitude number is not the same as calling int() on the quotient of that number and a large magnitude number. For example, in the following Perl code:

  1. my $x = 100000.0;
  2. my $y = int($x * 1e-5) * 1e5; # '0'
  3. my $z = int($x / 1e+5) * 1e5; # '100000'
  4. print "\$y is $y and \$z is $z\n"; # $y is 0 and $z is 100000

Although one would expect the quantities $y and $z to be the same and equal to 100000 they will differ and instead will be 0 and 100000 respectively.

Using PerlIO and different encodings on ASCII and EBCDIC partitions

Since version 5.8 Perl uses the new PerlIO on BS2000. This enables you using different encodings per IO channel. For example you may use

  1. use Encode;
  2. open($f, ">:encoding(ascii)", "test.ascii");
  3. print $f "Hello World!\n";
  4. open($f, ">:encoding(posix-bc)", "test.ebcdic");
  5. print $f "Hello World!\n";
  6. open($f, ">:encoding(latin1)", "test.latin1");
  7. print $f "Hello World!\n";
  8. open($f, ">:encoding(utf8)", "test.utf8");
  9. print $f "Hello World!\n";

to get two files containing "Hello World!\n" in ASCII, EBCDIC, ISO Latin-1 (in this example identical to ASCII) respective UTF-EBCDIC (in this example identical to normal EBCDIC). See the documentation of Encode::PerlIO for details.

As the PerlIO layer uses raw IO internally, all this totally ignores the type of your filesystem (ASCII or EBCDIC) and the IO_CONVERSION environment variable. If you want to get the old behavior, that the BS2000 IO functions determine conversion depending on the filesystem PerlIO still is your friend. You use IO_CONVERSION as usual and tell Perl, that it should use the native IO layer:

  1. export IO_CONVERSION=YES
  2. export PERLIO=stdio

Now your IO would be ASCII on ASCII partitions and EBCDIC on EBCDIC partitions. See the documentation of PerlIO (without Encode:: !) for further possibilities.

AUTHORS

Thomas Dorner

SEE ALSO

INSTALL, perlport.

Mailing list

If you are interested in the z/OS (formerly known as OS/390) and POSIX-BC (BS2000) ports of Perl then see the perl-mvs mailing list. To subscribe, send an empty message to perl-mvs-subscribe@perl.org.

See also:

  1. http://lists.perl.org/list/perl-mvs.html

There are web archives of the mailing list at:

  1. http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/
  2. http://archive.develooper.com/perl-mvs@perl.org/

HISTORY

This document was originally written by Thomas Dorner for the 5.005 release of Perl.

This document was podified for the 5.6 release of perl 11 July 2000.

 
perldoc-html/perlbug.html000644 000765 000024 00000067714 12275777420 015600 0ustar00jjstaff000000 000000 perlbug - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlbug

Perl 5 version 18.2 documentation
Recently read

perlbug

NAME

perlbug - how to submit bug reports on Perl

SYNOPSIS

perlbug

perlbug [ -v ] [ -a address ] [ -s subject ] [ -b body | -f inputfile ] [ -F outputfile ] [ -r returnaddress ] [ -e editor ] [ -c adminaddress | -C ] [ -S ] [ -t ] [ -d ] [ -A ] [ -h ] [ -T ]

perlbug [ -v ] [ -r returnaddress ] [ -A ] [ -ok | -okay | -nok | -nokay ]

perlthanks

DESCRIPTION

This program is designed to help you generate and send bug reports (and thank-you notes) about perl5 and the modules which ship with it.

In most cases, you can just run it interactively from a command line without any special arguments and follow the prompts.

If you have found a bug with a non-standard port (one that was not part of the standard distribution), a binary distribution, or a non-core module (such as Tk, DBI, etc), then please see the documentation that came with that distribution to determine the correct place to report bugs.

If you are unable to send your report using perlbug (most likely because your system doesn't have a way to send mail that perlbug recognizes), you may be able to use this tool to compose your report and save it to a file which you can then send to perlbug@perl.org using your regular mail client.

In extreme cases, perlbug may not work well enough on your system to guide you through composing a bug report. In those cases, you may be able to use perlbug -d to get system configuration information to include in a manually composed bug report to perlbug@perl.org.

When reporting a bug, please run through this checklist:

  • What version of Perl you are running?

    Type perl -v at the command line to find out.

  • Are you running the latest released version of perl?

    Look at http://www.perl.org/ to find out. If you are not using the latest released version, please try to replicate your bug on the latest stable release.

    Note that reports about bugs in old versions of Perl, especially those which indicate you haven't also tested the current stable release of Perl, are likely to receive less attention from the volunteers who build and maintain Perl than reports about bugs in the current release.

    This tool isn't appropriate for reporting bugs in any version prior to Perl 5.0.

  • Are you sure what you have is a bug?

    A significant number of the bug reports we get turn out to be documented features in Perl. Make sure the issue you've run into isn't intentional by glancing through the documentation that comes with the Perl distribution.

    Given the sheer volume of Perl documentation, this isn't a trivial undertaking, but if you can point to documentation that suggests the behaviour you're seeing is wrong, your issue is likely to receive more attention. You may want to start with perldoc perltrap for pointers to common traps that new (and experienced) Perl programmers run into.

    If you're unsure of the meaning of an error message you've run across, perldoc perldiag for an explanation. If the message isn't in perldiag, it probably isn't generated by Perl. You may have luck consulting your operating system documentation instead.

    If you are on a non-UNIX platform perldoc perlport, as some features may be unimplemented or work differently.

    You may be able to figure out what's going wrong using the Perl debugger. For information about how to use the debugger perldoc perldebug.

  • Do you have a proper test case?

    The easier it is to reproduce your bug, the more likely it will be fixed -- if nobody can duplicate your problem, it probably won't be addressed.

    A good test case has most of these attributes: short, simple code; few dependencies on external commands, modules, or libraries; no platform-dependent code (unless it's a platform-specific bug); clear, simple documentation.

    A good test case is almost always a good candidate to be included in Perl's test suite. If you have the time, consider writing your test case so that it can be easily included into the standard test suite.

  • Have you included all relevant information?

    Be sure to include the exact error messages, if any. "Perl gave an error" is not an exact error message.

    If you get a core dump (or equivalent), you may use a debugger (dbx, gdb, etc) to produce a stack trace to include in the bug report.

    NOTE: unless your Perl has been compiled with debug info (often -g), the stack trace is likely to be somewhat hard to use because it will most probably contain only the function names and not their arguments. If possible, recompile your Perl with debug info and reproduce the crash and the stack trace.

  • Can you describe the bug in plain English?

    The easier it is to understand a reproducible bug, the more likely it will be fixed. Any insight you can provide into the problem will help a great deal. In other words, try to analyze the problem (to the extent you can) and report your discoveries.

  • Can you fix the bug yourself?

    A bug report which includes a patch to fix it will almost definitely be fixed. When sending a patch, please use the diff program with the -u option to generate "unified" diff files. Bug reports with patches are likely to receive significantly more attention and interest than those without patches.

    Your patch may be returned with requests for changes, or requests for more detailed explanations about your fix.

    Here are a few hints for creating high-quality patches:

    Make sure the patch is not reversed (the first argument to diff is typically the original file, the second argument your changed file). Make sure you test your patch by applying it with the patch program before you send it on its way. Try to follow the same style as the code you are trying to patch. Make sure your patch really does work (make test , if the thing you're patching is covered by Perl's test suite).

  • Can you use perlbug to submit the report?

    perlbug will, amongst other things, ensure your report includes crucial information about your version of perl. If perlbug is unable to mail your report after you have typed it in, you may have to compose the message yourself, add the output produced by perlbug -d and email it to perlbug@perl.org. If, for some reason, you cannot run perlbug at all on your system, be sure to include the entire output produced by running perl -V (note the uppercase V).

    Whether you use perlbug or send the email manually, please make your Subject line informative. "a bug" is not informative. Neither is "perl crashes" nor is "HELP!!!". These don't help. A compact description of what's wrong is fine.

  • Can you use perlbug to submit a thank-you note?

    Yes, you can do this by either using the -T option, or by invoking the program as perlthanks . Thank-you notes are good. It makes people smile.

Having done your bit, please be prepared to wait, to be told the bug is in your code, or possibly to get no reply at all. The volunteers who maintain Perl are busy folks, so if your problem is an obvious bug in your own code, is difficult to understand or is a duplicate of an existing report, you may not receive a personal reply.

If it is important to you that your bug be fixed, do monitor the perl5-porters@perl.org mailing list and the commit logs to development versions of Perl, and encourage the maintainers with kind words or offers of frosty beverages. (Please do be kind to the maintainers. Harassing or flaming them is likely to have the opposite effect of the one you want.)

Feel free to update the ticket about your bug on http://rt.perl.org if a new version of Perl is released and your bug is still present.

OPTIONS

  • -a

    Address to send the report to. Defaults to perlbug@perl.org.

  • -A

    Don't send a bug received acknowledgement to the reply address. Generally it is only a sensible to use this option if you are a perl maintainer actively watching perl porters for your message to arrive.

  • -b

    Body of the report. If not included on the command line, or in a file with -f, you will get a chance to edit the message.

  • -C

    Don't send copy to administrator.

  • -c

    Address to send copy of report to. Defaults to the address of the local perl administrator (recorded when perl was built).

  • -d

    Data mode (the default if you redirect or pipe output). This prints out your configuration data, without mailing anything. You can use this with -v to get more complete data.

  • -e

    Editor to use.

  • -f

    File containing the body of the report. Use this to quickly send a prepared message.

  • -F

    File to output the results to instead of sending as an email. Useful particularly when running perlbug on a machine with no direct internet connection.

  • -h

    Prints a brief summary of the options.

  • -ok

    Report successful build on this system to perl porters. Forces -S and -C. Forces and supplies values for -s and -b. Only prompts for a return address if it cannot guess it (for use with make). Honors return address specified with -r. You can use this with -v to get more complete data. Only makes a report if this system is less than 60 days old.

  • -okay

    As -ok except it will report on older systems.

  • -nok

    Report unsuccessful build on this system. Forces -C. Forces and supplies a value for -s, then requires you to edit the report and say what went wrong. Alternatively, a prepared report may be supplied using -f. Only prompts for a return address if it cannot guess it (for use with make). Honors return address specified with -r. You can use this with -v to get more complete data. Only makes a report if this system is less than 60 days old.

  • -nokay

    As -nok except it will report on older systems.

  • -r

    Your return address. The program will ask you to confirm its default if you don't use this option.

  • -S

    Send without asking for confirmation.

  • -s

    Subject to include with the message. You will be prompted if you don't supply one on the command line.

  • -t

    Test mode. The target address defaults to perlbug-test@perl.org.

  • -T

    Send a thank-you note instead of a bug report.

  • -v

    Include verbose configuration data in the report.

AUTHORS

Kenneth Albanowski (<kjahds@kjahds.com>), subsequently doctored by Gurusamy Sarathy (<gsar@activestate.com>), Tom Christiansen (<tchrist@perl.com>), Nathan Torkington (<gnat@frii.com>), Charles F. Randall (<cfr@pobox.com>), Mike Guy (<mjtg@cam.ac.uk>), Dominic Dunlop (<domo@computer.org>), Hugo van der Sanden (<hv@crypt.org>), Jarkko Hietaniemi (<jhi@iki.fi>), Chris Nandor (<pudge@pobox.com>), Jon Orwant (<orwant@media.mit.edu>, Richard Foley (<richard.foley@rfi.net>), and Jesse Vincent (<jesse@bestpractical.com>).

SEE ALSO

perl(1), perldebug(1), perldiag(1), perlport(1), perltrap(1), diff(1), patch(1), dbx(1), gdb(1)

BUGS

None known (guess what must have been used to report them?)

 
perldoc-html/perlcall.html000644 000765 000024 00000334341 12275777363 015735 0ustar00jjstaff000000 000000 perlcall - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlcall

Perl 5 version 18.2 documentation
Recently read

perlcall

NAME

perlcall - Perl calling conventions from C

DESCRIPTION

The purpose of this document is to show you how to call Perl subroutines directly from C, i.e., how to write callbacks.

Apart from discussing the C interface provided by Perl for writing callbacks the document uses a series of examples to show how the interface actually works in practice. In addition some techniques for coding callbacks are covered.

Examples where callbacks are necessary include

  • An Error Handler

    You have created an XSUB interface to an application's C API.

    A fairly common feature in applications is to allow you to define a C function that will be called whenever something nasty occurs. What we would like is to be able to specify a Perl subroutine that will be called instead.

  • An Event-Driven Program

    The classic example of where callbacks are used is when writing an event driven program, such as for an X11 application. In this case you register functions to be called whenever specific events occur, e.g., a mouse button is pressed, the cursor moves into a window or a menu item is selected.

Although the techniques described here are applicable when embedding Perl in a C program, this is not the primary goal of this document. There are other details that must be considered and are specific to embedding Perl. For details on embedding Perl in C refer to perlembed.

Before you launch yourself head first into the rest of this document, it would be a good idea to have read the following two documents--perlxs and perlguts.

THE CALL_ FUNCTIONS

Although this stuff is easier to explain using examples, you first need be aware of a few important definitions.

Perl has a number of C functions that allow you to call Perl subroutines. They are

  1. I32 call_sv(SV* sv, I32 flags);
  2. I32 call_pv(char *subname, I32 flags);
  3. I32 call_method(char *methname, I32 flags);
  4. I32 call_argv(char *subname, I32 flags, char **argv);

The key function is call_sv. All the other functions are fairly simple wrappers which make it easier to call Perl subroutines in special cases. At the end of the day they will all call call_sv to invoke the Perl subroutine.

All the call_* functions have a flags parameter which is used to pass a bit mask of options to Perl. This bit mask operates identically for each of the functions. The settings available in the bit mask are discussed in FLAG VALUES.

Each of the functions will now be discussed in turn.

  • call_sv

    call_sv takes two parameters. The first, sv , is an SV*. This allows you to specify the Perl subroutine to be called either as a C string (which has first been converted to an SV) or a reference to a subroutine. The section, Using call_sv, shows how you can make use of call_sv.

  • call_pv

    The function, call_pv, is similar to call_sv except it expects its first parameter to be a C char* which identifies the Perl subroutine you want to call, e.g., call_pv("fred", 0) . If the subroutine you want to call is in another package, just include the package name in the string, e.g., "pkg::fred" .

  • call_method

    The function call_method is used to call a method from a Perl class. The parameter methname corresponds to the name of the method to be called. Note that the class that the method belongs to is passed on the Perl stack rather than in the parameter list. This class can be either the name of the class (for a static method) or a reference to an object (for a virtual method). See perlobj for more information on static and virtual methods and Using call_method for an example of using call_method.

  • call_argv

    call_argv calls the Perl subroutine specified by the C string stored in the subname parameter. It also takes the usual flags parameter. The final parameter, argv , consists of a NULL-terminated list of C strings to be passed as parameters to the Perl subroutine. See Using call_argv.

All the functions return an integer. This is a count of the number of items returned by the Perl subroutine. The actual items returned by the subroutine are stored on the Perl stack.

As a general rule you should always check the return value from these functions. Even if you are expecting only a particular number of values to be returned from the Perl subroutine, there is nothing to stop someone from doing something unexpected--don't say you haven't been warned.

FLAG VALUES

The flags parameter in all the call_* functions is one of G_VOID, G_SCALAR, or G_ARRAY, which indicate the call context, OR'ed together with a bit mask of any combination of the other G_* symbols defined below.

G_VOID

Calls the Perl subroutine in a void context.

This flag has 2 effects:

1.

It indicates to the subroutine being called that it is executing in a void context (if it executes wantarray the result will be the undefined value).

2.

It ensures that nothing is actually returned from the subroutine.

The value returned by the call_* function indicates how many items have been returned by the Perl subroutine--in this case it will be 0.

G_SCALAR

Calls the Perl subroutine in a scalar context. This is the default context flag setting for all the call_* functions.

This flag has 2 effects:

1.

It indicates to the subroutine being called that it is executing in a scalar context (if it executes wantarray the result will be false).

2.

It ensures that only a scalar is actually returned from the subroutine. The subroutine can, of course, ignore the wantarray and return a list anyway. If so, then only the last element of the list will be returned.

The value returned by the call_* function indicates how many items have been returned by the Perl subroutine - in this case it will be either 0 or 1.

If 0, then you have specified the G_DISCARD flag.

If 1, then the item actually returned by the Perl subroutine will be stored on the Perl stack - the section Returning a Scalar shows how to access this value on the stack. Remember that regardless of how many items the Perl subroutine returns, only the last one will be accessible from the stack - think of the case where only one value is returned as being a list with only one element. Any other items that were returned will not exist by the time control returns from the call_* function. The section Returning a list in a scalar context shows an example of this behavior.

G_ARRAY

Calls the Perl subroutine in a list context.

As with G_SCALAR, this flag has 2 effects:

1.

It indicates to the subroutine being called that it is executing in a list context (if it executes wantarray the result will be true).

2.

It ensures that all items returned from the subroutine will be accessible when control returns from the call_* function.

The value returned by the call_* function indicates how many items have been returned by the Perl subroutine.

If 0, then you have specified the G_DISCARD flag.

If not 0, then it will be a count of the number of items returned by the subroutine. These items will be stored on the Perl stack. The section Returning a list of values gives an example of using the G_ARRAY flag and the mechanics of accessing the returned items from the Perl stack.

G_DISCARD

By default, the call_* functions place the items returned from by the Perl subroutine on the stack. If you are not interested in these items, then setting this flag will make Perl get rid of them automatically for you. Note that it is still possible to indicate a context to the Perl subroutine by using either G_SCALAR or G_ARRAY.

If you do not set this flag then it is very important that you make sure that any temporaries (i.e., parameters passed to the Perl subroutine and values returned from the subroutine) are disposed of yourself. The section Returning a Scalar gives details of how to dispose of these temporaries explicitly and the section Using Perl to dispose of temporaries discusses the specific circumstances where you can ignore the problem and let Perl deal with it for you.

G_NOARGS

Whenever a Perl subroutine is called using one of the call_* functions, it is assumed by default that parameters are to be passed to the subroutine. If you are not passing any parameters to the Perl subroutine, you can save a bit of time by setting this flag. It has the effect of not creating the @_ array for the Perl subroutine.

Although the functionality provided by this flag may seem straightforward, it should be used only if there is a good reason to do so. The reason for being cautious is that, even if you have specified the G_NOARGS flag, it is still possible for the Perl subroutine that has been called to think that you have passed it parameters.

In fact, what can happen is that the Perl subroutine you have called can access the @_ array from a previous Perl subroutine. This will occur when the code that is executing the call_* function has itself been called from another Perl subroutine. The code below illustrates this

  1. sub fred
  2. { print "@_\n" }
  3. sub joe
  4. { &fred }
  5. &joe(1,2,3);

This will print

  1. 1 2 3

What has happened is that fred accesses the @_ array which belongs to joe .

G_EVAL

It is possible for the Perl subroutine you are calling to terminate abnormally, e.g., by calling die explicitly or by not actually existing. By default, when either of these events occurs, the process will terminate immediately. If you want to trap this type of event, specify the G_EVAL flag. It will put an eval { } around the subroutine call.

Whenever control returns from the call_* function you need to check the $@ variable as you would in a normal Perl script.

The value returned from the call_* function is dependent on what other flags have been specified and whether an error has occurred. Here are all the different cases that can occur:

  • If the call_* function returns normally, then the value returned is as specified in the previous sections.

  • If G_DISCARD is specified, the return value will always be 0.

  • If G_ARRAY is specified and an error has occurred, the return value will always be 0.

  • If G_SCALAR is specified and an error has occurred, the return value will be 1 and the value on the top of the stack will be undef. This means that if you have already detected the error by checking $@ and you want the program to continue, you must remember to pop the undef from the stack.

See Using G_EVAL for details on using G_EVAL.

G_KEEPERR

Using the G_EVAL flag described above will always set $@ : clearing it if there was no error, and setting it to describe the error if there was an error in the called code. This is what you want if your intention is to handle possible errors, but sometimes you just want to trap errors and stop them interfering with the rest of the program.

This scenario will mostly be applicable to code that is meant to be called from within destructors, asynchronous callbacks, and signal handlers. In such situations, where the code being called has little relation to the surrounding dynamic context, the main program needs to be insulated from errors in the called code, even if they can't be handled intelligently. It may also be useful to do this with code for __DIE__ or __WARN__ hooks, and tie functions.

The G_KEEPERR flag is meant to be used in conjunction with G_EVAL in call_* functions that are used to implement such code, or with eval_sv . This flag has no effect on the call_* functions when G_EVAL is not used.

When G_KEEPERR is used, any error in the called code will terminate the call as usual, and the error will not propagate beyond the call (as usual for G_EVAL), but it will not go into $@ . Instead the error will be converted into a warning, prefixed with the string "\t(in cleanup)". This can be disabled using no warnings 'misc' . If there is no error, $@ will not be cleared.

Note that the G_KEEPERR flag does not propagate into inner evals; these may still set $@ .

The G_KEEPERR flag was introduced in Perl version 5.002.

See Using G_KEEPERR for an example of a situation that warrants the use of this flag.

Determining the Context

As mentioned above, you can determine the context of the currently executing subroutine in Perl with wantarray. The equivalent test can be made in C by using the GIMME_V macro, which returns G_ARRAY if you have been called in a list context, G_SCALAR if in a scalar context, or G_VOID if in a void context (i.e., the return value will not be used). An older version of this macro is called GIMME ; in a void context it returns G_SCALAR instead of G_VOID . An example of using the GIMME_V macro is shown in section Using GIMME_V.

EXAMPLES

Enough of the definition talk! Let's have a few examples.

Perl provides many macros to assist in accessing the Perl stack. Wherever possible, these macros should always be used when interfacing to Perl internals. We hope this should make the code less vulnerable to any changes made to Perl in the future.

Another point worth noting is that in the first series of examples I have made use of only the call_pv function. This has been done to keep the code simpler and ease you into the topic. Wherever possible, if the choice is between using call_pv and call_sv, you should always try to use call_sv. See Using call_sv for details.

No Parameters, Nothing Returned

This first trivial example will call a Perl subroutine, PrintUID, to print out the UID of the process.

  1. sub PrintUID
  2. {
  3. print "UID is $<\n";
  4. }

and here is a C function to call it

  1. static void
  2. call_PrintUID()
  3. {
  4. dSP;
  5. PUSHMARK(SP);
  6. call_pv("PrintUID", G_DISCARD|G_NOARGS);
  7. }

Simple, eh?

A few points to note about this example:

1.

Ignore dSP and PUSHMARK(SP) for now. They will be discussed in the next example.

2.

We aren't passing any parameters to PrintUID so G_NOARGS can be specified.

3.

We aren't interested in anything returned from PrintUID, so G_DISCARD is specified. Even if PrintUID was changed to return some value(s), having specified G_DISCARD will mean that they will be wiped by the time control returns from call_pv.

4.

As call_pv is being used, the Perl subroutine is specified as a C string. In this case the subroutine name has been 'hard-wired' into the code.

5.

Because we specified G_DISCARD, it is not necessary to check the value returned from call_pv. It will always be 0.

Passing Parameters

Now let's make a slightly more complex example. This time we want to call a Perl subroutine, LeftString , which will take 2 parameters--a string ($s) and an integer ($n). The subroutine will simply print the first $n characters of the string.

So the Perl subroutine would look like this:

  1. sub LeftString
  2. {
  3. my($s, $n) = @_;
  4. print substr($s, 0, $n), "\n";
  5. }

The C function required to call LeftString would look like this:

  1. static void
  2. call_LeftString(a, b)
  3. char * a;
  4. int b;
  5. {
  6. dSP;
  7. ENTER;
  8. SAVETMPS;
  9. PUSHMARK(SP);
  10. XPUSHs(sv_2mortal(newSVpv(a, 0)));
  11. XPUSHs(sv_2mortal(newSViv(b)));
  12. PUTBACK;
  13. call_pv("LeftString", G_DISCARD);
  14. FREETMPS;
  15. LEAVE;
  16. }

Here are a few notes on the C function call_LeftString.

1.

Parameters are passed to the Perl subroutine using the Perl stack. This is the purpose of the code beginning with the line dSP and ending with the line PUTBACK . The dSP declares a local copy of the stack pointer. This local copy should always be accessed as SP .

2.

If you are going to put something onto the Perl stack, you need to know where to put it. This is the purpose of the macro dSP --it declares and initializes a local copy of the Perl stack pointer.

All the other macros which will be used in this example require you to have used this macro.

The exception to this rule is if you are calling a Perl subroutine directly from an XSUB function. In this case it is not necessary to use the dSP macro explicitly--it will be declared for you automatically.

3.

Any parameters to be pushed onto the stack should be bracketed by the PUSHMARK and PUTBACK macros. The purpose of these two macros, in this context, is to count the number of parameters you are pushing automatically. Then whenever Perl is creating the @_ array for the subroutine, it knows how big to make it.

The PUSHMARK macro tells Perl to make a mental note of the current stack pointer. Even if you aren't passing any parameters (like the example shown in the section No Parameters, Nothing Returned) you must still call the PUSHMARK macro before you can call any of the call_* functions--Perl still needs to know that there are no parameters.

The PUTBACK macro sets the global copy of the stack pointer to be the same as our local copy. If we didn't do this, call_pv wouldn't know where the two parameters we pushed were--remember that up to now all the stack pointer manipulation we have done is with our local copy, not the global copy.

4.

Next, we come to XPUSHs. This is where the parameters actually get pushed onto the stack. In this case we are pushing a string and an integer.

See XSUBs and the Argument Stack in perlguts for details on how the XPUSH macros work.

5.

Because we created temporary values (by means of sv_2mortal() calls) we will have to tidy up the Perl stack and dispose of mortal SVs.

This is the purpose of

  1. ENTER;
  2. SAVETMPS;

at the start of the function, and

  1. FREETMPS;
  2. LEAVE;

at the end. The ENTER /SAVETMPS pair creates a boundary for any temporaries we create. This means that the temporaries we get rid of will be limited to those which were created after these calls.

The FREETMPS /LEAVE pair will get rid of any values returned by the Perl subroutine (see next example), plus it will also dump the mortal SVs we have created. Having ENTER /SAVETMPS at the beginning of the code makes sure that no other mortals are destroyed.

Think of these macros as working a bit like { and } in Perl to limit the scope of local variables.

See the section Using Perl to Dispose of Temporaries for details of an alternative to using these macros.

6.

Finally, LeftString can now be called via the call_pv function. The only flag specified this time is G_DISCARD. Because we are passing 2 parameters to the Perl subroutine this time, we have not specified G_NOARGS.

Returning a Scalar

Now for an example of dealing with the items returned from a Perl subroutine.

Here is a Perl subroutine, Adder, that takes 2 integer parameters and simply returns their sum.

  1. sub Adder
  2. {
  3. my($a, $b) = @_;
  4. $a + $b;
  5. }

Because we are now concerned with the return value from Adder, the C function required to call it is now a bit more complex.

  1. static void
  2. call_Adder(a, b)
  3. int a;
  4. int b;
  5. {
  6. dSP;
  7. int count;
  8. ENTER;
  9. SAVETMPS;
  10. PUSHMARK(SP);
  11. XPUSHs(sv_2mortal(newSViv(a)));
  12. XPUSHs(sv_2mortal(newSViv(b)));
  13. PUTBACK;
  14. count = call_pv("Adder", G_SCALAR);
  15. SPAGAIN;
  16. if (count != 1)
  17. croak("Big trouble\n");
  18. printf ("The sum of %d and %d is %d\n", a, b, POPi);
  19. PUTBACK;
  20. FREETMPS;
  21. LEAVE;
  22. }

Points to note this time are

1.

The only flag specified this time was G_SCALAR. That means that the @_ array will be created and that the value returned by Adder will still exist after the call to call_pv.

2.

The purpose of the macro SPAGAIN is to refresh the local copy of the stack pointer. This is necessary because it is possible that the memory allocated to the Perl stack has been reallocated during the call_pv call.

If you are making use of the Perl stack pointer in your code you must always refresh the local copy using SPAGAIN whenever you make use of the call_* functions or any other Perl internal function.

3.

Although only a single value was expected to be returned from Adder, it is still good practice to check the return code from call_pv anyway.

Expecting a single value is not quite the same as knowing that there will be one. If someone modified Adder to return a list and we didn't check for that possibility and take appropriate action the Perl stack would end up in an inconsistent state. That is something you really don't want to happen ever.

4.

The POPi macro is used here to pop the return value from the stack. In this case we wanted an integer, so POPi was used.

Here is the complete list of POP macros available, along with the types they return.

  1. POPs SV
  2. POPp pointer
  3. POPn double
  4. POPi integer
  5. POPl long
5.

The final PUTBACK is used to leave the Perl stack in a consistent state before exiting the function. This is necessary because when we popped the return value from the stack with POPi it updated only our local copy of the stack pointer. Remember, PUTBACK sets the global stack pointer to be the same as our local copy.

Returning a List of Values

Now, let's extend the previous example to return both the sum of the parameters and the difference.

Here is the Perl subroutine

  1. sub AddSubtract
  2. {
  3. my($a, $b) = @_;
  4. ($a+$b, $a-$b);
  5. }

and this is the C function

  1. static void
  2. call_AddSubtract(a, b)
  3. int a;
  4. int b;
  5. {
  6. dSP;
  7. int count;
  8. ENTER;
  9. SAVETMPS;
  10. PUSHMARK(SP);
  11. XPUSHs(sv_2mortal(newSViv(a)));
  12. XPUSHs(sv_2mortal(newSViv(b)));
  13. PUTBACK;
  14. count = call_pv("AddSubtract", G_ARRAY);
  15. SPAGAIN;
  16. if (count != 2)
  17. croak("Big trouble\n");
  18. printf ("%d - %d = %d\n", a, b, POPi);
  19. printf ("%d + %d = %d\n", a, b, POPi);
  20. PUTBACK;
  21. FREETMPS;
  22. LEAVE;
  23. }

If call_AddSubtract is called like this

  1. call_AddSubtract(7, 4);

then here is the output

  1. 7 - 4 = 3
  2. 7 + 4 = 11

Notes

1.

We wanted list context, so G_ARRAY was used.

2.

Not surprisingly POPi is used twice this time because we were retrieving 2 values from the stack. The important thing to note is that when using the POP* macros they come off the stack in reverse order.

Returning a List in a Scalar Context

Say the Perl subroutine in the previous section was called in a scalar context, like this

  1. static void
  2. call_AddSubScalar(a, b)
  3. int a;
  4. int b;
  5. {
  6. dSP;
  7. int count;
  8. int i;
  9. ENTER;
  10. SAVETMPS;
  11. PUSHMARK(SP);
  12. XPUSHs(sv_2mortal(newSViv(a)));
  13. XPUSHs(sv_2mortal(newSViv(b)));
  14. PUTBACK;
  15. count = call_pv("AddSubtract", G_SCALAR);
  16. SPAGAIN;
  17. printf ("Items Returned = %d\n", count);
  18. for (i = 1; i <= count; ++i)
  19. printf ("Value %d = %d\n", i, POPi);
  20. PUTBACK;
  21. FREETMPS;
  22. LEAVE;
  23. }

The other modification made is that call_AddSubScalar will print the number of items returned from the Perl subroutine and their value (for simplicity it assumes that they are integer). So if call_AddSubScalar is called

  1. call_AddSubScalar(7, 4);

then the output will be

  1. Items Returned = 1
  2. Value 1 = 3

In this case the main point to note is that only the last item in the list is returned from the subroutine. AddSubtract actually made it back to call_AddSubScalar.

Returning Data from Perl via the Parameter List

It is also possible to return values directly via the parameter list--whether it is actually desirable to do it is another matter entirely.

The Perl subroutine, Inc, below takes 2 parameters and increments each directly.

  1. sub Inc
  2. {
  3. ++ $_[0];
  4. ++ $_[1];
  5. }

and here is a C function to call it.

  1. static void
  2. call_Inc(a, b)
  3. int a;
  4. int b;
  5. {
  6. dSP;
  7. int count;
  8. SV * sva;
  9. SV * svb;
  10. ENTER;
  11. SAVETMPS;
  12. sva = sv_2mortal(newSViv(a));
  13. svb = sv_2mortal(newSViv(b));
  14. PUSHMARK(SP);
  15. XPUSHs(sva);
  16. XPUSHs(svb);
  17. PUTBACK;
  18. count = call_pv("Inc", G_DISCARD);
  19. if (count != 0)
  20. croak ("call_Inc: expected 0 values from 'Inc', got %d\n",
  21. count);
  22. printf ("%d + 1 = %d\n", a, SvIV(sva));
  23. printf ("%d + 1 = %d\n", b, SvIV(svb));
  24. FREETMPS;
  25. LEAVE;
  26. }

To be able to access the two parameters that were pushed onto the stack after they return from call_pv it is necessary to make a note of their addresses--thus the two variables sva and svb .

The reason this is necessary is that the area of the Perl stack which held them will very likely have been overwritten by something else by the time control returns from call_pv.

Using G_EVAL

Now an example using G_EVAL. Below is a Perl subroutine which computes the difference of its 2 parameters. If this would result in a negative result, the subroutine calls die.

  1. sub Subtract
  2. {
  3. my ($a, $b) = @_;
  4. die "death can be fatal\n" if $a < $b;
  5. $a - $b;
  6. }

and some C to call it

  1. static void
  2. call_Subtract(a, b)
  3. int a;
  4. int b;
  5. {
  6. dSP;
  7. int count;
  8. ENTER;
  9. SAVETMPS;
  10. PUSHMARK(SP);
  11. XPUSHs(sv_2mortal(newSViv(a)));
  12. XPUSHs(sv_2mortal(newSViv(b)));
  13. PUTBACK;
  14. count = call_pv("Subtract", G_EVAL|G_SCALAR);
  15. SPAGAIN;
  16. /* Check the eval first */
  17. if (SvTRUE(ERRSV))
  18. {
  19. printf ("Uh oh - %s\n", SvPV_nolen(ERRSV));
  20. POPs;
  21. }
  22. else
  23. {
  24. if (count != 1)
  25. croak("call_Subtract: wanted 1 value from 'Subtract', got %d\n",
  26. count);
  27. printf ("%d - %d = %d\n", a, b, POPi);
  28. }
  29. PUTBACK;
  30. FREETMPS;
  31. LEAVE;
  32. }

If call_Subtract is called thus

  1. call_Subtract(4, 5)

the following will be printed

  1. Uh oh - death can be fatal

Notes

1.

We want to be able to catch the die so we have used the G_EVAL flag. Not specifying this flag would mean that the program would terminate immediately at the die statement in the subroutine Subtract.

2.

The code

  1. if (SvTRUE(ERRSV))
  2. {
  3. printf ("Uh oh - %s\n", SvPV_nolen(ERRSV));
  4. POPs;
  5. }

is the direct equivalent of this bit of Perl

  1. print "Uh oh - $@\n" if $@;

PL_errgv is a perl global of type GV * that points to the symbol table entry containing the error. ERRSV therefore refers to the C equivalent of $@ .

3.

Note that the stack is popped using POPs in the block where SvTRUE(ERRSV) is true. This is necessary because whenever a call_* function invoked with G_EVAL|G_SCALAR returns an error, the top of the stack holds the value undef. Because we want the program to continue after detecting this error, it is essential that the stack be tidied up by removing the undef.

Using G_KEEPERR

Consider this rather facetious example, where we have used an XS version of the call_Subtract example above inside a destructor:

  1. package Foo;
  2. sub new { bless {}, $_[0] }
  3. sub Subtract {
  4. my($a,$b) = @_;
  5. die "death can be fatal" if $a < $b;
  6. $a - $b;
  7. }
  8. sub DESTROY { call_Subtract(5, 4); }
  9. sub foo { die "foo dies"; }
  10. package main;
  11. {
  12. my $foo = Foo->new;
  13. eval { $foo->foo };
  14. }
  15. print "Saw: $@" if $@; # should be, but isn't

This example will fail to recognize that an error occurred inside the eval {} . Here's why: the call_Subtract code got executed while perl was cleaning up temporaries when exiting the outer braced block, and because call_Subtract is implemented with call_pv using the G_EVAL flag, it promptly reset $@ . This results in the failure of the outermost test for $@ , and thereby the failure of the error trap.

Appending the G_KEEPERR flag, so that the call_pv call in call_Subtract reads:

  1. count = call_pv("Subtract", G_EVAL|G_SCALAR|G_KEEPERR);

will preserve the error and restore reliable error handling.

Using call_sv

In all the previous examples I have 'hard-wired' the name of the Perl subroutine to be called from C. Most of the time though, it is more convenient to be able to specify the name of the Perl subroutine from within the Perl script.

Consider the Perl code below

  1. sub fred
  2. {
  3. print "Hello there\n";
  4. }
  5. CallSubPV("fred");

Here is a snippet of XSUB which defines CallSubPV.

  1. void
  2. CallSubPV(name)
  3. char * name
  4. CODE:
  5. PUSHMARK(SP);
  6. call_pv(name, G_DISCARD|G_NOARGS);

That is fine as far as it goes. The thing is, the Perl subroutine can be specified as only a string, however, Perl allows references to subroutines and anonymous subroutines. This is where call_sv is useful.

The code below for CallSubSV is identical to CallSubPV except that the name parameter is now defined as an SV* and we use call_sv instead of call_pv.

  1. void
  2. CallSubSV(name)
  3. SV * name
  4. CODE:
  5. PUSHMARK(SP);
  6. call_sv(name, G_DISCARD|G_NOARGS);

Because we are using an SV to call fred the following can all be used:

  1. CallSubSV("fred");
  2. CallSubSV(\&fred);
  3. $ref = \&fred;
  4. CallSubSV($ref);
  5. CallSubSV( sub { print "Hello there\n" } );

As you can see, call_sv gives you much greater flexibility in how you can specify the Perl subroutine.

You should note that, if it is necessary to store the SV (name in the example above) which corresponds to the Perl subroutine so that it can be used later in the program, it not enough just to store a copy of the pointer to the SV. Say the code above had been like this:

  1. static SV * rememberSub;
  2. void
  3. SaveSub1(name)
  4. SV * name
  5. CODE:
  6. rememberSub = name;
  7. void
  8. CallSavedSub1()
  9. CODE:
  10. PUSHMARK(SP);
  11. call_sv(rememberSub, G_DISCARD|G_NOARGS);

The reason this is wrong is that, by the time you come to use the pointer rememberSub in CallSavedSub1 , it may or may not still refer to the Perl subroutine that was recorded in SaveSub1 . This is particularly true for these cases:

  1. SaveSub1(\&fred);
  2. CallSavedSub1();
  3. SaveSub1( sub { print "Hello there\n" } );
  4. CallSavedSub1();

By the time each of the SaveSub1 statements above has been executed, the SV*s which corresponded to the parameters will no longer exist. Expect an error message from Perl of the form

  1. Can't use an undefined value as a subroutine reference at ...

for each of the CallSavedSub1 lines.

Similarly, with this code

  1. $ref = \&fred;
  2. SaveSub1($ref);
  3. $ref = 47;
  4. CallSavedSub1();

you can expect one of these messages (which you actually get is dependent on the version of Perl you are using)

  1. Not a CODE reference at ...
  2. Undefined subroutine &main::47 called ...

The variable $ref may have referred to the subroutine fred whenever the call to SaveSub1 was made but by the time CallSavedSub1 gets called it now holds the number 47 . Because we saved only a pointer to the original SV in SaveSub1 , any changes to $ref will be tracked by the pointer rememberSub . This means that whenever CallSavedSub1 gets called, it will attempt to execute the code which is referenced by the SV* rememberSub . In this case though, it now refers to the integer 47 , so expect Perl to complain loudly.

A similar but more subtle problem is illustrated with this code:

  1. $ref = \&fred;
  2. SaveSub1($ref);
  3. $ref = \&joe;
  4. CallSavedSub1();

This time whenever CallSavedSub1 gets called it will execute the Perl subroutine joe (assuming it exists) rather than fred as was originally requested in the call to SaveSub1 .

To get around these problems it is necessary to take a full copy of the SV. The code below shows SaveSub2 modified to do that.

  1. static SV * keepSub = (SV*)NULL;
  2. void
  3. SaveSub2(name)
  4. SV * name
  5. CODE:
  6. /* Take a copy of the callback */
  7. if (keepSub == (SV*)NULL)
  8. /* First time, so create a new SV */
  9. keepSub = newSVsv(name);
  10. else
  11. /* Been here before, so overwrite */
  12. SvSetSV(keepSub, name);
  13. void
  14. CallSavedSub2()
  15. CODE:
  16. PUSHMARK(SP);
  17. call_sv(keepSub, G_DISCARD|G_NOARGS);

To avoid creating a new SV every time SaveSub2 is called, the function first checks to see if it has been called before. If not, then space for a new SV is allocated and the reference to the Perl subroutine name is copied to the variable keepSub in one operation using newSVsv . Thereafter, whenever SaveSub2 is called, the existing SV, keepSub , is overwritten with the new value using SvSetSV .

Using call_argv

Here is a Perl subroutine which prints whatever parameters are passed to it.

  1. sub PrintList
  2. {
  3. my(@list) = @_;
  4. foreach (@list) { print "$_\n" }
  5. }

And here is an example of call_argv which will call PrintList.

  1. static char * words[] = {"alpha", "beta", "gamma", "delta", NULL};
  2. static void
  3. call_PrintList()
  4. {
  5. dSP;
  6. call_argv("PrintList", G_DISCARD, words);
  7. }

Note that it is not necessary to call PUSHMARK in this instance. This is because call_argv will do it for you.

Using call_method

Consider the following Perl code:

  1. {
  2. package Mine;
  3. sub new
  4. {
  5. my($type) = shift;
  6. bless [@_]
  7. }
  8. sub Display
  9. {
  10. my ($self, $index) = @_;
  11. print "$index: $$self[$index]\n";
  12. }
  13. sub PrintID
  14. {
  15. my($class) = @_;
  16. print "This is Class $class version 1.0\n";
  17. }
  18. }

It implements just a very simple class to manage an array. Apart from the constructor, new , it declares methods, one static and one virtual. The static method, PrintID , prints out simply the class name and a version number. The virtual method, Display , prints out a single element of the array. Here is an all-Perl example of using it.

  1. $a = Mine->new('red', 'green', 'blue');
  2. $a->Display(1);
  3. Mine->PrintID;

will print

  1. 1: green
  2. This is Class Mine version 1.0

Calling a Perl method from C is fairly straightforward. The following things are required:

  • A reference to the object for a virtual method or the name of the class for a static method

  • The name of the method

  • Any other parameters specific to the method

Here is a simple XSUB which illustrates the mechanics of calling both the PrintID and Display methods from C.

  1. void
  2. call_Method(ref, method, index)
  3. SV * ref
  4. char * method
  5. int index
  6. CODE:
  7. PUSHMARK(SP);
  8. XPUSHs(ref);
  9. XPUSHs(sv_2mortal(newSViv(index)));
  10. PUTBACK;
  11. call_method(method, G_DISCARD);
  12. void
  13. call_PrintID(class, method)
  14. char * class
  15. char * method
  16. CODE:
  17. PUSHMARK(SP);
  18. XPUSHs(sv_2mortal(newSVpv(class, 0)));
  19. PUTBACK;
  20. call_method(method, G_DISCARD);

So the methods PrintID and Display can be invoked like this:

  1. $a = Mine->new('red', 'green', 'blue');
  2. call_Method($a, 'Display', 1);
  3. call_PrintID('Mine', 'PrintID');

The only thing to note is that, in both the static and virtual methods, the method name is not passed via the stack--it is used as the first parameter to call_method.

Using GIMME_V

Here is a trivial XSUB which prints the context in which it is currently executing.

  1. void
  2. PrintContext()
  3. CODE:
  4. I32 gimme = GIMME_V;
  5. if (gimme == G_VOID)
  6. printf ("Context is Void\n");
  7. else if (gimme == G_SCALAR)
  8. printf ("Context is Scalar\n");
  9. else
  10. printf ("Context is Array\n");

And here is some Perl to test it.

  1. PrintContext;
  2. $a = PrintContext;
  3. @a = PrintContext;

The output from that will be

  1. Context is Void
  2. Context is Scalar
  3. Context is Array

Using Perl to Dispose of Temporaries

In the examples given to date, any temporaries created in the callback (i.e., parameters passed on the stack to the call_* function or values returned via the stack) have been freed by one of these methods:

  • Specifying the G_DISCARD flag with call_*

  • Explicitly using the ENTER /SAVETMPS --FREETMPS /LEAVE pairing

There is another method which can be used, namely letting Perl do it for you automatically whenever it regains control after the callback has terminated. This is done by simply not using the

  1. ENTER;
  2. SAVETMPS;
  3. ...
  4. FREETMPS;
  5. LEAVE;

sequence in the callback (and not, of course, specifying the G_DISCARD flag).

If you are going to use this method you have to be aware of a possible memory leak which can arise under very specific circumstances. To explain these circumstances you need to know a bit about the flow of control between Perl and the callback routine.

The examples given at the start of the document (an error handler and an event driven program) are typical of the two main sorts of flow control that you are likely to encounter with callbacks. There is a very important distinction between them, so pay attention.

In the first example, an error handler, the flow of control could be as follows. You have created an interface to an external library. Control can reach the external library like this

  1. perl --> XSUB --> external library

Whilst control is in the library, an error condition occurs. You have previously set up a Perl callback to handle this situation, so it will get executed. Once the callback has finished, control will drop back to Perl again. Here is what the flow of control will be like in that situation

  1. perl --> XSUB --> external library
  2. ...
  3. error occurs
  4. ...
  5. external library --> call_* --> perl
  6. |
  7. perl <-- XSUB <-- external library <-- call_* <----+

After processing of the error using call_* is completed, control reverts back to Perl more or less immediately.

In the diagram, the further right you go the more deeply nested the scope is. It is only when control is back with perl on the extreme left of the diagram that you will have dropped back to the enclosing scope and any temporaries you have left hanging around will be freed.

In the second example, an event driven program, the flow of control will be more like this

  1. perl --> XSUB --> event handler
  2. ...
  3. event handler --> call_* --> perl
  4. |
  5. event handler <-- call_* <----+
  6. ...
  7. event handler --> call_* --> perl
  8. |
  9. event handler <-- call_* <----+
  10. ...
  11. event handler --> call_* --> perl
  12. |
  13. event handler <-- call_* <----+

In this case the flow of control can consist of only the repeated sequence

  1. event handler --> call_* --> perl

for practically the complete duration of the program. This means that control may never drop back to the surrounding scope in Perl at the extreme left.

So what is the big problem? Well, if you are expecting Perl to tidy up those temporaries for you, you might be in for a long wait. For Perl to dispose of your temporaries, control must drop back to the enclosing scope at some stage. In the event driven scenario that may never happen. This means that, as time goes on, your program will create more and more temporaries, none of which will ever be freed. As each of these temporaries consumes some memory your program will eventually consume all the available memory in your system--kapow!

So here is the bottom line--if you are sure that control will revert back to the enclosing Perl scope fairly quickly after the end of your callback, then it isn't absolutely necessary to dispose explicitly of any temporaries you may have created. Mind you, if you are at all uncertain about what to do, it doesn't do any harm to tidy up anyway.

Strategies for Storing Callback Context Information

Potentially one of the trickiest problems to overcome when designing a callback interface can be figuring out how to store the mapping between the C callback function and the Perl equivalent.

To help understand why this can be a real problem first consider how a callback is set up in an all C environment. Typically a C API will provide a function to register a callback. This will expect a pointer to a function as one of its parameters. Below is a call to a hypothetical function register_fatal which registers the C function to get called when a fatal error occurs.

  1. register_fatal(cb1);

The single parameter cb1 is a pointer to a function, so you must have defined cb1 in your code, say something like this

  1. static void
  2. cb1()
  3. {
  4. printf ("Fatal Error\n");
  5. exit(1);
  6. }

Now change that to call a Perl subroutine instead

  1. static SV * callback = (SV*)NULL;
  2. static void
  3. cb1()
  4. {
  5. dSP;
  6. PUSHMARK(SP);
  7. /* Call the Perl sub to process the callback */
  8. call_sv(callback, G_DISCARD);
  9. }
  10. void
  11. register_fatal(fn)
  12. SV * fn
  13. CODE:
  14. /* Remember the Perl sub */
  15. if (callback == (SV*)NULL)
  16. callback = newSVsv(fn);
  17. else
  18. SvSetSV(callback, fn);
  19. /* register the callback with the external library */
  20. register_fatal(cb1);

where the Perl equivalent of register_fatal and the callback it registers, pcb1 , might look like this

  1. # Register the sub pcb1
  2. register_fatal(\&pcb1);
  3. sub pcb1
  4. {
  5. die "I'm dying...\n";
  6. }

The mapping between the C callback and the Perl equivalent is stored in the global variable callback .

This will be adequate if you ever need to have only one callback registered at any time. An example could be an error handler like the code sketched out above. Remember though, repeated calls to register_fatal will replace the previously registered callback function with the new one.

Say for example you want to interface to a library which allows asynchronous file i/o. In this case you may be able to register a callback whenever a read operation has completed. To be of any use we want to be able to call separate Perl subroutines for each file that is opened. As it stands, the error handler example above would not be adequate as it allows only a single callback to be defined at any time. What we require is a means of storing the mapping between the opened file and the Perl subroutine we want to be called for that file.

Say the i/o library has a function asynch_read which associates a C function ProcessRead with a file handle fh --this assumes that it has also provided some routine to open the file and so obtain the file handle.

  1. asynch_read(fh, ProcessRead)

This may expect the C ProcessRead function of this form

  1. void
  2. ProcessRead(fh, buffer)
  3. int fh;
  4. char * buffer;
  5. {
  6. ...
  7. }

To provide a Perl interface to this library we need to be able to map between the fh parameter and the Perl subroutine we want called. A hash is a convenient mechanism for storing this mapping. The code below shows a possible implementation

  1. static HV * Mapping = (HV*)NULL;
  2. void
  3. asynch_read(fh, callback)
  4. int fh
  5. SV * callback
  6. CODE:
  7. /* If the hash doesn't already exist, create it */
  8. if (Mapping == (HV*)NULL)
  9. Mapping = newHV();
  10. /* Save the fh -> callback mapping */
  11. hv_store(Mapping, (char*)&fh, sizeof(fh), newSVsv(callback), 0);
  12. /* Register with the C Library */
  13. asynch_read(fh, asynch_read_if);

and asynch_read_if could look like this

  1. static void
  2. asynch_read_if(fh, buffer)
  3. int fh;
  4. char * buffer;
  5. {
  6. dSP;
  7. SV ** sv;
  8. /* Get the callback associated with fh */
  9. sv = hv_fetch(Mapping, (char*)&fh , sizeof(fh), FALSE);
  10. if (sv == (SV**)NULL)
  11. croak("Internal error...\n");
  12. PUSHMARK(SP);
  13. XPUSHs(sv_2mortal(newSViv(fh)));
  14. XPUSHs(sv_2mortal(newSVpv(buffer, 0)));
  15. PUTBACK;
  16. /* Call the Perl sub */
  17. call_sv(*sv, G_DISCARD);
  18. }

For completeness, here is asynch_close . This shows how to remove the entry from the hash Mapping .

  1. void
  2. asynch_close(fh)
  3. int fh
  4. CODE:
  5. /* Remove the entry from the hash */
  6. (void) hv_delete(Mapping, (char*)&fh, sizeof(fh), G_DISCARD);
  7. /* Now call the real asynch_close */
  8. asynch_close(fh);

So the Perl interface would look like this

  1. sub callback1
  2. {
  3. my($handle, $buffer) = @_;
  4. }
  5. # Register the Perl callback
  6. asynch_read($fh, \&callback1);
  7. asynch_close($fh);

The mapping between the C callback and Perl is stored in the global hash Mapping this time. Using a hash has the distinct advantage that it allows an unlimited number of callbacks to be registered.

What if the interface provided by the C callback doesn't contain a parameter which allows the file handle to Perl subroutine mapping? Say in the asynchronous i/o package, the callback function gets passed only the buffer parameter like this

  1. void
  2. ProcessRead(buffer)
  3. char * buffer;
  4. {
  5. ...
  6. }

Without the file handle there is no straightforward way to map from the C callback to the Perl subroutine.

In this case a possible way around this problem is to predefine a series of C functions to act as the interface to Perl, thus

  1. #define MAX_CB 3
  2. #define NULL_HANDLE -1
  3. typedef void (*FnMap)();
  4. struct MapStruct {
  5. FnMap Function;
  6. SV * PerlSub;
  7. int Handle;
  8. };
  9. static void fn1();
  10. static void fn2();
  11. static void fn3();
  12. static struct MapStruct Map [MAX_CB] =
  13. {
  14. { fn1, NULL, NULL_HANDLE },
  15. { fn2, NULL, NULL_HANDLE },
  16. { fn3, NULL, NULL_HANDLE }
  17. };
  18. static void
  19. Pcb(index, buffer)
  20. int index;
  21. char * buffer;
  22. {
  23. dSP;
  24. PUSHMARK(SP);
  25. XPUSHs(sv_2mortal(newSVpv(buffer, 0)));
  26. PUTBACK;
  27. /* Call the Perl sub */
  28. call_sv(Map[index].PerlSub, G_DISCARD);
  29. }
  30. static void
  31. fn1(buffer)
  32. char * buffer;
  33. {
  34. Pcb(0, buffer);
  35. }
  36. static void
  37. fn2(buffer)
  38. char * buffer;
  39. {
  40. Pcb(1, buffer);
  41. }
  42. static void
  43. fn3(buffer)
  44. char * buffer;
  45. {
  46. Pcb(2, buffer);
  47. }
  48. void
  49. array_asynch_read(fh, callback)
  50. int fh
  51. SV * callback
  52. CODE:
  53. int index;
  54. int null_index = MAX_CB;
  55. /* Find the same handle or an empty entry */
  56. for (index = 0; index < MAX_CB; ++index)
  57. {
  58. if (Map[index].Handle == fh)
  59. break;
  60. if (Map[index].Handle == NULL_HANDLE)
  61. null_index = index;
  62. }
  63. if (index == MAX_CB && null_index == MAX_CB)
  64. croak ("Too many callback functions registered\n");
  65. if (index == MAX_CB)
  66. index = null_index;
  67. /* Save the file handle */
  68. Map[index].Handle = fh;
  69. /* Remember the Perl sub */
  70. if (Map[index].PerlSub == (SV*)NULL)
  71. Map[index].PerlSub = newSVsv(callback);
  72. else
  73. SvSetSV(Map[index].PerlSub, callback);
  74. asynch_read(fh, Map[index].Function);
  75. void
  76. array_asynch_close(fh)
  77. int fh
  78. CODE:
  79. int index;
  80. /* Find the file handle */
  81. for (index = 0; index < MAX_CB; ++ index)
  82. if (Map[index].Handle == fh)
  83. break;
  84. if (index == MAX_CB)
  85. croak ("could not close fh %d\n", fh);
  86. Map[index].Handle = NULL_HANDLE;
  87. SvREFCNT_dec(Map[index].PerlSub);
  88. Map[index].PerlSub = (SV*)NULL;
  89. asynch_close(fh);

In this case the functions fn1 , fn2 , and fn3 are used to remember the Perl subroutine to be called. Each of the functions holds a separate hard-wired index which is used in the function Pcb to access the Map array and actually call the Perl subroutine.

There are some obvious disadvantages with this technique.

Firstly, the code is considerably more complex than with the previous example.

Secondly, there is a hard-wired limit (in this case 3) to the number of callbacks that can exist simultaneously. The only way to increase the limit is by modifying the code to add more functions and then recompiling. None the less, as long as the number of functions is chosen with some care, it is still a workable solution and in some cases is the only one available.

To summarize, here are a number of possible methods for you to consider for storing the mapping between C and the Perl callback

1.
Ignore the problem - Allow only 1 callback

For a lot of situations, like interfacing to an error handler, this may be a perfectly adequate solution.

2.
Create a sequence of callbacks - hard wired limit

If it is impossible to tell from the parameters passed back from the C callback what the context is, then you may need to create a sequence of C callback interface functions, and store pointers to each in an array.

3.
Use a parameter to map to the Perl callback

A hash is an ideal mechanism to store the mapping between C and Perl.

Alternate Stack Manipulation

Although I have made use of only the POP* macros to access values returned from Perl subroutines, it is also possible to bypass these macros and read the stack using the ST macro (See perlxs for a full description of the ST macro).

Most of the time the POP* macros should be adequate; the main problem with them is that they force you to process the returned values in sequence. This may not be the most suitable way to process the values in some cases. What we want is to be able to access the stack in a random order. The ST macro as used when coding an XSUB is ideal for this purpose.

The code below is the example given in the section Returning a List of Values recoded to use ST instead of POP* .

  1. static void
  2. call_AddSubtract2(a, b)
  3. int a;
  4. int b;
  5. {
  6. dSP;
  7. I32 ax;
  8. int count;
  9. ENTER;
  10. SAVETMPS;
  11. PUSHMARK(SP);
  12. XPUSHs(sv_2mortal(newSViv(a)));
  13. XPUSHs(sv_2mortal(newSViv(b)));
  14. PUTBACK;
  15. count = call_pv("AddSubtract", G_ARRAY);
  16. SPAGAIN;
  17. SP -= count;
  18. ax = (SP - PL_stack_base) + 1;
  19. if (count != 2)
  20. croak("Big trouble\n");
  21. printf ("%d + %d = %d\n", a, b, SvIV(ST(0)));
  22. printf ("%d - %d = %d\n", a, b, SvIV(ST(1)));
  23. PUTBACK;
  24. FREETMPS;
  25. LEAVE;
  26. }

Notes

1.

Notice that it was necessary to define the variable ax . This is because the ST macro expects it to exist. If we were in an XSUB it would not be necessary to define ax as it is already defined for us.

2.

The code

  1. SPAGAIN;
  2. SP -= count;
  3. ax = (SP - PL_stack_base) + 1;

sets the stack up so that we can use the ST macro.

3.

Unlike the original coding of this example, the returned values are not accessed in reverse order. So ST(0) refers to the first value returned by the Perl subroutine and ST(count-1) refers to the last.

Creating and Calling an Anonymous Subroutine in C

As we've already shown, call_sv can be used to invoke an anonymous subroutine. However, our example showed a Perl script invoking an XSUB to perform this operation. Let's see how it can be done inside our C code:

  1. ...
  2. SV *cvrv = eval_pv("sub { print 'You will not find me cluttering any namespace!' }", TRUE);
  3. ...
  4. call_sv(cvrv, G_VOID|G_NOARGS);

eval_pv is used to compile the anonymous subroutine, which will be the return value as well (read more about eval_pv in eval_pv in perlapi). Once this code reference is in hand, it can be mixed in with all the previous examples we've shown.

LIGHTWEIGHT CALLBACKS

Sometimes you need to invoke the same subroutine repeatedly. This usually happens with a function that acts on a list of values, such as Perl's built-in sort(). You can pass a comparison function to sort(), which will then be invoked for every pair of values that needs to be compared. The first() and reduce() functions from List::Util follow a similar pattern.

In this case it is possible to speed up the routine (often quite substantially) by using the lightweight callback API. The idea is that the calling context only needs to be created and destroyed once, and the sub can be called arbitrarily many times in between.

It is usual to pass parameters using global variables (typically $_ for one parameter, or $a and $b for two parameters) rather than via @_. (It is possible to use the @_ mechanism if you know what you're doing, though there is as yet no supported API for it. It's also inherently slower.)

The pattern of macro calls is like this:

  1. dMULTICALL; /* Declare local variables */
  2. I32 gimme = G_SCALAR; /* context of the call: G_SCALAR,
  3. * G_ARRAY, or G_VOID */
  4. PUSH_MULTICALL(cv); /* Set up the context for calling cv,
  5. and set local vars appropriately */
  6. /* loop */ {
  7. /* set the value(s) af your parameter variables */
  8. MULTICALL; /* Make the actual call */
  9. } /* end of loop */
  10. POP_MULTICALL; /* Tear down the calling context */

For some concrete examples, see the implementation of the first() and reduce() functions of List::Util 1.18. There you will also find a header file that emulates the multicall API on older versions of perl.

SEE ALSO

perlxs, perlguts, perlembed

AUTHOR

Paul Marquess

Special thanks to the following people who assisted in the creation of the document.

Jeff Okamoto, Tim Bunce, Nick Gianniotis, Steve Kelem, Gurusamy Sarathy and Larry Wall.

DATE

Version 1.3, 14th Apr 1997

 
perldoc-html/perlce.html000644 000765 000024 00000074001 12275777411 015375 0ustar00jjstaff000000 000000 perlce - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlce

Perl 5 version 18.2 documentation
Recently read

perlce

NAME

perlce - Perl for WinCE

Building Perl for WinCE

DESCRIPTION

This file gives the instructions for building Perl5.8 and above for WinCE. Please read and understand the terms under which this software is distributed.

General explanations on cross-compiling WinCE

  • miniperl is built. This is a single executable (without DLL), intended to run on Win32, and it will facilitate remaining build process; all binaries built after it are foreign and should not run locally.

    miniperl is built using ./win32/Makefile; this is part of normal build process invoked as dependency from wince/Makefile.ce

  • After miniperl is built, configpm is invoked to create right Config.pm in right place and its corresponding Cross.pm.

    Unlike Win32 build, miniperl will not have Config.pm of host within reach; it rather will use Config.pm from within cross-compilation directories.

    File Cross.pm is dead simple: for given cross-architecture places in @INC a path where perl modules are, and right Config.pm in that place.

    That said, miniperl -Ilib -MConfig -we 1 should report an error, because it can not find Config.pm. If it does not give an error -- wrong Config.pm is substituted, and resulting binaries will be a mess.

    miniperl -MCross -MConfig -we 1 should run okay, and it will provide right Config.pm for further compilations.

  • During extensions build phase, a script ./win32/buldext.pl is invoked, which in turn steps in ./ext subdirectories and performs a build of each extension in turn.

    All invokes of Makefile.PL are provided with -MCross so to enable cross- compile.

BUILD

This section describes the steps to be performed to build PerlCE. You may find additional information about building perl for WinCE at http://perlce.sourceforge.net and some pre-built binaries.

Tools & SDK

For compiling, you need following:

  • Microsoft Embedded Visual Tools
  • Microsoft Visual C++
  • Rainer Keuchel's celib-sources
  • Rainer Keuchel's console-sources

Needed source files can be downloaded at http://perlce.sourceforge.net

Make

Normally you only need to edit ./win32/ce-helpers/compile.bat to reflect your system and run it.

File ./win32/ce-helpers/compile.bat is actually a wrapper to call nmake -f makefile.ce with appropriate parameters and it accepts extra parameters and forwards them to nmake command as additional arguments. You should pass target this way.

To prepare distribution you need to do following:

  • go to ./win32 subdirectory
  • edit file ./win32/ce-helpers/compile.bat
  • run compile.bat
  • run compile.bat dist

Makefile.ce has CROSS_NAME macro, and it is used further to refer to your cross-compilation scheme. You could assign a name to it, but this is not necessary, because by default it is assigned after your machine configuration name, such as "wince-sh3-hpc-wce211", and this is enough to distinguish different builds at the same time. This option could be handy for several different builds on same platform to perform, say, threaded build. In a following example we assume that all required environment variables are set properly for C cross-compiler (a special *.bat file could fit perfectly to this purpose) and your compile.bat has proper "MACHINE" parameter set, to, say, wince-mips-pocket-wce300 .

  1. compile.bat
  2. compile.bat dist
  3. compile.bat CROSS_NAME=mips-wce300-thr "USE_ITHREADS=define" "USE_IMP_SYS=define" "USE_MULTI=define"
  4. compile.bat CROSS_NAME=mips-wce300-thr "USE_ITHREADS=define" "USE_IMP_SYS=define" "USE_MULTI=define" dist

If all goes okay and no errors during a build, you'll get two independent distributions: wince-mips-pocket-wce300 and mips-wce300-thr .

Target dist prepares distribution file set. Target zipdist performs same as dist but additionally compresses distribution files into zip archive.

NOTE: during a build there could be created a number (or one) of Config.pm for cross-compilation ("foreign" Config.pm) and those are hidden inside ../xlib/$(CROSS_NAME) with other auxiliary files, but, and this is important to note, there should be no Config.pm for host miniperl. If you'll get an error that perl could not find Config.pm somewhere in building process this means something went wrong. Most probably you forgot to specify a cross-compilation when invoking miniperl.exe to Makefile.PL When building an extension for cross-compilation your command line should look like

  1. ..\miniperl.exe -I..\lib -MCross=mips-wce300-thr Makefile.PL

or just

  1. ..\miniperl.exe -I..\lib -MCross Makefile.PL

to refer a cross-compilation that was created last time.

All questions related to building for WinCE devices could be asked in perlce-user@lists.sourceforge.net mailing list.

Using Perl on WinCE

DESCRIPTION

PerlCE is currently linked with a simple console window, so it also works on non-hpc devices.

The simple stdio implementation creates the files stdin.txt, stdout.txt and stderr.txt, so you might examine them if your console has only a limited number of cols.

When exitcode is non-zero, a message box appears, otherwise the console closes, so you might have to catch an exit with status 0 in your program to see any output.

stdout/stderr now go into the files /perl-stdout.txt and /perl-stderr.txt.

PerlIDE is handy to deal with perlce.

LIMITATIONS

No fork(), pipe(), popen() etc.

ENVIRONMENT

All environment vars must be stored in HKLM\Environment as strings. They are read at process startup.

  • PERL5LIB

    Usual perl lib path (semi-list).

  • PATH

    Semi-list for executables.

  • TMP

    - Tempdir.

  • UNIXROOTPATH

    - Root for accessing some special files, i.e. /dev/null, /etc/services.

  • ROWS/COLS

    - Rows/cols for console.

  • HOME

    - Home directory.

  • CONSOLEFONTSIZE

    - Size for console font.

You can set these with cereg.exe, a (remote) registry editor or via the PerlIDE.

REGISTRY

To start perl by clicking on a perl source file, you have to make the according entries in HKCR (see ce-helpers/wince-reg.bat). cereg.exe (which must be executed on a desktop pc with ActiveSync) is reported not to work on some devices. You have to create the registry entries by hand using a registry editor.

XS

The following Win32-Methods are built-in:

  1. newXS("Win32::GetCwd", w32_GetCwd, file);
  2. newXS("Win32::SetCwd", w32_SetCwd, file);
  3. newXS("Win32::GetTickCount", w32_GetTickCount, file);
  4. newXS("Win32::GetOSVersion", w32_GetOSVersion, file);
  5. newXS("Win32::IsWinNT", w32_IsWinNT, file);
  6. newXS("Win32::IsWin95", w32_IsWin95, file);
  7. newXS("Win32::IsWinCE", w32_IsWinCE, file);
  8. newXS("Win32::CopyFile", w32_CopyFile, file);
  9. newXS("Win32::Sleep", w32_Sleep, file);
  10. newXS("Win32::MessageBox", w32_MessageBox, file);
  11. newXS("Win32::GetPowerStatus", w32_GetPowerStatus, file);
  12. newXS("Win32::GetOemInfo", w32_GetOemInfo, file);
  13. newXS("Win32::ShellEx", w32_ShellEx, file);

BUGS

Opening files for read-write is currently not supported if they use stdio (normal perl file handles).

If you find bugs or if it does not work at all on your device, send mail to the address below. Please report the details of your device (processor, ceversion, devicetype (hpc/palm/pocket)) and the date of the downloaded files.

INSTALLATION

Currently installation instructions are at http://perlce.sourceforge.net/.

After installation & testing processes will stabilize, information will be more precise.

ACKNOWLEDGEMENTS

The port for Win32 was used as a reference.

History of WinCE port

5.
6.0

Initial port of perl to WinCE. It was performed in separate directory named wince. This port was based on contents of ./win32 directory. miniperl was not built, user must have HOST perl and properly edit makefile.ce to reflect this.

5.
8.0

wince port was kept in the same ./wince directory, and wince/Makefile.ce was used to invoke native compiler to create HOST miniperl, which then facilitates cross-compiling process. Extension building support was added.

5.
9.4

Two directories ./win32 and ./wince were merged, so perlce build process comes in ./win32 directory.

AUTHORS

  • Rainer Keuchel <coyxc@rainer-keuchel.de>

    provided initial port of Perl, which appears to be most essential work, as it was a breakthrough on having Perl ported at all. Many thanks and obligations to Rainer!

  • Vadim Konovalov

    made further support of WinCE port.

 
perldoc-html/perlcheat.html000644 000765 000024 00000067244 12275777324 016110 0ustar00jjstaff000000 000000 perlcheat - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlcheat

Perl 5 version 18.2 documentation
Recently read

perlcheat

NAME

perlcheat - Perl 5 Cheat Sheet

DESCRIPTION

This 'cheat sheet' is a handy reference, meant for beginning Perl programmers. Not everything is mentioned, but 195 features may already be overwhelming.

The sheet

  1. CONTEXTS SIGILS ref ARRAYS HASHES
  2. void $scalar SCALAR @array %hash
  3. scalar @array ARRAY @array[0, 2] @hash{'a', 'b'}
  4. list %hash HASH $array[0] $hash{'a'}
  5. &sub CODE
  6. *glob GLOB SCALAR VALUES
  7. FORMAT number, string, ref, glob, undef
  8. REFERENCES
  9. \ reference $$foo[1] aka $foo->[1]
  10. $@%&* dereference $$foo{bar} aka $foo->{bar}
  11. [] anon. arrayref ${$$foo[1]}[2] aka $foo->[1]->[2]
  12. {} anon. hashref ${$$foo[1]}[2] aka $foo->[1][2]
  13. \() list of refs
  14. SYNTAX
  15. OPERATOR PRECEDENCE foreach (LIST) { } for (a;b;c) { }
  16. -> while (e) { } until (e) { }
  17. ++ -- if (e) { } elsif (e) { } else { }
  18. ** unless (e) { } elsif (e) { } else { }
  19. ! ~ \ u+ u- given (e) { when (e) {} default {} }
  20. =~ !~
  21. * / % x NUMBERS vs STRINGS FALSE vs TRUE
  22. + - . = = undef, "", 0, "0"
  23. << >> + . anything else
  24. named uops == != eq ne
  25. < > <= >= lt gt le ge < > <= >= lt gt le ge
  26. == != <=> eq ne cmp ~~ <=> cmp
  27. &
  28. | ^ REGEX MODIFIERS REGEX METACHARS
  29. && /i case insensitive ^ string begin
  30. || // /m line based ^$ $ str end (bfr \n)
  31. .. ... /s . includes \n + one or more
  32. ?: /x ignore wh.space * zero or more
  33. = += last goto /p preserve ? zero or one
  34. , => /a ASCII /aa safe {3,7} repeat in range
  35. list ops /l locale /d dual | alternation
  36. not /u Unicode [] character class
  37. and /e evaluate /ee rpts \b word boundary
  38. or xor /g global \z string end
  39. /o compile pat once () capture
  40. DEBUG (?:p) no capture
  41. -MO=Deparse REGEX CHARCLASSES (?#t) comment
  42. -MO=Terse . [^\n] (?=p) ZW pos ahead
  43. -D## \s whitespace (?!p) ZW neg ahead
  44. -d:Trace \w word chars (?<=p) ZW pos behind \K
  45. \d digits (?<!p) ZW neg behind
  46. CONFIGURATION \pP named property (?>p) no backtrack
  47. perl -V:ivsize \h horiz.wh.space (?|p|p)branch reset
  48. \R linebreak (?<n>p)named capture
  49. \S \W \D \H negate \g{n} ref to named cap
  50. \K keep left part
  51. FUNCTION RETURN LISTS
  52. stat localtime caller SPECIAL VARIABLES
  53. 0 dev 0 second 0 package $_ default variable
  54. 1 ino 1 minute 1 filename $0 program name
  55. 2 mode 2 hour 2 line $/ input separator
  56. 3 nlink 3 day 3 subroutine $\ output separator
  57. 4 uid 4 month-1 4 hasargs $| autoflush
  58. 5 gid 5 year-1900 5 wantarray $! sys/libcall error
  59. 6 rdev 6 weekday 6 evaltext $@ eval error
  60. 7 size 7 yearday 7 is_require $$ process ID
  61. 8 atime 8 is_dst 8 hints $. line number
  62. 9 mtime 9 bitmask @ARGV command line args
  63. 10 ctime 10 hinthash @INC include paths
  64. 11 blksz 3..10 only @_ subroutine args
  65. 12 blcks with EXPR %ENV environment

ACKNOWLEDGEMENTS

The first version of this document appeared on Perl Monks, where several people had useful suggestions. Thank you, Perl Monks.

A special thanks to Damian Conway, who didn't only suggest important changes, but also took the time to count the number of listed features and make a Perl 6 version to show that Perl will stay Perl.

AUTHOR

Juerd Waalboer <#####@juerd.nl>, with the help of many Perl Monks.

SEE ALSO

 
perldoc-html/perlclib.html000644 000765 000024 00000075725 12275777361 015741 0ustar00jjstaff000000 000000 perlclib - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlclib

Perl 5 version 18.2 documentation
Recently read

perlclib

NAME

perlclib - Internal replacements for standard C library functions

DESCRIPTION

One thing Perl porters should note is that perl doesn't tend to use that much of the C standard library internally; you'll see very little use of, for example, the ctype.h functions in there. This is because Perl tends to reimplement or abstract standard library functions, so that we know exactly how they're going to operate.

This is a reference card for people who are familiar with the C library and who want to do things the Perl way; to tell them which functions they ought to use instead of the more normal C functions.

Conventions

In the following tables:

  • t

    is a type.

  • p

    is a pointer.

  • n

    is a number.

  • s

    is a string.

sv , av , hv , etc. represent variables of their respective types.

File Operations

Instead of the stdio.h functions, you should use the Perl abstraction layer. Instead of FILE* types, you need to be handling PerlIO* types. Don't forget that with the new PerlIO layered I/O abstraction FILE* types may not even be available. See also the perlapio documentation for more information about the following functions:

  1. Instead Of: Use:
  2. stdin PerlIO_stdin()
  3. stdout PerlIO_stdout()
  4. stderr PerlIO_stderr()
  5. fopen(fn, mode) PerlIO_open(fn, mode)
  6. freopen(fn, mode, stream) PerlIO_reopen(fn, mode, perlio) (Deprecated)
  7. fflush(stream) PerlIO_flush(perlio)
  8. fclose(stream) PerlIO_close(perlio)

File Input and Output

  1. Instead Of: Use:
  2. fprintf(stream, fmt, ...) PerlIO_printf(perlio, fmt, ...)
  3. [f]getc(stream) PerlIO_getc(perlio)
  4. [f]putc(stream, n) PerlIO_putc(perlio, n)
  5. ungetc(n, stream) PerlIO_ungetc(perlio, n)

Note that the PerlIO equivalents of fread and fwrite are slightly different from their C library counterparts:

  1. fread(p, size, n, stream) PerlIO_read(perlio, buf, numbytes)
  2. fwrite(p, size, n, stream) PerlIO_write(perlio, buf, numbytes)
  3. fputs(s, stream) PerlIO_puts(perlio, s)

There is no equivalent to fgets ; one should use sv_gets instead:

  1. fgets(s, n, stream) sv_gets(sv, perlio, append)

File Positioning

  1. Instead Of: Use:
  2. feof(stream) PerlIO_eof(perlio)
  3. fseek(stream, n, whence) PerlIO_seek(perlio, n, whence)
  4. rewind(stream) PerlIO_rewind(perlio)
  5. fgetpos(stream, p) PerlIO_getpos(perlio, sv)
  6. fsetpos(stream, p) PerlIO_setpos(perlio, sv)
  7. ferror(stream) PerlIO_error(perlio)
  8. clearerr(stream) PerlIO_clearerr(perlio)

Memory Management and String Handling

  1. Instead Of: Use:
  2. t* p = malloc(n) Newx(p, n, t)
  3. t* p = calloc(n, s) Newxz(p, n, t)
  4. p = realloc(p, n) Renew(p, n, t)
  5. memcpy(dst, src, n) Copy(src, dst, n, t)
  6. memmove(dst, src, n) Move(src, dst, n, t)
  7. memcpy(dst, src, sizeof(t)) StructCopy(src, dst, t)
  8. memset(dst, 0, n * sizeof(t)) Zero(dst, n, t)
  9. memzero(dst, 0) Zero(dst, n, char)
  10. free(p) Safefree(p)
  11. strdup(p) savepv(p)
  12. strndup(p, n) savepvn(p, n) (Hey, strndup doesn't exist!)
  13. strstr(big, little) instr(big, little)
  14. strcmp(s1, s2) strLE(s1, s2) / strEQ(s1, s2) / strGT(s1,s2)
  15. strncmp(s1, s2, n) strnNE(s1, s2, n) / strnEQ(s1, s2, n)

Notice the different order of arguments to Copy and Move than used in memcpy and memmove .

Most of the time, though, you'll want to be dealing with SVs internally instead of raw char * strings:

  1. strlen(s) sv_len(sv)
  2. strcpy(dt, src) sv_setpv(sv, s)
  3. strncpy(dt, src, n) sv_setpvn(sv, s, n)
  4. strcat(dt, src) sv_catpv(sv, s)
  5. strncat(dt, src) sv_catpvn(sv, s)
  6. sprintf(s, fmt, ...) sv_setpvf(sv, fmt, ...)

Note also the existence of sv_catpvf and sv_vcatpvfn , combining concatenation with formatting.

Sometimes instead of zeroing the allocated heap by using Newxz() you should consider "poisoning" the data. This means writing a bit pattern into it that should be illegal as pointers (and floating point numbers), and also hopefully surprising enough as integers, so that any code attempting to use the data without forethought will break sooner rather than later. Poisoning can be done using the Poison() macros, which have similar arguments to Zero():

  1. PoisonWith(dst, n, t, b) scribble memory with byte b
  2. PoisonNew(dst, n, t) equal to PoisonWith(dst, n, t, 0xAB)
  3. PoisonFree(dst, n, t) equal to PoisonWith(dst, n, t, 0xEF)
  4. Poison(dst, n, t) equal to PoisonFree(dst, n, t)

Character Class Tests

There are two types of character class tests that Perl implements: one type deals in char s and are thus not Unicode aware (and hence deprecated unless you know you should use them) and the other type deal in UV s and know about Unicode properties. In the following table, c is a char , and u is a Unicode codepoint.

  1. Instead Of: Use: But better use:
  2. isalnum(c) isALNUM(c) isALNUM_uni(u)
  3. isalpha(c) isALPHA(c) isALPHA_uni(u)
  4. iscntrl(c) isCNTRL(c) isCNTRL_uni(u)
  5. isdigit(c) isDIGIT(c) isDIGIT_uni(u)
  6. isgraph(c) isGRAPH(c) isGRAPH_uni(u)
  7. islower(c) isLOWER(c) isLOWER_uni(u)
  8. isprint(c) isPRINT(c) isPRINT_uni(u)
  9. ispunct(c) isPUNCT(c) isPUNCT_uni(u)
  10. isspace(c) isSPACE(c) isSPACE_uni(u)
  11. isupper(c) isUPPER(c) isUPPER_uni(u)
  12. isxdigit(c) isXDIGIT(c) isXDIGIT_uni(u)
  13. tolower(c) toLOWER(c) toLOWER_uni(u)
  14. toupper(c) toUPPER(c) toUPPER_uni(u)

stdlib.h functions

  1. Instead Of: Use:
  2. atof(s) Atof(s)
  3. atol(s) Atol(s)
  4. strtod(s, &p) Nothing. Just don't use it.
  5. strtol(s, &p, n) Strtol(s, &p, n)
  6. strtoul(s, &p, n) Strtoul(s, &p, n)

Notice also the grok_bin , grok_hex , and grok_oct functions in numeric.c for converting strings representing numbers in the respective bases into NV s.

In theory Strtol and Strtoul may not be defined if the machine perl is built on doesn't actually have strtol and strtoul. But as those 2 functions are part of the 1989 ANSI C spec we suspect you'll find them everywhere by now.

  1. int rand() double Drand01()
  2. srand(n) { seedDrand01((Rand_seed_t)n);
  3. PL_srand_called = TRUE; }
  4. exit(n) my_exit(n)
  5. system(s) Don't. Look at pp_system or use my_popen
  6. getenv(s) PerlEnv_getenv(s)
  7. setenv(s, val) my_putenv(s, val)

Miscellaneous functions

You should not even want to use setjmp.h functions, but if you think you do, use the JMPENV stack in scope.h instead.

For signal /sigaction , use rsignal(signo, handler) .

SEE ALSO

perlapi, perlapio, perlguts

 
perldoc-html/perlcommunity.html000644 000765 000024 00000056657 12275777322 017054 0ustar00jjstaff000000 000000 perlcommunity - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlcommunity

Perl 5 version 18.2 documentation
Recently read

perlcommunity

NAME

perlcommunity - a brief overview of the Perl community

DESCRIPTION

This document aims to provide an overview of the vast perl community, which is far too large and diverse to provide a detailed listing. If any specific niche has been forgotten, it is not meant as an insult but an omission for the sake of brevity.

The Perl community is as diverse as Perl, and there is a large amount of evidence that the Perl users apply TMTOWTDI to all endeavors, not just programming. From websites, to IRC, to mailing lists, there is more than one way to get involved in the community.

Where to Find the Community

There is a central directory for the Perl community: http://perl.org maintained by the Perl Foundation (http://www.perlfoundation.org/), which tracks and provides services for a variety of other community sites.

Mailing Lists and Newsgroups

Perl runs on e-mail; there is no doubt about it. The Camel book was originally written mostly over e-mail and today Perl's development is co-ordinated through mailing lists. The largest repository of Perl mailing lists is located at http://lists.perl.org.

Most Perl-related projects set up mailing lists for both users and contributors. If you don't see a certain project listed at http://lists.perl.org, check the particular website for that project. Most mailing lists are archived at http://nntp.perl.org/.

There are also plenty of Perl related newsgroups located under comp.lang.perl.* .

IRC

The Perl community has a rather large IRC presence. For starters, it has its own IRC network, irc://irc.perl.org. General (not help-oriented) chat can be found at irc://irc.perl.org/#perl. Many other more specific chats are also hosted on the network. Information about irc.perl.org is located on the network's website: http://www.irc.perl.org. For a more help-oriented #perl, check out irc://irc.freenode.net/#perl. Perl 6 development also has a presence in irc://irc.freenode.net/#perl6. Most Perl-related channels will be kind enough to point you in the right direction if you ask nicely.

Any large IRC network (Dalnet, EFnet) is also likely to have a #perl channel, with varying activity levels.

Websites

Perl websites come in a variety of forms, but they fit into two large categories: forums and news websites. There are many Perl-related websites, so only a few of the community's largest are mentioned here.

News sites

  • http://perl.com/

    Run by O'Reilly Media (the publisher of the Camel Book, among other Perl-related literature), perl.com provides current Perl news, articles, and resources for Perl developers as well as a directory of other useful websites.

  • http://blogs.perl.org/

    Many members of the community have a Perl-related blog on this site. If you'd like to join them, you can sign up for free.

  • http://use.perl.org/

    use Perl; used to provide a slashdot-style news/blog website covering all things Perl, from minutes of the meetings of the Perl 6 Design team to conference announcements with (ir)relevant discussion. It no longer accepts updates, but you can still use the site to read old entries and comments.

Forums

  • http://www.perlmonks.org/

    PerlMonks is one of the largest Perl forums, and describes itself as "A place for individuals to polish, improve, and showcase their Perl skills." and "A community which allows everyone to grow and learn from each other."

  • http://stackoverflow.com/

    Stack Overflow is a free question-and-answer site for programmers. It's not focussed solely on Perl, but it does have an active group of users who do their best to help people with their Perl programming questions.

User Groups

Many cities around the world have local Perl Mongers chapters. A Perl Mongers chapter is a local user group which typically holds regular in-person meetings, both social and technical; helps organize local conferences, workshops, and hackathons; and provides a mailing list or other continual contact method for its members to keep in touch.

To find your local Perl Mongers (or PM as they're commonly abbreviated) group check the international Perl Mongers directory at http://www.pm.org/.

Workshops

Perl workshops are, as the name might suggest, workshops where Perl is taught in a variety of ways. At the workshops, subjects range from a beginner's introduction (such as the Pittsburgh Perl Workshop's "Zero To Perl") to much more advanced subjects.

There are several great resources for locating workshops: the websites mentioned above, the calendar mentioned below, and the YAPC Europe website, http://www.yapceurope.org/, which is probably the best resource for European Perl events.

Hackathons

Hackathons are a very different kind of gathering where Perl hackers gather to do just that, hack nonstop for an extended (several day) period on a specific project or projects. Information about hackathons can be located in the same place as information about workshops as well as in irc://irc.perl.org/#perl.

If you have never been to a hackathon, here are a few basic things you need to know before attending: have a working laptop and know how to use it; check out the involved projects beforehand; have the necessary version control client; and bring backup equipment (an extra LAN cable, additional power strips, etc.) because someone will forget.

Conventions

Perl has two major annual conventions: The Perl Conference (now part of OSCON), put on by O'Reilly, and Yet Another Perl Conference or YAPC (pronounced yap-see), which is localized into several regional YAPCs (North America, Europe, Asia) in a stunning grassroots display by the Perl community. For more information about either conference, check out their respective web pages: OSCON http://conferences.oreillynet.com/; YAPC http://www.yapc.org.

A relatively new conference franchise with a large Perl portion is the Open Source Developers Conference or OSDC. First held in Australia it has recently also spread to Israel and France. More information can be found at: http://www.osdc.com.au/ for Australia, http://www.osdc.org.il for Israel, and http://www.osdc.fr/ for France.

Calendar of Perl Events

The Perl Review, http://www.theperlreview.com maintains a website and Google calendar (http://www.theperlreview.com/community_calendar) for tracking workshops, hackathons, Perl Mongers meetings, and other events. Views of this calendar are at http://www.perl.org/events.html and http://www.yapc.org.

Not every event or Perl Mongers group is on that calendar, so don't lose heart if you don't see yours posted. To have your event or group listed, contact brian d foy (brian@theperlreview.com).

AUTHOR

Edgar "Trizor" Bering <trizor@gmail.com>

 
perldoc-html/perlcygwin.html000644 000765 000024 00000236322 12275777411 016314 0ustar00jjstaff000000 000000 perlcygwin - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlcygwin

Perl 5 version 18.2 documentation
Recently read

perlcygwin

NAME

perlcygwin - Perl for Cygwin

SYNOPSIS

This document will help you configure, make, test and install Perl on Cygwin. This document also describes features of Cygwin that will affect how Perl behaves at runtime.

NOTE: There are pre-built Perl packages available for Cygwin and a version of Perl is provided in the normal Cygwin install. If you do not need to customize the configuration, consider using one of those packages.

PREREQUISITES FOR COMPILING PERL ON CYGWIN

Cygwin = GNU+Cygnus+Windows (Don't leave UNIX without it)

The Cygwin tools are ports of the popular GNU development tools for Win32 platforms. They run thanks to the Cygwin library which provides the UNIX system calls and environment these programs expect. More information about this project can be found at:

http://www.cygwin.com/

A recent net or commercial release of Cygwin is required.

At the time this document was last updated, Cygwin 1.7.16 was current.

Cygwin Configuration

While building Perl some changes may be necessary to your Cygwin setup so that Perl builds cleanly. These changes are not required for normal Perl usage.

NOTE: The binaries that are built will run on all Win32 versions. They do not depend on your host system (WinXP/Win2K/Win7) or your Cygwin configuration (binary/text mounts, cvgserver). The only dependencies come from hard-coded pathnames like /usr/local. However, your host system and Cygwin configuration will affect Perl's runtime behavior (see TEST).

  • PATH

    Set the PATH environment variable so that Configure finds the Cygwin versions of programs. Any not-needed Windows directories should be removed or moved to the end of your PATH .

  • nroff

    If you do not have nroff (which is part of the groff package), Configure will not prompt you to install man pages.

CONFIGURE PERL ON CYGWIN

The default options gathered by Configure with the assistance of hints/cygwin.sh will build a Perl that supports dynamic loading (which requires a shared cygperl5_16.dll).

This will run Configure and keep a record:

  1. ./Configure 2>&1 | tee log.configure

If you are willing to accept all the defaults run Configure with -de. However, several useful customizations are available.

Stripping Perl Binaries on Cygwin

It is possible to strip the EXEs and DLLs created by the build process. The resulting binaries will be significantly smaller. If you want the binaries to be stripped, you can either add a -s option when Configure prompts you,

  1. Any additional ld flags (NOT including libraries)? [none] -s
  2. Any special flags to pass to g++ to create a dynamically loaded library?
  3. [none] -s
  4. Any special flags to pass to gcc to use dynamic linking? [none] -s

or you can edit hints/cygwin.sh and uncomment the relevant variables near the end of the file.

Optional Libraries for Perl on Cygwin

Several Perl functions and modules depend on the existence of some optional libraries. Configure will find them if they are installed in one of the directories listed as being used for library searches. Pre-built packages for most of these are available from the Cygwin installer.

  • -lcrypt

    The crypt package distributed with Cygwin is a Linux compatible 56-bit DES crypt port by Corinna Vinschen.

    Alternatively, the crypt libraries in GNU libc have been ported to Cygwin.

  • -lgdbm_compat (use GDBM_File )

    GDBM is available for Cygwin.

    NOTE: The GDBM library only works on NTFS partitions.

  • -ldb (use DB_File )

    BerkeleyDB is available for Cygwin.

    NOTE: The BerkeleyDB library only completely works on NTFS partitions.

  • cygserver (use IPC::SysV )

    A port of SysV IPC is available for Cygwin.

    NOTE: This has not been extensively tested. In particular, d_semctl_semun is undefined because it fails a Configure test and on Win9x the shm*() functions seem to hang. It also creates a compile time dependency because perl.h includes <sys/ipc.h> and <sys/sem.h> (which will be required in the future when compiling CPAN modules). CURRENTLY NOT SUPPORTED!

  • -lutil

    Included with the standard Cygwin netrelease is the inetutils package which includes libutil.a.

Configure-time Options for Perl on Cygwin

The INSTALL document describes several Configure-time options. Some of these will work with Cygwin, others are not yet possible. Also, some of these are experimental. You can either select an option when Configure prompts you or you can define (undefine) symbols on the command line.

  • -Uusedl

    Undefining this symbol forces Perl to be compiled statically.

  • -Dusemymalloc

    By default Perl does not use the malloc() included with the Perl source, because it was slower and not entirely thread-safe. If you want to force Perl to build with the old -Dusemymalloc define this.

  • -Uuseperlio

    Undefining this symbol disables the PerlIO abstraction. PerlIO is now the default; it is not recommended to disable PerlIO.

  • -Dusemultiplicity

    Multiplicity is required when embedding Perl in a C program and using more than one interpreter instance. This is only required when you build a not-threaded perl with -Uuseithreads .

  • -Uuse64bitint

    By default Perl uses 64 bit integers. If you want to use smaller 32 bit integers, define this symbol.

  • -Duselongdouble

    gcc supports long doubles (12 bytes). However, several additional long double math functions are necessary to use them within Perl ({atan2, cos, exp, floor, fmod, frexp, isnan, log, modf, pow, sin, sqrt}l, strtold). These are not yet available with newlib, the Cygwin libc.

  • -Uuseithreads

    Define this symbol if you want not-threaded faster perl.

  • -Duselargefiles

    Cygwin uses 64-bit integers for internal size and position calculations, this will be correctly detected and defined by Configure.

  • -Dmksymlinks

    Use this to build perl outside of the source tree. Details can be found in the INSTALL document. This is the recommended way to build perl from sources.

Suspicious Warnings on Cygwin

You may see some messages during Configure that seem suspicious.

  • Win9x and d_eofnblk

    Win9x does not correctly report EOF with a non-blocking read on a closed pipe. You will see the following messages:

    1. But it also returns -1 to signal EOF, so be careful!
    2. WARNING: you can't distinguish between EOF and no data!
    3. *** WHOA THERE!!! ***
    4. The recommended value for $d_eofnblk on this machine was "define"!
    5. Keep the recommended value? [y]

    At least for consistency with WinNT, you should keep the recommended value.

  • Compiler/Preprocessor defines

    The following error occurs because of the Cygwin #define of _LONG_DOUBLE :

    1. Guessing which symbols your C compiler and preprocessor define...
    2. try.c:<line#>: missing binary operator

    This failure does not seem to cause any problems. With older gcc versions, "parse error" is reported instead of "missing binary operator".

MAKE ON CYGWIN

Simply run make and wait:

  1. make 2>&1 | tee log.make

TEST ON CYGWIN

There are two steps to running the test suite:

  1. make test 2>&1 | tee log.make-test
  2. cd t; ./perl harness 2>&1 | tee ../log.harness

The same tests are run both times, but more information is provided when running as ./perl harness.

Test results vary depending on your host system and your Cygwin configuration. If a test can pass in some Cygwin setup, it is always attempted and explainable test failures are documented. It is possible for Perl to pass all the tests, but it is more likely that some tests will fail for one of the reasons listed below.

File Permissions on Cygwin

UNIX file permissions are based on sets of mode bits for {read,write,execute} for each {user,group,other}. By default Cygwin only tracks the Win32 read-only attribute represented as the UNIX file user write bit (files are always readable, files are executable if they have a .{com,bat,exe} extension or begin with #! , directories are always readable and executable). On WinNT with the ntea CYGWIN setting, the additional mode bits are stored as extended file attributes. On WinNT with the default ntsec CYGWIN setting, permissions use the standard WinNT security descriptors and access control lists. Without one of these options, these tests will fail (listing not updated yet):

  1. Failed Test List of failed
  2. ------------------------------------
  3. io/fs.t 5, 7, 9-10
  4. lib/anydbm.t 2
  5. lib/db-btree.t 20
  6. lib/db-hash.t 16
  7. lib/db-recno.t 18
  8. lib/gdbm.t 2
  9. lib/ndbm.t 2
  10. lib/odbm.t 2
  11. lib/sdbm.t 2
  12. op/stat.t 9, 20 (.tmp not an executable extension)

NDBM_File and ODBM_File do not work on FAT filesystems

Do not use NDBM_File or ODBM_File on FAT filesystem. They can be built on a FAT filesystem, but many tests will fail:

  1. ../ext/NDBM_File/ndbm.t 13 3328 71 59 83.10% 1-2 4 16-71
  2. ../ext/ODBM_File/odbm.t 255 65280 ?? ?? % ??
  3. ../lib/AnyDBM_File.t 2 512 12 2 16.67% 1 4
  4. ../lib/Memoize/t/errors.t 0 139 11 5 45.45% 7-11
  5. ../lib/Memoize/t/tie_ndbm.t 13 3328 4 4 100.00% 1-4
  6. run/fresh_perl.t 97 1 1.03% 91

If you intend to run only on FAT (or if using AnyDBM_File on FAT), run Configure with the -Ui_ndbm and -Ui_dbm options to prevent NDBM_File and ODBM_File being built.

With NTFS (and no CYGWIN=nontsec), there should be no problems even if perl was built on FAT.

fork() failures in io_* tests

A fork() failure may result in the following tests failing:

  1. ext/IO/lib/IO/t/io_multihomed.t
  2. ext/IO/lib/IO/t/io_sock.t
  3. ext/IO/lib/IO/t/io_unix.t

See comment on fork in Miscellaneous below.

Specific features of the Cygwin port

Script Portability on Cygwin

Cygwin does an outstanding job of providing UNIX-like semantics on top of Win32 systems. However, in addition to the items noted above, there are some differences that you should know about. This is a very brief guide to portability, more information can be found in the Cygwin documentation.

  • Pathnames

    Cygwin pathnames are separated by forward (/) slashes, Universal Naming Codes (//UNC) are also supported Since cygwin-1.7 non-POSIX pathnames are discouraged. Names may contain all printable characters.

    File names are case insensitive, but case preserving. A pathname that contains a backslash or drive letter is a Win32 pathname, and not subject to the translations applied to POSIX style pathnames, but cygwin will warn you, so better convert them to POSIX.

    For conversion we have Cygwin::win_to_posix_path() and Cygwin::posix_to_win_path() .

    Since cygwin-1.7 pathnames are UTF-8 encoded.

  • Text/Binary

    Since cygwin-1.7 textmounts are deprecated and strongly discouraged.

    When a file is opened it is in either text or binary mode. In text mode a file is subject to CR/LF/Ctrl-Z translations. With Cygwin, the default mode for an open() is determined by the mode of the mount that underlies the file. See Cygwin::is_binmount(). Perl provides a binmode() function to set binary mode on files that otherwise would be treated as text. sysopen() with the O_TEXT flag sets text mode on files that otherwise would be treated as binary:

    1. sysopen(FOO, "bar", O_WRONLY|O_CREAT|O_TEXT)

    lseek() , tell() and sysseek() only work with files opened in binary mode.

    The text/binary issue is covered at length in the Cygwin documentation.

  • PerlIO

    PerlIO overrides the default Cygwin Text/Binary behaviour. A file will always be treated as binary, regardless of the mode of the mount it lives on, just like it is in UNIX. So CR/LF translation needs to be requested in either the open() call like this:

    1. open(FH, ">:crlf", "out.txt");

    which will do conversion from LF to CR/LF on the output, or in the environment settings (add this to your .bashrc):

    1. export PERLIO=crlf

    which will pull in the crlf PerlIO layer which does LF -> CRLF conversion on every output generated by perl.

  • .exe

    The Cygwin stat(), lstat() and readlink() functions make the .exe extension transparent by looking for foo.exe when you ask for foo (unless a foo also exists). Cygwin does not require a .exe extension, but gcc adds it automatically when building a program. However, when accessing an executable as a normal file (e.g., cp in a makefile) the .exe is not transparent. The install program included with Cygwin automatically appends a .exe when necessary.

  • Cygwin vs. Windows process ids

    Cygwin processes have their own pid, which is different from the underlying windows pid. Most posix compliant Proc functions expect the cygwin pid, but several Win32::Process functions expect the winpid. E.g. $$ is the cygwin pid of /usr/bin/perl, which is not the winpid. Use Cygwin::winpid_to_pid() and Cygwin::winpid_to_pid() to translate between them.

  • Cygwin vs. Windows errors

    Under Cygwin, $^E is the same as $!. When using Win32 API Functions, use Win32::GetLastError() to get the last Windows error.

  • rebase errors on fork or system

    Using fork() or system() out to another perl after loading multiple dlls may result on a DLL baseaddress conflict. The internal cygwin error looks like like the following:

    1. 0 [main] perl 8916 child_info_fork::abort: data segment start: parent
    2. (0xC1A000) != child(0xA6A000)

    or:

    1. 183 [main] perl 3588 C:\cygwin\bin\perl.exe: *** fatal error - unable to remap C:\cygwin\bin\cygsvn_subr-1-0.dll to same address as parent(0x6FB30000) != 0x6FE60000
    2. 46 [main] perl 3488 fork: child 3588 - died waiting for dll loading, errno11

    See http://cygwin.com/faq/faq-nochunks.html#faq.using.fixing-fork-failures It helps if not too many DLLs are loaded in memory so the available address space is larger, e.g. stopping the MS Internet Explorer might help.

    Use the perlrebase or rebase utilities to resolve the conflicting dll addresses. The rebase package is included in the Cygwin setup. Use setup.exe from http://www.cygwin.com/setup.exe to install it.

    1. kill all perl processes and run perlrebase or

    2. kill all cygwin processes and services, start dash from cmd.exe and run rebaseall .

  • chown()

    On WinNT chown() can change a file's user and group IDs. On Win9x chown() is a no-op, although this is appropriate since there is no security model.

  • Miscellaneous

    File locking using the F_GETLK command to fcntl() is a stub that returns ENOSYS .

    Win9x can not rename() an open file (although WinNT can).

    The Cygwin chroot() implementation has holes (it can not restrict file access by native Win32 programs).

    Inplace editing perl -i of files doesn't work without doing a backup of the file being edited perl -i.bak because of windowish restrictions, therefore Perl adds the suffix .bak automatically if you use perl -i without specifying a backup extension.

Prebuilt methods:

  • Cwd::cwd

    Returns the current working directory.

  • Cygwin::pid_to_winpid

    Translates a cygwin pid to the corresponding Windows pid (which may or may not be the same).

  • Cygwin::winpid_to_pid

    Translates a Windows pid to the corresponding cygwin pid (if any).

  • Cygwin::win_to_posix_path

    Translates a Windows path to the corresponding cygwin path respecting the current mount points. With a second non-null argument returns an absolute path. Double-byte characters will not be translated.

  • Cygwin::posix_to_win_path

    Translates a cygwin path to the corresponding cygwin path respecting the current mount points. With a second non-null argument returns an absolute path. Double-byte characters will not be translated.

  • Cygwin::mount_table()

    Returns an array of [mnt_dir, mnt_fsname, mnt_type, mnt_opts].

    1. perl -e 'for $i (Cygwin::mount_table) {print join(" ",@$i),"\n";}'
    2. /bin c:\cygwin\bin system binmode,cygexec
    3. /usr/bin c:\cygwin\bin system binmode
    4. /usr/lib c:\cygwin\lib system binmode
    5. / c:\cygwin system binmode
    6. /cygdrive/c c: system binmode,noumount
    7. /cygdrive/d d: system binmode,noumount
    8. /cygdrive/e e: system binmode,noumount
  • Cygwin::mount_flags

    Returns the mount type and flags for a specified mount point. A comma-separated string of mntent->mnt_type (always "system" or "user"), then the mntent->mnt_opts, where the first is always "binmode" or "textmode".

    1. system|user,binmode|textmode,exec,cygexec,cygdrive,mixed,
    2. notexec,managed,nosuid,devfs,proc,noumount

    If the argument is "/cygdrive", then just the volume mount settings, and the cygdrive mount prefix are returned.

    User mounts override system mounts.

    1. $ perl -e 'print Cygwin::mount_flags "/usr/bin"'
    2. system,binmode,cygexec
    3. $ perl -e 'print Cygwin::mount_flags "/cygdrive"'
    4. binmode,cygdrive,/cygdrive
  • Cygwin::is_binmount

    Returns true if the given cygwin path is binary mounted, false if the path is mounted in textmode.

  • Cygwin::sync_winenv

    Cygwin does not initialize all original Win32 environment variables. See the bottom of this page http://cygwin.com/cygwin-ug-net/setup-env.html for "Restricted Win32 environment".

    Certain Win32 programs called from cygwin programs might need some environment variable, such as e.g. ADODB needs %COMMONPROGRAMFILES%. Call Cygwin::sync_winenv() to copy all Win32 environment variables to your process and note that cygwin will warn on every encounter of non-POSIX paths.

INSTALL PERL ON CYGWIN

This will install Perl, including man pages.

  1. make install 2>&1 | tee log.make-install

NOTE: If STDERR is redirected make install will not prompt you to install perl into /usr/bin.

You may need to be Administrator to run make install . If you are not, you must have write access to the directories in question.

Information on installing the Perl documentation in HTML format can be found in the INSTALL document.

MANIFEST ON CYGWIN

These are the files in the Perl release that contain references to Cygwin. These very brief notes attempt to explain the reason for all conditional code. Hopefully, keeping this up to date will allow the Cygwin port to be kept as clean as possible.

  • Documentation
    1. INSTALL README.cygwin README.win32 MANIFEST
    2. pod/perl.pod pod/perlport.pod pod/perlfaq3.pod
    3. pod/perldelta.pod pod/perl5004delta.pod pod/perl56delta.pod
    4. pod/perl561delta.pod pod/perl570delta.pod pod/perl572delta.pod
    5. pod/perl573delta.pod pod/perl58delta.pod pod/perl581delta.pod
    6. pod/perl590delta.pod pod/perlhist.pod pod/perlmodlib.pod
    7. pod/perltoc.pod Porting/Glossary pod/perlgit.pod
    8. Porting/checkAUTHORS.pl
    9. dist/Cwd/Changes ext/Compress-Raw-Zlib/Changes
    10. ext/Compress-Raw-Zlib/README ext/Compress-Zlib/Changes
    11. ext/DB_File/Changes ext/Encode/Changes ext/Sys-Syslog/Changes
    12. ext/Time-HiRes/Changes ext/Win32API-File/Changes lib/CGI/Changes
    13. lib/ExtUtils/CBuilder/Changes lib/ExtUtils/Changes lib/ExtUtils/NOTES
    14. lib/ExtUtils/PATCHING lib/ExtUtils/README lib/Module/Build/Changes
    15. lib/Net/Ping/Changes lib/Test/Harness/Changes
    16. lib/Term/ANSIColor/ChangeLog lib/Term/ANSIColor/README README.symbian
    17. symbian/TODO
  • Build, Configure, Make, Install
    1. cygwin/Makefile.SHs
    2. ext/IPC/SysV/hints/cygwin.pl
    3. ext/NDBM_File/hints/cygwin.pl
    4. ext/ODBM_File/hints/cygwin.pl
    5. hints/cygwin.sh
    6. Configure - help finding hints from uname,
    7. shared libperl required for dynamic loading
    8. Makefile.SH Cross/Makefile-cross-SH
    9. - linklibperl
    10. Porting/patchls - cygwin in port list
    11. installman - man pages with :: translated to .
    12. installperl - install dll, install to 'pods'
    13. makedepend.SH - uwinfix
    14. regen_lib.pl - file permissions
    15. NetWare/Makefile
    16. plan9/mkfile
    17. symbian/sanity.pl symbian/sisify.pl
    18. hints/uwin.sh
    19. vms/descrip_mms.template
    20. win32/Makefile win32/makefile.mk
  • Tests
    1. t/io/fs.t - no file mode checks if not ntsec
    2. skip rename() check when not check_case:relaxed
    3. t/io/tell.t - binmode
    4. t/lib/cygwin.t - builtin cygwin function tests
    5. t/op/groups.t - basegroup has ID = 0
    6. t/op/magic.t - $^X/symlink WORKAROUND, s/.exe//
    7. t/op/stat.t - no /dev, skip Win32 ftCreationTime quirk
    8. (cache manager sometimes preserves ctime of file
    9. previously created and deleted), no -u (setuid)
    10. t/op/taint.t - can't use empty path under Cygwin Perl
    11. t/op/time.t - no tzset()
  • Compiled Perl Source
    1. EXTERN.h - __declspec(dllimport)
    2. XSUB.h - __declspec(dllexport)
    3. cygwin/cygwin.c - os_extras (getcwd, spawn, and several Cygwin:: functions)
    4. perl.c - os_extras, -i.bak
    5. perl.h - binmode
    6. doio.c - win9x can not rename a file when it is open
    7. pp_sys.c - do not define h_errno, init _pwent_struct.pw_comment
    8. util.c - use setenv
    9. util.h - PERL_FILE_IS_ABSOLUTE macro
    10. pp.c - Comment about Posix vs IEEE math under Cygwin
    11. perlio.c - CR/LF mode
    12. perliol.c - Comment about EXTCONST under Cygwin
  • Compiled Module Source
    1. ext/Compress-Raw-Zlib/Makefile.PL
    2. - Can't install via CPAN shell under Cygwin
    3. ext/Compress-Raw-Zlib/zlib-src/zutil.h
    4. - Cygwin is Unix-like and has vsnprintf
    5. ext/Errno/Errno_pm.PL - Special handling for Win32 Perl under Cygwin
    6. ext/POSIX/POSIX.xs - tzname defined externally
    7. ext/SDBM_File/sdbm/pair.c
    8. - EXTCONST needs to be redefined from EXTERN.h
    9. ext/SDBM_File/sdbm/sdbm.c
    10. - binary open
    11. ext/Sys/Syslog/Syslog.xs
    12. - Cygwin has syslog.h
    13. ext/Sys/Syslog/win32/compile.pl
    14. - Convert paths to Windows paths
    15. ext/Time-HiRes/HiRes.xs
    16. - Various timers not available
    17. ext/Time-HiRes/Makefile.PL
    18. - Find w32api/windows.h
    19. ext/Win32/Makefile.PL - Use various libraries under Cygwin
    20. ext/Win32/Win32.xs - Child dir and child env under Cygwin
    21. ext/Win32API-File/File.xs
    22. - _open_osfhandle not implemented under Cygwin
    23. ext/Win32CORE/Win32CORE.c
    24. - __declspec(dllexport)
  • Perl Modules/Scripts
    1. ext/B/t/OptreeCheck.pm - Comment about stderr/stdout order under Cygwin
    2. ext/Digest-SHA/bin/shasum
    3. - Use binary mode under Cygwin
    4. ext/Sys/Syslog/win32/Win32.pm
    5. - Convert paths to Windows paths
    6. ext/Time-HiRes/HiRes.pm
    7. - Comment about various timers not available
    8. ext/Win32API-File/File.pm
    9. - _open_osfhandle not implemented under Cygwin
    10. ext/Win32CORE/Win32CORE.pm
    11. - History of Win32CORE under Cygwin
    12. lib/CGI.pm - binmode and path separator
    13. lib/CPANPLUS/Dist/MM.pm - Commented out code that fails under Win32/Cygwin
    14. lib/CPANPLUS/Internals/Constants/Report.pm
    15. - OS classifications
    16. lib/CPANPLUS/Internals/Constants.pm
    17. - Constants for Cygwin
    18. lib/CPANPLUS/Internals/Report.pm
    19. - Example of Cygwin report
    20. lib/CPANPLUS/Module.pm
    21. - Abort if running on old Cygwin version
    22. lib/Cwd.pm - hook to internal Cwd::cwd
    23. lib/ExtUtils/CBuilder/Platform/cygwin.pm
    24. - use gcc for ld, and link to libperl.dll.a
    25. lib/ExtUtils/CBuilder.pm
    26. - Cygwin is Unix-like
    27. lib/ExtUtils/Install.pm - Install and rename issues under Cygwin
    28. lib/ExtUtils/MM.pm - OS classifications
    29. lib/ExtUtils/MM_Any.pm - Example for Cygwin
    30. lib/ExtUtils/MakeMaker.pm
    31. - require MM_Cygwin.pm
    32. lib/ExtUtils/MM_Cygwin.pm
    33. - canonpath, cflags, manifypods, perl_archive
    34. lib/File/Fetch.pm - Comment about quotes using a Cygwin example
    35. lib/File/Find.pm - on remote drives stat() always sets st_nlink to 1
    36. lib/File/Spec/Cygwin.pm - case_tolerant
    37. lib/File/Spec/Unix.pm - preserve //unc
    38. lib/File/Spec/Win32.pm - References a message on cygwin.com
    39. lib/File/Spec.pm - Pulls in lib/File/Spec/Cygwin.pm
    40. lib/File/Temp.pm - no directory sticky bit
    41. lib/Module/Build/Compat.pm - Comment references 'make' under Cygwin
    42. lib/Module/Build/Platform/cygwin.pm
    43. - Use '.' for man page separator
    44. lib/Module/Build.pm - Cygwin is Unix-like
    45. lib/Module/CoreList.pm - List of all module files and versions
    46. lib/Net/Domain.pm - No domainname command under Cygwin
    47. lib/Net/Netrc.pm - Bypass using stat() under Cygwin
    48. lib/Net/Ping.pm - ECONREFUSED is EAGAIN under Cygwin
    49. lib/Pod/Find.pm - Set 'pods' dir
    50. lib/Pod/Perldoc/ToMan.pm - '-c' switch for pod2man
    51. lib/Pod/Perldoc.pm - Use 'less' pager, and use .exe extension
    52. lib/Term/ANSIColor.pm - Cygwin terminal info
    53. lib/perl5db.pl - use stdin not /dev/tty
    54. utils/perlbug.PL - Add CYGWIN environment variable to report
  • Perl Module Tests
    1. dist/Cwd/t/cwd.t
    2. ext/Compress-Zlib/t/14gzopen.t
    3. ext/DB_File/t/db-btree.t
    4. ext/DB_File/t/db-hash.t
    5. ext/DB_File/t/db-recno.t
    6. ext/DynaLoader/t/DynaLoader.t
    7. ext/File-Glob/t/basic.t
    8. ext/GDBM_File/t/gdbm.t
    9. ext/POSIX/t/sysconf.t
    10. ext/POSIX/t/time.t
    11. ext/SDBM_File/t/sdbm.t
    12. ext/Sys/Syslog/t/syslog.t
    13. ext/Time-HiRes/t/HiRes.t
    14. ext/Win32/t/Unicode.t
    15. ext/Win32API-File/t/file.t
    16. ext/Win32CORE/t/win32core.t
    17. lib/AnyDBM_File.t
    18. lib/Archive/Extract/t/01_Archive-Extract.t
    19. lib/Archive/Tar/t/02_methods.t
    20. lib/CPANPLUS/t/05_CPANPLUS-Internals-Fetch.t
    21. lib/CPANPLUS/t/20_CPANPLUS-Dist-MM.t
    22. lib/ExtUtils/t/Embed.t
    23. lib/ExtUtils/t/eu_command.t
    24. lib/ExtUtils/t/MM_Cygwin.t
    25. lib/ExtUtils/t/MM_Unix.t
    26. lib/File/Compare.t
    27. lib/File/Copy.t
    28. lib/File/Find/t/find.t
    29. lib/File/Path.t
    30. lib/File/Spec/t/crossplatform.t
    31. lib/File/Spec/t/Spec.t
    32. lib/Module/Build/t/destinations.t
    33. lib/Net/hostent.t
    34. lib/Net/Ping/t/110_icmp_inst.t
    35. lib/Net/Ping/t/500_ping_icmp.t
    36. lib/Net/t/netrc.t
    37. lib/Pod/Simple/t/perlcyg.pod
    38. lib/Pod/Simple/t/perlcygo.txt
    39. lib/Pod/Simple/t/perlfaq.pod
    40. lib/Pod/Simple/t/perlfaqo.txt
    41. lib/User/grent.t
    42. lib/User/pwent.t

BUGS ON CYGWIN

Support for swapping real and effective user and group IDs is incomplete. On WinNT Cygwin provides setuid() , seteuid() , setgid() and setegid() . However, additional Cygwin calls for manipulating WinNT access tokens and security contexts are required.

AUTHORS

Charles Wilson <cwilson@ece.gatech.edu>, Eric Fifer <egf7@columbia.edu>, alexander smishlajev <als@turnhere.com>, Steven Morlock <newspost@morlock.net>, Sebastien Barre <Sebastien.Barre@utc.fr>, Teun Burgers <burgers@ecn.nl>, Gerrit P. Haase <gp@familiehaase.de>, Reini Urban <rurban@cpan.org>, Jan Dubois <jand@activestate.com>, Jerry D. Hedden <jdhedden@cpan.org>.

HISTORY

Last updated: 2012-02-08

 
perldoc-html/perldata.html000644 000765 000024 00000272431 12275777332 015730 0ustar00jjstaff000000 000000 perldata - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldata

Perl 5 version 18.2 documentation
Recently read

perldata

NAME

perldata - Perl data types

DESCRIPTION

Variable names

Perl has three built-in data types: scalars, arrays of scalars, and associative arrays of scalars, known as "hashes". A scalar is a single string (of any size, limited only by the available memory), number, or a reference to something (which will be discussed in perlref). Normal arrays are ordered lists of scalars indexed by number, starting with 0. Hashes are unordered collections of scalar values indexed by their associated string key.

Values are usually referred to by name, or through a named reference. The first character of the name tells you to what sort of data structure it refers. The rest of the name tells you the particular value to which it refers. Usually this name is a single identifier, that is, a string beginning with a letter or underscore, and containing letters, underscores, and digits. In some cases, it may be a chain of identifiers, separated by :: (or by the slightly archaic '); all but the last are interpreted as names of packages, to locate the namespace in which to look up the final identifier (see Packages in perlmod for details). For a more in-depth discussion on identifiers, see Identifier parsing. It's possible to substitute for a simple identifier, an expression that produces a reference to the value at runtime. This is described in more detail below and in perlref.

Perl also has its own built-in variables whose names don't follow these rules. They have strange names so they don't accidentally collide with one of your normal variables. Strings that match parenthesized parts of a regular expression are saved under names containing only digits after the $ (see perlop and perlre). In addition, several special variables that provide windows into the inner working of Perl have names containing punctuation characters and control characters. These are documented in perlvar.

Scalar values are always named with '$', even when referring to a scalar that is part of an array or a hash. The '$' symbol works semantically like the English word "the" in that it indicates a single value is expected.

  1. $days # the simple scalar value "days"
  2. $days[28] # the 29th element of array @days
  3. $days{'Feb'} # the 'Feb' value from hash %days
  4. $#days # the last index of array @days

Entire arrays (and slices of arrays and hashes) are denoted by '@', which works much as the word "these" or "those" does in English, in that it indicates multiple values are expected.

  1. @days # ($days[0], $days[1],... $days[n])
  2. @days[3,4,5] # same as ($days[3],$days[4],$days[5])
  3. @days{'a','c'} # same as ($days{'a'},$days{'c'})

Entire hashes are denoted by '%':

  1. %days # (key1, val1, key2, val2 ...)

In addition, subroutines are named with an initial '&', though this is optional when unambiguous, just as the word "do" is often redundant in English. Symbol table entries can be named with an initial '*', but you don't really care about that yet (if ever :-).

Every variable type has its own namespace, as do several non-variable identifiers. This means that you can, without fear of conflict, use the same name for a scalar variable, an array, or a hash--or, for that matter, for a filehandle, a directory handle, a subroutine name, a format name, or a label. This means that $foo and @foo are two different variables. It also means that $foo[1] is a part of @foo, not a part of $foo. This may seem a bit weird, but that's okay, because it is weird.

Because variable references always start with '$', '@', or '%', the "reserved" words aren't in fact reserved with respect to variable names. They are reserved with respect to labels and filehandles, however, which don't have an initial special character. You can't have a filehandle named "log", for instance. Hint: you could say open(LOG,'logfile') rather than open(log,'logfile'). Using uppercase filehandles also improves readability and protects you from conflict with future reserved words. Case is significant--"FOO", "Foo", and "foo" are all different names. Names that start with a letter or underscore may also contain digits and underscores.

It is possible to replace such an alphanumeric name with an expression that returns a reference to the appropriate type. For a description of this, see perlref.

Names that start with a digit may contain only more digits. Names that do not start with a letter, underscore, digit or a caret (i.e. a control character) are limited to one character, e.g., $% or $$ . (Most of these one character names have a predefined significance to Perl. For instance, $$ is the current process id.)

Identifier parsing

Up until Perl 5.18, the actual rules of what a valid identifier was were a bit fuzzy. However, in general, anything defined here should work on previous versions of Perl, while the opposite -- edge cases that work in previous versions, but aren't defined here -- probably won't work on newer versions. As an important side note, please note that the following only applies to bareword identifiers as found in Perl source code, not identifiers introduced through symbolic references, which have much fewer restrictions. If working under the effect of the use utf8; pragma, the following rules apply:

  1. / (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) \p{XID_Continue}* /x

If not under use utf8 , the source is treated as ASCII + 128 extra controls, and identifiers should match

  1. / (?aa) (?!\d) \w+ /x

That is, any word character in the ASCII range, as long as the first character is not a digit.

There are two package separators in Perl: A double colon (:: ) and a single quote ('). Normal identifiers can start or end with a double colon, and can contain several parts delimited by double colons. Single quotes have similar rules, but with the exception that they are not legal at the end of an identifier: That is, $'foo and $foo'bar are legal, but $foo'bar' are not.

Finally, if the identifier is preceded by a sigil -- More so, normal identifiers can start or end with any number of double colons (::), and can contain several parts delimited by double colons. And additionally, if the identifier is preceded by a sigil -- that is, if the identifier is part of a variable name -- it may optionally be enclosed in braces.

While you can mix double colons with singles quotes, the quotes must come after the colons: $::::'foo and $foo::'bar are legal, but $::'::foo and $foo'::bar are not.

Put together, a grammar to match a basic identifier becomes

  1. /
  2. (?(DEFINE)
  3. (?<variable>
  4. (?&sigil)
  5. (?:
  6. (?&normal_identifier)
  7. | \{ \s* (?&normal_identifier) \s* \}
  8. )
  9. )
  10. (?<normal_identifier>
  11. (?: :: )* '?
  12. (?&basic_identifier)
  13. (?: (?= (?: :: )+ '? | (?: :: )* ' ) (?&normal_identifier) )?
  14. (?: :: )*
  15. )
  16. (?<basic_identifier>
  17. # is use utf8 on?
  18. (?(?{ (caller(0))[8] & $utf8::hint_bits })
  19. (?&Perl_XIDS) \p{XID_Continue}*
  20. | (?aa) (?!\d) \w+
  21. )
  22. )
  23. (?<sigil> [&*\$\@\%])
  24. (?<Perl_XIDS> (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) )
  25. )
  26. /x

Meanwhile, special identifiers don't follow the above rules; For the most part, all of the identifiers in this category have a special meaning given by Perl. Because they have special parsing rules, these generally can't be fully-qualified. They come in four forms:

  • A sigil, followed solely by digits matching \p{POSIX_Digit}, like $0 , $1 , or $10000 .
  • A sigil, followed by either a caret and a single POSIX uppercase letter, like $^V or $^W , or a sigil followed by a literal control character matching the \p{POSIX_Cntrl} property. Due to a historical oddity, if not running under use utf8 , the 128 extra controls in the [0x80-0xff] range may also be used in length one variables.
  • Similar to the above, a sigil, followed by bareword text in brackets, where the first character is either a caret followed by an uppercase letter, or a literal control, like ${^GLOBAL_PHASE} or ${\7LOBAL_PHASE} .
  • A sigil followed by a single character matching the \p{POSIX_Punct} property, like $! or %+ .

Context

The interpretation of operations and values in Perl sometimes depends on the requirements of the context around the operation or value. There are two major contexts: list and scalar. Certain operations return list values in contexts wanting a list, and scalar values otherwise. If this is true of an operation it will be mentioned in the documentation for that operation. In other words, Perl overloads certain operations based on whether the expected return value is singular or plural. Some words in English work this way, like "fish" and "sheep".

In a reciprocal fashion, an operation provides either a scalar or a list context to each of its arguments. For example, if you say

  1. int( <STDIN> )

the integer operation provides scalar context for the <> operator, which responds by reading one line from STDIN and passing it back to the integer operation, which will then find the integer value of that line and return that. If, on the other hand, you say

  1. sort( <STDIN> )

then the sort operation provides list context for <>, which will proceed to read every line available up to the end of file, and pass that list of lines back to the sort routine, which will then sort those lines and return them as a list to whatever the context of the sort was.

Assignment is a little bit special in that it uses its left argument to determine the context for the right argument. Assignment to a scalar evaluates the right-hand side in scalar context, while assignment to an array or hash evaluates the righthand side in list context. Assignment to a list (or slice, which is just a list anyway) also evaluates the right-hand side in list context.

When you use the use warnings pragma or Perl's -w command-line option, you may see warnings about useless uses of constants or functions in "void context". Void context just means the value has been discarded, such as a statement containing only "fred"; or getpwuid(0);. It still counts as scalar context for functions that care whether or not they're being called in list context.

User-defined subroutines may choose to care whether they are being called in a void, scalar, or list context. Most subroutines do not need to bother, though. That's because both scalars and lists are automatically interpolated into lists. See wantarray for how you would dynamically discern your function's calling context.

Scalar values

All data in Perl is a scalar, an array of scalars, or a hash of scalars. A scalar may contain one single value in any of three different flavors: a number, a string, or a reference. In general, conversion from one form to another is transparent. Although a scalar may not directly hold multiple values, it may contain a reference to an array or hash which in turn contains multiple values.

Scalars aren't necessarily one thing or another. There's no place to declare a scalar variable to be of type "string", type "number", type "reference", or anything else. Because of the automatic conversion of scalars, operations that return scalars don't need to care (and in fact, cannot care) whether their caller is looking for a string, a number, or a reference. Perl is a contextually polymorphic language whose scalars can be strings, numbers, or references (which includes objects). Although strings and numbers are considered pretty much the same thing for nearly all purposes, references are strongly-typed, uncastable pointers with builtin reference-counting and destructor invocation.

A scalar value is interpreted as FALSE in the Boolean sense if it is undefined, the null string or the number 0 (or its string equivalent, "0"), and TRUE if it is anything else. The Boolean context is just a special kind of scalar context where no conversion to a string or a number is ever performed.

There are actually two varieties of null strings (sometimes referred to as "empty" strings), a defined one and an undefined one. The defined version is just a string of length zero, such as "" . The undefined version is the value that indicates that there is no real value for something, such as when there was an error, or at end of file, or when you refer to an uninitialized variable or element of an array or hash. Although in early versions of Perl, an undefined scalar could become defined when first used in a place expecting a defined value, this no longer happens except for rare cases of autovivification as explained in perlref. You can use the defined() operator to determine whether a scalar value is defined (this has no meaning on arrays or hashes), and the undef() operator to produce an undefined value.

To find out whether a given string is a valid non-zero number, it's sometimes enough to test it against both numeric 0 and also lexical "0" (although this will cause noises if warnings are on). That's because strings that aren't numbers count as 0, just as they do in awk:

  1. if ($str == 0 && $str ne "0") {
  2. warn "That doesn't look like a number";
  3. }

That method may be best because otherwise you won't treat IEEE notations like NaN or Infinity properly. At other times, you might prefer to determine whether string data can be used numerically by calling the POSIX::strtod() function or by inspecting your string with a regular expression (as documented in perlre).

  1. warn "has nondigits" if /\D/;
  2. warn "not a natural number" unless /^\d+$/; # rejects -3
  3. warn "not an integer" unless /^-?\d+$/; # rejects +3
  4. warn "not an integer" unless /^[+-]?\d+$/;
  5. warn "not a decimal number" unless /^-?\d+\.?\d*$/; # rejects .2
  6. warn "not a decimal number" unless /^-?(?:\d+(?:\.\d*)?|\.\d+)$/;
  7. warn "not a C float"
  8. unless /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/;

The length of an array is a scalar value. You may find the length of array @days by evaluating $#days , as in csh. However, this isn't the length of the array; it's the subscript of the last element, which is a different value since there is ordinarily a 0th element. Assigning to $#days actually changes the length of the array. Shortening an array this way destroys intervening values. Lengthening an array that was previously shortened does not recover values that were in those elements.

You can also gain some minuscule measure of efficiency by pre-extending an array that is going to get big. You can also extend an array by assigning to an element that is off the end of the array. You can truncate an array down to nothing by assigning the null list () to it. The following are equivalent:

  1. @whatever = ();
  2. $#whatever = -1;

If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.) The following is always true:

  1. scalar(@whatever) == $#whatever + 1;

Some programmers choose to use an explicit conversion so as to leave nothing to doubt:

  1. $element_count = scalar(@whatever);

If you evaluate a hash in scalar context, it returns false if the hash is empty. If there are any key/value pairs, it returns true; more precisely, the value returned is a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. This is pretty much useful only to find out whether Perl's internal hashing algorithm is performing poorly on your data set. For example, you stick 10,000 things in a hash, but evaluating %HASH in scalar context reveals "1/16" , which means only one out of sixteen buckets has been touched, and presumably contains all 10,000 of your items. This isn't supposed to happen. If a tied hash is evaluated in scalar context, the SCALAR method is called (with a fallback to FIRSTKEY ).

You can preallocate space for a hash by assigning to the keys() function. This rounds up the allocated buckets to the next power of two:

  1. keys(%users) = 1000; # allocate 1024 buckets

Scalar value constructors

Numeric literals are specified in any of the following floating point or integer formats:

  1. 12345
  2. 12345.67
  3. .23E-10 # a very small number
  4. 3.14_15_92 # a very important number
  5. 4_294_967_296 # underscore for legibility
  6. 0xff # hex
  7. 0xdead_beef # more hex
  8. 0377 # octal (only numbers, begins with 0)
  9. 0b011011 # binary

You are allowed to use underscores (underbars) in numeric literals between digits for legibility (but not multiple underscores in a row: 23__500 is not legal; 23_500 is). You could, for example, group binary digits by threes (as for a Unix-style mode argument such as 0b110_100_100) or by fours (to represent nibbles, as in 0b1010_0110) or in other groups.

String literals are usually delimited by either single or double quotes. They work much like quotes in the standard Unix shells: double-quoted string literals are subject to backslash and variable substitution; single-quoted strings are not (except for \' and \\ ). The usual C-style backslash rules apply for making characters such as newline, tab, etc., as well as some more exotic forms. See Quote and Quote-like Operators in perlop for a list.

Hexadecimal, octal, or binary, representations in string literals (e.g. '0xff') are not automatically converted to their integer representation. The hex() and oct() functions make these conversions for you. See hex and oct for more details.

You can also embed newlines directly in your strings, i.e., they can end on a different line than they begin. This is nice, but if you forget your trailing quote, the error will not be reported until Perl finds another line containing the quote character, which may be much further on in the script. Variable substitution inside strings is limited to scalar variables, arrays, and array or hash slices. (In other words, names beginning with $ or @, followed by an optional bracketed expression as a subscript.) The following code segment prints out "The price is $100."

  1. $Price = '$100'; # not interpolated
  2. print "The price is $Price.\n"; # interpolated

There is no double interpolation in Perl, so the $100 is left as is.

By default floating point numbers substituted inside strings use the dot (".") as the decimal separator. If use locale is in effect, and POSIX::setlocale() has been called, the character used for the decimal separator is affected by the LC_NUMERIC locale. See perllocale and POSIX.

As in some shells, you can enclose the variable name in braces to disambiguate it from following alphanumerics (and underscores). You must also do this when interpolating a variable into a string to separate the variable name from a following double-colon or an apostrophe, since these would be otherwise treated as a package separator:

  1. $who = "Larry";
  2. print PASSWD "${who}::0:0:Superuser:/:/bin/perl\n";
  3. print "We use ${who}speak when ${who}'s here.\n";

Without the braces, Perl would have looked for a $whospeak, a $who::0 , and a $who's variable. The last two would be the $0 and the $s variables in the (presumably) non-existent package who .

In fact, a simple identifier within such curlies is forced to be a string, and likewise within a hash subscript. Neither need quoting. Our earlier example, $days{'Feb'} can be written as $days{Feb} and the quotes will be assumed automatically. But anything more complicated in the subscript will be interpreted as an expression. This means for example that $version{2.0}++ is equivalent to $version{2}++ , not to $version{'2.0'}++ .

Version Strings

A literal of the form v1.20.300.4000 is parsed as a string composed of characters with the specified ordinals. This form, known as v-strings, provides an alternative, more readable way to construct strings, rather than use the somewhat less readable interpolation form "\x{1}\x{14}\x{12c}\x{fa0}" . This is useful for representing Unicode strings, and for comparing version "numbers" using the string comparison operators, cmp , gt , lt etc. If there are two or more dots in the literal, the leading v may be omitted.

  1. print v9786; # prints SMILEY, "\x{263a}"
  2. print v102.111.111; # prints "foo"
  3. print 102.111.111; # same

Such literals are accepted by both require and use for doing a version check. Note that using the v-strings for IPv4 addresses is not portable unless you also use the inet_aton()/inet_ntoa() routines of the Socket package.

Note that since Perl 5.8.1 the single-number v-strings (like v65 ) are not v-strings before the => operator (which is usually used to separate a hash key from a hash value); instead they are interpreted as literal strings ('v65'). They were v-strings from Perl 5.6.0 to Perl 5.8.0, but that caused more confusion and breakage than good. Multi-number v-strings like v65.66 and 65.66.67 continue to be v-strings always.

Special Literals

The special literals __FILE__, __LINE__, and __PACKAGE__ represent the current filename, line number, and package name at that point in your program. __SUB__ gives a reference to the current subroutine. They may be used only as separate tokens; they will not be interpolated into strings. If there is no current package (due to an empty package; directive), __PACKAGE__ is the undefined value. (But the empty package; is no longer supported, as of version 5.10.) Outside of a subroutine, __SUB__ is the undefined value. __SUB__ is only available in 5.16 or higher, and only with a use v5.16 or use feature "current_sub" declaration.

The two control characters ^D and ^Z, and the tokens __END__ and __DATA__ may be used to indicate the logical end of the script before the actual end of file. Any following text is ignored.

Text after __DATA__ may be read via the filehandle PACKNAME::DATA , where PACKNAME is the package that was current when the __DATA__ token was encountered. The filehandle is left open pointing to the line after __DATA__. The program should close DATA when it is done reading from it. (Leaving it open leaks filehandles if the module is reloaded for any reason, so it's a safer practice to close it.) For compatibility with older scripts written before __DATA__ was introduced, __END__ behaves like __DATA__ in the top level script (but not in files loaded with require or do) and leaves the remaining contents of the file accessible via main::DATA .

See SelfLoader for more description of __DATA__, and an example of its use. Note that you cannot read from the DATA filehandle in a BEGIN block: the BEGIN block is executed as soon as it is seen (during compilation), at which point the corresponding __DATA__ (or __END__) token has not yet been seen.

Barewords

A word that has no other interpretation in the grammar will be treated as if it were a quoted string. These are known as "barewords". As with filehandles and labels, a bareword that consists entirely of lowercase letters risks conflict with future reserved words, and if you use the use warnings pragma or the -w switch, Perl will warn you about any such words. Perl limits barewords (like identifiers) to about 250 characters. Future versions of Perl are likely to eliminate these arbitrary limitations.

Some people may wish to outlaw barewords entirely. If you say

  1. use strict 'subs';

then any bareword that would NOT be interpreted as a subroutine call produces a compile-time error instead. The restriction lasts to the end of the enclosing block. An inner block may countermand this by saying no strict 'subs' .

Array Interpolation

Arrays and slices are interpolated into double-quoted strings by joining the elements with the delimiter specified in the $" variable ($LIST_SEPARATOR if "use English;" is specified), space by default. The following are equivalent:

  1. $temp = join($", @ARGV);
  2. system "echo $temp";
  3. system "echo @ARGV";

Within search patterns (which also undergo double-quotish substitution) there is an unfortunate ambiguity: Is /$foo[bar]/ to be interpreted as /${foo}[bar]/ (where [bar] is a character class for the regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to array @foo)? If @foo doesn't otherwise exist, then it's obviously a character class. If @foo exists, Perl takes a good guess about [bar] , and is almost always right. If it does guess wrong, or if you're just plain paranoid, you can force the correct interpretation with curly braces as above.

If you're looking for the information on how to use here-documents, which used to be here, that's been moved to Quote and Quote-like Operators in perlop.

List value constructors

List values are denoted by separating individual values by commas (and enclosing the list in parentheses where precedence requires it):

  1. (LIST)

In a context not requiring a list value, the value of what appears to be a list literal is simply the value of the final element, as with the C comma operator. For example,

  1. @foo = ('cc', '-E', $bar);

assigns the entire list value to array @foo, but

  1. $foo = ('cc', '-E', $bar);

assigns the value of variable $bar to the scalar variable $foo. Note that the value of an actual array in scalar context is the length of the array; the following assigns the value 3 to $foo:

  1. @foo = ('cc', '-E', $bar);
  2. $foo = @foo; # $foo gets 3

You may have an optional comma before the closing parenthesis of a list literal, so that you can say:

  1. @foo = (
  2. 1,
  3. 2,
  4. 3,
  5. );

To use a here-document to assign an array, one line per element, you might use an approach like this:

  1. @sauces = <<End_Lines =~ m/(\S.*\S)/g;
  2. normal tomato
  3. spicy tomato
  4. green chile
  5. pesto
  6. white wine
  7. End_Lines

LISTs do automatic interpolation of sublists. That is, when a LIST is evaluated, each element of the list is evaluated in list context, and the resulting list value is interpolated into LIST just as if each individual element were a member of LIST. Thus arrays and hashes lose their identity in a LIST--the list

  1. (@foo,@bar,&SomeSub,%glarch)

contains all the elements of @foo followed by all the elements of @bar, followed by all the elements returned by the subroutine named SomeSub called in list context, followed by the key/value pairs of %glarch. To make a list reference that does NOT interpolate, see perlref.

The null list is represented by (). Interpolating it in a list has no effect. Thus ((),(),()) is equivalent to (). Similarly, interpolating an array with no elements is the same as if no array had been interpolated at that point.

This interpolation combines with the facts that the opening and closing parentheses are optional (except when necessary for precedence) and lists may end with an optional comma to mean that multiple commas within lists are legal syntax. The list 1,,3 is a concatenation of two lists, 1, and 3 , the first of which ends with that optional comma. 1,,3 is (1,),(3) is 1,3 (And similarly for 1,,,3 is (1,),(,),3 is 1,3 and so on.) Not that we'd advise you to use this obfuscation.

A list value may also be subscripted like a normal array. You must put the list in parentheses to avoid ambiguity. For example:

  1. # Stat returns list value.
  2. $time = (stat($file))[8];
  3. # SYNTAX ERROR HERE.
  4. $time = stat($file)[8]; # OOPS, FORGOT PARENTHESES
  5. # Find a hex digit.
  6. $hexdigit = ('a','b','c','d','e','f')[$digit-10];
  7. # A "reverse comma operator".
  8. return (pop(@foo),pop(@foo))[0];

Lists may be assigned to only when each element of the list is itself legal to assign to:

  1. ($a, $b, $c) = (1, 2, 3);
  2. ($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00);

An exception to this is that you may assign to undef in a list. This is useful for throwing away some of the return values of a function:

  1. ($dev, $ino, undef, undef, $uid, $gid) = stat($file);

List assignment in scalar context returns the number of elements produced by the expression on the right side of the assignment:

  1. $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
  2. $x = (($foo,$bar) = f()); # set $x to f()'s return count

This is handy when you want to do a list assignment in a Boolean context, because most list functions return a null list when finished, which when assigned produces a 0, which is interpreted as FALSE.

It's also the source of a useful idiom for executing a function or performing an operation in list context and then counting the number of return values, by assigning to an empty list and then using that assignment in scalar context. For example, this code:

  1. $count = () = $string =~ /\d+/g;

will place into $count the number of digit groups found in $string. This happens because the pattern match is in list context (since it is being assigned to the empty list), and will therefore return a list of all matching parts of the string. The list assignment in scalar context will translate that into the number of elements (here, the number of times the pattern matched) and assign that to $count. Note that simply using

  1. $count = $string =~ /\d+/g;

would not have worked, since a pattern match in scalar context will only return true or false, rather than a count of matches.

The final element of a list assignment may be an array or a hash:

  1. ($a, $b, @rest) = split;
  2. my($a, $b, %rest) = @_;

You can actually put an array or hash anywhere in the list, but the first one in the list will soak up all the values, and anything after it will become undefined. This may be useful in a my() or local().

A hash can be initialized using a literal list holding pairs of items to be interpreted as a key and a value:

  1. # same as map assignment above
  2. %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);

While literal lists and named arrays are often interchangeable, that's not the case for hashes. Just because you can subscript a list value like a normal array does not mean that you can subscript a list value as a hash. Likewise, hashes included as parts of other lists (including parameters lists and return lists from functions) always flatten out into key/value pairs. That's why it's good to use references sometimes.

It is often more readable to use the => operator between key/value pairs. The => operator is mostly just a more visually distinctive synonym for a comma, but it also arranges for its left-hand operand to be interpreted as a string if it's a bareword that would be a legal simple identifier. => doesn't quote compound identifiers, that contain double colons. This makes it nice for initializing hashes:

  1. %map = (
  2. red => 0x00f,
  3. blue => 0x0f0,
  4. green => 0xf00,
  5. );

or for initializing hash references to be used as records:

  1. $rec = {
  2. witch => 'Mable the Merciless',
  3. cat => 'Fluffy the Ferocious',
  4. date => '10/31/1776',
  5. };

or for using call-by-named-parameter to complicated functions:

  1. $field = $query->radio_group(
  2. name => 'group_name',
  3. values => ['eenie','meenie','minie'],
  4. default => 'meenie',
  5. linebreak => 'true',
  6. labels => \%labels
  7. );

Note that just because a hash is initialized in that order doesn't mean that it comes out in that order. See sort for examples of how to arrange for an output ordering.

If a key appears more than once in the initializer list of a hash, the last occurrence wins:

  1. %circle = (
  2. center => [5, 10],
  3. center => [27, 9],
  4. radius => 100,
  5. color => [0xDF, 0xFF, 0x00],
  6. radius => 54,
  7. );
  8. # same as
  9. %circle = (
  10. center => [27, 9],
  11. color => [0xDF, 0xFF, 0x00],
  12. radius => 54,
  13. );

This can be used to provide overridable configuration defaults:

  1. # values in %args take priority over %config_defaults
  2. %config = (%config_defaults, %args);

Subscripts

An array can be accessed one scalar at a time by specifying a dollar sign ($ ), then the name of the array (without the leading @ ), then the subscript inside square brackets. For example:

  1. @myarray = (5, 50, 500, 5000);
  2. print "The Third Element is", $myarray[2], "\n";

The array indices start with 0. A negative subscript retrieves its value from the end. In our example, $myarray[-1] would have been 5000, and $myarray[-2] would have been 500.

Hash subscripts are similar, only instead of square brackets curly brackets are used. For example:

  1. %scientists =
  2. (
  3. "Newton" => "Isaac",
  4. "Einstein" => "Albert",
  5. "Darwin" => "Charles",
  6. "Feynman" => "Richard",
  7. );
  8. print "Darwin's First Name is ", $scientists{"Darwin"}, "\n";

You can also subscript a list to get a single element from it:

  1. $dir = (getpwnam("daemon"))[7];

Multi-dimensional array emulation

Multidimensional arrays may be emulated by subscripting a hash with a list. The elements of the list are joined with the subscript separator (see $; in perlvar).

  1. $foo{$a,$b,$c}

is equivalent to

  1. $foo{join($;, $a, $b, $c)}

The default subscript separator is "\034", the same as SUBSEP in awk.

Slices

A slice accesses several elements of a list, an array, or a hash simultaneously using a list of subscripts. It's more convenient than writing out the individual elements as a list of separate scalar values.

  1. ($him, $her) = @folks[0,-1]; # array slice
  2. @them = @folks[0 .. 3]; # array slice
  3. ($who, $home) = @ENV{"USER", "HOME"}; # hash slice
  4. ($uid, $dir) = (getpwnam("daemon"))[2,7]; # list slice

Since you can assign to a list of variables, you can also assign to an array or hash slice.

  1. @days[3..5] = qw/Wed Thu Fri/;
  2. @colors{'red','blue','green'}
  3. = (0xff0000, 0x0000ff, 0x00ff00);
  4. @folks[0, -1] = @folks[-1, 0];

The previous assignments are exactly equivalent to

  1. ($days[3], $days[4], $days[5]) = qw/Wed Thu Fri/;
  2. ($colors{'red'}, $colors{'blue'}, $colors{'green'})
  3. = (0xff0000, 0x0000ff, 0x00ff00);
  4. ($folks[0], $folks[-1]) = ($folks[-1], $folks[0]);

Since changing a slice changes the original array or hash that it's slicing, a foreach construct will alter some--or even all--of the values of the array or hash.

  1. foreach (@array[ 4 .. 10 ]) { s/peter/paul/ }
  2. foreach (@hash{qw[key1 key2]}) {
  3. s/^\s+//; # trim leading whitespace
  4. s/\s+$//; # trim trailing whitespace
  5. s/(\w+)/\u\L$1/g; # "titlecase" words
  6. }

A slice of an empty list is still an empty list. Thus:

  1. @a = ()[1,0]; # @a has no elements
  2. @b = (@a)[0,1]; # @b has no elements

But:

  1. @a = (1)[1,0]; # @a has two elements
  2. @b = (1,undef)[1,0,2]; # @b has three elements

More generally, a slice yields the empty list if it indexes only beyond the end of a list:

  1. @a = (1)[ 1,2]; # @a has no elements
  2. @b = (1)[0,1,2]; # @b has three elements

This makes it easy to write loops that terminate when a null list is returned:

  1. while ( ($home, $user) = (getpwent)[7,0]) {
  2. printf "%-8s %s\n", $user, $home;
  3. }

As noted earlier in this document, the scalar sense of list assignment is the number of elements on the right-hand side of the assignment. The null list contains no elements, so when the password file is exhausted, the result is 0, not 2.

Slices in scalar context return the last item of the slice.

  1. @a = qw/first second third/;
  2. %h = (first => 'A', second => 'B');
  3. $t = @a[0, 1]; # $t is now 'second'
  4. $u = @h{'first', 'second'}; # $u is now 'B'

If you're confused about why you use an '@' there on a hash slice instead of a '%', think of it like this. The type of bracket (square or curly) governs whether it's an array or a hash being looked at. On the other hand, the leading symbol ('$' or '@') on the array or hash indicates whether you are getting back a singular value (a scalar) or a plural one (a list).

Typeglobs and Filehandles

Perl uses an internal type called a typeglob to hold an entire symbol table entry. The type prefix of a typeglob is a * , because it represents all types. This used to be the preferred way to pass arrays and hashes by reference into a function, but now that we have real references, this is seldom needed.

The main use of typeglobs in modern Perl is create symbol table aliases. This assignment:

  1. *this = *that;

makes $this an alias for $that, @this an alias for @that, %this an alias for %that, &this an alias for &that, etc. Much safer is to use a reference. This:

  1. local *Here::blue = \$There::green;

temporarily makes $Here::blue an alias for $There::green, but doesn't make @Here::blue an alias for @There::green, or %Here::blue an alias for %There::green, etc. See Symbol Tables in perlmod for more examples of this. Strange though this may seem, this is the basis for the whole module import/export system.

Another use for typeglobs is to pass filehandles into a function or to create new filehandles. If you need to use a typeglob to save away a filehandle, do it this way:

  1. $fh = *STDOUT;

or perhaps as a real reference, like this:

  1. $fh = \*STDOUT;

See perlsub for examples of using these as indirect filehandles in functions.

Typeglobs are also a way to create a local filehandle using the local() operator. These last until their block is exited, but may be passed back. For example:

  1. sub newopen {
  2. my $path = shift;
  3. local *FH; # not my!
  4. open (FH, $path) or return undef;
  5. return *FH;
  6. }
  7. $fh = newopen('/etc/passwd');

Now that we have the *foo{THING} notation, typeglobs aren't used as much for filehandle manipulations, although they're still needed to pass brand new file and directory handles into or out of functions. That's because *HANDLE{IO} only works if HANDLE has already been used as a handle. In other words, *FH must be used to create new symbol table entries; *foo{THING} cannot. When in doubt, use *FH .

All functions that are capable of creating filehandles (open(), opendir(), pipe(), socketpair(), sysopen(), socket(), and accept()) automatically create an anonymous filehandle if the handle passed to them is an uninitialized scalar variable. This allows the constructs such as open(my $fh, ...) and open(local $fh,...) to be used to create filehandles that will conveniently be closed automatically when the scope ends, provided there are no other references to them. This largely eliminates the need for typeglobs when opening filehandles that must be passed around, as in the following example:

  1. sub myopen {
  2. open my $fh, "@_"
  3. or die "Can't open '@_': $!";
  4. return $fh;
  5. }
  6. {
  7. my $f = myopen("</etc/motd");
  8. print <$f>;
  9. # $f implicitly closed here
  10. }

Note that if an initialized scalar variable is used instead the result is different: my $fh='zzz'; open($fh, ...) is equivalent to open( *{'zzz'}, ...) . use strict 'refs' forbids such practice.

Another way to create anonymous filehandles is with the Symbol module or with the IO::Handle module and its ilk. These modules have the advantage of not hiding different types of the same name during the local(). See the bottom of open for an example.

SEE ALSO

See perlvar for a description of Perl's built-in variables and a discussion of legal variable names. See perlref, perlsub, and Symbol Tables in perlmod for more discussion on typeglobs and the *foo{THING} syntax.

 
perldoc-html/perldbmfilter.html000644 000765 000024 00000066305 12275777342 016771 0ustar00jjstaff000000 000000 perldbmfilter - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldbmfilter

Perl 5 version 18.2 documentation
Recently read

perldbmfilter

NAME

perldbmfilter - Perl DBM Filters

SYNOPSIS

  1. $db = tie %hash, 'DBM', ...
  2. $old_filter = $db->filter_store_key ( sub { ... } );
  3. $old_filter = $db->filter_store_value( sub { ... } );
  4. $old_filter = $db->filter_fetch_key ( sub { ... } );
  5. $old_filter = $db->filter_fetch_value( sub { ... } );

DESCRIPTION

The four filter_* methods shown above are available in all the DBM modules that ship with Perl, namely DB_File, GDBM_File, NDBM_File, ODBM_File and SDBM_File.

Each of the methods works identically, and is used to install (or uninstall) a single DBM Filter. The only difference between them is the place that the filter is installed.

To summarise:

  • filter_store_key

    If a filter has been installed with this method, it will be invoked every time you write a key to a DBM database.

  • filter_store_value

    If a filter has been installed with this method, it will be invoked every time you write a value to a DBM database.

  • filter_fetch_key

    If a filter has been installed with this method, it will be invoked every time you read a key from a DBM database.

  • filter_fetch_value

    If a filter has been installed with this method, it will be invoked every time you read a value from a DBM database.

You can use any combination of the methods from none to all four.

All filter methods return the existing filter, if present, or undef if not.

To delete a filter pass undef to it.

The Filter

When each filter is called by Perl, a local copy of $_ will contain the key or value to be filtered. Filtering is achieved by modifying the contents of $_ . The return code from the filter is ignored.

An Example: the NULL termination problem.

DBM Filters are useful for a class of problems where you always want to make the same transformation to all keys, all values or both.

For example, consider the following scenario. You have a DBM database that you need to share with a third-party C application. The C application assumes that all keys and values are NULL terminated. Unfortunately when Perl writes to DBM databases it doesn't use NULL termination, so your Perl application will have to manage NULL termination itself. When you write to the database you will have to use something like this:

  1. $hash{"$key\0"} = "$value\0";

Similarly the NULL needs to be taken into account when you are considering the length of existing keys/values.

It would be much better if you could ignore the NULL terminations issue in the main application code and have a mechanism that automatically added the terminating NULL to all keys and values whenever you write to the database and have them removed when you read from the database. As I'm sure you have already guessed, this is a problem that DBM Filters can fix very easily.

  1. use strict;
  2. use warnings;
  3. use SDBM_File;
  4. use Fcntl;
  5. my %hash;
  6. my $filename = "filt";
  7. unlink $filename;
  8. my $db = tie(%hash, 'SDBM_File', $filename, O_RDWR|O_CREAT, 0640)
  9. or die "Cannot open $filename: $!\n";
  10. # Install DBM Filters
  11. $db->filter_fetch_key ( sub { s/\0$// } );
  12. $db->filter_store_key ( sub { $_ .= "\0" } );
  13. $db->filter_fetch_value(
  14. sub { no warnings 'uninitialized'; s/\0$// } );
  15. $db->filter_store_value( sub { $_ .= "\0" } );
  16. $hash{"abc"} = "def";
  17. my $a = $hash{"ABC"};
  18. # ...
  19. undef $db;
  20. untie %hash;

The code above uses SDBM_File, but it will work with any of the DBM modules.

Hopefully the contents of each of the filters should be self-explanatory. Both "fetch" filters remove the terminating NULL, and both "store" filters add a terminating NULL.

Another Example: Key is a C int.

Here is another real-life example. By default, whenever Perl writes to a DBM database it always writes the key and value as strings. So when you use this:

  1. $hash{12345} = "something";

the key 12345 will get stored in the DBM database as the 5 byte string "12345". If you actually want the key to be stored in the DBM database as a C int, you will have to use pack when writing, and unpack when reading.

Here is a DBM Filter that does it:

  1. use strict;
  2. use warnings;
  3. use DB_File;
  4. my %hash;
  5. my $filename = "filt";
  6. unlink $filename;
  7. my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH
  8. or die "Cannot open $filename: $!\n";
  9. $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } );
  10. $db->filter_store_key ( sub { $_ = pack ("i", $_) } );
  11. $hash{123} = "def";
  12. # ...
  13. undef $db;
  14. untie %hash;

The code above uses DB_File, but again it will work with any of the DBM modules.

This time only two filters have been used; we only need to manipulate the contents of the key, so it wasn't necessary to install any value filters.

SEE ALSO

DB_File, GDBM_File, NDBM_File, ODBM_File and SDBM_File.

AUTHOR

Paul Marquess

 
perldoc-html/perldebguts.html000644 000765 000024 00000332003 12275777360 016445 0ustar00jjstaff000000 000000 perldebguts - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldebguts

Perl 5 version 18.2 documentation
Recently read

perldebguts

NAME

perldebguts - Guts of Perl debugging

DESCRIPTION

This is not perldebug, which tells you how to use the debugger. This manpage describes low-level details concerning the debugger's internals, which range from difficult to impossible to understand for anyone who isn't incredibly intimate with Perl's guts. Caveat lector.

Debugger Internals

Perl has special debugging hooks at compile-time and run-time used to create debugging environments. These hooks are not to be confused with the perl -Dxxx command described in perlrun, which is usable only if a special Perl is built per the instructions in the INSTALL podpage in the Perl source tree.

For example, whenever you call Perl's built-in caller function from the package DB , the arguments that the corresponding stack frame was called with are copied to the @DB::args array. These mechanisms are enabled by calling Perl with the -d switch. Specifically, the following additional features are enabled (cf. $^P in perlvar):

  • Perl inserts the contents of $ENV{PERL5DB} (or BEGIN {require 'perl5db.pl'} if not present) before the first line of your program.

  • Each array @{"_<$filename"} holds the lines of $filename for a file compiled by Perl. The same is also true for evaled strings that contain subroutines, or which are currently being executed. The $filename for evaled strings looks like (eval 34) .

    Values in this array are magical in numeric context: they compare equal to zero only if the line is not breakable.

  • Each hash %{"_<$filename"} contains breakpoints and actions keyed by line number. Individual entries (as opposed to the whole hash) are settable. Perl only cares about Boolean true here, although the values used by perl5db.pl have the form "$break_condition\0$action" .

    The same holds for evaluated strings that contain subroutines, or which are currently being executed. The $filename for evaled strings looks like (eval 34) .

  • Each scalar ${"_<$filename"} contains "_<$filename" . This is also the case for evaluated strings that contain subroutines, or which are currently being executed. The $filename for evaled strings looks like (eval 34) .

  • After each required file is compiled, but before it is executed, DB::postponed(*{"_<$filename"}) is called if the subroutine DB::postponed exists. Here, the $filename is the expanded name of the required file, as found in the values of %INC.

  • After each subroutine subname is compiled, the existence of $DB::postponed{subname} is checked. If this key exists, DB::postponed(subname) is called if the DB::postponed subroutine also exists.

  • A hash %DB::sub is maintained, whose keys are subroutine names and whose values have the form filename:startline-endline . filename has the form (eval 34) for subroutines defined inside evals.

  • When the execution of your program reaches a point that can hold a breakpoint, the DB::DB() subroutine is called if any of the variables $DB::trace , $DB::single , or $DB::signal is true. These variables are not localizable. This feature is disabled when executing inside DB::DB() , including functions called from it unless $^D & (1<<30) is true.

  • When execution of the program reaches a subroutine call, a call to &DB::sub (args) is made instead, with $DB::sub holding the name of the called subroutine. (This doesn't happen if the subroutine was compiled in the DB package.)

Note that if &DB::sub needs external data for it to work, no subroutine call is possible without it. As an example, the standard debugger's &DB::sub depends on the $DB::deep variable (it defines how many levels of recursion deep into the debugger you can go before a mandatory break). If $DB::deep is not defined, subroutine calls are not possible, even though &DB::sub exists.

Writing Your Own Debugger

Environment Variables

The PERL5DB environment variable can be used to define a debugger. For example, the minimal "working" debugger (it actually doesn't do anything) consists of one line:

  1. sub DB::DB {}

It can easily be defined like this:

  1. $ PERL5DB="sub DB::DB {}" perl -d your-script

Another brief debugger, slightly more useful, can be created with only the line:

  1. sub DB::DB {print ++$i; scalar <STDIN>}

This debugger prints a number which increments for each statement encountered and waits for you to hit a newline before continuing to the next statement.

The following debugger is actually useful:

  1. {
  2. package DB;
  3. sub DB {}
  4. sub sub {print ++$i, " $sub\n"; &$sub}
  5. }

It prints the sequence number of each subroutine call and the name of the called subroutine. Note that &DB::sub is being compiled into the package DB through the use of the package directive.

When it starts, the debugger reads your rc file (./.perldb or ~/.perldb under Unix), which can set important options. (A subroutine (&afterinit ) can be defined here as well; it is executed after the debugger completes its own initialization.)

After the rc file is read, the debugger reads the PERLDB_OPTS environment variable and uses it to set debugger options. The contents of this variable are treated as if they were the argument of an o ... debugger command (q.v. in Configurable Options in perldebug).

Debugger Internal Variables

In addition to the file and subroutine-related variables mentioned above, the debugger also maintains various magical internal variables.

  • @DB::dbline is an alias for @{"::_<current_file"} , which holds the lines of the currently-selected file (compiled by Perl), either explicitly chosen with the debugger's f command, or implicitly by flow of execution.

    Values in this array are magical in numeric context: they compare equal to zero only if the line is not breakable.

  • %DB::dbline is an alias for %{"::_<current_file"} , which contains breakpoints and actions keyed by line number in the currently-selected file, either explicitly chosen with the debugger's f command, or implicitly by flow of execution.

    As previously noted, individual entries (as opposed to the whole hash) are settable. Perl only cares about Boolean true here, although the values used by perl5db.pl have the form "$break_condition\0$action" .

Debugger Customization Functions

Some functions are provided to simplify customization.

  • See Configurable Options in perldebug for a description of options parsed by DB::parse_options(string) .

  • DB::dump_trace(skip[,count]) skips the specified number of frames and returns a list containing information about the calling frames (all of them, if count is missing). Each entry is reference to a hash with keys context (either ., $ , or @ ), sub (subroutine name, or info about eval), args (undef or a reference to an array), file , and line .

  • DB::print_trace(FH, skip[, count[, short]]) prints formatted info about caller frames. The last two functions may be convenient as arguments to < , << commands.

Note that any variables and functions that are not documented in this manpages (or in perldebug) are considered for internal use only, and as such are subject to change without notice.

Frame Listing Output Examples

The frame option can be used to control the output of frame information. For example, contrast this expression trace:

  1. $ perl -de 42
  2. Stack dump during die enabled outside of evals.
  3. Loading DB routines from perl5db.pl patch level 0.94
  4. Emacs support available.
  5. Enter h or 'h h' for help.
  6. main::(-e:1): 0
  7. DB<1> sub foo { 14 }
  8. DB<2> sub bar { 3 }
  9. DB<3> t print foo() * bar()
  10. main::((eval 172):3): print foo() + bar();
  11. main::foo((eval 168):2):
  12. main::bar((eval 170):2):
  13. 42

with this one, once the o ption frame=2 has been set:

  1. DB<4> o f=2
  2. frame = '2'
  3. DB<5> t print foo() * bar()
  4. 3: foo() * bar()
  5. entering main::foo
  6. 2: sub foo { 14 };
  7. exited main::foo
  8. entering main::bar
  9. 2: sub bar { 3 };
  10. exited main::bar
  11. 42

By way of demonstration, we present below a laborious listing resulting from setting your PERLDB_OPTS environment variable to the value f=n N , and running perl -d -V from the command line. Examples using various values of n are shown to give you a feel for the difference between settings. Long though it may be, this is not a complete listing, but only excerpts.

1
  1. entering main::BEGIN
  2. entering Config::BEGIN
  3. Package lib/Exporter.pm.
  4. Package lib/Carp.pm.
  5. Package lib/Config.pm.
  6. entering Config::TIEHASH
  7. entering Exporter::import
  8. entering Exporter::export
  9. entering Config::myconfig
  10. entering Config::FETCH
  11. entering Config::FETCH
  12. entering Config::FETCH
  13. entering Config::FETCH
2
  1. entering main::BEGIN
  2. entering Config::BEGIN
  3. Package lib/Exporter.pm.
  4. Package lib/Carp.pm.
  5. exited Config::BEGIN
  6. Package lib/Config.pm.
  7. entering Config::TIEHASH
  8. exited Config::TIEHASH
  9. entering Exporter::import
  10. entering Exporter::export
  11. exited Exporter::export
  12. exited Exporter::import
  13. exited main::BEGIN
  14. entering Config::myconfig
  15. entering Config::FETCH
  16. exited Config::FETCH
  17. entering Config::FETCH
  18. exited Config::FETCH
  19. entering Config::FETCH
3
  1. in $=main::BEGIN() from /dev/null:0
  2. in $=Config::BEGIN() from lib/Config.pm:2
  3. Package lib/Exporter.pm.
  4. Package lib/Carp.pm.
  5. Package lib/Config.pm.
  6. in $=Config::TIEHASH('Config') from lib/Config.pm:644
  7. in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
  8. in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
  9. in @=Config::myconfig() from /dev/null:0
  10. in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
  11. in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
  12. in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
  13. in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
  14. in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
  15. in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574
4
  1. in $=main::BEGIN() from /dev/null:0
  2. in $=Config::BEGIN() from lib/Config.pm:2
  3. Package lib/Exporter.pm.
  4. Package lib/Carp.pm.
  5. out $=Config::BEGIN() from lib/Config.pm:0
  6. Package lib/Config.pm.
  7. in $=Config::TIEHASH('Config') from lib/Config.pm:644
  8. out $=Config::TIEHASH('Config') from lib/Config.pm:644
  9. in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
  10. in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
  11. out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
  12. out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
  13. out $=main::BEGIN() from /dev/null:0
  14. in @=Config::myconfig() from /dev/null:0
  15. in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
  16. out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
  17. in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
  18. out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
  19. in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
  20. out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
  21. in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
5
  1. in $=main::BEGIN() from /dev/null:0
  2. in $=Config::BEGIN() from lib/Config.pm:2
  3. Package lib/Exporter.pm.
  4. Package lib/Carp.pm.
  5. out $=Config::BEGIN() from lib/Config.pm:0
  6. Package lib/Config.pm.
  7. in $=Config::TIEHASH('Config') from lib/Config.pm:644
  8. out $=Config::TIEHASH('Config') from lib/Config.pm:644
  9. in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
  10. in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
  11. out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
  12. out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
  13. out $=main::BEGIN() from /dev/null:0
  14. in @=Config::myconfig() from /dev/null:0
  15. in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
  16. out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
  17. in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
  18. out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
6
  1. in $=CODE(0x15eca4)() from /dev/null:0
  2. in $=CODE(0x182528)() from lib/Config.pm:2
  3. Package lib/Exporter.pm.
  4. out $=CODE(0x182528)() from lib/Config.pm:0
  5. scalar context return from CODE(0x182528): undef
  6. Package lib/Config.pm.
  7. in $=Config::TIEHASH('Config') from lib/Config.pm:628
  8. out $=Config::TIEHASH('Config') from lib/Config.pm:628
  9. scalar context return from Config::TIEHASH: empty hash
  10. in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
  11. in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
  12. out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
  13. scalar context return from Exporter::export: ''
  14. out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
  15. scalar context return from Exporter::import: ''

In all cases shown above, the line indentation shows the call tree. If bit 2 of frame is set, a line is printed on exit from a subroutine as well. If bit 4 is set, the arguments are printed along with the caller info. If bit 8 is set, the arguments are printed even if they are tied or references. If bit 16 is set, the return value is printed, too.

When a package is compiled, a line like this

  1. Package lib/Carp.pm.

is printed with proper indentation.

Debugging Regular Expressions

There are two ways to enable debugging output for regular expressions.

If your perl is compiled with -DDEBUGGING , you may use the -Dr flag on the command line.

Otherwise, one can use re 'debug' , which has effects at compile time and run time. Since Perl 5.9.5, this pragma is lexically scoped.

Compile-time Output

The debugging output at compile time looks like this:

  1. Compiling REx '[bc]d(ef*g)+h[ij]k$'
  2. size 45 Got 364 bytes for offset annotations.
  3. first at 1
  4. rarest char g at 0
  5. rarest char d at 0
  6. 1: ANYOF[bc](12)
  7. 12: EXACT <d>(14)
  8. 14: CURLYX[0] {1,32767}(28)
  9. 16: OPEN1(18)
  10. 18: EXACT <e>(20)
  11. 20: STAR(23)
  12. 21: EXACT <f>(0)
  13. 23: EXACT <g>(25)
  14. 25: CLOSE1(27)
  15. 27: WHILEM[1/1](0)
  16. 28: NOTHING(29)
  17. 29: EXACT <h>(31)
  18. 31: ANYOF[ij](42)
  19. 42: EXACT <k>(44)
  20. 44: EOL(45)
  21. 45: END(0)
  22. anchored 'de' at 1 floating 'gh' at 3..2147483647 (checking floating)
  23. stclass 'ANYOF[bc]' minlen 7
  24. Offsets: [45]
  25. 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
  26. 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
  27. 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
  28. 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
  29. Omitting $` $& $' support.

The first line shows the pre-compiled form of the regex. The second shows the size of the compiled form (in arbitrary units, usually 4-byte words) and the total number of bytes allocated for the offset/length table, usually 4+size *8. The next line shows the label id of the first node that does a match.

The

  1. anchored 'de' at 1 floating 'gh' at 3..2147483647 (checking floating)
  2. stclass 'ANYOF[bc]' minlen 7

line (split into two lines above) contains optimizer information. In the example shown, the optimizer found that the match should contain a substring de at offset 1, plus substring gh at some offset between 3 and infinity. Moreover, when checking for these substrings (to abandon impossible matches quickly), Perl will check for the substring gh before checking for the substring de . The optimizer may also use the knowledge that the match starts (at the first id) with a character class, and no string shorter than 7 characters can possibly match.

The fields of interest which may appear in this line are

  • anchored STRING at POS
  • floating STRING at POS1..POS2

    See above.

  • matching floating/anchored

    Which substring to check first.

  • minlen

    The minimal length of the match.

  • stclass TYPE

    Type of first matching node.

  • noscan

    Don't scan for the found substrings.

  • isall

    Means that the optimizer information is all that the regular expression contains, and thus one does not need to enter the regex engine at all.

  • GPOS

    Set if the pattern contains \G .

  • plus

    Set if the pattern starts with a repeated char (as in x+y).

  • implicit

    Set if the pattern starts with .*.

  • with eval

    Set if the pattern contain eval-groups, such as (?{ code }) and (??{ code }) .

  • anchored(TYPE)

    If the pattern may match only at a handful of places, with TYPE being BOL , MBOL , or GPOS . See the table below.

If a substring is known to match at end-of-line only, it may be followed by $ , as in floating 'k'$ .

The optimizer-specific information is used to avoid entering (a slow) regex engine on strings that will not definitely match. If the isall flag is set, a call to the regex engine may be avoided even when the optimizer found an appropriate place for the match.

Above the optimizer section is the list of nodes of the compiled form of the regex. Each line has format

id: TYPE OPTIONAL-INFO (next-id)

Types of Nodes

Here are the possible types, with short descriptions:

  1. # TYPE arg-description [num-args] [longjump-len] DESCRIPTION
  2. # Exit points
  3. END no End of program.
  4. SUCCEED no Return from a subroutine, basically.
  5. # Anchors:
  6. BOL no Match "" at beginning of line.
  7. MBOL no Same, assuming multiline.
  8. SBOL no Same, assuming singleline.
  9. EOS no Match "" at end of string.
  10. EOL no Match "" at end of line.
  11. MEOL no Same, assuming multiline.
  12. SEOL no Same, assuming singleline.
  13. BOUND no Match "" at any word boundary using
  14. native charset semantics for non-utf8
  15. BOUNDL no Match "" at any locale word boundary
  16. BOUNDU no Match "" at any word boundary using
  17. Unicode semantics
  18. BOUNDA no Match "" at any word boundary using ASCII
  19. semantics
  20. NBOUND no Match "" at any word non-boundary using
  21. native charset semantics for non-utf8
  22. NBOUNDL no Match "" at any locale word non-boundary
  23. NBOUNDU no Match "" at any word non-boundary using
  24. Unicode semantics
  25. NBOUNDA no Match "" at any word non-boundary using
  26. ASCII semantics
  27. GPOS no Matches where last m//g left off.
  28. # [Special] alternatives:
  29. REG_ANY no Match any one character (except newline).
  30. SANY no Match any one character.
  31. CANY no Match any one byte.
  32. ANYOF sv Match character in (or not in) this
  33. class, single char match only
  34. ANYOF_WARN_SUPER sv Match character in (or not in) this
  35. class, warn (if enabled) upon matching a
  36. char above Unicode max;
  37. ANYOF_SYNTHETIC sv Synthetic start class
  38. POSIXD none Some [[:class:]] under /d; the FLAGS
  39. field gives which one
  40. POSIXL none Some [[:class:]] under /l; the FLAGS
  41. field gives which one
  42. POSIXU none Some [[:class:]] under /u; the FLAGS
  43. field gives which one
  44. POSIXA none Some [[:class:]] under /a; the FLAGS
  45. field gives which one
  46. NPOSIXD none complement of POSIXD, [[:^class:]]
  47. NPOSIXL none complement of POSIXL, [[:^class:]]
  48. NPOSIXU none complement of POSIXU, [[:^class:]]
  49. NPOSIXA none complement of POSIXA, [[:^class:]]
  50. CLUMP no Match any extended grapheme cluster
  51. sequence
  52. # Alternation
  53. # BRANCH The set of branches constituting a single choice are
  54. # hooked together with their "next" pointers, since
  55. # precedence prevents anything being concatenated to
  56. # any individual branch. The "next" pointer of the last
  57. # BRANCH in a choice points to the thing following the
  58. # whole choice. This is also where the final "next"
  59. # pointer of each individual branch points; each branch
  60. # starts with the operand node of a BRANCH node.
  61. #
  62. BRANCH node Match this alternative, or the next...
  63. # Back pointer
  64. # BACK Normal "next" pointers all implicitly point forward;
  65. # BACK exists to make loop structures possible.
  66. # not used
  67. BACK no Match "", "next" ptr points backward.
  68. # Literals
  69. EXACT str Match this string (preceded by length).
  70. EXACTF str Match this non-UTF-8 string (not
  71. guaranteed to be folded) using /id rules
  72. (w/len).
  73. EXACTFL str Match this string (not guaranteed to be
  74. folded) using /il rules (w/len).
  75. EXACTFU str Match this string (folded iff in UTF-8,
  76. length in folding doesn't change if not
  77. in UTF-8) using /iu rules (w/len).
  78. EXACTFA str Match this string (not guaranteed to be
  79. folded) using /iaa rules (w/len).
  80. EXACTFU_SS str Match this string (folded iff in UTF-8,
  81. length in folding may change even if not
  82. in UTF-8) using /iu rules (w/len).
  83. EXACTFU_TRICKYFOLD str Match this folded UTF-8 string using /iu
  84. rules
  85. # Do nothing types
  86. NOTHING no Match empty string.
  87. # A variant of above which delimits a group, thus stops optimizations
  88. TAIL no Match empty string. Can jump here from
  89. outside.
  90. # Loops
  91. # STAR,PLUS '?', and complex '*' and '+', are implemented as
  92. # circular BRANCH structures using BACK. Simple cases
  93. # (one character per match) are implemented with STAR
  94. # and PLUS for speed and to minimize recursive plunges.
  95. #
  96. STAR node Match this (simple) thing 0 or more
  97. times.
  98. PLUS node Match this (simple) thing 1 or more
  99. times.
  100. CURLY sv 2 Match this simple thing {n,m} times.
  101. CURLYN no 2 Capture next-after-this simple thing
  102. CURLYM no 2 Capture this medium-complex thing {n,m}
  103. times.
  104. CURLYX sv 2 Match this complex thing {n,m} times.
  105. # This terminator creates a loop structure for CURLYX
  106. WHILEM no Do curly processing and see if rest
  107. matches.
  108. # Buffer related
  109. # OPEN,CLOSE,GROUPP ...are numbered at compile time.
  110. OPEN num 1 Mark this point in input as start of #n.
  111. CLOSE num 1 Analogous to OPEN.
  112. REF num 1 Match some already matched string
  113. REFF num 1 Match already matched string, folded
  114. using native charset semantics for non-
  115. utf8
  116. REFFL num 1 Match already matched string, folded in
  117. loc.
  118. REFFU num 1 Match already matched string, folded
  119. using unicode semantics for non-utf8
  120. REFFA num 1 Match already matched string, folded
  121. using unicode semantics for non-utf8, no
  122. mixing ASCII, non-ASCII
  123. # Named references. Code in regcomp.c assumes that these all are after
  124. # the numbered references
  125. NREF no-sv 1 Match some already matched string
  126. NREFF no-sv 1 Match already matched string, folded
  127. using native charset semantics for non-
  128. utf8
  129. NREFFL no-sv 1 Match already matched string, folded in
  130. loc.
  131. NREFFU num 1 Match already matched string, folded
  132. using unicode semantics for non-utf8
  133. NREFFA num 1 Match already matched string, folded
  134. using unicode semantics for non-utf8, no
  135. mixing ASCII, non-ASCII
  136. IFMATCH off 1 2 Succeeds if the following matches.
  137. UNLESSM off 1 2 Fails if the following matches.
  138. SUSPEND off 1 1 "Independent" sub-RE.
  139. IFTHEN off 1 1 Switch, should be preceded by switcher.
  140. GROUPP num 1 Whether the group matched.
  141. # Support for long RE
  142. LONGJMP off 1 1 Jump far away.
  143. BRANCHJ off 1 1 BRANCH with long offset.
  144. # The heavy worker
  145. EVAL evl 1 Execute some Perl code.
  146. # Modifiers
  147. MINMOD no Next operator is not greedy.
  148. LOGICAL no Next opcode should set the flag only.
  149. # This is not used yet
  150. RENUM off 1 1 Group with independently numbered parens.
  151. # Trie Related
  152. # Behave the same as A|LIST|OF|WORDS would. The '..C' variants
  153. # have inline charclass data (ascii only), the 'C' store it in the
  154. # structure.
  155. TRIE trie 1 Match many EXACT(F[ALU]?)? at once.
  156. flags==type
  157. TRIEC trie Same as TRIE, but with embedded charclass
  158. charclass data
  159. AHOCORASICK trie 1 Aho Corasick stclass. flags==type
  160. AHOCORASICKC trie Same as AHOCORASICK, but with embedded
  161. charclass charclass data
  162. # Regex Subroutines
  163. GOSUB num/ofs 2L recurse to paren arg1 at (signed) ofs
  164. arg2
  165. GOSTART no recurse to start of pattern
  166. # Special conditionals
  167. NGROUPP no-sv 1 Whether the group matched.
  168. INSUBP num 1 Whether we are in a specific recurse.
  169. DEFINEP none 1 Never execute directly.
  170. # Backtracking Verbs
  171. ENDLIKE none Used only for the type field of verbs
  172. OPFAIL none Same as (?!)
  173. ACCEPT parno 1 Accepts the current matched string.
  174. # Verbs With Arguments
  175. VERB no-sv 1 Used only for the type field of verbs
  176. PRUNE no-sv 1 Pattern fails at this startpoint if no-
  177. backtracking through this
  178. MARKPOINT no-sv 1 Push the current location for rollback by
  179. cut.
  180. SKIP no-sv 1 On failure skip forward (to the mark)
  181. before retrying
  182. COMMIT no-sv 1 Pattern fails outright if backtracking
  183. through this
  184. CUTGROUP no-sv 1 On failure go to the next alternation in
  185. the group
  186. # Control what to keep in $&.
  187. KEEPS no $& begins here.
  188. # New charclass like patterns
  189. LNBREAK none generic newline pattern
  190. # SPECIAL REGOPS
  191. # This is not really a node, but an optimized away piece of a "long"
  192. # node. To simplify debugging output, we mark it as if it were a node
  193. OPTIMIZED off Placeholder for dump.
  194. # Special opcode with the property that no opcode in a compiled program
  195. # will ever be of this type. Thus it can be used as a flag value that
  196. # no other opcode has been seen. END is used similarly, in that an END
  197. # node cant be optimized. So END implies "unoptimizable" and PSEUDO
  198. # mean "not seen anything to optimize yet".
  199. PSEUDO off Pseudo opcode for internal use.

Following the optimizer information is a dump of the offset/length table, here split across several lines:

  1. Offsets: [45]
  2. 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
  3. 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
  4. 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
  5. 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]

The first line here indicates that the offset/length table contains 45 entries. Each entry is a pair of integers, denoted by offset[length] . Entries are numbered starting with 1, so entry #1 here is 1[4] and entry #12 is 5[1] . 1[4] indicates that the node labeled 1: (the 1: ANYOF[bc]) begins at character position 1 in the pre-compiled form of the regex, and has a length of 4 characters. 5[1] in position 12 indicates that the node labeled 12: (the 12: EXACT <d>) begins at character position 5 in the pre-compiled form of the regex, and has a length of 1 character. 12[1] in position 14 indicates that the node labeled 14: (the 14: CURLYX[0] {1,32767}) begins at character position 12 in the pre-compiled form of the regex, and has a length of 1 character---that is, it corresponds to the + symbol in the precompiled regex.

0[0] items indicate that there is no corresponding node.

Run-time Output

First of all, when doing a match, one may get no run-time output even if debugging is enabled. This means that the regex engine was never entered and that all of the job was therefore done by the optimizer.

If the regex engine was entered, the output may look like this:

  1. Matching '[bc]d(ef*g)+h[ij]k$' against 'abcdefg__gh__'
  2. Setting an EVAL scope, savestack=3
  3. 2 <ab> <cdefg__gh_> | 1: ANYOF
  4. 3 <abc> <defg__gh_> | 11: EXACT <d>
  5. 4 <abcd> <efg__gh_> | 13: CURLYX {1,32767}
  6. 4 <abcd> <efg__gh_> | 26: WHILEM
  7. 0 out of 1..32767 cc=effff31c
  8. 4 <abcd> <efg__gh_> | 15: OPEN1
  9. 4 <abcd> <efg__gh_> | 17: EXACT <e>
  10. 5 <abcde> <fg__gh_> | 19: STAR
  11. EXACT <f> can match 1 times out of 32767...
  12. Setting an EVAL scope, savestack=3
  13. 6 <bcdef> <g__gh__> | 22: EXACT <g>
  14. 7 <bcdefg> <__gh__> | 24: CLOSE1
  15. 7 <bcdefg> <__gh__> | 26: WHILEM
  16. 1 out of 1..32767 cc=effff31c
  17. Setting an EVAL scope, savestack=12
  18. 7 <bcdefg> <__gh__> | 15: OPEN1
  19. 7 <bcdefg> <__gh__> | 17: EXACT <e>
  20. restoring \1 to 4(4)..7
  21. failed, try continuation...
  22. 7 <bcdefg> <__gh__> | 27: NOTHING
  23. 7 <bcdefg> <__gh__> | 28: EXACT <h>
  24. failed...
  25. failed...

The most significant information in the output is about the particular node of the compiled regex that is currently being tested against the target string. The format of these lines is

STRING-OFFSET <PRE-STRING> <POST-STRING> |ID: TYPE

The TYPE info is indented with respect to the backtracking level. Other incidental information appears interspersed within.

Debugging Perl Memory Usage

Perl is a profligate wastrel when it comes to memory use. There is a saying that to estimate memory usage of Perl, assume a reasonable algorithm for memory allocation, multiply that estimate by 10, and while you still may miss the mark, at least you won't be quite so astonished. This is not absolutely true, but may provide a good grasp of what happens.

Assume that an integer cannot take less than 20 bytes of memory, a float cannot take less than 24 bytes, a string cannot take less than 32 bytes (all these examples assume 32-bit architectures, the result are quite a bit worse on 64-bit architectures). If a variable is accessed in two of three different ways (which require an integer, a float, or a string), the memory footprint may increase yet another 20 bytes. A sloppy malloc(3) implementation can inflate these numbers dramatically.

On the opposite end of the scale, a declaration like

  1. sub foo;

may take up to 500 bytes of memory, depending on which release of Perl you're running.

Anecdotal estimates of source-to-compiled code bloat suggest an eightfold increase. This means that the compiled form of reasonable (normally commented, properly indented etc.) code will take about eight times more space in memory than the code took on disk.

The -DL command-line switch is obsolete since circa Perl 5.6.0 (it was available only if Perl was built with -DDEBUGGING ). The switch was used to track Perl's memory allocations and possible memory leaks. These days the use of malloc debugging tools like Purify or valgrind is suggested instead. See also PERL_MEM_LOG in perlhacktips.

One way to find out how much memory is being used by Perl data structures is to install the Devel::Size module from CPAN: it gives you the minimum number of bytes required to store a particular data structure. Please be mindful of the difference between the size() and total_size().

If Perl has been compiled using Perl's malloc you can analyze Perl memory usage by setting $ENV{PERL_DEBUG_MSTATS}.

Using $ENV{PERL_DEBUG_MSTATS}

If your perl is using Perl's malloc() and was compiled with the necessary switches (this is the default), then it will print memory usage statistics after compiling your code when $ENV{PERL_DEBUG_MSTATS} > 1 , and before termination of the program when $ENV{PERL_DEBUG_MSTATS} >= 1 . The report format is similar to the following example:

  1. $ PERL_DEBUG_MSTATS=2 perl -e "require Carp"
  2. Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
  3. 14216 free: 130 117 28 7 9 0 2 2 1 0 0
  4. 437 61 36 0 5
  5. 60924 used: 125 137 161 55 7 8 6 16 2 0 1
  6. 74 109 304 84 20
  7. Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
  8. Memory allocation statistics after execution: (buckets 4(4)..8188(8192)
  9. 30888 free: 245 78 85 13 6 2 1 3 2 0 1
  10. 315 162 39 42 11
  11. 175816 used: 265 176 1112 111 26 22 11 27 2 1 1
  12. 196 178 1066 798 39
  13. Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.

It is possible to ask for such a statistic at arbitrary points in your execution using the mstat() function out of the standard Devel::Peek module.

Here is some explanation of that format:

  • buckets SMALLEST(APPROX)..GREATEST(APPROX)

    Perl's malloc() uses bucketed allocations. Every request is rounded up to the closest bucket size available, and a bucket is taken from the pool of buckets of that size.

    The line above describes the limits of buckets currently in use. Each bucket has two sizes: memory footprint and the maximal size of user data that can fit into this bucket. Suppose in the above example that the smallest bucket were size 4. The biggest bucket would have usable size 8188, and the memory footprint would be 8192.

    In a Perl built for debugging, some buckets may have negative usable size. This means that these buckets cannot (and will not) be used. For larger buckets, the memory footprint may be one page greater than a power of 2. If so, the corresponding power of two is printed in the APPROX field above.

  • Free/Used

    The 1 or 2 rows of numbers following that correspond to the number of buckets of each size between SMALLEST and GREATEST . In the first row, the sizes (memory footprints) of buckets are powers of two--or possibly one page greater. In the second row, if present, the memory footprints of the buckets are between the memory footprints of two buckets "above".

    For example, suppose under the previous example, the memory footprints were

    1. free: 8 16 32 64 128 256 512 1024 2048 4096 8192
    2. 4 12 24 48 80

    With a non-DEBUGGING perl, the buckets starting from 128 have a 4-byte overhead, and thus an 8192-long bucket may take up to 8188-byte allocations.

  • Total sbrk(): SBRKed/SBRKs:CONTINUOUS

    The first two fields give the total amount of memory perl sbrk(2)ed (ess-broken? :-) and number of sbrk(2)s used. The third number is what perl thinks about continuity of returned chunks. So long as this number is positive, malloc() will assume that it is probable that sbrk(2) will provide continuous memory.

    Memory allocated by external libraries is not counted.

  • pad: 0

    The amount of sbrk(2)ed memory needed to keep buckets aligned.

  • heads: 2192

    Although memory overhead of bigger buckets is kept inside the bucket, for smaller buckets, it is kept in separate areas. This field gives the total size of these areas.

  • chain: 0

    malloc() may want to subdivide a bigger bucket into smaller buckets. If only a part of the deceased bucket is left unsubdivided, the rest is kept as an element of a linked list. This field gives the total size of these chunks.

  • tail: 6144

    To minimize the number of sbrk(2)s, malloc() asks for more memory. This field gives the size of the yet unused part, which is sbrk(2)ed, but never touched.

SEE ALSO

perldebug, perlguts, perlrun re, and Devel::DProf.

 
perldoc-html/perldebtut.html000644 000765 000024 00000140772 12275777324 016311 0ustar00jjstaff000000 000000 perldebtut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldebtut

Perl 5 version 18.2 documentation
Recently read

perldebtut

NAME

perldebtut - Perl debugging tutorial

DESCRIPTION

A (very) lightweight introduction in the use of the perl debugger, and a pointer to existing, deeper sources of information on the subject of debugging perl programs.

There's an extraordinary number of people out there who don't appear to know anything about using the perl debugger, though they use the language every day. This is for them.

use strict

First of all, there's a few things you can do to make your life a lot more straightforward when it comes to debugging perl programs, without using the debugger at all. To demonstrate, here's a simple script, named "hello", with a problem:

  1. #!/usr/bin/perl
  2. $var1 = 'Hello World'; # always wanted to do that :-)
  3. $var2 = "$varl\n";
  4. print $var2;
  5. exit;

While this compiles and runs happily, it probably won't do what's expected, namely it doesn't print "Hello World\n" at all; It will on the other hand do exactly what it was told to do, computers being a bit that way inclined. That is, it will print out a newline character, and you'll get what looks like a blank line. It looks like there's 2 variables when (because of the typo) there's really 3:

  1. $var1 = 'Hello World';
  2. $varl = undef;
  3. $var2 = "\n";

To catch this kind of problem, we can force each variable to be declared before use by pulling in the strict module, by putting 'use strict;' after the first line of the script.

Now when you run it, perl complains about the 3 undeclared variables and we get four error messages because one variable is referenced twice:

  1. Global symbol "$var1" requires explicit package name at ./t1 line 4.
  2. Global symbol "$var2" requires explicit package name at ./t1 line 5.
  3. Global symbol "$varl" requires explicit package name at ./t1 line 5.
  4. Global symbol "$var2" requires explicit package name at ./t1 line 7.
  5. Execution of ./hello aborted due to compilation errors.

Luvverly! and to fix this we declare all variables explicitly and now our script looks like this:

  1. #!/usr/bin/perl
  2. use strict;
  3. my $var1 = 'Hello World';
  4. my $varl = undef;
  5. my $var2 = "$varl\n";
  6. print $var2;
  7. exit;

We then do (always a good idea) a syntax check before we try to run it again:

  1. > perl -c hello
  2. hello syntax OK

And now when we run it, we get "\n" still, but at least we know why. Just getting this script to compile has exposed the '$varl' (with the letter 'l') variable, and simply changing $varl to $var1 solves the problem.

Looking at data and -w and v

Ok, but how about when you want to really see your data, what's in that dynamic variable, just before using it?

  1. #!/usr/bin/perl
  2. use strict;
  3. my $key = 'welcome';
  4. my %data = (
  5. 'this' => qw(that),
  6. 'tom' => qw(and jerry),
  7. 'welcome' => q(Hello World),
  8. 'zip' => q(welcome),
  9. );
  10. my @data = keys %data;
  11. print "$data{$key}\n";
  12. exit;

Looks OK, after it's been through the syntax check (perl -c scriptname), we run it and all we get is a blank line again! Hmmmm.

One common debugging approach here, would be to liberally sprinkle a few print statements, to add a check just before we print out our data, and another just after:

  1. print "All OK\n" if grep($key, keys %data);
  2. print "$data{$key}\n";
  3. print "done: '$data{$key}'\n";

And try again:

  1. > perl data
  2. All OK
  3. done: ''

After much staring at the same piece of code and not seeing the wood for the trees for some time, we get a cup of coffee and try another approach. That is, we bring in the cavalry by giving perl the '-d' switch on the command line:

  1. > perl -d data
  2. Default die handler restored.
  3. Loading DB routines from perl5db.pl version 1.07
  4. Editor support available.
  5. Enter h or `h h' for help, or `man perldebug' for more help.
  6. main::(./data:4): my $key = 'welcome';

Now, what we've done here is to launch the built-in perl debugger on our script. It's stopped at the first line of executable code and is waiting for input.

Before we go any further, you'll want to know how to quit the debugger: use just the letter 'q', not the words 'quit' or 'exit':

  1. DB<1> q
  2. >

That's it, you're back on home turf again.

help

Fire the debugger up again on your script and we'll look at the help menu. There's a couple of ways of calling help: a simple 'h' will get the summary help list, '|h' (pipe-h) will pipe the help through your pager (which is (probably 'more' or 'less'), and finally, 'h h' (h-space-h) will give you the entire help screen. Here is the summary page:

D1h

  1. List/search source lines: Control script execution:
  2. l [ln|sub] List source code T Stack trace
  3. - or . List previous/current line s [expr] Single step [in expr]
  4. v [line] View around line n [expr] Next, steps over subs
  5. f filename View source in file <CR/Enter> Repeat last n or s
  6. /pattern/ ?patt? Search forw/backw r Return from subroutine
  7. M Show module versions c [ln|sub] Continue until position
  8. Debugger controls: L List break/watch/actions
  9. o [...] Set debugger options t [expr] Toggle trace [trace expr]
  10. <[<]|{[{]|>[>] [cmd] Do pre/post-prompt b [ln|event|sub] [cnd] Set breakpoint
  11. ! [N|pat] Redo a previous command B ln|* Delete a/all breakpoints
  12. H [-num] Display last num commands a [ln] cmd Do cmd before line
  13. = [a val] Define/list an alias A ln|* Delete a/all actions
  14. h [db_cmd] Get help on command w expr Add a watch expression
  15. h h Complete help page W expr|* Delete a/all watch exprs
  16. |[|]db_cmd Send output to pager ![!] syscmd Run cmd in a subprocess
  17. q or ^D Quit R Attempt a restart
  18. Data Examination: expr Execute perl code, also see: s,n,t expr
  19. x|m expr Evals expr in list context, dumps the result or lists methods.
  20. p expr Print expression (uses script's current package).
  21. S [[!]pat] List subroutine names [not] matching pattern
  22. V [Pk [Vars]] List Variables in Package. Vars can be ~pattern or !pattern.
  23. X [Vars] Same as "V current_package [Vars]".
  24. y [n [Vars]] List lexicals in higher scope <n>. Vars same as V.
  25. For more help, type h cmd_letter, or run man perldebug for all docs.

More confusing options than you can shake a big stick at! It's not as bad as it looks and it's very useful to know more about all of it, and fun too!

There's a couple of useful ones to know about straight away. You wouldn't think we're using any libraries at all at the moment, but 'M' will show which modules are currently loaded, and their version number, while 'm' will show the methods, and 'S' shows all subroutines (by pattern) as shown below. 'V' and 'X' show variables in the program by package scope and can be constrained by pattern.

  1. DB<2>S str
  2. dumpvar::stringify
  3. strict::bits
  4. strict::import
  5. strict::unimport

Using 'X' and cousins requires you not to use the type identifiers ($@%), just the 'name':

  1. DM<3>X ~err
  2. FileHandle(stderr) => fileno(2)

Remember we're in our tiny program with a problem, we should have a look at where we are, and what our data looks like. First of all let's view some code at our present position (the first line of code in this case), via 'v':

  1. DB<4> v
  2. 1 #!/usr/bin/perl
  3. 2: use strict;
  4. 3
  5. 4==> my $key = 'welcome';
  6. 5: my %data = (
  7. 6 'this' => qw(that),
  8. 7 'tom' => qw(and jerry),
  9. 8 'welcome' => q(Hello World),
  10. 9 'zip' => q(welcome),
  11. 10 );

At line number 4 is a helpful pointer, that tells you where you are now. To see more code, type 'v' again:

  1. DB<4> v
  2. 8 'welcome' => q(Hello World),
  3. 9 'zip' => q(welcome),
  4. 10 );
  5. 11: my @data = keys %data;
  6. 12: print "All OK\n" if grep($key, keys %data);
  7. 13: print "$data{$key}\n";
  8. 14: print "done: '$data{$key}'\n";
  9. 15: exit;

And if you wanted to list line 5 again, type 'l 5', (note the space):

  1. DB<4> l 5
  2. 5: my %data = (

In this case, there's not much to see, but of course normally there's pages of stuff to wade through, and 'l' can be very useful. To reset your view to the line we're about to execute, type a lone period '.':

  1. DB<5> .
  2. main::(./data_a:4): my $key = 'welcome';

The line shown is the one that is about to be executed next, it hasn't happened yet. So while we can print a variable with the letter 'p', at this point all we'd get is an empty (undefined) value back. What we need to do is to step through the next executable statement with an 's':

  1. DB<6> s
  2. main::(./data_a:5): my %data = (
  3. main::(./data_a:6): 'this' => qw(that),
  4. main::(./data_a:7): 'tom' => qw(and jerry),
  5. main::(./data_a:8): 'welcome' => q(Hello World),
  6. main::(./data_a:9): 'zip' => q(welcome),
  7. main::(./data_a:10): );

Now we can have a look at that first ($key) variable:

  1. DB<7> p $key
  2. welcome

line 13 is where the action is, so let's continue down to there via the letter 'c', which by the way, inserts a 'one-time-only' breakpoint at the given line or sub routine:

  1. DB<8> c 13
  2. All OK
  3. main::(./data_a:13): print "$data{$key}\n";

We've gone past our check (where 'All OK' was printed) and have stopped just before the meat of our task. We could try to print out a couple of variables to see what is happening:

  1. DB<9> p $data{$key}

Not much in there, lets have a look at our hash:

  1. DB<10> p %data
  2. Hello Worldziptomandwelcomejerrywelcomethisthat
  3. DB<11> p keys %data
  4. Hello Worldtomwelcomejerrythis

Well, this isn't very easy to read, and using the helpful manual (h h), the 'x' command looks promising:

  1. DB<12> x %data
  2. 0 'Hello World'
  3. 1 'zip'
  4. 2 'tom'
  5. 3 'and'
  6. 4 'welcome'
  7. 5 undef
  8. 6 'jerry'
  9. 7 'welcome'
  10. 8 'this'
  11. 9 'that'

That's not much help, a couple of welcomes in there, but no indication of which are keys, and which are values, it's just a listed array dump and, in this case, not particularly helpful. The trick here, is to use a reference to the data structure:

  1. DB<13> x \%data
  2. 0 HASH(0x8194bc4)
  3. 'Hello World' => 'zip'
  4. 'jerry' => 'welcome'
  5. 'this' => 'that'
  6. 'tom' => 'and'
  7. 'welcome' => undef

The reference is truly dumped and we can finally see what we're dealing with. Our quoting was perfectly valid but wrong for our purposes, with 'and jerry' being treated as 2 separate words rather than a phrase, thus throwing the evenly paired hash structure out of alignment.

The '-w' switch would have told us about this, had we used it at the start, and saved us a lot of trouble:

  1. > perl -w data
  2. Odd number of elements in hash assignment at ./data line 5.

We fix our quoting: 'tom' => q(and jerry), and run it again, this time we get our expected output:

  1. > perl -w data
  2. Hello World

While we're here, take a closer look at the 'x' command, it's really useful and will merrily dump out nested references, complete objects, partial objects - just about whatever you throw at it:

Let's make a quick object and x-plode it, first we'll start the debugger: it wants some form of input from STDIN, so we give it something non-committal, a zero:

  1. > perl -de 0
  2. Default die handler restored.
  3. Loading DB routines from perl5db.pl version 1.07
  4. Editor support available.
  5. Enter h or `h h' for help, or `man perldebug' for more help.
  6. main::(-e:1): 0

Now build an on-the-fly object over a couple of lines (note the backslash):

  1. DB<1> $obj = bless({'unique_id'=>'123', 'attr'=> \
  2. cont: {'col' => 'black', 'things' => [qw(this that etc)]}}, 'MY_class')

And let's have a look at it:

  1. DB<2> x $obj
  2. 0 MY_class=HASH(0x828ad98)
  3. 'attr' => HASH(0x828ad68)
  4. 'col' => 'black'
  5. 'things' => ARRAY(0x828abb8)
  6. 0 'this'
  7. 1 'that'
  8. 2 'etc'
  9. 'unique_id' => 123
  10. DB<3>

Useful, huh? You can eval nearly anything in there, and experiment with bits of code or regexes until the cows come home:

  1. DB<3> @data = qw(this that the other atheism leather theory scythe)
  2. DB<4> p 'saw -> '.($cnt += map { print "\t:\t$_\n" } grep(/the/, sort @data))
  3. atheism
  4. leather
  5. other
  6. scythe
  7. the
  8. theory
  9. saw -> 6

If you want to see the command History, type an 'H':

  1. DB<5> H
  2. 4: p 'saw -> '.($cnt += map { print "\t:\t$_\n" } grep(/the/, sort @data))
  3. 3: @data = qw(this that the other atheism leather theory scythe)
  4. 2: x $obj
  5. 1: $obj = bless({'unique_id'=>'123', 'attr'=>
  6. {'col' => 'black', 'things' => [qw(this that etc)]}}, 'MY_class')
  7. DB<5>

And if you want to repeat any previous command, use the exclamation: '!':

  1. DB<5> !4
  2. p 'saw -> '.($cnt += map { print "$_\n" } grep(/the/, sort @data))
  3. atheism
  4. leather
  5. other
  6. scythe
  7. the
  8. theory
  9. saw -> 12

For more on references see perlref and perlreftut

Stepping through code

Here's a simple program which converts between Celsius and Fahrenheit, it too has a problem:

  1. #!/usr/bin/perl -w
  2. use strict;
  3. my $arg = $ARGV[0] || '-c20';
  4. if ($arg =~ /^\-(c|f)((\-|\+)*\d+(\.\d+)*)$/) {
  5. my ($deg, $num) = ($1, $2);
  6. my ($in, $out) = ($num, $num);
  7. if ($deg eq 'c') {
  8. $deg = 'f';
  9. $out = &c2f($num);
  10. } else {
  11. $deg = 'c';
  12. $out = &f2c($num);
  13. }
  14. $out = sprintf('%0.2f', $out);
  15. $out =~ s/^((\-|\+)*\d+)\.0+$/$1/;
  16. print "$out $deg\n";
  17. } else {
  18. print "Usage: $0 -[c|f] num\n";
  19. }
  20. exit;
  21. sub f2c {
  22. my $f = shift;
  23. my $c = 5 * $f - 32 / 9;
  24. return $c;
  25. }
  26. sub c2f {
  27. my $c = shift;
  28. my $f = 9 * $c / 5 + 32;
  29. return $f;
  30. }

For some reason, the Fahrenheit to Celsius conversion fails to return the expected output. This is what it does:

  1. > temp -c0.72
  2. 33.30 f
  3. > temp -f33.3
  4. 162.94 c

Not very consistent! We'll set a breakpoint in the code manually and run it under the debugger to see what's going on. A breakpoint is a flag, to which the debugger will run without interruption, when it reaches the breakpoint, it will stop execution and offer a prompt for further interaction. In normal use, these debugger commands are completely ignored, and they are safe - if a little messy, to leave in production code.

  1. my ($in, $out) = ($num, $num);
  2. $DB::single=2; # insert at line 9!
  3. if ($deg eq 'c')
  4. ...
  5. > perl -d temp -f33.3
  6. Default die handler restored.
  7. Loading DB routines from perl5db.pl version 1.07
  8. Editor support available.
  9. Enter h or `h h' for help, or `man perldebug' for more help.
  10. main::(temp:4): my $arg = $ARGV[0] || '-c100';

We'll simply continue down to our pre-set breakpoint with a 'c':

  1. DB<1> c
  2. main::(temp:10): if ($deg eq 'c') {

Followed by a view command to see where we are:

  1. DB<1> v
  2. 7: my ($deg, $num) = ($1, $2);
  3. 8: my ($in, $out) = ($num, $num);
  4. 9: $DB::single=2;
  5. 10==> if ($deg eq 'c') {
  6. 11: $deg = 'f';
  7. 12: $out = &c2f($num);
  8. 13 } else {
  9. 14: $deg = 'c';
  10. 15: $out = &f2c($num);
  11. 16 }

And a print to show what values we're currently using:

  1. DB<1> p $deg, $num
  2. f33.3

We can put another break point on any line beginning with a colon, we'll use line 17 as that's just as we come out of the subroutine, and we'd like to pause there later on:

  1. DB<2> b 17

There's no feedback from this, but you can see what breakpoints are set by using the list 'L' command:

  1. DB<3> L
  2. temp:
  3. 17: print "$out $deg\n";
  4. break if (1)

Note that to delete a breakpoint you use 'B'.

Now we'll continue down into our subroutine, this time rather than by line number, we'll use the subroutine name, followed by the now familiar 'v':

  1. DB<3> c f2c
  2. main::f2c(temp:30): my $f = shift;
  3. DB<4> v
  4. 24: exit;
  5. 25
  6. 26 sub f2c {
  7. 27==> my $f = shift;
  8. 28: my $c = 5 * $f - 32 / 9;
  9. 29: return $c;
  10. 30 }
  11. 31
  12. 32 sub c2f {
  13. 33: my $c = shift;

Note that if there was a subroutine call between us and line 29, and we wanted to single-step through it, we could use the 's' command, and to step over it we would use 'n' which would execute the sub, but not descend into it for inspection. In this case though, we simply continue down to line 29:

  1. DB<4> c 29
  2. main::f2c(temp:29): return $c;

And have a look at the return value:

  1. DB<5> p $c
  2. 162.944444444444

This is not the right answer at all, but the sum looks correct. I wonder if it's anything to do with operator precedence? We'll try a couple of other possibilities with our sum:

  1. DB<6> p (5 * $f - 32 / 9)
  2. 162.944444444444
  3. DB<7> p 5 * $f - (32 / 9)
  4. 162.944444444444
  5. DB<8> p (5 * $f) - 32 / 9
  6. 162.944444444444
  7. DB<9> p 5 * ($f - 32) / 9
  8. 0.722222222222221

:-) that's more like it! Ok, now we can set our return variable and we'll return out of the sub with an 'r':

  1. DB<10> $c = 5 * ($f - 32) / 9
  2. DB<11> r
  3. scalar context return from main::f2c: 0.722222222222221

Looks good, let's just continue off the end of the script:

  1. DB<12> c
  2. 0.72 c
  3. Debugged program terminated. Use q to quit or R to restart,
  4. use O inhibit_exit to avoid stopping after program termination,
  5. h q, h R or h O to get additional info.

A quick fix to the offending line (insert the missing parentheses) in the actual program and we're finished.

Placeholder for a, w, t, T

Actions, watch variables, stack traces etc.: on the TODO list.

  1. a
  2. w
  3. t
  4. T

REGULAR EXPRESSIONS

Ever wanted to know what a regex looked like? You'll need perl compiled with the DEBUGGING flag for this one:

  1. > perl -Dr -e '/^pe(a)*rl$/i'
  2. Compiling REx `^pe(a)*rl$'
  3. size 17 first at 2
  4. rarest char
  5. at 0
  6. 1: BOL(2)
  7. 2: EXACTF <pe>(4)
  8. 4: CURLYN[1] {0,32767}(14)
  9. 6: NOTHING(8)
  10. 8: EXACTF <a>(0)
  11. 12: WHILEM(0)
  12. 13: NOTHING(14)
  13. 14: EXACTF <rl>(16)
  14. 16: EOL(17)
  15. 17: END(0)
  16. floating `'$ at 4..2147483647 (checking floating) stclass `EXACTF <pe>'
  17. anchored(BOL) minlen 4
  18. Omitting $` $& $' support.
  19. EXECUTING...
  20. Freeing REx: `^pe(a)*rl$'

Did you really want to know? :-) For more gory details on getting regular expressions to work, have a look at perlre, perlretut, and to decode the mysterious labels (BOL and CURLYN, etc. above), see perldebguts.

OUTPUT TIPS

To get all the output from your error log, and not miss any messages via helpful operating system buffering, insert a line like this, at the start of your script:

  1. $|=1;

To watch the tail of a dynamically growing logfile, (from the command line):

  1. tail -f $error_log

Wrapping all die calls in a handler routine can be useful to see how, and from where, they're being called, perlvar has more information:

  1. BEGIN { $SIG{__DIE__} = sub { require Carp; Carp::confess(@_) } }

Various useful techniques for the redirection of STDOUT and STDERR filehandles are explained in perlopentut and perlfaq8.

CGI

Just a quick hint here for all those CGI programmers who can't figure out how on earth to get past that 'waiting for input' prompt, when running their CGI script from the command-line, try something like this:

  1. > perl -d my_cgi.pl -nodebug

Of course CGI and perlfaq9 will tell you more.

GUIs

The command line interface is tightly integrated with an emacs extension and there's a vi interface too.

You don't have to do this all on the command line, though, there are a few GUI options out there. The nice thing about these is you can wave a mouse over a variable and a dump of its data will appear in an appropriate window, or in a popup balloon, no more tiresome typing of 'x $varname' :-)

In particular have a hunt around for the following:

ptkdb perlTK based wrapper for the built-in debugger

ddd data display debugger

PerlDevKit and PerlBuilder are NT specific

NB. (more info on these and others would be appreciated).

SUMMARY

We've seen how to encourage good coding practices with use strict and -w. We can run the perl debugger perl -d scriptname to inspect your data from within the perl debugger with the p and x commands. You can walk through your code, set breakpoints with b and step through that code with s or n, continue with c and return from a sub with r. Fairly intuitive stuff when you get down to it.

There is of course lots more to find out about, this has just scratched the surface. The best way to learn more is to use perldoc to find out more about the language, to read the on-line help (perldebug is probably the next place to go), and of course, experiment.

SEE ALSO

perldebug, perldebguts, perldiag, perlrun

AUTHOR

Richard Foley <richard.foley@rfi.net> Copyright (c) 2000

CONTRIBUTORS

Various people have made helpful suggestions and contributions, in particular:

Ronald J Kimball <rjk@linguist.dartmouth.edu>

Hugo van der Sanden <hv@crypt0.demon.co.uk>

Peter Scott <Peter@PSDT.com>

 
perldoc-html/perldebug.html000644 000765 000024 00000225277 12275777337 016120 0ustar00jjstaff000000 000000 perldebug - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldebug

Perl 5 version 18.2 documentation
Recently read

perldebug

NAME

perldebug - Perl debugging

DESCRIPTION

First of all, have you tried using the -w switch?

If you're new to the Perl debugger, you may prefer to read perldebtut, which is a tutorial introduction to the debugger.

The Perl Debugger

If you invoke Perl with the -d switch, your script runs under the Perl source debugger. This works like an interactive Perl environment, prompting for debugger commands that let you examine source code, set breakpoints, get stack backtraces, change the values of variables, etc. This is so convenient that you often fire up the debugger all by itself just to test out Perl constructs interactively to see what they do. For example:

  1. $ perl -d -e 42

In Perl, the debugger is not a separate program the way it usually is in the typical compiled environment. Instead, the -d flag tells the compiler to insert source information into the parse trees it's about to hand off to the interpreter. That means your code must first compile correctly for the debugger to work on it. Then when the interpreter starts up, it preloads a special Perl library file containing the debugger.

The program will halt right before the first run-time executable statement (but see below regarding compile-time statements) and ask you to enter a debugger command. Contrary to popular expectations, whenever the debugger halts and shows you a line of code, it always displays the line it's about to execute, rather than the one it has just executed.

Any command not recognized by the debugger is directly executed (eval'd) as Perl code in the current package. (The debugger uses the DB package for keeping its own state information.)

Note that the said eval is bound by an implicit scope. As a result any newly introduced lexical variable or any modified capture buffer content is lost after the eval. The debugger is a nice environment to learn Perl, but if you interactively experiment using material which should be in the same scope, stuff it in one line.

For any text entered at the debugger prompt, leading and trailing whitespace is first stripped before further processing. If a debugger command coincides with some function in your own program, merely precede the function with something that doesn't look like a debugger command, such as a leading ; or perhaps a + , or by wrapping it with parentheses or braces.

Calling the Debugger

There are several ways to call the debugger:

  • perl -d program_name

    On the given program identified by program_name .

  • perl -d -e 0

    Interactively supply an arbitrary expression using -e .

  • perl -d:Ptkdb program_name

    Debug a given program via the Devel::Ptkdb GUI.

  • perl -dt threaded_program_name

    Debug a given program using threads (experimental).

Debugger Commands

The interactive debugger understands the following commands:

  • h

    Prints out a summary help message

  • h [command]

    Prints out a help message for the given debugger command.

  • h h

    The special argument of h h produces the entire help page, which is quite long.

    If the output of the h h command (or any command, for that matter) scrolls past your screen, precede the command with a leading pipe symbol so that it's run through your pager, as in

    1. DB> |h h

    You may change the pager which is used via o pager=... command.

  • p expr

    Same as print {$DB::OUT} expr in the current package. In particular, because this is just Perl's own print function, this means that nested data structures and objects are not dumped, unlike with the x command.

    The DB::OUT filehandle is opened to /dev/tty, regardless of where STDOUT may be redirected to.

  • x [maxdepth] expr

    Evaluates its expression in list context and dumps out the result in a pretty-printed fashion. Nested data structures are printed out recursively, unlike the real print function in Perl. When dumping hashes, you'll probably prefer 'x \%h' rather than 'x %h'. See Dumpvalue if you'd like to do this yourself.

    The output format is governed by multiple options described under Configurable Options.

    If the maxdepth is included, it must be a numeral N; the value is dumped only N levels deep, as if the dumpDepth option had been temporarily set to N.

  • V [pkg [vars]]

    Display all (or some) variables in package (defaulting to main ) using a data pretty-printer (hashes show their keys and values so you see what's what, control characters are made printable, etc.). Make sure you don't put the type specifier (like $ ) there, just the symbol names, like this:

    1. V DB filename line

    Use ~pattern and !pattern for positive and negative regexes.

    This is similar to calling the x command on each applicable var.

  • X [vars]

    Same as V currentpackage [vars] .

  • y [level [vars]]

    Display all (or some) lexical variables (mnemonic: mY variables) in the current scope or level scopes higher. You can limit the variables that you see with vars which works exactly as it does for the V and X commands. Requires the PadWalker module version 0.08 or higher; will warn if this isn't installed. Output is pretty-printed in the same style as for V and the format is controlled by the same options.

  • T

    Produce a stack backtrace. See below for details on its output.

  • s [expr]

    Single step. Executes until the beginning of another statement, descending into subroutine calls. If an expression is supplied that includes function calls, it too will be single-stepped.

  • n [expr]

    Next. Executes over subroutine calls, until the beginning of the next statement. If an expression is supplied that includes function calls, those functions will be executed with stops before each statement.

  • r

    Continue until the return from the current subroutine. Dump the return value if the PrintRet option is set (default).

  • <CR>

    Repeat last n or s command.

  • c [line|sub]

    Continue, optionally inserting a one-time-only breakpoint at the specified line or subroutine.

  • l

    List next window of lines.

  • l min+incr

    List incr+1 lines starting at min .

  • l min-max

    List lines min through max . l - is synonymous to - .

  • l line

    List a single line.

  • l subname

    List first window of lines from subroutine. subname may be a variable that contains a code reference.

  • -

    List previous window of lines.

  • v [line]

    View a few lines of code around the current line.

  • .

    Return the internal debugger pointer to the line last executed, and print out that line.

  • f filename

    Switch to viewing a different file or eval statement. If filename is not a full pathname found in the values of %INC, it is considered a regex.

    evaled strings (when accessible) are considered to be filenames: f (eval 7) and f eval 7\b access the body of the 7th evaled string (in the order of execution). The bodies of the currently executed eval and of evaled strings that define subroutines are saved and thus accessible.

  • /pattern/

    Search forwards for pattern (a Perl regex); final / is optional. The search is case-insensitive by default.

  • ?pattern?

    Search backwards for pattern; final ? is optional. The search is case-insensitive by default.

  • L [abw]

    List (default all) actions, breakpoints and watch expressions

  • S [[!]regex]

    List subroutine names [not] matching the regex.

  • t [n]

    Toggle trace mode (see also the AutoTrace option). Optional argument is the maximum number of levels to trace below the current one; anything deeper than that will be silent.

  • t [n] expr

    Trace through execution of expr . Optional first argument is the maximum number of levels to trace below the current one; anything deeper than that will be silent. See Frame Listing Output Examples in perldebguts for examples.

  • b

    Sets breakpoint on current line

  • b [line] [condition]

    Set a breakpoint before the given line. If a condition is specified, it's evaluated each time the statement is reached: a breakpoint is taken only if the condition is true. Breakpoints may only be set on lines that begin an executable statement. Conditions don't use if :

    1. b 237 $x > 30
    2. b 237 ++$count237 < 11
    3. b 33 /pattern/i

    If the line number is ., sets a breakpoint on the current line:

    1. b . $n > 100
  • b [file]:[line] [condition]

    Set a breakpoint before the given line in a (possibly different) file. If a condition is specified, it's evaluated each time the statement is reached: a breakpoint is taken only if the condition is true. Breakpoints may only be set on lines that begin an executable statement. Conditions don't use if :

    1. b lib/MyModule.pm:237 $x > 30
    2. b /usr/lib/perl5/site_perl/CGI.pm:100 ++$count100 < 11
  • b subname [condition]

    Set a breakpoint before the first line of the named subroutine. subname may be a variable containing a code reference (in this case condition is not supported).

  • b postpone subname [condition]

    Set a breakpoint at first line of subroutine after it is compiled.

  • b load filename

    Set a breakpoint before the first executed line of the filename, which should be a full pathname found amongst the %INC values.

  • b compile subname

    Sets a breakpoint before the first statement executed after the specified subroutine is compiled.

  • B line

    Delete a breakpoint from the specified line.

  • B *

    Delete all installed breakpoints.

  • disable [file]:[line]

    Disable the breakpoint so it won't stop the execution of the program. Breakpoints are enabled by default and can be re-enabled using the enable command.

  • disable [line]

    Disable the breakpoint so it won't stop the execution of the program. Breakpoints are enabled by default and can be re-enabled using the enable command.

    This is done for a breakpoint in the current file.

  • enable [file]:[line]

    Enable the breakpoint so it will stop the execution of the program.

  • enable [line]

    Enable the breakpoint so it will stop the execution of the program.

    This is done for a breakpoint in the current file.

  • a [line] command

    Set an action to be done before the line is executed. If line is omitted, set an action on the line about to be executed. The sequence of steps taken by the debugger is

    1. 1. check for a breakpoint at this line
    2. 2. print the line if necessary (tracing)
    3. 3. do any actions associated with that line
    4. 4. prompt user if at a breakpoint or in single-step
    5. 5. evaluate line

    For example, this will print out $foo every time line 53 is passed:

    1. a 53 print "DB FOUND $foo\n"
  • A line

    Delete an action from the specified line.

  • A *

    Delete all installed actions.

  • w expr

    Add a global watch-expression. Whenever a watched global changes the debugger will stop and display the old and new values.

  • W expr

    Delete watch-expression

  • W *

    Delete all watch-expressions.

  • o

    Display all options.

  • o booloption ...

    Set each listed Boolean option to the value 1 .

  • o anyoption? ...

    Print out the value of one or more options.

  • o option=value ...

    Set the value of one or more options. If the value has internal whitespace, it should be quoted. For example, you could set o pager="less -MQeicsNfr" to call less with those specific options. You may use either single or double quotes, but if you do, you must escape any embedded instances of same sort of quote you began with, as well as any escaping any escapes that immediately precede that quote but which are not meant to escape the quote itself. In other words, you follow single-quoting rules irrespective of the quote; eg: o option='this isn\'t bad' or o option="She said, \"Isn't it?\"" .

    For historical reasons, the =value is optional, but defaults to 1 only where it is safe to do so--that is, mostly for Boolean options. It is always better to assign a specific value using = . The option can be abbreviated, but for clarity probably should not be. Several options can be set together. See Configurable Options for a list of these.

  • < ?

    List out all pre-prompt Perl command actions.

  • < [ command ]

    Set an action (Perl command) to happen before every debugger prompt. A multi-line command may be entered by backslashing the newlines.

  • < *

    Delete all pre-prompt Perl command actions.

  • << command

    Add an action (Perl command) to happen before every debugger prompt. A multi-line command may be entered by backwhacking the newlines.

  • > ?

    List out post-prompt Perl command actions.

  • > command

    Set an action (Perl command) to happen after the prompt when you've just given a command to return to executing the script. A multi-line command may be entered by backslashing the newlines (we bet you couldn't have guessed this by now).

  • > *

    Delete all post-prompt Perl command actions.

  • >> command

    Adds an action (Perl command) to happen after the prompt when you've just given a command to return to executing the script. A multi-line command may be entered by backslashing the newlines.

  • { ?

    List out pre-prompt debugger commands.

  • { [ command ]

    Set an action (debugger command) to happen before every debugger prompt. A multi-line command may be entered in the customary fashion.

    Because this command is in some senses new, a warning is issued if you appear to have accidentally entered a block instead. If that's what you mean to do, write it as with ;{ ... } or even do { ... } .

  • { *

    Delete all pre-prompt debugger commands.

  • {{ command

    Add an action (debugger command) to happen before every debugger prompt. A multi-line command may be entered, if you can guess how: see above.

  • ! number

    Redo a previous command (defaults to the previous command).

  • ! -number

    Redo number'th previous command.

  • ! pattern

    Redo last command that started with pattern. See o recallCommand , too.

  • !! cmd

    Run cmd in a subprocess (reads from DB::IN, writes to DB::OUT) See o shellBang , also. Note that the user's current shell (well, their $ENV{SHELL} variable) will be used, which can interfere with proper interpretation of exit status or signal and coredump information.

  • source file

    Read and execute debugger commands from file. file may itself contain source commands.

  • H -number

    Display last n commands. Only commands longer than one character are listed. If number is omitted, list them all.

  • q or ^D

    Quit. ("quit" doesn't work for this, unless you've made an alias) This is the only supported way to exit the debugger, though typing exit twice might work.

    Set the inhibit_exit option to 0 if you want to be able to step off the end the script. You may also need to set $finished to 0 if you want to step through global destruction.

  • R

    Restart the debugger by exec()ing a new session. We try to maintain your history across this, but internal settings and command-line options may be lost.

    The following setting are currently preserved: history, breakpoints, actions, debugger options, and the Perl command-line options -w, -I, and -e.

  • |dbcmd

    Run the debugger command, piping DB::OUT into your current pager.

  • ||dbcmd

    Same as |dbcmd but DB::OUT is temporarily selected as well.

  • = [alias value]

    Define a command alias, like

    1. = quit q

    or list current aliases.

  • command

    Execute command as a Perl statement. A trailing semicolon will be supplied. If the Perl statement would otherwise be confused for a Perl debugger, use a leading semicolon, too.

  • m expr

    List which methods may be called on the result of the evaluated expression. The expression may evaluated to a reference to a blessed object, or to a package name.

  • M

    Display all loaded modules and their versions.

  • man [manpage]

    Despite its name, this calls your system's default documentation viewer on the given page, or on the viewer itself if manpage is omitted. If that viewer is man, the current Config information is used to invoke man using the proper MANPATH or -M manpath option. Failed lookups of the form XXX that match known manpages of the form perlXXX will be retried. This lets you type man debug or man op from the debugger.

    On systems traditionally bereft of a usable man command, the debugger invokes perldoc. Occasionally this determination is incorrect due to recalcitrant vendors or rather more felicitously, to enterprising users. If you fall into either category, just manually set the $DB::doccmd variable to whatever viewer to view the Perl documentation on your system. This may be set in an rc file, or through direct assignment. We're still waiting for a working example of something along the lines of:

    1. $DB::doccmd = 'netscape -remote http://something.here/';

Configurable Options

The debugger has numerous options settable using the o command, either interactively or from the environment or an rc file. (./.perldb or ~/.perldb under Unix.)

  • recallCommand , ShellBang

    The characters used to recall a command or spawn a shell. By default, both are set to ! , which is unfortunate.

  • pager

    Program to use for output of pager-piped commands (those beginning with a | character.) By default, $ENV{PAGER} will be used. Because the debugger uses your current terminal characteristics for bold and underlining, if the chosen pager does not pass escape sequences through unchanged, the output of some debugger commands will not be readable when sent through the pager.

  • tkRunning

    Run Tk while prompting (with ReadLine).

  • signalLevel , warnLevel , dieLevel

    Level of verbosity. By default, the debugger leaves your exceptions and warnings alone, because altering them can break correctly running programs. It will attempt to print a message when uncaught INT, BUS, or SEGV signals arrive. (But see the mention of signals in BUGS below.)

    To disable this default safe mode, set these values to something higher than 0. At a level of 1, you get backtraces upon receiving any kind of warning (this is often annoying) or exception (this is often valuable). Unfortunately, the debugger cannot discern fatal exceptions from non-fatal ones. If dieLevel is even 1, then your non-fatal exceptions are also traced and unceremoniously altered if they came from eval'ed strings or from any kind of eval within modules you're attempting to load. If dieLevel is 2, the debugger doesn't care where they came from: It usurps your exception handler and prints out a trace, then modifies all exceptions with its own embellishments. This may perhaps be useful for some tracing purposes, but tends to hopelessly destroy any program that takes its exception handling seriously.

  • AutoTrace

    Trace mode (similar to t command, but can be put into PERLDB_OPTS ).

  • LineInfo

    File or pipe to print line number info to. If it is a pipe (say, |visual_perl_db), then a short message is used. This is the mechanism used to interact with a slave editor or visual debugger, such as the special vi or emacs hooks, or the ddd graphical debugger.

  • inhibit_exit

    If 0, allows stepping off the end of the script.

  • PrintRet

    Print return value after r command if set (default).

  • ornaments

    Affects screen appearance of the command line (see Term::ReadLine). There is currently no way to disable these, which can render some output illegible on some displays, or with some pagers. This is considered a bug.

  • frame

    Affects the printing of messages upon entry and exit from subroutines. If frame & 2 is false, messages are printed on entry only. (Printing on exit might be useful if interspersed with other messages.)

    If frame & 4 , arguments to functions are printed, plus context and caller info. If frame & 8 , overloaded stringify and tied FETCH is enabled on the printed arguments. If frame & 16 , the return value from the subroutine is printed.

    The length at which the argument list is truncated is governed by the next option:

  • maxTraceLen

    Length to truncate the argument list when the frame option's bit 4 is set.

  • windowSize

    Change the size of code list window (default is 10 lines).

The following options affect what happens with V , X , and x commands:

  • arrayDepth , hashDepth

    Print only first N elements ('' for all).

  • dumpDepth

    Limit recursion depth to N levels when dumping structures. Negative values are interpreted as infinity. Default: infinity.

  • compactDump , veryCompact

    Change the style of array and hash output. If compactDump , short array may be printed on one line.

  • globPrint

    Whether to print contents of globs.

  • DumpDBFiles

    Dump arrays holding debugged files.

  • DumpPackages

    Dump symbol tables of packages.

  • DumpReused

    Dump contents of "reused" addresses.

  • quote , HighBit , undefPrint

    Change the style of string dump. The default value for quote is auto ; one can enable double-quotish or single-quotish format by setting it to " or ', respectively. By default, characters with their high bit set are printed verbatim.

  • UsageOnly

    Rudimentary per-package memory usage dump. Calculates total size of strings found in variables in the package. This does not include lexicals in a module's file scope, or lost in closures.

  • HistFile

    The path of the file from which the history (assuming a usable Term::ReadLine backend) will be read on the debugger's startup, and to which it will be saved on shutdown (for persistence across sessions). Similar in concept to Bash's .bash_history file.

  • HistSize

    The count of the saved lines in the history (assuming HistFile above).

After the rc file is read, the debugger reads the $ENV{PERLDB_OPTS} environment variable and parses this as the remainder of a "O ..." line as one might enter at the debugger prompt. You may place the initialization options TTY , noTTY , ReadLine , and NonStop there.

If your rc file contains:

  1. parse_options("NonStop=1 LineInfo=db.out AutoTrace");

then your script will run without human intervention, putting trace information into the file db.out. (If you interrupt it, you'd better reset LineInfo to /dev/tty if you expect to see anything.)

  • TTY

    The TTY to use for debugging I/O.

  • noTTY

    If set, the debugger goes into NonStop mode and will not connect to a TTY. If interrupted (or if control goes to the debugger via explicit setting of $DB::signal or $DB::single from the Perl script), it connects to a TTY specified in the TTY option at startup, or to a tty found at runtime using the Term::Rendezvous module of your choice.

    This module should implement a method named new that returns an object with two methods: IN and OUT . These should return filehandles to use for debugging input and output correspondingly. The new method should inspect an argument containing the value of $ENV{PERLDB_NOTTY} at startup, or "$ENV{HOME}/.perldbtty$$" otherwise. This file is not inspected for proper ownership, so security hazards are theoretically possible.

  • ReadLine

    If false, readline support in the debugger is disabled in order to debug applications that themselves use ReadLine.

  • NonStop

    If set, the debugger goes into non-interactive mode until interrupted, or programmatically by setting $DB::signal or $DB::single.

Here's an example of using the $ENV{PERLDB_OPTS} variable:

  1. $ PERLDB_OPTS="NonStop frame=2" perl -d myprogram

That will run the script myprogram without human intervention, printing out the call tree with entry and exit points. Note that NonStop=1 frame=2 is equivalent to N f=2 , and that originally, options could be uniquely abbreviated by the first letter (modulo the Dump* options). It is nevertheless recommended that you always spell them out in full for legibility and future compatibility.

Other examples include

  1. $ PERLDB_OPTS="NonStop LineInfo=listing frame=2" perl -d myprogram

which runs script non-interactively, printing info on each entry into a subroutine and each executed line into the file named listing. (If you interrupt it, you would better reset LineInfo to something "interactive"!)

Other examples include (using standard shell syntax to show environment variable settings):

  1. $ ( PERLDB_OPTS="NonStop frame=1 AutoTrace LineInfo=tperl.out"
  2. perl -d myprogram )

which may be useful for debugging a program that uses Term::ReadLine itself. Do not forget to detach your shell from the TTY in the window that corresponds to /dev/ttyXX, say, by issuing a command like

  1. $ sleep 1000000

See Debugger Internals in perldebguts for details.

Debugger Input/Output

  • Prompt

    The debugger prompt is something like

    1. DB<8>

    or even

    1. DB<<17>>

    where that number is the command number, and which you'd use to access with the built-in csh-like history mechanism. For example, !17 would repeat command number 17. The depth of the angle brackets indicates the nesting depth of the debugger. You could get more than one set of brackets, for example, if you'd already at a breakpoint and then printed the result of a function call that itself has a breakpoint, or you step into an expression via s/n/t expression command.

  • Multiline commands

    If you want to enter a multi-line command, such as a subroutine definition with several statements or a format, escape the newline that would normally end the debugger command with a backslash. Here's an example:

    1. DB<1> for (1..4) { \
    2. cont: print "ok\n"; \
    3. cont: }
    4. ok
    5. ok
    6. ok
    7. ok

    Note that this business of escaping a newline is specific to interactive commands typed into the debugger.

  • Stack backtrace

    Here's an example of what a stack backtrace via T command might look like:

    1. $ = main::infested called from file 'Ambulation.pm' line 10
    2. @ = Ambulation::legs(1, 2, 3, 4) called from file 'camel_flea' line 7
    3. $ = main::pests('bactrian', 4) called from file 'camel_flea' line 4

    The left-hand character up there indicates the context in which the function was called, with $ and @ meaning scalar or list contexts respectively, and . meaning void context (which is actually a sort of scalar context). The display above says that you were in the function main::infested when you ran the stack dump, and that it was called in scalar context from line 10 of the file Ambulation.pm, but without any arguments at all, meaning it was called as &infested . The next stack frame shows that the function Ambulation::legs was called in list context from the camel_flea file with four arguments. The last stack frame shows that main::pests was called in scalar context, also from camel_flea, but from line 4.

    If you execute the T command from inside an active use statement, the backtrace will contain both a require frame and an eval frame.

  • Line Listing Format

    This shows the sorts of output the l command can produce:

    1. DB<<13>> l
    2. 101: @i{@i} = ();
    3. 102:b @isa{@i,$pack} = ()
    4. 103 if(exists $i{$prevpack} || exists $isa{$pack});
    5. 104 }
    6. 105
    7. 106 next
    8. 107==> if(exists $isa{$pack});
    9. 108
    10. 109:a if ($extra-- > 0) {
    11. 110: %isa = ($pack,1);

    Breakable lines are marked with : . Lines with breakpoints are marked by b and those with actions by a . The line that's about to be executed is marked by ==>.

    Please be aware that code in debugger listings may not look the same as your original source code. Line directives and external source filters can alter the code before Perl sees it, causing code to move from its original positions or take on entirely different forms.

  • Frame listing

    When the frame option is set, the debugger would print entered (and optionally exited) subroutines in different styles. See perldebguts for incredibly long examples of these.

Debugging Compile-Time Statements

If you have compile-time executable statements (such as code within BEGIN, UNITCHECK and CHECK blocks or use statements), these will not be stopped by debugger, although requires and INIT blocks will, and compile-time statements can be traced with the AutoTrace option set in PERLDB_OPTS ). From your own Perl code, however, you can transfer control back to the debugger using the following statement, which is harmless if the debugger is not running:

  1. $DB::single = 1;

If you set $DB::single to 2, it's equivalent to having just typed the n command, whereas a value of 1 means the s command. The $DB::trace variable should be set to 1 to simulate having typed the t command.

Another way to debug compile-time code is to start the debugger, set a breakpoint on the load of some module:

  1. DB<7> b load f:/perllib/lib/Carp.pm
  2. Will stop on load of 'f:/perllib/lib/Carp.pm'.

and then restart the debugger using the R command (if possible). One can use b compile subname for the same purpose.

Debugger Customization

The debugger probably contains enough configuration hooks that you won't ever have to modify it yourself. You may change the behaviour of the debugger from within the debugger using its o command, from the command line via the PERLDB_OPTS environment variable, and from customization files.

You can do some customization by setting up a .perldb file, which contains initialization code. For instance, you could make aliases like these (the last one is one people expect to be there):

  1. $DB::alias{'len'} = 's/^len(.*)/p length($1)/';
  2. $DB::alias{'stop'} = 's/^stop (at|in)/b/';
  3. $DB::alias{'ps'} = 's/^ps\b/p scalar /';
  4. $DB::alias{'quit'} = 's/^quit(\s*)/exit/';

You can change options from .perldb by using calls like this one;

  1. parse_options("NonStop=1 LineInfo=db.out AutoTrace=1 frame=2");

The code is executed in the package DB . Note that .perldb is processed before processing PERLDB_OPTS . If .perldb defines the subroutine afterinit , that function is called after debugger initialization ends. .perldb may be contained in the current directory, or in the home directory. Because this file is sourced in by Perl and may contain arbitrary commands, for security reasons, it must be owned by the superuser or the current user, and writable by no one but its owner.

You can mock TTY input to debugger by adding arbitrary commands to @DB::typeahead. For example, your .perldb file might contain:

  1. sub afterinit { push @DB::typeahead, "b 4", "b 6"; }

Which would attempt to set breakpoints on lines 4 and 6 immediately after debugger initialization. Note that @DB::typeahead is not a supported interface and is subject to change in future releases.

If you want to modify the debugger, copy perl5db.pl from the Perl library to another name and hack it to your heart's content. You'll then want to set your PERL5DB environment variable to say something like this:

  1. BEGIN { require "myperl5db.pl" }

As a last resort, you could also use PERL5DB to customize the debugger by directly setting internal variables or calling debugger functions.

Note that any variables and functions that are not documented in this document (or in perldebguts) are considered for internal use only, and as such are subject to change without notice.

Readline Support / History in the Debugger

As shipped, the only command-line history supplied is a simplistic one that checks for leading exclamation points. However, if you install the Term::ReadKey and Term::ReadLine modules from CPAN (such as Term::ReadLine::Gnu, Term::ReadLine::Perl, ...) you will have full editing capabilities much like those GNU readline(3) provides. Look for these in the modules/by-module/Term directory on CPAN. These do not support normal vi command-line editing, however.

A rudimentary command-line completion is also available, including lexical variables in the current scope if the PadWalker module is installed.

Without Readline support you may see the symbols "^[[A", "^[[C", "^[[B", "^[[D"", "^H", ... when using the arrow keys and/or the backspace key.

Editor Support for Debugging

If you have the FSF's version of emacs installed on your system, it can interact with the Perl debugger to provide an integrated software development environment reminiscent of its interactions with C debuggers.

Recent versions of Emacs come with a start file for making emacs act like a syntax-directed editor that understands (some of) Perl's syntax. See perlfaq3.

A similar setup by Tom Christiansen for interacting with any vendor-shipped vi and the X11 window system is also available. This works similarly to the integrated multiwindow support that emacs provides, where the debugger drives the editor. At the time of this writing, however, that tool's eventual location in the Perl distribution was uncertain.

Users of vi should also look into vim and gvim, the mousey and windy version, for coloring of Perl keywords.

Note that only perl can truly parse Perl, so all such CASE tools fall somewhat short of the mark, especially if you don't program your Perl as a C programmer might.

The Perl Profiler

If you wish to supply an alternative debugger for Perl to run, invoke your script with a colon and a package argument given to the -d flag. Perl's alternative debuggers include a Perl profiler, Devel::NYTProf, which is available separately as a CPAN distribution. To profile your Perl program in the file mycode.pl, just type:

  1. $ perl -d:NYTProf mycode.pl

When the script terminates the profiler will create a database of the profile information that you can turn into reports using the profiler's tools. See <perlperf> for details.

Debugging Regular Expressions

use re 'debug' enables you to see the gory details of how the Perl regular expression engine works. In order to understand this typically voluminous output, one must not only have some idea about how regular expression matching works in general, but also know how Perl's regular expressions are internally compiled into an automaton. These matters are explored in some detail in Debugging Regular Expressions in perldebguts.

Debugging Memory Usage

Perl contains internal support for reporting its own memory usage, but this is a fairly advanced concept that requires some understanding of how memory allocation works. See Debugging Perl Memory Usage in perldebguts for the details.

SEE ALSO

You did try the -w switch, didn't you?

perldebtut, perldebguts, re, DB, Devel::NYTProf, Dumpvalue, and perlrun.

When debugging a script that uses #! and is thus normally found in $PATH, the -S option causes perl to search $PATH for it, so you don't have to type the path or which $scriptname .

  1. $ perl -Sd foo.pl

BUGS

You cannot get stack frame information or in any fashion debug functions that were not compiled by Perl, such as those from C or C++ extensions.

If you alter your @_ arguments in a subroutine (such as with shift or pop), the stack backtrace will not show the original values.

The debugger does not currently work in conjunction with the -W command-line switch, because it itself is not free of warnings.

If you're in a slow syscall (like waiting, accepting, or reading from your keyboard or a socket) and haven't set up your own $SIG{INT} handler, then you won't be able to CTRL-C your way back to the debugger, because the debugger's own $SIG{INT} handler doesn't understand that it needs to raise an exception to longjmp(3) out of slow syscalls.

 
perldoc-html/perldelta.html000644 000765 000024 00000053154 12275777371 016112 0ustar00jjstaff000000 000000 perldelta - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldelta

Perl 5 version 18.2 documentation
Recently read

perldelta

NAME

perldelta - what is new for perl v5.18.2

DESCRIPTION

This document describes differences between the 5.18.1 release and the 5.18.2 release.

If you are upgrading from an earlier release such as 5.18.0, first read perl5181delta, which describes differences between 5.18.0 and 5.18.1.

Modules and Pragmata

Updated Modules and Pragmata

  • B has been upgraded from version 1.42_01 to 1.42_02.

    The fix for [perl #118525] introduced a regression in the behaviour of B::CV::GV , changing the return value from a B::SPECIAL object on a NULL CvGV to undef. B::CV::GV again returns a B::SPECIAL object in this case. [perl #119413]

  • B::Concise has been upgraded from version 0.95 to 0.95_01.

    This fixes a bug in dumping unexpected SEPCIALs.

  • English has been upgraded from version 1.06 to 1.06_01. This fixes an error about the performance of $` , $& , and c<$'>.

  • File::Glob has been upgraded from version 1.20 to 1.20_01.

Documentation

Changes to Existing Documentation

  • perlrepository has been restored with a pointer to more useful pages.

  • perlhack has been updated with the latest changes from blead.

Selected Bug Fixes

  • Perl 5.18.1 introduced a regression along with a bugfix for lexical subs. Some B::SPECIAL results from B::CV::GV became undefs instead. This broke Devel::Cover among other libraries. This has been fixed. [perl #119351]

  • Perl 5.18.0 introduced a regression whereby [:^ascii:] , if used in the same character class as other qualifiers, would fail to match characters in the Latin-1 block. This has been fixed. [perl #120799]

  • Perl 5.18.0 introduced a regression when using ->SUPER::method with AUTOLOAD by looking up AUTOLOAD from the current package, rather than the current package’s superclass. This has been fixed. [perl #120694]

  • Perl 5.18.0 introduced a regression whereby -bareword was no longer permitted under the strict and integer pragmata when used together. This has been fixed. [perl #120288]

  • Previously PerlIOBase_dup didn't check if pushing the new layer succeeded before (optionally) setting the utf8 flag. This could cause segfaults-by-nullpointer. This has been fixed.

  • A buffer overflow with very long identifiers has been fixed.

  • A regression from 5.16 in the handling of padranges led to assertion failures if a keyword plugin declined to handle the second ‘my’, but only after creating a padop.

    This affected, at least, Devel::CallParser under threaded builds.

    This has been fixed

  • The construct $r=qr/.../; /$r/p is now handled properly, an issue which had been worsened by changes 5.18.0. [perl #118213]

Acknowledgements

Perl 5.18.2 represents approximately 3 months of development since Perl 5.18.1 and contains approximately 980 lines of changes across 39 files from 4 authors.

Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.18.2:

Craig A. Berry, David Mitchell, Ricardo Signes, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see the AUTHORS file in the Perl source distribution.

Reporting Bugs

If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the perlbug program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of perl -V , will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SEE ALSO

The Changes file for an explanation of how to view exhaustive details on what changed.

The INSTALL file for how to build Perl.

The README file for general stuff.

The Artistic and Copying files for copyright information.

 
perldoc-html/perldgux.html000644 000765 000024 00000043475 12275777411 015770 0ustar00jjstaff000000 000000 perldgux - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldgux

Perl 5 version 18.2 documentation
Recently read

perldgux

NAME

perldgux - Perl under DG/UX.

SYNOPSIS

One can read this document in the following formats:

  1. man perldgux
  2. view perl perldgux
  3. explorer perldgux.html
  4. info perldgux

to list some (not all may be available simultaneously), or it may be read as is: as README.dgux.

DESCRIPTION

Perl 5.7/8.x for DG/UX ix86 R4.20MU0x

BUILDING PERL ON DG/UX

Non-threaded Perl on DG/UX

Just run ./Configure script from the top directory. Then give "make" to compile.

Threaded Perl on DG/UX

If you are using as compiler GCC-2.95.x rev(DG/UX) an easy solution for configuring perl in your DG/UX machine is to run the command:

./Configure -Dusethreads -Duseithreads -Dusedevel -des

This will automatically accept all the defaults and in particular /usr/local/ as installation directory. Note that GCC-2.95.x rev(DG/UX) knows the switch -pthread which allows it to link correctly DG/UX's -lthread library.

If you want to change the installation directory or have a standard DG/UX with C compiler GCC-2.7.2.x then you have no choice than to do an interactive build by issuing the command:

./Configure -Dusethreads -Duseithreads

In particular with GCC-2.7.2.x accept all the defaults and *watch* out for the message:

  1. Any additional ld flags (NOT including libraries)? [ -pthread]

Instead of -pthread put here -lthread. CGCC-2.7.2.x that comes with the DG/UX OS does NOT know the -pthread switch. So your build will fail if you choose the defaults. After configuration is done correctly give "make" to compile.

Testing Perl on DG/UX

Issuing a "make test" will run all the tests. If the test lib/ftmp-security gives you as a result something like

  1. lib/ftmp-security....File::Temp::_gettemp:
  2. Parent directory (/tmp/) is not safe (sticky bit not set
  3. when world writable?) at lib/ftmp-security.t line 100

don't panic and just set the sticky bit in your /tmp directory by doing the following as root:

cd / chmod +t /tmp (=set the sticky bit to /tmp).

Then rerun the tests. This time all must be OK.

Installing the built perl on DG/UX

Run the command "make install"

AUTHOR

Takis Psarogiannakopoulos University of Cambridge Centre for Mathematical Sciences Department of Pure Mathematics Wilberforce road Cambridge CB3 0WB , UK email <takis@XFree86.Org>

SEE ALSO

perl(1).

 
perldoc-html/perldiag.html000644 000765 000024 00001402214 12275777336 015722 0ustar00jjstaff000000 000000 perldiag - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldiag

Perl 5 version 18.2 documentation
Recently read

perldiag

NAME

perldiag - various Perl diagnostics

DESCRIPTION

These messages are classified as follows (listed in increasing order of desperation):

  1. (W) A warning (optional).
  2. (D) A deprecation (enabled by default).
  3. (S) A severe warning (enabled by default).
  4. (F) A fatal error (trappable).
  5. (P) An internal error you should never see (trappable).
  6. (X) A very fatal error (nontrappable).
  7. (A) An alien error message (not generated by Perl).

The majority of messages from the first three classifications above (W, D & S) can be controlled using the warnings pragma.

If a message can be controlled by the warnings pragma, its warning category is included with the classification letter in the description below. E.g. (W closed) means a warning in the closed category.

Optional warnings are enabled by using the warnings pragma or the -w and -W switches. Warnings may be captured by setting $SIG{__WARN__} to a reference to a routine that will be called on each warning instead of printing it. See perlvar.

Severe warnings are always enabled, unless they are explicitly disabled with the warnings pragma or the -X switch.

Trappable errors may be trapped using the eval operator. See eval. In almost all cases, warnings may be selectively disabled or promoted to fatal errors using the warnings pragma. See warnings.

The messages are in alphabetical order, without regard to upper or lower-case. Some of these messages are generic. Spots that vary are denoted with a %s or other printf-style escape. These escapes are ignored by the alphabetical order, as are all characters other than letters. To look up your message, just ignore anything that is not a letter.

  • accept() on closed socket %s

    (W closed) You tried to do an accept on a closed socket. Did you forget to check the return value of your socket() call? See accept.

  • Allocation too large: %x

    (X) You can't allocate more than 64K on an MS-DOS machine.

  • '%c' allowed only after types %s

    (F) The modifiers '!', '<' and '>' are allowed in pack() or unpack() only after certain types. See pack.

  • Ambiguous call resolved as CORE::%s(), qualify as such or use &

    (W ambiguous) A subroutine you have declared has the same name as a Perl keyword, and you have used the name without qualification for calling one or the other. Perl decided to call the builtin because the subroutine is not imported.

    To force interpretation as a subroutine call, either put an ampersand before the subroutine name, or qualify the name with its package. Alternatively, you can import the subroutine (or pretend that it's imported with the use subs pragma).

    To silently interpret it as the Perl operator, use the CORE:: prefix on the operator (e.g. CORE::log($x) ) or declare the subroutine to be an object method (see Subroutine Attributes in perlsub or attributes).

  • Ambiguous range in transliteration operator

    (F) You wrote something like tr/a-z-0// which doesn't mean anything at all. To include a - character in a transliteration, put it either first or last. (In the past, tr/a-z-0// was synonymous with tr/a-y//, which was probably not what you would have expected.)

  • Ambiguous use of %s resolved as %s

    (S ambiguous) You said something that may not be interpreted the way you thought. Normally it's pretty easy to disambiguate it by supplying a missing quote, operator, parenthesis pair or declaration.

  • Ambiguous use of %c resolved as operator %c

    (S ambiguous) % , & , and * are both infix operators (modulus, bitwise and, and multiplication) and initial special characters (denoting hashes, subroutines and typeglobs), and you said something like *foo * foo that might be interpreted as either of them. We assumed you meant the infix operator, but please try to make it more clear -- in the example given, you might write *foo * foo() if you really meant to multiply a glob by the result of calling a function.

  • Ambiguous use of %c{%s} resolved to %c%s

    (W ambiguous) You wrote something like @{foo} , which might be asking for the variable @foo , or it might be calling a function named foo, and dereferencing it as an array reference. If you wanted the variable, you can just write @foo . If you wanted to call the function, write @{foo()} ... or you could just not have a variable and a function with the same name, and save yourself a lot of trouble.

  • Ambiguous use of %c{%s[...]} resolved to %c%s[...]
  • Ambiguous use of %c{%s{...}} resolved to %c%s{...}

    (W ambiguous) You wrote something like ${foo[2]} (where foo represents the name of a Perl keyword), which might be looking for element number 2 of the array named @foo , in which case please write $foo[2] , or you might have meant to pass an anonymous arrayref to the function named foo, and then do a scalar deref on the value it returns. If you meant that, write ${foo([2])} .

    In regular expressions, the ${foo[2]} syntax is sometimes necessary to disambiguate between array subscripts and character classes. /$length[2345]/ , for instance, will be interpreted as $length followed by the character class [2345] . If an array subscript is what you want, you can avoid the warning by changing /${length[2345]}/ to the unsightly /${\$length[2345]}/ , by renaming your array to something that does not coincide with a built-in keyword, or by simply turning off warnings with no warnings 'ambiguous'; .

  • Ambiguous use of -%s resolved as -&%s()

    (S ambiguous) You wrote something like -foo , which might be the string "-foo" , or a call to the function foo , negated. If you meant the string, just write "-foo" . If you meant the function call, write -foo() .

  • '|' and '<' may not both be specified on command line

    (F) An error peculiar to VMS. Perl does its own command line redirection, and found that STDIN was a pipe, and that you also tried to redirect STDIN using '<'. Only one STDIN stream to a customer, please.

  • '|' and '>' may not both be specified on command line

    (F) An error peculiar to VMS. Perl does its own command line redirection, and thinks you tried to redirect stdout both to a file and into a pipe to another command. You need to choose one or the other, though nothing's stopping you from piping into a program or Perl script which 'splits' output into two streams, such as

    1. open(OUT,">$ARGV[0]") or die "Can't write to $ARGV[0]: $!";
    2. while (<STDIN>) {
    3. print;
    4. print OUT;
    5. }
    6. close OUT;
  • Applying %s to %s will act on scalar(%s)

    (W misc) The pattern match (// ), substitution (s///), and transliteration (tr///) operators work on scalar values. If you apply one of them to an array or a hash, it will convert the array or hash to a scalar value (the length of an array, or the population info of a hash) and then work on that scalar value. This is probably not what you meant to do. See grep and map for alternatives.

  • Arg too short for msgsnd

    (F) msgsnd() requires a string at least as long as sizeof(long).

  • %s argument is not a HASH or ARRAY element or a subroutine

    (F) The argument to exists() must be a hash or array element or a subroutine with an ampersand, such as:

    1. $foo{$bar}
    2. $ref->{"susie"}[12]
    3. &do_something
  • %s argument is not a HASH or ARRAY element or slice

    (F) The argument to delete() must be either a hash or array element, such as:

    1. $foo{$bar}
    2. $ref->{"susie"}[12]

    or a hash or array slice, such as:

    1. @foo[$bar, $baz, $xyzzy]
    2. @{$ref->[12]}{"susie", "queue"}
  • %s argument is not a subroutine name

    (F) The argument to exists() for exists &sub must be a subroutine name, and not a subroutine call. exists &sub() will generate this error.

  • Argument "%s" isn't numeric%s

    (W numeric) The indicated string was fed as an argument to an operator that expected a numeric value instead. If you're fortunate the message will identify which operator was so unfortunate.

  • Argument list not closed for PerlIO layer "%s"

    (W layer) When pushing a layer with arguments onto the Perl I/O system you forgot the ) that closes the argument list. (Layers take care of transforming data between external and internal representations.) Perl stopped parsing the layer list at this point and did not attempt to push this layer. If your program didn't explicitly request the failing operation, it may be the result of the value of the environment variable PERLIO.

  • Array @%s missing the @ in argument %d of %s()

    (D deprecated) Really old Perl let you omit the @ on array names in some spots. This is now heavily deprecated.

  • A sequence of multiple spaces in a charnames alias definition is deprecated

    (D) You defined a character name which had multiple space characters in a row. Change them to single spaces. Usually these names are defined in the :alias import argument to use charnames , but they could be defined by a translator installed into $^H{charnames} . See CUSTOM ALIASES in charnames.

  • assertion botched: %s

    (X) The malloc package that comes with Perl had an internal failure.

  • Assertion failed: file "%s"

    (X) A general assertion failed. The file in question must be examined.

  • Assigning non-zero to $[ is no longer possible

    (F) When the "array_base" feature is disabled (e.g., under use v5.16; ) the special variable $[ , which is deprecated, is now a fixed zero value.

  • Assignment to both a list and a scalar

    (F) If you assign to a conditional operator, the 2nd and 3rd arguments must either both be scalars or both be lists. Otherwise Perl won't know which context to supply to the right side.

  • A thread exited while %d threads were running

    (W threads)(S) When using threaded Perl, a thread (not necessarily the main thread) exited while there were still other threads running. Usually it's a good idea first to collect the return values of the created threads by joining them, and only then to exit from the main thread. See threads.

  • Attempt to access disallowed key '%s' in a restricted hash

    (F) The failing code has attempted to get or set a key which is not in the current set of allowed keys of a restricted hash.

  • Attempt to bless into a reference

    (F) The CLASSNAME argument to the bless() operator is expected to be the name of the package to bless the resulting object into. You've supplied instead a reference to something: perhaps you wrote

    1. bless $self, $proto;

    when you intended

    1. bless $self, ref($proto) || $proto;

    If you actually want to bless into the stringified version of the reference supplied, you need to stringify it yourself, for example by:

    1. bless $self, "$proto";
  • Attempt to clear deleted array

    (S debugging) An array was assigned to when it was being freed. Freed values are not supposed to be visible to Perl code. This can also happen if XS code calls av_clear from a custom magic callback on the array.

  • Attempt to delete disallowed key '%s' from a restricted hash

    (F) The failing code attempted to delete from a restricted hash a key which is not in its key set.

  • Attempt to delete readonly key '%s' from a restricted hash

    (F) The failing code attempted to delete a key whose value has been declared readonly from a restricted hash.

  • Attempt to free non-arena SV: 0x%x

    (S internal) All SV objects are supposed to be allocated from arenas that will be garbage collected on exit. An SV was discovered to be outside any of those arenas.

  • Attempt to free nonexistent shared string '%s'%s

    (S internal) Perl maintains a reference-counted internal table of strings to optimize the storage and access of hash keys and other strings. This indicates someone tried to decrement the reference count of a string that can no longer be found in the table.

  • Attempt to free temp prematurely: SV 0x%x

    (S debugging) Mortalized values are supposed to be freed by the free_tmps() routine. This indicates that something else is freeing the SV before the free_tmps() routine gets a chance, which means that the free_tmps() routine will be freeing an unreferenced scalar when it does try to free it.

  • Attempt to free unreferenced glob pointers

    (S internal) The reference counts got screwed up on symbol aliases.

  • Attempt to free unreferenced scalar: SV 0x%x

    (S internal) Perl went to decrement the reference count of a scalar to see if it would go to 0, and discovered that it had already gone to 0 earlier, and should have been freed, and in fact, probably was freed. This could indicate that SvREFCNT_dec() was called too many times, or that SvREFCNT_inc() was called too few times, or that the SV was mortalized when it shouldn't have been, or that memory has been corrupted.

  • Attempt to join self

    (F) You tried to join a thread from within itself, which is an impossible task. You may be joining the wrong thread, or you may need to move the join() to some other thread.

  • Attempt to pack pointer to temporary value

    (W pack) You tried to pass a temporary value (like the result of a function, or a computed expression) to the "p" pack() template. This means the result contains a pointer to a location that could become invalid anytime, even before the end of the current statement. Use literals or global values as arguments to the "p" pack() template to avoid this warning.

  • Attempt to reload %s aborted.

    (F) You tried to load a file with use or require that failed to compile once already. Perl will not try to compile this file again unless you delete its entry from %INC. See require and %INC in perlvar.

  • Attempt to set length of freed array

    (W misc) You tried to set the length of an array which has been freed. You can do this by storing a reference to the scalar representing the last index of an array and later assigning through that reference. For example

    1. $r = do {my @a; \$#a};
    2. $$r = 503
  • Attempt to use reference as lvalue in substr

    (W substr) You supplied a reference as the first argument to substr() used as an lvalue, which is pretty strange. Perhaps you forgot to dereference it first. See substr.

  • Attribute "locked" is deprecated

    (D deprecated) You have used the attributes pragma to modify the "locked" attribute on a code reference. The :locked attribute is obsolete, has had no effect since 5005 threads were removed, and will be removed in a future release of Perl 5.

  • Attribute "unique" is deprecated

    (D deprecated) You have used the attributes pragma to modify the "unique" attribute on an array, hash or scalar reference. The :unique attribute has had no effect since Perl 5.8.8, and will be removed in a future release of Perl 5.

  • av_reify called on tied array

    (S debugging) This indicates that something went wrong and Perl got very confused about @_ or @DB::args being tied.

  • Bad arg length for %s, is %u, should be %d

    (F) You passed a buffer of the wrong size to one of msgctl(), semctl() or shmctl(). In C parlance, the correct sizes are, respectively, sizeof(struct msqid_ds *), sizeof(struct semid_ds *), and sizeof(struct shmid_ds *).

  • Bad evalled substitution pattern

    (F) You've used the /e switch to evaluate the replacement for a substitution, but perl found a syntax error in the code to evaluate, most likely an unexpected right brace '}'.

  • Bad filehandle: %s

    (F) A symbol was passed to something wanting a filehandle, but the symbol has no filehandle associated with it. Perhaps you didn't do an open(), or did it in another package.

  • Bad free() ignored

    (S malloc) An internal routine called free() on something that had never been malloc()ed in the first place. Mandatory, but can be disabled by setting environment variable PERL_BADFREE to 0.

    This message can be seen quite often with DB_File on systems with "hard" dynamic linking, like AIX and OS/2 . It is a bug of Berkeley DB which is left unnoticed if DB uses forgiving system malloc().

  • Bad hash

    (P) One of the internal hash routines was passed a null HV pointer.

  • Badly placed ()'s

    (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself.

  • Bad name after %s

    (F) You started to name a symbol by using a package prefix, and then didn't finish the symbol. In particular, you can't interpolate outside of quotes, so

    1. $var = 'myvar';
    2. $sym = mypack::$var;

    is not the same as

    1. $var = 'myvar';
    2. $sym = "mypack::$var";
  • Bad plugin affecting keyword '%s'

    (F) An extension using the keyword plugin mechanism violated the plugin API.

  • Bad realloc() ignored

    (S malloc) An internal routine called realloc() on something that had never been malloc()ed in the first place. Mandatory, but can be disabled by setting the environment variable PERL_BADFREE to 1.

  • Bad symbol for array

    (P) An internal request asked to add an array entry to something that wasn't a symbol table entry.

  • Bad symbol for dirhandle

    (P) An internal request asked to add a dirhandle entry to something that wasn't a symbol table entry.

  • Bad symbol for filehandle

    (P) An internal request asked to add a filehandle entry to something that wasn't a symbol table entry.

  • Bad symbol for hash

    (P) An internal request asked to add a hash entry to something that wasn't a symbol table entry.

  • Bareword found in conditional

    (W bareword) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example:

    1. open FOO || die;

    It may also indicate a misspelled constant that has been interpreted as a bareword:

    1. use constant TYPO => 1;
    2. if (TYOP) { print "foo" }

    The strict pragma is useful in avoiding such errors.

  • Bareword "%s" not allowed while "strict subs" in use

    (F) With "strict subs" in use, a bareword is only allowed as a subroutine identifier, in curly brackets or to the left of the "=>" symbol. Perhaps you need to predeclare a subroutine?

  • Bareword "%s" refers to nonexistent package

    (W bareword) You used a qualified bareword of the form Foo:: , but the compiler saw no other uses of that namespace before that point. Perhaps you need to predeclare a package?

  • BEGIN failed--compilation aborted

    (F) An untrapped exception was raised while executing a BEGIN subroutine. Compilation stops immediately and the interpreter is exited.

  • BEGIN not safe after errors--compilation aborted

    (F) Perl found a BEGIN {} subroutine (or a use directive, which implies a BEGIN {} ) after one or more compilation errors had already occurred. Since the intended environment for the BEGIN {} could not be guaranteed (due to the errors), and since subsequent code likely depends on its correct operation, Perl just gave up.

  • \1 better written as $1

    (W syntax) Outside of patterns, backreferences live on as variables. The use of backslashes is grandfathered on the right-hand side of a substitution, but stylistically it's better to use the variable form because other Perl programmers will expect it, and it works better if there are more than 9 backreferences.

  • Binary number > 0b11111111111111111111111111111111 non-portable

    (W portable) The binary number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

  • bind() on closed socket %s

    (W closed) You tried to do a bind on a closed socket. Did you forget to check the return value of your socket() call? See bind.

  • binmode() on closed filehandle %s

    (W unopened) You tried binmode() on a filehandle that was never opened. Check your control flow and number of arguments.

  • "\b{" is deprecated; use "\b\{" or "\b[{]" instead in regex; marked by <-- HERE in m/%s/
  • "\B{" is deprecated; use "\B\{" or "\B[{]" instead in regex; marked by <-- HERE in m/%s/

    (W deprecated) Use of an unescaped "{" immediately following a \b or \B is now deprecated so as to reserve its use for Perl itself in a future release. You can either precede the brace with a backslash, or enclose it in square brackets; the latter is the way to go if the pattern delimiters are {} .

  • Bit vector size > 32 non-portable

    (W portable) Using bit vector sizes larger than 32 is non-portable.

  • Bizarre copy of %s

    (P) Perl detected an attempt to copy an internal value that is not copiable.

  • Buffer overflow in prime_env_iter: %s

    (W internal) A warning peculiar to VMS. While Perl was preparing to iterate over %ENV, it encountered a logical name or symbol definition which was too long, so it was truncated to the string shown.

  • Bizarre SvTYPE [%d]

    (P) When starting a new thread or return values from a thread, Perl encountered an invalid data type.

  • Callback called exit

    (F) A subroutine invoked from an external package via call_sv() exited by calling exit.

  • %s() called too early to check prototype

    (W prototype) You've called a function that has a prototype before the parser saw a definition or declaration for it, and Perl could not check that the call conforms to the prototype. You need to either add an early prototype declaration for the subroutine in question, or move the subroutine definition ahead of the call to get proper prototype checking. Alternatively, if you are certain that you're calling the function correctly, you may put an ampersand before the name to avoid the warning. See perlsub.

  • Cannot compress integer in pack

    (F) An argument to pack("w",...) was too large to compress. The BER compressed integer format can only be used with positive integers, and you attempted to compress Infinity or a very large number (> 1e308). See pack.

  • Cannot compress negative numbers in pack

    (F) An argument to pack("w",...) was negative. The BER compressed integer format can only be used with positive integers. See pack.

  • Cannot convert a reference to %s to typeglob

    (F) You manipulated Perl's symbol table directly, stored a reference in it, then tried to access that symbol via conventional Perl syntax. The access triggers Perl to autovivify that typeglob, but it there is no legal conversion from that type of reference to a typeglob.

  • Cannot copy to %s

    (P) Perl detected an attempt to copy a value to an internal type that cannot be directly assigned to.

  • Cannot find encoding "%s"

    (S io) You tried to apply an encoding that did not exist to a filehandle, either with open() or binmode().

  • Cannot set tied @DB::args

    (F) caller tried to set @DB::args , but found it tied. Tying @DB::args is not supported. (Before this error was added, it used to crash.)

  • Cannot tie unreifiable array

    (P) You somehow managed to call tie on an array that does not keep a reference count on its arguments and cannot be made to do so. Such arrays are not even supposed to be accessible to Perl code, but are only used internally.

  • Can only compress unsigned integers in pack

    (F) An argument to pack("w",...) was not an integer. The BER compressed integer format can only be used with positive integers, and you attempted to compress something else. See pack.

  • Can't bless non-reference value

    (F) Only hard references may be blessed. This is how Perl "enforces" encapsulation of objects. See perlobj.

  • Can't "break" in a loop topicalizer

    (F) You called break , but you're in a foreach block rather than a given block. You probably meant to use next or last.

  • Can't "break" outside a given block

    (F) You called break , but you're not inside a given block.

  • Can't call method "%s" on an undefined value

    (F) You used the syntax of a method call, but the slot filled by the object reference or package name contains an undefined value. Something like this will reproduce the error:

    1. $BADREF = undef;
    2. process $BADREF 1,2,3;
    3. $BADREF->process(1,2,3);
  • Can't call method "%s" on unblessed reference

    (F) A method call must know in what package it's supposed to run. It ordinarily finds this out from the object reference you supply, but you didn't supply an object reference in this case. A reference isn't an object reference until it has been blessed. See perlobj.

  • Can't call method "%s" without a package or object reference

    (F) You used the syntax of a method call, but the slot filled by the object reference or package name contains an expression that returns a defined value which is neither an object reference nor a package name. Something like this will reproduce the error:

    1. $BADREF = 42;
    2. process $BADREF 1,2,3;
    3. $BADREF->process(1,2,3);
  • Can't chdir to %s

    (F) You called perl -x/foo/bar, but /foo/bar is not a directory that you can chdir to, possibly because it doesn't exist.

  • Can't check filesystem of script "%s" for nosuid

    (P) For some reason you can't check the filesystem of the script for nosuid.

  • Can't coerce %s to %s in %s

    (F) Certain types of SVs, in particular real symbol table entries (typeglobs), can't be forced to stop being what they are. So you can't say things like:

    1. *foo += 1;

    You CAN say

    1. $foo = *foo;
    2. $foo += 1;

    but then $foo no longer contains a glob.

  • Can't "continue" outside a when block

    (F) You called continue, but you're not inside a when or default block.

  • Can't create pipe mailbox

    (P) An error peculiar to VMS. The process is suffering from exhausted quotas or other plumbing problems.

  • Can't declare %s in "%s"

    (F) Only scalar, array, and hash variables may be declared as "my", "our" or "state" variables. They must have ordinary identifiers as names.

  • Can't "default" outside a topicalizer

    (F) You have used a default block that is neither inside a foreach loop nor a given block. (Note that this error is issued on exit from the default block, so you won't get the error if you use an explicit continue.)

  • Can't do inplace edit: %s is not a regular file

    (S inplace) You tried to use the -i switch on a special file, such as a file in /dev, a FIFO or an uneditable directory. The file was ignored.

  • Can't do inplace edit on %s: %s

    (S inplace) The creation of the new file failed for the indicated reason.

  • Can't do inplace edit without backup

    (F) You're on a system such as MS-DOS that gets confused if you try reading from a deleted (but still opened) file. You have to say -i.bak , or some such.

  • Can't do inplace edit: %s would not be unique

    (S inplace) Your filesystem does not support filenames longer than 14 characters and Perl was unable to create a unique filename during inplace editing with the -i switch. The file was ignored.

  • Can't do waitpid with flags

    (F) This machine doesn't have either waitpid() or wait4(), so only waitpid() without flags is emulated.

  • Can't emulate -%s on #! line

    (F) The #! line specifies a switch that doesn't make sense at this point. For example, it'd be kind of silly to put a -x on the #! line.

  • Can't %s %s-endian %ss on this platform

    (F) Your platform's byte-order is neither big-endian nor little-endian, or it has a very strange pointer size. Packing and unpacking big- or little-endian floating point values and pointers may not be possible. See pack.

  • Can't exec "%s": %s

    (W exec) A system(), exec(), or piped open call could not execute the named program for the indicated reason. Typical reasons include: the permissions were wrong on the file, the file wasn't found in $ENV{PATH} , the executable in question was compiled for another architecture, or the #! line in a script points to an interpreter that can't be run for similar reasons. (Or maybe your system doesn't support #! at all.)

  • Can't exec %s

    (F) Perl was trying to execute the indicated program for you because that's what the #! line said. If that's not what you wanted, you may need to mention "perl" on the #! line somewhere.

  • Can't execute %s

    (F) You used the -S switch, but the copies of the script to execute found in the PATH did not have correct permissions.

  • Can't find an opnumber for "%s"

    (F) A string of a form CORE::word was given to prototype(), but there is no builtin with the name word .

  • Can't find %s character property "%s"

    (F) You used \p{} or \P{} but the character property by that name could not be found. Maybe you misspelled the name of the property? See Properties accessible through \p{} and \P{} in perluniprops for a complete list of available official properties.

  • Can't find label %s

    (F) You said to goto a label that isn't mentioned anywhere that it's possible for us to go to. See goto.

  • Can't find %s on PATH

    (F) You used the -S switch, but the script to execute could not be found in the PATH.

  • Can't find %s on PATH, '.' not in PATH

    (F) You used the -S switch, but the script to execute could not be found in the PATH, or at least not with the correct permissions. The script exists in the current directory, but PATH prohibits running it.

  • Can't find string terminator %s anywhere before EOF

    (F) Perl strings can stretch over multiple lines. This message means that the closing delimiter was omitted. Because bracketed quotes count nesting levels, the following is missing its final parenthesis:

    1. print q(The character '(' starts a side comment.);

    If you're getting this error from a here-document, you may have included unseen whitespace before or after your closing tag or there may not be a linebreak after it. A good programmer's editor will have a way to help you find these characters (or lack of characters). See perlop for the full details on here-documents.

  • Can't find Unicode property definition "%s"

    (F) You may have tried to use \p which means a Unicode property (for example \p{Lu} matches all uppercase letters). If you did mean to use a Unicode property, see Properties accessible through \p{} and \P{} in perluniprops for a complete list of available properties. If you didn't mean to use a Unicode property, escape the \p , either by \\p (just the \p ) or by \Q\p (the rest of the string, or until \E ).

  • Can't fork: %s

    (F) A fatal error occurred while trying to fork while opening a pipeline.

  • Can't fork, trying again in 5 seconds

    (W pipe) A fork in a piped open failed with EAGAIN and will be retried after five seconds.

  • Can't get filespec - stale stat buffer?

    (S) A warning peculiar to VMS. This arises because of the difference between access checks under VMS and under the Unix model Perl assumes. Under VMS, access checks are done by filename, rather than by bits in the stat buffer, so that ACLs and other protections can be taken into account. Unfortunately, Perl assumes that the stat buffer contains all the necessary information, and passes it, instead of the filespec, to the access-checking routine. It will try to retrieve the filespec using the device name and FID present in the stat buffer, but this works only if you haven't made a subsequent call to the CRTL stat() routine, because the device name is overwritten with each call. If this warning appears, the name lookup failed, and the access-checking routine gave up and returned FALSE, just to be conservative. (Note: The access-checking routine knows about the Perl stat operator and file tests, so you shouldn't ever see this warning in response to a Perl command; it arises only if some internal code takes stat buffers lightly.)

  • Can't get pipe mailbox device name

    (P) An error peculiar to VMS. After creating a mailbox to act as a pipe, Perl can't retrieve its name for later use.

  • Can't get SYSGEN parameter value for MAXBUF

    (P) An error peculiar to VMS. Perl asked $GETSYI how big you want your mailbox buffers to be, and didn't get an answer.

  • Can't "goto" into the middle of a foreach loop

    (F) A "goto" statement was executed to jump into the middle of a foreach loop. You can't get there from here. See goto.

  • Can't "goto" out of a pseudo block

    (F) A "goto" statement was executed to jump out of what might look like a block, except that it isn't a proper block. This usually occurs if you tried to jump out of a sort() block or subroutine, which is a no-no. See goto.

  • Can't goto subroutine from a sort sub (or similar callback)

    (F) The "goto subroutine" call can't be used to jump out of the comparison sub for a sort(), or from a similar callback (such as the reduce() function in List::Util).

  • Can't goto subroutine from an eval-%s

    (F) The "goto subroutine" call can't be used to jump out of an eval "string" or block.

  • Can't goto subroutine outside a subroutine

    (F) The deeply magical "goto subroutine" call can only replace one subroutine call for another. It can't manufacture one out of whole cloth. In general you should be calling it out of only an AUTOLOAD routine anyway. See goto.

  • Can't ignore signal CHLD, forcing to default

    (W signal) Perl has detected that it is being run with the SIGCHLD signal (sometimes known as SIGCLD) disabled. Since disabling this signal will interfere with proper determination of exit status of child processes, Perl has reset the signal to its default value. This situation typically indicates that the parent program under which Perl may be running (e.g. cron) is being very careless.

  • Can't kill a non-numeric process ID

    (F) Process identifiers must be (signed) integers. It is a fatal error to attempt to kill() an undefined, empty-string or otherwise non-numeric process identifier.

  • Can't "last" outside a loop block

    (F) A "last" statement was executed to break out of the current block, except that there's this itty bitty problem called there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(), map() or grep(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See last.

  • Can't linearize anonymous symbol table

    (F) Perl tried to calculate the method resolution order (MRO) of a package, but failed because the package stash has no name.

  • Can't load '%s' for module %s

    (F) The module you tried to load failed to load a dynamic extension. This may either mean that you upgraded your version of perl to one that is incompatible with your old dynamic extensions (which is known to happen between major versions of perl), or (more likely) that your dynamic extension was built against an older version of the library that is installed on your system. You may need to rebuild your old dynamic extensions.

  • Can't localize lexical variable %s

    (F) You used local on a variable name that was previously declared as a lexical variable using "my" or "state". This is not allowed. If you want to localize a package variable of the same name, qualify it with the package name.

  • Can't localize through a reference

    (F) You said something like local $$ref , which Perl can't currently handle, because when it goes to restore the old value of whatever $ref pointed to after the scope of the local() is finished, it can't be sure that $ref will still be a reference.

  • Can't locate %s

    (F) You said to do (or require, or use) a file that couldn't be found. Perl looks for the file in all the locations mentioned in @INC, unless the file name included the full path to the file. Perhaps you need to set the PERL5LIB or PERL5OPT environment variable to say where the extra library is, or maybe the script needs to add the library name to @INC. Or maybe you just misspelled the name of the file. See require and lib.

  • Can't locate auto/%s.al in @INC

    (F) A function (or method) was called in a package which allows autoload, but there is no function to autoload. Most probable causes are a misprint in a function/method name or a failure to AutoSplit the file, say, by doing make install .

  • Can't locate loadable object for module %s in @INC

    (F) The module you loaded is trying to load an external library, like for example, foo.so or bar.dll, but the DynaLoader module was unable to locate this library. See DynaLoader.

  • Can't locate object method "%s" via package "%s"

    (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See perlobj.

  • Can't locate package %s for @%s::ISA

    (W syntax) The @ISA array contained the name of another package that doesn't seem to exist.

  • Can't locate PerlIO%s

    (F) You tried to use in open() a PerlIO layer that does not exist, e.g. open(FH, ">:nosuchlayer", "somefile").

  • Can't make list assignment to %ENV on this system

    (F) List assignment to %ENV is not supported on some systems, notably VMS.

  • Can't make loaded symbols global on this platform while loading %s

    (W) A module passed the flag 0x01 to DynaLoader::dl_load_file() to request that symbols from the stated file are made available globally within the process, but that functionality is not available on this platform. Whilst the module likely will still work, this may prevent the perl interpreter from loading other XS-based extensions which need to link directly to functions defined in the C or XS code in the stated file.

  • Can't modify %s in %s

    (F) You aren't allowed to assign to the item indicated, or otherwise try to change it, such as with an auto-increment.

  • Can't modify nonexistent substring

    (P) The internal routine that does assignment to a substr() was handed a NULL.

  • Can't modify non-lvalue subroutine call

    (F) Subroutines meant to be used in lvalue context should be declared as such. See Lvalue subroutines in perlsub.

  • Can't msgrcv to read-only var

    (F) The target of a msgrcv must be modifiable to be used as a receive buffer.

  • Can't "next" outside a loop block

    (F) A "next" statement was executed to reiterate the current block, but there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(), map() or grep(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See next.

  • Can't open %s

    (F) You tried to run a perl built with MAD support with the PERL_XMLDUMP environment variable set, but the file named by that variable could not be opened.

  • Can't open %s: %s

    (S inplace) The implicit opening of a file through use of the <> filehandle, either implicitly under the -n or -p command-line switches, or explicitly, failed for the indicated reason. Usually this is because you don't have read permission for a file which you named on the command line.

    (F) You tried to call perl with the -e switch, but /dev/null (or your operating system's equivalent) could not be opened.

  • Can't open a reference

    (W io) You tried to open a scalar reference for reading or writing, using the 3-arg open() syntax:

    1. open FH, '>', $ref;

    but your version of perl is compiled without perlio, and this form of open is not supported.

  • Can't open bidirectional pipe

    (W pipe) You tried to say open(CMD, "|cmd|") , which is not supported. You can try any of several modules in the Perl library to do this, such as IPC::Open2. Alternately, direct the pipe's output to a file using ">", and then read it in under a different file handle.

  • Can't open error file %s as stderr

    (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '2>' or '2>>' on the command line for writing.

  • Can't open input file %s as stdin

    (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '<' on the command line for reading.

  • Can't open output file %s as stdout

    (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '>' or '>>' on the command line for writing.

  • Can't open output pipe (name: %s)

    (P) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the pipe into which to send data destined for stdout.

  • Can't open perl script "%s": %s

    (F) The script you specified can't be opened for the indicated reason.

    If you're debugging a script that uses #!, and normally relies on the shell's $PATH search, the -S option causes perl to do that search, so you don't have to type the path or `which $scriptname` .

  • Can't read CRTL environ

    (S) A warning peculiar to VMS. Perl tried to read an element of %ENV from the CRTL's internal environment array and discovered the array was missing. You need to figure out where your CRTL misplaced its environ or define PERL_ENV_TABLES (see perlvms) so that environ is not searched.

  • Can't "redo" outside a loop block

    (F) A "redo" statement was executed to restart the current block, but there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(), map() or grep(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See redo.

  • Can't remove %s: %s, skipping file

    (S inplace) You requested an inplace edit without creating a backup file. Perl was unable to remove the original file to replace it with the modified file. The file was left unmodified.

  • Can't rename %s to %s: %s, skipping file

    (S inplace) The rename done by the -i switch failed for some reason, probably because you don't have write permission to the directory.

  • Can't reopen input pipe (name: %s) in binary mode

    (P) An error peculiar to VMS. Perl thought stdin was a pipe, and tried to reopen it to accept binary data. Alas, it failed.

  • Can't reset %ENV on this system

    (F) You called reset('E') or similar, which tried to reset all variables in the current package beginning with "E". In the main package, that includes %ENV. Resetting %ENV is not supported on some systems, notably VMS.

  • Can't resolve method "%s" overloading "%s" in package "%s"

    (F)(P) Error resolving overloading specified by a method name (as opposed to a subroutine reference): no such method callable via the package. If the method name is ???, this is an internal error.

  • Can't return %s from lvalue subroutine

    (F) Perl detected an attempt to return illegal lvalues (such as temporary or readonly values) from a subroutine used as an lvalue. This is not allowed.

  • Can't return outside a subroutine

    (F) The return statement was executed in mainline code, that is, where there was no subroutine call to return out of. See perlsub.

  • Can't return %s to lvalue scalar context

    (F) You tried to return a complete array or hash from an lvalue subroutine, but you called the subroutine in a way that made Perl think you meant to return only one value. You probably meant to write parentheses around the call to the subroutine, which tell Perl that the call should be in list context.

  • Can't stat script "%s"

    (P) For some reason you can't fstat() the script even though you have it open already. Bizarre.

  • Can't take log of %g

    (F) For ordinary real numbers, you can't take the logarithm of a negative number or zero. There's a Math::Complex package that comes standard with Perl, though, if you really want to do that for the negative numbers.

  • Can't take sqrt of %g

    (F) For ordinary real numbers, you can't take the square root of a negative number. There's a Math::Complex package that comes standard with Perl, though, if you really want to do that.

  • Can't undef active subroutine

    (F) You can't undefine a routine that's currently running. You can, however, redefine it while it's running, and you can even undef the redefined subroutine while the old routine is running. Go figure.

  • Can't upgrade %s (%d) to %d

    (P) The internal sv_upgrade routine adds "members" to an SV, making it into a more specialized kind of SV. The top several SV types are so specialized, however, that they cannot be interconverted. This message indicates that such a conversion was attempted.

  • Can't use '%c' after -mname

    (F) You tried to call perl with the -m switch, but you put something other than "=" after the module name.

  • Can't use anonymous symbol table for method lookup

    (F) The internal routine that does method lookup was handed a symbol table that doesn't have a name. Symbol tables can become anonymous for example by undefining stashes: undef %Some::Package:: .

  • Can't use an undefined value as %s reference

    (F) A value used as either a hard reference or a symbolic reference must be a defined value. This helps to delurk some insidious errors.

  • Can't use bareword ("%s") as %s ref while "strict refs" in use

    (F) Only hard references are allowed by "strict refs". Symbolic references are disallowed. See perlref.

  • Can't use %! because Errno.pm is not available

    (F) The first time the %! hash is used, perl automatically loads the Errno.pm module. The Errno module is expected to tie the %! hash to provide symbolic names for $! errno values.

  • Can't use both '<' and '>' after type '%c' in %s

    (F) A type cannot be forced to have both big-endian and little-endian byte-order at the same time, so this combination of modifiers is not allowed. See pack.

  • Can't use %s for loop variable

    (F) Only a simple scalar variable may be used as a loop variable on a foreach.

  • Can't use global %s in "%s"

    (F) You tried to declare a magical variable as a lexical variable. This is not allowed, because the magic can be tied to only one location (namely the global variable) and it would be incredibly confusing to have variables in your program that looked like magical variables but weren't.

  • Can't use '%c' in a group with different byte-order in %s

    (F) You attempted to force a different byte-order on a type that is already inside a group with a byte-order modifier. For example you cannot force little-endianness on a type that is inside a big-endian group.

  • Can't use "my %s" in sort comparison

    (F) The global variables $a and $b are reserved for sort comparisons. You mentioned $a or $b in the same line as the <=> or cmp operator, and the variable had earlier been declared as a lexical variable. Either qualify the sort variable with the package name, or rename the lexical variable.

  • Can't use %s ref as %s ref

    (F) You've mixed up your reference types. You have to dereference a reference of the type needed. You can use the ref() function to test the type of the reference, if need be.

  • Can't use string ("%s") as %s ref while "strict refs" in use

    (F) You've told Perl to dereference a string, something which use strict blocks to prevent it happening accidentally. See Symbolic references in perlref. This can be triggered by an @ or $ in a double-quoted string immediately before interpolating a variable, for example in "user @$twitter_id" , which says to treat the contents of $twitter_id as an array reference; use a \ to have a literal @ symbol followed by the contents of $twitter_id : "user \@$twitter_id" .

  • Can't use subscript on %s

    (F) The compiler tried to interpret a bracketed expression as a subscript. But to the left of the brackets was an expression that didn't look like a hash or array reference, or anything else subscriptable.

  • Can't use \%c to mean $%c in expression

    (W syntax) In an ordinary expression, backslash is a unary operator that creates a reference to its argument. The use of backslash to indicate a backreference to a matched substring is valid only as part of a regular expression pattern. Trying to do this in ordinary Perl code produces a value that prints out looking like SCALAR(0xdecaf). Use the $1 form instead.

  • Can't weaken a nonreference

    (F) You attempted to weaken something that was not a reference. Only references can be weakened.

  • Can't "when" outside a topicalizer

    (F) You have used a when() block that is neither inside a foreach loop nor a given block. (Note that this error is issued on exit from the when block, so you won't get the error if the match fails, or if you use an explicit continue.)

  • Can't x= to read-only value

    (F) You tried to repeat a constant value (often the undefined value) with an assignment operator, which implies modifying the value itself. Perhaps you need to copy the value to a temporary, and repeat that.

  • Character following "\c" must be ASCII

    (F)(W deprecated, syntax) In \cX, X must be an ASCII character. It is planned to make this fatal in all instances in Perl v5.20. In the cases where it isn't fatal, the character this evaluates to is derived by exclusive or'ing the code point of this character with 0x40.

    Note that non-alphabetic ASCII characters are discouraged here as well, and using non-printable ones will be deprecated starting in v5.18.

  • Character in 'C' format wrapped in pack

    (W pack) You said

    1. pack("C", $x)

    where $x is either less than 0 or more than 255; the "C" format is only for encoding native operating system characters (ASCII, EBCDIC, and so on) and not for Unicode characters, so Perl behaved as if you meant

    1. pack("C", $x & 255)

    If you actually want to pack Unicode codepoints, use the "U" format instead.

  • Character in 'W' format wrapped in pack

    (W pack) You said

    1. pack("U0W", $x)

    where $x is either less than 0 or more than 255. However, U0 -mode expects all values to fall in the interval [0, 255], so Perl behaved as if you meant:

    1. pack("U0W", $x & 255)
  • Character in 'c' format wrapped in pack

    (W pack) You said

    1. pack("c", $x)

    where $x is either less than -128 or more than 127; the "c" format is only for encoding native operating system characters (ASCII, EBCDIC, and so on) and not for Unicode characters, so Perl behaved as if you meant

    1. pack("c", $x & 255);

    If you actually want to pack Unicode codepoints, use the "U" format instead.

  • Character in '%c' format wrapped in unpack

    (W unpack) You tried something like

    1. unpack("H", "\x{2a1}")

    where the format expects to process a byte (a character with a value below 256), but a higher value was provided instead. Perl uses the value modulus 256 instead, as if you had provided:

    1. unpack("H", "\x{a1}")
  • Character(s) in '%c' format wrapped in pack

    (W pack) You tried something like

    1. pack("u", "\x{1f3}b")

    where the format expects to process a sequence of bytes (character with a value below 256), but some of the characters had a higher value. Perl uses the character values modulus 256 instead, as if you had provided:

    1. pack("u", "\x{f3}b")
  • Character(s) in '%c' format wrapped in unpack

    (W unpack) You tried something like

    1. unpack("s", "\x{1f3}b")

    where the format expects to process a sequence of bytes (character with a value below 256), but some of the characters had a higher value. Perl uses the character values modulus 256 instead, as if you had provided:

    1. unpack("s", "\x{f3}b")
  • "\c{" is deprecated and is more clearly written as ";"

    (D deprecated, syntax) The \cX construct is intended to be a way to specify non-printable characters. You used it with a "{" which evaluates to ";", which is printable. It is planned to remove the ability to specify a semi-colon this way in Perl 5.20. Just use a semi-colon or a backslash-semi-colon without the "\c".

  • "\c%c" is more clearly written simply as "%s"

    (W syntax) The \cX construct is intended to be a way to specify non-printable characters. You used it for a printable one, which is better written as simply itself, perhaps preceded by a backslash for non-word characters.

  • Cloning substitution context is unimplemented

    (F) Creating a new thread inside the s/// operator is not supported.

  • close() on unopened filehandle %s

    (W unopened) You tried to close a filehandle that was never opened.

  • closedir() attempted on invalid dirhandle %s

    (W io) The dirhandle you tried to close is either closed or not really a dirhandle. Check your control flow.

  • Closure prototype called

    (F) If a closure has attributes, the subroutine passed to an attribute handler is the prototype that is cloned when a new closure is created. This subroutine cannot be called.

  • Code missing after '/'

    (F) You had a (sub-)template that ends with a '/'. There must be another template code following the slash. See pack.

  • Code point 0x%X is not Unicode, may not be portable
  • Code point 0x%X is not Unicode, all \p{} matches fail; all \P{} matches succeed

    (S utf8, non_unicode) You had a code point above the Unicode maximum of U+10FFFF.

    Perl allows strings to contain a superset of Unicode code points, up to the limit of what is storable in an unsigned integer on your system, but these may not be accepted by other languages/systems. At one time, it was legal in some standards to have code points up to 0x7FFF_FFFF, but not higher. Code points above 0xFFFF_FFFF require larger than a 32 bit word.

    None of the Unicode or Perl-defined properties will match a non-Unicode code point. For example,

    1. chr(0x7FF_FFFF) =~ /\p{Any}/

    will not match, because the code point is not in Unicode. But

    1. chr(0x7FF_FFFF) =~ /\P{Any}/

    will match.

    This may be counterintuitive at times, as both these fail:

    1. chr(0x110000) =~ /\p{ASCII_Hex_Digit=True}/ # Fails.
    2. chr(0x110000) =~ /\p{ASCII_Hex_Digit=False}/ # Also fails!

    and both these succeed:

    1. chr(0x110000) =~ /\P{ASCII_Hex_Digit=True}/ # Succeeds.
    2. chr(0x110000) =~ /\P{ASCII_Hex_Digit=False}/ # Also succeeds!
  • %s: Command not found

    (A) You've accidentally run your script through csh or another shell shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself. The #! line at the top of your file could look like

    1. #!/usr/bin/perl -w
  • Compilation failed in require

    (F) Perl could not compile a file specified in a require statement. Perl uses this generic message when none of the errors that it encountered were severe enough to halt compilation immediately.

  • Complex regular subexpression recursion limit (%d) exceeded

    (W regexp) The regular expression engine uses recursion in complex situations where back-tracking is required. Recursion depth is limited to 32766, or perhaps less in architectures where the stack cannot grow arbitrarily. ("Simple" and "medium" situations are handled without recursion and are not subject to a limit.) Try shortening the string under examination; looping in Perl code (e.g. with while ) rather than in the regular expression engine; or rewriting the regular expression so that it is simpler or backtracks less. (See perlfaq2 for information on Mastering Regular Expressions.)

  • cond_broadcast() called on unlocked variable

    (W threads) Within a thread-enabled program, you tried to call cond_broadcast() on a variable which wasn't locked. The cond_broadcast() function is used to wake up another thread that is waiting in a cond_wait(). To ensure that the signal isn't sent before the other thread has a chance to enter the wait, it is usual for the signaling thread first to wait for a lock on variable. This lock attempt will only succeed after the other thread has entered cond_wait() and thus relinquished the lock.

  • cond_signal() called on unlocked variable

    (W threads) Within a thread-enabled program, you tried to call cond_signal() on a variable which wasn't locked. The cond_signal() function is used to wake up another thread that is waiting in a cond_wait(). To ensure that the signal isn't sent before the other thread has a chance to enter the wait, it is usual for the signaling thread first to wait for a lock on variable. This lock attempt will only succeed after the other thread has entered cond_wait() and thus relinquished the lock.

  • connect() on closed socket %s

    (W closed) You tried to do a connect on a closed socket. Did you forget to check the return value of your socket() call? See connect.

  • Constant(%s): Call to &{$^H{%s}} did not return a defined value

    (F) The subroutine registered to handle constant overloading (see overload) or a custom charnames handler (see CUSTOM TRANSLATORS in charnames) returned an undefined value.

  • Constant(%s): $^H{%s} is not defined

    (F) The parser found inconsistencies while attempting to define an overloaded constant. Perhaps you forgot to load the corresponding overload pragma?.

  • Constant(%s) unknown

    (F) The parser found inconsistencies either while attempting to define an overloaded constant, or when trying to find the character name specified in the \N{...} escape. Perhaps you forgot to load the corresponding overload pragma?.

  • Constant is not %s reference

    (F) A constant value (perhaps declared using the use constant pragma) is being dereferenced, but it amounts to the wrong type of reference. The message indicates the type of reference that was expected. This usually indicates a syntax error in dereferencing the constant value. See Constant Functions in perlsub and constant.

  • Constant subroutine %s redefined

    (W redefine)(S) You redefined a subroutine which had previously been eligible for inlining. See Constant Functions in perlsub for commentary and workarounds.

  • Constant subroutine %s undefined

    (W misc) You undefined a subroutine which had previously been eligible for inlining. See Constant Functions in perlsub for commentary and workarounds.

  • Copy method did not return a reference

    (F) The method which overloads "=" is buggy. See Copy Constructor in overload.

  • &CORE::%s cannot be called directly

    (F) You tried to call a subroutine in the CORE:: namespace with &foo syntax or through a reference. Some subroutines in this package cannot yet be called that way, but must be called as barewords. Something like this will work:

    1. BEGIN { *shove = \&CORE::push; }
    2. shove @array, 1,2,3; # pushes on to @array
  • CORE::%s is not a keyword

    (F) The CORE:: namespace is reserved for Perl keywords.

  • corrupted regexp pointers

    (P) The regular expression engine got confused by what the regular expression compiler gave it.

  • corrupted regexp program

    (P) The regular expression engine got passed a regexp program without a valid magic number.

  • Corrupt malloc ptr 0x%x at 0x%x

    (P) The malloc package that comes with Perl had an internal failure.

  • Corrupted regexp opcode %d > %d

    (F) This is either an error in Perl, or, if you're using one, your custom regular expression engine. If not the latter, report the problem through the perlbug utility.

  • Count after length/code in unpack

    (F) You had an unpack template indicating a counted-length string, but you have also specified an explicit size for the string. See pack.

  • Deep recursion on anonymous subroutine
  • Deep recursion on subroutine "%s"

    (W recursion) This subroutine has called itself (directly or indirectly) 100 times more than it has returned. This probably indicates an infinite recursion, unless you're writing strange benchmark programs, in which case it indicates something else.

    This threshold can be changed from 100, by recompiling the perl binary, setting the C pre-processor macro PERL_SUB_DEPTH_WARN to the desired value.

  • defined(@array) is deprecated

    (D deprecated) defined() is not usually useful on arrays because it checks for an undefined scalar value. If you want to see if the array is empty, just use if (@array) { # not empty } for example.

  • defined(%hash) is deprecated

    (D deprecated) defined() is not usually right on hashes and has been discouraged since 5.004.

    Although defined %hash is false on a plain not-yet-used hash, it becomes true in several non-obvious circumstances, including iterators, weak references, stash names, even remaining true after undef %hash . These things make defined %hash fairly useless in practice.

    If a check for non-empty is what you wanted then just put it in boolean context (see Scalar values in perldata):

    1. if (%hash) {
    2. # not empty
    3. }

    If you had defined %Foo::Bar::QUUX to check whether such a package variable exists then that's never really been reliable, and isn't a good way to enquire about the features of a package, or whether it's loaded, etc.

  • (?(DEFINE)....) does not allow branches in regex; marked by <-- HERE in m/%s/

    (F) You used something like (?(DEFINE)...|..) which is illegal. The most likely cause of this error is that you left out a parenthesis inside of the .... part.

    The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • %s defines neither package nor VERSION--version check failed

    (F) You said something like "use Module 42" but in the Module file there are neither package declarations nor a $VERSION .

  • Delimiter for here document is too long

    (F) In a here document construct like <<FOO, the label FOO is too long for Perl to handle. You have to be seriously twisted to write code that triggers this error.

  • Deprecated use of my() in false conditional

    (D deprecated) You used a declaration similar to my $x if 0 . There has been a long-standing bug in Perl that causes a lexical variable not to be cleared at scope exit when its declaration includes a false conditional. Some people have exploited this bug to achieve a kind of static variable. Since we intend to fix this bug, we don't want people relying on this behavior. You can achieve a similar static effect by declaring the variable in a separate block outside the function, eg

    1. sub f { my $x if 0; return $x++ }

    becomes

    1. { my $x; sub f { return $x++ } }

    Beginning with perl 5.9.4, you can also use state variables to have lexicals that are initialized only once (see feature):

    1. sub f { state $x; return $x++ }
  • DESTROY created new reference to dead object '%s'

    (F) A DESTROY() method created a new reference to the object which is just being DESTROYed. Perl is confused, and prefers to abort rather than to create a dangling reference.

  • Did not produce a valid header

    See Server error.

  • %s did not return a true value

    (F) A required (or used) file must return a true value to indicate that it compiled correctly and ran its initialization code correctly. It's traditional to end such a file with a "1;", though any true value would do. See require.

  • (Did you mean &%s instead?)

    (W misc) You probably referred to an imported subroutine &FOO as $FOO or some such.

  • (Did you mean "local" instead of "our"?)

    (W misc) Remember that "our" does not localize the declared global variable. You have declared it again in the same lexical scope, which seems superfluous.

  • (Did you mean $ or @ instead of %?)

    (W) You probably said %hash{$key} when you meant $hash{$key} or @hash{@keys}. On the other hand, maybe you just meant %hash and got carried away.

  • Died

    (F) You passed die() an empty string (the equivalent of die "" ) or you called it with no args and $@ was empty.

  • Document contains no data

    See Server error.

  • %s does not define %s::VERSION--version check failed

    (F) You said something like "use Module 42" but the Module did not define a $VERSION .

  • '/' does not take a repeat count

    (F) You cannot put a repeat count of any kind right after the '/' code. See pack.

  • Don't know how to handle magic of type '%s'

    (P) The internal handling of magical variables has been cursed.

  • do_study: out of memory

    (P) This should have been caught by safemalloc() instead.

  • (Do you need to predeclare %s?)

    (S syntax) This is an educated guess made in conjunction with the message "%s found where operator expected". It often means a subroutine or module name is being referenced that hasn't been declared yet. This may be because of ordering problems in your file, or because of a missing "sub", "package", "require", or "use" statement. If you're referencing something that isn't defined yet, you don't actually have to define the subroutine or package before the current location. You can use an empty "sub foo;" or "package FOO;" to enter a "forward" declaration.

  • dump() better written as CORE::dump()

    (W misc) You used the obsolescent dump() built-in function, without fully qualifying it as CORE::dump() . Maybe it's a typo. See dump.

  • dump is not supported

    (F) Your machine doesn't support dump/undump.

  • Duplicate free() ignored

    (S malloc) An internal routine called free() on something that had already been freed.

  • Duplicate modifier '%c' after '%c' in %s

    (W unpack) You have applied the same modifier more than once after a type in a pack template. See pack.

  • elseif should be elsif

    (S syntax) There is no keyword "elseif" in Perl because Larry thinks it's ugly. Your code will be interpreted as an attempt to call a method named "elseif" for the class returned by the following block. This is unlikely to be what you want.

  • Empty \%c{} in regex; marked by <-- HERE in m/%s/

    (F) \p and \P are used to introduce a named Unicode property, as described in perlunicode and perlre. You used \p or \P in a regular expression without specifying the property name.

  • entering effective %s failed

    (F) While under the use filetest pragma, switching the real and effective uids or gids failed.

  • %ENV is aliased to %s

    (F) You're running under taint mode, and the %ENV variable has been aliased to another hash, so it doesn't reflect anymore the state of the program's environment. This is potentially insecure.

  • Error converting file specification %s

    (F) An error peculiar to VMS. Because Perl may have to deal with file specifications in either VMS or Unix syntax, it converts them to a single form when it must operate on them directly. Either you've passed an invalid file specification to Perl, or you've found a case the conversion routines don't handle. Drat.

  • Escape literal pattern white space under /x

    (D deprecated) You compiled a regular expression pattern with /x to ignore white space, and you used, as a literal, one of the characters that Perl plans to eventually treat as white space. The character must be escaped somehow, or it will work differently on a future Perl that does treat it as white space. The easiest way is to insert a backslash immediately before it, or to enclose it with square brackets. This change is to bring Perl into conformance with Unicode recommendations. Here are the five characters that generate this warning: U+0085 NEXT LINE, U+200E LEFT-TO-RIGHT MARK, U+200F RIGHT-TO-LEFT MARK, U+2028 LINE SEPARATOR, and U+2029 PARAGRAPH SEPARATOR.

  • Eval-group in insecure regular expression

    (F) Perl detected tainted data when trying to compile a regular expression that contains the (?{ ... }) zero-width assertion, which is unsafe. See (?{ code }) in perlre, and perlsec.

  • Eval-group not allowed at runtime, use re 'eval' in regex m/%s/

    (F) Perl tried to compile a regular expression containing the (?{ ... }) zero-width assertion at run time, as it would when the pattern contains interpolated values. Since that is a security risk, it is not allowed. If you insist, you may still do this by using the re 'eval' pragma or by explicitly building the pattern from an interpolated string at run time and using that in an eval(). See (?{ code }) in perlre.

  • Eval-group not allowed, use re 'eval' in regex m/%s/

    (F) A regular expression contained the (?{ ... }) zero-width assertion, but that construct is only allowed when the use re 'eval' pragma is in effect. See (?{ code }) in perlre.

  • EVAL without pos change exceeded limit in regex; marked by <-- HERE in m/%s/

    (F) You used a pattern that nested too many EVAL calls without consuming any text. Restructure the pattern so that text is consumed.

    The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • Excessively long <> operator

    (F) The contents of a <> operator may not exceed the maximum size of a Perl identifier. If you're just trying to glob a long list of filenames, try using the glob() operator, or put the filenames into a variable and glob that.

  • exec? I'm not *that* kind of operating system

    (F) The exec function is not implemented on some systems, e.g., Symbian OS. See perlport.

  • Execution of %s aborted due to compilation errors.

    (F) The final summary message when a Perl compilation fails.

  • Exiting eval via %s

    (W exiting) You are exiting an eval by unconventional means, such as a goto, or a loop control statement.

  • Exiting format via %s

    (W exiting) You are exiting a format by unconventional means, such as a goto, or a loop control statement.

  • Exiting pseudo-block via %s

    (W exiting) You are exiting a rather special block construct (like a sort block or subroutine) by unconventional means, such as a goto, or a loop control statement. See sort.

  • Exiting subroutine via %s

    (W exiting) You are exiting a subroutine by unconventional means, such as a goto, or a loop control statement.

  • Exiting substitution via %s

    (W exiting) You are exiting a substitution by unconventional means, such as a return, a goto, or a loop control statement.

  • Expecting close bracket in regex; marked by <-- HERE in m/%s/

    (F) You wrote something like

    1. (?13

    to denote a capturing group of the form (?PARNO), but omitted the ")" .

  • Experimental "%s" subs not enabled

    (F) To use lexical subs, you must first enable them:

    1. no warnings 'experimental::lexical_subs';
    2. use feature 'lexical_subs';
    3. my sub foo { ... }
  • Explicit blessing to '' (assuming package main)

    (W misc) You are blessing a reference to a zero length string. This has the effect of blessing the reference into the package main. This is usually not what you want. Consider providing a default target package, e.g. bless($ref, $p || 'MyPackage');

  • %s: Expression syntax

    (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself.

  • %s failed--call queue aborted

    (F) An untrapped exception was raised while executing a UNITCHECK, CHECK, INIT, or END subroutine. Processing of the remainder of the queue of such routines has been prematurely ended.

  • False [] range "%s" in regex; marked by <-- HERE in m/%s/

    (W regexp) A character class range must start and end at a literal character, not another character class like \d or [:alpha:]. The "-" in your false range is interpreted as a literal "-". Consider quoting the "-", "\-". The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Fatal VMS error (status=%d) at %s, line %d

    (P) An error peculiar to VMS. Something untoward happened in a VMS system service or RTL routine; Perl's exit status should provide more details. The filename in "at %s" and the line number in "line %d" tell you which section of the Perl source code is distressed.

  • fcntl is not implemented

    (F) Your machine apparently doesn't implement fcntl(). What is this, a PDP-11 or something?

  • FETCHSIZE returned a negative value

    (F) A tied array claimed to have a negative number of elements, which is not possible.

  • Field too wide in 'u' format in pack

    (W pack) Each line in an uuencoded string starts with a length indicator which can't encode values above 63. So there is no point in asking for a line length bigger than that. Perl will behave as if you specified u63 as the format.

  • Filehandle %s opened only for input

    (W io) You tried to write on a read-only filehandle. If you intended it to be a read-write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with "<" or nothing. If you intended only to write the file, use ">" or ">>". See open.

  • Filehandle %s opened only for output

    (W io) You tried to read from a filehandle opened only for writing, If you intended it to be a read/write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with ">". If you intended only to read from the file, use "<". See open. Another possibility is that you attempted to open filedescriptor 0 (also known as STDIN) for output (maybe you closed STDIN earlier?).

  • Filehandle %s reopened as %s only for input

    (W io) You opened for reading a filehandle that got the same filehandle id as STDOUT or STDERR. This occurred because you closed STDOUT or STDERR previously.

  • Filehandle STDIN reopened as %s only for output

    (W io) You opened for writing a filehandle that got the same filehandle id as STDIN. This occurred because you closed STDIN previously.

  • Final $ should be \$ or $name

    (F) You must now decide whether the final $ in a string was meant to be a literal dollar sign, or was meant to introduce a variable name that happens to be missing. So you have to put either the backslash or the name.

  • flock() on closed filehandle %s

    (W closed) The filehandle you're attempting to flock() got itself closed some time before now. Check your control flow. flock() operates on filehandles. Are you attempting to call flock() on a dirhandle by the same name?

  • Format not terminated

    (F) A format must be terminated by a line with a solitary dot. Perl got to the end of your file without finding such a line.

  • Format %s redefined

    (W redefine) You redefined a format. To suppress this warning, say

    1. {
    2. no warnings 'redefine';
    3. eval "format NAME =...";
    4. }
  • Found = in conditional, should be ==

    (W syntax) You said

    1. if ($foo = 123)

    when you meant

    1. if ($foo == 123)

    (or something like that).

  • %s found where operator expected

    (S syntax) The Perl lexer knows whether to expect a term or an operator. If it sees what it knows to be a term when it was expecting to see an operator, it gives you this warning. Usually it indicates that an operator or delimiter was omitted, such as a semicolon.

  • gdbm store returned %d, errno %d, key "%s"

    (S) A warning from the GDBM_File extension that a store failed.

  • gethostent not implemented

    (F) Your C library apparently doesn't implement gethostent(), probably because if it did, it'd feel morally obligated to return every hostname on the Internet.

  • get%sname() on closed socket %s

    (W closed) You tried to get a socket or peer socket name on a closed socket. Did you forget to check the return value of your socket() call?

  • getpwnam returned invalid UIC %#o for user "%s"

    (S) A warning peculiar to VMS. The call to sys$getuai underlying the getpwnam operator returned an invalid UIC.

  • getsockopt() on closed socket %s

    (W closed) You tried to get a socket option on a closed socket. Did you forget to check the return value of your socket() call? See getsockopt.

  • given is experimental

    (S experimental::smartmatch) given depends on both a lexical $_ and smartmatch, both of which are experimental, so its behavior may change or even be removed in any future release of perl. See the explanation under Experimental Details on given and when in perlsyn.

  • Global symbol "%s" requires explicit package name

    (F) You've said "use strict" or "use strict vars", which indicates that all variables must either be lexically scoped (using "my" or "state"), declared beforehand using "our", or explicitly qualified to say which package the global variable is in (using "::").

  • glob failed (%s)

    (S glob) Something went wrong with the external program(s) used for glob and <*.c> . Usually, this means that you supplied a glob pattern that caused the external program to fail and exit with a nonzero status. If the message indicates that the abnormal exit resulted in a coredump, this may also mean that your csh (C shell) is broken. If so, you should change all of the csh-related variables in config.sh: If you have tcsh, make the variables refer to it as if it were csh (e.g. full_csh='/usr/bin/tcsh' ); otherwise, make them all empty (except that d_csh should be 'undef' ) so that Perl will think csh is missing. In either case, after editing config.sh, run ./Configure -S and rebuild Perl.

  • Glob not terminated

    (F) The lexer saw a left angle bracket in a place where it was expecting a term, so it's looking for the corresponding right angle bracket, and not finding it. Chances are you left some needed parentheses out earlier in the line, and you really meant a "less than".

  • gmtime(%f) too large

    (W overflow) You called gmtime with a number that was larger than it can reliably handle and gmtime probably returned the wrong date. This warning is also triggered with NaN (the special not-a-number value).

  • gmtime(%f) too small

    (W overflow) You called gmtime with a number that was smaller than it can reliably handle and gmtime probably returned the wrong date.

  • Got an error from DosAllocMem

    (P) An error peculiar to OS/2. Most probably you're using an obsolete version of Perl, and this should not happen anyway.

  • goto must have label

    (F) Unlike with "next" or "last", you're not allowed to goto an unspecified destination. See goto.

  • Goto undefined subroutine%s

    (F) You tried to call a subroutine with goto &sub syntax, but the indicated subroutine hasn't been defined, or if it was, it has since been undefined.

  • ()-group starts with a count

    (F) A ()-group started with a count. A count is supposed to follow something: a template character or a ()-group. See pack.

  • Group name must start with a non-digit word character in regex; marked by <-- HERE in m/%s/

    (F) Group names must follow the rules for perl identifiers, meaning they must start with a non-digit word character. A common cause of this error is using (?&0) instead of (?0). See perlre.

  • %s had compilation errors.

    (F) The final summary message when a perl -c fails.

  • Had to create %s unexpectedly

    (S internal) A routine asked for a symbol from a symbol table that ought to have existed already, but for some reason it didn't, and had to be created on an emergency basis to prevent a core dump.

  • Hash %%s missing the % in argument %d of %s()

    (D deprecated) Really old Perl let you omit the % on hash names in some spots. This is now heavily deprecated.

  • %s has too many errors

    (F) The parser has given up trying to parse the program after 10 errors. Further error messages would likely be uninformative.

  • Hexadecimal number > 0xffffffff non-portable

    (W portable) The hexadecimal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

  • -i used with no filenames on the command line, reading from STDIN

    (S inplace) The -i option was passed on the command line, indicating that the script is intended to edit files inplace, but no files were given. This is usually a mistake, since editing STDIN inplace doesn't make sense, and can be confusing because it can make perl look like it is hanging when it is really just trying to read from STDIN. You should either pass a filename to edit, or remove -i from the command line. See perlrun for more details.

  • Identifier too long

    (F) Perl limits identifiers (names for variables, functions, etc.) to about 250 characters for simple names, and somewhat more for compound names (like $A::B ). You've exceeded Perl's limits. Future versions of Perl are likely to eliminate these arbitrary limitations.

  • Ignoring zero length \N{} in character class in regex; marked by <-- HERE in m/%s/

    (W regexp) Named Unicode character escapes (\N{...}) may return a zero-length sequence. When such an escape is used in a character class its behaviour is not well defined. Check that the correct escape has been used, and the correct charname handler is in scope.

  • Illegal binary digit %s

    (F) You used a digit other than 0 or 1 in a binary number.

  • Illegal binary digit %s ignored

    (W digit) You may have tried to use a digit other than 0 or 1 in a binary number. Interpretation of the binary number stopped before the offending digit.

  • Illegal character after '_' in prototype for %s : %s

    (W illegalproto) An illegal character was found in a prototype declaration. Legal characters in prototypes are $, @, %, *, ;, [, ], &, \, and +.

  • Illegal character \%o (carriage return)

    (F) Perl normally treats carriage returns in the program text as it would any other whitespace, which means you should never see this error when Perl was built using standard options. For some reason, your version of Perl appears to have been built without this support. Talk to your Perl administrator.

  • Illegal character in prototype for %s : %s

    (W illegalproto) An illegal character was found in a prototype declaration. Legal characters in prototypes are $, @, %, *, ;, [, ], &, \, and +.

  • Illegal declaration of anonymous subroutine

    (F) When using the sub keyword to construct an anonymous subroutine, you must always specify a block of code. See perlsub.

  • Illegal declaration of subroutine %s

    (F) A subroutine was not declared correctly. See perlsub.

  • Illegal division by zero

    (F) You tried to divide a number by 0. Either something was wrong in your logic, or you need to put a conditional in to guard against meaningless input.

  • Illegal hexadecimal digit %s ignored

    (W digit) You may have tried to use a character other than 0 - 9 or A - F, a - f in a hexadecimal number. Interpretation of the hexadecimal number stopped before the illegal character.

  • Illegal modulus zero

    (F) You tried to divide a number by 0 to get the remainder. Most numbers don't take to this kindly.

  • Illegal number of bits in vec

    (F) The number of bits in vec() (the third argument) must be a power of two from 1 to 32 (or 64, if your platform supports that).

  • Illegal octal digit %s

    (F) You used an 8 or 9 in an octal number.

  • Illegal octal digit %s ignored

    (W digit) You may have tried to use an 8 or 9 in an octal number. Interpretation of the octal number stopped before the 8 or 9.

  • Illegal pattern in regex; marked by <-- HERE in m/%s/

    (F) You wrote something like

    1. (?+foo)

    The "+" is valid only when followed by digits, indicating a capturing group. See (?PARNO).

  • Illegal switch in PERL5OPT: -%c

    (X) The PERL5OPT environment variable may only be used to set the following switches: -[CDIMUdmtw].

  • Ill-formed CRTL environ value "%s"

    (W internal) A warning peculiar to VMS. Perl tried to read the CRTL's internal environ array, and encountered an element without the = delimiter used to separate keys from values. The element is ignored.

  • Ill-formed message in prime_env_iter: |%s|

    (W internal) A warning peculiar to VMS. Perl tried to read a logical name or CLI symbol definition when preparing to iterate over %ENV, and didn't see the expected delimiter between key and value, so the line was ignored.

  • (in cleanup) %s

    (W misc) This prefix usually indicates that a DESTROY() method raised the indicated exception. Since destructors are usually called by the system at arbitrary points during execution, and often a vast number of times, the warning is issued only once for any number of failures that would otherwise result in the same message being repeated.

    Failure of user callbacks dispatched using the G_KEEPERR flag could also result in this warning. See G_KEEPERR in perlcall.

  • In '(*VERB...)', splitting the initial '(*' is deprecated in regex; marked by <-- HERE in m/%s/

    (D regexp, deprecated) The two-character sequence "(*" in this context in a regular expression pattern should be an indivisible token, with nothing intervening between the "(" and the "*" , but you separated them. Due to an accident of implementation, this prohibition was not enforced, but we do plan to forbid it in a future Perl version. This message serves as giving you fair warning of this pending change.

  • In '(?...)', splitting the initial '(?' is deprecated in regex; marked by <-- HERE in m/%s/

    (D regexp, deprecated) The two-character sequence "(?" in this context in a regular expression pattern should be an indivisible token, with nothing intervening between the "(" and the "?" , but you separated them. Due to an accident of implementation, this prohibition was not enforced, but we do plan to forbid it in a future Perl version. This message serves as giving you fair warning of this pending change.

  • Incomplete expression within '(?[ ])' in regex; marked by <-- HERE in m/%s/

    (F) There was a syntax error within the (?[ ]) . This can happen if the expression inside the construct was completely empty, or if there are too many or few operands for the number of operators. Perl is not smart enough to give you a more precise indication as to what is wrong.

  • Inconsistent hierarchy during C3 merge of class '%s': merging failed on parent '%s'

    (F) The method resolution order (MRO) of the given class is not C3-consistent, and you have enabled the C3 MRO for this class. See the C3 documentation in mro for more information.

  • In EBCDIC the v-string components cannot exceed 2147483647

    (F) An error peculiar to EBCDIC. Internally, v-strings are stored as Unicode code points, and encoded in EBCDIC as UTF-EBCDIC. The UTF-EBCDIC encoding is limited to code points no larger than 2147483647 (0x7FFFFFFF).

  • Infinite recursion in regex; marked by <-- HERE in m/%s/

    (F) You used a pattern that references itself without consuming any input text. You should check the pattern to ensure that recursive patterns either consume text or fail.

    The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • Initialization of state variables in list context currently forbidden

    (F) Currently the implementation of "state" only permits the initialization of scalar variables in scalar context. Re-write state ($a) = 42 as state $a = 42 to change from list to scalar context. Constructions such as state (@a) = foo() will be supported in a future perl release.

  • Insecure dependency in %s

    (F) You tried to do something that the tainting mechanism didn't like. The tainting mechanism is turned on when you're running setuid or setgid, or when you specify -T to turn it on explicitly. The tainting mechanism labels all data that's derived directly or indirectly from the user, who is considered to be unworthy of your trust. If any such data is used in a "dangerous" operation, you get this error. See perlsec for more information.

  • Insecure directory in %s

    (F) You can't use system(), exec(), or a piped open in a setuid or setgid script if $ENV{PATH} contains a directory that is writable by the world. Also, the PATH must not contain any relative directory. See perlsec.

  • Insecure $ENV{%s} while running %s

    (F) You can't use system(), exec(), or a piped open in a setuid or setgid script if any of $ENV{PATH} , $ENV{IFS} , $ENV{CDPATH} , $ENV{ENV} , $ENV{BASH_ENV} or $ENV{TERM} are derived from data supplied (or potentially supplied) by the user. The script must set the path to a known value, using trustworthy data. See perlsec.

  • Insecure user-defined property %s

    (F) Perl detected tainted data when trying to compile a regular expression that contains a call to a user-defined character property function, i.e. \p{IsFoo} or \p{InFoo} . See User-Defined Character Properties in perlunicode and perlsec.

  • Integer overflow in format string for %s

    (F) The indexes and widths specified in the format string of printf() or sprintf() are too large. The numbers must not overflow the size of integers for your architecture.

  • Integer overflow in %s number

    (S overflow) The hexadecimal, octal or binary number you have specified either as a literal or as an argument to hex() or oct() is too big for your architecture, and has been converted to a floating point number. On a 32-bit architecture the largest hexadecimal, octal or binary number representable without overflow is 0xFFFFFFFF, 037777777777, or 0b11111111111111111111111111111111 respectively. Note that Perl transparently promotes all numbers to a floating point representation internally--subject to loss of precision errors in subsequent operations.

  • Integer overflow in srand

    (S overflow) The number you have passed to srand is too big to fit in your architecture's integer representation. The number has been replaced with the largest integer supported (0xFFFFFFFF on 32-bit architectures). This means you may be getting less randomness than you expect, because different random seeds above the maximum will return the same sequence of random numbers.

  • Integer overflow in version
  • Integer overflow in version %d

    (W overflow) Some portion of a version initialization is too large for the size of integers for your architecture. This is not a warning because there is no rational reason for a version to try and use an element larger than typically 2**32. This is usually caused by trying to use some odd mathematical operation as a version, like 100/9.

  • Internal disaster in regex; marked by <-- HERE in m/%s/

    (P) Something went badly wrong in the regular expression parser. The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • Internal inconsistency in tracking vforks

    (S) A warning peculiar to VMS. Perl keeps track of the number of times you've called fork and exec, to determine whether the current call to exec should affect the current script or a subprocess (see exec LIST in perlvms). Somehow, this count has become scrambled, so Perl is making a guess and treating this exec as a request to terminate the Perl script and execute the specified command.

  • Internal urp in regex; marked by <-- HERE in m/%s/

    (P) Something went badly awry in the regular expression parser. The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • %s (...) interpreted as function

    (W syntax) You've run afoul of the rule that says that any list operator followed by parentheses turns into a function, with all the list operators arguments found inside the parentheses. See Terms and List Operators (Leftward) in perlop.

  • Invalid %s attribute: %s

    (F) The indicated attribute for a subroutine or variable was not recognized by Perl or by a user-supplied handler. See attributes.

  • Invalid %s attributes: %s

    (F) The indicated attributes for a subroutine or variable were not recognized by Perl or by a user-supplied handler. See attributes.

  • Invalid [] range "%*.*s" in regex; marked by <-- HERE in m/%s/

    (F) You wrote something like

    1. [z-a]

    in a regular expression pattern. Ranges must be specified with the lowest code point first. Instead write

    1. [a-z]
  • Invalid character in \N{...}; marked by <-- HERE in \N{%s}

    (F) Only certain characters are valid for character names. The indicated one isn't. See CUSTOM ALIASES in charnames.

  • Invalid character in charnames alias definition; marked by <-- HERE in '%s

    (F) You tried to create a custom alias for a character name, with the :alias option to use charnames and the specified character in the indicated name isn't valid. See CUSTOM ALIASES in charnames.

  • Invalid conversion in %s: "%s"

    (W printf) Perl does not understand the given format conversion. See sprintf.

  • Invalid escape in the specified encoding in regex; marked by <-- HERE in m/%s/

    (W regexp) The numeric escape (for example \xHH ) of value < 256 didn't correspond to a single character through the conversion from the encoding specified by the encoding pragma. The escape was replaced with REPLACEMENT CHARACTER (U+FFFD) instead. The <-- HERE shows whereabouts in the regular expression the escape was discovered.

  • Invalid hexadecimal number in \N{U+...}
  • Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/%s/

    (F) The character constant represented by ... is not a valid hexadecimal number. Either it is empty, or you tried to use a character other than 0 - 9 or A - F, a - f in a hexadecimal number.

  • Invalid module name %s with -%c option: contains single ':'

    (F) The module argument to perl's -m and -M command-line options cannot contain single colons in the module name, but only in the arguments after "=". In other words, -MFoo::Bar=:baz is ok, but -MFoo:Bar=baz is not.

  • Invalid mro name: '%s'

    (F) You tried to mro::set_mro("classname", "foo") or use mro 'foo' , where foo is not a valid method resolution order (MRO). Currently, the only valid ones supported are dfs and c3 , unless you have loaded a module that is a MRO plugin. See mro and perlmroapi.

  • Invalid negative number (%s) in chr

    (W utf8) You passed a negative number to chr. Negative numbers are not valid characters numbers, so it return the Unicode replacement character (U+FFFD).

  • invalid option -D%c, use -D'' to see choices

    (S debugging) Perl was called with invalid debugger flags. Call perl with the -D option with no flags to see the list of acceptable values. See also -Dletters in perlrun.

  • Invalid [] range "%s" in regex; marked by <-- HERE in m/%s/

    (F) The range specified in a character class had a minimum character greater than the maximum character. One possibility is that you forgot the {} from your ending \x{} - \x without the curly braces can go only up to ff . The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Invalid range "%s" in transliteration operator

    (F) The range specified in the tr/// or y/// operator had a minimum character greater than the maximum character. See perlop.

  • Invalid separator character %s in attribute list

    (F) Something other than a colon or whitespace was seen between the elements of an attribute list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon. See attributes.

  • Invalid separator character %s in PerlIO layer specification %s

    (W layer) When pushing layers onto the Perl I/O system, something other than a colon or whitespace was seen between the elements of a layer list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon.

  • Invalid strict version format (%s)

    (F) A version number did not meet the "strict" criteria for versions. A "strict" version number is a positive decimal number (integer or decimal-fraction) without exponentiation or else a dotted-decimal v-string with a leading 'v' character and at least three components. The parenthesized text indicates which criteria were not met. See the version module for more details on allowed version formats.

  • Invalid type '%s' in %s

    (F) The given character is not a valid pack or unpack type. See pack.

    (W) The given character is not a valid pack or unpack type but used to be silently ignored.

  • Invalid version format (%s)

    (F) A version number did not meet the "lax" criteria for versions. A "lax" version number is a positive decimal number (integer or decimal-fraction) without exponentiation or else a dotted-decimal v-string. If the v-string has fewer than three components, it must have a leading 'v' character. Otherwise, the leading 'v' is optional. Both decimal and dotted-decimal versions may have a trailing "alpha" component separated by an underscore character after a fractional or dotted-decimal component. The parenthesized text indicates which criteria were not met. See the version module for more details on allowed version formats.

  • Invalid version object

    (F) The internal structure of the version object was invalid. Perhaps the internals were modified directly in some way or an arbitrary reference was blessed into the "version" class.

  • ioctl is not implemented

    (F) Your machine apparently doesn't implement ioctl(), which is pretty strange for a machine that supports C.

  • ioctl() on unopened %s

    (W unopened) You tried ioctl() on a filehandle that was never opened. Check your control flow and number of arguments.

  • IO layers (like '%s') unavailable

    (F) Your Perl has not been configured to have PerlIO, and therefore you cannot use IO layers. To have PerlIO, Perl must be configured with 'useperlio'.

  • IO::Socket::atmark not implemented on this architecture

    (F) Your machine doesn't implement the sockatmark() functionality, neither as a system call nor an ioctl call (SIOCATMARK).

  • $* is no longer supported

    (D deprecated, syntax) The special variable $* , deprecated in older perls, has been removed as of 5.9.0 and is no longer supported. In previous versions of perl the use of $* enabled or disabled multi-line matching within a string.

    Instead of using $* you should use the /m (and maybe /s) regexp modifiers. You can enable /m for a lexical scope (even a whole file) with use re '/m' . (In older versions: when $* was set to a true value then all regular expressions behaved as if they were written using /m.)

  • $# is no longer supported

    (D deprecated, syntax) The special variable $# , deprecated in older perls, has been removed as of 5.9.3 and is no longer supported. You should use the printf/sprintf functions instead.

  • '%s' is not a code reference

    (W overload) The second (fourth, sixth, ...) argument of overload::constant needs to be a code reference. Either an anonymous subroutine, or a reference to a subroutine.

  • '%s' is not an overloadable type

    (W overload) You tried to overload a constant type the overload package is unaware of.

  • Junk on end of regexp in regex m/%s/

    (P) The regular expression parser is confused.

  • Label not found for "last %s"

    (F) You named a loop to break out of, but you're not currently in a loop of that name, not even if you count where you were called from. See last.

  • Label not found for "next %s"

    (F) You named a loop to continue, but you're not currently in a loop of that name, not even if you count where you were called from. See last.

  • Label not found for "redo %s"

    (F) You named a loop to restart, but you're not currently in a loop of that name, not even if you count where you were called from. See last.

  • leaving effective %s failed

    (F) While under the use filetest pragma, switching the real and effective uids or gids failed.

  • length/code after end of string in unpack

    (F) While unpacking, the string buffer was already used up when an unpack length/code combination tried to obtain more data. This results in an undefined value for the length. See pack.

  • length() used on %s

    (W syntax) You used length() on either an array or a hash when you probably wanted a count of the items.

    Array size can be obtained by doing:

    1. scalar(@array);

    The number of items in a hash can be obtained by doing:

    1. scalar(keys %hash);
  • Lexing code attempted to stuff non-Latin-1 character into Latin-1 input

    (F) An extension is attempting to insert text into the current parse (using lex_stuff_pvn or similar), but tried to insert a character that couldn't be part of the current input. This is an inherent pitfall of the stuffing mechanism, and one of the reasons to avoid it. Where it is necessary to stuff, stuffing only plain ASCII is recommended.

  • Lexing code internal error (%s)

    (F) Lexing code supplied by an extension violated the lexer's API in a detectable way.

  • listen() on closed socket %s

    (W closed) You tried to do a listen on a closed socket. Did you forget to check the return value of your socket() call? See listen.

  • List form of piped open not implemented

    (F) On some platforms, notably Windows, the three-or-more-arguments form of open does not support pipes, such as open($pipe, '|-', @args) . Use the two-argument open($pipe, '|prog arg1 arg2...') form instead.

  • localtime(%f) too large

    (W overflow) You called localtime with a number that was larger than it can reliably handle and localtime probably returned the wrong date. This warning is also triggered with NaN (the special not-a-number value).

  • localtime(%f) too small

    (W overflow) You called localtime with a number that was smaller than it can reliably handle and localtime probably returned the wrong date.

  • Lookbehind longer than %d not implemented in regex m/%s/

    (F) There is currently a limit on the length of string which lookbehind can handle. This restriction may be eased in a future release.

  • Lost precision when %s %f by 1

    (W imprecision) The value you attempted to increment or decrement by one is too large for the underlying floating point representation to store accurately, hence the target of ++ or -- is unchanged. Perl issues this warning because it has already switched from integers to floating point when values are too large for integers, and now even floating point is insufficient. You may wish to switch to using Math::BigInt explicitly.

  • lstat() on filehandle%s

    (W io) You tried to do an lstat on a filehandle. What did you mean by that? lstat() makes sense only on filenames. (Perl did a fstat() instead on the filehandle.)

  • lvalue attribute %s already-defined subroutine

    (W misc) Although attributes.pm allows this, turning the lvalue attribute on or off on a Perl subroutine that is already defined does not always work properly. It may or may not do what you want, depending on what code is inside the subroutine, with exact details subject to change between Perl versions. Only do this if you really know what you are doing.

  • lvalue attribute ignored after the subroutine has been defined

    (W misc) Using the :lvalue declarative syntax to make a Perl subroutine an lvalue subroutine after it has been defined is not permitted. To make the subroutine an lvalue subroutine, add the lvalue attribute to the definition, or put the sub foo :lvalue; declaration before the definition.

    See also attributes.pm.

  • Malformed integer in [] in pack

    (F) Between the brackets enclosing a numeric repeat count only digits are permitted. See pack.

  • Malformed integer in [] in unpack

    (F) Between the brackets enclosing a numeric repeat count only digits are permitted. See pack.

  • Malformed PERLLIB_PREFIX

    (F) An error peculiar to OS/2. PERLLIB_PREFIX should be of the form

    1. prefix1;prefix2

    or prefix1 prefix2

    with nonempty prefix1 and prefix2. If prefix1 is indeed a prefix of a builtin library search path, prefix2 is substituted. The error may appear if components are not found, or are too long. See "PERLLIB_PREFIX" in perlos2.

  • Malformed prototype for %s: %s

    (F) You tried to use a function with a malformed prototype. The syntax of function prototypes is given a brief compile-time check for obvious errors like invalid characters. A more rigorous check is run when the function is called.

  • Malformed UTF-8 character (%s)

    (S utf8)(F) Perl detected a string that didn't comply with UTF-8 encoding rules, even though it had the UTF8 flag on.

    One possible cause is that you set the UTF8 flag yourself for data that you thought to be in UTF-8 but it wasn't (it was for example legacy 8-bit data). To guard against this, you can use Encode::decode_utf8.

    If you use the :encoding(UTF-8) PerlIO layer for input, invalid byte sequences are handled gracefully, but if you use :utf8 , the flag is set without validating the data, possibly resulting in this error message.

    See also Handling Malformed Data in Encode.

  • Malformed UTF-8 character immediately after '%s'

    (F) You said use utf8 , but the program file doesn't comply with UTF-8 encoding rules. The message prints out the properly encoded characters just before the first bad one. If utf8 warnings are enabled, a warning is generated that gives more details about the type of malformation.

  • Malformed UTF-8 returned by \N{%s} immediately after '%s'

    (F) The charnames handler returned malformed UTF-8.

  • Malformed UTF-8 string in '%c' format in unpack

    (F) You tried to unpack something that didn't comply with UTF-8 encoding rules and perl was unable to guess how to make more progress.

  • Malformed UTF-8 string in pack

    (F) You tried to pack something that didn't comply with UTF-8 encoding rules and perl was unable to guess how to make more progress.

  • Malformed UTF-8 string in unpack

    (F) You tried to unpack something that didn't comply with UTF-8 encoding rules and perl was unable to guess how to make more progress.

  • Malformed UTF-16 surrogate

    (F) Perl thought it was reading UTF-16 encoded character data but while doing it Perl met a malformed Unicode surrogate.

  • %s matches null string many times in regex; marked by <-- HERE in m/%s/

    (W regexp) The pattern you've specified would be an infinite loop if the regular expression engine didn't specifically check for that. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Maximal count of pending signals (%u) exceeded

    (F) Perl aborted due to too high a number of signals pending. This usually indicates that your operating system tried to deliver signals too fast (with a very high priority), starving the perl process from resources it would need to reach a point where it can process signals safely. (See Deferred Signals (Safe Signals) in perlipc.)

  • "%s" may clash with future reserved word

    (W) This warning may be due to running a perl5 script through a perl4 interpreter, especially if the word that is being warned about is "use" or "my".

  • '%' may not be used in pack

    (F) You can't pack a string by supplying a checksum, because the checksumming process loses information, and you can't go the other way. See unpack.

  • Method for operation %s not found in package %s during blessing

    (F) An attempt was made to specify an entry in an overloading table that doesn't resolve to a valid subroutine. See overload.

  • Method %s not permitted

    See Server error.

  • Might be a runaway multi-line %s string starting on line %d

    (S) An advisory indicating that the previous error may have been caused by a missing delimiter on a string or pattern, because it eventually ended earlier on the current line.

  • Misplaced _ in number

    (W syntax) An underscore (underbar) in a numeric constant did not separate two digits.

  • Missing argument in %s

    (W uninitialized) A printf-type format required more arguments than were supplied.

  • Missing argument to -%c

    (F) The argument to the indicated command line switch must follow immediately after the switch, without intervening spaces.

  • Missing braces on \N{}
  • Missing braces on \N{} in regex; marked by <-- HERE in m/%s/

    (F) Wrong syntax of character name literal \N{charname} within double-quotish context. This can also happen when there is a space (or comment) between the \N and the { in a regex with the /x modifier. This modifier does not change the requirement that the brace immediately follow the \N .

  • Missing braces on \o{}

    (F) A \o must be followed immediately by a { in double-quotish context.

  • Missing comma after first argument to %s function

    (F) While certain functions allow you to specify a filehandle or an "indirect object" before the argument list, this ain't one of them.

  • Missing command in piped open

    (W pipe) You used the open(FH, "| command") or open(FH, "command |") construction, but the command was missing or blank.

  • Missing control char name in \c

    (F) A double-quoted string ended with "\c", without the required control character name.

  • Missing name in "%s sub"

    (F) The reserved syntax for lexically scoped subroutines requires that they have a name with which they can be found.

  • Missing $ on loop variable

    (F) Apparently you've been programming in csh too much. Variables are always mentioned with the $ in Perl, unlike in the shells, where it can vary from one line to the next.

  • (Missing operator before %s?)

    (S syntax) This is an educated guess made in conjunction with the message "%s found where operator expected". Often the missing operator is a comma.

  • Missing right brace on \%c{} in regex; marked by <-- HERE in m/%s/

    (F) Missing right brace in \x{...} , \p{...} , \P{...} , or \N{...} .

  • Missing right brace on \N{} or unescaped left brace after \N

    (F) \N has two meanings.

    The traditional one has it followed by a name enclosed in braces, meaning the character (or sequence of characters) given by that name. Thus \N{ASTERISK} is another way of writing * , valid in both double-quoted strings and regular expression patterns. In patterns, it doesn't have the meaning an unescaped * does.

    Starting in Perl 5.12.0, \N also can have an additional meaning (only) in patterns, namely to match a non-newline character. (This is short for [^\n], and like . but is not affected by the /s regex modifier.)

    This can lead to some ambiguities. When \N is not followed immediately by a left brace, Perl assumes the [^\n] meaning. Also, if the braces form a valid quantifier such as \N{3} or \N{5,} , Perl assumes that this means to match the given quantity of non-newlines (in these examples, 3; and 5 or more, respectively). In all other case, where there is a \N{ and a matching }, Perl assumes that a character name is desired.

    However, if there is no matching }, Perl doesn't know if it was mistakenly omitted, or if [^\n]{ was desired, and raises this error. If you meant the former, add the right brace; if you meant the latter, escape the brace with a backslash, like so: \N\{

  • Missing right curly or square bracket

    (F) The lexer counted more opening curly or square brackets than closing ones. As a general rule, you'll find it's missing near the place you were last editing.

  • (Missing semicolon on previous line?)

    (S syntax) This is an educated guess made in conjunction with the message "%s found where operator expected". Don't automatically put a semicolon on the previous line just because you saw this message.

  • Modification of a read-only value attempted

    (F) You tried, directly or indirectly, to change the value of a constant. You didn't, of course, try "2 = 1", because the compiler catches that. But an easy way to do the same thing is:

    1. sub mod { $_[0] = 1 }
    2. mod(2);

    Another way is to assign to a substr() that's off the end of the string.

    Yet another way is to assign to a foreach loop VAR when VAR is aliased to a constant in the look LIST:

    1. $x = 1;
    2. foreach my $n ($x, 2) {
    3. $n *= 2; # modifies the $x, but fails on attempt to
    4. } # modify the 2
  • Modification of non-creatable array value attempted, %s

    (F) You tried to make an array value spring into existence, and the subscript was probably negative, even counting from end of the array backwards.

  • Modification of non-creatable hash value attempted, %s

    (P) You tried to make a hash value spring into existence, and it couldn't be created for some peculiar reason.

  • Module name must be constant

    (F) Only a bare module name is allowed as the first argument to a "use".

  • Module name required with -%c option

    (F) The -M or -m options say that Perl should load some module, but you omitted the name of the module. Consult perlrun for full details about -M and -m.

  • More than one argument to '%s' open

    (F) The open function has been asked to open multiple files. This can happen if you are trying to open a pipe to a command that takes a list of arguments, but have forgotten to specify a piped open mode. See open for details.

  • msg%s not implemented

    (F) You don't have System V message IPC on your system.

  • Multidimensional syntax %s not supported

    (W syntax) Multidimensional arrays aren't written like $foo[1,2,3] . They're written like $foo[1][2][3] , as in C.

  • '/' must follow a numeric type in unpack

    (F) You had an unpack template that contained a '/', but this did not follow some unpack specification producing a numeric value. See pack.

  • "my sub" not yet implemented

    (F) Lexically scoped subroutines are not yet implemented. Don't try that yet.

  • "my" variable %s can't be in a package

    (F) Lexically scoped variables aren't in a package, so it doesn't make sense to try to declare one with a package qualifier on the front. Use local() if you want to localize a package variable.

  • Name "%s::%s" used only once: possible typo

    (W once) Typographical errors often show up as unique variable names. If you had a good reason for having a unique name, then just mention it again somehow to suppress the message. The our declaration is provided for this purpose.

    NOTE: This warning detects symbols that have been used only once so $c, @c, %c, *c, &c, sub c{}, c(), and c (the filehandle or format) are considered the same; if a program uses $c only once but also uses any of the others it will not trigger this warning.

  • \N in a character class must be a named character: \N{...} in regex; marked by <-- HERE in m/%s/

    (F) The new (5.12) meaning of \N as [^\n] is not valid in a bracketed character class, for the same reason that . in a character class loses its specialness: it matches almost everything, which is probably not what you want.

  • \N{NAME} must be resolved by the lexer in regex; marked by <-- HERE in m/%s/

    (F) When compiling a regex pattern, an unresolved named character or sequence was encountered. This can happen in any of several ways that bypass the lexer, such as using single-quotish context, or an extra backslash in double-quotish:

    1. $re = '\N{SPACE}'; # Wrong!
    2. $re = "\\N{SPACE}"; # Wrong!
    3. /$re/;

    Instead, use double-quotes with a single backslash:

    1. $re = "\N{SPACE}"; # ok
    2. /$re/;

    The lexer can be bypassed as well by creating the pattern from smaller components:

    1. $re = '\N';
    2. /${re}{SPACE}/; # Wrong!

    It's not a good idea to split a construct in the middle like this, and it doesn't work here. Instead use the solution above.

    Finally, the message also can happen under the /x regex modifier when the \N is separated by spaces from the {, in which case, remove the spaces.

    1. /\N {SPACE}/x; # Wrong!
    2. /\N{SPACE}/x; # ok
  • Need exactly 3 octal digits in regex; marked by <-- HERE in m/%s/

    (F) Within (?[ ]) , all constants interpreted as octal need to be exactly 3 digits long. This helps catch some ambiguities. If your constant is too short, add leading zeros, like

    1. (?[ [ \078 ] ]) # Syntax error!
    2. (?[ [ \0078 ] ]) # Works
    3. (?[ [ \007 8 ] ]) # Clearer

    The maximum number this construct can express is \777 . If you need a larger one, you need to use \o{} instead. If you meant two separate things, you need to separate them

    1. (?[ [ \7776 ] ]) # Syntax error!
    2. (?[ [ \o{7776} ] ]) # One meaning
    3. (?[ [ \777 6 ] ]) # Another meaning
    4. (?[ [ \777 \006 ] ]) # Still another
  • Negative '/' count in unpack

    (F) The length count obtained from a length/code unpack operation was negative. See pack.

  • Negative length

    (F) You tried to do a read/write/send/recv operation with a buffer length that is less than 0. This is difficult to imagine.

  • Negative offset to vec in lvalue context

    (F) When vec is called in an lvalue context, the second argument must be greater than or equal to zero.

  • Nested quantifiers in regex; marked by <-- HERE in m/%s/

    (F) You can't quantify a quantifier without intervening parentheses. So things like ** or +* or ?* are illegal. The <-- HERE shows whereabouts in the regular expression the problem was discovered.

    Note that the minimal matching quantifiers, *? , +?, and ?? appear to be nested quantifiers, but aren't. See perlre.

  • %s never introduced

    (S internal) The symbol in question was declared but somehow went out of scope before it could possibly have been used.

  • next::method/next::can/maybe::next::method cannot find enclosing method

    (F) next::method needs to be called within the context of a real method in a real package, and it could not find such a context. See mro.

  • No %s allowed while running setuid

    (F) Certain operations are deemed to be too insecure for a setuid or setgid script to even be allowed to attempt. Generally speaking there will be another way to do what you want that is, if not secure, at least securable. See perlsec.

  • No code specified for -%c

    (F) Perl's -e and -E command-line options require an argument. If you want to run an empty program, pass the empty string as a separate argument or run a program consisting of a single 0 or 1:

    1. perl -e ""
    2. perl -e0
    3. perl -e1
  • No comma allowed after %s

    (F) A list operator that has a filehandle or "indirect object" is not allowed to have a comma between that and the following arguments. Otherwise it'd be just another one of the arguments.

    One possible cause for this is that you expected to have imported a constant to your name space with use or import while no such importing took place, it may for example be that your operating system does not support that particular constant. Hopefully you did use an explicit import list for the constants you expect to see; please see use and import. While an explicit import list would probably have caught this error earlier it naturally does not remedy the fact that your operating system still does not support that constant. Maybe you have a typo in the constants of the symbol import list of use or import or in the constant name at the line where this error was triggered?

  • No command into which to pipe on command line

    (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '|' at the end of the command line, so it doesn't know where you want to pipe the output from this command.

  • No DB::DB routine defined

    (F) The currently executing code was compiled with the -d switch, but for some reason the current debugger (e.g. perl5db.pl or a Devel:: module) didn't define a routine to be called at the beginning of each statement.

  • No dbm on this machine

    (P) This is counted as an internal error, because every machine should supply dbm nowadays, because Perl comes with SDBM. See SDBM_File.

  • No DB::sub routine defined

    (F) The currently executing code was compiled with the -d switch, but for some reason the current debugger (e.g. perl5db.pl or a Devel:: module) didn't define a DB::sub routine to be called at the beginning of each ordinary subroutine call.

  • No directory specified for -I

    (F) The -I command-line switch requires a directory name as part of the same argument. Use -Ilib, for instance. -I lib won't work.

  • No error file after 2> or 2>> on command line

    (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '2>' or a '2>>' on the command line, but can't find the name of the file to which to write data destined for stderr.

  • No group ending character '%c' found in template

    (F) A pack or unpack template has an opening '(' or '[' without its matching counterpart. See pack.

  • No input file after < on command line

    (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '<' on the command line, but can't find the name of the file from which to read data for stdin.

  • No next::method '%s' found for %s

    (F) next::method found no further instances of this method name in the remaining packages of the MRO of this class. If you don't want it throwing an exception, use maybe::next::method or next::can. See mro.

  • "no" not allowed in expression

    (F) The "no" keyword is recognized and executed at compile time, and returns no useful value. See perlmod.

  • No output file after > on command line

    (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a lone '>' at the end of the command line, so it doesn't know where you wanted to redirect stdout.

  • No output file after > or>> on command line

    (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '>' or a '>>' on the command line, but can't find the name of the file to which to write data destined for stdout.

  • No package name allowed for variable %s in "our"

    (F) Fully qualified variable names are not allowed in "our" declarations, because that doesn't make much sense under existing semantics. Such syntax is reserved for future extensions.

  • No Perl script found in input

    (F) You called perl -x , but no line was found in the file beginning with #! and containing the word "perl".

  • No setregid available

    (F) Configure didn't find anything resembling the setregid() call for your system.

  • No setreuid available

    (F) Configure didn't find anything resembling the setreuid() call for your system.

  • No such class field "%s" in variable %s of type %s

    (F) You tried to access a key from a hash through the indicated typed variable but that key is not allowed by the package of the same type. The indicated package has restricted the set of allowed keys using the fields pragma.

  • No such class %s

    (F) You provided a class qualifier in a "my", "our" or "state" declaration, but this class doesn't exist at this point in your program.

  • No such hook: %s

    (F) You specified a signal hook that was not recognized by Perl. Currently, Perl accepts __DIE__ and __WARN__ as valid signal hooks.

  • No such pipe open

    (P) An error peculiar to VMS. The internal routine my_pclose() tried to close a pipe which hadn't been opened. This should have been caught earlier as an attempt to close an unopened filehandle.

  • No such signal: SIG%s

    (W signal) You specified a signal name as a subscript to %SIG that was not recognized. Say kill -l in your shell to see the valid signal names on your system.

  • Not a CODE reference

    (F) Perl was trying to evaluate a reference to a code value (that is, a subroutine), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See also perlref.

  • Not a GLOB reference

    (F) Perl was trying to evaluate a reference to a "typeglob" (that is, a symbol table entry that looks like *foo ), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See perlref.

  • Not a HASH reference

    (F) Perl was trying to evaluate a reference to a hash value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See perlref.

  • Not an ARRAY reference

    (F) Perl was trying to evaluate a reference to an array value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See perlref.

  • Not an unblessed ARRAY reference

    (F) You passed a reference to a blessed array to push, shift or another array function. These only accept unblessed array references or arrays beginning explicitly with @ .

  • Not a SCALAR reference

    (F) Perl was trying to evaluate a reference to a scalar value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See perlref.

  • Not a subroutine reference

    (F) Perl was trying to evaluate a reference to a code value (that is, a subroutine), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See also perlref.

  • Not a subroutine reference in overload table

    (F) An attempt was made to specify an entry in an overloading table that doesn't somehow point to a valid subroutine. See overload.

  • Not enough arguments for %s

    (F) The function requires more arguments than you specified.

  • Not enough format arguments

    (W syntax) A format specified more picture fields than the next line supplied. See perlform.

  • %s: not found

    (A) You've accidentally run your script through the Bourne shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself.

  • no UTC offset information; assuming local time is UTC

    (S) A warning peculiar to VMS. Perl was unable to find the local timezone offset, so it's assuming that local system time is equivalent to UTC. If it's not, define the logical name SYS$TIMEZONE_DIFFERENTIAL to translate to the number of seconds which need to be added to UTC to get local time.

  • Non-hex character in regex; marked by <-- HERE in m/%s/

    (F) In a regular expression, there was a non-hexadecimal character where a hex one was expected, like

    1. (?[ [ \xDG ] ])
    2. (?[ [ \x{DEKA} ] ])
  • Non-octal character '%c'. Resolved as "%s"

    (W digit) In parsing an octal numeric constant, a character was unexpectedly encountered that isn't octal. The resulting value is as indicated.

  • Non-octal character in regex; marked by <-- HERE in m/%s/

    (F) In a regular expression, there was a non-octal character where an octal one was expected, like

    1. (?[ [ \o{1278} ] ])
  • Non-string passed as bitmask

    (W misc) A number has been passed as a bitmask argument to select(). Use the vec() function to construct the file descriptor bitmasks for select. See select.

  • (?[...]) not valid in locale in regex; marked by <-- HERE in m/%s/

    (F) (?[...]) cannot be used within the scope of a use locale or with an /l regular expression modifier, as that would require deferring to run-time the calculation of what it should evaluate to, and it is regex compile-time only.

  • Null filename used

    (F) You can't require the null filename, especially because on many machines that means the current directory! See require.

  • NULL OP IN RUN

    (S debugging) Some internal routine called run() with a null opcode pointer.

  • Null picture in formline

    (F) The first argument to formline must be a valid format picture specification. It was found to be empty, which probably means you supplied it an uninitialized value. See perlform.

  • Null realloc

    (P) An attempt was made to realloc NULL.

  • NULL regexp argument

    (P) The internal pattern matching routines blew it big time.

  • NULL regexp parameter

    (P) The internal pattern matching routines are out of their gourd.

  • Number too long

    (F) Perl limits the representation of decimal numbers in programs to about 250 characters. You've exceeded that length. Future versions of Perl are likely to eliminate this arbitrary limitation. In the meantime, try using scientific notation (e.g. "1e6" instead of "1_000_000").

  • Number with no digits

    (F) Perl was looking for a number but found nothing that looked like a number. This happens, for example with \o{} , with no number between the braces.

  • "my %s" used in sort comparison

    (W syntax) The package variables $a and $b are used for sort comparisons. You used $a or $b in as an operand to the <=> or cmp operator inside a sort comparison block, and the variable had earlier been declared as a lexical variable. Either qualify the sort variable with the package name, or rename the lexical variable.

  • Octal number > 037777777777 non-portable

    (W portable) The octal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See perlport for more on portability concerns.

  • Odd number of arguments for overload::constant

    (W overload) The call to overload::constant contained an odd number of arguments. The arguments should come in pairs.

  • Odd number of elements in anonymous hash

    (W misc) You specified an odd number of elements to initialize a hash, which is odd, because hashes come in key/value pairs.

  • Odd number of elements in hash assignment

    (W misc) You specified an odd number of elements to initialize a hash, which is odd, because hashes come in key/value pairs.

  • Offset outside string

    (F)(W layer) You tried to do a read/write/send/recv/seek operation with an offset pointing outside the buffer. This is difficult to imagine. The sole exceptions to this are that zero padding will take place when going past the end of the string when either sysread()ing a file, or when seeking past the end of a scalar opened for I/O (in anticipation of future reads and to imitate the behaviour with real files).

  • %s() on unopened %s

    (W unopened) An I/O operation was attempted on a filehandle that was never initialized. You need to do an open(), a sysopen(), or a socket() call, or call a constructor from the FileHandle package.

  • -%s on unopened filehandle %s

    (W unopened) You tried to invoke a file test operator on a filehandle that isn't open. Check your control flow. See also -X.

  • Strings with code points over 0xFF may not be mapped into in-memory file handles

    (W utf8) You tried to open a reference to a scalar for read or append where the scalar contained code points over 0xFF. In-memory files model on-disk files and can only contain bytes.

  • oops: oopsAV

    (S internal) An internal warning that the grammar is screwed up.

  • oops: oopsHV

    (S internal) An internal warning that the grammar is screwed up.

  • Opening dirhandle %s also as a file

    (D io, deprecated) You used open() to associate a filehandle to a symbol (glob or scalar) that already holds a dirhandle. Although legal, this idiom might render your code confusing and is deprecated.

  • Opening filehandle %s also as a directory

    (D io, deprecated) You used opendir() to associate a dirhandle to a symbol (glob or scalar) that already holds a filehandle. Although legal, this idiom might render your code confusing and is deprecated.

  • Operand with no preceding operator in regex; marked by <-- HERE in m/%s/

    (F) You wrote something like

    1. (?[ \p{Digit} \p{Thai} ])

    There are two operands, but no operator giving how you want to combine them.

  • Operation "%s": no method found, %s

    (F) An attempt was made to perform an overloaded operation for which no handler was defined. While some handlers can be autogenerated in terms of other handlers, there is no default handler for any operation, unless the fallback overloading key is specified to be true. See overload.

  • Operation "%s" returns its argument for non-Unicode code point 0x%X

    (S utf8, non_unicode) You performed an operation requiring Unicode semantics on a code point that is not in Unicode, so what it should do is not defined. Perl has chosen to have it do nothing, and warn you.

    If the operation shown is "ToFold", it means that case-insensitive matching in a regular expression was done on the code point.

    If you know what you are doing you can turn off this warning by no warnings 'non_unicode'; .

  • Operation "%s" returns its argument for UTF-16 surrogate U+%X

    (S utf8, surrogate) You performed an operation requiring Unicode semantics on a Unicode surrogate. Unicode frowns upon the use of surrogates for anything but storing strings in UTF-16, but semantics are (reluctantly) defined for the surrogates, and they are to do nothing for this operation. Because the use of surrogates can be dangerous, Perl warns.

    If the operation shown is "ToFold", it means that case-insensitive matching in a regular expression was done on the code point.

    If you know what you are doing you can turn off this warning by no warnings 'surrogate'; .

  • Operator or semicolon missing before %s

    (S ambiguous) You used a variable or subroutine call where the parser was expecting an operator. The parser has assumed you really meant to use an operator, but this is highly likely to be incorrect. For example, if you say "*foo *foo" it will be interpreted as if you said "*foo * 'foo'".

  • "our" variable %s redeclared

    (W misc) You seem to have already declared the same global once before in the current lexical scope.

  • Out of memory!

    (X) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. Perl has no option but to exit immediately.

    At least in Unix you may be able to get past this by increasing your process datasize limits: in csh/tcsh use limit and limit datasize n (where n is the number of kilobytes) to check the current limits and change them, and in ksh/bash/zsh use ulimit -a and ulimit -d n , respectively.

  • Out of memory during %s extend

    (X) An attempt was made to extend an array, a list, or a string beyond the largest possible memory allocation.

  • Out of memory during "large" request for %s

    (F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. However, the request was judged large enough (compile-time default is 64K), so a possibility to shut down by trapping this error is granted.

  • Out of memory during request for %s

    (X)(F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request.

    The request was judged to be small, so the possibility to trap it depends on the way perl was compiled. By default it is not trappable. However, if compiled for this, Perl may use the contents of $^M as an emergency pool after die()ing with this message. In this case the error is trappable once, and the error message will include the line and file where the failed request happened.

  • Out of memory during ridiculously large request

    (F) You can't allocate more than 2^31+"small amount" bytes. This error is most likely to be caused by a typo in the Perl program. e.g., $arr[time] instead of $arr[$time] .

  • Out of memory for yacc stack

    (F) The yacc parser wanted to grow its stack so it could continue parsing, but realloc() wouldn't give it more memory, virtual or otherwise.

  • '.' outside of string in pack

    (F) The argument to a '.' in your template tried to move the working position to before the start of the packed string being built.

  • '@' outside of string in unpack

    (F) You had a template that specified an absolute position outside the string being unpacked. See pack.

  • '@' outside of string with malformed UTF-8 in unpack

    (F) You had a template that specified an absolute position outside the string being unpacked. The string being unpacked was also invalid UTF-8. See pack.

  • overload arg '%s' is invalid

    (W overload) The overload pragma was passed an argument it did not recognize. Did you mistype an operator?

  • Overloaded dereference did not return a reference

    (F) An object with an overloaded dereference operator was dereferenced, but the overloaded operation did not return a reference. See overload.

  • Overloaded qr did not return a REGEXP

    (F) An object with a qr overload was used as part of a match, but the overloaded operation didn't return a compiled regexp. See overload.

  • %s package attribute may clash with future reserved word: %s

    (W reserved) A lowercase attribute name was used that had a package-specific handler. That name might have a meaning to Perl itself some day, even though it doesn't yet. Perhaps you should use a mixed-case attribute name, instead. See attributes.

  • pack/unpack repeat count overflow

    (F) You can't specify a repeat count so large that it overflows your signed integers. See pack.

  • page overflow

    (W io) A single call to write() produced more lines than can fit on a page. See perlform.

  • panic: %s

    (P) An internal error.

  • panic: attempt to call %s in %s

    (P) One of the file test operators entered a code branch that calls an ACL related-function, but that function is not available on this platform. Earlier checks mean that it should not be possible to enter this branch on this platform.

  • panic: child pseudo-process was never scheduled

    (P) A child pseudo-process in the ithreads implementation on Windows was not scheduled within the time period allowed and therefore was not able to initialize properly.

  • panic: ck_grep, type=%u

    (P) Failed an internal consistency check trying to compile a grep.

  • panic: ck_split, type=%u

    (P) Failed an internal consistency check trying to compile a split.

  • panic: corrupt saved stack index %ld

    (P) The savestack was requested to restore more localized values than there are in the savestack.

  • panic: del_backref

    (P) Failed an internal consistency check while trying to reset a weak reference.

  • panic: die %s

    (P) We popped the context stack to an eval context, and then discovered it wasn't an eval context.

  • panic: do_subst

    (P) The internal pp_subst() routine was called with invalid operational data.

  • panic: do_trans_%s

    (P) The internal do_trans routines were called with invalid operational data.

  • panic: fold_constants JMPENV_PUSH returned %d

    (P) While attempting folding constants an exception other than an eval failure was caught.

  • panic: frexp

    (P) The library function frexp() failed, making printf("%f") impossible.

  • panic: goto, type=%u, ix=%ld

    (P) We popped the context stack to a context with the specified label, and then discovered it wasn't a context we know how to do a goto in.

  • panic: gp_free failed to free glob pointer

    (P) The internal routine used to clear a typeglob's entries tried repeatedly, but each time something re-created entries in the glob. Most likely the glob contains an object with a reference back to the glob and a destructor that adds a new object to the glob.

  • panic: INTERPCASEMOD, %s

    (P) The lexer got into a bad state at a case modifier.

  • panic: INTERPCONCAT, %s

    (P) The lexer got into a bad state parsing a string with brackets.

  • panic: kid popen errno read

    (F) forked child returned an incomprehensible message about its errno.

  • panic: last, type=%u

    (P) We popped the context stack to a block context, and then discovered it wasn't a block context.

  • panic: leave_scope clearsv

    (P) A writable lexical variable became read-only somehow within the scope.

  • panic: leave_scope inconsistency %u

    (P) The savestack probably got out of sync. At least, there was an invalid enum on the top of it.

  • panic: magic_killbackrefs

    (P) Failed an internal consistency check while trying to reset all weak references to an object.

  • panic: malloc, %s

    (P) Something requested a negative number of bytes of malloc.

  • panic: memory wrap

    (P) Something tried to allocate more memory than possible.

  • panic: pad_alloc, %p!=%p

    (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from.

  • panic: pad_free curpad, %p!=%p

    (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from.

  • panic: pad_free po

    (P) An invalid scratch pad offset was detected internally.

  • panic: pad_reset curpad, %p!=%p

    (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from.

  • panic: pad_sv po

    (P) An invalid scratch pad offset was detected internally.

  • panic: pad_swipe curpad, %p!=%p

    (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from.

  • panic: pad_swipe po

    (P) An invalid scratch pad offset was detected internally.

  • panic: pp_iter, type=%u

    (P) The foreach iterator got called in a non-loop context frame.

  • panic: pp_match%s

    (P) The internal pp_match() routine was called with invalid operational data.

  • panic: pp_split, pm=%p, s=%p

    (P) Something terrible went wrong in setting up for the split.

  • panic: realloc, %s

    (P) Something requested a negative number of bytes of realloc.

  • panic: reference miscount on nsv in sv_replace() (%d != 1)

    (P) The internal sv_replace() function was handed a new SV with a reference count other than 1.

  • panic: restartop in %s

    (P) Some internal routine requested a goto (or something like it), and didn't supply the destination.

  • panic: return, type=%u

    (P) We popped the context stack to a subroutine or eval context, and then discovered it wasn't a subroutine or eval context.

  • panic: scan_num, %s

    (P) scan_num() got called on something that wasn't a number.

  • panic: Sequence (?{...}): no code block found

    (P) while compiling a pattern that has embedded (?{}) or (??{}) code blocks, perl couldn't locate the code block that should have already been seen and compiled by perl before control passed to the regex compiler.

  • panic: sv_chop %s

    (P) The sv_chop() routine was passed a position that is not within the scalar's string buffer.

  • panic: sv_insert, midend=%p, bigend=%p

    (P) The sv_insert() routine was told to remove more string than there was string.

  • panic: strxfrm() gets absurd - a => %u, ab => %u

    (P) The interpreter's sanity check of the C function strxfrm() failed. In your current locale the returned transformation of the string "ab" is shorter than that of the string "a", which makes no sense.

  • panic: top_env

    (P) The compiler attempted to do a goto, or something weird like that.

  • panic: unimplemented op %s (#%d) called

    (P) The compiler is screwed up and attempted to use an op that isn't permitted at run time.

  • panic: utf16_to_utf8: odd bytelen

    (P) Something tried to call utf16_to_utf8 with an odd (as opposed to even) byte length.

  • panic: utf16_to_utf8_reversed: odd bytelen

    (P) Something tried to call utf16_to_utf8_reversed with an odd (as opposed to even) byte length.

  • panic: yylex, %s

    (P) The lexer got into a bad state while processing a case modifier.

  • Parentheses missing around "%s" list

    (W parenthesis) You said something like

    1. my $foo, $bar = @_;

    when you meant

    1. my ($foo, $bar) = @_;

    Remember that "my", "our", "local" and "state" bind tighter than comma.

  • Parsing code internal error (%s)

    (F) Parsing code supplied by an extension violated the parser's API in a detectable way.

  • Passing malformed UTF-8 to "%s" is deprecated

    (D deprecated, utf8) This message indicates a bug either in the Perl core or in XS code. Such code was trying to find out if a character, allegedly stored internally encoded as UTF-8, was of a given type, such as being punctuation or a digit. But the character was not encoded in legal UTF-8. The %s is replaced by a string that can be used by knowledgeable people to determine what the type being checked against was. If utf8 warnings are enabled, a further message is raised, giving details of the malformation.

  • Pattern subroutine nesting without pos change exceeded limit in regex; marked by <-- HERE in m/%s/

    (F) You used a pattern that uses too many nested subpattern calls without consuming any text. Restructure the pattern so text is consumed before the nesting limit is exceeded.

    The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • -p destination: %s

    (F) An error occurred during the implicit output invoked by the -p command-line switch. (This output goes to STDOUT unless you've redirected it with select().)

  • (perhaps you forgot to load "%s"?)

    (F) This is an educated guess made in conjunction with the message "Can't locate object method \"%s\" via package \"%s\"". It often means that a method requires a package that has not been loaded.

  • Perl folding rules are not up-to-date for 0x%X; please use the perlbug utility to report; in regex; marked by <-- HERE in m/%s/

    (D regexp, deprecated) You used a regular expression with case-insensitive matching, and there is a bug in Perl in which the built-in regular expression folding rules are not accurate. This may lead to incorrect results. Please report this as a bug using the perlbug utility. (This message is marked deprecated, so that it by default will be turned-on.)

  • Perl_my_%s() not available

    (F) Your platform has very uncommon byte-order and integer size, so it was not possible to set up some or all fixed-width byte-order conversion functions. This is only a problem when you're using the '<' or '>' modifiers in (un)pack templates. See pack.

  • Perl %s required (did you mean %s?)--this is only %s, stopped

    (F) The code you are trying to run has asked for a newer version of Perl than you are running. Perhaps use 5.10 was written instead of use 5.010 or use v5.10 . Without the leading v , the number is interpreted as a decimal, with every three digits after the decimal point representing a part of the version number. So 5.10 is equivalent to v5.100.

  • Perl %s required--this is only version %s, stopped

    (F) The module in question uses features of a version of Perl more recent than the currently running version. How long has it been since you upgraded, anyway? See require.

  • PERL_SH_DIR too long

    (F) An error peculiar to OS/2. PERL_SH_DIR is the directory to find the sh -shell in. See "PERL_SH_DIR" in perlos2.

  • PERL_SIGNALS illegal: "%s"

    (X) See PERL_SIGNALS in perlrun for legal values.

  • Perls since %s too modern--this is %s, stopped

    (F) The code you are trying to run claims it will not run on the version of Perl you are using because it is too new. Maybe the code needs to be updated, or maybe it is simply wrong and the version check should just be removed.

  • perl: warning: Setting locale failed.

    (S) The whole warning message will look something like:

    1. perl: warning: Setting locale failed.
    2. perl: warning: Please check that your locale settings:
    3. LC_ALL = "En_US",
    4. LANG = (unset)
    5. are supported and installed on your system.
    6. perl: warning: Falling back to the standard locale ("C").

    Exactly what were the failed locale settings varies. In the above the settings were that the LC_ALL was "En_US" and the LANG had no value. This error means that Perl detected that you and/or your operating system supplier and/or system administrator have set up the so-called locale system but Perl could not use those settings. This was not dead serious, fortunately: there is a "default locale" called "C" that Perl can and will use, and the script will be run. Before you really fix the problem, however, you will get the same error message each time you run Perl. How to really fix the problem can be found in perllocale section LOCALE PROBLEMS.

  • perl: warning: Non hex character in '$ENV{PERL_HASH_SEED}', seed only partially set

    (W) PERL_HASH_SEED should match /^\s*(?:0x)?[0-9a-fA-F]+\s*\z/ but it contained a non hex character. This could mean you are not using the hash seed you think you are.

  • perl: warning: strange setting in '$ENV{PERL_PERTURB_KEYS}': '%s'

    (W) Perl was run with the environment variable PERL_PERTURB_KEYS defined but containing an unexpected value. The legal values of this setting are as follows.

    1. Numeric | String | Result
    2. --------+---------------+-----------------------------------------
    3. 0 | NO | Disables key traversal randomization
    4. 1 | RANDOM | Enables full key traversal randomization
    5. 2 | DETERMINISTIC | Enables repeatable key traversal randomization

    Both numeric and string values are accepted, but note that string values are case sensitive. The default for this setting is "RANDOM" or 1.

  • pid %x not a child

    (W exec) A warning peculiar to VMS. Waitpid() was asked to wait for a process which isn't a subprocess of the current process. While this is fine from VMS' perspective, it's probably not what you intended.

  • 'P' must have an explicit size in unpack

    (F) The unpack format P must have an explicit size, not "*".

  • POSIX class [:%s:] unknown in regex; marked by <-- HERE in m/%s/

    (F) The class in the character class [: :] syntax is unknown. The <-- HERE shows whereabouts in the regular expression the problem was discovered. Note that the POSIX character classes do not have the is prefix the corresponding C interfaces have: in other words, it's [[:print:]], not isprint . See perlre.

  • POSIX getpgrp can't take an argument

    (F) Your system has POSIX getpgrp(), which takes no argument, unlike the BSD version, which takes a pid.

  • POSIX syntax [%c %c] belongs inside character classes in regex; marked by <-- HERE in m/%s/

    (W regexp) The character class constructs [: :], [= =], and [. .] go inside character classes, the [] are part of the construct, for example: /[012[:alpha:]345]/. Note that [= =] and [. .] are not currently implemented; they are simply placeholders for future extensions and will cause fatal errors. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • POSIX syntax [. .] is reserved for future extensions in regex; marked by <-- HERE in m/%s/

    (F) Within regular expression character classes ([]) the syntax beginning with "[." and ending with ".]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[." and ".\]". The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • POSIX syntax [= =] is reserved for future extensions in regex; marked by <-- HERE in m/%s/

    (F) Within regular expression character classes ([]) the syntax beginning with "[=" and ending with "=]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[=" and "=\]". The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Possible attempt to put comments in qw() list

    (W qw) qw() lists contain items separated by whitespace; as with literal strings, comment characters are not ignored, but are instead treated as literal data. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.)

    You probably wrote something like this:

    1. @list = qw(
    2. a # a comment
    3. b # another comment
    4. );

    when you should have written this:

    1. @list = qw(
    2. a
    3. b
    4. );

    If you really want comments, build your list the old-fashioned way, with quotes and commas:

    1. @list = (
    2. 'a', # a comment
    3. 'b', # another comment
    4. );
  • Possible attempt to separate words with commas

    (W qw) qw() lists contain items separated by whitespace; therefore commas aren't needed to separate the items. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.)

    You probably wrote something like this:

    1. qw! a, b, c !;

    which puts literal commas into some of the list items. Write it without commas if you don't want them to appear in your data:

    1. qw! a b c !;
  • Possible memory corruption: %s overflowed 3rd argument

    (F) An ioctl() or fcntl() returned more than Perl was bargaining for. Perl guesses a reasonable buffer size, but puts a sentinel byte at the end of the buffer just in case. This sentinel byte got clobbered, and Perl assumes that memory is now corrupted. See ioctl.

  • Possible precedence problem on bitwise %c operator

    (W precedence) Your program uses a bitwise logical operator in conjunction with a numeric comparison operator, like this :

    1. if ($x & $y == 0) { ... }

    This expression is actually equivalent to $x & ($y == 0) , due to the higher precedence of == . This is probably not what you want. (If you really meant to write this, disable the warning, or, better, put the parentheses explicitly and write $x & ($y == 0) ).

  • Possible unintended interpolation of $\ in regex

    (W ambiguous) You said something like m/$\/ in a regex. The regex m/foo$\s+bar/m translates to: match the word 'foo', the output record separator (see $\ in perlvar) and the letter 's' (one time or more) followed by the word 'bar'.

    If this is what you intended then you can silence the warning by using m/${\}/ (for example: m/foo${\}s+bar/).

    If instead you intended to match the word 'foo' at the end of the line followed by whitespace and the word 'bar' on the next line then you can use m/$(?)\/ (for example: m/foo$(?)\s+bar/).

  • Possible unintended interpolation of %s in string

    (W ambiguous) You said something like '@foo' in a double-quoted string but there was no array @foo in scope at the time. If you wanted a literal @foo, then write it as \@foo; otherwise find out what happened to the array you apparently lost track of.

  • Precedence problem: open %s should be open(%s)

    (S precedence) The old irregular construct

    1. open FOO || die;

    is now misinterpreted as

    1. open(FOO || die);

    because of the strict regularization of Perl 5's grammar into unary and list operators. (The old open was a little of both.) You must put parentheses around the filehandle, or use the new "or" operator instead of "||".

  • Premature end of script headers

    See Server error.

  • printf() on closed filehandle %s

    (W closed) The filehandle you're writing to got itself closed sometime before now. Check your control flow.

  • print() on closed filehandle %s

    (W closed) The filehandle you're printing on got itself closed sometime before now. Check your control flow.

  • Process terminated by SIG%s

    (W) This is a standard message issued by OS/2 applications, while *nix applications die in silence. It is considered a feature of the OS/2 port. One can easily disable this by appropriate sighandlers, see Signals in perlipc. See also "Process terminated by SIGTERM/SIGINT" in perlos2.

  • Property '%s' is unknown in regex; marked by <-- HERE in m/%s/

    (F) The named property which you specified via \p or \P is not one known to Perl. Perhaps you misspelled the name? See Properties accessible through \p{} and \P{} in perluniprops for a complete list of available official properties. If it is a user-defined property it must have been defined by the time the regular expression is compiled.

  • Prototype after '%c' for %s : %s

    (W illegalproto) A character follows % or @ in a prototype. This is useless, since % and @ gobble the rest of the subroutine arguments.

  • Prototype mismatch: %s vs %s

    (S prototype) The subroutine being declared or defined had previously been declared or defined with a different function prototype.

  • Prototype not terminated

    (F) You've omitted the closing parenthesis in a function prototype definition.

  • \p{} uses Unicode rules, not locale rules

    (W) You compiled a regular expression that contained a Unicode property match (\p or \P ), but the regular expression is also being told to use the run-time locale, not Unicode. Instead, use a POSIX character class, which should know about the locale's rules. (See POSIX Character Classes in perlrecharclass.)

    Even if the run-time locale is ISO 8859-1 (Latin1), which is a subset of Unicode, some properties will give results that are not valid for that subset.

    Here are a couple of examples to help you see what's going on. If the locale is ISO 8859-7, the character at code point 0xD7 is the "GREEK CAPITAL LETTER CHI". But in Unicode that code point means the "MULTIPLICATION SIGN" instead, and \p always uses the Unicode meaning. That means that \p{Alpha} won't match, but [[:alpha:]] should. Only in the Latin1 locale are all the characters in the same positions as they are in Unicode. But, even here, some properties give incorrect results. An example is \p{Changes_When_Uppercased} which is true for "LATIN SMALL LETTER Y WITH DIAERESIS", but since the upper case of that character is not in Latin1, in that locale it doesn't change when upper cased.

  • Quantifier {n,m} with n > m can't match in regex

    (W regexp) Minima should be less than or equal to maxima. If you really want your regexp to match something 0 times, just put {0}.

  • Quantifier follows nothing in regex; marked by <-- HERE in m/%s/

    (F) You started a regular expression with a quantifier. Backslash it if you meant it literally. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Quantifier in {,} bigger than %d in regex; marked by <-- HERE in m/%s/

    (F) There is currently a limit to the size of the min and max values of the {min,max} construct. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Quantifier unexpected on zero-length expression in regex; marked by <-- HERE in m/%s/

    (W regexp) You applied a regular expression quantifier in a place where it makes no sense, such as on a zero-width assertion. Try putting the quantifier inside the assertion instead. For example, the way to match "abc" provided that it is followed by three repetitions of "xyz" is /abc(?=(?:xyz){3})/ , not /abc(?=xyz){3}/ .

    The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • Quantifier {n,m} with n > m can't match in regex; marked by <-- HERE in m/%s/

    (W regexp) Minima should be less than or equal to maxima. If you really want your regexp to match something 0 times, just put {0}.

  • Range iterator outside integer range

    (F) One (or both) of the numeric arguments to the range operator ".." are outside the range which can be represented by integers internally. One possible workaround is to force Perl to use magical string increment by prepending "0" to your numbers.

  • readdir() attempted on invalid dirhandle %s

    (W io) The dirhandle you're reading from is either closed or not really a dirhandle. Check your control flow.

  • readline() on closed filehandle %s

    (W closed) The filehandle you're reading from got itself closed sometime before now. Check your control flow.

  • read() on closed filehandle %s

    (W closed) You tried to read from a closed filehandle.

  • read() on unopened filehandle %s

    (W unopened) You tried to read from a filehandle that was never opened.

  • Reallocation too large: %x

    (F) You can't allocate more than 64K on an MS-DOS machine.

  • realloc() of freed memory ignored

    (S malloc) An internal routine called realloc() on something that had already been freed.

  • Recompile perl with -DDEBUGGING to use -D switch

    (S debugging) You can't use the -D option unless the code to produce the desired output is compiled into Perl, which entails some overhead, which is why it's currently left out of your copy.

  • Recursive call to Perl_load_module in PerlIO_find_layer

    (P) It is currently not permitted to load modules when creating a filehandle inside an %INC hook. This can happen with open my $fh, '<', \$scalar , which implicitly loads PerlIO::scalar. Try loading PerlIO::scalar explicitly first.

  • Recursive inheritance detected in package '%s'

    (F) While calculating the method resolution order (MRO) of a package, Perl believes it found an infinite loop in the @ISA hierarchy. This is a crude check that bails out after 100 levels of @ISA depth.

  • refcnt_dec: fd %d%s
  • refcnt: fd %d%s
  • refcnt_inc: fd %d%s

    (P) Perl's I/O implementation failed an internal consistency check. If you see this message, something is very wrong.

  • Reference found where even-sized list expected

    (W misc) You gave a single reference where Perl was expecting a list with an even number of elements (for assignment to a hash). This usually means that you used the anon hash constructor when you meant to use parens. In any case, a hash requires key/value pairs.

    1. %hash = { one => 1, two => 2, }; # WRONG
    2. %hash = [ qw/ an anon array / ]; # WRONG
    3. %hash = ( one => 1, two => 2, ); # right
    4. %hash = qw( one 1 two 2 ); # also fine
  • Reference is already weak

    (W misc) You have attempted to weaken a reference that is already weak. Doing so has no effect.

  • Reference to invalid group 0 in regex; marked by <-- HERE in m/%s/

    (F) You used \g0 or similar in a regular expression. You may refer to capturing parentheses only with strictly positive integers (normal backreferences) or with strictly negative integers (relative backreferences). Using 0 does not make sense.

  • Reference to nonexistent group in regex; marked by <-- HERE in m/%s/

    (F) You used something like \7 in your regular expression, but there are not at least seven sets of capturing parentheses in the expression. If you wanted to have the character with ordinal 7 inserted into the regular expression, prepend zeroes to make it three digits long: \007

    The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • Reference to nonexistent named group in regex; marked by <-- HERE in m/%s/

    (F) You used something like \k'NAME' or \k<NAME> in your regular expression, but there is no corresponding named capturing parentheses such as (?'NAME'...) or (?<NAME>...) . Check if the name has been spelled correctly both in the backreference and the declaration.

    The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • Reference to nonexistent or unclosed group in regex; marked by <-- HERE in m/%s/

    (F) You used something like \g{-7} in your regular expression, but there are not at least seven sets of closed capturing parentheses in the expression before where the \g{-7} was located.

    The <-- HERE shows whereabouts in the regular expression the problem was discovered.

  • regexp memory corruption

    (P) The regular expression engine got confused by what the regular expression compiler gave it.

  • Regexp modifier "/%c" may appear a maximum of twice
  • Regexp modifier "/%c" may not appear twice

    (F syntax, regexp) The regular expression pattern had too many occurrences of the specified modifier. Remove the extraneous ones.

  • Regexp modifier "%c" may not appear after the "-" in regex; marked by <-- HERE in m/%s/

    (F) Turning off the given modifier has the side effect of turning on another one. Perl currently doesn't allow this. Reword the regular expression to use the modifier you want to turn on (and place it before the minus), instead of the one you want to turn off.

  • Regexp modifiers "/%c" and "/%c" are mutually exclusive

    (F syntax, regexp) The regular expression pattern had more than one of these mutually exclusive modifiers. Retain only the modifier that is supposed to be there.

  • Regexp out of space in regex m/%s/

    (P) A "can't happen" error, because safemalloc() should have caught it earlier.

  • Repeated format line will never terminate (~~ and @# incompatible)

    (F) Your format contains the ~~ repeat-until-blank sequence and a numeric field that will never go blank so that the repetition never terminates. You might use ^# instead. See perlform.

  • Replacement list is longer than search list

    (W misc) You have used a replacement list that is longer than the search list. So the additional elements in the replacement list are meaningless.

  • '%s' resolved to '\o{%s}%d'

    (W misc, regexp) You wrote something like \08 , or \179 in a double-quotish string. All but the last digit is treated as a single character, specified in octal. The last digit is the next character in the string. To tell Perl that this is indeed what you want, you can use the \o{ } syntax, or use exactly three digits to specify the octal for the character.

  • Reversed %s= operator

    (W syntax) You wrote your assignment operator backwards. The = must always come last, to avoid ambiguity with subsequent unary operators.

  • rewinddir() attempted on invalid dirhandle %s

    (W io) The dirhandle you tried to do a rewinddir() on is either closed or not really a dirhandle. Check your control flow.

  • Scalars leaked: %d

    (S internal) Something went wrong in Perl's internal bookkeeping of scalars: not all scalar variables were deallocated by the time Perl exited. What this usually indicates is a memory leak, which is of course bad, especially if the Perl program is intended to be long-running.

  • Scalar value @%s[%s] better written as $%s[%s]

    (W syntax) You've used an array slice (indicated by @) to select a single element of an array. Generally it's better to ask for a scalar value (indicated by $). The difference is that $foo[&bar] always behaves like a scalar, both when assigning to it and when evaluating its argument, while @foo[&bar] behaves like a list when you assign to it, and provides a list context to its subscript, which can do weird things if you're expecting only one subscript.

    On the other hand, if you were actually hoping to treat the array element as a list, you need to look into how references work, because Perl will not magically convert between scalars and lists for you. See perlref.

  • Scalar value @%s{%s} better written as $%s{%s}

    (W syntax) You've used a hash slice (indicated by @) to select a single element of a hash. Generally it's better to ask for a scalar value (indicated by $). The difference is that $foo{&bar} always behaves like a scalar, both when assigning to it and when evaluating its argument, while @foo{&bar} behaves like a list when you assign to it, and provides a list context to its subscript, which can do weird things if you're expecting only one subscript.

    On the other hand, if you were actually hoping to treat the hash element as a list, you need to look into how references work, because Perl will not magically convert between scalars and lists for you. See perlref.

  • Search pattern not terminated

    (F) The lexer couldn't find the final delimiter of a // or m{} construct. Remember that bracketing delimiters count nesting level. Missing the leading $ from a variable $m may cause this error.

    Note that since Perl 5.9.0 a // can also be the defined-or construct, not just the empty search pattern. Therefore code written in Perl 5.9.0 or later that uses the // as the defined-or can be misparsed by pre-5.9.0 Perls as a non-terminated search pattern.

  • Search pattern not terminated or ternary operator parsed as search pattern

    (F) The lexer couldn't find the final delimiter of a ?PATTERN? construct.

    The question mark is also used as part of the ternary operator (as in foo ? 0 : 1 ) leading to some ambiguous constructions being wrongly parsed. One way to disambiguate the parsing is to put parentheses around the conditional expression, i.e. (foo) ? 0 : 1 .

  • seekdir() attempted on invalid dirhandle %s

    (W io) The dirhandle you are doing a seekdir() on is either closed or not really a dirhandle. Check your control flow.

  • %sseek() on unopened filehandle

    (W unopened) You tried to use the seek() or sysseek() function on a filehandle that was either never opened or has since been closed.

  • select not implemented

    (F) This machine doesn't implement the select() system call.

  • Self-ties of arrays and hashes are not supported

    (F) Self-ties are of arrays and hashes are not supported in the current implementation.

  • Semicolon seems to be missing

    (W semicolon) A nearby syntax error was probably caused by a missing semicolon, or possibly some other missing operator, such as a comma.

  • semi-panic: attempt to dup freed string

    (S internal) The internal newSVsv() routine was called to duplicate a scalar that had previously been marked as free.

  • sem%s not implemented

    (F) You don't have System V semaphore IPC on your system.

  • send() on closed socket %s

    (W closed) The socket you're sending to got itself closed sometime before now. Check your control flow.

  • Sequence (? incomplete in regex; marked by <-- HERE in m/%s/

    (F) A regular expression ended with an incomplete extension (?. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Sequence (?%s...) not implemented in regex; marked by <-- HERE in m/%s/

    (F) A proposed regular expression extension has the character reserved but has not yet been written. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Sequence (?%s...) not recognized in regex; marked by <-- HERE in m/%s/

    (F) You used a regular expression extension that doesn't make sense. The <-- HERE shows whereabouts in the regular expression the problem was discovered. This happens when using the (?^...) construct to tell Perl to use the default regular expression modifiers, and you redundantly specify a default modifier. For other causes, see perlre.

  • Sequence \%s... not terminated in regex; marked by <-- HERE in m/%s/

    (F) The regular expression expects a mandatory argument following the escape sequence and this has been omitted or incorrectly written.

  • Sequence (?#... not terminated in regex m/%s/

    (F) A regular expression comment must be terminated by a closing parenthesis. Embedded parentheses aren't allowed. See perlre.

  • Sequence (?{...}) not terminated with ')'

    (F) The end of the perl code contained within the {...} must be followed immediately by a ')'.

  • 500 Server error

    See Server error.

  • Server error

    (A) This is the error message generally seen in a browser window when trying to run a CGI program (including SSI) over the web. The actual error text varies widely from server to server. The most frequently-seen variants are "500 Server error", "Method (something) not permitted", "Document contains no data", "Premature end of script headers", and "Did not produce a valid header".

    This is a CGI error, not a Perl error.

    You need to make sure your script is executable, is accessible by the user CGI is running the script under (which is probably not the user account you tested it under), does not rely on any environment variables (like PATH) from the user it isn't running under, and isn't in a location where the CGI server can't find it, basically, more or less. Please see the following for more information:

    1. http://www.perl.org/CGI_MetaFAQ.html
    2. http://www.htmlhelp.org/faq/cgifaq.html
    3. http://www.w3.org/Security/Faq/

    You should also look at perlfaq9.

  • setegid() not implemented

    (F) You tried to assign to $) , and your operating system doesn't support the setegid() system call (or equivalent), or at least Configure didn't think so.

  • seteuid() not implemented

    (F) You tried to assign to $> , and your operating system doesn't support the seteuid() system call (or equivalent), or at least Configure didn't think so.

  • setpgrp can't take arguments

    (F) Your system has the setpgrp() from BSD 4.2, which takes no arguments, unlike POSIX setpgid(), which takes a process ID and process group ID.

  • setrgid() not implemented

    (F) You tried to assign to $( , and your operating system doesn't support the setrgid() system call (or equivalent), or at least Configure didn't think so.

  • setruid() not implemented

    (F) You tried to assign to $< , and your operating system doesn't support the setruid() system call (or equivalent), or at least Configure didn't think so.

  • setsockopt() on closed socket %s

    (W closed) You tried to set a socket option on a closed socket. Did you forget to check the return value of your socket() call? See setsockopt.

  • shm%s not implemented

    (F) You don't have System V shared memory IPC on your system.

  • !=~ should be !~

    (W syntax) The non-matching operator is !~, not !=~. !=~ will be interpreted as the != (numeric not equal) and ~ (1's complement) operators: probably not what you intended.

  • <> should be quotes

    (F) You wrote require <file> when you should have written require 'file' .

  • /%s/ should probably be written as "%s"

    (W syntax) You have used a pattern where Perl expected to find a string, as in the first argument to join. Perl will treat the true or false result of matching the pattern against $_ as the string, which is probably not what you had in mind.

  • shutdown() on closed socket %s

    (W closed) You tried to do a shutdown on a closed socket. Seems a bit superfluous.

  • SIG%s handler "%s" not defined

    (W signal) The signal handler named in %SIG doesn't, in fact, exist. Perhaps you put it into the wrong package?

  • Slab leaked from cv %p

    (S) If you see this message, then something is seriously wrong with the internal bookkeeping of op trees. An op tree needed to be freed after a compilation error, but could not be found, so it was leaked instead.

  • sleep(%u) too large

    (W overflow) You called sleep with a number that was larger than it can reliably handle and sleep probably slept for less time than requested.

  • Smartmatch is experimental

    (S experimental::smartmatch) This warning is emitted if you use the smartmatch (~~ ) operator. This is currently an experimental feature, and its details are subject to change in future releases of Perl. Particularly, its current behavior is noticed for being unnecessarily complex and unintuitive, and is very likely to be overhauled.

  • Smart matching a non-overloaded object breaks encapsulation

    (F) You should not use the ~~ operator on an object that does not overload it: Perl refuses to use the object's underlying structure for the smart match.

  • sort is now a reserved word

    (F) An ancient error message that almost nobody ever runs into anymore. But before sort was a keyword, people sometimes used it as a filehandle.

  • Sort subroutine didn't return single value

    (F) A sort comparison subroutine written in XS must return exactly one item. See sort.

  • Source filters apply only to byte streams

    (F) You tried to activate a source filter (usually by loading a source filter module) within a string passed to eval. This is not permitted under the unicode_eval feature. Consider using evalbytes instead. See feature.

  • splice() offset past end of array

    (W misc) You attempted to specify an offset that was past the end of the array passed to splice(). Splicing will instead commence at the end of the array, rather than past it. If this isn't what you want, try explicitly pre-extending the array by assigning $#array = $offset. See splice.

  • Split loop

    (P) The split was looping infinitely. (Obviously, a split shouldn't iterate more times than there are characters of input, which is what happened.) See split.

  • Statement unlikely to be reached

    (W exec) You did an exec() with some statement after it other than a die(). This is almost always an error, because exec() never returns unless there was a failure. You probably wanted to use system() instead, which does return. To suppress this warning, put the exec() in a block by itself.

  • "state" variable %s can't be in a package

    (F) Lexically scoped variables aren't in a package, so it doesn't make sense to try to declare one with a package qualifier on the front. Use local() if you want to localize a package variable.

  • "state %s" used in sort comparison

    (W syntax) The package variables $a and $b are used for sort comparisons. You used $a or $b in as an operand to the <=> or cmp operator inside a sort comparison block, and the variable had earlier been declared as a lexical variable. Either qualify the sort variable with the package name, or rename the lexical variable.

  • stat() on unopened filehandle %s

    (W unopened) You tried to use the stat() function on a filehandle that was either never opened or has since been closed.

  • Stub found while resolving method "%s" overloading "%s" in package "%s"

    (P) Overloading resolution over @ISA tree may be broken by importation stubs. Stubs should never be implicitly created, but explicit calls to can may break this.

  • Subroutine "&%s" is not available

    (W closure) During compilation, an inner named subroutine or eval is attempting to capture an outer lexical subroutine that is not currently available. This can happen for one of two reasons. First, the lexical subroutine may be declared in an outer anonymous subroutine that has not yet been created. (Remember that named subs are created at compile time, while anonymous subs are created at run-time.) For example,

    1. sub { my sub a {...} sub f { \&a } }

    At the time that f is created, it can't capture the current the "a" sub, since the anonymous subroutine hasn't been created yet. Conversely, the following won't give a warning since the anonymous subroutine has by now been created and is live:

    1. sub { my sub a {...} eval 'sub f { \&a }' }->();

    The second situation is caused by an eval accessing a variable that has gone out of scope, for example,

    1. sub f {
    2. my sub a {...}
    3. sub { eval '\&a' }
    4. }
    5. f()->();

    Here, when the '\&a' in the eval is being compiled, f() is not currently being executed, so its &a is not available for capture.

  • "%s" subroutine &%s masks earlier declaration in same %s

    (W misc) A "my" or "state" subroutine has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier subroutine will still exist until the end of the scope or until all closure references to it are destroyed.

  • Subroutine %s redefined

    (W redefine) You redefined a subroutine. To suppress this warning, say

    1. {
    2. no warnings 'redefine';
    3. eval "sub name { ... }";
    4. }
  • Substitution loop

    (P) The substitution was looping infinitely. (Obviously, a substitution shouldn't iterate more times than there are characters of input, which is what happened.) See the discussion of substitution in Regexp Quote-Like Operators in perlop.

  • Substitution pattern not terminated

    (F) The lexer couldn't find the interior delimiter of an s/// or s{}{} construct. Remember that bracketing delimiters count nesting level. Missing the leading $ from variable $s may cause this error.

  • Substitution replacement not terminated

    (F) The lexer couldn't find the final delimiter of an s/// or s{}{} construct. Remember that bracketing delimiters count nesting level. Missing the leading $ from variable $s may cause this error.

  • substr outside of string

    (W substr)(F) You tried to reference a substr() that pointed outside of a string. That is, the absolute value of the offset was larger than the length of the string. See substr. This warning is fatal if substr is used in an lvalue context (as the left hand side of an assignment or as a subroutine argument for example).

  • sv_upgrade from type %d down to type %d

    (P) Perl tried to force the upgrade of an SV to a type which was actually inferior to its current type.

  • Switch (?(condition)... contains too many branches in regex; marked by <-- HERE in m/%s/

    (F) A (?(condition)if-clause|else-clause) construct can have at most two branches (the if-clause and the else-clause). If you want one or both to contain alternation, such as using this|that|other , enclose it in clustering parentheses:

    1. (?(condition)(?:this|that|other)|else-clause)

    The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Switch condition not recognized in regex; marked by <-- HERE in m/%s/

    (F) If the argument to the (?(...)if-clause|else-clause) construct is a number, it can be only a number. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • switching effective %s is not implemented

    (F) While under the use filetest pragma, we cannot switch the real and effective uids or gids.

  • %s syntax OK

    (F) The final summary message when a perl -c succeeds.

  • syntax error

    (F) Probably means you had a syntax error. Common reasons include:

    1. A keyword is misspelled.
    2. A semicolon is missing.
    3. A comma is missing.
    4. An opening or closing parenthesis is missing.
    5. An opening or closing brace is missing.
    6. A closing quote is missing.

    Often there will be another error message associated with the syntax error giving more information. (Sometimes it helps to turn on -w.) The error message itself often tells you where it was in the line when it decided to give up. Sometimes the actual error is several tokens before this, because Perl is good at understanding random input. Occasionally the line number may be misleading, and once in a blue moon the only way to figure out what's triggering the error is to call perl -c repeatedly, chopping away half the program each time to see if the error went away. Sort of the cybernetic version of 20 questions.

  • syntax error at line %d: '%s' unexpected

    (A) You've accidentally run your script through the Bourne shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself.

  • syntax error in file %s at line %d, next 2 tokens "%s"

    (F) This error is likely to occur if you run a perl5 script through a perl4 interpreter, especially if the next 2 tokens are "use strict" or "my $var" or "our $var".

  • sysread() on closed filehandle %s

    (W closed) You tried to read from a closed filehandle.

  • sysread() on unopened filehandle %s

    (W unopened) You tried to read from a filehandle that was never opened.

  • Syntax error in (?[...]) in regex m/%s/

    (F) Perl could not figure out what you meant inside this construct; this notifies you that it is giving up trying.

  • System V %s is not implemented on this machine

    (F) You tried to do something with a function beginning with "sem", "shm", or "msg" but that System V IPC is not implemented in your machine. In some machines the functionality can exist but be unconfigured. Consult your system support.

  • syswrite() on closed filehandle %s

    (W closed) The filehandle you're writing to got itself closed sometime before now. Check your control flow.

  • -T and -B not implemented on filehandles

    (F) Perl can't peek at the stdio buffer of filehandles when it doesn't know about your kind of stdio. You'll have to use a filename instead.

  • Target of goto is too deeply nested

    (F) You tried to use goto to reach a label that was too deeply nested for Perl to reach. Perl is doing you a favor by refusing.

  • telldir() attempted on invalid dirhandle %s

    (W io) The dirhandle you tried to telldir() is either closed or not really a dirhandle. Check your control flow.

  • tell() on unopened filehandle

    (W unopened) You tried to use the tell() function on a filehandle that was either never opened or has since been closed.

  • That use of $[ is unsupported

    (F) Assignment to $[ is now strictly circumscribed, and interpreted as a compiler directive. You may say only one of

    1. $[ = 0;
    2. $[ = 1;
    3. ...
    4. local $[ = 0;
    5. local $[ = 1;
    6. ...

    This is to prevent the problem of one module changing the array base out from under another module inadvertently. See $[ in perlvar and arybase.

  • The crypt() function is unimplemented due to excessive paranoia.

    (F) Configure couldn't find the crypt() function on your machine, probably because your vendor didn't supply it, probably because they think the U.S. Government thinks it's a secret, or at least that they will continue to pretend that it is. And if you quote me on that, I will deny it.

  • The lexical_subs feature is experimental

    (S experimental::lexical_subs) This warning is emitted if you declare a sub with my or state. Simply suppress the warning if you want to use the feature, but know that in doing so you are taking the risk of using an experimental feature which may change or be removed in a future Perl version:

    1. no warnings "experimental::lexical_subs";
    2. use feature "lexical_subs";
    3. my sub foo { ... }
  • The regex_sets feature is experimental

    (S experimental::regex_sets) This warning is emitted if you use the syntax (?[ ]) in a regular expression. The details of this feature are subject to change. if you want to use it, but know that in doing so you are taking the risk of using an experimental feature which may change in a future Perl version, you can do this to silence the warning:

    1. no warnings "experimental::regex_sets";
  • The %s feature is experimental

    (S experimental) This warning is emitted if you enable an experimental feature via use feature . Simply suppress the warning if you want to use the feature, but know that in doing so you are taking the risk of using an experimental feature which may change or be removed in a future Perl version:

    1. no warnings "experimental::lexical_subs";
    2. use feature "lexical_subs";
  • The %s function is unimplemented

    (F) The function indicated isn't implemented on this architecture, according to the probings of Configure.

  • The stat preceding %s wasn't an lstat

    (F) It makes no sense to test the current stat buffer for symbolic linkhood if the last stat that wrote to the stat buffer already went past the symlink to get to the real file. Use an actual filename instead.

  • The 'unique' attribute may only be applied to 'our' variables

    (F) This attribute was never supported on my or sub declarations.

  • This Perl can't reset CRTL environ elements (%s)
  • This Perl can't set CRTL environ elements (%s=%s)

    (W internal) Warnings peculiar to VMS. You tried to change or delete an element of the CRTL's internal environ array, but your copy of Perl wasn't built with a CRTL that contained the setenv() function. You'll need to rebuild Perl with a CRTL that does, or redefine PERL_ENV_TABLES (see perlvms) so that the environ array isn't the target of the change to %ENV which produced the warning.

  • This Perl has not been built with support for randomized hash key traversal but something called Perl_hv_rand_set().

    (F) Something has attempted to use an internal API call which depends on Perl being compiled with the default support for randomized hash key traversal, but this Perl has been compiled without it. You should report this warning to the relevant upstream party, or recompile perl with default options.

  • thread failed to start: %s

    (W threads)(S) The entry point function of threads->create() failed for some reason.

  • times not implemented

    (F) Your version of the C library apparently doesn't do times(). I suspect you're not running on Unix.

  • "-T" is on the #! line, it must also be used on the command line

    (X) The #! line (or local equivalent) in a Perl script contains the -T option (or the -t option), but Perl was not invoked with -T in its command line. This is an error because, by the time Perl discovers a -T in a script, it's too late to properly taint everything from the environment. So Perl gives up.

    If the Perl script is being executed as a command using the #! mechanism (or its local equivalent), this error can usually be fixed by editing the #! line so that the -%c option is a part of Perl's first argument: e.g. change perl -n -%c to perl -%c -n .

    If the Perl script is being executed as perl scriptname , then the -%c option must appear on the command line: perl -%c scriptname.

  • To%s: illegal mapping '%s'

    (F) You tried to define a customized To-mapping for lc(), lcfirst, uc(), or ucfirst() (or their string-inlined versions), but you specified an illegal mapping. See User-Defined Character Properties in perlunicode.

  • Too deeply nested ()-groups

    (F) Your template contains ()-groups with a ridiculously deep nesting level.

  • Too few args to syscall

    (F) There has to be at least one argument to syscall() to specify the system call to call, silly dilly.

  • Too late for "-%s" option

    (X) The #! line (or local equivalent) in a Perl script contains the -M, -m or -C option.

    In the case of -M and -m, this is an error because those options are not intended for use inside scripts. Use the use pragma instead.

    The -C option only works if it is specified on the command line as well (with the same sequence of letters or numbers following). Either specify this option on the command line, or, if your system supports it, make your script executable and run it directly instead of passing it to perl.

  • Too late to run %s block

    (W void) A CHECK or INIT block is being defined during run time proper, when the opportunity to run them has already passed. Perhaps you are loading a file with require or do when you should be using use instead. Or perhaps you should put the require or do inside a BEGIN block.

  • Too many args to syscall

    (F) Perl supports a maximum of only 14 args to syscall().

  • Too many arguments for %s

    (F) The function requires fewer arguments than you specified.

  • Too many )'s

    (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself.

  • Too many ('s

    (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself.

  • Trailing \ in regex m/%s/

    (F) The regular expression ends with an unbackslashed backslash. Backslash it. See perlre.

  • Trailing white-space in a charnames alias definition is deprecated

    (D) You defined a character name which ended in a space character. Remove the trailing space(s). Usually these names are defined in the :alias import argument to use charnames , but they could be defined by a translator installed into $^H{charnames} . See CUSTOM ALIASES in charnames.

  • Transliteration pattern not terminated

    (F) The lexer couldn't find the interior delimiter of a tr/// or tr[][] or y/// or y[][] construct. Missing the leading $ from variables $tr or $y may cause this error.

  • Transliteration replacement not terminated

    (F) The lexer couldn't find the final delimiter of a tr///, tr[][], y/// or y[][] construct.

  • '%s' trapped by operation mask

    (F) You tried to use an operator from a Safe compartment in which it's disallowed. See Safe.

  • truncate not implemented

    (F) Your machine doesn't implement a file truncation mechanism that Configure knows about.

  • Type of arg %d to &CORE::%s must be %s

    (F) The subroutine in question in the CORE package requires its argument to be a hard reference to data of the specified type. Overloading is ignored, so a reference to an object that is not the specified type, but nonetheless has overloading to handle it, will still not be accepted.

  • Type of arg %d to %s must be %s (not %s)

    (F) This function requires the argument in that position to be of a certain type. Arrays must be @NAME or @{EXPR} . Hashes must be %NAME or %{EXPR} . No implicit dereferencing is allowed--use the {EXPR} forms as an explicit dereference. See perlref.

  • Type of argument to %s must be unblessed hashref or arrayref

    (F) You called keys, values or each with a scalar argument that was not a reference to an unblessed hash or array.

  • umask not implemented

    (F) Your machine doesn't implement the umask function and you tried to use it to restrict permissions for yourself (EXPR & 0700).

  • Unbalanced context: %d more PUSHes than POPs

    (S internal) The exit code detected an internal inconsistency in how many execution contexts were entered and left.

  • Unbalanced saves: %d more saves than restores

    (S internal) The exit code detected an internal inconsistency in how many values were temporarily localized.

  • Unbalanced scopes: %d more ENTERs than LEAVEs

    (S internal) The exit code detected an internal inconsistency in how many blocks were entered and left.

  • Unbalanced string table refcount: (%d) for "%s"

    (S internal) On exit, Perl found some strings remaining in the shared string table used for copy on write and for hash keys. The entries should have been freed, so this indicates a bug somewhere.

  • Unbalanced tmps: %d more allocs than frees

    (S internal) The exit code detected an internal inconsistency in how many mortal scalars were allocated and freed.

  • Undefined format "%s" called

    (F) The format indicated doesn't seem to exist. Perhaps it's really in another package? See perlform.

  • Undefined sort subroutine "%s" called

    (F) The sort comparison routine specified doesn't seem to exist. Perhaps it's in a different package? See sort.

  • Undefined subroutine &%s called

    (F) The subroutine indicated hasn't been defined, or if it was, it has since been undefined.

  • Undefined subroutine called

    (F) The anonymous subroutine you're trying to call hasn't been defined, or if it was, it has since been undefined.

  • Undefined subroutine in sort

    (F) The sort comparison routine specified is declared but doesn't seem to have been defined yet. See sort.

  • Undefined top format "%s" called

    (F) The format indicated doesn't seem to exist. Perhaps it's really in another package? See perlform.

  • Undefined value assigned to typeglob

    (W misc) An undefined value was assigned to a typeglob, a la *foo = undef . This does nothing. It's possible that you really mean undef *foo .

  • %s: Undefined variable

    (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself.

  • unexec of %s into %s failed!

    (F) The unexec() routine failed for some reason. See your local FSF representative, who probably put it there in the first place.

  • Unexpected '(' with no preceding operator in regex; marked by <-- HERE in m/%s/

    (F) You had something like this:

    1. (?[ \p{Digit} ( \p{Lao} + \p{Thai} ) ])

    There should be an operator before the "(" , as there's no indication as to how the digits are to be combined with the characters in the Lao and Thai scripts.

  • Unexpected ')' in regex; marked by <-- HERE in m/%s/

    (F) You had something like this:

    1. (?[ ( \p{Digit} + ) ])

    The ")" is out-of-place. Something apparently was supposed to be combined with the digits, or the "+" shouldn't be there, or something like that. Perl can't figure out what was intended.

  • Unexpected binary operator '%c' with no preceding operand in regex; marked by <-- HERE in m/%s/

    (F) You had something like this:

    1. (?[ | \p{Digit} ])

    where the "|" is a binary operator with an operand on the right, but no operand on the left.

  • Unexpected character in regex; marked by <-- HERE in m/%s/

    (F) You had something like this:

    1. (?[ z ])

    Within (?[ ]) , no literal characters are allowed unless they are within an inner pair of square brackets, like

    1. (?[ [ z ] ])

    Another possibility is that you forgot a backslash. Perl isn't smart enough to figure out what you really meant.

  • Unexpected constant lvalue entersub entry via type/targ %d:%d

    (P) When compiling a subroutine call in lvalue context, Perl failed an internal consistency check. It encountered a malformed op tree.

  • Unicode non-character U+%X is illegal for open interchange

    (S utf8, nonchar) Certain codepoints, such as U+FFFE and U+FFFF, are defined by the Unicode standard to be non-characters. Those are legal codepoints, but are reserved for internal use; so, applications shouldn't attempt to exchange them. If you know what you are doing you can turn off this warning by no warnings 'nonchar'; .

  • Unicode surrogate U+%X is illegal in UTF-8

    (S utf8, surrogate) You had a UTF-16 surrogate in a context where they are not considered acceptable. These code points, between U+D800 and U+DFFF (inclusive), are used by Unicode only for UTF-16. However, Perl internally allows all unsigned integer code points (up to the size limit available on your platform), including surrogates. But these can cause problems when being input or output, which is likely where this message came from. If you really really know what you are doing you can turn off this warning by no warnings 'surrogate'; .

  • Unknown BYTEORDER

    (F) There are no byte-swapping functions for a machine with this byte order.

  • Unknown charname '%s'

    (F) The name you used inside \N{} is unknown to Perl. Check the spelling. You can say use charnames ":loose" to not have to be so precise about spaces, hyphens, and capitalization on standard Unicode names. (Any custom aliases that have been created must be specified exactly, regardless of whether :loose is used or not.) This error may also happen if the \N{} is not in the scope of the corresponding use charnames .

  • Unknown error

    (P) Perl was about to print an error message in $@ , but the $@ variable did not exist, even after an attempt to create it.

  • Unknown open() mode '%s'

    (F) The second argument of 3-argument open() is not among the list of valid modes: < , >, >> , +< , +>, +>> , -|, |-, <& , >&.

  • Unknown PerlIO layer "%s"

    (W layer) An attempt was made to push an unknown layer onto the Perl I/O system. (Layers take care of transforming data between external and internal representations.) Note that some layers, such as mmap , are not supported in all environments. If your program didn't explicitly request the failing operation, it may be the result of the value of the environment variable PERLIO.

  • Unknown process %x sent message to prime_env_iter: %s

    (P) An error peculiar to VMS. Perl was reading values for %ENV before iterating over it, and someone else stuck a message in the stream of data Perl expected. Someone's very confused, or perhaps trying to subvert Perl's population of %ENV for nefarious purposes.

  • Unknown "re" subpragma '%s' (known ones are: %s)

    (W) You tried to use an unknown subpragma of the "re" pragma.

  • Unknown regex modifier "%s"

    (F) Alphanumerics immediately following the closing delimiter of a regular expression pattern are interpreted by Perl as modifier flags for the regex. One of the ones you specified is invalid. One way this can happen is if you didn't put in white space between the end of the regex and a following alphanumeric operator:

    1. if ($a =~ /foo/and $bar == 3) { ... }

    The "a" is a valid modifier flag, but the "n" is not, and raises this error. Likely what was meant instead was:

    1. if ($a =~ /foo/ and $bar == 3) { ... }
  • Unknown switch condition (?(%s in regex; marked by <-- HERE in m/%s/

    (F) The condition part of a (?(condition)if-clause|else-clause) construct is not known. The condition must be one of the following:

    1. (1) (2) ... true if 1st, 2nd, etc., capture matched
    2. (<NAME>) ('NAME') true if named capture matched
    3. (?=...) (?<=...) true if subpattern matches
    4. (?!...) (?<!...) true if subpattern fails to match
    5. (?{ CODE }) true if code returns a true value
    6. (R) true if evaluating inside recursion
    7. (R1) (R2) ... true if directly inside capture group 1, 2, etc.
    8. (R&NAME) true if directly inside named capture
    9. (DEFINE) always false; for defining named subpatterns

    The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Unknown Unicode option letter '%c'

    (F) You specified an unknown Unicode option. See perlrun documentation of the -C switch for the list of known options.

  • Unknown Unicode option value %x

    (F) You specified an unknown Unicode option. See perlrun documentation of the -C switch for the list of known options.

  • Unknown verb pattern '%s' in regex; marked by <-- HERE in m/%s/

    (F) You either made a typo or have incorrectly put a * quantifier after an open brace in your pattern. Check the pattern and review perlre for details on legal verb patterns.

  • Unknown warnings category '%s'

    (F) An error issued by the warnings pragma. You specified a warnings category that is unknown to perl at this point.

    Note that if you want to enable a warnings category registered by a module (e.g. use warnings 'File::Find' ), you must have loaded this module first.

  • Unmatched '%c' in POSIX class in regex; marked by <-- HERE in m/%s/

    You had something like this:

    1. (?[ [:alnum] ])

    There should be a second ":" , like this:

    1. (?[ [:alnum:] ])
  • Unmatched [ in regex; marked by <-- HERE in m/%s/

    (F) The brackets around a character class must match. If you wish to include a closing bracket in a character class, backslash it or put it first. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Unmatched '[' in POSIX class in regex; marked by <-- HERE in m/%s/

    (F) You had something like this:

    1. (?[ [:digit: ])

    That should be written:

    1. (?[ [:digit:] ])
  • Unmatched ( in regex; marked by <-- HERE in m/%s/
  • Unmatched ) in regex; marked by <-- HERE in m/%s/

    (F) Unbackslashed parentheses must always be balanced in regular expressions. If you're a vi user, the % key is valuable for finding the matching parenthesis. The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Unmatched right %s bracket

    (F) The lexer counted more closing curly or square brackets than opening ones, so you're probably missing a matching opening bracket. As a general rule, you'll find the missing one (so to speak) near the place you were last editing.

  • Unquoted string "%s" may clash with future reserved word

    (W reserved) You used a bareword that might someday be claimed as a reserved word. It's best to put such a word in quotes, or capitalize it somehow, or insert an underbar into it. You might also declare it as a subroutine.

  • Unrecognized character %s; marked by <-- HERE after %s near column %d

    (F) The Perl parser has no idea what to do with the specified character in your Perl script (or eval) near the specified column. Perhaps you tried to run a compressed script, a binary program, or a directory as a Perl program.

  • Unrecognized escape \%c in character class in regex; marked by <-- HERE in m/%s/

    (F) You used a backslash-character combination which is not recognized by Perl inside character classes. This is a fatal error when the character class is used within (?[ ]) .

  • Unrecognized escape \%c in character class passed through in regex; marked by <-- HERE in m/%s/

    (W regexp) You used a backslash-character combination which is not recognized by Perl inside character classes. The character was understood literally, but this may change in a future version of Perl. The <-- HERE shows whereabouts in the regular expression the escape was discovered.

  • Unrecognized escape \%c passed through

    (W misc) You used a backslash-character combination which is not recognized by Perl. The character was understood literally, but this may change in a future version of Perl.

  • Unrecognized escape \%s passed through in regex; marked by <-- HERE in m/%s/

    (W regexp) You used a backslash-character combination which is not recognized by Perl. The character(s) were understood literally, but this may change in a future version of Perl. The <-- HERE shows whereabouts in the regular expression the escape was discovered.

  • Unrecognized signal name "%s"

    (F) You specified a signal name to the kill() function that was not recognized. Say kill -l in your shell to see the valid signal names on your system.

  • Unrecognized switch: -%s (-h will show valid options)

    (F) You specified an illegal option to Perl. Don't do that. (If you think you didn't do that, check the #! line to see if it's supplying the bad switch on your behalf.)

  • Unsuccessful %s on filename containing newline

    (W newline) A file operation was attempted on a filename, and that operation failed, PROBABLY because the filename contained a newline, PROBABLY because you forgot to chomp() it off. See chomp.

  • Unsupported directory function "%s" called

    (F) Your machine doesn't support opendir() and readdir().

  • Unsupported function %s

    (F) This machine doesn't implement the indicated function, apparently. At least, Configure doesn't think so.

  • Unsupported function fork

    (F) Your version of executable does not support forking.

    Note that under some systems, like OS/2, there may be different flavors of Perl executables, some of which may support fork, some not. Try changing the name you call Perl by to perl_ , perl__ , and so on.

  • Unsupported script encoding %s

    (F) Your program file begins with a Unicode Byte Order Mark (BOM) which declares it to be in a Unicode encoding that Perl cannot read.

  • Unsupported socket function "%s" called

    (F) Your machine doesn't support the Berkeley socket mechanism, or at least that's what Configure thought.

  • Unterminated attribute list

    (F) The lexer found something other than a simple identifier at the start of an attribute, and it wasn't a semicolon or the start of a block. Perhaps you terminated the parameter list of the previous attribute too soon. See attributes.

  • Unterminated attribute parameter in attribute list

    (F) The lexer saw an opening (left) parenthesis character while parsing an attribute list, but the matching closing (right) parenthesis character was not found. You may need to add (or remove) a backslash character to get your parentheses to balance. See attributes.

  • Unterminated compressed integer

    (F) An argument to unpack("w",...) was incompatible with the BER compressed integer format and could not be converted to an integer. See pack.

  • Unterminated delimiter for here document

    (F) This message occurs when a here document label has an initial quotation mark but the final quotation mark is missing. Perhaps you wrote:

    1. <<"foo

    instead of:

    1. <<"foo"
  • Unterminated \g{...} pattern in regex; marked by <-- HERE in m/%s/

    (F) You missed a close brace on a \g{..} pattern (group reference) in a regular expression. Fix the pattern and retry.

  • Unterminated <> operator

    (F) The lexer saw a left angle bracket in a place where it was expecting a term, so it's looking for the corresponding right angle bracket, and not finding it. Chances are you left some needed parentheses out earlier in the line, and you really meant a "less than".

  • Unterminated verb pattern argument in regex; marked by <-- HERE in m/%s/

    (F) You used a pattern of the form (*VERB:ARG) but did not terminate the pattern with a ). Fix the pattern and retry.

  • Unterminated verb pattern in regex; marked by <-- HERE in m/%s/

    (F) You used a pattern of the form (*VERB) but did not terminate the pattern with a ). Fix the pattern and retry.

  • untie attempted while %d inner references still exist

    (W untie) A copy of the object returned from tie (or tied) was still valid when untie was called.

  • Usage: POSIX::%s(%s)

    (F) You called a POSIX function with incorrect arguments. See FUNCTIONS in POSIX for more information.

  • Usage: Win32::%s(%s)

    (F) You called a Win32 function with incorrect arguments. See Win32 for more information.

  • $[ used in %s (did you mean $] ?)

    (W syntax) You used $[ in a comparison, such as:

    1. if ($[ > 5.006) {
    2. ...
    3. }

    You probably meant to use $] instead. $[ is the base for indexing arrays. $] is the Perl version number in decimal.

  • Use \\x{...} for more than two hex characters in regex; marked by <-- HERE in m/%s/

    (F) In a regular expression, you said something like

    1. (?[ [ \xBEEF ] ])

    Perl isn't sure if you meant this

    1. (?[ [ \x{BEEF} ] ])

    or if you meant this

    1. (?[ [ \x{BE} E F ] ])

    You need to add either braces or blanks to disambiguate.

  • Use of each() on hash after insertion without resetting hash iterator results in undefined behavior

    (S internal) The behavior of each() after insertion is undefined, it may skip items, or visit items more than once. Consider using keys() instead of each().

  • Useless assignment to a temporary

    (W misc) You assigned to an lvalue subroutine, but what the subroutine returned was a temporary scalar about to be discarded, so the assignment had no effect.

  • Useless (?-%s) - don't use /%s modifier in regex; marked by <-- HERE in m/%s/

    (W regexp) You have used an internal modifier such as (?-o) that has no meaning unless removed from the entire regexp:

    1. if ($string =~ /(?-o)$pattern/o) { ... }

    must be written as

    1. if ($string =~ /$pattern/) { ... }

    The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Useless localization of %s

    (W syntax) The localization of lvalues such as local($x=10) is legal, but in fact the local() currently has no effect. This may change at some point in the future, but in the meantime such code is discouraged.

  • Useless (?%s) - use /%s modifier in regex; marked by <-- HERE in m/%s/

    (W regexp) You have used an internal modifier such as (?o) that has no meaning unless applied to the entire regexp:

    1. if ($string =~ /(?o)$pattern/) { ... }

    must be written as

    1. if ($string =~ /$pattern/o) { ... }

    The <-- HERE shows whereabouts in the regular expression the problem was discovered. See perlre.

  • Useless use of /d modifier in transliteration operator

    (W misc) You have used the /d modifier where the searchlist has the same length as the replacelist. See perlop for more information about the /d modifier.

  • Useless use of '\'; doesn't escape metacharacter '%c'

    (D deprecated) You wrote a regular expression pattern something like one of these:

    1. m{ \x\{FF\} }x
    2. m{foo\{1,3\}}
    3. qr(foo\(bar\))
    4. s[foo\[a-z\]bar][baz]

    The interior braces, square brackets, and parentheses are treated as metacharacters even though they are backslashed; instead write:

    1. m{ \x{FF} }x
    2. m{foo{1,3}}
    3. qr(foo(bar))
    4. s[foo[a-z]bar][baz]

    The backslashes have no effect when a regular expression pattern is delimitted by {} , [] , or () , which ordinarily are metacharacters, and the delimiters are also used, paired, within the interior of the pattern. It is planned that a future Perl release will change the meaning of constructs like these so that the backslashes will have an effect, so remove them from your code.

  • Useless use of \E

    (W misc) You have a \E in a double-quotish string without a \U , \L or \Q preceding it.

  • Useless use of %s in void context

    (W void) You did something without a side effect in a context that does nothing with the return value, such as a statement that doesn't return a value from a block, or the left side of a scalar comma operator. Very often this points not to stupidity on your part, but a failure of Perl to parse your program the way you thought it would. For example, you'd get this if you mixed up your C precedence with Python precedence and said

    1. $one, $two = 1, 2;

    when you meant to say

    1. ($one, $two) = (1, 2);

    Another common error is to use ordinary parentheses to construct a list reference when you should be using square or curly brackets, for example, if you say

    1. $array = (1,2);

    when you should have said

    1. $array = [1,2];

    The square brackets explicitly turn a list value into a scalar value, while parentheses do not. So when a parenthesized list is evaluated in a scalar context, the comma is treated like C's comma operator, which throws away the left argument, which is not what you want. See perlref for more on this.

    This warning will not be issued for numerical constants equal to 0 or 1 since they are often used in statements like

    1. 1 while sub_with_side_effects();

    String constants that would normally evaluate to 0 or 1 are warned about.

  • Useless use of "re" pragma

    (W) You did use re; without any arguments. That isn't very useful.

  • Useless use of sort in scalar context

    (W void) You used sort in scalar context, as in :

    1. my $x = sort @y;

    This is not very useful, and perl currently optimizes this away.

  • Useless use of (?-p) in regex; marked by <-- HERE in m/%s/

    (W regexp) The p modifier cannot be turned off once set. Trying to do so is futile.

  • Useless use of %s with no values

    (W syntax) You used the push() or unshift() function with no arguments apart from the array, like push(@x) or unshift(@foo). That won't usually have any effect on the array, so is completely useless. It's possible in principle that push(@tied_array) could have some effect if the array is tied to a class which implements a PUSH method. If so, you can write it as push(@tied_array,()) to avoid this warning.

  • Useless (%s%c) - %suse /%c modifier in regex; marked by <-- HERE in m/%s/

    (W regexp) The /g and /o regular expression modifiers are global and can't be turned off once set; hence things like (?g) or (?-o:) do nothing.

  • Useless (%sc) - %suse /gc modifier in regex; marked by <-- HERE in m/%s/

    (W regexp) The /c regular expression modifier is global, can't be turned off once set, and doesn't do anything without the /g modifier being specified as well; hence things like (?c) or (?-c:) do nothing, nor do thing like (?gc) nor (?-gc:) .

  • "use" not allowed in expression

    (F) The "use" keyword is recognized and executed at compile time, and returns no useful value. See perlmod.

  • Use of assignment to $[ is deprecated

    (D deprecated) The $[ variable (index of the first element in an array) is deprecated. See $[ in perlvar.

  • Use of bare << to mean <<"" is deprecated

    (D deprecated) You are now encouraged to use the explicitly quoted form if you wish to use an empty line as the terminator of the here-document.

  • Use of comma-less variable list is deprecated

    (D deprecated) The values you give to a format should be separated by commas, not just aligned on a line.

  • Use of chdir('') or chdir(undef) as chdir() deprecated

    (D deprecated) chdir() with no arguments is documented to change to $ENV{HOME} or $ENV{LOGDIR}. chdir(undef) and chdir('') share this behavior, but that has been deprecated. In future versions they will simply fail.

    Be careful to check that what you pass to chdir() is defined and not blank, else you might find yourself in your home directory.

  • Use of /c modifier is meaningless in s///

    (W regexp) You used the /c modifier in a substitution. The /c modifier is not presently meaningful in substitutions.

  • Use of /c modifier is meaningless without /g

    (W regexp) You used the /c modifier with a regex operand, but didn't use the /g modifier. Currently, /c is meaningful only when /g is used. (This may change in the future.)

  • Use of := for an empty attribute list is not allowed

    (F) The construction my $x := 42 used to parse as equivalent to my $x : = 42 (applying an empty attribute list to $x ). This construct was deprecated in 5.12.0, and has now been made a syntax error, so := can be reclaimed as a new operator in the future.

    If you need an empty attribute list, for example in a code generator, add a space before the = .

  • Use of freed value in iteration

    (F) Perhaps you modified the iterated array within the loop? This error is typically caused by code like the following:

    1. @a = (3,4);
    2. @a = () for (1,2,@a);

    You are not supposed to modify arrays while they are being iterated over. For speed and efficiency reasons, Perl internally does not do full reference-counting of iterated items, hence deleting such an item in the middle of an iteration causes Perl to see a freed value.

  • Use of *glob{FILEHANDLE} is deprecated

    (D deprecated) You are now encouraged to use the shorter *glob{IO} form to access the filehandle slot within a typeglob.

  • Use of /g modifier is meaningless in split

    (W regexp) You used the /g modifier on the pattern for a split operator. Since split always tries to match the pattern repeatedly, the /g has no effect.

  • Use of "goto" to jump into a construct is deprecated

    (D deprecated) Using goto to jump from an outer scope into an inner scope is deprecated and should be avoided.

  • Use of inherited AUTOLOAD for non-method %s() is deprecated

    (D deprecated) As an (ahem) accidental feature, AUTOLOAD subroutines are looked up as methods (using the @ISA hierarchy) even when the subroutines to be autoloaded were called as plain functions (e.g. Foo::bar() ), not as methods (e.g. Foo->bar() or $obj->bar() ).

    This bug will be rectified in future by using method lookup only for methods' AUTOLOAD s. However, there is a significant base of existing code that may be using the old behavior. So, as an interim step, Perl currently issues an optional warning when non-methods use inherited AUTOLOAD s.

    The simple rule is: Inheritance will not work when autoloading non-methods. The simple fix for old code is: In any module that used to depend on inheriting AUTOLOAD for non-methods from a base class named BaseClass , execute *AUTOLOAD = \&BaseClass::AUTOLOAD during startup.

    In code that currently says use AutoLoader; @ISA = qw(AutoLoader); you should remove AutoLoader from @ISA and change use AutoLoader; to use AutoLoader 'AUTOLOAD'; .

  • Use of %s in printf format not supported

    (F) You attempted to use a feature of printf that is accessible from only C. This usually means there's a better way to do it in Perl.

  • Use of %s is deprecated

    (D deprecated) The construct indicated is no longer recommended for use, generally because there's a better way to do it, and also because the old way has bad side effects.

  • Use of -l on filehandle %s

    (W io) A filehandle represents an opened file, and when you opened the file it already went past any symlink you are presumably trying to look for. The operation returned undef. Use a filename instead.

  • Use of my $_ is experimental

    (S experimental::lexical_topic) Lexical $_ is an experimental feature and its behavior may change or even be removed in any future release of perl. See the explanation under $_ in perlvar.

  • Use of %s on a handle without * is deprecated

    (D deprecated) You used tie, tied or untie on a scalar but that scalar happens to hold a typeglob, which means its filehandle will be tied. If you mean to tie a handle, use an explicit * as in tie *$handle .

    This was a long-standing bug that was removed in Perl 5.16, as there was no way to tie the scalar itself when it held a typeglob, and no way to untie a scalar that had had a typeglob assigned to it. If you see this message, you must be using an older version.

  • Use of ?PATTERN? without explicit operator is deprecated

    (D deprecated) You have written something like ?\w? , for a regular expression that matches only once. Starting this term directly with the question mark delimiter is now deprecated, so that the question mark will be available for use in new operators in the future. Write m?\w? instead, explicitly using the m operator: the question mark delimiter still invokes match-once behaviour.

  • Use of reference "%s" as array index

    (W misc) You tried to use a reference as an array index; this probably isn't what you mean, because references in numerical context tend to be huge numbers, and so usually indicates programmer error.

    If you really do mean it, explicitly numify your reference, like so: $array[0+$ref] . This warning is not given for overloaded objects, however, because you can overload the numification and stringification operators and then you presumably know what you are doing.

  • Use of state $_ is experimental

    (S experimental::lexical_topic) Lexical $_ is an experimental feature and its behavior may change or even be removed in any future release of perl. See the explanation under $_ in perlvar.

  • Use of tainted arguments in %s is deprecated

    (W taint, deprecated) You have supplied system() or exec() with multiple arguments and at least one of them is tainted. This used to be allowed but will become a fatal error in a future version of perl. Untaint your arguments. See perlsec.

  • Use of uninitialized value%s

    (W uninitialized) An undefined value was used as if it were already defined. It was interpreted as a "" or a 0, but maybe it was a mistake. To suppress this warning assign a defined value to your variables.

    To help you figure out what was undefined, perl will try to tell you the name of the variable (if any) that was undefined. In some cases it cannot do this, so it also tells you what operation you used the undefined value in. Note, however, that perl optimizes your program anid the operation displayed in the warning may not necessarily appear literally in your program. For example, "that $foo" is usually optimized into "that " . $foo , and the warning will refer to the concatenation (.) operator, even though there is no . in your program.

  • Using a hash as a reference is deprecated

    (D deprecated) You tried to use a hash as a reference, as in %foo->{"bar"} or %$ref->{"hello"} . Versions of perl <= 5.6.1 used to allow this syntax, but shouldn't have. It is now deprecated, and will be removed in a future version.

  • Using an array as a reference is deprecated

    (D deprecated) You tried to use an array as a reference, as in @foo->[23] or @$ref->[99] . Versions of perl <= 5.6.1 used to allow this syntax, but shouldn't have. It is now deprecated, and will be removed in a future version.

  • Using just the first character returned by \N{} in character class in regex; marked by <-- HERE in m/%s/

    (W regexp) A charnames handler may return a sequence of more than one character. Currently all but the first one are discarded when used in a regular expression pattern bracketed character class.

  • Using !~ with %s doesn't make sense

    (F) Using the !~ operator with s///r, tr///r or y///r is currently reserved for future use, as the exact behaviour has not been decided. (Simply returning the boolean opposite of the modified string is usually not particularly useful.)

  • UTF-16 surrogate U+%X

    (S utf8, surrogate) You had a UTF-16 surrogate in a context where they are not considered acceptable. These code points, between U+D800 and U+DFFF (inclusive), are used by Unicode only for UTF-16. However, Perl internally allows all unsigned integer code points (up to the size limit available on your platform), including surrogates. But these can cause problems when being input or output, which is likely where this message came from. If you really really know what you are doing you can turn off this warning by no warnings 'surrogate'; .

  • Value of %s can be "0"; test with defined()

    (W misc) In a conditional expression, you used <HANDLE>, <*> (glob), each(), or readdir() as a boolean value. Each of these constructs can return a value of "0"; that would make the conditional expression false, which is probably not what you intended. When using these constructs in conditional expressions, test their values with the defined operator.

  • Value of CLI symbol "%s" too long

    (W misc) A warning peculiar to VMS. Perl tried to read the value of an %ENV element from a CLI symbol table, and found a resultant string longer than 1024 characters. The return value has been truncated to 1024 characters.

  • Variable "%s" is not available

    (W closure) During compilation, an inner named subroutine or eval is attempting to capture an outer lexical that is not currently available. This can happen for one of two reasons. First, the outer lexical may be declared in an outer anonymous subroutine that has not yet been created. (Remember that named subs are created at compile time, while anonymous subs are created at run-time.) For example,

    1. sub { my $a; sub f { $a } }

    At the time that f is created, it can't capture the current value of $a, since the anonymous subroutine hasn't been created yet. Conversely, the following won't give a warning since the anonymous subroutine has by now been created and is live:

    1. sub { my $a; eval 'sub f { $a }' }->();

    The second situation is caused by an eval accessing a variable that has gone out of scope, for example,

    1. sub f {
    2. my $a;
    3. sub { eval '$a' }
    4. }
    5. f()->();

    Here, when the '$a' in the eval is being compiled, f() is not currently being executed, so its $a is not available for capture.

  • Variable "%s" is not imported%s

    (S misc) With "use strict" in effect, you referred to a global variable that you apparently thought was imported from another module, because something else of the same name (usually a subroutine) is exported by that module. It usually means you put the wrong funny character on the front of your variable.

  • Variable length lookbehind not implemented in regex m/%s/

    (F) Lookbehind is allowed only for subexpressions whose length is fixed and known at compile time. See perlre.

  • "%s" variable %s masks earlier declaration in same %s

    (W misc) A "my", "our" or "state" variable has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier variable will still exist until the end of the scope or until all closure references to it are destroyed.

  • Variable syntax

    (A) You've accidentally run your script through csh instead of Perl. Check the #! line, or manually feed your script into Perl yourself.

  • Variable "%s" will not stay shared

    (W closure) An inner (nested) named subroutine is referencing a lexical variable defined in an outer named subroutine.

    When the inner subroutine is called, it will see the value of the outer subroutine's variable as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared.

    This problem can usually be solved by making the inner subroutine anonymous, using the sub {} syntax. When inner anonymous subs that reference variables in outer subroutines are created, they are automatically rebound to the current values of such variables.

  • vector argument not supported with alpha versions

    (S printf) The %vd (s)printf format does not support version objects with alpha parts.

  • Verb pattern '%s' has a mandatory argument in regex; marked by <-- HERE in m/%s/

    (F) You used a verb pattern that requires an argument. Supply an argument or check that you are using the right verb.

  • Verb pattern '%s' may not have an argument in regex; marked by <-- HERE in m/%s/

    (F) You used a verb pattern that is not allowed an argument. Remove the argument or check that you are using the right verb.

  • Version number must be a constant number

    (P) The attempt to translate a use Module n.n LIST statement into its equivalent BEGIN block found an internal inconsistency with the version number.

  • Version string '%s' contains invalid data; ignoring: '%s'

    (W misc) The version string contains invalid characters at the end, which are being ignored.

  • Warning: something's wrong

    (W) You passed warn() an empty string (the equivalent of warn "" ) or you called it with no args and $@ was empty.

  • Warning: unable to close filehandle %s properly

    (S) The implicit close() done by an open() got an error indication on the close(). This usually indicates your file system ran out of disk space.

  • Warning: Use of "%s" without parentheses is ambiguous

    (S ambiguous) You wrote a unary operator followed by something that looks like a binary operator that could also have been interpreted as a term or unary operator. For instance, if you know that the rand function has a default argument of 1.0, and you write

    1. rand + 5;

    you may THINK you wrote the same thing as

    1. rand() + 5;

    but in actual fact, you got

    1. rand(+5);

    So put in parentheses to say what you really mean.

  • when is experimental

    (S experimental::smartmatch) when depends on smartmatch, which is experimental. Additionally, it has several special cases that may not be immediately obvious, and their behavior may change or even be removed in any future release of perl. See the explanation under Experimental Details on given and when in perlsyn.

  • Wide character in %s

    (S utf8) Perl met a wide character (>255) when it wasn't expecting one. This warning is by default on for I/O (like print). The easiest way to quiet this warning is simply to add the :utf8 layer to the output, e.g. binmode STDOUT, ':utf8' . Another way to turn off the warning is to add no warnings 'utf8'; but that is often closer to cheating. In general, you are supposed to explicitly mark the filehandle with an encoding, see open and binmode.

  • Within []-length '%c' not allowed

    (F) The count in the (un)pack template may be replaced by [TEMPLATE] only if TEMPLATE always matches the same amount of packed bytes that can be determined from the template alone. This is not possible if it contains any of the codes @, /, U, u, w or a *-length. Redesign the template.

  • write() on closed filehandle %s

    (W closed) The filehandle you're writing to got itself closed sometime before now. Check your control flow.

  • %s "\x%X" does not map to Unicode

    (F) When reading in different encodings, Perl tries to map everything into Unicode characters. The bytes you read in are not legal in this encoding. For example

    1. utf8 "\xE4" does not map to Unicode

    if you try to read in the a-diaereses Latin-1 as UTF-8.

  • 'X' outside of string

    (F) You had a (un)pack template that specified a relative position before the beginning of the string being (un)packed. See pack.

  • 'x' outside of string in unpack

    (F) You had a pack template that specified a relative position after the end of the string being unpacked. See pack.

  • YOU HAVEN'T DISABLED SET-ID SCRIPTS IN THE KERNEL YET!

    (F) And you probably never will, because you probably don't have the sources to your kernel, and your vendor probably doesn't give a rip about what you want. Your best bet is to put a setuid C wrapper around your script.

  • You need to quote "%s"

    (W syntax) You assigned a bareword as a signal handler name. Unfortunately, you already have a subroutine of that name declared, which means that Perl 5 will try to call the subroutine when the assignment is executed, which is probably not what you want. (If it IS what you want, put an & in front.)

  • Your random numbers are not that random

    (F) When trying to initialise the random seed for hashes, Perl could not get any randomness out of your system. This usually indicates Something Very Wrong.

SEE ALSO

warnings, perllexwarn, diagnostics.

 
perldoc-html/perldoc.html000644 000765 000024 00000072551 12275777420 015563 0ustar00jjstaff000000 000000 perldoc - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldoc

Perl 5 version 18.2 documentation
Recently read

perldoc

SYNOPSIS

  1. perldoc [-h] [-D] [-t] [-u] [-m] [-l] [-F]
  2. [-i] [-V] [-T] [-r]
  3. [-d destination_file]
  4. [-o formatname]
  5. [-M FormatterClassName]
  6. [-w formatteroption:value]
  7. [-n nroff-replacement]
  8. [-X]
  9. [-L language_code]
  10. PageName|ModuleName|ProgramName|URL

Examples:

  1. perldoc -f BuiltinFunction
  2. perldoc -L it -f BuiltinFunction
  3. perldoc -q FAQ Keyword
  4. perldoc -L fr -q FAQ Keyword
  5. perldoc -v PerlVariable

See below for more description of the switches.

DESCRIPTION

perldoc looks up a piece of documentation in .pod format that is embedded in the perl installation tree or in a perl script, and displays it via groff -man | $PAGER . (In addition, if running under HP-UX, col -x will be used.) This is primarily used for the documentation for the perl library modules.

Your system may also have man pages installed for those modules, in which case you can probably just use the man(1) command.

If you are looking for a table of contents to the Perl library modules documentation, see the perltoc page.

OPTIONS

  • -h

    Prints out a brief help message.

  • -D

    Describes search for the item in detail.

  • -t

    Display docs using plain text converter, instead of nroff. This may be faster, but it probably won't look as nice.

  • -u

    Skip the real Pod formatting, and just show the raw Pod source (Unformatted)

  • -m module

    Display the entire module: both code and unformatted pod documentation. This may be useful if the docs don't explain a function in the detail you need, and you'd like to inspect the code directly; perldoc will find the file for you and simply hand it off for display.

  • -l

    Display only the file name of the module found.

  • -F

    Consider arguments as file names; no search in directories will be performed.

  • -f perlfunc

    The -f option followed by the name of a perl built-in function will extract the documentation of this function from perlfunc.

    Example:

    1. perldoc -f sprintf
  • -q perlfaq-search-regexp

    The -q option takes a regular expression as an argument. It will search the question headings in perlfaq[1-9] and print the entries matching the regular expression.

    Example:

    1. perldoc -q shuffle
  • -v perlvar

    The -v option followed by the name of a Perl predefined variable will extract the documentation of this variable from perlvar.

    Examples:

    1. perldoc -v '$"'
    2. perldoc -v @+
    3. perldoc -v DATA
  • -T

    This specifies that the output is not to be sent to a pager, but is to be sent directly to STDOUT.

  • -d destination-filename

    This specifies that the output is to be sent neither to a pager nor to STDOUT, but is to be saved to the specified filename. Example: perldoc -oLaTeX -dtextwrapdocs.tex Text::Wrap

  • -o output-formatname

    This specifies that you want Perldoc to try using a Pod-formatting class for the output format that you specify. For example: -oman . This is actually just a wrapper around the -M switch; using -oformatname just looks for a loadable class by adding that format name (with different capitalizations) to the end of different classname prefixes.

    For example, -oLaTeX currently tries all of the following classes: Pod::Perldoc::ToLaTeX Pod::Perldoc::Tolatex Pod::Perldoc::ToLatex Pod::Perldoc::ToLATEX Pod::Simple::LaTeX Pod::Simple::latex Pod::Simple::Latex Pod::Simple::LATEX Pod::LaTeX Pod::latex Pod::Latex Pod::LATEX.

  • -M module-name

    This specifies the module that you want to try using for formatting the pod. The class must at least provide a parse_from_file method. For example: perldoc -MPod::Perldoc::ToChecker .

    You can specify several classes to try by joining them with commas or semicolons, as in -MTk::SuperPod;Tk::Pod .

  • -w option:value or -w option

    This specifies an option to call the formatter with. For example, -w textsize:15 will call $formatter->textsize(15) on the formatter object before it is used to format the object. For this to be valid, the formatter class must provide such a method, and the value you pass should be valid. (So if textsize expects an integer, and you do -w textsize:big, expect trouble.)

    You can use -w optionname (without a value) as shorthand for -w optionname:TRUE. This is presumably useful in cases of on/off features like: -w page_numbering .

    You can use an "=" instead of the ":", as in: -w textsize=15 . This might be more (or less) convenient, depending on what shell you use.

  • -X

    Use an index if it is present. The -X option looks for an entry whose basename matches the name given on the command line in the file $Config{archlib}/pod.idx . The pod.idx file should contain fully qualified filenames, one per line.

  • -L language_code

    This allows one to specify the language code for the desired language translation. If the POD2::<language_code> package isn't installed in your system, the switch is ignored. All available translation packages are to be found under the POD2:: namespace. See POD2::IT (or POD2::FR) to see how to create new localized POD2::* documentation packages and integrate them into Pod::Perldoc.

  • PageName|ModuleName|ProgramName|URL

    The item you want to look up. Nested modules (such as File::Basename ) are specified either as File::Basename or File/Basename . You may also give a descriptive name of a page, such as perlfunc . For URLs, HTTP and HTTPS are the only kind currently supported.

    For simple names like 'foo', when the normal search fails to find a matching page, a search with the "perl" prefix is tried as well. So "perldoc intro" is enough to find/render "perlintro.pod".

  • -n some-formatter

    Specify replacement for groff

  • -r

    Recursive search.

  • -i

    Ignore case.

  • -V

    Displays the version of perldoc you're running.

SECURITY

Because perldoc does not run properly tainted, and is known to have security issues, when run as the superuser it will attempt to drop privileges by setting the effective and real IDs to nobody's or nouser's account, or -2 if unavailable. If it cannot relinquish its privileges, it will not run.

ENVIRONMENT

Any switches in the PERLDOC environment variable will be used before the command line arguments.

Useful values for PERLDOC include -oterm , -otext , -ortf , -oxml , and so on, depending on what modules you have on hand; or the formatter class may be specified exactly with -MPod::Perldoc::ToTerm or the like.

perldoc also searches directories specified by the PERL5LIB (or PERLLIB if PERL5LIB is not defined) and PATH environment variables. (The latter is so that embedded pods for executables, such as perldoc itself, are available.)

In directories where either Makefile.PL or Build.PL exist, perldoc will add . and lib first to its search path, and as long as you're not the superuser will add blib too. This is really helpful if you're working inside of a build directory and want to read through the docs even if you have a version of a module previously installed.

perldoc will use, in order of preference, the pager defined in PERLDOC_PAGER , MANPAGER , or PAGER before trying to find a pager on its own. (MANPAGER is not used if perldoc was told to display plain text or unformatted pod.)

One useful value for PERLDOC_PAGER is less -+C -E .

Having PERLDOCDEBUG set to a positive integer will make perldoc emit even more descriptive output than the -D switch does; the higher the number, the more it emits.

CHANGES

Up to 3.14_05, the switch -v was used to produce verbose messages of perldoc operation, which is now enabled by -D.

SEE ALSO

perlpod, Pod::Perldoc

AUTHOR

Current maintainer: Mark Allen <mallen@cpan.org>

Past contributors are: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org> , Kenneth Albanowski <kjahds@kjahds.com> , Andy Dougherty <doughera@lafcol.lafayette.edu> , and many others.

 
perldoc-html/perldos.html000644 000765 000024 00000075215 12275777411 015603 0ustar00jjstaff000000 000000 perldos - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldos

Perl 5 version 18.2 documentation
Recently read

perldos

NAME

perldos - Perl under DOS, W31, W95.

SYNOPSIS

These are instructions for building Perl under DOS (or w??), using DJGPP v2.03 or later. Under w95 long filenames are supported.

DESCRIPTION

Before you start, you should glance through the README file found in the top-level directory where the Perl distribution was extracted. Make sure you read and understand the terms under which this software is being distributed.

This port currently supports MakeMaker (the set of modules that is used to build extensions to perl). Therefore, you should be able to build and install most extensions found in the CPAN sites.

Detailed instructions on how to build and install perl extension modules, including XS-type modules, is included. See 'BUILDING AND INSTALLING MODULES'.

Prerequisites for Compiling Perl on DOS

  • DJGPP

    DJGPP is a port of GNU C/C++ compiler and development tools to 32-bit, protected-mode environment on Intel 32-bit CPUs running MS-DOS and compatible operating systems, by DJ Delorie <dj@delorie.com> and friends.

    For more details (FAQ), check out the home of DJGPP at:

    1. http://www.delorie.com/djgpp/

    If you have questions about DJGPP, try posting to the DJGPP newsgroup: comp.os.msdos.djgpp, or use the email gateway djgpp@delorie.com.

    You can find the full DJGPP distribution on any of the mirrors listed here:

    1. http://www.delorie.com/djgpp/getting.html

    You need the following files to build perl (or add new modules):

    1. v2/djdev203.zip
    2. v2gnu/bnu2112b.zip
    3. v2gnu/gcc2953b.zip
    4. v2gnu/bsh204b.zip
    5. v2gnu/mak3791b.zip
    6. v2gnu/fil40b.zip
    7. v2gnu/sed3028b.zip
    8. v2gnu/txt20b.zip
    9. v2gnu/dif272b.zip
    10. v2gnu/grep24b.zip
    11. v2gnu/shl20jb.zip
    12. v2gnu/gwk306b.zip
    13. v2misc/csdpmi5b.zip

    or possibly any newer version.

  • Pthreads

    Thread support is not tested in this version of the djgpp perl.

Shortcomings of Perl under DOS

Perl under DOS lacks some features of perl under UNIX because of deficiencies in the UNIX-emulation, most notably:

  • fork() and pipe()

  • some features of the UNIX filesystem regarding link count and file dates

  • in-place operation is a little bit broken with short filenames

  • sockets

Building Perl on DOS

  • Unpack the source package perl5.8*.tar.gz with djtarx. If you want to use long file names under w95 and also to get Perl to pass all its tests, don't forget to use

    1. set LFN=y
    2. set FNCASE=y

    before unpacking the archive.

  • Create a "symlink" or copy your bash.exe to sh.exe in your ($DJDIR)/bin directory.

    1. ln -s bash.exe sh.exe

    [If you have the recommended version of bash for DJGPP, this is already done for you.]

    And make the SHELL environment variable point to this sh.exe:

    1. set SHELL=c:/djgpp/bin/sh.exe (use full path name!)

    You can do this in djgpp.env too. Add this line BEFORE any section definition:

    1. +SHELL=%DJDIR%/bin/sh.exe
  • If you have split.exe and gsplit.exe in your path, then rename split.exe to djsplit.exe, and gsplit.exe to split.exe. Copy or link gecho.exe to echo.exe if you don't have echo.exe. Copy or link gawk.exe to awk.exe if you don't have awk.exe.

    [If you have the recommended versions of djdev, shell utilities and gawk, all these are already done for you, and you will not need to do anything.]

  • Chdir to the djgpp subdirectory of perl toplevel and type the following commands:

    1. set FNCASE=y
    2. configure.bat

    This will do some preprocessing then run the Configure script for you. The Configure script is interactive, but in most cases you just need to press ENTER. The "set" command ensures that DJGPP preserves the letter case of file names when reading directories. If you already issued this set command when unpacking the archive, and you are in the same DOS session as when you unpacked the archive, you don't have to issue the set command again. This command is necessary *before* you start to (re)configure or (re)build perl in order to ensure both that perl builds correctly and that building XS-type modules can succeed. See the DJGPP info entry for "_preserve_fncase" for more information:

    1. info libc alphabetical _preserve_fncase

    If the script says that your package is incomplete, and asks whether to continue, just answer with Y (this can only happen if you don't use long filenames or forget to issue "set FNCASE=y" first).

    When Configure asks about the extensions, I suggest IO and Fcntl, and if you want database handling then SDBM_File or GDBM_File (you need to install gdbm for this one). If you want to use the POSIX extension (this is the default), make sure that the stack size of your cc1.exe is at least 512kbyte (you can check this with: stubedit cc1.exe ).

    You can use the Configure script in non-interactive mode too. When I built my perl.exe, I used something like this:

    1. configure.bat -des

    You can find more info about Configure's command line switches in the INSTALL file.

    When the script ends, and you want to change some values in the generated config.sh file, then run

    1. sh Configure -S

    after you made your modifications.

    IMPORTANT: if you use this -S switch, be sure to delete the CONFIG environment variable before running the script:

    1. set CONFIG=
  • Now you can compile Perl. Type:

    1. make

Testing Perl on DOS

Type:

  1. make test

If you're lucky you should see "All tests successful". But there can be a few failed subtests (less than 5 hopefully) depending on some external conditions (e.g. some subtests fail under linux/dosemu or plain dos with short filenames only).

Installation of Perl on DOS

Type:

  1. make install

This will copy the newly compiled perl and libraries into your DJGPP directory structure. Perl.exe and the utilities go into ($DJDIR)/bin , and the library goes under ($DJDIR)/lib/perl5 . The pod documentation goes under ($DJDIR)/lib/perl5/pod .

BUILDING AND INSTALLING MODULES ON DOS

Building Prerequisites for Perl on DOS

For building and installing non-XS modules, all you need is a working perl under DJGPP. Non-XS modules do not require re-linking the perl binary, and so are simpler to build and install.

XS-type modules do require re-linking the perl binary, because part of an XS module is written in "C", and has to be linked together with the perl binary to be executed. This is required because perl under DJGPP is built with the "static link" option, due to the lack of "dynamic linking" in the DJGPP environment.

Because XS modules require re-linking of the perl binary, you need both the perl binary distribution and the perl source distribution to build an XS extension module. In addition, you will have to have built your perl binary from the source distribution so that all of the components of the perl binary are available for the required link step.

Unpacking CPAN Modules on DOS

First, download the module package from CPAN (e.g., the "Comma Separated Value" text package, Text-CSV-0.01.tar.gz). Then expand the contents of the package into some location on your disk. Most CPAN modules are built with an internal directory structure, so it is usually safe to expand it in the root of your DJGPP installation. Some people prefer to locate source trees under /usr/src (i.e., ($DJDIR)/usr/src ), but you may put it wherever seems most logical to you, *EXCEPT* under the same directory as your perl source code. There are special rules that apply to modules which live in the perl source tree that do not apply to most of the modules in CPAN.

Unlike other DJGPP packages, which are normal "zip" files, most CPAN module packages are "gzipped tarballs". Recent versions of WinZip will safely unpack and expand them, *UNLESS* they have zero-length files. It is a known WinZip bug (as of v7.0) that it will not extract zero-length files.

From the command line, you can use the djtar utility provided with DJGPP to unpack and expand these files. For example:

  1. C:\djgpp>djtarx -v Text-CSV-0.01.tar.gz

This will create the new directory ($DJDIR)/Text-CSV-0.01 , filling it with the source for this module.

Building Non-XS Modules on DOS

To build a non-XS module, you can use the standard module-building instructions distributed with perl modules.

  1. perl Makefile.PL
  2. make
  3. make test
  4. make install

This is sufficient because non-XS modules install only ".pm" files and (sometimes) pod and/or man documentation. No re-linking of the perl binary is needed to build, install or use non-XS modules.

Building XS Modules on DOS

To build an XS module, you must use the standard module-building instructions distributed with perl modules *PLUS* three extra instructions specific to the DJGPP "static link" build environment.

  1. set FNCASE=y
  2. perl Makefile.PL
  3. make
  4. make perl
  5. make test
  6. make -f Makefile.aperl inst_perl MAP_TARGET=perl.exe
  7. make install

The first extra instruction sets DJGPP's FNCASE environment variable so that the new perl binary which you must build for an XS-type module will build correctly. The second extra instruction re-builds the perl binary in your module directory before you run "make test", so that you are testing with the new module code you built with "make". The third extra instruction installs the perl binary from your module directory into the standard DJGPP binary directory, ($DJDIR)/bin , replacing your previous perl binary.

Note that the MAP_TARGET value *must* have the ".exe" extension or you will not create a "perl.exe" to replace the one in ($DJDIR)/bin .

When you are done, the XS-module install process will have added information to your "perllocal" information telling that the perl binary has been replaced, and what module was installed. You can view this information at any time by using the command:

  1. perl -S perldoc perllocal

AUTHOR

Laszlo Molnar, laszlo.molnar@eth.ericsson.se [Installing/building perl]

Peter J. Farley III pjfarley@banet.net [Building/installing modules]

SEE ALSO

perl(1).

 
perldoc-html/perldsc.html000644 000765 000024 00000261543 12275777323 015572 0ustar00jjstaff000000 000000 perldsc - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldsc

Perl 5 version 18.2 documentation
Recently read

perldsc

NAME

perldsc - Perl Data Structures Cookbook

DESCRIPTION

Perl lets us have complex data structures. You can write something like this and all of a sudden, you'd have an array with three dimensions!

  1. for $x (1 .. 10) {
  2. for $y (1 .. 10) {
  3. for $z (1 .. 10) {
  4. $AoA[$x][$y][$z] =
  5. $x ** $y + $z;
  6. }
  7. }
  8. }

Alas, however simple this may appear, underneath it's a much more elaborate construct than meets the eye!

How do you print it out? Why can't you say just print @AoA ? How do you sort it? How can you pass it to a function or get one of these back from a function? Is it an object? Can you save it to disk to read back later? How do you access whole rows or columns of that matrix? Do all the values have to be numeric?

As you see, it's quite easy to become confused. While some small portion of the blame for this can be attributed to the reference-based implementation, it's really more due to a lack of existing documentation with examples designed for the beginner.

This document is meant to be a detailed but understandable treatment of the many different sorts of data structures you might want to develop. It should also serve as a cookbook of examples. That way, when you need to create one of these complex data structures, you can just pinch, pilfer, or purloin a drop-in example from here.

Let's look at each of these possible constructs in detail. There are separate sections on each of the following:

  • arrays of arrays
  • hashes of arrays
  • arrays of hashes
  • hashes of hashes
  • more elaborate constructs

But for now, let's look at general issues common to all these types of data structures.

REFERENCES

The most important thing to understand about all data structures in Perl--including multidimensional arrays--is that even though they might appear otherwise, Perl @ARRAY s and %HASH es are all internally one-dimensional. They can hold only scalar values (meaning a string, number, or a reference). They cannot directly contain other arrays or hashes, but instead contain references to other arrays or hashes.

You can't use a reference to an array or hash in quite the same way that you would a real array or hash. For C or C++ programmers unused to distinguishing between arrays and pointers to the same, this can be confusing. If so, just think of it as the difference between a structure and a pointer to a structure.

You can (and should) read more about references in perlref. Briefly, references are rather like pointers that know what they point to. (Objects are also a kind of reference, but we won't be needing them right away--if ever.) This means that when you have something which looks to you like an access to a two-or-more-dimensional array and/or hash, what's really going on is that the base type is merely a one-dimensional entity that contains references to the next level. It's just that you can use it as though it were a two-dimensional one. This is actually the way almost all C multidimensional arrays work as well.

  1. $array[7][12] # array of arrays
  2. $array[7]{string} # array of hashes
  3. $hash{string}[7] # hash of arrays
  4. $hash{string}{'another string'} # hash of hashes

Now, because the top level contains only references, if you try to print out your array in with a simple print() function, you'll get something that doesn't look very nice, like this:

  1. @AoA = ( [2, 3], [4, 5, 7], [0] );
  2. print $AoA[1][2];
  3. 7
  4. print @AoA;
  5. ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)

That's because Perl doesn't (ever) implicitly dereference your variables. If you want to get at the thing a reference is referring to, then you have to do this yourself using either prefix typing indicators, like ${$blah} , @{$blah} , @{$blah[$i]} , or else postfix pointer arrows, like $a->[3] , $h->{fred} , or even $ob->method()->[3] .

COMMON MISTAKES

The two most common mistakes made in constructing something like an array of arrays is either accidentally counting the number of elements or else taking a reference to the same memory location repeatedly. Here's the case where you just get the count instead of a nested array:

  1. for $i (1..10) {
  2. @array = somefunc($i);
  3. $AoA[$i] = @array; # WRONG!
  4. }

That's just the simple case of assigning an array to a scalar and getting its element count. If that's what you really and truly want, then you might do well to consider being a tad more explicit about it, like this:

  1. for $i (1..10) {
  2. @array = somefunc($i);
  3. $counts[$i] = scalar @array;
  4. }

Here's the case of taking a reference to the same memory location again and again:

  1. for $i (1..10) {
  2. @array = somefunc($i);
  3. $AoA[$i] = \@array; # WRONG!
  4. }

So, what's the big problem with that? It looks right, doesn't it? After all, I just told you that you need an array of references, so by golly, you've made me one!

Unfortunately, while this is true, it's still broken. All the references in @AoA refer to the very same place, and they will therefore all hold whatever was last in @array! It's similar to the problem demonstrated in the following C program:

  1. #include <pwd.h>
  2. main() {
  3. struct passwd *getpwnam(), *rp, *dp;
  4. rp = getpwnam("root");
  5. dp = getpwnam("daemon");
  6. printf("daemon name is %s\nroot name is %s\n",
  7. dp->pw_name, rp->pw_name);
  8. }

Which will print

  1. daemon name is daemon
  2. root name is daemon

The problem is that both rp and dp are pointers to the same location in memory! In C, you'd have to remember to malloc() yourself some new memory. In Perl, you'll want to use the array constructor [] or the hash constructor {} instead. Here's the right way to do the preceding broken code fragments:

  1. for $i (1..10) {
  2. @array = somefunc($i);
  3. $AoA[$i] = [ @array ];
  4. }

The square brackets make a reference to a new array with a copy of what's in @array at the time of the assignment. This is what you want.

Note that this will produce something similar, but it's much harder to read:

  1. for $i (1..10) {
  2. @array = 0 .. $i;
  3. @{$AoA[$i]} = @array;
  4. }

Is it the same? Well, maybe so--and maybe not. The subtle difference is that when you assign something in square brackets, you know for sure it's always a brand new reference with a new copy of the data. Something else could be going on in this new case with the @{$AoA[$i]} dereference on the left-hand-side of the assignment. It all depends on whether $AoA[$i] had been undefined to start with, or whether it already contained a reference. If you had already populated @AoA with references, as in

  1. $AoA[3] = \@another_array;

Then the assignment with the indirection on the left-hand-side would use the existing reference that was already there:

  1. @{$AoA[3]} = @array;

Of course, this would have the "interesting" effect of clobbering @another_array. (Have you ever noticed how when a programmer says something is "interesting", that rather than meaning "intriguing", they're disturbingly more apt to mean that it's "annoying", "difficult", or both? :-)

So just remember always to use the array or hash constructors with [] or {} , and you'll be fine, although it's not always optimally efficient.

Surprisingly, the following dangerous-looking construct will actually work out fine:

  1. for $i (1..10) {
  2. my @array = somefunc($i);
  3. $AoA[$i] = \@array;
  4. }

That's because my() is more of a run-time statement than it is a compile-time declaration per se. This means that the my() variable is remade afresh each time through the loop. So even though it looks as though you stored the same variable reference each time, you actually did not! This is a subtle distinction that can produce more efficient code at the risk of misleading all but the most experienced of programmers. So I usually advise against teaching it to beginners. In fact, except for passing arguments to functions, I seldom like to see the gimme-a-reference operator (backslash) used much at all in code. Instead, I advise beginners that they (and most of the rest of us) should try to use the much more easily understood constructors [] and {} instead of relying upon lexical (or dynamic) scoping and hidden reference-counting to do the right thing behind the scenes.

In summary:

  1. $AoA[$i] = [ @array ]; # usually best
  2. $AoA[$i] = \@array; # perilous; just how my() was that array?
  3. @{ $AoA[$i] } = @array; # way too tricky for most programmers

CAVEAT ON PRECEDENCE

Speaking of things like @{$AoA[$i]} , the following are actually the same thing:

  1. $aref->[2][2] # clear
  2. $$aref[2][2] # confusing

That's because Perl's precedence rules on its five prefix dereferencers (which look like someone swearing: $ @ * % & ) make them bind more tightly than the postfix subscripting brackets or braces! This will no doubt come as a great shock to the C or C++ programmer, who is quite accustomed to using *a[i] to mean what's pointed to by the i'th element of a . That is, they first take the subscript, and only then dereference the thing at that subscript. That's fine in C, but this isn't C.

The seemingly equivalent construct in Perl, $$aref[$i] first does the deref of $aref, making it take $aref as a reference to an array, and then dereference that, and finally tell you the i'th value of the array pointed to by $AoA. If you wanted the C notion, you'd have to write ${$AoA[$i]} to force the $AoA[$i] to get evaluated first before the leading $ dereferencer.

WHY YOU SHOULD ALWAYS use strict

If this is starting to sound scarier than it's worth, relax. Perl has some features to help you avoid its most common pitfalls. The best way to avoid getting confused is to start every program like this:

  1. #!/usr/bin/perl -w
  2. use strict;

This way, you'll be forced to declare all your variables with my() and also disallow accidental "symbolic dereferencing". Therefore if you'd done this:

  1. my $aref = [
  2. [ "fred", "barney", "pebbles", "bambam", "dino", ],
  3. [ "homer", "bart", "marge", "maggie", ],
  4. [ "george", "jane", "elroy", "judy", ],
  5. ];
  6. print $aref[2][2];

The compiler would immediately flag that as an error at compile time, because you were accidentally accessing @aref , an undeclared variable, and it would thereby remind you to write instead:

  1. print $aref->[2][2]

DEBUGGING

You can use the debugger's x command to dump out complex data structures. For example, given the assignment to $AoA above, here's the debugger output:

  1. DB<1> x $AoA
  2. $AoA = ARRAY(0x13b5a0)
  3. 0 ARRAY(0x1f0a24)
  4. 0 'fred'
  5. 1 'barney'
  6. 2 'pebbles'
  7. 3 'bambam'
  8. 4 'dino'
  9. 1 ARRAY(0x13b558)
  10. 0 'homer'
  11. 1 'bart'
  12. 2 'marge'
  13. 3 'maggie'
  14. 2 ARRAY(0x13b540)
  15. 0 'george'
  16. 1 'jane'
  17. 2 'elroy'
  18. 3 'judy'

CODE EXAMPLES

Presented with little comment (these will get their own manpages someday) here are short code examples illustrating access of various types of data structures.

ARRAYS OF ARRAYS

Declaration of an ARRAY OF ARRAYS

  1. @AoA = (
  2. [ "fred", "barney" ],
  3. [ "george", "jane", "elroy" ],
  4. [ "homer", "marge", "bart" ],
  5. );

Generation of an ARRAY OF ARRAYS

  1. # reading from file
  2. while ( <> ) {
  3. push @AoA, [ split ];
  4. }
  5. # calling a function
  6. for $i ( 1 .. 10 ) {
  7. $AoA[$i] = [ somefunc($i) ];
  8. }
  9. # using temp vars
  10. for $i ( 1 .. 10 ) {
  11. @tmp = somefunc($i);
  12. $AoA[$i] = [ @tmp ];
  13. }
  14. # add to an existing row
  15. push @{ $AoA[0] }, "wilma", "betty";

Access and Printing of an ARRAY OF ARRAYS

  1. # one element
  2. $AoA[0][0] = "Fred";
  3. # another element
  4. $AoA[1][1] =~ s/(\w)/\u$1/;
  5. # print the whole thing with refs
  6. for $aref ( @AoA ) {
  7. print "\t [ @$aref ],\n";
  8. }
  9. # print the whole thing with indices
  10. for $i ( 0 .. $#AoA ) {
  11. print "\t [ @{$AoA[$i]} ],\n";
  12. }
  13. # print the whole thing one at a time
  14. for $i ( 0 .. $#AoA ) {
  15. for $j ( 0 .. $#{ $AoA[$i] } ) {
  16. print "elt $i $j is $AoA[$i][$j]\n";
  17. }
  18. }

HASHES OF ARRAYS

Declaration of a HASH OF ARRAYS

  1. %HoA = (
  2. flintstones => [ "fred", "barney" ],
  3. jetsons => [ "george", "jane", "elroy" ],
  4. simpsons => [ "homer", "marge", "bart" ],
  5. );

Generation of a HASH OF ARRAYS

  1. # reading from file
  2. # flintstones: fred barney wilma dino
  3. while ( <> ) {
  4. next unless s/^(.*?):\s*//;
  5. $HoA{$1} = [ split ];
  6. }
  7. # reading from file; more temps
  8. # flintstones: fred barney wilma dino
  9. while ( $line = <> ) {
  10. ($who, $rest) = split /:\s*/, $line, 2;
  11. @fields = split ' ', $rest;
  12. $HoA{$who} = [ @fields ];
  13. }
  14. # calling a function that returns a list
  15. for $group ( "simpsons", "jetsons", "flintstones" ) {
  16. $HoA{$group} = [ get_family($group) ];
  17. }
  18. # likewise, but using temps
  19. for $group ( "simpsons", "jetsons", "flintstones" ) {
  20. @members = get_family($group);
  21. $HoA{$group} = [ @members ];
  22. }
  23. # append new members to an existing family
  24. push @{ $HoA{"flintstones"} }, "wilma", "betty";

Access and Printing of a HASH OF ARRAYS

  1. # one element
  2. $HoA{flintstones}[0] = "Fred";
  3. # another element
  4. $HoA{simpsons}[1] =~ s/(\w)/\u$1/;
  5. # print the whole thing
  6. foreach $family ( keys %HoA ) {
  7. print "$family: @{ $HoA{$family} }\n"
  8. }
  9. # print the whole thing with indices
  10. foreach $family ( keys %HoA ) {
  11. print "family: ";
  12. foreach $i ( 0 .. $#{ $HoA{$family} } ) {
  13. print " $i = $HoA{$family}[$i]";
  14. }
  15. print "\n";
  16. }
  17. # print the whole thing sorted by number of members
  18. foreach $family ( sort { @{$HoA{$b}} <=> @{$HoA{$a}} } keys %HoA ) {
  19. print "$family: @{ $HoA{$family} }\n"
  20. }
  21. # print the whole thing sorted by number of members and name
  22. foreach $family ( sort {
  23. @{$HoA{$b}} <=> @{$HoA{$a}}
  24. ||
  25. $a cmp $b
  26. } keys %HoA )
  27. {
  28. print "$family: ", join(", ", sort @{ $HoA{$family} }), "\n";
  29. }

ARRAYS OF HASHES

Declaration of an ARRAY OF HASHES

  1. @AoH = (
  2. {
  3. Lead => "fred",
  4. Friend => "barney",
  5. },
  6. {
  7. Lead => "george",
  8. Wife => "jane",
  9. Son => "elroy",
  10. },
  11. {
  12. Lead => "homer",
  13. Wife => "marge",
  14. Son => "bart",
  15. }
  16. );

Generation of an ARRAY OF HASHES

  1. # reading from file
  2. # format: LEAD=fred FRIEND=barney
  3. while ( <> ) {
  4. $rec = {};
  5. for $field ( split ) {
  6. ($key, $value) = split /=/, $field;
  7. $rec->{$key} = $value;
  8. }
  9. push @AoH, $rec;
  10. }
  11. # reading from file
  12. # format: LEAD=fred FRIEND=barney
  13. # no temp
  14. while ( <> ) {
  15. push @AoH, { split /[\s+=]/ };
  16. }
  17. # calling a function that returns a key/value pair list, like
  18. # "lead","fred","daughter","pebbles"
  19. while ( %fields = getnextpairset() ) {
  20. push @AoH, { %fields };
  21. }
  22. # likewise, but using no temp vars
  23. while (<>) {
  24. push @AoH, { parsepairs($_) };
  25. }
  26. # add key/value to an element
  27. $AoH[0]{pet} = "dino";
  28. $AoH[2]{pet} = "santa's little helper";

Access and Printing of an ARRAY OF HASHES

  1. # one element
  2. $AoH[0]{lead} = "fred";
  3. # another element
  4. $AoH[1]{lead} =~ s/(\w)/\u$1/;
  5. # print the whole thing with refs
  6. for $href ( @AoH ) {
  7. print "{ ";
  8. for $role ( keys %$href ) {
  9. print "$role=$href->{$role} ";
  10. }
  11. print "}\n";
  12. }
  13. # print the whole thing with indices
  14. for $i ( 0 .. $#AoH ) {
  15. print "$i is { ";
  16. for $role ( keys %{ $AoH[$i] } ) {
  17. print "$role=$AoH[$i]{$role} ";
  18. }
  19. print "}\n";
  20. }
  21. # print the whole thing one at a time
  22. for $i ( 0 .. $#AoH ) {
  23. for $role ( keys %{ $AoH[$i] } ) {
  24. print "elt $i $role is $AoH[$i]{$role}\n";
  25. }
  26. }

HASHES OF HASHES

Declaration of a HASH OF HASHES

  1. %HoH = (
  2. flintstones => {
  3. lead => "fred",
  4. pal => "barney",
  5. },
  6. jetsons => {
  7. lead => "george",
  8. wife => "jane",
  9. "his boy" => "elroy",
  10. },
  11. simpsons => {
  12. lead => "homer",
  13. wife => "marge",
  14. kid => "bart",
  15. },
  16. );

Generation of a HASH OF HASHES

  1. # reading from file
  2. # flintstones: lead=fred pal=barney wife=wilma pet=dino
  3. while ( <> ) {
  4. next unless s/^(.*?):\s*//;
  5. $who = $1;
  6. for $field ( split ) {
  7. ($key, $value) = split /=/, $field;
  8. $HoH{$who}{$key} = $value;
  9. }
  10. # reading from file; more temps
  11. while ( <> ) {
  12. next unless s/^(.*?):\s*//;
  13. $who = $1;
  14. $rec = {};
  15. $HoH{$who} = $rec;
  16. for $field ( split ) {
  17. ($key, $value) = split /=/, $field;
  18. $rec->{$key} = $value;
  19. }
  20. }
  21. # calling a function that returns a key,value hash
  22. for $group ( "simpsons", "jetsons", "flintstones" ) {
  23. $HoH{$group} = { get_family($group) };
  24. }
  25. # likewise, but using temps
  26. for $group ( "simpsons", "jetsons", "flintstones" ) {
  27. %members = get_family($group);
  28. $HoH{$group} = { %members };
  29. }
  30. # append new members to an existing family
  31. %new_folks = (
  32. wife => "wilma",
  33. pet => "dino",
  34. );
  35. for $what (keys %new_folks) {
  36. $HoH{flintstones}{$what} = $new_folks{$what};
  37. }

Access and Printing of a HASH OF HASHES

  1. # one element
  2. $HoH{flintstones}{wife} = "wilma";
  3. # another element
  4. $HoH{simpsons}{lead} =~ s/(\w)/\u$1/;
  5. # print the whole thing
  6. foreach $family ( keys %HoH ) {
  7. print "$family: { ";
  8. for $role ( keys %{ $HoH{$family} } ) {
  9. print "$role=$HoH{$family}{$role} ";
  10. }
  11. print "}\n";
  12. }
  13. # print the whole thing somewhat sorted
  14. foreach $family ( sort keys %HoH ) {
  15. print "$family: { ";
  16. for $role ( sort keys %{ $HoH{$family} } ) {
  17. print "$role=$HoH{$family}{$role} ";
  18. }
  19. print "}\n";
  20. }
  21. # print the whole thing sorted by number of members
  22. foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} } keys %HoH ) {
  23. print "$family: { ";
  24. for $role ( sort keys %{ $HoH{$family} } ) {
  25. print "$role=$HoH{$family}{$role} ";
  26. }
  27. print "}\n";
  28. }
  29. # establish a sort order (rank) for each role
  30. $i = 0;
  31. for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }
  32. # now print the whole thing sorted by number of members
  33. foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } } keys %HoH ) {
  34. print "$family: { ";
  35. # and print these according to rank order
  36. for $role ( sort { $rank{$a} <=> $rank{$b} } keys %{ $HoH{$family} } ) {
  37. print "$role=$HoH{$family}{$role} ";
  38. }
  39. print "}\n";
  40. }

MORE ELABORATE RECORDS

Declaration of MORE ELABORATE RECORDS

Here's a sample showing how to create and use a record whose fields are of many different sorts:

  1. $rec = {
  2. TEXT => $string,
  3. SEQUENCE => [ @old_values ],
  4. LOOKUP => { %some_table },
  5. THATCODE => \&some_function,
  6. THISCODE => sub { $_[0] ** $_[1] },
  7. HANDLE => \*STDOUT,
  8. };
  9. print $rec->{TEXT};
  10. print $rec->{SEQUENCE}[0];
  11. $last = pop @ { $rec->{SEQUENCE} };
  12. print $rec->{LOOKUP}{"key"};
  13. ($first_k, $first_v) = each %{ $rec->{LOOKUP} };
  14. $answer = $rec->{THATCODE}->($arg);
  15. $answer = $rec->{THISCODE}->($arg1, $arg2);
  16. # careful of extra block braces on fh ref
  17. print { $rec->{HANDLE} } "a string\n";
  18. use FileHandle;
  19. $rec->{HANDLE}->autoflush(1);
  20. $rec->{HANDLE}->print(" a string\n");

Declaration of a HASH OF COMPLEX RECORDS

  1. %TV = (
  2. flintstones => {
  3. series => "flintstones",
  4. nights => [ qw(monday thursday friday) ],
  5. members => [
  6. { name => "fred", role => "lead", age => 36, },
  7. { name => "wilma", role => "wife", age => 31, },
  8. { name => "pebbles", role => "kid", age => 4, },
  9. ],
  10. },
  11. jetsons => {
  12. series => "jetsons",
  13. nights => [ qw(wednesday saturday) ],
  14. members => [
  15. { name => "george", role => "lead", age => 41, },
  16. { name => "jane", role => "wife", age => 39, },
  17. { name => "elroy", role => "kid", age => 9, },
  18. ],
  19. },
  20. simpsons => {
  21. series => "simpsons",
  22. nights => [ qw(monday) ],
  23. members => [
  24. { name => "homer", role => "lead", age => 34, },
  25. { name => "marge", role => "wife", age => 37, },
  26. { name => "bart", role => "kid", age => 11, },
  27. ],
  28. },
  29. );

Generation of a HASH OF COMPLEX RECORDS

  1. # reading from file
  2. # this is most easily done by having the file itself be
  3. # in the raw data format as shown above. perl is happy
  4. # to parse complex data structures if declared as data, so
  5. # sometimes it's easiest to do that
  6. # here's a piece by piece build up
  7. $rec = {};
  8. $rec->{series} = "flintstones";
  9. $rec->{nights} = [ find_days() ];
  10. @members = ();
  11. # assume this file in field=value syntax
  12. while (<>) {
  13. %fields = split /[\s=]+/;
  14. push @members, { %fields };
  15. }
  16. $rec->{members} = [ @members ];
  17. # now remember the whole thing
  18. $TV{ $rec->{series} } = $rec;
  19. ###########################################################
  20. # now, you might want to make interesting extra fields that
  21. # include pointers back into the same data structure so if
  22. # change one piece, it changes everywhere, like for example
  23. # if you wanted a {kids} field that was a reference
  24. # to an array of the kids' records without having duplicate
  25. # records and thus update problems.
  26. ###########################################################
  27. foreach $family (keys %TV) {
  28. $rec = $TV{$family}; # temp pointer
  29. @kids = ();
  30. for $person ( @{ $rec->{members} } ) {
  31. if ($person->{role} =~ /kid|son|daughter/) {
  32. push @kids, $person;
  33. }
  34. }
  35. # REMEMBER: $rec and $TV{$family} point to same data!!
  36. $rec->{kids} = [ @kids ];
  37. }
  38. # you copied the array, but the array itself contains pointers
  39. # to uncopied objects. this means that if you make bart get
  40. # older via
  41. $TV{simpsons}{kids}[0]{age}++;
  42. # then this would also change in
  43. print $TV{simpsons}{members}[2]{age};
  44. # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2]
  45. # both point to the same underlying anonymous hash table
  46. # print the whole thing
  47. foreach $family ( keys %TV ) {
  48. print "the $family";
  49. print " is on during @{ $TV{$family}{nights} }\n";
  50. print "its members are:\n";
  51. for $who ( @{ $TV{$family}{members} } ) {
  52. print " $who->{name} ($who->{role}), age $who->{age}\n";
  53. }
  54. print "it turns out that $TV{$family}{lead} has ";
  55. print scalar ( @{ $TV{$family}{kids} } ), " kids named ";
  56. print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } );
  57. print "\n";
  58. }

Database Ties

You cannot easily tie a multilevel data structure (such as a hash of hashes) to a dbm file. The first problem is that all but GDBM and Berkeley DB have size limitations, but beyond that, you also have problems with how references are to be represented on disk. One experimental module that does partially attempt to address this need is the MLDBM module. Check your nearest CPAN site as described in perlmodlib for source code to MLDBM.

SEE ALSO

perlref, perllol, perldata, perlobj

AUTHOR

Tom Christiansen <tchrist@perl.com>

 
perldoc-html/perldtrace.html000644 000765 000024 00000105014 12275777356 016257 0ustar00jjstaff000000 000000 perldtrace - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perldtrace

Perl 5 version 18.2 documentation
Recently read

perldtrace

NAME

perldtrace - Perl's support for DTrace

SYNOPSIS

  1. # dtrace -Zn 'perl::sub-entry, perl::sub-return { trace(copyinstr(arg0)) }'
  2. dtrace: description 'perl::sub-entry, perl::sub-return ' matched 10 probes
  3. # perl -E 'sub outer { inner(@_) } sub inner { say shift } outer("hello")'
  4. hello
  5. (dtrace output)
  6. CPU ID FUNCTION:NAME
  7. 0 75915 Perl_pp_entersub:sub-entry BEGIN
  8. 0 75915 Perl_pp_entersub:sub-entry import
  9. 0 75922 Perl_pp_leavesub:sub-return import
  10. 0 75922 Perl_pp_leavesub:sub-return BEGIN
  11. 0 75915 Perl_pp_entersub:sub-entry outer
  12. 0 75915 Perl_pp_entersub:sub-entry inner
  13. 0 75922 Perl_pp_leavesub:sub-return inner
  14. 0 75922 Perl_pp_leavesub:sub-return outer

DESCRIPTION

DTrace is a framework for comprehensive system- and application-level tracing. Perl is a DTrace provider, meaning it exposes several probes for instrumentation. You can use these in conjunction with kernel-level probes, as well as probes from other providers such as MySQL, in order to diagnose software defects, or even just your application's bottlenecks.

Perl must be compiled with the -Dusedtrace option in order to make use of the provided probes. While DTrace aims to have no overhead when its instrumentation is not active, Perl's support itself cannot uphold that guarantee, so it is built without DTrace probes under most systems. One notable exception is that Mac OS X ships a /usr/bin/perl with DTrace support enabled.

HISTORY

5.
10.1

Perl's initial DTrace support was added, providing sub-entry and sub-return probes.

5.
14.0

The sub-entry and sub-return probes gain a fourth argument: the package name of the function.

5.
16.0

The phase-change probe was added.

5.
18.0

The op-entry , loading-file , and loaded-file probes were added.

PROBES

  • sub-entry(SUBNAME, FILE, LINE, PACKAGE)

    Traces the entry of any subroutine. Note that all of the variables refer to the subroutine that is being invoked; there is currently no way to get ahold of any information about the subroutine's caller from a DTrace action.

    1. :*perl*::sub-entry {
    2. printf("%s::%s entered at %s line %d\n",
    3. copyinstr(arg3), copyinstr(arg0), copyinstr(arg1), arg2);
    4. }
  • sub-return(SUBNAME, FILE, LINE, PACKAGE)

    Traces the exit of any subroutine. Note that all of the variables refer to the subroutine that is returning; there is currently no way to get ahold of any information about the subroutine's caller from a DTrace action.

    1. :*perl*::sub-return {
    2. printf("%s::%s returned at %s line %d\n",
    3. copyinstr(arg3), copyinstr(arg0), copyinstr(arg1), arg2);
    4. }
  • phase-change(NEWPHASE, OLDPHASE)

    Traces changes to Perl's interpreter state. You can internalize this as tracing changes to Perl's ${^GLOBAL_PHASE} variable, especially since the values for NEWPHASE and OLDPHASE are the strings that ${^GLOBAL_PHASE} reports.

    1. :*perl*::phase-change {
    2. printf("Phase changed from %s to %s\n",
    3. copyinstr(arg1), copyinstr(arg0));
    4. }
  • op-entry(OPNAME)

    Traces the execution of each opcode in the Perl runloop. This probe is fired before the opcode is executed. When the Perl debugger is enabled, the DTrace probe is fired after the debugger hooks (but still before the opcode itself is executed).

    1. :*perl*::op-entry {
    2. printf("About to execute opcode %s\n", copyinstr(arg0));
    3. }
  • loading-file(FILENAME)

    Fires when Perl is about to load an individual file, whether from use, require, or do. This probe fires before the file is read from disk. The filename argument is converted to local filesystem paths instead of providing Module::Name -style names.

    1. :*perl*:loading-file {
    2. printf("About to load %s\n", copyinstr(arg0));
    3. }
  • loaded-file(FILENAME)

    Fires when Perl has successfully loaded an individual file, whether from use, require, or do. This probe fires after the file is read from disk and its contentss evaluated. The filename argument is converted to local filesystem paths instead of providing Module::Name -style names.

    1. :*perl*:loaded-file {
    2. printf("Successfully loaded %s\n", copyinstr(arg0));
    3. }

EXAMPLES

  • Most frequently called functions
    1. # dtrace -qZn 'sub-entry { @[strjoin(strjoin(copyinstr(arg3),"::"),copyinstr(arg0))] = count() } END {trunc(@, 10)}'
    2. Class::MOP::Attribute::slots 400
    3. Try::Tiny::catch 411
    4. Try::Tiny::try 411
    5. Class::MOP::Instance::inline_slot_access 451
    6. Class::MOP::Class::Immutable::Trait:::around 472
    7. Class::MOP::Mixin::AttributeCore::has_initializer 496
    8. Class::MOP::Method::Wrapped::__ANON__ 544
    9. Class::MOP::Package::_package_stash 737
    10. Class::MOP::Class::initialize 1128
    11. Class::MOP::get_metaclass_by_name 1204
  • Trace function calls
    1. # dtrace -qFZn 'sub-entry, sub-return { trace(copyinstr(arg0)) }'
    2. 0 -> Perl_pp_entersub BEGIN
    3. 0 <- Perl_pp_leavesub BEGIN
    4. 0 -> Perl_pp_entersub BEGIN
    5. 0 -> Perl_pp_entersub import
    6. 0 <- Perl_pp_leavesub import
    7. 0 <- Perl_pp_leavesub BEGIN
    8. 0 -> Perl_pp_entersub BEGIN
    9. 0 -> Perl_pp_entersub dress
    10. 0 <- Perl_pp_leavesub dress
    11. 0 -> Perl_pp_entersub dirty
    12. 0 <- Perl_pp_leavesub dirty
    13. 0 -> Perl_pp_entersub whiten
    14. 0 <- Perl_pp_leavesub whiten
    15. 0 <- Perl_dounwind BEGIN
  • Function calls during interpreter cleanup
    1. # dtrace -Zn 'phase-change /copyinstr(arg0) == "END"/ { self->ending = 1 } sub-entry /self->ending/ { trace(copyinstr(arg0)) }'
    2. CPU ID FUNCTION:NAME
    3. 1 77214 Perl_pp_entersub:sub-entry END
    4. 1 77214 Perl_pp_entersub:sub-entry END
    5. 1 77214 Perl_pp_entersub:sub-entry cleanup
    6. 1 77214 Perl_pp_entersub:sub-entry _force_writable
    7. 1 77214 Perl_pp_entersub:sub-entry _force_writable
  • System calls at compile time
    1. # dtrace -qZn 'phase-change /copyinstr(arg0) == "START"/ { self->interesting = 1 } phase-change /copyinstr(arg0) == "RUN"/ { self->interesting = 0 } syscall::: /self->interesting/ { @[probefunc] = count() } END { trunc(@, 3) }'
    2. lseek 310
    3. read 374
    4. stat64 1056
  • Perl functions that execute the most opcodes
    1. # dtrace -qZn 'sub-entry { self->fqn = strjoin(copyinstr(arg3), strjoin("::", copyinstr(arg0))) } op-entry /self->fqn != ""/ { @[self->fqn] = count() } END { trunc(@, 3) }'
    2. warnings::unimport 4589
    3. Exporter::Heavy::_rebuild_cache 5039
    4. Exporter::import 14578

REFERENCES

SEE ALSO

AUTHORS

Shawn M Moore sartak@gmail.com

 
perldoc-html/perlebcdic.html000644 000765 000024 00000514053 12275777345 016233 0ustar00jjstaff000000 000000 perlebcdic - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlebcdic

Perl 5 version 18.2 documentation
Recently read

perlebcdic

NAME

perlebcdic - Considerations for running Perl on EBCDIC platforms

DESCRIPTION

An exploration of some of the issues facing Perl programmers on EBCDIC based computers. We do not cover localization, internationalization, or multi-byte character set issues other than some discussion of UTF-8 and UTF-EBCDIC.

Portions that are still incomplete are marked with XXX.

Perl used to work on EBCDIC machines, but there are now areas of the code where it doesn't. If you want to use Perl on an EBCDIC machine, please let us know by sending mail to perlbug@perl.org

COMMON CHARACTER CODE SETS

ASCII

The American Standard Code for Information Interchange (ASCII or US-ASCII) is a set of integers running from 0 to 127 (decimal) that imply character interpretation by the display and other systems of computers. The range 0..127 can be covered by setting the bits in a 7-bit binary digit, hence the set is sometimes referred to as "7-bit ASCII". ASCII was described by the American National Standards Institute document ANSI X3.4-1986. It was also described by ISO 646:1991 (with localization for currency symbols). The full ASCII set is given in the table below as the first 128 elements. Languages that can be written adequately with the characters in ASCII include English, Hawaiian, Indonesian, Swahili and some Native American languages.

There are many character sets that extend the range of integers from 0..2**7-1 up to 2**8-1, or 8 bit bytes (octets if you prefer). One common one is the ISO 8859-1 character set.

ISO 8859

The ISO 8859-$n are a collection of character code sets from the International Organization for Standardization (ISO), each of which adds characters to the ASCII set that are typically found in European languages, many of which are based on the Roman, or Latin, alphabet.

Latin 1 (ISO 8859-1)

A particular 8-bit extension to ASCII that includes grave and acute accented Latin characters. Languages that can employ ISO 8859-1 include all the languages covered by ASCII as well as Afrikaans, Albanian, Basque, Catalan, Danish, Faroese, Finnish, Norwegian, Portuguese, Spanish, and Swedish. Dutch is covered albeit without the ij ligature. French is covered too but without the oe ligature. German can use ISO 8859-1 but must do so without German-style quotation marks. This set is based on Western European extensions to ASCII and is commonly encountered in world wide web work. In IBM character code set identification terminology ISO 8859-1 is also known as CCSID 819 (or sometimes 0819 or even 00819).

EBCDIC

The Extended Binary Coded Decimal Interchange Code refers to a large collection of single- and multi-byte coded character sets that are different from ASCII or ISO 8859-1 and are all slightly different from each other; they typically run on host computers. The EBCDIC encodings derive from 8-bit byte extensions of Hollerith punched card encodings. The layout on the cards was such that high bits were set for the upper and lower case alphabet characters [a-z] and [A-Z], but there were gaps within each Latin alphabet range.

Some IBM EBCDIC character sets may be known by character code set identification numbers (CCSID numbers) or code page numbers.

Perl can be compiled on platforms that run any of three commonly used EBCDIC character sets, listed below.

The 13 variant characters

Among IBM EBCDIC character code sets there are 13 characters that are often mapped to different integer values. Those characters are known as the 13 "variant" characters and are:

  1. \ [ ] { } ^ ~ ! # | $ @ `

When Perl is compiled for a platform, it looks at some of these characters to guess which EBCDIC character set the platform uses, and adapts itself accordingly to that platform. If the platform uses a character set that is not one of the three Perl knows about, Perl will either fail to compile, or mistakenly and silently choose one of the three. They are:

  • 0037

    Character code set ID 0037 is a mapping of the ASCII plus Latin-1 characters (i.e. ISO 8859-1) to an EBCDIC set. 0037 is used in North American English locales on the OS/400 operating system that runs on AS/400 computers. CCSID 0037 differs from ISO 8859-1 in 237 places, in other words they agree on only 19 code point values.

  • 1047

    Character code set ID 1047 is also a mapping of the ASCII plus Latin-1 characters (i.e. ISO 8859-1) to an EBCDIC set. 1047 is used under Unix System Services for OS/390 or z/OS, and OpenEdition for VM/ESA. CCSID 1047 differs from CCSID 0037 in eight places.

  • POSIX-BC

    The EBCDIC code page in use on Siemens' BS2000 system is distinct from 1047 and 0037. It is identified below as the POSIX-BC set.

Unicode code points versus EBCDIC code points

In Unicode terminology a code point is the number assigned to a character: for example, in EBCDIC the character "A" is usually assigned the number 193. In Unicode the character "A" is assigned the number 65. This causes a problem with the semantics of the pack/unpack "U", which are supposed to pack Unicode code points to characters and back to numbers. The problem is: which code points to use for code points less than 256? (for 256 and over there's no problem: Unicode code points are used) In EBCDIC, for the low 256 the EBCDIC code points are used. This means that the equivalences

  1. pack("U", ord($character)) eq $character
  2. unpack("U", $character) == ord $character

will hold. (If Unicode code points were applied consistently over all the possible code points, pack("U",ord("A")) would in EBCDIC equal A with acute or chr(101), and unpack("U", "A") would equal 65, or non-breaking space, not 193, or ord "A".)

Remaining Perl Unicode problems in EBCDIC

  • Many of the remaining problems seem to be related to case-insensitive matching

  • The extensions Unicode::Collate and Unicode::Normalized are not supported under EBCDIC, likewise for the encoding pragma.

Unicode and UTF

UTF stands for Unicode Transformation Format . UTF-8 is an encoding of Unicode into a sequence of 8-bit byte chunks, based on ASCII and Latin-1. The length of a sequence required to represent a Unicode code point depends on the ordinal number of that code point, with larger numbers requiring more bytes. UTF-EBCDIC is like UTF-8, but based on EBCDIC.

You may see the term invariant character or code point. This simply means that the character has the same numeric value when encoded as when not. (Note that this is a very different concept from The 13 variant characters mentioned above.) For example, the ordinal value of 'A' is 193 in most EBCDIC code pages, and also is 193 when encoded in UTF-EBCDIC. All variant code points occupy at least two bytes when encoded. In UTF-8, the code points corresponding to the lowest 128 ordinal numbers (0 - 127: the ASCII characters) are invariant. In UTF-EBCDIC, there are 160 invariant characters. (If you care, the EBCDIC invariants are those characters which have ASCII equivalents, plus those that correspond to the C1 controls (80..9f on ASCII platforms).)

A string encoded in UTF-EBCDIC may be longer (but never shorter) than one encoded in UTF-8.

Using Encode

Starting from Perl 5.8 you can use the standard new module Encode to translate from EBCDIC to Latin-1 code points. Encode knows about more EBCDIC character sets than Perl can currently be compiled to run on.

  1. use Encode 'from_to';
  2. my %ebcdic = ( 176 => 'cp37', 95 => 'cp1047', 106 => 'posix-bc' );
  3. # $a is in EBCDIC code points
  4. from_to($a, $ebcdic{ord '^'}, 'latin1');
  5. # $a is ISO 8859-1 code points

and from Latin-1 code points to EBCDIC code points

  1. use Encode 'from_to';
  2. my %ebcdic = ( 176 => 'cp37', 95 => 'cp1047', 106 => 'posix-bc' );
  3. # $a is ISO 8859-1 code points
  4. from_to($a, 'latin1', $ebcdic{ord '^'});
  5. # $a is in EBCDIC code points

For doing I/O it is suggested that you use the autotranslating features of PerlIO, see perluniintro.

Since version 5.8 Perl uses the new PerlIO I/O library. This enables you to use different encodings per IO channel. For example you may use

  1. use Encode;
  2. open($f, ">:encoding(ascii)", "test.ascii");
  3. print $f "Hello World!\n";
  4. open($f, ">:encoding(cp37)", "test.ebcdic");
  5. print $f "Hello World!\n";
  6. open($f, ">:encoding(latin1)", "test.latin1");
  7. print $f "Hello World!\n";
  8. open($f, ">:encoding(utf8)", "test.utf8");
  9. print $f "Hello World!\n";

to get four files containing "Hello World!\n" in ASCII, CP 0037 EBCDIC, ISO 8859-1 (Latin-1) (in this example identical to ASCII since only ASCII characters were printed), and UTF-EBCDIC (in this example identical to normal EBCDIC since only characters that don't differ between EBCDIC and UTF-EBCDIC were printed). See the documentation of Encode::PerlIO for details.

As the PerlIO layer uses raw IO (bytes) internally, all this totally ignores things like the type of your filesystem (ASCII or EBCDIC).

SINGLE OCTET TABLES

The following tables list the ASCII and Latin 1 ordered sets including the subsets: C0 controls (0..31), ASCII graphics (32..7e), delete (7f), C1 controls (80..9f), and Latin-1 (a.k.a. ISO 8859-1) (a0..ff). In the table names of the Latin 1 extensions to ASCII have been labelled with character names roughly corresponding to The Unicode Standard, Version 6.1 albeit with substitutions such as s/LATIN// and s/VULGAR// in all cases, s/CAPITAL LETTER// in some cases, and s/SMALL LETTER ([A-Z])/\l$1/ in some other cases. Controls are listed using their Unicode 6.1 abbreviatons. The differences between the 0037 and 1047 sets are flagged with **. The differences between the 1047 and POSIX-BC sets are flagged with ##. All ord() numbers listed are decimal. If you would rather see this table listing octal values, then run the table (that is, the pod source text of this document, since this recipe may not work with a pod2_other_format translation) through:

  • recipe 0
  1. perl -ne 'if(/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
  2. -e '{printf("%s%-5.03o%-5.03o%-5.03o%.03o\n",$1,$2,$3,$4,$5)}' \
  3. perlebcdic.pod

If you want to retain the UTF-x code points then in script form you might want to write:

  • recipe 1
  1. open(FH,"<perlebcdic.pod") or die "Could not open perlebcdic.pod: $!";
  2. while (<FH>) {
  3. if (/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)\s+(\d+)\.?(\d*)/)
  4. {
  5. if ($7 ne '' && $9 ne '') {
  6. printf(
  7. "%s%-5.03o%-5.03o%-5.03o%-5.03o%-3o.%-5o%-3o.%.03o\n",
  8. $1,$2,$3,$4,$5,$6,$7,$8,$9);
  9. }
  10. elsif ($7 ne '') {
  11. printf("%s%-5.03o%-5.03o%-5.03o%-5.03o%-3o.%-5o%.03o\n",
  12. $1,$2,$3,$4,$5,$6,$7,$8);
  13. }
  14. else {
  15. printf("%s%-5.03o%-5.03o%-5.03o%-5.03o%-5.03o%.03o\n",
  16. $1,$2,$3,$4,$5,$6,$8);
  17. }
  18. }
  19. }

If you would rather see this table listing hexadecimal values then run the table through:

  • recipe 2
  1. perl -ne 'if(/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
  2. -e '{printf("%s%-5.02X%-5.02X%-5.02X%.02X\n",$1,$2,$3,$4,$5)}' \
  3. perlebcdic.pod

Or, in order to retain the UTF-x code points in hexadecimal:

  • recipe 3
  1. open(FH,"<perlebcdic.pod") or die "Could not open perlebcdic.pod: $!";
  2. while (<FH>) {
  3. if (/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)\s+(\d+)\.?(\d*)/)
  4. {
  5. if ($7 ne '' && $9 ne '') {
  6. printf(
  7. "%s%-5.02X%-5.02X%-5.02X%-5.02X%-2X.%-6.02X%02X.%02X\n",
  8. $1,$2,$3,$4,$5,$6,$7,$8,$9);
  9. }
  10. elsif ($7 ne '') {
  11. printf("%s%-5.02X%-5.02X%-5.02X%-5.02X%-2X.%-6.02X%02X\n",
  12. $1,$2,$3,$4,$5,$6,$7,$8);
  13. }
  14. else {
  15. printf("%s%-5.02X%-5.02X%-5.02X%-5.02X%-5.02X%02X\n",
  16. $1,$2,$3,$4,$5,$6,$8);
  17. }
  18. }
  19. }
  20. ISO
  21. 8859-1 POS-
  22. CCSID CCSID CCSID IX-
  23. chr 0819 0037 1047 BC UTF-8 UTF-EBCDIC
  24. ---------------------------------------------------------------------
  25. <NUL> 0 0 0 0 0 0
  26. <SOH> 1 1 1 1 1 1
  27. <STX> 2 2 2 2 2 2
  28. <ETX> 3 3 3 3 3 3
  29. <EOT> 4 55 55 55 4 55
  30. <ENQ> 5 45 45 45 5 45
  31. <ACK> 6 46 46 46 6 46
  32. <BEL> 7 47 47 47 7 47
  33. <BS> 8 22 22 22 8 22
  34. <HT> 9 5 5 5 9 5
  35. <LF> 10 37 21 21 10 21 **
  36. <VT> 11 11 11 11 11 11
  37. <FF> 12 12 12 12 12 12
  38. <CR> 13 13 13 13 13 13
  39. <SO> 14 14 14 14 14 14
  40. <SI> 15 15 15 15 15 15
  41. <DLE> 16 16 16 16 16 16
  42. <DC1> 17 17 17 17 17 17
  43. <DC2> 18 18 18 18 18 18
  44. <DC3> 19 19 19 19 19 19
  45. <DC4> 20 60 60 60 20 60
  46. <NAK> 21 61 61 61 21 61
  47. <SYN> 22 50 50 50 22 50
  48. <ETB> 23 38 38 38 23 38
  49. <CAN> 24 24 24 24 24 24
  50. <EOM> 25 25 25 25 25 25
  51. <SUB> 26 63 63 63 26 63
  52. <ESC> 27 39 39 39 27 39
  53. <FS> 28 28 28 28 28 28
  54. <GS> 29 29 29 29 29 29
  55. <RS> 30 30 30 30 30 30
  56. <US> 31 31 31 31 31 31
  57. <SPACE> 32 64 64 64 32 64
  58. ! 33 90 90 90 33 90
  59. " 34 127 127 127 34 127
  60. # 35 123 123 123 35 123
  61. $ 36 91 91 91 36 91
  62. % 37 108 108 108 37 108
  63. & 38 80 80 80 38 80
  64. ' 39 125 125 125 39 125
  65. ( 40 77 77 77 40 77
  66. ) 41 93 93 93 41 93
  67. * 42 92 92 92 42 92
  68. + 43 78 78 78 43 78
  69. , 44 107 107 107 44 107
  70. - 45 96 96 96 45 96
  71. . 46 75 75 75 46 75
  72. / 47 97 97 97 47 97
  73. 0 48 240 240 240 48 240
  74. 1 49 241 241 241 49 241
  75. 2 50 242 242 242 50 242
  76. 3 51 243 243 243 51 243
  77. 4 52 244 244 244 52 244
  78. 5 53 245 245 245 53 245
  79. 6 54 246 246 246 54 246
  80. 7 55 247 247 247 55 247
  81. 8 56 248 248 248 56 248
  82. 9 57 249 249 249 57 249
  83. : 58 122 122 122 58 122
  84. ; 59 94 94 94 59 94
  85. < 60 76 76 76 60 76
  86. = 61 126 126 126 61 126
  87. > 62 110 110 110 62 110
  88. ? 63 111 111 111 63 111
  89. @ 64 124 124 124 64 124
  90. A 65 193 193 193 65 193
  91. B 66 194 194 194 66 194
  92. C 67 195 195 195 67 195
  93. D 68 196 196 196 68 196
  94. E 69 197 197 197 69 197
  95. F 70 198 198 198 70 198
  96. G 71 199 199 199 71 199
  97. H 72 200 200 200 72 200
  98. I 73 201 201 201 73 201
  99. J 74 209 209 209 74 209
  100. K 75 210 210 210 75 210
  101. L 76 211 211 211 76 211
  102. M 77 212 212 212 77 212
  103. N 78 213 213 213 78 213
  104. O 79 214 214 214 79 214
  105. P 80 215 215 215 80 215
  106. Q 81 216 216 216 81 216
  107. R 82 217 217 217 82 217
  108. S 83 226 226 226 83 226
  109. T 84 227 227 227 84 227
  110. U 85 228 228 228 85 228
  111. V 86 229 229 229 86 229
  112. W 87 230 230 230 87 230
  113. X 88 231 231 231 88 231
  114. Y 89 232 232 232 89 232
  115. Z 90 233 233 233 90 233
  116. [ 91 186 173 187 91 173 ** ##
  117. \ 92 224 224 188 92 224 ##
  118. ] 93 187 189 189 93 189 **
  119. ^ 94 176 95 106 94 95 ** ##
  120. _ 95 109 109 109 95 109
  121. ` 96 121 121 74 96 121 ##
  122. a 97 129 129 129 97 129
  123. b 98 130 130 130 98 130
  124. c 99 131 131 131 99 131
  125. d 100 132 132 132 100 132
  126. e 101 133 133 133 101 133
  127. f 102 134 134 134 102 134
  128. g 103 135 135 135 103 135
  129. h 104 136 136 136 104 136
  130. i 105 137 137 137 105 137
  131. j 106 145 145 145 106 145
  132. k 107 146 146 146 107 146
  133. l 108 147 147 147 108 147
  134. m 109 148 148 148 109 148
  135. n 110 149 149 149 110 149
  136. o 111 150 150 150 111 150
  137. p 112 151 151 151 112 151
  138. q 113 152 152 152 113 152
  139. r 114 153 153 153 114 153
  140. s 115 162 162 162 115 162
  141. t 116 163 163 163 116 163
  142. u 117 164 164 164 117 164
  143. v 118 165 165 165 118 165
  144. w 119 166 166 166 119 166
  145. x 120 167 167 167 120 167
  146. y 121 168 168 168 121 168
  147. z 122 169 169 169 122 169
  148. { 123 192 192 251 123 192 ##
  149. | 124 79 79 79 124 79
  150. } 125 208 208 253 125 208 ##
  151. ~ 126 161 161 255 126 161 ##
  152. <DEL> 127 7 7 7 127 7
  153. <PAD> 128 32 32 32 194.128 32
  154. <HOP> 129 33 33 33 194.129 33
  155. <BPH> 130 34 34 34 194.130 34
  156. <NBH> 131 35 35 35 194.131 35
  157. <IND> 132 36 36 36 194.132 36
  158. <NEL> 133 21 37 37 194.133 37 **
  159. <SSA> 134 6 6 6 194.134 6
  160. <ESA> 135 23 23 23 194.135 23
  161. <HTS> 136 40 40 40 194.136 40
  162. <HTJ> 137 41 41 41 194.137 41
  163. <VTS> 138 42 42 42 194.138 42
  164. <PLD> 139 43 43 43 194.139 43
  165. <PLU> 140 44 44 44 194.140 44
  166. <RI> 141 9 9 9 194.141 9
  167. <SS2> 142 10 10 10 194.142 10
  168. <SS3> 143 27 27 27 194.143 27
  169. <DCS> 144 48 48 48 194.144 48
  170. <PU1> 145 49 49 49 194.145 49
  171. <PU2> 146 26 26 26 194.146 26
  172. <STS> 147 51 51 51 194.147 51
  173. <CCH> 148 52 52 52 194.148 52
  174. <MW> 149 53 53 53 194.149 53
  175. <SPA> 150 54 54 54 194.150 54
  176. <EPA> 151 8 8 8 194.151 8
  177. <SOS> 152 56 56 56 194.152 56
  178. <SGC> 153 57 57 57 194.153 57
  179. <SCI> 154 58 58 58 194.154 58
  180. <CSI> 155 59 59 59 194.155 59
  181. <ST> 156 4 4 4 194.156 4
  182. <OSC> 157 20 20 20 194.157 20
  183. <PM> 158 62 62 62 194.158 62
  184. <APC> 159 255 255 95 194.159 255 ##
  185. <NON-BREAKING SPACE> 160 65 65 65 194.160 128.65
  186. <INVERTED "!" > 161 170 170 170 194.161 128.66
  187. <CENT SIGN> 162 74 74 176 194.162 128.67 ##
  188. <POUND SIGN> 163 177 177 177 194.163 128.68
  189. <CURRENCY SIGN> 164 159 159 159 194.164 128.69
  190. <YEN SIGN> 165 178 178 178 194.165 128.70
  191. <BROKEN BAR> 166 106 106 208 194.166 128.71 ##
  192. <SECTION SIGN> 167 181 181 181 194.167 128.72
  193. <DIAERESIS> 168 189 187 121 194.168 128.73 ** ##
  194. <COPYRIGHT SIGN> 169 180 180 180 194.169 128.74
  195. <FEMININE ORDINAL> 170 154 154 154 194.170 128.81
  196. <LEFT POINTING GUILLEMET> 171 138 138 138 194.171 128.82
  197. <NOT SIGN> 172 95 176 186 194.172 128.83 ** ##
  198. <SOFT HYPHEN> 173 202 202 202 194.173 128.84
  199. <REGISTERED TRADE MARK> 174 175 175 175 194.174 128.85
  200. <MACRON> 175 188 188 161 194.175 128.86 ##
  201. <DEGREE SIGN> 176 144 144 144 194.176 128.87
  202. <PLUS-OR-MINUS SIGN> 177 143 143 143 194.177 128.88
  203. <SUPERSCRIPT TWO> 178 234 234 234 194.178 128.89
  204. <SUPERSCRIPT THREE> 179 250 250 250 194.179 128.98
  205. <ACUTE ACCENT> 180 190 190 190 194.180 128.99
  206. <MICRO SIGN> 181 160 160 160 194.181 128.100
  207. <PARAGRAPH SIGN> 182 182 182 182 194.182 128.101
  208. <MIDDLE DOT> 183 179 179 179 194.183 128.102
  209. <CEDILLA> 184 157 157 157 194.184 128.103
  210. <SUPERSCRIPT ONE> 185 218 218 218 194.185 128.104
  211. <MASC. ORDINAL INDICATOR> 186 155 155 155 194.186 128.105
  212. <RIGHT POINTING GUILLEMET> 187 139 139 139 194.187 128.106
  213. <FRACTION ONE QUARTER> 188 183 183 183 194.188 128.112
  214. <FRACTION ONE HALF> 189 184 184 184 194.189 128.113
  215. <FRACTION THREE QUARTERS> 190 185 185 185 194.190 128.114
  216. <INVERTED QUESTION MARK> 191 171 171 171 194.191 128.115
  217. <A WITH GRAVE> 192 100 100 100 195.128 138.65
  218. <A WITH ACUTE> 193 101 101 101 195.129 138.66
  219. <A WITH CIRCUMFLEX> 194 98 98 98 195.130 138.67
  220. <A WITH TILDE> 195 102 102 102 195.131 138.68
  221. <A WITH DIAERESIS> 196 99 99 99 195.132 138.69
  222. <A WITH RING ABOVE> 197 103 103 103 195.133 138.70
  223. <CAPITAL LIGATURE AE> 198 158 158 158 195.134 138.71
  224. <C WITH CEDILLA> 199 104 104 104 195.135 138.72
  225. <E WITH GRAVE> 200 116 116 116 195.136 138.73
  226. <E WITH ACUTE> 201 113 113 113 195.137 138.74
  227. <E WITH CIRCUMFLEX> 202 114 114 114 195.138 138.81
  228. <E WITH DIAERESIS> 203 115 115 115 195.139 138.82
  229. <I WITH GRAVE> 204 120 120 120 195.140 138.83
  230. <I WITH ACUTE> 205 117 117 117 195.141 138.84
  231. <I WITH CIRCUMFLEX> 206 118 118 118 195.142 138.85
  232. <I WITH DIAERESIS> 207 119 119 119 195.143 138.86
  233. <CAPITAL LETTER ETH> 208 172 172 172 195.144 138.87
  234. <N WITH TILDE> 209 105 105 105 195.145 138.88
  235. <O WITH GRAVE> 210 237 237 237 195.146 138.89
  236. <O WITH ACUTE> 211 238 238 238 195.147 138.98
  237. <O WITH CIRCUMFLEX> 212 235 235 235 195.148 138.99
  238. <O WITH TILDE> 213 239 239 239 195.149 138.100
  239. <O WITH DIAERESIS> 214 236 236 236 195.150 138.101
  240. <MULTIPLICATION SIGN> 215 191 191 191 195.151 138.102
  241. <O WITH STROKE> 216 128 128 128 195.152 138.103
  242. <U WITH GRAVE> 217 253 253 224 195.153 138.104 ##
  243. <U WITH ACUTE> 218 254 254 254 195.154 138.105
  244. <U WITH CIRCUMFLEX> 219 251 251 221 195.155 138.106 ##
  245. <U WITH DIAERESIS> 220 252 252 252 195.156 138.112
  246. <Y WITH ACUTE> 221 173 186 173 195.157 138.113 ** ##
  247. <CAPITAL LETTER THORN> 222 174 174 174 195.158 138.114
  248. <SMALL LETTER SHARP S> 223 89 89 89 195.159 138.115
  249. <a WITH GRAVE> 224 68 68 68 195.160 139.65
  250. <a WITH ACUTE> 225 69 69 69 195.161 139.66
  251. <a WITH CIRCUMFLEX> 226 66 66 66 195.162 139.67
  252. <a WITH TILDE> 227 70 70 70 195.163 139.68
  253. <a WITH DIAERESIS> 228 67 67 67 195.164 139.69
  254. <a WITH RING ABOVE> 229 71 71 71 195.165 139.70
  255. <SMALL LIGATURE ae> 230 156 156 156 195.166 139.71
  256. <c WITH CEDILLA> 231 72 72 72 195.167 139.72
  257. <e WITH GRAVE> 232 84 84 84 195.168 139.73
  258. <e WITH ACUTE> 233 81 81 81 195.169 139.74
  259. <e WITH CIRCUMFLEX> 234 82 82 82 195.170 139.81
  260. <e WITH DIAERESIS> 235 83 83 83 195.171 139.82
  261. <i WITH GRAVE> 236 88 88 88 195.172 139.83
  262. <i WITH ACUTE> 237 85 85 85 195.173 139.84
  263. <i WITH CIRCUMFLEX> 238 86 86 86 195.174 139.85
  264. <i WITH DIAERESIS> 239 87 87 87 195.175 139.86
  265. <SMALL LETTER eth> 240 140 140 140 195.176 139.87
  266. <n WITH TILDE> 241 73 73 73 195.177 139.88
  267. <o WITH GRAVE> 242 205 205 205 195.178 139.89
  268. <o WITH ACUTE> 243 206 206 206 195.179 139.98
  269. <o WITH CIRCUMFLEX> 244 203 203 203 195.180 139.99
  270. <o WITH TILDE> 245 207 207 207 195.181 139.100
  271. <o WITH DIAERESIS> 246 204 204 204 195.182 139.101
  272. <DIVISION SIGN> 247 225 225 225 195.183 139.102
  273. <o WITH STROKE> 248 112 112 112 195.184 139.103
  274. <u WITH GRAVE> 249 221 221 192 195.185 139.104 ##
  275. <u WITH ACUTE> 250 222 222 222 195.186 139.105
  276. <u WITH CIRCUMFLEX> 251 219 219 219 195.187 139.106
  277. <u WITH DIAERESIS> 252 220 220 220 195.188 139.112
  278. <y WITH ACUTE> 253 141 141 141 195.189 139.113
  279. <SMALL LETTER thorn> 254 142 142 142 195.190 139.114
  280. <y WITH DIAERESIS> 255 223 223 223 195.191 139.115

If you would rather see the above table in CCSID 0037 order rather than ASCII + Latin-1 order then run the table through:

  • recipe 4
  1. perl \
  2. -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\
  3. -e '{push(@l,$_)}' \
  4. -e 'END{print map{$_->[0]}' \
  5. -e ' sort{$a->[1] <=> $b->[1]}' \
  6. -e ' map{[$_,substr($_,34,3)]}@l;}' perlebcdic.pod

If you would rather see it in CCSID 1047 order then change the number 34 in the last line to 39, like this:

  • recipe 5
  1. perl \
  2. -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\
  3. -e '{push(@l,$_)}' \
  4. -e 'END{print map{$_->[0]}' \
  5. -e ' sort{$a->[1] <=> $b->[1]}' \
  6. -e ' map{[$_,substr($_,39,3)]}@l;}' perlebcdic.pod

If you would rather see it in POSIX-BC order then change the number 39 in the last line to 44, like this:

  • recipe 6
  1. perl \
  2. -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\
  3. -e '{push(@l,$_)}' \
  4. -e 'END{print map{$_->[0]}' \
  5. -e ' sort{$a->[1] <=> $b->[1]}' \
  6. -e ' map{[$_,substr($_,44,3)]}@l;}' perlebcdic.pod

IDENTIFYING CHARACTER CODE SETS

To determine the character set you are running under from perl one could use the return value of ord() or chr() to test one or more character values. For example:

  1. $is_ascii = "A" eq chr(65);
  2. $is_ebcdic = "A" eq chr(193);

Also, "\t" is a HORIZONTAL TABULATION character so that:

  1. $is_ascii = ord("\t") == 9;
  2. $is_ebcdic = ord("\t") == 5;

To distinguish EBCDIC code pages try looking at one or more of the characters that differ between them. For example:

  1. $is_ebcdic_37 = "\n" eq chr(37);
  2. $is_ebcdic_1047 = "\n" eq chr(21);

Or better still choose a character that is uniquely encoded in any of the code sets, e.g.:

  1. $is_ascii = ord('[') == 91;
  2. $is_ebcdic_37 = ord('[') == 186;
  3. $is_ebcdic_1047 = ord('[') == 173;
  4. $is_ebcdic_POSIX_BC = ord('[') == 187;

However, it would be unwise to write tests such as:

  1. $is_ascii = "\r" ne chr(13); # WRONG
  2. $is_ascii = "\n" ne chr(10); # ILL ADVISED

Obviously the first of these will fail to distinguish most ASCII platforms from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC platform since "\r" eq chr(13) under all of those coded character sets. But note too that because "\n" is chr(13) and "\r" is chr(10) on the Macintosh (which is an ASCII platform) the second $is_ascii test will lead to trouble there.

To determine whether or not perl was built under an EBCDIC code page you can use the Config module like so:

  1. use Config;
  2. $is_ebcdic = $Config{'ebcdic'} eq 'define';

CONVERSIONS

utf8::unicode_to_native() and utf8::native_to_unicode()

These functions take an input numeric code point in one encoding and return what its equivalent value is in the other.

tr///

In order to convert a string of characters from one character set to another a simple list of numbers, such as in the right columns in the above table, along with perl's tr/// operator is all that is needed. The data in the table are in ASCII/Latin1 order, hence the EBCDIC columns provide easy-to-use ASCII/Latin1 to EBCDIC operations that are also easily reversed.

For example, to convert ASCII/Latin1 to code page 037 take the output of the second numbers column from the output of recipe 2 (modified to add '\' characters), and use it in tr/// like so:

  1. $cp_037 =
  2. '\x00\x01\x02\x03\x37\x2D\x2E\x2F\x16\x05\x25\x0B\x0C\x0D\x0E\x0F' .
  3. '\x10\x11\x12\x13\x3C\x3D\x32\x26\x18\x19\x3F\x27\x1C\x1D\x1E\x1F' .
  4. '\x40\x5A\x7F\x7B\x5B\x6C\x50\x7D\x4D\x5D\x5C\x4E\x6B\x60\x4B\x61' .
  5. '\xF0\xF1\xF2\xF3\xF4\xF5\xF6\xF7\xF8\xF9\x7A\x5E\x4C\x7E\x6E\x6F' .
  6. '\x7C\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xD1\xD2\xD3\xD4\xD5\xD6' .
  7. '\xD7\xD8\xD9\xE2\xE3\xE4\xE5\xE6\xE7\xE8\xE9\xBA\xE0\xBB\xB0\x6D' .
  8. '\x79\x81\x82\x83\x84\x85\x86\x87\x88\x89\x91\x92\x93\x94\x95\x96' .
  9. '\x97\x98\x99\xA2\xA3\xA4\xA5\xA6\xA7\xA8\xA9\xC0\x4F\xD0\xA1\x07' .
  10. '\x20\x21\x22\x23\x24\x15\x06\x17\x28\x29\x2A\x2B\x2C\x09\x0A\x1B' .
  11. '\x30\x31\x1A\x33\x34\x35\x36\x08\x38\x39\x3A\x3B\x04\x14\x3E\xFF' .
  12. '\x41\xAA\x4A\xB1\x9F\xB2\x6A\xB5\xBD\xB4\x9A\x8A\x5F\xCA\xAF\xBC' .
  13. '\x90\x8F\xEA\xFA\xBE\xA0\xB6\xB3\x9D\xDA\x9B\x8B\xB7\xB8\xB9\xAB' .
  14. '\x64\x65\x62\x66\x63\x67\x9E\x68\x74\x71\x72\x73\x78\x75\x76\x77' .
  15. '\xAC\x69\xED\xEE\xEB\xEF\xEC\xBF\x80\xFD\xFE\xFB\xFC\xAD\xAE\x59' .
  16. '\x44\x45\x42\x46\x43\x47\x9C\x48\x54\x51\x52\x53\x58\x55\x56\x57' .
  17. '\x8C\x49\xCD\xCE\xCB\xCF\xCC\xE1\x70\xDD\xDE\xDB\xDC\x8D\x8E\xDF';
  18. my $ebcdic_string = $ascii_string;
  19. eval '$ebcdic_string =~ tr/\000-\377/' . $cp_037 . '/';

To convert from EBCDIC 037 to ASCII just reverse the order of the tr/// arguments like so:

  1. my $ascii_string = $ebcdic_string;
  2. eval '$ascii_string =~ tr/' . $cp_037 . '/\000-\377/';

Similarly one could take the output of the third numbers column from recipe 2 to obtain a $cp_1047 table. The fourth numbers column of the output from recipe 2 could provide a $cp_posix_bc table suitable for transcoding as well.

If you wanted to see the inverse tables, you would first have to sort on the desired numbers column as in recipes 4, 5 or 6, then take the output of the first numbers column.

iconv

XPG operability often implies the presence of an iconv utility available from the shell or from the C library. Consult your system's documentation for information on iconv.

On OS/390 or z/OS see the iconv(1) manpage. One way to invoke the iconv shell utility from within perl would be to:

  1. # OS/390 or z/OS example
  2. $ascii_data = `echo '$ebcdic_data'| iconv -f IBM-1047 -t ISO8859-1`

or the inverse map:

  1. # OS/390 or z/OS example
  2. $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047`

For other perl-based conversion options see the Convert::* modules on CPAN.

C RTL

The OS/390 and z/OS C run-time libraries provide _atoe() and _etoa() functions.

OPERATOR DIFFERENCES

The .. range operator treats certain character ranges with care on EBCDIC platforms. For example the following array will have twenty six elements on either an EBCDIC platform or an ASCII platform:

  1. @alphabet = ('A'..'Z'); # $#alphabet == 25

The bitwise operators such as & ^ | may return different results when operating on string or character data in a perl program running on an EBCDIC platform than when run on an ASCII platform. Here is an example adapted from the one in perlop:

  1. # EBCDIC-based examples
  2. print "j p \n" ^ " a h"; # prints "JAPH\n"
  3. print "JA" | " ph\n"; # prints "japh\n"
  4. print "JAPH\nJunk" & "\277\277\277\277\277"; # prints "japh\n";
  5. print 'p N$' ^ " E<H\n"; # prints "Perl\n";

An interesting property of the 32 C0 control characters in the ASCII table is that they can "literally" be constructed as control characters in perl, e.g. (chr(0) eq \c@ )> (chr(1) eq \cA )>, and so on. Perl on EBCDIC platforms has been ported to take \c@ to chr(0) and \cA to chr(1), etc. as well, but the thirty three characters that result depend on which code page you are using. The table below uses the standard acronyms for the controls. The POSIX-BC and 1047 sets are identical throughout this range and differ from the 0037 set at only one spot (21 decimal). Note that the LINE FEED character may be generated by \cJ on ASCII platforms but by \cU on 1047 or POSIX-BC platforms and cannot be generated as a "\c.letter." control character on 0037 platforms. Note also that \c\ cannot be the final element in a string or regex, as it will absorb the terminator. But \c\X is a FILE SEPARATOR concatenated with X for all X.

  1. chr ord 8859-1 0037 1047 && POSIX-BC
  2. -----------------------------------------------------------------------
  3. \c? 127 <DEL> " "
  4. \c@ 0 <NUL> <NUL> <NUL>
  5. \cA 1 <SOH> <SOH> <SOH>
  6. \cB 2 <STX> <STX> <STX>
  7. \cC 3 <ETX> <ETX> <ETX>
  8. \cD 4 <EOT> <ST> <ST>
  9. \cE 5 <ENQ> <HT> <HT>
  10. \cF 6 <ACK> <SSA> <SSA>
  11. \cG 7 <BEL> <DEL> <DEL>
  12. \cH 8 <BS> <EPA> <EPA>
  13. \cI 9 <HT> <RI> <RI>
  14. \cJ 10 <LF> <SS2> <SS2>
  15. \cK 11 <VT> <VT> <VT>
  16. \cL 12 <FF> <FF> <FF>
  17. \cM 13 <CR> <CR> <CR>
  18. \cN 14 <SO> <SO> <SO>
  19. \cO 15 <SI> <SI> <SI>
  20. \cP 16 <DLE> <DLE> <DLE>
  21. \cQ 17 <DC1> <DC1> <DC1>
  22. \cR 18 <DC2> <DC2> <DC2>
  23. \cS 19 <DC3> <DC3> <DC3>
  24. \cT 20 <DC4> <OSC> <OSC>
  25. \cU 21 <NAK> <NEL> <LF> **
  26. \cV 22 <SYN> <BS> <BS>
  27. \cW 23 <ETB> <ESA> <ESA>
  28. \cX 24 <CAN> <CAN> <CAN>
  29. \cY 25 <EOM> <EOM> <EOM>
  30. \cZ 26 <SUB> <PU2> <PU2>
  31. \c[ 27 <ESC> <SS3> <SS3>
  32. \c\X 28 <FS>X <FS>X <FS>X
  33. \c] 29 <GS> <GS> <GS>
  34. \c^ 30 <RS> <RS> <RS>
  35. \c_ 31 <US> <US> <US>

FUNCTION DIFFERENCES

  • chr()

    chr() must be given an EBCDIC code number argument to yield a desired character return value on an EBCDIC platform. For example:

    1. $CAPITAL_LETTER_A = chr(193);
  • ord()

    ord() will return EBCDIC code number values on an EBCDIC platform. For example:

    1. $the_number_193 = ord("A");
  • pack()

    The c and C templates for pack() are dependent upon character set encoding. Examples of usage on EBCDIC include:

    1. $foo = pack("CCCC",193,194,195,196);
    2. # $foo eq "ABCD"
    3. $foo = pack("C4",193,194,195,196);
    4. # same thing
    5. $foo = pack("ccxxcc",193,194,195,196);
    6. # $foo eq "AB\0\0CD"
  • print()

    One must be careful with scalars and strings that are passed to print that contain ASCII encodings. One common place for this to occur is in the output of the MIME type header for CGI script writing. For example, many perl programming guides recommend something similar to:

    1. print "Content-type:\ttext/html\015\012\015\012";
    2. # this may be wrong on EBCDIC

    Under the IBM OS/390 USS Web Server or WebSphere on z/OS for example you should instead write that as:

    1. print "Content-type:\ttext/html\r\n\r\n"; # OK for DGW et al

    That is because the translation from EBCDIC to ASCII is done by the web server in this case (such code will not be appropriate for the Macintosh however). Consult your web server's documentation for further details.

  • printf()

    The formats that can convert characters to numbers and vice versa will be different from their ASCII counterparts when executed on an EBCDIC platform. Examples include:

    1. printf("%c%c%c",193,194,195); # prints ABC
  • sort()

    EBCDIC sort results may differ from ASCII sort results especially for mixed case strings. This is discussed in more detail below.

  • sprintf()

    See the discussion of printf() above. An example of the use of sprintf would be:

    1. $CAPITAL_LETTER_A = sprintf("%c",193);
  • unpack()

    See the discussion of pack() above.

REGULAR EXPRESSION DIFFERENCES

As of perl 5.005_03 the letter range regular expressions such as [A-Z] and [a-z] have been especially coded to not pick up gap characters. For example, characters such as ô o WITH CIRCUMFLEX that lie between I and J would not be matched by the regular expression range /[H-K]/ . This works in the other direction, too, if either of the range end points is explicitly numeric: [\x89-\x91] will match \x8e , even though \x89 is i and \x91 is j , and \x8e is a gap character from the alphabetic viewpoint.

If you do want to match the alphabet gap characters in a single octet regular expression try matching the hex or octal code such as /\313/ on EBCDIC or /\364/ on ASCII platforms to have your regular expression match o WITH CIRCUMFLEX .

Another construct to be wary of is the inappropriate use of hex or octal constants in regular expressions. Consider the following set of subs:

  1. sub is_c0 {
  2. my $char = substr(shift,0,1);
  3. $char =~ /[\000-\037]/;
  4. }
  5. sub is_print_ascii {
  6. my $char = substr(shift,0,1);
  7. $char =~ /[\040-\176]/;
  8. }
  9. sub is_delete {
  10. my $char = substr(shift,0,1);
  11. $char eq "\177";
  12. }
  13. sub is_c1 {
  14. my $char = substr(shift,0,1);
  15. $char =~ /[\200-\237]/;
  16. }
  17. sub is_latin_1 {
  18. my $char = substr(shift,0,1);
  19. $char =~ /[\240-\377]/;
  20. }

The above would be adequate if the concern was only with numeric code points. However, the concern may be with characters rather than code points and on an EBCDIC platform it may be desirable for constructs such as if (is_print_ascii("A")) {print "A is a printable character\n";} to print out the expected message. One way to represent the above collection of character classification subs that is capable of working across the four coded character sets discussed in this document is as follows:

  1. sub Is_c0 {
  2. my $char = substr(shift,0,1);
  3. if (ord('^')==94) { # ascii
  4. return $char =~ /[\000-\037]/;
  5. }
  6. if (ord('^')==176) { # 0037
  7. return $char =~ /[\000-\003\067\055-\057\026\005\045\013-\023\074\075\062\046\030\031\077\047\034-\037]/;
  8. }
  9. if (ord('^')==95 || ord('^')==106) { # 1047 || posix-bc
  10. return $char =~ /[\000-\003\067\055-\057\026\005\025\013-\023\074\075\062\046\030\031\077\047\034-\037]/;
  11. }
  12. }
  13. sub Is_print_ascii {
  14. my $char = substr(shift,0,1);
  15. $char =~ /[ !"\#\$%&'()*+,\-.\/0-9:;<=>?\@A-Z[\\\]^_`a-z{|}~]/;
  16. }
  17. sub Is_delete {
  18. my $char = substr(shift,0,1);
  19. if (ord('^')==94) { # ascii
  20. return $char eq "\177";
  21. }
  22. else { # ebcdic
  23. return $char eq "\007";
  24. }
  25. }
  26. sub Is_c1 {
  27. my $char = substr(shift,0,1);
  28. if (ord('^')==94) { # ascii
  29. return $char =~ /[\200-\237]/;
  30. }
  31. if (ord('^')==176) { # 0037
  32. return $char =~ /[\040-\044\025\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\377]/;
  33. }
  34. if (ord('^')==95) { # 1047
  35. return $char =~ /[\040-\045\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\377]/;
  36. }
  37. if (ord('^')==106) { # posix-bc
  38. return $char =~
  39. /[\040-\045\006\027\050-\054\011\012\033\060\061\032\063-\066\010\070-\073\040\024\076\137]/;
  40. }
  41. }
  42. sub Is_latin_1 {
  43. my $char = substr(shift,0,1);
  44. if (ord('^')==94) { # ascii
  45. return $char =~ /[\240-\377]/;
  46. }
  47. if (ord('^')==176) { # 0037
  48. return $char =~
  49. /[\101\252\112\261\237\262\152\265\275\264\232\212\137\312\257\274\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\375\376\373\374\255\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\335\336\333\334\215\216\337]/;
  50. }
  51. if (ord('^')==95) { # 1047
  52. return $char =~
  53. /[\101\252\112\261\237\262\152\265\273\264\232\212\260\312\257\274\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\375\376\373\374\272\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\335\336\333\334\215\216\337]/;
  54. }
  55. if (ord('^')==106) { # posix-bc
  56. return $char =~
  57. /[\101\252\260\261\237\262\320\265\171\264\232\212\272\312\257\241\220\217\352\372\276\240\266\263\235\332\233\213\267\270\271\253\144\145\142\146\143\147\236\150\164\161-\163\170\165-\167\254\151\355\356\353\357\354\277\200\340\376\335\374\255\256\131\104\105\102\106\103\107\234\110\124\121-\123\130\125-\127\214\111\315\316\313\317\314\341\160\300\336\333\334\215\216\337]/;
  58. }
  59. }

Note however that only the Is_ascii_print() sub is really independent of coded character set. Another way to write Is_latin_1() would be to use the characters in the range explicitly:

  1. sub Is_latin_1 {
  2. my $char = substr(shift,0,1);
  3. $char =~ /[ ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ]/;
  4. }

Although that form may run into trouble in network transit (due to the presence of 8 bit characters) or on non ISO-Latin character sets.

SOCKETS

Most socket programming assumes ASCII character encodings in network byte order. Exceptions can include CGI script writing under a host web server where the server may take care of translation for you. Most host web servers convert EBCDIC data to ISO-8859-1 or Unicode on output.

SORTING

One big difference between ASCII-based character sets and EBCDIC ones are the relative positions of upper and lower case letters and the letters compared to the digits. If sorted on an ASCII-based platform the two-letter abbreviation for a physician comes before the two letter abbreviation for drive; that is:

  1. @sorted = sort(qw(Dr. dr.)); # @sorted holds ('Dr.','dr.') on ASCII,
  2. # but ('dr.','Dr.') on EBCDIC

The property of lowercase before uppercase letters in EBCDIC is even carried to the Latin 1 EBCDIC pages such as 0037 and 1047. An example would be that Ë E WITH DIAERESIS (203) comes before ë e WITH DIAERESIS (235) on an ASCII platform, but the latter (83) comes before the former (115) on an EBCDIC platform. (Astute readers will note that the uppercase version of ß SMALL LETTER SHARP S is simply "SS" and that the upper case version of ÿ y WITH DIAERESIS is not in the 0..255 range but it is at U+x0178 in Unicode, or "\x{178}" in a Unicode enabled Perl).

The sort order will cause differences between results obtained on ASCII platforms versus EBCDIC platforms. What follows are some suggestions on how to deal with these differences.

Ignore ASCII vs. EBCDIC sort differences.

This is the least computationally expensive strategy. It may require some user education.

MONO CASE then sort data.

In order to minimize the expense of mono casing mixed-case text, try to tr/// towards the character set case most employed within the data. If the data are primarily UPPERCASE non Latin 1 then apply tr/[a-z]/[A-Z]/ then sort(). If the data are primarily lowercase non Latin 1 then apply tr/[A-Z]/[a-z]/ before sorting. If the data are primarily UPPERCASE and include Latin-1 characters then apply:

  1. tr/[a-z]/[A-Z]/;
  2. tr/[àáâãäåæçèéêëìíîïðñòóôõöøùúûüýþ]/[ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ/;
  3. s/ß/SS/g;

then sort(). Do note however that such Latin-1 manipulation does not address the ÿ y WITH DIAERESIS character that will remain at code point 255 on ASCII platforms, but 223 on most EBCDIC platforms where it will sort to a place less than the EBCDIC numerals. With a Unicode-enabled Perl you might try:

  1. tr/^?/\x{178}/;

The strategy of mono casing data before sorting does not preserve the case of the data and may not be acceptable for that reason.

Convert, sort data, then re convert.

This is the most expensive proposition that does not employ a network connection.

Perform sorting on one type of platform only.

This strategy can employ a network connection. As such it would be computationally expensive.

TRANSFORMATION FORMATS

There are a variety of ways of transforming data with an intra character set mapping that serve a variety of purposes. Sorting was discussed in the previous section and a few of the other more popular mapping techniques are discussed next.

URL decoding and encoding

Note that some URLs have hexadecimal ASCII code points in them in an attempt to overcome character or protocol limitation issues. For example the tilde character is not on every keyboard hence a URL of the form:

  1. http://www.pvhp.com/~pvhp/

may also be expressed as either of:

  1. http://www.pvhp.com/%7Epvhp/
  2. http://www.pvhp.com/%7epvhp/

where 7E is the hexadecimal ASCII code point for '~'. Here is an example of decoding such a URL under CCSID 1047:

  1. $url = 'http://www.pvhp.com/%7Epvhp/';
  2. # this array assumes code page 1047
  3. my @a2e_1047 = (
  4. 0, 1, 2, 3, 55, 45, 46, 47, 22, 5, 21, 11, 12, 13, 14, 15,
  5. 16, 17, 18, 19, 60, 61, 50, 38, 24, 25, 63, 39, 28, 29, 30, 31,
  6. 64, 90,127,123, 91,108, 80,125, 77, 93, 92, 78,107, 96, 75, 97,
  7. 240,241,242,243,244,245,246,247,248,249,122, 94, 76,126,110,111,
  8. 124,193,194,195,196,197,198,199,200,201,209,210,211,212,213,214,
  9. 215,216,217,226,227,228,229,230,231,232,233,173,224,189, 95,109,
  10. 121,129,130,131,132,133,134,135,136,137,145,146,147,148,149,150,
  11. 151,152,153,162,163,164,165,166,167,168,169,192, 79,208,161, 7,
  12. 32, 33, 34, 35, 36, 37, 6, 23, 40, 41, 42, 43, 44, 9, 10, 27,
  13. 48, 49, 26, 51, 52, 53, 54, 8, 56, 57, 58, 59, 4, 20, 62,255,
  14. 65,170, 74,177,159,178,106,181,187,180,154,138,176,202,175,188,
  15. 144,143,234,250,190,160,182,179,157,218,155,139,183,184,185,171,
  16. 100,101, 98,102, 99,103,158,104,116,113,114,115,120,117,118,119,
  17. 172,105,237,238,235,239,236,191,128,253,254,251,252,186,174, 89,
  18. 68, 69, 66, 70, 67, 71,156, 72, 84, 81, 82, 83, 88, 85, 86, 87,
  19. 140, 73,205,206,203,207,204,225,112,221,222,219,220,141,142,223
  20. );
  21. $url =~ s/%([0-9a-fA-F]{2})/pack("c",$a2e_1047[hex($1)])/ge;

Conversely, here is a partial solution for the task of encoding such a URL under the 1047 code page:

  1. $url = 'http://www.pvhp.com/~pvhp/';
  2. # this array assumes code page 1047
  3. my @e2a_1047 = (
  4. 0, 1, 2, 3,156, 9,134,127,151,141,142, 11, 12, 13, 14, 15,
  5. 16, 17, 18, 19,157, 10, 8,135, 24, 25,146,143, 28, 29, 30, 31,
  6. 128,129,130,131,132,133, 23, 27,136,137,138,139,140, 5, 6, 7,
  7. 144,145, 22,147,148,149,150, 4,152,153,154,155, 20, 21,158, 26,
  8. 32,160,226,228,224,225,227,229,231,241,162, 46, 60, 40, 43,124,
  9. 38,233,234,235,232,237,238,239,236,223, 33, 36, 42, 41, 59, 94,
  10. 45, 47,194,196,192,193,195,197,199,209,166, 44, 37, 95, 62, 63,
  11. 248,201,202,203,200,205,206,207,204, 96, 58, 35, 64, 39, 61, 34,
  12. 216, 97, 98, 99,100,101,102,103,104,105,171,187,240,253,254,177,
  13. 176,106,107,108,109,110,111,112,113,114,170,186,230,184,198,164,
  14. 181,126,115,116,117,118,119,120,121,122,161,191,208, 91,222,174,
  15. 172,163,165,183,169,167,182,188,189,190,221,168,175, 93,180,215,
  16. 123, 65, 66, 67, 68, 69, 70, 71, 72, 73,173,244,246,242,243,245,
  17. 125, 74, 75, 76, 77, 78, 79, 80, 81, 82,185,251,252,249,250,255,
  18. 92,247, 83, 84, 85, 86, 87, 88, 89, 90,178,212,214,210,211,213,
  19. 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,179,219,220,217,218,159
  20. );
  21. # The following regular expression does not address the
  22. # mappings for: ('.' => '%2E', '/' => '%2F', ':' => '%3A')
  23. $url =~ s/([\t "#%&\(\),;<=>\?\@\[\\\]^`{|}~])/sprintf("%%%02X",$e2a_1047[ord($1)])/ge;

where a more complete solution would split the URL into components and apply a full s/// substitution only to the appropriate parts.

In the remaining examples a @e2a or @a2e array may be employed but the assignment will not be shown explicitly. For code page 1047 you could use the @a2e_1047 or @e2a_1047 arrays just shown.

uu encoding and decoding

The u template to pack() or unpack() will render EBCDIC data in EBCDIC characters equivalent to their ASCII counterparts. For example, the following will print "Yes indeed\n" on either an ASCII or EBCDIC computer:

  1. $all_byte_chrs = '';
  2. for (0..255) { $all_byte_chrs .= chr($_); }
  3. $uuencode_byte_chrs = pack('u', $all_byte_chrs);
  4. ($uu = <<'ENDOFHEREDOC') =~ s/^\s*//gm;
  5. M``$"`P0%!@<("0H+#`T.#Q`1$A,4%187&!D:&QP='A\@(2(C)"4F)R@I*BLL
  6. M+2XO,#$R,S0U-C<X.3H[/#T^/T!!0D-$149'2$E*2TQ-3D]045)35%565UA9
  7. M6EM<75Y?8&%B8V1E9F=H:6IK;&UN;W!Q<G-T=79W>'EZ>WQ]?G^`@8*#A(6&
  8. MAXB)BHN,C8Z/D)&2DY25EI>8F9J;G)V>GZ"AHJ.DI::GJ*FJJZRMKJ^PL;*S
  9. MM+6VM[BYNKN\O;Z_P,'"P\3%QL?(R<K+S,W.S]#1TM/4U=;7V-G:V]S=WM_@
  10. ?X>+CY.7FY^CIZNOL[>[O\/'R\_3U]O?X^?K[_/W^_P``
  11. ENDOFHEREDOC
  12. if ($uuencode_byte_chrs eq $uu) {
  13. print "Yes ";
  14. }
  15. $uudecode_byte_chrs = unpack('u', $uuencode_byte_chrs);
  16. if ($uudecode_byte_chrs eq $all_byte_chrs) {
  17. print "indeed\n";
  18. }

Here is a very spartan uudecoder that will work on EBCDIC provided that the @e2a array is filled in appropriately:

  1. #!/usr/local/bin/perl
  2. @e2a = ( # this must be filled in
  3. );
  4. $_ = <> until ($mode,$file) = /^begin\s*(\d*)\s*(\S*)/;
  5. open(OUT, "> $file") if $file ne "";
  6. while(<>) {
  7. last if /^end/;
  8. next if /[a-z]/;
  9. next unless int(((($e2a[ord()] - 32 ) & 077) + 2) / 3) ==
  10. int(length() / 4);
  11. print OUT unpack("u", $_);
  12. }
  13. close(OUT);
  14. chmod oct($mode), $file;

Quoted-Printable encoding and decoding

On ASCII-encoded platforms it is possible to strip characters outside of the printable set using:

  1. # This QP encoder works on ASCII only
  2. $qp_string =~ s/([=\x00-\x1F\x80-\xFF])/sprintf("=%02X",ord($1))/ge;

Whereas a QP encoder that works on both ASCII and EBCDIC platforms would look somewhat like the following (where the EBCDIC branch @e2a array is omitted for brevity):

  1. if (ord('A') == 65) { # ASCII
  2. $delete = "\x7F"; # ASCII
  3. @e2a = (0 .. 255) # ASCII to ASCII identity map
  4. }
  5. else { # EBCDIC
  6. $delete = "\x07"; # EBCDIC
  7. @e2a = # EBCDIC to ASCII map (as shown above)
  8. }
  9. $qp_string =~
  10. s/([^ !"\#\$%&'()*+,\-.\/0-9:;<>?\@A-Z[\\\]^_`a-z{|}~$delete])/sprintf("=%02X",$e2a[ord($1)])/ge;

(although in production code the substitutions might be done in the EBCDIC branch with the @e2a array and separately in the ASCII branch without the expense of the identity map).

Such QP strings can be decoded with:

  1. # This QP decoder is limited to ASCII only
  2. $string =~ s/=([0-9A-Fa-f][0-9A-Fa-f])/chr hex $1/ge;
  3. $string =~ s/=[\n\r]+$//;

Whereas a QP decoder that works on both ASCII and EBCDIC platforms would look somewhat like the following (where the @a2e array is omitted for brevity):

  1. $string =~ s/=([0-9A-Fa-f][0-9A-Fa-f])/chr $a2e[hex $1]/ge;
  2. $string =~ s/=[\n\r]+$//;

Caesarean ciphers

The practice of shifting an alphabet one or more characters for encipherment dates back thousands of years and was explicitly detailed by Gaius Julius Caesar in his Gallic Wars text. A single alphabet shift is sometimes referred to as a rotation and the shift amount is given as a number $n after the string 'rot' or "rot$n". Rot0 and rot26 would designate identity maps on the 26-letter English version of the Latin alphabet. Rot13 has the interesting property that alternate subsequent invocations are identity maps (thus rot13 is its own non-trivial inverse in the group of 26 alphabet rotations). Hence the following is a rot13 encoder and decoder that will work on ASCII and EBCDIC platforms:

  1. #!/usr/local/bin/perl
  2. while(<>){
  3. tr/n-za-mN-ZA-M/a-zA-Z/;
  4. print;
  5. }

In one-liner form:

  1. perl -ne 'tr/n-za-mN-ZA-M/a-zA-Z/;print'

Hashing order and checksums

To the extent that it is possible to write code that depends on hashing order there may be differences between hashes as stored on an ASCII-based platform and hashes stored on an EBCDIC-based platform. XXX

I18N AND L10N

Internationalization (I18N) and localization (L10N) are supported at least in principle even on EBCDIC platforms. The details are system-dependent and discussed under the OS ISSUES in perlebcdic section below.

MULTI-OCTET CHARACTER SETS

Perl may work with an internal UTF-EBCDIC encoding form for wide characters on EBCDIC platforms in a manner analogous to the way that it works with the UTF-8 internal encoding form on ASCII based platforms.

Legacy multi byte EBCDIC code pages XXX.

OS ISSUES

There may be a few system-dependent issues of concern to EBCDIC Perl programmers.

OS/400

  • PASE

    The PASE environment is a runtime environment for OS/400 that can run executables built for PowerPC AIX in OS/400; see perlos400. PASE is ASCII-based, not EBCDIC-based as the ILE.

  • IFS access

    XXX.

OS/390, z/OS

Perl runs under Unix Systems Services or USS.

  • chcp

    chcp is supported as a shell utility for displaying and changing one's code page. See also chcp(1).

  • dataset access

    For sequential data set access try:

    1. my @ds_records = `cat //DSNAME`;

    or:

    1. my @ds_records = `cat //'HLQ.DSNAME'`;

    See also the OS390::Stdio module on CPAN.

  • OS/390, z/OS iconv

    iconv is supported as both a shell utility and a C RTL routine. See also the iconv(1) and iconv(3) manual pages.

  • locales

    On OS/390 or z/OS see locale for information on locales. The L10N files are in /usr/nls/locale. $Config{d_setlocale} is 'define' on OS/390 or z/OS.

POSIX-BC?

XXX.

BUGS

This pod document contains literal Latin 1 characters and may encounter translation difficulties. In particular one popular nroff implementation was known to strip accented characters to their unaccented counterparts while attempting to view this document through the pod2man program (for example, you may see a plain y rather than one with a diaeresis as in ÿ). Another nroff truncated the resultant manpage at the first occurrence of 8 bit characters.

Not all shells will allow multiple -e string arguments to perl to be concatenated together properly as recipes 0, 2, 4, 5, and 6 might seem to imply.

SEE ALSO

perllocale, perlfunc, perlunicode, utf8.

REFERENCES

http://anubis.dkuug.dk/i18n/charmaps

http://www.unicode.org/

http://www.unicode.org/unicode/reports/tr16/

http://www.wps.com/projects/codes/ ASCII: American Standard Code for Information Infiltration Tom Jennings, September 1999.

The Unicode Standard, Version 3.0 The Unicode Consortium, Lisa Moore ed., ISBN 0-201-61633-5, Addison Wesley Developers Press, February 2000.

CDRA: IBM - Character Data Representation Architecture - Reference and Registry, IBM SC09-2190-00, December 1996.

"Demystifying Character Sets", Andrea Vine, Multilingual Computing & Technology, #26 Vol. 10 Issue 4, August/September 1999; ISSN 1523-0309; Multilingual Computing Inc. Sandpoint ID, USA.

Codes, Ciphers, and Other Cryptic and Clandestine Communication Fred B. Wrixon, ISBN 1-57912-040-7, Black Dog & Leventhal Publishers, 1998.

http://www.bobbemer.com/P-BIT.HTM IBM - EBCDIC and the P-bit; The biggest Computer Goof Ever Robert Bemer.

HISTORY

15 April 2001: added UTF-8 and UTF-EBCDIC to main table, pvhp.

AUTHOR

Peter Prymmer pvhp@best.com wrote this in 1999 and 2000 with CCSID 0819 and 0037 help from Chris Leach and André Pirard A.Pirard@ulg.ac.be as well as POSIX-BC help from Thomas Dorner Thomas.Dorner@start.de. Thanks also to Vickie Cooper, Philip Newton, William Raffloer, and Joe Smith. Trademarks, registered trademarks, service marks and registered service marks used in this document are the property of their respective owners.

 
perldoc-html/perlembed.html000644 000765 000024 00000271305 12275777360 016073 0ustar00jjstaff000000 000000 perlembed - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlembed

Perl 5 version 18.2 documentation
Recently read

perlembed

NAME

perlembed - how to embed perl in your C program

DESCRIPTION

PREAMBLE

Do you want to:

ROADMAP

  • Compiling your C program

  • Adding a Perl interpreter to your C program

  • Calling a Perl subroutine from your C program

  • Evaluating a Perl statement from your C program

  • Performing Perl pattern matches and substitutions from your C program

  • Fiddling with the Perl stack from your C program

  • Maintaining a persistent interpreter

  • Maintaining multiple interpreter instances

  • Using Perl modules, which themselves use C libraries, from your C program

  • Embedding Perl under Win32

Compiling your C program

If you have trouble compiling the scripts in this documentation, you're not alone. The cardinal rule: COMPILE THE PROGRAMS IN EXACTLY THE SAME WAY THAT YOUR PERL WAS COMPILED. (Sorry for yelling.)

Also, every C program that uses Perl must link in the perl library. What's that, you ask? Perl is itself written in C; the perl library is the collection of compiled C programs that were used to create your perl executable (/usr/bin/perl or equivalent). (Corollary: you can't use Perl from your C program unless Perl has been compiled on your machine, or installed properly--that's why you shouldn't blithely copy Perl executables from machine to machine without also copying the lib directory.)

When you use Perl from C, your C program will--usually--allocate, "run", and deallocate a PerlInterpreter object, which is defined by the perl library.

If your copy of Perl is recent enough to contain this documentation (version 5.002 or later), then the perl library (and EXTERN.h and perl.h, which you'll also need) will reside in a directory that looks like this:

  1. /usr/local/lib/perl5/your_architecture_here/CORE

or perhaps just

  1. /usr/local/lib/perl5/CORE

or maybe something like

  1. /usr/opt/perl5/CORE

Execute this statement for a hint about where to find CORE:

  1. perl -MConfig -e 'print $Config{archlib}'

Here's how you'd compile the example in the next section, Adding a Perl interpreter to your C program, on my Linux box:

  1. % gcc -O2 -Dbool=char -DHAS_BOOL -I/usr/local/include
  2. -I/usr/local/lib/perl5/i586-linux/5.003/CORE
  3. -L/usr/local/lib/perl5/i586-linux/5.003/CORE
  4. -o interp interp.c -lperl -lm

(That's all one line.) On my DEC Alpha running old 5.003_05, the incantation is a bit different:

  1. % cc -O2 -Olimit 2900 -DSTANDARD_C -I/usr/local/include
  2. -I/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE
  3. -L/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE -L/usr/local/lib
  4. -D__LANGUAGE_C__ -D_NO_PROTO -o interp interp.c -lperl -lm

How can you figure out what to add? Assuming your Perl is post-5.001, execute a perl -V command and pay special attention to the "cc" and "ccflags" information.

You'll have to choose the appropriate compiler (cc, gcc, et al.) for your machine: perl -MConfig -e 'print $Config{cc}' will tell you what to use.

You'll also have to choose the appropriate library directory (/usr/local/lib/...) for your machine. If your compiler complains that certain functions are undefined, or that it can't locate -lperl, then you need to change the path following the -L . If it complains that it can't find EXTERN.h and perl.h, you need to change the path following the -I .

You may have to add extra libraries as well. Which ones? Perhaps those printed by

  1. perl -MConfig -e 'print $Config{libs}'

Provided your perl binary was properly configured and installed the ExtUtils::Embed module will determine all of this information for you:

  1. % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

If the ExtUtils::Embed module isn't part of your Perl distribution, you can retrieve it from http://www.perl.com/perl/CPAN/modules/by-module/ExtUtils/ (If this documentation came from your Perl distribution, then you're running 5.004 or better and you already have it.)

The ExtUtils::Embed kit on CPAN also contains all source code for the examples in this document, tests, additional examples and other information you may find useful.

Adding a Perl interpreter to your C program

In a sense, perl (the C program) is a good example of embedding Perl (the language), so I'll demonstrate embedding with miniperlmain.c, included in the source distribution. Here's a bastardized, non-portable version of miniperlmain.c containing the essentials of embedding:

  1. #include <EXTERN.h> /* from the Perl distribution */
  2. #include <perl.h> /* from the Perl distribution */
  3. static PerlInterpreter *my_perl; /*** The Perl interpreter ***/
  4. int main(int argc, char **argv, char **env)
  5. {
  6. PERL_SYS_INIT3(&argc,&argv,&env);
  7. my_perl = perl_alloc();
  8. perl_construct(my_perl);
  9. PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
  10. perl_parse(my_perl, NULL, argc, argv, (char **)NULL);
  11. perl_run(my_perl);
  12. perl_destruct(my_perl);
  13. perl_free(my_perl);
  14. PERL_SYS_TERM();
  15. }

Notice that we don't use the env pointer. Normally handed to perl_parse as its final argument, env here is replaced by NULL , which means that the current environment will be used.

The macros PERL_SYS_INIT3() and PERL_SYS_TERM() provide system-specific tune up of the C runtime environment necessary to run Perl interpreters; they should only be called once regardless of how many interpreters you create or destroy. Call PERL_SYS_INIT3() before you create your first interpreter, and PERL_SYS_TERM() after you free your last interpreter.

Since PERL_SYS_INIT3() may change env , it may be more appropriate to provide env as an argument to perl_parse().

Also notice that no matter what arguments you pass to perl_parse(), PERL_SYS_INIT3() must be invoked on the C main() argc, argv and env and only once.

Now compile this program (I'll call it interp.c) into an executable:

  1. % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

After a successful compilation, you'll be able to use interp just like perl itself:

  1. % interp
  2. print "Pretty Good Perl \n";
  3. print "10890 - 9801 is ", 10890 - 9801;
  4. <CTRL-D>
  5. Pretty Good Perl
  6. 10890 - 9801 is 1089

or

  1. % interp -e 'printf("%x", 3735928559)'
  2. deadbeef

You can also read and execute Perl statements from a file while in the midst of your C program, by placing the filename in argv[1] before calling perl_run.

Calling a Perl subroutine from your C program

To call individual Perl subroutines, you can use any of the call_* functions documented in perlcall. In this example we'll use call_argv .

That's shown below, in a program I'll call showtime.c.

  1. #include <EXTERN.h>
  2. #include <perl.h>
  3. static PerlInterpreter *my_perl;
  4. int main(int argc, char **argv, char **env)
  5. {
  6. char *args[] = { NULL };
  7. PERL_SYS_INIT3(&argc,&argv,&env);
  8. my_perl = perl_alloc();
  9. perl_construct(my_perl);
  10. perl_parse(my_perl, NULL, argc, argv, NULL);
  11. PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
  12. /*** skipping perl_run() ***/
  13. call_argv("showtime", G_DISCARD | G_NOARGS, args);
  14. perl_destruct(my_perl);
  15. perl_free(my_perl);
  16. PERL_SYS_TERM();
  17. }

where showtime is a Perl subroutine that takes no arguments (that's the G_NOARGS) and for which I'll ignore the return value (that's the G_DISCARD). Those flags, and others, are discussed in perlcall.

I'll define the showtime subroutine in a file called showtime.pl:

  1. print "I shan't be printed.";
  2. sub showtime {
  3. print time;
  4. }

Simple enough. Now compile and run:

  1. % cc -o showtime showtime.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
  2. % showtime showtime.pl
  3. 818284590

yielding the number of seconds that elapsed between January 1, 1970 (the beginning of the Unix epoch), and the moment I began writing this sentence.

In this particular case we don't have to call perl_run, as we set the PL_exit_flag PERL_EXIT_DESTRUCT_END which executes END blocks in perl_destruct.

If you want to pass arguments to the Perl subroutine, you can add strings to the NULL -terminated args list passed to call_argv. For other data types, or to examine return values, you'll need to manipulate the Perl stack. That's demonstrated in Fiddling with the Perl stack from your C program.

Evaluating a Perl statement from your C program

Perl provides two API functions to evaluate pieces of Perl code. These are eval_sv in perlapi and eval_pv in perlapi.

Arguably, these are the only routines you'll ever need to execute snippets of Perl code from within your C program. Your code can be as long as you wish; it can contain multiple statements; it can employ use, require, and do to include external Perl files.

eval_pv lets us evaluate individual Perl strings, and then extract variables for coercion into C types. The following program, string.c, executes three Perl strings, extracting an int from the first, a float from the second, and a char * from the third.

  1. #include <EXTERN.h>
  2. #include <perl.h>
  3. static PerlInterpreter *my_perl;
  4. main (int argc, char **argv, char **env)
  5. {
  6. char *embedding[] = { "", "-e", "0" };
  7. PERL_SYS_INIT3(&argc,&argv,&env);
  8. my_perl = perl_alloc();
  9. perl_construct( my_perl );
  10. perl_parse(my_perl, NULL, 3, embedding, NULL);
  11. PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
  12. perl_run(my_perl);
  13. /** Treat $a as an integer **/
  14. eval_pv("$a = 3; $a **= 2", TRUE);
  15. printf("a = %d\n", SvIV(get_sv("a", 0)));
  16. /** Treat $a as a float **/
  17. eval_pv("$a = 3.14; $a **= 2", TRUE);
  18. printf("a = %f\n", SvNV(get_sv("a", 0)));
  19. /** Treat $a as a string **/
  20. eval_pv("$a = 'rekcaH lreP rehtonA tsuJ'; $a = reverse($a);", TRUE);
  21. printf("a = %s\n", SvPV_nolen(get_sv("a", 0)));
  22. perl_destruct(my_perl);
  23. perl_free(my_perl);
  24. PERL_SYS_TERM();
  25. }

All of those strange functions with sv in their names help convert Perl scalars to C types. They're described in perlguts and perlapi.

If you compile and run string.c, you'll see the results of using SvIV() to create an int, SvNV() to create a float , and SvPV() to create a string:

  1. a = 9
  2. a = 9.859600
  3. a = Just Another Perl Hacker

In the example above, we've created a global variable to temporarily store the computed value of our eval'ed expression. It is also possible and in most cases a better strategy to fetch the return value from eval_pv() instead. Example:

  1. ...
  2. SV *val = eval_pv("reverse 'rekcaH lreP rehtonA tsuJ'", TRUE);
  3. printf("%s\n", SvPV_nolen(val));
  4. ...

This way, we avoid namespace pollution by not creating global variables and we've simplified our code as well.

Performing Perl pattern matches and substitutions from your C program

The eval_sv() function lets us evaluate strings of Perl code, so we can define some functions that use it to "specialize" in matches and substitutions: match(), substitute(), and matches().

  1. I32 match(SV *string, char *pattern);

Given a string and a pattern (e.g., m/clasp/ or /\b\w*\b/ , which in your C program might appear as "/\\b\\w*\\b/"), match() returns 1 if the string matches the pattern and 0 otherwise.

  1. int substitute(SV **string, char *pattern);

Given a pointer to an SV and an =~ operation (e.g., s/bob/robert/g or tr[A-Z][a-z]), substitute() modifies the string within the SV as according to the operation, returning the number of substitutions made.

  1. int matches(SV *string, char *pattern, AV **matches);

Given an SV , a pattern, and a pointer to an empty AV , matches() evaluates $string =~ $pattern in a list context, and fills in matches with the array elements, returning the number of matches found.

Here's a sample program, match.c, that uses all three (long lines have been wrapped here):

  1. #include <EXTERN.h>
  2. #include <perl.h>
  3. static PerlInterpreter *my_perl;
  4. /** my_eval_sv(code, error_check)
  5. ** kinda like eval_sv(),
  6. ** but we pop the return value off the stack
  7. **/
  8. SV* my_eval_sv(SV *sv, I32 croak_on_error)
  9. {
  10. dSP;
  11. SV* retval;
  12. PUSHMARK(SP);
  13. eval_sv(sv, G_SCALAR);
  14. SPAGAIN;
  15. retval = POPs;
  16. PUTBACK;
  17. if (croak_on_error && SvTRUE(ERRSV))
  18. croak(SvPVx_nolen(ERRSV));
  19. return retval;
  20. }
  21. /** match(string, pattern)
  22. **
  23. ** Used for matches in a scalar context.
  24. **
  25. ** Returns 1 if the match was successful; 0 otherwise.
  26. **/
  27. I32 match(SV *string, char *pattern)
  28. {
  29. SV *command = newSV(0), *retval;
  30. sv_setpvf(command, "my $string = '%s'; $string =~ %s",
  31. SvPV_nolen(string), pattern);
  32. retval = my_eval_sv(command, TRUE);
  33. SvREFCNT_dec(command);
  34. return SvIV(retval);
  35. }
  36. /** substitute(string, pattern)
  37. **
  38. ** Used for =~ operations that modify their left-hand side (s/// and tr///)
  39. **
  40. ** Returns the number of successful matches, and
  41. ** modifies the input string if there were any.
  42. **/
  43. I32 substitute(SV **string, char *pattern)
  44. {
  45. SV *command = newSV(0), *retval;
  46. sv_setpvf(command, "$string = '%s'; ($string =~ %s)",
  47. SvPV_nolen(*string), pattern);
  48. retval = my_eval_sv(command, TRUE);
  49. SvREFCNT_dec(command);
  50. *string = get_sv("string", 0);
  51. return SvIV(retval);
  52. }
  53. /** matches(string, pattern, matches)
  54. **
  55. ** Used for matches in a list context.
  56. **
  57. ** Returns the number of matches,
  58. ** and fills in **matches with the matching substrings
  59. **/
  60. I32 matches(SV *string, char *pattern, AV **match_list)
  61. {
  62. SV *command = newSV(0);
  63. I32 num_matches;
  64. sv_setpvf(command, "my $string = '%s'; @array = ($string =~ %s)",
  65. SvPV_nolen(string), pattern);
  66. my_eval_sv(command, TRUE);
  67. SvREFCNT_dec(command);
  68. *match_list = get_av("array", 0);
  69. num_matches = av_top_index(*match_list) + 1;
  70. return num_matches;
  71. }
  72. main (int argc, char **argv, char **env)
  73. {
  74. char *embedding[] = { "", "-e", "0" };
  75. AV *match_list;
  76. I32 num_matches, i;
  77. SV *text;
  78. PERL_SYS_INIT3(&argc,&argv,&env);
  79. my_perl = perl_alloc();
  80. perl_construct(my_perl);
  81. perl_parse(my_perl, NULL, 3, embedding, NULL);
  82. PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
  83. text = newSV(0);
  84. sv_setpv(text, "When he is at a convenience store and the "
  85. "bill comes to some amount like 76 cents, Maynard is "
  86. "aware that there is something he *should* do, something "
  87. "that will enable him to get back a quarter, but he has "
  88. "no idea *what*. He fumbles through his red squeezey "
  89. "changepurse and gives the boy three extra pennies with "
  90. "his dollar, hoping that he might luck into the correct "
  91. "amount. The boy gives him back two of his own pennies "
  92. "and then the big shiny quarter that is his prize. "
  93. "-RICHH");
  94. if (match(text, "m/quarter/")) /** Does text contain 'quarter'? **/
  95. printf("match: Text contains the word 'quarter'.\n\n");
  96. else
  97. printf("match: Text doesn't contain the word 'quarter'.\n\n");
  98. if (match(text, "m/eighth/")) /** Does text contain 'eighth'? **/
  99. printf("match: Text contains the word 'eighth'.\n\n");
  100. else
  101. printf("match: Text doesn't contain the word 'eighth'.\n\n");
  102. /** Match all occurrences of /wi../ **/
  103. num_matches = matches(text, "m/(wi..)/g", &match_list);
  104. printf("matches: m/(wi..)/g found %d matches...\n", num_matches);
  105. for (i = 0; i < num_matches; i++)
  106. printf("match: %s\n", SvPV_nolen(*av_fetch(match_list, i, FALSE)));
  107. printf("\n");
  108. /** Remove all vowels from text **/
  109. num_matches = substitute(&text, "s/[aeiou]//gi");
  110. if (num_matches) {
  111. printf("substitute: s/[aeiou]//gi...%d substitutions made.\n",
  112. num_matches);
  113. printf("Now text is: %s\n\n", SvPV_nolen(text));
  114. }
  115. /** Attempt a substitution **/
  116. if (!substitute(&text, "s/Perl/C/")) {
  117. printf("substitute: s/Perl/C...No substitution made.\n\n");
  118. }
  119. SvREFCNT_dec(text);
  120. PL_perl_destruct_level = 1;
  121. perl_destruct(my_perl);
  122. perl_free(my_perl);
  123. PERL_SYS_TERM();
  124. }

which produces the output (again, long lines have been wrapped here)

  1. match: Text contains the word 'quarter'.
  2. match: Text doesn't contain the word 'eighth'.
  3. matches: m/(wi..)/g found 2 matches...
  4. match: will
  5. match: with
  6. substitute: s/[aeiou]//gi...139 substitutions made.
  7. Now text is: Whn h s t cnvnnc str nd th bll cms t sm mnt lk 76 cnts,
  8. Mynrd s wr tht thr s smthng h *shld* d, smthng tht wll nbl hm t gt bck
  9. qrtr, bt h hs n d *wht*. H fmbls thrgh hs rd sqzy chngprs nd gvs th by
  10. thr xtr pnns wth hs dllr, hpng tht h mght lck nt th crrct mnt. Th by gvs
  11. hm bck tw f hs wn pnns nd thn th bg shny qrtr tht s hs prz. -RCHH
  12. substitute: s/Perl/C...No substitution made.

Fiddling with the Perl stack from your C program

When trying to explain stacks, most computer science textbooks mumble something about spring-loaded columns of cafeteria plates: the last thing you pushed on the stack is the first thing you pop off. That'll do for our purposes: your C program will push some arguments onto "the Perl stack", shut its eyes while some magic happens, and then pop the results--the return value of your Perl subroutine--off the stack.

First you'll need to know how to convert between C types and Perl types, with newSViv() and sv_setnv() and newAV() and all their friends. They're described in perlguts and perlapi.

Then you'll need to know how to manipulate the Perl stack. That's described in perlcall.

Once you've understood those, embedding Perl in C is easy.

Because C has no builtin function for integer exponentiation, let's make Perl's ** operator available to it (this is less useful than it sounds, because Perl implements ** with C's pow() function). First I'll create a stub exponentiation function in power.pl:

  1. sub expo {
  2. my ($a, $b) = @_;
  3. return $a ** $b;
  4. }

Now I'll create a C program, power.c, with a function PerlPower() that contains all the perlguts necessary to push the two arguments into expo() and to pop the return value out. Take a deep breath...

  1. #include <EXTERN.h>
  2. #include <perl.h>
  3. static PerlInterpreter *my_perl;
  4. static void
  5. PerlPower(int a, int b)
  6. {
  7. dSP; /* initialize stack pointer */
  8. ENTER; /* everything created after here */
  9. SAVETMPS; /* ...is a temporary variable. */
  10. PUSHMARK(SP); /* remember the stack pointer */
  11. XPUSHs(sv_2mortal(newSViv(a))); /* push the base onto the stack */
  12. XPUSHs(sv_2mortal(newSViv(b))); /* push the exponent onto stack */
  13. PUTBACK; /* make local stack pointer global */
  14. call_pv("expo", G_SCALAR); /* call the function */
  15. SPAGAIN; /* refresh stack pointer */
  16. /* pop the return value from stack */
  17. printf ("%d to the %dth power is %d.\n", a, b, POPi);
  18. PUTBACK;
  19. FREETMPS; /* free that return value */
  20. LEAVE; /* ...and the XPUSHed "mortal" args.*/
  21. }
  22. int main (int argc, char **argv, char **env)
  23. {
  24. char *my_argv[] = { "", "power.pl" };
  25. PERL_SYS_INIT3(&argc,&argv,&env);
  26. my_perl = perl_alloc();
  27. perl_construct( my_perl );
  28. perl_parse(my_perl, NULL, 2, my_argv, (char **)NULL);
  29. PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
  30. perl_run(my_perl);
  31. PerlPower(3, 4); /*** Compute 3 ** 4 ***/
  32. perl_destruct(my_perl);
  33. perl_free(my_perl);
  34. PERL_SYS_TERM();
  35. }

Compile and run:

  1. % cc -o power power.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
  2. % power
  3. 3 to the 4th power is 81.

Maintaining a persistent interpreter

When developing interactive and/or potentially long-running applications, it's a good idea to maintain a persistent interpreter rather than allocating and constructing a new interpreter multiple times. The major reason is speed: since Perl will only be loaded into memory once.

However, you have to be more cautious with namespace and variable scoping when using a persistent interpreter. In previous examples we've been using global variables in the default package main . We knew exactly what code would be run, and assumed we could avoid variable collisions and outrageous symbol table growth.

Let's say your application is a server that will occasionally run Perl code from some arbitrary file. Your server has no way of knowing what code it's going to run. Very dangerous.

If the file is pulled in by perl_parse() , compiled into a newly constructed interpreter, and subsequently cleaned out with perl_destruct() afterwards, you're shielded from most namespace troubles.

One way to avoid namespace collisions in this scenario is to translate the filename into a guaranteed-unique package name, and then compile the code into that package using eval. In the example below, each file will only be compiled once. Or, the application might choose to clean out the symbol table associated with the file after it's no longer needed. Using call_argv in perlapi, We'll call the subroutine Embed::Persistent::eval_file which lives in the file persistent.pl and pass the filename and boolean cleanup/cache flag as arguments.

Note that the process will continue to grow for each file that it uses. In addition, there might be AUTOLOAD ed subroutines and other conditions that cause Perl's symbol table to grow. You might want to add some logic that keeps track of the process size, or restarts itself after a certain number of requests, to ensure that memory consumption is minimized. You'll also want to scope your variables with my whenever possible.

  1. package Embed::Persistent;
  2. #persistent.pl
  3. use strict;
  4. our %Cache;
  5. use Symbol qw(delete_package);
  6. sub valid_package_name {
  7. my($string) = @_;
  8. $string =~ s/([^A-Za-z0-9\/])/sprintf("_%2x",unpack("C",$1))/eg;
  9. # second pass only for words starting with a digit
  10. $string =~ s|/(\d)|sprintf("/_%2x",unpack("C",$1))|eg;
  11. # Dress it up as a real package name
  12. $string =~ s|/|::|g;
  13. return "Embed" . $string;
  14. }
  15. sub eval_file {
  16. my($filename, $delete) = @_;
  17. my $package = valid_package_name($filename);
  18. my $mtime = -M $filename;
  19. if(defined $Cache{$package}{mtime}
  20. &&
  21. $Cache{$package}{mtime} <= $mtime)
  22. {
  23. # we have compiled this subroutine already,
  24. # it has not been updated on disk, nothing left to do
  25. print STDERR "already compiled $package->handler\n";
  26. }
  27. else {
  28. local *FH;
  29. open FH, $filename or die "open '$filename' $!";
  30. local($/) = undef;
  31. my $sub = <FH>;
  32. close FH;
  33. #wrap the code into a subroutine inside our unique package
  34. my $eval = qq{package $package; sub handler { $sub; }};
  35. {
  36. # hide our variables within this block
  37. my($filename,$mtime,$package,$sub);
  38. eval $eval;
  39. }
  40. die $@ if $@;
  41. #cache it unless we're cleaning out each time
  42. $Cache{$package}{mtime} = $mtime unless $delete;
  43. }
  44. eval {$package->handler;};
  45. die $@ if $@;
  46. delete_package($package) if $delete;
  47. #take a look if you want
  48. #print Devel::Symdump->rnew($package)->as_string, $/;
  49. }
  50. 1;
  51. __END__
  52. /* persistent.c */
  53. #include <EXTERN.h>
  54. #include <perl.h>
  55. /* 1 = clean out filename's symbol table after each request, 0 = don't */
  56. #ifndef DO_CLEAN
  57. #define DO_CLEAN 0
  58. #endif
  59. #define BUFFER_SIZE 1024
  60. static PerlInterpreter *my_perl = NULL;
  61. int
  62. main(int argc, char **argv, char **env)
  63. {
  64. char *embedding[] = { "", "persistent.pl" };
  65. char *args[] = { "", DO_CLEAN, NULL };
  66. char filename[BUFFER_SIZE];
  67. int exitstatus = 0;
  68. PERL_SYS_INIT3(&argc,&argv,&env);
  69. if((my_perl = perl_alloc()) == NULL) {
  70. fprintf(stderr, "no memory!");
  71. exit(1);
  72. }
  73. perl_construct(my_perl);
  74. PL_origalen = 1; /* don't let $0 assignment update the proctitle or embedding[0] */
  75. exitstatus = perl_parse(my_perl, NULL, 2, embedding, NULL);
  76. PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
  77. if(!exitstatus) {
  78. exitstatus = perl_run(my_perl);
  79. while(printf("Enter file name: ") &&
  80. fgets(filename, BUFFER_SIZE, stdin)) {
  81. filename[strlen(filename)-1] = '\0'; /* strip \n */
  82. /* call the subroutine, passing it the filename as an argument */
  83. args[0] = filename;
  84. call_argv("Embed::Persistent::eval_file",
  85. G_DISCARD | G_EVAL, args);
  86. /* check $@ */
  87. if(SvTRUE(ERRSV))
  88. fprintf(stderr, "eval error: %s\n", SvPV_nolen(ERRSV));
  89. }
  90. }
  91. PL_perl_destruct_level = 0;
  92. perl_destruct(my_perl);
  93. perl_free(my_perl);
  94. PERL_SYS_TERM();
  95. exit(exitstatus);
  96. }

Now compile:

  1. % cc -o persistent persistent.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

Here's an example script file:

  1. #test.pl
  2. my $string = "hello";
  3. foo($string);
  4. sub foo {
  5. print "foo says: @_\n";
  6. }

Now run:

  1. % persistent
  2. Enter file name: test.pl
  3. foo says: hello
  4. Enter file name: test.pl
  5. already compiled Embed::test_2epl->handler
  6. foo says: hello
  7. Enter file name: ^C

Execution of END blocks

Traditionally END blocks have been executed at the end of the perl_run. This causes problems for applications that never call perl_run. Since perl 5.7.2 you can specify PL_exit_flags |= PERL_EXIT_DESTRUCT_END to get the new behaviour. This also enables the running of END blocks if the perl_parse fails and perl_destruct will return the exit value.

$0 assignments

When a perl script assigns a value to $0 then the perl runtime will try to make this value show up as the program name reported by "ps" by updating the memory pointed to by the argv passed to perl_parse() and also calling API functions like setproctitle() where available. This behaviour might not be appropriate when embedding perl and can be disabled by assigning the value 1 to the variable PL_origalen before perl_parse() is called.

The persistent.c example above is for instance likely to segfault when $0 is assigned to if the PL_origalen = 1; assignment is removed. This because perl will try to write to the read only memory of the embedding[] strings.

Maintaining multiple interpreter instances

Some rare applications will need to create more than one interpreter during a session. Such an application might sporadically decide to release any resources associated with the interpreter.

The program must take care to ensure that this takes place before the next interpreter is constructed. By default, when perl is not built with any special options, the global variable PL_perl_destruct_level is set to 0 , since extra cleaning isn't usually needed when a program only ever creates a single interpreter in its entire lifetime.

Setting PL_perl_destruct_level to 1 makes everything squeaky clean:

  1. while(1) {
  2. ...
  3. /* reset global variables here with PL_perl_destruct_level = 1 */
  4. PL_perl_destruct_level = 1;
  5. perl_construct(my_perl);
  6. ...
  7. /* clean and reset _everything_ during perl_destruct */
  8. PL_perl_destruct_level = 1;
  9. perl_destruct(my_perl);
  10. perl_free(my_perl);
  11. ...
  12. /* let's go do it again! */
  13. }

When perl_destruct() is called, the interpreter's syntax parse tree and symbol tables are cleaned up, and global variables are reset. The second assignment to PL_perl_destruct_level is needed because perl_construct resets it to 0 .

Now suppose we have more than one interpreter instance running at the same time. This is feasible, but only if you used the Configure option -Dusemultiplicity or the options -Dusethreads -Duseithreads when building perl. By default, enabling one of these Configure options sets the per-interpreter global variable PL_perl_destruct_level to 1 , so that thorough cleaning is automatic and interpreter variables are initialized correctly. Even if you don't intend to run two or more interpreters at the same time, but to run them sequentially, like in the above example, it is recommended to build perl with the -Dusemultiplicity option otherwise some interpreter variables may not be initialized correctly between consecutive runs and your application may crash.

See also Thread-aware system interfaces in perlxs.

Using -Dusethreads -Duseithreads rather than -Dusemultiplicity is more appropriate if you intend to run multiple interpreters concurrently in different threads, because it enables support for linking in the thread libraries of your system with the interpreter.

Let's give it a try:

  1. #include <EXTERN.h>
  2. #include <perl.h>
  3. /* we're going to embed two interpreters */
  4. #define SAY_HELLO "-e", "print qq(Hi, I'm $^X\n)"
  5. int main(int argc, char **argv, char **env)
  6. {
  7. PerlInterpreter *one_perl, *two_perl;
  8. char *one_args[] = { "one_perl", SAY_HELLO };
  9. char *two_args[] = { "two_perl", SAY_HELLO };
  10. PERL_SYS_INIT3(&argc,&argv,&env);
  11. one_perl = perl_alloc();
  12. two_perl = perl_alloc();
  13. PERL_SET_CONTEXT(one_perl);
  14. perl_construct(one_perl);
  15. PERL_SET_CONTEXT(two_perl);
  16. perl_construct(two_perl);
  17. PERL_SET_CONTEXT(one_perl);
  18. perl_parse(one_perl, NULL, 3, one_args, (char **)NULL);
  19. PERL_SET_CONTEXT(two_perl);
  20. perl_parse(two_perl, NULL, 3, two_args, (char **)NULL);
  21. PERL_SET_CONTEXT(one_perl);
  22. perl_run(one_perl);
  23. PERL_SET_CONTEXT(two_perl);
  24. perl_run(two_perl);
  25. PERL_SET_CONTEXT(one_perl);
  26. perl_destruct(one_perl);
  27. PERL_SET_CONTEXT(two_perl);
  28. perl_destruct(two_perl);
  29. PERL_SET_CONTEXT(one_perl);
  30. perl_free(one_perl);
  31. PERL_SET_CONTEXT(two_perl);
  32. perl_free(two_perl);
  33. PERL_SYS_TERM();
  34. }

Note the calls to PERL_SET_CONTEXT(). These are necessary to initialize the global state that tracks which interpreter is the "current" one on the particular process or thread that may be running it. It should always be used if you have more than one interpreter and are making perl API calls on both interpreters in an interleaved fashion.

PERL_SET_CONTEXT(interp) should also be called whenever interp is used by a thread that did not create it (using either perl_alloc(), or the more esoteric perl_clone()).

Compile as usual:

  1. % cc -o multiplicity multiplicity.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

Run it, Run it:

  1. % multiplicity
  2. Hi, I'm one_perl
  3. Hi, I'm two_perl

Using Perl modules, which themselves use C libraries, from your C program

If you've played with the examples above and tried to embed a script that use()s a Perl module (such as Socket) which itself uses a C or C++ library, this probably happened:

  1. Can't load module Socket, dynamic loading not available in this perl.
  2. (You may need to build a new perl executable which either supports
  3. dynamic loading or has the Socket module statically linked into it.)

What's wrong?

Your interpreter doesn't know how to communicate with these extensions on its own. A little glue will help. Up until now you've been calling perl_parse(), handing it NULL for the second argument:

  1. perl_parse(my_perl, NULL, argc, my_argv, NULL);

That's where the glue code can be inserted to create the initial contact between Perl and linked C/C++ routines. Let's take a look some pieces of perlmain.c to see how Perl does this:

  1. static void xs_init (pTHX);
  2. EXTERN_C void boot_DynaLoader (pTHX_ CV* cv);
  3. EXTERN_C void boot_Socket (pTHX_ CV* cv);
  4. EXTERN_C void
  5. xs_init(pTHX)
  6. {
  7. char *file = __FILE__;
  8. /* DynaLoader is a special case */
  9. newXS("DynaLoader::boot_DynaLoader", boot_DynaLoader, file);
  10. newXS("Socket::bootstrap", boot_Socket, file);
  11. }

Simply put: for each extension linked with your Perl executable (determined during its initial configuration on your computer or when adding a new extension), a Perl subroutine is created to incorporate the extension's routines. Normally, that subroutine is named Module::bootstrap() and is invoked when you say use Module. In turn, this hooks into an XSUB, boot_Module, which creates a Perl counterpart for each of the extension's XSUBs. Don't worry about this part; leave that to the xsubpp and extension authors. If your extension is dynamically loaded, DynaLoader creates Module::bootstrap() for you on the fly. In fact, if you have a working DynaLoader then there is rarely any need to link in any other extensions statically.

Once you have this code, slap it into the second argument of perl_parse():

  1. perl_parse(my_perl, xs_init, argc, my_argv, NULL);

Then compile:

  1. % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`
  2. % interp
  3. use Socket;
  4. use SomeDynamicallyLoadedModule;
  5. print "Now I can use extensions!\n"'

ExtUtils::Embed can also automate writing the xs_init glue code.

  1. % perl -MExtUtils::Embed -e xsinit -- -o perlxsi.c
  2. % cc -c perlxsi.c `perl -MExtUtils::Embed -e ccopts`
  3. % cc -c interp.c `perl -MExtUtils::Embed -e ccopts`
  4. % cc -o interp perlxsi.o interp.o `perl -MExtUtils::Embed -e ldopts`

Consult perlxs, perlguts, and perlapi for more details.

Hiding Perl_

If you completely hide the short forms of the Perl public API, add -DPERL_NO_SHORT_NAMES to the compilation flags. This means that for example instead of writing

  1. warn("%d bottles of beer on the wall", bottlecount);

you will have to write the explicit full form

  1. Perl_warn(aTHX_ "%d bottles of beer on the wall", bottlecount);

(See Background and PERL_IMPLICIT_CONTEXT in perlguts for the explanation of the aTHX_ . ) Hiding the short forms is very useful for avoiding all sorts of nasty (C preprocessor or otherwise) conflicts with other software packages (Perl defines about 2400 APIs with these short names, take or leave few hundred, so there certainly is room for conflict.)

MORAL

You can sometimes write faster code in C, but you can always write code faster in Perl. Because you can use each from the other, combine them as you wish.

AUTHOR

Jon Orwant <orwant@media.mit.edu> and Doug MacEachern <dougm@covalent.net>, with small contributions from Tim Bunce, Tom Christiansen, Guy Decoux, Hallvard Furuseth, Dov Grobgeld, and Ilya Zakharevich.

Doug MacEachern has an article on embedding in Volume 1, Issue 4 of The Perl Journal ( http://www.tpj.com/ ). Doug is also the developer of the most widely-used Perl embedding: the mod_perl system (perl.apache.org), which embeds Perl in the Apache web server. Oracle, Binary Evolution, ActiveState, and Ben Sugars's nsapi_perl have used this model for Oracle, Netscape and Internet Information Server Perl plugins.

COPYRIGHT

Copyright (C) 1995, 1996, 1997, 1998 Doug MacEachern and Jon Orwant. All Rights Reserved.

This document may be distributed under the same terms as Perl itself.

 
perldoc-html/perlexperiment.html000644 000765 000024 00000063137 12275777356 017206 0ustar00jjstaff000000 000000 perlexperiment - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlexperiment

Perl 5 version 18.2 documentation
Recently read

perlexperiment

NAME

perlexperiment - A listing of experimental features in Perl

DESCRIPTION

This document lists the current and past experimental features in the perl core. Although all of these are documented with their appropriate topics, this succinct listing gives you an overview and basic facts about their status.

So far we've merely tried to find and list the experimental features and infer their inception, versions, etc. There's a lot of speculation here.

Current experiments

  • -Dusemultiplicity -Duseithreads

    Introduced in Perl 5.6.0

  • Long Doubles Still Don't Work In Solaris

    Introduced in Perl 5.7.0

  • our can now have an experimental optional attribute unique

    Introduced in Perl 5.8.0

    Deprecated in Perl 5.10.0

  • Linux abstract Unix domain sockets

    Introduced in Perl 5.9.2

    See also Socket

  • Pod::HTML2Pod
  • Pod::PXML
  • The <:pop> IO pseudolayer

    See also perlrun

  • The <:win32> IO pseudolayer

    See also perlrun

  • MLDBM

    See also perldsc

  • internal functions with M flag

    See also perlguts

  • lex_start API

    Introduced in Perl 5.13.7

  • internal API for %^H

    Introduced in Perl 5.13.7

    See also cophh_ in perlapi.

  • alloccopstash

    Introduced in Perl 5.18.0

  • av_create_and_push
  • av_create_and_unshift_one
  • av_create_and_unshift_one
  • cop_store_label

    Introduced in Perl 5.16.0

  • PL_keyword_plugin
  • gv_fetchmethod_*_flags

    Introduced in Perl 5.16.0

  • hv_iternext_flags
  • lex_bufutf8
  • lex_discard_to
  • lex_grow_linestr
  • lex_next_chunk
  • lex_peek_unichar
  • lex_read_space
  • lex_read_to
  • lex_read_unichar
  • lex_stuff_pv
  • lex_stuff_pvn
  • lex_stuff_pvs
  • lex_stuff_sv
  • lex_unstuff
  • op_scope
  • op_lvalue
  • parse_fullstmt
  • parse_stmtseq
  • PL_parser->bufend
  • PL_parser->bufptr
  • PL_parser->linestart
  • PL_parser->linestr
  • Perl_signbit
  • pad_findmy
  • sv_utf8_decode
  • sv_utf8_downgrade
  • bytes_from_utf8
  • bytes_to_utf8
  • utf8_to_bytes
  • Lvalue subroutines

    Introduced in Perl 5.6.0

    See also perlsub

  • There is an installhtml target in the Makefile.
  • Unicode in Perl on EBCDIC
  • (?{code})

    See also perlre

  • (??{ code })

    See also perlre

  • Smart match (~~ )

    Introduced in Perl 5.10.0

    Modified in Perl 5.10.1, 5.12.0

  • Lexical $_

    Introduced in Perl 5.10.0

  • Backtracking control verbs

    (*ACCEPT)

    Introduced in: Perl 5.10

    See also: Special Backtracking Control Verbs in perlre

  • Code expressions, conditional expressions, and independent expressions in regexes
  • gv_try_downgrade

    See also perlintern

  • Experimental Support for Sun Studio Compilers for Linux OS

    See also perllinux

  • Pluggable keywords

    See PL_keyword_plugin in perlapi for the mechanism.

    Introduced in: Perl 5.11.2

  • Array and hash container functions accept references

    Introduced in Perl 5.14.0

  • Lexical subroutines

    Introduced in: Perl 5.18

    See also: Lexical Subroutines in perlsub

  • Regular Expression Set Operations

    Introduced in: Perl 5.18

    See also: Extended Bracketed Character Classes in perlrecharclass

Accepted features

These features were so wildly successful and played so well with others that we decided to remove their experimental status and admit them as full, stable features in the world of Perl, lavishing all the benefits and luxuries thereof. They are also awarded +5 Stability and +3 Charisma.

  • The \N regex character class

    The \N character class, not to be confused with the named character sequence \N{NAME} , denotes any non-newline character in a regular expression.

    Introduced in: Perl 5.12

  • fork() emulation

    Introduced in Perl 5.6.1

    See also perlfork

  • DB module

    Introduced in Perl 5.6.0

    See also perldebug, perldebtut

  • Weak references

    Introduced in Perl 5.6.0

  • Internal file glob

    Introduced in Perl 5.6.0

  • die accepts a reference

    Introduced in Perl 5.005

  • 64-bit support

    Introduced in Perl 5.005

Removed features

These features are no longer considered experimental and their functionality has disappeared. It's your own fault if you wrote production programs using these features after we explicitly told you not to (see perlpolicy).

  • legacy

    The experimental legacy pragma was swallowed by the feature pragma.

    Introduced in: 5.11.2

    Removed in: 5.11.3

  • Assertions

    The -A command line switch

    Introduced in Perl 5.9.0

    Removed in Perl 5.9.5

  • Test::Harness::Straps

    Moved from Perl 5.10.1 to CPAN

  • GetOpt::Long Options can now take multiple values at once (experimental)

    Getopt::Long upgraded to version 2.35

    Removed in Perl 5.8.8

  • The pseudo-hash data type

    Introduced in Perl 5.6.0

    Removed in Perl 5.9.0

  • 5.005-style threading

    Introduced in Perl 5.005

    Removed in Perl 5.10

  • perlcc

    Introduced in Perl 5.005

    Moved from Perl 5.9.0 to CPAN

AUTHORS

brian d foy <brian.d.foy@gmail.com>

Sébastien Aperghis-Tramoni <saper@cpan.org>

COPYRIGHT

Copyright 2010, brian d foy <brian.d.foy@gmail.com>

LICENSE

You can use and redistribute this document under the same terms as Perl itself.

 
perldoc-html/perlfaq.html000644 000765 000024 00000127264 12275777326 015574 0ustar00jjstaff000000 000000 perlfaq - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq

Perl 5 version 18.2 documentation
Recently read

perlfaq

NAME

perlfaq - frequently asked questions about Perl

DESCRIPTION

The perlfaq comprises several documents that answer the most commonly asked questions about Perl and Perl programming. It's divided by topic into nine major sections outlined in this document.

Where to find the perlfaq

The perlfaq is an evolving document. Read the latest version at http://learn.perl.org/faq/. It is also included in the standard Perl distribution.

How to use the perlfaq

The perldoc command line tool is part of the standard Perl distribution. To read the perlfaq:

  1. $ perldoc perlfaq

To search the perlfaq question headings:

  1. $ perldoc -q open

How to contribute to the perlfaq

Review https://github.com/perl-doc-cats/perlfaq/wiki. If you don't find your suggestion create an issue or pull request against https://github.com/perl-doc-cats/perlfaq.

Once approved, changes are merged into https://github.com/tpf/perlfaq, the repository which drives http://learn.perl.org/faq/, and they are distributed with the next Perl 5 release.

What if my question isn't answered in the FAQ?

Try the resources in perlfaq2.

TABLE OF CONTENTS

  • perlfaq1 - General Questions About Perl
  • perlfaq2 - Obtaining and Learning about Perl
  • perlfaq3 - Programming Tools
  • perlfaq4 - Data Manipulation
  • perlfaq5 - Files and Formats
  • perlfaq6 - Regular Expressions
  • perlfaq7 - General Perl Language Issues
  • perlfaq8 - System Interaction
  • perlfaq9 - Web, Email and Networking

THE QUESTIONS

perlfaq1: General Questions About Perl

This section of the FAQ answers very general, high-level questions about Perl.

  • What is Perl?

  • Who supports Perl? Who develops it? Why is it free?

  • Which version of Perl should I use?

  • What are Perl 4, Perl 5, or Perl 6?

  • What is Perl 6?

  • How stable is Perl?

  • Is Perl difficult to learn?

  • How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl?

  • Can I do [task] in Perl?

  • When shouldn't I program in Perl?

  • What's the difference between "perl" and "Perl"?

  • What is a JAPH?

  • How can I convince others to use Perl?

perlfaq2: Obtaining and Learning about Perl

This section of the FAQ answers questions about where to find source and documentation for Perl, support, and related matters.

  • What machines support Perl? Where do I get it?

  • How can I get a binary version of Perl?

  • I don't have a C compiler. How can I build my own Perl interpreter?

  • I copied the Perl binary from one machine to another, but scripts don't work.

  • I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work?

  • What modules and extensions are available for Perl? What is CPAN?

  • Where can I get information on Perl?

  • What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org?

  • Where can I post questions?

  • Perl Books

  • Which magazines have Perl content?

  • Which Perl blogs should I read?

  • What mailing lists are there for Perl?

  • Where can I buy a commercial version of Perl?

  • Where do I send bug reports?

perlfaq3: Programming Tools

This section of the FAQ answers questions related to programmer tools and programming support.

  • How do I do (anything)?

  • How can I use Perl interactively?

  • How do I find which modules are installed on my system?

  • How do I debug my Perl programs?

  • How do I profile my Perl programs?

  • How do I cross-reference my Perl programs?

  • Is there a pretty-printer (formatter) for Perl?

  • Is there an IDE or Windows Perl Editor?

  • Where can I get Perl macros for vi?

  • Where can I get perl-mode or cperl-mode for emacs?

  • How can I use curses with Perl?

  • How can I write a GUI (X, Tk, Gtk, etc.) in Perl?

  • How can I make my Perl program run faster?

  • How can I make my Perl program take less memory?

  • Is it safe to return a reference to local or lexical data?

  • How can I free an array or hash so my program shrinks?

  • How can I make my CGI script more efficient?

  • How can I hide the source for my Perl program?

  • How can I compile my Perl program into byte code or C?

  • How can I get #!perl to work on [MS-DOS,NT,...]?

  • Can I write useful Perl programs on the command line?

  • Why don't Perl one-liners work on my DOS/Mac/VMS system?

  • Where can I learn about CGI or Web programming in Perl?

  • Where can I learn about object-oriented Perl programming?

  • Where can I learn about linking C with Perl?

  • I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong?

  • When I tried to run my script, I got this message. What does it mean?

  • What's MakeMaker?

perlfaq4: Data Manipulation

This section of the FAQ answers questions related to manipulating numbers, dates, strings, arrays, hashes, and miscellaneous data issues.

  • Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?

  • Why is int() broken?

  • Why isn't my octal data interpreted correctly?

  • Does Perl have a round() function? What about ceil() and floor()? Trig functions?

  • How do I convert between numeric representations/bases/radixes?

  • Why doesn't & work the way I want it to?

  • How do I multiply matrices?

  • How do I perform an operation on a series of integers?

  • How can I output Roman numerals?

  • Why aren't my random numbers random?

  • How do I get a random number between X and Y?

  • How do I find the day or week of the year?

  • How do I find the current century or millennium?

  • How can I compare two dates and find the difference?

  • How can I take a string and turn it into epoch seconds?

  • How can I find the Julian Day?

  • How do I find yesterday's date?

  • Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant?

  • How do I validate input?

  • How do I unescape a string?

  • How do I remove consecutive pairs of characters?

  • How do I expand function calls in a string?

  • How do I find matching/nesting anything?

  • How do I reverse a string?

  • How do I expand tabs in a string?

  • How do I reformat a paragraph?

  • How can I access or change N characters of a string?

  • How do I change the Nth occurrence of something?

  • How can I count the number of occurrences of a substring within a string?

  • How do I capitalize all the words on one line?

  • How can I split a [character]-delimited string except when inside [character]?

  • How do I strip blank space from the beginning/end of a string?

  • How do I pad a string with blanks or pad a number with zeroes?

  • How do I extract selected columns from a string?

  • How do I find the soundex value of a string?

  • How can I expand variables in text strings?

  • What's wrong with always quoting "$vars"?

  • Why don't my <<HERE documents work?

  • What is the difference between a list and an array?

  • What is the difference between $array[1] and @array[1]?

  • How can I remove duplicate elements from a list or array?

  • How can I tell whether a certain element is contained in a list or array?

  • How do I compute the difference of two arrays? How do I compute the intersection of two arrays?

  • How do I test whether two arrays or hashes are equal?

  • How do I find the first array element for which a condition is true?

  • How do I handle linked lists?

  • How do I handle circular lists?

  • How do I shuffle an array randomly?

  • How do I process/modify each element of an array?

  • How do I select a random element from an array?

  • How do I permute N elements of a list?

  • How do I sort an array by (anything)?

  • How do I manipulate arrays of bits?

  • Why does defined() return true on empty arrays and hashes?

  • How do I process an entire hash?

  • How do I merge two hashes?

  • What happens if I add or remove keys from a hash while iterating over it?

  • How do I look up a hash element by value?

  • How can I know how many entries are in a hash?

  • How do I sort a hash (optionally by value instead of key)?

  • How can I always keep my hash sorted?

  • What's the difference between "delete" and "undef" with hashes?

  • Why don't my tied hashes make the defined/exists distinction?

  • How do I reset an each() operation part-way through?

  • How can I get the unique keys from two hashes?

  • How can I store a multidimensional array in a DBM file?

  • How can I make my hash remember the order I put elements into it?

  • Why does passing a subroutine an undefined element in a hash create it?

  • How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays?

  • How can I use a reference as a hash key?

  • How can I check if a key exists in a multilevel hash?

  • How can I prevent addition of unwanted keys into a hash?

  • How do I handle binary data correctly?

  • How do I determine whether a scalar is a number/whole/integer/float?

  • How do I keep persistent data across program calls?

  • How do I print out or copy a recursive data structure?

  • How do I define methods for every class/object?

  • How do I verify a credit card checksum?

  • How do I pack arrays of doubles or floats for XS code?

perlfaq5: Files and Formats

This section deals with I/O and the "f" issues: filehandles, flushing, formats, and footers.

  • How do I flush/unbuffer an output filehandle? Why must I do this?

  • How do I change, delete, or insert a line in a file, or append to the beginning of a file?

  • How do I count the number of lines in a file?

  • How do I delete the last N lines from a file?

  • How can I use Perl's -i option from within a program?

  • How can I copy a file?

  • How do I make a temporary file name?

  • How can I manipulate fixed-record-length files?

  • How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?

  • How can I use a filehandle indirectly?

  • How can I set up a footer format to be used with write()?

  • How can I write() into a string?

  • How can I open a filehandle to a string?

  • How can I output my numbers with commas added?

  • How can I translate tildes (~) in a filename?

  • How come when I open a file read-write it wipes it out?

  • Why do I sometimes get an "Argument list too long" when I use <*>?

  • How can I open a file with a leading ">" or trailing blanks?

  • How can I reliably rename a file?

  • How can I lock a file?

  • Why can't I just open(FH, ">file.lock")?

  • I still don't get locking. I just want to increment the number in the file. How can I do this?

  • All I want to do is append a small amount of text to the end of a file. Do I still have to use locking?

  • How do I randomly update a binary file?

  • How do I get a file's timestamp in perl?

  • How do I set a file's timestamp in perl?

  • How do I print to more than one file at once?

  • How can I read in an entire file all at once?

  • How can I read in a file by paragraphs?

  • How can I read a single character from a file? From the keyboard?

  • How can I tell whether there's a character waiting on a filehandle?

  • How do I do a tail -f in perl?

  • How do I dup() a filehandle in Perl?

  • How do I close a file descriptor by number?

  • Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work?

  • Why doesn't glob("*.*") get all the files?

  • Why does Perl let me delete read-only files? Why does -i clobber protected files? Isn't this a bug in Perl?

  • How do I select a random line from a file?

  • Why do I get weird spaces when I print an array of lines?

  • How do I traverse a directory tree?

  • How do I delete a directory tree?

  • How do I copy an entire directory?

perlfaq6: Regular Expressions

This section is surprisingly small because the rest of the FAQ is littered with answers involving regular expressions. For example, decoding a URL and checking whether something is a number can be handled with regular expressions, but those answers are found elsewhere in this document (in perlfaq9 : "How do I decode or create those %-encodings on the web" and perlfaq4 : "How do I determine whether a scalar is a number/whole/integer/float", to be precise).

  • How can I hope to use regular expressions without creating illegible and unmaintainable code?

  • I'm having trouble matching over more than one line. What's wrong?

  • How can I pull out lines between two patterns that are themselves on different lines?

  • How do I match XML, HTML, or other nasty, ugly things with a regex?

  • I put a regular expression into $/ but it didn't work. What's wrong?

  • How do I substitute case-insensitively on the LHS while preserving case on the RHS?

  • How can I make \w match national character sets?

  • How can I match a locale-smart version of /[a-zA-Z]/ ?

  • How can I quote a variable to use in a regex?

  • What is /o really for?

  • How do I use a regular expression to strip C-style comments from a file?

  • Can I use Perl regular expressions to match balanced text?

  • What does it mean that regexes are greedy? How can I get around it?

  • How do I process each word on each line?

  • How can I print out a word-frequency or line-frequency summary?

  • How can I do approximate matching?

  • How do I efficiently match many regular expressions at once?

  • Why don't word-boundary searches with \b work for me?

  • Why does using $&, $`, or $' slow my program down?

  • What good is \G in a regular expression?

  • Are Perl regexes DFAs or NFAs? Are they POSIX compliant?

  • What's wrong with using grep in a void context?

  • How can I match strings with multibyte characters?

  • How do I match a regular expression that's in a variable?

perlfaq7: General Perl Language Issues

This section deals with general Perl language issues that don't clearly fit into any of the other sections.

  • Can I get a BNF/yacc/RE for the Perl language?

  • What are all these $@%&* punctuation signs, and how do I know when to use them?

  • Do I always/never have to quote my strings or use semicolons and commas?

  • How do I skip some return values?

  • How do I temporarily block warnings?

  • What's an extension?

  • Why do Perl operators have different precedence than C operators?

  • How do I declare/create a structure?

  • How do I create a module?

  • How do I adopt or take over a module already on CPAN?

  • How do I create a class?

  • How can I tell if a variable is tainted?

  • What's a closure?

  • What is variable suicide and how can I prevent it?

  • How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}?

  • How do I create a static variable?

  • What's the difference between dynamic and lexical (static) scoping? Between local() and my()?

  • How can I access a dynamic variable while a similarly named lexical is in scope?

  • What's the difference between deep and shallow binding?

  • Why doesn't "my($foo) = <$fh>;" work right?

  • How do I redefine a builtin function, operator, or method?

  • What's the difference between calling a function as &foo and foo()?

  • How do I create a switch or case statement?

  • How can I catch accesses to undefined variables, functions, or methods?

  • Why can't a method included in this same file be found?

  • How can I find out my current or calling package?

  • How can I comment out a large block of Perl code?

  • How do I clear a package?

  • How can I use a variable as a variable name?

  • What does "bad interpreter" mean?

perlfaq8: System Interaction

This section of the Perl FAQ covers questions involving operating system interaction. Topics include interprocess communication (IPC), control over the user-interface (keyboard, screen and pointing devices), and most anything else not related to data manipulation.

  • How do I find out which operating system I'm running under?

  • How come exec() doesn't return?

  • How do I do fancy stuff with the keyboard/screen/mouse?

  • How do I print something out in color?

  • How do I read just one key without waiting for a return key?

  • How do I check whether input is ready on the keyboard?

  • How do I clear the screen?

  • How do I get the screen size?

  • How do I ask the user for a password?

  • How do I read and write the serial port?

  • How do I decode encrypted password files?

  • How do I start a process in the background?

  • How do I trap control characters/signals?

  • How do I modify the shadow password file on a Unix system?

  • How do I set the time and date?

  • How can I sleep() or alarm() for under a second?

  • How can I measure time under a second?

  • How can I do an atexit() or setjmp()/longjmp()? (Exception handling)

  • Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean?

  • How can I call my system's unique C functions from Perl?

  • Where do I get the include files to do ioctl() or syscall()?

  • Why do setuid perl scripts complain about kernel problems?

  • How can I open a pipe both to and from a command?

  • Why can't I get the output of a command with system()?

  • How can I capture STDERR from an external command?

  • Why doesn't open() return an error when a pipe open fails?

  • What's wrong with using backticks in a void context?

  • How can I call backticks without shell processing?

  • Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)?

  • How can I convert my shell script to perl?

  • Can I use perl to run a telnet or ftp session?

  • How can I write expect in Perl?

  • Is there a way to hide perl's command line from programs such as "ps"?

  • I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible?

  • How do I close a process's filehandle without waiting for it to complete?

  • How do I fork a daemon process?

  • How do I find out if I'm running interactively or not?

  • How do I timeout a slow event?

  • How do I set CPU limits?

  • How do I avoid zombies on a Unix system?

  • How do I use an SQL database?

  • How do I make a system() exit on control-C?

  • How do I open a file without blocking?

  • How do I tell the difference between errors from the shell and perl?

  • How do I install a module from CPAN?

  • What's the difference between require and use?

  • How do I keep my own module/library directory?

  • How do I add the directory my program lives in to the module/library search path?

  • How do I add a directory to my include path (@INC) at runtime?

  • What is socket.ph and where do I get it?

perlfaq9: Web, Email and Networking

This section deals with questions related to running web sites, sending and receiving email as well as general networking.

  • Should I use a web framework?

  • Which web framework should I use?

  • What is Plack and PSGI?

  • How do I remove HTML from a string?

  • How do I extract URLs?

  • How do I fetch an HTML file?

  • How do I automate an HTML form submission?

  • How do I decode or create those %-encodings on the web?

  • How do I redirect to another page?

  • How do I put a password on my web pages?

  • How do I make sure users can't enter values into a form that causes my CGI script to do bad things?

  • How do I parse a mail header?

  • How do I check a valid mail address?

  • How do I decode a MIME/BASE64 string?

  • How do I find the user's mail address?

  • How do I send email?

  • How do I use MIME to make an attachment to a mail message?

  • How do I read email?

  • How do I find out my hostname, domainname, or IP address?

  • How do I fetch/put an (S)FTP file?

  • How can I do RPC in Perl?

CREDITS

Tom Christiansen wrote the original perlfaq then expanded it with the help of Nat Torkington. brian d foy substantialy edited and expanded the perlfaq. perlfaq-workers and others have also supplied feedback, patches and corrections over the years.

AUTHOR AND COPYRIGHT

Tom Christiansen wrote the original version of this document. brian d foy <bdfoy@cpan.org> wrote this version. See the individual perlfaq documents for additional copyright information.

This document is available under the same terms as Perl itself. Code examples in all the perlfaq documents are in the public domain. Use them as you see fit (and at your own risk with no warranty from anyone).

 
perldoc-html/perlfaq1.html000644 000765 000024 00000077716 12275777326 015663 0ustar00jjstaff000000 000000 perlfaq1 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq1

Perl 5 version 18.2 documentation
Recently read

perlfaq1

NAME

perlfaq1 - General Questions About Perl

DESCRIPTION

This section of the FAQ answers very general, high-level questions about Perl.

What is Perl?

Perl is a high-level programming language with an eclectic heritage written by Larry Wall and a cast of thousands.

Perl's process, file, and text manipulation facilities make it particularly well-suited for tasks involving quick prototyping, system utilities, software tools, system management tasks, database access, graphical programming, networking, and web programming.

Perl derives from the ubiquitous C programming language and to a lesser extent from sed, awk, the Unix shell, and many other tools and languages.

These strengths make it especially popular with web developers and system administrators. Mathematicians, geneticists, journalists, managers and many other people also use Perl.

Who supports Perl? Who develops it? Why is it free?

The original culture of the pre-populist Internet and the deeply-held beliefs of Perl's author, Larry Wall, gave rise to the free and open distribution policy of Perl. Perl is supported by its users. The core, the standard Perl library, the optional modules, and the documentation you're reading now were all written by volunteers.

The core development team (known as the Perl Porters) are a group of highly altruistic individuals committed to producing better software for free than you could hope to purchase for money. You may snoop on pending developments via the archives or read the faq, or you can subscribe to the mailing list by sending perl5-porters-subscribe@perl.org a subscription request (an empty message with no subject is fine).

While the GNU project includes Perl in its distributions, there's no such thing as "GNU Perl". Perl is not produced nor maintained by the Free Software Foundation. Perl's licensing terms are also more open than GNU software's tend to be.

You can get commercial support of Perl if you wish, although for most users the informal support will more than suffice. See the answer to "Where can I buy a commercial version of Perl?" for more information.

Which version of Perl should I use?

(contributed by brian d foy)

There is often a matter of opinion and taste, and there isn't any one answer that fits everyone. In general, you want to use either the current stable release, or the stable release immediately prior to that one. Currently, those are perl5.14.x and perl5.12.x, respectively.

Beyond that, you have to consider several things and decide which is best for you.

  • If things aren't broken, upgrading perl may break them (or at least issue new warnings).

  • The latest versions of perl have more bug fixes.

  • The Perl community is geared toward supporting the most recent releases, so you'll have an easier time finding help for those.

  • Versions prior to perl5.004 had serious security problems with buffer overflows, and in some cases have CERT advisories (for instance, http://www.cert.org/advisories/CA-1997-17.html ).

  • The latest versions are probably the least deployed and widely tested, so you may want to wait a few months after their release and see what problems others have if you are risk averse.

  • The immediate, previous releases (i.e. perl5.8.x ) are usually maintained for a while, although not at the same level as the current releases.

  • No one is actively supporting Perl 4. Ten years ago it was a dead camel carcass (according to this document). Now it's barely a skeleton as its whitewashed bones have fractured or eroded.

  • The current leading implementation of Perl 6, Rakudo, released a "useful, usable, 'early adopter'" distribution of Perl 6 (called Rakudo Star) in July of 2010. Please see http://rakudo.org/ for more information.

  • There are really two tracks of perl development: a maintenance version and an experimental version. The maintenance versions are stable, and have an even number as the minor release (i.e. perl5.10.x, where 10 is the minor release). The experimental versions may include features that don't make it into the stable versions, and have an odd number as the minor release (i.e. perl5.9.x, where 9 is the minor release).

What are Perl 4, Perl 5, or Perl 6?

In short, Perl 4 is the parent to both Perl 5 and Perl 6. Perl 5 is the older sibling, and though they are different languages, someone who knows one will spot many similarities in the other.

The number after Perl (i.e. the 5 after Perl 5) is the major release of the perl interpreter as well as the version of the language. Each major version has significant differences that earlier versions cannot support.

The current major release of Perl is Perl 5, first released in 1994. It can run scripts from the previous major release, Perl 4 (March 1991), but has significant differences.

Perl 6 is a reinvention of Perl, it is a language in the same lineage but not compatible. The two are complementary, not mutually exclusive. Perl 6 is not meant to replace Perl 5, and vice versa. See What is Perl 6? below to find out more.

See perlhist for a history of Perl revisions.

What is Perl 6?

Perl 6 was originally described as the community's rewrite of Perl 5. Development started in 2002; syntax and design work continue to this day. As the language has evolved, it has become clear that it is a separate language, incompatible with Perl 5 but in the same language family.

Contrary to popular belief, Perl 6 and Perl 5 peacefully coexist with one another. Perl 6 has proven to be a fascinating source of ideas for those using Perl 5 (the Moose object system is a well-known example). There is overlap in the communities, and this overlap fosters the tradition of sharing and borrowing that have been instrumental to Perl's success. The current leading implementation of Perl 6 is Rakudo, and you can learn more about it at http://rakudo.org.

If you want to learn more about Perl 6, or have a desire to help in the crusade to make Perl a better place then read the Perl 6 developers page at http://www.perl6.org/ and get involved.

"We're really serious about reinventing everything that needs reinventing." --Larry Wall

How stable is Perl?

Production releases, which incorporate bug fixes and new functionality, are widely tested before release. Since the 5.000 release, we have averaged about one production release per year.

The Perl development team occasionally make changes to the internal core of the language, but all possible efforts are made toward backward compatibility.

Is Perl difficult to learn?

No, Perl is easy to start learning --and easy to keep learning. It looks like most programming languages you're likely to have experience with, so if you've ever written a C program, an awk script, a shell script, or even a BASIC program, you're already partway there.

Most tasks only require a small subset of the Perl language. One of the guiding mottos for Perl development is "there's more than one way to do it" (TMTOWTDI, sometimes pronounced "tim toady"). Perl's learning curve is therefore shallow (easy to learn) and long (there's a whole lot you can do if you really want).

Finally, because Perl is frequently (but not always, and certainly not by definition) an interpreted language, you can write your programs and test them without an intermediate compilation step, allowing you to experiment and test/debug quickly and easily. This ease of experimentation flattens the learning curve even more.

Things that make Perl easier to learn: Unix experience, almost any kind of programming experience, an understanding of regular expressions, and the ability to understand other people's code. If there's something you need to do, then it's probably already been done, and a working example is usually available for free. Don't forget Perl modules, either. They're discussed in Part 3 of this FAQ, along with CPAN, which is discussed in Part 2.

How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl?

Perl can be used for almost any coding problem, even ones which require integrating specialist C code for extra speed. As with any tool it can be used well or badly. Perl has many strengths, and a few weaknesses, precisely which areas are good and bad is often a personal choice.

When choosing a language you should also be influenced by the resources, testing culture and community which surrounds it.

For comparisons to a specific language it is often best to create a small project in both languages and compare the results, make sure to use all the resources of each language, as a language is far more than just it's syntax.

Can I do [task] in Perl?

Perl is flexible and extensible enough for you to use on virtually any task, from one-line file-processing tasks to large, elaborate systems.

For many people, Perl serves as a great replacement for shell scripting. For others, it serves as a convenient, high-level replacement for most of what they'd program in low-level languages like C or C++. It's ultimately up to you (and possibly your management) which tasks you'll use Perl for and which you won't.

If you have a library that provides an API, you can make any component of it available as just another Perl function or variable using a Perl extension written in C or C++ and dynamically linked into your main perl interpreter. You can also go the other direction, and write your main program in C or C++, and then link in some Perl code on the fly, to create a powerful application. See perlembed.

That said, there will always be small, focused, special-purpose languages dedicated to a specific problem domain that are simply more convenient for certain kinds of problems. Perl tries to be all things to all people, but nothing special to anyone. Examples of specialized languages that come to mind include prolog and matlab.

When shouldn't I program in Perl?

One good reason is when you already have an existing application written in another language that's all done (and done well), or you have an application language specifically designed for a certain task (e.g. prolog, make).

If you find that you need to speed up a specific part of a Perl application (not something you often need) you may want to use C, but you can access this from your Perl code with perlxs.

What's the difference between "perl" and "Perl"?

"Perl" is the name of the language. Only the "P" is capitalized. The name of the interpreter (the program which runs the Perl script) is "perl" with a lowercase "p".

You may or may not choose to follow this usage. But never write "PERL", because perl is not an acronym.

What is a JAPH?

(contributed by brian d foy)

JAPH stands for "Just another Perl hacker,", which Randal Schwartz used to sign email and usenet messages starting in the late 1980s. He previously used the phrase with many subjects ("Just another x hacker,"), so to distinguish his JAPH, he started to write them as Perl programs:

  1. print "Just another Perl hacker,";

Other people picked up on this and started to write clever or obfuscated programs to produce the same output, spinning things quickly out of control while still providing hours of amusement for their creators and readers.

CPAN has several JAPH programs at http://www.cpan.org/misc/japh.

How can I convince others to use Perl?

(contributed by brian d foy)

Appeal to their self interest! If Perl is new (and thus scary) to them, find something that Perl can do to solve one of their problems. That might mean that Perl either saves them something (time, headaches, money) or gives them something (flexibility, power, testability).

In general, the benefit of a language is closely related to the skill of the people using that language. If you or your team can be faster, better, and stronger through Perl, you'll deliver more value. Remember, people often respond better to what they get out of it. If you run into resistance, figure out what those people get out of the other choice and how Perl might satisfy that requirement.

You don't have to worry about finding or paying for Perl; it's freely available and several popular operating systems come with Perl. Community support in places such as Perlmonks ( http://www.perlmonks.com ) and the various Perl mailing lists ( http://lists.perl.org ) means that you can usually get quick answers to your problems.

Finally, keep in mind that Perl might not be the right tool for every job. You're a much better advocate if your claims are reasonable and grounded in reality. Dogmatically advocating anything tends to make people discount your message. Be honest about possible disadvantages to your choice of Perl since any choice has trade-offs.

You might find these links useful:

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples here are in the public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required.

 
perldoc-html/perlfaq2.html000644 000765 000024 00000075364 12275777326 015661 0ustar00jjstaff000000 000000 perlfaq2 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq2

Perl 5 version 18.2 documentation
Recently read

perlfaq2

NAME

perlfaq2 - Obtaining and Learning about Perl

DESCRIPTION

This section of the FAQ answers questions about where to find source and documentation for Perl, support, and related matters.

What machines support Perl? Where do I get it?

The standard release of Perl (the one maintained by the Perl development team) is distributed only in source code form. You can find the latest releases at http://www.cpan.org/src/.

Perl builds and runs on a bewildering number of platforms. Virtually all known and current Unix derivatives are supported (perl's native platform), as are other systems like VMS, DOS, OS/2, Windows, QNX, BeOS, OS X, MPE/iX and the Amiga.

Binary distributions for some proprietary platforms can be found http://www.cpan.org/ports/ directory. Because these are not part of the standard distribution, they may and in fact do differ from the base perl port in a variety of ways. You'll have to check their respective release notes to see just what the differences are. These differences can be either positive (e.g. extensions for the features of the particular platform that are not supported in the source release of perl) or negative (e.g. might be based upon a less current source release of perl).

How can I get a binary version of Perl?

See CPAN Ports

I don't have a C compiler. How can I build my own Perl interpreter?

For Windows, use a binary version of Perl, Strawberry Perl and ActivePerl come with a bundled C compiler.

Otherwise if you really do want to build Perl, you need to get a binary version of gcc for your system first. Use a search engine to find out how to do this for your operating system.

I copied the Perl binary from one machine to another, but scripts don't work.

That's probably because you forgot libraries, or library paths differ. You really should build the whole distribution on the machine it will eventually live on, and then type make install . Most other approaches are doomed to failure.

One simple way to check that things are in the right place is to print out the hard-coded @INC that perl looks through for libraries:

  1. % perl -le 'print for @INC'

If this command lists any paths that don't exist on your system, then you may need to move the appropriate libraries to these locations, or create symbolic links, aliases, or shortcuts appropriately. @INC is also printed as part of the output of

  1. % perl -V

You might also want to check out How do I keep my own module/library directory? in perlfaq8.

I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work?

Read the INSTALL file, which is part of the source distribution. It describes in detail how to cope with most idiosyncrasies that the Configure script can't work around for any given system or architecture.

What modules and extensions are available for Perl? What is CPAN?

CPAN stands for Comprehensive Perl Archive Network, a multi-gigabyte archive replicated on hundreds of machines all over the world. CPAN contains tens of thousands of modules and extensions, source code and documentation, designed for everything from commercial database interfaces to keyboard/screen control and running large web sites.

You can search CPAN on http://metacpan.org or http://search.cpan.org/.

The master web site for CPAN is http://www.cpan.org/, http://www.cpan.org/SITES.html lists all mirrors.

See the CPAN FAQ at http://www.cpan.org/misc/cpan-faq.html for answers to the most frequently asked questions about CPAN.

The Task::Kensho module has a list of recommended modules which you should review as a good starting point.

Where can I get information on Perl?

The complete Perl documentation is available with the Perl distribution. If you have Perl installed locally, you probably have the documentation installed as well: type perldoc perl in a terminal or view online.

(Some operating system distributions may ship the documentation in a different package; for instance, on Debian, you need to install the perl-doc package.)

Many good books have been written about Perl--see the section later in perlfaq2 for more details.

What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org?

Perl.com used to be part of the O'Reilly Network, a subsidiary of O'Reilly Media. Although it retains most of the original content from its O'Reilly Network, it is now hosted by The Perl Foundation.

The Perl Foundation is an advocacy organization for the Perl language which maintains the web site http://www.perl.org/ as a general advocacy site for the Perl language. It uses the domain to provide general support services to the Perl community, including the hosting of mailing lists, web sites, and other services. There are also many other sub-domains for special topics like learning Perl and jobs in Perl, such as:

Perl Mongers uses the pm.org domain for services related to local Perl user groups, including the hosting of mailing lists and web sites. See the Perl Mongers web site for more information about joining, starting, or requesting services for a Perl user group.

CPAN, or the Comprehensive Perl Archive Network http://www.cpan.org/, is a replicated, worldwide repository of Perl software. See What is CPAN?.

Where can I post questions?

There are many Perl mailing lists for various topics, specifically the beginners list may be of use.

Other places to ask questions are on the PerlMonks site or stackoverflow.

Perl Books

There are many good books on Perl.

Which magazines have Perl content?

There's also $foo Magazin, a German magazine dedicated to Perl, at ( http://www.foo-magazin.de ). The Perl-Zeitung is another German-speaking magazine for Perl beginners (see http://perl-zeitung.at.tf ).

Several unix/linux releated magazines frequently includes articles on Perl.

Which Perl blogs should I read?

Perl News covers some of the major events in the Perl world, Perl Weekly is a weekly e-mail (and RSS feed) of hand-picked Perl articles.

http://blogs.perl.org/ hosts many Perl blogs, there are also several blog aggregators: Perlsphere and IronMan are two of them.

What mailing lists are there for Perl?

A comprehensive list of Perl-related mailing lists can be found at http://lists.perl.org/

Where can I buy a commercial version of Perl?

Perl already is commercial software: it has a license that you can grab and carefully read to your manager. It is distributed in releases and comes in well-defined packages. There is a very large and supportive user community and an extensive literature.

If you still need commercial support ActiveState offers this.

Where do I send bug reports?

(contributed by brian d foy)

First, ensure that you've found an actual bug. Second, ensure you've found an actual bug.

If you've found a bug with the perl interpreter or one of the modules in the standard library (those that come with Perl), you can use the perlbug utility that comes with Perl (>= 5.004). It collects information about your installation to include with your message, then sends the message to the right place.

To determine if a module came with your version of Perl, you can install and use the Module::CoreList module. It has the information about the modules (with their versions) included with each release of Perl.

Every CPAN module has a bug tracker set up in RT, http://rt.cpan.org. You can submit bugs to RT either through its web interface or by email. To email a bug report, send it to bug-<distribution-name>@rt.cpan.org . For example, if you wanted to report a bug in Business::ISBN, you could send a message to bug-Business-ISBN@rt.cpan.org .

Some modules might have special reporting requirements, such as a Github or Google Code tracking system, so you should check the module documentation too.

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples here are in the public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required.

 
perldoc-html/perlfaq3.html000644 000765 000024 00000242524 12275777327 015655 0ustar00jjstaff000000 000000 perlfaq3 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq3

Perl 5 version 18.2 documentation
Recently read

perlfaq3

NAME

perlfaq3 - Programming Tools

DESCRIPTION

This section of the FAQ answers questions related to programmer tools and programming support.

How do I do (anything)?

Have you looked at CPAN (see perlfaq2)? The chances are that someone has already written a module that can solve your problem. Have you read the appropriate manpages? Here's a brief index:

  • Basics
  • Execution
  • Functions
  • Objects
    • perlref - Perl references and nested data structures
    • perlmod - Perl modules (packages and symbol tables)
    • perlobj - Perl objects
    • perltie - how to hide an object class in a simple variable
  • Data Structures
    • perlref - Perl references and nested data structures
    • perllol - Manipulating arrays of arrays in Perl
    • perldsc - Perl Data Structures Cookbook
  • Modules
    • perlmod - Perl modules (packages and symbol tables)
    • perlmodlib - constructing new Perl modules and finding existing ones
  • Regexes
    • perlre - Perl regular expressions
    • perlfunc - Perl builtin functions>
    • perlop - Perl operators and precedence
    • perllocale - Perl locale handling (internationalization and localization)
  • Moving to perl5
  • Linking with C
    • perlxstut - Tutorial for writing XSUBs
    • perlxs - XS language reference manual
    • perlcall - Perl calling conventions from C
    • perlguts - Introduction to the Perl API
    • perlembed - how to embed perl in your C program
  • Various

    http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz (not a man-page but still useful, a collection of various essays on Perl techniques)

A crude table of contents for the Perl manpage set is found in perltoc.

How can I use Perl interactively?

The typical approach uses the Perl debugger, described in the perldebug(1) manpage, on an "empty" program, like this:

  1. perl -de 42

Now just type in any legal Perl code, and it will be immediately evaluated. You can also examine the symbol table, get stack backtraces, check variable values, set breakpoints, and other operations typically found in symbolic debuggers.

You can also use Devel::REPL which is an interactive shell for Perl, commonly known as a REPL - Read, Evaluate, Print, Loop. It provides various handy features.

How do I find which modules are installed on my system?

From the command line, you can use the cpan command's -l switch:

  1. $ cpan -l

You can also use cpan 's -a switch to create an autobundle file that CPAN.pm understands and can use to re-install every module:

  1. $ cpan -a

Inside a Perl program, you can use the ExtUtils::Installed module to show all installed distributions, although it can take awhile to do its magic. The standard library which comes with Perl just shows up as "Perl" (although you can get those with Module::CoreList).

  1. use ExtUtils::Installed;
  2. my $inst = ExtUtils::Installed->new();
  3. my @modules = $inst->modules();

If you want a list of all of the Perl module filenames, you can use File::Find::Rule:

  1. use File::Find::Rule;
  2. my @files = File::Find::Rule->
  3. extras({follow => 1})->
  4. file()->
  5. name( '*.pm' )->
  6. in( @INC )
  7. ;

If you do not have that module, you can do the same thing with File::Find which is part of the standard library:

  1. use File::Find;
  2. my @files;
  3. find(
  4. {
  5. wanted => sub {
  6. push @files, $File::Find::fullname
  7. if -f $File::Find::fullname && /\.pm$/
  8. },
  9. follow => 1,
  10. follow_skip => 2,
  11. },
  12. @INC
  13. );
  14. print join "\n", @files;

If you simply need to check quickly to see if a module is available, you can check for its documentation. If you can read the documentation the module is most likely installed. If you cannot read the documentation, the module might not have any (in rare cases):

  1. $ perldoc Module::Name

You can also try to include the module in a one-liner to see if perl finds it:

  1. $ perl -MModule::Name -e1

(If you don't receive a "Can't locate ... in @INC" error message, then Perl found the module name you asked for.)

How do I debug my Perl programs?

(contributed by brian d foy)

Before you do anything else, you can help yourself by ensuring that you let Perl tell you about problem areas in your code. By turning on warnings and strictures, you can head off many problems before they get too big. You can find out more about these in strict and warnings.

  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;

Beyond that, the simplest debugger is the print function. Use it to look at values as you run your program:

  1. print STDERR "The value is [$value]\n";

The Data::Dumper module can pretty-print Perl data structures:

  1. use Data::Dumper qw( Dumper );
  2. print STDERR "The hash is " . Dumper( \%hash ) . "\n";

Perl comes with an interactive debugger, which you can start with the -d switch. It's fully explained in perldebug.

If you'd like a graphical user interface and you have Tk, you can use ptkdb . It's on CPAN and available for free.

If you need something much more sophisticated and controllable, Leon Brocard's Devel::ebug (which you can call with the -D switch as -Debug ) gives you the programmatic hooks into everything you need to write your own (without too much pain and suffering).

You can also use a commercial debugger such as Affrus (Mac OS X), Komodo from Activestate (Windows and Mac OS X), or EPIC (most platforms).

How do I profile my Perl programs?

(contributed by brian d foy, updated Fri Jul 25 12:22:26 PDT 2008)

The Devel namespace has several modules which you can use to profile your Perl programs.

The Devel::NYTProf (New York Times Profiler) does both statement and subroutine profiling. It's available from CPAN and you also invoke it with the -d switch:

  1. perl -d:NYTProf some_perl.pl

It creates a database of the profile information that you can turn into reports. The nytprofhtml command turns the data into an HTML report similar to the Devel::Cover report:

  1. nytprofhtml

You might also be interested in using the Benchmark to measure and compare code snippets.

You can read more about profiling in Programming Perl, chapter 20, or Mastering Perl, chapter 5.

perldebguts documents creating a custom debugger if you need to create a special sort of profiler. brian d foy describes the process in The Perl Journal, "Creating a Perl Debugger", http://www.ddj.com/184404522 , and "Profiling in Perl" http://www.ddj.com/184404580 .

Perl.com has two interesting articles on profiling: "Profiling Perl", by Simon Cozens, http://www.perl.com/lpt/a/850 and "Debugging and Profiling mod_perl Applications", by Frank Wiles, http://www.perl.com/pub/a/2006/02/09/debug_mod_perl.html .

Randal L. Schwartz writes about profiling in "Speeding up Your Perl Programs" for Unix Review, http://www.stonehenge.com/merlyn/UnixReview/col49.html , and "Profiling in Template Toolkit via Overriding" for Linux Magazine, http://www.stonehenge.com/merlyn/LinuxMag/col75.html .

How do I cross-reference my Perl programs?

The B::Xref module can be used to generate cross-reference reports for Perl programs.

  1. perl -MO=Xref[,OPTIONS] scriptname.plx

Is there a pretty-printer (formatter) for Perl?

Perl::Tidy comes with a perl script perltidy which indents and reformats Perl scripts to make them easier to read by trying to follow the rules of the perlstyle. If you write Perl, or spend much time reading Perl, you will probably find it useful.

Of course, if you simply follow the guidelines in perlstyle, you shouldn't need to reformat. The habit of formatting your code as you write it will help prevent bugs. Your editor can and should help you with this. The perl-mode or newer cperl-mode for emacs can provide remarkable amounts of help with most (but not all) code, and even less programmable editors can provide significant assistance. Tom Christiansen and many other VI users swear by the following settings in vi and its clones:

  1. set ai sw=4
  2. map! ^O {^M}^[O^T

Put that in your .exrc file (replacing the caret characters with control characters) and away you go. In insert mode, ^T is for indenting, ^D is for undenting, and ^O is for blockdenting--as it were. A more complete example, with comments, can be found at http://www.cpan.org/authors/id/TOMC/scripts/toms.exrc.gz

Is there an IDE or Windows Perl Editor?

Perl programs are just plain text, so any editor will do.

If you're on Unix, you already have an IDE--Unix itself. The Unix philosophy is the philosophy of several small tools that each do one thing and do it well. It's like a carpenter's toolbox.

If you want an IDE, check the following (in alphabetical order, not order of preference):

For editors: if you're on Unix you probably have vi or a vi clone already, and possibly an emacs too, so you may not need to download anything. In any emacs the cperl-mode (M-x cperl-mode) gives you perhaps the best available Perl editing mode in any editor.

If you are using Windows, you can use any editor that lets you work with plain text, such as NotePad or WordPad. Word processors, such as Microsoft Word or WordPerfect, typically do not work since they insert all sorts of behind-the-scenes information, although some allow you to save files as "Text Only". You can also download text editors designed specifically for programming, such as Textpad ( http://www.textpad.com/ ) and UltraEdit ( http://www.ultraedit.com/ ), among others.

If you are using MacOS, the same concerns apply. MacPerl (for Classic environments) comes with a simple editor. Popular external editors are BBEdit ( http://www.bbedit.com/ ) or Alpha ( http://www.his.com/~jguyer/Alpha/Alpha8.html ). MacOS X users can use Unix editors as well.

or a vi clone such as

The following are Win32 multilanguage editor/IDEs that support Perl:

There is also a toyedit Text widget based editor written in Perl that is distributed with the Tk module on CPAN. The ptkdb ( http://ptkdb.sourceforge.net/ ) is a Perl/Tk-based debugger that acts as a development environment of sorts. Perl Composer ( http://perlcomposer.sourceforge.net/ ) is an IDE for Perl/Tk GUI creation.

In addition to an editor/IDE you might be interested in a more powerful shell environment for Win32. Your options include

MKS and U/WIN are commercial (U/WIN is free for educational and research purposes), Cygwin is covered by the GNU General Public License (but that shouldn't matter for Perl use). The Cygwin, MKS, and U/WIN all contain (in addition to the shells) a comprehensive set of standard Unix toolkit utilities.

If you're transferring text files between Unix and Windows using FTP be sure to transfer them in ASCII mode so the ends of lines are appropriately converted.

On Mac OS the MacPerl Application comes with a simple 32k text editor that behaves like a rudimentary IDE. In contrast to the MacPerl Application the MPW Perl tool can make use of the MPW Shell itself as an editor (with no 32k limit).

Where can I get Perl macros for vi?

For a complete version of Tom Christiansen's vi configuration file, see http://www.cpan.org/authors/Tom_Christiansen/scripts/toms.exrc.gz , the standard benchmark file for vi emulators. The file runs best with nvi, the current version of vi out of Berkeley, which incidentally can be built with an embedded Perl interpreter--see http://www.cpan.org/src/misc/ .

Where can I get perl-mode or cperl-mode for emacs?

Since Emacs version 19 patchlevel 22 or so, there have been both a perl-mode.el and support for the Perl debugger built in. These should come with the standard Emacs 19 distribution.

Note that the perl-mode of emacs will have fits with "main'foo" (single quote), and mess up the indentation and highlighting. You are probably using "main::foo" in new Perl code anyway, so this shouldn't be an issue.

For CPerlMode, see http://www.emacswiki.org/cgi-bin/wiki/CPerlMode

How can I use curses with Perl?

The Curses module from CPAN provides a dynamically loadable object module interface to a curses library. A small demo can be found at the directory http://www.cpan.org/authors/Tom_Christiansen/scripts/rep.gz ; this program repeats a command and updates the screen as needed, rendering rep ps axu similar to top.

How can I write a GUI (X, Tk, Gtk, etc.) in Perl?

(contributed by Ben Morrow)

There are a number of modules which let you write GUIs in Perl. Most GUI toolkits have a perl interface: an incomplete list follows.

  • Tk

    This works under Unix and Windows, and the current version doesn't look half as bad under Windows as it used to. Some of the gui elements still don't 'feel' quite right, though. The interface is very natural and 'perlish', making it easy to use in small scripts that just need a simple gui. It hasn't been updated in a while.

  • Wx

    This is a Perl binding for the cross-platform wxWidgets toolkit ( http://www.wxwidgets.org ). It works under Unix, Win32 and Mac OS X, using native widgets (Gtk under Unix). The interface follows the C++ interface closely, but the documentation is a little sparse for someone who doesn't know the library, mostly just referring you to the C++ documentation.

  • Gtk and Gtk2

    These are Perl bindings for the Gtk toolkit ( http://www.gtk.org ). The interface changed significantly between versions 1 and 2 so they have separate Perl modules. It runs under Unix, Win32 and Mac OS X (currently it requires an X server on Mac OS, but a 'native' port is underway), and the widgets look the same on every platform: i.e., they don't match the native widgets. As with Wx, the Perl bindings follow the C API closely, and the documentation requires you to read the C documentation to understand it.

  • Win32::GUI

    This provides access to most of the Win32 GUI widgets from Perl. Obviously, it only runs under Win32, and uses native widgets. The Perl interface doesn't really follow the C interface: it's been made more Perlish, and the documentation is pretty good. More advanced stuff may require familiarity with the C Win32 APIs, or reference to MSDN.

  • CamelBones

    CamelBones ( http://camelbones.sourceforge.net ) is a Perl interface to Mac OS X's Cocoa GUI toolkit, and as such can be used to produce native GUIs on Mac OS X. It's not on CPAN, as it requires frameworks that CPAN.pm doesn't know how to install, but installation is via the standard OSX package installer. The Perl API is, again, very close to the ObjC API it's wrapping, and the documentation just tells you how to translate from one to the other.

  • Qt

    There is a Perl interface to TrollTech's Qt toolkit, but it does not appear to be maintained.

  • Athena

    Sx is an interface to the Athena widget set which comes with X, but again it appears not to be much used nowadays.

How can I make my Perl program run faster?

The best way to do this is to come up with a better algorithm. This can often make a dramatic difference. Jon Bentley's book Programming Pearls (that's not a misspelling!) has some good tips on optimization, too. Advice on benchmarking boils down to: benchmark and profile to make sure you're optimizing the right part, look for better algorithms instead of microtuning your code, and when all else fails consider just buying faster hardware. You will probably want to read the answer to the earlier question "How do I profile my Perl programs?" if you haven't done so already.

A different approach is to autoload seldom-used Perl code. See the AutoSplit and AutoLoader modules in the standard distribution for that. Or you could locate the bottleneck and think about writing just that part in C, the way we used to take bottlenecks in C code and write them in assembler. Similar to rewriting in C, modules that have critical sections can be written in C (for instance, the PDL module from CPAN).

If you're currently linking your perl executable to a shared libc.so, you can often gain a 10-25% performance benefit by rebuilding it to link with a static libc.a instead. This will make a bigger perl executable, but your Perl programs (and programmers) may thank you for it. See the INSTALL file in the source distribution for more information.

The undump program was an ancient attempt to speed up Perl program by storing the already-compiled form to disk. This is no longer a viable option, as it only worked on a few architectures, and wasn't a good solution anyway.

How can I make my Perl program take less memory?

When it comes to time-space tradeoffs, Perl nearly always prefers to throw memory at a problem. Scalars in Perl use more memory than strings in C, arrays take more than that, and hashes use even more. While there's still a lot to be done, recent releases have been addressing these issues. For example, as of 5.004, duplicate hash keys are shared amongst all hashes using them, so require no reallocation.

In some cases, using substr() or vec() to simulate arrays can be highly beneficial. For example, an array of a thousand booleans will take at least 20,000 bytes of space, but it can be turned into one 125-byte bit vector--a considerable memory savings. The standard Tie::SubstrHash module can also help for certain types of data structure. If you're working with specialist data structures (matrices, for instance) modules that implement these in C may use less memory than equivalent Perl modules.

Another thing to try is learning whether your Perl was compiled with the system malloc or with Perl's builtin malloc. Whichever one it is, try using the other one and see whether this makes a difference. Information about malloc is in the INSTALL file in the source distribution. You can find out whether you are using perl's malloc by typing perl -V:usemymalloc.

Of course, the best way to save memory is to not do anything to waste it in the first place. Good programming practices can go a long way toward this:

  • Don't slurp!

    Don't read an entire file into memory if you can process it line by line. Or more concretely, use a loop like this:

    1. #
    2. # Good Idea
    3. #
    4. while (my $line = <$file_handle>) {
    5. # ...
    6. }

    instead of this:

    1. #
    2. # Bad Idea
    3. #
    4. my @data = <$file_handle>;
    5. foreach (@data) {
    6. # ...
    7. }

    When the files you're processing are small, it doesn't much matter which way you do it, but it makes a huge difference when they start getting larger.

  • Use map and grep selectively

    Remember that both map and grep expect a LIST argument, so doing this:

    1. @wanted = grep {/pattern/} <$file_handle>;

    will cause the entire file to be slurped. For large files, it's better to loop:

    1. while (<$file_handle>) {
    2. push(@wanted, $_) if /pattern/;
    3. }
  • Avoid unnecessary quotes and stringification

    Don't quote large strings unless absolutely necessary:

    1. my $copy = "$large_string";

    makes 2 copies of $large_string (one for $copy and another for the quotes), whereas

    1. my $copy = $large_string;

    only makes one copy.

    Ditto for stringifying large arrays:

    1. {
    2. local $, = "\n";
    3. print @big_array;
    4. }

    is much more memory-efficient than either

    1. print join "\n", @big_array;

    or

    1. {
    2. local $" = "\n";
    3. print "@big_array";
    4. }
  • Pass by reference

    Pass arrays and hashes by reference, not by value. For one thing, it's the only way to pass multiple lists or hashes (or both) in a single call/return. It also avoids creating a copy of all the contents. This requires some judgement, however, because any changes will be propagated back to the original data. If you really want to mangle (er, modify) a copy, you'll have to sacrifice the memory needed to make one.

  • Tie large variables to disk

    For "big" data stores (i.e. ones that exceed available memory) consider using one of the DB modules to store it on disk instead of in RAM. This will incur a penalty in access time, but that's probably better than causing your hard disk to thrash due to massive swapping.

Is it safe to return a reference to local or lexical data?

Yes. Perl's garbage collection system takes care of this so everything works out right.

  1. sub makeone {
  2. my @a = ( 1 .. 10 );
  3. return \@a;
  4. }
  5. for ( 1 .. 10 ) {
  6. push @many, makeone();
  7. }
  8. print $many[4][5], "\n";
  9. print "@many\n";

How can I free an array or hash so my program shrinks?

(contributed by Michael Carman)

You usually can't. Memory allocated to lexicals (i.e. my() variables) cannot be reclaimed or reused even if they go out of scope. It is reserved in case the variables come back into scope. Memory allocated to global variables can be reused (within your program) by using undef() and/or delete().

On most operating systems, memory allocated to a program can never be returned to the system. That's why long-running programs sometimes re- exec themselves. Some operating systems (notably, systems that use mmap(2) for allocating large chunks of memory) can reclaim memory that is no longer used, but on such systems, perl must be configured and compiled to use the OS's malloc, not perl's.

In general, memory allocation and de-allocation isn't something you can or should be worrying about much in Perl.

See also "How can I make my Perl program take less memory?"

How can I make my CGI script more efficient?

Beyond the normal measures described to make general Perl programs faster or smaller, a CGI program has additional issues. It may be run several times per second. Given that each time it runs it will need to be re-compiled and will often allocate a megabyte or more of system memory, this can be a killer. Compiling into C isn't going to help you because the process start-up overhead is where the bottleneck is.

There are three popular ways to avoid this overhead. One solution involves running the Apache HTTP server (available from http://www.apache.org/ ) with either of the mod_perl or mod_fastcgi plugin modules.

With mod_perl and the Apache::Registry module (distributed with mod_perl), httpd will run with an embedded Perl interpreter which pre-compiles your script and then executes it within the same address space without forking. The Apache extension also gives Perl access to the internal server API, so modules written in Perl can do just about anything a module written in C can. For more on mod_perl, see http://perl.apache.org/

With the FCGI module (from CPAN) and the mod_fastcgi module (available from http://www.fastcgi.com/ ) each of your Perl programs becomes a permanent CGI daemon process.

Finally, Plack is a Perl module and toolkit that contains PSGI middleware, helpers and adapters to web servers, allowing you to easily deploy scripts which can continue running, and provides flexibility with regards to which web server you use. It can allow existing CGI scripts to enjoy this flexibility and performance with minimal changes, or can be used along with modern Perl web frameworks to make writing and deploying web services with Perl a breeze.

These solutions can have far-reaching effects on your system and on the way you write your CGI programs, so investigate them with care.

See also http://www.cpan.org/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI/ .

How can I hide the source for my Perl program?

Delete it. :-) Seriously, there are a number of (mostly unsatisfactory) solutions with varying levels of "security".

First of all, however, you can't take away read permission, because the source code has to be readable in order to be compiled and interpreted. (That doesn't mean that a CGI script's source is readable by people on the web, though--only by people with access to the filesystem.) So you have to leave the permissions at the socially friendly 0755 level.

Some people regard this as a security problem. If your program does insecure things and relies on people not knowing how to exploit those insecurities, it is not secure. It is often possible for someone to determine the insecure things and exploit them without viewing the source. Security through obscurity, the name for hiding your bugs instead of fixing them, is little security indeed.

You can try using encryption via source filters (Starting from Perl 5.8 the Filter::Simple and Filter::Util::Call modules are included in the standard distribution), but any decent programmer will be able to decrypt it. You can try using the byte code compiler and interpreter described later in perlfaq3, but the curious might still be able to de-compile it. You can try using the native-code compiler described later, but crackers might be able to disassemble it. These pose varying degrees of difficulty to people wanting to get at your code, but none can definitively conceal it (true of every language, not just Perl).

It is very easy to recover the source of Perl programs. You simply feed the program to the perl interpreter and use the modules in the B:: hierarchy. The B::Deparse module should be able to defeat most attempts to hide source. Again, this is not unique to Perl.

If you're concerned about people profiting from your code, then the bottom line is that nothing but a restrictive license will give you legal security. License your software and pepper it with threatening statements like "This is unpublished proprietary software of XYZ Corp. Your access to it does not give you permission to use it blah blah blah." We are not lawyers, of course, so you should see a lawyer if you want to be sure your license's wording will stand up in court.

How can I compile my Perl program into byte code or C?

(contributed by brian d foy)

In general, you can't do this. There are some things that may work for your situation though. People usually ask this question because they want to distribute their works without giving away the source code, and most solutions trade disk space for convenience. You probably won't see much of a speed increase either, since most solutions simply bundle a Perl interpreter in the final product (but see How can I make my Perl program run faster?).

The Perl Archive Toolkit ( http://par.perl.org/ ) is Perl's analog to Java's JAR. It's freely available and on CPAN ( http://search.cpan.org/dist/PAR/ ).

There are also some commercial products that may work for you, although you have to buy a license for them.

The Perl Dev Kit ( http://www.activestate.com/Products/Perl_Dev_Kit/ ) from ActiveState can "Turn your Perl programs into ready-to-run executables for HP-UX, Linux, Solaris and Windows."

Perl2Exe ( http://www.indigostar.com/perl2exe.htm ) is a command line program for converting perl scripts to executable files. It targets both Windows and Unix platforms.

How can I get #!perl to work on [MS-DOS,NT,...]?

For OS/2 just use

  1. extproc perl -S -your_switches

as the first line in *.cmd file (-S due to a bug in cmd.exe's "extproc" handling). For DOS one should first invent a corresponding batch file and codify it in ALTERNATE_SHEBANG (see the dosish.h file in the source distribution for more information).

The Win95/NT installation, when using the ActiveState port of Perl, will modify the Registry to associate the .pl extension with the perl interpreter. If you install another port, perhaps even building your own Win95/NT Perl from the standard sources by using a Windows port of gcc (e.g., with cygwin or mingw32), then you'll have to modify the Registry yourself. In addition to associating .pl with the interpreter, NT people can use: SET PATHEXT=%PATHEXT%;.PL to let them run the program install-linux.pl merely by typing install-linux .

Under "Classic" MacOS, a perl program will have the appropriate Creator and Type, so that double-clicking them will invoke the MacPerl application. Under Mac OS X, clickable apps can be made from any #! script using Wil Sanchez' DropScript utility: http://www.wsanchez.net/software/ .

IMPORTANT!: Whatever you do, PLEASE don't get frustrated, and just throw the perl interpreter into your cgi-bin directory, in order to get your programs working for a web server. This is an EXTREMELY big security risk. Take the time to figure out how to do it correctly.

Can I write useful Perl programs on the command line?

Yes. Read perlrun for more information. Some examples follow. (These assume standard Unix shell quoting rules.)

  1. # sum first and last fields
  2. perl -lane 'print $F[0] + $F[-1]' *
  3. # identify text files
  4. perl -le 'for(@ARGV) {print if -f && -T _}' *
  5. # remove (most) comments from C program
  6. perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c
  7. # make file a month younger than today, defeating reaper daemons
  8. perl -e '$X=24*60*60; utime(time(),time() + 30 * $X,@ARGV)' *
  9. # find first unused uid
  10. perl -le '$i++ while getpwuid($i); print $i'
  11. # display reasonable manpath
  12. echo $PATH | perl -nl -072 -e '
  13. s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}'

OK, the last one was actually an Obfuscated Perl Contest entry. :-)

Why don't Perl one-liners work on my DOS/Mac/VMS system?

The problem is usually that the command interpreters on those systems have rather different ideas about quoting than the Unix shells under which the one-liners were created. On some systems, you may have to change single-quotes to double ones, which you must NOT do on Unix or Plan9 systems. You might also have to change a single % to a %%.

For example:

  1. # Unix (including Mac OS X)
  2. perl -e 'print "Hello world\n"'
  3. # DOS, etc.
  4. perl -e "print \"Hello world\n\""
  5. # Mac Classic
  6. print "Hello world\n"
  7. (then Run "Myscript" or Shift-Command-R)
  8. # MPW
  9. perl -e 'print "Hello world\n"'
  10. # VMS
  11. perl -e "print ""Hello world\n"""

The problem is that none of these examples are reliable: they depend on the command interpreter. Under Unix, the first two often work. Under DOS, it's entirely possible that neither works. If 4DOS was the command shell, you'd probably have better luck like this:

  1. perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>""

Under the Mac, it depends which environment you are using. The MacPerl shell, or MPW, is much like Unix shells in its support for several quoting variants, except that it makes free use of the Mac's non-ASCII characters as control characters.

Using qq(), q(), and qx(), instead of "double quotes", 'single quotes', and `backticks`, may make one-liners easier to write.

There is no general solution to all of this. It is a mess.

[Some of this answer was contributed by Kenneth Albanowski.]

Where can I learn about CGI or Web programming in Perl?

For modules, get the CGI or LWP modules from CPAN. For textbooks, see the two especially dedicated to web stuff in the question on books. For problems and questions related to the web, like "Why do I get 500 Errors" or "Why doesn't it run from the browser right when it runs fine on the command line", see the troubleshooting guides and references in perlfaq9 or in the CGI MetaFAQ:

  1. L<http://www.perl.org/CGI_MetaFAQ.html>

Looking in to Plack and modern Perl web frameworks is highly recommended, though; web programming in Perl has evolved a long way from the old days of simple CGI scripts.

Where can I learn about object-oriented Perl programming?

A good place to start is perltoot, and you can use perlobj, perlboot, perltoot, perltooc, and perlbot for reference.

A good book on OO on Perl is the "Object-Oriented Perl" by Damian Conway from Manning Publications, or "Intermediate Perl" by Randal Schwartz, brian d foy, and Tom Phoenix from O'Reilly Media.

Where can I learn about linking C with Perl?

If you want to call C from Perl, start with perlxstut, moving on to perlxs, xsubpp, and perlguts. If you want to call Perl from C, then read perlembed, perlcall, and perlguts. Don't forget that you can learn a lot from looking at how the authors of existing extension modules wrote their code and solved their problems.

You might not need all the power of XS. The Inline::C module lets you put C code directly in your Perl source. It handles all the magic to make it work. You still have to learn at least some of the perl API but you won't have to deal with the complexity of the XS support files.

I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong?

Download the ExtUtils::Embed kit from CPAN and run `make test'. If the tests pass, read the pods again and again and again. If they fail, see perlbug and send a bug report with the output of make test TEST_VERBOSE=1 along with perl -V .

When I tried to run my script, I got this message. What does it mean?

A complete list of Perl's error messages and warnings with explanatory text can be found in perldiag. You can also use the splain program (distributed with Perl) to explain the error messages:

  1. perl program 2>diag.out
  2. splain [-v] [-p] diag.out

or change your program to explain the messages for you:

  1. use diagnostics;

or

  1. use diagnostics -verbose;

What's MakeMaker?

(contributed by brian d foy)

The ExtUtils::MakeMaker module, better known simply as "MakeMaker", turns a Perl script, typically called Makefile.PL , into a Makefile. The Unix tool make uses this file to manage dependencies and actions to process and install a Perl distribution.

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples here are in the public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required.

 
perldoc-html/perlfaq4.html000644 000765 000024 00001072736 12275777327 015665 0ustar00jjstaff000000 000000 perlfaq4 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq4

Perl 5 version 18.2 documentation
Recently read

perlfaq4

NAME

perlfaq4 - Data Manipulation

DESCRIPTION

This section of the FAQ answers questions related to manipulating numbers, dates, strings, arrays, hashes, and miscellaneous data issues.

Data: Numbers

Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?

For the long explanation, see David Goldberg's "What Every Computer Scientist Should Know About Floating-Point Arithmetic" (http://web.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf).

Internally, your computer represents floating-point numbers in binary. Digital (as in powers of two) computers cannot store all numbers exactly. Some real numbers lose precision in the process. This is a problem with how computers store numbers and affects all computer languages, not just Perl.

perlnumber shows the gory details of number representations and conversions.

To limit the number of decimal places in your numbers, you can use the printf or sprintf function. See Floating-point Arithmetic in perlop for more details.

  1. printf "%.2f", 10/3;
  2. my $number = sprintf "%.2f", 10/3;

Why is int() broken?

Your int() is most probably working just fine. It's the numbers that aren't quite what you think.

First, see the answer to "Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?".

For example, this

  1. print int(0.6/0.2-2), "\n";

will in most computers print 0, not 1, because even such simple numbers as 0.6 and 0.2 cannot be presented exactly by floating-point numbers. What you think in the above as 'three' is really more like 2.9999999999999995559.

Why isn't my octal data interpreted correctly?

(contributed by brian d foy)

You're probably trying to convert a string to a number, which Perl only converts as a decimal number. When Perl converts a string to a number, it ignores leading spaces and zeroes, then assumes the rest of the digits are in base 10:

  1. my $string = '0644';
  2. print $string + 0; # prints 644
  3. print $string + 44; # prints 688, certainly not octal!

This problem usually involves one of the Perl built-ins that has the same name a Unix command that uses octal numbers as arguments on the command line. In this example, chmod on the command line knows that its first argument is octal because that's what it does:

  1. %prompt> chmod 644 file

If you want to use the same literal digits (644) in Perl, you have to tell Perl to treat them as octal numbers either by prefixing the digits with a 0 or using oct:

  1. chmod( 0644, $filename ); # right, has leading zero
  2. chmod( oct(644), $filename ); # also correct

The problem comes in when you take your numbers from something that Perl thinks is a string, such as a command line argument in @ARGV :

  1. chmod( $ARGV[0], $filename ); # wrong, even if "0644"
  2. chmod( oct($ARGV[0]), $filename ); # correct, treat string as octal

You can always check the value you're using by printing it in octal notation to ensure it matches what you think it should be. Print it in octal and decimal format:

  1. printf "0%o %d", $number, $number;

Does Perl have a round() function? What about ceil() and floor()? Trig functions?

Remember that int() merely truncates toward 0. For rounding to a certain number of digits, sprintf() or printf() is usually the easiest route.

  1. printf("%.3f", 3.1415926535); # prints 3.142

The POSIX module (part of the standard Perl distribution) implements ceil() , floor() , and a number of other mathematical and trigonometric functions.

  1. use POSIX;
  2. my $ceil = ceil(3.5); # 4
  3. my $floor = floor(3.5); # 3

In 5.000 to 5.003 perls, trigonometry was done in the Math::Complex module. With 5.004, the Math::Trig module (part of the standard Perl distribution) implements the trigonometric functions. Internally it uses the Math::Complex module and some functions can break out from the real axis into the complex plane, for example the inverse sine of 2.

Rounding in financial applications can have serious implications, and the rounding method used should be specified precisely. In these cases, it probably pays not to trust whichever system of rounding is being used by Perl, but instead to implement the rounding function you need yourself.

To see why, notice how you'll still have an issue on half-way-point alternation:

  1. for (my $i = 0; $i < 1.01; $i += 0.05) { printf "%.1f ",$i}
  2. 0.0 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.7 0.7
  3. 0.8 0.8 0.9 0.9 1.0 1.0

Don't blame Perl. It's the same as in C. IEEE says we have to do this. Perl numbers whose absolute values are integers under 2**31 (on 32-bit machines) will work pretty much like mathematical integers. Other numbers are not guaranteed.

How do I convert between numeric representations/bases/radixes?

As always with Perl there is more than one way to do it. Below are a few examples of approaches to making common conversions between number representations. This is intended to be representational rather than exhaustive.

Some of the examples later in perlfaq4 use the Bit::Vector module from CPAN. The reason you might choose Bit::Vector over the perl built-in functions is that it works with numbers of ANY size, that it is optimized for speed on some operations, and for at least some programmers the notation might be familiar.

  • How do I convert hexadecimal into decimal

    Using perl's built in conversion of 0x notation:

    1. my $dec = 0xDEADBEEF;

    Using the hex function:

    1. my $dec = hex("DEADBEEF");

    Using pack:

    1. my $dec = unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8)));

    Using the CPAN module Bit::Vector :

    1. use Bit::Vector;
    2. my $vec = Bit::Vector->new_Hex(32, "DEADBEEF");
    3. my $dec = $vec->to_Dec();
  • How do I convert from decimal to hexadecimal

    Using sprintf:

    1. my $hex = sprintf("%X", 3735928559); # upper case A-F
    2. my $hex = sprintf("%x", 3735928559); # lower case a-f

    Using unpack:

    1. my $hex = unpack("H*", pack("N", 3735928559));

    Using Bit::Vector:

    1. use Bit::Vector;
    2. my $vec = Bit::Vector->new_Dec(32, -559038737);
    3. my $hex = $vec->to_Hex();

    And Bit::Vector supports odd bit counts:

    1. use Bit::Vector;
    2. my $vec = Bit::Vector->new_Dec(33, 3735928559);
    3. $vec->Resize(32); # suppress leading 0 if unwanted
    4. my $hex = $vec->to_Hex();
  • How do I convert from octal to decimal

    Using Perl's built in conversion of numbers with leading zeros:

    1. my $dec = 033653337357; # note the leading 0!

    Using the oct function:

    1. my $dec = oct("33653337357");

    Using Bit::Vector:

    1. use Bit::Vector;
    2. my $vec = Bit::Vector->new(32);
    3. $vec->Chunk_List_Store(3, split(//, reverse "33653337357"));
    4. my $dec = $vec->to_Dec();
  • How do I convert from decimal to octal

    Using sprintf:

    1. my $oct = sprintf("%o", 3735928559);

    Using Bit::Vector:

    1. use Bit::Vector;
    2. my $vec = Bit::Vector->new_Dec(32, -559038737);
    3. my $oct = reverse join('', $vec->Chunk_List_Read(3));
  • How do I convert from binary to decimal

    Perl 5.6 lets you write binary numbers directly with the 0b notation:

    1. my $number = 0b10110110;

    Using oct:

    1. my $input = "10110110";
    2. my $decimal = oct( "0b$input" );

    Using pack and ord:

    1. my $decimal = ord(pack('B8', '10110110'));

    Using pack and unpack for larger strings:

    1. my $int = unpack("N", pack("B32",
    2. substr("0" x 32 . "11110101011011011111011101111", -32)));
    3. my $dec = sprintf("%d", $int);
    4. # substr() is used to left-pad a 32-character string with zeros.

    Using Bit::Vector:

    1. my $vec = Bit::Vector->new_Bin(32, "11011110101011011011111011101111");
    2. my $dec = $vec->to_Dec();
  • How do I convert from decimal to binary

    Using sprintf (perl 5.6+):

    1. my $bin = sprintf("%b", 3735928559);

    Using unpack:

    1. my $bin = unpack("B*", pack("N", 3735928559));

    Using Bit::Vector:

    1. use Bit::Vector;
    2. my $vec = Bit::Vector->new_Dec(32, -559038737);
    3. my $bin = $vec->to_Bin();

    The remaining transformations (e.g. hex -> oct, bin -> hex, etc.) are left as an exercise to the inclined reader.

Why doesn't & work the way I want it to?

The behavior of binary arithmetic operators depends on whether they're used on numbers or strings. The operators treat a string as a series of bits and work with that (the string "3" is the bit pattern 00110011 ). The operators work with the binary form of a number (the number 3 is treated as the bit pattern 00000011 ).

So, saying 11 & 3 performs the "and" operation on numbers (yielding 3 ). Saying "11" & "3" performs the "and" operation on strings (yielding "1" ).

Most problems with & and | arise because the programmer thinks they have a number but really it's a string or vice versa. To avoid this, stringify the arguments explicitly (using "" or qq()) or convert them to numbers explicitly (using 0+$arg ). The rest arise because the programmer says:

  1. if ("\020\020" & "\101\101") {
  2. # ...
  3. }

but a string consisting of two null bytes (the result of "\020\020" & "\101\101" ) is not a false value in Perl. You need:

  1. if ( ("\020\020" & "\101\101") !~ /[^\000]/) {
  2. # ...
  3. }

How do I multiply matrices?

Use the Math::Matrix or Math::MatrixReal modules (available from CPAN) or the PDL extension (also available from CPAN).

How do I perform an operation on a series of integers?

To call a function on each element in an array, and collect the results, use:

  1. my @results = map { my_func($_) } @array;

For example:

  1. my @triple = map { 3 * $_ } @single;

To call a function on each element of an array, but ignore the results:

  1. foreach my $iterator (@array) {
  2. some_func($iterator);
  3. }

To call a function on each integer in a (small) range, you can use:

  1. my @results = map { some_func($_) } (5 .. 25);

but you should be aware that in this form, the .. operator creates a list of all integers in the range, which can take a lot of memory for large ranges. However, the problem does not occur when using .. within a for loop, because in that case the range operator is optimized to iterate over the range, without creating the entire list. So

  1. my @results = ();
  2. for my $i (5 .. 500_005) {
  3. push(@results, some_func($i));
  4. }

or even

  1. push(@results, some_func($_)) for 5 .. 500_005;

will not create an intermediate list of 500,000 integers.

How can I output Roman numerals?

Get the http://www.cpan.org/modules/by-module/Roman module.

Why aren't my random numbers random?

If you're using a version of Perl before 5.004, you must call srand once at the start of your program to seed the random number generator.

  1. BEGIN { srand() if $] < 5.004 }

5.004 and later automatically call srand at the beginning. Don't call srand more than once--you make your numbers less random, rather than more.

Computers are good at being predictable and bad at being random (despite appearances caused by bugs in your programs :-). The random article in the "Far More Than You Ever Wanted To Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz, courtesy of Tom Phoenix, talks more about this. John von Neumann said, "Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin."

Perl relies on the underlying system for the implementation of rand and srand; on some systems, the generated numbers are not random enough (especially on Windows : see http://www.perlmonks.org/?node_id=803632). Several CPAN modules in the Math namespace implement better pseudorandom generators; see for example Math::Random::MT ("Mersenne Twister", fast), or Math::TrulyRandom (uses the imperfections in the system's timer to generate random numbers, which is rather slow). More algorithms for random numbers are described in "Numerical Recipes in C" at http://www.nr.com/

How do I get a random number between X and Y?

To get a random number between two values, you can use the rand() built-in to get a random number between 0 and 1. From there, you shift that into the range that you want.

rand($x) returns a number such that 0 <= rand($x) < $x . Thus what you want to have perl figure out is a random number in the range from 0 to the difference between your X and Y.

That is, to get a number between 10 and 15, inclusive, you want a random number between 0 and 5 that you can then add to 10.

  1. my $number = 10 + int rand( 15-10+1 ); # ( 10,11,12,13,14, or 15 )

Hence you derive the following simple function to abstract that. It selects a random integer between the two given integers (inclusive), For example: random_int_between(50,120) .

  1. sub random_int_between {
  2. my($min, $max) = @_;
  3. # Assumes that the two arguments are integers themselves!
  4. return $min if $min == $max;
  5. ($min, $max) = ($max, $min) if $min > $max;
  6. return $min + int rand(1 + $max - $min);
  7. }

Data: Dates

How do I find the day or week of the year?

The day of the year is in the list returned by the localtime function. Without an argument localtime uses the current time.

  1. my $day_of_year = (localtime)[7];

The POSIX module can also format a date as the day of the year or week of the year.

  1. use POSIX qw/strftime/;
  2. my $day_of_year = strftime "%j", localtime;
  3. my $week_of_year = strftime "%W", localtime;

To get the day of year for any date, use POSIX's mktime to get a time in epoch seconds for the argument to localtime.

  1. use POSIX qw/mktime strftime/;
  2. my $week_of_year = strftime "%W",
  3. localtime( mktime( 0, 0, 0, 18, 11, 87 ) );

You can also use Time::Piece, which comes with Perl and provides a localtime that returns an object:

  1. use Time::Piece;
  2. my $day_of_year = localtime->yday;
  3. my $week_of_year = localtime->week;

The Date::Calc module provides two functions to calculate these, too:

  1. use Date::Calc;
  2. my $day_of_year = Day_of_Year( 1987, 12, 18 );
  3. my $week_of_year = Week_of_Year( 1987, 12, 18 );

How do I find the current century or millennium?

Use the following simple functions:

  1. sub get_century {
  2. return int((((localtime(shift || time))[5] + 1999))/100);
  3. }
  4. sub get_millennium {
  5. return 1+int((((localtime(shift || time))[5] + 1899))/1000);
  6. }

On some systems, the POSIX module's strftime() function has been extended in a non-standard way to use a %C format, which they sometimes claim is the "century". It isn't, because on most such systems, this is only the first two digits of the four-digit year, and thus cannot be used to determine reliably the current century or millennium.

How can I compare two dates and find the difference?

(contributed by brian d foy)

You could just store all your dates as a number and then subtract. Life isn't always that simple though.

The Time::Piece module, which comes with Perl, replaces localtime with a version that returns an object. It also overloads the comparison operators so you can compare them directly:

  1. use Time::Piece;
  2. my $date1 = localtime( $some_time );
  3. my $date2 = localtime( $some_other_time );
  4. if( $date1 < $date2 ) {
  5. print "The date was in the past\n";
  6. }

You can also get differences with a subtraction, which returns a Time::Seconds object:

  1. my $diff = $date1 - $date2;
  2. print "The difference is ", $date_diff->days, " days\n";

If you want to work with formatted dates, the Date::Manip, Date::Calc, or DateTime modules can help you.

How can I take a string and turn it into epoch seconds?

If it's a regular enough string that it always has the same format, you can split it up and pass the parts to timelocal in the standard Time::Local module. Otherwise, you should look into the Date::Calc, Date::Parse, and Date::Manip modules from CPAN.

How can I find the Julian Day?

(contributed by brian d foy and Dave Cross)

You can use the Time::Piece module, part of the Standard Library, which can convert a date/time to a Julian Day:

  1. $ perl -MTime::Piece -le 'print localtime->julian_day'
  2. 2455607.7959375

Or the modified Julian Day:

  1. $ perl -MTime::Piece -le 'print localtime->mjd'
  2. 55607.2961226851

Or even the day of the year (which is what some people think of as a Julian day):

  1. $ perl -MTime::Piece -le 'print localtime->yday'
  2. 45

You can also do the same things with the DateTime module:

  1. $ perl -MDateTime -le'print DateTime->today->jd'
  2. 2453401.5
  3. $ perl -MDateTime -le'print DateTime->today->mjd'
  4. 53401
  5. $ perl -MDateTime -le'print DateTime->today->doy'
  6. 31

You can use the Time::JulianDay module available on CPAN. Ensure that you really want to find a Julian day, though, as many people have different ideas about Julian days (see http://www.hermetic.ch/cal_stud/jdn.htm for instance):

  1. $ perl -MTime::JulianDay -le 'print local_julian_day( time )'
  2. 55608

How do I find yesterday's date?

(contributed by brian d foy)

To do it correctly, you can use one of the Date modules since they work with calendars instead of times. The DateTime module makes it simple, and give you the same time of day, only the day before, despite daylight saving time changes:

  1. use DateTime;
  2. my $yesterday = DateTime->now->subtract( days => 1 );
  3. print "Yesterday was $yesterday\n";

You can also use the Date::Calc module using its Today_and_Now function.

  1. use Date::Calc qw( Today_and_Now Add_Delta_DHMS );
  2. my @date_time = Add_Delta_DHMS( Today_and_Now(), -1, 0, 0, 0 );
  3. print "@date_time\n";

Most people try to use the time rather than the calendar to figure out dates, but that assumes that days are twenty-four hours each. For most people, there are two days a year when they aren't: the switch to and from summer time throws this off. For example, the rest of the suggestions will be wrong sometimes:

Starting with Perl 5.10, Time::Piece and Time::Seconds are part of the standard distribution, so you might think that you could do something like this:

  1. use Time::Piece;
  2. use Time::Seconds;
  3. my $yesterday = localtime() - ONE_DAY; # WRONG
  4. print "Yesterday was $yesterday\n";

The Time::Piece module exports a new localtime that returns an object, and Time::Seconds exports the ONE_DAY constant that is a set number of seconds. This means that it always gives the time 24 hours ago, which is not always yesterday. This can cause problems around the end of daylight saving time when there's one day that is 25 hours long.

You have the same problem with Time::Local, which will give the wrong answer for those same special cases:

  1. # contributed by Gunnar Hjalmarsson
  2. use Time::Local;
  3. my $today = timelocal 0, 0, 12, ( localtime )[3..5];
  4. my ($d, $m, $y) = ( localtime $today-86400 )[3..5]; # WRONG
  5. printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d;

Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant?

(contributed by brian d foy)

Perl itself never had a Y2K problem, although that never stopped people from creating Y2K problems on their own. See the documentation for localtime for its proper use.

Starting with Perl 5.12, localtime and gmtime can handle dates past 03:14:08 January 19, 2038, when a 32-bit based time would overflow. You still might get a warning on a 32-bit perl :

  1. % perl5.12 -E 'say scalar localtime( 0x9FFF_FFFFFFFF )'
  2. Integer overflow in hexadecimal number at -e line 1.
  3. Wed Nov 1 19:42:39 5576711

On a 64-bit perl , you can get even larger dates for those really long running projects:

  1. % perl5.12 -E 'say scalar gmtime( 0x9FFF_FFFFFFFF )'
  2. Thu Nov 2 00:42:39 5576711

You're still out of luck if you need to keep track of decaying protons though.

Data: Strings

How do I validate input?

(contributed by brian d foy)

There are many ways to ensure that values are what you expect or want to accept. Besides the specific examples that we cover in the perlfaq, you can also look at the modules with "Assert" and "Validate" in their names, along with other modules such as Regexp::Common.

Some modules have validation for particular types of input, such as Business::ISBN, Business::CreditCard, Email::Valid, and Data::Validate::IP.

How do I unescape a string?

It depends just what you mean by "escape". URL escapes are dealt with in perlfaq9. Shell escapes with the backslash (\ ) character are removed with

  1. s/\\(.)/$1/g;

This won't expand "\n" or "\t" or any other special escapes.

How do I remove consecutive pairs of characters?

(contributed by brian d foy)

You can use the substitution operator to find pairs of characters (or runs of characters) and replace them with a single instance. In this substitution, we find a character in (.). The memory parentheses store the matched character in the back-reference \g1 and we use that to require that the same thing immediately follow it. We replace that part of the string with the character in $1 .

  1. s/(.)\g1/$1/g;

We can also use the transliteration operator, tr///. In this example, the search list side of our tr/// contains nothing, but the c option complements that so it contains everything. The replacement list also contains nothing, so the transliteration is almost a no-op since it won't do any replacements (or more exactly, replace the character with itself). However, the s option squashes duplicated and consecutive characters in the string so a character does not show up next to itself

  1. my $str = 'Haarlem'; # in the Netherlands
  2. $str =~ tr///cs; # Now Harlem, like in New York

How do I expand function calls in a string?

(contributed by brian d foy)

This is documented in perlref, and although it's not the easiest thing to read, it does work. In each of these examples, we call the function inside the braces used to dereference a reference. If we have more than one return value, we can construct and dereference an anonymous array. In this case, we call the function in list context.

  1. print "The time values are @{ [localtime] }.\n";

If we want to call the function in scalar context, we have to do a bit more work. We can really have any code we like inside the braces, so we simply have to end with the scalar reference, although how you do that is up to you, and you can use code inside the braces. Note that the use of parens creates a list context, so we need scalar to force the scalar context on the function:

  1. print "The time is ${\(scalar localtime)}.\n"
  2. print "The time is ${ my $x = localtime; \$x }.\n";

If your function already returns a reference, you don't need to create the reference yourself.

  1. sub timestamp { my $t = localtime; \$t }
  2. print "The time is ${ timestamp() }.\n";

The Interpolation module can also do a lot of magic for you. You can specify a variable name, in this case E , to set up a tied hash that does the interpolation for you. It has several other methods to do this as well.

  1. use Interpolation E => 'eval';
  2. print "The time values are $E{localtime()}.\n";

In most cases, it is probably easier to simply use string concatenation, which also forces scalar context.

  1. print "The time is " . localtime() . ".\n";

How do I find matching/nesting anything?

To find something between two single characters, a pattern like /x([^x]*)x/ will get the intervening bits in $1. For multiple ones, then something more like /alpha(.*?)omega/ would be needed. For nested patterns and/or balanced expressions, see the so-called (?PARNO) construct (available since perl 5.10). The CPAN module Regexp::Common can help to build such regular expressions (see in particular Regexp::Common::balanced and Regexp::Common::delimited).

More complex cases will require to write a parser, probably using a parsing module from CPAN, like Regexp::Grammars, Parse::RecDescent, Parse::Yapp, Text::Balanced, or Marpa::XS.

How do I reverse a string?

Use reverse() in scalar context, as documented in reverse.

  1. my $reversed = reverse $string;

How do I expand tabs in a string?

You can do it yourself:

  1. 1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;

Or you can just use the Text::Tabs module (part of the standard Perl distribution).

  1. use Text::Tabs;
  2. my @expanded_lines = expand(@lines_with_tabs);

How do I reformat a paragraph?

Use Text::Wrap (part of the standard Perl distribution):

  1. use Text::Wrap;
  2. print wrap("\t", ' ', @paragraphs);

The paragraphs you give to Text::Wrap should not contain embedded newlines. Text::Wrap doesn't justify the lines (flush-right).

Or use the CPAN module Text::Autoformat. Formatting files can be easily done by making a shell alias, like so:

  1. alias fmt="perl -i -MText::Autoformat -n0777 \
  2. -e 'print autoformat $_, {all=>1}' $*"

See the documentation for Text::Autoformat to appreciate its many capabilities.

How can I access or change N characters of a string?

You can access the first characters of a string with substr(). To get the first character, for example, start at position 0 and grab the string of length 1.

  1. my $string = "Just another Perl Hacker";
  2. my $first_char = substr( $string, 0, 1 ); # 'J'

To change part of a string, you can use the optional fourth argument which is the replacement string.

  1. substr( $string, 13, 4, "Perl 5.8.0" );

You can also use substr() as an lvalue.

  1. substr( $string, 13, 4 ) = "Perl 5.8.0";

How do I change the Nth occurrence of something?

You have to keep track of N yourself. For example, let's say you want to change the fifth occurrence of "whoever" or "whomever" into "whosoever" or "whomsoever" , case insensitively. These all assume that $_ contains the string to be altered.

  1. $count = 0;
  2. s{((whom?)ever)}{
  3. ++$count == 5 # is it the 5th?
  4. ? "${2}soever" # yes, swap
  5. : $1 # renege and leave it there
  6. }ige;

In the more general case, you can use the /g modifier in a while loop, keeping count of matches.

  1. $WANT = 3;
  2. $count = 0;
  3. $_ = "One fish two fish red fish blue fish";
  4. while (/(\w+)\s+fish\b/gi) {
  5. if (++$count == $WANT) {
  6. print "The third fish is a $1 one.\n";
  7. }
  8. }

That prints out: "The third fish is a red one." You can also use a repetition count and repeated pattern like this:

  1. /(?:\w+\s+fish\s+){2}(\w+)\s+fish/i;

How can I count the number of occurrences of a substring within a string?

There are a number of ways, with varying efficiency. If you want a count of a certain single character (X) within a string, you can use the tr/// function like so:

  1. my $string = "ThisXlineXhasXsomeXx'sXinXit";
  2. my $count = ($string =~ tr/X//);
  3. print "There are $count X characters in the string";

This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, tr/// won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers:

  1. my $string = "-9 55 48 -2 23 -76 4 14 -44";
  2. my $count = 0;
  3. while ($string =~ /-\d+/g) { $count++ }
  4. print "There are $count negative numbers in the string";

Another version uses a global match in list context, then assigns the result to a scalar, producing a count of the number of matches.

  1. my $count = () = $string =~ /-\d+/g;

How do I capitalize all the words on one line?

(contributed by brian d foy)

Damian Conway's Text::Autoformat handles all of the thinking for you.

  1. use Text::Autoformat;
  2. my $x = "Dr. Strangelove or: How I Learned to Stop ".
  3. "Worrying and Love the Bomb";
  4. print $x, "\n";
  5. for my $style (qw( sentence title highlight )) {
  6. print autoformat($x, { case => $style }), "\n";
  7. }

How do you want to capitalize those words?

  1. FRED AND BARNEY'S LODGE # all uppercase
  2. Fred And Barney's Lodge # title case
  3. Fred and Barney's Lodge # highlight case

It's not as easy a problem as it looks. How many words do you think are in there? Wait for it... wait for it.... If you answered 5 you're right. Perl words are groups of \w+ , but that's not what you want to capitalize. How is Perl supposed to know not to capitalize that s after the apostrophe? You could try a regular expression:

  1. $string =~ s/ (
  2. (^\w) #at the beginning of the line
  3. | # or
  4. (\s\w) #preceded by whitespace
  5. )
  6. /\U$1/xg;
  7. $string =~ s/([\w']+)/\u\L$1/g;

Now, what if you don't want to capitalize that "and"? Just use Text::Autoformat and get on with the next problem. :)

How can I split a [character]-delimited string except when inside [character]?

Several modules can handle this sort of parsing--Text::Balanced, Text::CSV, Text::CSV_XS, and Text::ParseWords, among others.

Take the example case of trying to split a string that is comma-separated into its different fields. You can't use split(/,/) because you shouldn't split if the comma is inside quotes. For example, take a data line like this:

  1. SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"

Due to the restriction of the quotes, this is a fairly complex problem. Thankfully, we have Jeffrey Friedl, author of Mastering Regular Expressions, to handle these for us. He suggests (assuming your string is contained in $text ):

  1. my @new = ();
  2. push(@new, $+) while $text =~ m{
  3. "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes
  4. | ([^,]+),?
  5. | ,
  6. }gx;
  7. push(@new, undef) if substr($text,-1,1) eq ',';

If you want to represent quotation marks inside a quotation-mark-delimited field, escape them with backslashes (eg, "like \"this\"" .

Alternatively, the Text::ParseWords module (part of the standard Perl distribution) lets you say:

  1. use Text::ParseWords;
  2. @new = quotewords(",", 0, $text);

For parsing or generating CSV, though, using Text::CSV rather than implementing it yourself is highly recommended; you'll save yourself odd bugs popping up later by just using code which has already been tried and tested in production for years.

How do I strip blank space from the beginning/end of a string?

(contributed by brian d foy)

A substitution can do this for you. For a single line, you want to replace all the leading or trailing whitespace with nothing. You can do that with a pair of substitutions:

  1. s/^\s+//;
  2. s/\s+$//;

You can also write that as a single substitution, although it turns out the combined statement is slower than the separate ones. That might not matter to you, though:

  1. s/^\s+|\s+$//g;

In this regular expression, the alternation matches either at the beginning or the end of the string since the anchors have a lower precedence than the alternation. With the /g flag, the substitution makes all possible matches, so it gets both. Remember, the trailing newline matches the \s+, and the $ anchor can match to the absolute end of the string, so the newline disappears too. Just add the newline to the output, which has the added benefit of preserving "blank" (consisting entirely of whitespace) lines which the ^\s+ would remove all by itself:

  1. while( <> ) {
  2. s/^\s+|\s+$//g;
  3. print "$_\n";
  4. }

For a multi-line string, you can apply the regular expression to each logical line in the string by adding the /m flag (for "multi-line"). With the /m flag, the $ matches before an embedded newline, so it doesn't remove it. This pattern still removes the newline at the end of the string:

  1. $string =~ s/^\s+|\s+$//gm;

Remember that lines consisting entirely of whitespace will disappear, since the first part of the alternation can match the entire string and replace it with nothing. If you need to keep embedded blank lines, you have to do a little more work. Instead of matching any whitespace (since that includes a newline), just match the other whitespace:

  1. $string =~ s/^[\t\f ]+|[\t\f ]+$//mg;

How do I pad a string with blanks or pad a number with zeroes?

In the following examples, $pad_len is the length to which you wish to pad the string, $text or $num contains the string to be padded, and $pad_char contains the padding character. You can use a single character string constant instead of the $pad_char variable if you know what it is in advance. And in the same way you can use an integer in place of $pad_len if you know the pad length in advance.

The simplest method uses the sprintf function. It can pad on the left or right with blanks and on the left with zeroes and it will not truncate the result. The pack function can only pad strings on the right with blanks and it will truncate the result to a maximum length of $pad_len .

  1. # Left padding a string with blanks (no truncation):
  2. my $padded = sprintf("%${pad_len}s", $text);
  3. my $padded = sprintf("%*s", $pad_len, $text); # same thing
  4. # Right padding a string with blanks (no truncation):
  5. my $padded = sprintf("%-${pad_len}s", $text);
  6. my $padded = sprintf("%-*s", $pad_len, $text); # same thing
  7. # Left padding a number with 0 (no truncation):
  8. my $padded = sprintf("%0${pad_len}d", $num);
  9. my $padded = sprintf("%0*d", $pad_len, $num); # same thing
  10. # Right padding a string with blanks using pack (will truncate):
  11. my $padded = pack("A$pad_len",$text);

If you need to pad with a character other than blank or zero you can use one of the following methods. They all generate a pad string with the x operator and combine that with $text . These methods do not truncate $text .

Left and right padding with any character, creating a new string:

  1. my $padded = $pad_char x ( $pad_len - length( $text ) ) . $text;
  2. my $padded = $text . $pad_char x ( $pad_len - length( $text ) );

Left and right padding with any character, modifying $text directly:

  1. substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) );
  2. $text .= $pad_char x ( $pad_len - length( $text ) );

How do I extract selected columns from a string?

(contributed by brian d foy)

If you know the columns that contain the data, you can use substr to extract a single column.

  1. my $column = substr( $line, $start_column, $length );

You can use split if the columns are separated by whitespace or some other delimiter, as long as whitespace or the delimiter cannot appear as part of the data.

  1. my $line = ' fred barney betty ';
  2. my @columns = split /\s+/, $line;
  3. # ( '', 'fred', 'barney', 'betty' );
  4. my $line = 'fred||barney||betty';
  5. my @columns = split /\|/, $line;
  6. # ( 'fred', '', 'barney', '', 'betty' );

If you want to work with comma-separated values, don't do this since that format is a bit more complicated. Use one of the modules that handle that format, such as Text::CSV, Text::CSV_XS, or Text::CSV_PP.

If you want to break apart an entire line of fixed columns, you can use unpack with the A (ASCII) format. By using a number after the format specifier, you can denote the column width. See the pack and unpack entries in perlfunc for more details.

  1. my @fields = unpack( $line, "A8 A8 A8 A16 A4" );

Note that spaces in the format argument to unpack do not denote literal spaces. If you have space separated data, you may want split instead.

How do I find the soundex value of a string?

(contributed by brian d foy)

You can use the Text::Soundex module. If you want to do fuzzy or close matching, you might also try the String::Approx, and Text::Metaphone, and Text::DoubleMetaphone modules.

How can I expand variables in text strings?

(contributed by brian d foy)

If you can avoid it, don't, or if you can use a templating system, such as Text::Template or Template Toolkit, do that instead. You might even be able to get the job done with sprintf or printf:

  1. my $string = sprintf 'Say hello to %s and %s', $foo, $bar;

However, for the one-off simple case where I don't want to pull out a full templating system, I'll use a string that has two Perl scalar variables in it. In this example, I want to expand $foo and $bar to their variable's values:

  1. my $foo = 'Fred';
  2. my $bar = 'Barney';
  3. $string = 'Say hello to $foo and $bar';

One way I can do this involves the substitution operator and a double /e flag. The first /e evaluates $1 on the replacement side and turns it into $foo . The second /e starts with $foo and replaces it with its value. $foo , then, turns into 'Fred', and that's finally what's left in the string:

  1. $string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'

The /e will also silently ignore violations of strict, replacing undefined variable names with the empty string. Since I'm using the /e flag (twice even!), I have all of the same security problems I have with eval in its string form. If there's something odd in $foo , perhaps something like @{[ system "rm -rf /" ]} , then I could get myself in trouble.

To get around the security problem, I could also pull the values from a hash instead of evaluating variable names. Using a single /e, I can check the hash to ensure the value exists, and if it doesn't, I can replace the missing value with a marker, in this case ??? to signal that I missed something:

  1. my $string = 'This has $foo and $bar';
  2. my %Replacements = (
  3. foo => 'Fred',
  4. );
  5. # $string =~ s/\$(\w+)/$Replacements{$1}/g;
  6. $string =~ s/\$(\w+)/
  7. exists $Replacements{$1} ? $Replacements{$1} : '???'
  8. /eg;
  9. print $string;

What's wrong with always quoting "$vars"?

The problem is that those double-quotes force stringification--coercing numbers and references into strings--even when you don't want them to be strings. Think of it this way: double-quote expansion is used to produce new strings. If you already have a string, why do you need more?

If you get used to writing odd things like these:

  1. print "$var"; # BAD
  2. my $new = "$old"; # BAD
  3. somefunc("$var"); # BAD

You'll be in trouble. Those should (in 99.8% of the cases) be the simpler and more direct:

  1. print $var;
  2. my $new = $old;
  3. somefunc($var);

Otherwise, besides slowing you down, you're going to break code when the thing in the scalar is actually neither a string nor a number, but a reference:

  1. func(\@array);
  2. sub func {
  3. my $aref = shift;
  4. my $oref = "$aref"; # WRONG
  5. }

You can also get into subtle problems on those few operations in Perl that actually do care about the difference between a string and a number, such as the magical ++ autoincrement operator or the syscall() function.

Stringification also destroys arrays.

  1. my @lines = `command`;
  2. print "@lines"; # WRONG - extra blanks
  3. print @lines; # right

Why don't my <<HERE documents work?

Here documents are found in perlop. Check for these three things:

  • There must be no space after the << part.
  • There (probably) should be a semicolon at the end of the opening token
  • You can't (easily) have any space in front of the tag.
  • There needs to be at least a line separator after the end token.

If you want to indent the text in the here document, you can do this:

  1. # all in one
  2. (my $VAR = <<HERE_TARGET) =~ s/^\s+//gm;
  3. your text
  4. goes here
  5. HERE_TARGET

But the HERE_TARGET must still be flush against the margin. If you want that indented also, you'll have to quote in the indentation.

  1. (my $quote = <<' FINIS') =~ s/^\s+//gm;
  2. ...we will have peace, when you and all your works have
  3. perished--and the works of your dark master to whom you
  4. would deliver us. You are a liar, Saruman, and a corrupter
  5. of men's hearts. --Theoden in /usr/src/perl/taint.c
  6. FINIS
  7. $quote =~ s/\s+--/\n--/;

A nice general-purpose fixer-upper function for indented here documents follows. It expects to be called with a here document as its argument. It looks to see whether each line begins with a common substring, and if so, strips that substring off. Otherwise, it takes the amount of leading whitespace found on the first line and removes that much off each subsequent line.

  1. sub fix {
  2. local $_ = shift;
  3. my ($white, $leader); # common whitespace and common leading string
  4. if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\g1\g2?.*\n)+$/) {
  5. ($white, $leader) = ($2, quotemeta($1));
  6. } else {
  7. ($white, $leader) = (/^(\s+)/, '');
  8. }
  9. s/^\s*?$leader(?:$white)?//gm;
  10. return $_;
  11. }

This works with leading special strings, dynamically determined:

  1. my $remember_the_main = fix<<' MAIN_INTERPRETER_LOOP';
  2. @@@ int
  3. @@@ runops() {
  4. @@@ SAVEI32(runlevel);
  5. @@@ runlevel++;
  6. @@@ while ( op = (*op->op_ppaddr)() );
  7. @@@ TAINT_NOT;
  8. @@@ return 0;
  9. @@@ }
  10. MAIN_INTERPRETER_LOOP

Or with a fixed amount of leading whitespace, with remaining indentation correctly preserved:

  1. my $poem = fix<<EVER_ON_AND_ON;
  2. Now far ahead the Road has gone,
  3. And I must follow, if I can,
  4. Pursuing it with eager feet,
  5. Until it joins some larger way
  6. Where many paths and errands meet.
  7. And whither then? I cannot say.
  8. --Bilbo in /usr/src/perl/pp_ctl.c
  9. EVER_ON_AND_ON

Data: Arrays

What is the difference between a list and an array?

(contributed by brian d foy)

A list is a fixed collection of scalars. An array is a variable that holds a variable collection of scalars. An array can supply its collection for list operations, so list operations also work on arrays:

  1. # slices
  2. ( 'dog', 'cat', 'bird' )[2,3];
  3. @animals[2,3];
  4. # iteration
  5. foreach ( qw( dog cat bird ) ) { ... }
  6. foreach ( @animals ) { ... }
  7. my @three = grep { length == 3 } qw( dog cat bird );
  8. my @three = grep { length == 3 } @animals;
  9. # supply an argument list
  10. wash_animals( qw( dog cat bird ) );
  11. wash_animals( @animals );

Array operations, which change the scalars, rearrange them, or add or subtract some scalars, only work on arrays. These can't work on a list, which is fixed. Array operations include shift, unshift, push, pop, and splice.

An array can also change its length:

  1. $#animals = 1; # truncate to two elements
  2. $#animals = 10000; # pre-extend to 10,001 elements

You can change an array element, but you can't change a list element:

  1. $animals[0] = 'Rottweiler';
  2. qw( dog cat bird )[0] = 'Rottweiler'; # syntax error!
  3. foreach ( @animals ) {
  4. s/^d/fr/; # works fine
  5. }
  6. foreach ( qw( dog cat bird ) ) {
  7. s/^d/fr/; # Error! Modification of read only value!
  8. }

However, if the list element is itself a variable, it appears that you can change a list element. However, the list element is the variable, not the data. You're not changing the list element, but something the list element refers to. The list element itself doesn't change: it's still the same variable.

You also have to be careful about context. You can assign an array to a scalar to get the number of elements in the array. This only works for arrays, though:

  1. my $count = @animals; # only works with arrays

If you try to do the same thing with what you think is a list, you get a quite different result. Although it looks like you have a list on the righthand side, Perl actually sees a bunch of scalars separated by a comma:

  1. my $scalar = ( 'dog', 'cat', 'bird' ); # $scalar gets bird

Since you're assigning to a scalar, the righthand side is in scalar context. The comma operator (yes, it's an operator!) in scalar context evaluates its lefthand side, throws away the result, and evaluates it's righthand side and returns the result. In effect, that list-lookalike assigns to $scalar it's rightmost value. Many people mess this up because they choose a list-lookalike whose last element is also the count they expect:

  1. my $scalar = ( 1, 2, 3 ); # $scalar gets 3, accidentally

What is the difference between $array[1] and @array[1]?

(contributed by brian d foy)

The difference is the sigil, that special character in front of the array name. The $ sigil means "exactly one item", while the @ sigil means "zero or more items". The $ gets you a single scalar, while the @ gets you a list.

The confusion arises because people incorrectly assume that the sigil denotes the variable type.

The $array[1] is a single-element access to the array. It's going to return the item in index 1 (or undef if there is no item there). If you intend to get exactly one element from the array, this is the form you should use.

The @array[1] is an array slice, although it has only one index. You can pull out multiple elements simultaneously by specifying additional indices as a list, like @array[1,4,3,0] .

Using a slice on the lefthand side of the assignment supplies list context to the righthand side. This can lead to unexpected results. For instance, if you want to read a single line from a filehandle, assigning to a scalar value is fine:

  1. $array[1] = <STDIN>;

However, in list context, the line input operator returns all of the lines as a list. The first line goes into @array[1] and the rest of the lines mysteriously disappear:

  1. @array[1] = <STDIN>; # most likely not what you want

Either the use warnings pragma or the -w flag will warn you when you use an array slice with a single index.

How can I remove duplicate elements from a list or array?

(contributed by brian d foy)

Use a hash. When you think the words "unique" or "duplicated", think "hash keys".

If you don't care about the order of the elements, you could just create the hash then extract the keys. It's not important how you create that hash: just that you use keys to get the unique elements.

  1. my %hash = map { $_, 1 } @array;
  2. # or a hash slice: @hash{ @array } = ();
  3. # or a foreach: $hash{$_} = 1 foreach ( @array );
  4. my @unique = keys %hash;

If you want to use a module, try the uniq function from List::MoreUtils. In list context it returns the unique elements, preserving their order in the list. In scalar context, it returns the number of unique elements.

  1. use List::MoreUtils qw(uniq);
  2. my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
  3. my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7

You can also go through each element and skip the ones you've seen before. Use a hash to keep track. The first time the loop sees an element, that element has no key in %Seen . The next statement creates the key and immediately uses its value, which is undef, so the loop continues to the push and increments the value for that key. The next time the loop sees that same element, its key exists in the hash and the value for that key is true (since it's not 0 or undef), so the next skips that iteration and the loop goes to the next element.

  1. my @unique = ();
  2. my %seen = ();
  3. foreach my $elem ( @array ) {
  4. next if $seen{ $elem }++;
  5. push @unique, $elem;
  6. }

You can write this more briefly using a grep, which does the same thing.

  1. my %seen = ();
  2. my @unique = grep { ! $seen{ $_ }++ } @array;

How can I tell whether a certain element is contained in a list or array?

(portions of this answer contributed by Anno Siegel and brian d foy)

Hearing the word "in" is an indication that you probably should have used a hash, not a list or array, to store your data. Hashes are designed to answer this question quickly and efficiently. Arrays aren't.

That being said, there are several ways to approach this. In Perl 5.10 and later, you can use the smart match operator to check that an item is contained in an array or a hash:

  1. use 5.010;
  2. if( $item ~~ @array ) {
  3. say "The array contains $item"
  4. }
  5. if( $item ~~ %hash ) {
  6. say "The hash contains $item"
  7. }

With earlier versions of Perl, you have to do a bit more work. If you are going to make this query many times over arbitrary string values, the fastest way is probably to invert the original array and maintain a hash whose keys are the first array's values:

  1. my @blues = qw/azure cerulean teal turquoise lapis-lazuli/;
  2. my %is_blue = ();
  3. for (@blues) { $is_blue{$_} = 1 }

Now you can check whether $is_blue{$some_color} . It might have been a good idea to keep the blues all in a hash in the first place.

If the values are all small integers, you could use a simple indexed array. This kind of an array will take up less space:

  1. my @primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31);
  2. my @is_tiny_prime = ();
  3. for (@primes) { $is_tiny_prime[$_] = 1 }
  4. # or simply @istiny_prime[@primes] = (1) x @primes;

Now you check whether $is_tiny_prime[$some_number].

If the values in question are integers instead of strings, you can save quite a lot of space by using bit strings instead:

  1. my @articles = ( 1..10, 150..2000, 2017 );
  2. undef $read;
  3. for (@articles) { vec($read,$_,1) = 1 }

Now check whether vec($read,$n,1) is true for some $n .

These methods guarantee fast individual tests but require a re-organization of the original list or array. They only pay off if you have to test multiple values against the same array.

If you are testing only once, the standard module List::Util exports the function first for this purpose. It works by stopping once it finds the element. It's written in C for speed, and its Perl equivalent looks like this subroutine:

  1. sub first (&@) {
  2. my $code = shift;
  3. foreach (@_) {
  4. return $_ if &{$code}();
  5. }
  6. undef;
  7. }

If speed is of little concern, the common idiom uses grep in scalar context (which returns the number of items that passed its condition) to traverse the entire list. This does have the benefit of telling you how many matches it found, though.

  1. my $is_there = grep $_ eq $whatever, @array;

If you want to actually extract the matching elements, simply use grep in list context.

  1. my @matches = grep $_ eq $whatever, @array;

How do I compute the difference of two arrays? How do I compute the intersection of two arrays?

Use a hash. Here's code to do both and more. It assumes that each element is unique in a given array:

  1. my (@union, @intersection, @difference);
  2. my %count = ();
  3. foreach my $element (@array1, @array2) { $count{$element}++ }
  4. foreach my $element (keys %count) {
  5. push @union, $element;
  6. push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element;
  7. }

Note that this is the symmetric difference, that is, all elements in either A or in B but not in both. Think of it as an xor operation.

How do I test whether two arrays or hashes are equal?

With Perl 5.10 and later, the smart match operator can give you the answer with the least amount of work:

  1. use 5.010;
  2. if( @array1 ~~ @array2 ) {
  3. say "The arrays are the same";
  4. }
  5. if( %hash1 ~~ %hash2 ) # doesn't check values! {
  6. say "The hash keys are the same";
  7. }

The following code works for single-level arrays. It uses a stringwise comparison, and does not distinguish defined versus undefined empty strings. Modify if you have other needs.

  1. $are_equal = compare_arrays(\@frogs, \@toads);
  2. sub compare_arrays {
  3. my ($first, $second) = @_;
  4. no warnings; # silence spurious -w undef complaints
  5. return 0 unless @$first == @$second;
  6. for (my $i = 0; $i < @$first; $i++) {
  7. return 0 if $first->[$i] ne $second->[$i];
  8. }
  9. return 1;
  10. }

For multilevel structures, you may wish to use an approach more like this one. It uses the CPAN module FreezeThaw:

  1. use FreezeThaw qw(cmpStr);
  2. my @a = my @b = ( "this", "that", [ "more", "stuff" ] );
  3. printf "a and b contain %s arrays\n",
  4. cmpStr(\@a, \@b) == 0
  5. ? "the same"
  6. : "different";

This approach also works for comparing hashes. Here we'll demonstrate two different answers:

  1. use FreezeThaw qw(cmpStr cmpStrHard);
  2. my %a = my %b = ( "this" => "that", "extra" => [ "more", "stuff" ] );
  3. $a{EXTRA} = \%b;
  4. $b{EXTRA} = \%a;
  5. printf "a and b contain %s hashes\n",
  6. cmpStr(\%a, \%b) == 0 ? "the same" : "different";
  7. printf "a and b contain %s hashes\n",
  8. cmpStrHard(\%a, \%b) == 0 ? "the same" : "different";

The first reports that both those the hashes contain the same data, while the second reports that they do not. Which you prefer is left as an exercise to the reader.

How do I find the first array element for which a condition is true?

To find the first array element which satisfies a condition, you can use the first() function in the List::Util module, which comes with Perl 5.8. This example finds the first element that contains "Perl".

  1. use List::Util qw(first);
  2. my $element = first { /Perl/ } @array;

If you cannot use List::Util, you can make your own loop to do the same thing. Once you find the element, you stop the loop with last.

  1. my $found;
  2. foreach ( @array ) {
  3. if( /Perl/ ) { $found = $_; last }
  4. }

If you want the array index, use the firstidx() function from List::MoreUtils :

  1. use List::MoreUtils qw(firstidx);
  2. my $index = firstidx { /Perl/ } @array;

Or write it yourself, iterating through the indices and checking the array element at each index until you find one that satisfies the condition:

  1. my( $found, $index ) = ( undef, -1 );
  2. for( $i = 0; $i < @array; $i++ ) {
  3. if( $array[$i] =~ /Perl/ ) {
  4. $found = $array[$i];
  5. $index = $i;
  6. last;
  7. }
  8. }

How do I handle linked lists?

(contributed by brian d foy)

Perl's arrays do not have a fixed size, so you don't need linked lists if you just want to add or remove items. You can use array operations such as push, pop, shift, unshift, or splice to do that.

Sometimes, however, linked lists can be useful in situations where you want to "shard" an array so you have have many small arrays instead of a single big array. You can keep arrays longer than Perl's largest array index, lock smaller arrays separately in threaded programs, reallocate less memory, or quickly insert elements in the middle of the chain.

Steve Lembark goes through the details in his YAPC::NA 2009 talk "Perly Linked Lists" ( http://www.slideshare.net/lembark/perly-linked-lists ), although you can just use his LinkedList::Single module.

How do I handle circular lists?

(contributed by brian d foy)

If you want to cycle through an array endlessly, you can increment the index modulo the number of elements in the array:

  1. my @array = qw( a b c );
  2. my $i = 0;
  3. while( 1 ) {
  4. print $array[ $i++ % @array ], "\n";
  5. last if $i > 20;
  6. }

You can also use Tie::Cycle to use a scalar that always has the next element of the circular array:

  1. use Tie::Cycle;
  2. tie my $cycle, 'Tie::Cycle', [ qw( FFFFFF 000000 FFFF00 ) ];
  3. print $cycle; # FFFFFF
  4. print $cycle; # 000000
  5. print $cycle; # FFFF00

The Array::Iterator::Circular creates an iterator object for circular arrays:

  1. use Array::Iterator::Circular;
  2. my $color_iterator = Array::Iterator::Circular->new(
  3. qw(red green blue orange)
  4. );
  5. foreach ( 1 .. 20 ) {
  6. print $color_iterator->next, "\n";
  7. }

How do I shuffle an array randomly?

If you either have Perl 5.8.0 or later installed, or if you have Scalar-List-Utils 1.03 or later installed, you can say:

  1. use List::Util 'shuffle';
  2. @shuffled = shuffle(@list);

If not, you can use a Fisher-Yates shuffle.

  1. sub fisher_yates_shuffle {
  2. my $deck = shift; # $deck is a reference to an array
  3. return unless @$deck; # must not be empty!
  4. my $i = @$deck;
  5. while (--$i) {
  6. my $j = int rand ($i+1);
  7. @$deck[$i,$j] = @$deck[$j,$i];
  8. }
  9. }
  10. # shuffle my mpeg collection
  11. #
  12. my @mpeg = <audio/*/*.mp3>;
  13. fisher_yates_shuffle( \@mpeg ); # randomize @mpeg in place
  14. print @mpeg;

Note that the above implementation shuffles an array in place, unlike the List::Util::shuffle() which takes a list and returns a new shuffled list.

You've probably seen shuffling algorithms that work using splice, randomly picking another element to swap the current element with

  1. srand;
  2. @new = ();
  3. @old = 1 .. 10; # just a demo
  4. while (@old) {
  5. push(@new, splice(@old, rand @old, 1));
  6. }

This is bad because splice is already O(N), and since you do it N times, you just invented a quadratic algorithm; that is, O(N**2). This does not scale, although Perl is so efficient that you probably won't notice this until you have rather largish arrays.

How do I process/modify each element of an array?

Use for /foreach :

  1. for (@lines) {
  2. s/foo/bar/; # change that word
  3. tr/XZ/ZX/; # swap those letters
  4. }

Here's another; let's compute spherical volumes:

  1. my @volumes = @radii;
  2. for (@volumes) { # @volumes has changed parts
  3. $_ **= 3;
  4. $_ *= (4/3) * 3.14159; # this will be constant folded
  5. }

which can also be done with map() which is made to transform one list into another:

  1. my @volumes = map {$_ ** 3 * (4/3) * 3.14159} @radii;

If you want to do the same thing to modify the values of the hash, you can use the values function. As of Perl 5.6 the values are not copied, so if you modify $orbit (in this case), you modify the value.

  1. for my $orbit ( values %orbits ) {
  2. ($orbit **= 3) *= (4/3) * 3.14159;
  3. }

Prior to perl 5.6 values returned copies of the values, so older perl code often contains constructions such as @orbits{keys %orbits} instead of values %orbits where the hash is to be modified.

How do I select a random element from an array?

Use the rand() function (see rand):

  1. my $index = rand @array;
  2. my $element = $array[$index];

Or, simply:

  1. my $element = $array[ rand @array ];

How do I permute N elements of a list?

Use the List::Permutor module on CPAN. If the list is actually an array, try the Algorithm::Permute module (also on CPAN). It's written in XS code and is very efficient:

  1. use Algorithm::Permute;
  2. my @array = 'a'..'d';
  3. my $p_iterator = Algorithm::Permute->new ( \@array );
  4. while (my @perm = $p_iterator->next) {
  5. print "next permutation: (@perm)\n";
  6. }

For even faster execution, you could do:

  1. use Algorithm::Permute;
  2. my @array = 'a'..'d';
  3. Algorithm::Permute::permute {
  4. print "next permutation: (@array)\n";
  5. } @array;

Here's a little program that generates all permutations of all the words on each line of input. The algorithm embodied in the permute() function is discussed in Volume 4 (still unpublished) of Knuth's The Art of Computer Programming and will work on any list:

  1. #!/usr/bin/perl -n
  2. # Fischer-Krause ordered permutation generator
  3. sub permute (&@) {
  4. my $code = shift;
  5. my @idx = 0..$#_;
  6. while ( $code->(@_[@idx]) ) {
  7. my $p = $#idx;
  8. --$p while $idx[$p-1] > $idx[$p];
  9. my $q = $p or return;
  10. push @idx, reverse splice @idx, $p;
  11. ++$q while $idx[$p-1] > $idx[$q];
  12. @idx[$p-1,$q]=@idx[$q,$p-1];
  13. }
  14. }
  15. permute { print "@_\n" } split;

The Algorithm::Loops module also provides the NextPermute and NextPermuteNum functions which efficiently find all unique permutations of an array, even if it contains duplicate values, modifying it in-place: if its elements are in reverse-sorted order then the array is reversed, making it sorted, and it returns false; otherwise the next permutation is returned.

NextPermute uses string order and NextPermuteNum numeric order, so you can enumerate all the permutations of 0..9 like this:

  1. use Algorithm::Loops qw(NextPermuteNum);
  2. my @list= 0..9;
  3. do { print "@list\n" } while NextPermuteNum @list;

How do I sort an array by (anything)?

Supply a comparison function to sort() (described in sort):

  1. @list = sort { $a <=> $b } @list;

The default sort function is cmp, string comparison, which would sort (1, 2, 10) into (1, 10, 2) . <=> , used above, is the numerical comparison operator.

If you have a complicated function needed to pull out the part you want to sort on, then don't do it inside the sort function. Pull it out first, because the sort BLOCK can be called many times for the same element. Here's an example of how to pull out the first word after the first number on each item, and then sort those words case-insensitively.

  1. my @idx;
  2. for (@data) {
  3. my $item;
  4. ($item) = /\d+\s*(\S+)/;
  5. push @idx, uc($item);
  6. }
  7. my @sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ];

which could also be written this way, using a trick that's come to be known as the Schwartzian Transform:

  1. my @sorted = map { $_->[0] }
  2. sort { $a->[1] cmp $b->[1] }
  3. map { [ $_, uc( (/\d+\s*(\S+)/)[0]) ] } @data;

If you need to sort on several fields, the following paradigm is useful.

  1. my @sorted = sort {
  2. field1($a) <=> field1($b) ||
  3. field2($a) cmp field2($b) ||
  4. field3($a) cmp field3($b)
  5. } @data;

This can be conveniently combined with precalculation of keys as given above.

See the sort article in the "Far More Than You Ever Wanted To Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz for more about this approach.

See also the question later in perlfaq4 on sorting hashes.

How do I manipulate arrays of bits?

Use pack() and unpack(), or else vec() and the bitwise operations.

For example, you don't have to store individual bits in an array (which would mean that you're wasting a lot of space). To convert an array of bits to a string, use vec() to set the right bits. This sets $vec to have bit N set only if $ints[N] was set:

  1. my @ints = (...); # array of bits, e.g. ( 1, 0, 0, 1, 1, 0 ... )
  2. my $vec = '';
  3. foreach( 0 .. $#ints ) {
  4. vec($vec,$_,1) = 1 if $ints[$_];
  5. }

The string $vec only takes up as many bits as it needs. For instance, if you had 16 entries in @ints , $vec only needs two bytes to store them (not counting the scalar variable overhead).

Here's how, given a vector in $vec , you can get those bits into your @ints array:

  1. sub bitvec_to_list {
  2. my $vec = shift;
  3. my @ints;
  4. # Find null-byte density then select best algorithm
  5. if ($vec =~ tr/\0// / length $vec > 0.95) {
  6. use integer;
  7. my $i;
  8. # This method is faster with mostly null-bytes
  9. while($vec =~ /[^\0]/g ) {
  10. $i = -9 + 8 * pos $vec;
  11. push @ints, $i if vec($vec, ++$i, 1);
  12. push @ints, $i if vec($vec, ++$i, 1);
  13. push @ints, $i if vec($vec, ++$i, 1);
  14. push @ints, $i if vec($vec, ++$i, 1);
  15. push @ints, $i if vec($vec, ++$i, 1);
  16. push @ints, $i if vec($vec, ++$i, 1);
  17. push @ints, $i if vec($vec, ++$i, 1);
  18. push @ints, $i if vec($vec, ++$i, 1);
  19. }
  20. }
  21. else {
  22. # This method is a fast general algorithm
  23. use integer;
  24. my $bits = unpack "b*", $vec;
  25. push @ints, 0 if $bits =~ s/^(\d)// && $1;
  26. push @ints, pos $bits while($bits =~ /1/g);
  27. }
  28. return \@ints;
  29. }

This method gets faster the more sparse the bit vector is. (Courtesy of Tim Bunce and Winfried Koenig.)

You can make the while loop a lot shorter with this suggestion from Benjamin Goldberg:

  1. while($vec =~ /[^\0]+/g ) {
  2. push @ints, grep vec($vec, $_, 1), $-[0] * 8 .. $+[0] * 8;
  3. }

Or use the CPAN module Bit::Vector:

  1. my $vector = Bit::Vector->new($num_of_bits);
  2. $vector->Index_List_Store(@ints);
  3. my @ints = $vector->Index_List_Read();

Bit::Vector provides efficient methods for bit vector, sets of small integers and "big int" math.

Here's a more extensive illustration using vec():

  1. # vec demo
  2. my $vector = "\xff\x0f\xef\xfe";
  3. print "Ilya's string \\xff\\x0f\\xef\\xfe represents the number ",
  4. unpack("N", $vector), "\n";
  5. my $is_set = vec($vector, 23, 1);
  6. print "Its 23rd bit is ", $is_set ? "set" : "clear", ".\n";
  7. pvec($vector);
  8. set_vec(1,1,1);
  9. set_vec(3,1,1);
  10. set_vec(23,1,1);
  11. set_vec(3,1,3);
  12. set_vec(3,2,3);
  13. set_vec(3,4,3);
  14. set_vec(3,4,7);
  15. set_vec(3,8,3);
  16. set_vec(3,8,7);
  17. set_vec(0,32,17);
  18. set_vec(1,32,17);
  19. sub set_vec {
  20. my ($offset, $width, $value) = @_;
  21. my $vector = '';
  22. vec($vector, $offset, $width) = $value;
  23. print "offset=$offset width=$width value=$value\n";
  24. pvec($vector);
  25. }
  26. sub pvec {
  27. my $vector = shift;
  28. my $bits = unpack("b*", $vector);
  29. my $i = 0;
  30. my $BASE = 8;
  31. print "vector length in bytes: ", length($vector), "\n";
  32. @bytes = unpack("A8" x length($vector), $bits);
  33. print "bits are: @bytes\n\n";
  34. }

Why does defined() return true on empty arrays and hashes?

The short story is that you should probably only use defined on scalars or functions, not on aggregates (arrays and hashes). See defined in the 5.004 release or later of Perl for more detail.

Data: Hashes (Associative Arrays)

How do I process an entire hash?

(contributed by brian d foy)

There are a couple of ways that you can process an entire hash. You can get a list of keys, then go through each key, or grab a one key-value pair at a time.

To go through all of the keys, use the keys function. This extracts all of the keys of the hash and gives them back to you as a list. You can then get the value through the particular key you're processing:

  1. foreach my $key ( keys %hash ) {
  2. my $value = $hash{$key}
  3. ...
  4. }

Once you have the list of keys, you can process that list before you process the hash elements. For instance, you can sort the keys so you can process them in lexical order:

  1. foreach my $key ( sort keys %hash ) {
  2. my $value = $hash{$key}
  3. ...
  4. }

Or, you might want to only process some of the items. If you only want to deal with the keys that start with text: , you can select just those using grep:

  1. foreach my $key ( grep /^text:/, keys %hash ) {
  2. my $value = $hash{$key}
  3. ...
  4. }

If the hash is very large, you might not want to create a long list of keys. To save some memory, you can grab one key-value pair at a time using each(), which returns a pair you haven't seen yet:

  1. while( my( $key, $value ) = each( %hash ) ) {
  2. ...
  3. }

The each operator returns the pairs in apparently random order, so if ordering matters to you, you'll have to stick with the keys method.

The each() operator can be a bit tricky though. You can't add or delete keys of the hash while you're using it without possibly skipping or re-processing some pairs after Perl internally rehashes all of the elements. Additionally, a hash has only one iterator, so if you mix keys, values, or each on the same hash, you risk resetting the iterator and messing up your processing. See the each entry in perlfunc for more details.

How do I merge two hashes?

(contributed by brian d foy)

Before you decide to merge two hashes, you have to decide what to do if both hashes contain keys that are the same and if you want to leave the original hashes as they were.

If you want to preserve the original hashes, copy one hash (%hash1 ) to a new hash (%new_hash ), then add the keys from the other hash (%hash2 to the new hash. Checking that the key already exists in %new_hash gives you a chance to decide what to do with the duplicates:

  1. my %new_hash = %hash1; # make a copy; leave %hash1 alone
  2. foreach my $key2 ( keys %hash2 ) {
  3. if( exists $new_hash{$key2} ) {
  4. warn "Key [$key2] is in both hashes!";
  5. # handle the duplicate (perhaps only warning)
  6. ...
  7. next;
  8. }
  9. else {
  10. $new_hash{$key2} = $hash2{$key2};
  11. }
  12. }

If you don't want to create a new hash, you can still use this looping technique; just change the %new_hash to %hash1 .

  1. foreach my $key2 ( keys %hash2 ) {
  2. if( exists $hash1{$key2} ) {
  3. warn "Key [$key2] is in both hashes!";
  4. # handle the duplicate (perhaps only warning)
  5. ...
  6. next;
  7. }
  8. else {
  9. $hash1{$key2} = $hash2{$key2};
  10. }
  11. }

If you don't care that one hash overwrites keys and values from the other, you could just use a hash slice to add one hash to another. In this case, values from %hash2 replace values from %hash1 when they have keys in common:

  1. @hash1{ keys %hash2 } = values %hash2;

What happens if I add or remove keys from a hash while iterating over it?

(contributed by brian d foy)

The easy answer is "Don't do that!"

If you iterate through the hash with each(), you can delete the key most recently returned without worrying about it. If you delete or add other keys, the iterator may skip or double up on them since perl may rearrange the hash table. See the entry for each() in perlfunc.

How do I look up a hash element by value?

Create a reverse hash:

  1. my %by_value = reverse %by_key;
  2. my $key = $by_value{$value};

That's not particularly efficient. It would be more space-efficient to use:

  1. while (my ($key, $value) = each %by_key) {
  2. $by_value{$value} = $key;
  3. }

If your hash could have repeated values, the methods above will only find one of the associated keys. This may or may not worry you. If it does worry you, you can always reverse the hash into a hash of arrays instead:

  1. while (my ($key, $value) = each %by_key) {
  2. push @{$key_list_by_value{$value}}, $key;
  3. }

How can I know how many entries are in a hash?

(contributed by brian d foy)

This is very similar to "How do I process an entire hash?", also in perlfaq4, but a bit simpler in the common cases.

You can use the keys() built-in function in scalar context to find out have many entries you have in a hash:

  1. my $key_count = keys %hash; # must be scalar context!

If you want to find out how many entries have a defined value, that's a bit different. You have to check each value. A grep is handy:

  1. my $defined_value_count = grep { defined } values %hash;

You can use that same structure to count the entries any way that you like. If you want the count of the keys with vowels in them, you just test for that instead:

  1. my $vowel_count = grep { /[aeiou]/ } keys %hash;

The grep in scalar context returns the count. If you want the list of matching items, just use it in list context instead:

  1. my @defined_values = grep { defined } values %hash;

The keys() function also resets the iterator, which means that you may see strange results if you use this between uses of other hash operators such as each().

How do I sort a hash (optionally by value instead of key)?

(contributed by brian d foy)

To sort a hash, start with the keys. In this example, we give the list of keys to the sort function which then compares them ASCIIbetically (which might be affected by your locale settings). The output list has the keys in ASCIIbetical order. Once we have the keys, we can go through them to create a report which lists the keys in ASCIIbetical order.

  1. my @keys = sort { $a cmp $b } keys %hash;
  2. foreach my $key ( @keys ) {
  3. printf "%-20s %6d\n", $key, $hash{$key};
  4. }

We could get more fancy in the sort() block though. Instead of comparing the keys, we can compute a value with them and use that value as the comparison.

For instance, to make our report order case-insensitive, we use lc to lowercase the keys before comparing them:

  1. my @keys = sort { lc $a cmp lc $b } keys %hash;

Note: if the computation is expensive or the hash has many elements, you may want to look at the Schwartzian Transform to cache the computation results.

If we want to sort by the hash value instead, we use the hash key to look it up. We still get out a list of keys, but this time they are ordered by their value.

  1. my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash;

From there we can get more complex. If the hash values are the same, we can provide a secondary sort on the hash key.

  1. my @keys = sort {
  2. $hash{$a} <=> $hash{$b}
  3. or
  4. "\L$a" cmp "\L$b"
  5. } keys %hash;

How can I always keep my hash sorted?

You can look into using the DB_File module and tie() using the $DB_BTREE hash bindings as documented in In Memory Databases in DB_File. The Tie::IxHash module from CPAN might also be instructive. Although this does keep your hash sorted, you might not like the slowdown you suffer from the tie interface. Are you sure you need to do this? :)

What's the difference between "delete" and "undef" with hashes?

Hashes contain pairs of scalars: the first is the key, the second is the value. The key will be coerced to a string, although the value can be any kind of scalar: string, number, or reference. If a key $key is present in %hash, exists($hash{$key}) will return true. The value for a given key can be undef, in which case $hash{$key} will be undef while exists $hash{$key} will return true. This corresponds to ($key , undef) being in the hash.

Pictures help... Here's the %hash table:

  1. keys values
  2. +------+------+
  3. | a | 3 |
  4. | x | 7 |
  5. | d | 0 |
  6. | e | 2 |
  7. +------+------+

And these conditions hold

  1. $hash{'a'} is true
  2. $hash{'d'} is false
  3. defined $hash{'d'} is true
  4. defined $hash{'a'} is true
  5. exists $hash{'a'} is true (Perl 5 only)
  6. grep ($_ eq 'a', keys %hash) is true

If you now say

  1. undef $hash{'a'}

your table now reads:

  1. keys values
  2. +------+------+
  3. | a | undef|
  4. | x | 7 |
  5. | d | 0 |
  6. | e | 2 |
  7. +------+------+

and these conditions now hold; changes in caps:

  1. $hash{'a'} is FALSE
  2. $hash{'d'} is false
  3. defined $hash{'d'} is true
  4. defined $hash{'a'} is FALSE
  5. exists $hash{'a'} is true (Perl 5 only)
  6. grep ($_ eq 'a', keys %hash) is true

Notice the last two: you have an undef value, but a defined key!

Now, consider this:

  1. delete $hash{'a'}

your table now reads:

  1. keys values
  2. +------+------+
  3. | x | 7 |
  4. | d | 0 |
  5. | e | 2 |
  6. +------+------+

and these conditions now hold; changes in caps:

  1. $hash{'a'} is false
  2. $hash{'d'} is false
  3. defined $hash{'d'} is true
  4. defined $hash{'a'} is false
  5. exists $hash{'a'} is FALSE (Perl 5 only)
  6. grep ($_ eq 'a', keys %hash) is FALSE

See, the whole entry is gone!

Why don't my tied hashes make the defined/exists distinction?

This depends on the tied hash's implementation of EXISTS(). For example, there isn't the concept of undef with hashes that are tied to DBM* files. It also means that exists() and defined() do the same thing with a DBM* file, and what they end up doing is not what they do with ordinary hashes.

How do I reset an each() operation part-way through?

(contributed by brian d foy)

You can use the keys or values functions to reset each. To simply reset the iterator used by each without doing anything else, use one of them in void context:

  1. keys %hash; # resets iterator, nothing else.
  2. values %hash; # resets iterator, nothing else.

See the documentation for each in perlfunc.

How can I get the unique keys from two hashes?

First you extract the keys from the hashes into lists, then solve the "removing duplicates" problem described above. For example:

  1. my %seen = ();
  2. for my $element (keys(%foo), keys(%bar)) {
  3. $seen{$element}++;
  4. }
  5. my @uniq = keys %seen;

Or more succinctly:

  1. my @uniq = keys %{{%foo,%bar}};

Or if you really want to save space:

  1. my %seen = ();
  2. while (defined ($key = each %foo)) {
  3. $seen{$key}++;
  4. }
  5. while (defined ($key = each %bar)) {
  6. $seen{$key}++;
  7. }
  8. my @uniq = keys %seen;

How can I store a multidimensional array in a DBM file?

Either stringify the structure yourself (no fun), or else get the MLDBM (which uses Data::Dumper) module from CPAN and layer it on top of either DB_File or GDBM_File. You might also try DBM::Deep, but it can be a bit slow.

How can I make my hash remember the order I put elements into it?

Use the Tie::IxHash from CPAN.

  1. use Tie::IxHash;
  2. tie my %myhash, 'Tie::IxHash';
  3. for (my $i=0; $i<20; $i++) {
  4. $myhash{$i} = 2*$i;
  5. }
  6. my @keys = keys %myhash;
  7. # @keys = (0,1,2,3,...)

Why does passing a subroutine an undefined element in a hash create it?

(contributed by brian d foy)

Are you using a really old version of Perl?

Normally, accessing a hash key's value for a nonexistent key will not create the key.

  1. my %hash = ();
  2. my $value = $hash{ 'foo' };
  3. print "This won't print\n" if exists $hash{ 'foo' };

Passing $hash{ 'foo' } to a subroutine used to be a special case, though. Since you could assign directly to $_[0] , Perl had to be ready to make that assignment so it created the hash key ahead of time:

  1. my_sub( $hash{ 'foo' } );
  2. print "This will print before 5.004\n" if exists $hash{ 'foo' };
  3. sub my_sub {
  4. # $_[0] = 'bar'; # create hash key in case you do this
  5. 1;
  6. }

Since Perl 5.004, however, this situation is a special case and Perl creates the hash key only when you make the assignment:

  1. my_sub( $hash{ 'foo' } );
  2. print "This will print, even after 5.004\n" if exists $hash{ 'foo' };
  3. sub my_sub {
  4. $_[0] = 'bar';
  5. }

However, if you want the old behavior (and think carefully about that because it's a weird side effect), you can pass a hash slice instead. Perl 5.004 didn't make this a special case:

  1. my_sub( @hash{ qw/foo/ } );

How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays?

Usually a hash ref, perhaps like this:

  1. $record = {
  2. NAME => "Jason",
  3. EMPNO => 132,
  4. TITLE => "deputy peon",
  5. AGE => 23,
  6. SALARY => 37_000,
  7. PALS => [ "Norbert", "Rhys", "Phineas"],
  8. };

References are documented in perlref and perlreftut. Examples of complex data structures are given in perldsc and perllol. Examples of structures and object-oriented classes are in perltoot.

How can I use a reference as a hash key?

(contributed by brian d foy and Ben Morrow)

Hash keys are strings, so you can't really use a reference as the key. When you try to do that, perl turns the reference into its stringified form (for instance, HASH(0xDEADBEEF) ). From there you can't get back the reference from the stringified form, at least without doing some extra work on your own.

Remember that the entry in the hash will still be there even if the referenced variable goes out of scope, and that it is entirely possible for Perl to subsequently allocate a different variable at the same address. This will mean a new variable might accidentally be associated with the value for an old.

If you have Perl 5.10 or later, and you just want to store a value against the reference for lookup later, you can use the core Hash::Util::Fieldhash module. This will also handle renaming the keys if you use multiple threads (which causes all variables to be reallocated at new addresses, changing their stringification), and garbage-collecting the entries when the referenced variable goes out of scope.

If you actually need to be able to get a real reference back from each hash entry, you can use the Tie::RefHash module, which does the required work for you.

How can I check if a key exists in a multilevel hash?

(contributed by brian d foy)

The trick to this problem is avoiding accidental autovivification. If you want to check three keys deep, you might naïvely try this:

  1. my %hash;
  2. if( exists $hash{key1}{key2}{key3} ) {
  3. ...;
  4. }

Even though you started with a completely empty hash, after that call to exists you've created the structure you needed to check for key3 :

  1. %hash = (
  2. 'key1' => {
  3. 'key2' => {}
  4. }
  5. );

That's autovivification. You can get around this in a few ways. The easiest way is to just turn it off. The lexical autovivification pragma is available on CPAN. Now you don't add to the hash:

  1. {
  2. no autovivification;
  3. my %hash;
  4. if( exists $hash{key1}{key2}{key3} ) {
  5. ...;
  6. }
  7. }

The Data::Diver module on CPAN can do it for you too. Its Dive subroutine can tell you not only if the keys exist but also get the value:

  1. use Data::Diver qw(Dive);
  2. my @exists = Dive( \%hash, qw(key1 key2 key3) );
  3. if( ! @exists ) {
  4. ...; # keys do not exist
  5. }
  6. elsif( ! defined $exists[0] ) {
  7. ...; # keys exist but value is undef
  8. }

You can easily do this yourself too by checking each level of the hash before you move onto the next level. This is essentially what Data::Diver does for you:

  1. if( check_hash( \%hash, qw(key1 key2 key3) ) ) {
  2. ...;
  3. }
  4. sub check_hash {
  5. my( $hash, @keys ) = @_;
  6. return unless @keys;
  7. foreach my $key ( @keys ) {
  8. return unless eval { exists $hash->{$key} };
  9. $hash = $hash->{$key};
  10. }
  11. return 1;
  12. }

How can I prevent addition of unwanted keys into a hash?

Since version 5.8.0, hashes can be restricted to a fixed number of given keys. Methods for creating and dealing with restricted hashes are exported by the Hash::Util module.

Data: Misc

How do I handle binary data correctly?

Perl is binary-clean, so it can handle binary data just fine. On Windows or DOS, however, you have to use binmode for binary files to avoid conversions for line endings. In general, you should use binmode any time you want to work with binary data.

Also see binmode or perlopentut.

If you're concerned about 8-bit textual data then see perllocale. If you want to deal with multibyte characters, however, there are some gotchas. See the section on Regular Expressions.

How do I determine whether a scalar is a number/whole/integer/float?

Assuming that you don't care about IEEE notations like "NaN" or "Infinity", you probably just want to use a regular expression:

  1. use 5.010;
  2. given( $number ) {
  3. when( /\D/ )
  4. { say "\thas nondigits"; continue }
  5. when( /^\d+\z/ )
  6. { say "\tis a whole number"; continue }
  7. when( /^-?\d+\z/ )
  8. { say "\tis an integer"; continue }
  9. when( /^[+-]?\d+\z/ )
  10. { say "\tis a +/- integer"; continue }
  11. when( /^-?(?:\d+\.?|\.\d)\d*\z/ )
  12. { say "\tis a real number"; continue }
  13. when( /^[+-]?(?=\.?\d)\d*\.?\d*(?:e[+-]?\d+)?\z/i)
  14. { say "\tis a C float" }
  15. }

There are also some commonly used modules for the task. Scalar::Util (distributed with 5.8) provides access to perl's internal function looks_like_number for determining whether a variable looks like a number. Data::Types exports functions that validate data types using both the above and other regular expressions. Thirdly, there is Regexp::Common which has regular expressions to match various types of numbers. Those three modules are available from the CPAN.

If you're on a POSIX system, Perl supports the POSIX::strtod function for converting strings to doubles (and also POSIX::strtol for longs). Its semantics are somewhat cumbersome, so here's a getnum wrapper function for more convenient access. This function takes a string and returns the number it found, or undef for input that isn't a C float. The is_numeric function is a front end to getnum if you just want to say, "Is this a float?"

  1. sub getnum {
  2. use POSIX qw(strtod);
  3. my $str = shift;
  4. $str =~ s/^\s+//;
  5. $str =~ s/\s+$//;
  6. $! = 0;
  7. my($num, $unparsed) = strtod($str);
  8. if (($str eq '') || ($unparsed != 0) || $!) {
  9. return undef;
  10. }
  11. else {
  12. return $num;
  13. }
  14. }
  15. sub is_numeric { defined getnum($_[0]) }

Or you could check out the String::Scanf module on the CPAN instead.

How do I keep persistent data across program calls?

For some specific applications, you can use one of the DBM modules. See AnyDBM_File. More generically, you should consult the FreezeThaw or Storable modules from CPAN. Starting from Perl 5.8, Storable is part of the standard distribution. Here's one example using Storable's store and retrieve functions:

  1. use Storable;
  2. store(\%hash, "filename");
  3. # later on...
  4. $href = retrieve("filename"); # by ref
  5. %hash = %{ retrieve("filename") }; # direct to hash

How do I print out or copy a recursive data structure?

The Data::Dumper module on CPAN (or the 5.005 release of Perl) is great for printing out data structures. The Storable module on CPAN (or the 5.8 release of Perl), provides a function called dclone that recursively copies its argument.

  1. use Storable qw(dclone);
  2. $r2 = dclone($r1);

Where $r1 can be a reference to any kind of data structure you'd like. It will be deeply copied. Because dclone takes and returns references, you'd have to add extra punctuation if you had a hash of arrays that you wanted to copy.

  1. %newhash = %{ dclone(\%oldhash) };

How do I define methods for every class/object?

(contributed by Ben Morrow)

You can use the UNIVERSAL class (see UNIVERSAL). However, please be very careful to consider the consequences of doing this: adding methods to every object is very likely to have unintended consequences. If possible, it would be better to have all your object inherit from some common base class, or to use an object system like Moose that supports roles.

How do I verify a credit card checksum?

Get the Business::CreditCard module from CPAN.

How do I pack arrays of doubles or floats for XS code?

The arrays.h/arrays.c code in the PGPLOT module on CPAN does just this. If you're doing a lot of float or double processing, consider using the PDL module from CPAN instead--it makes number-crunching easy.

See http://search.cpan.org/dist/PGPLOT for the code.

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required.

Page index
 
perldoc-html/perlfaq5.html000644 000765 000024 00000521627 12275777330 015655 0ustar00jjstaff000000 000000 perlfaq5 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq5

Perl 5 version 18.2 documentation
Recently read

perlfaq5

NAME

perlfaq5 - Files and Formats

DESCRIPTION

This section deals with I/O and the "f" issues: filehandles, flushing, formats, and footers.

How do I flush/unbuffer an output filehandle? Why must I do this?

(contributed by brian d foy)

You might like to read Mark Jason Dominus's "Suffering From Buffering" at http://perl.plover.com/FAQs/Buffering.html .

Perl normally buffers output so it doesn't make a system call for every bit of output. By saving up output, it makes fewer expensive system calls. For instance, in this little bit of code, you want to print a dot to the screen for every line you process to watch the progress of your program. Instead of seeing a dot for every line, Perl buffers the output and you have a long wait before you see a row of 50 dots all at once:

  1. # long wait, then row of dots all at once
  2. while( <> ) {
  3. print ".";
  4. print "\n" unless ++$count % 50;
  5. #... expensive line processing operations
  6. }

To get around this, you have to unbuffer the output filehandle, in this case, STDOUT . You can set the special variable $| to a true value (mnemonic: making your filehandles "piping hot"):

  1. $|++;
  2. # dot shown immediately
  3. while( <> ) {
  4. print ".";
  5. print "\n" unless ++$count % 50;
  6. #... expensive line processing operations
  7. }

The $| is one of the per-filehandle special variables, so each filehandle has its own copy of its value. If you want to merge standard output and standard error for instance, you have to unbuffer each (although STDERR might be unbuffered by default):

  1. {
  2. my $previous_default = select(STDOUT); # save previous default
  3. $|++; # autoflush STDOUT
  4. select(STDERR);
  5. $|++; # autoflush STDERR, to be sure
  6. select($previous_default); # restore previous default
  7. }
  8. # now should alternate . and +
  9. while( 1 ) {
  10. sleep 1;
  11. print STDOUT ".";
  12. print STDERR "+";
  13. print STDOUT "\n" unless ++$count % 25;
  14. }

Besides the $| special variable, you can use binmode to give your filehandle a :unix layer, which is unbuffered:

  1. binmode( STDOUT, ":unix" );
  2. while( 1 ) {
  3. sleep 1;
  4. print ".";
  5. print "\n" unless ++$count % 50;
  6. }

For more information on output layers, see the entries for binmode and open in perlfunc, and the PerlIO module documentation.

If you are using IO::Handle or one of its subclasses, you can call the autoflush method to change the settings of the filehandle:

  1. use IO::Handle;
  2. open my( $io_fh ), ">", "output.txt";
  3. $io_fh->autoflush(1);

The IO::Handle objects also have a flush method. You can flush the buffer any time you want without auto-buffering

  1. $io_fh->flush;

How do I change, delete, or insert a line in a file, or append to the beginning of a file?

(contributed by brian d foy)

The basic idea of inserting, changing, or deleting a line from a text file involves reading and printing the file to the point you want to make the change, making the change, then reading and printing the rest of the file. Perl doesn't provide random access to lines (especially since the record input separator, $/ , is mutable), although modules such as Tie::File can fake it.

A Perl program to do these tasks takes the basic form of opening a file, printing its lines, then closing the file:

  1. open my $in, '<', $file or die "Can't read old file: $!";
  2. open my $out, '>', "$file.new" or die "Can't write new file: $!";
  3. while( <$in> ) {
  4. print $out $_;
  5. }
  6. close $out;

Within that basic form, add the parts that you need to insert, change, or delete lines.

To prepend lines to the beginning, print those lines before you enter the loop that prints the existing lines.

  1. open my $in, '<', $file or die "Can't read old file: $!";
  2. open my $out, '>', "$file.new" or die "Can't write new file: $!";
  3. print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC
  4. while( <$in> ) {
  5. print $out $_;
  6. }
  7. close $out;

To change existing lines, insert the code to modify the lines inside the while loop. In this case, the code finds all lowercased versions of "perl" and uppercases them. The happens for every line, so be sure that you're supposed to do that on every line!

  1. open my $in, '<', $file or die "Can't read old file: $!";
  2. open my $out, '>', "$file.new" or die "Can't write new file: $!";
  3. print $out "# Add this line to the top\n";
  4. while( <$in> ) {
  5. s/\b(perl)\b/Perl/g;
  6. print $out $_;
  7. }
  8. close $out;

To change only a particular line, the input line number, $. , is useful. First read and print the lines up to the one you want to change. Next, read the single line you want to change, change it, and print it. After that, read the rest of the lines and print those:

  1. while( <$in> ) { # print the lines before the change
  2. print $out $_;
  3. last if $. == 4; # line number before change
  4. }
  5. my $line = <$in>;
  6. $line =~ s/\b(perl)\b/Perl/g;
  7. print $out $line;
  8. while( <$in> ) { # print the rest of the lines
  9. print $out $_;
  10. }

To skip lines, use the looping controls. The next in this example skips comment lines, and the last stops all processing once it encounters either __END__ or __DATA__ .

  1. while( <$in> ) {
  2. next if /^\s+#/; # skip comment lines
  3. last if /^__(END|DATA)__$/; # stop at end of code marker
  4. print $out $_;
  5. }

Do the same sort of thing to delete a particular line by using next to skip the lines you don't want to show up in the output. This example skips every fifth line:

  1. while( <$in> ) {
  2. next unless $. % 5;
  3. print $out $_;
  4. }

If, for some odd reason, you really want to see the whole file at once rather than processing line-by-line, you can slurp it in (as long as you can fit the whole thing in memory!):

  1. open my $in, '<', $file or die "Can't read old file: $!"
  2. open my $out, '>', "$file.new" or die "Can't write new file: $!";
  3. my $content = do { local $/; <$in> }; # slurp!
  4. # do your magic here
  5. print $out $content;

Modules such as File::Slurp and Tie::File can help with that too. If you can, however, avoid reading the entire file at once. Perl won't give that memory back to the operating system until the process finishes.

You can also use Perl one-liners to modify a file in-place. The following changes all 'Fred' to 'Barney' in inFile.txt, overwriting the file with the new contents. With the -p switch, Perl wraps a while loop around the code you specify with -e , and -i turns on in-place editing. The current line is in $_ . With -p , Perl automatically prints the value of $_ at the end of the loop. See perlrun for more details.

  1. perl -pi -e 's/Fred/Barney/' inFile.txt

To make a backup of inFile.txt , give -i a file extension to add:

  1. perl -pi.bak -e 's/Fred/Barney/' inFile.txt

To change only the fifth line, you can add a test checking $. , the input line number, then only perform the operation when the test passes:

  1. perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt

To add lines before a certain line, you can add a line (or lines!) before Perl prints $_ :

  1. perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt

You can even add a line to the beginning of a file, since the current line prints at the end of the loop:

  1. perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt

To insert a line after one already in the file, use the -n switch. It's just like -p except that it doesn't print $_ at the end of the loop, so you have to do that yourself. In this case, print $_ first, then print the line that you want to add.

  1. perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt

To delete lines, only print the ones that you want.

  1. perl -ni -e 'print if /d/' inFile.txt

How do I count the number of lines in a file?

(contributed by brian d foy)

Conceptually, the easiest way to count the lines in a file is to simply read them and count them:

  1. my $count = 0;
  2. while( <$fh> ) { $count++; }

You don't really have to count them yourself, though, since Perl already does that with the $. variable, which is the current line number from the last filehandle read:

  1. 1 while( <$fh> );
  2. my $count = $.;

If you want to use $. , you can reduce it to a simple one-liner, like one of these:

  1. % perl -lne '} print $.; {' file
  2. % perl -lne 'END { print $. }' file

Those can be rather inefficient though. If they aren't fast enough for you, you might just read chunks of data and count the number of newlines:

  1. my $lines = 0;
  2. open my($fh), '<:raw', $filename or die "Can't open $filename: $!";
  3. while( sysread $fh, $buffer, 4096 ) {
  4. $lines += ( $buffer =~ tr/\n// );
  5. }
  6. close FILE;

However, that doesn't work if the line ending isn't a newline. You might change that tr/// to a s/// so you can count the number of times the input record separator, $/ , shows up:

  1. my $lines = 0;
  2. open my($fh), '<:raw', $filename or die "Can't open $filename: $!";
  3. while( sysread $fh, $buffer, 4096 ) {
  4. $lines += ( $buffer =~ s|$/||g; );
  5. }
  6. close FILE;

If you don't mind shelling out, the wc command is usually the fastest, even with the extra interprocess overhead. Ensure that you have an untainted filename though:

  1. #!perl -T
  2. $ENV{PATH} = undef;
  3. my $lines;
  4. if( $filename =~ /^([0-9a-z_.]+)\z/ ) {
  5. $lines = `/usr/bin/wc -l $1`
  6. chomp $lines;
  7. }

How do I delete the last N lines from a file?

(contributed by brian d foy)

The easiest conceptual solution is to count the lines in the file then start at the beginning and print the number of lines (minus the last N) to a new file.

Most often, the real question is how you can delete the last N lines without making more than one pass over the file, or how to do it without a lot of copying. The easy concept is the hard reality when you might have millions of lines in your file.

One trick is to use File::ReadBackwards, which starts at the end of the file. That module provides an object that wraps the real filehandle to make it easy for you to move around the file. Once you get to the spot you need, you can get the actual filehandle and work with it as normal. In this case, you get the file position at the end of the last line you want to keep and truncate the file to that point:

  1. use File::ReadBackwards;
  2. my $filename = 'test.txt';
  3. my $Lines_to_truncate = 2;
  4. my $bw = File::ReadBackwards->new( $filename )
  5. or die "Could not read backwards in [$filename]: $!";
  6. my $lines_from_end = 0;
  7. until( $bw->eof or $lines_from_end == $Lines_to_truncate ) {
  8. print "Got: ", $bw->readline;
  9. $lines_from_end++;
  10. }
  11. truncate( $filename, $bw->tell );

The File::ReadBackwards module also has the advantage of setting the input record separator to a regular expression.

You can also use the Tie::File module which lets you access the lines through a tied array. You can use normal array operations to modify your file, including setting the last index and using splice.

How can I use Perl's -i option from within a program?

-i sets the value of Perl's $^I variable, which in turn affects the behavior of <> ; see perlrun for more details. By modifying the appropriate variables directly, you can get the same behavior within a larger program. For example:

  1. # ...
  2. {
  3. local($^I, @ARGV) = ('.orig', glob("*.c"));
  4. while (<>) {
  5. if ($. == 1) {
  6. print "This line should appear at the top of each file\n";
  7. }
  8. s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case
  9. print;
  10. close ARGV if eof; # Reset $.
  11. }
  12. }
  13. # $^I and @ARGV return to their old values here

This block modifies all the .c files in the current directory, leaving a backup of the original data from each file in a new .c.orig file.

How can I copy a file?

(contributed by brian d foy)

Use the File::Copy module. It comes with Perl and can do a true copy across file systems, and it does its magic in a portable fashion.

  1. use File::Copy;
  2. copy( $original, $new_copy ) or die "Copy failed: $!";

If you can't use File::Copy, you'll have to do the work yourself: open the original file, open the destination file, then print to the destination file as you read the original. You also have to remember to copy the permissions, owner, and group to the new file.

How do I make a temporary file name?

If you don't need to know the name of the file, you can use open() with undef in place of the file name. In Perl 5.8 or later, the open() function creates an anonymous temporary file:

  1. open my $tmp, '+>', undef or die $!;

Otherwise, you can use the File::Temp module.

  1. use File::Temp qw/ tempfile tempdir /;
  2. my $dir = tempdir( CLEANUP => 1 );
  3. ($fh, $filename) = tempfile( DIR => $dir );
  4. # or if you don't need to know the filename
  5. my $fh = tempfile( DIR => $dir );

The File::Temp has been a standard module since Perl 5.6.1. If you don't have a modern enough Perl installed, use the new_tmpfile class method from the IO::File module to get a filehandle opened for reading and writing. Use it if you don't need to know the file's name:

  1. use IO::File;
  2. my $fh = IO::File->new_tmpfile()
  3. or die "Unable to make new temporary file: $!";

If you're committed to creating a temporary file by hand, use the process ID and/or the current time-value. If you need to have many temporary files in one process, use a counter:

  1. BEGIN {
  2. use Fcntl;
  3. my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP};
  4. my $base_name = sprintf "%s/%d-%d-0000", $temp_dir, $$, time;
  5. sub temp_file {
  6. my $fh;
  7. my $count = 0;
  8. until( defined(fileno($fh)) || $count++ > 100 ) {
  9. $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e;
  10. # O_EXCL is required for security reasons.
  11. sysopen $fh, $base_name, O_WRONLY|O_EXCL|O_CREAT;
  12. }
  13. if( defined fileno($fh) ) {
  14. return ($fh, $base_name);
  15. }
  16. else {
  17. return ();
  18. }
  19. }
  20. }

How can I manipulate fixed-record-length files?

The most efficient way is using pack and unpack. This is faster than using substr when taking many, many strings. It is slower for just a few.

Here is a sample chunk of code to break up and put back together again some fixed-format input lines, in this case from the output of a normal, Berkeley-style ps:

  1. # sample input line:
  2. # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what
  3. my $PS_T = 'A6 A4 A7 A5 A*';
  4. open my $ps, '-|', 'ps';
  5. print scalar <$ps>;
  6. my @fields = qw( pid tt stat time command );
  7. while (<$ps>) {
  8. my %process;
  9. @process{@fields} = unpack($PS_T, $_);
  10. for my $field ( @fields ) {
  11. print "$field: <$process{$field}>\n";
  12. }
  13. print 'line=', pack($PS_T, @process{@fields} ), "\n";
  14. }

We've used a hash slice in order to easily handle the fields of each row. Storing the keys in an array makes it easy to operate on them as a group or loop over them with for . It also avoids polluting the program with global variables and using symbolic references.

How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles?

As of perl5.6, open() autovivifies file and directory handles as references if you pass it an uninitialized scalar variable. You can then pass these references just like any other scalar, and use them in the place of named handles.

  1. open my $fh, $file_name;
  2. open local $fh, $file_name;
  3. print $fh "Hello World!\n";
  4. process_file( $fh );

If you like, you can store these filehandles in an array or a hash. If you access them directly, they aren't simple scalars and you need to give print a little help by placing the filehandle reference in braces. Perl can only figure it out on its own when the filehandle reference is a simple scalar.

  1. my @fhs = ( $fh1, $fh2, $fh3 );
  2. for( $i = 0; $i <= $#fhs; $i++ ) {
  3. print {$fhs[$i]} "just another Perl answer, \n";
  4. }

Before perl5.6, you had to deal with various typeglob idioms which you may see in older code.

  1. open FILE, "> $filename";
  2. process_typeglob( *FILE );
  3. process_reference( \*FILE );
  4. sub process_typeglob { local *FH = shift; print FH "Typeglob!" }
  5. sub process_reference { local $fh = shift; print $fh "Reference!" }

If you want to create many anonymous handles, you should check out the Symbol or IO::Handle modules.

How can I use a filehandle indirectly?

An indirect filehandle is the use of something other than a symbol in a place that a filehandle is expected. Here are ways to get indirect filehandles:

  1. $fh = SOME_FH; # bareword is strict-subs hostile
  2. $fh = "SOME_FH"; # strict-refs hostile; same package only
  3. $fh = *SOME_FH; # typeglob
  4. $fh = \*SOME_FH; # ref to typeglob (bless-able)
  5. $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob

Or, you can use the new method from one of the IO::* modules to create an anonymous filehandle and store that in a scalar variable.

  1. use IO::Handle; # 5.004 or higher
  2. my $fh = IO::Handle->new();

Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an indirect filehandle may be used instead. An indirect filehandle is just a scalar variable that contains a filehandle. Functions like print, open, seek, or the <FH> diamond operator will accept either a named filehandle or a scalar variable containing one:

  1. ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
  2. print $ofh "Type it: ";
  3. my $got = <$ifh>
  4. print $efh "What was that: $got";

If you're passing a filehandle to a function, you can write the function in two ways:

  1. sub accept_fh {
  2. my $fh = shift;
  3. print $fh "Sending to indirect filehandle\n";
  4. }

Or it can localize a typeglob and use the filehandle directly:

  1. sub accept_fh {
  2. local *FH = shift;
  3. print FH "Sending to localized filehandle\n";
  4. }

Both styles work with either objects or typeglobs of real filehandles. (They might also work with strings under some circumstances, but this is risky.)

  1. accept_fh(*STDOUT);
  2. accept_fh($handle);

In the examples above, we assigned the filehandle to a scalar variable before using it. That is because only simple scalar variables, not expressions or subscripts of hashes or arrays, can be used with built-ins like print, printf, or the diamond operator. Using something other than a simple scalar variable as a filehandle is illegal and won't even compile:

  1. my @fd = (*STDIN, *STDOUT, *STDERR);
  2. print $fd[1] "Type it: "; # WRONG
  3. my $got = <$fd[0]> # WRONG
  4. print $fd[2] "What was that: $got"; # WRONG

With print and printf, you get around this by using a block and an expression where you would place the filehandle:

  1. print { $fd[1] } "funny stuff\n";
  2. printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559;
  3. # Pity the poor deadbeef.

That block is a proper block like any other, so you can put more complicated code there. This sends the message out to one of two places:

  1. my $ok = -x "/bin/cat";
  2. print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n";
  3. print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n";

This approach of treating print and printf like object methods calls doesn't work for the diamond operator. That's because it's a real operator, not just a function with a comma-less argument. Assuming you've been storing typeglobs in your structure as we did above, you can use the built-in function named readline to read a record just as <> does. Given the initialization shown above for @fd, this would work, but only because readline() requires a typeglob. It doesn't work with objects or strings, which might be a bug we haven't fixed yet.

  1. $got = readline($fd[0]);

Let it be noted that the flakiness of indirect filehandles is not related to whether they're strings, typeglobs, objects, or anything else. It's the syntax of the fundamental operators. Playing the object game doesn't help you at all here.

How can I set up a footer format to be used with write()?

There's no builtin way to do this, but perlform has a couple of techniques to make it possible for the intrepid hacker.

How can I write() into a string?

(contributed by brian d foy)

If you want to write into a string, you just have to <open> a filehandle to a string, which Perl has been able to do since Perl 5.6:

  1. open FH, '>', \my $string;
  2. write( FH );

Since you want to be a good programmer, you probably want to use a lexical filehandle, even though formats are designed to work with bareword filehandles since the default format names take the filehandle name. However, you can control this with some Perl special per-filehandle variables: $^ , which names the top-of-page format, and $~ which shows the line format. You have to change the default filehandle to set these variables:

  1. open my($fh), '>', \my $string;
  2. { # set per-filehandle variables
  3. my $old_fh = select( $fh );
  4. $~ = 'ANIMAL';
  5. $^ = 'ANIMAL_TOP';
  6. select( $old_fh );
  7. }
  8. format ANIMAL_TOP =
  9. ID Type Name
  10. .
  11. format ANIMAL =
  12. @## @<<< @<<<<<<<<<<<<<<
  13. $id, $type, $name
  14. .

Although write can work with lexical or package variables, whatever variables you use have to scope in the format. That most likely means you'll want to localize some package variables:

  1. {
  2. local( $id, $type, $name ) = qw( 12 cat Buster );
  3. write( $fh );
  4. }
  5. print $string;

There are also some tricks that you can play with formline and the accumulator variable $^A , but you lose a lot of the value of formats since formline won't handle paging and so on. You end up reimplementing formats when you use them.

How can I open a filehandle to a string?

(contributed by Peter J. Holzer, hjp-usenet2@hjp.at)

Since Perl 5.8.0 a file handle referring to a string can be created by calling open with a reference to that string instead of the filename. This file handle can then be used to read from or write to the string:

  1. open(my $fh, '>', \$string) or die "Could not open string for writing";
  2. print $fh "foo\n";
  3. print $fh "bar\n"; # $string now contains "foo\nbar\n"
  4. open(my $fh, '<', \$string) or die "Could not open string for reading";
  5. my $x = <$fh>; # $x now contains "foo\n"

With older versions of Perl, the IO::String module provides similar functionality.

How can I output my numbers with commas added?

(contributed by brian d foy and Benjamin Goldberg)

You can use Number::Format to separate places in a number. It handles locale information for those of you who want to insert full stops instead (or anything else that they want to use, really).

This subroutine will add commas to your number:

  1. sub commify {
  2. local $_ = shift;
  3. 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
  4. return $_;
  5. }

This regex from Benjamin Goldberg will add commas to numbers:

  1. s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g;

It is easier to see with comments:

  1. s/(
  2. ^[-+]? # beginning of number.
  3. \d+? # first digits before first comma
  4. (?= # followed by, (but not included in the match) :
  5. (?>(?:\d{3})+) # some positive multiple of three digits.
  6. (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever.
  7. )
  8. | # or:
  9. \G\d{3} # after the last group, get three digits
  10. (?=\d) # but they have to have more digits after them.
  11. )/$1,/xg;

How can I translate tildes (~) in a filename?

Use the <> (glob()) operator, documented in perlfunc. Versions of Perl older than 5.6 require that you have a shell installed that groks tildes. Later versions of Perl have this feature built in. The File::KGlob module (available from CPAN) gives more portable glob functionality.

Within Perl, you may use this directly:

  1. $filename =~ s{
  2. ^ ~ # find a leading tilde
  3. ( # save this in $1
  4. [^/] # a non-slash character
  5. * # repeated 0 or more times (0 means me)
  6. )
  7. }{
  8. $1
  9. ? (getpwnam($1))[7]
  10. : ( $ENV{HOME} || $ENV{LOGDIR} )
  11. }ex;

How come when I open a file read-write it wipes it out?

Because you're using something like this, which truncates the file then gives you read-write access:

  1. open my $fh, '+>', '/path/name'; # WRONG (almost always)

Whoops. You should instead use this, which will fail if the file doesn't exist:

  1. open my $fh, '+<', '/path/name'; # open for update

Using ">" always clobbers or creates. Using "<" never does either. The "+" doesn't change this.

Here are examples of many kinds of file opens. Those using sysopen all assume that you've pulled in the constants from Fcntl:

  1. use Fcntl;

To open file for reading:

  1. open my $fh, '<', $path or die $!;
  2. sysopen my $fh, $path, O_RDONLY or die $!;

To open file for writing, create new file if needed or else truncate old file:

  1. open my $fh, '>', $path or die $!;
  2. sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!;
  3. sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!;

To open file for writing, create new file, file must not exist:

  1. sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!;
  2. sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!;

To open file for appending, create if necessary:

  1. open my $fh, '>>' $path or die $!;
  2. sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!;
  3. sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!;

To open file for appending, file must exist:

  1. sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!;

To open file for update, file must exist:

  1. open my $fh, '+<', $path or die $!;
  2. sysopen my $fh, $path, O_RDWR or die $!;

To open file for update, create file if necessary:

  1. sysopen my $fh, $path, O_RDWR|O_CREAT or die $!;
  2. sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!;

To open file for update, file must not exist:

  1. sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!;
  2. sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!;

To open a file without blocking, creating if necessary:

  1. sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT
  2. or die "can't open /foo/somefile: $!":

Be warned that neither creation nor deletion of files is guaranteed to be an atomic operation over NFS. That is, two processes might both successfully create or unlink the same file! Therefore O_EXCL isn't as exclusive as you might wish.

See also perlopentut.

Why do I sometimes get an "Argument list too long" when I use <*>?

The <> operator performs a globbing operation (see above). In Perl versions earlier than v5.6.0, the internal glob() operator forks csh(1) to do the actual glob expansion, but csh can't handle more than 127 items and so gives the error message Argument list too long . People who installed tcsh as csh won't have this problem, but their users may be surprised by it.

To get around this, either upgrade to Perl v5.6.0 or later, do the glob yourself with readdir() and patterns, or use a module like File::Glob, one that doesn't use the shell to do globbing.

How can I open a file with a leading ">" or trailing blanks?

(contributed by Brian McCauley)

The special two-argument form of Perl's open() function ignores trailing blanks in filenames and infers the mode from certain leading characters (or a trailing "|"). In older versions of Perl this was the only version of open() and so it is prevalent in old code and books.

Unless you have a particular reason to use the two-argument form you should use the three-argument form of open() which does not treat any characters in the filename as special.

  1. open my $fh, "<", " file "; # filename is " file "
  2. open my $fh, ">", ">file"; # filename is ">file"

How can I reliably rename a file?

If your operating system supports a proper mv(1) utility or its functional equivalent, this works:

  1. rename($old, $new) or system("mv", $old, $new);

It may be more portable to use the File::Copy module instead. You just copy to the new file to the new name (checking return values), then delete the old one. This isn't really the same semantically as a rename(), which preserves meta-information like permissions, timestamps, inode info, etc.

How can I lock a file?

Perl's builtin flock() function (see perlfunc for details) will call flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and later), and lockf(3) if neither of the two previous system calls exists. On some systems, it may even use a different form of native locking. Here are some gotchas with Perl's flock():

1

Produces a fatal error if none of the three system calls (or their close equivalent) exists.

2

lockf(3) does not provide shared locking, and requires that the filehandle be open for writing (or appending, or read/writing).

3

Some versions of flock() can't lock files over a network (e.g. on NFS file systems), so you'd need to force the use of fcntl(2) when you build Perl. But even this is dubious at best. See the flock entry of perlfunc and the INSTALL file in the source distribution for information on building Perl to do this.

Two potentially non-obvious but traditional flock semantics are that it waits indefinitely until the lock is granted, and that its locks are merely advisory. Such discretionary locks are more flexible, but offer fewer guarantees. This means that files locked with flock() may be modified by programs that do not also use flock(). Cars that stop for red lights get on well with each other, but not with cars that don't stop for red lights. See the perlport manpage, your port's specific documentation, or your system-specific local manpages for details. It's best to assume traditional behavior if you're writing portable programs. (If you're not, you should as always feel perfectly free to write for your own system's idiosyncrasies (sometimes called "features"). Slavish adherence to portability concerns shouldn't get in the way of your getting your job done.)

For more information on file locking, see also File Locking in perlopentut if you have it (new for 5.6).

Why can't I just open(FH, ">file.lock")?

A common bit of code NOT TO USE is this:

  1. sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE
  2. open my $lock, '>', 'file.lock'; # THIS BROKEN CODE

This is a classic race condition: you take two steps to do something which must be done in one. That's why computer hardware provides an atomic test-and-set instruction. In theory, this "ought" to work:

  1. sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT
  2. or die "can't open file.lock: $!";

except that lamentably, file creation (and deletion) is not atomic over NFS, so this won't work (at least, not every time) over the net. Various schemes involving link() have been suggested, but these tend to involve busy-wait, which is also less than desirable.

I still don't get locking. I just want to increment the number in the file. How can I do this?

Didn't anyone ever tell you web-page hit counters were useless? They don't count number of hits, they're a waste of time, and they serve only to stroke the writer's vanity. It's better to pick a random number; they're more realistic.

Anyway, this is what you can do if you can't help yourself.

  1. use Fcntl qw(:DEFAULT :flock);
  2. sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!";
  3. flock $fh, LOCK_EX or die "can't flock numfile: $!";
  4. my $num = <$fh> || 0;
  5. seek $fh, 0, 0 or die "can't rewind numfile: $!";
  6. truncate $fh, 0 or die "can't truncate numfile: $!";
  7. (print $fh $num+1, "\n") or die "can't write numfile: $!";
  8. close $fh or die "can't close numfile: $!";

Here's a much better web-page hit counter:

  1. $hits = int( (time() - 850_000_000) / rand(1_000) );

If the count doesn't impress your friends, then the code might. :-)

All I want to do is append a small amount of text to the end of a file. Do I still have to use locking?

If you are on a system that correctly implements flock and you use the example appending code from "perldoc -f flock" everything will be OK even if the OS you are on doesn't implement append mode correctly (if such a system exists). So if you are happy to restrict yourself to OSs that implement flock (and that's not really much of a restriction) then that is what you should do.

If you know you are only going to use a system that does correctly implement appending (i.e. not Win32) then you can omit the seek from the code in the previous answer.

If you know you are only writing code to run on an OS and filesystem that does implement append mode correctly (a local filesystem on a modern Unix for example), and you keep the file in block-buffered mode and you write less than one buffer-full of output between each manual flushing of the buffer then each bufferload is almost guaranteed to be written to the end of the file in one chunk without getting intermingled with anyone else's output. You can also use the syswrite function which is simply a wrapper around your system's write(2) system call.

There is still a small theoretical chance that a signal will interrupt the system-level write() operation before completion. There is also a possibility that some STDIO implementations may call multiple system level write()s even if the buffer was empty to start. There may be some systems where this probability is reduced to zero, and this is not a concern when using :perlio instead of your system's STDIO.

How do I randomly update a binary file?

If you're just trying to patch a binary, in many cases something as simple as this works:

  1. perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs

However, if you have fixed sized records, then you might do something more like this:

  1. my $RECSIZE = 220; # size of record, in bytes
  2. my $recno = 37; # which record to update
  3. open my $fh, '+<', 'somewhere' or die "can't update somewhere: $!";
  4. seek $fh, $recno * $RECSIZE, 0;
  5. read $fh, $record, $RECSIZE == $RECSIZE or die "can't read record $recno: $!";
  6. # munge the record
  7. seek $fh, -$RECSIZE, 1;
  8. print $fh $record;
  9. close $fh;

Locking and error checking are left as an exercise for the reader. Don't forget them or you'll be quite sorry.

How do I get a file's timestamp in perl?

If you want to retrieve the time at which the file was last read, written, or had its meta-data (owner, etc) changed, you use the -A, -M, or -C file test operations as documented in perlfunc. These retrieve the age of the file (measured against the start-time of your program) in days as a floating point number. Some platforms may not have all of these times. See perlport for details. To retrieve the "raw" time in seconds since the epoch, you would call the stat function, then use localtime(), gmtime(), or POSIX::strftime() to convert this into human-readable form.

Here's an example:

  1. my $write_secs = (stat($file))[9];
  2. printf "file %s updated at %s\n", $file,
  3. scalar localtime($write_secs);

If you prefer something more legible, use the File::stat module (part of the standard distribution in version 5.004 and later):

  1. # error checking left as an exercise for reader.
  2. use File::stat;
  3. use Time::localtime;
  4. my $date_string = ctime(stat($file)->mtime);
  5. print "file $file updated at $date_string\n";

The POSIX::strftime() approach has the benefit of being, in theory, independent of the current locale. See perllocale for details.

How do I set a file's timestamp in perl?

You use the utime() function documented in utime. By way of example, here's a little program that copies the read and write times from its first argument to all the rest of them.

  1. if (@ARGV < 2) {
  2. die "usage: cptimes timestamp_file other_files ...\n";
  3. }
  4. my $timestamp = shift;
  5. my($atime, $mtime) = (stat($timestamp))[8,9];
  6. utime $atime, $mtime, @ARGV;

Error checking is, as usual, left as an exercise for the reader.

The perldoc for utime also has an example that has the same effect as touch(1) on files that already exist.

Certain file systems have a limited ability to store the times on a file at the expected level of precision. For example, the FAT and HPFS filesystem are unable to create dates on files with a finer granularity than two seconds. This is a limitation of the filesystems, not of utime().

How do I print to more than one file at once?

To connect one filehandle to several output filehandles, you can use the IO::Tee or Tie::FileHandle::Multiplex modules.

If you only have to do this once, you can print individually to each filehandle.

  1. for my $fh ($fh1, $fh2, $fh3) { print $fh "whatever\n" }

How can I read in an entire file all at once?

The customary Perl approach for processing all the lines in a file is to do so one line at a time:

  1. open my $input, '<', $file or die "can't open $file: $!";
  2. while (<$input>) {
  3. chomp;
  4. # do something with $_
  5. }
  6. close $input or die "can't close $file: $!";

This is tremendously more efficient than reading the entire file into memory as an array of lines and then processing it one element at a time, which is often--if not almost always--the wrong approach. Whenever you see someone do this:

  1. my @lines = <INPUT>;

You should think long and hard about why you need everything loaded at once. It's just not a scalable solution.

If you "mmap" the file with the File::Map module from CPAN, you can virtually load the entire file into a string without actually storing it in memory:

  1. use File::Map qw(map_file);
  2. map_file my $string, $filename;

Once mapped, you can treat $string as you would any other string. Since you don't necessarily have to load the data, mmap-ing can be very fast and may not increase your memory footprint.

You might also find it more fun to use the standard Tie::File module, or the DB_File module's $DB_RECNO bindings, which allow you to tie an array to a file so that accessing an element of the array actually accesses the corresponding line in the file.

If you want to load the entire file, you can use the File::Slurp module to do it in one one simple and efficient step:

  1. use File::Slurp;
  2. my $all_of_it = read_file($filename); # entire file in scalar
  3. my @all_lines = read_file($filename); # one line per element

Or you can read the entire file contents into a scalar like this:

  1. my $var;
  2. {
  3. local $/;
  4. open my $fh, '<', $file or die "can't open $file: $!";
  5. $var = <$fh>;
  6. }

That temporarily undefs your record separator, and will automatically close the file at block exit. If the file is already open, just use this:

  1. my $var = do { local $/; <$fh> };

You can also use a localized @ARGV to eliminate the open:

  1. my $var = do { local( @ARGV, $/ ) = $file; <> };

For ordinary files you can also use the read function.

  1. read( $fh, $var, -s $fh );

That third argument tests the byte size of the data on the $fh filehandle and reads that many bytes into the buffer $var .

How can I read in a file by paragraphs?

Use the $/ variable (see perlvar for details). You can either set it to "" to eliminate empty paragraphs ("abc\n\n\n\ndef" , for instance, gets treated as two paragraphs and not three), or "\n\n" to accept empty paragraphs.

Note that a blank line must have no blanks in it. Thus "fred\n \nstuff\n\n" is one paragraph, but "fred\n\nstuff\n\n" is two.

How can I read a single character from a file? From the keyboard?

You can use the builtin getc() function for most filehandles, but it won't (easily) work on a terminal device. For STDIN, either use the Term::ReadKey module from CPAN or use the sample code in getc.

If your system supports the portable operating system programming interface (POSIX), you can use the following code, which you'll note turns off echo processing as well.

  1. #!/usr/bin/perl -w
  2. use strict;
  3. $| = 1;
  4. for (1..4) {
  5. print "gimme: ";
  6. my $got = getone();
  7. print "--> $got\n";
  8. }
  9. exit;
  10. BEGIN {
  11. use POSIX qw(:termios_h);
  12. my ($term, $oterm, $echo, $noecho, $fd_stdin);
  13. my $fd_stdin = fileno(STDIN);
  14. $term = POSIX::Termios->new();
  15. $term->getattr($fd_stdin);
  16. $oterm = $term->getlflag();
  17. $echo = ECHO | ECHOK | ICANON;
  18. $noecho = $oterm & ~$echo;
  19. sub cbreak {
  20. $term->setlflag($noecho);
  21. $term->setcc(VTIME, 1);
  22. $term->setattr($fd_stdin, TCSANOW);
  23. }
  24. sub cooked {
  25. $term->setlflag($oterm);
  26. $term->setcc(VTIME, 0);
  27. $term->setattr($fd_stdin, TCSANOW);
  28. }
  29. sub getone {
  30. my $key = '';
  31. cbreak();
  32. sysread(STDIN, $key, 1);
  33. cooked();
  34. return $key;
  35. }
  36. }
  37. END { cooked() }

The Term::ReadKey module from CPAN may be easier to use. Recent versions include also support for non-portable systems as well.

  1. use Term::ReadKey;
  2. open my $tty, '<', '/dev/tty';
  3. print "Gimme a char: ";
  4. ReadMode "raw";
  5. my $key = ReadKey 0, $tty;
  6. ReadMode "normal";
  7. printf "\nYou said %s, char number %03d\n",
  8. $key, ord $key;

How can I tell whether there's a character waiting on a filehandle?

The very first thing you should do is look into getting the Term::ReadKey extension from CPAN. As we mentioned earlier, it now even has limited support for non-portable (read: not open systems, closed, proprietary, not POSIX, not Unix, etc.) systems.

You should also check out the Frequently Asked Questions list in comp.unix.* for things like this: the answer is essentially the same. It's very system-dependent. Here's one solution that works on BSD systems:

  1. sub key_ready {
  2. my($rin, $nfd);
  3. vec($rin, fileno(STDIN), 1) = 1;
  4. return $nfd = select($rin,undef,undef,0);
  5. }

If you want to find out how many characters are waiting, there's also the FIONREAD ioctl call to be looked at. The h2ph tool that comes with Perl tries to convert C include files to Perl code, which can be required. FIONREAD ends up defined as a function in the sys/ioctl.ph file:

  1. require 'sys/ioctl.ph';
  2. $size = pack("L", 0);
  3. ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n";
  4. $size = unpack("L", $size);

If h2ph wasn't installed or doesn't work for you, you can grep the include files by hand:

  1. % grep FIONREAD /usr/include/*/*
  2. /usr/include/asm/ioctls.h:#define FIONREAD 0x541B

Or write a small C program using the editor of champions:

  1. % cat > fionread.c
  2. #include <sys/ioctl.h>
  3. main() {
  4. printf("%#08x\n", FIONREAD);
  5. }
  6. ^D
  7. % cc -o fionread fionread.c
  8. % ./fionread
  9. 0x4004667f

And then hard-code it, leaving porting as an exercise to your successor.

  1. $FIONREAD = 0x4004667f; # XXX: opsys dependent
  2. $size = pack("L", 0);
  3. ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n";
  4. $size = unpack("L", $size);

FIONREAD requires a filehandle connected to a stream, meaning that sockets, pipes, and tty devices work, but not files.

How do I do a tail -f in perl?

First try

  1. seek($gw_fh, 0, 1);

The statement seek($gw_fh, 0, 1) doesn't change the current position, but it does clear the end-of-file condition on the handle, so that the next <$gw_fh> makes Perl try again to read something.

If that doesn't work (it relies on features of your stdio implementation), then you need something more like this:

  1. for (;;) {
  2. for ($curpos = tell($gw_fh); <$gw_fh>; $curpos =tell($gw_fh)) {
  3. # search for some stuff and put it into files
  4. }
  5. # sleep for a while
  6. seek($gw_fh, $curpos, 0); # seek to where we had been
  7. }

If this still doesn't work, look into the clearerr method from IO::Handle, which resets the error and end-of-file states on the handle.

There's also a File::Tail module from CPAN.

How do I dup() a filehandle in Perl?

If you check open, you'll see that several of the ways to call open() should do the trick. For example:

  1. open my $log, '>>', '/foo/logfile';
  2. open STDERR, '>&', $log;

Or even with a literal numeric descriptor:

  1. my $fd = $ENV{MHCONTEXTFD};
  2. open $mhcontext, "<&=$fd"; # like fdopen(3S)

Note that "<&STDIN" makes a copy, but "<&=STDIN" makes an alias. That means if you close an aliased handle, all aliases become inaccessible. This is not true with a copied one.

Error checking, as always, has been left as an exercise for the reader.

How do I close a file descriptor by number?

If, for some reason, you have a file descriptor instead of a filehandle (perhaps you used POSIX::open ), you can use the close() function from the POSIX module:

  1. use POSIX ();
  2. POSIX::close( $fd );

This should rarely be necessary, as the Perl close() function is to be used for things that Perl opened itself, even if it was a dup of a numeric descriptor as with MHCONTEXT above. But if you really have to, you may be able to do this:

  1. require 'sys/syscall.ph';
  2. my $rc = syscall(SYS_close(), $fd + 0); # must force numeric
  3. die "can't sysclose $fd: $!" unless $rc == -1;

Or, just use the fdopen(3S) feature of open():

  1. {
  2. open my $fh, "<&=$fd" or die "Cannot reopen fd=$fd: $!";
  3. close $fh;
  4. }

Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work?

Whoops! You just put a tab and a formfeed into that filename! Remember that within double quoted strings ("like\this"), the backslash is an escape character. The full list of these is in Quote and Quote-like Operators in perlop. Unsurprisingly, you don't have a file called "c:(tab)emp(formfeed)oo" or "c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem.

Either single-quote your strings, or (preferably) use forward slashes. Since all DOS and Windows versions since something like MS-DOS 2.0 or so have treated / and \ the same in a path, you might as well use the one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++, awk, Tcl, Java, or Python, just to mention a few. POSIX paths are more portable, too.

Why doesn't glob("*.*") get all the files?

Because even on non-Unix ports, Perl's glob function follows standard Unix globbing semantics. You'll need glob("*") to get all (non-hidden) files. This makes glob() portable even to legacy systems. Your port may include proprietary globbing functions as well. Check its documentation for details.

Why does Perl let me delete read-only files? Why does -i clobber protected files? Isn't this a bug in Perl?

This is elaborately and painstakingly described in the file-dir-perms article in the "Far More Than You Ever Wanted To Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz .

The executive summary: learn how your filesystem works. The permissions on a file say what can happen to the data in that file. The permissions on a directory say what can happen to the list of files in that directory. If you delete a file, you're removing its name from the directory (so the operation depends on the permissions of the directory, not of the file). If you try to write to the file, the permissions of the file govern whether you're allowed to.

How do I select a random line from a file?

Short of loading the file into a database or pre-indexing the lines in the file, there are a couple of things that you can do.

Here's a reservoir-sampling algorithm from the Camel Book:

  1. srand;
  2. rand($.) < 1 && ($line = $_) while <>;

This has a significant advantage in space over reading the whole file in. You can find a proof of this method in The Art of Computer Programming, Volume 2, Section 3.4.2, by Donald E. Knuth.

You can use the File::Random module which provides a function for that algorithm:

  1. use File::Random qw/random_line/;
  2. my $line = random_line($filename);

Another way is to use the Tie::File module, which treats the entire file as an array. Simply access a random array element.

Why do I get weird spaces when I print an array of lines?

(contributed by brian d foy)

If you are seeing spaces between the elements of your array when you print the array, you are probably interpolating the array in double quotes:

  1. my @animals = qw(camel llama alpaca vicuna);
  2. print "animals are: @animals\n";

It's the double quotes, not the print, doing this. Whenever you interpolate an array in a double quote context, Perl joins the elements with spaces (or whatever is in $" , which is a space by default):

  1. animals are: camel llama alpaca vicuna

This is different than printing the array without the interpolation:

  1. my @animals = qw(camel llama alpaca vicuna);
  2. print "animals are: ", @animals, "\n";

Now the output doesn't have the spaces between the elements because the elements of @animals simply become part of the list to print:

  1. animals are: camelllamaalpacavicuna

You might notice this when each of the elements of @array end with a newline. You expect to print one element per line, but notice that every line after the first is indented:

  1. this is a line
  2. this is another line
  3. this is the third line

That extra space comes from the interpolation of the array. If you don't want to put anything between your array elements, don't use the array in double quotes. You can send it to print without them:

  1. print @lines;

How do I traverse a directory tree?

(contributed by brian d foy)

The File::Find module, which comes with Perl, does all of the hard work to traverse a directory structure. It comes with Perl. You simply call the find subroutine with a callback subroutine and the directories you want to traverse:

  1. use File::Find;
  2. find( \&wanted, @directories );
  3. sub wanted {
  4. # full path in $File::Find::name
  5. # just filename in $_
  6. ... do whatever you want to do ...
  7. }

The File::Find::Closures, which you can download from CPAN, provides many ready-to-use subroutines that you can use with File::Find.

The File::Finder, which you can download from CPAN, can help you create the callback subroutine using something closer to the syntax of the find command-line utility:

  1. use File::Find;
  2. use File::Finder;
  3. my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}');
  4. find( $deep_dirs->as_options, @places );

The File::Find::Rule module, which you can download from CPAN, has a similar interface, but does the traversal for you too:

  1. use File::Find::Rule;
  2. my @files = File::Find::Rule->file()
  3. ->name( '*.pm' )
  4. ->in( @INC );

How do I delete a directory tree?

(contributed by brian d foy)

If you have an empty directory, you can use Perl's built-in rmdir. If the directory is not empty (so, no files or subdirectories), you either have to empty it yourself (a lot of work) or use a module to help you.

The File::Path module, which comes with Perl, has a remove_tree which can take care of all of the hard work for you:

  1. use File::Path qw(remove_tree);
  2. remove_tree( @directories );

The File::Path module also has a legacy interface to the older rmtree subroutine.

How do I copy an entire directory?

(contributed by Shlomi Fish)

To do the equivalent of cp -R (i.e. copy an entire directory tree recursively) in portable Perl, you'll either need to write something yourself or find a good CPAN module such as File::Copy::Recursive.

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples here are in the public domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required.

Page index
 
perldoc-html/perlfaq6.html000644 000765 000024 00000327656 12275777330 015664 0ustar00jjstaff000000 000000 perlfaq6 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq6

Perl 5 version 18.2 documentation
Recently read

perlfaq6

NAME

perlfaq6 - Regular Expressions

DESCRIPTION

This section is surprisingly small because the rest of the FAQ is littered with answers involving regular expressions. For example, decoding a URL and checking whether something is a number can be handled with regular expressions, but those answers are found elsewhere in this document (in perlfaq9: "How do I decode or create those %-encodings on the web" and perlfaq4: "How do I determine whether a scalar is a number/whole/integer/float", to be precise).

How can I hope to use regular expressions without creating illegible and unmaintainable code?

Three techniques can make regular expressions maintainable and understandable.

  • Comments Outside the Regex

    Describe what you're doing and how you're doing it, using normal Perl comments.

    1. # turn the line into the first word, a colon, and the
    2. # number of characters on the rest of the line
    3. s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg;
  • Comments Inside the Regex

    The /x modifier causes whitespace to be ignored in a regex pattern (except in a character class and a few other places), and also allows you to use normal comments there, too. As you can imagine, whitespace and comments help a lot.

    /x lets you turn this:

    1. s{<(?:[^>'"]*|".*?"|'.*?')+>}{}gs;

    into this:

    1. s{ < # opening angle bracket
    2. (?: # Non-backreffing grouping paren
    3. [^>'"] * # 0 or more things that are neither > nor ' nor "
    4. | # or else
    5. ".*?" # a section between double quotes (stingy match)
    6. | # or else
    7. '.*?' # a section between single quotes (stingy match)
    8. ) + # all occurring one or more times
    9. > # closing angle bracket
    10. }{}gsx; # replace with nothing, i.e. delete

    It's still not quite so clear as prose, but it is very useful for describing the meaning of each part of the pattern.

  • Different Delimiters

    While we normally think of patterns as being delimited with / characters, they can be delimited by almost any character. perlre describes this. For example, the s/// above uses braces as delimiters. Selecting another delimiter can avoid quoting the delimiter within the pattern:

    1. s/\/usr\/local/\/usr\/share/g; # bad delimiter choice
    2. s#/usr/local#/usr/share#g; # better

    Using logically paired delimiters can be even more readable:

    1. s{/usr/local/}{/usr/share}g; # better still

I'm having trouble matching over more than one line. What's wrong?

Either you don't have more than one line in the string you're looking at (probably), or else you aren't using the correct modifier(s) on your pattern (possibly).

There are many ways to get multiline data into a string. If you want it to happen automatically while reading input, you'll want to set $/ (probably to '' for paragraphs or undef for the whole file) to allow you to read more than one line at a time.

Read perlre to help you decide which of /s and /m (or both) you might want to use: /s allows dot to include newline, and /m allows caret and dollar to match next to a newline, not just at the end of the string. You do need to make sure that you've actually got a multiline string in there.

For example, this program detects duplicate words, even when they span line breaks (but not paragraph ones). For this example, we don't need /s because we aren't using dot in a regular expression that we want to cross line boundaries. Neither do we need /m because we don't want caret or dollar to match at any point inside the record next to newlines. But it's imperative that $/ be set to something other than the default, or else we won't actually ever have a multiline record read in.

  1. $/ = ''; # read in whole paragraph, not just one line
  2. while ( <> ) {
  3. while ( /\b([\w'-]+)(\s+\g1)+\b/gi ) { # word starts alpha
  4. print "Duplicate $1 at paragraph $.\n";
  5. }
  6. }

Here's some code that finds sentences that begin with "From " (which would be mangled by many mailers):

  1. $/ = ''; # read in whole paragraph, not just one line
  2. while ( <> ) {
  3. while ( /^From /gm ) { # /m makes ^ match next to \n
  4. print "leading from in paragraph $.\n";
  5. }
  6. }

Here's code that finds everything between START and END in a paragraph:

  1. undef $/; # read in whole file, not just one line or paragraph
  2. while ( <> ) {
  3. while ( /START(.*?)END/sgm ) { # /s makes . cross line boundaries
  4. print "$1\n";
  5. }
  6. }

How can I pull out lines between two patterns that are themselves on different lines?

You can use Perl's somewhat exotic .. operator (documented in perlop):

  1. perl -ne 'print if /START/ .. /END/' file1 file2 ...

If you wanted text and not lines, you would use

  1. perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ...

But if you want nested occurrences of START through END , you'll run up against the problem described in the question in this section on matching balanced text.

Here's another example of using .. :

  1. while (<>) {
  2. my $in_header = 1 .. /^$/;
  3. my $in_body = /^$/ .. eof;
  4. # now choose between them
  5. } continue {
  6. $. = 0 if eof; # fix $.
  7. }

How do I match XML, HTML, or other nasty, ugly things with a regex?

Do not use regexes. Use a module and forget about the regular expressions. The XML::LibXML, HTML::TokeParser and HTML::TreeBuilder modules are good starts, although each namespace has other parsing modules specialized for certain tasks and different ways of doing it. Start at CPAN Search ( http://metacpan.org/ ) and wonder at all the work people have done for you already! :)

I put a regular expression into $/ but it didn't work. What's wrong?

$/ has to be a string. You can use these examples if you really need to do this.

If you have File::Stream, this is easy.

  1. use File::Stream;
  2. my $stream = File::Stream->new(
  3. $filehandle,
  4. separator => qr/\s*,\s*/,
  5. );
  6. print "$_\n" while <$stream>;

If you don't have File::Stream, you have to do a little more work.

You can use the four-argument form of sysread to continually add to a buffer. After you add to the buffer, you check if you have a complete line (using your regular expression).

  1. local $_ = "";
  2. while( sysread FH, $_, 8192, length ) {
  3. while( s/^((?s).*?)your_pattern// ) {
  4. my $record = $1;
  5. # do stuff here.
  6. }
  7. }

You can do the same thing with foreach and a match using the c flag and the \G anchor, if you do not mind your entire file being in memory at the end.

  1. local $_ = "";
  2. while( sysread FH, $_, 8192, length ) {
  3. foreach my $record ( m/\G((?s).*?)your_pattern/gc ) {
  4. # do stuff here.
  5. }
  6. substr( $_, 0, pos ) = "" if pos;
  7. }

How do I substitute case-insensitively on the LHS while preserving case on the RHS?

Here's a lovely Perlish solution by Larry Rosler. It exploits properties of bitwise xor on ASCII strings.

  1. $_= "this is a TEsT case";
  2. $old = 'test';
  3. $new = 'success';
  4. s{(\Q$old\E)}
  5. { uc $new | (uc $1 ^ $1) .
  6. (uc(substr $1, -1) ^ substr $1, -1) x
  7. (length($new) - length $1)
  8. }egi;
  9. print;

And here it is as a subroutine, modeled after the above:

  1. sub preserve_case {
  2. my ($old, $new) = @_;
  3. my $mask = uc $old ^ $old;
  4. uc $new | $mask .
  5. substr($mask, -1) x (length($new) - length($old))
  6. }
  7. $string = "this is a TEsT case";
  8. $string =~ s/(test)/preserve_case($1, "success")/egi;
  9. print "$string\n";

This prints:

  1. this is a SUcCESS case

As an alternative, to keep the case of the replacement word if it is longer than the original, you can use this code, by Jeff Pinyan:

  1. sub preserve_case {
  2. my ($from, $to) = @_;
  3. my ($lf, $lt) = map length, @_;
  4. if ($lt < $lf) { $from = substr $from, 0, $lt }
  5. else { $from .= substr $to, $lf }
  6. return uc $to | ($from ^ uc $from);
  7. }

This changes the sentence to "this is a SUcCess case."

Just to show that C programmers can write C in any programming language, if you prefer a more C-like solution, the following script makes the substitution have the same case, letter by letter, as the original. (It also happens to run about 240% slower than the Perlish solution runs.) If the substitution has more characters than the string being substituted, the case of the last character is used for the rest of the substitution.

  1. # Original by Nathan Torkington, massaged by Jeffrey Friedl
  2. #
  3. sub preserve_case
  4. {
  5. my ($old, $new) = @_;
  6. my $state = 0; # 0 = no change; 1 = lc; 2 = uc
  7. my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new));
  8. my $len = $oldlen < $newlen ? $oldlen : $newlen;
  9. for ($i = 0; $i < $len; $i++) {
  10. if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) {
  11. $state = 0;
  12. } elsif (lc $c eq $c) {
  13. substr($new, $i, 1) = lc(substr($new, $i, 1));
  14. $state = 1;
  15. } else {
  16. substr($new, $i, 1) = uc(substr($new, $i, 1));
  17. $state = 2;
  18. }
  19. }
  20. # finish up with any remaining new (for when new is longer than old)
  21. if ($newlen > $oldlen) {
  22. if ($state == 1) {
  23. substr($new, $oldlen) = lc(substr($new, $oldlen));
  24. } elsif ($state == 2) {
  25. substr($new, $oldlen) = uc(substr($new, $oldlen));
  26. }
  27. }
  28. return $new;
  29. }

How can I make \w match national character sets?

Put use locale; in your script. The \w character class is taken from the current locale.

See perllocale for details.

How can I match a locale-smart version of /[a-zA-Z]/ ?

You can use the POSIX character class syntax /[[:alpha:]]/ documented in perlre.

No matter which locale you are in, the alphabetic characters are the characters in \w without the digits and the underscore. As a regex, that looks like /[^\W\d_]/ . Its complement, the non-alphabetics, is then everything in \W along with the digits and the underscore, or /[\W\d_]/ .

How can I quote a variable to use in a regex?

The Perl parser will expand $variable and @variable references in regular expressions unless the delimiter is a single quote. Remember, too, that the right-hand side of a s/// substitution is considered a double-quoted string (see perlop for more details). Remember also that any regex special characters will be acted on unless you precede the substitution with \Q. Here's an example:

  1. $string = "Placido P. Octopus";
  2. $regex = "P.";
  3. $string =~ s/$regex/Polyp/;
  4. # $string is now "Polypacido P. Octopus"

Because . is special in regular expressions, and can match any single character, the regex P. here has matched the <Pl> in the original string.

To escape the special meaning of ., we use \Q :

  1. $string = "Placido P. Octopus";
  2. $regex = "P.";
  3. $string =~ s/\Q$regex/Polyp/;
  4. # $string is now "Placido Polyp Octopus"

The use of \Q causes the <.> in the regex to be treated as a regular character, so that P. matches a P followed by a dot.

What is /o really for?

(contributed by brian d foy)

The /o option for regular expressions (documented in perlop and perlreref) tells Perl to compile the regular expression only once. This is only useful when the pattern contains a variable. Perls 5.6 and later handle this automatically if the pattern does not change.

Since the match operator m//, the substitution operator s///, and the regular expression quoting operator qr// are double-quotish constructs, you can interpolate variables into the pattern. See the answer to "How can I quote a variable to use in a regex?" for more details.

This example takes a regular expression from the argument list and prints the lines of input that match it:

  1. my $pattern = shift @ARGV;
  2. while( <> ) {
  3. print if m/$pattern/;
  4. }

Versions of Perl prior to 5.6 would recompile the regular expression for each iteration, even if $pattern had not changed. The /o would prevent this by telling Perl to compile the pattern the first time, then reuse that for subsequent iterations:

  1. my $pattern = shift @ARGV;
  2. while( <> ) {
  3. print if m/$pattern/o; # useful for Perl < 5.6
  4. }

In versions 5.6 and later, Perl won't recompile the regular expression if the variable hasn't changed, so you probably don't need the /o option. It doesn't hurt, but it doesn't help either. If you want any version of Perl to compile the regular expression only once even if the variable changes (thus, only using its initial value), you still need the /o.

You can watch Perl's regular expression engine at work to verify for yourself if Perl is recompiling a regular expression. The use re 'debug' pragma (comes with Perl 5.005 and later) shows the details. With Perls before 5.6, you should see re reporting that its compiling the regular expression on each iteration. With Perl 5.6 or later, you should only see re report that for the first iteration.

  1. use re 'debug';
  2. my $regex = 'Perl';
  3. foreach ( qw(Perl Java Ruby Python) ) {
  4. print STDERR "-" x 73, "\n";
  5. print STDERR "Trying $_...\n";
  6. print STDERR "\t$_ is good!\n" if m/$regex/;
  7. }

How do I use a regular expression to strip C-style comments from a file?

While this actually can be done, it's much harder than you'd think. For example, this one-liner

  1. perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c

will work in many but not all cases. You see, it's too simple-minded for certain kinds of C programs, in particular, those with what appear to be comments in quoted strings. For that, you'd need something like this, created by Jeffrey Friedl and later modified by Fred Curtis.

  1. $/ = undef;
  2. $_ = <>;
  3. s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse;
  4. print;

This could, of course, be more legibly written with the /x modifier, adding whitespace and comments. Here it is expanded, courtesy of Fred Curtis.

  1. s{
  2. /\* ## Start of /* ... */ comment
  3. [^*]*\*+ ## Non-* followed by 1-or-more *'s
  4. (
  5. [^/*][^*]*\*+
  6. )* ## 0-or-more things which don't start with /
  7. ## but do end with '*'
  8. / ## End of /* ... */ comment
  9. | ## OR various things which aren't comments:
  10. (
  11. " ## Start of " ... " string
  12. (
  13. \\. ## Escaped char
  14. | ## OR
  15. [^"\\] ## Non "\
  16. )*
  17. " ## End of " ... " string
  18. | ## OR
  19. ' ## Start of ' ... ' string
  20. (
  21. \\. ## Escaped char
  22. | ## OR
  23. [^'\\] ## Non '\
  24. )*
  25. ' ## End of ' ... ' string
  26. | ## OR
  27. . ## Anything other char
  28. [^/"'\\]* ## Chars which doesn't start a comment, string or escape
  29. )
  30. }{defined $2 ? $2 : ""}gxse;

A slight modification also removes C++ comments, possibly spanning multiple lines using a continuation character:

  1. s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse;

Can I use Perl regular expressions to match balanced text?

(contributed by brian d foy)

Your first try should probably be the Text::Balanced module, which is in the Perl standard library since Perl 5.8. It has a variety of functions to deal with tricky text. The Regexp::Common module can also help by providing canned patterns you can use.

As of Perl 5.10, you can match balanced text with regular expressions using recursive patterns. Before Perl 5.10, you had to resort to various tricks such as using Perl code in (??{}) sequences.

Here's an example using a recursive regular expression. The goal is to capture all of the text within angle brackets, including the text in nested angle brackets. This sample text has two "major" groups: a group with one level of nesting and a group with two levels of nesting. There are five total groups in angle brackets:

  1. I have some <brackets in <nested brackets> > and
  2. <another group <nested once <nested twice> > >
  3. and that's it.

The regular expression to match the balanced text uses two new (to Perl 5.10) regular expression features. These are covered in perlre and this example is a modified version of one in that documentation.

First, adding the new possessive + to any quantifier finds the longest match and does not backtrack. That's important since you want to handle any angle brackets through the recursion, not backtracking. The group [^<>]++ finds one or more non-angle brackets without backtracking.

Second, the new (?PARNO) refers to the sub-pattern in the particular capture group given by PARNO . In the following regex, the first capture group finds (and remembers) the balanced text, and you need that same pattern within the first buffer to get past the nested text. That's the recursive part. The (?1) uses the pattern in the outer capture group as an independent part of the regex.

Putting it all together, you have:

  1. #!/usr/local/bin/perl5.10.0
  2. my $string =<<"HERE";
  3. I have some <brackets in <nested brackets> > and
  4. <another group <nested once <nested twice> > >
  5. and that's it.
  6. HERE
  7. my @groups = $string =~ m/
  8. ( # start of capture group 1
  9. < # match an opening angle bracket
  10. (?:
  11. [^<>]++ # one or more non angle brackets, non backtracking
  12. |
  13. (?1) # found < or >, so recurse to capture group 1
  14. )*
  15. > # match a closing angle bracket
  16. ) # end of capture group 1
  17. /xg;
  18. $" = "\n\t";
  19. print "Found:\n\t@groups\n";

The output shows that Perl found the two major groups:

  1. Found:
  2. <brackets in <nested brackets> >
  3. <another group <nested once <nested twice> > >

With a little extra work, you can get the all of the groups in angle brackets even if they are in other angle brackets too. Each time you get a balanced match, remove its outer delimiter (that's the one you just matched so don't match it again) and add it to a queue of strings to process. Keep doing that until you get no matches:

  1. #!/usr/local/bin/perl5.10.0
  2. my @queue =<<"HERE";
  3. I have some <brackets in <nested brackets> > and
  4. <another group <nested once <nested twice> > >
  5. and that's it.
  6. HERE
  7. my $regex = qr/
  8. ( # start of bracket 1
  9. < # match an opening angle bracket
  10. (?:
  11. [^<>]++ # one or more non angle brackets, non backtracking
  12. |
  13. (?1) # recurse to bracket 1
  14. )*
  15. > # match a closing angle bracket
  16. ) # end of bracket 1
  17. /x;
  18. $" = "\n\t";
  19. while( @queue ) {
  20. my $string = shift @queue;
  21. my @groups = $string =~ m/$regex/g;
  22. print "Found:\n\t@groups\n\n" if @groups;
  23. unshift @queue, map { s/^<//; s/>$//; $_ } @groups;
  24. }

The output shows all of the groups. The outermost matches show up first and the nested matches so up later:

  1. Found:
  2. <brackets in <nested brackets> >
  3. <another group <nested once <nested twice> > >
  4. Found:
  5. <nested brackets>
  6. Found:
  7. <nested once <nested twice> >
  8. Found:
  9. <nested twice>

What does it mean that regexes are greedy? How can I get around it?

Most people mean that greedy regexes match as much as they can. Technically speaking, it's actually the quantifiers (?, * , + , {} ) that are greedy rather than the whole pattern; Perl prefers local greed and immediate gratification to overall greed. To get non-greedy versions of the same quantifiers, use (?? , *? , +?, {}?).

An example:

  1. my $s1 = my $s2 = "I am very very cold";
  2. $s1 =~ s/ve.*y //; # I am cold
  3. $s2 =~ s/ve.*?y //; # I am very cold

Notice how the second substitution stopped matching as soon as it encountered "y ". The *? quantifier effectively tells the regular expression engine to find a match as quickly as possible and pass control on to whatever is next in line, as you would if you were playing hot potato.

How do I process each word on each line?

Use the split function:

  1. while (<>) {
  2. foreach my $word ( split ) {
  3. # do something with $word here
  4. }
  5. }

Note that this isn't really a word in the English sense; it's just chunks of consecutive non-whitespace characters.

To work with only alphanumeric sequences (including underscores), you might consider

  1. while (<>) {
  2. foreach $word (m/(\w+)/g) {
  3. # do something with $word here
  4. }
  5. }

How can I print out a word-frequency or line-frequency summary?

To do this, you have to parse out each word in the input stream. We'll pretend that by word you mean chunk of alphabetics, hyphens, or apostrophes, rather than the non-whitespace chunk idea of a word given in the previous question:

  1. my (%seen);
  2. while (<>) {
  3. while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'"
  4. $seen{$1}++;
  5. }
  6. }
  7. while ( my ($word, $count) = each %seen ) {
  8. print "$count $word\n";
  9. }

If you wanted to do the same thing for lines, you wouldn't need a regular expression:

  1. my (%seen);
  2. while (<>) {
  3. $seen{$_}++;
  4. }
  5. while ( my ($line, $count) = each %seen ) {
  6. print "$count $line";
  7. }

If you want these output in a sorted order, see perlfaq4: "How do I sort a hash (optionally by value instead of key)?".

How can I do approximate matching?

See the module String::Approx available from CPAN.

How do I efficiently match many regular expressions at once?

(contributed by brian d foy)

If you have Perl 5.10 or later, this is almost trivial. You just smart match against an array of regular expression objects:

  1. my @patterns = ( qr/Fr.d/, qr/B.rn.y/, qr/W.lm./ );
  2. if( $string ~~ @patterns ) {
  3. ...
  4. };

The smart match stops when it finds a match, so it doesn't have to try every expression.

Earlier than Perl 5.10, you have a bit of work to do. You want to avoid compiling a regular expression every time you want to match it. In this example, perl must recompile the regular expression for every iteration of the foreach loop since it has no way to know what $pattern will be:

  1. my @patterns = qw( foo bar baz );
  2. LINE: while( <DATA> ) {
  3. foreach $pattern ( @patterns ) {
  4. if( /\b$pattern\b/i ) {
  5. print;
  6. next LINE;
  7. }
  8. }
  9. }

The qr// operator showed up in perl 5.005. It compiles a regular expression, but doesn't apply it. When you use the pre-compiled version of the regex, perl does less work. In this example, I inserted a map to turn each pattern into its pre-compiled form. The rest of the script is the same, but faster:

  1. my @patterns = map { qr/\b$_\b/i } qw( foo bar baz );
  2. LINE: while( <> ) {
  3. foreach $pattern ( @patterns ) {
  4. if( /$pattern/ ) {
  5. print;
  6. next LINE;
  7. }
  8. }
  9. }

In some cases, you may be able to make several patterns into a single regular expression. Beware of situations that require backtracking though.

  1. my $regex = join '|', qw( foo bar baz );
  2. LINE: while( <> ) {
  3. print if /\b(?:$regex)\b/i;
  4. }

For more details on regular expression efficiency, see Mastering Regular Expressions by Jeffrey Friedl. He explains how the regular expressions engine works and why some patterns are surprisingly inefficient. Once you understand how perl applies regular expressions, you can tune them for individual situations.

Why don't word-boundary searches with \b work for me?

(contributed by brian d foy)

Ensure that you know what \b really does: it's the boundary between a word character, \w, and something that isn't a word character. That thing that isn't a word character might be \W, but it can also be the start or end of the string.

It's not (not!) the boundary between whitespace and non-whitespace, and it's not the stuff between words we use to create sentences.

In regex speak, a word boundary (\b) is a "zero width assertion", meaning that it doesn't represent a character in the string, but a condition at a certain position.

For the regular expression, /\bPerl\b/, there has to be a word boundary before the "P" and after the "l". As long as something other than a word character precedes the "P" and succeeds the "l", the pattern will match. These strings match /\bPerl\b/.

  1. "Perl" # no word char before P or after l
  2. "Perl " # same as previous (space is not a word char)
  3. "'Perl'" # the ' char is not a word char
  4. "Perl's" # no word char before P, non-word char after "l"

These strings do not match /\bPerl\b/.

  1. "Perl_" # _ is a word char!
  2. "Perler" # no word char before P, but one after l

You don't have to use \b to match words though. You can look for non-word characters surrounded by word characters. These strings match the pattern /\b'\b/.

  1. "don't" # the ' char is surrounded by "n" and "t"
  2. "qep'a'" # the ' char is surrounded by "p" and "a"

These strings do not match /\b'\b/.

  1. "foo'" # there is no word char after non-word '

You can also use the complement of \b, \B, to specify that there should not be a word boundary.

In the pattern /\Bam\B/, there must be a word character before the "a" and after the "m". These patterns match /\Bam\B/:

  1. "llama" # "am" surrounded by word chars
  2. "Samuel" # same

These strings do not match /\Bam\B/

  1. "Sam" # no word boundary before "a", but one after "m"
  2. "I am Sam" # "am" surrounded by non-word chars

Why does using $&, $`, or $' slow my program down?

(contributed by Anno Siegel)

Once Perl sees that you need one of these variables anywhere in the program, it provides them on each and every pattern match. That means that on every pattern match the entire string will be copied, part of it to $`, part to $&, and part to $'. Thus the penalty is most severe with long strings and patterns that match often. Avoid $&, $', and $` if you can, but if you can't, once you've used them at all, use them at will because you've already paid the price. Remember that some algorithms really appreciate them. As of the 5.005 release, the $& variable is no longer "expensive" the way the other two are.

Since Perl 5.6.1 the special variables @- and @+ can functionally replace $`, $& and $'. These arrays contain pointers to the beginning and end of each match (see perlvar for the full story), so they give you essentially the same information, but without the risk of excessive string copying.

Perl 5.10 added three specials, ${^MATCH} , ${^PREMATCH} , and ${^POSTMATCH} to do the same job but without the global performance penalty. Perl 5.10 only sets these variables if you compile or execute the regular expression with the /p modifier.

What good is \G in a regular expression?

You use the \G anchor to start the next match on the same string where the last match left off. The regular expression engine cannot skip over any characters to find the next match with this anchor, so \G is similar to the beginning of string anchor, ^. The \G anchor is typically used with the g flag. It uses the value of pos() as the position to start the next match. As the match operator makes successive matches, it updates pos() with the position of the next character past the last match (or the first character of the next match, depending on how you like to look at it). Each string has its own pos() value.

Suppose you want to match all of consecutive pairs of digits in a string like "1122a44" and stop matching when you encounter non-digits. You want to match 11 and 22 but the letter <a> shows up between 22 and 44 and you want to stop at a . Simply matching pairs of digits skips over the a and still matches 44 .

  1. $_ = "1122a44";
  2. my @pairs = m/(\d\d)/g; # qw( 11 22 44 )

If you use the \G anchor, you force the match after 22 to start with the a . The regular expression cannot match there since it does not find a digit, so the next match fails and the match operator returns the pairs it already found.

  1. $_ = "1122a44";
  2. my @pairs = m/\G(\d\d)/g; # qw( 11 22 )

You can also use the \G anchor in scalar context. You still need the g flag.

  1. $_ = "1122a44";
  2. while( m/\G(\d\d)/g ) {
  3. print "Found $1\n";
  4. }

After the match fails at the letter a , perl resets pos() and the next match on the same string starts at the beginning.

  1. $_ = "1122a44";
  2. while( m/\G(\d\d)/g ) {
  3. print "Found $1\n";
  4. }
  5. print "Found $1 after while" if m/(\d\d)/g; # finds "11"

You can disable pos() resets on fail with the c flag, documented in perlop and perlreref. Subsequent matches start where the last successful match ended (the value of pos()) even if a match on the same string has failed in the meantime. In this case, the match after the while() loop starts at the a (where the last match stopped), and since it does not use any anchor it can skip over the a to find 44 .

  1. $_ = "1122a44";
  2. while( m/\G(\d\d)/gc ) {
  3. print "Found $1\n";
  4. }
  5. print "Found $1 after while" if m/(\d\d)/g; # finds "44"

Typically you use the \G anchor with the c flag when you want to try a different match if one fails, such as in a tokenizer. Jeffrey Friedl offers this example which works in 5.004 or later.

  1. while (<>) {
  2. chomp;
  3. PARSER: {
  4. m/ \G( \d+\b )/gcx && do { print "number: $1\n"; redo; };
  5. m/ \G( \w+ )/gcx && do { print "word: $1\n"; redo; };
  6. m/ \G( \s+ )/gcx && do { print "space: $1\n"; redo; };
  7. m/ \G( [^\w\d]+ )/gcx && do { print "other: $1\n"; redo; };
  8. }
  9. }

For each line, the PARSER loop first tries to match a series of digits followed by a word boundary. This match has to start at the place the last match left off (or the beginning of the string on the first match). Since m/ \G( \d+\b )/gcx uses the c flag, if the string does not match that regular expression, perl does not reset pos() and the next match starts at the same position to try a different pattern.

Are Perl regexes DFAs or NFAs? Are they POSIX compliant?

While it's true that Perl's regular expressions resemble the DFAs (deterministic finite automata) of the egrep(1) program, they are in fact implemented as NFAs (non-deterministic finite automata) to allow backtracking and backreferencing. And they aren't POSIX-style either, because those guarantee worst-case behavior for all cases. (It seems that some people prefer guarantees of consistency, even when what's guaranteed is slowness.) See the book "Mastering Regular Expressions" (from O'Reilly) by Jeffrey Friedl for all the details you could ever hope to know on these matters (a full citation appears in perlfaq2).

What's wrong with using grep in a void context?

The problem is that grep builds a return list, regardless of the context. This means you're making Perl go to the trouble of building a list that you then just throw away. If the list is large, you waste both time and space. If your intent is to iterate over the list, then use a for loop for this purpose.

In perls older than 5.8.1, map suffers from this problem as well. But since 5.8.1, this has been fixed, and map is context aware - in void context, no lists are constructed.

How can I match strings with multibyte characters?

Starting from Perl 5.6 Perl has had some level of multibyte character support. Perl 5.8 or later is recommended. Supported multibyte character repertoires include Unicode, and legacy encodings through the Encode module. See perluniintro, perlunicode, and Encode.

If you are stuck with older Perls, you can do Unicode with the Unicode::String module, and character conversions using the Unicode::Map8 and Unicode::Map modules. If you are using Japanese encodings, you might try using the jperl 5.005_03.

Finally, the following set of approaches was offered by Jeffrey Friedl, whose article in issue #5 of The Perl Journal talks about this very matter.

Let's suppose you have some weird Martian encoding where pairs of ASCII uppercase letters encode single Martian letters (i.e. the two bytes "CV" make a single Martian letter, as do the two bytes "SG", "VS", "XX", etc.). Other bytes represent single characters, just like ASCII.

So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'.

Now, say you want to search for the single character /GX/ . Perl doesn't know about Martian, so it'll find the two bytes "GX" in the "I am CVSGXX!" string, even though that character isn't there: it just looks like it is because "SG" is next to "XX", but there's no real "GX". This is a big problem.

Here are a few ways, all painful, to deal with it:

  1. # Make sure adjacent "martian" bytes are no longer adjacent.
  2. $martian =~ s/([A-Z][A-Z])/ $1 /g;
  3. print "found GX!\n" if $martian =~ /GX/;

Or like this:

  1. my @chars = $martian =~ m/([A-Z][A-Z]|[^A-Z])/g;
  2. # above is conceptually similar to: my @chars = $text =~ m/(.)/g;
  3. #
  4. foreach my $char (@chars) {
  5. print "found GX!\n", last if $char eq 'GX';
  6. }

Or like this:

  1. while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) { # \G probably unneeded
  2. if ($1 eq 'GX') {
  3. print "found GX!\n";
  4. last;
  5. }
  6. }

Here's another, slightly less painful, way to do it from Benjamin Goldberg, who uses a zero-width negative look-behind assertion.

  1. print "found GX!\n" if $martian =~ m/
  2. (?<![A-Z])
  3. (?:[A-Z][A-Z])*?
  4. GX
  5. /x;

This succeeds if the "martian" character GX is in the string, and fails otherwise. If you don't like using (?<!), a zero-width negative look-behind assertion, you can replace (?<![A-Z]) with (?:^|[^A-Z]).

It does have the drawback of putting the wrong thing in $-[0] and $+[0], but this usually can be worked around.

How do I match a regular expression that's in a variable?

(contributed by brian d foy)

We don't have to hard-code patterns into the match operator (or anything else that works with regular expressions). We can put the pattern in a variable for later use.

The match operator is a double quote context, so you can interpolate your variable just like a double quoted string. In this case, you read the regular expression as user input and store it in $regex . Once you have the pattern in $regex , you use that variable in the match operator.

  1. chomp( my $regex = <STDIN> );
  2. if( $string =~ m/$regex/ ) { ... }

Any regular expression special characters in $regex are still special, and the pattern still has to be valid or Perl will complain. For instance, in this pattern there is an unpaired parenthesis.

  1. my $regex = "Unmatched ( paren";
  2. "Two parens to bind them all" =~ m/$regex/;

When Perl compiles the regular expression, it treats the parenthesis as the start of a memory match. When it doesn't find the closing parenthesis, it complains:

  1. Unmatched ( in regex; marked by <-- HERE in m/Unmatched ( <-- HERE paren/ at script line 3.

You can get around this in several ways depending on our situation. First, if you don't want any of the characters in the string to be special, you can escape them with quotemeta before you use the string.

  1. chomp( my $regex = <STDIN> );
  2. $regex = quotemeta( $regex );
  3. if( $string =~ m/$regex/ ) { ... }

You can also do this directly in the match operator using the \Q and \E sequences. The \Q tells Perl where to start escaping special characters, and the \E tells it where to stop (see perlop for more details).

  1. chomp( my $regex = <STDIN> );
  2. if( $string =~ m/\Q$regex\E/ ) { ... }

Alternately, you can use qr//, the regular expression quote operator (see perlop for more details). It quotes and perhaps compiles the pattern, and you can apply regular expression flags to the pattern.

  1. chomp( my $input = <STDIN> );
  2. my $regex = qr/$input/is;
  3. $string =~ m/$regex/ # same as m/$input/is;

You might also want to trap any errors by wrapping an eval block around the whole thing.

  1. chomp( my $input = <STDIN> );
  2. eval {
  3. if( $string =~ m/\Q$input\E/ ) { ... }
  4. };
  5. warn $@ if $@;

Or...

  1. my $regex = eval { qr/$input/is };
  2. if( defined $regex ) {
  3. $string =~ m/$regex/;
  4. }
  5. else {
  6. warn $@;
  7. }

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required.

 
perldoc-html/perlfaq7.html000644 000765 000024 00000320232 12275777330 015644 0ustar00jjstaff000000 000000 perlfaq7 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq7

Perl 5 version 18.2 documentation
Recently read

perlfaq7

NAME

perlfaq7 - General Perl Language Issues

DESCRIPTION

This section deals with general Perl language issues that don't clearly fit into any of the other sections.

Can I get a BNF/yacc/RE for the Perl language?

There is no BNF, but you can paw your way through the yacc grammar in perly.y in the source distribution if you're particularly brave. The grammar relies on very smart tokenizing code, so be prepared to venture into toke.c as well.

In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF. The work of parsing perl is distributed between yacc, the lexer, smoke and mirrors."

What are all these $@%&* punctuation signs, and how do I know when to use them?

They are type specifiers, as detailed in perldata:

  1. $ for scalar values (number, string or reference)
  2. @ for arrays
  3. % for hashes (associative arrays)
  4. & for subroutines (aka functions, procedures, methods)
  5. * for all types of that symbol name. In version 4 you used them like
  6. pointers, but in modern perls you can just use references.

There are a couple of other symbols that you're likely to encounter that aren't really type specifiers:

  1. <> are used for inputting a record from a filehandle.
  2. \ takes a reference to something.

Note that <FILE> is neither the type specifier for files nor the name of the handle. It is the <> operator applied to the handle FILE. It reads one line (well, record--see $/ in perlvar) from the handle FILE in scalar context, or all lines in list context. When performing open, close, or any other operation besides <> on files, or even when talking about the handle, do not use the brackets. These are correct: eof(FH), seek(FH, 0, 2) and "copying from STDIN to FILE".

Do I always/never have to quote my strings or use semicolons and commas?

Normally, a bareword doesn't need to be quoted, but in most cases probably should be (and must be under use strict ). But a hash key consisting of a simple word and the left-hand operand to the => operator both count as though they were quoted:

  1. This is like this
  2. ------------ ---------------
  3. $foo{line} $foo{'line'}
  4. bar => stuff 'bar' => stuff

The final semicolon in a block is optional, as is the final comma in a list. Good style (see perlstyle) says to put them in except for one-liners:

  1. if ($whoops) { exit 1 }
  2. my @nums = (1, 2, 3);
  3. if ($whoops) {
  4. exit 1;
  5. }
  6. my @lines = (
  7. "There Beren came from mountains cold",
  8. "And lost he wandered under leaves",
  9. );

How do I skip some return values?

One way is to treat the return values as a list and index into it:

  1. $dir = (getpwnam($user))[7];

Another way is to use undef as an element on the left-hand-side:

  1. ($dev, $ino, undef, undef, $uid, $gid) = stat($file);

You can also use a list slice to select only the elements that you need:

  1. ($dev, $ino, $uid, $gid) = ( stat($file) )[0,1,4,5];

How do I temporarily block warnings?

If you are running Perl 5.6.0 or better, the use warnings pragma allows fine control of what warnings are produced. See perllexwarn for more details.

  1. {
  2. no warnings; # temporarily turn off warnings
  3. $x = $y + $z; # I know these might be undef
  4. }

Additionally, you can enable and disable categories of warnings. You turn off the categories you want to ignore and you can still get other categories of warnings. See perllexwarn for the complete details, including the category names and hierarchy.

  1. {
  2. no warnings 'uninitialized';
  3. $x = $y + $z;
  4. }

If you have an older version of Perl, the $^W variable (documented in perlvar) controls runtime warnings for a block:

  1. {
  2. local $^W = 0; # temporarily turn off warnings
  3. $x = $y + $z; # I know these might be undef
  4. }

Note that like all the punctuation variables, you cannot currently use my() on $^W , only local().

What's an extension?

An extension is a way of calling compiled C code from Perl. Reading perlxstut is a good place to learn more about extensions.

Why do Perl operators have different precedence than C operators?

Actually, they don't. All C operators that Perl copies have the same precedence in Perl as they do in C. The problem is with operators that C doesn't have, especially functions that give a list context to everything on their right, eg. print, chmod, exec, and so on. Such functions are called "list operators" and appear as such in the precedence table in perlop.

A common mistake is to write:

  1. unlink $file || die "snafu";

This gets interpreted as:

  1. unlink ($file || die "snafu");

To avoid this problem, either put in extra parentheses or use the super low precedence or operator:

  1. (unlink $file) || die "snafu";
  2. unlink $file or die "snafu";

The "English" operators (and , or , xor , and not ) deliberately have precedence lower than that of list operators for just such situations as the one above.

Another operator with surprising precedence is exponentiation. It binds more tightly even than unary minus, making -2**2 produce a negative four and not a positive one. It is also right-associating, meaning that 2**3**2 is two raised to the ninth power, not eight squared.

Although it has the same precedence as in C, Perl's ?: operator produces an lvalue. This assigns $x to either $if_true or $if_false, depending on the trueness of $maybe:

  1. ($maybe ? $if_true : $if_false) = $x;

How do I declare/create a structure?

In general, you don't "declare" a structure. Just use a (probably anonymous) hash reference. See perlref and perldsc for details. Here's an example:

  1. $person = {}; # new anonymous hash
  2. $person->{AGE} = 24; # set field AGE to 24
  3. $person->{NAME} = "Nat"; # set field NAME to "Nat"

If you're looking for something a bit more rigorous, try perltoot.

How do I create a module?

perlnewmod is a good place to start, ignore the bits about uploading to CPAN if you don't want to make your module publicly available.

ExtUtils::ModuleMaker and Module::Starter are also good places to start. Many CPAN authors now use Dist::Zilla to automate as much as possible.

Detailed documentation about modules can be found at: perlmod, perlmodlib, perlmodstyle.

If you need to include C code or C library interfaces use h2xs. h2xs will create the module distribution structure and the initial interface files. perlxs and perlxstut explain the details.

How do I adopt or take over a module already on CPAN?

Ask the current maintainer to make you a co-maintainer or transfer the module to you.

If you can not reach the author for some reason contact the PAUSE admins at modules@perl.org who may be able to help, but each case it treated seperatly.

  • Get a login for the Perl Authors Upload Server (PAUSE) if you don't already have one: http://pause.perl.org

  • Write to modules@perl.org explaining what you did to contact the current maintainer. The PAUSE admins will also try to reach the maintainer.

  • Post a public message in a heavily trafficked site announcing your intention to take over the module.

  • Wait a bit. The PAUSE admins don't want to act too quickly in case the current maintainer is on holiday. If there's no response to private communication or the public post, a PAUSE admin can transfer it to you.

How do I create a class?

(contributed by brian d foy)

In Perl, a class is just a package, and methods are just subroutines. Perl doesn't get more formal than that and lets you set up the package just the way that you like it (that is, it doesn't set up anything for you).

The Perl documentation has several tutorials that cover class creation, including perlboot (Barnyard Object Oriented Tutorial), perltoot (Tom's Object Oriented Tutorial), perlbot (Bag o' Object Tricks), and perlobj.

How can I tell if a variable is tainted?

You can use the tainted() function of the Scalar::Util module, available from CPAN (or included with Perl since release 5.8.0). See also Laundering and Detecting Tainted Data in perlsec.

What's a closure?

Closures are documented in perlref.

Closure is a computer science term with a precise but hard-to-explain meaning. Usually, closures are implemented in Perl as anonymous subroutines with lasting references to lexical variables outside their own scopes. These lexicals magically refer to the variables that were around when the subroutine was defined (deep binding).

Closures are most often used in programming languages where you can have the return value of a function be itself a function, as you can in Perl. Note that some languages provide anonymous functions but are not capable of providing proper closures: the Python language, for example. For more information on closures, check out any textbook on functional programming. Scheme is a language that not only supports but encourages closures.

Here's a classic non-closure function-generating function:

  1. sub add_function_generator {
  2. return sub { shift() + shift() };
  3. }
  4. my $add_sub = add_function_generator();
  5. my $sum = $add_sub->(4,5); # $sum is 9 now.

The anonymous subroutine returned by add_function_generator() isn't technically a closure because it refers to no lexicals outside its own scope. Using a closure gives you a function template with some customization slots left out to be filled later.

Contrast this with the following make_adder() function, in which the returned anonymous function contains a reference to a lexical variable outside the scope of that function itself. Such a reference requires that Perl return a proper closure, thus locking in for all time the value that the lexical had when the function was created.

  1. sub make_adder {
  2. my $addpiece = shift;
  3. return sub { shift() + $addpiece };
  4. }
  5. my $f1 = make_adder(20);
  6. my $f2 = make_adder(555);

Now $f1->($n) is always 20 plus whatever $n you pass in, whereas $f2->($n) is always 555 plus whatever $n you pass in. The $addpiece in the closure sticks around.

Closures are often used for less esoteric purposes. For example, when you want to pass in a bit of code into a function:

  1. my $line;
  2. timeout( 30, sub { $line = <STDIN> } );

If the code to execute had been passed in as a string, '$line = <STDIN>' , there would have been no way for the hypothetical timeout() function to access the lexical variable $line back in its caller's scope.

Another use for a closure is to make a variable private to a named subroutine, e.g. a counter that gets initialized at creation time of the sub and can only be modified from within the sub. This is sometimes used with a BEGIN block in package files to make sure a variable doesn't get meddled with during the lifetime of the package:

  1. BEGIN {
  2. my $id = 0;
  3. sub next_id { ++$id }
  4. }

This is discussed in more detail in perlsub; see the entry on Persistent Private Variables.

What is variable suicide and how can I prevent it?

This problem was fixed in perl 5.004_05, so preventing it means upgrading your version of perl. ;)

Variable suicide is when you (temporarily or permanently) lose the value of a variable. It is caused by scoping through my() and local() interacting with either closures or aliased foreach() iterator variables and subroutine arguments. It used to be easy to inadvertently lose a variable's value this way, but now it's much harder. Take this code:

  1. my $f = 'foo';
  2. sub T {
  3. while ($i++ < 3) { my $f = $f; $f .= "bar"; print $f, "\n" }
  4. }
  5. T;
  6. print "Finally $f\n";

If you are experiencing variable suicide, that my $f in the subroutine doesn't pick up a fresh copy of the $f whose value is 'foo' . The output shows that inside the subroutine the value of $f leaks through when it shouldn't, as in this output:

  1. foobar
  2. foobarbar
  3. foobarbarbar
  4. Finally foo

The $f that has "bar" added to it three times should be a new $f my $f should create a new lexical variable each time through the loop. The expected output is:

  1. foobar
  2. foobar
  3. foobar
  4. Finally foo

How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}?

You need to pass references to these objects. See Pass by Reference in perlsub for this particular question, and perlref for information on references.

  • Passing Variables and Functions

    Regular variables and functions are quite easy to pass: just pass in a reference to an existing or anonymous variable or function:

    1. func( \$some_scalar );
    2. func( \@some_array );
    3. func( [ 1 .. 10 ] );
    4. func( \%some_hash );
    5. func( { this => 10, that => 20 } );
    6. func( \&some_func );
    7. func( sub { $_[0] ** $_[1] } );
  • Passing Filehandles

    As of Perl 5.6, you can represent filehandles with scalar variables which you treat as any other scalar.

    1. open my $fh, $filename or die "Cannot open $filename! $!";
    2. func( $fh );
    3. sub func {
    4. my $passed_fh = shift;
    5. my $line = <$passed_fh>;
    6. }

    Before Perl 5.6, you had to use the *FH or \*FH notations. These are "typeglobs"--see Typeglobs and Filehandles in perldata and especially Pass by Reference in perlsub for more information.

  • Passing Regexes

    Here's an example of how to pass in a string and a regular expression for it to match against. You construct the pattern with the qr// operator:

    1. sub compare {
    2. my ($val1, $regex) = @_;
    3. my $retval = $val1 =~ /$regex/;
    4. return $retval;
    5. }
    6. $match = compare("old McDonald", qr/d.*D/i);
  • Passing Methods

    To pass an object method into a subroutine, you can do this:

    1. call_a_lot(10, $some_obj, "methname")
    2. sub call_a_lot {
    3. my ($count, $widget, $trick) = @_;
    4. for (my $i = 0; $i < $count; $i++) {
    5. $widget->$trick();
    6. }
    7. }

    Or, you can use a closure to bundle up the object, its method call, and arguments:

    1. my $whatnot = sub { $some_obj->obfuscate(@args) };
    2. func($whatnot);
    3. sub func {
    4. my $code = shift;
    5. &$code();
    6. }

    You could also investigate the can() method in the UNIVERSAL class (part of the standard perl distribution).

How do I create a static variable?

(contributed by brian d foy)

In Perl 5.10, declare the variable with state. The state declaration creates the lexical variable that persists between calls to the subroutine:

  1. sub counter { state $count = 1; $count++ }

You can fake a static variable by using a lexical variable which goes out of scope. In this example, you define the subroutine counter , and it uses the lexical variable $count . Since you wrap this in a BEGIN block, $count is defined at compile-time, but also goes out of scope at the end of the BEGIN block. The BEGIN block also ensures that the subroutine and the value it uses is defined at compile-time so the subroutine is ready to use just like any other subroutine, and you can put this code in the same place as other subroutines in the program text (i.e. at the end of the code, typically). The subroutine counter still has a reference to the data, and is the only way you can access the value (and each time you do, you increment the value). The data in chunk of memory defined by $count is private to counter .

  1. BEGIN {
  2. my $count = 1;
  3. sub counter { $count++ }
  4. }
  5. my $start = counter();
  6. .... # code that calls counter();
  7. my $end = counter();

In the previous example, you created a function-private variable because only one function remembered its reference. You could define multiple functions while the variable is in scope, and each function can share the "private" variable. It's not really "static" because you can access it outside the function while the lexical variable is in scope, and even create references to it. In this example, increment_count and return_count share the variable. One function adds to the value and the other simply returns the value. They can both access $count , and since it has gone out of scope, there is no other way to access it.

  1. BEGIN {
  2. my $count = 1;
  3. sub increment_count { $count++ }
  4. sub return_count { $count }
  5. }

To declare a file-private variable, you still use a lexical variable. A file is also a scope, so a lexical variable defined in the file cannot be seen from any other file.

See Persistent Private Variables in perlsub for more information. The discussion of closures in perlref may help you even though we did not use anonymous subroutines in this answer. See Persistent Private Variables in perlsub for details.

What's the difference between dynamic and lexical (static) scoping? Between local() and my()?

local($x) saves away the old value of the global variable $x and assigns a new value for the duration of the subroutine which is visible in other functions called from that subroutine. This is done at run-time, so is called dynamic scoping. local() always affects global variables, also called package variables or dynamic variables.

my($x) creates a new variable that is only visible in the current subroutine. This is done at compile-time, so it is called lexical or static scoping. my() always affects private variables, also called lexical variables or (improperly) static(ly scoped) variables.

For instance:

  1. sub visible {
  2. print "var has value $var\n";
  3. }
  4. sub dynamic {
  5. local $var = 'local'; # new temporary value for the still-global
  6. visible(); # variable called $var
  7. }
  8. sub lexical {
  9. my $var = 'private'; # new private variable, $var
  10. visible(); # (invisible outside of sub scope)
  11. }
  12. $var = 'global';
  13. visible(); # prints global
  14. dynamic(); # prints local
  15. lexical(); # prints global

Notice how at no point does the value "private" get printed. That's because $var only has that value within the block of the lexical() function, and it is hidden from the called subroutine.

In summary, local() doesn't make what you think of as private, local variables. It gives a global variable a temporary value. my() is what you're looking for if you want private variables.

See Private Variables via my() in perlsub and Temporary Values via local() in perlsub for excruciating details.

How can I access a dynamic variable while a similarly named lexical is in scope?

If you know your package, you can just mention it explicitly, as in $Some_Pack::var. Note that the notation $::var is not the dynamic $var in the current package, but rather the one in the "main" package, as though you had written $main::var.

  1. use vars '$var';
  2. local $var = "global";
  3. my $var = "lexical";
  4. print "lexical is $var\n";
  5. print "global is $main::var\n";

Alternatively you can use the compiler directive our() to bring a dynamic variable into the current lexical scope.

  1. require 5.006; # our() did not exist before 5.6
  2. use vars '$var';
  3. local $var = "global";
  4. my $var = "lexical";
  5. print "lexical is $var\n";
  6. {
  7. our $var;
  8. print "global is $var\n";
  9. }

What's the difference between deep and shallow binding?

In deep binding, lexical variables mentioned in anonymous subroutines are the same ones that were in scope when the subroutine was created. In shallow binding, they are whichever variables with the same names happen to be in scope when the subroutine is called. Perl always uses deep binding of lexical variables (i.e., those created with my()). However, dynamic variables (aka global, local, or package variables) are effectively shallowly bound. Consider this just one more reason not to use them. See the answer to What's a closure?.

Why doesn't "my($foo) = <$fh>;" work right?

my() and local() give list context to the right hand side of = . The <$fh> read operation, like so many of Perl's functions and operators, can tell which context it was called in and behaves appropriately. In general, the scalar() function can help. This function does nothing to the data itself (contrary to popular myth) but rather tells its argument to behave in whatever its scalar fashion is. If that function doesn't have a defined scalar behavior, this of course doesn't help you (such as with sort()).

To enforce scalar context in this particular case, however, you need merely omit the parentheses:

  1. local($foo) = <$fh>; # WRONG
  2. local($foo) = scalar(<$fh>); # ok
  3. local $foo = <$fh>; # right

You should probably be using lexical variables anyway, although the issue is the same here:

  1. my($foo) = <$fh>; # WRONG
  2. my $foo = <$fh>; # right

How do I redefine a builtin function, operator, or method?

Why do you want to do that? :-)

If you want to override a predefined function, such as open(), then you'll have to import the new definition from a different module. See Overriding Built-in Functions in perlsub.

If you want to overload a Perl operator, such as + or ** , then you'll want to use the use overload pragma, documented in overload.

If you're talking about obscuring method calls in parent classes, see Overridden Methods in perltoot.

What's the difference between calling a function as &foo and foo()?

(contributed by brian d foy)

Calling a subroutine as &foo with no trailing parentheses ignores the prototype of foo and passes it the current value of the argument list, @_ . Here's an example; the bar subroutine calls &foo , which prints its arguments list:

  1. sub bar { &foo }
  2. sub foo { print "Args in foo are: @_\n" }
  3. bar( qw( a b c ) );

When you call bar with arguments, you see that foo got the same @_ :

  1. Args in foo are: a b c

Calling the subroutine with trailing parentheses, with or without arguments, does not use the current @_ and respects the subroutine prototype. Changing the example to put parentheses after the call to foo changes the program:

  1. sub bar { &foo() }
  2. sub foo { print "Args in foo are: @_\n" }
  3. bar( qw( a b c ) );

Now the output shows that foo doesn't get the @_ from its caller.

  1. Args in foo are:

The main use of the @_ pass-through feature is to write subroutines whose main job it is to call other subroutines for you. For further details, see perlsub.

How do I create a switch or case statement?

In Perl 5.10, use the given-when construct described in perlsyn:

  1. use 5.010;
  2. given ( $string ) {
  3. when( 'Fred' ) { say "I found Fred!" }
  4. when( 'Barney' ) { say "I found Barney!" }
  5. when( /Bamm-?Bamm/ ) { say "I found Bamm-Bamm!" }
  6. default { say "I don't recognize the name!" }
  7. };

If one wants to use pure Perl and to be compatible with Perl versions prior to 5.10, the general answer is to use if-elsif-else:

  1. for ($variable_to_test) {
  2. if (/pat1/) { } # do something
  3. elsif (/pat2/) { } # do something else
  4. elsif (/pat3/) { } # do something else
  5. else { } # default
  6. }

Here's a simple example of a switch based on pattern matching, lined up in a way to make it look more like a switch statement. We'll do a multiway conditional based on the type of reference stored in $whatchamacallit:

  1. SWITCH: for (ref $whatchamacallit) {
  2. /^$/ && die "not a reference";
  3. /SCALAR/ && do {
  4. print_scalar($$ref);
  5. last SWITCH;
  6. };
  7. /ARRAY/ && do {
  8. print_array(@$ref);
  9. last SWITCH;
  10. };
  11. /HASH/ && do {
  12. print_hash(%$ref);
  13. last SWITCH;
  14. };
  15. /CODE/ && do {
  16. warn "can't print function ref";
  17. last SWITCH;
  18. };
  19. # DEFAULT
  20. warn "User defined type skipped";
  21. }

See perlsyn for other examples in this style.

Sometimes you should change the positions of the constant and the variable. For example, let's say you wanted to test which of many answers you were given, but in a case-insensitive way that also allows abbreviations. You can use the following technique if the strings all start with different characters or if you want to arrange the matches so that one takes precedence over another, as "SEND" has precedence over "STOP" here:

  1. chomp($answer = <>);
  2. if ("SEND" =~ /^\Q$answer/i) { print "Action is send\n" }
  3. elsif ("STOP" =~ /^\Q$answer/i) { print "Action is stop\n" }
  4. elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" }
  5. elsif ("LIST" =~ /^\Q$answer/i) { print "Action is list\n" }
  6. elsif ("EDIT" =~ /^\Q$answer/i) { print "Action is edit\n" }

A totally different approach is to create a hash of function references.

  1. my %commands = (
  2. "happy" => \&joy,
  3. "sad", => \&sullen,
  4. "done" => sub { die "See ya!" },
  5. "mad" => \&angry,
  6. );
  7. print "How are you? ";
  8. chomp($string = <STDIN>);
  9. if ($commands{$string}) {
  10. $commands{$string}->();
  11. } else {
  12. print "No such command: $string\n";
  13. }

Starting from Perl 5.8, a source filter module, Switch , can also be used to get switch and case. Its use is now discouraged, because it's not fully compatible with the native switch of Perl 5.10, and because, as it's implemented as a source filter, it doesn't always work as intended when complex syntax is involved.

How can I catch accesses to undefined variables, functions, or methods?

The AUTOLOAD method, discussed in Autoloading in perlsub and AUTOLOAD: Proxy Methods in perltoot, lets you capture calls to undefined functions and methods.

When it comes to undefined variables that would trigger a warning under use warnings , you can promote the warning to an error.

  1. use warnings FATAL => qw(uninitialized);

Why can't a method included in this same file be found?

Some possible reasons: your inheritance is getting confused, you've misspelled the method name, or the object is of the wrong type. Check out perltoot for details about any of the above cases. You may also use print ref($object) to find out the class $object was blessed into.

Another possible reason for problems is that you've used the indirect object syntax (eg, find Guru "Samy" ) on a class name before Perl has seen that such a package exists. It's wisest to make sure your packages are all defined before you start using them, which will be taken care of if you use the use statement instead of require. If not, make sure to use arrow notation (eg., Guru->find("Samy") ) instead. Object notation is explained in perlobj.

Make sure to read about creating modules in perlmod and the perils of indirect objects in Method Invocation in perlobj.

How can I find out my current or calling package?

(contributed by brian d foy)

To find the package you are currently in, use the special literal __PACKAGE__ , as documented in perldata. You can only use the special literals as separate tokens, so you can't interpolate them into strings like you can with variables:

  1. my $current_package = __PACKAGE__;
  2. print "I am in package $current_package\n";

If you want to find the package calling your code, perhaps to give better diagnostics as Carp does, use the caller built-in:

  1. sub foo {
  2. my @args = ...;
  3. my( $package, $filename, $line ) = caller;
  4. print "I was called from package $package\n";
  5. );

By default, your program starts in package main , so you will always be in some package.

This is different from finding out the package an object is blessed into, which might not be the current package. For that, use blessed from Scalar::Util, part of the Standard Library since Perl 5.8:

  1. use Scalar::Util qw(blessed);
  2. my $object_package = blessed( $object );

Most of the time, you shouldn't care what package an object is blessed into, however, as long as it claims to inherit from that class:

  1. my $is_right_class = eval { $object->isa( $package ) }; # true or false

And, with Perl 5.10 and later, you don't have to check for an inheritance to see if the object can handle a role. For that, you can use DOES , which comes from UNIVERSAL :

  1. my $class_does_it = eval { $object->DOES( $role ) }; # true or false

You can safely replace isa with DOES (although the converse is not true).

How can I comment out a large block of Perl code?

(contributed by brian d foy)

The quick-and-dirty way to comment out more than one line of Perl is to surround those lines with Pod directives. You have to put these directives at the beginning of the line and somewhere where Perl expects a new statement (so not in the middle of statements like the # comments). You end the comment with =cut , ending the Pod section:

  1. =pod
  2. my $object = NotGonnaHappen->new();
  3. ignored_sub();
  4. $wont_be_assigned = 37;
  5. =cut

The quick-and-dirty method only works well when you don't plan to leave the commented code in the source. If a Pod parser comes along, your multiline comment is going to show up in the Pod translation. A better way hides it from Pod parsers as well.

The =begin directive can mark a section for a particular purpose. If the Pod parser doesn't want to handle it, it just ignores it. Label the comments with comment . End the comment using =end with the same label. You still need the =cut to go back to Perl code from the Pod comment:

  1. =begin comment
  2. my $object = NotGonnaHappen->new();
  3. ignored_sub();
  4. $wont_be_assigned = 37;
  5. =end comment
  6. =cut

For more information on Pod, check out perlpod and perlpodspec.

How do I clear a package?

Use this code, provided by Mark-Jason Dominus:

  1. sub scrub_package {
  2. no strict 'refs';
  3. my $pack = shift;
  4. die "Shouldn't delete main package"
  5. if $pack eq "" || $pack eq "main";
  6. my $stash = *{$pack . '::'}{HASH};
  7. my $name;
  8. foreach $name (keys %$stash) {
  9. my $fullname = $pack . '::' . $name;
  10. # Get rid of everything with that name.
  11. undef $$fullname;
  12. undef @$fullname;
  13. undef %$fullname;
  14. undef &$fullname;
  15. undef *$fullname;
  16. }
  17. }

Or, if you're using a recent release of Perl, you can just use the Symbol::delete_package() function instead.

How can I use a variable as a variable name?

Beginners often think they want to have a variable contain the name of a variable.

  1. $fred = 23;
  2. $varname = "fred";
  3. ++$$varname; # $fred now 24

This works sometimes, but it is a very bad idea for two reasons.

The first reason is that this technique only works on global variables. That means that if $fred is a lexical variable created with my() in the above example, the code wouldn't work at all: you'd accidentally access the global and skip right over the private lexical altogether. Global variables are bad because they can easily collide accidentally and in general make for non-scalable and confusing code.

Symbolic references are forbidden under the use strict pragma. They are not true references and consequently are not reference-counted or garbage-collected.

The other reason why using a variable to hold the name of another variable is a bad idea is that the question often stems from a lack of understanding of Perl data structures, particularly hashes. By using symbolic references, you are just using the package's symbol-table hash (like %main:: ) instead of a user-defined hash. The solution is to use your own hash or a real reference instead.

  1. $USER_VARS{"fred"} = 23;
  2. my $varname = "fred";
  3. $USER_VARS{$varname}++; # not $$varname++

There we're using the %USER_VARS hash instead of symbolic references. Sometimes this comes up in reading strings from the user with variable references and wanting to expand them to the values of your perl program's variables. This is also a bad idea because it conflates the program-addressable namespace and the user-addressable one. Instead of reading a string and expanding it to the actual contents of your program's own variables:

  1. $str = 'this has a $fred and $barney in it';
  2. $str =~ s/(\$\w+)/$1/eeg; # need double eval

it would be better to keep a hash around like %USER_VARS and have variable references actually refer to entries in that hash:

  1. $str =~ s/\$(\w+)/$USER_VARS{$1}/g; # no /e here at all

That's faster, cleaner, and safer than the previous approach. Of course, you don't need to use a dollar sign. You could use your own scheme to make it less confusing, like bracketed percent symbols, etc.

  1. $str = 'this has a %fred% and %barney% in it';
  2. $str =~ s/%(\w+)%/$USER_VARS{$1}/g; # no /e here at all

Another reason that folks sometimes think they want a variable to contain the name of a variable is that they don't know how to build proper data structures using hashes. For example, let's say they wanted two hashes in their program: %fred and %barney, and that they wanted to use another scalar variable to refer to those by name.

  1. $name = "fred";
  2. $$name{WIFE} = "wilma"; # set %fred
  3. $name = "barney";
  4. $$name{WIFE} = "betty"; # set %barney

This is still a symbolic reference, and is still saddled with the problems enumerated above. It would be far better to write:

  1. $folks{"fred"}{WIFE} = "wilma";
  2. $folks{"barney"}{WIFE} = "betty";

And just use a multilevel hash to start with.

The only times that you absolutely must use symbolic references are when you really must refer to the symbol table. This may be because it's something that one can't take a real reference to, such as a format name. Doing so may also be important for method calls, since these always go through the symbol table for resolution.

In those cases, you would turn off strict 'refs' temporarily so you can play around with the symbol table. For example:

  1. @colors = qw(red blue green yellow orange purple violet);
  2. for my $name (@colors) {
  3. no strict 'refs'; # renege for the block
  4. *$name = sub { "<FONT COLOR='$name'>@_</FONT>" };
  5. }

All those functions (red(), blue(), green(), etc.) appear to be separate, but the real code in the closure actually was compiled only once.

So, sometimes you might want to use symbolic references to manipulate the symbol table directly. This doesn't matter for formats, handles, and subroutines, because they are always global--you can't use my() on them. For scalars, arrays, and hashes, though--and usually for subroutines-- you probably only want to use hard references.

What does "bad interpreter" mean?

(contributed by brian d foy)

The "bad interpreter" message comes from the shell, not perl. The actual message may vary depending on your platform, shell, and locale settings.

If you see "bad interpreter - no such file or directory", the first line in your perl script (the "shebang" line) does not contain the right path to perl (or any other program capable of running scripts). Sometimes this happens when you move the script from one machine to another and each machine has a different path to perl--/usr/bin/perl versus /usr/local/bin/perl for instance. It may also indicate that the source machine has CRLF line terminators and the destination machine has LF only: the shell tries to find /usr/bin/perl<CR>, but can't.

If you see "bad interpreter: Permission denied", you need to make your script executable.

In either case, you should still be able to run the scripts with perl explicitly:

  1. % perl script.pl

If you get a message like "perl: command not found", perl is not in your PATH, which might also mean that the location of perl is not where you expect it so you need to adjust your shebang line.

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required.

Page index
 
perldoc-html/perlfaq8.html000644 000765 000024 00000434537 12275777331 015664 0ustar00jjstaff000000 000000 perlfaq8 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq8

Perl 5 version 18.2 documentation
Recently read

perlfaq8

NAME

perlfaq8 - System Interaction

DESCRIPTION

This section of the Perl FAQ covers questions involving operating system interaction. Topics include interprocess communication (IPC), control over the user-interface (keyboard, screen and pointing devices), and most anything else not related to data manipulation.

Read the FAQs and documentation specific to the port of perl to your operating system (eg, perlvms, perlplan9, ...). These should contain more detailed information on the vagaries of your perl.

How do I find out which operating system I'm running under?

The $^O variable ($OSNAME if you use English ) contains an indication of the name of the operating system (not its release number) that your perl binary was built for.

How come exec() doesn't return?

(contributed by brian d foy)

The exec function's job is to turn your process into another command and never to return. If that's not what you want to do, don't use exec. :)

If you want to run an external command and still keep your Perl process going, look at a piped open, fork, or system.

How do I do fancy stuff with the keyboard/screen/mouse?

How you access/control keyboards, screens, and pointing devices ("mice") is system-dependent. Try the following modules:

  • Keyboard
    1. Term::Cap Standard perl distribution
    2. Term::ReadKey CPAN
    3. Term::ReadLine::Gnu CPAN
    4. Term::ReadLine::Perl CPAN
    5. Term::Screen CPAN
  • Screen
    1. Term::Cap Standard perl distribution
    2. Curses CPAN
    3. Term::ANSIColor CPAN
  • Mouse
    1. Tk CPAN
    2. Wx CPAN
    3. Gtk2 CPAN
    4. Qt4 kdebindings4 package

Some of these specific cases are shown as examples in other answers in this section of the perlfaq.

How do I print something out in color?

In general, you don't, because you don't know whether the recipient has a color-aware display device. If you know that they have an ANSI terminal that understands color, you can use the Term::ANSIColor module from CPAN:

  1. use Term::ANSIColor;
  2. print color("red"), "Stop!\n", color("reset");
  3. print color("green"), "Go!\n", color("reset");

Or like this:

  1. use Term::ANSIColor qw(:constants);
  2. print RED, "Stop!\n", RESET;
  3. print GREEN, "Go!\n", RESET;

How do I read just one key without waiting for a return key?

Controlling input buffering is a remarkably system-dependent matter. On many systems, you can just use the stty command as shown in getc, but as you see, that's already getting you into portability snags.

  1. open(TTY, "+</dev/tty") or die "no tty: $!";
  2. system "stty cbreak </dev/tty >/dev/tty 2>&1";
  3. $key = getc(TTY); # perhaps this works
  4. # OR ELSE
  5. sysread(TTY, $key, 1); # probably this does
  6. system "stty -cbreak </dev/tty >/dev/tty 2>&1";

The Term::ReadKey module from CPAN offers an easy-to-use interface that should be more efficient than shelling out to stty for each key. It even includes limited support for Windows.

  1. use Term::ReadKey;
  2. ReadMode('cbreak');
  3. $key = ReadKey(0);
  4. ReadMode('normal');

However, using the code requires that you have a working C compiler and can use it to build and install a CPAN module. Here's a solution using the standard POSIX module, which is already on your system (assuming your system supports POSIX).

  1. use HotKey;
  2. $key = readkey();

And here's the HotKey module, which hides the somewhat mystifying calls to manipulate the POSIX termios structures.

  1. # HotKey.pm
  2. package HotKey;
  3. use strict;
  4. use warnings;
  5. use parent 'Exporter';
  6. our @EXPORT = qw(cbreak cooked readkey);
  7. use POSIX qw(:termios_h);
  8. my ($term, $oterm, $echo, $noecho, $fd_stdin);
  9. $fd_stdin = fileno(STDIN);
  10. $term = POSIX::Termios->new();
  11. $term->getattr($fd_stdin);
  12. $oterm = $term->getlflag();
  13. $echo = ECHO | ECHOK | ICANON;
  14. $noecho = $oterm & ~$echo;
  15. sub cbreak {
  16. $term->setlflag($noecho); # ok, so i don't want echo either
  17. $term->setcc(VTIME, 1);
  18. $term->setattr($fd_stdin, TCSANOW);
  19. }
  20. sub cooked {
  21. $term->setlflag($oterm);
  22. $term->setcc(VTIME, 0);
  23. $term->setattr($fd_stdin, TCSANOW);
  24. }
  25. sub readkey {
  26. my $key = '';
  27. cbreak();
  28. sysread(STDIN, $key, 1);
  29. cooked();
  30. return $key;
  31. }
  32. END { cooked() }
  33. 1;

How do I check whether input is ready on the keyboard?

The easiest way to do this is to read a key in nonblocking mode with the Term::ReadKey module from CPAN, passing it an argument of -1 to indicate not to block:

  1. use Term::ReadKey;
  2. ReadMode('cbreak');
  3. if (defined (my $char = ReadKey(-1)) ) {
  4. # input was waiting and it was $char
  5. } else {
  6. # no input was waiting
  7. }
  8. ReadMode('normal'); # restore normal tty settings

How do I clear the screen?

(contributed by brian d foy)

To clear the screen, you just have to print the special sequence that tells the terminal to clear the screen. Once you have that sequence, output it when you want to clear the screen.

You can use the Term::ANSIScreen module to get the special sequence. Import the cls function (or the :screen tag):

  1. use Term::ANSIScreen qw(cls);
  2. my $clear_screen = cls();
  3. print $clear_screen;

The Term::Cap module can also get the special sequence if you want to deal with the low-level details of terminal control. The Tputs method returns the string for the given capability:

  1. use Term::Cap;
  2. my $terminal = Term::Cap->Tgetent( { OSPEED => 9600 } );
  3. my $clear_string = $terminal->Tputs('cl');
  4. print $clear_screen;

On Windows, you can use the Win32::Console module. After creating an object for the output filehandle you want to affect, call the Cls method:

  1. Win32::Console;
  2. my $OUT = Win32::Console->new(STD_OUTPUT_HANDLE);
  3. my $clear_string = $OUT->Cls;
  4. print $clear_screen;

If you have a command-line program that does the job, you can call it in backticks to capture whatever it outputs so you can use it later:

  1. my $clear_string = `clear`;
  2. print $clear_string;

How do I get the screen size?

If you have Term::ReadKey module installed from CPAN, you can use it to fetch the width and height in characters and in pixels:

  1. use Term::ReadKey;
  2. my ($wchar, $hchar, $wpixels, $hpixels) = GetTerminalSize();

This is more portable than the raw ioctl, but not as illustrative:

  1. require 'sys/ioctl.ph';
  2. die "no TIOCGWINSZ " unless defined &TIOCGWINSZ;
  3. open(my $tty_fh, "+</dev/tty") or die "No tty: $!";
  4. unless (ioctl($tty_fh, &TIOCGWINSZ, $winsize='')) {
  5. die sprintf "$0: ioctl TIOCGWINSZ (%08x: $!)\n", &TIOCGWINSZ;
  6. }
  7. my ($row, $col, $xpixel, $ypixel) = unpack('S4', $winsize);
  8. print "(row,col) = ($row,$col)";
  9. print " (xpixel,ypixel) = ($xpixel,$ypixel)" if $xpixel || $ypixel;
  10. print "\n";

How do I ask the user for a password?

(This question has nothing to do with the web. See a different FAQ for that.)

There's an example of this in crypt). First, you put the terminal into "no echo" mode, then just read the password normally. You may do this with an old-style ioctl() function, POSIX terminal control (see POSIX or its documentation the Camel Book), or a call to the stty program, with varying degrees of portability.

You can also do this for most systems using the Term::ReadKey module from CPAN, which is easier to use and in theory more portable.

  1. use Term::ReadKey;
  2. ReadMode('noecho');
  3. my $password = ReadLine(0);

How do I read and write the serial port?

This depends on which operating system your program is running on. In the case of Unix, the serial ports will be accessible through files in /dev ; on other systems, device names will doubtless differ. Several problem areas common to all device interaction are the following:

  • lockfiles

    Your system may use lockfiles to control multiple access. Make sure you follow the correct protocol. Unpredictable behavior can result from multiple processes reading from one device.

  • open mode

    If you expect to use both read and write operations on the device, you'll have to open it for update (see open for details). You may wish to open it without running the risk of blocking by using sysopen() and O_RDWR|O_NDELAY|O_NOCTTY from the Fcntl module (part of the standard perl distribution). See sysopen for more on this approach.

  • end of line

    Some devices will be expecting a "\r" at the end of each line rather than a "\n". In some ports of perl, "\r" and "\n" are different from their usual (Unix) ASCII values of "\015" and "\012". You may have to give the numeric values you want directly, using octal ("\015"), hex ("0x0D"), or as a control-character specification ("\cM").

    1. print DEV "atv1\012"; # wrong, for some devices
    2. print DEV "atv1\015"; # right, for some devices

    Even though with normal text files a "\n" will do the trick, there is still no unified scheme for terminating a line that is portable between Unix, DOS/Win, and Macintosh, except to terminate ALL line ends with "\015\012", and strip what you don't need from the output. This applies especially to socket I/O and autoflushing, discussed next.

  • flushing output

    If you expect characters to get to your device when you print() them, you'll want to autoflush that filehandle. You can use select() and the $| variable to control autoflushing (see perlvar/$ and select, or perlfaq5, "How do I flush/unbuffer an output filehandle? Why must I do this?"):

    1. my $old_handle = select($dev_fh);
    2. $| = 1;
    3. select($old_handle);

    You'll also see code that does this without a temporary variable, as in

    1. select((select($deb_handle), $| = 1)[0]);

    Or if you don't mind pulling in a few thousand lines of code just because you're afraid of a little $| variable:

    1. use IO::Handle;
    2. $dev_fh->autoflush(1);

    As mentioned in the previous item, this still doesn't work when using socket I/O between Unix and Macintosh. You'll need to hard code your line terminators, in that case.

  • non-blocking input

    If you are doing a blocking read() or sysread(), you'll have to arrange for an alarm handler to provide a timeout (see alarm). If you have a non-blocking open, you'll likely have a non-blocking read, which means you may have to use a 4-arg select() to determine whether I/O is ready on that device (see select.

While trying to read from his caller-id box, the notorious Jamie Zawinski <jwz@netscape.com> , after much gnashing of teeth and fighting with sysread, sysopen, POSIX's tcgetattr business, and various other functions that go bump in the night, finally came up with this:

  1. sub open_modem {
  2. use IPC::Open2;
  3. my $stty = `/bin/stty -g`;
  4. open2( \*MODEM_IN, \*MODEM_OUT, "cu -l$modem_device -s2400 2>&1");
  5. # starting cu hoses /dev/tty's stty settings, even when it has
  6. # been opened on a pipe...
  7. system("/bin/stty $stty");
  8. $_ = <MODEM_IN>;
  9. chomp;
  10. if ( !m/^Connected/ ) {
  11. print STDERR "$0: cu printed `$_' instead of `Connected'\n";
  12. }
  13. }

How do I decode encrypted password files?

You spend lots and lots of money on dedicated hardware, but this is bound to get you talked about.

Seriously, you can't if they are Unix password files--the Unix password system employs one-way encryption. It's more like hashing than encryption. The best you can do is check whether something else hashes to the same string. You can't turn a hash back into the original string. Programs like Crack can forcibly (and intelligently) try to guess passwords, but don't (can't) guarantee quick success.

If you're worried about users selecting bad passwords, you should proactively check when they try to change their password (by modifying passwd(1), for example).

How do I start a process in the background?

(contributed by brian d foy)

There's not a single way to run code in the background so you don't have to wait for it to finish before your program moves on to other tasks. Process management depends on your particular operating system, and many of the techniques are covered in perlipc.

Several CPAN modules may be able to help, including IPC::Open2 or IPC::Open3, IPC::Run, Parallel::Jobs, Parallel::ForkManager, POE, Proc::Background, and Win32::Process. There are many other modules you might use, so check those namespaces for other options too.

If you are on a Unix-like system, you might be able to get away with a system call where you put an & on the end of the command:

  1. system("cmd &")

You can also try using fork, as described in perlfunc (although this is the same thing that many of the modules will do for you).

  • STDIN, STDOUT, and STDERR are shared

    Both the main process and the backgrounded one (the "child" process) share the same STDIN, STDOUT and STDERR filehandles. If both try to access them at once, strange things can happen. You may want to close or reopen these for the child. You can get around this with opening a pipe (see open) but on some systems this means that the child process cannot outlive the parent.

  • Signals

    You'll have to catch the SIGCHLD signal, and possibly SIGPIPE too. SIGCHLD is sent when the backgrounded process finishes. SIGPIPE is sent when you write to a filehandle whose child process has closed (an untrapped SIGPIPE can cause your program to silently die). This is not an issue with system("cmd&").

  • Zombies

    You have to be prepared to "reap" the child process when it finishes.

    1. $SIG{CHLD} = sub { wait };
    2. $SIG{CHLD} = 'IGNORE';

    You can also use a double fork. You immediately wait() for your first child, and the init daemon will wait() for your grandchild once it exits.

    1. unless ($pid = fork) {
    2. unless (fork) {
    3. exec "what you really wanna do";
    4. die "exec failed!";
    5. }
    6. exit 0;
    7. }
    8. waitpid($pid, 0);

    See Signals in perlipc for other examples of code to do this. Zombies are not an issue with system("prog &") .

How do I trap control characters/signals?

You don't actually "trap" a control character. Instead, that character generates a signal which is sent to your terminal's currently foregrounded process group, which you then trap in your process. Signals are documented in Signals in perlipc and the section on "Signals" in the Camel.

You can set the values of the %SIG hash to be the functions you want to handle the signal. After perl catches the signal, it looks in %SIG for a key with the same name as the signal, then calls the subroutine value for that key.

  1. # as an anonymous subroutine
  2. $SIG{INT} = sub { syswrite(STDERR, "ouch\n", 5 ) };
  3. # or a reference to a function
  4. $SIG{INT} = \&ouch;
  5. # or the name of the function as a string
  6. $SIG{INT} = "ouch";

Perl versions before 5.8 had in its C source code signal handlers which would catch the signal and possibly run a Perl function that you had set in %SIG . This violated the rules of signal handling at that level causing perl to dump core. Since version 5.8.0, perl looks at %SIG after the signal has been caught, rather than while it is being caught. Previous versions of this answer were incorrect.

How do I modify the shadow password file on a Unix system?

If perl was installed correctly and your shadow library was written properly, the getpw*() functions described in perlfunc should in theory provide (read-only) access to entries in the shadow password file. To change the file, make a new shadow password file (the format varies from system to system--see passwd(1) for specifics) and use pwd_mkdb(8) to install it (see pwd_mkdb(8) for more details).

How do I set the time and date?

Assuming you're running under sufficient permissions, you should be able to set the system-wide date and time by running the date(1) program. (There is no way to set the time and date on a per-process basis.) This mechanism will work for Unix, MS-DOS, Windows, and NT; the VMS equivalent is set time .

However, if all you want to do is change your time zone, you can probably get away with setting an environment variable:

  1. $ENV{TZ} = "MST7MDT"; # Unixish
  2. $ENV{'SYS$TIMEZONE_DIFFERENTIAL'}="-5" # vms
  3. system('trn', 'comp.lang.perl.misc');

How can I sleep() or alarm() for under a second?

If you want finer granularity than the 1 second that the sleep() function provides, the easiest way is to use the select() function as documented in select. Try the Time::HiRes and the BSD::Itimer modules (available from CPAN, and starting from Perl 5.8 Time::HiRes is part of the standard distribution).

How can I measure time under a second?

(contributed by brian d foy)

The Time::HiRes module (part of the standard distribution as of Perl 5.8) measures time with the gettimeofday() system call, which returns the time in microseconds since the epoch. If you can't install Time::HiRes for older Perls and you are on a Unixish system, you may be able to call gettimeofday(2) directly. See syscall.

How can I do an atexit() or setjmp()/longjmp()? (Exception handling)

You can use the END block to simulate atexit() . Each package's END block is called when the program or thread ends. See the perlmod manpage for more details about END blocks.

For example, you can use this to make sure your filter program managed to finish its output without filling up the disk:

  1. END {
  2. close(STDOUT) || die "stdout close failed: $!";
  3. }

The END block isn't called when untrapped signals kill the program, though, so if you use END blocks you should also use

  1. use sigtrap qw(die normal-signals);

Perl's exception-handling mechanism is its eval() operator. You can use eval() as setjmp and die() as longjmp . For details of this, see the section on signals, especially the time-out handler for a blocking flock() in Signals in perlipc or the section on "Signals" in Programming Perl.

If exception handling is all you're interested in, use one of the many CPAN modules that handle exceptions, such as Try::Tiny.

If you want the atexit() syntax (and an rmexit() as well), try the AtExit module available from CPAN.

Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean?

Some Sys-V based systems, notably Solaris 2.X, redefined some of the standard socket constants. Since these were constant across all architectures, they were often hardwired into perl code. The proper way to deal with this is to "use Socket" to get the correct values.

Note that even though SunOS and Solaris are binary compatible, these values are different. Go figure.

How can I call my system's unique C functions from Perl?

In most cases, you write an external module to do it--see the answer to "Where can I learn about linking C with Perl? [h2xs, xsubpp]". However, if the function is a system call, and your system supports syscall(), you can use the syscall function (documented in perlfunc).

Remember to check the modules that came with your distribution, and CPAN as well--someone may already have written a module to do it. On Windows, try Win32::API. On Macs, try Mac::Carbon. If no module has an interface to the C function, you can inline a bit of C in your Perl source with Inline::C.

Where do I get the include files to do ioctl() or syscall()?

Historically, these would be generated by the h2ph tool, part of the standard perl distribution. This program converts cpp(1) directives in C header files to files containing subroutine definitions, like SYS_getitimer() , which you can use as arguments to your functions. It doesn't work perfectly, but it usually gets most of the job done. Simple files like errno.h, syscall.h, and socket.h were fine, but the hard ones like ioctl.h nearly always need to be hand-edited. Here's how to install the *.ph files:

  1. 1. Become the super-user
  2. 2. cd /usr/include
  3. 3. h2ph *.h */*.h

If your system supports dynamic loading, for reasons of portability and sanity you probably ought to use h2xs (also part of the standard perl distribution). This tool converts C header files to Perl extensions. See perlxstut for how to get started with h2xs.

If your system doesn't support dynamic loading, you still probably ought to use h2xs. See perlxstut and ExtUtils::MakeMaker for more information (in brief, just use make perl instead of a plain make to rebuild perl with a new static extension).

Why do setuid perl scripts complain about kernel problems?

Some operating systems have bugs in the kernel that make setuid scripts inherently insecure. Perl gives you a number of options (described in perlsec) to work around such systems.

How can I open a pipe both to and from a command?

The IPC::Open2 module (part of the standard perl distribution) is an easy-to-use approach that internally uses pipe(), fork(), and exec() to do the job. Make sure you read the deadlock warnings in its documentation, though (see IPC::Open2). See Bidirectional Communication with Another Process in perlipc and Bidirectional Communication with Yourself in perlipc

You may also use the IPC::Open3 module (part of the standard perl distribution), but be warned that it has a different order of arguments from IPC::Open2 (see IPC::Open3).

Why can't I get the output of a command with system()?

You're confusing the purpose of system() and backticks (``). system() runs a command and returns exit status information (as a 16 bit value: the low 7 bits are the signal the process died from, if any, and the high 8 bits are the actual exit value). Backticks (``) run a command and return what it sent to STDOUT.

  1. my $exit_status = system("mail-users");
  2. my $output_string = `ls`;

How can I capture STDERR from an external command?

There are three basic ways of running external commands:

  1. system $cmd; # using system()
  2. my $output = `$cmd`; # using backticks (``)
  3. open (my $pipe_fh, "$cmd |"); # using open()

With system(), both STDOUT and STDERR will go the same place as the script's STDOUT and STDERR, unless the system() command redirects them. Backticks and open() read only the STDOUT of your command.

You can also use the open3() function from IPC::Open3. Benjamin Goldberg provides some sample code:

To capture a program's STDOUT, but discard its STDERR:

  1. use IPC::Open3;
  2. use File::Spec;
  3. use Symbol qw(gensym);
  4. open(NULL, ">", File::Spec->devnull);
  5. my $pid = open3(gensym, \*PH, ">&NULL", "cmd");
  6. while( <PH> ) { }
  7. waitpid($pid, 0);

To capture a program's STDERR, but discard its STDOUT:

  1. use IPC::Open3;
  2. use File::Spec;
  3. use Symbol qw(gensym);
  4. open(NULL, ">", File::Spec->devnull);
  5. my $pid = open3(gensym, ">&NULL", \*PH, "cmd");
  6. while( <PH> ) { }
  7. waitpid($pid, 0);

To capture a program's STDERR, and let its STDOUT go to our own STDERR:

  1. use IPC::Open3;
  2. use Symbol qw(gensym);
  3. my $pid = open3(gensym, ">&STDERR", \*PH, "cmd");
  4. while( <PH> ) { }
  5. waitpid($pid, 0);

To read both a command's STDOUT and its STDERR separately, you can redirect them to temp files, let the command run, then read the temp files:

  1. use IPC::Open3;
  2. use Symbol qw(gensym);
  3. use IO::File;
  4. local *CATCHOUT = IO::File->new_tmpfile;
  5. local *CATCHERR = IO::File->new_tmpfile;
  6. my $pid = open3(gensym, ">&CATCHOUT", ">&CATCHERR", "cmd");
  7. waitpid($pid, 0);
  8. seek $_, 0, 0 for \*CATCHOUT, \*CATCHERR;
  9. while( <CATCHOUT> ) {}
  10. while( <CATCHERR> ) {}

But there's no real need for both to be tempfiles... the following should work just as well, without deadlocking:

  1. use IPC::Open3;
  2. use Symbol qw(gensym);
  3. use IO::File;
  4. local *CATCHERR = IO::File->new_tmpfile;
  5. my $pid = open3(gensym, \*CATCHOUT, ">&CATCHERR", "cmd");
  6. while( <CATCHOUT> ) {}
  7. waitpid($pid, 0);
  8. seek CATCHERR, 0, 0;
  9. while( <CATCHERR> ) {}

And it'll be faster, too, since we can begin processing the program's stdout immediately, rather than waiting for the program to finish.

With any of these, you can change file descriptors before the call:

  1. open(STDOUT, ">logfile");
  2. system("ls");

or you can use Bourne shell file-descriptor redirection:

  1. $output = `$cmd 2>some_file`;
  2. open (PIPE, "cmd 2>some_file |");

You can also use file-descriptor redirection to make STDERR a duplicate of STDOUT:

  1. $output = `$cmd 2>&1`;
  2. open (PIPE, "cmd 2>&1 |");

Note that you cannot simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling the shell to do the redirection. This doesn't work:

  1. open(STDERR, ">&STDOUT");
  2. $alloutput = `cmd args`; # stderr still escapes

This fails because the open() makes STDERR go to where STDOUT was going at the time of the open(). The backticks then make STDOUT go to a string, but don't change STDERR (which still goes to the old STDOUT).

Note that you must use Bourne shell (sh(1) ) redirection syntax in backticks, not csh(1) ! Details on why Perl's system() and backtick and pipe opens all use the Bourne shell are in the versus/csh.whynot article in the "Far More Than You Ever Wanted To Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz . To capture a command's STDERR and STDOUT together:

  1. $output = `cmd 2>&1`; # either with backticks
  2. $pid = open(PH, "cmd 2>&1 |"); # or with an open pipe
  3. while (<PH>) { } # plus a read

To capture a command's STDOUT but discard its STDERR:

  1. $output = `cmd 2>/dev/null`; # either with backticks
  2. $pid = open(PH, "cmd 2>/dev/null |"); # or with an open pipe
  3. while (<PH>) { } # plus a read

To capture a command's STDERR but discard its STDOUT:

  1. $output = `cmd 2>&1 1>/dev/null`; # either with backticks
  2. $pid = open(PH, "cmd 2>&1 1>/dev/null |"); # or with an open pipe
  3. while (<PH>) { } # plus a read

To exchange a command's STDOUT and STDERR in order to capture the STDERR but leave its STDOUT to come out our old STDERR:

  1. $output = `cmd 3>&1 1>&2 2>&3 3>&-`; # either with backticks
  2. $pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&-|");# or with an open pipe
  3. while (<PH>) { } # plus a read

To read both a command's STDOUT and its STDERR separately, it's easiest to redirect them separately to files, and then read from those files when the program is done:

  1. system("program args 1>program.stdout 2>program.stderr");

Ordering is important in all these examples. That's because the shell processes file descriptor redirections in strictly left to right order.

  1. system("prog args 1>tmpfile 2>&1");
  2. system("prog args 2>&1 1>tmpfile");

The first command sends both standard out and standard error to the temporary file. The second command sends only the old standard output there, and the old standard error shows up on the old standard out.

Why doesn't open() return an error when a pipe open fails?

If the second argument to a piped open() contains shell metacharacters, perl fork()s, then exec()s a shell to decode the metacharacters and eventually run the desired program. If the program couldn't be run, it's the shell that gets the message, not Perl. All your Perl program can find out is whether the shell itself could be successfully started. You can still capture the shell's STDERR and check it for error messages. See How can I capture STDERR from an external command? elsewhere in this document, or use the IPC::Open3 module.

If there are no shell metacharacters in the argument of open(), Perl runs the command directly, without using the shell, and can correctly report whether the command started.

What's wrong with using backticks in a void context?

Strictly speaking, nothing. Stylistically speaking, it's not a good way to write maintainable code. Perl has several operators for running external commands. Backticks are one; they collect the output from the command for use in your program. The system function is another; it doesn't do this.

Writing backticks in your program sends a clear message to the readers of your code that you wanted to collect the output of the command. Why send a clear message that isn't true?

Consider this line:

  1. `cat /etc/termcap`;

You forgot to check $? to see whether the program even ran correctly. Even if you wrote

  1. print `cat /etc/termcap`;

this code could and probably should be written as

  1. system("cat /etc/termcap") == 0
  2. or die "cat program failed!";

which will echo the cat command's output as it is generated, instead of waiting until the program has completed to print it out. It also checks the return value.

system also provides direct control over whether shell wildcard processing may take place, whereas backticks do not.

How can I call backticks without shell processing?

This is a bit tricky. You can't simply write the command like this:

  1. @ok = `grep @opts '$search_string' @filenames`;

As of Perl 5.8.0, you can use open() with multiple arguments. Just like the list forms of system() and exec(), no shell escapes happen.

  1. open( GREP, "-|", 'grep', @opts, $search_string, @filenames );
  2. chomp(@ok = <GREP>);
  3. close GREP;

You can also:

  1. my @ok = ();
  2. if (open(GREP, "-|")) {
  3. while (<GREP>) {
  4. chomp;
  5. push(@ok, $_);
  6. }
  7. close GREP;
  8. } else {
  9. exec 'grep', @opts, $search_string, @filenames;
  10. }

Just as with system(), no shell escapes happen when you exec() a list. Further examples of this can be found in Safe Pipe Opens in perlipc.

Note that if you're using Windows, no solution to this vexing issue is even possible. Even though Perl emulates fork(), you'll still be stuck, because Windows does not have an argc/argv-style API.

Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)?

This happens only if your perl is compiled to use stdio instead of perlio, which is the default. Some (maybe all?) stdios set error and eof flags that you may need to clear. The POSIX module defines clearerr() that you can use. That is the technically correct way to do it. Here are some less reliable workarounds:

1

Try keeping around the seekpointer and go there, like this:

  1. my $where = tell($log_fh);
  2. seek($log_fh, $where, 0);
2

If that doesn't work, try seeking to a different part of the file and then back.

3

If that doesn't work, try seeking to a different part of the file, reading something, and then seeking back.

4

If that doesn't work, give up on your stdio package and use sysread.

How can I convert my shell script to perl?

Learn Perl and rewrite it. Seriously, there's no simple converter. Things that are awkward to do in the shell are easy to do in Perl, and this very awkwardness is what would make a shell->perl converter nigh-on impossible to write. By rewriting it, you'll think about what you're really trying to do, and hopefully will escape the shell's pipeline datastream paradigm, which while convenient for some matters, causes many inefficiencies.

Can I use perl to run a telnet or ftp session?

Try the Net::FTP, TCP::Client, and Net::Telnet modules (available from CPAN). http://www.cpan.org/scripts/netstuff/telnet.emul.shar will also help for emulating the telnet protocol, but Net::Telnet is quite probably easier to use.

If all you want to do is pretend to be telnet but don't need the initial telnet handshaking, then the standard dual-process approach will suffice:

  1. use IO::Socket; # new in 5.004
  2. my $handle = IO::Socket::INET->new('www.perl.com:80')
  3. or die "can't connect to port 80 on www.perl.com $!";
  4. $handle->autoflush(1);
  5. if (fork()) { # XXX: undef means failure
  6. select($handle);
  7. print while <STDIN>; # everything from stdin to socket
  8. } else {
  9. print while <$handle>; # everything from socket to stdout
  10. }
  11. close $handle;
  12. exit;

How can I write expect in Perl?

Once upon a time, there was a library called chat2.pl (part of the standard perl distribution), which never really got finished. If you find it somewhere, don't use it. These days, your best bet is to look at the Expect module available from CPAN, which also requires two other modules from CPAN, IO::Pty and IO::Stty.

Is there a way to hide perl's command line from programs such as "ps"?

First of all note that if you're doing this for security reasons (to avoid people seeing passwords, for example) then you should rewrite your program so that critical information is never given as an argument. Hiding the arguments won't make your program completely secure.

To actually alter the visible command line, you can assign to the variable $0 as documented in perlvar. This won't work on all operating systems, though. Daemon programs like sendmail place their state there, as in:

  1. $0 = "orcus [accepting connections]";

I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible?

  • Unix

    In the strictest sense, it can't be done--the script executes as a different process from the shell it was started from. Changes to a process are not reflected in its parent--only in any children created after the change. There is shell magic that may allow you to fake it by eval()ing the script's output in your shell; check out the comp.unix.questions FAQ for details.

How do I close a process's filehandle without waiting for it to complete?

Assuming your system supports such things, just send an appropriate signal to the process (see kill). It's common to first send a TERM signal, wait a little bit, and then send a KILL signal to finish it off.

How do I fork a daemon process?

If by daemon process you mean one that's detached (disassociated from its tty), then the following process is reported to work on most Unixish systems. Non-Unix users should check their Your_OS::Process module for other solutions.

  • Open /dev/tty and use the TIOCNOTTY ioctl on it. See tty(1) for details. Or better yet, you can just use the POSIX::setsid() function, so you don't have to worry about process groups.

  • Change directory to /

  • Reopen STDIN, STDOUT, and STDERR so they're not connected to the old tty.

  • Background yourself like this:

    1. fork && exit;

The Proc::Daemon module, available from CPAN, provides a function to perform these actions for you.

How do I find out if I'm running interactively or not?

(contributed by brian d foy)

This is a difficult question to answer, and the best answer is only a guess.

What do you really want to know? If you merely want to know if one of your filehandles is connected to a terminal, you can try the -t file test:

  1. if( -t STDOUT ) {
  2. print "I'm connected to a terminal!\n";
  3. }

However, you might be out of luck if you expect that means there is a real person on the other side. With the Expect module, another program can pretend to be a person. The program might even come close to passing the Turing test.

The IO::Interactive module does the best it can to give you an answer. Its is_interactive function returns an output filehandle; that filehandle points to standard output if the module thinks the session is interactive. Otherwise, the filehandle is a null handle that simply discards the output:

  1. use IO::Interactive;
  2. print { is_interactive } "I might go to standard output!\n";

This still doesn't guarantee that a real person is answering your prompts or reading your output.

If you want to know how to handle automated testing for your distribution, you can check the environment. The CPAN Testers, for instance, set the value of AUTOMATED_TESTING :

  1. unless( $ENV{AUTOMATED_TESTING} ) {
  2. print "Hello interactive tester!\n";
  3. }

How do I timeout a slow event?

Use the alarm() function, probably in conjunction with a signal handler, as documented in Signals in perlipc and the section on "Signals" in the Camel. You may instead use the more flexible Sys::AlarmCall module available from CPAN.

The alarm() function is not implemented on all versions of Windows. Check the documentation for your specific version of Perl.

How do I set CPU limits?

(contributed by Xho)

Use the BSD::Resource module from CPAN. As an example:

  1. use BSD::Resource;
  2. setrlimit(RLIMIT_CPU,10,20) or die $!;

This sets the soft and hard limits to 10 and 20 seconds, respectively. After 10 seconds of time spent running on the CPU (not "wall" time), the process will be sent a signal (XCPU on some systems) which, if not trapped, will cause the process to terminate. If that signal is trapped, then after 10 more seconds (20 seconds in total) the process will be killed with a non-trappable signal.

See the BSD::Resource and your systems documentation for the gory details.

How do I avoid zombies on a Unix system?

Use the reaper code from Signals in perlipc to call wait() when a SIGCHLD is received, or else use the double-fork technique described in How do I start a process in the background? in perlfaq8.

How do I use an SQL database?

The DBI module provides an abstract interface to most database servers and types, including Oracle, DB2, Sybase, mysql, Postgresql, ODBC, and flat files. The DBI module accesses each database type through a database driver, or DBD. You can see a complete list of available drivers on CPAN: http://www.cpan.org/modules/by-module/DBD/ . You can read more about DBI on http://dbi.perl.org/ .

Other modules provide more specific access: Win32::ODBC, Alzabo, iodbc , and others found on CPAN Search: http://search.cpan.org/ .

How do I make a system() exit on control-C?

You can't. You need to imitate the system() call (see perlipc for sample code) and then have a signal handler for the INT signal that passes the signal on to the subprocess. Or you can check for it:

  1. $rc = system($cmd);
  2. if ($rc & 127) { die "signal death" }

How do I open a file without blocking?

If you're lucky enough to be using a system that supports non-blocking reads (most Unixish systems do), you need only to use the O_NDELAY or O_NONBLOCK flag from the Fcntl module in conjunction with sysopen():

  1. use Fcntl;
  2. sysopen(my $fh, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644)
  3. or die "can't open /foo/somefile: $!":

How do I tell the difference between errors from the shell and perl?

(answer contributed by brian d foy)

When you run a Perl script, something else is running the script for you, and that something else may output error messages. The script might emit its own warnings and error messages. Most of the time you cannot tell who said what.

You probably cannot fix the thing that runs perl, but you can change how perl outputs its warnings by defining a custom warning and die functions.

Consider this script, which has an error you may not notice immediately.

  1. #!/usr/locl/bin/perl
  2. print "Hello World\n";

I get an error when I run this from my shell (which happens to be bash). That may look like perl forgot it has a print() function, but my shebang line is not the path to perl, so the shell runs the script, and I get the error.

  1. $ ./test
  2. ./test: line 3: print: command not found

A quick and dirty fix involves a little bit of code, but this may be all you need to figure out the problem.

  1. #!/usr/bin/perl -w
  2. BEGIN {
  3. $SIG{__WARN__} = sub{ print STDERR "Perl: ", @_; };
  4. $SIG{__DIE__} = sub{ print STDERR "Perl: ", @_; exit 1};
  5. }
  6. $a = 1 + undef;
  7. $x / 0;
  8. __END__

The perl message comes out with "Perl" in front. The BEGIN block works at compile time so all of the compilation errors and warnings get the "Perl:" prefix too.

  1. Perl: Useless use of division (/) in void context at ./test line 9.
  2. Perl: Name "main::a" used only once: possible typo at ./test line 8.
  3. Perl: Name "main::x" used only once: possible typo at ./test line 9.
  4. Perl: Use of uninitialized value in addition (+) at ./test line 8.
  5. Perl: Use of uninitialized value in division (/) at ./test line 9.
  6. Perl: Illegal division by zero at ./test line 9.
  7. Perl: Illegal division by zero at -e line 3.

If I don't see that "Perl:", it's not from perl.

You could also just know all the perl errors, and although there are some people who may know all of them, you probably don't. However, they all should be in the perldiag manpage. If you don't find the error in there, it probably isn't a perl error.

Looking up every message is not the easiest way, so let perl to do it for you. Use the diagnostics pragma with turns perl's normal messages into longer discussions on the topic.

  1. use diagnostics;

If you don't get a paragraph or two of expanded discussion, it might not be perl's message.

How do I install a module from CPAN?

(contributed by brian d foy)

The easiest way is to have a module also named CPAN do it for you by using the cpan command that comes with Perl. You can give it a list of modules to install:

  1. $ cpan IO::Interactive Getopt::Whatever

If you prefer CPANPLUS , it's just as easy:

  1. $ cpanp i IO::Interactive Getopt::Whatever

If you want to install a distribution from the current directory, you can tell CPAN.pm to install . (the full stop):

  1. $ cpan .

See the documentation for either of those commands to see what else you can do.

If you want to try to install a distribution by yourself, resolving all dependencies on your own, you follow one of two possible build paths.

For distributions that use Makefile.PL:

  1. $ perl Makefile.PL
  2. $ make test install

For distributions that use Build.PL:

  1. $ perl Build.PL
  2. $ ./Build test
  3. $ ./Build install

Some distributions may need to link to libraries or other third-party code and their build and installation sequences may be more complicated. Check any README or INSTALL files that you may find.

What's the difference between require and use?

(contributed by brian d foy)

Perl runs require statement at run-time. Once Perl loads, compiles, and runs the file, it doesn't do anything else. The use statement is the same as a require run at compile-time, but Perl also calls the import method for the loaded package. These two are the same:

  1. use MODULE qw(import list);
  2. BEGIN {
  3. require MODULE;
  4. MODULE->import(import list);
  5. }

However, you can suppress the import by using an explicit, empty import list. Both of these still happen at compile-time:

  1. use MODULE ();
  2. BEGIN {
  3. require MODULE;
  4. }

Since use will also call the import method, the actual value for MODULE must be a bareword. That is, use cannot load files by name, although require can:

  1. require "$ENV{HOME}/lib/Foo.pm"; # no @INC searching!

See the entry for use in perlfunc for more details.

How do I keep my own module/library directory?

When you build modules, tell Perl where to install the modules.

If you want to install modules for your own use, the easiest way might be local::lib, which you can download from CPAN. It sets various installation settings for you, and uses those same settings within your programs.

If you want more flexibility, you need to configure your CPAN client for your particular situation.

For Makefile.PL -based distributions, use the INSTALL_BASE option when generating Makefiles:

  1. perl Makefile.PL INSTALL_BASE=/mydir/perl

You can set this in your CPAN.pm configuration so modules automatically install in your private library directory when you use the CPAN.pm shell:

  1. % cpan
  2. cpan> o conf makepl_arg INSTALL_BASE=/mydir/perl
  3. cpan> o conf commit

For Build.PL -based distributions, use the --install_base option:

  1. perl Build.PL --install_base /mydir/perl

You can configure CPAN.pm to automatically use this option too:

  1. % cpan
  2. cpan> o conf mbuild_arg "--install_base /mydir/perl"
  3. cpan> o conf commit

INSTALL_BASE tells these tools to put your modules into /mydir/perl/lib/perl5. See How do I add a directory to my include path (@INC) at runtime? for details on how to run your newly installed modules.

There is one caveat with INSTALL_BASE, though, since it acts differently from the PREFIX and LIB settings that older versions of ExtUtils::MakeMaker advocated. INSTALL_BASE does not support installing modules for multiple versions of Perl or different architectures under the same directory. You should consider whether you really want that and, if you do, use the older PREFIX and LIB settings. See the ExtUtils::Makemaker documentation for more details.

How do I add the directory my program lives in to the module/library search path?

(contributed by brian d foy)

If you know the directory already, you can add it to @INC as you would for any other directory. You might <use lib> if you know the directory at compile time:

  1. use lib $directory;

The trick in this task is to find the directory. Before your script does anything else (such as a chdir), you can get the current working directory with the Cwd module, which comes with Perl:

  1. BEGIN {
  2. use Cwd;
  3. our $directory = cwd;
  4. }
  5. use lib $directory;

You can do a similar thing with the value of $0 , which holds the script name. That might hold a relative path, but rel2abs can turn it into an absolute path. Once you have the

  1. BEGIN {
  2. use File::Spec::Functions qw(rel2abs);
  3. use File::Basename qw(dirname);
  4. my $path = rel2abs( $0 );
  5. our $directory = dirname( $path );
  6. }
  7. use lib $directory;

The FindBin module, which comes with Perl, might work. It finds the directory of the currently running script and puts it in $Bin , which you can then use to construct the right library path:

  1. use FindBin qw($Bin);

You can also use local::lib to do much of the same thing. Install modules using local::lib's settings then use the module in your program:

  1. use local::lib; # sets up a local lib at ~/perl5

See the local::lib documentation for more details.

How do I add a directory to my include path (@INC) at runtime?

Here are the suggested ways of modifying your include path, including environment variables, run-time switches, and in-code statements:

  • the PERLLIB environment variable
    1. $ export PERLLIB=/path/to/my/dir
    2. $ perl program.pl
  • the PERL5LIB environment variable
    1. $ export PERL5LIB=/path/to/my/dir
    2. $ perl program.pl
  • the perl -Idir command line flag
    1. $ perl -I/path/to/my/dir program.pl
  • the lib pragma:
    1. use lib "$ENV{HOME}/myown_perllib";
  • the local::lib module:
    1. use local::lib;
    2. use local::lib "~/myown_perllib";

The last is particularly useful because it knows about machine-dependent architectures. The lib.pm pragmatic module was first included with the 5.002 release of Perl.

What is socket.ph and where do I get it?

It's a Perl 4 style file defining values for system networking constants. Sometimes it is built using h2ph when Perl is installed, but other times it is not. Modern programs should use use Socket; instead.

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required.

Page index
 
perldoc-html/perlfaq9.html000644 000765 000024 00000140270 12275777331 015651 0ustar00jjstaff000000 000000 perlfaq9 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfaq9

Perl 5 version 18.2 documentation
Recently read

perlfaq9

NAME

perlfaq9 - Web, Email and Networking

DESCRIPTION

This section deals with questions related to running web sites, sending and receiving email as well as general networking.

Should I use a web framework?

Yes. If you are building a web site with any level of interactivity (forms / users / databases), you will want to use a framework to make handling requests and responses easier.

If there is no interactivity then you may still want to look at using something like Template Toolkit or Plack::Middleware::TemplateToolkit so maintenance of your HTML files (and other assets) is easier.

Which web framework should I use?

There is no simple answer to this question. Perl frameworks can run everything from basic file servers and small scale intranets to massive multinational multilingual websites that are the core to international businesses.

Below is a list of a few frameworks with comments which might help you in making a decision, depending on your specific requirements. Start by reading the docs, then ask questions on the relevant mailing list or IRC channel.

  • Catalyst

    Strongly object-oriented and fully-featured with a long development history and a large community and addon ecosystem. It is excellent for large and complex applications, where you have full control over the server.

  • Dancer

    Young and free of legacy weight, providing a lightweight and easy to learn API. Has a growing addon ecosystem. It is best used for smaller projects and very easy to learn for beginners.

  • Mojolicious

    Fairly young with a focus on HTML5 and real-time web technologies such as WebSockets.

  • Web::Simple

    Currently experimental, strongly object-oriented, built for speed and intended as a toolkit for building micro web apps, custom frameworks or for tieing together existing Plack-compatible web applications with one central dispatcher.

All of these interact with or use Plack which is worth understanding the basics of when building a website in Perl (there is a lot of useful Plack::Middleware).

What is Plack and PSGI?

PSGI is the Perl Web Server Gateway Interface Specification, it is a standard that many Perl web frameworks use, you should not need to understand it to build a web site, the part you might want to use is Plack.

Plack is a set of tools for using the PSGI stack. It contains middleware components, a reference server and utilities for Web application frameworks. Plack is like Ruby's Rack or Python's Paste for WSGI.

You could build a web site using Plack and your own code, but for anything other than a very basic web site, using a web framework (that uses Plack) is a better option.

How do I remove HTML from a string?

Use HTML::Strip, or HTML::FormatText which not only removes HTML but also attempts to do a little simple formatting of the resulting plain text.

How do I extract URLs?

HTML::SimpleLinkExtor will extract URLs from HTML, it handles anchors, images, objects, frames, and many other tags that can contain a URL. If you need anything more complex, you can create your own subclass of HTML::LinkExtor or HTML::Parser. You might even use HTML::SimpleLinkExtor as an example for something specifically suited to your needs.

You can use URI::Find to extract URLs from an arbitrary text document.

How do I fetch an HTML file?

(contributed by brian d foy)

Use the libwww-perl distribution. The LWP::Simple module can fetch web resources and give their content back to you as a string:

  1. use LWP::Simple qw(get);
  2. my $html = get( "http://www.example.com/index.html" );

It can also store the resource directly in a file:

  1. use LWP::Simple qw(getstore);
  2. getstore( "http://www.example.com/index.html", "foo.html" );

If you need to do something more complicated, you can use LWP::UserAgent module to create your own user-agent (e.g. browser) to get the job done. If you want to simulate an interactive web browser, you can use the WWW::Mechanize module.

How do I automate an HTML form submission?

If you are doing something complex, such as moving through many pages and forms or a web site, you can use WWW::Mechanize. See its documentation for all the details.

If you're submitting values using the GET method, create a URL and encode the form using the query_form method:

  1. use LWP::Simple;
  2. use URI::URL;
  3. my $url = url('L<http://www.perl.com/cgi-bin/cpan_mod')>;
  4. $url->query_form(module => 'DB_File', readme => 1);
  5. $content = get($url);

If you're using the POST method, create your own user agent and encode the content appropriately.

  1. use HTTP::Request::Common qw(POST);
  2. use LWP::UserAgent;
  3. my $ua = LWP::UserAgent->new();
  4. my $req = POST 'L<http://www.perl.com/cgi-bin/cpan_mod'>,
  5. [ module => 'DB_File', readme => 1 ];
  6. my $content = $ua->request($req)->as_string;

How do I decode or create those %-encodings on the web?

Most of the time you should not need to do this as your web framework, or if you are making a request, the LWP or other module would handle it for you.

To encode a string yourself, use the URI::Escape module. The uri_escape function returns the escaped string:

  1. my $original = "Colon : Hash # Percent %";
  2. my $escaped = uri_escape( $original );
  3. print "$escaped\n"; # 'Colon%20%3A%20Hash%20%23%20Percent%20%25'

To decode the string, use the uri_unescape function:

  1. my $unescaped = uri_unescape( $escaped );
  2. print $unescaped; # back to original

Remember not to encode a full URI, you need to escape each component separately and then join them together.

How do I redirect to another page?

Most Perl Web Frameworks will have a mechanism for doing this, using the Catalyst framework it would be:

  1. $c->res->redirect($url);
  2. $c->detach();

If you are using Plack (which most frameworks do), then Plack::Middleware::Rewrite is worth looking at if you are migrating from Apache or have URL's you want to always redirect.

How do I put a password on my web pages?

See if the web framework you are using has an authentication system and if that fits your needs.

Alternativly look at Plack::Middleware::Auth::Basic, or one of the other Plack authentication options.

How do I make sure users can't enter values into a form that causes my CGI script to do bad things?

(contributed by brian d foy)

You can't prevent people from sending your script bad data. Even if you add some client-side checks, people may disable them or bypass them completely. For instance, someone might use a module such as LWP to submit to your web site. If you want to prevent data that try to use SQL injection or other sorts of attacks (and you should want to), you have to not trust any data that enter your program.

The perlsec documentation has general advice about data security. If you are using the DBI module, use placeholder to fill in data. If you are running external programs with system or exec, use the list forms. There are many other precautions that you should take, too many to list here, and most of them fall under the category of not using any data that you don't intend to use. Trust no one.

How do I parse a mail header?

Use the Email::MIME module. It's well-tested and supports all the craziness that you'll see in the real world (comment-folding whitespace, encodings, comments, etc.).

  1. use Email::MIME;
  2. my $message = Email::MIME->new($rfc2822);
  3. my $subject = $message->header('Subject');
  4. my $from = $message->header('From');

If you've already got some other kind of email object, consider passing it to Email::Abstract and then using its cast method to get an Email::MIME object:

  1. my $mail_message_object = read_message();
  2. my $abstract = Email::Abstract->new($mail_message_object);
  3. my $email_mime_object = $abstract->cast('Email::MIME');

How do I check a valid mail address?

(partly contributed by Aaron Sherman)

This isn't as simple a question as it sounds. There are two parts:

a) How do I verify that an email address is correctly formatted?

b) How do I verify that an email address targets a valid recipient?

Without sending mail to the address and seeing whether there's a human on the other end to answer you, you cannot fully answer part b, but the Email::Valid module will do both part a and part b as far as you can in real-time.

Our best advice for verifying a person's mail address is to have them enter their address twice, just as you normally do to change a password. This usually weeds out typos. If both versions match, send mail to that address with a personal message. If you get the message back and they've followed your directions, you can be reasonably assured that it's real.

A related strategy that's less open to forgery is to give them a PIN (personal ID number). Record the address and PIN (best that it be a random one) for later processing. In the mail you send, include a link to your site with the PIN included. If the mail bounces, you know it's not valid. If they don't click on the link, either they forged the address or (assuming they got the message) following through wasn't important so you don't need to worry about it.

How do I decode a MIME/BASE64 string?

The MIME::Base64 package handles this as well as the MIME/QP encoding. Decoding base 64 becomes as simple as:

  1. use MIME::Base64;
  2. my $decoded = decode_base64($encoded);

The Email::MIME module can decode base 64-encoded email message parts transparently so the developer doesn't need to worry about it.

How do I find the user's mail address?

Ask them for it. There are so many email providers available that it's unlikely the local system has any idea how to determine a user's email address.

The exception is for organization-specific email (e.g. foo@yourcompany.com) where policy can be codified in your program. In that case, you could look at $ENV{USER}, $ENV{LOGNAME}, and getpwuid($<) in scalar context, like so:

  1. my $user_name = getpwuid($<)

But you still cannot make assumptions about whether this is correct, unless your policy says it is. You really are best off asking the user.

How do I send email?

Use the Email::MIME and Email::Sender::Simple modules, like so:

  1. # first, create your message
  2. my $message = Email::MIME->create(
  3. header_str => [
  4. From => 'you@example.com',
  5. To => 'friend@example.com',
  6. Subject => 'Happy birthday!',
  7. ],
  8. attributes => {
  9. encoding => 'quoted-printable',
  10. charset => 'utf-8',
  11. },
  12. body_str => "Happy birthday to you!\n",
  13. );
  14. use Email::Sender::Simple qw(sendmail);
  15. sendmail($message);

By default, Email::Sender::Simple will try `sendmail` first, if it exists in your $PATH. This generally isn't the case. If there's a remote mail server you use to send mail, consider investigating one of the Transport classes. At time of writing, the available transports include:

  • Email::Sender::Transport::Sendmail

    This is the default. If you can use the mail(1) or mailx(1) program to send mail from the machine where your code runs, you should be able to use this.

  • Email::Sender::Transport::SMTP

    This transport contacts a remote SMTP server over TCP. It optionally uses SSL and can authenticate to the server via SASL.

  • Email::Sender::Transport::SMTP::TLS

    This is like the SMTP transport, but uses TLS security. You can authenticate with this module as well, using any mechanisms your server supports after STARTTLS.

Telling Email::Sender::Simple to use your transport is straightforward.

  1. sendmail(
  2. $message,
  3. {
  4. transport => $email_sender_transport_object,
  5. }
  6. );

How do I use MIME to make an attachment to a mail message?

Email::MIME directly supports multipart messages. Email::MIME objects themselves are parts and can be attached to other Email::MIME objects. Consult the Email::MIME documentation for more information, including all of the supported methods and examples of their use.

How do I read email?

Use the Email::Folder module, like so:

  1. use Email::Folder;
  2. my $folder = Email::Folder->new('/path/to/email/folder');
  3. while(my $message = $folder->next_message) {
  4. # next_message returns Email::Simple objects, but we want
  5. # Email::MIME objects as they're more robust
  6. my $mime = Email::MIME->new($message->as_string);
  7. }

There are different classes in the Email::Folder namespace for supporting various mailbox types. Note that these modules are generally rather limited and only support reading rather than writing.

How do I find out my hostname, domainname, or IP address?

(contributed by brian d foy)

The Net::Domain module, which is part of the Standard Library starting in Perl 5.7.3, can get you the fully qualified domain name (FQDN), the host name, or the domain name.

  1. use Net::Domain qw(hostname hostfqdn hostdomain);
  2. my $host = hostfqdn();

The Sys::Hostname module, part of the Standard Library, can also get the hostname:

  1. use Sys::Hostname;
  2. $host = hostname();

The Sys::Hostname::Long module takes a different approach and tries harder to return the fully qualified hostname:

  1. use Sys::Hostname::Long 'hostname_long';
  2. my $hostname = hostname_long();

To get the IP address, you can use the gethostbyname built-in function to turn the name into a number. To turn that number into the dotted octet form (a.b.c.d) that most people expect, use the inet_ntoa function from the Socket module, which also comes with perl.

  1. use Socket;
  2. my $address = inet_ntoa(
  3. scalar gethostbyname( $host || 'localhost' )
  4. );

How do I fetch/put an (S)FTP file?

Net::FTP, and Net::SFTP allow you to interact with FTP and SFTP (Secure FTP) servers.

How can I do RPC in Perl?

Use one of the RPC modules( https://metacpan.org/search?q=RPC ).

AUTHOR AND COPYRIGHT

Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other authors as noted. All rights reserved.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples in this file are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required.

 
perldoc-html/perlfilter.html000644 000765 000024 00000155433 12275777355 016313 0ustar00jjstaff000000 000000 perlfilter - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfilter

Perl 5 version 18.2 documentation
Recently read

perlfilter

NAME

perlfilter - Source Filters

DESCRIPTION

This article is about a little-known feature of Perl called source filters. Source filters alter the program text of a module before Perl sees it, much as a C preprocessor alters the source text of a C program before the compiler sees it. This article tells you more about what source filters are, how they work, and how to write your own.

The original purpose of source filters was to let you encrypt your program source to prevent casual piracy. This isn't all they can do, as you'll soon learn. But first, the basics.

CONCEPTS

Before the Perl interpreter can execute a Perl script, it must first read it from a file into memory for parsing and compilation. If that script itself includes other scripts with a use or require statement, then each of those scripts will have to be read from their respective files as well.

Now think of each logical connection between the Perl parser and an individual file as a source stream. A source stream is created when the Perl parser opens a file, it continues to exist as the source code is read into memory, and it is destroyed when Perl is finished parsing the file. If the parser encounters a require or use statement in a source stream, a new and distinct stream is created just for that file.

The diagram below represents a single source stream, with the flow of source from a Perl script file on the left into the Perl parser on the right. This is how Perl normally operates.

  1. file -------> parser

There are two important points to remember:

1.

Although there can be any number of source streams in existence at any given time, only one will be active.

2.

Every source stream is associated with only one file.

A source filter is a special kind of Perl module that intercepts and modifies a source stream before it reaches the parser. A source filter changes our diagram like this:

  1. file ----> filter ----> parser

If that doesn't make much sense, consider the analogy of a command pipeline. Say you have a shell script stored in the compressed file trial.gz. The simple pipeline command below runs the script without needing to create a temporary file to hold the uncompressed file.

  1. gunzip -c trial.gz | sh

In this case, the data flow from the pipeline can be represented as follows:

  1. trial.gz ----> gunzip ----> sh

With source filters, you can store the text of your script compressed and use a source filter to uncompress it for Perl's parser:

  1. compressed gunzip
  2. Perl program ---> source filter ---> parser

USING FILTERS

So how do you use a source filter in a Perl script? Above, I said that a source filter is just a special kind of module. Like all Perl modules, a source filter is invoked with a use statement.

Say you want to pass your Perl source through the C preprocessor before execution. As it happens, the source filters distribution comes with a C preprocessor filter module called Filter::cpp.

Below is an example program, cpp_test , which makes use of this filter. Line numbers have been added to allow specific lines to be referenced easily.

  1. 1: use Filter::cpp;
  2. 2: #define TRUE 1
  3. 3: $a = TRUE;
  4. 4: print "a = $a\n";

When you execute this script, Perl creates a source stream for the file. Before the parser processes any of the lines from the file, the source stream looks like this:

  1. cpp_test ---------> parser

Line 1, use Filter::cpp , includes and installs the cpp filter module. All source filters work this way. The use statement is compiled and executed at compile time, before any more of the file is read, and it attaches the cpp filter to the source stream behind the scenes. Now the data flow looks like this:

  1. cpp_test ----> cpp filter ----> parser

As the parser reads the second and subsequent lines from the source stream, it feeds those lines through the cpp source filter before processing them. The cpp filter simply passes each line through the real C preprocessor. The output from the C preprocessor is then inserted back into the source stream by the filter.

  1. .-> cpp --.
  2. | |
  3. | |
  4. | <-'
  5. cpp_test ----> cpp filter ----> parser

The parser then sees the following code:

  1. use Filter::cpp;
  2. $a = 1;
  3. print "a = $a\n";

Let's consider what happens when the filtered code includes another module with use:

  1. 1: use Filter::cpp;
  2. 2: #define TRUE 1
  3. 3: use Fred;
  4. 4: $a = TRUE;
  5. 5: print "a = $a\n";

The cpp filter does not apply to the text of the Fred module, only to the text of the file that used it (cpp_test ). Although the use statement on line 3 will pass through the cpp filter, the module that gets included (Fred ) will not. The source streams look like this after line 3 has been parsed and before line 4 is parsed:

  1. cpp_test ---> cpp filter ---> parser (INACTIVE)
  2. Fred.pm ----> parser

As you can see, a new stream has been created for reading the source from Fred.pm . This stream will remain active until all of Fred.pm has been parsed. The source stream for cpp_test will still exist, but is inactive. Once the parser has finished reading Fred.pm, the source stream associated with it will be destroyed. The source stream for cpp_test then becomes active again and the parser reads line 4 and subsequent lines from cpp_test .

You can use more than one source filter on a single file. Similarly, you can reuse the same filter in as many files as you like.

For example, if you have a uuencoded and compressed source file, it is possible to stack a uudecode filter and an uncompression filter like this:

  1. use Filter::uudecode; use Filter::uncompress;
  2. M'XL(".H<US4''V9I;F%L')Q;>7/;1I;_>_I3=&E=%:F*I"T?22Q/
  3. M6]9*<IQCO*XFT"0[PL%%'Y+IG?WN^ZYN-$'J.[.JE$,20/?K=_[>
  4. ...

Once the first line has been processed, the flow will look like this:

  1. file ---> uudecode ---> uncompress ---> parser
  2. filter filter

Data flows through filters in the same order they appear in the source file. The uudecode filter appeared before the uncompress filter, so the source file will be uudecoded before it's uncompressed.

WRITING A SOURCE FILTER

There are three ways to write your own source filter. You can write it in C, use an external program as a filter, or write the filter in Perl. I won't cover the first two in any great detail, so I'll get them out of the way first. Writing the filter in Perl is most convenient, so I'll devote the most space to it.

WRITING A SOURCE FILTER IN C

The first of the three available techniques is to write the filter completely in C. The external module you create interfaces directly with the source filter hooks provided by Perl.

The advantage of this technique is that you have complete control over the implementation of your filter. The big disadvantage is the increased complexity required to write the filter - not only do you need to understand the source filter hooks, but you also need a reasonable knowledge of Perl guts. One of the few times it is worth going to this trouble is when writing a source scrambler. The decrypt filter (which unscrambles the source before Perl parses it) included with the source filter distribution is an example of a C source filter (see Decryption Filters, below).

  • Decryption Filters

    All decryption filters work on the principle of "security through obscurity." Regardless of how well you write a decryption filter and how strong your encryption algorithm is, anyone determined enough can retrieve the original source code. The reason is quite simple - once the decryption filter has decrypted the source back to its original form, fragments of it will be stored in the computer's memory as Perl parses it. The source might only be in memory for a short period of time, but anyone possessing a debugger, skill, and lots of patience can eventually reconstruct your program.

    That said, there are a number of steps that can be taken to make life difficult for the potential cracker. The most important: Write your decryption filter in C and statically link the decryption module into the Perl binary. For further tips to make life difficult for the potential cracker, see the file decrypt.pm in the source filters distribution.

CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE

An alternative to writing the filter in C is to create a separate executable in the language of your choice. The separate executable reads from standard input, does whatever processing is necessary, and writes the filtered data to standard output. Filter::cpp is an example of a source filter implemented as a separate executable - the executable is the C preprocessor bundled with your C compiler.

The source filter distribution includes two modules that simplify this task: Filter::exec and Filter::sh . Both allow you to run any external executable. Both use a coprocess to control the flow of data into and out of the external executable. (For details on coprocesses, see Stephens, W.R., "Advanced Programming in the UNIX Environment." Addison-Wesley, ISBN 0-210-56317-7, pages 441-445.) The difference between them is that Filter::exec spawns the external command directly, while Filter::sh spawns a shell to execute the external command. (Unix uses the Bourne shell; NT uses the cmd shell.) Spawning a shell allows you to make use of the shell metacharacters and redirection facilities.

Here is an example script that uses Filter::sh :

  1. use Filter::sh 'tr XYZ PQR';
  2. $a = 1;
  3. print "XYZ a = $a\n";

The output you'll get when the script is executed:

  1. PQR a = 1

Writing a source filter as a separate executable works fine, but a small performance penalty is incurred. For example, if you execute the small example above, a separate subprocess will be created to run the Unix tr command. Each use of the filter requires its own subprocess. If creating subprocesses is expensive on your system, you might want to consider one of the other options for creating source filters.

WRITING A SOURCE FILTER IN PERL

The easiest and most portable option available for creating your own source filter is to write it completely in Perl. To distinguish this from the previous two techniques, I'll call it a Perl source filter.

To help understand how to write a Perl source filter we need an example to study. Here is a complete source filter that performs rot13 decoding. (Rot13 is a very simple encryption scheme used in Usenet postings to hide the contents of offensive posts. It moves every letter forward thirteen places, so that A becomes N, B becomes O, and Z becomes M.)

  1. package Rot13;
  2. use Filter::Util::Call;
  3. sub import {
  4. my ($type) = @_;
  5. my ($ref) = [];
  6. filter_add(bless $ref);
  7. }
  8. sub filter {
  9. my ($self) = @_;
  10. my ($status);
  11. tr/n-za-mN-ZA-M/a-zA-Z/
  12. if ($status = filter_read()) > 0;
  13. $status;
  14. }
  15. 1;

All Perl source filters are implemented as Perl classes and have the same basic structure as the example above.

First, we include the Filter::Util::Call module, which exports a number of functions into your filter's namespace. The filter shown above uses two of these functions, filter_add() and filter_read() .

Next, we create the filter object and associate it with the source stream by defining the import function. If you know Perl well enough, you know that import is called automatically every time a module is included with a use statement. This makes import the ideal place to both create and install a filter object.

In the example filter, the object ($ref ) is blessed just like any other Perl object. Our example uses an anonymous array, but this isn't a requirement. Because this example doesn't need to store any context information, we could have used a scalar or hash reference just as well. The next section demonstrates context data.

The association between the filter object and the source stream is made with the filter_add() function. This takes a filter object as a parameter ($ref in this case) and installs it in the source stream.

Finally, there is the code that actually does the filtering. For this type of Perl source filter, all the filtering is done in a method called filter() . (It is also possible to write a Perl source filter using a closure. See the Filter::Util::Call manual page for more details.) It's called every time the Perl parser needs another line of source to process. The filter() method, in turn, reads lines from the source stream using the filter_read() function.

If a line was available from the source stream, filter_read() returns a status value greater than zero and appends the line to $_ . A status value of zero indicates end-of-file, less than zero means an error. The filter function itself is expected to return its status in the same way, and put the filtered line it wants written to the source stream in $_ . The use of $_ accounts for the brevity of most Perl source filters.

In order to make use of the rot13 filter we need some way of encoding the source file in rot13 format. The script below, mkrot13 , does just that.

  1. die "usage mkrot13 filename\n" unless @ARGV;
  2. my $in = $ARGV[0];
  3. my $out = "$in.tmp";
  4. open(IN, "<$in") or die "Cannot open file $in: $!\n";
  5. open(OUT, ">$out") or die "Cannot open file $out: $!\n";
  6. print OUT "use Rot13;\n";
  7. while (<IN>) {
  8. tr/a-zA-Z/n-za-mN-ZA-M/;
  9. print OUT;
  10. }
  11. close IN;
  12. close OUT;
  13. unlink $in;
  14. rename $out, $in;

If we encrypt this with mkrot13 :

  1. print " hello fred \n";

the result will be this:

  1. use Rot13;
  2. cevag "uryyb serq\a";

Running it produces this output:

  1. hello fred

USING CONTEXT: THE DEBUG FILTER

The rot13 example was a trivial example. Here's another demonstration that shows off a few more features.

Say you wanted to include a lot of debugging code in your Perl script during development, but you didn't want it available in the released product. Source filters offer a solution. In order to keep the example simple, let's say you wanted the debugging output to be controlled by an environment variable, DEBUG . Debugging code is enabled if the variable exists, otherwise it is disabled.

Two special marker lines will bracket debugging code, like this:

  1. ## DEBUG_BEGIN
  2. if ($year > 1999) {
  3. warn "Debug: millennium bug in year $year\n";
  4. }
  5. ## DEBUG_END

The filter ensures that Perl parses the code between the <DEBUG_BEGIN> and DEBUG_END markers only when the DEBUG environment variable exists. That means that when DEBUG does exist, the code above should be passed through the filter unchanged. The marker lines can also be passed through as-is, because the Perl parser will see them as comment lines. When DEBUG isn't set, we need a way to disable the debug code. A simple way to achieve that is to convert the lines between the two markers into comments:

  1. ## DEBUG_BEGIN
  2. #if ($year > 1999) {
  3. # warn "Debug: millennium bug in year $year\n";
  4. #}
  5. ## DEBUG_END

Here is the complete Debug filter:

  1. package Debug;
  2. use strict;
  3. use warnings;
  4. use Filter::Util::Call;
  5. use constant TRUE => 1;
  6. use constant FALSE => 0;
  7. sub import {
  8. my ($type) = @_;
  9. my (%context) = (
  10. Enabled => defined $ENV{DEBUG},
  11. InTraceBlock => FALSE,
  12. Filename => (caller)[1],
  13. LineNo => 0,
  14. LastBegin => 0,
  15. );
  16. filter_add(bless \%context);
  17. }
  18. sub Die {
  19. my ($self) = shift;
  20. my ($message) = shift;
  21. my ($line_no) = shift || $self->{LastBegin};
  22. die "$message at $self->{Filename} line $line_no.\n"
  23. }
  24. sub filter {
  25. my ($self) = @_;
  26. my ($status);
  27. $status = filter_read();
  28. ++ $self->{LineNo};
  29. # deal with EOF/error first
  30. if ($status <= 0) {
  31. $self->Die("DEBUG_BEGIN has no DEBUG_END")
  32. if $self->{InTraceBlock};
  33. return $status;
  34. }
  35. if ($self->{InTraceBlock}) {
  36. if (/^\s*##\s*DEBUG_BEGIN/ ) {
  37. $self->Die("Nested DEBUG_BEGIN", $self->{LineNo})
  38. } elsif (/^\s*##\s*DEBUG_END/) {
  39. $self->{InTraceBlock} = FALSE;
  40. }
  41. # comment out the debug lines when the filter is disabled
  42. s/^/#/ if ! $self->{Enabled};
  43. } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) {
  44. $self->{InTraceBlock} = TRUE;
  45. $self->{LastBegin} = $self->{LineNo};
  46. } elsif ( /^\s*##\s*DEBUG_END/ ) {
  47. $self->Die("DEBUG_END has no DEBUG_BEGIN", $self->{LineNo});
  48. }
  49. return $status;
  50. }
  51. 1;

The big difference between this filter and the previous example is the use of context data in the filter object. The filter object is based on a hash reference, and is used to keep various pieces of context information between calls to the filter function. All but two of the hash fields are used for error reporting. The first of those two, Enabled, is used by the filter to determine whether the debugging code should be given to the Perl parser. The second, InTraceBlock, is true when the filter has encountered a DEBUG_BEGIN line, but has not yet encountered the following DEBUG_END line.

If you ignore all the error checking that most of the code does, the essence of the filter is as follows:

  1. sub filter {
  2. my ($self) = @_;
  3. my ($status);
  4. $status = filter_read();
  5. # deal with EOF/error first
  6. return $status if $status <= 0;
  7. if ($self->{InTraceBlock}) {
  8. if (/^\s*##\s*DEBUG_END/) {
  9. $self->{InTraceBlock} = FALSE
  10. }
  11. # comment out debug lines when the filter is disabled
  12. s/^/#/ if ! $self->{Enabled};
  13. } elsif ( /^\s*##\s*DEBUG_BEGIN/ ) {
  14. $self->{InTraceBlock} = TRUE;
  15. }
  16. return $status;
  17. }

Be warned: just as the C-preprocessor doesn't know C, the Debug filter doesn't know Perl. It can be fooled quite easily:

  1. print <<EOM;
  2. ##DEBUG_BEGIN
  3. EOM

Such things aside, you can see that a lot can be achieved with a modest amount of code.

CONCLUSION

You now have better understanding of what a source filter is, and you might even have a possible use for them. If you feel like playing with source filters but need a bit of inspiration, here are some extra features you could add to the Debug filter.

First, an easy one. Rather than having debugging code that is all-or-nothing, it would be much more useful to be able to control which specific blocks of debugging code get included. Try extending the syntax for debug blocks to allow each to be identified. The contents of the DEBUG environment variable can then be used to control which blocks get included.

Once you can identify individual blocks, try allowing them to be nested. That isn't difficult either.

Here is an interesting idea that doesn't involve the Debug filter. Currently Perl subroutines have fairly limited support for formal parameter lists. You can specify the number of parameters and their type, but you still have to manually take them out of the @_ array yourself. Write a source filter that allows you to have a named parameter list. Such a filter would turn this:

  1. sub MySub ($first, $second, @rest) { ... }

into this:

  1. sub MySub($$@) {
  2. my ($first) = shift;
  3. my ($second) = shift;
  4. my (@rest) = @_;
  5. ...
  6. }

Finally, if you feel like a real challenge, have a go at writing a full-blown Perl macro preprocessor as a source filter. Borrow the useful features from the C preprocessor and any other macro processors you know. The tricky bit will be choosing how much knowledge of Perl's syntax you want your filter to have.

THINGS TO LOOK OUT FOR

  • Some Filters Clobber the DATA Handle

    Some source filters use the DATA handle to read the calling program. When using these source filters you cannot rely on this handle, nor expect any particular kind of behavior when operating on it. Filters based on Filter::Util::Call (and therefore Filter::Simple) do not alter the DATA filehandle.

REQUIREMENTS

The Source Filters distribution is available on CPAN, in

  1. CPAN/modules/by-module/Filter

Starting from Perl 5.8 Filter::Util::Call (the core part of the Source Filters distribution) is part of the standard Perl distribution. Also included is a friendlier interface called Filter::Simple, by Damian Conway.

AUTHOR

Paul Marquess <Paul.Marquess@btinternet.com>

Copyrights

This article originally appeared in The Perl Journal #11, and is copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and The Perl Journal. This document may be distributed under the same terms as Perl itself.

 
perldoc-html/perlfork.html000644 000765 000024 00000111710 12275777342 015751 0ustar00jjstaff000000 000000 perlfork - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfork

Perl 5 version 18.2 documentation
Recently read

perlfork

NAME

perlfork - Perl's fork() emulation

SYNOPSIS

  1. NOTE: As of the 5.8.0 release, fork() emulation has considerably
  2. matured. However, there are still a few known bugs and differences
  3. from real fork() that might affect you. See the "BUGS" and
  4. "CAVEATS AND LIMITATIONS" sections below.

Perl provides a fork() keyword that corresponds to the Unix system call of the same name. On most Unix-like platforms where the fork() system call is available, Perl's fork() simply calls it.

On some platforms such as Windows where the fork() system call is not available, Perl can be built to emulate fork() at the interpreter level. While the emulation is designed to be as compatible as possible with the real fork() at the level of the Perl program, there are certain important differences that stem from the fact that all the pseudo child "processes" created this way live in the same real process as far as the operating system is concerned.

This document provides a general overview of the capabilities and limitations of the fork() emulation. Note that the issues discussed here are not applicable to platforms where a real fork() is available and Perl has been configured to use it.

DESCRIPTION

The fork() emulation is implemented at the level of the Perl interpreter. What this means in general is that running fork() will actually clone the running interpreter and all its state, and run the cloned interpreter in a separate thread, beginning execution in the new thread just after the point where the fork() was called in the parent. We will refer to the thread that implements this child "process" as the pseudo-process.

To the Perl program that called fork(), all this is designed to be transparent. The parent returns from the fork() with a pseudo-process ID that can be subsequently used in any process-manipulation functions; the child returns from the fork() with a value of 0 to signify that it is the child pseudo-process.

Behavior of other Perl features in forked pseudo-processes

Most Perl features behave in a natural way within pseudo-processes.

  • $$ or $PROCESS_ID

    This special variable is correctly set to the pseudo-process ID. It can be used to identify pseudo-processes within a particular session. Note that this value is subject to recycling if any pseudo-processes are launched after others have been wait()-ed on.

  • %ENV

    Each pseudo-process maintains its own virtual environment. Modifications to %ENV affect the virtual environment, and are only visible within that pseudo-process, and in any processes (or pseudo-processes) launched from it.

  • chdir() and all other builtins that accept filenames

    Each pseudo-process maintains its own virtual idea of the current directory. Modifications to the current directory using chdir() are only visible within that pseudo-process, and in any processes (or pseudo-processes) launched from it. All file and directory accesses from the pseudo-process will correctly map the virtual working directory to the real working directory appropriately.

  • wait() and waitpid()

    wait() and waitpid() can be passed a pseudo-process ID returned by fork(). These calls will properly wait for the termination of the pseudo-process and return its status.

  • kill()

    kill('KILL', ...) can be used to terminate a pseudo-process by passing it the ID returned by fork(). The outcome of kill on a pseudo-process is unpredictable and it should not be used except under dire circumstances, because the operating system may not guarantee integrity of the process resources when a running thread is terminated. The process which implements the pseudo-processes can be blocked and the Perl interpreter hangs. Note that using kill('KILL', ...) on a pseudo-process() may typically cause memory leaks, because the thread that implements the pseudo-process does not get a chance to clean up its resources.

    kill('TERM', ...) can also be used on pseudo-processes, but the signal will not be delivered while the pseudo-process is blocked by a system call, e.g. waiting for a socket to connect, or trying to read from a socket with no data available. Starting in Perl 5.14 the parent process will not wait for children to exit once they have been signalled with kill('TERM', ...) to avoid deadlock during process exit. You will have to explicitly call waitpid() to make sure the child has time to clean-up itself, but you are then also responsible that the child is not blocking on I/O either.

  • exec()

    Calling exec() within a pseudo-process actually spawns the requested executable in a separate process and waits for it to complete before exiting with the same exit status as that process. This means that the process ID reported within the running executable will be different from what the earlier Perl fork() might have returned. Similarly, any process manipulation functions applied to the ID returned by fork() will affect the waiting pseudo-process that called exec(), not the real process it is waiting for after the exec().

    When exec() is called inside a pseudo-process then DESTROY methods and END blocks will still be called after the external process returns.

  • exit()

    exit() always exits just the executing pseudo-process, after automatically wait()-ing for any outstanding child pseudo-processes. Note that this means that the process as a whole will not exit unless all running pseudo-processes have exited. See below for some limitations with open filehandles.

  • Open handles to files, directories and network sockets

    All open handles are dup()-ed in pseudo-processes, so that closing any handles in one process does not affect the others. See below for some limitations.

Resource limits

In the eyes of the operating system, pseudo-processes created via the fork() emulation are simply threads in the same process. This means that any process-level limits imposed by the operating system apply to all pseudo-processes taken together. This includes any limits imposed by the operating system on the number of open file, directory and socket handles, limits on disk space usage, limits on memory size, limits on CPU utilization etc.

Killing the parent process

If the parent process is killed (either using Perl's kill() builtin, or using some external means) all the pseudo-processes are killed as well, and the whole process exits.

Lifetime of the parent process and pseudo-processes

During the normal course of events, the parent process and every pseudo-process started by it will wait for their respective pseudo-children to complete before they exit. This means that the parent and every pseudo-child created by it that is also a pseudo-parent will only exit after their pseudo-children have exited.

Starting with Perl 5.14 a parent will not wait() automatically for any child that has been signalled with sig('TERM', ...) to avoid a deadlock in case the child is blocking on I/O and never receives the signal.

CAVEATS AND LIMITATIONS

  • BEGIN blocks

    The fork() emulation will not work entirely correctly when called from within a BEGIN block. The forked copy will run the contents of the BEGIN block, but will not continue parsing the source stream after the BEGIN block. For example, consider the following code:

    1. BEGIN {
    2. fork and exit; # fork child and exit the parent
    3. print "inner\n";
    4. }
    5. print "outer\n";

    This will print:

    1. inner

    rather than the expected:

    1. inner
    2. outer

    This limitation arises from fundamental technical difficulties in cloning and restarting the stacks used by the Perl parser in the middle of a parse.

  • Open filehandles

    Any filehandles open at the time of the fork() will be dup()-ed. Thus, the files can be closed independently in the parent and child, but beware that the dup()-ed handles will still share the same seek pointer. Changing the seek position in the parent will change it in the child and vice-versa. One can avoid this by opening files that need distinct seek pointers separately in the child.

    On some operating systems, notably Solaris and Unixware, calling exit() from a child process will flush and close open filehandles in the parent, thereby corrupting the filehandles. On these systems, calling _exit() is suggested instead. _exit() is available in Perl through the POSIX module. Please consult your system's manpages for more information on this.

  • Open directory handles

    Perl will completely read from all open directory handles until they reach the end of the stream. It will then seekdir() back to the original location and all future readdir() requests will be fulfilled from the cache buffer. That means that neither the directory handle held by the parent process nor the one held by the child process will see any changes made to the directory after the fork() call.

    Note that rewinddir() has a similar limitation on Windows and will not force readdir() to read the directory again either. Only a newly opened directory handle will reflect changes to the directory.

  • Forking pipe open() not yet implemented

    The open(FOO, "|-") and open(BAR, "-|") constructs are not yet implemented. This limitation can be easily worked around in new code by creating a pipe explicitly. The following example shows how to write to a forked child:

    1. # simulate open(FOO, "|-")
    2. sub pipe_to_fork ($) {
    3. my $parent = shift;
    4. pipe my $child, $parent or die;
    5. my $pid = fork();
    6. die "fork() failed: $!" unless defined $pid;
    7. if ($pid) {
    8. close $child;
    9. }
    10. else {
    11. close $parent;
    12. open(STDIN, "<&=" . fileno($child)) or die;
    13. }
    14. $pid;
    15. }
    16. if (pipe_to_fork('FOO')) {
    17. # parent
    18. print FOO "pipe_to_fork\n";
    19. close FOO;
    20. }
    21. else {
    22. # child
    23. while (<STDIN>) { print; }
    24. exit(0);
    25. }

    And this one reads from the child:

    1. # simulate open(FOO, "-|")
    2. sub pipe_from_fork ($) {
    3. my $parent = shift;
    4. pipe $parent, my $child or die;
    5. my $pid = fork();
    6. die "fork() failed: $!" unless defined $pid;
    7. if ($pid) {
    8. close $child;
    9. }
    10. else {
    11. close $parent;
    12. open(STDOUT, ">&=" . fileno($child)) or die;
    13. }
    14. $pid;
    15. }
    16. if (pipe_from_fork('BAR')) {
    17. # parent
    18. while (<BAR>) { print; }
    19. close BAR;
    20. }
    21. else {
    22. # child
    23. print "pipe_from_fork\n";
    24. exit(0);
    25. }

    Forking pipe open() constructs will be supported in future.

  • Global state maintained by XSUBs

    External subroutines (XSUBs) that maintain their own global state may not work correctly. Such XSUBs will either need to maintain locks to protect simultaneous access to global data from different pseudo-processes, or maintain all their state on the Perl symbol table, which is copied naturally when fork() is called. A callback mechanism that provides extensions an opportunity to clone their state will be provided in the near future.

  • Interpreter embedded in larger application

    The fork() emulation may not behave as expected when it is executed in an application which embeds a Perl interpreter and calls Perl APIs that can evaluate bits of Perl code. This stems from the fact that the emulation only has knowledge about the Perl interpreter's own data structures and knows nothing about the containing application's state. For example, any state carried on the application's own call stack is out of reach.

  • Thread-safety of extensions

    Since the fork() emulation runs code in multiple threads, extensions calling into non-thread-safe libraries may not work reliably when calling fork(). As Perl's threading support gradually becomes more widely adopted even on platforms with a native fork(), such extensions are expected to be fixed for thread-safety.

PORTABILITY CAVEATS

In portable Perl code, kill(9, $child) must not be used on forked processes. Killing a forked process is unsafe and has unpredictable results. See kill(), above.

BUGS

  • Having pseudo-process IDs be negative integers breaks down for the integer -1 because the wait() and waitpid() functions treat this number as being special. The tacit assumption in the current implementation is that the system never allocates a thread ID of 1 for user threads. A better representation for pseudo-process IDs will be implemented in future.

  • In certain cases, the OS-level handles created by the pipe(), socket(), and accept() operators are apparently not duplicated accurately in pseudo-processes. This only happens in some situations, but where it does happen, it may result in deadlocks between the read and write ends of pipe handles, or inability to send or receive data across socket handles.

  • This document may be incomplete in some respects.

AUTHOR

Support for concurrent interpreters and the fork() emulation was implemented by ActiveState, with funding from Microsoft Corporation.

This document is authored and maintained by Gurusamy Sarathy <gsar@activestate.com>.

SEE ALSO

fork, perlipc

 
perldoc-html/perlform.html000644 000765 000024 00000124500 12275777341 015753 0ustar00jjstaff000000 000000 perlform - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlform

Perl 5 version 18.2 documentation
Recently read

perlform

NAME

perlform - Perl formats

DESCRIPTION

Perl has a mechanism to help you generate simple reports and charts. To facilitate this, Perl helps you code up your output page close to how it will look when it's printed. It can keep track of things like how many lines are on a page, what page you're on, when to print page headers, etc. Keywords are borrowed from FORTRAN: format() to declare and write() to execute; see their entries in perlfunc. Fortunately, the layout is much more legible, more like BASIC's PRINT USING statement. Think of it as a poor man's nroff(1).

Formats, like packages and subroutines, are declared rather than executed, so they may occur at any point in your program. (Usually it's best to keep them all together though.) They have their own namespace apart from all the other "types" in Perl. This means that if you have a function named "Foo", it is not the same thing as having a format named "Foo". However, the default name for the format associated with a given filehandle is the same as the name of the filehandle. Thus, the default format for STDOUT is named "STDOUT", and the default format for filehandle TEMP is named "TEMP". They just look the same. They aren't.

Output record formats are declared as follows:

  1. format NAME =
  2. FORMLIST
  3. .

If the name is omitted, format "STDOUT" is defined. A single "." in column 1 is used to terminate a format. FORMLIST consists of a sequence of lines, each of which may be one of three types:

1.

A comment, indicated by putting a '#' in the first column.

2.

A "picture" line giving the format for one output line.

3.

An argument line supplying values to plug into the previous picture line.

Picture lines contain output field definitions, intermingled with literal text. These lines do not undergo any kind of variable interpolation. Field definitions are made up from a set of characters, for starting and extending a field to its desired width. This is the complete set of characters for field definitions:

  1. @ start of regular field
  2. ^ start of special field
  3. < pad character for left justification
  4. | pad character for centering
  5. > pad character for right justification
  6. # pad character for a right-justified numeric field
  7. 0 instead of first #: pad number with leading zeroes
  8. . decimal point within a numeric field
  9. ... terminate a text field, show "..." as truncation evidence
  10. @* variable width field for a multi-line value
  11. ^* variable width field for next line of a multi-line value
  12. ~ suppress line with all fields empty
  13. ~~ repeat line until all fields are exhausted

Each field in a picture line starts with either "@" (at) or "^" (caret), indicating what we'll call, respectively, a "regular" or "special" field. The choice of pad characters determines whether a field is textual or numeric. The tilde operators are not part of a field. Let's look at the various possibilities in detail.

Text Fields

The length of the field is supplied by padding out the field with multiple "<", ">", or "|" characters to specify a non-numeric field with, respectively, left justification, right justification, or centering. For a regular field, the value (up to the first newline) is taken and printed according to the selected justification, truncating excess characters. If you terminate a text field with "...", three dots will be shown if the value is truncated. A special text field may be used to do rudimentary multi-line text block filling; see Using Fill Mode for details.

  1. Example:
  2. format STDOUT =
  3. @<<<<<< @|||||| @>>>>>>
  4. "left", "middle", "right"
  5. .
  6. Output:
  7. left middle right

Numeric Fields

Using "#" as a padding character specifies a numeric field, with right justification. An optional "." defines the position of the decimal point. With a "0" (zero) instead of the first "#", the formatted number will be padded with leading zeroes if necessary. A special numeric field is blanked out if the value is undefined. If the resulting value would exceed the width specified the field is filled with "#" as overflow evidence.

  1. Example:
  2. format STDOUT =
  3. @### @.### @##.### @### @### ^####
  4. 42, 3.1415, undef, 0, 10000, undef
  5. .
  6. Output:
  7. 42 3.142 0.000 0 ####

The Field @* for Variable-Width Multi-Line Text

The field "@*" can be used for printing multi-line, nontruncated values; it should (but need not) appear by itself on a line. A final line feed is chomped off, but all other characters are emitted verbatim.

The Field ^* for Variable-Width One-line-at-a-time Text

Like "@*", this is a variable-width field. The value supplied must be a scalar variable. Perl puts the first line (up to the first "\n") of the text into the field, and then chops off the front of the string so that the next time the variable is referenced, more of the text can be printed. The variable will not be restored.

  1. Example:
  2. $text = "line 1\nline 2\nline 3";
  3. format STDOUT =
  4. Text: ^*
  5. $text
  6. ~~ ^*
  7. $text
  8. .
  9. Output:
  10. Text: line 1
  11. line 2
  12. line 3

Specifying Values

The values are specified on the following format line in the same order as the picture fields. The expressions providing the values must be separated by commas. They are all evaluated in a list context before the line is processed, so a single list expression could produce multiple list elements. The expressions may be spread out to more than one line if enclosed in braces. If so, the opening brace must be the first token on the first line. If an expression evaluates to a number with a decimal part, and if the corresponding picture specifies that the decimal part should appear in the output (that is, any picture except multiple "#" characters without an embedded "."), the character used for the decimal point is determined by the current LC_NUMERIC locale if use locale is in effect. This means that, if, for example, the run-time environment happens to specify a German locale, "," will be used instead of the default ".". See perllocale and WARNINGS for more information.

Using Fill Mode

On text fields the caret enables a kind of fill mode. Instead of an arbitrary expression, the value supplied must be a scalar variable that contains a text string. Perl puts the next portion of the text into the field, and then chops off the front of the string so that the next time the variable is referenced, more of the text can be printed. (Yes, this means that the variable itself is altered during execution of the write() call, and is not restored.) The next portion of text is determined by a crude line-breaking algorithm. You may use the carriage return character (\r ) to force a line break. You can change which characters are legal to break on by changing the variable $: (that's $FORMAT_LINE_BREAK_CHARACTERS if you're using the English module) to a list of the desired characters.

Normally you would use a sequence of fields in a vertical stack associated with the same scalar variable to print out a block of text. You might wish to end the final field with the text "...", which will appear in the output if the text was too long to appear in its entirety.

Suppressing Lines Where All Fields Are Void

Using caret fields can produce lines where all fields are blank. You can suppress such lines by putting a "~" (tilde) character anywhere in the line. The tilde will be translated to a space upon output.

Repeating Format Lines

If you put two contiguous tilde characters "~~" anywhere into a line, the line will be repeated until all the fields on the line are exhausted, i.e. undefined. For special (caret) text fields this will occur sooner or later, but if you use a text field of the at variety, the expression you supply had better not give the same value every time forever! (shift(@f) is a simple example that would work.) Don't use a regular (at) numeric field in such lines, because it will never go blank.

Top of Form Processing

Top-of-form processing is by default handled by a format with the same name as the current filehandle with "_TOP" concatenated to it. It's triggered at the top of each page. See write.

Examples:

  1. # a report on the /etc/passwd file
  2. format STDOUT_TOP =
  3. Passwd File
  4. Name Login Office Uid Gid Home
  5. ------------------------------------------------------------------
  6. .
  7. format STDOUT =
  8. @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
  9. $name, $login, $office,$uid,$gid, $home
  10. .
  11. # a report from a bug report form
  12. format STDOUT_TOP =
  13. Bug Reports
  14. @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>>
  15. $system, $%, $date
  16. ------------------------------------------------------------------
  17. .
  18. format STDOUT =
  19. Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  20. $subject
  21. Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  22. $index, $description
  23. Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  24. $priority, $date, $description
  25. From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  26. $from, $description
  27. Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  28. $programmer, $description
  29. ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  30. $description
  31. ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  32. $description
  33. ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  34. $description
  35. ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  36. $description
  37. ~ ^<<<<<<<<<<<<<<<<<<<<<<<...
  38. $description
  39. .

It is possible to intermix print()s with write()s on the same output channel, but you'll have to handle $- ($FORMAT_LINES_LEFT ) yourself.

Format Variables

The current format name is stored in the variable $~ ($FORMAT_NAME ), and the current top of form format name is in $^ ($FORMAT_TOP_NAME ). The current output page number is stored in $% ($FORMAT_PAGE_NUMBER ), and the number of lines on the page is in $= ($FORMAT_LINES_PER_PAGE ). Whether to autoflush output on this handle is stored in $| ($OUTPUT_AUTOFLUSH ). The string output before each top of page (except the first) is stored in $^L ($FORMAT_FORMFEED ). These variables are set on a per-filehandle basis, so you'll need to select() into a different one to affect them:

  1. select((select(OUTF),
  2. $~ = "My_Other_Format",
  3. $^ = "My_Top_Format"
  4. )[0]);

Pretty ugly, eh? It's a common idiom though, so don't be too surprised when you see it. You can at least use a temporary variable to hold the previous filehandle: (this is a much better approach in general, because not only does legibility improve, you now have an intermediary stage in the expression to single-step the debugger through):

  1. $ofh = select(OUTF);
  2. $~ = "My_Other_Format";
  3. $^ = "My_Top_Format";
  4. select($ofh);

If you use the English module, you can even read the variable names:

  1. use English '-no_match_vars';
  2. $ofh = select(OUTF);
  3. $FORMAT_NAME = "My_Other_Format";
  4. $FORMAT_TOP_NAME = "My_Top_Format";
  5. select($ofh);

But you still have those funny select()s. So just use the FileHandle module. Now, you can access these special variables using lowercase method names instead:

  1. use FileHandle;
  2. format_name OUTF "My_Other_Format";
  3. format_top_name OUTF "My_Top_Format";

Much better!

NOTES

Because the values line may contain arbitrary expressions (for at fields, not caret fields), you can farm out more sophisticated processing to other functions, like sprintf() or one of your own. For example:

  1. format Ident =
  2. @<<<<<<<<<<<<<<<
  3. &commify($n)
  4. .

To get a real at or caret into the field, do this:

  1. format Ident =
  2. I have an @ here.
  3. "@"
  4. .

To center a whole line of text, do something like this:

  1. format Ident =
  2. @|||||||||||||||||||||||||||||||||||||||||||||||
  3. "Some text line"
  4. .

There is no builtin way to say "float this to the right hand side of the page, however wide it is." You have to specify where it goes. The truly desperate can generate their own format on the fly, based on the current number of columns, and then eval() it:

  1. $format = "format STDOUT = \n"
  2. . '^' . '<' x $cols . "\n"
  3. . '$entry' . "\n"
  4. . "\t^" . "<" x ($cols-8) . "~~\n"
  5. . '$entry' . "\n"
  6. . ".\n";
  7. print $format if $Debugging;
  8. eval $format;
  9. die $@ if $@;

Which would generate a format looking something like this:

  1. format STDOUT =
  2. ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  3. $entry
  4. ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~
  5. $entry
  6. .

Here's a little program that's somewhat like fmt(1):

  1. format =
  2. ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ~~
  3. $_
  4. .
  5. $/ = '';
  6. while (<>) {
  7. s/\s*\n\s*/ /g;
  8. write;
  9. }

Footers

While $FORMAT_TOP_NAME contains the name of the current header format, there is no corresponding mechanism to automatically do the same thing for a footer. Not knowing how big a format is going to be until you evaluate it is one of the major problems. It's on the TODO list.

Here's one strategy: If you have a fixed-size footer, you can get footers by checking $FORMAT_LINES_LEFT before each write() and print the footer yourself if necessary.

Here's another strategy: Open a pipe to yourself, using open(MYSELF, "|-") (see open) and always write() to MYSELF instead of STDOUT. Have your child process massage its STDIN to rearrange headers and footers however you like. Not very convenient, but doable.

Accessing Formatting Internals

For low-level access to the formatting mechanism, you may use formline() and access $^A (the $ACCUMULATOR variable) directly.

For example:

  1. $str = formline <<'END', 1,2,3;
  2. @<<< @||| @>>>
  3. END
  4. print "Wow, I just stored '$^A' in the accumulator!\n";

Or to make an swrite() subroutine, which is to write() what sprintf() is to printf(), do this:

  1. use Carp;
  2. sub swrite {
  3. croak "usage: swrite PICTURE ARGS" unless @_;
  4. my $format = shift;
  5. $^A = "";
  6. formline($format,@_);
  7. return $^A;
  8. }
  9. $string = swrite(<<'END', 1, 2, 3);
  10. Check me out
  11. @<<< @||| @>>>
  12. END
  13. print $string;

WARNINGS

The lone dot that ends a format can also prematurely end a mail message passing through a misconfigured Internet mailer (and based on experience, such misconfiguration is the rule, not the exception). So when sending format code through mail, you should indent it so that the format-ending dot is not on the left margin; this will prevent SMTP cutoff.

Lexical variables (declared with "my") are not visible within a format unless the format is declared within the scope of the lexical variable.

If a program's environment specifies an LC_NUMERIC locale and use locale is in effect when the format is declared, the locale is used to specify the decimal point character in formatted output. Formatted output cannot be controlled by use locale at the time when write() is called. See perllocale for further discussion of locale handling.

Within strings that are to be displayed in a fixed-length text field, each control character is substituted by a space. (But remember the special meaning of \r when using fill mode.) This is done to avoid misalignment when control characters "disappear" on some output media.

 
perldoc-html/perlfreebsd.html000644 000765 000024 00000040407 12275777411 016423 0ustar00jjstaff000000 000000 perlfreebsd - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfreebsd

Perl 5 version 18.2 documentation
Recently read

perlfreebsd

NAME

perlfreebsd - Perl version 5 on FreeBSD systems

DESCRIPTION

This document describes various features of FreeBSD that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs.

FreeBSD core dumps from readdir_r with ithreads

When perl is configured to use ithreads, it will use re-entrant library calls in preference to non-re-entrant versions. There is a bug in FreeBSD's readdir_r function in versions 4.5 and earlier that can cause a SEGV when reading large directories. A patch for FreeBSD libc is available (see http://www.freebsd.org/cgi/query-pr.cgi?pr=misc/30631 ) which has been integrated into FreeBSD 4.6.

$^X doesn't always contain a full path in FreeBSD

perl sets $^X where possible to a full path by asking the operating system. On FreeBSD the full path of the perl interpreter is found by using sysctl with KERN_PROC_PATHNAME if that is supported, else by reading the symlink /proc/curproc/file. FreeBSD 7 and earlier has a bug where either approach sometimes returns an incorrect value (see http://www.freebsd.org/cgi/query-pr.cgi?pr=35703 ). In these cases perl will fall back to the old behaviour of using C's argv[0] value for $^X .

AUTHOR

Nicholas Clark <nick@ccl4.org>, collating wisdom supplied by Slaven Rezic and Tim Bunce.

Please report any errors, updates, or suggestions to perlbug@perl.org.

 
perldoc-html/perlfunc.html000644 000765 000024 00003062361 12275777524 015757 0ustar00jjstaff000000 000000 perlfunc - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlfunc

Perl 5 version 18.2 documentation
Recently read

perlfunc

NAME

perlfunc - Perl builtin functions

DESCRIPTION

The functions in this section can serve as terms in an expression. They fall into two major categories: list operators and named unary operators. These differ in their precedence relationship with a following comma. (See the precedence table in perlop.) List operators take more than one argument, while unary operators can never take more than one argument. Thus, a comma terminates the argument of a unary operator, but merely separates the arguments of a list operator. A unary operator generally provides scalar context to its argument, while a list operator may provide either scalar or list contexts for its arguments. If it does both, scalar arguments come first and list argument follow, and there can only ever be one such list argument. For instance, splice() has three scalar arguments followed by a list, whereas gethostbyname() has four scalar arguments.

In the syntax descriptions that follow, list operators that expect a list (and provide list context for elements of the list) are shown with LIST as an argument. Such a list may consist of any combination of scalar arguments or list values; the list values will be included in the list as if each individual element were interpolated at that point in the list, forming a longer single-dimensional list value. Commas should separate literal elements of the LIST.

Any function in the list below may be used either with or without parentheses around its arguments. (The syntax descriptions omit the parentheses.) If you use parentheses, the simple but occasionally surprising rule is this: It looks like a function, therefore it is a function, and precedence doesn't matter. Otherwise it's a list operator or unary operator, and precedence does matter. Whitespace between the function and left parenthesis doesn't count, so sometimes you need to be careful:

  1. print 1+2+4; # Prints 7.
  2. print(1+2) + 4; # Prints 3.
  3. print (1+2)+4; # Also prints 3!
  4. print +(1+2)+4; # Prints 7.
  5. print ((1+2)+4); # Prints 7.

If you run Perl with the -w switch it can warn you about this. For example, the third line above produces:

  1. print (...) interpreted as function at - line 1.
  2. Useless use of integer addition in void context at - line 1.

A few functions take no arguments at all, and therefore work as neither unary nor list operators. These include such functions as time and endpwent. For example, time+86_400 always means time() + 86_400 .

For functions that can be used in either a scalar or list context, nonabortive failure is generally indicated in scalar context by returning the undefined value, and in list context by returning the empty list.

Remember the following important rule: There is no rule that relates the behavior of an expression in list context to its behavior in scalar context, or vice versa. It might do two totally different things. Each operator and function decides which sort of value would be most appropriate to return in scalar context. Some operators return the length of the list that would have been returned in list context. Some operators return the first value in the list. Some operators return the last value in the list. Some operators return a count of successful operations. In general, they do what you want, unless you want consistency.

A named array in scalar context is quite different from what would at first glance appear to be a list in scalar context. You can't get a list like (1,2,3) into being in scalar context, because the compiler knows the context at compile time. It would generate the scalar comma operator there, not the list construction version of the comma. That means it was never a list to start with.

In general, functions in Perl that serve as wrappers for system calls ("syscalls") of the same name (like chown(2), fork(2), closedir(2), etc.) return true when they succeed and undef otherwise, as is usually mentioned in the descriptions below. This is different from the C interfaces, which return -1 on failure. Exceptions to this rule include wait, waitpid, and syscall. System calls also set the special $! variable on failure. Other functions do not, except accidentally.

Extension modules can also hook into the Perl parser to define new kinds of keyword-headed expression. These may look like functions, but may also look completely different. The syntax following the keyword is defined entirely by the extension. If you are an implementor, see PL_keyword_plugin in perlapi for the mechanism. If you are using such a module, see the module's documentation for details of the syntax that it defines.

Perl Functions by Category

Here are Perl's functions (including things that look like functions, like some keywords and named operators) arranged by category. Some functions appear in more than one place.

Portability

Perl was born in Unix and can therefore access all common Unix system calls. In non-Unix environments, the functionality of some Unix system calls may not be available or details of the available functionality may differ slightly. The Perl functions affected by this are:

-X, binmode, chmod, chown, chroot, crypt, dbmclose, dbmopen, dump, endgrent, endhostent, endnetent, endprotoent, endpwent, endservent, exec, fcntl, flock, fork, getgrent, getgrgid, gethostbyname, gethostent, getlogin, getnetbyaddr, getnetbyname, getnetent, getppid, getpgrp, getpriority, getprotobynumber, getprotoent, getpwent, getpwnam, getpwuid, getservbyport, getservent, getsockopt, glob, ioctl, kill, link, lstat, msgctl, msgget, msgrcv, msgsnd, open, pipe, readlink, rename, select, semctl, semget, semop, setgrent, sethostent, setnetent, setpgrp, setpriority, setprotoent, setpwent, setservent, setsockopt, shmctl, shmget, shmread, shmwrite, socket, socketpair, stat, symlink, syscall, sysopen, system, times, truncate, umask, unlink, utime, wait, waitpid

For more information about the portability of these functions, see perlport and other available platform-specific documentation.

Alphabetical Listing of Perl Functions

  • -X FILEHANDLE
  • -X EXPR
  • -X DIRHANDLE
  • -X

    A file test, where X is one of the letters listed below. This unary operator takes one argument, either a filename, a filehandle, or a dirhandle, and tests the associated file to see if something is true about it. If the argument is omitted, tests $_ , except for -t , which tests STDIN. Unless otherwise documented, it returns 1 for true and '' for false, or the undefined value if the file doesn't exist. Despite the funny names, precedence is the same as any other named unary operator. The operator may be any of:

    1. -r File is readable by effective uid/gid.
    2. -w File is writable by effective uid/gid.
    3. -x File is executable by effective uid/gid.
    4. -o File is owned by effective uid.
    5. -R File is readable by real uid/gid.
    6. -W File is writable by real uid/gid.
    7. -X File is executable by real uid/gid.
    8. -O File is owned by real uid.
    9. -e File exists.
    10. -z File has zero size (is empty).
    11. -s File has nonzero size (returns size in bytes).
    12. -f File is a plain file.
    13. -d File is a directory.
    14. -l File is a symbolic link.
    15. -p File is a named pipe (FIFO), or Filehandle is a pipe.
    16. -S File is a socket.
    17. -b File is a block special file.
    18. -c File is a character special file.
    19. -t Filehandle is opened to a tty.
    20. -u File has setuid bit set.
    21. -g File has setgid bit set.
    22. -k File has sticky bit set.
    23. -T File is an ASCII text file (heuristic guess).
    24. -B File is a "binary" file (opposite of -T).
    25. -M Script start time minus file modification time, in days.
    26. -A Same for access time.
    27. -C Same for inode change time (Unix, may differ for other
    28. platforms)

    Example:

    1. while (<>) {
    2. chomp;
    3. next unless -f $_; # ignore specials
    4. #...
    5. }

    Note that -s/a/b/ does not do a negated substitution. Saying -exp($foo) still works as expected, however: only single letters following a minus are interpreted as file tests.

    These operators are exempt from the "looks like a function rule" described above. That is, an opening parenthesis after the operator does not affect how much of the following code constitutes the argument. Put the opening parentheses before the operator to separate it from code that follows (this applies only to operators with higher precedence than unary operators, of course):

    1. -s($file) + 1024 # probably wrong; same as -s($file + 1024)
    2. (-s $file) + 1024 # correct

    The interpretation of the file permission operators -r , -R , -w , -W , -x , and -X is by default based solely on the mode of the file and the uids and gids of the user. There may be other reasons you can't actually read, write, or execute the file: for example network filesystem access controls, ACLs (access control lists), read-only filesystems, and unrecognized executable formats. Note that the use of these six specific operators to verify if some operation is possible is usually a mistake, because it may be open to race conditions.

    Also note that, for the superuser on the local filesystems, the -r , -R , -w , and -W tests always return 1, and -x and -X return 1 if any execute bit is set in the mode. Scripts run by the superuser may thus need to do a stat() to determine the actual mode of the file, or temporarily set their effective uid to something else.

    If you are using ACLs, there is a pragma called filetest that may produce more accurate results than the bare stat() mode bits. When under use filetest 'access' the above-mentioned filetests test whether the permission can(not) be granted using the access(2) family of system calls. Also note that the -x and -X may under this pragma return true even if there are no execute permission bits set (nor any extra execute permission ACLs). This strangeness is due to the underlying system calls' definitions. Note also that, due to the implementation of use filetest 'access' , the _ special filehandle won't cache the results of the file tests when this pragma is in effect. Read the documentation for the filetest pragma for more information.

    The -T and -B switches work as follows. The first block or so of the file is examined for odd characters such as strange control codes or characters with the high bit set. If too many strange characters (>30%) are found, it's a -B file; otherwise it's a -T file. Also, any file containing a zero byte in the first block is considered a binary file. If -T or -B is used on a filehandle, the current IO buffer is examined rather than the first block. Both -T and -B return true on an empty file, or a file at EOF when testing a filehandle. Because you have to read a file to do the -T test, on most occasions you want to use a -f against the file first, as in next unless -f $file && -T $file .

    If any of the file tests (or either the stat or lstat operator) is given the special filehandle consisting of a solitary underline, then the stat structure of the previous file test (or stat operator) is used, saving a system call. (This doesn't work with -t , and you need to remember that lstat() and -l leave values in the stat structure for the symbolic link, not the real file.) (Also, if the stat buffer was filled by an lstat call, -T and -B will reset it with the results of stat _ ). Example:

    1. print "Can do.\n" if -r $a || -w _ || -x _;
    2. stat($filename);
    3. print "Readable\n" if -r _;
    4. print "Writable\n" if -w _;
    5. print "Executable\n" if -x _;
    6. print "Setuid\n" if -u _;
    7. print "Setgid\n" if -g _;
    8. print "Sticky\n" if -k _;
    9. print "Text\n" if -T _;
    10. print "Binary\n" if -B _;

    As of Perl 5.10.0, as a form of purely syntactic sugar, you can stack file test operators, in a way that -f -w -x $file is equivalent to -x $file && -w _ && -f _ . (This is only fancy fancy: if you use the return value of -f $file as an argument to another filetest operator, no special magic will happen.)

    Portability issues: -X in perlport.

    To avoid confusing would-be users of your code with mysterious syntax errors, put something like this at the top of your script:

    1. use 5.010; # so filetest ops can stack
  • abs VALUE
  • abs

    Returns the absolute value of its argument. If VALUE is omitted, uses $_ .

  • accept NEWSOCKET,GENERICSOCKET

    Accepts an incoming socket connect, just as accept(2) does. Returns the packed address if it succeeded, false otherwise. See the example in Sockets: Client/Server Communication in perlipc.

    On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptor, as determined by the value of $^F. See $^F in perlvar.

  • alarm SECONDS
  • alarm

    Arranges to have a SIGALRM delivered to this process after the specified number of wallclock seconds has elapsed. If SECONDS is not specified, the value stored in $_ is used. (On some machines, unfortunately, the elapsed time may be up to one second less or more than you specified because of how seconds are counted, and process scheduling may delay the delivery of the signal even further.)

    Only one timer may be counting at once. Each call disables the previous timer, and an argument of 0 may be supplied to cancel the previous timer without starting a new one. The returned value is the amount of time remaining on the previous timer.

    For delays of finer granularity than one second, the Time::HiRes module (from CPAN, and starting from Perl 5.8 part of the standard distribution) provides ualarm(). You may also use Perl's four-argument version of select() leaving the first three arguments undefined, or you might be able to use the syscall interface to access setitimer(2) if your system supports it. See perlfaq8 for details.

    It is usually a mistake to intermix alarm and sleep calls, because sleep may be internally implemented on your system with alarm.

    If you want to use alarm to time out a system call you need to use an eval/die pair. You can't rely on the alarm causing the system call to fail with $! set to EINTR because Perl sets up signal handlers to restart system calls on some systems. Using eval/die always works, modulo the caveats given in Signals in perlipc.

    1. eval {
    2. local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required
    3. alarm $timeout;
    4. $nread = sysread SOCKET, $buffer, $size;
    5. alarm 0;
    6. };
    7. if ($@) {
    8. die unless $@ eq "alarm\n"; # propagate unexpected errors
    9. # timed out
    10. }
    11. else {
    12. # didn't
    13. }

    For more information see perlipc.

    Portability issues: alarm in perlport.

  • atan2 Y,X

    Returns the arctangent of Y/X in the range -PI to PI.

    For the tangent operation, you may use the Math::Trig::tan function, or use the familiar relation:

    1. sub tan { sin($_[0]) / cos($_[0]) }

    The return value for atan2(0,0) is implementation-defined; consult your atan2(3) manpage for more information.

    Portability issues: atan2 in perlport.

  • bind SOCKET,NAME

    Binds a network address to a socket, just as bind(2) does. Returns true if it succeeded, false otherwise. NAME should be a packed address of the appropriate type for the socket. See the examples in Sockets: Client/Server Communication in perlipc.

  • binmode FILEHANDLE, LAYER
  • binmode FILEHANDLE

    Arranges for FILEHANDLE to be read or written in "binary" or "text" mode on systems where the run-time libraries distinguish between binary and text files. If FILEHANDLE is an expression, the value is taken as the name of the filehandle. Returns true on success, otherwise it returns undef and sets $! (errno).

    On some systems (in general, DOS- and Windows-based systems) binmode() is necessary when you're not working with a text file. For the sake of portability it is a good idea always to use it when appropriate, and never to use it when it isn't appropriate. Also, people can set their I/O to be by default UTF8-encoded Unicode, not bytes.

    In other words: regardless of platform, use binmode() on binary data, like images, for example.

    If LAYER is present it is a single string, but may contain multiple directives. The directives alter the behaviour of the filehandle. When LAYER is present, using binmode on a text file makes sense.

    If LAYER is omitted or specified as :raw the filehandle is made suitable for passing binary data. This includes turning off possible CRLF translation and marking it as bytes (as opposed to Unicode characters). Note that, despite what may be implied in "Programming Perl" (the Camel, 3rd edition) or elsewhere, :raw is not simply the inverse of :crlf . Other layers that would affect the binary nature of the stream are also disabled. See PerlIO, perlrun, and the discussion about the PERLIO environment variable.

    The :bytes , :crlf , :utf8 , and any other directives of the form :... , are called I/O layers. The open pragma can be used to establish default I/O layers. See open.

    The LAYER parameter of the binmode() function is described as "DISCIPLINE" in "Programming Perl, 3rd Edition". However, since the publishing of this book, by many known as "Camel III", the consensus of the naming of this functionality has moved from "discipline" to "layer". All documentation of this version of Perl therefore refers to "layers" rather than to "disciplines". Now back to the regularly scheduled documentation...

    To mark FILEHANDLE as UTF-8, use :utf8 or :encoding(UTF-8) . :utf8 just marks the data as UTF-8 without further checking, while :encoding(UTF-8) checks the data for actually being valid UTF-8. More details can be found in PerlIO::encoding.

    In general, binmode() should be called after open() but before any I/O is done on the filehandle. Calling binmode() normally flushes any pending buffered output data (and perhaps pending input data) on the handle. An exception to this is the :encoding layer that changes the default character encoding of the handle; see open. The :encoding layer sometimes needs to be called in mid-stream, and it doesn't flush the stream. The :encoding also implicitly pushes on top of itself the :utf8 layer because internally Perl operates on UTF8-encoded Unicode characters.

    The operating system, device drivers, C libraries, and Perl run-time system all conspire to let the programmer treat a single character (\n ) as the line terminator, irrespective of external representation. On many operating systems, the native text file representation matches the internal representation, but on some platforms the external representation of \n is made up of more than one character.

    All variants of Unix, Mac OS (old and new), and Stream_LF files on VMS use a single character to end each line in the external representation of text (even though that single character is CARRIAGE RETURN on old, pre-Darwin flavors of Mac OS, and is LINE FEED on Unix and most VMS files). In other systems like OS/2, DOS, and the various flavors of MS-Windows, your program sees a \n as a simple \cJ , but what's stored in text files are the two characters \cM\cJ . That means that if you don't use binmode() on these systems, \cM\cJ sequences on disk will be converted to \n on input, and any \n in your program will be converted back to \cM\cJ on output. This is what you want for text files, but it can be disastrous for binary files.

    Another consequence of using binmode() (on some systems) is that special end-of-file markers will be seen as part of the data stream. For systems from the Microsoft family this means that, if your binary data contain \cZ , the I/O subsystem will regard it as the end of the file, unless you use binmode().

    binmode() is important not only for readline() and print() operations, but also when using read(), seek(), sysread(), syswrite() and tell() (see perlport for more details). See the $/ and $\ variables in perlvar for how to manually set your input and output line-termination sequences.

    Portability issues: binmode in perlport.

  • bless REF,CLASSNAME
  • bless REF

    This function tells the thingy referenced by REF that it is now an object in the CLASSNAME package. If CLASSNAME is omitted, the current package is used. Because a bless is often the last thing in a constructor, it returns the reference for convenience. Always use the two-argument version if a derived class might inherit the function doing the blessing. See perlobj for more about the blessing (and blessings) of objects.

    Consider always blessing objects in CLASSNAMEs that are mixed case. Namespaces with all lowercase names are considered reserved for Perl pragmata. Builtin types have all uppercase names. To prevent confusion, you may wish to avoid such package names as well. Make sure that CLASSNAME is a true value.

    See Perl Modules in perlmod.

  • break

    Break out of a given() block.

    This keyword is enabled by the "switch" feature; see feature for more information on "switch" . You can also access it by prefixing it with CORE:: . Alternatively, include a use v5.10 or later to the current scope.

  • caller EXPR
  • caller

    Returns the context of the current subroutine call. In scalar context, returns the caller's package name if there is a caller (that is, if we're in a subroutine or eval or require) and the undefined value otherwise. In list context, returns

    1. # 0 1 2
    2. ($package, $filename, $line) = caller;

    With EXPR, it returns some extra information that the debugger uses to print a stack trace. The value of EXPR indicates how many call frames to go back before the current one.

    1. # 0 1 2 3 4
    2. ($package, $filename, $line, $subroutine, $hasargs,
    3. # 5 6 7 8 9 10
    4. $wantarray, $evaltext, $is_require, $hints, $bitmask, $hinthash)
    5. = caller($i);

    Here $subroutine may be (eval) if the frame is not a subroutine call, but an eval. In such a case additional elements $evaltext and $is_require are set: $is_require is true if the frame is created by a require or use statement, $evaltext contains the text of the eval EXPR statement. In particular, for an eval BLOCK statement, $subroutine is (eval) , but $evaltext is undefined. (Note also that each use statement creates a require frame inside an eval EXPR frame.) $subroutine may also be (unknown) if this particular subroutine happens to have been deleted from the symbol table. $hasargs is true if a new instance of @_ was set up for the frame. $hints and $bitmask contain pragmatic hints that the caller was compiled with. $hints corresponds to $^H , and $bitmask corresponds to ${^WARNING_BITS} . The $hints and $bitmask values are subject to change between versions of Perl, and are not meant for external use.

    $hinthash is a reference to a hash containing the value of %^H when the caller was compiled, or undef if %^H was empty. Do not modify the values of this hash, as they are the actual values stored in the optree.

    Furthermore, when called from within the DB package in list context, and with an argument, caller returns more detailed information: it sets the list variable @DB::args to be the arguments with which the subroutine was invoked.

    Be aware that the optimizer might have optimized call frames away before caller had a chance to get the information. That means that caller(N) might not return information about the call frame you expect it to, for N > 1 . In particular, @DB::args might have information from the previous time caller was called.

    Be aware that setting @DB::args is best effort, intended for debugging or generating backtraces, and should not be relied upon. In particular, as @_ contains aliases to the caller's arguments, Perl does not take a copy of @_ , so @DB::args will contain modifications the subroutine makes to @_ or its contents, not the original values at call time. @DB::args , like @_ , does not hold explicit references to its elements, so under certain cases its elements may have become freed and reallocated for other variables or temporary values. Finally, a side effect of the current implementation is that the effects of shift @_ can normally be undone (but not pop @_ or other splicing, and not if a reference to @_ has been taken, and subject to the caveat about reallocated elements), so @DB::args is actually a hybrid of the current state and initial state of @_ . Buyer beware.

  • chdir EXPR
  • chdir FILEHANDLE
  • chdir DIRHANDLE
  • chdir

    Changes the working directory to EXPR, if possible. If EXPR is omitted, changes to the directory specified by $ENV{HOME} , if set; if not, changes to the directory specified by $ENV{LOGDIR} . (Under VMS, the variable $ENV{SYS$LOGIN} is also checked, and used if it is set.) If neither is set, chdir does nothing. It returns true on success, false otherwise. See the example under die.

    On systems that support fchdir(2), you may pass a filehandle or directory handle as the argument. On systems that don't support fchdir(2), passing handles raises an exception.

  • chmod LIST

    Changes the permissions of a list of files. The first element of the list must be the numeric mode, which should probably be an octal number, and which definitely should not be a string of octal digits: 0644 is okay, but "0644" is not. Returns the number of files successfully changed. See also oct if all you have is a string.

    1. $cnt = chmod 0755, "foo", "bar";
    2. chmod 0755, @executables;
    3. $mode = "0644"; chmod $mode, "foo"; # !!! sets mode to
    4. # --w----r-T
    5. $mode = "0644"; chmod oct($mode), "foo"; # this is better
    6. $mode = 0644; chmod $mode, "foo"; # this is best

    On systems that support fchmod(2), you may pass filehandles among the files. On systems that don't support fchmod(2), passing filehandles raises an exception. Filehandles must be passed as globs or glob references to be recognized; barewords are considered filenames.

    1. open(my $fh, "<", "foo");
    2. my $perm = (stat $fh)[2] & 07777;
    3. chmod($perm | 0600, $fh);

    You can also import the symbolic S_I* constants from the Fcntl module:

    1. use Fcntl qw( :mode );
    2. chmod S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH, @executables;
    3. # Identical to the chmod 0755 of the example above.

    Portability issues: chmod in perlport.

  • chomp VARIABLE
  • chomp( LIST )
  • chomp

    This safer version of chop removes any trailing string that corresponds to the current value of $/ (also known as $INPUT_RECORD_SEPARATOR in the English module). It returns the total number of characters removed from all its arguments. It's often used to remove the newline from the end of an input record when you're worried that the final record may be missing its newline. When in paragraph mode ($/ = "" ), it removes all trailing newlines from the string. When in slurp mode ($/ = undef ) or fixed-length record mode ($/ is a reference to an integer or the like; see perlvar) chomp() won't remove anything. If VARIABLE is omitted, it chomps $_ . Example:

    1. while (<>) {
    2. chomp; # avoid \n on last field
    3. @array = split(/:/);
    4. # ...
    5. }

    If VARIABLE is a hash, it chomps the hash's values, but not its keys.

    You can actually chomp anything that's an lvalue, including an assignment:

    1. chomp($cwd = `pwd`);
    2. chomp($answer = <STDIN>);

    If you chomp a list, each element is chomped, and the total number of characters removed is returned.

    Note that parentheses are necessary when you're chomping anything that is not a simple variable. This is because chomp $cwd = `pwd`; is interpreted as (chomp $cwd) = `pwd`; , rather than as chomp( $cwd = `pwd` ) which you might expect. Similarly, chomp $a, $b is interpreted as chomp($a), $b rather than as chomp($a, $b) .

  • chop VARIABLE
  • chop( LIST )
  • chop

    Chops off the last character of a string and returns the character chopped. It is much more efficient than s/.$//s because it neither scans nor copies the string. If VARIABLE is omitted, chops $_ . If VARIABLE is a hash, it chops the hash's values, but not its keys.

    You can actually chop anything that's an lvalue, including an assignment.

    If you chop a list, each element is chopped. Only the value of the last chop is returned.

    Note that chop returns the last character. To return all but the last character, use substr($string, 0, -1) .

    See also chomp.

  • chown LIST

    Changes the owner (and group) of a list of files. The first two elements of the list must be the numeric uid and gid, in that order. A value of -1 in either position is interpreted by most systems to leave that value unchanged. Returns the number of files successfully changed.

    1. $cnt = chown $uid, $gid, 'foo', 'bar';
    2. chown $uid, $gid, @filenames;

    On systems that support fchown(2), you may pass filehandles among the files. On systems that don't support fchown(2), passing filehandles raises an exception. Filehandles must be passed as globs or glob references to be recognized; barewords are considered filenames.

    Here's an example that looks up nonnumeric uids in the passwd file:

    1. print "User: ";
    2. chomp($user = <STDIN>);
    3. print "Files: ";
    4. chomp($pattern = <STDIN>);
    5. ($login,$pass,$uid,$gid) = getpwnam($user)
    6. or die "$user not in passwd file";
    7. @ary = glob($pattern); # expand filenames
    8. chown $uid, $gid, @ary;

    On most systems, you are not allowed to change the ownership of the file unless you're the superuser, although you should be able to change the group to any of your secondary groups. On insecure systems, these restrictions may be relaxed, but this is not a portable assumption. On POSIX systems, you can detect this condition this way:

    1. use POSIX qw(sysconf _PC_CHOWN_RESTRICTED);
    2. $can_chown_giveaway = not sysconf(_PC_CHOWN_RESTRICTED);

    Portability issues: chmod in perlport.

  • chr NUMBER
  • chr

    Returns the character represented by that NUMBER in the character set. For example, chr(65) is "A" in either ASCII or Unicode, and chr(0x263a) is a Unicode smiley face.

    Negative values give the Unicode replacement character (chr(0xfffd)), except under the bytes pragma, where the low eight bits of the value (truncated to an integer) are used.

    If NUMBER is omitted, uses $_ .

    For the reverse, use ord.

    Note that characters from 128 to 255 (inclusive) are by default internally not encoded as UTF-8 for backward compatibility reasons.

    See perlunicode for more about Unicode.

  • chroot FILENAME
  • chroot

    This function works like the system call by the same name: it makes the named directory the new root directory for all further pathnames that begin with a / by your process and all its children. (It doesn't change your current working directory, which is unaffected.) For security reasons, this call is restricted to the superuser. If FILENAME is omitted, does a chroot to $_ .

    Portability issues: chroot in perlport.

  • close FILEHANDLE
  • close

    Closes the file or pipe associated with the filehandle, flushes the IO buffers, and closes the system file descriptor. Returns true if those operations succeed and if no error was reported by any PerlIO layer. Closes the currently selected filehandle if the argument is omitted.

    You don't have to close FILEHANDLE if you are immediately going to do another open on it, because open closes it for you. (See open.) However, an explicit close on an input file resets the line counter ($. ), while the implicit close done by open does not.

    If the filehandle came from a piped open, close returns false if one of the other syscalls involved fails or if its program exits with non-zero status. If the only problem was that the program exited non-zero, $! will be set to 0 . Closing a pipe also waits for the process executing on the pipe to exit--in case you wish to look at the output of the pipe afterwards--and implicitly puts the exit status value of that command into $? and ${^CHILD_ERROR_NATIVE} .

    If there are multiple threads running, close on a filehandle from a piped open returns true without waiting for the child process to terminate, if the filehandle is still open in another thread.

    Closing the read end of a pipe before the process writing to it at the other end is done writing results in the writer receiving a SIGPIPE. If the other end can't handle that, be sure to read all the data before closing the pipe.

    Example:

    1. open(OUTPUT, '|sort >foo') # pipe to sort
    2. or die "Can't start sort: $!";
    3. #... # print stuff to output
    4. close OUTPUT # wait for sort to finish
    5. or warn $! ? "Error closing sort pipe: $!"
    6. : "Exit status $? from sort";
    7. open(INPUT, 'foo') # get sort's results
    8. or die "Can't open 'foo' for input: $!";

    FILEHANDLE may be an expression whose value can be used as an indirect filehandle, usually the real filehandle name or an autovivified handle.

  • closedir DIRHANDLE

    Closes a directory opened by opendir and returns the success of that system call.

  • connect SOCKET,NAME

    Attempts to connect to a remote socket, just like connect(2). Returns true if it succeeded, false otherwise. NAME should be a packed address of the appropriate type for the socket. See the examples in Sockets: Client/Server Communication in perlipc.

  • continue BLOCK
  • continue

    When followed by a BLOCK, continue is actually a flow control statement rather than a function. If there is a continue BLOCK attached to a BLOCK (typically in a while or foreach ), it is always executed just before the conditional is about to be evaluated again, just like the third part of a for loop in C. Thus it can be used to increment a loop variable, even when the loop has been continued via the next statement (which is similar to the C continue statement).

    last, next, or redo may appear within a continue block; last and redo behave as if they had been executed within the main block. So will next, but since it will execute a continue block, it may be more entertaining.

    1. while (EXPR) {
    2. ### redo always comes here
    3. do_something;
    4. } continue {
    5. ### next always comes here
    6. do_something_else;
    7. # then back the top to re-check EXPR
    8. }
    9. ### last always comes here

    Omitting the continue section is equivalent to using an empty one, logically enough, so next goes directly back to check the condition at the top of the loop.

    When there is no BLOCK, continue is a function that falls through the current when or default block instead of iterating a dynamically enclosing foreach or exiting a lexically enclosing given . In Perl 5.14 and earlier, this form of continue was only available when the "switch" feature was enabled. See feature and Switch Statements in perlsyn for more information.

  • cos EXPR
  • cos

    Returns the cosine of EXPR (expressed in radians). If EXPR is omitted, takes the cosine of $_ .

    For the inverse cosine operation, you may use the Math::Trig::acos() function, or use this relation:

    1. sub acos { atan2( sqrt(1 - $_[0] * $_[0]), $_[0] ) }
  • crypt PLAINTEXT,SALT

    Creates a digest string exactly like the crypt(3) function in the C library (assuming that you actually have a version there that has not been extirpated as a potential munition).

    crypt() is a one-way hash function. The PLAINTEXT and SALT are turned into a short string, called a digest, which is returned. The same PLAINTEXT and SALT will always return the same string, but there is no (known) way to get the original PLAINTEXT from the hash. Small changes in the PLAINTEXT or SALT will result in large changes in the digest.

    There is no decrypt function. This function isn't all that useful for cryptography (for that, look for Crypt modules on your nearby CPAN mirror) and the name "crypt" is a bit of a misnomer. Instead it is primarily used to check if two pieces of text are the same without having to transmit or store the text itself. An example is checking if a correct password is given. The digest of the password is stored, not the password itself. The user types in a password that is crypt()'d with the same salt as the stored digest. If the two digests match, the password is correct.

    When verifying an existing digest string you should use the digest as the salt (like crypt($plain, $digest) eq $digest ). The SALT used to create the digest is visible as part of the digest. This ensures crypt() will hash the new string with the same salt as the digest. This allows your code to work with the standard crypt and with more exotic implementations. In other words, assume nothing about the returned string itself nor about how many bytes of SALT may matter.

    Traditionally the result is a string of 13 bytes: two first bytes of the salt, followed by 11 bytes from the set [./0-9A-Za-z], and only the first eight bytes of PLAINTEXT mattered. But alternative hashing schemes (like MD5), higher level security schemes (like C2), and implementations on non-Unix platforms may produce different strings.

    When choosing a new salt create a random two character string whose characters come from the set [./0-9A-Za-z] (like join '', ('.', '/', 0..9, 'A'..'Z', 'a'..'z')[rand 64, rand 64] ). This set of characters is just a recommendation; the characters allowed in the salt depend solely on your system's crypt library, and Perl can't restrict what salts crypt() accepts.

    Here's an example that makes sure that whoever runs this program knows their password:

    1. $pwd = (getpwuid($<))[1];
    2. system "stty -echo";
    3. print "Password: ";
    4. chomp($word = <STDIN>);
    5. print "\n";
    6. system "stty echo";
    7. if (crypt($word, $pwd) ne $pwd) {
    8. die "Sorry...\n";
    9. } else {
    10. print "ok\n";
    11. }

    Of course, typing in your own password to whoever asks you for it is unwise.

    The crypt function is unsuitable for hashing large quantities of data, not least of all because you can't get the information back. Look at the Digest module for more robust algorithms.

    If using crypt() on a Unicode string (which potentially has characters with codepoints above 255), Perl tries to make sense of the situation by trying to downgrade (a copy of) the string back to an eight-bit byte string before calling crypt() (on that copy). If that works, good. If not, crypt() dies with Wide character in crypt .

    Portability issues: crypt in perlport.

  • dbmclose HASH

    [This function has been largely superseded by the untie function.]

    Breaks the binding between a DBM file and a hash.

    Portability issues: dbmclose in perlport.

  • dbmopen HASH,DBNAME,MASK

    [This function has been largely superseded by the tie function.]

    This binds a dbm(3), ndbm(3), sdbm(3), gdbm(3), or Berkeley DB file to a hash. HASH is the name of the hash. (Unlike normal open, the first argument is not a filehandle, even though it looks like one). DBNAME is the name of the database (without the .dir or .pag extension if any). If the database does not exist, it is created with protection specified by MASK (as modified by the umask). To prevent creation of the database if it doesn't exist, you may specify a MODE of 0, and the function will return a false value if it can't find an existing database. If your system supports only the older DBM functions, you may make only one dbmopen call in your program. In older versions of Perl, if your system had neither DBM nor ndbm, calling dbmopen produced a fatal error; it now falls back to sdbm(3).

    If you don't have write access to the DBM file, you can only read hash variables, not set them. If you want to test whether you can write, either use file tests or try setting a dummy hash entry inside an eval to trap the error.

    Note that functions such as keys and values may return huge lists when used on large DBM files. You may prefer to use the each function to iterate over large DBM files. Example:

    1. # print out history file offsets
    2. dbmopen(%HIST,'/usr/lib/news/history',0666);
    3. while (($key,$val) = each %HIST) {
    4. print $key, ' = ', unpack('L',$val), "\n";
    5. }
    6. dbmclose(%HIST);

    See also AnyDBM_File for a more general description of the pros and cons of the various dbm approaches, as well as DB_File for a particularly rich implementation.

    You can control which DBM library you use by loading that library before you call dbmopen():

    1. use DB_File;
    2. dbmopen(%NS_Hist, "$ENV{HOME}/.netscape/history.db")
    3. or die "Can't open netscape history file: $!";

    Portability issues: dbmopen in perlport.

  • defined EXPR
  • defined

    Returns a Boolean value telling whether EXPR has a value other than the undefined value undef. If EXPR is not present, $_ is checked.

    Many operations return undef to indicate failure, end of file, system error, uninitialized variable, and other exceptional conditions. This function allows you to distinguish undef from other values. (A simple Boolean test will not distinguish among undef, zero, the empty string, and "0" , which are all equally false.) Note that since undef is a valid scalar, its presence doesn't necessarily indicate an exceptional condition: pop returns undef when its argument is an empty array, or when the element to return happens to be undef.

    You may also use defined(&func) to check whether subroutine &func has ever been defined. The return value is unaffected by any forward declarations of &func . A subroutine that is not defined may still be callable: its package may have an AUTOLOAD method that makes it spring into existence the first time that it is called; see perlsub.

    Use of defined on aggregates (hashes and arrays) is deprecated. It used to report whether memory for that aggregate had ever been allocated. This behavior may disappear in future versions of Perl. You should instead use a simple test for size:

    1. if (@an_array) { print "has array elements\n" }
    2. if (%a_hash) { print "has hash members\n" }

    When used on a hash element, it tells you whether the value is defined, not whether the key exists in the hash. Use exists for the latter purpose.

    Examples:

    1. print if defined $switch{D};
    2. print "$val\n" while defined($val = pop(@ary));
    3. die "Can't readlink $sym: $!"
    4. unless defined($value = readlink $sym);
    5. sub foo { defined &$bar ? &$bar(@_) : die "No bar"; }
    6. $debugging = 0 unless defined $debugging;

    Note: Many folks tend to overuse defined and are then surprised to discover that the number 0 and "" (the zero-length string) are, in fact, defined values. For example, if you say

    1. "ab" =~ /a(.*)b/;

    The pattern match succeeds and $1 is defined, although it matched "nothing". It didn't really fail to match anything. Rather, it matched something that happened to be zero characters long. This is all very above-board and honest. When a function returns an undefined value, it's an admission that it couldn't give you an honest answer. So you should use defined only when questioning the integrity of what you're trying to do. At other times, a simple comparison to 0 or "" is what you want.

    See also undef, exists, ref.

  • delete EXPR

    Given an expression that specifies an element or slice of a hash, delete deletes the specified elements from that hash so that exists() on that element no longer returns true. Setting a hash element to the undefined value does not remove its key, but deleting it does; see exists.

    In list context, returns the value or values deleted, or the last such element in scalar context. The return list's length always matches that of the argument list: deleting non-existent elements returns the undefined value in their corresponding positions.

    delete() may also be used on arrays and array slices, but its behavior is less straightforward. Although exists() will return false for deleted entries, deleting array elements never changes indices of existing values; use shift() or splice() for that. However, if all deleted elements fall at the end of an array, the array's size shrinks to the position of the highest element that still tests true for exists(), or to 0 if none do.

    WARNING: Calling delete on array values is deprecated and likely to be removed in a future version of Perl.

    Deleting from %ENV modifies the environment. Deleting from a hash tied to a DBM file deletes the entry from the DBM file. Deleting from a tied hash or array may not necessarily return anything; it depends on the implementation of the tied package's DELETE method, which may do whatever it pleases.

    The delete local EXPR construct localizes the deletion to the current block at run time. Until the block exits, elements locally deleted temporarily no longer exist. See Localized deletion of elements of composite types in perlsub.

    1. %hash = (foo => 11, bar => 22, baz => 33);
    2. $scalar = delete $hash{foo}; # $scalar is 11
    3. $scalar = delete @hash{qw(foo bar)}; # $scalar is 22
    4. @array = delete @hash{qw(foo baz)}; # @array is (undef,33)

    The following (inefficiently) deletes all the values of %HASH and @ARRAY:

    1. foreach $key (keys %HASH) {
    2. delete $HASH{$key};
    3. }
    4. foreach $index (0 .. $#ARRAY) {
    5. delete $ARRAY[$index];
    6. }

    And so do these:

    1. delete @HASH{keys %HASH};
    2. delete @ARRAY[0 .. $#ARRAY];

    But both are slower than assigning the empty list or undefining %HASH or @ARRAY, which is the customary way to empty out an aggregate:

    1. %HASH = (); # completely empty %HASH
    2. undef %HASH; # forget %HASH ever existed
    3. @ARRAY = (); # completely empty @ARRAY
    4. undef @ARRAY; # forget @ARRAY ever existed

    The EXPR can be arbitrarily complicated provided its final operation is an element or slice of an aggregate:

    1. delete $ref->[$x][$y]{$key};
    2. delete @{$ref->[$x][$y]}{$key1, $key2, @morekeys};
    3. delete $ref->[$x][$y][$index];
    4. delete @{$ref->[$x][$y]}[$index1, $index2, @moreindices];
  • die LIST

    die raises an exception. Inside an eval the error message is stuffed into $@ and the eval is terminated with the undefined value. If the exception is outside of all enclosing evals, then the uncaught exception prints LIST to STDERR and exits with a non-zero value. If you need to exit the process with a specific exit code, see exit.

    Equivalent examples:

    1. die "Can't cd to spool: $!\n" unless chdir '/usr/spool/news';
    2. chdir '/usr/spool/news' or die "Can't cd to spool: $!\n"

    If the last element of LIST does not end in a newline, the current script line number and input line number (if any) are also printed, and a newline is supplied. Note that the "input line number" (also known as "chunk") is subject to whatever notion of "line" happens to be currently in effect, and is also available as the special variable $. . See $/ in perlvar and $. in perlvar.

    Hint: sometimes appending ", stopped" to your message will cause it to make better sense when the string "at foo line 123" is appended. Suppose you are running script "canasta".

    1. die "/etc/games is no good";
    2. die "/etc/games is no good, stopped";

    produce, respectively

    1. /etc/games is no good at canasta line 123.
    2. /etc/games is no good, stopped at canasta line 123.

    If the output is empty and $@ already contains a value (typically from a previous eval) that value is reused after appending "\t...propagated" . This is useful for propagating exceptions:

    1. eval { ... };
    2. die unless $@ =~ /Expected exception/;

    If the output is empty and $@ contains an object reference that has a PROPAGATE method, that method will be called with additional file and line number parameters. The return value replaces the value in $@ ; i.e., as if $@ = eval { $@->PROPAGATE(__FILE__, __LINE__) }; were called.

    If $@ is empty then the string "Died" is used.

    If an uncaught exception results in interpreter exit, the exit code is determined from the values of $! and $? with this pseudocode:

    1. exit $! if $!; # errno
    2. exit $? >> 8 if $? >> 8; # child exit status
    3. exit 255; # last resort

    The intent is to squeeze as much possible information about the likely cause into the limited space of the system exit code. However, as $! is the value of C's errno , which can be set by any system call, this means that the value of the exit code used by die can be non-predictable, so should not be relied upon, other than to be non-zero.

    You can also call die with a reference argument, and if this is trapped within an eval, $@ contains that reference. This permits more elaborate exception handling using objects that maintain arbitrary state about the exception. Such a scheme is sometimes preferable to matching particular string values of $@ with regular expressions. Because $@ is a global variable and eval may be used within object implementations, be careful that analyzing the error object doesn't replace the reference in the global variable. It's easiest to make a local copy of the reference before any manipulations. Here's an example:

    1. use Scalar::Util "blessed";
    2. eval { ... ; die Some::Module::Exception->new( FOO => "bar" ) };
    3. if (my $ev_err = $@) {
    4. if (blessed($ev_err)
    5. && $ev_err->isa("Some::Module::Exception")) {
    6. # handle Some::Module::Exception
    7. }
    8. else {
    9. # handle all other possible exceptions
    10. }
    11. }

    Because Perl stringifies uncaught exception messages before display, you'll probably want to overload stringification operations on exception objects. See overload for details about that.

    You can arrange for a callback to be run just before the die does its deed, by setting the $SIG{__DIE__} hook. The associated handler is called with the error text and can change the error message, if it sees fit, by calling die again. See %SIG in perlvar for details on setting %SIG entries, and eval BLOCK for some examples. Although this feature was to be run only right before your program was to exit, this is not currently so: the $SIG{__DIE__} hook is currently called even inside eval()ed blocks/strings! If one wants the hook to do nothing in such situations, put

    1. die @_ if $^S;

    as the first line of the handler (see $^S in perlvar). Because this promotes strange action at a distance, this counterintuitive behavior may be fixed in a future release.

    See also exit(), warn(), and the Carp module.

  • do BLOCK

    Not really a function. Returns the value of the last command in the sequence of commands indicated by BLOCK. When modified by the while or until loop modifier, executes the BLOCK once before testing the loop condition. (On other statements the loop modifiers test the conditional first.)

    do BLOCK does not count as a loop, so the loop control statements next, last, or redo cannot be used to leave or restart the block. See perlsyn for alternative strategies.

  • do SUBROUTINE(LIST)

    This form of subroutine call is deprecated. SUBROUTINE can be a bareword or scalar variable.

  • do EXPR

    Uses the value of EXPR as a filename and executes the contents of the file as a Perl script.

    1. do 'stat.pl';

    is largely like

    1. eval `cat stat.pl`;

    except that it's more concise, runs no external processes, keeps track of the current filename for error messages, searches the @INC directories, and updates %INC if the file is found. See @INC in perlvar and %INC in perlvar for these variables. It also differs in that code evaluated with do FILENAME cannot see lexicals in the enclosing scope; eval STRING does. It's the same, however, in that it does reparse the file every time you call it, so you probably don't want to do this inside a loop.

    If do can read the file but cannot compile it, it returns undef and sets an error message in $@ . If do cannot read the file, it returns undef and sets $! to the error. Always check $@ first, as compilation could fail in a way that also sets $! . If the file is successfully compiled, do returns the value of the last expression evaluated.

    Inclusion of library modules is better done with the use and require operators, which also do automatic error checking and raise an exception if there's a problem.

    You might like to use do to read in a program configuration file. Manual error checking can be done this way:

    1. # read in config files: system first, then user
    2. for $file ("/share/prog/defaults.rc",
    3. "$ENV{HOME}/.someprogrc")
    4. {
    5. unless ($return = do $file) {
    6. warn "couldn't parse $file: $@" if $@;
    7. warn "couldn't do $file: $!" unless defined $return;
    8. warn "couldn't run $file" unless $return;
    9. }
    10. }
  • dump LABEL
  • dump EXPR
  • dump

    This function causes an immediate core dump. See also the -u command-line switch in perlrun, which does the same thing. Primarily this is so that you can use the undump program (not supplied) to turn your core dump into an executable binary after having initialized all your variables at the beginning of the program. When the new binary is executed it will begin by executing a goto LABEL (with all the restrictions that goto suffers). Think of it as a goto with an intervening core dump and reincarnation. If LABEL is omitted, restarts the program from the top. The dump EXPR form, available starting in Perl 5.18.0, allows a name to be computed at run time, being otherwise identical to dump LABEL .

    WARNING: Any files opened at the time of the dump will not be open any more when the program is reincarnated, with possible resulting confusion by Perl.

    This function is now largely obsolete, mostly because it's very hard to convert a core file into an executable. That's why you should now invoke it as CORE::dump() , if you don't want to be warned against a possible typo.

    Unlike most named operators, this has the same precedence as assignment. It is also exempt from the looks-like-a-function rule, so dump ("foo")."bar" will cause "bar" to be part of the argument to dump.

    Portability issues: dump in perlport.

  • each HASH
  • each ARRAY
  • each EXPR

    When called on a hash in list context, returns a 2-element list consisting of the key and value for the next element of a hash. In Perl 5.12 and later only, it will also return the index and value for the next element of an array so that you can iterate over it; older Perls consider this a syntax error. When called in scalar context, returns only the key (not the value) in a hash, or the index in an array.

    Hash entries are returned in an apparently random order. The actual random order is specific to a given hash; the exact same series of operations on two hashes may result in a different order for each hash. Any insertion into the hash may change the order, as will any deletion, with the exception that the most recent key returned by each or keys may be deleted without changing the order. So long as a given hash is unmodified you may rely on keys, values and each to repeatedly return the same order as each other. See Algorithmic Complexity Attacks in perlsec for details on why hash order is randomized. Aside from the guarantees provided here the exact details of Perl's hash algorithm and the hash traversal order are subject to change in any release of Perl.

    After each has returned all entries from the hash or array, the next call to each returns the empty list in list context and undef in scalar context; the next call following that one restarts iteration. Each hash or array has its own internal iterator, accessed by each, keys, and values. The iterator is implicitly reset when each has reached the end as just described; it can be explicitly reset by calling keys or values on the hash or array. If you add or delete a hash's elements while iterating over it, entries may be skipped or duplicated--so don't do that. Exception: In the current implementation, it is always safe to delete the item most recently returned by each(), so the following code works properly:

    1. while (($key, $value) = each %hash) {
    2. print $key, "\n";
    3. delete $hash{$key}; # This is safe
    4. }

    This prints out your environment like the printenv(1) program, but in a different order:

    1. while (($key,$value) = each %ENV) {
    2. print "$key=$value\n";
    3. }

    Starting with Perl 5.14, each can take a scalar EXPR, which must hold reference to an unblessed hash or array. The argument will be dereferenced automatically. This aspect of each is considered highly experimental. The exact behaviour may change in a future version of Perl.

    1. while (($key,$value) = each $hashref) { ... }

    As of Perl 5.18 you can use a bare each in a while loop, which will set $_ on every iteration.

    1. while(each %ENV) {
    2. print "$_=$ENV{$_}\n";
    3. }

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.012; # so keys/values/each work on arrays
    2. use 5.014; # so keys/values/each work on scalars (experimental)
    3. use 5.018; # so each assigns to $_ in a lone while test

    See also keys, values, and sort.

  • eof FILEHANDLE
  • eof ()
  • eof

    Returns 1 if the next read on FILEHANDLE will return end of file or if FILEHANDLE is not open. FILEHANDLE may be an expression whose value gives the real filehandle. (Note that this function actually reads a character and then ungetc s it, so isn't useful in an interactive context.) Do not read from a terminal file (or call eof(FILEHANDLE) on it) after end-of-file is reached. File types such as terminals may lose the end-of-file condition if you do.

    An eof without an argument uses the last file read. Using eof() with empty parentheses is different. It refers to the pseudo file formed from the files listed on the command line and accessed via the <> operator. Since <> isn't explicitly opened, as a normal filehandle is, an eof() before <> has been used will cause @ARGV to be examined to determine if input is available. Similarly, an eof() after <> has returned end-of-file will assume you are processing another @ARGV list, and if you haven't set @ARGV , will read input from STDIN ; see I/O Operators in perlop.

    In a while (<>) loop, eof or eof(ARGV) can be used to detect the end of each file, whereas eof() will detect the end of the very last file only. Examples:

    1. # reset line numbering on each input file
    2. while (<>) {
    3. next if /^\s*#/; # skip comments
    4. print "$.\t$_";
    5. } continue {
    6. close ARGV if eof; # Not eof()!
    7. }
    8. # insert dashes just before last line of last file
    9. while (<>) {
    10. if (eof()) { # check for end of last file
    11. print "--------------\n";
    12. }
    13. print;
    14. last if eof(); # needed if we're reading from a terminal
    15. }

    Practical hint: you almost never need to use eof in Perl, because the input operators typically return undef when they run out of data or encounter an error.

  • eval EXPR
  • eval BLOCK
  • eval

    In the first form, the return value of EXPR is parsed and executed as if it were a little Perl program. The value of the expression (which is itself determined within scalar context) is first parsed, and if there were no errors, executed as a block within the lexical context of the current Perl program. This means, that in particular, any outer lexical variables are visible to it, and any package variable settings or subroutine and format definitions remain afterwards.

    Note that the value is parsed every time the eval executes. If EXPR is omitted, evaluates $_ . This form is typically used to delay parsing and subsequent execution of the text of EXPR until run time.

    If the unicode_eval feature is enabled (which is the default under a use 5.16 or higher declaration), EXPR or $_ is treated as a string of characters, so use utf8 declarations have no effect, and source filters are forbidden. In the absence of the unicode_eval feature, the string will sometimes be treated as characters and sometimes as bytes, depending on the internal encoding, and source filters activated within the eval exhibit the erratic, but historical, behaviour of affecting some outer file scope that is still compiling. See also the evalbytes keyword, which always treats its input as a byte stream and works properly with source filters, and the feature pragma.

    In the second form, the code within the BLOCK is parsed only once--at the same time the code surrounding the eval itself was parsed--and executed within the context of the current Perl program. This form is typically used to trap exceptions more efficiently than the first (see below), while also providing the benefit of checking the code within BLOCK at compile time.

    The final semicolon, if any, may be omitted from the value of EXPR or within the BLOCK.

    In both forms, the value returned is the value of the last expression evaluated inside the mini-program; a return statement may be also used, just as with subroutines. The expression providing the return value is evaluated in void, scalar, or list context, depending on the context of the eval itself. See wantarray for more on how the evaluation context can be determined.

    If there is a syntax error or runtime error, or a die statement is executed, eval returns undef in scalar context or an empty list in list context, and $@ is set to the error message. (Prior to 5.16, a bug caused undef to be returned in list context for syntax errors, but not for runtime errors.) If there was no error, $@ is set to the empty string. A control flow operator like last or goto can bypass the setting of $@ . Beware that using eval neither silences Perl from printing warnings to STDERR, nor does it stuff the text of warning messages into $@ . To do either of those, you have to use the $SIG{__WARN__} facility, or turn off warnings inside the BLOCK or EXPR using no warnings 'all' . See warn, perlvar, warnings and perllexwarn.

    Note that, because eval traps otherwise-fatal errors, it is useful for determining whether a particular feature (such as socket or symlink) is implemented. It is also Perl's exception-trapping mechanism, where the die operator is used to raise exceptions.

    If you want to trap errors when loading an XS module, some problems with the binary interface (such as Perl version skew) may be fatal even with eval unless $ENV{PERL_DL_NONLAZY} is set. See perlrun.

    If the code to be executed doesn't vary, you may use the eval-BLOCK form to trap run-time errors without incurring the penalty of recompiling each time. The error, if any, is still returned in $@ . Examples:

    1. # make divide-by-zero nonfatal
    2. eval { $answer = $a / $b; }; warn $@ if $@;
    3. # same thing, but less efficient
    4. eval '$answer = $a / $b'; warn $@ if $@;
    5. # a compile-time error
    6. eval { $answer = }; # WRONG
    7. # a run-time error
    8. eval '$answer ='; # sets $@

    Using the eval{} form as an exception trap in libraries does have some issues. Due to the current arguably broken state of __DIE__ hooks, you may wish not to trigger any __DIE__ hooks that user code may have installed. You can use the local $SIG{__DIE__} construct for this purpose, as this example shows:

    1. # a private exception trap for divide-by-zero
    2. eval { local $SIG{'__DIE__'}; $answer = $a / $b; };
    3. warn $@ if $@;

    This is especially significant, given that __DIE__ hooks can call die again, which has the effect of changing their error messages:

    1. # __DIE__ hooks may modify error messages
    2. {
    3. local $SIG{'__DIE__'} =
    4. sub { (my $x = $_[0]) =~ s/foo/bar/g; die $x };
    5. eval { die "foo lives here" };
    6. print $@ if $@; # prints "bar lives here"
    7. }

    Because this promotes action at a distance, this counterintuitive behavior may be fixed in a future release.

    With an eval, you should be especially careful to remember what's being looked at when:

    1. eval $x; # CASE 1
    2. eval "$x"; # CASE 2
    3. eval '$x'; # CASE 3
    4. eval { $x }; # CASE 4
    5. eval "\$$x++"; # CASE 5
    6. $$x++; # CASE 6

    Cases 1 and 2 above behave identically: they run the code contained in the variable $x. (Although case 2 has misleading double quotes making the reader wonder what else might be happening (nothing is).) Cases 3 and 4 likewise behave in the same way: they run the code '$x' , which does nothing but return the value of $x. (Case 4 is preferred for purely visual reasons, but it also has the advantage of compiling at compile-time instead of at run-time.) Case 5 is a place where normally you would like to use double quotes, except that in this particular situation, you can just use symbolic references instead, as in case 6.

    Before Perl 5.14, the assignment to $@ occurred before restoration of localized variables, which means that for your code to run on older versions, a temporary is required if you want to mask some but not all errors:

    1. # alter $@ on nefarious repugnancy only
    2. {
    3. my $e;
    4. {
    5. local $@; # protect existing $@
    6. eval { test_repugnancy() };
    7. # $@ =~ /nefarious/ and die $@; # Perl 5.14 and higher only
    8. $@ =~ /nefarious/ and $e = $@;
    9. }
    10. die $e if defined $e
    11. }

    eval BLOCK does not count as a loop, so the loop control statements next, last, or redo cannot be used to leave or restart the block.

    An eval '' executed within a subroutine defined in the DB package doesn't see the usual surrounding lexical scope, but rather the scope of the first non-DB piece of code that called it. You don't normally need to worry about this unless you are writing a Perl debugger.

  • evalbytes EXPR
  • evalbytes

    This function is like eval with a string argument, except it always parses its argument, or $_ if EXPR is omitted, as a string of bytes. A string containing characters whose ordinal value exceeds 255 results in an error. Source filters activated within the evaluated code apply to the code itself.

    This function is only available under the evalbytes feature, a use v5.16 (or higher) declaration, or with a CORE:: prefix. See feature for more information.

  • exec LIST
  • exec PROGRAM LIST

    The exec function executes a system command and never returns; use system instead of exec if you want it to return. It fails and returns false only if the command does not exist and it is executed directly instead of via your system's command shell (see below).

    Since it's a common mistake to use exec instead of system, Perl warns you if exec is called in void context and if there is a following statement that isn't die, warn, or exit (if -w is set--but you always do that, right?). If you really want to follow an exec with some other statement, you can use one of these styles to avoid the warning:

    1. exec ('foo') or print STDERR "couldn't exec foo: $!";
    2. { exec ('foo') }; print STDERR "couldn't exec foo: $!";

    If there is more than one argument in LIST, or if LIST is an array with more than one value, calls execvp(3) with the arguments in LIST. If there is only one scalar argument or an array with one element in it, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's command shell for parsing (this is /bin/sh -c on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the argument, it is split into words and passed directly to execvp , which is more efficient. Examples:

    1. exec '/bin/echo', 'Your arguments are: ', @ARGV;
    2. exec "sort $outfile | uniq";

    If you don't really want to execute the first argument, but want to lie to the program you are executing about its own name, you can specify the program you actually want to run as an "indirect object" (without a comma) in front of the LIST. (This always forces interpretation of the LIST as a multivalued list, even if there is only a single scalar in the list.) Example:

    1. $shell = '/bin/csh';
    2. exec $shell '-sh'; # pretend it's a login shell

    or, more directly,

    1. exec {'/bin/csh'} '-sh'; # pretend it's a login shell

    When the arguments get executed via the system shell, results are subject to its quirks and capabilities. See `STRING` in perlop for details.

    Using an indirect object with exec or system is also more secure. This usage (which also works fine with system()) forces interpretation of the arguments as a multivalued list, even if the list had just one argument. That way you're safe from the shell expanding wildcards or splitting up words with whitespace in them.

    1. @args = ( "echo surprise" );
    2. exec @args; # subject to shell escapes
    3. # if @args == 1
    4. exec { $args[0] } @args; # safe even with one-arg list

    The first version, the one without the indirect object, ran the echo program, passing it "surprise" an argument. The second version didn't; it tried to run a program named "echo surprise", didn't find it, and set $? to a non-zero value indicating failure.

    Perl attempts to flush all files opened for output before the exec, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles to avoid lost output.

    Note that exec will not call your END blocks, nor will it invoke DESTROY methods on your objects.

    Portability issues: exec in perlport.

  • exists EXPR

    Given an expression that specifies an element of a hash, returns true if the specified element in the hash has ever been initialized, even if the corresponding value is undefined.

    1. print "Exists\n" if exists $hash{$key};
    2. print "Defined\n" if defined $hash{$key};
    3. print "True\n" if $hash{$key};

    exists may also be called on array elements, but its behavior is much less obvious and is strongly tied to the use of delete on arrays. Be aware that calling exists on array values is deprecated and likely to be removed in a future version of Perl.

    1. print "Exists\n" if exists $array[$index];
    2. print "Defined\n" if defined $array[$index];
    3. print "True\n" if $array[$index];

    A hash or array element can be true only if it's defined and defined only if it exists, but the reverse doesn't necessarily hold true.

    Given an expression that specifies the name of a subroutine, returns true if the specified subroutine has ever been declared, even if it is undefined. Mentioning a subroutine name for exists or defined does not count as declaring it. Note that a subroutine that does not exist may still be callable: its package may have an AUTOLOAD method that makes it spring into existence the first time that it is called; see perlsub.

    1. print "Exists\n" if exists &subroutine;
    2. print "Defined\n" if defined &subroutine;

    Note that the EXPR can be arbitrarily complicated as long as the final operation is a hash or array key lookup or subroutine name:

    1. if (exists $ref->{A}->{B}->{$key}) { }
    2. if (exists $hash{A}{B}{$key}) { }
    3. if (exists $ref->{A}->{B}->[$ix]) { }
    4. if (exists $hash{A}{B}[$ix]) { }
    5. if (exists &{$ref->{A}{B}{$key}}) { }

    Although the most deeply nested array or hash element will not spring into existence just because its existence was tested, any intervening ones will. Thus $ref->{"A"} and $ref->{"A"}->{"B"} will spring into existence due to the existence test for the $key element above. This happens anywhere the arrow operator is used, including even here:

    1. undef $ref;
    2. if (exists $ref->{"Some key"}) { }
    3. print $ref; # prints HASH(0x80d3d5c)

    This surprising autovivification in what does not at first--or even second--glance appear to be an lvalue context may be fixed in a future release.

    Use of a subroutine call, rather than a subroutine name, as an argument to exists() is an error.

    1. exists &sub; # OK
    2. exists &sub(); # Error
  • exit EXPR
  • exit

    Evaluates EXPR and exits immediately with that value. Example:

    1. $ans = <STDIN>;
    2. exit 0 if $ans =~ /^[Xx]/;

    See also die. If EXPR is omitted, exits with 0 status. The only universally recognized values for EXPR are 0 for success and 1 for error; other values are subject to interpretation depending on the environment in which the Perl program is running. For example, exiting 69 (EX_UNAVAILABLE) from a sendmail incoming-mail filter will cause the mailer to return the item undelivered, but that's not true everywhere.

    Don't use exit to abort a subroutine if there's any chance that someone might want to trap whatever error happened. Use die instead, which can be trapped by an eval.

    The exit() function does not always exit immediately. It calls any defined END routines first, but these END routines may not themselves abort the exit. Likewise any object destructors that need to be called are called before the real exit. END routines and destructors can change the exit status by modifying $? . If this is a problem, you can call POSIX::_exit($status) to avoid END and destructor processing. See perlmod for details.

    Portability issues: exit in perlport.

  • exp EXPR
  • exp

    Returns e (the natural logarithm base) to the power of EXPR. If EXPR is omitted, gives exp($_).

  • fc EXPR
  • fc

    Returns the casefolded version of EXPR. This is the internal function implementing the \F escape in double-quoted strings.

    Casefolding is the process of mapping strings to a form where case differences are erased; comparing two strings in their casefolded form is effectively a way of asking if two strings are equal, regardless of case.

    Roughly, if you ever found yourself writing this

    1. lc($this) eq lc($that) # Wrong!
    2. # or
    3. uc($this) eq uc($that) # Also wrong!
    4. # or
    5. $this =~ /^\Q$that\E\z/i # Right!

    Now you can write

    1. fc($this) eq fc($that)

    And get the correct results.

    Perl only implements the full form of casefolding, but you can access the simple folds using casefold() in Unicode::UCD and prop_invmap() in Unicode::UCD. For further information on casefolding, refer to the Unicode Standard, specifically sections 3.13 Default Case Operations , 4.2 Case-Normative , and 5.18 Case Mappings , available at http://www.unicode.org/versions/latest/, as well as the Case Charts available at http://www.unicode.org/charts/case/.

    If EXPR is omitted, uses $_ .

    This function behaves the same way under various pragma, such as in a locale, as lc does.

    While the Unicode Standard defines two additional forms of casefolding, one for Turkic languages and one that never maps one character into multiple characters, these are not provided by the Perl core; However, the CPAN module Unicode::Casing may be used to provide an implementation.

    This keyword is available only when the "fc" feature is enabled, or when prefixed with CORE:: ; See feature. Alternately, include a use v5.16 or later to the current scope.

  • fcntl FILEHANDLE,FUNCTION,SCALAR

    Implements the fcntl(2) function. You'll probably have to say

    1. use Fcntl;

    first to get the correct constant definitions. Argument processing and value returned work just like ioctl below. For example:

    1. use Fcntl;
    2. fcntl($filehandle, F_GETFL, $packed_return_buffer)
    3. or die "can't fcntl F_GETFL: $!";

    You don't have to check for defined on the return from fcntl. Like ioctl, it maps a 0 return from the system call into "0 but true" in Perl. This string is true in boolean context and 0 in numeric context. It is also exempt from the normal -w warnings on improper numeric conversions.

    Note that fcntl raises an exception if used on a machine that doesn't implement fcntl(2). See the Fcntl module or your fcntl(2) manpage to learn what functions are available on your system.

    Here's an example of setting a filehandle named REMOTE to be non-blocking at the system level. You'll have to negotiate $| on your own, though.

    1. use Fcntl qw(F_GETFL F_SETFL O_NONBLOCK);
    2. $flags = fcntl(REMOTE, F_GETFL, 0)
    3. or die "Can't get flags for the socket: $!\n";
    4. $flags = fcntl(REMOTE, F_SETFL, $flags | O_NONBLOCK)
    5. or die "Can't set flags for the socket: $!\n";

    Portability issues: fcntl in perlport.

  • __FILE__

    A special token that returns the name of the file in which it occurs.

  • fileno FILEHANDLE

    Returns the file descriptor for a filehandle, or undefined if the filehandle is not open. If there is no real file descriptor at the OS level, as can happen with filehandles connected to memory objects via open with a reference for the third argument, -1 is returned.

    This is mainly useful for constructing bitmaps for select and low-level POSIX tty-handling operations. If FILEHANDLE is an expression, the value is taken as an indirect filehandle, generally its name.

    You can use this to find out whether two handles refer to the same underlying descriptor:

    1. if (fileno(THIS) == fileno(THAT)) {
    2. print "THIS and THAT are dups\n";
    3. }
  • flock FILEHANDLE,OPERATION

    Calls flock(2), or an emulation of it, on FILEHANDLE. Returns true for success, false on failure. Produces a fatal error if used on a machine that doesn't implement flock(2), fcntl(2) locking, or lockf(3). flock is Perl's portable file-locking interface, although it locks entire files only, not records.

    Two potentially non-obvious but traditional flock semantics are that it waits indefinitely until the lock is granted, and that its locks are merely advisory. Such discretionary locks are more flexible, but offer fewer guarantees. This means that programs that do not also use flock may modify files locked with flock. See perlport, your port's specific documentation, and your system-specific local manpages for details. It's best to assume traditional behavior if you're writing portable programs. (But if you're not, you should as always feel perfectly free to write for your own system's idiosyncrasies (sometimes called "features"). Slavish adherence to portability concerns shouldn't get in the way of your getting your job done.)

    OPERATION is one of LOCK_SH, LOCK_EX, or LOCK_UN, possibly combined with LOCK_NB. These constants are traditionally valued 1, 2, 8 and 4, but you can use the symbolic names if you import them from the Fcntl module, either individually, or as a group using the :flock tag. LOCK_SH requests a shared lock, LOCK_EX requests an exclusive lock, and LOCK_UN releases a previously requested lock. If LOCK_NB is bitwise-or'ed with LOCK_SH or LOCK_EX, then flock returns immediately rather than blocking waiting for the lock; check the return status to see if you got it.

    To avoid the possibility of miscoordination, Perl now flushes FILEHANDLE before locking or unlocking it.

    Note that the emulation built with lockf(3) doesn't provide shared locks, and it requires that FILEHANDLE be open with write intent. These are the semantics that lockf(3) implements. Most if not all systems implement lockf(3) in terms of fcntl(2) locking, though, so the differing semantics shouldn't bite too many people.

    Note that the fcntl(2) emulation of flock(3) requires that FILEHANDLE be open with read intent to use LOCK_SH and requires that it be open with write intent to use LOCK_EX.

    Note also that some versions of flock cannot lock things over the network; you would need to use the more system-specific fcntl for that. If you like you can force Perl to ignore your system's flock(2) function, and so provide its own fcntl(2)-based emulation, by passing the switch -Ud_flock to the Configure program when you configure and build a new Perl.

    Here's a mailbox appender for BSD systems.

    1. # import LOCK_* and SEEK_END constants
    2. use Fcntl qw(:flock SEEK_END);
    3. sub lock {
    4. my ($fh) = @_;
    5. flock($fh, LOCK_EX) or die "Cannot lock mailbox - $!\n";
    6. # and, in case someone appended while we were waiting...
    7. seek($fh, 0, SEEK_END) or die "Cannot seek - $!\n";
    8. }
    9. sub unlock {
    10. my ($fh) = @_;
    11. flock($fh, LOCK_UN) or die "Cannot unlock mailbox - $!\n";
    12. }
    13. open(my $mbox, ">>", "/usr/spool/mail/$ENV{'USER'}")
    14. or die "Can't open mailbox: $!";
    15. lock($mbox);
    16. print $mbox $msg,"\n\n";
    17. unlock($mbox);

    On systems that support a real flock(2), locks are inherited across fork() calls, whereas those that must resort to the more capricious fcntl(2) function lose their locks, making it seriously harder to write servers.

    See also DB_File for other flock() examples.

    Portability issues: flock in perlport.

  • fork

    Does a fork(2) system call to create a new process running the same program at the same point. It returns the child pid to the parent process, 0 to the child process, or undef if the fork is unsuccessful. File descriptors (and sometimes locks on those descriptors) are shared, while everything else is copied. On most systems supporting fork(), great care has gone into making it extremely efficient (for example, using copy-on-write technology on data pages), making it the dominant paradigm for multitasking over the last few decades.

    Perl attempts to flush all files opened for output before forking the child process, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles to avoid duplicate output.

    If you fork without ever waiting on your children, you will accumulate zombies. On some systems, you can avoid this by setting $SIG{CHLD} to "IGNORE" . See also perlipc for more examples of forking and reaping moribund children.

    Note that if your forked child inherits system file descriptors like STDIN and STDOUT that are actually connected by a pipe or socket, even if you exit, then the remote server (such as, say, a CGI script or a backgrounded job launched from a remote shell) won't think you're done. You should reopen those to /dev/null if it's any issue.

    On some platforms such as Windows, where the fork() system call is not available, Perl can be built to emulate fork() in the Perl interpreter. The emulation is designed, at the level of the Perl program, to be as compatible as possible with the "Unix" fork(). However it has limitations that have to be considered in code intended to be portable. See perlfork for more details.

    Portability issues: fork in perlport.

  • format

    Declare a picture format for use by the write function. For example:

    1. format Something =
    2. Test: @<<<<<<<< @||||| @>>>>>
    3. $str, $%, '$' . int($num)
    4. .
    5. $str = "widget";
    6. $num = $cost/$quantity;
    7. $~ = 'Something';
    8. write;

    See perlform for many details and examples.

  • formline PICTURE,LIST

    This is an internal function used by formats, though you may call it, too. It formats (see perlform) a list of values according to the contents of PICTURE, placing the output into the format output accumulator, $^A (or $ACCUMULATOR in English). Eventually, when a write is done, the contents of $^A are written to some filehandle. You could also read $^A and then set $^A back to "" . Note that a format typically does one formline per line of form, but the formline function itself doesn't care how many newlines are embedded in the PICTURE. This means that the ~ and ~~ tokens treat the entire PICTURE as a single line. You may therefore need to use multiple formlines to implement a single record format, just like the format compiler.

    Be careful if you put double quotes around the picture, because an @ character may be taken to mean the beginning of an array name. formline always returns true. See perlform for other examples.

    If you are trying to use this instead of write to capture the output, you may find it easier to open a filehandle to a scalar (open $fh, ">", \$output ) and write to that instead.

  • getc FILEHANDLE
  • getc

    Returns the next character from the input file attached to FILEHANDLE, or the undefined value at end of file or if there was an error (in the latter case $! is set). If FILEHANDLE is omitted, reads from STDIN. This is not particularly efficient. However, it cannot be used by itself to fetch single characters without waiting for the user to hit enter. For that, try something more like:

    1. if ($BSD_STYLE) {
    2. system "stty cbreak </dev/tty >/dev/tty 2>&1";
    3. }
    4. else {
    5. system "stty", '-icanon', 'eol', "\001";
    6. }
    7. $key = getc(STDIN);
    8. if ($BSD_STYLE) {
    9. system "stty -cbreak </dev/tty >/dev/tty 2>&1";
    10. }
    11. else {
    12. system 'stty', 'icanon', 'eol', '^@'; # ASCII NUL
    13. }
    14. print "\n";

    Determination of whether $BSD_STYLE should be set is left as an exercise to the reader.

    The POSIX::getattr function can do this more portably on systems purporting POSIX compliance. See also the Term::ReadKey module from your nearest CPAN site; details on CPAN can be found under CPAN in perlmodlib.

  • getlogin

    This implements the C library function of the same name, which on most systems returns the current login from /etc/utmp, if any. If it returns the empty string, use getpwuid.

    1. $login = getlogin || getpwuid($<) || "Kilroy";

    Do not consider getlogin for authentication: it is not as secure as getpwuid.

    Portability issues: getlogin in perlport.

  • getpeername SOCKET

    Returns the packed sockaddr address of the other end of the SOCKET connection.

    1. use Socket;
    2. $hersockaddr = getpeername(SOCK);
    3. ($port, $iaddr) = sockaddr_in($hersockaddr);
    4. $herhostname = gethostbyaddr($iaddr, AF_INET);
    5. $herstraddr = inet_ntoa($iaddr);
  • getpgrp PID

    Returns the current process group for the specified PID. Use a PID of 0 to get the current process group for the current process. Will raise an exception if used on a machine that doesn't implement getpgrp(2). If PID is omitted, returns the process group of the current process. Note that the POSIX version of getpgrp does not accept a PID argument, so only PID==0 is truly portable.

    Portability issues: getpgrp in perlport.

  • getppid

    Returns the process id of the parent process.

    Note for Linux users: Between v5.8.1 and v5.16.0 Perl would work around non-POSIX thread semantics the minority of Linux systems (and Debian GNU/kFreeBSD systems) that used LinuxThreads, this emulation has since been removed. See the documentation for $$ for details.

    Portability issues: getppid in perlport.

  • getpriority WHICH,WHO

    Returns the current priority for a process, a process group, or a user. (See getpriority(2).) Will raise a fatal exception if used on a machine that doesn't implement getpriority(2).

    Portability issues: getpriority in perlport.

  • getpwnam NAME
  • getgrnam NAME
  • gethostbyname NAME
  • getnetbyname NAME
  • getprotobyname NAME
  • getpwuid UID
  • getgrgid GID
  • getservbyname NAME,PROTO
  • gethostbyaddr ADDR,ADDRTYPE
  • getnetbyaddr ADDR,ADDRTYPE
  • getprotobynumber NUMBER
  • getservbyport PORT,PROTO
  • getpwent
  • getgrent
  • gethostent
  • getnetent
  • getprotoent
  • getservent
  • setpwent
  • setgrent
  • sethostent STAYOPEN
  • setnetent STAYOPEN
  • setprotoent STAYOPEN
  • setservent STAYOPEN
  • endpwent
  • endgrent
  • endhostent
  • endnetent
  • endprotoent
  • endservent

    These routines are the same as their counterparts in the system C library. In list context, the return values from the various get routines are as follows:

    1. ($name,$passwd,$uid,$gid,
    2. $quota,$comment,$gcos,$dir,$shell,$expire) = getpw*
    3. ($name,$passwd,$gid,$members) = getgr*
    4. ($name,$aliases,$addrtype,$length,@addrs) = gethost*
    5. ($name,$aliases,$addrtype,$net) = getnet*
    6. ($name,$aliases,$proto) = getproto*
    7. ($name,$aliases,$port,$proto) = getserv*

    (If the entry doesn't exist you get an empty list.)

    The exact meaning of the $gcos field varies but it usually contains the real name of the user (as opposed to the login name) and other information pertaining to the user. Beware, however, that in many system users are able to change this information and therefore it cannot be trusted and therefore the $gcos is tainted (see perlsec). The $passwd and $shell, user's encrypted password and login shell, are also tainted, for the same reason.

    In scalar context, you get the name, unless the function was a lookup by name, in which case you get the other thing, whatever it is. (If the entry doesn't exist you get the undefined value.) For example:

    1. $uid = getpwnam($name);
    2. $name = getpwuid($num);
    3. $name = getpwent();
    4. $gid = getgrnam($name);
    5. $name = getgrgid($num);
    6. $name = getgrent();
    7. #etc.

    In getpw*() the fields $quota, $comment, and $expire are special in that they are unsupported on many systems. If the $quota is unsupported, it is an empty scalar. If it is supported, it usually encodes the disk quota. If the $comment field is unsupported, it is an empty scalar. If it is supported it usually encodes some administrative comment about the user. In some systems the $quota field may be $change or $age, fields that have to do with password aging. In some systems the $comment field may be $class. The $expire field, if present, encodes the expiration period of the account or the password. For the availability and the exact meaning of these fields in your system, please consult getpwnam(3) and your system's pwd.h file. You can also find out from within Perl what your $quota and $comment fields mean and whether you have the $expire field by using the Config module and the values d_pwquota , d_pwage , d_pwchange , d_pwcomment , and d_pwexpire . Shadow password files are supported only if your vendor has implemented them in the intuitive fashion that calling the regular C library routines gets the shadow versions if you're running under privilege or if there exists the shadow(3) functions as found in System V (this includes Solaris and Linux). Those systems that implement a proprietary shadow password facility are unlikely to be supported.

    The $members value returned by getgr*() is a space-separated list of the login names of the members of the group.

    For the gethost*() functions, if the h_errno variable is supported in C, it will be returned to you via $? if the function call fails. The @addrs value returned by a successful call is a list of raw addresses returned by the corresponding library call. In the Internet domain, each address is four bytes long; you can unpack it by saying something like:

    1. ($a,$b,$c,$d) = unpack('W4',$addr[0]);

    The Socket library makes this slightly easier:

    1. use Socket;
    2. $iaddr = inet_aton("127.1"); # or whatever address
    3. $name = gethostbyaddr($iaddr, AF_INET);
    4. # or going the other way
    5. $straddr = inet_ntoa($iaddr);

    In the opposite way, to resolve a hostname to the IP address you can write this:

    1. use Socket;
    2. $packed_ip = gethostbyname("www.perl.org");
    3. if (defined $packed_ip) {
    4. $ip_address = inet_ntoa($packed_ip);
    5. }

    Make sure gethostbyname() is called in SCALAR context and that its return value is checked for definedness.

    The getprotobynumber function, even though it only takes one argument, has the precedence of a list operator, so beware:

    1. getprotobynumber $number eq 'icmp' # WRONG
    2. getprotobynumber($number eq 'icmp') # actually means this
    3. getprotobynumber($number) eq 'icmp' # better this way

    If you get tired of remembering which element of the return list contains which return value, by-name interfaces are provided in standard modules: File::stat , Net::hostent , Net::netent , Net::protoent , Net::servent , Time::gmtime , Time::localtime , and User::grent . These override the normal built-ins, supplying versions that return objects with the appropriate names for each field. For example:

    1. use File::stat;
    2. use User::pwent;
    3. $is_his = (stat($filename)->uid == pwent($whoever)->uid);

    Even though it looks as though they're the same method calls (uid), they aren't, because a File::stat object is different from a User::pwent object.

    Portability issues: getpwnam in perlport to endservent in perlport.

  • getsockname SOCKET

    Returns the packed sockaddr address of this end of the SOCKET connection, in case you don't know the address because you have several different IPs that the connection might have come in on.

    1. use Socket;
    2. $mysockaddr = getsockname(SOCK);
    3. ($port, $myaddr) = sockaddr_in($mysockaddr);
    4. printf "Connect to %s [%s]\n",
    5. scalar gethostbyaddr($myaddr, AF_INET),
    6. inet_ntoa($myaddr);
  • getsockopt SOCKET,LEVEL,OPTNAME

    Queries the option named OPTNAME associated with SOCKET at a given LEVEL. Options may exist at multiple protocol levels depending on the socket type, but at least the uppermost socket level SOL_SOCKET (defined in the Socket module) will exist. To query options at another level the protocol number of the appropriate protocol controlling the option should be supplied. For example, to indicate that an option is to be interpreted by the TCP protocol, LEVEL should be set to the protocol number of TCP, which you can get using getprotobyname.

    The function returns a packed string representing the requested socket option, or undef on error, with the reason for the error placed in $! . Just what is in the packed string depends on LEVEL and OPTNAME; consult getsockopt(2) for details. A common case is that the option is an integer, in which case the result is a packed integer, which you can decode using unpack with the i (or I ) format.

    Here's an example to test whether Nagle's algorithm is enabled on a socket:

    1. use Socket qw(:all);
    2. defined(my $tcp = getprotobyname("tcp"))
    3. or die "Could not determine the protocol number for tcp";
    4. # my $tcp = IPPROTO_TCP; # Alternative
    5. my $packed = getsockopt($socket, $tcp, TCP_NODELAY)
    6. or die "getsockopt TCP_NODELAY: $!";
    7. my $nodelay = unpack("I", $packed);
    8. print "Nagle's algorithm is turned ",
    9. $nodelay ? "off\n" : "on\n";

    Portability issues: getsockopt in perlport.

  • glob EXPR
  • glob

    In list context, returns a (possibly empty) list of filename expansions on the value of EXPR such as the standard Unix shell /bin/csh would do. In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted. This is the internal function implementing the <*.c> operator, but you can use it directly. If EXPR is omitted, $_ is used. The <*.c> operator is discussed in more detail in I/O Operators in perlop.

    Note that glob splits its arguments on whitespace and treats each segment as separate pattern. As such, glob("*.c *.h") matches all files with a .c or .h extension. The expression glob(".* *") matches all files in the current working directory. If you want to glob filenames that might contain whitespace, you'll have to use extra quotes around the spacey filename to protect it. For example, to glob filenames that have an e followed by a space followed by an f , use either of:

    1. @spacies = <"*e f*">;
    2. @spacies = glob '"*e f*"';
    3. @spacies = glob q("*e f*");

    If you had to get a variable through, you could do this:

    1. @spacies = glob "'*${var}e f*'";
    2. @spacies = glob qq("*${var}e f*");

    If non-empty braces are the only wildcard characters used in the glob, no filenames are matched, but potentially many strings are returned. For example, this produces nine strings, one for each pairing of fruits and colors:

    1. @many = glob "{apple,tomato,cherry}={green,yellow,red}";

    This operator is implemented using the standard File::Glob extension. See File::Glob for details, including bsd_glob which does not treat whitespace as a pattern separator.

    Portability issues: glob in perlport.

  • gmtime EXPR
  • gmtime

    Works just like localtime but the returned values are localized for the standard Greenwich time zone.

    Note: When called in list context, $isdst, the last value returned by gmtime, is always 0 . There is no Daylight Saving Time in GMT.

    Portability issues: gmtime in perlport.

  • goto LABEL
  • goto EXPR
  • goto &NAME

    The goto-LABEL form finds the statement labeled with LABEL and resumes execution there. It can't be used to get out of a block or subroutine given to sort. It can be used to go almost anywhere else within the dynamic scope, including out of subroutines, but it's usually better to use some other construct such as last or die. The author of Perl has never felt the need to use this form of goto (in Perl, that is; C is another matter). (The difference is that C does not offer named loops combined with loop control. Perl does, and this replaces most structured uses of goto in other languages.)

    The goto-EXPR form expects a label name, whose scope will be resolved dynamically. This allows for computed gotos per FORTRAN, but isn't necessarily recommended if you're optimizing for maintainability:

    1. goto ("FOO", "BAR", "GLARCH")[$i];

    As shown in this example, goto-EXPR is exempt from the "looks like a function" rule. A pair of parentheses following it does not (necessarily) delimit its argument. goto("NE")."XT" is equivalent to goto NEXT . Also, unlike most named operators, this has the same precedence as assignment.

    Use of goto-LABEL or goto-EXPR to jump into a construct is deprecated and will issue a warning. Even then, it may not be used to go into any construct that requires initialization, such as a subroutine or a foreach loop. It also can't be used to go into a construct that is optimized away.

    The goto-&NAME form is quite different from the other forms of goto. In fact, it isn't a goto in the normal sense at all, and doesn't have the stigma associated with other gotos. Instead, it exits the current subroutine (losing any changes set by local()) and immediately calls in its place the named subroutine using the current value of @_. This is used by AUTOLOAD subroutines that wish to load another subroutine and then pretend that the other subroutine had been called in the first place (except that any modifications to @_ in the current subroutine are propagated to the other subroutine.) After the goto, not even caller will be able to tell that this routine was called first.

    NAME needn't be the name of a subroutine; it can be a scalar variable containing a code reference or a block that evaluates to a code reference.

  • grep BLOCK LIST
  • grep EXPR,LIST

    This is similar in spirit to, but not the same as, grep(1) and its relatives. In particular, it is not limited to using regular expressions.

    Evaluates the BLOCK or EXPR for each element of LIST (locally setting $_ to each element) and returns the list value consisting of those elements for which the expression evaluated to true. In scalar context, returns the number of times the expression was true.

    1. @foo = grep(!/^#/, @bar); # weed out comments

    or equivalently,

    1. @foo = grep {!/^#/} @bar; # weed out comments

    Note that $_ is an alias to the list value, so it can be used to modify the elements of the LIST. While this is useful and supported, it can cause bizarre results if the elements of LIST are not variables. Similarly, grep returns aliases into the original list, much as a for loop's index variable aliases the list elements. That is, modifying an element of a list returned by grep (for example, in a foreach , map or another grep) actually modifies the element in the original list. This is usually something to be avoided when writing clear code.

    If $_ is lexical in the scope where the grep appears (because it has been declared with the deprecated my $_ construct) then, in addition to being locally aliased to the list elements, $_ keeps being lexical inside the block; i.e., it can't be seen from the outside, avoiding any potential side-effects.

    See also map for a list composed of the results of the BLOCK or EXPR.

  • hex EXPR
  • hex

    Interprets EXPR as a hex string and returns the corresponding value. (To convert strings that might start with either 0 , 0x , or 0b, see oct.) If EXPR is omitted, uses $_ .

    1. print hex '0xAf'; # prints '175'
    2. print hex 'aF'; # same

    Hex strings may only represent integers. Strings that would cause integer overflow trigger a warning. Leading whitespace is not stripped, unlike oct(). To present something as hex, look into printf, sprintf, and unpack.

  • import LIST

    There is no builtin import function. It is just an ordinary method (subroutine) defined (or inherited) by modules that wish to export names to another module. The use function calls the import method for the package used. See also use, perlmod, and Exporter.

  • index STR,SUBSTR,POSITION
  • index STR,SUBSTR

    The index function searches for one string within another, but without the wildcard-like behavior of a full regular-expression pattern match. It returns the position of the first occurrence of SUBSTR in STR at or after POSITION. If POSITION is omitted, starts searching from the beginning of the string. POSITION before the beginning of the string or after its end is treated as if it were the beginning or the end, respectively. POSITION and the return value are based at zero. If the substring is not found, index returns -1.

  • int EXPR
  • int

    Returns the integer portion of EXPR. If EXPR is omitted, uses $_ . You should not use this function for rounding: one because it truncates towards 0 , and two because machine representations of floating-point numbers can sometimes produce counterintuitive results. For example, int(-6.725/0.025) produces -268 rather than the correct -269; that's because it's really more like -268.99999999999994315658 instead. Usually, the sprintf, printf, or the POSIX::floor and POSIX::ceil functions will serve you better than will int().

  • ioctl FILEHANDLE,FUNCTION,SCALAR

    Implements the ioctl(2) function. You'll probably first have to say

    1. require "sys/ioctl.ph"; # probably in
    2. # $Config{archlib}/sys/ioctl.ph

    to get the correct function definitions. If sys/ioctl.ph doesn't exist or doesn't have the correct definitions you'll have to roll your own, based on your C header files such as <sys/ioctl.h>. (There is a Perl script called h2ph that comes with the Perl kit that may help you in this, but it's nontrivial.) SCALAR will be read and/or written depending on the FUNCTION; a C pointer to the string value of SCALAR will be passed as the third argument of the actual ioctl call. (If SCALAR has no string value but does have a numeric value, that value will be passed rather than a pointer to the string value. To guarantee this to be true, add a 0 to the scalar before using it.) The pack and unpack functions may be needed to manipulate the values of structures used by ioctl.

    The return value of ioctl (and fcntl) is as follows:

    1. if OS returns: then Perl returns:
    2. -1 undefined value
    3. 0 string "0 but true"
    4. anything else that number

    Thus Perl returns true on success and false on failure, yet you can still easily determine the actual value returned by the operating system:

    1. $retval = ioctl(...) || -1;
    2. printf "System returned %d\n", $retval;

    The special string "0 but true" is exempt from -w complaints about improper numeric conversions.

    Portability issues: ioctl in perlport.

  • join EXPR,LIST

    Joins the separate strings of LIST into a single string with fields separated by the value of EXPR, and returns that new string. Example:

    1. $rec = join(':', $login,$passwd,$uid,$gid,$gcos,$home,$shell);

    Beware that unlike split, join doesn't take a pattern as its first argument. Compare split.

  • keys HASH
  • keys ARRAY
  • keys EXPR

    Called in list context, returns a list consisting of all the keys of the named hash, or in Perl 5.12 or later only, the indices of an array. Perl releases prior to 5.12 will produce a syntax error if you try to use an array argument. In scalar context, returns the number of keys or indices.

    Hash entries are returned in an apparently random order. The actual random order is specific to a given hash; the exact same series of operations on two hashes may result in a different order for each hash. Any insertion into the hash may change the order, as will any deletion, with the exception that the most recent key returned by each or keys may be deleted without changing the order. So long as a given hash is unmodified you may rely on keys, values and each to repeatedly return the same order as each other. See Algorithmic Complexity Attacks in perlsec for details on why hash order is randomized. Aside from the guarantees provided here the exact details of Perl's hash algorithm and the hash traversal order are subject to change in any release of Perl.

    As a side effect, calling keys() resets the internal iterator of the HASH or ARRAY (see each). In particular, calling keys() in void context resets the iterator with no other overhead.

    Here is yet another way to print your environment:

    1. @keys = keys %ENV;
    2. @values = values %ENV;
    3. while (@keys) {
    4. print pop(@keys), '=', pop(@values), "\n";
    5. }

    or how about sorted by key:

    1. foreach $key (sort(keys %ENV)) {
    2. print $key, '=', $ENV{$key}, "\n";
    3. }

    The returned values are copies of the original keys in the hash, so modifying them will not affect the original hash. Compare values.

    To sort a hash by value, you'll need to use a sort function. Here's a descending numeric sort of a hash by its values:

    1. foreach $key (sort { $hash{$b} <=> $hash{$a} } keys %hash) {
    2. printf "%4d %s\n", $hash{$key}, $key;
    3. }

    Used as an lvalue, keys allows you to increase the number of hash buckets allocated for the given hash. This can gain you a measure of efficiency if you know the hash is going to get big. (This is similar to pre-extending an array by assigning a larger number to $#array.) If you say

    1. keys %hash = 200;

    then %hash will have at least 200 buckets allocated for it--256 of them, in fact, since it rounds up to the next power of two. These buckets will be retained even if you do %hash = () , use undef %hash if you want to free the storage while %hash is still in scope. You can't shrink the number of buckets allocated for the hash using keys in this way (but you needn't worry about doing this by accident, as trying has no effect). keys @array in an lvalue context is a syntax error.

    Starting with Perl 5.14, keys can take a scalar EXPR, which must contain a reference to an unblessed hash or array. The argument will be dereferenced automatically. This aspect of keys is considered highly experimental. The exact behaviour may change in a future version of Perl.

    1. for (keys $hashref) { ... }
    2. for (keys $obj->get_arrayref) { ... }

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.012; # so keys/values/each work on arrays
    2. use 5.014; # so keys/values/each work on scalars (experimental)

    See also each, values, and sort.

  • kill SIGNAL, LIST
  • kill SIGNAL

    Sends a signal to a list of processes. Returns the number of processes successfully signaled (which is not necessarily the same as the number actually killed).

    1. $cnt = kill 'HUP', $child1, $child2;
    2. kill 'KILL', @goners;

    SIGNAL may be either a signal name (a string) or a signal number. A signal name may start with a SIG prefix, thus FOO and SIGFOO refer to the same signal. The string form of SIGNAL is recommended for portability because the same signal may have different numbers in different operating systems.

    A list of signal names supported by the current platform can be found in $Config{sig_name} , which is provided by the Config module. See Config for more details.

    A negative signal name is the same as a negative signal number, killing process groups instead of processes. For example, kill '-KILL', $pgrp and kill -9, $pgrp will send SIGKILL to the entire process group specified. That means you usually want to use positive not negative signals.

    If SIGNAL is either the number 0 or the string ZERO (or SIGZZERO ), no signal is sent to the process, but kill checks whether it's possible to send a signal to it (that means, to be brief, that the process is owned by the same user, or we are the super-user). This is useful to check that a child process is still alive (even if only as a zombie) and hasn't changed its UID. See perlport for notes on the portability of this construct.

    The behavior of kill when a PROCESS number is zero or negative depends on the operating system. For example, on POSIX-conforming systems, zero will signal the current process group, -1 will signal all processes, and any other negative PROCESS number will act as a negative signal number and kill the entire process group specified.

    If both the SIGNAL and the PROCESS are negative, the results are undefined. A warning may be produced in a future version.

    See Signals in perlipc for more details.

    On some platforms such as Windows where the fork() system call is not available. Perl can be built to emulate fork() at the interpreter level. This emulation has limitations related to kill that have to be considered, for code running on Windows and in code intended to be portable.

    See perlfork for more details.

    If there is no LIST of processes, no signal is sent, and the return value is 0. This form is sometimes used, however, because it causes tainting checks to be run. But see Laundering and Detecting Tainted Data in perlsec.

    Portability issues: kill in perlport.

  • last LABEL
  • last EXPR
  • last

    The last command is like the break statement in C (as used in loops); it immediately exits the loop in question. If the LABEL is omitted, the command refers to the innermost enclosing loop. The last EXPR form, available starting in Perl 5.18.0, allows a label name to be computed at run time, and is otherwise identical to last LABEL . The continue block, if any, is not executed:

    1. LINE: while (<STDIN>) {
    2. last LINE if /^$/; # exit when done with header
    3. #...
    4. }

    last cannot be used to exit a block that returns a value such as eval {} , sub {} , or do {} , and should not be used to exit a grep() or map() operation.

    Note that a block by itself is semantically identical to a loop that executes once. Thus last can be used to effect an early exit out of such a block.

    See also continue for an illustration of how last, next, and redo work.

    Unlike most named operators, this has the same precedence as assignment. It is also exempt from the looks-like-a-function rule, so last ("foo")."bar" will cause "bar" to be part of the argument to last.

  • lc EXPR
  • lc

    Returns a lowercased version of EXPR. This is the internal function implementing the \L escape in double-quoted strings.

    If EXPR is omitted, uses $_ .

    What gets returned depends on several factors:

    • If use bytes is in effect:

      The results follow ASCII semantics. Only characters A-Z change, to a-z respectively.

    • Otherwise, if use locale (but not use locale ':not_characters' ) is in effect:

      Respects current LC_CTYPE locale for code points < 256; and uses Unicode semantics for the remaining code points (this last can only happen if the UTF8 flag is also set). See perllocale.

      A deficiency in this is that case changes that cross the 255/256 boundary are not well-defined. For example, the lower case of LATIN CAPITAL LETTER SHARP S (U+1E9E) in Unicode semantics is U+00DF (on ASCII platforms). But under use locale , the lower case of U+1E9E is itself, because 0xDF may not be LATIN SMALL LETTER SHARP S in the current locale, and Perl has no way of knowing if that character even exists in the locale, much less what code point it is. Perl returns the input character unchanged, for all instances (and there aren't many) where the 255/256 boundary would otherwise be crossed.

    • Otherwise, If EXPR has the UTF8 flag set:

      Unicode semantics are used for the case change.

    • Otherwise, if use feature 'unicode_strings' or use locale ':not_characters' is in effect:

      Unicode semantics are used for the case change.

    • Otherwise:

      ASCII semantics are used for the case change. The lowercase of any character outside the ASCII range is the character itself.

  • lcfirst EXPR
  • lcfirst

    Returns the value of EXPR with the first character lowercased. This is the internal function implementing the \l escape in double-quoted strings.

    If EXPR is omitted, uses $_ .

    This function behaves the same way under various pragmata, such as in a locale, as lc does.

  • length EXPR
  • length

    Returns the length in characters of the value of EXPR. If EXPR is omitted, returns the length of $_ . If EXPR is undefined, returns undef.

    This function cannot be used on an entire array or hash to find out how many elements these have. For that, use scalar @array and scalar keys %hash , respectively.

    Like all Perl character operations, length() normally deals in logical characters, not physical bytes. For how many bytes a string encoded as UTF-8 would take up, use length(Encode::encode_utf8(EXPR)) (you'll have to use Encode first). See Encode and perlunicode.

  • __LINE__

    A special token that compiles to the current line number.

  • link OLDFILE,NEWFILE

    Creates a new filename linked to the old filename. Returns true for success, false otherwise.

    Portability issues: link in perlport.

  • listen SOCKET,QUEUESIZE

    Does the same thing that the listen(2) system call does. Returns true if it succeeded, false otherwise. See the example in Sockets: Client/Server Communication in perlipc.

  • local EXPR

    You really probably want to be using my instead, because local isn't what most people think of as "local". See Private Variables via my() in perlsub for details.

    A local modifies the listed variables to be local to the enclosing block, file, or eval. If more than one value is listed, the list must be placed in parentheses. See Temporary Values via local() in perlsub for details, including issues with tied arrays and hashes.

    The delete local EXPR construct can also be used to localize the deletion of array/hash elements to the current block. See Localized deletion of elements of composite types in perlsub.

  • localtime EXPR
  • localtime

    Converts a time as returned by the time function to a 9-element list with the time analyzed for the local time zone. Typically used as follows:

    1. # 0 1 2 3 4 5 6 7 8
    2. ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
    3. localtime(time);

    All list elements are numeric and come straight out of the C `struct tm'. $sec , $min , and $hour are the seconds, minutes, and hours of the specified time.

    $mday is the day of the month and $mon the month in the range 0..11 , with 0 indicating January and 11 indicating December. This makes it easy to get a month name from a list:

    1. my @abbr = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
    2. print "$abbr[$mon] $mday";
    3. # $mon=9, $mday=18 gives "Oct 18"

    $year contains the number of years since 1900. To get a 4-digit year write:

    1. $year += 1900;

    To get the last two digits of the year (e.g., "01" in 2001) do:

    1. $year = sprintf("%02d", $year % 100);

    $wday is the day of the week, with 0 indicating Sunday and 3 indicating Wednesday. $yday is the day of the year, in the range 0..364 (or 0..365 in leap years.)

    $isdst is true if the specified time occurs during Daylight Saving Time, false otherwise.

    If EXPR is omitted, localtime() uses the current time (as returned by time(3)).

    In scalar context, localtime() returns the ctime(3) value:

    1. $now_string = localtime; # e.g., "Thu Oct 13 04:54:34 1994"

    The format of this scalar value is not locale-dependent but built into Perl. For GMT instead of local time use the gmtime builtin. See also the Time::Local module (for converting seconds, minutes, hours, and such back to the integer value returned by time()), and the POSIX module's strftime(3) and mktime(3) functions.

    To get somewhat similar but locale-dependent date strings, set up your locale environment variables appropriately (please see perllocale) and try for example:

    1. use POSIX qw(strftime);
    2. $now_string = strftime "%a %b %e %H:%M:%S %Y", localtime;
    3. # or for GMT formatted appropriately for your locale:
    4. $now_string = strftime "%a %b %e %H:%M:%S %Y", gmtime;

    Note that the %a and %b , the short forms of the day of the week and the month of the year, may not necessarily be three characters wide.

    The Time::gmtime and Time::localtime modules provide a convenient, by-name access mechanism to the gmtime() and localtime() functions, respectively.

    For a comprehensive date and time representation look at the DateTime module on CPAN.

    Portability issues: localtime in perlport.

  • lock THING

    This function places an advisory lock on a shared variable or referenced object contained in THING until the lock goes out of scope.

    The value returned is the scalar itself, if the argument is a scalar, or a reference, if the argument is a hash, array or subroutine.

    lock() is a "weak keyword" : this means that if you've defined a function by this name (before any calls to it), that function will be called instead. If you are not under use threads::shared this does nothing. See threads::shared.

  • log EXPR
  • log

    Returns the natural logarithm (base e) of EXPR. If EXPR is omitted, returns the log of $_ . To get the log of another base, use basic algebra: The base-N log of a number is equal to the natural log of that number divided by the natural log of N. For example:

    1. sub log10 {
    2. my $n = shift;
    3. return log($n)/log(10);
    4. }

    See also exp for the inverse operation.

  • lstat FILEHANDLE
  • lstat EXPR
  • lstat DIRHANDLE
  • lstat

    Does the same thing as the stat function (including setting the special _ filehandle) but stats a symbolic link instead of the file the symbolic link points to. If symbolic links are unimplemented on your system, a normal stat is done. For much more detailed information, please see the documentation for stat.

    If EXPR is omitted, stats $_ .

    Portability issues: lstat in perlport.

  • m//

    The match operator. See Regexp Quote-Like Operators in perlop.

  • map BLOCK LIST
  • map EXPR,LIST

    Evaluates the BLOCK or EXPR for each element of LIST (locally setting $_ to each element) and returns the list value composed of the results of each such evaluation. In scalar context, returns the total number of elements so generated. Evaluates BLOCK or EXPR in list context, so each element of LIST may produce zero, one, or more elements in the returned value.

    1. @chars = map(chr, @numbers);

    translates a list of numbers to the corresponding characters.

    1. my @squares = map { $_ * $_ } @numbers;

    translates a list of numbers to their squared values.

    1. my @squares = map { $_ > 5 ? ($_ * $_) : () } @numbers;

    shows that number of returned elements can differ from the number of input elements. To omit an element, return an empty list (). This could also be achieved by writing

    1. my @squares = map { $_ * $_ } grep { $_ > 5 } @numbers;

    which makes the intention more clear.

    Map always returns a list, which can be assigned to a hash such that the elements become key/value pairs. See perldata for more details.

    1. %hash = map { get_a_key_for($_) => $_ } @array;

    is just a funny way to write

    1. %hash = ();
    2. foreach (@array) {
    3. $hash{get_a_key_for($_)} = $_;
    4. }

    Note that $_ is an alias to the list value, so it can be used to modify the elements of the LIST. While this is useful and supported, it can cause bizarre results if the elements of LIST are not variables. Using a regular foreach loop for this purpose would be clearer in most cases. See also grep for an array composed of those items of the original list for which the BLOCK or EXPR evaluates to true.

    If $_ is lexical in the scope where the map appears (because it has been declared with the deprecated my $_ construct), then, in addition to being locally aliased to the list elements, $_ keeps being lexical inside the block; that is, it can't be seen from the outside, avoiding any potential side-effects.

    { starts both hash references and blocks, so map { ... could be either the start of map BLOCK LIST or map EXPR, LIST. Because Perl doesn't look ahead for the closing } it has to take a guess at which it's dealing with based on what it finds just after the {. Usually it gets it right, but if it doesn't it won't realize something is wrong until it gets to the } and encounters the missing (or unexpected) comma. The syntax error will be reported close to the }, but you'll need to change something near the { such as using a unary + to give Perl some help:

    1. %hash = map { "\L$_" => 1 } @array # perl guesses EXPR. wrong
    2. %hash = map { +"\L$_" => 1 } @array # perl guesses BLOCK. right
    3. %hash = map { ("\L$_" => 1) } @array # this also works
    4. %hash = map { lc($_) => 1 } @array # as does this.
    5. %hash = map +( lc($_) => 1 ), @array # this is EXPR and works!
    6. %hash = map ( lc($_), 1 ), @array # evaluates to (1, @array)

    or to force an anon hash constructor use +{:

    1. @hashes = map +{ lc($_) => 1 }, @array # EXPR, so needs
    2. # comma at end

    to get a list of anonymous hashes each with only one entry apiece.

  • mkdir FILENAME,MASK
  • mkdir FILENAME
  • mkdir

    Creates the directory specified by FILENAME, with permissions specified by MASK (as modified by umask). If it succeeds it returns true; otherwise it returns false and sets $! (errno). MASK defaults to 0777 if omitted, and FILENAME defaults to $_ if omitted.

    In general, it is better to create directories with a permissive MASK and let the user modify that with their umask than it is to supply a restrictive MASK and give the user no way to be more permissive. The exceptions to this rule are when the file or directory should be kept private (mail files, for instance). The perlfunc(1) entry on umask discusses the choice of MASK in more detail.

    Note that according to the POSIX 1003.1-1996 the FILENAME may have any number of trailing slashes. Some operating and filesystems do not get this right, so Perl automatically removes all trailing slashes to keep everyone happy.

    To recursively create a directory structure, look at the mkpath function of the File::Path module.

  • msgctl ID,CMD,ARG

    Calls the System V IPC function msgctl(2). You'll probably have to say

    1. use IPC::SysV;

    first to get the correct constant definitions. If CMD is IPC_STAT , then ARG must be a variable that will hold the returned msqid_ds structure. Returns like ioctl: the undefined value for error, "0 but true" for zero, or the actual return value otherwise. See also SysV IPC in perlipc and the documentation for IPC::SysV and IPC::Semaphore .

    Portability issues: msgctl in perlport.

  • msgget KEY,FLAGS

    Calls the System V IPC function msgget(2). Returns the message queue id, or undef on error. See also SysV IPC in perlipc and the documentation for IPC::SysV and IPC::Msg .

    Portability issues: msgget in perlport.

  • msgrcv ID,VAR,SIZE,TYPE,FLAGS

    Calls the System V IPC function msgrcv to receive a message from message queue ID into variable VAR with a maximum message size of SIZE. Note that when a message is received, the message type as a native long integer will be the first thing in VAR, followed by the actual message. This packing may be opened with unpack("l! a*") . Taints the variable. Returns true if successful, false on error. See also SysV IPC in perlipc and the documentation for IPC::SysV and IPC::SysV::Msg .

    Portability issues: msgrcv in perlport.

  • msgsnd ID,MSG,FLAGS

    Calls the System V IPC function msgsnd to send the message MSG to the message queue ID. MSG must begin with the native long integer message type, be followed by the length of the actual message, and then finally the message itself. This kind of packing can be achieved with pack("l! a*", $type, $message) . Returns true if successful, false on error. See also the IPC::SysV and IPC::SysV::Msg documentation.

    Portability issues: msgsnd in perlport.

  • my EXPR
  • my TYPE EXPR
  • my EXPR : ATTRS
  • my TYPE EXPR : ATTRS

    A my declares the listed variables to be local (lexically) to the enclosing block, file, or eval. If more than one value is listed, the list must be placed in parentheses.

    The exact semantics and interface of TYPE and ATTRS are still evolving. TYPE is currently bound to the use of the fields pragma, and attributes are handled using the attributes pragma, or starting from Perl 5.8.0 also via the Attribute::Handlers module. See Private Variables via my() in perlsub for details, and fields, attributes, and Attribute::Handlers.

  • next LABEL
  • next EXPR
  • next

    The next command is like the continue statement in C; it starts the next iteration of the loop:

    1. LINE: while (<STDIN>) {
    2. next LINE if /^#/; # discard comments
    3. #...
    4. }

    Note that if there were a continue block on the above, it would get executed even on discarded lines. If LABEL is omitted, the command refers to the innermost enclosing loop. The next EXPR form, available as of Perl 5.18.0, allows a label name to be computed at run time, being otherwise identical to next LABEL .

    next cannot be used to exit a block which returns a value such as eval {} , sub {} , or do {} , and should not be used to exit a grep() or map() operation.

    Note that a block by itself is semantically identical to a loop that executes once. Thus next will exit such a block early.

    See also continue for an illustration of how last, next, and redo work.

    Unlike most named operators, this has the same precedence as assignment. It is also exempt from the looks-like-a-function rule, so next ("foo")."bar" will cause "bar" to be part of the argument to next.

  • no MODULE VERSION LIST
  • no MODULE VERSION
  • no MODULE LIST
  • no MODULE
  • no VERSION

    See the use function, of which no is the opposite.

  • oct EXPR
  • oct

    Interprets EXPR as an octal string and returns the corresponding value. (If EXPR happens to start off with 0x , interprets it as a hex string. If EXPR starts off with 0b, it is interpreted as a binary string. Leading whitespace is ignored in all three cases.) The following will handle decimal, binary, octal, and hex in standard Perl notation:

    1. $val = oct($val) if $val =~ /^0/;

    If EXPR is omitted, uses $_ . To go the other way (produce a number in octal), use sprintf() or printf():

    1. $dec_perms = (stat("filename"))[2] & 07777;
    2. $oct_perm_str = sprintf "%o", $perms;

    The oct() function is commonly used when a string such as 644 needs to be converted into a file mode, for example. Although Perl automatically converts strings into numbers as needed, this automatic conversion assumes base 10.

    Leading white space is ignored without warning, as too are any trailing non-digits, such as a decimal point (oct only handles non-negative integers, not negative integers or floating point).

  • open FILEHANDLE,EXPR
  • open FILEHANDLE,MODE,EXPR
  • open FILEHANDLE,MODE,EXPR,LIST
  • open FILEHANDLE,MODE,REFERENCE
  • open FILEHANDLE

    Opens the file whose filename is given by EXPR, and associates it with FILEHANDLE.

    Simple examples to open a file for reading:

    1. open(my $fh, "<", "input.txt")
    2. or die "cannot open < input.txt: $!";

    and for writing:

    1. open(my $fh, ">", "output.txt")
    2. or die "cannot open > output.txt: $!";

    (The following is a comprehensive reference to open(): for a gentler introduction you may consider perlopentut.)

    If FILEHANDLE is an undefined scalar variable (or array or hash element), a new filehandle is autovivified, meaning that the variable is assigned a reference to a newly allocated anonymous filehandle. Otherwise if FILEHANDLE is an expression, its value is the real filehandle. (This is considered a symbolic reference, so use strict "refs" should not be in effect.)

    If EXPR is omitted, the global (package) scalar variable of the same name as the FILEHANDLE contains the filename. (Note that lexical variables--those declared with my or state--will not work for this purpose; so if you're using my or state, specify EXPR in your call to open.)

    If three (or more) arguments are specified, the open mode (including optional encoding) in the second argument are distinct from the filename in the third. If MODE is < or nothing, the file is opened for input. If MODE is >, the file is opened for output, with existing files first being truncated ("clobbered") and nonexisting files newly created. If MODE is >> , the file is opened for appending, again being created if necessary.

    You can put a + in front of the > or < to indicate that you want both read and write access to the file; thus +< is almost always preferred for read/write updates--the +> mode would clobber the file first. You can't usually use either read-write mode for updating textfiles, since they have variable-length records. See the -i switch in perlrun for a better approach. The file is created with permissions of 0666 modified by the process's umask value.

    These various prefixes correspond to the fopen(3) modes of r , r+ , w , w+ , a , and a+ .

    In the one- and two-argument forms of the call, the mode and filename should be concatenated (in that order), preferably separated by white space. You can--but shouldn't--omit the mode in these forms when that mode is < . It is always safe to use the two-argument form of open if the filename argument is a known literal.

    For three or more arguments if MODE is |-, the filename is interpreted as a command to which output is to be piped, and if MODE is -|, the filename is interpreted as a command that pipes output to us. In the two-argument (and one-argument) form, one should replace dash (- ) with the command. See Using open() for IPC in perlipc for more examples of this. (You are not allowed to open to a command that pipes both in and out, but see IPC::Open2, IPC::Open3, and Bidirectional Communication with Another Process in perlipc for alternatives.)

    In the form of pipe opens taking three or more arguments, if LIST is specified (extra arguments after the command name) then LIST becomes arguments to the command invoked if the platform supports it. The meaning of open with more than three arguments for non-pipe modes is not yet defined, but experimental "layers" may give extra LIST arguments meaning.

    In the two-argument (and one-argument) form, opening <- or - opens STDIN and opening >- opens STDOUT.

    You may (and usually should) use the three-argument form of open to specify I/O layers (sometimes referred to as "disciplines") to apply to the handle that affect how the input and output are processed (see open and PerlIO for more details). For example:

    1. open(my $fh, "<:encoding(UTF-8)", "filename")
    2. || die "can't open UTF-8 encoded filename: $!";

    opens the UTF8-encoded file containing Unicode characters; see perluniintro. Note that if layers are specified in the three-argument form, then default layers stored in ${^OPEN} (see perlvar; usually set by the open pragma or the switch -CioD) are ignored. Those layers will also be ignored if you specifying a colon with no name following it. In that case the default layer for the operating system (:raw on Unix, :crlf on Windows) is used.

    Open returns nonzero on success, the undefined value otherwise. If the open involved a pipe, the return value happens to be the pid of the subprocess.

    If you're running Perl on a system that distinguishes between text files and binary files, then you should check out binmode for tips for dealing with this. The key distinction between systems that need binmode and those that don't is their text file formats. Systems like Unix, Mac OS, and Plan 9, that end lines with a single character and encode that character in C as "\n" do not need binmode. The rest need it.

    When opening a file, it's seldom a good idea to continue if the request failed, so open is frequently used with die. Even if die won't do what you want (say, in a CGI script, where you want to format a suitable error message (but there are modules that can help with that problem)) always check the return value from opening a file.

    As a special case the three-argument form with a read/write mode and the third argument being undef:

    1. open(my $tmp, "+>", undef) or die ...

    opens a filehandle to an anonymous temporary file. Also using +< works for symmetry, but you really should consider writing something to the temporary file first. You will need to seek() to do the reading.

    Perl is built using PerlIO by default; Unless you've changed this (such as building Perl with Configure -Uuseperlio ), you can open filehandles directly to Perl scalars via:

    1. open($fh, ">", \$variable) || ..

    To (re)open STDOUT or STDERR as an in-memory file, close it first:

    1. close STDOUT;
    2. open(STDOUT, ">", \$variable)
    3. or die "Can't open STDOUT: $!";

    General examples:

    1. $ARTICLE = 100;
    2. open(ARTICLE) or die "Can't find article $ARTICLE: $!\n";
    3. while (<ARTICLE>) {...
    4. open(LOG, ">>/usr/spool/news/twitlog"); # (log is reserved)
    5. # if the open fails, output is discarded
    6. open(my $dbase, "+<", "dbase.mine") # open for update
    7. or die "Can't open 'dbase.mine' for update: $!";
    8. open(my $dbase, "+<dbase.mine") # ditto
    9. or die "Can't open 'dbase.mine' for update: $!";
    10. open(ARTICLE, "-|", "caesar <$article") # decrypt article
    11. or die "Can't start caesar: $!";
    12. open(ARTICLE, "caesar <$article |") # ditto
    13. or die "Can't start caesar: $!";
    14. open(EXTRACT, "|sort >Tmp$$") # $$ is our process id
    15. or die "Can't start sort: $!";
    16. # in-memory files
    17. open(MEMORY, ">", \$var)
    18. or die "Can't open memory file: $!";
    19. print MEMORY "foo!\n"; # output will appear in $var
    20. # process argument list of files along with any includes
    21. foreach $file (@ARGV) {
    22. process($file, "fh00");
    23. }
    24. sub process {
    25. my($filename, $input) = @_;
    26. $input++; # this is a string increment
    27. unless (open($input, "<", $filename)) {
    28. print STDERR "Can't open $filename: $!\n";
    29. return;
    30. }
    31. local $_;
    32. while (<$input>) { # note use of indirection
    33. if (/^#include "(.*)"/) {
    34. process($1, $input);
    35. next;
    36. }
    37. #... # whatever
    38. }
    39. }

    See perliol for detailed info on PerlIO.

    You may also, in the Bourne shell tradition, specify an EXPR beginning with >&, in which case the rest of the string is interpreted as the name of a filehandle (or file descriptor, if numeric) to be duped (as dup(2) ) and opened. You may use & after >, >> , < , +>, +>> , and +< . The mode you specify should match the mode of the original filehandle. (Duping a filehandle does not take into account any existing contents of IO buffers.) If you use the three-argument form, then you can pass either a number, the name of a filehandle, or the normal "reference to a glob".

    Here is a script that saves, redirects, and restores STDOUT and STDERR using various methods:

    1. #!/usr/bin/perl
    2. open(my $oldout, ">&STDOUT") or die "Can't dup STDOUT: $!";
    3. open(OLDERR, ">&", \*STDERR) or die "Can't dup STDERR: $!";
    4. open(STDOUT, '>', "foo.out") or die "Can't redirect STDOUT: $!";
    5. open(STDERR, ">&STDOUT") or die "Can't dup STDOUT: $!";
    6. select STDERR; $| = 1; # make unbuffered
    7. select STDOUT; $| = 1; # make unbuffered
    8. print STDOUT "stdout 1\n"; # this works for
    9. print STDERR "stderr 1\n"; # subprocesses too
    10. open(STDOUT, ">&", $oldout) or die "Can't dup \$oldout: $!";
    11. open(STDERR, ">&OLDERR") or die "Can't dup OLDERR: $!";
    12. print STDOUT "stdout 2\n";
    13. print STDERR "stderr 2\n";

    If you specify '<&=X' , where X is a file descriptor number or a filehandle, then Perl will do an equivalent of C's fdopen of that file descriptor (and not call dup(2) ); this is more parsimonious of file descriptors. For example:

    1. # open for input, reusing the fileno of $fd
    2. open(FILEHANDLE, "<&=$fd")

    or

    1. open(FILEHANDLE, "<&=", $fd)

    or

    1. # open for append, using the fileno of OLDFH
    2. open(FH, ">>&=", OLDFH)

    or

    1. open(FH, ">>&=OLDFH")

    Being parsimonious on filehandles is also useful (besides being parsimonious) for example when something is dependent on file descriptors, like for example locking using flock(). If you do just open(A, ">>&B") , the filehandle A will not have the same file descriptor as B, and therefore flock(A) will not flock(B) nor vice versa. But with open(A, ">>&=B") , the filehandles will share the same underlying system file descriptor.

    Note that under Perls older than 5.8.0, Perl uses the standard C library's' fdopen() to implement the = functionality. On many Unix systems, fdopen() fails when file descriptors exceed a certain value, typically 255. For Perls 5.8.0 and later, PerlIO is (most often) the default.

    You can see whether your Perl was built with PerlIO by running perl -V and looking for the useperlio= line. If useperlio is define , you have PerlIO; otherwise you don't.

    If you open a pipe on the command - (that is, specify either |- or -| with the one- or two-argument forms of open), an implicit fork is done, so open returns twice: in the parent process it returns the pid of the child process, and in the child process it returns (a defined) 0 . Use defined($pid) or // to determine whether the open was successful.

    For example, use either

    1. $child_pid = open(FROM_KID, "-|") // die "can't fork: $!";

    or

    1. $child_pid = open(TO_KID, "|-") // die "can't fork: $!";

    followed by

    1. if ($child_pid) {
    2. # am the parent:
    3. # either write TO_KID or else read FROM_KID
    4. ...
    5. waitpid $child_pid, 0;
    6. } else {
    7. # am the child; use STDIN/STDOUT normally
    8. ...
    9. exit;
    10. }

    The filehandle behaves normally for the parent, but I/O to that filehandle is piped from/to the STDOUT/STDIN of the child process. In the child process, the filehandle isn't opened--I/O happens from/to the new STDOUT/STDIN. Typically this is used like the normal piped open when you want to exercise more control over just how the pipe command gets executed, such as when running setuid and you don't want to have to scan shell commands for metacharacters.

    The following blocks are more or less equivalent:

    1. open(FOO, "|tr '[a-z]' '[A-Z]'");
    2. open(FOO, "|-", "tr '[a-z]' '[A-Z]'");
    3. open(FOO, "|-") || exec 'tr', '[a-z]', '[A-Z]';
    4. open(FOO, "|-", "tr", '[a-z]', '[A-Z]');
    5. open(FOO, "cat -n '$file'|");
    6. open(FOO, "-|", "cat -n '$file'");
    7. open(FOO, "-|") || exec "cat", "-n", $file;
    8. open(FOO, "-|", "cat", "-n", $file);

    The last two examples in each block show the pipe as "list form", which is not yet supported on all platforms. A good rule of thumb is that if your platform has a real fork() (in other words, if your platform is Unix, including Linux and MacOS X), you can use the list form. You would want to use the list form of the pipe so you can pass literal arguments to the command without risk of the shell interpreting any shell metacharacters in them. However, this also bars you from opening pipes to commands that intentionally contain shell metacharacters, such as:

    1. open(FOO, "|cat -n | expand -4 | lpr")
    2. // die "Can't open pipeline to lpr: $!";

    See Safe Pipe Opens in perlipc for more examples of this.

    Perl will attempt to flush all files opened for output before any operation that may do a fork, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles.

    On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptor as determined by the value of $^F . See $^F in perlvar.

    Closing any piped filehandle causes the parent process to wait for the child to finish, then returns the status value in $? and ${^CHILD_ERROR_NATIVE} .

    The filename passed to the one- and two-argument forms of open() will have leading and trailing whitespace deleted and normal redirection characters honored. This property, known as "magic open", can often be used to good effect. A user could specify a filename of "rsh cat file |", or you could change certain filenames as needed:

    1. $filename =~ s/(.*\.gz)\s*$/gzip -dc < $1|/;
    2. open(FH, $filename) or die "Can't open $filename: $!";

    Use the three-argument form to open a file with arbitrary weird characters in it,

    1. open(FOO, "<", $file)
    2. || die "can't open < $file: $!";

    otherwise it's necessary to protect any leading and trailing whitespace:

    1. $file =~ s#^(\s)#./$1#;
    2. open(FOO, "< $file\0")
    3. || die "open failed: $!";

    (this may not work on some bizarre filesystems). One should conscientiously choose between the magic and three-argument form of open():

    1. open(IN, $ARGV[0]) || die "can't open $ARGV[0]: $!";

    will allow the user to specify an argument of the form "rsh cat file |" , but will not work on a filename that happens to have a trailing space, while

    1. open(IN, "<", $ARGV[0])
    2. || die "can't open < $ARGV[0]: $!";

    will have exactly the opposite restrictions.

    If you want a "real" C open (see open(2) on your system), then you should use the sysopen function, which involves no such magic (but may use subtly different filemodes than Perl open(), which is mapped to C fopen()). This is another way to protect your filenames from interpretation. For example:

    1. use IO::Handle;
    2. sysopen(HANDLE, $path, O_RDWR|O_CREAT|O_EXCL)
    3. or die "sysopen $path: $!";
    4. $oldfh = select(HANDLE); $| = 1; select($oldfh);
    5. print HANDLE "stuff $$\n";
    6. seek(HANDLE, 0, 0);
    7. print "File contains: ", <HANDLE>;

    Using the constructor from the IO::Handle package (or one of its subclasses, such as IO::File or IO::Socket ), you can generate anonymous filehandles that have the scope of the variables used to hold them, then automatically (but silently) close once their reference counts become zero, typically at scope exit:

    1. use IO::File;
    2. #...
    3. sub read_myfile_munged {
    4. my $ALL = shift;
    5. # or just leave it undef to autoviv
    6. my $handle = IO::File->new;
    7. open($handle, "<", "myfile") or die "myfile: $!";
    8. $first = <$handle>
    9. or return (); # Automatically closed here.
    10. mung($first) or die "mung failed"; # Or here.
    11. return (first, <$handle>) if $ALL; # Or here.
    12. return $first; # Or here.
    13. }

    WARNING: The previous example has a bug because the automatic close that happens when the refcount on handle reaches zero does not properly detect and report failures. Always close the handle yourself and inspect the return value.

    1. close($handle)
    2. || warn "close failed: $!";

    See seek for some details about mixing reading and writing.

    Portability issues: open in perlport.

  • opendir DIRHANDLE,EXPR

    Opens a directory named EXPR for processing by readdir, telldir, seekdir, rewinddir, and closedir. Returns true if successful. DIRHANDLE may be an expression whose value can be used as an indirect dirhandle, usually the real dirhandle name. If DIRHANDLE is an undefined scalar variable (or array or hash element), the variable is assigned a reference to a new anonymous dirhandle; that is, it's autovivified. DIRHANDLEs have their own namespace separate from FILEHANDLEs.

    See the example at readdir.

  • ord EXPR
  • ord

    Returns the numeric value of the first character of EXPR. If EXPR is an empty string, returns 0. If EXPR is omitted, uses $_ . (Note character, not byte.)

    For the reverse, see chr. See perlunicode for more about Unicode.

  • our EXPR
  • our TYPE EXPR
  • our EXPR : ATTRS
  • our TYPE EXPR : ATTRS

    our makes a lexical alias to a package variable of the same name in the current package for use within the current lexical scope.

    our has the same scoping rules as my or state, but our only declares an alias, whereas my or state both declare a variable name and allocate storage for that name within the current scope.

    This means that when use strict 'vars' is in effect, our lets you use a package variable without qualifying it with the package name, but only within the lexical scope of the our declaration. In this way, our differs from use vars , which allows use of an unqualified name only within the affected package, but across scopes.

    If more than one value is listed, the list must be placed in parentheses.

    1. our $foo;
    2. our($bar, $baz);

    An our declaration declares an alias for a package variable that will be visible across its entire lexical scope, even across package boundaries. The package in which the variable is entered is determined at the point of the declaration, not at the point of use. This means the following behavior holds:

    1. package Foo;
    2. our $bar; # declares $Foo::bar for rest of lexical scope
    3. $bar = 20;
    4. package Bar;
    5. print $bar; # prints 20, as it refers to $Foo::bar

    Multiple our declarations with the same name in the same lexical scope are allowed if they are in different packages. If they happen to be in the same package, Perl will emit warnings if you have asked for them, just like multiple my declarations. Unlike a second my declaration, which will bind the name to a fresh variable, a second our declaration in the same package, in the same scope, is merely redundant.

    1. use warnings;
    2. package Foo;
    3. our $bar; # declares $Foo::bar for rest of lexical scope
    4. $bar = 20;
    5. package Bar;
    6. our $bar = 30; # declares $Bar::bar for rest of lexical scope
    7. print $bar; # prints 30
    8. our $bar; # emits warning but has no other effect
    9. print $bar; # still prints 30

    An our declaration may also have a list of attributes associated with it.

    The exact semantics and interface of TYPE and ATTRS are still evolving. TYPE is currently bound to the use of the fields pragma, and attributes are handled using the attributes pragma, or, starting from Perl 5.8.0, also via the Attribute::Handlers module. See Private Variables via my() in perlsub for details, and fields, attributes, and Attribute::Handlers.

  • pack TEMPLATE,LIST

    Takes a LIST of values and converts it into a string using the rules given by the TEMPLATE. The resulting string is the concatenation of the converted values. Typically, each converted value looks like its machine-level representation. For example, on 32-bit machines an integer may be represented by a sequence of 4 bytes, which will in Perl be presented as a string that's 4 characters long.

    See perlpacktut for an introduction to this function.

    The TEMPLATE is a sequence of characters that give the order and type of values, as follows:

    1. a A string with arbitrary binary data, will be null padded.
    2. A A text (ASCII) string, will be space padded.
    3. Z A null-terminated (ASCIZ) string, will be null padded.
    4. b A bit string (ascending bit order inside each byte,
    5. like vec()).
    6. B A bit string (descending bit order inside each byte).
    7. h A hex string (low nybble first).
    8. H A hex string (high nybble first).
    9. c A signed char (8-bit) value.
    10. C An unsigned char (octet) value.
    11. W An unsigned char value (can be greater than 255).
    12. s A signed short (16-bit) value.
    13. S An unsigned short value.
    14. l A signed long (32-bit) value.
    15. L An unsigned long value.
    16. q A signed quad (64-bit) value.
    17. Q An unsigned quad value.
    18. (Quads are available only if your system supports 64-bit
    19. integer values _and_ if Perl has been compiled to support
    20. those. Raises an exception otherwise.)
    21. i A signed integer value.
    22. I A unsigned integer value.
    23. (This 'integer' is _at_least_ 32 bits wide. Its exact
    24. size depends on what a local C compiler calls 'int'.)
    25. n An unsigned short (16-bit) in "network" (big-endian) order.
    26. N An unsigned long (32-bit) in "network" (big-endian) order.
    27. v An unsigned short (16-bit) in "VAX" (little-endian) order.
    28. V An unsigned long (32-bit) in "VAX" (little-endian) order.
    29. j A Perl internal signed integer value (IV).
    30. J A Perl internal unsigned integer value (UV).
    31. f A single-precision float in native format.
    32. d A double-precision float in native format.
    33. F A Perl internal floating-point value (NV) in native format
    34. D A float of long-double precision in native format.
    35. (Long doubles are available only if your system supports
    36. long double values _and_ if Perl has been compiled to
    37. support those. Raises an exception otherwise.)
    38. p A pointer to a null-terminated string.
    39. P A pointer to a structure (fixed-length string).
    40. u A uuencoded string.
    41. U A Unicode character number. Encodes to a character in char-
    42. acter mode and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in
    43. byte mode.
    44. w A BER compressed integer (not an ASN.1 BER, see perlpacktut
    45. for details). Its bytes represent an unsigned integer in
    46. base 128, most significant digit first, with as few digits
    47. as possible. Bit eight (the high bit) is set on each byte
    48. except the last.
    49. x A null byte (a.k.a ASCII NUL, "\000", chr(0))
    50. X Back up a byte.
    51. @ Null-fill or truncate to absolute position, counted from the
    52. start of the innermost ()-group.
    53. . Null-fill or truncate to absolute position specified by
    54. the value.
    55. ( Start of a ()-group.

    One or more modifiers below may optionally follow certain letters in the TEMPLATE (the second column lists letters for which the modifier is valid):

    1. ! sSlLiI Forces native (short, long, int) sizes instead
    2. of fixed (16-/32-bit) sizes.
    3. xX Make x and X act as alignment commands.
    4. nNvV Treat integers as signed instead of unsigned.
    5. @. Specify position as byte offset in the internal
    6. representation of the packed string. Efficient
    7. but dangerous.
    8. > sSiIlLqQ Force big-endian byte-order on the type.
    9. jJfFdDpP (The "big end" touches the construct.)
    10. < sSiIlLqQ Force little-endian byte-order on the type.
    11. jJfFdDpP (The "little end" touches the construct.)

    The > and < modifiers can also be used on () groups to force a particular byte-order on all components in that group, including all its subgroups.

    The following rules apply:

    • Each letter may optionally be followed by a number indicating the repeat count. A numeric repeat count may optionally be enclosed in brackets, as in pack("C[80]", @arr) . The repeat count gobbles that many values from the LIST when used with all format types other than a , A , Z , b , B , h , H , @ , ., x , X , and P , where it means something else, described below. Supplying a * for the repeat count instead of a number means to use however many items are left, except for:

      • @ , x , and X , where it is equivalent to 0 .

      • <.>, where it means relative to the start of the string.

      • u , where it is equivalent to 1 (or 45, which here is equivalent).

      One can replace a numeric repeat count with a template letter enclosed in brackets to use the packed byte length of the bracketed template for the repeat count.

      For example, the template x[L] skips as many bytes as in a packed long, and the template "$t X[$t] $t" unpacks twice whatever $t (when variable-expanded) unpacks. If the template in brackets contains alignment commands (such as x![d] ), its packed length is calculated as if the start of the template had the maximal possible alignment.

      When used with Z , a * as the repeat count is guaranteed to add a trailing null byte, so the resulting string is always one byte longer than the byte length of the item itself.

      When used with @ , the repeat count represents an offset from the start of the innermost () group.

      When used with ., the repeat count determines the starting position to calculate the value offset as follows:

      • If the repeat count is 0 , it's relative to the current position.

      • If the repeat count is * , the offset is relative to the start of the packed string.

      • And if it's an integer n, the offset is relative to the start of the nth innermost ( ) group, or to the start of the string if n is bigger then the group level.

      The repeat count for u is interpreted as the maximal number of bytes to encode per line of output, with 0, 1 and 2 replaced by 45. The repeat count should not be more than 65.

    • The a , A , and Z types gobble just one value, but pack it as a string of length count, padding with nulls or spaces as needed. When unpacking, A strips trailing whitespace and nulls, Z strips everything after the first null, and a returns data with no stripping at all.

      If the value to pack is too long, the result is truncated. If it's too long and an explicit count is provided, Z packs only $count-1 bytes, followed by a null byte. Thus Z always packs a trailing null, except when the count is 0.

    • Likewise, the b and B formats pack a string that's that many bits long. Each such format generates 1 bit of the result. These are typically followed by a repeat count like B8 or B64 .

      Each result bit is based on the least-significant bit of the corresponding input character, i.e., on ord($char)%2. In particular, characters "0" and "1" generate bits 0 and 1, as do characters "\000" and "\001" .

      Starting from the beginning of the input string, each 8-tuple of characters is converted to 1 character of output. With format b , the first character of the 8-tuple determines the least-significant bit of a character; with format B , it determines the most-significant bit of a character.

      If the length of the input string is not evenly divisible by 8, the remainder is packed as if the input string were padded by null characters at the end. Similarly during unpacking, "extra" bits are ignored.

      If the input string is longer than needed, remaining characters are ignored.

      A * for the repeat count uses all characters of the input field. On unpacking, bits are converted to a string of 0 s and 1 s.

    • The h and H formats pack a string that many nybbles (4-bit groups, representable as hexadecimal digits, "0".."9" "a".."f" ) long.

      For each such format, pack() generates 4 bits of result. With non-alphabetical characters, the result is based on the 4 least-significant bits of the input character, i.e., on ord($char)%16. In particular, characters "0" and "1" generate nybbles 0 and 1, as do bytes "\000" and "\001" . For characters "a".."f" and "A".."F" , the result is compatible with the usual hexadecimal digits, so that "a" and "A" both generate the nybble 0xA==10 . Use only these specific hex characters with this format.

      Starting from the beginning of the template to pack(), each pair of characters is converted to 1 character of output. With format h , the first character of the pair determines the least-significant nybble of the output character; with format H , it determines the most-significant nybble.

      If the length of the input string is not even, it behaves as if padded by a null character at the end. Similarly, "extra" nybbles are ignored during unpacking.

      If the input string is longer than needed, extra characters are ignored.

      A * for the repeat count uses all characters of the input field. For unpack(), nybbles are converted to a string of hexadecimal digits.

    • The p format packs a pointer to a null-terminated string. You are responsible for ensuring that the string is not a temporary value, as that could potentially get deallocated before you got around to using the packed result. The P format packs a pointer to a structure of the size indicated by the length. A null pointer is created if the corresponding value for p or P is undef; similarly with unpack(), where a null pointer unpacks into undef.

      If your system has a strange pointer size--meaning a pointer is neither as big as an int nor as big as a long--it may not be possible to pack or unpack pointers in big- or little-endian byte order. Attempting to do so raises an exception.

    • The / template character allows packing and unpacking of a sequence of items where the packed structure contains a packed item count followed by the packed items themselves. This is useful when the structure you're unpacking has encoded the sizes or repeat counts for some of its fields within the structure itself as separate fields.

      For pack, you write length-item/sequence-item, and the length-item describes how the length value is packed. Formats likely to be of most use are integer-packing ones like n for Java strings, w for ASN.1 or SNMP, and N for Sun XDR.

      For pack, sequence-item may have a repeat count, in which case the minimum of that and the number of available items is used as the argument for length-item. If it has no repeat count or uses a '*', the number of available items is used.

      For unpack, an internal stack of integer arguments unpacked so far is used. You write /sequence-item and the repeat count is obtained by popping off the last element from the stack. The sequence-item must not have a repeat count.

      If sequence-item refers to a string type ("A" , "a" , or "Z" ), the length-item is the string length, not the number of strings. With an explicit repeat count for pack, the packed string is adjusted to that length. For example:

      1. This code: gives this result:
      2. unpack("W/a", "\004Gurusamy") ("Guru")
      3. unpack("a3/A A*", "007 Bond J ") (" Bond", "J")
      4. unpack("a3 x2 /A A*", "007: Bond, J.") ("Bond, J", ".")
      5. pack("n/a* w/a","hello,","world") "\000\006hello,\005world"
      6. pack("a/W2", ord("a") .. ord("z")) "2ab"

      The length-item is not returned explicitly from unpack.

      Supplying a count to the length-item format letter is only useful with A , a , or Z . Packing with a length-item of a or Z may introduce "\000" characters, which Perl does not regard as legal in numeric strings.

    • The integer types s, S , l , and L may be followed by a ! modifier to specify native shorts or longs. As shown in the example above, a bare l means exactly 32 bits, although the native long as seen by the local C compiler may be larger. This is mainly an issue on 64-bit platforms. You can see whether using ! makes any difference this way:

      1. printf "format s is %d, s! is %d\n",
      2. length pack("s"), length pack("s!");
      3. printf "format l is %d, l! is %d\n",
      4. length pack("l"), length pack("l!");

      i! and I! are also allowed, but only for completeness' sake: they are identical to i and I .

      The actual sizes (in bytes) of native shorts, ints, longs, and long longs on the platform where Perl was built are also available from the command line:

      1. $ perl -V:{short,int,long{,long}}size
      2. shortsize='2';
      3. intsize='4';
      4. longsize='4';
      5. longlongsize='8';

      or programmatically via the Config module:

      1. use Config;
      2. print $Config{shortsize}, "\n";
      3. print $Config{intsize}, "\n";
      4. print $Config{longsize}, "\n";
      5. print $Config{longlongsize}, "\n";

      $Config{longlongsize} is undefined on systems without long long support.

    • The integer formats s, S , i , I , l , L , j , and J are inherently non-portable between processors and operating systems because they obey native byteorder and endianness. For example, a 4-byte integer 0x12345678 (305419896 decimal) would be ordered natively (arranged in and handled by the CPU registers) into bytes as

      1. 0x12 0x34 0x56 0x78 # big-endian
      2. 0x78 0x56 0x34 0x12 # little-endian

      Basically, Intel and VAX CPUs are little-endian, while everybody else, including Motorola m68k/88k, PPC, Sparc, HP PA, Power, and Cray, are big-endian. Alpha and MIPS can be either: Digital/Compaq uses (well, used) them in little-endian mode, but SGI/Cray uses them in big-endian mode.

      The names big-endian and little-endian are comic references to the egg-eating habits of the little-endian Lilliputians and the big-endian Blefuscudians from the classic Jonathan Swift satire, Gulliver's Travels. This entered computer lingo via the paper "On Holy Wars and a Plea for Peace" by Danny Cohen, USC/ISI IEN 137, April 1, 1980.

      Some systems may have even weirder byte orders such as

      1. 0x56 0x78 0x12 0x34
      2. 0x34 0x12 0x78 0x56

      You can determine your system endianness with this incantation:

      1. printf("%#02x ", $_) for unpack("W*", pack L=>0x12345678);

      The byteorder on the platform where Perl was built is also available via Config:

      1. use Config;
      2. print "$Config{byteorder}\n";

      or from the command line:

      1. $ perl -V:byteorder

      Byteorders "1234" and "12345678" are little-endian; "4321" and "87654321" are big-endian.

      For portably packed integers, either use the formats n , N , v , and V or else use the > and < modifiers described immediately below. See also perlport.

    • Starting with Perl 5.10.0, integer and floating-point formats, along with the p and P formats and () groups, may all be followed by the > or < endianness modifiers to respectively enforce big- or little-endian byte-order. These modifiers are especially useful given how n , N , v , and V don't cover signed integers, 64-bit integers, or floating-point values.

      Here are some concerns to keep in mind when using an endianness modifier:

      • Exchanging signed integers between different platforms works only when all platforms store them in the same format. Most platforms store signed integers in two's-complement notation, so usually this is not an issue.

      • The > or < modifiers can only be used on floating-point formats on big- or little-endian machines. Otherwise, attempting to use them raises an exception.

      • Forcing big- or little-endian byte-order on floating-point values for data exchange can work only if all platforms use the same binary representation such as IEEE floating-point. Even if all platforms are using IEEE, there may still be subtle differences. Being able to use > or < on floating-point values can be useful, but also dangerous if you don't know exactly what you're doing. It is not a general way to portably store floating-point values.

      • When using > or < on a () group, this affects all types inside the group that accept byte-order modifiers, including all subgroups. It is silently ignored for all other types. You are not allowed to override the byte-order within a group that already has a byte-order modifier suffix.

    • Real numbers (floats and doubles) are in native machine format only. Due to the multiplicity of floating-point formats and the lack of a standard "network" representation for them, no facility for interchange has been made. This means that packed floating-point data written on one machine may not be readable on another, even if both use IEEE floating-point arithmetic (because the endianness of the memory representation is not part of the IEEE spec). See also perlport.

      If you know exactly what you're doing, you can use the > or < modifiers to force big- or little-endian byte-order on floating-point values.

      Because Perl uses doubles (or long doubles, if configured) internally for all numeric calculation, converting from double into float and thence to double again loses precision, so unpack("f", pack("f", $foo)) will not in general equal $foo.

    • Pack and unpack can operate in two modes: character mode (C0 mode) where the packed string is processed per character, and UTF-8 mode (U0 mode) where the packed string is processed in its UTF-8-encoded Unicode form on a byte-by-byte basis. Character mode is the default unless the format string starts with U . You can always switch mode mid-format with an explicit C0 or U0 in the format. This mode remains in effect until the next mode change, or until the end of the () group it (directly) applies to.

      Using C0 to get Unicode characters while using U0 to get non-Unicode bytes is not necessarily obvious. Probably only the first of these is what you want:

      1. $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
      2. perl -CS -ne 'printf "%v04X\n", $_ for unpack("C0A*", $_)'
      3. 03B1.03C9
      4. $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
      5. perl -CS -ne 'printf "%v02X\n", $_ for unpack("U0A*", $_)'
      6. CE.B1.CF.89
      7. $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
      8. perl -C0 -ne 'printf "%v02X\n", $_ for unpack("C0A*", $_)'
      9. CE.B1.CF.89
      10. $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
      11. perl -C0 -ne 'printf "%v02X\n", $_ for unpack("U0A*", $_)'
      12. C3.8E.C2.B1.C3.8F.C2.89

      Those examples also illustrate that you should not try to use pack/unpack as a substitute for the Encode module.

    • You must yourself do any alignment or padding by inserting, for example, enough "x" es while packing. There is no way for pack() and unpack() to know where characters are going to or coming from, so they handle their output and input as flat sequences of characters.

    • A () group is a sub-TEMPLATE enclosed in parentheses. A group may take a repeat count either as postfix, or for unpack(), also via the / template character. Within each repetition of a group, positioning with @ starts over at 0. Therefore, the result of

      1. pack("@1A((@2A)@3A)", qw[X Y Z])

      is the string "\0X\0\0YZ" .

    • x and X accept the ! modifier to act as alignment commands: they jump forward or back to the closest position aligned at a multiple of count characters. For example, to pack() or unpack() a C structure like

      1. struct {
      2. char c; /* one signed, 8-bit character */
      3. double d;
      4. char cc[2];
      5. }

      one may need to use the template c x![d] d c[2] . This assumes that doubles must be aligned to the size of double.

      For alignment commands, a count of 0 is equivalent to a count of 1; both are no-ops.

    • n , N , v and V accept the ! modifier to represent signed 16-/32-bit integers in big-/little-endian order. This is portable only when all platforms sharing packed data use the same binary representation for signed integers; for example, when all platforms use two's-complement representation.

    • Comments can be embedded in a TEMPLATE using # through the end of line. White space can separate pack codes from each other, but modifiers and repeat counts must follow immediately. Breaking complex templates into individual line-by-line components, suitably annotated, can do as much to improve legibility and maintainability of pack/unpack formats as /x can for complicated pattern matches.

    • If TEMPLATE requires more arguments than pack() is given, pack() assumes additional "" arguments. If TEMPLATE requires fewer arguments than given, extra arguments are ignored.

    Examples:

    1. $foo = pack("WWWW",65,66,67,68);
    2. # foo eq "ABCD"
    3. $foo = pack("W4",65,66,67,68);
    4. # same thing
    5. $foo = pack("W4",0x24b6,0x24b7,0x24b8,0x24b9);
    6. # same thing with Unicode circled letters.
    7. $foo = pack("U4",0x24b6,0x24b7,0x24b8,0x24b9);
    8. # same thing with Unicode circled letters. You don't get the
    9. # UTF-8 bytes because the U at the start of the format caused
    10. # a switch to U0-mode, so the UTF-8 bytes get joined into
    11. # characters
    12. $foo = pack("C0U4",0x24b6,0x24b7,0x24b8,0x24b9);
    13. # foo eq "\xe2\x92\xb6\xe2\x92\xb7\xe2\x92\xb8\xe2\x92\xb9"
    14. # This is the UTF-8 encoding of the string in the
    15. # previous example
    16. $foo = pack("ccxxcc",65,66,67,68);
    17. # foo eq "AB\0\0CD"
    18. # NOTE: The examples above featuring "W" and "c" are true
    19. # only on ASCII and ASCII-derived systems such as ISO Latin 1
    20. # and UTF-8. On EBCDIC systems, the first example would be
    21. # $foo = pack("WWWW",193,194,195,196);
    22. $foo = pack("s2",1,2);
    23. # "\001\000\002\000" on little-endian
    24. # "\000\001\000\002" on big-endian
    25. $foo = pack("a4","abcd","x","y","z");
    26. # "abcd"
    27. $foo = pack("aaaa","abcd","x","y","z");
    28. # "axyz"
    29. $foo = pack("a14","abcdefg");
    30. # "abcdefg\0\0\0\0\0\0\0"
    31. $foo = pack("i9pl", gmtime);
    32. # a real struct tm (on my system anyway)
    33. $utmp_template = "Z8 Z8 Z16 L";
    34. $utmp = pack($utmp_template, @utmp1);
    35. # a struct utmp (BSDish)
    36. @utmp2 = unpack($utmp_template, $utmp);
    37. # "@utmp1" eq "@utmp2"
    38. sub bintodec {
    39. unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
    40. }
    41. $foo = pack('sx2l', 12, 34);
    42. # short 12, two zero bytes padding, long 34
    43. $bar = pack('s@4l', 12, 34);
    44. # short 12, zero fill to position 4, long 34
    45. # $foo eq $bar
    46. $baz = pack('s.l', 12, 4, 34);
    47. # short 12, zero fill to position 4, long 34
    48. $foo = pack('nN', 42, 4711);
    49. # pack big-endian 16- and 32-bit unsigned integers
    50. $foo = pack('S>L>', 42, 4711);
    51. # exactly the same
    52. $foo = pack('s<l<', -42, 4711);
    53. # pack little-endian 16- and 32-bit signed integers
    54. $foo = pack('(sl)<', -42, 4711);
    55. # exactly the same

    The same template may generally also be used in unpack().

  • package NAMESPACE
  • package NAMESPACE VERSION
  • package NAMESPACE BLOCK
  • package NAMESPACE VERSION BLOCK

    Declares the BLOCK or the rest of the compilation unit as being in the given namespace. The scope of the package declaration is either the supplied code BLOCK or, in the absence of a BLOCK, from the declaration itself through the end of current scope (the enclosing block, file, or eval). That is, the forms without a BLOCK are operative through the end of the current scope, just like the my, state, and our operators. All unqualified dynamic identifiers in this scope will be in the given namespace, except where overridden by another package declaration or when they're one of the special identifiers that qualify into main:: , like STDOUT , ARGV , ENV , and the punctuation variables.

    A package statement affects dynamic variables only, including those you've used local on, but not lexically-scoped variables, which are created with my, state, or our. Typically it would be the first declaration in a file included by require or use. You can switch into a package in more than one place, since this only determines which default symbol table the compiler uses for the rest of that block. You can refer to identifiers in other packages than the current one by prefixing the identifier with the package name and a double colon, as in $SomePack::var or ThatPack::INPUT_HANDLE . If package name is omitted, the main package as assumed. That is, $::sail is equivalent to $main::sail (as well as to $main'sail , still seen in ancient code, mostly from Perl 4).

    If VERSION is provided, package sets the $VERSION variable in the given namespace to a version object with the VERSION provided. VERSION must be a "strict" style version number as defined by the version module: a positive decimal number (integer or decimal-fraction) without exponentiation or else a dotted-decimal v-string with a leading 'v' character and at least three components. You should set $VERSION only once per package.

    See Packages in perlmod for more information about packages, modules, and classes. See perlsub for other scoping issues.

  • __PACKAGE__

    A special token that returns the name of the package in which it occurs.

  • pipe READHANDLE,WRITEHANDLE

    Opens a pair of connected pipes like the corresponding system call. Note that if you set up a loop of piped processes, deadlock can occur unless you are very careful. In addition, note that Perl's pipes use IO buffering, so you may need to set $| to flush your WRITEHANDLE after each command, depending on the application.

    Returns true on success.

    See IPC::Open2, IPC::Open3, and Bidirectional Communication with Another Process in perlipc for examples of such things.

    On systems that support a close-on-exec flag on files, that flag is set on all newly opened file descriptors whose filenos are higher than the current value of $^F (by default 2 for STDERR ). See $^F in perlvar.

  • pop ARRAY
  • pop EXPR
  • pop

    Pops and returns the last value of the array, shortening the array by one element.

    Returns the undefined value if the array is empty, although this may also happen at other times. If ARRAY is omitted, pops the @ARGV array in the main program, but the @_ array in subroutines, just like shift.

    Starting with Perl 5.14, pop can take a scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of pop is considered highly experimental. The exact behaviour may change in a future version of Perl.

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.014; # so push/pop/etc work on scalars (experimental)
  • pos SCALAR
  • pos

    Returns the offset of where the last m//g search left off for the variable in question ($_ is used when the variable is not specified). Note that 0 is a valid match offset. undef indicates that the search position is reset (usually due to match failure, but can also be because no match has yet been run on the scalar).

    pos directly accesses the location used by the regexp engine to store the offset, so assigning to pos will change that offset, and so will also influence the \G zero-width assertion in regular expressions. Both of these effects take place for the next match, so you can't affect the position with pos during the current match, such as in (?{pos() = 5}) or s//pos() = 5/e .

    Setting pos also resets the matched with zero-length flag, described under Repeated Patterns Matching a Zero-length Substring in perlre.

    Because a failed m//gc match doesn't reset the offset, the return from pos won't change either in this case. See perlre and perlop.

  • print FILEHANDLE LIST
  • print FILEHANDLE
  • print LIST
  • print

    Prints a string or a list of strings. Returns true if successful. FILEHANDLE may be a scalar variable containing the name of or a reference to the filehandle, thus introducing one level of indirection. (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be misinterpreted as an operator unless you interpose a + or put parentheses around the arguments.) If FILEHANDLE is omitted, prints to the last selected (see select) output handle. If LIST is omitted, prints $_ to the currently selected output handle. To use FILEHANDLE alone to print the content of $_ to it, you must use a real filehandle like FH , not an indirect one like $fh . To set the default output handle to something other than STDOUT, use the select operation.

    The current value of $, (if any) is printed between each LIST item. The current value of $\ (if any) is printed after the entire LIST has been printed. Because print takes a LIST, anything in the LIST is evaluated in list context, including any subroutines whose return lists you pass to print. Be careful not to follow the print keyword with a left parenthesis unless you want the corresponding right parenthesis to terminate the arguments to the print; put parentheses around all arguments (or interpose a + , but that doesn't look as good).

    If you're storing handles in an array or hash, or in general whenever you're using any expression more complex than a bareword handle or a plain, unsubscripted scalar variable to retrieve it, you will have to use a block returning the filehandle value instead, in which case the LIST may not be omitted:

    1. print { $files[$i] } "stuff\n";
    2. print { $OK ? STDOUT : STDERR } "stuff\n";

    Printing to a closed pipe or socket will generate a SIGPIPE signal. See perlipc for more on signal handling.

  • printf FILEHANDLE FORMAT, LIST
  • printf FILEHANDLE
  • printf FORMAT, LIST
  • printf

    Equivalent to print FILEHANDLE sprintf(FORMAT, LIST) , except that $\ (the output record separator) is not appended. The FORMAT and the LIST are actually parsed as a single list. The first argument of the list will be interpreted as the printf format. This means that printf(@_) will use $_[0] as the format. See sprintf for an explanation of the format argument. If use locale (including use locale ':not_characters' ) is in effect and POSIX::setlocale() has been called, the character used for the decimal separator in formatted floating-point numbers is affected by the LC_NUMERIC locale setting. See perllocale and POSIX.

    For historical reasons, if you omit the list, $_ is used as the format; to use FILEHANDLE without a list, you must use a real filehandle like FH , not an indirect one like $fh . However, this will rarely do what you want; if $_ contains formatting codes, they will be replaced with the empty string and a warning will be emitted if warnings are enabled. Just use print if you want to print the contents of $_.

    Don't fall into the trap of using a printf when a simple print would do. The print is more efficient and less error prone.

  • prototype FUNCTION

    Returns the prototype of a function as a string (or undef if the function has no prototype). FUNCTION is a reference to, or the name of, the function whose prototype you want to retrieve.

    If FUNCTION is a string starting with CORE:: , the rest is taken as a name for a Perl builtin. If the builtin's arguments cannot be adequately expressed by a prototype (such as system), prototype() returns undef, because the builtin does not really behave like a Perl function. Otherwise, the string describing the equivalent prototype is returned.

  • push ARRAY,LIST
  • push EXPR,LIST

    Treats ARRAY as a stack by appending the values of LIST to the end of ARRAY. The length of ARRAY increases by the length of LIST. Has the same effect as

    1. for $value (LIST) {
    2. $ARRAY[++$#ARRAY] = $value;
    3. }

    but is more efficient. Returns the number of elements in the array following the completed push.

    Starting with Perl 5.14, push can take a scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of push is considered highly experimental. The exact behaviour may change in a future version of Perl.

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.014; # so push/pop/etc work on scalars (experimental)
  • q/STRING/
  • qq/STRING/
  • qw/STRING/
  • qx/STRING/

    Generalized quotes. See Quote-Like Operators in perlop.

  • qr/STRING/

    Regexp-like quote. See Regexp Quote-Like Operators in perlop.

  • quotemeta EXPR
  • quotemeta

    Returns the value of EXPR with all the ASCII non-"word" characters backslashed. (That is, all ASCII characters not matching /[A-Za-z_0-9]/ will be preceded by a backslash in the returned string, regardless of any locale settings.) This is the internal function implementing the \Q escape in double-quoted strings. (See below for the behavior on non-ASCII code points.)

    If EXPR is omitted, uses $_ .

    quotemeta (and \Q ... \E ) are useful when interpolating strings into regular expressions, because by default an interpolated variable will be considered a mini-regular expression. For example:

    1. my $sentence = 'The quick brown fox jumped over the lazy dog';
    2. my $substring = 'quick.*?fox';
    3. $sentence =~ s{$substring}{big bad wolf};

    Will cause $sentence to become 'The big bad wolf jumped over...' .

    On the other hand:

    1. my $sentence = 'The quick brown fox jumped over the lazy dog';
    2. my $substring = 'quick.*?fox';
    3. $sentence =~ s{\Q$substring\E}{big bad wolf};

    Or:

    1. my $sentence = 'The quick brown fox jumped over the lazy dog';
    2. my $substring = 'quick.*?fox';
    3. my $quoted_substring = quotemeta($substring);
    4. $sentence =~ s{$quoted_substring}{big bad wolf};

    Will both leave the sentence as is. Normally, when accepting literal string input from the user, quotemeta() or \Q must be used.

    In Perl v5.14, all non-ASCII characters are quoted in non-UTF-8-encoded strings, but not quoted in UTF-8 strings.

    Starting in Perl v5.16, Perl adopted a Unicode-defined strategy for quoting non-ASCII characters; the quoting of ASCII characters is unchanged.

    Also unchanged is the quoting of non-UTF-8 strings when outside the scope of a use feature 'unicode_strings' , which is to quote all characters in the upper Latin1 range. This provides complete backwards compatibility for old programs which do not use Unicode. (Note that unicode_strings is automatically enabled within the scope of a use v5.12 or greater.)

    Within the scope of use locale , all non-ASCII Latin1 code points are quoted whether the string is encoded as UTF-8 or not. As mentioned above, locale does not affect the quoting of ASCII-range characters. This protects against those locales where characters such as "|" are considered to be word characters.

    Otherwise, Perl quotes non-ASCII characters using an adaptation from Unicode (see http://www.unicode.org/reports/tr31/). The only code points that are quoted are those that have any of the Unicode properties: Pattern_Syntax, Pattern_White_Space, White_Space, Default_Ignorable_Code_Point, or General_Category=Control.

    Of these properties, the two important ones are Pattern_Syntax and Pattern_White_Space. They have been set up by Unicode for exactly this purpose of deciding which characters in a regular expression pattern should be quoted. No character that can be in an identifier has these properties.

    Perl promises, that if we ever add regular expression pattern metacharacters to the dozen already defined (\ | ( ) [ { ^ $ * + ? . ), that we will only use ones that have the Pattern_Syntax property. Perl also promises, that if we ever add characters that are considered to be white space in regular expressions (currently mostly affected by /x), they will all have the Pattern_White_Space property.

    Unicode promises that the set of code points that have these two properties will never change, so something that is not quoted in v5.16 will never need to be quoted in any future Perl release. (Not all the code points that match Pattern_Syntax have actually had characters assigned to them; so there is room to grow, but they are quoted whether assigned or not. Perl, of course, would never use an unassigned code point as an actual metacharacter.)

    Quoting characters that have the other 3 properties is done to enhance the readability of the regular expression and not because they actually need to be quoted for regular expression purposes (characters with the White_Space property are likely to be indistinguishable on the page or screen from those with the Pattern_White_Space property; and the other two properties contain non-printing characters).

  • rand EXPR
  • rand

    Returns a random fractional number greater than or equal to 0 and less than the value of EXPR. (EXPR should be positive.) If EXPR is omitted, the value 1 is used. Currently EXPR with the value 0 is also special-cased as 1 (this was undocumented before Perl 5.8.0 and is subject to change in future versions of Perl). Automatically calls srand unless srand has already been called. See also srand.

    Apply int() to the value returned by rand() if you want random integers instead of random fractional numbers. For example,

    1. int(rand(10))

    returns a random integer between 0 and 9 , inclusive.

    (Note: If your rand function consistently returns numbers that are too large or too small, then your version of Perl was probably compiled with the wrong number of RANDBITS.)

    rand() is not cryptographically secure. You should not rely on it in security-sensitive situations. As of this writing, a number of third-party CPAN modules offer random number generators intended by their authors to be cryptographically secure, including: Data::Entropy, Crypt::Random, Math::Random::Secure, and Math::TrulyRandom.

  • read FILEHANDLE,SCALAR,LENGTH,OFFSET
  • read FILEHANDLE,SCALAR,LENGTH

    Attempts to read LENGTH characters of data into variable SCALAR from the specified FILEHANDLE. Returns the number of characters actually read, 0 at end of file, or undef if there was an error (in the latter case $! is also set). SCALAR will be grown or shrunk so that the last character actually read is the last character of the scalar after the read.

    An OFFSET may be specified to place the read data at some place in the string other than the beginning. A negative OFFSET specifies placement at that many characters counting backwards from the end of the string. A positive OFFSET greater than the length of SCALAR results in the string being padded to the required size with "\0" bytes before the result of the read is appended.

    The call is implemented in terms of either Perl's or your system's native fread(3) library function. To get a true read(2) system call, see sysread.

    Note the characters: depending on the status of the filehandle, either (8-bit) bytes or characters are read. By default, all filehandles operate on bytes, but for example if the filehandle has been opened with the :utf8 I/O layer (see open, and the open pragma, open), the I/O will operate on UTF8-encoded Unicode characters, not bytes. Similarly for the :encoding pragma: in that case pretty much any characters can be read.

  • readdir DIRHANDLE

    Returns the next directory entry for a directory opened by opendir. If used in list context, returns all the rest of the entries in the directory. If there are no more entries, returns the undefined value in scalar context and the empty list in list context.

    If you're planning to filetest the return values out of a readdir, you'd better prepend the directory in question. Otherwise, because we didn't chdir there, it would have been testing the wrong file.

    1. opendir(my $dh, $some_dir) || die "can't opendir $some_dir: $!";
    2. @dots = grep { /^\./ && -f "$some_dir/$_" } readdir($dh);
    3. closedir $dh;

    As of Perl 5.12 you can use a bare readdir in a while loop, which will set $_ on every iteration.

    1. opendir(my $dh, $some_dir) || die;
    2. while(readdir $dh) {
    3. print "$some_dir/$_\n";
    4. }
    5. closedir $dh;

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious failures, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.012; # so readdir assigns to $_ in a lone while test
  • readline EXPR
  • readline

    Reads from the filehandle whose typeglob is contained in EXPR (or from *ARGV if EXPR is not provided). In scalar context, each call reads and returns the next line until end-of-file is reached, whereupon the subsequent call returns undef. In list context, reads until end-of-file is reached and returns a list of lines. Note that the notion of "line" used here is whatever you may have defined with $/ or $INPUT_RECORD_SEPARATOR ). See $/ in perlvar.

    When $/ is set to undef, when readline is in scalar context (i.e., file slurp mode), and when an empty file is read, it returns '' the first time, followed by undef subsequently.

    This is the internal function implementing the <EXPR> operator, but you can use it directly. The <EXPR> operator is discussed in more detail in I/O Operators in perlop.

    1. $line = <STDIN>;
    2. $line = readline(*STDIN); # same thing

    If readline encounters an operating system error, $! will be set with the corresponding error message. It can be helpful to check $! when you are reading from filehandles you don't trust, such as a tty or a socket. The following example uses the operator form of readline and dies if the result is not defined.

    1. while ( ! eof($fh) ) {
    2. defined( $_ = <$fh> ) or die "readline failed: $!";
    3. ...
    4. }

    Note that you have can't handle readline errors that way with the ARGV filehandle. In that case, you have to open each element of @ARGV yourself since eof handles ARGV differently.

    1. foreach my $arg (@ARGV) {
    2. open(my $fh, $arg) or warn "Can't open $arg: $!";
    3. while ( ! eof($fh) ) {
    4. defined( $_ = <$fh> )
    5. or die "readline failed for $arg: $!";
    6. ...
    7. }
    8. }
  • readlink EXPR
  • readlink

    Returns the value of a symbolic link, if symbolic links are implemented. If not, raises an exception. If there is a system error, returns the undefined value and sets $! (errno). If EXPR is omitted, uses $_ .

    Portability issues: readlink in perlport.

  • readpipe EXPR
  • readpipe

    EXPR is executed as a system command. The collected standard output of the command is returned. In scalar context, it comes back as a single (potentially multi-line) string. In list context, returns a list of lines (however you've defined lines with $/ or $INPUT_RECORD_SEPARATOR ). This is the internal function implementing the qx/EXPR/ operator, but you can use it directly. The qx/EXPR/ operator is discussed in more detail in I/O Operators in perlop. If EXPR is omitted, uses $_ .

  • recv SOCKET,SCALAR,LENGTH,FLAGS

    Receives a message on a socket. Attempts to receive LENGTH characters of data into variable SCALAR from the specified SOCKET filehandle. SCALAR will be grown or shrunk to the length actually read. Takes the same flags as the system call of the same name. Returns the address of the sender if SOCKET's protocol supports this; returns an empty string otherwise. If there's an error, returns the undefined value. This call is actually implemented in terms of recvfrom(2) system call. See UDP: Message Passing in perlipc for examples.

    Note the characters: depending on the status of the socket, either (8-bit) bytes or characters are received. By default all sockets operate on bytes, but for example if the socket has been changed using binmode() to operate with the :encoding(utf8) I/O layer (see the open pragma, open), the I/O will operate on UTF8-encoded Unicode characters, not bytes. Similarly for the :encoding pragma: in that case pretty much any characters can be read.

  • redo LABEL
  • redo EXPR
  • redo

    The redo command restarts the loop block without evaluating the conditional again. The continue block, if any, is not executed. If the LABEL is omitted, the command refers to the innermost enclosing loop. The redo EXPR form, available starting in Perl 5.18.0, allows a label name to be computed at run time, and is otherwise identical to redo LABEL . Programs that want to lie to themselves about what was just input normally use this command:

    1. # a simpleminded Pascal comment stripper
    2. # (warning: assumes no { or } in strings)
    3. LINE: while (<STDIN>) {
    4. while (s|({.*}.*){.*}|$1 |) {}
    5. s|{.*}| |;
    6. if (s|{.*| |) {
    7. $front = $_;
    8. while (<STDIN>) {
    9. if (/}/) { # end of comment?
    10. s|^|$front\{|;
    11. redo LINE;
    12. }
    13. }
    14. }
    15. print;
    16. }

    redo cannot be used to retry a block that returns a value such as eval {} , sub {} , or do {} , and should not be used to exit a grep() or map() operation.

    Note that a block by itself is semantically identical to a loop that executes once. Thus redo inside such a block will effectively turn it into a looping construct.

    See also continue for an illustration of how last, next, and redo work.

    Unlike most named operators, this has the same precedence as assignment. It is also exempt from the looks-like-a-function rule, so redo ("foo")."bar" will cause "bar" to be part of the argument to redo.

  • ref EXPR
  • ref

    Returns a non-empty string if EXPR is a reference, the empty string otherwise. If EXPR is not specified, $_ will be used. The value returned depends on the type of thing the reference is a reference to. Builtin types include:

    1. SCALAR
    2. ARRAY
    3. HASH
    4. CODE
    5. REF
    6. GLOB
    7. LVALUE
    8. FORMAT
    9. IO
    10. VSTRING
    11. Regexp

    If the referenced object has been blessed into a package, then that package name is returned instead. You can think of ref as a typeof operator.

    1. if (ref($r) eq "HASH") {
    2. print "r is a reference to a hash.\n";
    3. }
    4. unless (ref($r)) {
    5. print "r is not a reference at all.\n";
    6. }

    The return value LVALUE indicates a reference to an lvalue that is not a variable. You get this from taking the reference of function calls like pos() or substr(). VSTRING is returned if the reference points to a version string.

    The result Regexp indicates that the argument is a regular expression resulting from qr//.

    See also perlref.

  • rename OLDNAME,NEWNAME

    Changes the name of a file; an existing file NEWNAME will be clobbered. Returns true for success, false otherwise.

    Behavior of this function varies wildly depending on your system implementation. For example, it will usually not work across file system boundaries, even though the system mv command sometimes compensates for this. Other restrictions include whether it works on directories, open files, or pre-existing files. Check perlport and either the rename(2) manpage or equivalent system documentation for details.

    For a platform independent move function look at the File::Copy module.

    Portability issues: rename in perlport.

  • require VERSION
  • require EXPR
  • require

    Demands a version of Perl specified by VERSION, or demands some semantics specified by EXPR or by $_ if EXPR is not supplied.

    VERSION may be either a numeric argument such as 5.006, which will be compared to $] , or a literal of the form v5.6.1, which will be compared to $^V (aka $PERL_VERSION). An exception is raised if VERSION is greater than the version of the current Perl interpreter. Compare with use, which can do a similar check at compile time.

    Specifying VERSION as a literal of the form v5.6.1 should generally be avoided, because it leads to misleading error messages under earlier versions of Perl that do not support this syntax. The equivalent numeric version should be used instead.

    1. require v5.6.1; # run time version check
    2. require 5.6.1; # ditto
    3. require 5.006_001; # ditto; preferred for backwards
    4. compatibility

    Otherwise, require demands that a library file be included if it hasn't already been included. The file is included via the do-FILE mechanism, which is essentially just a variety of eval with the caveat that lexical variables in the invoking script will be invisible to the included code. Has semantics similar to the following subroutine:

    1. sub require {
    2. my ($filename) = @_;
    3. if (exists $INC{$filename}) {
    4. return 1 if $INC{$filename};
    5. die "Compilation failed in require";
    6. }
    7. my ($realfilename,$result);
    8. ITER: {
    9. foreach $prefix (@INC) {
    10. $realfilename = "$prefix/$filename";
    11. if (-f $realfilename) {
    12. $INC{$filename} = $realfilename;
    13. $result = do $realfilename;
    14. last ITER;
    15. }
    16. }
    17. die "Can't find $filename in \@INC";
    18. }
    19. if ($@) {
    20. $INC{$filename} = undef;
    21. die $@;
    22. } elsif (!$result) {
    23. delete $INC{$filename};
    24. die "$filename did not return true value";
    25. } else {
    26. return $result;
    27. }
    28. }

    Note that the file will not be included twice under the same specified name.

    The file must return true as the last statement to indicate successful execution of any initialization code, so it's customary to end such a file with 1; unless you're sure it'll return true otherwise. But it's better just to put the 1; , in case you add more statements.

    If EXPR is a bareword, the require assumes a ".pm" extension and replaces "::" with "/" in the filename for you, to make it easy to load standard modules. This form of loading of modules does not risk altering your namespace.

    In other words, if you try this:

    1. require Foo::Bar; # a splendid bareword

    The require function will actually look for the "Foo/Bar.pm" file in the directories specified in the @INC array.

    But if you try this:

    1. $class = 'Foo::Bar';
    2. require $class; # $class is not a bareword
    3. #or
    4. require "Foo::Bar"; # not a bareword because of the ""

    The require function will look for the "Foo::Bar" file in the @INC array and will complain about not finding "Foo::Bar" there. In this case you can do:

    1. eval "require $class";

    Now that you understand how require looks for files with a bareword argument, there is a little extra functionality going on behind the scenes. Before require looks for a ".pm" extension, it will first look for a similar filename with a ".pmc" extension. If this file is found, it will be loaded in place of any file ending in a ".pm" extension.

    You can also insert hooks into the import facility by putting Perl code directly into the @INC array. There are three forms of hooks: subroutine references, array references, and blessed objects.

    Subroutine references are the simplest case. When the inclusion system walks through @INC and encounters a subroutine, this subroutine gets called with two parameters, the first a reference to itself, and the second the name of the file to be included (e.g., "Foo/Bar.pm"). The subroutine should return either nothing or else a list of up to three values in the following order:

    1

    A filehandle, from which the file will be read.

    2

    A reference to a subroutine. If there is no filehandle (previous item), then this subroutine is expected to generate one line of source code per call, writing the line into $_ and returning 1, then finally at end of file returning 0. If there is a filehandle, then the subroutine will be called to act as a simple source filter, with the line as read in $_ . Again, return 1 for each valid line, and 0 after all lines have been returned.

    3

    Optional state for the subroutine. The state is passed in as $_[1] . A reference to the subroutine itself is passed in as $_[0] .

    If an empty list, undef, or nothing that matches the first 3 values above is returned, then require looks at the remaining elements of @INC. Note that this filehandle must be a real filehandle (strictly a typeglob or reference to a typeglob, whether blessed or unblessed); tied filehandles will be ignored and processing will stop there.

    If the hook is an array reference, its first element must be a subroutine reference. This subroutine is called as above, but the first parameter is the array reference. This lets you indirectly pass arguments to the subroutine.

    In other words, you can write:

    1. push @INC, \&my_sub;
    2. sub my_sub {
    3. my ($coderef, $filename) = @_; # $coderef is \&my_sub
    4. ...
    5. }

    or:

    1. push @INC, [ \&my_sub, $x, $y, ... ];
    2. sub my_sub {
    3. my ($arrayref, $filename) = @_;
    4. # Retrieve $x, $y, ...
    5. my @parameters = @$arrayref[1..$#$arrayref];
    6. ...
    7. }

    If the hook is an object, it must provide an INC method that will be called as above, the first parameter being the object itself. (Note that you must fully qualify the sub's name, as unqualified INC is always forced into package main .) Here is a typical code layout:

    1. # In Foo.pm
    2. package Foo;
    3. sub new { ... }
    4. sub Foo::INC {
    5. my ($self, $filename) = @_;
    6. ...
    7. }
    8. # In the main program
    9. push @INC, Foo->new(...);

    These hooks are also permitted to set the %INC entry corresponding to the files they have loaded. See %INC in perlvar.

    For a yet-more-powerful import facility, see use and perlmod.

  • reset EXPR
  • reset

    Generally used in a continue block at the end of a loop to clear variables and reset ?? searches so that they work again. The expression is interpreted as a list of single characters (hyphens allowed for ranges). All variables and arrays beginning with one of those letters are reset to their pristine state. If the expression is omitted, one-match searches (?pattern? ) are reset to match again. Only resets variables or searches in the current package. Always returns 1. Examples:

    1. reset 'X'; # reset all X variables
    2. reset 'a-z'; # reset lower case variables
    3. reset; # just reset ?one-time? searches

    Resetting "A-Z" is not recommended because you'll wipe out your @ARGV and @INC arrays and your %ENV hash. Resets only package variables; lexical variables are unaffected, but they clean themselves up on scope exit anyway, so you'll probably want to use them instead. See my.

  • return EXPR
  • return

    Returns from a subroutine, eval, or do FILE with the value given in EXPR. Evaluation of EXPR may be in list, scalar, or void context, depending on how the return value will be used, and the context may vary from one execution to the next (see wantarray). If no EXPR is given, returns an empty list in list context, the undefined value in scalar context, and (of course) nothing at all in void context.

    (In the absence of an explicit return, a subroutine, eval, or do FILE automatically returns the value of the last expression evaluated.)

    Unlike most named operators, this is also exempt from the looks-like-a-function rule, so return ("foo")."bar" will cause "bar" to be part of the argument to return.

  • reverse LIST

    In list context, returns a list value consisting of the elements of LIST in the opposite order. In scalar context, concatenates the elements of LIST and returns a string value with all characters in the opposite order.

    1. print join(", ", reverse "world", "Hello"); # Hello, world
    2. print scalar reverse "dlrow ,", "olleH"; # Hello, world

    Used without arguments in scalar context, reverse() reverses $_ .

    1. $_ = "dlrow ,olleH";
    2. print reverse; # No output, list context
    3. print scalar reverse; # Hello, world

    Note that reversing an array to itself (as in @a = reverse @a ) will preserve non-existent elements whenever possible; i.e., for non-magical arrays or for tied arrays with EXISTS and DELETE methods.

    This operator is also handy for inverting a hash, although there are some caveats. If a value is duplicated in the original hash, only one of those can be represented as a key in the inverted hash. Also, this has to unwind one hash and build a whole new one, which may take some time on a large hash, such as from a DBM file.

    1. %by_name = reverse %by_address; # Invert the hash
  • rewinddir DIRHANDLE

    Sets the current position to the beginning of the directory for the readdir routine on DIRHANDLE.

    Portability issues: rewinddir in perlport.

  • rindex STR,SUBSTR,POSITION
  • rindex STR,SUBSTR

    Works just like index() except that it returns the position of the last occurrence of SUBSTR in STR. If POSITION is specified, returns the last occurrence beginning at or before that position.

  • rmdir FILENAME
  • rmdir

    Deletes the directory specified by FILENAME if that directory is empty. If it succeeds it returns true; otherwise it returns false and sets $! (errno). If FILENAME is omitted, uses $_ .

    To remove a directory tree recursively (rm -rf on Unix) look at the rmtree function of the File::Path module.

  • s///

    The substitution operator. See Regexp Quote-Like Operators in perlop.

  • say FILEHANDLE LIST
  • say FILEHANDLE
  • say LIST
  • say

    Just like print, but implicitly appends a newline. say LIST is simply an abbreviation for { local $\ = "\n"; print LIST } . To use FILEHANDLE without a LIST to print the contents of $_ to it, you must use a real filehandle like FH , not an indirect one like $fh .

    This keyword is available only when the "say" feature is enabled, or when prefixed with CORE:: ; see feature. Alternately, include a use v5.10 or later to the current scope.

  • scalar EXPR

    Forces EXPR to be interpreted in scalar context and returns the value of EXPR.

    1. @counts = ( scalar @a, scalar @b, scalar @c );

    There is no equivalent operator to force an expression to be interpolated in list context because in practice, this is never needed. If you really wanted to do so, however, you could use the construction @{[ (some expression) ]} , but usually a simple (some expression) suffices.

    Because scalar is a unary operator, if you accidentally use a parenthesized list for the EXPR, this behaves as a scalar comma expression, evaluating all but the last element in void context and returning the final element evaluated in scalar context. This is seldom what you want.

    The following single statement:

    1. print uc(scalar(&foo,$bar)),$baz;

    is the moral equivalent of these two:

    1. &foo;
    2. print(uc($bar),$baz);

    See perlop for more details on unary operators and the comma operator.

  • seek FILEHANDLE,POSITION,WHENCE

    Sets FILEHANDLE's position, just like the fseek call of stdio . FILEHANDLE may be an expression whose value gives the name of the filehandle. The values for WHENCE are 0 to set the new position in bytes to POSITION; 1 to set it to the current position plus POSITION; and 2 to set it to EOF plus POSITION, typically negative. For WHENCE you may use the constants SEEK_SET , SEEK_CUR , and SEEK_END (start of the file, current position, end of the file) from the Fcntl module. Returns 1 on success, false otherwise.

    Note the in bytes: even if the filehandle has been set to operate on characters (for example by using the :encoding(utf8) open layer), tell() will return byte offsets, not character offsets (because implementing that would render seek() and tell() rather slow).

    If you want to position the file for sysread or syswrite, don't use seek, because buffering makes its effect on the file's read-write position unpredictable and non-portable. Use sysseek instead.

    Due to the rules and rigors of ANSI C, on some systems you have to do a seek whenever you switch between reading and writing. Amongst other things, this may have the effect of calling stdio's clearerr(3). A WHENCE of 1 (SEEK_CUR ) is useful for not moving the file position:

    1. seek(TEST,0,1);

    This is also useful for applications emulating tail -f . Once you hit EOF on your read and then sleep for a while, you (probably) have to stick in a dummy seek() to reset things. The seek doesn't change the position, but it does clear the end-of-file condition on the handle, so that the next <FILE> makes Perl try again to read something. (We hope.)

    If that doesn't work (some I/O implementations are particularly cantankerous), you might need something like this:

    1. for (;;) {
    2. for ($curpos = tell(FILE); $_ = <FILE>;
    3. $curpos = tell(FILE)) {
    4. # search for some stuff and put it into files
    5. }
    6. sleep($for_a_while);
    7. seek(FILE, $curpos, 0);
    8. }
  • seekdir DIRHANDLE,POS

    Sets the current position for the readdir routine on DIRHANDLE. POS must be a value returned by telldir. seekdir also has the same caveats about possible directory compaction as the corresponding system library routine.

  • select FILEHANDLE
  • select

    Returns the currently selected filehandle. If FILEHANDLE is supplied, sets the new current default filehandle for output. This has two effects: first, a write or a print without a filehandle default to this FILEHANDLE. Second, references to variables related to output will refer to this output channel.

    For example, to set the top-of-form format for more than one output channel, you might do the following:

    1. select(REPORT1);
    2. $^ = 'report1_top';
    3. select(REPORT2);
    4. $^ = 'report2_top';

    FILEHANDLE may be an expression whose value gives the name of the actual filehandle. Thus:

    1. $oldfh = select(STDERR); $| = 1; select($oldfh);

    Some programmers may prefer to think of filehandles as objects with methods, preferring to write the last example as:

    1. use IO::Handle;
    2. STDERR->autoflush(1);

    Portability issues: select in perlport.

  • select RBITS,WBITS,EBITS,TIMEOUT

    This calls the select(2) syscall with the bit masks specified, which can be constructed using fileno and vec, along these lines:

    1. $rin = $win = $ein = '';
    2. vec($rin, fileno(STDIN), 1) = 1;
    3. vec($win, fileno(STDOUT), 1) = 1;
    4. $ein = $rin | $win;

    If you want to select on many filehandles, you may wish to write a subroutine like this:

    1. sub fhbits {
    2. my @fhlist = @_;
    3. my $bits = "";
    4. for my $fh (@fhlist) {
    5. vec($bits, fileno($fh), 1) = 1;
    6. }
    7. return $bits;
    8. }
    9. $rin = fhbits(*STDIN, *TTY, *MYSOCK);

    The usual idiom is:

    1. ($nfound,$timeleft) =
    2. select($rout=$rin, $wout=$win, $eout=$ein, $timeout);

    or to block until something becomes ready just do this

    1. $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);

    Most systems do not bother to return anything useful in $timeleft, so calling select() in scalar context just returns $nfound.

    Any of the bit masks can also be undef. The timeout, if specified, is in seconds, which may be fractional. Note: not all implementations are capable of returning the $timeleft. If not, they always return $timeleft equal to the supplied $timeout.

    You can effect a sleep of 250 milliseconds this way:

    1. select(undef, undef, undef, 0.25);

    Note that whether select gets restarted after signals (say, SIGALRM) is implementation-dependent. See also perlport for notes on the portability of select.

    On error, select behaves just like select(2): it returns -1 and sets $! .

    On some Unixes, select(2) may report a socket file descriptor as "ready for reading" even when no data is available, and thus any subsequent read would block. This can be avoided if you always use O_NONBLOCK on the socket. See select(2) and fcntl(2) for further details.

    The standard IO::Select module provides a user-friendlier interface to select, mostly because it does all the bit-mask work for you.

    WARNING: One should not attempt to mix buffered I/O (like read or <FH>) with select, except as permitted by POSIX, and even then only on POSIX systems. You have to use sysread instead.

    Portability issues: select in perlport.

  • semctl ID,SEMNUM,CMD,ARG

    Calls the System V IPC function semctl(2). You'll probably have to say

    1. use IPC::SysV;

    first to get the correct constant definitions. If CMD is IPC_STAT or GETALL, then ARG must be a variable that will hold the returned semid_ds structure or semaphore value array. Returns like ioctl: the undefined value for error, "0 but true" for zero, or the actual return value otherwise. The ARG must consist of a vector of native short integers, which may be created with pack("s!",(0)x$nsem). See also SysV IPC in perlipc, IPC::SysV , IPC::Semaphore documentation.

    Portability issues: semctl in perlport.

  • semget KEY,NSEMS,FLAGS

    Calls the System V IPC function semget(2). Returns the semaphore id, or the undefined value on error. See also SysV IPC in perlipc, IPC::SysV , IPC::SysV::Semaphore documentation.

    Portability issues: semget in perlport.

  • semop KEY,OPSTRING

    Calls the System V IPC function semop(2) for semaphore operations such as signalling and waiting. OPSTRING must be a packed array of semop structures. Each semop structure can be generated with pack("s!3", $semnum, $semop, $semflag) . The length of OPSTRING implies the number of semaphore operations. Returns true if successful, false on error. As an example, the following code waits on semaphore $semnum of semaphore id $semid:

    1. $semop = pack("s!3", $semnum, -1, 0);
    2. die "Semaphore trouble: $!\n" unless semop($semid, $semop);

    To signal the semaphore, replace -1 with 1 . See also SysV IPC in perlipc, IPC::SysV , and IPC::SysV::Semaphore documentation.

    Portability issues: semop in perlport.

  • send SOCKET,MSG,FLAGS,TO
  • send SOCKET,MSG,FLAGS

    Sends a message on a socket. Attempts to send the scalar MSG to the SOCKET filehandle. Takes the same flags as the system call of the same name. On unconnected sockets, you must specify a destination to send to, in which case it does a sendto(2) syscall. Returns the number of characters sent, or the undefined value on error. The sendmsg(2) syscall is currently unimplemented. See UDP: Message Passing in perlipc for examples.

    Note the characters: depending on the status of the socket, either (8-bit) bytes or characters are sent. By default all sockets operate on bytes, but for example if the socket has been changed using binmode() to operate with the :encoding(utf8) I/O layer (see open, or the open pragma, open), the I/O will operate on UTF-8 encoded Unicode characters, not bytes. Similarly for the :encoding pragma: in that case pretty much any characters can be sent.

  • setpgrp PID,PGRP

    Sets the current process group for the specified PID, 0 for the current process. Raises an exception when used on a machine that doesn't implement POSIX setpgid(2) or BSD setpgrp(2). If the arguments are omitted, it defaults to 0,0 . Note that the BSD 4.2 version of setpgrp does not accept any arguments, so only setpgrp(0,0) is portable. See also POSIX::setsid() .

    Portability issues: setpgrp in perlport.

  • setpriority WHICH,WHO,PRIORITY

    Sets the current priority for a process, a process group, or a user. (See setpriority(2).) Raises an exception when used on a machine that doesn't implement setpriority(2).

    Portability issues: setpriority in perlport.

  • setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL

    Sets the socket option requested. Returns undef on error. Use integer constants provided by the Socket module for LEVEL and OPNAME. Values for LEVEL can also be obtained from getprotobyname. OPTVAL might either be a packed string or an integer. An integer OPTVAL is shorthand for pack("i", OPTVAL).

    An example disabling Nagle's algorithm on a socket:

    1. use Socket qw(IPPROTO_TCP TCP_NODELAY);
    2. setsockopt($socket, IPPROTO_TCP, TCP_NODELAY, 1);

    Portability issues: setsockopt in perlport.

  • shift ARRAY
  • shift EXPR
  • shift

    Shifts the first value of the array off and returns it, shortening the array by 1 and moving everything down. If there are no elements in the array, returns the undefined value. If ARRAY is omitted, shifts the @_ array within the lexical scope of subroutines and formats, and the @ARGV array outside a subroutine and also within the lexical scopes established by the eval STRING , BEGIN {} , INIT {} , CHECK {} , UNITCHECK {} , and END {} constructs.

    Starting with Perl 5.14, shift can take a scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of shift is considered highly experimental. The exact behaviour may change in a future version of Perl.

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.014; # so push/pop/etc work on scalars (experimental)

    See also unshift, push, and pop. shift and unshift do the same thing to the left end of an array that pop and push do to the right end.

  • shmctl ID,CMD,ARG

    Calls the System V IPC function shmctl. You'll probably have to say

    1. use IPC::SysV;

    first to get the correct constant definitions. If CMD is IPC_STAT , then ARG must be a variable that will hold the returned shmid_ds structure. Returns like ioctl: undef for error; "0 but true" for zero; and the actual return value otherwise. See also SysV IPC in perlipc and IPC::SysV documentation.

    Portability issues: shmctl in perlport.

  • shmget KEY,SIZE,FLAGS

    Calls the System V IPC function shmget. Returns the shared memory segment id, or undef on error. See also SysV IPC in perlipc and IPC::SysV documentation.

    Portability issues: shmget in perlport.

  • shmread ID,VAR,POS,SIZE
  • shmwrite ID,STRING,POS,SIZE

    Reads or writes the System V shared memory segment ID starting at position POS for size SIZE by attaching to it, copying in/out, and detaching from it. When reading, VAR must be a variable that will hold the data read. When writing, if STRING is too long, only SIZE bytes are used; if STRING is too short, nulls are written to fill out SIZE bytes. Return true if successful, false on error. shmread() taints the variable. See also SysV IPC in perlipc, IPC::SysV , and the IPC::Shareable module from CPAN.

    Portability issues: shmread in perlport and shmwrite in perlport.

  • shutdown SOCKET,HOW

    Shuts down a socket connection in the manner indicated by HOW, which has the same interpretation as in the syscall of the same name.

    1. shutdown(SOCKET, 0); # I/we have stopped reading data
    2. shutdown(SOCKET, 1); # I/we have stopped writing data
    3. shutdown(SOCKET, 2); # I/we have stopped using this socket

    This is useful with sockets when you want to tell the other side you're done writing but not done reading, or vice versa. It's also a more insistent form of close because it also disables the file descriptor in any forked copies in other processes.

    Returns 1 for success; on error, returns undef if the first argument is not a valid filehandle, or returns 0 and sets $! for any other failure.

  • sin EXPR
  • sin

    Returns the sine of EXPR (expressed in radians). If EXPR is omitted, returns sine of $_ .

    For the inverse sine operation, you may use the Math::Trig::asin function, or use this relation:

    1. sub asin { atan2($_[0], sqrt(1 - $_[0] * $_[0])) }
  • sleep EXPR
  • sleep

    Causes the script to sleep for (integer) EXPR seconds, or forever if no argument is given. Returns the integer number of seconds actually slept.

    May be interrupted if the process receives a signal such as SIGALRM .

    1. eval {
    2. local $SIG{ALARM} = sub { die "Alarm!\n" };
    3. sleep;
    4. };
    5. die $@ unless $@ eq "Alarm!\n";

    You probably cannot mix alarm and sleep calls, because sleep is often implemented using alarm.

    On some older systems, it may sleep up to a full second less than what you requested, depending on how it counts seconds. Most modern systems always sleep the full amount. They may appear to sleep longer than that, however, because your process might not be scheduled right away in a busy multitasking system.

    For delays of finer granularity than one second, the Time::HiRes module (from CPAN, and starting from Perl 5.8 part of the standard distribution) provides usleep(). You may also use Perl's four-argument version of select() leaving the first three arguments undefined, or you might be able to use the syscall interface to access setitimer(2) if your system supports it. See perlfaq8 for details.

    See also the POSIX module's pause function.

  • socket SOCKET,DOMAIN,TYPE,PROTOCOL

    Opens a socket of the specified kind and attaches it to filehandle SOCKET. DOMAIN, TYPE, and PROTOCOL are specified the same as for the syscall of the same name. You should use Socket first to get the proper definitions imported. See the examples in Sockets: Client/Server Communication in perlipc.

    On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptor, as determined by the value of $^F. See $^F in perlvar.

  • socketpair SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL

    Creates an unnamed pair of sockets in the specified domain, of the specified type. DOMAIN, TYPE, and PROTOCOL are specified the same as for the syscall of the same name. If unimplemented, raises an exception. Returns true if successful.

    On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptors, as determined by the value of $^F. See $^F in perlvar.

    Some systems defined pipe in terms of socketpair, in which a call to pipe(Rdr, Wtr) is essentially:

    1. use Socket;
    2. socketpair(Rdr, Wtr, AF_UNIX, SOCK_STREAM, PF_UNSPEC);
    3. shutdown(Rdr, 1); # no more writing for reader
    4. shutdown(Wtr, 0); # no more reading for writer

    See perlipc for an example of socketpair use. Perl 5.8 and later will emulate socketpair using IP sockets to localhost if your system implements sockets but not socketpair.

    Portability issues: socketpair in perlport.

  • sort SUBNAME LIST
  • sort BLOCK LIST
  • sort LIST

    In list context, this sorts the LIST and returns the sorted list value. In scalar context, the behaviour of sort() is undefined.

    If SUBNAME or BLOCK is omitted, sorts in standard string comparison order. If SUBNAME is specified, it gives the name of a subroutine that returns an integer less than, equal to, or greater than 0 , depending on how the elements of the list are to be ordered. (The <=> and cmp operators are extremely useful in such routines.) SUBNAME may be a scalar variable name (unsubscripted), in which case the value provides the name of (or a reference to) the actual subroutine to use. In place of a SUBNAME, you can provide a BLOCK as an anonymous, in-line sort subroutine.

    If the subroutine's prototype is ($$) , the elements to be compared are passed by reference in @_ , as for a normal subroutine. This is slower than unprototyped subroutines, where the elements to be compared are passed into the subroutine as the package global variables $a and $b (see example below). Note that in the latter case, it is usually highly counter-productive to declare $a and $b as lexicals.

    If the subroutine is an XSUB, the elements to be compared are pushed on to the stack, the way arguments are usually passed to XSUBs. $a and $b are not set.

    The values to be compared are always passed by reference and should not be modified.

    You also cannot exit out of the sort block or subroutine using any of the loop control operators described in perlsyn or with goto.

    When use locale (but not use locale 'not_characters' ) is in effect, sort LIST sorts LIST according to the current collation locale. See perllocale.

    sort() returns aliases into the original list, much as a for loop's index variable aliases the list elements. That is, modifying an element of a list returned by sort() (for example, in a foreach , map or grep) actually modifies the element in the original list. This is usually something to be avoided when writing clear code.

    Perl 5.6 and earlier used a quicksort algorithm to implement sort. That algorithm was not stable, so could go quadratic. (A stable sort preserves the input order of elements that compare equal. Although quicksort's run time is O(NlogN) when averaged over all arrays of length N, the time can be O(N**2), quadratic behavior, for some inputs.) In 5.7, the quicksort implementation was replaced with a stable mergesort algorithm whose worst-case behavior is O(NlogN). But benchmarks indicated that for some inputs, on some platforms, the original quicksort was faster. 5.8 has a sort pragma for limited control of the sort. Its rather blunt control of the underlying algorithm may not persist into future Perls, but the ability to characterize the input or output in implementation independent ways quite probably will. See the sort pragma.

    Examples:

    1. # sort lexically
    2. @articles = sort @files;
    3. # same thing, but with explicit sort routine
    4. @articles = sort {$a cmp $b} @files;
    5. # now case-insensitively
    6. @articles = sort {fc($a) cmp fc($b)} @files;
    7. # same thing in reversed order
    8. @articles = sort {$b cmp $a} @files;
    9. # sort numerically ascending
    10. @articles = sort {$a <=> $b} @files;
    11. # sort numerically descending
    12. @articles = sort {$b <=> $a} @files;
    13. # this sorts the %age hash by value instead of key
    14. # using an in-line function
    15. @eldest = sort { $age{$b} <=> $age{$a} } keys %age;
    16. # sort using explicit subroutine name
    17. sub byage {
    18. $age{$a} <=> $age{$b}; # presuming numeric
    19. }
    20. @sortedclass = sort byage @class;
    21. sub backwards { $b cmp $a }
    22. @harry = qw(dog cat x Cain Abel);
    23. @george = qw(gone chased yz Punished Axed);
    24. print sort @harry;
    25. # prints AbelCaincatdogx
    26. print sort backwards @harry;
    27. # prints xdogcatCainAbel
    28. print sort @george, 'to', @harry;
    29. # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
    30. # inefficiently sort by descending numeric compare using
    31. # the first integer after the first = sign, or the
    32. # whole record case-insensitively otherwise
    33. my @new = sort {
    34. ($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0]
    35. ||
    36. fc($a) cmp fc($b)
    37. } @old;
    38. # same thing, but much more efficiently;
    39. # we'll build auxiliary indices instead
    40. # for speed
    41. my @nums = @caps = ();
    42. for (@old) {
    43. push @nums, ( /=(\d+)/ ? $1 : undef );
    44. push @caps, fc($_);
    45. }
    46. my @new = @old[ sort {
    47. $nums[$b] <=> $nums[$a]
    48. ||
    49. $caps[$a] cmp $caps[$b]
    50. } 0..$#old
    51. ];
    52. # same thing, but without any temps
    53. @new = map { $_->[0] }
    54. sort { $b->[1] <=> $a->[1]
    55. ||
    56. $a->[2] cmp $b->[2]
    57. } map { [$_, /=(\d+)/, fc($_)] } @old;
    58. # using a prototype allows you to use any comparison subroutine
    59. # as a sort subroutine (including other package's subroutines)
    60. package other;
    61. sub backwards ($$) { $_[1] cmp $_[0]; } # $a and $b are
    62. # not set here
    63. package main;
    64. @new = sort other::backwards @old;
    65. # guarantee stability, regardless of algorithm
    66. use sort 'stable';
    67. @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old;
    68. # force use of mergesort (not portable outside Perl 5.8)
    69. use sort '_mergesort'; # note discouraging _
    70. @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old;

    Warning: syntactical care is required when sorting the list returned from a function. If you want to sort the list returned by the function call find_records(@key) , you can use:

    1. @contact = sort { $a cmp $b } find_records @key;
    2. @contact = sort +find_records(@key);
    3. @contact = sort &find_records(@key);
    4. @contact = sort(find_records(@key));

    If instead you want to sort the array @key with the comparison routine find_records() then you can use:

    1. @contact = sort { find_records() } @key;
    2. @contact = sort find_records(@key);
    3. @contact = sort(find_records @key);
    4. @contact = sort(find_records (@key));

    If you're using strict, you must not declare $a and $b as lexicals. They are package globals. That means that if you're in the main package and type

    1. @articles = sort {$b <=> $a} @files;

    then $a and $b are $main::a and $main::b (or $::a and $::b ), but if you're in the FooPack package, it's the same as typing

    1. @articles = sort {$FooPack::b <=> $FooPack::a} @files;

    The comparison function is required to behave. If it returns inconsistent results (sometimes saying $x[1] is less than $x[2] and sometimes saying the opposite, for example) the results are not well-defined.

    Because <=> returns undef when either operand is NaN (not-a-number), be careful when sorting with a comparison function like $a <=> $b any lists that might contain a NaN . The following example takes advantage that NaN != NaN to eliminate any NaN s from the input list.

    1. @result = sort { $a <=> $b } grep { $_ == $_ } @input;
  • splice ARRAY or EXPR,OFFSET,LENGTH,LIST
  • splice ARRAY or EXPR,OFFSET,LENGTH
  • splice ARRAY or EXPR,OFFSET
  • splice ARRAY or EXPR

    Removes the elements designated by OFFSET and LENGTH from an array, and replaces them with the elements of LIST, if any. In list context, returns the elements removed from the array. In scalar context, returns the last element removed, or undef if no elements are removed. The array grows or shrinks as necessary. If OFFSET is negative then it starts that far from the end of the array. If LENGTH is omitted, removes everything from OFFSET onward. If LENGTH is negative, removes the elements from OFFSET onward except for -LENGTH elements at the end of the array. If both OFFSET and LENGTH are omitted, removes everything. If OFFSET is past the end of the array, Perl issues a warning, and splices at the end of the array.

    The following equivalences hold (assuming $#a >= $i )

    1. push(@a,$x,$y) splice(@a,@a,0,$x,$y)
    2. pop(@a) splice(@a,-1)
    3. shift(@a) splice(@a,0,1)
    4. unshift(@a,$x,$y) splice(@a,0,0,$x,$y)
    5. $a[$i] = $y splice(@a,$i,1,$y)

    Example, assuming array lengths are passed before arrays:

    1. sub aeq { # compare two list values
    2. my(@a) = splice(@_,0,shift);
    3. my(@b) = splice(@_,0,shift);
    4. return 0 unless @a == @b; # same len?
    5. while (@a) {
    6. return 0 if pop(@a) ne pop(@b);
    7. }
    8. return 1;
    9. }
    10. if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }

    Starting with Perl 5.14, splice can take scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of splice is considered highly experimental. The exact behaviour may change in a future version of Perl.

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.014; # so push/pop/etc work on scalars (experimental)
  • split /PATTERN/,EXPR,LIMIT
  • split /PATTERN/,EXPR
  • split /PATTERN/
  • split

    Splits the string EXPR into a list of strings and returns the list in list context, or the size of the list in scalar context.

    If only PATTERN is given, EXPR defaults to $_ .

    Anything in EXPR that matches PATTERN is taken to be a separator that separates the EXPR into substrings (called "fields") that do not include the separator. Note that a separator may be longer than one character or even have no characters at all (the empty string, which is a zero-width match).

    The PATTERN need not be constant; an expression may be used to specify a pattern that varies at runtime.

    If PATTERN matches the empty string, the EXPR is split at the match position (between characters). As an example, the following:

    1. print join(':', split('b', 'abc')), "\n";

    uses the 'b' in 'abc' as a separator to produce the output 'a:c'. However, this:

    1. print join(':', split('', 'abc')), "\n";

    uses empty string matches as separators to produce the output 'a:b:c'; thus, the empty string may be used to split EXPR into a list of its component characters.

    As a special case for split, the empty pattern given in match operator syntax (// ) specifically matches the empty string, which is contrary to its usual interpretation as the last successful match.

    If PATTERN is /^/ , then it is treated as if it used the multiline modifier (/^/m ), since it isn't much use otherwise.

    As another special case, split emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a literal string composed of a single space character (such as ' ' or "\x20" , but not e.g. / / ). In this case, any leading whitespace in EXPR is removed before splitting occurs, and the PATTERN is instead treated as if it were /\s+/ ; in particular, this means that any contiguous whitespace (not just a single space character) is used as a separator. However, this special treatment can be avoided by specifying the pattern / / instead of the string " " , thereby allowing only a single space character to be a separator. In earlier Perl's this special case was restricted to the use of a plain " " as the pattern argument to split, in Perl 5.18.0 and later this special case is triggered by any expression which evaluates as the simple string " " .

    If omitted, PATTERN defaults to a single space, " " , triggering the previously described awk emulation.

    If LIMIT is specified and positive, it represents the maximum number of fields into which the EXPR may be split; in other words, LIMIT is one greater than the maximum number of times EXPR may be split. Thus, the LIMIT value 1 means that EXPR may be split a maximum of zero times, producing a maximum of one field (namely, the entire value of EXPR). For instance:

    1. print join(':', split(//, 'abc', 1)), "\n";

    produces the output 'abc', and this:

    1. print join(':', split(//, 'abc', 2)), "\n";

    produces the output 'a:bc', and each of these:

    1. print join(':', split(//, 'abc', 3)), "\n";
    2. print join(':', split(//, 'abc', 4)), "\n";

    produces the output 'a:b:c'.

    If LIMIT is negative, it is treated as if it were instead arbitrarily large; as many fields as possible are produced.

    If LIMIT is omitted (or, equivalently, zero), then it is usually treated as if it were instead negative but with the exception that trailing empty fields are stripped (empty leading fields are always preserved); if all fields are empty, then all fields are considered to be trailing (and are thus stripped in this case). Thus, the following:

    1. print join(':', split(',', 'a,b,c,,,')), "\n";

    produces the output 'a:b:c', but the following:

    1. print join(':', split(',', 'a,b,c,,,', -1)), "\n";

    produces the output 'a:b:c:::'.

    In time-critical applications, it is worthwhile to avoid splitting into more fields than necessary. Thus, when assigning to a list, if LIMIT is omitted (or zero), then LIMIT is treated as though it were one larger than the number of variables in the list; for the following, LIMIT is implicitly 3:

    1. ($login, $passwd) = split(/:/);

    Note that splitting an EXPR that evaluates to the empty string always produces zero fields, regardless of the LIMIT specified.

    An empty leading field is produced when there is a positive-width match at the beginning of EXPR. For instance:

    1. print join(':', split(/ /, ' abc')), "\n";

    produces the output ':abc'. However, a zero-width match at the beginning of EXPR never produces an empty field, so that:

    1. print join(':', split(//, ' abc'));

    produces the output ' :a:b:c' (rather than ': :a:b:c').

    An empty trailing field, on the other hand, is produced when there is a match at the end of EXPR, regardless of the length of the match (of course, unless a non-zero LIMIT is given explicitly, such fields are removed, as in the last example). Thus:

    1. print join(':', split(//, ' abc', -1)), "\n";

    produces the output ' :a:b:c:'.

    If the PATTERN contains capturing groups, then for each separator, an additional field is produced for each substring captured by a group (in the order in which the groups are specified, as per backreferences); if any group does not match, then it captures the undef value instead of a substring. Also, note that any such additional field is produced whenever there is a separator (that is, whenever a split occurs), and such an additional field does not count towards the LIMIT. Consider the following expressions evaluated in list context (each returned list is provided in the associated comment):

    1. split(/-|,/, "1-10,20", 3)
    2. # ('1', '10', '20')
    3. split(/(-|,)/, "1-10,20", 3)
    4. # ('1', '-', '10', ',', '20')
    5. split(/-|(,)/, "1-10,20", 3)
    6. # ('1', undef, '10', ',', '20')
    7. split(/(-)|,/, "1-10,20", 3)
    8. # ('1', '-', '10', undef, '20')
    9. split(/(-)|(,)/, "1-10,20", 3)
    10. # ('1', '-', undef, '10', undef, ',', '20')
  • sprintf FORMAT, LIST

    Returns a string formatted by the usual printf conventions of the C library function sprintf. See below for more details and see sprintf(3) or printf(3) on your system for an explanation of the general principles.

    For example:

    1. # Format number with up to 8 leading zeroes
    2. $result = sprintf("%08d", $number);
    3. # Round number to 3 digits after decimal point
    4. $rounded = sprintf("%.3f", $number);

    Perl does its own sprintf formatting: it emulates the C function sprintf(3), but doesn't use it except for floating-point numbers, and even then only standard modifiers are allowed. Non-standard extensions in your local sprintf(3) are therefore unavailable from Perl.

    Unlike printf, sprintf does not do what you probably mean when you pass it an array as your first argument. The array is given scalar context, and instead of using the 0th element of the array as the format, Perl will use the count of elements in the array as the format, which is almost never useful.

    Perl's sprintf permits the following universally-known conversions:

    1. %% a percent sign
    2. %c a character with the given number
    3. %s a string
    4. %d a signed integer, in decimal
    5. %u an unsigned integer, in decimal
    6. %o an unsigned integer, in octal
    7. %x an unsigned integer, in hexadecimal
    8. %e a floating-point number, in scientific notation
    9. %f a floating-point number, in fixed decimal notation
    10. %g a floating-point number, in %e or %f notation

    In addition, Perl permits the following widely-supported conversions:

    1. %X like %x, but using upper-case letters
    2. %E like %e, but using an upper-case "E"
    3. %G like %g, but with an upper-case "E" (if applicable)
    4. %b an unsigned integer, in binary
    5. %B like %b, but using an upper-case "B" with the # flag
    6. %p a pointer (outputs the Perl value's address in hexadecimal)
    7. %n special: *stores* the number of characters output so far
    8. into the next argument in the parameter list

    Finally, for backward (and we do mean "backward") compatibility, Perl permits these unnecessary but widely-supported conversions:

    1. %i a synonym for %d
    2. %D a synonym for %ld
    3. %U a synonym for %lu
    4. %O a synonym for %lo
    5. %F a synonym for %f

    Note that the number of exponent digits in the scientific notation produced by %e , %E , %g and %G for numbers with the modulus of the exponent less than 100 is system-dependent: it may be three or less (zero-padded as necessary). In other words, 1.23 times ten to the 99th may be either "1.23e99" or "1.23e099".

    Between the % and the format letter, you may specify several additional attributes controlling the interpretation of the format. In order, these are:

    • format parameter index

      An explicit format parameter index, such as 2$. By default sprintf will format the next unused argument in the list, but this allows you to take the arguments out of order:

      1. printf '%2$d %1$d', 12, 34; # prints "34 12"
      2. printf '%3$d %d %1$d', 1, 2, 3; # prints "3 1 1"
    • flags

      one or more of:

      1. space prefix non-negative number with a space
      2. + prefix non-negative number with a plus sign
      3. - left-justify within the field
      4. 0 use zeros, not spaces, to right-justify
      5. # ensure the leading "0" for any octal,
      6. prefix non-zero hexadecimal with "0x" or "0X",
      7. prefix non-zero binary with "0b" or "0B"

      For example:

      1. printf '<% d>', 12; # prints "< 12>"
      2. printf '<%+d>', 12; # prints "<+12>"
      3. printf '<%6s>', 12; # prints "< 12>"
      4. printf '<%-6s>', 12; # prints "<12 >"
      5. printf '<%06s>', 12; # prints "<000012>"
      6. printf '<%#o>', 12; # prints "<014>"
      7. printf '<%#x>', 12; # prints "<0xc>"
      8. printf '<%#X>', 12; # prints "<0XC>"
      9. printf '<%#b>', 12; # prints "<0b1100>"
      10. printf '<%#B>', 12; # prints "<0B1100>"

      When a space and a plus sign are given as the flags at once, a plus sign is used to prefix a positive number.

      1. printf '<%+ d>', 12; # prints "<+12>"
      2. printf '<% +d>', 12; # prints "<+12>"

      When the # flag and a precision are given in the %o conversion, the precision is incremented if it's necessary for the leading "0".

      1. printf '<%#.5o>', 012; # prints "<00012>"
      2. printf '<%#.5o>', 012345; # prints "<012345>"
      3. printf '<%#.0o>', 0; # prints "<0>"
    • vector flag

      This flag tells Perl to interpret the supplied string as a vector of integers, one for each character in the string. Perl applies the format to each integer in turn, then joins the resulting strings with a separator (a dot . by default). This can be useful for displaying ordinal values of characters in arbitrary strings:

      1. printf "%vd", "AB\x{100}"; # prints "65.66.256"
      2. printf "version is v%vd\n", $^V; # Perl's version

      Put an asterisk * before the v to override the string to use to separate the numbers:

      1. printf "address is %*vX\n", ":", $addr; # IPv6 address
      2. printf "bits are %0*v8b\n", " ", $bits; # random bitstring

      You can also explicitly specify the argument number to use for the join string using something like *2$v; for example:

      1. printf '%*4$vX %*4$vX %*4$vX', # 3 IPv6 addresses
      2. @addr[1..3], ":";
    • (minimum) width

      Arguments are usually formatted to be only as wide as required to display the given value. You can override the width by putting a number here, or get the width from the next argument (with * ) or from a specified argument (e.g., with *2$):

      1. printf "<%s>", "a"; # prints "<a>"
      2. printf "<%6s>", "a"; # prints "< a>"
      3. printf "<%*s>", 6, "a"; # prints "< a>"
      4. printf '<%*2$s>', "a", 6; # prints "< a>"
      5. printf "<%2s>", "long"; # prints "<long>" (does not truncate)

      If a field width obtained through * is negative, it has the same effect as the - flag: left-justification.

    • precision, or maximum width

      You can specify a precision (for numeric conversions) or a maximum width (for string conversions) by specifying a . followed by a number. For floating-point formats except g and G , this specifies how many places right of the decimal point to show (the default being 6). For example:

      1. # these examples are subject to system-specific variation
      2. printf '<%f>', 1; # prints "<1.000000>"
      3. printf '<%.1f>', 1; # prints "<1.0>"
      4. printf '<%.0f>', 1; # prints "<1>"
      5. printf '<%e>', 10; # prints "<1.000000e+01>"
      6. printf '<%.1e>', 10; # prints "<1.0e+01>"

      For "g" and "G", this specifies the maximum number of digits to show, including those prior to the decimal point and those after it; for example:

      1. # These examples are subject to system-specific variation.
      2. printf '<%g>', 1; # prints "<1>"
      3. printf '<%.10g>', 1; # prints "<1>"
      4. printf '<%g>', 100; # prints "<100>"
      5. printf '<%.1g>', 100; # prints "<1e+02>"
      6. printf '<%.2g>', 100.01; # prints "<1e+02>"
      7. printf '<%.5g>', 100.01; # prints "<100.01>"
      8. printf '<%.4g>', 100.01; # prints "<100>"

      For integer conversions, specifying a precision implies that the output of the number itself should be zero-padded to this width, where the 0 flag is ignored:

      1. printf '<%.6d>', 1; # prints "<000001>"
      2. printf '<%+.6d>', 1; # prints "<+000001>"
      3. printf '<%-10.6d>', 1; # prints "<000001 >"
      4. printf '<%10.6d>', 1; # prints "< 000001>"
      5. printf '<%010.6d>', 1; # prints "< 000001>"
      6. printf '<%+10.6d>', 1; # prints "< +000001>"
      7. printf '<%.6x>', 1; # prints "<000001>"
      8. printf '<%#.6x>', 1; # prints "<0x000001>"
      9. printf '<%-10.6x>', 1; # prints "<000001 >"
      10. printf '<%10.6x>', 1; # prints "< 000001>"
      11. printf '<%010.6x>', 1; # prints "< 000001>"
      12. printf '<%#10.6x>', 1; # prints "< 0x000001>"

      For string conversions, specifying a precision truncates the string to fit the specified width:

      1. printf '<%.5s>', "truncated"; # prints "<trunc>"
      2. printf '<%10.5s>', "truncated"; # prints "< trunc>"

      You can also get the precision from the next argument using .*:

      1. printf '<%.6x>', 1; # prints "<000001>"
      2. printf '<%.*x>', 6, 1; # prints "<000001>"

      If a precision obtained through * is negative, it counts as having no precision at all.

      1. printf '<%.*s>', 7, "string"; # prints "<string>"
      2. printf '<%.*s>', 3, "string"; # prints "<str>"
      3. printf '<%.*s>', 0, "string"; # prints "<>"
      4. printf '<%.*s>', -1, "string"; # prints "<string>"
      5. printf '<%.*d>', 1, 0; # prints "<0>"
      6. printf '<%.*d>', 0, 0; # prints "<>"
      7. printf '<%.*d>', -1, 0; # prints "<0>"

      You cannot currently get the precision from a specified number, but it is intended that this will be possible in the future, for example using .*2$:

      1. printf '<%.*2$x>', 1, 6; # INVALID, but in future will print
      2. # "<000001>"
    • size

      For numeric conversions, you can specify the size to interpret the number as using l , h , V , q, L , or ll . For integer conversions (d u o x X b i D U O ), numbers are usually assumed to be whatever the default integer size is on your platform (usually 32 or 64 bits), but you can override this to use instead one of the standard C types, as supported by the compiler used to build Perl:

      1. hh interpret integer as C type "char" or "unsigned
      2. char" on Perl 5.14 or later
      3. h interpret integer as C type "short" or
      4. "unsigned short"
      5. j interpret integer as C type "intmax_t" on Perl
      6. 5.14 or later, and only with a C99 compiler
      7. (unportable)
      8. l interpret integer as C type "long" or
      9. "unsigned long"
      10. q, L, or ll interpret integer as C type "long long",
      11. "unsigned long long", or "quad" (typically
      12. 64-bit integers)
      13. t interpret integer as C type "ptrdiff_t" on Perl
      14. 5.14 or later
      15. z interpret integer as C type "size_t" on Perl 5.14
      16. or later

      As of 5.14, none of these raises an exception if they are not supported on your platform. However, if warnings are enabled, a warning of the printf warning class is issued on an unsupported conversion flag. Should you instead prefer an exception, do this:

      1. use warnings FATAL => "printf";

      If you would like to know about a version dependency before you start running the program, put something like this at its top:

      1. use 5.014; # for hh/j/t/z/ printf modifiers

      You can find out whether your Perl supports quads via Config:

      1. use Config;
      2. if ($Config{use64bitint} eq "define"
      3. || $Config{longsize} >= 8) {
      4. print "Nice quads!\n";
      5. }

      For floating-point conversions (e f g E F G ), numbers are usually assumed to be the default floating-point size on your platform (double or long double), but you can force "long double" with q, L , or ll if your platform supports them. You can find out whether your Perl supports long doubles via Config:

      1. use Config;
      2. print "long doubles\n" if $Config{d_longdbl} eq "define";

      You can find out whether Perl considers "long double" to be the default floating-point size to use on your platform via Config:

      1. use Config;
      2. if ($Config{uselongdouble} eq "define") {
      3. print "long doubles by default\n";
      4. }

      It can also be that long doubles and doubles are the same thing:

      1. use Config;
      2. ($Config{doublesize} == $Config{longdblsize}) &&
      3. print "doubles are long doubles\n";

      The size specifier V has no effect for Perl code, but is supported for compatibility with XS code. It means "use the standard size for a Perl integer or floating-point number", which is the default.

    • order of arguments

      Normally, sprintf() takes the next unused argument as the value to format for each format specification. If the format specification uses * to require additional arguments, these are consumed from the argument list in the order they appear in the format specification before the value to format. Where an argument is specified by an explicit index, this does not affect the normal order for the arguments, even when the explicitly specified index would have been the next argument.

      So:

      1. printf "<%*.*s>", $a, $b, $c;

      uses $a for the width, $b for the precision, and $c as the value to format; while:

      1. printf '<%*1$.*s>', $a, $b;

      would use $a for the width and precision, and $b as the value to format.

      Here are some more examples; be aware that when using an explicit index, the $ may need escaping:

      1. printf "%2\$d %d\n", 12, 34; # will print "34 12\n"
      2. printf "%2\$d %d %d\n", 12, 34; # will print "34 12 34\n"
      3. printf "%3\$d %d %d\n", 12, 34, 56; # will print "56 12 34\n"
      4. printf "%2\$*3\$d %d\n", 12, 34, 3; # will print " 34 12\n"

    If use locale (including use locale 'not_characters' ) is in effect and POSIX::setlocale() has been called, the character used for the decimal separator in formatted floating-point numbers is affected by the LC_NUMERIC locale. See perllocale and POSIX.

  • sqrt EXPR
  • sqrt

    Return the positive square root of EXPR. If EXPR is omitted, uses $_ . Works only for non-negative operands unless you've loaded the Math::Complex module.

    1. use Math::Complex;
    2. print sqrt(-4); # prints 2i
  • srand EXPR
  • srand

    Sets and returns the random number seed for the rand operator.

    The point of the function is to "seed" the rand function so that rand can produce a different sequence each time you run your program. When called with a parameter, srand uses that for the seed; otherwise it (semi-)randomly chooses a seed. In either case, starting with Perl 5.14, it returns the seed. To signal that your code will work only on Perls of a recent vintage:

    1. use 5.014; # so srand returns the seed

    If srand() is not called explicitly, it is called implicitly without a parameter at the first use of the rand operator. However, there are a few situations where programs are likely to want to call srand. One is for generating predictable results, generally for testing or debugging. There, you use srand($seed), with the same $seed each time. Another case is that you may want to call srand() after a fork() to avoid child processes sharing the same seed value as the parent (and consequently each other).

    Do not call srand() (i.e., without an argument) more than once per process. The internal state of the random number generator should contain more entropy than can be provided by any seed, so calling srand() again actually loses randomness.

    Most implementations of srand take an integer and will silently truncate decimal numbers. This means srand(42) will usually produce the same results as srand(42.1). To be safe, always pass srand an integer.

    A typical use of the returned seed is for a test program which has too many combinations to test comprehensively in the time available to it each run. It can test a random subset each time, and should there be a failure, log the seed used for that run so that it can later be used to reproduce the same results.

    rand() is not cryptographically secure. You should not rely on it in security-sensitive situations. As of this writing, a number of third-party CPAN modules offer random number generators intended by their authors to be cryptographically secure, including: Data::Entropy, Crypt::Random, Math::Random::Secure, and Math::TrulyRandom.

  • stat FILEHANDLE
  • stat EXPR
  • stat DIRHANDLE
  • stat

    Returns a 13-element list giving the status info for a file, either the file opened via FILEHANDLE or DIRHANDLE, or named by EXPR. If EXPR is omitted, it stats $_ (not _ !). Returns the empty list if stat fails. Typically used as follows:

    1. ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
    2. $atime,$mtime,$ctime,$blksize,$blocks)
    3. = stat($filename);

    Not all fields are supported on all filesystem types. Here are the meanings of the fields:

    1. 0 dev device number of filesystem
    2. 1 ino inode number
    3. 2 mode file mode (type and permissions)
    4. 3 nlink number of (hard) links to the file
    5. 4 uid numeric user ID of file's owner
    6. 5 gid numeric group ID of file's owner
    7. 6 rdev the device identifier (special files only)
    8. 7 size total size of file, in bytes
    9. 8 atime last access time in seconds since the epoch
    10. 9 mtime last modify time in seconds since the epoch
    11. 10 ctime inode change time in seconds since the epoch (*)
    12. 11 blksize preferred I/O size in bytes for interacting with the
    13. file (may vary from file to file)
    14. 12 blocks actual number of system-specific blocks allocated
    15. on disk (often, but not always, 512 bytes each)

    (The epoch was at 00:00 January 1, 1970 GMT.)

    (*) Not all fields are supported on all filesystem types. Notably, the ctime field is non-portable. In particular, you cannot expect it to be a "creation time"; see Files and Filesystems in perlport for details.

    If stat is passed the special filehandle consisting of an underline, no stat is done, but the current contents of the stat structure from the last stat, lstat, or filetest are returned. Example:

    1. if (-x $file && (($d) = stat(_)) && $d < 0) {
    2. print "$file is executable NFS file\n";
    3. }

    (This works on machines only for which the device number is negative under NFS.)

    Because the mode contains both the file type and its permissions, you should mask off the file type portion and (s)printf using a "%o" if you want to see the real permissions.

    1. $mode = (stat($filename))[2];
    2. printf "Permissions are %04o\n", $mode & 07777;

    In scalar context, stat returns a boolean value indicating success or failure, and, if successful, sets the information associated with the special filehandle _ .

    The File::stat module provides a convenient, by-name access mechanism:

    1. use File::stat;
    2. $sb = stat($filename);
    3. printf "File is %s, size is %s, perm %04o, mtime %s\n",
    4. $filename, $sb->size, $sb->mode & 07777,
    5. scalar localtime $sb->mtime;

    You can import symbolic mode constants (S_IF* ) and functions (S_IS* ) from the Fcntl module:

    1. use Fcntl ':mode';
    2. $mode = (stat($filename))[2];
    3. $user_rwx = ($mode & S_IRWXU) >> 6;
    4. $group_read = ($mode & S_IRGRP) >> 3;
    5. $other_execute = $mode & S_IXOTH;
    6. printf "Permissions are %04o\n", S_IMODE($mode), "\n";
    7. $is_setuid = $mode & S_ISUID;
    8. $is_directory = S_ISDIR($mode);

    You could write the last two using the -u and -d operators. Commonly available S_IF* constants are:

    1. # Permissions: read, write, execute, for user, group, others.
    2. S_IRWXU S_IRUSR S_IWUSR S_IXUSR
    3. S_IRWXG S_IRGRP S_IWGRP S_IXGRP
    4. S_IRWXO S_IROTH S_IWOTH S_IXOTH
    5. # Setuid/Setgid/Stickiness/SaveText.
    6. # Note that the exact meaning of these is system-dependent.
    7. S_ISUID S_ISGID S_ISVTX S_ISTXT
    8. # File types. Not all are necessarily available on
    9. # your system.
    10. S_IFREG S_IFDIR S_IFLNK S_IFBLK S_IFCHR
    11. S_IFIFO S_IFSOCK S_IFWHT S_ENFMT
    12. # The following are compatibility aliases for S_IRUSR,
    13. # S_IWUSR, and S_IXUSR.
    14. S_IREAD S_IWRITE S_IEXEC

    and the S_IF* functions are

    1. S_IMODE($mode) the part of $mode containing the permission
    2. bits and the setuid/setgid/sticky bits
    3. S_IFMT($mode) the part of $mode containing the file type
    4. which can be bit-anded with (for example)
    5. S_IFREG or with the following functions
    6. # The operators -f, -d, -l, -b, -c, -p, and -S.
    7. S_ISREG($mode) S_ISDIR($mode) S_ISLNK($mode)
    8. S_ISBLK($mode) S_ISCHR($mode) S_ISFIFO($mode) S_ISSOCK($mode)
    9. # No direct -X operator counterpart, but for the first one
    10. # the -g operator is often equivalent. The ENFMT stands for
    11. # record flocking enforcement, a platform-dependent feature.
    12. S_ISENFMT($mode) S_ISWHT($mode)

    See your native chmod(2) and stat(2) documentation for more details about the S_* constants. To get status info for a symbolic link instead of the target file behind the link, use the lstat function.

    Portability issues: stat in perlport.

  • state EXPR
  • state TYPE EXPR
  • state EXPR : ATTRS
  • state TYPE EXPR : ATTRS

    state declares a lexically scoped variable, just like my. However, those variables will never be reinitialized, contrary to lexical variables that are reinitialized each time their enclosing block is entered. See Persistent Private Variables in perlsub for details.

    state variables are enabled only when the use feature "state" pragma is in effect, unless the keyword is written as CORE::state . See also feature.

  • study SCALAR
  • study

    Takes extra time to study SCALAR ($_ if unspecified) in anticipation of doing many pattern matches on the string before it is next modified. This may or may not save time, depending on the nature and number of patterns you are searching and the distribution of character frequencies in the string to be searched; you probably want to compare run times with and without it to see which is faster. Those loops that scan for many short constant strings (including the constant parts of more complex patterns) will benefit most. (The way study works is this: a linked list of every character in the string to be searched is made, so we know, for example, where all the 'k' characters are. From each search string, the rarest character is selected, based on some static frequency tables constructed from some C programs and English text. Only those places that contain this "rarest" character are examined.)

    For example, here is a loop that inserts index producing entries before any line containing a certain pattern:

    1. while (<>) {
    2. study;
    3. print ".IX foo\n" if /\bfoo\b/;
    4. print ".IX bar\n" if /\bbar\b/;
    5. print ".IX blurfl\n" if /\bblurfl\b/;
    6. # ...
    7. print;
    8. }

    In searching for /\bfoo\b/ , only locations in $_ that contain f will be looked at, because f is rarer than o . In general, this is a big win except in pathological cases. The only question is whether it saves you more time than it took to build the linked list in the first place.

    Note that if you have to look for strings that you don't know till runtime, you can build an entire loop as a string and eval that to avoid recompiling all your patterns all the time. Together with undefining $/ to input entire files as one record, this can be quite fast, often faster than specialized programs like fgrep(1). The following scans a list of files (@files ) for a list of words (@words ), and prints out the names of those files that contain a match:

    1. $search = 'while (<>) { study;';
    2. foreach $word (@words) {
    3. $search .= "++\$seen{\$ARGV} if /\\b$word\\b/;\n";
    4. }
    5. $search .= "}";
    6. @ARGV = @files;
    7. undef $/;
    8. eval $search; # this screams
    9. $/ = "\n"; # put back to normal input delimiter
    10. foreach $file (sort keys(%seen)) {
    11. print $file, "\n";
    12. }
  • sub NAME BLOCK
  • sub NAME (PROTO) BLOCK
  • sub NAME : ATTRS BLOCK
  • sub NAME (PROTO) : ATTRS BLOCK

    This is subroutine definition, not a real function per se. Without a BLOCK it's just a forward declaration. Without a NAME, it's an anonymous function declaration, so does return a value: the CODE ref of the closure just created.

    See perlsub and perlref for details about subroutines and references; see attributes and Attribute::Handlers for more information about attributes.

  • __SUB__

    A special token that returns a reference to the current subroutine, or undef outside of a subroutine.

    The behaviour of __SUB__ within a regex code block (such as /(?{...})/ ) is subject to change.

    This token is only available under use v5.16 or the "current_sub" feature. See feature.

  • substr EXPR,OFFSET,LENGTH,REPLACEMENT
  • substr EXPR,OFFSET,LENGTH
  • substr EXPR,OFFSET

    Extracts a substring out of EXPR and returns it. First character is at offset zero. If OFFSET is negative, starts that far back from the end of the string. If LENGTH is omitted, returns everything through the end of the string. If LENGTH is negative, leaves that many characters off the end of the string.

    1. my $s = "The black cat climbed the green tree";
    2. my $color = substr $s, 4, 5; # black
    3. my $middle = substr $s, 4, -11; # black cat climbed the
    4. my $end = substr $s, 14; # climbed the green tree
    5. my $tail = substr $s, -4; # tree
    6. my $z = substr $s, -4, 2; # tr

    You can use the substr() function as an lvalue, in which case EXPR must itself be an lvalue. If you assign something shorter than LENGTH, the string will shrink, and if you assign something longer than LENGTH, the string will grow to accommodate it. To keep the string the same length, you may need to pad or chop your value using sprintf.

    If OFFSET and LENGTH specify a substring that is partly outside the string, only the part within the string is returned. If the substring is beyond either end of the string, substr() returns the undefined value and produces a warning. When used as an lvalue, specifying a substring that is entirely outside the string raises an exception. Here's an example showing the behavior for boundary cases:

    1. my $name = 'fred';
    2. substr($name, 4) = 'dy'; # $name is now 'freddy'
    3. my $null = substr $name, 6, 2; # returns "" (no warning)
    4. my $oops = substr $name, 7; # returns undef, with warning
    5. substr($name, 7) = 'gap'; # raises an exception

    An alternative to using substr() as an lvalue is to specify the replacement string as the 4th argument. This allows you to replace parts of the EXPR and return what was there before in one operation, just as you can with splice().

    1. my $s = "The black cat climbed the green tree";
    2. my $z = substr $s, 14, 7, "jumped from"; # climbed
    3. # $s is now "The black cat jumped from the green tree"

    Note that the lvalue returned by the three-argument version of substr() acts as a 'magic bullet'; each time it is assigned to, it remembers which part of the original string is being modified; for example:

    1. $x = '1234';
    2. for (substr($x,1,2)) {
    3. $_ = 'a'; print $x,"\n"; # prints 1a4
    4. $_ = 'xyz'; print $x,"\n"; # prints 1xyz4
    5. $x = '56789';
    6. $_ = 'pq'; print $x,"\n"; # prints 5pq9
    7. }

    With negative offsets, it remembers its position from the end of the string when the target string is modified:

    1. $x = '1234';
    2. for (substr($x, -3, 2)) {
    3. $_ = 'a'; print $x,"\n"; # prints 1a4, as above
    4. $x = 'abcdefg';
    5. print $_,"\n"; # prints f
    6. }

    Prior to Perl version 5.10, the result of using an lvalue multiple times was unspecified. Prior to 5.16, the result with negative offsets was unspecified.

  • symlink OLDFILE,NEWFILE

    Creates a new filename symbolically linked to the old filename. Returns 1 for success, 0 otherwise. On systems that don't support symbolic links, raises an exception. To check for that, use eval:

    1. $symlink_exists = eval { symlink("",""); 1 };

    Portability issues: symlink in perlport.

  • syscall NUMBER, LIST

    Calls the system call specified as the first element of the list, passing the remaining elements as arguments to the system call. If unimplemented, raises an exception. The arguments are interpreted as follows: if a given argument is numeric, the argument is passed as an int. If not, the pointer to the string value is passed. You are responsible to make sure a string is pre-extended long enough to receive any result that might be written into a string. You can't use a string literal (or other read-only string) as an argument to syscall because Perl has to assume that any string pointer might be written through. If your integer arguments are not literals and have never been interpreted in a numeric context, you may need to add 0 to them to force them to look like numbers. This emulates the syswrite function (or vice versa):

    1. require 'syscall.ph'; # may need to run h2ph
    2. $s = "hi there\n";
    3. syscall(&SYS_write, fileno(STDOUT), $s, length $s);

    Note that Perl supports passing of up to only 14 arguments to your syscall, which in practice should (usually) suffice.

    Syscall returns whatever value returned by the system call it calls. If the system call fails, syscall returns -1 and sets $! (errno). Note that some system calls can legitimately return -1 . The proper way to handle such calls is to assign $!=0 before the call, then check the value of $! if syscall returns -1 .

    There's a problem with syscall(&SYS_pipe): it returns the file number of the read end of the pipe it creates, but there is no way to retrieve the file number of the other end. You can avoid this problem by using pipe instead.

    Portability issues: syscall in perlport.

  • sysopen FILEHANDLE,FILENAME,MODE
  • sysopen FILEHANDLE,FILENAME,MODE,PERMS

    Opens the file whose filename is given by FILENAME, and associates it with FILEHANDLE. If FILEHANDLE is an expression, its value is used as the real filehandle wanted; an undefined scalar will be suitably autovivified. This function calls the underlying operating system's open(2) function with the parameters FILENAME, MODE, and PERMS.

    The possible values and flag bits of the MODE parameter are system-dependent; they are available via the standard module Fcntl . See the documentation of your operating system's open(2) syscall to see which values and flag bits are available. You may combine several flags using the |-operator.

    Some of the most common values are O_RDONLY for opening the file in read-only mode, O_WRONLY for opening the file in write-only mode, and O_RDWR for opening the file in read-write mode.

    For historical reasons, some values work on almost every system supported by Perl: 0 means read-only, 1 means write-only, and 2 means read/write. We know that these values do not work under OS/390 and on the Macintosh; you probably don't want to use them in new code.

    If the file named by FILENAME does not exist and the open call creates it (typically because MODE includes the O_CREAT flag), then the value of PERMS specifies the permissions of the newly created file. If you omit the PERMS argument to sysopen, Perl uses the octal value 0666 . These permission values need to be in octal, and are modified by your process's current umask.

    In many systems the O_EXCL flag is available for opening files in exclusive mode. This is not locking: exclusiveness means here that if the file already exists, sysopen() fails. O_EXCL may not work on network filesystems, and has no effect unless the O_CREAT flag is set as well. Setting O_CREAT|O_EXCL prevents the file from being opened if it is a symbolic link. It does not protect against symbolic links in the file's path.

    Sometimes you may want to truncate an already-existing file. This can be done using the O_TRUNC flag. The behavior of O_TRUNC with O_RDONLY is undefined.

    You should seldom if ever use 0644 as argument to sysopen, because that takes away the user's option to have a more permissive umask. Better to omit it. See the perlfunc(1) entry on umask for more on this.

    Note that sysopen depends on the fdopen() C library function. On many Unix systems, fdopen() is known to fail when file descriptors exceed a certain value, typically 255. If you need more file descriptors than that, consider rebuilding Perl to use the sfio library, or perhaps using the POSIX::open() function.

    See perlopentut for a kinder, gentler explanation of opening files.

    Portability issues: sysopen in perlport.

  • sysread FILEHANDLE,SCALAR,LENGTH,OFFSET
  • sysread FILEHANDLE,SCALAR,LENGTH

    Attempts to read LENGTH bytes of data into variable SCALAR from the specified FILEHANDLE, using the read(2). It bypasses buffered IO, so mixing this with other kinds of reads, print, write, seek, tell, or eof can cause confusion because the perlio or stdio layers usually buffers data. Returns the number of bytes actually read, 0 at end of file, or undef if there was an error (in the latter case $! is also set). SCALAR will be grown or shrunk so that the last byte actually read is the last byte of the scalar after the read.

    An OFFSET may be specified to place the read data at some place in the string other than the beginning. A negative OFFSET specifies placement at that many characters counting backwards from the end of the string. A positive OFFSET greater than the length of SCALAR results in the string being padded to the required size with "\0" bytes before the result of the read is appended.

    There is no syseof() function, which is ok, since eof() doesn't work well on device files (like ttys) anyway. Use sysread() and check for a return value for 0 to decide whether you're done.

    Note that if the filehandle has been marked as :utf8 Unicode characters are read instead of bytes (the LENGTH, OFFSET, and the return value of sysread() are in Unicode characters). The :encoding(...) layer implicitly introduces the :utf8 layer. See binmode, open, and the open pragma, open.

  • sysseek FILEHANDLE,POSITION,WHENCE

    Sets FILEHANDLE's system position in bytes using lseek(2). FILEHANDLE may be an expression whose value gives the name of the filehandle. The values for WHENCE are 0 to set the new position to POSITION; 1 to set the it to the current position plus POSITION; and 2 to set it to EOF plus POSITION, typically negative.

    Note the in bytes: even if the filehandle has been set to operate on characters (for example by using the :encoding(utf8) I/O layer), tell() will return byte offsets, not character offsets (because implementing that would render sysseek() unacceptably slow).

    sysseek() bypasses normal buffered IO, so mixing it with reads other than sysread (for example <> or read()) print, write, seek, tell, or eof may cause confusion.

    For WHENCE, you may also use the constants SEEK_SET , SEEK_CUR , and SEEK_END (start of the file, current position, end of the file) from the Fcntl module. Use of the constants is also more portable than relying on 0, 1, and 2. For example to define a "systell" function:

    1. use Fcntl 'SEEK_CUR';
    2. sub systell { sysseek($_[0], 0, SEEK_CUR) }

    Returns the new position, or the undefined value on failure. A position of zero is returned as the string "0 but true" ; thus sysseek returns true on success and false on failure, yet you can still easily determine the new position.

  • system LIST
  • system PROGRAM LIST

    Does exactly the same thing as exec LIST , except that a fork is done first and the parent process waits for the child process to exit. Note that argument processing varies depending on the number of arguments. If there is more than one argument in LIST, or if LIST is an array with more than one value, starts the program given by the first element of the list with arguments given by the rest of the list. If there is only one scalar argument, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's command shell for parsing (this is /bin/sh -c on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the argument, it is split into words and passed directly to execvp , which is more efficient.

    Perl will attempt to flush all files opened for output before any operation that may do a fork, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles.

    The return value is the exit status of the program as returned by the wait call. To get the actual exit value, shift right by eight (see below). See also exec. This is not what you want to use to capture the output from a command; for that you should use merely backticks or qx//, as described in `STRING` in perlop. Return value of -1 indicates a failure to start the program or an error of the wait(2) system call (inspect $! for the reason).

    If you'd like to make system (and many other bits of Perl) die on error, have a look at the autodie pragma.

    Like exec, system allows you to lie to a program about its name if you use the system PROGRAM LIST syntax. Again, see exec.

    Since SIGINT and SIGQUIT are ignored during the execution of system, if you expect your program to terminate on receipt of these signals you will need to arrange to do so yourself based on the return value.

    1. @args = ("command", "arg1", "arg2");
    2. system(@args) == 0
    3. or die "system @args failed: $?"

    If you'd like to manually inspect system's failure, you can check all possible failure modes by inspecting $? like this:

    1. if ($? == -1) {
    2. print "failed to execute: $!\n";
    3. }
    4. elsif ($? & 127) {
    5. printf "child died with signal %d, %s coredump\n",
    6. ($? & 127), ($? & 128) ? 'with' : 'without';
    7. }
    8. else {
    9. printf "child exited with value %d\n", $? >> 8;
    10. }

    Alternatively, you may inspect the value of ${^CHILD_ERROR_NATIVE} with the W*() calls from the POSIX module.

    When system's arguments are executed indirectly by the shell, results and return codes are subject to its quirks. See `STRING` in perlop and exec for details.

    Since system does a fork and wait it may affect a SIGCHLD handler. See perlipc for details.

    Portability issues: system in perlport.

  • syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET
  • syswrite FILEHANDLE,SCALAR,LENGTH
  • syswrite FILEHANDLE,SCALAR

    Attempts to write LENGTH bytes of data from variable SCALAR to the specified FILEHANDLE, using write(2). If LENGTH is not specified, writes whole SCALAR. It bypasses buffered IO, so mixing this with reads (other than sysread()), print, write, seek, tell, or eof may cause confusion because the perlio and stdio layers usually buffer data. Returns the number of bytes actually written, or undef if there was an error (in this case the errno variable $! is also set). If the LENGTH is greater than the data available in the SCALAR after the OFFSET, only as much data as is available will be written.

    An OFFSET may be specified to write the data from some part of the string other than the beginning. A negative OFFSET specifies writing that many characters counting backwards from the end of the string. If SCALAR is of length zero, you can only use an OFFSET of 0.

    WARNING: If the filehandle is marked :utf8 , Unicode characters encoded in UTF-8 are written instead of bytes, and the LENGTH, OFFSET, and return value of syswrite() are in (UTF8-encoded Unicode) characters. The :encoding(...) layer implicitly introduces the :utf8 layer. Alternately, if the handle is not marked with an encoding but you attempt to write characters with code points over 255, raises an exception. See binmode, open, and the open pragma, open.

  • tell FILEHANDLE
  • tell

    Returns the current position in bytes for FILEHANDLE, or -1 on error. FILEHANDLE may be an expression whose value gives the name of the actual filehandle. If FILEHANDLE is omitted, assumes the file last read.

    Note the in bytes: even if the filehandle has been set to operate on characters (for example by using the :encoding(utf8) open layer), tell() will return byte offsets, not character offsets (because that would render seek() and tell() rather slow).

    The return value of tell() for the standard streams like the STDIN depends on the operating system: it may return -1 or something else. tell() on pipes, fifos, and sockets usually returns -1.

    There is no systell function. Use sysseek(FH, 0, 1) for that.

    Do not use tell() (or other buffered I/O operations) on a filehandle that has been manipulated by sysread(), syswrite(), or sysseek(). Those functions ignore the buffering, while tell() does not.

  • telldir DIRHANDLE

    Returns the current position of the readdir routines on DIRHANDLE. Value may be given to seekdir to access a particular location in a directory. telldir has the same caveats about possible directory compaction as the corresponding system library routine.

  • tie VARIABLE,CLASSNAME,LIST

    This function binds a variable to a package class that will provide the implementation for the variable. VARIABLE is the name of the variable to be enchanted. CLASSNAME is the name of a class implementing objects of correct type. Any additional arguments are passed to the appropriate constructor method of the class (meaning TIESCALAR , TIEHANDLE , TIEARRAY , or TIEHASH ). Typically these are arguments such as might be passed to the dbm_open() function of C. The object returned by the constructor is also returned by the tie function, which would be useful if you want to access other methods in CLASSNAME.

    Note that functions such as keys and values may return huge lists when used on large objects, like DBM files. You may prefer to use the each function to iterate over such. Example:

    1. # print out history file offsets
    2. use NDBM_File;
    3. tie(%HIST, 'NDBM_File', '/usr/lib/news/history', 1, 0);
    4. while (($key,$val) = each %HIST) {
    5. print $key, ' = ', unpack('L',$val), "\n";
    6. }
    7. untie(%HIST);

    A class implementing a hash should have the following methods:

    1. TIEHASH classname, LIST
    2. FETCH this, key
    3. STORE this, key, value
    4. DELETE this, key
    5. CLEAR this
    6. EXISTS this, key
    7. FIRSTKEY this
    8. NEXTKEY this, lastkey
    9. SCALAR this
    10. DESTROY this
    11. UNTIE this

    A class implementing an ordinary array should have the following methods:

    1. TIEARRAY classname, LIST
    2. FETCH this, key
    3. STORE this, key, value
    4. FETCHSIZE this
    5. STORESIZE this, count
    6. CLEAR this
    7. PUSH this, LIST
    8. POP this
    9. SHIFT this
    10. UNSHIFT this, LIST
    11. SPLICE this, offset, length, LIST
    12. EXTEND this, count
    13. DELETE this, key
    14. EXISTS this, key
    15. DESTROY this
    16. UNTIE this

    A class implementing a filehandle should have the following methods:

    1. TIEHANDLE classname, LIST
    2. READ this, scalar, length, offset
    3. READLINE this
    4. GETC this
    5. WRITE this, scalar, length, offset
    6. PRINT this, LIST
    7. PRINTF this, format, LIST
    8. BINMODE this
    9. EOF this
    10. FILENO this
    11. SEEK this, position, whence
    12. TELL this
    13. OPEN this, mode, LIST
    14. CLOSE this
    15. DESTROY this
    16. UNTIE this

    A class implementing a scalar should have the following methods:

    1. TIESCALAR classname, LIST
    2. FETCH this,
    3. STORE this, value
    4. DESTROY this
    5. UNTIE this

    Not all methods indicated above need be implemented. See perltie, Tie::Hash, Tie::Array, Tie::Scalar, and Tie::Handle.

    Unlike dbmopen, the tie function will not use or require a module for you; you need to do that explicitly yourself. See DB_File or the Config module for interesting tie implementations.

    For further details see perltie, tied VARIABLE.

  • tied VARIABLE

    Returns a reference to the object underlying VARIABLE (the same value that was originally returned by the tie call that bound the variable to a package.) Returns the undefined value if VARIABLE isn't tied to a package.

  • time

    Returns the number of non-leap seconds since whatever time the system considers to be the epoch, suitable for feeding to gmtime and localtime. On most systems the epoch is 00:00:00 UTC, January 1, 1970; a prominent exception being Mac OS Classic which uses 00:00:00, January 1, 1904 in the current local time zone for its epoch.

    For measuring time in better granularity than one second, use the Time::HiRes module from Perl 5.8 onwards (or from CPAN before then), or, if you have gettimeofday(2), you may be able to use the syscall interface of Perl. See perlfaq8 for details.

    For date and time processing look at the many related modules on CPAN. For a comprehensive date and time representation look at the DateTime module.

  • times

    Returns a four-element list giving the user and system times in seconds for this process and any exited children of this process.

    1. ($user,$system,$cuser,$csystem) = times;

    In scalar context, times returns $user .

    Children's times are only included for terminated children.

    Portability issues: times in perlport.

  • tr///

    The transliteration operator. Same as y///. See Quote and Quote-like Operators in perlop.

  • truncate FILEHANDLE,LENGTH
  • truncate EXPR,LENGTH

    Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified length. Raises an exception if truncate isn't implemented on your system. Returns true if successful, undef on error.

    The behavior is undefined if LENGTH is greater than the length of the file.

    The position in the file of FILEHANDLE is left unchanged. You may want to call seek before writing to the file.

    Portability issues: truncate in perlport.

  • uc EXPR
  • uc

    Returns an uppercased version of EXPR. This is the internal function implementing the \U escape in double-quoted strings. It does not attempt to do titlecase mapping on initial letters. See ucfirst for that.

    If EXPR is omitted, uses $_ .

    This function behaves the same way under various pragma, such as in a locale, as lc does.

  • ucfirst EXPR
  • ucfirst

    Returns the value of EXPR with the first character in uppercase (titlecase in Unicode). This is the internal function implementing the \u escape in double-quoted strings.

    If EXPR is omitted, uses $_ .

    This function behaves the same way under various pragma, such as in a locale, as lc does.

  • umask EXPR
  • umask

    Sets the umask for the process to EXPR and returns the previous value. If EXPR is omitted, merely returns the current umask.

    The Unix permission rwxr-x--- is represented as three sets of three bits, or three octal digits: 0750 (the leading 0 indicates octal and isn't one of the digits). The umask value is such a number representing disabled permissions bits. The permission (or "mode") values you pass mkdir or sysopen are modified by your umask, so even if you tell sysopen to create a file with permissions 0777 , if your umask is 0022 , then the file will actually be created with permissions 0755 . If your umask were 0027 (group can't write; others can't read, write, or execute), then passing sysopen 0666 would create a file with mode 0640 (because 0666 &~ 027 is 0640 ).

    Here's some advice: supply a creation mode of 0666 for regular files (in sysopen) and one of 0777 for directories (in mkdir) and executable files. This gives users the freedom of choice: if they want protected files, they might choose process umasks of 022 , 027 , or even the particularly antisocial mask of 077 . Programs should rarely if ever make policy decisions better left to the user. The exception to this is when writing files that should be kept private: mail files, web browser cookies, .rhosts files, and so on.

    If umask(2) is not implemented on your system and you are trying to restrict access for yourself (i.e., (EXPR & 0700) > 0 ), raises an exception. If umask(2) is not implemented and you are not trying to restrict access for yourself, returns undef.

    Remember that a umask is a number, usually given in octal; it is not a string of octal digits. See also oct, if all you have is a string.

    Portability issues: umask in perlport.

  • undef EXPR
  • undef

    Undefines the value of EXPR, which must be an lvalue. Use only on a scalar value, an array (using @ ), a hash (using % ), a subroutine (using & ), or a typeglob (using * ). Saying undef $hash{$key} will probably not do what you expect on most predefined variables or DBM list values, so don't do that; see delete. Always returns the undefined value. You can omit the EXPR, in which case nothing is undefined, but you still get an undefined value that you could, for instance, return from a subroutine, assign to a variable, or pass as a parameter. Examples:

    1. undef $foo;
    2. undef $bar{'blurfl'}; # Compare to: delete $bar{'blurfl'};
    3. undef @ary;
    4. undef %hash;
    5. undef &mysub;
    6. undef *xyz; # destroys $xyz, @xyz, %xyz, &xyz, etc.
    7. return (wantarray ? (undef, $errmsg) : undef) if $they_blew_it;
    8. select undef, undef, undef, 0.25;
    9. ($a, $b, undef, $c) = &foo; # Ignore third value returned

    Note that this is a unary operator, not a list operator.

  • unlink LIST
  • unlink

    Deletes a list of files. On success, it returns the number of files it successfully deleted. On failure, it returns false and sets $! (errno):

    1. my $unlinked = unlink 'a', 'b', 'c';
    2. unlink @goners;
    3. unlink glob "*.bak";

    On error, unlink will not tell you which files it could not remove. If you want to know which files you could not remove, try them one at a time:

    1. foreach my $file ( @goners ) {
    2. unlink $file or warn "Could not unlink $file: $!";
    3. }

    Note: unlink will not attempt to delete directories unless you are superuser and the -U flag is supplied to Perl. Even if these conditions are met, be warned that unlinking a directory can inflict damage on your filesystem. Finally, using unlink on directories is not supported on many operating systems. Use rmdir instead.

    If LIST is omitted, unlink uses $_ .

  • unpack TEMPLATE,EXPR
  • unpack TEMPLATE

    unpack does the reverse of pack: it takes a string and expands it out into a list of values. (In scalar context, it returns merely the first value produced.)

    If EXPR is omitted, unpacks the $_ string. See perlpacktut for an introduction to this function.

    The string is broken into chunks described by the TEMPLATE. Each chunk is converted separately to a value. Typically, either the string is a result of pack, or the characters of the string represent a C structure of some kind.

    The TEMPLATE has the same format as in the pack function. Here's a subroutine that does substring:

    1. sub substr {
    2. my($what,$where,$howmuch) = @_;
    3. unpack("x$where a$howmuch", $what);
    4. }

    and then there's

    1. sub ordinal { unpack("W",$_[0]); } # same as ord()

    In addition to fields allowed in pack(), you may prefix a field with a %<number> to indicate that you want a <number>-bit checksum of the items instead of the items themselves. Default is a 16-bit checksum. Checksum is calculated by summing numeric values of expanded values (for string fields the sum of ord($char) is taken; for bit fields the sum of zeroes and ones).

    For example, the following computes the same number as the System V sum program:

    1. $checksum = do {
    2. local $/; # slurp!
    3. unpack("%32W*",<>) % 65535;
    4. };

    The following efficiently counts the number of set bits in a bit vector:

    1. $setbits = unpack("%32b*", $selectmask);

    The p and P formats should be used with care. Since Perl has no way of checking whether the value passed to unpack() corresponds to a valid memory location, passing a pointer value that's not known to be valid is likely to have disastrous consequences.

    If there are more pack codes or if the repeat count of a field or a group is larger than what the remainder of the input string allows, the result is not well defined: the repeat count may be decreased, or unpack() may produce empty strings or zeros, or it may raise an exception. If the input string is longer than one described by the TEMPLATE, the remainder of that input string is ignored.

    See pack for more examples and notes.

  • unshift ARRAY,LIST
  • unshift EXPR,LIST

    Does the opposite of a shift. Or the opposite of a push, depending on how you look at it. Prepends list to the front of the array and returns the new number of elements in the array.

    1. unshift(@ARGV, '-e') unless $ARGV[0] =~ /^-/;

    Note the LIST is prepended whole, not one element at a time, so the prepended elements stay in the same order. Use reverse to do the reverse.

    Starting with Perl 5.14, unshift can take a scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of unshift is considered highly experimental. The exact behaviour may change in a future version of Perl.

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.014; # so push/pop/etc work on scalars (experimental)
  • untie VARIABLE

    Breaks the binding between a variable and a package. (See tie.) Has no effect if the variable is not tied.

  • use Module VERSION LIST
  • use Module VERSION
  • use Module LIST
  • use Module
  • use VERSION

    Imports some semantics into the current package from the named module, generally by aliasing certain subroutine or variable names into your package. It is exactly equivalent to

    1. BEGIN { require Module; Module->import( LIST ); }

    except that Module must be a bareword. The importation can be made conditional by using the if module.

    In the peculiar use VERSION form, VERSION may be either a positive decimal fraction such as 5.006, which will be compared to $] , or a v-string of the form v5.6.1, which will be compared to $^V (aka $PERL_VERSION). An exception is raised if VERSION is greater than the version of the current Perl interpreter; Perl will not attempt to parse the rest of the file. Compare with require, which can do a similar check at run time. Symmetrically, no VERSION allows you to specify that you want a version of Perl older than the specified one.

    Specifying VERSION as a literal of the form v5.6.1 should generally be avoided, because it leads to misleading error messages under earlier versions of Perl (that is, prior to 5.6.0) that do not support this syntax. The equivalent numeric version should be used instead.

    1. use v5.6.1; # compile time version check
    2. use 5.6.1; # ditto
    3. use 5.006_001; # ditto; preferred for backwards compatibility

    This is often useful if you need to check the current Perl version before useing library modules that won't work with older versions of Perl. (We try not to do this more than we have to.)

    use VERSION also enables all features available in the requested version as defined by the feature pragma, disabling any features not in the requested version's feature bundle. See feature. Similarly, if the specified Perl version is greater than or equal to 5.12.0, strictures are enabled lexically as with use strict . Any explicit use of use strict or no strict overrides use VERSION , even if it comes before it. In both cases, the feature.pm and strict.pm files are not actually loaded.

    The BEGIN forces the require and import to happen at compile time. The require makes sure the module is loaded into memory if it hasn't been yet. The import is not a builtin; it's just an ordinary static method call into the Module package to tell the module to import the list of features back into the current package. The module can implement its import method any way it likes, though most modules just choose to derive their import method via inheritance from the Exporter class that is defined in the Exporter module. See Exporter. If no import method can be found then the call is skipped, even if there is an AUTOLOAD method.

    If you do not want to call the package's import method (for instance, to stop your namespace from being altered), explicitly supply the empty list:

    1. use Module ();

    That is exactly equivalent to

    1. BEGIN { require Module }

    If the VERSION argument is present between Module and LIST, then the use will call the VERSION method in class Module with the given version as an argument. The default VERSION method, inherited from the UNIVERSAL class, croaks if the given version is larger than the value of the variable $Module::VERSION .

    Again, there is a distinction between omitting LIST (import called with no arguments) and an explicit empty LIST () (import not called). Note that there is no comma after VERSION!

    Because this is a wide-open interface, pragmas (compiler directives) are also implemented this way. Currently implemented pragmas are:

    1. use constant;
    2. use diagnostics;
    3. use integer;
    4. use sigtrap qw(SEGV BUS);
    5. use strict qw(subs vars refs);
    6. use subs qw(afunc blurfl);
    7. use warnings qw(all);
    8. use sort qw(stable _quicksort _mergesort);

    Some of these pseudo-modules import semantics into the current block scope (like strict or integer , unlike ordinary modules, which import symbols into the current package (which are effective through the end of the file).

    Because use takes effect at compile time, it doesn't respect the ordinary flow control of the code being compiled. In particular, putting a use inside the false branch of a conditional doesn't prevent it from being processed. If a module or pragma only needs to be loaded conditionally, this can be done using the if pragma:

    1. use if $] < 5.008, "utf8";
    2. use if WANT_WARNINGS, warnings => qw(all);

    There's a corresponding no declaration that unimports meanings imported by use, i.e., it calls unimport Module LIST instead of import. It behaves just as import does with VERSION, an omitted or empty LIST, or no unimport method being found.

    1. no integer;
    2. no strict 'refs';
    3. no warnings;

    Care should be taken when using the no VERSION form of no. It is only meant to be used to assert that the running Perl is of a earlier version than its argument and not to undo the feature-enabling side effects of use VERSION .

    See perlmodlib for a list of standard modules and pragmas. See perlrun for the -M and -m command-line options to Perl that give use functionality from the command-line.

  • utime LIST

    Changes the access and modification times on each file of a list of files. The first two elements of the list must be the NUMERIC access and modification times, in that order. Returns the number of files successfully changed. The inode change time of each file is set to the current time. For example, this code has the same effect as the Unix touch(1) command when the files already exist and belong to the user running the program:

    1. #!/usr/bin/perl
    2. $atime = $mtime = time;
    3. utime $atime, $mtime, @ARGV;

    Since Perl 5.8.0, if the first two elements of the list are undef, the utime(2) syscall from your C library is called with a null second argument. On most systems, this will set the file's access and modification times to the current time (i.e., equivalent to the example above) and will work even on files you don't own provided you have write permission:

    1. for $file (@ARGV) {
    2. utime(undef, undef, $file)
    3. || warn "couldn't touch $file: $!";
    4. }

    Under NFS this will use the time of the NFS server, not the time of the local machine. If there is a time synchronization problem, the NFS server and local machine will have different times. The Unix touch(1) command will in fact normally use this form instead of the one shown in the first example.

    Passing only one of the first two elements as undef is equivalent to passing a 0 and will not have the effect described when both are undef. This also triggers an uninitialized warning.

    On systems that support futimes(2), you may pass filehandles among the files. On systems that don't support futimes(2), passing filehandles raises an exception. Filehandles must be passed as globs or glob references to be recognized; barewords are considered filenames.

    Portability issues: utime in perlport.

  • values HASH
  • values ARRAY
  • values EXPR

    In list context, returns a list consisting of all the values of the named hash. In Perl 5.12 or later only, will also return a list of the values of an array; prior to that release, attempting to use an array argument will produce a syntax error. In scalar context, returns the number of values.

    Hash entries are returned in an apparently random order. The actual random order is specific to a given hash; the exact same series of operations on two hashes may result in a different order for each hash. Any insertion into the hash may change the order, as will any deletion, with the exception that the most recent key returned by each or keys may be deleted without changing the order. So long as a given hash is unmodified you may rely on keys, values and each to repeatedly return the same order as each other. See Algorithmic Complexity Attacks in perlsec for details on why hash order is randomized. Aside from the guarantees provided here the exact details of Perl's hash algorithm and the hash traversal order are subject to change in any release of Perl.

    As a side effect, calling values() resets the HASH or ARRAY's internal iterator, see each. (In particular, calling values() in void context resets the iterator with no other overhead. Apart from resetting the iterator, values @array in list context is the same as plain @array . (We recommend that you use void context keys @array for this, but reasoned that taking values @array out would require more documentation than leaving it in.)

    Note that the values are not copied, which means modifying them will modify the contents of the hash:

    1. for (values %hash) { s/foo/bar/g } # modifies %hash values
    2. for (@hash{keys %hash}) { s/foo/bar/g } # same

    Starting with Perl 5.14, values can take a scalar EXPR, which must hold a reference to an unblessed hash or array. The argument will be dereferenced automatically. This aspect of values is considered highly experimental. The exact behaviour may change in a future version of Perl.

    1. for (values $hashref) { ... }
    2. for (values $obj->get_arrayref) { ... }

    To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

    1. use 5.012; # so keys/values/each work on arrays
    2. use 5.014; # so keys/values/each work on scalars (experimental)

    See also keys, each, and sort.

  • vec EXPR,OFFSET,BITS

    Treats the string in EXPR as a bit vector made up of elements of width BITS and returns the value of the element specified by OFFSET as an unsigned integer. BITS therefore specifies the number of bits that are reserved for each element in the bit vector. This must be a power of two from 1 to 32 (or 64, if your platform supports that).

    If BITS is 8, "elements" coincide with bytes of the input string.

    If BITS is 16 or more, bytes of the input string are grouped into chunks of size BITS/8, and each group is converted to a number as with pack()/unpack() with big-endian formats n /N (and analogously for BITS==64). See pack for details.

    If bits is 4 or less, the string is broken into bytes, then the bits of each byte are broken into 8/BITS groups. Bits of a byte are numbered in a little-endian-ish way, as in 0x01 , 0x02 , 0x04 , 0x08 , 0x10 , 0x20 , 0x40 , 0x80 . For example, breaking the single input byte chr(0x36) into two groups gives a list (0x6, 0x3) ; breaking it into 4 groups gives (0x2, 0x1, 0x3, 0x0) .

    vec may also be assigned to, in which case parentheses are needed to give the expression the correct precedence as in

    1. vec($image, $max_x * $x + $y, 8) = 3;

    If the selected element is outside the string, the value 0 is returned. If an element off the end of the string is written to, Perl will first extend the string with sufficiently many zero bytes. It is an error to try to write off the beginning of the string (i.e., negative OFFSET).

    If the string happens to be encoded as UTF-8 internally (and thus has the UTF8 flag set), this is ignored by vec, and it operates on the internal byte string, not the conceptual character string, even if you only have characters with values less than 256.

    Strings created with vec can also be manipulated with the logical operators |, & , ^, and ~ . These operators will assume a bit vector operation is desired when both operands are strings. See Bitwise String Operators in perlop.

    The following code will build up an ASCII string saying 'PerlPerlPerl' . The comments show the string after each step. Note that this code works in the same way on big-endian or little-endian machines.

    1. my $foo = '';
    2. vec($foo, 0, 32) = 0x5065726C; # 'Perl'
    3. # $foo eq "Perl" eq "\x50\x65\x72\x6C", 32 bits
    4. print vec($foo, 0, 8); # prints 80 == 0x50 == ord('P')
    5. vec($foo, 2, 16) = 0x5065; # 'PerlPe'
    6. vec($foo, 3, 16) = 0x726C; # 'PerlPerl'
    7. vec($foo, 8, 8) = 0x50; # 'PerlPerlP'
    8. vec($foo, 9, 8) = 0x65; # 'PerlPerlPe'
    9. vec($foo, 20, 4) = 2; # 'PerlPerlPe' . "\x02"
    10. vec($foo, 21, 4) = 7; # 'PerlPerlPer'
    11. # 'r' is "\x72"
    12. vec($foo, 45, 2) = 3; # 'PerlPerlPer' . "\x0c"
    13. vec($foo, 93, 1) = 1; # 'PerlPerlPer' . "\x2c"
    14. vec($foo, 94, 1) = 1; # 'PerlPerlPerl'
    15. # 'l' is "\x6c"

    To transform a bit vector into a string or list of 0's and 1's, use these:

    1. $bits = unpack("b*", $vector);
    2. @bits = split(//, unpack("b*", $vector));

    If you know the exact length in bits, it can be used in place of the * .

    Here is an example to illustrate how the bits actually fall in place:

    1. #!/usr/bin/perl -wl
    2. print <<'EOT';
    3. 0 1 2 3
    4. unpack("V",$_) 01234567890123456789012345678901
    5. ------------------------------------------------------------------
    6. EOT
    7. for $w (0..3) {
    8. $width = 2**$w;
    9. for ($shift=0; $shift < $width; ++$shift) {
    10. for ($off=0; $off < 32/$width; ++$off) {
    11. $str = pack("B*", "0"x32);
    12. $bits = (1<<$shift);
    13. vec($str, $off, $width) = $bits;
    14. $res = unpack("b*",$str);
    15. $val = unpack("V", $str);
    16. write;
    17. }
    18. }
    19. }
    20. format STDOUT =
    21. vec($_,@#,@#) = @<< == @######### @>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
    22. $off, $width, $bits, $val, $res
    23. .
    24. __END__

    Regardless of the machine architecture on which it runs, the example above should print the following table:

    1. 0 1 2 3
    2. unpack("V",$_) 01234567890123456789012345678901
    3. ------------------------------------------------------------------
    4. vec($_, 0, 1) = 1 == 1 10000000000000000000000000000000
    5. vec($_, 1, 1) = 1 == 2 01000000000000000000000000000000
    6. vec($_, 2, 1) = 1 == 4 00100000000000000000000000000000
    7. vec($_, 3, 1) = 1 == 8 00010000000000000000000000000000
    8. vec($_, 4, 1) = 1 == 16 00001000000000000000000000000000
    9. vec($_, 5, 1) = 1 == 32 00000100000000000000000000000000
    10. vec($_, 6, 1) = 1 == 64 00000010000000000000000000000000
    11. vec($_, 7, 1) = 1 == 128 00000001000000000000000000000000
    12. vec($_, 8, 1) = 1 == 256 00000000100000000000000000000000
    13. vec($_, 9, 1) = 1 == 512 00000000010000000000000000000000
    14. vec($_,10, 1) = 1 == 1024 00000000001000000000000000000000
    15. vec($_,11, 1) = 1 == 2048 00000000000100000000000000000000
    16. vec($_,12, 1) = 1 == 4096 00000000000010000000000000000000
    17. vec($_,13, 1) = 1 == 8192 00000000000001000000000000000000
    18. vec($_,14, 1) = 1 == 16384 00000000000000100000000000000000
    19. vec($_,15, 1) = 1 == 32768 00000000000000010000000000000000
    20. vec($_,16, 1) = 1 == 65536 00000000000000001000000000000000
    21. vec($_,17, 1) = 1 == 131072 00000000000000000100000000000000
    22. vec($_,18, 1) = 1 == 262144 00000000000000000010000000000000
    23. vec($_,19, 1) = 1 == 524288 00000000000000000001000000000000
    24. vec($_,20, 1) = 1 == 1048576 00000000000000000000100000000000
    25. vec($_,21, 1) = 1 == 2097152 00000000000000000000010000000000
    26. vec($_,22, 1) = 1 == 4194304 00000000000000000000001000000000
    27. vec($_,23, 1) = 1 == 8388608 00000000000000000000000100000000
    28. vec($_,24, 1) = 1 == 16777216 00000000000000000000000010000000
    29. vec($_,25, 1) = 1 == 33554432 00000000000000000000000001000000
    30. vec($_,26, 1) = 1 == 67108864 00000000000000000000000000100000
    31. vec($_,27, 1) = 1 == 134217728 00000000000000000000000000010000
    32. vec($_,28, 1) = 1 == 268435456 00000000000000000000000000001000
    33. vec($_,29, 1) = 1 == 536870912 00000000000000000000000000000100
    34. vec($_,30, 1) = 1 == 1073741824 00000000000000000000000000000010
    35. vec($_,31, 1) = 1 == 2147483648 00000000000000000000000000000001
    36. vec($_, 0, 2) = 1 == 1 10000000000000000000000000000000
    37. vec($_, 1, 2) = 1 == 4 00100000000000000000000000000000
    38. vec($_, 2, 2) = 1 == 16 00001000000000000000000000000000
    39. vec($_, 3, 2) = 1 == 64 00000010000000000000000000000000
    40. vec($_, 4, 2) = 1 == 256 00000000100000000000000000000000
    41. vec($_, 5, 2) = 1 == 1024 00000000001000000000000000000000
    42. vec($_, 6, 2) = 1 == 4096 00000000000010000000000000000000
    43. vec($_, 7, 2) = 1 == 16384 00000000000000100000000000000000
    44. vec($_, 8, 2) = 1 == 65536 00000000000000001000000000000000
    45. vec($_, 9, 2) = 1 == 262144 00000000000000000010000000000000
    46. vec($_,10, 2) = 1 == 1048576 00000000000000000000100000000000
    47. vec($_,11, 2) = 1 == 4194304 00000000000000000000001000000000
    48. vec($_,12, 2) = 1 == 16777216 00000000000000000000000010000000
    49. vec($_,13, 2) = 1 == 67108864 00000000000000000000000000100000
    50. vec($_,14, 2) = 1 == 268435456 00000000000000000000000000001000
    51. vec($_,15, 2) = 1 == 1073741824 00000000000000000000000000000010
    52. vec($_, 0, 2) = 2 == 2 01000000000000000000000000000000
    53. vec($_, 1, 2) = 2 == 8 00010000000000000000000000000000
    54. vec($_, 2, 2) = 2 == 32 00000100000000000000000000000000
    55. vec($_, 3, 2) = 2 == 128 00000001000000000000000000000000
    56. vec($_, 4, 2) = 2 == 512 00000000010000000000000000000000
    57. vec($_, 5, 2) = 2 == 2048 00000000000100000000000000000000
    58. vec($_, 6, 2) = 2 == 8192 00000000000001000000000000000000
    59. vec($_, 7, 2) = 2 == 32768 00000000000000010000000000000000
    60. vec($_, 8, 2) = 2 == 131072 00000000000000000100000000000000
    61. vec($_, 9, 2) = 2 == 524288 00000000000000000001000000000000
    62. vec($_,10, 2) = 2 == 2097152 00000000000000000000010000000000
    63. vec($_,11, 2) = 2 == 8388608 00000000000000000000000100000000
    64. vec($_,12, 2) = 2 == 33554432 00000000000000000000000001000000
    65. vec($_,13, 2) = 2 == 134217728 00000000000000000000000000010000
    66. vec($_,14, 2) = 2 == 536870912 00000000000000000000000000000100
    67. vec($_,15, 2) = 2 == 2147483648 00000000000000000000000000000001
    68. vec($_, 0, 4) = 1 == 1 10000000000000000000000000000000
    69. vec($_, 1, 4) = 1 == 16 00001000000000000000000000000000
    70. vec($_, 2, 4) = 1 == 256 00000000100000000000000000000000
    71. vec($_, 3, 4) = 1 == 4096 00000000000010000000000000000000
    72. vec($_, 4, 4) = 1 == 65536 00000000000000001000000000000000
    73. vec($_, 5, 4) = 1 == 1048576 00000000000000000000100000000000
    74. vec($_, 6, 4) = 1 == 16777216 00000000000000000000000010000000
    75. vec($_, 7, 4) = 1 == 268435456 00000000000000000000000000001000
    76. vec($_, 0, 4) = 2 == 2 01000000000000000000000000000000
    77. vec($_, 1, 4) = 2 == 32 00000100000000000000000000000000
    78. vec($_, 2, 4) = 2 == 512 00000000010000000000000000000000
    79. vec($_, 3, 4) = 2 == 8192 00000000000001000000000000000000
    80. vec($_, 4, 4) = 2 == 131072 00000000000000000100000000000000
    81. vec($_, 5, 4) = 2 == 2097152 00000000000000000000010000000000
    82. vec($_, 6, 4) = 2 == 33554432 00000000000000000000000001000000
    83. vec($_, 7, 4) = 2 == 536870912 00000000000000000000000000000100
    84. vec($_, 0, 4) = 4 == 4 00100000000000000000000000000000
    85. vec($_, 1, 4) = 4 == 64 00000010000000000000000000000000
    86. vec($_, 2, 4) = 4 == 1024 00000000001000000000000000000000
    87. vec($_, 3, 4) = 4 == 16384 00000000000000100000000000000000
    88. vec($_, 4, 4) = 4 == 262144 00000000000000000010000000000000
    89. vec($_, 5, 4) = 4 == 4194304 00000000000000000000001000000000
    90. vec($_, 6, 4) = 4 == 67108864 00000000000000000000000000100000
    91. vec($_, 7, 4) = 4 == 1073741824 00000000000000000000000000000010
    92. vec($_, 0, 4) = 8 == 8 00010000000000000000000000000000
    93. vec($_, 1, 4) = 8 == 128 00000001000000000000000000000000
    94. vec($_, 2, 4) = 8 == 2048 00000000000100000000000000000000
    95. vec($_, 3, 4) = 8 == 32768 00000000000000010000000000000000
    96. vec($_, 4, 4) = 8 == 524288 00000000000000000001000000000000
    97. vec($_, 5, 4) = 8 == 8388608 00000000000000000000000100000000
    98. vec($_, 6, 4) = 8 == 134217728 00000000000000000000000000010000
    99. vec($_, 7, 4) = 8 == 2147483648 00000000000000000000000000000001
    100. vec($_, 0, 8) = 1 == 1 10000000000000000000000000000000
    101. vec($_, 1, 8) = 1 == 256 00000000100000000000000000000000
    102. vec($_, 2, 8) = 1 == 65536 00000000000000001000000000000000
    103. vec($_, 3, 8) = 1 == 16777216 00000000000000000000000010000000
    104. vec($_, 0, 8) = 2 == 2 01000000000000000000000000000000
    105. vec($_, 1, 8) = 2 == 512 00000000010000000000000000000000
    106. vec($_, 2, 8) = 2 == 131072 00000000000000000100000000000000
    107. vec($_, 3, 8) = 2 == 33554432 00000000000000000000000001000000
    108. vec($_, 0, 8) = 4 == 4 00100000000000000000000000000000
    109. vec($_, 1, 8) = 4 == 1024 00000000001000000000000000000000
    110. vec($_, 2, 8) = 4 == 262144 00000000000000000010000000000000
    111. vec($_, 3, 8) = 4 == 67108864 00000000000000000000000000100000
    112. vec($_, 0, 8) = 8 == 8 00010000000000000000000000000000
    113. vec($_, 1, 8) = 8 == 2048 00000000000100000000000000000000
    114. vec($_, 2, 8) = 8 == 524288 00000000000000000001000000000000
    115. vec($_, 3, 8) = 8 == 134217728 00000000000000000000000000010000
    116. vec($_, 0, 8) = 16 == 16 00001000000000000000000000000000
    117. vec($_, 1, 8) = 16 == 4096 00000000000010000000000000000000
    118. vec($_, 2, 8) = 16 == 1048576 00000000000000000000100000000000
    119. vec($_, 3, 8) = 16 == 268435456 00000000000000000000000000001000
    120. vec($_, 0, 8) = 32 == 32 00000100000000000000000000000000
    121. vec($_, 1, 8) = 32 == 8192 00000000000001000000000000000000
    122. vec($_, 2, 8) = 32 == 2097152 00000000000000000000010000000000
    123. vec($_, 3, 8) = 32 == 536870912 00000000000000000000000000000100
    124. vec($_, 0, 8) = 64 == 64 00000010000000000000000000000000
    125. vec($_, 1, 8) = 64 == 16384 00000000000000100000000000000000
    126. vec($_, 2, 8) = 64 == 4194304 00000000000000000000001000000000
    127. vec($_, 3, 8) = 64 == 1073741824 00000000000000000000000000000010
    128. vec($_, 0, 8) = 128 == 128 00000001000000000000000000000000
    129. vec($_, 1, 8) = 128 == 32768 00000000000000010000000000000000
    130. vec($_, 2, 8) = 128 == 8388608 00000000000000000000000100000000
    131. vec($_, 3, 8) = 128 == 2147483648 00000000000000000000000000000001
  • wait

    Behaves like wait(2) on your system: it waits for a child process to terminate and returns the pid of the deceased process, or -1 if there are no child processes. The status is returned in $? and ${^CHILD_ERROR_NATIVE} . Note that a return value of -1 could mean that child processes are being automatically reaped, as described in perlipc.

    If you use wait in your handler for $SIG{CHLD} it may accidentally for the child created by qx() or system(). See perlipc for details.

    Portability issues: wait in perlport.

  • waitpid PID,FLAGS

    Waits for a particular child process to terminate and returns the pid of the deceased process, or -1 if there is no such child process. On some systems, a value of 0 indicates that there are processes still running. The status is returned in $? and ${^CHILD_ERROR_NATIVE} . If you say

    1. use POSIX ":sys_wait_h";
    2. #...
    3. do {
    4. $kid = waitpid(-1, WNOHANG);
    5. } while $kid > 0;

    then you can do a non-blocking wait for all pending zombie processes. Non-blocking wait is available on machines supporting either the waitpid(2) or wait4(2) syscalls. However, waiting for a particular pid with FLAGS of 0 is implemented everywhere. (Perl emulates the system call by remembering the status values of processes that have exited but have not been harvested by the Perl script yet.)

    Note that on some systems, a return value of -1 could mean that child processes are being automatically reaped. See perlipc for details, and for other examples.

    Portability issues: waitpid in perlport.

  • wantarray

    Returns true if the context of the currently executing subroutine or eval is looking for a list value. Returns false if the context is looking for a scalar. Returns the undefined value if the context is looking for no value (void context).

    1. return unless defined wantarray; # don't bother doing more
    2. my @a = complex_calculation();
    3. return wantarray ? @a : "@a";

    wantarray()'s result is unspecified in the top level of a file, in a BEGIN , UNITCHECK , CHECK , INIT or END block, or in a DESTROY method.

    This function should have been named wantlist() instead.

  • warn LIST

    Prints the value of LIST to STDERR. If the last element of LIST does not end in a newline, it appends the same file/line number text as die does.

    If the output is empty and $@ already contains a value (typically from a previous eval) that value is used after appending "\t...caught" to $@ . This is useful for staying almost, but not entirely similar to die.

    If $@ is empty then the string "Warning: Something's wrong" is used.

    No message is printed if there is a $SIG{__WARN__} handler installed. It is the handler's responsibility to deal with the message as it sees fit (like, for instance, converting it into a die). Most handlers must therefore arrange to actually display the warnings that they are not prepared to deal with, by calling warn again in the handler. Note that this is quite safe and will not produce an endless loop, since __WARN__ hooks are not called from inside one.

    You will find this behavior is slightly different from that of $SIG{__DIE__} handlers (which don't suppress the error text, but can instead call die again to change it).

    Using a __WARN__ handler provides a powerful way to silence all warnings (even the so-called mandatory ones). An example:

    1. # wipe out *all* compile-time warnings
    2. BEGIN { $SIG{'__WARN__'} = sub { warn $_[0] if $DOWARN } }
    3. my $foo = 10;
    4. my $foo = 20; # no warning about duplicate my $foo,
    5. # but hey, you asked for it!
    6. # no compile-time or run-time warnings before here
    7. $DOWARN = 1;
    8. # run-time warnings enabled after here
    9. warn "\$foo is alive and $foo!"; # does show up

    See perlvar for details on setting %SIG entries and for more examples. See the Carp module for other kinds of warnings using its carp() and cluck() functions.

  • write FILEHANDLE
  • write EXPR
  • write

    Writes a formatted record (possibly multi-line) to the specified FILEHANDLE, using the format associated with that file. By default the format for a file is the one having the same name as the filehandle, but the format for the current output channel (see the select function) may be set explicitly by assigning the name of the format to the $~ variable.

    Top of form processing is handled automatically: if there is insufficient room on the current page for the formatted record, the page is advanced by writing a form feed, a special top-of-page format is used to format the new page header before the record is written. By default, the top-of-page format is the name of the filehandle with "_TOP" appended. This would be a problem with autovivified filehandles, but it may be dynamically set to the format of your choice by assigning the name to the $^ variable while that filehandle is selected. The number of lines remaining on the current page is in variable $- , which can be set to 0 to force a new page.

    If FILEHANDLE is unspecified, output goes to the current default output channel, which starts out as STDOUT but may be changed by the select operator. If the FILEHANDLE is an EXPR, then the expression is evaluated and the resulting string is used to look up the name of the FILEHANDLE at run time. For more on formats, see perlform.

    Note that write is not the opposite of read. Unfortunately.

  • y///

    The transliteration operator. Same as tr///. See Quote and Quote-like Operators in perlop.

Non-function Keywords by Cross-reference

perldata

perlmod

perlobj

perlop

  • and
  • cmp
  • eq
  • ge
  • gt
  • if
  • le
  • lt
  • ne
  • not
  • or
  • x
  • xor

    These operators are documented in perlop.

perlsub

perlsyn

  • default
  • given
  • when

    These flow-control keywords related to the experimental switch feature are documented in Switch Statements in perlsyn .

 
perldoc-html/perlglossary.html000644 000765 000024 00000551672 12275777356 016677 0ustar00jjstaff000000 000000 perlglossary - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlglossary

Perl 5 version 18.2 documentation
Recently read

perlglossary

NAME

perlglossary - Perl Glossary

DESCRIPTION

A glossary of terms (technical and otherwise) used in the Perl documentation, derived from the Glossary of Programming Perl, Fourth Edition. Words or phrases in bold are defined elsewhere in this glossary.

Other useful sources include the Unicode Glossary http://unicode.org/glossary/, the Free On-Line Dictionary of Computing http://foldoc.org/, the Jargon File http://catb.org/~esr/jargon/, and Wikipedia http://www.wikipedia.org/.

A

  • accessor methods

    A method used to indirectly inspect or update an object’s state (its instance variables).

  • actual arguments

    The scalar values that you supply to a function or subroutine when you call it. For instance, when you call power("puff") , the string "puff" is the actual argument. See also argument and formal arguments.

  • address operator

    Some languages work directly with the memory addresses of values, but this can be like playing with fire. Perl provides a set of asbestos gloves for handling all memory management. The closest to an address operator in Perl is the backslash operator, but it gives you a hard reference, which is much safer than a memory address.

  • algorithm

    A well-defined sequence of steps, explained clearly enough that even a computer could do them.

  • alias

    A nickname for something, which behaves in all ways as though you’d used the original name instead of the nickname. Temporary aliases are implicitly created in the loop variable for foreach loops, in the $_ variable for map or grep operators, in $a and $b during sort’s comparison function, and in each element of @_ for the actual arguments of a subroutine call. Permanent aliases are explicitly created in packages by importing symbols or by assignment to typeglobs. Lexically scoped aliases for package variables are explicitly created by the our declaration.

  • alphabetic

    The sort of characters we put into words. In Unicode, this is all letters including all ideographs and certain diacritics, letter numbers like Roman numerals, and various combining marks.

  • alternatives

    A list of possible choices from which you may select only one, as in, “Would you like door A, B, or C?” Alternatives in regular expressions are separated with a single vertical bar: |. Alternatives in normal Perl expressions are separated with a double vertical bar: ||. Logical alternatives in Boolean expressions are separated with either || or or .

  • anonymous

    Used to describe a referent that is not directly accessible through a named variable. Such a referent must be indirectly accessible through at least one hard reference. When the last hard reference goes away, the anonymous referent is destroyed without pity.

  • application

    A bigger, fancier sort of program with a fancier name so people don’t realize they are using a program.

  • architecture

    The kind of computer you’re working on, where one “kind of computer” means all those computers sharing a compatible machine language. Since Perl programs are (typically) simple text files, not executable images, a Perl program is much less sensitive to the architecture it’s running on than programs in other languages, such as C, that are compiled into machine code. See also platform and operating system.

  • argument

    A piece of data supplied to a program, subroutine, function, or method to tell it what it’s supposed to do. Also called a “parameter”.

  • ARGV

    The name of the array containing the argument vector from the command line. If you use the empty <> operator, ARGV is the name of both the filehandle used to traverse the arguments and the scalar containing the name of the current input file.

  • arithmetical operator

    A symbol such as + or / that tells Perl to do the arithmetic you were supposed to learn in grade school.

  • array

    An ordered sequence of values, stored such that you can easily access any of the values using an integer subscript that specifies the value’s offset in the sequence.

  • array context

    An archaic expression for what is more correctly referred to as list context.

  • Artistic License

    The open source license that Larry Wall created for Perl, maximizing Perl’s usefulness, availability, and modifiability. The current version is 2. (http://www.opensource.org/licenses/artistic-license.php).

  • ASCII

    The American Standard Code for Information Interchange (a 7-bit character set adequate only for poorly representing English text). Often used loosely to describe the lowest 128 values of the various ISO-8859-X character sets, a bunch of mutually incompatible 8-bit codes best described as half ASCII. See also Unicode.

  • assertion

    A component of a regular expression that must be true for the pattern to match but does not necessarily match any characters itself. Often used specifically to mean a zero-width assertion.

  • assignment

    An operator whose assigned mission in life is to change the value of a variable.

  • assignment operator

    Either a regular assignment or a compound operator composed of an ordinary assignment and some other operator, that changes the value of a variable in place; that is, relative to its old value. For example, $a += 2 adds 2 to $a .

  • associative array

    See hash. Please. The term associative array is the old Perl 4 term for a hash. Some languages call it a dictionary.

  • associativity

    Determines whether you do the left operator first or the right operator first when you have “A operator B operator C”, and the two operators are of the same precedence. Operators like + are left associative, while operators like ** are right associative. See Camel chapter 3, “Unary and Binary Operators” for a list of operators and their associativity.

  • asynchronous

    Said of events or activities whose relative temporal ordering is indeterminate because too many things are going on at once. Hence, an asynchronous event is one you didn’t know when to expect.

  • atom

    A regular expression component potentially matching a substring containing one or more characters and treated as an indivisible syntactic unit by any following quantifier. (Contrast with an assertion that matches something of zero width and may not be quantified.)

  • atomic operation

    When Democritus gave the word “atom” to the indivisible bits of matter, he meant literally something that could not be cut: ἀ- (not) + -τομος (cuttable). An atomic operation is an action that can’t be interrupted, not one forbidden in a nuclear-free zone.

  • attribute

    A new feature that allows the declaration of variables and subroutines with modifiers, as in sub foo : locked method . Also another name for an instance variable of an object.

  • autogeneration

    A feature of operator overloading of objects, whereby the behavior of certain operators can be reasonably deduced using more fundamental operators. This assumes that the overloaded operators will often have the same relationships as the regular operators. See Camel chapter 13, “Overloading”.

  • autoincrement

    To add one to something automatically, hence the name of the ++ operator. To instead subtract one from something automatically is known as an “autodecrement”.

  • autoload

    To load on demand. (Also called “lazy” loading.) Specifically, to call an AUTOLOAD subroutine on behalf of an undefined subroutine.

  • autosplit

    To split a string automatically, as the –a switch does when running under –p or –n in order to emulate awk. (See also the AutoSplit module, which has nothing to do with the –a switch but a lot to do with autoloading.)

  • autovivification

    A Graeco-Roman word meaning “to bring oneself to life”. In Perl, storage locations (lvalues) spontaneously generate themselves as needed, including the creation of any hard reference values to point to the next level of storage. The assignment $a[5][5][5][5][5] = "quintet" potentially creates five scalar storage locations, plus four references (in the first four scalar locations) pointing to four new anonymous arrays (to hold the last four scalar locations). But the point of autovivification is that you don’t have to worry about it.

  • AV

    Short for “array value”, which refers to one of Perl’s internal data types that holds an array. The AV type is a subclass of SV.

  • awk

    Descriptive editing term—short for “awkward”. Also coincidentally refers to a venerable text-processing language from which Perl derived some of its high-level ideas.

B

  • backreference

    A substring captured by a subpattern within unadorned parentheses in a regex. Backslashed decimal numbers (\1 , \2 , etc.) later in the same pattern refer back to the corresponding subpattern in the current match. Outside the pattern, the numbered variables ($1 , $2 , etc.) continue to refer to these same values, as long as the pattern was the last successful match of the current dynamic scope.

  • backtracking

    The practice of saying, “If I had to do it all over, I’d do it differently,” and then actually going back and doing it all over differently. Mathematically speaking, it’s returning from an unsuccessful recursion on a tree of possibilities. Perl backtracks when it attempts to match patterns with a regular expression, and its earlier attempts don’t pan out. See the section “The Little Engine That /Couldn(n’t)” in Camel chapter 5, “Pattern Matching”.

  • backward compatibility

    Means you can still run your old program because we didn’t break any of the features or bugs it was relying on.

  • bareword

    A word sufficiently ambiguous to be deemed illegal under use strict 'subs' . In the absence of that stricture, a bareword is treated as if quotes were around it.

  • base class

    A generic object type; that is, a class from which other, more specific classes are derived genetically by inheritance. Also called a “superclass” by people who respect their ancestors.

  • big-endian

    From Swift: someone who eats eggs big end first. Also used of computers that store the most significant byte of a word at a lower byte address than the least significant byte. Often considered superior to little-endian machines. See also little-endian.

  • binary

    Having to do with numbers represented in base 2. That means there’s basically two numbers: 0 and 1. Also used to describe a file of “nontext”, presumably because such a file makes full use of all the binary bits in its bytes. With the advent of Unicode, this distinction, already suspect, loses even more of its meaning.

  • binary operator

    An operator that takes two operands.

  • bind

    To assign a specific network address to a socket.

  • bit

    An integer in the range from 0 to 1, inclusive. The smallest possible unit of information storage. An eighth of a byte or of a dollar. (The term “Pieces of Eight” comes from being able to split the old Spanish dollar into 8 bits, each of which still counted for money. That’s why a 25- cent piece today is still “two bits”.)

  • bit shift

    The movement of bits left or right in a computer word, which has the effect of multiplying or dividing by a power of 2.

  • bit string

    A sequence of bits that is actually being thought of as a sequence of bits, for once.

  • bless

    In corporate life, to grant official approval to a thing, as in, “The VP of Engineering has blessed our WebCruncher project.” Similarly, in Perl, to grant official approval to a referent so that it can function as an object, such as a WebCruncher object. See the bless function in Camel chapter 27, “Functions”.

  • block

    What a process does when it has to wait for something: “My process blocked waiting for the disk.” As an unrelated noun, it refers to a large chunk of data, of a size that the operating system likes to deal with (normally a power of 2 such as 512 or 8192). Typically refers to a chunk of data that’s coming from or going to a disk file.

  • BLOCK

    A syntactic construct consisting of a sequence of Perl statements that is delimited by braces. The if and while statements are defined in terms of BLOCK s, for instance. Sometimes we also say “block” to mean a lexical scope; that is, a sequence of statements that acts like a BLOCK , such as within an eval or a file, even though the statements aren’t delimited by braces.

  • block buffering

    A method of making input and output efficient by passing one block at a time. By default, Perl does block buffering to disk files. See buffer and command buffering.

  • Boolean

    A value that is either true or false.

  • Boolean context

    A special kind of scalar context used in conditionals to decide whether the scalar value returned by an expression is true or false. Does not evaluate as either a string or a number. See context.

  • breakpoint

    A spot in your program where you’ve told the debugger to stop execution so you can poke around and see whether anything is wrong yet.

  • broadcast

    To send a datagram to multiple destinations simultaneously.

  • BSD

    A psychoactive drug, popular in the ’80s, probably developed at UC Berkeley or thereabouts. Similar in many ways to the prescription-only medication called “System V”, but infinitely more useful. (Or, at least, more fun.) The full chemical name is “Berkeley Standard Distribution”.

  • bucket

    A location in a hash table containing (potentially) multiple entries whose keys “hash” to the same hash value according to its hash function. (As internal policy, you don’t have to worry about it unless you’re into internals, or policy.)

  • buffer

    A temporary holding location for data. Data that are Block buffering means that the data is passed on to its destination whenever the buffer is full. Line buffering means that it’s passed on whenever a complete line is received. Command buffering means that it’s passed every time you do a print command (or equivalent). If your output is unbuffered, the system processes it one byte at a time without the use of a holding area. This can be rather inefficient.

  • built-in

    A function that is predefined in the language. Even when hidden by overriding, you can always get at a built- in function by qualifying its name with the CORE:: pseudopackage.

  • bundle

    A group of related modules on CPAN. (Also sometimes refers to a group of command-line switches grouped into one switch cluster.)

  • byte

    A piece of data worth eight bits in most places.

  • bytecode

    A pidgin-like lingo spoken among ’droids when they don’t wish to reveal their orientation (see endian). Named after some similar languages spoken (for similar reasons) between compilers and interpreters in the late 20ᵗʰ century. These languages are characterized by representing everything as a nonarchitecture-dependent sequence of bytes.

C

  • C

    A language beloved by many for its inside-out type definitions, inscrutable precedence rules, and heavy overloading of the function-call mechanism. (Well, actually, people first switched to C because they found lowercase identifiers easier to read than upper.) Perl is written in C, so it’s not surprising that Perl borrowed a few ideas from it.

  • cache

    A data repository. Instead of computing expensive answers several times, compute it once and save the result.

  • callback

    A handler that you register with some other part of your program in the hope that the other part of your program will trigger your handler when some event of interest transpires.

  • call by reference

    An argument-passing mechanism in which the formal arguments refer directly to the actual arguments, and the subroutine can change the actual arguments by changing the formal arguments. That is, the formal argument is an alias for the actual argument. See also call by value.

  • call by value

    An argument-passing mechanism in which the formal arguments refer to a copy of the actual arguments, and the subroutine cannot change the actual arguments by changing the formal arguments. See also call by reference.

  • canonical

    Reduced to a standard form to facilitate comparison.

  • capture variables

    The variables—such as $1 and $2 , and %+ and %– —that hold the text remembered in a pattern match. See Camel chapter 5, “Pattern Matching”.

  • capturing

    The use of parentheses around a subpattern in a regular expression to store the matched substring as a backreference. (Captured strings are also returned as a list in list context.) See Camel chapter 5, “Pattern Matching”.

  • cargo cult

    Copying and pasting code without understanding it, while superstitiously believing in its value. This term originated from preindustrial cultures dealing with the detritus of explorers and colonizers of technologically advanced cultures. See The Gods Must Be Crazy.

  • case

    A property of certain characters. Originally, typesetter stored capital letters in the upper of two cases and small letters in the lower one. Unicode recognizes three cases: lowercase (character property \p{lower} ), titlecase (\p{title} ), and uppercase (\p{upper} ). A fourth casemapping called foldcase is not itself a distinct case, but it is used internally to implement casefolding. Not all letters have case, and some nonletters have case.

  • casefolding

    Comparing or matching a string case-insensitively. In Perl, it is implemented with the /i pattern modifier, the fc function, and the \F double-quote translation escape.

  • casemapping

    The process of converting a string to one of the four Unicode casemaps; in Perl, it is implemented with the fc, lc, ucfirst, and uc functions.

  • character

    The smallest individual element of a string. Computers store characters as integers, but Perl lets you operate on them as text. The integer used to represent a particular character is called that character’s codepoint.

  • character class

    A square-bracketed list of characters used in a regular expression to indicate that any character of the set may occur at a given point. Loosely, any predefined set of characters so used.

  • character property

    A predefined character class matchable by the \p or \P metasymbol. Unicode defines hundreds of standard properties for every possible codepoint, and Perl defines a few of its own, too.

  • circumfix operator

    An operator that surrounds its operand, like the angle operator, or parentheses, or a hug.

  • class

    A user-defined type, implemented in Perl via a package that provides (either directly or by inheritance) methods (that is, subroutines) to handle instances of the class (its objects). See also inheritance.

  • class method

    A method whose invocant is a package name, not an object reference. A method associated with the class as a whole. Also see instance method.

  • client

    In networking, a process that initiates contact with a server process in order to exchange data and perhaps receive a service.

  • closure

    An anonymous subroutine that, when a reference to it is generated at runtime, keeps track of the identities of externally visible lexical variables, even after those lexical variables have supposedly gone out of scope. They’re called “closures” because this sort of behavior gives mathematicians a sense of closure.

  • cluster

    A parenthesized subpattern used to group parts of a regular expression into a single atom.

  • CODE

    The word returned by the ref function when you apply it to a reference to a subroutine. See also CV.

  • code generator

    A system that writes code for you in a low-level language, such as code to implement the backend of a compiler. See program generator.

  • codepoint

    The integer a computer uses to represent a given character. ASCII codepoints are in the range 0 to 127; Unicode codepoints are in the range 0 to 0x1F_FFFF; and Perl codepoints are in the range 0 to 2³²−1 or 0 to 2⁶⁴−1, depending on your native integer size. In Perl Culture, sometimes called ordinals.

  • code subpattern

    A regular expression subpattern whose real purpose is to execute some Perl code—for example, the (?{...}) and (??{...}) subpatterns.

  • collating sequence

    The order into which characters sort. This is used by string comparison routines to decide, for example, where in this glossary to put “collating sequence”.

  • co-maintainer

    A person with permissions to index a namespace in PAUSE. Anyone can upload any namespace, but only primary and co-maintainers get their contributions indexed.

  • combining character

    Any character with the General Category of Combining Mark (\p{GC=M} ), which may be spacing or nonspacing. Some are even invisible. A sequence of combining characters following a grapheme base character together make up a single user-visible character called a grapheme. Most but not all diacritics are combining characters, and vice versa.

  • command

    In shell programming, the syntactic combination of a program name and its arguments. More loosely, anything you type to a shell (a command interpreter) that starts it doing something. Even more loosely, a Perl statement, which might start with a label and typically ends with a semicolon.

  • command buffering

    A mechanism in Perl that lets you store up the output of each Perl command and then flush it out as a single request to the operating system. It’s enabled by setting the $| ($AUTOFLUSH ) variable to a true value. It’s used when you don’t want data sitting around, not going where it’s supposed to, which may happen because the default on a file or pipe is to use block buffering.

  • command-line arguments

    The values you supply along with a program name when you tell a shell to execute a command. These values are passed to a Perl program through @ARGV .

  • command name

    The name of the program currently executing, as typed on the command line. In C, the command name is passed to the program as the first command-line argument. In Perl, it comes in separately as $0 .

  • comment

    A remark that doesn’t affect the meaning of the program. In Perl, a comment is introduced by a # character and continues to the end of the line.

  • compilation unit

    The file (or string, in the case of eval) that is currently being compiled.

  • compile

    The process of turning source code into a machine-usable form. See compile phase.

  • compile phase

    Any time before Perl starts running your main program. See also run phase. Compile phase is mostly spent in compile time, but may also be spent in runtime when BEGIN blocks, use or no declarations, or constant subexpressions are being evaluated. The startup and import code of any use declaration is also run during compile phase.

  • compiler

    Strictly speaking, a program that munches up another program and spits out yet another file containing the program in a “more executable” form, typically containing native machine instructions. The perl program is not a compiler by this definition, but it does contain a kind of compiler that takes a program and turns it into a more executable form (syntax trees) within the perl process itself, which the interpreter then interprets. There are, however, extension modules to get Perl to act more like a “real” compiler. See Camel chapter 16, “Compiling”.

  • compile time

    The time when Perl is trying to make sense of your code, as opposed to when it thinks it knows what your code means and is merely trying to do what it thinks your code says to do, which is runtime.

  • composer

    A “constructor” for a referent that isn’t really an object, like an anonymous array or a hash (or a sonata, for that matter). For example, a pair of braces acts as a composer for a hash, and a pair of brackets acts as a composer for an array. See the section “Creating References” in Camel chapter 8, “References”.

  • concatenation

    The process of gluing one cat’s nose to another cat’s tail. Also a similar operation on two strings.

  • conditional

    Something “iffy”. See Boolean context.

  • connection

    In telephony, the temporary electrical circuit between the caller’s and the callee’s phone. In networking, the same kind of temporary circuit between a client and a server.

  • construct

    As a noun, a piece of syntax made up of smaller pieces. As a transitive verb, to create an object using a constructor.

  • constructor

    Any class method, instance, or subroutine that composes, initializes, blesses, and returns an object. Sometimes we use the term loosely to mean a composer.

  • context

    The surroundings or environment. The context given by the surrounding code determines what kind of data a particular expression is expected to return. The three primary contexts are list context, scalar, and void context. Scalar context is sometimes subdivided into Boolean context, numeric context, string context, and void context. There’s also a “don’t care” context (which is dealt with in Camel chapter 2, “Bits and Pieces”, if you care).

  • continuation

    The treatment of more than one physical line as a single logical line. Makefile lines are continued by putting a backslash before the newline. Mail headers, as defined by RFC 822, are continued by putting a space or tab after the newline. In general, lines in Perl do not need any form of continuation mark, because whitespace (including newlines) is gleefully ignored. Usually.

  • core dump

    The corpse of a process, in the form of a file left in the working directory of the process, usually as a result of certain kinds of fatal errors.

  • CPAN

    The Comprehensive Perl Archive Network. (See the Camel Preface and Camel chapter 19, “CPAN” for details.)

  • C preprocessor

    The typical C compiler’s first pass, which processes lines beginning with # for conditional compilation and macro definition, and does various manipulations of the program text based on the current definitions. Also known as cpp(1).

  • cracker

    Someone who breaks security on computer systems. A cracker may be a true hacker or only a script kiddie.

  • currently selected output channel

    The last filehandle that was designated with select(FILEHANDLE); STDOUT , if no filehandle has been selected.

  • current package

    The package in which the current statement is compiled. Scan backward in the text of your program through the current lexical scope or any enclosing lexical scopes until you find a package declaration. That’s your current package name.

  • current working directory

    See working directory.

  • CV

    In academia, a curriculum vitæ, a fancy kind of résumé. In Perl, an internal “code value” typedef holding a subroutine. The CV type is a subclass of SV.

D

  • dangling statement

    A bare, single statement, without any braces, hanging off an if or while conditional. C allows them. Perl doesn’t.

  • datagram

    A packet of data, such as a UDP message, that (from the viewpoint of the programs involved) can be sent independently over the network. (In fact, all packets are sent independently at the IP level, but stream protocols such as TCP hide this from your program.)

  • data structure

    How your various pieces of data relate to each other and what shape they make when you put them all together, as in a rectangular table or a triangular tree.

  • data type

    A set of possible values, together with all the operations that know how to deal with those values. For example, a numeric data type has a certain set of numbers that you can work with, as well as various mathematical operations that you can do on the numbers, but would make little sense on, say, a string such as "Kilroy" . Strings have their own operations, such as concatenation. Compound types made of a number of smaller pieces generally have operations to compose and decompose them, and perhaps to rearrange them. Objects that model things in the real world often have operations that correspond to real activities. For instance, if you model an elevator, your elevator object might have an open_door method.

  • DBM

    Stands for “Database Management” routines, a set of routines that emulate an associative array using disk files. The routines use a dynamic hashing scheme to locate any entry with only two disk accesses. DBM files allow a Perl program to keep a persistent hash across multiple invocations. You can tie your hash variables to various DBM implementations.

  • declaration

    An assertion that states something exists and perhaps describes what it’s like, without giving any commitment as to how or where you’ll use it. A declaration is like the part of your recipe that says, “two cups flour, one large egg, four or five tadpoles…” See statement for its opposite. Note that some declarations also function as statements. Subroutine declarations also act as definitions if a body is supplied.

  • declarator

    Something that tells your program what sort of variable you’d like. Perl doesn’t require you to declare variables, but you can use my, our, or state to denote that you want something other than the default.

  • decrement

    To subtract a value from a variable, as in “decrement $x ” (meaning to remove 1 from its value) or “decrement $x by 3”.

  • default

    A value chosen for you if you don’t supply a value of your own.

  • defined

    Having a meaning. Perl thinks that some of the things people try to do are devoid of meaning; in particular, making use of variables that have never been given a value and performing certain operations on data that isn’t there. For example, if you try to read data past the end of a file, Perl will hand you back an undefined value. See also false and the defined entry in Camel chapter 27, “Functions”.

  • delimiter

    A character or string that sets bounds to an arbitrarily sized textual object, not to be confused with a separator or terminator. “To delimit” really just means “to surround” or “to enclose” (like these parentheses are doing).

  • dereference

    A fancy computer science term meaning “to follow a reference to what it points to”. The “de” part of it refers to the fact that you’re taking away one level of indirection.

  • derived class

    A class that defines some of its methods in terms of a more generic class, called a base class. Note that classes aren’t classified exclusively into base classes or derived classes: a class can function as both a derived class and a base class simultaneously, which is kind of classy.

  • descriptor

    See file descriptor.

  • destroy

    To deallocate the memory of a referent (first triggering its DESTROY method, if it has one).

  • destructor

    A special method that is called when an object is thinking about destroying itself. A Perl program’s DESTROY method doesn’t do the actual destruction; Perl just triggers the method in case the class wants to do any associated cleanup.

  • device

    A whiz-bang hardware gizmo (like a disk or tape drive or a modem or a joystick or a mouse) attached to your computer, which the operating system tries to make look like a file (or a bunch of files). Under Unix, these fake files tend to live in the /dev directory.

  • directive

    A pod directive. See Camel chapter 23, “Plain Old Documentation”.

  • directory

    A special file that contains other files. Some operating systems call these “folders”, “drawers”, “catalogues”, or “catalogs”.

  • directory handle

    A name that represents a particular instance of opening a directory to read it, until you close it. See the opendir function.

  • discipline

    Some people need this and some people avoid it. For Perl, it’s an old way to say I/O layer.

  • dispatch

    To send something to its correct destination. Often used metaphorically to indicate a transfer of programmatic control to a destination selected algorithmically, often by lookup in a table of function references or, in the case of object methods, by traversing the inheritance tree looking for the most specific definition for the method.

  • distribution

    A standard, bundled release of a system of software. The default usage implies source code is included. If that is not the case, it will be called a “binary-only” distribution.

  • dual-lived

    Some modules live both in the Standard Library and on CPAN. These modules might be developed on two tracks as people modify either version. The trend currently is to untangle these situations.

  • dweomer

    An enchantment, illusion, phantasm, or jugglery. Said when Perl’s magical dwimmer effects don’t do what you expect, but rather seem to be the product of arcane dweomercraft, sorcery, or wonder working. [From Middle English.]

  • dwimmer

    DWIM is an acronym for “Do What I Mean”, the principle that something should just do what you want it to do without an undue amount of fuss. A bit of code that does “dwimming” is a “dwimmer”. Dwimming can require a great deal of behind-the-scenes magic, which (if it doesn’t stay properly behind the scenes) is called a dweomer instead.

  • dynamic scoping

    Dynamic scoping works over a dynamic scope, making variables visible throughout the rest of the block in which they are first used and in any subroutines that are called by the rest of the block. Dynamically scoped variables can have their values temporarily changed (and implicitly restored later) by a local operator. (Compare lexical scoping.) Used more loosely to mean how a subroutine that is in the middle of calling another subroutine “contains” that subroutine at runtime.

E

  • eclectic

    Derived from many sources. Some would say too many.

  • element

    A basic building block. When you’re talking about an array, it’s one of the items that make up the array.

  • embedding

    When something is contained in something else, particularly when that might be considered surprising: “I’ve embedded a complete Perl interpreter in my editor!”

  • empty subclass test

    The notion that an empty derived class should behave exactly like its base class.

  • encapsulation

    The veil of abstraction separating the interface from the implementation (whether enforced or not), which mandates that all access to an object’s state be through methods alone.

  • endian

    See little-endian and big-endian.

  • en passant

    When you change a value as it is being copied. [From French “in passing”, as in the exotic pawn-capturing maneuver in chess.]

  • environment

    The collective set of environment variables your process inherits from its parent. Accessed via %ENV .

  • environment variable

    A mechanism by which some high-level agent such as a user can pass its preferences down to its future offspring (child processes, grandchild processes, great-grandchild processes, and so on). Each environment variable is a key/value pair, like one entry in a hash.

  • EOF

    End of File. Sometimes used metaphorically as the terminating string of a here document.

  • errno

    The error number returned by a syscall when it fails. Perl refers to the error by the name $! (or $OS_ERROR if you use the English module).

  • error

    See exception or fatal error.

  • escape sequence

    See metasymbol.

  • exception

    A fancy term for an error. See fatal error.

  • exception handling

    The way a program responds to an error. The exception-handling mechanism in Perl is the eval operator.

  • exec

    To throw away the current process’s program and replace it with another, without exiting the process or relinquishing any resources held (apart from the old memory image).

  • executable file

    A file that is specially marked to tell the operating system that it’s okay to run this file as a program. Usually shortened to “executable”.

  • execute

    To run a program or subroutine. (Has nothing to do with the kill built-in, unless you’re trying to run a signal handler.)

  • execute bit

    The special mark that tells the operating system it can run this program. There are actually three execute bits under Unix, and which bit gets used depends on whether you own the file singularly, collectively, or not at all.

  • exit status

    See status.

  • exploit

    Used as a noun in this case, this refers to a known way to compromise a program to get it to do something the author didn’t intend. Your task is to write unexploitable programs.

  • export

    To make symbols from a module available for import by other modules.

  • expression

    Anything you can legally say in a spot where a value is required. Typically composed of literals, variables, operators, functions, and subroutine calls, not necessarily in that order.

  • extension

    A Perl module that also pulls in compiled C or C++ code. More generally, any experimental option that can be compiled into Perl, such as multithreading.

F

  • false

    In Perl, any value that would look like "" or "0" if evaluated in a string context. Since undefined values evaluate to "" , all undefined values are false, but not all false values are undefined.

  • FAQ

    Frequently Asked Question (although not necessarily frequently answered, especially if the answer appears in the Perl FAQ shipped standard with Perl).

  • fatal error

    An uncaught exception, which causes termination of the process after printing a message on your standard error stream. Errors that happen inside an eval are not fatal. Instead, the eval terminates after placing the exception message in the $@ ($EVAL_ERROR ) variable. You can try to provoke a fatal error with the die operator (known as throwing or raising an exception), but this may be caught by a dynamically enclosing eval. If not caught, the die becomes a fatal error.

  • feeping creaturism

    A spoonerism of “creeping featurism”, noting the biological urge to add just one more feature to a program.

  • field

    A single piece of numeric or string data that is part of a longer string, record, or line. Variable-width fields are usually split up by separators (so use split to extract the fields), while fixed-width fields are usually at fixed positions (so use unpack). Instance variables are also known as “fields”.

  • FIFO

    First In, First Out. See also LIFO. Also a nickname for a named pipe.

  • file

    A named collection of data, usually stored on disk in a directory in a filesystem. Roughly like a document, if you’re into office metaphors. In modern filesystems, you can actually give a file more than one name. Some files have special properties, like directories and devices.

  • file descriptor

    The little number the operating system uses to keep track of which opened file you’re talking about. Perl hides the file descriptor inside a standard I/O stream and then attaches the stream to a filehandle.

  • fileglob

    A “wildcard” match on filenames. See the glob function.

  • filehandle

    An identifier (not necessarily related to the real name of a file) that represents a particular instance of opening a file, until you close it. If you’re going to open and close several different files in succession, it’s fine to open each of them with the same filehandle, so you don’t have to write out separate code to process each file.

  • filename

    One name for a file. This name is listed in a directory. You can use it in an open to tell the operating system exactly which file you want to open, and associate the file with a filehandle, which will carry the subsequent identity of that file in your program, until you close it.

  • filesystem

    A set of directories and files residing on a partition of the disk. Sometimes known as a “partition”. You can change the file’s name or even move a file around from directory to directory within a filesystem without actually moving the file itself, at least under Unix.

  • file test operator

    A built-in unary operator that you use to determine whether something is true about a file, such as –o $filename to test whether you’re the owner of the file.

  • filter

    A program designed to take a stream of input and transform it into a stream of output.

  • first-come

    The first PAUSE author to upload a namespace automatically becomes the primary maintainer for that namespace. The “first come” permissions distinguish a primary maintainer who was assigned that role from one who received it automatically.

  • flag

    We tend to avoid this term because it means so many things. It may mean a command-line switch that takes no argument itself (such as Perl’s –n and –p flags) or, less frequently, a single-bit indicator (such as the O_CREAT and O_EXCL flags used in sysopen). Sometimes informally used to refer to certain regex modifiers.

  • floating point

    A method of storing numbers in “scientific notation”, such that the precision of the number is independent of its magnitude (the decimal point “floats”). Perl does its numeric work with floating-point numbers (sometimes called “floats”) when it can’t get away with using integers. Floating-point numbers are mere approximations of real numbers.

  • flush

    The act of emptying a buffer, often before it’s full.

  • FMTEYEWTK

    Far More Than Everything You Ever Wanted To Know. An exhaustive treatise on one narrow topic, something of a super-FAQ. See Tom for far more.

  • foldcase

    The casemap used in Unicode when comparing or matching without regard to case. Comparing lower-, title-, or uppercase are all unreliable due to Unicode’s complex, one-to-many case mappings. Foldcase is a lowercase variant (using a partially decomposed normalization form for certain codepoints) created specifically to resolve this.

  • fork

    To create a child process identical to the parent process at its moment of conception, at least until it gets ideas of its own. A thread with protected memory.

  • formal arguments

    The generic names by which a subroutine knows its arguments. In many languages, formal arguments are always given individual names; in Perl, the formal arguments are just the elements of an array. The formal arguments to a Perl program are $ARGV[0] , $ARGV[1] , and so on. Similarly, the formal arguments to a Perl subroutine are $_[0] , $_[1] , and so on. You may give the arguments individual names by assigning the values to a my list. See also actual arguments.

  • format

    A specification of how many spaces and digits and things to put somewhere so that whatever you’re printing comes out nice and pretty.

  • freely available

    Means you don’t have to pay money to get it, but the copyright on it may still belong to someone else (like Larry).

  • freely redistributable

    Means you’re not in legal trouble if you give a bootleg copy of it to your friends and we find out about it. In fact, we’d rather you gave a copy to all your friends.

  • freeware

    Historically, any software that you give away, particularly if you make the source code available as well. Now often called open source software. Recently there has been a trend to use the term in contradistinction to open source software, to refer only to free software released under the Free Software Foundation’s GPL (General Public License), but this is difficult to justify etymologically.

  • function

    Mathematically, a mapping of each of a set of input values to a particular output value. In computers, refers to a subroutine or operator that returns a value. It may or may not have input values (called arguments).

  • funny character

    Someone like Larry, or one of his peculiar friends. Also refers to the strange prefixes that Perl requires as noun markers on its variables.

G

  • garbage collection

    A misnamed feature—it should be called, “expecting your mother to pick up after you”. Strictly speaking, Perl doesn’t do this, but it relies on a reference-counting mechanism to keep things tidy. However, we rarely speak strictly and will often refer to the reference-counting scheme as a form of garbage collection. (If it’s any comfort, when your interpreter exits, a “real” garbage collector runs to make sure everything is cleaned up if you’ve been messy with circular references and such.)

  • GID

    Group ID—in Unix, the numeric group ID that the operating system uses to identify you and members of your group.

  • glob

    Strictly, the shell’s * character, which will match a “glob” of characters when you’re trying to generate a list of filenames. Loosely, the act of using globs and similar symbols to do pattern matching. See also fileglob and typeglob.

  • global

    Something you can see from anywhere, usually used of variables and subroutines that are visible everywhere in your program. In Perl, only certain special variables are truly global—most variables (and all subroutines) exist only in the current package. Global variables can be declared with our. See “Global Declarations” in Camel chapter 4, “Statements and Declarations”.

  • global destruction

    The garbage collection of globals (and the running of any associated object destructors) that takes place when a Perl interpreter is being shut down. Global destruction should not be confused with the Apocalypse, except perhaps when it should.

  • glue language

    A language such as Perl that is good at hooking things together that weren’t intended to be hooked together.

  • granularity

    The size of the pieces you’re dealing with, mentally speaking.

  • grapheme

    A graphene is an allotrope of carbon arranged in a hexagonal crystal lattice one atom thick. A grapheme, or more fully, a grapheme cluster string is a single user-visible character, which may in turn be several characters (codepoints) long. For example, a carriage return plus a line feed is a single grapheme but two characters, while a “ȫ” is a single grapheme but one, two, or even three characters, depending on normalization.

  • greedy

    A subpattern whose quantifier wants to match as many things as possible.

  • grep

    Originally from the old Unix editor command for “Globally search for a Regular Expression and Print it”, now used in the general sense of any kind of search, especially text searches. Perl has a built-in grep function that searches a list for elements matching any given criterion, whereas the grep(1) program searches for lines matching a regular expression in one or more files.

  • group

    A set of users of which you are a member. In some operating systems (like Unix), you can give certain file access permissions to other members of your group.

  • GV

    An internal “glob value” typedef, holding a typeglob. The GV type is a subclass of SV.

H

  • hacker

    Someone who is brilliantly persistent in solving technical problems, whether these involve golfing, fighting orcs, or programming. Hacker is a neutral term, morally speaking. Good hackers are not to be confused with evil crackers or clueless script kiddies. If you confuse them, we will presume that you are either evil or clueless.

  • handler

    A subroutine or method that Perl calls when your program needs to respond to some internal event, such as a signal, or an encounter with an operator subject to operator overloading. See also callback.

  • hard reference

    A scalar value containing the actual address of a referent, such that the referent’s reference count accounts for it. (Some hard references are held internally, such as the implicit reference from one of a typeglob’s variable slots to its corresponding referent.) A hard reference is different from a symbolic reference.

  • hash

    An unordered association of key/value pairs, stored such that you can easily use a string key to look up its associated data value. This glossary is like a hash, where the word to be defined is the key and the definition is the value. A hash is also sometimes septisyllabically called an “associative array”, which is a pretty good reason for simply calling it a “hash” instead.

  • hash table

    A data structure used internally by Perl for implementing associative arrays (hashes) efficiently. See also bucket.

  • header file

    A file containing certain required definitions that you must include “ahead” of the rest of your program to do certain obscure operations. A C header file has a .h extension. Perl doesn’t really have header files, though historically Perl has sometimes used translated .h files with a .ph extension. See require in Camel chapter 27, “Functions”. (Header files have been superseded by the module mechanism.)

  • here document

    So called because of a similar construct in shells that pretends that the lines following the command are a separate file to be fed to the command, up to some terminating string. In Perl, however, it’s just a fancy form of quoting.

  • hexadecimal

    A number in base 16, “hex” for short. The digits for 10 through 15 are customarily represented by the letters a through f . Hexadecimal constants in Perl start with 0x . See also the hex function in Camel chapter 27, “Functions”.

  • home directory

    The directory you are put into when you log in. On a Unix system, the name is often placed into $ENV{HOME} or $ENV{LOGDIR} by login, but you can also find it with (get pwuid($<))[7] . (Some platforms do not have a concept of a home directory.)

  • host

    The computer on which a program or other data resides.

  • hubris

    Excessive pride, the sort of thing for which Zeus zaps you. Also the quality that makes you write (and maintain) programs that other people won’t want to say bad things about. Hence, the third great virtue of a programmer. See also laziness and impatience.

  • HV

    Short for a “hash value” typedef, which holds Perl’s internal representation of a hash. The HV type is a subclass of SV.

I

  • identifier

    A legally formed name for most anything in which a computer program might be interested. Many languages (including Perl) allow identifiers to start with an alphabetic character, and then contain alphabetics and digits. Perl also allows connector punctuation like the underscore character wherever it allows alphabetics. (Perl also has more complicated names, like qualified names.)

  • impatience

    The anger you feel when the computer is being lazy. This makes you write programs that don’t just react to your needs, but actually anticipate them. Or at least that pretend to. Hence, the second great virtue of a programmer. See also laziness and hubris.

  • implementation

    How a piece of code actually goes about doing its job. Users of the code should not count on implementation details staying the same unless they are part of the published interface.

  • import

    To gain access to symbols that are exported from another module. See use in Camel chapter 27, “Functions”.

  • increment

    To increase the value of something by 1 (or by some other number, if so specified).

  • indexing

    In olden days, the act of looking up a key in an actual index (such as a phone book). But now it's merely the act of using any kind of key or position to find the corresponding value, even if no index is involved. Things have degenerated to the point that Perl’s index function merely locates the position (index) of one string in another.

  • indirect filehandle

    An expression that evaluates to something that can be used as a filehandle: a string (filehandle name), a typeglob, a typeglob reference, or a low-level IO object.

  • indirection

    If something in a program isn’t the value you’re looking for but indicates where the value is, that’s indirection. This can be done with either symbolic references or hard.

  • indirect object

    In English grammar, a short noun phrase between a verb and its direct object indicating the beneficiary or recipient of the action. In Perl, print STDOUT "$foo\n"; can be understood as “verb indirect-object object”, where STDOUT is the recipient of the print action, and "$foo" is the object being printed. Similarly, when invoking a method, you might place the invocant in the dative slot between the method and its arguments:

    1. $gollum = new Pathetic::Creature "Sméagol";
    2. give $gollum "Fisssssh!";
    3. give $gollum "Precious!";
  • indirect object slot

    The syntactic position falling between a method call and its arguments when using the indirect object invocation syntax. (The slot is distinguished by the absence of a comma between it and the next argument.) STDERR is in the indirect object slot here:

    1. print STDERR "Awake! Awake! Fear, Fire, Foes! Awake!\n";
  • infix

    An operator that comes in between its operands, such as multiplication in 24 * 7 .

  • inheritance

    What you get from your ancestors, genetically or otherwise. If you happen to be a class, your ancestors are called base classes and your descendants are called derived classes. See single inheritance and multiple inheritance.

  • instance

    Short for “an instance of a class”, meaning an object of that class.

  • instance data

    See instance variable.

  • instance method

    A method of an object, as opposed to a class method.

    A method whose invocant is an object, not a package name. Every object of a class shares all the methods of that class, so an instance method applies to all instances of the class, rather than applying to a particular instance. Also see class method.

  • instance variable

    An attribute of an object; data stored with the particular object rather than with the class as a whole.

  • integer

    A number with no fractional (decimal) part. A counting number, like 1, 2, 3, and so on, but including 0 and the negatives.

  • interface

    The services a piece of code promises to provide forever, in contrast to its implementation, which it should feel free to change whenever it likes.

  • interpolation

    The insertion of a scalar or list value somewhere in the middle of another value, such that it appears to have been there all along. In Perl, variable interpolation happens in double-quoted strings and patterns, and list interpolation occurs when constructing the list of values to pass to a list operator or other such construct that takes a LIST .

  • interpreter

    Strictly speaking, a program that reads a second program and does what the second program says directly without turning the program into a different form first, which is what compilers do. Perl is not an interpreter by this definition, because it contains a kind of compiler that takes a program and turns it into a more executable form (syntax trees) within the perl process itself, which the Perl runtime system then interprets.

  • invocant

    The agent on whose behalf a method is invoked. In a class method, the invocant is a package name. In an instance method, the invocant is an object reference.

  • invocation

    The act of calling up a deity, daemon, program, method, subroutine, or function to get it to do what you think it’s supposed to do. We usually “call” subroutines but “invoke” methods, since it sounds cooler.

  • I/O

    Input from, or output to, a file or device.

  • IO

    An internal I/O object. Can also mean indirect object.

  • I/O layer

    One of the filters between the data and what you get as input or what you end up with as output.

  • IPA

    India Pale Ale. Also the International Phonetic Alphabet, the standard alphabet used for phonetic notation worldwide. Draws heavily on Unicode, including many combining characters.

  • IP

    Internet Protocol, or Intellectual Property.

  • IPC

    Interprocess Communication.

  • is-a

    A relationship between two objects in which one object is considered to be a more specific version of the other, generic object: “A camel is a mammal.” Since the generic object really only exists in a Platonic sense, we usually add a little abstraction to the notion of objects and think of the relationship as being between a generic base class and a specific derived class. Oddly enough, Platonic classes don’t always have Platonic relationships—see inheritance.

  • iteration

    Doing something repeatedly.

  • iterator

    A special programming gizmo that keeps track of where you are in something that you’re trying to iterate over. The foreach loop in Perl contains an iterator; so does a hash, allowing you to each through it.

  • IV

    The integer four, not to be confused with six, Tom’s favorite editor. IV also means an internal Integer Value of the type a scalar can hold, not to be confused with an NV.

J

  • JAPH

    “Just Another Perl Hacker”, a clever but cryptic bit of Perl code that, when executed, evaluates to that string. Often used to illustrate a particular Perl feature, and something of an ongoing Obfuscated Perl Contest seen in USENET signatures.

K

  • key

    The string index to a hash, used to look up the value associated with that key.

  • keyword

    See reserved words.

L

  • label

    A name you give to a statement so that you can talk about that statement elsewhere in the program.

  • laziness

    The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful, and then document what you wrote so you don’t have to answer so many questions about it. Hence, the first great virtue of a programmer. Also hence, this book. See also impatience and hubris.

  • leftmost longest

    The preference of the regular expression engine to match the leftmost occurrence of a pattern, then given a position at which a match will occur, the preference for the longest match (presuming the use of a greedy quantifier). See Camel chapter 5, “Pattern Matching” for much more on this subject.

  • left shift

    A bit shift that multiplies the number by some power of 2.

  • lexeme

    Fancy term for a token.

  • lexer

    Fancy term for a tokener.

  • lexical analysis

    Fancy term for tokenizing.

  • lexical scoping

    Looking at your Oxford English Dictionary through a microscope. (Also known as static scoping, because dictionaries don’t change very fast.) Similarly, looking at variables stored in a private dictionary (namespace) for each scope, which are visible only from their point of declaration down to the end of the lexical scope in which they are declared. —Syn. static scoping. —Ant. dynamic scoping.

  • lexical variable

    A variable subject to lexical scoping, declared by my. Often just called a “lexical”. (The our declaration declares a lexically scoped name for a global variable, which is not itself a lexical variable.)

  • library

    Generally, a collection of procedures. In ancient days, referred to a collection of subroutines in a .pl file. In modern times, refers more often to the entire collection of Perl modules on your system.

  • LIFO

    Last In, First Out. See also FIFO. A LIFO is usually called a stack.

  • line

    In Unix, a sequence of zero or more nonnewline characters terminated with a newline character. On non-Unix machines, this is emulated by the C library even if the underlying operating system has different ideas.

  • linebreak

    A grapheme consisting of either a carriage return followed by a line feed or any character with the Unicode Vertical Space character property.

  • line buffering

    Used by a standard I/O output stream that flushes its buffer after every newline. Many standard I/O libraries automatically set up line buffering on output that is going to the terminal.

  • line number

    The number of lines read previous to this one, plus 1. Perl keeps a separate line number for each source or input file it opens. The current source file’s line number is represented by __LINE__ . The current input line number (for the file that was most recently read via <FH> ) is represented by the $. ($INPUT_LINE_NUMBER ) variable. Many error messages report both values, if available.

  • link

    Used as a noun, a name in a directory that represents a file. A given file can have multiple links to it. It’s like having the same phone number listed in the phone directory under different names. As a verb, to resolve a partially compiled file’s unresolved symbols into a (nearly) executable image. Linking can generally be static or dynamic, which has nothing to do with static or dynamic scoping.

  • LIST

    A syntactic construct representing a comma- separated list of expressions, evaluated to produce a list value. Each expression in a LIST is evaluated in list context and interpolated into the list value.

  • list

    An ordered set of scalar values.

  • list context

    The situation in which an expression is expected by its surroundings (the code calling it) to return a list of values rather than a single value. Functions that want a LIST of arguments tell those arguments that they should produce a list value. See also context.

  • list operator

    An operator that does something with a list of values, such as join or grep. Usually used for named built-in operators (such as print, unlink, and system) that do not require parentheses around their argument list.

  • list value

    An unnamed list of temporary scalar values that may be passed around within a program from any list-generating function to any function or construct that provides a list context.

  • literal

    A token in a programming language, such as a number or string, that gives you an actual value instead of merely representing possible values as a variable does.

  • little-endian

    From Swift: someone who eats eggs little end first. Also used of computers that store the least significant byte of a word at a lower byte address than the most significant byte. Often considered superior to big-endian machines. See also big-endian.

  • local

    Not meaning the same thing everywhere. A global variable in Perl can be localized inside a dynamic scope via the local operator.

  • logical operator

    Symbols representing the concepts “and”, “or”, “xor”, and “not”.

  • lookahead

    An assertion that peeks at the string to the right of the current match location.

  • lookbehind

    An assertion that peeks at the string to the left of the current match location.

  • loop

    A construct that performs something repeatedly, like a roller coaster.

  • loop control statement

    Any statement within the body of a loop that can make a loop prematurely stop looping or skip an iteration. Generally, you shouldn’t try this on roller coasters.

  • loop label

    A kind of key or name attached to a loop (or roller coaster) so that loop control statements can talk about which loop they want to control.

  • lowercase

    In Unicode, not just characters with the General Category of Lowercase Letter, but any character with the Lowercase property, including Modifier Letters, Letter Numbers, some Other Symbols, and one Combining Mark.

  • lvaluable

    Able to serve as an lvalue.

  • lvalue

    Term used by language lawyers for a storage location you can assign a new value to, such as a variable or an element of an array. The “l” is short for “left”, as in the left side of an assignment, a typical place for lvalues. An lvaluable function or expression is one to which a value may be assigned, as in pos($x) = 10 .

  • lvalue modifier

    An adjectival pseudofunction that warps the meaning of an lvalue in some declarative fashion. Currently there are three lvalue modifiers: my, our, and local.

M

  • magic

    Technically speaking, any extra semantics attached to a variable such as $! , $0 , %ENV , or %SIG , or to any tied variable. Magical things happen when you diddle those variables.

  • magical increment

    An increment operator that knows how to bump up ASCII alphabetics as well as numbers.

  • magical variables

    Special variables that have side effects when you access them or assign to them. For example, in Perl, changing elements of the %ENV array also changes the corresponding environment variables that subprocesses will use. Reading the $! variable gives you the current system error number or message.

  • Makefile

    A file that controls the compilation of a program. Perl programs don’t usually need a Makefile because the Perl compiler has plenty of self-control.

  • man

    The Unix program that displays online documentation (manual pages) for you.

  • manpage

    A “page” from the manuals, typically accessed via the man(1) command. A manpage contains a SYNOPSIS, a DESCRIPTION, a list of BUGS, and so on, and is typically longer than a page. There are manpages documenting commands, syscalls, library functions, devices, protocols, files, and such. In this book, we call any piece of standard Perl documentation (like perlop or perldelta) a manpage, no matter what format it’s installed in on your system.

  • matching

    See pattern matching.

  • member data

    See instance variable.

  • memory

    This always means your main memory, not your disk. Clouding the issue is the fact that your machine may implement virtual memory; that is, it will pretend that it has more memory than it really does, and it’ll use disk space to hold inactive bits. This can make it seem like you have a little more memory than you really do, but it’s not a substitute for real memory. The best thing that can be said about virtual memory is that it lets your performance degrade gradually rather than suddenly when you run out of real memory. But your program can die when you run out of virtual memory, too—if you haven’t thrashed your disk to death first.

  • metacharacter

    A character that is not supposed to be treated normally. Which characters are to be treated specially as metacharacters varies greatly from context to context. Your shell will have certain metacharacters, double-quoted Perl strings have other metacharacters, and regular expression patterns have all the double-quote metacharacters plus some extra ones of their own.

  • metasymbol

    Something we’d call a metacharacter except that it’s a sequence of more than one character. Generally, the first character in the sequence must be a true metacharacter to get the other characters in the metasymbol to misbehave along with it.

  • method

    A kind of action that an object can take if you tell it to. See Camel chapter 12, “Objects”.

  • method resolution order

    The path Perl takes through @INC . By default, this is a double depth first search, once looking for defined methods and once for AUTOLOAD . However, Perl lets you configure this with mro .

  • minicpan

    A CPAN mirror that includes just the latest versions for each distribution, probably created with CPAN::Mini . See Camel chapter 19, “CPAN”.

  • minimalism

    The belief that “small is beautiful”. Paradoxically, if you say something in a small language, it turns out big, and if you say it in a big language, it turns out small. Go figure.

  • mode

    In the context of the stat(2) syscall, refers to the field holding the permission bits and the type of the file.

  • modifier

    See statement modifier, regular expression, and lvalue, not necessarily in that order.

  • module

    A file that defines a package of (almost) the same name, which can either export symbols or function as an object class. (A module’s main .pm file may also load in other files in support of the module.) See the use built-in.

  • modulus

    An integer divisor when you’re interested in the remainder instead of the quotient.

  • mojibake

    When you speak one language and the computer thinks you’re speaking another. You’ll see odd translations when you send UTF‑8, for instance, but the computer thinks you sent Latin-1, showing all sorts of weird characters instead. The term is written 「文字化け」in Japanese and means “character rot”, an apt description. Pronounced [modʑibake ] in standard IPA phonetics, or approximately “moh-jee-bah-keh”.

  • monger

    Short for one member of Perl mongers, a purveyor of Perl.

  • mortal

    A temporary value scheduled to die when the current statement finishes.

  • mro

    See method resolution order.

  • multidimensional array

    An array with multiple subscripts for finding a single element. Perl implements these using references—see Camel chapter 9, “Data Structures”.

  • multiple inheritance

    The features you got from your mother and father, mixed together unpredictably. (See also inheritance and single inheritance.) In computer languages (including Perl), it is the notion that a given class may have multiple direct ancestors or base classes.

N

  • named pipe

    A pipe with a name embedded in the filesystem so that it can be accessed by two unrelated processes.

  • namespace

    A domain of names. You needn’t worry about whether the names in one such domain have been used in another. See package.

  • NaN

    Not a number. The value Perl uses for certain invalid or inexpressible floating-point operations.

  • network address

    The most important attribute of a socket, like your telephone’s telephone number. Typically an IP address. See also port.

  • newline

    A single character that represents the end of a line, with the ASCII value of 012 octal under Unix (but 015 on a Mac), and represented by \n in Perl strings. For Windows machines writing text files, and for certain physical devices like terminals, the single newline gets automatically translated by your C library into a line feed and a carriage return, but normally, no translation is done.

  • NFS

    Network File System, which allows you to mount a remote filesystem as if it were local.

  • normalization

    Converting a text string into an alternate but equivalent canonical (or compatible) representation that can then be compared for equivalence. Unicode recognizes four different normalization forms: NFD, NFC, NFKD, and NFKC.

  • null character

    A character with the numeric value of zero. It’s used by C to terminate strings, but Perl allows strings to contain a null.

  • null list

    A list value with zero elements, represented in Perl by () .

  • null string

    A string containing no characters, not to be confused with a string containing a null character, which has a positive length and is true.

  • numeric context

    The situation in which an expression is expected by its surroundings (the code calling it) to return a number. See also context and string context.

  • numification

    (Sometimes spelled nummification and nummify.) Perl lingo for implicit conversion into a number; the related verb is numify. Numification is intended to rhyme with mummification, and numify with mummify. It is unrelated to English numen, numina, numinous. We originally forgot the extra m a long time ago, and some people got used to our funny spelling, and so just as with HTTP_REFERER ’s own missing letter, our weird spelling has stuck around.

  • NV

    Short for Nevada, no part of which will ever be confused with civilization. NV also means an internal floating- point Numeric Value of the type a scalar can hold, not to be confused with an IV.

  • nybble

    Half a byte, equivalent to one hexadecimal digit, and worth four bits.

O

  • object

    An instance of a class. Something that “knows” what user-defined type (class) it is, and what it can do because of what class it is. Your program can request an object to do things, but the object gets to decide whether it wants to do them or not. Some objects are more accommodating than others.

  • octal

    A number in base 8. Only the digits 0 through 7 are allowed. Octal constants in Perl start with 0, as in 013. See also the oct function.

  • offset

    How many things you have to skip over when moving from the beginning of a string or array to a specific position within it. Thus, the minimum offset is zero, not one, because you don’t skip anything to get to the first item.

  • one-liner

    An entire computer program crammed into one line of text.

  • open source software

    Programs for which the source code is freely available and freely redistributable, with no commercial strings attached. For a more detailed definition, see http://www.opensource.org/osd.html.

  • operand

    An expression that yields a value that an operator operates on. See also precedence.

  • operating system

    A special program that runs on the bare machine and hides the gory details of managing processes and devices. Usually used in a looser sense to indicate a particular culture of programming. The loose sense can be used at varying levels of specificity. At one extreme, you might say that all versions of Unix and Unix-lookalikes are the same operating system (upsetting many people, especially lawyers and other advocates). At the other extreme, you could say this particular version of this particular vendor’s operating system is different from any other version of this or any other vendor’s operating system. Perl is much more portable across operating systems than many other languages. See also architecture and platform.

  • operator

    A gizmo that transforms some number of input values to some number of output values, often built into a language with a special syntax or symbol. A given operator may have specific expectations about what types of data you give as its arguments (operands) and what type of data you want back from it.

  • operator overloading

    A kind of overloading that you can do on built-in operators to make them work on objects as if the objects were ordinary scalar values, but with the actual semantics supplied by the object class. This is set up with the overload pragma—see Camel chapter 13, “Overloading”.

  • options

    See either switches or regular expression modifiers.

  • ordinal

    An abstract character’s integer value. Same thing as codepoint.

  • overloading

    Giving additional meanings to a symbol or construct. Actually, all languages do overloading to one extent or another, since people are good at figuring out things from context.

  • overriding

    Hiding or invalidating some other definition of the same name. (Not to be confused with overloading, which adds definitions that must be disambiguated some other way.) To confuse the issue further, we use the word with two overloaded definitions: to describe how you can define your own subroutine to hide a built-in function of the same name (see the section “Overriding Built-in Functions” in Camel chapter 11, “Modules”), and to describe how you can define a replacement method in a derived class to hide a base class’s method of the same name (see Camel chapter 12, “Objects”).

  • owner

    The one user (apart from the superuser) who has absolute control over a file. A file may also have a group of users who may exercise joint ownership if the real owner permits it. See permission bits.

P

  • package

    A namespace for global variables, subroutines, and the like, such that they can be kept separate from like-named symbols in other namespaces. In a sense, only the package is global, since the symbols in the package’s symbol table are only accessible from code compiled outside the package by naming the package. But in another sense, all package symbols are also globals—they’re just well-organized globals.

  • pad

    Short for scratchpad.

  • parameter

    See argument.

  • parent class

    See base class.

  • parse tree

    See syntax tree.

  • parsing

    The subtle but sometimes brutal art of attempting to turn your possibly malformed program into a valid syntax tree.

  • patch

    To fix by applying one, as it were. In the realm of hackerdom, a listing of the differences between two versions of a program as might be applied by the patch(1) program when you want to fix a bug or upgrade your old version.

  • PATH

    The list of directories the system searches to find a program you want to execute. The list is stored as one of your environment variables, accessible in Perl as $ENV{PATH} .

  • pathname

    A fully qualified filename such as /usr/bin/perl. Sometimes confused with PATH .

  • pattern

    A template used in pattern matching.

  • pattern matching

    Taking a pattern, usually a regular expression, and trying the pattern various ways on a string to see whether there’s any way to make it fit. Often used to pick interesting tidbits out of a file.

  • PAUSE

    The Perl Authors Upload SErver (http://pause.perl.org), the gateway for modules on their way to CPAN.

  • Perl mongers

    A Perl user group, taking the form of its name from the New York Perl mongers, the first Perl user group. Find one near you at http://www.pm.org.

  • permission bits

    Bits that the owner of a file sets or unsets to allow or disallow access to other people. These flag bits are part of the mode word returned by the stat built-in when you ask about a file. On Unix systems, you can check the ls(1) manpage for more information.

  • Pern

    What you get when you do Perl++ twice. Doing it only once will curl your hair. You have to increment it eight times to shampoo your hair. Lather, rinse, iterate.

  • pipe

    A direct connection that carries the output of one process to the input of another without an intermediate temporary file. Once the pipe is set up, the two processes in question can read and write as if they were talking to a normal file, with some caveats.

  • pipeline

    A series of processes all in a row, linked by pipes, where each passes its output stream to the next.

  • platform

    The entire hardware and software context in which a program runs. A program written in a platform-dependent language might break if you change any of the following: machine, operating system, libraries, compiler, or system configuration. The perl interpreter has to be compiled differently for each platform because it is implemented in C, but programs written in the Perl language are largely platform independent.

  • pod

    The markup used to embed documentation into your Perl code. Pod stands for “Plain old documentation”. See Camel chapter 23, “Plain Old Documentation”.

  • pod command

    A sequence, such as =head1 , that denotes the start of a pod section.

  • pointer

    A variable in a language like C that contains the exact memory location of some other item. Perl handles pointers internally so you don’t have to worry about them. Instead, you just use symbolic pointers in the form of keys and variable names, or hard references, which aren’t pointers (but act like pointers and do in fact contain pointers).

  • polymorphism

    The notion that you can tell an object to do something generic, and the object will interpret the command in different ways depending on its type. [< Greek πολυ- + μορϕή, many forms.]

  • port

    The part of the address of a TCP or UDP socket that directs packets to the correct process after finding the right machine, something like the phone extension you give when you reach the company operator. Also the result of converting code to run on a different platform than originally intended, or the verb denoting this conversion.

  • portable

    Once upon a time, C code compilable under both BSD and SysV. In general, code that can be easily converted to run on another platform, where “easily” can be defined however you like, and usually is. Anything may be considered portable if you try hard enough, such as a mobile home or London Bridge.

  • porter

    Someone who “carries” software from one platform to another. Porting programs written in platform-dependent languages such as C can be difficult work, but porting programs like Perl is very much worth the agony.

  • possessive

    Said of quantifiers and groups in patterns that refuse to give up anything once they’ve gotten their mitts on it. Catchier and easier to say than the even more formal nonbacktrackable.

  • POSIX

    The Portable Operating System Interface specification.

  • postfix

    An operator that follows its operand, as in $x++ .

  • pp

    An internal shorthand for a “push- pop” code; that is, C code implementing Perl’s stack machine.

  • pragma

    A standard module whose practical hints and suggestions are received (and possibly ignored) at compile time. Pragmas are named in all lowercase.

  • precedence

    The rules of conduct that, in the absence of other guidance, determine what should happen first. For example, in the absence of parentheses, you always do multiplication before addition.

  • prefix

    An operator that precedes its operand, as in ++$x .

  • preprocessing

    What some helper process did to transform the incoming data into a form more suitable for the current process. Often done with an incoming pipe. See also C preprocessor.

  • primary maintainer

    The author that PAUSE allows to assign co-maintainer permissions to a namespace. A primary maintainer can give up this distinction by assigning it to another PAUSE author. See Camel chapter 19, “CPAN”.

  • procedure

    A subroutine.

  • process

    An instance of a running program. Under multitasking systems like Unix, two or more separate processes could be running the same program independently at the same time—in fact, the fork function is designed to bring about this happy state of affairs. Under other operating systems, processes are sometimes called “threads”, “tasks”, or “jobs”, often with slight nuances in meaning.

  • program

    See script.

  • program generator

    A system that algorithmically writes code for you in a high-level language. See also code generator.

  • progressive matching

    Pattern matching matching>that picks up where it left off before.

  • property

    See either instance variable or character property.

  • protocol

    In networking, an agreed-upon way of sending messages back and forth so that neither correspondent will get too confused.

  • prototype

    An optional part of a subroutine declaration telling the Perl compiler how many and what flavor of arguments may be passed as actual arguments, so you can write subroutine calls that parse much like built-in functions. (Or don’t parse, as the case may be.)

  • pseudofunction

    A construct that sometimes looks like a function but really isn’t. Usually reserved for lvalue modifiers like my, for context modifiers like scalar, and for the pick-your-own-quotes constructs, q//, qq//, qx//, qw//, qr//, m//, s///, y///, and tr///.

  • pseudohash

    Formerly, a reference to an array whose initial element happens to hold a reference to a hash. You used to be able to treat a pseudohash reference as either an array reference or a hash reference. Pseduohashes are no longer supported.

  • pseudoliteral

    An operator Xthat looks something like a literal, such as the output-grabbing operator, <literal moreinfo="none"`>command `.

  • public domain

    Something not owned by anybody. Perl is copyrighted and is thus not in the public domain—it’s just freely available and freely redistributable.

  • pumpkin

    A notional “baton” handed around the Perl community indicating who is the lead integrator in some arena of development.

  • pumpking

    A pumpkin holder, the person in charge of pumping the pump, or at least priming it. Must be willing to play the part of the Great Pumpkin now and then.

  • PV

    A “pointer value”, which is Perl Internals Talk for a char* .

Q

  • qualified

    Possessing a complete name. The symbol $Ent::moot is qualified; $moot is unqualified. A fully qualified filename is specified from the top-level directory.

  • quantifier

    A component of a regular expression specifying how many times the foregoing atom may occur.

R

  • race condition

    A race condition exists when the result of several interrelated events depends on the ordering of those events, but that order cannot be guaranteed due to nondeterministic timing effects. If two or more programs, or parts of the same program, try to go through the same series of events, one might interrupt the work of the other. This is a good way to find an exploit.

  • readable

    With respect to files, one that has the proper permission bit set to let you access the file. With respect to computer programs, one that’s written well enough that someone has a chance of figuring out what it’s trying to do.

  • reaping

    The last rites performed by a parent process on behalf of a deceased child process so that it doesn’t remain a zombie. See the wait and waitpid function calls.

  • record

    A set of related data values in a file or stream, often associated with a unique key field. In Unix, often commensurate with a line, or a blank-line–terminated set of lines (a “paragraph”). Each line of the /etc/passwd file is a record, keyed on login name, containing information about that user.

  • recursion

    The art of defining something (at least partly) in terms of itself, which is a naughty no-no in dictionaries but often works out okay in computer programs if you’re careful not to recurse forever (which is like an infinite loop with more spectacular failure modes).

  • reference

    Where you look to find a pointer to information somewhere else. (See indirection.) References come in two flavors: symbolic references and hard references.

  • referent

    Whatever a reference refers to, which may or may not have a name. Common types of referents include scalars, arrays, hashes, and subroutines.

  • regex

    See regular expression.

  • regular expression

    A single entity with various interpretations, like an elephant. To a computer scientist, it’s a grammar for a little language in which some strings are legal and others aren’t. To normal people, it’s a pattern you can use to find what you’re looking for when it varies from case to case. Perl’s regular expressions are far from regular in the theoretical sense, but in regular use they work quite well. Here’s a regular expression: /Oh s.*t./ . This will match strings like “Oh say can you see by the dawn's early light ” and “Oh sit! ”. See Camel chapter 5, “Pattern Matching”.

  • regular expression modifier

    An option on a pattern or substitution, such as /i to render the pattern case- insensitive.

  • regular file

    A file that’s not a directory, a device, a named pipe or socket, or a symbolic link. Perl uses the –f file test operator to identify regular files. Sometimes called a “plain” file.

  • relational operator

    An operator that says whether a particular ordering relationship is true about a pair of operands. Perl has both numeric and string relational operators. See collating sequence.

  • reserved words

    A word with a specific, built-in meaning to a compiler, such as if or delete. In many languages (not Perl), it’s illegal to use reserved words to name anything else. (Which is why they’re reserved, after all.) In Perl, you just can’t use them to name labels or filehandles. Also called “keywords”.

  • return value

    The value produced by a subroutine or expression when evaluated. In Perl, a return value may be either a list or a scalar.

  • RFC

    Request For Comment, which despite the timid connotations is the name of a series of important standards documents.

  • right shift

    A bit shift that divides a number by some power of 2.

  • role

    A name for a concrete set of behaviors. A role is a way to add behavior to a class without inheritance.

  • root

    The superuser (UID == 0). Also the top-level directory of the filesystem.

  • RTFM

    What you are told when someone thinks you should Read The Fine Manual.

  • run phase

    Any time after Perl starts running your main program. See also compile phase. Run phase is mostly spent in runtime but may also be spent in compile time when require, do FILE , or eval STRING operators are executed, or when a substitution uses the /ee modifier.

  • runtime

    The time when Perl is actually doing what your code says to do, as opposed to the earlier period of time when it was trying to figure out whether what you said made any sense whatsoever, which is compile time.

  • runtime pattern

    A pattern that contains one or more variables to be interpolated before parsing the pattern as a regular expression, and that therefore cannot be analyzed at compile time, but must be reanalyzed each time the pattern match operator is evaluated. Runtime patterns are useful but expensive.

  • RV

    A recreational vehicle, not to be confused with vehicular recreation. RV also means an internal Reference Value of the type a scalar can hold. See also IV and NV if you’re not confused yet.

  • rvalue

    A value that you might find on the right side of an assignment. See also lvalue.

S

  • sandbox

    A walled off area that’s not supposed to affect beyond its walls. You let kids play in the sandbox instead of running in the road. See Camel chapter 20, “Security”.

  • scalar

    A simple, singular value; a number, string, or reference.

  • scalar context

    The situation in which an expression is expected by its surroundings (the code calling it) to return a single value rather than a list of values. See also context and list context. A scalar context sometimes imposes additional constraints on the return value—see string context and numeric context. Sometimes we talk about a Boolean context inside conditionals, but this imposes no additional constraints, since any scalar value, whether numeric or string, is already true or false.

  • scalar literal

    A number or quoted string—an actual value in the text of your program, as opposed to a variable.

  • scalar value

    A value that happens to be a scalar as opposed to a list.

  • scalar variable

    A variable prefixed with $ that holds a single value.

  • scope

    From how far away you can see a variable, looking through one. Perl has two visibility mechanisms. It does dynamic scoping of local variables, meaning that the rest of the block, and any subroutines that are called by the rest of the block, can see the variables that are local to the block. Perl does lexical scoping of my variables, meaning that the rest of the block can see the variable, but other subroutines called by the block cannot see the variable.

  • scratchpad

    The area in which a particular invocation of a particular file or subroutine keeps some of its temporary values, including any lexically scoped variables.

  • script

    A text file that is a program intended to be executed directly rather than compiled to another form of file before execution.

    Also, in the context of Unicode, a writing system for a particular language or group of languages, such as Greek, Bengali, or Tengwar.

  • script kiddie

    A cracker who is not a hacker but knows just enough to run canned scripts. A cargo-cult programmer.

  • sed

    A venerable Stream EDitor from which Perl derives some of its ideas.

  • semaphore

    A fancy kind of interlock that prevents multiple threads or processes from using up the same resources simultaneously.

  • separator

    A character or string that keeps two surrounding strings from being confused with each other. The split function works on separators. Not to be confused with delimiters or terminators. The “or” in the previous sentence separated the two alternatives.

  • serialization

    Putting a fancy data structure into linear order so that it can be stored as a string in a disk file or database, or sent through a pipe. Also called marshalling.

  • server

    In networking, a process that either advertises a service or just hangs around at a known location and waits for clients who need service to get in touch with it.

  • service

    Something you do for someone else to make them happy, like giving them the time of day (or of their life). On some machines, well-known services are listed by the getservent function.

  • setgid

    Same as setuid, only having to do with giving away group privileges.

  • setuid

    Said of a program that runs with the privileges of its owner rather than (as is usually the case) the privileges of whoever is running it. Also describes the bit in the mode word (permission bits) that controls the feature. This bit must be explicitly set by the owner to enable this feature, and the program must be carefully written not to give away more privileges than it ought to.

  • shared memory

    A piece of memory accessible by two different processes who otherwise would not see each other’s memory.

  • shebang

    Irish for the whole McGillicuddy. In Perl culture, a portmanteau of “sharp” and “bang”, meaning the #! sequence that tells the system where to find the interpreter.

  • shell

    A command-line interpreter. The program that interactively gives you a prompt, accepts one or more lines of input, and executes the programs you mentioned, feeding each of them their proper arguments and input data. Shells can also execute scripts containing such commands. Under Unix, typical shells include the Bourne shell (/bin/sh), the C shell (/bin/csh), and the Korn shell (/bin/ksh). Perl is not strictly a shell because it’s not interactive (although Perl programs can be interactive).

  • side effects

    Something extra that happens when you evaluate an expression. Nowadays it can refer to almost anything. For example, evaluating a simple assignment statement typically has the “side effect” of assigning a value to a variable. (And you thought assigning the value was your primary intent in the first place!) Likewise, assigning a value to the special variable $| ($AUTOFLUSH ) has the side effect of forcing a flush after every write or print on the currently selected filehandle.

  • sigil

    A glyph used in magic. Or, for Perl, the symbol in front of a variable name, such as $ , @ , and % .

  • signal

    A bolt out of the blue; that is, an event triggered by the operating system, probably when you’re least expecting it.

  • signal handler

    A subroutine that, instead of being content to be called in the normal fashion, sits around waiting for a bolt out of the blue before it will deign to execute. Under Perl, bolts out of the blue are called signals, and you send them with the kill built-in. See the %SIG hash in Camel chapter 25, “Special Names” and the section “Signals” in Camel chapter 15, “Interprocess Communication”.

  • single inheritance

    The features you got from your mother, if she told you that you don’t have a father. (See also inheritance and multiple inheritance.) In computer languages, the idea that classes reproduce asexually so that a given class can only have one direct ancestor or base class. Perl supplies no such restriction, though you may certainly program Perl that way if you like.

  • slice

    A selection of any number of elements from a list, array, or hash.

  • slurp

    To read an entire file into a string in one operation.

  • socket

    An endpoint for network communication among multiple processes that works much like a telephone or a post office box. The most important thing about a socket is its network address (like a phone number). Different kinds of sockets have different kinds of addresses—some look like filenames, and some don’t.

  • soft reference

    See symbolic reference.

  • source filter

    A special kind of module that does preprocessing on your script just before it gets to the tokener.

  • stack

    A device you can put things on the top of, and later take them back off in the opposite order in which you put them on. See LIFO.

  • standard

    Included in the official Perl distribution, as in a standard module, a standard tool, or a standard Perl manpage.

  • standard error

    The default output stream for nasty remarks that don’t belong in standard output. Represented within a Perl program by the output> filehandle STDERR . You can use this stream explicitly, but the die and warn built-ins write to your standard error stream automatically (unless trapped or otherwise intercepted).

  • standard input

    The default input stream for your program, which if possible shouldn’t care where its data is coming from. Represented within a Perl program by the filehandle STDIN .

  • standard I/O

    A standard C library for doing buffered input and output to the operating system. (The “standard” of standard I/O is at most marginally related to the “standard” of standard input and output.) In general, Perl relies on whatever implementation of standard I/O a given operating system supplies, so the buffering characteristics of a Perl program on one machine may not exactly match those on another machine. Normally this only influences efficiency, not semantics. If your standard I/O package is doing block buffering and you want it to flush the buffer more often, just set the $| variable to a true value.

  • Standard Library

    Everything that comes with the official perl distribution. Some vendor versions of perl change their distributions, leaving out some parts or including extras. See also dual-lived.

  • standard output

    The default output stream for your program, which if possible shouldn’t care where its data is going. Represented within a Perl program by the filehandle STDOUT .

  • statement

    A command to the computer about what to do next, like a step in a recipe: “Add marmalade to batter and mix until mixed.” A statement is distinguished from a declaration, which doesn’t tell the computer to do anything, but just to learn something.

  • statement modifier

    A conditional or loop that you put after the statement instead of before, if you know what we mean.

  • static

    Varying slowly compared to something else. (Unfortunately, everything is relatively stable compared to something else, except for certain elementary particles, and we’re not so sure about them.) In computers, where things are supposed to vary rapidly, “static” has a derogatory connotation, indicating a slightly dysfunctional variable, subroutine, or method. In Perl culture, the word is politely avoided.

    If you’re a C or C++ programmer, you might be looking for Perl’s state keyword.

  • static method

    No such thing. See class method.

  • static scoping

    No such thing. See lexical scoping.

  • static variable

    No such thing. Just use a lexical variable in a scope larger than your subroutine, or declare it with state instead of with my.

  • stat structure

    A special internal spot in which Perl keeps the information about the last file on which you requested information.

  • status

    The value returned to the parent process when one of its child processes dies. This value is placed in the special variable $? . Its upper eight bits are the exit status of the defunct process, and its lower eight bits identify the signal (if any) that the process died from. On Unix systems, this status value is the same as the status word returned by wait(2). See system in Camel chapter 27, “Functions”.

  • STDERR

    See standard error.

  • STDIN

    See standard input.

  • STDIO

    See standard I/O.

  • STDOUT

    See standard output.

  • stream

    A flow of data into or out of a process as a steady sequence of bytes or characters, without the appearance of being broken up into packets. This is a kind of interface—the underlying implementation may well break your data up into separate packets for delivery, but this is hidden from you.

  • string

    A sequence of characters such as “He said !@#*&%@#*?!”. A string does not have to be entirely printable.

  • string context

    The situation in which an expression is expected by its surroundings (the code calling it) to return a string. See also context and numeric context.

  • stringification

    The process of producing a string representation of an abstract object.

  • struct

    C keyword introducing a structure definition or name.

  • structure

    See data structure.

  • subclass

    See derived class.

  • subpattern

    A component of a regular expression pattern.

  • subroutine

    A named or otherwise accessible piece of program that can be invoked from elsewhere in the program in order to accomplish some subgoal of the program. A subroutine is often parameterized to accomplish different but related things depending on its input arguments. If the subroutine returns a meaningful value, it is also called a function.

  • subscript

    A value that indicates the position of a particular array element in an array.

  • substitution

    Changing parts of a string via the s/// operator. (We avoid use of this term to mean variable interpolation.)

  • substring

    A portion of a string, starting at a certain character position (offset) and proceeding for a certain number of characters.

  • superclass

    See base class.

  • superuser

    The person whom the operating system will let do almost anything. Typically your system administrator or someone pretending to be your system administrator. On Unix systems, the root user. On Windows systems, usually the Administrator user.

  • SV

    Short for “scalar value”. But within the Perl interpreter, every referent is treated as a member of a class derived from SV, in an object-oriented sort of way. Every value inside Perl is passed around as a C language SV* pointer. The SV struct knows its own “referent type”, and the code is smart enough (we hope) not to try to call a hash function on a subroutine.

  • switch

    An option you give on a command line to influence the way your program works, usually introduced with a minus sign. The word is also used as a nickname for a switch statement.

  • switch cluster

    The combination of multiple command- line switches (e.g., –a –b –c ) into one switch (e.g., –abc ). Any switch with an additional argument must be the last switch in a cluster.

  • switch statement

    A program technique that lets you evaluate an expression and then, based on the value of the expression, do a multiway branch to the appropriate piece of code for that value. Also called a “case structure”, named after the similar Pascal construct. Most switch statements in Perl are spelled given . See “The given statement” in Camel chapter 4, “Statements and Declarations”.

  • symbol

    Generally, any token or metasymbol. Often used more specifically to mean the sort of name you might find in a symbol table.

  • symbolic debugger

    A program that lets you step through the execution of your program, stopping or printing things out here and there to see whether anything has gone wrong, and, if so, what. The “symbolic” part just means that you can talk to the debugger using the same symbols with which your program is written.

  • symbolic link

    An alternate filename that points to the real filename, which in turn points to the real file. Whenever the operating system is trying to parse a pathname containing a symbolic link, it merely substitutes the new name and continues parsing.

  • symbolic reference

    A variable whose value is the name of another variable or subroutine. By dereferencing the first variable, you can get at the second one. Symbolic references are illegal under use strict "refs" .

  • symbol table

    Where a compiler remembers symbols. A program like Perl must somehow remember all the names of all the variables, filehandles, and subroutines you’ve used. It does this by placing the names in a symbol table, which is implemented in Perl using a hash table. There is a separate symbol table for each package to give each package its own namespace.

  • synchronous

    Programming in which the orderly sequence of events can be determined; that is, when things happen one after the other, not at the same time.

  • syntactic sugar

    An alternative way of writing something more easily; a shortcut.

  • syntax

    From Greek σύνταξις, “with-arrangement”. How things (particularly symbols) are put together with each other.

  • syntax tree

    An internal representation of your program wherein lower-level constructs dangle off the higher-level constructs enclosing them.

  • syscall

    A function call directly to the operating system. Many of the important subroutines and functions you use aren’t direct system calls, but are built up in one or more layers above the system call level. In general, Perl programmers don’t need to worry about the distinction. However, if you do happen to know which Perl functions are really syscalls, you can predict which of these will set the $! ($ERRNO ) variable on failure. Unfortunately, beginning programmers often confusingly employ the term “system call” to mean what happens when you call the Perl system function, which actually involves many syscalls. To avoid any confusion, we nearly always say “syscall” for something you could call indirectly via Perl’s syscall function, and never for something you would call with Perl’s system function.

T

  • taint checks

    The special bookkeeping Perl does to track the flow of external data through your program and disallow their use in system commands.

  • tainted

    Said of data derived from the grubby hands of a user, and thus unsafe for a secure program to rely on. Perl does taint checks if you run a setuid (or setgid) program, or if you use the –T switch.

  • taint mode

    Running under the –T switch, marking all external data as suspect and refusing to use it with system commands. See Camel chapter 20, “Security”.

  • TCP

    Short for Transmission Control Protocol. A protocol wrapped around the Internet Protocol to make an unreliable packet transmission mechanism appear to the application program to be a reliable stream of bytes. (Usually.)

  • term

    Short for a “terminal”—that is, a leaf node of a syntax tree. A thing that functions grammatically as an operand for the operators in an expression.

  • terminator

    A character or string that marks the end of another string. The $/ variable contains the string that terminates a readline operation, which chomp deletes from the end. Not to be confused with delimiters or separators. The period at the end of this sentence is a terminator.

  • ternary

    An operator taking three operands. Sometimes pronounced trinary.

  • text

    A string or file containing primarily printable characters.

  • thread

    Like a forked process, but without fork’s inherent memory protection. A thread is lighter weight than a full process, in that a process could have multiple threads running around in it, all fighting over the same process’s memory space unless steps are taken to protect threads from one another.

  • tie

    The bond between a magical variable and its implementation class. See the tie function in Camel chapter 27, “Functions” and Camel chapter 14, “Tied Variables”.

  • titlecase

    The case used for capitals that are followed by lowercase characters instead of by more capitals. Sometimes called sentence case or headline case. English doesn’t use Unicode titlecase, but casing rules for English titles are more complicated than simply capitalizing each word’s first character.

  • TMTOWTDI

    There’s More Than One Way To Do It, the Perl Motto. The notion that there can be more than one valid path to solving a programming problem in context. (This doesn’t mean that more ways are always better or that all possible paths are equally desirable—just that there need not be One True Way.)

  • token

    A morpheme in a programming language, the smallest unit of text with semantic significance.

  • tokener

    A module that breaks a program text into a sequence of tokens for later analysis by a parser.

  • tokenizing

    Splitting up a program text into tokens. Also known as “lexing”, in which case you get “lexemes” instead of tokens.

  • toolbox approach

    The notion that, with a complete set of simple tools that work well together, you can build almost anything you want. Which is fine if you’re assembling a tricycle, but if you’re building a defranishizing comboflux regurgalator, you really want your own machine shop in which to build special tools. Perl is sort of a machine shop.

  • topic

    The thing you’re working on. Structures like while(<>), for , foreach , and given set the topic for you by assigning to $_ , the default (topic) variable.

  • transliterate

    To turn one string representation into another by mapping each character of the source string to its corresponding character in the result string. Not to be confused with translation: for example, Greek πολύχρωμος transliterates into polychromos but translates into many-colored. See the tr/// operator in Camel chapter 5, “Pattern Matching”.

  • trigger

    An event that causes a handler to be run.

  • trinary

    Not a stellar system with three stars, but an operator taking three operands. Sometimes pronounced ternary.

  • troff

    A venerable typesetting language from which Perl derives the name of its $% variable and which is secretly used in the production of Camel books.

  • true

    Any scalar value that doesn’t evaluate to 0 or "" .

  • truncating

    Emptying a file of existing contents, either automatically when opening a file for writing or explicitly via the truncate function.

  • type

    See data type and class.

  • type casting

    Converting data from one type to another. C permits this. Perl does not need it. Nor want it.

  • typedef

    A type definition in the C and C++ languages.

  • typed lexical

    A lexical variable lexical>that is declared with a class type: my Pony $bill .

  • typeglob

    Use of a single identifier, prefixed with * . For example, *name stands for any or all of $name , @name , %name , &name , or just name . How you use it determines whether it is interpreted as all or only one of them. See “Typeglobs and Filehandles” in Camel chapter 2, “Bits and Pieces”.

  • typemap

    A description of how C types may be transformed to and from Perl types within an extension module written in XS.

U

  • UDP

    User Datagram Protocol, the typical way to send datagrams over the Internet.

  • UID

    A user ID. Often used in the context of file or process ownership.

  • umask

    A mask of those permission bits that should be forced off when creating files or directories, in order to establish a policy of whom you’ll ordinarily deny access to. See the umask function.

  • unary operator

    An operator with only one operand, like ! or chdir. Unary operators are usually prefix operators; that is, they precede their operand. The ++ and –– operators can be either prefix or postfix. (Their position does change their meanings.)

  • Unicode

    A character set comprising all the major character sets of the world, more or less. See http://www.unicode.org.

  • Unix

    A very large and constantly evolving language with several alternative and largely incompatible syntaxes, in which anyone can define anything any way they choose, and usually do. Speakers of this language think it’s easy to learn because it’s so easily twisted to one’s own ends, but dialectical differences make tribal intercommunication nearly impossible, and travelers are often reduced to a pidgin-like subset of the language. To be universally understood, a Unix shell programmer must spend years of study in the art. Many have abandoned this discipline and now communicate via an Esperanto-like language called Perl.

    In ancient times, Unix was also used to refer to some code that a couple of people at Bell Labs wrote to make use of a PDP-7 computer that wasn’t doing much of anything else at the time.

  • uppercase

    In Unicode, not just characters with the General Category of Uppercase Letter, but any character with the Uppercase property, including some Letter Numbers and Symbols. Not to be confused with titlecase.

V

  • value

    An actual piece of data, in contrast to all the variables, references, keys, indices, operators, and whatnot that you need to access the value.

  • variable

    A named storage location that can hold any of various kinds of value, as your program sees fit.

  • variable interpolation

    The interpolation of a scalar or array variable into a string.

  • variadic

    Said of a function that happily receives an indeterminate number of actual arguments.

  • vector

    Mathematical jargon for a list of scalar values.

  • virtual

    Providing the appearance of something without the reality, as in: virtual memory is not real memory. (See also memory.) The opposite of “virtual” is “transparent”, which means providing the reality of something without the appearance, as in: Perl handles the variable-length UTF‑8 character encoding transparently.

  • void context

    A form of scalar context in which an expression is not expected to return any value at all and is evaluated for its side effects alone.

  • v-string

    A “version” or “vector” string specified with a v followed by a series of decimal integers in dot notation, for instance, v1.20.300.4000 . Each number turns into a character with the specified ordinal value. (The v is optional when there are at least three integers.)

W

  • warning

    A message printed to the STDERR stream to the effect that something might be wrong but isn’t worth blowing up over. See warn in Camel chapter 27, “Functions” and the warnings pragma in Camel chapter 28, “Pragmantic Modules”.

  • watch expression

    An expression which, when its value changes, causes a breakpoint in the Perl debugger.

  • weak reference

    A reference that doesn’t get counted normally. When all the normal references to data disappear, the data disappears. These are useful for circular references that would never disappear otherwise.

  • whitespace

    A character that moves your cursor but doesn’t otherwise put anything on your screen. Typically refers to any of: space, tab, line feed, carriage return, or form feed. In Unicode, matches many other characters that Unicode considers whitespace, including the ɴ-ʙʀ .

  • word

    In normal “computerese”, the piece of data of the size most efficiently handled by your computer, typically 32 bits or so, give or take a few powers of 2. In Perl culture, it more often refers to an alphanumeric identifier (including underscores), or to a string of nonwhitespace characters bounded by whitespace or string boundaries.

  • working directory

    Your current directory, from which relative pathnames are interpreted by the operating system. The operating system knows your current directory because you told it with a chdir, or because you started out in the place where your parent process was when you were born.

  • wrapper

    A program or subroutine that runs some other program or subroutine for you, modifying some of its input or output to better suit your purposes.

  • WYSIWYG

    What You See Is What You Get. Usually used when something that appears on the screen matches how it will eventually look, like Perl’s format declarations. Also used to mean the opposite of magic because everything works exactly as it appears, as in the three- argument form of open.

X

  • XS

    An extraordinarily exported, expeditiously excellent, expressly eXternal Subroutine, executed in existing C or C++ or in an exciting extension language called (exasperatingly) XS.

  • XSUB

    An external subroutine defined in XS.

Y

  • yacc

    Yet Another Compiler Compiler. A parser generator without which Perl probably would not have existed. See the file perly.y in the Perl source distribution.

Z

  • zero width

    A subpattern assertion matching the null string between characters.

  • zombie

    A process that has died (exited) but whose parent has not yet received proper notification of its demise by virtue of having called wait or waitpid. If you fork, you must clean up after your child processes when they exit; otherwise, the process table will fill up and your system administrator will Not Be Happy with you.

AUTHOR AND COPYRIGHT

Based on the Glossary of Programming Perl, Fourth Edition, by Tom Christiansen, brian d foy, Larry Wall, & Jon Orwant. Copyright (c) 2000, 1996, 1991, 2012 O'Reilly Media, Inc. This document may be distributed under the same terms as Perl itself.

 
perldoc-html/perlgpl.html000644 000765 000024 00000153476 12275777410 015605 0ustar00jjstaff000000 000000 perlgpl - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlgpl

Perl 5 version 18.2 documentation
Recently read

perlgpl

SYNOPSIS

  1. You can refer to this document in Pod via "L<perlgpl>"
  2. Or you can see this document by entering "perldoc perlgpl"

DESCRIPTION

Perl is free software; you can redistribute it and/or modify it under the terms of either:

  1. a) the GNU General Public License as published by the Free
  2. Software Foundation; either version 1, or (at your option) any
  3. later version, or
  4. b) the "Artistic License" which comes with this Kit.

This is the "GNU General Public License, version 1". It's here so that modules, programs, etc., that want to declare this as their distribution license can link to it.

For the Perl Artistic License, see perlartistic.

GNU GENERAL PUBLIC LICENSE

  1. GNU GENERAL PUBLIC LICENSE
  2. Version 1, February 1989
  3. Copyright (C) 1989 Free Software Foundation, Inc.
  4. 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
  5. Everyone is permitted to copy and distribute verbatim copies
  6. of this license document, but changing it is not allowed.
  7. Preamble
  8. The license agreements of most software companies try to keep users
  9. at the mercy of those companies. By contrast, our General Public
  10. License is intended to guarantee your freedom to share and change free
  11. software--to make sure the software is free for all its users. The
  12. General Public License applies to the Free Software Foundation's
  13. software and to any other program whose authors commit to using it.
  14. You can use it for your programs, too.
  15. When we speak of free software, we are referring to freedom, not
  16. price. Specifically, the General Public License is designed to make
  17. sure that you have the freedom to give away or sell copies of free
  18. software, that you receive source code or can get it if you want it,
  19. that you can change the software or use pieces of it in new free
  20. programs; and that you know you can do these things.
  21. To protect your rights, we need to make restrictions that forbid
  22. anyone to deny you these rights or to ask you to surrender the rights.
  23. These restrictions translate to certain responsibilities for you if you
  24. distribute copies of the software, or if you modify it.
  25. For example, if you distribute copies of a such a program, whether
  26. gratis or for a fee, you must give the recipients all the rights that
  27. you have. You must make sure that they, too, receive or can get the
  28. source code. And you must tell them their rights.
  29. We protect your rights with two steps: (1) copyright the software, and
  30. (2) offer you this license which gives you legal permission to copy,
  31. distribute and/or modify the software.
  32. Also, for each author's protection and ours, we want to make certain
  33. that everyone understands that there is no warranty for this free
  34. software. If the software is modified by someone else and passed on, we
  35. want its recipients to know that what they have is not the original, so
  36. that any problems introduced by others will not reflect on the original
  37. authors' reputations.
  38. The precise terms and conditions for copying, distribution and
  39. modification follow.
  40. GNU GENERAL PUBLIC LICENSE
  41. TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
  42. 0. This License Agreement applies to any program or other work which
  43. contains a notice placed by the copyright holder saying it may be
  44. distributed under the terms of this General Public License. The
  45. "Program", below, refers to any such program or work, and a "work based
  46. on the Program" means either the Program or any work containing the
  47. Program or a portion of it, either verbatim or with modifications. Each
  48. licensee is addressed as "you".
  49. 1. You may copy and distribute verbatim copies of the Program's source
  50. code as you receive it, in any medium, provided that you conspicuously and
  51. appropriately publish on each copy an appropriate copyright notice and
  52. disclaimer of warranty; keep intact all the notices that refer to this
  53. General Public License and to the absence of any warranty; and give any
  54. other recipients of the Program a copy of this General Public License
  55. along with the Program. You may charge a fee for the physical act of
  56. transferring a copy.
  57. 2. You may modify your copy or copies of the Program or any portion of
  58. it, and copy and distribute such modifications under the terms of Paragraph
  59. 1 above, provided that you also do the following:
  60. a) cause the modified files to carry prominent notices stating that
  61. you changed the files and the date of any change; and
  62. b) cause the whole of any work that you distribute or publish, that
  63. in whole or in part contains the Program or any part thereof, either
  64. with or without modifications, to be licensed at no charge to all
  65. third parties under the terms of this General Public License (except
  66. that you may choose to grant warranty protection to some or all
  67. third parties, at your option).
  68. c) If the modified program normally reads commands interactively when
  69. run, you must cause it, when started running for such interactive use
  70. in the simplest and most usual way, to print or display an
  71. announcement including an appropriate copyright notice and a notice
  72. that there is no warranty (or else, saying that you provide a
  73. warranty) and that users may redistribute the program under these
  74. conditions, and telling the user how to view a copy of this General
  75. Public License.
  76. d) You may charge a fee for the physical act of transferring a
  77. copy, and you may at your option offer warranty protection in
  78. exchange for a fee.
  79. Mere aggregation of another independent work with the Program (or its
  80. derivative) on a volume of a storage or distribution medium does not bring
  81. the other work under the scope of these terms.
  82. 3. You may copy and distribute the Program (or a portion or derivative of
  83. it, under Paragraph 2) in object code or executable form under the terms of
  84. Paragraphs 1 and 2 above provided that you also do one of the following:
  85. a) accompany it with the complete corresponding machine-readable
  86. source code, which must be distributed under the terms of
  87. Paragraphs 1 and 2 above; or,
  88. b) accompany it with a written offer, valid for at least three
  89. years, to give any third party free (except for a nominal charge
  90. for the cost of distribution) a complete machine-readable copy of the
  91. corresponding source code, to be distributed under the terms of
  92. Paragraphs 1 and 2 above; or,
  93. c) accompany it with the information you received as to where the
  94. corresponding source code may be obtained. (This alternative is
  95. allowed only for noncommercial distribution and only if you
  96. received the program in object code or executable form alone.)
  97. Source code for a work means the preferred form of the work for making
  98. modifications to it. For an executable file, complete source code means
  99. all the source code for all modules it contains; but, as a special
  100. exception, it need not include source code for modules which are standard
  101. libraries that accompany the operating system on which the executable
  102. file runs, or for standard header files or definitions files that
  103. accompany that operating system.
  104. 4. You may not copy, modify, sublicense, distribute or transfer the
  105. Program except as expressly provided under this General Public License.
  106. Any attempt otherwise to copy, modify, sublicense, distribute or transfer
  107. the Program is void, and will automatically terminate your rights to use
  108. the Program under this License. However, parties who have received
  109. copies, or rights to use copies, from you under this General Public
  110. License will not have their licenses terminated so long as such parties
  111. remain in full compliance.
  112. 5. By copying, distributing or modifying the Program (or any work based
  113. on the Program) you indicate your acceptance of this license to do so,
  114. and all its terms and conditions.
  115. 6. Each time you redistribute the Program (or any work based on the
  116. Program), the recipient automatically receives a license from the original
  117. licensor to copy, distribute or modify the Program subject to these
  118. terms and conditions. You may not impose any further restrictions on the
  119. recipients' exercise of the rights granted herein.
  120. 7. The Free Software Foundation may publish revised and/or new versions
  121. of the General Public License from time to time. Such new versions will
  122. be similar in spirit to the present version, but may differ in detail to
  123. address new problems or concerns.
  124. Each version is given a distinguishing version number. If the Program
  125. specifies a version number of the license which applies to it and "any
  126. later version", you have the option of following the terms and conditions
  127. either of that version or of any later version published by the Free
  128. Software Foundation. If the Program does not specify a version number of
  129. the license, you may choose any version ever published by the Free Software
  130. Foundation.
  131. 8. If you wish to incorporate parts of the Program into other free
  132. programs whose distribution conditions are different, write to the author
  133. to ask for permission. For software which is copyrighted by the Free
  134. Software Foundation, write to the Free Software Foundation; we sometimes
  135. make exceptions for this. Our decision will be guided by the two goals
  136. of preserving the free status of all derivatives of our free software and
  137. of promoting the sharing and reuse of software generally.
  138. NO WARRANTY
  139. 9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
  140. FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
  141. OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
  142. PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
  143. OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
  144. MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
  145. TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
  146. PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
  147. REPAIR OR CORRECTION.
  148. 10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
  149. WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
  150. REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
  151. INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
  152. OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
  153. TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
  154. YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
  155. PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
  156. POSSIBILITY OF SUCH DAMAGES.
  157. END OF TERMS AND CONDITIONS
  158. Appendix: How to Apply These Terms to Your New Programs
  159. If you develop a new program, and you want it to be of the greatest
  160. possible use to humanity, the best way to achieve this is to make it
  161. free software which everyone can redistribute and change under these
  162. terms.
  163. To do so, attach the following notices to the program. It is safest to
  164. attach them to the start of each source file to most effectively convey
  165. the exclusion of warranty; and each file should have at least the
  166. "copyright" line and a pointer to where the full notice is found.
  167. <one line to give the program's name and a brief idea of what it does.>
  168. Copyright (C) 19yy <name of author>
  169. This program is free software; you can redistribute it and/or modify
  170. it under the terms of the GNU General Public License as published by
  171. the Free Software Foundation; either version 1, or (at your option)
  172. any later version.
  173. This program is distributed in the hope that it will be useful,
  174. but WITHOUT ANY WARRANTY; without even the implied warranty of
  175. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  176. GNU General Public License for more details.
  177. You should have received a copy of the GNU General Public License
  178. along with this program; if not, write to the Free Software
  179. Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
  180. 02110-1301 USA
  181. Also add information on how to contact you by electronic and paper mail.
  182. If the program is interactive, make it output a short notice like this
  183. when it starts in an interactive mode:
  184. Gnomovision version 69, Copyright (C) 19xx name of author
  185. Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type 'show w'.
  186. This is free software, and you are welcome to redistribute it
  187. under certain conditions; type 'show c' for details.
  188. The hypothetical commands 'show w' and 'show c' should show the
  189. appropriate parts of the General Public License. Of course, the
  190. commands you use may be called something other than 'show w' and 'show
  191. c'; they could even be mouse-clicks or menu items--whatever suits your
  192. program.
  193. You should also get your employer (if you work as a programmer) or your
  194. school, if any, to sign a "copyright disclaimer" for the program, if
  195. necessary. Here a sample; alter the names:
  196. Yoyodyne, Inc., hereby disclaims all copyright interest in the
  197. program 'Gnomovision' (a program to direct compilers to make passes
  198. at assemblers) written by James Hacker.
  199. <signature of Ty Coon>, 1 April 1989
  200. Ty Coon, President of Vice
  201. That's all there is to it!
 
perldoc-html/perlguts.html000644 000765 000024 00000737633 12275777362 016016 0ustar00jjstaff000000 000000 perlguts - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlguts

Perl 5 version 18.2 documentation
Recently read

perlguts

NAME

perlguts - Introduction to the Perl API

DESCRIPTION

This document attempts to describe how to use the Perl API, as well as to provide some info on the basic workings of the Perl core. It is far from complete and probably contains many errors. Please refer any questions or comments to the author below.

Variables

Datatypes

Perl has three typedefs that handle Perl's three main data types:

  1. SV Scalar Value
  2. AV Array Value
  3. HV Hash Value

Each typedef has specific routines that manipulate the various data types.

What is an "IV"?

Perl uses a special typedef IV which is a simple signed integer type that is guaranteed to be large enough to hold a pointer (as well as an integer). Additionally, there is the UV, which is simply an unsigned IV.

Perl also uses two special typedefs, I32 and I16, which will always be at least 32-bits and 16-bits long, respectively. (Again, there are U32 and U16, as well.) They will usually be exactly 32 and 16 bits long, but on Crays they will both be 64 bits.

Working with SVs

An SV can be created and loaded with one command. There are five types of values that can be loaded: an integer value (IV), an unsigned integer value (UV), a double (NV), a string (PV), and another scalar (SV). ("PV" stands for "Pointer Value". You might think that it is misnamed because it is described as pointing only to strings. However, it is possible to have it point to other things. For example, inversion lists, used in regular expression data structures, are scalars, each consisting of an array of UVs which are accessed through PVs. But, using it for non-strings requires care, as the underlying assumption of much of the internals is that PVs are just for strings. Often, for example, a trailing NUL is tacked on automatically. The non-string use is documented only in this paragraph.)

The seven routines are:

  1. SV* newSViv(IV);
  2. SV* newSVuv(UV);
  3. SV* newSVnv(double);
  4. SV* newSVpv(const char*, STRLEN);
  5. SV* newSVpvn(const char*, STRLEN);
  6. SV* newSVpvf(const char*, ...);
  7. SV* newSVsv(SV*);

STRLEN is an integer type (Size_t, usually defined as size_t in config.h) guaranteed to be large enough to represent the size of any string that perl can handle.

In the unlikely case of a SV requiring more complex initialisation, you can create an empty SV with newSV(len). If len is 0 an empty SV of type NULL is returned, else an SV of type PV is returned with len + 1 (for the NUL) bytes of storage allocated, accessible via SvPVX. In both cases the SV has the undef value.

  1. SV *sv = newSV(0); /* no storage allocated */
  2. SV *sv = newSV(10); /* 10 (+1) bytes of uninitialised storage
  3. * allocated */

To change the value of an already-existing SV, there are eight routines:

  1. void sv_setiv(SV*, IV);
  2. void sv_setuv(SV*, UV);
  3. void sv_setnv(SV*, double);
  4. void sv_setpv(SV*, const char*);
  5. void sv_setpvn(SV*, const char*, STRLEN)
  6. void sv_setpvf(SV*, const char*, ...);
  7. void sv_vsetpvfn(SV*, const char*, STRLEN, va_list *,
  8. SV **, I32, bool *);
  9. void sv_setsv(SV*, SV*);

Notice that you can choose to specify the length of the string to be assigned by using sv_setpvn , newSVpvn , or newSVpv , or you may allow Perl to calculate the length by using sv_setpv or by specifying 0 as the second argument to newSVpv . Be warned, though, that Perl will determine the string's length by using strlen , which depends on the string terminating with a NUL character, and not otherwise containing NULs.

The arguments of sv_setpvf are processed like sprintf, and the formatted output becomes the value.

sv_vsetpvfn is an analogue of vsprintf , but it allows you to specify either a pointer to a variable argument list or the address and length of an array of SVs. The last argument points to a boolean; on return, if that boolean is true, then locale-specific information has been used to format the string, and the string's contents are therefore untrustworthy (see perlsec). This pointer may be NULL if that information is not important. Note that this function requires you to specify the length of the format.

The sv_set*() functions are not generic enough to operate on values that have "magic". See Magic Virtual Tables later in this document.

All SVs that contain strings should be terminated with a NUL character. If it is not NUL-terminated there is a risk of core dumps and corruptions from code which passes the string to C functions or system calls which expect a NUL-terminated string. Perl's own functions typically add a trailing NUL for this reason. Nevertheless, you should be very careful when you pass a string stored in an SV to a C function or system call.

To access the actual value that an SV points to, you can use the macros:

  1. SvIV(SV*)
  2. SvUV(SV*)
  3. SvNV(SV*)
  4. SvPV(SV*, STRLEN len)
  5. SvPV_nolen(SV*)

which will automatically coerce the actual scalar type into an IV, UV, double, or string.

In the SvPV macro, the length of the string returned is placed into the variable len (this is a macro, so you do not use &len ). If you do not care what the length of the data is, use the SvPV_nolen macro. Historically the SvPV macro with the global variable PL_na has been used in this case. But that can be quite inefficient because PL_na must be accessed in thread-local storage in threaded Perl. In any case, remember that Perl allows arbitrary strings of data that may both contain NULs and might not be terminated by a NUL.

Also remember that C doesn't allow you to safely say foo(SvPV(s, len), len);. It might work with your compiler, but it won't work for everyone. Break this sort of statement up into separate assignments:

  1. SV *s;
  2. STRLEN len;
  3. char *ptr;
  4. ptr = SvPV(s, len);
  5. foo(ptr, len);

If you want to know if the scalar value is TRUE, you can use:

  1. SvTRUE(SV*)

Although Perl will automatically grow strings for you, if you need to force Perl to allocate more memory for your SV, you can use the macro

  1. SvGROW(SV*, STRLEN newlen)

which will determine if more memory needs to be allocated. If so, it will call the function sv_grow . Note that SvGROW can only increase, not decrease, the allocated memory of an SV and that it does not automatically add space for the trailing NUL byte (perl's own string functions typically do SvGROW(sv, len + 1) ).

If you have an SV and want to know what kind of data Perl thinks is stored in it, you can use the following macros to check the type of SV you have.

  1. SvIOK(SV*)
  2. SvNOK(SV*)
  3. SvPOK(SV*)

You can get and set the current length of the string stored in an SV with the following macros:

  1. SvCUR(SV*)
  2. SvCUR_set(SV*, I32 val)

You can also get a pointer to the end of the string stored in the SV with the macro:

  1. SvEND(SV*)

But note that these last three macros are valid only if SvPOK() is true.

If you want to append something to the end of string stored in an SV* , you can use the following functions:

  1. void sv_catpv(SV*, const char*);
  2. void sv_catpvn(SV*, const char*, STRLEN);
  3. void sv_catpvf(SV*, const char*, ...);
  4. void sv_vcatpvfn(SV*, const char*, STRLEN, va_list *, SV **,
  5. I32, bool);
  6. void sv_catsv(SV*, SV*);

The first function calculates the length of the string to be appended by using strlen . In the second, you specify the length of the string yourself. The third function processes its arguments like sprintf and appends the formatted output. The fourth function works like vsprintf . You can specify the address and length of an array of SVs instead of the va_list argument. The fifth function extends the string stored in the first SV with the string stored in the second SV. It also forces the second SV to be interpreted as a string.

The sv_cat*() functions are not generic enough to operate on values that have "magic". See Magic Virtual Tables later in this document.

If you know the name of a scalar variable, you can get a pointer to its SV by using the following:

  1. SV* get_sv("package::varname", 0);

This returns NULL if the variable does not exist.

If you want to know if this variable (or any other SV) is actually defined, you can call:

  1. SvOK(SV*)

The scalar undef value is stored in an SV instance called PL_sv_undef .

Its address can be used whenever an SV* is needed. Make sure that you don't try to compare a random sv with &PL_sv_undef . For example when interfacing Perl code, it'll work correctly for:

  1. foo(undef);

But won't work when called as:

  1. $x = undef;
  2. foo($x);

So to repeat always use SvOK() to check whether an sv is defined.

Also you have to be careful when using &PL_sv_undef as a value in AVs or HVs (see AVs, HVs and undefined values).

There are also the two values PL_sv_yes and PL_sv_no , which contain boolean TRUE and FALSE values, respectively. Like PL_sv_undef , their addresses can be used whenever an SV* is needed.

Do not be fooled into thinking that (SV *) 0 is the same as &PL_sv_undef . Take this code:

  1. SV* sv = (SV*) 0;
  2. if (I-am-to-return-a-real-value) {
  3. sv = sv_2mortal(newSViv(42));
  4. }
  5. sv_setsv(ST(0), sv);

This code tries to return a new SV (which contains the value 42) if it should return a real value, or undef otherwise. Instead it has returned a NULL pointer which, somewhere down the line, will cause a segmentation violation, bus error, or just weird results. Change the zero to &PL_sv_undef in the first line and all will be well.

To free an SV that you've created, call SvREFCNT_dec(SV*) . Normally this call is not necessary (see Reference Counts and Mortality).

Offsets

Perl provides the function sv_chop to efficiently remove characters from the beginning of a string; you give it an SV and a pointer to somewhere inside the PV, and it discards everything before the pointer. The efficiency comes by means of a little hack: instead of actually removing the characters, sv_chop sets the flag OOK (offset OK) to signal to other functions that the offset hack is in effect, and it puts the number of bytes chopped off into the IV field of the SV. It then moves the PV pointer (called SvPVX ) forward that many bytes, and adjusts SvCUR and SvLEN .

Hence, at this point, the start of the buffer that we allocated lives at SvPVX(sv) - SvIV(sv) in memory and the PV pointer is pointing into the middle of this allocated storage.

This is best demonstrated by example:

  1. % ./perl -Ilib -MDevel::Peek -le '$a="12345"; $a=~s/.//; Dump($a)'
  2. SV = PVIV(0x8128450) at 0x81340f0
  3. REFCNT = 1
  4. FLAGS = (POK,OOK,pPOK)
  5. IV = 1 (OFFSET)
  6. PV = 0x8135781 ( "1" . ) "2345"\0
  7. CUR = 4
  8. LEN = 5

Here the number of bytes chopped off (1) is put into IV, and Devel::Peek::Dump helpfully reminds us that this is an offset. The portion of the string between the "real" and the "fake" beginnings is shown in parentheses, and the values of SvCUR and SvLEN reflect the fake beginning, not the real one.

Something similar to the offset hack is performed on AVs to enable efficient shifting and splicing off the beginning of the array; while AvARRAY points to the first element in the array that is visible from Perl, AvALLOC points to the real start of the C array. These are usually the same, but a shift operation can be carried out by increasing AvARRAY by one and decreasing AvFILL and AvMAX . Again, the location of the real start of the C array only comes into play when freeing the array. See av_shift in av.c.

What's Really Stored in an SV?

Recall that the usual method of determining the type of scalar you have is to use Sv*OK macros. Because a scalar can be both a number and a string, usually these macros will always return TRUE and calling the Sv*V macros will do the appropriate conversion of string to integer/double or integer/double to string.

If you really need to know if you have an integer, double, or string pointer in an SV, you can use the following three macros instead:

  1. SvIOKp(SV*)
  2. SvNOKp(SV*)
  3. SvPOKp(SV*)

These will tell you if you truly have an integer, double, or string pointer stored in your SV. The "p" stands for private.

There are various ways in which the private and public flags may differ. For example, a tied SV may have a valid underlying value in the IV slot (so SvIOKp is true), but the data should be accessed via the FETCH routine rather than directly, so SvIOK is false. Another is when numeric conversion has occurred and precision has been lost: only the private flag is set on 'lossy' values. So when an NV is converted to an IV with loss, SvIOKp, SvNOKp and SvNOK will be set, while SvIOK wont be.

In general, though, it's best to use the Sv*V macros.

Working with AVs

There are two ways to create and load an AV. The first method creates an empty AV:

  1. AV* newAV();

The second method both creates the AV and initially populates it with SVs:

  1. AV* av_make(I32 num, SV **ptr);

The second argument points to an array containing num SV* 's. Once the AV has been created, the SVs can be destroyed, if so desired.

Once the AV has been created, the following operations are possible on it:

  1. void av_push(AV*, SV*);
  2. SV* av_pop(AV*);
  3. SV* av_shift(AV*);
  4. void av_unshift(AV*, I32 num);

These should be familiar operations, with the exception of av_unshift . This routine adds num elements at the front of the array with the undef value. You must then use av_store (described below) to assign values to these new elements.

Here are some other functions:

  1. I32 av_top_index(AV*);
  2. SV** av_fetch(AV*, I32 key, I32 lval);
  3. SV** av_store(AV*, I32 key, SV* val);

The av_top_index function returns the highest index value in an array (just like $#array in Perl). If the array is empty, -1 is returned. The av_fetch function returns the value at index key , but if lval is non-zero, then av_fetch will store an undef value at that index. The av_store function stores the value val at index key , and does not increment the reference count of val . Thus the caller is responsible for taking care of that, and if av_store returns NULL, the caller will have to decrement the reference count to avoid a memory leak. Note that av_fetch and av_store both return SV** 's, not SV* 's as their return value.

A few more:

  1. void av_clear(AV*);
  2. void av_undef(AV*);
  3. void av_extend(AV*, I32 key);

The av_clear function deletes all the elements in the AV* array, but does not actually delete the array itself. The av_undef function will delete all the elements in the array plus the array itself. The av_extend function extends the array so that it contains at least key+1 elements. If key+1 is less than the currently allocated length of the array, then nothing is done.

If you know the name of an array variable, you can get a pointer to its AV by using the following:

  1. AV* get_av("package::varname", 0);

This returns NULL if the variable does not exist.

See Understanding the Magic of Tied Hashes and Arrays for more information on how to use the array access functions on tied arrays.

Working with HVs

To create an HV, you use the following routine:

  1. HV* newHV();

Once the HV has been created, the following operations are possible on it:

  1. SV** hv_store(HV*, const char* key, U32 klen, SV* val, U32 hash);
  2. SV** hv_fetch(HV*, const char* key, U32 klen, I32 lval);

The klen parameter is the length of the key being passed in (Note that you cannot pass 0 in as a value of klen to tell Perl to measure the length of the key). The val argument contains the SV pointer to the scalar being stored, and hash is the precomputed hash value (zero if you want hv_store to calculate it for you). The lval parameter indicates whether this fetch is actually a part of a store operation, in which case a new undefined value will be added to the HV with the supplied key and hv_fetch will return as if the value had already existed.

Remember that hv_store and hv_fetch return SV** 's and not just SV* . To access the scalar value, you must first dereference the return value. However, you should check to make sure that the return value is not NULL before dereferencing it.

The first of these two functions checks if a hash table entry exists, and the second deletes it.

  1. bool hv_exists(HV*, const char* key, U32 klen);
  2. SV* hv_delete(HV*, const char* key, U32 klen, I32 flags);

If flags does not include the G_DISCARD flag then hv_delete will create and return a mortal copy of the deleted value.

And more miscellaneous functions:

  1. void hv_clear(HV*);
  2. void hv_undef(HV*);

Like their AV counterparts, hv_clear deletes all the entries in the hash table but does not actually delete the hash table. The hv_undef deletes both the entries and the hash table itself.

Perl keeps the actual data in a linked list of structures with a typedef of HE. These contain the actual key and value pointers (plus extra administrative overhead). The key is a string pointer; the value is an SV* . However, once you have an HE* , to get the actual key and value, use the routines specified below.

  1. I32 hv_iterinit(HV*);
  2. /* Prepares starting point to traverse hash table */
  3. HE* hv_iternext(HV*);
  4. /* Get the next entry, and return a pointer to a
  5. structure that has both the key and value */
  6. char* hv_iterkey(HE* entry, I32* retlen);
  7. /* Get the key from an HE structure and also return
  8. the length of the key string */
  9. SV* hv_iterval(HV*, HE* entry);
  10. /* Return an SV pointer to the value of the HE
  11. structure */
  12. SV* hv_iternextsv(HV*, char** key, I32* retlen);
  13. /* This convenience routine combines hv_iternext,
  14. hv_iterkey, and hv_iterval. The key and retlen
  15. arguments are return values for the key and its
  16. length. The value is returned in the SV* argument */

If you know the name of a hash variable, you can get a pointer to its HV by using the following:

  1. HV* get_hv("package::varname", 0);

This returns NULL if the variable does not exist.

The hash algorithm is defined in the PERL_HASH macro:

  1. PERL_HASH(hash, key, klen)

The exact implementation of this macro varies by architecture and version of perl, and the return value may change per invocation, so the value is only valid for the duration of a single perl process.

See Understanding the Magic of Tied Hashes and Arrays for more information on how to use the hash access functions on tied hashes.

Hash API Extensions

Beginning with version 5.004, the following functions are also supported:

  1. HE* hv_fetch_ent (HV* tb, SV* key, I32 lval, U32 hash);
  2. HE* hv_store_ent (HV* tb, SV* key, SV* val, U32 hash);
  3. bool hv_exists_ent (HV* tb, SV* key, U32 hash);
  4. SV* hv_delete_ent (HV* tb, SV* key, I32 flags, U32 hash);
  5. SV* hv_iterkeysv (HE* entry);

Note that these functions take SV* keys, which simplifies writing of extension code that deals with hash structures. These functions also allow passing of SV* keys to tie functions without forcing you to stringify the keys (unlike the previous set of functions).

They also return and accept whole hash entries (HE* ), making their use more efficient (since the hash number for a particular string doesn't have to be recomputed every time). See perlapi for detailed descriptions.

The following macros must always be used to access the contents of hash entries. Note that the arguments to these macros must be simple variables, since they may get evaluated more than once. See perlapi for detailed descriptions of these macros.

  1. HePV(HE* he, STRLEN len)
  2. HeVAL(HE* he)
  3. HeHASH(HE* he)
  4. HeSVKEY(HE* he)
  5. HeSVKEY_force(HE* he)
  6. HeSVKEY_set(HE* he, SV* sv)

These two lower level macros are defined, but must only be used when dealing with keys that are not SV* s:

  1. HeKEY(HE* he)
  2. HeKLEN(HE* he)

Note that both hv_store and hv_store_ent do not increment the reference count of the stored val , which is the caller's responsibility. If these functions return a NULL value, the caller will usually have to decrement the reference count of val to avoid a memory leak.

AVs, HVs and undefined values

Sometimes you have to store undefined values in AVs or HVs. Although this may be a rare case, it can be tricky. That's because you're used to using &PL_sv_undef if you need an undefined SV.

For example, intuition tells you that this XS code:

  1. AV *av = newAV();
  2. av_store( av, 0, &PL_sv_undef );

is equivalent to this Perl code:

  1. my @av;
  2. $av[0] = undef;

Unfortunately, this isn't true. AVs use &PL_sv_undef as a marker for indicating that an array element has not yet been initialized. Thus, exists $av[0] would be true for the above Perl code, but false for the array generated by the XS code.

Other problems can occur when storing &PL_sv_undef in HVs:

  1. hv_store( hv, "key", 3, &PL_sv_undef, 0 );

This will indeed make the value undef, but if you try to modify the value of key , you'll get the following error:

  1. Modification of non-creatable hash value attempted

In perl 5.8.0, &PL_sv_undef was also used to mark placeholders in restricted hashes. This caused such hash entries not to appear when iterating over the hash or when checking for the keys with the hv_exists function.

You can run into similar problems when you store &PL_sv_yes or &PL_sv_no into AVs or HVs. Trying to modify such elements will give you the following error:

  1. Modification of a read-only value attempted

To make a long story short, you can use the special variables &PL_sv_undef , &PL_sv_yes and &PL_sv_no with AVs and HVs, but you have to make sure you know what you're doing.

Generally, if you want to store an undefined value in an AV or HV, you should not use &PL_sv_undef , but rather create a new undefined value using the newSV function, for example:

  1. av_store( av, 42, newSV(0) );
  2. hv_store( hv, "foo", 3, newSV(0), 0 );

References

References are a special type of scalar that point to other data types (including other references).

To create a reference, use either of the following functions:

  1. SV* newRV_inc((SV*) thing);
  2. SV* newRV_noinc((SV*) thing);

The thing argument can be any of an SV* , AV* , or HV* . The functions are identical except that newRV_inc increments the reference count of the thing , while newRV_noinc does not. For historical reasons, newRV is a synonym for newRV_inc .

Once you have a reference, you can use the following macro to dereference the reference:

  1. SvRV(SV*)

then call the appropriate routines, casting the returned SV* to either an AV* or HV* , if required.

To determine if an SV is a reference, you can use the following macro:

  1. SvROK(SV*)

To discover what type of value the reference refers to, use the following macro and then check the return value.

  1. SvTYPE(SvRV(SV*))

The most useful types that will be returned are:

  1. < SVt_PVAV Scalar
  2. SVt_PVAV Array
  3. SVt_PVHV Hash
  4. SVt_PVCV Code
  5. SVt_PVGV Glob (possibly a file handle)

See svtype in perlapi for more details.

Blessed References and Class Objects

References are also used to support object-oriented programming. In perl's OO lexicon, an object is simply a reference that has been blessed into a package (or class). Once blessed, the programmer may now use the reference to access the various methods in the class.

A reference can be blessed into a package with the following function:

  1. SV* sv_bless(SV* sv, HV* stash);

The sv argument must be a reference value. The stash argument specifies which class the reference will belong to. See Stashes and Globs for information on converting class names into stashes.

/* Still under construction */

The following function upgrades rv to reference if not already one. Creates a new SV for rv to point to. If classname is non-null, the SV is blessed into the specified class. SV is returned.

  1. SV* newSVrv(SV* rv, const char* classname);

The following three functions copy integer, unsigned integer or double into an SV whose reference is rv . SV is blessed if classname is non-null.

  1. SV* sv_setref_iv(SV* rv, const char* classname, IV iv);
  2. SV* sv_setref_uv(SV* rv, const char* classname, UV uv);
  3. SV* sv_setref_nv(SV* rv, const char* classname, NV iv);

The following function copies the pointer value (the address, not the string!) into an SV whose reference is rv. SV is blessed if classname is non-null.

  1. SV* sv_setref_pv(SV* rv, const char* classname, void* pv);

The following function copies a string into an SV whose reference is rv . Set length to 0 to let Perl calculate the string length. SV is blessed if classname is non-null.

  1. SV* sv_setref_pvn(SV* rv, const char* classname, char* pv,
  2. STRLEN length);

The following function tests whether the SV is blessed into the specified class. It does not check inheritance relationships.

  1. int sv_isa(SV* sv, const char* name);

The following function tests whether the SV is a reference to a blessed object.

  1. int sv_isobject(SV* sv);

The following function tests whether the SV is derived from the specified class. SV can be either a reference to a blessed object or a string containing a class name. This is the function implementing the UNIVERSAL::isa functionality.

  1. bool sv_derived_from(SV* sv, const char* name);

To check if you've got an object derived from a specific class you have to write:

  1. if (sv_isobject(sv) && sv_derived_from(sv, class)) { ... }

Creating New Variables

To create a new Perl variable with an undef value which can be accessed from your Perl script, use the following routines, depending on the variable type.

  1. SV* get_sv("package::varname", GV_ADD);
  2. AV* get_av("package::varname", GV_ADD);
  3. HV* get_hv("package::varname", GV_ADD);

Notice the use of GV_ADD as the second parameter. The new variable can now be set, using the routines appropriate to the data type.

There are additional macros whose values may be bitwise OR'ed with the GV_ADD argument to enable certain extra features. Those bits are:

  • GV_ADDMULTI

    Marks the variable as multiply defined, thus preventing the:

    1. Name <varname> used only once: possible typo

    warning.

  • GV_ADDWARN

    Issues the warning:

    1. Had to create <varname> unexpectedly

    if the variable did not exist before the function was called.

If you do not specify a package name, the variable is created in the current package.

Reference Counts and Mortality

Perl uses a reference count-driven garbage collection mechanism. SVs, AVs, or HVs (xV for short in the following) start their life with a reference count of 1. If the reference count of an xV ever drops to 0, then it will be destroyed and its memory made available for reuse.

This normally doesn't happen at the Perl level unless a variable is undef'ed or the last variable holding a reference to it is changed or overwritten. At the internal level, however, reference counts can be manipulated with the following macros:

  1. int SvREFCNT(SV* sv);
  2. SV* SvREFCNT_inc(SV* sv);
  3. void SvREFCNT_dec(SV* sv);

However, there is one other function which manipulates the reference count of its argument. The newRV_inc function, you will recall, creates a reference to the specified argument. As a side effect, it increments the argument's reference count. If this is not what you want, use newRV_noinc instead.

For example, imagine you want to return a reference from an XSUB function. Inside the XSUB routine, you create an SV which initially has a reference count of one. Then you call newRV_inc , passing it the just-created SV. This returns the reference as a new SV, but the reference count of the SV you passed to newRV_inc has been incremented to two. Now you return the reference from the XSUB routine and forget about the SV. But Perl hasn't! Whenever the returned reference is destroyed, the reference count of the original SV is decreased to one and nothing happens. The SV will hang around without any way to access it until Perl itself terminates. This is a memory leak.

The correct procedure, then, is to use newRV_noinc instead of newRV_inc . Then, if and when the last reference is destroyed, the reference count of the SV will go to zero and it will be destroyed, stopping any memory leak.

There are some convenience functions available that can help with the destruction of xVs. These functions introduce the concept of "mortality". An xV that is mortal has had its reference count marked to be decremented, but not actually decremented, until "a short time later". Generally the term "short time later" means a single Perl statement, such as a call to an XSUB function. The actual determinant for when mortal xVs have their reference count decremented depends on two macros, SAVETMPS and FREETMPS. See perlcall and perlxs for more details on these macros.

"Mortalization" then is at its simplest a deferred SvREFCNT_dec . However, if you mortalize a variable twice, the reference count will later be decremented twice.

"Mortal" SVs are mainly used for SVs that are placed on perl's stack. For example an SV which is created just to pass a number to a called sub is made mortal to have it cleaned up automatically when it's popped off the stack. Similarly, results returned by XSUBs (which are pushed on the stack) are often made mortal.

To create a mortal variable, use the functions:

  1. SV* sv_newmortal()
  2. SV* sv_2mortal(SV*)
  3. SV* sv_mortalcopy(SV*)

The first call creates a mortal SV (with no value), the second converts an existing SV to a mortal SV (and thus defers a call to SvREFCNT_dec ), and the third creates a mortal copy of an existing SV. Because sv_newmortal gives the new SV no value, it must normally be given one via sv_setpv , sv_setiv , etc. :

  1. SV *tmp = sv_newmortal();
  2. sv_setiv(tmp, an_integer);

As that is multiple C statements it is quite common so see this idiom instead:

  1. SV *tmp = sv_2mortal(newSViv(an_integer));

You should be careful about creating mortal variables. Strange things can happen if you make the same value mortal within multiple contexts, or if you make a variable mortal multiple times. Thinking of "Mortalization" as deferred SvREFCNT_dec should help to minimize such problems. For example if you are passing an SV which you know has a high enough REFCNT to survive its use on the stack you need not do any mortalization. If you are not sure then doing an SvREFCNT_inc and sv_2mortal , or making a sv_mortalcopy is safer.

The mortal routines are not just for SVs; AVs and HVs can be made mortal by passing their address (type-casted to SV* ) to the sv_2mortal or sv_mortalcopy routines.

Stashes and Globs

A stash is a hash that contains all variables that are defined within a package. Each key of the stash is a symbol name (shared by all the different types of objects that have the same name), and each value in the hash table is a GV (Glob Value). This GV in turn contains references to the various objects of that name, including (but not limited to) the following:

  1. Scalar Value
  2. Array Value
  3. Hash Value
  4. I/O Handle
  5. Format
  6. Subroutine

There is a single stash called PL_defstash that holds the items that exist in the main package. To get at the items in other packages, append the string "::" to the package name. The items in the Foo package are in the stash Foo:: in PL_defstash. The items in the Bar::Baz package are in the stash Baz:: in Bar:: 's stash.

To get the stash pointer for a particular package, use the function:

  1. HV* gv_stashpv(const char* name, I32 flags)
  2. HV* gv_stashsv(SV*, I32 flags)

The first function takes a literal string, the second uses the string stored in the SV. Remember that a stash is just a hash table, so you get back an HV* . The flags flag will create a new package if it is set to GV_ADD.

The name that gv_stash*v wants is the name of the package whose symbol table you want. The default package is called main . If you have multiply nested packages, pass their names to gv_stash*v , separated by :: as in the Perl language itself.

Alternately, if you have an SV that is a blessed reference, you can find out the stash pointer by using:

  1. HV* SvSTASH(SvRV(SV*));

then use the following to get the package name itself:

  1. char* HvNAME(HV* stash);

If you need to bless or re-bless an object you can use the following function:

  1. SV* sv_bless(SV*, HV* stash)

where the first argument, an SV* , must be a reference, and the second argument is a stash. The returned SV* can now be used in the same way as any other SV.

For more information on references and blessings, consult perlref.

Double-Typed SVs

Scalar variables normally contain only one type of value, an integer, double, pointer, or reference. Perl will automatically convert the actual scalar data from the stored type into the requested type.

Some scalar variables contain more than one type of scalar data. For example, the variable $! contains either the numeric value of errno or its string equivalent from either strerror or sys_errlist[] .

To force multiple data values into an SV, you must do two things: use the sv_set*v routines to add the additional scalar type, then set a flag so that Perl will believe it contains more than one type of data. The four macros to set the flags are:

  1. SvIOK_on
  2. SvNOK_on
  3. SvPOK_on
  4. SvROK_on

The particular macro you must use depends on which sv_set*v routine you called first. This is because every sv_set*v routine turns on only the bit for the particular type of data being set, and turns off all the rest.

For example, to create a new Perl variable called "dberror" that contains both the numeric and descriptive string error values, you could use the following code:

  1. extern int dberror;
  2. extern char *dberror_list;
  3. SV* sv = get_sv("dberror", GV_ADD);
  4. sv_setiv(sv, (IV) dberror);
  5. sv_setpv(sv, dberror_list[dberror]);
  6. SvIOK_on(sv);

If the order of sv_setiv and sv_setpv had been reversed, then the macro SvPOK_on would need to be called instead of SvIOK_on .

Magic Variables

[This section still under construction. Ignore everything here. Post no bills. Everything not permitted is forbidden.]

Any SV may be magical, that is, it has special features that a normal SV does not have. These features are stored in the SV structure in a linked list of struct magic 's, typedef'ed to MAGIC .

  1. struct magic {
  2. MAGIC* mg_moremagic;
  3. MGVTBL* mg_virtual;
  4. U16 mg_private;
  5. char mg_type;
  6. U8 mg_flags;
  7. I32 mg_len;
  8. SV* mg_obj;
  9. char* mg_ptr;
  10. };

Note this is current as of patchlevel 0, and could change at any time.

Assigning Magic

Perl adds magic to an SV using the sv_magic function:

  1. void sv_magic(SV* sv, SV* obj, int how, const char* name, I32 namlen);

The sv argument is a pointer to the SV that is to acquire a new magical feature.

If sv is not already magical, Perl uses the SvUPGRADE macro to convert sv to type SVt_PVMG . Perl then continues by adding new magic to the beginning of the linked list of magical features. Any prior entry of the same type of magic is deleted. Note that this can be overridden, and multiple instances of the same type of magic can be associated with an SV.

The name and namlen arguments are used to associate a string with the magic, typically the name of a variable. namlen is stored in the mg_len field and if name is non-null then either a savepvn copy of name or name itself is stored in the mg_ptr field, depending on whether namlen is greater than zero or equal to zero respectively. As a special case, if (name && namlen == HEf_SVKEY) then name is assumed to contain an SV* and is stored as-is with its REFCNT incremented.

The sv_magic function uses how to determine which, if any, predefined "Magic Virtual Table" should be assigned to the mg_virtual field. See the Magic Virtual Tables section below. The how argument is also stored in the mg_type field. The value of how should be chosen from the set of macros PERL_MAGIC_foo found in perl.h. Note that before these macros were added, Perl internals used to directly use character literals, so you may occasionally come across old code or documentation referring to 'U' magic rather than PERL_MAGIC_uvar for example.

The obj argument is stored in the mg_obj field of the MAGIC structure. If it is not the same as the sv argument, the reference count of the obj object is incremented. If it is the same, or if the how argument is PERL_MAGIC_arylen , or if it is a NULL pointer, then obj is merely stored, without the reference count being incremented.

See also sv_magicext in perlapi for a more flexible way to add magic to an SV.

There is also a function to add magic to an HV :

  1. void hv_magic(HV *hv, GV *gv, int how);

This simply calls sv_magic and coerces the gv argument into an SV .

To remove the magic from an SV, call the function sv_unmagic:

  1. int sv_unmagic(SV *sv, int type);

The type argument should be equal to the how value when the SV was initially made magical.

However, note that sv_unmagic removes all magic of a certain type from the SV . If you want to remove only certain magic of a type based on the magic virtual table, use sv_unmagicext instead:

  1. int sv_unmagicext(SV *sv, int type, MGVTBL *vtbl);

Magic Virtual Tables

The mg_virtual field in the MAGIC structure is a pointer to an MGVTBL , which is a structure of function pointers and stands for "Magic Virtual Table" to handle the various operations that might be applied to that variable.

The MGVTBL has five (or sometimes eight) pointers to the following routine types:

  1. int (*svt_get)(SV* sv, MAGIC* mg);
  2. int (*svt_set)(SV* sv, MAGIC* mg);
  3. U32 (*svt_len)(SV* sv, MAGIC* mg);
  4. int (*svt_clear)(SV* sv, MAGIC* mg);
  5. int (*svt_free)(SV* sv, MAGIC* mg);
  6. int (*svt_copy)(SV *sv, MAGIC* mg, SV *nsv,
  7. const char *name, I32 namlen);
  8. int (*svt_dup)(MAGIC *mg, CLONE_PARAMS *param);
  9. int (*svt_local)(SV *nsv, MAGIC *mg);

This MGVTBL structure is set at compile-time in perl.h and there are currently 32 types. These different structures contain pointers to various routines that perform additional actions depending on which function is being called.

  1. Function pointer Action taken
  2. ---------------- ------------
  3. svt_get Do something before the value of the SV is
  4. retrieved.
  5. svt_set Do something after the SV is assigned a value.
  6. svt_len Report on the SV's length.
  7. svt_clear Clear something the SV represents.
  8. svt_free Free any extra storage associated with the SV.
  9. svt_copy copy tied variable magic to a tied element
  10. svt_dup duplicate a magic structure during thread cloning
  11. svt_local copy magic to local value during 'local'

For instance, the MGVTBL structure called vtbl_sv (which corresponds to an mg_type of PERL_MAGIC_sv ) contains:

  1. { magic_get, magic_set, magic_len, 0, 0 }

Thus, when an SV is determined to be magical and of type PERL_MAGIC_sv , if a get operation is being performed, the routine magic_get is called. All the various routines for the various magical types begin with magic_ . NOTE: the magic routines are not considered part of the Perl API, and may not be exported by the Perl library.

The last three slots are a recent addition, and for source code compatibility they are only checked for if one of the three flags MGf_COPY, MGf_DUP or MGf_LOCAL is set in mg_flags. This means that most code can continue declaring a vtable as a 5-element value. These three are currently used exclusively by the threading code, and are highly subject to change.

The current kinds of Magic Virtual Tables are:

  1. mg_type
  2. (old-style char and macro) MGVTBL Type of magic
  3. -------------------------- ------ -------------
  4. \0 PERL_MAGIC_sv vtbl_sv Special scalar variable
  5. # PERL_MAGIC_arylen vtbl_arylen Array length ($#ary)
  6. % PERL_MAGIC_rhash (none) extra data for restricted
  7. hashes
  8. & PERL_MAGIC_proto (none) my sub prototype CV
  9. . PERL_MAGIC_pos vtbl_pos pos() lvalue
  10. : PERL_MAGIC_symtab (none) extra data for symbol
  11. tables
  12. < PERL_MAGIC_backref vtbl_backref for weak ref data
  13. @ PERL_MAGIC_arylen_p (none) to move arylen out of XPVAV
  14. B PERL_MAGIC_bm vtbl_regexp Boyer-Moore
  15. (fast string search)
  16. c PERL_MAGIC_overload_table vtbl_ovrld Holds overload table
  17. (AMT) on stash
  18. D PERL_MAGIC_regdata vtbl_regdata Regex match position data
  19. (@+ and @- vars)
  20. d PERL_MAGIC_regdatum vtbl_regdatum Regex match position data
  21. element
  22. E PERL_MAGIC_env vtbl_env %ENV hash
  23. e PERL_MAGIC_envelem vtbl_envelem %ENV hash element
  24. f PERL_MAGIC_fm vtbl_regexp Formline
  25. ('compiled' format)
  26. g PERL_MAGIC_regex_global vtbl_mglob m//g target
  27. H PERL_MAGIC_hints vtbl_hints %^H hash
  28. h PERL_MAGIC_hintselem vtbl_hintselem %^H hash element
  29. I PERL_MAGIC_isa vtbl_isa @ISA array
  30. i PERL_MAGIC_isaelem vtbl_isaelem @ISA array element
  31. k PERL_MAGIC_nkeys vtbl_nkeys scalar(keys()) lvalue
  32. L PERL_MAGIC_dbfile (none) Debugger %_<filename
  33. l PERL_MAGIC_dbline vtbl_dbline Debugger %_<filename
  34. element
  35. N PERL_MAGIC_shared (none) Shared between threads
  36. n PERL_MAGIC_shared_scalar (none) Shared between threads
  37. o PERL_MAGIC_collxfrm vtbl_collxfrm Locale transformation
  38. P PERL_MAGIC_tied vtbl_pack Tied array or hash
  39. p PERL_MAGIC_tiedelem vtbl_packelem Tied array or hash element
  40. q PERL_MAGIC_tiedscalar vtbl_packelem Tied scalar or handle
  41. r PERL_MAGIC_qr vtbl_regexp precompiled qr// regex
  42. S PERL_MAGIC_sig (none) %SIG hash
  43. s PERL_MAGIC_sigelem vtbl_sigelem %SIG hash element
  44. t PERL_MAGIC_taint vtbl_taint Taintedness
  45. U PERL_MAGIC_uvar vtbl_uvar Available for use by
  46. extensions
  47. u PERL_MAGIC_uvar_elem (none) Reserved for use by
  48. extensions
  49. V PERL_MAGIC_vstring (none) SV was vstring literal
  50. v PERL_MAGIC_vec vtbl_vec vec() lvalue
  51. w PERL_MAGIC_utf8 vtbl_utf8 Cached UTF-8 information
  52. x PERL_MAGIC_substr vtbl_substr substr() lvalue
  53. y PERL_MAGIC_defelem vtbl_defelem Shadow "foreach" iterator
  54. variable / smart parameter
  55. vivification
  56. ] PERL_MAGIC_checkcall vtbl_checkcall inlining/mutation of call
  57. to this CV
  58. ~ PERL_MAGIC_ext (none) Available for use by
  59. extensions

When an uppercase and lowercase letter both exist in the table, then the uppercase letter is typically used to represent some kind of composite type (a list or a hash), and the lowercase letter is used to represent an element of that composite type. Some internals code makes use of this case relationship. However, 'v' and 'V' (vec and v-string) are in no way related.

The PERL_MAGIC_ext and PERL_MAGIC_uvar magic types are defined specifically for use by extensions and will not be used by perl itself. Extensions can use PERL_MAGIC_ext magic to 'attach' private information to variables (typically objects). This is especially useful because there is no way for normal perl code to corrupt this private information (unlike using extra elements of a hash object).

Similarly, PERL_MAGIC_uvar magic can be used much like tie() to call a C function any time a scalar's value is used or changed. The MAGIC 's mg_ptr field points to a ufuncs structure:

  1. struct ufuncs {
  2. I32 (*uf_val)(pTHX_ IV, SV*);
  3. I32 (*uf_set)(pTHX_ IV, SV*);
  4. IV uf_index;
  5. };

When the SV is read from or written to, the uf_val or uf_set function will be called with uf_index as the first arg and a pointer to the SV as the second. A simple example of how to add PERL_MAGIC_uvar magic is shown below. Note that the ufuncs structure is copied by sv_magic, so you can safely allocate it on the stack.

  1. void
  2. Umagic(sv)
  3. SV *sv;
  4. PREINIT:
  5. struct ufuncs uf;
  6. CODE:
  7. uf.uf_val = &my_get_fn;
  8. uf.uf_set = &my_set_fn;
  9. uf.uf_index = 0;
  10. sv_magic(sv, 0, PERL_MAGIC_uvar, (char*)&uf, sizeof(uf));

Attaching PERL_MAGIC_uvar to arrays is permissible but has no effect.

For hashes there is a specialized hook that gives control over hash keys (but not values). This hook calls PERL_MAGIC_uvar 'get' magic if the "set" function in the ufuncs structure is NULL. The hook is activated whenever the hash is accessed with a key specified as an SV through the functions hv_store_ent , hv_fetch_ent , hv_delete_ent , and hv_exists_ent . Accessing the key as a string through the functions without the ..._ent suffix circumvents the hook. See GUTS in Hash::Util::FieldHash for a detailed description.

Note that because multiple extensions may be using PERL_MAGIC_ext or PERL_MAGIC_uvar magic, it is important for extensions to take extra care to avoid conflict. Typically only using the magic on objects blessed into the same class as the extension is sufficient. For PERL_MAGIC_ext magic, it is usually a good idea to define an MGVTBL , even if all its fields will be 0 , so that individual MAGIC pointers can be identified as a particular kind of magic using their magic virtual table. mg_findext provides an easy way to do that:

  1. STATIC MGVTBL my_vtbl = { 0, 0, 0, 0, 0, 0, 0, 0 };
  2. MAGIC *mg;
  3. if ((mg = mg_findext(sv, PERL_MAGIC_ext, &my_vtbl))) {
  4. /* this is really ours, not another module's PERL_MAGIC_ext */
  5. my_priv_data_t *priv = (my_priv_data_t *)mg->mg_ptr;
  6. ...
  7. }

Also note that the sv_set*() and sv_cat*() functions described earlier do not invoke 'set' magic on their targets. This must be done by the user either by calling the SvSETMAGIC() macro after calling these functions, or by using one of the sv_set*_mg() or sv_cat*_mg() functions. Similarly, generic C code must call the SvGETMAGIC() macro to invoke any 'get' magic if they use an SV obtained from external sources in functions that don't handle magic. See perlapi for a description of these functions. For example, calls to the sv_cat*() functions typically need to be followed by SvSETMAGIC() , but they don't need a prior SvGETMAGIC() since their implementation handles 'get' magic.

Finding Magic

  1. MAGIC *mg_find(SV *sv, int type); /* Finds the magic pointer of that
  2. * type */

This routine returns a pointer to a MAGIC structure stored in the SV. If the SV does not have that magical feature, NULL is returned. If the SV has multiple instances of that magical feature, the first one will be returned. mg_findext can be used to find a MAGIC structure of an SV based on both its magic type and its magic virtual table:

  1. MAGIC *mg_findext(SV *sv, int type, MGVTBL *vtbl);

Also, if the SV passed to mg_find or mg_findext is not of type SVt_PVMG, Perl may core dump.

  1. int mg_copy(SV* sv, SV* nsv, const char* key, STRLEN klen);

This routine checks to see what types of magic sv has. If the mg_type field is an uppercase letter, then the mg_obj is copied to nsv , but the mg_type field is changed to be the lowercase letter.

Understanding the Magic of Tied Hashes and Arrays

Tied hashes and arrays are magical beasts of the PERL_MAGIC_tied magic type.

WARNING: As of the 5.004 release, proper usage of the array and hash access functions requires understanding a few caveats. Some of these caveats are actually considered bugs in the API, to be fixed in later releases, and are bracketed with [MAYCHANGE] below. If you find yourself actually applying such information in this section, be aware that the behavior may change in the future, umm, without warning.

The perl tie function associates a variable with an object that implements the various GET, SET, etc methods. To perform the equivalent of the perl tie function from an XSUB, you must mimic this behaviour. The code below carries out the necessary steps - firstly it creates a new hash, and then creates a second hash which it blesses into the class which will implement the tie methods. Lastly it ties the two hashes together, and returns a reference to the new tied hash. Note that the code below does NOT call the TIEHASH method in the MyTie class - see Calling Perl Routines from within C Programs for details on how to do this.

  1. SV*
  2. mytie()
  3. PREINIT:
  4. HV *hash;
  5. HV *stash;
  6. SV *tie;
  7. CODE:
  8. hash = newHV();
  9. tie = newRV_noinc((SV*)newHV());
  10. stash = gv_stashpv("MyTie", GV_ADD);
  11. sv_bless(tie, stash);
  12. hv_magic(hash, (GV*)tie, PERL_MAGIC_tied);
  13. RETVAL = newRV_noinc(hash);
  14. OUTPUT:
  15. RETVAL

The av_store function, when given a tied array argument, merely copies the magic of the array onto the value to be "stored", using mg_copy . It may also return NULL, indicating that the value did not actually need to be stored in the array. [MAYCHANGE] After a call to av_store on a tied array, the caller will usually need to call mg_set(val) to actually invoke the perl level "STORE" method on the TIEARRAY object. If av_store did return NULL, a call to SvREFCNT_dec(val) will also be usually necessary to avoid a memory leak. [/MAYCHANGE]

The previous paragraph is applicable verbatim to tied hash access using the hv_store and hv_store_ent functions as well.

av_fetch and the corresponding hash functions hv_fetch and hv_fetch_ent actually return an undefined mortal value whose magic has been initialized using mg_copy . Note the value so returned does not need to be deallocated, as it is already mortal. [MAYCHANGE] But you will need to call mg_get() on the returned value in order to actually invoke the perl level "FETCH" method on the underlying TIE object. Similarly, you may also call mg_set() on the return value after possibly assigning a suitable value to it using sv_setsv , which will invoke the "STORE" method on the TIE object. [/MAYCHANGE]

[MAYCHANGE] In other words, the array or hash fetch/store functions don't really fetch and store actual values in the case of tied arrays and hashes. They merely call mg_copy to attach magic to the values that were meant to be "stored" or "fetched". Later calls to mg_get and mg_set actually do the job of invoking the TIE methods on the underlying objects. Thus the magic mechanism currently implements a kind of lazy access to arrays and hashes.

Currently (as of perl version 5.004), use of the hash and array access functions requires the user to be aware of whether they are operating on "normal" hashes and arrays, or on their tied variants. The API may be changed to provide more transparent access to both tied and normal data types in future versions. [/MAYCHANGE]

You would do well to understand that the TIEARRAY and TIEHASH interfaces are mere sugar to invoke some perl method calls while using the uniform hash and array syntax. The use of this sugar imposes some overhead (typically about two to four extra opcodes per FETCH/STORE operation, in addition to the creation of all the mortal variables required to invoke the methods). This overhead will be comparatively small if the TIE methods are themselves substantial, but if they are only a few statements long, the overhead will not be insignificant.

Localizing changes

Perl has a very handy construction

  1. {
  2. local $var = 2;
  3. ...
  4. }

This construction is approximately equivalent to

  1. {
  2. my $oldvar = $var;
  3. $var = 2;
  4. ...
  5. $var = $oldvar;
  6. }

The biggest difference is that the first construction would reinstate the initial value of $var, irrespective of how control exits the block: goto, return, die/eval, etc. It is a little bit more efficient as well.

There is a way to achieve a similar task from C via Perl API: create a pseudo-block, and arrange for some changes to be automatically undone at the end of it, either explicit, or via a non-local exit (via die()). A block-like construct is created by a pair of ENTER /LEAVE macros (see Returning a Scalar in perlcall). Such a construct may be created specially for some important localized task, or an existing one (like boundaries of enclosing Perl subroutine/block, or an existing pair for freeing TMPs) may be used. (In the second case the overhead of additional localization must be almost negligible.) Note that any XSUB is automatically enclosed in an ENTER /LEAVE pair.

Inside such a pseudo-block the following service is available:

  • SAVEINT(int i)
  • SAVEIV(IV i)
  • SAVEI32(I32 i)
  • SAVELONG(long i)

    These macros arrange things to restore the value of integer variable i at the end of enclosing pseudo-block.

  • SAVESPTR(s)
  • SAVEPPTR(p)

    These macros arrange things to restore the value of pointers s and p . s must be a pointer of a type which survives conversion to SV* and back, p should be able to survive conversion to char* and back.

  • SAVEFREESV(SV *sv)

    The refcount of sv would be decremented at the end of pseudo-block. This is similar to sv_2mortal in that it is also a mechanism for doing a delayed SvREFCNT_dec . However, while sv_2mortal extends the lifetime of sv until the beginning of the next statement, SAVEFREESV extends it until the end of the enclosing scope. These lifetimes can be wildly different.

    Also compare SAVEMORTALIZESV .

  • SAVEMORTALIZESV(SV *sv)

    Just like SAVEFREESV , but mortalizes sv at the end of the current scope instead of decrementing its reference count. This usually has the effect of keeping sv alive until the statement that called the currently live scope has finished executing.

  • SAVEFREEOP(OP *op)

    The OP * is op_free()ed at the end of pseudo-block.

  • SAVEFREEPV(p)

    The chunk of memory which is pointed to by p is Safefree()ed at the end of pseudo-block.

  • SAVECLEARSV(SV *sv)

    Clears a slot in the current scratchpad which corresponds to sv at the end of pseudo-block.

  • SAVEDELETE(HV *hv, char *key, I32 length)

    The key key of hv is deleted at the end of pseudo-block. The string pointed to by key is Safefree()ed. If one has a key in short-lived storage, the corresponding string may be reallocated like this:

    1. SAVEDELETE(PL_defstash, savepv(tmpbuf), strlen(tmpbuf));
  • SAVEDESTRUCTOR(DESTRUCTORFUNC_NOCONTEXT_t f, void *p)

    At the end of pseudo-block the function f is called with the only argument p .

  • SAVEDESTRUCTOR_X(DESTRUCTORFUNC_t f, void *p)

    At the end of pseudo-block the function f is called with the implicit context argument (if any), and p .

  • SAVESTACK_POS()

    The current offset on the Perl internal stack (cf. SP ) is restored at the end of pseudo-block.

The following API list contains functions, thus one needs to provide pointers to the modifiable data explicitly (either C pointers, or Perlish GV * s). Where the above macros take int, a similar function takes int * .

  • SV* save_scalar(GV *gv)

    Equivalent to Perl code local $gv .

  • AV* save_ary(GV *gv)
  • HV* save_hash(GV *gv)

    Similar to save_scalar , but localize @gv and %gv .

  • void save_item(SV *item)

    Duplicates the current value of SV , on the exit from the current ENTER /LEAVE pseudo-block will restore the value of SV using the stored value. It doesn't handle magic. Use save_scalar if magic is affected.

  • void save_list(SV **sarg, I32 maxsarg)

    A variant of save_item which takes multiple arguments via an array sarg of SV* of length maxsarg .

  • SV* save_svref(SV **sptr)

    Similar to save_scalar , but will reinstate an SV * .

  • void save_aptr(AV **aptr)
  • void save_hptr(HV **hptr)

    Similar to save_svref , but localize AV * and HV * .

The Alias module implements localization of the basic types within the caller's scope. People who are interested in how to localize things in the containing scope should take a look there too.

Subroutines

XSUBs and the Argument Stack

The XSUB mechanism is a simple way for Perl programs to access C subroutines. An XSUB routine will have a stack that contains the arguments from the Perl program, and a way to map from the Perl data structures to a C equivalent.

The stack arguments are accessible through the ST(n) macro, which returns the n 'th stack argument. Argument 0 is the first argument passed in the Perl subroutine call. These arguments are SV* , and can be used anywhere an SV* is used.

Most of the time, output from the C routine can be handled through use of the RETVAL and OUTPUT directives. However, there are some cases where the argument stack is not already long enough to handle all the return values. An example is the POSIX tzname() call, which takes no arguments, but returns two, the local time zone's standard and summer time abbreviations.

To handle this situation, the PPCODE directive is used and the stack is extended using the macro:

  1. EXTEND(SP, num);

where SP is the macro that represents the local copy of the stack pointer, and num is the number of elements the stack should be extended by.

Now that there is room on the stack, values can be pushed on it using PUSHs macro. The pushed values will often need to be "mortal" (See Reference Counts and Mortality):

  1. PUSHs(sv_2mortal(newSViv(an_integer)))
  2. PUSHs(sv_2mortal(newSVuv(an_unsigned_integer)))
  3. PUSHs(sv_2mortal(newSVnv(a_double)))
  4. PUSHs(sv_2mortal(newSVpv("Some String",0)))
  5. /* Although the last example is better written as the more
  6. * efficient: */
  7. PUSHs(newSVpvs_flags("Some String", SVs_TEMP))

And now the Perl program calling tzname , the two values will be assigned as in:

  1. ($standard_abbrev, $summer_abbrev) = POSIX::tzname;

An alternate (and possibly simpler) method to pushing values on the stack is to use the macro:

  1. XPUSHs(SV*)

This macro automatically adjusts the stack for you, if needed. Thus, you do not need to call EXTEND to extend the stack.

Despite their suggestions in earlier versions of this document the macros (X)PUSH[iunp] are not suited to XSUBs which return multiple results. For that, either stick to the (X)PUSHs macros shown above, or use the new m(X)PUSH[iunp] macros instead; see Putting a C value on Perl stack.

For more information, consult perlxs and perlxstut.

Autoloading with XSUBs

If an AUTOLOAD routine is an XSUB, as with Perl subroutines, Perl puts the fully-qualified name of the autoloaded subroutine in the $AUTOLOAD variable of the XSUB's package.

But it also puts the same information in certain fields of the XSUB itself:

  1. HV *stash = CvSTASH(cv);
  2. const char *subname = SvPVX(cv);
  3. STRLEN name_length = SvCUR(cv); /* in bytes */
  4. U32 is_utf8 = SvUTF8(cv);

SvPVX(cv) contains just the sub name itself, not including the package. For an AUTOLOAD routine in UNIVERSAL or one of its superclasses, CvSTASH(cv) returns NULL during a method call on a nonexistent package.

Note: Setting $AUTOLOAD stopped working in 5.6.1, which did not support XS AUTOLOAD subs at all. Perl 5.8.0 introduced the use of fields in the XSUB itself. Perl 5.16.0 restored the setting of $AUTOLOAD. If you need to support 5.8-5.14, use the XSUB's fields.

Calling Perl Routines from within C Programs

There are four routines that can be used to call a Perl subroutine from within a C program. These four are:

  1. I32 call_sv(SV*, I32);
  2. I32 call_pv(const char*, I32);
  3. I32 call_method(const char*, I32);
  4. I32 call_argv(const char*, I32, char**);

The routine most often used is call_sv . The SV* argument contains either the name of the Perl subroutine to be called, or a reference to the subroutine. The second argument consists of flags that control the context in which the subroutine is called, whether or not the subroutine is being passed arguments, how errors should be trapped, and how to treat return values.

All four routines return the number of arguments that the subroutine returned on the Perl stack.

These routines used to be called perl_call_sv , etc., before Perl v5.6.0, but those names are now deprecated; macros of the same name are provided for compatibility.

When using any of these routines (except call_argv ), the programmer must manipulate the Perl stack. These include the following macros and functions:

  1. dSP
  2. SP
  3. PUSHMARK()
  4. PUTBACK
  5. SPAGAIN
  6. ENTER
  7. SAVETMPS
  8. FREETMPS
  9. LEAVE
  10. XPUSH*()
  11. POP*()

For a detailed description of calling conventions from C to Perl, consult perlcall.

Memory Allocation

Allocation

All memory meant to be used with the Perl API functions should be manipulated using the macros described in this section. The macros provide the necessary transparency between differences in the actual malloc implementation that is used within perl.

It is suggested that you enable the version of malloc that is distributed with Perl. It keeps pools of various sizes of unallocated memory in order to satisfy allocation requests more quickly. However, on some platforms, it may cause spurious malloc or free errors.

The following three macros are used to initially allocate memory :

  1. Newx(pointer, number, type);
  2. Newxc(pointer, number, type, cast);
  3. Newxz(pointer, number, type);

The first argument pointer should be the name of a variable that will point to the newly allocated memory.

The second and third arguments number and type specify how many of the specified type of data structure should be allocated. The argument type is passed to sizeof . The final argument to Newxc , cast , should be used if the pointer argument is different from the type argument.

Unlike the Newx and Newxc macros, the Newxz macro calls memzero to zero out all the newly allocated memory.

Reallocation

  1. Renew(pointer, number, type);
  2. Renewc(pointer, number, type, cast);
  3. Safefree(pointer)

These three macros are used to change a memory buffer size or to free a piece of memory no longer needed. The arguments to Renew and Renewc match those of New and Newc with the exception of not needing the "magic cookie" argument.

Moving

  1. Move(source, dest, number, type);
  2. Copy(source, dest, number, type);
  3. Zero(dest, number, type);

These three macros are used to move, copy, or zero out previously allocated memory. The source and dest arguments point to the source and destination starting points. Perl will move, copy, or zero out number instances of the size of the type data structure (using the sizeof function).

PerlIO

The most recent development releases of Perl have been experimenting with removing Perl's dependency on the "normal" standard I/O suite and allowing other stdio implementations to be used. This involves creating a new abstraction layer that then calls whichever implementation of stdio Perl was compiled with. All XSUBs should now use the functions in the PerlIO abstraction layer and not make any assumptions about what kind of stdio is being used.

For a complete description of the PerlIO abstraction, consult perlapio.

Putting a C value on Perl stack

A lot of opcodes (this is an elementary operation in the internal perl stack machine) put an SV* on the stack. However, as an optimization the corresponding SV is (usually) not recreated each time. The opcodes reuse specially assigned SVs (targets) which are (as a corollary) not constantly freed/created.

Each of the targets is created only once (but see Scratchpads and recursion below), and when an opcode needs to put an integer, a double, or a string on stack, it just sets the corresponding parts of its target and puts the target on stack.

The macro to put this target on stack is PUSHTARG , and it is directly used in some opcodes, as well as indirectly in zillions of others, which use it via (X)PUSH[iunp].

Because the target is reused, you must be careful when pushing multiple values on the stack. The following code will not do what you think:

  1. XPUSHi(10);
  2. XPUSHi(20);

This translates as "set TARG to 10, push a pointer to TARG onto the stack; set TARG to 20, push a pointer to TARG onto the stack". At the end of the operation, the stack does not contain the values 10 and 20, but actually contains two pointers to TARG , which we have set to 20.

If you need to push multiple different values then you should either use the (X)PUSHs macros, or else use the new m(X)PUSH[iunp] macros, none of which make use of TARG . The (X)PUSHs macros simply push an SV* on the stack, which, as noted under XSUBs and the Argument Stack, will often need to be "mortal". The new m(X)PUSH[iunp] macros make this a little easier to achieve by creating a new mortal for you (via (X)PUSHmortal), pushing that onto the stack (extending it if necessary in the case of the mXPUSH[iunp] macros), and then setting its value. Thus, instead of writing this to "fix" the example above:

  1. XPUSHs(sv_2mortal(newSViv(10)))
  2. XPUSHs(sv_2mortal(newSViv(20)))

you can simply write:

  1. mXPUSHi(10)
  2. mXPUSHi(20)

On a related note, if you do use (X)PUSH[iunp], then you're going to need a dTARG in your variable declarations so that the *PUSH* macros can make use of the local variable TARG . See also dTARGET and dXSTARG .

Scratchpads

The question remains on when the SVs which are targets for opcodes are created. The answer is that they are created when the current unit--a subroutine or a file (for opcodes for statements outside of subroutines)--is compiled. During this time a special anonymous Perl array is created, which is called a scratchpad for the current unit.

A scratchpad keeps SVs which are lexicals for the current unit and are targets for opcodes. One can deduce that an SV lives on a scratchpad by looking on its flags: lexicals have SVs_PADMY set, and targets have SVs_PADTMP set.

The correspondence between OPs and targets is not 1-to-1. Different OPs in the compile tree of the unit can use the same target, if this would not conflict with the expected life of the temporary.

Scratchpads and recursion

In fact it is not 100% true that a compiled unit contains a pointer to the scratchpad AV. In fact it contains a pointer to an AV of (initially) one element, and this element is the scratchpad AV. Why do we need an extra level of indirection?

The answer is recursion, and maybe threads. Both these can create several execution pointers going into the same subroutine. For the subroutine-child not write over the temporaries for the subroutine-parent (lifespan of which covers the call to the child), the parent and the child should have different scratchpads. (And the lexicals should be separate anyway!)

So each subroutine is born with an array of scratchpads (of length 1). On each entry to the subroutine it is checked that the current depth of the recursion is not more than the length of this array, and if it is, new scratchpad is created and pushed into the array.

The targets on this scratchpad are undefs, but they are already marked with correct flags.

Compiled code

Code tree

Here we describe the internal form your code is converted to by Perl. Start with a simple example:

  1. $a = $b + $c;

This is converted to a tree similar to this one:

  1. assign-to
  2. / \
  3. + $a
  4. / \
  5. $b $c

(but slightly more complicated). This tree reflects the way Perl parsed your code, but has nothing to do with the execution order. There is an additional "thread" going through the nodes of the tree which shows the order of execution of the nodes. In our simplified example above it looks like:

  1. $b ---> $c ---> + ---> $a ---> assign-to

But with the actual compile tree for $a = $b + $c it is different: some nodes optimized away. As a corollary, though the actual tree contains more nodes than our simplified example, the execution order is the same as in our example.

Examining the tree

If you have your perl compiled for debugging (usually done with -DDEBUGGING on the Configure command line), you may examine the compiled tree by specifying -Dx on the Perl command line. The output takes several lines per node, and for $b+$c it looks like this:

  1. 5 TYPE = add ===> 6
  2. TARG = 1
  3. FLAGS = (SCALAR,KIDS)
  4. {
  5. TYPE = null ===> (4)
  6. (was rv2sv)
  7. FLAGS = (SCALAR,KIDS)
  8. {
  9. 3 TYPE = gvsv ===> 4
  10. FLAGS = (SCALAR)
  11. GV = main::b
  12. }
  13. }
  14. {
  15. TYPE = null ===> (5)
  16. (was rv2sv)
  17. FLAGS = (SCALAR,KIDS)
  18. {
  19. 4 TYPE = gvsv ===> 5
  20. FLAGS = (SCALAR)
  21. GV = main::c
  22. }
  23. }

This tree has 5 nodes (one per TYPE specifier), only 3 of them are not optimized away (one per number in the left column). The immediate children of the given node correspond to {} pairs on the same level of indentation, thus this listing corresponds to the tree:

  1. add
  2. / \
  3. null null
  4. | |
  5. gvsv gvsv

The execution order is indicated by ===> marks, thus it is 3 4 5 6 (node 6 is not included into above listing), i.e., gvsv gvsv add whatever .

Each of these nodes represents an op, a fundamental operation inside the Perl core. The code which implements each operation can be found in the pp*.c files; the function which implements the op with type gvsv is pp_gvsv , and so on. As the tree above shows, different ops have different numbers of children: add is a binary operator, as one would expect, and so has two children. To accommodate the various different numbers of children, there are various types of op data structure, and they link together in different ways.

The simplest type of op structure is OP : this has no children. Unary operators, UNOP s, have one child, and this is pointed to by the op_first field. Binary operators (BINOP s) have not only an op_first field but also an op_last field. The most complex type of op is a LISTOP , which has any number of children. In this case, the first child is pointed to by op_first and the last child by op_last . The children in between can be found by iteratively following the op_sibling pointer from the first child to the last.

There are also two other op types: a PMOP holds a regular expression, and has no children, and a LOOP may or may not have children. If the op_children field is non-zero, it behaves like a LISTOP . To complicate matters, if a UNOP is actually a null op after optimization (see Compile pass 2: context propagation) it will still have children in accordance with its former type.

Another way to examine the tree is to use a compiler back-end module, such as B::Concise.

Compile pass 1: check routines

The tree is created by the compiler while yacc code feeds it the constructions it recognizes. Since yacc works bottom-up, so does the first pass of perl compilation.

What makes this pass interesting for perl developers is that some optimization may be performed on this pass. This is optimization by so-called "check routines". The correspondence between node names and corresponding check routines is described in opcode.pl (do not forget to run make regen_headers if you modify this file).

A check routine is called when the node is fully constructed except for the execution-order thread. Since at this time there are no back-links to the currently constructed node, one can do most any operation to the top-level node, including freeing it and/or creating new nodes above/below it.

The check routine returns the node which should be inserted into the tree (if the top-level node was not modified, check routine returns its argument).

By convention, check routines have names ck_* . They are usually called from new*OP subroutines (or convert ) (which in turn are called from perly.y).

Compile pass 1a: constant folding

Immediately after the check routine is called the returned node is checked for being compile-time executable. If it is (the value is judged to be constant) it is immediately executed, and a constant node with the "return value" of the corresponding subtree is substituted instead. The subtree is deleted.

If constant folding was not performed, the execution-order thread is created.

Compile pass 2: context propagation

When a context for a part of compile tree is known, it is propagated down through the tree. At this time the context can have 5 values (instead of 2 for runtime context): void, boolean, scalar, list, and lvalue. In contrast with the pass 1 this pass is processed from top to bottom: a node's context determines the context for its children.

Additional context-dependent optimizations are performed at this time. Since at this moment the compile tree contains back-references (via "thread" pointers), nodes cannot be free()d now. To allow optimized-away nodes at this stage, such nodes are null()ified instead of free()ing (i.e. their type is changed to OP_NULL).

Compile pass 3: peephole optimization

After the compile tree for a subroutine (or for an eval or a file) is created, an additional pass over the code is performed. This pass is neither top-down or bottom-up, but in the execution order (with additional complications for conditionals). Optimizations performed at this stage are subject to the same restrictions as in the pass 2.

Peephole optimizations are done by calling the function pointed to by the global variable PL_peepp . By default, PL_peepp just calls the function pointed to by the global variable PL_rpeepp . By default, that performs some basic op fixups and optimisations along the execution-order op chain, and recursively calls PL_rpeepp for each side chain of ops (resulting from conditionals). Extensions may provide additional optimisations or fixups, hooking into either the per-subroutine or recursive stage, like this:

  1. static peep_t prev_peepp;
  2. static void my_peep(pTHX_ OP *o)
  3. {
  4. /* custom per-subroutine optimisation goes here */
  5. prev_peepp(aTHX_ o);
  6. /* custom per-subroutine optimisation may also go here */
  7. }
  8. BOOT:
  9. prev_peepp = PL_peepp;
  10. PL_peepp = my_peep;
  11. static peep_t prev_rpeepp;
  12. static void my_rpeep(pTHX_ OP *o)
  13. {
  14. OP *orig_o = o;
  15. for(; o; o = o->op_next) {
  16. /* custom per-op optimisation goes here */
  17. }
  18. prev_rpeepp(aTHX_ orig_o);
  19. }
  20. BOOT:
  21. prev_rpeepp = PL_rpeepp;
  22. PL_rpeepp = my_rpeep;

Pluggable runops

The compile tree is executed in a runops function. There are two runops functions, in run.c and in dump.c. Perl_runops_debug is used with DEBUGGING and Perl_runops_standard is used otherwise. For fine control over the execution of the compile tree it is possible to provide your own runops function.

It's probably best to copy one of the existing runops functions and change it to suit your needs. Then, in the BOOT section of your XS file, add the line:

  1. PL_runops = my_runops;

This function should be as efficient as possible to keep your programs running as fast as possible.

Compile-time scope hooks

As of perl 5.14 it is possible to hook into the compile-time lexical scope mechanism using Perl_blockhook_register . This is used like this:

  1. STATIC void my_start_hook(pTHX_ int full);
  2. STATIC BHK my_hooks;
  3. BOOT:
  4. BhkENTRY_set(&my_hooks, bhk_start, my_start_hook);
  5. Perl_blockhook_register(aTHX_ &my_hooks);

This will arrange to have my_start_hook called at the start of compiling every lexical scope. The available hooks are:

  • void bhk_start(pTHX_ int full)

    This is called just after starting a new lexical scope. Note that Perl code like

    1. if ($x) { ... }

    creates two scopes: the first starts at the ( and has full == 1 , the second starts at the { and has full == 0 . Both end at the }, so calls to start and pre/post_end will match. Anything pushed onto the save stack by this hook will be popped just before the scope ends (between the pre_ and post_end hooks, in fact).

  • void bhk_pre_end(pTHX_ OP **o)

    This is called at the end of a lexical scope, just before unwinding the stack. o is the root of the optree representing the scope; it is a double pointer so you can replace the OP if you need to.

  • void bhk_post_end(pTHX_ OP **o)

    This is called at the end of a lexical scope, just after unwinding the stack. o is as above. Note that it is possible for calls to pre_ and post_end to nest, if there is something on the save stack that calls string eval.

  • void bhk_eval(pTHX_ OP *const o)

    This is called just before starting to compile an eval STRING , do FILE , require or use, after the eval has been set up. o is the OP that requested the eval, and will normally be an OP_ENTEREVAL , OP_DOFILE or OP_REQUIRE .

Once you have your hook functions, you need a BHK structure to put them in. It's best to allocate it statically, since there is no way to free it once it's registered. The function pointers should be inserted into this structure using the BhkENTRY_set macro, which will also set flags indicating which entries are valid. If you do need to allocate your BHK dynamically for some reason, be sure to zero it before you start.

Once registered, there is no mechanism to switch these hooks off, so if that is necessary you will need to do this yourself. An entry in %^H is probably the best way, so the effect is lexically scoped; however it is also possible to use the BhkDISABLE and BhkENABLE macros to temporarily switch entries on and off. You should also be aware that generally speaking at least one scope will have opened before your extension is loaded, so you will see some pre/post_end pairs that didn't have a matching start .

Examining internal data structures with the dump functions

To aid debugging, the source file dump.c contains a number of functions which produce formatted output of internal data structures.

The most commonly used of these functions is Perl_sv_dump ; it's used for dumping SVs, AVs, HVs, and CVs. The Devel::Peek module calls sv_dump to produce debugging output from Perl-space, so users of that module should already be familiar with its format.

Perl_op_dump can be used to dump an OP structure or any of its derivatives, and produces output similar to perl -Dx ; in fact, Perl_dump_eval will dump the main root of the code being evaluated, exactly like -Dx .

Other useful functions are Perl_dump_sub , which turns a GV into an op tree, Perl_dump_packsubs which calls Perl_dump_sub on all the subroutines in a package like so: (Thankfully, these are all xsubs, so there is no op tree)

  1. (gdb) print Perl_dump_packsubs(PL_defstash)
  2. SUB attributes::bootstrap = (xsub 0x811fedc 0)
  3. SUB UNIVERSAL::can = (xsub 0x811f50c 0)
  4. SUB UNIVERSAL::isa = (xsub 0x811f304 0)
  5. SUB UNIVERSAL::VERSION = (xsub 0x811f7ac 0)
  6. SUB DynaLoader::boot_DynaLoader = (xsub 0x805b188 0)

and Perl_dump_all , which dumps all the subroutines in the stash and the op tree of the main root.

How multiple interpreters and concurrency are supported

Background and PERL_IMPLICIT_CONTEXT

The Perl interpreter can be regarded as a closed box: it has an API for feeding it code or otherwise making it do things, but it also has functions for its own use. This smells a lot like an object, and there are ways for you to build Perl so that you can have multiple interpreters, with one interpreter represented either as a C structure, or inside a thread-specific structure. These structures contain all the context, the state of that interpreter.

One macro controls the major Perl build flavor: MULTIPLICITY. The MULTIPLICITY build has a C structure that packages all the interpreter state. With multiplicity-enabled perls, PERL_IMPLICIT_CONTEXT is also normally defined, and enables the support for passing in a "hidden" first argument that represents all three data structures. MULTIPLICITY makes multi-threaded perls possible (with the ithreads threading model, related to the macro USE_ITHREADS.)

Two other "encapsulation" macros are the PERL_GLOBAL_STRUCT and PERL_GLOBAL_STRUCT_PRIVATE (the latter turns on the former, and the former turns on MULTIPLICITY.) The PERL_GLOBAL_STRUCT causes all the internal variables of Perl to be wrapped inside a single global struct, struct perl_vars, accessible as (globals) &PL_Vars or PL_VarsPtr or the function Perl_GetVars(). The PERL_GLOBAL_STRUCT_PRIVATE goes one step further, there is still a single struct (allocated in main() either from heap or from stack) but there are no global data symbols pointing to it. In either case the global struct should be initialised as the very first thing in main() using Perl_init_global_struct() and correspondingly tear it down after perl_free() using Perl_free_global_struct(), please see miniperlmain.c for usage details. You may also need to use dVAR in your coding to "declare the global variables" when you are using them. dTHX does this for you automatically.

To see whether you have non-const data you can use a BSD-compatible nm :

  1. nm libperl.a | grep -v ' [TURtr] '

If this displays any D or d symbols, you have non-const data.

For backward compatibility reasons defining just PERL_GLOBAL_STRUCT doesn't actually hide all symbols inside a big global struct: some PerlIO_xxx vtables are left visible. The PERL_GLOBAL_STRUCT_PRIVATE then hides everything (see how the PERLIO_FUNCS_DECL is used).

All this obviously requires a way for the Perl internal functions to be either subroutines taking some kind of structure as the first argument, or subroutines taking nothing as the first argument. To enable these two very different ways of building the interpreter, the Perl source (as it does in so many other situations) makes heavy use of macros and subroutine naming conventions.

First problem: deciding which functions will be public API functions and which will be private. All functions whose names begin S_ are private (think "S" for "secret" or "static"). All other functions begin with "Perl_", but just because a function begins with "Perl_" does not mean it is part of the API. (See Internal Functions.) The easiest way to be sure a function is part of the API is to find its entry in perlapi. If it exists in perlapi, it's part of the API. If it doesn't, and you think it should be (i.e., you need it for your extension), send mail via perlbug explaining why you think it should be.

Second problem: there must be a syntax so that the same subroutine declarations and calls can pass a structure as their first argument, or pass nothing. To solve this, the subroutines are named and declared in a particular way. Here's a typical start of a static function used within the Perl guts:

  1. STATIC void
  2. S_incline(pTHX_ char *s)

STATIC becomes "static" in C, and may be #define'd to nothing in some configurations in the future.

A public function (i.e. part of the internal API, but not necessarily sanctioned for use in extensions) begins like this:

  1. void
  2. Perl_sv_setiv(pTHX_ SV* dsv, IV num)

pTHX_ is one of a number of macros (in perl.h) that hide the details of the interpreter's context. THX stands for "thread", "this", or "thingy", as the case may be. (And no, George Lucas is not involved. :-) The first character could be 'p' for a prototype, 'a' for argument, or 'd' for declaration, so we have pTHX , aTHX and dTHX , and their variants.

When Perl is built without options that set PERL_IMPLICIT_CONTEXT, there is no first argument containing the interpreter's context. The trailing underscore in the pTHX_ macro indicates that the macro expansion needs a comma after the context argument because other arguments follow it. If PERL_IMPLICIT_CONTEXT is not defined, pTHX_ will be ignored, and the subroutine is not prototyped to take the extra argument. The form of the macro without the trailing underscore is used when there are no additional explicit arguments.

When a core function calls another, it must pass the context. This is normally hidden via macros. Consider sv_setiv . It expands into something like this:

  1. #ifdef PERL_IMPLICIT_CONTEXT
  2. #define sv_setiv(a,b) Perl_sv_setiv(aTHX_ a, b)
  3. /* can't do this for vararg functions, see below */
  4. #else
  5. #define sv_setiv Perl_sv_setiv
  6. #endif

This works well, and means that XS authors can gleefully write:

  1. sv_setiv(foo, bar);

and still have it work under all the modes Perl could have been compiled with.

This doesn't work so cleanly for varargs functions, though, as macros imply that the number of arguments is known in advance. Instead we either need to spell them out fully, passing aTHX_ as the first argument (the Perl core tends to do this with functions like Perl_warner), or use a context-free version.

The context-free version of Perl_warner is called Perl_warner_nocontext, and does not take the extra argument. Instead it does dTHX; to get the context from thread-local storage. We #define warner Perl_warner_nocontext so that extensions get source compatibility at the expense of performance. (Passing an arg is cheaper than grabbing it from thread-local storage.)

You can ignore [pad]THXx when browsing the Perl headers/sources. Those are strictly for use within the core. Extensions and embedders need only be aware of [pad]THX.

So what happened to dTHR?

dTHR was introduced in perl 5.005 to support the older thread model. The older thread model now uses the THX mechanism to pass context pointers around, so dTHR is not useful any more. Perl 5.6.0 and later still have it for backward source compatibility, but it is defined to be a no-op.

How do I use all this in extensions?

When Perl is built with PERL_IMPLICIT_CONTEXT, extensions that call any functions in the Perl API will need to pass the initial context argument somehow. The kicker is that you will need to write it in such a way that the extension still compiles when Perl hasn't been built with PERL_IMPLICIT_CONTEXT enabled.

There are three ways to do this. First, the easy but inefficient way, which is also the default, in order to maintain source compatibility with extensions: whenever XSUB.h is #included, it redefines the aTHX and aTHX_ macros to call a function that will return the context. Thus, something like:

  1. sv_setiv(sv, num);

in your extension will translate to this when PERL_IMPLICIT_CONTEXT is in effect:

  1. Perl_sv_setiv(Perl_get_context(), sv, num);

or to this otherwise:

  1. Perl_sv_setiv(sv, num);

You don't have to do anything new in your extension to get this; since the Perl library provides Perl_get_context(), it will all just work.

The second, more efficient way is to use the following template for your Foo.xs:

  1. #define PERL_NO_GET_CONTEXT /* we want efficiency */
  2. #include "EXTERN.h"
  3. #include "perl.h"
  4. #include "XSUB.h"
  5. STATIC void my_private_function(int arg1, int arg2);
  6. STATIC void
  7. my_private_function(int arg1, int arg2)
  8. {
  9. dTHX; /* fetch context */
  10. ... call many Perl API functions ...
  11. }
  12. [... etc ...]
  13. MODULE = Foo PACKAGE = Foo
  14. /* typical XSUB */
  15. void
  16. my_xsub(arg)
  17. int arg
  18. CODE:
  19. my_private_function(arg, 10);

Note that the only two changes from the normal way of writing an extension is the addition of a #define PERL_NO_GET_CONTEXT before including the Perl headers, followed by a dTHX; declaration at the start of every function that will call the Perl API. (You'll know which functions need this, because the C compiler will complain that there's an undeclared identifier in those functions.) No changes are needed for the XSUBs themselves, because the XS() macro is correctly defined to pass in the implicit context if needed.

The third, even more efficient way is to ape how it is done within the Perl guts:

  1. #define PERL_NO_GET_CONTEXT /* we want efficiency */
  2. #include "EXTERN.h"
  3. #include "perl.h"
  4. #include "XSUB.h"
  5. /* pTHX_ only needed for functions that call Perl API */
  6. STATIC void my_private_function(pTHX_ int arg1, int arg2);
  7. STATIC void
  8. my_private_function(pTHX_ int arg1, int arg2)
  9. {
  10. /* dTHX; not needed here, because THX is an argument */
  11. ... call Perl API functions ...
  12. }
  13. [... etc ...]
  14. MODULE = Foo PACKAGE = Foo
  15. /* typical XSUB */
  16. void
  17. my_xsub(arg)
  18. int arg
  19. CODE:
  20. my_private_function(aTHX_ arg, 10);

This implementation never has to fetch the context using a function call, since it is always passed as an extra argument. Depending on your needs for simplicity or efficiency, you may mix the previous two approaches freely.

Never add a comma after pTHX yourself--always use the form of the macro with the underscore for functions that take explicit arguments, or the form without the argument for functions with no explicit arguments.

If one is compiling Perl with the -DPERL_GLOBAL_STRUCT the dVAR definition is needed if the Perl global variables (see perlvars.h or globvar.sym) are accessed in the function and dTHX is not used (the dTHX includes the dVAR if necessary). One notices the need for dVAR only with the said compile-time define, because otherwise the Perl global variables are visible as-is.

Should I do anything special if I call perl from multiple threads?

If you create interpreters in one thread and then proceed to call them in another, you need to make sure perl's own Thread Local Storage (TLS) slot is initialized correctly in each of those threads.

The perl_alloc and perl_clone API functions will automatically set the TLS slot to the interpreter they created, so that there is no need to do anything special if the interpreter is always accessed in the same thread that created it, and that thread did not create or call any other interpreters afterwards. If that is not the case, you have to set the TLS slot of the thread before calling any functions in the Perl API on that particular interpreter. This is done by calling the PERL_SET_CONTEXT macro in that thread as the first thing you do:

  1. /* do this before doing anything else with some_perl */
  2. PERL_SET_CONTEXT(some_perl);
  3. ... other Perl API calls on some_perl go here ...

Future Plans and PERL_IMPLICIT_SYS

Just as PERL_IMPLICIT_CONTEXT provides a way to bundle up everything that the interpreter knows about itself and pass it around, so too are there plans to allow the interpreter to bundle up everything it knows about the environment it's running on. This is enabled with the PERL_IMPLICIT_SYS macro. Currently it only works with USE_ITHREADS on Windows.

This allows the ability to provide an extra pointer (called the "host" environment) for all the system calls. This makes it possible for all the system stuff to maintain their own state, broken down into seven C structures. These are thin wrappers around the usual system calls (see win32/perllib.c) for the default perl executable, but for a more ambitious host (like the one that would do fork() emulation) all the extra work needed to pretend that different interpreters are actually different "processes", would be done here.

The Perl engine/interpreter and the host are orthogonal entities. There could be one or more interpreters in a process, and one or more "hosts", with free association between them.

Internal Functions

All of Perl's internal functions which will be exposed to the outside world are prefixed by Perl_ so that they will not conflict with XS functions or functions used in a program in which Perl is embedded. Similarly, all global variables begin with PL_ . (By convention, static functions start with S_ .)

Inside the Perl core (PERL_CORE defined), you can get at the functions either with or without the Perl_ prefix, thanks to a bunch of defines that live in embed.h. Note that extension code should not set PERL_CORE ; this exposes the full perl internals, and is likely to cause breakage of the XS in each new perl release.

The file embed.h is generated automatically from embed.pl and embed.fnc. embed.pl also creates the prototyping header files for the internal functions, generates the documentation and a lot of other bits and pieces. It's important that when you add a new function to the core or change an existing one, you change the data in the table in embed.fnc as well. Here's a sample entry from that table:

  1. Apd |SV** |av_fetch |AV* ar|I32 key|I32 lval

The second column is the return type, the third column the name. Columns after that are the arguments. The first column is a set of flags:

  • A

    This function is a part of the public API. All such functions should also have 'd', very few do not.

  • p

    This function has a Perl_ prefix; i.e. it is defined as Perl_av_fetch .

  • d

    This function has documentation using the apidoc feature which we'll look at in a second. Some functions have 'd' but not 'A'; docs are good.

Other available flags are:

  • s

    This is a static function and is defined as STATIC S_whatever , and usually called within the sources as whatever(...) .

  • n

    This does not need an interpreter context, so the definition has no pTHX , and it follows that callers don't use aTHX . (See Background and PERL_IMPLICIT_CONTEXT.)

  • r

    This function never returns; croak , exit and friends.

  • f

    This function takes a variable number of arguments, printf style. The argument list should end with ... , like this:

    1. Afprd |void |croak |const char* pat|...
  • M

    This function is part of the experimental development API, and may change or disappear without notice.

  • o

    This function should not have a compatibility macro to define, say, Perl_parse to parse . It must be called as Perl_parse .

  • x

    This function isn't exported out of the Perl core.

  • m

    This is implemented as a macro.

  • X

    This function is explicitly exported.

  • E

    This function is visible to extensions included in the Perl core.

  • b

    Binary backward compatibility; this function is a macro but also has a Perl_ implementation (which is exported).

  • others

    See the comments at the top of embed.fnc for others.

If you edit embed.pl or embed.fnc, you will need to run make regen_headers to force a rebuild of embed.h and other auto-generated files.

Formatted Printing of IVs, UVs, and NVs

If you are printing IVs, UVs, or NVS instead of the stdio(3) style formatting codes like %d , %ld , %f , you should use the following macros for portability

  1. IVdf IV in decimal
  2. UVuf UV in decimal
  3. UVof UV in octal
  4. UVxf UV in hexadecimal
  5. NVef NV %e-like
  6. NVff NV %f-like
  7. NVgf NV %g-like

These will take care of 64-bit integers and long doubles. For example:

  1. printf("IV is %"IVdf"\n", iv);

The IVdf will expand to whatever is the correct format for the IVs.

If you are printing addresses of pointers, use UVxf combined with PTR2UV(), do not use %lx or %p.

Pointer-To-Integer and Integer-To-Pointer

Because pointer size does not necessarily equal integer size, use the follow macros to do it right.

  1. PTR2UV(pointer)
  2. PTR2IV(pointer)
  3. PTR2NV(pointer)
  4. INT2PTR(pointertotype, integer)

For example:

  1. IV iv = ...;
  2. SV *sv = INT2PTR(SV*, iv);

and

  1. AV *av = ...;
  2. UV uv = PTR2UV(av);

Exception Handling

There are a couple of macros to do very basic exception handling in XS modules. You have to define NO_XSLOCKS before including XSUB.h to be able to use these macros:

  1. #define NO_XSLOCKS
  2. #include "XSUB.h"

You can use these macros if you call code that may croak, but you need to do some cleanup before giving control back to Perl. For example:

  1. dXCPT; /* set up necessary variables */
  2. XCPT_TRY_START {
  3. code_that_may_croak();
  4. } XCPT_TRY_END
  5. XCPT_CATCH
  6. {
  7. /* do cleanup here */
  8. XCPT_RETHROW;
  9. }

Note that you always have to rethrow an exception that has been caught. Using these macros, it is not possible to just catch the exception and ignore it. If you have to ignore the exception, you have to use the call_* function.

The advantage of using the above macros is that you don't have to setup an extra function for call_* , and that using these macros is faster than using call_* .

Source Documentation

There's an effort going on to document the internal functions and automatically produce reference manuals from them - perlapi is one such manual which details all the functions which are available to XS writers. perlintern is the autogenerated manual for the functions which are not part of the API and are supposedly for internal use only.

Source documentation is created by putting POD comments into the C source, like this:

  1. /*
  2. =for apidoc sv_setiv
  3. Copies an integer into the given SV. Does not handle 'set' magic. See
  4. C<sv_setiv_mg>.
  5. =cut
  6. */

Please try and supply some documentation if you add functions to the Perl core.

Backwards compatibility

The Perl API changes over time. New functions are added or the interfaces of existing functions are changed. The Devel::PPPort module tries to provide compatibility code for some of these changes, so XS writers don't have to code it themselves when supporting multiple versions of Perl.

Devel::PPPort generates a C header file ppport.h that can also be run as a Perl script. To generate ppport.h, run:

  1. perl -MDevel::PPPort -eDevel::PPPort::WriteFile

Besides checking existing XS code, the script can also be used to retrieve compatibility information for various API calls using the --api-info command line switch. For example:

  1. % perl ppport.h --api-info=sv_magicext

For details, see perldoc ppport.h .

Unicode Support

Perl 5.6.0 introduced Unicode support. It's important for porters and XS writers to understand this support and make sure that the code they write does not corrupt Unicode data.

What is Unicode, anyway?

In the olden, less enlightened times, we all used to use ASCII. Most of us did, anyway. The big problem with ASCII is that it's American. Well, no, that's not actually the problem; the problem is that it's not particularly useful for people who don't use the Roman alphabet. What used to happen was that particular languages would stick their own alphabet in the upper range of the sequence, between 128 and 255. Of course, we then ended up with plenty of variants that weren't quite ASCII, and the whole point of it being a standard was lost.

Worse still, if you've got a language like Chinese or Japanese that has hundreds or thousands of characters, then you really can't fit them into a mere 256, so they had to forget about ASCII altogether, and build their own systems using pairs of numbers to refer to one character.

To fix this, some people formed Unicode, Inc. and produced a new character set containing all the characters you can possibly think of and more. There are several ways of representing these characters, and the one Perl uses is called UTF-8. UTF-8 uses a variable number of bytes to represent a character. You can learn more about Unicode and Perl's Unicode model in perlunicode.

How can I recognise a UTF-8 string?

You can't. This is because UTF-8 data is stored in bytes just like non-UTF-8 data. The Unicode character 200, (0xC8 for you hex types) capital E with a grave accent, is represented by the two bytes v196.172 . Unfortunately, the non-Unicode string chr(196).chr(172) has that byte sequence as well. So you can't tell just by looking - this is what makes Unicode input an interesting problem.

In general, you either have to know what you're dealing with, or you have to guess. The API function is_utf8_string can help; it'll tell you if a string contains only valid UTF-8 characters. However, it can't do the work for you. On a character-by-character basis, is_utf8_char_buf will tell you whether the current character in a string is valid UTF-8.

How does UTF-8 represent Unicode characters?

As mentioned above, UTF-8 uses a variable number of bytes to store a character. Characters with values 0...127 are stored in one byte, just like good ol' ASCII. Character 128 is stored as v194.128 ; this continues up to character 191, which is v194.191 . Now we've run out of bits (191 is binary 10111111 ) so we move on; 192 is v195.128 . And so it goes on, moving to three bytes at character 2048.

Assuming you know you're dealing with a UTF-8 string, you can find out how long the first character in it is with the UTF8SKIP macro:

  1. char *utf = "\305\233\340\240\201";
  2. I32 len;
  3. len = UTF8SKIP(utf); /* len is 2 here */
  4. utf += len;
  5. len = UTF8SKIP(utf); /* len is 3 here */

Another way to skip over characters in a UTF-8 string is to use utf8_hop , which takes a string and a number of characters to skip over. You're on your own about bounds checking, though, so don't use it lightly.

All bytes in a multi-byte UTF-8 character will have the high bit set, so you can test if you need to do something special with this character like this (the UTF8_IS_INVARIANT() is a macro that tests whether the byte can be encoded as a single byte even in UTF-8):

  1. U8 *utf;
  2. U8 *utf_end; /* 1 beyond buffer pointed to by utf */
  3. UV uv; /* Note: a UV, not a U8, not a char */
  4. STRLEN len; /* length of character in bytes */
  5. if (!UTF8_IS_INVARIANT(*utf))
  6. /* Must treat this as UTF-8 */
  7. uv = utf8_to_uvchr_buf(utf, utf_end, &len);
  8. else
  9. /* OK to treat this character as a byte */
  10. uv = *utf;

You can also see in that example that we use utf8_to_uvchr_buf to get the value of the character; the inverse function uvchr_to_utf8 is available for putting a UV into UTF-8:

  1. if (!UTF8_IS_INVARIANT(uv))
  2. /* Must treat this as UTF8 */
  3. utf8 = uvchr_to_utf8(utf8, uv);
  4. else
  5. /* OK to treat this character as a byte */
  6. *utf8++ = uv;

You must convert characters to UVs using the above functions if you're ever in a situation where you have to match UTF-8 and non-UTF-8 characters. You may not skip over UTF-8 characters in this case. If you do this, you'll lose the ability to match hi-bit non-UTF-8 characters; for instance, if your UTF-8 string contains v196.172 , and you skip that character, you can never match a chr(200) in a non-UTF-8 string. So don't do that!

How does Perl store UTF-8 strings?

Currently, Perl deals with Unicode strings and non-Unicode strings slightly differently. A flag in the SV, SVf_UTF8 , indicates that the string is internally encoded as UTF-8. Without it, the byte value is the codepoint number and vice versa (in other words, the string is encoded as iso-8859-1, but use feature 'unicode_strings' is needed to get iso-8859-1 semantics). You can check and manipulate this flag with the following macros:

  1. SvUTF8(sv)
  2. SvUTF8_on(sv)
  3. SvUTF8_off(sv)

This flag has an important effect on Perl's treatment of the string: if Unicode data is not properly distinguished, regular expressions, length, substr and other string handling operations will have undesirable results.

The problem comes when you have, for instance, a string that isn't flagged as UTF-8, and contains a byte sequence that could be UTF-8 - especially when combining non-UTF-8 and UTF-8 strings.

Never forget that the SVf_UTF8 flag is separate to the PV value; you need be sure you don't accidentally knock it off while you're manipulating SVs. More specifically, you cannot expect to do this:

  1. SV *sv;
  2. SV *nsv;
  3. STRLEN len;
  4. char *p;
  5. p = SvPV(sv, len);
  6. frobnicate(p);
  7. nsv = newSVpvn(p, len);

The char* string does not tell you the whole story, and you can't copy or reconstruct an SV just by copying the string value. Check if the old SV has the UTF8 flag set, and act accordingly:

  1. p = SvPV(sv, len);
  2. frobnicate(p);
  3. nsv = newSVpvn(p, len);
  4. if (SvUTF8(sv))
  5. SvUTF8_on(nsv);

In fact, your frobnicate function should be made aware of whether or not it's dealing with UTF-8 data, so that it can handle the string appropriately.

Since just passing an SV to an XS function and copying the data of the SV is not enough to copy the UTF8 flags, even less right is just passing a char * to an XS function.

How do I convert a string to UTF-8?

If you're mixing UTF-8 and non-UTF-8 strings, it is necessary to upgrade one of the strings to UTF-8. If you've got an SV, the easiest way to do this is:

  1. sv_utf8_upgrade(sv);

However, you must not do this, for example:

  1. if (!SvUTF8(left))
  2. sv_utf8_upgrade(left);

If you do this in a binary operator, you will actually change one of the strings that came into the operator, and, while it shouldn't be noticeable by the end user, it can cause problems in deficient code.

Instead, bytes_to_utf8 will give you a UTF-8-encoded copy of its string argument. This is useful for having the data available for comparisons and so on, without harming the original SV. There's also utf8_to_bytes to go the other way, but naturally, this will fail if the string contains any characters above 255 that can't be represented in a single byte.

Is there anything else I need to know?

Not really. Just remember these things:

  • There's no way to tell if a string is UTF-8 or not. You can tell if an SV is UTF-8 by looking at its SvUTF8 flag. Don't forget to set the flag if something should be UTF-8. Treat the flag as part of the PV, even though it's not - if you pass on the PV to somewhere, pass on the flag too.

  • If a string is UTF-8, always use utf8_to_uvchr_buf to get at the value, unless UTF8_IS_INVARIANT(*s) in which case you can use *s .

  • When writing a character uv to a UTF-8 string, always use uvchr_to_utf8 , unless UTF8_IS_INVARIANT(uv)) in which case you can use *s = uv .

  • Mixing UTF-8 and non-UTF-8 strings is tricky. Use bytes_to_utf8 to get a new string which is UTF-8 encoded, and then combine them.

Custom Operators

Custom operator support is an experimental feature that allows you to define your own ops. This is primarily to allow the building of interpreters for other languages in the Perl core, but it also allows optimizations through the creation of "macro-ops" (ops which perform the functions of multiple ops which are usually executed together, such as gvsv, gvsv, add .)

This feature is implemented as a new op type, OP_CUSTOM . The Perl core does not "know" anything special about this op type, and so it will not be involved in any optimizations. This also means that you can define your custom ops to be any op structure - unary, binary, list and so on - you like.

It's important to know what custom operators won't do for you. They won't let you add new syntax to Perl, directly. They won't even let you add new keywords, directly. In fact, they won't change the way Perl compiles a program at all. You have to do those changes yourself, after Perl has compiled the program. You do this either by manipulating the op tree using a CHECK block and the B::Generate module, or by adding a custom peephole optimizer with the optimize module.

When you do this, you replace ordinary Perl ops with custom ops by creating ops with the type OP_CUSTOM and the op_ppaddr of your own PP function. This should be defined in XS code, and should look like the PP ops in pp_*.c. You are responsible for ensuring that your op takes the appropriate number of values from the stack, and you are responsible for adding stack marks if necessary.

You should also "register" your op with the Perl interpreter so that it can produce sensible error and warning messages. Since it is possible to have multiple custom ops within the one "logical" op type OP_CUSTOM , Perl uses the value of o->op_ppaddr to determine which custom op it is dealing with. You should create an XOP structure for each ppaddr you use, set the properties of the custom op with XopENTRY_set , and register the structure against the ppaddr using Perl_custom_op_register . A trivial example might look like:

  1. static XOP my_xop;
  2. static OP *my_pp(pTHX);
  3. BOOT:
  4. XopENTRY_set(&my_xop, xop_name, "myxop");
  5. XopENTRY_set(&my_xop, xop_desc, "Useless custom op");
  6. Perl_custom_op_register(aTHX_ my_pp, &my_xop);

The available fields in the structure are:

  • xop_name

    A short name for your op. This will be included in some error messages, and will also be returned as $op->name by the B module, so it will appear in the output of module like B::Concise.

  • xop_desc

    A short description of the function of the op.

  • xop_class

    Which of the various *OP structures this op uses. This should be one of the OA_* constants from op.h, namely

    • OA_BASEOP
    • OA_UNOP
    • OA_BINOP
    • OA_LOGOP
    • OA_LISTOP
    • OA_PMOP
    • OA_SVOP
    • OA_PADOP
    • OA_PVOP_OR_SVOP

      This should be interpreted as 'PVOP ' only. The _OR_SVOP is because the only core PVOP , OP_TRANS , can sometimes be a SVOP instead.

    • OA_LOOP
    • OA_COP

    The other OA_* constants should not be used.

  • xop_peep

    This member is of type Perl_cpeep_t , which expands to void (*Perl_cpeep_t)(aTHX_ OP *o, OP *oldop) . If it is set, this function will be called from Perl_rpeep when ops of this type are encountered by the peephole optimizer. o is the OP that needs optimizing; oldop is the previous OP optimized, whose op_next points to o.

B::Generate directly supports the creation of custom ops by name.

AUTHORS

Until May 1997, this document was maintained by Jeff Okamoto <okamoto@corp.hp.com>. It is now maintained as part of Perl itself by the Perl 5 Porters <perl5-porters@perl.org>.

With lots of help and suggestions from Dean Roehrich, Malcolm Beattie, Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, Neil Bowers, Matthew Green, Tim Bunce, Spider Boardman, Ulrich Pfeifer, Stephen McCamant, and Gurusamy Sarathy.

SEE ALSO

perlapi, perlintern, perlxs, perlembed

Page index
 
perldoc-html/perlhack.html000644 000765 000024 00000203534 12275777370 015725 0ustar00jjstaff000000 000000 perlhack - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlhack

Perl 5 version 18.2 documentation
Recently read

perlhack

NAME

perlhack - How to hack on Perl

DESCRIPTION

This document explains how Perl development works. It includes details about the Perl 5 Porters email list, the Perl repository, the Perlbug bug tracker, patch guidelines, and commentary on Perl development philosophy.

SUPER QUICK PATCH GUIDE

If you just want to submit a single small patch like a pod fix, a test for a bug, comment fixes, etc., it's easy! Here's how:

  • Check out the source repository

    The perl source is in a git repository. You can clone the repository with the following command:

    1. % git clone git://perl5.git.perl.org/perl.git perl
  • Ensure you're following the latest advice

    In case the advice in this guide has been updated recently, read the latest version directly from the perl source:

    1. % perldoc pod/perlhack.pod
  • Make your change

    Hack, hack, hack.

  • Test your change

    You can run all the tests with the following commands:

    1. % ./Configure -des -Dusedevel
    2. % make test

    Keep hacking until the tests pass.

  • Commit your change

    Committing your work will save the change on your local system:

    1. % git commit -a -m 'Commit message goes here'

    Make sure the commit message describes your change in a single sentence. For example, "Fixed spelling errors in perlhack.pod".

  • Send your change to perlbug

    The next step is to submit your patch to the Perl core ticket system via email.

    If your changes are in a single git commit, run the following commands to write the file as a MIME attachment and send it with a meaningful subject:

    1. % git format-patch -1 --attach
    2. % ./perl -Ilib utils/perlbug -s "[PATCH] $(
    3. git log -1 --oneline HEAD)" -f 0001-*.patch

    The perlbug program will ask you a few questions about your email address and the patch you're submitting. Once you've answered them it will submit your patch via email.

    If your changes are in multiple commits, generate a patch file containing them all, and attach that:

    1. % git format-patch origin/blead --attach --stdout > patches
    2. % ./perl -Ilib utils/perlbug -f patches

    When prompted, pick a subject that summarizes your changes overall and has "[PATCH]" at the beginning.

  • Thank you

    The porters appreciate the time you spent helping to make Perl better. Thank you!

  • Next time

    The next time you wish to make a patch, you need to start from the latest perl in a pristine state. Check you don't have any local changes or added files in your perl check-out which you wish to keep, then run these commands:

    1. % git pull
    2. % git reset --hard origin/blead
    3. % git clean -dxf

BUG REPORTING

If you want to report a bug in Perl, you must use the perlbug command line tool. This tool will ensure that your bug report includes all the relevant system and configuration information.

To browse existing Perl bugs and patches, you can use the web interface at http://rt.perl.org/.

Please check the archive of the perl5-porters list (see below) and/or the bug tracking system before submitting a bug report. Often, you'll find that the bug has been reported already.

You can log in to the bug tracking system and comment on existing bug reports. If you have additional information regarding an existing bug, please add it. This will help the porters fix the bug.

PERL 5 PORTERS

The perl5-porters (p5p) mailing list is where the Perl standard distribution is maintained and developed. The people who maintain Perl are also referred to as the "Perl 5 Porters", "p5p" or just the "porters".

A searchable archive of the list is available at http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/. There is also another archive at http://archive.develooper.com/perl5-porters@perl.org/.

perl-changes mailing list

The perl5-changes mailing list receives a copy of each patch that gets submitted to the maintenance and development branches of the perl repository. See http://lists.perl.org/list/perl5-changes.html for subscription and archive information.

#p5p on IRC

Many porters are also active on the irc://irc.perl.org/#p5p channel. Feel free to join the channel and ask questions about hacking on the Perl core.

GETTING THE PERL SOURCE

All of Perl's source code is kept centrally in a Git repository at perl5.git.perl.org. The repository contains many Perl revisions from Perl 1 onwards and all the revisions from Perforce, the previous version control system.

For much more detail on using git with the Perl repository, please see perlgit.

Read access via Git

You will need a copy of Git for your computer. You can fetch a copy of the repository using the git protocol:

  1. % git clone git://perl5.git.perl.org/perl.git perl

This clones the repository and makes a local copy in the perl directory.

If you cannot use the git protocol for firewall reasons, you can also clone via http, though this is much slower:

  1. % git clone http://perl5.git.perl.org/perl.git perl

Read access via the web

You may access the repository over the web. This allows you to browse the tree, see recent commits, subscribe to RSS feeds for the changes, search for particular commits and more. You may access it at http://perl5.git.perl.org/perl.git. A mirror of the repository is found at http://github.com/mirrors/perl.

Read access via rsync

You can also choose to use rsync to get a copy of the current source tree for the bleadperl branch and all maintenance branches:

  1. % rsync -avz rsync://perl5.git.perl.org/perl-current .
  2. % rsync -avz rsync://perl5.git.perl.org/perl-5.12.x .
  3. % rsync -avz rsync://perl5.git.perl.org/perl-5.10.x .
  4. % rsync -avz rsync://perl5.git.perl.org/perl-5.8.x .
  5. % rsync -avz rsync://perl5.git.perl.org/perl-5.6.x .
  6. % rsync -avz rsync://perl5.git.perl.org/perl-5.005xx .

(Add the --delete option to remove leftover files.)

To get a full list of the available sync points:

  1. % rsync perl5.git.perl.org::

Write access via git

If you have a commit bit, please see perlgit for more details on using git.

PATCHING PERL

If you're planning to do more extensive work than a single small fix, we encourage you to read the documentation below. This will help you focus your work and make your patches easier to incorporate into the Perl source.

Submitting patches

If you have a small patch to submit, please submit it via perlbug. You can also send email directly to perlbug@perl.org. Please note that messages sent to perlbug may be held in a moderation queue, so you won't receive a response immediately.

You'll know your submission has been processed when you receive an email from our ticket tracking system. This email will give you a ticket number. Once your patch has made it to the ticket tracking system, it will also be sent to the perl5-porters@perl.org list.

Patches are reviewed and discussed on the p5p list. Simple, uncontroversial patches will usually be applied without any discussion. When the patch is applied, the ticket will be updated and you will receive email. In addition, an email will be sent to the p5p list.

In other cases, the patch will need more work or discussion. That will happen on the p5p list.

You are encouraged to participate in the discussion and advocate for your patch. Sometimes your patch may get lost in the shuffle. It's appropriate to send a reminder email to p5p if no action has been taken in a month. Please remember that the Perl 5 developers are all volunteers, and be polite.

Changes are always applied directly to the main development branch, called "blead". Some patches may be backported to a maintenance branch. If you think your patch is appropriate for the maintenance branch (see MAINTENANCE BRANCHES in perlpolicy), please explain why when you submit it.

Getting your patch accepted

If you are submitting a code patch there are several things that you can do to help the Perl 5 Porters accept your patch.

Patch style

If you used git to check out the Perl source, then using git format-patch will produce a patch in a style suitable for Perl. The format-patch command produces one patch file for each commit you made. If you prefer to send a single patch for all commits, you can use git diff .

  1. % git checkout blead
  2. % git pull
  3. % git diff blead my-branch-name

This produces a patch based on the difference between blead and your current branch. It's important to make sure that blead is up to date before producing the diff, that's why we call git pull first.

We strongly recommend that you use git if possible. It will make your life easier, and ours as well.

However, if you're not using git, you can still produce a suitable patch. You'll need a pristine copy of the Perl source to diff against. The porters prefer unified diffs. Using GNU diff , you can produce a diff like this:

  1. % diff -Npurd perl.pristine perl.mine

Make sure that you make realclean in your copy of Perl to remove any build artifacts, or you may get a confusing result.

Commit message

As you craft each patch you intend to submit to the Perl core, it's important to write a good commit message. This is especially important if your submission will consist of a series of commits.

The first line of the commit message should be a short description without a period. It should be no longer than the subject line of an email, 50 characters being a good rule of thumb.

A lot of Git tools (Gitweb, GitHub, git log --pretty=oneline, ...) will only display the first line (cut off at 50 characters) when presenting commit summaries.

The commit message should include a description of the problem that the patch corrects or new functionality that the patch adds.

As a general rule of thumb, your commit message should help a programmer who knows the Perl core quickly understand what you were trying to do, how you were trying to do it, and why the change matters to Perl.

  • Why

    Your commit message should describe why the change you are making is important. When someone looks at your change in six months or six years, your intent should be clear.

    If you're deprecating a feature with the intent of later simplifying another bit of code, say so. If you're fixing a performance problem or adding a new feature to support some other bit of the core, mention that.

  • What

    Your commit message should describe what part of the Perl core you're changing and what you expect your patch to do.

  • How

    While it's not necessary for documentation changes, new tests or trivial patches, it's often worth explaining how your change works. Even if it's clear to you today, it may not be clear to a porter next month or next year.

A commit message isn't intended to take the place of comments in your code. Commit messages should describe the change you made, while code comments should describe the current state of the code.

If you've just implemented a new feature, complete with doc, tests and well-commented code, a brief commit message will often suffice. If, however, you've just changed a single character deep in the parser or lexer, you might need to write a small novel to ensure that future readers understand what you did and why you did it.

Comments, Comments, Comments

Be sure to adequately comment your code. While commenting every line is unnecessary, anything that takes advantage of side effects of operators, that creates changes that will be felt outside of the function being patched, or that others may find confusing should be documented. If you are going to err, it is better to err on the side of adding too many comments than too few.

The best comments explain why the code does what it does, not what it does.

Style

In general, please follow the particular style of the code you are patching.

In particular, follow these general guidelines for patching Perl sources:

  • 8-wide tabs (no exceptions!)

  • 4-wide indents for code, 2-wide indents for nested CPP #defines

  • Try hard not to exceed 79-columns

  • ANSI C prototypes

  • Uncuddled elses and "K&R" style for indenting control constructs

  • No C++ style (//) comments

  • Mark places that need to be revisited with XXX (and revisit often!)

  • Opening brace lines up with "if" when conditional spans multiple lines; should be at end-of-line otherwise

  • In function definitions, name starts in column 0 (return value is on previous line)

  • Single space after keywords that are followed by parens, no space between function name and following paren

  • Avoid assignments in conditionals, but if they're unavoidable, use extra paren, e.g. "if (a && (b = c)) ..."

  • "return foo;" rather than "return(foo);"

  • "if (!foo) ..." rather than "if (foo == FALSE) ..." etc.

  • Do not declare variables using "register". It may be counterproductive with modern compilers, and is deprecated in C++, under which the Perl source is regularly compiled.

  • In-line functions that are in headers that are accessible to XS code need to be able to compile without warnings with commonly used extra compilation flags, such as gcc's -Wswitch-default which warns whenever a switch statement does not have a "default" case. The use of these extra flags is to catch potential problems in legal C code, and is often used by Perl aggregators, such as Linux distributors.

Test suite

If your patch changes code (rather than just changing documentation), you should also include one or more test cases which illustrate the bug you're fixing or validate the new functionality you're adding. In general, you should update an existing test file rather than create a new one.

Your test suite additions should generally follow these guidelines (courtesy of Gurusamy Sarathy <gsar@activestate.com>):

  • Know what you're testing. Read the docs, and the source.

  • Tend to fail, not succeed.

  • Interpret results strictly.

  • Use unrelated features (this will flush out bizarre interactions).

  • Use non-standard idioms (otherwise you are not testing TIMTOWTDI).

  • Avoid using hardcoded test numbers whenever possible (the EXPECTED/GOT found in t/op/tie.t is much more maintainable, and gives better failure reports).

  • Give meaningful error messages when a test fails.

  • Avoid using qx// and system() unless you are testing for them. If you do use them, make sure that you cover _all_ perl platforms.

  • Unlink any temporary files you create.

  • Promote unforeseen warnings to errors with $SIG{__WARN__}.

  • Be sure to use the libraries and modules shipped with the version being tested, not those that were already installed.

  • Add comments to the code explaining what you are testing for.

  • Make updating the '1..42' string unnecessary. Or make sure that you update it.

  • Test _all_ behaviors of a given operator, library, or function.

    Test all optional arguments.

    Test return values in various contexts (boolean, scalar, list, lvalue).

    Use both global and lexical variables.

    Don't forget the exceptional, pathological cases.

Patching a core module

This works just like patching anything else, with one extra consideration.

Modules in the cpan/ directory of the source tree are maintained outside of the Perl core. When the author updates the module, the updates are simply copied into the core. See that module's documentation or its listing on http://search.cpan.org/ for more information on reporting bugs and submitting patches.

In most cases, patches to modules in cpan/ should be sent upstream and should not be applied to the Perl core individually. If a patch to a file in cpan/ absolutely cannot wait for the fix to be made upstream, released to CPAN and copied to blead, you must add (or update) a CUSTOMIZED entry in the "Porting/Maintainers.pl" file to flag that a local modification has been made. See "Porting/Maintainers.pl" for more details.

In contrast, modules in the dist/ directory are maintained in the core.

Updating perldelta

For changes significant enough to warrant a pod/perldelta.pod entry, the porters will greatly appreciate it if you submit a delta entry along with your actual change. Significant changes include, but are not limited to:

  • Adding, deprecating, or removing core features

  • Adding, deprecating, removing, or upgrading core or dual-life modules

  • Adding new core tests

  • Fixing security issues and user-visible bugs in the core

  • Changes that might break existing code, either on the perl or C level

  • Significant performance improvements

  • Adding, removing, or significantly changing documentation in the pod/ directory

  • Important platform-specific changes

Please make sure you add the perldelta entry to the right section within pod/perldelta.pod. More information on how to write good perldelta entries is available in the Style section of Porting/how_to_write_a_perldelta.pod.

What makes for a good patch?

New features and extensions to the language can be contentious. There is no specific set of criteria which determine what features get added, but here are some questions to consider when developing a patch:

Does the concept match the general goals of Perl?

Our goals include, but are not limited to:

1.

Keep it fast, simple, and useful.

2.

Keep features/concepts as orthogonal as possible.

3.

No arbitrary limits (platforms, data sizes, cultures).

4.

Keep it open and exciting to use/patch/advocate Perl everywhere.

5.

Either assimilate new technologies, or build bridges to them.

Where is the implementation?

All the talk in the world is useless without an implementation. In almost every case, the person or people who argue for a new feature will be expected to be the ones who implement it. Porters capable of coding new features have their own agendas, and are not available to implement your (possibly good) idea.

Backwards compatibility

It's a cardinal sin to break existing Perl programs. New warnings can be contentious--some say that a program that emits warnings is not broken, while others say it is. Adding keywords has the potential to break programs, changing the meaning of existing token sequences or functions might break programs.

The Perl 5 core includes mechanisms to help porters make backwards incompatible changes more compatible such as the feature and deprecate modules. Please use them when appropriate.

Could it be a module instead?

Perl 5 has extension mechanisms, modules and XS, specifically to avoid the need to keep changing the Perl interpreter. You can write modules that export functions, you can give those functions prototypes so they can be called like built-in functions, you can even write XS code to mess with the runtime data structures of the Perl interpreter if you want to implement really complicated things.

Whenever possible, new features should be prototyped in a CPAN module before they will be considered for the core.

Is the feature generic enough?

Is this something that only the submitter wants added to the language, or is it broadly useful? Sometimes, instead of adding a feature with a tight focus, the porters might decide to wait until someone implements the more generalized feature.

Does it potentially introduce new bugs?

Radical rewrites of large chunks of the Perl interpreter have the potential to introduce new bugs.

How big is it?

The smaller and more localized the change, the better. Similarly, a series of small patches is greatly preferred over a single large patch.

Does it preclude other desirable features?

A patch is likely to be rejected if it closes off future avenues of development. For instance, a patch that placed a true and final interpretation on prototypes is likely to be rejected because there are still options for the future of prototypes that haven't been addressed.

Is the implementation robust?

Good patches (tight code, complete, correct) stand more chance of going in. Sloppy or incorrect patches might be placed on the back burner until the pumpking has time to fix, or might be discarded altogether without further notice.

Is the implementation generic enough to be portable?

The worst patches make use of system-specific features. It's highly unlikely that non-portable additions to the Perl language will be accepted.

Is the implementation tested?

Patches which change behaviour (fixing bugs or introducing new features) must include regression tests to verify that everything works as expected.

Without tests provided by the original author, how can anyone else changing perl in the future be sure that they haven't unwittingly broken the behaviour the patch implements? And without tests, how can the patch's author be confident that his/her hard work put into the patch won't be accidentally thrown away by someone in the future?

Is there enough documentation?

Patches without documentation are probably ill-thought out or incomplete. No features can be added or changed without documentation, so submitting a patch for the appropriate pod docs as well as the source code is important.

Is there another way to do it?

Larry said "Although the Perl Slogan is There's More Than One Way to Do It, I hesitate to make 10 ways to do something". This is a tricky heuristic to navigate, though--one man's essential addition is another man's pointless cruft.

Does it create too much work?

Work for the pumpking, work for Perl programmers, work for module authors, ... Perl is supposed to be easy.

Patches speak louder than words

Working code is always preferred to pie-in-the-sky ideas. A patch to add a feature stands a much higher chance of making it to the language than does a random feature request, no matter how fervently argued the request might be. This ties into "Will it be useful?", as the fact that someone took the time to make the patch demonstrates a strong desire for the feature.

TESTING

The core uses the same testing style as the rest of Perl, a simple "ok/not ok" run through Test::Harness, but there are a few special considerations.

There are three ways to write a test in the core: Test::More, t/test.pl and ad hoc print $test ? "ok 42\n" : "not ok 42\n" . The decision of which to use depends on what part of the test suite you're working on. This is a measure to prevent a high-level failure (such as Config.pm breaking) from causing basic functionality tests to fail.

The t/test.pl library provides some of the features of Test::More, but avoids loading most modules and uses as few core features as possible.

If you write your own test, use the Test Anything Protocol.

  • t/base, t/comp and t/opbasic

    Since we don't know if require works, or even subroutines, use ad hoc tests for these three. Step carefully to avoid using the feature being tested. Tests in t/opbasic, for instance, have been placed there rather than in t/op because they test functionality which t/test.pl presumes has already been demonstrated to work.

  • t/cmd, t/run, t/io and t/op

    Now that basic require() and subroutines are tested, you can use the t/test.pl library.

    You can also use certain libraries like Config conditionally, but be sure to skip the test gracefully if it's not there.

  • Everything else

    Now that the core of Perl is tested, Test::More can and should be used. You can also use the full suite of core modules in the tests.

When you say "make test", Perl uses the t/TEST program to run the test suite (except under Win32 where it uses t/harness instead). All tests are run from the t/ directory, not the directory which contains the test. This causes some problems with the tests in lib/, so here's some opportunity for some patching.

You must be triply conscious of cross-platform concerns. This usually boils down to using File::Spec and avoiding things like fork() and system() unless absolutely necessary.

Special make test targets

There are various special make targets that can be used to test Perl slightly differently than the standard "test" target. Not all them are expected to give a 100% success rate. Many of them have several aliases, and many of them are not available on certain operating systems.

  • test_porting

    This runs some basic sanity tests on the source tree and helps catch basic errors before you submit a patch.

  • minitest

    Run miniperl on t/base, t/comp, t/cmd, t/run, t/io, t/op, t/uni and t/mro tests.

  • test.valgrind check.valgrind

    (Only in Linux) Run all the tests using the memory leak + naughty memory access tool "valgrind". The log files will be named testname.valgrind.

  • test_harness

    Run the test suite with the t/harness controlling program, instead of t/TEST. t/harness is more sophisticated, and uses the Test::Harness module, thus using this test target supposes that perl mostly works. The main advantage for our purposes is that it prints a detailed summary of failed tests at the end. Also, unlike t/TEST, it doesn't redirect stderr to stdout.

    Note that under Win32 t/harness is always used instead of t/TEST, so there is no special "test_harness" target.

    Under Win32's "test" target you may use the TEST_SWITCHES and TEST_FILES environment variables to control the behaviour of t/harness. This means you can say

    1. nmake test TEST_FILES="op/*.t"
    2. nmake test TEST_SWITCHES="-torture" TEST_FILES="op/*.t"
  • test-notty test_notty

    Sets PERL_SKIP_TTY_TEST to true before running normal test.

Parallel tests

The core distribution can now run its regression tests in parallel on Unix-like platforms. Instead of running make test , set TEST_JOBS in your environment to the number of tests to run in parallel, and run make test_harness . On a Bourne-like shell, this can be done as

  1. TEST_JOBS=3 make test_harness # Run 3 tests in parallel

An environment variable is used, rather than parallel make itself, because TAP::Harness needs to be able to schedule individual non-conflicting test scripts itself, and there is no standard interface to make utilities to interact with their job schedulers.

Note that currently some test scripts may fail when run in parallel (most notably ext/IO/t/io_dir.t). If necessary, run just the failing scripts again sequentially and see if the failures go away.

Running tests by hand

You can run part of the test suite by hand by using one of the following commands from the t/ directory:

  1. ./perl -I../lib TEST list-of-.t-files

or

  1. ./perl -I../lib harness list-of-.t-files

(If you don't specify test scripts, the whole test suite will be run.)

Using t/harness for testing

If you use harness for testing, you have several command line options available to you. The arguments are as follows, and are in the order that they must appear if used together.

  1. harness -v -torture -re=pattern LIST OF FILES TO TEST
  2. harness -v -torture -re LIST OF PATTERNS TO MATCH

If LIST OF FILES TO TEST is omitted, the file list is obtained from the manifest. The file list may include shell wildcards which will be expanded out.

  • -v

    Run the tests under verbose mode so you can see what tests were run, and debug output.

  • -torture

    Run the torture tests as well as the normal set.

  • -re=PATTERN

    Filter the file list so that all the test files run match PATTERN. Note that this form is distinct from the -re LIST OF PATTERNS form below in that it allows the file list to be provided as well.

  • -re LIST OF PATTERNS

    Filter the file list so that all the test files run match /(LIST|OF|PATTERNS)/. Note that with this form the patterns are joined by '|' and you cannot supply a list of files, instead the test files are obtained from the MANIFEST.

You can run an individual test by a command similar to

  1. ./perl -I../lib path/to/foo.t

except that the harnesses set up some environment variables that may affect the execution of the test:

  • PERL_CORE=1

    indicates that we're running this test as part of the perl core test suite. This is useful for modules that have a dual life on CPAN.

  • PERL_DESTRUCT_LEVEL=2

    is set to 2 if it isn't set already (see PERL_DESTRUCT_LEVEL in perlhacktips).

  • PERL

    (used only by t/TEST) if set, overrides the path to the perl executable that should be used to run the tests (the default being ./perl).

  • PERL_SKIP_TTY_TEST

    if set, tells to skip the tests that need a terminal. It's actually set automatically by the Makefile, but can also be forced artificially by running 'make test_notty'.

Other environment variables that may influence tests

  • PERL_TEST_Net_Ping

    Setting this variable runs all the Net::Ping modules tests, otherwise some tests that interact with the outside world are skipped. See perl58delta.

  • PERL_TEST_NOVREXX

    Setting this variable skips the vrexx.t tests for OS2::REXX.

  • PERL_TEST_NUMCONVERTS

    This sets a variable in op/numconvert.t.

  • PERL_TEST_MEMORY

    Setting this variable includes the tests in t/bigmem/. This should be set to the number of gigabytes of memory available for testing, eg. PERL_TEST_MEMORY=4 indicates that tests that require 4GiB of available memory can be run safely.

See also the documentation for the Test and Test::Harness modules, for more environment variables that affect testing.

MORE READING FOR GUTS HACKERS

To hack on the Perl guts, you'll need to read the following things:

  • perlsource

    An overview of the Perl source tree. This will help you find the files you're looking for.

  • perlinterp

    An overview of the Perl interpreter source code and some details on how Perl does what it does.

  • perlhacktut

    This document walks through the creation of a small patch to Perl's C code. If you're just getting started with Perl core hacking, this will help you understand how it works.

  • perlhacktips

    More details on hacking the Perl core. This document focuses on lower level details such as how to write tests, compilation issues, portability, debugging, etc.

    If you plan on doing serious C hacking, make sure to read this.

  • perlguts

    This is of paramount importance, since it's the documentation of what goes where in the Perl source. Read it over a couple of times and it might start to make sense - don't worry if it doesn't yet, because the best way to study it is to read it in conjunction with poking at Perl source, and we'll do that later on.

    Gisle Aas's "illustrated perlguts", also known as illguts, has very helpful pictures:

    http://search.cpan.org/dist/illguts/

  • perlxstut and perlxs

    A working knowledge of XSUB programming is incredibly useful for core hacking; XSUBs use techniques drawn from the PP code, the portion of the guts that actually executes a Perl program. It's a lot gentler to learn those techniques from simple examples and explanation than from the core itself.

  • perlapi

    The documentation for the Perl API explains what some of the internal functions do, as well as the many macros used in the source.

  • Porting/pumpkin.pod

    This is a collection of words of wisdom for a Perl porter; some of it is only useful to the pumpkin holder, but most of it applies to anyone wanting to go about Perl development.

CPAN TESTERS AND PERL SMOKERS

The CPAN testers ( http://testers.cpan.org/ ) are a group of volunteers who test CPAN modules on a variety of platforms.

Perl Smokers ( http://www.nntp.perl.org/group/perl.daily-build/ and http://www.nntp.perl.org/group/perl.daily-build.reports/ ) automatically test Perl source releases on platforms with various configurations.

Both efforts welcome volunteers. In order to get involved in smoke testing of the perl itself visit http://search.cpan.org/dist/Test-Smoke/. In order to start smoke testing CPAN modules visit http://search.cpan.org/dist/CPANPLUS-YACSmoke/ or http://search.cpan.org/dist/minismokebox/ or http://search.cpan.org/dist/CPAN-Reporter/.

WHAT NEXT?

If you've read all the documentation in the document and the ones listed above, you're more than ready to hack on Perl.

Here's some more recommendations

  • Subscribe to perl5-porters, follow the patches and try and understand them; don't be afraid to ask if there's a portion you're not clear on - who knows, you may unearth a bug in the patch...

  • Do read the README associated with your operating system, e.g. README.aix on the IBM AIX OS. Don't hesitate to supply patches to that README if you find anything missing or changed over a new OS release.

  • Find an area of Perl that seems interesting to you, and see if you can work out how it works. Scan through the source, and step over it in the debugger. Play, poke, investigate, fiddle! You'll probably get to understand not just your chosen area but a much wider range of perl's activity as well, and probably sooner than you'd think.

"The Road goes ever on and on, down from the door where it began."

If you can do these things, you've started on the long road to Perl porting. Thanks for wanting to help make Perl better - and happy hacking!

Metaphoric Quotations

If you recognized the quote about the Road above, you're in luck.

Most software projects begin each file with a literal description of each file's purpose. Perl instead begins each with a literary allusion to that file's purpose.

Like chapters in many books, all top-level Perl source files (along with a few others here and there) begin with an epigrammatic inscription that alludes, indirectly and metaphorically, to the material you're about to read.

Quotations are taken from writings of J.R.R. Tolkien pertaining to his Legendarium, almost always from The Lord of the Rings. Chapters and page numbers are given using the following editions:

  • The Hobbit, by J.R.R. Tolkien. The hardcover, 70th-anniversary edition of 2007 was used, published in the UK by Harper Collins Publishers and in the US by the Houghton Mifflin Company.

  • The Lord of the Rings, by J.R.R. Tolkien. The hardcover, 50th-anniversary edition of 2004 was used, published in the UK by Harper Collins Publishers and in the US by the Houghton Mifflin Company.

  • The Lays of Beleriand, by J.R.R. Tolkien and published posthumously by his son and literary executor, C.J.R. Tolkien, being the 3rd of the 12 volumes in Christopher's mammoth History of Middle Earth. Page numbers derive from the hardcover edition, first published in 1983 by George Allen & Unwin; no page numbers changed for the special 3-volume omnibus edition of 2002 or the various trade-paper editions, all again now by Harper Collins or Houghton Mifflin.

Other JRRT books fair game for quotes would thus include The Adventures of Tom Bombadil, The Silmarillion, Unfinished Tales, and The Tale of the Children of Hurin, all but the first posthumously assembled by CJRT. But The Lord of the Rings itself is perfectly fine and probably best to quote from, provided you can find a suitable quote there.

So if you were to supply a new, complete, top-level source file to add to Perl, you should conform to this peculiar practice by yourself selecting an appropriate quotation from Tolkien, retaining the original spelling and punctuation and using the same format the rest of the quotes are in. Indirect and oblique is just fine; remember, it's a metaphor, so being meta is, after all, what it's for.

AUTHOR

This document was originally written by Nathan Torkington, and is maintained by the perl5-porters mailing list.

 
perldoc-html/perlhacktips.html000644 000765 000024 00000254524 12275777370 016632 0ustar00jjstaff000000 000000 perlhacktips - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlhacktips

Perl 5 version 18.2 documentation
Recently read

perlhacktips

NAME

perlhacktips - Tips for Perl core C code hacking

DESCRIPTION

This document will help you learn the best way to go about hacking on the Perl core C code. It covers common problems, debugging, profiling, and more.

If you haven't read perlhack and perlhacktut yet, you might want to do that first.

COMMON PROBLEMS

Perl source plays by ANSI C89 rules: no C99 (or C++) extensions. In some cases we have to take pre-ANSI requirements into consideration. You don't care about some particular platform having broken Perl? I hear there is still a strong demand for J2EE programmers.

Perl environment problems

  • Not compiling with threading

    Compiling with threading (-Duseithreads) completely rewrites the function prototypes of Perl. You better try your changes with that. Related to this is the difference between "Perl_-less" and "Perl_-ly" APIs, for example:

    1. Perl_sv_setiv(aTHX_ ...);
    2. sv_setiv(...);

    The first one explicitly passes in the context, which is needed for e.g. threaded builds. The second one does that implicitly; do not get them mixed. If you are not passing in a aTHX_, you will need to do a dTHX (or a dVAR) as the first thing in the function.

    See How multiple interpreters and concurrency are supported in perlguts for further discussion about context.

  • Not compiling with -DDEBUGGING

    The DEBUGGING define exposes more code to the compiler, therefore more ways for things to go wrong. You should try it.

  • Introducing (non-read-only) globals

    Do not introduce any modifiable globals, truly global or file static. They are bad form and complicate multithreading and other forms of concurrency. The right way is to introduce them as new interpreter variables, see intrpvar.h (at the very end for binary compatibility).

    Introducing read-only (const) globals is okay, as long as you verify with e.g. nm libperl.a|egrep -v ' [TURtr] ' (if your nm has BSD-style output) that the data you added really is read-only. (If it is, it shouldn't show up in the output of that command.)

    If you want to have static strings, make them constant:

    1. static const char etc[] = "...";

    If you want to have arrays of constant strings, note carefully the right combination of const s:

    1. static const char * const yippee[] =
    2. {"hi", "ho", "silver"};

    There is a way to completely hide any modifiable globals (they are all moved to heap), the compilation setting -DPERL_GLOBAL_STRUCT_PRIVATE . It is not normally used, but can be used for testing, read more about it in Background and PERL_IMPLICIT_CONTEXT in perlguts.

  • Not exporting your new function

    Some platforms (Win32, AIX, VMS, OS/2, to name a few) require any function that is part of the public API (the shared Perl library) to be explicitly marked as exported. See the discussion about embed.pl in perlguts.

  • Exporting your new function

    The new shiny result of either genuine new functionality or your arduous refactoring is now ready and correctly exported. So what could possibly go wrong?

    Maybe simply that your function did not need to be exported in the first place. Perl has a long and not so glorious history of exporting functions that it should not have.

    If the function is used only inside one source code file, make it static. See the discussion about embed.pl in perlguts.

    If the function is used across several files, but intended only for Perl's internal use (and this should be the common case), do not export it to the public API. See the discussion about embed.pl in perlguts.

Portability problems

The following are common causes of compilation and/or execution failures, not common to Perl as such. The C FAQ is good bedtime reading. Please test your changes with as many C compilers and platforms as possible; we will, anyway, and it's nice to save oneself from public embarrassment.

If using gcc, you can add the -std=c89 option which will hopefully catch most of these unportabilities. (However it might also catch incompatibilities in your system's header files.)

Use the Configure -Dgccansipedantic flag to enable the gcc -ansi -pedantic flags which enforce stricter ANSI rules.

If using the gcc -Wall note that not all the possible warnings (like -Wunitialized ) are given unless you also compile with -O .

Note that if using gcc, starting from Perl 5.9.5 the Perl core source code files (the ones at the top level of the source code distribution, but not e.g. the extensions under ext/) are automatically compiled with as many as possible of the -std=c89 , -ansi , -pedantic , and a selection of -W flags (see cflags.SH).

Also study perlport carefully to avoid any bad assumptions about the operating system, filesystems, and so forth.

You may once in a while try a "make microperl" to see whether we can still compile Perl with just the bare minimum of interfaces. (See README.micro.)

Do not assume an operating system indicates a certain compiler.

  • Casting pointers to integers or casting integers to pointers

    1. void castaway(U8* p)
    2. {
    3. IV i = p;

    or

    1. void castaway(U8* p)
    2. {
    3. IV i = (IV)p;

    Both are bad, and broken, and unportable. Use the PTR2IV() macro that does it right. (Likewise, there are PTR2UV(), PTR2NV(), INT2PTR(), and NUM2PTR().)

  • Casting between data function pointers and data pointers

    Technically speaking casting between function pointers and data pointers is unportable and undefined, but practically speaking it seems to work, but you should use the FPTR2DPTR() and DPTR2FPTR() macros. Sometimes you can also play games with unions.

  • Assuming sizeof(int) == sizeof(long)

    There are platforms where longs are 64 bits, and platforms where ints are 64 bits, and while we are out to shock you, even platforms where shorts are 64 bits. This is all legal according to the C standard. (In other words, "long long" is not a portable way to specify 64 bits, and "long long" is not even guaranteed to be any wider than "long".)

    Instead, use the definitions IV, UV, IVSIZE, I32SIZE, and so forth. Avoid things like I32 because they are not guaranteed to be exactly 32 bits, they are at least 32 bits, nor are they guaranteed to be int or long. If you really explicitly need 64-bit variables, use I64 and U64, but only if guarded by HAS_QUAD.

  • Assuming one can dereference any type of pointer for any type of data

    1. char *p = ...;
    2. long pony = *p; /* BAD */

    Many platforms, quite rightly so, will give you a core dump instead of a pony if the p happens not to be correctly aligned.

  • Lvalue casts

    1. (int)*p = ...; /* BAD */

    Simply not portable. Get your lvalue to be of the right type, or maybe use temporary variables, or dirty tricks with unions.

  • Assume anything about structs (especially the ones you don't control, like the ones coming from the system headers)

    • That a certain field exists in a struct

    • That no other fields exist besides the ones you know of

    • That a field is of certain signedness, sizeof, or type

    • That the fields are in a certain order

      • While C guarantees the ordering specified in the struct definition, between different platforms the definitions might differ

    • That the sizeof(struct) or the alignments are the same everywhere

      • There might be padding bytes between the fields to align the fields - the bytes can be anything

      • Structs are required to be aligned to the maximum alignment required by the fields - which for native types is for usually equivalent to sizeof() of the field

  • Assuming the character set is ASCIIish

    Perl can compile and run under EBCDIC platforms. See perlebcdic. This is transparent for the most part, but because the character sets differ, you shouldn't use numeric (decimal, octal, nor hex) constants to refer to characters. You can safely say 'A', but not 0x41. You can safely say '\n', but not \012. If a character doesn't have a trivial input form, you should add it to the list in regen/unicode_constants.pl, and have Perl create #defines for you, based on the current platform.

    Also, the range 'A' - 'Z' in ASCII is an unbroken sequence of 26 upper case alphabetic characters. That is not true in EBCDIC. Nor for 'a' to 'z'. But '0' - '9' is an unbroken range in both systems. Don't assume anything about other ranges.

    Many of the comments in the existing code ignore the possibility of EBCDIC, and may be wrong therefore, even if the code works. This is actually a tribute to the successful transparent insertion of being able to handle EBCDIC without having to change pre-existing code.

    UTF-8 and UTF-EBCDIC are two different encodings used to represent Unicode code points as sequences of bytes. Macros with the same names (but different definitions) in utf8.h and utfebcdic.h are used to allow the calling code to think that there is only one such encoding. This is almost always referred to as utf8 , but it means the EBCDIC version as well. Again, comments in the code may well be wrong even if the code itself is right. For example, the concept of invariant characters differs between ASCII and EBCDIC. On ASCII platforms, only characters that do not have the high-order bit set (i.e. whose ordinals are strict ASCII, 0 - 127) are invariant, and the documentation and comments in the code may assume that, often referring to something like, say, hibit . The situation differs and is not so simple on EBCDIC machines, but as long as the code itself uses the NATIVE_IS_INVARIANT() macro appropriately, it works, even if the comments are wrong.

  • Assuming the character set is just ASCII

    ASCII is a 7 bit encoding, but bytes have 8 bits in them. The 128 extra characters have different meanings depending on the locale. Absent a locale, currently these extra characters are generally considered to be unassigned, and this has presented some problems. This is being changed starting in 5.12 so that these characters will be considered to be Latin-1 (ISO-8859-1).

  • Mixing #define and #ifdef

    1. #define BURGLE(x) ... \
    2. #ifdef BURGLE_OLD_STYLE /* BAD */
    3. ... do it the old way ... \
    4. #else
    5. ... do it the new way ... \
    6. #endif

    You cannot portably "stack" cpp directives. For example in the above you need two separate BURGLE() #defines, one for each #ifdef branch.

  • Adding non-comment stuff after #endif or #else

    1. #ifdef SNOSH
    2. ...
    3. #else !SNOSH /* BAD */
    4. ...
    5. #endif SNOSH /* BAD */

    The #endif and #else cannot portably have anything non-comment after them. If you want to document what is going (which is a good idea especially if the branches are long), use (C) comments:

    1. #ifdef SNOSH
    2. ...
    3. #else /* !SNOSH */
    4. ...
    5. #endif /* SNOSH */

    The gcc option -Wendif-labels warns about the bad variant (by default on starting from Perl 5.9.4).

  • Having a comma after the last element of an enum list

    1. enum color {
    2. CERULEAN,
    3. CHARTREUSE,
    4. CINNABAR, /* BAD */
    5. };

    is not portable. Leave out the last comma.

    Also note that whether enums are implicitly morphable to ints varies between compilers, you might need to (int).

  • Using //-comments

    1. // This function bamfoodles the zorklator. /* BAD */

    That is C99 or C++. Perl is C89. Using the //-comments is silently allowed by many C compilers but cranking up the ANSI C89 strictness (which we like to do) causes the compilation to fail.

  • Mixing declarations and code

    1. void zorklator()
    2. {
    3. int n = 3;
    4. set_zorkmids(n); /* BAD */
    5. int q = 4;

    That is C99 or C++. Some C compilers allow that, but you shouldn't.

    The gcc option -Wdeclaration-after-statements scans for such problems (by default on starting from Perl 5.9.4).

  • Introducing variables inside for()

    1. for(int i = ...; ...; ...) { /* BAD */

    That is C99 or C++. While it would indeed be awfully nice to have that also in C89, to limit the scope of the loop variable, alas, we cannot.

  • Mixing signed char pointers with unsigned char pointers

    1. int foo(char *s) { ... }
    2. ...
    3. unsigned char *t = ...; /* Or U8* t = ... */
    4. foo(t); /* BAD */

    While this is legal practice, it is certainly dubious, and downright fatal in at least one platform: for example VMS cc considers this a fatal error. One cause for people often making this mistake is that a "naked char" and therefore dereferencing a "naked char pointer" have an undefined signedness: it depends on the compiler and the flags of the compiler and the underlying platform whether the result is signed or unsigned. For this very same reason using a 'char' as an array index is bad.

  • Macros that have string constants and their arguments as substrings of the string constants

    1. #define FOO(n) printf("number = %d\n", n) /* BAD */
    2. FOO(10);

    Pre-ANSI semantics for that was equivalent to

    1. printf("10umber = %d\10");

    which is probably not what you were expecting. Unfortunately at least one reasonably common and modern C compiler does "real backward compatibility" here, in AIX that is what still happens even though the rest of the AIX compiler is very happily C89.

  • Using printf formats for non-basic C types

    1. IV i = ...;
    2. printf("i = %d\n", i); /* BAD */

    While this might by accident work in some platform (where IV happens to be an int), in general it cannot. IV might be something larger. Even worse the situation is with more specific types (defined by Perl's configuration step in config.h):

    1. Uid_t who = ...;
    2. printf("who = %d\n", who); /* BAD */

    The problem here is that Uid_t might be not only not int-wide but it might also be unsigned, in which case large uids would be printed as negative values.

    There is no simple solution to this because of printf()'s limited intelligence, but for many types the right format is available as with either 'f' or '_f' suffix, for example:

    1. IVdf /* IV in decimal */
    2. UVxf /* UV is hexadecimal */
    3. printf("i = %"IVdf"\n", i); /* The IVdf is a string constant. */
    4. Uid_t_f /* Uid_t in decimal */
    5. printf("who = %"Uid_t_f"\n", who);

    Or you can try casting to a "wide enough" type:

    1. printf("i = %"IVdf"\n", (IV)something_very_small_and_signed);

    Also remember that the %p format really does require a void pointer:

    1. U8* p = ...;
    2. printf("p = %p\n", (void*)p);

    The gcc option -Wformat scans for such problems.

  • Blindly using variadic macros

    gcc has had them for a while with its own syntax, and C99 brought them with a standardized syntax. Don't use the former, and use the latter only if the HAS_C99_VARIADIC_MACROS is defined.

  • Blindly passing va_list

    Not all platforms support passing va_list to further varargs (stdarg) functions. The right thing to do is to copy the va_list using the Perl_va_copy() if the NEED_VA_COPY is defined.

  • Using gcc statement expressions

    1. val = ({...;...;...}); /* BAD */

    While a nice extension, it's not portable. The Perl code does admittedly use them if available to gain some extra speed (essentially as a funky form of inlining), but you shouldn't.

  • Binding together several statements in a macro

    Use the macros STMT_START and STMT_END.

    1. STMT_START {
    2. ...
    3. } STMT_END
  • Testing for operating systems or versions when should be testing for features

    1. #ifdef __FOONIX__ /* BAD */
    2. foo = quux();
    3. #endif

    Unless you know with 100% certainty that quux() is only ever available for the "Foonix" operating system and that is available and correctly working for all past, present, and future versions of "Foonix", the above is very wrong. This is more correct (though still not perfect, because the below is a compile-time check):

    1. #ifdef HAS_QUUX
    2. foo = quux();
    3. #endif

    How does the HAS_QUUX become defined where it needs to be? Well, if Foonix happens to be Unixy enough to be able to run the Configure script, and Configure has been taught about detecting and testing quux(), the HAS_QUUX will be correctly defined. In other platforms, the corresponding configuration step will hopefully do the same.

    In a pinch, if you cannot wait for Configure to be educated, or if you have a good hunch of where quux() might be available, you can temporarily try the following:

    1. #if (defined(__FOONIX__) || defined(__BARNIX__))
    2. # define HAS_QUUX
    3. #endif
    4. ...
    5. #ifdef HAS_QUUX
    6. foo = quux();
    7. #endif

    But in any case, try to keep the features and operating systems separate.

Problematic System Interfaces

  • malloc(0), realloc(0), calloc(0, 0) are non-portable. To be portable allocate at least one byte. (In general you should rarely need to work at this low level, but instead use the various malloc wrappers.)

  • snprintf() - the return type is unportable. Use my_snprintf() instead.

Security problems

Last but not least, here are various tips for safer coding.

  • Do not use gets()

    Or we will publicly ridicule you. Seriously.

  • Do not use strcpy() or strcat() or strncpy() or strncat()

    Use my_strlcpy() and my_strlcat() instead: they either use the native implementation, or Perl's own implementation (borrowed from the public domain implementation of INN).

  • Do not use sprintf() or vsprintf()

    If you really want just plain byte strings, use my_snprintf() and my_vsnprintf() instead, which will try to use snprintf() and vsnprintf() if those safer APIs are available. If you want something fancier than a plain byte string, use SVs and Perl_sv_catpvf().

DEBUGGING

You can compile a special debugging version of Perl, which allows you to use the -D option of Perl to tell more about what Perl is doing. But sometimes there is no alternative than to dive in with a debugger, either to see the stack trace of a core dump (very useful in a bug report), or trying to figure out what went wrong before the core dump happened, or how did we end up having wrong or unexpected results.

Poking at Perl

To really poke around with Perl, you'll probably want to build Perl for debugging, like this:

  1. ./Configure -d -D optimize=-g
  2. make

-g is a flag to the C compiler to have it produce debugging information which will allow us to step through a running program, and to see in which C function we are at (without the debugging information we might see only the numerical addresses of the functions, which is not very helpful).

Configure will also turn on the DEBUGGING compilation symbol which enables all the internal debugging code in Perl. There are a whole bunch of things you can debug with this: perlrun lists them all, and the best way to find out about them is to play about with them. The most useful options are probably

  1. l Context (loop) stack processing
  2. t Trace execution
  3. o Method and overloading resolution
  4. c String/numeric conversions

Some of the functionality of the debugging code can be achieved using XS modules.

  1. -Dr => use re 'debug'
  2. -Dx => use O 'Debug'

Using a source-level debugger

If the debugging output of -D doesn't help you, it's time to step through perl's execution with a source-level debugger.

  • We'll use gdb for our examples here; the principles will apply to any debugger (many vendors call their debugger dbx ), but check the manual of the one you're using.

To fire up the debugger, type

  1. gdb ./perl

Or if you have a core dump:

  1. gdb ./perl core

You'll want to do that in your Perl source tree so the debugger can read the source code. You should see the copyright message, followed by the prompt.

  1. (gdb)

help will get you into the documentation, but here are the most useful commands:

  • run [args]

    Run the program with the given arguments.

  • break function_name
  • break source.c:xxx

    Tells the debugger that we'll want to pause execution when we reach either the named function (but see Internal Functions in perlguts!) or the given line in the named source file.

  • step

    Steps through the program a line at a time.

  • next

    Steps through the program a line at a time, without descending into functions.

  • continue

    Run until the next breakpoint.

  • finish

    Run until the end of the current function, then stop again.

  • 'enter'

    Just pressing Enter will do the most recent operation again - it's a blessing when stepping through miles of source code.

  • print

    Execute the given C code and print its results. WARNING: Perl makes heavy use of macros, and gdb does not necessarily support macros (see later gdb macro support). You'll have to substitute them yourself, or to invoke cpp on the source code files (see The .i Targets) So, for instance, you can't say

    1. print SvPV_nolen(sv)

    but you have to say

    1. print Perl_sv_2pv_nolen(sv)

You may find it helpful to have a "macro dictionary", which you can produce by saying cpp -dM perl.c | sort . Even then, cpp won't recursively apply those macros for you.

gdb macro support

Recent versions of gdb have fairly good macro support, but in order to use it you'll need to compile perl with macro definitions included in the debugging information. Using gcc version 3.1, this means configuring with -Doptimize=-g3 . Other compilers might use a different switch (if they support debugging macros at all).

Dumping Perl Data Structures

One way to get around this macro hell is to use the dumping functions in dump.c; these work a little like an internal Devel::Peek, but they also cover OPs and other structures that you can't get at from Perl. Let's take an example. We'll use the $a = $b + $c we used before, but give it a bit of context: $b = "6XXXX"; $c = 2.3; . Where's a good place to stop and poke around?

What about pp_add , the function we examined earlier to implement the + operator:

  1. (gdb) break Perl_pp_add
  2. Breakpoint 1 at 0x46249f: file pp_hot.c, line 309.

Notice we use Perl_pp_add and not pp_add - see Internal Functions in perlguts. With the breakpoint in place, we can run our program:

  1. (gdb) run -e '$b = "6XXXX"; $c = 2.3; $a = $b + $c'

Lots of junk will go past as gdb reads in the relevant source files and libraries, and then:

  1. Breakpoint 1, Perl_pp_add () at pp_hot.c:309
  2. 309 dSP; dATARGET; tryAMAGICbin(add,opASSIGN);
  3. (gdb) step
  4. 311 dPOPTOPnnrl_ul;
  5. (gdb)

We looked at this bit of code before, and we said that dPOPTOPnnrl_ul arranges for two NV s to be placed into left and right - let's slightly expand it:

  1. #define dPOPTOPnnrl_ul NV right = POPn; \
  2. SV *leftsv = TOPs; \
  3. NV left = USE_LEFT(leftsv) ? SvNV(leftsv) : 0.0

POPn takes the SV from the top of the stack and obtains its NV either directly (if SvNOK is set) or by calling the sv_2nv function. TOPs takes the next SV from the top of the stack - yes, POPn uses TOPs - but doesn't remove it. We then use SvNV to get the NV from leftsv in the same way as before - yes, POPn uses SvNV .

Since we don't have an NV for $b , we'll have to use sv_2nv to convert it. If we step again, we'll find ourselves there:

  1. Perl_sv_2nv (sv=0xa0675d0) at sv.c:1669
  2. 1669 if (!sv)
  3. (gdb)

We can now use Perl_sv_dump to investigate the SV:

  1. SV = PV(0xa057cc0) at 0xa0675d0
  2. REFCNT = 1
  3. FLAGS = (POK,pPOK)
  4. PV = 0xa06a510 "6XXXX"\0
  5. CUR = 5
  6. LEN = 6
  7. $1 = void

We know we're going to get 6 from this, so let's finish the subroutine:

  1. (gdb) finish
  2. Run till exit from #0 Perl_sv_2nv (sv=0xa0675d0) at sv.c:1671
  3. 0x462669 in Perl_pp_add () at pp_hot.c:311
  4. 311 dPOPTOPnnrl_ul;

We can also dump out this op: the current op is always stored in PL_op , and we can dump it with Perl_op_dump . This'll give us similar output to B::Debug.

  1. {
  2. 13 TYPE = add ===> 14
  3. TARG = 1
  4. FLAGS = (SCALAR,KIDS)
  5. {
  6. TYPE = null ===> (12)
  7. (was rv2sv)
  8. FLAGS = (SCALAR,KIDS)
  9. {
  10. 11 TYPE = gvsv ===> 12
  11. FLAGS = (SCALAR)
  12. GV = main::b
  13. }
  14. }

# finish this later #

SOURCE CODE STATIC ANALYSIS

Various tools exist for analysing C source code statically, as opposed to dynamically, that is, without executing the code. It is possible to detect resource leaks, undefined behaviour, type mismatches, portability problems, code paths that would cause illegal memory accesses, and other similar problems by just parsing the C code and looking at the resulting graph, what does it tell about the execution and data flows. As a matter of fact, this is exactly how C compilers know to give warnings about dubious code.

lint, splint

The good old C code quality inspector, lint , is available in several platforms, but please be aware that there are several different implementations of it by different vendors, which means that the flags are not identical across different platforms.

There is a lint variant called splint (Secure Programming Lint) available from http://www.splint.org/ that should compile on any Unix-like platform.

There are lint and <splint> targets in Makefile, but you may have to diddle with the flags (see above).

Coverity

Coverity (http://www.coverity.com/) is a product similar to lint and as a testbed for their product they periodically check several open source projects, and they give out accounts to open source developers to the defect databases.

cpd (cut-and-paste detector)

The cpd tool detects cut-and-paste coding. If one instance of the cut-and-pasted code changes, all the other spots should probably be changed, too. Therefore such code should probably be turned into a subroutine or a macro.

cpd (http://pmd.sourceforge.net/cpd.html) is part of the pmd project (http://pmd.sourceforge.net/). pmd was originally written for static analysis of Java code, but later the cpd part of it was extended to parse also C and C++.

Download the pmd-bin-X.Y.zip () from the SourceForge site, extract the pmd-X.Y.jar from it, and then run that on source code thusly:

  1. java -cp pmd-X.Y.jar net.sourceforge.pmd.cpd.CPD \
  2. --minimum-tokens 100 --files /some/where/src --language c > cpd.txt

You may run into memory limits, in which case you should use the -Xmx option:

  1. java -Xmx512M ...

gcc warnings

Though much can be written about the inconsistency and coverage problems of gcc warnings (like -Wall not meaning "all the warnings", or some common portability problems not being covered by -Wall , or -ansi and -pedantic both being a poorly defined collection of warnings, and so forth), gcc is still a useful tool in keeping our coding nose clean.

The -Wall is by default on.

The -ansi (and its sidekick, -pedantic ) would be nice to be on always, but unfortunately they are not safe on all platforms, they can for example cause fatal conflicts with the system headers (Solaris being a prime example). If Configure -Dgccansipedantic is used, the cflags frontend selects -ansi -pedantic for the platforms where they are known to be safe.

Starting from Perl 5.9.4 the following extra flags are added:

  • -Wendif-labels

  • -Wextra

  • -Wdeclaration-after-statement

The following flags would be nice to have but they would first need their own Augean stablemaster:

  • -Wpointer-arith

  • -Wshadow

  • -Wstrict-prototypes

The -Wtraditional is another example of the annoying tendency of gcc to bundle a lot of warnings under one switch (it would be impossible to deploy in practice because it would complain a lot) but it does contain some warnings that would be beneficial to have available on their own, such as the warning about string constants inside macros containing the macro arguments: this behaved differently pre-ANSI than it does in ANSI, and some C compilers are still in transition, AIX being an example.

Warnings of other C compilers

Other C compilers (yes, there are other C compilers than gcc) often have their "strict ANSI" or "strict ANSI with some portability extensions" modes on, like for example the Sun Workshop has its -Xa mode on (though implicitly), or the DEC (these days, HP...) has its -std1 mode on.

MEMORY DEBUGGERS

NOTE 1: Running under older memory debuggers such as Purify, valgrind or Third Degree greatly slows down the execution: seconds become minutes, minutes become hours. For example as of Perl 5.8.1, the ext/Encode/t/Unicode.t takes extraordinarily long to complete under e.g. Purify, Third Degree, and valgrind. Under valgrind it takes more than six hours, even on a snappy computer. The said test must be doing something that is quite unfriendly for memory debuggers. If you don't feel like waiting, that you can simply kill away the perl process. Roughly valgrind slows down execution by factor 10, AddressSanitizer by factor 2.

NOTE 2: To minimize the number of memory leak false alarms (see PERL_DESTRUCT_LEVEL for more information), you have to set the environment variable PERL_DESTRUCT_LEVEL to 2.

For csh-like shells:

  1. setenv PERL_DESTRUCT_LEVEL 2

For Bourne-type shells:

  1. PERL_DESTRUCT_LEVEL=2
  2. export PERL_DESTRUCT_LEVEL

In Unixy environments you can also use the env command:

  1. env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib ...

NOTE 3: There are known memory leaks when there are compile-time errors within eval or require, seeing S_doeval in the call stack is a good sign of these. Fixing these leaks is non-trivial, unfortunately, but they must be fixed eventually.

NOTE 4: DynaLoader will not clean up after itself completely unless Perl is built with the Configure option -Accflags=-DDL_UNLOAD_ALL_AT_EXIT .

Rational Software's Purify

Purify is a commercial tool that is helpful in identifying memory overruns, wild pointers, memory leaks and other such badness. Perl must be compiled in a specific way for optimal testing with Purify. Purify is available under Windows NT, Solaris, HP-UX, SGI, and Siemens Unix.

Purify on Unix

On Unix, Purify creates a new Perl binary. To get the most benefit out of Purify, you should create the perl to Purify using:

  1. sh Configure -Accflags=-DPURIFY -Doptimize='-g' \
  2. -Uusemymalloc -Dusemultiplicity

where these arguments mean:

  • -Accflags=-DPURIFY

    Disables Perl's arena memory allocation functions, as well as forcing use of memory allocation functions derived from the system malloc.

  • -Doptimize='-g'

    Adds debugging information so that you see the exact source statements where the problem occurs. Without this flag, all you will see is the source filename of where the error occurred.

  • -Uusemymalloc

    Disable Perl's malloc so that Purify can more closely monitor allocations and leaks. Using Perl's malloc will make Purify report most leaks in the "potential" leaks category.

  • -Dusemultiplicity

    Enabling the multiplicity option allows perl to clean up thoroughly when the interpreter shuts down, which reduces the number of bogus leak reports from Purify.

Once you've compiled a perl suitable for Purify'ing, then you can just:

  1. make pureperl

which creates a binary named 'pureperl' that has been Purify'ed. This binary is used in place of the standard 'perl' binary when you want to debug Perl memory problems.

As an example, to show any memory leaks produced during the standard Perl testset you would create and run the Purify'ed perl as:

  1. make pureperl
  2. cd t
  3. ../pureperl -I../lib harness

which would run Perl on test.pl and report any memory problems.

Purify outputs messages in "Viewer" windows by default. If you don't have a windowing environment or if you simply want the Purify output to unobtrusively go to a log file instead of to the interactive window, use these following options to output to the log file "perl.log":

  1. setenv PURIFYOPTIONS "-chain-length=25 -windows=no \
  2. -log-file=perl.log -append-logfile=yes"

If you plan to use the "Viewer" windows, then you only need this option:

  1. setenv PURIFYOPTIONS "-chain-length=25"

In Bourne-type shells:

  1. PURIFYOPTIONS="..."
  2. export PURIFYOPTIONS

or if you have the "env" utility:

  1. env PURIFYOPTIONS="..." ../pureperl ...

Purify on NT

Purify on Windows NT instruments the Perl binary 'perl.exe' on the fly. There are several options in the makefile you should change to get the most use out of Purify:

  • DEFINES

    You should add -DPURIFY to the DEFINES line so the DEFINES line looks something like:

    1. DEFINES = -DWIN32 -D_CONSOLE -DNO_STRICT $(CRYPT_FLAG) -DPURIFY=1

    to disable Perl's arena memory allocation functions, as well as to force use of memory allocation functions derived from the system malloc.

  • USE_MULTI = define

    Enabling the multiplicity option allows perl to clean up thoroughly when the interpreter shuts down, which reduces the number of bogus leak reports from Purify.

  • #PERL_MALLOC = define

    Disable Perl's malloc so that Purify can more closely monitor allocations and leaks. Using Perl's malloc will make Purify report most leaks in the "potential" leaks category.

  • CFG = Debug

    Adds debugging information so that you see the exact source statements where the problem occurs. Without this flag, all you will see is the source filename of where the error occurred.

As an example, to show any memory leaks produced during the standard Perl testset you would create and run Purify as:

  1. cd win32
  2. make
  3. cd ../t
  4. purify ../perl -I../lib harness

which would instrument Perl in memory, run Perl on test.pl, then finally report any memory problems.

valgrind

The valgrind tool can be used to find out both memory leaks and illegal heap memory accesses. As of version 3.3.0, Valgrind only supports Linux on x86, x86-64 and PowerPC and Darwin (OS X) on x86 and x86-64). The special "test.valgrind" target can be used to run the tests under valgrind. Found errors and memory leaks are logged in files named testfile.valgrind.

Valgrind also provides a cachegrind tool, invoked on perl as:

  1. VG_OPTS=--tool=cachegrind make test.valgrind

As system libraries (most notably glibc) are also triggering errors, valgrind allows to suppress such errors using suppression files. The default suppression file that comes with valgrind already catches a lot of them. Some additional suppressions are defined in t/perl.supp.

To get valgrind and for more information see

  1. http://valgrind.org/

AddressSanitizer

AddressSanitizer is a clang extension, included in clang since v3.1. It checks illegal heap pointers, global pointers, stack pointers and use after free errors, and is fast enough that you can easily compile your debugging or optimized perl with it. It does not check memory leaks though. AddressSanitizer is available for linux, Mac OS X and soon on Windows.

To build perl with AddressSanitizer, your Configure invocation should look like:

  1. sh Configure -des -Dcc=clang \
  2. -Accflags=-faddress-sanitizer -Aldflags=-faddress-sanitizer \
  3. -Alddlflags=-shared\ -faddress-sanitizer

where these arguments mean:

  • -Dcc=clang

    This should be replaced by the full path to your clang executable if it is not in your path.

  • -Accflags=-faddress-sanitizer

    Compile perl and extensions sources with AddressSanitizer.

  • -Aldflags=-faddress-sanitizer

    Link the perl executable with AddressSanitizer.

  • -Alddlflags=-shared\ -faddress-sanitizer

    Link dynamic extensions with AddressSanitizer. You must manually specify -shared because using -Alddlflags=-shared will prevent Configure from setting a default value for lddlflags , which usually contains -shared (at least on linux).

See also http://code.google.com/p/address-sanitizer/wiki/AddressSanitizer.

PROFILING

Depending on your platform there are various ways of profiling Perl.

There are two commonly used techniques of profiling executables: statistical time-sampling and basic-block counting.

The first method takes periodically samples of the CPU program counter, and since the program counter can be correlated with the code generated for functions, we get a statistical view of in which functions the program is spending its time. The caveats are that very small/fast functions have lower probability of showing up in the profile, and that periodically interrupting the program (this is usually done rather frequently, in the scale of milliseconds) imposes an additional overhead that may skew the results. The first problem can be alleviated by running the code for longer (in general this is a good idea for profiling), the second problem is usually kept in guard by the profiling tools themselves.

The second method divides up the generated code into basic blocks. Basic blocks are sections of code that are entered only in the beginning and exited only at the end. For example, a conditional jump starts a basic block. Basic block profiling usually works by instrumenting the code by adding enter basic block #nnnn book-keeping code to the generated code. During the execution of the code the basic block counters are then updated appropriately. The caveat is that the added extra code can skew the results: again, the profiling tools usually try to factor their own effects out of the results.

Gprof Profiling

gprof is a profiling tool available in many Unix platforms, it uses statistical time-sampling.

You can build a profiled version of perl called "perl.gprof" by invoking the make target "perl.gprof" (What is required is that Perl must be compiled using the -pg flag, you may need to re-Configure). Running the profiled version of Perl will create an output file called gmon.out is created which contains the profiling data collected during the execution.

The gprof tool can then display the collected data in various ways. Usually gprof understands the following options:

  • -a

    Suppress statically defined functions from the profile.

  • -b

    Suppress the verbose descriptions in the profile.

  • -e routine

    Exclude the given routine and its descendants from the profile.

  • -f routine

    Display only the given routine and its descendants in the profile.

  • -s

    Generate a summary file called gmon.sum which then may be given to subsequent gprof runs to accumulate data over several runs.

  • -z

    Display routines that have zero usage.

For more detailed explanation of the available commands and output formats, see your own local documentation of gprof.

quick hint:

  1. $ sh Configure -des -Dusedevel -Doptimize='-pg' && make perl.gprof
  2. $ ./perl.gprof someprog # creates gmon.out in current directory
  3. $ gprof ./perl.gprof > out
  4. $ view out

GCC gcov Profiling

Starting from GCC 3.0 basic block profiling is officially available for the GNU CC.

You can build a profiled version of perl called perl.gcov by invoking the make target "perl.gcov" (what is required that Perl must be compiled using gcc with the flags -fprofile-arcs -ftest-coverage , you may need to re-Configure).

Running the profiled version of Perl will cause profile output to be generated. For each source file an accompanying ".da" file will be created.

To display the results you use the "gcov" utility (which should be installed if you have gcc 3.0 or newer installed). gcov is run on source code files, like this

  1. gcov sv.c

which will cause sv.c.gcov to be created. The .gcov files contain the source code annotated with relative frequencies of execution indicated by "#" markers.

Useful options of gcov include -b which will summarise the basic block, branch, and function call coverage, and -c which instead of relative frequencies will use the actual counts. For more information on the use of gcov and basic block profiling with gcc, see the latest GNU CC manual, as of GCC 3.0 see

  1. http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc.html

and its section titled "8. gcov: a Test Coverage Program"

  1. http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc_8.html#SEC132

quick hint:

  1. $ sh Configure -des -Dusedevel -Doptimize='-g' \
  2. -Accflags='-fprofile-arcs -ftest-coverage' \
  3. -Aldflags='-fprofile-arcs -ftest-coverage' && make perl.gcov
  4. $ rm -f regexec.c.gcov regexec.gcda
  5. $ ./perl.gcov
  6. $ gcov regexec.c
  7. $ view regexec.c.gcov

MISCELLANEOUS TRICKS

PERL_DESTRUCT_LEVEL

If you want to run any of the tests yourself manually using e.g. valgrind, or the pureperl or perl.third executables, please note that by default perl does not explicitly cleanup all the memory it has allocated (such as global memory arenas) but instead lets the exit() of the whole program "take care" of such allocations, also known as "global destruction of objects".

There is a way to tell perl to do complete cleanup: set the environment variable PERL_DESTRUCT_LEVEL to a non-zero value. The t/TEST wrapper does set this to 2, and this is what you need to do too, if you don't want to see the "global leaks": For example, for "third-degreed" Perl:

  1. env PERL_DESTRUCT_LEVEL=2 ./perl.third -Ilib t/foo/bar.t

(Note: the mod_perl apache module uses also this environment variable for its own purposes and extended its semantics. Refer to the mod_perl documentation for more information. Also, spawned threads do the equivalent of setting this variable to the value 1.)

If, at the end of a run you get the message N scalars leaked, you can recompile with -DDEBUG_LEAKING_SCALARS , which will cause the addresses of all those leaked SVs to be dumped along with details as to where each SV was originally allocated. This information is also displayed by Devel::Peek. Note that the extra details recorded with each SV increases memory usage, so it shouldn't be used in production environments. It also converts new_SV() from a macro into a real function, so you can use your favourite debugger to discover where those pesky SVs were allocated.

If you see that you're leaking memory at runtime, but neither valgrind nor -DDEBUG_LEAKING_SCALARS will find anything, you're probably leaking SVs that are still reachable and will be properly cleaned up during destruction of the interpreter. In such cases, using the -Dm switch can point you to the source of the leak. If the executable was built with -DDEBUG_LEAKING_SCALARS , -Dm will output SV allocations in addition to memory allocations. Each SV allocation has a distinct serial number that will be written on creation and destruction of the SV. So if you're executing the leaking code in a loop, you need to look for SVs that are created, but never destroyed between each cycle. If such an SV is found, set a conditional breakpoint within new_SV() and make it break only when PL_sv_serial is equal to the serial number of the leaking SV. Then you will catch the interpreter in exactly the state where the leaking SV is allocated, which is sufficient in many cases to find the source of the leak.

As -Dm is using the PerlIO layer for output, it will by itself allocate quite a bunch of SVs, which are hidden to avoid recursion. You can bypass the PerlIO layer if you use the SV logging provided by -DPERL_MEM_LOG instead.

PERL_MEM_LOG

If compiled with -DPERL_MEM_LOG , both memory and SV allocations go through logging functions, which is handy for breakpoint setting.

Unless -DPERL_MEM_LOG_NOIMPL is also compiled, the logging functions read $ENV{PERL_MEM_LOG} to determine whether to log the event, and if so how:

  1. $ENV{PERL_MEM_LOG} =~ /m/ Log all memory ops
  2. $ENV{PERL_MEM_LOG} =~ /s/ Log all SV ops
  3. $ENV{PERL_MEM_LOG} =~ /t/ include timestamp in Log
  4. $ENV{PERL_MEM_LOG} =~ /^(\d+)/ write to FD given (default is 2)

Memory logging is somewhat similar to -Dm but is independent of -DDEBUGGING , and at a higher level; all uses of Newx(), Renew(), and Safefree() are logged with the caller's source code file and line number (and C function name, if supported by the C compiler). In contrast, -Dm is directly at the point of malloc() . SV logging is similar.

Since the logging doesn't use PerlIO, all SV allocations are logged and no extra SV allocations are introduced by enabling the logging. If compiled with -DDEBUG_LEAKING_SCALARS , the serial number for each SV allocation is also logged.

DDD over gdb

Those debugging perl with the DDD frontend over gdb may find the following useful:

You can extend the data conversion shortcuts menu, so for example you can display an SV's IV value with one click, without doing any typing. To do that simply edit ~/.ddd/init file and add after:

  1. ! Display shortcuts.
  2. Ddd*gdbDisplayShortcuts: \
  3. /t () // Convert to Bin\n\
  4. /d () // Convert to Dec\n\
  5. /x () // Convert to Hex\n\
  6. /o () // Convert to Oct(\n\

the following two lines:

  1. ((XPV*) (())->sv_any )->xpv_pv // 2pvx\n\
  2. ((XPVIV*) (())->sv_any )->xiv_iv // 2ivx

so now you can do ivx and pvx lookups or you can plug there the sv_peek "conversion":

  1. Perl_sv_peek(my_perl, (SV*)()) // sv_peek

(The my_perl is for threaded builds.) Just remember that every line, but the last one, should end with \n\

Alternatively edit the init file interactively via: 3rd mouse button -> New Display -> Edit Menu

Note: you can define up to 20 conversion shortcuts in the gdb section.

Poison

If you see in a debugger a memory area mysteriously full of 0xABABABAB or 0xEFEFEFEF, you may be seeing the effect of the Poison() macros, see perlclib.

Read-only optrees

Under ithreads the optree is read only. If you want to enforce this, to check for write accesses from buggy code, compile with -DPERL_DEBUG_READONLY_OPS to enable code that allocates op memory via mmap , and sets it read-only when it is attached to a subroutine. Any write access to an op results in a SIGBUS and abort.

This code is intended for development only, and may not be portable even to all Unix variants. Also, it is an 80% solution, in that it isn't able to make all ops read only. Specifically it does not apply to op slabs belonging to BEGIN blocks.

However, as an 80% solution it is still effective, as it has caught bugs in the past.

The .i Targets

You can expand the macros in a foo.c file by saying

  1. make foo.i

which will expand the macros using cpp. Don't be scared by the results.

AUTHOR

This document was originally written by Nathan Torkington, and is maintained by the perl5-porters mailing list.

 
perldoc-html/perlhacktut.html000644 000765 000024 00000074222 12275777370 016462 0ustar00jjstaff000000 000000 perlhacktut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlhacktut

Perl 5 version 18.2 documentation
Recently read

perlhacktut

NAME

perlhacktut - Walk through the creation of a simple C code patch

DESCRIPTION

This document takes you through a simple patch example.

If you haven't read perlhack yet, go do that first! You might also want to read through perlsource too.

Once you're done here, check out perlhacktips next.

EXAMPLE OF A SIMPLE PATCH

Let's take a simple patch from start to finish.

Here's something Larry suggested: if a U is the first active format during a pack, (for example, pack "U3C8", @stuff ) then the resulting string should be treated as UTF-8 encoded.

If you are working with a git clone of the Perl repository, you will want to create a branch for your changes. This will make creating a proper patch much simpler. See the perlgit for details on how to do this.

Writing the patch

How do we prepare to fix this up? First we locate the code in question - the pack happens at runtime, so it's going to be in one of the pp files. Sure enough, pp_pack is in pp.c. Since we're going to be altering this file, let's copy it to pp.c~.

[Well, it was in pp.c when this tutorial was written. It has now been split off with pp_unpack to its own file, pp_pack.c]

Now let's look over pp_pack : we take a pattern into pat , and then loop over the pattern, taking each format character in turn into datum_type . Then for each possible format character, we swallow up the other arguments in the pattern (a field width, an asterisk, and so on) and convert the next chunk input into the specified format, adding it onto the output SV cat .

How do we know if the U is the first format in the pat ? Well, if we have a pointer to the start of pat then, if we see a U we can test whether we're still at the start of the string. So, here's where pat is set up:

  1. STRLEN fromlen;
  2. char *pat = SvPVx(*++MARK, fromlen);
  3. char *patend = pat + fromlen;
  4. I32 len;
  5. I32 datumtype;
  6. SV *fromstr;

We'll have another string pointer in there:

  1. STRLEN fromlen;
  2. char *pat = SvPVx(*++MARK, fromlen);
  3. char *patend = pat + fromlen;
  4. + char *patcopy;
  5. I32 len;
  6. I32 datumtype;
  7. SV *fromstr;

And just before we start the loop, we'll set patcopy to be the start of pat :

  1. items = SP - MARK;
  2. MARK++;
  3. sv_setpvn(cat, "", 0);
  4. + patcopy = pat;
  5. while (pat < patend) {

Now if we see a U which was at the start of the string, we turn on the UTF8 flag for the output SV, cat :

  1. + if (datumtype == 'U' && pat==patcopy+1)
  2. + SvUTF8_on(cat);
  3. if (datumtype == '#') {
  4. while (pat < patend && *pat != '\n')
  5. pat++;

Remember that it has to be patcopy+1 because the first character of the string is the U which has been swallowed into datumtype!

Oops, we forgot one thing: what if there are spaces at the start of the pattern? pack(" U*", @stuff) will have U as the first active character, even though it's not the first thing in the pattern. In this case, we have to advance patcopy along with pat when we see spaces:

  1. if (isSPACE(datumtype))
  2. continue;

needs to become

  1. if (isSPACE(datumtype)) {
  2. patcopy++;
  3. continue;
  4. }

OK. That's the C part done. Now we must do two additional things before this patch is ready to go: we've changed the behaviour of Perl, and so we must document that change. We must also provide some more regression tests to make sure our patch works and doesn't create a bug somewhere else along the line.

Testing the patch

The regression tests for each operator live in t/op/, and so we make a copy of t/op/pack.t to t/op/pack.t~. Now we can add our tests to the end. First, we'll test that the U does indeed create Unicode strings.

t/op/pack.t has a sensible ok() function, but if it didn't we could use the one from t/test.pl.

  1. require './test.pl';
  2. plan( tests => 159 );

so instead of this:

  1. print 'not ' unless "1.20.300.4000" eq sprintf "%vd",
  2. pack("U*",1,20,300,4000);
  3. print "ok $test\n"; $test++;

we can write the more sensible (see Test::More for a full explanation of is() and other testing functions).

  1. is( "1.20.300.4000", sprintf "%vd", pack("U*",1,20,300,4000),
  2. "U* produces Unicode" );

Now we'll test that we got that space-at-the-beginning business right:

  1. is( "1.20.300.4000", sprintf "%vd", pack(" U*",1,20,300,4000),
  2. " with spaces at the beginning" );

And finally we'll test that we don't make Unicode strings if U is not the first active format:

  1. isnt( v1.20.300.4000, sprintf "%vd", pack("C0U*",1,20,300,4000),
  2. "U* not first isn't Unicode" );

Mustn't forget to change the number of tests which appears at the top, or else the automated tester will get confused. This will either look like this:

  1. print "1..156\n";

or this:

  1. plan( tests => 156 );

We now compile up Perl, and run it through the test suite. Our new tests pass, hooray!

Documenting the patch

Finally, the documentation. The job is never done until the paperwork is over, so let's describe the change we've just made. The relevant place is pod/perlfunc.pod; again, we make a copy, and then we'll insert this text in the description of pack:

  1. =item *
  2. If the pattern begins with a C<U>, the resulting string will be treated
  3. as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a string
  4. with an initial C<U0>, and the bytes that follow will be interpreted as
  5. Unicode characters. If you don't want this to happen, you can begin
  6. your pattern with C<C0> (or anything else) to force Perl not to UTF-8
  7. encode your string, and then follow this with a C<U*> somewhere in your
  8. pattern.

Submit

See perlhack for details on how to submit this patch.

AUTHOR

This document was originally written by Nathan Torkington, and is maintained by the perl5-porters mailing list.

 
perldoc-html/perlhaiku.html000644 000765 000024 00000040273 12275777411 016113 0ustar00jjstaff000000 000000 perlhaiku - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlhaiku

Perl 5 version 18.2 documentation
Recently read

perlhaiku

NAME

perlhaiku - Perl version 5.10+ on Haiku

DESCRIPTION

This file contains instructions how to build Perl for Haiku and lists known problems.

BUILD AND INSTALL

The build procedure is completely standard:

  1. ./Configure -de
  2. make
  3. make install

Make perl executable and create a symlink for libperl:

  1. chmod a+x /boot/common/bin/perl
  2. cd /boot/common/lib; ln -s perl5/5.18.2/BePC-haiku/CORE/libperl.so .

Replace 5.18.2 with your respective version of Perl.

KNOWN PROBLEMS

The following problems are encountered with Haiku revision 28311:

  • Perl cannot be compiled with threading support ATM.

  • The ext/Socket/t/socketpair.t test fails. More precisely: the subtests using datagram sockets fail. Unix datagram sockets aren't implemented in Haiku yet.

  • A subtest of the ext/Sys/Syslog/t/syslog.t test fails. This is due to Haiku not implementing /dev/log support yet.

  • The tests lib/Net/Ping/t/450_service.t and lib/Net/Ping/t/510_ping_udp.t fail. This is due to bugs in Haiku's network stack implementation.

CONTACT

For Haiku specific problems contact the HaikuPorts developers: http://ports.haiku-files.org/

The initial Haiku port was done by Ingo Weinhold <ingo_weinhold@gmx.de>.

Last update: 2008-10-29

 
perldoc-html/perlhist.html000644 000765 000024 00000557625 12275777371 016004 0ustar00jjstaff000000 000000 perlhist - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlhist

Perl 5 version 18.2 documentation
Recently read

perlhist

NAME

perlhist - the Perl history records

DESCRIPTION

This document aims to record the Perl source code releases.

INTRODUCTION

Perl history in brief, by Larry Wall:

  1. Perl 0 introduced Perl to my officemates.
  2. Perl 1 introduced Perl to the world, and changed /\(...\|...\)/ to
  3. /(...|...)/. \(Dan Faigin still hasn't forgiven me. :-\)
  4. Perl 2 introduced Henry Spencer's regular expression package.
  5. Perl 3 introduced the ability to handle binary data (embedded nulls).
  6. Perl 4 introduced the first Camel book. Really. We mostly just
  7. switched version numbers so the book could refer to 4.000.
  8. Perl 5 introduced everything else, including the ability to
  9. introduce everything else.

THE KEEPERS OF THE PUMPKIN

Larry Wall, Andy Dougherty, Tom Christiansen, Charles Bailey, Nick Ing-Simmons, Chip Salzenberg, Tim Bunce, Malcolm Beattie, Gurusamy Sarathy, Graham Barr, Jarkko Hietaniemi, Hugo van der Sanden, Michael Schwern, Rafael Garcia-Suarez, Nicholas Clark, Richard Clamp, Leon Brocard, Dave Mitchell, Jesse Vincent, Ricardo Signes, Steve Hay, Matt S Trout, David Golden, Florian Ragwitz, Tatsuhiko Miyagawa, Chris BinGOs Williams, Zefram, Ævar Arnfjörð Bjarmason, Stevan Little, Dave Rolsky, Max Maischein, Abigail, Jesse Luehrs, Tony Cook, Dominic Hargreaves, Aaron Crane and Aristotle Pagaltzis.

PUMPKIN?

[from Porting/pumpkin.pod in the Perl source code distribution]

Chip Salzenberg gets credit for that, with a nod to his cow orker, David Croy. We had passed around various names (baton, token, hot potato) but none caught on. Then, Chip asked:

[begin quote]

  1. Who has the patch pumpkin?

To explain: David Croy once told me that at a previous job, there was one tape drive and multiple systems that used it for backups. But instead of some high-tech exclusion software, they used a low-tech method to prevent multiple simultaneous backups: a stuffed pumpkin. No one was allowed to make backups unless they had the "backup pumpkin".

[end quote]

The name has stuck. The holder of the pumpkin is sometimes called the pumpking (keeping the source afloat?) or the pumpkineer (pulling the strings?).

THE RECORDS

  1. Pump- Release Date Notes
  2. king (by no means
  3. comprehensive,
  4. see Changes*
  5. for details)
  6. ======================================================================
  7. Larry 0 Classified. Don't ask.
  8. Larry 1.000 1987-Dec-18
  9. 1.001..10 1988-Jan-30
  10. 1.011..14 1988-Feb-02
  11. Schwern 1.0.15 2002-Dec-18 Modernization
  12. Richard 1.0_16 2003-Dec-18
  13. Larry 2.000 1988-Jun-05
  14. 2.001 1988-Jun-28
  15. Larry 3.000 1989-Oct-18
  16. 3.001 1989-Oct-26
  17. 3.002..4 1989-Nov-11
  18. 3.005 1989-Nov-18
  19. 3.006..8 1989-Dec-22
  20. 3.009..13 1990-Mar-02
  21. 3.014 1990-Mar-13
  22. 3.015 1990-Mar-14
  23. 3.016..18 1990-Mar-28
  24. 3.019..27 1990-Aug-10 User subs.
  25. 3.028 1990-Aug-14
  26. 3.029..36 1990-Oct-17
  27. 3.037 1990-Oct-20
  28. 3.040 1990-Nov-10
  29. 3.041 1990-Nov-13
  30. 3.042..43 1991-Jan-??
  31. 3.044 1991-Jan-12
  32. Larry 4.000 1991-Mar-21
  33. 4.001..3 1991-Apr-12
  34. 4.004..9 1991-Jun-07
  35. 4.010 1991-Jun-10
  36. 4.011..18 1991-Nov-05
  37. 4.019 1991-Nov-11 Stable.
  38. 4.020..33 1992-Jun-08
  39. 4.034 1992-Jun-11
  40. 4.035 1992-Jun-23
  41. Larry 4.036 1993-Feb-05 Very stable.
  42. 5.000alpha1 1993-Jul-31
  43. 5.000alpha2 1993-Aug-16
  44. 5.000alpha3 1993-Oct-10
  45. 5.000alpha4 1993-???-??
  46. 5.000alpha5 1993-???-??
  47. 5.000alpha6 1994-Mar-18
  48. 5.000alpha7 1994-Mar-25
  49. Andy 5.000alpha8 1994-Apr-04
  50. Larry 5.000alpha9 1994-May-05 ext appears.
  51. 5.000alpha10 1994-Jun-11
  52. 5.000alpha11 1994-Jul-01
  53. Andy 5.000a11a 1994-Jul-07 To fit 14.
  54. 5.000a11b 1994-Jul-14
  55. 5.000a11c 1994-Jul-19
  56. 5.000a11d 1994-Jul-22
  57. Larry 5.000alpha12 1994-Aug-04
  58. Andy 5.000a12a 1994-Aug-08
  59. 5.000a12b 1994-Aug-15
  60. 5.000a12c 1994-Aug-22
  61. 5.000a12d 1994-Aug-22
  62. 5.000a12e 1994-Aug-22
  63. 5.000a12f 1994-Aug-24
  64. 5.000a12g 1994-Aug-24
  65. 5.000a12h 1994-Aug-24
  66. Larry 5.000beta1 1994-Aug-30
  67. Andy 5.000b1a 1994-Sep-06
  68. Larry 5.000beta2 1994-Sep-14 Core slushified.
  69. Andy 5.000b2a 1994-Sep-14
  70. 5.000b2b 1994-Sep-17
  71. 5.000b2c 1994-Sep-17
  72. Larry 5.000beta3 1994-Sep-??
  73. Andy 5.000b3a 1994-Sep-18
  74. 5.000b3b 1994-Sep-22
  75. 5.000b3c 1994-Sep-23
  76. 5.000b3d 1994-Sep-27
  77. 5.000b3e 1994-Sep-28
  78. 5.000b3f 1994-Sep-30
  79. 5.000b3g 1994-Oct-04
  80. Andy 5.000b3h 1994-Oct-07
  81. Larry? 5.000gamma 1994-Oct-13?
  82. Larry 5.000 1994-Oct-17
  83. Andy 5.000a 1994-Dec-19
  84. 5.000b 1995-Jan-18
  85. 5.000c 1995-Jan-18
  86. 5.000d 1995-Jan-18
  87. 5.000e 1995-Jan-18
  88. 5.000f 1995-Jan-18
  89. 5.000g 1995-Jan-18
  90. 5.000h 1995-Jan-18
  91. 5.000i 1995-Jan-26
  92. 5.000j 1995-Feb-07
  93. 5.000k 1995-Feb-11
  94. 5.000l 1995-Feb-21
  95. 5.000m 1995-Feb-28
  96. 5.000n 1995-Mar-07
  97. 5.000o 1995-Mar-13?
  98. Larry 5.001 1995-Mar-13
  99. Andy 5.001a 1995-Mar-15
  100. 5.001b 1995-Mar-31
  101. 5.001c 1995-Apr-07
  102. 5.001d 1995-Apr-14
  103. 5.001e 1995-Apr-18 Stable.
  104. 5.001f 1995-May-31
  105. 5.001g 1995-May-25
  106. 5.001h 1995-May-25
  107. 5.001i 1995-May-30
  108. 5.001j 1995-Jun-05
  109. 5.001k 1995-Jun-06
  110. 5.001l 1995-Jun-06 Stable.
  111. 5.001m 1995-Jul-02 Very stable.
  112. 5.001n 1995-Oct-31 Very unstable.
  113. 5.002beta1 1995-Nov-21
  114. 5.002b1a 1995-Dec-04
  115. 5.002b1b 1995-Dec-04
  116. 5.002b1c 1995-Dec-04
  117. 5.002b1d 1995-Dec-04
  118. 5.002b1e 1995-Dec-08
  119. 5.002b1f 1995-Dec-08
  120. Tom 5.002b1g 1995-Dec-21 Doc release.
  121. Andy 5.002b1h 1996-Jan-05
  122. 5.002b2 1996-Jan-14
  123. Larry 5.002b3 1996-Feb-02
  124. Andy 5.002gamma 1996-Feb-11
  125. Larry 5.002delta 1996-Feb-27
  126. Larry 5.002 1996-Feb-29 Prototypes.
  127. Charles 5.002_01 1996-Mar-25
  128. 5.003 1996-Jun-25 Security release.
  129. 5.003_01 1996-Jul-31
  130. Nick 5.003_02 1996-Aug-10
  131. Andy 5.003_03 1996-Aug-28
  132. 5.003_04 1996-Sep-02
  133. 5.003_05 1996-Sep-12
  134. 5.003_06 1996-Oct-07
  135. 5.003_07 1996-Oct-10
  136. Chip 5.003_08 1996-Nov-19
  137. 5.003_09 1996-Nov-26
  138. 5.003_10 1996-Nov-29
  139. 5.003_11 1996-Dec-06
  140. 5.003_12 1996-Dec-19
  141. 5.003_13 1996-Dec-20
  142. 5.003_14 1996-Dec-23
  143. 5.003_15 1996-Dec-23
  144. 5.003_16 1996-Dec-24
  145. 5.003_17 1996-Dec-27
  146. 5.003_18 1996-Dec-31
  147. 5.003_19 1997-Jan-04
  148. 5.003_20 1997-Jan-07
  149. 5.003_21 1997-Jan-15
  150. 5.003_22 1997-Jan-16
  151. 5.003_23 1997-Jan-25
  152. 5.003_24 1997-Jan-29
  153. 5.003_25 1997-Feb-04
  154. 5.003_26 1997-Feb-10
  155. 5.003_27 1997-Feb-18
  156. 5.003_28 1997-Feb-21
  157. 5.003_90 1997-Feb-25 Ramping up to the 5.004 release.
  158. 5.003_91 1997-Mar-01
  159. 5.003_92 1997-Mar-06
  160. 5.003_93 1997-Mar-10
  161. 5.003_94 1997-Mar-22
  162. 5.003_95 1997-Mar-25
  163. 5.003_96 1997-Apr-01
  164. 5.003_97 1997-Apr-03 Fairly widely used.
  165. 5.003_97a 1997-Apr-05
  166. 5.003_97b 1997-Apr-08
  167. 5.003_97c 1997-Apr-10
  168. 5.003_97d 1997-Apr-13
  169. 5.003_97e 1997-Apr-15
  170. 5.003_97f 1997-Apr-17
  171. 5.003_97g 1997-Apr-18
  172. 5.003_97h 1997-Apr-24
  173. 5.003_97i 1997-Apr-25
  174. 5.003_97j 1997-Apr-28
  175. 5.003_98 1997-Apr-30
  176. 5.003_99 1997-May-01
  177. 5.003_99a 1997-May-09
  178. p54rc1 1997-May-12 Release Candidates.
  179. p54rc2 1997-May-14
  180. Chip 5.004 1997-May-15 A major maintenance release.
  181. Tim 5.004_01-t1 1997-???-?? The 5.004 maintenance track.
  182. 5.004_01-t2 1997-Jun-11 aka perl5.004m1t2
  183. 5.004_01 1997-Jun-13
  184. 5.004_01_01 1997-Jul-29 aka perl5.004m2t1
  185. 5.004_01_02 1997-Aug-01 aka perl5.004m2t2
  186. 5.004_01_03 1997-Aug-05 aka perl5.004m2t3
  187. 5.004_02 1997-Aug-07
  188. 5.004_02_01 1997-Aug-12 aka perl5.004m3t1
  189. 5.004_03-t2 1997-Aug-13 aka perl5.004m3t2
  190. 5.004_03 1997-Sep-05
  191. 5.004_04-t1 1997-Sep-19 aka perl5.004m4t1
  192. 5.004_04-t2 1997-Sep-23 aka perl5.004m4t2
  193. 5.004_04-t3 1997-Oct-10 aka perl5.004m4t3
  194. 5.004_04-t4 1997-Oct-14 aka perl5.004m4t4
  195. 5.004_04 1997-Oct-15
  196. 5.004_04-m1 1998-Mar-04 (5.004m5t1) Maint. trials for 5.004_05.
  197. 5.004_04-m2 1998-May-01
  198. 5.004_04-m3 1998-May-15
  199. 5.004_04-m4 1998-May-19
  200. 5.004_05-MT5 1998-Jul-21
  201. 5.004_05-MT6 1998-Oct-09
  202. 5.004_05-MT7 1998-Nov-22
  203. 5.004_05-MT8 1998-Dec-03
  204. Chip 5.004_05-MT9 1999-Apr-26
  205. 5.004_05 1999-Apr-29
  206. Malcolm 5.004_50 1997-Sep-09 The 5.005 development track.
  207. 5.004_51 1997-Oct-02
  208. 5.004_52 1997-Oct-15
  209. 5.004_53 1997-Oct-16
  210. 5.004_54 1997-Nov-14
  211. 5.004_55 1997-Nov-25
  212. 5.004_56 1997-Dec-18
  213. 5.004_57 1998-Feb-03
  214. 5.004_58 1998-Feb-06
  215. 5.004_59 1998-Feb-13
  216. 5.004_60 1998-Feb-20
  217. 5.004_61 1998-Feb-27
  218. 5.004_62 1998-Mar-06
  219. 5.004_63 1998-Mar-17
  220. 5.004_64 1998-Apr-03
  221. 5.004_65 1998-May-15
  222. 5.004_66 1998-May-29
  223. Sarathy 5.004_67 1998-Jun-15
  224. 5.004_68 1998-Jun-23
  225. 5.004_69 1998-Jun-29
  226. 5.004_70 1998-Jul-06
  227. 5.004_71 1998-Jul-09
  228. 5.004_72 1998-Jul-12
  229. 5.004_73 1998-Jul-13
  230. 5.004_74 1998-Jul-14 5.005 beta candidate.
  231. 5.004_75 1998-Jul-15 5.005 beta1.
  232. 5.004_76 1998-Jul-21 5.005 beta2.
  233. Sarathy 5.005 1998-Jul-22 Oneperl.
  234. Sarathy 5.005_01 1998-Jul-27 The 5.005 maintenance track.
  235. 5.005_02-T1 1998-Aug-02
  236. 5.005_02-T2 1998-Aug-05
  237. 5.005_02 1998-Aug-08
  238. Graham 5.005_03-MT1 1998-Nov-30
  239. 5.005_03-MT2 1999-Jan-04
  240. 5.005_03-MT3 1999-Jan-17
  241. 5.005_03-MT4 1999-Jan-26
  242. 5.005_03-MT5 1999-Jan-28
  243. 5.005_03-MT6 1999-Mar-05
  244. 5.005_03 1999-Mar-28
  245. Leon 5.005_04-RC1 2004-Feb-05
  246. 5.005_04-RC2 2004-Feb-18
  247. 5.005_04 2004-Feb-23
  248. 5.005_05-RC1 2009-Feb-16
  249. Sarathy 5.005_50 1998-Jul-26 The 5.6 development track.
  250. 5.005_51 1998-Aug-10
  251. 5.005_52 1998-Sep-25
  252. 5.005_53 1998-Oct-31
  253. 5.005_54 1998-Nov-30
  254. 5.005_55 1999-Feb-16
  255. 5.005_56 1999-Mar-01
  256. 5.005_57 1999-May-25
  257. 5.005_58 1999-Jul-27
  258. 5.005_59 1999-Aug-02
  259. 5.005_60 1999-Aug-02
  260. 5.005_61 1999-Aug-20
  261. 5.005_62 1999-Oct-15
  262. 5.005_63 1999-Dec-09
  263. 5.5.640 2000-Feb-02
  264. 5.5.650 2000-Feb-08 beta1
  265. 5.5.660 2000-Feb-22 beta2
  266. 5.5.670 2000-Feb-29 beta3
  267. 5.6.0-RC1 2000-Mar-09 Release candidate 1.
  268. 5.6.0-RC2 2000-Mar-14 Release candidate 2.
  269. 5.6.0-RC3 2000-Mar-21 Release candidate 3.
  270. Sarathy 5.6.0 2000-Mar-22
  271. Sarathy 5.6.1-TRIAL1 2000-Dec-18 The 5.6 maintenance track.
  272. 5.6.1-TRIAL2 2001-Jan-31
  273. 5.6.1-TRIAL3 2001-Mar-19
  274. 5.6.1-foolish 2001-Apr-01 The "fools-gold" release.
  275. 5.6.1 2001-Apr-08
  276. Rafael 5.6.2-RC1 2003-Nov-08
  277. 5.6.2 2003-Nov-15 Fix new build issues
  278. Jarkko 5.7.0 2000-Sep-02 The 5.7 track: Development.
  279. 5.7.1 2001-Apr-09
  280. 5.7.2 2001-Jul-13 Virtual release candidate 0.
  281. 5.7.3 2002-Mar-05
  282. 5.8.0-RC1 2002-Jun-01
  283. 5.8.0-RC2 2002-Jun-21
  284. 5.8.0-RC3 2002-Jul-13
  285. Jarkko 5.8.0 2002-Jul-18
  286. Jarkko 5.8.1-RC1 2003-Jul-10 The 5.8 maintenance track
  287. 5.8.1-RC2 2003-Jul-11
  288. 5.8.1-RC3 2003-Jul-30
  289. 5.8.1-RC4 2003-Aug-01
  290. 5.8.1-RC5 2003-Sep-22
  291. 5.8.1 2003-Sep-25
  292. Nicholas 5.8.2-RC1 2003-Oct-27
  293. 5.8.2-RC2 2003-Nov-03
  294. 5.8.2 2003-Nov-05
  295. 5.8.3-RC1 2004-Jan-07
  296. 5.8.3 2004-Jan-14
  297. 5.8.4-RC1 2004-Apr-05
  298. 5.8.4-RC2 2004-Apr-15
  299. 5.8.4 2004-Apr-21
  300. 5.8.5-RC1 2004-Jul-06
  301. 5.8.5-RC2 2004-Jul-08
  302. 5.8.5 2004-Jul-19
  303. 5.8.6-RC1 2004-Nov-11
  304. 5.8.6 2004-Nov-27
  305. 5.8.7-RC1 2005-May-18
  306. 5.8.7 2005-May-30
  307. 5.8.8-RC1 2006-Jan-20
  308. 5.8.8 2006-Jan-31
  309. 5.8.9-RC1 2008-Nov-10
  310. 5.8.9-RC2 2008-Dec-06
  311. 5.8.9 2008-Dec-14
  312. Hugo 5.9.0 2003-Oct-27 The 5.9 development track
  313. Rafael 5.9.1 2004-Mar-16
  314. 5.9.2 2005-Apr-01
  315. 5.9.3 2006-Jan-28
  316. 5.9.4 2006-Aug-15
  317. 5.9.5 2007-Jul-07
  318. 5.10.0-RC1 2007-Nov-17
  319. 5.10.0-RC2 2007-Nov-25
  320. Rafael 5.10.0 2007-Dec-18
  321. David M 5.10.1-RC1 2009-Aug-06 The 5.10 maintenance track
  322. 5.10.1-RC2 2009-Aug-18
  323. 5.10.1 2009-Aug-22
  324. Jesse 5.11.0 2009-Oct-02 The 5.11 development track
  325. 5.11.1 2009-Oct-20
  326. Leon 5.11.2 2009-Nov-20
  327. Jesse 5.11.3 2009-Dec-20
  328. Ricardo 5.11.4 2010-Jan-20
  329. Steve 5.11.5 2010-Feb-20
  330. Jesse 5.12.0-RC0 2010-Mar-21
  331. 5.12.0-RC1 2010-Mar-29
  332. 5.12.0-RC2 2010-Apr-01
  333. 5.12.0-RC3 2010-Apr-02
  334. 5.12.0-RC4 2010-Apr-06
  335. 5.12.0-RC5 2010-Apr-09
  336. Jesse 5.12.0 2010-Apr-12
  337. Jesse 5.12.1-RC2 2010-May-13 The 5.12 maintenance track
  338. 5.12.1-RC1 2010-May-09
  339. 5.12.1 2010-May-16
  340. 5.12.2-RC2 2010-Aug-31
  341. 5.12.2 2010-Sep-06
  342. Ricardo 5.12.3-RC1 2011-Jan-09
  343. Ricardo 5.12.3-RC2 2011-Jan-14
  344. Ricardo 5.12.3-RC3 2011-Jan-17
  345. Ricardo 5.12.3 2011-Jan-21
  346. Leon 5.12.4-RC1 2011-Jun-08
  347. Leon 5.12.4 2011-Jun-20
  348. Dominic 5.12.5 2012-Nov-10
  349. Leon 5.13.0 2010-Apr-20 The 5.13 development track
  350. Ricardo 5.13.1 2010-May-20
  351. Matt 5.13.2 2010-Jun-22
  352. David G 5.13.3 2010-Jul-20
  353. Florian 5.13.4 2010-Aug-20
  354. Steve 5.13.5 2010-Sep-19
  355. Miyagawa 5.13.6 2010-Oct-20
  356. BinGOs 5.13.7 2010-Nov-20
  357. Zefram 5.13.8 2010-Dec-20
  358. Jesse 5.13.9 2011-Jan-20
  359. Ævar 5.13.10 2011-Feb-20
  360. Florian 5.13.11 2011-Mar-20
  361. Jesse 5.14.0RC1 2011-Apr-20
  362. Jesse 5.14.0RC2 2011-May-04
  363. Jesse 5.14.0RC3 2011-May-11
  364. Jesse 5.14.0 2011-May-14 The 5.14 maintenance track
  365. Jesse 5.14.1 2011-Jun-16
  366. Florian 5.14.2-RC1 2011-Sep-19
  367. 5.14.2 2011-Sep-26
  368. Dominic 5.14.3 2012-Oct-12
  369. David M 5.14.4-RC1 2013-Mar-05
  370. David M 5.14.4-RC2 2013-Mar-07
  371. David M 5.14.4 2013-Mar-10
  372. David G 5.15.0 2011-Jun-20 The 5.15 development track
  373. Zefram 5.15.1 2011-Jul-20
  374. Ricardo 5.15.2 2011-Aug-20
  375. Stevan 5.15.3 2011-Sep-20
  376. Florian 5.15.4 2011-Oct-20
  377. Steve 5.15.5 2011-Nov-20
  378. Dave R 5.15.6 2011-Dec-20
  379. BinGOs 5.15.7 2012-Jan-20
  380. Max M 5.15.8 2012-Feb-20
  381. Abigail 5.15.9 2012-Mar-20
  382. Ricardo 5.16.0-RC0 2012-May-10
  383. Ricardo 5.16.0-RC1 2012-May-14
  384. Ricardo 5.16.0-RC2 2012-May-15
  385. Ricardo 5.16.0 2012-May-20 The 5.16 maintenance track
  386. Ricardo 5.16.1 2012-Aug-08
  387. Ricardo 5.16.2 2012-Nov-01
  388. Ricardo 5.16.3-RC1 2013-Mar-06
  389. Ricardo 5.16.3 2013-Mar-11
  390. Zefram 5.17.0 2012-May-26 The 5.17 development track
  391. Jesse L 5.17.1 2012-Jun-20
  392. TonyC 5.17.2 2012-Jul-20
  393. Steve 5.17.3 2012-Aug-20
  394. Florian 5.17.4 2012-Sep-20
  395. Florian 5.17.5 2012-Oct-20
  396. Ricardo 5.17.6 2012-Nov-20
  397. Dave R 5.17.7 2012-Dec-18
  398. Aaron 5.17.8 2013-Jan-20
  399. BinGOs 5.17.9 2013-Feb-20
  400. Max M 5.17.10 2013-Mar-21
  401. Ricardo 5.18.0-RC1 2013-May-11 The 5.18 maintenance track
  402. Ricardo 5.18.0-RC2 2013-May-12
  403. Ricardo 5.18.0-RC3 2013-May-13
  404. Ricardo 5.18.0-RC4 2013-May-15
  405. Ricardo 5.18.0 2013-May-18
  406. Ricardo 5.18.1-RC1 2013-Aug-01
  407. Ricardo 5.18.1-RC2 2013-Aug-03
  408. Ricardo 5.18.1-RC3 2013-Aug-08
  409. Ricardo 5.18.1 2013-Aug-12
  410. Ricardo 5.18.2 2014-Jan-06
  411. Ricardo 5.19.0 2013-May-20 The 5.19 development track
  412. David G 5.19.1 2013-Jun-21
  413. Aristotle 5.19.2 2013-Jul-22

SELECTED RELEASE SIZES

For example the notation "core: 212 29" in the release 1.000 means that it had in the core 212 kilobytes, in 29 files. The "core".."doc" are explained below.

  1. release core lib ext t doc
  2. ======================================================================
  3. 1.000 212 29 - - - - 38 51 62 3
  4. 1.014 219 29 - - - - 39 52 68 4
  5. 2.000 309 31 2 3 - - 55 57 92 4
  6. 2.001 312 31 2 3 - - 55 57 94 4
  7. 3.000 508 36 24 11 - - 79 73 156 5
  8. 3.044 645 37 61 20 - - 90 74 190 6
  9. 4.000 635 37 59 20 - - 91 75 198 4
  10. 4.019 680 37 85 29 - - 98 76 199 4
  11. 4.036 709 37 89 30 - - 98 76 208 5
  12. 5.000alpha2 785 50 114 32 - - 112 86 209 5
  13. 5.000alpha3 801 50 117 33 - - 121 87 209 5
  14. 5.000alpha9 1022 56 149 43 116 29 125 90 217 6
  15. 5.000a12h 978 49 140 49 205 46 152 97 228 9
  16. 5.000b3h 1035 53 232 70 216 38 162 94 218 21
  17. 5.000 1038 53 250 76 216 38 154 92 536 62
  18. 5.001m 1071 54 388 82 240 38 159 95 544 29
  19. 5.002 1121 54 661 101 287 43 155 94 847 35
  20. 5.003 1129 54 680 102 291 43 166 100 853 35
  21. 5.003_07 1231 60 748 106 396 53 213 137 976 39
  22. 5.004 1351 60 1230 136 408 51 355 161 1587 55
  23. 5.004_01 1356 60 1258 138 410 51 358 161 1587 55
  24. 5.004_04 1375 60 1294 139 413 51 394 162 1629 55
  25. 5.004_05 1463 60 1435 150 394 50 445 175 1855 59
  26. 5.004_51 1401 61 1260 140 413 53 358 162 1594 56
  27. 5.004_53 1422 62 1295 141 438 70 394 162 1637 56
  28. 5.004_56 1501 66 1301 140 447 74 408 165 1648 57
  29. 5.004_59 1555 72 1317 142 448 74 424 171 1678 58
  30. 5.004_62 1602 77 1327 144 629 92 428 173 1674 58
  31. 5.004_65 1626 77 1358 146 615 92 446 179 1698 60
  32. 5.004_68 1856 74 1382 152 619 92 463 187 1784 60
  33. 5.004_70 1863 75 1456 154 675 92 494 194 1809 60
  34. 5.004_73 1874 76 1467 152 762 102 506 196 1883 61
  35. 5.004_75 1877 76 1467 152 770 103 508 196 1896 62
  36. 5.005 1896 76 1469 152 795 103 509 197 1945 63
  37. 5.005_03 1936 77 1541 153 813 104 551 201 2176 72
  38. 5.005_50 1969 78 1842 301 795 103 514 198 1948 63
  39. 5.005_53 1999 79 1885 303 806 104 602 224 2002 67
  40. 5.005_56 2086 79 1970 307 866 113 672 238 2221 75
  41. 5.6.0 2820 79 2626 364 1096 129 863 280 2840 93
  42. 5.6.1 2946 78 2921 430 1171 132 1024 304 3330 102
  43. 5.6.2 2947 78 3143 451 1247 127 1303 387 3406 102
  44. 5.7.0 2977 80 2801 425 1250 132 975 307 3206 100
  45. 5.7.1 3351 84 3442 455 1944 167 1334 357 3698 124
  46. 5.7.2 3491 87 4858 618 3290 298 1598 449 3910 139
  47. 5.7.3 3299 85 4295 537 2196 300 2176 626 4171 120
  48. 5.8.0 3489 87 4533 585 2437 331 2588 726 4368 125
  49. 5.8.1 3674 90 5104 623 2604 353 2983 836 4625 134
  50. 5.8.2 3633 90 5111 623 2623 357 3019 848 4634 135
  51. 5.8.3 3625 90 5141 624 2660 363 3083 869 4669 136
  52. 5.8.4 3653 90 5170 634 2684 368 3148 885 4689 137
  53. 5.8.5 3664 90 4260 303 2707 369 3208 898 4689 138
  54. 5.8.6 3690 90 4271 303 3141 396 3411 925 4709 139
  55. 5.8.7 3788 90 4322 307 3297 401 3485 964 4744 141
  56. 5.8.8 3895 90 4357 314 3409 431 3622 1017 4979 144
  57. 5.8.9 4132 93 5508 330 3826 529 4364 1234 5348 152
  58. 5.9.0 3657 90 4951 626 2603 354 3011 841 4609 135
  59. 5.9.1 3580 90 5196 634 2665 367 3186 889 4725 138
  60. 5.9.2 3863 90 4654 312 3283 403 3551 973 4800 142
  61. 5.9.3 4096 91 5318 381 4806 597 4272 1214 5139 147
  62. 5.9.4 4393 94 5718 415 4578 642 4646 1310 5335 153
  63. 5.9.5 4681 96 6849 479 4827 671 5155 1490 5572 159
  64. 5.10.0 4710 97 7050 486 4899 673 5275 1503 5673 160
  65. 5.10.1 4858 98 7440 519 6195 921 6147 1751 5151 163
  66. 5.12.0 4999 100 1146 121 15227 2176 6400 1843 5342 168
  67. 5.12.1 5000 100 1146 121 15283 2178 6407 1846 5354 169
  68. 5.12.2 5003 100 1146 121 15404 2178 6413 1846 5376 170
  69. 5.12.3 5004 100 1146 121 15529 2180 6417 1848 5391 171
  70. 5.14.0 5328 104 1100 114 17779 2479 7697 2130 5871 188
  71. 5.16.0 5562 109 1077 80 20504 2702 8750 2375 4815 152
  72. 5.18.0 5892 113 1088 79 20077 2760 9365 2439 4943 154

The "core"..."doc" mean the following files from the Perl source code distribution. The glob notation ** means recursively, (.) means regular files.

  1. core *.[hcy]
  2. lib lib/**/*.p[ml]
  3. ext ext/**/*.{[hcyt],xs,pm} (for -5.10.1) or
  4. {dist,ext,cpan}/**/*.{[hcyt],xs,pm} (for 5.12.0-)
  5. t t/**/*(.) (for 1-5.005_56) or **/*.t (for 5.6.0-5.7.3)
  6. doc {README*,INSTALL,*[_.]man{,.?},pod/**/*.pod}

Here are some statistics for the other subdirectories and one file in the Perl source distribution for somewhat more selected releases.

  1. ======================================================================
  2. Legend: kB #
  3. 1.014 2.001 3.044
  4. Configure 31 1 37 1 62 1
  5. eg - - 34 28 47 39
  6. h2pl - - - - 12 12
  7. msdos - - - - 41 13
  8. os2 - - - - 63 22
  9. usub - - - - 21 16
  10. x2p 103 17 104 17 137 17
  11. ======================================================================
  12. 4.000 4.019 4.036
  13. atarist - - - - 113 31
  14. Configure 73 1 83 1 86 1
  15. eg 47 39 47 39 47 39
  16. emacs 67 4 67 4 67 4
  17. h2pl 12 12 12 12 12 12
  18. hints - - 5 42 11 56
  19. msdos 57 15 58 15 60 15
  20. os2 81 29 81 29 113 31
  21. usub 25 7 43 8 43 8
  22. x2p 147 18 152 19 154 19
  23. ======================================================================
  24. 5.000a2 5.000a12h 5.000b3h 5.000 5.001m
  25. apollo 8 3 8 3 8 3 8 3 8 3
  26. atarist 113 31 113 31 - - - - - -
  27. bench - - 0 1 - - - - - -
  28. Bugs 2 5 26 1 - - - - - -
  29. dlperl 40 5 - - - - - - - -
  30. do 127 71 - - - - - - - -
  31. Configure - - 153 1 159 1 160 1 180 1
  32. Doc - - 26 1 75 7 11 1 11 1
  33. eg 79 58 53 44 51 43 54 44 54 44
  34. emacs 67 4 104 6 104 6 104 1 104 6
  35. h2pl 12 12 12 12 12 12 12 12 12 12
  36. hints 11 56 12 46 18 48 18 48 44 56
  37. msdos 60 15 60 15 - - - - - -
  38. os2 113 31 113 31 - - - - - -
  39. U - - 62 8 112 42 - - - -
  40. usub 43 8 - - - - - - - -
  41. vms - - 80 7 123 9 184 15 304 20
  42. x2p 171 22 171 21 162 20 162 20 279 20
  43. ======================================================================
  44. 5.002 5.003 5.003_07
  45. Configure 201 1 201 1 217 1
  46. eg 54 44 54 44 54 44
  47. emacs 108 1 108 1 143 1
  48. h2pl 12 12 12 12 12 12
  49. hints 73 59 77 60 90 62
  50. os2 84 17 56 10 117 42
  51. plan9 - - - - 79 15
  52. Porting - - - - 51 1
  53. utils 87 7 88 7 97 7
  54. vms 500 24 475 26 505 27
  55. x2p 280 20 280 20 280 19
  56. ======================================================================
  57. 5.004 5.004_04 5.004_62 5.004_65 5.004_68
  58. beos - - - - - - 1 1 1 1
  59. Configure 225 1 225 1 240 1 248 1 256 1
  60. cygwin32 23 5 23 5 23 5 24 5 24 5
  61. djgpp - - - - 14 5 14 5 14 5
  62. eg 81 62 81 62 81 62 81 62 81 62
  63. emacs 194 1 204 1 212 2 212 2 212 2
  64. h2pl 12 12 12 12 12 12 12 12 12 12
  65. hints 129 69 132 71 144 72 151 74 155 74
  66. os2 121 42 127 42 127 44 129 44 129 44
  67. plan9 82 15 82 15 82 15 82 15 82 15
  68. Porting 94 2 109 4 203 6 234 8 241 9
  69. qnx 1 2 1 2 1 2 1 2 1 2
  70. utils 112 8 118 8 124 8 156 9 159 9
  71. vms 518 34 524 34 538 34 569 34 569 34
  72. win32 285 33 378 36 470 39 493 39 575 41
  73. x2p 281 19 281 19 281 19 282 19 281 19
  74. ======================================================================
  75. 5.004_70 5.004_73 5.004_75 5.005 5.005_03
  76. apollo - - - - - - - - 0 1
  77. beos 1 1 1 1 1 1 1 1 1 1
  78. Configure 256 1 256 1 264 1 264 1 270 1
  79. cygwin32 24 5 24 5 24 5 24 5 24 5
  80. djgpp 14 5 14 5 14 5 14 5 15 5
  81. eg 86 65 86 65 86 65 86 65 86 65
  82. emacs 262 2 262 2 262 2 262 2 274 2
  83. h2pl 12 12 12 12 12 12 12 12 12 12
  84. hints 157 74 157 74 159 74 160 74 179 77
  85. mint - - - - - - - - 4 7
  86. mpeix - - - - 5 3 5 3 5 3
  87. os2 129 44 139 44 142 44 143 44 148 44
  88. plan9 82 15 82 15 82 15 82 15 82 15
  89. Porting 241 9 253 9 259 10 264 12 272 13
  90. qnx 1 2 1 2 1 2 1 2 1 2
  91. utils 160 9 160 9 160 9 160 9 164 9
  92. vms 570 34 572 34 573 34 575 34 583 34
  93. vos - - - - - - - - 156 10
  94. win32 577 41 585 41 585 41 587 41 600 42
  95. x2p 281 19 281 19 281 19 281 19 281 19
  96. ======================================================================
  97. 5.6.0 5.6.1 5.6.2 5.7.3
  98. apollo 8 3 8 3 8 3 8 3
  99. beos 5 2 5 2 5 2 6 4
  100. Configure 346 1 361 1 363 1 394 1
  101. Cross - - - - - - 4 2
  102. djgpp 19 6 19 6 19 6 21 7
  103. eg 112 71 112 71 112 71 - -
  104. emacs 303 4 319 4 319 4 319 4
  105. epoc 29 8 35 8 35 8 36 8
  106. h2pl 24 15 24 15 24 15 24 15
  107. hints 242 83 250 84 321 89 272 87
  108. mint 11 9 11 9 11 9 11 9
  109. mpeix 9 4 9 4 9 4 9 4
  110. NetWare - - - - - - 423 57
  111. os2 214 59 224 60 224 60 357 66
  112. plan9 92 17 92 17 92 17 85 15
  113. Porting 361 15 390 16 390 16 425 21
  114. qnx 5 3 5 3 5 3 5 3
  115. utils 228 12 221 11 222 11 267 13
  116. uts - - - - - - 12 3
  117. vmesa 25 4 25 4 25 4 25 4
  118. vms 686 38 627 38 627 38 649 36
  119. vos 227 12 249 15 248 15 281 17
  120. win32 755 41 782 42 801 42 1006 50
  121. x2p 307 20 307 20 307 20 345 20
  122. ======================================================================
  123. 5.8.0 5.8.1 5.8.2 5.8.3 5.8.4
  124. apollo 8 3 8 3 8 3 8 3 8 3
  125. beos 6 4 6 4 6 4 6 4 6 4
  126. Configure 472 1 493 1 493 1 493 1 494 1
  127. Cross 4 2 45 10 45 10 45 10 45 10
  128. djgpp 21 7 21 7 21 7 21 7 21 7
  129. emacs 319 4 329 4 329 4 329 4 329 4
  130. epoc 33 8 33 8 33 8 33 8 33 8
  131. h2pl 24 15 24 15 24 15 24 15 24 15
  132. hints 294 88 321 89 321 89 321 89 348 91
  133. mint 11 9 11 9 11 9 11 9 11 9
  134. mpeix 24 5 25 5 25 5 25 5 25 5
  135. NetWare 488 61 490 61 490 61 490 61 488 61
  136. os2 361 66 445 67 450 67 488 67 488 67
  137. plan9 85 15 325 17 325 17 325 17 321 17
  138. Porting 479 22 537 32 538 32 539 32 538 33
  139. qnx 5 3 5 3 5 3 5 3 5 3
  140. utils 275 15 258 16 258 16 263 19 263 19
  141. uts 12 3 12 3 12 3 12 3 12 3
  142. vmesa 25 4 25 4 25 4 25 4 25 4
  143. vms 648 36 654 36 654 36 656 36 656 36
  144. vos 330 20 335 20 335 20 335 20 335 20
  145. win32 1062 49 1125 49 1127 49 1126 49 1181 56
  146. x2p 347 20 348 20 348 20 348 20 348 20
  147. ======================================================================
  148. 5.8.5 5.8.6 5.8.7 5.8.8 5.8.9
  149. apollo 8 3 8 3 8 3 8 3 8 3
  150. beos 6 4 6 4 8 4 8 4 8 4
  151. Configure 494 1 494 1 495 1 506 1 520 1
  152. Cross 45 10 45 10 45 10 45 10 46 10
  153. djgpp 21 7 21 7 21 7 21 7 21 7
  154. emacs 329 4 329 4 329 4 329 4 406 4
  155. epoc 33 8 33 8 33 8 34 8 35 8
  156. h2pl 24 15 24 15 24 15 24 15 24 15
  157. hints 350 91 352 91 355 94 360 94 387 99
  158. mint 11 9 11 9 11 9 11 9 11 9
  159. mpeix 25 5 25 5 25 5 49 6 49 6
  160. NetWare 488 61 488 61 488 61 490 61 491 61
  161. os2 488 67 488 67 488 67 488 67 552 70
  162. plan9 321 17 321 17 321 17 322 17 324 17
  163. Porting 538 34 548 35 549 35 564 37 625 41
  164. qnx 5 3 5 3 5 3 5 3 5 3
  165. utils 265 19 265 19 266 19 267 19 281 21
  166. uts 12 3 12 3 12 3 12 3 12 3
  167. vmesa 25 4 25 4 25 4 25 4 25 4
  168. vms 657 36 658 36 662 36 664 36 716 35
  169. vos 335 20 335 20 335 20 336 21 345 22
  170. win32 1183 56 1190 56 1199 56 1219 56 1484 68
  171. x2p 349 20 349 20 349 20 349 19 350 19
  172. ======================================================================
  173. 5.9.0 5.9.1 5.9.2 5.9.3 5.9.4
  174. apollo 8 3 8 3 8 3 8 3 8 3
  175. beos 6 4 6 4 8 4 8 4 8 4
  176. Configure 493 1 493 1 495 1 508 1 512 1
  177. Cross 45 10 45 10 45 10 45 10 46 10
  178. djgpp 21 7 21 7 21 7 21 7 21 7
  179. emacs 329 4 329 4 329 4 329 4 329 4
  180. epoc 33 8 33 8 33 8 34 8 34 8
  181. h2pl 24 15 24 15 24 15 24 15 24 15
  182. hints 321 89 346 91 355 94 359 94 366 96
  183. mad - - - - - - - - 174 6
  184. mint 11 9 11 9 11 9 11 9 11 9
  185. mpeix 25 5 25 5 25 5 49 6 49 6
  186. NetWare 489 61 487 61 487 61 489 61 489 61
  187. os2 444 67 488 67 488 67 488 67 488 67
  188. plan9 325 17 321 17 321 17 322 17 323 17
  189. Porting 537 32 536 33 549 36 564 38 576 38
  190. qnx 5 3 5 3 5 3 5 3 5 3
  191. symbian - - - - - - 293 53 293 53
  192. utils 258 16 263 19 268 20 273 23 275 24
  193. uts 12 3 12 3 12 3 12 3 12 3
  194. vmesa 25 4 25 4 25 4 25 4 25 4
  195. vms 660 36 547 33 553 33 661 33 696 33
  196. vos 11 7 11 7 11 7 11 7 11 7
  197. win32 1120 49 1124 51 1191 56 1209 56 1719 90
  198. x2p 348 20 348 20 349 20 349 19 349 19
  199. ======================================================================
  200. 5.9.5 5.10.0 5.10.1 5.12.0 5.12.1
  201. apollo 8 3 8 3 0 3 0 3 0 3
  202. beos 8 4 8 4 4 4 4 4 4 4
  203. Configure 518 1 518 1 533 1 536 1 536 1
  204. Cross 122 15 122 15 119 15 118 15 118 15
  205. djgpp 21 7 21 7 17 7 17 7 17 7
  206. emacs 329 4 406 4 402 4 402 4 402 4
  207. epoc 34 8 35 8 31 8 31 8 31 8
  208. h2pl 24 15 24 15 12 15 12 15 12 15
  209. hints 377 98 381 98 385 100 368 97 368 97
  210. mad 182 8 182 8 174 8 174 8 174 8
  211. mint 11 9 11 9 3 9 - - - -
  212. mpeix 49 6 49 6 45 6 45 6 45 6
  213. NetWare 489 61 489 61 465 61 466 61 466 61
  214. os2 552 70 552 70 507 70 507 70 507 70
  215. plan9 324 17 324 17 316 17 316 17 316 17
  216. Porting 627 40 632 40 933 53 749 54 749 54
  217. qnx 5 3 5 4 1 4 1 4 1 4
  218. symbian 300 54 300 54 290 54 288 54 288 54
  219. utils 260 26 264 27 268 27 269 27 269 27
  220. uts 12 3 12 3 8 3 8 3 8 3
  221. vmesa 25 4 25 4 21 4 21 4 21 4
  222. vms 690 32 722 32 693 30 645 18 645 18
  223. vos 19 8 19 8 16 8 16 8 16 8
  224. win32 1482 68 1485 68 1497 70 1841 73 1841 73
  225. x2p 349 19 349 19 345 19 345 19 345 19
  226. ======================================================================
  227. 5.12.2 5.12.3 5.14.0 5.16.0 5.18.0
  228. apollo 0 3 0 3 - - - - - -
  229. beos 4 4 4 4 5 4 5 4 - -
  230. Configure 536 1 536 1 539 1 547 1 550 1
  231. Cross 118 15 118 15 118 15 118 15 118 15
  232. djgpp 17 7 17 7 18 7 18 7 18 7
  233. emacs 402 4 402 4 - - - - - -
  234. epoc 31 8 31 8 32 8 30 8 - -
  235. h2pl 12 15 12 15 15 15 15 15 13 15
  236. hints 368 97 368 97 370 96 371 96 354 91
  237. mad 174 8 174 8 176 8 176 8 174 8
  238. mpeix 45 6 45 6 46 6 46 6 - -
  239. NetWare 466 61 466 61 473 61 472 61 469 61
  240. os2 507 70 507 70 518 70 519 70 510 70
  241. plan9 316 17 316 17 319 17 319 17 318 17
  242. Porting 750 54 750 54 855 60 1093 69 1149 70
  243. qnx 1 4 1 4 2 4 2 4 1 4
  244. symbian 288 54 288 54 292 54 292 54 290 54
  245. utils 269 27 269 27 249 29 245 30 246 31
  246. uts 8 3 8 3 9 3 9 3 - -
  247. vmesa 21 4 21 4 22 4 22 4 - -
  248. vms 646 18 644 18 639 17 571 15 564 15
  249. vos 16 8 16 8 17 8 9 7 8 7
  250. win32 1841 73 1841 73 1833 72 1655 67 1157 62
  251. x2p 345 19 345 19 346 19 345 19 344 20

SELECTED PATCH SIZES

The "diff lines kB" means that for example the patch 5.003_08, to be applied on top of the 5.003_07 (or whatever was before the 5.003_08) added lines for 110 kilobytes, it removed lines for 19 kilobytes, and changed lines for 424 kilobytes. Just the lines themselves are counted, not their context. The "+ - !" become from the diff(1) context diff output format.

  1. Pump- Release Date diff lines kB
  2. king -------------
  3. + - !
  4. ======================================================================
  5. Chip 5.003_08 1996-Nov-19 110 19 424
  6. 5.003_09 1996-Nov-26 38 9 248
  7. 5.003_10 1996-Nov-29 29 2 27
  8. 5.003_11 1996-Dec-06 73 12 165
  9. 5.003_12 1996-Dec-19 275 6 436
  10. 5.003_13 1996-Dec-20 95 1 56
  11. 5.003_14 1996-Dec-23 23 7 333
  12. 5.003_15 1996-Dec-23 0 0 1
  13. 5.003_16 1996-Dec-24 12 3 50
  14. 5.003_17 1996-Dec-27 19 1 14
  15. 5.003_18 1996-Dec-31 21 1 32
  16. 5.003_19 1997-Jan-04 80 3 85
  17. 5.003_20 1997-Jan-07 18 1 146
  18. 5.003_21 1997-Jan-15 38 10 221
  19. 5.003_22 1997-Jan-16 4 0 18
  20. 5.003_23 1997-Jan-25 71 15 119
  21. 5.003_24 1997-Jan-29 426 1 20
  22. 5.003_25 1997-Feb-04 21 8 169
  23. 5.003_26 1997-Feb-10 16 1 15
  24. 5.003_27 1997-Feb-18 32 10 38
  25. 5.003_28 1997-Feb-21 58 4 66
  26. 5.003_90 1997-Feb-25 22 2 34
  27. 5.003_91 1997-Mar-01 37 1 39
  28. 5.003_92 1997-Mar-06 16 3 69
  29. 5.003_93 1997-Mar-10 12 3 15
  30. 5.003_94 1997-Mar-22 407 7 200
  31. 5.003_95 1997-Mar-25 41 1 37
  32. 5.003_96 1997-Apr-01 283 5 261
  33. 5.003_97 1997-Apr-03 13 2 34
  34. 5.003_97a 1997-Apr-05 57 1 27
  35. 5.003_97b 1997-Apr-08 14 1 20
  36. 5.003_97c 1997-Apr-10 20 1 16
  37. 5.003_97d 1997-Apr-13 8 0 16
  38. 5.003_97e 1997-Apr-15 15 4 46
  39. 5.003_97f 1997-Apr-17 7 1 33
  40. 5.003_97g 1997-Apr-18 6 1 42
  41. 5.003_97h 1997-Apr-24 23 3 68
  42. 5.003_97i 1997-Apr-25 23 1 31
  43. 5.003_97j 1997-Apr-28 36 1 49
  44. 5.003_98 1997-Apr-30 171 12 539
  45. 5.003_99 1997-May-01 6 0 7
  46. 5.003_99a 1997-May-09 36 2 61
  47. p54rc1 1997-May-12 8 1 11
  48. p54rc2 1997-May-14 6 0 40
  49. 5.004 1997-May-15 4 0 4
  50. Tim 5.004_01 1997-Jun-13 222 14 57
  51. 5.004_02 1997-Aug-07 112 16 119
  52. 5.004_03 1997-Sep-05 109 0 17
  53. 5.004_04 1997-Oct-15 66 8 173

The patch-free era

In more modern times, named releases don't come as often, and as progress can be followed (nearly) instantly (with rsync, and since late 2008, git) patches between versions are no longer provided. However, that doesn't keep us from calculating how large a patch could have been. Which is shown in the table below. Unless noted otherwise, the size mentioned is the patch to bring version x.y.z to x.y.z+1.

  1. Sarathy 5.6.1 2001-Apr-08 531 44 651
  2. Rafael 5.6.2 2003-Nov-15 20 11 1819
  3. Jarkko 5.8.0 2002-Jul-18 1205 31 471 From 5.7.3
  4. 5.8.1 2003-Sep-25 243 102 6162
  5. Nicholas 5.8.2 2003-Nov-05 10 50 788
  6. 5.8.3 2004-Jan-14 31 13 360
  7. 5.8.4 2004-Apr-21 33 8 299
  8. 5.8.5 2004-Jul-19 11 19 255
  9. 5.8.6 2004-Nov-27 35 3 192
  10. 5.8.7 2005-May-30 75 34 778
  11. 5.8.8 2006-Jan-31 131 42 1251
  12. 5.8.9 2008-Dec-14 340 132 12988
  13. Hugo 5.9.0 2003-Oct-27 281 168 7132 From 5.8.0
  14. Rafael 5.9.1 2004-Mar-16 57 250 2107
  15. 5.9.2 2005-Apr-01 720 57 858
  16. 5.9.3 2006-Jan-28 1124 102 1906
  17. 5.9.4 2006-Aug-15 896 60 862
  18. 5.9.5 2007-Jul-07 1149 128 1062
  19. 5.10.0 2007-Dec-18 50 31 13111 From 5.9.5

THE KEEPERS OF THE RECORDS

Jarkko Hietaniemi <jhi@iki.fi>.

Thanks to the collective memory of the Perlfolk. In addition to the Keepers of the Pumpkin also Alan Champion, Mark Dominus, Andreas König, John Macdonald, Matthias Neeracher, Jeff Okamoto, Michael Peppler, Randal Schwartz, and Paul D. Smith sent corrections and additions. Abigail added file and patch size data for the 5.6.0 - 5.10 era.

 
perldoc-html/perlhpux.html000644 000765 000024 00000205402 12275777411 015773 0ustar00jjstaff000000 000000 perlhpux - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlhpux

Perl 5 version 18.2 documentation
Recently read

perlhpux

NAME

perlhpux - Perl version 5 on Hewlett-Packard Unix (HP-UX) systems

DESCRIPTION

This document describes various features of HP's Unix operating system (HP-UX) that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs.

Using perl as shipped with HP-UX

Application release September 2001, HP-UX 11.00 is the first to ship with Perl. By the time it was perl-5.6.1 in /opt/perl. The first occurrence is on CD 5012-7954 and can be installed using

  1. swinstall -s /cdrom perl

assuming you have mounted that CD on /cdrom.

That build was a portable hppa-1.1 multithread build that supports large files compiled with gcc-2.9-hppa-991112.

If you perform a new installation, then (a newer) Perl will be installed automatically. Pre-installed HP-UX systems now have more recent versions of Perl and the updated modules.

The official (threaded) builds from HP, as they are shipped on the Application DVD/CD's are available on http://www.software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=PERL for both PA-RISC and IPF (Itanium Processor Family). They are built with the HP ANSI-C compiler. Up till 5.8.8 that was done by ActiveState.

To see what version is included on the DVD (assumed here to be mounted on /cdrom), issue this command:

  1. # swlist -s /cdrom perl
  2. # perl D.5.8.8.B 5.8.8 Perl Programming Language
  3. perl.Perl5-32 D.5.8.8.B 32-bit 5.8.8 Perl Programming Language with Extensions
  4. perl.Perl5-64 D.5.8.8.B 64-bit 5.8.8 Perl Programming Language with Extensions

To see what is installed on your system:

  1. # swlist -R perl
  2. # perl E.5.8.8.J Perl Programming Language
  3. # perl.Perl5-32 E.5.8.8.J 32-bit Perl Programming Language with Extensions
  4. perl.Perl5-32.PERL-MAN E.5.8.8.J 32-bit Perl Man Pages for IA
  5. perl.Perl5-32.PERL-RUN E.5.8.8.J 32-bit Perl Binaries for IA
  6. # perl.Perl5-64 E.5.8.8.J 64-bit Perl Programming Language with Extensions
  7. perl.Perl5-64.PERL-MAN E.5.8.8.J 64-bit Perl Man Pages for IA
  8. perl.Perl5-64.PERL-RUN E.5.8.8.J 64-bit Perl Binaries for IA

Using perl from HP's porting centre

HP porting centre tries to keep up with customer demand and release updates from the Open Source community. Having precompiled Perl binaries available is obvious, though "up-to-date" is something relative. At the moment of writing only perl-5.10.1 was available (with 5.16.3 being the latest stable release from the porters point of view).

The HP porting centres are limited in what systems they are allowed to port to and they usually choose the two most recent OS versions available.

HP has asked the porting centre to move Open Source binaries from /opt to /usr/local, so binaries produced since the start of July 2002 are located in /usr/local.

One of HP porting centres URL's is http://hpux.connect.org.uk/ The port currently available is built with GNU gcc.

Other prebuilt perl binaries

To get even more recent perl depots for the whole range of HP-UX, visit H.Merijn Brand's site at http://mirrors.develooper.com/hpux/#Perl. Carefully read the notes to see if the available versions suit your needs.

Compiling Perl 5 on HP-UX

When compiling Perl, you must use an ANSI C compiler. The C compiler that ships with all HP-UX systems is a K&R compiler that should only be used to build new kernels.

Perl can be compiled with either HP's ANSI C compiler or with gcc. The former is recommended, as not only can it compile Perl with no difficulty, but also can take advantage of features listed later that require the use of HP compiler-specific command-line flags.

If you decide to use gcc, make sure your installation is recent and complete, and be sure to read the Perl INSTALL file for more gcc-specific details.

PA-RISC

HP's HP9000 Unix systems run on HP's own Precision Architecture (PA-RISC) chip. HP-UX used to run on the Motorola MC68000 family of chips, but any machine with this chip in it is quite obsolete and this document will not attempt to address issues for compiling Perl on the Motorola chipset.

The version of PA-RISC at the time of this document's last update is 2.0, which is also the last there will be. HP PA-RISC systems are usually refered to with model description "HP 9000". The last CPU in this series is the PA-8900. Support for PA-RISC architectured machines officially ends as shown in the following table:

  1. PA-RISC End-of-Life Roadmap
  2. +--------+----------------+----------------+-----------------+
  3. | HP9000 | Superdome | PA-8700 | Spring 2011 |
  4. | 4-128 | | PA-8800/sx1000 | Summer 2012 |
  5. | cores | | PA-8900/sx1000 | 2014 |
  6. | | | PA-8900/sx2000 | 2015 |
  7. +--------+----------------+----------------+-----------------+
  8. | HP9000 | rp7410, rp8400 | PA-8700 | Spring 2011 |
  9. | 2-32 | rp7420, rp8420 | PA-8800/sx1000 | 2012 |
  10. | cores | rp7440, rp8440 | PA-8900/sx1000 | Autumn 2013 |
  11. | | | PA-8900/sx2000 | 2015 |
  12. +--------+----------------+----------------+-----------------+
  13. | HP9000 | rp44x0 | PA-8700 | Spring 2011 |
  14. | 1-8 | | PA-8800/rp44x0 | 2012 |
  15. | cores | | PA-8900/rp44x0 | 2014 |
  16. +--------+----------------+----------------+-----------------+
  17. | HP9000 | rp34x0 | PA-8700 | Spring 2011 |
  18. | 1-4 | | PA-8800/rp34x0 | 2012 |
  19. | cores | | PA-8900/rp34x0 | 2014 |
  20. +--------+----------------+----------------+-----------------+

From http://www.hp.com/products1/evolution/9000/faqs.html

  1. The last order date for HP 9000 systems was December 31, 2008.

A complete list of models at the time the OS was built is in the file /usr/sam/lib/mo/sched.models. The first column corresponds to the last part of the output of the "model" command. The second column is the PA-RISC version and the third column is the exact chip type used. (Start browsing at the bottom to prevent confusion ;-)

  1. # model
  2. 9000/800/L1000-44
  3. # grep L1000-44 /usr/sam/lib/mo/sched.models
  4. L1000-44 2.0 PA8500

Portability Between PA-RISC Versions

An executable compiled on a PA-RISC 2.0 platform will not execute on a PA-RISC 1.1 platform, even if they are running the same version of HP-UX. If you are building Perl on a PA-RISC 2.0 platform and want that Perl to also run on a PA-RISC 1.1, the compiler flags +DAportable and +DS32 should be used.

It is no longer possible to compile PA-RISC 1.0 executables on either the PA-RISC 1.1 or 2.0 platforms. The command-line flags are accepted, but the resulting executable will not run when transferred to a PA-RISC 1.0 system.

PA-RISC 1.0

The original version of PA-RISC, HP no longer sells any system with this chip.

The following systems contained PA-RISC 1.0 chips:

  1. 600, 635, 645, 808, 815, 822, 825, 832, 834, 835, 840, 842, 845, 850,
  2. 852, 855, 860, 865, 870, 890

PA-RISC 1.1

An upgrade to the PA-RISC design, it shipped for many years in many different system.

The following systems contain with PA-RISC 1.1 chips:

  1. 705, 710, 712, 715, 720, 722, 725, 728, 730, 735, 742, 743, 744, 745,
  2. 747, 750, 755, 770, 777, 778, 779, 800, 801, 803, 806, 807, 809, 811,
  3. 813, 816, 817, 819, 821, 826, 827, 829, 831, 837, 839, 841, 847, 849,
  4. 851, 856, 857, 859, 867, 869, 877, 887, 891, 892, 897, A180, A180C,
  5. B115, B120, B132L, B132L+, B160L, B180L, C100, C110, C115, C120,
  6. C160L, D200, D210, D220, D230, D250, D260, D310, D320, D330, D350,
  7. D360, D410, DX0, DX5, DXO, E25, E35, E45, E55, F10, F20, F30, G30,
  8. G40, G50, G60, G70, H20, H30, H40, H50, H60, H70, I30, I40, I50, I60,
  9. I70, J200, J210, J210XC, K100, K200, K210, K220, K230, K400, K410,
  10. K420, S700i, S715, S744, S760, T500, T520

PA-RISC 2.0

The most recent upgrade to the PA-RISC design, it added support for 64-bit integer data.

As of the date of this document's last update, the following systems contain PA-RISC 2.0 chips:

  1. 700, 780, 781, 782, 783, 785, 802, 804, 810, 820, 861, 871, 879, 889,
  2. 893, 895, 896, 898, 899, A400, A500, B1000, B2000, C130, C140, C160,
  3. C180, C180+, C180-XP, C200+, C400+, C3000, C360, C3600, CB260, D270,
  4. D280, D370, D380, D390, D650, J220, J2240, J280, J282, J400, J410,
  5. J5000, J5500XM, J5600, J7000, J7600, K250, K260, K260-EG, K270, K360,
  6. K370, K380, K450, K460, K460-EG, K460-XP, K470, K570, K580, L1000,
  7. L2000, L3000, N4000, R380, R390, SD16000, SD32000, SD64000, T540,
  8. T600, V2000, V2200, V2250, V2500, V2600

Just before HP took over Compaq, some systems were renamed. the link that contained the explanation is dead, so here's a short summary:

  1. HP 9000 A-Class servers, now renamed HP Server rp2400 series.
  2. HP 9000 L-Class servers, now renamed HP Server rp5400 series.
  3. HP 9000 N-Class servers, now renamed HP Server rp7400.
  4. rp2400, rp2405, rp2430, rp2450, rp2470, rp3410, rp3440, rp4410,
  5. rp4440, rp5400, rp5405, rp5430, rp5450, rp5470, rp7400, rp7405,
  6. rp7410, rp7420, rp7440, rp8400, rp8420, rp8440, Superdome

The current naming convention is:

  1. aadddd
  2. ||||`+- 00 - 99 relative capacity & newness (upgrades, etc.)
  3. |||`--- unique number for each architecture to ensure different
  4. ||| systems do not have the same numbering across
  5. ||| architectures
  6. ||`---- 1 - 9 identifies family and/or relative positioning
  7. ||
  8. |`----- c = ia32 (cisc)
  9. | p = pa-risc
  10. | x = ia-64 (Itanium & Itanium 2)
  11. | h = housing
  12. `------ t = tower
  13. r = rack optimized
  14. s = super scalable
  15. b = blade
  16. sa = appliance

Itanium Processor Family (IPF) and HP-UX

HP-UX also runs on the new Itanium processor. This requires the use of a different version of HP-UX (currently 11.23 or 11i v2), and with the exception of a few differences detailed below and in later sections, Perl should compile with no problems.

Although PA-RISC binaries can run on Itanium systems, you should not attempt to use a PA-RISC version of Perl on an Itanium system. This is because shared libraries created on an Itanium system cannot be loaded while running a PA-RISC executable.

HP Itanium 2 systems are usually refered to with model description "HP Integrity".

Itanium, Itanium 2 & Madison 6

HP also ships servers with the 128-bit Itanium processor(s). The cx26x0 is told to have Madison 6. As of the date of this document's last update, the following systems contain Itanium or Itanium 2 chips (this is likely to be out of date):

  1. BL60p, BL860c, BL870c, BL890c, cx2600, cx2620, rx1600, rx1620, rx2600,
  2. rx2600hptc, rx2620, rx2660, rx2800, rx3600, rx4610, rx4640, rx5670,
  3. rx6600, rx7420, rx7620, rx7640, rx8420, rx8620, rx8640, rx9610,
  4. sx1000, sx2000

To see all about your machine, type

  1. # model
  2. ia64 hp server rx2600
  3. # /usr/contrib/bin/machinfo

HP-UX versions

Not all architectures (PA = PA-RISC, IPF = Itanium Processor Family) support all versions of HP-UX, here is a short list

  1. HP-UX version Kernel Architecture End-of-factory support
  2. ------------- ------ ------------ ----------------------------------
  3. 10.20 32 bit PA 30-Jun-2003
  4. 11.00 32/64 PA 31-Dec-2006
  5. 11.11 11i v1 32/64 PA 31-Dec-2015
  6. 11.22 11i v2 64 IPF 30-Apr-2004
  7. 11.23 11i v2 64 PA & IPF 31-Dec-2015
  8. 11.31 11i v3 64 PA & IPF 31-Dec-2020 (PA) 31-Dec-2022 (IPF)

See for the full list of hardware/OS support and expected end-of-life http://www.hp.com/go/hpuxservermatrix

Building Dynamic Extensions on HP-UX

HP-UX supports dynamically loadable libraries (shared libraries). Shared libraries end with the suffix .sl. On Itanium systems, they end with the suffix .so.

Shared libraries created on a platform using a particular PA-RISC version are not usable on platforms using an earlier PA-RISC version by default. However, this backwards compatibility may be enabled using the same +DAportable compiler flag (with the same PA-RISC 1.0 caveat mentioned above).

Shared libraries created on an Itanium platform cannot be loaded on a PA-RISC platform. Shared libraries created on a PA-RISC platform can only be loaded on an Itanium platform if it is a PA-RISC executable that is attempting to load the PA-RISC library. A PA-RISC shared library cannot be loaded into an Itanium executable nor vice-versa.

To create a shared library, the following steps must be performed:

  1. 1. Compile source modules with +z or +Z flag to create a .o module
  2. which contains Position-Independent Code (PIC). The linker will
  3. tell you in the next step if +Z was needed.
  4. (For gcc, the appropriate flag is -fpic or -fPIC.)
  5. 2. Link the shared library using the -b flag. If the code calls
  6. any functions in other system libraries (e.g., libm), it must
  7. be included on this line.

(Note that these steps are usually handled automatically by the extension's Makefile).

If these dependent libraries are not listed at shared library creation time, you will get fatal "Unresolved symbol" errors at run time when the library is loaded.

You may create a shared library that refers to another library, which may be either an archive library or a shared library. If this second library is a shared library, this is called a "dependent library". The dependent library's name is recorded in the main shared library, but it is not linked into the shared library. Instead, it is loaded when the main shared library is loaded. This can cause problems if you build an extension on one system and move it to another system where the libraries may not be located in the same place as on the first system.

If the referred library is an archive library, then it is treated as a simple collection of .o modules (all of which must contain PIC). These modules are then linked into the shared library.

Note that it is okay to create a library which contains a dependent library that is already linked into perl.

Some extensions, like DB_File and Compress::Zlib use/require prebuilt libraries for the perl extensions/modules to work. If these libraries are built using the default configuration, it might happen that you run into an error like "invalid loader fixup" during load phase. HP is aware of this problem. Search the HP-UX cxx-dev forums for discussions about the subject. The short answer is that everything (all libraries, everything) must be compiled with +z or +Z to be PIC (position independent code). (For gcc, that would be -fpic or -fPIC ). In HP-UX 11.00 or newer the linker error message should tell the name of the offending object file.

A more general approach is to intervene manually, as with an example for the DB_File module, which requires SleepyCat's libdb.sl:

  1. # cd .../db-3.2.9/build_unix
  2. # vi Makefile
  3. ... add +Z to all cflags to create shared objects
  4. CFLAGS= -c $(CPPFLAGS) +Z -Ae +O2 +Onolimit \
  5. -I/usr/local/include -I/usr/include/X11R6
  6. CXXFLAGS= -c $(CPPFLAGS) +Z -Ae +O2 +Onolimit \
  7. -I/usr/local/include -I/usr/include/X11R6
  8. # make clean
  9. # make
  10. # mkdir tmp
  11. # cd tmp
  12. # ar x ../libdb.a
  13. # ld -b -o libdb-3.2.sl *.o
  14. # mv libdb-3.2.sl /usr/local/lib
  15. # rm *.o
  16. # cd /usr/local/lib
  17. # rm -f libdb.sl
  18. # ln -s libdb-3.2.sl libdb.sl
  19. # cd .../DB_File-1.76
  20. # make distclean
  21. # perl Makefile.PL
  22. # make
  23. # make test
  24. # make install

As of db-4.2.x it is no longer needed to do this by hand. Sleepycat has changed the configuration process to add +z on HP-UX automatically.

  1. # cd .../db-4.2.25/build_unix
  2. # env CFLAGS=+DD64 LDFLAGS=+DD64 ../dist/configure

should work to generate 64bit shared libraries for HP-UX 11.00 and 11i.

It is no longer possible to link PA-RISC 1.0 shared libraries (even though the command-line flags are still present).

PA-RISC and Itanium object files are not interchangeable. Although you may be able to use ar to create an archive library of PA-RISC object files on an Itanium system, you cannot link against it using an Itanium link editor.

The HP ANSI C Compiler

When using this compiler to build Perl, you should make sure that the flag -Aa is added to the cpprun and cppstdin variables in the config.sh file (though see the section on 64-bit perl below). If you are using a recent version of the Perl distribution, these flags are set automatically.

Even though HP-UX 10.20 and 11.00 are not actively maintained by HP anymore, updates for the HP ANSI C compiler are still available from time to time, and it might be advisable to see if updates are applicable. At the moment of writing, the latests available patches for 11.00 that should be applied are PHSS_35098, PHSS_35175, PHSS_35100, PHSS_33036, and PHSS_33902). If you have a SUM account, you can use it to search for updates/patches. Enter "ANSI" as keyword.

The GNU C Compiler

When you are going to use the GNU C compiler (gcc), and you don't have gcc yet, you can either build it yourself from the sources (available from e.g. http://gcc.gnu.org/mirrors.html) or fetch a prebuilt binary from the HP porting center at http://hpux.connect.org.uk/hppd/cgi-bin/search?term=gcc&Search=Search or from the DSPP (you need to be a member) at http://h21007.www2.hp.com/portal/site/dspp/menuitem.863c3e4cbcdc3f3515b49c108973a801?ciid=2a08725cc2f02110725cc2f02110275d6e10RCRD&jumpid=reg_r1002_usen_c-001_title_r0001 (Browse through the list, because there are often multiple versions of the same package available).

Most mentioned distributions are depots. H.Merijn Brand has made prebuilt gcc binaries available on http://mirrors.develooper.com/hpux/ and/or http://www.cmve.net/~merijn/ for HP-UX 10.20 (only 32bit), HP-UX 11.00, HP-UX 11.11 (HP-UX 11i v1), and HP-UX 11.23 (HP-UX 11i v2 PA-RISC) in both 32- and 64-bit versions. For HP-UX 11.23 IPF and HP-UX 11.31 IPF depots are available too. The IPF versions do not need two versions of GNU gcc.

On PA-RISC you need a different compiler for 32-bit applications and for 64-bit applications. On PA-RISC, 32-bit objects and 64-bit objects do not mix. Period. There is no different behaviour for HP C-ANSI-C or GNU gcc. So if you require your perl binary to use 64-bit libraries, like Oracle-64bit, you MUST build a 64-bit perl.

Building a 64-bit capable gcc on PA-RISC from source is possible only when you have the HP C-ANSI C compiler or an already working 64-bit binary of gcc available. Best performance for perl is achieved with HP's native compiler.

Using Large Files with Perl on HP-UX

Beginning with HP-UX version 10.20, files larger than 2GB (2^31 bytes) may be created and manipulated. Three separate methods of doing this are available. Of these methods, the best method for Perl is to compile using the -Duselargefiles flag to Configure. This causes Perl to be compiled using structures and functions in which these are 64 bits wide, rather than 32 bits wide. (Note that this will only work with HP's ANSI C compiler. If you want to compile Perl using gcc, you will have to get a version of the compiler that supports 64-bit operations. See above for where to find it.)

There are some drawbacks to this approach. One is that any extension which calls any file-manipulating C function will need to be recompiled (just follow the usual "perl Makefile.PL; make; make test; make install" procedure).

The list of functions that will need to recompiled is: creat, fgetpos, fopen, freopen, fsetpos, fstat, fstatvfs, fstatvfsdev, ftruncate, ftw, lockf, lseek, lstat, mmap, nftw, open, prealloc, stat, statvfs, statvfsdev, tmpfile, truncate, getrlimit, setrlimit

Another drawback is only valid for Perl versions before 5.6.0. This drawback is that the seek and tell functions (both the builtin version and POSIX module version) will not perform correctly.

It is strongly recommended that you use this flag when you run Configure. If you do not do this, but later answer the question about large files when Configure asks you, you may get a configuration that cannot be compiled, or that does not function as expected.

Threaded Perl on HP-UX

It is possible to compile a version of threaded Perl on any version of HP-UX before 10.30, but it is strongly suggested that you be running on HP-UX 11.00 at least.

To compile Perl with threads, add -Dusethreads to the arguments of Configure. Verify that the -D_POSIX_C_SOURCE=199506L compiler flag is automatically added to the list of flags. Also make sure that -lpthread is listed before -lc in the list of libraries to link Perl with. The hints provided for HP-UX during Configure will try very hard to get this right for you.

HP-UX versions before 10.30 require a separate installation of a POSIX threads library package. Two examples are the HP DCE package, available on "HP-UX Hardware Extensions 3.0, Install and Core OS, Release 10.20, April 1999 (B3920-13941)" or the Freely available PTH package, available on H.Merijn's site (http://mirrors.develooper.com/hpux/). The use of PTH will be unsupported in perl-5.12 and up and is rather buggy in 5.11.x.

If you are going to use the HP DCE package, the library used for threading is /usr/lib/libcma.sl, but there have been multiple updates of that library over time. Perl will build with the first version, but it will not pass the test suite. Older Oracle versions might be a compelling reason not to update that library, otherwise please find a newer version in one of the following patches: PHSS_19739, PHSS_20608, or PHSS_23672

reformatted output:

  1. d3:/usr/lib 106 > what libcma-*.1
  2. libcma-00000.1:
  3. HP DCE/9000 1.5 Module: libcma.sl (Export)
  4. Date: Apr 29 1996 22:11:24
  5. libcma-19739.1:
  6. HP DCE/9000 1.5 PHSS_19739-40 Module: libcma.sl (Export)
  7. Date: Sep 4 1999 01:59:07
  8. libcma-20608.1:
  9. HP DCE/9000 1.5 PHSS_20608 Module: libcma.1 (Export)
  10. Date: Dec 8 1999 18:41:23
  11. libcma-23672.1:
  12. HP DCE/9000 1.5 PHSS_23672 Module: libcma.1 (Export)
  13. Date: Apr 9 2001 10:01:06
  14. d3:/usr/lib 107 >

If you choose for the PTH package, use swinstall to install pth in the default location (/opt/pth), and then make symbolic links to the libraries from /usr/lib

  1. # cd /usr/lib
  2. # ln -s /opt/pth/lib/libpth* .

For building perl to support Oracle, it needs to be linked with libcl and libpthread. So even if your perl is an unthreaded build, these libraries might be required. See "Oracle on HP-UX" below.

64-bit Perl on HP-UX

Beginning with HP-UX 11.00, programs compiled under HP-UX can take advantage of the LP64 programming environment (LP64 means Longs and Pointers are 64 bits wide), in which scalar variables will be able to hold numbers larger than 2^32 with complete precision. Perl has proven to be consistent and reliable in 64bit mode since 5.8.1 on all HP-UX 11.xx.

As of the date of this document, Perl is fully 64-bit compliant on HP-UX 11.00 and up for both cc- and gcc builds. If you are about to build a 64-bit perl with GNU gcc, please read the gcc section carefully.

Should a user have the need for compiling Perl in the LP64 environment, use the -Duse64bitall flag to Configure. This will force Perl to be compiled in a pure LP64 environment (with the +DD64 flag for HP C-ANSI-C, with no additional options for GNU gcc 64-bit on PA-RISC, and with -mlp64 for GNU gcc on Itanium). If you want to compile Perl using gcc, you will have to get a version of the compiler that supports 64-bit operations.)

You can also use the -Duse64bitint flag to Configure. Although there are some minor differences between compiling Perl with this flag versus the -Duse64bitall flag, they should not be noticeable from a Perl user's perspective. When configuring -Duse64bitint using a 64bit gcc on a pa-risc architecture, -Duse64bitint is silently promoted to -Duse64bitall.

In both cases, it is strongly recommended that you use these flags when you run Configure. If you do not use do this, but later answer the questions about 64-bit numbers when Configure asks you, you may get a configuration that cannot be compiled, or that does not function as expected.

Oracle on HP-UX

Using perl to connect to Oracle databases through DBI and DBD::Oracle has caused a lot of people many headaches. Read README.hpux in the DBD::Oracle for much more information. The reason to mention it here is that Oracle requires a perl built with libcl and libpthread, the latter even when perl is build without threads. Building perl using all defaults, but still enabling to build DBD::Oracle later on can be achieved using

  1. Configure -A prepend:libswanted='cl pthread ' ...

Do not forget the space before the trailing quote.

Also note that this does not (yet) work with all configurations, it is known to fail with 64-bit versions of GCC.

GDBM and Threads on HP-UX

If you attempt to compile Perl with (POSIX) threads on an 11.X system and also link in the GDBM library, then Perl will immediately core dump when it starts up. The only workaround at this point is to relink the GDBM library under 11.X, then relink it into Perl.

the error might show something like:

Pthread internal error: message: __libc_reinit() failed, file: ../pthreads/pthread.c, line: 1096 Return Pointer is 0xc082bf33 sh: 5345 Quit(coredump)

and Configure will give up.

NFS filesystems and utime(2) on HP-UX

If you are compiling Perl on a remotely-mounted NFS filesystem, the test io/fs.t may fail on test #18. This appears to be a bug in HP-UX and no fix is currently available.

HP-UX Kernel Parameters (maxdsiz) for Compiling Perl

By default, HP-UX comes configured with a maximum data segment size of 64MB. This is too small to correctly compile Perl with the maximum optimization levels. You can increase the size of the maxdsiz kernel parameter through the use of SAM.

When using the GUI version of SAM, click on the Kernel Configuration icon, then the Configurable Parameters icon. Scroll down and select the maxdsiz line. From the Actions menu, select the Modify Configurable Parameter item. Insert the new formula into the Formula/Value box. Then follow the instructions to rebuild your kernel and reboot your system.

In general, a value of 256MB (or "256*1024*1024") is sufficient for Perl to compile at maximum optimization.

nss_delete core dump from op/pwent or op/grent

You may get a bus error core dump from the op/pwent or op/grent tests. If compiled with -g you will see a stack trace much like the following:

  1. #0 0xc004216c in () from /usr/lib/libc.2
  2. #1 0xc00d7550 in __nss_src_state_destr () from /usr/lib/libc.2
  3. #2 0xc00d7768 in __nss_src_state_destr () from /usr/lib/libc.2
  4. #3 0xc00d78a8 in nss_delete () from /usr/lib/libc.2
  5. #4 0xc01126d8 in endpwent () from /usr/lib/libc.2
  6. #5 0xd1950 in Perl_pp_epwent () from ./perl
  7. #6 0x94d3c in Perl_runops_standard () from ./perl
  8. #7 0x23728 in S_run_body () from ./perl
  9. #8 0x23428 in perl_run () from ./perl
  10. #9 0x2005c in main () from ./perl

The key here is the nss_delete call. One workaround for this bug seems to be to create add to the file /etc/nsswitch.conf (at least) the following lines

  1. group: files
  2. passwd: files

Whether you are using NIS does not matter. Amazingly enough, the same bug also affects Solaris.

error: pasting ")" and "l" does not give a valid preprocessing token

There seems to be a broken system header file in HP-UX 11.00 that breaks perl building in 32bit mode with GNU gcc-4.x causing this error. The same file for HP-UX 11.11 (even though the file is older) does not show this failure, and has the correct definition, so the best fix is to patch the header to match:

  1. --- /usr/include/inttypes.h 2001-04-20 18:42:14 +0200
  2. +++ /usr/include/inttypes.h 2000-11-14 09:00:00 +0200
  3. @@ -72,7 +72,7 @@
  4. #define UINT32_C(__c) __CONCAT_U__(__c)
  5. #else /* __LP64 */
  6. #define INT32_C(__c) __CONCAT__(__c,l)
  7. -#define UINT32_C(__c) __CONCAT__(__CONCAT_U__(__c),l)
  8. +#define UINT32_C(__c) __CONCAT__(__c,ul)
  9. #endif /* __LP64 */
  10. #define INT64_C(__c) __CONCAT_L__(__c,l)

Miscellaneous

HP-UX 11 Y2K patch "Y2K-1100 B.11.00.B0125 HP-UX Core OS Year 2000 Patch Bundle" has been reported to break the io/fs test #18 which tests whether utime() can change timestamps. The Y2K patch seems to break utime() so that over NFS the timestamps do not get changed (on local filesystems utime() still works). This has probably been fixed on your system by now.

AUTHOR

H.Merijn Brand <h.m.brand@xs4all.nl> Jeff Okamoto <okamoto@corp.hp.com>

With much assistance regarding shared libraries from Marc Sabatella.

 
perldoc-html/perlhurd.html000644 000765 000024 00000040030 12275777411 015743 0ustar00jjstaff000000 000000 perlhurd - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlhurd

Perl 5 version 18.2 documentation
Recently read

perlhurd

NAME

perlhurd - Perl version 5 on Hurd

DESCRIPTION

If you want to use Perl on the Hurd, I recommend using the Debian GNU/Hurd distribution ( see http://www.debian.org/ ), even if an official, stable release has not yet been made. The old "gnu-0.2" binary distribution will most certainly have additional problems.

Known Problems with Perl on Hurd

The Perl test suite may still report some errors on the Hurd. The "lib/anydbm" and "pragma/warnings" tests will almost certainly fail. Both failures are not really specific to the Hurd, as indicated by the test suite output.

The socket tests may fail if the network is not configured. You have to make "/hurd/pfinet" the translator for "/servers/socket/2", giving it the right arguments. Try "/hurd/pfinet --help" for more information.

Here are the statistics for Perl 5.005_62 on my system:

  1. Failed Test Status Wstat Total Fail Failed List of failed
  2. -------------------------------------------------------------------------
  3. lib/anydbm.t 12 1 8.33% 12
  4. pragma/warnings 333 1 0.30% 215
  5. 8 tests and 24 subtests skipped.
  6. Failed 2/229 test scripts, 99.13% okay. 2/10850 subtests failed, 99.98% okay.

There are quite a few systems out there that do worse!

However, since I am running a very recent Hurd snapshot, in which a lot of bugs that were exposed by the Perl test suite have been fixed, you may encounter more failures. Likely candidates are: "op/stat", "lib/io_pipe", "lib/io_sock", "lib/io_udp" and "lib/time".

In any way, if you're seeing failures beyond those mentioned in this document, please consider upgrading to the latest Hurd before reporting the failure as a bug.

AUTHOR

Mark Kettenis <kettenis@gnu.org>

Last Updated: Fri, 29 Oct 1999 22:50:30 +0200

 
perldoc-html/perlintern.html000644 000765 000024 00000312554 12275777367 016327 0ustar00jjstaff000000 000000 perlintern - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlintern

Perl 5 version 18.2 documentation
Recently read

perlintern

NAME

perlintern - autogenerated documentation of purely internal Perl functions

DESCRIPTION

This file is the autogenerated documentation of functions in the Perl interpreter that are documented using Perl's internal documentation format but are not marked as part of the Perl API. In other words, they are not for use in extensions!

Compile-time scope hooks

  • BhkENTRY

    Return an entry from the BHK structure. which is a preprocessor token indicating which entry to return. If the appropriate flag is not set this will return NULL. The type of the return value depends on which entry you ask for.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void * BhkENTRY(BHK *hk, which)
  • BhkFLAGS

    Return the BHK's flags.

    NOTE: this function is experimental and may change or be removed without notice.

    1. U32 BhkFLAGS(BHK *hk)
  • CALL_BLOCK_HOOKS

    Call all the registered block hooks for type which. which is a preprocessing token; the type of arg depends on which.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void CALL_BLOCK_HOOKS(which, arg)

CV reference counts and CvOUTSIDE

  • CvWEAKOUTSIDE

    Each CV has a pointer, CvOUTSIDE() , to its lexically enclosing CV (if any). Because pointers to anonymous sub prototypes are stored in & pad slots, it is a possible to get a circular reference, with the parent pointing to the child and vice-versa. To avoid the ensuing memory leak, we do not increment the reference count of the CV pointed to by CvOUTSIDE in the one specific instance that the parent has a & pad slot pointing back to us. In this case, we set the CvWEAKOUTSIDE flag in the child. This allows us to determine under what circumstances we should decrement the refcount of the parent when freeing the child.

    There is a further complication with non-closure anonymous subs (i.e. those that do not refer to any lexicals outside that sub). In this case, the anonymous prototype is shared rather than being cloned. This has the consequence that the parent may be freed while there are still active children, eg

    1. BEGIN { $a = sub { eval '$x' } }

    In this case, the BEGIN is freed immediately after execution since there are no active references to it: the anon sub prototype has CvWEAKOUTSIDE set since it's not a closure, and $a points to the same CV, so it doesn't contribute to BEGIN's refcount either. When $a is executed, the eval '$x' causes the chain of CvOUTSIDE s to be followed, and the freed BEGIN is accessed.

    To avoid this, whenever a CV and its associated pad is freed, any & entries in the pad are explicitly removed from the pad, and if the refcount of the pointed-to anon sub is still positive, then that child's CvOUTSIDE is set to point to its grandparent. This will only occur in the single specific case of a non-closure anon prototype having one or more active references (such as $a above).

    One other thing to consider is that a CV may be merely undefined rather than freed, eg undef &foo . In this case, its refcount may not have reached zero, but we still delete its pad and its CvROOT etc. Since various children may still have their CvOUTSIDE pointing at this undefined CV, we keep its own CvOUTSIDE for the time being, so that the chain of lexical scopes is unbroken. For example, the following should print 123:

    1. my $x = 123;
    2. sub tmp { sub { eval '$x' } }
    3. my $a = tmp();
    4. undef &tmp;
    5. print $a->();
    6. bool CvWEAKOUTSIDE(CV *cv)

Embedding Functions

  • cv_dump

    dump the contents of a CV

    1. void cv_dump(CV *cv, const char *title)
  • cv_forget_slab

    When a CV has a reference count on its slab (CvSLABBED), it is responsible for making sure it is freed. (Hence, no two CVs should ever have a reference count on the same slab.) The CV only needs to reference the slab during compilation. Once it is compiled and CvROOT attached, it has finished its job, so it can forget the slab.

    1. void cv_forget_slab(CV *cv)
  • do_dump_pad

    Dump the contents of a padlist

    1. void do_dump_pad(I32 level, PerlIO *file,
    2. PADLIST *padlist, int full)
  • intro_my

    "Introduce" my variables to visible status. This is called during parsing at the end of each statement to make lexical variables visible to subsequent statements.

    1. U32 intro_my()
  • padlist_dup

    Duplicates a pad.

    1. PADLIST * padlist_dup(PADLIST *srcpad,
    2. CLONE_PARAMS *param)
  • pad_alloc_name

    Allocates a place in the currently-compiling pad (via pad_alloc in perlapi) and then stores a name for that entry. namesv is adopted and becomes the name entry; it must already contain the name string and be sufficiently upgraded. typestash and ourstash and the padadd_STATE flag get added to namesv. None of the other processing of pad_add_name_pvn in perlapi is done. Returns the offset of the allocated pad slot.

    1. PADOFFSET pad_alloc_name(SV *namesv, U32 flags,
    2. HV *typestash, HV *ourstash)
  • pad_block_start

    Update the pad compilation state variables on entry to a new block.

    1. void pad_block_start(int full)
  • pad_check_dup

    Check for duplicate declarations: report any of:

    1. * a my in the current scope with the same name;
    2. * an our (anywhere in the pad) with the same name and the
    3. same stash as C<ourstash>

    is_our indicates that the name to check is an 'our' declaration.

    1. void pad_check_dup(SV *name, U32 flags,
    2. const HV *ourstash)
  • pad_findlex

    Find a named lexical anywhere in a chain of nested pads. Add fake entries in the inner pads if it's found in an outer one.

    Returns the offset in the bottom pad of the lex or the fake lex. cv is the CV in which to start the search, and seq is the current cop_seq to match against. If warn is true, print appropriate warnings. The out_* vars return values, and so are pointers to where the returned values should be stored. out_capture, if non-null, requests that the innermost instance of the lexical is captured; out_name_sv is set to the innermost matched namesv or fake namesv; out_flags returns the flags normally associated with the IVX field of a fake namesv.

    Note that pad_findlex() is recursive; it recurses up the chain of CVs, then comes back down, adding fake entries as it goes. It has to be this way because fake namesvs in anon protoypes have to store in xlow the index into the parent pad.

    1. PADOFFSET pad_findlex(const char *namepv,
    2. STRLEN namelen, U32 flags,
    3. const CV* cv, U32 seq, int warn,
    4. SV** out_capture,
    5. SV** out_name_sv, int *out_flags)
  • pad_fixup_inner_anons

    For any anon CVs in the pad, change CvOUTSIDE of that CV from old_cv to new_cv if necessary. Needed when a newly-compiled CV has to be moved to a pre-existing CV struct.

    1. void pad_fixup_inner_anons(PADLIST *padlist,
    2. CV *old_cv, CV *new_cv)
  • pad_free

    Free the SV at offset po in the current pad.

    1. void pad_free(PADOFFSET po)
  • pad_leavemy

    Cleanup at end of scope during compilation: set the max seq number for lexicals in this scope and warn of any lexicals that never got introduced.

    1. void pad_leavemy()
  • pad_push

    Push a new pad frame onto the padlist, unless there's already a pad at this depth, in which case don't bother creating a new one. Then give the new pad an @_ in slot zero.

    1. void pad_push(PADLIST *padlist, int depth)
  • pad_reset

    Mark all the current temporaries for reuse

    1. void pad_reset()
  • pad_swipe

    Abandon the tmp in the current pad at offset po and replace with a new one.

    1. void pad_swipe(PADOFFSET po, bool refadjust)

Functions in file op.c

  • core_prototype

    This function assigns the prototype of the named core function to sv , or to a new mortal SV if sv is NULL. It returns the modified sv , or NULL if the core function has no prototype. code is a code as returned by keyword() . It must not be equal to 0 or -KEY_CORE.

    1. SV * core_prototype(SV *sv, const char *name,
    2. const int code,
    3. int * const opnum)

Functions in file pp_ctl.c

  • docatch

    Check for the cases 0 or 3 of cur_env.je_ret, only used inside an eval context.

    0 is used as continue inside eval,

    3 is used for a die caught by an inner eval - continue inner loop

    See cop.h: je_mustcatch, when set at any runlevel to TRUE, means eval ops must establish a local jmpenv to handle exception traps.

    1. OP* docatch(OP *o)

GV Functions

  • gv_try_downgrade

    If the typeglob gv can be expressed more succinctly, by having something other than a real GV in its place in the stash, replace it with the optimised form. Basic requirements for this are that gv is a real typeglob, is sufficiently ordinary, and is only referenced from its package. This function is meant to be used when a GV has been looked up in part to see what was there, causing upgrading, but based on what was found it turns out that the real GV isn't required after all.

    If gv is a completely empty typeglob, it is deleted from the stash.

    If gv is a typeglob containing only a sufficiently-ordinary constant sub, the typeglob is replaced with a scalar-reference placeholder that more compactly represents the same thing.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void gv_try_downgrade(GV* gv)

Hash Manipulation Functions

  • hv_ename_add

    Adds a name to a stash's internal list of effective names. See hv_ename_delete .

    This is called when a stash is assigned to a new location in the symbol table.

    1. void hv_ename_add(HV *hv, const char *name, U32 len,
    2. U32 flags)
  • hv_ename_delete

    Removes a name from a stash's internal list of effective names. If this is the name returned by HvENAME , then another name in the list will take its place (HvENAME will use it).

    This is called when a stash is deleted from the symbol table.

    1. void hv_ename_delete(HV *hv, const char *name,
    2. U32 len, U32 flags)
  • refcounted_he_chain_2hv

    Generates and returns a HV * representing the content of a refcounted_he chain. flags is currently unused and must be zero.

    1. HV * refcounted_he_chain_2hv(
    2. const struct refcounted_he *c, U32 flags
    3. )
  • refcounted_he_fetch_pv

    Like refcounted_he_fetch_pvn, but takes a nul-terminated string instead of a string/length pair.

    1. SV * refcounted_he_fetch_pv(
    2. const struct refcounted_he *chain,
    3. const char *key, U32 hash, U32 flags
    4. )
  • refcounted_he_fetch_pvn

    Search along a refcounted_he chain for an entry with the key specified by keypv and keylen. If flags has the REFCOUNTED_HE_KEY_UTF8 bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. hash is a precomputed hash of the key string, or zero if it has not been precomputed. Returns a mortal scalar representing the value associated with the key, or &PL_sv_placeholder if there is no value associated with the key.

    1. SV * refcounted_he_fetch_pvn(
    2. const struct refcounted_he *chain,
    3. const char *keypv, STRLEN keylen, U32 hash,
    4. U32 flags
    5. )
  • refcounted_he_fetch_pvs

    Like refcounted_he_fetch_pvn, but takes a literal string instead of a string/length pair, and no precomputed hash.

    1. SV * refcounted_he_fetch_pvs(
    2. const struct refcounted_he *chain,
    3. const char *key, U32 flags
    4. )
  • refcounted_he_fetch_sv

    Like refcounted_he_fetch_pvn, but takes a Perl scalar instead of a string/length pair.

    1. SV * refcounted_he_fetch_sv(
    2. const struct refcounted_he *chain, SV *key,
    3. U32 hash, U32 flags
    4. )
  • refcounted_he_free

    Decrements the reference count of a refcounted_he by one. If the reference count reaches zero the structure's memory is freed, which (recursively) causes a reduction of its parent refcounted_he 's reference count. It is safe to pass a null pointer to this function: no action occurs in this case.

    1. void refcounted_he_free(struct refcounted_he *he)
  • refcounted_he_inc

    Increment the reference count of a refcounted_he . The pointer to the refcounted_he is also returned. It is safe to pass a null pointer to this function: no action occurs and a null pointer is returned.

    1. struct refcounted_he * refcounted_he_inc(
    2. struct refcounted_he *he
    3. )
  • refcounted_he_new_pv

    Like refcounted_he_new_pvn, but takes a nul-terminated string instead of a string/length pair.

    1. struct refcounted_he * refcounted_he_new_pv(
    2. struct refcounted_he *parent,
    3. const char *key, U32 hash,
    4. SV *value, U32 flags
    5. )
  • refcounted_he_new_pvn

    Creates a new refcounted_he . This consists of a single key/value pair and a reference to an existing refcounted_he chain (which may be empty), and thus forms a longer chain. When using the longer chain, the new key/value pair takes precedence over any entry for the same key further along the chain.

    The new key is specified by keypv and keylen. If flags has the REFCOUNTED_HE_KEY_UTF8 bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. hash is a precomputed hash of the key string, or zero if it has not been precomputed.

    value is the scalar value to store for this key. value is copied by this function, which thus does not take ownership of any reference to it, and later changes to the scalar will not be reflected in the value visible in the refcounted_he . Complex types of scalar will not be stored with referential integrity, but will be coerced to strings. value may be either null or &PL_sv_placeholder to indicate that no value is to be associated with the key; this, as with any non-null value, takes precedence over the existence of a value for the key further along the chain.

    parent points to the rest of the refcounted_he chain to be attached to the new refcounted_he . This function takes ownership of one reference to parent, and returns one reference to the new refcounted_he .

    1. struct refcounted_he * refcounted_he_new_pvn(
    2. struct refcounted_he *parent,
    3. const char *keypv,
    4. STRLEN keylen, U32 hash,
    5. SV *value, U32 flags
    6. )
  • refcounted_he_new_pvs

    Like refcounted_he_new_pvn, but takes a literal string instead of a string/length pair, and no precomputed hash.

    1. struct refcounted_he * refcounted_he_new_pvs(
    2. struct refcounted_he *parent,
    3. const char *key, SV *value,
    4. U32 flags
    5. )
  • refcounted_he_new_sv

    Like refcounted_he_new_pvn, but takes a Perl scalar instead of a string/length pair.

    1. struct refcounted_he * refcounted_he_new_sv(
    2. struct refcounted_he *parent,
    3. SV *key, U32 hash, SV *value,
    4. U32 flags
    5. )

IO Functions

  • start_glob

    Function called by do_readline to spawn a glob (or do the glob inside perl on VMS). This code used to be inline, but now perl uses File::Glob this glob starter is only used by miniperl during the build process. Moving it away shrinks pp_hot.c; shrinking pp_hot.c helps speed perl up.

    NOTE: this function is experimental and may change or be removed without notice.

    1. PerlIO* start_glob(SV *tmpglob, IO *io)

Magical Functions

  • magic_clearhint

    Triggered by a delete from %^H, records the key to PL_compiling.cop_hints_hash .

    1. int magic_clearhint(SV* sv, MAGIC* mg)
  • magic_clearhints

    Triggered by clearing %^H, resets PL_compiling.cop_hints_hash .

    1. int magic_clearhints(SV* sv, MAGIC* mg)
  • magic_methcall

    Invoke a magic method (like FETCH).

    sv and mg are the tied thingy and the tie magic.

    meth is the name of the method to call.

    argc is the number of args (in addition to $self) to pass to the method.

    The flags can be:

    1. G_DISCARD invoke method with G_DISCARD flag and don't
    2. return a value
    3. G_UNDEF_FILL fill the stack with argc pointers to
    4. PL_sv_undef

    The arguments themselves are any values following the flags argument.

    Returns the SV (if any) returned by the method, or NULL on failure.

    1. SV* magic_methcall(SV *sv, const MAGIC *mg,
    2. const char *meth, U32 flags,
    3. U32 argc, ...)
  • magic_sethint

    Triggered by a store to %^H, records the key/value pair to PL_compiling.cop_hints_hash . It is assumed that hints aren't storing anything that would need a deep copy. Maybe we should warn if we find a reference.

    1. int magic_sethint(SV* sv, MAGIC* mg)
  • mg_localize

    Copy some of the magic from an existing SV to new localized version of that SV. Container magic (eg %ENV, $1, tie) gets copied, value magic doesn't (eg taint, pos).

    If setmagic is false then no set magic will be called on the new (empty) SV. This typically means that assignment will soon follow (e.g. 'local $x = $y'), and that will handle the magic.

    1. void mg_localize(SV* sv, SV* nsv, bool setmagic)

MRO Functions

  • mro_get_linear_isa_dfs

    Returns the Depth-First Search linearization of @ISA the given stash. The return value is a read-only AV*. level should be 0 (it is used internally in this function's recursion).

    You are responsible for SvREFCNT_inc() on the return value if you plan to store it anywhere semi-permanently (otherwise it might be deleted out from under you the next time the cache is invalidated).

    1. AV* mro_get_linear_isa_dfs(HV* stash, U32 level)
  • mro_isa_changed_in

    Takes the necessary steps (cache invalidations, mostly) when the @ISA of the given package has changed. Invoked by the setisa magic, should not need to invoke directly.

    1. void mro_isa_changed_in(HV* stash)
  • mro_package_moved

    Call this function to signal to a stash that it has been assigned to another spot in the stash hierarchy. stash is the stash that has been assigned. oldstash is the stash it replaces, if any. gv is the glob that is actually being assigned to.

    This can also be called with a null first argument to indicate that oldstash has been deleted.

    This function invalidates isa caches on the old stash, on all subpackages nested inside it, and on the subclasses of all those, including non-existent packages that have corresponding entries in stash .

    It also sets the effective names (HvENAME ) on all the stashes as appropriate.

    If the gv is present and is not in the symbol table, then this function simply returns. This checked will be skipped if flags & 1 .

    1. void mro_package_moved(HV * const stash,
    2. HV * const oldstash,
    3. const GV * const gv,
    4. U32 flags)

Optree Manipulation Functions

  • finalize_optree

    This function finalizes the optree. Should be called directly after the complete optree is built. It does some additional checking which can't be done in the normal ck_xxx functions and makes the tree thread-safe.

    1. void finalize_optree(OP* o)

Pad Data Structures

  • CX_CURPAD_SAVE

    Save the current pad in the given context block structure.

    1. void CX_CURPAD_SAVE(struct context)
  • CX_CURPAD_SV

    Access the SV at offset po in the saved current pad in the given context block structure (can be used as an lvalue).

    1. SV * CX_CURPAD_SV(struct context, PADOFFSET po)
  • PadnameIsOUR

    Whether this is an "our" variable.

    1. bool PadnameIsOUR(PADNAME pn)
  • PadnameIsSTATE

    Whether this is a "state" variable.

    1. bool PadnameIsSTATE(PADNAME pn)
  • PadnameOURSTASH

    The stash in which this "our" variable was declared.

    1. HV * PadnameOURSTASH()
  • PadnameOUTER

    Whether this entry belongs to an outer pad.

    1. bool PadnameOUTER(PADNAME pn)
  • PadnameTYPE

    The stash associated with a typed lexical. This returns the %Foo:: hash for my Foo $bar .

    1. HV * PadnameTYPE(PADNAME pn)
  • PAD_BASE_SV

    Get the value from slot po in the base (DEPTH=1) pad of a padlist

    1. SV * PAD_BASE_SV(PADLIST padlist, PADOFFSET po)
  • PAD_CLONE_VARS

    Clone the state variables associated with running and compiling pads.

    1. void PAD_CLONE_VARS(PerlInterpreter *proto_perl,
    2. CLONE_PARAMS* param)
  • PAD_COMPNAME_FLAGS

    Return the flags for the current compiling pad name at offset po . Assumes a valid slot entry.

    1. U32 PAD_COMPNAME_FLAGS(PADOFFSET po)
  • PAD_COMPNAME_GEN

    The generation number of the name at offset po in the current compiling pad (lvalue). Note that SvUVX is hijacked for this purpose.

    1. STRLEN PAD_COMPNAME_GEN(PADOFFSET po)
  • PAD_COMPNAME_GEN_set

    Sets the generation number of the name at offset po in the current ling pad (lvalue) to gen . Note that SvUV_set is hijacked for this purpose.

    1. STRLEN PAD_COMPNAME_GEN_set(PADOFFSET po, int gen)
  • PAD_COMPNAME_OURSTASH

    Return the stash associated with an our variable. Assumes the slot entry is a valid our lexical.

    1. HV * PAD_COMPNAME_OURSTASH(PADOFFSET po)
  • PAD_COMPNAME_PV

    Return the name of the current compiling pad name at offset po . Assumes a valid slot entry.

    1. char * PAD_COMPNAME_PV(PADOFFSET po)
  • PAD_COMPNAME_TYPE

    Return the type (stash) of the current compiling pad name at offset po . Must be a valid name. Returns null if not typed.

    1. HV * PAD_COMPNAME_TYPE(PADOFFSET po)
  • pad_peg

    When PERL_MAD is enabled, this is a small no-op function that gets called at the start of each pad-related function. It can be breakpointed to track all pad operations. The parameter is a string indicating the type of pad operation being performed.

    NOTE: this function is experimental and may change or be removed without notice.

    1. void pad_peg(const char *s)
  • PAD_RESTORE_LOCAL

    Restore the old pad saved into the local variable opad by PAD_SAVE_LOCAL()

    1. void PAD_RESTORE_LOCAL(PAD *opad)
  • PAD_SAVE_LOCAL

    Save the current pad to the local variable opad, then make the current pad equal to npad

    1. void PAD_SAVE_LOCAL(PAD *opad, PAD *npad)
  • PAD_SAVE_SETNULLPAD

    Save the current pad then set it to null.

    1. void PAD_SAVE_SETNULLPAD()
  • PAD_SETSV

    Set the slot at offset po in the current pad to sv

    1. SV * PAD_SETSV(PADOFFSET po, SV* sv)
  • PAD_SET_CUR

    Set the current pad to be pad n in the padlist, saving the previous current pad. NB currently this macro expands to a string too long for some compilers, so it's best to replace it with

    1. SAVECOMPPAD();
    2. PAD_SET_CUR_NOSAVE(padlist,n);
    3. void PAD_SET_CUR(PADLIST padlist, I32 n)
  • PAD_SET_CUR_NOSAVE

    like PAD_SET_CUR, but without the save

    1. void PAD_SET_CUR_NOSAVE(PADLIST padlist, I32 n)
  • PAD_SV

    Get the value at offset po in the current pad

    1. void PAD_SV(PADOFFSET po)
  • PAD_SVl

    Lightweight and lvalue version of PAD_SV . Get or set the value at offset po in the current pad. Unlike PAD_SV , does not print diagnostics with -DX. For internal use only.

    1. SV * PAD_SVl(PADOFFSET po)
  • SAVECLEARSV

    Clear the pointed to pad value on scope exit. (i.e. the runtime action of 'my')

    1. void SAVECLEARSV(SV **svp)
  • SAVECOMPPAD

    save PL_comppad and PL_curpad

    1. void SAVECOMPPAD()
  • SAVEPADSV

    Save a pad slot (used to restore after an iteration)

    XXX DAPM it would make more sense to make the arg a PADOFFSET void SAVEPADSV(PADOFFSET po)

Per-Interpreter Variables

  • PL_DBsingle

    When Perl is run in debugging mode, with the -d switch, this SV is a boolean which indicates whether subs are being single-stepped. Single-stepping is automatically turned on after every step. This is the C variable which corresponds to Perl's $DB::single variable. See PL_DBsub .

    1. SV * PL_DBsingle
  • PL_DBsub

    When Perl is run in debugging mode, with the -d switch, this GV contains the SV which holds the name of the sub being debugged. This is the C variable which corresponds to Perl's $DB::sub variable. See PL_DBsingle .

    1. GV * PL_DBsub
  • PL_DBtrace

    Trace variable used when Perl is run in debugging mode, with the -d switch. This is the C variable which corresponds to Perl's $DB::trace variable. See PL_DBsingle .

    1. SV * PL_DBtrace
  • PL_dowarn

    The C variable which corresponds to Perl's $^W warning variable.

    1. bool PL_dowarn
  • PL_last_in_gv

    The GV which was last used for a filehandle input operation. (<FH> )

    1. GV* PL_last_in_gv
  • PL_ofsgv

    The glob containing the output field separator - *, in Perl space.

    1. GV* PL_ofsgv
  • PL_rs

    The input record separator - $/ in Perl space.

    1. SV* PL_rs

Stack Manipulation Macros

  • djSP

    Declare Just SP . This is actually identical to dSP , and declares a local copy of perl's stack pointer, available via the SP macro. See SP . (Available for backward source code compatibility with the old (Perl 5.005) thread model.)

    1. djSP;
  • LVRET

    True if this op will be the return value of an lvalue subroutine

SV Manipulation Functions

  • SvTHINKFIRST

    A quick flag check to see whether an sv should be passed to sv_force_normal to be "downgraded" before SvIVX or SvPVX can be modified directly.

    For example, if your scalar is a reference and you want to modify the SvIVX slot, you can't just do SvROK_off, as that will leak the referent.

    This is used internally by various sv-modifying functions, such as sv_setsv, sv_setiv and sv_pvn_force.

    One case that this does not handle is a gv without SvFAKE set. After

    1. if (SvTHINKFIRST(gv)) sv_force_normal(gv);

    it will still be a gv.

    SvTHINKFIRST sometimes produces false positives. In those cases sv_force_normal does nothing.

    1. U32 SvTHINKFIRST(SV *sv)
  • sv_add_arena

    Given a chunk of memory, link it to the head of the list of arenas, and split it into a list of free SVs.

    1. void sv_add_arena(char *const ptr, const U32 size,
    2. const U32 flags)
  • sv_clean_all

    Decrement the refcnt of each remaining SV, possibly triggering a cleanup. This function may have to be called multiple times to free SVs which are in complex self-referential hierarchies.

    1. I32 sv_clean_all()
  • sv_clean_objs

    Attempt to destroy all objects not yet freed.

    1. void sv_clean_objs()
  • sv_free_arenas

    Deallocate the memory used by all arenas. Note that all the individual SV heads and bodies within the arenas must already have been freed.

    1. void sv_free_arenas()

SV-Body Allocation

  • sv_2num

    Return an SV with the numeric value of the source SV, doing any necessary reference or overload conversion. You must use the SvNUM(sv) macro to access this function.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV* sv_2num(SV *const sv)
  • sv_copypv

    Copies a stringified representation of the source SV into the destination SV. Automatically performs any necessary mg_get and coercion of numeric values into strings. Guaranteed to preserve UTF8 flag even from overloaded objects. Similar in nature to sv_2pv[_flags] but operates directly on an SV instead of just the string. Mostly uses sv_2pv_flags to do its work, except when that would lose the UTF-8'ness of the PV.

    1. void sv_copypv(SV *const dsv, SV *const ssv)
  • sv_ref

    Returns a SV describing what the SV passed in is a reference to.

    1. SV* sv_ref(SV *dst, const SV *const sv,
    2. const int ob)

Unicode Support

  • find_uninit_var

    Find the name of the undefined variable (if any) that caused the operator to issue a "Use of uninitialized value" warning. If match is true, only return a name if its value matches uninit_sv. So roughly speaking, if a unary operator (such as OP_COS) generates a warning, then following the direct child of the op may yield an OP_PADSV or OP_GV that gives the name of the undefined variable. On the other hand, with OP_ADD there are two branches to follow, so we only print the variable name if we get an exact match.

    The name is returned as a mortal SV.

    Assumes that PL_op is the op that originally triggered the error, and that PL_comppad/PL_curpad points to the currently executing pad.

    NOTE: this function is experimental and may change or be removed without notice.

    1. SV* find_uninit_var(const OP *const obase,
    2. const SV *const uninit_sv,
    3. bool top)
  • report_uninit

    Print appropriate "Use of uninitialized variable" warning.

    1. void report_uninit(const SV *uninit_sv)

Undocumented functions

The following functions are currently undocumented. If you use one of them, you may wish to consider creating and submitting documentation for it.

  • Perl_croak_memory_wrap
  • Slab_Alloc
  • Slab_Free
  • Slab_to_ro
  • Slab_to_rw
  • _add_range_to_invlist
  • _core_swash_init
  • _get_invlist_len_addr
  • _get_swash_invlist
  • _invlist_array_init
  • _invlist_contains_cp
  • _invlist_contents
  • _invlist_intersection
  • _invlist_intersection_maybe_complement_2nd
  • _invlist_invert
  • _invlist_invert_prop
  • _invlist_len
  • _invlist_populate_swatch
  • _invlist_search
  • _invlist_subtract
  • _invlist_union
  • _invlist_union_maybe_complement_2nd
  • _new_invlist
  • _swash_inversion_hash
  • _swash_to_invlist
  • _to_fold_latin1
  • _to_upper_title_latin1
  • aassign_common_vars
  • add_cp_to_invlist
  • addmad
  • alloc_maybe_populate_EXACT
  • allocmy
  • amagic_is_enabled
  • append_madprops
  • apply
  • av_extend_guts
  • av_reify
  • bind_match
  • block_end
  • block_start
  • boot_core_PerlIO
  • boot_core_UNIVERSAL
  • boot_core_mro
  • cando
  • check_utf8_print
  • ck_entersub_args_core
  • compute_EXACTish
  • convert
  • coresub_op
  • create_eval_scope
  • croak_no_mem
  • croak_popstack
  • current_re_engine
  • cv_ckproto_len_flags
  • cv_clone_into
  • cvgv_set
  • cvstash_set
  • deb_stack_all
  • delete_eval_scope
  • die_unwind
  • do_aexec
  • do_aexec5
  • do_eof
  • do_exec
  • do_exec3
  • do_execfree
  • do_ipcctl
  • do_ipcget
  • do_msgrcv
  • do_msgsnd
  • do_ncmp
  • do_op_xmldump
  • do_pmop_xmldump
  • do_print
  • do_readline
  • do_seek
  • do_semop
  • do_shmio
  • do_sysseek
  • do_tell
  • do_trans
  • do_vecget
  • do_vecset
  • do_vop
  • dofile
  • dump_all_perl
  • dump_packsubs_perl
  • dump_sub_perl
  • dump_sv_child
  • emulate_cop_io
  • feature_is_enabled
  • find_lexical_cv
  • find_runcv_where
  • find_rundefsv2
  • find_script
  • free_tied_hv_pool
  • get_and_check_backslash_N_name
  • get_db_sub
  • get_debug_opts
  • get_hash_seed
  • get_invlist_iter_addr
  • get_invlist_previous_index_addr
  • get_invlist_version_id_addr
  • get_invlist_zero_addr
  • get_no_modify
  • get_opargs
  • get_re_arg
  • getenv_len
  • grok_bslash_x
  • hfree_next_entry
  • hv_backreferences_p
  • hv_kill_backrefs
  • hv_undef_flags
  • init_argv_symbols
  • init_constants
  • init_dbargs
  • init_debugger
  • invert
  • invlist_array
  • invlist_clone
  • invlist_highest
  • invlist_is_iterating
  • invlist_iterfinish
  • invlist_iterinit
  • invlist_max
  • invlist_previous_index
  • invlist_set_len
  • invlist_set_previous_index
  • invlist_trim
  • io_close
  • isALNUM_lazy
  • isIDFIRST_lazy
  • is_utf8_char_slow
  • is_utf8_common
  • jmaybe
  • keyword
  • keyword_plugin_standard
  • list
  • localize
  • mad_free
  • madlex
  • madparse
  • magic_clear_all_env
  • magic_cleararylen_p
  • magic_clearenv
  • magic_clearisa
  • magic_clearpack
  • magic_clearsig
  • magic_copycallchecker
  • magic_existspack
  • magic_freearylen_p
  • magic_freeovrld
  • magic_get
  • magic_getarylen
  • magic_getdefelem
  • magic_getnkeys
  • magic_getpack
  • magic_getpos
  • magic_getsig
  • magic_getsubstr
  • magic_gettaint
  • magic_getuvar
  • magic_getvec
  • magic_killbackrefs
  • magic_nextpack
  • magic_regdata_cnt
  • magic_regdatum_get
  • magic_regdatum_set
  • magic_scalarpack
  • magic_set
  • magic_set_all_env
  • magic_setarylen
  • magic_setcollxfrm
  • magic_setdbline
  • magic_setdefelem
  • magic_setenv
  • magic_setisa
  • magic_setmglob
  • magic_setnkeys
  • magic_setpack
  • magic_setpos
  • magic_setregexp
  • magic_setsig
  • magic_setsubstr
  • magic_settaint
  • magic_setutf8
  • magic_setuvar
  • magic_setvec
  • magic_sizepack
  • magic_wipepack
  • malloc_good_size
  • malloced_size
  • mem_collxfrm
  • mode_from_discipline
  • more_bodies
  • mro_meta_dup
  • mro_meta_init
  • my_attrs
  • my_betoh16
  • my_betoh32
  • my_betoh64
  • my_betohi
  • my_betohl
  • my_betohs
  • my_clearenv
  • my_htobe16
  • my_htobe32
  • my_htobe64
  • my_htobei
  • my_htobel
  • my_htobes
  • my_htole16
  • my_htole32
  • my_htole64
  • my_htolei
  • my_htolel
  • my_htoles
  • my_letoh16
  • my_letoh32
  • my_letoh64
  • my_letohi
  • my_letohl
  • my_letohs
  • my_lstat_flags
  • my_stat_flags
  • my_swabn
  • my_unexec
  • newATTRSUB_flags
  • newGP
  • newMADPROP
  • newMADsv
  • newSTUB
  • newTOKEN
  • newXS_len_flags
  • new_warnings_bitfield
  • nextargv
  • oopsAV
  • oopsHV
  • op_clear
  • op_const_sv
  • op_getmad
  • op_getmad_weak
  • op_integerize
  • op_lvalue_flags
  • op_refcnt_dec
  • op_refcnt_inc
  • op_std_init
  • op_unscope
  • op_xmldump
  • opslab_force_free
  • opslab_free
  • opslab_free_nopad
  • package
  • package_version
  • padlist_store
  • parse_unicode_opts
  • parser_free
  • parser_free_nexttoke_ops
  • peep
  • pmop_xmldump
  • pmruntime
  • populate_isa
  • prepend_madprops
  • qerror
  • re_op_compile
  • reg_named_buff
  • reg_named_buff_iter
  • reg_numbered_buff_fetch
  • reg_numbered_buff_length
  • reg_numbered_buff_store
  • reg_qr_package
  • reg_temp_copy
  • regcurly
  • regpposixcc
  • regprop
  • report_evil_fh
  • report_redefined_cv
  • report_wrongway_fh
  • rpeep
  • rsignal_restore
  • rsignal_save
  • rxres_save
  • same_dirent
  • sawparens
  • scalar
  • scalarvoid
  • sighandler
  • softref2xv
  • sub_crush_depth
  • sv_add_backref
  • sv_catxmlpv
  • sv_catxmlpvn
  • sv_catxmlsv
  • sv_del_backref
  • sv_free2
  • sv_kill_backrefs
  • sv_len_utf8_nomg
  • sv_mortalcopy_flags
  • sv_resetpvn
  • sv_sethek
  • sv_setsv_cow
  • sv_unglob
  • sv_xmlpeek
  • tied_method
  • token_free
  • token_getmad
  • translate_substr_offsets
  • try_amagic_bin
  • try_amagic_un
  • unshare_hek
  • utilize
  • varname
  • vivify_defelem
  • vivify_ref
  • wait4pid
  • was_lvalue_sub
  • watch
  • win32_croak_not_implemented
  • write_to_stderr
  • xmldump_all
  • xmldump_all_perl
  • xmldump_eval
  • xmldump_form
  • xmldump_indent
  • xmldump_packsubs
  • xmldump_packsubs_perl
  • xmldump_sub
  • xmldump_sub_perl
  • xmldump_vindent
  • xs_apiversion_bootcheck
  • xs_version_bootcheck
  • yyerror
  • yyerror_pv
  • yyerror_pvn
  • yylex
  • yyparse
  • yyunlex

AUTHORS

The autodocumentation system was originally added to the Perl core by Benjamin Stuhl. Documentation is by whoever was kind enough to document their functions.

SEE ALSO

perlguts, perlapi

 
perldoc-html/perlinterp.html000644 000765 000024 00000204544 12275777361 016322 0ustar00jjstaff000000 000000 perlinterp - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlinterp

Perl 5 version 18.2 documentation
Recently read

perlinterp

NAME

perlinterp - An overview of the Perl interpreter

DESCRIPTION

This document provides an overview of how the Perl interpreter works at the level of C code, along with pointers to the relevant C source code files.

ELEMENTS OF THE INTERPRETER

The work of the interpreter has two main stages: compiling the code into the internal representation, or bytecode, and then executing it. Compiled code in perlguts explains exactly how the compilation stage happens.

Here is a short breakdown of perl's operation:

Startup

The action begins in perlmain.c. (or miniperlmain.c for miniperl) This is very high-level code, enough to fit on a single screen, and it resembles the code found in perlembed; most of the real action takes place in perl.c

perlmain.c is generated by ExtUtils::Miniperl from miniperlmain.c at make time, so you should make perl to follow this along.

First, perlmain.c allocates some memory and constructs a Perl interpreter, along these lines:

  1. 1 PERL_SYS_INIT3(&argc,&argv,&env);
  2. 2
  3. 3 if (!PL_do_undump) {
  4. 4 my_perl = perl_alloc();
  5. 5 if (!my_perl)
  6. 6 exit(1);
  7. 7 perl_construct(my_perl);
  8. 8 PL_perl_destruct_level = 0;
  9. 9 }

Line 1 is a macro, and its definition is dependent on your operating system. Line 3 references PL_do_undump , a global variable - all global variables in Perl start with PL_ . This tells you whether the current running program was created with the -u flag to perl and then undump, which means it's going to be false in any sane context.

Line 4 calls a function in perl.c to allocate memory for a Perl interpreter. It's quite a simple function, and the guts of it looks like this:

  1. my_perl = (PerlInterpreter*)PerlMem_malloc(sizeof(PerlInterpreter));

Here you see an example of Perl's system abstraction, which we'll see later: PerlMem_malloc is either your system's malloc , or Perl's own malloc as defined in malloc.c if you selected that option at configure time.

Next, in line 7, we construct the interpreter using perl_construct, also in perl.c; this sets up all the special variables that Perl needs, the stacks, and so on.

Now we pass Perl the command line options, and tell it to go:

  1. exitstatus = perl_parse(my_perl, xs_init, argc, argv, (char **)NULL);
  2. if (!exitstatus)
  3. perl_run(my_perl);
  4. exitstatus = perl_destruct(my_perl);
  5. perl_free(my_perl);

perl_parse is actually a wrapper around S_parse_body , as defined in perl.c, which processes the command line options, sets up any statically linked XS modules, opens the program and calls yyparse to parse it.

Parsing

The aim of this stage is to take the Perl source, and turn it into an op tree. We'll see what one of those looks like later. Strictly speaking, there's three things going on here.

yyparse , the parser, lives in perly.c, although you're better off reading the original YACC input in perly.y. (Yes, Virginia, there is a YACC grammar for Perl!) The job of the parser is to take your code and "understand" it, splitting it into sentences, deciding which operands go with which operators and so on.

The parser is nobly assisted by the lexer, which chunks up your input into tokens, and decides what type of thing each token is: a variable name, an operator, a bareword, a subroutine, a core function, and so on. The main point of entry to the lexer is yylex , and that and its associated routines can be found in toke.c. Perl isn't much like other computer languages; it's highly context sensitive at times, it can be tricky to work out what sort of token something is, or where a token ends. As such, there's a lot of interplay between the tokeniser and the parser, which can get pretty frightening if you're not used to it.

As the parser understands a Perl program, it builds up a tree of operations for the interpreter to perform during execution. The routines which construct and link together the various operations are to be found in op.c, and will be examined later.

Optimization

Now the parsing stage is complete, and the finished tree represents the operations that the Perl interpreter needs to perform to execute our program. Next, Perl does a dry run over the tree looking for optimisations: constant expressions such as 3 + 4 will be computed now, and the optimizer will also see if any multiple operations can be replaced with a single one. For instance, to fetch the variable $foo , instead of grabbing the glob *foo and looking at the scalar component, the optimizer fiddles the op tree to use a function which directly looks up the scalar in question. The main optimizer is peep in op.c, and many ops have their own optimizing functions.

Running

Now we're finally ready to go: we have compiled Perl byte code, and all that's left to do is run it. The actual execution is done by the runops_standard function in run.c; more specifically, it's done by these three innocent looking lines:

  1. while ((PL_op = PL_op->op_ppaddr(aTHX))) {
  2. PERL_ASYNC_CHECK();
  3. }

You may be more comfortable with the Perl version of that:

  1. PERL_ASYNC_CHECK() while $Perl::op = &{$Perl::op->{function}};

Well, maybe not. Anyway, each op contains a function pointer, which stipulates the function which will actually carry out the operation. This function will return the next op in the sequence - this allows for things like if which choose the next op dynamically at run time. The PERL_ASYNC_CHECK makes sure that things like signals interrupt execution if required.

The actual functions called are known as PP code, and they're spread between four files: pp_hot.c contains the "hot" code, which is most often used and highly optimized, pp_sys.c contains all the system-specific functions, pp_ctl.c contains the functions which implement control structures (if , while and the like) and pp.c contains everything else. These are, if you like, the C code for Perl's built-in functions and operators.

Note that each pp_ function is expected to return a pointer to the next op. Calls to perl subs (and eval blocks) are handled within the same runops loop, and do not consume extra space on the C stack. For example, pp_entersub and pp_entertry just push a CxSUB or CxEVAL block struct onto the context stack which contain the address of the op following the sub call or eval. They then return the first op of that sub or eval block, and so execution continues of that sub or block. Later, a pp_leavesub or pp_leavetry op pops the CxSUB or CxEVAL , retrieves the return op from it, and returns it.

Exception handing

Perl's exception handing (i.e. die etc.) is built on top of the low-level setjmp() /longjmp() C-library functions. These basically provide a way to capture the current PC and SP registers and later restore them; i.e. a longjmp() continues at the point in code where a previous setjmp() was done, with anything further up on the C stack being lost. This is why code should always save values using SAVE_FOO rather than in auto variables.

The perl core wraps setjmp() etc in the macros JMPENV_PUSH and JMPENV_JUMP . The basic rule of perl exceptions is that exit, and die (in the absence of eval) perform a JMPENV_JUMP(2) , while die within eval does a JMPENV_JUMP(3) .

At entry points to perl, such as perl_parse() , perl_run() and call_sv(cv, G_EVAL) each does a JMPENV_PUSH , then enter a runops loop or whatever, and handle possible exception returns. For a 2 return, final cleanup is performed, such as popping stacks and calling CHECK or END blocks. Amongst other things, this is how scope cleanup still occurs during an exit.

If a die can find a CxEVAL block on the context stack, then the stack is popped to that level and the return op in that block is assigned to PL_restartop ; then a JMPENV_JUMP(3) is performed. This normally passes control back to the guard. In the case of perl_run and call_sv , a non-null PL_restartop triggers re-entry to the runops loop. The is the normal way that die or croak is handled within an eval.

Sometimes ops are executed within an inner runops loop, such as tie, sort or overload code. In this case, something like

  1. sub FETCH { eval { die } }

would cause a longjmp right back to the guard in perl_run , popping both runops loops, which is clearly incorrect. One way to avoid this is for the tie code to do a JMPENV_PUSH before executing FETCH in the inner runops loop, but for efficiency reasons, perl in fact just sets a flag, using CATCH_SET(TRUE) . The pp_require , pp_entereval and pp_entertry ops check this flag, and if true, they call docatch , which does a JMPENV_PUSH and starts a new runops level to execute the code, rather than doing it on the current loop.

As a further optimisation, on exit from the eval block in the FETCH , execution of the code following the block is still carried on in the inner loop. When an exception is raised, docatch compares the JMPENV level of the CxEVAL with PL_top_env and if they differ, just re-throws the exception. In this way any inner loops get popped.

Here's an example.

  1. 1: eval { tie @a, 'A' };
  2. 2: sub A::TIEARRAY {
  3. 3: eval { die };
  4. 4: die;
  5. 5: }

To run this code, perl_run is called, which does a JMPENV_PUSH then enters a runops loop. This loop executes the eval and tie ops on line 1, with the eval pushing a CxEVAL onto the context stack.

The pp_tie does a CATCH_SET(TRUE) , then starts a second runops loop to execute the body of TIEARRAY . When it executes the entertry op on line 3, CATCH_GET is true, so pp_entertry calls docatch which does a JMPENV_PUSH and starts a third runops loop, which then executes the die op. At this point the C call stack looks like this:

  1. Perl_pp_die
  2. Perl_runops # third loop
  3. S_docatch_body
  4. S_docatch
  5. Perl_pp_entertry
  6. Perl_runops # second loop
  7. S_call_body
  8. Perl_call_sv
  9. Perl_pp_tie
  10. Perl_runops # first loop
  11. S_run_body
  12. perl_run
  13. main

and the context and data stacks, as shown by -Dstv , look like:

  1. STACK 0: MAIN
  2. CX 0: BLOCK =>
  3. CX 1: EVAL => AV() PV("A"\0)
  4. retop=leave
  5. STACK 1: MAGIC
  6. CX 0: SUB =>
  7. retop=(null)
  8. CX 1: EVAL => *
  9. retop=nextstate

The die pops the first CxEVAL off the context stack, sets PL_restartop from it, does a JMPENV_JUMP(3) , and control returns to the top docatch . This then starts another third-level runops level, which executes the nextstate, pushmark and die ops on line 4. At the point that the second pp_die is called, the C call stack looks exactly like that above, even though we are no longer within an inner eval; this is because of the optimization mentioned earlier. However, the context stack now looks like this, ie with the top CxEVAL popped:

  1. STACK 0: MAIN
  2. CX 0: BLOCK =>
  3. CX 1: EVAL => AV() PV("A"\0)
  4. retop=leave
  5. STACK 1: MAGIC
  6. CX 0: SUB =>
  7. retop=(null)

The die on line 4 pops the context stack back down to the CxEVAL, leaving it as:

  1. STACK 0: MAIN
  2. CX 0: BLOCK =>

As usual, PL_restartop is extracted from the CxEVAL , and a JMPENV_JUMP(3) done, which pops the C stack back to the docatch:

  1. S_docatch
  2. Perl_pp_entertry
  3. Perl_runops # second loop
  4. S_call_body
  5. Perl_call_sv
  6. Perl_pp_tie
  7. Perl_runops # first loop
  8. S_run_body
  9. perl_run
  10. main

In this case, because the JMPENV level recorded in the CxEVAL differs from the current one, docatch just does a JMPENV_JUMP(3) and the C stack unwinds to:

  1. perl_run
  2. main

Because PL_restartop is non-null, run_body starts a new runops loop and execution continues.

INTERNAL VARIABLE TYPES

You should by now have had a look at perlguts, which tells you about Perl's internal variable types: SVs, HVs, AVs and the rest. If not, do that now.

These variables are used not only to represent Perl-space variables, but also any constants in the code, as well as some structures completely internal to Perl. The symbol table, for instance, is an ordinary Perl hash. Your code is represented by an SV as it's read into the parser; any program files you call are opened via ordinary Perl filehandles, and so on.

The core Devel::Peek module lets us examine SVs from a Perl program. Let's see, for instance, how Perl treats the constant "hello" .

  1. % perl -MDevel::Peek -e 'Dump("hello")'
  2. 1 SV = PV(0xa041450) at 0xa04ecbc
  3. 2 REFCNT = 1
  4. 3 FLAGS = (POK,READONLY,pPOK)
  5. 4 PV = 0xa0484e0 "hello"\0
  6. 5 CUR = 5
  7. 6 LEN = 6

Reading Devel::Peek output takes a bit of practise, so let's go through it line by line.

Line 1 tells us we're looking at an SV which lives at 0xa04ecbc in memory. SVs themselves are very simple structures, but they contain a pointer to a more complex structure. In this case, it's a PV, a structure which holds a string value, at location 0xa041450 . Line 2 is the reference count; there are no other references to this data, so it's 1.

Line 3 are the flags for this SV - it's OK to use it as a PV, it's a read-only SV (because it's a constant) and the data is a PV internally. Next we've got the contents of the string, starting at location 0xa0484e0 .

Line 5 gives us the current length of the string - note that this does not include the null terminator. Line 6 is not the length of the string, but the length of the currently allocated buffer; as the string grows, Perl automatically extends the available storage via a routine called SvGROW .

You can get at any of these quantities from C very easily; just add Sv to the name of the field shown in the snippet, and you've got a macro which will return the value: SvCUR(sv) returns the current length of the string, SvREFCOUNT(sv) returns the reference count, SvPV(sv, len) returns the string itself with its length, and so on. More macros to manipulate these properties can be found in perlguts.

Let's take an example of manipulating a PV, from sv_catpvn , in sv.c

  1. 1 void
  2. 2 Perl_sv_catpvn(pTHX_ SV *sv, const char *ptr, STRLEN len)
  3. 3 {
  4. 4 STRLEN tlen;
  5. 5 char *junk;
  6. 6 junk = SvPV_force(sv, tlen);
  7. 7 SvGROW(sv, tlen + len + 1);
  8. 8 if (ptr == junk)
  9. 9 ptr = SvPVX(sv);
  10. 10 Move(ptr,SvPVX(sv)+tlen,len,char);
  11. 11 SvCUR(sv) += len;
  12. 12 *SvEND(sv) = '\0';
  13. 13 (void)SvPOK_only_UTF8(sv); /* validate pointer */
  14. 14 SvTAINT(sv);
  15. 15 }

This is a function which adds a string, ptr , of length len onto the end of the PV stored in sv . The first thing we do in line 6 is make sure that the SV has a valid PV, by calling the SvPV_force macro to force a PV. As a side effect, tlen gets set to the current value of the PV, and the PV itself is returned to junk .

In line 7, we make sure that the SV will have enough room to accommodate the old string, the new string and the null terminator. If LEN isn't big enough, SvGROW will reallocate space for us.

Now, if junk is the same as the string we're trying to add, we can grab the string directly from the SV; SvPVX is the address of the PV in the SV.

Line 10 does the actual catenation: the Move macro moves a chunk of memory around: we move the string ptr to the end of the PV - that's the start of the PV plus its current length. We're moving len bytes of type char . After doing so, we need to tell Perl we've extended the string, by altering CUR to reflect the new length. SvEND is a macro which gives us the end of the string, so that needs to be a "\0" .

Line 13 manipulates the flags; since we've changed the PV, any IV or NV values will no longer be valid: if we have $a=10; $a.="6"; we don't want to use the old IV of 10. SvPOK_only_utf8 is a special UTF-8-aware version of SvPOK_only , a macro which turns off the IOK and NOK flags and turns on POK. The final SvTAINT is a macro which launders tainted data if taint mode is turned on.

AVs and HVs are more complicated, but SVs are by far the most common variable type being thrown around. Having seen something of how we manipulate these, let's go on and look at how the op tree is constructed.

OP TREES

First, what is the op tree, anyway? The op tree is the parsed representation of your program, as we saw in our section on parsing, and it's the sequence of operations that Perl goes through to execute your program, as we saw in Running.

An op is a fundamental operation that Perl can perform: all the built-in functions and operators are ops, and there are a series of ops which deal with concepts the interpreter needs internally - entering and leaving a block, ending a statement, fetching a variable, and so on.

The op tree is connected in two ways: you can imagine that there are two "routes" through it, two orders in which you can traverse the tree. First, parse order reflects how the parser understood the code, and secondly, execution order tells perl what order to perform the operations in.

The easiest way to examine the op tree is to stop Perl after it has finished parsing, and get it to dump out the tree. This is exactly what the compiler backends B::Terse, B::Concise and B::Debug do.

Let's have a look at how Perl sees $a = $b + $c :

  1. % perl -MO=Terse -e '$a=$b+$c'
  2. 1 LISTOP (0x8179888) leave
  3. 2 OP (0x81798b0) enter
  4. 3 COP (0x8179850) nextstate
  5. 4 BINOP (0x8179828) sassign
  6. 5 BINOP (0x8179800) add [1]
  7. 6 UNOP (0x81796e0) null [15]
  8. 7 SVOP (0x80fafe0) gvsv GV (0x80fa4cc) *b
  9. 8 UNOP (0x81797e0) null [15]
  10. 9 SVOP (0x8179700) gvsv GV (0x80efeb0) *c
  11. 10 UNOP (0x816b4f0) null [15]
  12. 11 SVOP (0x816dcf0) gvsv GV (0x80fa460) *a

Let's start in the middle, at line 4. This is a BINOP, a binary operator, which is at location 0x8179828 . The specific operator in question is sassign - scalar assignment - and you can find the code which implements it in the function pp_sassign in pp_hot.c. As a binary operator, it has two children: the add operator, providing the result of $b+$c , is uppermost on line 5, and the left hand side is on line 10.

Line 10 is the null op: this does exactly nothing. What is that doing there? If you see the null op, it's a sign that something has been optimized away after parsing. As we mentioned in Optimization, the optimization stage sometimes converts two operations into one, for example when fetching a scalar variable. When this happens, instead of rewriting the op tree and cleaning up the dangling pointers, it's easier just to replace the redundant operation with the null op. Originally, the tree would have looked like this:

  1. 10 SVOP (0x816b4f0) rv2sv [15]
  2. 11 SVOP (0x816dcf0) gv GV (0x80fa460) *a

That is, fetch the a entry from the main symbol table, and then look at the scalar component of it: gvsv (pp_gvsv into pp_hot.c) happens to do both these things.

The right hand side, starting at line 5 is similar to what we've just seen: we have the add op (pp_add also in pp_hot.c) add together two gvsv s.

Now, what's this about?

  1. 1 LISTOP (0x8179888) leave
  2. 2 OP (0x81798b0) enter
  3. 3 COP (0x8179850) nextstate

enter and leave are scoping ops, and their job is to perform any housekeeping every time you enter and leave a block: lexical variables are tidied up, unreferenced variables are destroyed, and so on. Every program will have those first three lines: leave is a list, and its children are all the statements in the block. Statements are delimited by nextstate , so a block is a collection of nextstate ops, with the ops to be performed for each statement being the children of nextstate . enter is a single op which functions as a marker.

That's how Perl parsed the program, from top to bottom:

  1. Program
  2. |
  3. Statement
  4. |
  5. =
  6. / \
  7. / \
  8. $a +
  9. / \
  10. $b $c

However, it's impossible to perform the operations in this order: you have to find the values of $b and $c before you add them together, for instance. So, the other thread that runs through the op tree is the execution order: each op has a field op_next which points to the next op to be run, so following these pointers tells us how perl executes the code. We can traverse the tree in this order using the exec option to B::Terse :

  1. % perl -MO=Terse,exec -e '$a=$b+$c'
  2. 1 OP (0x8179928) enter
  3. 2 COP (0x81798c8) nextstate
  4. 3 SVOP (0x81796c8) gvsv GV (0x80fa4d4) *b
  5. 4 SVOP (0x8179798) gvsv GV (0x80efeb0) *c
  6. 5 BINOP (0x8179878) add [1]
  7. 6 SVOP (0x816dd38) gvsv GV (0x80fa468) *a
  8. 7 BINOP (0x81798a0) sassign
  9. 8 LISTOP (0x8179900) leave

This probably makes more sense for a human: enter a block, start a statement. Get the values of $b and $c , and add them together. Find $a , and assign one to the other. Then leave.

The way Perl builds up these op trees in the parsing process can be unravelled by examining perly.y, the YACC grammar. Let's take the piece we need to construct the tree for $a = $b + $c

  1. 1 term : term ASSIGNOP term
  2. 2 { $$ = newASSIGNOP(OPf_STACKED, $1, $2, $3); }
  3. 3 | term ADDOP term
  4. 4 { $$ = newBINOP($2, 0, scalar($1), scalar($3)); }

If you're not used to reading BNF grammars, this is how it works: You're fed certain things by the tokeniser, which generally end up in upper case. Here, ADDOP , is provided when the tokeniser sees + in your code. ASSIGNOP is provided when = is used for assigning. These are "terminal symbols", because you can't get any simpler than them.

The grammar, lines one and three of the snippet above, tells you how to build up more complex forms. These complex forms, "non-terminal symbols" are generally placed in lower case. term here is a non-terminal symbol, representing a single expression.

The grammar gives you the following rule: you can make the thing on the left of the colon if you see all the things on the right in sequence. This is called a "reduction", and the aim of parsing is to completely reduce the input. There are several different ways you can perform a reduction, separated by vertical bars: so, term followed by = followed by term makes a term , and term followed by + followed by term can also make a term .

So, if you see two terms with an = or + , between them, you can turn them into a single expression. When you do this, you execute the code in the block on the next line: if you see = , you'll do the code in line 2. If you see + , you'll do the code in line 4. It's this code which contributes to the op tree.

  1. | term ADDOP term
  2. { $$ = newBINOP($2, 0, scalar($1), scalar($3)); }

What this does is creates a new binary op, and feeds it a number of variables. The variables refer to the tokens: $1 is the first token in the input, $2 the second, and so on - think regular expression backreferences. $$ is the op returned from this reduction. So, we call newBINOP to create a new binary operator. The first parameter to newBINOP , a function in op.c, is the op type. It's an addition operator, so we want the type to be ADDOP . We could specify this directly, but it's right there as the second token in the input, so we use $2 . The second parameter is the op's flags: 0 means "nothing special". Then the things to add: the left and right hand side of our expression, in scalar context.

STACKS

When perl executes something like addop , how does it pass on its results to the next op? The answer is, through the use of stacks. Perl has a number of stacks to store things it's currently working on, and we'll look at the three most important ones here.

Argument stack

Arguments are passed to PP code and returned from PP code using the argument stack, ST . The typical way to handle arguments is to pop them off the stack, deal with them how you wish, and then push the result back onto the stack. This is how, for instance, the cosine operator works:

  1. NV value;
  2. value = POPn;
  3. value = Perl_cos(value);
  4. XPUSHn(value);

We'll see a more tricky example of this when we consider Perl's macros below. POPn gives you the NV (floating point value) of the top SV on the stack: the $x in cos($x). Then we compute the cosine, and push the result back as an NV. The X in XPUSHn means that the stack should be extended if necessary - it can't be necessary here, because we know there's room for one more item on the stack, since we've just removed one! The XPUSH* macros at least guarantee safety.

Alternatively, you can fiddle with the stack directly: SP gives you the first element in your portion of the stack, and TOP* gives you the top SV/IV/NV/etc. on the stack. So, for instance, to do unary negation of an integer:

  1. SETi(-TOPi);

Just set the integer value of the top stack entry to its negation.

Argument stack manipulation in the core is exactly the same as it is in XSUBs - see perlxstut, perlxs and perlguts for a longer description of the macros used in stack manipulation.

Mark stack

I say "your portion of the stack" above because PP code doesn't necessarily get the whole stack to itself: if your function calls another function, you'll only want to expose the arguments aimed for the called function, and not (necessarily) let it get at your own data. The way we do this is to have a "virtual" bottom-of-stack, exposed to each function. The mark stack keeps bookmarks to locations in the argument stack usable by each function. For instance, when dealing with a tied variable, (internally, something with "P" magic) Perl has to call methods for accesses to the tied variables. However, we need to separate the arguments exposed to the method to the argument exposed to the original function - the store or fetch or whatever it may be. Here's roughly how the tied push is implemented; see av_push in av.c:

  1. 1 PUSHMARK(SP);
  2. 2 EXTEND(SP,2);
  3. 3 PUSHs(SvTIED_obj((SV*)av, mg));
  4. 4 PUSHs(val);
  5. 5 PUTBACK;
  6. 6 ENTER;
  7. 7 call_method("PUSH", G_SCALAR|G_DISCARD);
  8. 8 LEAVE;

Let's examine the whole implementation, for practice:

  1. 1 PUSHMARK(SP);

Push the current state of the stack pointer onto the mark stack. This is so that when we've finished adding items to the argument stack, Perl knows how many things we've added recently.

  1. 2 EXTEND(SP,2);
  2. 3 PUSHs(SvTIED_obj((SV*)av, mg));
  3. 4 PUSHs(val);

We're going to add two more items onto the argument stack: when you have a tied array, the PUSH subroutine receives the object and the value to be pushed, and that's exactly what we have here - the tied object, retrieved with SvTIED_obj , and the value, the SV val .

  1. 5 PUTBACK;

Next we tell Perl to update the global stack pointer from our internal variable: dSP only gave us a local copy, not a reference to the global.

  1. 6 ENTER;
  2. 7 call_method("PUSH", G_SCALAR|G_DISCARD);
  3. 8 LEAVE;

ENTER and LEAVE localise a block of code - they make sure that all variables are tidied up, everything that has been localised gets its previous value returned, and so on. Think of them as the { and } of a Perl block.

To actually do the magic method call, we have to call a subroutine in Perl space: call_method takes care of that, and it's described in perlcall. We call the PUSH method in scalar context, and we're going to discard its return value. The call_method() function removes the top element of the mark stack, so there is nothing for the caller to clean up.

Save stack

C doesn't have a concept of local scope, so perl provides one. We've seen that ENTER and LEAVE are used as scoping braces; the save stack implements the C equivalent of, for example:

  1. {
  2. local $foo = 42;
  3. ...
  4. }

See Localizing changes in perlguts for how to use the save stack.

MILLIONS OF MACROS

One thing you'll notice about the Perl source is that it's full of macros. Some have called the pervasive use of macros the hardest thing to understand, others find it adds to clarity. Let's take an example, the code which implements the addition operator:

  1. 1 PP(pp_add)
  2. 2 {
  3. 3 dSP; dATARGET; tryAMAGICbin(add,opASSIGN);
  4. 4 {
  5. 5 dPOPTOPnnrl_ul;
  6. 6 SETn( left + right );
  7. 7 RETURN;
  8. 8 }
  9. 9 }

Every line here (apart from the braces, of course) contains a macro. The first line sets up the function declaration as Perl expects for PP code; line 3 sets up variable declarations for the argument stack and the target, the return value of the operation. Finally, it tries to see if the addition operation is overloaded; if so, the appropriate subroutine is called.

Line 5 is another variable declaration - all variable declarations start with d - which pops from the top of the argument stack two NVs (hence nn ) and puts them into the variables right and left , hence the rl . These are the two operands to the addition operator. Next, we call SETn to set the NV of the return value to the result of adding the two values. This done, we return - the RETURN macro makes sure that our return value is properly handled, and we pass the next operator to run back to the main run loop.

Most of these macros are explained in perlapi, and some of the more important ones are explained in perlxs as well. Pay special attention to Background and PERL_IMPLICIT_CONTEXT in perlguts for information on the [pad]THX_? macros.

FURTHER READING

For more information on the Perl internals, please see the documents listed at Internals and C Language Interface in perl.

 
perldoc-html/perlintro.html000644 000765 000024 00000212130 12275777322 016137 0ustar00jjstaff000000 000000 perlintro - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlintro

Perl 5 version 18.2 documentation
Recently read

perlintro

NAME

perlintro -- a brief introduction and overview of Perl

DESCRIPTION

This document is intended to give you a quick overview of the Perl programming language, along with pointers to further documentation. It is intended as a "bootstrap" guide for those who are new to the language, and provides just enough information for you to be able to read other peoples' Perl and understand roughly what it's doing, or write your own simple scripts.

This introductory document does not aim to be complete. It does not even aim to be entirely accurate. In some cases perfection has been sacrificed in the goal of getting the general idea across. You are strongly advised to follow this introduction with more information from the full Perl manual, the table of contents to which can be found in perltoc.

Throughout this document you'll see references to other parts of the Perl documentation. You can read that documentation using the perldoc command or whatever method you're using to read this document.

Throughout Perl's documentation, you'll find numerous examples intended to help explain the discussed features. Please keep in mind that many of them are code fragments rather than complete programs.

These examples often reflect the style and preference of the author of that piece of the documentation, and may be briefer than a corresponding line of code in a real program. Except where otherwise noted, you should assume that use strict and use warnings statements appear earlier in the "program", and that any variables used have already been declared, even if those declarations have been omitted to make the example easier to read.

Do note that the examples have been written by many different authors over a period of several decades. Styles and techniques will therefore differ, although some effort has been made to not vary styles too widely in the same sections. Do not consider one style to be better than others - "There's More Than One Way To Do It" is one of Perl's mottos. After all, in your journey as a programmer, you are likely to encounter different styles.

What is Perl?

Perl is a general-purpose programming language originally developed for text manipulation and now used for a wide range of tasks including system administration, web development, network programming, GUI development, and more.

The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). Its major features are that it's easy to use, supports both procedural and object-oriented (OO) programming, has powerful built-in support for text processing, and has one of the world's most impressive collections of third-party modules.

Different definitions of Perl are given in perl, perlfaq1 and no doubt other places. From this we can determine that Perl is different things to different people, but that lots of people think it's at least worth writing about.

Running Perl programs

To run a Perl program from the Unix command line:

  1. perl progname.pl

Alternatively, put this as the first line of your script:

  1. #!/usr/bin/env perl

... and run the script as /path/to/script.pl. Of course, it'll need to be executable first, so chmod 755 script.pl (under Unix).

(This start line assumes you have the env program. You can also put directly the path to your perl executable, like in #!/usr/bin/perl ).

For more information, including instructions for other platforms such as Windows and Mac OS, read perlrun.

Safety net

Perl by default is very forgiving. In order to make it more robust it is recommended to start every program with the following lines:

  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;

The two additional lines request from perl to catch various common problems in your code. They check different things so you need both. A potential problem caught by use strict; will cause your code to stop immediately when it is encountered, while use warnings; will merely give a warning (like the command-line switch -w) and let your code run. To read more about them check their respective manual pages at strict and warnings.

Basic syntax overview

A Perl script or program consists of one or more statements. These statements are simply written in the script in a straightforward fashion. There is no need to have a main() function or anything of that kind.

Perl statements end in a semi-colon:

  1. print "Hello, world";

Comments start with a hash symbol and run to the end of the line

  1. # This is a comment

Whitespace is irrelevant:

  1. print
  2. "Hello, world"
  3. ;

... except inside quoted strings:

  1. # this would print with a linebreak in the middle
  2. print "Hello
  3. world";

Double quotes or single quotes may be used around literal strings:

  1. print "Hello, world";
  2. print 'Hello, world';

However, only double quotes "interpolate" variables and special characters such as newlines (\n ):

  1. print "Hello, $name\n"; # works fine
  2. print 'Hello, $name\n'; # prints $name\n literally

Numbers don't need quotes around them:

  1. print 42;

You can use parentheses for functions' arguments or omit them according to your personal taste. They are only required occasionally to clarify issues of precedence.

  1. print("Hello, world\n");
  2. print "Hello, world\n";

More detailed information about Perl syntax can be found in perlsyn.

Perl variable types

Perl has three main variable types: scalars, arrays, and hashes.

  • Scalars

    A scalar represents a single value:

    1. my $animal = "camel";
    2. my $answer = 42;

    Scalar values can be strings, integers or floating point numbers, and Perl will automatically convert between them as required. There is no need to pre-declare your variable types, but you have to declare them using the my keyword the first time you use them. (This is one of the requirements of use strict; .)

    Scalar values can be used in various ways:

    1. print $animal;
    2. print "The animal is $animal\n";
    3. print "The square of $answer is ", $answer * $answer, "\n";

    There are a number of "magic" scalars with names that look like punctuation or line noise. These special variables are used for all kinds of purposes, and are documented in perlvar. The only one you need to know about for now is $_ which is the "default variable". It's used as the default argument to a number of functions in Perl, and it's set implicitly by certain looping constructs.

    1. print; # prints contents of $_ by default
  • Arrays

    An array represents a list of values:

    1. my @animals = ("camel", "llama", "owl");
    2. my @numbers = (23, 42, 69);
    3. my @mixed = ("camel", 42, 1.23);

    Arrays are zero-indexed. Here's how you get at elements in an array:

    1. print $animals[0]; # prints "camel"
    2. print $animals[1]; # prints "llama"

    The special variable $#array tells you the index of the last element of an array:

    1. print $mixed[$#mixed]; # last element, prints 1.23

    You might be tempted to use $#array + 1 to tell you how many items there are in an array. Don't bother. As it happens, using @array where Perl expects to find a scalar value ("in scalar context") will give you the number of elements in the array:

    1. if (@animals < 5) { ... }

    The elements we're getting from the array start with a $ because we're getting just a single value out of the array; you ask for a scalar, you get a scalar.

    To get multiple values from an array:

    1. @animals[0,1]; # gives ("camel", "llama");
    2. @animals[0..2]; # gives ("camel", "llama", "owl");
    3. @animals[1..$#animals]; # gives all except the first element

    This is called an "array slice".

    You can do various useful things to lists:

    1. my @sorted = sort @animals;
    2. my @backwards = reverse @numbers;

    There are a couple of special arrays too, such as @ARGV (the command line arguments to your script) and @_ (the arguments passed to a subroutine). These are documented in perlvar.

  • Hashes

    A hash represents a set of key/value pairs:

    1. my %fruit_color = ("apple", "red", "banana", "yellow");

    You can use whitespace and the => operator to lay them out more nicely:

    1. my %fruit_color = (
    2. apple => "red",
    3. banana => "yellow",
    4. );

    To get at hash elements:

    1. $fruit_color{"apple"}; # gives "red"

    You can get at lists of keys and values with keys() and values().

    1. my @fruits = keys %fruit_colors;
    2. my @colors = values %fruit_colors;

    Hashes have no particular internal order, though you can sort the keys and loop through them.

    Just like special scalars and arrays, there are also special hashes. The most well known of these is %ENV which contains environment variables. Read all about it (and other special variables) in perlvar.

Scalars, arrays and hashes are documented more fully in perldata.

More complex data types can be constructed using references, which allow you to build lists and hashes within lists and hashes.

A reference is a scalar value and can refer to any other Perl data type. So by storing a reference as the value of an array or hash element, you can easily create lists and hashes within lists and hashes. The following example shows a 2 level hash of hash structure using anonymous hash references.

  1. my $variables = {
  2. scalar => {
  3. description => "single item",
  4. sigil => '$',
  5. },
  6. array => {
  7. description => "ordered list of items",
  8. sigil => '@',
  9. },
  10. hash => {
  11. description => "key/value pairs",
  12. sigil => '%',
  13. },
  14. };
  15. print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n";

Exhaustive information on the topic of references can be found in perlreftut, perllol, perlref and perldsc.

Variable scoping

Throughout the previous section all the examples have used the syntax:

  1. my $var = "value";

The my is actually not required; you could just use:

  1. $var = "value";

However, the above usage will create global variables throughout your program, which is bad programming practice. my creates lexically scoped variables instead. The variables are scoped to the block (i.e. a bunch of statements surrounded by curly-braces) in which they are defined.

  1. my $x = "foo";
  2. my $some_condition = 1;
  3. if ($some_condition) {
  4. my $y = "bar";
  5. print $x; # prints "foo"
  6. print $y; # prints "bar"
  7. }
  8. print $x; # prints "foo"
  9. print $y; # prints nothing; $y has fallen out of scope

Using my in combination with a use strict; at the top of your Perl scripts means that the interpreter will pick up certain common programming errors. For instance, in the example above, the final print $y would cause a compile-time error and prevent you from running the program. Using strict is highly recommended.

Conditional and looping constructs

Perl has most of the usual conditional and looping constructs. As of Perl 5.10, it even has a case/switch statement (spelled given /when ). See Switch Statements in perlsyn for more details.

The conditions can be any Perl expression. See the list of operators in the next section for information on comparison and boolean logic operators, which are commonly used in conditional statements.

  • if
    1. if ( condition ) {
    2. ...
    3. } elsif ( other condition ) {
    4. ...
    5. } else {
    6. ...
    7. }

    There's also a negated version of it:

    1. unless ( condition ) {
    2. ...
    3. }

    This is provided as a more readable version of if (!condition).

    Note that the braces are required in Perl, even if you've only got one line in the block. However, there is a clever way of making your one-line conditional blocks more English like:

    1. # the traditional way
    2. if ($zippy) {
    3. print "Yow!";
    4. }
    5. # the Perlish post-condition way
    6. print "Yow!" if $zippy;
    7. print "We have no bananas" unless $bananas;
  • while
    1. while ( condition ) {
    2. ...
    3. }

    There's also a negated version, for the same reason we have unless :

    1. until ( condition ) {
    2. ...
    3. }

    You can also use while in a post-condition:

    1. print "LA LA LA\n" while 1; # loops forever
  • for

    Exactly like C:

    1. for ($i = 0; $i <= $max; $i++) {
    2. ...
    3. }

    The C style for loop is rarely needed in Perl since Perl provides the more friendly list scanning foreach loop.

  • foreach
    1. foreach (@array) {
    2. print "This element is $_\n";
    3. }
    4. print $list[$_] foreach 0 .. $max;
    5. # you don't have to use the default $_ either...
    6. foreach my $key (keys %hash) {
    7. print "The value of $key is $hash{$key}\n";
    8. }

    The foreach keyword is actually a synonym for the for keyword. See Foreach Loops in perlsyn.

For more detail on looping constructs (and some that weren't mentioned in this overview) see perlsyn.

Builtin operators and functions

Perl comes with a wide selection of builtin functions. Some of the ones we've already seen include print, sort and reverse. A list of them is given at the start of perlfunc and you can easily read about any given function by using perldoc -f functionname.

Perl operators are documented in full in perlop, but here are a few of the most common ones:

  • Arithmetic
    1. + addition
    2. - subtraction
    3. * multiplication
    4. / division
  • Numeric comparison
    1. == equality
    2. != inequality
    3. < less than
    4. > greater than
    5. <= less than or equal
    6. >= greater than or equal
  • String comparison
    1. eq equality
    2. ne inequality
    3. lt less than
    4. gt greater than
    5. le less than or equal
    6. ge greater than or equal

    (Why do we have separate numeric and string comparisons? Because we don't have special variable types, and Perl needs to know whether to sort numerically (where 99 is less than 100) or alphabetically (where 100 comes before 99).

  • Boolean logic
    1. && and
    2. || or
    3. ! not

    (and , or and not aren't just in the above table as descriptions of the operators. They're also supported as operators in their own right. They're more readable than the C-style operators, but have different precedence to && and friends. Check perlop for more detail.)

  • Miscellaneous
    1. = assignment
    2. . string concatenation
    3. x string multiplication
    4. .. range operator (creates a list of numbers)

Many operators can be combined with a = as follows:

  1. $a += 1; # same as $a = $a + 1
  2. $a -= 1; # same as $a = $a - 1
  3. $a .= "\n"; # same as $a = $a . "\n";

Files and I/O

You can open a file for input or output using the open() function. It's documented in extravagant detail in perlfunc and perlopentut, but in short:

  1. open(my $in, "<", "input.txt") or die "Can't open input.txt: $!";
  2. open(my $out, ">", "output.txt") or die "Can't open output.txt: $!";
  3. open(my $log, ">>", "my.log") or die "Can't open my.log: $!";

You can read from an open filehandle using the <> operator. In scalar context it reads a single line from the filehandle, and in list context it reads the whole file in, assigning each line to an element of the list:

  1. my $line = <$in>;
  2. my @lines = <$in>;

Reading in the whole file at one time is called slurping. It can be useful but it may be a memory hog. Most text file processing can be done a line at a time with Perl's looping constructs.

The <> operator is most often seen in a while loop:

  1. while (<$in>) { # assigns each line in turn to $_
  2. print "Just read in this line: $_";
  3. }

We've already seen how to print to standard output using print(). However, print() can also take an optional first argument specifying which filehandle to print to:

  1. print STDERR "This is your final warning.\n";
  2. print $out $record;
  3. print $log $logmessage;

When you're done with your filehandles, you should close() them (though to be honest, Perl will clean up after you if you forget):

  1. close $in or die "$in: $!";

Regular expressions

Perl's regular expression support is both broad and deep, and is the subject of lengthy documentation in perlrequick, perlretut, and elsewhere. However, in short:

  • Simple matching
    1. if (/foo/) { ... } # true if $_ contains "foo"
    2. if ($a =~ /foo/) { ... } # true if $a contains "foo"

    The // matching operator is documented in perlop. It operates on $_ by default, or can be bound to another variable using the =~ binding operator (also documented in perlop).

  • Simple substitution
    1. s/foo/bar/; # replaces foo with bar in $_
    2. $a =~ s/foo/bar/; # replaces foo with bar in $a
    3. $a =~ s/foo/bar/g; # replaces ALL INSTANCES of foo with bar
    4. # in $a

    The s/// substitution operator is documented in perlop.

  • More complex regular expressions

    You don't just have to match on fixed strings. In fact, you can match on just about anything you could dream of by using more complex regular expressions. These are documented at great length in perlre, but for the meantime, here's a quick cheat sheet:

    1. . a single character
    2. \s a whitespace character (space, tab, newline,
    3. ...)
    4. \S non-whitespace character
    5. \d a digit (0-9)
    6. \D a non-digit
    7. \w a word character (a-z, A-Z, 0-9, _)
    8. \W a non-word character
    9. [aeiou] matches a single character in the given set
    10. [^aeiou] matches a single character outside the given
    11. set
    12. (foo|bar|baz) matches any of the alternatives specified
    13. ^ start of string
    14. $ end of string

    Quantifiers can be used to specify how many of the previous thing you want to match on, where "thing" means either a literal character, one of the metacharacters listed above, or a group of characters or metacharacters in parentheses.

    1. * zero or more of the previous thing
    2. + one or more of the previous thing
    3. ? zero or one of the previous thing
    4. {3} matches exactly 3 of the previous thing
    5. {3,6} matches between 3 and 6 of the previous thing
    6. {3,} matches 3 or more of the previous thing

    Some brief examples:

    1. /^\d+/ string starts with one or more digits
    2. /^$/ nothing in the string (start and end are
    3. adjacent)
    4. /(\d\s){3}/ three digits, each followed by a whitespace
    5. character (eg "3 4 5 ")
    6. /(a.)+/ matches a string in which every odd-numbered
    7. letter is a (eg "abacadaf")
    8. # This loop reads from STDIN, and prints non-blank lines:
    9. while (<>) {
    10. next if /^$/;
    11. print;
    12. }
  • Parentheses for capturing

    As well as grouping, parentheses serve a second purpose. They can be used to capture the results of parts of the regexp match for later use. The results end up in $1 , $2 and so on.

    1. # a cheap and nasty way to break an email address up into parts
    2. if ($email =~ /([^@]+)@(.+)/) {
    3. print "Username is $1\n";
    4. print "Hostname is $2\n";
    5. }
  • Other regexp features

    Perl regexps also support backreferences, lookaheads, and all kinds of other complex details. Read all about them in perlrequick, perlretut, and perlre.

Writing subroutines

Writing subroutines is easy:

  1. sub logger {
  2. my $logmessage = shift;
  3. open my $logfile, ">>", "my.log" or die "Could not open my.log: $!";
  4. print $logfile $logmessage;
  5. }

Now we can use the subroutine just as any other built-in function:

  1. logger("We have a logger subroutine!");

What's that shift? Well, the arguments to a subroutine are available to us as a special array called @_ (see perlvar for more on that). The default argument to the shift function just happens to be @_ . So my $logmessage = shift; shifts the first item off the list of arguments and assigns it to $logmessage .

We can manipulate @_ in other ways too:

  1. my ($logmessage, $priority) = @_; # common
  2. my $logmessage = $_[0]; # uncommon, and ugly

Subroutines can also return values:

  1. sub square {
  2. my $num = shift;
  3. my $result = $num * $num;
  4. return $result;
  5. }

Then use it like:

  1. $sq = square(8);

For more information on writing subroutines, see perlsub.

OO Perl

OO Perl is relatively simple and is implemented using references which know what sort of object they are based on Perl's concept of packages. However, OO Perl is largely beyond the scope of this document. Read perlootut and perlobj.

As a beginning Perl programmer, your most common use of OO Perl will be in using third-party modules, which are documented below.

Using Perl modules

Perl modules provide a range of features to help you avoid reinventing the wheel, and can be downloaded from CPAN ( http://www.cpan.org/ ). A number of popular modules are included with the Perl distribution itself.

Categories of modules range from text manipulation to network protocols to database integration to graphics. A categorized list of modules is also available from CPAN.

To learn how to install modules you download from CPAN, read perlmodinstall.

To learn how to use a particular module, use perldoc Module::Name. Typically you will want to use Module::Name, which will then give you access to exported functions or an OO interface to the module.

perlfaq contains questions and answers related to many common tasks, and often provides suggestions for good CPAN modules to use.

perlmod describes Perl modules in general. perlmodlib lists the modules which came with your Perl installation.

If you feel the urge to write Perl modules, perlnewmod will give you good advice.

AUTHOR

Kirrily "Skud" Robert <skud@cpan.org>

 
perldoc-html/perliol.html000644 000765 000024 00000204272 12275777367 015610 0ustar00jjstaff000000 000000 perliol - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perliol

Perl 5 version 18.2 documentation
Recently read

perliol

NAME

perliol - C API for Perl's implementation of IO in Layers.

SYNOPSIS

  1. /* Defining a layer ... */
  2. #include <perliol.h>

DESCRIPTION

This document describes the behavior and implementation of the PerlIO abstraction described in perlapio when USE_PERLIO is defined (and USE_SFIO is not).

History and Background

The PerlIO abstraction was introduced in perl5.003_02 but languished as just an abstraction until perl5.7.0. However during that time a number of perl extensions switched to using it, so the API is mostly fixed to maintain (source) compatibility.

The aim of the implementation is to provide the PerlIO API in a flexible and platform neutral manner. It is also a trial of an "Object Oriented C, with vtables" approach which may be applied to Perl 6.

Basic Structure

PerlIO is a stack of layers.

The low levels of the stack work with the low-level operating system calls (file descriptors in C) getting bytes in and out, the higher layers of the stack buffer, filter, and otherwise manipulate the I/O, and return characters (or bytes) to Perl. Terms above and below are used to refer to the relative positioning of the stack layers.

A layer contains a "vtable", the table of I/O operations (at C level a table of function pointers), and status flags. The functions in the vtable implement operations like "open", "read", and "write".

When I/O, for example "read", is requested, the request goes from Perl first down the stack using "read" functions of each layer, then at the bottom the input is requested from the operating system services, then the result is returned up the stack, finally being interpreted as Perl data.

The requests do not necessarily go always all the way down to the operating system: that's where PerlIO buffering comes into play.

When you do an open() and specify extra PerlIO layers to be deployed, the layers you specify are "pushed" on top of the already existing default stack. One way to see it is that "operating system is on the left" and "Perl is on the right".

What exact layers are in this default stack depends on a lot of things: your operating system, Perl version, Perl compile time configuration, and Perl runtime configuration. See PerlIO, PERLIO in perlrun, and open for more information.

binmode() operates similarly to open(): by default the specified layers are pushed on top of the existing stack.

However, note that even as the specified layers are "pushed on top" for open() and binmode(), this doesn't mean that the effects are limited to the "top": PerlIO layers can be very 'active' and inspect and affect layers also deeper in the stack. As an example there is a layer called "raw" which repeatedly "pops" layers until it reaches the first layer that has declared itself capable of handling binary data. The "pushed" layers are processed in left-to-right order.

sysopen() operates (unsurprisingly) at a lower level in the stack than open(). For example in Unix or Unix-like systems sysopen() operates directly at the level of file descriptors: in the terms of PerlIO layers, it uses only the "unix" layer, which is a rather thin wrapper on top of the Unix file descriptors.

Layers vs Disciplines

Initial discussion of the ability to modify IO streams behaviour used the term "discipline" for the entities which were added. This came (I believe) from the use of the term in "sfio", which in turn borrowed it from "line disciplines" on Unix terminals. However, this document (and the C code) uses the term "layer".

This is, I hope, a natural term given the implementation, and should avoid connotations that are inherent in earlier uses of "discipline" for things which are rather different.

Data Structures

The basic data structure is a PerlIOl:

  1. typedef struct _PerlIO PerlIOl;
  2. typedef struct _PerlIO_funcs PerlIO_funcs;
  3. typedef PerlIOl *PerlIO;
  4. struct _PerlIO
  5. {
  6. PerlIOl * next; /* Lower layer */
  7. PerlIO_funcs * tab; /* Functions for this layer */
  8. IV flags; /* Various flags for state */
  9. };

A PerlIOl * is a pointer to the struct, and the application level PerlIO * is a pointer to a PerlIOl * - i.e. a pointer to a pointer to the struct. This allows the application level PerlIO * to remain constant while the actual PerlIOl * underneath changes. (Compare perl's SV * which remains constant while its sv_any field changes as the scalar's type changes.) An IO stream is then in general represented as a pointer to this linked-list of "layers".

It should be noted that because of the double indirection in a PerlIO * , a &(perlio->next) "is" a PerlIO * , and so to some degree at least one layer can use the "standard" API on the next layer down.

A "layer" is composed of two parts:

1.

The functions and attributes of the "layer class".

2.

The per-instance data for a particular handle.

Functions and Attributes

The functions and attributes are accessed via the "tab" (for table) member of PerlIOl . The functions (methods of the layer "class") are fixed, and are defined by the PerlIO_funcs type. They are broadly the same as the public PerlIO_xxxxx functions:

  1. struct _PerlIO_funcs
  2. {
  3. Size_t fsize;
  4. char * name;
  5. Size_t size;
  6. IV kind;
  7. IV (*Pushed)(pTHX_ PerlIO *f,const char *mode,SV *arg, PerlIO_funcs *tab);
  8. IV (*Popped)(pTHX_ PerlIO *f);
  9. PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
  10. PerlIO_list_t *layers, IV n,
  11. const char *mode,
  12. int fd, int imode, int perm,
  13. PerlIO *old,
  14. int narg, SV **args);
  15. IV (*Binmode)(pTHX_ PerlIO *f);
  16. SV * (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags)
  17. IV (*Fileno)(pTHX_ PerlIO *f);
  18. PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags)
  19. /* Unix-like functions - cf sfio line disciplines */
  20. SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
  21. SSize_t (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
  22. SSize_t (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
  23. IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
  24. Off_t (*Tell)(pTHX_ PerlIO *f);
  25. IV (*Close)(pTHX_ PerlIO *f);
  26. /* Stdio-like buffered IO functions */
  27. IV (*Flush)(pTHX_ PerlIO *f);
  28. IV (*Fill)(pTHX_ PerlIO *f);
  29. IV (*Eof)(pTHX_ PerlIO *f);
  30. IV (*Error)(pTHX_ PerlIO *f);
  31. void (*Clearerr)(pTHX_ PerlIO *f);
  32. void (*Setlinebuf)(pTHX_ PerlIO *f);
  33. /* Perl's snooping functions */
  34. STDCHAR * (*Get_base)(pTHX_ PerlIO *f);
  35. Size_t (*Get_bufsiz)(pTHX_ PerlIO *f);
  36. STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f);
  37. SSize_t (*Get_cnt)(pTHX_ PerlIO *f);
  38. void (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt);
  39. };

The first few members of the struct give a function table size for compatibility check "name" for the layer, the size to malloc for the per-instance data, and some flags which are attributes of the class as whole (such as whether it is a buffering layer), then follow the functions which fall into four basic groups:

1.

Opening and setup functions

2.

Basic IO operations

3.

Stdio class buffering options.

4.

Functions to support Perl's traditional "fast" access to the buffer.

A layer does not have to implement all the functions, but the whole table has to be present. Unimplemented slots can be NULL (which will result in an error when called) or can be filled in with stubs to "inherit" behaviour from a "base class". This "inheritance" is fixed for all instances of the layer, but as the layer chooses which stubs to populate the table, limited "multiple inheritance" is possible.

Per-instance Data

The per-instance data are held in memory beyond the basic PerlIOl struct, by making a PerlIOl the first member of the layer's struct thus:

  1. typedef struct
  2. {
  3. struct _PerlIO base; /* Base "class" info */
  4. STDCHAR * buf; /* Start of buffer */
  5. STDCHAR * end; /* End of valid part of buffer */
  6. STDCHAR * ptr; /* Current position in buffer */
  7. Off_t posn; /* Offset of buf into the file */
  8. Size_t bufsiz; /* Real size of buffer */
  9. IV oneword; /* Emergency buffer */
  10. } PerlIOBuf;

In this way (as for perl's scalars) a pointer to a PerlIOBuf can be treated as a pointer to a PerlIOl.

Layers in action.

  1. table perlio unix
  2. | |
  3. +-----------+ +----------+ +--------+
  4. PerlIO ->| |--->| next |--->| NULL |
  5. +-----------+ +----------+ +--------+
  6. | | | buffer | | fd |
  7. +-----------+ | | +--------+
  8. | | +----------+

The above attempts to show how the layer scheme works in a simple case. The application's PerlIO * points to an entry in the table(s) representing open (allocated) handles. For example the first three slots in the table correspond to stdin ,stdout and stderr . The table in turn points to the current "top" layer for the handle - in this case an instance of the generic buffering layer "perlio". That layer in turn points to the next layer down - in this case the low-level "unix" layer.

The above is roughly equivalent to a "stdio" buffered stream, but with much more flexibility:

  • If Unix level read/write/lseek is not appropriate for (say) sockets then the "unix" layer can be replaced (at open time or even dynamically) with a "socket" layer.

  • Different handles can have different buffering schemes. The "top" layer could be the "mmap" layer if reading disk files was quicker using mmap than read. An "unbuffered" stream can be implemented simply by not having a buffer layer.

  • Extra layers can be inserted to process the data as it flows through. This was the driving need for including the scheme in perl 5.7.0+ - we needed a mechanism to allow data to be translated between perl's internal encoding (conceptually at least Unicode as UTF-8), and the "native" format used by the system. This is provided by the ":encoding(xxxx)" layer which typically sits above the buffering layer.

  • A layer can be added that does "\n" to CRLF translation. This layer can be used on any platform, not just those that normally do such things.

Per-instance flag bits

The generic flag bits are a hybrid of O_XXXXX style flags deduced from the mode string passed to PerlIO_open() , and state bits for typical buffer layers.

  • PERLIO_F_EOF

    End of file.

  • PERLIO_F_CANWRITE

    Writes are permitted, i.e. opened as "w" or "r+" or "a", etc.

  • PERLIO_F_CANREAD

    Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick).

  • PERLIO_F_ERROR

    An error has occurred (for PerlIO_error() ).

  • PERLIO_F_TRUNCATE

    Truncate file suggested by open mode.

  • PERLIO_F_APPEND

    All writes should be appends.

  • PERLIO_F_CRLF

    Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF mapped to "\n" for input. Normally the provided "crlf" layer is the only layer that need bother about this. PerlIO_binmode() will mess with this flag rather than add/remove layers if the PERLIO_K_CANCRLF bit is set for the layers class.

  • PERLIO_F_UTF8

    Data written to this layer should be UTF-8 encoded; data provided by this layer should be considered UTF-8 encoded. Can be set on any layer by ":utf8" dummy layer. Also set on ":encoding" layer.

  • PERLIO_F_UNBUF

    Layer is unbuffered - i.e. write to next layer down should occur for each write to this layer.

  • PERLIO_F_WRBUF

    The buffer for this layer currently holds data written to it but not sent to next layer.

  • PERLIO_F_RDBUF

    The buffer for this layer currently holds unconsumed data read from layer below.

  • PERLIO_F_LINEBUF

    Layer is line buffered. Write data should be passed to next layer down whenever a "\n" is seen. Any data beyond the "\n" should then be processed.

  • PERLIO_F_TEMP

    File has been unlink()ed, or should be deleted on close().

  • PERLIO_F_OPEN

    Handle is open.

  • PERLIO_F_FASTGETS

    This instance of this layer supports the "fast gets " interface. Normally set based on PERLIO_K_FASTGETS for the class and by the existence of the function(s) in the table. However a class that normally provides that interface may need to avoid it on a particular instance. The "pending" layer needs to do this when it is pushed above a layer which does not support the interface. (Perl's sv_gets() does not expect the streams fast gets behaviour to change during one "get".)

Methods in Detail

  • fsize
    1. Size_t fsize;

    Size of the function table. This is compared against the value PerlIO code "knows" as a compatibility check. Future versions may be able to tolerate layers compiled against an old version of the headers.

  • name
    1. char * name;

    The name of the layer whose open() method Perl should invoke on open(). For example if the layer is called APR, you will call:

    1. open $fh, ">:APR", ...

    and Perl knows that it has to invoke the PerlIOAPR_open() method implemented by the APR layer.

  • size
    1. Size_t size;

    The size of the per-instance data structure, e.g.:

    1. sizeof(PerlIOAPR)

    If this field is zero then PerlIO_pushed does not malloc anything and assumes layer's Pushed function will do any required layer stack manipulation - used to avoid malloc/free overhead for dummy layers. If the field is non-zero it must be at least the size of PerlIOl , PerlIO_pushed will allocate memory for the layer's data structures and link new layer onto the stream's stack. (If the layer's Pushed method returns an error indication the layer is popped again.)

  • kind
    1. IV kind;
    • PERLIO_K_BUFFERED

      The layer is buffered.

    • PERLIO_K_RAW

      The layer is acceptable to have in a binmode(FH) stack - i.e. it does not (or will configure itself not to) transform bytes passing through it.

    • PERLIO_K_CANCRLF

      Layer can translate between "\n" and CRLF line ends.

    • PERLIO_K_FASTGETS

      Layer allows buffer snooping.

    • PERLIO_K_MULTIARG

      Used when the layer's open() accepts more arguments than usual. The extra arguments should come not before the MODE argument. When this flag is used it's up to the layer to validate the args.

  • Pushed
    1. IV (*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg);

    The only absolutely mandatory method. Called when the layer is pushed onto the stack. The mode argument may be NULL if this occurs post-open. The arg will be non-NULL if an argument string was passed. In most cases this should call PerlIOBase_pushed() to convert mode into the appropriate PERLIO_F_XXXXX flags in addition to any actions the layer itself takes. If a layer is not expecting an argument it need neither save the one passed to it, nor provide Getarg() (it could perhaps Perl_warn that the argument was un-expected).

    Returns 0 on success. On failure returns -1 and should set errno.

  • Popped
    1. IV (*Popped)(pTHX_ PerlIO *f);

    Called when the layer is popped from the stack. A layer will normally be popped after Close() is called. But a layer can be popped without being closed if the program is dynamically managing layers on the stream. In such cases Popped() should free any resources (buffers, translation tables, ...) not held directly in the layer's struct. It should also Unread() any unconsumed data that has been read and buffered from the layer below back to that layer, so that it can be re-provided to what ever is now above.

    Returns 0 on success and failure. If Popped() returns true then perlio.c assumes that either the layer has popped itself, or the layer is super special and needs to be retained for other reasons. In most cases it should return false.

  • Open
    1. PerlIO * (*Open)(...);

    The Open() method has lots of arguments because it combines the functions of perl's open, PerlIO_open , perl's sysopen, PerlIO_fdopen and PerlIO_reopen . The full prototype is as follows:

    1. PerlIO * (*Open)(pTHX_ PerlIO_funcs *tab,
    2. PerlIO_list_t *layers, IV n,
    3. const char *mode,
    4. int fd, int imode, int perm,
    5. PerlIO *old,
    6. int narg, SV **args);

    Open should (perhaps indirectly) call PerlIO_allocate() to allocate a slot in the table and associate it with the layers information for the opened file, by calling PerlIO_push . The layers is an array of all the layers destined for the PerlIO * , and any arguments passed to them, n is the index into that array of the layer being called. The macro PerlIOArg will return a (possibly NULL ) SV * for the argument passed to the layer.

    The mode string is an "fopen() -like" string which would match the regular expression /^[I#]?[rwa]\+?[bt]?$/ .

    The 'I' prefix is used during creation of stdin ..stderr via special PerlIO_fdopen calls; the '#' prefix means that this is sysopen and that imode and perm should be passed to PerlLIO_open3 ; 'r' means read, 'w' means write and 'a' means append. The '+' suffix means that both reading and writing/appending are permitted. The 'b' suffix means file should be binary, and 't' means it is text. (Almost all layers should do the IO in binary mode, and ignore the b/t bits. The :crlf layer should be pushed to handle the distinction.)

    If old is not NULL then this is a PerlIO_reopen . Perl itself does not use this (yet?) and semantics are a little vague.

    If fd not negative then it is the numeric file descriptor fd, which will be open in a manner compatible with the supplied mode string, the call is thus equivalent to PerlIO_fdopen . In this case nargs will be zero.

    If nargs is greater than zero then it gives the number of arguments passed to open, otherwise it will be 1 if for example PerlIO_open was called. In simple cases SvPV_nolen(*args) is the pathname to open.

    If a layer provides Open() it should normally call the Open() method of next layer down (if any) and then push itself on top if that succeeds. PerlIOBase_open is provided to do exactly that, so in most cases you don't have to write your own Open() method. If this method is not defined, other layers may have difficulty pushing themselves on top of it during open.

    If PerlIO_push was performed and open has failed, it must PerlIO_pop itself, since if it's not, the layer won't be removed and may cause bad problems.

    Returns NULL on failure.

  • Binmode
    1. IV (*Binmode)(pTHX_ PerlIO *f);

    Optional. Used when :raw layer is pushed (explicitly or as a result of binmode(FH)). If not present layer will be popped. If present should configure layer as binary (or pop itself) and return 0. If it returns -1 for error binmode will fail with layer still on the stack.

  • Getarg
    1. SV * (*Getarg)(pTHX_ PerlIO *f,
    2. CLONE_PARAMS *param, int flags);

    Optional. If present should return an SV * representing the string argument passed to the layer when it was pushed. e.g. ":encoding(ascii)" would return an SvPV with value "ascii". (param and flags arguments can be ignored in most cases)

    Dup uses Getarg to retrieve the argument originally passed to Pushed , so you must implement this function if your layer has an extra argument to Pushed and will ever be Dup ed.

  • Fileno
    1. IV (*Fileno)(pTHX_ PerlIO *f);

    Returns the Unix/Posix numeric file descriptor for the handle. Normally PerlIOBase_fileno() (which just asks next layer down) will suffice for this.

    Returns -1 on error, which is considered to include the case where the layer cannot provide such a file descriptor.

  • Dup
    1. PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o,
    2. CLONE_PARAMS *param, int flags);

    XXX: Needs more docs.

    Used as part of the "clone" process when a thread is spawned (in which case param will be non-NULL) and when a stream is being duplicated via '&' in the open.

    Similar to Open , returns PerlIO* on success, NULL on failure.

  • Read
    1. SSize_t (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);

    Basic read operation.

    Typically will call Fill and manipulate pointers (possibly via the API). PerlIOBuf_read() may be suitable for derived classes which provide "fast gets" methods.

    Returns actual bytes read, or -1 on an error.

  • Unread
    1. SSize_t (*Unread)(pTHX_ PerlIO *f,
    2. const void *vbuf, Size_t count);

    A superset of stdio's ungetc() . Should arrange for future reads to see the bytes in vbuf . If there is no obviously better implementation then PerlIOBase_unread() provides the function by pushing a "fake" "pending" layer above the calling layer.

    Returns the number of unread chars.

  • Write
    1. SSize_t (*Write)(PerlIO *f, const void *vbuf, Size_t count);

    Basic write operation.

    Returns bytes written or -1 on an error.

  • Seek
    1. IV (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);

    Position the file pointer. Should normally call its own Flush method and then the Seek method of next layer down.

    Returns 0 on success, -1 on failure.

  • Tell
    1. Off_t (*Tell)(pTHX_ PerlIO *f);

    Return the file pointer. May be based on layers cached concept of position to avoid overhead.

    Returns -1 on failure to get the file pointer.

  • Close
    1. IV (*Close)(pTHX_ PerlIO *f);

    Close the stream. Should normally call PerlIOBase_close() to flush itself and close layers below, and then deallocate any data structures (buffers, translation tables, ...) not held directly in the data structure.

    Returns 0 on success, -1 on failure.

  • Flush
    1. IV (*Flush)(pTHX_ PerlIO *f);

    Should make stream's state consistent with layers below. That is, any buffered write data should be written, and file position of lower layers adjusted for data read from below but not actually consumed. (Should perhaps Unread() such data to the lower layer.)

    Returns 0 on success, -1 on failure.

  • Fill
    1. IV (*Fill)(pTHX_ PerlIO *f);

    The buffer for this layer should be filled (for read) from layer below. When you "subclass" PerlIOBuf layer, you want to use its _read method and to supply your own fill method, which fills the PerlIOBuf's buffer.

    Returns 0 on success, -1 on failure.

  • Eof
    1. IV (*Eof)(pTHX_ PerlIO *f);

    Return end-of-file indicator. PerlIOBase_eof() is normally sufficient.

    Returns 0 on end-of-file, 1 if not end-of-file, -1 on error.

  • Error
    1. IV (*Error)(pTHX_ PerlIO *f);

    Return error indicator. PerlIOBase_error() is normally sufficient.

    Returns 1 if there is an error (usually when PERLIO_F_ERROR is set), 0 otherwise.

  • Clearerr
    1. void (*Clearerr)(pTHX_ PerlIO *f);

    Clear end-of-file and error indicators. Should call PerlIOBase_clearerr() to set the PERLIO_F_XXXXX flags, which may suffice.

  • Setlinebuf
    1. void (*Setlinebuf)(pTHX_ PerlIO *f);

    Mark the stream as line buffered. PerlIOBase_setlinebuf() sets the PERLIO_F_LINEBUF flag and is normally sufficient.

  • Get_base
    1. STDCHAR * (*Get_base)(pTHX_ PerlIO *f);

    Allocate (if not already done so) the read buffer for this layer and return pointer to it. Return NULL on failure.

  • Get_bufsiz
    1. Size_t (*Get_bufsiz)(pTHX_ PerlIO *f);

    Return the number of bytes that last Fill() put in the buffer.

  • Get_ptr
    1. STDCHAR * (*Get_ptr)(pTHX_ PerlIO *f);

    Return the current read pointer relative to this layer's buffer.

  • Get_cnt
    1. SSize_t (*Get_cnt)(pTHX_ PerlIO *f);

    Return the number of bytes left to be read in the current buffer.

  • Set_ptrcnt
    1. void (*Set_ptrcnt)(pTHX_ PerlIO *f,
    2. STDCHAR *ptr, SSize_t cnt);

    Adjust the read pointer and count of bytes to match ptr and/or cnt . The application (or layer above) must ensure they are consistent. (Checking is allowed by the paranoid.)

Utilities

To ask for the next layer down use PerlIONext(PerlIO *f).

To check that a PerlIO* is valid use PerlIOValid(PerlIO *f). (All this does is really just to check that the pointer is non-NULL and that the pointer behind that is non-NULL.)

PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words, the PerlIOl* pointer.

PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type.

Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either calls the callback from the functions of the layer f (just by the name of the IO function, like "Read") with the args, or if there is no such callback, calls the base version of the callback with the same args, or if the f is invalid, set errno to EBADF and return failure.

Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls the callback of the functions of the layer f with the args, or if there is no such callback, set errno to EINVAL. Or if the f is invalid, set errno to EBADF and return failure.

Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls the callback of the functions of the layer f with the args, or if there is no such callback, calls the base version of the callback with the same args, or if the f is invalid, set errno to EBADF.

Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the callback of the functions of the layer f with the args, or if there is no such callback, set errno to EINVAL. Or if the f is invalid, set errno to EBADF.

Implementing PerlIO Layers

If you find the implementation document unclear or not sufficient, look at the existing PerlIO layer implementations, which include:

  • C implementations

    The perlio.c and perliol.h in the Perl core implement the "unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending" layers, and also the "mmap" and "win32" layers if applicable. (The "win32" is currently unfinished and unused, to see what is used instead in Win32, see Querying the layers of filehandles in PerlIO .)

    PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core.

    PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN.

  • Perl implementations

    PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN.

If you are creating a PerlIO layer, you may want to be lazy, in other words, implement only the methods that interest you. The other methods you can either replace with the "blank" methods

  1. PerlIOBase_noop_ok
  2. PerlIOBase_noop_fail

(which do nothing, and return zero and -1, respectively) or for certain methods you may assume a default behaviour by using a NULL method. The Open method looks for help in the 'parent' layer. The following table summarizes the behaviour:

  1. method behaviour with NULL
  2. Clearerr PerlIOBase_clearerr
  3. Close PerlIOBase_close
  4. Dup PerlIOBase_dup
  5. Eof PerlIOBase_eof
  6. Error PerlIOBase_error
  7. Fileno PerlIOBase_fileno
  8. Fill FAILURE
  9. Flush SUCCESS
  10. Getarg SUCCESS
  11. Get_base FAILURE
  12. Get_bufsiz FAILURE
  13. Get_cnt FAILURE
  14. Get_ptr FAILURE
  15. Open INHERITED
  16. Popped SUCCESS
  17. Pushed SUCCESS
  18. Read PerlIOBase_read
  19. Seek FAILURE
  20. Set_cnt FAILURE
  21. Set_ptrcnt FAILURE
  22. Setlinebuf PerlIOBase_setlinebuf
  23. Tell FAILURE
  24. Unread PerlIOBase_unread
  25. Write FAILURE
  26. FAILURE Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS) and
  27. return -1 (for numeric return values) or NULL (for pointers)
  28. INHERITED Inherited from the layer below
  29. SUCCESS Return 0 (for numeric return values) or a pointer

Core Layers

The file perlio.c provides the following layers:

  • "unix"

    A basic non-buffered layer which calls Unix/POSIX read(), write(), lseek() , close(). No buffering. Even on platforms that distinguish between O_TEXT and O_BINARY this layer is always O_BINARY.

  • "perlio"

    A very complete generic buffering layer which provides the whole of PerlIO API. It is also intended to be used as a "base class" for other layers. (For example its Read() method is implemented in terms of the Get_cnt() /Get_ptr() /Set_ptrcnt() methods).

    "perlio" over "unix" provides a complete replacement for stdio as seen via PerlIO API. This is the default for USE_PERLIO when system's stdio does not permit perl's "fast gets" access, and which do not distinguish between O_TEXT and O_BINARY .

  • "stdio"

    A layer which provides the PerlIO API via the layer scheme, but implements it by calling system's stdio. This is (currently) the default if system's stdio provides sufficient access to allow perl's "fast gets" access and which do not distinguish between O_TEXT and O_BINARY .

  • "crlf"

    A layer derived using "perlio" as a base class. It provides Win32-like "\n" to CR,LF translation. Can either be applied above "perlio" or serve as the buffer layer itself. "crlf" over "unix" is the default if system distinguishes between O_TEXT and O_BINARY opens. (At some point "unix" will be replaced by a "native" Win32 IO layer on that platform, as Win32's read/write layer has various drawbacks.) The "crlf" layer is a reasonable model for a layer which transforms data in some way.

  • "mmap"

    If Configure detects mmap() functions this layer is provided (with "perlio" as a "base") which does "read" operations by mmap()ing the file. Performance improvement is marginal on modern systems, so it is mainly there as a proof of concept. It is likely to be unbundled from the core at some point. The "mmap" layer is a reasonable model for a minimalist "derived" layer.

  • "pending"

    An "internal" derivative of "perlio" which can be used to provide Unread() function for layers which have no buffer or cannot be bothered. (Basically this layer's Fill() pops itself off the stack and so resumes reading from layer below.)

  • "raw"

    A dummy layer which never exists on the layer stack. Instead when "pushed" it actually pops the stack removing itself, it then calls Binmode function table entry on all the layers in the stack - normally this (via PerlIOBase_binmode) removes any layers which do not have PERLIO_K_RAW bit set. Layers can modify that behaviour by defining their own Binmode entry.

  • "utf8"

    Another dummy layer. When pushed it pops itself and sets the PERLIO_F_UTF8 flag on the layer which was (and now is once more) the top of the stack.

In addition perlio.c also provides a number of PerlIOBase_xxxx() functions which are intended to be used in the table slots of classes which do not need to do anything special for a particular method.

Extension Layers

Layers can be made available by extension modules. When an unknown layer is encountered the PerlIO code will perform the equivalent of :

  1. use PerlIO 'layer';

Where layer is the unknown layer. PerlIO.pm will then attempt to:

  1. require PerlIO::layer;

If after that process the layer is still not defined then the open will fail.

The following extension layers are bundled with perl:

  • ":encoding"
    1. use Encoding;

    makes this layer available, although PerlIO.pm "knows" where to find it. It is an example of a layer which takes an argument as it is called thus:

    1. open( $fh, "<:encoding(iso-8859-7)", $pathname );
  • ":scalar"

    Provides support for reading data from and writing data to a scalar.

    1. open( $fh, "+<:scalar", \$scalar );

    When a handle is so opened, then reads get bytes from the string value of $scalar, and writes change the value. In both cases the position in $scalar starts as zero but can be altered via seek, and determined via tell.

    Please note that this layer is implied when calling open() thus:

    1. open( $fh, "+<", \$scalar );
  • ":via"

    Provided to allow layers to be implemented as Perl code. For instance:

    1. use PerlIO::via::StripHTML;
    2. open( my $fh, "<:via(StripHTML)", "index.html" );

    See PerlIO::via for details.

TODO

Things that need to be done to improve this document.

  • Explain how to make a valid fh without going through open()(i.e. apply a layer). For example if the file is not opened through perl, but we want to get back a fh, like it was opened by Perl.

    How PerlIO_apply_layera fits in, where its docs, was it made public?

    Currently the example could be something like this:

    1. PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...)
    2. {
    3. char *mode; /* "w", "r", etc */
    4. const char *layers = ":APR"; /* the layer name */
    5. PerlIO *f = PerlIO_allocate(aTHX);
    6. if (!f) {
    7. return NULL;
    8. }
    9. PerlIO_apply_layers(aTHX_ f, mode, layers);
    10. if (f) {
    11. PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR);
    12. /* fill in the st struct, as in _open() */
    13. st->file = file;
    14. PerlIOBase(f)->flags |= PERLIO_F_OPEN;
    15. return f;
    16. }
    17. return NULL;
    18. }
  • fix/add the documentation in places marked as XXX.

  • The handling of errors by the layer is not specified. e.g. when $! should be set explicitly, when the error handling should be just delegated to the top layer.

    Probably give some hints on using SETERRNO() or pointers to where they can be found.

  • I think it would help to give some concrete examples to make it easier to understand the API. Of course I agree that the API has to be concise, but since there is no second document that is more of a guide, I think that it'd make it easier to start with the doc which is an API, but has examples in it in places where things are unclear, to a person who is not a PerlIO guru (yet).

 
perldoc-html/perlipc.html000644 000765 000024 00000622540 12275777342 015573 0ustar00jjstaff000000 000000 perlipc - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlipc

Perl 5 version 18.2 documentation
Recently read

perlipc

NAME

perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)

DESCRIPTION

The basic IPC facilities of Perl are built out of the good old Unix signals, named pipes, pipe opens, the Berkeley socket routines, and SysV IPC calls. Each is used in slightly different situations.

Signals

Perl uses a simple signal handling model: the %SIG hash contains names or references of user-installed signal handlers. These handlers will be called with an argument which is the name of the signal that triggered it. A signal may be generated intentionally from a particular keyboard sequence like control-C or control-Z, sent to you from another process, or triggered automatically by the kernel when special events transpire, like a child process exiting, your own process running out of stack space, or hitting a process file-size limit.

For example, to trap an interrupt signal, set up a handler like this:

  1. our $shucks;
  2. sub catch_zap {
  3. my $signame = shift;
  4. $shucks++;
  5. die "Somebody sent me a SIG$signame";
  6. }
  7. $SIG{INT} = __PACKAGE__ . "::catch_zap";
  8. $SIG{INT} = \&catch_zap; # best strategy

Prior to Perl 5.8.0 it was necessary to do as little as you possibly could in your handler; notice how all we do is set a global variable and then raise an exception. That's because on most systems, libraries are not re-entrant; particularly, memory allocation and I/O routines are not. That meant that doing nearly anything in your handler could in theory trigger a memory fault and subsequent core dump - see Deferred Signals (Safe Signals) below.

The names of the signals are the ones listed out by kill -l on your system, or you can retrieve them using the CPAN module IPC::Signal.

You may also choose to assign the strings "IGNORE" or "DEFAULT" as the handler, in which case Perl will try to discard the signal or do the default thing.

On most Unix platforms, the CHLD (sometimes also known as CLD ) signal has special behavior with respect to a value of "IGNORE" . Setting $SIG{CHLD} to "IGNORE" on such a platform has the effect of not creating zombie processes when the parent process fails to wait() on its child processes (i.e., child processes are automatically reaped). Calling wait() with $SIG{CHLD} set to "IGNORE" usually returns -1 on such platforms.

Some signals can be neither trapped nor ignored, such as the KILL and STOP (but not the TSTP) signals. Note that ignoring signals makes them disappear. If you only want them blocked temporarily without them getting lost you'll have to use POSIX' sigprocmask.

Sending a signal to a negative process ID means that you send the signal to the entire Unix process group. This code sends a hang-up signal to all processes in the current process group, and also sets $SIG{HUP} to "IGNORE" so it doesn't kill itself:

  1. # block scope for local
  2. {
  3. local $SIG{HUP} = "IGNORE";
  4. kill HUP => -$$;
  5. # snazzy writing of: kill("HUP", -$$)
  6. }

Another interesting signal to send is signal number zero. This doesn't actually affect a child process, but instead checks whether it's alive or has changed its UIDs.

  1. unless (kill 0 => $kid_pid) {
  2. warn "something wicked happened to $kid_pid";
  3. }

Signal number zero may fail because you lack permission to send the signal when directed at a process whose real or saved UID is not identical to the real or effective UID of the sending process, even though the process is alive. You may be able to determine the cause of failure using $! or %! .

  1. unless (kill(0 => $pid) || $!{EPERM}) {
  2. warn "$pid looks dead";
  3. }

You might also want to employ anonymous functions for simple signal handlers:

  1. $SIG{INT} = sub { die "\nOutta here!\n" };

SIGCHLD handlers require some special care. If a second child dies while in the signal handler caused by the first death, we won't get another signal. So must loop here else we will leave the unreaped child as a zombie. And the next time two children die we get another zombie. And so on.

  1. use POSIX ":sys_wait_h";
  2. $SIG{CHLD} = sub {
  3. while ((my $child = waitpid(-1, WNOHANG)) > 0) {
  4. $Kid_Status{$child} = $?;
  5. }
  6. };
  7. # do something that forks...

Be careful: qx(), system(), and some modules for calling external commands do a fork(), then wait() for the result. Thus, your signal handler will be called. Because wait() was already called by system() or qx(), the wait() in the signal handler will see no more zombies and will therefore block.

The best way to prevent this issue is to use waitpid(), as in the following example:

  1. use POSIX ":sys_wait_h"; # for nonblocking read
  2. my %children;
  3. $SIG{CHLD} = sub {
  4. # don't change $! and $? outside handler
  5. local ($!, $?);
  6. my $pid = waitpid(-1, WNOHANG);
  7. return if $pid == -1;
  8. return unless defined $children{$pid};
  9. delete $children{$pid};
  10. cleanup_child($pid, $?);
  11. };
  12. while (1) {
  13. my $pid = fork();
  14. die "cannot fork" unless defined $pid;
  15. if ($pid == 0) {
  16. # ...
  17. exit 0;
  18. } else {
  19. $children{$pid}=1;
  20. # ...
  21. system($command);
  22. # ...
  23. }
  24. }

Signal handling is also used for timeouts in Unix. While safely protected within an eval{} block, you set a signal handler to trap alarm signals and then schedule to have one delivered to you in some number of seconds. Then try your blocking operation, clearing the alarm when it's done but not before you've exited your eval{} block. If it goes off, you'll use die() to jump out of the block.

Here's an example:

  1. my $ALARM_EXCEPTION = "alarm clock restart";
  2. eval {
  3. local $SIG{ALRM} = sub { die $ALARM_EXCEPTION };
  4. alarm 10;
  5. flock(FH, 2) # blocking write lock
  6. || die "cannot flock: $!";
  7. alarm 0;
  8. };
  9. if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }

If the operation being timed out is system() or qx(), this technique is liable to generate zombies. If this matters to you, you'll need to do your own fork() and exec(), and kill the errant child process.

For more complex signal handling, you might see the standard POSIX module. Lamentably, this is almost entirely undocumented, but the t/lib/posix.t file from the Perl source distribution has some examples in it.

Handling the SIGHUP Signal in Daemons

A process that usually starts when the system boots and shuts down when the system is shut down is called a daemon (Disk And Execution MONitor). If a daemon process has a configuration file which is modified after the process has been started, there should be a way to tell that process to reread its configuration file without stopping the process. Many daemons provide this mechanism using a SIGHUP signal handler. When you want to tell the daemon to reread the file, simply send it the SIGHUP signal.

The following example implements a simple daemon, which restarts itself every time the SIGHUP signal is received. The actual code is located in the subroutine code() , which just prints some debugging info to show that it works; it should be replaced with the real code.

  1. #!/usr/bin/perl -w
  2. use POSIX ();
  3. use FindBin ();
  4. use File::Basename ();
  5. use File::Spec::Functions;
  6. $| = 1;
  7. # make the daemon cross-platform, so exec always calls the script
  8. # itself with the right path, no matter how the script was invoked.
  9. my $script = File::Basename::basename($0);
  10. my $SELF = catfile($FindBin::Bin, $script);
  11. # POSIX unmasks the sigprocmask properly
  12. $SIG{HUP} = sub {
  13. print "got SIGHUP\n";
  14. exec($SELF, @ARGV) || die "$0: couldn't restart: $!";
  15. };
  16. code();
  17. sub code {
  18. print "PID: $$\n";
  19. print "ARGV: @ARGV\n";
  20. my $count = 0;
  21. while (++$count) {
  22. sleep 2;
  23. print "$count\n";
  24. }
  25. }

Deferred Signals (Safe Signals)

Before Perl 5.8.0, installing Perl code to deal with signals exposed you to danger from two things. First, few system library functions are re-entrant. If the signal interrupts while Perl is executing one function (like malloc(3) or printf(3)), and your signal handler then calls the same function again, you could get unpredictable behavior--often, a core dump. Second, Perl isn't itself re-entrant at the lowest levels. If the signal interrupts Perl while Perl is changing its own internal data structures, similarly unpredictable behavior may result.

There were two things you could do, knowing this: be paranoid or be pragmatic. The paranoid approach was to do as little as possible in your signal handler. Set an existing integer variable that already has a value, and return. This doesn't help you if you're in a slow system call, which will just restart. That means you have to die to longjmp(3) out of the handler. Even this is a little cavalier for the true paranoiac, who avoids die in a handler because the system is out to get you. The pragmatic approach was to say "I know the risks, but prefer the convenience", and to do anything you wanted in your signal handler, and be prepared to clean up core dumps now and again.

Perl 5.8.0 and later avoid these problems by "deferring" signals. That is, when the signal is delivered to the process by the system (to the C code that implements Perl) a flag is set, and the handler returns immediately. Then at strategic "safe" points in the Perl interpreter (e.g. when it is about to execute a new opcode) the flags are checked and the Perl level handler from %SIG is executed. The "deferred" scheme allows much more flexibility in the coding of signal handlers as we know the Perl interpreter is in a safe state, and that we are not in a system library function when the handler is called. However the implementation does differ from previous Perls in the following ways:

  • Long-running opcodes

    As the Perl interpreter looks at signal flags only when it is about to execute a new opcode, a signal that arrives during a long-running opcode (e.g. a regular expression operation on a very large string) will not be seen until the current opcode completes.

    If a signal of any given type fires multiple times during an opcode (such as from a fine-grained timer), the handler for that signal will be called only once, after the opcode completes; all other instances will be discarded. Furthermore, if your system's signal queue gets flooded to the point that there are signals that have been raised but not yet caught (and thus not deferred) at the time an opcode completes, those signals may well be caught and deferred during subsequent opcodes, with sometimes surprising results. For example, you may see alarms delivered even after calling alarm(0) as the latter stops the raising of alarms but does not cancel the delivery of alarms raised but not yet caught. Do not depend on the behaviors described in this paragraph as they are side effects of the current implementation and may change in future versions of Perl.

  • Interrupting IO

    When a signal is delivered (e.g., SIGINT from a control-C) the operating system breaks into IO operations like read(2), which is used to implement Perl's readline() function, the <> operator. On older Perls the handler was called immediately (and as read is not "unsafe", this worked well). With the "deferred" scheme the handler is not called immediately, and if Perl is using the system's stdio library that library may restart the read without returning to Perl to give it a chance to call the %SIG handler. If this happens on your system the solution is to use the :perlio layer to do IO--at least on those handles that you want to be able to break into with signals. (The :perlio layer checks the signal flags and calls %SIG handlers before resuming IO operation.)

    The default in Perl 5.8.0 and later is to automatically use the :perlio layer.

    Note that it is not advisable to access a file handle within a signal handler where that signal has interrupted an I/O operation on that same handle. While perl will at least try hard not to crash, there are no guarantees of data integrity; for example, some data might get dropped or written twice.

    Some networking library functions like gethostbyname() are known to have their own implementations of timeouts which may conflict with your timeouts. If you have problems with such functions, try using the POSIX sigaction() function, which bypasses Perl safe signals. Be warned that this does subject you to possible memory corruption, as described above.

    Instead of setting $SIG{ALRM} :

    1. local $SIG{ALRM} = sub { die "alarm" };

    try something like the following:

    1. use POSIX qw(SIGALRM);
    2. POSIX::sigaction(SIGALRM, POSIX::SigAction->new(sub { die "alarm" }))
    3. || die "Error setting SIGALRM handler: $!\n";

    Another way to disable the safe signal behavior locally is to use the Perl::Unsafe::Signals module from CPAN, which affects all signals.

  • Restartable system calls

    On systems that supported it, older versions of Perl used the SA_RESTART flag when installing %SIG handlers. This meant that restartable system calls would continue rather than returning when a signal arrived. In order to deliver deferred signals promptly, Perl 5.8.0 and later do not use SA_RESTART. Consequently, restartable system calls can fail (with $! set to EINTR ) in places where they previously would have succeeded.

    The default :perlio layer retries read, write and close as described above; interrupted wait and waitpid calls will always be retried.

  • Signals as "faults"

    Certain signals like SEGV, ILL, and BUS are generated by virtual memory addressing errors and similar "faults". These are normally fatal: there is little a Perl-level handler can do with them. So Perl delivers them immediately rather than attempting to defer them.

  • Signals triggered by operating system state

    On some operating systems certain signal handlers are supposed to "do something" before returning. One example can be CHLD or CLD, which indicates a child process has completed. On some operating systems the signal handler is expected to wait for the completed child process. On such systems the deferred signal scheme will not work for those signals: it does not do the wait. Again the failure will look like a loop as the operating system will reissue the signal because there are completed child processes that have not yet been waited for.

If you want the old signal behavior back despite possible memory corruption, set the environment variable PERL_SIGNALS to "unsafe" . This feature first appeared in Perl 5.8.1.

Named Pipes

A named pipe (often referred to as a FIFO) is an old Unix IPC mechanism for processes communicating on the same machine. It works just like regular anonymous pipes, except that the processes rendezvous using a filename and need not be related.

To create a named pipe, use the POSIX::mkfifo() function.

  1. use POSIX qw(mkfifo);
  2. mkfifo($path, 0700) || die "mkfifo $path failed: $!";

You can also use the Unix command mknod(1), or on some systems, mkfifo(1). These may not be in your normal path, though.

  1. # system return val is backwards, so && not ||
  2. #
  3. $ENV{PATH} .= ":/etc:/usr/etc";
  4. if ( system("mknod", $path, "p")
  5. && system("mkfifo", $path) )
  6. {
  7. die "mk{nod,fifo} $path failed";
  8. }

A fifo is convenient when you want to connect a process to an unrelated one. When you open a fifo, the program will block until there's something on the other end.

For example, let's say you'd like to have your .signature file be a named pipe that has a Perl program on the other end. Now every time any program (like a mailer, news reader, finger program, etc.) tries to read from that file, the reading program will read the new signature from your program. We'll use the pipe-checking file-test operator, -p, to find out whether anyone (or anything) has accidentally removed our fifo.

  1. chdir(); # go home
  2. my $FIFO = ".signature";
  3. while (1) {
  4. unless (-p $FIFO) {
  5. unlink $FIFO; # discard any failure, will catch later
  6. require POSIX; # delayed loading of heavy module
  7. POSIX::mkfifo($FIFO, 0700)
  8. || die "can't mkfifo $FIFO: $!";
  9. }
  10. # next line blocks till there's a reader
  11. open (FIFO, "> $FIFO") || die "can't open $FIFO: $!";
  12. print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
  13. close(FIFO) || die "can't close $FIFO: $!";
  14. sleep 2; # to avoid dup signals
  15. }

Using open() for IPC

Perl's basic open() statement can also be used for unidirectional interprocess communication by either appending or prepending a pipe symbol to the second argument to open(). Here's how to start something up in a child process you intend to write to:

  1. open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
  2. || die "can't fork: $!";
  3. local $SIG{PIPE} = sub { die "spooler pipe broke" };
  4. print SPOOLER "stuff\n";
  5. close SPOOLER || die "bad spool: $! $?";

And here's how to start up a child process you intend to read from:

  1. open(STATUS, "netstat -an 2>&1 |")
  2. || die "can't fork: $!";
  3. while (<STATUS>) {
  4. next if /^(tcp|udp)/;
  5. print;
  6. }
  7. close STATUS || die "bad netstat: $! $?";

If one can be sure that a particular program is a Perl script expecting filenames in @ARGV, the clever programmer can write something like this:

  1. % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile

and no matter which sort of shell it's called from, the Perl program will read from the file f1, the process cmd1, standard input (tmpfile in this case), the f2 file, the cmd2 command, and finally the f3 file. Pretty nifty, eh?

You might notice that you could use backticks for much the same effect as opening a pipe for reading:

  1. print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
  2. die "bad netstatus ($?)" if $?;

While this is true on the surface, it's much more efficient to process the file one line or record at a time because then you don't have to read the whole thing into memory at once. It also gives you finer control of the whole process, letting you kill off the child process early if you'd like.

Be careful to check the return values from both open() and close(). If you're writing to a pipe, you should also trap SIGPIPE. Otherwise, think of what happens when you start up a pipe to a command that doesn't exist: the open() will in all likelihood succeed (it only reflects the fork()'s success), but then your output will fail--spectacularly. Perl can't know whether the command worked, because your command is actually running in a separate process whose exec() might have failed. Therefore, while readers of bogus commands return just a quick EOF, writers to bogus commands will get hit with a signal, which they'd best be prepared to handle. Consider:

  1. open(FH, "|bogus") || die "can't fork: $!";
  2. print FH "bang\n"; # neither necessary nor sufficient
  3. # to check print retval!
  4. close(FH) || die "can't close: $!";

The reason for not checking the return value from print() is because of pipe buffering; physical writes are delayed. That won't blow up until the close, and it will blow up with a SIGPIPE. To catch it, you could use this:

  1. $SIG{PIPE} = "IGNORE";
  2. open(FH, "|bogus") || die "can't fork: $!";
  3. print FH "bang\n";
  4. close(FH) || die "can't close: status=$?";

Filehandles

Both the main process and any child processes it forks share the same STDIN, STDOUT, and STDERR filehandles. If both processes try to access them at once, strange things can happen. You may also want to close or reopen the filehandles for the child. You can get around this by opening your pipe with open(), but on some systems this means that the child process cannot outlive the parent.

Background Processes

You can run a command in the background with:

  1. system("cmd &");

The command's STDOUT and STDERR (and possibly STDIN, depending on your shell) will be the same as the parent's. You won't need to catch SIGCHLD because of the double-fork taking place; see below for details.

Complete Dissociation of Child from Parent

In some cases (starting server processes, for instance) you'll want to completely dissociate the child process from the parent. This is often called daemonization. A well-behaved daemon will also chdir() to the root directory so it doesn't prevent unmounting the filesystem containing the directory from which it was launched, and redirect its standard file descriptors from and to /dev/null so that random output doesn't wind up on the user's terminal.

  1. use POSIX "setsid";
  2. sub daemonize {
  3. chdir("/") || die "can't chdir to /: $!";
  4. open(STDIN, "< /dev/null") || die "can't read /dev/null: $!";
  5. open(STDOUT, "> /dev/null") || die "can't write to /dev/null: $!";
  6. defined(my $pid = fork()) || die "can't fork: $!";
  7. exit if $pid; # non-zero now means I am the parent
  8. (setsid() != -1) || die "Can't start a new session: $!";
  9. open(STDERR, ">&STDOUT") || die "can't dup stdout: $!";
  10. }

The fork() has to come before the setsid() to ensure you aren't a process group leader; the setsid() will fail if you are. If your system doesn't have the setsid() function, open /dev/tty and use the TIOCNOTTY ioctl() on it instead. See tty(4) for details.

Non-Unix users should check their Your_OS::Process module for other possible solutions.

Safe Pipe Opens

Another interesting approach to IPC is making your single program go multiprocess and communicate between--or even amongst--yourselves. The open() function will accept a file argument of either "-|" or "|-" to do a very interesting thing: it forks a child connected to the filehandle you've opened. The child is running the same program as the parent. This is useful for safely opening a file when running under an assumed UID or GID, for example. If you open a pipe to minus, you can write to the filehandle you opened and your kid will find it in his STDIN. If you open a pipe from minus, you can read from the filehandle you opened whatever your kid writes to his STDOUT.

  1. use English qw[ -no_match_vars ];
  2. my $PRECIOUS = "/path/to/some/safe/file";
  3. my $sleep_count;
  4. my $pid;
  5. do {
  6. $pid = open(KID_TO_WRITE, "|-");
  7. unless (defined $pid) {
  8. warn "cannot fork: $!";
  9. die "bailing out" if $sleep_count++ > 6;
  10. sleep 10;
  11. }
  12. } until defined $pid;
  13. if ($pid) { # I am the parent
  14. print KID_TO_WRITE @some_data;
  15. close(KID_TO_WRITE) || warn "kid exited $?";
  16. } else { # I am the child
  17. # drop permissions in setuid and/or setgid programs:
  18. ($EUID, $EGID) = ($UID, $GID);
  19. open (OUTFILE, "> $PRECIOUS")
  20. || die "can't open $PRECIOUS: $!";
  21. while (<STDIN>) {
  22. print OUTFILE; # child's STDIN is parent's KID_TO_WRITE
  23. }
  24. close(OUTFILE) || die "can't close $PRECIOUS: $!";
  25. exit(0); # don't forget this!!
  26. }

Another common use for this construct is when you need to execute something without the shell's interference. With system(), it's straightforward, but you can't use a pipe open or backticks safely. That's because there's no way to stop the shell from getting its hands on your arguments. Instead, use lower-level control to call exec() directly.

Here's a safe backtick or pipe open for read:

  1. my $pid = open(KID_TO_READ, "-|");
  2. defined($pid) || die "can't fork: $!";
  3. if ($pid) { # parent
  4. while (<KID_TO_READ>) {
  5. # do something interesting
  6. }
  7. close(KID_TO_READ) || warn "kid exited $?";
  8. } else { # child
  9. ($EUID, $EGID) = ($UID, $GID); # suid only
  10. exec($program, @options, @args)
  11. || die "can't exec program: $!";
  12. # NOTREACHED
  13. }

And here's a safe pipe open for writing:

  1. my $pid = open(KID_TO_WRITE, "|-");
  2. defined($pid) || die "can't fork: $!";
  3. $SIG{PIPE} = sub { die "whoops, $program pipe broke" };
  4. if ($pid) { # parent
  5. print KID_TO_WRITE @data;
  6. close(KID_TO_WRITE) || warn "kid exited $?";
  7. } else { # child
  8. ($EUID, $EGID) = ($UID, $GID);
  9. exec($program, @options, @args)
  10. || die "can't exec program: $!";
  11. # NOTREACHED
  12. }

It is very easy to dead-lock a process using this form of open(), or indeed with any use of pipe() with multiple subprocesses. The example above is "safe" because it is simple and calls exec(). See Avoiding Pipe Deadlocks for general safety principles, but there are extra gotchas with Safe Pipe Opens.

In particular, if you opened the pipe using open FH, "|-" , then you cannot simply use close() in the parent process to close an unwanted writer. Consider this code:

  1. my $pid = open(WRITER, "|-"); # fork open a kid
  2. defined($pid) || die "first fork failed: $!";
  3. if ($pid) {
  4. if (my $sub_pid = fork()) {
  5. defined($sub_pid) || die "second fork failed: $!";
  6. close(WRITER) || die "couldn't close WRITER: $!";
  7. # now do something else...
  8. }
  9. else {
  10. # first write to WRITER
  11. # ...
  12. # then when finished
  13. close(WRITER) || die "couldn't close WRITER: $!";
  14. exit(0);
  15. }
  16. }
  17. else {
  18. # first do something with STDIN, then
  19. exit(0);
  20. }

In the example above, the true parent does not want to write to the WRITER filehandle, so it closes it. However, because WRITER was opened using open FH, "|-" , it has a special behavior: closing it calls waitpid() (see waitpid), which waits for the subprocess to exit. If the child process ends up waiting for something happening in the section marked "do something else", you have deadlock.

This can also be a problem with intermediate subprocesses in more complicated code, which will call waitpid() on all open filehandles during global destruction--in no predictable order.

To solve this, you must manually use pipe(), fork(), and the form of open() which sets one file descriptor to another, as shown below:

  1. pipe(READER, WRITER) || die "pipe failed: $!";
  2. $pid = fork();
  3. defined($pid) || die "first fork failed: $!";
  4. if ($pid) {
  5. close READER;
  6. if (my $sub_pid = fork()) {
  7. defined($sub_pid) || die "first fork failed: $!";
  8. close(WRITER) || die "can't close WRITER: $!";
  9. }
  10. else {
  11. # write to WRITER...
  12. # ...
  13. # then when finished
  14. close(WRITER) || die "can't close WRITER: $!";
  15. exit(0);
  16. }
  17. # write to WRITER...
  18. }
  19. else {
  20. open(STDIN, "<&READER") || die "can't reopen STDIN: $!";
  21. close(WRITER) || die "can't close WRITER: $!";
  22. # do something...
  23. exit(0);
  24. }

Since Perl 5.8.0, you can also use the list form of open for pipes. This is preferred when you wish to avoid having the shell interpret metacharacters that may be in your command string.

So for example, instead of using:

  1. open(PS_PIPE, "ps aux|") || die "can't open ps pipe: $!";

One would use either of these:

  1. open(PS_PIPE, "-|", "ps", "aux")
  2. || die "can't open ps pipe: $!";
  3. @ps_args = qw[ ps aux ];
  4. open(PS_PIPE, "-|", @ps_args)
  5. || die "can't open @ps_args|: $!";

Because there are more than three arguments to open(), forks the ps(1) command without spawning a shell, and reads its standard output via the PS_PIPE filehandle. The corresponding syntax to write to command pipes is to use "|-" in place of "-|" .

This was admittedly a rather silly example, because you're using string literals whose content is perfectly safe. There is therefore no cause to resort to the harder-to-read, multi-argument form of pipe open(). However, whenever you cannot be assured that the program arguments are free of shell metacharacters, the fancier form of open() should be used. For example:

  1. @grep_args = ("egrep", "-i", $some_pattern, @many_files);
  2. open(GREP_PIPE, "-|", @grep_args)
  3. || die "can't open @grep_args|: $!";

Here the multi-argument form of pipe open() is preferred because the pattern and indeed even the filenames themselves might hold metacharacters.

Be aware that these operations are full Unix forks, which means they may not be correctly implemented on all alien systems. Additionally, these are not true multithreading. To learn more about threading, see the modules file mentioned below in the SEE ALSO section.

Avoiding Pipe Deadlocks

Whenever you have more than one subprocess, you must be careful that each closes whichever half of any pipes created for interprocess communication it is not using. This is because any child process reading from the pipe and expecting an EOF will never receive it, and therefore never exit. A single process closing a pipe is not enough to close it; the last process with the pipe open must close it for it to read EOF.

Certain built-in Unix features help prevent this most of the time. For instance, filehandles have a "close on exec" flag, which is set en masse under control of the $^F variable. This is so any filehandles you didn't explicitly route to the STDIN, STDOUT or STDERR of a child program will be automatically closed.

Always explicitly and immediately call close() on the writable end of any pipe, unless that process is actually writing to it. Even if you don't explicitly call close(), Perl will still close() all filehandles during global destruction. As previously discussed, if those filehandles have been opened with Safe Pipe Open, this will result in calling waitpid(), which may again deadlock.

Bidirectional Communication with Another Process

While this works reasonably well for unidirectional communication, what about bidirectional communication? The most obvious approach doesn't work:

  1. # THIS DOES NOT WORK!!
  2. open(PROG_FOR_READING_AND_WRITING, "| some program |")

If you forget to use warnings , you'll miss out entirely on the helpful diagnostic message:

  1. Can't do bidirectional pipe at -e line 1.

If you really want to, you can use the standard open2() from the IPC::Open2 module to catch both ends. There's also an open3() in IPC::Open3 for tridirectional I/O so you can also catch your child's STDERR, but doing so would then require an awkward select() loop and wouldn't allow you to use normal Perl input operations.

If you look at its source, you'll see that open2() uses low-level primitives like the pipe() and exec() syscalls to create all the connections. Although it might have been more efficient by using socketpair(), this would have been even less portable than it already is. The open2() and open3() functions are unlikely to work anywhere except on a Unix system, or at least one purporting POSIX compliance.

Here's an example of using open2():

  1. use FileHandle;
  2. use IPC::Open2;
  3. $pid = open2(*Reader, *Writer, "cat -un");
  4. print Writer "stuff\n";
  5. $got = <Reader>;

The problem with this is that buffering is really going to ruin your day. Even though your Writer filehandle is auto-flushed so the process on the other end gets your data in a timely manner, you can't usually do anything to force that process to give its data to you in a similarly quick fashion. In this special case, we could actually so, because we gave cat a -u flag to make it unbuffered. But very few commands are designed to operate over pipes, so this seldom works unless you yourself wrote the program on the other end of the double-ended pipe.

A solution to this is to use a library which uses pseudottys to make your program behave more reasonably. This way you don't have to have control over the source code of the program you're using. The Expect module from CPAN also addresses this kind of thing. This module requires two other modules from CPAN, IO::Pty and IO::Stty . It sets up a pseudo terminal to interact with programs that insist on talking to the terminal device driver. If your system is supported, this may be your best bet.

Bidirectional Communication with Yourself

If you want, you may make low-level pipe() and fork() syscalls to stitch this together by hand. This example only talks to itself, but you could reopen the appropriate handles to STDIN and STDOUT and call other processes. (The following example lacks proper error checking.)

  1. #!/usr/bin/perl -w
  2. # pipe1 - bidirectional communication using two pipe pairs
  3. # designed for the socketpair-challenged
  4. use IO::Handle; # thousands of lines just for autoflush :-(
  5. pipe(PARENT_RDR, CHILD_WTR); # XXX: check failure?
  6. pipe(CHILD_RDR, PARENT_WTR); # XXX: check failure?
  7. CHILD_WTR->autoflush(1);
  8. PARENT_WTR->autoflush(1);
  9. if ($pid = fork()) {
  10. close PARENT_RDR;
  11. close PARENT_WTR;
  12. print CHILD_WTR "Parent Pid $$ is sending this\n";
  13. chomp($line = <CHILD_RDR>);
  14. print "Parent Pid $$ just read this: '$line'\n";
  15. close CHILD_RDR; close CHILD_WTR;
  16. waitpid($pid, 0);
  17. } else {
  18. die "cannot fork: $!" unless defined $pid;
  19. close CHILD_RDR;
  20. close CHILD_WTR;
  21. chomp($line = <PARENT_RDR>);
  22. print "Child Pid $$ just read this: '$line'\n";
  23. print PARENT_WTR "Child Pid $$ is sending this\n";
  24. close PARENT_RDR;
  25. close PARENT_WTR;
  26. exit(0);
  27. }

But you don't actually have to make two pipe calls. If you have the socketpair() system call, it will do this all for you.

  1. #!/usr/bin/perl -w
  2. # pipe2 - bidirectional communication using socketpair
  3. # "the best ones always go both ways"
  4. use Socket;
  5. use IO::Handle; # thousands of lines just for autoflush :-(
  6. # We say AF_UNIX because although *_LOCAL is the
  7. # POSIX 1003.1g form of the constant, many machines
  8. # still don't have it.
  9. socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
  10. || die "socketpair: $!";
  11. CHILD->autoflush(1);
  12. PARENT->autoflush(1);
  13. if ($pid = fork()) {
  14. close PARENT;
  15. print CHILD "Parent Pid $$ is sending this\n";
  16. chomp($line = <CHILD>);
  17. print "Parent Pid $$ just read this: '$line'\n";
  18. close CHILD;
  19. waitpid($pid, 0);
  20. } else {
  21. die "cannot fork: $!" unless defined $pid;
  22. close CHILD;
  23. chomp($line = <PARENT>);
  24. print "Child Pid $$ just read this: '$line'\n";
  25. print PARENT "Child Pid $$ is sending this\n";
  26. close PARENT;
  27. exit(0);
  28. }

Sockets: Client/Server Communication

While not entirely limited to Unix-derived operating systems (e.g., WinSock on PCs provides socket support, as do some VMS libraries), you might not have sockets on your system, in which case this section probably isn't going to do you much good. With sockets, you can do both virtual circuits like TCP streams and datagrams like UDP packets. You may be able to do even more depending on your system.

The Perl functions for dealing with sockets have the same names as the corresponding system calls in C, but their arguments tend to differ for two reasons. First, Perl filehandles work differently than C file descriptors. Second, Perl already knows the length of its strings, so you don't need to pass that information.

One of the major problems with ancient, antemillennial socket code in Perl was that it used hard-coded values for some of the constants, which severely hurt portability. If you ever see code that does anything like explicitly setting $AF_INET = 2 , you know you're in for big trouble. An immeasurably superior approach is to use the Socket module, which more reliably grants access to the various constants and functions you'll need.

If you're not writing a server/client for an existing protocol like NNTP or SMTP, you should give some thought to how your server will know when the client has finished talking, and vice-versa. Most protocols are based on one-line messages and responses (so one party knows the other has finished when a "\n" is received) or multi-line messages and responses that end with a period on an empty line ("\n.\n" terminates a message/response).

Internet Line Terminators

The Internet line terminator is "\015\012". Under ASCII variants of Unix, that could usually be written as "\r\n", but under other systems, "\r\n" might at times be "\015\015\012", "\012\012\015", or something completely different. The standards specify writing "\015\012" to be conformant (be strict in what you provide), but they also recommend accepting a lone "\012" on input (be lenient in what you require). We haven't always been very good about that in the code in this manpage, but unless you're on a Mac from way back in its pre-Unix dark ages, you'll probably be ok.

Internet TCP Clients and Servers

Use Internet-domain sockets when you want to do client-server communication that might extend to machines outside of your own system.

Here's a sample TCP client using Internet-domain sockets:

  1. #!/usr/bin/perl -w
  2. use strict;
  3. use Socket;
  4. my ($remote, $port, $iaddr, $paddr, $proto, $line);
  5. $remote = shift || "localhost";
  6. $port = shift || 2345; # random port
  7. if ($port =~ /\D/) { $port = getservbyname($port, "tcp") }
  8. die "No port" unless $port;
  9. $iaddr = inet_aton($remote) || die "no host: $remote";
  10. $paddr = sockaddr_in($port, $iaddr);
  11. $proto = getprotobyname("tcp");
  12. socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
  13. connect(SOCK, $paddr) || die "connect: $!";
  14. while ($line = <SOCK>) {
  15. print $line;
  16. }
  17. close (SOCK) || die "close: $!";
  18. exit(0);

And here's a corresponding server to go along with it. We'll leave the address as INADDR_ANY so that the kernel can choose the appropriate interface on multihomed hosts. If you want sit on a particular interface (like the external side of a gateway or firewall machine), fill this in with your real address instead.

  1. #!/usr/bin/perl -Tw
  2. use strict;
  3. BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
  4. use Socket;
  5. use Carp;
  6. my $EOL = "\015\012";
  7. sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
  8. my $port = shift || 2345;
  9. die "invalid port" unless if $port =~ /^ \d+ $/x;
  10. my $proto = getprotobyname("tcp");
  11. socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
  12. setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
  13. || die "setsockopt: $!";
  14. bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
  15. listen(Server, SOMAXCONN) || die "listen: $!";
  16. logmsg "server started on port $port";
  17. my $paddr;
  18. $SIG{CHLD} = \&REAPER;
  19. for ( ; $paddr = accept(Client, Server); close Client) {
  20. my($port, $iaddr) = sockaddr_in($paddr);
  21. my $name = gethostbyaddr($iaddr, AF_INET);
  22. logmsg "connection from $name [",
  23. inet_ntoa($iaddr), "]
  24. at port $port";
  25. print Client "Hello there, $name, it's now ",
  26. scalar localtime(), $EOL;
  27. }

And here's a multithreaded version. It's multithreaded in that like most typical servers, it spawns (fork()s) a slave server to handle the client request so that the master server can quickly go back to service a new client.

  1. #!/usr/bin/perl -Tw
  2. use strict;
  3. BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
  4. use Socket;
  5. use Carp;
  6. my $EOL = "\015\012";
  7. sub spawn; # forward declaration
  8. sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
  9. my $port = shift || 2345;
  10. die "invalid port" unless if $port =~ /^ \d+ $/x;
  11. my $proto = getprotobyname("tcp");
  12. socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
  13. setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
  14. || die "setsockopt: $!";
  15. bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
  16. listen(Server, SOMAXCONN) || die "listen: $!";
  17. logmsg "server started on port $port";
  18. my $waitedpid = 0;
  19. my $paddr;
  20. use POSIX ":sys_wait_h";
  21. use Errno;
  22. sub REAPER {
  23. local $!; # don't let waitpid() overwrite current error
  24. while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) {
  25. logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
  26. }
  27. $SIG{CHLD} = \&REAPER; # loathe SysV
  28. }
  29. $SIG{CHLD} = \&REAPER;
  30. while (1) {
  31. $paddr = accept(Client, Server) || do {
  32. # try again if accept() returned because got a signal
  33. next if $!{EINTR};
  34. die "accept: $!";
  35. };
  36. my ($port, $iaddr) = sockaddr_in($paddr);
  37. my $name = gethostbyaddr($iaddr, AF_INET);
  38. logmsg "connection from $name [",
  39. inet_ntoa($iaddr),
  40. "] at port $port";
  41. spawn sub {
  42. $| = 1;
  43. print "Hello there, $name, it's now ", scalar localtime(), $EOL;
  44. exec "/usr/games/fortune" # XXX: "wrong" line terminators
  45. or confess "can't exec fortune: $!";
  46. };
  47. close Client;
  48. }
  49. sub spawn {
  50. my $coderef = shift;
  51. unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
  52. confess "usage: spawn CODEREF";
  53. }
  54. my $pid;
  55. unless (defined($pid = fork())) {
  56. logmsg "cannot fork: $!";
  57. return;
  58. }
  59. elsif ($pid) {
  60. logmsg "begat $pid";
  61. return; # I'm the parent
  62. }
  63. # else I'm the child -- go spawn
  64. open(STDIN, "<&Client") || die "can't dup client to stdin";
  65. open(STDOUT, ">&Client") || die "can't dup client to stdout";
  66. ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
  67. exit($coderef->());
  68. }

This server takes the trouble to clone off a child version via fork() for each incoming request. That way it can handle many requests at once, which you might not always want. Even if you don't fork(), the listen() will allow that many pending connections. Forking servers have to be particularly careful about cleaning up their dead children (called "zombies" in Unix parlance), because otherwise you'll quickly fill up your process table. The REAPER subroutine is used here to call waitpid() for any child processes that have finished, thereby ensuring that they terminate cleanly and don't join the ranks of the living dead.

Within the while loop we call accept() and check to see if it returns a false value. This would normally indicate a system error needs to be reported. However, the introduction of safe signals (see Deferred Signals (Safe Signals) above) in Perl 5.8.0 means that accept() might also be interrupted when the process receives a signal. This typically happens when one of the forked subprocesses exits and notifies the parent process with a CHLD signal.

If accept() is interrupted by a signal, $! will be set to EINTR. If this happens, we can safely continue to the next iteration of the loop and another call to accept(). It is important that your signal handling code not modify the value of $!, or else this test will likely fail. In the REAPER subroutine we create a local version of $! before calling waitpid(). When waitpid() sets $! to ECHILD as it inevitably does when it has no more children waiting, it updates the local copy and leaves the original unchanged.

You should use the -T flag to enable taint checking (see perlsec) even if we aren't running setuid or setgid. This is always a good idea for servers or any program run on behalf of someone else (like CGI scripts), because it lessens the chances that people from the outside will be able to compromise your system.

Let's look at another TCP client. This one connects to the TCP "time" service on a number of different machines and shows how far their clocks differ from the system on which it's being run:

  1. #!/usr/bin/perl -w
  2. use strict;
  3. use Socket;
  4. my $SECS_OF_70_YEARS = 2208988800;
  5. sub ctime { scalar localtime(shift() || time()) }
  6. my $iaddr = gethostbyname("localhost");
  7. my $proto = getprotobyname("tcp");
  8. my $port = getservbyname("time", "tcp");
  9. my $paddr = sockaddr_in(0, $iaddr);
  10. my($host);
  11. $| = 1;
  12. printf "%-24s %8s %s\n", "localhost", 0, ctime();
  13. foreach $host (@ARGV) {
  14. printf "%-24s ", $host;
  15. my $hisiaddr = inet_aton($host) || die "unknown host";
  16. my $hispaddr = sockaddr_in($port, $hisiaddr);
  17. socket(SOCKET, PF_INET, SOCK_STREAM, $proto)
  18. || die "socket: $!";
  19. connect(SOCKET, $hispaddr) || die "connect: $!";
  20. my $rtime = pack("C4", ());
  21. read(SOCKET, $rtime, 4);
  22. close(SOCKET);
  23. my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
  24. printf "%8d %s\n", $histime - time(), ctime($histime);
  25. }

Unix-Domain TCP Clients and Servers

That's fine for Internet-domain clients and servers, but what about local communications? While you can use the same setup, sometimes you don't want to. Unix-domain sockets are local to the current host, and are often used internally to implement pipes. Unlike Internet domain sockets, Unix domain sockets can show up in the file system with an ls(1) listing.

  1. % ls -l /dev/log
  2. srw-rw-rw- 1 root 0 Oct 31 07:23 /dev/log

You can test for these with Perl's -S file test:

  1. unless (-S "/dev/log") {
  2. die "something's wicked with the log system";
  3. }

Here's a sample Unix-domain client:

  1. #!/usr/bin/perl -w
  2. use Socket;
  3. use strict;
  4. my ($rendezvous, $line);
  5. $rendezvous = shift || "catsock";
  6. socket(SOCK, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
  7. connect(SOCK, sockaddr_un($rendezvous)) || die "connect: $!";
  8. while (defined($line = <SOCK>)) {
  9. print $line;
  10. }
  11. exit(0);

And here's a corresponding server. You don't have to worry about silly network terminators here because Unix domain sockets are guaranteed to be on the localhost, and thus everything works right.

  1. #!/usr/bin/perl -Tw
  2. use strict;
  3. use Socket;
  4. use Carp;
  5. BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
  6. sub spawn; # forward declaration
  7. sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }
  8. my $NAME = "catsock";
  9. my $uaddr = sockaddr_un($NAME);
  10. my $proto = getprotobyname("tcp");
  11. socket(Server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
  12. unlink($NAME);
  13. bind (Server, $uaddr) || die "bind: $!";
  14. listen(Server, SOMAXCONN) || die "listen: $!";
  15. logmsg "server started on $NAME";
  16. my $waitedpid;
  17. use POSIX ":sys_wait_h";
  18. sub REAPER {
  19. my $child;
  20. while (($waitedpid = waitpid(-1, WNOHANG)) > 0) {
  21. logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
  22. }
  23. $SIG{CHLD} = \&REAPER; # loathe SysV
  24. }
  25. $SIG{CHLD} = \&REAPER;
  26. for ( $waitedpid = 0;
  27. accept(Client, Server) || $waitedpid;
  28. $waitedpid = 0, close Client)
  29. {
  30. next if $waitedpid;
  31. logmsg "connection on $NAME";
  32. spawn sub {
  33. print "Hello there, it's now ", scalar localtime(), "\n";
  34. exec("/usr/games/fortune") || die "can't exec fortune: $!";
  35. };
  36. }
  37. sub spawn {
  38. my $coderef = shift();
  39. unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
  40. confess "usage: spawn CODEREF";
  41. }
  42. my $pid;
  43. unless (defined($pid = fork())) {
  44. logmsg "cannot fork: $!";
  45. return;
  46. }
  47. elsif ($pid) {
  48. logmsg "begat $pid";
  49. return; # I'm the parent
  50. }
  51. else {
  52. # I'm the child -- go spawn
  53. }
  54. open(STDIN, "<&Client") || die "can't dup client to stdin";
  55. open(STDOUT, ">&Client") || die "can't dup client to stdout";
  56. ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
  57. exit($coderef->());
  58. }

As you see, it's remarkably similar to the Internet domain TCP server, so much so, in fact, that we've omitted several duplicate functions--spawn(), logmsg(), ctime(), and REAPER()--which are the same as in the other server.

So why would you ever want to use a Unix domain socket instead of a simpler named pipe? Because a named pipe doesn't give you sessions. You can't tell one process's data from another's. With socket programming, you get a separate session for each client; that's why accept() takes two arguments.

For example, let's say that you have a long-running database server daemon that you want folks to be able to access from the Web, but only if they go through a CGI interface. You'd have a small, simple CGI program that does whatever checks and logging you feel like, and then acts as a Unix-domain client and connects to your private server.

TCP Clients with IO::Socket

For those preferring a higher-level interface to socket programming, the IO::Socket module provides an object-oriented approach. If for some reason you lack this module, you can just fetch IO::Socket from CPAN, where you'll also find modules providing easy interfaces to the following systems: DNS, FTP, Ident (RFC 931), NIS and NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay, Telnet, and Time--to name just a few.

A Simple Client

Here's a client that creates a TCP connection to the "daytime" service at port 13 of the host name "localhost" and prints out everything that the server there cares to provide.

  1. #!/usr/bin/perl -w
  2. use IO::Socket;
  3. $remote = IO::Socket::INET->new(
  4. Proto => "tcp",
  5. PeerAddr => "localhost",
  6. PeerPort => "daytime(13)",
  7. )
  8. || die "can't connect to daytime service on localhost";
  9. while (<$remote>) { print }

When you run this program, you should get something back that looks like this:

  1. Wed May 14 08:40:46 MDT 1997

Here are what those parameters to the new() constructor mean:

  • Proto

    This is which protocol to use. In this case, the socket handle returned will be connected to a TCP socket, because we want a stream-oriented connection, that is, one that acts pretty much like a plain old file. Not all sockets are this of this type. For example, the UDP protocol can be used to make a datagram socket, used for message-passing.

  • PeerAddr

    This is the name or Internet address of the remote host the server is running on. We could have specified a longer name like "www.perl.com" , or an address like "207.171.7.72" . For demonstration purposes, we've used the special hostname "localhost" , which should always mean the current machine you're running on. The corresponding Internet address for localhost is "127.0.0.1" , if you'd rather use that.

  • PeerPort

    This is the service name or port number we'd like to connect to. We could have gotten away with using just "daytime" on systems with a well-configured system services file,[FOOTNOTE: The system services file is found in /etc/services under Unixy systems.] but here we've specified the port number (13) in parentheses. Using just the number would have also worked, but numeric literals make careful programmers nervous.

Notice how the return value from the new constructor is used as a filehandle in the while loop? That's what's called an indirect filehandle, a scalar variable containing a filehandle. You can use it the same way you would a normal filehandle. For example, you can read one line from it this way:

  1. $line = <$handle>;

all remaining lines from is this way:

  1. @lines = <$handle>;

and send a line of data to it this way:

  1. print $handle "some data\n";

A Webget Client

Here's a simple client that takes a remote host to fetch a document from, and then a list of files to get from that host. This is a more interesting client than the previous one because it first sends something to the server before fetching the server's response.

  1. #!/usr/bin/perl -w
  2. use IO::Socket;
  3. unless (@ARGV > 1) { die "usage: $0 host url ..." }
  4. $host = shift(@ARGV);
  5. $EOL = "\015\012";
  6. $BLANK = $EOL x 2;
  7. for my $document (@ARGV) {
  8. $remote = IO::Socket::INET->new( Proto => "tcp",
  9. PeerAddr => $host,
  10. PeerPort => "http(80)",
  11. ) || die "cannot connect to httpd on $host";
  12. $remote->autoflush(1);
  13. print $remote "GET $document HTTP/1.0" . $BLANK;
  14. while ( <$remote> ) { print }
  15. close $remote;
  16. }

The web server handling the HTTP service is assumed to be at its standard port, number 80. If the server you're trying to connect to is at a different port, like 1080 or 8080, you should specify it as the named-parameter pair, PeerPort => 8080 . The autoflush method is used on the socket because otherwise the system would buffer up the output we sent it. (If you're on a prehistoric Mac, you'll also need to change every "\n" in your code that sends data over the network to be a "\015\012" instead.)

Connecting to the server is only the first part of the process: once you have the connection, you have to use the server's language. Each server on the network has its own little command language that it expects as input. The string that we send to the server starting with "GET" is in HTTP syntax. In this case, we simply request each specified document. Yes, we really are making a new connection for each document, even though it's the same host. That's the way you always used to have to speak HTTP. Recent versions of web browsers may request that the remote server leave the connection open a little while, but the server doesn't have to honor such a request.

Here's an example of running that program, which we'll call webget:

  1. % webget www.perl.com /guanaco.html
  2. HTTP/1.1 404 File Not Found
  3. Date: Thu, 08 May 1997 18:02:32 GMT
  4. Server: Apache/1.2b6
  5. Connection: close
  6. Content-type: text/html
  7. <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
  8. <BODY><H1>File Not Found</H1>
  9. The requested URL /guanaco.html was not found on this server.<P>
  10. </BODY>

Ok, so that's not very interesting, because it didn't find that particular document. But a long response wouldn't have fit on this page.

For a more featureful version of this program, you should look to the lwp-request program included with the LWP modules from CPAN.

Interactive Client with IO::Socket

Well, that's all fine if you want to send one command and get one answer, but what about setting up something fully interactive, somewhat like the way telnet works? That way you can type a line, get the answer, type a line, get the answer, etc.

This client is more complicated than the two we've done so far, but if you're on a system that supports the powerful fork call, the solution isn't that rough. Once you've made the connection to whatever service you'd like to chat with, call fork to clone your process. Each of these two identical process has a very simple job to do: the parent copies everything from the socket to standard output, while the child simultaneously copies everything from standard input to the socket. To accomplish the same thing using just one process would be much harder, because it's easier to code two processes to do one thing than it is to code one process to do two things. (This keep-it-simple principle a cornerstones of the Unix philosophy, and good software engineering as well, which is probably why it's spread to other systems.)

Here's the code:

  1. #!/usr/bin/perl -w
  2. use strict;
  3. use IO::Socket;
  4. my ($host, $port, $kidpid, $handle, $line);
  5. unless (@ARGV == 2) { die "usage: $0 host port" }
  6. ($host, $port) = @ARGV;
  7. # create a tcp connection to the specified host and port
  8. $handle = IO::Socket::INET->new(Proto => "tcp",
  9. PeerAddr => $host,
  10. PeerPort => $port)
  11. || die "can't connect to port $port on $host: $!";
  12. $handle->autoflush(1); # so output gets there right away
  13. print STDERR "[Connected to $host:$port]\n";
  14. # split the program into two processes, identical twins
  15. die "can't fork: $!" unless defined($kidpid = fork());
  16. # the if{} block runs only in the parent process
  17. if ($kidpid) {
  18. # copy the socket to standard output
  19. while (defined ($line = <$handle>)) {
  20. print STDOUT $line;
  21. }
  22. kill("TERM", $kidpid); # send SIGTERM to child
  23. }
  24. # the else{} block runs only in the child process
  25. else {
  26. # copy standard input to the socket
  27. while (defined ($line = <STDIN>)) {
  28. print $handle $line;
  29. }
  30. exit(0); # just in case
  31. }

The kill function in the parent's if block is there to send a signal to our child process, currently running in the else block, as soon as the remote server has closed its end of the connection.

If the remote server sends data a byte at time, and you need that data immediately without waiting for a newline (which might not happen), you may wish to replace the while loop in the parent with the following:

  1. my $byte;
  2. while (sysread($handle, $byte, 1) == 1) {
  3. print STDOUT $byte;
  4. }

Making a system call for each byte you want to read is not very efficient (to put it mildly) but is the simplest to explain and works reasonably well.

TCP Servers with IO::Socket

As always, setting up a server is little bit more involved than running a client. The model is that the server creates a special kind of socket that does nothing but listen on a particular port for incoming connections. It does this by calling the IO::Socket::INET->new() method with slightly different arguments than the client did.

  • Proto

    This is which protocol to use. Like our clients, we'll still specify "tcp" here.

  • LocalPort

    We specify a local port in the LocalPort argument, which we didn't do for the client. This is service name or port number for which you want to be the server. (Under Unix, ports under 1024 are restricted to the superuser.) In our sample, we'll use port 9000, but you can use any port that's not currently in use on your system. If you try to use one already in used, you'll get an "Address already in use" message. Under Unix, the netstat -a command will show which services current have servers.

  • Listen

    The Listen parameter is set to the maximum number of pending connections we can accept until we turn away incoming clients. Think of it as a call-waiting queue for your telephone. The low-level Socket module has a special symbol for the system maximum, which is SOMAXCONN.

  • Reuse

    The Reuse parameter is needed so that we restart our server manually without waiting a few minutes to allow system buffers to clear out.

Once the generic server socket has been created using the parameters listed above, the server then waits for a new client to connect to it. The server blocks in the accept method, which eventually accepts a bidirectional connection from the remote client. (Make sure to autoflush this handle to circumvent buffering.)

To add to user-friendliness, our server prompts the user for commands. Most servers don't do this. Because of the prompt without a newline, you'll have to use the sysread variant of the interactive client above.

This server accepts one of five different commands, sending output back to the client. Unlike most network servers, this one handles only one incoming client at a time. Multithreaded servers are covered in Chapter 16 of the Camel.

Here's the code. We'll

  1. #!/usr/bin/perl -w
  2. use IO::Socket;
  3. use Net::hostent; # for OOish version of gethostbyaddr
  4. $PORT = 9000; # pick something not in use
  5. $server = IO::Socket::INET->new( Proto => "tcp",
  6. LocalPort => $PORT,
  7. Listen => SOMAXCONN,
  8. Reuse => 1);
  9. die "can't setup server" unless $server;
  10. print "[Server $0 accepting clients]\n";
  11. while ($client = $server->accept()) {
  12. $client->autoflush(1);
  13. print $client "Welcome to $0; type help for command list.\n";
  14. $hostinfo = gethostbyaddr($client->peeraddr);
  15. printf "[Connect from %s]\n", $hostinfo ? $hostinfo->name : $client->peerhost;
  16. print $client "Command? ";
  17. while ( <$client>) {
  18. next unless /\S/; # blank line
  19. if (/quit|exit/i) { last }
  20. elsif (/date|time/i) { printf $client "%s\n", scalar localtime() }
  21. elsif (/who/i ) { print $client `who 2>&1` }
  22. elsif (/cookie/i ) { print $client `/usr/games/fortune 2>&1` }
  23. elsif (/motd/i ) { print $client `cat /etc/motd 2>&1` }
  24. else {
  25. print $client "Commands: quit date who cookie motd\n";
  26. }
  27. } continue {
  28. print $client "Command? ";
  29. }
  30. close $client;
  31. }

UDP: Message Passing

Another kind of client-server setup is one that uses not connections, but messages. UDP communications involve much lower overhead but also provide less reliability, as there are no promises that messages will arrive at all, let alone in order and unmangled. Still, UDP offers some advantages over TCP, including being able to "broadcast" or "multicast" to a whole bunch of destination hosts at once (usually on your local subnet). If you find yourself overly concerned about reliability and start building checks into your message system, then you probably should use just TCP to start with.

UDP datagrams are not a bytestream and should not be treated as such. This makes using I/O mechanisms with internal buffering like stdio (i.e. print() and friends) especially cumbersome. Use syswrite(), or better send(), like in the example below.

Here's a UDP program similar to the sample Internet TCP client given earlier. However, instead of checking one host at a time, the UDP version will check many of them asynchronously by simulating a multicast and then using select() to do a timed-out wait for I/O. To do something similar with TCP, you'd have to use a different socket handle for each host.

  1. #!/usr/bin/perl -w
  2. use strict;
  3. use Socket;
  4. use Sys::Hostname;
  5. my ( $count, $hisiaddr, $hispaddr, $histime,
  6. $host, $iaddr, $paddr, $port, $proto,
  7. $rin, $rout, $rtime, $SECS_OF_70_YEARS);
  8. $SECS_OF_70_YEARS = 2_208_988_800;
  9. $iaddr = gethostbyname(hostname());
  10. $proto = getprotobyname("udp");
  11. $port = getservbyname("time", "udp");
  12. $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick
  13. socket(SOCKET, PF_INET, SOCK_DGRAM, $proto) || die "socket: $!";
  14. bind(SOCKET, $paddr) || die "bind: $!";
  15. $| = 1;
  16. printf "%-12s %8s %s\n", "localhost", 0, scalar localtime();
  17. $count = 0;
  18. for $host (@ARGV) {
  19. $count++;
  20. $hisiaddr = inet_aton($host) || die "unknown host";
  21. $hispaddr = sockaddr_in($port, $hisiaddr);
  22. defined(send(SOCKET, 0, 0, $hispaddr)) || die "send $host: $!";
  23. }
  24. $rin = "";
  25. vec($rin, fileno(SOCKET), 1) = 1;
  26. # timeout after 10.0 seconds
  27. while ($count && select($rout = $rin, undef, undef, 10.0)) {
  28. $rtime = "";
  29. $hispaddr = recv(SOCKET, $rtime, 4, 0) || die "recv: $!";
  30. ($port, $hisiaddr) = sockaddr_in($hispaddr);
  31. $host = gethostbyaddr($hisiaddr, AF_INET);
  32. $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
  33. printf "%-12s ", $host;
  34. printf "%8d %s\n", $histime - time(), scalar localtime($histime);
  35. $count--;
  36. }

This example does not include any retries and may consequently fail to contact a reachable host. The most prominent reason for this is congestion of the queues on the sending host if the number of hosts to contact is sufficiently large.

SysV IPC

While System V IPC isn't so widely used as sockets, it still has some interesting uses. However, you cannot use SysV IPC or Berkeley mmap() to have a variable shared amongst several processes. That's because Perl would reallocate your string when you weren't wanting it to. You might look into the IPC::Shareable or threads::shared modules for that.

Here's a small example showing shared memory usage.

  1. use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR);
  2. $size = 2000;
  3. $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR);
  4. defined($id) || die "shmget: $!";
  5. print "shm key $id\n";
  6. $message = "Message #1";
  7. shmwrite($id, $message, 0, 60) || die "shmwrite: $!";
  8. print "wrote: '$message'\n";
  9. shmread($id, $buff, 0, 60) || die "shmread: $!";
  10. print "read : '$buff'\n";
  11. # the buffer of shmread is zero-character end-padded.
  12. substr($buff, index($buff, "\0")) = "";
  13. print "un" unless $buff eq $message;
  14. print "swell\n";
  15. print "deleting shm $id\n";
  16. shmctl($id, IPC_RMID, 0) || die "shmctl: $!";

Here's an example of a semaphore:

  1. use IPC::SysV qw(IPC_CREAT);
  2. $IPC_KEY = 1234;
  3. $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT);
  4. defined($id) || die "semget: $!";
  5. print "sem id $id\n";

Put this code in a separate file to be run in more than one process. Call the file take:

  1. # create a semaphore
  2. $IPC_KEY = 1234;
  3. $id = semget($IPC_KEY, 0, 0);
  4. defined($id) || die "semget: $!";
  5. $semnum = 0;
  6. $semflag = 0;
  7. # "take" semaphore
  8. # wait for semaphore to be zero
  9. $semop = 0;
  10. $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);
  11. # Increment the semaphore count
  12. $semop = 1;
  13. $opstring2 = pack("s!s!s!", $semnum, $semop, $semflag);
  14. $opstring = $opstring1 . $opstring2;
  15. semop($id, $opstring) || die "semop: $!";

Put this code in a separate file to be run in more than one process. Call this file give:

  1. # "give" the semaphore
  2. # run this in the original process and you will see
  3. # that the second process continues
  4. $IPC_KEY = 1234;
  5. $id = semget($IPC_KEY, 0, 0);
  6. die unless defined($id);
  7. $semnum = 0;
  8. $semflag = 0;
  9. # Decrement the semaphore count
  10. $semop = -1;
  11. $opstring = pack("s!s!s!", $semnum, $semop, $semflag);
  12. semop($id, $opstring) || die "semop: $!";

The SysV IPC code above was written long ago, and it's definitely clunky looking. For a more modern look, see the IPC::SysV module.

A small example demonstrating SysV message queues:

  1. use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR);
  2. my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR);
  3. defined($id) || die "msgget failed: $!";
  4. my $sent = "message";
  5. my $type_sent = 1234;
  6. msgsnd($id, pack("l! a*", $type_sent, $sent), 0)
  7. || die "msgsnd failed: $!";
  8. msgrcv($id, my $rcvd_buf, 60, 0, 0)
  9. || die "msgrcv failed: $!";
  10. my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf);
  11. if ($rcvd eq $sent) {
  12. print "okay\n";
  13. } else {
  14. print "not okay\n";
  15. }
  16. msgctl($id, IPC_RMID, 0) || die "msgctl failed: $!\n";

NOTES

Most of these routines quietly but politely return undef when they fail instead of causing your program to die right then and there due to an uncaught exception. (Actually, some of the new Socket conversion functions do croak() on bad arguments.) It is therefore essential to check return values from these functions. Always begin your socket programs this way for optimal success, and don't forget to add the -T taint-checking flag to the #! line for servers:

  1. #!/usr/bin/perl -Tw
  2. use strict;
  3. use sigtrap;
  4. use Socket;

BUGS

These routines all create system-specific portability problems. As noted elsewhere, Perl is at the mercy of your C libraries for much of its system behavior. It's probably safest to assume broken SysV semantics for signals and to stick with simple TCP and UDP socket operations; e.g., don't try to pass open file descriptors over a local UDP datagram socket if you want your code to stand a chance of being portable.

AUTHOR

Tom Christiansen, with occasional vestiges of Larry Wall's original version and suggestions from the Perl Porters.

SEE ALSO

There's a lot more to networking than this, but this should get you started.

For intrepid programmers, the indispensable textbook is Unix Network Programming, 2nd Edition, Volume 1 by W. Richard Stevens (published by Prentice-Hall). Most books on networking address the subject from the perspective of a C programmer; translation to Perl is left as an exercise for the reader.

The IO::Socket(3) manpage describes the object library, and the Socket(3) manpage describes the low-level interface to sockets. Besides the obvious functions in perlfunc, you should also check out the modules file at your nearest CPAN site, especially http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_. See perlmodlib or best yet, the Perl FAQ for a description of what CPAN is and where to get it if the previous link doesn't work for you.

Section 5 of CPAN's modules file is devoted to "Networking, Device Control (modems), and Interprocess Communication", and contains numerous unbundled modules numerous networking modules, Chat and Expect operations, CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet, Threads, and ToolTalk--to name just a few.

 
perldoc-html/perlirix.html000644 000765 000024 00000052447 12275777411 015773 0ustar00jjstaff000000 000000 perlirix - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlirix

Perl 5 version 18.2 documentation
Recently read

perlirix

NAME

perlirix - Perl version 5 on Irix systems

DESCRIPTION

This document describes various features of Irix that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs.

Building 32-bit Perl in Irix

Use

  1. sh Configure -Dcc='cc -n32'

to compile Perl 32-bit. Don't bother with -n32 unless you have 7.1 or later compilers (use cc -version to check).

(Building 'cc -n32' is the default.)

Building 64-bit Perl in Irix

Use

  1. sh Configure -Dcc='cc -64' -Duse64bitint

This requires require a 64-bit MIPS CPU (R8000, R10000, ...)

You can also use

  1. sh Configure -Dcc='cc -64' -Duse64bitall

but that makes no difference compared with the -Duse64bitint because of the cc -64 .

You can also do

  1. sh Configure -Dcc='cc -n32' -Duse64bitint

to use long longs for the 64-bit integer type, in case you don't have a 64-bit CPU.

If you are using gcc, just

  1. sh Configure -Dcc=gcc -Duse64bitint

should be enough, the Configure should automatically probe for the correct 64-bit settings.

About Compiler Versions of Irix

Some Irix cc versions, e.g. 7.3.1.1m (try cc -version) have been known to have issues (coredumps) when compiling perl.c. If you've used -OPT:fast_io=ON and this happens, try removing it. If that fails, or you didn't use that, then try adjusting other optimization options (-LNO, -INLINE, -O3 to -O2, etcetera). The compiler bug has been reported to SGI. (Allen Smith <easmith@beatrice.rutgers.edu>)

Linker Problems in Irix

If you get complaints about so_locations then search in the file hints/irix_6.sh for "lddflags" and do the suggested adjustments. (David Billinghurst <David.Billinghurst@riotinto.com.au>)

Malloc in Irix

Do not try to use Perl's malloc, this will lead into very mysterious errors (especially with -Duse64bitall).

Building with threads in Irix

Run Configure with -Duseithreads which will configure Perl with the Perl 5.8.0 "interpreter threads", see threads.

For Irix 6.2 with perl threads, you have to have the following patches installed:

  1. 1404 Irix 6.2 Posix 1003.1b man pages
  2. 1645 Irix 6.2 & 6.3 POSIX header file updates
  3. 2000 Irix 6.2 Posix 1003.1b support modules
  4. 2254 Pthread library fixes
  5. 2401 6.2 all platform kernel rollup

IMPORTANT: Without patch 2401, a kernel bug in Irix 6.2 will cause your machine to panic and crash when running threaded perl. Irix 6.3 and later are okay.

  1. Thanks to Hannu Napari <Hannu.Napari@hut.fi> for the IRIX
  2. pthreads patches information.

Irix 5.3

While running Configure and when building, you are likely to get quite a few of these warnings:

  1. ld:
  2. The shared object /usr/lib/libm.so did not resolve any symbols.
  3. You may want to remove it from your link line.

Ignore them: in IRIX 5.3 there is no way to quieten ld about this.

During compilation you will see this warning from toke.c:

  1. uopt: Warning: Perl_yylex: this procedure not optimized because it
  2. exceeds size threshold; to optimize this procedure, use -Olimit option
  3. with value >= 4252.

Ignore the warning.

In IRIX 5.3 and with Perl 5.8.1 (Perl 5.8.0 didn't compile in IRIX 5.3) the following failures are known.

  1. Failed Test Stat Wstat Total Fail Failed List of Failed
  2. --------------------------------------------------------------------------
  3. ../ext/List/Util/t/shuffle.t 0 139 ?? ?? % ??
  4. ../lib/Math/Trig.t 255 65280 29 12 41.38% 24-29
  5. ../lib/sort.t 0 138 119 72 60.50% 48-119
  6. 56 tests and 474 subtests skipped.
  7. Failed 3/811 test scripts, 99.63% okay. 78/75813 subtests failed, 99.90% okay.

They are suspected to be compiler errors (at least the shuffle.t failure is known from some IRIX 6 setups) and math library errors (the Trig.t failure), but since IRIX 5 is long since end-of-lifed, further fixes for the IRIX are unlikely. If you can get gcc for 5.3, you could try that, too, since gcc in IRIX 6 is a known workaround for at least the shuffle.t and sort.t failures.

AUTHOR

Jarkko Hietaniemi <jhi@iki.fi>

Please report any errors, updates, or suggestions to perlbug@perl.org.

 
perldoc-html/perlivp.html000644 000765 000024 00000044547 12275777420 015620 0ustar00jjstaff000000 000000 perlivp - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlivp

Perl 5 version 18.2 documentation
Recently read

perlivp

NAME

perlivp - Perl Installation Verification Procedure

SYNOPSIS

perlivp [-p] [-v] [-h]

DESCRIPTION

The perlivp program is set up at Perl source code build time to test the Perl version it was built under. It can be used after running:

  1. make install

(or your platform's equivalent procedure) to verify that perl and its libraries have been installed correctly. A correct installation is verified by output that looks like:

  1. ok 1
  2. ok 2

etc.

OPTIONS

  • -h help

    Prints out a brief help message.

  • -p print preface

    Gives a description of each test prior to performing it.

  • -v verbose

    Gives more detailed information about each test, after it has been performed. Note that any failed tests ought to print out some extra information whether or not -v is thrown.

DIAGNOSTICS

  • print "# Perl binary '$perlpath' does not appear executable.\n";

    Likely to occur for a perl binary that was not properly installed. Correct by conducting a proper installation.

  • print "# Perl version '$]' installed, expected $ivp_VERSION.\n";

    Likely to occur for a perl that was not properly installed. Correct by conducting a proper installation.

  • print "# Perl \@INC directory '$_' does not appear to exist.\n";

    Likely to occur for a perl library tree that was not properly installed. Correct by conducting a proper installation.

  • print "# Needed module '$_' does not appear to be properly installed.\n";

    One of the two modules that is used by perlivp was not present in the installation. This is a serious error since it adversely affects perlivp's ability to function. You may be able to correct this by performing a proper perl installation.

  • print "# Required module '$_' does not appear to be properly installed.\n";

    An attempt to eval "require $module" failed, even though the list of extensions indicated that it should succeed. Correct by conducting a proper installation.

  • print "# Unnecessary module 'bLuRfle' appears to be installed.\n";

    This test not coming out ok could indicate that you have in fact installed a bLuRfle.pm module or that the eval " require \"$module_name.pm\"; " test may give misleading results with your installation of perl. If yours is the latter case then please let the author know.

  • print "# file",+($#missing == 0) ? '' : 's'," missing from installation:\n";

    One or more files turned up missing according to a run of ExtUtils::Installed -> validate() over your installation. Correct by conducting a proper installation.

For further information on how to conduct a proper installation consult the INSTALL file that comes with the perl source and the README file for your platform.

AUTHOR

Peter Prymmer

 
perldoc-html/perllexwarn.html000644 000765 000024 00000143454 12275777336 016505 0ustar00jjstaff000000 000000 perllexwarn - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perllexwarn

Perl 5 version 18.2 documentation
Recently read

perllexwarn

NAME

perllexwarn - Perl Lexical Warnings

DESCRIPTION

The use warnings pragma enables to control precisely what warnings are to be enabled in which parts of a Perl program. It's a more flexible alternative for both the command line flag -w and the equivalent Perl variable, $^W .

This pragma works just like the strict pragma. This means that the scope of the warning pragma is limited to the enclosing block. It also means that the pragma setting will not leak across files (via use, require or do). This allows authors to independently define the degree of warning checks that will be applied to their module.

By default, optional warnings are disabled, so any legacy code that doesn't attempt to control the warnings will work unchanged.

All warnings are enabled in a block by either of these:

  1. use warnings;
  2. use warnings 'all';

Similarly all warnings are disabled in a block by either of these:

  1. no warnings;
  2. no warnings 'all';

For example, consider the code below:

  1. use warnings;
  2. my @a;
  3. {
  4. no warnings;
  5. my $b = @a[0];
  6. }
  7. my $c = @a[0];

The code in the enclosing block has warnings enabled, but the inner block has them disabled. In this case that means the assignment to the scalar $c will trip the "Scalar value @a[0] better written as $a[0]" warning, but the assignment to the scalar $b will not.

Default Warnings and Optional Warnings

Before the introduction of lexical warnings, Perl had two classes of warnings: mandatory and optional.

As its name suggests, if your code tripped a mandatory warning, you would get a warning whether you wanted it or not. For example, the code below would always produce an "isn't numeric" warning about the "2:".

  1. my $a = "2:" + 3;

With the introduction of lexical warnings, mandatory warnings now become default warnings. The difference is that although the previously mandatory warnings are still enabled by default, they can then be subsequently enabled or disabled with the lexical warning pragma. For example, in the code below, an "isn't numeric" warning will only be reported for the $a variable.

  1. my $a = "2:" + 3;
  2. no warnings;
  3. my $b = "2:" + 3;

Note that neither the -w flag or the $^W can be used to disable/enable default warnings. They are still mandatory in this case.

What's wrong with -w and $^W

Although very useful, the big problem with using -w on the command line to enable warnings is that it is all or nothing. Take the typical scenario when you are writing a Perl program. Parts of the code you will write yourself, but it's very likely that you will make use of pre-written Perl modules. If you use the -w flag in this case, you end up enabling warnings in pieces of code that you haven't written.

Similarly, using $^W to either disable or enable blocks of code is fundamentally flawed. For a start, say you want to disable warnings in a block of code. You might expect this to be enough to do the trick:

  1. {
  2. local ($^W) = 0;
  3. my $a =+ 2;
  4. my $b; chop $b;
  5. }

When this code is run with the -w flag, a warning will be produced for the $a line: "Reversed += operator" .

The problem is that Perl has both compile-time and run-time warnings. To disable compile-time warnings you need to rewrite the code like this:

  1. {
  2. BEGIN { $^W = 0 }
  3. my $a =+ 2;
  4. my $b; chop $b;
  5. }

The other big problem with $^W is the way you can inadvertently change the warning setting in unexpected places in your code. For example, when the code below is run (without the -w flag), the second call to doit will trip a "Use of uninitialized value" warning, whereas the first will not.

  1. sub doit
  2. {
  3. my $b; chop $b;
  4. }
  5. doit();
  6. {
  7. local ($^W) = 1;
  8. doit()
  9. }

This is a side-effect of $^W being dynamically scoped.

Lexical warnings get around these limitations by allowing finer control over where warnings can or can't be tripped.

Controlling Warnings from the Command Line

There are three Command Line flags that can be used to control when warnings are (or aren't) produced:

  • -w

    This is the existing flag. If the lexical warnings pragma is not used in any of you code, or any of the modules that you use, this flag will enable warnings everywhere. See Backward Compatibility for details of how this flag interacts with lexical warnings.

  • -W

    If the -W flag is used on the command line, it will enable all warnings throughout the program regardless of whether warnings were disabled locally using no warnings or $^W =0 . This includes all files that get included via use, require or do. Think of it as the Perl equivalent of the "lint" command.

  • -X

    Does the exact opposite to the -W flag, i.e. it disables all warnings.

Backward Compatibility

If you are used to working with a version of Perl prior to the introduction of lexically scoped warnings, or have code that uses both lexical warnings and $^W , this section will describe how they interact.

How Lexical Warnings interact with -w/$^W :

1.

If none of the three command line flags (-w, -W or -X) that control warnings is used and neither $^W nor the warnings pragma are used, then default warnings will be enabled and optional warnings disabled. This means that legacy code that doesn't attempt to control the warnings will work unchanged.

2.

The -w flag just sets the global $^W variable as in 5.005. This means that any legacy code that currently relies on manipulating $^W to control warning behavior will still work as is.

3.

Apart from now being a boolean, the $^W variable operates in exactly the same horrible uncontrolled global way, except that it cannot disable/enable default warnings.

4.

If a piece of code is under the control of the warnings pragma, both the $^W variable and the -w flag will be ignored for the scope of the lexical warning.

5.

The only way to override a lexical warnings setting is with the -W or -X command line flags.

The combined effect of 3 & 4 is that it will allow code which uses the warnings pragma to control the warning behavior of $^W-type code (using a local $^W=0 ) if it really wants to, but not vice-versa.

Category Hierarchy

A hierarchy of "categories" have been defined to allow groups of warnings to be enabled/disabled in isolation.

The current hierarchy is:

  1. all -+
  2. |
  3. +- closure
  4. |
  5. +- deprecated
  6. |
  7. +- exiting
  8. |
  9. +- experimental --+
  10. | |
  11. | +- experimental::lexical_subs
  12. |
  13. +- glob
  14. |
  15. +- imprecision
  16. |
  17. +- io ------------+
  18. | |
  19. | +- closed
  20. | |
  21. | +- exec
  22. | |
  23. | +- layer
  24. | |
  25. | +- newline
  26. | |
  27. | +- pipe
  28. | |
  29. | +- unopened
  30. |
  31. +- misc
  32. |
  33. +- numeric
  34. |
  35. +- once
  36. |
  37. +- overflow
  38. |
  39. +- pack
  40. |
  41. +- portable
  42. |
  43. +- recursion
  44. |
  45. +- redefine
  46. |
  47. +- regexp
  48. |
  49. +- severe --------+
  50. | |
  51. | +- debugging
  52. | |
  53. | +- inplace
  54. | |
  55. | +- internal
  56. | |
  57. | +- malloc
  58. |
  59. +- signal
  60. |
  61. +- substr
  62. |
  63. +- syntax --------+
  64. | |
  65. | +- ambiguous
  66. | |
  67. | +- bareword
  68. | |
  69. | +- digit
  70. | |
  71. | +- illegalproto
  72. | |
  73. | +- parenthesis
  74. | |
  75. | +- precedence
  76. | |
  77. | +- printf
  78. | |
  79. | +- prototype
  80. | |
  81. | +- qw
  82. | |
  83. | +- reserved
  84. | |
  85. | +- semicolon
  86. |
  87. +- taint
  88. |
  89. +- threads
  90. |
  91. +- uninitialized
  92. |
  93. +- unpack
  94. |
  95. +- untie
  96. |
  97. +- utf8 ----------+
  98. | |
  99. | +- non_unicode
  100. | |
  101. | +- nonchar
  102. | |
  103. | +- surrogate
  104. |
  105. +- void

Just like the "strict" pragma any of these categories can be combined

  1. use warnings qw(void redefine);
  2. no warnings qw(io syntax untie);

Also like the "strict" pragma, if there is more than one instance of the warnings pragma in a given scope the cumulative effect is additive.

  1. use warnings qw(void); # only "void" warnings enabled
  2. ...
  3. use warnings qw(io); # only "void" & "io" warnings enabled
  4. ...
  5. no warnings qw(void); # only "io" warnings enabled

To determine which category a specific warning has been assigned to see perldiag.

Note: In Perl 5.6.1, the lexical warnings category "deprecated" was a sub-category of the "syntax" category. It is now a top-level category in its own right.

Fatal Warnings

The presence of the word "FATAL" in the category list will escalate any warnings detected from the categories specified in the lexical scope into fatal errors. In the code below, the use of time, length and join can all produce a "Useless use of xxx in void context" warning.

  1. use warnings;
  2. time;
  3. {
  4. use warnings FATAL => qw(void);
  5. length "abc";
  6. }
  7. join "", 1,2,3;
  8. print "done\n";

When run it produces this output

  1. Useless use of time in void context at fatal line 3.
  2. Useless use of length in void context at fatal line 7.

The scope where length is used has escalated the void warnings category into a fatal error, so the program terminates immediately it encounters the warning.

To explicitly turn off a "FATAL" warning you just disable the warning it is associated with. So, for example, to disable the "void" warning in the example above, either of these will do the trick:

  1. no warnings qw(void);
  2. no warnings FATAL => qw(void);

If you want to downgrade a warning that has been escalated into a fatal error back to a normal warning, you can use the "NONFATAL" keyword. For example, the code below will promote all warnings into fatal errors, except for those in the "syntax" category.

  1. use warnings FATAL => 'all', NONFATAL => 'syntax';

Reporting Warnings from a Module

The warnings pragma provides a number of functions that are useful for module authors. These are used when you want to report a module-specific warning to a calling module has enabled warnings via the warnings pragma.

Consider the module MyMod::Abc below.

  1. package MyMod::Abc;
  2. use warnings::register;
  3. sub open {
  4. my $path = shift;
  5. if ($path !~ m#^/#) {
  6. warnings::warn("changing relative path to /var/abc")
  7. if warnings::enabled();
  8. $path = "/var/abc/$path";
  9. }
  10. }
  11. 1;

The call to warnings::register will create a new warnings category called "MyMod::Abc", i.e. the new category name matches the current package name. The open function in the module will display a warning message if it gets given a relative path as a parameter. This warnings will only be displayed if the code that uses MyMod::Abc has actually enabled them with the warnings pragma like below.

  1. use MyMod::Abc;
  2. use warnings 'MyMod::Abc';
  3. ...
  4. abc::open("../fred.txt");

It is also possible to test whether the pre-defined warnings categories are set in the calling module with the warnings::enabled function. Consider this snippet of code:

  1. package MyMod::Abc;
  2. sub open {
  3. warnings::warnif("deprecated",
  4. "open is deprecated, use new instead");
  5. new(@_);
  6. }
  7. sub new
  8. ...
  9. 1;

The function open has been deprecated, so code has been included to display a warning message whenever the calling module has (at least) the "deprecated" warnings category enabled. Something like this, say.

  1. use warnings 'deprecated';
  2. use MyMod::Abc;
  3. ...
  4. MyMod::Abc::open($filename);

Either the warnings::warn or warnings::warnif function should be used to actually display the warnings message. This is because they can make use of the feature that allows warnings to be escalated into fatal errors. So in this case

  1. use MyMod::Abc;
  2. use warnings FATAL => 'MyMod::Abc';
  3. ...
  4. MyMod::Abc::open('../fred.txt');

the warnings::warnif function will detect this and die after displaying the warning message.

The three warnings functions, warnings::warn , warnings::warnif and warnings::enabled can optionally take an object reference in place of a category name. In this case the functions will use the class name of the object as the warnings category.

Consider this example:

  1. package Original;
  2. no warnings;
  3. use warnings::register;
  4. sub new
  5. {
  6. my $class = shift;
  7. bless [], $class;
  8. }
  9. sub check
  10. {
  11. my $self = shift;
  12. my $value = shift;
  13. if ($value % 2 && warnings::enabled($self))
  14. { warnings::warn($self, "Odd numbers are unsafe") }
  15. }
  16. sub doit
  17. {
  18. my $self = shift;
  19. my $value = shift;
  20. $self->check($value);
  21. # ...
  22. }
  23. 1;
  24. package Derived;
  25. use warnings::register;
  26. use Original;
  27. our @ISA = qw( Original );
  28. sub new
  29. {
  30. my $class = shift;
  31. bless [], $class;
  32. }
  33. 1;

The code below makes use of both modules, but it only enables warnings from Derived .

  1. use Original;
  2. use Derived;
  3. use warnings 'Derived';
  4. my $a = Original->new();
  5. $a->doit(1);
  6. my $b = Derived->new();
  7. $a->doit(1);

When this code is run only the Derived object, $b , will generate a warning.

  1. Odd numbers are unsafe at main.pl line 7

Notice also that the warning is reported at the line where the object is first used.

When registering new categories of warning, you can supply more names to warnings::register like this:

  1. package MyModule;
  2. use warnings::register qw(format precision);
  3. ...
  4. warnings::warnif('MyModule::format', '...');

SEE ALSO

warnings, perldiag.

AUTHOR

Paul Marquess

 
perldoc-html/perllinux.html000644 000765 000024 00000037660 12275777411 016157 0ustar00jjstaff000000 000000 perllinux - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perllinux

Perl 5 version 18.2 documentation
Recently read

perllinux

NAME

perllinux - Perl version 5 on Linux systems

DESCRIPTION

This document describes various features of Linux that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs.

Experimental Support for Sun Studio Compilers for Linux OS

Sun Microsystems has released a port of their Sun Studio compilers for Linux. As of November 2005, only an alpha version has been released. Until a release of these compilers is made, support for compiling Perl with these compiler experimental.

Also, some special instructions for building Perl with Sun Studio on Linux. Following the normal Configure , you have to run make as follows:

  1. LDLOADLIBS=-lc make

LDLOADLIBS is an environment variable used by the linker to link modules /ext modules to glibc. Currently, that environment variable is not getting populated by a combination of Config entries and ExtUtil::MakeMaker . While there may be a bug somewhere in Perl's configuration or ExtUtil::MakeMaker causing the problem, the most likely cause is an incomplete understanding of Sun Studio by this author. Further investigation is needed to get this working better.

AUTHOR

Steve Peters <steve@fisharerojo.org>

Please report any errors, updates, or suggestions to perlbug@perl.org.

 
perldoc-html/perllocale.html000644 000765 000024 00000316415 12275777343 016261 0ustar00jjstaff000000 000000 perllocale - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perllocale

Perl 5 version 18.2 documentation
Recently read

perllocale

NAME

perllocale - Perl locale handling (internationalization and localization)

DESCRIPTION

In the beginning there was ASCII, the "American Standard Code for Information Interchange", which works quite well for Americans with their English alphabet and dollar-denominated currency. But it doesn't work so well even for other English speakers, who may use different currencies, such as the pound sterling (as the symbol for that currency is not in ASCII); and it's hopelessly inadequate for many of the thousands of the world's other languages.

To address these deficiencies, the concept of locales was invented (formally the ISO C, XPG4, POSIX 1.c "locale system"). And applications were and are being written that use the locale mechanism. The process of making such an application take account of its users' preferences in these kinds of matters is called internationalization (often abbreviated as i18n); telling such an application about a particular set of preferences is known as localization (l10n).

Perl was extended to support the locale system. This is controlled per application by using one pragma, one function call, and several environment variables.

Unfortunately, there are quite a few deficiencies with the design (and often, the implementations) of locales, and their use for character sets has mostly been supplanted by Unicode (see perlunitut for an introduction to that, and keep on reading here for how Unicode interacts with locales in Perl).

Perl continues to support the old locale system, and starting in v5.16, provides a hybrid way to use the Unicode character set, along with the other portions of locales that may not be so problematic. (Unicode is also creating CLDR , the "Common Locale Data Repository", http://cldr.unicode.org/ which includes more types of information than are available in the POSIX locale system. At the time of this writing, there was no CPAN module that provides access to this XML-encoded data. However, many of its locales have the POSIX-only data extracted, and are available at http://unicode.org/Public/cldr/latest/.)

WHAT IS A LOCALE

A locale is a set of data that describes various aspects of how various communities in the world categorize their world. These categories are broken down into the following types (some of which include a brief note here):

  • Category LC_NUMERIC: Numeric formatting

    This indicates how numbers should be formatted for human readability, for example the character used as the decimal point.

  • Category LC_MONETARY: Formatting of monetary amounts

     

  • Category LC_TIME: Date/Time formatting

     

  • Category LC_MESSAGES: Error and other messages

    This for the most part is beyond the scope of Perl

  • Category LC_COLLATE: Collation

    This indicates the ordering of letters for comparison and sorting. In Latin alphabets, for example, "b", generally follows "a".

  • Category LC_CTYPE: Character Types

    This indicates, for example if a character is an uppercase letter.

More details on the categories are given below in LOCALE CATEGORIES.

Together, these categories go a long way towards being able to customize a single program to run in many different locations. But there are deficiencies, so keep reading.

PREPARING TO USE LOCALES

Perl will not use locales unless specifically requested to (see NOTES below for the partial exception of write()). But even if there is such a request, all of the following must be true for it to work properly:

  • Your operating system must support the locale system. If it does, you should find that the setlocale() function is a documented part of its C library.

  • Definitions for locales that you use must be installed. You, or your system administrator, must make sure that this is the case. The available locales, the location in which they are kept, and the manner in which they are installed all vary from system to system. Some systems provide only a few, hard-wired locales and do not allow more to be added. Others allow you to add "canned" locales provided by the system supplier. Still others allow you or the system administrator to define and add arbitrary locales. (You may have to ask your supplier to provide canned locales that are not delivered with your operating system.) Read your system documentation for further illumination.

  • Perl must believe that the locale system is supported. If it does, perl -V:d_setlocale will say that the value for d_setlocale is define .

If you want a Perl application to process and present your data according to a particular locale, the application code should include the use locale pragma (see The use locale pragma) where appropriate, and at least one of the following must be true:

1

The locale-determining environment variables (see ENVIRONMENT) must be correctly set up at the time the application is started, either by yourself or by whomever set up your system account; or

2

The application must set its own locale using the method described in The setlocale function.

USING LOCALES

The use locale pragma

By default, Perl ignores the current locale. The use locale pragma tells Perl to use the current locale for some operations. Starting in v5.16, there is an optional parameter to this pragma:

  1. use locale ':not_characters';

This parameter allows better mixing of locales and Unicode, and is described fully in Unicode and UTF-8, but briefly, it tells Perl to not use the character portions of the locale definition, that is the LC_CTYPE and LC_COLLATE categories. Instead it will use the native (extended by Unicode) character set. When using this parameter, you are responsible for getting the external character set translated into the native/Unicode one (which it already will be if it is one of the increasingly popular UTF-8 locales). There are convenient ways of doing this, as described in Unicode and UTF-8.

The current locale is set at execution time by setlocale() described below. If that function hasn't yet been called in the course of the program's execution, the current locale is that which was determined by the ENVIRONMENT in effect at the start of the program, except that LC_NUMERIC is always initialized to the C locale (mentioned under Finding locales). If there is no valid environment, the current locale is undefined. It is likely, but not necessarily, the "C" locale.

The operations that are affected by locale are:

  • Under use locale ':not_characters';
    • Format declarations (format()) use LC_NUMERIC

    • The POSIX date formatting function (strftime()) uses LC_TIME .

     

  • Under just plain use locale;

    The above operations are affected, as well as the following:

    • The comparison operators (lt , le , cmp , ge , and gt ) and the POSIX string collation functions strcoll() and strxfrm() use LC_COLLATE . sort() is also affected if used without an explicit comparison function, because it uses cmp by default.

      Note: eq and ne are unaffected by locale: they always perform a char-by-char comparison of their scalar operands. What's more, if cmp finds that its operands are equal according to the collation sequence specified by the current locale, it goes on to perform a char-by-char comparison, and only returns 0 (equal) if the operands are char-for-char identical. If you really want to know whether two strings--which eq and cmp may consider different--are equal as far as collation in the locale is concerned, see the discussion in Category LC_COLLATE: Collation.

    • Regular expressions and case-modification functions (uc(), lc(), ucfirst(), and lcfirst()) use LC_CTYPE

The default behavior is restored with the no locale pragma, or upon reaching the end of the block enclosing use locale . Note that use locale and use locale ':not_characters' may be nested, and that what is in effect within an inner scope will revert to the outer scope's rules at the end of the inner scope.

The string result of any operation that uses locale information is tainted, as it is possible for a locale to be untrustworthy. See SECURITY.

The setlocale function

You can switch locales as often as you wish at run time with the POSIX::setlocale() function:

  1. # Import locale-handling tool set from POSIX module.
  2. # This example uses: setlocale -- the function call
  3. # LC_CTYPE -- explained below
  4. use POSIX qw(locale_h);
  5. # query and save the old locale
  6. $old_locale = setlocale(LC_CTYPE);
  7. setlocale(LC_CTYPE, "fr_CA.ISO8859-1");
  8. # LC_CTYPE now in locale "French, Canada, codeset ISO 8859-1"
  9. setlocale(LC_CTYPE, "");
  10. # LC_CTYPE now reset to default defined by LC_ALL/LC_CTYPE/LANG
  11. # environment variables. See below for documentation.
  12. # restore the old locale
  13. setlocale(LC_CTYPE, $old_locale);

The first argument of setlocale() gives the category, the second the locale. The category tells in what aspect of data processing you want to apply locale-specific rules. Category names are discussed in LOCALE CATEGORIES and ENVIRONMENT. The locale is the name of a collection of customization information corresponding to a particular combination of language, country or territory, and codeset. Read on for hints on the naming of locales: not all systems name locales as in the example.

If no second argument is provided and the category is something else than LC_ALL, the function returns a string naming the current locale for the category. You can use this value as the second argument in a subsequent call to setlocale().

If no second argument is provided and the category is LC_ALL, the result is implementation-dependent. It may be a string of concatenated locale names (separator also implementation-dependent) or a single locale name. Please consult your setlocale(3) man page for details.

If a second argument is given and it corresponds to a valid locale, the locale for the category is set to that value, and the function returns the now-current locale value. You can then use this in yet another call to setlocale(). (In some implementations, the return value may sometimes differ from the value you gave as the second argument--think of it as an alias for the value you gave.)

As the example shows, if the second argument is an empty string, the category's locale is returned to the default specified by the corresponding environment variables. Generally, this results in a return to the default that was in force when Perl started up: changes to the environment made by the application after startup may or may not be noticed, depending on your system's C library.

If the second argument does not correspond to a valid locale, the locale for the category is not changed, and the function returns undef.

Note that Perl ignores the current LC_CTYPE and LC_COLLATE locales within the scope of a use locale ':not_characters' .

For further information about the categories, consult setlocale(3).

Finding locales

For locales available in your system, consult also setlocale(3) to see whether it leads to the list of available locales (search for the SEE ALSO section). If that fails, try the following command lines:

  1. locale -a
  2. nlsinfo
  3. ls /usr/lib/nls/loc
  4. ls /usr/lib/locale
  5. ls /usr/lib/nls
  6. ls /usr/share/locale

and see whether they list something resembling these

  1. en_US.ISO8859-1 de_DE.ISO8859-1 ru_RU.ISO8859-5
  2. en_US.iso88591 de_DE.iso88591 ru_RU.iso88595
  3. en_US de_DE ru_RU
  4. en de ru
  5. english german russian
  6. english.iso88591 german.iso88591 russian.iso88595
  7. english.roman8 russian.koi8r

Sadly, even though the calling interface for setlocale() has been standardized, names of locales and the directories where the configuration resides have not been. The basic form of the name is language_territory.codeset, but the latter parts after language are not always present. The language and country are usually from the standards ISO 3166 and ISO 639, the two-letter abbreviations for the countries and the languages of the world, respectively. The codeset part often mentions some ISO 8859 character set, the Latin codesets. For example, ISO 8859-1 is the so-called "Western European codeset" that can be used to encode most Western European languages adequately. Again, there are several ways to write even the name of that one standard. Lamentably.

Two special locales are worth particular mention: "C" and "POSIX". Currently these are effectively the same locale: the difference is mainly that the first one is defined by the C standard, the second by the POSIX standard. They define the default locale in which every program starts in the absence of locale information in its environment. (The default default locale, if you will.) Its language is (American) English and its character codeset ASCII. Warning. The C locale delivered by some vendors may not actually exactly match what the C standard calls for. So beware.

NOTE: Not all systems have the "POSIX" locale (not all systems are POSIX-conformant), so use "C" when you need explicitly to specify this default locale.

LOCALE PROBLEMS

You may encounter the following warning message at Perl startup:

  1. perl: warning: Setting locale failed.
  2. perl: warning: Please check that your locale settings:
  3. LC_ALL = "En_US",
  4. LANG = (unset)
  5. are supported and installed on your system.
  6. perl: warning: Falling back to the standard locale ("C").

This means that your locale settings had LC_ALL set to "En_US" and LANG exists but has no value. Perl tried to believe you but could not. Instead, Perl gave up and fell back to the "C" locale, the default locale that is supposed to work no matter what. This usually means your locale settings were wrong, they mention locales your system has never heard of, or the locale installation in your system has problems (for example, some system files are broken or missing). There are quick and temporary fixes to these problems, as well as more thorough and lasting fixes.

Temporarily fixing locale problems

The two quickest fixes are either to render Perl silent about any locale inconsistencies or to run Perl under the default locale "C".

Perl's moaning about locale problems can be silenced by setting the environment variable PERL_BADLANG to a zero value, for example "0". This method really just sweeps the problem under the carpet: you tell Perl to shut up even when Perl sees that something is wrong. Do not be surprised if later something locale-dependent misbehaves.

Perl can be run under the "C" locale by setting the environment variable LC_ALL to "C". This method is perhaps a bit more civilized than the PERL_BADLANG approach, but setting LC_ALL (or other locale variables) may affect other programs as well, not just Perl. In particular, external programs run from within Perl will see these changes. If you make the new settings permanent (read on), all programs you run see the changes. See ENVIRONMENT for the full list of relevant environment variables and USING LOCALES for their effects in Perl. Effects in other programs are easily deducible. For example, the variable LC_COLLATE may well affect your sort program (or whatever the program that arranges "records" alphabetically in your system is called).

You can test out changing these variables temporarily, and if the new settings seem to help, put those settings into your shell startup files. Consult your local documentation for the exact details. For in Bourne-like shells (sh, ksh, bash, zsh):

  1. LC_ALL=en_US.ISO8859-1
  2. export LC_ALL

This assumes that we saw the locale "en_US.ISO8859-1" using the commands discussed above. We decided to try that instead of the above faulty locale "En_US"--and in Cshish shells (csh, tcsh)

  1. setenv LC_ALL en_US.ISO8859-1

or if you have the "env" application you can do in any shell

  1. env LC_ALL=en_US.ISO8859-1 perl ...

If you do not know what shell you have, consult your local helpdesk or the equivalent.

Permanently fixing locale problems

The slower but superior fixes are when you may be able to yourself fix the misconfiguration of your own environment variables. The mis(sing)configuration of the whole system's locales usually requires the help of your friendly system administrator.

First, see earlier in this document about Finding locales. That tells how to find which locales are really supported--and more importantly, installed--on your system. In our example error message, environment variables affecting the locale are listed in the order of decreasing importance (and unset variables do not matter). Therefore, having LC_ALL set to "En_US" must have been the bad choice, as shown by the error message. First try fixing locale settings listed first.

Second, if using the listed commands you see something exactly (prefix matches do not count and case usually counts) like "En_US" without the quotes, then you should be okay because you are using a locale name that should be installed and available in your system. In this case, see Permanently fixing your system's locale configuration.

Permanently fixing your system's locale configuration

This is when you see something like:

  1. perl: warning: Please check that your locale settings:
  2. LC_ALL = "En_US",
  3. LANG = (unset)
  4. are supported and installed on your system.

but then cannot see that "En_US" listed by the above-mentioned commands. You may see things like "en_US.ISO8859-1", but that isn't the same. In this case, try running under a locale that you can list and which somehow matches what you tried. The rules for matching locale names are a bit vague because standardization is weak in this area. See again the Finding locales about general rules.

Fixing system locale configuration

Contact a system administrator (preferably your own) and report the exact error message you get, and ask them to read this same documentation you are now reading. They should be able to check whether there is something wrong with the locale configuration of the system. The Finding locales section is unfortunately a bit vague about the exact commands and places because these things are not that standardized.

The localeconv function

The POSIX::localeconv() function allows you to get particulars of the locale-dependent numeric formatting information specified by the current LC_NUMERIC and LC_MONETARY locales. (If you just want the name of the current locale for a particular category, use POSIX::setlocale() with a single parameter--see The setlocale function.)

  1. use POSIX qw(locale_h);
  2. # Get a reference to a hash of locale-dependent info
  3. $locale_values = localeconv();
  4. # Output sorted list of the values
  5. for (sort keys %$locale_values) {
  6. printf "%-20s = %s\n", $_, $locale_values->{$_}
  7. }

localeconv() takes no arguments, and returns a reference to a hash. The keys of this hash are variable names for formatting, such as decimal_point and thousands_sep . The values are the corresponding, er, values. See localeconv in POSIX for a longer example listing the categories an implementation might be expected to provide; some provide more and others fewer. You don't need an explicit use locale , because localeconv() always observes the current locale.

Here's a simple-minded example program that rewrites its command-line parameters as integers correctly formatted in the current locale:

  1. use POSIX qw(locale_h);
  2. # Get some of locale's numeric formatting parameters
  3. my ($thousands_sep, $grouping) =
  4. @{localeconv()}{'thousands_sep', 'grouping'};
  5. # Apply defaults if values are missing
  6. $thousands_sep = ',' unless $thousands_sep;
  7. # grouping and mon_grouping are packed lists
  8. # of small integers (characters) telling the
  9. # grouping (thousand_seps and mon_thousand_seps
  10. # being the group dividers) of numbers and
  11. # monetary quantities. The integers' meanings:
  12. # 255 means no more grouping, 0 means repeat
  13. # the previous grouping, 1-254 means use that
  14. # as the current grouping. Grouping goes from
  15. # right to left (low to high digits). In the
  16. # below we cheat slightly by never using anything
  17. # else than the first grouping (whatever that is).
  18. if ($grouping) {
  19. @grouping = unpack("C*", $grouping);
  20. } else {
  21. @grouping = (3);
  22. }
  23. # Format command line params for current locale
  24. for (@ARGV) {
  25. $_ = int; # Chop non-integer part
  26. 1 while
  27. s/(\d)(\d{$grouping[0]}($|$thousands_sep))/$1$thousands_sep$2/;
  28. print "$_";
  29. }
  30. print "\n";

I18N::Langinfo

Another interface for querying locale-dependent information is the I18N::Langinfo::langinfo() function, available at least in Unix-like systems and VMS.

The following example will import the langinfo() function itself and three constants to be used as arguments to langinfo(): a constant for the abbreviated first day of the week (the numbering starts from Sunday = 1) and two more constants for the affirmative and negative answers for a yes/no question in the current locale.

  1. use I18N::Langinfo qw(langinfo ABDAY_1 YESSTR NOSTR);
  2. my ($abday_1, $yesstr, $nostr)
  3. = map { langinfo } qw(ABDAY_1 YESSTR NOSTR);
  4. print "$abday_1? [$yesstr/$nostr] ";

In other words, in the "C" (or English) locale the above will probably print something like:

  1. Sun? [yes/no]

See I18N::Langinfo for more information.

LOCALE CATEGORIES

The following subsections describe basic locale categories. Beyond these, some combination categories allow manipulation of more than one basic category at a time. See ENVIRONMENT for a discussion of these.

Category LC_COLLATE: Collation

In the scope of use locale (but not a use locale ':not_characters' ), Perl looks to the LC_COLLATE environment variable to determine the application's notions on collation (ordering) of characters. For example, "b" follows "a" in Latin alphabets, but where do "á" and "å" belong? And while "color" follows "chocolate" in English, what about in traditional Spanish?

The following collations all make sense and you may meet any of them if you "use locale".

  1. A B C D E a b c d e
  2. A a B b C c D d E e
  3. a A b B c C d D e E
  4. a b c d e A B C D E

Here is a code snippet to tell what "word" characters are in the current locale, in that locale's order:

  1. use locale;
  2. print +(sort grep /\w/, map { chr } 0..255), "\n";

Compare this with the characters that you see and their order if you state explicitly that the locale should be ignored:

  1. no locale;
  2. print +(sort grep /\w/, map { chr } 0..255), "\n";

This machine-native collation (which is what you get unless use locale has appeared earlier in the same block) must be used for sorting raw binary data, whereas the locale-dependent collation of the first example is useful for natural text.

As noted in USING LOCALES, cmp compares according to the current collation locale when use locale is in effect, but falls back to a char-by-char comparison for strings that the locale says are equal. You can use POSIX::strcoll() if you don't want this fall-back:

  1. use POSIX qw(strcoll);
  2. $equal_in_locale =
  3. !strcoll("space and case ignored", "SpaceAndCaseIgnored");

$equal_in_locale will be true if the collation locale specifies a dictionary-like ordering that ignores space characters completely and which folds case.

If you have a single string that you want to check for "equality in locale" against several others, you might think you could gain a little efficiency by using POSIX::strxfrm() in conjunction with eq :

  1. use POSIX qw(strxfrm);
  2. $xfrm_string = strxfrm("Mixed-case string");
  3. print "locale collation ignores spaces\n"
  4. if $xfrm_string eq strxfrm("Mixed-casestring");
  5. print "locale collation ignores hyphens\n"
  6. if $xfrm_string eq strxfrm("Mixedcase string");
  7. print "locale collation ignores case\n"
  8. if $xfrm_string eq strxfrm("mixed-case string");

strxfrm() takes a string and maps it into a transformed string for use in char-by-char comparisons against other transformed strings during collation. "Under the hood", locale-affected Perl comparison operators call strxfrm() for both operands, then do a char-by-char comparison of the transformed strings. By calling strxfrm() explicitly and using a non locale-affected comparison, the example attempts to save a couple of transformations. But in fact, it doesn't save anything: Perl magic (see Magic Variables in perlguts) creates the transformed version of a string the first time it's needed in a comparison, then keeps this version around in case it's needed again. An example rewritten the easy way with cmp runs just about as fast. It also copes with null characters embedded in strings; if you call strxfrm() directly, it treats the first null it finds as a terminator. don't expect the transformed strings it produces to be portable across systems--or even from one revision of your operating system to the next. In short, don't call strxfrm() directly: let Perl do it for you.

Note: use locale isn't shown in some of these examples because it isn't needed: strcoll() and strxfrm() exist only to generate locale-dependent results, and so always obey the current LC_COLLATE locale.

Category LC_CTYPE: Character Types

In the scope of use locale (but not a use locale ':not_characters' ), Perl obeys the LC_CTYPE locale setting. This controls the application's notion of which characters are alphabetic. This affects Perl's \w regular expression metanotation, which stands for alphanumeric characters--that is, alphabetic, numeric, and including other special characters such as the underscore or hyphen. (Consult perlre for more information about regular expressions.) Thanks to LC_CTYPE , depending on your locale setting, characters like "æ", "ð", "ß", and "ø" may be understood as \w characters.

The LC_CTYPE locale also provides the map used in transliterating characters between lower and uppercase. This affects the case-mapping functions--lc(), lcfirst, uc(), and ucfirst(); case-mapping interpolation with \l , \L , \u , or \U in double-quoted strings and s/// substitutions; and case-independent regular expression pattern matching using the i modifier.

Finally, LC_CTYPE affects the POSIX character-class test functions--isalpha(), islower(), and so on. For example, if you move from the "C" locale to a 7-bit Scandinavian one, you may find--possibly to your surprise--that "|" moves from the ispunct() class to isalpha(). Unfortunately, this creates big problems for regular expressions. "|" still means alternation even though it matches \w .

Note that there are quite a few things that are unaffected by the current locale. All the escape sequences for particular characters, \n for example, always mean the platform's native one. This means, for example, that \N in regular expressions (every character but new-line) work on the platform character set.

Note: A broken or malicious LC_CTYPE locale definition may result in clearly ineligible characters being considered to be alphanumeric by your application. For strict matching of (mundane) ASCII letters and digits--for example, in command strings--locale-aware applications should use \w with the /a regular expression modifier. See SECURITY.

Category LC_NUMERIC: Numeric Formatting

After a proper POSIX::setlocale() call, Perl obeys the LC_NUMERIC locale information, which controls an application's idea of how numbers should be formatted for human readability by the printf(), sprintf(), and write() functions. String-to-numeric conversion by the POSIX::strtod() function is also affected. In most implementations the only effect is to change the character used for the decimal point--perhaps from "." to ",". These functions aren't aware of such niceties as thousands separation and so on. (See The localeconv function if you care about these things.)

Output produced by print() is also affected by the current locale: it corresponds to what you'd get from printf() in the "C" locale. The same is true for Perl's internal conversions between numeric and string formats:

  1. use POSIX qw(strtod setlocale LC_NUMERIC);
  2. setlocale LC_NUMERIC, "";
  3. $n = 5/2; # Assign numeric 2.5 to $n
  4. $a = " $n"; # Locale-dependent conversion to string
  5. print "half five is $n\n"; # Locale-dependent output
  6. printf "half five is %g\n", $n; # Locale-dependent output
  7. print "DECIMAL POINT IS COMMA\n"
  8. if $n == (strtod("2,5"))[0]; # Locale-dependent conversion

See also I18N::Langinfo and RADIXCHAR .

Category LC_MONETARY: Formatting of monetary amounts

The C standard defines the LC_MONETARY category, but not a function that is affected by its contents. (Those with experience of standards committees will recognize that the working group decided to punt on the issue.) Consequently, Perl takes no notice of it. If you really want to use LC_MONETARY , you can query its contents--see The localeconv function--and use the information that it returns in your application's own formatting of currency amounts. However, you may well find that the information, voluminous and complex though it may be, still does not quite meet your requirements: currency formatting is a hard nut to crack.

See also I18N::Langinfo and CRNCYSTR .

LC_TIME

Output produced by POSIX::strftime(), which builds a formatted human-readable date/time string, is affected by the current LC_TIME locale. Thus, in a French locale, the output produced by the %B format element (full month name) for the first month of the year would be "janvier". Here's how to get a list of long month names in the current locale:

  1. use POSIX qw(strftime);
  2. for (0..11) {
  3. $long_month_name[$_] =
  4. strftime("%B", 0, 0, 0, 1, $_, 96);
  5. }

Note: use locale isn't needed in this example: as a function that exists only to generate locale-dependent results, strftime() always obeys the current LC_TIME locale.

See also I18N::Langinfo and ABDAY_1 ..ABDAY_7 , DAY_1 ..DAY_7 , ABMON_1 ..ABMON_12 , and ABMON_1 ..ABMON_12 .

Other categories

The remaining locale category, LC_MESSAGES (possibly supplemented by others in particular implementations) is not currently used by Perl--except possibly to affect the behavior of library functions called by extensions outside the standard Perl distribution and by the operating system and its utilities. Note especially that the string value of $! and the error messages given by external utilities may be changed by LC_MESSAGES . If you want to have portable error codes, use %! . See Errno.

SECURITY

Although the main discussion of Perl security issues can be found in perlsec, a discussion of Perl's locale handling would be incomplete if it did not draw your attention to locale-dependent security issues. Locales--particularly on systems that allow unprivileged users to build their own locales--are untrustworthy. A malicious (or just plain broken) locale can make a locale-aware application give unexpected results. Here are a few possibilities:

  • Regular expression checks for safe file names or mail addresses using \w may be spoofed by an LC_CTYPE locale that claims that characters such as ">" and "|" are alphanumeric.

  • String interpolation with case-mapping, as in, say, $dest = "C:\U$name.$ext" , may produce dangerous results if a bogus LC_CTYPE case-mapping table is in effect.

  • A sneaky LC_COLLATE locale could result in the names of students with "D" grades appearing ahead of those with "A"s.

  • An application that takes the trouble to use information in LC_MONETARY may format debits as if they were credits and vice versa if that locale has been subverted. Or it might make payments in US dollars instead of Hong Kong dollars.

  • The date and day names in dates formatted by strftime() could be manipulated to advantage by a malicious user able to subvert the LC_DATE locale. ("Look--it says I wasn't in the building on Sunday.")

Such dangers are not peculiar to the locale system: any aspect of an application's environment which may be modified maliciously presents similar challenges. Similarly, they are not specific to Perl: any programming language that allows you to write programs that take account of their environment exposes you to these issues.

Perl cannot protect you from all possibilities shown in the examples--there is no substitute for your own vigilance--but, when use locale is in effect, Perl uses the tainting mechanism (see perlsec) to mark string results that become locale-dependent, and which may be untrustworthy in consequence. Here is a summary of the tainting behavior of operators and functions that may be affected by the locale:

  • Comparison operators (lt , le , ge , gt and cmp ):

    Scalar true/false (or less/equal/greater) result is never tainted.

  • Case-mapping interpolation (with \l , \L , \u or \U )

    Result string containing interpolated material is tainted if use locale (but not use locale ':not_characters' ) is in effect.

  • Matching operator (m//):

    Scalar true/false result never tainted.

    Subpatterns, either delivered as a list-context result or as $1 etc. are tainted if use locale (but not use locale ':not_characters' ) is in effect, and the subpattern regular expression contains \w (to match an alphanumeric character), \W (non-alphanumeric character), \s (whitespace character), or \S (non whitespace character). The matched-pattern variable, $&, $` (pre-match), $' (post-match), and $+ (last match) are also tainted if use locale is in effect and the regular expression contains \w , \W , \s, or \S .

  • Substitution operator (s///):

    Has the same behavior as the match operator. Also, the left operand of =~ becomes tainted when use locale (but not use locale ':not_characters' ) is in effect if modified as a result of a substitution based on a regular expression match involving \w , \W , \s, or \S ; or of case-mapping with \l , \L ,\u or \U .

  • Output formatting functions (printf() and write()):

    Results are never tainted because otherwise even output from print, for example print(1/7), should be tainted if use locale is in effect.

  • Case-mapping functions (lc(), lcfirst(), uc(), ucfirst()):

    Results are tainted if use locale (but not use locale ':not_characters' ) is in effect.

  • POSIX locale-dependent functions (localeconv(), strcoll(), strftime(), strxfrm()):

    Results are never tainted.

  • POSIX character class tests (isalnum(), isalpha(), isdigit(), isgraph(), islower(), isprint(), ispunct(), isspace(), isupper(), isxdigit()):

    True/false results are never tainted.

Three examples illustrate locale-dependent tainting. The first program, which ignores its locale, won't run: a value taken directly from the command line may not be used to name an output file when taint checks are enabled.

  1. #/usr/local/bin/perl -T
  2. # Run with taint checking
  3. # Command line sanity check omitted...
  4. $tainted_output_file = shift;
  5. open(F, ">$tainted_output_file")
  6. or warn "Open of $tainted_output_file failed: $!\n";

The program can be made to run by "laundering" the tainted value through a regular expression: the second example--which still ignores locale information--runs, creating the file named on its command line if it can.

  1. #/usr/local/bin/perl -T
  2. $tainted_output_file = shift;
  3. $tainted_output_file =~ m%[\w/]+%;
  4. $untainted_output_file = $&;
  5. open(F, ">$untainted_output_file")
  6. or warn "Open of $untainted_output_file failed: $!\n";

Compare this with a similar but locale-aware program:

  1. #/usr/local/bin/perl -T
  2. $tainted_output_file = shift;
  3. use locale;
  4. $tainted_output_file =~ m%[\w/]+%;
  5. $localized_output_file = $&;
  6. open(F, ">$localized_output_file")
  7. or warn "Open of $localized_output_file failed: $!\n";

This third program fails to run because $& is tainted: it is the result of a match involving \w while use locale is in effect.

ENVIRONMENT

  • PERL_BADLANG

    A string that can suppress Perl's warning about failed locale settings at startup. Failure can occur if the locale support in the operating system is lacking (broken) in some way--or if you mistyped the name of a locale when you set up your environment. If this environment variable is absent, or has a value that does not evaluate to integer zero--that is, "0" or ""-- Perl will complain about locale setting failures.

    NOTE: PERL_BADLANG only gives you a way to hide the warning message. The message tells about some problem in your system's locale support, and you should investigate what the problem is.

The following environment variables are not specific to Perl: They are part of the standardized (ISO C, XPG4, POSIX 1.c) setlocale() method for controlling an application's opinion on data.

  • LC_ALL

    LC_ALL is the "override-all" locale environment variable. If set, it overrides all the rest of the locale environment variables.

  • LANGUAGE

    NOTE: LANGUAGE is a GNU extension, it affects you only if you are using the GNU libc. This is the case if you are using e.g. Linux. If you are using "commercial" Unixes you are most probably not using GNU libc and you can ignore LANGUAGE .

    However, in the case you are using LANGUAGE : it affects the language of informational, warning, and error messages output by commands (in other words, it's like LC_MESSAGES ) but it has higher priority than LC_ALL . Moreover, it's not a single value but instead a "path" (":"-separated list) of languages (not locales). See the GNU gettext library documentation for more information.

  • LC_CTYPE

    In the absence of LC_ALL , LC_CTYPE chooses the character type locale. In the absence of both LC_ALL and LC_CTYPE , LANG chooses the character type locale.

  • LC_COLLATE

    In the absence of LC_ALL , LC_COLLATE chooses the collation (sorting) locale. In the absence of both LC_ALL and LC_COLLATE , LANG chooses the collation locale.

  • LC_MONETARY

    In the absence of LC_ALL , LC_MONETARY chooses the monetary formatting locale. In the absence of both LC_ALL and LC_MONETARY , LANG chooses the monetary formatting locale.

  • LC_NUMERIC

    In the absence of LC_ALL , LC_NUMERIC chooses the numeric format locale. In the absence of both LC_ALL and LC_NUMERIC , LANG chooses the numeric format.

  • LC_TIME

    In the absence of LC_ALL , LC_TIME chooses the date and time formatting locale. In the absence of both LC_ALL and LC_TIME , LANG chooses the date and time formatting locale.

  • LANG

    LANG is the "catch-all" locale environment variable. If it is set, it is used as the last resort after the overall LC_ALL and the category-specific LC_... .

Examples

The LC_NUMERIC controls the numeric output:

  1. use locale;
  2. use POSIX qw(locale_h); # Imports setlocale() and the LC_ constants.
  3. setlocale(LC_NUMERIC, "fr_FR") or die "Pardon";
  4. printf "%g\n", 1.23; # If the "fr_FR" succeeded, probably shows 1,23.

and also how strings are parsed by POSIX::strtod() as numbers:

  1. use locale;
  2. use POSIX qw(locale_h strtod);
  3. setlocale(LC_NUMERIC, "de_DE") or die "Entschuldigung";
  4. my $x = strtod("2,34") + 5;
  5. print $x, "\n"; # Probably shows 7,34.

NOTES

Backward compatibility

Versions of Perl prior to 5.004 mostly ignored locale information, generally behaving as if something similar to the "C" locale were always in force, even if the program environment suggested otherwise (see The setlocale function). By default, Perl still behaves this way for backward compatibility. If you want a Perl application to pay attention to locale information, you must use the use locale pragma (see The use locale pragma) or, in the unlikely event that you want to do so for just pattern matching, the /l regular expression modifier (see Character set modifiers in perlre) to instruct it to do so.

Versions of Perl from 5.002 to 5.003 did use the LC_CTYPE information if available; that is, \w did understand what were the letters according to the locale environment variables. The problem was that the user had no control over the feature: if the C library supported locales, Perl used them.

I18N:Collate obsolete

In versions of Perl prior to 5.004, per-locale collation was possible using the I18N::Collate library module. This module is now mildly obsolete and should be avoided in new applications. The LC_COLLATE functionality is now integrated into the Perl core language: One can use locale-specific scalar data completely normally with use locale , so there is no longer any need to juggle with the scalar references of I18N::Collate .

Sort speed and memory use impacts

Comparing and sorting by locale is usually slower than the default sorting; slow-downs of two to four times have been observed. It will also consume more memory: once a Perl scalar variable has participated in any string comparison or sorting operation obeying the locale collation rules, it will take 3-15 times more memory than before. (The exact multiplier depends on the string's contents, the operating system and the locale.) These downsides are dictated more by the operating system's implementation of the locale system than by Perl.

write() and LC_NUMERIC

If a program's environment specifies an LC_NUMERIC locale and use locale is in effect when the format is declared, the locale is used to specify the decimal point character in formatted output. Formatted output cannot be controlled by use locale at the time when write() is called.

Freely available locale definitions

The Unicode CLDR project extracts the POSIX portion of many of its locales, available at

  1. http://unicode.org/Public/cldr/latest/

There is a large collection of locale definitions at:

  1. http://std.dkuug.dk/i18n/WG15-collection/locales/

You should be aware that it is unsupported, and is not claimed to be fit for any purpose. If your system allows installation of arbitrary locales, you may find the definitions useful as they are, or as a basis for the development of your own locales.

I18n and l10n

"Internationalization" is often abbreviated as i18n because its first and last letters are separated by eighteen others. (You may guess why the internalin ... internaliti ... i18n tends to get abbreviated.) In the same way, "localization" is often abbreviated to l10n.

An imperfect standard

Internationalization, as defined in the C and POSIX standards, can be criticized as incomplete, ungainly, and having too large a granularity. (Locales apply to a whole process, when it would arguably be more useful to have them apply to a single thread, window group, or whatever.) They also have a tendency, like standards groups, to divide the world into nations, when we all know that the world can equally well be divided into bankers, bikers, gamers, and so on.

Unicode and UTF-8

The support of Unicode is new starting from Perl version v5.6, and more fully implemented in version v5.8 and later. See perluniintro. It is strongly recommended that when combining Unicode and locale (starting in v5.16), you use

  1. use locale ':not_characters';

When this form of the pragma is used, only the non-character portions of locales are used by Perl, for example LC_NUMERIC . Perl assumes that you have translated all the characters it is to operate on into Unicode (actually the platform's native character set (ASCII or EBCDIC) plus Unicode). For data in files, this can conveniently be done by also specifying

  1. use open ':locale';

This pragma arranges for all inputs from files to be translated into Unicode from the current locale as specified in the environment (see ENVIRONMENT), and all outputs to files to be translated back into the locale. (See open). On a per-filehandle basis, you can instead use the PerlIO::locale module, or the Encode::Locale module, both available from CPAN. The latter module also has methods to ease the handling of ARGV and environment variables, and can be used on individual strings. Also, if you know that all your locales will be UTF-8, as many are these days, you can use the -C command line switch.

This form of the pragma allows essentially seamless handling of locales with Unicode. The collation order will be Unicode's. It is strongly recommended that when you need to order and sort strings that you use the standard module Unicode::Collate which gives much better results in many instances than you can get with the old-style locale handling.

For pre-v5.16 Perls, or if you use the locale pragma without the :not_characters parameter, Perl tries to work with both Unicode and locales--but there are problems.

Perl does not handle multi-byte locales in this case, such as have been used for various Asian languages, such as Big5 or Shift JIS. However, the increasingly common multi-byte UTF-8 locales, if properly implemented, may work reasonably well (depending on your C library implementation) in this form of the locale pragma, simply because both they and Perl store characters that take up multiple bytes the same way. However, some, if not most, C library implementations may not process the characters in the upper half of the Latin-1 range (128 - 255) properly under LC_CTYPE. To see if a character is a particular type under a locale, Perl uses the functions like isalnum() . Your C library may not work for UTF-8 locales with those functions, instead only working under the newer wide library functions like iswalnum() .

Perl generally takes the tack to use locale rules on code points that can fit in a single byte, and Unicode rules for those that can't (though this isn't uniformly applied, see the note at the end of this section). This prevents many problems in locales that aren't UTF-8. Suppose the locale is ISO8859-7, Greek. The character at 0xD7 there is a capital Chi. But in the ISO8859-1 locale, Latin1, it is a multiplication sign. The POSIX regular expression character class [[:alpha:]] will magically match 0xD7 in the Greek locale but not in the Latin one.

However, there are places where this breaks down. Certain constructs are for Unicode only, such as \p{Alpha} . They assume that 0xD7 always has its Unicode meaning (or the equivalent on EBCDIC platforms). Since Latin1 is a subset of Unicode and 0xD7 is the multiplication sign in both Latin1 and Unicode, \p{Alpha} will never match it, regardless of locale. A similar issue occurs with \N{...} . It is therefore a bad idea to use \p{} or \N{} under plain use locale --unless you can guarantee that the locale will be a ISO8859-1. Use POSIX character classes instead.

Another problem with this approach is that operations that cross the single byte/multiple byte boundary are not well-defined, and so are disallowed. (This boundary is between the codepoints at 255/256.). For example, lower casing LATIN CAPITAL LETTER Y WITH DIAERESIS (U+0178) should return LATIN SMALL LETTER Y WITH DIAERESIS (U+00FF). But in the Greek locale, for example, there is no character at 0xFF, and Perl has no way of knowing what the character at 0xFF is really supposed to represent. Thus it disallows the operation. In this mode, the lowercase of U+0178 is itself.

The same problems ensue if you enable automatic UTF-8-ification of your standard file handles, default open() layer, and @ARGV on non-ISO8859-1, non-UTF-8 locales (by using either the -C command line switch or the PERL_UNICODE environment variable; see perlrun). Things are read in as UTF-8, which would normally imply a Unicode interpretation, but the presence of a locale causes them to be interpreted in that locale instead. For example, a 0xD7 code point in the Unicode input, which should mean the multiplication sign, won't be interpreted by Perl that way under the Greek locale. This is not a problem provided you make certain that all locales will always and only be either an ISO8859-1, or, if you don't have a deficient C library, a UTF-8 locale.

Vendor locales are notoriously buggy, and it is difficult for Perl to test its locale-handling code because this interacts with code that Perl has no control over; therefore the locale-handling code in Perl may be buggy as well. (However, the Unicode-supplied locales should be better, and there is a feed back mechanism to correct any problems. See Freely available locale definitions.)

If you have Perl v5.16, the problems mentioned above go away if you use the :not_characters parameter to the locale pragma (except for vendor bugs in the non-character portions). If you don't have v5.16, and you do have locales that work, using them may be worthwhile for certain specific purposes, as long as you keep in mind the gotchas already mentioned. For example, if the collation for your locales works, it runs faster under locales than under Unicode::Collate; and you gain access to such things as the local currency symbol and the names of the months and days of the week. (But to hammer home the point, in v5.16, you get this access without the downsides of locales by using the :not_characters form of the pragma.)

Note: The policy of using locale rules for code points that can fit in a byte, and Unicode rules for those that can't is not uniformly applied. Pre-v5.12, it was somewhat haphazard; in v5.12 it was applied fairly consistently to regular expression matching except for bracketed character classes; in v5.14 it was extended to all regex matches; and in v5.16 to the casing operations such as "\L" and uc(). For collation, in all releases, the system's strxfrm() function is called, and whatever it does is what you get.

BUGS

Broken systems

In certain systems, the operating system's locale support is broken and cannot be fixed or used by Perl. Such deficiencies can and will result in mysterious hangs and/or Perl core dumps when use locale is in effect. When confronted with such a system, please report in excruciating detail to <perlbug@perl.org>, and also contact your vendor: bug fixes may exist for these problems in your operating system. Sometimes such bug fixes are called an operating system upgrade.

SEE ALSO

I18N::Langinfo, perluniintro, perlunicode, open, isalnum in POSIX, isalpha in POSIX, isdigit in POSIX, isgraph in POSIX, islower in POSIX, isprint in POSIX, ispunct in POSIX, isspace in POSIX, isupper in POSIX, isxdigit in POSIX, localeconv in POSIX, setlocale in POSIX, strcoll in POSIX, strftime in POSIX, strtod in POSIX, strxfrm in POSIX.

HISTORY

Jarkko Hietaniemi's original perli18n.pod heavily hacked by Dominic Dunlop, assisted by the perl5-porters. Prose worked over a bit by Tom Christiansen, and updated by Perl 5 porters.

 
perldoc-html/perllol.html000644 000765 000024 00000141410 12275777323 015575 0ustar00jjstaff000000 000000 perllol - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perllol

Perl 5 version 18.2 documentation
Recently read

perllol

NAME

perllol - Manipulating Arrays of Arrays in Perl

DESCRIPTION

Declaration and Access of Arrays of Arrays

The simplest two-level data structure to build in Perl is an array of arrays, sometimes casually called a list of lists. It's reasonably easy to understand, and almost everything that applies here will also be applicable later on with the fancier data structures.

An array of an array is just a regular old array @AoA that you can get at with two subscripts, like $AoA[3][2] . Here's a declaration of the array:

  1. use 5.010; # so we can use say()
  2. # assign to our array, an array of array references
  3. @AoA = (
  4. [ "fred", "barney", "pebbles", "bambam", "dino", ],
  5. [ "george", "jane", "elroy", "judy", ],
  6. [ "homer", "bart", "marge", "maggie", ],
  7. );
  8. say $AoA[2][1];
  9. bart

Now you should be very careful that the outer bracket type is a round one, that is, a parenthesis. That's because you're assigning to an @array, so you need parentheses. If you wanted there not to be an @AoA, but rather just a reference to it, you could do something more like this:

  1. # assign a reference to array of array references
  2. $ref_to_AoA = [
  3. [ "fred", "barney", "pebbles", "bambam", "dino", ],
  4. [ "george", "jane", "elroy", "judy", ],
  5. [ "homer", "bart", "marge", "maggie", ],
  6. ];
  7. say $ref_to_AoA->[2][1];
  8. bart

Notice that the outer bracket type has changed, and so our access syntax has also changed. That's because unlike C, in perl you can't freely interchange arrays and references thereto. $ref_to_AoA is a reference to an array, whereas @AoA is an array proper. Likewise, $AoA[2] is not an array, but an array ref. So how come you can write these:

  1. $AoA[2][2]
  2. $ref_to_AoA->[2][2]

instead of having to write these:

  1. $AoA[2]->[2]
  2. $ref_to_AoA->[2]->[2]

Well, that's because the rule is that on adjacent brackets only (whether square or curly), you are free to omit the pointer dereferencing arrow. But you cannot do so for the very first one if it's a scalar containing a reference, which means that $ref_to_AoA always needs it.

Growing Your Own

That's all well and good for declaration of a fixed data structure, but what if you wanted to add new elements on the fly, or build it up entirely from scratch?

First, let's look at reading it in from a file. This is something like adding a row at a time. We'll assume that there's a flat file in which each line is a row and each word an element. If you're trying to develop an @AoA array containing all these, here's the right way to do that:

  1. while (<>) {
  2. @tmp = split;
  3. push @AoA, [ @tmp ];
  4. }

You might also have loaded that from a function:

  1. for $i ( 1 .. 10 ) {
  2. $AoA[$i] = [ somefunc($i) ];
  3. }

Or you might have had a temporary variable sitting around with the array in it.

  1. for $i ( 1 .. 10 ) {
  2. @tmp = somefunc($i);
  3. $AoA[$i] = [ @tmp ];
  4. }

It's important you make sure to use the [ ] array reference constructor. That's because this wouldn't work:

  1. $AoA[$i] = @tmp; # WRONG!

The reason that doesn't do what you want is because assigning a named array like that to a scalar is taking an array in scalar context, which means just counts the number of elements in @tmp.

If you are running under use strict (and if you aren't, why in the world aren't you?), you'll have to add some declarations to make it happy:

  1. use strict;
  2. my(@AoA, @tmp);
  3. while (<>) {
  4. @tmp = split;
  5. push @AoA, [ @tmp ];
  6. }

Of course, you don't need the temporary array to have a name at all:

  1. while (<>) {
  2. push @AoA, [ split ];
  3. }

You also don't have to use push(). You could just make a direct assignment if you knew where you wanted to put it:

  1. my (@AoA, $i, $line);
  2. for $i ( 0 .. 10 ) {
  3. $line = <>;
  4. $AoA[$i] = [ split " ", $line ];
  5. }

or even just

  1. my (@AoA, $i);
  2. for $i ( 0 .. 10 ) {
  3. $AoA[$i] = [ split " ", <> ];
  4. }

You should in general be leery of using functions that could potentially return lists in scalar context without explicitly stating such. This would be clearer to the casual reader:

  1. my (@AoA, $i);
  2. for $i ( 0 .. 10 ) {
  3. $AoA[$i] = [ split " ", scalar(<>) ];
  4. }

If you wanted to have a $ref_to_AoA variable as a reference to an array, you'd have to do something like this:

  1. while (<>) {
  2. push @$ref_to_AoA, [ split ];
  3. }

Now you can add new rows. What about adding new columns? If you're dealing with just matrices, it's often easiest to use simple assignment:

  1. for $x (1 .. 10) {
  2. for $y (1 .. 10) {
  3. $AoA[$x][$y] = func($x, $y);
  4. }
  5. }
  6. for $x ( 3, 7, 9 ) {
  7. $AoA[$x][20] += func2($x);
  8. }

It doesn't matter whether those elements are already there or not: it'll gladly create them for you, setting intervening elements to undef as need be.

If you wanted just to append to a row, you'd have to do something a bit funnier looking:

  1. # add new columns to an existing row
  2. push @{ $AoA[0] }, "wilma", "betty"; # explicit deref

Prior to Perl 5.14, this wouldn't even compile:

  1. push $AoA[0], "wilma", "betty"; # implicit deref

How come? Because once upon a time, the argument to push() had to be a real array, not just a reference to one. That's no longer true. In fact, the line marked "implicit deref" above works just fine--in this instance--to do what the one that says explicit deref did.

The reason I said "in this instance" is because that only works because $AoA[0] already held an array reference. If you try that on an undefined variable, you'll take an exception. That's because the implicit derefererence will never autovivify an undefined variable the way @{ } always will:

  1. my $aref = undef;
  2. push $aref, qw(some more values); # WRONG!
  3. push @$aref, qw(a few more); # ok

If you want to take advantage of this new implicit dereferencing behavior, go right ahead: it makes code easier on the eye and wrist. Just understand that older releases will choke on it during compilation. Whenever you make use of something that works only in some given release of Perl and later, but not earlier, you should place a prominent

  1. use v5.14; # needed for implicit deref of array refs by array ops

directive at the top of the file that needs it. That way when somebody tries to run the new code under an old perl, rather than getting an error like

  1. Type of arg 1 to push must be array (not array element) at /tmp/a line 8, near ""betty";"
  2. Execution of /tmp/a aborted due to compilation errors.

they'll be politely informed that

  1. Perl v5.14.0 required--this is only v5.12.3, stopped at /tmp/a line 1.
  2. BEGIN failed--compilation aborted at /tmp/a line 1.

Access and Printing

Now it's time to print your data structure out. How are you going to do that? Well, if you want only one of the elements, it's trivial:

  1. print $AoA[0][0];

If you want to print the whole thing, though, you can't say

  1. print @AoA; # WRONG

because you'll get just references listed, and perl will never automatically dereference things for you. Instead, you have to roll yourself a loop or two. This prints the whole structure, using the shell-style for() construct to loop across the outer set of subscripts.

  1. for $aref ( @AoA ) {
  2. say "\t [ @$aref ],";
  3. }

If you wanted to keep track of subscripts, you might do this:

  1. for $i ( 0 .. $#AoA ) {
  2. say "\t elt $i is [ @{$AoA[$i]} ],";
  3. }

or maybe even this. Notice the inner loop.

  1. for $i ( 0 .. $#AoA ) {
  2. for $j ( 0 .. $#{$AoA[$i]} ) {
  3. say "elt $i $j is $AoA[$i][$j]";
  4. }
  5. }

As you can see, it's getting a bit complicated. That's why sometimes is easier to take a temporary on your way through:

  1. for $i ( 0 .. $#AoA ) {
  2. $aref = $AoA[$i];
  3. for $j ( 0 .. $#{$aref} ) {
  4. say "elt $i $j is $AoA[$i][$j]";
  5. }
  6. }

Hmm... that's still a bit ugly. How about this:

  1. for $i ( 0 .. $#AoA ) {
  2. $aref = $AoA[$i];
  3. $n = @$aref - 1;
  4. for $j ( 0 .. $n ) {
  5. say "elt $i $j is $AoA[$i][$j]";
  6. }
  7. }

When you get tired of writing a custom print for your data structures, you might look at the standard Dumpvalue or Data::Dumper modules. The former is what the Perl debugger uses, while the latter generates parsable Perl code. For example:

  1. use v5.14; # using the + prototype, new to v5.14
  2. sub show(+) {
  3. require Dumpvalue;
  4. state $prettily = new Dumpvalue::
  5. tick => q("),
  6. compactDump => 1, # comment these two lines out
  7. veryCompact => 1, # if you want a bigger dump
  8. ;
  9. dumpValue $prettily @_;
  10. }
  11. # Assign a list of array references to an array.
  12. my @AoA = (
  13. [ "fred", "barney" ],
  14. [ "george", "jane", "elroy" ],
  15. [ "homer", "marge", "bart" ],
  16. );
  17. push $AoA[0], "wilma", "betty";
  18. show @AoA;

will print out:

  1. 0 0..3 "fred" "barney" "wilma" "betty"
  2. 1 0..2 "george" "jane" "elroy"
  3. 2 0..2 "homer" "marge" "bart"

Whereas if you comment out the two lines I said you might wish to, then it shows it to you this way instead:

  1. 0 ARRAY(0x8031d0)
  2. 0 "fred"
  3. 1 "barney"
  4. 2 "wilma"
  5. 3 "betty"
  6. 1 ARRAY(0x803d40)
  7. 0 "george"
  8. 1 "jane"
  9. 2 "elroy"
  10. 2 ARRAY(0x803e10)
  11. 0 "homer"
  12. 1 "marge"
  13. 2 "bart"

Slices

If you want to get at a slice (part of a row) in a multidimensional array, you're going to have to do some fancy subscripting. That's because while we have a nice synonym for single elements via the pointer arrow for dereferencing, no such convenience exists for slices.

Here's how to do one operation using a loop. We'll assume an @AoA variable as before.

  1. @part = ();
  2. $x = 4;
  3. for ($y = 7; $y < 13; $y++) {
  4. push @part, $AoA[$x][$y];
  5. }

That same loop could be replaced with a slice operation:

  1. @part = @{$AoA[4]}[7..12];

or spaced out a bit:

  1. @part = @{ $AoA[4] } [ 7..12 ];

But as you might well imagine, this can get pretty rough on the reader.

Ah, but what if you wanted a two-dimensional slice, such as having $x run from 4..8 and $y run from 7 to 12? Hmm... here's the simple way:

  1. @newAoA = ();
  2. for ($startx = $x = 4; $x <= 8; $x++) {
  3. for ($starty = $y = 7; $y <= 12; $y++) {
  4. $newAoA[$x - $startx][$y - $starty] = $AoA[$x][$y];
  5. }
  6. }

We can reduce some of the looping through slices

  1. for ($x = 4; $x <= 8; $x++) {
  2. push @newAoA, [ @{ $AoA[$x] } [ 7..12 ] ];
  3. }

If you were into Schwartzian Transforms, you would probably have selected map for that

  1. @newAoA = map { [ @{ $AoA[$_] } [ 7..12 ] ] } 4 .. 8;

Although if your manager accused you of seeking job security (or rapid insecurity) through inscrutable code, it would be hard to argue. :-) If I were you, I'd put that in a function:

  1. @newAoA = splice_2D( \@AoA, 4 => 8, 7 => 12 );
  2. sub splice_2D {
  3. my $lrr = shift; # ref to array of array refs!
  4. my ($x_lo, $x_hi,
  5. $y_lo, $y_hi) = @_;
  6. return map {
  7. [ @{ $lrr->[$_] } [ $y_lo .. $y_hi ] ]
  8. } $x_lo .. $x_hi;
  9. }

SEE ALSO

perldata, perlref, perldsc

AUTHOR

Tom Christiansen <tchrist@perl.com>

Last update: Tue Apr 26 18:30:55 MDT 2011

 
perldoc-html/perlmacos.html000644 000765 000024 00000035436 12275777411 016121 0ustar00jjstaff000000 000000 perlmacos - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlmacos

Perl 5 version 18.2 documentation
Recently read

perlmacos

NAME

perlmacos - Perl under Mac OS (Classic)

SYNOPSIS

For Mac OS X see README.macosx

Perl under Mac OS Classic has not been supported since before Perl 5.10 (April 2004).

When we say "Mac OS" below, we mean Mac OS 7, 8, and 9, and not Mac OS X.

DESCRIPTION

The port of Perl to to Mac OS was officially removed as of Perl 5.12, though the last official production release of MacPerl corresponded to Perl 5.6. While Perl 5.10 included the port to Mac OS, ExtUtils::MakeMaker, a core part of Perl's module installation infrastructure officially dropped support for Mac OS in April 2004.

AUTHOR

Perl was ported to Mac OS by Matthias Neeracher <neeracher@mac.com>. Chris Nandor <pudge@pobox.com> continued development and maintenance for the duration of the port's life.

 
perldoc-html/perlmacosx.html000644 000765 000024 00000073401 12275777411 016303 0ustar00jjstaff000000 000000 perlmacosx - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlmacosx

Perl 5 version 18.2 documentation
Recently read

perlmacosx

NAME

perlmacosx - Perl under Mac OS X

SYNOPSIS

This document briefly describes Perl under Mac OS X.

  1. curl http://www.cpan.org/src/perl-5.18.2.tar.gz > perl-5.18.0.tar.gz
  2. tar -xzf perl-5.18.2.tar.gz
  3. cd perl-5.18.2
  4. ./Configure -des -Dprefix=/usr/local/
  5. make
  6. make test
  7. sudo make install

DESCRIPTION

The latest Perl release (5.18.2 as of this writing) builds without changes under all versions of Mac OS X from 10.3 "Panther" onwards.

In order to build your own version of Perl you will need 'make', which is part of Apple's developer tools - also known as Xcode. From Mac OS X 10.7 "Lion" onwards, it can be downloaded separately as the 'Command Line Tools' bundle directly from https://developer.apple.com/downloads/ (you will need a free account to log in), or as a part of the Xcode suite, freely available at the App Store. Xcode is a pretty big app, so unless you already have it or really want it, you are advised to get the 'Command Line Tools' bundle separately from the link above. If you want to do it from within Xcode, go to Xcode -> Preferences -> Downloads and select the 'Command Line Tools' option.

Between Mac OS X 10.3 "Panther" and 10.6 "Snow Leopard", the 'Command Line Tools' bundle was called 'unix tools', and was usually supplied with Mac OS install DVDs.

Earlier Mac OS X releases (10.2 "Jaguar" and older) did not include a completely thread-safe libc, so threading is not fully supported. Also, earlier releases included a buggy libdb, so some of the DB_File tests are known to fail on those releases.

Installation Prefix

The default installation location for this release uses the traditional UNIX directory layout under /usr/local. This is the recommended location for most users, and will leave the Apple-supplied Perl and its modules undisturbed.

Using an installation prefix of '/usr' will result in a directory layout that mirrors that of Apple's default Perl, with core modules stored in '/System/Library/Perl/${version}', CPAN modules stored in '/Library/Perl/${version}', and the addition of '/Network/Library/Perl/${version}' to @INC for modules that are stored on a file server and used by many Macs.

SDK support

First, export the path to the SDK into the build environment:

  1. export SDK=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk

Please make sure the SDK version (i.e. the numbers right before '.sdk') matches your system's (in this case, Mac OS X 10.8 "Mountain Lion"), as it is possible to have more than one SDK installed. Also make sure the path exists in your system, and if it doesn't please make sure the SDK is properly installed, as it should come with the 'Command Line Tools' bundle mentioned above. Finally, if you have an older Mac OS X (10.6 "Snow Leopard" and below) running Xcode 4.2 or lower, the SDK path might be something like '/Developer/SDKs/MacOSX10.3.9.sdk' .

You can use the SDK by exporting some additions to Perl's 'ccflags' and '..flags' config variables:

  1. ./Configure -Accflags="-nostdinc -B$SDK/usr/include/gcc \
  2. -B$SDK/usr/lib/gcc -isystem$SDK/usr/include \
  3. -F$SDK/System/Library/Frameworks" \
  4. -Aldflags="-Wl,-syslibroot,$SDK" \
  5. -de

Universal Binary support

Note: From Mac OS X 10.6 "Snow Leopard" onwards, Apple only supports Intel-based hardware. This means you can safely skip this section unless you have an older Apple computer running on ppc or wish to create a perl binary with backwards compatibility.

You can compile perl as a universal binary (built for both ppc and intel). In Mac OS X 10.4 "Tiger", you must export the 'u' variant of the SDK:

  1. export SDK=/Developer/SDKs/MacOSX10.4u.sdk

Mac OS X 10.5 "Leopard" and above do not require the 'u' variant.

In addition to the compiler flags used to select the SDK, also add the flags for creating a universal binary:

  1. ./Configure -Accflags="-arch i686 -arch ppc -nostdinc -B$SDK/usr/include/gcc \
  2. -B$SDK/usr/lib/gcc -isystem$SDK/usr/include \
  3. -F$SDK/System/Library/Frameworks" \
  4. -Aldflags="-arch i686 -arch ppc -Wl,-syslibroot,$SDK" \
  5. -de

Keep in mind that these compiler and linker settings will also be used when building CPAN modules. For XS modules to be compiled as a universal binary, any libraries it links to must also be universal binaries. The system libraries that Apple includes with the 10.4u SDK are all universal, but user-installed libraries may need to be re-installed as universal binaries.

64-bit PPC support

Follow the instructions in INSTALL to build perl with support for 64-bit integers (use64bitint ) or both 64-bit integers and 64-bit addressing (use64bitall ). In the latter case, the resulting binary will run only on G5-based hosts.

Support for 64-bit addressing is experimental: some aspects of Perl may be omitted or buggy. Note the messages output by Configure for further information. Please use perlbug to submit a problem report in the event that you encounter difficulties.

When building 64-bit modules, it is your responsibility to ensure that linked external libraries and frameworks provide 64-bit support: if they do not, module building may appear to succeed, but attempts to use the module will result in run-time dynamic linking errors, and subsequent test failures. You can use file to discover the architectures supported by a library:

  1. $ file libgdbm.3.0.0.dylib
  2. libgdbm.3.0.0.dylib: Mach-O fat file with 2 architectures
  3. libgdbm.3.0.0.dylib (for architecture ppc): Mach-O dynamically linked shared library ppc
  4. libgdbm.3.0.0.dylib (for architecture ppc64): Mach-O 64-bit dynamically linked shared library ppc64

Note that this issue precludes the building of many Macintosh-specific CPAN modules (Mac::* ), as the required Apple frameworks do not provide PPC64 support. Similarly, downloads from Fink or Darwinports are unlikely to provide 64-bit support; the libraries must be rebuilt from source with the appropriate compiler and linker flags. For further information, see Apple's 64-Bit Transition Guide at http://developer.apple.com/documentation/Darwin/Conceptual/64bitPorting/index.html.

libperl and Prebinding

Mac OS X ships with a dynamically-loaded libperl, but the default for this release is to compile a static libperl. The reason for this is pre-binding. Dynamic libraries can be pre-bound to a specific address in memory in order to decrease load time. To do this, one needs to be aware of the location and size of all previously-loaded libraries. Apple collects this information as part of their overall OS build process, and thus has easy access to it when building Perl, but ordinary users would need to go to a great deal of effort to obtain the information needed for pre-binding.

You can override the default and build a shared libperl if you wish (Configure ... -Duseshrplib).

With Mac OS X 10.4 "Tiger" and newer, there is almost no performance penalty for non-prebound libraries. Earlier releases will suffer a greater load time than either the static library, or Apple's pre-bound dynamic library.

Updating Apple's Perl

In a word - don't, at least not without a *very* good reason. Your scripts can just as easily begin with "#!/usr/local/bin/perl" as with "#!/usr/bin/perl". Scripts supplied by Apple and other third parties as part of installation packages and such have generally only been tested with the /usr/bin/perl that's installed by Apple.

If you find that you do need to update the system Perl, one issue worth keeping in mind is the question of static vs. dynamic libraries. If you upgrade using the default static libperl, you will find that the dynamic libperl supplied by Apple will not be deleted. If both libraries are present when an application that links against libperl is built, ld will link against the dynamic library by default. So, if you need to replace Apple's dynamic libperl with a static libperl, you need to be sure to delete the older dynamic library after you've installed the update.

Known problems

If you have installed extra libraries such as GDBM through Fink (in other words, you have libraries under /sw/lib), or libdlcompat to /usr/local/lib, you may need to be extra careful when running Configure to not to confuse Configure and Perl about which libraries to use. Being confused will show up for example as "dyld" errors about symbol problems, for example during "make test". The safest bet is to run Configure as

  1. Configure ... -Uloclibpth -Dlibpth=/usr/lib

to make Configure look only into the system libraries. If you have some extra library directories that you really want to use (such as newer Berkeley DB libraries in pre-Panther systems), add those to the libpth:

  1. Configure ... -Uloclibpth -Dlibpth='/usr/lib /opt/lib'

The default of building Perl statically may cause problems with complex applications like Tk: in that case consider building shared Perl

  1. Configure ... -Duseshrplib

but remember that there's a startup cost to pay in that case (see above "libperl and Prebinding").

Starting with Tiger (Mac OS X 10.4), Apple shipped broken locale files for the eu_ES locale (Basque-Spain). In previous releases of Perl, this resulted in failures in the lib/locale test. These failures have been suppressed in the current release of Perl by making the test ignore the broken locale. If you need to use the eu_ES locale, you should contact Apple support.

Cocoa

There are two ways to use Cocoa from Perl. Apple's PerlObjCBridge module, included with Mac OS X, can be used by standalone scripts to access Foundation (i.e. non-GUI) classes and objects.

An alternative is CamelBones, a framework that allows access to both Foundation and AppKit classes and objects, so that full GUI applications can be built in Perl. CamelBones can be found on SourceForge, at http://www.sourceforge.net/projects/camelbones/.

Starting From Scratch

Unfortunately it is not that difficult somehow manage to break one's Mac OS X Perl rather severely. If all else fails and you want to really, REALLY, start from scratch and remove even your Apple Perl installation (which has become corrupted somehow), the following instructions should do it. Please think twice before following these instructions: they are much like conducting brain surgery to yourself. Without anesthesia. We will not come to fix your system if you do this.

First, get rid of the libperl.dylib:

  1. # cd /System/Library/Perl/darwin/CORE
  2. # rm libperl.dylib

Then delete every .bundle file found anywhere in the folders:

  1. /System/Library/Perl
  2. /Library/Perl

You can find them for example by

  1. # find /System/Library/Perl /Library/Perl -name '*.bundle' -print

After this you can either copy Perl from your operating system media (you will need at least the /System/Library/Perl and /usr/bin/perl), or rebuild Perl from the source code with Configure -Dprefix=/usr -Duseshrplib NOTE: the -Dprefix=/usr to replace the system Perl works much better with Perl 5.8.1 and later, in Perl 5.8.0 the settings were not quite right.

"Pacifist" from CharlesSoft (http://www.charlessoft.com/) is a nice way to extract the Perl binaries from the OS media, without having to reinstall the entire OS.

AUTHOR

This README was written by Sherm Pendley <sherm@dot-app.org>, and subsequently updated by Dominic Dunlop <domo@computer.org> and Breno G. de Oliveira <garu@cpan.org>. The "Starting From Scratch" recipe was contributed by John Montbriand <montbriand@apple.com>.

DATE

Last modified 2013-04-29.

 
perldoc-html/perlmod.html000644 000765 000024 00000167125 12275777346 015606 0ustar00jjstaff000000 000000 perlmod - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlmod

Perl 5 version 18.2 documentation
Recently read

perlmod

NAME

perlmod - Perl modules (packages and symbol tables)

DESCRIPTION

Packages

Perl provides a mechanism for alternative namespaces to protect packages from stomping on each other's variables. In fact, there's really no such thing as a global variable in Perl. The package statement declares the compilation unit as being in the given namespace. The scope of the package declaration is from the declaration itself through the end of the enclosing block, eval, or file, whichever comes first (the same scope as the my() and local() operators). Unqualified dynamic identifiers will be in this namespace, except for those few identifiers that if unqualified, default to the main package instead of the current one as described below. A package statement affects only dynamic variables--including those you've used local() on--but not lexical variables created with my(). Typically it would be the first declaration in a file included by the do, require, or use operators. You can switch into a package in more than one place; it merely influences which symbol table is used by the compiler for the rest of that block. You can refer to variables and filehandles in other packages by prefixing the identifier with the package name and a double colon: $Package::Variable . If the package name is null, the main package is assumed. That is, $::sail is equivalent to $main::sail .

The old package delimiter was a single quote, but double colon is now the preferred delimiter, in part because it's more readable to humans, and in part because it's more readable to emacs macros. It also makes C++ programmers feel like they know what's going on--as opposed to using the single quote as separator, which was there to make Ada programmers feel like they knew what was going on. Because the old-fashioned syntax is still supported for backwards compatibility, if you try to use a string like "This is $owner's house" , you'll be accessing $owner::s ; that is, the $s variable in package owner , which is probably not what you meant. Use braces to disambiguate, as in "This is ${owner}'s house" .

Packages may themselves contain package separators, as in $OUTER::INNER::var . This implies nothing about the order of name lookups, however. There are no relative packages: all symbols are either local to the current package, or must be fully qualified from the outer package name down. For instance, there is nowhere within package OUTER that $INNER::var refers to $OUTER::INNER::var . INNER refers to a totally separate global package.

Only identifiers starting with letters (or underscore) are stored in a package's symbol table. All other symbols are kept in package main , including all punctuation variables, like $_. In addition, when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC, and SIG are forced to be in package main , even when used for other purposes than their built-in ones. If you have a package called m, s, or y, then you can't use the qualified form of an identifier because it would be instead interpreted as a pattern match, a substitution, or a transliteration.

Variables beginning with underscore used to be forced into package main, but we decided it was more useful for package writers to be able to use leading underscore to indicate private variables and method names. However, variables and functions named with a single _ , such as $_ and sub _ , are still forced into the package main . See also The Syntax of Variable Names in perlvar.

evaled strings are compiled in the package in which the eval() was compiled. (Assignments to $SIG{} , however, assume the signal handler specified is in the main package. Qualify the signal handler name if you wish to have a signal handler in a package.) For an example, examine perldb.pl in the Perl library. It initially switches to the DB package so that the debugger doesn't interfere with variables in the program you are trying to debug. At various points, however, it temporarily switches back to the main package to evaluate various expressions in the context of the main package (or wherever you came from). See perldebug.

The special symbol __PACKAGE__ contains the current package, but cannot (easily) be used to construct variable names.

See perlsub for other scoping issues related to my() and local(), and perlref regarding closures.

Symbol Tables

The symbol table for a package happens to be stored in the hash of that name with two colons appended. The main symbol table's name is thus %main:: , or %:: for short. Likewise the symbol table for the nested package mentioned earlier is named %OUTER::INNER:: .

The value in each entry of the hash is what you are referring to when you use the *name typeglob notation.

  1. local *main::foo = *main::bar;

You can use this to print out all the variables in a package, for instance. The standard but antiquated dumpvar.pl library and the CPAN module Devel::Symdump make use of this.

The results of creating new symbol table entries directly or modifying any entries that are not already typeglobs are undefined and subject to change between releases of perl.

Assignment to a typeglob performs an aliasing operation, i.e.,

  1. *dick = *richard;

causes variables, subroutines, formats, and file and directory handles accessible via the identifier richard also to be accessible via the identifier dick . If you want to alias only a particular variable or subroutine, assign a reference instead:

  1. *dick = \$richard;

Which makes $richard and $dick the same variable, but leaves @richard and @dick as separate arrays. Tricky, eh?

There is one subtle difference between the following statements:

  1. *foo = *bar;
  2. *foo = \$bar;

*foo = *bar makes the typeglobs themselves synonymous while *foo = \$bar makes the SCALAR portions of two distinct typeglobs refer to the same scalar value. This means that the following code:

  1. $bar = 1;
  2. *foo = \$bar; # Make $foo an alias for $bar
  3. {
  4. local $bar = 2; # Restrict changes to block
  5. print $foo; # Prints '1'!
  6. }

Would print '1', because $foo holds a reference to the original $bar . The one that was stuffed away by local() and which will be restored when the block ends. Because variables are accessed through the typeglob, you can use *foo = *bar to create an alias which can be localized. (But be aware that this means you can't have a separate @foo and @bar , etc.)

What makes all of this important is that the Exporter module uses glob aliasing as the import/export mechanism. Whether or not you can properly localize a variable that has been exported from a module depends on how it was exported:

  1. @EXPORT = qw($FOO); # Usual form, can't be localized
  2. @EXPORT = qw(*FOO); # Can be localized

You can work around the first case by using the fully qualified name ($Package::FOO ) where you need a local value, or by overriding it by saying *FOO = *Package::FOO in your script.

The *x = \$y mechanism may be used to pass and return cheap references into or from subroutines if you don't want to copy the whole thing. It only works when assigning to dynamic variables, not lexicals.

  1. %some_hash = (); # can't be my()
  2. *some_hash = fn( \%another_hash );
  3. sub fn {
  4. local *hashsym = shift;
  5. # now use %hashsym normally, and you
  6. # will affect the caller's %another_hash
  7. my %nhash = (); # do what you want
  8. return \%nhash;
  9. }

On return, the reference will overwrite the hash slot in the symbol table specified by the *some_hash typeglob. This is a somewhat tricky way of passing around references cheaply when you don't want to have to remember to dereference variables explicitly.

Another use of symbol tables is for making "constant" scalars.

  1. *PI = \3.14159265358979;

Now you cannot alter $PI , which is probably a good thing all in all. This isn't the same as a constant subroutine, which is subject to optimization at compile-time. A constant subroutine is one prototyped to take no arguments and to return a constant expression. See perlsub for details on these. The use constant pragma is a convenient shorthand for these.

You can say *foo{PACKAGE} and *foo{NAME} to find out what name and package the *foo symbol table entry comes from. This may be useful in a subroutine that gets passed typeglobs as arguments:

  1. sub identify_typeglob {
  2. my $glob = shift;
  3. print 'You gave me ', *{$glob}{PACKAGE}, '::', *{$glob}{NAME}, "\n";
  4. }
  5. identify_typeglob *foo;
  6. identify_typeglob *bar::baz;

This prints

  1. You gave me main::foo
  2. You gave me bar::baz

The *foo{THING} notation can also be used to obtain references to the individual elements of *foo. See perlref.

Subroutine definitions (and declarations, for that matter) need not necessarily be situated in the package whose symbol table they occupy. You can define a subroutine outside its package by explicitly qualifying the name of the subroutine:

  1. package main;
  2. sub Some_package::foo { ... } # &foo defined in Some_package

This is just a shorthand for a typeglob assignment at compile time:

  1. BEGIN { *Some_package::foo = sub { ... } }

and is not the same as writing:

  1. {
  2. package Some_package;
  3. sub foo { ... }
  4. }

In the first two versions, the body of the subroutine is lexically in the main package, not in Some_package. So something like this:

  1. package main;
  2. $Some_package::name = "fred";
  3. $main::name = "barney";
  4. sub Some_package::foo {
  5. print "in ", __PACKAGE__, ": \$name is '$name'\n";
  6. }
  7. Some_package::foo();

prints:

  1. in main: $name is 'barney'

rather than:

  1. in Some_package: $name is 'fred'

This also has implications for the use of the SUPER:: qualifier (see perlobj).

BEGIN, UNITCHECK, CHECK, INIT and END

Five specially named code blocks are executed at the beginning and at the end of a running Perl program. These are the BEGIN , UNITCHECK , CHECK , INIT , and END blocks.

These code blocks can be prefixed with sub to give the appearance of a subroutine (although this is not considered good style). One should note that these code blocks don't really exist as named subroutines (despite their appearance). The thing that gives this away is the fact that you can have more than one of these code blocks in a program, and they will get all executed at the appropriate moment. So you can't execute any of these code blocks by name.

A BEGIN code block is executed as soon as possible, that is, the moment it is completely defined, even before the rest of the containing file (or string) is parsed. You may have multiple BEGIN blocks within a file (or eval'ed string); they will execute in order of definition. Because a BEGIN code block executes immediately, it can pull in definitions of subroutines and such from other files in time to be visible to the rest of the compile and run time. Once a BEGIN has run, it is immediately undefined and any code it used is returned to Perl's memory pool.

An END code block is executed as late as possible, that is, after perl has finished running the program and just before the interpreter is being exited, even if it is exiting as a result of a die() function. (But not if it's morphing into another program via exec, or being blown out of the water by a signal--you have to trap that yourself (if you can).) You may have multiple END blocks within a file--they will execute in reverse order of definition; that is: last in, first out (LIFO). END blocks are not executed when you run perl with the -c switch, or if compilation fails.

Note that END code blocks are not executed at the end of a string eval(): if any END code blocks are created in a string eval(), they will be executed just as any other END code block of that package in LIFO order just before the interpreter is being exited.

Inside an END code block, $? contains the value that the program is going to pass to exit(). You can modify $? to change the exit value of the program. Beware of changing $? by accident (e.g. by running something via system).

Inside of a END block, the value of ${^GLOBAL_PHASE} will be "END" .

UNITCHECK , CHECK and INIT code blocks are useful to catch the transition between the compilation phase and the execution phase of the main program.

UNITCHECK blocks are run just after the unit which defined them has been compiled. The main program file and each module it loads are compilation units, as are string evals, run-time code compiled using the (?{ }) construct in a regex, calls to do FILE , require FILE , and code after the -e switch on the command line.

BEGIN and UNITCHECK blocks are not directly related to the phase of the interpreter. They can be created and executed during any phase.

CHECK code blocks are run just after the initial Perl compile phase ends and before the run time begins, in LIFO order. CHECK code blocks are used in the Perl compiler suite to save the compiled state of the program.

Inside of a CHECK block, the value of ${^GLOBAL_PHASE} will be "CHECK" .

INIT blocks are run just before the Perl runtime begins execution, in "first in, first out" (FIFO) order.

Inside of an INIT block, the value of ${^GLOBAL_PHASE} will be "INIT" .

The CHECK and INIT blocks in code compiled by require, string do, or string eval will not be executed if they occur after the end of the main compilation phase; that can be a problem in mod_perl and other persistent environments which use those functions to load code at runtime.

When you use the -n and -p switches to Perl, BEGIN and END work just as they do in awk, as a degenerate case. Both BEGIN and CHECK blocks are run when you use the -c switch for a compile-only syntax check, although your main code is not.

The begincheck program makes it all clear, eventually:

  1. #!/usr/bin/perl
  2. # begincheck
  3. print "10. Ordinary code runs at runtime.\n";
  4. END { print "16. So this is the end of the tale.\n" }
  5. INIT { print " 7. INIT blocks run FIFO just before runtime.\n" }
  6. UNITCHECK {
  7. print " 4. And therefore before any CHECK blocks.\n"
  8. }
  9. CHECK { print " 6. So this is the sixth line.\n" }
  10. print "11. It runs in order, of course.\n";
  11. BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" }
  12. END { print "15. Read perlmod for the rest of the story.\n" }
  13. CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" }
  14. INIT { print " 8. Run this again, using Perl's -c switch.\n" }
  15. print "12. This is anti-obfuscated code.\n";
  16. END { print "14. END blocks run LIFO at quitting time.\n" }
  17. BEGIN { print " 2. So this line comes out second.\n" }
  18. UNITCHECK {
  19. print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n"
  20. }
  21. INIT { print " 9. You'll see the difference right away.\n" }
  22. print "13. It merely _looks_ like it should be confusing.\n";
  23. __END__

Perl Classes

There is no special class syntax in Perl, but a package may act as a class if it provides subroutines to act as methods. Such a package may also derive some of its methods from another class (package) by listing the other package name(s) in its global @ISA array (which must be a package global, not a lexical).

For more on this, see perlootut and perlobj.

Perl Modules

A module is just a set of related functions in a library file, i.e., a Perl package with the same name as the file. It is specifically designed to be reusable by other modules or programs. It may do this by providing a mechanism for exporting some of its symbols into the symbol table of any package using it, or it may function as a class definition and make its semantics available implicitly through method calls on the class and its objects, without explicitly exporting anything. Or it can do a little of both.

For example, to start a traditional, non-OO module called Some::Module, create a file called Some/Module.pm and start with this template:

  1. package Some::Module; # assumes Some/Module.pm
  2. use strict;
  3. use warnings;
  4. BEGIN {
  5. require Exporter;
  6. # set the version for version checking
  7. our $VERSION = 1.00;
  8. # Inherit from Exporter to export functions and variables
  9. our @ISA = qw(Exporter);
  10. # Functions and variables which are exported by default
  11. our @EXPORT = qw(func1 func2);
  12. # Functions and variables which can be optionally exported
  13. our @EXPORT_OK = qw($Var1 %Hashit func3);
  14. }
  15. # exported package globals go here
  16. our $Var1 = '';
  17. our %Hashit = ();
  18. # non-exported package globals go here
  19. # (they are still accessible as $Some::Module::stuff)
  20. our @more = ();
  21. our $stuff = '';
  22. # file-private lexicals go here, before any functions which use them
  23. my $priv_var = '';
  24. my %secret_hash = ();
  25. # here's a file-private function as a closure,
  26. # callable as $priv_func->();
  27. my $priv_func = sub {
  28. ...
  29. };
  30. # make all your functions, whether exported or not;
  31. # remember to put something interesting in the {} stubs
  32. sub func1 { ... }
  33. sub func2 { ... }
  34. # this one isn't exported, but could be called directly
  35. # as Some::Module::func3()
  36. sub func3 { ... }
  37. END { ... } # module clean-up code here (global destructor)
  38. 1; # don't forget to return a true value from the file

Then go on to declare and use your variables in functions without any qualifications. See Exporter and the perlmodlib for details on mechanics and style issues in module creation.

Perl modules are included into your program by saying

  1. use Module;

or

  1. use Module LIST;

This is exactly equivalent to

  1. BEGIN { require 'Module.pm'; 'Module'->import; }

or

  1. BEGIN { require 'Module.pm'; 'Module'->import( LIST ); }

As a special case

  1. use Module ();

is exactly equivalent to

  1. BEGIN { require 'Module.pm'; }

All Perl module files have the extension .pm. The use operator assumes this so you don't have to spell out "Module.pm" in quotes. This also helps to differentiate new modules from old .pl and .ph files. Module names are also capitalized unless they're functioning as pragmas; pragmas are in effect compiler directives, and are sometimes called "pragmatic modules" (or even "pragmata" if you're a classicist).

The two statements:

  1. require SomeModule;
  2. require "SomeModule.pm";

differ from each other in two ways. In the first case, any double colons in the module name, such as Some::Module , are translated into your system's directory separator, usually "/". The second case does not, and would have to be specified literally. The other difference is that seeing the first require clues in the compiler that uses of indirect object notation involving "SomeModule", as in $ob = purge SomeModule , are method calls, not function calls. (Yes, this really can make a difference.)

Because the use statement implies a BEGIN block, the importing of semantics happens as soon as the use statement is compiled, before the rest of the file is compiled. This is how it is able to function as a pragma mechanism, and also how modules are able to declare subroutines that are then visible as list or unary operators for the rest of the current file. This will not work if you use require instead of use. With require you can get into this problem:

  1. require Cwd; # make Cwd:: accessible
  2. $here = Cwd::getcwd();
  3. use Cwd; # import names from Cwd::
  4. $here = getcwd();
  5. require Cwd; # make Cwd:: accessible
  6. $here = getcwd(); # oops! no main::getcwd()

In general, use Module () is recommended over require Module , because it determines module availability at compile time, not in the middle of your program's execution. An exception would be if two modules each tried to use each other, and each also called a function from that other module. In that case, it's easy to use require instead.

Perl packages may be nested inside other package names, so we can have package names containing :: . But if we used that package name directly as a filename it would make for unwieldy or impossible filenames on some systems. Therefore, if a module's name is, say, Text::Soundex , then its definition is actually found in the library file Text/Soundex.pm.

Perl modules always have a .pm file, but there may also be dynamically linked executables (often ending in .so) or autoloaded subroutine definitions (often ending in .al) associated with the module. If so, these will be entirely transparent to the user of the module. It is the responsibility of the .pm file to load (or arrange to autoload) any additional functionality. For example, although the POSIX module happens to do both dynamic loading and autoloading, the user can say just use POSIX to get it all.

Making your module threadsafe

Perl supports a type of threads called interpreter threads (ithreads). These threads can be used explicitly and implicitly.

Ithreads work by cloning the data tree so that no data is shared between different threads. These threads can be used by using the threads module or by doing fork() on win32 (fake fork() support). When a thread is cloned all Perl data is cloned, however non-Perl data cannot be cloned automatically. Perl after 5.8.0 has support for the CLONE special subroutine. In CLONE you can do whatever you need to do, like for example handle the cloning of non-Perl data, if necessary. CLONE will be called once as a class method for every package that has it defined (or inherits it). It will be called in the context of the new thread, so all modifications are made in the new area. Currently CLONE is called with no parameters other than the invocant package name, but code should not assume that this will remain unchanged, as it is likely that in future extra parameters will be passed in to give more information about the state of cloning.

If you want to CLONE all objects you will need to keep track of them per package. This is simply done using a hash and Scalar::Util::weaken().

Perl after 5.8.7 has support for the CLONE_SKIP special subroutine. Like CLONE , CLONE_SKIP is called once per package; however, it is called just before cloning starts, and in the context of the parent thread. If it returns a true value, then no objects of that class will be cloned; or rather, they will be copied as unblessed, undef values. For example: if in the parent there are two references to a single blessed hash, then in the child there will be two references to a single undefined scalar value instead. This provides a simple mechanism for making a module threadsafe; just add sub CLONE_SKIP { 1 } at the top of the class, and DESTROY() will now only be called once per object. Of course, if the child thread needs to make use of the objects, then a more sophisticated approach is needed.

Like CLONE , CLONE_SKIP is currently called with no parameters other than the invocant package name, although that may change. Similarly, to allow for future expansion, the return value should be a single 0 or 1 value.

SEE ALSO

See perlmodlib for general style issues related to building Perl modules and classes, as well as descriptions of the standard library and CPAN, Exporter for how Perl's standard import/export mechanism works, perlootut and perlobj for in-depth information on creating classes, perlobj for a hard-core reference document on objects, perlsub for an explanation of functions and scoping, and perlxstut and perlguts for more information on writing extension modules.

 
perldoc-html/perlmodinstall.html000644 000765 000024 00000104721 12275777354 017165 0ustar00jjstaff000000 000000 perlmodinstall - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlmodinstall

Perl 5 version 18.2 documentation
Recently read

perlmodinstall

NAME

perlmodinstall - Installing CPAN Modules

DESCRIPTION

You can think of a module as the fundamental unit of reusable Perl code; see perlmod for details. Whenever anyone creates a chunk of Perl code that they think will be useful to the world, they register as a Perl developer at http://www.cpan.org/modules/04pause.html so that they can then upload their code to the CPAN. The CPAN is the Comprehensive Perl Archive Network and can be accessed at http://www.cpan.org/ , and searched at http://search.cpan.org/ .

This documentation is for people who want to download CPAN modules and install them on their own computer.

PREAMBLE

First, are you sure that the module isn't already on your system? Try perl -MFoo -e 1 . (Replace "Foo" with the name of the module; for instance, perl -MCGI::Carp -e 1 .)

If you don't see an error message, you have the module. (If you do see an error message, it's still possible you have the module, but that it's not in your path, which you can display with perl -e "print qq(@INC)" .) For the remainder of this document, we'll assume that you really honestly truly lack an installed module, but have found it on the CPAN.

So now you have a file ending in .tar.gz (or, less often, .zip). You know there's a tasty module inside. There are four steps you must now take:

  • DECOMPRESS the file
  • UNPACK the file into a directory
  • BUILD the module (sometimes unnecessary)
  • INSTALL the module.

Here's how to perform each step for each operating system. This is <not> a substitute for reading the README and INSTALL files that might have come with your module!

Also note that these instructions are tailored for installing the module into your system's repository of Perl modules, but you can install modules into any directory you wish. For instance, where I say perl Makefile.PL , you can substitute perl Makefile.PL PREFIX=/my/perl_directory to install the modules into /my/perl_directory. Then you can use the modules from your Perl programs with use lib "/my/perl_directory/lib/site_perl"; or sometimes just use "/my/perl_directory"; . If you're on a system that requires superuser/root access to install modules into the directories you see when you type perl -e "print qq(@INC)" , you'll want to install them into a local directory (such as your home directory) and use this approach.

  • If you're on a Unix or Unix-like system,

    You can use Andreas Koenig's CPAN module ( http://www.cpan.org/modules/by-module/CPAN ) to automate the following steps, from DECOMPRESS through INSTALL.

    A. DECOMPRESS

    Decompress the file with gzip -d yourmodule.tar.gz

    You can get gzip from ftp://prep.ai.mit.edu/pub/gnu/

    Or, you can combine this step with the next to save disk space:

    1. gzip -dc yourmodule.tar.gz | tar -xof -

    B. UNPACK

    Unpack the result with tar -xof yourmodule.tar

    C. BUILD

    Go into the newly-created directory and type:

    1. perl Makefile.PL
    2. make test

    or

    1. perl Makefile.PL PREFIX=/my/perl_directory

    to install it locally. (Remember that if you do this, you'll have to put use lib "/my/perl_directory"; near the top of the program that is to use this module.

    D. INSTALL

    While still in that directory, type:

    1. make install

    Make sure you have the appropriate permissions to install the module in your Perl 5 library directory. Often, you'll need to be root.

    That's all you need to do on Unix systems with dynamic linking. Most Unix systems have dynamic linking. If yours doesn't, or if for another reason you have a statically-linked perl, and the module requires compilation, you'll need to build a new Perl binary that includes the module. Again, you'll probably need to be root.

  • If you're running ActivePerl (Win95/98/2K/NT/XP, Linux, Solaris),

    First, type ppm from a shell and see whether ActiveState's PPM repository has your module. If so, you can install it with ppm and you won't have to bother with any of the other steps here. You might be able to use the CPAN instructions from the "Unix or Linux" section above as well; give it a try. Otherwise, you'll have to follow the steps below.

    1. A. DECOMPRESS

    You can use the shareware Winzip ( http://www.winzip.com ) to decompress and unpack modules.

    1. B. UNPACK

    If you used WinZip, this was already done for you.

    1. C. BUILD

    You'll need the nmake utility, available at http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/nmake15.exe or dmake, available on CPAN. http://search.cpan.org/dist/dmake/

    Does the module require compilation (i.e. does it have files that end in .xs, .c, .h, .y, .cc, .cxx, or .C)? If it does, life is now officially tough for you, because you have to compile the module yourself (no easy feat on Windows). You'll need a compiler such as Visual C++. Alternatively, you can download a pre-built PPM package from ActiveState. http://aspn.activestate.com/ASPN/Downloads/ActivePerl/PPM/

    Go into the newly-created directory and type:

    1. perl Makefile.PL
    2. nmake test
    3. D. INSTALL

    While still in that directory, type:

    1. nmake install
  • If you're using a Macintosh with "Classic" MacOS and MacPerl,

    A. DECOMPRESS

    First, make sure you have the latest cpan-mac distribution ( http://www.cpan.org/authors/id/CNANDOR/ ), which has utilities for doing all of the steps. Read the cpan-mac directions carefully and install it. If you choose not to use cpan-mac for some reason, there are alternatives listed here.

    After installing cpan-mac, drop the module archive on the untarzipme droplet, which will decompress and unpack for you.

    Or, you can either use the shareware StuffIt Expander program ( http://my.smithmicro.com/mac/stuffit/ ) or the freeware MacGzip program ( http://persephone.cps.unizar.es/general/gente/spd/gzip/gzip.html ).

    B. UNPACK

    If you're using untarzipme or StuffIt, the archive should be extracted now. Or, you can use the freeware suntar or Tar ( http://hyperarchive.lcs.mit.edu/HyperArchive/Archive/cmp/ ).

    C. BUILD

    Check the contents of the distribution. Read the module's documentation, looking for reasons why you might have trouble using it with MacPerl. Look for .xs and .c files, which normally denote that the distribution must be compiled, and you cannot install it "out of the box." (See PORTABILITY.)

    D. INSTALL

    If you are using cpan-mac, just drop the folder on the installme droplet, and use the module.

    Or, if you aren't using cpan-mac, do some manual labor.

    Make sure the newlines for the modules are in Mac format, not Unix format. If they are not then you might have decompressed them incorrectly. Check your decompression and unpacking utilities settings to make sure they are translating text files properly.

    As a last resort, you can use the perl one-liner:

    1. perl -i.bak -pe 's/(?:\015)?\012/\015/g' <filenames>

    on the source files.

    Then move the files (probably just the .pm files, though there may be some additional ones, too; check the module documentation) to their final destination: This will most likely be in $ENV{MACPERL}site_lib: (i.e., HD:MacPerl folder:site_lib:). You can add new paths to the default @INC in the Preferences menu item in the MacPerl application ($ENV{MACPERL}site_lib: is added automagically). Create whatever directory structures are required (i.e., for Some::Module , create $ENV{MACPERL}site_lib:Some: and put Module.pm in that directory).

    Then run the following script (or something like it):

    1. #!perl -w
    2. use AutoSplit;
    3. my $dir = "${MACPERL}site_perl";
    4. autosplit("$dir:Some:Module.pm", "$dir:auto", 0, 1, 1);
  • If you're on the DJGPP port of DOS,

    1. A. DECOMPRESS

    djtarx ( ftp://ftp.delorie.com/pub/djgpp/current/v2/ ) will both uncompress and unpack.

    1. B. UNPACK

    See above.

    1. C. BUILD

    Go into the newly-created directory and type:

    1. perl Makefile.PL
    2. make test

    You will need the packages mentioned in README.dos in the Perl distribution.

    1. D. INSTALL

    While still in that directory, type:

    1. make install

    You will need the packages mentioned in README.dos in the Perl distribution.

  • If you're on OS/2,

    Get the EMX development suite and gzip/tar, from either Hobbes ( http://hobbes.nmsu.edu ) or Leo ( http://www.leo.org ), and then follow the instructions for Unix.

  • If you're on VMS,

    When downloading from CPAN, save your file with a .tgz extension instead of .tar.gz. All other periods in the filename should be replaced with underscores. For example, Your-Module-1.33.tar.gz should be downloaded as Your-Module-1_33.tgz.

    A. DECOMPRESS

    Type

    1. gzip -d Your-Module.tgz

    or, for zipped modules, type

    1. unzip Your-Module.zip

    Executables for gzip, zip, and VMStar:

    1. http://www.hp.com/go/openvms/freeware/

    and their source code:

    1. http://www.fsf.org/order/ftp.html

    Note that GNU's gzip/gunzip is not the same as Info-ZIP's zip/unzip package. The former is a simple compression tool; the latter permits creation of multi-file archives.

    B. UNPACK

    If you're using VMStar:

    1. VMStar xf Your-Module.tar

    Or, if you're fond of VMS command syntax:

    1. tar/extract/verbose Your_Module.tar

    C. BUILD

    Make sure you have MMS (from Digital) or the freeware MMK ( available from MadGoat at http://www.madgoat.com ). Then type this to create the DESCRIP.MMS for the module:

    1. perl Makefile.PL

    Now you're ready to build:

    1. mms test

    Substitute mmk for mms above if you're using MMK.

    D. INSTALL

    Type

    1. mms install

    Substitute mmk for mms above if you're using MMK.

  • If you're on MVS,

    Introduce the .tar.gz file into an HFS as binary; don't translate from ASCII to EBCDIC.

    A. DECOMPRESS

    Decompress the file with gzip -d yourmodule.tar.gz

    You can get gzip from http://www.s390.ibm.com/products/oe/bpxqp1.html

    B. UNPACK

    Unpack the result with

    1. pax -o to=IBM-1047,from=ISO8859-1 -r < yourmodule.tar

    The BUILD and INSTALL steps are identical to those for Unix. Some modules generate Makefiles that work better with GNU make, which is available from http://www.mks.com/s390/gnu/

PORTABILITY

Note that not all modules will work with on all platforms. See perlport for more information on portability issues. Read the documentation to see if the module will work on your system. There are basically three categories of modules that will not work "out of the box" with all platforms (with some possibility of overlap):

  • Those that should, but don't. These need to be fixed; consider contacting the author and possibly writing a patch.

  • Those that need to be compiled, where the target platform doesn't have compilers readily available. (These modules contain .xs or .c files, usually.) You might be able to find existing binaries on the CPAN or elsewhere, or you might want to try getting compilers and building it yourself, and then release the binary for other poor souls to use.

  • Those that are targeted at a specific platform. (Such as the Win32:: modules.) If the module is targeted specifically at a platform other than yours, you're out of luck, most likely.

Check the CPAN Testers if a module should work with your platform but it doesn't behave as you'd expect, or you aren't sure whether or not a module will work under your platform. If the module you want isn't listed there, you can test it yourself and let CPAN Testers know, you can join CPAN Testers, or you can request it be tested.

  1. http://testers.cpan.org/

HEY

If you have any suggested changes for this page, let me know. Please don't send me mail asking for help on how to install your modules. There are too many modules, and too few Orwants, for me to be able to answer or even acknowledge all your questions. Contact the module author instead, or post to comp.lang.perl.modules, or ask someone familiar with Perl on your operating system.

AUTHOR

Jon Orwant

orwant@medita.mit.edu

with invaluable help from Chris Nandor, and valuable help from Brandon Allbery, Charles Bailey, Graham Barr, Dominic Dunlop, Jarkko Hietaniemi, Ben Holzman, Tom Horsley, Nick Ing-Simmons, Tuomas J. Lukka, Laszlo Molnar, Alan Olsen, Peter Prymmer, Gurusamy Sarathy, Christoph Spalinger, Dan Sugalski, Larry Virden, and Ilya Zakharevich.

First version July 22, 1998; last revised November 21, 2001.

COPYRIGHT

Copyright (C) 1998, 2002, 2003 Jon Orwant. All Rights Reserved.

This document may be distributed under the same terms as Perl itself.

 
perldoc-html/perlmodlib.html000644 000765 000024 00000620611 12275777354 016266 0ustar00jjstaff000000 000000 perlmodlib - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlmodlib

Perl 5 version 18.2 documentation
Recently read

perlmodlib

NAME

perlmodlib - constructing new Perl modules and finding existing ones

THE PERL MODULE LIBRARY

Many modules are included in the Perl distribution. These are described below, and all end in .pm. You may discover compiled library files (usually ending in .so) or small pieces of modules to be autoloaded (ending in .al); these were automatically generated by the installation process. You may also discover files in the library directory that end in either .pl or .ph. These are old libraries supplied so that old programs that use them still run. The .pl files will all eventually be converted into standard modules, and the .ph files made by h2ph will probably end up as extension modules made by h2xs. (Some .ph values may already be available through the POSIX, Errno, or Fcntl modules.) The pl2pm file in the distribution may help in your conversion, but it's just a mechanical process and therefore far from bulletproof.

Pragmatic Modules

They work somewhat like compiler directives (pragmata) in that they tend to affect the compilation of your program, and thus will usually work well only when used within a use, or no. Most of these are lexically scoped, so an inner BLOCK may countermand them by saying:

  1. no integer;
  2. no strict 'refs';
  3. no warnings;

which lasts until the end of that BLOCK.

Some pragmas are lexically scoped--typically those that affect the $^H hints variable. Others affect the current package instead, like use vars and use subs , which allow you to predeclare a variables or subroutines within a particular file rather than just a block. Such declarations are effective for the entire file for which they were declared. You cannot rescind them with no vars or no subs .

The following pragmas are defined (and have their own documentation).

  • arybase

    Set indexing base via $[

  • attributes

    Get/set subroutine or variable attributes

  • autodie

    Replace functions with ones that succeed or die with lexical scope

  • autodie::exception

    Exceptions from autodying functions.

  • autodie::exception::system

    Exceptions from autodying system().

  • autodie::hints

    Provide hints about user subroutines to autodie

  • autouse

    Postpone load of modules until a function is used

  • base

    Establish an ISA relationship with base classes at compile time

  • bigint

    Transparent BigInteger support for Perl

  • bignum

    Transparent BigNumber support for Perl

  • bigrat

    Transparent BigNumber/BigRational support for Perl

  • blib

    Use MakeMaker's uninstalled version of a package

  • bytes

    Force byte semantics rather than character semantics

  • charnames

    Access to Unicode character names and named character sequences; also define character names

  • constant

    Declare constants

  • deprecate

    Perl pragma for deprecating the core version of a module

  • diagnostics

    Produce verbose warning diagnostics

  • encoding

    Allows you to write your script in non-ascii or non-utf8

  • encoding::warnings

    Warn on implicit encoding conversions

  • feature

    Enable new features

  • fields

    Compile-time class fields

  • filetest

    Control the filetest permission operators

  • if

    use a Perl module if a condition holds

  • inc::latest

    Use modules bundled in inc/ if they are newer than installed ones

  • integer

    Use integer arithmetic instead of floating point

  • less

    Request less of something

  • lib

    Manipulate @INC at compile time

  • locale

    Use or avoid POSIX locales for built-in operations

  • mro

    Method Resolution Order

  • open

    Set default PerlIO layers for input and output

  • ops

    Restrict unsafe operations when compiling

  • overload

    Package for overloading Perl operations

  • overloading

    Lexically control overloading

  • parent

    Establish an ISA relationship with base classes at compile time

  • perldoc

    Look up Perl documentation in Pod format.

  • perlfaq

    Frequently asked questions about Perl

  • perlfaq1

    General Questions About Perl

  • perlfaq2

    Obtaining and Learning about Perl

  • perlfaq3

    Programming Tools

  • perlfaq4

    Data Manipulation

  • perlfaq5

    Files and Formats

  • perlfaq6

    Regular Expressions

  • perlfaq7

    General Perl Language Issues

  • perlfaq8

    System Interaction

  • perlfaq9

    Web, Email and Networking

  • perlfunc

    Perl builtin functions

  • perlglossary

    Perl Glossary

  • perlpodspeccopy

    Plain Old Documentation: format specification and notes

  • perlvarcopy

    Perl predefined variables

  • perlxs

    XS language reference manual

  • perlxstut

    Tutorial for writing XSUBs

  • perlxstypemap

    Perl XS C/Perl type mapping

  • re

    Alter regular expression behaviour

  • sigtrap

    Enable simple signal handling

  • sort

    Control sort() behaviour

  • strict

    Restrict unsafe constructs

  • subs

    Predeclare sub names

  • threads

    Perl interpreter-based threads

  • threads::shared

    Perl extension for sharing data structures between threads

  • utf8

    Enable/disable UTF-8 (or UTF-EBCDIC) in source code

  • vars

    Predeclare global variable names

  • version

    Perl extension for Version Objects

  • vmsish

    Control VMS-specific language features

  • warnings

    Control optional warnings

  • warnings::register

    Warnings import function

Standard Modules

Standard, bundled modules are all expected to behave in a well-defined manner with respect to namespace pollution because they use the Exporter module. See their own documentation for details.

It's possible that not all modules listed below are installed on your system. For example, the GDBM_File module will not be installed if you don't have the gdbm library.

  • AnyDBM_File

    Provide framework for multiple DBMs

  • App::Cpan

    Easily interact with CPAN from the command line

  • App::Prove

    Implements the prove command.

  • App::Prove::State

    State storage for the prove command.

  • App::Prove::State::Result

    Individual test suite results.

  • App::Prove::State::Result::Test

    Individual test results.

  • Archive::Extract

    A generic archive extracting mechanism

  • Archive::Tar

    Module for manipulations of tar archives

  • Archive::Tar::File

    A subclass for in-memory extracted file from Archive::Tar

  • Attribute::Handlers

    Simpler definition of attribute handlers

  • AutoLoader

    Load subroutines only on demand

  • AutoSplit

    Split a package for autoloading

  • B

    The Perl Compiler Backend

  • B::Concise

    Walk Perl syntax tree, printing concise info about ops

  • B::Debug

    Walk Perl syntax tree, printing debug info about ops

  • B::Deparse

    Perl compiler backend to produce perl code

  • B::Lint

    Perl lint

  • B::Lint::Debug

    Adds debugging stringification to B::

  • B::Showlex

    Show lexical variables used in functions or files

  • B::Terse

    Walk Perl syntax tree, printing terse info about ops

  • B::Xref

    Generates cross reference reports for Perl programs

  • Benchmark

    Benchmark running times of Perl code

  • Socket

    Networking constants and support functions

  • CGI

    Handle Common Gateway Interface requests and responses

  • CGI::Apache

    Backward compatibility module for CGI.pm

  • CGI::Carp

    CGI routines for writing to the HTTPD (or other) error log

  • CGI::Cookie

    Interface to HTTP Cookies

  • CGI::Fast

    CGI Interface for Fast CGI

  • CGI::Pretty

    Module to produce nicely formatted HTML code

  • CGI::Push

    Simple Interface to Server Push

  • CGI::Switch

    Backward compatibility module for defunct CGI::Switch

  • CGI::Util

    Internal utilities used by CGI module

  • CORE

    Namespace for Perl's core routines

  • CPAN

    Query, download and build perl modules from CPAN sites

  • CPAN::API::HOWTO

    A recipe book for programming with CPAN.pm

  • CPAN::Debug

    Internal debugging for CPAN.pm

  • CPAN::Distroprefs

    Read and match distroprefs

  • CPAN::FirstTime

    Utility for CPAN::Config file Initialization

  • CPAN::HandleConfig

    Internal configuration handling for CPAN.pm

  • CPAN::Kwalify

    Interface between CPAN.pm and Kwalify.pm

  • CPAN::Meta

    The distribution metadata for a CPAN dist

  • CPAN::Meta::Converter

    Convert CPAN distribution metadata structures

  • CPAN::Meta::Feature

    An optional feature provided by a CPAN distribution

  • CPAN::Meta::History

    History of CPAN Meta Spec changes

  • CPAN::Meta::Prereqs

    A set of distribution prerequisites by phase and type

  • CPAN::Meta::Requirements

    A set of version requirements for a CPAN dist

  • CPAN::Meta::Spec

    Specification for CPAN distribution metadata

  • CPAN::Meta::Validator

    Validate CPAN distribution metadata structures

  • CPAN::Meta::YAML

    Read and write a subset of YAML for CPAN Meta files

  • CPAN::Nox

    Wrapper around CPAN.pm without using any XS module

  • CPAN::Queue

    Internal queue support for CPAN.pm

  • CPAN::Tarzip

    Internal handling of tar archives for CPAN.pm

  • CPAN::Version

    Utility functions to compare CPAN versions

  • CPANPLUS

    API & CLI access to the CPAN mirrors

  • CPANPLUS::Backend

    Programmer's interface to CPANPLUS

  • CPANPLUS::Backend::RV

    Return value objects

  • CPANPLUS::Config

    Configuration defaults and heuristics for CPANPLUS

  • CPANPLUS::Config::HomeEnv

    Set the environment for the CPANPLUS base dir

  • CPANPLUS::Configure

    Configuration for CPANPLUS

  • CPANPLUS::Dist

    Base class for plugins

  • CPANPLUS::Dist::Autobundle

    Distribution class for installation snapshots

  • CPANPLUS::Dist::Base

    Base class for custom distribution classes

  • CPANPLUS::Dist::Build

    CPANPLUS plugin to install packages that use Build.PL

  • CPANPLUS::Dist::Build::Constants

    Constants for CPANPLUS::Dist::Build

  • CPANPLUS::Dist::MM

    Distribution class for MakeMaker related modules

  • CPANPLUS::Dist::Sample

    Sample code to create your own Dist::* plugin

  • CPANPLUS::Error

    Error handling for CPANPLUS

  • CPANPLUS::FAQ

    CPANPLUS Frequently Asked Questions

  • CPANPLUS::Hacking

    Developing CPANPLUS

  • CPANPLUS::Internals

    CPANPLUS internals

  • CPANPLUS::Internals::Extract

    Internals for archive extraction

  • CPANPLUS::Internals::Fetch

    Internals for fetching files

  • CPANPLUS::Internals::Report

    Internals for sending test reports

  • CPANPLUS::Internals::Search

    Internals for searching for modules

  • CPANPLUS::Internals::Source

    Internals for updating source files

  • CPANPLUS::Internals::Source::Memory

    In memory implementation

  • CPANPLUS::Internals::Source::SQLite

    SQLite implementation

  • CPANPLUS::Internals::Utils

    Convenience functions for CPANPLUS

  • CPANPLUS::Module

    CPAN module objects for CPANPLUS

  • CPANPLUS::Module::Author

    CPAN author object for CPANPLUS

  • CPANPLUS::Module::Author::Fake

    Dummy author object for CPANPLUS

  • CPANPLUS::Module::Checksums

    Checking the checksum of a distribution

  • CPANPLUS::Module::Fake

    Fake module object for internal use

  • CPANPLUS::Selfupdate

    Self-updating for CPANPLUS

  • CPANPLUS::Shell

    Base class for CPANPLUS shells

  • CPANPLUS::Shell::Classic

    CPAN.pm emulation for CPANPLUS

  • CPANPLUS::Shell::Default

    The default CPANPLUS shell

  • CPANPLUS::Shell::Default::Plugins::CustomSource

    Add custom sources to CPANPLUS

  • CPANPLUS::Shell::Default::Plugins::HOWTO

    Documentation on how to write your own plugins

  • CPANPLUS::Shell::Default::Plugins::Remote

    Connect to a remote CPANPLUS

  • CPANPLUS::Shell::Default::Plugins::Source

    Read in CPANPLUS commands

  • Carp

    Alternative warn and die for modules

  • Class::Struct

    Declare struct-like datatypes as Perl classes

  • Compress::Raw::Bzip2

    Low-Level Interface to bzip2 compression library

  • Compress::Raw::Zlib

    Low-Level Interface to zlib compression library

  • Compress::Zlib

    Interface to zlib compression library

  • Config

    Access Perl configuration information

  • Config::Perl::V

    Structured data retrieval of perl -V output

  • Cwd

    Get pathname of current working directory

  • DB

    Programmatic interface to the Perl debugging API

  • DBM_Filter

    Filter DBM keys/values

  • DBM_Filter::compress

    Filter for DBM_Filter

  • DBM_Filter::encode

    Filter for DBM_Filter

  • DBM_Filter::int32

    Filter for DBM_Filter

  • DBM_Filter::null

    Filter for DBM_Filter

  • DBM_Filter::utf8

    Filter for DBM_Filter

  • DB_File

    Perl5 access to Berkeley DB version 1.x

  • Data::Dumper

    Stringified perl data structures, suitable for both printing and eval

  • Devel::InnerPackage

    Find all the inner packages of a package

  • Devel::PPPort

    Perl/Pollution/Portability

  • Devel::Peek

    A data debugging tool for the XS programmer

  • Devel::SelfStubber

    Generate stubs for a SelfLoading module

  • Digest

    Modules that calculate message digests

  • Digest::MD5

    Perl interface to the MD5 Algorithm

  • Digest::SHA

    Perl extension for SHA-1/224/256/384/512

  • Digest::base

    Digest base class

  • Digest::file

    Calculate digests of files

  • DirHandle

    Supply object methods for directory handles

  • Dumpvalue

    Provides screen dump of Perl data.

  • DynaLoader

    Dynamically load C libraries into Perl code

  • Encode

    Character encodings in Perl

  • Encode::Alias

    Alias definitions to encodings

  • Encode::Byte

    Single Byte Encodings

  • Encode::CJKConstants

    Internally used by Encode::??::ISO_2022_*

  • Encode::CN

    China-based Chinese Encodings

  • Encode::CN::HZ

    Internally used by Encode::CN

  • Encode::Config

    Internally used by Encode

  • Encode::EBCDIC

    EBCDIC Encodings

  • Encode::Encoder

    Object Oriented Encoder

  • Encode::Encoding

    Encode Implementation Base Class

  • Encode::GSM0338

    ESTI GSM 03.38 Encoding

  • Encode::Guess

    Guesses encoding from data

  • Encode::JP

    Japanese Encodings

  • Encode::JP::H2Z

    Internally used by Encode::JP::2022_JP*

  • Encode::JP::JIS7

    Internally used by Encode::JP

  • Encode::KR

    Korean Encodings

  • Encode::KR::2022_KR

    Internally used by Encode::KR

  • Encode::MIME::Header

    MIME 'B' and 'Q' header encoding

  • Encode::MIME::Name

    Internally used by Encode

  • Encode::PerlIO

    A detailed document on Encode and PerlIO

  • Encode::Supported

    Encodings supported by Encode

  • Encode::Symbol

    Symbol Encodings

  • Encode::TW

    Taiwan-based Chinese Encodings

  • Encode::Unicode

    Various Unicode Transformation Formats

  • Encode::Unicode::UTF7

    UTF-7 encoding

  • English

    Use nice English (or awk) names for ugly punctuation variables

  • Env

    Perl module that imports environment variables as scalars or arrays

  • Errno

    System errno constants

  • Exporter

    Implements default import method for modules

  • Exporter::Heavy

    Exporter guts

  • ExtUtils::CBuilder

    Compile and link C code for Perl modules

  • ExtUtils::CBuilder::Platform::Windows

    Builder class for Windows platforms

  • ExtUtils::Command

    Utilities to replace common UNIX commands in Makefiles etc.

  • ExtUtils::Command::MM

    Commands for the MM's to use in Makefiles

  • ExtUtils::Constant

    Generate XS code to import C header constants

  • ExtUtils::Constant::Base

    Base class for ExtUtils::Constant objects

  • ExtUtils::Constant::Utils

    Helper functions for ExtUtils::Constant

  • ExtUtils::Constant::XS

    Generate C code for XS modules' constants.

  • ExtUtils::Embed

    Utilities for embedding Perl in C/C++ applications

  • ExtUtils::Install

    Install files from here to there

  • ExtUtils::Installed

    Inventory management of installed modules

  • ExtUtils::Liblist

    Determine libraries to use and how to use them

  • ExtUtils::MM

    OS adjusted ExtUtils::MakeMaker subclass

  • ExtUtils::MM_AIX

    AIX specific subclass of ExtUtils::MM_Unix

  • ExtUtils::MM_Any

    Platform-agnostic MM methods

  • ExtUtils::MM_BeOS

    Methods to override UN*X behaviour in ExtUtils::MakeMaker

  • ExtUtils::MM_Cygwin

    Methods to override UN*X behaviour in ExtUtils::MakeMaker

  • ExtUtils::MM_DOS

    DOS specific subclass of ExtUtils::MM_Unix

  • ExtUtils::MM_Darwin

    Special behaviors for OS X

  • ExtUtils::MM_MacOS

    Once produced Makefiles for MacOS Classic

  • ExtUtils::MM_NW5

    Methods to override UN*X behaviour in ExtUtils::MakeMaker

  • ExtUtils::MM_OS2

    Methods to override UN*X behaviour in ExtUtils::MakeMaker

  • ExtUtils::MM_QNX

    QNX specific subclass of ExtUtils::MM_Unix

  • ExtUtils::MM_UWIN

    U/WIN specific subclass of ExtUtils::MM_Unix

  • ExtUtils::MM_Unix

    Methods used by ExtUtils::MakeMaker

  • ExtUtils::MM_VMS

    Methods to override UN*X behaviour in ExtUtils::MakeMaker

  • ExtUtils::MM_VOS

    VOS specific subclass of ExtUtils::MM_Unix

  • ExtUtils::MM_Win32

    Methods to override UN*X behaviour in ExtUtils::MakeMaker

  • ExtUtils::MM_Win95

    Method to customize MakeMaker for Win9X

  • ExtUtils::MY

    ExtUtils::MakeMaker subclass for customization

  • ExtUtils::MakeMaker

    Create a module Makefile

  • ExtUtils::MakeMaker::Config

    Wrapper around Config.pm

  • ExtUtils::MakeMaker::FAQ

    Frequently Asked Questions About MakeMaker

  • ExtUtils::MakeMaker::Tutorial

    Writing a module with MakeMaker

  • ExtUtils::Manifest

    Utilities to write and check a MANIFEST file

  • ExtUtils::Mkbootstrap

    Make a bootstrap file for use by DynaLoader

  • ExtUtils::Mksymlists

    Write linker options files for dynamic extension

  • ExtUtils::Packlist

    Manage .packlist files

  • ExtUtils::ParseXS

    Converts Perl XS code into C code

  • ExtUtils::ParseXS::Constants

    Initialization values for some globals

  • ExtUtils::ParseXS::Utilities

    Subroutines used with ExtUtils::ParseXS

  • ExtUtils::Typemaps

    Read/Write/Modify Perl/XS typemap files

  • ExtUtils::Typemaps::Cmd

    Quick commands for handling typemaps

  • ExtUtils::Typemaps::InputMap

    Entry in the INPUT section of a typemap

  • ExtUtils::Typemaps::OutputMap

    Entry in the OUTPUT section of a typemap

  • ExtUtils::Typemaps::Type

    Entry in the TYPEMAP section of a typemap

  • ExtUtils::XSSymSet

    Keep sets of symbol names palatable to the VMS linker

  • ExtUtils::testlib

    Add blib/* directories to @INC

  • Fatal

    Replace functions with equivalents which succeed or die

  • Fcntl

    Load the C Fcntl.h defines

  • File::Basename

    Parse file paths into directory, filename and suffix.

  • File::CheckTree

    Run many filetest checks on a tree

  • File::Compare

    Compare files or filehandles

  • File::Copy

    Copy files or filehandles

  • File::DosGlob

    DOS like globbing and then some

  • File::Fetch

    A generic file fetching mechanism

  • File::Find

    Traverse a directory tree.

  • File::Glob

    Perl extension for BSD glob routine

  • File::GlobMapper

    Extend File Glob to Allow Input and Output Files

  • File::Path

    Create or remove directory trees

  • File::Spec

    Portably perform operations on file names

  • File::Spec::Cygwin

    Methods for Cygwin file specs

  • File::Spec::Epoc

    Methods for Epoc file specs

  • File::Spec::Functions

    Portably perform operations on file names

  • File::Spec::Mac

    File::Spec for Mac OS (Classic)

  • File::Spec::OS2

    Methods for OS/2 file specs

  • File::Spec::Unix

    File::Spec for Unix, base for other File::Spec modules

  • File::Spec::VMS

    Methods for VMS file specs

  • File::Spec::Win32

    Methods for Win32 file specs

  • File::Temp

    Return name and handle of a temporary file safely

  • File::stat

    By-name interface to Perl's built-in stat() functions

  • FileCache

    Keep more files open than the system permits

  • FileHandle

    Supply object methods for filehandles

  • Filter::Simple

    Simplified source filtering

  • Filter::Util::Call

    Perl Source Filter Utility Module

  • FindBin

    Locate directory of original perl script

  • GDBM_File

    Perl5 access to the gdbm library.

  • Getopt::Long

    Extended processing of command line options

  • Getopt::Std

    Process single-character switches with switch clustering

  • HTTP::Tiny

    A small, simple, correct HTTP/1.1 client

  • Hash::Util

    A selection of general-utility hash subroutines

  • Hash::Util::FieldHash

    Support for Inside-Out Classes

  • I18N::Collate

    Compare 8-bit scalar data according to the current locale

  • I18N::LangTags

    Functions for dealing with RFC3066-style language tags

  • I18N::LangTags::Detect

    Detect the user's language preferences

  • I18N::LangTags::List

    Tags and names for human languages

  • I18N::Langinfo

    Query locale information

  • IO

    Load various IO modules

  • IO::Compress::Base

    Base Class for IO::Compress modules

  • IO::Compress::Bzip2

    Write bzip2 files/buffers

  • IO::Compress::Deflate

    Write RFC 1950 files/buffers

  • IO::Compress::FAQ

    Frequently Asked Questions about IO::Compress

  • IO::Compress::Gzip

    Write RFC 1952 files/buffers

  • IO::Compress::RawDeflate

    Write RFC 1951 files/buffers

  • IO::Compress::Zip

    Write zip files/buffers

  • IO::Dir

    Supply object methods for directory handles

  • IO::File

    Supply object methods for filehandles

  • IO::Handle

    Supply object methods for I/O handles

  • IO::Pipe

    Supply object methods for pipes

  • IO::Poll

    Object interface to system poll call

  • IO::Seekable

    Supply seek based methods for I/O objects

  • IO::Select

    OO interface to the select system call

  • IO::Socket

    Object interface to socket communications

  • IO::Socket::INET

    Object interface for AF_INET domain sockets

  • IO::Socket::UNIX

    Object interface for AF_UNIX domain sockets

  • IO::Uncompress::AnyInflate

    Uncompress zlib-based (zip, gzip) file/buffer

  • IO::Uncompress::AnyUncompress

    Uncompress gzip, zip, bzip2 or lzop file/buffer

  • IO::Uncompress::Base

    Base Class for IO::Uncompress modules

  • IO::Uncompress::Bunzip2

    Read bzip2 files/buffers

  • IO::Uncompress::Gunzip

    Read RFC 1952 files/buffers

  • IO::Uncompress::Inflate

    Read RFC 1950 files/buffers

  • IO::Uncompress::RawInflate

    Read RFC 1951 files/buffers

  • IO::Uncompress::Unzip

    Read zip files/buffers

  • IO::Zlib

    IO:: style interface to Compress::Zlib

  • IPC::Cmd

    Finding and running system commands made easy

  • IPC::Msg

    SysV Msg IPC object class

  • IPC::Open2

    Open a process for both reading and writing using open2()

  • IPC::Open3

    Open a process for reading, writing, and error handling using open3()

  • IPC::Semaphore

    SysV Semaphore IPC object class

  • IPC::SharedMem

    SysV Shared Memory IPC object class

  • IPC::SysV

    System V IPC constants and system calls

  • JSON::PP

    JSON::XS compatible pure-Perl module.

  • JSON::PP::Boolean

    Dummy module providing JSON::PP::Boolean

  • List::Util

    A selection of general-utility list subroutines

  • List::Util::XS

    Indicate if List::Util was compiled with a C compiler

  • Locale::Codes

    A distribution of modules to handle locale codes

  • Locale::Codes::API

    A description of the callable function in each module

  • Locale::Codes::Changes

    Details changes to Locale::Codes

  • Locale::Codes::Constants

    Constants for Locale codes

  • Locale::Codes::Country

    Standard codes for country identification

  • Locale::Codes::Country_Codes

    Country codes for the Locale::Codes::Country module

  • Locale::Codes::Country_Retired

    Retired country codes for the Locale::Codes::Country module

  • Locale::Codes::Currency

    Standard codes for currency identification

  • Locale::Codes::Currency_Codes

    Currency codes for the Locale::Codes::Currency module

  • Locale::Codes::Currency_Retired

    Retired currency codes for the Locale::Codes::Currency module

  • Locale::Codes::LangExt

    Standard codes for language extension identification

  • Locale::Codes::LangExt_Codes

    Langext codes for the Locale::Codes::LangExt module

  • Locale::Codes::LangExt_Retired

    Retired langext codes for the Locale::Codes::LangExt module

  • Locale::Codes::LangFam

    Standard codes for language extension identification

  • Locale::Codes::LangFam_Codes

    Langfam codes for the Locale::Codes::LangFam module

  • Locale::Codes::LangFam_Retired

    Retired langfam codes for the Locale::Codes::LangFam module

  • Locale::Codes::LangVar

    Standard codes for language variation identification

  • Locale::Codes::LangVar_Codes

    Langvar codes for the Locale::Codes::LangVar module

  • Locale::Codes::LangVar_Retired

    Retired langvar codes for the Locale::Codes::LangVar module

  • Locale::Codes::Language

    Standard codes for language identification

  • Locale::Codes::Language_Codes

    Language codes for the Locale::Codes::Language module

  • Locale::Codes::Language_Retired

    Retired language codes for the Locale::Codes::Language module

  • Locale::Codes::Script

    Standard codes for script identification

  • Locale::Codes::Script_Codes

    Script codes for the Locale::Codes::Script module

  • Locale::Codes::Script_Retired

    Retired script codes for the Locale::Codes::Script module

  • Locale::Country

    Standard codes for country identification

  • Locale::Currency

    Standard codes for currency identification

  • Locale::Language

    Standard codes for language identification

  • Locale::Maketext

    Framework for localization

  • Locale::Maketext::Cookbook

    Recipes for using Locale::Maketext

  • Locale::Maketext::Guts

    Deprecated module to load Locale::Maketext utf8 code

  • Locale::Maketext::GutsLoader

    Deprecated module to load Locale::Maketext utf8 code

  • Locale::Maketext::Simple

    Simple interface to Locale::Maketext::Lexicon

  • Locale::Maketext::TPJ13

    Article about software localization

  • Locale::Script

    Standard codes for script identification

  • Log::Message

    A generic message storing mechanism;

  • Log::Message::Config

    Configuration options for Log::Message

  • Log::Message::Handlers

    Message handlers for Log::Message

  • Log::Message::Item

    Message objects for Log::Message

  • Log::Message::Simple

    Simplified interface to Log::Message

  • MIME::Base64

    Encoding and decoding of base64 strings

  • MIME::QuotedPrint

    Encoding and decoding of quoted-printable strings

  • Math::BigFloat

    Arbitrary size floating point math package

  • Math::BigInt

    Arbitrary size integer/float math package

  • Math::BigInt::Calc

    Pure Perl module to support Math::BigInt

  • Math::BigInt::CalcEmu

    Emulate low-level math with BigInt code

  • Math::BigInt::FastCalc

    Math::BigInt::Calc with some XS for more speed

  • Math::BigRat

    Arbitrary big rational numbers

  • Math::Complex

    Complex numbers and associated mathematical functions

  • Math::Trig

    Trigonometric functions

  • Memoize

    Make functions faster by trading space for time

  • Memoize::AnyDBM_File

    Glue to provide EXISTS for AnyDBM_File for Storable use

  • Memoize::Expire

    Plug-in module for automatic expiration of memoized values

  • Memoize::ExpireFile

    Test for Memoize expiration semantics

  • Memoize::ExpireTest

    Test for Memoize expiration semantics

  • Memoize::NDBM_File

    Glue to provide EXISTS for NDBM_File for Storable use

  • Memoize::SDBM_File

    Glue to provide EXISTS for SDBM_File for Storable use

  • Memoize::Storable

    Store Memoized data in Storable database

  • Module::Build

    Build and install Perl modules

  • Module::Build::API

    API Reference for Module Authors

  • Module::Build::Authoring

    Authoring Module::Build modules

  • Module::Build::Base

    Default methods for Module::Build

  • Module::Build::Bundling

    How to bundle Module::Build with a distribution

  • Module::Build::Compat

    Compatibility with ExtUtils::MakeMaker

  • Module::Build::ConfigData

    Configuration for Module::Build

  • Module::Build::Cookbook

    Examples of Module::Build Usage

  • Module::Build::ModuleInfo

    DEPRECATED

  • Module::Build::Notes

    Create persistent distribution configuration modules

  • Module::Build::PPMMaker

    Perl Package Manager file creation

  • Module::Build::Platform::Amiga

    Builder class for Amiga platforms

  • Module::Build::Platform::Default

    Stub class for unknown platforms

  • Module::Build::Platform::EBCDIC

    Builder class for EBCDIC platforms

  • Module::Build::Platform::MPEiX

    Builder class for MPEiX platforms

  • Module::Build::Platform::MacOS

    Builder class for MacOS platforms

  • Module::Build::Platform::RiscOS

    Builder class for RiscOS platforms

  • Module::Build::Platform::Unix

    Builder class for Unix platforms

  • Module::Build::Platform::VMS

    Builder class for VMS platforms

  • Module::Build::Platform::VOS

    Builder class for VOS platforms

  • Module::Build::Platform::Windows

    Builder class for Windows platforms

  • Module::Build::Platform::aix

    Builder class for AIX platform

  • Module::Build::Platform::cygwin

    Builder class for Cygwin platform

  • Module::Build::Platform::darwin

    Builder class for Mac OS X platform

  • Module::Build::Platform::os2

    Builder class for OS/2 platform

  • Module::Build::Version

    DEPRECATED

  • Module::Build::YAML

    DEPRECATED

  • Module::CoreList

    What modules shipped with versions of perl

  • Module::CoreList::Utils

    What utilities shipped with versions of perl

  • Module::Load

    Runtime require of both modules and files

  • Module::Load::Conditional

    Looking up module information / loading at runtime

  • Module::Loaded

    Mark modules as loaded or unloaded

  • Module::Metadata

    Gather package and POD information from perl module files

  • Module::Pluggable

    Automatically give your module the ability to have plugins

  • Module::Pluggable::Object

    Automatically give your module the ability to have plugins

  • NDBM_File

    Tied access to ndbm files

  • NEXT

    Provide a pseudo-class NEXT (et al) that allows method redispatch

  • Net::Cmd

    Network Command class (as used by FTP, SMTP etc)

  • Net::Config

    Local configuration data for libnet

  • Net::Domain

    Attempt to evaluate the current host's internet name and domain

  • Net::FTP

    FTP Client class

  • Net::NNTP

    NNTP Client class

  • Net::Netrc

    OO interface to users netrc file

  • Net::POP3

    Post Office Protocol 3 Client class (RFC1939)

  • Net::Ping

    Check a remote host for reachability

  • Net::SMTP

    Simple Mail Transfer Protocol Client

  • Net::Time

    Time and daytime network client interface

  • Net::hostent

    By-name interface to Perl's built-in gethost*() functions

  • Net::libnetFAQ

    Libnet Frequently Asked Questions

  • Net::netent

    By-name interface to Perl's built-in getnet*() functions

  • Net::protoent

    By-name interface to Perl's built-in getproto*() functions

  • Net::servent

    By-name interface to Perl's built-in getserv*() functions

  • O

    Generic interface to Perl Compiler backends

  • ODBM_File

    Tied access to odbm files

  • Object::Accessor

    Interface to create per object accessors

  • Opcode

    Disable named opcodes when compiling perl code

  • POSIX

    Perl interface to IEEE Std 1003.1

  • Package::Constants

    List all constants declared in a package

  • Params::Check

    A generic input parsing/checking mechanism.

  • Parse::CPAN::Meta

    Parse META.yml and META.json CPAN metadata files

  • Perl::OSType

    Map Perl operating system names to generic types

  • PerlIO

    On demand loader for PerlIO layers and root of PerlIO::* name space

  • PerlIO::encoding

    Encoding layer

  • PerlIO::mmap

    Memory mapped IO

  • PerlIO::scalar

    In-memory IO, scalar IO

  • PerlIO::via

    Helper class for PerlIO layers implemented in perl

  • PerlIO::via::QuotedPrint

    PerlIO layer for quoted-printable strings

  • Pod::Escapes

    For resolving Pod E<...> sequences

  • Pod::Functions

    Group Perl's functions a la perlfunc.pod

  • Pod::Html

    Module to convert pod files to HTML

  • Pod::LaTeX

    Convert Pod data to formatted Latex

  • Pod::Man

    Convert POD data to formatted *roff input

  • Pod::ParseLink

    Parse an L<> formatting code in POD text

  • Pod::Perldoc

    Look up Perl documentation in Pod format.

  • Pod::Perldoc::BaseTo

    Base for Pod::Perldoc formatters

  • Pod::Perldoc::GetOptsOO

    Customized option parser for Pod::Perldoc

  • Pod::Perldoc::ToANSI

    Render Pod with ANSI color escapes

  • Pod::Perldoc::ToChecker

    Let Perldoc check Pod for errors

  • Pod::Perldoc::ToMan

    Let Perldoc render Pod as man pages

  • Pod::Perldoc::ToNroff

    Let Perldoc convert Pod to nroff

  • Pod::Perldoc::ToPod

    Let Perldoc render Pod as ... Pod!

  • Pod::Perldoc::ToRtf

    Let Perldoc render Pod as RTF

  • Pod::Perldoc::ToTerm

    Render Pod with terminal escapes

  • Pod::Perldoc::ToText

    Let Perldoc render Pod as plaintext

  • Pod::Perldoc::ToTk

    Let Perldoc use Tk::Pod to render Pod

  • Pod::Perldoc::ToXml

    Let Perldoc render Pod as XML

  • Pod::Simple

    Framework for parsing Pod

  • Pod::Simple::Checker

    Check the Pod syntax of a document

  • Pod::Simple::Debug

    Put Pod::Simple into trace/debug mode

  • Pod::Simple::DumpAsText

    Dump Pod-parsing events as text

  • Pod::Simple::DumpAsXML

    Turn Pod into XML

  • Pod::Simple::HTML

    Convert Pod to HTML

  • Pod::Simple::HTMLBatch

    Convert several Pod files to several HTML files

  • Pod::Simple::LinkSection

    Represent "section" attributes of L codes

  • Pod::Simple::Methody

    Turn Pod::Simple events into method calls

  • Pod::Simple::PullParser

    A pull-parser interface to parsing Pod

  • Pod::Simple::PullParserEndToken

    End-tokens from Pod::Simple::PullParser

  • Pod::Simple::PullParserStartToken

    Start-tokens from Pod::Simple::PullParser

  • Pod::Simple::PullParserTextToken

    Text-tokens from Pod::Simple::PullParser

  • Pod::Simple::PullParserToken

    Tokens from Pod::Simple::PullParser

  • Pod::Simple::RTF

    Format Pod as RTF

  • Pod::Simple::Search

    Find POD documents in directory trees

  • Pod::Simple::SimpleTree

    Parse Pod into a simple parse tree

  • Pod::Simple::Subclassing

    Write a formatter as a Pod::Simple subclass

  • Pod::Simple::Text

    Format Pod as plaintext

  • Pod::Simple::TextContent

    Get the text content of Pod

  • Pod::Simple::XHTML

    Format Pod as validating XHTML

  • Pod::Simple::XMLOutStream

    Turn Pod into XML

  • Pod::Text

    Convert POD data to formatted ASCII text

  • Pod::Text::Color

    Convert POD data to formatted color ASCII text

  • Pod::Text::Termcap

    Convert POD data to ASCII text with format escapes

  • SDBM_File

    Tied access to sdbm files

  • Safe

    Compile and execute code in restricted compartments

  • Scalar::Util

    A selection of general-utility scalar subroutines

  • Search::Dict

    Look - search for key in dictionary file

  • SelectSaver

    Save and restore selected file handle

  • SelfLoader

    Load functions only on demand

  • Storable

    Persistence for Perl data structures

  • Symbol

    Manipulate Perl symbols and their names

  • Sys::Hostname

    Try every conceivable way to get hostname

  • Sys::Syslog

    Perl interface to the UNIX syslog(3) calls

  • Sys::Syslog::Win32

    Win32 support for Sys::Syslog

  • TAP::Base

    Base class that provides common functionality to TAP::Parser

  • TAP::Formatter::Base

    Base class for harness output delegates

  • TAP::Formatter::Color

    Run Perl test scripts with color

  • TAP::Formatter::Console

    Harness output delegate for default console output

  • TAP::Formatter::Console::ParallelSession

    Harness output delegate for parallel console output

  • TAP::Formatter::Console::Session

    Harness output delegate for default console output

  • TAP::Formatter::File

    Harness output delegate for file output

  • TAP::Formatter::File::Session

    Harness output delegate for file output

  • TAP::Formatter::Session

    Abstract base class for harness output delegate

  • TAP::Harness

    Run test scripts with statistics

  • TAP::Object

    Base class that provides common functionality to all TAP::* modules

  • TAP::Parser

    Parse TAP output

  • TAP::Parser::Aggregator

    Aggregate TAP::Parser results

  • TAP::Parser::Grammar

    A grammar for the Test Anything Protocol.

  • TAP::Parser::Iterator

    Base class for TAP source iterators

  • TAP::Parser::Iterator::Array

    Iterator for array-based TAP sources

  • TAP::Parser::Iterator::Process

    Iterator for process-based TAP sources

  • TAP::Parser::Iterator::Stream

    Iterator for filehandle-based TAP sources

  • TAP::Parser::IteratorFactory

    Figures out which SourceHandler objects to use for a given Source

  • TAP::Parser::Multiplexer

    Multiplex multiple TAP::Parsers

  • TAP::Parser::Result

    Base class for TAP::Parser output objects

  • TAP::Parser::Result::Bailout

    Bailout result token.

  • TAP::Parser::Result::Comment

    Comment result token.

  • TAP::Parser::Result::Plan

    Plan result token.

  • TAP::Parser::Result::Pragma

    TAP pragma token.

  • TAP::Parser::Result::Test

    Test result token.

  • TAP::Parser::Result::Unknown

    Unknown result token.

  • TAP::Parser::Result::Version

    TAP syntax version token.

  • TAP::Parser::Result::YAML

    YAML result token.

  • TAP::Parser::ResultFactory

    Factory for creating TAP::Parser output objects

  • TAP::Parser::Scheduler

    Schedule tests during parallel testing

  • TAP::Parser::Scheduler::Job

    A single testing job.

  • TAP::Parser::Scheduler::Spinner

    A no-op job.

  • TAP::Parser::Source

    A TAP source & meta data about it

  • TAP::Parser::SourceHandler

    Base class for different TAP source handlers

  • TAP::Parser::SourceHandler::Executable

    Stream output from an executable TAP source

  • TAP::Parser::SourceHandler::File

    Stream TAP from a text file.

  • TAP::Parser::SourceHandler::Handle

    Stream TAP from an IO::Handle or a GLOB.

  • TAP::Parser::SourceHandler::Perl

    Stream TAP from a Perl executable

  • TAP::Parser::SourceHandler::RawTAP

    Stream output from raw TAP in a scalar/array ref.

  • TAP::Parser::Utils

    Internal TAP::Parser utilities

  • TAP::Parser::YAMLish::Reader

    Read YAMLish data from iterator

  • TAP::Parser::YAMLish::Writer

    Write YAMLish data

  • Term::ANSIColor

    Color screen output using ANSI escape sequences

  • Term::Cap

    Perl termcap interface

  • Term::Complete

    Perl word completion module

  • Term::ReadLine

    Perl interface to various readline packages.

  • Term::UI

    Term::ReadLine UI made easy

  • Term::UI::History

    History function

  • Test

    Provides a simple framework for writing test scripts

  • Test::Builder

    Backend for building test libraries

  • Test::Builder::Module

    Base class for test modules

  • Test::Builder::Tester

    Test testsuites that have been built with

  • Test::Builder::Tester::Color

    Turn on colour in Test::Builder::Tester

  • Test::Harness

    Run Perl standard test scripts with statistics

  • Test::More

    Yet another framework for writing test scripts

  • Test::Simple

    Basic utilities for writing tests.

  • Test::Tutorial

    A tutorial about writing really basic tests

  • Text::Abbrev

    Abbrev - create an abbreviation table from a list

  • Text::Balanced

    Extract delimited text sequences from strings.

  • Text::ParseWords

    Parse text into an array of tokens or array of arrays

  • Text::Soundex

    Implementation of the soundex algorithm.

  • Text::Tabs

    Expand and unexpand tabs like unix expand(1) and unexpand(1)

  • Text::Wrap

    Line wrapping to form simple paragraphs

  • Thread

    Manipulate threads in Perl (for old code only)

  • Thread::Queue

    Thread-safe queues

  • Thread::Semaphore

    Thread-safe semaphores

  • Tie::Array

    Base class for tied arrays

  • Tie::File

    Access the lines of a disk file via a Perl array

  • Tie::Handle

    Base class definitions for tied handles

  • Tie::Hash

    Base class definitions for tied hashes

  • Tie::Hash::NamedCapture

    Named regexp capture buffers

  • Tie::Memoize

    Add data to hash when needed

  • Tie::RefHash

    Use references as hash keys

  • Tie::Scalar

    Base class definitions for tied scalars

  • Tie::StdHandle

    Base class definitions for tied handles

  • Tie::SubstrHash

    Fixed-table-size, fixed-key-length hashing

  • Time::HiRes

    High resolution alarm, sleep, gettimeofday, interval timers

  • Time::Local

    Efficiently compute time from local and GMT time

  • Time::Piece

    Object Oriented time objects

  • Time::Seconds

    A simple API to convert seconds to other date values

  • Time::gmtime

    By-name interface to Perl's built-in gmtime() function

  • Time::localtime

    By-name interface to Perl's built-in localtime() function

  • Time::tm

    Internal object used by Time::gmtime and Time::localtime

  • UNIVERSAL

    Base class for ALL classes (blessed references)

  • Unicode::Collate

    Unicode Collation Algorithm

  • Unicode::Collate::CJK::Big5

    Weighting CJK Unified Ideographs

  • Unicode::Collate::CJK::GB2312

    Weighting CJK Unified Ideographs

  • Unicode::Collate::CJK::JISX0208

    Weighting JIS KANJI for Unicode::Collate

  • Unicode::Collate::CJK::Korean

    Weighting CJK Unified Ideographs

  • Unicode::Collate::CJK::Pinyin

    Weighting CJK Unified Ideographs

  • Unicode::Collate::CJK::Stroke

    Weighting CJK Unified Ideographs

  • Unicode::Collate::CJK::Zhuyin

    Weighting CJK Unified Ideographs

  • Unicode::Collate::Locale

    Linguistic tailoring for DUCET via Unicode::Collate

  • Unicode::Normalize

    Unicode Normalization Forms

  • Unicode::UCD

    Unicode character database

  • User::grent

    By-name interface to Perl's built-in getgr*() functions

  • User::pwent

    By-name interface to Perl's built-in getpw*() functions

  • VMS::DCLsym

    Perl extension to manipulate DCL symbols

  • VMS::Stdio

    Standard I/O functions via VMS extensions

  • Win32

    Interfaces to some Win32 API Functions

  • Win32API::File

    Low-level access to Win32 system API calls for files/dirs.

  • Win32CORE

    Win32 CORE function stubs

  • XS::APItest

    Test the perl C API

  • XS::Typemap

    Module to test the XS typemaps distributed with perl

  • XSLoader

    Dynamically load C libraries into Perl code

  • version::Internals

    Perl extension for Version Objects

To find out all modules installed on your system, including those without documentation or outside the standard release, just use the following command (under the default win32 shell, double quotes should be used instead of single quotes).

  1. % perl -MFile::Find=find -MFile::Spec::Functions -Tlwe \
  2. 'find { wanted => sub { print canonpath $_ if /\.pm\z/ },
  3. no_chdir => 1 }, @INC'

(The -T is here to prevent '.' from being listed in @INC.) They should all have their own documentation installed and accessible via your system man(1) command. If you do not have a find program, you can use the Perl find2perl program instead, which generates Perl code as output you can run through perl. If you have a man program but it doesn't find your modules, you'll have to fix your manpath. See perl for details. If you have no system man command, you might try the perldoc program.

Note also that the command perldoc perllocal gives you a (possibly incomplete) list of the modules that have been further installed on your system. (The perllocal.pod file is updated by the standard MakeMaker install process.)

Extension Modules

Extension modules are written in C (or a mix of Perl and C). They are usually dynamically loaded into Perl if and when you need them, but may also be linked in statically. Supported extension modules include Socket, Fcntl, and POSIX.

Many popular C extension modules do not come bundled (at least, not completely) due to their sizes, volatility, or simply lack of time for adequate testing and configuration across the multitude of platforms on which Perl was beta-tested. You are encouraged to look for them on CPAN (described below), or using web search engines like Alta Vista or Google.

CPAN

CPAN stands for Comprehensive Perl Archive Network; it's a globally replicated trove of Perl materials, including documentation, style guides, tricks and traps, alternate ports to non-Unix systems and occasional binary distributions for these. Search engines for CPAN can be found at http://www.cpan.org/

Most importantly, CPAN includes around a thousand unbundled modules, some of which require a C compiler to build. Major categories of modules are:

  • Language Extensions and Documentation Tools

  • Development Support

  • Operating System Interfaces

  • Networking, Device Control (modems) and InterProcess Communication

  • Data Types and Data Type Utilities

  • Database Interfaces

  • User Interfaces

  • Interfaces to / Emulations of Other Programming Languages

  • File Names, File Systems and File Locking (see also File Handles)

  • String Processing, Language Text Processing, Parsing, and Searching

  • Option, Argument, Parameter, and Configuration File Processing

  • Internationalization and Locale

  • Authentication, Security, and Encryption

  • World Wide Web, HTML, HTTP, CGI, MIME

  • Server and Daemon Utilities

  • Archiving and Compression

  • Images, Pixmap and Bitmap Manipulation, Drawing, and Graphing

  • Mail and Usenet News

  • Control Flow Utilities (callbacks and exceptions etc)

  • File Handle and Input/Output Stream Utilities

  • Miscellaneous Modules

The list of the registered CPAN sites follows. Please note that the sorting order is alphabetical on fields:

Continent | |-->Country | |-->[state/province] | |-->ftp | |-->[http]

and thus the North American servers happen to be listed between the European and the South American sites.

Registered CPAN sites

Africa

  • South Africa
    1. http://cpan.mirror.ac.za/
    2. ftp://cpan.mirror.ac.za/
    3. http://mirror.is.co.za/pub/cpan/
    4. ftp://ftp.is.co.za/pub/cpan/
    5. ftp://ftp.saix.net/pub/CPAN/

Asia

  • China
    1. http://cpan.wenzk.com/
  • Hong Kong
    1. http://ftp.cuhk.edu.hk/pub/packages/perl/CPAN/
    2. ftp://ftp.cuhk.edu.hk/pub/packages/perl/CPAN/
    3. http://mirrors.geoexpat.com/cpan/
  • India
    1. http://perlmirror.indialinks.com/
  • Indonesia
    1. http://cpan.biz.net.id/
    2. http://komo.vlsm.org/CPAN/
    3. ftp://komo.vlsm.org/CPAN/
    4. http://cpan.cermin.lipi.go.id/
    5. ftp://cermin.lipi.go.id/pub/CPAN/
    6. http://cpan.pesat.net.id/
  • Japan
    1. ftp://ftp.u-aizu.ac.jp/pub/CPAN
    2. ftp://ftp.kddilabs.jp/CPAN/
    3. http://ftp.nara.wide.ad.jp/pub/CPAN/
    4. ftp://ftp.nara.wide.ad.jp/pub/CPAN/
    5. http://ftp.jaist.ac.jp/pub/CPAN/
    6. ftp://ftp.jaist.ac.jp/pub/CPAN/
    7. ftp://ftp.dti.ad.jp/pub/lang/CPAN/
    8. ftp://ftp.ring.gr.jp/pub/lang/perl/CPAN/
    9. http://ftp.riken.jp/lang/CPAN/
    10. ftp://ftp.riken.jp/lang/CPAN/
    11. http://ftp.yz.yamagata-u.ac.jp/pub/lang/cpan/
    12. ftp://ftp.yz.yamagata-u.ac.jp/pub/lang/cpan/
  • Republic of Korea
    1. http://ftp.kaist.ac.kr/pub/CPAN
    2. ftp://ftp.kaist.ac.kr/pub/CPAN
    3. http://cpan.mirror.cdnetworks.com/
    4. ftp://cpan.mirror.cdnetworks.com/CPAN/
    5. http://cpan.sarang.net/
    6. ftp://cpan.sarang.net/CPAN/
  • Russia
    1. http://cpan.tomsk.ru/
    2. ftp://cpan.tomsk.ru/
  • Singapore
    1. http://mirror.averse.net/pub/CPAN
    2. ftp://mirror.averse.net/pub/CPAN
    3. http://cpan.mirror.choon.net/
    4. http://cpan.oss.eznetsols.org
    5. ftp://ftp.oss.eznetsols.org/cpan
  • Taiwan
    1. http://ftp.cse.yzu.edu.tw/pub/CPAN/
    2. ftp://ftp.cse.yzu.edu.tw/pub/CPAN/
    3. http://cpan.nctu.edu.tw/
    4. ftp://cpan.nctu.edu.tw/
    5. ftp://ftp.ncu.edu.tw/CPAN/
    6. http://cpan.cdpa.nsysu.edu.tw/
    7. ftp://cpan.cdpa.nsysu.edu.tw/Unix/Lang/CPAN/
    8. http://cpan.stu.edu.tw
    9. ftp://ftp.stu.edu.tw/CPAN
    10. http://ftp.stu.edu.tw/CPAN
    11. ftp://ftp.stu.edu.tw/pub/CPAN
    12. http://cpan.cs.pu.edu.tw/
    13. ftp://cpan.cs.pu.edu.tw/pub/CPAN
  • Thailand
    1. http://mirrors.issp.co.th/cpan/
    2. ftp://mirrors.issp.co.th/cpan/
    3. http://mirror.yourconnect.com/CPAN/
    4. ftp://mirror.yourconnect.com/CPAN/
  • Turkey
    1. http://cpan.gazi.edu.tr/

Central America

  • Costa Rica
    1. http://mirrors.ucr.ac.cr/CPAN/
    2. ftp://mirrors.ucr.ac.cr/CPAN/

Europe

  • Austria
    1. http://cpan.inode.at/
    2. ftp://cpan.inode.at
    3. http://gd.tuwien.ac.at/languages/perl/CPAN/
    4. ftp://gd.tuwien.ac.at/pub/CPAN/
  • Belgium
    1. http://ftp.belnet.be/mirror/ftp.cpan.org/
    2. ftp://ftp.belnet.be/mirror/ftp.cpan.org/
    3. http://ftp.easynet.be/pub/CPAN/
    4. http://cpan.weepee.org/
  • Bosnia and Herzegovina
    1. http://cpan.blic.net/
  • Bulgaria
    1. http://cpan.cbox.biz/
    2. ftp://cpan.cbox.biz/cpan/
    3. http://cpan.digsys.bg/
    4. ftp://ftp.digsys.bg/pub/CPAN
  • Croatia
    1. http://ftp.carnet.hr/pub/CPAN/
    2. ftp://ftp.carnet.hr/pub/CPAN/
  • Czech Republic
    1. ftp://ftp.fi.muni.cz/pub/CPAN/
    2. http://archive.cpan.cz/
  • Denmark
    1. http://mirrors.dotsrc.org/cpan
    2. ftp://mirrors.dotsrc.org/cpan/
    3. http://www.cpan.dk/
    4. http://mirror.uni-c.dk/pub/CPAN/
  • Finland
    1. ftp://ftp.funet.fi/pub/languages/perl/CPAN/
    2. http://mirror.eunet.fi/CPAN
  • France
    1. http://cpan.enstimac.fr/
    2. ftp://ftp.inria.fr/pub/CPAN/
    3. http://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/
    4. ftp://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/
    5. ftp://ftp.lip6.fr/pub/perl/CPAN/
    6. http://mir2.ovh.net/ftp.cpan.org
    7. ftp://mir1.ovh.net/ftp.cpan.org
    8. ftp://ftp.oleane.net/pub/CPAN/
    9. http://ftp.crihan.fr/mirrors/ftp.cpan.org/
    10. ftp://ftp.crihan.fr/mirrors/ftp.cpan.org/
    11. http://ftp.u-strasbg.fr/CPAN
    12. ftp://ftp.u-strasbg.fr/CPAN
    13. http://cpan.cict.fr/
    14. ftp://cpan.cict.fr/pub/CPAN/
  • Germany
    1. ftp://ftp.fu-berlin.de/unix/languages/perl/
    2. http://mirrors.softliste.de/cpan/
    3. ftp://ftp.rub.de/pub/CPAN/
    4. http://www.planet-elektronik.de/CPAN/
    5. http://ftp.hosteurope.de/pub/CPAN/
    6. ftp://ftp.hosteurope.de/pub/CPAN/
    7. http://www.mirrorspace.org/cpan/
    8. http://mirror.netcologne.de/cpan/
    9. ftp://mirror.netcologne.de/cpan/
    10. ftp://ftp.freenet.de/pub/ftp.cpan.org/pub/CPAN/
    11. http://ftp-stud.hs-esslingen.de/pub/Mirrors/CPAN/
    12. ftp://ftp-stud.hs-esslingen.de/pub/Mirrors/CPAN/
    13. http://mirrors.zerg.biz/cpan/
    14. http://ftp.gwdg.de/pub/languages/perl/CPAN/
    15. ftp://ftp.gwdg.de/pub/languages/perl/CPAN/
    16. http://dl.ambiweb.de/mirrors/ftp.cpan.org/
    17. http://cpan.mirror.clusters.kg/
    18. http://cpan.mirror.iphh.net/
    19. ftp://cpan.mirror.iphh.net/pub/CPAN/
    20. http://cpan.mirroring.de/
    21. http://mirror.informatik.uni-mannheim.de/pub/mirrors/CPAN/
    22. ftp://mirror.informatik.uni-mannheim.de/pub/mirrors/CPAN/
    23. http://www.chemmedia.de/mirrors/CPAN/
    24. http://ftp.cw.net/pub/CPAN/
    25. ftp://ftp.cw.net/pub/CPAN/
    26. http://cpan.cpantesters.org/
    27. ftp://cpan.cpantesters.org/CPAN/
    28. http://cpan.mirrored.de/
    29. ftp://mirror.petamem.com/CPAN/
    30. http://cpan.noris.de/
    31. ftp://cpan.noris.de/pub/CPAN/
    32. ftp://ftp.mpi-sb.mpg.de/pub/perl/CPAN/
    33. ftp://ftp.gmd.de/mirrors/CPAN/
  • Greece
    1. ftp://ftp.forthnet.gr/pub/languages/perl/CPAN
    2. ftp://ftp.ntua.gr/pub/lang/perl/
    3. http://cpan.cc.uoc.gr/
    4. ftp://ftp.cc.uoc.gr/mirrors/CPAN/
  • Hungary
    1. http://cpan.mirrors.enexis.hu/
    2. ftp://cpan.mirrors.enexis.hu/mirrors/cpan/
    3. http://cpan.hu/
  • Iceland
    1. http://ftp.rhnet.is/pub/CPAN/
    2. ftp://ftp.rhnet.is/pub/CPAN/
  • Ireland
    1. http://ftp.esat.net/pub/languages/perl/CPAN/
    2. ftp://ftp.esat.net/pub/languages/perl/CPAN/
    3. http://ftp.heanet.ie/mirrors/ftp.perl.org/pub/CPAN
    4. ftp://ftp.heanet.ie/mirrors/ftp.perl.org/pub/CPAN
  • Italy
    1. http://bo.mirror.garr.it/mirrors/CPAN/
    2. http://cpan.panu.it/
    3. ftp://ftp.panu.it/pub/mirrors/perl/CPAN/
  • Latvia
    1. http://kvin.lv/pub/CPAN/
  • Lithuania
    1. http://ftp.litnet.lt/pub/CPAN/
    2. ftp://ftp.litnet.lt/pub/CPAN/
  • Malta
    1. http://cpan.waldonet.net.mt/
  • Netherlands
    1. ftp://ftp.quicknet.nl/pub/CPAN/
    2. http://mirror.hostfuss.com/CPAN/
    3. ftp://mirror.hostfuss.com/CPAN/
    4. http://mirrors3.kernel.org/cpan/
    5. ftp://mirrors3.kernel.org/pub/CPAN/
    6. http://cpan.mirror.versatel.nl/
    7. ftp://ftp.mirror.versatel.nl/cpan/
    8. ftp://download.xs4all.nl/pub/mirror/CPAN/
    9. http://mirror.leaseweb.com/CPAN/
    10. ftp://mirror.leaseweb.com/CPAN/
    11. ftp://ftp.cpan.nl/pub/CPAN/
    12. http://archive.cs.uu.nl/mirror/CPAN/
    13. ftp://ftp.cs.uu.nl/mirror/CPAN/
    14. http://luxitude.net/cpan/
  • Norway
    1. ftp://ftp.uninett.no/pub/languages/perl/CPAN
    2. ftp://ftp.uit.no/pub/languages/perl/cpan/
  • Poland
    1. http://piotrkosoft.net/pub/mirrors/CPAN/
    2. ftp://ftp.piotrkosoft.net/pub/mirrors/CPAN/
    3. http://ftp.man.poznan.pl/pub/CPAN
    4. ftp://ftp.man.poznan.pl/pub/CPAN
    5. ftp://ftp.ps.pl/pub/CPAN/
    6. ftp://sunsite.icm.edu.pl/pub/CPAN/
    7. ftp://ftp.tpnet.pl/d4/CPAN/
  • Portugal
    1. http://cpan.dei.uc.pt/
    2. ftp://ftp.dei.uc.pt/pub/CPAN
    3. ftp://ftp.ist.utl.pt/pub/CPAN/
    4. http://cpan.perl.pt/
    5. http://cpan.ip.pt/
    6. ftp://cpan.ip.pt/pub/cpan/
    7. http://mirrors.nfsi.pt/CPAN/
    8. ftp://mirrors.nfsi.pt/pub/CPAN/
    9. http://cpan.dcc.fc.up.pt/
  • Romania
    1. http://ftp.astral.ro/pub/CPAN/
    2. ftp://ftp.astral.ro/pub/CPAN/
    3. ftp://ftp.lug.ro/CPAN
    4. http://mirrors.xservers.ro/CPAN/
    5. http://mirrors.hostingromania.ro/ftp.cpan.org/
    6. ftp://ftp.hostingromania.ro/mirrors/ftp.cpan.org/
    7. ftp://ftp.iasi.roedu.net/pub/mirrors/ftp.cpan.org/
  • Russia
    1. ftp://ftp.aha.ru/CPAN/
    2. http://cpan.rinet.ru/
    3. ftp://cpan.rinet.ru/pub/mirror/CPAN/
    4. ftp://ftp.SpringDaemons.com/pub/CPAN/
    5. http://mirror.rol.ru/CPAN/
    6. http://ftp.silvernet.ru/CPAN/
    7. http://ftp.spbu.ru/CPAN/
    8. ftp://ftp.spbu.ru/CPAN/
  • Slovakia
    1. http://cpan.fyxm.net/
  • Slovenia
    1. http://www.klevze.si/cpan
  • Spain
    1. http://osl.ugr.es/CPAN/
    2. ftp://ftp.rediris.es/mirror/CPAN/
    3. http://ftp.gui.uva.es/sites/cpan.org/
    4. ftp://ftp.gui.uva.es/sites/cpan.org/
  • Sweden
    1. http://mirrors4.kernel.org/cpan/
    2. ftp://mirrors4.kernel.org/pub/CPAN/
  • Switzerland
    1. http://cpan.mirror.solnet.ch/
    2. ftp://ftp.solnet.ch/mirror/CPAN/
    3. ftp://ftp.adwired.ch/CPAN/
    4. http://mirror.switch.ch/ftp/mirror/CPAN/
    5. ftp://mirror.switch.ch/mirror/CPAN/
  • Ukraine
    1. http://cpan.makeperl.org/
    2. ftp://cpan.makeperl.org/pub/CPAN
    3. http://cpan.org.ua/
    4. http://cpan.gafol.net/
    5. ftp://ftp.gafol.net/pub/cpan/
  • United Kingdom
    1. http://www.mirrorservice.org/sites/ftp.funet.fi/pub/languages/perl/CPAN/
    2. ftp://ftp.mirrorservice.org/sites/ftp.funet.fi/pub/languages/perl/CPAN/
    3. http://mirror.tje.me.uk/pub/mirrors/ftp.cpan.org/
    4. ftp://mirror.tje.me.uk/pub/mirrors/ftp.cpan.org/
    5. http://www.mirror.8086.net/sites/CPAN/
    6. ftp://ftp.mirror.8086.net/sites/CPAN/
    7. http://cpan.mirror.anlx.net/
    8. ftp://ftp.mirror.anlx.net/CPAN/
    9. http://mirror.bytemark.co.uk/CPAN/
    10. ftp://mirror.bytemark.co.uk/CPAN/
    11. http://cpan.etla.org/
    12. ftp://cpan.etla.org/pub/CPAN
    13. ftp://ftp.demon.co.uk/pub/CPAN/
    14. http://mirror.sov.uk.goscomb.net/CPAN/
    15. ftp://mirror.sov.uk.goscomb.net/pub/CPAN/
    16. http://ftp.plig.net/pub/CPAN/
    17. ftp://ftp.plig.net/pub/CPAN/
    18. http://ftp.ticklers.org/pub/CPAN/
    19. ftp://ftp.ticklers.org/pub/CPAN/
    20. http://cpan.mirrors.uk2.net/
    21. ftp://mirrors.uk2.net/pub/CPAN/
    22. http://mirror.ox.ac.uk/sites/www.cpan.org/
    23. ftp://mirror.ox.ac.uk/sites/www.cpan.org/

North America

  • Bahamas
    1. http://www.securehost.com/mirror/CPAN/
  • Canada
    1. http://cpan.arcticnetwork.ca
    2. ftp://mirror.arcticnetwork.ca/pub/CPAN
    3. http://cpan.sunsite.ualberta.ca/
    4. ftp://cpan.sunsite.ualberta.ca/pub/CPAN/
    5. http://theoryx5.uwinnipeg.ca/pub/CPAN/
    6. ftp://theoryx5.uwinnipeg.ca/pub/CPAN/
    7. http://arwen.cs.dal.ca/mirror/CPAN/
    8. ftp://arwen.cs.dal.ca/pub/mirror/CPAN/
    9. http://CPAN.mirror.rafal.ca/
    10. ftp://CPAN.mirror.rafal.ca/pub/CPAN/
    11. ftp://ftp.nrc.ca/pub/CPAN/
    12. http://mirror.csclub.uwaterloo.ca/pub/CPAN/
    13. ftp://mirror.csclub.uwaterloo.ca/pub/CPAN/
  • Mexico
    1. http://www.msg.com.mx/CPAN/
    2. ftp://ftp.msg.com.mx/pub/CPAN/
  • United States
    • Alabama
      1. http://mirror.hiwaay.net/CPAN/
      2. ftp://mirror.hiwaay.net/CPAN/
    • Arizona
      1. http://cpan.ezarticleinformation.com/
    • California
      1. http://cpan.knowledgematters.net/
      2. http://cpan.binkerton.com/
      3. http://cpan.develooper.com/
      4. http://mirrors.gossamer-threads.com/CPAN
      5. http://cpan.schatt.com/
      6. http://mirrors.kernel.org/cpan/
      7. ftp://mirrors.kernel.org/pub/CPAN
      8. http://mirrors2.kernel.org/cpan/
      9. ftp://mirrors2.kernel.org/pub/CPAN/
      10. http://cpan.mirror.facebook.net/
      11. http://mirrors1.kernel.org/cpan/
      12. ftp://mirrors1.kernel.org/pub/CPAN/
      13. http://cpan-sj.viaverio.com/
      14. ftp://cpan-sj.viaverio.com/pub/CPAN/
      15. http://www.perl.com/CPAN/
    • Florida
      1. ftp://ftp.cise.ufl.edu/pub/mirrors/CPAN/
      2. http://mirror.atlantic.net/pub/CPAN/
      3. ftp://mirror.atlantic.net/pub/CPAN/
    • Idaho
      1. http://mirror.its.uidaho.edu/pub/cpan/
      2. ftp://mirror.its.uidaho.edu/cpan/
    • Illinois
      1. http://cpan.mirrors.hoobly.com/
      2. http://cpan.uchicago.edu/pub/CPAN/
      3. ftp://cpan.uchicago.edu/pub/CPAN/
      4. http://mirrors.servercentral.net/CPAN/
      5. http://www.stathy.com/CPAN/
      6. ftp://www.stathy.com/CPAN/
    • Indiana
      1. ftp://ftp.uwsg.iu.edu/pub/perl/CPAN/
      2. http://cpan.netnitco.net/
      3. ftp://cpan.netnitco.net/pub/mirrors/CPAN/
      4. http://ftp.ndlug.nd.edu/pub/perl/
      5. ftp://ftp.ndlug.nd.edu/pub/perl/
    • Massachusetts
      1. http://mirrors.ccs.neu.edu/CPAN/
    • Michigan
      1. http://ftp.wayne.edu/cpan/
      2. ftp://ftp.wayne.edu/cpan/
    • Minnesota
      1. http://cpan.msi.umn.edu/
    • New Jersey
      1. http://mirror.datapipe.net/CPAN/
      2. ftp://mirror.datapipe.net/pub/CPAN/
    • New York
      1. http://mirrors.24-7-solutions.net/pub/CPAN/
      2. ftp://mirrors.24-7-solutions.net/pub/CPAN/
      3. http://mirror.cc.columbia.edu/pub/software/cpan/
      4. ftp://mirror.cc.columbia.edu/pub/software/cpan/
      5. http://cpan.belfry.net/
      6. http://cpan.erlbaum.net/
      7. ftp://cpan.erlbaum.net/CPAN/
      8. http://cpan.hexten.net/
      9. ftp://cpan.hexten.net/
      10. ftp://mirror.nyi.net/CPAN/
      11. http://mirror.rit.edu/CPAN/
      12. ftp://mirror.rit.edu/CPAN/
    • North Carolina
      1. http://www.ibiblio.org/pub/mirrors/CPAN
      2. ftp://ftp.ncsu.edu/pub/mirror/CPAN/
    • Oregon
      1. http://ftp.osuosl.org/pub/CPAN/
      2. ftp://ftp.osuosl.org/pub/CPAN/
    • Pennsylvania
      1. http://ftp.epix.net/CPAN/
      2. ftp://ftp.epix.net/pub/languages/perl/
      3. http://cpan.pair.com/
      4. ftp://cpan.pair.com/pub/CPAN/
    • South Carolina
      1. http://cpan.mirror.clemson.edu/
    • Tennessee
      1. http://mira.sunsite.utk.edu/CPAN/
    • Texas
      1. http://mirror.uta.edu/CPAN
    • Utah
      1. ftp://mirror.xmission.com/CPAN/
    • Virginia
      1. http://cpan-du.viaverio.com/
      2. ftp://cpan-du.viaverio.com/pub/CPAN/
      3. http://perl.secsup.org/
      4. ftp://perl.secsup.org/pub/perl/
      5. ftp://mirror.cogentco.com/pub/CPAN/
    • Washington
      1. http://cpan.llarian.net/
      2. ftp://cpan.llarian.net/pub/CPAN/
      3. ftp://ftp-mirror.internap.com/pub/CPAN/
    • Wisconsin
      1. http://cpan.mirrors.tds.net
      2. ftp://cpan.mirrors.tds.net/pub/CPAN
      3. http://mirror.sit.wisc.edu/pub/CPAN/
      4. ftp://mirror.sit.wisc.edu/pub/CPAN/

Oceania

  • Australia
    1. http://mirror.internode.on.net/pub/cpan/
    2. ftp://mirror.internode.on.net/pub/cpan/
    3. http://cpan.mirror.aussiehq.net.au/
    4. http://mirror.as24220.net/cpan/
    5. ftp://mirror.as24220.net/cpan/
  • New Zealand
    1. ftp://ftp.auckland.ac.nz/pub/perl/CPAN/
    2. http://cpan.inspire.net.nz
    3. ftp://cpan.inspire.net.nz/cpan
    4. http://cpan.catalyst.net.nz/CPAN/
    5. ftp://cpan.catalyst.net.nz/pub/CPAN/

South America

  • Argentina
    1. http://cpan.patan.com.ar/
    2. http://cpan.localhost.net.ar
    3. ftp://mirrors.localhost.net.ar/pub/mirrors/CPAN
  • Brazil
    1. ftp://cpan.pop-mg.com.br/pub/CPAN/
    2. http://ftp.pucpr.br/CPAN
    3. ftp://ftp.pucpr.br/CPAN
    4. http://cpan.kinghost.net/
  • Chile
    1. http://cpan.dcc.uchile.cl/
    2. ftp://cpan.dcc.uchile.cl/pub/lang/cpan/
  • Colombia
    1. http://www.laqee.unal.edu.co/CPAN/

RSYNC Mirrors

  1. mirror.as24220.net::cpan
  2. cpan.inode.at::CPAN
  3. gd.tuwien.ac.at::CPAN
  4. ftp.belnet.be::packages/cpan
  5. rsync.linorg.usp.br::CPAN
  6. rsync.arcticnetwork.ca::CPAN
  7. CPAN.mirror.rafal.ca::CPAN
  8. mirror.csclub.uwaterloo.ca::CPAN
  9. theoryx5.uwinnipeg.ca::CPAN
  10. www.laqee.unal.edu.co::CPAN
  11. mirror.uni-c.dk::CPAN
  12. rsync.nic.funet.fi::CPAN
  13. rsync://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/
  14. mir1.ovh.net::CPAN
  15. miroir-francais.fr::cpan
  16. ftp.crihan.fr::CPAN
  17. rsync://mirror.cict.fr/cpan/
  18. rsync://mirror.netcologne.de/cpan/
  19. ftp-stud.hs-esslingen.de::CPAN/
  20. ftp.gwdg.de::FTP/languages/perl/CPAN/
  21. cpan.mirror.iphh.net::CPAN
  22. cpan.cpantesters.org::cpan
  23. cpan.hu::CPAN
  24. komo.vlsm.org::CPAN
  25. mirror.unej.ac.id::cpan
  26. ftp.esat.net::/pub/languages/perl/CPAN
  27. ftp.heanet.ie::mirrors/ftp.perl.org/pub/CPAN
  28. rsync.panu.it::CPAN
  29. cpan.fastbull.org::CPAN
  30. ftp.kddilabs.jp::cpan
  31. ftp.nara.wide.ad.jp::cpan/
  32. rsync://ftp.jaist.ac.jp/pub/CPAN/
  33. rsync://ftp.riken.jp/cpan/
  34. mirror.linuxiso.kz::CPAN
  35. rsync://mirrors3.kernel.org/mirrors/CPAN/
  36. rsync://rsync.osmirror.nl/cpan/
  37. mirror.leaseweb.com::CPAN
  38. cpan.nautile.nc::CPAN
  39. mirror.icis.pcz.pl::CPAN
  40. piotrkosoft.net::mirrors/CPAN
  41. rsync://cpan.perl.pt/
  42. ftp.kaist.ac.kr::cpan
  43. cpan.sarang.net::CPAN
  44. mirror.averse.net::cpan
  45. rsync.oss.eznetsols.org
  46. mirror.ac.za::cpan
  47. ftp.is.co.za::IS-Mirror/ftp.cpan.org/
  48. rsync://ftp.gui.uva.es/cpan/
  49. rsync://mirrors4.kernel.org/mirrors/CPAN/
  50. ftp.solnet.ch::CPAN
  51. ftp.ulak.net.tr::CPAN
  52. gafol.net::cpan
  53. rsync.mirrorservice.org::ftp.funet.fi/pub/
  54. rsync://rsync.mirror.8086.net/CPAN/
  55. rsync.mirror.anlx.net::CPAN
  56. mirror.bytemark.co.uk::CPAN
  57. ftp.plig.net::CPAN
  58. rsync://ftp.ticklers.org:CPAN/
  59. mirrors.ibiblio.org::CPAN
  60. cpan-du.viaverio.com::CPAN
  61. mirror.hiwaay.net::CPAN
  62. rsync://mira.sunsite.utk.edu/CPAN/
  63. cpan.mirrors.tds.net::CPAN
  64. mirror.its.uidaho.edu::cpan
  65. rsync://mirror.cc.columbia.edu::cpan/
  66. ftp.fxcorporate.com::CPAN
  67. rsync.atlantic.net::CPAN
  68. mirrors.kernel.org::mirrors/CPAN
  69. rsync://mirrors2.kernel.org/mirrors/CPAN/
  70. cpan.pair.com::CPAN
  71. rsync://mirror.rit.edu/CPAN/
  72. rsync://mirror.facebook.net/cpan/
  73. rsync://mirrors1.kernel.org/mirrors/CPAN/
  74. cpan-sj.viaverio.com::CPAN

For an up-to-date listing of CPAN sites, see http://www.cpan.org/SITES or ftp://www.cpan.org/SITES .

Modules: Creation, Use, and Abuse

(The following section is borrowed directly from Tim Bunce's modules file, available at your nearest CPAN site.)

Perl implements a class using a package, but the presence of a package doesn't imply the presence of a class. A package is just a namespace. A class is a package that provides subroutines that can be used as methods. A method is just a subroutine that expects, as its first argument, either the name of a package (for "static" methods), or a reference to something (for "virtual" methods).

A module is a file that (by convention) provides a class of the same name (sans the .pm), plus an import method in that class that can be called to fetch exported symbols. This module may implement some of its methods by loading dynamic C or C++ objects, but that should be totally transparent to the user of the module. Likewise, the module might set up an AUTOLOAD function to slurp in subroutine definitions on demand, but this is also transparent. Only the .pm file is required to exist. See perlsub, perlobj, and AutoLoader for details about the AUTOLOAD mechanism.

Guidelines for Module Creation

  • Do similar modules already exist in some form?

    If so, please try to reuse the existing modules either in whole or by inheriting useful features into a new class. If this is not practical try to get together with the module authors to work on extending or enhancing the functionality of the existing modules. A perfect example is the plethora of packages in perl4 for dealing with command line options.

    If you are writing a module to expand an already existing set of modules, please coordinate with the author of the package. It helps if you follow the same naming scheme and module interaction scheme as the original author.

  • Try to design the new module to be easy to extend and reuse.

    Try to use warnings; (or use warnings qw(...); ). Remember that you can add no warnings qw(...); to individual blocks of code that need less warnings.

    Use blessed references. Use the two argument form of bless to bless into the class name given as the first parameter of the constructor, e.g.,:

    1. sub new {
    2. my $class = shift;
    3. return bless {}, $class;
    4. }

    or even this if you'd like it to be used as either a static or a virtual method.

    1. sub new {
    2. my $self = shift;
    3. my $class = ref($self) || $self;
    4. return bless {}, $class;
    5. }

    Pass arrays as references so more parameters can be added later (it's also faster). Convert functions into methods where appropriate. Split large methods into smaller more flexible ones. Inherit methods from other modules if appropriate.

    Avoid class name tests like: die "Invalid" unless ref $ref eq 'FOO' . Generally you can delete the eq 'FOO' part with no harm at all. Let the objects look after themselves! Generally, avoid hard-wired class names as far as possible.

    Avoid $r->Class::func() where using @ISA=qw(... Class ...) and $r->func() would work.

    Use autosplit so little used or newly added functions won't be a burden to programs that don't use them. Add test functions to the module after __END__ either using AutoSplit or by saying:

    1. eval join('',<main::DATA>) || die $@ unless caller();

    Does your module pass the 'empty subclass' test? If you say @SUBCLASS::ISA = qw(YOURCLASS); your applications should be able to use SUBCLASS in exactly the same way as YOURCLASS. For example, does your application still work if you change: $obj = YOURCLASS->new(); into: $obj = SUBCLASS->new(); ?

    Avoid keeping any state information in your packages. It makes it difficult for multiple other packages to use yours. Keep state information in objects.

    Always use -w.

    Try to use strict; (or use strict qw(...); ). Remember that you can add no strict qw(...); to individual blocks of code that need less strictness.

    Always use -w.

    Follow the guidelines in perlstyle.

    Always use -w.

  • Some simple style guidelines

    The perlstyle manual supplied with Perl has many helpful points.

    Coding style is a matter of personal taste. Many people evolve their style over several years as they learn what helps them write and maintain good code. Here's one set of assorted suggestions that seem to be widely used by experienced developers:

    Use underscores to separate words. It is generally easier to read $var_names_like_this than $VarNamesLikeThis, especially for non-native speakers of English. It's also a simple rule that works consistently with VAR_NAMES_LIKE_THIS.

    Package/Module names are an exception to this rule. Perl informally reserves lowercase module names for 'pragma' modules like integer and strict. Other modules normally begin with a capital letter and use mixed case with no underscores (need to be short and portable).

    You may find it helpful to use letter case to indicate the scope or nature of a variable. For example:

    1. $ALL_CAPS_HERE constants only (beware clashes with Perl vars)
    2. $Some_Caps_Here package-wide global/static
    3. $no_caps_here function scope my() or local() variables

    Function and method names seem to work best as all lowercase. e.g., $obj->as_string() .

    You can use a leading underscore to indicate that a variable or function should not be used outside the package that defined it.

  • Select what to export.

    Do NOT export method names!

    Do NOT export anything else by default without a good reason!

    Exports pollute the namespace of the module user. If you must export try to use @EXPORT_OK in preference to @EXPORT and avoid short or common names to reduce the risk of name clashes.

    Generally anything not exported is still accessible from outside the module using the ModuleName::item_name (or $blessed_ref->method ) syntax. By convention you can use a leading underscore on names to indicate informally that they are 'internal' and not for public use.

    (It is actually possible to get private functions by saying: my $subref = sub { ... }; &$subref; . But there's no way to call that directly as a method, because a method must have a name in the symbol table.)

    As a general rule, if the module is trying to be object oriented then export nothing. If it's just a collection of functions then @EXPORT_OK anything but use @EXPORT with caution.

  • Select a name for the module.

    This name should be as descriptive, accurate, and complete as possible. Avoid any risk of ambiguity. Always try to use two or more whole words. Generally the name should reflect what is special about what the module does rather than how it does it. Please use nested module names to group informally or categorize a module. There should be a very good reason for a module not to have a nested name. Module names should begin with a capital letter.

    Having 57 modules all called Sort will not make life easy for anyone (though having 23 called Sort::Quick is only marginally better :-). Imagine someone trying to install your module alongside many others. If in any doubt ask for suggestions in comp.lang.perl.misc.

    If you are developing a suite of related modules/classes it's good practice to use nested classes with a common prefix as this will avoid namespace clashes. For example: Xyz::Control, Xyz::View, Xyz::Model etc. Use the modules in this list as a naming guide.

    If adding a new module to a set, follow the original author's standards for naming modules and the interface to methods in those modules.

    If developing modules for private internal or project specific use, that will never be released to the public, then you should ensure that their names will not clash with any future public module. You can do this either by using the reserved Local::* category or by using a category name that includes an underscore like Foo_Corp::*.

    To be portable each component of a module name should be limited to 11 characters. If it might be used on MS-DOS then try to ensure each is unique in the first 8 characters. Nested modules make this easier.

  • Have you got it right?

    How do you know that you've made the right decisions? Have you picked an interface design that will cause problems later? Have you picked the most appropriate name? Do you have any questions?

    The best way to know for sure, and pick up many helpful suggestions, is to ask someone who knows. Comp.lang.perl.misc is read by just about all the people who develop modules and it's the best place to ask.

    All you need to do is post a short summary of the module, its purpose and interfaces. A few lines on each of the main methods is probably enough. (If you post the whole module it might be ignored by busy people - generally the very people you want to read it!)

    Don't worry about posting if you can't say when the module will be ready - just say so in the message. It might be worth inviting others to help you, they may be able to complete it for you!

  • README and other Additional Files.

    It's well known that software developers usually fully document the software they write. If, however, the world is in urgent need of your software and there is not enough time to write the full documentation please at least provide a README file containing:

    • A description of the module/package/extension etc.

    • A copyright notice - see below.

    • Prerequisites - what else you may need to have.

    • How to build it - possible changes to Makefile.PL etc.

    • How to install it.

    • Recent changes in this release, especially incompatibilities

    • Changes / enhancements you plan to make in the future.

    If the README file seems to be getting too large you may wish to split out some of the sections into separate files: INSTALL, Copying, ToDo etc.

    • Adding a Copyright Notice.

      How you choose to license your work is a personal decision. The general mechanism is to assert your Copyright and then make a declaration of how others may copy/use/modify your work.

      Perl, for example, is supplied with two types of licence: The GNU GPL and The Artistic Licence (see the files README, Copying, and Artistic, or perlgpl and perlartistic). Larry has good reasons for NOT just using the GNU GPL.

      My personal recommendation, out of respect for Larry, Perl, and the Perl community at large is to state something simply like:

      1. Copyright (c) 1995 Your Name. All rights reserved.
      2. This program is free software; you can redistribute it and/or
      3. modify it under the same terms as Perl itself.

      This statement should at least appear in the README file. You may also wish to include it in a Copying file and your source files. Remember to include the other words in addition to the Copyright.

    • Give the module a version/issue/release number.

      To be fully compatible with the Exporter and MakeMaker modules you should store your module's version number in a non-my package variable called $VERSION. This should be a positive floating point number with at least two digits after the decimal (i.e., hundredths, e.g, $VERSION = "0.01" ). Don't use a "1.3.2" style version. See Exporter for details.

      It may be handy to add a function or method to retrieve the number. Use the number in announcements and archive file names when releasing the module (ModuleName-1.02.tar.Z). See perldoc ExtUtils::MakeMaker.pm for details.

    • How to release and distribute a module.

      It's good idea to post an announcement of the availability of your module (or the module itself if small) to the comp.lang.perl.announce Usenet newsgroup. This will at least ensure very wide once-off distribution.

      If possible, register the module with CPAN. You should include details of its location in your announcement.

      Some notes about ftp archives: Please use a long descriptive file name that includes the version number. Most incoming directories will not be readable/listable, i.e., you won't be able to see your file after uploading it. Remember to send your email notification message as soon as possible after uploading else your file may get deleted automatically. Allow time for the file to be processed and/or check the file has been processed before announcing its location.

      FTP Archives for Perl Modules:

      Follow the instructions and links on:

      1. http://www.cpan.org/modules/00modlist.long.html
      2. http://www.cpan.org/modules/04pause.html

      or upload to one of these sites:

      1. https://pause.kbx.de/pause/
      2. http://pause.perl.org/

      and notify <modules@perl.org>.

      By using the WWW interface you can ask the Upload Server to mirror your modules from your ftp or WWW site into your own directory on CPAN!

      Please remember to send me an updated entry for the Module list!

    • Take care when changing a released module.

      Always strive to remain compatible with previous released versions. Otherwise try to add a mechanism to revert to the old behavior if people rely on it. Document incompatible changes.

Guidelines for Converting Perl 4 Library Scripts into Modules

  • There is no requirement to convert anything.

    If it ain't broke, don't fix it! Perl 4 library scripts should continue to work with no problems. You may need to make some minor changes (like escaping non-array @'s in double quoted strings) but there is no need to convert a .pl file into a Module for just that.

  • Consider the implications.

    All Perl applications that make use of the script will need to be changed (slightly) if the script is converted into a module. Is it worth it unless you plan to make other changes at the same time?

  • Make the most of the opportunity.

    If you are going to convert the script to a module you can use the opportunity to redesign the interface. The guidelines for module creation above include many of the issues you should consider.

  • The pl2pm utility will get you started.

    This utility will read *.pl files (given as parameters) and write corresponding *.pm files. The pl2pm utilities does the following:

    • Adds the standard Module prologue lines

    • Converts package specifiers from ' to ::

    • Converts die(...) to croak(...)

    • Several other minor changes

    Being a mechanical process pl2pm is not bullet proof. The converted code will need careful checking, especially any package statements. Don't delete the original .pl file till the new .pm one works!

Guidelines for Reusing Application Code

  • Complete applications rarely belong in the Perl Module Library.

  • Many applications contain some Perl code that could be reused.

    Help save the world! Share your code in a form that makes it easy to reuse.

  • Break-out the reusable code into one or more separate module files.

  • Take the opportunity to reconsider and redesign the interfaces.

  • In some cases the 'application' can then be reduced to a small

    fragment of code built on top of the reusable modules. In these cases the application could invoked as:

    1. % perl -e 'use Module::Name; method(@ARGV)' ...
    2. or
    3. % perl -mModule::Name ... (in perl5.002 or higher)

NOTE

Perl does not enforce private and public parts of its modules as you may have been used to in other languages like C++, Ada, or Modula-17. Perl doesn't have an infatuation with enforced privacy. It would prefer that you stayed out of its living room because you weren't invited, not because it has a shotgun.

The module and its user have a contract, part of which is common law, and part of which is "written". Part of the common law contract is that a module doesn't pollute any namespace it wasn't asked to. The written contract for the module (A.K.A. documentation) may make other provisions. But then you know when you use RedefineTheWorld that you're redefining the world and willing to take the consequences.

 
perldoc-html/perlmodstyle.html000644 000765 000024 00000134017 12275777354 016660 0ustar00jjstaff000000 000000 perlmodstyle - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlmodstyle

Perl 5 version 18.2 documentation
Recently read

perlmodstyle

NAME

perlmodstyle - Perl module style guide

INTRODUCTION

This document attempts to describe the Perl Community's "best practice" for writing Perl modules. It extends the recommendations found in perlstyle , which should be considered required reading before reading this document.

While this document is intended to be useful to all module authors, it is particularly aimed at authors who wish to publish their modules on CPAN.

The focus is on elements of style which are visible to the users of a module, rather than those parts which are only seen by the module's developers. However, many of the guidelines presented in this document can be extrapolated and applied successfully to a module's internals.

This document differs from perlnewmod in that it is a style guide rather than a tutorial on creating CPAN modules. It provides a checklist against which modules can be compared to determine whether they conform to best practice, without necessarily describing in detail how to achieve this.

All the advice contained in this document has been gleaned from extensive conversations with experienced CPAN authors and users. Every piece of advice given here is the result of previous mistakes. This information is here to help you avoid the same mistakes and the extra work that would inevitably be required to fix them.

The first section of this document provides an itemized checklist; subsequent sections provide a more detailed discussion of the items on the list. The final section, "Common Pitfalls", describes some of the most popular mistakes made by CPAN authors.

QUICK CHECKLIST

For more detail on each item in this checklist, see below.

Before you start

  • Don't re-invent the wheel

  • Patch, extend or subclass an existing module where possible

  • Do one thing and do it well

  • Choose an appropriate name

The API

  • API should be understandable by the average programmer

  • Simple methods for simple tasks

  • Separate functionality from output

  • Consistent naming of subroutines or methods

  • Use named parameters (a hash or hashref) when there are more than two parameters

Stability

  • Ensure your module works under use strict and -w

  • Stable modules should maintain backwards compatibility

Documentation

  • Write documentation in POD

  • Document purpose, scope and target applications

  • Document each publically accessible method or subroutine, including params and return values

  • Give examples of use in your documentation

  • Provide a README file and perhaps also release notes, changelog, etc

  • Provide links to further information (URL, email)

Release considerations

  • Specify pre-requisites in Makefile.PL or Build.PL

  • Specify Perl version requirements with use

  • Include tests with your module

  • Choose a sensible and consistent version numbering scheme (X.YY is the common Perl module numbering scheme)

  • Increment the version number for every change, no matter how small

  • Package the module using "make dist"

  • Choose an appropriate license (GPL/Artistic is a good default)

BEFORE YOU START WRITING A MODULE

Try not to launch headlong into developing your module without spending some time thinking first. A little forethought may save you a vast amount of effort later on.

Has it been done before?

You may not even need to write the module. Check whether it's already been done in Perl, and avoid re-inventing the wheel unless you have a good reason.

Good places to look for pre-existing modules include http://search.cpan.org/ and asking on modules@perl.org

If an existing module almost does what you want, consider writing a patch, writing a subclass, or otherwise extending the existing module rather than rewriting it.

Do one thing and do it well

At the risk of stating the obvious, modules are intended to be modular. A Perl developer should be able to use modules to put together the building blocks of their application. However, it's important that the blocks are the right shape, and that the developer shouldn't have to use a big block when all they need is a small one.

Your module should have a clearly defined scope which is no longer than a single sentence. Can your module be broken down into a family of related modules?

Bad example:

"FooBar.pm provides an implementation of the FOO protocol and the related BAR standard."

Good example:

"Foo.pm provides an implementation of the FOO protocol. Bar.pm implements the related BAR protocol."

This means that if a developer only needs a module for the BAR standard, they should not be forced to install libraries for FOO as well.

What's in a name?

Make sure you choose an appropriate name for your module early on. This will help people find and remember your module, and make programming with your module more intuitive.

When naming your module, consider the following:

  • Be descriptive (i.e. accurately describes the purpose of the module).

  • Be consistent with existing modules.

  • Reflect the functionality of the module, not the implementation.

  • Avoid starting a new top-level hierarchy, especially if a suitable hierarchy already exists under which you could place your module.

You should contact modules@perl.org to ask them about your module name before publishing your module. You should also try to ask people who are already familiar with the module's application domain and the CPAN naming system. Authors of similar modules, or modules with similar names, may be a good place to start.

DESIGNING AND WRITING YOUR MODULE

Considerations for module design and coding:

To OO or not to OO?

Your module may be object oriented (OO) or not, or it may have both kinds of interfaces available. There are pros and cons of each technique, which should be considered when you design your API.

In Perl Best Practices (copyright 2004, Published by O'Reilly Media, Inc.), Damian Conway provides a list of criteria to use when deciding if OO is the right fit for your problem:

  • The system being designed is large, or is likely to become large.

  • The data can be aggregated into obvious structures, especially if there's a large amount of data in each aggregate.

  • The various types of data aggregate form a natural hierarchy that facilitates the use of inheritance and polymorphism.

  • You have a piece of data on which many different operations are applied.

  • You need to perform the same general operations on related types of data, but with slight variations depending on the specific type of data the operations are applied to.

  • It's likely you'll have to add new data types later.

  • The typical interactions between pieces of data are best represented by operators.

  • The implementation of individual components of the system is likely to change over time.

  • The system design is already object-oriented.

  • Large numbers of other programmers will be using your code modules.

Think carefully about whether OO is appropriate for your module. Gratuitous object orientation results in complex APIs which are difficult for the average module user to understand or use.

Designing your API

Your interfaces should be understandable by an average Perl programmer. The following guidelines may help you judge whether your API is sufficiently straightforward:

  • Write simple routines to do simple things.

    It's better to have numerous simple routines than a few monolithic ones. If your routine changes its behaviour significantly based on its arguments, it's a sign that you should have two (or more) separate routines.

  • Separate functionality from output.

    Return your results in the most generic form possible and allow the user to choose how to use them. The most generic form possible is usually a Perl data structure which can then be used to generate a text report, HTML, XML, a database query, or whatever else your users require.

    If your routine iterates through some kind of list (such as a list of files, or records in a database) you may consider providing a callback so that users can manipulate each element of the list in turn. File::Find provides an example of this with its find(\&wanted, $dir) syntax.

  • Provide sensible shortcuts and defaults.

    Don't require every module user to jump through the same hoops to achieve a simple result. You can always include optional parameters or routines for more complex or non-standard behaviour. If most of your users have to type a few almost identical lines of code when they start using your module, it's a sign that you should have made that behaviour a default. Another good indicator that you should use defaults is if most of your users call your routines with the same arguments.

  • Naming conventions

    Your naming should be consistent. For instance, it's better to have:

    1. display_day();
    2. display_week();
    3. display_year();

    than

    1. display_day();
    2. week_display();
    3. show_year();

    This applies equally to method names, parameter names, and anything else which is visible to the user (and most things that aren't!)

  • Parameter passing

    Use named parameters. It's easier to use a hash like this:

    1. $obj->do_something(
    2. name => "wibble",
    3. type => "text",
    4. size => 1024,
    5. );

    ... than to have a long list of unnamed parameters like this:

    1. $obj->do_something("wibble", "text", 1024);

    While the list of arguments might work fine for one, two or even three arguments, any more arguments become hard for the module user to remember, and hard for the module author to manage. If you want to add a new parameter you will have to add it to the end of the list for backward compatibility, and this will probably make your list order unintuitive. Also, if many elements may be undefined you may see the following unattractive method calls:

    1. $obj->do_something(undef, undef, undef, undef, undef, undef, 1024);

    Provide sensible defaults for parameters which have them. Don't make your users specify parameters which will almost always be the same.

    The issue of whether to pass the arguments in a hash or a hashref is largely a matter of personal style.

    The use of hash keys starting with a hyphen (-name ) or entirely in upper case (NAME ) is a relic of older versions of Perl in which ordinary lower case strings were not handled correctly by the => operator. While some modules retain uppercase or hyphenated argument keys for historical reasons or as a matter of personal style, most new modules should use simple lower case keys. Whatever you choose, be consistent!

Strictness and warnings

Your module should run successfully under the strict pragma and should run without generating any warnings. Your module should also handle taint-checking where appropriate, though this can cause difficulties in many cases.

Backwards compatibility

Modules which are "stable" should not break backwards compatibility without at least a long transition phase and a major change in version number.

Error handling and messages

When your module encounters an error it should do one or more of:

  • Return an undefined value.

  • set $Module::errstr or similar (errstr is a common name used by DBI and other popular modules; if you choose something else, be sure to document it clearly).

  • warn() or carp() a message to STDERR.

  • croak() only when your module absolutely cannot figure out what to do. (croak() is a better version of die() for use within modules, which reports its errors from the perspective of the caller. See Carp for details of croak() , carp() and other useful routines.)

  • As an alternative to the above, you may prefer to throw exceptions using the Error module.

Configurable error handling can be very useful to your users. Consider offering a choice of levels for warning and debug messages, an option to send messages to a separate file, a way to specify an error-handling routine, or other such features. Be sure to default all these options to the commonest use.

DOCUMENTING YOUR MODULE

POD

Your module should include documentation aimed at Perl developers. You should use Perl's "plain old documentation" (POD) for your general technical documentation, though you may wish to write additional documentation (white papers, tutorials, etc) in some other format. You need to cover the following subjects:

  • A synopsis of the common uses of the module

  • The purpose, scope and target applications of your module

  • Use of each publically accessible method or subroutine, including parameters and return values

  • Examples of use

  • Sources of further information

  • A contact email address for the author/maintainer

The level of detail in Perl module documentation generally goes from less detailed to more detailed. Your SYNOPSIS section should contain a minimal example of use (perhaps as little as one line of code; skip the unusual use cases or anything not needed by most users); the DESCRIPTION should describe your module in broad terms, generally in just a few paragraphs; more detail of the module's routines or methods, lengthy code examples, or other in-depth material should be given in subsequent sections.

Ideally, someone who's slightly familiar with your module should be able to refresh their memory without hitting "page down". As your reader continues through the document, they should receive a progressively greater amount of knowledge.

The recommended order of sections in Perl module documentation is:

  • NAME

  • SYNOPSIS

  • DESCRIPTION

  • One or more sections or subsections giving greater detail of available methods and routines and any other relevant information.

  • BUGS/CAVEATS/etc

  • AUTHOR

  • SEE ALSO

  • COPYRIGHT and LICENSE

Keep your documentation near the code it documents ("inline" documentation). Include POD for a given method right above that method's subroutine. This makes it easier to keep the documentation up to date, and avoids having to document each piece of code twice (once in POD and once in comments).

README, INSTALL, release notes, changelogs

Your module should also include a README file describing the module and giving pointers to further information (website, author email).

An INSTALL file should be included, and should contain simple installation instructions. When using ExtUtils::MakeMaker this will usually be:

  • perl Makefile.PL
  • make
  • make test
  • make install

When using Module::Build, this will usually be:

  • perl Build.PL
  • perl Build
  • perl Build test
  • perl Build install

Release notes or changelogs should be produced for each release of your software describing user-visible changes to your module, in terms relevant to the user.

RELEASE CONSIDERATIONS

Version numbering

Version numbers should indicate at least major and minor releases, and possibly sub-minor releases. A major release is one in which most of the functionality has changed, or in which major new functionality is added. A minor release is one in which a small amount of functionality has been added or changed. Sub-minor version numbers are usually used for changes which do not affect functionality, such as documentation patches.

The most common CPAN version numbering scheme looks like this:

  1. 1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32

A correct CPAN version number is a floating point number with at least 2 digits after the decimal. You can test whether it conforms to CPAN by using

  1. perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)' 'Foo.pm'

If you want to release a 'beta' or 'alpha' version of a module but don't want CPAN.pm to list it as most recent use an '_' after the regular version number followed by at least 2 digits, eg. 1.20_01. If you do this, the following idiom is recommended:

  1. $VERSION = "1.12_01";
  2. $XS_VERSION = $VERSION; # only needed if you have XS code
  3. $VERSION = eval $VERSION;

With that trick MakeMaker will only read the first line and thus read the underscore, while the perl interpreter will evaluate the $VERSION and convert the string into a number. Later operations that treat $VERSION as a number will then be able to do so without provoking a warning about $VERSION not being a number.

Never release anything (even a one-word documentation patch) without incrementing the number. Even a one-word documentation patch should result in a change in version at the sub-minor level.

Pre-requisites

Module authors should carefully consider whether to rely on other modules, and which modules to rely on.

Most importantly, choose modules which are as stable as possible. In order of preference:

  • Core Perl modules

  • Stable CPAN modules

  • Unstable CPAN modules

  • Modules not available from CPAN

Specify version requirements for other Perl modules in the pre-requisites in your Makefile.PL or Build.PL.

Be sure to specify Perl version requirements both in Makefile.PL or Build.PL and with require 5.6.1 or similar. See the section on use VERSION of require for details.

Testing

All modules should be tested before distribution (using "make disttest"), and the tests should also be available to people installing the modules (using "make test"). For Module::Build you would use the make test equivalent perl Build test .

The importance of these tests is proportional to the alleged stability of a module. A module which purports to be stable or which hopes to achieve wide use should adhere to as strict a testing regime as possible.

Useful modules to help you write tests (with minimum impact on your development process or your time) include Test::Simple, Carp::Assert and Test::Inline. For more sophisticated test suites there are Test::More and Test::MockObject.

Packaging

Modules should be packaged using one of the standard packaging tools. Currently you have the choice between ExtUtils::MakeMaker and the more platform independent Module::Build, allowing modules to be installed in a consistent manner. When using ExtUtils::MakeMaker, you can use "make dist" to create your package. Tools exist to help you to build your module in a MakeMaker-friendly style. These include ExtUtils::ModuleMaker and h2xs. See also perlnewmod.

Licensing

Make sure that your module has a license, and that the full text of it is included in the distribution (unless it's a common one and the terms of the license don't require you to include it).

If you don't know what license to use, dual licensing under the GPL and Artistic licenses (the same as Perl itself) is a good idea. See perlgpl and perlartistic.

COMMON PITFALLS

Reinventing the wheel

There are certain application spaces which are already very, very well served by CPAN. One example is templating systems, another is date and time modules, and there are many more. While it is a rite of passage to write your own version of these things, please consider carefully whether the Perl world really needs you to publish it.

Trying to do too much

Your module will be part of a developer's toolkit. It will not, in itself, form the entire toolkit. It's tempting to add extra features until your code is a monolithic system rather than a set of modular building blocks.

Inappropriate documentation

Don't fall into the trap of writing for the wrong audience. Your primary audience is a reasonably experienced developer with at least a moderate understanding of your module's application domain, who's just downloaded your module and wants to start using it as quickly as possible.

Tutorials, end-user documentation, research papers, FAQs etc are not appropriate in a module's main documentation. If you really want to write these, include them as sub-documents such as My::Module::Tutorial or My::Module::FAQ and provide a link in the SEE ALSO section of the main documentation.

SEE ALSO

AUTHOR

Kirrily "Skud" Robert <skud@cpan.org>

 
perldoc-html/perlmroapi.html000644 000765 000024 00000047505 12275777367 016320 0ustar00jjstaff000000 000000 perlmroapi - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlmroapi

Perl 5 version 18.2 documentation
Recently read

perlmroapi

NAME

perlmroapi - Perl method resolution plugin interface

DESCRIPTION

As of Perl 5.10.1 there is a new interface for plugging and using method resolution orders other than the default (linear depth first search). The C3 method resolution order added in 5.10.0 has been re-implemented as a plugin, without changing its Perl-space interface.

Each plugin should register itself by providing the following structure

  1. struct mro_alg {
  2. AV *(*resolve)(pTHX_ HV *stash, U32 level);
  3. const char *name;
  4. U16 length;
  5. U16 kflags;
  6. U32 hash;
  7. };

and calling Perl_mro_register :

  1. Perl_mro_register(aTHX_ &my_mro_alg);
  • resolve

    Pointer to the linearisation function, described below.

  • name

    Name of the MRO, either in ISO-8859-1 or UTF-8.

  • length

    Length of the name.

  • kflags

    If the name is given in UTF-8, set this to HVhek_UTF8 . The value is passed direct as the parameter kflags to hv_common() .

  • hash

    A precomputed hash value for the MRO's name, or 0.

Callbacks

The resolve function is called to generate a linearised ISA for the given stash, using this MRO. It is called with a pointer to the stash, and a level of 0. The core always sets level to 0 when it calls your function - the parameter is provided to allow your implementation to track depth if it needs to recurse.

The function should return a reference to an array containing the parent classes in order. The names of the classes should be the result of calling HvENAME() on the stash. In those cases where HvENAME() returns null, HvNAME() should be used instead.

The caller is responsible for incrementing the reference count of the array returned if it wants to keep the structure. Hence, if you have created a temporary value that you keep no pointer to, sv_2mortal() to ensure that it is disposed of correctly. If you have cached your return value, then return a pointer to it without changing the reference count.

Caching

Computing MROs can be expensive. The implementation provides a cache, in which you can store a single SV * , or anything that can be cast to SV * , such as AV * . To read your private value, use the macro MRO_GET_PRIVATE_DATA() , passing it the mro_meta structure from the stash, and a pointer to your mro_alg structure:

  1. meta = HvMROMETA(stash);
  2. private_sv = MRO_GET_PRIVATE_DATA(meta, &my_mro_alg);

To set your private value, call Perl_mro_set_private_data() :

  1. Perl_mro_set_private_data(aTHX_ meta, &c3_alg, private_sv);

The private data cache will take ownership of a reference to private_sv, much the same way that hv_store() takes ownership of a reference to the value that you pass it.

Examples

For examples of MRO implementations, see S_mro_get_linear_isa_c3() and the BOOT: section of mro/mro.xs, and S_mro_get_linear_isa_dfs() in mro.c

AUTHORS

The implementation of the C3 MRO and switchable MROs within the perl core was written by Brandon L Black. Nicholas Clark created the pluggable interface, refactored Brandon's implementation to work with it, and wrote this document.

 
perldoc-html/perlnetware.html000644 000765 000024 00000053452 12275777411 016462 0ustar00jjstaff000000 000000 perlnetware - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlnetware

Perl 5 version 18.2 documentation
Recently read

perlnetware

NAME

perlnetware - Perl for NetWare

DESCRIPTION

This file gives instructions for building Perl 5.7 and above, and also Perl modules for NetWare. Before you start, you may want to read the README file found in the top level directory into which the Perl source code distribution was extracted. Make sure you read and understand the terms under which the software is being distributed.

BUILD

This section describes the steps to be performed to build a Perl NLM and other associated NLMs.

Tools & SDK

The build requires CodeWarrior compiler and linker. In addition, the "NetWare SDK", "NLM & NetWare Libraries for C" and "NetWare Server Protocol Libraries for C", all available at http://developer.novell.com/wiki/index.php/Category:Novell_Developer_Kit, are required. Microsoft Visual C++ version 4.2 or later is also required.

Setup

The build process is dependent on the location of the NetWare SDK. Once the Tools & SDK are installed, the build environment has to be setup. The following batch files setup the environment.

  • SetNWBld.bat

    The Execution of this file takes 2 parameters as input. The first being the NetWare SDK path, second being the path for CodeWarrior Compiler & tools. Execution of this file sets these paths and also sets the build type to Release by default.

  • Buildtype.bat

    This is used to set the build type to debug or release. Change the build type only after executing SetNWBld.bat

    Example:

    1.

    Typing "buildtype d on" at the command prompt causes the buildtype to be set to Debug type with D2 flag set.

    2.

    Typing "buildtype d off" or "buildtype d" at the command prompt causes the buildtype to be set to Debug type with D1 flag set.

    3.

    Typing "buildtype r" at the command prompt sets it to Release Build type.

Make

The make process runs only under WinNT shell. The NetWare makefile is located under the NetWare folder. This makes use of miniperl.exe to run some of the Perl scripts. To create miniperl.exe, first set the required paths for Visual c++ compiler (specify vcvars32 location) at the command prompt. Then run nmake from win32 folder through WinNT command prompt. The build process can be stopped after miniperl.exe is created. Then run nmake from NetWare folder through WinNT command prompt.

Currently the following two build types are tested on NetWare:

  • USE_MULTI, USE_ITHREADS & USE_IMP_SYS defined

  • USE_MULTI & USE_IMP_SYS defined and USE_ITHREADS not defined

Interpreter

Once miniperl.exe creation is over, run nmake from the NetWare folder. This will build the Perl interpreter for NetWare as perl.nlm. This is copied under the Release folder if you are doing a release build, else will be copied under Debug folder for debug builds.

Extensions

The make process also creates the Perl extensions as <Extension.nlm>

INSTALL

To install NetWare Perl onto a NetWare server, first map the Sys volume of a NetWare server to i:. This is because the makefile by default sets the drive letter to i:. Type nmake nwinstall from NetWare folder on a WinNT command prompt. This will copy the binaries and module files onto the NetWare server under sys:\Perl folder. The Perl interpreter, perl.nlm, is copied under sys:\perl\system folder. Copy this to sys:\system folder.

Example: At the command prompt Type "nmake nwinstall". This will install NetWare Perl on the NetWare Server. Similarly, if you type "nmake install", this will cause the binaries to be installed on the local machine. (Typically under the c:\perl folder)

BUILD NEW EXTENSIONS

To build extensions other than standard extensions, NetWare Perl has to be installed on Windows along with Windows Perl. The Perl for Windows can be either downloaded from the CPAN site and built using the sources, or the binaries can be directly downloaded from the ActiveState site. Installation can be done by invoking nmake install from the NetWare folder on a WinNT command prompt after building NetWare Perl by following steps given above. This will copy all the *.pm files and other required files. Documentation files are not copied. Thus one must first install Windows Perl, Then install NetWare Perl.

Once this is done, do the following to build any extension:

  • Change to the extension directory where its source files are present.

  • Run the following command at the command prompt:

    1. perl -II<path to NetWare lib dir> -II<path to lib> Makefile.pl

    Example:

    1. perl -Ic:/perl/5.6.1/lib/NetWare-x86-multi-thread -Ic:\perl\5.6.1\lib MakeFile.pl

    or

    1. perl -Ic:/perl/5.8.0/lib/NetWare-x86-multi-thread -Ic:\perl\5.8.0\lib MakeFile.pl
  • nmake

  • nmake install

    Install will copy the files into the Windows machine where NetWare Perl is installed and these files may have to be copied to the NetWare server manually. Alternatively, pass INSTALLSITELIB=i:\perl\lib as an input to makefile.pl above. Here i: is the mapped drive to the sys: volume of the server where Perl on NetWare is installed. Now typing nmake install, will copy the files onto the NetWare server.

    Example: You can execute the following on the command prompt.

    1. perl -Ic:/perl/5.6.1/lib/NetWare-x86-multi-thread -Ic:\perl\5.6.1\lib MakeFile.pl
    2. INSTALLSITELIB=i:\perl\lib

    or

    1. perl -Ic:/perl/5.8.0/lib/NetWare-x86-multi-thread -Ic:\perl\5.8.0\lib MakeFile.pl
    2. INSTALLSITELIB=i:\perl\lib
  • Note: Some modules downloaded from CPAN may require NetWare related API in order to build on NetWare. Other modules may however build smoothly with or without minor changes depending on the type of module.

ACKNOWLEDGEMENTS

The makefile for Win32 is used as a reference to create the makefile for NetWare. Also, the make process for NetWare port uses miniperl.exe to run scripts during the make and installation process.

AUTHORS

Anantha Kesari H Y (hyanantha@novell.com) Aditya C (caditya@novell.com)

DATE

  • Created - 18 Jan 2001

  • Modified - 25 June 2001

  • Modified - 13 July 2001

  • Modified - 28 May 2002

 
perldoc-html/perlnewmod.html000644 000765 000024 00000075667 12275777354 016330 0ustar00jjstaff000000 000000 perlnewmod - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlnewmod

Perl 5 version 18.2 documentation
Recently read

perlnewmod

NAME

perlnewmod - preparing a new module for distribution

DESCRIPTION

This document gives you some suggestions about how to go about writing Perl modules, preparing them for distribution, and making them available via CPAN.

One of the things that makes Perl really powerful is the fact that Perl hackers tend to want to share the solutions to problems they've faced, so you and I don't have to battle with the same problem again.

The main way they do this is by abstracting the solution into a Perl module. If you don't know what one of these is, the rest of this document isn't going to be much use to you. You're also missing out on an awful lot of useful code; consider having a look at perlmod, perlmodlib and perlmodinstall before coming back here.

When you've found that there isn't a module available for what you're trying to do, and you've had to write the code yourself, consider packaging up the solution into a module and uploading it to CPAN so that others can benefit.

Warning

We're going to primarily concentrate on Perl-only modules here, rather than XS modules. XS modules serve a rather different purpose, and you should consider different things before distributing them - the popularity of the library you are gluing, the portability to other operating systems, and so on. However, the notes on preparing the Perl side of the module and packaging and distributing it will apply equally well to an XS module as a pure-Perl one.

What should I make into a module?

You should make a module out of any code that you think is going to be useful to others. Anything that's likely to fill a hole in the communal library and which someone else can slot directly into their program. Any part of your code which you can isolate and extract and plug into something else is a likely candidate.

Let's take an example. Suppose you're reading in data from a local format into a hash-of-hashes in Perl, turning that into a tree, walking the tree and then piping each node to an Acme Transmogrifier Server.

Now, quite a few people have the Acme Transmogrifier, and you've had to write something to talk the protocol from scratch - you'd almost certainly want to make that into a module. The level at which you pitch it is up to you: you might want protocol-level modules analogous to Net::SMTP which then talk to higher level modules analogous to Mail::Send. The choice is yours, but you do want to get a module out for that server protocol.

Nobody else on the planet is going to talk your local data format, so we can ignore that. But what about the thing in the middle? Building tree structures from Perl variables and then traversing them is a nice, general problem, and if nobody's already written a module that does that, you might want to modularise that code too.

So hopefully you've now got a few ideas about what's good to modularise. Let's now see how it's done.

Step-by-step: Preparing the ground

Before we even start scraping out the code, there are a few things we'll want to do in advance.

  • Look around

    Dig into a bunch of modules to see how they're written. I'd suggest starting with Text::Tabs, since it's in the standard library and is nice and simple, and then looking at something a little more complex like File::Copy. For object oriented code, WWW::Mechanize or the Email::* modules provide some good examples.

    These should give you an overall feel for how modules are laid out and written.

  • Check it's new

    There are a lot of modules on CPAN, and it's easy to miss one that's similar to what you're planning on contributing. Have a good plough through the http://search.cpan.org and make sure you're not the one reinventing the wheel!

  • Discuss the need

    You might love it. You might feel that everyone else needs it. But there might not actually be any real demand for it out there. If you're unsure about the demand your module will have, consider sending out feelers on the comp.lang.perl.modules newsgroup, or as a last resort, ask the modules list at modules@perl.org . Remember that this is a closed list with a very long turn-around time - be prepared to wait a good while for a response from them.

  • Choose a name

    Perl modules included on CPAN have a naming hierarchy you should try to fit in with. See perlmodlib for more details on how this works, and browse around CPAN and the modules list to get a feel of it. At the very least, remember this: modules should be title capitalised, (This::Thing) fit in with a category, and explain their purpose succinctly.

  • Check again

    While you're doing that, make really sure you haven't missed a module similar to the one you're about to write.

    When you've got your name sorted out and you're sure that your module is wanted and not currently available, it's time to start coding.

Step-by-step: Making the module

  • Start with module-starter or h2xs

    The module-starter utility is distributed as part of the Module::Starter CPAN package. It creates a directory with stubs of all the necessary files to start a new module, according to recent "best practice" for module development, and is invoked from the command line, thus:

    1. module-starter --module=Foo::Bar \
    2. --author="Your Name" --email=yourname@cpan.org

    If you do not wish to install the Module::Starter package from CPAN, h2xs is an older tool, originally intended for the development of XS modules, which comes packaged with the Perl distribution.

    A typical invocation of h2xs for a pure Perl module is:

    1. h2xs -AX --skip-exporter --use-new-tests -n Foo::Bar

    The -A omits the Autoloader code, -X omits XS elements, --skip-exporter omits the Exporter code, --use-new-tests sets up a modern testing environment, and -n specifies the name of the module.

  • Use strict and warnings

    A module's code has to be warning and strict-clean, since you can't guarantee the conditions that it'll be used under. Besides, you wouldn't want to distribute code that wasn't warning or strict-clean anyway, right?

  • Use Carp

    The Carp module allows you to present your error messages from the caller's perspective; this gives you a way to signal a problem with the caller and not your module. For instance, if you say this:

    1. warn "No hostname given";

    the user will see something like this:

    1. No hostname given at /usr/local/lib/perl5/site_perl/5.6.0/Net/Acme.pm
    2. line 123.

    which looks like your module is doing something wrong. Instead, you want to put the blame on the user, and say this:

    1. No hostname given at bad_code, line 10.

    You do this by using Carp and replacing your warns with carp s. If you need to die, say croak instead. However, keep warn and die in place for your sanity checks - where it really is your module at fault.

  • Use Exporter - wisely!

    Exporter gives you a standard way of exporting symbols and subroutines from your module into the caller's namespace. For instance, saying use Net::Acme qw(&frob) would import the frob subroutine.

    The package variable @EXPORT will determine which symbols will get exported when the caller simply says use Net::Acme - you will hardly ever want to put anything in there. @EXPORT_OK , on the other hand, specifies which symbols you're willing to export. If you do want to export a bunch of symbols, use the %EXPORT_TAGS and define a standard export set - look at Exporter for more details.

  • Use plain old documentation

    The work isn't over until the paperwork is done, and you're going to need to put in some time writing some documentation for your module. module-starter or h2xs will provide a stub for you to fill in; if you're not sure about the format, look at perlpod for an introduction. Provide a good synopsis of how your module is used in code, a description, and then notes on the syntax and function of the individual subroutines or methods. Use Perl comments for developer notes and POD for end-user notes.

  • Write tests

    You're encouraged to create self-tests for your module to ensure it's working as intended on the myriad platforms Perl supports; if you upload your module to CPAN, a host of testers will build your module and send you the results of the tests. Again, module-starter and h2xs provide a test framework which you can extend - you should do something more than just checking your module will compile. Test::Simple and Test::More are good places to start when writing a test suite.

  • Write the README

    If you're uploading to CPAN, the automated gremlins will extract the README file and place that in your CPAN directory. It'll also appear in the main by-module and by-category directories if you make it onto the modules list. It's a good idea to put here what the module actually does in detail, and the user-visible changes since the last release.

Step-by-step: Distributing your module

  • Get a CPAN user ID

    Every developer publishing modules on CPAN needs a CPAN ID. Visit http://pause.perl.org/, select "Request PAUSE Account", and wait for your request to be approved by the PAUSE administrators.

  • perl Makefile.PL; make test; make dist

    Once again, module-starter or h2xs has done all the work for you. They produce the standard Makefile.PL you see when you download and install modules, and this produces a Makefile with a dist target.

    Once you've ensured that your module passes its own tests - always a good thing to make sure - you can make dist , and the Makefile will hopefully produce you a nice tarball of your module, ready for upload.

  • Upload the tarball

    The email you got when you received your CPAN ID will tell you how to log in to PAUSE, the Perl Authors Upload SErver. From the menus there, you can upload your module to CPAN.

  • Announce to the modules list

    Once uploaded, it'll sit unnoticed in your author directory. If you want it connected to the rest of the CPAN, you'll need to go to "Register Namespace" on PAUSE. Once registered, your module will appear in the by-module and by-category listings on CPAN.

  • Announce to clpa

    If you have a burning desire to tell the world about your release, post an announcement to the moderated comp.lang.perl.announce newsgroup.

  • Fix bugs!

    Once you start accumulating users, they'll send you bug reports. If you're lucky, they'll even send you patches. Welcome to the joys of maintaining a software project...

AUTHOR

Simon Cozens, simon@cpan.org

Updated by Kirrily "Skud" Robert, skud@cpan.org

SEE ALSO

perlmod, perlmodlib, perlmodinstall, h2xs, strict, Carp, Exporter, perlpod, Test::Simple, Test::More ExtUtils::MakeMaker, Module::Build, Module::Starter http://www.cpan.org/ , Ken Williams's tutorial on building your own module at http://mathforum.org/~ken/perl_modules.html

 
perldoc-html/perlnumber.html000644 000765 000024 00000066034 12275777342 016310 0ustar00jjstaff000000 000000 perlnumber - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlnumber

Perl 5 version 18.2 documentation
Recently read

perlnumber

NAME

perlnumber - semantics of numbers and numeric operations in Perl

SYNOPSIS

  1. $n = 1234; # decimal integer
  2. $n = 0b1110011; # binary integer
  3. $n = 01234; # octal integer
  4. $n = 0x1234; # hexadecimal integer
  5. $n = 12.34e-56; # exponential notation
  6. $n = "-12.34e56"; # number specified as a string
  7. $n = "1234"; # number specified as a string

DESCRIPTION

This document describes how Perl internally handles numeric values.

Perl's operator overloading facility is completely ignored here. Operator overloading allows user-defined behaviors for numbers, such as operations over arbitrarily large integers, floating points numbers with arbitrary precision, operations over "exotic" numbers such as modular arithmetic or p-adic arithmetic, and so on. See overload for details.

Storing numbers

Perl can internally represent numbers in 3 different ways: as native integers, as native floating point numbers, and as decimal strings. Decimal strings may have an exponential notation part, as in "12.34e-56" . Native here means "a format supported by the C compiler which was used to build perl".

The term "native" does not mean quite as much when we talk about native integers, as it does when native floating point numbers are involved. The only implication of the term "native" on integers is that the limits for the maximal and the minimal supported true integral quantities are close to powers of 2. However, "native" floats have a most fundamental restriction: they may represent only those numbers which have a relatively "short" representation when converted to a binary fraction. For example, 0.9 cannot be represented by a native float, since the binary fraction for 0.9 is infinite:

  1. binary0.1110011001100...

with the sequence 1100 repeating again and again. In addition to this limitation, the exponent of the binary number is also restricted when it is represented as a floating point number. On typical hardware, floating point values can store numbers with up to 53 binary digits, and with binary exponents between -1024 and 1024. In decimal representation this is close to 16 decimal digits and decimal exponents in the range of -304..304. The upshot of all this is that Perl cannot store a number like 12345678901234567 as a floating point number on such architectures without loss of information.

Similarly, decimal strings can represent only those numbers which have a finite decimal expansion. Being strings, and thus of arbitrary length, there is no practical limit for the exponent or number of decimal digits for these numbers. (But realize that what we are discussing the rules for just the storage of these numbers. The fact that you can store such "large" numbers does not mean that the operations over these numbers will use all of the significant digits. See Numeric operators and numeric conversions for details.)

In fact numbers stored in the native integer format may be stored either in the signed native form, or in the unsigned native form. Thus the limits for Perl numbers stored as native integers would typically be -2**31..2**32-1, with appropriate modifications in the case of 64-bit integers. Again, this does not mean that Perl can do operations only over integers in this range: it is possible to store many more integers in floating point format.

Summing up, Perl numeric values can store only those numbers which have a finite decimal expansion or a "short" binary expansion.

Numeric operators and numeric conversions

As mentioned earlier, Perl can store a number in any one of three formats, but most operators typically understand only one of those formats. When a numeric value is passed as an argument to such an operator, it will be converted to the format understood by the operator.

Six such conversions are possible:

  1. native integer --> native floating point (*)
  2. native integer --> decimal string
  3. native floating_point --> native integer (*)
  4. native floating_point --> decimal string (*)
  5. decimal string --> native integer
  6. decimal string --> native floating point (*)

These conversions are governed by the following general rules:

  • If the source number can be represented in the target form, that representation is used.

  • If the source number is outside of the limits representable in the target form, a representation of the closest limit is used. (Loss of information)

  • If the source number is between two numbers representable in the target form, a representation of one of these numbers is used. (Loss of information)

  • In native floating point --> native integer conversions the magnitude of the result is less than or equal to the magnitude of the source. ("Rounding to zero".)

  • If the decimal string --> native integer conversion cannot be done without loss of information, the result is compatible with the conversion sequence decimal_string --> native_floating_point --> native_integer . In particular, rounding is strongly biased to 0, though a number like "0.99999999999999999999" has a chance of being rounded to 1.

RESTRICTION: The conversions marked with (*) above involve steps performed by the C compiler. In particular, bugs/features of the compiler used may lead to breakage of some of the above rules.

Flavors of Perl numeric operations

Perl operations which take a numeric argument treat that argument in one of four different ways: they may force it to one of the integer/floating/ string formats, or they may behave differently depending on the format of the operand. Forcing a numeric value to a particular format does not change the number stored in the value.

All the operators which need an argument in the integer format treat the argument as in modular arithmetic, e.g., mod 2**32 on a 32-bit architecture. sprintf "%u", -1 therefore provides the same result as sprintf "%u", ~0 .

  • Arithmetic operators

    The binary operators + - * / % == != > < >= <= and the unary operators - abs and -- will attempt to convert arguments to integers. If both conversions are possible without loss of precision, and the operation can be performed without loss of precision then the integer result is used. Otherwise arguments are converted to floating point format and the floating point result is used. The caching of conversions (as described above) means that the integer conversion does not throw away fractional parts on floating point numbers.

  • ++

    ++ behaves as the other operators above, except that if it is a string matching the format /^[a-zA-Z]*[0-9]*\z/ the string increment described in perlop is used.

  • Arithmetic operators during use integer

    In scopes where use integer; is in force, nearly all the operators listed above will force their argument(s) into integer format, and return an integer result. The exceptions, abs, ++ and -- , do not change their behavior with use integer;

  • Other mathematical operators

    Operators such as ** , sin and exp force arguments to floating point format.

  • Bitwise operators

    Arguments are forced into the integer format if not strings.

  • Bitwise operators during use integer

    forces arguments to integer format. Also shift operations internally use signed integers rather than the default unsigned.

  • Operators which expect an integer

    force the argument into the integer format. This is applicable to the third and fourth arguments of sysread, for example.

  • Operators which expect a string

    force the argument into the string format. For example, this is applicable to printf "%s", $value .

Though forcing an argument into a particular form does not change the stored number, Perl remembers the result of such conversions. In particular, though the first such conversion may be time-consuming, repeated operations will not need to redo the conversion.

AUTHOR

Ilya Zakharevich ilya@math.ohio-state.edu

Editorial adjustments by Gurusamy Sarathy <gsar@ActiveState.com>

Updates for 5.8.0 by Nicholas Clark <nick@ccl4.org>

SEE ALSO

overload, perlop

 
perldoc-html/perlobj.html000644 000765 000024 00000261756 12275777341 015601 0ustar00jjstaff000000 000000 perlobj - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlobj

Perl 5 version 18.2 documentation
Recently read

perlobj

NAME

perlobj - Perl object reference

DESCRIPTION

This document provides a reference for Perl's object orientation features. If you're looking for an introduction to object-oriented programming in Perl, please see perlootut.

In order to understand Perl objects, you first need to understand references in Perl. See perlref for details.

This document describes all of Perl's object-oriented (OO) features from the ground up. If you're just looking to write some object-oriented code of your own, you are probably better served by using one of the object systems from CPAN described in perlootut.

If you're looking to write your own object system, or you need to maintain code which implements objects from scratch then this document will help you understand exactly how Perl does object orientation.

There are a few basic principles which define object oriented Perl:

1.

An object is simply a data structure that knows to which class it belongs.

2.

A class is simply a package. A class provides methods that expect to operate on objects.

3.

A method is simply a subroutine that expects a reference to an object (or a package name, for class methods) as the first argument.

Let's look at each of these principles in depth.

An Object is Simply a Data Structure

Unlike many other languages which support object orientation, Perl does not provide any special syntax for constructing an object. Objects are merely Perl data structures (hashes, arrays, scalars, filehandles, etc.) that have been explicitly associated with a particular class.

That explicit association is created by the built-in bless function, which is typically used within the constructor subroutine of the class.

Here is a simple constructor:

  1. package File;
  2. sub new {
  3. my $class = shift;
  4. return bless {}, $class;
  5. }

The name new isn't special. We could name our constructor something else:

  1. package File;
  2. sub load {
  3. my $class = shift;
  4. return bless {}, $class;
  5. }

The modern convention for OO modules is to always use new as the name for the constructor, but there is no requirement to do so. Any subroutine that blesses a data structure into a class is a valid constructor in Perl.

In the previous examples, the {} code creates a reference to an empty anonymous hash. The bless function then takes that reference and associates the hash with the class in $class . In the simplest case, the $class variable will end up containing the string "File".

We can also use a variable to store a reference to the data structure that is being blessed as our object:

  1. sub new {
  2. my $class = shift;
  3. my $self = {};
  4. bless $self, $class;
  5. return $self;
  6. }

Once we've blessed the hash referred to by $self we can start calling methods on it. This is useful if you want to put object initialization in its own separate method:

  1. sub new {
  2. my $class = shift;
  3. my $self = {};
  4. bless $self, $class;
  5. $self->_initialize();
  6. return $self;
  7. }

Since the object is also a hash, you can treat it as one, using it to store data associated with the object. Typically, code inside the class can treat the hash as an accessible data structure, while code outside the class should always treat the object as opaque. This is called encapsulation. Encapsulation means that the user of an object does not have to know how it is implemented. The user simply calls documented methods on the object.

Note, however, that (unlike most other OO languages) Perl does not ensure or enforce encapsulation in any way. If you want objects to actually be opaque you need to arrange for that yourself. This can be done in a varierty of ways, including using Inside-Out objects or modules from CPAN.

Objects Are Blessed; Variables Are Not

When we bless something, we are not blessing the variable which contains a reference to that thing, nor are we blessing the reference that the variable stores; we are blessing the thing that the variable refers to (sometimes known as the referent). This is best demonstrated with this code:

  1. use Scalar::Util 'blessed';
  2. my $foo = {};
  3. my $bar = $foo;
  4. bless $foo, 'Class';
  5. print blessed( $bar ); # prints "Class"
  6. $bar = "some other value";
  7. print blessed( $bar ); # prints undef

When we call bless on a variable, we are actually blessing the underlying data structure that the variable refers to. We are not blessing the reference itself, nor the variable that contains that reference. That's why the second call to blessed( $bar ) returns false. At that point $bar is no longer storing a reference to an object.

You will sometimes see older books or documentation mention "blessing a reference" or describe an object as a "blessed reference", but this is incorrect. It isn't the reference that is blessed as an object; it's the thing the reference refers to (i.e. the referent).

A Class is Simply a Package

Perl does not provide any special syntax for class definitions. A package is simply a namespace containing variables and subroutines. The only difference is that in a class, the subroutines may expect a reference to an object or the name of a class as the first argument. This is purely a matter of convention, so a class may contain both methods and subroutines which don't operate on an object or class.

Each package contains a special array called @ISA . The @ISA array contains a list of that class's parent classes, if any. This array is examined when Perl does method resolution, which we will cover later.

It is possible to manually set @ISA , and you may see this in older Perl code. Much older code also uses the base pragma. For new code, we recommend that you use the parent pragma to declare your parents. This pragma will take care of setting @ISA . It will also load the parent classes and make sure that the package doesn't inherit from itself.

However the parent classes are set, the package's @ISA variable will contain a list of those parents. This is simply a list of scalars, each of which is a string that corresponds to a package name.

All classes inherit from the UNIVERSAL class implicitly. The UNIVERSAL class is implemented by the Perl core, and provides several default methods, such as isa() , can() , and VERSION() . The UNIVERSAL class will never appear in a package's @ISA variable.

Perl only provides method inheritance as a built-in feature. Attribute inheritance is left up the class to implement. See the Writing Accessors section for details.

A Method is Simply a Subroutine

Perl does not provide any special syntax for defining a method. A method is simply a regular subroutine, and is declared with sub. What makes a method special is that it expects to receive either an object or a class name as its first argument.

Perl does provide special syntax for method invocation, the -> operator. We will cover this in more detail later.

Most methods you write will expect to operate on objects:

  1. sub save {
  2. my $self = shift;
  3. open my $fh, '>', $self->path() or die $!;
  4. print {$fh} $self->data() or die $!;
  5. close $fh or die $!;
  6. }

Method Invocation

Calling a method on an object is written as $object->method .

The left hand side of the method invocation (or arrow) operator is the object (or class name), and the right hand side is the method name.

  1. my $pod = File->new( 'perlobj.pod', $data );
  2. $pod->save();

The -> syntax is also used when dereferencing a reference. It looks like the same operator, but these are two different operations.

When you call a method, the thing on the left side of the arrow is passed as the first argument to the method. That means when we call Critter->new() , the new() method receives the string "Critter" as its first argument. When we call $fred->speak() , the $fred variable is passed as the first argument to speak() .

Just as with any Perl subroutine, all of the arguments passed in @_ are aliases to the original argument. This includes the object itself. If you assign directly to $_[0] you will change the contents of the variable that holds the reference to the object. We recommend that you don't do this unless you know exactly what you're doing.

Perl knows what package the method is in by looking at the left side of the arrow. If the left hand side is a package name, it looks for the method in that package. If the left hand side is an object, then Perl looks for the method in the package that the object has been blessed into.

If the left hand side is neither a package name nor an object, then the method call will cause an error, but see the section on Method Call Variations for more nuances.

Inheritance

We already talked about the special @ISA array and the parent pragma.

When a class inherits from another class, any methods defined in the parent class are available to the child class. If you attempt to call a method on an object that isn't defined in its own class, Perl will also look for that method in any parent classes it may have.

  1. package File::MP3;
  2. use parent 'File'; # sets @File::MP3::ISA = ('File');
  3. my $mp3 = File::MP3->new( 'Andvari.mp3', $data );
  4. $mp3->save();

Since we didn't define a save() method in the File::MP3 class, Perl will look at the File::MP3 class's parent classes to find the save() method. If Perl cannot find a save() method anywhere in the inheritance hierarchy, it will die.

In this case, it finds a save() method in the File class. Note that the object passed to save() in this case is still a File::MP3 object, even though the method is found in the File class.

We can override a parent's method in a child class. When we do so, we can still call the parent class's method with the SUPER pseudo-class.

  1. sub save {
  2. my $self = shift;
  3. say 'Prepare to rock';
  4. $self->SUPER::save();
  5. }

The SUPER modifier can only be used for method calls. You can't use it for regular subroutine calls or class methods:

  1. SUPER::save($thing); # FAIL: looks for save() sub in package SUPER
  2. SUPER->save($thing); # FAIL: looks for save() method in class
  3. # SUPER
  4. $thing->SUPER::save(); # Okay: looks for save() method in parent
  5. # classes

How SUPER is Resolved

The SUPER pseudo-class is resolved from the package where the call is made. It is not resolved based on the object's class. This is important, because it lets methods at different levels within a deep inheritance hierarchy each correctly call their respective parent methods.

  1. package A;
  2. sub new {
  3. return bless {}, shift;
  4. }
  5. sub speak {
  6. my $self = shift;
  7. say 'A';
  8. }
  9. package B;
  10. use parent -norequire, 'A';
  11. sub speak {
  12. my $self = shift;
  13. $self->SUPER::speak();
  14. say 'B';
  15. }
  16. package C;
  17. use parent -norequire, 'B';
  18. sub speak {
  19. my $self = shift;
  20. $self->SUPER::speak();
  21. say 'C';
  22. }
  23. my $c = C->new();
  24. $c->speak();

In this example, we will get the following output:

  1. A
  2. B
  3. C

This demonstrates how SUPER is resolved. Even though the object is blessed into the C class, the speak() method in the B class can still call SUPER::speak() and expect it to correctly look in the parent class of B (i.e the class the method call is in), not in the parent class of C (i.e. the class the object belongs to).

There are rare cases where this package-based resolution can be a problem. If you copy a subroutine from one package to another, SUPER resolution will be done based on the original package.

Multiple Inheritance

Multiple inheritance often indicates a design problem, but Perl always gives you enough rope to hang yourself with if you ask for it.

To declare multiple parents, you simply need to pass multiple class names to use parent :

  1. package MultiChild;
  2. use parent 'Parent1', 'Parent2';

Method Resolution Order

Method resolution order only matters in the case of multiple inheritance. In the case of single inheritance, Perl simply looks up the inheritance chain to find a method:

  1. Grandparent
  2. |
  3. Parent
  4. |
  5. Child

If we call a method on a Child object and that method is not defined in the Child class, Perl will look for that method in the Parent class and then, if necessary, in the Grandparent class.

If Perl cannot find the method in any of these classes, it will die with an error message.

When a class has multiple parents, the method lookup order becomes more complicated.

By default, Perl does a depth-first left-to-right search for a method. That means it starts with the first parent in the @ISA array, and then searches all of its parents, grandparents, etc. If it fails to find the method, it then goes to the next parent in the original class's @ISA array and searches from there.

  1. SharedGreatGrandParent
  2. / \
  3. PaternalGrandparent MaternalGrandparent
  4. \ /
  5. Father Mother
  6. \ /
  7. Child

So given the diagram above, Perl will search Child , Father , PaternalGrandparent , SharedGreatGrandParent , Mother , and finally MaternalGrandparent . This may be a problem because now we're looking in SharedGreatGrandParent before we've checked all its derived classes (i.e. before we tried Mother and MaternalGrandparent ).

It is possible to ask for a different method resolution order with the mro pragma.

  1. package Child;
  2. use mro 'c3';
  3. use parent 'Father', 'Mother';

This pragma lets you switch to the "C3" resolution order. In simple terms, "C3" order ensures that shared parent classes are never searched before child classes, so Perl will now search: Child , Father , PaternalGrandparent , Mother MaternalGrandparent , and finally SharedGreatGrandParent . Note however that this is not "breadth-first" searching: All the Father ancestors (except the common ancestor) are searched before any of the Mother ancestors are considered.

The C3 order also lets you call methods in sibling classes with the next pseudo-class. See the mro documentation for more details on this feature.

Method Resolution Caching

When Perl searches for a method, it caches the lookup so that future calls to the method do not need to search for it again. Changing a class's parent class or adding subroutines to a class will invalidate the cache for that class.

The mro pragma provides some functions for manipulating the method cache directly.

Writing Constructors

As we mentioned earlier, Perl provides no special constructor syntax. This means that a class must implement its own constructor. A constructor is simply a class method that returns a reference to a new object.

The constructor can also accept additional parameters that define the object. Let's write a real constructor for the File class we used earlier:

  1. package File;
  2. sub new {
  3. my $class = shift;
  4. my ( $path, $data ) = @_;
  5. my $self = bless {
  6. path => $path,
  7. data => $data,
  8. }, $class;
  9. return $self;
  10. }

As you can see, we've stored the path and file data in the object itself. Remember, under the hood, this object is still just a hash. Later, we'll write accessors to manipulate this data.

For our File::MP3 class, we can check to make sure that the path we're given ends with ".mp3":

  1. package File::MP3;
  2. sub new {
  3. my $class = shift;
  4. my ( $path, $data ) = @_;
  5. die "You cannot create a File::MP3 without an mp3 extension\n"
  6. unless $path =~ /\.mp3\z/;
  7. return $class->SUPER::new(@_);
  8. }

This constructor lets its parent class do the actual object construction.

Attributes

An attribute is a piece of data belonging to a particular object. Unlike most object-oriented languages, Perl provides no special syntax or support for declaring and manipulating attributes.

Attributes are often stored in the object itself. For example, if the object is an anonymous hash, we can store the attribute values in the hash using the attribute name as the key.

While it's possible to refer directly to these hash keys outside of the class, it's considered a best practice to wrap all access to the attribute with accessor methods.

This has several advantages. Accessors make it easier to change the implementation of an object later while still preserving the original API.

An accessor lets you add additional code around attribute access. For example, you could apply a default to an attribute that wasn't set in the constructor, or you could validate that a new value for the attribute is acceptable.

Finally, using accessors makes inheritance much simpler. Subclasses can use the accessors rather than having to know how a parent class is implemented internally.

Writing Accessors

As with constructors, Perl provides no special accessor declaration syntax, so classes must provide explicitly written accessor methods. There are two common types of accessors, read-only and read-write.

A simple read-only accessor simply gets the value of a single attribute:

  1. sub path {
  2. my $self = shift;
  3. return $self->{path};
  4. }

A read-write accessor will allow the caller to set the value as well as get it:

  1. sub path {
  2. my $self = shift;
  3. if (@_) {
  4. $self->{path} = shift;
  5. }
  6. return $self->{path};
  7. }

An Aside About Smarter and Safer Code

Our constructor and accessors are not very smart. They don't check that a $path is defined, nor do they check that a $path is a valid filesystem path.

Doing these checks by hand can quickly become tedious. Writing a bunch of accessors by hand is also incredibly tedious. There are a lot of modules on CPAN that can help you write safer and more concise code, including the modules we recommend in perlootut.

Method Call Variations

Perl supports several other ways to call methods besides the $object->method() usage we've seen so far.

Method Names as Strings

Perl lets you use a scalar variable containing a string as a method name:

  1. my $file = File->new( $path, $data );
  2. my $method = 'save';
  3. $file->$method();

This works exactly like calling $file->save() . This can be very useful for writing dynamic code. For example, it allows you to pass a method name to be called as a parameter to another method.

Class Names as Strings

Perl also lets you use a scalar containing a string as a class name:

  1. my $class = 'File';
  2. my $file = $class->new( $path, $data );

Again, this allows for very dynamic code.

Subroutine References as Methods

You can also use a subroutine reference as a method:

  1. my $sub = sub {
  2. my $self = shift;
  3. $self->save();
  4. };
  5. $file->$sub();

This is exactly equivalent to writing $sub->($file) . You may see this idiom in the wild combined with a call to can :

  1. if ( my $meth = $object->can('foo') ) {
  2. $object->$meth();
  3. }

Deferencing Method Call

Perl also lets you use a dereferenced scalar reference in a method call. That's a mouthful, so let's look at some code:

  1. $file->${ \'save' };
  2. $file->${ returns_scalar_ref() };
  3. $file->${ \( returns_scalar() ) };
  4. $file->${ returns_ref_to_sub_ref() };

This works if the dereference produces a string or a subroutine reference.

Method Calls on Filehandles

Under the hood, Perl filehandles are instances of the IO::Handle or IO::File class. Once you have an open filehandle, you can call methods on it. Additionally, you can call methods on the STDIN , STDOUT , and STDERR filehandles.

  1. open my $fh, '>', 'path/to/file';
  2. $fh->autoflush();
  3. $fh->print('content');
  4. STDOUT->autoflush();

Invoking Class Methods

Because Perl allows you to use barewords for package names and subroutine names, it sometimes interprets a bareword's meaning incorrectly. For example, the construct Class->new() can be interpreted as either 'Class'->new() or Class()->new() . In English, that second interpretation reads as "call a subroutine named Class(), then call new() as a method on the return value of Class()". If there is a subroutine named Class() in the current namespace, Perl will always interpret Class->new() as the second alternative: a call to new() on the object returned by a call to Class()

You can force Perl to use the first interpretation (i.e. as a method call on the class named "Class") in two ways. First, you can append a :: to the class name:

  1. Class::->new()

Perl will always interpret this as a method call.

Alternatively, you can quote the class name:

  1. 'Class'->new()

Of course, if the class name is in a scalar Perl will do the right thing as well:

  1. my $class = 'Class';
  2. $class->new();

Indirect Object Syntax

Outside of the file handle case, use of this syntax is discouraged as it can confuse the Perl interpreter. See below for more details.

Perl suports another method invocation syntax called "indirect object" notation. This syntax is called "indirect" because the method comes before the object it is being invoked on.

This syntax can be used with any class or object method:

  1. my $file = new File $path, $data;
  2. save $file;

We recommend that you avoid this syntax, for several reasons.

First, it can be confusing to read. In the above example, it's not clear if save is a method provided by the File class or simply a subroutine that expects a file object as its first argument.

When used with class methods, the problem is even worse. Because Perl allows subroutine names to be written as barewords, Perl has to guess whether the bareword after the method is a class name or subroutine name. In other words, Perl can resolve the syntax as either File->new( $path, $data ) or new( File( $path, $data ) ) .

To parse this code, Perl uses a heuristic based on what package names it has seen, what subroutines exist in the current package, what barewords it has previously seen, and other input. Needless to say, heuristics can produce very surprising results!

Older documentation (and some CPAN modules) encouraged this syntax, particularly for constructors, so you may still find it in the wild. However, we encourage you to avoid using it in new code.

You can force Perl to interpret the bareword as a class name by appending "::" to it, like we saw earlier:

  1. my $file = new File:: $path, $data;

bless, blessed , and ref

As we saw earlier, an object is simply a data structure that has been blessed into a class via the bless function. The bless function can take either one or two arguments:

  1. my $object = bless {}, $class;
  2. my $object = bless {};

In the first form, the anonymous hash is being blessed into the class in $class . In the second form, the anonymous hash is blessed into the current package.

The second form is strongly discouraged, because it breaks the ability of a subclass to reuse the parent's constructor, but you may still run across it in existing code.

If you want to know whether a particular scalar refers to an object, you can use the blessed function exported by Scalar::Util, which is shipped with the Perl core.

  1. use Scalar::Util 'blessed';
  2. if ( defined blessed($thing) ) { ... }

If $thing refers to an object, then this function returns the name of the package the object has been blessed into. If $thing doesn't contain a reference to a blessed object, the blessed function returns undef.

Note that blessed($thing) will also return false if $thing has been blessed into a class named "0". This is a possible, but quite pathological. Don't create a class named "0" unless you know what you're doing.

Similarly, Perl's built-in ref function treats a reference to a blessed object specially. If you call ref($thing) and $thing holds a reference to an object, it will return the name of the class that the object has been blessed into.

If you simply want to check that a variable contains an object reference, we recommend that you use defined blessed($object) , since ref returns true values for all references, not just objects.

The UNIVERSAL Class

All classes automatically inherit from the UNIVERSAL class, which is built-in to the Perl core. This class provides a number of methods, all of which can be called on either a class or an object. You can also choose to override some of these methods in your class. If you do so, we recommend that you follow the built-in semantics described below.

  • isa($class)

    The isa method returns true if the object is a member of the class in $class , or a member of a subclass of $class .

    If you override this method, it should never throw an exception.

  • DOES($role)

    The DOES method returns true if its object claims to perform the role $role . By default, this is equivalent to isa . This method is provided for use by object system extensions that implement roles, like Moose and Role::Tiny .

    You can also override DOES directly in your own classes. If you override this method, it should never throw an exception.

  • can($method)

    The can method checks to see if the class or object it was called on has a method named $method . This checks for the method in the class and all of its parents. If the method exists, then a reference to the subroutine is returned. If it does not then undef is returned.

    If your class responds to method calls via AUTOLOAD , you may want to overload can to return a subroutine reference for methods which your AUTOLOAD method handles.

    If you override this method, it should never throw an exception.

  • VERSION($need)

    The VERSION method returns the version number of the class (package).

    If the $need argument is given then it will check that the current version (as defined by the $VERSION variable in the package) is greater than or equal to $need ; it will die if this is not the case. This method is called automatically by the VERSION form of use.

    1. use Package 1.2 qw(some imported subs);
    2. # implies:
    3. Package->VERSION(1.2);

    We recommend that you use this method to access another package's version, rather than looking directly at $Package::VERSION . The package you are looking at could have overridden the VERSION method.

    We also recommend using this method to check whether a module has a sufficient version. The internal implementation uses the version module to make sure that different types of version numbers are compared correctly.

AUTOLOAD

If you call a method that doesn't exist in a class, Perl will throw an error. However, if that class or any of its parent classes defines an AUTOLOAD method, that AUTOLOAD method is called instead.

AUTOLOAD is called as a regular method, and the caller will not know the difference. Whatever value your AUTOLOAD method returns is returned to the caller.

The fully qualified method name that was called is available in the $AUTOLOAD package global for your class. Since this is a global, if you want to refer to do it without a package name prefix under strict 'vars' , you need to declare it.

  1. # XXX - this is a terrible way to implement accessors, but it makes
  2. # for a simple example.
  3. our $AUTOLOAD;
  4. sub AUTOLOAD {
  5. my $self = shift;
  6. # Remove qualifier from original method name...
  7. my $called = $AUTOLOAD =~ s/.*:://r;
  8. # Is there an attribute of that name?
  9. die "No such attribute: $called"
  10. unless exists $self->{$called};
  11. # If so, return it...
  12. return $self->{$called};
  13. }
  14. sub DESTROY { } # see below

Without the our $AUTOLOAD declaration, this code will not compile under the strict pragma.

As the comment says, this is not a good way to implement accessors. It's slow and too clever by far. However, you may see this as a way to provide accessors in older Perl code. See perlootut for recommendations on OO coding in Perl.

If your class does have an AUTOLOAD method, we strongly recommend that you override can in your class as well. Your overridden can method should return a subroutine reference for any method that your AUTOLOAD responds to.

Destructors

When the last reference to an object goes away, the object is destroyed. If you only have one reference to an object stored in a lexical scalar, the object is destroyed when that scalar goes out of scope. If you store the object in a package global, that object may not go out of scope until the program exits.

If you want to do something when the object is destroyed, you can define a DESTROY method in your class. This method will always be called by Perl at the appropriate time, unless the method is empty.

This is called just like any other method, with the object as the first argument. It does not receive any additional arguments. However, the $_[0] variable will be read-only in the destructor, so you cannot assign a value to it.

If your DESTROY method throws an error, this error will be ignored. It will not be sent to STDERR and it will not cause the program to die. However, if your destructor is running inside an eval {} block, then the error will change the value of $@ .

Because DESTROY methods can be called at any time, you should localize any global variables you might update in your DESTROY . In particular, if you use eval {} you should localize $@ , and if you use system or backticks you should localize $? .

If you define an AUTOLOAD in your class, then Perl will call your AUTOLOAD to handle the DESTROY method. You can prevent this by defining an empty DESTROY , like we did in the autoloading example. You can also check the value of $AUTOLOAD and return without doing anything when called to handle DESTROY .

Global Destruction

The order in which objects are destroyed during the global destruction before the program exits is unpredictable. This means that any objects contained by your object may already have been destroyed. You should check that a contained object is defined before calling a method on it:

  1. sub DESTROY {
  2. my $self = shift;
  3. $self->{handle}->close() if $self->{handle};
  4. }

You can use the ${^GLOBAL_PHASE} variable to detect if you are currently in the global destruction phase:

  1. sub DESTROY {
  2. my $self = shift;
  3. return if ${^GLOBAL_PHASE} eq 'DESTRUCT';
  4. $self->{handle}->close();
  5. }

Note that this variable was added in Perl 5.14.0. If you want to detect the global destruction phase on older versions of Perl, you can use the Devel::GlobalDestruction module on CPAN.

If your DESTROY method issues a warning during global destruction, the Perl interpreter will append the string " during global destruction" the warning.

During global destruction, Perl will always garbage collect objects before unblessed references. See PERL_DESTRUCT_LEVEL in perlhacktips for more information about global destruction.

Non-Hash Objects

All the examples so far have shown objects based on a blessed hash. However, it's possible to bless any type of data structure or referent, including scalars, globs, and subroutines. You may see this sort of thing when looking at code in the wild.

Here's an example of a module as a blessed scalar:

  1. package Time;
  2. use strict;
  3. use warnings;
  4. sub new {
  5. my $class = shift;
  6. my $time = time;
  7. return bless \$time, $class;
  8. }
  9. sub epoch {
  10. my $self = shift;
  11. return ${ $self };
  12. }
  13. my $time = Time->new();
  14. print $time->epoch();

Inside-Out objects

In the past, the Perl community experimented with a technique called "inside-out objects". An inside-out object stores its data outside of the object's reference, indexed on a unique property of the object, such as its memory address, rather than in the object itself. This has the advantage of enforcing the encapsulation of object attributes, since their data is not stored in the object itself.

This technique was popular for a while (and was recommended in Damian Conway's Perl Best Practices), but never achieved universal adoption. The Object::InsideOut module on CPAN provides a comprehensive implementation of this technique, and you may see it or other inside-out modules in the wild.

Here is a simple example of the technique, using the Hash::Util::FieldHash core module. This module was added to the core to support inside-out object implementations.

  1. package Time;
  2. use strict;
  3. use warnings;
  4. use Hash::Util::FieldHash 'fieldhash';
  5. fieldhash my %time_for;
  6. sub new {
  7. my $class = shift;
  8. my $self = bless \( my $object ), $class;
  9. $time_for{$self} = time;
  10. return $self;
  11. }
  12. sub epoch {
  13. my $self = shift;
  14. return $time_for{$self};
  15. }
  16. my $time = Time->new;
  17. print $time->epoch;

Pseudo-hashes

The pseudo-hash feature was an experimental feature introduced in earlier versions of Perl and removed in 5.10.0. A pseudo-hash is an array reference which can be accessed using named keys like a hash. You may run in to some code in the wild which uses it. See the fields pragma for more information.

SEE ALSO

A kinder, gentler tutorial on object-oriented programming in Perl can be found in perlootut. You should also check out perlmodlib for some style guides on constructing both modules and classes.

 
perldoc-html/perlootut.html000644 000765 000024 00000176611 12275777324 016175 0ustar00jjstaff000000 000000 perlootut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlootut

Perl 5 version 18.2 documentation
Recently read

perlootut

NAME

perlootut - Object-Oriented Programming in Perl Tutorial

DATE

This document was created in February, 2011, and the last major revision was in February, 2013.

If you are reading this in the future then it's possible that the state of the art has changed. We recommend you start by reading the perlootut document in the latest stable release of Perl, rather than this version.

DESCRIPTION

This document provides an introduction to object-oriented programming in Perl. It begins with a brief overview of the concepts behind object oriented design. Then it introduces several different OO systems from CPAN which build on top of what Perl provides.

By default, Perl's built-in OO system is very minimal, leaving you to do most of the work. This minimalism made a lot of sense in 1994, but in the years since Perl 5.0 we've seen a number of common patterns emerge in Perl OO. Fortunately, Perl's flexibility has allowed a rich ecosystem of Perl OO systems to flourish.

If you want to know how Perl OO works under the hood, the perlobj document explains the nitty gritty details.

This document assumes that you already understand the basics of Perl syntax, variable types, operators, and subroutine calls. If you don't understand these concepts yet, please read perlintro first. You should also read the perlsyn, perlop, and perlsub documents.

OBJECT-ORIENTED FUNDAMENTALS

Most object systems share a number of common concepts. You've probably heard terms like "class", "object, "method", and "attribute" before. Understanding the concepts will make it much easier to read and write object-oriented code. If you're already familiar with these terms, you should still skim this section, since it explains each concept in terms of Perl's OO implementation.

Perl's OO system is class-based. Class-based OO is fairly common. It's used by Java, C++, C#, Python, Ruby, and many other languages. There are other object orientation paradigms as well. JavaScript is the most popular language to use another paradigm. JavaScript's OO system is prototype-based.

Object

An object is a data structure that bundles together data and subroutines which operate on that data. An object's data is called attributes, and its subroutines are called methods. An object can be thought of as a noun (a person, a web service, a computer).

An object represents a single discrete thing. For example, an object might represent a file. The attributes for a file object might include its path, content, and last modification time. If we created an object to represent /etc/hostname on a machine named "foo.example.com", that object's path would be "/etc/hostname", its content would be "foo\n", and it's last modification time would be 1304974868 seconds since the beginning of the epoch.

The methods associated with a file might include rename() and write().

In Perl most objects are hashes, but the OO systems we recommend keep you from having to worry about this. In practice, it's best to consider an object's internal data structure opaque.

Class

A class defines the behavior of a category of objects. A class is a name for a category (like "File"), and a class also defines the behavior of objects in that category.

All objects belong to a specific class. For example, our /etc/hostname object belongs to the File class. When we want to create a specific object, we start with its class, and construct or instantiate an object. A specific object is often referred to as an instance of a class.

In Perl, any package can be a class. The difference between a package which is a class and one which isn't is based on how the package is used. Here's our "class declaration" for the File class:

  1. package File;

In Perl, there is no special keyword for constructing an object. However, most OO modules on CPAN use a method named new() to construct a new object:

  1. my $hostname = File->new(
  2. path => '/etc/hostname',
  3. content => "foo\n",
  4. last_mod_time => 1304974868,
  5. );

(Don't worry about that -> operator, it will be explained later.)

Blessing

As we said earlier, most Perl objects are hashes, but an object can be an instance of any Perl data type (scalar, array, etc.). Turning a plain data structure into an object is done by blessing that data structure using Perl's bless function.

While we strongly suggest you don't build your objects from scratch, you should know the term bless. A blessed data structure (aka "a referent") is an object. We sometimes say that an object has been "blessed into a class".

Once a referent has been blessed, the blessed function from the Scalar::Util core module can tell us its class name. This subroutine returns an object's class when passed an object, and false otherwise.

  1. use Scalar::Util 'blessed';
  2. print blessed($hash); # undef
  3. print blessed($hostname); # File

Constructor

A constructor creates a new object. In Perl, a class's constructor is just another method, unlike some other languages, which provide syntax for constructors. Most Perl classes use new as the name for their constructor:

  1. my $file = File->new(...);

Methods

You already learned that a method is a subroutine that operates on an object. You can think of a method as the things that an object can do. If an object is a noun, then methods are its verbs (save, print, open).

In Perl, methods are simply subroutines that live in a class's package. Methods are always written to receive the object as their first argument:

  1. sub print_info {
  2. my $self = shift;
  3. print "This file is at ", $self->path, "\n";
  4. }
  5. $file->print_info;
  6. # The file is at /etc/hostname

What makes a method special is how it's called. The arrow operator (-> ) tells Perl that we are calling a method.

When we make a method call, Perl arranges for the method's invocant to be passed as the first argument. Invocant is a fancy name for the thing on the left side of the arrow. The invocant can either be a class name or an object. We can also pass additional arguments to the method:

  1. sub print_info {
  2. my $self = shift;
  3. my $prefix = shift // "This file is at ";
  4. print $prefix, ", ", $self->path, "\n";
  5. }
  6. $file->print_info("The file is located at ");
  7. # The file is located at /etc/hostname

Attributes

Each class can define its attributes. When we instantiate an object, we assign values to those attributes. For example, every File object has a path. Attributes are sometimes called properties.

Perl has no special syntax for attributes. Under the hood, attributes are often stored as keys in the object's underlying hash, but don't worry about this.

We recommend that you only access attributes via accessor methods. These are methods that can get or set the value of each attribute. We saw this earlier in the print_info() example, which calls $self->path .

You might also see the terms getter and setter. These are two types of accessors. A getter gets the attribute's value, while a setter sets it. Another term for a setter is mutator

Attributes are typically defined as read-only or read-write. Read-only attributes can only be set when the object is first created, while read-write attributes can be altered at any time.

The value of an attribute may itself be another object. For example, instead of returning its last mod time as a number, the File class could return a DateTime object representing that value.

It's possible to have a class that does not expose any publicly settable attributes. Not every class has attributes and methods.

Polymorphism

Polymorphism is a fancy way of saying that objects from two different classes share an API. For example, we could have File and WebPage classes which both have a print_content() method. This method might produce different output for each class, but they share a common interface.

While the two classes may differ in many ways, when it comes to the print_content() method, they are the same. This means that we can try to call the print_content() method on an object of either class, and we don't have to know what class the object belongs to!

Polymorphism is one of the key concepts of object-oriented design.

Inheritance

Inheritance lets you create a specialized version of an existing class. Inheritance lets the new class reuse the methods and attributes of another class.

For example, we could create an File::MP3 class which inherits from File . An File::MP3 is-a more specific type of File . All mp3 files are files, but not all files are mp3 files.

We often refer to inheritance relationships as parent-child or superclass/subclass relationships. Sometimes we say that the child has an is-a relationship with its parent class.

File is a superclass of File::MP3 , and File::MP3 is a subclass of File .

  1. package File::MP3;
  2. use parent 'File';

The parent module is one of several ways that Perl lets you define inheritance relationships.

Perl allows multiple inheritance, which means that a class can inherit from multiple parents. While this is possible, we strongly recommend against it. Generally, you can use roles to do everything you can do with multiple inheritance, but in a cleaner way.

Note that there's nothing wrong with defining multiple subclasses of a given class. This is both common and safe. For example, we might define File::MP3::FixedBitrate and File::MP3::VariableBitrate classes to distinguish between different types of mp3 file.

Overriding methods and method resolution

Inheritance allows two classes to share code. By default, every method in the parent class is also available in the child. The child can explicitly override a parent's method to provide its own implementation. For example, if we have an File::MP3 object, it has the print_info() method from File :

  1. my $cage = File::MP3->new(
  2. path => 'mp3s/My-Body-Is-a-Cage.mp3',
  3. content => $mp3_data,
  4. last_mod_time => 1304974868,
  5. title => 'My Body Is a Cage',
  6. );
  7. $cage->print_info;
  8. # The file is at mp3s/My-Body-Is-a-Cage.mp3

If we wanted to include the mp3's title in the greeting, we could override the method:

  1. package File::MP3;
  2. use parent 'File';
  3. sub print_info {
  4. my $self = shift;
  5. print "This file is at ", $self->path, "\n";
  6. print "Its title is ", $self->title, "\n";
  7. }
  8. $cage->print_info;
  9. # The file is at mp3s/My-Body-Is-a-Cage.mp3
  10. # Its title is My Body Is a Cage

The process of determining what method should be used is called method resolution. What Perl does is look at the object's class first (File::MP3 in this case). If that class defines the method, then that class's version of the method is called. If not, Perl looks at each parent class in turn. For File::MP3 , its only parent is File . If File::MP3 does not define the method, but File does, then Perl calls the method in File .

If File inherited from DataSource , which inherited from Thing , then Perl would keep looking "up the chain" if necessary.

It is possible to explicitly call a parent method from a child:

  1. package File::MP3;
  2. use parent 'File';
  3. sub print_info {
  4. my $self = shift;
  5. $self->SUPER::print_info();
  6. print "Its title is ", $self->title, "\n";
  7. }

The SUPER:: bit tells Perl to look for the print_info() in the File::MP3 class's inheritance chain. When it finds the parent class that implements this method, the method is called.

We mentioned multiple inheritance earlier. The main problem with multiple inheritance is that it greatly complicates method resolution. See perlobj for more details.

Encapsulation

Encapsulation is the idea that an object is opaque. When another developer uses your class, they don't need to know how it is implemented, they just need to know what it does.

Encapsulation is important for several reasons. First, it allows you to separate the public API from the private implementation. This means you can change that implementation without breaking the API.

Second, when classes are well encapsulated, they become easier to subclass. Ideally, a subclass uses the same APIs to access object data that its parent class uses. In reality, subclassing sometimes involves violating encapsulation, but a good API can minimize the need to do this.

We mentioned earlier that most Perl objects are implemented as hashes under the hood. The principle of encapsulation tells us that we should not rely on this. Instead, we should use accessor methods to access the data in that hash. The object systems that we recommend below all automate the generation of accessor methods. If you use one of them, you should never have to access the object as a hash directly.

Composition

In object-oriented code, we often find that one object references another object. This is called composition, or a has-a relationship.

Earlier, we mentioned that the File class's last_mod_time accessor could return a DateTime object. This is a perfect example of composition. We could go even further, and make the path and content accessors return objects as well. The File class would then be composed of several other objects.

Roles

Roles are something that a class does, rather than something that it is. Roles are relatively new to Perl, but have become rather popular. Roles are applied to classes. Sometimes we say that classes consume roles.

Roles are an alternative to inheritance for providing polymorphism. Let's assume we have two classes, Radio and Computer . Both of these things have on/off switches. We want to model that in our class definitions.

We could have both classes inherit from a common parent, like Machine , but not all machines have on/off switches. We could create a parent class called HasOnOffSwitch , but that is very artificial. Radios and computers are not specializations of this parent. This parent is really a rather ridiculous creation.

This is where roles come in. It makes a lot of sense to create a HasOnOffSwitch role and apply it to both classes. This role would define a known API like providing turn_on() and turn_off() methods.

Perl does not have any built-in way to express roles. In the past, people just bit the bullet and used multiple inheritance. Nowadays, there are several good choices on CPAN for using roles.

When to Use OO

Object Orientation is not the best solution to every problem. In Perl Best Practices (copyright 2004, Published by O'Reilly Media, Inc.), Damian Conway provides a list of criteria to use when deciding if OO is the right fit for your problem:

  • The system being designed is large, or is likely to become large.

  • The data can be aggregated into obvious structures, especially if there's a large amount of data in each aggregate.

  • The various types of data aggregate form a natural hierarchy that facilitates the use of inheritance and polymorphism.

  • You have a piece of data on which many different operations are applied.

  • You need to perform the same general operations on related types of data, but with slight variations depending on the specific type of data the operations are applied to.

  • It's likely you'll have to add new data types later.

  • The typical interactions between pieces of data are best represented by operators.

  • The implementation of individual components of the system is likely to change over time.

  • The system design is already object-oriented.

  • Large numbers of other programmers will be using your code modules.

PERL OO SYSTEMS

As we mentioned before, Perl's built-in OO system is very minimal, but also quite flexible. Over the years, many people have developed systems which build on top of Perl's built-in system to provide more features and convenience.

We strongly recommend that you use one of these systems. Even the most minimal of them eliminates a lot of repetitive boilerplate. There's really no good reason to write your classes from scratch in Perl.

If you are interested in the guts underlying these systems, check out perlobj.

Moose

Moose bills itself as a "postmodern object system for Perl 5". Don't be scared, the "postmodern" label is a callback to Larry's description of Perl as "the first postmodern computer language".

Moose provides a complete, modern OO system. Its biggest influence is the Common Lisp Object System, but it also borrows ideas from Smalltalk and several other languages. Moose was created by Stevan Little, and draws heavily from his work on the Perl 6 OO design.

Here is our File class using Moose :

  1. package File;
  2. use Moose;
  3. has path => ( is => 'ro' );
  4. has content => ( is => 'ro' );
  5. has last_mod_time => ( is => 'ro' );
  6. sub print_info {
  7. my $self = shift;
  8. print "This file is at ", $self->path, "\n";
  9. }

Moose provides a number of features:

  • Declarative sugar

    Moose provides a layer of declarative "sugar" for defining classes. That sugar is just a set of exported functions that make declaring how your class works simpler and more palatable. This lets you describe what your class is, rather than having to tell Perl how to implement your class.

    The has() subroutine declares an attribute, and Moose automatically creates accessors for these attributes. It also takes care of creating a new() method for you. This constructor knows about the attributes you declared, so you can set them when creating a new File .

  • Roles built-in

    Moose lets you define roles the same way you define classes:

    1. package HasOnOfSwitch;
    2. use Moose::Role;
    3. has is_on => (
    4. is => 'rw',
    5. isa => 'Bool',
    6. );
    7. sub turn_on {
    8. my $self = shift;
    9. $self->is_on(1);
    10. }
    11. sub turn_off {
    12. my $self = shift;
    13. $self->is_on(0);
    14. }
  • A miniature type system

    In the example above, you can see that we passed isa => 'Bool' to has() when creating our is_on attribute. This tells Moose that this attribute must be a boolean value. If we try to set it to an invalid value, our code will throw an error.

  • Full introspection and manipulation

    Perl's built-in introspection features are fairly minimal. Moose builds on top of them and creates a full introspection layer for your classes. This lets you ask questions like "what methods does the File class implement?" It also lets you modify your classes programmatically.

  • Self-hosted and extensible

    Moose describes itself using its own introspection API. Besides being a cool trick, this means that you can extend Moose using Moose itself.

  • Rich ecosystem

    There is a rich ecosystem of Moose extensions on CPAN under the MooseX namespace. In addition, many modules on CPAN already use Moose , providing you with lots of examples to learn from.

  • Many more features

    Moose is a very powerful tool, and we can't cover all of its features here. We encourage you to learn more by reading the Moose documentation, starting with Moose::Manual.

Of course, Moose isn't perfect.

Moose can make your code slower to load. Moose itself is not small, and it does a lot of code generation when you define your class. This code generation means that your runtime code is as fast as it can be, but you pay for this when your modules are first loaded.

This load time hit can be a problem when startup speed is important, such as with a command-line script or a "plain vanilla" CGI script that must be loaded each time it is executed.

Before you panic, know that many people do use Moose for command-line tools and other startup-sensitive code. We encourage you to try Moose out first before worrying about startup speed.

Moose also has several dependencies on other modules. Most of these are small stand-alone modules, a number of which have been spun off from Moose . Moose itself, and some of its dependencies, require a compiler. If you need to install your software on a system without a compiler, or if having any dependencies is a problem, then Moose may not be right for you.

Moo

If you try Moose and find that one of these issues is preventing you from using Moose , we encourage you to consider Moo next. Moo implements a subset of Moose 's functionality in a simpler package. For most features that it does implement, the end-user API is identical to Moose , meaning you can switch from Moo to Moose quite easily.

Moo does not implement most of Moose 's introspection API, so it's often faster when loading your modules. Additionally, none of its dependencies require XS, so it can be installed on machines without a compiler.

One of Moo 's most compelling features is its interoperability with Moose . When someone tries to use Moose 's introspection API on a Moo class or role, it is transparently inflated into a Moose class or role. This makes it easier to incorporate Moo -using code into a Moose code base and vice versa.

For example, a Moose class can subclass a Moo class using extends or consume a Moo role using with .

The Moose authors hope that one day Moo can be made obsolete by improving Moose enough, but for now it provides a worthwhile alternative to Moose .

Class::Accessor

Class::Accessor is the polar opposite of Moose . It provides very few features, nor is it self-hosting.

It is, however, very simple, pure Perl, and it has no non-core dependencies. It also provides a "Moose-like" API on demand for the features it supports.

Even though it doesn't do much, it is still preferable to writing your own classes from scratch.

Here's our File class with Class::Accessor :

  1. package File;
  2. use Class::Accessor 'antlers';
  3. has path => ( is => 'ro' );
  4. has content => ( is => 'ro' );
  5. has last_mod_time => ( is => 'ro' );
  6. sub print_info {
  7. my $self = shift;
  8. print "This file is at ", $self->path, "\n";
  9. }

The antlers import flag tells Class::Accessor that you want to define your attributes using Moose -like syntax. The only parameter that you can pass to has is is . We recommend that you use this Moose-like syntax if you choose Class::Accessor since it means you will have a smoother upgrade path if you later decide to move to Moose .

Like Moose , Class::Accessor generates accessor methods and a constructor for your class.

Object::Tiny

Finally, we have Object::Tiny. This module truly lives up to its name. It has an incredibly minimal API and absolutely no dependencies (core or not). Still, we think it's a lot easier to use than writing your own OO code from scratch.

Here's our File class once more:

  1. package File;
  2. use Object::Tiny qw( path content last_mod_time );
  3. sub print_info {
  4. my $self = shift;
  5. print "This file is at ", $self->path, "\n";
  6. }

That's it!

With Object::Tiny , all accessors are read-only. It generates a constructor for you, as well as the accessors you define.

Role::Tiny

As we mentioned before, roles provide an alternative to inheritance, but Perl does not have any built-in role support. If you choose to use Moose, it comes with a full-fledged role implementation. However, if you use one of our other recommended OO modules, you can still use roles with Role::Tiny

Role::Tiny provides some of the same features as Moose's role system, but in a much smaller package. Most notably, it doesn't support any sort of attribute declaration, so you have to do that by hand. Still, it's useful, and works well with Class::Accessor and Object::Tiny

OO System Summary

Here's a brief recap of the options we covered:

  • Moose

    Moose is the maximal option. It has a lot of features, a big ecosystem, and a thriving user base. We also covered Moo briefly. Moo is Moose lite, and a reasonable alternative when Moose doesn't work for your application.

  • Class::Accessor

    Class::Accessor does a lot less than Moose , and is a nice alternative if you find Moose overwhelming. It's been around a long time and is well battle-tested. It also has a minimal Moose compatibility mode which makes moving from Class::Accessor to Moose easy.

  • Object::Tiny

    Object::Tiny is the absolute minimal option. It has no dependencies, and almost no syntax to learn. It's a good option for a super minimal environment and for throwing something together quickly without having to worry about details.

  • Role::Tiny

    Use Role::Tiny with Class::Accessor or Object::Tiny if you find yourself considering multiple inheritance. If you go with Moose , it comes with its own role implementation.

Other OO Systems

There are literally dozens of other OO-related modules on CPAN besides those covered here, and you're likely to run across one or more of them if you work with other people's code.

In addition, plenty of code in the wild does all of its OO "by hand", using just the Perl built-in OO features. If you need to maintain such code, you should read perlobj to understand exactly how Perl's built-in OO works.

CONCLUSION

As we said before, Perl's minimal OO system has led to a profusion of OO systems on CPAN. While you can still drop down to the bare metal and write your classes by hand, there's really no reason to do that with modern Perl.

For small systems, Object::Tiny and Class::Accessor both provide minimal object systems that take care of basic boilerplate for you.

For bigger projects, Moose provides a rich set of features that will let you focus on implementing your business logic.

We encourage you to play with and evaluate Moose, Class::Accessor, and Object::Tiny to see which OO system is right for you.

 
perldoc-html/perlop.html000644 000765 000024 00001064606 12275777333 015442 0ustar00jjstaff000000 000000 perlop - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlop

Perl 5 version 18.2 documentation
Recently read

perlop

NAME

perlop - Perl operators and precedence

DESCRIPTION

Operator Precedence and Associativity

Operator precedence and associativity work in Perl more or less like they do in mathematics.

Operator precedence means some operators are evaluated before others. For example, in 2 + 4 * 5 , the multiplication has higher precedence so 4 * 5 is evaluated first yielding 2 + 20 == 22 and not 6 * 5 == 30 .

Operator associativity defines what happens if a sequence of the same operators is used one after another: whether the evaluator will evaluate the left operations first or the right. For example, in 8 - 4 - 2 , subtraction is left associative so Perl evaluates the expression left to right. 8 - 4 is evaluated first making the expression 4 - 2 == 2 and not 8 - 2 == 6 .

Perl operators have the following associativity and precedence, listed from highest precedence to lowest. Operators borrowed from C keep the same precedence relationship with each other, even where C's precedence is slightly screwy. (This makes learning Perl easier for C folks.) With very few exceptions, these all operate on scalar values only, not array values.

  1. left terms and list operators (leftward)
  2. left ->
  3. nonassoc ++ --
  4. right **
  5. right ! ~ \ and unary + and -
  6. left =~ !~
  7. left * / % x
  8. left + - .
  9. left << >>
  10. nonassoc named unary operators
  11. nonassoc < > <= >= lt gt le ge
  12. nonassoc == != <=> eq ne cmp ~~
  13. left &
  14. left | ^
  15. left &&
  16. left || //
  17. nonassoc .. ...
  18. right ?:
  19. right = += -= *= etc. goto last next redo dump
  20. left , =>
  21. nonassoc list operators (rightward)
  22. right not
  23. left and
  24. left or xor

In the following sections, these operators are covered in precedence order.

Many operators can be overloaded for objects. See overload.

Terms and List Operators (Leftward)

A TERM has the highest precedence in Perl. They include variables, quote and quote-like operators, any expression in parentheses, and any function whose arguments are parenthesized. Actually, there aren't really functions in this sense, just list operators and unary operators behaving as functions because you put parentheses around the arguments. These are all documented in perlfunc.

If any list operator (print(), etc.) or any unary operator (chdir(), etc.) is followed by a left parenthesis as the next token, the operator and arguments within parentheses are taken to be of highest precedence, just like a normal function call.

In the absence of parentheses, the precedence of list operators such as print, sort, or chmod is either very high or very low depending on whether you are looking at the left side or the right side of the operator. For example, in

  1. @ary = (1, 3, sort 4, 2);
  2. print @ary; # prints 1324

the commas on the right of the sort are evaluated before the sort, but the commas on the left are evaluated after. In other words, list operators tend to gobble up all arguments that follow, and then act like a simple TERM with regard to the preceding expression. Be careful with parentheses:

  1. # These evaluate exit before doing the print:
  2. print($foo, exit); # Obviously not what you want.
  3. print $foo, exit; # Nor is this.
  4. # These do the print before evaluating exit:
  5. (print $foo), exit; # This is what you want.
  6. print($foo), exit; # Or this.
  7. print ($foo), exit; # Or even this.

Also note that

  1. print ($foo & 255) + 1, "\n";

probably doesn't do what you expect at first glance. The parentheses enclose the argument list for print which is evaluated (printing the result of $foo & 255 ). Then one is added to the return value of print (usually 1). The result is something like this:

  1. 1 + 1, "\n"; # Obviously not what you meant.

To do what you meant properly, you must write:

  1. print(($foo & 255) + 1, "\n");

See Named Unary Operators for more discussion of this.

Also parsed as terms are the do {} and eval {} constructs, as well as subroutine and method calls, and the anonymous constructors [] and {} .

See also Quote and Quote-like Operators toward the end of this section, as well as I/O Operators.

The Arrow Operator

"-> " is an infix dereference operator, just as it is in C and C++. If the right side is either a [...] , {...} , or a (...) subscript, then the left side must be either a hard or symbolic reference to an array, a hash, or a subroutine respectively. (Or technically speaking, a location capable of holding a hard reference, if it's an array or hash reference being used for assignment.) See perlreftut and perlref.

Otherwise, the right side is a method name or a simple scalar variable containing either the method name or a subroutine reference, and the left side must be either an object (a blessed reference) or a class name (that is, a package name). See perlobj.

Auto-increment and Auto-decrement

"++" and "--" work as in C. That is, if placed before a variable, they increment or decrement the variable by one before returning the value, and if placed after, increment or decrement after returning the value.

  1. $i = 0; $j = 0;
  2. print $i++; # prints 0
  3. print ++$j; # prints 1

Note that just as in C, Perl doesn't define when the variable is incremented or decremented. You just know it will be done sometime before or after the value is returned. This also means that modifying a variable twice in the same statement will lead to undefined behavior. Avoid statements like:

  1. $i = $i ++;
  2. print ++ $i + $i ++;

Perl will not guarantee what the result of the above statements is.

The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^[a-zA-Z]*[0-9]*\z/ , the increment is done as a string, preserving each character within its range, with carry:

  1. print ++($foo = "99"); # prints "100"
  2. print ++($foo = "a0"); # prints "a1"
  3. print ++($foo = "Az"); # prints "Ba"
  4. print ++($foo = "zz"); # prints "aaa"

undef is always treated as numeric, and in particular is changed to 0 before incrementing (so that a post-increment of an undef value will return 0 rather than undef).

The auto-decrement operator is not magical.

Exponentiation

Binary "**" is the exponentiation operator. It binds even more tightly than unary minus, so -2**4 is -(2**4), not (-2)**4. (This is implemented using C's pow(3) function, which actually works on doubles internally.)

Symbolic Unary Operators

Unary "!" performs logical negation, that is, "not". See also not for a lower precedence version of this.

Unary "-" performs arithmetic negation if the operand is numeric, including any string that looks like a number. If the operand is an identifier, a string consisting of a minus sign concatenated with the identifier is returned. Otherwise, if the string starts with a plus or minus, a string starting with the opposite sign is returned. One effect of these rules is that -bareword is equivalent to the string "-bareword". If, however, the string begins with a non-alphabetic character (excluding "+" or "-"), Perl will attempt to convert the string to a numeric and the arithmetic negation is performed. If the string cannot be cleanly converted to a numeric, Perl will give the warning Argument "the string" isn't numeric in negation (-) at ....

Unary "~" performs bitwise negation, that is, 1's complement. For example, 0666 & ~027 is 0640. (See also Integer Arithmetic and Bitwise String Operators.) Note that the width of the result is platform-dependent: ~0 is 32 bits wide on a 32-bit platform, but 64 bits wide on a 64-bit platform, so if you are expecting a certain bit width, remember to use the "&" operator to mask off the excess bits.

When complementing strings, if all characters have ordinal values under 256, then their complements will, also. But if they do not, all characters will be in either 32- or 64-bit complements, depending on your architecture. So for example, ~"\x{3B1}" is "\x{FFFF_FC4E}" on 32-bit machines and "\x{FFFF_FFFF_FFFF_FC4E}" on 64-bit machines.

Unary "+" has no effect whatsoever, even on strings. It is useful syntactically for separating a function name from a parenthesized expression that would otherwise be interpreted as the complete list of function arguments. (See examples above under Terms and List Operators (Leftward).)

Unary "\" creates a reference to whatever follows it. See perlreftut and perlref. Do not confuse this behavior with the behavior of backslash within a string, although both forms do convey the notion of protecting the next thing from interpolation.

Binding Operators

Binary "=~" binds a scalar expression to a pattern match. Certain operations search or modify the string $_ by default. This operator makes that kind of operation work on some other string. The right argument is a search pattern, substitution, or transliteration. The left argument is what is supposed to be searched, substituted, or transliterated instead of the default $_. When used in scalar context, the return value generally indicates the success of the operation. The exceptions are substitution (s///) and transliteration (y///) with the /r (non-destructive) option, which cause the return value to be the result of the substitution. Behavior in list context depends on the particular operator. See Regexp Quote-Like Operators for details and perlretut for examples using these operators.

If the right argument is an expression rather than a search pattern, substitution, or transliteration, it is interpreted as a search pattern at run time. Note that this means that its contents will be interpolated twice, so

  1. '\\' =~ q'\\';

is not ok, as the regex engine will end up trying to compile the pattern \ , which it will consider a syntax error.

Binary "!~" is just like "=~" except the return value is negated in the logical sense.

Binary "!~" with a non-destructive substitution (s///r) or transliteration (y///r) is a syntax error.

Multiplicative Operators

Binary "*" multiplies two numbers.

Binary "/" divides two numbers.

Binary "%" is the modulo operator, which computes the division remainder of its first argument with respect to its second argument. Given integer operands $a and $b : If $b is positive, then $a % $b is $a minus the largest multiple of $b less than or equal to $a . If $b is negative, then $a % $b is $a minus the smallest multiple of $b that is not less than $a (that is, the result will be less than or equal to zero). If the operands $a and $b are floating point values and the absolute value of $b (that is abs($b)) is less than (UV_MAX + 1) , only the integer portion of $a and $b will be used in the operation (Note: here UV_MAX means the maximum of the unsigned integer type). If the absolute value of the right operand (abs($b)) is greater than or equal to (UV_MAX + 1) , "%" computes the floating-point remainder $r in the equation ($r = $a - $i*$b) where $i is a certain integer that makes $r have the same sign as the right operand $b (not as the left operand $a like C function fmod() ) and the absolute value less than that of $b . Note that when use integer is in scope, "%" gives you direct access to the modulo operator as implemented by your C compiler. This operator is not as well defined for negative operands, but it will execute faster.

Binary "x" is the repetition operator. In scalar context or if the left operand is not enclosed in parentheses, it returns a string consisting of the left operand repeated the number of times specified by the right operand. In list context, if the left operand is enclosed in parentheses or is a list formed by qw/STRING/, it repeats the list. If the right operand is zero or negative, it returns an empty string or an empty list, depending on the context.

  1. print '-' x 80; # print row of dashes
  2. print "\t" x ($tab/8), ' ' x ($tab%8); # tab over
  3. @ones = (1) x 80; # a list of 80 1's
  4. @ones = (5) x @ones; # set all elements to 5

Additive Operators

Binary + returns the sum of two numbers.

Binary - returns the difference of two numbers.

Binary . concatenates two strings.

Shift Operators

Binary << returns the value of its left argument shifted left by the number of bits specified by the right argument. Arguments should be integers. (See also Integer Arithmetic.)

Binary >> returns the value of its left argument shifted right by the number of bits specified by the right argument. Arguments should be integers. (See also Integer Arithmetic.)

Note that both << and >> in Perl are implemented directly using << and >> in C. If use integer (see Integer Arithmetic) is in force then signed C integers are used, else unsigned C integers are used. Either way, the implementation isn't going to generate results larger than the size of the integer type Perl was built with (32 bits or 64 bits).

The result of overflowing the range of the integers is undefined because it is undefined also in C. In other words, using 32-bit integers, 1 << 32 is undefined. Shifting by a negative number of bits is also undefined.

If you get tired of being subject to your platform's native integers, the use bigint pragma neatly sidesteps the issue altogether:

  1. print 20 << 20; # 20971520
  2. print 20 << 40; # 5120 on 32-bit machines,
  3. # 21990232555520 on 64-bit machines
  4. use bigint;
  5. print 20 << 100; # 25353012004564588029934064107520

Named Unary Operators

The various named unary operators are treated as functions with one argument, with optional parentheses.

If any list operator (print(), etc.) or any unary operator (chdir(), etc.) is followed by a left parenthesis as the next token, the operator and arguments within parentheses are taken to be of highest precedence, just like a normal function call. For example, because named unary operators are higher precedence than ||:

  1. chdir $foo || die; # (chdir $foo) || die
  2. chdir($foo) || die; # (chdir $foo) || die
  3. chdir ($foo) || die; # (chdir $foo) || die
  4. chdir +($foo) || die; # (chdir $foo) || die

but, because * is higher precedence than named operators:

  1. chdir $foo * 20; # chdir ($foo * 20)
  2. chdir($foo) * 20; # (chdir $foo) * 20
  3. chdir ($foo) * 20; # (chdir $foo) * 20
  4. chdir +($foo) * 20; # chdir ($foo * 20)
  5. rand 10 * 20; # rand (10 * 20)
  6. rand(10) * 20; # (rand 10) * 20
  7. rand (10) * 20; # (rand 10) * 20
  8. rand +(10) * 20; # rand (10 * 20)

Regarding precedence, the filetest operators, like -f , -M , etc. are treated like named unary operators, but they don't follow this functional parenthesis rule. That means, for example, that -f($file).".bak" is equivalent to -f "$file.bak" .

See also Terms and List Operators (Leftward).

Relational Operators

Perl operators that return true or false generally return values that can be safely used as numbers. For example, the relational operators in this section and the equality operators in the next one return 1 for true and a special version of the defined empty string, "" , which counts as a zero but is exempt from warnings about improper numeric conversions, just as "0 but true" is.

Binary "<" returns true if the left argument is numerically less than the right argument.

Binary ">" returns true if the left argument is numerically greater than the right argument.

Binary "<=" returns true if the left argument is numerically less than or equal to the right argument.

Binary ">=" returns true if the left argument is numerically greater than or equal to the right argument.

Binary "lt" returns true if the left argument is stringwise less than the right argument.

Binary "gt" returns true if the left argument is stringwise greater than the right argument.

Binary "le" returns true if the left argument is stringwise less than or equal to the right argument.

Binary "ge" returns true if the left argument is stringwise greater than or equal to the right argument.

Equality Operators

Binary "==" returns true if the left argument is numerically equal to the right argument.

Binary "!=" returns true if the left argument is numerically not equal to the right argument.

Binary "<=>" returns -1, 0, or 1 depending on whether the left argument is numerically less than, equal to, or greater than the right argument. If your platform supports NaNs (not-a-numbers) as numeric values, using them with "<=>" returns undef. NaN is not "<", "==", ">", "<=" or ">=" anything (even NaN), so those 5 return false. NaN != NaN returns true, as does NaN != anything else. If your platform doesn't support NaNs then NaN is just a string with numeric value 0.

  1. $ perl -le '$a = "NaN"; print "No NaN support here" if $a == $a'
  2. $ perl -le '$a = "NaN"; print "NaN support here" if $a != $a'

(Note that the bigint, bigrat, and bignum pragmas all support "NaN".)

Binary "eq" returns true if the left argument is stringwise equal to the right argument.

Binary "ne" returns true if the left argument is stringwise not equal to the right argument.

Binary "cmp" returns -1, 0, or 1 depending on whether the left argument is stringwise less than, equal to, or greater than the right argument.

Binary "~~" does a smartmatch between its arguments. Smart matching is described in the next section.

"lt", "le", "ge", "gt" and "cmp" use the collation (sort) order specified by the current locale if a legacy use locale (but not use locale ':not_characters' ) is in effect. See perllocale. Do not mix these with Unicode, only with legacy binary encodings. The standard Unicode::Collate and Unicode::Collate::Locale modules offer much more powerful solutions to collation issues.

Smartmatch Operator

First available in Perl 5.10.1 (the 5.10.0 version behaved differently), binary ~~ does a "smartmatch" between its arguments. This is mostly used implicitly in the when construct described in perlsyn, although not all when clauses call the smartmatch operator. Unique among all of Perl's operators, the smartmatch operator can recurse.

It is also unique in that all other Perl operators impose a context (usually string or numeric context) on their operands, autoconverting those operands to those imposed contexts. In contrast, smartmatch infers contexts from the actual types of its operands and uses that type information to select a suitable comparison mechanism.

The ~~ operator compares its operands "polymorphically", determining how to compare them according to their actual types (numeric, string, array, hash, etc.) Like the equality operators with which it shares the same precedence, ~~ returns 1 for true and "" for false. It is often best read aloud as "in", "inside of", or "is contained in", because the left operand is often looked for inside the right operand. That makes the order of the operands to the smartmatch operand often opposite that of the regular match operator. In other words, the "smaller" thing is usually placed in the left operand and the larger one in the right.

The behavior of a smartmatch depends on what type of things its arguments are, as determined by the following table. The first row of the table whose types apply determines the smartmatch behavior. Because what actually happens is mostly determined by the type of the second operand, the table is sorted on the right operand instead of on the left.

  1. Left Right Description and pseudocode
  2. ===============================================================
  3. Any undef check whether Any is undefined
  4. like: !defined Any
  5. Any Object invoke ~~ overloading on Object, or die
  6. Right operand is an ARRAY:
  7. Left Right Description and pseudocode
  8. ===============================================================
  9. ARRAY1 ARRAY2 recurse on paired elements of ARRAY1 and ARRAY2[2]
  10. like: (ARRAY1[0] ~~ ARRAY2[0])
  11. && (ARRAY1[1] ~~ ARRAY2[1]) && ...
  12. HASH ARRAY any ARRAY elements exist as HASH keys
  13. like: grep { exists HASH->{$_} } ARRAY
  14. Regexp ARRAY any ARRAY elements pattern match Regexp
  15. like: grep { /Regexp/ } ARRAY
  16. undef ARRAY undef in ARRAY
  17. like: grep { !defined } ARRAY
  18. Any ARRAY smartmatch each ARRAY element[3]
  19. like: grep { Any ~~ $_ } ARRAY
  20. Right operand is a HASH:
  21. Left Right Description and pseudocode
  22. ===============================================================
  23. HASH1 HASH2 all same keys in both HASHes
  24. like: keys HASH1 ==
  25. grep { exists HASH2->{$_} } keys HASH1
  26. ARRAY HASH any ARRAY elements exist as HASH keys
  27. like: grep { exists HASH->{$_} } ARRAY
  28. Regexp HASH any HASH keys pattern match Regexp
  29. like: grep { /Regexp/ } keys HASH
  30. undef HASH always false (undef can't be a key)
  31. like: 0 == 1
  32. Any HASH HASH key existence
  33. like: exists HASH->{Any}
  34. Right operand is CODE:
  35. Left Right Description and pseudocode
  36. ===============================================================
  37. ARRAY CODE sub returns true on all ARRAY elements[1]
  38. like: !grep { !CODE->($_) } ARRAY
  39. HASH CODE sub returns true on all HASH keys[1]
  40. like: !grep { !CODE->($_) } keys HASH
  41. Any CODE sub passed Any returns true
  42. like: CODE->(Any)

Right operand is a Regexp:

  1. Left Right Description and pseudocode
  2. ===============================================================
  3. ARRAY Regexp any ARRAY elements match Regexp
  4. like: grep { /Regexp/ } ARRAY
  5. HASH Regexp any HASH keys match Regexp
  6. like: grep { /Regexp/ } keys HASH
  7. Any Regexp pattern match
  8. like: Any =~ /Regexp/
  9. Other:
  10. Left Right Description and pseudocode
  11. ===============================================================
  12. Object Any invoke ~~ overloading on Object,
  13. or fall back to...
  14. Any Num numeric equality
  15. like: Any == Num
  16. Num nummy[4] numeric equality
  17. like: Num == nummy
  18. undef Any check whether undefined
  19. like: !defined(Any)
  20. Any Any string equality
  21. like: Any eq Any

Notes:

1.
Empty hashes or arrays match.
2.
That is, each element smartmatches the element of the same index in the other array.[3]
3.
If a circular reference is found, fall back to referential equality.
4.
Either an actual number, or a string that looks like one.

The smartmatch implicitly dereferences any non-blessed hash or array reference, so the HASH and ARRAY entries apply in those cases. For blessed references, the Object entries apply. Smartmatches involving hashes only consider hash keys, never hash values.

The "like" code entry is not always an exact rendition. For example, the smartmatch operator short-circuits whenever possible, but grep does not. Also, grep in scalar context returns the number of matches, but ~~ returns only true or false.

Unlike most operators, the smartmatch operator knows to treat undef specially:

  1. use v5.10.1;
  2. @array = (1, 2, 3, undef, 4, 5);
  3. say "some elements undefined" if undef ~~ @array;

Each operand is considered in a modified scalar context, the modification being that array and hash variables are passed by reference to the operator, which implicitly dereferences them. Both elements of each pair are the same:

  1. use v5.10.1;
  2. my %hash = (red => 1, blue => 2, green => 3,
  3. orange => 4, yellow => 5, purple => 6,
  4. black => 7, grey => 8, white => 9);
  5. my @array = qw(red blue green);
  6. say "some array elements in hash keys" if @array ~~ %hash;
  7. say "some array elements in hash keys" if \@array ~~ \%hash;
  8. say "red in array" if "red" ~~ @array;
  9. say "red in array" if "red" ~~ \@array;
  10. say "some keys end in e" if /e$/ ~~ %hash;
  11. say "some keys end in e" if /e$/ ~~ \%hash;

Two arrays smartmatch if each element in the first array smartmatches (that is, is "in") the corresponding element in the second array, recursively.

  1. use v5.10.1;
  2. my @little = qw(red blue green);
  3. my @bigger = ("red", "blue", [ "orange", "green" ] );
  4. if (@little ~~ @bigger) { # true!
  5. say "little is contained in bigger";
  6. }

Because the smartmatch operator recurses on nested arrays, this will still report that "red" is in the array.

  1. use v5.10.1;
  2. my @array = qw(red blue green);
  3. my $nested_array = [[[[[[[ @array ]]]]]]];
  4. say "red in array" if "red" ~~ $nested_array;

If two arrays smartmatch each other, then they are deep copies of each others' values, as this example reports:

  1. use v5.12.0;
  2. my @a = (0, 1, 2, [3, [4, 5], 6], 7);
  3. my @b = (0, 1, 2, [3, [4, 5], 6], 7);
  4. if (@a ~~ @b && @b ~~ @a) {
  5. say "a and b are deep copies of each other";
  6. }
  7. elsif (@a ~~ @b) {
  8. say "a smartmatches in b";
  9. }
  10. elsif (@b ~~ @a) {
  11. say "b smartmatches in a";
  12. }
  13. else {
  14. say "a and b don't smartmatch each other at all";
  15. }

If you were to set $b[3] = 4 , then instead of reporting that "a and b are deep copies of each other", it now reports that "b smartmatches in a". That because the corresponding position in @a contains an array that (eventually) has a 4 in it.

Smartmatching one hash against another reports whether both contain the same keys, no more and no less. This could be used to see whether two records have the same field names, without caring what values those fields might have. For example:

  1. use v5.10.1;
  2. sub make_dogtag {
  3. state $REQUIRED_FIELDS = { name=>1, rank=>1, serial_num=>1 };
  4. my ($class, $init_fields) = @_;
  5. die "Must supply (only) name, rank, and serial number"
  6. unless $init_fields ~~ $REQUIRED_FIELDS;
  7. ...
  8. }

or, if other non-required fields are allowed, use ARRAY ~~ HASH:

  1. use v5.10.1;
  2. sub make_dogtag {
  3. state $REQUIRED_FIELDS = { name=>1, rank=>1, serial_num=>1 };
  4. my ($class, $init_fields) = @_;
  5. die "Must supply (at least) name, rank, and serial number"
  6. unless [keys %{$init_fields}] ~~ $REQUIRED_FIELDS;
  7. ...
  8. }

The smartmatch operator is most often used as the implicit operator of a when clause. See the section on "Switch Statements" in perlsyn.

Smartmatching of Objects

To avoid relying on an object's underlying representation, if the smartmatch's right operand is an object that doesn't overload ~~ , it raises the exception "Smartmatching a non-overloaded object breaks encapsulation ". That's because one has no business digging around to see whether something is "in" an object. These are all illegal on objects without a ~~ overload:

  1. %hash ~~ $object
  2. 42 ~~ $object
  3. "fred" ~~ $object

However, you can change the way an object is smartmatched by overloading the ~~ operator. This is allowed to extend the usual smartmatch semantics. For objects that do have an ~~ overload, see overload.

Using an object as the left operand is allowed, although not very useful. Smartmatching rules take precedence over overloading, so even if the object in the left operand has smartmatch overloading, this will be ignored. A left operand that is a non-overloaded object falls back on a string or numeric comparison of whatever the ref operator returns. That means that

  1. $object ~~ X

does not invoke the overload method with X as an argument. Instead the above table is consulted as normal, and based on the type of X, overloading may or may not be invoked. For simple strings or numbers, in becomes equivalent to this:

  1. $object ~~ $number ref($object) == $number
  2. $object ~~ $string ref($object) eq $string

For example, this reports that the handle smells IOish (but please don't really do this!):

  1. use IO::Handle;
  2. my $fh = IO::Handle->new();
  3. if ($fh ~~ /\bIO\b/) {
  4. say "handle smells IOish";
  5. }

That's because it treats $fh as a string like "IO::Handle=GLOB(0x8039e0)" , then pattern matches against that.

Bitwise And

Binary "&" returns its operands ANDed together bit by bit. (See also Integer Arithmetic and Bitwise String Operators.)

Note that "&" has lower priority than relational operators, so for example the parentheses are essential in a test like

  1. print "Even\n" if ($x & 1) == 0;

Bitwise Or and Exclusive Or

Binary "|" returns its operands ORed together bit by bit. (See also Integer Arithmetic and Bitwise String Operators.)

Binary "^" returns its operands XORed together bit by bit. (See also Integer Arithmetic and Bitwise String Operators.)

Note that "|" and "^" have lower priority than relational operators, so for example the brackets are essential in a test like

  1. print "false\n" if (8 | 2) != 10;

C-style Logical And

Binary "&&" performs a short-circuit logical AND operation. That is, if the left operand is false, the right operand is not even evaluated. Scalar or list context propagates down to the right operand if it is evaluated.

C-style Logical Or

Binary "||" performs a short-circuit logical OR operation. That is, if the left operand is true, the right operand is not even evaluated. Scalar or list context propagates down to the right operand if it is evaluated.

Logical Defined-Or

Although it has no direct equivalent in C, Perl's // operator is related to its C-style or. In fact, it's exactly the same as ||, except that it tests the left hand side's definedness instead of its truth. Thus, EXPR1 // EXPR2 returns the value of EXPR1 if it's defined, otherwise, the value of EXPR2 is returned. (EXPR1 is evaluated in scalar context, EXPR2 in the context of // itself). Usually, this is the same result as defined(EXPR1) ? EXPR1 : EXPR2 (except that the ternary-operator form can be used as a lvalue, while EXPR1 // EXPR2 cannot). This is very useful for providing default values for variables. If you actually want to test if at least one of $a and $b is defined, use defined($a // $b) .

The ||, // and && operators return the last value evaluated (unlike C's || and &&, which return 0 or 1). Thus, a reasonably portable way to find out the home directory might be:

  1. $home = $ENV{HOME}
  2. // $ENV{LOGDIR}
  3. // (getpwuid($<))[7]
  4. // die "You're homeless!\n";

In particular, this means that you shouldn't use this for selecting between two aggregates for assignment:

  1. @a = @b || @c; # this is wrong
  2. @a = scalar(@b) || @c; # really meant this
  3. @a = @b ? @b : @c; # this works fine, though

As alternatives to && and || when used for control flow, Perl provides the and and or operators (see below). The short-circuit behavior is identical. The precedence of "and" and "or" is much lower, however, so that you can safely use them after a list operator without the need for parentheses:

  1. unlink "alpha", "beta", "gamma"
  2. or gripe(), next LINE;

With the C-style operators that would have been written like this:

  1. unlink("alpha", "beta", "gamma")
  2. || (gripe(), next LINE);

It would be even more readable to write that this way:

  1. unless(unlink("alpha", "beta", "gamma")) {
  2. gripe();
  3. next LINE;
  4. }

Using "or" for assignment is unlikely to do what you want; see below.

Range Operators

Binary ".." is the range operator, which is really two different operators depending on the context. In list context, it returns a list of values counting (up by ones) from the left value to the right value. If the left value is greater than the right value then it returns the empty list. The range operator is useful for writing foreach (1..10) loops and for doing slice operations on arrays. In the current implementation, no temporary array is created when the range operator is used as the expression in foreach loops, but older versions of Perl might burn a lot of memory when you write something like this:

  1. for (1 .. 1_000_000) {
  2. # code
  3. }

The range operator also works on strings, using the magical auto-increment, see below.

In scalar context, ".." returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each ".." operator maintains its own boolean state, even across calls to a subroutine that contains it. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again. It doesn't become false till the next time the range operator is evaluated. It can test the right operand and become false on the same evaluation it became true (as in awk), but it still returns true once. If you don't want it to test the right operand until the next evaluation, as in sed, just use three dots ("...") instead of two. In all other regards, "..." behaves just like ".." does.

The right operand is not evaluated while the operator is in the "false" state, and the left operand is not evaluated while the operator is in the "true" state. The precedence is a little lower than || and &&. The value returned is either the empty string for false, or a sequence number (beginning with 1) for true. The sequence number is reset for each range encountered. The final sequence number in a range has the string "E0" appended to it, which doesn't affect its numeric value, but gives you something to search for if you want to exclude the endpoint. You can exclude the beginning point by waiting for the sequence number to be greater than 1.

If either operand of scalar ".." is a constant expression, that operand is considered true if it is equal (== ) to the current input line number (the $. variable).

To be pedantic, the comparison is actually int(EXPR) == int(EXPR) , but that is only an issue if you use a floating point expression; when implicitly using $. as described in the previous paragraph, the comparison is int(EXPR) == int($.) which is only an issue when $. is set to a floating point value and you are not reading from a file. Furthermore, "span" .. "spat" or 2.18 .. 3.14 will not do what you want in scalar context because each of the operands are evaluated using their integer representation.

Examples:

As a scalar operator:

  1. if (101 .. 200) { print; } # print 2nd hundred lines, short for
  2. # if ($. == 101 .. $. == 200) { print; }
  3. next LINE if (1 .. /^$/); # skip header lines, short for
  4. # next LINE if ($. == 1 .. /^$/);
  5. # (typically in a loop labeled LINE)
  6. s/^/> / if (/^$/ .. eof()); # quote body
  7. # parse mail messages
  8. while (<>) {
  9. $in_header = 1 .. /^$/;
  10. $in_body = /^$/ .. eof;
  11. if ($in_header) {
  12. # do something
  13. } else { # in body
  14. # do something else
  15. }
  16. } continue {
  17. close ARGV if eof; # reset $. each file
  18. }

Here's a simple example to illustrate the difference between the two range operators:

  1. @lines = (" - Foo",
  2. "01 - Bar",
  3. "1 - Baz",
  4. " - Quux");
  5. foreach (@lines) {
  6. if (/0/ .. /1/) {
  7. print "$_\n";
  8. }
  9. }

This program will print only the line containing "Bar". If the range operator is changed to ... , it will also print the "Baz" line.

And now some examples as a list operator:

  1. for (101 .. 200) { print } # print $_ 100 times
  2. @foo = @foo[0 .. $#foo]; # an expensive no-op
  3. @foo = @foo[$#foo-4 .. $#foo]; # slice last 5 items

The range operator (in list context) makes use of the magical auto-increment algorithm if the operands are strings. You can say

  1. @alphabet = ("A" .. "Z");

to get all normal letters of the English alphabet, or

  1. $hexdigit = (0 .. 9, "a" .. "f")[$num & 15];

to get a hexadecimal digit, or

  1. @z2 = ("01" .. "31");
  2. print $z2[$mday];

to get dates with leading zeros.

If the final value specified is not in the sequence that the magical increment would produce, the sequence goes until the next value would be longer than the final value specified.

If the initial value specified isn't part of a magical increment sequence (that is, a non-empty string matching /^[a-zA-Z]*[0-9]*\z/ ), only the initial value will be returned. So the following will only return an alpha:

  1. use charnames "greek";
  2. my @greek_small = ("\N{alpha}" .. "\N{omega}");

To get the 25 traditional lowercase Greek letters, including both sigmas, you could use this instead:

  1. use charnames "greek";
  2. my @greek_small = map { chr } ( ord("\N{alpha}")
  3. ..
  4. ord("\N{omega}")
  5. );

However, because there are many other lowercase Greek characters than just those, to match lowercase Greek characters in a regular expression, you would use the pattern /(?:(?=\p{Greek})\p{Lower})+/ .

Because each operand is evaluated in integer form, 2.18 .. 3.14 will return two elements in list context.

  1. @list = (2.18 .. 3.14); # same as @list = (2 .. 3);

Conditional Operator

Ternary "?:" is the conditional operator, just as in C. It works much like an if-then-else. If the argument before the ? is true, the argument before the : is returned, otherwise the argument after the : is returned. For example:

  1. printf "I have %d dog%s.\n", $n,
  2. ($n == 1) ? "" : "s";

Scalar or list context propagates downward into the 2nd or 3rd argument, whichever is selected.

  1. $a = $ok ? $b : $c; # get a scalar
  2. @a = $ok ? @b : @c; # get an array
  3. $a = $ok ? @b : @c; # oops, that's just a count!

The operator may be assigned to if both the 2nd and 3rd arguments are legal lvalues (meaning that you can assign to them):

  1. ($a_or_b ? $a : $b) = $c;

Because this operator produces an assignable result, using assignments without parentheses will get you in trouble. For example, this:

  1. $a % 2 ? $a += 10 : $a += 2

Really means this:

  1. (($a % 2) ? ($a += 10) : $a) += 2

Rather than this:

  1. ($a % 2) ? ($a += 10) : ($a += 2)

That should probably be written more simply as:

  1. $a += ($a % 2) ? 10 : 2;

Assignment Operators

"=" is the ordinary assignment operator.

Assignment operators work as in C. That is,

  1. $a += 2;

is equivalent to

  1. $a = $a + 2;

although without duplicating any side effects that dereferencing the lvalue might trigger, such as from tie(). Other assignment operators work similarly. The following are recognized:

  1. **= += *= &= <<= &&=
  2. -= /= |= >>= ||=
  3. .= %= ^= //=
  4. x=

Although these are grouped by family, they all have the precedence of assignment.

Unlike in C, the scalar assignment operator produces a valid lvalue. Modifying an assignment is equivalent to doing the assignment and then modifying the variable that was assigned to. This is useful for modifying a copy of something, like this:

  1. ($tmp = $global) =~ tr/13579/24680/;

Although as of 5.14, that can be also be accomplished this way:

  1. use v5.14;
  2. $tmp = ($global =~ tr/13579/24680/r);

Likewise,

  1. ($a += 2) *= 3;

is equivalent to

  1. $a += 2;
  2. $a *= 3;

Similarly, a list assignment in list context produces the list of lvalues assigned to, and a list assignment in scalar context returns the number of elements produced by the expression on the right hand side of the assignment.

Comma Operator

Binary "," is the comma operator. In scalar context it evaluates its left argument, throws that value away, then evaluates its right argument and returns that value. This is just like C's comma operator.

In list context, it's just the list argument separator, and inserts both its arguments into the list. These arguments are also evaluated from left to right.

The => operator is a synonym for the comma except that it causes a word on its left to be interpreted as a string if it begins with a letter or underscore and is composed only of letters, digits and underscores. This includes operands that might otherwise be interpreted as operators, constants, single number v-strings or function calls. If in doubt about this behavior, the left operand can be quoted explicitly.

Otherwise, the => operator behaves exactly as the comma operator or list argument separator, according to context.

For example:

  1. use constant FOO => "something";
  2. my %h = ( FOO => 23 );

is equivalent to:

  1. my %h = ("FOO", 23);

It is NOT:

  1. my %h = ("something", 23);

The => operator is helpful in documenting the correspondence between keys and values in hashes, and other paired elements in lists.

  1. %hash = ( $key => $value );
  2. login( $username => $password );

The special quoting behavior ignores precedence, and hence may apply to part of the left operand:

  1. print time.shift => "bbb";

That example prints something like "1314363215shiftbbb", because the => implicitly quotes the shift immediately on its left, ignoring the fact that time.shift is the entire left operand.

List Operators (Rightward)

On the right side of a list operator, the comma has very low precedence, such that it controls all comma-separated expressions found there. The only operators with lower precedence are the logical operators "and", "or", and "not", which may be used to evaluate calls to list operators without the need for parentheses:

  1. open HANDLE, "< :utf8", "filename" or die "Can't open: $!\n";

However, some people find that code harder to read than writing it with parentheses:

  1. open(HANDLE, "< :utf8", "filename") or die "Can't open: $!\n";

in which case you might as well just use the more customary "||" operator:

  1. open(HANDLE, "< :utf8", "filename") || die "Can't open: $!\n";

See also discussion of list operators in Terms and List Operators (Leftward).

Logical Not

Unary "not" returns the logical negation of the expression to its right. It's the equivalent of "!" except for the very low precedence.

Logical And

Binary "and" returns the logical conjunction of the two surrounding expressions. It's equivalent to && except for the very low precedence. This means that it short-circuits: the right expression is evaluated only if the left expression is true.

Logical or and Exclusive Or

Binary "or" returns the logical disjunction of the two surrounding expressions. It's equivalent to || except for the very low precedence. This makes it useful for control flow:

  1. print FH $data or die "Can't write to FH: $!";

This means that it short-circuits: the right expression is evaluated only if the left expression is false. Due to its precedence, you must be careful to avoid using it as replacement for the || operator. It usually works out better for flow control than in assignments:

  1. $a = $b or $c; # bug: this is wrong
  2. ($a = $b) or $c; # really means this
  3. $a = $b || $c; # better written this way

However, when it's a list-context assignment and you're trying to use || for control flow, you probably need "or" so that the assignment takes higher precedence.

  1. @info = stat($file) || die; # oops, scalar sense of stat!
  2. @info = stat($file) or die; # better, now @info gets its due

Then again, you could always use parentheses.

Binary xor returns the exclusive-OR of the two surrounding expressions. It cannot short-circuit (of course).

There is no low precedence operator for defined-OR.

C Operators Missing From Perl

Here is what C has that Perl doesn't:

  • unary &

    Address-of operator. (But see the "\" operator for taking a reference.)

  • unary *

    Dereference-address operator. (Perl's prefix dereferencing operators are typed: $, @, %, and &.)

  • (TYPE)

    Type-casting operator.

Quote and Quote-like Operators

While we usually think of quotes as literal values, in Perl they function as operators, providing various kinds of interpolating and pattern matching capabilities. Perl provides customary quote characters for these behaviors, but also provides a way for you to choose your quote character for any of them. In the following table, a {} represents any pair of delimiters you choose.

  1. Customary Generic Meaning Interpolates
  2. '' q{} Literal no
  3. "" qq{} Literal yes
  4. `` qx{} Command yes*
  5. qw{} Word list no
  6. // m{} Pattern match yes*
  7. qr{} Pattern yes*
  8. s{}{} Substitution yes*
  9. tr{}{} Transliteration no (but see below)
  10. y{}{} Transliteration no (but see below)
  11. <<EOF here-doc yes*
  12. * unless the delimiter is ''.

Non-bracketing delimiters use the same character fore and aft, but the four sorts of ASCII brackets (round, angle, square, curly) all nest, which means that

  1. q{foo{bar}baz}

is the same as

  1. 'foo{bar}baz'

Note, however, that this does not always work for quoting Perl code:

  1. $s = q{ if($a eq "}") ... }; # WRONG

is a syntax error. The Text::Balanced module (standard as of v5.8, and from CPAN before then) is able to do this properly.

There can be whitespace between the operator and the quoting characters, except when # is being used as the quoting character. q#foo# is parsed as the string foo , while q #foo# is the operator q followed by a comment. Its argument will be taken from the next line. This allows you to write:

  1. s {foo} # Replace foo
  2. {bar} # with bar.

The following escape sequences are available in constructs that interpolate, and in transliterations:

  1. Sequence Note Description
  2. \t tab (HT, TAB)
  3. \n newline (NL)
  4. \r return (CR)
  5. \f form feed (FF)
  6. \b backspace (BS)
  7. \a alarm (bell) (BEL)
  8. \e escape (ESC)
  9. \x{263A} [1,8] hex char (example: SMILEY)
  10. \x1b [2,8] restricted range hex char (example: ESC)
  11. \N{name} [3] named Unicode character or character sequence
  12. \N{U+263D} [4,8] Unicode character (example: FIRST QUARTER MOON)
  13. \c[ [5] control char (example: chr(27))
  14. \o{23072} [6,8] octal char (example: SMILEY)
  15. \033 [7,8] restricted range octal char (example: ESC)
  • [1]

    The result is the character specified by the hexadecimal number between the braces. See [8] below for details on which character.

    Only hexadecimal digits are valid between the braces. If an invalid character is encountered, a warning will be issued and the invalid character and all subsequent characters (valid or invalid) within the braces will be discarded.

    If there are no valid digits between the braces, the generated character is the NULL character (\x{00} ). However, an explicit empty brace (\x{} ) will not cause a warning (currently).

  • [2]

    The result is the character specified by the hexadecimal number in the range 0x00 to 0xFF. See [8] below for details on which character.

    Only hexadecimal digits are valid following \x . When \x is followed by fewer than two valid digits, any valid digits will be zero-padded. This means that \x7 will be interpreted as \x07 , and a lone <\x> will be interpreted as \x00 . Except at the end of a string, having fewer than two valid digits will result in a warning. Note that although the warning says the illegal character is ignored, it is only ignored as part of the escape and will still be used as the subsequent character in the string. For example:

    1. Original Result Warns?
    2. "\x7" "\x07" no
    3. "\x" "\x00" no
    4. "\x7q" "\x07q" yes
    5. "\xq" "\x00q" yes
  • [3]

    The result is the Unicode character or character sequence given by name. See charnames.

  • [4]

    \N{U+hexadecimal number} means the Unicode character whose Unicode code point is hexadecimal number.

  • [5]

    The character following \c is mapped to some other character as shown in the table:

    1. Sequence Value
    2. \c@ chr(0)
    3. \cA chr(1)
    4. \ca chr(1)
    5. \cB chr(2)
    6. \cb chr(2)
    7. ...
    8. \cZ chr(26)
    9. \cz chr(26)
    10. \c[ chr(27)
    11. \c] chr(29)
    12. \c^ chr(30)
    13. \c? chr(127)

    In other words, it's the character whose code point has had 64 xor'd with its uppercase. \c? is DELETE because ord("?") ^ 64 is 127, and \c@ is NULL because the ord of "@" is 64, so xor'ing 64 itself produces 0.

    Also, \c\X yields chr(28) . "X" for any X, but cannot come at the end of a string, because the backslash would be parsed as escaping the end quote.

    On ASCII platforms, the resulting characters from the list above are the complete set of ASCII controls. This isn't the case on EBCDIC platforms; see OPERATOR DIFFERENCES in perlebcdic for the complete list of what these sequences mean on both ASCII and EBCDIC platforms.

    Use of any other character following the "c" besides those listed above is discouraged, and some are deprecated with the intention of removing those in a later Perl version. What happens for any of these other characters currently though, is that the value is derived by xor'ing with the seventh bit, which is 64.

    To get platform independent controls, you can use \N{...} .

  • [6]

    The result is the character specified by the octal number between the braces. See [8] below for details on which character.

    If a character that isn't an octal digit is encountered, a warning is raised, and the value is based on the octal digits before it, discarding it and all following characters up to the closing brace. It is a fatal error if there are no octal digits at all.

  • [7]

    The result is the character specified by the three-digit octal number in the range 000 to 777 (but best to not use above 077, see next paragraph). See [8] below for details on which character.

    Some contexts allow 2 or even 1 digit, but any usage without exactly three digits, the first being a zero, may give unintended results. (For example, in a regular expression it may be confused with a backreference; see Octal escapes in perlrebackslash.) Starting in Perl 5.14, you may use \o{} instead, which avoids all these problems. Otherwise, it is best to use this construct only for ordinals \077 and below, remembering to pad to the left with zeros to make three digits. For larger ordinals, either use \o{} , or convert to something else, such as to hex and use \x{} instead.

    Having fewer than 3 digits may lead to a misleading warning message that says that what follows is ignored. For example, "\128" in the ASCII character set is equivalent to the two characters "\n8" , but the warning Illegal octal digit '8' ignored will be thrown. If "\n8" is what you want, you can avoid this warning by padding your octal number with 0 's: "\0128" .

  • [8]

    Several constructs above specify a character by a number. That number gives the character's position in the character set encoding (indexed from 0). This is called synonymously its ordinal, code position, or code point. Perl works on platforms that have a native encoding currently of either ASCII/Latin1 or EBCDIC, each of which allow specification of 256 characters. In general, if the number is 255 (0xFF, 0377) or below, Perl interprets this in the platform's native encoding. If the number is 256 (0x100, 0400) or above, Perl interprets it as a Unicode code point and the result is the corresponding Unicode character. For example \x{50} and \o{120} both are the number 80 in decimal, which is less than 256, so the number is interpreted in the native character set encoding. In ASCII the character in the 80th position (indexed from 0) is the letter "P", and in EBCDIC it is the ampersand symbol "&". \x{100} and \o{400} are both 256 in decimal, so the number is interpreted as a Unicode code point no matter what the native encoding is. The name of the character in the 256th position (indexed by 0) in Unicode is LATIN CAPITAL LETTER A WITH MACRON .

    There are a couple of exceptions to the above rule. \N{U+hex number} is always interpreted as a Unicode code point, so that \N{U+0050} is "P" even on EBCDIC platforms. And if use encoding is in effect, the number is considered to be in that encoding, and is translated from that into the platform's native encoding if there is a corresponding native character; otherwise to Unicode.

NOTE: Unlike C and other languages, Perl has no \v escape sequence for the vertical tab (VT, which is 11 in both ASCII and EBCDIC), but you may use \ck or \x0b . (\v does have meaning in regular expression patterns in Perl, see perlre.)

The following escape sequences are available in constructs that interpolate, but not in transliterations.

  1. \l lowercase next character only
  2. \u titlecase (not uppercase!) next character only
  3. \L lowercase all characters till \E or end of string
  4. \U uppercase all characters till \E or end of string
  5. \F foldcase all characters till \E or end of string
  6. \Q quote (disable) pattern metacharacters till \E or
  7. end of string
  8. \E end either case modification or quoted section
  9. (whichever was last seen)

See quotemeta for the exact definition of characters that are quoted by \Q .

\L , \U , \F , and \Q can stack, in which case you need one \E for each. For example:

  1. say"This \Qquoting \ubusiness \Uhere isn't quite\E done yet,\E is it?";
  2. This quoting\ Business\ HERE\ ISN\'T\ QUITE\ done\ yet\, is it?

If use locale is in effect (but not use locale ':not_characters' ), the case map used by \l , \L , \u , and \U is taken from the current locale. See perllocale. If Unicode (for example, \N{} or code points of 0x100 or beyond) is being used, the case map used by \l , \L , \u , and \U is as defined by Unicode. That means that case-mapping a single character can sometimes produce several characters. Under use locale , \F produces the same results as \L .

All systems use the virtual "\n" to represent a line terminator, called a "newline". There is no such thing as an unvarying, physical newline character. It is only an illusion that the operating system, device drivers, C libraries, and Perl all conspire to preserve. Not all systems read "\r" as ASCII CR and "\n" as ASCII LF. For example, on the ancient Macs (pre-MacOS X) of yesteryear, these used to be reversed, and on systems without line terminator, printing "\n" might emit no actual data. In general, use "\n" when you mean a "newline" for your system, but use the literal ASCII when you need an exact character. For example, most networking protocols expect and prefer a CR+LF ("\015\012" or "\cM\cJ" ) for line terminators, and although they often accept just "\012" , they seldom tolerate just "\015" . If you get in the habit of using "\n" for networking, you may be burned some day.

For constructs that do interpolate, variables beginning with "$ " or "@ " are interpolated. Subscripted variables such as $a[3] or $href->{key}[0] are also interpolated, as are array and hash slices. But method calls such as $obj->meth are not.

Interpolating an array or slice interpolates the elements in order, separated by the value of $" , so is equivalent to interpolating join $", @array . "Punctuation" arrays such as @* are usually interpolated only if the name is enclosed in braces @{*}, but the arrays @_ , @+ , and @- are interpolated even without braces.

For double-quoted strings, the quoting from \Q is applied after interpolation and escapes are processed.

  1. "abc\Qfoo\tbar$s\Exyz"

is equivalent to

  1. "abc" . quotemeta("foo\tbar$s") . "xyz"

For the pattern of regex operators (qr//, m// and s///), the quoting from \Q is applied after interpolation is processed, but before escapes are processed. This allows the pattern to match literally (except for $ and @ ). For example, the following matches:

  1. '\s\t' =~ /\Q\s\t/

Because $ or @ trigger interpolation, you'll need to use something like /\Quser\E\@\Qhost/ to match them literally.

Patterns are subject to an additional level of interpretation as a regular expression. This is done as a second pass, after variables are interpolated, so that regular expressions may be incorporated into the pattern from the variables. If this is not what you want, use \Q to interpolate a variable literally.

Apart from the behavior described above, Perl does not expand multiple levels of interpolation. In particular, contrary to the expectations of shell programmers, back-quotes do NOT interpolate within double quotes, nor do single quotes impede evaluation of variables when used within double quotes.

Regexp Quote-Like Operators

Here are the quote-like operators that apply to pattern matching and related activities.

  • qr/STRING/msixpodual

    This operator quotes (and possibly compiles) its STRING as a regular expression. STRING is interpolated the same way as PATTERN in m/PATTERN/. If "'" is used as the delimiter, no interpolation is done. Returns a Perl value which may be used instead of the corresponding /STRING/msixpodual expression. The returned value is a normalized version of the original pattern. It magically differs from a string containing the same characters: ref(qr/x/) returns "Regexp"; however, dereferencing it is not well defined (you currently get the normalized version of the original pattern, but this may change).

    For example,

    1. $rex = qr/my.STRING/is;
    2. print $rex; # prints (?si-xm:my.STRING)
    3. s/$rex/foo/;

    is equivalent to

    1. s/my.STRING/foo/is;

    The result may be used as a subpattern in a match:

    1. $re = qr/$pattern/;
    2. $string =~ /foo${re}bar/; # can be interpolated in other
    3. # patterns
    4. $string =~ $re; # or used standalone
    5. $string =~ /$re/; # or this way

    Since Perl may compile the pattern at the moment of execution of the qr() operator, using qr() may have speed advantages in some situations, notably if the result of qr() is used standalone:

    1. sub match {
    2. my $patterns = shift;
    3. my @compiled = map qr/$_/i, @$patterns;
    4. grep {
    5. my $success = 0;
    6. foreach my $pat (@compiled) {
    7. $success = 1, last if /$pat/;
    8. }
    9. $success;
    10. } @_;
    11. }

    Precompilation of the pattern into an internal representation at the moment of qr() avoids a need to recompile the pattern every time a match /$pat/ is attempted. (Perl has many other internal optimizations, but none would be triggered in the above example if we did not use qr() operator.)

    Options (specified by the following modifiers) are:

    1. m Treat string as multiple lines.
    2. s Treat string as single line. (Make . match a newline)
    3. i Do case-insensitive pattern matching.
    4. x Use extended regular expressions.
    5. p When matching preserve a copy of the matched string so
    6. that ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be
    7. defined.
    8. o Compile pattern only once.
    9. a ASCII-restrict: Use ASCII for \d, \s, \w; specifying two
    10. a's further restricts /i matching so that no ASCII
    11. character will match a non-ASCII one.
    12. l Use the locale.
    13. u Use Unicode rules.
    14. d Use Unicode or native charset, as in 5.12 and earlier.

    If a precompiled pattern is embedded in a larger pattern then the effect of "msixpluad" will be propagated appropriately. The effect the "o" modifier has is not propagated, being restricted to those patterns explicitly using it.

    The last four modifiers listed above, added in Perl 5.14, control the character set semantics, but /a is the only one you are likely to want to specify explicitly; the other three are selected automatically by various pragmas.

    See perlre for additional information on valid syntax for STRING, and for a detailed look at the semantics of regular expressions. In particular, all modifiers except the largely obsolete /o are further explained in Modifiers in perlre. /o is described in the next section.

  • m/PATTERN/msixpodualgc
  • /PATTERN/msixpodualgc

    Searches a string for a pattern match, and in scalar context returns true if it succeeds, false if it fails. If no string is specified via the =~ or !~ operator, the $_ string is searched. (The string specified with =~ need not be an lvalue--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.) See also perlre.

    Options are as described in qr// above; in addition, the following match process modifiers are available:

    1. g Match globally, i.e., find all occurrences.
    2. c Do not reset search position on a failed match when /g is
    3. in effect.

    If "/" is the delimiter then the initial m is optional. With the m you can use any pair of non-whitespace (ASCII) characters as delimiters. This is particularly useful for matching path names that contain "/", to avoid LTS (leaning toothpick syndrome). If "?" is the delimiter, then a match-only-once rule applies, described in m?PATTERN? below. If "'" is the delimiter, no interpolation is performed on the PATTERN. When using a character valid in an identifier, whitespace is required after the m.

    PATTERN may contain variables, which will be interpolated every time the pattern search is evaluated, except for when the delimiter is a single quote. (Note that $( , $) , and $| are not interpolated because they look like end-of-string tests.) Perl will not recompile the pattern unless an interpolated variable that it contains changes. You can force Perl to skip the test and never recompile by adding a /o (which stands for "once") after the trailing delimiter. Once upon a time, Perl would recompile regular expressions unnecessarily, and this modifier was useful to tell it not to do so, in the interests of speed. But now, the only reasons to use /o are either:

    1

    The variables are thousands of characters long and you know that they don't change, and you need to wring out the last little bit of speed by having Perl skip testing for that. (There is a maintenance penalty for doing this, as mentioning /o constitutes a promise that you won't change the variables in the pattern. If you do change them, Perl won't even notice.)

    2

    you want the pattern to use the initial values of the variables regardless of whether they change or not. (But there are saner ways of accomplishing this than using /o.)

    3

    If the pattern contains embedded code, such as

    1. use re 'eval';
    2. $code = 'foo(?{ $x })';
    3. /$code/

    then perl will recompile each time, even though the pattern string hasn't changed, to ensure that the current value of $x is seen each time. Use /o if you want to avoid this.

    The bottom line is that using /o is almost never a good idea.

  • The empty pattern //

    If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead. In this case, only the g and c flags on the empty pattern are honored; the other flags are taken from the original pattern. If no match has previously succeeded, this will (silently) act instead as a genuine empty pattern (which will always match).

    Note that it's possible to confuse Perl into thinking // (the empty regex) is really // (the defined-or operator). Perl is usually pretty good about this, but some pathological cases might trigger this, such as $a/// (is that ($a) / (//) or $a // /?) and print $fh // (print $fh(// or print($fh //?). In all of these examples, Perl will assume you meant defined-or. If you meant the empty regex, just use parentheses or spaces to disambiguate, or even prefix the empty regex with an m (so // becomes m//).

  • Matching in list context

    If the /g option is not used, m// in list context returns a list consisting of the subexpressions matched by the parentheses in the pattern, that is, ($1 , $2 , $3 ...) (Note that here $1 etc. are also set). When there are no parentheses in the pattern, the return value is the list (1) for success. With or without parentheses, an empty list is returned upon failure.

    Examples:

    1. open(TTY, "+</dev/tty")
    2. || die "can't access /dev/tty: $!";
    3. <TTY> =~ /^y/i && foo(); # do foo if desired
    4. if (/Version: *([0-9.]*)/) { $version = $1; }
    5. next if m#^/usr/spool/uucp#;
    6. # poor man's grep
    7. $arg = shift;
    8. while (<>) {
    9. print if /$arg/o; # compile only once (no longer needed!)
    10. }
    11. if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/))

    This last example splits $foo into the first two words and the remainder of the line, and assigns those three fields to $F1, $F2, and $Etc. The conditional is true if any variables were assigned; that is, if the pattern matched.

    The /g modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.

    In scalar context, each execution of m//g finds the next match, returning true if it matches, and false if there is no further match. The position after the last match can be read or set using the pos() function; see pos. A failed match normally resets the search position to the beginning of the string, but you can avoid that by adding the /c modifier (for example, m//gc). Modifying the target string also resets the search position.

  • \G assertion

    You can intermix m//g matches with m/\G.../g, where \G is a zero-width assertion that matches the exact position where the previous m//g, if any, left off. Without the /g modifier, the \G assertion still anchors at pos() as it was at the start of the operation (see pos), but the match is of course only attempted once. Using \G without /g on a target string that has not previously had a /g match applied to it is the same as using the \A assertion to match the beginning of the string. Note also that, currently, \G is only properly supported when anchored at the very beginning of the pattern.

    Examples:

    1. # list context
    2. ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
    3. # scalar context
    4. local $/ = "";
    5. while ($paragraph = <>) {
    6. while ($paragraph =~ /\p{Ll}['")]*[.!?]+['")]*\s/g) {
    7. $sentences++;
    8. }
    9. }
    10. say $sentences;

    Here's another way to check for sentences in a paragraph:

    1. my $sentence_rx = qr{
    2. (?: (?<= ^ ) | (?<= \s ) ) # after start-of-string or
    3. # whitespace
    4. \p{Lu} # capital letter
    5. .*? # a bunch of anything
    6. (?<= \S ) # that ends in non-
    7. # whitespace
    8. (?<! \b [DMS]r ) # but isn't a common abbr.
    9. (?<! \b Mrs )
    10. (?<! \b Sra )
    11. (?<! \b St )
    12. [.?!] # followed by a sentence
    13. # ender
    14. (?= $ | \s ) # in front of end-of-string
    15. # or whitespace
    16. }sx;
    17. local $/ = "";
    18. while (my $paragraph = <>) {
    19. say "NEW PARAGRAPH";
    20. my $count = 0;
    21. while ($paragraph =~ /($sentence_rx)/g) {
    22. printf "\tgot sentence %d: <%s>\n", ++$count, $1;
    23. }
    24. }

    Here's how to use m//gc with \G :

    1. $_ = "ppooqppqq";
    2. while ($i++ < 2) {
    3. print "1: '";
    4. print $1 while /(o)/gc; print "', pos=", pos, "\n";
    5. print "2: '";
    6. print $1 if /\G(q)/gc; print "', pos=", pos, "\n";
    7. print "3: '";
    8. print $1 while /(p)/gc; print "', pos=", pos, "\n";
    9. }
    10. print "Final: '$1', pos=",pos,"\n" if /\G(.)/;

    The last example should print:

    1. 1: 'oo', pos=4
    2. 2: 'q', pos=5
    3. 3: 'pp', pos=7
    4. 1: '', pos=7
    5. 2: 'q', pos=8
    6. 3: '', pos=8
    7. Final: 'q', pos=8

    Notice that the final match matched q instead of p , which a match without the \G anchor would have done. Also note that the final match did not update pos. pos is only updated on a /g match. If the final match did indeed match p , it's a good bet that you're running a very old (pre-5.6.0) version of Perl.

    A useful idiom for lex -like scanners is /\G.../gc . You can combine several regexps like this to process a string part-by-part, doing different actions depending on which regexp matched. Each regexp tries to match where the previous one leaves off.

    1. $_ = <<'EOL';
    2. $url = URI::URL->new( "http://example.com/" );
    3. die if $url eq "xXx";
    4. EOL
    5. LOOP: {
    6. print(" digits"), redo LOOP if /\G\d+\b[,.;]?\s*/gc;
    7. print(" lowercase"), redo LOOP
    8. if /\G\p{Ll}+\b[,.;]?\s*/gc;
    9. print(" UPPERCASE"), redo LOOP
    10. if /\G\p{Lu}+\b[,.;]?\s*/gc;
    11. print(" Capitalized"), redo LOOP
    12. if /\G\p{Lu}\p{Ll}+\b[,.;]?\s*/gc;
    13. print(" MiXeD"), redo LOOP if /\G\pL+\b[,.;]?\s*/gc;
    14. print(" alphanumeric"), redo LOOP
    15. if /\G[\p{Alpha}\pN]+\b[,.;]?\s*/gc;
    16. print(" line-noise"), redo LOOP if /\G\W+/gc;
    17. print ". That's all!\n";
    18. }

    Here is the output (split into several lines):

    1. line-noise lowercase line-noise UPPERCASE line-noise UPPERCASE
    2. line-noise lowercase line-noise lowercase line-noise lowercase
    3. lowercase line-noise lowercase lowercase line-noise lowercase
    4. lowercase line-noise MiXeD line-noise. That's all!
  • m?PATTERN?msixpodualgc
  • ?PATTERN?msixpodualgc

    This is just like the m/PATTERN/ search, except that it matches only once between calls to the reset() operator. This is a useful optimization when you want to see only the first occurrence of something in each file of a set of files, for instance. Only m?? patterns local to the current package are reset.

    1. while (<>) {
    2. if (m?^$?) {
    3. # blank line between header and body
    4. }
    5. } continue {
    6. reset if eof; # clear m?? status for next file
    7. }

    Another example switched the first "latin1" encoding it finds to "utf8" in a pod file:

    1. s//utf8/ if m? ^ =encoding \h+ \K latin1 ?x;

    The match-once behavior is controlled by the match delimiter being ?; with any other delimiter this is the normal m// operator.

    For historical reasons, the leading m in m?PATTERN? is optional, but the resulting ?PATTERN? syntax is deprecated, will warn on usage and might be removed from a future stable release of Perl (without further notice!).

  • s/PATTERN/REPLACEMENT/msixpodualgcer

    Searches a string for a pattern, and if found, replaces that pattern with the replacement text and returns the number of substitutions made. Otherwise it returns false (specifically, the empty string).

    If the /r (non-destructive) option is used then it runs the substitution on a copy of the string and instead of returning the number of substitutions, it returns the copy whether or not a substitution occurred. The original string is never changed when /r is used. The copy will always be a plain string, even if the input is an object or a tied variable.

    If no string is specified via the =~ or !~ operator, the $_ variable is searched and modified. Unless the /r option is used, the string specified must be a scalar variable, an array element, a hash element, or an assignment to one of those; that is, some sort of scalar lvalue.

    If the delimiter chosen is a single quote, no interpolation is done on either the PATTERN or the REPLACEMENT. Otherwise, if the PATTERN contains a $ that looks like a variable rather than an end-of-string test, the variable will be interpolated into the pattern at run-time. If you want the pattern compiled only once the first time the variable is interpolated, use the /o option. If the pattern evaluates to the empty string, the last successfully executed regular expression is used instead. See perlre for further explanation on these.

    Options are as with m// with the addition of the following replacement specific options:

    1. e Evaluate the right side as an expression.
    2. ee Evaluate the right side as a string then eval the
    3. result.
    4. r Return substitution and leave the original string
    5. untouched.

    Any non-whitespace delimiter may replace the slashes. Add space after the s when using a character allowed in identifiers. If single quotes are used, no interpretation is done on the replacement string (the /e modifier overrides this, however). Note that Perl treats backticks as normal delimiters; the replacement text is not evaluated as a command. If the PATTERN is delimited by bracketing quotes, the REPLACEMENT has its own pair of quotes, which may or may not be bracketing quotes, for example, s(foo)(bar) or s/bar/. A /e will cause the replacement portion to be treated as a full-fledged Perl expression and evaluated right then and there. It is, however, syntax checked at compile-time. A second e modifier will cause the replacement portion to be evaled before being run as a Perl expression.

    Examples:

    1. s/\bgreen\b/mauve/g; # don't change wintergreen
    2. $path =~ s|/usr/bin|/usr/local/bin|;
    3. s/Login: $foo/Login: $bar/; # run-time pattern
    4. ($foo = $bar) =~ s/this/that/; # copy first, then
    5. # change
    6. ($foo = "$bar") =~ s/this/that/; # convert to string,
    7. # copy, then change
    8. $foo = $bar =~ s/this/that/r; # Same as above using /r
    9. $foo = $bar =~ s/this/that/r
    10. =~ s/that/the other/r; # Chained substitutes
    11. # using /r
    12. @foo = map { s/this/that/r } @bar # /r is very useful in
    13. # maps
    14. $count = ($paragraph =~ s/Mister\b/Mr./g); # get change-cnt
    15. $_ = 'abc123xyz';
    16. s/\d+/$&*2/e; # yields 'abc246xyz'
    17. s/\d+/sprintf("%5d",$&)/e; # yields 'abc 246xyz'
    18. s/\w/$& x 2/eg; # yields 'aabbcc 224466xxyyzz'
    19. s/%(.)/$percent{$1}/g; # change percent escapes; no /e
    20. s/%(.)/$percent{$1} || $&/ge; # expr now, so /e
    21. s/^=(\w+)/pod($1)/ge; # use function call
    22. $_ = 'abc123xyz';
    23. $a = s/abc/def/r; # $a is 'def123xyz' and
    24. # $_ remains 'abc123xyz'.
    25. # expand variables in $_, but dynamics only, using
    26. # symbolic dereferencing
    27. s/\$(\w+)/${$1}/g;
    28. # Add one to the value of any numbers in the string
    29. s/(\d+)/1 + $1/eg;
    30. # Titlecase words in the last 30 characters only
    31. substr($str, -30) =~ s/\b(\p{Alpha}+)\b/\u\L$1/g;
    32. # This will expand any embedded scalar variable
    33. # (including lexicals) in $_ : First $1 is interpolated
    34. # to the variable name, and then evaluated
    35. s/(\$\w+)/$1/eeg;
    36. # Delete (most) C comments.
    37. $program =~ s {
    38. /\* # Match the opening delimiter.
    39. .*? # Match a minimal number of characters.
    40. \*/ # Match the closing delimiter.
    41. } []gsx;
    42. s/^\s*(.*?)\s*$/$1/; # trim whitespace in $_,
    43. # expensively
    44. for ($variable) { # trim whitespace in $variable,
    45. # cheap
    46. s/^\s+//;
    47. s/\s+$//;
    48. }
    49. s/([^ ]*) *([^ ]*)/$2 $1/; # reverse 1st two fields

    Note the use of $ instead of \ in the last example. Unlike sed, we use the \<digit> form in only the left hand side. Anywhere else it's $<digit>.

    Occasionally, you can't use just a /g to get all the changes to occur that you might want. Here are two common cases:

    1. # put commas in the right places in an integer
    2. 1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g;
    3. # expand tabs to 8-column spacing
    4. 1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;

Quote-Like Operators

  • q/STRING/
  • 'STRING'

    A single-quoted, literal string. A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated.

    1. $foo = q!I said, "You said, 'She said it.'"!;
    2. $bar = q('This is it.');
    3. $baz = '\n'; # a two-character string
  • qq/STRING/
  • "STRING"

    A double-quoted, interpolated string.

    1. $_ .= qq
    2. (*** The previous line contains the naughty word "$1".\n)
    3. if /\b(tcl|java|python)\b/i; # :-)
    4. $baz = "\n"; # a one-character string
  • qx/STRING/
  • `STRING`

    A string which is (possibly) interpolated and then executed as a system command with /bin/sh or its equivalent. Shell wildcards, pipes, and redirections will be honored. The collected standard output of the command is returned; standard error is unaffected. In scalar context, it comes back as a single (potentially multi-line) string, or undef if the command failed. In list context, returns a list of lines (however you've defined lines with $/ or $INPUT_RECORD_SEPARATOR), or an empty list if the command failed.

    Because backticks do not affect standard error, use shell file descriptor syntax (assuming the shell supports this) if you care to address this. To capture a command's STDERR and STDOUT together:

    1. $output = `cmd 2>&1`;

    To capture a command's STDOUT but discard its STDERR:

    1. $output = `cmd 2>/dev/null`;

    To capture a command's STDERR but discard its STDOUT (ordering is important here):

    1. $output = `cmd 2>&1 1>/dev/null`;

    To exchange a command's STDOUT and STDERR in order to capture the STDERR but leave its STDOUT to come out the old STDERR:

    1. $output = `cmd 3>&1 1>&2 2>&3 3>&-`;

    To read both a command's STDOUT and its STDERR separately, it's easiest to redirect them separately to files, and then read from those files when the program is done:

    1. system("program args 1>program.stdout 2>program.stderr");

    The STDIN filehandle used by the command is inherited from Perl's STDIN. For example:

    1. open(SPLAT, "stuff") || die "can't open stuff: $!";
    2. open(STDIN, "<&SPLAT") || die "can't dupe SPLAT: $!";
    3. print STDOUT `sort`;

    will print the sorted contents of the file named "stuff".

    Using single-quote as a delimiter protects the command from Perl's double-quote interpolation, passing it on to the shell instead:

    1. $perl_info = qx(ps $$); # that's Perl's $$
    2. $shell_info = qx'ps $$'; # that's the new shell's $$

    How that string gets evaluated is entirely subject to the command interpreter on your system. On most platforms, you will have to protect shell metacharacters if you want them treated literally. This is in practice difficult to do, as it's unclear how to escape which characters. See perlsec for a clean and safe example of a manual fork() and exec() to emulate backticks safely.

    On some platforms (notably DOS-like ones), the shell may not be capable of dealing with multiline commands, so putting newlines in the string may not get you what you want. You may be able to evaluate multiple commands in a single line by separating them with the command separator character, if your shell supports that (for example, ; on many Unix shells and & on the Windows NT cmd shell).

    Perl will attempt to flush all files opened for output before starting the child process, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles.

    Beware that some command shells may place restrictions on the length of the command line. You must ensure your strings don't exceed this limit after any necessary interpolations. See the platform-specific release notes for more details about your particular environment.

    Using this operator can lead to programs that are difficult to port, because the shell commands called vary between systems, and may in fact not be present at all. As one example, the type command under the POSIX shell is very different from the type command under DOS. That doesn't mean you should go out of your way to avoid backticks when they're the right way to get something done. Perl was made to be a glue language, and one of the things it glues together is commands. Just understand what you're getting yourself into.

    See I/O Operators for more discussion.

  • qw/STRING/

    Evaluates to a list of the words extracted out of STRING, using embedded whitespace as the word delimiters. It can be understood as being roughly equivalent to:

    1. split(" ", q/STRING/);

    the differences being that it generates a real list at compile time, and in scalar context it returns the last element in the list. So this expression:

    1. qw(foo bar baz)

    is semantically equivalent to the list:

    1. "foo", "bar", "baz"

    Some frequently seen examples:

    1. use POSIX qw( setlocale localeconv )
    2. @EXPORT = qw( foo bar baz );

    A common mistake is to try to separate the words with comma or to put comments into a multi-line qw-string. For this reason, the use warnings pragma and the -w switch (that is, the $^W variable) produces warnings if the STRING contains the "," or the "#" character.

  • tr/SEARCHLIST/REPLACEMENTLIST/cdsr
  • y/SEARCHLIST/REPLACEMENTLIST/cdsr

    Transliterates all occurrences of the characters found in the search list with the corresponding character in the replacement list. It returns the number of characters replaced or deleted. If no string is specified via the =~ or !~ operator, the $_ string is transliterated.

    If the /r (non-destructive) option is present, a new copy of the string is made and its characters transliterated, and this copy is returned no matter whether it was modified or not: the original string is always left unchanged. The new copy is always a plain string, even if the input string is an object or a tied variable.

    Unless the /r option is used, the string specified with =~ must be a scalar variable, an array element, a hash element, or an assignment to one of those; in other words, an lvalue.

    A character range may be specified with a hyphen, so tr/A-J/0-9/ does the same replacement as tr/ACEGIBDFHJ/0246813579/. For sed devotees, y is provided as a synonym for tr. If the SEARCHLIST is delimited by bracketing quotes, the REPLACEMENTLIST has its own pair of quotes, which may or may not be bracketing quotes; for example, tr[aeiouy][yuoiea] or tr(+\-*/)/ABCD/.

    Note that tr does not do regular expression character classes such as \d or \pL . The tr operator is not equivalent to the tr(1) utility. If you want to map strings between lower/upper cases, see lc and uc, and in general consider using the s operator if you need regular expressions. The \U , \u , \L , and \l string-interpolation escapes on the right side of a substitution operator will perform correct case-mappings, but tr[a-z][A-Z] will not (except sometimes on legacy 7-bit data).

    Note also that the whole range idea is rather unportable between character sets--and even within character sets they may cause results you probably didn't expect. A sound principle is to use only ranges that begin from and end at either alphabets of equal case (a-e, A-E), or digits (0-4). Anything else is unsafe. If in doubt, spell out the character sets in full.

    Options:

    1. c Complement the SEARCHLIST.
    2. d Delete found but unreplaced characters.
    3. s Squash duplicate replaced characters.
    4. r Return the modified string and leave the original string
    5. untouched.

    If the /c modifier is specified, the SEARCHLIST character set is complemented. If the /d modifier is specified, any characters specified by SEARCHLIST not found in REPLACEMENTLIST are deleted. (Note that this is slightly more flexible than the behavior of some tr programs, which delete anything they find in the SEARCHLIST, period.) If the /s modifier is specified, sequences of characters that were transliterated to the same character are squashed down to a single instance of the character.

    If the /d modifier is used, the REPLACEMENTLIST is always interpreted exactly as specified. Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST, the final character is replicated till it is long enough. If the REPLACEMENTLIST is empty, the SEARCHLIST is replicated. This latter is useful for counting characters in a class or for squashing character sequences in a class.

    Examples:

    1. $ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case ASCII
    2. $cnt = tr/*/*/; # count the stars in $_
    3. $cnt = $sky =~ tr/*/*/; # count the stars in $sky
    4. $cnt = tr/0-9//; # count the digits in $_
    5. tr/a-zA-Z//s; # bookkeeper -> bokeper
    6. ($HOST = $host) =~ tr/a-z/A-Z/;
    7. $HOST = $host =~ tr/a-z/A-Z/r; # same thing
    8. $HOST = $host =~ tr/a-z/A-Z/r # chained with s///r
    9. =~ s/:/ -p/r;
    10. tr/a-zA-Z/ /cs; # change non-alphas to single space
    11. @stripped = map tr/a-zA-Z/ /csr, @original;
    12. # /r with map
    13. tr [\200-\377]
    14. [\000-\177]; # wickedly delete 8th bit

    If multiple transliterations are given for a character, only the first one is used:

    1. tr/AAA/XYZ/

    will transliterate any A to X.

    Because the transliteration table is built at compile time, neither the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote interpolation. That means that if you want to use variables, you must use an eval():

    1. eval "tr/$oldlist/$newlist/";
    2. die $@ if $@;
    3. eval "tr/$oldlist/$newlist/, 1" or die $@;
  • <<EOF

    A line-oriented form of quoting is based on the shell "here-document" syntax. Following a << you specify a string to terminate the quoted material, and all lines following the current line down to the terminating string are the value of the item.

    The terminating string may be either an identifier (a word), or some quoted text. An unquoted identifier works like double quotes. There may not be a space between the << and the identifier, unless the identifier is explicitly quoted. (If you put a space it will be treated as a null identifier, which is valid, and matches the first empty line.) The terminating string must appear by itself (unquoted and with no surrounding whitespace) on the terminating line.

    If the terminating string is quoted, the type of quotes used determine the treatment of the text.

    • Double Quotes

      Double quotes indicate that the text will be interpolated using exactly the same rules as normal double quoted strings.

      1. print <<EOF;
      2. The price is $Price.
      3. EOF
      4. print << "EOF"; # same as above
      5. The price is $Price.
      6. EOF
    • Single Quotes

      Single quotes indicate the text is to be treated literally with no interpolation of its content. This is similar to single quoted strings except that backslashes have no special meaning, with \\ being treated as two backslashes and not one as they would in every other quoting construct.

      Just as in the shell, a backslashed bareword following the << means the same thing as a single-quoted string does:

      1. $cost = <<'VISTA'; # hasta la ...
      2. That'll be $10 please, ma'am.
      3. VISTA
      4. $cost = <<\VISTA; # Same thing!
      5. That'll be $10 please, ma'am.
      6. VISTA

      This is the only form of quoting in perl where there is no need to worry about escaping content, something that code generators can and do make good use of.

    • Backticks

      The content of the here doc is treated just as it would be if the string were embedded in backticks. Thus the content is interpolated as though it were double quoted and then executed via the shell, with the results of the execution returned.

      1. print << `EOC`; # execute command and get results
      2. echo hi there
      3. EOC

    It is possible to stack multiple here-docs in a row:

    1. print <<"foo", <<"bar"; # you can stack them
    2. I said foo.
    3. foo
    4. I said bar.
    5. bar
    6. myfunc(<< "THIS", 23, <<'THAT');
    7. Here's a line
    8. or two.
    9. THIS
    10. and here's another.
    11. THAT

    Just don't forget that you have to put a semicolon on the end to finish the statement, as Perl doesn't know you're not going to try to do this:

    1. print <<ABC
    2. 179231
    3. ABC
    4. + 20;

    If you want to remove the line terminator from your here-docs, use chomp().

    1. chomp($string = <<'END');
    2. This is a string.
    3. END

    If you want your here-docs to be indented with the rest of the code, you'll need to remove leading whitespace from each line manually:

    1. ($quote = <<'FINIS') =~ s/^\s+//gm;
    2. The Road goes ever on and on,
    3. down from the door where it began.
    4. FINIS

    If you use a here-doc within a delimited construct, such as in s///eg, the quoted material must still come on the line following the <<FOO marker, which means it may be inside the delimited construct:

    1. s/this/<<E . 'that'
    2. the other
    3. E
    4. . 'more '/eg;

    It works this way as of Perl 5.18. Historically, it was inconsistent, and you would have to write

    1. s/this/<<E . 'that'
    2. . 'more '/eg;
    3. the other
    4. E

    outside of string evals.

    Additionally, quoting rules for the end-of-string identifier are unrelated to Perl's quoting rules. q(), qq(), and the like are not supported in place of '' and "" , and the only interpolation is for backslashing the quoting character:

    1. print << "abc\"def";
    2. testing...
    3. abc"def

    Finally, quoted strings cannot span multiple lines. The general rule is that the identifier must be a string literal. Stick with that, and you should be safe.

Gory details of parsing quoted constructs

When presented with something that might have several different interpretations, Perl uses the DWIM (that's "Do What I Mean") principle to pick the most probable interpretation. This strategy is so successful that Perl programmers often do not suspect the ambivalence of what they write. But from time to time, Perl's notions differ substantially from what the author honestly meant.

This section hopes to clarify how Perl handles quoted constructs. Although the most common reason to learn this is to unravel labyrinthine regular expressions, because the initial steps of parsing are the same for all quoting operators, they are all discussed together.

The most important Perl parsing rule is the first one discussed below: when processing a quoted construct, Perl first finds the end of that construct, then interprets its contents. If you understand this rule, you may skip the rest of this section on the first reading. The other rules are likely to contradict the user's expectations much less frequently than this first one.

Some passes discussed below are performed concurrently, but because their results are the same, we consider them individually. For different quoting constructs, Perl performs different numbers of passes, from one to four, but these passes are always performed in the same order.

  • Finding the end

    The first pass is finding the end of the quoted construct, where the information about the delimiters is used in parsing. During this search, text between the starting and ending delimiters is copied to a safe location. The text copied gets delimiter-independent.

    If the construct is a here-doc, the ending delimiter is a line that has a terminating string as the content. Therefore <<EOF is terminated by EOF immediately followed by "\n" and starting from the first column of the terminating line. When searching for the terminating line of a here-doc, nothing is skipped. In other words, lines after the here-doc syntax are compared with the terminating string line by line.

    For the constructs except here-docs, single characters are used as starting and ending delimiters. If the starting delimiter is an opening punctuation (that is (, [, {, or < ), the ending delimiter is the corresponding closing punctuation (that is ), ], }, or >). If the starting delimiter is an unpaired character like / or a closing punctuation, the ending delimiter is same as the starting delimiter. Therefore a / terminates a qq// construct, while a ] terminates qq[] and qq]] constructs.

    When searching for single-character delimiters, escaped delimiters and \\ are skipped. For example, while searching for terminating /, combinations of \\ and \/ are skipped. If the delimiters are bracketing, nested pairs are also skipped. For example, while searching for closing ] paired with the opening [, combinations of \\ , \], and \[ are all skipped, and nested [ and ] are skipped as well. However, when backslashes are used as the delimiters (like qq\\ and tr\\\), nothing is skipped. During the search for the end, backslashes that escape delimiters or other backslashes are removed (exactly speaking, they are not copied to the safe location).

    For constructs with three-part delimiters (s///, y///, and tr///), the search is repeated once more. If the first delimiter is not an opening punctuation, three delimiters must be same such as s!!! and tr))), in which case the second delimiter terminates the left part and starts the right part at once. If the left part is delimited by bracketing punctuation (that is () , [] , {} , or <> ), the right part needs another pair of delimiters such as s(){} and tr[]//. In these cases, whitespace and comments are allowed between both parts, though the comment must follow at least one whitespace character; otherwise a character expected as the start of the comment may be regarded as the starting delimiter of the right part.

    During this search no attention is paid to the semantics of the construct. Thus:

    1. "$hash{"$foo/$bar"}"

    or:

    1. m/
    2. bar # NOT a comment, this slash / terminated m//!
    3. /x

    do not form legal quoted expressions. The quoted part ends on the first " and /, and the rest happens to be a syntax error. Because the slash that terminated m// was followed by a SPACE , the example above is not m//x, but rather m// with no /x modifier. So the embedded # is interpreted as a literal # .

    Also no attention is paid to \c\ (multichar control char syntax) during this search. Thus the second \ in qq/\c\/ is interpreted as a part of \/, and the following / is not recognized as a delimiter. Instead, use \034 or \x1c at the end of quoted constructs.

  • Interpolation

    The next step is interpolation in the text obtained, which is now delimiter-independent. There are multiple cases.

    • <<'EOF'

      No interpolation is performed. Note that the combination \\ is left intact, since escaped delimiters are not available for here-docs.

    • m'', the pattern of s'''

      No interpolation is performed at this stage. Any backslashed sequences including \\ are treated at the stage to parsing regular expressions.

    • '' , q//, tr''', y''', the replacement of s'''

      The only interpolation is removal of \ from pairs of \\ . Therefore - in tr''' and y''' is treated literally as a hyphen and no character range is available. \1 in the replacement of s''' does not work as $1 .

    • tr///, y///

      No variable interpolation occurs. String modifying combinations for case and quoting such as \Q , \U , and \E are not recognized. The other escape sequences such as \200 and \t and backslashed characters such as \\ and \- are converted to appropriate literals. The character - is treated specially and therefore \- is treated as a literal - .

    • "" , `` , qq//, qx//, <file*glob> , <<"EOF"

      \Q , \U , \u , \L , \l , \F (possibly paired with \E ) are converted to corresponding Perl constructs. Thus, "$foo\Qbaz$bar" is converted to $foo . (quotemeta("baz" . $bar)) internally. The other escape sequences such as \200 and \t and backslashed characters such as \\ and \- are replaced with appropriate expansions.

      Let it be stressed that whatever falls between \Q and \E is interpolated in the usual way. Something like "\Q\\E" has no \E inside. Instead, it has \Q , \\ , and E , so the result is the same as for "\\\\E" . As a general rule, backslashes between \Q and \E may lead to counterintuitive results. So, "\Q\t\E" is converted to quotemeta("\t"), which is the same as "\\\t" (since TAB is not alphanumeric). Note also that:

      1. $str = '\t';
      2. return "\Q$str";

      may be closer to the conjectural intention of the writer of "\Q\t\E" .

      Interpolated scalars and arrays are converted internally to the join and . catenation operations. Thus, "$foo XXX '@arr'" becomes:

      1. $foo . " XXX '" . (join $", @arr) . "'";

      All operations above are performed simultaneously, left to right.

      Because the result of "\Q STRING \E" has all metacharacters quoted, there is no way to insert a literal $ or @ inside a \Q\E pair. If protected by \ , $ will be quoted to became "\\\$" ; if not, it is interpreted as the start of an interpolated scalar.

      Note also that the interpolation code needs to make a decision on where the interpolated scalar ends. For instance, whether "a $b -> {c}" really means:

      1. "a " . $b . " -> {c}";

      or:

      1. "a " . $b -> {c};

      Most of the time, the longest possible text that does not include spaces between components and which contains matching braces or brackets. because the outcome may be determined by voting based on heuristic estimators, the result is not strictly predictable. Fortunately, it's usually correct for ambiguous cases.

    • the replacement of s///

      Processing of \Q , \U , \u , \L , \l , \F and interpolation happens as with qq// constructs.

      It is at this step that \1 is begrudgingly converted to $1 in the replacement text of s///, in order to correct the incorrigible sed hackers who haven't picked up the saner idiom yet. A warning is emitted if the use warnings pragma or the -w command-line flag (that is, the $^W variable) was set.

    • RE in ?RE? , /RE/ , m/RE/, s/RE/foo/,

      Processing of \Q , \U , \u , \L , \l , \F , \E , and interpolation happens (almost) as with qq// constructs.

      Processing of \N{...} is also done here, and compiled into an intermediate form for the regex compiler. (This is because, as mentioned below, the regex compilation may be done at execution time, and \N{...} is a compile-time construct.)

      However any other combinations of \ followed by a character are not substituted but only skipped, in order to parse them as regular expressions at the following step. As \c is skipped at this step, @ of \c@ in RE is possibly treated as an array symbol (for example @foo ), even though the same text in qq// gives interpolation of \c@ .

      Code blocks such as (?{BLOCK}) are handled by temporarily passing control back to the perl parser, in a similar way that an interpolated array subscript expression such as "foo$array[1+f("[xyz")]bar" would be.

      Moreover, inside (?{BLOCK}), (?# comment ), and a # -comment in a //x -regular expression, no processing is performed whatsoever. This is the first step at which the presence of the //x modifier is relevant.

      Interpolation in patterns has several quirks: $| , $( , $) , @+ and @- are not interpolated, and constructs $var[SOMETHING] are voted (by several different estimators) to be either an array element or $var followed by an RE alternative. This is where the notation ${arr[$bar]} comes handy: /${arr[0-9]}/ is interpreted as array element -9 , not as a regular expression from the variable $arr followed by a digit, which would be the interpretation of /$arr[0-9]/ . Since voting among different estimators may occur, the result is not predictable.

      The lack of processing of \\ creates specific restrictions on the post-processed text. If the delimiter is /, one cannot get the combination \/ into the result of this step. / will finish the regular expression, \/ will be stripped to / on the previous step, and \\/ will be left as is. Because / is equivalent to \/ inside a regular expression, this does not matter unless the delimiter happens to be character special to the RE engine, such as in s*foo*bar*, m[foo], or ?foo? ; or an alphanumeric char, as in:

      1. m m ^ a \s* b mmx;

      In the RE above, which is intentionally obfuscated for illustration, the delimiter is m, the modifier is mx , and after delimiter-removal the RE is the same as for m/ ^ a \s* b /mx . There's more than one reason you're encouraged to restrict your delimiters to non-alphanumeric, non-whitespace choices.

    This step is the last one for all constructs except regular expressions, which are processed further.

  • parsing regular expressions

    Previous steps were performed during the compilation of Perl code, but this one happens at run time, although it may be optimized to be calculated at compile time if appropriate. After preprocessing described above, and possibly after evaluation if concatenation, joining, casing translation, or metaquoting are involved, the resulting string is passed to the RE engine for compilation.

    Whatever happens in the RE engine might be better discussed in perlre, but for the sake of continuity, we shall do so here.

    This is another step where the presence of the //x modifier is relevant. The RE engine scans the string from left to right and converts it to a finite automaton.

    Backslashed characters are either replaced with corresponding literal strings (as with \{), or else they generate special nodes in the finite automaton (as with \b ). Characters special to the RE engine (such as |) generate corresponding nodes or groups of nodes. (?#...) comments are ignored. All the rest is either converted to literal strings to match, or else is ignored (as is whitespace and # -style comments if //x is present).

    Parsing of the bracketed character class construct, [...] , is rather different than the rule used for the rest of the pattern. The terminator of this construct is found using the same rules as for finding the terminator of a {} -delimited construct, the only exception being that ] immediately following [ is treated as though preceded by a backslash.

    The terminator of runtime (?{...}) is found by temporarily switching control to the perl parser, which should stop at the point where the logically balancing terminating } is found.

    It is possible to inspect both the string given to RE engine and the resulting finite automaton. See the arguments debug /debugcolor in the use re pragma, as well as Perl's -Dr command-line switch documented in Command Switches in perlrun.

  • Optimization of regular expressions

    This step is listed for completeness only. Since it does not change semantics, details of this step are not documented and are subject to change without notice. This step is performed over the finite automaton that was generated during the previous pass.

    It is at this stage that split() silently optimizes /^/ to mean /^/m .

I/O Operators

There are several I/O operators you should know about.

A string enclosed by backticks (grave accents) first undergoes double-quote interpolation. It is then interpreted as an external command, and the output of that command is the value of the backtick string, like in a shell. In scalar context, a single string consisting of all output is returned. In list context, a list of values is returned, one per line of output. (You can set $/ to use a different line terminator.) The command is executed each time the pseudo-literal is evaluated. The status value of the command is returned in $? (see perlvar for the interpretation of $? ). Unlike in csh, no translation is done on the return data--newlines remain newlines. Unlike in any of the shells, single quotes do not hide variable names in the command from interpretation. To pass a literal dollar-sign through to the shell you need to hide it with a backslash. The generalized form of backticks is qx//. (Because backticks always undergo shell expansion as well, see perlsec for security concerns.)

In scalar context, evaluating a filehandle in angle brackets yields the next line from that file (the newline, if any, included), or undef at end-of-file or on error. When $/ is set to undef (sometimes known as file-slurp mode) and the file is empty, it returns '' the first time, followed by undef subsequently.

Ordinarily you must assign the returned value to a variable, but there is one situation where an automatic assignment happens. If and only if the input symbol is the only thing inside the conditional of a while statement (even if disguised as a for(;;) loop), the value is automatically assigned to the global variable $_, destroying whatever was there previously. (This may seem like an odd thing to you, but you'll use the construct in almost every Perl script you write.) The $_ variable is not implicitly localized. You'll have to put a local $_; before the loop if you want that to happen.

The following lines are equivalent:

  1. while (defined($_ = <STDIN>)) { print; }
  2. while ($_ = <STDIN>) { print; }
  3. while (<STDIN>) { print; }
  4. for (;<STDIN>;) { print; }
  5. print while defined($_ = <STDIN>);
  6. print while ($_ = <STDIN>);
  7. print while <STDIN>;

This also behaves similarly, but assigns to a lexical variable instead of to $_ :

  1. while (my $line = <STDIN>) { print $line }

In these loop constructs, the assigned value (whether assignment is automatic or explicit) is then tested to see whether it is defined. The defined test avoids problems where the line has a string value that would be treated as false by Perl; for example a "" or a "0" with no trailing newline. If you really mean for such values to terminate the loop, they should be tested for explicitly:

  1. while (($_ = <STDIN>) ne '0') { ... }
  2. while (<STDIN>) { last unless $_; ... }

In other boolean contexts, <FILEHANDLE> without an explicit defined test or comparison elicits a warning if the use warnings pragma or the -w command-line switch (the $^W variable) is in effect.

The filehandles STDIN, STDOUT, and STDERR are predefined. (The filehandles stdin , stdout , and stderr will also work except in packages, where they would be interpreted as local identifiers rather than global.) Additional filehandles may be created with the open() function, amongst others. See perlopentut and open for details on this.

If a <FILEHANDLE> is used in a context that is looking for a list, a list comprising all input lines is returned, one line per list element. It's easy to grow to a rather large data space this way, so use with care.

<FILEHANDLE> may also be spelled readline(*FILEHANDLE). See readline.

The null filehandle <> is special: it can be used to emulate the behavior of sed and awk, and any other Unix filter program that takes a list of filenames, doing the same to each line of input from all of them. Input from <> comes either from standard input, or from each file listed on the command line. Here's how it works: the first time <> is evaluated, the @ARGV array is checked, and if it is empty, $ARGV[0] is set to "-", which when opened gives you standard input. The @ARGV array is then processed as a list of filenames. The loop

  1. while (<>) {
  2. ... # code for each line
  3. }

is equivalent to the following Perl-like pseudo code:

  1. unshift(@ARGV, '-') unless @ARGV;
  2. while ($ARGV = shift) {
  3. open(ARGV, $ARGV);
  4. while (<ARGV>) {
  5. ... # code for each line
  6. }
  7. }

except that it isn't so cumbersome to say, and will actually work. It really does shift the @ARGV array and put the current filename into the $ARGV variable. It also uses filehandle ARGV internally. <> is just a synonym for <ARGV>, which is magical. (The pseudo code above doesn't work because it treats <ARGV> as non-magical.)

Since the null filehandle uses the two argument form of open it interprets special characters, so if you have a script like this:

  1. while (<>) {
  2. print;
  3. }

and call it with perl dangerous.pl 'rm -rfv *|' , it actually opens a pipe, executes the rm command and reads rm 's output from that pipe. If you want all items in @ARGV to be interpreted as file names, you can use the module ARGV::readonly from CPAN.

You can modify @ARGV before the first <> as long as the array ends up containing the list of filenames you really want. Line numbers ($. ) continue as though the input were one big happy file. See the example in eof for how to reset line numbers on each file.

If you want to set @ARGV to your own list of files, go right ahead. This sets @ARGV to all plain text files if no @ARGV was given:

  1. @ARGV = grep { -f && -T } glob('*') unless @ARGV;

You can even set them to pipe commands. For example, this automatically filters compressed arguments through gzip:

  1. @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc < $_ |" : $_ } @ARGV;

If you want to pass switches into your script, you can use one of the Getopts modules or put a loop on the front like this:

  1. while ($_ = $ARGV[0], /^-/) {
  2. shift;
  3. last if /^--$/;
  4. if (/^-D(.*)/) { $debug = $1 }
  5. if (/^-v/) { $verbose++ }
  6. # ... # other switches
  7. }
  8. while (<>) {
  9. # ... # code for each line
  10. }

The <> symbol will return undef for end-of-file only once. If you call it again after this, it will assume you are processing another @ARGV list, and if you haven't set @ARGV, will read input from STDIN.

If what the angle brackets contain is a simple scalar variable (for example, <$foo>), then that variable contains the name of the filehandle to input from, or its typeglob, or a reference to the same. For example:

  1. $fh = \*STDIN;
  2. $line = <$fh>;

If what's within the angle brackets is neither a filehandle nor a simple scalar variable containing a filehandle name, typeglob, or typeglob reference, it is interpreted as a filename pattern to be globbed, and either a list of filenames or the next filename in the list is returned, depending on context. This distinction is determined on syntactic grounds alone. That means <$x> is always a readline() from an indirect handle, but <$hash{key}> is always a glob(). That's because $x is a simple scalar variable, but $hash{key} is not--it's a hash element. Even <$x > (note the extra space) is treated as glob("$x ") , not readline($x).

One level of double-quote interpretation is done first, but you can't say <$foo> because that's an indirect filehandle as explained in the previous paragraph. (In older versions of Perl, programmers would insert curly brackets to force interpretation as a filename glob: <${foo}> . These days, it's considered cleaner to call the internal function directly as glob($foo), which is probably the right way to have done it in the first place.) For example:

  1. while (<*.c>) {
  2. chmod 0644, $_;
  3. }

is roughly equivalent to:

  1. open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
  2. while (<FOO>) {
  3. chomp;
  4. chmod 0644, $_;
  5. }

except that the globbing is actually done internally using the standard File::Glob extension. Of course, the shortest way to do the above is:

  1. chmod 0644, <*.c>;

A (file)glob evaluates its (embedded) argument only when it is starting a new list. All values must be read before it will start over. In list context, this isn't important because you automatically get them all anyway. However, in scalar context the operator returns the next value each time it's called, or undef when the list has run out. As with filehandle reads, an automatic defined is generated when the glob occurs in the test part of a while , because legal glob returns (for example, a file called 0) would otherwise terminate the loop. Again, undef is returned only once. So if you're expecting a single value from a glob, it is much better to say

  1. ($file) = <blurch*>;

than

  1. $file = <blurch*>;

because the latter will alternate between returning a filename and returning false.

If you're trying to do variable interpolation, it's definitely better to use the glob() function, because the older notation can cause people to become confused with the indirect filehandle notation.

  1. @files = glob("$dir/*.[ch]");
  2. @files = glob($files[$i]);

Constant Folding

Like C, Perl does a certain amount of expression evaluation at compile time whenever it determines that all arguments to an operator are static and have no side effects. In particular, string concatenation happens at compile time between literals that don't do variable substitution. Backslash interpolation also happens at compile time. You can say

  1. 'Now is the time for all'
  2. . "\n"
  3. . 'good men to come to.'

and this all reduces to one string internally. Likewise, if you say

  1. foreach $file (@filenames) {
  2. if (-s $file > 5 + 100 * 2**16) { }
  3. }

the compiler precomputes the number which that expression represents so that the interpreter won't have to.

No-ops

Perl doesn't officially have a no-op operator, but the bare constants 0 and 1 are special-cased not to produce a warning in void context, so you can for example safely do

  1. 1 while foo();

Bitwise String Operators

Bitstrings of any size may be manipulated by the bitwise operators (~ | & ^).

If the operands to a binary bitwise op are strings of different sizes, | and ^ ops act as though the shorter operand had additional zero bits on the right, while the & op acts as though the longer operand were truncated to the length of the shorter. The granularity for such extension or truncation is one or more bytes.

  1. # ASCII-based examples
  2. print "j p \n" ^ " a h"; # prints "JAPH\n"
  3. print "JA" | " ph\n"; # prints "japh\n"
  4. print "japh\nJunk" & '_____'; # prints "JAPH\n";
  5. print 'p N$' ^ " E<H\n"; # prints "Perl\n";

If you are intending to manipulate bitstrings, be certain that you're supplying bitstrings: If an operand is a number, that will imply a numeric bitwise operation. You may explicitly show which type of operation you intend by using "" or 0+ , as in the examples below.

  1. $foo = 150 | 105; # yields 255 (0x96 | 0x69 is 0xFF)
  2. $foo = '150' | 105; # yields 255
  3. $foo = 150 | '105'; # yields 255
  4. $foo = '150' | '105'; # yields string '155' (under ASCII)
  5. $baz = 0+$foo & 0+$bar; # both ops explicitly numeric
  6. $biz = "$foo" ^ "$bar"; # both ops explicitly stringy

See vec for information on how to manipulate individual bits in a bit vector.

Integer Arithmetic

By default, Perl assumes that it must do most of its arithmetic in floating point. But by saying

  1. use integer;

you may tell the compiler to use integer operations (see integer for a detailed explanation) from here to the end of the enclosing BLOCK. An inner BLOCK may countermand this by saying

  1. no integer;

which lasts until the end of that BLOCK. Note that this doesn't mean everything is an integer, merely that Perl will use integer operations for arithmetic, comparison, and bitwise operators. For example, even under use integer , if you take the sqrt(2), you'll still get 1.4142135623731 or so.

Used on numbers, the bitwise operators ("&", "|", "^", "~", "<<", and ">>") always produce integral results. (But see also Bitwise String Operators.) However, use integer still has meaning for them. By default, their results are interpreted as unsigned integers, but if use integer is in effect, their results are interpreted as signed integers. For example, ~0 usually evaluates to a large integral value. However, use integer; ~0 is -1 on two's-complement machines.

Floating-point Arithmetic

While use integer provides integer-only arithmetic, there is no analogous mechanism to provide automatic rounding or truncation to a certain number of decimal places. For rounding to a certain number of digits, sprintf() or printf() is usually the easiest route. See perlfaq4.

Floating-point numbers are only approximations to what a mathematician would call real numbers. There are infinitely more reals than floats, so some corners must be cut. For example:

  1. printf "%.20g\n", 123456789123456789;
  2. # produces 123456789123456784

Testing for exact floating-point equality or inequality is not a good idea. Here's a (relatively expensive) work-around to compare whether two floating-point numbers are equal to a particular number of decimal places. See Knuth, volume II, for a more robust treatment of this topic.

  1. sub fp_equal {
  2. my ($X, $Y, $POINTS) = @_;
  3. my ($tX, $tY);
  4. $tX = sprintf("%.${POINTS}g", $X);
  5. $tY = sprintf("%.${POINTS}g", $Y);
  6. return $tX eq $tY;
  7. }

The POSIX module (part of the standard perl distribution) implements ceil(), floor(), and other mathematical and trigonometric functions. The Math::Complex module (part of the standard perl distribution) defines mathematical functions that work on both the reals and the imaginary numbers. Math::Complex not as efficient as POSIX, but POSIX can't work with complex numbers.

Rounding in financial applications can have serious implications, and the rounding method used should be specified precisely. In these cases, it probably pays not to trust whichever system rounding is being used by Perl, but to instead implement the rounding function you need yourself.

Bigger Numbers

The standard Math::BigInt , Math::BigRat , and Math::BigFloat modules, along with the bignum , bigint , and bigrat pragmas, provide variable-precision arithmetic and overloaded operators, although they're currently pretty slow. At the cost of some space and considerable speed, they avoid the normal pitfalls associated with limited-precision representations.

  1. use 5.010;
  2. use bigint; # easy interface to Math::BigInt
  3. $x = 123456789123456789;
  4. say $x * $x;
  5. +15241578780673678515622620750190521

Or with rationals:

  1. use 5.010;
  2. use bigrat;
  3. $a = 3/22;
  4. $b = 4/6;
  5. say "a/b is ", $a/$b;
  6. say "a*b is ", $a*$b;
  7. a/b is 9/44
  8. a*b is 1/11

Several modules let you calculate with (bound only by memory and CPU time) unlimited or fixed precision. There are also some non-standard modules that provide faster implementations via external C libraries.

Here is a short, but incomplete summary:

  1. Math::String treat string sequences like numbers
  2. Math::FixedPrecision calculate with a fixed precision
  3. Math::Currency for currency calculations
  4. Bit::Vector manipulate bit vectors fast (uses C)
  5. Math::BigIntFast Bit::Vector wrapper for big numbers
  6. Math::Pari provides access to the Pari C library
  7. Math::Cephes uses the external Cephes C library (no
  8. big numbers)
  9. Math::Cephes::Fraction fractions via the Cephes library
  10. Math::GMP another one using an external C library
  11. Math::GMPz an alternative interface to libgmp's big ints
  12. Math::GMPq an interface to libgmp's fraction numbers
  13. Math::GMPf an interface to libgmp's floating point numbers

Choose wisely.

 
perldoc-html/perlopenbsd.html000644 000765 000024 00000036743 12275777411 016453 0ustar00jjstaff000000 000000 perlopenbsd - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlopenbsd

Perl 5 version 18.2 documentation
Recently read

perlopenbsd

NAME

perlopenbsd - Perl version 5 on OpenBSD systems

DESCRIPTION

This document describes various features of OpenBSD that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs.

OpenBSD core dumps from getprotobyname_r and getservbyname_r with ithreads

When Perl is configured to use ithreads, it will use re-entrant library calls in preference to non-re-entrant versions. There is an incompatibility in OpenBSD's getprotobyname_r and getservbyname_r function in versions 3.7 and later that will cause a SEGV when called without doing a bzero on their return structs prior to calling these functions. Current Perl's should handle this problem correctly. Older threaded Perls (5.8.6 or earlier) will run into this problem. If you want to run a threaded Perl on OpenBSD 3.7 or higher, you will need to upgrade to at least Perl 5.8.7.

AUTHOR

Steve Peters <steve@fisharerojo.org>

Please report any errors, updates, or suggestions to perlbug@perl.org.

 
perldoc-html/perlopentut.html000644 000765 000024 00000316402 12275777325 016514 0ustar00jjstaff000000 000000 perlopentut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlopentut

Perl 5 version 18.2 documentation
Recently read

perlopentut

NAME

perlopentut - tutorial on opening things in Perl

DESCRIPTION

Perl has two simple, built-in ways to open files: the shell way for convenience, and the C way for precision. The shell way also has 2- and 3-argument forms, which have different semantics for handling the filename. The choice is yours.

Open à la shell

Perl's open function was designed to mimic the way command-line redirection in the shell works. Here are some basic examples from the shell:

  1. $ myprogram file1 file2 file3
  2. $ myprogram < inputfile
  3. $ myprogram > outputfile
  4. $ myprogram >> outputfile
  5. $ myprogram | otherprogram
  6. $ otherprogram | myprogram

And here are some more advanced examples:

  1. $ otherprogram | myprogram f1 - f2
  2. $ otherprogram 2>&1 | myprogram -
  3. $ myprogram <&3
  4. $ myprogram >&4

Programmers accustomed to constructs like those above can take comfort in learning that Perl directly supports these familiar constructs using virtually the same syntax as the shell.

Simple Opens

The open function takes two arguments: the first is a filehandle, and the second is a single string comprising both what to open and how to open it. open returns true when it works, and when it fails, returns a false value and sets the special variable $! to reflect the system error. If the filehandle was previously opened, it will be implicitly closed first.

For example:

  1. open(INFO, "datafile") || die("can't open datafile: $!");
  2. open(INFO, "< datafile") || die("can't open datafile: $!");
  3. open(RESULTS,"> runstats") || die("can't open runstats: $!");
  4. open(LOG, ">> logfile ") || die("can't open logfile: $!");

If you prefer the low-punctuation version, you could write that this way:

  1. open INFO, "< datafile" or die "can't open datafile: $!";
  2. open RESULTS,"> runstats" or die "can't open runstats: $!";
  3. open LOG, ">> logfile " or die "can't open logfile: $!";

A few things to notice. First, the leading < is optional. If omitted, Perl assumes that you want to open the file for reading.

Note also that the first example uses the || logical operator, and the second uses or , which has lower precedence. Using || in the latter examples would effectively mean

  1. open INFO, ( "< datafile" || die "can't open datafile: $!" );

which is definitely not what you want.

The other important thing to notice is that, just as in the shell, any whitespace before or after the filename is ignored. This is good, because you wouldn't want these to do different things:

  1. open INFO, "<datafile"
  2. open INFO, "< datafile"
  3. open INFO, "< datafile"

Ignoring surrounding whitespace also helps for when you read a filename in from a different file, and forget to trim it before opening:

  1. $filename = <INFO>; # oops, \n still there
  2. open(EXTRA, "< $filename") || die "can't open $filename: $!";

This is not a bug, but a feature. Because open mimics the shell in its style of using redirection arrows to specify how to open the file, it also does so with respect to extra whitespace around the filename itself as well. For accessing files with naughty names, see Dispelling the Dweomer.

There is also a 3-argument version of open, which lets you put the special redirection characters into their own argument:

  1. open( INFO, ">", $datafile ) || die "Can't create $datafile: $!";

In this case, the filename to open is the actual string in $datafile , so you don't have to worry about $datafile containing characters that might influence the open mode, or whitespace at the beginning of the filename that would be absorbed in the 2-argument version. Also, any reduction of unnecessary string interpolation is a good thing.

Indirect Filehandles

open's first argument can be a reference to a filehandle. As of perl 5.6.0, if the argument is uninitialized, Perl will automatically create a filehandle and put a reference to it in the first argument, like so:

  1. open( my $in, $infile ) or die "Couldn't read $infile: $!";
  2. while ( <$in> ) {
  3. # do something with $_
  4. }
  5. close $in;

Indirect filehandles make namespace management easier. Since filehandles are global to the current package, two subroutines trying to open INFILE will clash. With two functions opening indirect filehandles like my $infile , there's no clash and no need to worry about future conflicts.

Another convenient behavior is that an indirect filehandle automatically closes when there are no more references to it:

  1. sub firstline {
  2. open( my $in, shift ) && return scalar <$in>;
  3. # no close() required
  4. }

Indirect filehandles also make it easy to pass filehandles to and return filehandles from subroutines:

  1. for my $file ( qw(this.conf that.conf) ) {
  2. my $fin = open_or_throw('<', $file);
  3. process_conf( $fin );
  4. # no close() needed
  5. }
  6. use Carp;
  7. sub open_or_throw {
  8. my ($mode, $filename) = @_;
  9. open my $h, $mode, $filename
  10. or croak "Could not open '$filename': $!";
  11. return $h;
  12. }

Pipe Opens

In C, when you want to open a file using the standard I/O library, you use the fopen function, but when opening a pipe, you use the popen function. But in the shell, you just use a different redirection character. That's also the case for Perl. The open call remains the same--just its argument differs.

If the leading character is a pipe symbol, open starts up a new command and opens a write-only filehandle leading into that command. This lets you write into that handle and have what you write show up on that command's standard input. For example:

  1. open(PRINTER, "| lpr -Plp1") || die "can't run lpr: $!";
  2. print PRINTER "stuff\n";
  3. close(PRINTER) || die "can't close lpr: $!";

If the trailing character is a pipe, you start up a new command and open a read-only filehandle leading out of that command. This lets whatever that command writes to its standard output show up on your handle for reading. For example:

  1. open(NET, "netstat -i -n |") || die "can't fork netstat: $!";
  2. while (<NET>) { } # do something with input
  3. close(NET) || die "can't close netstat: $!";

What happens if you try to open a pipe to or from a non-existent command? If possible, Perl will detect the failure and set $! as usual. But if the command contains special shell characters, such as > or * , called 'metacharacters', Perl does not execute the command directly. Instead, Perl runs the shell, which then tries to run the command. This means that it's the shell that gets the error indication. In such a case, the open call will only indicate failure if Perl can't even run the shell. See How can I capture STDERR from an external command? in perlfaq8 to see how to cope with this. There's also an explanation in perlipc.

If you would like to open a bidirectional pipe, the IPC::Open2 library will handle this for you. Check out Bidirectional Communication with Another Process in perlipc

perl-5.6.x introduced a version of piped open that executes a process based on its command line arguments without relying on the shell. (Similar to the system(@LIST) notation.) This is safer and faster than executing a single argument pipe-command, but does not allow special shell constructs. (It is also not supported on Microsoft Windows, Mac OS Classic or RISC OS.)

Here's an example of open '-|' , which prints a random Unix fortune cookie as uppercase:

  1. my $collection = shift(@ARGV);
  2. open my $fortune, '-|', 'fortune', $collection
  3. or die "Could not find fortune - $!";
  4. while (<$fortune>)
  5. {
  6. print uc($_);
  7. }
  8. close($fortune);

And this open '|-' pipes into lpr:

  1. open my $printer, '|-', 'lpr', '-Plp1'
  2. or die "can't run lpr: $!";
  3. print {$printer} "stuff\n";
  4. close($printer)
  5. or die "can't close lpr: $!";

The Minus File

Again following the lead of the standard shell utilities, Perl's open function treats a file whose name is a single minus, "-", in a special way. If you open minus for reading, it really means to access the standard input. If you open minus for writing, it really means to access the standard output.

If minus can be used as the default input or default output, what happens if you open a pipe into or out of minus? What's the default command it would run? The same script as you're currently running! This is actually a stealth fork hidden inside an open call. See Safe Pipe Opens in perlipc for details.

Mixing Reads and Writes

It is possible to specify both read and write access. All you do is add a "+" symbol in front of the redirection. But as in the shell, using a less-than on a file never creates a new file; it only opens an existing one. On the other hand, using a greater-than always clobbers (truncates to zero length) an existing file, or creates a brand-new one if there isn't an old one. Adding a "+" for read-write doesn't affect whether it only works on existing files or always clobbers existing ones.

  1. open(WTMP, "+< /usr/adm/wtmp")
  2. || die "can't open /usr/adm/wtmp: $!";
  3. open(SCREEN, "+> lkscreen")
  4. || die "can't open lkscreen: $!";
  5. open(LOGFILE, "+>> /var/log/applog")
  6. || die "can't open /var/log/applog: $!";

The first one won't create a new file, and the second one will always clobber an old one. The third one will create a new file if necessary and not clobber an old one, and it will allow you to read at any point in the file, but all writes will always go to the end. In short, the first case is substantially more common than the second and third cases, which are almost always wrong. (If you know C, the plus in Perl's open is historically derived from the one in C's fopen(3S), which it ultimately calls.)

In fact, when it comes to updating a file, unless you're working on a binary file as in the WTMP case above, you probably don't want to use this approach for updating. Instead, Perl's -i flag comes to the rescue. The following command takes all the C, C++, or yacc source or header files and changes all their foo's to bar's, leaving the old version in the original filename with a ".orig" tacked on the end:

  1. $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy]

This is a short cut for some renaming games that are really the best way to update textfiles. See the second question in perlfaq5 for more details.

Filters

One of the most common uses for open is one you never even notice. When you process the ARGV filehandle using <ARGV> , Perl actually does an implicit open on each file in @ARGV. Thus a program called like this:

  1. $ myprogram file1 file2 file3

can have all its files opened and processed one at a time using a construct no more complex than:

  1. while (<>) {
  2. # do something with $_
  3. }

If @ARGV is empty when the loop first begins, Perl pretends you've opened up minus, that is, the standard input. In fact, $ARGV, the currently open file during <ARGV> processing, is even set to "-" in these circumstances.

You are welcome to pre-process your @ARGV before starting the loop to make sure it's to your liking. One reason to do this might be to remove command options beginning with a minus. While you can always roll the simple ones by hand, the Getopts modules are good for this:

  1. use Getopt::Std;
  2. # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o
  3. getopts("vDo:");
  4. # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o}
  5. getopts("vDo:", \%args);

Or the standard Getopt::Long module to permit named arguments:

  1. use Getopt::Long;
  2. GetOptions( "verbose" => \$verbose, # --verbose
  3. "Debug" => \$debug, # --Debug
  4. "output=s" => \$output );
  5. # --output=somestring or --output somestring

Another reason for preprocessing arguments is to make an empty argument list default to all files:

  1. @ARGV = glob("*") unless @ARGV;

You could even filter out all but plain, text files. This is a bit silent, of course, and you might prefer to mention them on the way.

  1. @ARGV = grep { -f && -T } @ARGV;

If you're using the -n or -p command-line options, you should put changes to @ARGV in a BEGIN{} block.

Remember that a normal open has special properties, in that it might call fopen(3S) or it might called popen(3S), depending on what its argument looks like; that's why it's sometimes called "magic open". Here's an example:

  1. $pwdinfo = `domainname` =~ /^(\(none\))?$/
  2. ? '< /etc/passwd'
  3. : 'ypcat passwd |';
  4. open(PWD, $pwdinfo)
  5. or die "can't open $pwdinfo: $!";

This sort of thing also comes into play in filter processing. Because <ARGV> processing employs the normal, shell-style Perl open, it respects all the special things we've already seen:

  1. $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile

That program will read from the file f1, the process cmd1, standard input (tmpfile in this case), the f2 file, the cmd2 command, and finally the f3 file.

Yes, this also means that if you have files named "-" (and so on) in your directory, they won't be processed as literal files by open. You'll need to pass them as "./-", much as you would for the rm program, or you could use sysopen as described below.

One of the more interesting applications is to change files of a certain name into pipes. For example, to autoprocess gzipped or compressed files by decompressing them with gzip:

  1. @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc $_ |" : $_ } @ARGV;

Or, if you have the GET program installed from LWP, you can fetch URLs before processing them:

  1. @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV;

It's not for nothing that this is called magic <ARGV> . Pretty nifty, eh?

Open à la C

If you want the convenience of the shell, then Perl's open is definitely the way to go. On the other hand, if you want finer precision than C's simplistic fopen(3S) provides you should look to Perl's sysopen, which is a direct hook into the open(2) system call. That does mean it's a bit more involved, but that's the price of precision.

sysopen takes 3 (or 4) arguments.

  1. sysopen HANDLE, PATH, FLAGS, [MASK]

The HANDLE argument is a filehandle just as with open. The PATH is a literal path, one that doesn't pay attention to any greater-thans or less-thans or pipes or minuses, nor ignore whitespace. If it's there, it's part of the path. The FLAGS argument contains one or more values derived from the Fcntl module that have been or'd together using the bitwise "|" operator. The final argument, the MASK, is optional; if present, it is combined with the user's current umask for the creation mode of the file. You should usually omit this.

Although the traditional values of read-only, write-only, and read-write are 0, 1, and 2 respectively, this is known not to hold true on some systems. Instead, it's best to load in the appropriate constants first from the Fcntl module, which supplies the following standard flags:

  1. O_RDONLY Read only
  2. O_WRONLY Write only
  3. O_RDWR Read and write
  4. O_CREAT Create the file if it doesn't exist
  5. O_EXCL Fail if the file already exists
  6. O_APPEND Append to the file
  7. O_TRUNC Truncate the file
  8. O_NONBLOCK Non-blocking access

Less common flags that are sometimes available on some operating systems include O_BINARY , O_TEXT , O_SHLOCK , O_EXLOCK , O_DEFER , O_SYNC , O_ASYNC , O_DSYNC , O_RSYNC , O_NOCTTY , O_NDELAY and O_LARGEFILE . Consult your open(2) manpage or its local equivalent for details. (Note: starting from Perl release 5.6 the O_LARGEFILE flag, if available, is automatically added to the sysopen() flags because large files are the default.)

Here's how to use sysopen to emulate the simple open calls we had before. We'll omit the || die $! checks for clarity, but make sure you always check the return values in real code. These aren't quite the same, since open will trim leading and trailing whitespace, but you'll get the idea.

To open a file for reading:

  1. open(FH, "< $path");
  2. sysopen(FH, $path, O_RDONLY);

To open a file for writing, creating a new file if needed or else truncating an old file:

  1. open(FH, "> $path");
  2. sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT);

To open a file for appending, creating one if necessary:

  1. open(FH, ">> $path");
  2. sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT);

To open a file for update, where the file must already exist:

  1. open(FH, "+< $path");
  2. sysopen(FH, $path, O_RDWR);

And here are things you can do with sysopen that you cannot do with a regular open. As you'll see, it's just a matter of controlling the flags in the third argument.

To open a file for writing, creating a new file which must not previously exist:

  1. sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT);

To open a file for appending, where that file must already exist:

  1. sysopen(FH, $path, O_WRONLY | O_APPEND);

To open a file for update, creating a new file if necessary:

  1. sysopen(FH, $path, O_RDWR | O_CREAT);

To open a file for update, where that file must not already exist:

  1. sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT);

To open a file without blocking, creating one if necessary:

  1. sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT);

Permissions à la mode

If you omit the MASK argument to sysopen, Perl uses the octal value 0666. The normal MASK to use for executables and directories should be 0777, and for anything else, 0666.

Why so permissive? Well, it isn't really. The MASK will be modified by your process's current umask. A umask is a number representing disabled permissions bits; that is, bits that will not be turned on in the created file's permissions field.

For example, if your umask were 027, then the 020 part would disable the group from writing, and the 007 part would disable others from reading, writing, or executing. Under these conditions, passing sysopen 0666 would create a file with mode 0640, since 0666 & ~027 is 0640.

You should seldom use the MASK argument to sysopen(). That takes away the user's freedom to choose what permission new files will have. Denying choice is almost always a bad thing. One exception would be for cases where sensitive or private data is being stored, such as with mail folders, cookie files, and internal temporary files.

Obscure Open Tricks

Re-Opening Files (dups)

Sometimes you already have a filehandle open, and want to make another handle that's a duplicate of the first one. In the shell, we place an ampersand in front of a file descriptor number when doing redirections. For example, 2>&1 makes descriptor 2 (that's STDERR in Perl) be redirected into descriptor 1 (which is usually Perl's STDOUT). The same is essentially true in Perl: a filename that begins with an ampersand is treated instead as a file descriptor if a number, or as a filehandle if a string.

  1. open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!";
  2. open(MHCONTEXT, "<&4") || die "couldn't dup fd4: $!";

That means that if a function is expecting a filename, but you don't want to give it a filename because you already have the file open, you can just pass the filehandle with a leading ampersand. It's best to use a fully qualified handle though, just in case the function happens to be in a different package:

  1. somefunction("&main::LOGFILE");

This way if somefunction() is planning on opening its argument, it can just use the already opened handle. This differs from passing a handle, because with a handle, you don't open the file. Here you have something you can pass to open.

If you have one of those tricky, newfangled I/O objects that the C++ folks are raving about, then this doesn't work because those aren't a proper filehandle in the native Perl sense. You'll have to use fileno() to pull out the proper descriptor number, assuming you can:

  1. use IO::Socket;
  2. $handle = IO::Socket::INET->new("www.perl.com:80");
  3. $fd = $handle->fileno;
  4. somefunction("&$fd"); # not an indirect function call

It can be easier (and certainly will be faster) just to use real filehandles though:

  1. use IO::Socket;
  2. local *REMOTE = IO::Socket::INET->new("www.perl.com:80");
  3. die "can't connect" unless defined(fileno(REMOTE));
  4. somefunction("&main::REMOTE");

If the filehandle or descriptor number is preceded not just with a simple "&" but rather with a "&=" combination, then Perl will not create a completely new descriptor opened to the same place using the dup(2) system call. Instead, it will just make something of an alias to the existing one using the fdopen(3S) library call. This is slightly more parsimonious of systems resources, although this is less a concern these days. Here's an example of that:

  1. $fd = $ENV{"MHCONTEXTFD"};
  2. open(MHCONTEXT, "<&=$fd") or die "couldn't fdopen $fd: $!";

If you're using magic <ARGV> , you could even pass in as a command line argument in @ARGV something like "<&=$MHCONTEXTFD" , but we've never seen anyone actually do this.

Dispelling the Dweomer

Perl is more of a DWIMmer language than something like Java--where DWIM is an acronym for "do what I mean". But this principle sometimes leads to more hidden magic than one knows what to do with. In this way, Perl is also filled with dweomer, an obscure word meaning an enchantment. Sometimes, Perl's DWIMmer is just too much like dweomer for comfort.

If magic open is a bit too magical for you, you don't have to turn to sysopen. To open a file with arbitrary weird characters in it, it's necessary to protect any leading and trailing whitespace. Leading whitespace is protected by inserting a "./" in front of a filename that starts with whitespace. Trailing whitespace is protected by appending an ASCII NUL byte ("\0" ) at the end of the string.

  1. $file =~ s#^(\s)#./$1#;
  2. open(FH, "< $file\0") || die "can't open $file: $!";

This assumes, of course, that your system considers dot the current working directory, slash the directory separator, and disallows ASCII NULs within a valid filename. Most systems follow these conventions, including all POSIX systems as well as proprietary Microsoft systems. The only vaguely popular system that doesn't work this way is the "Classic" Macintosh system, which uses a colon where the rest of us use a slash. Maybe sysopen isn't such a bad idea after all.

If you want to use <ARGV> processing in a totally boring and non-magical way, you could do this first:

  1. # "Sam sat on the ground and put his head in his hands.
  2. # 'I wish I had never come here, and I don't want to see
  3. # no more magic,' he said, and fell silent."
  4. for (@ARGV) {
  5. s#^([^./])#./$1#;
  6. $_ .= "\0";
  7. }
  8. while (<>) {
  9. # now process $_
  10. }

But be warned that users will not appreciate being unable to use "-" to mean standard input, per the standard convention.

Paths as Opens

You've probably noticed how Perl's warn and die functions can produce messages like:

  1. Some warning at scriptname line 29, <FH> line 7.

That's because you opened a filehandle FH, and had read in seven records from it. But what was the name of the file, rather than the handle?

If you aren't running with strict refs , or if you've turned them off temporarily, then all you have to do is this:

  1. open($path, "< $path") || die "can't open $path: $!";
  2. while (<$path>) {
  3. # whatever
  4. }

Since you're using the pathname of the file as its handle, you'll get warnings more like

  1. Some warning at scriptname line 29, </etc/motd> line 7.

Single Argument Open

Remember how we said that Perl's open took two arguments? That was a passive prevarication. You see, it can also take just one argument. If and only if the variable is a global variable, not a lexical, you can pass open just one argument, the filehandle, and it will get the path from the global scalar variable of the same name.

  1. $FILE = "/etc/motd";
  2. open FILE or die "can't open $FILE: $!";
  3. while (<FILE>) {
  4. # whatever
  5. }

Why is this here? Someone has to cater to the hysterical porpoises. It's something that's been in Perl since the very beginning, if not before.

Playing with STDIN and STDOUT

One clever move with STDOUT is to explicitly close it when you're done with the program.

  1. END { close(STDOUT) || die "can't close stdout: $!" }

If you don't do this, and your program fills up the disk partition due to a command line redirection, it won't report the error exit with a failure status.

You don't have to accept the STDIN and STDOUT you were given. You are welcome to reopen them if you'd like.

  1. open(STDIN, "< datafile")
  2. || die "can't open datafile: $!";
  3. open(STDOUT, "> output")
  4. || die "can't open output: $!";

And then these can be accessed directly or passed on to subprocesses. This makes it look as though the program were initially invoked with those redirections from the command line.

It's probably more interesting to connect these to pipes. For example:

  1. $pager = $ENV{PAGER} || "(less || more)";
  2. open(STDOUT, "| $pager")
  3. || die "can't fork a pager: $!";

This makes it appear as though your program were called with its stdout already piped into your pager. You can also use this kind of thing in conjunction with an implicit fork to yourself. You might do this if you would rather handle the post processing in your own program, just in a different process:

  1. head(100);
  2. while (<>) {
  3. print;
  4. }
  5. sub head {
  6. my $lines = shift || 20;
  7. return if $pid = open(STDOUT, "|-"); # return if parent
  8. die "cannot fork: $!" unless defined $pid;
  9. while (<STDIN>) {
  10. last if --$lines < 0;
  11. print;
  12. }
  13. exit;
  14. }

This technique can be applied to repeatedly push as many filters on your output stream as you wish.

Other I/O Issues

These topics aren't really arguments related to open or sysopen, but they do affect what you do with your open files.

Opening Non-File Files

When is a file not a file? Well, you could say when it exists but isn't a plain file. We'll check whether it's a symbolic link first, just in case.

  1. if (-l $file || ! -f _) {
  2. print "$file is not a plain file\n";
  3. }

What other kinds of files are there than, well, files? Directories, symbolic links, named pipes, Unix-domain sockets, and block and character devices. Those are all files, too--just not plain files. This isn't the same issue as being a text file. Not all text files are plain files. Not all plain files are text files. That's why there are separate -f and -T file tests.

To open a directory, you should use the opendir function, then process it with readdir, carefully restoring the directory name if necessary:

  1. opendir(DIR, $dirname) or die "can't opendir $dirname: $!";
  2. while (defined($file = readdir(DIR))) {
  3. # do something with "$dirname/$file"
  4. }
  5. closedir(DIR);

If you want to process directories recursively, it's better to use the File::Find module. For example, this prints out all files recursively and adds a slash to their names if the file is a directory.

  1. @ARGV = qw(.) unless @ARGV;
  2. use File::Find;
  3. find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV;

This finds all bogus symbolic links beneath a particular directory:

  1. find sub { print "$File::Find::name\n" if -l && !-e }, $dir;

As you see, with symbolic links, you can just pretend that it is what it points to. Or, if you want to know what it points to, then readlink is called for:

  1. if (-l $file) {
  2. if (defined($whither = readlink($file))) {
  3. print "$file points to $whither\n";
  4. } else {
  5. print "$file points nowhere: $!\n";
  6. }
  7. }

Opening Named Pipes

Named pipes are a different matter. You pretend they're regular files, but their opens will normally block until there is both a reader and a writer. You can read more about them in Named Pipes in perlipc. Unix-domain sockets are rather different beasts as well; they're described in Unix-Domain TCP Clients and Servers in perlipc.

When it comes to opening devices, it can be easy and it can be tricky. We'll assume that if you're opening up a block device, you know what you're doing. The character devices are more interesting. These are typically used for modems, mice, and some kinds of printers. This is described in How do I read and write the serial port? in perlfaq8 It's often enough to open them carefully:

  1. sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY)
  2. # (O_NOCTTY no longer needed on POSIX systems)
  3. or die "can't open /dev/ttyS1: $!";
  4. open(TTYOUT, "+>&TTYIN")
  5. or die "can't dup TTYIN: $!";
  6. $ofh = select(TTYOUT); $| = 1; select($ofh);
  7. print TTYOUT "+++at\015";
  8. $answer = <TTYIN>;

With descriptors that you haven't opened using sysopen, such as sockets, you can set them to be non-blocking using fcntl:

  1. use Fcntl;
  2. my $old_flags = fcntl($handle, F_GETFL, 0)
  3. or die "can't get flags: $!";
  4. fcntl($handle, F_SETFL, $old_flags | O_NONBLOCK)
  5. or die "can't set non blocking: $!";

Rather than losing yourself in a morass of twisting, turning ioctls, all dissimilar, if you're going to manipulate ttys, it's best to make calls out to the stty(1) program if you have it, or else use the portable POSIX interface. To figure this all out, you'll need to read the termios(3) manpage, which describes the POSIX interface to tty devices, and then POSIX, which describes Perl's interface to POSIX. There are also some high-level modules on CPAN that can help you with these games. Check out Term::ReadKey and Term::ReadLine.

Opening Sockets

What else can you open? To open a connection using sockets, you won't use one of Perl's two open functions. See Sockets: Client/Server Communication in perlipc for that. Here's an example. Once you have it, you can use FH as a bidirectional filehandle.

  1. use IO::Socket;
  2. local *FH = IO::Socket::INET->new("www.perl.com:80");

For opening up a URL, the LWP modules from CPAN are just what the doctor ordered. There's no filehandle interface, but it's still easy to get the contents of a document:

  1. use LWP::Simple;
  2. $doc = get('http://www.cpan.org/');

Binary Files

On certain legacy systems with what could charitably be called terminally convoluted (some would say broken) I/O models, a file isn't a file--at least, not with respect to the C standard I/O library. On these old systems whose libraries (but not kernels) distinguish between text and binary streams, to get files to behave properly you'll have to bend over backwards to avoid nasty problems. On such infelicitous systems, sockets and pipes are already opened in binary mode, and there is currently no way to turn that off. With files, you have more options.

Another option is to use the binmode function on the appropriate handles before doing regular I/O on them:

  1. binmode(STDIN);
  2. binmode(STDOUT);
  3. while (<STDIN>) { print }

Passing sysopen a non-standard flag option will also open the file in binary mode on those systems that support it. This is the equivalent of opening the file normally, then calling binmode on the handle.

  1. sysopen(BINDAT, "records.data", O_RDWR | O_BINARY)
  2. || die "can't open records.data: $!";

Now you can use read and print on that handle without worrying about the non-standard system I/O library breaking your data. It's not a pretty picture, but then, legacy systems seldom are. CP/M will be with us until the end of days, and after.

On systems with exotic I/O systems, it turns out that, astonishingly enough, even unbuffered I/O using sysread and syswrite might do sneaky data mutilation behind your back.

  1. while (sysread(WHENCE, $buf, 1024)) {
  2. syswrite(WHITHER, $buf, length($buf));
  3. }

Depending on the vicissitudes of your runtime system, even these calls may need binmode or O_BINARY first. Systems known to be free of such difficulties include Unix, the Mac OS, Plan 9, and Inferno.

File Locking

In a multitasking environment, you may need to be careful not to collide with other processes who want to do I/O on the same files as you are working on. You'll often need shared or exclusive locks on files for reading and writing respectively. You might just pretend that only exclusive locks exist.

Never use the existence of a file -e $file as a locking indication, because there is a race condition between the test for the existence of the file and its creation. It's possible for another process to create a file in the slice of time between your existence check and your attempt to create the file. Atomicity is critical.

Perl's most portable locking interface is via the flock function, whose simplicity is emulated on systems that don't directly support it such as SysV or Windows. The underlying semantics may affect how it all works, so you should learn how flock is implemented on your system's port of Perl.

File locking does not lock out another process that would like to do I/O. A file lock only locks out others trying to get a lock, not processes trying to do I/O. Because locks are advisory, if one process uses locking and another doesn't, all bets are off.

By default, the flock call will block until a lock is granted. A request for a shared lock will be granted as soon as there is no exclusive locker. A request for an exclusive lock will be granted as soon as there is no locker of any kind. Locks are on file descriptors, not file names. You can't lock a file until you open it, and you can't hold on to a lock once the file has been closed.

Here's how to get a blocking shared lock on a file, typically used for reading:

  1. use 5.004;
  2. use Fcntl qw(:DEFAULT :flock);
  3. open(FH, "< filename") or die "can't open filename: $!";
  4. flock(FH, LOCK_SH) or die "can't lock filename: $!";
  5. # now read from FH

You can get a non-blocking lock by using LOCK_NB .

  1. flock(FH, LOCK_SH | LOCK_NB)
  2. or die "can't lock filename: $!";

This can be useful for producing more user-friendly behaviour by warning if you're going to be blocking:

  1. use 5.004;
  2. use Fcntl qw(:DEFAULT :flock);
  3. open(FH, "< filename") or die "can't open filename: $!";
  4. unless (flock(FH, LOCK_SH | LOCK_NB)) {
  5. $| = 1;
  6. print "Waiting for lock...";
  7. flock(FH, LOCK_SH) or die "can't lock filename: $!";
  8. print "got it.\n"
  9. }
  10. # now read from FH

To get an exclusive lock, typically used for writing, you have to be careful. We sysopen the file so it can be locked before it gets emptied. You can get a nonblocking version using LOCK_EX | LOCK_NB .

  1. use 5.004;
  2. use Fcntl qw(:DEFAULT :flock);
  3. sysopen(FH, "filename", O_WRONLY | O_CREAT)
  4. or die "can't open filename: $!";
  5. flock(FH, LOCK_EX)
  6. or die "can't lock filename: $!";
  7. truncate(FH, 0)
  8. or die "can't truncate filename: $!";
  9. # now write to FH

Finally, due to the uncounted millions who cannot be dissuaded from wasting cycles on useless vanity devices called hit counters, here's how to increment a number in a file safely:

  1. use Fcntl qw(:DEFAULT :flock);
  2. sysopen(FH, "numfile", O_RDWR | O_CREAT)
  3. or die "can't open numfile: $!";
  4. # autoflush FH
  5. $ofh = select(FH); $| = 1; select ($ofh);
  6. flock(FH, LOCK_EX)
  7. or die "can't write-lock numfile: $!";
  8. $num = <FH> || 0;
  9. seek(FH, 0, 0)
  10. or die "can't rewind numfile : $!";
  11. print FH $num+1, "\n"
  12. or die "can't write numfile: $!";
  13. truncate(FH, tell(FH))
  14. or die "can't truncate numfile: $!";
  15. close(FH)
  16. or die "can't close numfile: $!";

IO Layers

In Perl 5.8.0 a new I/O framework called "PerlIO" was introduced. This is a new "plumbing" for all the I/O happening in Perl; for the most part everything will work just as it did, but PerlIO also brought in some new features such as the ability to think of I/O as "layers". One I/O layer may in addition to just moving the data also do transformations on the data. Such transformations may include compression and decompression, encryption and decryption, and transforming between various character encodings.

Full discussion about the features of PerlIO is out of scope for this tutorial, but here is how to recognize the layers being used:

  • The three-(or more)-argument form of open is being used and the second argument contains something else in addition to the usual '<' , '>' , '>>' , '|' and their variants, for example:

    1. open(my $fh, "<:crlf", $fn);
  • The two-argument form of binmode is being used, for example

    1. binmode($fh, ":encoding(utf16)");

For more detailed discussion about PerlIO see PerlIO; for more detailed discussion about Unicode and I/O see perluniintro.

SEE ALSO

The open and sysopen functions in perlfunc(1); the system open(2), dup(2), fopen(3), and fdopen(3) manpages; the POSIX documentation.

AUTHOR and COPYRIGHT

Copyright 1998 Tom Christiansen.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples in these files are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required.

HISTORY

First release: Sat Jan 9 08:09:11 MST 1999

 
perldoc-html/perlos2.html000644 000765 000024 00000521653 12275777412 015524 0ustar00jjstaff000000 000000 perlos2 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlos2

Perl 5 version 18.2 documentation
Recently read

perlos2

NAME

perlos2 - Perl under OS/2, DOS, Win0.3*, Win0.95 and WinNT.

SYNOPSIS

One can read this document in the following formats:

  1. man perlos2
  2. view perl perlos2
  3. explorer perlos2.html
  4. info perlos2

to list some (not all may be available simultaneously), or it may be read as is: either as README.os2, or pod/perlos2.pod.

To read the .INF version of documentation (very recommended) outside of OS/2, one needs an IBM's reader (may be available on IBM ftp sites (?) (URL anyone?)) or shipped with PC DOS 7.0 and IBM's Visual Age C++ 3.5.

A copy of a Win* viewer is contained in the "Just add OS/2 Warp" package

  1. ftp://ftp.software.ibm.com/ps/products/os2/tools/jaow/jaow.zip

in ?:\JUST_ADD\view.exe. This gives one an access to EMX's .INF docs as well (text form is available in /emx/doc in EMX's distribution). There is also a different viewer named xview.

Note that if you have lynx.exe or netscape.exe installed, you can follow WWW links from this document in .INF format. If you have EMX docs installed correctly, you can follow library links (you need to have view emxbook working by setting EMXBOOK environment variable as it is described in EMX docs).

DESCRIPTION

Target

The target is to make OS/2 one of the best supported platform for using/building/developing Perl and Perl applications, as well as make Perl the best language to use under OS/2. The secondary target is to try to make this work under DOS and Win* as well (but not too hard).

The current state is quite close to this target. Known limitations:

  • Some *nix programs use fork() a lot; with the mostly useful flavors of perl for OS/2 (there are several built simultaneously) this is supported; but some flavors do not support this (e.g., when Perl is called from inside REXX). Using fork() after useing dynamically loading extensions would not work with very old versions of EMX.

  • You need a separate perl executable perl__.exe (see perl__.exe) if you want to use PM code in your application (as Perl/Tk or OpenGL Perl modules do) without having a text-mode window present.

    While using the standard perl.exe from a text-mode window is possible too, I have seen cases when this causes degradation of the system stability. Using perl__.exe avoids such a degradation.

  • There is no simple way to access WPS objects. The only way I know is via OS2::REXX and SOM extensions (see OS2::REXX, SOM). However, we do not have access to convenience methods of Object-REXX. (Is it possible at all? I know of no Object-REXX API.) The SOM extension (currently in alpha-text) may eventually remove this shortcoming; however, due to the fact that DII is not supported by the SOM module, using SOM is not as convenient as one would like it.

Please keep this list up-to-date by informing me about other items.

Other OSes

Since OS/2 port of perl uses a remarkable EMX environment, it can run (and build extensions, and - possibly - be built itself) under any environment which can run EMX. The current list is DOS, DOS-inside-OS/2, Win0.3*, Win0.95 and WinNT. Out of many perl flavors, only one works, see perl_.exe.

Note that not all features of Perl are available under these environments. This depends on the features the extender - most probably RSX - decided to implement.

Cf. Prerequisites.

Prerequisites

  • EMX

    EMX runtime is required (may be substituted by RSX). Note that it is possible to make perl_.exe to run under DOS without any external support by binding emx.exe/rsx.exe to it, see emxbind . Note that under DOS for best results one should use RSX runtime, which has much more functions working (like fork, popen and so on). In fact RSX is required if there is no VCPI present. Note the RSX requires DPMI. Many implementations of DPMI are known to be very buggy, beware!

    Only the latest runtime is supported, currently 0.9d fix 03. Perl may run under earlier versions of EMX, but this is not tested.

    One can get different parts of EMX from, say

    1. ftp://crydee.sai.msu.ru/pub/comp/os/os2/leo/gnu/emx+gcc/
    2. http://hobbes.nmsu.edu/h-browse.php?dir=/pub/os2/dev/emx/v0.9d/

    The runtime component should have the name emxrt.zip.

    NOTE. When using emx.exe/rsx.exe, it is enough to have them on your path. One does not need to specify them explicitly (though this

    1. emx perl_.exe -de 0

    will work as well.)

  • RSX

    To run Perl on DPMI platforms one needs RSX runtime. This is needed under DOS-inside-OS/2, Win0.3*, Win0.95 and WinNT (see Other OSes). RSX would not work with VCPI only, as EMX would, it requires DMPI.

    Having RSX and the latest sh.exe one gets a fully functional *nix-ish environment under DOS, say, fork, `` and pipe-open work. In fact, MakeMaker works (for static build), so one can have Perl development environment under DOS.

    One can get RSX from, say

    1. http://cd.textfiles.com/hobbesos29804/disk1/EMX09C/
    2. ftp://crydee.sai.msu.ru/pub/comp/os/os2/leo/gnu/emx+gcc/contrib/

    Contact the author on rainer@mathematik.uni-bielefeld.de .

    The latest sh.exe with DOS hooks is available in

    1. http://www.ilyaz.org/software/os2/

    as sh_dos.zip or under similar names starting with sh , pdksh etc.

  • HPFS

    Perl does not care about file systems, but the perl library contains many files with long names, so to install it intact one needs a file system which supports long file names.

    Note that if you do not plan to build the perl itself, it may be possible to fool EMX to truncate file names. This is not supported, read EMX docs to see how to do it.

  • pdksh

    To start external programs with complicated command lines (like with pipes in between, and/or quoting of arguments), Perl uses an external shell. With EMX port such shell should be named sh.exe, and located either in the wired-in-during-compile locations (usually F:/bin), or in configurable location (see PERL_SH_DIR).

    For best results use EMX pdksh. The standard binary (5.2.14 or later) runs under DOS (with RSX) as well, see

    1. http://www.ilyaz.org/software/os2/

Starting Perl programs under OS/2 (and DOS and...)

Start your Perl program foo.pl with arguments arg1 arg2 arg3 the same way as on any other platform, by

  1. perl foo.pl arg1 arg2 arg3

If you want to specify perl options -my_opts to the perl itself (as opposed to your program), use

  1. perl -my_opts foo.pl arg1 arg2 arg3

Alternately, if you use OS/2-ish shell, like CMD or 4os2, put the following at the start of your perl script:

  1. extproc perl -S -my_opts

rename your program to foo.cmd, and start it by typing

  1. foo arg1 arg2 arg3

Note that because of stupid OS/2 limitations the full path of the perl script is not available when you use extproc , thus you are forced to use -S perl switch, and your script should be on the PATH . As a plus side, if you know a full path to your script, you may still start it with

  1. perl ../../blah/foo.cmd arg1 arg2 arg3

(note that the argument -my_opts is taken care of by the extproc line in your script, see extproc on the first line).

To understand what the above magic does, read perl docs about -S switch - see perlrun, and cmdref about extproc :

  1. view perl perlrun
  2. man perlrun
  3. view cmdref extproc
  4. help extproc

or whatever method you prefer.

There are also endless possibilities to use executable extensions of 4os2, associations of WPS and so on... However, if you use *nixish shell (like sh.exe supplied in the binary distribution), you need to follow the syntax specified in Command Switches in perlrun.

Note that -S switch supports scripts with additional extensions .cmd, .btm, .bat, .pl as well.

Starting OS/2 (and DOS) programs under Perl

This is what system() (see system), `` (see I/O Operators in perlop), and open pipe (see open) are for. (Avoid exec() (see exec) unless you know what you do).

Note however that to use some of these operators you need to have a sh-syntax shell installed (see Pdksh, Frequently asked questions), and perl should be able to find it (see PERL_SH_DIR).

The cases when the shell is used are:

1

One-argument system() (see system), exec() (see exec) with redirection or shell meta-characters;

2

Pipe-open (see open) with the command which contains redirection or shell meta-characters;

3

Backticks `` (see I/O Operators in perlop) with the command which contains redirection or shell meta-characters;

4

If the executable called by system()/exec()/pipe-open()/`` is a script with the "magic" #! line or extproc line which specifies shell;

5

If the executable called by system()/exec()/pipe-open()/`` is a script without "magic" line, and $ENV{EXECSHELL} is set to shell;

6

If the executable called by system()/exec()/pipe-open()/`` is not found (is not this remark obsolete?);

7

For globbing (see glob, I/O Operators in perlop) (obsolete? Perl uses builtin globbing nowadays...).

For the sake of speed for a common case, in the above algorithms backslashes in the command name are not considered as shell metacharacters.

Perl starts scripts which begin with cookies extproc or #! directly, without an intervention of shell. Perl uses the same algorithm to find the executable as pdksh: if the path on #! line does not work, and contains /, then the directory part of the executable is ignored, and the executable is searched in . and on PATH . To find arguments for these scripts Perl uses a different algorithm than pdksh: up to 3 arguments are recognized, and trailing whitespace is stripped.

If a script does not contain such a cooky, then to avoid calling sh.exe, Perl uses the same algorithm as pdksh: if $ENV{EXECSHELL} is set, the script is given as the first argument to this command, if not set, then $ENV{COMSPEC} /c is used (or a hardwired guess if $ENV{COMSPEC} is not set).

When starting scripts directly, Perl uses exactly the same algorithm as for the search of script given by -S command-line option: it will look in the current directory, then on components of $ENV{PATH} using the following order of appended extensions: no extension, .cmd, .btm, .bat, .pl.

Note that Perl will start to look for scripts only if OS/2 cannot start the specified application, thus system 'blah' will not look for a script if there is an executable file blah.exe anywhere on PATH . In other words, PATH is essentially searched twice: once by the OS for an executable, then by Perl for scripts.

Note also that executable files on OS/2 can have an arbitrary extension, but .exe will be automatically appended if no dot is present in the name. The workaround is as simple as that: since blah. and blah denote the same file (at list on FAT and HPFS file systems), to start an executable residing in file n:/bin/blah (no extension) give an argument n:/bin/blah. (dot appended) to system().

Perl will start PM programs from VIO (=text-mode) Perl process in a separate PM session; the opposite is not true: when you start a non-PM program from a PM Perl process, Perl would not run it in a separate session. If a separate session is desired, either ensure that shell will be used, as in system 'cmd /c myprog' , or start it using optional arguments to system() documented in OS2::Process module. This is considered to be a feature.

Frequently asked questions

"It does not work"

Perl binary distributions come with a testperl.cmd script which tries to detect common problems with misconfigured installations. There is a pretty large chance it will discover which step of the installation you managed to goof. ;-)

I cannot run external programs

  • Did you run your programs with -w switch? See 2 (and DOS) programs under Perl in Starting OS.

  • Do you try to run internal shell commands, like `copy a b` (internal for cmd.exe), or `glob a*b` (internal for ksh)? You need to specify your shell explicitly, like `cmd /c copy a b` , since Perl cannot deduce which commands are internal to your shell.

I cannot embed perl into my program, or use perl.dll from my program.

  • Is your program EMX-compiled with -Zmt -Zcrtdll ?

    Well, nowadays Perl DLL should be usable from a differently compiled program too... If you can run Perl code from REXX scripts (see OS2::REXX), then there are some other aspect of interaction which are overlooked by the current hackish code to support differently-compiled principal programs.

    If everything else fails, you need to build a stand-alone DLL for perl. Contact me, I did it once. Sockets would not work, as a lot of other stuff.

  • Did you use ExtUtils::Embed?

    Some time ago I had reports it does not work. Nowadays it is checked in the Perl test suite, so grep ./t subdirectory of the build tree (as well as *.t files in the ./lib subdirectory) to find how it should be done "correctly".

`` and pipe-open do not work under DOS.

This may a variant of just I cannot run external programs, or a deeper problem. Basically: you need RSX (see Prerequisites) for these commands to work, and you may need a port of sh.exe which understands command arguments. One of such ports is listed in Prerequisites under RSX. Do not forget to set variable PERL_SH_DIR as well.

DPMI is required for RSX.

Cannot start find.exe "pattern" file

The whole idea of the "standard C API to start applications" is that the forms foo and "foo" of program arguments are completely interchangeable. find breaks this paradigm;

  1. find "pattern" file
  2. find pattern file

are not equivalent; find cannot be started directly using the above API. One needs a way to surround the doublequotes in some other quoting construction, necessarily having an extra non-Unixish shell in between.

Use one of

  1. system 'cmd', '/c', 'find "pattern" file';
  2. `cmd /c 'find "pattern" file'`

This would start find.exe via cmd.exe via sh.exe via perl.exe , but this is a price to pay if you want to use non-conforming program.

INSTALLATION

Automatic binary installation

The most convenient way of installing a binary distribution of perl is via perl installer install.exe. Just follow the instructions, and 99% of the installation blues would go away.

Note however, that you need to have unzip.exe on your path, and EMX environment running. The latter means that if you just installed EMX, and made all the needed changes to Config.sys, you may need to reboot in between. Check EMX runtime by running

  1. emxrev

Binary installer also creates a folder on your desktop with some useful objects. If you need to change some aspects of the work of the binary installer, feel free to edit the file Perl.pkg. This may be useful e.g., if you need to run the installer many times and do not want to make many interactive changes in the GUI.

Things not taken care of by automatic binary installation:

  • PERL_BADLANG

    may be needed if you change your codepage after perl installation, and the new value is not supported by EMX. See PERL_BADLANG.

  • PERL_BADFREE

    see PERL_BADFREE.

  • Config.pm

    This file resides somewhere deep in the location you installed your perl library, find it out by

    1. perl -MConfig -le "print $INC{'Config.pm'}"

    While most important values in this file are updated by the binary installer, some of them may need to be hand-edited. I know no such data, please keep me informed if you find one. Moreover, manual changes to the installed version may need to be accompanied by an edit of this file.

NOTE. Because of a typo the binary installer of 5.00305 would install a variable PERL_SHPATH into Config.sys. Please remove this variable and put PERL_SH_DIR instead.

Manual binary installation

As of version 5.00305, OS/2 perl binary distribution comes split into 11 components. Unfortunately, to enable configurable binary installation, the file paths in the zip files are not absolute, but relative to some directory.

Note that the extraction with the stored paths is still necessary (default with unzip, specify -d to pkunzip). However, you need to know where to extract the files. You need also to manually change entries in Config.sys to reflect where did you put the files. Note that if you have some primitive unzipper (like pkunzip ), you may get a lot of warnings/errors during unzipping. Upgrade to (w)unzip.

Below is the sample of what to do to reproduce the configuration on my machine. In VIEW.EXE you can press Ctrl-Insert now, and cut-and-paste from the resulting file - created in the directory you started VIEW.EXE from.

For each component, we mention environment variables related to each installation directory. Either choose directories to match your values of the variables, or create/append-to variables to take into account the directories.

  • Perl VIO and PM executables (dynamically linked)
    1. unzip perl_exc.zip *.exe *.ico -d f:/emx.add/bin
    2. unzip perl_exc.zip *.dll -d f:/emx.add/dll

    (have the directories with *.exe on PATH, and *.dll on LIBPATH);

  • Perl_ VIO executable (statically linked)
    1. unzip perl_aou.zip -d f:/emx.add/bin

    (have the directory on PATH);

  • Executables for Perl utilities
    1. unzip perl_utl.zip -d f:/emx.add/bin

    (have the directory on PATH);

  • Main Perl library
    1. unzip perl_mlb.zip -d f:/perllib/lib

    If this directory is exactly the same as the prefix which was compiled into perl.exe, you do not need to change anything. However, for perl to find the library if you use a different path, you need to set PERLLIB_PREFIX in Config.sys, see PERLLIB_PREFIX.

  • Additional Perl modules
    1. unzip perl_ste.zip -d f:/perllib/lib/site_perl/5.18.2/

    Same remark as above applies. Additionally, if this directory is not one of directories on @INC (and @INC is influenced by PERLLIB_PREFIX ), you need to put this directory and subdirectory ./os2 in PERLLIB or PERL5LIB variable. Do not use PERL5LIB unless you have it set already. See ENVIRONMENT in perl.

    [Check whether this extraction directory is still applicable with the new directory structure layout!]

  • Tools to compile Perl modules
    1. unzip perl_blb.zip -d f:/perllib/lib

    Same remark as for perl_ste.zip.

  • Manpages for Perl and utilities
    1. unzip perl_man.zip -d f:/perllib/man

    This directory should better be on MANPATH . You need to have a working man to access these files.

  • Manpages for Perl modules
    1. unzip perl_mam.zip -d f:/perllib/man

    This directory should better be on MANPATH . You need to have a working man to access these files.

  • Source for Perl documentation
    1. unzip perl_pod.zip -d f:/perllib/lib

    This is used by the perldoc program (see perldoc), and may be used to generate HTML documentation usable by WWW browsers, and documentation in zillions of other formats: info , LaTeX , Acrobat , FrameMaker and so on. [Use programs such as pod2latex etc.]

  • Perl manual in .INF format
    1. unzip perl_inf.zip -d d:/os2/book

    This directory should better be on BOOKSHELF .

  • Pdksh
    1. unzip perl_sh.zip -d f:/bin

    This is used by perl to run external commands which explicitly require shell, like the commands using redirection and shell metacharacters. It is also used instead of explicit /bin/sh.

    Set PERL_SH_DIR (see PERL_SH_DIR) if you move sh.exe from the above location.

    Note. It may be possible to use some other sh-compatible shell (untested).

After you installed the components you needed and updated the Config.sys correspondingly, you need to hand-edit Config.pm. This file resides somewhere deep in the location you installed your perl library, find it out by

  1. perl -MConfig -le "print $INC{'Config.pm'}"

You need to correct all the entries which look like file paths (they currently start with f:/).

Warning

The automatic and manual perl installation leave precompiled paths inside perl executables. While these paths are overwriteable (see PERLLIB_PREFIX, PERL_SH_DIR), some people may prefer binary editing of paths inside the executables/DLLs.

Accessing documentation

Depending on how you built/installed perl you may have (otherwise identical) Perl documentation in the following formats:

OS/2 .INF file

Most probably the most convenient form. Under OS/2 view it as

  1. view perl
  2. view perl perlfunc
  3. view perl less
  4. view perl ExtUtils::MakeMaker

(currently the last two may hit a wrong location, but this may improve soon). Under Win* see SYNOPSIS.

If you want to build the docs yourself, and have OS/2 toolkit, run

  1. pod2ipf > perl.ipf

in /perllib/lib/pod directory, then

  1. ipfc /inf perl.ipf

(Expect a lot of errors during the both steps.) Now move it on your BOOKSHELF path.

Plain text

If you have perl documentation in the source form, perl utilities installed, and GNU groff installed, you may use

  1. perldoc perlfunc
  2. perldoc less
  3. perldoc ExtUtils::MakeMaker

to access the perl documentation in the text form (note that you may get better results using perl manpages).

Alternately, try running pod2text on .pod files.

Manpages

If you have man installed on your system, and you installed perl manpages, use something like this:

  1. man perlfunc
  2. man 3 less
  3. man ExtUtils.MakeMaker

to access documentation for different components of Perl. Start with

  1. man perl

Note that dot (.) is used as a package separator for documentation for packages, and as usual, sometimes you need to give the section - 3 above - to avoid shadowing by the less(1) manpage.

Make sure that the directory above the directory with manpages is on our MANPATH , like this

  1. set MANPATH=c:/man;f:/perllib/man

for Perl manpages in f:/perllib/man/man1/ etc.

HTML

If you have some WWW browser available, installed the Perl documentation in the source form, and Perl utilities, you can build HTML docs. Cd to directory with .pod files, and do like this

  1. cd f:/perllib/lib/pod
  2. pod2html

After this you can direct your browser the file perl.html in this directory, and go ahead with reading docs, like this:

  1. explore file:///f:/perllib/lib/pod/perl.html

Alternatively you may be able to get these docs prebuilt from CPAN.

GNU info files

Users of Emacs would appreciate it very much, especially with CPerl mode loaded. You need to get latest pod2texi from CPAN , or, alternately, the prebuilt info pages.

PDF files

for Acrobat are available on CPAN (may be for slightly older version of perl).

LaTeX docs

can be constructed using pod2latex .

BUILD

Here we discuss how to build Perl under OS/2.

The short story

Assume that you are a seasoned porter, so are sure that all the necessary tools are already present on your system, and you know how to get the Perl source distribution. Untar it, change to the extract directory, and

  1. gnupatch -p0 < os2\diff.configure
  2. sh Configure -des -D prefix=f:/perllib
  3. make
  4. make test
  5. make install
  6. make aout_test
  7. make aout_install

This puts the executables in f:/perllib/bin. Manually move them to the PATH , manually move the built perl*.dll to LIBPATH (here for Perl DLL * is a not-very-meaningful hex checksum), and run

  1. make installcmd INSTALLCMDDIR=d:/ir/on/path

Assuming that the man -files were put on an appropriate location, this completes the installation of minimal Perl system. (The binary distribution contains also a lot of additional modules, and the documentation in INF format.)

What follows is a detailed guide through these steps.

Prerequisites

You need to have the latest EMX development environment, the full GNU tool suite (gawk renamed to awk, and GNU find.exe earlier on path than the OS/2 find.exe, same with sort.exe, to check use

  1. find --version
  2. sort --version

). You need the latest version of pdksh installed as sh.exe.

Check that you have BSD libraries and headers installed, and - optionally - Berkeley DB headers and libraries, and crypt.

Possible locations to get the files:

  1. ftp://ftp.uni-heidelberg.de/pub/os2/unix/
  2. http://hobbes.nmsu.edu/h-browse.php?dir=/pub/os2
  3. http://cd.textfiles.com/hobbesos29804/disk1/DEV32/
  4. http://cd.textfiles.com/hobbesos29804/disk1/EMX09C/

It is reported that the following archives contain enough utils to build perl: gnufutil.zip, gnusutil.zip, gnututil.zip, gnused.zip, gnupatch.zip, gnuawk.zip, gnumake.zip, gnugrep.zip, bsddev.zip and ksh527rt.zip (or a later version). Note that all these utilities are known to be available from LEO:

  1. ftp://crydee.sai.msu.ru/pub/comp/os/os2/leo/gnu/

Note also that the db.lib and db.a from the EMX distribution are not suitable for multi-threaded compile (even single-threaded flavor of Perl uses multi-threaded C RTL, for compatibility with XFree86-OS/2). Get a corrected one from

  1. http://www.ilyaz.org/software/os2/db_mt.zip

If you have exactly the same version of Perl installed already, make sure that no copies or perl are currently running. Later steps of the build may fail since an older version of perl.dll loaded into memory may be found. Running make test becomes meaningless, since the test are checking a previous build of perl (this situation is detected and reported by lib/os2_base.t test). Do not forget to unset PERL_EMXLOAD_SEC in environment.

Also make sure that you have /tmp directory on the current drive, and . directory in your LIBPATH . One may try to correct the latter condition by

  1. set BEGINLIBPATH .\.

if you use something like CMD.EXE or latest versions of 4os2.exe. (Setting BEGINLIBPATH to just . is ignored by the OS/2 kernel.)

Make sure your gcc is good for -Zomf linking: run omflibs script in /emx/lib directory.

Check that you have link386 installed. It comes standard with OS/2, but may be not installed due to customization. If typing

  1. link386

shows you do not have it, do Selective install, and choose Link object modules in Optional system utilities/More. If you get into link386 prompts, press Ctrl-C to exit.

Getting perl source

You need to fetch the latest perl source (including developers releases). With some probability it is located in

  1. http://www.cpan.org/src/
  2. http://www.cpan.org/src/unsupported

If not, you may need to dig in the indices to find it in the directory of the current maintainer.

Quick cycle of developers release may break the OS/2 build time to time, looking into

  1. http://www.cpan.org/ports/os2/

may indicate the latest release which was publicly released by the maintainer. Note that the release may include some additional patches to apply to the current source of perl.

Extract it like this

  1. tar vzxf perl5.00409.tar.gz

You may see a message about errors while extracting Configure. This is because there is a conflict with a similarly-named file configure.

Change to the directory of extraction.

Application of the patches

You need to apply the patches in ./os2/diff.* like this:

  1. gnupatch -p0 < os2\diff.configure

You may also need to apply the patches supplied with the binary distribution of perl. It also makes sense to look on the perl5-porters mailing list for the latest OS/2-related patches (see http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/). Such patches usually contain strings /os2/ and patch , so it makes sense looking for these strings.

Hand-editing

You may look into the file ./hints/os2.sh and correct anything wrong you find there. I do not expect it is needed anywhere.

Making

  1. sh Configure -des -D prefix=f:/perllib

prefix means: where to install the resulting perl library. Giving correct prefix you may avoid the need to specify PERLLIB_PREFIX , see PERLLIB_PREFIX.

Ignore the message about missing ln , and about -c option to tr. The latter is most probably already fixed, if you see it and can trace where the latter spurious warning comes from, please inform me.

Now

  1. make

At some moment the built may die, reporting a version mismatch or unable to run perl. This means that you do not have . in your LIBPATH, so perl.exe cannot find the needed perl67B2.dll (treat these hex digits as line noise). After this is fixed the build should finish without a lot of fuss.

Testing

Now run

  1. make test

All tests should succeed (with some of them skipped). If you have the same version of Perl installed, it is crucial that you have . early in your LIBPATH (or in BEGINLIBPATH), otherwise your tests will most probably test the wrong version of Perl.

Some tests may generate extra messages similar to

  • A lot of bad free

    in database tests related to Berkeley DB. This should be fixed already. If it persists, you may disable this warnings, see PERL_BADFREE.

  • Process terminated by SIGTERM/SIGINT

    This is a standard message issued by OS/2 applications. *nix applications die in silence. It is considered to be a feature. One can easily disable this by appropriate sighandlers.

    However the test engine bleeds these message to screen in unexpected moments. Two messages of this kind should be present during testing.

To get finer test reports, call

  1. perl t/harness

The report with io/pipe.t failing may look like this:

  1. Failed Test Status Wstat Total Fail Failed List of failed
  2. ------------------------------------------------------------
  3. io/pipe.t 12 1 8.33% 9
  4. 7 tests skipped, plus 56 subtests skipped.
  5. Failed 1/195 test scripts, 99.49% okay. 1/6542 subtests failed, 99.98% okay.

The reasons for most important skipped tests are:

  • op/fs.t
    18

    Checks atime and mtime of stat() - unfortunately, HPFS provides only 2sec time granularity (for compatibility with FAT?).

    25

    Checks truncate() on a filehandle just opened for write - I do not know why this should or should not work.

  • op/stat.t

    Checks stat(). Tests:

    4

    Checks atime and mtime of stat() - unfortunately, HPFS provides only 2sec time granularity (for compatibility with FAT?).

Installing the built perl

If you haven't yet moved perl*.dll onto LIBPATH, do it now.

Run

  1. make install

It would put the generated files into needed locations. Manually put perl.exe, perl__.exe and perl___.exe to a location on your PATH, perl.dll to a location on your LIBPATH.

Run

  1. make installcmd INSTALLCMDDIR=d:/ir/on/path

to convert perl utilities to .cmd files and put them on PATH. You need to put .EXE-utilities on path manually. They are installed in $prefix/bin , here $prefix is what you gave to Configure, see Making.

If you use man , either move the installed */man/ directories to your MANPATH , or modify MANPATH to match the location. (One could have avoided this by providing a correct manpath option to ./Configure, or editing ./config.sh between configuring and making steps.)

a.out -style build

Proceed as above, but make perl_.exe (see perl_.exe) by

  1. make perl_

test and install by

  1. make aout_test
  2. make aout_install

Manually put perl_.exe to a location on your PATH.

Note. The build process for perl_ does not know about all the dependencies, so you should make sure that anything is up-to-date, say, by doing

  1. make perl_dll

first.

Building a binary distribution

[This section provides a short overview only...]

Building should proceed differently depending on whether the version of perl you install is already present and used on your system, or is a new version not yet used. The description below assumes that the version is new, so installing its DLLs and .pm files will not disrupt the operation of your system even if some intermediate steps are not yet fully working.

The other cases require a little bit more convoluted procedures. Below I suppose that the current version of Perl is 5.8.2 , so the executables are named accordingly.

1.

Fully build and test the Perl distribution. Make sure that no tests are failing with test and aout_test targets; fix the bugs in Perl and the Perl test suite detected by these tests. Make sure that all_test make target runs as clean as possible. Check that os2/perlrexx.cmd runs fine.

2.

Fully install Perl, including installcmd target. Copy the generated DLLs to LIBPATH ; copy the numbered Perl executables (as in perl5.8.2.exe) to PATH ; copy perl_.exe to PATH as perl_5.8.2.exe . Think whether you need backward-compatibility DLLs. In most cases you do not need to install them yet; but sometime this may simplify the following steps.

3.

Make sure that CPAN.pm can download files from CPAN. If not, you may need to manually install Net::FTP .

4.

Install the bundle Bundle::OS2_default

  1. perl5.8.2 -MCPAN -e "install Bundle::OS2_default" < nul |& tee 00cpan_i_1

This may take a couple of hours on 1GHz processor (when run the first time). And this should not be necessarily a smooth procedure. Some modules may not specify required dependencies, so one may need to repeat this procedure several times until the results stabilize.

  1. perl5.8.2 -MCPAN -e "install Bundle::OS2_default" < nul |& tee 00cpan_i_2
  2. perl5.8.2 -MCPAN -e "install Bundle::OS2_default" < nul |& tee 00cpan_i_3

Even after they stabilize, some tests may fail.

Fix as many discovered bugs as possible. Document all the bugs which are not fixed, and all the failures with unknown reasons. Inspect the produced logs 00cpan_i_1 to find suspiciously skipped tests, and other fishy events.

Keep in mind that installation of some modules may fail too: for example, the DLLs to update may be already loaded by CPAN.pm. Inspect the install logs (in the example above 00cpan_i_1 etc) for errors, and install things manually, as in

  1. cd $CPANHOME/.cpan/build/Digest-MD5-2.31
  2. make install

Some distributions may fail some tests, but you may want to install them anyway (as above, or via force install command of CPAN.pm shell-mode).

Since this procedure may take quite a long time to complete, it makes sense to "freeze" your CPAN configuration by disabling periodic updates of the local copy of CPAN index: set index_expire to some big value (I use 365), then save the settings

  1. CPAN> o conf index_expire 365
  2. CPAN> o conf commit

Reset back to the default value 1 when you are finished.

5.

When satisfied with the results, rerun the installcmd target. Now you can copy perl5.8.2.exe to perl.exe , and install the other OMF-build executables: perl__.exe etc. They are ready to be used.

6.

Change to the ./pod directory of the build tree, download the Perl logo CamelGrayBig.BMP, and run

  1. ( perl2ipf > perl.ipf ) |& tee 00ipf
  2. ipfc /INF perl.ipf |& tee 00inf

This produces the Perl docs online book perl.INF . Install in on BOOKSHELF path.

7.

Now is the time to build statically linked executable perl_.exe which includes newly-installed via Bundle::OS2_default modules. Doing testing via CPAN.pm is going to be painfully slow, since it statically links a new executable per XS extension.

Here is a possible workaround: create a toplevel Makefile.PL in $CPANHOME/.cpan/build/ with contents being (compare with Making executables with a custom collection of statically loaded extensions)

  1. use ExtUtils::MakeMaker;
  2. WriteMakefile NAME => 'dummy';

execute this as

  1. perl_5.8.2.exe Makefile.PL <nul |& tee 00aout_c1
  2. make -k all test <nul |& 00aout_t1

Again, this procedure should not be absolutely smooth. Some Makefile.PL 's in subdirectories may be buggy, and would not run as "child" scripts. The interdependency of modules can strike you; however, since non-XS modules are already installed, the prerequisites of most modules have a very good chance to be present.

If you discover some glitches, move directories of problematic modules to a different location; if these modules are non-XS modules, you may just ignore them - they are already installed; the remaining, XS, modules you need to install manually one by one.

After each such removal you need to rerun the Makefile.PL /make process; usually this procedure converges soon. (But be sure to convert all the necessary external C libraries from .lib format to .a format: run one of

  1. emxaout foo.lib
  2. emximp -o foo.a foo.lib

whichever is appropriate.) Also, make sure that the DLLs for external libraries are usable with with executables compiled without -Zmtd options.

When you are sure that only a few subdirectories lead to failures, you may want to add -j4 option to make to speed up skipping subdirectories with already finished build.

When you are satisfied with the results of tests, install the build C libraries for extensions:

  1. make install |& tee 00aout_i

Now you can rename the file ./perl.exe generated during the last phase to perl_5.8.2.exe; place it on PATH ; if there is an inter-dependency between some XS modules, you may need to repeat the test /install loop with this new executable and some excluded modules - until the procedure converges.

Now you have all the necessary .a libraries for these Perl modules in the places where Perl builder can find it. Use the perl builder: change to an empty directory, create a "dummy" Makefile.PL again, and run

  1. perl_5.8.2.exe Makefile.PL |& tee 00c
  2. make perl |& tee 00p

This should create an executable ./perl.exe with all the statically loaded extensions built in. Compare the generated perlmain.c files to make sure that during the iterations the number of loaded extensions only increases. Rename ./perl.exe to perl_5.8.2.exe on PATH .

When it converges, you got a functional variant of perl_5.8.2.exe; copy it to perl_.exe . You are done with generation of the local Perl installation.

8.

Make sure that the installed modules are actually installed in the location of the new Perl, and are not inherited from entries of @INC given for inheritance from the older versions of Perl: set PERLLIB_582_PREFIX to redirect the new version of Perl to a new location, and copy the installed files to this new location. Redo the tests to make sure that the versions of modules inherited from older versions of Perl are not needed.

Actually, the log output of pod2ipf(1) during the step 6 gives a very detailed info about which modules are loaded from which place; so you may use it as an additional verification tool.

Check that some temporary files did not make into the perl install tree. Run something like this

  1. pfind . -f "!(/\.(pm|pl|ix|al|h|a|lib|txt|pod|imp|bs|dll|ld|bs|inc|xbm|yml|cgi|uu|e2x|skip|packlist|eg|cfg|html|pub|enc|all|ini|po|pot)$/i or /^\w+$/") | less

in the install tree (both top one and sitelib one).

Compress all the DLLs with lxlite. The tiny .exe can be compressed with /c:max (the bug only appears when there is a fixup in the last 6 bytes of a page (?); since the tiny executables are much smaller than a page, the bug will not hit). Do not compress perl_.exe - it would not work under DOS.

9.

Now you can generate the binary distribution. This is done by running the test of the CPAN distribution OS2::SoftInstaller . Tune up the file test.pl to suit the layout of current version of Perl first. Do not forget to pack the necessary external DLLs accordingly. Include the description of the bugs and test suite failures you could not fix. Include the small-stack versions of Perl executables from Perl build directory.

Include perl5.def so that people can relink the perl DLL preserving the binary compatibility, or can create compatibility DLLs. Include the diff files (diff -pu old new ) of fixes you did so that people can rebuild your version. Include perl5.map so that one can use remote debugging.

10.

Share what you did with the other people. Relax. Enjoy fruits of your work.

11.

Brace yourself for thanks, bug reports, hate mail and spam coming as result of the previous step. No good deed should remain unpunished!

Building custom .EXE files

The Perl executables can be easily rebuilt at any moment. Moreover, one can use the embedding interface (see perlembed) to make very customized executables.

Making executables with a custom collection of statically loaded extensions

It is a little bit easier to do so while decreasing the list of statically loaded extensions. We discuss this case only here.

1.

Change to an empty directory, and create a placeholder <Makefile.PL>:

  1. use ExtUtils::MakeMaker;
  2. WriteMakefile NAME => 'dummy';
2.

Run it with the flavor of Perl (perl.exe or perl_.exe) you want to rebuild.

  1. perl_ Makefile.PL
3.

Ask it to create new Perl executable:

  1. make perl

(you may need to manually add PERLTYPE=-DPERL_CORE to this commandline on some versions of Perl; the symptom is that the command-line globbing does not work from OS/2 shells with the newly-compiled executable; check with

  1. .\perl.exe -wle "print for @ARGV" *

).

4.

The previous step created perlmain.c which contains a list of newXS() calls near the end. Removing unnecessary calls, and rerunning

  1. make perl

will produce a customized executable.

Making executables with a custom search-paths

The default perl executable is flexible enough to support most usages. However, one may want something yet more flexible; for example, one may want to find Perl DLL relatively to the location of the EXE file; or one may want to ignore the environment when setting the Perl-library search patch, etc.

If you fill comfortable with embedding interface (see perlembed), such things are easy to do repeating the steps outlined in Making executables with a custom collection of statically loaded extensions, and doing more comprehensive edits to main() of perlmain.c. The people with little desire to understand Perl can just rename main(), and do necessary modification in a custom main() which calls the renamed function in appropriate time.

However, there is a third way: perl DLL exports the main() function and several callbacks to customize the search path. Below is a complete example of a "Perl loader" which

1.

Looks for Perl DLL in the directory $exedir/../dll;

2.

Prepends the above directory to BEGINLIBPATH ;

3.

Fails if the Perl DLL found via BEGINLIBPATH is different from what was loaded on step 1; e.g., another process could have loaded it from LIBPATH or from a different value of BEGINLIBPATH . In these cases one needs to modify the setting of the system so that this other process either does not run, or loads the DLL from BEGINLIBPATH with LIBPATHSTRICT=T (available with kernels after September 2000).

4.

Loads Perl library from $exedir/../dll/lib/.

5.

Uses Bourne shell from $exedir/../dll/sh/ksh.exe.

For best results compile the C file below with the same options as the Perl DLL. However, a lot of functionality will work even if the executable is not an EMX applications, e.g., if compiled with

  1. gcc -Wall -DDOSISH -DOS2=1 -O2 -s -Zomf -Zsys perl-starter.c -DPERL_DLL_BASENAME=\"perl312F\" -Zstack 8192 -Zlinker /PM:VIO

Here is the sample C file:

  1. #define INCL_DOS
  2. #define INCL_NOPM
  3. /* These are needed for compile if os2.h includes os2tk.h, not os2emx.h */
  4. #define INCL_DOSPROCESS
  5. #include <os2.h>
  6. #include "EXTERN.h"
  7. #define PERL_IN_MINIPERLMAIN_C
  8. #include "perl.h"
  9. static char *me;
  10. HMODULE handle;
  11. static void
  12. die_with(char *msg1, char *msg2, char *msg3, char *msg4)
  13. {
  14. ULONG c;
  15. char *s = " error: ";
  16. DosWrite(2, me, strlen(me), &c);
  17. DosWrite(2, s, strlen(s), &c);
  18. DosWrite(2, msg1, strlen(msg1), &c);
  19. DosWrite(2, msg2, strlen(msg2), &c);
  20. DosWrite(2, msg3, strlen(msg3), &c);
  21. DosWrite(2, msg4, strlen(msg4), &c);
  22. DosWrite(2, "\r\n", 2, &c);
  23. exit(255);
  24. }
  25. typedef ULONG (*fill_extLibpath_t)(int type, char *pre, char *post, int replace, char *msg);
  26. typedef int (*main_t)(int type, char *argv[], char *env[]);
  27. typedef int (*handler_t)(void* data, int which);
  28. #ifndef PERL_DLL_BASENAME
  29. # define PERL_DLL_BASENAME "perl"
  30. #endif
  31. static HMODULE
  32. load_perl_dll(char *basename)
  33. {
  34. char buf[300], fail[260];
  35. STRLEN l, dirl;
  36. fill_extLibpath_t f;
  37. ULONG rc_fullname;
  38. HMODULE handle, handle1;
  39. if (_execname(buf, sizeof(buf) - 13) != 0)
  40. die_with("Can't find full path: ", strerror(errno), "", "");
  41. /* XXXX Fill 'me' with new value */
  42. l = strlen(buf);
  43. while (l && buf[l-1] != '/' && buf[l-1] != '\\')
  44. l--;
  45. dirl = l - 1;
  46. strcpy(buf + l, basename);
  47. l += strlen(basename);
  48. strcpy(buf + l, ".dll");
  49. if ( (rc_fullname = DosLoadModule(fail, sizeof fail, buf, &handle)) != 0
  50. && DosLoadModule(fail, sizeof fail, basename, &handle) != 0 )
  51. die_with("Can't load DLL ", buf, "", "");
  52. if (rc_fullname)
  53. return handle; /* was loaded with short name; all is fine */
  54. if (DosQueryProcAddr(handle, 0, "fill_extLibpath", (PFN*)&f))
  55. die_with(buf, ": DLL exports no symbol ", "fill_extLibpath", "");
  56. buf[dirl] = 0;
  57. if (f(0 /*BEGINLIBPATH*/, buf /* prepend */, NULL /* append */,
  58. 0 /* keep old value */, me))
  59. die_with(me, ": prepending BEGINLIBPATH", "", "");
  60. if (DosLoadModule(fail, sizeof fail, basename, &handle1) != 0)
  61. die_with(me, ": finding perl DLL again via BEGINLIBPATH", "", "");
  62. buf[dirl] = '\\';
  63. if (handle1 != handle) {
  64. if (DosQueryModuleName(handle1, sizeof(fail), fail))
  65. strcpy(fail, "???");
  66. die_with(buf, ":\n\tperl DLL via BEGINLIBPATH is different: \n\t",
  67. fail,
  68. "\n\tYou may need to manipulate global BEGINLIBPATH and LIBPATHSTRICT"
  69. "\n\tso that the other copy is loaded via BEGINLIBPATH.");
  70. }
  71. return handle;
  72. }
  73. int
  74. main(int argc, char **argv, char **env)
  75. {
  76. main_t f;
  77. handler_t h;
  78. me = argv[0];
  79. /**/
  80. handle = load_perl_dll(PERL_DLL_BASENAME);
  81. if (DosQueryProcAddr(handle, 0, "Perl_OS2_handler_install", (PFN*)&h))
  82. die_with(PERL_DLL_BASENAME, ": DLL exports no symbol ", "Perl_OS2_handler_install", "");
  83. if ( !h((void *)"~installprefix", Perlos2_handler_perllib_from)
  84. || !h((void *)"~dll", Perlos2_handler_perllib_to)
  85. || !h((void *)"~dll/sh/ksh.exe", Perlos2_handler_perl_sh) )
  86. die_with(PERL_DLL_BASENAME, ": Can't install @INC manglers", "", "");
  87. if (DosQueryProcAddr(handle, 0, "dll_perlmain", (PFN*)&f))
  88. die_with(PERL_DLL_BASENAME, ": DLL exports no symbol ", "dll_perlmain", "");
  89. return f(argc, argv, env);
  90. }

Build FAQ

Some / became \ in pdksh.

You have a very old pdksh. See Prerequisites.

'errno' - unresolved external

You do not have MT-safe db.lib. See Prerequisites.

Problems with tr or sed

reported with very old version of tr and sed.

Some problem (forget which ;-)

You have an older version of perl.dll on your LIBPATH, which broke the build of extensions.

Library ... not found

You did not run omflibs . See Prerequisites.

Segfault in make

You use an old version of GNU make. See Prerequisites.

op/sprintf test failure

This can result from a bug in emx sprintf which was fixed in 0.9d fix 03.

Specific (mis)features of OS/2 port

setpriority, getpriority

Note that these functions are compatible with *nix, not with the older ports of '94 - 95. The priorities are absolute, go from 32 to -95, lower is quicker. 0 is the default priority.

WARNING. Calling getpriority on a non-existing process could lock the system before Warp3 fixpak22. Starting with Warp3, Perl will use a workaround: it aborts getpriority() if the process is not present. This is not possible on older versions 2.* , and has a race condition anyway.

system()

Multi-argument form of system() allows an additional numeric argument. The meaning of this argument is described in OS2::Process.

When finding a program to run, Perl first asks the OS to look for executables on PATH (OS/2 adds extension .exe if no extension is present). If not found, it looks for a script with possible extensions added in this order: no extension, .cmd, .btm, .bat, .pl. If found, Perl checks the start of the file for magic strings "#!" and "extproc " . If found, Perl uses the rest of the first line as the beginning of the command line to run this script. The only mangling done to the first line is extraction of arguments (currently up to 3), and ignoring of the path-part of the "interpreter" name if it can't be found using the full path.

E.g., system 'foo', 'bar', 'baz' may lead Perl to finding C:/emx/bin/foo.cmd with the first line being

  1. extproc /bin/bash -x -c

If /bin/bash.exe is not found, then Perl looks for an executable bash.exe on PATH . If found in C:/emx.add/bin/bash.exe, then the above system() is translated to

  1. system qw(C:/emx.add/bin/bash.exe -x -c C:/emx/bin/foo.cmd bar baz)

One additional translation is performed: instead of /bin/sh Perl uses the hardwired-or-customized shell (see PERL_SH_DIR).

The above search for "interpreter" is recursive: if bash executable is not found, but bash.btm is found, Perl will investigate its first line etc. The only hardwired limit on the recursion depth is implicit: there is a limit 4 on the number of additional arguments inserted before the actual arguments given to system(). In particular, if no additional arguments are specified on the "magic" first lines, then the limit on the depth is 4.

If Perl finds that the found executable is of PM type when the current session is not, it will start the new process in a separate session of necessary type. Call via OS2::Process to disable this magic.

WARNING. Due to the described logic, you need to explicitly specify .com extension if needed. Moreover, if the executable perl5.6.1 is requested, Perl will not look for perl5.6.1.exe. [This may change in the future.]

extproc on the first line

If the first chars of a Perl script are "extproc " , this line is treated as #! -line, thus all the switches on this line are processed (twice if script was started via cmd.exe). See DESCRIPTION in perlrun.

Additional modules:

OS2::Process, OS2::DLL, OS2::REXX, OS2::PrfDB, OS2::ExtAttr. These modules provide access to additional numeric argument for system and to the information about the running process, to DLLs having functions with REXX signature and to the REXX runtime, to OS/2 databases in the .INI format, and to Extended Attributes.

Two additional extensions by Andreas Kaiser, OS2::UPM , and OS2::FTP , are included into ILYAZ directory, mirrored on CPAN. Other OS/2-related extensions are available too.

Prebuilt methods:

  • File::Copy::syscopy

    used by File::Copy::copy , see File::Copy.

  • DynaLoader::mod2fname

    used by DynaLoader for DLL name mangling.

  • Cwd::current_drive()

    Self explanatory.

  • Cwd::sys_chdir(name)

    leaves drive as it is.

  • Cwd::change_drive(name)

    changes the "current" drive.

  • Cwd::sys_is_absolute(name)

    means has drive letter and is_rooted.

  • Cwd::sys_is_rooted(name)

    means has leading [/\\] (maybe after a drive-letter:).

  • Cwd::sys_is_relative(name)

    means changes with current dir.

  • Cwd::sys_cwd(name)

    Interface to cwd from EMX. Used by Cwd::cwd .

  • Cwd::sys_abspath(name, dir)

    Really really odious function to implement. Returns absolute name of file which would have name if CWD were dir . Dir defaults to the current dir.

  • Cwd::extLibpath([type])

    Get current value of extended library search path. If type is present and positive, works with END_LIBPATH , if negative, works with LIBPATHSTRICT , otherwise with BEGIN_LIBPATH .

  • Cwd::extLibpath_set( path [, type ] )

    Set current value of extended library search path. If type is present and positive, works with <END_LIBPATH>, if negative, works with LIBPATHSTRICT , otherwise with BEGIN_LIBPATH .

  • OS2::Error(do_harderror,do_exception)

    Returns undef if it was not called yet, otherwise bit 1 is set if on the previous call do_harderror was enabled, bit 2 is set if on previous call do_exception was enabled.

    This function enables/disables error popups associated with hardware errors (Disk not ready etc.) and software exceptions.

    I know of no way to find out the state of popups before the first call to this function.

  • OS2::Errors2Drive(drive)

    Returns undef if it was not called yet, otherwise return false if errors were not requested to be written to a hard drive, or the drive letter if this was requested.

    This function may redirect error popups associated with hardware errors (Disk not ready etc.) and software exceptions to the file POPUPLOG.OS2 at the root directory of the specified drive. Overrides OS2::Error() specified by individual programs. Given argument undef will disable redirection.

    Has global effect, persists after the application exits.

    I know of no way to find out the state of redirection of popups to the disk before the first call to this function.

  • OS2::SysInfo()

    Returns a hash with system information. The keys of the hash are

    1. MAX_PATH_LENGTH, MAX_TEXT_SESSIONS, MAX_PM_SESSIONS,
    2. MAX_VDM_SESSIONS, BOOT_DRIVE, DYN_PRI_VARIATION,
    3. MAX_WAIT, MIN_SLICE, MAX_SLICE, PAGE_SIZE,
    4. VERSION_MAJOR, VERSION_MINOR, VERSION_REVISION,
    5. MS_COUNT, TIME_LOW, TIME_HIGH, TOTPHYSMEM, TOTRESMEM,
    6. TOTAVAILMEM, MAXPRMEM, MAXSHMEM, TIMER_INTERVAL,
    7. MAX_COMP_LENGTH, FOREGROUND_FS_SESSION,
    8. FOREGROUND_PROCESS
  • OS2::BootDrive()

    Returns a letter without colon.

  • OS2::MorphPM(serve) , OS2::UnMorphPM(serve)

    Transforms the current application into a PM application and back. The argument true means that a real message loop is going to be served. OS2::MorphPM() returns the PM message queue handle as an integer.

    See Centralized management of resources for additional details.

  • OS2::Serve_Messages(force)

    Fake on-demand retrieval of outstanding PM messages. If force is false, will not dispatch messages if a real message loop is known to be present. Returns number of messages retrieved.

    Dies with "QUITing..." if WM_QUIT message is obtained.

  • OS2::Process_Messages(force [, cnt])

    Retrieval of PM messages until window creation/destruction. If force is false, will not dispatch messages if a real message loop is known to be present.

    Returns change in number of windows. If cnt is given, it is incremented by the number of messages retrieved.

    Dies with "QUITing..." if WM_QUIT message is obtained.

  • OS2::_control87(new,mask)

    the same as _control87(3) of EMX. Takes integers as arguments, returns the previous coprocessor control word as an integer. Only bits in new which are present in mask are changed in the control word.

  • OS2::get_control87()

    gets the coprocessor control word as an integer.

  • OS2::set_control87_em(new=MCW_EM,mask=MCW_EM)

    The variant of OS2::_control87() with default values good for handling exception mask: if no mask , uses exception mask part of new only. If no new , disables all the floating point exceptions.

    See Misfeatures for details.

  • OS2::DLLname([how [, \&xsub]])

    Gives the information about the Perl DLL or the DLL containing the C function bound to by &xsub . The meaning of how is: default (2): full name; 0: handle; 1: module name.

(Note that some of these may be moved to different libraries - eventually).

Prebuilt variables:

  • $OS2::emx_rev

    numeric value is the same as _emx_rev of EMX, a string value the same as _emx_vprt (similar to 0.9c).

  • $OS2::emx_env

    same as _emx_env of EMX, a number similar to 0x8001.

  • $OS2::os_ver

    a number OS_MAJOR + 0.001 * OS_MINOR .

  • $OS2::is_aout

    true if the Perl library was compiled in AOUT format.

  • $OS2::can_fork

    true if the current executable is an AOUT EMX executable, so Perl can fork. Do not use this, use the portable check for $Config::Config{dfork}.

  • $OS2::nsyserror

    This variable (default is 1) controls whether to enforce the contents of $^E to start with SYS0003 -like id. If set to 0, then the string value of $^E is what is available from the OS/2 message file. (Some messages in this file have an SYS0003 -like id prepended, some not.)

Misfeatures

  • Since flock(3) is present in EMX, but is not functional, it is emulated by perl. To disable the emulations, set environment variable USE_PERL_FLOCK=0 .

  • Here is the list of things which may be "broken" on EMX (from EMX docs):

    • The functions recvmsg(3), sendmsg(3), and socketpair(3) are not implemented.

    • sock_init(3) is not required and not implemented.

    • flock(3) is not yet implemented (dummy function). (Perl has a workaround.)

    • kill(3): Special treatment of PID=0, PID=1 and PID=-1 is not implemented.

    • waitpid(3):

      1. WUNTRACED
      2. Not implemented.
      3. waitpid() is not implemented for negative values of PID.

    Note that kill -9 does not work with the current version of EMX.

  • See Text-mode filehandles.

  • Unix-domain sockets on OS/2 live in a pseudo-file-system /sockets/... . To avoid a failure to create a socket with a name of a different form, "/socket/" is prepended to the socket name (unless it starts with this already).

    This may lead to problems later in case the socket is accessed via the "usual" file-system calls using the "initial" name.

  • Apparently, IBM used a compiler (for some period of time around '95?) which changes FP mask right and left. This is not that bad for IBM's programs, but the same compiler was used for DLLs which are used with general-purpose applications. When these DLLs are used, the state of floating-point flags in the application is not predictable.

    What is much worse, some DLLs change the floating point flags when in _DLLInitTerm() (e.g., TCP32IP). This means that even if you do not call any function in the DLL, just the act of loading this DLL will reset your flags. What is worse, the same compiler was used to compile some HOOK DLLs. Given that HOOK dlls are executed in the context of all the applications in the system, this means a complete unpredictability of floating point flags on systems using such HOOK DLLs. E.g., GAMESRVR.DLL of DIVE origin changes the floating point flags on each write to the TTY of a VIO (windowed text-mode) applications.

    Some other (not completely debugged) situations when FP flags change include some video drivers (?), and some operations related to creation of the windows. People who code OpenGL may have more experience on this.

    Perl is generally used in the situation when all the floating-point exceptions are ignored, as is the default under EMX. If they are not ignored, some benign Perl programs would get a SIGFPE and would die a horrible death.

    To circumvent this, Perl uses two hacks. They help against one type of damage only: FP flags changed when loading a DLL.

    One of the hacks is to disable floating point exceptions on Perl startup (as is the default with EMX). This helps only with compile-time-linked DLLs changing the flags before main() had a chance to be called.

    The other hack is to restore FP flags after a call to dlopen(). This helps against similar damage done by DLLs _DLLInitTerm() at runtime. Currently no way to switch these hacks off is provided.

Modifications

Perl modifies some standard C library calls in the following ways:

  • popen

    my_popen uses sh.exe if shell is required, cf. PERL_SH_DIR.

  • tmpnam

    is created using TMP or TEMP environment variable, via tempnam .

  • tmpfile

    If the current directory is not writable, file is created using modified tmpnam , so there may be a race condition.

  • ctermid

    a dummy implementation.

  • stat

    os2_stat special-cases /dev/tty and /dev/con.

  • mkdir, rmdir

    these EMX functions do not work if the path contains a trailing /. Perl contains a workaround for this.

  • flock

    Since flock(3) is present in EMX, but is not functional, it is emulated by perl. To disable the emulations, set environment variable USE_PERL_FLOCK=0 .

Identifying DLLs

All the DLLs built with the current versions of Perl have ID strings identifying the name of the extension, its version, and the version of Perl required for this DLL. Run bldlevel DLL-name to find this info.

Centralized management of resources

Since to call certain OS/2 API one needs to have a correctly initialized Win subsystem, OS/2-specific extensions may require getting HAB s and HMQ s. If an extension would do it on its own, another extension could fail to initialize.

Perl provides a centralized management of these resources:

  • HAB

    To get the HAB, the extension should call hab = perl_hab_GET() in C. After this call is performed, hab may be accessed as Perl_hab . There is no need to release the HAB after it is used.

    If by some reasons perl.h cannot be included, use

    1. extern int Perl_hab_GET(void);

    instead.

  • HMQ

    There are two cases:

    • the extension needs an HMQ only because some API will not work otherwise. Use serve = 0 below.

    • the extension needs an HMQ since it wants to engage in a PM event loop. Use serve = 1 below.

    To get an HMQ , the extension should call hmq = perl_hmq_GET(serve) in C. After this call is performed, hmq may be accessed as Perl_hmq .

    To signal to Perl that HMQ is not needed any more, call perl_hmq_UNSET(serve) . Perl process will automatically morph/unmorph itself into/from a PM process if HMQ is needed/not-needed. Perl will automatically enable/disable WM_QUIT message during shutdown if the message queue is served/not-served.

    NOTE. If during a shutdown there is a message queue which did not disable WM_QUIT, and which did not process the received WM_QUIT message, the shutdown will be automatically cancelled. Do not call perl_hmq_GET(1) unless you are going to process messages on an orderly basis.

  • Treating errors reported by OS/2 API

    There are two principal conventions (it is useful to call them Dos* and Win* - though this part of the function signature is not always determined by the name of the API) of reporting the error conditions of OS/2 API. Most of Dos* APIs report the error code as the result of the call (so 0 means success, and there are many types of errors). Most of Win* API report success/fail via the result being TRUE /FALSE ; to find the reason for the failure one should call WinGetLastError() API.

    Some Win* entry points also overload a "meaningful" return value with the error indicator; having a 0 return value indicates an error. Yet some other Win* entry points overload things even more, and 0 return value may mean a successful call returning a valid value 0, as well as an error condition; in the case of a 0 return value one should call WinGetLastError() API to distinguish a successful call from a failing one.

    By convention, all the calls to OS/2 API should indicate their failures by resetting $^E. All the Perl-accessible functions which call OS/2 API may be broken into two classes: some die()s when an API error is encountered, the other report the error via a false return value (of course, this does not concern Perl-accessible functions which expect a failure of the OS/2 API call, having some workarounds coded).

    Obviously, in the situation of the last type of the signature of an OS/2 API, it is must more convenient for the users if the failure is indicated by die()ing: one does not need to check $^E to know that something went wrong. If, however, this solution is not desirable by some reason, the code in question should reset $^E to 0 before making this OS/2 API call, so that the caller of this Perl-accessible function has a chance to distinguish a success-but-0-return value from a failure. (One may return undef as an alternative way of reporting an error.)

    The macros to simplify this type of error propagation are

    • CheckOSError(expr)

      Returns true on error, sets $^E. Expects expr() be a call of Dos* -style API.

    • CheckWinError(expr)

      Returns true on error, sets $^E. Expects expr() be a call of Win* -style API.

    • SaveWinError(expr)

      Returns expr , sets $^E from WinGetLastError() if expr is false.

    • SaveCroakWinError(expr,die,name1,name2)

      Returns expr , sets $^E from WinGetLastError() if expr is false, and die()s if die and $^E are true. The message to die is the concatenated strings name1 and name2 , separated by ": " from the contents of $^E.

    • WinError_2_Perl_rc

      Sets Perl_rc to the return value of WinGetLastError().

    • FillWinError

      Sets Perl_rc to the return value of WinGetLastError(), and sets $^E to the corresponding value.

    • FillOSError(rc)

      Sets Perl_rc to rc , and sets $^E to the corresponding value.

  • Loading DLLs and ordinals in DLLs

    Some DLLs are only present in some versions of OS/2, or in some configurations of OS/2. Some exported entry points are present only in DLLs shipped with some versions of OS/2. If these DLLs and entry points were linked directly for a Perl executable/DLL or from a Perl extensions, this binary would work only with the specified versions/setups. Even if these entry points were not needed, the load of the executable (or DLL) would fail.

    For example, many newer useful APIs are not present in OS/2 v2; many PM-related APIs require DLLs not available on floppy-boot setup.

    To make these calls fail only when the calls are executed, one should call these API via a dynamic linking API. There is a subsystem in Perl to simplify such type of calls. A large number of entry points available for such linking is provided (see entries_ordinals - and also PMWIN_entries - in os2ish.h). These ordinals can be accessed via the APIs:

    1. CallORD(), DeclFuncByORD(), DeclVoidFuncByORD(),
    2. DeclOSFuncByORD(), DeclWinFuncByORD(), AssignFuncPByORD(),
    3. DeclWinFuncByORD_CACHE(), DeclWinFuncByORD_CACHE_survive(),
    4. DeclWinFuncByORD_CACHE_resetError_survive(),
    5. DeclWinFunc_CACHE(), DeclWinFunc_CACHE_resetError(),
    6. DeclWinFunc_CACHE_survive(), DeclWinFunc_CACHE_resetError_survive()

    See the header files and the C code in the supplied OS/2-related modules for the details on usage of these functions.

    Some of these functions also combine dynaloading semantic with the error-propagation semantic discussed above.

Perl flavors

Because of idiosyncrasies of OS/2 one cannot have all the eggs in the same basket (though EMX environment tries hard to overcome this limitations, so the situation may somehow improve). There are 4 executables for Perl provided by the distribution:

perl.exe

The main workhorse. This is a chimera executable: it is compiled as an a.out -style executable, but is linked with omf -style dynamic library perl.dll, and with dynamic CRT DLL. This executable is a VIO application.

It can load perl dynamic extensions, and it can fork().

Note. Keep in mind that fork() is needed to open a pipe to yourself.

perl_.exe

This is a statically linked a.out -style executable. It cannot load dynamic Perl extensions. The executable supplied in binary distributions has a lot of extensions prebuilt, thus the above restriction is important only if you use custom-built extensions. This executable is a VIO application.

This is the only executable with does not require OS/2. The friends locked into M$ world would appreciate the fact that this executable runs under DOS, Win0.3*, Win0.95 and WinNT with an appropriate extender. See Other OSes.

perl__.exe

This is the same executable as perl___.exe, but it is a PM application.

Note. Usually (unless explicitly redirected during the startup) STDIN, STDERR, and STDOUT of a PM application are redirected to nul. However, it is possible to see them if you start perl__.exe from a PM program which emulates a console window, like Shell mode of Emacs or EPM. Thus it is possible to use Perl debugger (see perldebug) to debug your PM application (but beware of the message loop lockups - this will not work if you have a message queue to serve, unless you hook the serving into the getc() function of the debugger).

Another way to see the output of a PM program is to run it as

  1. pm_prog args 2>&1 | cat -

with a shell different from cmd.exe, so that it does not create a link between a VIO session and the session of pm_porg . (Such a link closes the VIO window.) E.g., this works with sh.exe - or with Perl!

  1. open P, 'pm_prog args 2>&1 |' or die;
  2. print while <P>;

The flavor perl__.exe is required if you want to start your program without a VIO window present, but not detach ed (run help detach for more info). Very useful for extensions which use PM, like Perl/Tk or OpenGL .

Note also that the differences between PM and VIO executables are only in the default behaviour. One can start any executable in any kind of session by using the arguments /fs, /pm or /win switches of the command start (of CMD.EXE or a similar shell). Alternatively, one can use the numeric first argument of the system Perl function (see OS2::Process).

perl___.exe

This is an omf -style executable which is dynamically linked to perl.dll and CRT DLL. I know no advantages of this executable over perl.exe , but it cannot fork() at all. Well, one advantage is that the build process is not so convoluted as with perl.exe .

It is a VIO application.

Why strange names?

Since Perl processes the #! -line (cf. DESCRIPTION in perlrun, Command Switches in perlrun, No Perl script found in input in perldiag), it should know when a program is a Perl. There is some naming convention which allows Perl to distinguish correct lines from wrong ones. The above names are almost the only names allowed by this convention which do not contain digits (which have absolutely different semantics).

Why dynamic linking?

Well, having several executables dynamically linked to the same huge library has its advantages, but this would not substantiate the additional work to make it compile. The reason is the complicated-to-developers but very quick and convenient-to-users "hard" dynamic linking used by OS/2.

There are two distinctive features of the dyna-linking model of OS/2: first, all the references to external functions are resolved at the compile time; second, there is no runtime fixup of the DLLs after they are loaded into memory. The first feature is an enormous advantage over other models: it avoids conflicts when several DLLs used by an application export entries with the same name. In such cases "other" models of dyna-linking just choose between these two entry points using some random criterion - with predictable disasters as results. But it is the second feature which requires the build of perl.dll.

The address tables of DLLs are patched only once, when they are loaded. The addresses of the entry points into DLLs are guaranteed to be the same for all the programs which use the same DLL. This removes the runtime fixup - once DLL is loaded, its code is read-only.

While this allows some (significant?) performance advantages, this makes life much harder for developers, since the above scheme makes it impossible for a DLL to be "linked" to a symbol in the .EXE file. Indeed, this would need a DLL to have different relocations tables for the (different) executables which use this DLL.

However, a dynamically loaded Perl extension is forced to use some symbols from the perl executable, e.g., to know how to find the arguments to the functions: the arguments live on the perl internal evaluation stack. The solution is to put the main code of the interpreter into a DLL, and make the .EXE file which just loads this DLL into memory and supplies command-arguments. The extension DLL cannot link to symbols in .EXE, but it has no problem linking to symbols in the .DLL.

This greatly increases the load time for the application (as well as complexity of the compilation). Since interpreter is in a DLL, the C RTL is basically forced to reside in a DLL as well (otherwise extensions would not be able to use CRT). There are some advantages if you use different flavors of perl, such as running perl.exe and perl__.exe simultaneously: they share the memory of perl.dll.

NOTE. There is one additional effect which makes DLLs more wasteful: DLLs are loaded in the shared memory region, which is a scarse resource given the 512M barrier of the "standard" OS/2 virtual memory. The code of .EXE files is also shared by all the processes which use the particular .EXE, but they are "shared in the private address space of the process"; this is possible because the address at which different sections of the .EXE file are loaded is decided at compile-time, thus all the processes have these sections loaded at same addresses, and no fixup of internal links inside the .EXE is needed.

Since DLLs may be loaded at run time, to have the same mechanism for DLLs one needs to have the address range of any of the loaded DLLs in the system to be available in all the processes which did not load a particular DLL yet. This is why the DLLs are mapped to the shared memory region.

Why chimera build?

Current EMX environment does not allow DLLs compiled using Unixish a.out format to export symbols for data (or at least some types of data). This forces omf -style compile of perl.dll.

Current EMX environment does not allow .EXE files compiled in omf format to fork(). fork() is needed for exactly three Perl operations:

  • explicit fork() in the script,

  • open FH, "|-"

  • open FH, "-|" , in other words, opening pipes to itself.

While these operations are not questions of life and death, they are needed for a lot of useful scripts. This forces a.out -style compile of perl.exe.

ENVIRONMENT

Here we list environment variables with are either OS/2- and DOS- and Win*-specific, or are more important under OS/2 than under other OSes.

PERLLIB_PREFIX

Specific for EMX port. Should have the form

  1. path1;path2

or

  1. path1 path2

If the beginning of some prebuilt path matches path1, it is substituted with path2.

Should be used if the perl library is moved from the default location in preference to PERL(5)LIB, since this would not leave wrong entries in @INC. For example, if the compiled version of perl looks for @INC in f:/perllib/lib, and you want to install the library in h:/opt/gnu, do

  1. set PERLLIB_PREFIX=f:/perllib/lib;h:/opt/gnu

This will cause Perl with the prebuilt @INC of

  1. f:/perllib/lib/5.00553/os2
  2. f:/perllib/lib/5.00553
  3. f:/perllib/lib/site_perl/5.00553/os2
  4. f:/perllib/lib/site_perl/5.00553
  5. .

to use the following @INC:

  1. h:/opt/gnu/5.00553/os2
  2. h:/opt/gnu/5.00553
  3. h:/opt/gnu/site_perl/5.00553/os2
  4. h:/opt/gnu/site_perl/5.00553
  5. .

PERL_BADLANG

If 0, perl ignores setlocale() failing. May be useful with some strange locales.

PERL_BADFREE

If 0, perl would not warn of in case of unwarranted free(). With older perls this might be useful in conjunction with the module DB_File, which was buggy when dynamically linked and OMF-built.

Should not be set with newer Perls, since this may hide some real problems.

PERL_SH_DIR

Specific for EMX port. Gives the directory part of the location for sh.exe.

USE_PERL_FLOCK

Specific for EMX port. Since flock(3) is present in EMX, but is not functional, it is emulated by perl. To disable the emulations, set environment variable USE_PERL_FLOCK=0 .

TMP or TEMP

Specific for EMX port. Used as storage place for temporary files.

Evolution

Here we list major changes which could make you by surprise.

Text-mode filehandles

Starting from version 5.8, Perl uses a builtin translation layer for text-mode files. This replaces the efficient well-tested EMX layer by some code which should be best characterized as a "quick hack".

In addition to possible bugs and an inability to follow changes to the translation policy with off/on switches of TERMIO translation, this introduces a serious incompatible change: before sysread() on text-mode filehandles would go through the translation layer, now it would not.

Priorities

setpriority and getpriority are not compatible with earlier ports by Andreas Kaiser. See "setpriority, getpriority" .

DLL name mangling: pre 5.6.2

With the release 5.003_01 the dynamically loadable libraries should be rebuilt when a different version of Perl is compiled. In particular, DLLs (including perl.dll) are now created with the names which contain a checksum, thus allowing workaround for OS/2 scheme of caching DLLs.

It may be possible to code a simple workaround which would

  • find the old DLLs looking through the old @INC;

  • mangle the names according to the scheme of new perl and copy the DLLs to these names;

  • edit the internal LX tables of DLL to reflect the change of the name (probably not needed for Perl extension DLLs, since the internally coded names are not used for "specific" DLLs, they used only for "global" DLLs).

  • edit the internal IMPORT tables and change the name of the "old" perl????.dll to the "new" perl????.dll.

DLL name mangling: 5.6.2 and beyond

In fact mangling of extension DLLs was done due to misunderstanding of the OS/2 dynaloading model. OS/2 (effectively) maintains two different tables of loaded DLL:

  • Global DLLs

    those loaded by the base name from LIBPATH ; including those associated at link time;

  • specific DLLs

    loaded by the full name.

When resolving a request for a global DLL, the table of already-loaded specific DLLs is (effectively) ignored; moreover, specific DLLs are always loaded from the prescribed path.

There is/was a minor twist which makes this scheme fragile: what to do with DLLs loaded from

  • BEGINLIBPATH and ENDLIBPATH

    (which depend on the process)

  • . from LIBPATH

    which effectively depends on the process (although LIBPATH is the same for all the processes).

Unless LIBPATHSTRICT is set to T (and the kernel is after 2000/09/01), such DLLs are considered to be global. When loading a global DLL it is first looked in the table of already-loaded global DLLs. Because of this the fact that one executable loaded a DLL from BEGINLIBPATH and ENDLIBPATH , or . from LIBPATH may affect which DLL is loaded when another executable requests a DLL with the same name. This is the reason for version-specific mangling of the DLL name for perl DLL.

Since the Perl extension DLLs are always loaded with the full path, there is no need to mangle their names in a version-specific ways: their directory already reflects the corresponding version of perl, and @INC takes into account binary compatibility with older version. Starting from 5.6.2 the name mangling scheme is fixed to be the same as for Perl 5.005_53 (same as in a popular binary release). Thus new Perls will be able to resolve the names of old extension DLLs if @INC allows finding their directories.

However, this still does not guarantee that these DLL may be loaded. The reason is the mangling of the name of the Perl DLL. And since the extension DLLs link with the Perl DLL, extension DLLs for older versions would load an older Perl DLL, and would most probably segfault (since the data in this DLL is not properly initialized).

There is a partial workaround (which can be made complete with newer OS/2 kernels): create a forwarder DLL with the same name as the DLL of the older version of Perl, which forwards the entry points to the newer Perl's DLL. Make this DLL accessible on (say) the BEGINLIBPATH of the new Perl executable. When the new executable accesses old Perl's extension DLLs, they would request the old Perl's DLL by name, get the forwarder instead, so effectively will link with the currently running (new) Perl DLL.

This may break in two ways:

  • Old perl executable is started when a new executable is running has loaded an extension compiled for the old executable (ouph!). In this case the old executable will get a forwarder DLL instead of the old perl DLL, so would link with the new perl DLL. While not directly fatal, it will behave the same as new executable. This beats the whole purpose of explicitly starting an old executable.

  • A new executable loads an extension compiled for the old executable when an old perl executable is running. In this case the extension will not pick up the forwarder - with fatal results.

With support for LIBPATHSTRICT this may be circumvented - unless one of DLLs is started from . from LIBPATH (I do not know whether LIBPATHSTRICT affects this case).

REMARK. Unless newer kernels allow . in BEGINLIBPATH (older do not), this mess cannot be completely cleaned. (It turns out that as of the beginning of 2002, . is not allowed, but .\. is - and it has the same effect.)

REMARK. LIBPATHSTRICT , BEGINLIBPATH and ENDLIBPATH are not environment variables, although cmd.exe emulates them on SET ... lines. From Perl they may be accessed by Cwd::extLibpath and Cwd::extLibpath_set.

DLL forwarder generation

Assume that the old DLL is named perlE0AC.dll (as is one for 5.005_53), and the new version is 5.6.1. Create a file perl5shim.def-leader with

  1. LIBRARY 'perlE0AC' INITINSTANCE TERMINSTANCE
  2. DESCRIPTION '@#perl5-porters@perl.org:5.006001#@ Perl module for 5.00553 -> Perl 5.6.1 forwarder'
  3. CODE LOADONCALL
  4. DATA LOADONCALL NONSHARED MULTIPLE
  5. EXPORTS

modifying the versions/names as needed. Run

  1. perl -wnle "next if 0../EXPORTS/; print qq( \"$1\") if /\"(\w+)\"/" perl5.def >lst

in the Perl build directory (to make the DLL smaller replace perl5.def with the definition file for the older version of Perl if present).

  1. cat perl5shim.def-leader lst >perl5shim.def
  2. gcc -Zomf -Zdll -o perlE0AC.dll perl5shim.def -s -llibperl

(ignore multiple warning L4085 ).

Threading

As of release 5.003_01 perl is linked to multithreaded C RTL DLL. If perl itself is not compiled multithread-enabled, so will not be perl's malloc(). However, extensions may use multiple thread on their own risk.

This was needed to compile Perl/Tk for XFree86-OS/2 out-of-the-box, and link with DLLs for other useful libraries, which typically are compiled with -Zmt -Zcrtdll .

Calls to external programs

Due to a popular demand the perl external program calling has been changed wrt Andreas Kaiser's port. If perl needs to call an external program via shell, the f:/bin/sh.exe will be called, or whatever is the override, see PERL_SH_DIR.

Thus means that you need to get some copy of a sh.exe as well (I use one from pdksh). The path F:/bin above is set up automatically during the build to a correct value on the builder machine, but is overridable at runtime,

Reasons: a consensus on perl5-porters was that perl should use one non-overridable shell per platform. The obvious choices for OS/2 are cmd.exe and sh.exe. Having perl build itself would be impossible with cmd.exe as a shell, thus I picked up sh.exe . This assures almost 100% compatibility with the scripts coming from *nix. As an added benefit this works as well under DOS if you use DOS-enabled port of pdksh (see Prerequisites).

Disadvantages: currently sh.exe of pdksh calls external programs via fork()/exec(), and there is no functioning exec() on OS/2. exec() is emulated by EMX by an asynchronous call while the caller waits for child completion (to pretend that the pid did not change). This means that 1 extra copy of sh.exe is made active via fork()/exec(), which may lead to some resources taken from the system (even if we do not count extra work needed for fork()ing).

Note that this a lesser issue now when we do not spawn sh.exe unless needed (metachars found).

One can always start cmd.exe explicitly via

  1. system 'cmd', '/c', 'mycmd', 'arg1', 'arg2', ...

If you need to use cmd.exe, and do not want to hand-edit thousands of your scripts, the long-term solution proposed on p5-p is to have a directive

  1. use OS2::Cmd;

which will override system(), exec(), `` , and open(,'...|'). With current perl you may override only system(), readpipe() - the explicit version of `` , and maybe exec(). The code will substitute the one-argument call to system() by CORE::system('cmd.exe', '/c', shift) .

If you have some working code for OS2::Cmd , please send it to me, I will include it into distribution. I have no need for such a module, so cannot test it.

For the details of the current situation with calling external programs, see 2 (and DOS) programs under Perl in Starting OS. Set us mention a couple of features:

  • External scripts may be called by their basename. Perl will try the same extensions as when processing -S command-line switch.

  • External scripts starting with #! or extproc will be executed directly, without calling the shell, by calling the program specified on the rest of the first line.

Memory allocation

Perl uses its own malloc() under OS/2 - interpreters are usually malloc-bound for speed, but perl is not, since its malloc is lightning-fast. Perl-memory-usage-tuned benchmarks show that Perl's malloc is 5 times quicker than EMX one. I do not have convincing data about memory footprint, but a (pretty random) benchmark showed that Perl's one is 5% better.

Combination of perl's malloc() and rigid DLL name resolution creates a special problem with library functions which expect their return value to be free()d by system's free(). To facilitate extensions which need to call such functions, system memory-allocation functions are still available with the prefix emx_ added. (Currently only DLL perl has this, it should propagate to perl_.exe shortly.)

Threads

One can build perl with thread support enabled by providing -D usethreads option to Configure. Currently OS/2 support of threads is very preliminary.

Most notable problems:

  • COND_WAIT

    may have a race condition (but probably does not due to edge-triggered nature of OS/2 Event semaphores). (Needs a reimplementation (in terms of chaining waiting threads, with the linked list stored in per-thread structure?)?)

  • os2.c

    has a couple of static variables used in OS/2-specific functions. (Need to be moved to per-thread structure, or serialized?)

Note that these problems should not discourage experimenting, since they have a low probability of affecting small programs.

BUGS

This description is not updated often (since 5.6.1?), see ./os2/Changes for more info.

AUTHOR

Ilya Zakharevich, cpan@ilyaz.org

SEE ALSO

perl(1).

Page index
 
perldoc-html/perlos390.html000644 000765 000024 00000114356 12275777412 015674 0ustar00jjstaff000000 000000 perlos390 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlos390

Perl 5 version 18.2 documentation
Recently read

perlos390

NAME

perlos390 - building and installing Perl for OS/390 and z/OS

SYNOPSIS

This document will help you Configure, build, test and install Perl on OS/390 (aka z/OS) Unix System Services.

DESCRIPTION

This is a fully ported Perl for OS/390 Version 2 Release 3, 5, 6, 7, 8, and 9. It may work on other versions or releases, but those are the ones we've tested it on.

You may need to carry out some system configuration tasks before running the Configure script for Perl.

Tools

The z/OS Unix Tools and Toys list may prove helpful and contains links to ports of much of the software helpful for building Perl. http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1toy.html

Unpacking Perl distribution on OS/390

If using ftp remember to transfer the distribution in binary format.

Gunzip/gzip for OS/390 is discussed at:

  1. http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1ty1.html

to extract an ASCII tar archive on OS/390, try this:

  1. pax -o to=IBM-1047,from=ISO8859-1 -r < latest.tar

or

  1. zcat latest.tar.Z | pax -o to=IBM-1047,from=ISO8859-1 -r

If you get lots of errors of the form

  1. tar: FSUM7171 ...: cannot set uid/gid: EDC5139I Operation not permitted.

you didn't read the above and tried to use tar instead of pax, you'll first have to remove the (now corrupt) perl directory

  1. rm -rf perl-...

and then use pax.

Setup and utilities for Perl on OS/390

Be sure that your yacc installation is in place including any necessary parser template files. If you have not already done so then be sure to:

  1. cp /samples/yyparse.c /etc

This may also be a good time to ensure that your /etc/protocol file and either your /etc/resolv.conf or /etc/hosts files are in place. The IBM document that described such USS system setup issues was SC28-1890-07 "OS/390 UNIX System Services Planning", in particular Chapter 6 on customizing the OE shell.

GNU make for OS/390, which is recommended for the build of perl (as well as building CPAN modules and extensions), is available from the Tools.

Some people have reported encountering "Out of memory!" errors while trying to build Perl using GNU make binaries. If you encounter such trouble then try to download the source code kit and build GNU make from source to eliminate any such trouble. You might also find GNU make (as well as Perl and Apache) in the red-piece/book "Open Source Software for OS/390 UNIX", SG24-5944-00 from IBM.

If instead of the recommended GNU make you would like to use the system supplied make program then be sure to install the default rules file properly via the shell command:

  1. cp /samples/startup.mk /etc

and be sure to also set the environment variable _C89_CCMODE=1 (exporting _C89_CCMODE=1 is also a good idea for users of GNU make).

You might also want to have GNU groff for OS/390 installed before running the "make install" step for Perl.

There is a syntax error in the /usr/include/sys/socket.h header file that IBM supplies with USS V2R7, V2R8, and possibly V2R9. The problem with the header file is that near the definition of the SO_REUSEPORT constant there is a spurious extra '/' character outside of a comment like so:

  1. #define SO_REUSEPORT 0x0200 /* allow local address & port
  2. reuse */ /

You could edit that header yourself to remove that last '/', or you might note that Language Environment (LE) APAR PQ39997 describes the problem and PTF's UQ46272 and UQ46271 are the (R8 at least) fixes and apply them. If left unattended that syntax error will turn up as an inability for Perl to build its "Socket" extension.

For successful testing you may need to turn on the sticky bit for your world readable /tmp directory if you have not already done so (see man chmod).

Configure Perl on OS/390

Once you've unpacked the distribution, run "sh Configure" (see INSTALL for a full discussion of the Configure options). There is a "hints" file for os390 that specifies the correct values for most things. Some things to watch out for include:

  • A message of the form:

    1. (I see you are using the Korn shell. Some ksh's blow up on Configure,
    2. mainly on older exotic systems. If yours does, try the Bourne shell instead.)

    is nothing to worry about at all.

  • Some of the parser default template files in /samples are needed in /etc. In particular be sure that you at least copy /samples/yyparse.c to /etc before running Perl's Configure. This step ensures successful extraction of EBCDIC versions of parser files such as perly.c, perly.h, and x2p/a2p.c. This has to be done before running Configure the first time. If you failed to do so then the easiest way to re-Configure Perl is to delete your misconfigured build root and re-extract the source from the tar ball. Then you must ensure that /etc/yyparse.c is properly in place before attempting to re-run Configure.

  • This port will support dynamic loading, but it is not selected by default. If you would like to experiment with dynamic loading then be sure to specify -Dusedl in the arguments to the Configure script. See the comments in hints/os390.sh for more information on dynamic loading. If you build with dynamic loading then you will need to add the $archlibexp/CORE directory to your LIBPATH environment variable in order for perl to work. See the config.sh file for the value of $archlibexp. If in trying to use Perl you see an error message similar to:

    1. CEE3501S The module libperl.dll was not found.
    2. From entry point __dllstaticinit at compile unit offset +00000194 at

    then your LIBPATH does not have the location of libperl.x and either libperl.dll or libperl.so in it. Add that directory to your LIBPATH and proceed.

  • Do not turn on the compiler optimization flag "-O". There is a bug in either the optimizer or perl that causes perl to not work correctly when the optimizer is on.

  • Some of the configuration files in /etc used by the networking APIs are either missing or have the wrong names. In particular, make sure that there's either an /etc/resolv.conf or an /etc/hosts, so that gethostbyname() works, and make sure that the file /etc/proto has been renamed to /etc/protocol (NOT /etc/protocols, as used by other Unix systems). You may have to look for things like HOSTNAME and DOMAINORIGIN in the "//'SYS1.TCPPARMS(TCPDATA)'" PDS member in order to properly set up your /etc networking files.

Build, Test, Install Perl on OS/390

Simply put:

  1. sh Configure
  2. make
  3. make test

if everything looks ok (see the next section for test/IVP diagnosis) then:

  1. make install

this last step may or may not require UID=0 privileges depending on how you answered the questions that Configure asked and whether or not you have write access to the directories you specified.

Build Anomalies with Perl on OS/390

"Out of memory!" messages during the build of Perl are most often fixed by re building the GNU make utility for OS/390 from a source code kit.

Another memory limiting item to check is your MAXASSIZE parameter in your 'SYS1.PARMLIB(BPXPRMxx)' data set (note too that as of V2R8 address space limits can be set on a per user ID basis in the USS segment of a RACF profile). People have reported successful builds of Perl with MAXASSIZE parameters as small as 503316480 (and it may be possible to build Perl with a MAXASSIZE smaller than that).

Within USS your /etc/profile or $HOME/.profile may limit your ulimit settings. Check that the following command returns reasonable values:

  1. ulimit -a

To conserve memory you should have your compiler modules loaded into the Link Pack Area (LPA/ELPA) rather than in a link list or step lib.

If the c89 compiler complains of syntax errors during the build of the Socket extension then be sure to fix the syntax error in the system header /usr/include/sys/socket.h.

Testing Anomalies with Perl on OS/390

The "make test" step runs a Perl Verification Procedure, usually before installation. You might encounter STDERR messages even during a successful run of "make test". Here is a guide to some of the more commonly seen anomalies:

  • A message of the form:

    1. io/openpid...........CEE5210S The signal SIGHUP was received.
    2. CEE5210S The signal SIGHUP was received.
    3. CEE5210S The signal SIGHUP was received.
    4. ok

    indicates that the t/io/openpid.t test of Perl has passed but done so with extraneous messages on stderr from CEE.

  • A message of the form:

    1. lib/ftmp-security....File::Temp::_gettemp: Parent directory (/tmp/) is not safe
    2. (sticky bit not set when world writable?) at lib/ftmp-security.t line 100
    3. File::Temp::_gettemp: Parent directory (/tmp/) is not safe (sticky bit not
    4. set when world writable?) at lib/ftmp-security.t line 100
    5. ok

    indicates a problem with the permissions on your /tmp directory within the HFS. To correct that problem issue the command:

    1. chmod a+t /tmp

    from an account with write access to the directory entry for /tmp.

  • Out of Memory!

    Recent perl test suite is quite memory hungry. In addition to the comments above on memory limitations it is also worth checking for _CEE_RUNOPTS in your environment. Perl now has (in miniperlmain.c) a C #pragma to set CEE run options, but the environment variable wins.

    The C code asks for:

    1. #pragma runopts(HEAP(2M,500K,ANYWHERE,KEEP,8K,4K) STACK(,,ANY,) ALL31(ON))

    The important parts of that are the second argument (the increment) to HEAP, and allowing the stack to be "Above the (16M) line". If the heap increment is too small then when perl (for example loading unicode/Name.pl) tries to create a "big" (400K+) string it cannot fit in a single segment and you get "Out of Memory!" - even if there is still plenty of memory available.

    A related issue is use with perl's malloc. Perl's malloc uses sbrk() to get memory, and sbrk() is limited to the first allocation so in this case something like:

    1. HEAP(8M,500K,ANYWHERE,KEEP,8K,4K)

    is needed to get through the test suite.

Installation Anomalies with Perl on OS/390

The installman script will try to run on OS/390. There will be fewer errors if you have a roff utility installed. You can obtain GNU groff from the Redbook SG24-5944-00 ftp site.

Usage Hints for Perl on OS/390

When using perl on OS/390 please keep in mind that the EBCDIC and ASCII character sets are different. See perlebcdic.pod for more on such character set issues. Perl builtin functions that may behave differently under EBCDIC are also mentioned in the perlport.pod document.

Open Edition (UNIX System Services) from V2R8 onward does support #!/path/to/perl script invocation. There is a PTF available from IBM for V2R7 that will allow shell/kernel support for #!. USS releases prior to V2R7 did not support the #! means of script invocation. If you are running V2R6 or earlier then see:

  1. head `whence perldoc`

for an example of how to use the "eval exec" trick to ask the shell to have Perl run your scripts on those older releases of Unix System Services.

If you are having trouble with square brackets then consider switching your rlogin or telnet client. Try to avoid older 3270 emulators and ISHELL for working with Perl on USS.

Floating Point Anomalies with Perl on OS/390

There appears to be a bug in the floating point implementation on S/390 systems such that calling int() on the product of a number and a small magnitude number is not the same as calling int() on the quotient of that number and a large magnitude number. For example, in the following Perl code:

  1. my $x = 100000.0;
  2. my $y = int($x * 1e-5) * 1e5; # '0'
  3. my $z = int($x / 1e+5) * 1e5; # '100000'
  4. print "\$y is $y and \$z is $z\n"; # $y is 0 and $z is 100000

Although one would expect the quantities $y and $z to be the same and equal to 100000 they will differ and instead will be 0 and 100000 respectively.

The problem can be further examined in a roughly equivalent C program:

  1. #include <stdio.h>
  2. #include <math.h>
  3. main()
  4. {
  5. double r1,r2;
  6. double x = 100000.0;
  7. double y = 0.0;
  8. double z = 0.0;
  9. x = 100000.0 * 1e-5;
  10. r1 = modf (x,&y);
  11. x = 100000.0 / 1e+5;
  12. r2 = modf (x,&z);
  13. printf("y is %e and z is %e\n",y*1e5,z*1e5);
  14. /* y is 0.000000e+00 and z is 1.000000e+05 (with c89) */
  15. }

Modules and Extensions for Perl on OS/390

Pure pure (that is non xs) modules may be installed via the usual:

  1. perl Makefile.PL
  2. make
  3. make test
  4. make install

If you built perl with dynamic loading capability then that would also be the way to build xs based extensions. However, if you built perl with the default static linking you can still build xs based extensions for OS/390 but you will need to follow the instructions in ExtUtils::MakeMaker for building statically linked perl binaries. In the simplest configurations building a static perl + xs extension boils down to:

  1. perl Makefile.PL
  2. make
  3. make perl
  4. make test
  5. make install
  6. make -f Makefile.aperl inst_perl MAP_TARGET=perl

In most cases people have reported better results with GNU make rather than the system's /bin/make program, whether for plain modules or for xs based extensions.

If the make process encounters trouble with either compilation or linking then try setting the _C89_CCMODE to 1. Assuming sh is your login shell then run:

  1. export _C89_CCMODE=1

If tcsh is your login shell then use the setenv command.

AUTHORS

David Fiander and Peter Prymmer with thanks to Dennis Longnecker and William Raffloer for valuable reports, LPAR and PTF feedback. Thanks to Mike MacIsaac and Egon Terwedow for SG24-5944-00. Thanks to Ignasi Roca for pointing out the floating point problems. Thanks to John Goodyear for dynamic loading help.

SEE ALSO

INSTALL, perlport, perlebcdic, ExtUtils::MakeMaker.

  1. http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1toy.html
  2. http://www.redbooks.ibm.com/redbooks/SG245944.html
  3. http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1ty1.html#opensrc
  4. http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/
  5. http://publibz.boulder.ibm.com:80/cgi-bin/bookmgr_OS390/BOOKS/ceea3030/
  6. http://publibz.boulder.ibm.com:80/cgi-bin/bookmgr_OS390/BOOKS/CBCUG030/

Mailing list for Perl on OS/390

If you are interested in the z/OS (formerly known as OS/390) and POSIX-BC (BS2000) ports of Perl then see the perl-mvs mailing list. To subscribe, send an empty message to perl-mvs-subscribe@perl.org.

See also:

  1. http://lists.perl.org/list/perl-mvs.html

There are web archives of the mailing list at:

  1. http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/
  2. http://archive.develooper.com/perl-mvs@perl.org/

HISTORY

This document was originally written by David Fiander for the 5.005 release of Perl.

This document was podified for the 5.005_03 release of Perl 11 March 1999.

Updated 28 November 2001 for broken URLs.

Updated 12 November 2000 for the 5.7.1 release of Perl.

Updated 15 January 2001 for the 5.7.1 release of Perl.

Updated 24 January 2001 to mention dynamic loading.

Updated 12 March 2001 to mention //'SYS1.TCPPARMS(TCPDATA)'.

 
perldoc-html/perlos400.html000644 000765 000024 00000046753 12275777412 015671 0ustar00jjstaff000000 000000 perlos400 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlos400

Perl 5 version 18.2 documentation
Recently read

perlos400

NAME

perlos400 - Perl version 5 on OS/400

DESCRIPTION

This document describes various features of IBM's OS/400 operating system that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs.

By far the easiest way to build Perl for OS/400 is to use the PASE (Portable Application Solutions Environment), for more information see http://www.iseries.ibm.com/developer/factory/pase/index.html This environment allows one to use AIX APIs while programming, and it provides a runtime that allows AIX binaries to execute directly on the PowerPC iSeries.

Compiling Perl for OS/400 PASE

The recommended way to build Perl for the OS/400 PASE is to build the Perl 5 source code (release 5.8.1 or later) under AIX.

The trick is to give a special parameter to the Configure shell script when running it on AIX:

  1. sh Configure -DPASE ...

The default installation directory of Perl under PASE is /QOpenSys/perl. This can be modified if needed with Configure parameter -Dprefix=/some/dir.

Starting from OS/400 V5R2 the IBM Visual Age compiler is supported on OS/400 PASE, so it is possible to build Perl natively on OS/400. The easier way, however, is to compile in AIX, as just described.

If you don't want to install the compiled Perl in AIX into /QOpenSys (for packaging it before copying it to PASE), you can use a Configure parameter: -Dinstallprefix=/tmp/QOpenSys/perl. This will cause the "make install" to install everything into that directory, while the installed files still think they are (will be) in /QOpenSys/perl.

If building natively on PASE, please do the build under the /QOpenSys directory, since Perl is happier when built on a case sensitive filesystem.

Installing Perl in OS/400 PASE

If you are compiling on AIX, simply do a "make install" on the AIX box. Once the install finishes, tar up the /QOpenSys/perl directory. Transfer the tarball to the OS/400 using FTP with the following commands:

  1. > binary
  2. > site namefmt 1
  3. > put perl.tar /QOpenSys

Once you have it on, simply bring up a PASE shell and extract the tarball.

If you are compiling in PASE, then "make install" is the only thing you will need to do.

The default path for perl binary is /QOpenSys/perl/bin/perl. You'll want to symlink /QOpenSys/usr/bin/perl to this file so you don't have to modify your path.

Using Perl in OS/400 PASE

Perl in PASE may be used in the same manner as you would use Perl on AIX.

Scripts starting with #!/usr/bin/perl should work if you have /QOpenSys/usr/bin/perl symlinked to your perl binary. This will not work if you've done a setuid/setgid or have environment variable PASE_EXEC_QOPENSYS="N". If you have V5R1, you'll need to get the latest PTFs to have this feature. Scripts starting with #!/QOpenSys/perl/bin/perl should always work.

Known Problems

When compiling in PASE, there is no "oslevel" command. Therefore, you may want to create a script called "oslevel" that echoes the level of AIX that your version of PASE runtime supports. If you're unsure, consult your documentation or use "4.3.3.0".

If you have test cases that fail, check for the existence of spool files. The test case may be trying to use a syscall that is not implemented in PASE. To avoid the SIGILL, try setting the PASE_SYSCALL_NOSIGILL environment variable or have a handler for the SIGILL. If you can compile programs for PASE, run the config script and edit config.sh when it gives you the option. If you want to remove fchdir(), which isn't implement in V5R1, simply change the line that says:

d_fchdir='define'

to

d_fchdir='undef'

and then compile Perl. The places where fchdir() is used have alternatives for systems that do not have fchdir() available.

Perl on ILE

There exists a port of Perl to the ILE environment. This port, however, is based quite an old release of Perl, Perl 5.00502 (August 1998). (As of July 2002 the latest release of Perl is 5.8.0, and even 5.6.1 has been out since April 2001.) If you need to run Perl on ILE, though, you may need this older port: http://www.cpan.org/ports/#os400 Note that any Perl release later than 5.00502 has not been ported to ILE.

If you need to use Perl in the ILE environment, you may want to consider using Qp2RunPase() to call the PASE version of Perl.

AUTHORS

Jarkko Hietaniemi <jhi@iki.fi> Bryan Logan <bryanlog@us.ibm.com> David Larson <larson1@us.ibm.com>

 
perldoc-html/perlpacktut.html000644 000765 000024 00000407721 12275777325 016476 0ustar00jjstaff000000 000000 perlpacktut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlpacktut

Perl 5 version 18.2 documentation
Recently read

perlpacktut

NAME

perlpacktut - tutorial on pack and unpack

DESCRIPTION

pack and unpack are two functions for transforming data according to a user-defined template, between the guarded way Perl stores values and some well-defined representation as might be required in the environment of a Perl program. Unfortunately, they're also two of the most misunderstood and most often overlooked functions that Perl provides. This tutorial will demystify them for you.

The Basic Principle

Most programming languages don't shelter the memory where variables are stored. In C, for instance, you can take the address of some variable, and the sizeof operator tells you how many bytes are allocated to the variable. Using the address and the size, you may access the storage to your heart's content.

In Perl, you just can't access memory at random, but the structural and representational conversion provided by pack and unpack is an excellent alternative. The pack function converts values to a byte sequence containing representations according to a given specification, the so-called "template" argument. unpack is the reverse process, deriving some values from the contents of a string of bytes. (Be cautioned, however, that not all that has been packed together can be neatly unpacked - a very common experience as seasoned travellers are likely to confirm.)

Why, you may ask, would you need a chunk of memory containing some values in binary representation? One good reason is input and output accessing some file, a device, or a network connection, whereby this binary representation is either forced on you or will give you some benefit in processing. Another cause is passing data to some system call that is not available as a Perl function: syscall requires you to provide parameters stored in the way it happens in a C program. Even text processing (as shown in the next section) may be simplified with judicious usage of these two functions.

To see how (un)packing works, we'll start with a simple template code where the conversion is in low gear: between the contents of a byte sequence and a string of hexadecimal digits. Let's use unpack, since this is likely to remind you of a dump program, or some desperate last message unfortunate programs are wont to throw at you before they expire into the wild blue yonder. Assuming that the variable $mem holds a sequence of bytes that we'd like to inspect without assuming anything about its meaning, we can write

  1. my( $hex ) = unpack( 'H*', $mem );
  2. print "$hex\n";

whereupon we might see something like this, with each pair of hex digits corresponding to a byte:

  1. 41204d414e204120504c414e20412043414e414c2050414e414d41

What was in this chunk of memory? Numbers, characters, or a mixture of both? Assuming that we're on a computer where ASCII (or some similar) encoding is used: hexadecimal values in the range 0x40 - 0x5A indicate an uppercase letter, and 0x20 encodes a space. So we might assume it is a piece of text, which some are able to read like a tabloid; but others will have to get hold of an ASCII table and relive that firstgrader feeling. Not caring too much about which way to read this, we note that unpack with the template code H converts the contents of a sequence of bytes into the customary hexadecimal notation. Since "a sequence of" is a pretty vague indication of quantity, H has been defined to convert just a single hexadecimal digit unless it is followed by a repeat count. An asterisk for the repeat count means to use whatever remains.

The inverse operation - packing byte contents from a string of hexadecimal digits - is just as easily written. For instance:

  1. my $s = pack( 'H2' x 10, 30..39 );
  2. print "$s\n";

Since we feed a list of ten 2-digit hexadecimal strings to pack, the pack template should contain ten pack codes. If this is run on a computer with ASCII character coding, it will print 0123456789 .

Packing Text

Let's suppose you've got to read in a data file like this:

  1. Date |Description | Income|Expenditure
  2. 01/24/2001 Zed's Camel Emporium 1147.99
  3. 01/28/2001 Flea spray 24.99
  4. 01/29/2001 Camel rides to tourists 235.00

How do we do it? You might think first to use split; however, since split collapses blank fields, you'll never know whether a record was income or expenditure. Oops. Well, you could always use substr:

  1. while (<>) {
  2. my $date = substr($_, 0, 11);
  3. my $desc = substr($_, 12, 27);
  4. my $income = substr($_, 40, 7);
  5. my $expend = substr($_, 52, 7);
  6. ...
  7. }

It's not really a barrel of laughs, is it? In fact, it's worse than it may seem; the eagle-eyed may notice that the first field should only be 10 characters wide, and the error has propagated right through the other numbers - which we've had to count by hand. So it's error-prone as well as horribly unfriendly.

Or maybe we could use regular expressions:

  1. while (<>) {
  2. my($date, $desc, $income, $expend) =
  3. m|(\d\d/\d\d/\d{4}) (.{27}) (.{7})(.*)|;
  4. ...
  5. }

Urgh. Well, it's a bit better, but - well, would you want to maintain that?

Hey, isn't Perl supposed to make this sort of thing easy? Well, it does, if you use the right tools. pack and unpack are designed to help you out when dealing with fixed-width data like the above. Let's have a look at a solution with unpack:

  1. while (<>) {
  2. my($date, $desc, $income, $expend) = unpack("A10xA27xA7A*", $_);
  3. ...
  4. }

That looks a bit nicer; but we've got to take apart that weird template. Where did I pull that out of?

OK, let's have a look at some of our data again; in fact, we'll include the headers, and a handy ruler so we can keep track of where we are.

  1. 1 2 3 4 5
  2. 1234567890123456789012345678901234567890123456789012345678
  3. Date |Description | Income|Expenditure
  4. 01/28/2001 Flea spray 24.99
  5. 01/29/2001 Camel rides to tourists 235.00

From this, we can see that the date column stretches from column 1 to column 10 - ten characters wide. The pack-ese for "character" is A , and ten of them are A10 . So if we just wanted to extract the dates, we could say this:

  1. my($date) = unpack("A10", $_);

OK, what's next? Between the date and the description is a blank column; we want to skip over that. The x template means "skip forward", so we want one of those. Next, we have another batch of characters, from 12 to 38. That's 27 more characters, hence A27 . (Don't make the fencepost error - there are 27 characters between 12 and 38, not 26. Count 'em!)

Now we skip another character and pick up the next 7 characters:

  1. my($date,$description,$income) = unpack("A10xA27xA7", $_);

Now comes the clever bit. Lines in our ledger which are just income and not expenditure might end at column 46. Hence, we don't want to tell our unpack pattern that we need to find another 12 characters; we'll just say "if there's anything left, take it". As you might guess from regular expressions, that's what the * means: "use everything remaining".

  • Be warned, though, that unlike regular expressions, if the unpack template doesn't match the incoming data, Perl will scream and die.

Hence, putting it all together:

  1. my($date,$description,$income,$expend) = unpack("A10xA27xA7xA*", $_);

Now, that's our data parsed. I suppose what we might want to do now is total up our income and expenditure, and add another line to the end of our ledger - in the same format - saying how much we've brought in and how much we've spent:

  1. while (<>) {
  2. my($date, $desc, $income, $expend) = unpack("A10xA27xA7xA*", $_);
  3. $tot_income += $income;
  4. $tot_expend += $expend;
  5. }
  6. $tot_income = sprintf("%.2f", $tot_income); # Get them into
  7. $tot_expend = sprintf("%.2f", $tot_expend); # "financial" format
  8. $date = POSIX::strftime("%m/%d/%Y", localtime);
  9. # OK, let's go:
  10. print pack("A10xA27xA7xA*", $date, "Totals", $tot_income, $tot_expend);

Oh, hmm. That didn't quite work. Let's see what happened:

  1. 01/24/2001 Zed's Camel Emporium 1147.99
  2. 01/28/2001 Flea spray 24.99
  3. 01/29/2001 Camel rides to tourists 1235.00
  4. 03/23/2001Totals 1235.001172.98

OK, it's a start, but what happened to the spaces? We put x , didn't we? Shouldn't it skip forward? Let's look at what pack says:

  1. x A null byte.

Urgh. No wonder. There's a big difference between "a null byte", character zero, and "a space", character 32. Perl's put something between the date and the description - but unfortunately, we can't see it!

What we actually need to do is expand the width of the fields. The A format pads any non-existent characters with spaces, so we can use the additional spaces to line up our fields, like this:

  1. print pack("A11 A28 A8 A*", $date, "Totals", $tot_income, $tot_expend);

(Note that you can put spaces in the template to make it more readable, but they don't translate to spaces in the output.) Here's what we got this time:

  1. 01/24/2001 Zed's Camel Emporium 1147.99
  2. 01/28/2001 Flea spray 24.99
  3. 01/29/2001 Camel rides to tourists 1235.00
  4. 03/23/2001 Totals 1235.00 1172.98

That's a bit better, but we still have that last column which needs to be moved further over. There's an easy way to fix this up: unfortunately, we can't get pack to right-justify our fields, but we can get sprintf to do it:

  1. $tot_income = sprintf("%.2f", $tot_income);
  2. $tot_expend = sprintf("%12.2f", $tot_expend);
  3. $date = POSIX::strftime("%m/%d/%Y", localtime);
  4. print pack("A11 A28 A8 A*", $date, "Totals", $tot_income, $tot_expend);

This time we get the right answer:

  1. 01/28/2001 Flea spray 24.99
  2. 01/29/2001 Camel rides to tourists 1235.00
  3. 03/23/2001 Totals 1235.00 1172.98

So that's how we consume and produce fixed-width data. Let's recap what we've seen of pack and unpack so far:

  • Use pack to go from several pieces of data to one fixed-width version; use unpack to turn a fixed-width-format string into several pieces of data.

  • The pack format A means "any character"; if you're packing and you've run out of things to pack, pack will fill the rest up with spaces.

  • x means "skip a byte" when unpacking; when packing, it means "introduce a null byte" - that's probably not what you mean if you're dealing with plain text.

  • You can follow the formats with numbers to say how many characters should be affected by that format: A12 means "take 12 characters"; x6 means "skip 6 bytes" or "character 0, 6 times".

  • Instead of a number, you can use * to mean "consume everything else left".

    Warning: when packing multiple pieces of data, * only means "consume all of the current piece of data". That's to say

    1. pack("A*A*", $one, $two)

    packs all of $one into the first A* and then all of $two into the second. This is a general principle: each format character corresponds to one piece of data to be packed.

Packing Numbers

So much for textual data. Let's get onto the meaty stuff that pack and unpack are best at: handling binary formats for numbers. There is, of course, not just one binary format - life would be too simple - but Perl will do all the finicky labor for you.

Integers

Packing and unpacking numbers implies conversion to and from some specific binary representation. Leaving floating point numbers aside for the moment, the salient properties of any such representation are:

  • the number of bytes used for storing the integer,

  • whether the contents are interpreted as a signed or unsigned number,

  • the byte ordering: whether the first byte is the least or most significant byte (or: little-endian or big-endian, respectively).

So, for instance, to pack 20302 to a signed 16 bit integer in your computer's representation you write

  1. my $ps = pack( 's', 20302 );

Again, the result is a string, now containing 2 bytes. If you print this string (which is, generally, not recommended) you might see ON or NO (depending on your system's byte ordering) - or something entirely different if your computer doesn't use ASCII character encoding. Unpacking $ps with the same template returns the original integer value:

  1. my( $s ) = unpack( 's', $ps );

This is true for all numeric template codes. But don't expect miracles: if the packed value exceeds the allotted byte capacity, high order bits are silently discarded, and unpack certainly won't be able to pull them back out of some magic hat. And, when you pack using a signed template code such as s, an excess value may result in the sign bit getting set, and unpacking this will smartly return a negative value.

16 bits won't get you too far with integers, but there is l and L for signed and unsigned 32-bit integers. And if this is not enough and your system supports 64 bit integers you can push the limits much closer to infinity with pack codes q and Q . A notable exception is provided by pack codes i and I for signed and unsigned integers of the "local custom" variety: Such an integer will take up as many bytes as a local C compiler returns for sizeof(int) , but it'll use at least 32 bits.

Each of the integer pack codes sSlLqQ results in a fixed number of bytes, no matter where you execute your program. This may be useful for some applications, but it does not provide for a portable way to pass data structures between Perl and C programs (bound to happen when you call XS extensions or the Perl function syscall), or when you read or write binary files. What you'll need in this case are template codes that depend on what your local C compiler compiles when you code short or unsigned long , for instance. These codes and their corresponding byte lengths are shown in the table below. Since the C standard leaves much leeway with respect to the relative sizes of these data types, actual values may vary, and that's why the values are given as expressions in C and Perl. (If you'd like to use values from %Config in your program you have to import it with use Config .)

  1. signed unsigned byte length in C byte length in Perl
  2. s! S! sizeof(short) $Config{shortsize}
  3. i! I! sizeof(int) $Config{intsize}
  4. l! L! sizeof(long) $Config{longsize}
  5. q! Q! sizeof(long long) $Config{longlongsize}

The i! and I! codes aren't different from i and I ; they are tolerated for completeness' sake.

Unpacking a Stack Frame

Requesting a particular byte ordering may be necessary when you work with binary data coming from some specific architecture whereas your program could run on a totally different system. As an example, assume you have 24 bytes containing a stack frame as it happens on an Intel 8086:

  1. +---------+ +----+----+ +---------+
  2. TOS: | IP | TOS+4:| FL | FH | FLAGS TOS+14:| SI |
  3. +---------+ +----+----+ +---------+
  4. | CS | | AL | AH | AX | DI |
  5. +---------+ +----+----+ +---------+
  6. | BL | BH | BX | BP |
  7. +----+----+ +---------+
  8. | CL | CH | CX | DS |
  9. +----+----+ +---------+
  10. | DL | DH | DX | ES |
  11. +----+----+ +---------+

First, we note that this time-honored 16-bit CPU uses little-endian order, and that's why the low order byte is stored at the lower address. To unpack such a (unsigned) short we'll have to use code v . A repeat count unpacks all 12 shorts:

  1. my( $ip, $cs, $flags, $ax, $bx, $cd, $dx, $si, $di, $bp, $ds, $es ) =
  2. unpack( 'v12', $frame );

Alternatively, we could have used C to unpack the individually accessible byte registers FL, FH, AL, AH, etc.:

  1. my( $fl, $fh, $al, $ah, $bl, $bh, $cl, $ch, $dl, $dh ) =
  2. unpack( 'C10', substr( $frame, 4, 10 ) );

It would be nice if we could do this in one fell swoop: unpack a short, back up a little, and then unpack 2 bytes. Since Perl is nice, it proffers the template code X to back up one byte. Putting this all together, we may now write:

  1. my( $ip, $cs,
  2. $flags,$fl,$fh,
  3. $ax,$al,$ah, $bx,$bl,$bh, $cx,$cl,$ch, $dx,$dl,$dh,
  4. $si, $di, $bp, $ds, $es ) =
  5. unpack( 'v2' . ('vXXCC' x 5) . 'v5', $frame );

(The clumsy construction of the template can be avoided - just read on!)

We've taken some pains to construct the template so that it matches the contents of our frame buffer. Otherwise we'd either get undefined values, or unpack could not unpack all. If pack runs out of items, it will supply null strings (which are coerced into zeroes whenever the pack code says so).

How to Eat an Egg on a Net

The pack code for big-endian (high order byte at the lowest address) is n for 16 bit and N for 32 bit integers. You use these codes if you know that your data comes from a compliant architecture, but, surprisingly enough, you should also use these pack codes if you exchange binary data, across the network, with some system that you know next to nothing about. The simple reason is that this order has been chosen as the network order, and all standard-fearing programs ought to follow this convention. (This is, of course, a stern backing for one of the Lilliputian parties and may well influence the political development there.) So, if the protocol expects you to send a message by sending the length first, followed by just so many bytes, you could write:

  1. my $buf = pack( 'N', length( $msg ) ) . $msg;

or even:

  1. my $buf = pack( 'NA*', length( $msg ), $msg );

and pass $buf to your send routine. Some protocols demand that the count should include the length of the count itself: then just add 4 to the data length. (But make sure to read Lengths and Widths before you really code this!)

Byte-order modifiers

In the previous sections we've learned how to use n , N , v and V to pack and unpack integers with big- or little-endian byte-order. While this is nice, it's still rather limited because it leaves out all kinds of signed integers as well as 64-bit integers. For example, if you wanted to unpack a sequence of signed big-endian 16-bit integers in a platform-independent way, you would have to write:

  1. my @data = unpack 's*', pack 'S*', unpack 'n*', $buf;

This is ugly. As of Perl 5.9.2, there's a much nicer way to express your desire for a certain byte-order: the > and < modifiers. > is the big-endian modifier, while < is the little-endian modifier. Using them, we could rewrite the above code as:

  1. my @data = unpack 's>*', $buf;

As you can see, the "big end" of the arrow touches the s, which is a nice way to remember that > is the big-endian modifier. The same obviously works for < , where the "little end" touches the code.

You will probably find these modifiers even more useful if you have to deal with big- or little-endian C structures. Be sure to read Packing and Unpacking C Structures for more on that.

Floating point Numbers

For packing floating point numbers you have the choice between the pack codes f , d , F and D . f and d pack into (or unpack from) single-precision or double-precision representation as it is provided by your system. If your systems supports it, D can be used to pack and unpack extended-precision floating point values (long double ), which can offer even more resolution than f or d . F packs an NV , which is the floating point type used by Perl internally. (There is no such thing as a network representation for reals, so if you want to send your real numbers across computer boundaries, you'd better stick to ASCII representation, unless you're absolutely sure what's on the other end of the line. For the even more adventuresome, you can use the byte-order modifiers from the previous section also on floating point codes.)

Exotic Templates

Bit Strings

Bits are the atoms in the memory world. Access to individual bits may have to be used either as a last resort or because it is the most convenient way to handle your data. Bit string (un)packing converts between strings containing a series of 0 and 1 characters and a sequence of bytes each containing a group of 8 bits. This is almost as simple as it sounds, except that there are two ways the contents of a byte may be written as a bit string. Let's have a look at an annotated byte:

  1. 7 6 5 4 3 2 1 0
  2. +-----------------+
  3. | 1 0 0 0 1 1 0 0 |
  4. +-----------------+
  5. MSB LSB

It's egg-eating all over again: Some think that as a bit string this should be written "10001100" i.e. beginning with the most significant bit, others insist on "00110001". Well, Perl isn't biased, so that's why we have two bit string codes:

  1. $byte = pack( 'B8', '10001100' ); # start with MSB
  2. $byte = pack( 'b8', '00110001' ); # start with LSB

It is not possible to pack or unpack bit fields - just integral bytes. pack always starts at the next byte boundary and "rounds up" to the next multiple of 8 by adding zero bits as required. (If you do want bit fields, there is vec. Or you could implement bit field handling at the character string level, using split, substr, and concatenation on unpacked bit strings.)

To illustrate unpacking for bit strings, we'll decompose a simple status register (a "-" stands for a "reserved" bit):

  1. +-----------------+-----------------+
  2. | S Z - A - P - C | - - - - O D I T |
  3. +-----------------+-----------------+
  4. MSB LSB MSB LSB

Converting these two bytes to a string can be done with the unpack template 'b16' . To obtain the individual bit values from the bit string we use split with the "empty" separator pattern which dissects into individual characters. Bit values from the "reserved" positions are simply assigned to undef, a convenient notation for "I don't care where this goes".

  1. ($carry, undef, $parity, undef, $auxcarry, undef, $zero, $sign,
  2. $trace, $interrupt, $direction, $overflow) =
  3. split( //, unpack( 'b16', $status ) );

We could have used an unpack template 'b12' just as well, since the last 4 bits can be ignored anyway.

Uuencoding

Another odd-man-out in the template alphabet is u , which packs an "uuencoded string". ("uu" is short for Unix-to-Unix.) Chances are that you won't ever need this encoding technique which was invented to overcome the shortcomings of old-fashioned transmission mediums that do not support other than simple ASCII data. The essential recipe is simple: Take three bytes, or 24 bits. Split them into 4 six-packs, adding a space (0x20) to each. Repeat until all of the data is blended. Fold groups of 4 bytes into lines no longer than 60 and garnish them in front with the original byte count (incremented by 0x20) and a "\n" at the end. - The pack chef will prepare this for you, a la minute, when you select pack code u on the menu:

  1. my $uubuf = pack( 'u', $bindat );

A repeat count after u sets the number of bytes to put into an uuencoded line, which is the maximum of 45 by default, but could be set to some (smaller) integer multiple of three. unpack simply ignores the repeat count.

Doing Sums

An even stranger template code is % <number>. First, because it's used as a prefix to some other template code. Second, because it cannot be used in pack at all, and third, in unpack, doesn't return the data as defined by the template code it precedes. Instead it'll give you an integer of number bits that is computed from the data value by doing sums. For numeric unpack codes, no big feat is achieved:

  1. my $buf = pack( 'iii', 100, 20, 3 );
  2. print unpack( '%32i3', $buf ), "\n"; # prints 123

For string values, % returns the sum of the byte values saving you the trouble of a sum loop with substr and ord:

  1. print unpack( '%32A*', "\x01\x10" ), "\n"; # prints 17

Although the % code is documented as returning a "checksum": don't put your trust in such values! Even when applied to a small number of bytes, they won't guarantee a noticeable Hamming distance.

In connection with b or B , % simply adds bits, and this can be put to good use to count set bits efficiently:

  1. my $bitcount = unpack( '%32b*', $mask );

And an even parity bit can be determined like this:

  1. my $evenparity = unpack( '%1b*', $mask );

Unicode

Unicode is a character set that can represent most characters in most of the world's languages, providing room for over one million different characters. Unicode 3.1 specifies 94,140 characters: The Basic Latin characters are assigned to the numbers 0 - 127. The Latin-1 Supplement with characters that are used in several European languages is in the next range, up to 255. After some more Latin extensions we find the character sets from languages using non-Roman alphabets, interspersed with a variety of symbol sets such as currency symbols, Zapf Dingbats or Braille. (You might want to visit http://www.unicode.org/ for a look at some of them - my personal favourites are Telugu and Kannada.)

The Unicode character sets associates characters with integers. Encoding these numbers in an equal number of bytes would more than double the requirements for storing texts written in Latin alphabets. The UTF-8 encoding avoids this by storing the most common (from a western point of view) characters in a single byte while encoding the rarer ones in three or more bytes.

Perl uses UTF-8, internally, for most Unicode strings.

So what has this got to do with pack? Well, if you want to compose a Unicode string (that is internally encoded as UTF-8), you can do so by using template code U . As an example, let's produce the Euro currency symbol (code number 0x20AC):

  1. $UTF8{Euro} = pack( 'U', 0x20AC );
  2. # Equivalent to: $UTF8{Euro} = "\x{20ac}";

Inspecting $UTF8{Euro} shows that it contains 3 bytes: "\xe2\x82\xac". However, it contains only 1 character, number 0x20AC. The round trip can be completed with unpack:

  1. $Unicode{Euro} = unpack( 'U', $UTF8{Euro} );

Unpacking using the U template code also works on UTF-8 encoded byte strings.

Usually you'll want to pack or unpack UTF-8 strings:

  1. # pack and unpack the Hebrew alphabet
  2. my $alefbet = pack( 'U*', 0x05d0..0x05ea );
  3. my @hebrew = unpack( 'U*', $utf );

Please note: in the general case, you're better off using Encode::decode_utf8 to decode a UTF-8 encoded byte string to a Perl Unicode string, and Encode::encode_utf8 to encode a Perl Unicode string to UTF-8 bytes. These functions provide means of handling invalid byte sequences and generally have a friendlier interface.

Another Portable Binary Encoding

The pack code w has been added to support a portable binary data encoding scheme that goes way beyond simple integers. (Details can be found at http://Casbah.org/, the Scarab project.) A BER (Binary Encoded Representation) compressed unsigned integer stores base 128 digits, most significant digit first, with as few digits as possible. Bit eight (the high bit) is set on each byte except the last. There is no size limit to BER encoding, but Perl won't go to extremes.

  1. my $berbuf = pack( 'w*', 1, 128, 128+1, 128*128+127 );

A hex dump of $berbuf , with spaces inserted at the right places, shows 01 8100 8101 81807F. Since the last byte is always less than 128, unpack knows where to stop.

Template Grouping

Prior to Perl 5.8, repetitions of templates had to be made by x -multiplication of template strings. Now there is a better way as we may use the pack codes ( and ) combined with a repeat count. The unpack template from the Stack Frame example can simply be written like this:

  1. unpack( 'v2 (vXXCC)5 v5', $frame )

Let's explore this feature a little more. We'll begin with the equivalent of

  1. join( '', map( substr( $_, 0, 1 ), @str ) )

which returns a string consisting of the first character from each string. Using pack, we can write

  1. pack( '(A)'.@str, @str )

or, because a repeat count * means "repeat as often as required", simply

  1. pack( '(A)*', @str )

(Note that the template A* would only have packed $str[0] in full length.)

To pack dates stored as triplets ( day, month, year ) in an array @dates into a sequence of byte, byte, short integer we can write

  1. $pd = pack( '(CCS)*', map( @$_, @dates ) );

To swap pairs of characters in a string (with even length) one could use several techniques. First, let's use x and X to skip forward and back:

  1. $s = pack( '(A)*', unpack( '(xAXXAx)*', $s ) );

We can also use @ to jump to an offset, with 0 being the position where we were when the last ( was encountered:

  1. $s = pack( '(A)*', unpack( '(@1A @0A @2)*', $s ) );

Finally, there is also an entirely different approach by unpacking big endian shorts and packing them in the reverse byte order:

  1. $s = pack( '(v)*', unpack( '(n)*', $s );

Lengths and Widths

String Lengths

In the previous section we've seen a network message that was constructed by prefixing the binary message length to the actual message. You'll find that packing a length followed by so many bytes of data is a frequently used recipe since appending a null byte won't work if a null byte may be part of the data. Here is an example where both techniques are used: after two null terminated strings with source and destination address, a Short Message (to a mobile phone) is sent after a length byte:

  1. my $msg = pack( 'Z*Z*CA*', $src, $dst, length( $sm ), $sm );

Unpacking this message can be done with the same template:

  1. ( $src, $dst, $len, $sm ) = unpack( 'Z*Z*CA*', $msg );

There's a subtle trap lurking in the offing: Adding another field after the Short Message (in variable $sm ) is all right when packing, but this cannot be unpacked naively:

  1. # pack a message
  2. my $msg = pack( 'Z*Z*CA*C', $src, $dst, length( $sm ), $sm, $prio );
  3. # unpack fails - $prio remains undefined!
  4. ( $src, $dst, $len, $sm, $prio ) = unpack( 'Z*Z*CA*C', $msg );

The pack code A* gobbles up all remaining bytes, and $prio remains undefined! Before we let disappointment dampen the morale: Perl's got the trump card to make this trick too, just a little further up the sleeve. Watch this:

  1. # pack a message: ASCIIZ, ASCIIZ, length/string, byte
  2. my $msg = pack( 'Z* Z* C/A* C', $src, $dst, $sm, $prio );
  3. # unpack
  4. ( $src, $dst, $sm, $prio ) = unpack( 'Z* Z* C/A* C', $msg );

Combining two pack codes with a slash (/) associates them with a single value from the argument list. In pack, the length of the argument is taken and packed according to the first code while the argument itself is added after being converted with the template code after the slash. This saves us the trouble of inserting the length call, but it is in unpack where we really score: The value of the length byte marks the end of the string to be taken from the buffer. Since this combination doesn't make sense except when the second pack code isn't a* , A* or Z* , Perl won't let you.

The pack code preceding / may be anything that's fit to represent a number: All the numeric binary pack codes, and even text codes such as A4 or Z* :

  1. # pack/unpack a string preceded by its length in ASCII
  2. my $buf = pack( 'A4/A*', "Humpty-Dumpty" );
  3. # unpack $buf: '13 Humpty-Dumpty'
  4. my $txt = unpack( 'A4/A*', $buf );

/ is not implemented in Perls before 5.6, so if your code is required to work on older Perls you'll need to unpack( 'Z* Z* C') to get the length, then use it to make a new unpack string. For example

  1. # pack a message: ASCIIZ, ASCIIZ, length, string, byte (5.005 compatible)
  2. my $msg = pack( 'Z* Z* C A* C', $src, $dst, length $sm, $sm, $prio );
  3. # unpack
  4. ( undef, undef, $len) = unpack( 'Z* Z* C', $msg );
  5. ($src, $dst, $sm, $prio) = unpack ( "Z* Z* x A$len C", $msg );

But that second unpack is rushing ahead. It isn't using a simple literal string for the template. So maybe we should introduce...

Dynamic Templates

So far, we've seen literals used as templates. If the list of pack items doesn't have fixed length, an expression constructing the template is required (whenever, for some reason, ()* cannot be used). Here's an example: To store named string values in a way that can be conveniently parsed by a C program, we create a sequence of names and null terminated ASCII strings, with = between the name and the value, followed by an additional delimiting null byte. Here's how:

  1. my $env = pack( '(A*A*Z*)' . keys( %Env ) . 'C',
  2. map( { ( $_, '=', $Env{$_} ) } keys( %Env ) ), 0 );

Let's examine the cogs of this byte mill, one by one. There's the map call, creating the items we intend to stuff into the $env buffer: to each key (in $_ ) it adds the = separator and the hash entry value. Each triplet is packed with the template code sequence A*A*Z* that is repeated according to the number of keys. (Yes, that's what the keys function returns in scalar context.) To get the very last null byte, we add a 0 at the end of the pack list, to be packed with C . (Attentive readers may have noticed that we could have omitted the 0.)

For the reverse operation, we'll have to determine the number of items in the buffer before we can let unpack rip it apart:

  1. my $n = $env =~ tr/\0// - 1;
  2. my %env = map( split( /=/, $_ ), unpack( "(Z*)$n", $env ) );

The tr counts the null bytes. The unpack call returns a list of name-value pairs each of which is taken apart in the map block.

Counting Repetitions

Rather than storing a sentinel at the end of a data item (or a list of items), we could precede the data with a count. Again, we pack keys and values of a hash, preceding each with an unsigned short length count, and up front we store the number of pairs:

  1. my $env = pack( 'S(S/A* S/A*)*', scalar keys( %Env ), %Env );

This simplifies the reverse operation as the number of repetitions can be unpacked with the / code:

  1. my %env = unpack( 'S/(S/A* S/A*)', $env );

Note that this is one of the rare cases where you cannot use the same template for pack and unpack because pack can't determine a repeat count for a () -group.

Intel HEX

Intel HEX is a file format for representing binary data, mostly for programming various chips, as a text file. (See http://en.wikipedia.org/wiki/.hex for a detailed description, and http://en.wikipedia.org/wiki/SREC_(file_format) for the Motorola S-record format, which can be unravelled using the same technique.) Each line begins with a colon (':') and is followed by a sequence of hexadecimal characters, specifying a byte count n (8 bit), an address (16 bit, big endian), a record type (8 bit), n data bytes and a checksum (8 bit) computed as the least significant byte of the two's complement sum of the preceding bytes. Example: :0300300002337A1E.

The first step of processing such a line is the conversion, to binary, of the hexadecimal data, to obtain the four fields, while checking the checksum. No surprise here: we'll start with a simple pack call to convert everything to binary:

  1. my $binrec = pack( 'H*', substr( $hexrec, 1 ) );

The resulting byte sequence is most convenient for checking the checksum. Don't slow your program down with a for loop adding the ord values of this string's bytes - the unpack code % is the thing to use for computing the 8-bit sum of all bytes, which must be equal to zero:

  1. die unless unpack( "%8C*", $binrec ) == 0;

Finally, let's get those four fields. By now, you shouldn't have any problems with the first three fields - but how can we use the byte count of the data in the first field as a length for the data field? Here the codes x and X come to the rescue, as they permit jumping back and forth in the string to unpack.

  1. my( $addr, $type, $data ) = unpack( "x n C X4 C x3 /a", $bin );

Code x skips a byte, since we don't need the count yet. Code n takes care of the 16-bit big-endian integer address, and C unpacks the record type. Being at offset 4, where the data begins, we need the count. X4 brings us back to square one, which is the byte at offset 0. Now we pick up the count, and zoom forth to offset 4, where we are now fully furnished to extract the exact number of data bytes, leaving the trailing checksum byte alone.

Packing and Unpacking C Structures

In previous sections we have seen how to pack numbers and character strings. If it were not for a couple of snags we could conclude this section right away with the terse remark that C structures don't contain anything else, and therefore you already know all there is to it. Sorry, no: read on, please.

If you have to deal with a lot of C structures, and don't want to hack all your template strings manually, you'll probably want to have a look at the CPAN module Convert::Binary::C . Not only can it parse your C source directly, but it also has built-in support for all the odds and ends described further on in this section.

The Alignment Pit

In the consideration of speed against memory requirements the balance has been tilted in favor of faster execution. This has influenced the way C compilers allocate memory for structures: On architectures where a 16-bit or 32-bit operand can be moved faster between places in memory, or to or from a CPU register, if it is aligned at an even or multiple-of-four or even at a multiple-of eight address, a C compiler will give you this speed benefit by stuffing extra bytes into structures. If you don't cross the C shoreline this is not likely to cause you any grief (although you should care when you design large data structures, or you want your code to be portable between architectures (you do want that, don't you?)).

To see how this affects pack and unpack, we'll compare these two C structures:

  1. typedef struct {
  2. char c1;
  3. short s;
  4. char c2;
  5. long l;
  6. } gappy_t;
  7. typedef struct {
  8. long l;
  9. short s;
  10. char c1;
  11. char c2;
  12. } dense_t;

Typically, a C compiler allocates 12 bytes to a gappy_t variable, but requires only 8 bytes for a dense_t . After investigating this further, we can draw memory maps, showing where the extra 4 bytes are hidden:

  1. 0 +4 +8 +12
  2. +--+--+--+--+--+--+--+--+--+--+--+--+
  3. |c1|xx| s |c2|xx|xx|xx| l | xx = fill byte
  4. +--+--+--+--+--+--+--+--+--+--+--+--+
  5. gappy_t
  6. 0 +4 +8
  7. +--+--+--+--+--+--+--+--+
  8. | l | h |c1|c2|
  9. +--+--+--+--+--+--+--+--+
  10. dense_t

And that's where the first quirk strikes: pack and unpack templates have to be stuffed with x codes to get those extra fill bytes.

The natural question: "Why can't Perl compensate for the gaps?" warrants an answer. One good reason is that C compilers might provide (non-ANSI) extensions permitting all sorts of fancy control over the way structures are aligned, even at the level of an individual structure field. And, if this were not enough, there is an insidious thing called union where the amount of fill bytes cannot be derived from the alignment of the next item alone.

OK, so let's bite the bullet. Here's one way to get the alignment right by inserting template codes x , which don't take a corresponding item from the list:

  1. my $gappy = pack( 'cxs cxxx l!', $c1, $s, $c2, $l );

Note the ! after l : We want to make sure that we pack a long integer as it is compiled by our C compiler. And even now, it will only work for the platforms where the compiler aligns things as above. And somebody somewhere has a platform where it doesn't. [Probably a Cray, where short s, ints and long s are all 8 bytes. :-)]

Counting bytes and watching alignments in lengthy structures is bound to be a drag. Isn't there a way we can create the template with a simple program? Here's a C program that does the trick:

  1. #include <stdio.h>
  2. #include <stddef.h>
  3. typedef struct {
  4. char fc1;
  5. short fs;
  6. char fc2;
  7. long fl;
  8. } gappy_t;
  9. #define Pt(struct,field,tchar) \
  10. printf( "@%d%s ", offsetof(struct,field), # tchar );
  11. int main() {
  12. Pt( gappy_t, fc1, c );
  13. Pt( gappy_t, fs, s! );
  14. Pt( gappy_t, fc2, c );
  15. Pt( gappy_t, fl, l! );
  16. printf( "\n" );
  17. }

The output line can be used as a template in a pack or unpack call:

  1. my $gappy = pack( '@0c @2s! @4c @8l!', $c1, $s, $c2, $l );

Gee, yet another template code - as if we hadn't plenty. But @ saves our day by enabling us to specify the offset from the beginning of the pack buffer to the next item: This is just the value the offsetof macro (defined in <stddef.h> ) returns when given a struct type and one of its field names ("member-designator" in C standardese).

Neither using offsets nor adding x 's to bridge the gaps is satisfactory. (Just imagine what happens if the structure changes.) What we really need is a way of saying "skip as many bytes as required to the next multiple of N". In fluent Templatese, you say this with x!N where N is replaced by the appropriate value. Here's the next version of our struct packaging:

  1. my $gappy = pack( 'c x!2 s c x!4 l!', $c1, $s, $c2, $l );

That's certainly better, but we still have to know how long all the integers are, and portability is far away. Rather than 2 , for instance, we want to say "however long a short is". But this can be done by enclosing the appropriate pack code in brackets: [s]. So, here's the very best we can do:

  1. my $gappy = pack( 'c x![s] s c x![l!] l!', $c1, $s, $c2, $l );

Dealing with Endian-ness

Now, imagine that we want to pack the data for a machine with a different byte-order. First, we'll have to figure out how big the data types on the target machine really are. Let's assume that the longs are 32 bits wide and the shorts are 16 bits wide. You can then rewrite the template as:

  1. my $gappy = pack( 'c x![s] s c x![l] l', $c1, $s, $c2, $l );

If the target machine is little-endian, we could write:

  1. my $gappy = pack( 'c x![s] s< c x![l] l<', $c1, $s, $c2, $l );

This forces the short and the long members to be little-endian, and is just fine if you don't have too many struct members. But we could also use the byte-order modifier on a group and write the following:

  1. my $gappy = pack( '( c x![s] s c x![l] l )<', $c1, $s, $c2, $l );

This is not as short as before, but it makes it more obvious that we intend to have little-endian byte-order for a whole group, not only for individual template codes. It can also be more readable and easier to maintain.

Alignment, Take 2

I'm afraid that we're not quite through with the alignment catch yet. The hydra raises another ugly head when you pack arrays of structures:

  1. typedef struct {
  2. short count;
  3. char glyph;
  4. } cell_t;
  5. typedef cell_t buffer_t[BUFLEN];

Where's the catch? Padding is neither required before the first field count , nor between this and the next field glyph , so why can't we simply pack like this:

  1. # something goes wrong here:
  2. pack( 's!a' x @buffer,
  3. map{ ( $_->{count}, $_->{glyph} ) } @buffer );

This packs 3*@buffer bytes, but it turns out that the size of buffer_t is four times BUFLEN ! The moral of the story is that the required alignment of a structure or array is propagated to the next higher level where we have to consider padding at the end of each component as well. Thus the correct template is:

  1. pack( 's!ax' x @buffer,
  2. map{ ( $_->{count}, $_->{glyph} ) } @buffer );

Alignment, Take 3

And even if you take all the above into account, ANSI still lets this:

  1. typedef struct {
  2. char foo[2];
  3. } foo_t;

vary in size. The alignment constraint of the structure can be greater than any of its elements. [And if you think that this doesn't affect anything common, dismember the next cellphone that you see. Many have ARM cores, and the ARM structure rules make sizeof (foo_t) == 4]

Pointers for How to Use Them

The title of this section indicates the second problem you may run into sooner or later when you pack C structures. If the function you intend to call expects a, say, void * value, you cannot simply take a reference to a Perl variable. (Although that value certainly is a memory address, it's not the address where the variable's contents are stored.)

Template code P promises to pack a "pointer to a fixed length string". Isn't this what we want? Let's try:

  1. # allocate some storage and pack a pointer to it
  2. my $memory = "\x00" x $size;
  3. my $memptr = pack( 'P', $memory );

But wait: doesn't pack just return a sequence of bytes? How can we pass this string of bytes to some C code expecting a pointer which is, after all, nothing but a number? The answer is simple: We have to obtain the numeric address from the bytes returned by pack.

  1. my $ptr = unpack( 'L!', $memptr );

Obviously this assumes that it is possible to typecast a pointer to an unsigned long and vice versa, which frequently works but should not be taken as a universal law. - Now that we have this pointer the next question is: How can we put it to good use? We need a call to some C function where a pointer is expected. The read(2) system call comes to mind:

  1. ssize_t read(int fd, void *buf, size_t count);

After reading perlfunc explaining how to use syscall we can write this Perl function copying a file to standard output:

  1. require 'syscall.ph';
  2. sub cat($){
  3. my $path = shift();
  4. my $size = -s $path;
  5. my $memory = "\x00" x $size; # allocate some memory
  6. my $ptr = unpack( 'L', pack( 'P', $memory ) );
  7. open( F, $path ) || die( "$path: cannot open ($!)\n" );
  8. my $fd = fileno(F);
  9. my $res = syscall( &SYS_read, fileno(F), $ptr, $size );
  10. print $memory;
  11. close( F );
  12. }

This is neither a specimen of simplicity nor a paragon of portability but it illustrates the point: We are able to sneak behind the scenes and access Perl's otherwise well-guarded memory! (Important note: Perl's syscall does not require you to construct pointers in this roundabout way. You simply pass a string variable, and Perl forwards the address.)

How does unpack with P work? Imagine some pointer in the buffer about to be unpacked: If it isn't the null pointer (which will smartly produce the undef value) we have a start address - but then what? Perl has no way of knowing how long this "fixed length string" is, so it's up to you to specify the actual size as an explicit length after P .

  1. my $mem = "abcdefghijklmn";
  2. print unpack( 'P5', pack( 'P', $mem ) ); # prints "abcde"

As a consequence, pack ignores any number or * after P .

Now that we have seen P at work, we might as well give p a whirl. Why do we need a second template code for packing pointers at all? The answer lies behind the simple fact that an unpack with p promises a null-terminated string starting at the address taken from the buffer, and that implies a length for the data item to be returned:

  1. my $buf = pack( 'p', "abc\x00efhijklmn" );
  2. print unpack( 'p', $buf ); # prints "abc"

Albeit this is apt to be confusing: As a consequence of the length being implied by the string's length, a number after pack code p is a repeat count, not a length as after P .

Using pack(..., $x) with P or p to get the address where $x is actually stored must be used with circumspection. Perl's internal machinery considers the relation between a variable and that address as its very own private matter and doesn't really care that we have obtained a copy. Therefore:

  • Do not use pack with p or P to obtain the address of variable that's bound to go out of scope (and thereby freeing its memory) before you are done with using the memory at that address.

  • Be very careful with Perl operations that change the value of the variable. Appending something to the variable, for instance, might require reallocation of its storage, leaving you with a pointer into no-man's land.

  • Don't think that you can get the address of a Perl variable when it is stored as an integer or double number! pack('P', $x) will force the variable's internal representation to string, just as if you had written something like $x .= '' .

It's safe, however, to P- or p-pack a string literal, because Perl simply allocates an anonymous variable.

Pack Recipes

Here are a collection of (possibly) useful canned recipes for pack and unpack:

  1. # Convert IP address for socket functions
  2. pack( "C4", split /\./, "123.4.5.6" );
  3. # Count the bits in a chunk of memory (e.g. a select vector)
  4. unpack( '%32b*', $mask );
  5. # Determine the endianness of your system
  6. $is_little_endian = unpack( 'c', pack( 's', 1 ) );
  7. $is_big_endian = unpack( 'xc', pack( 's', 1 ) );
  8. # Determine the number of bits in a native integer
  9. $bits = unpack( '%32I!', ~0 );
  10. # Prepare argument for the nanosleep system call
  11. my $timespec = pack( 'L!L!', $secs, $nanosecs );

For a simple memory dump we unpack some bytes into just as many pairs of hex digits, and use map to handle the traditional spacing - 16 bytes to a line:

  1. my $i;
  2. print map( ++$i % 16 ? "$_ " : "$_\n",
  3. unpack( 'H2' x length( $mem ), $mem ) ),
  4. length( $mem ) % 16 ? "\n" : '';

Funnies Section

  1. # Pulling digits out of nowhere...
  2. print unpack( 'C', pack( 'x' ) ),
  3. unpack( '%B*', pack( 'A' ) ),
  4. unpack( 'H', pack( 'A' ) ),
  5. unpack( 'A', unpack( 'C', pack( 'A' ) ) ), "\n";
  6. # One for the road ;-)
  7. my $advice = pack( 'all u can in a van' );

Authors

Simon Cozens and Wolfgang Laun.

 
perldoc-html/perlperf.html000644 000765 000024 00000316445 12275777342 015760 0ustar00jjstaff000000 000000 perlperf - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlperf

Perl 5 version 18.2 documentation
Recently read

perlperf

NAME

perlperf - Perl Performance and Optimization Techniques

DESCRIPTION

This is an introduction to the use of performance and optimization techniques which can be used with particular reference to perl programs. While many perl developers have come from other languages, and can use their prior knowledge where appropriate, there are many other people who might benefit from a few perl specific pointers. If you want the condensed version, perhaps the best advice comes from the renowned Japanese Samurai, Miyamoto Musashi, who said:

  1. "Do Not Engage in Useless Activity"

in 1645.

OVERVIEW

Perhaps the most common mistake programmers make is to attempt to optimize their code before a program actually does anything useful - this is a bad idea. There's no point in having an extremely fast program that doesn't work. The first job is to get a program to correctly do something useful, (not to mention ensuring the test suite is fully functional), and only then to consider optimizing it. Having decided to optimize existing working code, there are several simple but essential steps to consider which are intrinsic to any optimization process.

ONE STEP SIDEWAYS

Firstly, you need to establish a baseline time for the existing code, which timing needs to be reliable and repeatable. You'll probably want to use the Benchmark or Devel::NYTProf modules, or something similar, for this step, or perhaps the Unix system time utility, whichever is appropriate. See the base of this document for a longer list of benchmarking and profiling modules, and recommended further reading.

ONE STEP FORWARD

Next, having examined the program for hot spots, (places where the code seems to run slowly), change the code with the intention of making it run faster. Using version control software, like subversion , will ensure no changes are irreversible. It's too easy to fiddle here and fiddle there - don't change too much at any one time or you might not discover which piece of code really was the slow bit.

ANOTHER STEP SIDEWAYS

It's not enough to say: "that will make it run faster", you have to check it. Rerun the code under control of the benchmarking or profiling modules, from the first step above, and check that the new code executed the same task in less time. Save your work and repeat...

GENERAL GUIDELINES

The critical thing when considering performance is to remember there is no such thing as a Golden Bullet , which is why there are no rules, only guidelines.

It is clear that inline code is going to be faster than subroutine or method calls, because there is less overhead, but this approach has the disadvantage of being less maintainable and comes at the cost of greater memory usage - there is no such thing as a free lunch. If you are searching for an element in a list, it can be more efficient to store the data in a hash structure, and then simply look to see whether the key is defined, rather than to loop through the entire array using grep() for instance. substr() may be (a lot) faster than grep() but not as flexible, so you have another trade-off to access. Your code may contain a line which takes 0.01 of a second to execute which if you call it 1,000 times, quite likely in a program parsing even medium sized files for instance, you already have a 10 second delay, in just one single code location, and if you call that line 100,000 times, your entire program will slow down to an unbearable crawl.

Using a subroutine as part of your sort is a powerful way to get exactly what you want, but will usually be slower than the built-in alphabetic cmp and numeric <=> sort operators. It is possible to make multiple passes over your data, building indices to make the upcoming sort more efficient, and to use what is known as the OM (Orcish Maneuver) to cache the sort keys in advance. The cache lookup, while a good idea, can itself be a source of slowdown by enforcing a double pass over the data - once to setup the cache, and once to sort the data. Using pack() to extract the required sort key into a consistent string can be an efficient way to build a single string to compare, instead of using multiple sort keys, which makes it possible to use the standard, written in c and fast, perl sort() function on the output, and is the basis of the GRT (Guttman Rossler Transform). Some string combinations can slow the GRT down, by just being too plain complex for it's own good.

For applications using database backends, the standard DBIx namespace has tries to help with keeping things nippy, not least because it tries to not query the database until the latest possible moment, but always read the docs which come with your choice of libraries. Among the many issues facing developers dealing with databases should remain aware of is to always use SQL placeholders and to consider pre-fetching data sets when this might prove advantageous. Splitting up a large file by assigning multiple processes to parsing a single file, using say POE , threads or fork can also be a useful way of optimizing your usage of the available CPU resources, though this technique is fraught with concurrency issues and demands high attention to detail.

Every case has a specific application and one or more exceptions, and there is no replacement for running a few tests and finding out which method works best for your particular environment, this is why writing optimal code is not an exact science, and why we love using Perl so much - TMTOWTDI.

BENCHMARKS

Here are a few examples to demonstrate usage of Perl's benchmarking tools.

Assigning and Dereferencing Variables.

I'm sure most of us have seen code which looks like, (or worse than), this:

  1. if ( $obj->{_ref}->{_myscore} >= $obj->{_ref}->{_yourscore} ) {
  2. ...

This sort of code can be a real eyesore to read, as well as being very sensitive to typos, and it's much clearer to dereference the variable explicitly. We're side-stepping the issue of working with object-oriented programming techniques to encapsulate variable access via methods, only accessible through an object. Here we're just discussing the technical implementation of choice, and whether this has an effect on performance. We can see whether this dereferencing operation, has any overhead by putting comparative code in a file and running a Benchmark test.

# dereference

  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;
  4. use Benchmark;
  5. my $ref = {
  6. 'ref' => {
  7. _myscore => '100 + 1',
  8. _yourscore => '102 - 1',
  9. },
  10. };
  11. timethese(1000000, {
  12. 'direct' => sub {
  13. my $x = $ref->{ref}->{_myscore} . $ref->{ref}->{_yourscore} ;
  14. },
  15. 'dereference' => sub {
  16. my $ref = $ref->{ref};
  17. my $myscore = $ref->{_myscore};
  18. my $yourscore = $ref->{_yourscore};
  19. my $x = $myscore . $yourscore;
  20. },
  21. });

It's essential to run any timing measurements a sufficient number of times so the numbers settle on a numerical average, otherwise each run will naturally fluctuate due to variations in the environment, to reduce the effect of contention for CPU resources and network bandwidth for instance. Running the above code for one million iterations, we can take a look at the report output by the Benchmark module, to see which approach is the most effective.

  1. $> perl dereference
  2. Benchmark: timing 1000000 iterations of dereference, direct...
  3. dereference: 2 wallclock secs ( 1.59 usr + 0.00 sys = 1.59 CPU) @ 628930.82/s (n=1000000)
  4. direct: 1 wallclock secs ( 1.20 usr + 0.00 sys = 1.20 CPU) @ 833333.33/s (n=1000000)

The difference is clear to see and the dereferencing approach is slower. While it managed to execute an average of 628,930 times a second during our test, the direct approach managed to run an additional 204,403 times, unfortunately. Unfortunately, because there are many examples of code written using the multiple layer direct variable access, and it's usually horrible. It is, however, minusculy faster. The question remains whether the minute gain is actually worth the eyestrain, or the loss of maintainability.

Search and replace or tr

If we have a string which needs to be modified, while a regex will almost always be much more flexible, tr, an oft underused tool, can still be a useful. One scenario might be replace all vowels with another character. The regex solution might look like this:

  1. $str =~ s/[aeiou]/x/g

The tr alternative might look like this:

  1. $str =~ tr/aeiou/xxxxx/

We can put that into a test file which we can run to check which approach is the fastest, using a global $STR variable to assign to the my $str variable so as to avoid perl trying to optimize any of the work away by noticing it's assigned only the once.

# regex-transliterate

  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;
  4. use Benchmark;
  5. my $STR = "$$-this and that";
  6. timethese( 1000000, {
  7. 'sr' => sub { my $str = $STR; $str =~ s/[aeiou]/x/g; return $str; },
  8. 'tr' => sub { my $str = $STR; $str =~ tr/aeiou/xxxxx/; return $str; },
  9. });

Running the code gives us our results:

  1. $> perl regex-transliterate
  2. Benchmark: timing 1000000 iterations of sr, tr...
  3. sr: 2 wallclock secs ( 1.19 usr + 0.00 sys = 1.19 CPU) @ 840336.13/s (n=1000000)
  4. tr: 0 wallclock secs ( 0.49 usr + 0.00 sys = 0.49 CPU) @ 2040816.33/s (n=1000000)

The tr version is a clear winner. One solution is flexible, the other is fast - and it's appropriately the programmer's choice which to use.

Check the Benchmark docs for further useful techniques.

PROFILING TOOLS

A slightly larger piece of code will provide something on which a profiler can produce more extensive reporting statistics. This example uses the simplistic wordmatch program which parses a given input file and spews out a short report on the contents.

# wordmatch

  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;
  4. =head1 NAME
  5. filewords - word analysis of input file
  6. =head1 SYNOPSIS
  7. filewords -f inputfilename [-d]
  8. =head1 DESCRIPTION
  9. This program parses the given filename, specified with C<-f>, and displays a
  10. simple analysis of the words found therein. Use the C<-d> switch to enable
  11. debugging messages.
  12. =cut
  13. use FileHandle;
  14. use Getopt::Long;
  15. my $debug = 0;
  16. my $file = '';
  17. my $result = GetOptions (
  18. 'debug' => \$debug,
  19. 'file=s' => \$file,
  20. );
  21. die("invalid args") unless $result;
  22. unless ( -f $file ) {
  23. die("Usage: $0 -f filename [-d]");
  24. }
  25. my $FH = FileHandle->new("< $file") or die("unable to open file($file): $!");
  26. my $i_LINES = 0;
  27. my $i_WORDS = 0;
  28. my %count = ();
  29. my @lines = <$FH>;
  30. foreach my $line ( @lines ) {
  31. $i_LINES++;
  32. $line =~ s/\n//;
  33. my @words = split(/ +/, $line);
  34. my $i_words = scalar(@words);
  35. $i_WORDS = $i_WORDS + $i_words;
  36. debug("line: $i_LINES supplying $i_words words: @words");
  37. my $i_word = 0;
  38. foreach my $word ( @words ) {
  39. $i_word++;
  40. $count{$i_LINES}{spec} += matches($i_word, $word, '[^a-zA-Z0-9]');
  41. $count{$i_LINES}{only} += matches($i_word, $word, '^[^a-zA-Z0-9]+$');
  42. $count{$i_LINES}{cons} += matches($i_word, $word, '^[(?i:bcdfghjklmnpqrstvwxyz)]+$');
  43. $count{$i_LINES}{vows} += matches($i_word, $word, '^[(?i:aeiou)]+$');
  44. $count{$i_LINES}{caps} += matches($i_word, $word, '^[(A-Z)]+$');
  45. }
  46. }
  47. print report( %count );
  48. sub matches {
  49. my $i_wd = shift;
  50. my $word = shift;
  51. my $regex = shift;
  52. my $has = 0;
  53. if ( $word =~ /($regex)/ ) {
  54. $has++ if $1;
  55. }
  56. debug("word: $i_wd ".($has ? 'matches' : 'does not match')." chars: /$regex/");
  57. return $has;
  58. }
  59. sub report {
  60. my %report = @_;
  61. my %rep;
  62. foreach my $line ( keys %report ) {
  63. foreach my $key ( keys %{ $report{$line} } ) {
  64. $rep{$key} += $report{$line}{$key};
  65. }
  66. }
  67. my $report = qq|
  68. $0 report for $file:
  69. lines in file: $i_LINES
  70. words in file: $i_WORDS
  71. words with special (non-word) characters: $i_spec
  72. words with only special (non-word) characters: $i_only
  73. words with only consonants: $i_cons
  74. words with only capital letters: $i_caps
  75. words with only vowels: $i_vows
  76. |;
  77. return $report;
  78. }
  79. sub debug {
  80. my $message = shift;
  81. if ( $debug ) {
  82. print STDERR "DBG: $message\n";
  83. }
  84. }
  85. exit 0;

Devel::DProf

This venerable module has been the de-facto standard for Perl code profiling for more than a decade, but has been replaced by a number of other modules which have brought us back to the 21st century. Although you're recommended to evaluate your tool from the several mentioned here and from the CPAN list at the base of this document, (and currently Devel::NYTProf seems to be the weapon of choice - see below), we'll take a quick look at the output from Devel::DProf first, to set a baseline for Perl profiling tools. Run the above program under the control of Devel::DProf by using the -d switch on the command-line.

  1. $> perl -d:DProf wordmatch -f perl5db.pl
  2. <...multiple lines snipped...>
  3. wordmatch report for perl5db.pl:
  4. lines in file: 9428
  5. words in file: 50243
  6. words with special (non-word) characters: 20480
  7. words with only special (non-word) characters: 7790
  8. words with only consonants: 4801
  9. words with only capital letters: 1316
  10. words with only vowels: 1701

Devel::DProf produces a special file, called tmon.out by default, and this file is read by the dprofpp program, which is already installed as part of the Devel::DProf distribution. If you call dprofpp with no options, it will read the tmon.out file in the current directory and produce a human readable statistics report of the run of your program. Note that this may take a little time.

  1. $> dprofpp
  2. Total Elapsed Time = 2.951677 Seconds
  3. User+System Time = 2.871677 Seconds
  4. Exclusive Times
  5. %Time ExclSec CumulS #Calls sec/call Csec/c Name
  6. 102. 2.945 3.003 251215 0.0000 0.0000 main::matches
  7. 2.40 0.069 0.069 260643 0.0000 0.0000 main::debug
  8. 1.74 0.050 0.050 1 0.0500 0.0500 main::report
  9. 1.04 0.030 0.049 4 0.0075 0.0123 main::BEGIN
  10. 0.35 0.010 0.010 3 0.0033 0.0033 Exporter::as_heavy
  11. 0.35 0.010 0.010 7 0.0014 0.0014 IO::File::BEGIN
  12. 0.00 - -0.000 1 - - Getopt::Long::FindOption
  13. 0.00 - -0.000 1 - - Symbol::BEGIN
  14. 0.00 - -0.000 1 - - Fcntl::BEGIN
  15. 0.00 - -0.000 1 - - Fcntl::bootstrap
  16. 0.00 - -0.000 1 - - warnings::BEGIN
  17. 0.00 - -0.000 1 - - IO::bootstrap
  18. 0.00 - -0.000 1 - - Getopt::Long::ConfigDefaults
  19. 0.00 - -0.000 1 - - Getopt::Long::Configure
  20. 0.00 - -0.000 1 - - Symbol::gensym

dprofpp will produce some quite detailed reporting on the activity of the wordmatch program. The wallclock, user and system, times are at the top of the analysis, and after this are the main columns defining which define the report. Check the dprofpp docs for details of the many options it supports.

See also Apache::DProf which hooks Devel::DProf into mod_perl .

Devel::Profiler

Let's take a look at the same program using a different profiler: Devel::Profiler , a drop-in Perl-only replacement for Devel::DProf . The usage is very slightly different in that instead of using the special -d: flag, you pull Devel::Profiler in directly as a module using -M .

  1. $> perl -MDevel::Profiler wordmatch -f perl5db.pl
  2. <...multiple lines snipped...>
  3. wordmatch report for perl5db.pl:
  4. lines in file: 9428
  5. words in file: 50243
  6. words with special (non-word) characters: 20480
  7. words with only special (non-word) characters: 7790
  8. words with only consonants: 4801
  9. words with only capital letters: 1316
  10. words with only vowels: 1701

Devel::Profiler generates a tmon.out file which is compatible with the dprofpp program, thus saving the construction of a dedicated statistics reader program. dprofpp usage is therefore identical to the above example.

  1. $> dprofpp
  2. Total Elapsed Time = 20.984 Seconds
  3. User+System Time = 19.981 Seconds
  4. Exclusive Times
  5. %Time ExclSec CumulS #Calls sec/call Csec/c Name
  6. 49.0 9.792 14.509 251215 0.0000 0.0001 main::matches
  7. 24.4 4.887 4.887 260643 0.0000 0.0000 main::debug
  8. 0.25 0.049 0.049 1 0.0490 0.0490 main::report
  9. 0.00 0.000 0.000 1 0.0000 0.0000 Getopt::Long::GetOptions
  10. 0.00 0.000 0.000 2 0.0000 0.0000 Getopt::Long::ParseOptionSpec
  11. 0.00 0.000 0.000 1 0.0000 0.0000 Getopt::Long::FindOption
  12. 0.00 0.000 0.000 1 0.0000 0.0000 IO::File::new
  13. 0.00 0.000 0.000 1 0.0000 0.0000 IO::Handle::new
  14. 0.00 0.000 0.000 1 0.0000 0.0000 Symbol::gensym
  15. 0.00 0.000 0.000 1 0.0000 0.0000 IO::File::open

Interestingly we get slightly different results, which is mostly because the algorithm which generates the report is different, even though the output file format was allegedly identical. The elapsed, user and system times are clearly showing the time it took for Devel::Profiler to execute its own run, but the column listings feel more accurate somehow than the ones we had earlier from Devel::DProf . The 102% figure has disappeared, for example. This is where we have to use the tools at our disposal, and recognise their pros and cons, before using them. Interestingly, the numbers of calls for each subroutine are identical in the two reports, it's the percentages which differ. As the author of Devel::Proviler writes:

  1. ...running HTML::Template's test suite under Devel::DProf shows output()
  2. taking NO time but Devel::Profiler shows around 10% of the time is in output().
  3. I don't know which to trust but my gut tells me something is wrong with
  4. Devel::DProf. HTML::Template::output() is a big routine that's called for
  5. every test. Either way, something needs fixing.

YMMV.

See also Devel::Apache::Profiler which hooks Devel::Profiler into mod_perl .

Devel::SmallProf

The Devel::SmallProf profiler examines the runtime of your Perl program and produces a line-by-line listing to show how many times each line was called, and how long each line took to execute. It is called by supplying the familiar -d flag to Perl at runtime.

  1. $> perl -d:SmallProf wordmatch -f perl5db.pl
  2. <...multiple lines snipped...>
  3. wordmatch report for perl5db.pl:
  4. lines in file: 9428
  5. words in file: 50243
  6. words with special (non-word) characters: 20480
  7. words with only special (non-word) characters: 7790
  8. words with only consonants: 4801
  9. words with only capital letters: 1316
  10. words with only vowels: 1701

Devel::SmallProf writes it's output into a file called smallprof.out, by default. The format of the file looks like this:

  1. <num> <time> <ctime> <line>:<text>

When the program has terminated, the output may be examined and sorted using any standard text filtering utilities. Something like the following may be sufficient:

  1. $> cat smallprof.out | grep \d*: | sort -k3 | tac | head -n20
  2. 251215 1.65674 7.68000 75: if ( $word =~ /($regex)/ ) {
  3. 251215 0.03264 4.40000 79: debug("word: $i_wd ".($has ? 'matches' :
  4. 251215 0.02693 4.10000 81: return $has;
  5. 260643 0.02841 4.07000 128: if ( $debug ) {
  6. 260643 0.02601 4.04000 126: my $message = shift;
  7. 251215 0.02641 3.91000 73: my $has = 0;
  8. 251215 0.03311 3.71000 70: my $i_wd = shift;
  9. 251215 0.02699 3.69000 72: my $regex = shift;
  10. 251215 0.02766 3.68000 71: my $word = shift;
  11. 50243 0.59726 1.00000 59: $count{$i_LINES}{cons} =
  12. 50243 0.48175 0.92000 61: $count{$i_LINES}{spec} =
  13. 50243 0.00644 0.89000 56: my $i_cons = matches($i_word, $word,
  14. 50243 0.48837 0.88000 63: $count{$i_LINES}{caps} =
  15. 50243 0.00516 0.88000 58: my $i_caps = matches($i_word, $word, '^[(A-
  16. 50243 0.00631 0.81000 54: my $i_spec = matches($i_word, $word, '[^a-
  17. 50243 0.00496 0.80000 57: my $i_vows = matches($i_word, $word,
  18. 50243 0.00688 0.80000 53: $i_word++;
  19. 50243 0.48469 0.79000 62: $count{$i_LINES}{only} =
  20. 50243 0.48928 0.77000 60: $count{$i_LINES}{vows} =
  21. 50243 0.00683 0.75000 55: my $i_only = matches($i_word, $word, '^[^a-

You can immediately see a slightly different focus to the subroutine profiling modules, and we start to see exactly which line of code is taking the most time. That regex line is looking a bit suspicious, for example. Remember that these tools are supposed to be used together, there is no single best way to profile your code, you need to use the best tools for the job.

See also Apache::SmallProf which hooks Devel::SmallProf into mod_perl .

Devel::FastProf

Devel::FastProf is another Perl line profiler. This was written with a view to getting a faster line profiler, than is possible with for example Devel::SmallProf , because it's written in C . To use Devel::FastProf , supply the -d argument to Perl:

  1. $> perl -d:FastProf wordmatch -f perl5db.pl
  2. <...multiple lines snipped...>
  3. wordmatch report for perl5db.pl:
  4. lines in file: 9428
  5. words in file: 50243
  6. words with special (non-word) characters: 20480
  7. words with only special (non-word) characters: 7790
  8. words with only consonants: 4801
  9. words with only capital letters: 1316
  10. words with only vowels: 1701

Devel::FastProf writes statistics to the file fastprof.out in the current directory. The output file, which can be specified, can be interpreted by using the fprofpp command-line program.

  1. $> fprofpp | head -n20
  2. # fprofpp output format is:
  3. # filename:line time count: source
  4. wordmatch:75 3.93338 251215: if ( $word =~ /($regex)/ ) {
  5. wordmatch:79 1.77774 251215: debug("word: $i_wd ".($has ? 'matches' : 'does not match')." chars: /$regex/");
  6. wordmatch:81 1.47604 251215: return $has;
  7. wordmatch:126 1.43441 260643: my $message = shift;
  8. wordmatch:128 1.42156 260643: if ( $debug ) {
  9. wordmatch:70 1.36824 251215: my $i_wd = shift;
  10. wordmatch:71 1.36739 251215: my $word = shift;
  11. wordmatch:72 1.35939 251215: my $regex = shift;

Straightaway we can see that the number of times each line has been called is identical to the Devel::SmallProf output, and the sequence is only very slightly different based on the ordering of the amount of time each line took to execute, if ( $debug ) { and my $message = shift; , for example. The differences in the actual times recorded might be in the algorithm used internally, or it could be due to system resource limitations or contention.

See also the DBIx::Profile which will profile database queries running under the DBIx::* namespace.

Devel::NYTProf

Devel::NYTProf is the next generation of Perl code profiler, fixing many shortcomings in other tools and implementing many cool features. First of all it can be used as either a line profiler, a block or a subroutine profiler, all at once. It can also use sub-microsecond (100ns) resolution on systems which provide clock_gettime() . It can be started and stopped even by the program being profiled. It's a one-line entry to profile mod_perl applications. It's written in c and is probably the fastest profiler available for Perl. The list of coolness just goes on. Enough of that, let's see how to it works - just use the familiar -d switch to plug it in and run the code.

  1. $> perl -d:NYTProf wordmatch -f perl5db.pl
  2. wordmatch report for perl5db.pl:
  3. lines in file: 9427
  4. words in file: 50243
  5. words with special (non-word) characters: 20480
  6. words with only special (non-word) characters: 7790
  7. words with only consonants: 4801
  8. words with only capital letters: 1316
  9. words with only vowels: 1701

NYTProf will generate a report database into the file nytprof.out by default. Human readable reports can be generated from here by using the supplied nytprofhtml (HTML output) and nytprofcsv (CSV output) programs. We've used the Unix system html2text utility to convert the nytprof/index.html file for convenience here.

  1. $> html2text nytprof/index.html
  2. Performance Profile Index
  3. For wordmatch
  4. Run on Fri Sep 26 13:46:39 2008
  5. Reported on Fri Sep 26 13:47:23 2008
  6. Top 15 Subroutines -- ordered by exclusive time
  7. |Calls |P |F |Inclusive|Exclusive|Subroutine |
  8. | | | |Time |Time | |
  9. |251215|5 |1 |13.09263 |10.47692 |main:: |matches |
  10. |260642|2 |1 |2.71199 |2.71199 |main:: |debug |
  11. |1 |1 |1 |0.21404 |0.21404 |main:: |report |
  12. |2 |2 |2 |0.00511 |0.00511 |XSLoader:: |load (xsub) |
  13. |14 |14|7 |0.00304 |0.00298 |Exporter:: |import |
  14. |3 |1 |1 |0.00265 |0.00254 |Exporter:: |as_heavy |
  15. |10 |10|4 |0.00140 |0.00140 |vars:: |import |
  16. |13 |13|1 |0.00129 |0.00109 |constant:: |import |
  17. |1 |1 |1 |0.00360 |0.00096 |FileHandle:: |import |
  18. |3 |3 |3 |0.00086 |0.00074 |warnings::register::|import |
  19. |9 |3 |1 |0.00036 |0.00036 |strict:: |bits |
  20. |13 |13|13|0.00032 |0.00029 |strict:: |import |
  21. |2 |2 |2 |0.00020 |0.00020 |warnings:: |import |
  22. |2 |1 |1 |0.00020 |0.00020 |Getopt::Long:: |ParseOptionSpec|
  23. |7 |7 |6 |0.00043 |0.00020 |strict:: |unimport |
  24. For more information see the full list of 189 subroutines.

The first part of the report already shows the critical information regarding which subroutines are using the most time. The next gives some statistics about the source files profiled.

  1. Source Code Files -- ordered by exclusive time then name
  2. |Stmts |Exclusive|Avg. |Reports |Source File |
  3. | |Time | | | |
  4. |2699761|15.66654 |6e-06 |line . block . sub|wordmatch |
  5. |35 |0.02187 |0.00062|line . block . sub|IO/Handle.pm |
  6. |274 |0.01525 |0.00006|line . block . sub|Getopt/Long.pm |
  7. |20 |0.00585 |0.00029|line . block . sub|Fcntl.pm |
  8. |128 |0.00340 |0.00003|line . block . sub|Exporter/Heavy.pm |
  9. |42 |0.00332 |0.00008|line . block . sub|IO/File.pm |
  10. |261 |0.00308 |0.00001|line . block . sub|Exporter.pm |
  11. |323 |0.00248 |8e-06 |line . block . sub|constant.pm |
  12. |12 |0.00246 |0.00021|line . block . sub|File/Spec/Unix.pm |
  13. |191 |0.00240 |0.00001|line . block . sub|vars.pm |
  14. |77 |0.00201 |0.00003|line . block . sub|FileHandle.pm |
  15. |12 |0.00198 |0.00016|line . block . sub|Carp.pm |
  16. |14 |0.00175 |0.00013|line . block . sub|Symbol.pm |
  17. |15 |0.00130 |0.00009|line . block . sub|IO.pm |
  18. |22 |0.00120 |0.00005|line . block . sub|IO/Seekable.pm |
  19. |198 |0.00085 |4e-06 |line . block . sub|warnings/register.pm|
  20. |114 |0.00080 |7e-06 |line . block . sub|strict.pm |
  21. |47 |0.00068 |0.00001|line . block . sub|warnings.pm |
  22. |27 |0.00054 |0.00002|line . block . sub|overload.pm |
  23. |9 |0.00047 |0.00005|line . block . sub|SelectSaver.pm |
  24. |13 |0.00045 |0.00003|line . block . sub|File/Spec.pm |
  25. |2701595|15.73869 | |Total |
  26. |128647 |0.74946 | |Average |
  27. | |0.00201 |0.00003|Median |
  28. | |0.00121 |0.00003|Deviation |
  29. Report produced by the NYTProf 2.03 Perl profiler, developed by Tim Bunce and
  30. Adam Kaplan.

At this point, if you're using the html report, you can click through the various links to bore down into each subroutine and each line of code. Because we're using the text reporting here, and there's a whole directory full of reports built for each source file, we'll just display a part of the corresponding wordmatch-line.html file, sufficient to give an idea of the sort of output you can expect from this cool tool.

  1. $> html2text nytprof/wordmatch-line.html
  2. Performance Profile -- -block view-.-line view-.-sub view-
  3. For wordmatch
  4. Run on Fri Sep 26 13:46:39 2008
  5. Reported on Fri Sep 26 13:47:22 2008
  6. File wordmatch
  7. Subroutines -- ordered by exclusive time
  8. |Calls |P|F|Inclusive|Exclusive|Subroutine |
  9. | | | |Time |Time | |
  10. |251215|5|1|13.09263 |10.47692 |main::|matches|
  11. |260642|2|1|2.71199 |2.71199 |main::|debug |
  12. |1 |1|1|0.21404 |0.21404 |main::|report |
  13. |0 |0|0|0 |0 |main::|BEGIN |
  14. |Line|Stmts.|Exclusive|Avg. |Code |
  15. | | |Time | | |
  16. |1 | | | |#!/usr/bin/perl |
  17. |2 | | | | |
  18. | | | | |use strict; |
  19. |3 |3 |0.00086 |0.00029|# spent 0.00003s making 1 calls to strict:: |
  20. | | | | |import |
  21. | | | | |use warnings; |
  22. |4 |3 |0.01563 |0.00521|# spent 0.00012s making 1 calls to warnings:: |
  23. | | | | |import |
  24. |5 | | | | |
  25. |6 | | | |=head1 NAME |
  26. |7 | | | | |
  27. |8 | | | |filewords - word analysis of input file |
  28. <...snip...>
  29. |62 |1 |0.00445 |0.00445|print report( %count ); |
  30. | | | | |# spent 0.21404s making 1 calls to main::report|
  31. |63 | | | | |
  32. | | | | |# spent 23.56955s (10.47692+2.61571) within |
  33. | | | | |main::matches which was called 251215 times, |
  34. | | | | |avg 0.00005s/call: # 50243 times |
  35. | | | | |(2.12134+0.51939s) at line 57 of wordmatch, avg|
  36. | | | | |0.00005s/call # 50243 times (2.17735+0.54550s) |
  37. |64 | | | |at line 56 of wordmatch, avg 0.00005s/call # |
  38. | | | | |50243 times (2.10992+0.51797s) at line 58 of |
  39. | | | | |wordmatch, avg 0.00005s/call # 50243 times |
  40. | | | | |(2.12696+0.51598s) at line 55 of wordmatch, avg|
  41. | | | | |0.00005s/call # 50243 times (1.94134+0.51687s) |
  42. | | | | |at line 54 of wordmatch, avg 0.00005s/call |
  43. | | | | |sub matches { |
  44. <...snip...>
  45. |102 | | | | |
  46. | | | | |# spent 2.71199s within main::debug which was |
  47. | | | | |called 260642 times, avg 0.00001s/call: # |
  48. | | | | |251215 times (2.61571+0s) by main::matches at |
  49. |103 | | | |line 74 of wordmatch, avg 0.00001s/call # 9427 |
  50. | | | | |times (0.09628+0s) at line 50 of wordmatch, avg|
  51. | | | | |0.00001s/call |
  52. | | | | |sub debug { |
  53. |104 |260642|0.58496 |2e-06 |my $message = shift; |
  54. |105 | | | | |
  55. |106 |260642|1.09917 |4e-06 |if ( $debug ) { |
  56. |107 | | | |print STDERR "DBG: $message\n"; |
  57. |108 | | | |} |
  58. |109 | | | |} |
  59. |110 | | | | |
  60. |111 |1 |0.01501 |0.01501|exit 0; |
  61. |112 | | | | |

Oodles of very useful information in there - this seems to be the way forward.

See also Devel::NYTProf::Apache which hooks Devel::NYTProf into mod_perl .

SORTING

Perl modules are not the only tools a performance analyst has at their disposal, system tools like time should not be overlooked as the next example shows, where we take a quick look at sorting. Many books, theses and articles, have been written about efficient sorting algorithms, and this is not the place to repeat such work, there's several good sorting modules which deserve taking a look at too: Sort::Maker , Sort::Key spring to mind. However, it's still possible to make some observations on certain Perl specific interpretations on issues relating to sorting data sets and give an example or two with regard to how sorting large data volumes can effect performance. Firstly, an often overlooked point when sorting large amounts of data, one can attempt to reduce the data set to be dealt with and in many cases grep() can be quite useful as a simple filter:

  1. @data = sort grep { /$filter/ } @incoming

A command such as this can vastly reduce the volume of material to actually sort through in the first place, and should not be too lightly disregarded purely on the basis of its simplicity. The KISS principle is too often overlooked - the next example uses the simple system time utility to demonstrate. Let's take a look at an actual example of sorting the contents of a large file, an apache logfile would do. This one has over a quarter of a million lines, is 50M in size, and a snippet of it looks like this:

# logfile

  1. 188.209-65-87.adsl-dyn.isp.belgacom.be - - [08/Feb/2007:12:57:16 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
  2. 188.209-65-87.adsl-dyn.isp.belgacom.be - - [08/Feb/2007:12:57:16 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
  3. 151.56.71.198 - - [08/Feb/2007:12:57:41 +0000] "GET /suse-on-vaio.html HTTP/1.1" 200 2858 "http://www.linux-on-laptops.com/sony.html" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"
  4. 151.56.71.198 - - [08/Feb/2007:12:57:42 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net/suse-on-vaio.html" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"
  5. 151.56.71.198 - - [08/Feb/2007:12:57:43 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"
  6. 217.113.68.60 - - [08/Feb/2007:13:02:15 +0000] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
  7. 217.113.68.60 - - [08/Feb/2007:13:02:16 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
  8. debora.to.isac.cnr.it - - [08/Feb/2007:13:03:58 +0000] "GET /suse-on-vaio.html HTTP/1.1" 200 2858 "http://www.linux-on-laptops.com/sony.html" "Mozilla/5.0 (compatible; Konqueror/3.4; Linux) KHTML/3.4.0 (like Gecko)"
  9. debora.to.isac.cnr.it - - [08/Feb/2007:13:03:58 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net/suse-on-vaio.html" "Mozilla/5.0 (compatible; Konqueror/3.4; Linux) KHTML/3.4.0 (like Gecko)"
  10. debora.to.isac.cnr.it - - [08/Feb/2007:13:03:58 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/5.0 (compatible; Konqueror/3.4; Linux) KHTML/3.4.0 (like Gecko)"
  11. 195.24.196.99 - - [08/Feb/2007:13:26:48 +0000] "GET / HTTP/1.0" 200 3309 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9"
  12. 195.24.196.99 - - [08/Feb/2007:13:26:58 +0000] "GET /data/css HTTP/1.0" 404 206 "http://www.rfi.net/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9"
  13. 195.24.196.99 - - [08/Feb/2007:13:26:59 +0000] "GET /favicon.ico HTTP/1.0" 404 209 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9"
  14. crawl1.cosmixcorp.com - - [08/Feb/2007:13:27:57 +0000] "GET /robots.txt HTTP/1.0" 200 179 "-" "voyager/1.0"
  15. crawl1.cosmixcorp.com - - [08/Feb/2007:13:28:25 +0000] "GET /links.html HTTP/1.0" 200 3413 "-" "voyager/1.0"
  16. fhm226.internetdsl.tpnet.pl - - [08/Feb/2007:13:37:32 +0000] "GET /suse-on-vaio.html HTTP/1.1" 200 2858 "http://www.linux-on-laptops.com/sony.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
  17. fhm226.internetdsl.tpnet.pl - - [08/Feb/2007:13:37:34 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net/suse-on-vaio.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
  18. 80.247.140.134 - - [08/Feb/2007:13:57:35 +0000] "GET / HTTP/1.1" 200 3309 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
  19. 80.247.140.134 - - [08/Feb/2007:13:57:37 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
  20. pop.compuscan.co.za - - [08/Feb/2007:14:10:43 +0000] "GET / HTTP/1.1" 200 3309 "-" "www.clamav.net"
  21. livebot-207-46-98-57.search.live.com - - [08/Feb/2007:14:12:04 +0000] "GET /robots.txt HTTP/1.0" 200 179 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
  22. livebot-207-46-98-57.search.live.com - - [08/Feb/2007:14:12:04 +0000] "GET /html/oracle.html HTTP/1.0" 404 214 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
  23. dslb-088-064-005-154.pools.arcor-ip.net - - [08/Feb/2007:14:12:15 +0000] "GET / HTTP/1.1" 200 3309 "-" "www.clamav.net"
  24. 196.201.92.41 - - [08/Feb/2007:14:15:01 +0000] "GET / HTTP/1.1" 200 3309 "-" "MOT-L7/08.B7.DCR MIB/2.2.1 Profile/MIDP-2.0 Configuration/CLDC-1.1"

The specific task here is to sort the 286,525 lines of this file by Response Code, Query, Browser, Referring Url, and lastly Date. One solution might be to use the following code, which iterates over the files given on the command-line.

# sort-apache-log

  1. #!/usr/bin/perl -n
  2. use strict;
  3. use warnings;
  4. my @data;
  5. LINE:
  6. while ( <> ) {
  7. my $line = $_;
  8. if (
  9. $line =~ m/^(
  10. ([\w\.\-]+) # client
  11. \s*-\s*-\s*\[
  12. ([^]]+) # date
  13. \]\s*"\w+\s*
  14. (\S+) # query
  15. [^"]+"\s*
  16. (\d+) # status
  17. \s+\S+\s+"[^"]*"\s+"
  18. ([^"]*) # browser
  19. "
  20. .*
  21. )$/x
  22. ) {
  23. my @chunks = split(/ +/, $line);
  24. my $ip = $1;
  25. my $date = $2;
  26. my $query = $3;
  27. my $status = $4;
  28. my $browser = $5;
  29. push(@data, [$ip, $date, $query, $status, $browser, $line]);
  30. }
  31. }
  32. my @sorted = sort {
  33. $a->[3] cmp $b->[3]
  34. ||
  35. $a->[2] cmp $b->[2]
  36. ||
  37. $a->[0] cmp $b->[0]
  38. ||
  39. $a->[1] cmp $b->[1]
  40. ||
  41. $a->[4] cmp $b->[4]
  42. } @data;
  43. foreach my $data ( @sorted ) {
  44. print $data->[5];
  45. }
  46. exit 0;

When running this program, redirect STDOUT so it is possible to check the output is correct from following test runs and use the system time utility to check the overall runtime.

  1. $> time ./sort-apache-log logfile > out-sort
  2. real 0m17.371s
  3. user 0m15.757s
  4. sys 0m0.592s

The program took just over 17 wallclock seconds to run. Note the different values time outputs, it's important to always use the same one, and to not confuse what each one means.

  • Elapsed Real Time

    The overall, or wallclock, time between when time was called, and when it terminates. The elapsed time includes both user and system times, and time spent waiting for other users and processes on the system. Inevitably, this is the most approximate of the measurements given.

  • User CPU Time

    The user time is the amount of time the entire process spent on behalf of the user on this system executing this program.

  • System CPU Time

    The system time is the amount of time the kernel itself spent executing routines, or system calls, on behalf of this process user.

Running this same process as a Schwarzian Transform it is possible to eliminate the input and output arrays for storing all the data, and work on the input directly as it arrives too. Otherwise, the code looks fairly similar:

# sort-apache-log-schwarzian

  1. #!/usr/bin/perl -n
  2. use strict;
  3. use warnings;
  4. print
  5. map $_->[0] =>
  6. sort {
  7. $a->[4] cmp $b->[4]
  8. ||
  9. $a->[3] cmp $b->[3]
  10. ||
  11. $a->[1] cmp $b->[1]
  12. ||
  13. $a->[2] cmp $b->[2]
  14. ||
  15. $a->[5] cmp $b->[5]
  16. }
  17. map [ $_, m/^(
  18. ([\w\.\-]+) # client
  19. \s*-\s*-\s*\[
  20. ([^]]+) # date
  21. \]\s*"\w+\s*
  22. (\S+) # query
  23. [^"]+"\s*
  24. (\d+) # status
  25. \s+\S+\s+"[^"]*"\s+"
  26. ([^"]*) # browser
  27. "
  28. .*
  29. )$/xo ]
  30. => <>;
  31. exit 0;

Run the new code against the same logfile, as above, to check the new time.

  1. $> time ./sort-apache-log-schwarzian logfile > out-schwarz
  2. real 0m9.664s
  3. user 0m8.873s
  4. sys 0m0.704s

The time has been cut in half, which is a respectable speed improvement by any standard. Naturally, it is important to check the output is consistent with the first program run, this is where the Unix system cksum utility comes in.

  1. $> cksum out-sort out-schwarz
  2. 3044173777 52029194 out-sort
  3. 3044173777 52029194 out-schwarz

BTW. Beware too of pressure from managers who see you speed a program up by 50% of the runtime once, only to get a request one month later to do the same again (true story) - you'll just have to point out your only human, even if you are a Perl programmer, and you'll see what you can do...

LOGGING

An essential part of any good development process is appropriate error handling with appropriately informative messages, however there exists a school of thought which suggests that log files should be chatty, as if the chain of unbroken output somehow ensures the survival of the program. If speed is in any way an issue, this approach is wrong.

A common sight is code which looks something like this:

  1. logger->debug( "A logging message via process-id: $$ INC: " . Dumper(\%INC) )

The problem is that this code will always be parsed and executed, even when the debug level set in the logging configuration file is zero. Once the debug() subroutine has been entered, and the internal $debug variable confirmed to be zero, for example, the message which has been sent in will be discarded and the program will continue. In the example given though, the \%INC hash will already have been dumped, and the message string constructed, all of which work could be bypassed by a debug variable at the statement level, like this:

  1. logger->debug( "A logging message via process-id: $$ INC: " . Dumper(\%INC) ) if $DEBUG;

This effect can be demonstrated by setting up a test script with both forms, including a debug() subroutine to emulate typical logger() functionality.

# ifdebug

  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;
  4. use Benchmark;
  5. use Data::Dumper;
  6. my $DEBUG = 0;
  7. sub debug {
  8. my $msg = shift;
  9. if ( $DEBUG ) {
  10. print "DEBUG: $msg\n";
  11. }
  12. };
  13. timethese(100000, {
  14. 'debug' => sub {
  15. debug( "A $0 logging message via process-id: $$" . Dumper(\%INC) )
  16. },
  17. 'ifdebug' => sub {
  18. debug( "A $0 logging message via process-id: $$" . Dumper(\%INC) ) if $DEBUG
  19. },
  20. });

Let's see what Benchmark makes of this:

  1. $> perl ifdebug
  2. Benchmark: timing 100000 iterations of constant, sub...
  3. ifdebug: 0 wallclock secs ( 0.01 usr + 0.00 sys = 0.01 CPU) @ 10000000.00/s (n=100000)
  4. (warning: too few iterations for a reliable count)
  5. debug: 14 wallclock secs (13.18 usr + 0.04 sys = 13.22 CPU) @ 7564.30/s (n=100000)

In the one case the code, which does exactly the same thing as far as outputting any debugging information is concerned, in other words nothing, takes 14 seconds, and in the other case the code takes one hundredth of a second. Looks fairly definitive. Use a $DEBUG variable BEFORE you call the subroutine, rather than relying on the smart functionality inside it.

Logging if DEBUG (constant)

It's possible to take the previous idea a little further, by using a compile time DEBUG constant.

# ifdebug-constant

  1. #!/usr/bin/perl
  2. use strict;
  3. use warnings;
  4. use Benchmark;
  5. use Data::Dumper;
  6. use constant
  7. DEBUG => 0
  8. ;
  9. sub debug {
  10. if ( DEBUG ) {
  11. my $msg = shift;
  12. print "DEBUG: $msg\n";
  13. }
  14. };
  15. timethese(100000, {
  16. 'debug' => sub {
  17. debug( "A $0 logging message via process-id: $$" . Dumper(\%INC) )
  18. },
  19. 'constant' => sub {
  20. debug( "A $0 logging message via process-id: $$" . Dumper(\%INC) ) if DEBUG
  21. },
  22. });

Running this program produces the following output:

  1. $> perl ifdebug-constant
  2. Benchmark: timing 100000 iterations of constant, sub...
  3. constant: 0 wallclock secs (-0.00 usr + 0.00 sys = -0.00 CPU) @ -7205759403792793600000.00/s (n=100000)
  4. (warning: too few iterations for a reliable count)
  5. sub: 14 wallclock secs (13.09 usr + 0.00 sys = 13.09 CPU) @ 7639.42/s (n=100000)

The DEBUG constant wipes the floor with even the $debug variable, clocking in at minus zero seconds, and generates a "warning: too few iterations for a reliable count" message into the bargain. To see what is really going on, and why we had too few iterations when we thought we asked for 100000, we can use the very useful B::Deparse to inspect the new code:

  1. $> perl -MO=Deparse ifdebug-constant
  2. use Benchmark;
  3. use Data::Dumper;
  4. use constant ('DEBUG', 0);
  5. sub debug {
  6. use warnings;
  7. use strict 'refs';
  8. 0;
  9. }
  10. use warnings;
  11. use strict 'refs';
  12. timethese(100000, {'sub', sub {
  13. debug "A $0 logging message via process-id: $$" . Dumper(\%INC);
  14. }
  15. , 'constant', sub {
  16. 0;
  17. }
  18. });
  19. ifdebug-constant syntax OK

The output shows the constant() subroutine we're testing being replaced with the value of the DEBUG constant: zero. The line to be tested has been completely optimized away, and you can't get much more efficient than that.

POSTSCRIPT

This document has provided several way to go about identifying hot-spots, and checking whether any modifications have improved the runtime of the code.

As a final thought, remember that it's not (at the time of writing) possible to produce a useful program which will run in zero or negative time and this basic principle can be written as: useful programs are slow by their very definition. It is of course possible to write a nearly instantaneous program, but it's not going to do very much, here's a very efficient one:

  1. $> perl -e 0

Optimizing that any further is a job for p5p .

SEE ALSO

Further reading can be found using the modules and links below.

PERLDOCS

For example: perldoc -f sort .

perlfaq4.

perlfork, perlfunc, perlretut, perlthrtut.

threads.

MAN PAGES

time.

MODULES

It's not possible to individually showcase all the performance related code for Perl here, naturally, but here's a short list of modules from the CPAN which deserve further attention.

  1. Apache::DProf
  2. Apache::SmallProf
  3. Benchmark
  4. DBIx::Profile
  5. Devel::AutoProfiler
  6. Devel::DProf
  7. Devel::DProfLB
  8. Devel::FastProf
  9. Devel::GraphVizProf
  10. Devel::NYTProf
  11. Devel::NYTProf::Apache
  12. Devel::Profiler
  13. Devel::Profile
  14. Devel::Profit
  15. Devel::SmallProf
  16. Devel::WxProf
  17. POE::Devel::Profiler
  18. Sort::Key
  19. Sort::Maker

URLS

Very useful online reference material:

  1. http://www.ccl4.org/~nick/P/Fast_Enough/
  2. http://www-128.ibm.com/developerworks/library/l-optperl.html
  3. http://perlbuzz.com/2007/11/bind-output-variables-in-dbi-for-speed-and-safety.html
  4. http://en.wikipedia.org/wiki/Performance_analysis
  5. http://apache.perl.org/docs/1.0/guide/performance.html
  6. http://perlgolf.sourceforge.net/
  7. http://www.sysarch.com/Perl/sort_paper.html

AUTHOR

Richard Foley <richard.foley@rfi.net> Copyright (c) 2008

 
perldoc-html/perlplan9.html000644 000765 000024 00000057062 12275777412 016042 0ustar00jjstaff000000 000000 perlplan9 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlplan9

Perl 5 version 18.2 documentation
Recently read

perlplan9

NAME

perlplan9 - Plan 9-specific documentation for Perl

DESCRIPTION

These are a few notes describing features peculiar to Plan 9 Perl. As such, it is not intended to be a replacement for the rest of the Perl 5 documentation (which is both copious and excellent). If you have any questions to which you can't find answers in these man pages, contact Luther Huffman at lutherh@stratcom.com and we'll try to answer them.

Invoking Perl

Perl is invoked from the command line as described in perl. Most perl scripts, however, do have a first line such as "#!/usr/local/bin/perl". This is known as a shebang (shell-bang) statement and tells the OS shell where to find the perl interpreter. In Plan 9 Perl this statement should be "#!/bin/perl" if you wish to be able to directly invoke the script by its name. Alternatively, you may invoke perl with the command "Perl" instead of "perl". This will produce Acme-friendly error messages of the form "filename:18".

Some scripts, usually identified with a *.PL extension, are self-configuring and are able to correctly create their own shebang path from config information located in Plan 9 Perl. These you won't need to be worried about.

What's in Plan 9 Perl

Although Plan 9 Perl currently only provides static loading, it is built with a number of useful extensions. These include Opcode, FileHandle, Fcntl, and POSIX. Expect to see others (and DynaLoading!) in the future.

What's not in Plan 9 Perl

As mentioned previously, dynamic loading isn't currently available nor is MakeMaker. Both are high-priority items.

Perl5 Functions not currently supported in Plan 9 Perl

Some, such as chown and umask aren't provided because the concept does not exist within Plan 9. Others, such as some of the socket-related functions, simply haven't been written yet. Many in the latter category may be supported in the future.

The functions not currently implemented include:

  1. chown, chroot, dbmclose, dbmopen, getsockopt,
  2. setsockopt, recvmsg, sendmsg, getnetbyname,
  3. getnetbyaddr, getnetent, getprotoent, getservent,
  4. sethostent, setnetent, setprotoent, setservent,
  5. endservent, endnetent, endprotoent, umask

There may be several other functions that have undefined behavior so this list shouldn't be considered complete.

Signals in Plan 9 Perl

For compatibility with perl scripts written for the Unix environment, Plan 9 Perl uses the POSIX signal emulation provided in Plan 9's ANSI POSIX Environment (APE). Signal stacking isn't supported. The signals provided are:

  1. SIGHUP, SIGINT, SIGQUIT, SIGILL, SIGABRT,
  2. SIGFPE, SIGKILL, SIGSEGV, SIGPIPE, SIGPIPE, SIGALRM,
  3. SIGTERM, SIGUSR1, SIGUSR2, SIGCHLD, SIGCONT,
  4. SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU

COMPILING AND INSTALLING PERL ON PLAN 9

WELCOME to Plan 9 Perl, brave soul!

  1. This is a preliminary alpha version of Plan 9 Perl. Still to be
  2. implemented are MakeMaker and DynaLoader. Many perl commands are
  3. missing or currently behave in an inscrutable manner. These gaps will,
  4. with perseverance and a modicum of luck, be remedied in the near
  5. future.To install this software:

1. Create the source directories and libraries for perl by running the plan9/setup.rc command (i.e., located in the plan9 subdirectory). Note: the setup routine assumes that you haven't dearchived these files into /sys/src/cmd/perl. After running setup.rc you may delete the copy of the source you originally detarred, as source code has now been installed in /sys/src/cmd/perl. If you plan on installing perl binaries for all architectures, run "setup.rc -a".

2. After making sure that you have adequate privileges to build system software, from /sys/src/cmd/perl/5.00301 (adjust version appropriately) run:

  1. mk install

If you wish to install perl versions for all architectures (68020, mips, sparc and 386) run:

  1. mk installall

3. Wait. The build process will take a *long* time because perl bootstraps itself. A 75MHz Pentium, 16MB RAM machine takes roughly 30 minutes to build the distribution from scratch.

Installing Perl Documentation on Plan 9

This perl distribution comes with a tremendous amount of documentation. To add these to the built-in manuals that come with Plan 9, from /sys/src/cmd/perl/5.00301 (adjust version appropriately) run:

  1. mk man

To begin your reading, start with:

  1. man perl

This is a good introduction and will direct you towards other man pages that may interest you.

(Note: "mk man" may produce some extraneous noise. Fear not.)

BUGS

"As many as there are grains of sand on all the beaches of the world . . ." - Carl Sagan

Revision date

This document was revised 09-October-1996 for Perl 5.003_7.

AUTHOR

Direct questions, comments, and the unlikely bug report (ahem) direct comments toward:

Luther Huffman, lutherh@stratcom.com, Strategic Computer Solutions, Inc.

 
perldoc-html/perlpod.html000644 000765 000024 00000133531 12275777333 015577 0ustar00jjstaff000000 000000 perlpod - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlpod

Perl 5 version 18.2 documentation
Recently read

perlpod

NAME

perlpod - the Plain Old Documentation format

DESCRIPTION

Pod is a simple-to-use markup language used for writing documentation for Perl, Perl programs, and Perl modules.

Translators are available for converting Pod to various formats like plain text, HTML, man pages, and more.

Pod markup consists of three basic kinds of paragraphs: ordinary, verbatim, and command.

Ordinary Paragraph

Most paragraphs in your documentation will be ordinary blocks of text, like this one. You can simply type in your text without any markup whatsoever, and with just a blank line before and after. When it gets formatted, it will undergo minimal formatting, like being rewrapped, probably put into a proportionally spaced font, and maybe even justified.

You can use formatting codes in ordinary paragraphs, for bold, italic, code-style , hyperlinks, and more. Such codes are explained in the "Formatting Codes" section, below.

Verbatim Paragraph

Verbatim paragraphs are usually used for presenting a codeblock or other text which does not require any special parsing or formatting, and which shouldn't be wrapped.

A verbatim paragraph is distinguished by having its first character be a space or a tab. (And commonly, all its lines begin with spaces and/or tabs.) It should be reproduced exactly, with tabs assumed to be on 8-column boundaries. There are no special formatting codes, so you can't italicize or anything like that. A \ means \, and nothing else.

Command Paragraph

A command paragraph is used for special treatment of whole chunks of text, usually as headings or parts of lists.

All command paragraphs (which are typically only one line long) start with "=", followed by an identifier, followed by arbitrary text that the command can use however it pleases. Currently recognized commands are

  1. =pod
  2. =head1 Heading Text
  3. =head2 Heading Text
  4. =head3 Heading Text
  5. =head4 Heading Text
  6. =over indentlevel
  7. =item stuff
  8. =back
  9. =begin format
  10. =end format
  11. =for format text...
  12. =encoding type
  13. =cut

To explain them each in detail:

  • =head1 Heading Text
  • =head2 Heading Text
  • =head3 Heading Text
  • =head4 Heading Text

    Head1 through head4 produce headings, head1 being the highest level. The text in the rest of this paragraph is the content of the heading. For example:

    1. =head2 Object Attributes

    The text "Object Attributes" comprises the heading there. The text in these heading commands can use formatting codes, as seen here:

    1. =head2 Possible Values for C<$/>

    Such commands are explained in the "Formatting Codes" section, below.

  • =over indentlevel
  • =item stuff...
  • =back

    Item, over, and back require a little more explanation: "=over" starts a region specifically for the generation of a list using "=item" commands, or for indenting (groups of) normal paragraphs. At the end of your list, use "=back" to end it. The indentlevel option to "=over" indicates how far over to indent, generally in ems (where one em is the width of an "M" in the document's base font) or roughly comparable units; if there is no indentlevel option, it defaults to four. (And some formatters may just ignore whatever indentlevel you provide.) In the stuff in =item stuff..., you may use formatting codes, as seen here:

    1. =item Using C<$|> to Control Buffering

    Such commands are explained in the "Formatting Codes" section, below.

    Note also that there are some basic rules to using "=over" ... "=back" regions:

    • Don't use "=item"s outside of an "=over" ... "=back" region.

    • The first thing after the "=over" command should be an "=item", unless there aren't going to be any items at all in this "=over" ... "=back" region.

    • Don't put "=headn" commands inside an "=over" ... "=back" region.

    • And perhaps most importantly, keep the items consistent: either use "=item *" for all of them, to produce bullets; or use "=item 1.", "=item 2.", etc., to produce numbered lists; or use "=item foo", "=item bar", etc.--namely, things that look nothing like bullets or numbers.

      If you start with bullets or numbers, stick with them, as formatters use the first "=item" type to decide how to format the list.

  • =cut

    To end a Pod block, use a blank line, then a line beginning with "=cut", and a blank line after it. This lets Perl (and the Pod formatter) know that this is where Perl code is resuming. (The blank line before the "=cut" is not technically necessary, but many older Pod processors require it.)

  • =pod

    The "=pod" command by itself doesn't do much of anything, but it signals to Perl (and Pod formatters) that a Pod block starts here. A Pod block starts with any command paragraph, so a "=pod" command is usually used just when you want to start a Pod block with an ordinary paragraph or a verbatim paragraph. For example:

    1. =item stuff()
    2. This function does stuff.
    3. =cut
    4. sub stuff {
    5. ...
    6. }
    7. =pod
    8. Remember to check its return value, as in:
    9. stuff() || die "Couldn't do stuff!";
    10. =cut
  • =begin formatname
  • =end formatname
  • =for formatname text...

    For, begin, and end will let you have regions of text/code/data that are not generally interpreted as normal Pod text, but are passed directly to particular formatters, or are otherwise special. A formatter that can use that format will use the region, otherwise it will be completely ignored.

    A command "=begin formatname", some paragraphs, and a command "=end formatname", mean that the text/data in between is meant for formatters that understand the special format called formatname. For example,

    1. =begin html
    2. <hr> <img src="thang.png">
    3. <p> This is a raw HTML paragraph </p>
    4. =end html

    The command "=for formatname text..." specifies that the remainder of just this paragraph (starting right after formatname) is in that special format.

    1. =for html <hr> <img src="thang.png">
    2. <p> This is a raw HTML paragraph </p>

    This means the same thing as the above "=begin html" ... "=end html" region.

    That is, with "=for", you can have only one paragraph's worth of text (i.e., the text in "=foo targetname text..."), but with "=begin targetname" ... "=end targetname", you can have any amount of stuff in between. (Note that there still must be a blank line after the "=begin" command and a blank line before the "=end" command.)

    Here are some examples of how to use these:

    1. =begin html
    2. <br>Figure 1.<br><IMG SRC="figure1.png"><br>
    3. =end html
    4. =begin text
    5. ---------------
    6. | foo |
    7. | bar |
    8. ---------------
    9. ^^^^ Figure 1. ^^^^
    10. =end text

    Some format names that formatters currently are known to accept include "roff", "man", "latex", "tex", "text", and "html". (Some formatters will treat some of these as synonyms.)

    A format name of "comment" is common for just making notes (presumably to yourself) that won't appear in any formatted version of the Pod document:

    1. =for comment
    2. Make sure that all the available options are documented!

    Some formatnames will require a leading colon (as in "=for :formatname" , or "=begin :formatname" ... "=end :formatname" ), to signal that the text is not raw data, but instead is Pod text (i.e., possibly containing formatting codes) that's just not for normal formatting (e.g., may not be a normal-use paragraph, but might be for formatting as a footnote).

  • =encoding encodingname

    This command is used for declaring the encoding of a document. Most users won't need this; but if your encoding isn't US-ASCII or Latin-1, then put a =encoding encodingname command early in the document so that pod formatters will know how to decode the document. For encodingname, use a name recognized by the Encode::Supported module. Examples:

    1. =encoding utf8
    2. =encoding koi8-r
    3. =encoding ShiftJIS
    4. =encoding big5

=encoding affects the whole document, and must occur only once.

And don't forget, when using any other command, that the command lasts up until the end of its paragraph, not its line. So in the examples below, you can see that every command needs the blank line after it, to end its paragraph.

Some examples of lists include:

  1. =over
  2. =item *
  3. First item
  4. =item *
  5. Second item
  6. =back
  7. =over
  8. =item Foo()
  9. Description of Foo function
  10. =item Bar()
  11. Description of Bar function
  12. =back

Formatting Codes

In ordinary paragraphs and in some command paragraphs, various formatting codes (a.k.a. "interior sequences") can be used:

  • I<text> -- italic text

    Used for emphasis ("be I<careful!> ") and parameters ("redo I<LABEL> ")

  • B<text> -- bold text

    Used for switches ("perl's B<-n> switch "), programs ("some systems provide a B<chfn> for that "), emphasis ("be B<careful!> "), and so on ("and that feature is known as B<autovivification> ").

  • C<code> -- code text

    Renders code in a typewriter font, or gives some other indication that this represents program text ("C<gmtime($^T)> ") or some other form of computerese ("C<drwxr-xr-x> ").

  • L<name> -- a hyperlink

    There are various syntaxes, listed below. In the syntaxes given, text , name , and section cannot contain the characters '/' and '|'; and any '<' or '>' should be matched.

    • L<name>

      Link to a Perl manual page (e.g., L<Net::Ping> ). Note that name should not contain spaces. This syntax is also occasionally used for references to Unix man pages, as in L<crontab(5)> .

    • L<name/"sec"> or L<name/sec>

      Link to a section in other manual page. E.g., L<perlsyn/"For Loops">

    • L</"sec"> or L</sec>

      Link to a section in this manual page. E.g., L</"Object Methods">

    A section is started by the named heading or item. For example, L<perlvar/$.> or L<perlvar/"$."> both link to the section started by "=item $. " in perlvar. And L<perlsyn/For Loops> or L<perlsyn/"For Loops"> both link to the section started by "=head2 For Loops " in perlsyn.

    To control what text is used for display, you use "L<text|...>", as in:

    • L<text|name>

      Link this text to that manual page. E.g., L<Perl Error Messages|perldiag>

    • L<text|name/"sec"> or L<text|name/sec>

      Link this text to that section in that manual page. E.g., L<postfix "if"|perlsyn/"Statement Modifiers">

    • L<text|/"sec"> or L<text|/sec> or L<text|"sec">

      Link this text to that section in this manual page. E.g., L<the various attributes|/"Member Data">

    Or you can link to a web page:

  • E<escape> -- a character escape

    Very similar to HTML/XML &foo; "entity references":

    • E<lt> -- a literal < (less than)

    • E<gt> -- a literal > (greater than)

    • E<verbar> -- a literal | (vertical bar)

    • E<sol> -- a literal / (solidus)

      The above four are optional except in other formatting codes, notably L<...> , and when preceded by a capital letter.

    • E<htmlname>

      Some non-numeric HTML entity name, such as E<eacute> , meaning the same thing as &eacute; in HTML -- i.e., a lowercase e with an acute (/-shaped) accent.

    • E<number>

      The ASCII/Latin-1/Unicode character with that number. A leading "0x" means that number is hex, as in E<0x201E> . A leading "0" means that number is octal, as in E<075> . Otherwise number is interpreted as being in decimal, as in E<181> .

      Note that older Pod formatters might not recognize octal or hex numeric escapes, and that many formatters cannot reliably render characters above 255. (Some formatters may even have to use compromised renderings of Latin-1 characters, like rendering E<eacute> as just a plain "e".)

  • F<filename> -- used for filenames

    Typically displayed in italics. Example: "F<.cshrc> "

  • S<text> -- text contains non-breaking spaces

    This means that the words in text should not be broken across lines. Example: S<$x ? $y : $z> .

  • X<topic name> -- an index entry

    This is ignored by most formatters, but some may use it for building indexes. It always renders as empty-string. Example: X<absolutizing relative URLs>

  • Z<> -- a null (zero-effect) formatting code

    This is rarely used. It's one way to get around using an E<...> code sometimes. For example, instead of "NE<lt>3" (for "N<3") you could write "NZ<><3 " (the "Z<>" breaks up the "N" and the "<" so they can't be considered the part of a (fictitious) "N<...>" code).

Most of the time, you will need only a single set of angle brackets to delimit the beginning and end of formatting codes. However, sometimes you will want to put a real right angle bracket (a greater-than sign, '>') inside of a formatting code. This is particularly common when using a formatting code to provide a different font-type for a snippet of code. As with all things in Perl, there is more than one way to do it. One way is to simply escape the closing bracket using an E code:

  1. C<$a E<lt>=E<gt> $b>

This will produce: "$a <=> $b "

A more readable, and perhaps more "plain" way is to use an alternate set of delimiters that doesn't require a single ">" to be escaped. Doubled angle brackets ("<<" and ">>") may be used if and only if there is whitespace right after the opening delimiter and whitespace right before the closing delimiter! For example, the following will do the trick:

  1. C<< $a <=> $b >>

In fact, you can use as many repeated angle-brackets as you like so long as you have the same number of them in the opening and closing delimiters, and make sure that whitespace immediately follows the last '<' of the opening delimiter, and immediately precedes the first '>' of the closing delimiter. (The whitespace is ignored.) So the following will also work:

  1. C<<< $a <=> $b >>>
  2. C<<<< $a <=> $b >>>>

And they all mean exactly the same as this:

  1. C<$a E<lt>=E<gt> $b>

The multiple-bracket form does not affect the interpretation of the contents of the formatting code, only how it must end. That means that the examples above are also exactly the same as this:

  1. C<< $a E<lt>=E<gt> $b >>

As a further example, this means that if you wanted to put these bits of code in C (code) style:

  1. open(X, ">>thing.dat") || die $!
  2. $foo->bar();

you could do it like so:

  1. C<<< open(X, ">>thing.dat") || die $! >>>
  2. C<< $foo->bar(); >>

which is presumably easier to read than the old way:

  1. C<open(X, "E<gt>E<gt>thing.dat") || die $!>
  2. C<$foo-E<gt>bar();>

This is currently supported by pod2text (Pod::Text), pod2man (Pod::Man), and any other pod2xxx or Pod::Xxxx translators that use Pod::Parser 1.093 or later, or Pod::Tree 1.02 or later.

The Intent

The intent is simplicity of use, not power of expression. Paragraphs look like paragraphs (block format), so that they stand out visually, and so that I could run them through fmt easily to reformat them (that's F7 in my version of vi, or Esc Q in my version of emacs). I wanted the translator to always leave the ' and ` and " quotes alone, in verbatim mode, so I could slurp in a working program, shift it over four spaces, and have it print out, er, verbatim. And presumably in a monospace font.

The Pod format is not necessarily sufficient for writing a book. Pod is just meant to be an idiot-proof common source for nroff, HTML, TeX, and other markup languages, as used for online documentation. Translators exist for pod2text, pod2html, pod2man (that's for nroff(1) and troff(1)), pod2latex, and pod2fm. Various others are available in CPAN.

Embedding Pods in Perl Modules

You can embed Pod documentation in your Perl modules and scripts. Start your documentation with an empty line, a "=head1" command at the beginning, and end it with a "=cut" command and an empty line. Perl will ignore the Pod text. See any of the supplied library modules for examples. If you're going to put your Pod at the end of the file, and you're using an __END__ or __DATA__ cut mark, make sure to put an empty line there before the first Pod command.

  1. __END__
  2. =head1 NAME
  3. Time::Local - efficiently compute time from local and GMT time

Without that empty line before the "=head1", many translators wouldn't have recognized the "=head1" as starting a Pod block.

Hints for Writing Pod

  • The podchecker command is provided for checking Pod syntax for errors and warnings. For example, it checks for completely blank lines in Pod blocks and for unknown commands and formatting codes. You should still also pass your document through one or more translators and proofread the result, or print out the result and proofread that. Some of the problems found may be bugs in the translators, which you may or may not wish to work around.

  • If you're more familiar with writing in HTML than with writing in Pod, you can try your hand at writing documentation in simple HTML, and converting it to Pod with the experimental Pod::HTML2Pod module, (available in CPAN), and looking at the resulting code. The experimental Pod::PXML module in CPAN might also be useful.

  • Many older Pod translators require the lines before every Pod command and after every Pod command (including "=cut"!) to be a blank line. Having something like this:

    1. # - - - - - - - - - - - -
    2. =item $firecracker->boom()
    3. This noisily detonates the firecracker object.
    4. =cut
    5. sub boom {
    6. ...

    ...will make such Pod translators completely fail to see the Pod block at all.

    Instead, have it like this:

    1. # - - - - - - - - - - - -
    2. =item $firecracker->boom()
    3. This noisily detonates the firecracker object.
    4. =cut
    5. sub boom {
    6. ...
  • Some older Pod translators require paragraphs (including command paragraphs like "=head2 Functions") to be separated by completely empty lines. If you have an apparently empty line with some spaces on it, this might not count as a separator for those translators, and that could cause odd formatting.

  • Older translators might add wording around an L<> link, so that L<Foo::Bar> may become "the Foo::Bar manpage", for example. So you shouldn't write things like the L<foo> documentation, if you want the translated document to read sensibly. Instead, write the L<Foo::Bar|Foo::Bar> documentation or L<the Foo::Bar documentation|Foo::Bar> , to control how the link comes out.

  • Going past the 70th column in a verbatim block might be ungracefully wrapped by some formatters.

SEE ALSO

perlpodspec, PODs: Embedded Documentation in perlsyn, perlnewmod, perldoc, pod2html, pod2man, podchecker.

AUTHOR

Larry Wall, Sean M. Burke

 
perldoc-html/perlpodspec.html000644 000765 000024 00000314213 12275777334 016451 0ustar00jjstaff000000 000000 perlpodspec - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlpodspec

Perl 5 version 18.2 documentation
Recently read

perlpodspec

NAME

perlpodspec - Plain Old Documentation: format specification and notes

DESCRIPTION

This document is detailed notes on the Pod markup language. Most people will only have to read perlpod to know how to write in Pod, but this document may answer some incidental questions to do with parsing and rendering Pod.

In this document, "must" / "must not", "should" / "should not", and "may" have their conventional (cf. RFC 2119) meanings: "X must do Y" means that if X doesn't do Y, it's against this specification, and should really be fixed. "X should do Y" means that it's recommended, but X may fail to do Y, if there's a good reason. "X may do Y" is merely a note that X can do Y at will (although it is up to the reader to detect any connotation of "and I think it would be nice if X did Y" versus "it wouldn't really bother me if X did Y").

Notably, when I say "the parser should do Y", the parser may fail to do Y, if the calling application explicitly requests that the parser not do Y. I often phrase this as "the parser should, by default, do Y." This doesn't require the parser to provide an option for turning off whatever feature Y is (like expanding tabs in verbatim paragraphs), although it implicates that such an option may be provided.

Pod Definitions

Pod is embedded in files, typically Perl source files, although you can write a file that's nothing but Pod.

A line in a file consists of zero or more non-newline characters, terminated by either a newline or the end of the file.

A newline sequence is usually a platform-dependent concept, but Pod parsers should understand it to mean any of CR (ASCII 13), LF (ASCII 10), or a CRLF (ASCII 13 followed immediately by ASCII 10), in addition to any other system-specific meaning. The first CR/CRLF/LF sequence in the file may be used as the basis for identifying the newline sequence for parsing the rest of the file.

A blank line is a line consisting entirely of zero or more spaces (ASCII 32) or tabs (ASCII 9), and terminated by a newline or end-of-file. A non-blank line is a line containing one or more characters other than space or tab (and terminated by a newline or end-of-file).

(Note: Many older Pod parsers did not accept a line consisting of spaces/tabs and then a newline as a blank line. The only lines they considered blank were lines consisting of no characters at all, terminated by a newline.)

Whitespace is used in this document as a blanket term for spaces, tabs, and newline sequences. (By itself, this term usually refers to literal whitespace. That is, sequences of whitespace characters in Pod source, as opposed to "E<32>", which is a formatting code that denotes a whitespace character.)

A Pod parser is a module meant for parsing Pod (regardless of whether this involves calling callbacks or building a parse tree or directly formatting it). A Pod formatter (or Pod translator) is a module or program that converts Pod to some other format (HTML, plaintext, TeX, PostScript, RTF). A Pod processor might be a formatter or translator, or might be a program that does something else with the Pod (like counting words, scanning for index points, etc.).

Pod content is contained in Pod blocks. A Pod block starts with a line that matches <m/\A=[a-zA-Z]/>, and continues up to the next line that matches m/\A=cut/ or up to the end of the file if there is no m/\A=cut/ line.

Within a Pod block, there are Pod paragraphs. A Pod paragraph consists of non-blank lines of text, separated by one or more blank lines.

For purposes of Pod processing, there are four types of paragraphs in a Pod block:

  • A command paragraph (also called a "directive"). The first line of this paragraph must match m/\A=[a-zA-Z]/. Command paragraphs are typically one line, as in:

    1. =head1 NOTES
    2. =item *

    But they may span several (non-blank) lines:

    1. =for comment
    2. Hm, I wonder what it would look like if
    3. you tried to write a BNF for Pod from this.
    4. =head3 Dr. Strangelove, or: How I Learned to
    5. Stop Worrying and Love the Bomb

    Some command paragraphs allow formatting codes in their content (i.e., after the part that matches m/\A=[a-zA-Z]\S*\s*/), as in:

    1. =head1 Did You Remember to C<use strict;>?

    In other words, the Pod processing handler for "head1" will apply the same processing to "Did You Remember to C<use strict;>?" that it would to an ordinary paragraph (i.e., formatting codes like "C<...>") are parsed and presumably formatted appropriately, and whitespace in the form of literal spaces and/or tabs is not significant.

  • A verbatim paragraph. The first line of this paragraph must be a literal space or tab, and this paragraph must not be inside a "=begin identifier", ... "=end identifier" sequence unless "identifier" begins with a colon (":"). That is, if a paragraph starts with a literal space or tab, but is inside a "=begin identifier", ... "=end identifier" region, then it's a data paragraph, unless "identifier" begins with a colon.

    Whitespace is significant in verbatim paragraphs (although, in processing, tabs are probably expanded).

  • An ordinary paragraph. A paragraph is an ordinary paragraph if its first line matches neither m/\A=[a-zA-Z]/ nor m/\A[ \t]/ , and if it's not inside a "=begin identifier", ... "=end identifier" sequence unless "identifier" begins with a colon (":").

  • A data paragraph. This is a paragraph that is inside a "=begin identifier" ... "=end identifier" sequence where "identifier" does not begin with a literal colon (":"). In some sense, a data paragraph is not part of Pod at all (i.e., effectively it's "out-of-band"), since it's not subject to most kinds of Pod parsing; but it is specified here, since Pod parsers need to be able to call an event for it, or store it in some form in a parse tree, or at least just parse around it.

For example: consider the following paragraphs:

  1. # <- that's the 0th column
  2. =head1 Foo
  3. Stuff
  4. $foo->bar
  5. =cut

Here, "=head1 Foo" and "=cut" are command paragraphs because the first line of each matches m/\A=[a-zA-Z]/. "[space][space]$foo->bar" is a verbatim paragraph, because its first line starts with a literal whitespace character (and there's no "=begin"..."=end" region around).

The "=begin identifier" ... "=end identifier" commands stop paragraphs that they surround from being parsed as ordinary or verbatim paragraphs, if identifier doesn't begin with a colon. This is discussed in detail in the section About Data Paragraphs and =begin/=end Regions.

Pod Commands

This section is intended to supplement and clarify the discussion in Command Paragraph in perlpod. These are the currently recognized Pod commands:

  • "=head1", "=head2", "=head3", "=head4"

    This command indicates that the text in the remainder of the paragraph is a heading. That text may contain formatting codes. Examples:

    1. =head1 Object Attributes
    2. =head3 What B<Not> to Do!
  • "=pod"

    This command indicates that this paragraph begins a Pod block. (If we are already in the middle of a Pod block, this command has no effect at all.) If there is any text in this command paragraph after "=pod", it must be ignored. Examples:

    1. =pod
    2. This is a plain Pod paragraph.
    3. =pod This text is ignored.
  • "=cut"

    This command indicates that this line is the end of this previously started Pod block. If there is any text after "=cut" on the line, it must be ignored. Examples:

    1. =cut
    2. =cut The documentation ends here.
    3. =cut
    4. # This is the first line of program text.
    5. sub foo { # This is the second.

    It is an error to try to start a Pod block with a "=cut" command. In that case, the Pod processor must halt parsing of the input file, and must by default emit a warning.

  • "=over"

    This command indicates that this is the start of a list/indent region. If there is any text following the "=over", it must consist of only a nonzero positive numeral. The semantics of this numeral is explained in the About =over...=back Regions section, further below. Formatting codes are not expanded. Examples:

    1. =over 3
    2. =over 3.5
    3. =over
  • "=item"

    This command indicates that an item in a list begins here. Formatting codes are processed. The semantics of the (optional) text in the remainder of this paragraph are explained in the About =over...=back Regions section, further below. Examples:

    1. =item
    2. =item *
    3. =item *
    4. =item 14
    5. =item 3.
    6. =item C<< $thing->stuff(I<dodad>) >>
    7. =item For transporting us beyond seas to be tried for pretended
    8. offenses
    9. =item He is at this time transporting large armies of foreign
    10. mercenaries to complete the works of death, desolation and
    11. tyranny, already begun with circumstances of cruelty and perfidy
    12. scarcely paralleled in the most barbarous ages, and totally
    13. unworthy the head of a civilized nation.
  • "=back"

    This command indicates that this is the end of the region begun by the most recent "=over" command. It permits no text after the "=back" command.

  • "=begin formatname"
  • "=begin formatname parameter"

    This marks the following paragraphs (until the matching "=end formatname") as being for some special kind of processing. Unless "formatname" begins with a colon, the contained non-command paragraphs are data paragraphs. But if "formatname" does begin with a colon, then non-command paragraphs are ordinary paragraphs or data paragraphs. This is discussed in detail in the section About Data Paragraphs and =begin/=end Regions.

    It is advised that formatnames match the regexp m/\A:?[-a-zA-Z0-9_]+\z/. Everything following whitespace after the formatname is a parameter that may be used by the formatter when dealing with this region. This parameter must not be repeated in the "=end" paragraph. Implementors should anticipate future expansion in the semantics and syntax of the first parameter to "=begin"/"=end"/"=for".

  • "=end formatname"

    This marks the end of the region opened by the matching "=begin formatname" region. If "formatname" is not the formatname of the most recent open "=begin formatname" region, then this is an error, and must generate an error message. This is discussed in detail in the section About Data Paragraphs and =begin/=end Regions.

  • "=for formatname text..."

    This is synonymous with:

    1. =begin formatname
    2. text...
    3. =end formatname

    That is, it creates a region consisting of a single paragraph; that paragraph is to be treated as a normal paragraph if "formatname" begins with a ":"; if "formatname" doesn't begin with a colon, then "text..." will constitute a data paragraph. There is no way to use "=for formatname text..." to express "text..." as a verbatim paragraph.

  • "=encoding encodingname"

    This command, which should occur early in the document (at least before any non-US-ASCII data!), declares that this document is encoded in the encoding encodingname, which must be an encoding name that Encode recognizes. (Encode's list of supported encodings, in Encode::Supported, is useful here.) If the Pod parser cannot decode the declared encoding, it should emit a warning and may abort parsing the document altogether.

    A document having more than one "=encoding" line should be considered an error. Pod processors may silently tolerate this if the not-first "=encoding" lines are just duplicates of the first one (e.g., if there's a "=encoding utf8" line, and later on another "=encoding utf8" line). But Pod processors should complain if there are contradictory "=encoding" lines in the same document (e.g., if there is a "=encoding utf8" early in the document and "=encoding big5" later). Pod processors that recognize BOMs may also complain if they see an "=encoding" line that contradicts the BOM (e.g., if a document with a UTF-16LE BOM has an "=encoding shiftjis" line).

If a Pod processor sees any command other than the ones listed above (like "=head", or "=haed1", or "=stuff", or "=cuttlefish", or "=w123"), that processor must by default treat this as an error. It must not process the paragraph beginning with that command, must by default warn of this as an error, and may abort the parse. A Pod parser may allow a way for particular applications to add to the above list of known commands, and to stipulate, for each additional command, whether formatting codes should be processed.

Future versions of this specification may add additional commands.

Pod Formatting Codes

(Note that in previous drafts of this document and of perlpod, formatting codes were referred to as "interior sequences", and this term may still be found in the documentation for Pod parsers, and in error messages from Pod processors.)

There are two syntaxes for formatting codes:

  • A formatting code starts with a capital letter (just US-ASCII [A-Z]) followed by a "<", any number of characters, and ending with the first matching ">". Examples:

    1. That's what I<you> think!
    2. What's C<dump()> for?
    3. X<C<chmod> and C<unlink()> Under Different Operating Systems>
  • A formatting code starts with a capital letter (just US-ASCII [A-Z]) followed by two or more "<"'s, one or more whitespace characters, any number of characters, one or more whitespace characters, and ending with the first matching sequence of two or more ">"'s, where the number of ">"'s equals the number of "<"'s in the opening of this formatting code. Examples:

    1. That's what I<< you >> think!
    2. C<<< open(X, ">>thing.dat") || die $! >>>
    3. B<< $foo->bar(); >>

    With this syntax, the whitespace character(s) after the "C<<<" and before the ">>" (or whatever letter) are not renderable. They do not signify whitespace, are merely part of the formatting codes themselves. That is, these are all synonymous:

    1. C<thing>
    2. C<< thing >>
    3. C<< thing >>
    4. C<<< thing >>>
    5. C<<<<
    6. thing
    7. >>>>

    and so on.

    Finally, the multiple-angle-bracket form does not alter the interpretation of nested formatting codes, meaning that the following four example lines are identical in meaning:

    1. B<example: C<$a E<lt>=E<gt> $b>>
    2. B<example: C<< $a <=> $b >>>
    3. B<example: C<< $a E<lt>=E<gt> $b >>>
    4. B<<< example: C<< $a E<lt>=E<gt> $b >> >>>

In parsing Pod, a notably tricky part is the correct parsing of (potentially nested!) formatting codes. Implementors should consult the code in the parse_text routine in Pod::Parser as an example of a correct implementation.

  • I<text> -- italic text

    See the brief discussion in Formatting Codes in perlpod.

  • B<text> -- bold text

    See the brief discussion in Formatting Codes in perlpod.

  • C<code> -- code text

    See the brief discussion in Formatting Codes in perlpod.

  • F<filename> -- style for filenames

    See the brief discussion in Formatting Codes in perlpod.

  • X<topic name> -- an index entry

    See the brief discussion in Formatting Codes in perlpod.

    This code is unusual in that most formatters completely discard this code and its content. Other formatters will render it with invisible codes that can be used in building an index of the current document.

  • Z<> -- a null (zero-effect) formatting code

    Discussed briefly in Formatting Codes in perlpod.

    This code is unusual is that it should have no content. That is, a processor may complain if it sees Z<potatoes> . Whether or not it complains, the potatoes text should ignored.

  • L<name> -- a hyperlink

    The complicated syntaxes of this code are discussed at length in Formatting Codes in perlpod, and implementation details are discussed below, in About L<...> Codes. Parsing the contents of L<content> is tricky. Notably, the content has to be checked for whether it looks like a URL, or whether it has to be split on literal "|" and/or "/" (in the right order!), and so on, before E<...> codes are resolved.

  • E<escape> -- a character escape

    See Formatting Codes in perlpod, and several points in Notes on Implementing Pod Processors.

  • S<text> -- text contains non-breaking spaces

    This formatting code is syntactically simple, but semantically complex. What it means is that each space in the printable content of this code signifies a non-breaking space.

    Consider:

    1. C<$x ? $y : $z>
    2. S<C<$x ? $y : $z>>

    Both signify the monospace (c[ode] style) text consisting of "$x", one space, "?", one space, ":", one space, "$z". The difference is that in the latter, with the S code, those spaces are not "normal" spaces, but instead are non-breaking spaces.

If a Pod processor sees any formatting code other than the ones listed above (as in "N<...>", or "Q<...>", etc.), that processor must by default treat this as an error. A Pod parser may allow a way for particular applications to add to the above list of known formatting codes; a Pod parser might even allow a way to stipulate, for each additional command, whether it requires some form of special processing, as L<...> does.

Future versions of this specification may add additional formatting codes.

Historical note: A few older Pod processors would not see a ">" as closing a "C<" code, if the ">" was immediately preceded by a "-". This was so that this:

  1. C<$foo->bar>

would parse as equivalent to this:

  1. C<$foo-E<gt>bar>

instead of as equivalent to a "C" formatting code containing only "$foo-", and then a "bar>" outside the "C" formatting code. This problem has since been solved by the addition of syntaxes like this:

  1. C<< $foo->bar >>

Compliant parsers must not treat "->" as special.

Formatting codes absolutely cannot span paragraphs. If a code is opened in one paragraph, and no closing code is found by the end of that paragraph, the Pod parser must close that formatting code, and should complain (as in "Unterminated I code in the paragraph starting at line 123: 'Time objects are not...'"). So these two paragraphs:

  1. I<I told you not to do this!
  2. Don't make me say it again!>

...must not be parsed as two paragraphs in italics (with the I code starting in one paragraph and starting in another.) Instead, the first paragraph should generate a warning, but that aside, the above code must parse as if it were:

  1. I<I told you not to do this!>
  2. Don't make me say it again!E<gt>

(In SGMLish jargon, all Pod commands are like block-level elements, whereas all Pod formatting codes are like inline-level elements.)

Notes on Implementing Pod Processors

The following is a long section of miscellaneous requirements and suggestions to do with Pod processing.

  • Pod formatters should tolerate lines in verbatim blocks that are of any length, even if that means having to break them (possibly several times, for very long lines) to avoid text running off the side of the page. Pod formatters may warn of such line-breaking. Such warnings are particularly appropriate for lines are over 100 characters long, which are usually not intentional.

  • Pod parsers must recognize all of the three well-known newline formats: CR, LF, and CRLF. See perlport.

  • Pod parsers should accept input lines that are of any length.

  • Since Perl recognizes a Unicode Byte Order Mark at the start of files as signaling that the file is Unicode encoded as in UTF-16 (whether big-endian or little-endian) or UTF-8, Pod parsers should do the same. Otherwise, the character encoding should be understood as being UTF-8 if the first highbit byte sequence in the file seems valid as a UTF-8 sequence, or otherwise as Latin-1.

    Future versions of this specification may specify how Pod can accept other encodings. Presumably treatment of other encodings in Pod parsing would be as in XML parsing: whatever the encoding declared by a particular Pod file, content is to be stored in memory as Unicode characters.

  • The well known Unicode Byte Order Marks are as follows: if the file begins with the two literal byte values 0xFE 0xFF, this is the BOM for big-endian UTF-16. If the file begins with the two literal byte value 0xFF 0xFE, this is the BOM for little-endian UTF-16. If the file begins with the three literal byte values 0xEF 0xBB 0xBF, this is the BOM for UTF-8.

  • A naive but sufficient heuristic for testing the first highbit byte-sequence in a BOM-less file (whether in code or in Pod!), to see whether that sequence is valid as UTF-8 (RFC 2279) is to check whether that the first byte in the sequence is in the range 0xC0 - 0xFD and whether the next byte is in the range 0x80 - 0xBF. If so, the parser may conclude that this file is in UTF-8, and all highbit sequences in the file should be assumed to be UTF-8. Otherwise the parser should treat the file as being in Latin-1. In the unlikely circumstance that the first highbit sequence in a truly non-UTF-8 file happens to appear to be UTF-8, one can cater to our heuristic (as well as any more intelligent heuristic) by prefacing that line with a comment line containing a highbit sequence that is clearly not valid as UTF-8. A line consisting of simply "#", an e-acute, and any non-highbit byte, is sufficient to establish this file's encoding.

  • This document's requirements and suggestions about encodings do not apply to Pod processors running on non-ASCII platforms, notably EBCDIC platforms.

  • Pod processors must treat a "=for [label] [content...]" paragraph as meaning the same thing as a "=begin [label]" paragraph, content, and an "=end [label]" paragraph. (The parser may conflate these two constructs, or may leave them distinct, in the expectation that the formatter will nevertheless treat them the same.)

  • When rendering Pod to a format that allows comments (i.e., to nearly any format other than plaintext), a Pod formatter must insert comment text identifying its name and version number, and the name and version numbers of any modules it might be using to process the Pod. Minimal examples:

    1. %% POD::Pod2PS v3.14159, using POD::Parser v1.92
    2. <!-- Pod::HTML v3.14159, using POD::Parser v1.92 -->
    3. {\doccomm generated by Pod::Tree::RTF 3.14159 using Pod::Tree 1.08}
    4. .\" Pod::Man version 3.14159, using POD::Parser version 1.92

    Formatters may also insert additional comments, including: the release date of the Pod formatter program, the contact address for the author(s) of the formatter, the current time, the name of input file, the formatting options in effect, version of Perl used, etc.

    Formatters may also choose to note errors/warnings as comments, besides or instead of emitting them otherwise (as in messages to STDERR, or dieing).

  • Pod parsers may emit warnings or error messages ("Unknown E code E<zslig>!") to STDERR (whether through printing to STDERR, or warning/carp ing, or dieing/croak ing), but must allow suppressing all such STDERR output, and instead allow an option for reporting errors/warnings in some other way, whether by triggering a callback, or noting errors in some attribute of the document object, or some similarly unobtrusive mechanism -- or even by appending a "Pod Errors" section to the end of the parsed form of the document.

  • In cases of exceptionally aberrant documents, Pod parsers may abort the parse. Even then, using dieing/croak ing is to be avoided; where possible, the parser library may simply close the input file and add text like "*** Formatting Aborted ***" to the end of the (partial) in-memory document.

  • In paragraphs where formatting codes (like E<...>, B<...>) are understood (i.e., not verbatim paragraphs, but including ordinary paragraphs, and command paragraphs that produce renderable text, like "=head1"), literal whitespace should generally be considered "insignificant", in that one literal space has the same meaning as any (nonzero) number of literal spaces, literal newlines, and literal tabs (as long as this produces no blank lines, since those would terminate the paragraph). Pod parsers should compact literal whitespace in each processed paragraph, but may provide an option for overriding this (since some processing tasks do not require it), or may follow additional special rules (for example, specially treating period-space-space or period-newline sequences).

  • Pod parsers should not, by default, try to coerce apostrophe (') and quote (") into smart quotes (little 9's, 66's, 99's, etc), nor try to turn backtick (`) into anything else but a single backtick character (distinct from an open quote character!), nor "--" into anything but two minus signs. They must never do any of those things to text in C<...> formatting codes, and never ever to text in verbatim paragraphs.

  • When rendering Pod to a format that has two kinds of hyphens (-), one that's a non-breaking hyphen, and another that's a breakable hyphen (as in "object-oriented", which can be split across lines as "object-", newline, "oriented"), formatters are encouraged to generally translate "-" to non-breaking hyphen, but may apply heuristics to convert some of these to breaking hyphens.

  • Pod formatters should make reasonable efforts to keep words of Perl code from being broken across lines. For example, "Foo::Bar" in some formatting systems is seen as eligible for being broken across lines as "Foo::" newline "Bar" or even "Foo::-" newline "Bar". This should be avoided where possible, either by disabling all line-breaking in mid-word, or by wrapping particular words with internal punctuation in "don't break this across lines" codes (which in some formats may not be a single code, but might be a matter of inserting non-breaking zero-width spaces between every pair of characters in a word.)

  • Pod parsers should, by default, expand tabs in verbatim paragraphs as they are processed, before passing them to the formatter or other processor. Parsers may also allow an option for overriding this.

  • Pod parsers should, by default, remove newlines from the end of ordinary and verbatim paragraphs before passing them to the formatter. For example, while the paragraph you're reading now could be considered, in Pod source, to end with (and contain) the newline(s) that end it, it should be processed as ending with (and containing) the period character that ends this sentence.

  • Pod parsers, when reporting errors, should make some effort to report an approximate line number ("Nested E<>'s in Paragraph #52, near line 633 of Thing/Foo.pm!"), instead of merely noting the paragraph number ("Nested E<>'s in Paragraph #52 of Thing/Foo.pm!"). Where this is problematic, the paragraph number should at least be accompanied by an excerpt from the paragraph ("Nested E<>'s in Paragraph #52 of Thing/Foo.pm, which begins 'Read/write accessor for the C<interest rate> attribute...'").

  • Pod parsers, when processing a series of verbatim paragraphs one after another, should consider them to be one large verbatim paragraph that happens to contain blank lines. I.e., these two lines, which have a blank line between them:

    1. use Foo;
    2. print Foo->VERSION

    should be unified into one paragraph ("\tuse Foo;\n\n\tprint Foo->VERSION") before being passed to the formatter or other processor. Parsers may also allow an option for overriding this.

    While this might be too cumbersome to implement in event-based Pod parsers, it is straightforward for parsers that return parse trees.

  • Pod formatters, where feasible, are advised to avoid splitting short verbatim paragraphs (under twelve lines, say) across pages.

  • Pod parsers must treat a line with only spaces and/or tabs on it as a "blank line" such as separates paragraphs. (Some older parsers recognized only two adjacent newlines as a "blank line" but would not recognize a newline, a space, and a newline, as a blank line. This is noncompliant behavior.)

  • Authors of Pod formatters/processors should make every effort to avoid writing their own Pod parser. There are already several in CPAN, with a wide range of interface styles -- and one of them, Pod::Parser, comes with modern versions of Perl.

  • Characters in Pod documents may be conveyed either as literals, or by number in E<n> codes, or by an equivalent mnemonic, as in E<eacute> which is exactly equivalent to E<233>.

    Characters in the range 32-126 refer to those well known US-ASCII characters (also defined there by Unicode, with the same meaning), which all Pod formatters must render faithfully. Characters in the ranges 0-31 and 127-159 should not be used (neither as literals, nor as E<number> codes), except for the literal byte-sequences for newline (13, 13 10, or 10), and tab (9).

    Characters in the range 160-255 refer to Latin-1 characters (also defined there by Unicode, with the same meaning). Characters above 255 should be understood to refer to Unicode characters.

  • Be warned that some formatters cannot reliably render characters outside 32-126; and many are able to handle 32-126 and 160-255, but nothing above 255.

  • Besides the well-known "E<lt>" and "E<gt>" codes for less-than and greater-than, Pod parsers must understand "E<sol>" for "/" (solidus, slash), and "E<verbar>" for "|" (vertical bar, pipe). Pod parsers should also understand "E<lchevron>" and "E<rchevron>" as legacy codes for characters 171 and 187, i.e., "left-pointing double angle quotation mark" = "left pointing guillemet" and "right-pointing double angle quotation mark" = "right pointing guillemet". (These look like little "<<" and ">>", and they are now preferably expressed with the HTML/XHTML codes "E<laquo>" and "E<raquo>".)

  • Pod parsers should understand all "E<html>" codes as defined in the entity declarations in the most recent XHTML specification at www.W3.org . Pod parsers must understand at least the entities that define characters in the range 160-255 (Latin-1). Pod parsers, when faced with some unknown "E<identifier>" code, shouldn't simply replace it with nullstring (by default, at least), but may pass it through as a string consisting of the literal characters E, less-than, identifier, greater-than. Or Pod parsers may offer the alternative option of processing such unknown "E<identifier>" codes by firing an event especially for such codes, or by adding a special node-type to the in-memory document tree. Such "E<identifier>" may have special meaning to some processors, or some processors may choose to add them to a special error report.

  • Pod parsers must also support the XHTML codes "E<quot>" for character 34 (doublequote, "), "E<amp>" for character 38 (ampersand, &), and "E<apos>" for character 39 (apostrophe, ').

  • Note that in all cases of "E<whatever>", whatever (whether an htmlname, or a number in any base) must consist only of alphanumeric characters -- that is, whatever must watch m/\A\w+\z/. So "E< 0 1 2 3 >" is invalid, because it contains spaces, which aren't alphanumeric characters. This presumably does not need special treatment by a Pod processor; " 0 1 2 3 " doesn't look like a number in any base, so it would presumably be looked up in the table of HTML-like names. Since there isn't (and cannot be) an HTML-like entity called " 0 1 2 3 ", this will be treated as an error. However, Pod processors may treat "E< 0 1 2 3 >" or "E<e-acute>" as syntactically invalid, potentially earning a different error message than the error message (or warning, or event) generated by a merely unknown (but theoretically valid) htmlname, as in "E<qacute>" [sic]. However, Pod parsers are not required to make this distinction.

  • Note that E<number> must not be interpreted as simply "codepoint number in the current/native character set". It always means only "the character represented by codepoint number in Unicode." (This is identical to the semantics of &#number; in XML.)

    This will likely require many formatters to have tables mapping from treatable Unicode codepoints (such as the "\xE9" for the e-acute character) to the escape sequences or codes necessary for conveying such sequences in the target output format. A converter to *roff would, for example know that "\xE9" (whether conveyed literally, or via a E<...> sequence) is to be conveyed as "e\\*'". Similarly, a program rendering Pod in a Mac OS application window, would presumably need to know that "\xE9" maps to codepoint 142 in MacRoman encoding that (at time of writing) is native for Mac OS. Such Unicode2whatever mappings are presumably already widely available for common output formats. (Such mappings may be incomplete! Implementers are not expected to bend over backwards in an attempt to render Cherokee syllabics, Etruscan runes, Byzantine musical symbols, or any of the other weird things that Unicode can encode.) And if a Pod document uses a character not found in such a mapping, the formatter should consider it an unrenderable character.

  • If, surprisingly, the implementor of a Pod formatter can't find a satisfactory pre-existing table mapping from Unicode characters to escapes in the target format (e.g., a decent table of Unicode characters to *roff escapes), it will be necessary to build such a table. If you are in this circumstance, you should begin with the characters in the range 0x00A0 - 0x00FF, which is mostly the heavily used accented characters. Then proceed (as patience permits and fastidiousness compels) through the characters that the (X)HTML standards groups judged important enough to merit mnemonics for. These are declared in the (X)HTML specifications at the www.W3.org site. At time of writing (September 2001), the most recent entity declaration files are:

    1. http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
    2. http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
    3. http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent

    Then you can progress through any remaining notable Unicode characters in the range 0x2000-0x204D (consult the character tables at www.unicode.org), and whatever else strikes your fancy. For example, in xhtml-symbol.ent, there is the entry:

    1. <!ENTITY infin "&#8734;"> <!-- infinity, U+221E ISOtech -->

    While the mapping "infin" to the character "\x{221E}" will (hopefully) have been already handled by the Pod parser, the presence of the character in this file means that it's reasonably important enough to include in a formatter's table that maps from notable Unicode characters to the codes necessary for rendering them. So for a Unicode-to-*roff mapping, for example, this would merit the entry:

    1. "\x{221E}" => '\(in',

    It is eagerly hoped that in the future, increasing numbers of formats (and formatters) will support Unicode characters directly (as (X)HTML does with &infin; , &#8734;, or &#x221E;), reducing the need for idiosyncratic mappings of Unicode-to-my_escapes.

  • It is up to individual Pod formatter to display good judgement when confronted with an unrenderable character (which is distinct from an unknown E<thing> sequence that the parser couldn't resolve to anything, renderable or not). It is good practice to map Latin letters with diacritics (like "E<eacute>"/"E<233>") to the corresponding unaccented US-ASCII letters (like a simple character 101, "e"), but clearly this is often not feasible, and an unrenderable character may be represented as "?", or the like. In attempting a sane fallback (as from E<233> to "e"), Pod formatters may use the %Latin1Code_to_fallback table in Pod::Escapes, or Text::Unidecode, if available.

    For example, this Pod text:

    1. magic is enabled if you set C<$Currency> to 'E<euro>'.

    may be rendered as: "magic is enabled if you set $Currency to '?'" or as "magic is enabled if you set $Currency to '[euro]'", or as "magic is enabled if you set $Currency to '[x20AC]', etc.

    A Pod formatter may also note, in a comment or warning, a list of what unrenderable characters were encountered.

  • E<...> may freely appear in any formatting code (other than in another E<...> or in an Z<>). That is, "X<The E<euro>1,000,000 Solution>" is valid, as is "L<The E<euro>1,000,000 Solution|Million::Euros>".

  • Some Pod formatters output to formats that implement non-breaking spaces as an individual character (which I'll call "NBSP"), and others output to formats that implement non-breaking spaces just as spaces wrapped in a "don't break this across lines" code. Note that at the level of Pod, both sorts of codes can occur: Pod can contain a NBSP character (whether as a literal, or as a "E<160>" or "E<nbsp>" code); and Pod can contain "S<foo I<bar> baz>" codes, where "mere spaces" (character 32) in such codes are taken to represent non-breaking spaces. Pod parsers should consider supporting the optional parsing of "S<foo I<bar> baz>" as if it were "fooNBSPI<bar>NBSPbaz", and, going the other way, the optional parsing of groups of words joined by NBSP's as if each group were in a S<...> code, so that formatters may use the representation that maps best to what the output format demands.

  • Some processors may find that the S<...> code is easiest to implement by replacing each space in the parse tree under the content of the S, with an NBSP. But note: the replacement should apply not to spaces in all text, but only to spaces in printable text. (This distinction may or may not be evident in the particular tree/event model implemented by the Pod parser.) For example, consider this unusual case:

    1. S<L</Autoloaded Functions>>

    This means that the space in the middle of the visible link text must not be broken across lines. In other words, it's the same as this:

    1. L<"AutoloadedE<160>Functions"/Autoloaded Functions>

    However, a misapplied space-to-NBSP replacement could (wrongly) produce something equivalent to this:

    1. L<"AutoloadedE<160>Functions"/AutoloadedE<160>Functions>

    ...which is almost definitely not going to work as a hyperlink (assuming this formatter outputs a format supporting hypertext).

    Formatters may choose to just not support the S format code, especially in cases where the output format simply has no NBSP character/code and no code for "don't break this stuff across lines".

  • Besides the NBSP character discussed above, implementors are reminded of the existence of the other "special" character in Latin-1, the "soft hyphen" character, also known as "discretionary hyphen", i.e. E<173> = E<0xAD> = E<shy> ). This character expresses an optional hyphenation point. That is, it normally renders as nothing, but may render as a "-" if a formatter breaks the word at that point. Pod formatters should, as appropriate, do one of the following: 1) render this with a code with the same meaning (e.g., "\-" in RTF), 2) pass it through in the expectation that the formatter understands this character as such, or 3) delete it.

    For example:

    1. sigE<shy>action
    2. manuE<shy>script
    3. JarkE<shy>ko HieE<shy>taE<shy>nieE<shy>mi

    These signal to a formatter that if it is to hyphenate "sigaction" or "manuscript", then it should be done as "sig-[linebreak]action" or "manu-[linebreak]script" (and if it doesn't hyphenate it, then the E<shy> doesn't show up at all). And if it is to hyphenate "Jarkko" and/or "Hietaniemi", it can do so only at the points where there is a E<shy> code.

    In practice, it is anticipated that this character will not be used often, but formatters should either support it, or delete it.

  • If you think that you want to add a new command to Pod (like, say, a "=biblio" command), consider whether you could get the same effect with a for or begin/end sequence: "=for biblio ..." or "=begin biblio" ... "=end biblio". Pod processors that don't understand "=for biblio", etc, will simply ignore it, whereas they may complain loudly if they see "=biblio".

  • Throughout this document, "Pod" has been the preferred spelling for the name of the documentation format. One may also use "POD" or "pod". For the documentation that is (typically) in the Pod format, you may use "pod", or "Pod", or "POD". Understanding these distinctions is useful; but obsessing over how to spell them, usually is not.

About L<...> Codes

As you can tell from a glance at perlpod, the L<...> code is the most complex of the Pod formatting codes. The points below will hopefully clarify what it means and how processors should deal with it.

  • In parsing an L<...> code, Pod parsers must distinguish at least four attributes:

    • First:

      The link-text. If there is none, this must be undef. (E.g., in "L<Perl Functions|perlfunc>", the link-text is "Perl Functions". In "L<Time::HiRes>" and even "L<|Time::HiRes>", there is no link text. Note that link text may contain formatting.)

    • Second:

      The possibly inferred link-text; i.e., if there was no real link text, then this is the text that we'll infer in its place. (E.g., for "L<Getopt::Std>", the inferred link text is "Getopt::Std".)

    • Third:

      The name or URL, or undef if none. (E.g., in "L<Perl Functions|perlfunc>", the name (also sometimes called the page) is "perlfunc". In "L</CAVEATS>", the name is undef.)

    • Fourth:

      The section (AKA "item" in older perlpods), or undef if none. E.g., in "L<Getopt::Std/DESCRIPTION>", "DESCRIPTION" is the section. (Note that this is not the same as a manpage section like the "5" in "man 5 crontab". "Section Foo" in the Pod sense means the part of the text that's introduced by the heading or item whose text is "Foo".)

    Pod parsers may also note additional attributes including:

    • Fifth:

      A flag for whether item 3 (if present) is a URL (like "http://lists.perl.org" is), in which case there should be no section attribute; a Pod name (like "perldoc" and "Getopt::Std" are); or possibly a man page name (like "crontab(5)" is).

    • Sixth:

      The raw original L<...> content, before text is split on "|", "/", etc, and before E<...> codes are expanded.

    (The above were numbered only for concise reference below. It is not a requirement that these be passed as an actual list or array.)

    For example:

    1. L<Foo::Bar>
    2. => undef, # link text
    3. "Foo::Bar", # possibly inferred link text
    4. "Foo::Bar", # name
    5. undef, # section
    6. 'pod', # what sort of link
    7. "Foo::Bar" # original content
    8. L<Perlport's section on NL's|perlport/Newlines>
    9. => "Perlport's section on NL's", # link text
    10. "Perlport's section on NL's", # possibly inferred link text
    11. "perlport", # name
    12. "Newlines", # section
    13. 'pod', # what sort of link
    14. "Perlport's section on NL's|perlport/Newlines" # orig. content
    15. L<perlport/Newlines>
    16. => undef, # link text
    17. '"Newlines" in perlport', # possibly inferred link text
    18. "perlport", # name
    19. "Newlines", # section
    20. 'pod', # what sort of link
    21. "perlport/Newlines" # original content
    22. L<crontab(5)/"DESCRIPTION">
    23. => undef, # link text
    24. '"DESCRIPTION" in crontab(5)', # possibly inferred link text
    25. "crontab(5)", # name
    26. "DESCRIPTION", # section
    27. 'man', # what sort of link
    28. 'crontab(5)/"DESCRIPTION"' # original content
    29. L</Object Attributes>
    30. => undef, # link text
    31. '"Object Attributes"', # possibly inferred link text
    32. undef, # name
    33. "Object Attributes", # section
    34. 'pod', # what sort of link
    35. "/Object Attributes" # original content
    36. L<http://www.perl.org/>
    37. => undef, # link text
    38. "http://www.perl.org/", # possibly inferred link text
    39. "http://www.perl.org/", # name
    40. undef, # section
    41. 'url', # what sort of link
    42. "http://www.perl.org/" # original content
    43. L<Perl.org|http://www.perl.org/>
    44. => "Perl.org", # link text
    45. "http://www.perl.org/", # possibly inferred link text
    46. "http://www.perl.org/", # name
    47. undef, # section
    48. 'url', # what sort of link
    49. "Perl.org|http://www.perl.org/" # original content

    Note that you can distinguish URL-links from anything else by the fact that they match m/\A\w+:[^:\s]\S*\z/. So L<http://www.perl.com> is a URL, but L<HTTP::Response> isn't.

  • In case of L<...> codes with no "text|" part in them, older formatters have exhibited great variation in actually displaying the link or cross reference. For example, L<crontab(5)> would render as "the crontab(5) manpage", or "in the crontab(5) manpage" or just "crontab(5) ".

    Pod processors must now treat "text|"-less links as follows:

    1. L<name> => L<name|name>
    2. L</section> => L<"section"|/section>
    3. L<name/section> => L<"section" in name|name/section>
  • Note that section names might contain markup. I.e., if a section starts with:

    1. =head2 About the C<-M> Operator

    or with:

    1. =item About the C<-M> Operator

    then a link to it would look like this:

    1. L<somedoc/About the C<-M> Operator>

    Formatters may choose to ignore the markup for purposes of resolving the link and use only the renderable characters in the section name, as in:

    1. <h1><a name="About_the_-M_Operator">About the <code>-M</code>
    2. Operator</h1>
    3. ...
    4. <a href="somedoc#About_the_-M_Operator">About the <code>-M</code>
    5. Operator" in somedoc</a>
  • Previous versions of perlpod distinguished L<name/"section"> links from L<name/item> links (and their targets). These have been merged syntactically and semantically in the current specification, and section can refer either to a "=headn Heading Content" command or to a "=item Item Content" command. This specification does not specify what behavior should be in the case of a given document having several things all seeming to produce the same section identifier (e.g., in HTML, several things all producing the same anchorname in <a name="anchorname">...</a> elements). Where Pod processors can control this behavior, they should use the first such anchor. That is, L<Foo/Bar> refers to the first "Bar" section in Foo.

    But for some processors/formats this cannot be easily controlled; as with the HTML example, the behavior of multiple ambiguous <a name="anchorname">...</a> is most easily just left up to browsers to decide.

  • In a L<text|...> code, text may contain formatting codes for formatting or for E<...> escapes, as in:

    1. L<B<ummE<234>stuff>|...>

    For L<...> codes without a "name|" part, only E<...> and Z<> codes may occur. That is, authors should not use "L<B<Foo::Bar>> ".

    Note, however, that formatting codes and Z<>'s can occur in any and all parts of an L<...> (i.e., in name, section, text, and url).

    Authors must not nest L<...> codes. For example, "L<The L<Foo::Bar> man page>" should be treated as an error.

  • Note that Pod authors may use formatting codes inside the "text" part of "L<text|name>" (and so on for L<text|/"sec">).

    In other words, this is valid:

    1. Go read L<the docs on C<$.>|perlvar/"$.">

    Some output formats that do allow rendering "L<...>" codes as hypertext, might not allow the link-text to be formatted; in that case, formatters will have to just ignore that formatting.

  • At time of writing, L<name> values are of two types: either the name of a Pod page like L<Foo::Bar> (which might be a real Perl module or program in an @INC / PATH directory, or a .pod file in those places); or the name of a Unix man page, like L<crontab(5)> . In theory, L<chmod> in ambiguous between a Pod page called "chmod", or the Unix man page "chmod" (in whatever man-section). However, the presence of a string in parens, as in "crontab(5)", is sufficient to signal that what is being discussed is not a Pod page, and so is presumably a Unix man page. The distinction is of no importance to many Pod processors, but some processors that render to hypertext formats may need to distinguish them in order to know how to render a given L<foo> code.

  • Previous versions of perlpod allowed for a L<section> syntax (as in L<Object Attributes> ), which was not easily distinguishable from L<name> syntax and for L<"section"> which was only slightly less ambiguous. This syntax is no longer in the specification, and has been replaced by the L</section> syntax (where the slash was formerly optional). Pod parsers should tolerate the L<"section"> syntax, for a while at least. The suggested heuristic for distinguishing L<section> from L<name> is that if it contains any whitespace, it's a section. Pod processors should warn about this being deprecated syntax.

About =over...=back Regions

"=over"..."=back" regions are used for various kinds of list-like structures. (I use the term "region" here simply as a collective term for everything from the "=over" to the matching "=back".)

  • The non-zero numeric indentlevel in "=over indentlevel" ... "=back" is used for giving the formatter a clue as to how many "spaces" (ems, or roughly equivalent units) it should tab over, although many formatters will have to convert this to an absolute measurement that may not exactly match with the size of spaces (or M's) in the document's base font. Other formatters may have to completely ignore the number. The lack of any explicit indentlevel parameter is equivalent to an indentlevel value of 4. Pod processors may complain if indentlevel is present but is not a positive number matching m/\A(\d*\.)?\d+\z/.

  • Authors of Pod formatters are reminded that "=over" ... "=back" may map to several different constructs in your output format. For example, in converting Pod to (X)HTML, it can map to any of <ul>...</ul>, <ol>...</ol>, <dl>...</dl>, or <blockquote>...</blockquote>. Similarly, "=item" can map to <li> or <dt>.

  • Each "=over" ... "=back" region should be one of the following:

    • An "=over" ... "=back" region containing only "=item *" commands, each followed by some number of ordinary/verbatim paragraphs, other nested "=over" ... "=back" regions, "=for..." paragraphs, and "=begin"..."=end" regions.

      (Pod processors must tolerate a bare "=item" as if it were "=item *".) Whether "*" is rendered as a literal asterisk, an "o", or as some kind of real bullet character, is left up to the Pod formatter, and may depend on the level of nesting.

    • An "=over" ... "=back" region containing only m/\A=item\s+\d+\.?\s*\z/ paragraphs, each one (or each group of them) followed by some number of ordinary/verbatim paragraphs, other nested "=over" ... "=back" regions, "=for..." paragraphs, and/or "=begin"..."=end" codes. Note that the numbers must start at 1 in each section, and must proceed in order and without skipping numbers.

      (Pod processors must tolerate lines like "=item 1" as if they were "=item 1.", with the period.)

    • An "=over" ... "=back" region containing only "=item [text]" commands, each one (or each group of them) followed by some number of ordinary/verbatim paragraphs, other nested "=over" ... "=back" regions, or "=for..." paragraphs, and "=begin"..."=end" regions.

      The "=item [text]" paragraph should not match m/\A=item\s+\d+\.?\s*\z/ or m/\A=item\s+\*\s*\z/, nor should it match just m/\A=item\s*\z/.

    • An "=over" ... "=back" region containing no "=item" paragraphs at all, and containing only some number of ordinary/verbatim paragraphs, and possibly also some nested "=over" ... "=back" regions, "=for..." paragraphs, and "=begin"..."=end" regions. Such an itemless "=over" ... "=back" region in Pod is equivalent in meaning to a "<blockquote>...</blockquote>" element in HTML.

    Note that with all the above cases, you can determine which type of "=over" ... "=back" you have, by examining the first (non-"=cut", non-"=pod") Pod paragraph after the "=over" command.

  • Pod formatters must tolerate arbitrarily large amounts of text in the "=item text..." paragraph. In practice, most such paragraphs are short, as in:

    1. =item For cutting off our trade with all parts of the world

    But they may be arbitrarily long:

    1. =item For transporting us beyond seas to be tried for pretended
    2. offenses
    3. =item He is at this time transporting large armies of foreign
    4. mercenaries to complete the works of death, desolation and
    5. tyranny, already begun with circumstances of cruelty and perfidy
    6. scarcely paralleled in the most barbarous ages, and totally
    7. unworthy the head of a civilized nation.
  • Pod processors should tolerate "=item *" / "=item number" commands with no accompanying paragraph. The middle item is an example:

    1. =over
    2. =item 1
    3. Pick up dry cleaning.
    4. =item 2
    5. =item 3
    6. Stop by the store. Get Abba Zabas, Stoli, and cheap lawn chairs.
    7. =back
  • No "=over" ... "=back" region can contain headings. Processors may treat such a heading as an error.

  • Note that an "=over" ... "=back" region should have some content. That is, authors should not have an empty region like this:

    1. =over
    2. =back

    Pod processors seeing such a contentless "=over" ... "=back" region, may ignore it, or may report it as an error.

  • Processors must tolerate an "=over" list that goes off the end of the document (i.e., which has no matching "=back"), but they may warn about such a list.

  • Authors of Pod formatters should note that this construct:

    1. =item Neque
    2. =item Porro
    3. =item Quisquam Est
    4. Qui dolorem ipsum quia dolor sit amet, consectetur, adipisci
    5. velit, sed quia non numquam eius modi tempora incidunt ut
    6. labore et dolore magnam aliquam quaerat voluptatem.
    7. =item Ut Enim

    is semantically ambiguous, in a way that makes formatting decisions a bit difficult. On the one hand, it could be mention of an item "Neque", mention of another item "Porro", and mention of another item "Quisquam Est", with just the last one requiring the explanatory paragraph "Qui dolorem ipsum quia dolor..."; and then an item "Ut Enim". In that case, you'd want to format it like so:

    1. Neque
    2. Porro
    3. Quisquam Est
    4. Qui dolorem ipsum quia dolor sit amet, consectetur, adipisci
    5. velit, sed quia non numquam eius modi tempora incidunt ut
    6. labore et dolore magnam aliquam quaerat voluptatem.
    7. Ut Enim

    But it could equally well be a discussion of three (related or equivalent) items, "Neque", "Porro", and "Quisquam Est", followed by a paragraph explaining them all, and then a new item "Ut Enim". In that case, you'd probably want to format it like so:

    1. Neque
    2. Porro
    3. Quisquam Est
    4. Qui dolorem ipsum quia dolor sit amet, consectetur, adipisci
    5. velit, sed quia non numquam eius modi tempora incidunt ut
    6. labore et dolore magnam aliquam quaerat voluptatem.
    7. Ut Enim

    But (for the foreseeable future), Pod does not provide any way for Pod authors to distinguish which grouping is meant by the above "=item"-cluster structure. So formatters should format it like so:

    1. Neque
    2. Porro
    3. Quisquam Est
    4. Qui dolorem ipsum quia dolor sit amet, consectetur, adipisci
    5. velit, sed quia non numquam eius modi tempora incidunt ut
    6. labore et dolore magnam aliquam quaerat voluptatem.
    7. Ut Enim

    That is, there should be (at least roughly) equal spacing between items as between paragraphs (although that spacing may well be less than the full height of a line of text). This leaves it to the reader to use (con)textual cues to figure out whether the "Qui dolorem ipsum..." paragraph applies to the "Quisquam Est" item or to all three items "Neque", "Porro", and "Quisquam Est". While not an ideal situation, this is preferable to providing formatting cues that may be actually contrary to the author's intent.

About Data Paragraphs and "=begin/=end" Regions

Data paragraphs are typically used for inlining non-Pod data that is to be used (typically passed through) when rendering the document to a specific format:

  1. =begin rtf
  2. \par{\pard\qr\sa4500{\i Printed\~\chdate\~\chtime}\par}
  3. =end rtf

The exact same effect could, incidentally, be achieved with a single "=for" paragraph:

  1. =for rtf \par{\pard\qr\sa4500{\i Printed\~\chdate\~\chtime}\par}

(Although that is not formally a data paragraph, it has the same meaning as one, and Pod parsers may parse it as one.)

Another example of a data paragraph:

  1. =begin html
  2. I like <em>PIE</em>!
  3. <hr>Especially pecan pie!
  4. =end html

If these were ordinary paragraphs, the Pod parser would try to expand the "E</em>" (in the first paragraph) as a formatting code, just like "E<lt>" or "E<eacute>". But since this is in a "=begin identifier"..."=end identifier" region and the identifier "html" doesn't begin have a ":" prefix, the contents of this region are stored as data paragraphs, instead of being processed as ordinary paragraphs (or if they began with a spaces and/or tabs, as verbatim paragraphs).

As a further example: At time of writing, no "biblio" identifier is supported, but suppose some processor were written to recognize it as a way of (say) denoting a bibliographic reference (necessarily containing formatting codes in ordinary paragraphs). The fact that "biblio" paragraphs were meant for ordinary processing would be indicated by prefacing each "biblio" identifier with a colon:

  1. =begin :biblio
  2. Wirth, Niklaus. 1976. I<Algorithms + Data Structures =
  3. Programs.> Prentice-Hall, Englewood Cliffs, NJ.
  4. =end :biblio

This would signal to the parser that paragraphs in this begin...end region are subject to normal handling as ordinary/verbatim paragraphs (while still tagged as meant only for processors that understand the "biblio" identifier). The same effect could be had with:

  1. =for :biblio
  2. Wirth, Niklaus. 1976. I<Algorithms + Data Structures =
  3. Programs.> Prentice-Hall, Englewood Cliffs, NJ.

The ":" on these identifiers means simply "process this stuff normally, even though the result will be for some special target". I suggest that parser APIs report "biblio" as the target identifier, but also report that it had a ":" prefix. (And similarly, with the above "html", report "html" as the target identifier, and note the lack of a ":" prefix.)

Note that a "=begin identifier"..."=end identifier" region where identifier begins with a colon, can contain commands. For example:

  1. =begin :biblio
  2. Wirth's classic is available in several editions, including:
  3. =for comment
  4. hm, check abebooks.com for how much used copies cost.
  5. =over
  6. =item
  7. Wirth, Niklaus. 1975. I<Algorithmen und Datenstrukturen.>
  8. Teubner, Stuttgart. [Yes, it's in German.]
  9. =item
  10. Wirth, Niklaus. 1976. I<Algorithms + Data Structures =
  11. Programs.> Prentice-Hall, Englewood Cliffs, NJ.
  12. =back
  13. =end :biblio

Note, however, a "=begin identifier"..."=end identifier" region where identifier does not begin with a colon, should not directly contain "=head1" ... "=head4" commands, nor "=over", nor "=back", nor "=item". For example, this may be considered invalid:

  1. =begin somedata
  2. This is a data paragraph.
  3. =head1 Don't do this!
  4. This is a data paragraph too.
  5. =end somedata

A Pod processor may signal that the above (specifically the "=head1" paragraph) is an error. Note, however, that the following should not be treated as an error:

  1. =begin somedata
  2. This is a data paragraph.
  3. =cut
  4. # Yup, this isn't Pod anymore.
  5. sub excl { (rand() > .5) ? "hoo!" : "hah!" }
  6. =pod
  7. This is a data paragraph too.
  8. =end somedata

And this too is valid:

  1. =begin someformat
  2. This is a data paragraph.
  3. And this is a data paragraph.
  4. =begin someotherformat
  5. This is a data paragraph too.
  6. And this is a data paragraph too.
  7. =begin :yetanotherformat
  8. =head2 This is a command paragraph!
  9. This is an ordinary paragraph!
  10. And this is a verbatim paragraph!
  11. =end :yetanotherformat
  12. =end someotherformat
  13. Another data paragraph!
  14. =end someformat

The contents of the above "=begin :yetanotherformat" ... "=end :yetanotherformat" region aren't data paragraphs, because the immediately containing region's identifier (":yetanotherformat") begins with a colon. In practice, most regions that contain data paragraphs will contain only data paragraphs; however, the above nesting is syntactically valid as Pod, even if it is rare. However, the handlers for some formats, like "html", will accept only data paragraphs, not nested regions; and they may complain if they see (targeted for them) nested regions, or commands, other than "=end", "=pod", and "=cut".

Also consider this valid structure:

  1. =begin :biblio
  2. Wirth's classic is available in several editions, including:
  3. =over
  4. =item
  5. Wirth, Niklaus. 1975. I<Algorithmen und Datenstrukturen.>
  6. Teubner, Stuttgart. [Yes, it's in German.]
  7. =item
  8. Wirth, Niklaus. 1976. I<Algorithms + Data Structures =
  9. Programs.> Prentice-Hall, Englewood Cliffs, NJ.
  10. =back
  11. Buy buy buy!
  12. =begin html
  13. <img src='wirth_spokesmodeling_book.png'>
  14. <hr>
  15. =end html
  16. Now now now!
  17. =end :biblio

There, the "=begin html"..."=end html" region is nested inside the larger "=begin :biblio"..."=end :biblio" region. Note that the content of the "=begin html"..."=end html" region is data paragraph(s), because the immediately containing region's identifier ("html") doesn't begin with a colon.

Pod parsers, when processing a series of data paragraphs one after another (within a single region), should consider them to be one large data paragraph that happens to contain blank lines. So the content of the above "=begin html"..."=end html" may be stored as two data paragraphs (one consisting of "<img src='wirth_spokesmodeling_book.png'>\n" and another consisting of "<hr>\n"), but should be stored as a single data paragraph (consisting of "<img src='wirth_spokesmodeling_book.png'>\n\n<hr>\n").

Pod processors should tolerate empty "=begin something"..."=end something" regions, empty "=begin :something"..."=end :something" regions, and contentless "=for something" and "=for :something" paragraphs. I.e., these should be tolerated:

  1. =for html
  2. =begin html
  3. =end html
  4. =begin :biblio
  5. =end :biblio

Incidentally, note that there's no easy way to express a data paragraph starting with something that looks like a command. Consider:

  1. =begin stuff
  2. =shazbot
  3. =end stuff

There, "=shazbot" will be parsed as a Pod command "shazbot", not as a data paragraph "=shazbot\n". However, you can express a data paragraph consisting of "=shazbot\n" using this code:

  1. =for stuff =shazbot

The situation where this is necessary, is presumably quite rare.

Note that =end commands must match the currently open =begin command. That is, they must properly nest. For example, this is valid:

  1. =begin outer
  2. X
  3. =begin inner
  4. Y
  5. =end inner
  6. Z
  7. =end outer

while this is invalid:

  1. =begin outer
  2. X
  3. =begin inner
  4. Y
  5. =end outer
  6. Z
  7. =end inner

This latter is improper because when the "=end outer" command is seen, the currently open region has the formatname "inner", not "outer". (It just happens that "outer" is the format name of a higher-up region.) This is an error. Processors must by default report this as an error, and may halt processing the document containing that error. A corollary of this is that regions cannot "overlap". That is, the latter block above does not represent a region called "outer" which contains X and Y, overlapping a region called "inner" which contains Y and Z. But because it is invalid (as all apparently overlapping regions would be), it doesn't represent that, or anything at all.

Similarly, this is invalid:

  1. =begin thing
  2. =end hting

This is an error because the region is opened by "thing", and the "=end" tries to close "hting" [sic].

This is also invalid:

  1. =begin thing
  2. =end

This is invalid because every "=end" command must have a formatname parameter.

SEE ALSO

perlpod, PODs: Embedded Documentation in perlsyn, podchecker

AUTHOR

Sean M. Burke

 
perldoc-html/perlpodstyle.html000644 000765 000024 00000071433 12275777334 016663 0ustar00jjstaff000000 000000 perlpodstyle - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlpodstyle

Perl 5 version 18.2 documentation
Recently read

perlpodstyle

NAME

perlpodstyle - Perl POD style guide

DESCRIPTION

These are general guidelines for how to write POD documentation for Perl scripts and modules, based on general guidelines for writing good UNIX man pages. All of these guidelines are, of course, optional, but following them will make your documentation more consistent with other documentation on the system.

The name of the program being documented is conventionally written in bold (using B<>) wherever it occurs, as are all program options. Arguments should be written in italics (I<>). Function names are traditionally written in italics; if you write a function as function(), Pod::Man will take care of this for you. Literal code or commands should be in C<>. References to other man pages should be in the form manpage(section) or L<manpage(section)> , and Pod::Man will automatically format those appropriately. The second form, with L<>, is used to request that a POD formatter make a link to the man page if possible. As an exception, one normally omits the section when referring to module documentation since it's not clear what section module documentation will be in; use L<Module::Name> for module references instead.

References to other programs or functions are normally in the form of man page references so that cross-referencing tools can provide the user with links and the like. It's possible to overdo this, though, so be careful not to clutter your documentation with too much markup. References to other programs that are not given as man page references should be enclosed in B<>.

The major headers should be set out using a =head1 directive, and are historically written in the rather startling ALL UPPER CASE format; this is not mandatory, but it's strongly recommended so that sections have consistent naming across different software packages. Minor headers may be included using =head2 , and are typically in mixed case.

The standard sections of a manual page are:

  • NAME

    Mandatory section; should be a comma-separated list of programs or functions documented by this POD page, such as:

    1. foo, bar - programs to do something

    Manual page indexers are often extremely picky about the format of this section, so don't put anything in it except this line. Every program or function documented by this POD page should be listed, separated by a comma and a space. For a Perl module, just give the module name. A single dash, and only a single dash, should separate the list of programs or functions from the description. Do not use any markup such as C<> or B<> anywhere in this line. Functions should not be qualified with () or the like. The description should ideally fit on a single line, even if a man program replaces the dash with a few tabs.

  • SYNOPSIS

    A short usage summary for programs and functions. This section is mandatory for section 3 pages. For Perl module documentation, it's usually convenient to have the contents of this section be a verbatim block showing some (brief) examples of typical ways the module is used.

  • DESCRIPTION

    Extended description and discussion of the program or functions, or the body of the documentation for man pages that document something else. If particularly long, it's a good idea to break this up into subsections =head2 directives like:

    1. =head2 Normal Usage
    2. =head2 Advanced Features
    3. =head2 Writing Configuration Files

    or whatever is appropriate for your documentation.

    For a module, this is generally where the documentation of the interfaces provided by the module goes, usually in the form of a list with an =item for each interface. Depending on how many interfaces there are, you may want to put that documentation in separate METHODS, FUNCTIONS, CLASS METHODS, or INSTANCE METHODS sections instead and save the DESCRIPTION section for an overview.

  • OPTIONS

    Detailed description of each of the command-line options taken by the program. This should be separate from the description for the use of parsers like Pod::Usage. This is normally presented as a list, with each option as a separate =item . The specific option string should be enclosed in B<>. Any values that the option takes should be enclosed in I<>. For example, the section for the option --section=manext would be introduced with:

    1. =item B<--section>=I<manext>

    Synonymous options (like both the short and long forms) are separated by a comma and a space on the same =item line, or optionally listed as their own item with a reference to the canonical name. For example, since --section can also be written as -s, the above would be:

    1. =item B<-s> I<manext>, B<--section>=I<manext>

    Writing the short option first is recommended because it's easier to read. The long option is long enough to draw the eye to it anyway and the short option can otherwise get lost in visual noise.

  • RETURN VALUE

    What the program or function returns, if successful. This section can be omitted for programs whose precise exit codes aren't important, provided they return 0 on success and non-zero on failure as is standard. It should always be present for functions. For modules, it may be useful to summarize return values from the module interface here, or it may be more useful to discuss return values separately in the documentation of each function or method the module provides.

  • ERRORS

    Exceptions, error return codes, exit statuses, and errno settings. Typically used for function or module documentation; program documentation uses DIAGNOSTICS instead. The general rule of thumb is that errors printed to STDOUT or STDERR and intended for the end user are documented in DIAGNOSTICS while errors passed internal to the calling program and intended for other programmers are documented in ERRORS. When documenting a function that sets errno, a full list of the possible errno values should be given here.

  • DIAGNOSTICS

    All possible messages the program can print out and what they mean. You may wish to follow the same documentation style as the Perl documentation; see perldiag(1) for more details (and look at the POD source as well).

    If applicable, please include details on what the user should do to correct the error; documenting an error as indicating "the input buffer is too small" without telling the user how to increase the size of the input buffer (or at least telling them that it isn't possible) aren't very useful.

  • EXAMPLES

    Give some example uses of the program or function. Don't skimp; users often find this the most useful part of the documentation. The examples are generally given as verbatim paragraphs.

    Don't just present an example without explaining what it does. Adding a short paragraph saying what the example will do can increase the value of the example immensely.

  • ENVIRONMENT

    Environment variables that the program cares about, normally presented as a list using =over , =item , and =back . For example:

    1. =over 6
    2. =item HOME
    3. Used to determine the user's home directory. F<.foorc> in this
    4. directory is read for configuration details, if it exists.
    5. =back

    Since environment variables are normally in all uppercase, no additional special formatting is generally needed; they're glaring enough as it is.

  • FILES

    All files used by the program or function, normally presented as a list, and what it uses them for. File names should be enclosed in F<>. It's particularly important to document files that will be potentially modified.

  • CAVEATS

    Things to take special care with, sometimes called WARNINGS.

  • BUGS

    Things that are broken or just don't work quite right.

  • RESTRICTIONS

    Bugs you don't plan to fix. :-)

  • NOTES

    Miscellaneous commentary.

  • AUTHOR

    Who wrote it (use AUTHORS for multiple people). It's a good idea to include your current e-mail address (or some e-mail address to which bug reports should be sent) or some other contact information so that users have a way of contacting you. Remember that program documentation tends to roam the wild for far longer than you expect and pick a contact method that's likely to last.

  • HISTORY

    Programs derived from other sources sometimes have this. Some people keep a modification log here, but that usually gets long and is normally better maintained in a separate file.

  • COPYRIGHT AND LICENSE

    For copyright

    1. Copyright YEAR(s) YOUR NAME(s)

    (No, (C) is not needed. No, "all rights reserved" is not needed.)

    For licensing the easiest way is to use the same licensing as Perl itself:

    1. This library is free software; you may redistribute it and/or modify
    2. it under the same terms as Perl itself.

    This makes it easy for people to use your module with Perl. Note that this licensing example is neither an endorsement or a requirement, you are of course free to choose any licensing.

  • SEE ALSO

    Other man pages to check out, like man(1), man(7), makewhatis(8), or catman(8). Normally a simple list of man pages separated by commas, or a paragraph giving the name of a reference work. Man page references, if they use the standard name(section) form, don't have to be enclosed in L<> (although it's recommended), but other things in this section probably should be when appropriate.

    If the package has a mailing list, include a URL or subscription instructions here.

    If the package has a web site, include a URL here.

Documentation of object-oriented libraries or modules may want to use CONSTRUCTORS and METHODS sections, or CLASS METHODS and INSTANCE METHODS sections, for detailed documentation of the parts of the library and save the DESCRIPTION section for an overview. Large modules with a function interface may want to use FUNCTIONS for similar reasons. Some people use OVERVIEW to summarize the description if it's quite long.

Section ordering varies, although NAME must always be the first section (you'll break some man page systems otherwise), and NAME, SYNOPSIS, DESCRIPTION, and OPTIONS generally always occur first and in that order if present. In general, SEE ALSO, AUTHOR, and similar material should be left for last. Some systems also move WARNINGS and NOTES to last. The order given above should be reasonable for most purposes.

Some systems use CONFORMING TO to note conformance to relevant standards and MT-LEVEL to note safeness for use in threaded programs or signal handlers. These headings are primarily useful when documenting parts of a C library.

Finally, as a general note, try not to use an excessive amount of markup. As documented here and in Pod::Man, you can safely leave Perl variables, function names, man page references, and the like unadorned by markup and the POD translators will figure it out for you. This makes it much easier to later edit the documentation. Note that many existing translators will do the wrong thing with e-mail addresses when wrapped in L<>, so don't do that.

SEE ALSO

For additional information that may be more accurate for your specific system, see either man(5) or man(7) depending on your system manual section numbering conventions.

This documentation is maintained as part of the podlators distribution. The current version is always available from its web site at <http://www.eyrie.org/~eagle/software/podlators/>.

AUTHOR

Russ Allbery <rra@stanford.edu>, with large portions of this documentation taken from the documentation of the original pod2man implementation by Larry Wall and Tom Christiansen.

COPYRIGHT AND LICENSE

Copyright 1999, 2000, 2001, 2004, 2006, 2008, 2010 Russ Allbery <rra@stanford.edu>.

This documentation is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/perlpolicy.html000644 000765 000024 00000110546 12275777371 016317 0ustar00jjstaff000000 000000 perlpolicy - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlpolicy

Perl 5 version 18.2 documentation
Recently read

perlpolicy

NAME

perlpolicy - Various and sundry policies and commitments related to the Perl core

DESCRIPTION

This document is the master document which records all written policies about how the Perl 5 Porters collectively develop and maintain the Perl core.

GOVERNANCE

Perl 5 Porters

Subscribers to perl5-porters (the porters themselves) come in several flavours. Some are quiet curious lurkers, who rarely pitch in and instead watch the ongoing development to ensure they're forewarned of new changes or features in Perl. Some are representatives of vendors, who are there to make sure that Perl continues to compile and work on their platforms. Some patch any reported bug that they know how to fix, some are actively patching their pet area (threads, Win32, the regexp -engine), while others seem to do nothing but complain. In other words, it's your usual mix of technical people.

Over this group of porters presides Larry Wall. He has the final word in what does and does not change in any of the Perl programming languages. These days, Larry spends most of his time on Perl 6, while Perl 5 is shepherded by a "pumpking", a porter responsible for deciding what goes into each release and ensuring that releases happen on a regular basis.

Larry sees Perl development along the lines of the US government: there's the Legislature (the porters), the Executive branch (the -pumpking), and the Supreme Court (Larry). The legislature can discuss and submit patches to the executive branch all they like, but the executive branch is free to veto them. Rarely, the Supreme Court will side with the executive branch over the legislature, or the legislature over the executive branch. Mostly, however, the legislature and the executive branch are supposed to get along and work out their differences without impeachment or court cases.

You might sometimes see reference to Rule 1 and Rule 2. Larry's power as Supreme Court is expressed in The Rules:

1

Larry is always by definition right about how Perl should behave. This means he has final veto power on the core functionality.

2

Larry is allowed to change his mind about any matter at a later date, regardless of whether he previously invoked Rule 1.

Got that? Larry is always right, even when he was wrong. It's rare to see either Rule exercised, but they are often alluded to.

MAINTENANCE AND SUPPORT

Perl 5 is developed by a community, not a corporate entity. Every change contributed to the Perl core is the result of a donation. Typically, these donations are contributions of code or time by individual members of our community. On occasion, these donations come in the form of corporate or organizational sponsorship of a particular individual or project.

As a volunteer organization, the commitments we make are heavily dependent on the goodwill and hard work of individuals who have no obligation to contribute to Perl.

That being said, we value Perl's stability and security and have long had an unwritten covenant with the broader Perl community to support and maintain releases of Perl.

This document codifies the support and maintenance commitments that the Perl community should expect from Perl's developers:

  • We "officially" support the two most recent stable release series. 5.12.x and earlier are now out of support. As of the release of 5.18.0, we will "officially" end support for Perl 5.14.x, other than providing security updates as described below.

  • To the best of our ability, we will attempt to fix critical issues in the two most recent stable 5.x release series. Fixes for the current release series take precedence over fixes for the previous release series.

  • To the best of our ability, we will provide "critical" security patches / releases for any major version of Perl whose 5.x.0 release was within the past three years. We can only commit to providing these for the most recent .y release in any 5.x.y series.

  • We will not provide security updates or bug fixes for development releases of Perl.

  • We encourage vendors to ship the most recent supported release of Perl at the time of their code freeze.

  • As a vendor, you may have a requirement to backport security fixes beyond our 3 year support commitment. We can provide limited support and advice to you as you do so and, where possible will try to apply those patches to the relevant -maint branches in git, though we may or may not choose to make numbered releases or "official" patches available. Contact us at <perl5-security-report@perl.org> to begin that process.

BACKWARD COMPATIBILITY AND DEPRECATION

Our community has a long-held belief that backward-compatibility is a virtue, even when the functionality in question is a design flaw.

We would all love to unmake some mistakes we've made over the past decades. Living with every design error we've ever made can lead to painful stagnation. Unwinding our mistakes is very, very difficult. Doing so without actively harming our users is nearly impossible.

Lately, ignoring or actively opposing compatibility with earlier versions of Perl has come into vogue. Sometimes, a change is proposed which wants to usurp syntax which previously had another meaning. Sometimes, a change wants to improve previously-crazy semantics.

Down this road lies madness.

Requiring end-user programmers to change just a few language constructs, even language constructs which no well-educated developer would ever intentionally use is tantamount to saying "you should not upgrade to a new release of Perl unless you have 100% test coverage and can do a full manual audit of your codebase." If we were to have tools capable of reliably upgrading Perl source code from one version of Perl to another, this concern could be significantly mitigated.

We want to ensure that Perl continues to grow and flourish in the coming years and decades, but not at the expense of our user community.

Existing syntax and semantics should only be marked for destruction in very limited circumstances. If a given language feature's continued inclusion in the language will cause significant harm to the language or prevent us from making needed changes to the runtime, then it may be considered for deprecation.

Any language change which breaks backward-compatibility should be able to be enabled or disabled lexically. Unless code at a given scope declares that it wants the new behavior, that new behavior should be disabled. Which backward-incompatible changes are controlled implicitly by a 'use v5.x.y' is a decision which should be made by the pumpking in consultation with the community.

When a backward-incompatible change can't be toggled lexically, the decision to change the language must be considered very, very carefully. If it's possible to move the old syntax or semantics out of the core language and into XS-land, that XS module should be enabled by default unless the user declares that they want a newer revision of Perl.

Historically, we've held ourselves to a far higher standard than backward-compatibility -- bugward-compatibility. Any accident of implementation or unintentional side-effect of running some bit of code has been considered to be a feature of the language to be defended with the same zeal as any other feature or functionality. No matter how frustrating these unintentional features may be to us as we continue to improve Perl, these unintentional features often deserve our protection. It is very important that existing software written in Perl continue to work correctly. If end-user developers have adopted a bug as a feature, we need to treat it as such.

New syntax and semantics which don't break existing language constructs and syntax have a much lower bar. They merely need to prove themselves to be useful, elegant, well designed, and well tested.

Terminology

To make sure we're talking about the same thing when we discuss the removal of features or functionality from the Perl core, we have specific definitions for a few words and phrases.

  • experimental

    If something in the Perl core is marked as experimental, we may change its behaviour, deprecate or remove it without notice. While we'll always do our best to smooth the transition path for users of experimental features, you should contact the perl5-porters mailinglist if you find an experimental feature useful and want to help shape its future.

  • deprecated

    If something in the Perl core is marked as deprecated, we may remove it from the core in the next stable release series, though we may not. As of Perl 5.12, deprecated features and modules warn the user as they're used. When a module is deprecated, it will also be made available on CPAN. Installing it from CPAN will silence deprecation warnings for that module.

    If you use a deprecated feature or module and believe that its removal from the Perl core would be a mistake, please contact the perl5-porters mailinglist and plead your case. We don't deprecate things without a good reason, but sometimes there's a counterargument we haven't considered. Historically, we did not distinguish between "deprecated" and "discouraged" features.

  • discouraged

    From time to time, we may mark language constructs and features which we consider to have been mistakes as discouraged. Discouraged features aren't candidates for removal in the next major release series, but we may later deprecate them if they're found to stand in the way of a significant improvement to the Perl core.

  • removed

    Once a feature, construct or module has been marked as deprecated for a stable release cycle, we may remove it from the Perl core. Unsurprisingly, we say we've removed these things. When a module is removed, it will no longer ship with Perl, but will continue to be available on CPAN.

MAINTENANCE BRANCHES

  • New releases of maint should contain as few changes as possible. If there is any question about whether a given patch might merit inclusion in a maint release, then it almost certainly should not be included.

  • Portability fixes, such as changes to Configure and the files in hints/ are acceptable. Ports of Perl to a new platform, architecture or OS release that involve changes to the implementation are NOT acceptable.

  • Acceptable documentation updates are those that correct factual errors, explain significant bugs or deficiencies in the current implementation, or fix broken markup.

  • Patches that add new warnings or errors or deprecate features are not acceptable.

  • Patches that fix crashing bugs that do not otherwise change Perl's functionality or negatively impact performance are acceptable.

  • Patches that fix CVEs or security issues are acceptable, but should be run through the perl5-security-report@perl.org mailing list rather than applied directly.

  • Patches that fix regressions in perl's behavior relative to previous releases are acceptable.

  • Updates to dual-life modules should consist of minimal patches to fix crashing or security issues (as above).

  • Minimal patches that fix platform-specific test failures or installation issues are acceptable. When these changes are made to dual-life modules for which CPAN is canonical, any changes should be coordinated with the upstream author.

  • New versions of dual-life modules should NOT be imported into maint. Those belong in the next stable series.

  • Patches that add or remove features are not acceptable.

  • Patches that break binary compatibility are not acceptable. (Please talk to a pumpking.)

Getting changes into a maint branch

Historically, only the pumpking cherry-picked changes from bleadperl into maintperl. This has scaling problems. At the same time, maintenance branches of stable versions of Perl need to be treated with great care. To that end, as of Perl 5.12, we have a new process for maint branches.

Any committer may cherry-pick any commit from blead to a maint branch if they send mail to perl5-porters announcing their intent to cherry-pick a specific commit along with a rationale for doing so and at least two other committers respond to the list giving their assent. (This policy applies to current and former pumpkings, as well as other committers.)

CONTRIBUTED MODULES

A Social Contract about Artistic Control

What follows is a statement about artistic control, defined as the ability of authors of packages to guide the future of their code and maintain control over their work. It is a recognition that authors should have control over their work, and that it is a responsibility of the rest of the Perl community to ensure that they retain this control. It is an attempt to document the standards to which we, as Perl developers, intend to hold ourselves. It is an attempt to write down rough guidelines about the respect we owe each other as Perl developers.

This statement is not a legal contract. This statement is not a legal document in any way, shape, or form. Perl is distributed under the GNU Public License and under the Artistic License; those are the precise legal terms. This statement isn't about the law or licenses. It's about community, mutual respect, trust, and good-faith cooperation.

We recognize that the Perl core, defined as the software distributed with the heart of Perl itself, is a joint project on the part of all of us. From time to time, a script, module, or set of modules (hereafter referred to simply as a "module") will prove so widely useful and/or so integral to the correct functioning of Perl itself that it should be distributed with the Perl core. This should never be done without the author's explicit consent, and a clear recognition on all parts that this means the module is being distributed under the same terms as Perl itself. A module author should realize that inclusion of a module into the Perl core will necessarily mean some loss of control over it, since changes may occasionally have to be made on short notice or for consistency with the rest of Perl.

Once a module has been included in the Perl core, however, everyone involved in maintaining Perl should be aware that the module is still the property of the original author unless the original author explicitly gives up their ownership of it. In particular:

  • The version of the module in the Perl core should still be considered the work of the original author. All patches, bug reports, and so forth should be fed back to them. Their development directions should be respected whenever possible.

  • Patches may be applied by the pumpkin holder without the explicit cooperation of the module author if and only if they are very minor, time-critical in some fashion (such as urgent security fixes), or if the module author cannot be reached. Those patches must still be given back to the author when possible, and if the author decides on an alternate fix in their version, that fix should be strongly preferred unless there is a serious problem with it. Any changes not endorsed by the author should be marked as such, and the contributor of the change acknowledged.

  • The version of the module distributed with Perl should, whenever possible, be the latest version of the module as distributed by the author (the latest non-beta version in the case of public Perl releases), although the pumpkin holder may hold off on upgrading the version of the module distributed with Perl to the latest version until the latest version has had sufficient testing.

In other words, the author of a module should be considered to have final say on modifications to their module whenever possible (bearing in mind that it's expected that everyone involved will work together and arrive at reasonable compromises when there are disagreements).

As a last resort, however:

If the author's vision of the future of their module is sufficiently different from the vision of the pumpkin holder and perl5-porters as a whole so as to cause serious problems for Perl, the pumpkin holder may choose to formally fork the version of the module in the Perl core from the one maintained by the author. This should not be done lightly and should always if at all possible be done only after direct input from Larry. If this is done, it must then be made explicit in the module as distributed with the Perl core that it is a forked version and that while it is based on the original author's work, it is no longer maintained by them. This must be noted in both the documentation and in the comments in the source of the module.

Again, this should be a last resort only. Ideally, this should never happen, and every possible effort at cooperation and compromise should be made before doing this. If it does prove necessary to fork a module for the overall health of Perl, proper credit must be given to the original author in perpetuity and the decision should be constantly re-evaluated to see if a remerging of the two branches is possible down the road.

In all dealings with contributed modules, everyone maintaining Perl should keep in mind that the code belongs to the original author, that they may not be on perl5-porters at any given time, and that a patch is not official unless it has been integrated into the author's copy of the module. To aid with this, and with points #1, #2, and #3 above, contact information for the authors of all contributed modules should be kept with the Perl distribution.

Finally, the Perl community as a whole recognizes that respect for ownership of code, respect for artistic control, proper credit, and active effort to prevent unintentional code skew or communication gaps is vital to the health of the community and Perl itself. Members of a community should not normally have to resort to rules and laws to deal with each other, and this document, although it contains rules so as to be clear, is about an attitude and general approach. The first step in any dispute should be open communication, respect for opposing views, and an attempt at a compromise. In nearly every circumstance nothing more will be necessary, and certainly no more drastic measure should be used until every avenue of communication and discussion has failed.

DOCUMENTATION

Perl's documentation is an important resource for our users. It's incredibly important for Perl's documentation to be reasonably coherent and to accurately reflect the current implementation.

Just as P5P collectively maintains the codebase, we collectively maintain the documentation. Writing a particular bit of documentation doesn't give an author control of the future of that documentation. At the same time, just as source code changes should match the style of their surrounding blocks, so should documentation changes.

Examples in documentation should be illustrative of the concept they're explaining. Sometimes, the best way to show how a language feature works is with a small program the reader can run without modification. More often, examples will consist of a snippet of code containing only the "important" bits. The definition of "important" varies from snippet to snippet. Sometimes it's important to declare use strict and use warnings , initialize all variables and fully catch every error condition. More often than not, though, those things obscure the lesson the example was intended to teach.

As Perl is developed by a global team of volunteers, our documentation often contains spellings which look funny to somebody. Choice of American/British/Other spellings is left as an exercise for the author of each bit of documentation. When patching documentation, try to emulate the documentation around you, rather than changing the existing prose.

In general, documentation should describe what Perl does "now" rather than what it used to do. It's perfectly reasonable to include notes in documentation about how behaviour has changed from previous releases, but, with very few exceptions, documentation isn't "dual-life" -- it doesn't need to fully describe how all old versions used to work.

CREDITS

"Social Contract about Contributed Modules" originally by Russ Allbery <rra@stanford.edu> and the perl5-porters.

 
perldoc-html/perlport.html000644 000765 000024 00000454364 12275777343 016014 0ustar00jjstaff000000 000000 perlport - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlport

Perl 5 version 18.2 documentation
Recently read

perlport

NAME

perlport - Writing portable Perl

DESCRIPTION

Perl runs on numerous operating systems. While most of them share much in common, they also have their own unique features.

This document is meant to help you to find out what constitutes portable Perl code. That way once you make a decision to write portably, you know where the lines are drawn, and you can stay within them.

There is a tradeoff between taking full advantage of one particular type of computer and taking advantage of a full range of them. Naturally, as you broaden your range and become more diverse, the common factors drop, and you are left with an increasingly smaller area of common ground in which you can operate to accomplish a particular task. Thus, when you begin attacking a problem, it is important to consider under which part of the tradeoff curve you want to operate. Specifically, you must decide whether it is important that the task that you are coding have the full generality of being portable, or whether to just get the job done right now. This is the hardest choice to be made. The rest is easy, because Perl provides many choices, whichever way you want to approach your problem.

Looking at it another way, writing portable code is usually about willfully limiting your available choices. Naturally, it takes discipline and sacrifice to do that. The product of portability and convenience may be a constant. You have been warned.

Be aware of two important points:

  • Not all Perl programs have to be portable

    There is no reason you should not use Perl as a language to glue Unix tools together, or to prototype a Macintosh application, or to manage the Windows registry. If it makes no sense to aim for portability for one reason or another in a given program, then don't bother.

  • Nearly all of Perl already is portable

    Don't be fooled into thinking that it is hard to create portable Perl code. It isn't. Perl tries its level-best to bridge the gaps between what's available on different platforms, and all the means available to use those features. Thus almost all Perl code runs on any machine without modification. But there are some significant issues in writing portable code, and this document is entirely about those issues.

Here's the general rule: When you approach a task commonly done using a whole range of platforms, think about writing portable code. That way, you don't sacrifice much by way of the implementation choices you can avail yourself of, and at the same time you can give your users lots of platform choices. On the other hand, when you have to take advantage of some unique feature of a particular platform, as is often the case with systems programming (whether for Unix, Windows, VMS, etc.), consider writing platform-specific code.

When the code will run on only two or three operating systems, you may need to consider only the differences of those particular systems. The important thing is to decide where the code will run and to be deliberate in your decision.

The material below is separated into three main sections: main issues of portability (ISSUES), platform-specific issues (PLATFORMS), and built-in perl functions that behave differently on various ports (FUNCTION IMPLEMENTATIONS).

This information should not be considered complete; it includes possibly transient information about idiosyncrasies of some of the ports, almost all of which are in a state of constant evolution. Thus, this material should be considered a perpetual work in progress (<IMG SRC="yellow_sign.gif" ALT="Under Construction"> ).

ISSUES

Newlines

In most operating systems, lines in files are terminated by newlines. Just what is used as a newline may vary from OS to OS. Unix traditionally uses \012 , one type of DOSish I/O uses \015\012 , and Mac OS uses \015 .

Perl uses \n to represent the "logical" newline, where what is logical may depend on the platform in use. In MacPerl, \n always means \015 . In DOSish perls, \n usually means \012 , but when accessing a file in "text" mode, perl uses the :crlf layer that translates it to (or from) \015\012 , depending on whether you're reading or writing. Unix does the same thing on ttys in canonical mode. \015\012 is commonly referred to as CRLF.

To trim trailing newlines from text lines use chomp(). With default settings that function looks for a trailing \n character and thus trims in a portable way.

When dealing with binary files (or text files in binary mode) be sure to explicitly set $/ to the appropriate value for your file format before using chomp().

Because of the "text" mode translation, DOSish perls have limitations in using seek and tell on a file accessed in "text" mode. Stick to seek-ing to locations you got from tell (and no others), and you are usually free to use seek and tell even in "text" mode. Using seek or tell or other file operations may be non-portable. If you use binmode on a file, however, you can usually seek and tell with arbitrary values in safety.

A common misconception in socket programming is that \n eq \012 everywhere. When using protocols such as common Internet protocols, \012 and \015 are called for specifically, and the values of the logical \n and \r (carriage return) are not reliable.

  1. print SOCKET "Hi there, client!\r\n"; # WRONG
  2. print SOCKET "Hi there, client!\015\012"; # RIGHT

However, using \015\012 (or \cM\cJ , or \x0D\x0A ) can be tedious and unsightly, as well as confusing to those maintaining the code. As such, the Socket module supplies the Right Thing for those who want it.

  1. use Socket qw(:DEFAULT :crlf);
  2. print SOCKET "Hi there, client!$CRLF" # RIGHT

When reading from a socket, remember that the default input record separator $/ is \n , but robust socket code will recognize as either \012 or \015\012 as end of line:

  1. while (<SOCKET>) {
  2. # ...
  3. }

Because both CRLF and LF end in LF, the input record separator can be set to LF and any CR stripped later. Better to write:

  1. use Socket qw(:DEFAULT :crlf);
  2. local($/) = LF; # not needed if $/ is already \012
  3. while (<SOCKET>) {
  4. s/$CR?$LF/\n/; # not sure if socket uses LF or CRLF, OK
  5. # s/\015?\012/\n/; # same thing
  6. }

This example is preferred over the previous one--even for Unix platforms--because now any \015 's (\cM 's) are stripped out (and there was much rejoicing).

Similarly, functions that return text data--such as a function that fetches a web page--should sometimes translate newlines before returning the data, if they've not yet been translated to the local newline representation. A single line of code will often suffice:

  1. $data =~ s/\015?\012/\n/g;
  2. return $data;

Some of this may be confusing. Here's a handy reference to the ASCII CR and LF characters. You can print it out and stick it in your wallet.

  1. LF eq \012 eq \x0A eq \cJ eq chr(10) eq ASCII 10
  2. CR eq \015 eq \x0D eq \cM eq chr(13) eq ASCII 13
  3. | Unix | DOS | Mac |
  4. ---------------------------
  5. \n | LF | LF | CR |
  6. \r | CR | CR | LF |
  7. \n * | LF | CRLF | CR |
  8. \r * | CR | CR | LF |
  9. ---------------------------
  10. * text-mode STDIO

The Unix column assumes that you are not accessing a serial line (like a tty) in canonical mode. If you are, then CR on input becomes "\n", and "\n" on output becomes CRLF.

These are just the most common definitions of \n and \r in Perl. There may well be others. For example, on an EBCDIC implementation such as z/OS (OS/390) or OS/400 (using the ILE, the PASE is ASCII-based) the above material is similar to "Unix" but the code numbers change:

  1. LF eq \025 eq \x15 eq \cU eq chr(21) eq CP-1047 21
  2. LF eq \045 eq \x25 eq chr(37) eq CP-0037 37
  3. CR eq \015 eq \x0D eq \cM eq chr(13) eq CP-1047 13
  4. CR eq \015 eq \x0D eq \cM eq chr(13) eq CP-0037 13
  5. | z/OS | OS/400 |
  6. ----------------------
  7. \n | LF | LF |
  8. \r | CR | CR |
  9. \n * | LF | LF |
  10. \r * | CR | CR |
  11. ----------------------
  12. * text-mode STDIO

Numbers endianness and Width

Different CPUs store integers and floating point numbers in different orders (called endianness) and widths (32-bit and 64-bit being the most common today). This affects your programs when they attempt to transfer numbers in binary format from one CPU architecture to another, usually either "live" via network connection, or by storing the numbers to secondary storage such as a disk file or tape.

Conflicting storage orders make utter mess out of the numbers. If a little-endian host (Intel, VAX) stores 0x12345678 (305419896 in decimal), a big-endian host (Motorola, Sparc, PA) reads it as 0x78563412 (2018915346 in decimal). Alpha and MIPS can be either: Digital/Compaq used/uses them in little-endian mode; SGI/Cray uses them in big-endian mode. To avoid this problem in network (socket) connections use the pack and unpack formats n and N , the "network" orders. These are guaranteed to be portable.

As of perl 5.10.0, you can also use the > and < modifiers to force big- or little-endian byte-order. This is useful if you want to store signed integers or 64-bit integers, for example.

You can explore the endianness of your platform by unpacking a data structure packed in native format such as:

  1. print unpack("h*", pack("s2", 1, 2)), "\n";
  2. # '10002000' on e.g. Intel x86 or Alpha 21064 in little-endian mode
  3. # '00100020' on e.g. Motorola 68040

If you need to distinguish between endian architectures you could use either of the variables set like so:

  1. $is_big_endian = unpack("h*", pack("s", 1)) =~ /01/;
  2. $is_little_endian = unpack("h*", pack("s", 1)) =~ /^1/;

Differing widths can cause truncation even between platforms of equal endianness. The platform of shorter width loses the upper parts of the number. There is no good solution for this problem except to avoid transferring or storing raw binary numbers.

One can circumnavigate both these problems in two ways. Either transfer and store numbers always in text format, instead of raw binary, or else consider using modules like Data::Dumper and Storable (included as of perl 5.8). Keeping all data as text significantly simplifies matters.

The v-strings are portable only up to v2147483647 (0x7FFFFFFF), that's how far EBCDIC, or more precisely UTF-EBCDIC will go.

Files and Filesystems

Most platforms these days structure files in a hierarchical fashion. So, it is reasonably safe to assume that all platforms support the notion of a "path" to uniquely identify a file on the system. How that path is really written, though, differs considerably.

Although similar, file path specifications differ between Unix, Windows, Mac OS, OS/2, VMS, VOS, RISC OS, and probably others. Unix, for example, is one of the few OSes that has the elegant idea of a single root directory.

DOS, OS/2, VMS, VOS, and Windows can work similarly to Unix with / as path separator, or in their own idiosyncratic ways (such as having several root directories and various "unrooted" device files such NIL: and LPT:).

Mac OS 9 and earlier used : as a path separator instead of /.

The filesystem may support neither hard links (link) nor symbolic links (symlink, readlink, lstat).

The filesystem may support neither access timestamp nor change timestamp (meaning that about the only portable timestamp is the modification timestamp), or one second granularity of any timestamps (e.g. the FAT filesystem limits the time granularity to two seconds).

The "inode change timestamp" (the -C filetest) may really be the "creation timestamp" (which it is not in Unix).

VOS perl can emulate Unix filenames with / as path separator. The native pathname characters greater-than, less-than, number-sign, and percent-sign are always accepted.

RISC OS perl can emulate Unix filenames with / as path separator, or go native and use . for path separator and : to signal filesystems and disk names.

Don't assume Unix filesystem access semantics: that read, write, and execute are all the permissions there are, and even if they exist, that their semantics (for example what do r, w, and x mean on a directory) are the Unix ones. The various Unix/POSIX compatibility layers usually try to make interfaces like chmod() work, but sometimes there simply is no good mapping.

If all this is intimidating, have no (well, maybe only a little) fear. There are modules that can help. The File::Spec modules provide methods to do the Right Thing on whatever platform happens to be running the program.

  1. use File::Spec::Functions;
  2. chdir(updir()); # go up one directory
  3. my $file = catfile(curdir(), 'temp', 'file.txt');
  4. # on Unix and Win32, './temp/file.txt'
  5. # on Mac OS Classic, ':temp:file.txt'
  6. # on VMS, '[.temp]file.txt'

File::Spec is available in the standard distribution as of version 5.004_05. File::Spec::Functions is only in File::Spec 0.7 and later, and some versions of perl come with version 0.6. If File::Spec is not updated to 0.7 or later, you must use the object-oriented interface from File::Spec (or upgrade File::Spec).

In general, production code should not have file paths hardcoded. Making them user-supplied or read from a configuration file is better, keeping in mind that file path syntax varies on different machines.

This is especially noticeable in scripts like Makefiles and test suites, which often assume / as a path separator for subdirectories.

Also of use is File::Basename from the standard distribution, which splits a pathname into pieces (base filename, full path to directory, and file suffix).

Even when on a single platform (if you can call Unix a single platform), remember not to count on the existence or the contents of particular system-specific files or directories, like /etc/passwd, /etc/sendmail.conf, /etc/resolv.conf, or even /tmp/. For example, /etc/passwd may exist but not contain the encrypted passwords, because the system is using some form of enhanced security. Or it may not contain all the accounts, because the system is using NIS. If code does need to rely on such a file, include a description of the file and its format in the code's documentation, then make it easy for the user to override the default location of the file.

Don't assume a text file will end with a newline. They should, but people forget.

Do not have two files or directories of the same name with different case, like test.pl and Test.pl, as many platforms have case-insensitive (or at least case-forgiving) filenames. Also, try not to have non-word characters (except for .) in the names, and keep them to the 8.3 convention, for maximum portability, onerous a burden though this may appear.

Likewise, when using the AutoSplit module, try to keep your functions to 8.3 naming and case-insensitive conventions; or, at the least, make it so the resulting files have a unique (case-insensitively) first 8 characters.

Whitespace in filenames is tolerated on most systems, but not all, and even on systems where it might be tolerated, some utilities might become confused by such whitespace.

Many systems (DOS, VMS ODS-2) cannot have more than one . in their filenames.

Don't assume > won't be the first character of a filename. Always use < explicitly to open a file for reading, or even better, use the three-arg version of open, unless you want the user to be able to specify a pipe open.

  1. open my $fh, '<', $existing_file) or die $!;

If filenames might use strange characters, it is safest to open it with sysopen instead of open. open is magic and can translate characters like >, < , and |, which may be the wrong thing to do. (Sometimes, though, it's the right thing.) Three-arg open can also help protect against this translation in cases where it is undesirable.

Don't use : as a part of a filename since many systems use that for their own semantics (Mac OS Classic for separating pathname components, many networking schemes and utilities for separating the nodename and the pathname, and so on). For the same reasons, avoid @ , ; and |.

Don't assume that in pathnames you can collapse two leading slashes // into one: some networking and clustering filesystems have special semantics for that. Let the operating system to sort it out.

The portable filename characters as defined by ANSI C are

  1. a b c d e f g h i j k l m n o p q r t u v w x y z
  2. A B C D E F G H I J K L M N O P Q R T U V W X Y Z
  3. 0 1 2 3 4 5 6 7 8 9
  4. . _ -

and the "-" shouldn't be the first character. If you want to be hypercorrect, stay case-insensitive and within the 8.3 naming convention (all the files and directories have to be unique within one directory if their names are lowercased and truncated to eight characters before the ., if any, and to three characters after the ., if any). (And do not use .s in directory names.)

System Interaction

Not all platforms provide a command line. These are usually platforms that rely primarily on a Graphical User Interface (GUI) for user interaction. A program requiring a command line interface might not work everywhere. This is probably for the user of the program to deal with, so don't stay up late worrying about it.

Some platforms can't delete or rename files held open by the system, this limitation may also apply to changing filesystem metainformation like file permissions or owners. Remember to close files when you are done with them. Don't unlink or rename an open file. Don't tie or open a file already tied or opened; untie or close it first.

Don't open the same file more than once at a time for writing, as some operating systems put mandatory locks on such files.

Don't assume that write/modify permission on a directory gives the right to add or delete files/directories in that directory. That is filesystem specific: in some filesystems you need write/modify permission also (or even just) in the file/directory itself. In some filesystems (AFS, DFS) the permission to add/delete directory entries is a completely separate permission.

Don't assume that a single unlink completely gets rid of the file: some filesystems (most notably the ones in VMS) have versioned filesystems, and unlink() removes only the most recent one (it doesn't remove all the versions because by default the native tools on those platforms remove just the most recent version, too). The portable idiom to remove all the versions of a file is

  1. 1 while unlink "file";

This will terminate if the file is undeleteable for some reason (protected, not there, and so on).

Don't count on a specific environment variable existing in %ENV . Don't count on %ENV entries being case-sensitive, or even case-preserving. Don't try to clear %ENV by saying %ENV = (); , or, if you really have to, make it conditional on $^O ne 'VMS' since in VMS the %ENV table is much more than a per-process key-value string table.

On VMS, some entries in the %ENV hash are dynamically created when their key is used on a read if they did not previously exist. The values for $ENV{HOME} , $ENV{TERM} , $ENV{HOME} , and $ENV{USER} , are known to be dynamically generated. The specific names that are dynamically generated may vary with the version of the C library on VMS, and more may exist than is documented.

On VMS by default, changes to the %ENV hash are persistent after the process exits. This can cause unintended issues.

Don't count on signals or %SIG for anything.

Don't count on filename globbing. Use opendir, readdir, and closedir instead.

Don't count on per-program environment variables, or per-program current directories.

Don't count on specific values of $! , neither numeric nor especially the strings values. Users may switch their locales causing error messages to be translated into their languages. If you can trust a POSIXish environment, you can portably use the symbols defined by the Errno module, like ENOENT. And don't trust on the values of $! at all except immediately after a failed system call.

Command names versus file pathnames

Don't assume that the name used to invoke a command or program with system or exec can also be used to test for the existence of the file that holds the executable code for that command or program. First, many systems have "internal" commands that are built-in to the shell or OS and while these commands can be invoked, there is no corresponding file. Second, some operating systems (e.g., Cygwin, DJGPP, OS/2, and VOS) have required suffixes for executable files; these suffixes are generally permitted on the command name but are not required. Thus, a command like "perl" might exist in a file named "perl", "perl.exe", or "perl.pm", depending on the operating system. The variable "_exe" in the Config module holds the executable suffix, if any. Third, the VMS port carefully sets up $^X and $Config{perlpath} so that no further processing is required. This is just as well, because the matching regular expression used below would then have to deal with a possible trailing version number in the VMS file name.

To convert $^X to a file pathname, taking account of the requirements of the various operating system possibilities, say:

  1. use Config;
  2. my $thisperl = $^X;
  3. if ($^O ne 'VMS')
  4. {$thisperl .= $Config{_exe} unless $thisperl =~ m/$Config{_exe}$/i;}

To convert $Config{perlpath} to a file pathname, say:

  1. use Config;
  2. my $thisperl = $Config{perlpath};
  3. if ($^O ne 'VMS')
  4. {$thisperl .= $Config{_exe} unless $thisperl =~ m/$Config{_exe}$/i;}

Networking

Don't assume that you can reach the public Internet.

Don't assume that there is only one way to get through firewalls to the public Internet.

Don't assume that you can reach outside world through any other port than 80, or some web proxy. ftp is blocked by many firewalls.

Don't assume that you can send email by connecting to the local SMTP port.

Don't assume that you can reach yourself or any node by the name 'localhost'. The same goes for '127.0.0.1'. You will have to try both.

Don't assume that the host has only one network card, or that it can't bind to many virtual IP addresses.

Don't assume a particular network device name.

Don't assume a particular set of ioctl()s will work.

Don't assume that you can ping hosts and get replies.

Don't assume that any particular port (service) will respond.

Don't assume that Sys::Hostname (or any other API or command) returns either a fully qualified hostname or a non-qualified hostname: it all depends on how the system had been configured. Also remember that for things such as DHCP and NAT, the hostname you get back might not be very useful.

All the above "don't":s may look daunting, and they are, but the key is to degrade gracefully if one cannot reach the particular network service one wants. Croaking or hanging do not look very professional.

Interprocess Communication (IPC)

In general, don't directly access the system in code meant to be portable. That means, no system, exec, fork, pipe, `` , qx//, open with a |, nor any of the other things that makes being a perl hacker worth being.

Commands that launch external processes are generally supported on most platforms (though many of them do not support any type of forking). The problem with using them arises from what you invoke them on. External tools are often named differently on different platforms, may not be available in the same location, might accept different arguments, can behave differently, and often present their results in a platform-dependent way. Thus, you should seldom depend on them to produce consistent results. (Then again, if you're calling netstat -a, you probably don't expect it to run on both Unix and CP/M.)

One especially common bit of Perl code is opening a pipe to sendmail:

  1. open(MAIL, '|/usr/lib/sendmail -t')
  2. or die "cannot fork sendmail: $!";

This is fine for systems programming when sendmail is known to be available. But it is not fine for many non-Unix systems, and even some Unix systems that may not have sendmail installed. If a portable solution is needed, see the various distributions on CPAN that deal with it. Mail::Mailer and Mail::Send in the MailTools distribution are commonly used, and provide several mailing methods, including mail, sendmail, and direct SMTP (via Net::SMTP) if a mail transfer agent is not available. Mail::Sendmail is a standalone module that provides simple, platform-independent mailing.

The Unix System V IPC (msg*(), sem*(), shm*() ) is not available even on all Unix platforms.

Do not use either the bare result of pack("N", 10, 20, 30, 40) or bare v-strings (such as v10.20.30.40 ) to represent IPv4 addresses: both forms just pack the four bytes into network order. That this would be equal to the C language in_addr struct (which is what the socket code internally uses) is not guaranteed. To be portable use the routines of the Socket extension, such as inet_aton() , inet_ntoa() , and sockaddr_in() .

The rule of thumb for portable code is: Do it all in portable Perl, or use a module (that may internally implement it with platform-specific code, but expose a common interface).

External Subroutines (XS)

XS code can usually be made to work with any platform, but dependent libraries, header files, etc., might not be readily available or portable, or the XS code itself might be platform-specific, just as Perl code might be. If the libraries and headers are portable, then it is normally reasonable to make sure the XS code is portable, too.

A different type of portability issue arises when writing XS code: availability of a C compiler on the end-user's system. C brings with it its own portability issues, and writing XS code will expose you to some of those. Writing purely in Perl is an easier way to achieve portability.

Standard Modules

In general, the standard modules work across platforms. Notable exceptions are the CPAN module (which currently makes connections to external programs that may not be available), platform-specific modules (like ExtUtils::MM_VMS), and DBM modules.

There is no one DBM module available on all platforms. SDBM_File and the others are generally available on all Unix and DOSish ports, but not in MacPerl, where only NBDM_File and DB_File are available.

The good news is that at least some DBM module should be available, and AnyDBM_File will use whichever module it can find. Of course, then the code needs to be fairly strict, dropping to the greatest common factor (e.g., not exceeding 1K for each record), so that it will work with any DBM module. See AnyDBM_File for more details.

Time and Date

The system's notion of time of day and calendar date is controlled in widely different ways. Don't assume the timezone is stored in $ENV{TZ} , and even if it is, don't assume that you can control the timezone through that variable. Don't assume anything about the three-letter timezone abbreviations (for example that MST would be the Mountain Standard Time, it's been known to stand for Moscow Standard Time). If you need to use timezones, express them in some unambiguous format like the exact number of minutes offset from UTC, or the POSIX timezone format.

Don't assume that the epoch starts at 00:00:00, January 1, 1970, because that is OS- and implementation-specific. It is better to store a date in an unambiguous representation. The ISO 8601 standard defines YYYY-MM-DD as the date format, or YYYY-MM-DDTHH:MM:SS (that's a literal "T" separating the date from the time). Please do use the ISO 8601 instead of making us guess what date 02/03/04 might be. ISO 8601 even sorts nicely as-is. A text representation (like "1987-12-18") can be easily converted into an OS-specific value using a module like Date::Parse. An array of values, such as those returned by localtime, can be converted to an OS-specific representation using Time::Local.

When calculating specific times, such as for tests in time or date modules, it may be appropriate to calculate an offset for the epoch.

  1. require Time::Local;
  2. my $offset = Time::Local::timegm(0, 0, 0, 1, 0, 70);

The value for $offset in Unix will be 0 , but in Mac OS Classic will be some large number. $offset can then be added to a Unix time value to get what should be the proper value on any system.

Character sets and character encoding

Assume very little about character sets.

Assume nothing about numerical values (ord, chr) of characters. Do not use explicit code point ranges (like \xHH-\xHH); use for example symbolic character classes like [:print:].

Do not assume that the alphabetic characters are encoded contiguously (in the numeric sense). There may be gaps.

Do not assume anything about the ordering of the characters. The lowercase letters may come before or after the uppercase letters; the lowercase and uppercase may be interlaced so that both "a" and "A" come before "b"; the accented and other international characters may be interlaced so that ä comes before "b".

Internationalisation

If you may assume POSIX (a rather large assumption), you may read more about the POSIX locale system from perllocale. The locale system at least attempts to make things a little bit more portable, or at least more convenient and native-friendly for non-English users. The system affects character sets and encoding, and date and time formatting--amongst other things.

If you really want to be international, you should consider Unicode. See perluniintro and perlunicode for more information.

If you want to use non-ASCII bytes (outside the bytes 0x00..0x7f) in the "source code" of your code, to be portable you have to be explicit about what bytes they are. Someone might for example be using your code under a UTF-8 locale, in which case random native bytes might be illegal ("Malformed UTF-8 ...") This means that for example embedding ISO 8859-1 bytes beyond 0x7f into your strings might cause trouble later. If the bytes are native 8-bit bytes, you can use the bytes pragma. If the bytes are in a string (regular expression being a curious string), you can often also use the \xHH notation instead of embedding the bytes as-is. If you want to write your code in UTF-8, you can use the utf8 .

System Resources

If your code is destined for systems with severely constrained (or missing!) virtual memory systems then you want to be especially mindful of avoiding wasteful constructs such as:

  1. my @lines = <$very_large_file>; # bad
  2. while (<$fh>) {$file .= $_} # sometimes bad
  3. my $file = join('', <$fh>); # better

The last two constructs may appear unintuitive to most people. The first repeatedly grows a string, whereas the second allocates a large chunk of memory in one go. On some systems, the second is more efficient that the first.

Security

Most multi-user platforms provide basic levels of security, usually implemented at the filesystem level. Some, however, unfortunately do not. Thus the notion of user id, or "home" directory, or even the state of being logged-in, may be unrecognizable on many platforms. If you write programs that are security-conscious, it is usually best to know what type of system you will be running under so that you can write code explicitly for that platform (or class of platforms).

Don't assume the Unix filesystem access semantics: the operating system or the filesystem may be using some ACL systems, which are richer languages than the usual rwx. Even if the rwx exist, their semantics might be different.

(From security viewpoint testing for permissions before attempting to do something is silly anyway: if one tries this, there is potential for race conditions. Someone or something might change the permissions between the permissions check and the actual operation. Just try the operation.)

Don't assume the Unix user and group semantics: especially, don't expect the $< and $> (or the $( and $) ) to work for switching identities (or memberships).

Don't assume set-uid and set-gid semantics. (And even if you do, think twice: set-uid and set-gid are a known can of security worms.)

Style

For those times when it is necessary to have platform-specific code, consider keeping the platform-specific code in one place, making porting to other platforms easier. Use the Config module and the special variable $^O to differentiate platforms, as described in PLATFORMS.

Be careful in the tests you supply with your module or programs. Module code may be fully portable, but its tests might not be. This often happens when tests spawn off other processes or call external programs to aid in the testing, or when (as noted above) the tests assume certain things about the filesystem and paths. Be careful not to depend on a specific output style for errors, such as when checking $! after a failed system call. Using $! for anything else than displaying it as output is doubtful (though see the Errno module for testing reasonably portably for error value). Some platforms expect a certain output format, and Perl on those platforms may have been adjusted accordingly. Most specifically, don't anchor a regex when testing an error value.

CPAN Testers

Modules uploaded to CPAN are tested by a variety of volunteers on different platforms. These CPAN testers are notified by mail of each new upload, and reply to the list with PASS, FAIL, NA (not applicable to this platform), or UNKNOWN (unknown), along with any relevant notations.

The purpose of the testing is twofold: one, to help developers fix any problems in their code that crop up because of lack of testing on other platforms; two, to provide users with information about whether a given module works on a given platform.

Also see:

PLATFORMS

Perl is built with a $^O variable that indicates the operating system it was built on. This was implemented to help speed up code that would otherwise have to use Config and use the value of $Config{osname} . Of course, to get more detailed information about the system, looking into %Config is certainly recommended.

%Config cannot always be trusted, however, because it was built at compile time. If perl was built in one place, then transferred elsewhere, some values may be wrong. The values may even have been edited after the fact.

Unix

Perl works on a bewildering variety of Unix and Unix-like platforms (see e.g. most of the files in the hints/ directory in the source code kit). On most of these systems, the value of $^O (hence $Config{'osname'} , too) is determined either by lowercasing and stripping punctuation from the first field of the string returned by typing uname -a (or a similar command) at the shell prompt or by testing the file system for the presence of uniquely named files such as a kernel or header file. Here, for example, are a few of the more popular Unix flavors:

  1. uname $^O $Config{'archname'}
  2. --------------------------------------------
  3. AIX aix aix
  4. BSD/OS bsdos i386-bsdos
  5. Darwin darwin darwin
  6. dgux dgux AViiON-dgux
  7. DYNIX/ptx dynixptx i386-dynixptx
  8. FreeBSD freebsd freebsd-i386
  9. Haiku haiku BePC-haiku
  10. Linux linux arm-linux
  11. Linux linux i386-linux
  12. Linux linux i586-linux
  13. Linux linux ppc-linux
  14. HP-UX hpux PA-RISC1.1
  15. IRIX irix irix
  16. Mac OS X darwin darwin
  17. NeXT 3 next next-fat
  18. NeXT 4 next OPENSTEP-Mach
  19. openbsd openbsd i386-openbsd
  20. OSF1 dec_osf alpha-dec_osf
  21. reliantunix-n svr4 RM400-svr4
  22. SCO_SV sco_sv i386-sco_sv
  23. SINIX-N svr4 RM400-svr4
  24. sn4609 unicos CRAY_C90-unicos
  25. sn6521 unicosmk t3e-unicosmk
  26. sn9617 unicos CRAY_J90-unicos
  27. SunOS solaris sun4-solaris
  28. SunOS solaris i86pc-solaris
  29. SunOS4 sunos sun4-sunos

Because the value of $Config{archname} may depend on the hardware architecture, it can vary more than the value of $^O .

DOS and Derivatives

Perl has long been ported to Intel-style microcomputers running under systems like PC-DOS, MS-DOS, OS/2, and most Windows platforms you can bring yourself to mention (except for Windows CE, if you count that). Users familiar with COMMAND.COM or CMD.EXE style shells should be aware that each of these file specifications may have subtle differences:

  1. my $filespec0 = "c:/foo/bar/file.txt";
  2. my $filespec1 = "c:\\foo\\bar\\file.txt";
  3. my $filespec2 = 'c:\foo\bar\file.txt';
  4. my $filespec3 = 'c:\\foo\\bar\\file.txt';

System calls accept either / or \ as the path separator. However, many command-line utilities of DOS vintage treat / as the option prefix, so may get confused by filenames containing /. Aside from calling any external programs, / will work just fine, and probably better, as it is more consistent with popular usage, and avoids the problem of remembering what to backwhack and what not to.

The DOS FAT filesystem can accommodate only "8.3" style filenames. Under the "case-insensitive, but case-preserving" HPFS (OS/2) and NTFS (NT) filesystems you may have to be careful about case returned with functions like readdir or used with functions like open or opendir.

DOS also treats several filenames as special, such as AUX, PRN, NUL, CON, COM1, LPT1, LPT2, etc. Unfortunately, sometimes these filenames won't even work if you include an explicit directory prefix. It is best to avoid such filenames, if you want your code to be portable to DOS and its derivatives. It's hard to know what these all are, unfortunately.

Users of these operating systems may also wish to make use of scripts such as pl2bat.bat or pl2cmd to put wrappers around your scripts.

Newline (\n ) is translated as \015\012 by STDIO when reading from and writing to files (see Newlines). binmode(FILEHANDLE) will keep \n translated as \012 for that filehandle. Since it is a no-op on other systems, binmode should be used for cross-platform code that deals with binary data. That's assuming you realize in advance that your data is in binary. General-purpose programs should often assume nothing about their data.

The $^O variable and the $Config{archname} values for various DOSish perls are as follows:

  1. OS $^O $Config{archname} ID Version
  2. --------------------------------------------------------
  3. MS-DOS dos ?
  4. PC-DOS dos ?
  5. OS/2 os2 ?
  6. Windows 3.1 ? ? 0 3 01
  7. Windows 95 MSWin32 MSWin32-x86 1 4 00
  8. Windows 98 MSWin32 MSWin32-x86 1 4 10
  9. Windows ME MSWin32 MSWin32-x86 1 ?
  10. Windows NT MSWin32 MSWin32-x86 2 4 xx
  11. Windows NT MSWin32 MSWin32-ALPHA 2 4 xx
  12. Windows NT MSWin32 MSWin32-ppc 2 4 xx
  13. Windows 2000 MSWin32 MSWin32-x86 2 5 00
  14. Windows XP MSWin32 MSWin32-x86 2 5 01
  15. Windows 2003 MSWin32 MSWin32-x86 2 5 02
  16. Windows Vista MSWin32 MSWin32-x86 2 6 00
  17. Windows 7 MSWin32 MSWin32-x86 2 6 01
  18. Windows 7 MSWin32 MSWin32-x64 2 6 01
  19. Windows 2008 MSWin32 MSWin32-x86 2 6 01
  20. Windows 2008 MSWin32 MSWin32-x64 2 6 01
  21. Windows CE MSWin32 ? 3
  22. Cygwin cygwin cygwin

The various MSWin32 Perl's can distinguish the OS they are running on via the value of the fifth element of the list returned from Win32::GetOSVersion(). For example:

  1. if ($^O eq 'MSWin32') {
  2. my @os_version_info = Win32::GetOSVersion();
  3. print +('3.1','95','NT')[$os_version_info[4]],"\n";
  4. }

There are also Win32::IsWinNT() and Win32::IsWin95(), try perldoc Win32 , and as of libwin32 0.19 (not part of the core Perl distribution) Win32::GetOSName(). The very portable POSIX::uname() will work too:

  1. c:\> perl -MPOSIX -we "print join '|', uname"
  2. Windows NT|moonru|5.0|Build 2195 (Service Pack 2)|x86

Also see:

VMS

Perl on VMS is discussed in perlvms in the perl distribution.

The official name of VMS as of this writing is OpenVMS.

Perl on VMS can accept either VMS- or Unix-style file specifications as in either of the following:

  1. $ perl -ne "print if /perl_setup/i" SYS$LOGIN:LOGIN.COM
  2. $ perl -ne "print if /perl_setup/i" /sys$login/login.com

but not a mixture of both as in:

  1. $ perl -ne "print if /perl_setup/i" sys$login:/login.com
  2. Can't open sys$login:/login.com: file specification syntax error

Interacting with Perl from the Digital Command Language (DCL) shell often requires a different set of quotation marks than Unix shells do. For example:

  1. $ perl -e "print ""Hello, world.\n"""
  2. Hello, world.

There are several ways to wrap your perl scripts in DCL .COM files, if you are so inclined. For example:

  1. $ write sys$output "Hello from DCL!"
  2. $ if p1 .eqs. ""
  3. $ then perl -x 'f$environment("PROCEDURE")
  4. $ else perl -x - 'p1 'p2 'p3 'p4 'p5 'p6 'p7 'p8
  5. $ deck/dollars="__END__"
  6. #!/usr/bin/perl
  7. print "Hello from Perl!\n";
  8. __END__
  9. $ endif

Do take care with $ ASSIGN/nolog/user SYS$COMMAND: SYS$INPUT if your perl-in-DCL script expects to do things like $read = <STDIN>; .

The VMS operating system has two filesystems, known as ODS-2 and ODS-5.

For ODS-2, filenames are in the format "name.extension;version". The maximum length for filenames is 39 characters, and the maximum length for extensions is also 39 characters. Version is a number from 1 to 32767. Valid characters are /[A-Z0-9$_-]/ .

The ODS-2 filesystem is case-insensitive and does not preserve case. Perl simulates this by converting all filenames to lowercase internally.

For ODS-5, filenames may have almost any character in them and can include Unicode characters. Characters that could be misinterpreted by the DCL shell or file parsing utilities need to be prefixed with the ^ character, or replaced with hexadecimal characters prefixed with the ^ character. Such prefixing is only needed with the pathnames are in VMS format in applications. Programs that can accept the Unix format of pathnames do not need the escape characters. The maximum length for filenames is 255 characters. The ODS-5 file system can handle both a case preserved and a case sensitive mode.

ODS-5 is only available on the OpenVMS for 64 bit platforms.

Support for the extended file specifications is being done as optional settings to preserve backward compatibility with Perl scripts that assume the previous VMS limitations.

In general routines on VMS that get a Unix format file specification should return it in a Unix format, and when they get a VMS format specification they should return a VMS format unless they are documented to do a conversion.

For routines that generate return a file specification, VMS allows setting if the C library which Perl is built on if it will be returned in VMS format or in Unix format.

With the ODS-2 file system, there is not much difference in syntax of filenames without paths for VMS or Unix. With the extended character set available with ODS-5 there can be a significant difference.

Because of this, existing Perl scripts written for VMS were sometimes treating VMS and Unix filenames interchangeably. Without the extended character set enabled, this behavior will mostly be maintained for backwards compatibility.

When extended characters are enabled with ODS-5, the handling of Unix formatted file specifications is to that of a Unix system.

VMS file specifications without extensions have a trailing dot. An equivalent Unix file specification should not show the trailing dot.

The result of all of this, is that for VMS, for portable scripts, you can not depend on Perl to present the filenames in lowercase, to be case sensitive, and that the filenames could be returned in either Unix or VMS format.

And if a routine returns a file specification, unless it is intended to convert it, it should return it in the same format as it found it.

readdir by default has traditionally returned lowercased filenames. When the ODS-5 support is enabled, it will return the exact case of the filename on the disk.

Files without extensions have a trailing period on them, so doing a readdir in the default mode with a file named A.;5 will return a. when VMS is (though that file could be opened with open(FH, 'A') ).

With support for extended file specifications and if opendir was given a Unix format directory, a file named A.;5 will return a and optionally in the exact case on the disk. When opendir is given a VMS format directory, then readdir should return a., and again with the optionally the exact case.

RMS had an eight level limit on directory depths from any rooted logical (allowing 16 levels overall) prior to VMS 7.2, and even with versions of VMS on VAX up through 7.3. Hence PERL_ROOT:[LIB.2.3.4.5.6.7.8] is a valid directory specification but PERL_ROOT:[LIB.2.3.4.5.6.7.8.9] is not. Makefile.PL authors might have to take this into account, but at least they can refer to the former as /PERL_ROOT/lib/2/3/4/5/6/7/8/.

Pumpkings and module integrators can easily see whether files with too many directory levels have snuck into the core by running the following in the top-level source directory:

  1. $ perl -ne "$_=~s/\s+.*//; print if scalar(split /\//) > 8;" < MANIFEST

The VMS::Filespec module, which gets installed as part of the build process on VMS, is a pure Perl module that can easily be installed on non-VMS platforms and can be helpful for conversions to and from RMS native formats. It is also now the only way that you should check to see if VMS is in a case sensitive mode.

What \n represents depends on the type of file opened. It usually represents \012 but it could also be \015 , \012 , \015\012 , \000 , \040 , or nothing depending on the file organization and record format. The VMS::Stdio module provides access to the special fopen() requirements of files with unusual attributes on VMS.

TCP/IP stacks are optional on VMS, so socket routines might not be implemented. UDP sockets may not be supported.

The TCP/IP library support for all current versions of VMS is dynamically loaded if present, so even if the routines are configured, they may return a status indicating that they are not implemented.

The value of $^O on OpenVMS is "VMS". To determine the architecture that you are running on without resorting to loading all of %Config you can examine the content of the @INC array like so:

  1. if (grep(/VMS_AXP/, @INC)) {
  2. print "I'm on Alpha!\n";
  3. } elsif (grep(/VMS_VAX/, @INC)) {
  4. print "I'm on VAX!\n";
  5. } elsif (grep(/VMS_IA64/, @INC)) {
  6. print "I'm on IA64!\n";
  7. } else {
  8. print "I'm not so sure about where $^O is...\n";
  9. }

In general, the significant differences should only be if Perl is running on VMS_VAX or one of the 64 bit OpenVMS platforms.

On VMS, perl determines the UTC offset from the SYS$TIMEZONE_DIFFERENTIAL logical name. Although the VMS epoch began at 17-NOV-1858 00:00:00.00, calls to localtime are adjusted to count offsets from 01-JAN-1970 00:00:00.00, just like Unix.

Also see:

VOS

Perl on VOS (also known as OpenVOS) is discussed in README.vos in the perl distribution (installed as perlvos). Perl on VOS can accept either VOS- or Unix-style file specifications as in either of the following:

  1. $ perl -ne "print if /perl_setup/i" >system>notices
  2. $ perl -ne "print if /perl_setup/i" /system/notices

or even a mixture of both as in:

  1. $ perl -ne "print if /perl_setup/i" >system/notices

Even though VOS allows the slash character to appear in object names, because the VOS port of Perl interprets it as a pathname delimiting character, VOS files, directories, or links whose names contain a slash character cannot be processed. Such files must be renamed before they can be processed by Perl.

Older releases of VOS (prior to OpenVOS Release 17.0) limit file names to 32 or fewer characters, prohibit file names from starting with a - character, and prohibit file names from containing any character matching tr/ !#%&'()*;<=>?// .

Newer releases of VOS (OpenVOS Release 17.0 or later) support a feature known as extended names. On these releases, file names can contain up to 255 characters, are prohibited from starting with a - character, and the set of prohibited characters is reduced to any character matching tr/#%*<>?//. There are restrictions involving spaces and apostrophes: these characters must not begin or end a name, nor can they immediately precede or follow a period. Additionally, a space must not immediately precede another space or hyphen. Specifically, the following character combinations are prohibited: space-space, space-hyphen, period-space, space-period, period-apostrophe, apostrophe-period, leading or trailing space, and leading or trailing apostrophe. Although an extended file name is limited to 255 characters, a path name is still limited to 256 characters.

The value of $^O on VOS is "vos". To determine the architecture that you are running on without resorting to loading all of %Config you can examine the content of the @INC array like so:

  1. if ($^O =~ /vos/) {
  2. print "I'm on a Stratus box!\n";
  3. } else {
  4. print "I'm not on a Stratus box!\n";
  5. die;
  6. }

Also see:

  • README.vos (installed as perlvos)

  • The VOS mailing list.

    There is no specific mailing list for Perl on VOS. You can contact the Stratus Technologies Customer Assistance Center (CAC) for your region, or you can use the contact information located in the distribution files on the Stratus Anonymous FTP site.

  • Stratus Technologies on the web at http://www.stratus.com

  • VOS Open-Source Software on the web at http://ftp.stratus.com/pub/vos/vos.html

EBCDIC Platforms

Recent versions of Perl have been ported to platforms such as OS/400 on AS/400 minicomputers as well as OS/390, VM/ESA, and BS2000 for S/390 Mainframes. Such computers use EBCDIC character sets internally (usually Character Code Set ID 0037 for OS/400 and either 1047 or POSIX-BC for S/390 systems). On the mainframe perl currently works under the "Unix system services for OS/390" (formerly known as OpenEdition), VM/ESA OpenEdition, or the BS200 POSIX-BC system (BS2000 is supported in perl 5.6 and greater). See perlos390 for details. Note that for OS/400 there is also a port of Perl 5.8.1/5.10.0 or later to the PASE which is ASCII-based (as opposed to ILE which is EBCDIC-based), see perlos400.

As of R2.5 of USS for OS/390 and Version 2.3 of VM/ESA these Unix sub-systems do not support the #! shebang trick for script invocation. Hence, on OS/390 and VM/ESA perl scripts can be executed with a header similar to the following simple script:

  1. : # use perl
  2. eval 'exec /usr/local/bin/perl -S $0 ${1+"$@"}'
  3. if 0;
  4. #!/usr/local/bin/perl # just a comment really
  5. print "Hello from perl!\n";

OS/390 will support the #! shebang trick in release 2.8 and beyond. Calls to system and backticks can use POSIX shell syntax on all S/390 systems.

On the AS/400, if PERL5 is in your library list, you may need to wrap your perl scripts in a CL procedure to invoke them like so:

  1. BEGIN
  2. CALL PGM(PERL5/PERL) PARM('/QOpenSys/hello.pl')
  3. ENDPGM

This will invoke the perl script hello.pl in the root of the QOpenSys file system. On the AS/400 calls to system or backticks must use CL syntax.

On these platforms, bear in mind that the EBCDIC character set may have an effect on what happens with some perl functions (such as chr, pack, print, printf, ord, sort, sprintf, unpack), as well as bit-fiddling with ASCII constants using operators like ^, & and |, not to mention dealing with socket interfaces to ASCII computers (see Newlines).

Fortunately, most web servers for the mainframe will correctly translate the \n in the following statement to its ASCII equivalent (\r is the same under both Unix and OS/390):

  1. print "Content-type: text/html\r\n\r\n";

The values of $^O on some of these platforms includes:

  1. uname $^O $Config{'archname'}
  2. --------------------------------------------
  3. OS/390 os390 os390
  4. OS400 os400 os400
  5. POSIX-BC posix-bc BS2000-posix-bc

Some simple tricks for determining if you are running on an EBCDIC platform could include any of the following (perhaps all):

  1. if ("\t" eq "\005") { print "EBCDIC may be spoken here!\n"; }
  2. if (ord('A') == 193) { print "EBCDIC may be spoken here!\n"; }
  3. if (chr(169) eq 'z') { print "EBCDIC may be spoken here!\n"; }

One thing you may not want to rely on is the EBCDIC encoding of punctuation characters since these may differ from code page to code page (and once your module or script is rumoured to work with EBCDIC, folks will want it to work with all EBCDIC character sets).

Also see:

  • perlos390, README.os390, perlbs2000, perlebcdic.

  • The perl-mvs@perl.org list is for discussion of porting issues as well as general usage issues for all EBCDIC Perls. Send a message body of "subscribe perl-mvs" to majordomo@perl.org.

  • AS/400 Perl information at http://as400.rochester.ibm.com/ as well as on CPAN in the ports/ directory.

Acorn RISC OS

Because Acorns use ASCII with newlines (\n ) in text files as \012 like Unix, and because Unix filename emulation is turned on by default, most simple scripts will probably work "out of the box". The native filesystem is modular, and individual filesystems are free to be case-sensitive or insensitive, and are usually case-preserving. Some native filesystems have name length limits, which file and directory names are silently truncated to fit. Scripts should be aware that the standard filesystem currently has a name length limit of 10 characters, with up to 77 items in a directory, but other filesystems may not impose such limitations.

Native filenames are of the form

  1. Filesystem#Special_Field::DiskName.$.Directory.Directory.File

where

  1. Special_Field is not usually present, but may contain . and $ .
  2. Filesystem =~ m|[A-Za-z0-9_]|
  3. DsicName =~ m|[A-Za-z0-9_/]|
  4. $ represents the root directory
  5. . is the path separator
  6. @ is the current directory (per filesystem but machine global)
  7. ^ is the parent directory
  8. Directory and File =~ m|[^\0- "\.\$\%\&:\@\\^\|\177]+|

The default filename translation is roughly tr|/.|./|;

Note that "ADFS::HardDisk.$.File" ne 'ADFS::HardDisk.$.File' and that the second stage of $ interpolation in regular expressions will fall foul of the $. if scripts are not careful.

Logical paths specified by system variables containing comma-separated search lists are also allowed; hence System:Modules is a valid filename, and the filesystem will prefix Modules with each section of System$Path until a name is made that points to an object on disk. Writing to a new file System:Modules would be allowed only if System$Path contains a single item list. The filesystem will also expand system variables in filenames if enclosed in angle brackets, so <System$Dir>.Modules would look for the file $ENV{'System$Dir'} . 'Modules' . The obvious implication of this is that fully qualified filenames can start with <> and should be protected when open is used for input.

Because . was in use as a directory separator and filenames could not be assumed to be unique after 10 characters, Acorn implemented the C compiler to strip the trailing .c .h .s and .o suffix from filenames specified in source code and store the respective files in subdirectories named after the suffix. Hence files are translated:

  1. foo.h h.foo
  2. C:foo.h C:h.foo (logical path variable)
  3. sys/os.h sys.h.os (C compiler groks Unix-speak)
  4. 10charname.c c.10charname
  5. 10charname.o o.10charname
  6. 11charname_.c c.11charname (assuming filesystem truncates at 10)

The Unix emulation library's translation of filenames to native assumes that this sort of translation is required, and it allows a user-defined list of known suffixes that it will transpose in this fashion. This may seem transparent, but consider that with these rules foo/bar/baz.h and foo/bar/h/baz both map to foo.bar.h.baz, and that readdir and glob cannot and do not attempt to emulate the reverse mapping. Other .'s in filenames are translated to /.

As implied above, the environment accessed through %ENV is global, and the convention is that program specific environment variables are of the form Program$Name . Each filesystem maintains a current directory, and the current filesystem's current directory is the global current directory. Consequently, sociable programs don't change the current directory but rely on full pathnames, and programs (and Makefiles) cannot assume that they can spawn a child process which can change the current directory without affecting its parent (and everyone else for that matter).

Because native operating system filehandles are global and are currently allocated down from 255, with 0 being a reserved value, the Unix emulation library emulates Unix filehandles. Consequently, you can't rely on passing STDIN , STDOUT , or STDERR to your children.

The desire of users to express filenames of the form <Foo$Dir>.Bar on the command line unquoted causes problems, too: `` command output capture has to perform a guessing game. It assumes that a string <[^<>]+\$[^<>]> is a reference to an environment variable, whereas anything else involving < or > is redirection, and generally manages to be 99% right. Of course, the problem remains that scripts cannot rely on any Unix tools being available, or that any tools found have Unix-like command line arguments.

Extensions and XS are, in theory, buildable by anyone using free tools. In practice, many don't, as users of the Acorn platform are used to binary distributions. MakeMaker does run, but no available make currently copes with MakeMaker's makefiles; even if and when this should be fixed, the lack of a Unix-like shell will cause problems with makefile rules, especially lines of the form cd sdbm && make all , and anything using quoting.

"RISC OS" is the proper name for the operating system, but the value in $^O is "riscos" (because we don't like shouting).

Other perls

Perl has been ported to many platforms that do not fit into any of the categories listed above. Some, such as AmigaOS, QNX, Plan 9, and VOS, have been well-integrated into the standard Perl source code kit. You may need to see the ports/ directory on CPAN for information, and possibly binaries, for the likes of: aos, Atari ST, lynxos, riscos, Novell Netware, Tandem Guardian, etc. (Yes, we know that some of these OSes may fall under the Unix category, but we are not a standards body.)

Some approximate operating system names and their $^O values in the "OTHER" category include:

  1. OS $^O $Config{'archname'}
  2. ------------------------------------------
  3. Amiga DOS amigaos m68k-amigos

See also:

  • Amiga, README.amiga (installed as perlamiga).

  • A free perl5-based PERL.NLM for Novell Netware is available in precompiled binary and source code form from http://www.novell.com/ as well as from CPAN.

  • Plan 9, README.plan9

FUNCTION IMPLEMENTATIONS

Listed below are functions that are either completely unimplemented or else have been implemented differently on various platforms. Following each description will be, in parentheses, a list of platforms that the description applies to.

The list may well be incomplete, or even wrong in some places. When in doubt, consult the platform-specific README files in the Perl source distribution, and any other documentation resources accompanying a given port.

Be aware, moreover, that even among Unix-ish systems there are variations.

For many functions, you can also query %Config , exported by default from the Config module. For example, to check whether the platform has the lstat call, check $Config{d_lstat} . See Config for a full description of available variables.

Alphabetical Listing of Perl Functions

  • -X

    -w only inspects the read-only file attribute (FILE_ATTRIBUTE_READONLY), which determines whether the directory can be deleted, not whether it can be written to. Directories always have read and write access unless denied by discretionary access control lists (DACLs). (Win32)

    -r , -w , -x , and -o tell whether the file is accessible, which may not reflect UIC-based file protections. (VMS)

    -s by name on an open file will return the space reserved on disk, rather than the current extent. -s on an open filehandle returns the current size. (RISC OS)

    -R , -W , -X, -O are indistinguishable from -r , -w , -x , -o . (Win32, VMS, RISC OS)

    -g , -k , -l , -u , -A are not particularly meaningful. (Win32, VMS, RISC OS)

    -p is not particularly meaningful. (VMS, RISC OS)

    -d is true if passed a device spec without an explicit directory. (VMS)

    -x (or -X) determine if a file ends in one of the executable suffixes. -S is meaningless. (Win32)

    -x (or -X) determine if a file has an executable file type. (RISC OS)

  • alarm

    Emulated using timers that must be explicitly polled whenever Perl wants to dispatch "safe signals" and therefore cannot interrupt blocking system calls. (Win32)

  • atan2

    Due to issues with various CPUs, math libraries, compilers, and standards, results for atan2() may vary depending on any combination of the above. Perl attempts to conform to the Open Group/IEEE standards for the results returned from atan2(), but cannot force the issue if the system Perl is run on does not allow it. (Tru64, HP-UX 10.20)

    The current version of the standards for atan2() is available at http://www.opengroup.org/onlinepubs/009695399/functions/atan2.html.

  • binmode

    Meaningless. (RISC OS)

    Reopens file and restores pointer; if function fails, underlying filehandle may be closed, or pointer may be in a different position. (VMS)

    The value returned by tell may be affected after the call, and the filehandle may be flushed. (Win32)

  • chmod

    Only good for changing "owner" read-write access, "group", and "other" bits are meaningless. (Win32)

    Only good for changing "owner" and "other" read-write access. (RISC OS)

    Access permissions are mapped onto VOS access-control list changes. (VOS)

    The actual permissions set depend on the value of the CYGWIN in the SYSTEM environment settings. (Cygwin)

  • chown

    Not implemented. (Win32, Plan 9, RISC OS)

    Does nothing, but won't fail. (Win32)

    A little funky, because VOS's notion of ownership is a little funky (VOS).

  • chroot

    Not implemented. (Win32, VMS, Plan 9, RISC OS, VOS)

  • crypt

    May not be available if library or source was not provided when building perl. (Win32)

  • dbmclose

    Not implemented. (VMS, Plan 9, VOS)

  • dbmopen

    Not implemented. (VMS, Plan 9, VOS)

  • dump

    Not useful. (RISC OS)

    Not supported. (Cygwin, Win32)

    Invokes VMS debugger. (VMS)

  • exec

    Does not automatically flush output handles on some platforms. (SunOS, Solaris, HP-UX)

    Not supported. (Symbian OS)

  • exit

    Emulates Unix exit() (which considers exit 1 to indicate an error) by mapping the 1 to SS$_ABORT (44 ). This behavior may be overridden with the pragma use vmsish 'exit' . As with the CRTL's exit() function, exit 0 is also mapped to an exit status of SS$_NORMAL (1 ); this mapping cannot be overridden. Any other argument to exit() is used directly as Perl's exit status. On VMS, unless the future POSIX_EXIT mode is enabled, the exit code should always be a valid VMS exit code and not a generic number. When the POSIX_EXIT mode is enabled, a generic number will be encoded in a method compatible with the C library _POSIX_EXIT macro so that it can be decoded by other programs, particularly ones written in C, like the GNV package. (VMS)

    exit() resets file pointers, which is a problem when called from a child process (created by fork()) in BEGIN . A workaround is to use POSIX::_exit . (Solaris)

    1. exit unless $Config{archname} =~ /\bsolaris\b/;
    2. require POSIX and POSIX::_exit(0);
  • fcntl

    Not implemented. (Win32)

    Some functions available based on the version of VMS. (VMS)

  • flock

    Not implemented (VMS, RISC OS, VOS).

  • fork

    Not implemented. (AmigaOS, RISC OS, VMS)

    Emulated using multiple interpreters. See perlfork. (Win32)

    Does not automatically flush output handles on some platforms. (SunOS, Solaris, HP-UX)

  • getlogin

    Not implemented. (RISC OS)

  • getpgrp

    Not implemented. (Win32, VMS, RISC OS)

  • getppid

    Not implemented. (Win32, RISC OS)

  • getpriority

    Not implemented. (Win32, VMS, RISC OS, VOS)

  • getpwnam

    Not implemented. (Win32)

    Not useful. (RISC OS)

  • getgrnam

    Not implemented. (Win32, VMS, RISC OS)

  • getnetbyname

    Not implemented. (Win32, Plan 9)

  • getpwuid

    Not implemented. (Win32)

    Not useful. (RISC OS)

  • getgrgid

    Not implemented. (Win32, VMS, RISC OS)

  • getnetbyaddr

    Not implemented. (Win32, Plan 9)

  • getprotobynumber
  • getservbyport
  • getpwent

    Not implemented. (Win32)

  • getgrent

    Not implemented. (Win32, VMS)

  • gethostbyname

    gethostbyname('localhost') does not work everywhere: you may have to use gethostbyname('127.0.0.1'). (Irix 5)

  • gethostent

    Not implemented. (Win32)

  • getnetent

    Not implemented. (Win32, Plan 9)

  • getprotoent

    Not implemented. (Win32, Plan 9)

  • getservent

    Not implemented. (Win32, Plan 9)

  • sethostent

    Not implemented. (Win32, Plan 9, RISC OS)

  • setnetent

    Not implemented. (Win32, Plan 9, RISC OS)

  • setprotoent

    Not implemented. (Win32, Plan 9, RISC OS)

  • setservent

    Not implemented. (Plan 9, Win32, RISC OS)

  • endpwent

    Not implemented. (Win32)

  • endgrent

    Not implemented. (RISC OS, VMS, Win32)

  • endhostent

    Not implemented. (Win32)

  • endnetent

    Not implemented. (Win32, Plan 9)

  • endprotoent

    Not implemented. (Win32, Plan 9)

  • endservent

    Not implemented. (Plan 9, Win32)

  • getsockopt SOCKET,LEVEL,OPTNAME

    Not implemented. (Plan 9)

  • glob

    This operator is implemented via the File::Glob extension on most platforms. See File::Glob for portability information.

  • gmtime

    In theory, gmtime() is reliable from -2**63 to 2**63-1. However, because work arounds in the implementation use floating point numbers, it will become inaccurate as the time gets larger. This is a bug and will be fixed in the future.

    On VOS, time values are 32-bit quantities.

  • ioctl FILEHANDLE,FUNCTION,SCALAR

    Not implemented. (VMS)

    Available only for socket handles, and it does what the ioctlsocket() call in the Winsock API does. (Win32)

    Available only for socket handles. (RISC OS)

  • kill

    Not implemented, hence not useful for taint checking. (RISC OS)

    kill() doesn't have the semantics of raise() , i.e. it doesn't send a signal to the identified process like it does on Unix platforms. Instead kill($sig, $pid) terminates the process identified by $pid, and makes it exit immediately with exit status $sig. As in Unix, if $sig is 0 and the specified process exists, it returns true without actually terminating it. (Win32)

    kill(-9, $pid) will terminate the process specified by $pid and recursively all child processes owned by it. This is different from the Unix semantics, where the signal will be delivered to all processes in the same process group as the process specified by $pid. (Win32)

    Is not supported for process identification number of 0 or negative numbers. (VMS)

  • link

    Not implemented. (RISC OS, VOS)

    Link count not updated because hard links are not quite that hard (They are sort of half-way between hard and soft links). (AmigaOS)

    Hard links are implemented on Win32 under NTFS only. They are natively supported on Windows 2000 and later. On Windows NT they are implemented using the Windows POSIX subsystem support and the Perl process will need Administrator or Backup Operator privileges to create hard links.

    Available on 64 bit OpenVMS 8.2 and later. (VMS)

  • localtime

    localtime() has the same range as gmtime, but because time zone rules change its accuracy for historical and future times may degrade but usually by no more than an hour.

  • lstat

    Not implemented. (RISC OS)

    Return values (especially for device and inode) may be bogus. (Win32)

  • msgctl
  • msgget
  • msgsnd
  • msgrcv

    Not implemented. (Win32, VMS, Plan 9, RISC OS, VOS)

  • open

    open to |- and -| are unsupported. (Win32, RISC OS)

    Opening a process does not automatically flush output handles on some platforms. (SunOS, Solaris, HP-UX)

  • readlink

    Not implemented. (Win32, VMS, RISC OS)

  • rename

    Can't move directories between directories on different logical volumes. (Win32)

  • rewinddir

    Will not cause readdir() to re-read the directory stream. The entries already read before the rewinddir() call will just be returned again from a cache buffer. (Win32)

  • select

    Only implemented on sockets. (Win32, VMS)

    Only reliable on sockets. (RISC OS)

    Note that the select FILEHANDLE form is generally portable.

  • semctl
  • semget
  • semop

    Not implemented. (Win32, VMS, RISC OS)

  • setgrent

    Not implemented. (VMS, Win32, RISC OS)

  • setpgrp

    Not implemented. (Win32, VMS, RISC OS, VOS)

  • setpriority

    Not implemented. (Win32, VMS, RISC OS, VOS)

  • setpwent

    Not implemented. (Win32, RISC OS)

  • setsockopt

    Not implemented. (Plan 9)

  • shmctl
  • shmget
  • shmread
  • shmwrite

    Not implemented. (Win32, VMS, RISC OS)

  • sleep

    Emulated using synchronization functions such that it can be interrupted by alarm(), and limited to a maximum of 4294967 seconds, approximately 49 days. (Win32)

  • sockatmark

    A relatively recent addition to socket functions, may not be implemented even in Unix platforms.

  • socketpair

    Not implemented. (RISC OS)

    Available on 64 bit OpenVMS 8.2 and later. (VMS)

  • stat

    Platforms that do not have rdev, blksize, or blocks will return these as '', so numeric comparison or manipulation of these fields may cause 'not numeric' warnings.

    ctime not supported on UFS (Mac OS X).

    ctime is creation time instead of inode change time (Win32).

    device and inode are not meaningful. (Win32)

    device and inode are not necessarily reliable. (VMS)

    mtime, atime and ctime all return the last modification time. Device and inode are not necessarily reliable. (RISC OS)

    dev, rdev, blksize, and blocks are not available. inode is not meaningful and will differ between stat calls on the same file. (os2)

    some versions of cygwin when doing a stat("foo") and if not finding it may then attempt to stat("foo.exe") (Cygwin)

    On Win32 stat() needs to open the file to determine the link count and update attributes that may have been changed through hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up stat() by not performing this operation. (Win32)

  • symlink

    Not implemented. (Win32, RISC OS)

    Implemented on 64 bit VMS 8.3. VMS requires the symbolic link to be in Unix syntax if it is intended to resolve to a valid path.

  • syscall

    Not implemented. (Win32, VMS, RISC OS, VOS)

  • sysopen

    The traditional "0", "1", and "2" MODEs are implemented with different numeric values on some systems. The flags exported by Fcntl (O_RDONLY, O_WRONLY, O_RDWR) should work everywhere though. (Mac OS, OS/390)

  • system

    As an optimization, may not call the command shell specified in $ENV{PERL5SHELL} . system(1, @args) spawns an external process and immediately returns its process designator, without waiting for it to terminate. Return value may be used subsequently in wait or waitpid. Failure to spawn() a subprocess is indicated by setting $? to "255 << 8". $? is set in a way compatible with Unix (i.e. the exitstatus of the subprocess is obtained by "$?>> 8", as described in the documentation). (Win32)

    There is no shell to process metacharacters, and the native standard is to pass a command line terminated by "\n" "\r" or "\0" to the spawned program. Redirection such as > foo is performed (if at all) by the run time library of the spawned program. system list will call the Unix emulation library's exec emulation, which attempts to provide emulation of the stdin, stdout, stderr in force in the parent, providing the child program uses a compatible version of the emulation library. scalar will call the native command line direct and no such emulation of a child Unix program will exists. Mileage will vary. (RISC OS)

    Does not automatically flush output handles on some platforms. (SunOS, Solaris, HP-UX)

    The return value is POSIX-like (shifted up by 8 bits), which only allows room for a made-up value derived from the severity bits of the native 32-bit condition code (unless overridden by use vmsish 'status' ). If the native condition code is one that has a POSIX value encoded, the POSIX value will be decoded to extract the expected exit value. For more details see $? in perlvms. (VMS)

  • times

    "cumulative" times will be bogus. On anything other than Windows NT or Windows 2000, "system" time will be bogus, and "user" time is actually the time returned by the clock() function in the C runtime library. (Win32)

    Not useful. (RISC OS)

  • truncate

    Not implemented. (Older versions of VMS)

    Truncation to same-or-shorter lengths only. (VOS)

    If a FILEHANDLE is supplied, it must be writable and opened in append mode (i.e., use open(FH, '>>filename') or sysopen(FH,...,O_APPEND|O_RDWR). If a filename is supplied, it should not be held open elsewhere. (Win32)

  • umask

    Returns undef where unavailable.

    umask works but the correct permissions are set only when the file is finally closed. (AmigaOS)

  • utime

    Only the modification time is updated. (VMS, RISC OS)

    May not behave as expected. Behavior depends on the C runtime library's implementation of utime(), and the filesystem being used. The FAT filesystem typically does not support an "access time" field, and it may limit timestamps to a granularity of two seconds. (Win32)

  • wait
  • waitpid

    Can only be applied to process handles returned for processes spawned using system(1, ...) or pseudo processes created with fork(). (Win32)

    Not useful. (RISC OS)

Supported Platforms

The following platforms are known to build Perl 5.12 (as of April 2010, its release date) from the standard source code distribution available at http://www.cpan.org/src

  • Linux (x86, ARM, IA64)
  • HP-UX
  • AIX
  • Win32
    • Windows 2000
    • Windows XP
    • Windows Server 2003
    • Windows Vista
    • Windows Server 2008
    • Windows 7
  • Cygwin
  • Solaris (x86, SPARC)
  • OpenVMS
    • Alpha (7.2 and later)
    • I64 (8.2 and later)
  • Symbian
  • NetBSD
  • FreeBSD
  • Debian GNU/kFreeBSD
  • Haiku
  • Irix (6.5. What else?)
  • OpenBSD
  • Dragonfly BSD
  • Midnight BSD
  • QNX Neutrino RTOS (6.5.0)
  • MirOS BSD
  • Stratus OpenVOS (17.0 or later)

    Caveats:

    • time_t issues that may or may not be fixed
  • Symbian (Series 60 v3, 3.2 and 5 - what else?)
  • Stratus VOS / OpenVOS
  • AIX

EOL Platforms (Perl 5.14)

The following platforms were supported by a previous version of Perl but have been officially removed from Perl's source code as of 5.12:

  • Atari MiNT
  • Apollo Domain/OS
  • Apple Mac OS 8/9
  • Tenon Machten

The following platforms were supported up to 5.10. They may still have worked in 5.12, but supporting code has been removed for 5.14:

  • Windows 95
  • Windows 98
  • Windows ME
  • Windows NT4

Supported Platforms (Perl 5.8)

As of July 2002 (the Perl release 5.8.0), the following platforms were able to build Perl from the standard source code distribution available at http://www.cpan.org/src/

  1. AIX
  2. BeOS
  3. BSD/OS (BSDi)
  4. Cygwin
  5. DG/UX
  6. DOS DJGPP 1)
  7. DYNIX/ptx
  8. EPOC R5
  9. FreeBSD
  10. HI-UXMPP (Hitachi) (5.8.0 worked but we didn't know it)
  11. HP-UX
  12. IRIX
  13. Linux
  14. Mac OS Classic
  15. Mac OS X (Darwin)
  16. MPE/iX
  17. NetBSD
  18. NetWare
  19. NonStop-UX
  20. ReliantUNIX (formerly SINIX)
  21. OpenBSD
  22. OpenVMS (formerly VMS)
  23. Open UNIX (Unixware) (since Perl 5.8.1/5.9.0)
  24. OS/2
  25. OS/400 (using the PASE) (since Perl 5.8.1/5.9.0)
  26. PowerUX
  27. POSIX-BC (formerly BS2000)
  28. QNX
  29. Solaris
  30. SunOS 4
  31. SUPER-UX (NEC)
  32. Tru64 UNIX (formerly DEC OSF/1, Digital UNIX)
  33. UNICOS
  34. UNICOS/mk
  35. UTS
  36. VOS / OpenVOS
  37. Win95/98/ME/2K/XP 2)
  38. WinCE
  39. z/OS (formerly OS/390)
  40. VM/ESA
  41. 1) in DOS mode either the DOS or OS/2 ports can be used
  42. 2) compilers: Borland, MinGW (GCC), VC6

The following platforms worked with the previous releases (5.6 and 5.7), but we did not manage either to fix or to test these in time for the 5.8.0 release. There is a very good chance that many of these will work fine with the 5.8.0.

  1. BSD/OS
  2. DomainOS
  3. Hurd
  4. LynxOS
  5. MachTen
  6. PowerMAX
  7. SCO SV
  8. SVR4
  9. Unixware
  10. Windows 3.1

Known to be broken for 5.8.0 (but 5.6.1 and 5.7.2 can be used):

  1. AmigaOS

The following platforms have been known to build Perl from source in the past (5.005_03 and earlier), but we haven't been able to verify their status for the current release, either because the hardware/software platforms are rare or because we don't have an active champion on these platforms--or both. They used to work, though, so go ahead and try compiling them, and let perlbug@perl.org of any trouble.

  1. 3b1
  2. A/UX
  3. ConvexOS
  4. CX/UX
  5. DC/OSx
  6. DDE SMES
  7. DOS EMX
  8. Dynix
  9. EP/IX
  10. ESIX
  11. FPS
  12. GENIX
  13. Greenhills
  14. ISC
  15. MachTen 68k
  16. MPC
  17. NEWS-OS
  18. NextSTEP
  19. OpenSTEP
  20. Opus
  21. Plan 9
  22. RISC/os
  23. SCO ODT/OSR
  24. Stellar
  25. SVR2
  26. TI1500
  27. TitanOS
  28. Ultrix
  29. Unisys Dynix

The following platforms have their own source code distributions and binaries available via http://www.cpan.org/ports/

  1. Perl release
  2. OS/400 (ILE) 5.005_02
  3. Tandem Guardian 5.004

The following platforms have only binaries available via http://www.cpan.org/ports/index.html :

  1. Perl release
  2. Acorn RISCOS 5.005_02
  3. AOS 5.002
  4. LynxOS 5.004_02

Although we do suggest that you always build your own Perl from the source code, both for maximal configurability and for security, in case you are in a hurry you can check http://www.cpan.org/ports/index.html for binary distributions.

SEE ALSO

perlaix, perlamiga, perlbs2000, perlce, perlcygwin, perldgux, perldos, perlebcdic, perlfreebsd, perlhurd, perlhpux, perlirix, perlmacos, perlmacosx, perlnetware, perlos2, perlos390, perlos400, perlplan9, perlqnx, perlsolaris, perltru64, perlunicode, perlvms, perlvos, perlwin32, and Win32.

AUTHORS / CONTRIBUTORS

Abigail <abigail@foad.org>, Charles Bailey <bailey@newman.upenn.edu>, Graham Barr <gbarr@pobox.com>, Tom Christiansen <tchrist@perl.com>, Nicholas Clark <nick@ccl4.org>, Thomas Dorner <Thomas.Dorner@start.de>, Andy Dougherty <doughera@lafayette.edu>, Dominic Dunlop <domo@computer.org>, Neale Ferguson <neale@vma.tabnsw.com.au>, David J. Fiander <davidf@mks.com>, Paul Green <Paul.Green@stratus.com>, M.J.T. Guy <mjtg@cam.ac.uk>, Jarkko Hietaniemi <jhi@iki.fi>, Luther Huffman <lutherh@stratcom.com>, Nick Ing-Simmons <nick@ing-simmons.net>, Andreas J. König <a.koenig@mind.de>, Markus Laker <mlaker@contax.co.uk>, Andrew M. Langmead <aml@world.std.com>, Larry Moore <ljmoore@freespace.net>, Paul Moore <Paul.Moore@uk.origin-it.com>, Chris Nandor <pudge@pobox.com>, Matthias Neeracher <neeracher@mac.com>, Philip Newton <pne@cpan.org>, Gary Ng <71564.1743@CompuServe.COM>, Tom Phoenix <rootbeer@teleport.com>, André Pirard <A.Pirard@ulg.ac.be>, Peter Prymmer <pvhp@forte.com>, Hugo van der Sanden <hv@crypt0.demon.co.uk>, Gurusamy Sarathy <gsar@activestate.com>, Paul J. Schinder <schinder@pobox.com>, Michael G Schwern <schwern@pobox.com>, Dan Sugalski <dan@sidhe.org>, Nathan Torkington <gnat@frii.com>, John Malmberg <wb8tyw@qsl.net>

 
perldoc-html/perlpragma.html000644 000765 000024 00000070000 12275777326 016255 0ustar00jjstaff000000 000000 perlpragma - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlpragma

Perl 5 version 18.2 documentation
Recently read

perlpragma

NAME

perlpragma - how to write a user pragma

DESCRIPTION

A pragma is a module which influences some aspect of the compile time or run time behaviour of Perl, such as strict or warnings . With Perl 5.10 you are no longer limited to the built in pragmata; you can now create user pragmata that modify the behaviour of user functions within a lexical scope.

A basic example

For example, say you need to create a class implementing overloaded mathematical operators, and would like to provide your own pragma that functions much like use integer; You'd like this code

  1. use MyMaths;
  2. my $l = MyMaths->new(1.2);
  3. my $r = MyMaths->new(3.4);
  4. print "A: ", $l + $r, "\n";
  5. use myint;
  6. print "B: ", $l + $r, "\n";
  7. {
  8. no myint;
  9. print "C: ", $l + $r, "\n";
  10. }
  11. print "D: ", $l + $r, "\n";
  12. no myint;
  13. print "E: ", $l + $r, "\n";

to give the output

  1. A: 4.6
  2. B: 4
  3. C: 4.6
  4. D: 4
  5. E: 4.6

i.e., where use myint; is in effect, addition operations are forced to integer, whereas by default they are not, with the default behaviour being restored via no myint;

The minimal implementation of the package MyMaths would be something like this:

  1. package MyMaths;
  2. use warnings;
  3. use strict;
  4. use myint();
  5. use overload '+' => sub {
  6. my ($l, $r) = @_;
  7. # Pass 1 to check up one call level from here
  8. if (myint::in_effect(1)) {
  9. int($$l) + int($$r);
  10. } else {
  11. $$l + $$r;
  12. }
  13. };
  14. sub new {
  15. my ($class, $value) = @_;
  16. bless \$value, $class;
  17. }
  18. 1;

Note how we load the user pragma myint with an empty list () to prevent its import being called.

The interaction with the Perl compilation happens inside package myint :

  1. package myint;
  2. use strict;
  3. use warnings;
  4. sub import {
  5. $^H{"myint/in_effect"} = 1;
  6. }
  7. sub unimport {
  8. $^H{"myint/in_effect"} = 0;
  9. }
  10. sub in_effect {
  11. my $level = shift // 0;
  12. my $hinthash = (caller($level))[10];
  13. return $hinthash->{"myint/in_effect"};
  14. }
  15. 1;

As pragmata are implemented as modules, like any other module, use myint; becomes

  1. BEGIN {
  2. require myint;
  3. myint->import();
  4. }

and no myint; is

  1. BEGIN {
  2. require myint;
  3. myint->unimport();
  4. }

Hence the import and unimport routines are called at compile time for the user's code.

User pragmata store their state by writing to the magical hash %^H , hence these two routines manipulate it. The state information in %^H is stored in the optree, and can be retrieved read-only at runtime with caller(), at index 10 of the list of returned results. In the example pragma, retrieval is encapsulated into the routine in_effect() , which takes as parameter the number of call frames to go up to find the value of the pragma in the user's script. This uses caller() to determine the value of $^H{"myint/in_effect"} when each line of the user's script was called, and therefore provide the correct semantics in the subroutine implementing the overloaded addition.

Key naming

There is only a single %^H , but arbitrarily many modules that want to use its scoping semantics. To avoid stepping on each other's toes, they need to be sure to use different keys in the hash. It is therefore conventional for a module to use only keys that begin with the module's name (the name of its main package) and a "/" character. After this module-identifying prefix, the rest of the key is entirely up to the module: it may include any characters whatsoever. For example, a module Foo::Bar should use keys such as Foo::Bar/baz and Foo::Bar/$%/_! . Modules following this convention all play nicely with each other.

The Perl core uses a handful of keys in %^H which do not follow this convention, because they predate it. Keys that follow the convention won't conflict with the core's historical keys.

Implementation details

The optree is shared between threads. This means there is a possibility that the optree will outlive the particular thread (and therefore the interpreter instance) that created it, so true Perl scalars cannot be stored in the optree. Instead a compact form is used, which can only store values that are integers (signed and unsigned), strings or undef - references and floating point values are stringified. If you need to store multiple values or complex structures, you should serialise them, for example with pack. The deletion of a hash key from %^H is recorded, and as ever can be distinguished from the existence of a key with value undef with exists.

Don't attempt to store references to data structures as integers which are retrieved via caller and converted back, as this will not be threadsafe. Accesses would be to the structure without locking (which is not safe for Perl's scalars), and either the structure has to leak, or it has to be freed when its creating thread terminates, which may be before the optree referencing it is deleted, if other threads outlive it.

 
perldoc-html/perlqnx.html000644 000765 000024 00000046256 12275777412 015630 0ustar00jjstaff000000 000000 perlqnx - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlqnx

Perl 5 version 18.2 documentation
Recently read

perlqnx

NAME

perlqnx - Perl version 5 on QNX

DESCRIPTION

As of perl5.7.2 all tests pass under:

  1. QNX 4.24G
  2. Watcom 10.6 with Beta/970211.wcc.update.tar.F
  3. socket3r.lib Nov21 1996.

As of perl5.8.1 there is at least one test still failing.

Some tests may complain under known circumstances.

See below and hints/qnx.sh for more information.

Under QNX 6.2.0 there are still a few tests which fail. See below and hints/qnx.sh for more information.

Required Software for Compiling Perl on QNX4

As with many unix ports, this one depends on a few "standard" unix utilities which are not necessarily standard for QNX4.

  • /bin/sh

    This is used heavily by Configure and then by perl itself. QNX4's version is fine, but Configure will choke on the 16-bit version, so if you are running QNX 4.22, link /bin/sh to /bin32/ksh

  • ar

    This is the standard unix library builder. We use wlib. With Watcom 10.6, when wlib is linked as "ar", it behaves like ar and all is fine. Under 9.5, a cover is required. One is included in ../qnx

  • nm

    This is used (optionally) by configure to list the contents of libraries. I will generate a cover function on the fly in the UU directory.

  • cpp

    Configure and perl need a way to invoke a C preprocessor. I have created a simple cover for cc which does the right thing. Without this, Configure will create its own wrapper which works, but it doesn't handle some of the command line arguments that perl will throw at it.

  • make

    You really need GNU make to compile this. GNU make ships by default with QNX 4.23, but you can get it from quics for earlier versions.

Outstanding Issues with Perl on QNX4

There is no support for dynamically linked libraries in QNX4.

If you wish to compile with the Socket extension, you need to have the TCP/IP toolkit, and you need to make sure that -lsocket locates the correct copy of socket3r.lib. Beware that the Watcom compiler ships with a stub version of socket3r.lib which has very little functionality. Also beware the order in which wlink searches directories for libraries. You may have /usr/lib/socket3r.lib pointing to the correct library, but wlink may pick up /usr/watcom/10.6/usr/lib/socket3r.lib instead. Make sure they both point to the correct library, that is, /usr/tcptk/current/usr/lib/socket3r.lib.

The following tests may report errors under QNX4:

dist/Cwd/Cwd.t will complain if `pwd` and cwd don't give the same results. cwd calls `fullpath -t`, so if you cd `fullpath -t` before running the test, it will pass.

lib/File/Find/taint.t will complain if '.' is in your PATH. The PATH test is triggered because cwd calls `fullpath -t`.

ext/IO/lib/IO/t/io_sock.t: Subtests 14 and 22 are skipped due to the fact that the functionality to read back the non-blocking status of a socket is not implemented in QNX's TCP/IP. This has been reported to QNX and it may work with later versions of TCP/IP.

t/io/tell.t: Subtest 27 is failing. We are still investigating.

QNX auxiliary files

The files in the "qnx" directory are:

  • qnx/ar

    A script that emulates the standard unix archive (aka library) utility. Under Watcom 10.6, ar is linked to wlib and provides the expected interface. With Watcom 9.5, a cover function is required. This one is fairly crude but has proved adequate for compiling perl.

  • qnx/cpp

    A script that provides C preprocessing functionality. Configure can generate a similar cover, but it doesn't handle all the command-line options that perl throws at it. This might be reasonably placed in /usr/local/bin.

Outstanding issues with perl under QNX6

The following tests are still failing for Perl 5.8.1 under QNX 6.2.0:

  1. op/sprintf.........................FAILED at test 91
  2. lib/Benchmark......................FAILED at test 26

This is due to a bug in the C library's printf routine. printf("'%e'", 0. ) produces '0.000000e+0', but ANSI requires '0.000000e+00'. QNX has acknowledged the bug.

AUTHOR

Norton T. Allen (allen@huarp.harvard.edu)

 
perldoc-html/perlre.html000644 000765 000024 00000620400 12275777340 015415 0ustar00jjstaff000000 000000 perlre - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlre

Perl 5 version 18.2 documentation
Recently read

perlre

NAME

perlre - Perl regular expressions

DESCRIPTION

This page describes the syntax of regular expressions in Perl.

If you haven't used regular expressions before, a quick-start introduction is available in perlrequick, and a longer tutorial introduction is available in perlretut.

For reference on how regular expressions are used in matching operations, plus various examples of the same, see discussions of m//, s///, qr// and ?? in Regexp Quote-Like Operators in perlop.

Modifiers

Matching operations can have various modifiers. Modifiers that relate to the interpretation of the regular expression inside are listed below. Modifiers that alter the way a regular expression is used by Perl are detailed in Regexp Quote-Like Operators in perlop and Gory details of parsing quoted constructs in perlop.

  • m

    Treat string as multiple lines. That is, change "^" and "$" from matching the start or end of line only at the left and right ends of the string to matching them anywhere within the string.

  • s

    Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.

    Used together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string.

  • i

    Do case-insensitive pattern matching.

    If locale matching rules are in effect, the case map is taken from the current locale for code points less than 255, and from Unicode rules for larger code points. However, matches that would cross the Unicode rules/non-Unicode rules boundary (ords 255/256) will not succeed. See perllocale.

    There are a number of Unicode characters that match multiple characters under /i. For example, LATIN SMALL LIGATURE FI should match the sequence fi . Perl is not currently able to do this when the multiple characters are in the pattern and are split between groupings, or when one or more are quantified. Thus

    1. "\N{LATIN SMALL LIGATURE FI}" =~ /fi/i; # Matches
    2. "\N{LATIN SMALL LIGATURE FI}" =~ /[fi][fi]/i; # Doesn't match!
    3. "\N{LATIN SMALL LIGATURE FI}" =~ /fi*/i; # Doesn't match!
    4. # The below doesn't match, and it isn't clear what $1 and $2 would
    5. # be even if it did!!
    6. "\N{LATIN SMALL LIGATURE FI}" =~ /(f)(i)/i; # Doesn't match!

    Perl doesn't match multiple characters in a bracketed character class unless the character that maps to them is explicitly mentioned, and it doesn't match them at all if the character class is inverted, which otherwise could be highly confusing. See Bracketed Character Classes in perlrecharclass, and Negation in perlrecharclass.

  • x

    Extend your pattern's legibility by permitting whitespace and comments. Details in /x

  • p

    Preserve the string matched such that ${^PREMATCH}, ${^MATCH}, and ${^POSTMATCH} are available for use after matching.

  • g and c

    Global matching, and keep the Current position after failed matching. Unlike i, m, s and x, these two flags affect the way the regex is used rather than the regex itself. See Using regular expressions in Perl in perlretut for further explanation of the g and c modifiers.

  • a, d, l and u

    These modifiers, all new in 5.14, affect which character-set semantics (Unicode, etc.) are used, as described below in Character set modifiers.

Regular expression modifiers are usually written in documentation as e.g., "the /x modifier", even though the delimiter in question might not really be a slash. The modifiers /imsxadlup may also be embedded within the regular expression itself using the (?...) construct, see Extended Patterns below.

/x

/x tells the regular expression parser to ignore most whitespace that is neither backslashed nor within a character class. You can use this to break up your regular expression into (slightly) more readable parts. The # character is also treated as a metacharacter introducing a comment, just as in ordinary Perl code. This also means that if you want real whitespace or # characters in the pattern (outside a character class, where they are unaffected by /x), then you'll either have to escape them (using backslashes or \Q...\E ) or encode them using octal, hex, or \N{} escapes. Taken together, these features go a long way towards making Perl's regular expressions more readable. Note that you have to be careful not to include the pattern delimiter in the comment--perl has no way of knowing you did not intend to close the pattern early. See the C-comment deletion code in perlop. Also note that anything inside a \Q...\E stays unaffected by /x. And note that /x doesn't affect space interpretation within a single multi-character construct. For example in \x{...} , regardless of the /x modifier, there can be no spaces. Same for a quantifier such as {3} or {5,} . Similarly, (?:...) can't have a space between the (, ?, and : . Within any delimiters for such a construct, allowed spaces are not affected by /x, and depend on the construct. For example, \x{...} can't have spaces because hexadecimal numbers don't have spaces in them. But, Unicode properties can have spaces, so in \p{...} there can be spaces that follow the Unicode rules, for which see Properties accessible through \p{} and \P{} in perluniprops.

Character set modifiers

/d, /u , /a , and /l , available starting in 5.14, are called the character set modifiers; they affect the character set semantics used for the regular expression.

The /d, /u , and /l modifiers are not likely to be of much use to you, and so you need not worry about them very much. They exist for Perl's internal use, so that complex regular expression data structures can be automatically serialized and later exactly reconstituted, including all their nuances. But, since Perl can't keep a secret, and there may be rare instances where they are useful, they are documented here.

The /a modifier, on the other hand, may be useful. Its purpose is to allow code that is to work mostly on ASCII data to not have to concern itself with Unicode.

Briefly, /l sets the character set to that of whatever Locale is in effect at the time of the execution of the pattern match.

/u sets the character set to Unicode.

/a also sets the character set to Unicode, BUT adds several restrictions for ASCII-safe matching.

/d is the old, problematic, pre-5.14 Default character set behavior. Its only use is to force that old behavior.

At any given time, exactly one of these modifiers is in effect. Their existence allows Perl to keep the originally compiled behavior of a regular expression, regardless of what rules are in effect when it is actually executed. And if it is interpolated into a larger regex, the original's rules continue to apply to it, and only it.

The /l and /u modifiers are automatically selected for regular expressions compiled within the scope of various pragmas, and we recommend that in general, you use those pragmas instead of specifying these modifiers explicitly. For one thing, the modifiers affect only pattern matching, and do not extend to even any replacement done, whereas using the pragmas give consistent results for all appropriate operations within their scopes. For example,

  1. s/foo/\Ubar/il

will match "foo" using the locale's rules for case-insensitive matching, but the /l does not affect how the \U operates. Most likely you want both of them to use locale rules. To do this, instead compile the regular expression within the scope of use locale . This both implicitly adds the /l and applies locale rules to the \U . The lesson is to use locale and not /l explicitly.

Similarly, it would be better to use use feature 'unicode_strings' instead of,

  1. s/foo/\Lbar/iu

to get Unicode rules, as the \L in the former (but not necessarily the latter) would also use Unicode rules.

More detail on each of the modifiers follows. Most likely you don't need to know this detail for /l , /u , and /d, and can skip ahead to /a.

/l

means to use the current locale's rules (see perllocale) when pattern matching. For example, \w will match the "word" characters of that locale, and "/i" case-insensitive matching will match according to the locale's case folding rules. The locale used will be the one in effect at the time of execution of the pattern match. This may not be the same as the compilation-time locale, and can differ from one match to another if there is an intervening call of the setlocale() function.

Perl only supports single-byte locales. This means that code points above 255 are treated as Unicode no matter what locale is in effect. Under Unicode rules, there are a few case-insensitive matches that cross the 255/256 boundary. These are disallowed under /l . For example, 0xFF (on ASCII platforms) does not caselessly match the character at 0x178, LATIN CAPITAL LETTER Y WITH DIAERESIS , because 0xFF may not be LATIN SMALL LETTER Y WITH DIAERESIS in the current locale, and Perl has no way of knowing if that character even exists in the locale, much less what code point it is.

This modifier may be specified to be the default by use locale , but see Which character set modifier is in effect?.

/u

means to use Unicode rules when pattern matching. On ASCII platforms, this means that the code points between 128 and 255 take on their Latin-1 (ISO-8859-1) meanings (which are the same as Unicode's). (Otherwise Perl considers their meanings to be undefined.) Thus, under this modifier, the ASCII platform effectively becomes a Unicode platform; and hence, for example, \w will match any of the more than 100_000 word characters in Unicode.

Unlike most locales, which are specific to a language and country pair, Unicode classifies all the characters that are letters somewhere in the world as \w . For example, your locale might not think that LATIN SMALL LETTER ETH is a letter (unless you happen to speak Icelandic), but Unicode does. Similarly, all the characters that are decimal digits somewhere in the world will match \d ; this is hundreds, not 10, possible matches. And some of those digits look like some of the 10 ASCII digits, but mean a different number, so a human could easily think a number is a different quantity than it really is. For example, BENGALI DIGIT FOUR (U+09EA) looks very much like an ASCII DIGIT EIGHT (U+0038). And, \d+ , may match strings of digits that are a mixture from different writing systems, creating a security issue. num() in Unicode::UCD can be used to sort this out. Or the /a modifier can be used to force \d to match just the ASCII 0 through 9.

Also, under this modifier, case-insensitive matching works on the full set of Unicode characters. The KELVIN SIGN , for example matches the letters "k" and "K"; and LATIN SMALL LIGATURE FF matches the sequence "ff", which, if you're not prepared, might make it look like a hexadecimal constant, presenting another potential security issue. See http://unicode.org/reports/tr36 for a detailed discussion of Unicode security issues.

This modifier may be specified to be the default by use feature 'unicode_strings , use locale ':not_characters' , or use VERSION (or higher), but see Which character set modifier is in effect?.

/d

This modifier means to use the "Default" native rules of the platform except when there is cause to use Unicode rules instead, as follows:

1

the target string is encoded in UTF-8; or

2

the pattern is encoded in UTF-8; or

3

the pattern explicitly mentions a code point that is above 255 (say by \x{100} ); or

4

the pattern uses a Unicode name (\N{...} ); or

5

the pattern uses a Unicode property (\p{...} ); or

6

the pattern uses (?[ ])

Another mnemonic for this modifier is "Depends", as the rules actually used depend on various things, and as a result you can get unexpected results. See The Unicode Bug in perlunicode. The Unicode Bug has become rather infamous, leading to yet another (printable) name for this modifier, "Dodgy".

Unless the pattern or string are encoded in UTF-8, only ASCII characters can match positively.

Here are some examples of how that works on an ASCII platform:

  1. $str = "\xDF"; # $str is not in UTF-8 format.
  2. $str =~ /^\w/; # No match, as $str isn't in UTF-8 format.
  3. $str .= "\x{0e0b}"; # Now $str is in UTF-8 format.
  4. $str =~ /^\w/; # Match! $str is now in UTF-8 format.
  5. chop $str;
  6. $str =~ /^\w/; # Still a match! $str remains in UTF-8 format.

This modifier is automatically selected by default when none of the others are, so yet another name for it is "Default".

Because of the unexpected behaviors associated with this modifier, you probably should only use it to maintain weird backward compatibilities.

/a (and /aa)

This modifier stands for ASCII-restrict (or ASCII-safe). This modifier, unlike the others, may be doubled-up to increase its effect.

When it appears singly, it causes the sequences \d , \s, \w , and the Posix character classes to match only in the ASCII range. They thus revert to their pre-5.6, pre-Unicode meanings. Under /a , \d always means precisely the digits "0" to "9" ; \s means the five characters [ \f\n\r\t] , and starting in Perl v5.18, experimentally, the vertical tab; \w means the 63 characters [A-Za-z0-9_] ; and likewise, all the Posix classes such as [[:print:]] match only the appropriate ASCII-range characters.

This modifier is useful for people who only incidentally use Unicode, and who do not wish to be burdened with its complexities and security concerns.

With /a , one can write \d with confidence that it will only match ASCII characters, and should the need arise to match beyond ASCII, you can instead use \p{Digit} (or \p{Word} for \w ). There are similar \p{...} constructs that can match beyond ASCII both white space (see Whitespace in perlrecharclass), and Posix classes (see POSIX Character Classes in perlrecharclass). Thus, this modifier doesn't mean you can't use Unicode, it means that to get Unicode matching you must explicitly use a construct (\p{} , \P{} ) that signals Unicode.

As you would expect, this modifier causes, for example, \D to mean the same thing as [^0-9] ; in fact, all non-ASCII characters match \D , \S , and \W . \b still means to match at the boundary between \w and \W , using the /a definitions of them (similarly for \B ).

Otherwise, /a behaves like the /u modifier, in that case-insensitive matching uses Unicode semantics; for example, "k" will match the Unicode \N{KELVIN SIGN} under /i matching, and code points in the Latin1 range, above ASCII will have Unicode rules when it comes to case-insensitive matching.

To forbid ASCII/non-ASCII matches (like "k" with \N{KELVIN SIGN} ), specify the "a" twice, for example /aai or /aia . (The first occurrence of "a" restricts the \d , etc., and the second occurrence adds the /i restrictions.) But, note that code points outside the ASCII range will use Unicode rules for /i matching, so the modifier doesn't really restrict things to just ASCII; it just forbids the intermixing of ASCII and non-ASCII.

To summarize, this modifier provides protection for applications that don't wish to be exposed to all of Unicode. Specifying it twice gives added protection.

This modifier may be specified to be the default by use re '/a' or use re '/aa' . If you do so, you may actually have occasion to use the /u modifier explictly if there are a few regular expressions where you do want full Unicode rules (but even here, it's best if everything were under feature "unicode_strings" , along with the use re '/aa' ). Also see Which character set modifier is in effect?.

Which character set modifier is in effect?

Which of these modifiers is in effect at any given point in a regular expression depends on a fairly complex set of interactions. These have been designed so that in general you don't have to worry about it, but this section gives the gory details. As explained below in Extended Patterns it is possible to explicitly specify modifiers that apply only to portions of a regular expression. The innermost always has priority over any outer ones, and one applying to the whole expression has priority over any of the default settings that are described in the remainder of this section.

The use re '/foo' pragma can be used to set default modifiers (including these) for regular expressions compiled within its scope. This pragma has precedence over the other pragmas listed below that also change the defaults.

Otherwise, use locale sets the default modifier to /l ; and use feature 'unicode_strings, or use VERSION (or higher) set the default to /u when not in the same scope as either use locale or use bytes. (use locale ':not_characters' also sets the default to /u , overriding any plain use locale .) Unlike the mechanisms mentioned above, these affect operations besides regular expressions pattern matching, and so give more consistent results with other operators, including using \U , \l , etc. in substitution replacements.

If none of the above apply, for backwards compatibility reasons, the /d modifier is the one in effect by default. As this can lead to unexpected results, it is best to specify which other rule set should be used.

Character set modifier behavior prior to Perl 5.14

Prior to 5.14, there were no explicit modifiers, but /l was implied for regexes compiled within the scope of use locale , and /d was implied otherwise. However, interpolating a regex into a larger regex would ignore the original compilation in favor of whatever was in effect at the time of the second compilation. There were a number of inconsistencies (bugs) with the /d modifier, where Unicode rules would be used when inappropriate, and vice versa. \p{} did not imply Unicode rules, and neither did all occurrences of \N{} , until 5.12.

Regular Expressions

Metacharacters

The patterns used in Perl pattern matching evolved from those supplied in the Version 8 regex routines. (The routines are derived (distantly) from Henry Spencer's freely redistributable reimplementation of the V8 routines.) See Version 8 Regular Expressions for details.

In particular the following metacharacters have their standard egrep-ish meanings:

  1. \ Quote the next metacharacter
  2. ^ Match the beginning of the line
  3. . Match any character (except newline)
  4. $ Match the end of the line (or before newline at the end)
  5. | Alternation
  6. () Grouping
  7. [] Bracketed Character class

By default, the "^" character is guaranteed to match only the beginning of the string, the "$" character only the end (or before the newline at the end), and Perl does certain optimizations with the assumption that the string contains only one line. Embedded newlines will not be matched by "^" or "$". You may, however, wish to treat a string as a multi-line buffer, such that the "^" will match after any newline within the string (except if the newline is the last character in the string), and "$" will match before any newline. At the cost of a little more overhead, you can do this by using the /m modifier on the pattern match operator. (Older programs did this by setting $* , but this option was removed in perl 5.10.)

To simplify multi-line substitutions, the "." character never matches a newline unless you use the /s modifier, which in effect tells Perl to pretend the string is a single line--even if it isn't.

Quantifiers

The following standard quantifiers are recognized:

  1. * Match 0 or more times
  2. + Match 1 or more times
  3. ? Match 1 or 0 times
  4. {n} Match exactly n times
  5. {n,} Match at least n times
  6. {n,m} Match at least n but not more than m times

(If a curly bracket occurs in any other context and does not form part of a backslashed sequence like \x{...} , it is treated as a regular character. In particular, the lower quantifier bound is not optional, and a typo in a quantifier silently causes it to be treated as the literal characters. For example,

  1. /o{4,3}/

looks like a quantifier that matches 0 times, since 4 is greater than 3, but it really means to match the sequence of six characters "o { 4 , 3 }" . It is planned to eventually require literal uses of curly brackets to be escaped, say by preceding them with a backslash or enclosing them within square brackets, ("\{" or "[{]" ). This change will allow for future syntax extensions (like making the lower bound of a quantifier optional), and better error checking. In the meantime, you should get in the habit of escaping all instances where you mean a literal "{".)

The "*" quantifier is equivalent to {0,} , the "+" quantifier to {1,} , and the "?" quantifier to {0,1} . n and m are limited to non-negative integral values less than a preset limit defined when perl is built. This is usually 32766 on the most common platforms. The actual limit can be seen in the error message generated by code such as this:

  1. $_ **= $_ , / {$_} / for 2 .. 42;

By default, a quantified subpattern is "greedy", that is, it will match as many times as possible (given a particular starting location) while still allowing the rest of the pattern to match. If you want it to match the minimum number of times possible, follow the quantifier with a "?". Note that the meanings don't change, just the "greediness":

  1. *? Match 0 or more times, not greedily
  2. +? Match 1 or more times, not greedily
  3. ?? Match 0 or 1 time, not greedily
  4. {n}? Match exactly n times, not greedily (redundant)
  5. {n,}? Match at least n times, not greedily
  6. {n,m}? Match at least n but not more than m times, not greedily

By default, when a quantified subpattern does not allow the rest of the overall pattern to match, Perl will backtrack. However, this behaviour is sometimes undesirable. Thus Perl provides the "possessive" quantifier form as well.

  1. *+ Match 0 or more times and give nothing back
  2. ++ Match 1 or more times and give nothing back
  3. ?+ Match 0 or 1 time and give nothing back
  4. {n}+ Match exactly n times and give nothing back (redundant)
  5. {n,}+ Match at least n times and give nothing back
  6. {n,m}+ Match at least n but not more than m times and give nothing back

For instance,

  1. 'aaaa' =~ /a++a/

will never match, as the a++ will gobble up all the a 's in the string and won't leave any for the remaining part of the pattern. This feature can be extremely useful to give perl hints about where it shouldn't backtrack. For instance, the typical "match a double-quoted string" problem can be most efficiently performed when written as:

  1. /"(?:[^"\\]++|\\.)*+"/

as we know that if the final quote does not match, backtracking will not help. See the independent subexpression (?>pattern) for more details; possessive quantifiers are just syntactic sugar for that construct. For instance the above example could also be written as follows:

  1. /"(?>(?:(?>[^"\\]+)|\\.)*)"/

Escape sequences

Because patterns are processed as double-quoted strings, the following also work:

  1. \t tab (HT, TAB)
  2. \n newline (LF, NL)
  3. \r return (CR)
  4. \f form feed (FF)
  5. \a alarm (bell) (BEL)
  6. \e escape (think troff) (ESC)
  7. \cK control char (example: VT)
  8. \x{}, \x00 character whose ordinal is the given hexadecimal number
  9. \N{name} named Unicode character or character sequence
  10. \N{U+263D} Unicode character (example: FIRST QUARTER MOON)
  11. \o{}, \000 character whose ordinal is the given octal number
  12. \l lowercase next char (think vi)
  13. \u uppercase next char (think vi)
  14. \L lowercase till \E (think vi)
  15. \U uppercase till \E (think vi)
  16. \Q quote (disable) pattern metacharacters till \E
  17. \E end either case modification or quoted section, think vi

Details are in Quote and Quote-like Operators in perlop.

Character Classes and other Special Escapes

In addition, Perl defines the following:

  1. Sequence Note Description
  2. [...] [1] Match a character according to the rules of the
  3. bracketed character class defined by the "...".
  4. Example: [a-z] matches "a" or "b" or "c" ... or "z"
  5. [[:...:]] [2] Match a character according to the rules of the POSIX
  6. character class "..." within the outer bracketed
  7. character class. Example: [[:upper:]] matches any
  8. uppercase character.
  9. (?[...]) [8] Extended bracketed character class
  10. \w [3] Match a "word" character (alphanumeric plus "_", plus
  11. other connector punctuation chars plus Unicode
  12. marks)
  13. \W [3] Match a non-"word" character
  14. \s [3] Match a whitespace character
  15. \S [3] Match a non-whitespace character
  16. \d [3] Match a decimal digit character
  17. \D [3] Match a non-digit character
  18. \pP [3] Match P, named property. Use \p{Prop} for longer names
  19. \PP [3] Match non-P
  20. \X [4] Match Unicode "eXtended grapheme cluster"
  21. \C Match a single C-language char (octet) even if that is
  22. part of a larger UTF-8 character. Thus it breaks up
  23. characters into their UTF-8 bytes, so you may end up
  24. with malformed pieces of UTF-8. Unsupported in
  25. lookbehind.
  26. \1 [5] Backreference to a specific capture group or buffer.
  27. '1' may actually be any positive integer.
  28. \g1 [5] Backreference to a specific or previous group,
  29. \g{-1} [5] The number may be negative indicating a relative
  30. previous group and may optionally be wrapped in
  31. curly brackets for safer parsing.
  32. \g{name} [5] Named backreference
  33. \k<name> [5] Named backreference
  34. \K [6] Keep the stuff left of the \K, don't include it in $&
  35. \N [7] Any character but \n. Not affected by /s modifier
  36. \v [3] Vertical whitespace
  37. \V [3] Not vertical whitespace
  38. \h [3] Horizontal whitespace
  39. \H [3] Not horizontal whitespace
  40. \R [4] Linebreak

Assertions

Perl defines the following zero-width assertions:

  1. \b Match a word boundary
  2. \B Match except at a word boundary
  3. \A Match only at beginning of string
  4. \Z Match only at end of string, or before newline at the end
  5. \z Match only at end of string
  6. \G Match only at pos() (e.g. at the end-of-match position
  7. of prior m//g)

A word boundary (\b ) is a spot between two characters that has a \w on one side of it and a \W on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a \W . (Within character classes \b represents backspace rather than a word boundary, just as it normally does in any double-quoted string.) The \A and \Z are just like "^" and "$", except that they won't match multiple times when the /m modifier is used, while "^" and "$" will match at every internal line boundary. To match the actual end of the string and not ignore an optional trailing newline, use \z .

The \G assertion can be used to chain global matches (using m//g), as described in Regexp Quote-Like Operators in perlop. It is also useful when writing lex -like scanners, when you have several patterns that you want to match against consequent substrings of your string; see the previous reference. The actual location where \G will match can also be influenced by using pos() as an lvalue: see pos. Note that the rule for zero-length matches (see Repeated Patterns Matching a Zero-length Substring) is modified somewhat, in that contents to the left of \G are not counted when determining the length of the match. Thus the following will not match forever:

  1. my $string = 'ABC';
  2. pos($string) = 1;
  3. while ($string =~ /(.\G)/g) {
  4. print $1;
  5. }

It will print 'A' and then terminate, as it considers the match to be zero-width, and thus will not match at the same position twice in a row.

It is worth noting that \G improperly used can result in an infinite loop. Take care when using patterns that include \G in an alternation.

Capture groups

The bracketing construct ( ... ) creates capture groups (also referred to as capture buffers). To refer to the current contents of a group later on, within the same pattern, use \g1 (or \g{1} ) for the first, \g2 (or \g{2} ) for the second, and so on. This is called a backreference. There is no limit to the number of captured substrings that you may use. Groups are numbered with the leftmost open parenthesis being number 1, etc. If a group did not match, the associated backreference won't match either. (This can happen if the group is optional, or in a different branch of an alternation.) You can omit the "g" , and write "\1" , etc, but there are some issues with this form, described below.

You can also refer to capture groups relatively, by using a negative number, so that \g-1 and \g{-1} both refer to the immediately preceding capture group, and \g-2 and \g{-2} both refer to the group before it. For example:

  1. /
  2. (Y) # group 1
  3. ( # group 2
  4. (X) # group 3
  5. \g{-1} # backref to group 3
  6. \g{-3} # backref to group 1
  7. )
  8. /x

would match the same as /(Y) ( (X) \g3 \g1 )/x . This allows you to interpolate regexes into larger regexes and not have to worry about the capture groups being renumbered.

You can dispense with numbers altogether and create named capture groups. The notation is (?<name>...) to declare and \g{name} to reference. (To be compatible with .Net regular expressions, \g{name} may also be written as \k{name}, \k<name> or \k'name'.) name must not begin with a number, nor contain hyphens. When different groups within the same pattern have the same name, any reference to that name assumes the leftmost defined group. Named groups count in absolute and relative numbering, and so can also be referred to by those numbers. (It's possible to do things with named capture groups that would otherwise require (??{}) .)

Capture group contents are dynamically scoped and available to you outside the pattern until the end of the enclosing block or until the next successful match, whichever comes first. (See Compound Statements in perlsyn.) You can refer to them by absolute number (using "$1" instead of "\g1" , etc); or by name via the %+ hash, using "$+{name}".

Braces are required in referring to named capture groups, but are optional for absolute or relative numbered ones. Braces are safer when creating a regex by concatenating smaller strings. For example if you have qr/$a$b/, and $a contained "\g1" , and $b contained "37" , you would get /\g137/ which is probably not what you intended.

The \g and \k notations were introduced in Perl 5.10.0. Prior to that there were no named nor relative numbered capture groups. Absolute numbered groups were referred to using \1 , \2 , etc., and this notation is still accepted (and likely always will be). But it leads to some ambiguities if there are more than 9 capture groups, as \10 could mean either the tenth capture group, or the character whose ordinal in octal is 010 (a backspace in ASCII). Perl resolves this ambiguity by interpreting \10 as a backreference only if at least 10 left parentheses have opened before it. Likewise \11 is a backreference only if at least 11 left parentheses have opened before it. And so on. \1 through \9 are always interpreted as backreferences. There are several examples below that illustrate these perils. You can avoid the ambiguity by always using \g{} or \g if you mean capturing groups; and for octal constants always using \o{} , or for \077 and below, using 3 digits padded with leading zeros, since a leading zero implies an octal constant.

The \digit notation also works in certain circumstances outside the pattern. See Warning on \1 Instead of $1 below for details.

Examples:

  1. s/^([^ ]*) *([^ ]*)/$2 $1/; # swap first two words
  2. /(.)\g1/ # find first doubled char
  3. and print "'$1' is the first doubled character\n";
  4. /(?<char>.)\k<char>/ # ... a different way
  5. and print "'$+{char}' is the first doubled character\n";
  6. /(?'char'.)\g1/ # ... mix and match
  7. and print "'$1' is the first doubled character\n";
  8. if (/Time: (..):(..):(..)/) { # parse out values
  9. $hours = $1;
  10. $minutes = $2;
  11. $seconds = $3;
  12. }
  13. /(.)(.)(.)(.)(.)(.)(.)(.)(.)\g10/ # \g10 is a backreference
  14. /(.)(.)(.)(.)(.)(.)(.)(.)(.)\10/ # \10 is octal
  15. /((.)(.)(.)(.)(.)(.)(.)(.)(.))\10/ # \10 is a backreference
  16. /((.)(.)(.)(.)(.)(.)(.)(.)(.))\010/ # \010 is octal
  17. $a = '(.)\1'; # Creates problems when concatenated.
  18. $b = '(.)\g{1}'; # Avoids the problems.
  19. "aa" =~ /${a}/; # True
  20. "aa" =~ /${b}/; # True
  21. "aa0" =~ /${a}0/; # False!
  22. "aa0" =~ /${b}0/; # True
  23. "aa\x08" =~ /${a}0/; # True!
  24. "aa\x08" =~ /${b}0/; # False

Several special variables also refer back to portions of the previous match. $+ returns whatever the last bracket match matched. $& returns the entire matched string. (At one point $0 did also, but now it returns the name of the program.) $` returns everything before the matched string. $' returns everything after the matched string. And $^N contains whatever was matched by the most-recently closed group (submatch). $^N can be used in extended patterns (see below), for example to assign a submatch to a variable.

These special variables, like the %+ hash and the numbered match variables ($1 , $2 , $3 , etc.) are dynamically scoped until the end of the enclosing block or until the next successful match, whichever comes first. (See Compound Statements in perlsyn.)

NOTE: Failed matches in Perl do not reset the match variables, which makes it easier to write code that tests for a series of more specific cases and remembers the best match.

WARNING: Once Perl sees that you need one of $& , $` , or $' anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program. Perl uses the same mechanism to produce $1 , $2 , etc, so you also pay a price for each pattern that contains capturing parentheses. (To avoid this cost while retaining the grouping behaviour, use the extended regular expression (?: ... ) instead.) But if you never use $& , $` or $' , then patterns without capturing parentheses will not be penalized. So avoid $& , $' , and $` if you can, but if you can't (and some algorithms really appreciate them), once you've used them once, use them at will, because you've already paid the price. As of 5.17.4, the presence of each of the three variables in a program is recorded separately, and depending on circumstances, perl may be able be more efficient knowing that only $& rather than all three have been seen, for example.

As a workaround for this problem, Perl 5.10.0 introduces ${^PREMATCH} , ${^MATCH} and ${^POSTMATCH} , which are equivalent to $` , $& and $' , except that they are only guaranteed to be defined after a successful match that was executed with the /p (preserve) modifier. The use of these variables incurs no global performance penalty, unlike their punctuation char equivalents, however at the trade-off that you have to tell perl when you want to use them.

Quoting metacharacters

Backslashed metacharacters in Perl are alphanumeric, such as \b , \w , \n . Unlike some other regular expression languages, there are no backslashed symbols that aren't alphanumeric. So anything that looks like \\, \(, \), \[, \], \{, or \} is always interpreted as a literal character, not a metacharacter. This was once used in a common idiom to disable or quote the special meanings of regular expression metacharacters in a string that you want to use for a pattern. Simply quote all non-"word" characters:

  1. $pattern =~ s/(\W)/\\$1/g;

(If use locale is set, then this depends on the current locale.) Today it is more common to use the quotemeta() function or the \Q metaquoting escape sequence to disable all metacharacters' special meanings like this:

  1. /$unquoted\Q$quoted\E$unquoted/

Beware that if you put literal backslashes (those not inside interpolated variables) between \Q and \E , double-quotish backslash interpolation may lead to confusing results. If you need to use literal backslashes within \Q...\E , consult Gory details of parsing quoted constructs in perlop.

quotemeta() and \Q are fully described in quotemeta.

Extended Patterns

Perl also defines a consistent extension syntax for features not found in standard tools like awk and lex. The syntax for most of these is a pair of parentheses with a question mark as the first thing within the parentheses. The character after the question mark indicates the extension.

The stability of these extensions varies widely. Some have been part of the core language for many years. Others are experimental and may change without warning or be completely removed. Check the documentation on an individual feature to verify its current status.

A question mark was chosen for this and for the minimal-matching construct because 1) question marks are rare in older regular expressions, and 2) whenever you see one, you should stop and "question" exactly what is going on. That's psychology....

  • (?#text)

    A comment. The text is ignored. If the /x modifier enables whitespace formatting, a simple # will suffice. Note that Perl closes the comment as soon as it sees a ), so there is no way to put a literal ) in the comment.

  • (?adlupimsx-imsx)
  • (?^alupimsx)

    One or more embedded pattern-match modifiers, to be turned on (or turned off, if preceded by - ) for the remainder of the pattern or the remainder of the enclosing pattern group (if any).

    This is particularly useful for dynamic patterns, such as those read in from a configuration file, taken from an argument, or specified in a table somewhere. Consider the case where some patterns want to be case-sensitive and some do not: The case-insensitive ones merely need to include (?i) at the front of the pattern. For example:

    1. $pattern = "foobar";
    2. if ( /$pattern/i ) { }
    3. # more flexible:
    4. $pattern = "(?i)foobar";
    5. if ( /$pattern/ ) { }

    These modifiers are restored at the end of the enclosing group. For example,

    1. ( (?i) blah ) \s+ \g1

    will match blah in any case, some spaces, and an exact (including the case!) repetition of the previous word, assuming the /x modifier, and no /i modifier outside this group.

    These modifiers do not carry over into named subpatterns called in the enclosing group. In other words, a pattern such as ((?i)(?&NAME)) does not change the case-sensitivity of the "NAME" pattern.

    Any of these modifiers can be set to apply globally to all regular expressions compiled within the scope of a use re . See '/flags' mode in re.

    Starting in Perl 5.14, a "^" (caret or circumflex accent) immediately after the "?" is a shorthand equivalent to d-imsx . Flags (except "d" ) may follow the caret to override it. But a minus sign is not legal with it.

    Note that the a , d , l , p , and u modifiers are special in that they can only be enabled, not disabled, and the a , d , l , and u modifiers are mutually exclusive: specifying one de-specifies the others, and a maximum of one (or two a 's) may appear in the construct. Thus, for example, (?-p) will warn when compiled under use warnings ; (?-d:...) and (?dl:...) are fatal errors.

    Note also that the p modifier is special in that its presence anywhere in a pattern has a global effect.

  • (?:pattern)
  • (?adluimsx-imsx:pattern)
  • (?^aluimsx:pattern)

    This is for clustering, not capturing; it groups subexpressions like "()", but doesn't make backreferences as "()" does. So

    1. @fields = split(/\b(?:a|b|c)\b/)

    is like

    1. @fields = split(/\b(a|b|c)\b/)

    but doesn't spit out extra fields. It's also cheaper not to capture characters if you don't need to.

    Any letters between ? and : act as flags modifiers as with (?adluimsx-imsx) . For example,

    1. /(?s-i:more.*than).*million/i

    is equivalent to the more verbose

    1. /(?:(?s-i)more.*than).*million/i

    Starting in Perl 5.14, a "^" (caret or circumflex accent) immediately after the "?" is a shorthand equivalent to d-imsx . Any positive flags (except "d" ) may follow the caret, so

    1. (?^x:foo)

    is equivalent to

    1. (?x-ims:foo)

    The caret tells Perl that this cluster doesn't inherit the flags of any surrounding pattern, but uses the system defaults (d-imsx ), modified by any flags specified.

    The caret allows for simpler stringification of compiled regular expressions. These look like

    1. (?^:pattern)

    with any non-default flags appearing between the caret and the colon. A test that looks at such stringification thus doesn't need to have the system default flags hard-coded in it, just the caret. If new flags are added to Perl, the meaning of the caret's expansion will change to include the default for those flags, so the test will still work, unchanged.

    Specifying a negative flag after the caret is an error, as the flag is redundant.

    Mnemonic for (?^...) : A fresh beginning since the usual use of a caret is to match at the beginning.

  • (?|pattern)

    This is the "branch reset" pattern, which has the special property that the capture groups are numbered from the same starting point in each alternation branch. It is available starting from perl 5.10.0.

    Capture groups are numbered from left to right, but inside this construct the numbering is restarted for each branch.

    The numbering within each branch will be as normal, and any groups following this construct will be numbered as though the construct contained only one branch, that being the one with the most capture groups in it.

    This construct is useful when you want to capture one of a number of alternative matches.

    Consider the following pattern. The numbers underneath show in which group the captured content will be stored.

    1. # before ---------------branch-reset----------- after
    2. / ( a ) (?| x ( y ) z | (p (q) r) | (t) u (v) ) ( z ) /x
    3. # 1 2 2 3 2 3 4

    Be careful when using the branch reset pattern in combination with named captures. Named captures are implemented as being aliases to numbered groups holding the captures, and that interferes with the implementation of the branch reset pattern. If you are using named captures in a branch reset pattern, it's best to use the same names, in the same order, in each of the alternations:

    1. /(?| (?<a> x ) (?<b> y )
    2. | (?<a> z ) (?<b> w )) /x

    Not doing so may lead to surprises:

    1. "12" =~ /(?| (?<a> \d+ ) | (?<b> \D+))/x;
    2. say $+ {a}; # Prints '12'
    3. say $+ {b}; # *Also* prints '12'.

    The problem here is that both the group named a and the group named b are aliases for the group belonging to $1 .

  • Look-Around Assertions

    Look-around assertions are zero-width patterns which match a specific pattern without including it in $& . Positive assertions match when their subpattern matches, negative assertions match when their subpattern fails. Look-behind matches text up to the current match position, look-ahead matches text following the current match position.

    • (?=pattern)

      A zero-width positive look-ahead assertion. For example, /\w+(?=\t)/ matches a word followed by a tab, without including the tab in $& .

    • (?!pattern)

      A zero-width negative look-ahead assertion. For example /foo(?!bar)/ matches any occurrence of "foo" that isn't followed by "bar". Note however that look-ahead and look-behind are NOT the same thing. You cannot use this for look-behind.

      If you are looking for a "bar" that isn't preceded by a "foo", /(?!foo)bar/ will not do what you want. That's because the (?!foo) is just saying that the next thing cannot be "foo"--and it's not, it's a "bar", so "foobar" will match. Use look-behind instead (see below).

    • (?<=pattern) \K

      A zero-width positive look-behind assertion. For example, /(?<=\t)\w+/ matches a word that follows a tab, without including the tab in $& . Works only for fixed-width look-behind.

      There is a special form of this construct, called \K , which causes the regex engine to "keep" everything it had matched prior to the \K and not include it in $& . This effectively provides variable-length look-behind. The use of \K inside of another look-around assertion is allowed, but the behaviour is currently not well defined.

      For various reasons \K may be significantly more efficient than the equivalent (?<=...) construct, and it is especially useful in situations where you want to efficiently remove something following something else in a string. For instance

      1. s/(foo)bar/$1/g;

      can be rewritten as the much more efficient

      1. s/foo\Kbar//g;
    • (?<!pattern)

      A zero-width negative look-behind assertion. For example /(?<!bar)foo/ matches any occurrence of "foo" that does not follow "bar". Works only for fixed-width look-behind.

  • (?'NAME'pattern)
  • (?<NAME>pattern)

    A named capture group. Identical in every respect to normal capturing parentheses () but for the additional fact that the group can be referred to by name in various regular expression constructs (like \g{NAME} ) and can be accessed by name after a successful match via %+ or %- . See perlvar for more details on the %+ and %- hashes.

    If multiple distinct capture groups have the same name then the $+{NAME} will refer to the leftmost defined group in the match.

    The forms (?'NAME'pattern) and (?<NAME>pattern) are equivalent.

    NOTE: While the notation of this construct is the same as the similar function in .NET regexes, the behavior is not. In Perl the groups are numbered sequentially regardless of being named or not. Thus in the pattern

    1. /(x)(?<foo>y)(z)/

    $+{foo} will be the same as $2, and $3 will contain 'z' instead of the opposite which is what a .NET regex hacker might expect.

    Currently NAME is restricted to simple identifiers only. In other words, it must match /^[_A-Za-z][_A-Za-z0-9]*\z/ or its Unicode extension (see utf8), though it isn't extended by the locale (see perllocale).

    NOTE: In order to make things easier for programmers with experience with the Python or PCRE regex engines, the pattern (?P<NAME>pattern) may be used instead of (?<NAME>pattern); however this form does not support the use of single quotes as a delimiter for the name.

  • \k<NAME>
  • \k'NAME'

    Named backreference. Similar to numeric backreferences, except that the group is designated by name and not number. If multiple groups have the same name then it refers to the leftmost defined group in the current match.

    It is an error to refer to a name not defined by a (?<NAME>) earlier in the pattern.

    Both forms are equivalent.

    NOTE: In order to make things easier for programmers with experience with the Python or PCRE regex engines, the pattern (?P=NAME) may be used instead of \k<NAME> .

  • (?{ code })

    WARNING: This extended regular expression feature is considered experimental, and may be changed without notice. Code executed that has side effects may not perform identically from version to version due to the effect of future optimisations in the regex engine. The implementation of this feature was radically overhauled for the 5.18.0 release, and its behaviour in earlier versions of perl was much buggier, especially in relation to parsing, lexical vars, scoping, recursion and reentrancy.

    This zero-width assertion executes any embedded Perl code. It always succeeds, and its return value is set as $^R .

    In literal patterns, the code is parsed at the same time as the surrounding code. While within the pattern, control is passed temporarily back to the perl parser, until the logically-balancing closing brace is encountered. This is similar to the way that an array index expression in a literal string is handled, for example

    1. "abc$array[ 1 + f('[') + g()]def"

    In particular, braces do not need to be balanced:

    1. s/abc(?{ f('{'); })/def/

    Even in a pattern that is interpolated and compiled at run-time, literal code blocks will be compiled once, at perl compile time; the following prints "ABCD":

    1. print "D";
    2. my $qr = qr/(?{ BEGIN { print "A" } })/;
    3. my $foo = "foo";
    4. /$foo$qr(?{ BEGIN { print "B" } })/;
    5. BEGIN { print "C" }

    In patterns where the text of the code is derived from run-time information rather than appearing literally in a source code /pattern/, the code is compiled at the same time that the pattern is compiled, and for reasons of security, use re 'eval' must be in scope. This is to stop user-supplied patterns containing code snippets from being executable.

    In situations where you need to enable this with use re 'eval' , you should also have taint checking enabled. Better yet, use the carefully constrained evaluation within a Safe compartment. See perlsec for details about both these mechanisms.

    From the viewpoint of parsing, lexical variable scope and closures,

    1. /AAA(?{ BBB })CCC/

    behaves approximately like

    1. /AAA/ && do { BBB } && /CCC/

    Similarly,

    1. qr/AAA(?{ BBB })CCC/

    behaves approximately like

    1. sub { /AAA/ && do { BBB } && /CCC/ }

    In particular:

    1. { my $i = 1; $r = qr/(?{ print $i })/ }
    2. my $i = 2;
    3. /$r/; # prints "1"

    Inside a (?{...}) block, $_ refers to the string the regular expression is matching against. You can also use pos() to know what is the current position of matching within this string.

    The code block introduces a new scope from the perspective of lexical variable declarations, but not from the perspective of local and similar localizing behaviours. So later code blocks within the same pattern will still see the values which were localized in earlier blocks. These accumulated localizations are undone either at the end of a successful match, or if the assertion is backtracked (compare Backtracking). For example,

    1. $_ = 'a' x 8;
    2. m<
    3. (?{ $cnt = 0 }) # Initialize $cnt.
    4. (
    5. a
    6. (?{
    7. local $cnt = $cnt + 1; # Update $cnt,
    8. # backtracking-safe.
    9. })
    10. )*
    11. aaaa
    12. (?{ $res = $cnt }) # On success copy to
    13. # non-localized location.
    14. >x;

    will initially increment $cnt up to 8; then during backtracking, its value will be unwound back to 4, which is the value assigned to $res . At the end of the regex execution, $cnt will be wound back to its initial value of 0.

    This assertion may be used as the condition in a

    1. (?(condition)yes-pattern|no-pattern)

    switch. If not used in this way, the result of evaluation of code is put into the special variable $^R . This happens immediately, so $^R can be used from other (?{ code }) assertions inside the same regular expression.

    The assignment to $^R above is properly localized, so the old value of $^R is restored if the assertion is backtracked; compare Backtracking.

    Note that the special variable $^N is particularly useful with code blocks to capture the results of submatches in variables without having to keep track of the number of nested parentheses. For example:

    1. $_ = "The brown fox jumps over the lazy dog";
    2. /the (\S+)(?{ $color = $^N }) (\S+)(?{ $animal = $^N })/i;
    3. print "color = $color, animal = $animal\n";
  • (??{ code })

    WARNING: This extended regular expression feature is considered experimental, and may be changed without notice. Code executed that has side effects may not perform identically from version to version due to the effect of future optimisations in the regex engine.

    This is a "postponed" regular subexpression. It behaves in exactly the same way as a (?{ code }) code block as described above, except that its return value, rather than being assigned to $^R , is treated as a pattern, compiled if it's a string (or used as-is if its a qr// object), then matched as if it were inserted instead of this construct.

    During the matching of this sub-pattern, it has its own set of captures which are valid during the sub-match, but are discarded once control returns to the main pattern. For example, the following matches, with the inner pattern capturing "B" and matching "BB", while the outer pattern captures "A";

    1. my $inner = '(.)\1';
    2. "ABBA" =~ /^(.)(??{ $inner })\1/;
    3. print $1; # prints "A";

    Note that this means that there is no way for the inner pattern to refer to a capture group defined outside. (The code block itself can use $1 , etc., to refer to the enclosing pattern's capture groups.) Thus, although

    1. ('a' x 100)=~/(??{'(.)' x 100})/

    will match, it will not set $1 on exit.

    The following pattern matches a parenthesized group:

    1. $re = qr{
    2. \(
    3. (?:
    4. (?> [^()]+ ) # Non-parens without backtracking
    5. |
    6. (??{ $re }) # Group with matching parens
    7. )*
    8. \)
    9. }x;

    See also (?PARNO) for a different, more efficient way to accomplish the same task.

    Executing a postponed regular expression 50 times without consuming any input string will result in a fatal error. The maximum depth is compiled into perl, so changing it requires a custom build.

  • (?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)

    Similar to (??{ code }) except that it does not involve executing any code or potentially compiling a returned pattern string; instead it treats the part of the current pattern contained within a specified capture group as an independent pattern that must match at the current position. Capture groups contained by the pattern will have the value as determined by the outermost recursion.

    PARNO is a sequence of digits (not starting with 0) whose value reflects the paren-number of the capture group to recurse to. (?R) recurses to the beginning of the whole pattern. (?0) is an alternate syntax for (?R). If PARNO is preceded by a plus or minus sign then it is assumed to be relative, with negative numbers indicating preceding capture groups and positive ones following. Thus (?-1) refers to the most recently declared group, and (?+1) indicates the next group to be declared. Note that the counting for relative recursion differs from that of relative backreferences, in that with recursion unclosed groups are included.

    The following pattern matches a function foo() which may contain balanced parentheses as the argument.

    1. $re = qr{ ( # paren group 1 (full function)
    2. foo
    3. ( # paren group 2 (parens)
    4. \(
    5. ( # paren group 3 (contents of parens)
    6. (?:
    7. (?> [^()]+ ) # Non-parens without backtracking
    8. |
    9. (?2) # Recurse to start of paren group 2
    10. )*
    11. )
    12. \)
    13. )
    14. )
    15. }x;

    If the pattern was used as follows

    1. 'foo(bar(baz)+baz(bop))'=~/$re/
    2. and print "\$1 = $1\n",
    3. "\$2 = $2\n",
    4. "\$3 = $3\n";

    the output produced should be the following:

    1. $1 = foo(bar(baz)+baz(bop))
    2. $2 = (bar(baz)+baz(bop))
    3. $3 = bar(baz)+baz(bop)

    If there is no corresponding capture group defined, then it is a fatal error. Recursing deeper than 50 times without consuming any input string will also result in a fatal error. The maximum depth is compiled into perl, so changing it requires a custom build.

    The following shows how using negative indexing can make it easier to embed recursive patterns inside of a qr// construct for later use:

    1. my $parens = qr/(\((?:[^()]++|(?-1))*+\))/;
    2. if (/foo $parens \s+ \+ \s+ bar $parens/x) {
    3. # do something here...
    4. }

    Note that this pattern does not behave the same way as the equivalent PCRE or Python construct of the same form. In Perl you can backtrack into a recursed group, in PCRE and Python the recursed into group is treated as atomic. Also, modifiers are resolved at compile time, so constructs like (?i:(?1)) or (?:(?i)(?1)) do not affect how the sub-pattern will be processed.

  • (?&NAME)

    Recurse to a named subpattern. Identical to (?PARNO) except that the parenthesis to recurse to is determined by name. If multiple parentheses have the same name, then it recurses to the leftmost.

    It is an error to refer to a name that is not declared somewhere in the pattern.

    NOTE: In order to make things easier for programmers with experience with the Python or PCRE regex engines the pattern (?P>NAME) may be used instead of (?&NAME).

  • (?(condition)yes-pattern|no-pattern)
  • (?(condition)yes-pattern)

    Conditional expression. Matches yes-pattern if condition yields a true value, matches no-pattern otherwise. A missing pattern always matches.

    (condition) should be one of: 1) an integer in parentheses (which is valid if the corresponding pair of parentheses matched); 2) a look-ahead/look-behind/evaluate zero-width assertion; 3) a name in angle brackets or single quotes (which is valid if a group with the given name matched); or 4) the special symbol (R) (true when evaluated inside of recursion or eval). Additionally the R may be followed by a number, (which will be true when evaluated when recursing inside of the appropriate group), or by &NAME , in which case it will be true only when evaluated during recursion in the named group.

    Here's a summary of the possible predicates:

    • (1) (2) ...

      Checks if the numbered capturing group has matched something.

    • (<NAME>) ('NAME')

      Checks if a group with the given name has matched something.

    • (?=...) (?!...) (?<=...) (?<!...)

      Checks whether the pattern matches (or does not match, for the '!' variants).

    • (?{ CODE })

      Treats the return value of the code block as the condition.

    • (R)

      Checks if the expression has been evaluated inside of recursion.

    • (R1) (R2) ...

      Checks if the expression has been evaluated while executing directly inside of the n-th capture group. This check is the regex equivalent of

      1. if ((caller(0))[3] eq 'subname') { ... }

      In other words, it does not check the full recursion stack.

    • (R&NAME)

      Similar to (R1) , this predicate checks to see if we're executing directly inside of the leftmost group with a given name (this is the same logic used by (?&NAME) to disambiguate). It does not check the full stack, but only the name of the innermost active recursion.

    • (DEFINE)

      In this case, the yes-pattern is never directly executed, and no no-pattern is allowed. Similar in spirit to (?{0}) but more efficient. See below for details.

    For example:

    1. m{ ( \( )?
    2. [^()]+
    3. (?(1) \) )
    4. }x

    matches a chunk of non-parentheses, possibly included in parentheses themselves.

    A special form is the (DEFINE) predicate, which never executes its yes-pattern directly, and does not allow a no-pattern. This allows one to define subpatterns which will be executed only by the recursion mechanism. This way, you can define a set of regular expression rules that can be bundled into any pattern you choose.

    It is recommended that for this usage you put the DEFINE block at the end of the pattern, and that you name any subpatterns defined within it.

    Also, it's worth noting that patterns defined this way probably will not be as efficient, as the optimiser is not very clever about handling them.

    An example of how this might be used is as follows:

    1. /(?<NAME>(?&NAME_PAT))(?<ADDR>(?&ADDRESS_PAT))
    2. (?(DEFINE)
    3. (?<NAME_PAT>....)
    4. (?<ADRESS_PAT>....)
    5. )/x

    Note that capture groups matched inside of recursion are not accessible after the recursion returns, so the extra layer of capturing groups is necessary. Thus $+{NAME_PAT} would not be defined even though $+{NAME} would be.

    Finally, keep in mind that subpatterns created inside a DEFINE block count towards the absolute and relative number of captures, so this:

    1. my @captures = "a" =~ /(.) # First capture
    2. (?(DEFINE)
    3. (?<EXAMPLE> 1 ) # Second capture
    4. )/x;
    5. say scalar @captures;

    Will output 2, not 1. This is particularly important if you intend to compile the definitions with the qr// operator, and later interpolate them in another pattern.

  • (?>pattern)

    An "independent" subexpression, one which matches the substring that a standalone pattern would match if anchored at the given position, and it matches nothing other than this substring. This construct is useful for optimizations of what would otherwise be "eternal" matches, because it will not backtrack (see Backtracking). It may also be useful in places where the "grab all you can, and do not give anything back" semantic is desirable.

    For example: ^(?>a*)ab will never match, since (?>a*) (anchored at the beginning of string, as above) will match all characters a at the beginning of string, leaving no a for ab to match. In contrast, a*ab will match the same as a+b , since the match of the subgroup a* is influenced by the following group ab (see Backtracking). In particular, a* inside a*ab will match fewer characters than a standalone a* , since this makes the tail match.

    (?>pattern) does not disable backtracking altogether once it has matched. It is still possible to backtrack past the construct, but not into it. So ((?>a*)|(?>b*))ar will still match "bar".

    An effect similar to (?>pattern) may be achieved by writing (?=(pattern))\g{-1} . This matches the same substring as a standalone a+ , and the following \g{-1} eats the matched string; it therefore makes a zero-length assertion into an analogue of (?>...). (The difference between these two constructs is that the second one uses a capturing group, thus shifting ordinals of backreferences in the rest of a regular expression.)

    Consider this pattern:

    1. m{ \(
    2. (
    3. [^()]+ # x+
    4. |
    5. \( [^()]* \)
    6. )+
    7. \)
    8. }x

    That will efficiently match a nonempty group with matching parentheses two levels deep or less. However, if there is no such group, it will take virtually forever on a long string. That's because there are so many different ways to split a long string into several substrings. This is what (.+)+ is doing, and (.+)+ is similar to a subpattern of the above pattern. Consider how the pattern above detects no-match on ((()aaaaaaaaaaaaaaaaaa in several seconds, but that each extra letter doubles this time. This exponential performance will make it appear that your program has hung. However, a tiny change to this pattern

    1. m{ \(
    2. (
    3. (?> [^()]+ ) # change x+ above to (?> x+ )
    4. |
    5. \( [^()]* \)
    6. )+
    7. \)
    8. }x

    which uses (?>...) matches exactly when the one above does (verifying this yourself would be a productive exercise), but finishes in a fourth the time when used on a similar string with 1000000 a s. Be aware, however, that, when this construct is followed by a quantifier, it currently triggers a warning message under the use warnings pragma or -w switch saying it "matches null string many times in regex" .

    On simple groups, such as the pattern (?> [^()]+ ), a comparable effect may be achieved by negative look-ahead, as in [^()]+ (?! [^()] ). This was only 4 times slower on a string with 1000000 a s.

    The "grab all you can, and do not give anything back" semantic is desirable in many situations where on the first sight a simple ()* looks like the correct solution. Suppose we parse text with comments being delimited by # followed by some optional (horizontal) whitespace. Contrary to its appearance, #[ \t]* is not the correct subexpression to match the comment delimiter, because it may "give up" some whitespace if the remainder of the pattern can be made to match that way. The correct answer is either one of these:

    1. (?>#[ \t]*)
    2. #[ \t]*(?![ \t])

    For example, to grab non-empty comments into $1, one should use either one of these:

    1. / (?> \# [ \t]* ) ( .+ ) /x;
    2. / \# [ \t]* ( [^ \t] .* ) /x;

    Which one you pick depends on which of these expressions better reflects the above specification of comments.

    In some literature this construct is called "atomic matching" or "possessive matching".

    Possessive quantifiers are equivalent to putting the item they are applied to inside of one of these constructs. The following equivalences apply:

    1. Quantifier Form Bracketing Form
    2. --------------- ---------------
    3. PAT*+ (?>PAT*)
    4. PAT++ (?>PAT+)
    5. PAT?+ (?>PAT?)
    6. PAT{min,max}+ (?>PAT{min,max})
  • (?[ ])

    See Extended Bracketed Character Classes in perlrecharclass.

Special Backtracking Control Verbs

WARNING: These patterns are experimental and subject to change or removal in a future version of Perl. Their usage in production code should be noted to avoid problems during upgrades.

These special patterns are generally of the form (*VERB:ARG). Unless otherwise stated the ARG argument is optional; in some cases, it is forbidden.

Any pattern containing a special backtracking verb that allows an argument has the special behaviour that when executed it sets the current package's $REGERROR and $REGMARK variables. When doing so the following rules apply:

On failure, the $REGERROR variable will be set to the ARG value of the verb pattern, if the verb was involved in the failure of the match. If the ARG part of the pattern was omitted, then $REGERROR will be set to the name of the last (*MARK:NAME) pattern executed, or to TRUE if there was none. Also, the $REGMARK variable will be set to FALSE.

On a successful match, the $REGERROR variable will be set to FALSE, and the $REGMARK variable will be set to the name of the last (*MARK:NAME) pattern executed. See the explanation for the (*MARK:NAME) verb below for more details.

NOTE: $REGERROR and $REGMARK are not magic variables like $1 and most other regex-related variables. They are not local to a scope, nor readonly, but instead are volatile package variables similar to $AUTOLOAD . Use local to localize changes to them to a specific scope if necessary.

If a pattern does not contain a special backtracking verb that allows an argument, then $REGERROR and $REGMARK are not touched at all.

  • Verbs that take an argument
    • (*PRUNE) (*PRUNE:NAME)

      This zero-width pattern prunes the backtracking tree at the current point when backtracked into on failure. Consider the pattern A (*PRUNE) B, where A and B are complex patterns. Until the (*PRUNE) verb is reached, A may backtrack as necessary to match. Once it is reached, matching continues in B, which may also backtrack as necessary; however, should B not match, then no further backtracking will take place, and the pattern will fail outright at the current starting position.

      The following example counts all the possible matching strings in a pattern (without actually matching any of them).

      1. 'aaab' =~ /a+b?(?{print "$&\n"; $count++})(*FAIL)/;
      2. print "Count=$count\n";

      which produces:

      1. aaab
      2. aaa
      3. aa
      4. a
      5. aab
      6. aa
      7. a
      8. ab
      9. a
      10. Count=9

      If we add a (*PRUNE) before the count like the following

      1. 'aaab' =~ /a+b?(*PRUNE)(?{print "$&\n"; $count++})(*FAIL)/;
      2. print "Count=$count\n";

      we prevent backtracking and find the count of the longest matching string at each matching starting point like so:

      1. aaab
      2. aab
      3. ab
      4. Count=3

      Any number of (*PRUNE) assertions may be used in a pattern.

      See also (?>pattern) and possessive quantifiers for other ways to control backtracking. In some cases, the use of (*PRUNE) can be replaced with a (?>pattern) with no functional difference; however, (*PRUNE) can be used to handle cases that cannot be expressed using a (?>pattern) alone.

    • (*SKIP) (*SKIP:NAME)

      This zero-width pattern is similar to (*PRUNE) , except that on failure it also signifies that whatever text that was matched leading up to the (*SKIP) pattern being executed cannot be part of any match of this pattern. This effectively means that the regex engine "skips" forward to this position on failure and tries to match again, (assuming that there is sufficient room to match).

      The name of the (*SKIP:NAME) pattern has special significance. If a (*MARK:NAME) was encountered while matching, then it is that position which is used as the "skip point". If no (*MARK) of that name was encountered, then the (*SKIP) operator has no effect. When used without a name the "skip point" is where the match point was when executing the (*SKIP) pattern.

      Compare the following to the examples in (*PRUNE) ; note the string is twice as long:

      1. 'aaabaaab' =~ /a+b?(*SKIP)(?{print "$&\n"; $count++})(*FAIL)/;
      2. print "Count=$count\n";

      outputs

      1. aaab
      2. aaab
      3. Count=2

      Once the 'aaab' at the start of the string has matched, and the (*SKIP) executed, the next starting point will be where the cursor was when the (*SKIP) was executed.

    • (*MARK:NAME) (*:NAME)

      This zero-width pattern can be used to mark the point reached in a string when a certain part of the pattern has been successfully matched. This mark may be given a name. A later (*SKIP) pattern will then skip forward to that point if backtracked into on failure. Any number of (*MARK) patterns are allowed, and the NAME portion may be duplicated.

      In addition to interacting with the (*SKIP) pattern, (*MARK:NAME) can be used to "label" a pattern branch, so that after matching, the program can determine which branches of the pattern were involved in the match.

      When a match is successful, the $REGMARK variable will be set to the name of the most recently executed (*MARK:NAME) that was involved in the match.

      This can be used to determine which branch of a pattern was matched without using a separate capture group for each branch, which in turn can result in a performance improvement, as perl cannot optimize /(?:(x)|(y)|(z))/ as efficiently as something like /(?:x(*MARK:x)|y(*MARK:y)|z(*MARK:z))/ .

      When a match has failed, and unless another verb has been involved in failing the match and has provided its own name to use, the $REGERROR variable will be set to the name of the most recently executed (*MARK:NAME).

      See (*SKIP) for more details.

      As a shortcut (*MARK:NAME) can be written (*:NAME).

    • (*THEN) (*THEN:NAME)

      This is similar to the "cut group" operator :: from Perl 6. Like (*PRUNE) , this verb always matches, and when backtracked into on failure, it causes the regex engine to try the next alternation in the innermost enclosing group (capturing or otherwise) that has alternations. The two branches of a (?(condition)yes-pattern|no-pattern) do not count as an alternation, as far as (*THEN) is concerned.

      Its name comes from the observation that this operation combined with the alternation operator (|) can be used to create what is essentially a pattern-based if/then/else block:

      1. ( COND (*THEN) FOO | COND2 (*THEN) BAR | COND3 (*THEN) BAZ )

      Note that if this operator is used and NOT inside of an alternation then it acts exactly like the (*PRUNE) operator.

      1. / A (*PRUNE) B /

      is the same as

      1. / A (*THEN) B /

      but

      1. / ( A (*THEN) B | C ) /

      is not the same as

      1. / ( A (*PRUNE) B | C ) /

      as after matching the A but failing on the B the (*THEN) verb will backtrack and try C; but the (*PRUNE) verb will simply fail.

  • Verbs without an argument
    • (*COMMIT)

      This is the Perl 6 "commit pattern" <commit> or :::. It's a zero-width pattern similar to (*SKIP) , except that when backtracked into on failure it causes the match to fail outright. No further attempts to find a valid match by advancing the start pointer will occur again. For example,

      1. 'aaabaaab' =~ /a+b?(*COMMIT)(?{print "$&\n"; $count++})(*FAIL)/;
      2. print "Count=$count\n";

      outputs

      1. aaab
      2. Count=1

      In other words, once the (*COMMIT) has been entered, and if the pattern does not match, the regex engine will not try any further matching on the rest of the string.

    • (*FAIL) (*F)

      This pattern matches nothing and always fails. It can be used to force the engine to backtrack. It is equivalent to (?!), but easier to read. In fact, (?!) gets optimised into (*FAIL) internally.

      It is probably useful only when combined with (?{}) or (??{}) .

    • (*ACCEPT)

      WARNING: This feature is highly experimental. It is not recommended for production code.

      This pattern matches nothing and causes the end of successful matching at the point at which the (*ACCEPT) pattern was encountered, regardless of whether there is actually more to match in the string. When inside of a nested pattern, such as recursion, or in a subpattern dynamically generated via (??{}) , only the innermost pattern is ended immediately.

      If the (*ACCEPT) is inside of capturing groups then the groups are marked as ended at the point at which the (*ACCEPT) was encountered. For instance:

      1. 'AB' =~ /(A (A|B(*ACCEPT)|C) D)(E)/x;

      will match, and $1 will be AB and $2 will be B , $3 will not be set. If another branch in the inner parentheses was matched, such as in the string 'ACDE', then the D and E would have to be matched as well.

Backtracking

NOTE: This section presents an abstract approximation of regular expression behavior. For a more rigorous (and complicated) view of the rules involved in selecting a match among possible alternatives, see Combining RE Pieces.

A fundamental feature of regular expression matching involves the notion called backtracking, which is currently used (when needed) by all regular non-possessive expression quantifiers, namely * , *? , + , +?, {n,m}, and {n,m}?. Backtracking is often optimized internally, but the general principle outlined here is valid.

For a regular expression to match, the entire regular expression must match, not just part of it. So if the beginning of a pattern containing a quantifier succeeds in a way that causes later parts in the pattern to fail, the matching engine backs up and recalculates the beginning part--that's why it's called backtracking.

Here is an example of backtracking: Let's say you want to find the word following "foo" in the string "Food is on the foo table.":

  1. $_ = "Food is on the foo table.";
  2. if ( /\b(foo)\s+(\w+)/i ) {
  3. print "$2 follows $1.\n";
  4. }

When the match runs, the first part of the regular expression (\b(foo) ) finds a possible match right at the beginning of the string, and loads up $1 with "Foo". However, as soon as the matching engine sees that there's no whitespace following the "Foo" that it had saved in $1, it realizes its mistake and starts over again one character after where it had the tentative match. This time it goes all the way until the next occurrence of "foo". The complete regular expression matches this time, and you get the expected output of "table follows foo."

Sometimes minimal matching can help a lot. Imagine you'd like to match everything between "foo" and "bar". Initially, you write something like this:

  1. $_ = "The food is under the bar in the barn.";
  2. if ( /foo(.*)bar/ ) {
  3. print "got <$1>\n";
  4. }

Which perhaps unexpectedly yields:

  1. got <d is under the bar in the >

That's because .* was greedy, so you get everything between the first "foo" and the last "bar". Here it's more effective to use minimal matching to make sure you get the text between a "foo" and the first "bar" thereafter.

  1. if ( /foo(.*?)bar/ ) { print "got <$1>\n" }
  2. got <d is under the >

Here's another example. Let's say you'd like to match a number at the end of a string, and you also want to keep the preceding part of the match. So you write this:

  1. $_ = "I have 2 numbers: 53147";
  2. if ( /(.*)(\d*)/ ) { # Wrong!
  3. print "Beginning is <$1>, number is <$2>.\n";
  4. }

That won't work at all, because .* was greedy and gobbled up the whole string. As \d* can match on an empty string the complete regular expression matched successfully.

  1. Beginning is <I have 2 numbers: 53147>, number is <>.

Here are some variants, most of which don't work:

  1. $_ = "I have 2 numbers: 53147";
  2. @pats = qw{
  3. (.*)(\d*)
  4. (.*)(\d+)
  5. (.*?)(\d*)
  6. (.*?)(\d+)
  7. (.*)(\d+)$
  8. (.*?)(\d+)$
  9. (.*)\b(\d+)$
  10. (.*\D)(\d+)$
  11. };
  12. for $pat (@pats) {
  13. printf "%-12s ", $pat;
  14. if ( /$pat/ ) {
  15. print "<$1> <$2>\n";
  16. } else {
  17. print "FAIL\n";
  18. }
  19. }

That will print out:

  1. (.*)(\d*) <I have 2 numbers: 53147> <>
  2. (.*)(\d+) <I have 2 numbers: 5314> <7>
  3. (.*?)(\d*) <> <>
  4. (.*?)(\d+) <I have > <2>
  5. (.*)(\d+)$ <I have 2 numbers: 5314> <7>
  6. (.*?)(\d+)$ <I have 2 numbers: > <53147>
  7. (.*)\b(\d+)$ <I have 2 numbers: > <53147>
  8. (.*\D)(\d+)$ <I have 2 numbers: > <53147>

As you see, this can be a bit tricky. It's important to realize that a regular expression is merely a set of assertions that gives a definition of success. There may be 0, 1, or several different ways that the definition might succeed against a particular string. And if there are multiple ways it might succeed, you need to understand backtracking to know which variety of success you will achieve.

When using look-ahead assertions and negations, this can all get even trickier. Imagine you'd like to find a sequence of non-digits not followed by "123". You might try to write that as

  1. $_ = "ABC123";
  2. if ( /^\D*(?!123)/ ) { # Wrong!
  3. print "Yup, no 123 in $_\n";
  4. }

But that isn't going to match; at least, not the way you're hoping. It claims that there is no 123 in the string. Here's a clearer picture of why that pattern matches, contrary to popular expectations:

  1. $x = 'ABC123';
  2. $y = 'ABC445';
  3. print "1: got $1\n" if $x =~ /^(ABC)(?!123)/;
  4. print "2: got $1\n" if $y =~ /^(ABC)(?!123)/;
  5. print "3: got $1\n" if $x =~ /^(\D*)(?!123)/;
  6. print "4: got $1\n" if $y =~ /^(\D*)(?!123)/;

This prints

  1. 2: got ABC
  2. 3: got AB
  3. 4: got ABC

You might have expected test 3 to fail because it seems to a more general purpose version of test 1. The important difference between them is that test 3 contains a quantifier (\D* ) and so can use backtracking, whereas test 1 will not. What's happening is that you've asked "Is it true that at the start of $x, following 0 or more non-digits, you have something that's not 123?" If the pattern matcher had let \D* expand to "ABC", this would have caused the whole pattern to fail.

The search engine will initially match \D* with "ABC". Then it will try to match (?!123) with "123", which fails. But because a quantifier (\D* ) has been used in the regular expression, the search engine can backtrack and retry the match differently in the hope of matching the complete regular expression.

The pattern really, really wants to succeed, so it uses the standard pattern back-off-and-retry and lets \D* expand to just "AB" this time. Now there's indeed something following "AB" that is not "123". It's "C123", which suffices.

We can deal with this by using both an assertion and a negation. We'll say that the first part in $1 must be followed both by a digit and by something that's not "123". Remember that the look-aheads are zero-width expressions--they only look, but don't consume any of the string in their match. So rewriting this way produces what you'd expect; that is, case 5 will fail, but case 6 succeeds:

  1. print "5: got $1\n" if $x =~ /^(\D*)(?=\d)(?!123)/;
  2. print "6: got $1\n" if $y =~ /^(\D*)(?=\d)(?!123)/;
  3. 6: got ABC

In other words, the two zero-width assertions next to each other work as though they're ANDed together, just as you'd use any built-in assertions: /^$/ matches only if you're at the beginning of the line AND the end of the line simultaneously. The deeper underlying truth is that juxtaposition in regular expressions always means AND, except when you write an explicit OR using the vertical bar. /ab/ means match "a" AND (then) match "b", although the attempted matches are made at different positions because "a" is not a zero-width assertion, but a one-width assertion.

WARNING: Particularly complicated regular expressions can take exponential time to solve because of the immense number of possible ways they can use backtracking to try for a match. For example, without internal optimizations done by the regular expression engine, this will take a painfully long time to run:

  1. 'aaaaaaaaaaaa' =~ /((a{0,5}){0,5})*[c]/

And if you used * 's in the internal groups instead of limiting them to 0 through 5 matches, then it would take forever--or until you ran out of stack space. Moreover, these internal optimizations are not always applicable. For example, if you put {0,5} instead of * on the external group, no current optimization is applicable, and the match takes a long time to finish.

A powerful tool for optimizing such beasts is what is known as an "independent group", which does not backtrack (see (?>pattern)). Note also that zero-length look-ahead/look-behind assertions will not backtrack to make the tail match, since they are in "logical" context: only whether they match is considered relevant. For an example where side-effects of look-ahead might have influenced the following match, see (?>pattern).

Version 8 Regular Expressions

In case you're not familiar with the "regular" Version 8 regex routines, here are the pattern-matching rules not described above.

Any single character matches itself, unless it is a metacharacter with a special meaning described here or above. You can cause characters that normally function as metacharacters to be interpreted literally by prefixing them with a "\" (e.g., "\." matches a ".", not any character; "\\" matches a "\"). This escape mechanism is also required for the character used as the pattern delimiter.

A series of characters matches that series of characters in the target string, so the pattern blurfl would match "blurfl" in the target string.

You can specify a character class, by enclosing a list of characters in [] , which will match any character from the list. If the first character after the "[" is "^", the class matches any character not in the list. Within a list, the "-" character specifies a range, so that a-z represents all characters between "a" and "z", inclusive. If you want either "-" or "]" itself to be a member of a class, put it at the start of the list (possibly after a "^"), or escape it with a backslash. "-" is also taken literally when it is at the end of the list, just before the closing "]". (The following all specify the same class of three characters: [-az] , [az-] , and [a\-z] . All are different from [a-z] , which specifies a class containing twenty-six characters, even on EBCDIC-based character sets.) Also, if you try to use the character classes \w , \W , \s, \S , \d , or \D as endpoints of a range, the "-" is understood literally.

Note also that the whole range idea is rather unportable between character sets--and even within character sets they may cause results you probably didn't expect. A sound principle is to use only ranges that begin from and end at either alphabetics of equal case ([a-e], [A-E]), or digits ([0-9]). Anything else is unsafe. If in doubt, spell out the character sets in full.

Characters may be specified using a metacharacter syntax much like that used in C: "\n" matches a newline, "\t" a tab, "\r" a carriage return, "\f" a form feed, etc. More generally, \nnn, where nnn is a string of three octal digits, matches the character whose coded character set value is nnn. Similarly, \xnn, where nn are hexadecimal digits, matches the character whose ordinal is nn. The expression \cx matches the character control-x. Finally, the "." metacharacter matches any character except "\n" (unless you use /s).

You can specify a series of alternatives for a pattern using "|" to separate them, so that fee|fie|foe will match any of "fee", "fie", or "foe" in the target string (as would f(e|i|o)e). The first alternative includes everything from the last pattern delimiter ("(", "(?:", etc. or the beginning of the pattern) up to the first "|", and the last alternative contains everything from the last "|" to the next closing pattern delimiter. That's why it's common practice to include alternatives in parentheses: to minimize confusion about where they start and end.

Alternatives are tried from left to right, so the first alternative found for which the entire expression matches, is the one that is chosen. This means that alternatives are not necessarily greedy. For example: when matching foo|foot against "barefoot", only the "foo" part will match, as that is the first alternative tried, and it successfully matches the target string. (This might not seem important, but it is important when you are capturing matched text using parentheses.)

Also remember that "|" is interpreted as a literal within square brackets, so if you write [fee|fie|foe] you're really only matching [feio|] .

Within a pattern, you may designate subpatterns for later reference by enclosing them in parentheses, and you may refer back to the nth subpattern later in the pattern using the metacharacter \n or \gn. Subpatterns are numbered based on the left to right order of their opening parenthesis. A backreference matches whatever actually matched the subpattern in the string being examined, not the rules for that subpattern. Therefore, (0|0x)\d*\s\g1\d* will match "0x1234 0x4321", but not "0x1234 01234", because subpattern 1 matched "0x", even though the rule 0|0x could potentially match the leading 0 in the second number.

Warning on \1 Instead of $1

Some people get too used to writing things like:

  1. $pattern =~ s/(\W)/\\\1/g;

This is grandfathered (for \1 to \9) for the RHS of a substitute to avoid shocking the sed addicts, but it's a dirty habit to get into. That's because in PerlThink, the righthand side of an s/// is a double-quoted string. \1 in the usual double-quoted string means a control-A. The customary Unix meaning of \1 is kludged in for s///. However, if you get into the habit of doing that, you get yourself into trouble if you then add an /e modifier.

  1. s/(\d+)/ \1 + 1 /eg; # causes warning under -w

Or if you try to do

  1. s/(\d+)/\1000/;

You can't disambiguate that by saying \{1}000 , whereas you can fix it with ${1}000. The operation of interpolation should not be confused with the operation of matching a backreference. Certainly they mean two different things on the left side of the s///.

Repeated Patterns Matching a Zero-length Substring

WARNING: Difficult material (and prose) ahead. This section needs a rewrite.

Regular expressions provide a terse and powerful programming language. As with most other power tools, power comes together with the ability to wreak havoc.

A common abuse of this power stems from the ability to make infinite loops using regular expressions, with something as innocuous as:

  1. 'foo' =~ m{ ( o? )* }x;

The o? matches at the beginning of 'foo' , and since the position in the string is not moved by the match, o? would match again and again because of the * quantifier. Another common way to create a similar cycle is with the looping modifier //g :

  1. @matches = ( 'foo' =~ m{ o? }xg );

or

  1. print "match: <$&>\n" while 'foo' =~ m{ o? }xg;

or the loop implied by split().

However, long experience has shown that many programming tasks may be significantly simplified by using repeated subexpressions that may match zero-length substrings. Here's a simple example being:

  1. @chars = split //, $string; # // is not magic in split
  2. ($whitewashed = $string) =~ s/()/ /g; # parens avoid magic s// /

Thus Perl allows such constructs, by forcefully breaking the infinite loop. The rules for this are different for lower-level loops given by the greedy quantifiers *+{} , and for higher-level ones like the /g modifier or split() operator.

The lower-level loops are interrupted (that is, the loop is broken) when Perl detects that a repeated expression matched a zero-length substring. Thus

  1. m{ (?: NON_ZERO_LENGTH | ZERO_LENGTH )* }x;

is made equivalent to

  1. m{ (?: NON_ZERO_LENGTH )* (?: ZERO_LENGTH )? }x;

For example, this program

  1. #!perl -l
  2. "aaaaab" =~ /
  3. (?:
  4. a # non-zero
  5. | # or
  6. (?{print "hello"}) # print hello whenever this
  7. # branch is tried
  8. (?=(b)) # zero-width assertion
  9. )* # any number of times
  10. /x;
  11. print $&;
  12. print $1;

prints

  1. hello
  2. aaaaa
  3. b

Notice that "hello" is only printed once, as when Perl sees that the sixth iteration of the outermost (?:)* matches a zero-length string, it stops the * .

The higher-level loops preserve an additional state between iterations: whether the last match was zero-length. To break the loop, the following match after a zero-length match is prohibited to have a length of zero. This prohibition interacts with backtracking (see Backtracking), and so the second best match is chosen if the best match is of zero length.

For example:

  1. $_ = 'bar';
  2. s/\w??/<$&>/g;

results in <><b><><a><><r><> . At each position of the string the best match given by non-greedy ?? is the zero-length match, and the second best match is what is matched by \w . Thus zero-length matches alternate with one-character-long matches.

Similarly, for repeated m/()/g the second-best match is the match at the position one notch further in the string.

The additional state of being matched with zero-length is associated with the matched string, and is reset by each assignment to pos(). Zero-length matches at the end of the previous match are ignored during split.

Combining RE Pieces

Each of the elementary pieces of regular expressions which were described before (such as ab or \Z ) could match at most one substring at the given position of the input string. However, in a typical regular expression these elementary pieces are combined into more complicated patterns using combining operators ST , S|T , S* etc. (in these examples S and T are regular subexpressions).

Such combinations can include alternatives, leading to a problem of choice: if we match a regular expression a|ab against "abc" , will it match substring "a" or "ab" ? One way to describe which substring is actually matched is the concept of backtracking (see Backtracking). However, this description is too low-level and makes you think in terms of a particular implementation.

Another description starts with notions of "better"/"worse". All the substrings which may be matched by the given regular expression can be sorted from the "best" match to the "worst" match, and it is the "best" match which is chosen. This substitutes the question of "what is chosen?" by the question of "which matches are better, and which are worse?".

Again, for elementary pieces there is no such question, since at most one match at a given position is possible. This section describes the notion of better/worse for combining operators. In the description below S and T are regular subexpressions.

  • ST

    Consider two possible matches, AB and A'B', A and A' are substrings which can be matched by S , B and B' are substrings which can be matched by T .

    If A is a better match for S than A', AB is a better match than A'B'.

    If A and A' coincide: AB is a better match than AB' if B is a better match for T than B'.

  • S|T

    When S can match, it is a better match than when only T can match.

    Ordering of two matches for S is the same as for S . Similar for two matches for T .

  • S{REPEAT_COUNT}

    Matches as SSS...S (repeated as many times as necessary).

  • S{min,max}

    Matches as S{max}|S{max-1}|...|S{min+1}|S{min}.

  • S{min,max}?

    Matches as S{min}|S{min+1}|...|S{max-1}|S{max}.

  • S?, S* , S+

    Same as S{0,1} , S{0,BIG_NUMBER} , S{1,BIG_NUMBER} respectively.

  • S?? , S*?, S+?

    Same as S{0,1}?, S{0,BIG_NUMBER}?, S{1,BIG_NUMBER}? respectively.

  • (?>S)

    Matches the best match for S and only that.

  • (?=S), (?<=S)

    Only the best match for S is considered. (This is important only if S has capturing parentheses, and backreferences are used somewhere else in the whole regular expression.)

  • (?!S), (?<!S)

    For this grouping operator there is no need to describe the ordering, since only whether or not S can match is important.

  • (??{ EXPR }) , (?PARNO)

    The ordering is the same as for the regular expression which is the result of EXPR, or the pattern contained by capture group PARNO.

  • (?(condition)yes-pattern|no-pattern)

    Recall that which of yes-pattern or no-pattern actually matches is already determined. The ordering of the matches is the same as for the chosen subexpression.

The above recipes describe the ordering of matches at a given position. One more rule is needed to understand how a match is determined for the whole regular expression: a match at an earlier position is always better than a match at a later position.

Creating Custom RE Engines

As of Perl 5.10.0, one can create custom regular expression engines. This is not for the faint of heart, as they have to plug in at the C level. See perlreapi for more details.

As an alternative, overloaded constants (see overload) provide a simple way to extend the functionality of the RE engine, by substituting one pattern for another.

Suppose that we want to enable a new RE escape-sequence \Y| which matches at a boundary between whitespace characters and non-whitespace characters. Note that (?=\S)(?<!\S)|(?!\S)(?<=\S) matches exactly at these positions, so we want to have each \Y| in the place of the more complicated version. We can create a module customre to do this:

  1. package customre;
  2. use overload;
  3. sub import {
  4. shift;
  5. die "No argument to customre::import allowed" if @_;
  6. overload::constant 'qr' => \&convert;
  7. }
  8. sub invalid { die "/$_[0]/: invalid escape '\\$_[1]'"}
  9. # We must also take care of not escaping the legitimate \\Y|
  10. # sequence, hence the presence of '\\' in the conversion rules.
  11. my %rules = ( '\\' => '\\\\',
  12. 'Y|' => qr/(?=\S)(?<!\S)|(?!\S)(?<=\S)/ );
  13. sub convert {
  14. my $re = shift;
  15. $re =~ s{
  16. \\ ( \\ | Y . )
  17. }
  18. { $rules{$1} or invalid($re,$1) }sgex;
  19. return $re;
  20. }

Now use customre enables the new escape in constant regular expressions, i.e., those without any runtime variable interpolations. As documented in overload, this conversion will work only over literal parts of regular expressions. For \Y|$re\Y| the variable part of this regular expression needs to be converted explicitly (but only if the special meaning of \Y| should be enabled inside $re):

  1. use customre;
  2. $re = <>;
  3. chomp $re;
  4. $re = customre::convert $re;
  5. /\Y|$re\Y|/;

PCRE/Python Support

As of Perl 5.10.0, Perl supports several Python/PCRE-specific extensions to the regex syntax. While Perl programmers are encouraged to use the Perl-specific syntax, the following are also accepted:

  • (?P<NAME>pattern)

    Define a named capture group. Equivalent to (?<NAME>pattern).

  • (?P=NAME)

    Backreference to a named capture group. Equivalent to \g{NAME} .

  • (?P>NAME)

    Subroutine call to a named capture group. Equivalent to (?&NAME).

BUGS

Many regular expression constructs don't work on EBCDIC platforms.

There are a number of issues with regard to case-insensitive matching in Unicode rules. See i under Modifiers above.

This document varies from difficult to understand to completely and utterly opaque. The wandering prose riddled with jargon is hard to fathom in several places.

This document needs a rewrite that separates the tutorial content from the reference content.

SEE ALSO

perlrequick.

perlretut.

Regexp Quote-Like Operators in perlop.

Gory details of parsing quoted constructs in perlop.

perlfaq6.

pos.

perllocale.

perlebcdic.

Mastering Regular Expressions by Jeffrey Friedl, published by O'Reilly and Associates.

 
perldoc-html/perlreapi.html000644 000765 000024 00000243310 12275777370 016113 0ustar00jjstaff000000 000000 perlreapi - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlreapi

Perl 5 version 18.2 documentation
Recently read

perlreapi

NAME

perlreapi - Perl regular expression plugin interface

DESCRIPTION

As of Perl 5.9.5 there is a new interface for plugging and using regular expression engines other than the default one.

Each engine is supposed to provide access to a constant structure of the following format:

  1. typedef struct regexp_engine {
  2. REGEXP* (*comp) (pTHX_
  3. const SV * const pattern, const U32 flags);
  4. I32 (*exec) (pTHX_
  5. REGEXP * const rx,
  6. char* stringarg,
  7. char* strend, char* strbeg,
  8. I32 minend, SV* screamer,
  9. void* data, U32 flags);
  10. char* (*intuit) (pTHX_
  11. REGEXP * const rx, SV *sv,
  12. char *strpos, char *strend, U32 flags,
  13. struct re_scream_pos_data_s *data);
  14. SV* (*checkstr) (pTHX_ REGEXP * const rx);
  15. void (*free) (pTHX_ REGEXP * const rx);
  16. void (*numbered_buff_FETCH) (pTHX_
  17. REGEXP * const rx,
  18. const I32 paren,
  19. SV * const sv);
  20. void (*numbered_buff_STORE) (pTHX_
  21. REGEXP * const rx,
  22. const I32 paren,
  23. SV const * const value);
  24. I32 (*numbered_buff_LENGTH) (pTHX_
  25. REGEXP * const rx,
  26. const SV * const sv,
  27. const I32 paren);
  28. SV* (*named_buff) (pTHX_
  29. REGEXP * const rx,
  30. SV * const key,
  31. SV * const value,
  32. U32 flags);
  33. SV* (*named_buff_iter) (pTHX_
  34. REGEXP * const rx,
  35. const SV * const lastkey,
  36. const U32 flags);
  37. SV* (*qr_package)(pTHX_ REGEXP * const rx);
  38. #ifdef USE_ITHREADS
  39. void* (*dupe) (pTHX_ REGEXP * const rx, CLONE_PARAMS *param);
  40. #endif
  41. REGEXP* (*op_comp) (...);

When a regexp is compiled, its engine field is then set to point at the appropriate structure, so that when it needs to be used Perl can find the right routines to do so.

In order to install a new regexp handler, $^H{regcomp} is set to an integer which (when casted appropriately) resolves to one of these structures. When compiling, the comp method is executed, and the resulting regexp structure's engine field is expected to point back at the same structure.

The pTHX_ symbol in the definition is a macro used by Perl under threading to provide an extra argument to the routine holding a pointer back to the interpreter that is executing the regexp. So under threading all routines get an extra argument.

Callbacks

comp

  1. REGEXP* comp(pTHX_ const SV * const pattern, const U32 flags);

Compile the pattern stored in pattern using the given flags and return a pointer to a prepared REGEXP structure that can perform the match. See The REGEXP structure below for an explanation of the individual fields in the REGEXP struct.

The pattern parameter is the scalar that was used as the pattern. Previous versions of Perl would pass two char* indicating the start and end of the stringified pattern; the following snippet can be used to get the old parameters:

  1. STRLEN plen;
  2. char* exp = SvPV(pattern, plen);
  3. char* xend = exp + plen;

Since any scalar can be passed as a pattern, it's possible to implement an engine that does something with an array ("ook" =~ [ qw/ eek hlagh / ] ) or with the non-stringified form of a compiled regular expression ("ook" =~ qr/eek/ ). Perl's own engine will always stringify everything using the snippet above, but that doesn't mean other engines have to.

The flags parameter is a bitfield which indicates which of the msixp flags the regex was compiled with. It also contains additional info, such as if use locale is in effect.

The eogc flags are stripped out before being passed to the comp routine. The regex engine does not need to know if any of these are set, as those flags should only affect what Perl does with the pattern and its match variables, not how it gets compiled and executed.

By the time the comp callback is called, some of these flags have already had effect (noted below where applicable). However most of their effect occurs after the comp callback has run, in routines that read the rx->extflags field which it populates.

In general the flags should be preserved in rx->extflags after compilation, although the regex engine might want to add or delete some of them to invoke or disable some special behavior in Perl. The flags along with any special behavior they cause are documented below:

The pattern modifiers:

  • /m - RXf_PMf_MULTILINE

    If this is in rx->extflags it will be passed to Perl_fbm_instr by pp_split which will treat the subject string as a multi-line string.

  • /s - RXf_PMf_SINGLELINE
  • /i - RXf_PMf_FOLD
  • /x - RXf_PMf_EXTENDED

    If present on a regex, "#" comments will be handled differently by the tokenizer in some cases.

    TODO: Document those cases.

  • /p - RXf_PMf_KEEPCOPY

    TODO: Document this

  • Character set

    The character set semantics are determined by an enum that is contained in this field. This is still experimental and subject to change, but the current interface returns the rules by use of the in-line function get_regex_charset(const U32 flags) . The only currently documented value returned from it is REGEX_LOCALE_CHARSET, which is set if use locale is in effect. If present in rx->extflags , split will use the locale dependent definition of whitespace when RXf_SKIPWHITE or RXf_WHITE is in effect. ASCII whitespace is defined as per isSPACE, and by the internal macros is_utf8_space under UTF-8, and isSPACE_LC under use locale .

Additional flags:

  • RXf_SPLIT

    This flag was removed in perl 5.18.0. split ' ' is now special-cased solely in the parser. RXf_SPLIT is still #defined, so you can test for it. This is how it used to work:

    If split is invoked as split ' ' or with no arguments (which really means split(' ', $_) , see split), Perl will set this flag. The regex engine can then check for it and set the SKIPWHITE and WHITE extflags. To do this, the Perl engine does:

    1. if (flags & RXf_SPLIT && r->prelen == 1 && r->precomp[0] == ' ')
    2. r->extflags |= (RXf_SKIPWHITE|RXf_WHITE);

These flags can be set during compilation to enable optimizations in the split operator.

  • RXf_SKIPWHITE

    This flag was removed in perl 5.18.0. It is still #defined, so you can set it, but doing so will have no effect. This is how it used to work:

    If the flag is present in rx->extflags split will delete whitespace from the start of the subject string before it's operated on. What is considered whitespace depends on if the subject is a UTF-8 string and if the RXf_PMf_LOCALE flag is set.

    If RXf_WHITE is set in addition to this flag, split will behave like split " " under the Perl engine.

  • RXf_START_ONLY

    Tells the split operator to split the target string on newlines (\n ) without invoking the regex engine.

    Perl's engine sets this if the pattern is /^/ (plen == 1 && *exp == '^' ), even under /^/s ; see split. Of course a different regex engine might want to use the same optimizations with a different syntax.

  • RXf_WHITE

    Tells the split operator to split the target string on whitespace without invoking the regex engine. The definition of whitespace varies depending on if the target string is a UTF-8 string and on if RXf_PMf_LOCALE is set.

    Perl's engine sets this flag if the pattern is \s+.

  • RXf_NULL

    Tells the split operator to split the target string on characters. The definition of character varies depending on if the target string is a UTF-8 string.

    Perl's engine sets this flag on empty patterns, this optimization makes split // much faster than it would otherwise be. It's even faster than unpack.

  • RXf_NO_INPLACE_SUBST

    Added in perl 5.18.0, this flag indicates that a regular expression might perform an operation that would interfere with inplace substituion. For instance it might contain lookbehind, or assign to non-magical variables (such as $REGMARK and $REGERROR) during matching. s/// will skip certain optimisations when this is set.

exec

  1. I32 exec(pTHX_ REGEXP * const rx,
  2. char *stringarg, char* strend, char* strbeg,
  3. I32 minend, SV* screamer,
  4. void* data, U32 flags);

Execute a regexp. The arguments are

  • rx

    The regular expression to execute.

  • screamer

    This strangely-named arg is the SV to be matched against. Note that the actual char array to be matched against is supplied by the arguments described below; the SV is just used to determine UTF8ness, pos() etc.

  • strbeg

    Pointer to the physical start of the string.

  • strend

    Pointer to the character following the physical end of the string (i.e. the \0 ).

  • stringarg

    Pointer to the position in the string where matching should start; it might not be equal to strbeg (for example in a later iteration of /.../g ).

  • minend

    Minimum length of string (measured in bytes from stringarg ) that must match; if the engine reaches the end of the match but hasn't reached this position in the string, it should fail.

  • data

    Optimisation data; subject to change.

  • flags

    Optimisation flags; subject to change.

intuit

  1. char* intuit(pTHX_ REGEXP * const rx,
  2. SV *sv, char *strpos, char *strend,
  3. const U32 flags, struct re_scream_pos_data_s *data);

Find the start position where a regex match should be attempted, or possibly if the regex engine should not be run because the pattern can't match. This is called, as appropriate, by the core, depending on the values of the extflags member of the regexp structure.

checkstr

  1. SV* checkstr(pTHX_ REGEXP * const rx);

Return a SV containing a string that must appear in the pattern. Used by split for optimising matches.

free

  1. void free(pTHX_ REGEXP * const rx);

Called by Perl when it is freeing a regexp pattern so that the engine can release any resources pointed to by the pprivate member of the regexp structure. This is only responsible for freeing private data; Perl will handle releasing anything else contained in the regexp structure.

Numbered capture callbacks

Called to get/set the value of $` , $' , $& and their named equivalents, ${^PREMATCH}, ${^POSTMATCH} and $^{MATCH}, as well as the numbered capture groups ($1 , $2 , ...).

The paren parameter will be 1 for $1 , 2 for $2 and so forth, and have these symbolic values for the special variables:

  1. ${^PREMATCH} RX_BUFF_IDX_CARET_PREMATCH
  2. ${^POSTMATCH} RX_BUFF_IDX_CARET_POSTMATCH
  3. ${^MATCH} RX_BUFF_IDX_CARET_FULLMATCH
  4. $` RX_BUFF_IDX_PREMATCH
  5. $' RX_BUFF_IDX_POSTMATCH
  6. $& RX_BUFF_IDX_FULLMATCH

Note that in Perl 5.17.3 and earlier, the last three constants were also used for the caret variants of the variables.

The names have been chosen by analogy with Tie::Scalar methods names with an additional LENGTH callback for efficiency. However named capture variables are currently not tied internally but implemented via magic.

numbered_buff_FETCH

  1. void numbered_buff_FETCH(pTHX_ REGEXP * const rx, const I32 paren,
  2. SV * const sv);

Fetch a specified numbered capture. sv should be set to the scalar to return, the scalar is passed as an argument rather than being returned from the function because when it's called Perl already has a scalar to store the value, creating another one would be redundant. The scalar can be set with sv_setsv , sv_setpvn and friends, see perlapi.

This callback is where Perl untaints its own capture variables under taint mode (see perlsec). See the Perl_reg_numbered_buff_fetch function in regcomp.c for how to untaint capture variables if that's something you'd like your engine to do as well.

numbered_buff_STORE

  1. void (*numbered_buff_STORE) (pTHX_
  2. REGEXP * const rx,
  3. const I32 paren,
  4. SV const * const value);

Set the value of a numbered capture variable. value is the scalar that is to be used as the new value. It's up to the engine to make sure this is used as the new value (or reject it).

Example:

  1. if ("ook" =~ /(o*)/) {
  2. # 'paren' will be '1' and 'value' will be 'ee'
  3. $1 =~ tr/o/e/;
  4. }

Perl's own engine will croak on any attempt to modify the capture variables, to do this in another engine use the following callback (copied from Perl_reg_numbered_buff_store ):

  1. void
  2. Example_reg_numbered_buff_store(pTHX_
  3. REGEXP * const rx,
  4. const I32 paren,
  5. SV const * const value)
  6. {
  7. PERL_UNUSED_ARG(rx);
  8. PERL_UNUSED_ARG(paren);
  9. PERL_UNUSED_ARG(value);
  10. if (!PL_localizing)
  11. Perl_croak(aTHX_ PL_no_modify);
  12. }

Actually Perl will not always croak in a statement that looks like it would modify a numbered capture variable. This is because the STORE callback will not be called if Perl can determine that it doesn't have to modify the value. This is exactly how tied variables behave in the same situation:

  1. package CaptureVar;
  2. use base 'Tie::Scalar';
  3. sub TIESCALAR { bless [] }
  4. sub FETCH { undef }
  5. sub STORE { die "This doesn't get called" }
  6. package main;
  7. tie my $sv => "CaptureVar";
  8. $sv =~ y/a/b/;

Because $sv is undef when the y/// operator is applied to it, the transliteration won't actually execute and the program won't die. This is different to how 5.8 and earlier versions behaved since the capture variables were READONLY variables then; now they'll just die when assigned to in the default engine.

numbered_buff_LENGTH

  1. I32 numbered_buff_LENGTH (pTHX_
  2. REGEXP * const rx,
  3. const SV * const sv,
  4. const I32 paren);

Get the length of a capture variable. There's a special callback for this so that Perl doesn't have to do a FETCH and run length on the result, since the length is (in Perl's case) known from an offset stored in rx->offs , this is much more efficient:

  1. I32 s1 = rx->offs[paren].start;
  2. I32 s2 = rx->offs[paren].end;
  3. I32 len = t1 - s1;

This is a little bit more complex in the case of UTF-8, see what Perl_reg_numbered_buff_length does with is_utf8_string_loclen.

Named capture callbacks

Called to get/set the value of %+ and %- , as well as by some utility functions in re.

There are two callbacks, named_buff is called in all the cases the FETCH, STORE, DELETE, CLEAR, EXISTS and SCALAR Tie::Hash callbacks would be on changes to %+ and %- and named_buff_iter in the same cases as FIRSTKEY and NEXTKEY.

The flags parameter can be used to determine which of these operations the callbacks should respond to. The following flags are currently defined:

Which Tie::Hash operation is being performed from the Perl level on %+ or %+ , if any:

  1. RXapif_FETCH
  2. RXapif_STORE
  3. RXapif_DELETE
  4. RXapif_CLEAR
  5. RXapif_EXISTS
  6. RXapif_SCALAR
  7. RXapif_FIRSTKEY
  8. RXapif_NEXTKEY

If %+ or %- is being operated on, if any.

  1. RXapif_ONE /* %+ */
  2. RXapif_ALL /* %- */

If this is being called as re::regname , re::regnames or re::regnames_count , if any. The first two will be combined with RXapif_ONE or RXapif_ALL .

  1. RXapif_REGNAME
  2. RXapif_REGNAMES
  3. RXapif_REGNAMES_COUNT

Internally %+ and %- are implemented with a real tied interface via Tie::Hash::NamedCapture. The methods in that package will call back into these functions. However the usage of Tie::Hash::NamedCapture for this purpose might change in future releases. For instance this might be implemented by magic instead (would need an extension to mgvtbl).

named_buff

  1. SV* (*named_buff) (pTHX_ REGEXP * const rx, SV * const key,
  2. SV * const value, U32 flags);

named_buff_iter

  1. SV* (*named_buff_iter) (pTHX_
  2. REGEXP * const rx,
  3. const SV * const lastkey,
  4. const U32 flags);

qr_package

  1. SV* qr_package(pTHX_ REGEXP * const rx);

The package the qr// magic object is blessed into (as seen by ref qr// ). It is recommended that engines change this to their package name for identification regardless of if they implement methods on the object.

The package this method returns should also have the internal Regexp package in its @ISA . qr//->isa("Regexp") should always be true regardless of what engine is being used.

Example implementation might be:

  1. SV*
  2. Example_qr_package(pTHX_ REGEXP * const rx)
  3. {
  4. PERL_UNUSED_ARG(rx);
  5. return newSVpvs("re::engine::Example");
  6. }

Any method calls on an object created with qr// will be dispatched to the package as a normal object.

  1. use re::engine::Example;
  2. my $re = qr//;
  3. $re->meth; # dispatched to re::engine::Example::meth()

To retrieve the REGEXP object from the scalar in an XS function use the SvRX macro, see REGEXP Functions in perlapi.

  1. void meth(SV * rv)
  2. PPCODE:
  3. REGEXP * re = SvRX(sv);

dupe

  1. void* dupe(pTHX_ REGEXP * const rx, CLONE_PARAMS *param);

On threaded builds a regexp may need to be duplicated so that the pattern can be used by multiple threads. This routine is expected to handle the duplication of any private data pointed to by the pprivate member of the regexp structure. It will be called with the preconstructed new regexp structure as an argument, the pprivate member will point at the old private structure, and it is this routine's responsibility to construct a copy and return a pointer to it (which Perl will then use to overwrite the field as passed to this routine.)

This allows the engine to dupe its private data but also if necessary modify the final structure if it really must.

On unthreaded builds this field doesn't exist.

op_comp

This is private to the Perl core and subject to change. Should be left null.

The REGEXP structure

The REGEXP struct is defined in regexp.h. All regex engines must be able to correctly build such a structure in their comp routine.

The REGEXP structure contains all the data that Perl needs to be aware of to properly work with the regular expression. It includes data about optimisations that Perl can use to determine if the regex engine should really be used, and various other control info that is needed to properly execute patterns in various contexts, such as if the pattern anchored in some way, or what flags were used during the compile, or if the program contains special constructs that Perl needs to be aware of.

In addition it contains two fields that are intended for the private use of the regex engine that compiled the pattern. These are the intflags and pprivate members. pprivate is a void pointer to an arbitrary structure, whose use and management is the responsibility of the compiling engine. Perl will never modify either of these values.

  1. typedef struct regexp {
  2. /* what engine created this regexp? */
  3. const struct regexp_engine* engine;
  4. /* what re is this a lightweight copy of? */
  5. struct regexp* mother_re;
  6. /* Information about the match that the Perl core uses to manage
  7. * things */
  8. U32 extflags; /* Flags used both externally and internally */
  9. I32 minlen; /* mininum possible number of chars in */
  10. string to match */
  11. I32 minlenret; /* mininum possible number of chars in $& */
  12. U32 gofs; /* chars left of pos that we search from */
  13. /* substring data about strings that must appear
  14. in the final match, used for optimisations */
  15. struct reg_substr_data *substrs;
  16. U32 nparens; /* number of capture groups */
  17. /* private engine specific data */
  18. U32 intflags; /* Engine Specific Internal flags */
  19. void *pprivate; /* Data private to the regex engine which
  20. created this object. */
  21. /* Data about the last/current match. These are modified during
  22. * matching*/
  23. U32 lastparen; /* highest close paren matched ($+) */
  24. U32 lastcloseparen; /* last close paren matched ($^N) */
  25. regexp_paren_pair *swap; /* Swap copy of *offs */
  26. regexp_paren_pair *offs; /* Array of offsets for (@-) and
  27. (@+) */
  28. char *subbeg; /* saved or original string so \digit works
  29. forever. */
  30. SV_SAVED_COPY /* If non-NULL, SV which is COW from original */
  31. I32 sublen; /* Length of string pointed by subbeg */
  32. I32 suboffset; /* byte offset of subbeg from logical start of
  33. str */
  34. I32 subcoffset; /* suboffset equiv, but in chars (for @-/@+) */
  35. /* Information about the match that isn't often used */
  36. I32 prelen; /* length of precomp */
  37. const char *precomp; /* pre-compilation regular expression */
  38. char *wrapped; /* wrapped version of the pattern */
  39. I32 wraplen; /* length of wrapped */
  40. I32 seen_evals; /* number of eval groups in the pattern - for
  41. security checks */
  42. HV *paren_names; /* Optional hash of paren names */
  43. /* Refcount of this regexp */
  44. I32 refcnt; /* Refcount of this regexp */
  45. } regexp;

The fields are discussed in more detail below:

engine

This field points at a regexp_engine structure which contains pointers to the subroutines that are to be used for performing a match. It is the compiling routine's responsibility to populate this field before returning the regexp object.

Internally this is set to NULL unless a custom engine is specified in $^H{regcomp} , Perl's own set of callbacks can be accessed in the struct pointed to by RE_ENGINE_PTR .

mother_re

TODO, see http://www.mail-archive.com/perl5-changes@perl.org/msg17328.html

extflags

This will be used by Perl to see what flags the regexp was compiled with, this will normally be set to the value of the flags parameter by the comp callback. See the comp documentation for valid flags.

minlen minlenret

The minimum string length (in characters) required for the pattern to match. This is used to prune the search space by not bothering to match any closer to the end of a string than would allow a match. For instance there is no point in even starting the regex engine if the minlen is 10 but the string is only 5 characters long. There is no way that the pattern can match.

minlenret is the minimum length (in characters) of the string that would be found in $& after a match.

The difference between minlen and minlenret can be seen in the following pattern:

  1. /ns(?=\d)/

where the minlen would be 3 but minlenret would only be 2 as the \d is required to match but is not actually included in the matched content. This distinction is particularly important as the substitution logic uses the minlenret to tell if it can do in-place substitutions (these can result in considerable speed-up).

gofs

Left offset from pos() to start match at.

substrs

Substring data about strings that must appear in the final match. This is currently only used internally by Perl's engine, but might be used in the future for all engines for optimisations.

nparens , lastparen , and lastcloseparen

These fields are used to keep track of how many paren groups could be matched in the pattern, which was the last open paren to be entered, and which was the last close paren to be entered.

intflags

The engine's private copy of the flags the pattern was compiled with. Usually this is the same as extflags unless the engine chose to modify one of them.

pprivate

A void* pointing to an engine-defined data structure. The Perl engine uses the regexp_internal structure (see Base Structures in perlreguts) but a custom engine should use something else.

swap

Unused. Left in for compatibility with Perl 5.10.0.

offs

A regexp_paren_pair structure which defines offsets into the string being matched which correspond to the $& and $1 , $2 etc. captures, the regexp_paren_pair struct is defined as follows:

  1. typedef struct regexp_paren_pair {
  2. I32 start;
  3. I32 end;
  4. } regexp_paren_pair;

If ->offs[num].start or ->offs[num].end is -1 then that capture group did not match. ->offs[0].start/end represents $& (or ${^MATCH} under //p ) and ->offs[paren].end matches $$paren where $paren = 1>.

precomp prelen

Used for optimisations. precomp holds a copy of the pattern that was compiled and prelen its length. When a new pattern is to be compiled (such as inside a loop) the internal regcomp operator checks if the last compiled REGEXP 's precomp and prelen are equivalent to the new one, and if so uses the old pattern instead of compiling a new one.

The relevant snippet from Perl_pp_regcomp :

  1. if (!re || !re->precomp || re->prelen != (I32)len ||
  2. memNE(re->precomp, t, len))
  3. /* Compile a new pattern */

paren_names

This is a hash used internally to track named capture groups and their offsets. The keys are the names of the buffers the values are dualvars, with the IV slot holding the number of buffers with the given name and the pv being an embedded array of I32. The values may also be contained independently in the data array in cases where named backreferences are used.

substrs

Holds information on the longest string that must occur at a fixed offset from the start of the pattern, and the longest string that must occur at a floating offset from the start of the pattern. Used to do Fast-Boyer-Moore searches on the string to find out if its worth using the regex engine at all, and if so where in the string to search.

subbeg sublen saved_copy suboffset subcoffset

Used during the execution phase for managing search and replace patterns, and for providing the text for $& , $1 etc. subbeg points to a buffer (either the original string, or a copy in the case of RX_MATCH_COPIED(rx) ), and sublen is the length of the buffer. The RX_OFFS start and end indices index into this buffer.

In the presence of the REXEC_COPY_STR flag, but with the addition of the REXEC_COPY_SKIP_PRE or REXEC_COPY_SKIP_POST flags, an engine can choose not to copy the full buffer (although it must still do so in the presence of RXf_PMf_KEEPCOPY or the relevant bits being set in PL_sawampersand ). In this case, it may set suboffset to indicate the number of bytes from the logical start of the buffer to the physical start (i.e. subbeg ). It should also set subcoffset , the number of characters in the offset. The latter is needed to support @- and @+ which work in characters, not bytes.

wrapped wraplen

Stores the string qr// stringifies to. The Perl engine for example stores (?^:eek) in the case of qr/eek/.

When using a custom engine that doesn't support the (?:) construct for inline modifiers, it's probably best to have qr// stringify to the supplied pattern, note that this will create undesired patterns in cases such as:

  1. my $x = qr/a|b/; # "a|b"
  2. my $y = qr/c/i; # "c"
  3. my $z = qr/$x$y/; # "a|bc"

There's no solution for this problem other than making the custom engine understand a construct like (?:).

seen_evals

This stores the number of eval groups in the pattern. This is used for security purposes when embedding compiled regexes into larger patterns with qr//.

refcnt

The number of times the structure is referenced. When this falls to 0, the regexp is automatically freed by a call to pregfree. This should be set to 1 in each engine's comp routine.

HISTORY

Originally part of perlreguts.

AUTHORS

Originally written by Yves Orton, expanded by Ævar Arnfjörð Bjarmason.

LICENSE

Copyright 2006 Yves Orton and 2007 Ævar Arnfjörð Bjarmason.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/perlrebackslash.html000644 000765 000024 00000201456 12275777340 017277 0ustar00jjstaff000000 000000 perlrebackslash - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlrebackslash

Perl 5 version 18.2 documentation
Recently read

perlrebackslash

NAME

perlrebackslash - Perl Regular Expression Backslash Sequences and Escapes

DESCRIPTION

The top level documentation about Perl regular expressions is found in perlre.

This document describes all backslash and escape sequences. After explaining the role of the backslash, it lists all the sequences that have a special meaning in Perl regular expressions (in alphabetical order), then describes each of them.

Most sequences are described in detail in different documents; the primary purpose of this document is to have a quick reference guide describing all backslash and escape sequences.

The backslash

In a regular expression, the backslash can perform one of two tasks: it either takes away the special meaning of the character following it (for instance, \| matches a vertical bar, it's not an alternation), or it is the start of a backslash or escape sequence.

The rules determining what it is are quite simple: if the character following the backslash is an ASCII punctuation (non-word) character (that is, anything that is not a letter, digit, or underscore), then the backslash just takes away any special meaning of the character following it.

If the character following the backslash is an ASCII letter or an ASCII digit, then the sequence may be special; if so, it's listed below. A few letters have not been used yet, so escaping them with a backslash doesn't change them to be special. A future version of Perl may assign a special meaning to them, so if you have warnings turned on, Perl issues a warning if you use such a sequence. [1].

It is however guaranteed that backslash or escape sequences never have a punctuation character following the backslash, not now, and not in a future version of Perl 5. So it is safe to put a backslash in front of a non-word character.

Note that the backslash itself is special; if you want to match a backslash, you have to escape the backslash with a backslash: /\\/ matches a single backslash.

  • [1]

    There is one exception. If you use an alphanumeric character as the delimiter of your pattern (which you probably shouldn't do for readability reasons), you have to escape the delimiter if you want to match it. Perl won't warn then. See also Gory details of parsing quoted constructs in perlop.

All the sequences and escapes

Those not usable within a bracketed character class (like [\da-z] ) are marked as Not in [].

  1. \000 Octal escape sequence. See also \o{}.
  2. \1 Absolute backreference. Not in [].
  3. \a Alarm or bell.
  4. \A Beginning of string. Not in [].
  5. \b Word/non-word boundary. (Backspace in []).
  6. \B Not a word/non-word boundary. Not in [].
  7. \cX Control-X.
  8. \C Single octet, even under UTF-8. Not in [].
  9. \d Character class for digits.
  10. \D Character class for non-digits.
  11. \e Escape character.
  12. \E Turn off \Q, \L and \U processing. Not in [].
  13. \f Form feed.
  14. \F Foldcase till \E. Not in [].
  15. \g{}, \g1 Named, absolute or relative backreference.
  16. Not in [].
  17. \G Pos assertion. Not in [].
  18. \h Character class for horizontal whitespace.
  19. \H Character class for non horizontal whitespace.
  20. \k{}, \k<>, \k'' Named backreference. Not in [].
  21. \K Keep the stuff left of \K. Not in [].
  22. \l Lowercase next character. Not in [].
  23. \L Lowercase till \E. Not in [].
  24. \n (Logical) newline character.
  25. \N Any character but newline. Not in [].
  26. \N{} Named or numbered (Unicode) character or sequence.
  27. \o{} Octal escape sequence.
  28. \p{}, \pP Character with the given Unicode property.
  29. \P{}, \PP Character without the given Unicode property.
  30. \Q Quote (disable) pattern metacharacters till \E. Not
  31. in [].
  32. \r Return character.
  33. \R Generic new line. Not in [].
  34. \s Character class for whitespace.
  35. \S Character class for non whitespace.
  36. \t Tab character.
  37. \u Titlecase next character. Not in [].
  38. \U Uppercase till \E. Not in [].
  39. \v Character class for vertical whitespace.
  40. \V Character class for non vertical whitespace.
  41. \w Character class for word characters.
  42. \W Character class for non-word characters.
  43. \x{}, \x00 Hexadecimal escape sequence.
  44. \X Unicode "extended grapheme cluster". Not in [].
  45. \z End of string. Not in [].
  46. \Z End of string. Not in [].

Character Escapes

Fixed characters

A handful of characters have a dedicated character escape. The following table shows them, along with their ASCII code points (in decimal and hex), their ASCII name, the control escape on ASCII platforms and a short description. (For EBCDIC platforms, see OPERATOR DIFFERENCES in perlebcdic.)

  1. Seq. Code Point ASCII Cntrl Description.
  2. Dec Hex
  3. \a 7 07 BEL \cG alarm or bell
  4. \b 8 08 BS \cH backspace [1]
  5. \e 27 1B ESC \c[ escape character
  6. \f 12 0C FF \cL form feed
  7. \n 10 0A LF \cJ line feed [2]
  8. \r 13 0D CR \cM carriage return
  9. \t 9 09 TAB \cI tab
  • [1]

    \b is the backspace character only inside a character class. Outside a character class, \b is a word/non-word boundary.

  • [2]

    \n matches a logical newline. Perl converts between \n and your OS's native newline character when reading from or writing to text files.

Example

  1. $str =~ /\t/; # Matches if $str contains a (horizontal) tab.

Control characters

\c is used to denote a control character; the character following \c determines the value of the construct. For example the value of \cA is chr(1), and the value of \cb is chr(2), etc. The gory details are in Regexp Quote-Like Operators in perlop. A complete list of what chr(1), etc. means for ASCII and EBCDIC platforms is in OPERATOR DIFFERENCES in perlebcdic.

Note that \c\ alone at the end of a regular expression (or doubled-quoted string) is not valid. The backslash must be followed by another character. That is, \c\X means chr(28) . 'X' for all characters X.

To write platform-independent code, you must use \N{NAME} instead, like \N{ESCAPE} or \N{U+001B} , see charnames.

Mnemonic: control character.

Example

  1. $str =~ /\cK/; # Matches if $str contains a vertical tab (control-K).

Named or numbered characters and character sequences

Unicode characters have a Unicode name and numeric code point (ordinal) value. Use the \N{} construct to specify a character by either of these values. Certain sequences of characters also have names.

To specify by name, the name of the character or character sequence goes between the curly braces.

To specify a character by Unicode code point, use the form \N{U+code point}, where code point is a number in hexadecimal that gives the code point that Unicode has assigned to the desired character. It is customary but not required to use leading zeros to pad the number to 4 digits. Thus \N{U+0041} means LATIN CAPITAL LETTER A , and you will rarely see it written without the two leading zeros. \N{U+0041} means "A" even on EBCDIC machines (where the ordinal value of "A" is not 0x41).

It is even possible to give your own names to characters and character sequences. For details, see charnames.

(There is an expanded internal form that you may see in debug output: \N{U+code point.code point...}. The ... means any number of these code points separated by dots. This represents the sequence formed by the characters. This is an internal form only, subject to change, and you should not try to use it yourself.)

Mnemonic: Named character.

Note that a character or character sequence expressed as a named or numbered character is considered a character without special meaning by the regex engine, and will match "as is".

Example

  1. $str =~ /\N{THAI CHARACTER SO SO}/; # Matches the Thai SO SO character
  2. use charnames 'Cyrillic'; # Loads Cyrillic names.
  3. $str =~ /\N{ZHE}\N{KA}/; # Match "ZHE" followed by "KA".

Octal escapes

There are two forms of octal escapes. Each is used to specify a character by its code point specified in octal notation.

One form, available starting in Perl 5.14 looks like \o{...} , where the dots represent one or more octal digits. It can be used for any Unicode character.

It was introduced to avoid the potential problems with the other form, available in all Perls. That form consists of a backslash followed by three octal digits. One problem with this form is that it can look exactly like an old-style backreference (see Disambiguation rules between old-style octal escapes and backreferences below.) You can avoid this by making the first of the three digits always a zero, but that makes \077 the largest code point specifiable.

In some contexts, a backslash followed by two or even one octal digits may be interpreted as an octal escape, sometimes with a warning, and because of some bugs, sometimes with surprising results. Also, if you are creating a regex out of smaller snippets concatenated together, and you use fewer than three digits, the beginning of one snippet may be interpreted as adding digits to the ending of the snippet before it. See Absolute referencing for more discussion and examples of the snippet problem.

Note that a character expressed as an octal escape is considered a character without special meaning by the regex engine, and will match "as is".

To summarize, the \o{} form is always safe to use, and the other form is safe to use for code points through \077 when you use exactly three digits to specify them.

Mnemonic: 0ctal or octal.

Examples (assuming an ASCII platform)

  1. $str = "Perl";
  2. $str =~ /\o{120}/; # Match, "\120" is "P".
  3. $str =~ /\120/; # Same.
  4. $str =~ /\o{120}+/; # Match, "\120" is "P",
  5. # it's repeated at least once.
  6. $str =~ /\120+/; # Same.
  7. $str =~ /P\053/; # No match, "\053" is "+" and taken literally.
  8. /\o{23073}/ # Black foreground, white background smiling face.
  9. /\o{4801234567}/ # Raises a warning, and yields chr(4).

Disambiguation rules between old-style octal escapes and backreferences

Octal escapes of the \000 form outside of bracketed character classes potentially clash with old-style backreferences (see Absolute referencing below). They both consist of a backslash followed by numbers. So Perl has to use heuristics to determine whether it is a backreference or an octal escape. Perl uses the following rules to disambiguate:

1

If the backslash is followed by a single digit, it's a backreference.

2

If the first digit following the backslash is a 0, it's an octal escape.

3

If the number following the backslash is N (in decimal), and Perl already has seen N capture groups, Perl considers this a backreference. Otherwise, it considers it an octal escape. If N has more than three digits, Perl takes only the first three for the octal escape; the rest are matched as is.

  1. my $pat = "(" x 999;
  2. $pat .= "a";
  3. $pat .= ")" x 999;
  4. /^($pat)\1000$/; # Matches 'aa'; there are 1000 capture groups.
  5. /^$pat\1000$/; # Matches 'a@0'; there are 999 capture groups
  6. # and \1000 is seen as \100 (a '@') and a '0'.

You can force a backreference interpretation always by using the \g{...} form. You can the force an octal interpretation always by using the \o{...} form, or for numbers up through \077 (= 63 decimal), by using three digits, beginning with a "0".

Hexadecimal escapes

Like octal escapes, there are two forms of hexadecimal escapes, but both start with the same thing, \x . This is followed by either exactly two hexadecimal digits forming a number, or a hexadecimal number of arbitrary length surrounded by curly braces. The hexadecimal number is the code point of the character you want to express.

Note that a character expressed as one of these escapes is considered a character without special meaning by the regex engine, and will match "as is".

Mnemonic: hexadecimal.

Examples (assuming an ASCII platform)

  1. $str = "Perl";
  2. $str =~ /\x50/; # Match, "\x50" is "P".
  3. $str =~ /\x50+/; # Match, "\x50" is "P", it is repeated at least once
  4. $str =~ /P\x2B/; # No match, "\x2B" is "+" and taken literally.
  5. /\x{2603}\x{2602}/ # Snowman with an umbrella.
  6. # The Unicode character 2603 is a snowman,
  7. # the Unicode character 2602 is an umbrella.
  8. /\x{263B}/ # Black smiling face.
  9. /\x{263b}/ # Same, the hex digits A - F are case insensitive.

Modifiers

A number of backslash sequences have to do with changing the character, or characters following them. \l will lowercase the character following it, while \u will uppercase (or, more accurately, titlecase) the character following it. They provide functionality similar to the functions lcfirst and ucfirst.

To uppercase or lowercase several characters, one might want to use \L or \U , which will lowercase/uppercase all characters following them, until either the end of the pattern or the next occurrence of \E , whichever comes first. They provide functionality similar to what the functions lc and uc provide.

\Q is used to quote (disable) pattern metacharacters, up to the next \E or the end of the pattern. \Q adds a backslash to any character that could have special meaning to Perl. In the ASCII range, it quotes every character that isn't a letter, digit, or underscore. See quotemeta for details on what gets quoted for non-ASCII code points. Using this ensures that any character between \Q and \E will be matched literally, not interpreted as a metacharacter by the regex engine.

\F can be used to casefold all characters following, up to the next \E or the end of the pattern. It provides the functionality similar to the fc function.

Mnemonic: Lowercase, Uppercase, Fold-case, Quotemeta, End.

Examples

  1. $sid = "sid";
  2. $greg = "GrEg";
  3. $miranda = "(Miranda)";
  4. $str =~ /\u$sid/; # Matches 'Sid'
  5. $str =~ /\L$greg/; # Matches 'greg'
  6. $str =~ /\Q$miranda\E/; # Matches '(Miranda)', as if the pattern
  7. # had been written as /\(Miranda\)/

Character classes

Perl regular expressions have a large range of character classes. Some of the character classes are written as a backslash sequence. We will briefly discuss those here; full details of character classes can be found in perlrecharclass.

\w is a character class that matches any single word character (letters, digits, Unicode marks, and connector punctuation (like the underscore)). \d is a character class that matches any decimal digit, while the character class \s matches any whitespace character. New in perl 5.10.0 are the classes \h and \v which match horizontal and vertical whitespace characters.

The exact set of characters matched by \d , \s, and \w varies depending on various pragma and regular expression modifiers. It is possible to restrict the match to the ASCII range by using the /a regular expression modifier. See perlrecharclass.

The uppercase variants (\W , \D , \S , \H , and \V ) are character classes that match, respectively, any character that isn't a word character, digit, whitespace, horizontal whitespace, or vertical whitespace.

Mnemonics: word, digit, space, horizontal, vertical.

Unicode classes

\pP (where P is a single letter) and \p{Property} are used to match a character that matches the given Unicode property; properties include things like "letter", or "thai character". Capitalizing the sequence to \PP and \P{Property} make the sequence match a character that doesn't match the given Unicode property. For more details, see Backslash sequences in perlrecharclass and Unicode Character Properties in perlunicode.

Mnemonic: property.

Referencing

If capturing parenthesis are used in a regular expression, we can refer to the part of the source string that was matched, and match exactly the same thing. There are three ways of referring to such backreference: absolutely, relatively, and by name.

Absolute referencing

Either \gN (starting in Perl 5.10.0), or \N (old-style) where N is a positive (unsigned) decimal number of any length is an absolute reference to a capturing group.

N refers to the Nth set of parentheses, so \gN refers to whatever has been matched by that set of parentheses. Thus \g1 refers to the first capture group in the regex.

The \gN form can be equivalently written as \g{N} which avoids ambiguity when building a regex by concatenating shorter strings. Otherwise if you had a regex qr/$a$b/, and $a contained "\g1" , and $b contained "37" , you would get /\g137/ which is probably not what you intended.

In the \N form, N must not begin with a "0", and there must be at least N capturing groups, or else N is considered an octal escape (but something like \18 is the same as \0018 ; that is, the octal escape "\001" followed by a literal digit "8" ).

Mnemonic: group.

Examples

  1. /(\w+) \g1/; # Finds a duplicated word, (e.g. "cat cat").
  2. /(\w+) \1/; # Same thing; written old-style.
  3. /(.)(.)\g2\g1/; # Match a four letter palindrome (e.g. "ABBA").

Relative referencing

\g-N (starting in Perl 5.10.0) is used for relative addressing. (It can be written as \g{-N.) It refers to the Nth group before the \g{-N}.

The big advantage of this form is that it makes it much easier to write patterns with references that can be interpolated in larger patterns, even if the larger pattern also contains capture groups.

Examples

  1. /(A) # Group 1
  2. ( # Group 2
  3. (B) # Group 3
  4. \g{-1} # Refers to group 3 (B)
  5. \g{-3} # Refers to group 1 (A)
  6. )
  7. /x; # Matches "ABBA".
  8. my $qr = qr /(.)(.)\g{-2}\g{-1}/; # Matches 'abab', 'cdcd', etc.
  9. /$qr$qr/ # Matches 'ababcdcd'.

Named referencing

\g{name} (starting in Perl 5.10.0) can be used to back refer to a named capture group, dispensing completely with having to think about capture buffer positions.

To be compatible with .Net regular expressions, \g{name} may also be written as \k{name} , \k<name> or \k'name'.

To prevent any ambiguity, name must not start with a digit nor contain a hyphen.

Examples

  1. /(?<word>\w+) \g{word}/ # Finds duplicated word, (e.g. "cat cat")
  2. /(?<word>\w+) \k{word}/ # Same.
  3. /(?<word>\w+) \k<word>/ # Same.
  4. /(?<letter1>.)(?<letter2>.)\g{letter2}\g{letter1}/
  5. # Match a four letter palindrome (e.g. "ABBA")

Assertions

Assertions are conditions that have to be true; they don't actually match parts of the substring. There are six assertions that are written as backslash sequences.

  • \A

    \A only matches at the beginning of the string. If the /m modifier isn't used, then /\A/ is equivalent to /^/ . However, if the /m modifier is used, then /^/ matches internal newlines, but the meaning of /\A/ isn't changed by the /m modifier. \A matches at the beginning of the string regardless whether the /m modifier is used.

  • \z, \Z

    \z and \Z match at the end of the string. If the /m modifier isn't used, then /\Z/ is equivalent to /$/ ; that is, it matches at the end of the string, or one before the newline at the end of the string. If the /m modifier is used, then /$/ matches at internal newlines, but the meaning of /\Z/ isn't changed by the /m modifier. \Z matches at the end of the string (or just before a trailing newline) regardless whether the /m modifier is used.

    \z is just like \Z , except that it does not match before a trailing newline. \z matches at the end of the string only, regardless of the modifiers used, and not just before a newline. It is how to anchor the match to the true end of the string under all conditions.

  • \G

    \G is usually used only in combination with the /g modifier. If the /g modifier is used and the match is done in scalar context, Perl remembers where in the source string the last match ended, and the next time, it will start the match from where it ended the previous time.

    \G matches the point where the previous match on that string ended, or the beginning of that string if there was no previous match.

    Mnemonic: Global.

  • \b, \B

    \b matches at any place between a word and a non-word character; \B matches at any place between characters where \b doesn't match. \b and \B assume there's a non-word character before the beginning and after the end of the source string; so \b will match at the beginning (or end) of the source string if the source string begins (or ends) with a word character. Otherwise, \B will match.

    Do not use something like \b=head\d\b and expect it to match the beginning of a line. It can't, because for there to be a boundary before the non-word "=", there must be a word character immediately previous. All boundary determinations look for word characters alone, not for non-words characters nor for string ends. It may help to understand how <\b> and <\B> work by equating them as follows:

    1. \b really means (?:(?<=\w)(?!\w)|(?<!\w)(?=\w))
    2. \B really means (?:(?<=\w)(?=\w)|(?<!\w)(?!\w))

    Mnemonic: boundary.

Examples

  1. "cat" =~ /\Acat/; # Match.
  2. "cat" =~ /cat\Z/; # Match.
  3. "cat\n" =~ /cat\Z/; # Match.
  4. "cat\n" =~ /cat\z/; # No match.
  5. "cat" =~ /\bcat\b/; # Matches.
  6. "cats" =~ /\bcat\b/; # No match.
  7. "cat" =~ /\bcat\B/; # No match.
  8. "cats" =~ /\bcat\B/; # Match.
  9. while ("cat dog" =~ /(\w+)/g) {
  10. print $1; # Prints 'catdog'
  11. }
  12. while ("cat dog" =~ /\G(\w+)/g) {
  13. print $1; # Prints 'cat'
  14. }

Misc

Here we document the backslash sequences that don't fall in one of the categories above. These are:

  • \C

    \C always matches a single octet, even if the source string is encoded in UTF-8 format, and the character to be matched is a multi-octet character. This is very dangerous, because it violates the logical character abstraction and can cause UTF-8 sequences to become malformed.

    Mnemonic: oCtet.

  • \K

    This appeared in perl 5.10.0. Anything matched left of \K is not included in $& , and will not be replaced if the pattern is used in a substitution. This lets you write s/PAT1 \K PAT2/REPL/x instead of s/(PAT1) PAT2/${1}REPL/x or s/(?<=PAT1) PAT2/REPL/x .

    Mnemonic: Keep.

  • \N

    This feature, available starting in v5.12, matches any character that is not a newline. It is a short-hand for writing [^\n], and is identical to the . metasymbol, except under the /s flag, which changes the meaning of ., but not \N .

    Note that \N{...} can mean a named or numbered character .

    Mnemonic: Complement of \n.

  • \R

    \R matches a generic newline; that is, anything considered a linebreak sequence by Unicode. This includes all characters matched by \v (vertical whitespace), and the multi character sequence "\x0D\x0A" (carriage return followed by a line feed, sometimes called the network newline; it's the end of line sequence used in Microsoft text files opened in binary mode). \R is equivalent to (?>\x0D\x0A|\v) . (The reason it doesn't backtrack is that the sequence is considered inseparable. That means that

    1. "\x0D\x0A" =~ /^\R\x0A$/ # No match

    fails, because the \R matches the entire string, and won't backtrack to match just the "\x0D" .) Since \R can match a sequence of more than one character, it cannot be put inside a bracketed character class; /[\R]/ is an error; use \v instead. \R was introduced in perl 5.10.0.

    Note that this does not respect any locale that might be in effect; it matches according to the platform's native character set.

    Mnemonic: none really. \R was picked because PCRE already uses \R , and more importantly because Unicode recommends such a regular expression metacharacter, and suggests \R as its notation.

  • \X

    This matches a Unicode extended grapheme cluster.

    \X matches quite well what normal (non-Unicode-programmer) usage would consider a single character. As an example, consider a G with some sort of diacritic mark, such as an arrow. There is no such single character in Unicode, but one can be composed by using a G followed by a Unicode "COMBINING UPWARDS ARROW BELOW", and would be displayed by Unicode-aware software as if it were a single character.

    Mnemonic: eXtended Unicode character.

Examples

  1. "\x{256}" =~ /^\C\C$/; # Match as chr (0x256) takes
  2. # 2 octets in UTF-8.
  3. $str =~ s/foo\Kbar/baz/g; # Change any 'bar' following a 'foo' to 'baz'
  4. $str =~ s/(.)\K\g1//g; # Delete duplicated characters.
  5. "\n" =~ /^\R$/; # Match, \n is a generic newline.
  6. "\r" =~ /^\R$/; # Match, \r is a generic newline.
  7. "\r\n" =~ /^\R$/; # Match, \r\n is a generic newline.
  8. "P\x{307}" =~ /^\X$/ # \X matches a P with a dot above.
 
perldoc-html/perlrecharclass.html000644 000765 000024 00000304134 12275777341 017305 0ustar00jjstaff000000 000000 perlrecharclass - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlrecharclass

Perl 5 version 18.2 documentation
Recently read

perlrecharclass

NAME

perlrecharclass - Perl Regular Expression Character Classes

DESCRIPTION

The top level documentation about Perl regular expressions is found in perlre.

This manual page discusses the syntax and use of character classes in Perl regular expressions.

A character class is a way of denoting a set of characters in such a way that one character of the set is matched. It's important to remember that: matching a character class consumes exactly one character in the source string. (The source string is the string the regular expression is matched against.)

There are three types of character classes in Perl regular expressions: the dot, backslash sequences, and the form enclosed in square brackets. Keep in mind, though, that often the term "character class" is used to mean just the bracketed form. Certainly, most Perl documentation does that.

The dot

The dot (or period), . is probably the most used, and certainly the most well-known character class. By default, a dot matches any character, except for the newline. That default can be changed to add matching the newline by using the single line modifier: either for the entire regular expression with the /s modifier, or locally with (?s). (The \N backslash sequence, described below, matches any character except newline without regard to the single line modifier.)

Here are some examples:

  1. "a" =~ /./ # Match
  2. "." =~ /./ # Match
  3. "" =~ /./ # No match (dot has to match a character)
  4. "\n" =~ /./ # No match (dot does not match a newline)
  5. "\n" =~ /./s # Match (global 'single line' modifier)
  6. "\n" =~ /(?s:.)/ # Match (local 'single line' modifier)
  7. "ab" =~ /^.$/ # No match (dot matches one character)

Backslash sequences

A backslash sequence is a sequence of characters, the first one of which is a backslash. Perl ascribes special meaning to many such sequences, and some of these are character classes. That is, they match a single character each, provided that the character belongs to the specific set of characters defined by the sequence.

Here's a list of the backslash sequences that are character classes. They are discussed in more detail below. (For the backslash sequences that aren't character classes, see perlrebackslash.)

  1. \d Match a decimal digit character.
  2. \D Match a non-decimal-digit character.
  3. \w Match a "word" character.
  4. \W Match a non-"word" character.
  5. \s Match a whitespace character.
  6. \S Match a non-whitespace character.
  7. \h Match a horizontal whitespace character.
  8. \H Match a character that isn't horizontal whitespace.
  9. \v Match a vertical whitespace character.
  10. \V Match a character that isn't vertical whitespace.
  11. \N Match a character that isn't a newline.
  12. \pP, \p{Prop} Match a character that has the given Unicode property.
  13. \PP, \P{Prop} Match a character that doesn't have the Unicode property

\N

\N , available starting in v5.12, like the dot, matches any character that is not a newline. The difference is that \N is not influenced by the single line regular expression modifier (see The dot above). Note that the form \N{...} may mean something completely different. When the {...} is a quantifier, it means to match a non-newline character that many times. For example, \N{3} means to match 3 non-newlines; \N{5,} means to match 5 or more non-newlines. But if {...} is not a legal quantifier, it is presumed to be a named character. See charnames for those. For example, none of \N{COLON} , \N{4F}, and \N{F4} contain legal quantifiers, so Perl will try to find characters whose names are respectively COLON , 4F, and F4 .

Digits

\d matches a single character considered to be a decimal digit. If the /a regular expression modifier is in effect, it matches [0-9]. Otherwise, it matches anything that is matched by \p{Digit} , which includes [0-9]. (An unlikely possible exception is that under locale matching rules, the current locale might not have [0-9] matched by \d , and/or might match other characters whose code point is less than 256. Such a locale definition would be in violation of the C language standard, but Perl doesn't currently assume anything in regard to this.)

What this means is that unless the /a modifier is in effect \d not only matches the digits '0' - '9', but also Arabic, Devanagari, and digits from other languages. This may cause some confusion, and some security issues.

Some digits that \d matches look like some of the [0-9] ones, but have different values. For example, BENGALI DIGIT FOUR (U+09EA) looks very much like an ASCII DIGIT EIGHT (U+0038). An application that is expecting only the ASCII digits might be misled, or if the match is \d+ , the matched string might contain a mixture of digits from different writing systems that look like they signify a number different than they actually do. num() in Unicode::UCD can be used to safely calculate the value, returning undef if the input string contains such a mixture.

What \p{Digit} means (and hence \d except under the /a modifier) is \p{General_Category=Decimal_Number} , or synonymously, \p{General_Category=Digit} . Starting with Unicode version 4.1, this is the same set of characters matched by \p{Numeric_Type=Decimal} . But Unicode also has a different property with a similar name, \p{Numeric_Type=Digit} , which matches a completely different set of characters. These characters are things such as CIRCLED DIGIT ONE or subscripts, or are from writing systems that lack all ten digits.

The design intent is for \d to exactly match the set of characters that can safely be used with "normal" big-endian positional decimal syntax, where, for example 123 means one 'hundred', plus two 'tens', plus three 'ones'. This positional notation does not necessarily apply to characters that match the other type of "digit", \p{Numeric_Type=Digit} , and so \d doesn't match them.

The Tamil digits (U+0BE6 - U+0BEF) can also legally be used in old-style Tamil numbers in which they would appear no more than one in a row, separated by characters that mean "times 10", "times 100", etc. (See http://www.unicode.org/notes/tn21.)

Any character not matched by \d is matched by \D .

Word characters

A \w matches a single alphanumeric character (an alphabetic character, or a decimal digit); or a connecting punctuation character, such as an underscore ("_"); or a "mark" character (like some sort of accent) that attaches to one of those. It does not match a whole word. To match a whole word, use \w+ . This isn't the same thing as matching an English word, but in the ASCII range it is the same as a string of Perl-identifier characters.

  • If the /a modifier is in effect ...

    \w matches the 63 characters [a-zA-Z0-9_].

  • otherwise ...
    • For code points above 255 ...

      \w matches the same as \p{Word} matches in this range. That is, it matches Thai letters, Greek letters, etc. This includes connector punctuation (like the underscore) which connect two words together, or diacritics, such as a COMBINING TILDE and the modifier letters, which are generally used to add auxiliary markings to letters.

    • For code points below 256 ...
      • if locale rules are in effect ...

        \w matches the platform's native underscore character plus whatever the locale considers to be alphanumeric.

      • if Unicode rules are in effect ...

        \w matches exactly what \p{Word} matches.

      • otherwise ...

        \w matches [a-zA-Z0-9_].

Which rules apply are determined as described in Which character set modifier is in effect? in perlre.

There are a number of security issues with the full Unicode list of word characters. See http://unicode.org/reports/tr36.

Also, for a somewhat finer-grained set of characters that are in programming language identifiers beyond the ASCII range, you may wish to instead use the more customized Unicode Properties, \p{ID_Start} , \p{ID_Continue} , \p{XID_Start} , and \p{XID_Continue} . See http://unicode.org/reports/tr31.

Any character not matched by \w is matched by \W .

Whitespace

\s matches any single character considered whitespace.

  • If the /a modifier is in effect ...

    In all Perl versions, \s matches the 5 characters [\t\n\f\r ]; that is, the horizontal tab, the newline, the form feed, the carriage return, and the space. Starting in Perl v5.18, experimentally, it also matches the vertical tab, \cK . See note [1] below for a discussion of this.

  • otherwise ...
    • For code points above 255 ...

      \s matches exactly the code points above 255 shown with an "s" column in the table below.

    • For code points below 256 ...
      • if locale rules are in effect ...

        \s matches whatever the locale considers to be whitespace.

      • if Unicode rules are in effect ...

        \s matches exactly the characters shown with an "s" column in the table below.

      • otherwise ...

        \s matches [\t\n\f\r\cK ] and, starting, experimentally in Perl v5.18, the vertical tab, \cK . (See note [1] below for a discussion of this.) Note that this list doesn't include the non-breaking space.

Which rules apply are determined as described in Which character set modifier is in effect? in perlre.

Any character not matched by \s is matched by \S .

\h matches any character considered horizontal whitespace; this includes the platform's space and tab characters and several others listed in the table below. \H matches any character not considered horizontal whitespace. They use the platform's native character set, and do not consider any locale that may otherwise be in use.

\v matches any character considered vertical whitespace; this includes the platform's carriage return and line feed characters (newline) plus several other characters, all listed in the table below. \V matches any character not considered vertical whitespace. They use the platform's native character set, and do not consider any locale that may otherwise be in use.

\R matches anything that can be considered a newline under Unicode rules. It's not a character class, as it can match a multi-character sequence. Therefore, it cannot be used inside a bracketed character class; use \v instead (vertical whitespace). It uses the platform's native character set, and does not consider any locale that may otherwise be in use. Details are discussed in perlrebackslash.

Note that unlike \s (and \d and \w ), \h and \v always match the same characters, without regard to other factors, such as the active locale or whether the source string is in UTF-8 format.

One might think that \s is equivalent to [\h\v] . This is indeed true starting in Perl v5.18, but prior to that, the sole difference was that the vertical tab ("\cK" ) was not matched by \s.

The following table is a complete listing of characters matched by \s, \h and \v as of Unicode 6.0.

The first column gives the Unicode code point of the character (in hex format), the second column gives the (Unicode) name. The third column indicates by which class(es) the character is matched (assuming no locale is in effect that changes the \s matching).

  1. 0x0009 CHARACTER TABULATION h s
  2. 0x000a LINE FEED (LF) vs
  3. 0x000b LINE TABULATION vs [1]
  4. 0x000c FORM FEED (FF) vs
  5. 0x000d CARRIAGE RETURN (CR) vs
  6. 0x0020 SPACE h s
  7. 0x0085 NEXT LINE (NEL) vs [2]
  8. 0x00a0 NO-BREAK SPACE h s [2]
  9. 0x1680 OGHAM SPACE MARK h s
  10. 0x180e MONGOLIAN VOWEL SEPARATOR h s
  11. 0x2000 EN QUAD h s
  12. 0x2001 EM QUAD h s
  13. 0x2002 EN SPACE h s
  14. 0x2003 EM SPACE h s
  15. 0x2004 THREE-PER-EM SPACE h s
  16. 0x2005 FOUR-PER-EM SPACE h s
  17. 0x2006 SIX-PER-EM SPACE h s
  18. 0x2007 FIGURE SPACE h s
  19. 0x2008 PUNCTUATION SPACE h s
  20. 0x2009 THIN SPACE h s
  21. 0x200a HAIR SPACE h s
  22. 0x2028 LINE SEPARATOR vs
  23. 0x2029 PARAGRAPH SEPARATOR vs
  24. 0x202f NARROW NO-BREAK SPACE h s
  25. 0x205f MEDIUM MATHEMATICAL SPACE h s
  26. 0x3000 IDEOGRAPHIC SPACE h s
  • [1]

    Prior to Perl v5.18, \s did not match the vertical tab. The change in v5.18 is considered an experiment, which means it could be backed out in v5.20 or v5.22 if experience indicates that it breaks too much existing code. If this change adversely affects you, send email to perlbug@perl.org ; if it affects you positively, email perlthanks@perl.org . In the meantime, [^\S\cK] (obscurely) matches what \s traditionally did.

  • [2]

    NEXT LINE and NO-BREAK SPACE may or may not match \s depending on the rules in effect. See the beginning of this section.

Unicode Properties

\pP and \p{Prop} are character classes to match characters that fit given Unicode properties. One letter property names can be used in the \pP form, with the property name following the \p , otherwise, braces are required. When using braces, there is a single form, which is just the property name enclosed in the braces, and a compound form which looks like \p{name=value} , which means to match if the property "name" for the character has that particular "value". For instance, a match for a number can be written as /\pN/ or as /\p{Number}/ , or as /\p{Number=True}/ . Lowercase letters are matched by the property Lowercase_Letter which has the short form Ll. They need the braces, so are written as /\p{Ll}/ or /\p{Lowercase_Letter}/ , or /\p{General_Category=Lowercase_Letter}/ (the underscores are optional). /\pLl/ is valid, but means something different. It matches a two character string: a letter (Unicode property \pL ), followed by a lowercase l .

If locale rules are not in effect, the use of a Unicode property will force the regular expression into using Unicode rules, if it isn't already.

Note that almost all properties are immune to case-insensitive matching. That is, adding a /i regular expression modifier does not change what they match. There are two sets that are affected. The first set is Uppercase_Letter , Lowercase_Letter , and Titlecase_Letter , all of which match Cased_Letter under /i matching. The second set is Uppercase , Lowercase , and Titlecase , all of which match Cased under /i matching. (The difference between these sets is that some things, such as Roman numerals, come in both upper and lower case, so they are Cased , but aren't considered to be letters, so they aren't Cased_Letter s. They're actually Letter_Number s.) This set also includes its subsets PosixUpper and PosixLower , both of which under /i match PosixAlpha .

For more details on Unicode properties, see Unicode Character Properties in perlunicode; for a complete list of possible properties, see Properties accessible through \p{} and \P{} in perluniprops, which notes all forms that have /i differences. It is also possible to define your own properties. This is discussed in User-Defined Character Properties in perlunicode.

Unicode properties are defined (surprise!) only on Unicode code points. A warning is raised and all matches fail on non-Unicode code points (those above the legal Unicode maximum of 0x10FFFF). This can be somewhat surprising,

  1. chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails.
  2. chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Also fails!

Even though these two matches might be thought of as complements, they are so only on Unicode code points.

Examples

  1. "a" =~ /\w/ # Match, "a" is a 'word' character.
  2. "7" =~ /\w/ # Match, "7" is a 'word' character as well.
  3. "a" =~ /\d/ # No match, "a" isn't a digit.
  4. "7" =~ /\d/ # Match, "7" is a digit.
  5. " " =~ /\s/ # Match, a space is whitespace.
  6. "a" =~ /\D/ # Match, "a" is a non-digit.
  7. "7" =~ /\D/ # No match, "7" is not a non-digit.
  8. " " =~ /\S/ # No match, a space is not non-whitespace.
  9. " " =~ /\h/ # Match, space is horizontal whitespace.
  10. " " =~ /\v/ # No match, space is not vertical whitespace.
  11. "\r" =~ /\v/ # Match, a return is vertical whitespace.
  12. "a" =~ /\pL/ # Match, "a" is a letter.
  13. "a" =~ /\p{Lu}/ # No match, /\p{Lu}/ matches upper case letters.
  14. "\x{0e0b}" =~ /\p{Thai}/ # Match, \x{0e0b} is the character
  15. # 'THAI CHARACTER SO SO', and that's in
  16. # Thai Unicode class.
  17. "a" =~ /\P{Lao}/ # Match, as "a" is not a Laotian character.

It is worth emphasizing that \d , \w , etc, match single characters, not complete numbers or words. To match a number (that consists of digits), use \d+ ; to match a word, use \w+ . But be aware of the security considerations in doing so, as mentioned above.

Bracketed Character Classes

The third form of character class you can use in Perl regular expressions is the bracketed character class. In its simplest form, it lists the characters that may be matched, surrounded by square brackets, like this: [aeiou] . This matches one of a , e , i , o or u . Like the other character classes, exactly one character is matched.* To match a longer string consisting of characters mentioned in the character class, follow the character class with a quantifier. For instance, [aeiou]+ matches one or more lowercase English vowels.

Repeating a character in a character class has no effect; it's considered to be in the set only once.

Examples:

  1. "e" =~ /[aeiou]/ # Match, as "e" is listed in the class.
  2. "p" =~ /[aeiou]/ # No match, "p" is not listed in the class.
  3. "ae" =~ /^[aeiou]$/ # No match, a character class only matches
  4. # a single character.
  5. "ae" =~ /^[aeiou]+$/ # Match, due to the quantifier.
  6. -------

* There is an exception to a bracketed character class matching a single character only. When the class is to match caselessly under /i matching rules, and a character that is explicitly mentioned inside the class matches a multiple-character sequence caselessly under Unicode rules, the class (when not inverted) will also match that sequence. For example, Unicode says that the letter LATIN SMALL LETTER SHARP S should match the sequence ss under /i rules. Thus,

  1. 'ss' =~ /\A\N{LATIN SMALL LETTER SHARP S}\z/i # Matches
  2. 'ss' =~ /\A[aeioust\N{LATIN SMALL LETTER SHARP S}]\z/i # Matches

For this to happen, the character must be explicitly specified, and not be part of a multi-character range (not even as one of its endpoints). (Character Ranges will be explained shortly.) Therefore,

  1. 'ss' =~ /\A[\0-\x{ff}]\z/i # Doesn't match
  2. 'ss' =~ /\A[\0-\N{LATIN SMALL LETTER SHARP S}]\z/i # No match
  3. 'ss' =~ /\A[\xDF-\xDF]\z/i # Matches on ASCII platforms, since \XDF
  4. # is LATIN SMALL LETTER SHARP S, and the
  5. # range is just a single element

Note that it isn't a good idea to specify these types of ranges anyway.

Special Characters Inside a Bracketed Character Class

Most characters that are meta characters in regular expressions (that is, characters that carry a special meaning like ., * , or () lose their special meaning and can be used inside a character class without the need to escape them. For instance, [()] matches either an opening parenthesis, or a closing parenthesis, and the parens inside the character class don't group or capture.

Characters that may carry a special meaning inside a character class are: \ , ^, - , [ and ], and are discussed below. They can be escaped with a backslash, although this is sometimes not needed, in which case the backslash may be omitted.

The sequence \b is special inside a bracketed character class. While outside the character class, \b is an assertion indicating a point that does not have either two word characters or two non-word characters on either side, inside a bracketed character class, \b matches a backspace character.

The sequences \a , \c , \e , \f , \n , \N{NAME}, \N{U+hex char}, \r , \t , and \x are also special and have the same meanings as they do outside a bracketed character class. (However, inside a bracketed character class, if \N{NAME} expands to a sequence of characters, only the first one in the sequence is used, with a warning.)

Also, a backslash followed by two or three octal digits is considered an octal number.

A [ is not special inside a character class, unless it's the start of a POSIX character class (see POSIX Character Classes below). It normally does not need escaping.

A ] is normally either the end of a POSIX character class (see POSIX Character Classes below), or it signals the end of the bracketed character class. If you want to include a ] in the set of characters, you must generally escape it.

However, if the ] is the first (or the second if the first character is a caret) character of a bracketed character class, it does not denote the end of the class (as you cannot have an empty class) and is considered part of the set of characters that can be matched without escaping.

Examples:

  1. "+" =~ /[+?*]/ # Match, "+" in a character class is not special.
  2. "\cH" =~ /[\b]/ # Match, \b inside in a character class.
  3. # is equivalent to a backspace.
  4. "]" =~ /[][]/ # Match, as the character class contains.
  5. # both [ and ].
  6. "[]" =~ /[[]]/ # Match, the pattern contains a character class
  7. # containing just ], and the character class is
  8. # followed by a ].

Character Ranges

It is not uncommon to want to match a range of characters. Luckily, instead of listing all characters in the range, one may use the hyphen (- ). If inside a bracketed character class you have two characters separated by a hyphen, it's treated as if all characters between the two were in the class. For instance, [0-9] matches any ASCII digit, and [a-m] matches any lowercase letter from the first half of the ASCII alphabet.

Note that the two characters on either side of the hyphen are not necessarily both letters or both digits. Any character is possible, although not advisable. ['-?] contains a range of characters, but most people will not know which characters that means. Furthermore, such ranges may lead to portability problems if the code has to run on a platform that uses a different character set, such as EBCDIC.

If a hyphen in a character class cannot syntactically be part of a range, for instance because it is the first or the last character of the character class, or if it immediately follows a range, the hyphen isn't special, and so is considered a character to be matched literally. If you want a hyphen in your set of characters to be matched and its position in the class is such that it could be considered part of a range, you must escape that hyphen with a backslash.

Examples:

  1. [a-z] # Matches a character that is a lower case ASCII letter.
  2. [a-fz] # Matches any letter between 'a' and 'f' (inclusive) or
  3. # the letter 'z'.
  4. [-z] # Matches either a hyphen ('-') or the letter 'z'.
  5. [a-f-m] # Matches any letter between 'a' and 'f' (inclusive), the
  6. # hyphen ('-'), or the letter 'm'.
  7. ['-?] # Matches any of the characters '()*+,-./0123456789:;<=>?
  8. # (But not on an EBCDIC platform).

Negation

It is also possible to instead list the characters you do not want to match. You can do so by using a caret (^) as the first character in the character class. For instance, [^a-z] matches any character that is not a lowercase ASCII letter, which therefore includes more than a million Unicode code points. The class is said to be "negated" or "inverted".

This syntax make the caret a special character inside a bracketed character class, but only if it is the first character of the class. So if you want the caret as one of the characters to match, either escape the caret or else don't list it first.

In inverted bracketed character classes, Perl ignores the Unicode rules that normally say that certain characters should match a sequence of multiple characters under caseless /i matching. Following those rules could lead to highly confusing situations:

  1. "ss" =~ /^[^\xDF]+$/ui; # Matches!

This should match any sequences of characters that aren't \xDF nor what \xDF matches under /i. "s" isn't \xDF , but Unicode says that "ss" is what \xDF matches under /i. So which one "wins"? Do you fail the match because the string has ss or accept it because it has an s followed by another s? Perl has chosen the latter.

Examples:

  1. "e" =~ /[^aeiou]/ # No match, the 'e' is listed.
  2. "x" =~ /[^aeiou]/ # Match, as 'x' isn't a lowercase vowel.
  3. "^" =~ /[^^]/ # No match, matches anything that isn't a caret.
  4. "^" =~ /[x^]/ # Match, caret is not special here.

Backslash Sequences

You can put any backslash sequence character class (with the exception of \N and \R ) inside a bracketed character class, and it will act just as if you had put all characters matched by the backslash sequence inside the character class. For instance, [a-f\d] matches any decimal digit, or any of the lowercase letters between 'a' and 'f' inclusive.

\N within a bracketed character class must be of the forms \N{name} or \N{U+hex char}, and NOT be the form that matches non-newlines, for the same reason that a dot . inside a bracketed character class loses its special meaning: it matches nearly anything, which generally isn't what you want to happen.

Examples:

  1. /[\p{Thai}\d]/ # Matches a character that is either a Thai
  2. # character, or a digit.
  3. /[^\p{Arabic}()]/ # Matches a character that is neither an Arabic
  4. # character, nor a parenthesis.

Backslash sequence character classes cannot form one of the endpoints of a range. Thus, you can't say:

  1. /[\p{Thai}-\d]/ # Wrong!

POSIX Character Classes

POSIX character classes have the form [:class:], where class is name, and the [: and :] delimiters. POSIX character classes only appear inside bracketed character classes, and are a convenient and descriptive way of listing a group of characters.

Be careful about the syntax,

  1. # Correct:
  2. $string =~ /[[:alpha:]]/
  3. # Incorrect (will warn):
  4. $string =~ /[:alpha:]/

The latter pattern would be a character class consisting of a colon, and the letters a , l , p and h . POSIX character classes can be part of a larger bracketed character class. For example,

  1. [01[:alpha:]%]

is valid and matches '0', '1', any alphabetic character, and the percent sign.

Perl recognizes the following POSIX character classes:

  1. alpha Any alphabetical character ("[A-Za-z]").
  2. alnum Any alphanumeric character ("[A-Za-z0-9]").
  3. ascii Any character in the ASCII character set.
  4. blank A GNU extension, equal to a space or a horizontal tab ("\t").
  5. cntrl Any control character. See Note [2] below.
  6. digit Any decimal digit ("[0-9]"), equivalent to "\d".
  7. graph Any printable character, excluding a space. See Note [3] below.
  8. lower Any lowercase character ("[a-z]").
  9. print Any printable character, including a space. See Note [4] below.
  10. punct Any graphical character excluding "word" characters. Note [5].
  11. space Any whitespace character. "\s" including the vertical tab
  12. ("\cK").
  13. upper Any uppercase character ("[A-Z]").
  14. word A Perl extension ("[A-Za-z0-9_]"), equivalent to "\w".
  15. xdigit Any hexadecimal digit ("[0-9a-fA-F]").

Most POSIX character classes have two Unicode-style \p property counterparts. (They are not official Unicode properties, but Perl extensions derived from official Unicode properties.) The table below shows the relation between POSIX character classes and these counterparts.

One counterpart, in the column labelled "ASCII-range Unicode" in the table, matches only characters in the ASCII character set.

The other counterpart, in the column labelled "Full-range Unicode", matches any appropriate characters in the full Unicode character set. For example, \p{Alpha} matches not just the ASCII alphabetic characters, but any character in the entire Unicode character set considered alphabetic. An entry in the column labelled "backslash sequence" is a (short) equivalent.

  1. [[:...:]] ASCII-range Full-range backslash Note
  2. Unicode Unicode sequence
  3. -----------------------------------------------------
  4. alpha \p{PosixAlpha} \p{XPosixAlpha}
  5. alnum \p{PosixAlnum} \p{XPosixAlnum}
  6. ascii \p{ASCII}
  7. blank \p{PosixBlank} \p{XPosixBlank} \h [1]
  8. or \p{HorizSpace} [1]
  9. cntrl \p{PosixCntrl} \p{XPosixCntrl} [2]
  10. digit \p{PosixDigit} \p{XPosixDigit} \d
  11. graph \p{PosixGraph} \p{XPosixGraph} [3]
  12. lower \p{PosixLower} \p{XPosixLower}
  13. print \p{PosixPrint} \p{XPosixPrint} [4]
  14. punct \p{PosixPunct} \p{XPosixPunct} [5]
  15. \p{PerlSpace} \p{XPerlSpace} \s [6]
  16. space \p{PosixSpace} \p{XPosixSpace} [6]
  17. upper \p{PosixUpper} \p{XPosixUpper}
  18. word \p{PosixWord} \p{XPosixWord} \w
  19. xdigit \p{PosixXDigit} \p{XPosixXDigit}
  • [1]

    \p{Blank} and \p{HorizSpace} are synonyms.

  • [2]

    Control characters don't produce output as such, but instead usually control the terminal somehow: for example, newline and backspace are control characters. In the ASCII range, characters whose code points are between 0 and 31 inclusive, plus 127 (DEL ) are control characters.

  • [3]

    Any character that is graphical, that is, visible. This class consists of all alphanumeric characters and all punctuation characters.

  • [4]

    All printable characters, which is the set of all graphical characters plus those whitespace characters which are not also controls.

  • [5]

    \p{PosixPunct} and [[:punct:]] in the ASCII range match all non-controls, non-alphanumeric, non-space characters: [-!"#$%&'()*+,./:;<=>?@[\\\]^_`{|}~] (although if a locale is in effect, it could alter the behavior of [[:punct:]]).

    The similarly named property, \p{Punct} , matches a somewhat different set in the ASCII range, namely [-!"#%&'()*,./:;?@[\\\]_{}] . That is, it is missing the nine characters [$+<=>^`|~] . This is because Unicode splits what POSIX considers to be punctuation into two categories, Punctuation and Symbols.

    \p{XPosixPunct} and (under Unicode rules) [[:punct:]], match what \p{PosixPunct} matches in the ASCII range, plus what \p{Punct} matches. This is different than strictly matching according to \p{Punct} . Another way to say it is that if Unicode rules are in effect, [[:punct:]] matches all characters that Unicode considers punctuation, plus all ASCII-range characters that Unicode considers symbols.

  • [6]

    \p{SpacePerl} and \p{Space} match identically starting with Perl v5.18. In earlier versions, these differ only in that in non-locale matching, \p{SpacePerl} does not match the vertical tab, \cK . Same for the two ASCII-only range forms.

There are various other synonyms that can be used besides the names listed in the table. For example, \p{PosixAlpha} can be written as \p{Alpha} . All are listed in Properties accessible through \p{} and \P{} in perluniprops, plus all characters matched by each ASCII-range property.

Both the \p counterparts always assume Unicode rules are in effect. On ASCII platforms, this means they assume that the code points from 128 to 255 are Latin-1, and that means that using them under locale rules is unwise unless the locale is guaranteed to be Latin-1 or UTF-8. In contrast, the POSIX character classes are useful under locale rules. They are affected by the actual rules in effect, as follows:

  • If the /a modifier, is in effect ...

    Each of the POSIX classes matches exactly the same as their ASCII-range counterparts.

  • otherwise ...
    • For code points above 255 ...

      The POSIX class matches the same as its Full-range counterpart.

    • For code points below 256 ...
      • if locale rules are in effect ...

        The POSIX class matches according to the locale, except that word uses the platform's native underscore character, no matter what the locale is.

      • if Unicode rules are in effect ...

        The POSIX class matches the same as the Full-range counterpart.

      • otherwise ...

        The POSIX class matches the same as the ASCII range counterpart.

Which rules apply are determined as described in Which character set modifier is in effect? in perlre.

It is proposed to change this behavior in a future release of Perl so that whether or not Unicode rules are in effect would not change the behavior: Outside of locale, the POSIX classes would behave like their ASCII-range counterparts. If you wish to comment on this proposal, send email to perl5-porters@perl.org .

Negation of POSIX character classes

A Perl extension to the POSIX character class is the ability to negate it. This is done by prefixing the class name with a caret (^). Some examples:

  1. POSIX ASCII-range Full-range backslash
  2. Unicode Unicode sequence
  3. -----------------------------------------------------
  4. [[:^digit:]] \P{PosixDigit} \P{XPosixDigit} \D
  5. [[:^space:]] \P{PosixSpace} \P{XPosixSpace}
  6. \P{PerlSpace} \P{XPerlSpace} \S
  7. [[:^word:]] \P{PerlWord} \P{XPosixWord} \W

The backslash sequence can mean either ASCII- or Full-range Unicode, depending on various factors as described in Which character set modifier is in effect? in perlre.

[= =] and [. .]

Perl recognizes the POSIX character classes [=class=] and [.class.], but does not (yet?) support them. Any attempt to use either construct raises an exception.

Examples

  1. /[[:digit:]]/ # Matches a character that is a digit.
  2. /[01[:lower:]]/ # Matches a character that is either a
  3. # lowercase letter, or '0' or '1'.
  4. /[[:digit:][:^xdigit:]]/ # Matches a character that can be anything
  5. # except the letters 'a' to 'f' and 'A' to
  6. # 'F'. This is because the main character
  7. # class is composed of two POSIX character
  8. # classes that are ORed together, one that
  9. # matches any digit, and the other that
  10. # matches anything that isn't a hex digit.
  11. # The OR adds the digits, leaving only the
  12. # letters 'a' to 'f' and 'A' to 'F' excluded.

Extended Bracketed Character Classes

This is a fancy bracketed character class that can be used for more readable and less error-prone classes, and to perform set operations, such as intersection. An example is

  1. /(?[ \p{Thai} & \p{Digit} ])/

This will match all the digit characters that are in the Thai script.

This is an experimental feature available starting in 5.18, and is subject to change as we gain field experience with it. Any attempt to use it will raise a warning, unless disabled via

  1. no warnings "experimental::regex_sets";

Comments on this feature are welcome; send email to perl5-porters@perl.org .

We can extend the example above:

  1. /(?[ ( \p{Thai} + \p{Lao} ) & \p{Digit} ])/

This matches digits that are in either the Thai or Laotian scripts.

Notice the white space in these examples. This construct always has the /x modifier turned on.

The available binary operators are:

  1. & intersection
  2. + union
  3. | another name for '+', hence means union
  4. - subtraction (the result matches the set consisting of those
  5. code points matched by the first operand, excluding any that
  6. are also matched by the second operand)
  7. ^ symmetric difference (the union minus the intersection). This
  8. is like an exclusive or, in that the result is the set of code
  9. points that are matched by either, but not both, of the
  10. operands.

There is one unary operator:

  1. ! complement

All the binary operators left associate, and are of equal precedence. The unary operator right associates, and has higher precedence. Use parentheses to override the default associations. Some feedback we've received indicates a desire for intersection to have higher precedence than union. This is something that feedback from the field may cause us to change in future releases; you may want to parenthesize copiously to avoid such changes affecting your code, until this feature is no longer considered experimental.

The main restriction is that everything is a metacharacter. Thus, you cannot refer to single characters by doing something like this:

  1. /(?[ a + b ])/ # Syntax error!

The easiest way to specify an individual typable character is to enclose it in brackets:

  1. /(?[ [a] + [b] ])/

(This is the same thing as [ab] .) You could also have said the equivalent:

  1. /(?[[ a b ]])/

(You can, of course, specify single characters by using, \x{ } , \N{ } , etc.)

This last example shows the use of this construct to specify an ordinary bracketed character class without additional set operations. Note the white space within it; /x is turned on even within bracketed character classes, except you can't have comments inside them. Hence,

  1. (?[ [#] ])

matches the literal character "#". To specify a literal white space character, you can escape it with a backslash, like:

  1. /(?[ [ a e i o u \ ] ])/

This matches the English vowels plus the SPACE character. All the other escapes accepted by normal bracketed character classes are accepted here as well; but unrecognized escapes that generate warnings in normal classes are fatal errors here.

All warnings from these class elements are fatal, as well as some practices that don't currently warn. For example you cannot say

  1. /(?[ [ \xF ] ])/ # Syntax error!

You have to have two hex digits after a braceless \x (use a leading zero to make two). These restrictions are to lower the incidence of typos causing the class to not match what you thought it would.

The final difference between regular bracketed character classes and these, is that it is not possible to get these to match a multi-character fold. Thus,

  1. /(?[ [\xDF] ])/iu

does not match the string ss .

You don't have to enclose POSIX class names inside double brackets, hence both of the following work:

  1. /(?[ [:word:] - [:lower:] ])/
  2. /(?[ [[:word:]] - [[:lower:]] ])/

Any contained POSIX character classes, including things like \w and \D respect the /a (and /aa ) modifiers.

(?[ ]) is a regex-compile-time construct. Any attempt to use something which isn't knowable at the time the containing regular expression is compiled is a fatal error. In practice, this means just three limitiations:

1

This construct cannot be used within the scope of use locale (or the /l regex modifier).

2

Any user-defined property used must be already defined by the time the regular expression is compiled (but note that this construct can be used instead of such properties).

3

A regular expression that otherwise would compile using /d rules, and which uses this construct will instead use /u . Thus this construct tells Perl that you don't want /d rules for the entire regular expression containing it.

The /x processing within this class is an extended form. Besides the characters that are considered white space in normal /x processing, there are 5 others, recommended by the Unicode standard:

  1. U+0085 NEXT LINE
  2. U+200E LEFT-TO-RIGHT MARK
  3. U+200F RIGHT-TO-LEFT MARK
  4. U+2028 LINE SEPARATOR
  5. U+2029 PARAGRAPH SEPARATOR

Note that skipping white space applies only to the interior of this construct. There must not be any space between any of the characters that form the initial (?[ . Nor may there be space between the closing ]) characters.

Just as in all regular expressions, the pattern can can be built up by including variables that are interpolated at regex compilation time. Care must be taken to ensure that you are getting what you expect. For example:

  1. my $thai_or_lao = '\p{Thai} + \p{Lao}';
  2. ...
  3. qr/(?[ \p{Digit} & $thai_or_lao ])/;

compiles to

  1. qr/(?[ \p{Digit} & \p{Thai} + \p{Lao} ])/;

But this does not have the effect that someone reading the code would likely expect, as the intersection applies just to \p{Thai} , excluding the Laotian. Pitfalls like this can be avoided by parenthesizing the component pieces:

  1. my $thai_or_lao = '( \p{Thai} + \p{Lao} )';

But any modifiers will still apply to all the components:

  1. my $lower = '\p{Lower} + \p{Digit}';
  2. qr/(?[ \p{Greek} & $lower ])/i;

matches upper case things. You can avoid surprises by making the components into instances of this construct by compiling them:

  1. my $thai_or_lao = qr/(?[ \p{Thai} + \p{Lao} ])/;
  2. my $lower = qr/(?[ \p{Lower} + \p{Digit} ])/;

When these are embedded in another pattern, what they match does not change, regardless of parenthesization or what modifiers are in effect in that outer pattern.

Due to the way that Perl parses things, your parentheses and brackets may need to be balanced, even including comments. If you run into any examples, please send them to perlbug@perl.org , so that we can have a concrete example for this man page.

We may change it so that things that remain legal uses in normal bracketed character classes might become illegal within this experimental construct. One proposal, for example, is to forbid adjacent uses of the same character, as in (?[ [aa] ]) . The motivation for such a change is that this usage is likely a typo, as the second "a" adds nothing.

 
perldoc-html/perlref.html000644 000765 000024 00000210216 12275777341 015564 0ustar00jjstaff000000 000000 perlref - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlref

Perl 5 version 18.2 documentation
Recently read

perlref

NAME

perlref - Perl references and nested data structures

NOTE

This is complete documentation about all aspects of references. For a shorter, tutorial introduction to just the essential features, see perlreftut.

DESCRIPTION

Before release 5 of Perl it was difficult to represent complex data structures, because all references had to be symbolic--and even then it was difficult to refer to a variable instead of a symbol table entry. Perl now not only makes it easier to use symbolic references to variables, but also lets you have "hard" references to any piece of data or code. Any scalar may hold a hard reference. Because arrays and hashes contain scalars, you can now easily build arrays of arrays, arrays of hashes, hashes of arrays, arrays of hashes of functions, and so on.

Hard references are smart--they keep track of reference counts for you, automatically freeing the thing referred to when its reference count goes to zero. (Reference counts for values in self-referential or cyclic data structures may not go to zero without a little help; see Circular References for a detailed explanation.) If that thing happens to be an object, the object is destructed. See perlobj for more about objects. (In a sense, everything in Perl is an object, but we usually reserve the word for references to objects that have been officially "blessed" into a class package.)

Symbolic references are names of variables or other objects, just as a symbolic link in a Unix filesystem contains merely the name of a file. The *glob notation is something of a symbolic reference. (Symbolic references are sometimes called "soft references", but please don't call them that; references are confusing enough without useless synonyms.)

In contrast, hard references are more like hard links in a Unix file system: They are used to access an underlying object without concern for what its (other) name is. When the word "reference" is used without an adjective, as in the following paragraph, it is usually talking about a hard reference.

References are easy to use in Perl. There is just one overriding principle: in general, Perl does no implicit referencing or dereferencing. When a scalar is holding a reference, it always behaves as a simple scalar. It doesn't magically start being an array or hash or subroutine; you have to tell it explicitly to do so, by dereferencing it.

That said, be aware that Perl version 5.14 introduces an exception to the rule, for syntactic convenience. Experimental array and hash container function behavior allows array and hash references to be handled by Perl as if they had been explicitly syntactically dereferenced. See Syntactical Enhancements in perl5140delta and perlfunc for details.

Making References

References can be created in several ways.

1.

By using the backslash operator on a variable, subroutine, or value. (This works much like the & (address-of) operator in C.) This typically creates another reference to a variable, because there's already a reference to the variable in the symbol table. But the symbol table reference might go away, and you'll still have the reference that the backslash returned. Here are some examples:

  1. $scalarref = \$foo;
  2. $arrayref = \@ARGV;
  3. $hashref = \%ENV;
  4. $coderef = \&handler;
  5. $globref = \*foo;

It isn't possible to create a true reference to an IO handle (filehandle or dirhandle) using the backslash operator. The most you can get is a reference to a typeglob, which is actually a complete symbol table entry. But see the explanation of the *foo{THING} syntax below. However, you can still use type globs and globrefs as though they were IO handles.

2.

A reference to an anonymous array can be created using square brackets:

  1. $arrayref = [1, 2, ['a', 'b', 'c']];

Here we've created a reference to an anonymous array of three elements whose final element is itself a reference to another anonymous array of three elements. (The multidimensional syntax described later can be used to access this. For example, after the above, $arrayref->[2][1] would have the value "b".)

Taking a reference to an enumerated list is not the same as using square brackets--instead it's the same as creating a list of references!

  1. @list = (\$a, \@b, \%c);
  2. @list = \($a, @b, %c); # same thing!

As a special case, \(@foo) returns a list of references to the contents of @foo , not a reference to @foo itself. Likewise for %foo , except that the key references are to copies (since the keys are just strings rather than full-fledged scalars).

3.

A reference to an anonymous hash can be created using curly brackets:

  1. $hashref = {
  2. 'Adam' => 'Eve',
  3. 'Clyde' => 'Bonnie',
  4. };

Anonymous hash and array composers like these can be intermixed freely to produce as complicated a structure as you want. The multidimensional syntax described below works for these too. The values above are literals, but variables and expressions would work just as well, because assignment operators in Perl (even within local() or my()) are executable statements, not compile-time declarations.

Because curly brackets (braces) are used for several other things including BLOCKs, you may occasionally have to disambiguate braces at the beginning of a statement by putting a + or a return in front so that Perl realizes the opening brace isn't starting a BLOCK. The economy and mnemonic value of using curlies is deemed worth this occasional extra hassle.

For example, if you wanted a function to make a new hash and return a reference to it, you have these options:

  1. sub hashem { { @_ } } # silently wrong
  2. sub hashem { +{ @_ } } # ok
  3. sub hashem { return { @_ } } # ok

On the other hand, if you want the other meaning, you can do this:

  1. sub showem { { @_ } } # ambiguous (currently ok, but may change)
  2. sub showem { {; @_ } } # ok
  3. sub showem { { return @_ } } # ok

The leading +{ and {; always serve to disambiguate the expression to mean either the HASH reference, or the BLOCK.

4.

A reference to an anonymous subroutine can be created by using sub without a subname:

  1. $coderef = sub { print "Boink!\n" };

Note the semicolon. Except for the code inside not being immediately executed, a sub {} is not so much a declaration as it is an operator, like do{} or eval{}. (However, no matter how many times you execute that particular line (unless you're in an eval("...")), $coderef will still have a reference to the same anonymous subroutine.)

Anonymous subroutines act as closures with respect to my() variables, that is, variables lexically visible within the current scope. Closure is a notion out of the Lisp world that says if you define an anonymous function in a particular lexical context, it pretends to run in that context even when it's called outside the context.

In human terms, it's a funny way of passing arguments to a subroutine when you define it as well as when you call it. It's useful for setting up little bits of code to run later, such as callbacks. You can even do object-oriented stuff with it, though Perl already provides a different mechanism to do that--see perlobj.

You might also think of closure as a way to write a subroutine template without using eval(). Here's a small example of how closures work:

  1. sub newprint {
  2. my $x = shift;
  3. return sub { my $y = shift; print "$x, $y!\n"; };
  4. }
  5. $h = newprint("Howdy");
  6. $g = newprint("Greetings");
  7. # Time passes...
  8. &$h("world");
  9. &$g("earthlings");

This prints

  1. Howdy, world!
  2. Greetings, earthlings!

Note particularly that $x continues to refer to the value passed into newprint() despite "my $x" having gone out of scope by the time the anonymous subroutine runs. That's what a closure is all about.

This applies only to lexical variables, by the way. Dynamic variables continue to work as they have always worked. Closure is not something that most Perl programmers need trouble themselves about to begin with.

5.

References are often returned by special subroutines called constructors. Perl objects are just references to a special type of object that happens to know which package it's associated with. Constructors are just special subroutines that know how to create that association. They do so by starting with an ordinary reference, and it remains an ordinary reference even while it's also being an object. Constructors are often named new() . You can call them indirectly:

  1. $objref = new Doggie( Tail => 'short', Ears => 'long' );

But that can produce ambiguous syntax in certain cases, so it's often better to use the direct method invocation approach:

  1. $objref = Doggie->new(Tail => 'short', Ears => 'long');
  2. use Term::Cap;
  3. $terminal = Term::Cap->Tgetent( { OSPEED => 9600 });
  4. use Tk;
  5. $main = MainWindow->new();
  6. $menubar = $main->Frame(-relief => "raised",
  7. -borderwidth => 2)
6.

References of the appropriate type can spring into existence if you dereference them in a context that assumes they exist. Because we haven't talked about dereferencing yet, we can't show you any examples yet.

7.

A reference can be created by using a special syntax, lovingly known as the *foo{THING} syntax. *foo{THING} returns a reference to the THING slot in *foo (which is the symbol table entry which holds everything known as foo).

  1. $scalarref = *foo{SCALAR};
  2. $arrayref = *ARGV{ARRAY};
  3. $hashref = *ENV{HASH};
  4. $coderef = *handler{CODE};
  5. $ioref = *STDIN{IO};
  6. $globref = *foo{GLOB};
  7. $formatref = *foo{FORMAT};
  8. $globname = *foo{NAME}; # "foo"
  9. $pkgname = *foo{PACKAGE}; # "main"

Most of these are self-explanatory, but *foo{IO} deserves special attention. It returns the IO handle, used for file handles (open), sockets (socket and socketpair), and directory handles (opendir). For compatibility with previous versions of Perl, *foo{FILEHANDLE} is a synonym for *foo{IO} , though it is deprecated as of 5.8.0. If deprecation warnings are in effect, it will warn of its use.

*foo{THING} returns undef if that particular THING hasn't been used yet, except in the case of scalars. *foo{SCALAR} returns a reference to an anonymous scalar if $foo hasn't been used yet. This might change in a future release.

*foo{NAME} and *foo{PACKAGE} are the exception, in that they return strings, rather than references. These return the package and name of the typeglob itself, rather than one that has been assigned to it. So, after *foo=*Foo::bar , *foo will become "*Foo::bar" when used as a string, but *foo{PACKAGE} and *foo{NAME} will continue to produce "main" and "foo", respectively.

*foo{IO} is an alternative to the *HANDLE mechanism given in Typeglobs and Filehandles in perldata for passing filehandles into or out of subroutines, or storing into larger data structures. Its disadvantage is that it won't create a new filehandle for you. Its advantage is that you have less risk of clobbering more than you want to with a typeglob assignment. (It still conflates file and directory handles, though.) However, if you assign the incoming value to a scalar instead of a typeglob as we do in the examples below, there's no risk of that happening.

  1. splutter(*STDOUT); # pass the whole glob
  2. splutter(*STDOUT{IO}); # pass both file and dir handles
  3. sub splutter {
  4. my $fh = shift;
  5. print $fh "her um well a hmmm\n";
  6. }
  7. $rec = get_rec(*STDIN); # pass the whole glob
  8. $rec = get_rec(*STDIN{IO}); # pass both file and dir handles
  9. sub get_rec {
  10. my $fh = shift;
  11. return scalar <$fh>;
  12. }

Using References

That's it for creating references. By now you're probably dying to know how to use references to get back to your long-lost data. There are several basic methods.

1.

Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a simple scalar variable containing a reference of the correct type:

  1. $bar = $$scalarref;
  2. push(@$arrayref, $filename);
  3. $$arrayref[0] = "January";
  4. $$hashref{"KEY"} = "VALUE";
  5. &$coderef(1,2,3);
  6. print $globref "output\n";

It's important to understand that we are specifically not dereferencing $arrayref[0] or $hashref{"KEY"} there. The dereference of the scalar variable happens before it does any key lookups. Anything more complicated than a simple scalar variable must use methods 2 or 3 below. However, a "simple scalar" includes an identifier that itself uses method 1 recursively. Therefore, the following prints "howdy".

  1. $refrefref = \\\"howdy";
  2. print $$$$refrefref;
2.

Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a BLOCK returning a reference of the correct type. In other words, the previous examples could be written like this:

  1. $bar = ${$scalarref};
  2. push(@{$arrayref}, $filename);
  3. ${$arrayref}[0] = "January";
  4. ${$hashref}{"KEY"} = "VALUE";
  5. &{$coderef}(1,2,3);
  6. $globref->print("output\n"); # iff IO::Handle is loaded

Admittedly, it's a little silly to use the curlies in this case, but the BLOCK can contain any arbitrary expression, in particular, subscripted expressions:

  1. &{ $dispatch{$index} }(1,2,3); # call correct routine

Because of being able to omit the curlies for the simple case of $$x , people often make the mistake of viewing the dereferencing symbols as proper operators, and wonder about their precedence. If they were, though, you could use parentheses instead of braces. That's not the case. Consider the difference below; case 0 is a short-hand version of case 1, not case 2:

  1. $$hashref{"KEY"} = "VALUE"; # CASE 0
  2. ${$hashref}{"KEY"} = "VALUE"; # CASE 1
  3. ${$hashref{"KEY"}} = "VALUE"; # CASE 2
  4. ${$hashref->{"KEY"}} = "VALUE"; # CASE 3

Case 2 is also deceptive in that you're accessing a variable called %hashref, not dereferencing through $hashref to the hash it's presumably referencing. That would be case 3.

3.

Subroutine calls and lookups of individual array elements arise often enough that it gets cumbersome to use method 2. As a form of syntactic sugar, the examples for method 2 may be written:

  1. $arrayref->[0] = "January"; # Array element
  2. $hashref->{"KEY"} = "VALUE"; # Hash element
  3. $coderef->(1,2,3); # Subroutine call

The left side of the arrow can be any expression returning a reference, including a previous dereference. Note that $array[$x] is not the same thing as $array->[$x] here:

  1. $array[$x]->{"foo"}->[0] = "January";

This is one of the cases we mentioned earlier in which references could spring into existence when in an lvalue context. Before this statement, $array[$x] may have been undefined. If so, it's automatically defined with a hash reference so that we can look up {"foo"} in it. Likewise $array[$x]->{"foo"} will automatically get defined with an array reference so that we can look up [0] in it. This process is called autovivification.

One more thing here. The arrow is optional between brackets subscripts, so you can shrink the above down to

  1. $array[$x]{"foo"}[0] = "January";

Which, in the degenerate case of using only ordinary arrays, gives you multidimensional arrays just like C's:

  1. $score[$x][$y][$z] += 42;

Well, okay, not entirely like C's arrays, actually. C doesn't know how to grow its arrays on demand. Perl does.

4.

If a reference happens to be a reference to an object, then there are probably methods to access the things referred to, and you should probably stick to those methods unless you're in the class package that defines the object's methods. In other words, be nice, and don't violate the object's encapsulation without a very good reason. Perl does not enforce encapsulation. We are not totalitarians here. We do expect some basic civility though.

Using a string or number as a reference produces a symbolic reference, as explained above. Using a reference as a number produces an integer representing its storage location in memory. The only useful thing to be done with this is to compare two references numerically to see whether they refer to the same location.

  1. if ($ref1 == $ref2) { # cheap numeric compare of references
  2. print "refs 1 and 2 refer to the same thing\n";
  3. }

Using a reference as a string produces both its referent's type, including any package blessing as described in perlobj, as well as the numeric address expressed in hex. The ref() operator returns just the type of thing the reference is pointing to, without the address. See ref for details and examples of its use.

The bless() operator may be used to associate the object a reference points to with a package functioning as an object class. See perlobj.

A typeglob may be dereferenced the same way a reference can, because the dereference syntax always indicates the type of reference desired. So ${*foo} and ${\$foo} both indicate the same scalar variable.

Here's a trick for interpolating a subroutine call into a string:

  1. print "My sub returned @{[mysub(1,2,3)]} that time.\n";

The way it works is that when the @{...} is seen in the double-quoted string, it's evaluated as a block. The block creates a reference to an anonymous array containing the results of the call to mysub(1,2,3) . So the whole block returns a reference to an array, which is then dereferenced by @{...} and stuck into the double-quoted string. This chicanery is also useful for arbitrary expressions:

  1. print "That yields @{[$n + 5]} widgets\n";

Similarly, an expression that returns a reference to a scalar can be dereferenced via ${...} . Thus, the above expression may be written as:

  1. print "That yields ${\($n + 5)} widgets\n";

Circular References

It is possible to create a "circular reference" in Perl, which can lead to memory leaks. A circular reference occurs when two references contain a reference to each other, like this:

  1. my $foo = {};
  2. my $bar = { foo => $foo };
  3. $foo->{bar} = $bar;

You can also create a circular reference with a single variable:

  1. my $foo;
  2. $foo = \$foo;

In this case, the reference count for the variables will never reach 0, and the references will never be garbage-collected. This can lead to memory leaks.

Because objects in Perl are implemented as references, it's possible to have circular references with objects as well. Imagine a TreeNode class where each node references its parent and child nodes. Any node with a parent will be part of a circular reference.

You can break circular references by creating a "weak reference". A weak reference does not increment the reference count for a variable, which means that the object can go out of scope and be destroyed. You can weaken a reference with the weaken function exported by the Scalar::Util module.

Here's how we can make the first example safer:

  1. use Scalar::Util 'weaken';
  2. my $foo = {};
  3. my $bar = { foo => $foo };
  4. $foo->{bar} = $bar;
  5. weaken $foo->{bar};

The reference from $foo to $bar has been weakened. When the $bar variable goes out of scope, it will be garbage-collected. The next time you look at the value of the $foo->{bar} key, it will be undef.

This action at a distance can be confusing, so you should be careful with your use of weaken. You should weaken the reference in the variable that will go out of scope first. That way, the longer-lived variable will contain the expected reference until it goes out of scope.

Symbolic references

We said that references spring into existence as necessary if they are undefined, but we didn't say what happens if a value used as a reference is already defined, but isn't a hard reference. If you use it as a reference, it'll be treated as a symbolic reference. That is, the value of the scalar is taken to be the name of a variable, rather than a direct link to a (possibly) anonymous value.

People frequently expect it to work like this. So it does.

  1. $name = "foo";
  2. $$name = 1; # Sets $foo
  3. ${$name} = 2; # Sets $foo
  4. ${$name x 2} = 3; # Sets $foofoo
  5. $name->[0] = 4; # Sets $foo[0]
  6. @$name = (); # Clears @foo
  7. &$name(); # Calls &foo()
  8. $pack = "THAT";
  9. ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval

This is powerful, and slightly dangerous, in that it's possible to intend (with the utmost sincerity) to use a hard reference, and accidentally use a symbolic reference instead. To protect against that, you can say

  1. use strict 'refs';

and then only hard references will be allowed for the rest of the enclosing block. An inner block may countermand that with

  1. no strict 'refs';

Only package variables (globals, even if localized) are visible to symbolic references. Lexical variables (declared with my()) aren't in a symbol table, and thus are invisible to this mechanism. For example:

  1. local $value = 10;
  2. $ref = "value";
  3. {
  4. my $value = 20;
  5. print $$ref;
  6. }

This will still print 10, not 20. Remember that local() affects package variables, which are all "global" to the package.

Not-so-symbolic references

Brackets around a symbolic reference can simply serve to isolate an identifier or variable name from the rest of an expression, just as they always have within a string. For example,

  1. $push = "pop on ";
  2. print "${push}over";

has always meant to print "pop on over", even though push is a reserved word. This is generalized to work the same without the enclosing double quotes, so that

  1. print ${push} . "over";

and even

  1. print ${ push } . "over";

will have the same effect. This construct is not considered to be a symbolic reference when you're using strict refs:

  1. use strict 'refs';
  2. ${ bareword }; # Okay, means $bareword.
  3. ${ "bareword" }; # Error, symbolic reference.

Similarly, because of all the subscripting that is done using single words, the same rule applies to any bareword that is used for subscripting a hash. So now, instead of writing

  1. $array{ "aaa" }{ "bbb" }{ "ccc" }

you can write just

  1. $array{ aaa }{ bbb }{ ccc }

and not worry about whether the subscripts are reserved words. In the rare event that you do wish to do something like

  1. $array{ shift }

you can force interpretation as a reserved word by adding anything that makes it more than a bareword:

  1. $array{ shift() }
  2. $array{ +shift }
  3. $array{ shift @_ }

The use warnings pragma or the -w switch will warn you if it interprets a reserved word as a string. But it will no longer warn you about using lowercase words, because the string is effectively quoted.

Pseudo-hashes: Using an array as a hash

Pseudo-hashes have been removed from Perl. The 'fields' pragma remains available.

Function Templates

As explained above, an anonymous function with access to the lexical variables visible when that function was compiled, creates a closure. It retains access to those variables even though it doesn't get run until later, such as in a signal handler or a Tk callback.

Using a closure as a function template allows us to generate many functions that act similarly. Suppose you wanted functions named after the colors that generated HTML font changes for the various colors:

  1. print "Be ", red("careful"), "with that ", green("light");

The red() and green() functions would be similar. To create these, we'll assign a closure to a typeglob of the name of the function we're trying to build.

  1. @colors = qw(red blue green yellow orange purple violet);
  2. for my $name (@colors) {
  3. no strict 'refs'; # allow symbol table manipulation
  4. *$name = *{uc $name} = sub { "<FONT COLOR='$name'>@_</FONT>" };
  5. }

Now all those different functions appear to exist independently. You can call red(), RED(), blue(), BLUE(), green(), etc. This technique saves on both compile time and memory use, and is less error-prone as well, since syntax checks happen at compile time. It's critical that any variables in the anonymous subroutine be lexicals in order to create a proper closure. That's the reasons for the my on the loop iteration variable.

This is one of the only places where giving a prototype to a closure makes much sense. If you wanted to impose scalar context on the arguments of these functions (probably not a wise idea for this particular example), you could have written it this way instead:

  1. *$name = sub ($) { "<FONT COLOR='$name'>$_[0]</FONT>" };

However, since prototype checking happens at compile time, the assignment above happens too late to be of much use. You could address this by putting the whole loop of assignments within a BEGIN block, forcing it to occur during compilation.

Access to lexicals that change over time--like those in the for loop above, basically aliases to elements from the surrounding lexical scopes-- only works with anonymous subs, not with named subroutines. Generally said, named subroutines do not nest properly and should only be declared in the main package scope.

This is because named subroutines are created at compile time so their lexical variables get assigned to the parent lexicals from the first execution of the parent block. If a parent scope is entered a second time, its lexicals are created again, while the nested subs still reference the old ones.

Anonymous subroutines get to capture each time you execute the sub operator, as they are created on the fly. If you are accustomed to using nested subroutines in other programming languages with their own private variables, you'll have to work at it a bit in Perl. The intuitive coding of this type of thing incurs mysterious warnings about "will not stay shared" due to the reasons explained above. For example, this won't work:

  1. sub outer {
  2. my $x = $_[0] + 35;
  3. sub inner { return $x * 19 } # WRONG
  4. return $x + inner();
  5. }

A work-around is the following:

  1. sub outer {
  2. my $x = $_[0] + 35;
  3. local *inner = sub { return $x * 19 };
  4. return $x + inner();
  5. }

Now inner() can only be called from within outer(), because of the temporary assignments of the anonymous subroutine. But when it does, it has normal access to the lexical variable $x from the scope of outer() at the time outer is invoked.

This has the interesting effect of creating a function local to another function, something not normally supported in Perl.

WARNING

You may not (usefully) use a reference as the key to a hash. It will be converted into a string:

  1. $x{ \$a } = $a;

If you try to dereference the key, it won't do a hard dereference, and you won't accomplish what you're attempting. You might want to do something more like

  1. $r = \@a;
  2. $x{ $r } = $r;

And then at least you can use the values(), which will be real refs, instead of the keys(), which won't.

The standard Tie::RefHash module provides a convenient workaround to this.

SEE ALSO

Besides the obvious documents, source code can be instructive. Some pathological examples of the use of references can be found in the t/op/ref.t regression test in the Perl source directory.

See also perldsc and perllol for how to use references to create complex data structures, and perlootut and perlobj for how to use them to create objects.

 
perldoc-html/perlreftut.html000644 000765 000024 00000135323 12275777322 016325 0ustar00jjstaff000000 000000 perlreftut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlreftut

Perl 5 version 18.2 documentation
Recently read

perlreftut

NAME

perlreftut - Mark's very short tutorial about references

DESCRIPTION

One of the most important new features in Perl 5 was the capability to manage complicated data structures like multidimensional arrays and nested hashes. To enable these, Perl 5 introduced a feature called 'references', and using references is the key to managing complicated, structured data in Perl. Unfortunately, there's a lot of funny syntax to learn, and the main manual page can be hard to follow. The manual is quite complete, and sometimes people find that a problem, because it can be hard to tell what is important and what isn't.

Fortunately, you only need to know 10% of what's in the main page to get 90% of the benefit. This page will show you that 10%.

Who Needs Complicated Data Structures?

One problem that comes up all the time is needing a hash whose values are lists. Perl has hashes, of course, but the values have to be scalars; they can't be lists.

Why would you want a hash of lists? Let's take a simple example: You have a file of city and country names, like this:

  1. Chicago, USA
  2. Frankfurt, Germany
  3. Berlin, Germany
  4. Washington, USA
  5. Helsinki, Finland
  6. New York, USA

and you want to produce an output like this, with each country mentioned once, and then an alphabetical list of the cities in that country:

  1. Finland: Helsinki.
  2. Germany: Berlin, Frankfurt.
  3. USA: Chicago, New York, Washington.

The natural way to do this is to have a hash whose keys are country names. Associated with each country name key is a list of the cities in that country. Each time you read a line of input, split it into a country and a city, look up the list of cities already known to be in that country, and append the new city to the list. When you're done reading the input, iterate over the hash as usual, sorting each list of cities before you print it out.

If hash values couldn't be lists, you lose. You'd probably have to combine all the cities into a single string somehow, and then when time came to write the output, you'd have to break the string into a list, sort the list, and turn it back into a string. This is messy and error-prone. And it's frustrating, because Perl already has perfectly good lists that would solve the problem if only you could use them.

The Solution

By the time Perl 5 rolled around, we were already stuck with this design: Hash values must be scalars. The solution to this is references.

A reference is a scalar value that refers to an entire array or an entire hash (or to just about anything else). Names are one kind of reference that you're already familiar with. Think of the President of the United States: a messy, inconvenient bag of blood and bones. But to talk about him, or to represent him in a computer program, all you need is the easy, convenient scalar string "Barack Obama".

References in Perl are like names for arrays and hashes. They're Perl's private, internal names, so you can be sure they're unambiguous. Unlike "Barack Obama", a reference only refers to one thing, and you always know what it refers to. If you have a reference to an array, you can recover the entire array from it. If you have a reference to a hash, you can recover the entire hash. But the reference is still an easy, compact scalar value.

You can't have a hash whose values are arrays; hash values can only be scalars. We're stuck with that. But a single reference can refer to an entire array, and references are scalars, so you can have a hash of references to arrays, and it'll act a lot like a hash of arrays, and it'll be just as useful as a hash of arrays.

We'll come back to this city-country problem later, after we've seen some syntax for managing references.

Syntax

There are just two ways to make a reference, and just two ways to use it once you have it.

Making References

Make Rule 1

If you put a \ in front of a variable, you get a reference to that variable.

  1. $aref = \@array; # $aref now holds a reference to @array
  2. $href = \%hash; # $href now holds a reference to %hash
  3. $sref = \$scalar; # $sref now holds a reference to $scalar

Once the reference is stored in a variable like $aref or $href, you can copy it or store it just the same as any other scalar value:

  1. $xy = $aref; # $xy now holds a reference to @array
  2. $p[3] = $href; # $p[3] now holds a reference to %hash
  3. $z = $p[3]; # $z now holds a reference to %hash

These examples show how to make references to variables with names. Sometimes you want to make an array or a hash that doesn't have a name. This is analogous to the way you like to be able to use the string "\n" or the number 80 without having to store it in a named variable first.

Make Rule 2

[ ITEMS ] makes a new, anonymous array, and returns a reference to that array. { ITEMS } makes a new, anonymous hash, and returns a reference to that hash.

  1. $aref = [ 1, "foo", undef, 13 ];
  2. # $aref now holds a reference to an array
  3. $href = { APR => 4, AUG => 8 };
  4. # $href now holds a reference to a hash

The references you get from rule 2 are the same kind of references that you get from rule 1:

  1. # This:
  2. $aref = [ 1, 2, 3 ];
  3. # Does the same as this:
  4. @array = (1, 2, 3);
  5. $aref = \@array;

The first line is an abbreviation for the following two lines, except that it doesn't create the superfluous array variable @array .

If you write just [] , you get a new, empty anonymous array. If you write just {} , you get a new, empty anonymous hash.

Using References

What can you do with a reference once you have it? It's a scalar value, and we've seen that you can store it as a scalar and get it back again just like any scalar. There are just two more ways to use it:

Use Rule 1

You can always use an array reference, in curly braces, in place of the name of an array. For example, @{$aref} instead of @array .

Here are some examples of that:

Arrays:

  1. @a @{$aref} An array
  2. reverse @a reverse @{$aref} Reverse the array
  3. $a[3] ${$aref}[3] An element of the array
  4. $a[3] = 17; ${$aref}[3] = 17 Assigning an element

On each line are two expressions that do the same thing. The left-hand versions operate on the array @a . The right-hand versions operate on the array that is referred to by $aref . Once they find the array they're operating on, both versions do the same things to the arrays.

Using a hash reference is exactly the same:

  1. %h %{$href} A hash
  2. keys %h keys %{$href} Get the keys from the hash
  3. $h{'red'} ${$href}{'red'} An element of the hash
  4. $h{'red'} = 17 ${$href}{'red'} = 17 Assigning an element

Whatever you want to do with a reference, Use Rule 1 tells you how to do it. You just write the Perl code that you would have written for doing the same thing to a regular array or hash, and then replace the array or hash name with {$reference} . "How do I loop over an array when all I have is a reference?" Well, to loop over an array, you would write

  1. for my $element (@array) {
  2. ...
  3. }

so replace the array name, @array , with the reference:

  1. for my $element (@{$aref}) {
  2. ...
  3. }

"How do I print out the contents of a hash when all I have is a reference?" First write the code for printing out a hash:

  1. for my $key (keys %hash) {
  2. print "$key => $hash{$key}\n";
  3. }

And then replace the hash name with the reference:

  1. for my $key (keys %{$href}) {
  2. print "$key => ${$href}{$key}\n";
  3. }

Use Rule 2

Use Rule 1 is all you really need, because it tells you how to do absolutely everything you ever need to do with references. But the most common thing to do with an array or a hash is to extract a single element, and the Use Rule 1 notation is cumbersome. So there is an abbreviation.

${$aref}[3] is too hard to read, so you can write $aref->[3] instead.

${$href}{red} is too hard to read, so you can write $href->{red} instead.

If $aref holds a reference to an array, then $aref->[3] is the fourth element of the array. Don't confuse this with $aref[3] , which is the fourth element of a totally different array, one deceptively named @aref . $aref and @aref are unrelated the same way that $item and @item are.

Similarly, $href->{'red'} is part of the hash referred to by the scalar variable $href , perhaps even one with no name. $href{'red'} is part of the deceptively named %href hash. It's easy to forget to leave out the -> , and if you do, you'll get bizarre results when your program gets array and hash elements out of totally unexpected hashes and arrays that weren't the ones you wanted to use.

An Example

Let's see a quick example of how all this is useful.

First, remember that [1, 2, 3] makes an anonymous array containing (1, 2, 3) , and gives you a reference to that array.

Now think about

  1. @a = ( [1, 2, 3],
  2. [4, 5, 6],
  3. [7, 8, 9]
  4. );

@a is an array with three elements, and each one is a reference to another array.

$a[1] is one of these references. It refers to an array, the array containing (4, 5, 6) , and because it is a reference to an array, Use Rule 2 says that we can write $a[1]->[2] to get the third element from that array. $a[1]->[2] is the 6. Similarly, $a[0]->[1] is the 2. What we have here is like a two-dimensional array; you can write $a[ROW]->[COLUMN] to get or set the element in any row and any column of the array.

The notation still looks a little cumbersome, so there's one more abbreviation:

Arrow Rule

In between two subscripts, the arrow is optional.

Instead of $a[1]->[2] , we can write $a[1][2] ; it means the same thing. Instead of $a[0]->[1] = 23 , we can write $a[0][1] = 23 ; it means the same thing.

Now it really looks like two-dimensional arrays!

You can see why the arrows are important. Without them, we would have had to write ${$a[1]}[2] instead of $a[1][2] . For three-dimensional arrays, they let us write $x[2][3][5] instead of the unreadable ${${$x[2]}[3]}[5] .

Solution

Here's the answer to the problem I posed earlier, of reformatting a file of city and country names.

  1. 1 my %table;
  2. 2 while (<>) {
  3. 3 chomp;
  4. 4 my ($city, $country) = split /, /;
  5. 5 $table{$country} = [] unless exists $table{$country};
  6. 6 push @{$table{$country}}, $city;
  7. 7 }
  8. 8 foreach $country (sort keys %table) {
  9. 9 print "$country: ";
  10. 10 my @cities = @{$table{$country}};
  11. 11 print join ', ', sort @cities;
  12. 12 print ".\n";
  13. 13 }

The program has two pieces: Lines 2--7 read the input and build a data structure, and lines 8-13 analyze the data and print out the report. We're going to have a hash, %table , whose keys are country names, and whose values are references to arrays of city names. The data structure will look like this:

  1. %table
  2. +-------+---+
  3. | | | +-----------+--------+
  4. |Germany| *---->| Frankfurt | Berlin |
  5. | | | +-----------+--------+
  6. +-------+---+
  7. | | | +----------+
  8. |Finland| *---->| Helsinki |
  9. | | | +----------+
  10. +-------+---+
  11. | | | +---------+------------+----------+
  12. | USA | *---->| Chicago | Washington | New York |
  13. | | | +---------+------------+----------+
  14. +-------+---+

We'll look at output first. Supposing we already have this structure, how do we print it out?

  1. 8 foreach $country (sort keys %table) {
  2. 9 print "$country: ";
  3. 10 my @cities = @{$table{$country}};
  4. 11 print join ', ', sort @cities;
  5. 12 print ".\n";
  6. 13 }

%table is an ordinary hash, and we get a list of keys from it, sort the keys, and loop over the keys as usual. The only use of references is in line 10. $table{$country} looks up the key $country in the hash and gets the value, which is a reference to an array of cities in that country. Use Rule 1 says that we can recover the array by saying @{$table{$country}} . Line 10 is just like

  1. @cities = @array;

except that the name array has been replaced by the reference {$table{$country}} . The @ tells Perl to get the entire array. Having gotten the list of cities, we sort it, join it, and print it out as usual.

Lines 2-7 are responsible for building the structure in the first place. Here they are again:

  1. 2 while (<>) {
  2. 3 chomp;
  3. 4 my ($city, $country) = split /, /;
  4. 5 $table{$country} = [] unless exists $table{$country};
  5. 6 push @{$table{$country}}, $city;
  6. 7 }

Lines 2-4 acquire a city and country name. Line 5 looks to see if the country is already present as a key in the hash. If it's not, the program uses the [] notation (Make Rule 2) to manufacture a new, empty anonymous array of cities, and installs a reference to it into the hash under the appropriate key.

Line 6 installs the city name into the appropriate array. $table{$country} now holds a reference to the array of cities seen in that country so far. Line 6 is exactly like

  1. push @array, $city;

except that the name array has been replaced by the reference {$table{$country}} . The push adds a city name to the end of the referred-to array.

There's one fine point I skipped. Line 5 is unnecessary, and we can get rid of it.

  1. 2 while (<>) {
  2. 3 chomp;
  3. 4 my ($city, $country) = split /, /;
  4. 5 #### $table{$country} = [] unless exists $table{$country};
  5. 6 push @{$table{$country}}, $city;
  6. 7 }

If there's already an entry in %table for the current $country , then nothing is different. Line 6 will locate the value in $table{$country} , which is a reference to an array, and push $city into the array. But what does it do when $country holds a key, say Greece , that is not yet in %table ?

This is Perl, so it does the exact right thing. It sees that you want to push Athens onto an array that doesn't exist, so it helpfully makes a new, empty, anonymous array for you, installs it into %table , and then pushes Athens onto it. This is called 'autovivification'--bringing things to life automatically. Perl saw that the key wasn't in the hash, so it created a new hash entry automatically. Perl saw that you wanted to use the hash value as an array, so it created a new empty array and installed a reference to it in the hash automatically. And as usual, Perl made the array one element longer to hold the new city name.

The Rest

I promised to give you 90% of the benefit with 10% of the details, and that means I left out 90% of the details. Now that you have an overview of the important parts, it should be easier to read the perlref manual page, which discusses 100% of the details.

Some of the highlights of perlref:

  • You can make references to anything, including scalars, functions, and other references.

  • In Use Rule 1, you can omit the curly brackets whenever the thing inside them is an atomic scalar variable like $aref . For example, @$aref is the same as @{$aref} , and $$aref[1] is the same as ${$aref}[1] . If you're just starting out, you may want to adopt the habit of always including the curly brackets.

  • This doesn't copy the underlying array:

    1. $aref2 = $aref1;

    You get two references to the same array. If you modify $aref1->[23] and then look at $aref2->[23] you'll see the change.

    To copy the array, use

    1. $aref2 = [@{$aref1}];

    This uses [...] notation to create a new anonymous array, and $aref2 is assigned a reference to the new array. The new array is initialized with the contents of the array referred to by $aref1 .

    Similarly, to copy an anonymous hash, you can use

    1. $href2 = {%{$href1}};
  • To see if a variable contains a reference, use the ref function. It returns true if its argument is a reference. Actually it's a little better than that: It returns HASH for hash references and ARRAY for array references.

  • If you try to use a reference like a string, you get strings like

    1. ARRAY(0x80f5dec) or HASH(0x826afc0)

    If you ever see a string that looks like this, you'll know you printed out a reference by mistake.

    A side effect of this representation is that you can use eq to see if two references refer to the same thing. (But you should usually use == instead because it's much faster.)

  • You can use a string as if it were a reference. If you use the string "foo" as an array reference, it's taken to be a reference to the array @foo . This is called a soft reference or symbolic reference. The declaration use strict 'refs' disables this feature, which can cause all sorts of trouble if you use it by accident.

You might prefer to go on to perllol instead of perlref; it discusses lists of lists and multidimensional arrays in detail. After that, you should move on to perldsc; it's a Data Structure Cookbook that shows recipes for using and printing out arrays of hashes, hashes of arrays, and other kinds of data.

Summary

Everyone needs compound data structures, and in Perl the way you get them is with references. There are four important rules for managing references: Two for making references and two for using them. Once you know these rules you can do most of the important things you need to do with references.

Credits

Author: Mark Jason Dominus, Plover Systems (mjd-perl-ref+@plover.com )

This article originally appeared in The Perl Journal ( http://www.tpj.com/ ) volume 3, #2. Reprinted with permission.

The original title was Understand References Today.

Distribution Conditions

Copyright 1998 The Perl Journal.

This documentation is free; you can redistribute it and/or modify it under the same terms as Perl itself.

Irrespective of its distribution, all code examples in these files are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required.

 
perldoc-html/perlreguts.html000644 000765 000024 00000234223 12275777370 016327 0ustar00jjstaff000000 000000 perlreguts - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlreguts

Perl 5 version 18.2 documentation
Recently read

perlreguts

NAME

perlreguts - Description of the Perl regular expression engine.

DESCRIPTION

This document is an attempt to shine some light on the guts of the regex engine and how it works. The regex engine represents a significant chunk of the perl codebase, but is relatively poorly understood. This document is a meagre attempt at addressing this situation. It is derived from the author's experience, comments in the source code, other papers on the regex engine, feedback on the perl5-porters mail list, and no doubt other places as well.

NOTICE! It should be clearly understood that the behavior and structures discussed in this represents the state of the engine as the author understood it at the time of writing. It is NOT an API definition, it is purely an internals guide for those who want to hack the regex engine, or understand how the regex engine works. Readers of this document are expected to understand perl's regex syntax and its usage in detail. If you want to learn about the basics of Perl's regular expressions, see perlre. And if you want to replace the regex engine with your own, see perlreapi.

OVERVIEW

A quick note on terms

There is some debate as to whether to say "regexp" or "regex". In this document we will use the term "regex" unless there is a special reason not to, in which case we will explain why.

When speaking about regexes we need to distinguish between their source code form and their internal form. In this document we will use the term "pattern" when we speak of their textual, source code form, and the term "program" when we speak of their internal representation. These correspond to the terms S-regex and B-regex that Mark Jason Dominus employs in his paper on "Rx" ([1] in REFERENCES).

What is a regular expression engine?

A regular expression engine is a program that takes a set of constraints specified in a mini-language, and then applies those constraints to a target string, and determines whether or not the string satisfies the constraints. See perlre for a full definition of the language.

In less grandiose terms, the first part of the job is to turn a pattern into something the computer can efficiently use to find the matching point in the string, and the second part is performing the search itself.

To do this we need to produce a program by parsing the text. We then need to execute the program to find the point in the string that matches. And we need to do the whole thing efficiently.

Structure of a Regexp Program

High Level

Although it is a bit confusing and some people object to the terminology, it is worth taking a look at a comment that has been in regexp.h for years:

This is essentially a linear encoding of a nondeterministic finite-state machine (aka syntax charts or "railroad normal form" in parsing technology).

The term "railroad normal form" is a bit esoteric, with "syntax diagram/charts", or "railroad diagram/charts" being more common terms. Nevertheless it provides a useful mental image of a regex program: each node can be thought of as a unit of track, with a single entry and in most cases a single exit point (there are pieces of track that fork, but statistically not many), and the whole forms a layout with a single entry and single exit point. The matching process can be thought of as a car that moves along the track, with the particular route through the system being determined by the character read at each possible connector point. A car can fall off the track at any point but it may only proceed as long as it matches the track.

Thus the pattern /foo(?:\w+|\d+|\s+)bar/ can be thought of as the following chart:

  1. [start]
  2. |
  3. <foo>
  4. |
  5. +-----+-----+
  6. | | |
  7. <\w+> <\d+> <\s+>
  8. | | |
  9. +-----+-----+
  10. |
  11. <bar>
  12. |
  13. [end]

The truth of the matter is that perl's regular expressions these days are much more complex than this kind of structure, but visualising it this way can help when trying to get your bearings, and it matches the current implementation pretty closely.

To be more precise, we will say that a regex program is an encoding of a graph. Each node in the graph corresponds to part of the original regex pattern, such as a literal string or a branch, and has a pointer to the nodes representing the next component to be matched. Since "node" and "opcode" already have other meanings in the perl source, we will call the nodes in a regex program "regops".

The program is represented by an array of regnode structures, one or more of which represent a single regop of the program. Struct regnode is the smallest struct needed, and has a field structure which is shared with all the other larger structures.

The "next" pointers of all regops except BRANCH implement concatenation; a "next" pointer with a BRANCH on both ends of it is connecting two alternatives. [Here we have one of the subtle syntax dependencies: an individual BRANCH (as opposed to a collection of them) is never concatenated with anything because of operator precedence.]

The operand of some types of regop is a literal string; for others, it is a regop leading into a sub-program. In particular, the operand of a BRANCH node is the first regop of the branch.

NOTE: As the railroad metaphor suggests, this is not a tree structure: the tail of the branch connects to the thing following the set of BRANCH es. It is a like a single line of railway track that splits as it goes into a station or railway yard and rejoins as it comes out the other side.

Regops

The base structure of a regop is defined in regexp.h as follows:

  1. struct regnode {
  2. U8 flags; /* Various purposes, sometimes overridden */
  3. U8 type; /* Opcode value as specified by regnodes.h */
  4. U16 next_off; /* Offset in size regnode */
  5. };

Other larger regnode -like structures are defined in regcomp.h. They are almost like subclasses in that they have the same fields as regnode , with possibly additional fields following in the structure, and in some cases the specific meaning (and name) of some of base fields are overridden. The following is a more complete description.

  • regnode_1
  • regnode_2

    regnode_1 structures have the same header, followed by a single four-byte argument; regnode_2 structures contain two two-byte arguments instead:

    1. regnode_1 U32 arg1;
    2. regnode_2 U16 arg1; U16 arg2;
  • regnode_string

    regnode_string structures, used for literal strings, follow the header with a one-byte length and then the string data. Strings are padded on the end with zero bytes so that the total length of the node is a multiple of four bytes:

    1. regnode_string char string[1];
    2. U8 str_len; /* overrides flags */
  • regnode_charclass

    Character classes are represented by regnode_charclass structures, which have a four-byte argument and then a 32-byte (256-bit) bitmap indicating which characters are included in the class.

    1. regnode_charclass U32 arg1;
    2. char bitmap[ANYOF_BITMAP_SIZE];
  • regnode_charclass_class

    There is also a larger form of a char class structure used to represent POSIX char classes called regnode_charclass_class which has an additional 4-byte (32-bit) bitmap indicating which POSIX char classes have been included.

    1. regnode_charclass_class U32 arg1;
    2. char bitmap[ANYOF_BITMAP_SIZE];
    3. char classflags[ANYOF_CLASSBITMAP_SIZE];

regnodes.h defines an array called regarglen[] which gives the size of each opcode in units of size regnode (4-byte). A macro is used to calculate the size of an EXACT node based on its str_len field.

The regops are defined in regnodes.h which is generated from regcomp.sym by regcomp.pl. Currently the maximum possible number of distinct regops is restricted to 256, with about a quarter already used.

A set of macros makes accessing the fields easier and more consistent. These include OP() , which is used to determine the type of a regnode -like structure; NEXT_OFF() , which is the offset to the next node (more on this later); ARG() , ARG1() , ARG2() , ARG_SET() , and equivalents for reading and setting the arguments; and STR_LEN() , STRING() and OPERAND() for manipulating strings and regop bearing types.

What regop is next?

There are three distinct concepts of "next" in the regex engine, and it is important to keep them clear.

  • There is the "next regnode" from a given regnode, a value which is rarely useful except that sometimes it matches up in terms of value with one of the others, and that sometimes the code assumes this to always be so.

  • There is the "next regop" from a given regop/regnode. This is the regop physically located after the current one, as determined by the size of the current regop. This is often useful, such as when dumping the structure we use this order to traverse. Sometimes the code assumes that the "next regnode" is the same as the "next regop", or in other words assumes that the sizeof a given regop type is always going to be one regnode large.

  • There is the "regnext" from a given regop. This is the regop which is reached by jumping forward by the value of NEXT_OFF() , or in a few cases for longer jumps by the arg1 field of the regnode_1 structure. The subroutine regnext() handles this transparently. This is the logical successor of the node, which in some cases, like that of the BRANCH regop, has special meaning.

Process Overview

Broadly speaking, performing a match of a string against a pattern involves the following steps:

  • A. Compilation
    1.
    Parsing for size
    2.
    Parsing for construction
    3.
    Peep-hole optimisation and analysis
  • B. Execution
    4.
    Start position and no-match optimisations
    5.
    Program execution

Where these steps occur in the actual execution of a perl program is determined by whether the pattern involves interpolating any string variables. If interpolation occurs, then compilation happens at run time. If it does not, then compilation is performed at compile time. (The /o modifier changes this, as does qr// to a certain extent.) The engine doesn't really care that much.

Compilation

This code resides primarily in regcomp.c, along with the header files regcomp.h, regexp.h and regnodes.h.

Compilation starts with pregcomp() , which is mostly an initialisation wrapper which farms work out to two other routines for the heavy lifting: the first is reg() , which is the start point for parsing; the second, study_chunk() , is responsible for optimisation.

Initialisation in pregcomp() mostly involves the creation and data-filling of a special structure, RExC_state_t (defined in regcomp.c). Almost all internally-used routines in regcomp.h take a pointer to one of these structures as their first argument, with the name pRExC_state . This structure is used to store the compilation state and contains many fields. Likewise there are many macros which operate on this variable: anything that looks like RExC_xxxx is a macro that operates on this pointer/structure.

Parsing for size

In this pass the input pattern is parsed in order to calculate how much space is needed for each regop we would need to emit. The size is also used to determine whether long jumps will be required in the program.

This stage is controlled by the macro SIZE_ONLY being set.

The parse proceeds pretty much exactly as it does during the construction phase, except that most routines are short-circuited to change the size field RExC_size and not do anything else.

Parsing for construction

Once the size of the program has been determined, the pattern is parsed again, but this time for real. Now SIZE_ONLY will be false, and the actual construction can occur.

reg() is the start of the parse process. It is responsible for parsing an arbitrary chunk of pattern up to either the end of the string, or the first closing parenthesis it encounters in the pattern. This means it can be used to parse the top-level regex, or any section inside of a grouping parenthesis. It also handles the "special parens" that perl's regexes have. For instance when parsing /x(?:foo)y/ reg() will at one point be called to parse from the "?" symbol up to and including the ")".

Additionally, reg() is responsible for parsing the one or more branches from the pattern, and for "finishing them off" by correctly setting their next pointers. In order to do the parsing, it repeatedly calls out to regbranch() , which is responsible for handling up to the first | symbol it sees.

regbranch() in turn calls regpiece() which handles "things" followed by a quantifier. In order to parse the "things", regatom() is called. This is the lowest level routine, which parses out constant strings, character classes, and the various special symbols like $ . If regatom() encounters a "(" character it in turn calls reg() .

The routine regtail() is called by both reg() and regbranch() in order to "set the tail pointer" correctly. When executing and we get to the end of a branch, we need to go to the node following the grouping parens. When parsing, however, we don't know where the end will be until we get there, so when we do we must go back and update the offsets as appropriate. regtail is used to make this easier.

A subtlety of the parsing process means that a regex like /foo/ is originally parsed into an alternation with a single branch. It is only afterwards that the optimiser converts single branch alternations into the simpler form.

Parse Call Graph and a Grammar

The call graph looks like this:

  1. reg() # parse a top level regex, or inside of
  2. # parens
  3. regbranch() # parse a single branch of an alternation
  4. regpiece() # parse a pattern followed by a quantifier
  5. regatom() # parse a simple pattern
  6. regclass() # used to handle a class
  7. reg() # used to handle a parenthesised
  8. # subpattern
  9. ....
  10. ...
  11. regtail() # finish off the branch
  12. ...
  13. regtail() # finish off the branch sequence. Tie each
  14. # branch's tail to the tail of the
  15. # sequence
  16. # (NEW) In Debug mode this is
  17. # regtail_study().

A grammar form might be something like this:

  1. atom : constant | class
  2. quant : '*' | '+' | '?' | '{min,max}'
  3. _branch: piece
  4. | piece _branch
  5. | nothing
  6. branch: _branch
  7. | _branch '|' branch
  8. group : '(' branch ')'
  9. _piece: atom | group
  10. piece : _piece
  11. | _piece quant

Parsing complications

The implication of the above description is that a pattern containing nested parentheses will result in a call graph which cycles through reg() , regbranch() , regpiece() , regatom() , reg() , regbranch() etc multiple times, until the deepest level of nesting is reached. All the above routines return a pointer to a regnode , which is usually the last regnode added to the program. However, one complication is that reg() returns NULL for parsing (?:) syntax for embedded modifiers, setting the flag TRYAGAIN . The TRYAGAIN propagates upwards until it is captured, in some cases by by regatom() , but otherwise unconditionally by regbranch() . Hence it will never be returned by regbranch() to reg() . This flag permits patterns such as (?i)+ to be detected as errors (Quantifier follows nothing in regex; marked by <-- HERE in m/(?i)+ <-- HERE /).

Another complication is that the representation used for the program differs if it needs to store Unicode, but it's not always possible to know for sure whether it does until midway through parsing. The Unicode representation for the program is larger, and cannot be matched as efficiently. (See Unicode and Localisation Support below for more details as to why.) If the pattern contains literal Unicode, it's obvious that the program needs to store Unicode. Otherwise, the parser optimistically assumes that the more efficient representation can be used, and starts sizing on this basis. However, if it then encounters something in the pattern which must be stored as Unicode, such as an \x{...} escape sequence representing a character literal, then this means that all previously calculated sizes need to be redone, using values appropriate for the Unicode representation. Currently, all regular expression constructions which can trigger this are parsed by code in regatom() .

To avoid wasted work when a restart is needed, the sizing pass is abandoned - regatom() immediately returns NULL, setting the flag RESTART_UTF8 . (This action is encapsulated using the macro REQUIRE_UTF8 .) This restart request is propagated up the call chain in a similar fashion, until it is "caught" in Perl_re_op_compile() , which marks the pattern as containing Unicode, and restarts the sizing pass. It is also possible for constructions within run-time code blocks to turn out to need Unicode representation., which is signalled by S_compile_runtime_code() returning false to Perl_re_op_compile() .

The restart was previously implemented using a longjmp in regatom() back to a setjmp in Perl_re_op_compile() , but this proved to be problematic as the latter is a large function containing many automatic variables, which interact badly with the emergent control flow of setjmp .

Debug Output

In the 5.9.x development version of perl you can use re Debug => 'PARSE' to see some trace information about the parse process. We will start with some simple patterns and build up to more complex patterns.

So when we parse /foo/ we see something like the following table. The left shows what is being parsed, and the number indicates where the next regop would go. The stuff on the right is the trace output of the graph. The names are chosen to be short to make it less dense on the screen. 'tsdy' is a special form of regtail() which does some extra analysis.

  1. >foo< 1 reg
  2. brnc
  3. piec
  4. atom
  5. >< 4 tsdy~ EXACT <foo> (EXACT) (1)
  6. ~ attach to END (3) offset to 2

The resulting program then looks like:

  1. 1: EXACT <foo>(3)
  2. 3: END(0)

As you can see, even though we parsed out a branch and a piece, it was ultimately only an atom. The final program shows us how things work. We have an EXACT regop, followed by an END regop. The number in parens indicates where the regnext of the node goes. The regnext of an END regop is unused, as END regops mean we have successfully matched. The number on the left indicates the position of the regop in the regnode array.

Now let's try a harder pattern. We will add a quantifier, so now we have the pattern /foo+/ . We will see that regbranch() calls regpiece() twice.

  1. >foo+< 1 reg
  2. brnc
  3. piec
  4. atom
  5. >o+< 3 piec
  6. atom
  7. >< 6 tail~ EXACT <fo> (1)
  8. 7 tsdy~ EXACT <fo> (EXACT) (1)
  9. ~ PLUS (END) (3)
  10. ~ attach to END (6) offset to 3

And we end up with the program:

  1. 1: EXACT <fo>(3)
  2. 3: PLUS(6)
  3. 4: EXACT <o>(0)
  4. 6: END(0)

Now we have a special case. The EXACT regop has a regnext of 0. This is because if it matches it should try to match itself again. The PLUS regop handles the actual failure of the EXACT regop and acts appropriately (going to regnode 6 if the EXACT matched at least once, or failing if it didn't).

Now for something much more complex: /x(?:foo*|b[a][rR])(foo|bar)$/

  1. >x(?:foo*|b... 1 reg
  2. brnc
  3. piec
  4. atom
  5. >(?:foo*|b[... 3 piec
  6. atom
  7. >?:foo*|b[a... reg
  8. >foo*|b[a][... brnc
  9. piec
  10. atom
  11. >o*|b[a][rR... 5 piec
  12. atom
  13. >|b[a][rR])... 8 tail~ EXACT <fo> (3)
  14. >b[a][rR])(... 9 brnc
  15. 10 piec
  16. atom
  17. >[a][rR])(f... 12 piec
  18. atom
  19. >a][rR])(fo... clas
  20. >[rR])(foo|... 14 tail~ EXACT <b> (10)
  21. piec
  22. atom
  23. >rR])(foo|b... clas
  24. >)(foo|bar)... 25 tail~ EXACT <a> (12)
  25. tail~ BRANCH (3)
  26. 26 tsdy~ BRANCH (END) (9)
  27. ~ attach to TAIL (25) offset to 16
  28. tsdy~ EXACT <fo> (EXACT) (4)
  29. ~ STAR (END) (6)
  30. ~ attach to TAIL (25) offset to 19
  31. tsdy~ EXACT <b> (EXACT) (10)
  32. ~ EXACT <a> (EXACT) (12)
  33. ~ ANYOF[Rr] (END) (14)
  34. ~ attach to TAIL (25) offset to 11
  35. >(foo|bar)$< tail~ EXACT <x> (1)
  36. piec
  37. atom
  38. >foo|bar)$< reg
  39. 28 brnc
  40. piec
  41. atom
  42. >|bar)$< 31 tail~ OPEN1 (26)
  43. >bar)$< brnc
  44. 32 piec
  45. atom
  46. >)$< 34 tail~ BRANCH (28)
  47. 36 tsdy~ BRANCH (END) (31)
  48. ~ attach to CLOSE1 (34) offset to 3
  49. tsdy~ EXACT <foo> (EXACT) (29)
  50. ~ attach to CLOSE1 (34) offset to 5
  51. tsdy~ EXACT <bar> (EXACT) (32)
  52. ~ attach to CLOSE1 (34) offset to 2
  53. >$< tail~ BRANCH (3)
  54. ~ BRANCH (9)
  55. ~ TAIL (25)
  56. piec
  57. atom
  58. >< 37 tail~ OPEN1 (26)
  59. ~ BRANCH (28)
  60. ~ BRANCH (31)
  61. ~ CLOSE1 (34)
  62. 38 tsdy~ EXACT <x> (EXACT) (1)
  63. ~ BRANCH (END) (3)
  64. ~ BRANCH (END) (9)
  65. ~ TAIL (END) (25)
  66. ~ OPEN1 (END) (26)
  67. ~ BRANCH (END) (28)
  68. ~ BRANCH (END) (31)
  69. ~ CLOSE1 (END) (34)
  70. ~ EOL (END) (36)
  71. ~ attach to END (37) offset to 1

Resulting in the program

  1. 1: EXACT <x>(3)
  2. 3: BRANCH(9)
  3. 4: EXACT <fo>(6)
  4. 6: STAR(26)
  5. 7: EXACT <o>(0)
  6. 9: BRANCH(25)
  7. 10: EXACT <ba>(14)
  8. 12: OPTIMIZED (2 nodes)
  9. 14: ANYOF[Rr](26)
  10. 25: TAIL(26)
  11. 26: OPEN1(28)
  12. 28: TRIE-EXACT(34)
  13. [StS:1 Wds:2 Cs:6 Uq:5 #Sts:7 Mn:3 Mx:3 Stcls:bf]
  14. <foo>
  15. <bar>
  16. 30: OPTIMIZED (4 nodes)
  17. 34: CLOSE1(36)
  18. 36: EOL(37)
  19. 37: END(0)

Here we can see a much more complex program, with various optimisations in play. At regnode 10 we see an example where a character class with only one character in it was turned into an EXACT node. We can also see where an entire alternation was turned into a TRIE-EXACT node. As a consequence, some of the regnodes have been marked as optimised away. We can see that the $ symbol has been converted into an EOL regop, a special piece of code that looks for \n or the end of the string.

The next pointer for BRANCH es is interesting in that it points at where execution should go if the branch fails. When executing, if the engine tries to traverse from a branch to a regnext that isn't a branch then the engine will know that the entire set of branches has failed.

Peep-hole Optimisation and Analysis

The regular expression engine can be a weighty tool to wield. On long strings and complex patterns it can end up having to do a lot of work to find a match, and even more to decide that no match is possible. Consider a situation like the following pattern.

  1. 'ababababababababababab' =~ /(a|b)*z/

The (a|b)* part can match at every char in the string, and then fail every time because there is no z in the string. So obviously we can avoid using the regex engine unless there is a z in the string. Likewise in a pattern like:

  1. /foo(\w+)bar/

In this case we know that the string must contain a foo which must be followed by bar . We can use Fast Boyer-Moore matching as implemented in fbm_instr() to find the location of these strings. If they don't exist then we don't need to resort to the much more expensive regex engine. Even better, if they do exist then we can use their positions to reduce the search space that the regex engine needs to cover to determine if the entire pattern matches.

There are various aspects of the pattern that can be used to facilitate optimisations along these lines:

  • anchored fixed strings
  • floating fixed strings
  • minimum and maximum length requirements
  • start class
  • Beginning/End of line positions

Another form of optimisation that can occur is the post-parse "peep-hole" optimisation, where inefficient constructs are replaced by more efficient constructs. The TAIL regops which are used during parsing to mark the end of branches and the end of groups are examples of this. These regops are used as place-holders during construction and "always match" so they can be "optimised away" by making the things that point to the TAIL point to the thing that TAIL points to, thus "skipping" the node.

Another optimisation that can occur is that of "EXACT merging" which is where two consecutive EXACT nodes are merged into a single regop. An even more aggressive form of this is that a branch sequence of the form EXACT BRANCH ... EXACT can be converted into a TRIE-EXACT regop.

All of this occurs in the routine study_chunk() which uses a special structure scan_data_t to store the analysis that it has performed, and does the "peep-hole" optimisations as it goes.

The code involved in study_chunk() is extremely cryptic. Be careful. :-)

Execution

Execution of a regex generally involves two phases, the first being finding the start point in the string where we should match from, and the second being running the regop interpreter.

If we can tell that there is no valid start point then we don't bother running interpreter at all. Likewise, if we know from the analysis phase that we cannot detect a short-cut to the start position, we go straight to the interpreter.

The two entry points are re_intuit_start() and pregexec() . These routines have a somewhat incestuous relationship with overlap between their functions, and pregexec() may even call re_intuit_start() on its own. Nevertheless other parts of the perl source code may call into either, or both.

Execution of the interpreter itself used to be recursive, but thanks to the efforts of Dave Mitchell in the 5.9.x development track, that has changed: now an internal stack is maintained on the heap and the routine is fully iterative. This can make it tricky as the code is quite conservative about what state it stores, with the result that two consecutive lines in the code can actually be running in totally different contexts due to the simulated recursion.

Start position and no-match optimisations

re_intuit_start() is responsible for handling start points and no-match optimisations as determined by the results of the analysis done by study_chunk() (and described in Peep-hole Optimisation and Analysis).

The basic structure of this routine is to try to find the start- and/or end-points of where the pattern could match, and to ensure that the string is long enough to match the pattern. It tries to use more efficient methods over less efficient methods and may involve considerable cross-checking of constraints to find the place in the string that matches. For instance it may try to determine that a given fixed string must be not only present but a certain number of chars before the end of the string, or whatever.

It calls several other routines, such as fbm_instr() which does Fast Boyer Moore matching and find_byclass() which is responsible for finding the start using the first mandatory regop in the program.

When the optimisation criteria have been satisfied, reg_try() is called to perform the match.

Program execution

pregexec() is the main entry point for running a regex. It contains support for initialising the regex interpreter's state, running re_intuit_start() if needed, and running the interpreter on the string from various start positions as needed. When it is necessary to use the regex interpreter pregexec() calls regtry() .

regtry() is the entry point into the regex interpreter. It expects as arguments a pointer to a regmatch_info structure and a pointer to a string. It returns an integer 1 for success and a 0 for failure. It is basically a set-up wrapper around regmatch() .

regmatch is the main "recursive loop" of the interpreter. It is basically a giant switch statement that implements a state machine, where the possible states are the regops themselves, plus a number of additional intermediate and failure states. A few of the states are implemented as subroutines but the bulk are inline code.

MISCELLANEOUS

Unicode and Localisation Support

When dealing with strings containing characters that cannot be represented using an eight-bit character set, perl uses an internal representation that is a permissive version of Unicode's UTF-8 encoding[2]. This uses single bytes to represent characters from the ASCII character set, and sequences of two or more bytes for all other characters. (See perlunitut for more information about the relationship between UTF-8 and perl's encoding, utf8. The difference isn't important for this discussion.)

No matter how you look at it, Unicode support is going to be a pain in a regex engine. Tricks that might be fine when you have 256 possible characters often won't scale to handle the size of the UTF-8 character set. Things you can take for granted with ASCII may not be true with Unicode. For instance, in ASCII, it is safe to assume that sizeof(char1) == sizeof(char2) , but in UTF-8 it isn't. Unicode case folding is vastly more complex than the simple rules of ASCII, and even when not using Unicode but only localised single byte encodings, things can get tricky (for example, LATIN SMALL LETTER SHARP S (U+00DF, ß) should match 'SS' in localised case-insensitive matching).

Making things worse is that UTF-8 support was a later addition to the regex engine (as it was to perl) and this necessarily made things a lot more complicated. Obviously it is easier to design a regex engine with Unicode support in mind from the beginning than it is to retrofit it to one that wasn't.

Nearly all regops that involve looking at the input string have two cases, one for UTF-8, and one not. In fact, it's often more complex than that, as the pattern may be UTF-8 as well.

Care must be taken when making changes to make sure that you handle UTF-8 properly, both at compile time and at execution time, including when the string and pattern are mismatched.

The following comment in regcomp.h gives an example of exactly how tricky this can be:

  1. Two problematic code points in Unicode casefolding of EXACT nodes:
  2. U+0390 - GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
  3. U+03B0 - GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
  4. which casefold to
  5. Unicode UTF-8
  6. U+03B9 U+0308 U+0301 0xCE 0xB9 0xCC 0x88 0xCC 0x81
  7. U+03C5 U+0308 U+0301 0xCF 0x85 0xCC 0x88 0xCC 0x81
  8. This means that in case-insensitive matching (or "loose matching",
  9. as Unicode calls it), an EXACTF of length six (the UTF-8 encoded
  10. byte length of the above casefolded versions) can match a target
  11. string of length two (the byte length of UTF-8 encoded U+0390 or
  12. U+03B0). This would rather mess up the minimum length computation.
  13. What we'll do is to look for the tail four bytes, and then peek
  14. at the preceding two bytes to see whether we need to decrease
  15. the minimum length by four (six minus two).
  16. Thanks to the design of UTF-8, there cannot be false matches:
  17. A sequence of valid UTF-8 bytes cannot be a subsequence of
  18. another valid sequence of UTF-8 bytes.

Base Structures

The regexp structure described in perlreapi is common to all regex engines. Two of its fields that are intended for the private use of the regex engine that compiled the pattern. These are the intflags and pprivate members. The pprivate is a void pointer to an arbitrary structure whose use and management is the responsibility of the compiling engine. perl will never modify either of these values. In the case of the stock engine the structure pointed to by pprivate is called regexp_internal .

Its pprivate and intflags fields contain data specific to each engine.

There are two structures used to store a compiled regular expression. One, the regexp structure described in perlreapi is populated by the engine currently being. used and some of its fields read by perl to implement things such as the stringification of qr//.

The other structure is pointed to be the regexp struct's pprivate and is in addition to intflags in the same struct considered to be the property of the regex engine which compiled the regular expression;

The regexp structure contains all the data that perl needs to be aware of to properly work with the regular expression. It includes data about optimisations that perl can use to determine if the regex engine should really be used, and various other control info that is needed to properly execute patterns in various contexts such as is the pattern anchored in some way, or what flags were used during the compile, or whether the program contains special constructs that perl needs to be aware of.

In addition it contains two fields that are intended for the private use of the regex engine that compiled the pattern. These are the intflags and pprivate members. The pprivate is a void pointer to an arbitrary structure whose use and management is the responsibility of the compiling engine. perl will never modify either of these values.

As mentioned earlier, in the case of the default engines, the pprivate will be a pointer to a regexp_internal structure which holds the compiled program and any additional data that is private to the regex engine implementation.

Perl's pprivate structure

The following structure is used as the pprivate struct by perl's regex engine. Since it is specific to perl it is only of curiosity value to other engine implementations.

  1. typedef struct regexp_internal {
  2. U32 *offsets; /* offset annotations 20001228 MJD
  3. * data about mapping the program to
  4. * the string*/
  5. regnode *regstclass; /* Optional startclass as identified or
  6. * constructed by the optimiser */
  7. struct reg_data *data; /* Additional miscellaneous data used
  8. * by the program. Used to make it
  9. * easier to clone and free arbitrary
  10. * data that the regops need. Often the
  11. * ARG field of a regop is an index
  12. * into this structure */
  13. regnode program[1]; /* Unwarranted chumminess with
  14. * compiler. */
  15. } regexp_internal;
  • offsets

    Offsets holds a mapping of offset in the program to offset in the precomp string. This is only used by ActiveState's visual regex debugger.

  • regstclass

    Special regop that is used by re_intuit_start() to check if a pattern can match at a certain position. For instance if the regex engine knows that the pattern must start with a 'Z' then it can scan the string until it finds one and then launch the regex engine from there. The routine that handles this is called find_by_class() . Sometimes this field points at a regop embedded in the program, and sometimes it points at an independent synthetic regop that has been constructed by the optimiser.

  • data

    This field points at a reg_data structure, which is defined as follows

    1. struct reg_data {
    2. U32 count;
    3. U8 *what;
    4. void* data[1];
    5. };

    This structure is used for handling data structures that the regex engine needs to handle specially during a clone or free operation on the compiled product. Each element in the data array has a corresponding element in the what array. During compilation regops that need special structures stored will add an element to each array using the add_data() routine and then store the index in the regop.

  • program

    Compiled program. Inlined into the structure so the entire struct can be treated as a single blob.

SEE ALSO

perlreapi

perlre

perlunitut

AUTHOR

by Yves Orton, 2006.

With excerpts from Perl, and contributions and suggestions from Ronald J. Kimball, Dave Mitchell, Dominic Dunlop, Mark Jason Dominus, Stephen McCamant, and David Landgren.

LICENCE

Same terms as Perl.

REFERENCES

[1] http://perl.plover.com/Rx/paper/

[2] http://www.unicode.org

 
perldoc-html/perlrepository.html000644 000765 000024 00000034610 12275777361 017233 0ustar00jjstaff000000 000000 perlrepository - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlrepository

Perl 5 version 18.2 documentation
Recently read

perlrepository

NAME

perlrepository - Links to current information on the Perl source repository

DESCRIPTION

Perl's source code is stored in a Git repository.

See perlhack for an explanation of Perl development, including the Super Quick Patch Guide for making and submitting a small patch.

See perlgit for detailed information about Perl's Git repository.

(The above documents supersede the information that was formerly here in perlrepository.)

Page index
 
perldoc-html/perlrequick.html000644 000765 000024 00000155121 12275777323 016456 0ustar00jjstaff000000 000000 perlrequick - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlrequick

Perl 5 version 18.2 documentation
Recently read

perlrequick

NAME

perlrequick - Perl regular expressions quick start

DESCRIPTION

This page covers the very basics of understanding, creating and using regular expressions ('regexes') in Perl.

The Guide

Simple word matching

The simplest regex is simply a word, or more generally, a string of characters. A regex consisting of a word matches any string that contains that word:

  1. "Hello World" =~ /World/; # matches

In this statement, World is a regex and the // enclosing /World/ tells Perl to search a string for a match. The operator =~ associates the string with the regex match and produces a true value if the regex matched, or false if the regex did not match. In our case, World matches the second word in "Hello World" , so the expression is true. This idea has several variations.

Expressions like this are useful in conditionals:

  1. print "It matches\n" if "Hello World" =~ /World/;

The sense of the match can be reversed by using !~ operator:

  1. print "It doesn't match\n" if "Hello World" !~ /World/;

The literal string in the regex can be replaced by a variable:

  1. $greeting = "World";
  2. print "It matches\n" if "Hello World" =~ /$greeting/;

If you're matching against $_ , the $_ =~ part can be omitted:

  1. $_ = "Hello World";
  2. print "It matches\n" if /World/;

Finally, the // default delimiters for a match can be changed to arbitrary delimiters by putting an 'm' out front:

  1. "Hello World" =~ m!World!; # matches, delimited by '!'
  2. "Hello World" =~ m{World}; # matches, note the matching '{}'
  3. "/usr/bin/perl" =~ m"/perl"; # matches after '/usr/bin',
  4. # '/' becomes an ordinary char

Regexes must match a part of the string exactly in order for the statement to be true:

  1. "Hello World" =~ /world/; # doesn't match, case sensitive
  2. "Hello World" =~ /o W/; # matches, ' ' is an ordinary char
  3. "Hello World" =~ /World /; # doesn't match, no ' ' at end

Perl will always match at the earliest possible point in the string:

  1. "Hello World" =~ /o/; # matches 'o' in 'Hello'
  2. "That hat is red" =~ /hat/; # matches 'hat' in 'That'

Not all characters can be used 'as is' in a match. Some characters, called metacharacters, are reserved for use in regex notation. The metacharacters are

  1. {}[]()^$.|*+?\

A metacharacter can be matched by putting a backslash before it:

  1. "2+2=4" =~ /2+2/; # doesn't match, + is a metacharacter
  2. "2+2=4" =~ /2\+2/; # matches, \+ is treated like an ordinary +
  3. 'C:\WIN32' =~ /C:\\WIN/; # matches
  4. "/usr/bin/perl" =~ /\/usr\/bin\/perl/; # matches

In the last regex, the forward slash '/' is also backslashed, because it is used to delimit the regex.

Non-printable ASCII characters are represented by escape sequences. Common examples are \t for a tab, \n for a newline, and \r for a carriage return. Arbitrary bytes are represented by octal escape sequences, e.g., \033 , or hexadecimal escape sequences, e.g., \x1B :

  1. "1000\t2000" =~ m(0\t2) # matches
  2. "cat" =~ /\143\x61\x74/ # matches in ASCII, but a weird way to spell cat

Regexes are treated mostly as double-quoted strings, so variable substitution works:

  1. $foo = 'house';
  2. 'cathouse' =~ /cat$foo/; # matches
  3. 'housecat' =~ /${foo}cat/; # matches

With all of the regexes above, if the regex matched anywhere in the string, it was considered a match. To specify where it should match, we would use the anchor metacharacters ^ and $ . The anchor ^ means match at the beginning of the string and the anchor $ means match at the end of the string, or before a newline at the end of the string. Some examples:

  1. "housekeeper" =~ /keeper/; # matches
  2. "housekeeper" =~ /^keeper/; # doesn't match
  3. "housekeeper" =~ /keeper$/; # matches
  4. "housekeeper\n" =~ /keeper$/; # matches
  5. "housekeeper" =~ /^housekeeper$/; # matches

Using character classes

A character class allows a set of possible characters, rather than just a single character, to match at a particular point in a regex. Character classes are denoted by brackets [...] , with the set of characters to be possibly matched inside. Here are some examples:

  1. /cat/; # matches 'cat'
  2. /[bcr]at/; # matches 'bat', 'cat', or 'rat'
  3. "abc" =~ /[cab]/; # matches 'a'

In the last statement, even though 'c' is the first character in the class, the earliest point at which the regex can match is 'a' .

  1. /[yY][eE][sS]/; # match 'yes' in a case-insensitive way
  2. # 'yes', 'Yes', 'YES', etc.
  3. /yes/i; # also match 'yes' in a case-insensitive way

The last example shows a match with an 'i' modifier, which makes the match case-insensitive.

Character classes also have ordinary and special characters, but the sets of ordinary and special characters inside a character class are different than those outside a character class. The special characters for a character class are -]\^$ and are matched using an escape:

  1. /[\]c]def/; # matches ']def' or 'cdef'
  2. $x = 'bcr';
  3. /[$x]at/; # matches 'bat, 'cat', or 'rat'
  4. /[\$x]at/; # matches '$at' or 'xat'
  5. /[\\$x]at/; # matches '\at', 'bat, 'cat', or 'rat'

The special character '-' acts as a range operator within character classes, so that the unwieldy [0123456789] and [abc...xyz] become the svelte [0-9] and [a-z] :

  1. /item[0-9]/; # matches 'item0' or ... or 'item9'
  2. /[0-9a-fA-F]/; # matches a hexadecimal digit

If '-' is the first or last character in a character class, it is treated as an ordinary character.

The special character ^ in the first position of a character class denotes a negated character class, which matches any character but those in the brackets. Both [...] and [^...] must match a character, or the match fails. Then

  1. /[^a]at/; # doesn't match 'aat' or 'at', but matches
  2. # all other 'bat', 'cat, '0at', '%at', etc.
  3. /[^0-9]/; # matches a non-numeric character
  4. /[a^]at/; # matches 'aat' or '^at'; here '^' is ordinary

Perl has several abbreviations for common character classes. (These definitions are those that Perl uses in ASCII-safe mode with the /a modifier. Otherwise they could match many more non-ASCII Unicode characters as well. See Backslash sequences in perlrecharclass for details.)

  • \d is a digit and represents

    1. [0-9]
  • \s is a whitespace character and represents

    1. [\ \t\r\n\f]
  • \w is a word character (alphanumeric or _) and represents

    1. [0-9a-zA-Z_]
  • \D is a negated \d; it represents any character but a digit

    1. [^0-9]
  • \S is a negated \s; it represents any non-whitespace character

    1. [^\s]
  • \W is a negated \w; it represents any non-word character

    1. [^\w]
  • The period '.' matches any character but "\n"

The \d\s\w\D\S\W abbreviations can be used both inside and outside of character classes. Here are some in use:

  1. /\d\d:\d\d:\d\d/; # matches a hh:mm:ss time format
  2. /[\d\s]/; # matches any digit or whitespace character
  3. /\w\W\w/; # matches a word char, followed by a
  4. # non-word char, followed by a word char
  5. /..rt/; # matches any two chars, followed by 'rt'
  6. /end\./; # matches 'end.'
  7. /end[.]/; # same thing, matches 'end.'

The word anchor \b matches a boundary between a word character and a non-word character \w\W or \W\w :

  1. $x = "Housecat catenates house and cat";
  2. $x =~ /\bcat/; # matches cat in 'catenates'
  3. $x =~ /cat\b/; # matches cat in 'housecat'
  4. $x =~ /\bcat\b/; # matches 'cat' at end of string

In the last example, the end of the string is considered a word boundary.

Matching this or that

We can match different character strings with the alternation metacharacter '|' . To match dog or cat , we form the regex dog|cat . As before, Perl will try to match the regex at the earliest possible point in the string. At each character position, Perl will first try to match the first alternative, dog . If dog doesn't match, Perl will then try the next alternative, cat . If cat doesn't match either, then the match fails and Perl moves to the next position in the string. Some examples:

  1. "cats and dogs" =~ /cat|dog|bird/; # matches "cat"
  2. "cats and dogs" =~ /dog|cat|bird/; # matches "cat"

Even though dog is the first alternative in the second regex, cat is able to match earlier in the string.

  1. "cats" =~ /c|ca|cat|cats/; # matches "c"
  2. "cats" =~ /cats|cat|ca|c/; # matches "cats"

At a given character position, the first alternative that allows the regex match to succeed will be the one that matches. Here, all the alternatives match at the first string position, so the first matches.

Grouping things and hierarchical matching

The grouping metacharacters () allow a part of a regex to be treated as a single unit. Parts of a regex are grouped by enclosing them in parentheses. The regex house(cat|keeper) means match house followed by either cat or keeper . Some more examples are

  1. /(a|b)b/; # matches 'ab' or 'bb'
  2. /(^a|b)c/; # matches 'ac' at start of string or 'bc' anywhere
  3. /house(cat|)/; # matches either 'housecat' or 'house'
  4. /house(cat(s|)|)/; # matches either 'housecats' or 'housecat' or
  5. # 'house'. Note groups can be nested.
  6. "20" =~ /(19|20|)\d\d/; # matches the null alternative '()\d\d',
  7. # because '20\d\d' can't match

Extracting matches

The grouping metacharacters () also allow the extraction of the parts of a string that matched. For each grouping, the part that matched inside goes into the special variables $1 , $2 , etc. They can be used just as ordinary variables:

  1. # extract hours, minutes, seconds
  2. $time =~ /(\d\d):(\d\d):(\d\d)/; # match hh:mm:ss format
  3. $hours = $1;
  4. $minutes = $2;
  5. $seconds = $3;

In list context, a match /regex/ with groupings will return the list of matched values ($1,$2,...) . So we could rewrite it as

  1. ($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);

If the groupings in a regex are nested, $1 gets the group with the leftmost opening parenthesis, $2 the next opening parenthesis, etc. For example, here is a complex regex and the matching variables indicated below it:

  1. /(ab(cd|ef)((gi)|j))/;
  2. 1 2 34

Associated with the matching variables $1 , $2 , ... are the backreferences \g1 , \g2 , ... Backreferences are matching variables that can be used inside a regex:

  1. /(\w\w\w)\s\g1/; # find sequences like 'the the' in string

$1 , $2 , ... should only be used outside of a regex, and \g1 , \g2 , ... only inside a regex.

Matching repetitions

The quantifier metacharacters ?, * , + , and {} allow us to determine the number of repeats of a portion of a regex we consider to be a match. Quantifiers are put immediately after the character, character class, or grouping that we want to specify. They have the following meanings:

  • a? = match 'a' 1 or 0 times

  • a* = match 'a' 0 or more times, i.e., any number of times

  • a+ = match 'a' 1 or more times, i.e., at least once

  • a{n,m} = match at least n times, but not more than m times.

  • a{n,} = match at least n or more times

  • a{n} = match exactly n times

Here are some examples:

  1. /[a-z]+\s+\d*/; # match a lowercase word, at least some space, and
  2. # any number of digits
  3. /(\w+)\s+\g1/; # match doubled words of arbitrary length
  4. $year =~ /^\d{2,4}$/; # make sure year is at least 2 but not more
  5. # than 4 digits
  6. $year =~ /^\d{4}$|^\d{2}$/; # better match; throw out 3 digit dates

These quantifiers will try to match as much of the string as possible, while still allowing the regex to match. So we have

  1. $x = 'the cat in the hat';
  2. $x =~ /^(.*)(at)(.*)$/; # matches,
  3. # $1 = 'the cat in the h'
  4. # $2 = 'at'
  5. # $3 = '' (0 matches)

The first quantifier .* grabs as much of the string as possible while still having the regex match. The second quantifier .* has no string left to it, so it matches 0 times.

More matching

There are a few more things you might want to know about matching operators. The global modifier //g allows the matching operator to match within a string as many times as possible. In scalar context, successive matches against a string will have //g jump from match to match, keeping track of position in the string as it goes along. You can get or set the position with the pos() function. For example,

  1. $x = "cat dog house"; # 3 words
  2. while ($x =~ /(\w+)/g) {
  3. print "Word is $1, ends at position ", pos $x, "\n";
  4. }

prints

  1. Word is cat, ends at position 3
  2. Word is dog, ends at position 7
  3. Word is house, ends at position 13

A failed match or changing the target string resets the position. If you don't want the position reset after failure to match, add the //c , as in /regex/gc .

In list context, //g returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regex. So

  1. @words = ($x =~ /(\w+)/g); # matches,
  2. # $word[0] = 'cat'
  3. # $word[1] = 'dog'
  4. # $word[2] = 'house'

Search and replace

Search and replace is performed using s/regex/replacement/modifiers. The replacement is a Perl double-quoted string that replaces in the string whatever is matched with the regex . The operator =~ is also used here to associate a string with s///. If matching against $_ , the $_ =~ can be dropped. If there is a match, s/// returns the number of substitutions made; otherwise it returns false. Here are a few examples:

  1. $x = "Time to feed the cat!";
  2. $x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!"
  3. $y = "'quoted words'";
  4. $y =~ s/^'(.*)'$/$1/; # strip single quotes,
  5. # $y contains "quoted words"

With the s/// operator, the matched variables $1 , $2 , etc. are immediately available for use in the replacement expression. With the global modifier, s///g will search and replace all occurrences of the regex in the string:

  1. $x = "I batted 4 for 4";
  2. $x =~ s/4/four/; # $x contains "I batted four for 4"
  3. $x = "I batted 4 for 4";
  4. $x =~ s/4/four/g; # $x contains "I batted four for four"

The non-destructive modifier s///r causes the result of the substitution to be returned instead of modifying $_ (or whatever variable the substitute was bound to with =~ ):

  1. $x = "I like dogs.";
  2. $y = $x =~ s/dogs/cats/r;
  3. print "$x $y\n"; # prints "I like dogs. I like cats."
  4. $x = "Cats are great.";
  5. print $x =~ s/Cats/Dogs/r =~ s/Dogs/Frogs/r =~ s/Frogs/Hedgehogs/r, "\n";
  6. # prints "Hedgehogs are great."
  7. @foo = map { s/[a-z]/X/r } qw(a b c 1 2 3);
  8. # @foo is now qw(X X X 1 2 3)

The evaluation modifier s///e wraps an eval{...} around the replacement string and the evaluated result is substituted for the matched substring. Some examples:

  1. # reverse all the words in a string
  2. $x = "the cat in the hat";
  3. $x =~ s/(\w+)/reverse $1/ge; # $x contains "eht tac ni eht tah"
  4. # convert percentage to decimal
  5. $x = "A 39% hit rate";
  6. $x =~ s!(\d+)%!$1/100!e; # $x contains "A 0.39 hit rate"

The last example shows that s/// can use other delimiters, such as s!!! and s{}{}, and even s{}//. If single quotes are used s''', then the regex and replacement are treated as single-quoted strings.

The split operator

split /regex/, string splits string into a list of substrings and returns that list. The regex determines the character sequence that string is split with respect to. For example, to split a string into words, use

  1. $x = "Calvin and Hobbes";
  2. @word = split /\s+/, $x; # $word[0] = 'Calvin'
  3. # $word[1] = 'and'
  4. # $word[2] = 'Hobbes'

To extract a comma-delimited list of numbers, use

  1. $x = "1.618,2.718, 3.142";
  2. @const = split /,\s*/, $x; # $const[0] = '1.618'
  3. # $const[1] = '2.718'
  4. # $const[2] = '3.142'

If the empty regex // is used, the string is split into individual characters. If the regex has groupings, then the list produced contains the matched substrings from the groupings as well:

  1. $x = "/usr/bin";
  2. @parts = split m!(/)!, $x; # $parts[0] = ''
  3. # $parts[1] = '/'
  4. # $parts[2] = 'usr'
  5. # $parts[3] = '/'
  6. # $parts[4] = 'bin'

Since the first character of $x matched the regex, split prepended an empty initial element to the list.

BUGS

None.

SEE ALSO

This is just a quick start guide. For a more in-depth tutorial on regexes, see perlretut and for the reference page, see perlre.

AUTHOR AND COPYRIGHT

Copyright (c) 2000 Mark Kvale All rights reserved.

This document may be distributed under the same terms as Perl itself.

Acknowledgments

The author would like to thank Mark-Jason Dominus, Tom Christiansen, Ilya Zakharevich, Brad Hughes, and Mike Giroux for all their helpful comments.

 
perldoc-html/perlreref.html000644 000765 000024 00000154172 12275777340 016122 0ustar00jjstaff000000 000000 perlreref - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlreref

Perl 5 version 18.2 documentation
Recently read

perlreref

NAME

perlreref - Perl Regular Expressions Reference

DESCRIPTION

This is a quick reference to Perl's regular expressions. For full information see perlre and perlop, as well as the SEE ALSO section in this document.

OPERATORS

=~ determines to which variable the regex is applied. In its absence, $_ is used.

  1. $var =~ /foo/;

!~ determines to which variable the regex is applied, and negates the result of the match; it returns false if the match succeeds, and true if it fails.

  1. $var !~ /foo/;

m/pattern/msixpogcdual searches a string for a pattern match, applying the given options.

  1. m Multiline mode - ^ and $ match internal lines
  2. s match as a Single line - . matches \n
  3. i case-Insensitive
  4. x eXtended legibility - free whitespace and comments
  5. p Preserve a copy of the matched string -
  6. ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be defined.
  7. o compile pattern Once
  8. g Global - all occurrences
  9. c don't reset pos on failed matches when using /g
  10. a restrict \d, \s, \w and [:posix:] to match ASCII only
  11. aa (two a's) also /i matches exclude ASCII/non-ASCII
  12. l match according to current locale
  13. u match according to Unicode rules
  14. d match according to native rules unless something indicates
  15. Unicode

If 'pattern' is an empty string, the last successfully matched regex is used. Delimiters other than '/' may be used for both this operator and the following ones. The leading m can be omitted if the delimiter is '/'.

qr/pattern/msixpodual lets you store a regex in a variable, or pass one around. Modifiers as for m//, and are stored within the regex.

s/pattern/replacement/msixpogcedual substitutes matches of 'pattern' with 'replacement'. Modifiers as for m//, with two additions:

  1. e Evaluate 'replacement' as an expression
  2. r Return substitution and leave the original string untouched.

'e' may be specified multiple times. 'replacement' is interpreted as a double quoted string unless a single-quote (') is the delimiter.

?pattern? is like m/pattern/ but matches only once. No alternate delimiters can be used. Must be reset with reset().

SYNTAX

  1. \ Escapes the character immediately following it
  2. . Matches any single character except a newline (unless /s is
  3. used)
  4. ^ Matches at the beginning of the string (or line, if /m is used)
  5. $ Matches at the end of the string (or line, if /m is used)
  6. * Matches the preceding element 0 or more times
  7. + Matches the preceding element 1 or more times
  8. ? Matches the preceding element 0 or 1 times
  9. {...} Specifies a range of occurrences for the element preceding it
  10. [...] Matches any one of the characters contained within the brackets
  11. (...) Groups subexpressions for capturing to $1, $2...
  12. (?:...) Groups subexpressions without capturing (cluster)
  13. | Matches either the subexpression preceding or following it
  14. \g1 or \g{1}, \g2 ... Matches the text from the Nth group
  15. \1, \2, \3 ... Matches the text from the Nth group
  16. \g-1 or \g{-1}, \g-2 ... Matches the text from the Nth previous group
  17. \g{name} Named backreference
  18. \k<name> Named backreference
  19. \k'name' Named backreference
  20. (?P=name) Named backreference (python syntax)

ESCAPE SEQUENCES

These work as in normal strings.

  1. \a Alarm (beep)
  2. \e Escape
  3. \f Formfeed
  4. \n Newline
  5. \r Carriage return
  6. \t Tab
  7. \037 Char whose ordinal is the 3 octal digits, max \777
  8. \o{2307} Char whose ordinal is the octal number, unrestricted
  9. \x7f Char whose ordinal is the 2 hex digits, max \xFF
  10. \x{263a} Char whose ordinal is the hex number, unrestricted
  11. \cx Control-x
  12. \N{name} A named Unicode character or character sequence
  13. \N{U+263D} A Unicode character by hex ordinal
  14. \l Lowercase next character
  15. \u Titlecase next character
  16. \L Lowercase until \E
  17. \U Uppercase until \E
  18. \F Foldcase until \E
  19. \Q Disable pattern metacharacters until \E
  20. \E End modification

For Titlecase, see Titlecase.

This one works differently from normal strings:

  1. \b An assertion, not backspace, except in a character class

CHARACTER CLASSES

  1. [amy] Match 'a', 'm' or 'y'
  2. [f-j] Dash specifies "range"
  3. [f-j-] Dash escaped or at start or end means 'dash'
  4. [^f-j] Caret indicates "match any character _except_ these"

The following sequences (except \N ) work within or without a character class. The first six are locale aware, all are Unicode aware. See perllocale and perlunicode for details.

  1. \d A digit
  2. \D A nondigit
  3. \w A word character
  4. \W A non-word character
  5. \s A whitespace character
  6. \S A non-whitespace character
  7. \h An horizontal whitespace
  8. \H A non horizontal whitespace
  9. \N A non newline (when not followed by '{NAME}';;
  10. not valid in a character class; equivalent to [^\n]; it's
  11. like '.' without /s modifier)
  12. \v A vertical whitespace
  13. \V A non vertical whitespace
  14. \R A generic newline (?>\v|\x0D\x0A)
  15. \C Match a byte (with Unicode, '.' matches a character)
  16. \pP Match P-named (Unicode) property
  17. \p{...} Match Unicode property with name longer than 1 character
  18. \PP Match non-P
  19. \P{...} Match lack of Unicode property with name longer than 1 char
  20. \X Match Unicode extended grapheme cluster

POSIX character classes and their Unicode and Perl equivalents:

  1. ASCII- Full-
  2. POSIX range range backslash
  3. [[:...:]] \p{...} \p{...} sequence Description
  4. -----------------------------------------------------------------------
  5. alnum PosixAlnum XPosixAlnum Alpha plus Digit
  6. alpha PosixAlpha XPosixAlpha Alphabetic characters
  7. ascii ASCII Any ASCII character
  8. blank PosixBlank XPosixBlank \h Horizontal whitespace;
  9. full-range also
  10. written as
  11. \p{HorizSpace} (GNU
  12. extension)
  13. cntrl PosixCntrl XPosixCntrl Control characters
  14. digit PosixDigit XPosixDigit \d Decimal digits
  15. graph PosixGraph XPosixGraph Alnum plus Punct
  16. lower PosixLower XPosixLower Lowercase characters
  17. print PosixPrint XPosixPrint Graph plus Print, but
  18. not any Cntrls
  19. punct PosixPunct XPosixPunct Punctuation and Symbols
  20. in ASCII-range; just
  21. punct outside it
  22. space PosixSpace XPosixSpace [\s\cK]
  23. PerlSpace XPerlSpace \s Perl's whitespace def'n
  24. upper PosixUpper XPosixUpper Uppercase characters
  25. word PosixWord XPosixWord \w Alnum + Unicode marks +
  26. connectors, like '_'
  27. (Perl extension)
  28. xdigit ASCII_Hex_Digit XPosixDigit Hexadecimal digit,
  29. ASCII-range is
  30. [0-9A-Fa-f]

Also, various synonyms like \p{Alpha} for \p{XPosixAlpha} ; all listed in Properties accessible through \p{} and \P{} in perluniprops

Within a character class:

  1. POSIX traditional Unicode
  2. [:digit:] \d \p{Digit}
  3. [:^digit:] \D \P{Digit}

ANCHORS

All are zero-width assertions.

  1. ^ Match string start (or line, if /m is used)
  2. $ Match string end (or line, if /m is used) or before newline
  3. \b Match word boundary (between \w and \W)
  4. \B Match except at word boundary (between \w and \w or \W and \W)
  5. \A Match string start (regardless of /m)
  6. \Z Match string end (before optional newline)
  7. \z Match absolute string end
  8. \G Match where previous m//g left off
  9. \K Keep the stuff left of the \K, don't include it in $&

QUANTIFIERS

Quantifiers are greedy by default and match the longest leftmost.

  1. Maximal Minimal Possessive Allowed range
  2. ------- ------- ---------- -------------
  3. {n,m} {n,m}? {n,m}+ Must occur at least n times
  4. but no more than m times
  5. {n,} {n,}? {n,}+ Must occur at least n times
  6. {n} {n}? {n}+ Must occur exactly n times
  7. * *? *+ 0 or more times (same as {0,})
  8. + +? ++ 1 or more times (same as {1,})
  9. ? ?? ?+ 0 or 1 time (same as {0,1})

The possessive forms (new in Perl 5.10) prevent backtracking: what gets matched by a pattern with a possessive quantifier will not be backtracked into, even if that causes the whole match to fail.

There is no quantifier {,n} . That's interpreted as a literal string.

EXTENDED CONSTRUCTS

  1. (?#text) A comment
  2. (?:...) Groups subexpressions without capturing (cluster)
  3. (?pimsx-imsx:...) Enable/disable option (as per m// modifiers)
  4. (?=...) Zero-width positive lookahead assertion
  5. (?!...) Zero-width negative lookahead assertion
  6. (?<=...) Zero-width positive lookbehind assertion
  7. (?<!...) Zero-width negative lookbehind assertion
  8. (?>...) Grab what we can, prohibit backtracking
  9. (?|...) Branch reset
  10. (?<name>...) Named capture
  11. (?'name'...) Named capture
  12. (?P<name>...) Named capture (python syntax)
  13. (?{ code }) Embedded code, return value becomes $^R
  14. (??{ code }) Dynamic regex, return value used as regex
  15. (?N) Recurse into subpattern number N
  16. (?-N), (?+N) Recurse into Nth previous/next subpattern
  17. (?R), (?0) Recurse at the beginning of the whole pattern
  18. (?&name) Recurse into a named subpattern
  19. (?P>name) Recurse into a named subpattern (python syntax)
  20. (?(cond)yes|no)
  21. (?(cond)yes) Conditional expression, where "cond" can be:
  22. (?=pat) look-ahead
  23. (?!pat) negative look-ahead
  24. (?<=pat) look-behind
  25. (?<!pat) negative look-behind
  26. (N) subpattern N has matched something
  27. (<name>) named subpattern has matched something
  28. ('name') named subpattern has matched something
  29. (?{code}) code condition
  30. (R) true if recursing
  31. (RN) true if recursing into Nth subpattern
  32. (R&name) true if recursing into named subpattern
  33. (DEFINE) always false, no no-pattern allowed

VARIABLES

  1. $_ Default variable for operators to use
  2. $` Everything prior to matched string
  3. $& Entire matched string
  4. $' Everything after to matched string
  5. ${^PREMATCH} Everything prior to matched string
  6. ${^MATCH} Entire matched string
  7. ${^POSTMATCH} Everything after to matched string

The use of $` , $& or $' will slow down all regex use within your program. Consult perlvar for @- to see equivalent expressions that won't cause slow down. See also Devel::SawAmpersand. Starting with Perl 5.10, you can also use the equivalent variables ${^PREMATCH} , ${^MATCH} and ${^POSTMATCH} , but for them to be defined, you have to specify the /p (preserve) modifier on your regular expression.

  1. $1, $2 ... hold the Xth captured expr
  2. $+ Last parenthesized pattern match
  3. $^N Holds the most recently closed capture
  4. $^R Holds the result of the last (?{...}) expr
  5. @- Offsets of starts of groups. $-[0] holds start of whole match
  6. @+ Offsets of ends of groups. $+[0] holds end of whole match
  7. %+ Named capture groups
  8. %- Named capture groups, as array refs

Captured groups are numbered according to their opening paren.

FUNCTIONS

  1. lc Lowercase a string
  2. lcfirst Lowercase first char of a string
  3. uc Uppercase a string
  4. ucfirst Titlecase first char of a string
  5. fc Foldcase a string
  6. pos Return or set current match position
  7. quotemeta Quote metacharacters
  8. reset Reset ?pattern? status
  9. study Analyze string for optimizing matching
  10. split Use a regex to split a string into parts

The first five of these are like the escape sequences \L , \l , \U , \u , and \F . For Titlecase, see Titlecase; For Foldcase, see Foldcase.

TERMINOLOGY

Titlecase

Unicode concept which most often is equal to uppercase, but for certain characters like the German "sharp s" there is a difference.

Foldcase

Unicode form that is useful when comparing strings regardless of case, as certain characters have compex one-to-many case mappings. Primarily a variant of lowercase.

AUTHOR

Iain Truskett. Updated by the Perl 5 Porters.

This document may be distributed under the same terms as Perl itself.

SEE ALSO

THANKS

David P.C. Wollmann, Richard Soderberg, Sean M. Burke, Tom Christiansen, Jim Cromie, and Jeffrey Goff for useful advice.

 
perldoc-html/perlretut.html000644 000765 000024 00000741540 12275777324 016165 0ustar00jjstaff000000 000000 perlretut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlretut

Perl 5 version 18.2 documentation
Recently read

perlretut

NAME

perlretut - Perl regular expressions tutorial

DESCRIPTION

This page provides a basic tutorial on understanding, creating and using regular expressions in Perl. It serves as a complement to the reference page on regular expressions perlre. Regular expressions are an integral part of the m//, s///, qr// and split operators and so this tutorial also overlaps with Regexp Quote-Like Operators in perlop and split.

Perl is widely renowned for excellence in text processing, and regular expressions are one of the big factors behind this fame. Perl regular expressions display an efficiency and flexibility unknown in most other computer languages. Mastering even the basics of regular expressions will allow you to manipulate text with surprising ease.

What is a regular expression? A regular expression is simply a string that describes a pattern. Patterns are in common use these days; examples are the patterns typed into a search engine to find web pages and the patterns used to list files in a directory, e.g., ls *.txt or dir *.*. In Perl, the patterns described by regular expressions are used to search strings, extract desired parts of strings, and to do search and replace operations.

Regular expressions have the undeserved reputation of being abstract and difficult to understand. Regular expressions are constructed using simple concepts like conditionals and loops and are no more difficult to understand than the corresponding if conditionals and while loops in the Perl language itself. In fact, the main challenge in learning regular expressions is just getting used to the terse notation used to express these concepts.

This tutorial flattens the learning curve by discussing regular expression concepts, along with their notation, one at a time and with many examples. The first part of the tutorial will progress from the simplest word searches to the basic regular expression concepts. If you master the first part, you will have all the tools needed to solve about 98% of your needs. The second part of the tutorial is for those comfortable with the basics and hungry for more power tools. It discusses the more advanced regular expression operators and introduces the latest cutting-edge innovations.

A note: to save time, 'regular expression' is often abbreviated as regexp or regex. Regexp is a more natural abbreviation than regex, but is harder to pronounce. The Perl pod documentation is evenly split on regexp vs regex; in Perl, there is more than one way to abbreviate it. We'll use regexp in this tutorial.

Part 1: The basics

Simple word matching

The simplest regexp is simply a word, or more generally, a string of characters. A regexp consisting of a word matches any string that contains that word:

  1. "Hello World" =~ /World/; # matches

What is this Perl statement all about? "Hello World" is a simple double-quoted string. World is the regular expression and the // enclosing /World/ tells Perl to search a string for a match. The operator =~ associates the string with the regexp match and produces a true value if the regexp matched, or false if the regexp did not match. In our case, World matches the second word in "Hello World" , so the expression is true. Expressions like this are useful in conditionals:

  1. if ("Hello World" =~ /World/) {
  2. print "It matches\n";
  3. }
  4. else {
  5. print "It doesn't match\n";
  6. }

There are useful variations on this theme. The sense of the match can be reversed by using the !~ operator:

  1. if ("Hello World" !~ /World/) {
  2. print "It doesn't match\n";
  3. }
  4. else {
  5. print "It matches\n";
  6. }

The literal string in the regexp can be replaced by a variable:

  1. $greeting = "World";
  2. if ("Hello World" =~ /$greeting/) {
  3. print "It matches\n";
  4. }
  5. else {
  6. print "It doesn't match\n";
  7. }

If you're matching against the special default variable $_ , the $_ =~ part can be omitted:

  1. $_ = "Hello World";
  2. if (/World/) {
  3. print "It matches\n";
  4. }
  5. else {
  6. print "It doesn't match\n";
  7. }

And finally, the // default delimiters for a match can be changed to arbitrary delimiters by putting an 'm' out front:

  1. "Hello World" =~ m!World!; # matches, delimited by '!'
  2. "Hello World" =~ m{World}; # matches, note the matching '{}'
  3. "/usr/bin/perl" =~ m"/perl"; # matches after '/usr/bin',
  4. # '/' becomes an ordinary char

/World/ , m!World!, and m{World} all represent the same thing. When, e.g., the quote (") is used as a delimiter, the forward slash '/' becomes an ordinary character and can be used in this regexp without trouble.

Let's consider how different regexps would match "Hello World" :

  1. "Hello World" =~ /world/; # doesn't match
  2. "Hello World" =~ /o W/; # matches
  3. "Hello World" =~ /oW/; # doesn't match
  4. "Hello World" =~ /World /; # doesn't match

The first regexp world doesn't match because regexps are case-sensitive. The second regexp matches because the substring 'o W' occurs in the string "Hello World" . The space character ' ' is treated like any other character in a regexp and is needed to match in this case. The lack of a space character is the reason the third regexp 'oW' doesn't match. The fourth regexp 'World ' doesn't match because there is a space at the end of the regexp, but not at the end of the string. The lesson here is that regexps must match a part of the string exactly in order for the statement to be true.

If a regexp matches in more than one place in the string, Perl will always match at the earliest possible point in the string:

  1. "Hello World" =~ /o/; # matches 'o' in 'Hello'
  2. "That hat is red" =~ /hat/; # matches 'hat' in 'That'

With respect to character matching, there are a few more points you need to know about. First of all, not all characters can be used 'as is' in a match. Some characters, called metacharacters, are reserved for use in regexp notation. The metacharacters are

  1. {}[]()^$.|*+?\

The significance of each of these will be explained in the rest of the tutorial, but for now, it is important only to know that a metacharacter can be matched by putting a backslash before it:

  1. "2+2=4" =~ /2+2/; # doesn't match, + is a metacharacter
  2. "2+2=4" =~ /2\+2/; # matches, \+ is treated like an ordinary +
  3. "The interval is [0,1)." =~ /[0,1)./ # is a syntax error!
  4. "The interval is [0,1)." =~ /\[0,1\)\./ # matches
  5. "#!/usr/bin/perl" =~ /#!\/usr\/bin\/perl/; # matches

In the last regexp, the forward slash '/' is also backslashed, because it is used to delimit the regexp. This can lead to LTS (leaning toothpick syndrome), however, and it is often more readable to change delimiters.

  1. "#!/usr/bin/perl" =~ m!#\!/usr/bin/perl!; # easier to read

The backslash character '\' is a metacharacter itself and needs to be backslashed:

  1. 'C:\WIN32' =~ /C:\\WIN/; # matches

In addition to the metacharacters, there are some ASCII characters which don't have printable character equivalents and are instead represented by escape sequences. Common examples are \t for a tab, \n for a newline, \r for a carriage return and \a for a bell (or alert). If your string is better thought of as a sequence of arbitrary bytes, the octal escape sequence, e.g., \033 , or hexadecimal escape sequence, e.g., \x1B may be a more natural representation for your bytes. Here are some examples of escapes:

  1. "1000\t2000" =~ m(0\t2) # matches
  2. "1000\n2000" =~ /0\n20/ # matches
  3. "1000\t2000" =~ /\000\t2/ # doesn't match, "0" ne "\000"
  4. "cat" =~ /\o{143}\x61\x74/ # matches in ASCII, but a weird way
  5. # to spell cat

If you've been around Perl a while, all this talk of escape sequences may seem familiar. Similar escape sequences are used in double-quoted strings and in fact the regexps in Perl are mostly treated as double-quoted strings. This means that variables can be used in regexps as well. Just like double-quoted strings, the values of the variables in the regexp will be substituted in before the regexp is evaluated for matching purposes. So we have:

  1. $foo = 'house';
  2. 'housecat' =~ /$foo/; # matches
  3. 'cathouse' =~ /cat$foo/; # matches
  4. 'housecat' =~ /${foo}cat/; # matches

So far, so good. With the knowledge above you can already perform searches with just about any literal string regexp you can dream up. Here is a very simple emulation of the Unix grep program:

  1. % cat > simple_grep
  2. #!/usr/bin/perl
  3. $regexp = shift;
  4. while (<>) {
  5. print if /$regexp/;
  6. }
  7. ^D
  8. % chmod +x simple_grep
  9. % simple_grep abba /usr/dict/words
  10. Babbage
  11. cabbage
  12. cabbages
  13. sabbath
  14. Sabbathize
  15. Sabbathizes
  16. sabbatical
  17. scabbard
  18. scabbards

This program is easy to understand. #!/usr/bin/perl is the standard way to invoke a perl program from the shell. $regexp = shift; saves the first command line argument as the regexp to be used, leaving the rest of the command line arguments to be treated as files. while (<>) loops over all the lines in all the files. For each line, print if /$regexp/; prints the line if the regexp matches the line. In this line, both print and /$regexp/ use the default variable $_ implicitly.

With all of the regexps above, if the regexp matched anywhere in the string, it was considered a match. Sometimes, however, we'd like to specify where in the string the regexp should try to match. To do this, we would use the anchor metacharacters ^ and $ . The anchor ^ means match at the beginning of the string and the anchor $ means match at the end of the string, or before a newline at the end of the string. Here is how they are used:

  1. "housekeeper" =~ /keeper/; # matches
  2. "housekeeper" =~ /^keeper/; # doesn't match
  3. "housekeeper" =~ /keeper$/; # matches
  4. "housekeeper\n" =~ /keeper$/; # matches

The second regexp doesn't match because ^ constrains keeper to match only at the beginning of the string, but "housekeeper" has keeper starting in the middle. The third regexp does match, since the $ constrains keeper to match only at the end of the string.

When both ^ and $ are used at the same time, the regexp has to match both the beginning and the end of the string, i.e., the regexp matches the whole string. Consider

  1. "keeper" =~ /^keep$/; # doesn't match
  2. "keeper" =~ /^keeper$/; # matches
  3. "" =~ /^$/; # ^$ matches an empty string

The first regexp doesn't match because the string has more to it than keep . Since the second regexp is exactly the string, it matches. Using both ^ and $ in a regexp forces the complete string to match, so it gives you complete control over which strings match and which don't. Suppose you are looking for a fellow named bert, off in a string by himself:

  1. "dogbert" =~ /bert/; # matches, but not what you want
  2. "dilbert" =~ /^bert/; # doesn't match, but ..
  3. "bertram" =~ /^bert/; # matches, so still not good enough
  4. "bertram" =~ /^bert$/; # doesn't match, good
  5. "dilbert" =~ /^bert$/; # doesn't match, good
  6. "bert" =~ /^bert$/; # matches, perfect

Of course, in the case of a literal string, one could just as easily use the string comparison $string eq 'bert' and it would be more efficient. The ^...$ regexp really becomes useful when we add in the more powerful regexp tools below.

Using character classes

Although one can already do quite a lot with the literal string regexps above, we've only scratched the surface of regular expression technology. In this and subsequent sections we will introduce regexp concepts (and associated metacharacter notations) that will allow a regexp to represent not just a single character sequence, but a whole class of them.

One such concept is that of a character class. A character class allows a set of possible characters, rather than just a single character, to match at a particular point in a regexp. Character classes are denoted by brackets [...] , with the set of characters to be possibly matched inside. Here are some examples:

  1. /cat/; # matches 'cat'
  2. /[bcr]at/; # matches 'bat, 'cat', or 'rat'
  3. /item[0123456789]/; # matches 'item0' or ... or 'item9'
  4. "abc" =~ /[cab]/; # matches 'a'

In the last statement, even though 'c' is the first character in the class, 'a' matches because the first character position in the string is the earliest point at which the regexp can match.

  1. /[yY][eE][sS]/; # match 'yes' in a case-insensitive way
  2. # 'yes', 'Yes', 'YES', etc.

This regexp displays a common task: perform a case-insensitive match. Perl provides a way of avoiding all those brackets by simply appending an 'i' to the end of the match. Then /[yY][eE][sS]/; can be rewritten as /yes/i; . The 'i' stands for case-insensitive and is an example of a modifier of the matching operation. We will meet other modifiers later in the tutorial.

We saw in the section above that there were ordinary characters, which represented themselves, and special characters, which needed a backslash \ to represent themselves. The same is true in a character class, but the sets of ordinary and special characters inside a character class are different than those outside a character class. The special characters for a character class are -]\^$ (and the pattern delimiter, whatever it is). ] is special because it denotes the end of a character class. $ is special because it denotes a scalar variable. \ is special because it is used in escape sequences, just like above. Here is how the special characters ]$\ are handled:

  1. /[\]c]def/; # matches ']def' or 'cdef'
  2. $x = 'bcr';
  3. /[$x]at/; # matches 'bat', 'cat', or 'rat'
  4. /[\$x]at/; # matches '$at' or 'xat'
  5. /[\\$x]at/; # matches '\at', 'bat, 'cat', or 'rat'

The last two are a little tricky. In [\$x] , the backslash protects the dollar sign, so the character class has two members $ and x . In [\\$x] , the backslash is protected, so $x is treated as a variable and substituted in double quote fashion.

The special character '-' acts as a range operator within character classes, so that a contiguous set of characters can be written as a range. With ranges, the unwieldy [0123456789] and [abc...xyz] become the svelte [0-9] and [a-z] . Some examples are

  1. /item[0-9]/; # matches 'item0' or ... or 'item9'
  2. /[0-9bx-z]aa/; # matches '0aa', ..., '9aa',
  3. # 'baa', 'xaa', 'yaa', or 'zaa'
  4. /[0-9a-fA-F]/; # matches a hexadecimal digit
  5. /[0-9a-zA-Z_]/; # matches a "word" character,
  6. # like those in a Perl variable name

If '-' is the first or last character in a character class, it is treated as an ordinary character; [-ab] , [ab-] and [a\-b] are all equivalent.

The special character ^ in the first position of a character class denotes a negated character class, which matches any character but those in the brackets. Both [...] and [^...] must match a character, or the match fails. Then

  1. /[^a]at/; # doesn't match 'aat' or 'at', but matches
  2. # all other 'bat', 'cat, '0at', '%at', etc.
  3. /[^0-9]/; # matches a non-numeric character
  4. /[a^]at/; # matches 'aat' or '^at'; here '^' is ordinary

Now, even [0-9] can be a bother to write multiple times, so in the interest of saving keystrokes and making regexps more readable, Perl has several abbreviations for common character classes, as shown below. Since the introduction of Unicode, unless the //a modifier is in effect, these character classes match more than just a few characters in the ASCII range.

  • \d matches a digit, not just [0-9] but also digits from non-roman scripts

  • \s matches a whitespace character, the set [\ \t\r\n\f] and others

  • \w matches a word character (alphanumeric or _), not just [0-9a-zA-Z_] but also digits and characters from non-roman scripts

  • \D is a negated \d; it represents any other character than a digit, or [^\d]

  • \S is a negated \s; it represents any non-whitespace character [^\s]

  • \W is a negated \w; it represents any non-word character [^\w]

  • The period '.' matches any character but "\n" (unless the modifier //s is in effect, as explained below).

  • \N, like the period, matches any character but "\n", but it does so regardless of whether the modifier //s is in effect.

The //a modifier, available starting in Perl 5.14, is used to restrict the matches of \d, \s, and \w to just those in the ASCII range. It is useful to keep your program from being needlessly exposed to full Unicode (and its accompanying security considerations) when all you want is to process English-like text. (The "a" may be doubled, //aa , to provide even more restrictions, preventing case-insensitive matching of ASCII with non-ASCII characters; otherwise a Unicode "Kelvin Sign" would caselessly match a "k" or "K".)

The \d\s\w\D\S\W abbreviations can be used both inside and outside of character classes. Here are some in use:

  1. /\d\d:\d\d:\d\d/; # matches a hh:mm:ss time format
  2. /[\d\s]/; # matches any digit or whitespace character
  3. /\w\W\w/; # matches a word char, followed by a
  4. # non-word char, followed by a word char
  5. /..rt/; # matches any two chars, followed by 'rt'
  6. /end\./; # matches 'end.'
  7. /end[.]/; # same thing, matches 'end.'

Because a period is a metacharacter, it needs to be escaped to match as an ordinary period. Because, for example, \d and \w are sets of characters, it is incorrect to think of [^\d\w] as [\D\W] ; in fact [^\d\w] is the same as [^\w], which is the same as [\W] . Think DeMorgan's laws.

An anchor useful in basic regexps is the word anchor \b . This matches a boundary between a word character and a non-word character \w\W or \W\w :

  1. $x = "Housecat catenates house and cat";
  2. $x =~ /cat/; # matches cat in 'housecat'
  3. $x =~ /\bcat/; # matches cat in 'catenates'
  4. $x =~ /cat\b/; # matches cat in 'housecat'
  5. $x =~ /\bcat\b/; # matches 'cat' at end of string

Note in the last example, the end of the string is considered a word boundary.

You might wonder why '.' matches everything but "\n" - why not every character? The reason is that often one is matching against lines and would like to ignore the newline characters. For instance, while the string "\n" represents one line, we would like to think of it as empty. Then

  1. "" =~ /^$/; # matches
  2. "\n" =~ /^$/; # matches, $ anchors before "\n"
  3. "" =~ /./; # doesn't match; it needs a char
  4. "" =~ /^.$/; # doesn't match; it needs a char
  5. "\n" =~ /^.$/; # doesn't match; it needs a char other than "\n"
  6. "a" =~ /^.$/; # matches
  7. "a\n" =~ /^.$/; # matches, $ anchors before "\n"

This behavior is convenient, because we usually want to ignore newlines when we count and match characters in a line. Sometimes, however, we want to keep track of newlines. We might even want ^ and $ to anchor at the beginning and end of lines within the string, rather than just the beginning and end of the string. Perl allows us to choose between ignoring and paying attention to newlines by using the //s and //m modifiers. //s and //m stand for single line and multi-line and they determine whether a string is to be treated as one continuous string, or as a set of lines. The two modifiers affect two aspects of how the regexp is interpreted: 1) how the '.' character class is defined, and 2) where the anchors ^ and $ are able to match. Here are the four possible combinations:

  • no modifiers (//): Default behavior. '.' matches any character except "\n" . ^ matches only at the beginning of the string and $ matches only at the end or before a newline at the end.

  • s modifier (//s): Treat string as a single long line. '.' matches any character, even "\n" . ^ matches only at the beginning of the string and $ matches only at the end or before a newline at the end.

  • m modifier (//m): Treat string as a set of multiple lines. '.' matches any character except "\n" . ^ and $ are able to match at the start or end of any line within the string.

  • both s and m modifiers (//sm): Treat string as a single long line, but detect multiple lines. '.' matches any character, even "\n" . ^ and $ , however, are able to match at the start or end of any line within the string.

Here are examples of //s and //m in action:

  1. $x = "There once was a girl\nWho programmed in Perl\n";
  2. $x =~ /^Who/; # doesn't match, "Who" not at start of string
  3. $x =~ /^Who/s; # doesn't match, "Who" not at start of string
  4. $x =~ /^Who/m; # matches, "Who" at start of second line
  5. $x =~ /^Who/sm; # matches, "Who" at start of second line
  6. $x =~ /girl.Who/; # doesn't match, "." doesn't match "\n"
  7. $x =~ /girl.Who/s; # matches, "." matches "\n"
  8. $x =~ /girl.Who/m; # doesn't match, "." doesn't match "\n"
  9. $x =~ /girl.Who/sm; # matches, "." matches "\n"

Most of the time, the default behavior is what is wanted, but //s and //m are occasionally very useful. If //m is being used, the start of the string can still be matched with \A and the end of the string can still be matched with the anchors \Z (matches both the end and the newline before, like $ ), and \z (matches only the end):

  1. $x =~ /^Who/m; # matches, "Who" at start of second line
  2. $x =~ /\AWho/m; # doesn't match, "Who" is not at start of string
  3. $x =~ /girl$/m; # matches, "girl" at end of first line
  4. $x =~ /girl\Z/m; # doesn't match, "girl" is not at end of string
  5. $x =~ /Perl\Z/m; # matches, "Perl" is at newline before end
  6. $x =~ /Perl\z/m; # doesn't match, "Perl" is not at end of string

We now know how to create choices among classes of characters in a regexp. What about choices among words or character strings? Such choices are described in the next section.

Matching this or that

Sometimes we would like our regexp to be able to match different possible words or character strings. This is accomplished by using the alternation metacharacter |. To match dog or cat , we form the regexp dog|cat . As before, Perl will try to match the regexp at the earliest possible point in the string. At each character position, Perl will first try to match the first alternative, dog . If dog doesn't match, Perl will then try the next alternative, cat . If cat doesn't match either, then the match fails and Perl moves to the next position in the string. Some examples:

  1. "cats and dogs" =~ /cat|dog|bird/; # matches "cat"
  2. "cats and dogs" =~ /dog|cat|bird/; # matches "cat"

Even though dog is the first alternative in the second regexp, cat is able to match earlier in the string.

  1. "cats" =~ /c|ca|cat|cats/; # matches "c"
  2. "cats" =~ /cats|cat|ca|c/; # matches "cats"

Here, all the alternatives match at the first string position, so the first alternative is the one that matches. If some of the alternatives are truncations of the others, put the longest ones first to give them a chance to match.

  1. "cab" =~ /a|b|c/ # matches "c"
  2. # /a|b|c/ == /[abc]/

The last example points out that character classes are like alternations of characters. At a given character position, the first alternative that allows the regexp match to succeed will be the one that matches.

Grouping things and hierarchical matching

Alternation allows a regexp to choose among alternatives, but by itself it is unsatisfying. The reason is that each alternative is a whole regexp, but sometime we want alternatives for just part of a regexp. For instance, suppose we want to search for housecats or housekeepers. The regexp housecat|housekeeper fits the bill, but is inefficient because we had to type house twice. It would be nice to have parts of the regexp be constant, like house , and some parts have alternatives, like cat|keeper .

The grouping metacharacters () solve this problem. Grouping allows parts of a regexp to be treated as a single unit. Parts of a regexp are grouped by enclosing them in parentheses. Thus we could solve the housecat|housekeeper by forming the regexp as house(cat|keeper) . The regexp house(cat|keeper) means match house followed by either cat or keeper . Some more examples are

  1. /(a|b)b/; # matches 'ab' or 'bb'
  2. /(ac|b)b/; # matches 'acb' or 'bb'
  3. /(^a|b)c/; # matches 'ac' at start of string or 'bc' anywhere
  4. /(a|[bc])d/; # matches 'ad', 'bd', or 'cd'
  5. /house(cat|)/; # matches either 'housecat' or 'house'
  6. /house(cat(s|)|)/; # matches either 'housecats' or 'housecat' or
  7. # 'house'. Note groups can be nested.
  8. /(19|20|)\d\d/; # match years 19xx, 20xx, or the Y2K problem, xx
  9. "20" =~ /(19|20|)\d\d/; # matches the null alternative '()\d\d',
  10. # because '20\d\d' can't match

Alternations behave the same way in groups as out of them: at a given string position, the leftmost alternative that allows the regexp to match is taken. So in the last example at the first string position, "20" matches the second alternative, but there is nothing left over to match the next two digits \d\d . So Perl moves on to the next alternative, which is the null alternative and that works, since "20" is two digits.

The process of trying one alternative, seeing if it matches, and moving on to the next alternative, while going back in the string from where the previous alternative was tried, if it doesn't, is called backtracking. The term 'backtracking' comes from the idea that matching a regexp is like a walk in the woods. Successfully matching a regexp is like arriving at a destination. There are many possible trailheads, one for each string position, and each one is tried in order, left to right. From each trailhead there may be many paths, some of which get you there, and some which are dead ends. When you walk along a trail and hit a dead end, you have to backtrack along the trail to an earlier point to try another trail. If you hit your destination, you stop immediately and forget about trying all the other trails. You are persistent, and only if you have tried all the trails from all the trailheads and not arrived at your destination, do you declare failure. To be concrete, here is a step-by-step analysis of what Perl does when it tries to match the regexp

  1. "abcde" =~ /(abd|abc)(df|d|de)/;
0

Start with the first letter in the string 'a'.

1

Try the first alternative in the first group 'abd'.

2

Match 'a' followed by 'b'. So far so good.

3

'd' in the regexp doesn't match 'c' in the string - a dead end. So backtrack two characters and pick the second alternative in the first group 'abc'.

4

Match 'a' followed by 'b' followed by 'c'. We are on a roll and have satisfied the first group. Set $1 to 'abc'.

5

Move on to the second group and pick the first alternative 'df'.

6

Match the 'd'.

7

'f' in the regexp doesn't match 'e' in the string, so a dead end. Backtrack one character and pick the second alternative in the second group 'd'.

8

'd' matches. The second grouping is satisfied, so set $2 to 'd'.

9

We are at the end of the regexp, so we are done! We have matched 'abcd' out of the string "abcde".

There are a couple of things to note about this analysis. First, the third alternative in the second group 'de' also allows a match, but we stopped before we got to it - at a given character position, leftmost wins. Second, we were able to get a match at the first character position of the string 'a'. If there were no matches at the first position, Perl would move to the second character position 'b' and attempt the match all over again. Only when all possible paths at all possible character positions have been exhausted does Perl give up and declare $string =~ /(abd|abc)(df|d|de)/; to be false.

Even with all this work, regexp matching happens remarkably fast. To speed things up, Perl compiles the regexp into a compact sequence of opcodes that can often fit inside a processor cache. When the code is executed, these opcodes can then run at full throttle and search very quickly.

Extracting matches

The grouping metacharacters () also serve another completely different function: they allow the extraction of the parts of a string that matched. This is very useful to find out what matched and for text processing in general. For each grouping, the part that matched inside goes into the special variables $1 , $2 , etc. They can be used just as ordinary variables:

  1. # extract hours, minutes, seconds
  2. if ($time =~ /(\d\d):(\d\d):(\d\d)/) { # match hh:mm:ss format
  3. $hours = $1;
  4. $minutes = $2;
  5. $seconds = $3;
  6. }

Now, we know that in scalar context, $time =~ /(\d\d):(\d\d):(\d\d)/ returns a true or false value. In list context, however, it returns the list of matched values ($1,$2,$3) . So we could write the code more compactly as

  1. # extract hours, minutes, seconds
  2. ($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);

If the groupings in a regexp are nested, $1 gets the group with the leftmost opening parenthesis, $2 the next opening parenthesis, etc. Here is a regexp with nested groups:

  1. /(ab(cd|ef)((gi)|j))/;
  2. 1 2 34

If this regexp matches, $1 contains a string starting with 'ab' , $2 is either set to 'cd' or 'ef' , $3 equals either 'gi' or 'j' , and $4 is either set to 'gi' , just like $3 , or it remains undefined.

For convenience, Perl sets $+ to the string held by the highest numbered $1 , $2 ,... that got assigned (and, somewhat related, $^N to the value of the $1 , $2 ,... most-recently assigned; i.e. the $1 , $2 ,... associated with the rightmost closing parenthesis used in the match).

Backreferences

Closely associated with the matching variables $1 , $2 , ... are the backreferences \g1 , \g2 ,... Backreferences are simply matching variables that can be used inside a regexp. This is a really nice feature; what matches later in a regexp is made to depend on what matched earlier in the regexp. Suppose we wanted to look for doubled words in a text, like 'the the'. The following regexp finds all 3-letter doubles with a space in between:

  1. /\b(\w\w\w)\s\g1\b/;

The grouping assigns a value to \g1, so that the same 3-letter sequence is used for both parts.

A similar task is to find words consisting of two identical parts:

  1. % simple_grep '^(\w\w\w\w|\w\w\w|\w\w|\w)\g1$' /usr/dict/words
  2. beriberi
  3. booboo
  4. coco
  5. mama
  6. murmur
  7. papa

The regexp has a single grouping which considers 4-letter combinations, then 3-letter combinations, etc., and uses \g1 to look for a repeat. Although $1 and \g1 represent the same thing, care should be taken to use matched variables $1 , $2 ,... only outside a regexp and backreferences \g1 , \g2 ,... only inside a regexp; not doing so may lead to surprising and unsatisfactory results.

Relative backreferences

Counting the opening parentheses to get the correct number for a backreference is error-prone as soon as there is more than one capturing group. A more convenient technique became available with Perl 5.10: relative backreferences. To refer to the immediately preceding capture group one now may write \g{-1} , the next but last is available via \g{-2} , and so on.

Another good reason in addition to readability and maintainability for using relative backreferences is illustrated by the following example, where a simple pattern for matching peculiar strings is used:

  1. $a99a = '([a-z])(\d)\g2\g1'; # matches a11a, g22g, x33x, etc.

Now that we have this pattern stored as a handy string, we might feel tempted to use it as a part of some other pattern:

  1. $line = "code=e99e";
  2. if ($line =~ /^(\w+)=$a99a$/){ # unexpected behavior!
  3. print "$1 is valid\n";
  4. } else {
  5. print "bad line: '$line'\n";
  6. }

But this doesn't match, at least not the way one might expect. Only after inserting the interpolated $a99a and looking at the resulting full text of the regexp is it obvious that the backreferences have backfired. The subexpression (\w+) has snatched number 1 and demoted the groups in $a99a by one rank. This can be avoided by using relative backreferences:

  1. $a99a = '([a-z])(\d)\g{-1}\g{-2}'; # safe for being interpolated

Named backreferences

Perl 5.10 also introduced named capture groups and named backreferences. To attach a name to a capturing group, you write either (?<name>...) or (?'name'...). The backreference may then be written as \g{name} . It is permissible to attach the same name to more than one group, but then only the leftmost one of the eponymous set can be referenced. Outside of the pattern a named capture group is accessible through the %+ hash.

Assuming that we have to match calendar dates which may be given in one of the three formats yyyy-mm-dd, mm/dd/yyyy or dd.mm.yyyy, we can write three suitable patterns where we use 'd', 'm' and 'y' respectively as the names of the groups capturing the pertaining components of a date. The matching operation combines the three patterns as alternatives:

  1. $fmt1 = '(?<y>\d\d\d\d)-(?<m>\d\d)-(?<d>\d\d)';
  2. $fmt2 = '(?<m>\d\d)/(?<d>\d\d)/(?<y>\d\d\d\d)';
  3. $fmt3 = '(?<d>\d\d)\.(?<m>\d\d)\.(?<y>\d\d\d\d)';
  4. for my $d qw( 2006-10-21 15.01.2007 10/31/2005 ){
  5. if ( $d =~ m{$fmt1|$fmt2|$fmt3} ){
  6. print "day=$+{d} month=$+{m} year=$+{y}\n";
  7. }
  8. }

If any of the alternatives matches, the hash %+ is bound to contain the three key-value pairs.

Alternative capture group numbering

Yet another capturing group numbering technique (also as from Perl 5.10) deals with the problem of referring to groups within a set of alternatives. Consider a pattern for matching a time of the day, civil or military style:

  1. if ( $time =~ /(\d\d|\d):(\d\d)|(\d\d)(\d\d)/ ){
  2. # process hour and minute
  3. }

Processing the results requires an additional if statement to determine whether $1 and $2 or $3 and $4 contain the goodies. It would be easier if we could use group numbers 1 and 2 in second alternative as well, and this is exactly what the parenthesized construct (?|...), set around an alternative achieves. Here is an extended version of the previous pattern:

  1. if ( $time =~ /(?|(\d\d|\d):(\d\d)|(\d\d)(\d\d))\s+([A-Z][A-Z][A-Z])/ ){
  2. print "hour=$1 minute=$2 zone=$3\n";
  3. }

Within the alternative numbering group, group numbers start at the same position for each alternative. After the group, numbering continues with one higher than the maximum reached across all the alternatives.

Position information

In addition to what was matched, Perl also provides the positions of what was matched as contents of the @- and @+ arrays. $-[0] is the position of the start of the entire match and $+[0] is the position of the end. Similarly, $-[n] is the position of the start of the $n match and $+[n] is the position of the end. If $n is undefined, so are $-[n] and $+[n] . Then this code

  1. $x = "Mmm...donut, thought Homer";
  2. $x =~ /^(Mmm|Yech)\.\.\.(donut|peas)/; # matches
  3. foreach $expr (1..$#-) {
  4. print "Match $expr: '${$expr}' at position ($-[$expr],$+[$expr])\n";
  5. }

prints

  1. Match 1: 'Mmm' at position (0,3)
  2. Match 2: 'donut' at position (6,11)

Even if there are no groupings in a regexp, it is still possible to find out what exactly matched in a string. If you use them, Perl will set $` to the part of the string before the match, will set $& to the part of the string that matched, and will set $' to the part of the string after the match. An example:

  1. $x = "the cat caught the mouse";
  2. $x =~ /cat/; # $` = 'the ', $& = 'cat', $' = ' caught the mouse'
  3. $x =~ /the/; # $` = '', $& = 'the', $' = ' cat caught the mouse'

In the second match, $` equals '' because the regexp matched at the first character position in the string and stopped; it never saw the second 'the'. It is important to note that using $` and $' slows down regexp matching quite a bit, while $& slows it down to a lesser extent, because if they are used in one regexp in a program, they are generated for all regexps in the program. So if raw performance is a goal of your application, they should be avoided. If you need to extract the corresponding substrings, use @- and @+ instead:

  1. $` is the same as substr( $x, 0, $-[0] )
  2. $& is the same as substr( $x, $-[0], $+[0]-$-[0] )
  3. $' is the same as substr( $x, $+[0] )

As of Perl 5.10, the ${^PREMATCH} , ${^MATCH} and ${^POSTMATCH} variables may be used. These are only set if the /p modifier is present. Consequently they do not penalize the rest of the program.

Non-capturing groupings

A group that is required to bundle a set of alternatives may or may not be useful as a capturing group. If it isn't, it just creates a superfluous addition to the set of available capture group values, inside as well as outside the regexp. Non-capturing groupings, denoted by (?:regexp), still allow the regexp to be treated as a single unit, but don't establish a capturing group at the same time. Both capturing and non-capturing groupings are allowed to co-exist in the same regexp. Because there is no extraction, non-capturing groupings are faster than capturing groupings. Non-capturing groupings are also handy for choosing exactly which parts of a regexp are to be extracted to matching variables:

  1. # match a number, $1-$4 are set, but we only want $1
  2. /([+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?)/;
  3. # match a number faster , only $1 is set
  4. /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?)/;
  5. # match a number, get $1 = whole number, $2 = exponent
  6. /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE]([+-]?\d+))?)/;

Non-capturing groupings are also useful for removing nuisance elements gathered from a split operation where parentheses are required for some reason:

  1. $x = '12aba34ba5';
  2. @num = split /(a|b)+/, $x; # @num = ('12','a','34','a','5')
  3. @num = split /(?:a|b)+/, $x; # @num = ('12','34','5')

Matching repetitions

The examples in the previous section display an annoying weakness. We were only matching 3-letter words, or chunks of words of 4 letters or less. We'd like to be able to match words or, more generally, strings of any length, without writing out tedious alternatives like \w\w\w\w|\w\w\w|\w\w|\w .

This is exactly the problem the quantifier metacharacters ?, * , + , and {} were created for. They allow us to delimit the number of repeats for a portion of a regexp we consider to be a match. Quantifiers are put immediately after the character, character class, or grouping that we want to specify. They have the following meanings:

  • a? means: match 'a' 1 or 0 times

  • a* means: match 'a' 0 or more times, i.e., any number of times

  • a+ means: match 'a' 1 or more times, i.e., at least once

  • a{n,m} means: match at least n times, but not more than m times.

  • a{n,} means: match at least n or more times

  • a{n} means: match exactly n times

Here are some examples:

  1. /[a-z]+\s+\d*/; # match a lowercase word, at least one space, and
  2. # any number of digits
  3. /(\w+)\s+\g1/; # match doubled words of arbitrary length
  4. /y(es)?/i; # matches 'y', 'Y', or a case-insensitive 'yes'
  5. $year =~ /^\d{2,4}$/; # make sure year is at least 2 but not more
  6. # than 4 digits
  7. $year =~ /^\d{4}$|^\d{2}$/; # better match; throw out 3-digit dates
  8. $year =~ /^\d{2}(\d{2})?$/; # same thing written differently. However,
  9. # this captures the last two digits in $1
  10. # and the other does not.
  11. % simple_grep '^(\w+)\g1$' /usr/dict/words # isn't this easier?
  12. beriberi
  13. booboo
  14. coco
  15. mama
  16. murmur
  17. papa

For all of these quantifiers, Perl will try to match as much of the string as possible, while still allowing the regexp to succeed. Thus with /a?.../ , Perl will first try to match the regexp with the a present; if that fails, Perl will try to match the regexp without the a present. For the quantifier * , we get the following:

  1. $x = "the cat in the hat";
  2. $x =~ /^(.*)(cat)(.*)$/; # matches,
  3. # $1 = 'the '
  4. # $2 = 'cat'
  5. # $3 = ' in the hat'

Which is what we might expect, the match finds the only cat in the string and locks onto it. Consider, however, this regexp:

  1. $x =~ /^(.*)(at)(.*)$/; # matches,
  2. # $1 = 'the cat in the h'
  3. # $2 = 'at'
  4. # $3 = '' (0 characters match)

One might initially guess that Perl would find the at in cat and stop there, but that wouldn't give the longest possible string to the first quantifier .*. Instead, the first quantifier .* grabs as much of the string as possible while still having the regexp match. In this example, that means having the at sequence with the final at in the string. The other important principle illustrated here is that, when there are two or more elements in a regexp, the leftmost quantifier, if there is one, gets to grab as much of the string as possible, leaving the rest of the regexp to fight over scraps. Thus in our example, the first quantifier .* grabs most of the string, while the second quantifier .* gets the empty string. Quantifiers that grab as much of the string as possible are called maximal match or greedy quantifiers.

When a regexp can match a string in several different ways, we can use the principles above to predict which way the regexp will match:

  • Principle 0: Taken as a whole, any regexp will be matched at the earliest possible position in the string.

  • Principle 1: In an alternation a|b|c... , the leftmost alternative that allows a match for the whole regexp will be the one used.

  • Principle 2: The maximal matching quantifiers ?, * , + and {n,m} will in general match as much of the string as possible while still allowing the whole regexp to match.

  • Principle 3: If there are two or more elements in a regexp, the leftmost greedy quantifier, if any, will match as much of the string as possible while still allowing the whole regexp to match. The next leftmost greedy quantifier, if any, will try to match as much of the string remaining available to it as possible, while still allowing the whole regexp to match. And so on, until all the regexp elements are satisfied.

As we have seen above, Principle 0 overrides the others. The regexp will be matched as early as possible, with the other principles determining how the regexp matches at that earliest character position.

Here is an example of these principles in action:

  1. $x = "The programming republic of Perl";
  2. $x =~ /^(.+)(e|r)(.*)$/; # matches,
  3. # $1 = 'The programming republic of Pe'
  4. # $2 = 'r'
  5. # $3 = 'l'

This regexp matches at the earliest string position, 'T' . One might think that e , being leftmost in the alternation, would be matched, but r produces the longest string in the first quantifier.

  1. $x =~ /(m{1,2})(.*)$/; # matches,
  2. # $1 = 'mm'
  3. # $2 = 'ing republic of Perl'

Here, The earliest possible match is at the first 'm' in programming . m{1,2} is the first quantifier, so it gets to match a maximal mm .

  1. $x =~ /.*(m{1,2})(.*)$/; # matches,
  2. # $1 = 'm'
  3. # $2 = 'ing republic of Perl'

Here, the regexp matches at the start of the string. The first quantifier .* grabs as much as possible, leaving just a single 'm' for the second quantifier m{1,2}.

  1. $x =~ /(.?)(m{1,2})(.*)$/; # matches,
  2. # $1 = 'a'
  3. # $2 = 'mm'
  4. # $3 = 'ing republic of Perl'

Here, .? eats its maximal one character at the earliest possible position in the string, 'a' in programming , leaving m{1,2} the opportunity to match both m's. Finally,

  1. "aXXXb" =~ /(X*)/; # matches with $1 = ''

because it can match zero copies of 'X' at the beginning of the string. If you definitely want to match at least one 'X' , use X+ , not X* .

Sometimes greed is not good. At times, we would like quantifiers to match a minimal piece of string, rather than a maximal piece. For this purpose, Larry Wall created the minimal match or non-greedy quantifiers ?? , *? , +?, and {}?. These are the usual quantifiers with a ? appended to them. They have the following meanings:

  • a?? means: match 'a' 0 or 1 times. Try 0 first, then 1.

  • a*? means: match 'a' 0 or more times, i.e., any number of times, but as few times as possible

  • a+? means: match 'a' 1 or more times, i.e., at least once, but as few times as possible

  • a{n,m}? means: match at least n times, not more than m times, as few times as possible

  • a{n,}? means: match at least n times, but as few times as possible

  • a{n}? means: match exactly n times. Because we match exactly n times, a{n}? is equivalent to a{n} and is just there for notational consistency.

Let's look at the example above, but with minimal quantifiers:

  1. $x = "The programming republic of Perl";
  2. $x =~ /^(.+?)(e|r)(.*)$/; # matches,
  3. # $1 = 'Th'
  4. # $2 = 'e'
  5. # $3 = ' programming republic of Perl'

The minimal string that will allow both the start of the string ^ and the alternation to match is Th , with the alternation e|r matching e . The second quantifier .* is free to gobble up the rest of the string.

  1. $x =~ /(m{1,2}?)(.*?)$/; # matches,
  2. # $1 = 'm'
  3. # $2 = 'ming republic of Perl'

The first string position that this regexp can match is at the first 'm' in programming . At this position, the minimal m{1,2}? matches just one 'm' . Although the second quantifier .*? would prefer to match no characters, it is constrained by the end-of-string anchor $ to match the rest of the string.

  1. $x =~ /(.*?)(m{1,2}?)(.*)$/; # matches,
  2. # $1 = 'The progra'
  3. # $2 = 'm'
  4. # $3 = 'ming republic of Perl'

In this regexp, you might expect the first minimal quantifier .*? to match the empty string, because it is not constrained by a ^ anchor to match the beginning of the word. Principle 0 applies here, however. Because it is possible for the whole regexp to match at the start of the string, it will match at the start of the string. Thus the first quantifier has to match everything up to the first m. The second minimal quantifier matches just one m and the third quantifier matches the rest of the string.

  1. $x =~ /(.??)(m{1,2})(.*)$/; # matches,
  2. # $1 = 'a'
  3. # $2 = 'mm'
  4. # $3 = 'ing republic of Perl'

Just as in the previous regexp, the first quantifier .?? can match earliest at position 'a' , so it does. The second quantifier is greedy, so it matches mm , and the third matches the rest of the string.

We can modify principle 3 above to take into account non-greedy quantifiers:

  • Principle 3: If there are two or more elements in a regexp, the leftmost greedy (non-greedy) quantifier, if any, will match as much (little) of the string as possible while still allowing the whole regexp to match. The next leftmost greedy (non-greedy) quantifier, if any, will try to match as much (little) of the string remaining available to it as possible, while still allowing the whole regexp to match. And so on, until all the regexp elements are satisfied.

Just like alternation, quantifiers are also susceptible to backtracking. Here is a step-by-step analysis of the example

  1. $x = "the cat in the hat";
  2. $x =~ /^(.*)(at)(.*)$/; # matches,
  3. # $1 = 'the cat in the h'
  4. # $2 = 'at'
  5. # $3 = '' (0 matches)
0

Start with the first letter in the string 't'.

1

The first quantifier '.*' starts out by matching the whole string 'the cat in the hat'.

2

'a' in the regexp element 'at' doesn't match the end of the string. Backtrack one character.

3

'a' in the regexp element 'at' still doesn't match the last letter of the string 't', so backtrack one more character.

4

Now we can match the 'a' and the 't'.

5

Move on to the third element '.*'. Since we are at the end of the string and '.*' can match 0 times, assign it the empty string.

6

We are done!

Most of the time, all this moving forward and backtracking happens quickly and searching is fast. There are some pathological regexps, however, whose execution time exponentially grows with the size of the string. A typical structure that blows up in your face is of the form

  1. /(a|b+)*/;

The problem is the nested indeterminate quantifiers. There are many different ways of partitioning a string of length n between the + and * : one repetition with b+ of length n, two repetitions with the first b+ length k and the second with length n-k, m repetitions whose bits add up to length n, etc. In fact there are an exponential number of ways to partition a string as a function of its length. A regexp may get lucky and match early in the process, but if there is no match, Perl will try every possibility before giving up. So be careful with nested * 's, {n,m}'s, and + 's. The book Mastering Regular Expressions by Jeffrey Friedl gives a wonderful discussion of this and other efficiency issues.

Possessive quantifiers

Backtracking during the relentless search for a match may be a waste of time, particularly when the match is bound to fail. Consider the simple pattern

  1. /^\w+\s+\w+$/; # a word, spaces, a word

Whenever this is applied to a string which doesn't quite meet the pattern's expectations such as "abc " or "abc def " , the regex engine will backtrack, approximately once for each character in the string. But we know that there is no way around taking all of the initial word characters to match the first repetition, that all spaces must be eaten by the middle part, and the same goes for the second word.

With the introduction of the possessive quantifiers in Perl 5.10, we have a way of instructing the regex engine not to backtrack, with the usual quantifiers with a + appended to them. This makes them greedy as well as stingy; once they succeed they won't give anything back to permit another solution. They have the following meanings:

  • a{n,m}+ means: match at least n times, not more than m times, as many times as possible, and don't give anything up. a?+ is short for a{0,1}+

  • a{n,}+ means: match at least n times, but as many times as possible, and don't give anything up. a*+ is short for a{0,}+ and a++ is short for a{1,}+ .

  • a{n}+ means: match exactly n times. It is just there for notational consistency.

These possessive quantifiers represent a special case of a more general concept, the independent subexpression, see below.

As an example where a possessive quantifier is suitable we consider matching a quoted string, as it appears in several programming languages. The backslash is used as an escape character that indicates that the next character is to be taken literally, as another character for the string. Therefore, after the opening quote, we expect a (possibly empty) sequence of alternatives: either some character except an unescaped quote or backslash or an escaped character.

  1. /"(?:[^"\\]++|\\.)*+"/;

Building a regexp

At this point, we have all the basic regexp concepts covered, so let's give a more involved example of a regular expression. We will build a regexp that matches numbers.

The first task in building a regexp is to decide what we want to match and what we want to exclude. In our case, we want to match both integers and floating point numbers and we want to reject any string that isn't a number.

The next task is to break the problem down into smaller problems that are easily converted into a regexp.

The simplest case is integers. These consist of a sequence of digits, with an optional sign in front. The digits we can represent with \d+ and the sign can be matched with [+-] . Thus the integer regexp is

  1. /[+-]?\d+/; # matches integers

A floating point number potentially has a sign, an integral part, a decimal point, a fractional part, and an exponent. One or more of these parts is optional, so we need to check out the different possibilities. Floating point numbers which are in proper form include 123., 0.345, .34, -1e6, and 25.4E-72. As with integers, the sign out front is completely optional and can be matched by [+-]?. We can see that if there is no exponent, floating point numbers must have a decimal point, otherwise they are integers. We might be tempted to model these with \d*\.\d*, but this would also match just a single decimal point, which is not a number. So the three cases of floating point number without exponent are

  1. /[+-]?\d+\./; # 1., 321., etc.
  2. /[+-]?\.\d+/; # .1, .234, etc.
  3. /[+-]?\d+\.\d+/; # 1.0, 30.56, etc.

These can be combined into a single regexp with a three-way alternation:

  1. /[+-]?(\d+\.\d+|\d+\.|\.\d+)/; # floating point, no exponent

In this alternation, it is important to put '\d+\.\d+' before '\d+\.' . If '\d+\.' were first, the regexp would happily match that and ignore the fractional part of the number.

Now consider floating point numbers with exponents. The key observation here is that both integers and numbers with decimal points are allowed in front of an exponent. Then exponents, like the overall sign, are independent of whether we are matching numbers with or without decimal points, and can be 'decoupled' from the mantissa. The overall form of the regexp now becomes clear:

  1. /^(optional sign)(integer | f.p. mantissa)(optional exponent)$/;

The exponent is an e or E , followed by an integer. So the exponent regexp is

  1. /[eE][+-]?\d+/; # exponent

Putting all the parts together, we get a regexp that matches numbers:

  1. /^[+-]?(\d+\.\d+|\d+\.|\.\d+|\d+)([eE][+-]?\d+)?$/; # Ta da!

Long regexps like this may impress your friends, but can be hard to decipher. In complex situations like this, the //x modifier for a match is invaluable. It allows one to put nearly arbitrary whitespace and comments into a regexp without affecting their meaning. Using it, we can rewrite our 'extended' regexp in the more pleasing form

  1. /^
  2. [+-]? # first, match an optional sign
  3. ( # then match integers or f.p. mantissas:
  4. \d+\.\d+ # mantissa of the form a.b
  5. |\d+\. # mantissa of the form a.
  6. |\.\d+ # mantissa of the form .b
  7. |\d+ # integer of the form a
  8. )
  9. ([eE][+-]?\d+)? # finally, optionally match an exponent
  10. $/x;

If whitespace is mostly irrelevant, how does one include space characters in an extended regexp? The answer is to backslash it '\ ' or put it in a character class [ ] . The same thing goes for pound signs: use \# or [#]. For instance, Perl allows a space between the sign and the mantissa or integer, and we could add this to our regexp as follows:

  1. /^
  2. [+-]?\ * # first, match an optional sign *and space*
  3. ( # then match integers or f.p. mantissas:
  4. \d+\.\d+ # mantissa of the form a.b
  5. |\d+\. # mantissa of the form a.
  6. |\.\d+ # mantissa of the form .b
  7. |\d+ # integer of the form a
  8. )
  9. ([eE][+-]?\d+)? # finally, optionally match an exponent
  10. $/x;

In this form, it is easier to see a way to simplify the alternation. Alternatives 1, 2, and 4 all start with \d+ , so it could be factored out:

  1. /^
  2. [+-]?\ * # first, match an optional sign
  3. ( # then match integers or f.p. mantissas:
  4. \d+ # start out with a ...
  5. (
  6. \.\d* # mantissa of the form a.b or a.
  7. )? # ? takes care of integers of the form a
  8. |\.\d+ # mantissa of the form .b
  9. )
  10. ([eE][+-]?\d+)? # finally, optionally match an exponent
  11. $/x;

or written in the compact form,

  1. /^[+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?$/;

This is our final regexp. To recap, we built a regexp by

  • specifying the task in detail,

  • breaking down the problem into smaller parts,

  • translating the small parts into regexps,

  • combining the regexps,

  • and optimizing the final combined regexp.

These are also the typical steps involved in writing a computer program. This makes perfect sense, because regular expressions are essentially programs written in a little computer language that specifies patterns.

Using regular expressions in Perl

The last topic of Part 1 briefly covers how regexps are used in Perl programs. Where do they fit into Perl syntax?

We have already introduced the matching operator in its default /regexp/ and arbitrary delimiter m!regexp! forms. We have used the binding operator =~ and its negation !~ to test for string matches. Associated with the matching operator, we have discussed the single line //s , multi-line //m , case-insensitive //i and extended //x modifiers. There are a few more things you might want to know about matching operators.

Prohibiting substitution

If you change $pattern after the first substitution happens, Perl will ignore it. If you don't want any substitutions at all, use the special delimiter m'':

  1. @pattern = ('Seuss');
  2. while (<>) {
  3. print if m'@pattern'; # matches literal '@pattern', not 'Seuss'
  4. }

Similar to strings, m'' acts like apostrophes on a regexp; all other m delimiters act like quotes. If the regexp evaluates to the empty string, the regexp in the last successful match is used instead. So we have

  1. "dog" =~ /d/; # 'd' matches
  2. "dogbert =~ //; # this matches the 'd' regexp used before

Global matching

The final two modifiers we will discuss here, //g and //c , concern multiple matches. The modifier //g stands for global matching and allows the matching operator to match within a string as many times as possible. In scalar context, successive invocations against a string will have //g jump from match to match, keeping track of position in the string as it goes along. You can get or set the position with the pos() function.

The use of //g is shown in the following example. Suppose we have a string that consists of words separated by spaces. If we know how many words there are in advance, we could extract the words using groupings:

  1. $x = "cat dog house"; # 3 words
  2. $x =~ /^\s*(\w+)\s+(\w+)\s+(\w+)\s*$/; # matches,
  3. # $1 = 'cat'
  4. # $2 = 'dog'
  5. # $3 = 'house'

But what if we had an indeterminate number of words? This is the sort of task //g was made for. To extract all words, form the simple regexp (\w+) and loop over all matches with /(\w+)/g :

  1. while ($x =~ /(\w+)/g) {
  2. print "Word is $1, ends at position ", pos $x, "\n";
  3. }

prints

  1. Word is cat, ends at position 3
  2. Word is dog, ends at position 7
  3. Word is house, ends at position 13

A failed match or changing the target string resets the position. If you don't want the position reset after failure to match, add the //c , as in /regexp/gc . The current position in the string is associated with the string, not the regexp. This means that different strings have different positions and their respective positions can be set or read independently.

In list context, //g returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regexp. So if we wanted just the words, we could use

  1. @words = ($x =~ /(\w+)/g); # matches,
  2. # $words[0] = 'cat'
  3. # $words[1] = 'dog'
  4. # $words[2] = 'house'

Closely associated with the //g modifier is the \G anchor. The \G anchor matches at the point where the previous //g match left off. \G allows us to easily do context-sensitive matching:

  1. $metric = 1; # use metric units
  2. ...
  3. $x = <FILE>; # read in measurement
  4. $x =~ /^([+-]?\d+)\s*/g; # get magnitude
  5. $weight = $1;
  6. if ($metric) { # error checking
  7. print "Units error!" unless $x =~ /\Gkg\./g;
  8. }
  9. else {
  10. print "Units error!" unless $x =~ /\Glbs\./g;
  11. }
  12. $x =~ /\G\s+(widget|sprocket)/g; # continue processing

The combination of //g and \G allows us to process the string a bit at a time and use arbitrary Perl logic to decide what to do next. Currently, the \G anchor is only fully supported when used to anchor to the start of the pattern.

\G is also invaluable in processing fixed-length records with regexps. Suppose we have a snippet of coding region DNA, encoded as base pair letters ATCGTTGAAT... and we want to find all the stop codons TGA . In a coding region, codons are 3-letter sequences, so we can think of the DNA snippet as a sequence of 3-letter records. The naive regexp

  1. # expanded, this is "ATC GTT GAA TGC AAA TGA CAT GAC"
  2. $dna = "ATCGTTGAATGCAAATGACATGAC";
  3. $dna =~ /TGA/;

doesn't work; it may match a TGA , but there is no guarantee that the match is aligned with codon boundaries, e.g., the substring GTT GAA gives a match. A better solution is

  1. while ($dna =~ /(\w\w\w)*?TGA/g) { # note the minimal *?
  2. print "Got a TGA stop codon at position ", pos $dna, "\n";
  3. }

which prints

  1. Got a TGA stop codon at position 18
  2. Got a TGA stop codon at position 23

Position 18 is good, but position 23 is bogus. What happened?

The answer is that our regexp works well until we get past the last real match. Then the regexp will fail to match a synchronized TGA and start stepping ahead one character position at a time, not what we want. The solution is to use \G to anchor the match to the codon alignment:

  1. while ($dna =~ /\G(\w\w\w)*?TGA/g) {
  2. print "Got a TGA stop codon at position ", pos $dna, "\n";
  3. }

This prints

  1. Got a TGA stop codon at position 18

which is the correct answer. This example illustrates that it is important not only to match what is desired, but to reject what is not desired.

(There are other regexp modifiers that are available, such as //o , but their specialized uses are beyond the scope of this introduction. )

Search and replace

Regular expressions also play a big role in search and replace operations in Perl. Search and replace is accomplished with the s/// operator. The general form is s/regexp/replacement/modifiers, with everything we know about regexps and modifiers applying in this case as well. The replacement is a Perl double-quoted string that replaces in the string whatever is matched with the regexp . The operator =~ is also used here to associate a string with s///. If matching against $_ , the $_ =~ can be dropped. If there is a match, s/// returns the number of substitutions made; otherwise it returns false. Here are a few examples:

  1. $x = "Time to feed the cat!";
  2. $x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!"
  3. if ($x =~ s/^(Time.*hacker)!$/$1 now!/) {
  4. $more_insistent = 1;
  5. }
  6. $y = "'quoted words'";
  7. $y =~ s/^'(.*)'$/$1/; # strip single quotes,
  8. # $y contains "quoted words"

In the last example, the whole string was matched, but only the part inside the single quotes was grouped. With the s/// operator, the matched variables $1 , $2 , etc. are immediately available for use in the replacement expression, so we use $1 to replace the quoted string with just what was quoted. With the global modifier, s///g will search and replace all occurrences of the regexp in the string:

  1. $x = "I batted 4 for 4";
  2. $x =~ s/4/four/; # doesn't do it all:
  3. # $x contains "I batted four for 4"
  4. $x = "I batted 4 for 4";
  5. $x =~ s/4/four/g; # does it all:
  6. # $x contains "I batted four for four"

If you prefer 'regex' over 'regexp' in this tutorial, you could use the following program to replace it:

  1. % cat > simple_replace
  2. #!/usr/bin/perl
  3. $regexp = shift;
  4. $replacement = shift;
  5. while (<>) {
  6. s/$regexp/$replacement/g;
  7. print;
  8. }
  9. ^D
  10. % simple_replace regexp regex perlretut.pod

In simple_replace we used the s///g modifier to replace all occurrences of the regexp on each line. (Even though the regular expression appears in a loop, Perl is smart enough to compile it only once.) As with simple_grep , both the print and the s/$regexp/$replacement/g use $_ implicitly.

If you don't want s/// to change your original variable you can use the non-destructive substitute modifier, s///r. This changes the behavior so that s///r returns the final substituted string (instead of the number of substitutions):

  1. $x = "I like dogs.";
  2. $y = $x =~ s/dogs/cats/r;
  3. print "$x $y\n";

That example will print "I like dogs. I like cats". Notice the original $x variable has not been affected. The overall result of the substitution is instead stored in $y . If the substitution doesn't affect anything then the original string is returned:

  1. $x = "I like dogs.";
  2. $y = $x =~ s/elephants/cougars/r;
  3. print "$x $y\n"; # prints "I like dogs. I like dogs."

One other interesting thing that the s///r flag allows is chaining substitutions:

  1. $x = "Cats are great.";
  2. print $x =~ s/Cats/Dogs/r =~ s/Dogs/Frogs/r =~ s/Frogs/Hedgehogs/r, "\n";
  3. # prints "Hedgehogs are great."

A modifier available specifically to search and replace is the s///e evaluation modifier. s///e treats the replacement text as Perl code, rather than a double-quoted string. The value that the code returns is substituted for the matched substring. s///e is useful if you need to do a bit of computation in the process of replacing text. This example counts character frequencies in a line:

  1. $x = "Bill the cat";
  2. $x =~ s/(.)/$chars{$1}++;$1/eg; # final $1 replaces char with itself
  3. print "frequency of '$_' is $chars{$_}\n"
  4. foreach (sort {$chars{$b} <=> $chars{$a}} keys %chars);

This prints

  1. frequency of ' ' is 2
  2. frequency of 't' is 2
  3. frequency of 'l' is 2
  4. frequency of 'B' is 1
  5. frequency of 'c' is 1
  6. frequency of 'e' is 1
  7. frequency of 'h' is 1
  8. frequency of 'i' is 1
  9. frequency of 'a' is 1

As with the match m// operator, s/// can use other delimiters, such as s!!! and s{}{}, and even s{}//. If single quotes are used s''', then the regexp and replacement are treated as single-quoted strings and there are no variable substitutions. s/// in list context returns the same thing as in scalar context, i.e., the number of matches.

The split function

The split() function is another place where a regexp is used. split /regexp/, string, limit separates the string operand into a list of substrings and returns that list. The regexp must be designed to match whatever constitutes the separators for the desired substrings. The limit , if present, constrains splitting into no more than limit number of strings. For example, to split a string into words, use

  1. $x = "Calvin and Hobbes";
  2. @words = split /\s+/, $x; # $word[0] = 'Calvin'
  3. # $word[1] = 'and'
  4. # $word[2] = 'Hobbes'

If the empty regexp // is used, the regexp always matches and the string is split into individual characters. If the regexp has groupings, then the resulting list contains the matched substrings from the groupings as well. For instance,

  1. $x = "/usr/bin/perl";
  2. @dirs = split m!/!, $x; # $dirs[0] = ''
  3. # $dirs[1] = 'usr'
  4. # $dirs[2] = 'bin'
  5. # $dirs[3] = 'perl'
  6. @parts = split m!(/)!, $x; # $parts[0] = ''
  7. # $parts[1] = '/'
  8. # $parts[2] = 'usr'
  9. # $parts[3] = '/'
  10. # $parts[4] = 'bin'
  11. # $parts[5] = '/'
  12. # $parts[6] = 'perl'

Since the first character of $x matched the regexp, split prepended an empty initial element to the list.

If you have read this far, congratulations! You now have all the basic tools needed to use regular expressions to solve a wide range of text processing problems. If this is your first time through the tutorial, why not stop here and play around with regexps a while.... Part 2 concerns the more esoteric aspects of regular expressions and those concepts certainly aren't needed right at the start.

Part 2: Power tools

OK, you know the basics of regexps and you want to know more. If matching regular expressions is analogous to a walk in the woods, then the tools discussed in Part 1 are analogous to topo maps and a compass, basic tools we use all the time. Most of the tools in part 2 are analogous to flare guns and satellite phones. They aren't used too often on a hike, but when we are stuck, they can be invaluable.

What follows are the more advanced, less used, or sometimes esoteric capabilities of Perl regexps. In Part 2, we will assume you are comfortable with the basics and concentrate on the advanced features.

More on characters, strings, and character classes

There are a number of escape sequences and character classes that we haven't covered yet.

There are several escape sequences that convert characters or strings between upper and lower case, and they are also available within patterns. \l and \u convert the next character to lower or upper case, respectively:

  1. $x = "perl";
  2. $string =~ /\u$x/; # matches 'Perl' in $string
  3. $x = "M(rs?|s)\\."; # note the double backslash
  4. $string =~ /\l$x/; # matches 'mr.', 'mrs.', and 'ms.',

A \L or \U indicates a lasting conversion of case, until terminated by \E or thrown over by another \U or \L :

  1. $x = "This word is in lower case:\L SHOUT\E";
  2. $x =~ /shout/; # matches
  3. $x = "I STILL KEYPUNCH CARDS FOR MY 360"
  4. $x =~ /\Ukeypunch/; # matches punch card string

If there is no \E , case is converted until the end of the string. The regexps \L\u$word or \u\L$word convert the first character of $word to uppercase and the rest of the characters to lowercase.

Control characters can be escaped with \c , so that a control-Z character would be matched with \cZ . The escape sequence \Q ...\E quotes, or protects most non-alphabetic characters. For instance,

  1. $x = "\QThat !^*&%~& cat!";
  2. $x =~ /\Q!^*&%~&\E/; # check for rough language

It does not protect $ or @ , so that variables can still be substituted.

\Q , \L , \l , \U , \u and \E are actually part of double-quotish syntax, and not part of regexp syntax proper. They will work if they appear in a regular expression embedded directly in a program, but not when contained in a string that is interpolated in a pattern.

Perl regexps can handle more than just the standard ASCII character set. Perl supports Unicode, a standard for representing the alphabets from virtually all of the world's written languages, and a host of symbols. Perl's text strings are Unicode strings, so they can contain characters with a value (codepoint or character number) higher than 255.

What does this mean for regexps? Well, regexp users don't need to know much about Perl's internal representation of strings. But they do need to know 1) how to represent Unicode characters in a regexp and 2) that a matching operation will treat the string to be searched as a sequence of characters, not bytes. The answer to 1) is that Unicode characters greater than chr(255) are represented using the \x{hex} notation, because \x hex (without curly braces) doesn't go further than 255. (Starting in Perl 5.14, if you're an octal fan, you can also use \o{oct} .)

  1. /\x{263a}/; # match a Unicode smiley face :)

NOTE: In Perl 5.6.0 it used to be that one needed to say use utf8 to use any Unicode features. This is no more the case: for almost all Unicode processing, the explicit utf8 pragma is not needed. (The only case where it matters is if your Perl script is in Unicode and encoded in UTF-8, then an explicit use utf8 is needed.)

Figuring out the hexadecimal sequence of a Unicode character you want or deciphering someone else's hexadecimal Unicode regexp is about as much fun as programming in machine code. So another way to specify Unicode characters is to use the named character escape sequence \N{name}. name is a name for the Unicode character, as specified in the Unicode standard. For instance, if we wanted to represent or match the astrological sign for the planet Mercury, we could use

  1. $x = "abc\N{MERCURY}def";
  2. $x =~ /\N{MERCURY}/; # matches

One can also use "short" names:

  1. print "\N{GREEK SMALL LETTER SIGMA} is called sigma.\n";
  2. print "\N{greek:Sigma} is an upper-case sigma.\n";

You can also restrict names to a certain alphabet by specifying the charnames pragma:

  1. use charnames qw(greek);
  2. print "\N{sigma} is Greek sigma\n";

An index of character names is available on-line from the Unicode Consortium, http://www.unicode.org/charts/charindex.html; explanatory material with links to other resources at http://www.unicode.org/standard/where.

The answer to requirement 2) is that a regexp (mostly) uses Unicode characters. The "mostly" is for messy backward compatibility reasons, but starting in Perl 5.14, any regex compiled in the scope of a use feature 'unicode_strings' (which is automatically turned on within the scope of a use 5.012 or higher) will turn that "mostly" into "always". If you want to handle Unicode properly, you should ensure that 'unicode_strings' is turned on. Internally, this is encoded to bytes using either UTF-8 or a native 8 bit encoding, depending on the history of the string, but conceptually it is a sequence of characters, not bytes. See perlunitut for a tutorial about that.

Let us now discuss Unicode character classes. Just as with Unicode characters, there are named Unicode character classes represented by the \p{name} escape sequence. Closely associated is the \P{name} character class, which is the negation of the \p{name} class. For example, to match lower and uppercase characters,

  1. $x = "BOB";
  2. $x =~ /^\p{IsUpper}/; # matches, uppercase char class
  3. $x =~ /^\P{IsUpper}/; # doesn't match, char class sans uppercase
  4. $x =~ /^\p{IsLower}/; # doesn't match, lowercase char class
  5. $x =~ /^\P{IsLower}/; # matches, char class sans lowercase

(The "Is" is optional.)

Here is the association between some Perl named classes and the traditional Unicode classes:

  1. Perl class name Unicode class name or regular expression
  2. IsAlpha /^[LM]/
  3. IsAlnum /^[LMN]/
  4. IsASCII $code <= 127
  5. IsCntrl /^C/
  6. IsBlank $code =~ /^(0020|0009)$/ || /^Z[^lp]/
  7. IsDigit Nd
  8. IsGraph /^([LMNPS]|Co)/
  9. IsLower Ll
  10. IsPrint /^([LMNPS]|Co|Zs)/
  11. IsPunct /^P/
  12. IsSpace /^Z/ || ($code =~ /^(0009|000A|000B|000C|000D)$/
  13. IsSpacePerl /^Z/ || ($code =~ /^(0009|000A|000C|000D|0085|2028|2029)$/
  14. IsUpper /^L[ut]/
  15. IsWord /^[LMN]/ || $code eq "005F"
  16. IsXDigit $code =~ /^00(3[0-9]|[46][1-6])$/

You can also use the official Unicode class names with \p and \P , like \p{L} for Unicode 'letters', \p{Lu} for uppercase letters, or \P{Nd} for non-digits. If a name is just one letter, the braces can be dropped. For instance, \pM is the character class of Unicode 'marks', for example accent marks. For the full list see perlunicode.

Unicode has also been separated into various sets of characters which you can test with \p{...} (in) and \P{...} (not in). To test whether a character is (or is not) an element of a script you would use the script name, for example \p{Latin} , \p{Greek} , or \P{Katakana} .

What we have described so far is the single form of the \p{...} character classes. There is also a compound form which you may run into. These look like \p{name=value} or \p{name:value} (the equals sign and colon can be used interchangeably). These are more general than the single form, and in fact most of the single forms are just Perl-defined shortcuts for common compound forms. For example, the script examples in the previous paragraph could be written equivalently as \p{Script=Latin} , \p{Script:Greek} , and \P{script=katakana} (case is irrelevant between the {} braces). You may never have to use the compound forms, but sometimes it is necessary, and their use can make your code easier to understand.

\X is an abbreviation for a character class that comprises a Unicode extended grapheme cluster. This represents a "logical character": what appears to be a single character, but may be represented internally by more than one. As an example, using the Unicode full names, e.g., A + COMBINING RING is a grapheme cluster with base character A and combining character COMBINING RING , which translates in Danish to A with the circle atop it, as in the word Angstrom.

For the full and latest information about Unicode see the latest Unicode standard, or the Unicode Consortium's website http://www.unicode.org

As if all those classes weren't enough, Perl also defines POSIX-style character classes. These have the form [:name:], with name the name of the POSIX class. The POSIX classes are alpha , alnum , ascii , cntrl , digit , graph , lower , print, punct , space , upper , and xdigit , and two extensions, word (a Perl extension to match \w ), and blank (a GNU extension). The //a modifier restricts these to matching just in the ASCII range; otherwise they can match the same as their corresponding Perl Unicode classes: [:upper:] is the same as \p{IsUpper} , etc. (There are some exceptions and gotchas with this; see perlrecharclass for a full discussion.) The [:digit:], [:word:], and [:space:] correspond to the familiar \d , \w , and \s character classes. To negate a POSIX class, put a ^ in front of the name, so that, e.g., [:^digit:] corresponds to \D and, under Unicode, \P{IsDigit} . The Unicode and POSIX character classes can be used just like \d , with the exception that POSIX character classes can only be used inside of a character class:

  1. /\s+[abc[:digit:]xyz]\s*/; # match a,b,c,x,y,z, or a digit
  2. /^=item\s[[:digit:]]/; # match '=item',
  3. # followed by a space and a digit
  4. /\s+[abc\p{IsDigit}xyz]\s+/; # match a,b,c,x,y,z, or a digit
  5. /^=item\s\p{IsDigit}/; # match '=item',
  6. # followed by a space and a digit

Whew! That is all the rest of the characters and character classes.

Compiling and saving regular expressions

In Part 1 we mentioned that Perl compiles a regexp into a compact sequence of opcodes. Thus, a compiled regexp is a data structure that can be stored once and used again and again. The regexp quote qr// does exactly that: qr/string/ compiles the string as a regexp and transforms the result into a form that can be assigned to a variable:

  1. $reg = qr/foo+bar?/; # reg contains a compiled regexp

Then $reg can be used as a regexp:

  1. $x = "fooooba";
  2. $x =~ $reg; # matches, just like /foo+bar?/
  3. $x =~ /$reg/; # same thing, alternate form

$reg can also be interpolated into a larger regexp:

  1. $x =~ /(abc)?$reg/; # still matches

As with the matching operator, the regexp quote can use different delimiters, e.g., qr!!, qr{} or qr~~. Apostrophes as delimiters (qr'') inhibit any interpolation.

Pre-compiled regexps are useful for creating dynamic matches that don't need to be recompiled each time they are encountered. Using pre-compiled regexps, we write a grep_step program which greps for a sequence of patterns, advancing to the next pattern as soon as one has been satisfied.

  1. % cat > grep_step
  2. #!/usr/bin/perl
  3. # grep_step - match <number> regexps, one after the other
  4. # usage: multi_grep <number> regexp1 regexp2 ... file1 file2 ...
  5. $number = shift;
  6. $regexp[$_] = shift foreach (0..$number-1);
  7. @compiled = map qr/$_/, @regexp;
  8. while ($line = <>) {
  9. if ($line =~ /$compiled[0]/) {
  10. print $line;
  11. shift @compiled;
  12. last unless @compiled;
  13. }
  14. }
  15. ^D
  16. % grep_step 3 shift print last grep_step
  17. $number = shift;
  18. print $line;
  19. last unless @compiled;

Storing pre-compiled regexps in an array @compiled allows us to simply loop through the regexps without any recompilation, thus gaining flexibility without sacrificing speed.

Composing regular expressions at runtime

Backtracking is more efficient than repeated tries with different regular expressions. If there are several regular expressions and a match with any of them is acceptable, then it is possible to combine them into a set of alternatives. If the individual expressions are input data, this can be done by programming a join operation. We'll exploit this idea in an improved version of the simple_grep program: a program that matches multiple patterns:

  1. % cat > multi_grep
  2. #!/usr/bin/perl
  3. # multi_grep - match any of <number> regexps
  4. # usage: multi_grep <number> regexp1 regexp2 ... file1 file2 ...
  5. $number = shift;
  6. $regexp[$_] = shift foreach (0..$number-1);
  7. $pattern = join '|', @regexp;
  8. while ($line = <>) {
  9. print $line if $line =~ /$pattern/;
  10. }
  11. ^D
  12. % multi_grep 2 shift for multi_grep
  13. $number = shift;
  14. $regexp[$_] = shift foreach (0..$number-1);

Sometimes it is advantageous to construct a pattern from the input that is to be analyzed and use the permissible values on the left hand side of the matching operations. As an example for this somewhat paradoxical situation, let's assume that our input contains a command verb which should match one out of a set of available command verbs, with the additional twist that commands may be abbreviated as long as the given string is unique. The program below demonstrates the basic algorithm.

  1. % cat > keymatch
  2. #!/usr/bin/perl
  3. $kwds = 'copy compare list print';
  4. while( $command = <> ){
  5. $command =~ s/^\s+|\s+$//g; # trim leading and trailing spaces
  6. if( ( @matches = $kwds =~ /\b$command\w*/g ) == 1 ){
  7. print "command: '@matches'\n";
  8. } elsif( @matches == 0 ){
  9. print "no such command: '$command'\n";
  10. } else {
  11. print "not unique: '$command' (could be one of: @matches)\n";
  12. }
  13. }
  14. ^D
  15. % keymatch
  16. li
  17. command: 'list'
  18. co
  19. not unique: 'co' (could be one of: copy compare)
  20. printer
  21. no such command: 'printer'

Rather than trying to match the input against the keywords, we match the combined set of keywords against the input. The pattern matching operation $kwds =~ /\b($command\w*)/g does several things at the same time. It makes sure that the given command begins where a keyword begins (\b ). It tolerates abbreviations due to the added \w* . It tells us the number of matches (scalar @matches ) and all the keywords that were actually matched. You could hardly ask for more.

Embedding comments and modifiers in a regular expression

Starting with this section, we will be discussing Perl's set of extended patterns. These are extensions to the traditional regular expression syntax that provide powerful new tools for pattern matching. We have already seen extensions in the form of the minimal matching constructs ?? , *? , +?, {n,m}?, and {n,}?. Most of the extensions below have the form (?char...), where the char is a character that determines the type of extension.

The first extension is an embedded comment (?#text). This embeds a comment into the regular expression without affecting its meaning. The comment should not have any closing parentheses in the text. An example is

  1. /(?# Match an integer:)[+-]?\d+/;

This style of commenting has been largely superseded by the raw, freeform commenting that is allowed with the //x modifier.

Most modifiers, such as //i , //m , //s and //x (or any combination thereof) can also be embedded in a regexp using (?i), (?m), (?s), and (?x). For instance,

  1. /(?i)yes/; # match 'yes' case insensitively
  2. /yes/i; # same thing
  3. /(?x)( # freeform version of an integer regexp
  4. [+-]? # match an optional sign
  5. \d+ # match a sequence of digits
  6. )
  7. /x;

Embedded modifiers can have two important advantages over the usual modifiers. Embedded modifiers allow a custom set of modifiers to each regexp pattern. This is great for matching an array of regexps that must have different modifiers:

  1. $pattern[0] = '(?i)doctor';
  2. $pattern[1] = 'Johnson';
  3. ...
  4. while (<>) {
  5. foreach $patt (@pattern) {
  6. print if /$patt/;
  7. }
  8. }

The second advantage is that embedded modifiers (except //p , which modifies the entire regexp) only affect the regexp inside the group the embedded modifier is contained in. So grouping can be used to localize the modifier's effects:

  1. /Answer: ((?i)yes)/; # matches 'Answer: yes', 'Answer: YES', etc.

Embedded modifiers can also turn off any modifiers already present by using, e.g., (?-i). Modifiers can also be combined into a single expression, e.g., (?s-i) turns on single line mode and turns off case insensitivity.

Embedded modifiers may also be added to a non-capturing grouping. (?i-m:regexp) is a non-capturing grouping that matches regexp case insensitively and turns off multi-line mode.

Looking ahead and looking behind

This section concerns the lookahead and lookbehind assertions. First, a little background.

In Perl regular expressions, most regexp elements 'eat up' a certain amount of string when they match. For instance, the regexp element [abc}] eats up one character of the string when it matches, in the sense that Perl moves to the next character position in the string after the match. There are some elements, however, that don't eat up characters (advance the character position) if they match. The examples we have seen so far are the anchors. The anchor ^ matches the beginning of the line, but doesn't eat any characters. Similarly, the word boundary anchor \b matches wherever a character matching \w is next to a character that doesn't, but it doesn't eat up any characters itself. Anchors are examples of zero-width assertions: zero-width, because they consume no characters, and assertions, because they test some property of the string. In the context of our walk in the woods analogy to regexp matching, most regexp elements move us along a trail, but anchors have us stop a moment and check our surroundings. If the local environment checks out, we can proceed forward. But if the local environment doesn't satisfy us, we must backtrack.

Checking the environment entails either looking ahead on the trail, looking behind, or both. ^ looks behind, to see that there are no characters before. $ looks ahead, to see that there are no characters after. \b looks both ahead and behind, to see if the characters on either side differ in their "word-ness".

The lookahead and lookbehind assertions are generalizations of the anchor concept. Lookahead and lookbehind are zero-width assertions that let us specify which characters we want to test for. The lookahead assertion is denoted by (?=regexp) and the lookbehind assertion is denoted by (?<=fixed-regexp). Some examples are

  1. $x = "I catch the housecat 'Tom-cat' with catnip";
  2. $x =~ /cat(?=\s)/; # matches 'cat' in 'housecat'
  3. @catwords = ($x =~ /(?<=\s)cat\w+/g); # matches,
  4. # $catwords[0] = 'catch'
  5. # $catwords[1] = 'catnip'
  6. $x =~ /\bcat\b/; # matches 'cat' in 'Tom-cat'
  7. $x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in
  8. # middle of $x

Note that the parentheses in (?=regexp) and (?<=regexp) are non-capturing, since these are zero-width assertions. Thus in the second regexp, the substrings captured are those of the whole regexp itself. Lookahead (?=regexp) can match arbitrary regexps, but lookbehind (?<=fixed-regexp) only works for regexps of fixed width, i.e., a fixed number of characters long. Thus (?<=(ab|bc)) is fine, but (?<=(ab)*) is not. The negated versions of the lookahead and lookbehind assertions are denoted by (?!regexp) and (?<!fixed-regexp) respectively. They evaluate true if the regexps do not match:

  1. $x = "foobar";
  2. $x =~ /foo(?!bar)/; # doesn't match, 'bar' follows 'foo'
  3. $x =~ /foo(?!baz)/; # matches, 'baz' doesn't follow 'foo'
  4. $x =~ /(?<!\s)foo/; # matches, there is no \s before 'foo'

The \C is unsupported in lookbehind, because the already treacherous definition of \C would become even more so when going backwards.

Here is an example where a string containing blank-separated words, numbers and single dashes is to be split into its components. Using /\s+/ alone won't work, because spaces are not required between dashes, or a word or a dash. Additional places for a split are established by looking ahead and behind:

  1. $str = "one two - --6-8";
  2. @toks = split / \s+ # a run of spaces
  3. | (?<=\S) (?=-) # any non-space followed by '-'
  4. | (?<=-) (?=\S) # a '-' followed by any non-space
  5. /x, $str; # @toks = qw(one two - - - 6 - 8)

Using independent subexpressions to prevent backtracking

Independent subexpressions are regular expressions, in the context of a larger regular expression, that function independently of the larger regular expression. That is, they consume as much or as little of the string as they wish without regard for the ability of the larger regexp to match. Independent subexpressions are represented by (?>regexp). We can illustrate their behavior by first considering an ordinary regexp:

  1. $x = "ab";
  2. $x =~ /a*ab/; # matches

This obviously matches, but in the process of matching, the subexpression a* first grabbed the a . Doing so, however, wouldn't allow the whole regexp to match, so after backtracking, a* eventually gave back the a and matched the empty string. Here, what a* matched was dependent on what the rest of the regexp matched.

Contrast that with an independent subexpression:

  1. $x =~ /(?>a*)ab/; # doesn't match!

The independent subexpression (?>a*) doesn't care about the rest of the regexp, so it sees an a and grabs it. Then the rest of the regexp ab cannot match. Because (?>a*) is independent, there is no backtracking and the independent subexpression does not give up its a . Thus the match of the regexp as a whole fails. A similar behavior occurs with completely independent regexps:

  1. $x = "ab";
  2. $x =~ /a*/g; # matches, eats an 'a'
  3. $x =~ /\Gab/g; # doesn't match, no 'a' available

Here //g and \G create a 'tag team' handoff of the string from one regexp to the other. Regexps with an independent subexpression are much like this, with a handoff of the string to the independent subexpression, and a handoff of the string back to the enclosing regexp.

The ability of an independent subexpression to prevent backtracking can be quite useful. Suppose we want to match a non-empty string enclosed in parentheses up to two levels deep. Then the following regexp matches:

  1. $x = "abc(de(fg)h"; # unbalanced parentheses
  2. $x =~ /\( ( [^()]+ | \([^()]*\) )+ \)/x;

The regexp matches an open parenthesis, one or more copies of an alternation, and a close parenthesis. The alternation is two-way, with the first alternative [^()]+ matching a substring with no parentheses and the second alternative \([^()]*\) matching a substring delimited by parentheses. The problem with this regexp is that it is pathological: it has nested indeterminate quantifiers of the form (a+|b)+. We discussed in Part 1 how nested quantifiers like this could take an exponentially long time to execute if there was no match possible. To prevent the exponential blowup, we need to prevent useless backtracking at some point. This can be done by enclosing the inner quantifier as an independent subexpression:

  1. $x =~ /\( ( (?>[^()]+) | \([^()]*\) )+ \)/x;

Here, (?>[^()]+) breaks the degeneracy of string partitioning by gobbling up as much of the string as possible and keeping it. Then match failures fail much more quickly.

Conditional expressions

A conditional expression is a form of if-then-else statement that allows one to choose which patterns are to be matched, based on some condition. There are two types of conditional expression: (?(condition)yes-regexp) and (?(condition)yes-regexp|no-regexp). (?(condition)yes-regexp) is like an 'if () {}' statement in Perl. If the condition is true, the yes-regexp will be matched. If the condition is false, the yes-regexp will be skipped and Perl will move onto the next regexp element. The second form is like an 'if () {} else {}' statement in Perl. If the condition is true, the yes-regexp will be matched, otherwise the no-regexp will be matched.

The condition can have several forms. The first form is simply an integer in parentheses (integer) . It is true if the corresponding backreference \integer matched earlier in the regexp. The same thing can be done with a name associated with a capture group, written as (<name>) or ('name') . The second form is a bare zero-width assertion (?...), either a lookahead, a lookbehind, or a code assertion (discussed in the next section). The third set of forms provides tests that return true if the expression is executed within a recursion ((R) ) or is being called from some capturing group, referenced either by number ((R1) , (R2) ,...) or by name ((R&name) ).

The integer or name form of the condition allows us to choose, with more flexibility, what to match based on what matched earlier in the regexp. This searches for words of the form "$x$x" or "$x$y$y$x" :

  1. % simple_grep '^(\w+)(\w+)?(?(2)\g2\g1|\g1)$' /usr/dict/words
  2. beriberi
  3. coco
  4. couscous
  5. deed
  6. ...
  7. toot
  8. toto
  9. tutu

The lookbehind condition allows, along with backreferences, an earlier part of the match to influence a later part of the match. For instance,

  1. /[ATGC]+(?(?<=AA)G|C)$/;

matches a DNA sequence such that it either ends in AAG , or some other base pair combination and C . Note that the form is (?(?<=AA)G|C) and not (?((?<=AA))G|C); for the lookahead, lookbehind or code assertions, the parentheses around the conditional are not needed.

Defining named patterns

Some regular expressions use identical subpatterns in several places. Starting with Perl 5.10, it is possible to define named subpatterns in a section of the pattern so that they can be called up by name anywhere in the pattern. This syntactic pattern for this definition group is (?(DEFINE)(?<name>pattern)...). An insertion of a named pattern is written as (?&name).

The example below illustrates this feature using the pattern for floating point numbers that was presented earlier on. The three subpatterns that are used more than once are the optional sign, the digit sequence for an integer and the decimal fraction. The DEFINE group at the end of the pattern contains their definition. Notice that the decimal fraction pattern is the first place where we can reuse the integer pattern.

  1. /^ (?&osg)\ * ( (?&int)(?&dec)? | (?&dec) )
  2. (?: [eE](?&osg)(?&int) )?
  3. $
  4. (?(DEFINE)
  5. (?<osg>[-+]?) # optional sign
  6. (?<int>\d++) # integer
  7. (?<dec>\.(?&int)) # decimal fraction
  8. )/x

Recursive patterns

This feature (introduced in Perl 5.10) significantly extends the power of Perl's pattern matching. By referring to some other capture group anywhere in the pattern with the construct (?group-ref), the pattern within the referenced group is used as an independent subpattern in place of the group reference itself. Because the group reference may be contained within the group it refers to, it is now possible to apply pattern matching to tasks that hitherto required a recursive parser.

To illustrate this feature, we'll design a pattern that matches if a string contains a palindrome. (This is a word or a sentence that, while ignoring spaces, interpunctuation and case, reads the same backwards as forwards. We begin by observing that the empty string or a string containing just one word character is a palindrome. Otherwise it must have a word character up front and the same at its end, with another palindrome in between.

  1. /(?: (\w) (?...Here be a palindrome...) \g{-1} | \w? )/x

Adding \W* at either end to eliminate what is to be ignored, we already have the full pattern:

  1. my $pp = qr/^(\W* (?: (\w) (?1) \g{-1} | \w? ) \W*)$/ix;
  2. for $s ( "saippuakauppias", "A man, a plan, a canal: Panama!" ){
  3. print "'$s' is a palindrome\n" if $s =~ /$pp/;
  4. }

In (?...) both absolute and relative backreferences may be used. The entire pattern can be reinserted with (?R) or (?0). If you prefer to name your groups, you can use (?&name) to recurse into that group.

A bit of magic: executing Perl code in a regular expression

Normally, regexps are a part of Perl expressions. Code evaluation expressions turn that around by allowing arbitrary Perl code to be a part of a regexp. A code evaluation expression is denoted (?{code}), with code a string of Perl statements.

Be warned that this feature is considered experimental, and may be changed without notice.

Code expressions are zero-width assertions, and the value they return depends on their environment. There are two possibilities: either the code expression is used as a conditional in a conditional expression (?(condition)...), or it is not. If the code expression is a conditional, the code is evaluated and the result (i.e., the result of the last statement) is used to determine truth or falsehood. If the code expression is not used as a conditional, the assertion always evaluates true and the result is put into the special variable $^R . The variable $^R can then be used in code expressions later in the regexp. Here are some silly examples:

  1. $x = "abcdef";
  2. $x =~ /abc(?{print "Hi Mom!";})def/; # matches,
  3. # prints 'Hi Mom!'
  4. $x =~ /aaa(?{print "Hi Mom!";})def/; # doesn't match,
  5. # no 'Hi Mom!'

Pay careful attention to the next example:

  1. $x =~ /abc(?{print "Hi Mom!";})ddd/; # doesn't match,
  2. # no 'Hi Mom!'
  3. # but why not?

At first glance, you'd think that it shouldn't print, because obviously the ddd isn't going to match the target string. But look at this example:

  1. $x =~ /abc(?{print "Hi Mom!";})[dD]dd/; # doesn't match,
  2. # but _does_ print

Hmm. What happened here? If you've been following along, you know that the above pattern should be effectively (almost) the same as the last one; enclosing the d in a character class isn't going to change what it matches. So why does the first not print while the second one does?

The answer lies in the optimizations the regex engine makes. In the first case, all the engine sees are plain old characters (aside from the ?{} construct). It's smart enough to realize that the string 'ddd' doesn't occur in our target string before actually running the pattern through. But in the second case, we've tricked it into thinking that our pattern is more complicated. It takes a look, sees our character class, and decides that it will have to actually run the pattern to determine whether or not it matches, and in the process of running it hits the print statement before it discovers that we don't have a match.

To take a closer look at how the engine does optimizations, see the section Pragmas and debugging below.

More fun with ?{}:

  1. $x =~ /(?{print "Hi Mom!";})/; # matches,
  2. # prints 'Hi Mom!'
  3. $x =~ /(?{$c = 1;})(?{print "$c";})/; # matches,
  4. # prints '1'
  5. $x =~ /(?{$c = 1;})(?{print "$^R";})/; # matches,
  6. # prints '1'

The bit of magic mentioned in the section title occurs when the regexp backtracks in the process of searching for a match. If the regexp backtracks over a code expression and if the variables used within are localized using local, the changes in the variables produced by the code expression are undone! Thus, if we wanted to count how many times a character got matched inside a group, we could use, e.g.,

  1. $x = "aaaa";
  2. $count = 0; # initialize 'a' count
  3. $c = "bob"; # test if $c gets clobbered
  4. $x =~ /(?{local $c = 0;}) # initialize count
  5. ( a # match 'a'
  6. (?{local $c = $c + 1;}) # increment count
  7. )* # do this any number of times,
  8. aa # but match 'aa' at the end
  9. (?{$count = $c;}) # copy local $c var into $count
  10. /x;
  11. print "'a' count is $count, \$c variable is '$c'\n";

This prints

  1. 'a' count is 2, $c variable is 'bob'

If we replace the (?{local $c = $c + 1;}) with (?{$c = $c + 1;}), the variable changes are not undone during backtracking, and we get

  1. 'a' count is 4, $c variable is 'bob'

Note that only localized variable changes are undone. Other side effects of code expression execution are permanent. Thus

  1. $x = "aaaa";
  2. $x =~ /(a(?{print "Yow\n";}))*aa/;

produces

  1. Yow
  2. Yow
  3. Yow
  4. Yow

The result $^R is automatically localized, so that it will behave properly in the presence of backtracking.

This example uses a code expression in a conditional to match a definite article, either 'the' in English or 'der|die|das' in German:

  1. $lang = 'DE'; # use German
  2. ...
  3. $text = "das";
  4. print "matched\n"
  5. if $text =~ /(?(?{
  6. $lang eq 'EN'; # is the language English?
  7. })
  8. the | # if so, then match 'the'
  9. (der|die|das) # else, match 'der|die|das'
  10. )
  11. /xi;

Note that the syntax here is (?(?{...})yes-regexp|no-regexp), not (?((?{...}))yes-regexp|no-regexp). In other words, in the case of a code expression, we don't need the extra parentheses around the conditional.

If you try to use code expressions where the code text is contained within an interpolated variable, rather than appearing literally in the pattern, Perl may surprise you:

  1. $bar = 5;
  2. $pat = '(?{ 1 })';
  3. /foo(?{ $bar })bar/; # compiles ok, $bar not interpolated
  4. /foo(?{ 1 })$bar/; # compiles ok, $bar interpolated
  5. /foo${pat}bar/; # compile error!
  6. $pat = qr/(?{ $foo = 1 })/; # precompile code regexp
  7. /foo${pat}bar/; # compiles ok

If a regexp has a variable that interpolates a code expression, Perl treats the regexp as an error. If the code expression is precompiled into a variable, however, interpolating is ok. The question is, why is this an error?

The reason is that variable interpolation and code expressions together pose a security risk. The combination is dangerous because many programmers who write search engines often take user input and plug it directly into a regexp:

  1. $regexp = <>; # read user-supplied regexp
  2. $chomp $regexp; # get rid of possible newline
  3. $text =~ /$regexp/; # search $text for the $regexp

If the $regexp variable contains a code expression, the user could then execute arbitrary Perl code. For instance, some joker could search for system('rm -rf *'); to erase your files. In this sense, the combination of interpolation and code expressions taints your regexp. So by default, using both interpolation and code expressions in the same regexp is not allowed. If you're not concerned about malicious users, it is possible to bypass this security check by invoking use re 'eval' :

  1. use re 'eval'; # throw caution out the door
  2. $bar = 5;
  3. $pat = '(?{ 1 })';
  4. /foo${pat}bar/; # compiles ok

Another form of code expression is the pattern code expression. The pattern code expression is like a regular code expression, except that the result of the code evaluation is treated as a regular expression and matched immediately. A simple example is

  1. $length = 5;
  2. $char = 'a';
  3. $x = 'aaaaabb';
  4. $x =~ /(??{$char x $length})/x; # matches, there are 5 of 'a'

This final example contains both ordinary and pattern code expressions. It detects whether a binary string 1101010010001... has a Fibonacci spacing 0,1,1,2,3,5,... of the 1 's:

  1. $x = "1101010010001000001";
  2. $z0 = ''; $z1 = '0'; # initial conditions
  3. print "It is a Fibonacci sequence\n"
  4. if $x =~ /^1 # match an initial '1'
  5. (?:
  6. ((??{ $z0 })) # match some '0'
  7. 1 # and then a '1'
  8. (?{ $z0 = $z1; $z1 .= $^N; })
  9. )+ # repeat as needed
  10. $ # that is all there is
  11. /x;
  12. printf "Largest sequence matched was %d\n", length($z1)-length($z0);

Remember that $^N is set to whatever was matched by the last completed capture group. This prints

  1. It is a Fibonacci sequence
  2. Largest sequence matched was 5

Ha! Try that with your garden variety regexp package...

Note that the variables $z0 and $z1 are not substituted when the regexp is compiled, as happens for ordinary variables outside a code expression. Rather, the whole code block is parsed as perl code at the same time as perl is compiling the code containing the literal regexp pattern.

The regexp without the //x modifier is

  1. /^1(?:((??{ $z0 }))1(?{ $z0 = $z1; $z1 .= $^N; }))+$/

which shows that spaces are still possible in the code parts. Nevertheless, when working with code and conditional expressions, the extended form of regexps is almost necessary in creating and debugging regexps.

Backtracking control verbs

Perl 5.10 introduced a number of control verbs intended to provide detailed control over the backtracking process, by directly influencing the regexp engine and by providing monitoring techniques. As all the features in this group are experimental and subject to change or removal in a future version of Perl, the interested reader is referred to Special Backtracking Control Verbs in perlre for a detailed description.

Below is just one example, illustrating the control verb (*FAIL) , which may be abbreviated as (*F) . If this is inserted in a regexp it will cause it to fail, just as it would at some mismatch between the pattern and the string. Processing of the regexp continues as it would after any "normal" failure, so that, for instance, the next position in the string or another alternative will be tried. As failing to match doesn't preserve capture groups or produce results, it may be necessary to use this in combination with embedded code.

  1. %count = ();
  2. "supercalifragilisticexpialidocious" =~
  3. /([aeiou])(?{ $count{$1}++; })(*FAIL)/i;
  4. printf "%3d '%s'\n", $count{$_}, $_ for (sort keys %count);

The pattern begins with a class matching a subset of letters. Whenever this matches, a statement like $count{'a'}++; is executed, incrementing the letter's counter. Then (*FAIL) does what it says, and the regexp engine proceeds according to the book: as long as the end of the string hasn't been reached, the position is advanced before looking for another vowel. Thus, match or no match makes no difference, and the regexp engine proceeds until the entire string has been inspected. (It's remarkable that an alternative solution using something like

  1. $count{lc($_)}++ for split('', "supercalifragilisticexpialidocious");
  2. printf "%3d '%s'\n", $count2{$_}, $_ for ( qw{ a e i o u } );

is considerably slower.)

Pragmas and debugging

Speaking of debugging, there are several pragmas available to control and debug regexps in Perl. We have already encountered one pragma in the previous section, use re 'eval'; , that allows variable interpolation and code expressions to coexist in a regexp. The other pragmas are

  1. use re 'taint';
  2. $tainted = <>;
  3. @parts = ($tainted =~ /(\w+)\s+(\w+)/; # @parts is now tainted

The taint pragma causes any substrings from a match with a tainted variable to be tainted as well. This is not normally the case, as regexps are often used to extract the safe bits from a tainted variable. Use taint when you are not extracting safe bits, but are performing some other processing. Both taint and eval pragmas are lexically scoped, which means they are in effect only until the end of the block enclosing the pragmas.

  1. use re '/m'; # or any other flags
  2. $multiline_string =~ /^foo/; # /m is implied

The re '/flags' pragma (introduced in Perl 5.14) turns on the given regular expression flags until the end of the lexical scope. See '/flags' mode in re for more detail.

  1. use re 'debug';
  2. /^(.*)$/s; # output debugging info
  3. use re 'debugcolor';
  4. /^(.*)$/s; # output debugging info in living color

The global debug and debugcolor pragmas allow one to get detailed debugging info about regexp compilation and execution. debugcolor is the same as debug, except the debugging information is displayed in color on terminals that can display termcap color sequences. Here is example output:

  1. % perl -e 'use re "debug"; "abc" =~ /a*b+c/;'
  2. Compiling REx 'a*b+c'
  3. size 9 first at 1
  4. 1: STAR(4)
  5. 2: EXACT <a>(0)
  6. 4: PLUS(7)
  7. 5: EXACT <b>(0)
  8. 7: EXACT <c>(9)
  9. 9: END(0)
  10. floating 'bc' at 0..2147483647 (checking floating) minlen 2
  11. Guessing start of match, REx 'a*b+c' against 'abc'...
  12. Found floating substr 'bc' at offset 1...
  13. Guessed: match at offset 0
  14. Matching REx 'a*b+c' against 'abc'
  15. Setting an EVAL scope, savestack=3
  16. 0 <> <abc> | 1: STAR
  17. EXACT <a> can match 1 times out of 32767...
  18. Setting an EVAL scope, savestack=3
  19. 1 <a> <bc> | 4: PLUS
  20. EXACT <b> can match 1 times out of 32767...
  21. Setting an EVAL scope, savestack=3
  22. 2 <ab> <c> | 7: EXACT <c>
  23. 3 <abc> <> | 9: END
  24. Match successful!
  25. Freeing REx: 'a*b+c'

If you have gotten this far into the tutorial, you can probably guess what the different parts of the debugging output tell you. The first part

  1. Compiling REx 'a*b+c'
  2. size 9 first at 1
  3. 1: STAR(4)
  4. 2: EXACT <a>(0)
  5. 4: PLUS(7)
  6. 5: EXACT <b>(0)
  7. 7: EXACT <c>(9)
  8. 9: END(0)

describes the compilation stage. STAR(4) means that there is a starred object, in this case 'a' , and if it matches, goto line 4, i.e., PLUS(7) . The middle lines describe some heuristics and optimizations performed before a match:

  1. floating 'bc' at 0..2147483647 (checking floating) minlen 2
  2. Guessing start of match, REx 'a*b+c' against 'abc'...
  3. Found floating substr 'bc' at offset 1...
  4. Guessed: match at offset 0

Then the match is executed and the remaining lines describe the process:

  1. Matching REx 'a*b+c' against 'abc'
  2. Setting an EVAL scope, savestack=3
  3. 0 <> <abc> | 1: STAR
  4. EXACT <a> can match 1 times out of 32767...
  5. Setting an EVAL scope, savestack=3
  6. 1 <a> <bc> | 4: PLUS
  7. EXACT <b> can match 1 times out of 32767...
  8. Setting an EVAL scope, savestack=3
  9. 2 <ab> <c> | 7: EXACT <c>
  10. 3 <abc> <> | 9: END
  11. Match successful!
  12. Freeing REx: 'a*b+c'

Each step is of the form n <x> <y>, with <x> the part of the string matched and <y> the part not yet matched. The | 1: STAR says that Perl is at line number 1 in the compilation list above. See Debugging Regular Expressions in perldebguts for much more detail.

An alternative method of debugging regexps is to embed print statements within the regexp. This provides a blow-by-blow account of the backtracking in an alternation:

  1. "that this" =~ m@(?{print "Start at position ", pos, "\n";})
  2. t(?{print "t1\n";})
  3. h(?{print "h1\n";})
  4. i(?{print "i1\n";})
  5. s(?{print "s1\n";})
  6. |
  7. t(?{print "t2\n";})
  8. h(?{print "h2\n";})
  9. a(?{print "a2\n";})
  10. t(?{print "t2\n";})
  11. (?{print "Done at position ", pos, "\n";})
  12. @x;

prints

  1. Start at position 0
  2. t1
  3. h1
  4. t2
  5. h2
  6. a2
  7. t2
  8. Done at position 4

BUGS

Code expressions, conditional expressions, and independent expressions are experimental. Don't use them in production code. Yet.

SEE ALSO

This is just a tutorial. For the full story on Perl regular expressions, see the perlre regular expressions reference page.

For more information on the matching m// and substitution s/// operators, see Regexp Quote-Like Operators in perlop. For information on the split operation, see split.

For an excellent all-around resource on the care and feeding of regular expressions, see the book Mastering Regular Expressions by Jeffrey Friedl (published by O'Reilly, ISBN 1556592-257-3).

AUTHOR AND COPYRIGHT

Copyright (c) 2000 Mark Kvale All rights reserved.

This document may be distributed under the same terms as Perl itself.

Acknowledgments

The inspiration for the stop codon DNA example came from the ZIP code example in chapter 7 of Mastering Regular Expressions.

The author would like to thank Jeff Pinyan, Andrew Johnson, Peter Haworth, Ronald J Kimball, and Joe Smith for all their helpful comments.

 
perldoc-html/perlriscos.html000644 000765 000024 00000037316 12275777412 016321 0ustar00jjstaff000000 000000 perlriscos - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlriscos

Perl 5 version 18.2 documentation
Recently read

perlriscos

NAME

perlriscos - Perl version 5 for RISC OS

DESCRIPTION

This document gives instructions for building Perl for RISC OS. It is complicated by the need to cross compile. There is a binary version of perl available from http://www.cp15.org/perl/ which you may wish to use instead of trying to compile it yourself.

BUILD

You need an installed and working gccsdk cross compiler http://gccsdk.riscos.info/ and REXEN http://www.cp15.org/programming/

Firstly, copy the source and build a native copy of perl for your host system. Then, in the source to be cross compiled:

1.
  1. $ ./Configure
2.

Select the riscos hint file. The default answers for the rest of the questions are usually sufficient.

Note that, if you wish to run Configure non-interactively (see the INSTALL document for details), to have it select the correct hint file, you'll need to provide the argument -Dhintfile=riscos on the Configure command-line.

3.
  1. $ make miniperl
4.

This should build miniperl and then fail when it tries to run it.

5.

Copy the miniperl executable from the native build done earlier to replace the cross compiled miniperl.

6.
  1. $ make
7.

This will use miniperl to complete the rest of the build.

AUTHOR

Alex Waugh <alex@alexwaugh.com>

 
perldoc-html/perlrun.html000644 000765 000024 00000303346 12275777322 015622 0ustar00jjstaff000000 000000 perlrun - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlrun

Perl 5 version 18.2 documentation
Recently read

perlrun

NAME

perlrun - how to execute the Perl interpreter

SYNOPSIS

perl [ -sTtuUWX ] [ -hv ] [ -V[:configvar] ] [ -cw ] [ -d[t][:debugger] ] [ -D[number/list] ] [ -pna ] [ -Fpattern ] [ -l[octal] ] [ -0[octal/hexadecimal] ] [ -Idir ] [ -m[-]module ] [ -M[-]'module...' ] [ -f ] [ -C [number/list] ] [ -S ] [ -x[dir] ] [ -i[extension] ] [ [-e|-E] 'command' ] [ -- ] [ programfile ] [ argument ]...

DESCRIPTION

The normal way to run a Perl program is by making it directly executable, or else by passing the name of the source file as an argument on the command line. (An interactive Perl environment is also possible--see perldebug for details on how to do that.) Upon startup, Perl looks for your program in one of the following places:

1.

Specified line by line via -e or -E switches on the command line.

2.

Contained in the file specified by the first filename on the command line. (Note that systems supporting the #! notation invoke interpreters this way. See Location of Perl.)

3.

Passed in implicitly via standard input. This works only if there are no filename arguments--to pass arguments to a STDIN-read program you must explicitly specify a "-" for the program name.

With methods 2 and 3, Perl starts parsing the input file from the beginning, unless you've specified a -x switch, in which case it scans for the first line starting with #! and containing the word "perl", and starts there instead. This is useful for running a program embedded in a larger message. (In this case you would indicate the end of the program using the __END__ token.)

The #! line is always examined for switches as the line is being parsed. Thus, if you're on a machine that allows only one argument with the #! line, or worse, doesn't even recognize the #! line, you still can get consistent switch behaviour regardless of how Perl was invoked, even if -x was used to find the beginning of the program.

Because historically some operating systems silently chopped off kernel interpretation of the #! line after 32 characters, some switches may be passed in on the command line, and some may not; you could even get a "-" without its letter, if you're not careful. You probably want to make sure that all your switches fall either before or after that 32-character boundary. Most switches don't actually care if they're processed redundantly, but getting a "-" instead of a complete switch could cause Perl to try to execute standard input instead of your program. And a partial -I switch could also cause odd results.

Some switches do care if they are processed twice, for instance combinations of -l and -0. Either put all the switches after the 32-character boundary (if applicable), or replace the use of -0digits by BEGIN{ $/ = "\0digits"; } .

Parsing of the #! switches starts wherever "perl" is mentioned in the line. The sequences "-*" and "- " are specifically ignored so that you could, if you were so inclined, say

  1. #!/bin/sh
  2. #! -*-perl-*-
  3. eval 'exec perl -x -wS $0 ${1+"$@"}'
  4. if 0;

to let Perl see the -p switch.

A similar trick involves the env program, if you have it.

  1. #!/usr/bin/env perl

The examples above use a relative path to the perl interpreter, getting whatever version is first in the user's path. If you want a specific version of Perl, say, perl5.14.1, you should place that directly in the #! line's path.

If the #! line does not contain the word "perl" nor the word "indir" the program named after the #! is executed instead of the Perl interpreter. This is slightly bizarre, but it helps people on machines that don't do #! , because they can tell a program that their SHELL is /usr/bin/perl, and Perl will then dispatch the program to the correct interpreter for them.

After locating your program, Perl compiles the entire program to an internal form. If there are any compilation errors, execution of the program is not attempted. (This is unlike the typical shell script, which might run part-way through before finding a syntax error.)

If the program is syntactically correct, it is executed. If the program runs off the end without hitting an exit() or die() operator, an implicit exit(0) is provided to indicate successful completion.

#! and quoting on non-Unix systems

Unix's #! technique can be simulated on other systems:

  • OS/2

    Put

    1. extproc perl -S -your_switches

    as the first line in *.cmd file (-S due to a bug in cmd.exe's `extproc' handling).

  • MS-DOS

    Create a batch file to run your program, and codify it in ALTERNATE_SHEBANG (see the dosish.h file in the source distribution for more information).

  • Win95/NT

    The Win95/NT installation, when using the ActiveState installer for Perl, will modify the Registry to associate the .pl extension with the perl interpreter. If you install Perl by other means (including building from the sources), you may have to modify the Registry yourself. Note that this means you can no longer tell the difference between an executable Perl program and a Perl library file.

  • VMS

    Put

    1. $ perl -mysw 'f$env("procedure")' 'p1' 'p2' 'p3' 'p4' 'p5' 'p6' 'p7' 'p8' !
    2. $ exit++ + ++$status != 0 and $exit = $status = undef;

    at the top of your program, where -mysw are any command line switches you want to pass to Perl. You can now invoke the program directly, by saying perl program , or as a DCL procedure, by saying @program (or implicitly via DCL$PATH by just using the name of the program).

    This incantation is a bit much to remember, but Perl will display it for you if you say perl "-V:startperl" .

Command-interpreters on non-Unix systems have rather different ideas on quoting than Unix shells. You'll need to learn the special characters in your command-interpreter (* , \ and " are common) and how to protect whitespace and these characters to run one-liners (see -e below).

On some systems, you may have to change single-quotes to double ones, which you must not do on Unix or Plan 9 systems. You might also have to change a single % to a %%.

For example:

  1. # Unix
  2. perl -e 'print "Hello world\n"'
  3. # MS-DOS, etc.
  4. perl -e "print \"Hello world\n\""
  5. # VMS
  6. perl -e "print ""Hello world\n"""

The problem is that none of this is reliable: it depends on the command and it is entirely possible neither works. If 4DOS were the command shell, this would probably work better:

  1. perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>""

CMD.EXE in Windows NT slipped a lot of standard Unix functionality in when nobody was looking, but just try to find documentation for its quoting rules.

There is no general solution to all of this. It's just a mess.

Location of Perl

It may seem obvious to say, but Perl is useful only when users can easily find it. When possible, it's good for both /usr/bin/perl and /usr/local/bin/perl to be symlinks to the actual binary. If that can't be done, system administrators are strongly encouraged to put (symlinks to) perl and its accompanying utilities into a directory typically found along a user's PATH, or in some other obvious and convenient place.

In this documentation, #!/usr/bin/perl on the first line of the program will stand in for whatever method works on your system. You are advised to use a specific path if you care about a specific version.

  1. #!/usr/local/bin/perl5.14

or if you just want to be running at least version, place a statement like this at the top of your program:

  1. use 5.014;

Command Switches

As with all standard commands, a single-character switch may be clustered with the following switch, if any.

  1. #!/usr/bin/perl -spi.orig # same as -s -p -i.orig

Switches include:

  • -0[octal/hexadecimal]

    specifies the input record separator ($/ ) as an octal or hexadecimal number. If there are no digits, the null character is the separator. Other switches may precede or follow the digits. For example, if you have a version of find which can print filenames terminated by the null character, you can say this:

    1. find . -name '*.orig' -print0 | perl -n0e unlink

    The special value 00 will cause Perl to slurp files in paragraph mode. Any value 0400 or above will cause Perl to slurp files whole, but by convention the value 0777 is the one normally used for this purpose.

    You can also specify the separator character using hexadecimal notation: -0xHHH..., where the H are valid hexadecimal digits. Unlike the octal form, this one may be used to specify any Unicode character, even those beyond 0xFF. So if you really want a record separator of 0777, specify it as -0x1FF. (This means that you cannot use the -x option with a directory name that consists of hexadecimal digits, or else Perl will think you have specified a hex number to -0.)

  • -a

    turns on autosplit mode when used with a -n or -p. An implicit split command to the @F array is done as the first thing inside the implicit while loop produced by the -n or -p.

    1. perl -ane 'print pop(@F), "\n";'

    is equivalent to

    1. while (<>) {
    2. @F = split(' ');
    3. print pop(@F), "\n";
    4. }

    An alternate delimiter may be specified using -F.

  • -C [number/list]

    The -C flag controls some of the Perl Unicode features.

    As of 5.8.1, the -C can be followed either by a number or a list of option letters. The letters, their numeric values, and effects are as follows; listing the letters is equal to summing the numbers.

    1. I 1 STDIN is assumed to be in UTF-8
    2. O 2 STDOUT will be in UTF-8
    3. E 4 STDERR will be in UTF-8
    4. S 7 I + O + E
    5. i 8 UTF-8 is the default PerlIO layer for input streams
    6. o 16 UTF-8 is the default PerlIO layer for output streams
    7. D 24 i + o
    8. A 32 the @ARGV elements are expected to be strings encoded
    9. in UTF-8
    10. L 64 normally the "IOEioA" are unconditional, the L makes
    11. them conditional on the locale environment variables
    12. (the LC_ALL, LC_TYPE, and LANG, in the order of
    13. decreasing precedence) -- if the variables indicate
    14. UTF-8, then the selected "IOEioA" are in effect
    15. a 256 Set ${^UTF8CACHE} to -1, to run the UTF-8 caching
    16. code in debugging mode.

    For example, -COE and -C6 will both turn on UTF-8-ness on both STDOUT and STDERR. Repeating letters is just redundant, not cumulative nor toggling.

    The io options mean that any subsequent open() (or similar I/O operations) in the current file scope will have the :utf8 PerlIO layer implicitly applied to them, in other words, UTF-8 is expected from any input stream, and UTF-8 is produced to any output stream. This is just the default, with explicit layers in open() and with binmode() one can manipulate streams as usual.

    -C on its own (not followed by any number or option list), or the empty string "" for the PERL_UNICODE environment variable, has the same effect as -CSDL. In other words, the standard I/O handles and the default open() layer are UTF-8-fied but only if the locale environment variables indicate a UTF-8 locale. This behaviour follows the implicit (and problematic) UTF-8 behaviour of Perl 5.8.0. (See UTF-8 no longer default under UTF-8 locales in perl581delta.)

    You can use -C0 (or "0" for PERL_UNICODE ) to explicitly disable all the above Unicode features.

    The read-only magic variable ${^UNICODE} reflects the numeric value of this setting. This variable is set during Perl startup and is thereafter read-only. If you want runtime effects, use the three-arg open() (see open), the two-arg binmode() (see binmode), and the open pragma (see open).

    (In Perls earlier than 5.8.1 the -C switch was a Win32-only switch that enabled the use of Unicode-aware "wide system call" Win32 APIs. This feature was practically unused, however, and the command line switch was therefore "recycled".)

    Note: Since perl 5.10.1, if the -C option is used on the #! line, it must be specified on the command line as well, since the standard streams are already set up at this point in the execution of the perl interpreter. You can also use binmode() to set the encoding of an I/O stream.

  • -c

    causes Perl to check the syntax of the program and then exit without executing it. Actually, it will execute and BEGIN , UNITCHECK , or CHECK blocks and any use statements: these are considered as occurring outside the execution of your program. INIT and END blocks, however, will be skipped.

  • -d
  • -dt

    runs the program under the Perl debugger. See perldebug. If t is specified, it indicates to the debugger that threads will be used in the code being debugged.

  • -d:MOD[=bar,baz]
  • -dt:MOD[=bar,baz]

    runs the program under the control of a debugging, profiling, or tracing module installed as Devel::MOD. E.g., -d:DProf executes the program using the Devel::DProf profiler. As with the -M flag, options may be passed to the Devel::MOD package where they will be received and interpreted by the Devel::MOD::import routine. Again, like -M, use --d:-MOD to call Devel::MOD::unimport instead of import. The comma-separated list of options must follow a = character. If t is specified, it indicates to the debugger that threads will be used in the code being debugged. See perldebug.

  • -Dletters
  • -Dnumber

    sets debugging flags. To watch how it executes your program, use -Dtls. (This works only if debugging is compiled into your Perl.) Another nice value is -Dx, which lists your compiled syntax tree. And -Dr displays compiled regular expressions; the format of the output is explained in perldebguts.

    As an alternative, specify a number instead of list of letters (e.g., -D14 is equivalent to -Dtls):

    1. 1 p Tokenizing and parsing (with v, displays parse stack)
    2. 2 s Stack snapshots (with v, displays all stacks)
    3. 4 l Context (loop) stack processing
    4. 8 t Trace execution
    5. 16 o Method and overloading resolution
    6. 32 c String/numeric conversions
    7. 64 P Print profiling info, source file input state
    8. 128 m Memory and SV allocation
    9. 256 f Format processing
    10. 512 r Regular expression parsing and execution
    11. 1024 x Syntax tree dump
    12. 2048 u Tainting checks
    13. 4096 U Unofficial, User hacking (reserved for private,
    14. unreleased use)
    15. 8192 H Hash dump -- usurps values()
    16. 16384 X Scratchpad allocation
    17. 32768 D Cleaning up
    18. 65536 S Op slab allocation
    19. 131072 T Tokenizing
    20. 262144 R Include reference counts of dumped variables (eg when
    21. using -Ds)
    22. 524288 J show s,t,P-debug (don't Jump over) on opcodes within
    23. package DB
    24. 1048576 v Verbose: use in conjunction with other flags
    25. 2097152 C Copy On Write
    26. 4194304 A Consistency checks on internal structures
    27. 8388608 q quiet - currently only suppresses the "EXECUTING"
    28. message
    29. 16777216 M trace smart match resolution
    30. 33554432 B dump suBroutine definitions, including special Blocks
    31. like BEGIN

    All these flags require -DDEBUGGING when you compile the Perl executable (but see :opd in Devel::Peek or 'debug' mode in re which may change this). See the INSTALL file in the Perl source distribution for how to do this. This flag is automatically set if you include -g option when Configure asks you about optimizer/debugger flags.

    If you're just trying to get a print out of each line of Perl code as it executes, the way that sh -x provides for shell scripts, you can't use Perl's -D switch. Instead do this

    1. # If you have "env" utility
    2. env PERLDB_OPTS="NonStop=1 AutoTrace=1 frame=2" perl -dS program
    3. # Bourne shell syntax
    4. $ PERLDB_OPTS="NonStop=1 AutoTrace=1 frame=2" perl -dS program
    5. # csh syntax
    6. % (setenv PERLDB_OPTS "NonStop=1 AutoTrace=1 frame=2"; perl -dS program)

    See perldebug for details and variations.

  • -e commandline

    may be used to enter one line of program. If -e is given, Perl will not look for a filename in the argument list. Multiple -e commands may be given to build up a multi-line script. Make sure to use semicolons where you would in a normal program.

  • -E commandline

    behaves just like -e, except that it implicitly enables all optional features (in the main compilation unit). See feature.

  • -f

    Disable executing $Config{sitelib}/sitecustomize.pl at startup.

    Perl can be built so that it by default will try to execute $Config{sitelib}/sitecustomize.pl at startup (in a BEGIN block). This is a hook that allows the sysadmin to customize how Perl behaves. It can for instance be used to add entries to the @INC array to make Perl find modules in non-standard locations.

    Perl actually inserts the following code:

    1. BEGIN {
    2. do { local $!; -f "$Config{sitelib}/sitecustomize.pl"; }
    3. && do "$Config{sitelib}/sitecustomize.pl";
    4. }

    Since it is an actual do (not a require), sitecustomize.pl doesn't need to return a true value. The code is run in package main , in its own lexical scope. However, if the script dies, $@ will not be set.

    The value of $Config{sitelib} is also determined in C code and not read from Config.pm , which is not loaded.

    The code is executed very early. For example, any changes made to @INC will show up in the output of `perl -V`. Of course, END blocks will be likewise executed very late.

    To determine at runtime if this capability has been compiled in your perl, you can check the value of $Config{usesitecustomize} .

  • -Fpattern

    specifies the pattern to split on if -a is also in effect. The pattern may be surrounded by // , "" , or '' , otherwise it will be put in single quotes. You can't use literal whitespace in the pattern.

  • -h

    prints a summary of the options.

  • -i[extension]

    specifies that files processed by the <> construct are to be edited in-place. It does this by renaming the input file, opening the output file by the original name, and selecting that output file as the default for print() statements. The extension, if supplied, is used to modify the name of the old file to make a backup copy, following these rules:

    If no extension is supplied, and your system supports it, the original file is kept open without a name while the output is redirected to a new file with the original filename. When perl exits, cleanly or not, the original file is unlinked.

    If the extension doesn't contain a * , then it is appended to the end of the current filename as a suffix. If the extension does contain one or more * characters, then each * is replaced with the current filename. In Perl terms, you could think of this as:

    1. ($backup = $extension) =~ s/\*/$file_name/g;

    This allows you to add a prefix to the backup file, instead of (or in addition to) a suffix:

    1. $ perl -pi'orig_*' -e 's/bar/baz/' fileA # backup to
    2. # 'orig_fileA'

    Or even to place backup copies of the original files into another directory (provided the directory already exists):

    1. $ perl -pi'old/*.orig' -e 's/bar/baz/' fileA # backup to
    2. # 'old/fileA.orig'

    These sets of one-liners are equivalent:

    1. $ perl -pi -e 's/bar/baz/' fileA # overwrite current file
    2. $ perl -pi'*' -e 's/bar/baz/' fileA # overwrite current file
    3. $ perl -pi'.orig' -e 's/bar/baz/' fileA # backup to 'fileA.orig'
    4. $ perl -pi'*.orig' -e 's/bar/baz/' fileA # backup to 'fileA.orig'

    From the shell, saying

    1. $ perl -p -i.orig -e "s/foo/bar/; ... "

    is the same as using the program:

    1. #!/usr/bin/perl -pi.orig
    2. s/foo/bar/;

    which is equivalent to

    1. #!/usr/bin/perl
    2. $extension = '.orig';
    3. LINE: while (<>) {
    4. if ($ARGV ne $oldargv) {
    5. if ($extension !~ /\*/) {
    6. $backup = $ARGV . $extension;
    7. }
    8. else {
    9. ($backup = $extension) =~ s/\*/$ARGV/g;
    10. }
    11. rename($ARGV, $backup);
    12. open(ARGVOUT, ">$ARGV");
    13. select(ARGVOUT);
    14. $oldargv = $ARGV;
    15. }
    16. s/foo/bar/;
    17. }
    18. continue {
    19. print; # this prints to original filename
    20. }
    21. select(STDOUT);

    except that the -i form doesn't need to compare $ARGV to $oldargv to know when the filename has changed. It does, however, use ARGVOUT for the selected filehandle. Note that STDOUT is restored as the default output filehandle after the loop.

    As shown above, Perl creates the backup file whether or not any output is actually changed. So this is just a fancy way to copy files:

    1. $ perl -p -i'/some/file/path/*' -e 1 file1 file2 file3...
    2. or
    3. $ perl -p -i'.orig' -e 1 file1 file2 file3...

    You can use eof without parentheses to locate the end of each input file, in case you want to append to each file, or reset line numbering (see example in eof).

    If, for a given file, Perl is unable to create the backup file as specified in the extension then it will skip that file and continue on with the next one (if it exists).

    For a discussion of issues surrounding file permissions and -i, see Why does Perl let me delete read-only files? Why does -i clobber protected files? Isn't this a bug in Perl? in perlfaq5.

    You cannot use -i to create directories or to strip extensions from files.

    Perl does not expand ~ in filenames, which is good, since some folks use it for their backup files:

    1. $ perl -pi~ -e 's/foo/bar/' file1 file2 file3...

    Note that because -i renames or deletes the original file before creating a new file of the same name, Unix-style soft and hard links will not be preserved.

    Finally, the -i switch does not impede execution when no files are given on the command line. In this case, no backup is made (the original file cannot, of course, be determined) and processing proceeds from STDIN to STDOUT as might be expected.

  • -Idirectory

    Directories specified by -I are prepended to the search path for modules (@INC ).

  • -l[octnum]

    enables automatic line-ending processing. It has two separate effects. First, it automatically chomps $/ (the input record separator) when used with -n or -p. Second, it assigns $\ (the output record separator) to have the value of octnum so that any print statements will have that separator added back on. If octnum is omitted, sets $\ to the current value of $/ . For instance, to trim lines to 80 columns:

    1. perl -lpe 'substr($_, 80) = ""'

    Note that the assignment $\ = $/ is done when the switch is processed, so the input record separator can be different than the output record separator if the -l switch is followed by a -0 switch:

    1. gnufind / -print0 | perl -ln0e 'print "found $_" if -p'

    This sets $\ to newline and then sets $/ to the null character.

  • -m[-]module
  • -M[-]module
  • -M[-]'module ...'
  • -[mM][-]module=arg[,arg]...

    -mmodule executes use module (); before executing your program.

    -Mmodule executes use module ; before executing your program. You can use quotes to add extra code after the module name, e.g., '-MMODULE qw(foo bar)'.

    If the first character after the -M or -m is a dash (-) then the 'use' is replaced with 'no'.

    A little builtin syntactic sugar means you can also say -mMODULE=foo,bar or -MMODULE=foo,bar as a shortcut for '-MMODULE qw(foo bar)'. This avoids the need to use quotes when importing symbols. The actual code generated by -MMODULE=foo,bar is use module split(/,/,q{foo,bar}) . Note that the = form removes the distinction between -m and -M.

    A consequence of this is that -MMODULE=number never does a version check, unless MODULE::import() itself is set up to do a version check, which could happen for example if MODULE inherits from Exporter.

  • -n

    causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like sed -n or awk:

    1. LINE:
    2. while (<>) {
    3. ... # your program goes here
    4. }

    Note that the lines are not printed by default. See -p to have lines printed. If a file named by an argument cannot be opened for some reason, Perl warns you about it and moves on to the next file.

    Also note that <> passes command line arguments to open, which doesn't necessarily interpret them as file names. See perlop for possible security implications.

    Here is an efficient way to delete all files that haven't been modified for at least a week:

    1. find . -mtime +7 -print | perl -nle unlink

    This is faster than using the -exec switch of find because you don't have to start a process on every filename found. It does suffer from the bug of mishandling newlines in pathnames, which you can fix if you follow the example under -0.

    BEGIN and END blocks may be used to capture control before or after the implicit program loop, just as in awk.

  • -p

    causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like sed:

    1. LINE:
    2. while (<>) {
    3. ... # your program goes here
    4. } continue {
    5. print or die "-p destination: $!\n";
    6. }

    If a file named by an argument cannot be opened for some reason, Perl warns you about it, and moves on to the next file. Note that the lines are printed automatically. An error occurring during printing is treated as fatal. To suppress printing use the -n switch. A -p overrides a -n switch.

    BEGIN and END blocks may be used to capture control before or after the implicit loop, just as in awk.

  • -s

    enables rudimentary switch parsing for switches on the command line after the program name but before any filename arguments (or before an argument of --). Any switch found there is removed from @ARGV and sets the corresponding variable in the Perl program. The following program prints "1" if the program is invoked with a -xyz switch, and "abc" if it is invoked with -xyz=abc.

    1. #!/usr/bin/perl -s
    2. if ($xyz) { print "$xyz\n" }

    Do note that a switch like --help creates the variable ${-help} , which is not compliant with use strict "refs" . Also, when using this option on a script with warnings enabled you may get a lot of spurious "used only once" warnings.

  • -S

    makes Perl use the PATH environment variable to search for the program unless the name of the program contains path separators.

    On some platforms, this also makes Perl append suffixes to the filename while searching for it. For example, on Win32 platforms, the ".bat" and ".cmd" suffixes are appended if a lookup for the original name fails, and if the name does not already end in one of those suffixes. If your Perl was compiled with DEBUGGING turned on, using the -Dp switch to Perl shows how the search progresses.

    Typically this is used to emulate #! startup on platforms that don't support #! . It's also convenient when debugging a script that uses #! , and is thus normally found by the shell's $PATH search mechanism.

    This example works on many platforms that have a shell compatible with Bourne shell:

    1. #!/usr/bin/perl
    2. eval 'exec /usr/bin/perl -wS $0 ${1+"$@"}'
    3. if $running_under_some_shell;

    The system ignores the first line and feeds the program to /bin/sh, which proceeds to try to execute the Perl program as a shell script. The shell executes the second line as a normal shell command, and thus starts up the Perl interpreter. On some systems $0 doesn't always contain the full pathname, so the -S tells Perl to search for the program if necessary. After Perl locates the program, it parses the lines and ignores them because the variable $running_under_some_shell is never true. If the program will be interpreted by csh, you will need to replace ${1+"$@"} with $* , even though that doesn't understand embedded spaces (and such) in the argument list. To start up sh rather than csh, some systems may have to replace the #! line with a line containing just a colon, which will be politely ignored by Perl. Other systems can't control that, and need a totally devious construct that will work under any of csh, sh, or Perl, such as the following:

    1. eval '(exit $?0)' && eval 'exec perl -wS $0 ${1+"$@"}'
    2. & eval 'exec /usr/bin/perl -wS $0 $argv:q'
    3. if $running_under_some_shell;

    If the filename supplied contains directory separators (and so is an absolute or relative pathname), and if that file is not found, platforms that append file extensions will do so and try to look for the file with those extensions added, one by one.

    On DOS-like platforms, if the program does not contain directory separators, it will first be searched for in the current directory before being searched for on the PATH. On Unix platforms, the program will be searched for strictly on the PATH.

  • -t

    Like -T, but taint checks will issue warnings rather than fatal errors. These warnings can now be controlled normally with no warnings qw(taint) .

    Note: This is not a substitute for -T ! This is meant to be used only as a temporary development aid while securing legacy code: for real production code and for new secure code written from scratch, always use the real -T.

  • -T

    turns on "taint" so you can test them. Ordinarily these checks are done only when running setuid or setgid. It's a good idea to turn them on explicitly for programs that run on behalf of someone else whom you might not necessarily trust, such as CGI programs or any internet servers you might write in Perl. See perlsec for details. For security reasons, this option must be seen by Perl quite early; usually this means it must appear early on the command line or in the #! line for systems which support that construct.

  • -u

    This switch causes Perl to dump core after compiling your program. You can then in theory take this core dump and turn it into an executable file by using the undump program (not supplied). This speeds startup at the expense of some disk space (which you can minimize by stripping the executable). (Still, a "hello world" executable comes out to about 200K on my machine.) If you want to execute a portion of your program before dumping, use the dump() operator instead. Note: availability of undump is platform specific and may not be available for a specific port of Perl.

  • -U

    allows Perl to do unsafe operations. Currently the only "unsafe" operations are attempting to unlink directories while running as superuser and running setuid programs with fatal taint checks turned into warnings. Note that warnings must be enabled along with this option to actually generate the taint-check warnings.

  • -v

    prints the version and patchlevel of your perl executable.

  • -V

    prints summary of the major perl configuration values and the current values of @INC.

  • -V:configvar

    Prints to STDOUT the value of the named configuration variable(s), with multiples when your configvar argument looks like a regex (has non-letters). For example:

    1. $ perl -V:libc
    2. libc='/lib/libc-2.2.4.so';
    3. $ perl -V:lib.
    4. libs='-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc';
    5. libc='/lib/libc-2.2.4.so';
    6. $ perl -V:lib.*
    7. libpth='/usr/local/lib /lib /usr/lib';
    8. libs='-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc';
    9. lib_ext='.a';
    10. libc='/lib/libc-2.2.4.so';
    11. libperl='libperl.a';
    12. ....

    Additionally, extra colons can be used to control formatting. A trailing colon suppresses the linefeed and terminator ";", allowing you to embed queries into shell commands. (mnemonic: PATH separator ":".)

    1. $ echo "compression-vars: " `perl -V:z.*: ` " are here !"
    2. compression-vars: zcat='' zip='zip' are here !

    A leading colon removes the "name=" part of the response, this allows you to map to the name you need. (mnemonic: empty label)

    1. $ echo "goodvfork="`./perl -Ilib -V::usevfork`
    2. goodvfork=false;

    Leading and trailing colons can be used together if you need positional parameter values without the names. Note that in the case below, the PERL_API params are returned in alphabetical order.

    1. $ echo building_on `perl -V::osname: -V::PERL_API_.*:` now
    2. building_on 'linux' '5' '1' '9' now
  • -w

    prints warnings about dubious constructs, such as variable names mentioned only once and scalar variables used before being set; redefined subroutines; references to undefined filehandles; filehandles opened read-only that you are attempting to write on; values used as a number that don't look like numbers; using an array as though it were a scalar; if your subroutines recurse more than 100 deep; and innumerable other things.

    This switch really just enables the global $^W variable; normally, the lexically scoped use warnings pragma is preferred. You can disable or promote into fatal errors specific warnings using __WARN__ hooks, as described in perlvar and warn. See also perldiag and perltrap. A fine-grained warning facility is also available if you want to manipulate entire classes of warnings; see warnings or perllexwarn.

  • -W

    Enables all warnings regardless of no warnings or $^W . See perllexwarn.

  • -X

    Disables all warnings regardless of use warnings or $^W . See perllexwarn.

  • -x
  • -xdirectory

    tells Perl that the program is embedded in a larger chunk of unrelated text, such as in a mail message. Leading garbage will be discarded until the first line that starts with #! and contains the string "perl". Any meaningful switches on that line will be applied.

    All references to line numbers by the program (warnings, errors, ...) will treat the #! line as the first line. Thus a warning on the 2nd line of the program, which is on the 100th line in the file will be reported as line 2, not as line 100. This can be overridden by using the #line directive. (See Plain Old Comments (Not!) in perlsyn)

    If a directory name is specified, Perl will switch to that directory before running the program. The -x switch controls only the disposal of leading garbage. The program must be terminated with __END__ if there is trailing garbage to be ignored; the program can process any or all of the trailing garbage via the DATA filehandle if desired.

    The directory, if specified, must appear immediately following the -x with no intervening whitespace.

ENVIRONMENT

  • HOME

    Used if chdir has no argument.

  • LOGDIR

    Used if chdir has no argument and HOME is not set.

  • PATH

    Used in executing subprocesses, and in finding the program if -S is used.

  • PERL5LIB

    A list of directories in which to look for Perl library files before looking in the standard library and the current directory. Any architecture-specific and version-specific directories, such as version/archname/, version/, or archname/ under the specified locations are automatically included if they exist, with this lookup done at interpreter startup time. In addition, any directories matching the entries in $Config{inc_version_list} are added. (These typically would be for older compatible perl versions installed in the same directory tree.)

    If PERL5LIB is not defined, PERLLIB is used. Directories are separated (like in PATH) by a colon on Unixish platforms and by a semicolon on Windows (the proper path separator being given by the command perl -V:path_sep).

    When running taint checks, either because the program was running setuid or setgid, or the -T or -t switch was specified, neither PERL5LIB nor PERLLIB is consulted. The program should instead say:

    1. use lib "/my/directory";
  • PERL5OPT

    Command-line options (switches). Switches in this variable are treated as if they were on every Perl command line. Only the -[CDIMUdmtwW] switches are allowed. When running taint checks (either because the program was running setuid or setgid, or because the -T or -t switch was used), this variable is ignored. If PERL5OPT begins with -T, tainting will be enabled and subsequent options ignored. If PERL5OPT begins with -t, tainting will be enabled, a writable dot removed from @INC, and subsequent options honored.

  • PERLIO

    A space (or colon) separated list of PerlIO layers. If perl is built to use PerlIO system for IO (the default) these layers affect Perl's IO.

    It is conventional to start layer names with a colon (for example, :perlio ) to emphasize their similarity to variable "attributes". But the code that parses layer specification strings, which is also used to decode the PERLIO environment variable, treats the colon as a separator.

    An unset or empty PERLIO is equivalent to the default set of layers for your platform; for example, :unix:perlio on Unix-like systems and :unix:crlf on Windows and other DOS-like systems.

    The list becomes the default for all Perl's IO. Consequently only built-in layers can appear in this list, as external layers (such as :encoding() ) need IO in order to load them! See open pragma for how to add external encodings as defaults.

    Layers it makes sense to include in the PERLIO environment variable are briefly summarized below. For more details see PerlIO.

    • :bytes

      A pseudolayer that turns the :utf8 flag off for the layer below; unlikely to be useful on its own in the global PERLIO environment variable. You perhaps were thinking of :crlf:bytes or :perlio:bytes.

    • :crlf

      A layer which does CRLF to "\n" translation distinguishing "text" and "binary" files in the manner of MS-DOS and similar operating systems. (It currently does not mimic MS-DOS as far as treating of Control-Z as being an end-of-file marker.)

    • :mmap

      A layer that implements "reading" of files by using mmap(2) to make an entire file appear in the process's address space, and then using that as PerlIO's "buffer".

    • :perlio

      This is a re-implementation of stdio-like buffering written as a PerlIO layer. As such it will call whatever layer is below it for its operations, typically :unix .

    • :pop

      An experimental pseudolayer that removes the topmost layer. Use with the same care as is reserved for nitroglycerine.

    • :raw

      A pseudolayer that manipulates other layers. Applying the :raw layer is equivalent to calling binmode($fh). It makes the stream pass each byte as-is without translation. In particular, both CRLF translation and intuiting :utf8 from the locale are disabled.

      Unlike in earlier versions of Perl, :raw is not just the inverse of :crlf : other layers which would affect the binary nature of the stream are also removed or disabled.

    • :stdio

      This layer provides a PerlIO interface by wrapping system's ANSI C "stdio" library calls. The layer provides both buffering and IO. Note that the :stdio layer does not do CRLF translation even if that is the platform's normal behaviour. You will need a :crlf layer above it to do that.

    • :unix

      Low-level layer that calls read, write, lseek , etc.

    • :utf8

      A pseudolayer that enables a flag in the layer below to tell Perl that output should be in utf8 and that input should be regarded as already in valid utf8 form. WARNING: It does not check for validity and as such should be handled with extreme caution for input, because security violations can occur with non-shortest UTF-8 encodings, etc. Generally :encoding(utf8) is the best option when reading UTF-8 encoded data.

    • :win32

      On Win32 platforms this experimental layer uses native "handle" IO rather than a Unix-like numeric file descriptor layer. Known to be buggy in this release (5.14).

    The default set of layers should give acceptable results on all platforms

    For Unix platforms that will be the equivalent of "unix perlio" or "stdio". Configure is set up to prefer the "stdio" implementation if the system's library provides for fast access to the buffer; otherwise, it uses the "unix perlio" implementation.

    On Win32 the default in this release (5.14) is "unix crlf". Win32's "stdio" has a number of bugs/mis-features for Perl IO which are somewhat depending on the version and vendor of the C compiler. Using our own crlf layer as the buffer avoids those issues and makes things more uniform. The crlf layer provides CRLF conversion as well as buffering.

    This release (5.14) uses unix as the bottom layer on Win32, and so still uses the C compiler's numeric file descriptor routines. There is an experimental native win32 layer, which is expected to be enhanced and should eventually become the default under Win32.

    The PERLIO environment variable is completely ignored when Perl is run in taint mode.

  • PERLIO_DEBUG

    If set to the name of a file or device, certain operations of PerlIO subsystem will be logged to that file, which is opened in append mode. Typical uses are in Unix:

    1. % env PERLIO_DEBUG=/dev/tty perl script ...

    and under Win32, the approximately equivalent:

    1. > set PERLIO_DEBUG=CON
    2. perl script ...

    This functionality is disabled for setuid scripts and for scripts run with -T.

  • PERLLIB

    A list of directories in which to look for Perl library files before looking in the standard library and the current directory. If PERL5LIB is defined, PERLLIB is not used.

    The PERLLIB environment variable is completely ignored when Perl is run in taint mode.

  • PERL5DB

    The command used to load the debugger code. The default is:

    1. BEGIN { require "perl5db.pl" }

    The PERL5DB environment variable is only used when Perl is started with a bare -d switch.

  • PERL5DB_THREADED

    If set to a true value, indicates to the debugger that the code being debugged uses threads.

  • PERL5SHELL (specific to the Win32 port)

    On Win32 ports only, may be set to an alternative shell that Perl must use internally for executing "backtick" commands or system(). Default is cmd.exe /x/d/c on WindowsNT and command.com /c on Windows95. The value is considered space-separated. Precede any character that needs to be protected, like a space or backslash, with another backslash.

    Note that Perl doesn't use COMSPEC for this purpose because COMSPEC has a high degree of variability among users, leading to portability concerns. Besides, Perl can use a shell that may not be fit for interactive use, and setting COMSPEC to such a shell may interfere with the proper functioning of other programs (which usually look in COMSPEC to find a shell fit for interactive use).

    Before Perl 5.10.0 and 5.8.8, PERL5SHELL was not taint checked when running external commands. It is recommended that you explicitly set (or delete) $ENV{PERL5SHELL} when running in taint mode under Windows.

  • PERL_ALLOW_NON_IFS_LSP (specific to the Win32 port)

    Set to 1 to allow the use of non-IFS compatible LSPs (Layered Service Providers). Perl normally searches for an IFS-compatible LSP because this is required for its emulation of Windows sockets as real filehandles. However, this may cause problems if you have a firewall such as McAfee Guardian, which requires that all applications use its LSP but which is not IFS-compatible, because clearly Perl will normally avoid using such an LSP.

    Setting this environment variable to 1 means that Perl will simply use the first suitable LSP enumerated in the catalog, which keeps McAfee Guardian happy--and in that particular case Perl still works too because McAfee Guardian's LSP actually plays other games which allow applications requiring IFS compatibility to work.

  • PERL_DEBUG_MSTATS

    Relevant only if Perl is compiled with the malloc included with the Perl distribution; that is, if perl -V:d_mymalloc is "define".

    If set, this dumps out memory statistics after execution. If set to an integer greater than one, also dumps out memory statistics after compilation.

  • PERL_DESTRUCT_LEVEL

    Relevant only if your Perl executable was built with -DDEBUGGING, this controls the behaviour of global destruction of objects and other references. See PERL_DESTRUCT_LEVEL in perlhacktips for more information.

  • PERL_DL_NONLAZY

    Set to "1" to have Perl resolve all undefined symbols when it loads a dynamic library. The default behaviour is to resolve symbols when they are used. Setting this variable is useful during testing of extensions, as it ensures that you get an error on misspelled function names even if the test suite doesn't call them.

  • PERL_ENCODING

    If using the use encoding pragma without an explicit encoding name, the PERL_ENCODING environment variable is consulted for an encoding name.

  • PERL_HASH_SEED

    (Since Perl 5.8.1, new semantics in Perl 5.18.0) Used to override the randomization of Perl's internal hash function. The value is expressed in hexadecimal, and may include a leading 0x. Truncated patterns are treated as though they are suffixed with sufficient 0's as required.

    If the option is provided, and PERL_PERTURB_KEYS is NOT set, then a value of '0' implies PERL_PERTURB_KEYS=0 and any other value implies PERL_PERTURB_KEYS=2 .

    PLEASE NOTE: The hash seed is sensitive information. Hashes are randomized to protect against local and remote attacks against Perl code. By manually setting a seed, this protection may be partially or completely lost.

    See Algorithmic Complexity Attacks in perlsec and PERL_PERTURB_KEYS PERL_HASH_SEED_DEBUG for more information.

  • PERL_PERTURB_KEYS

    (Since Perl 5.18.0) Set to "0" or "NO" then traversing keys will be repeatable from run to run for the same PERL_HASH_SEED. Insertion into a hash will not change the order, except to provide for more space in the hash. When combined with setting PERL_HASH_SEED this mode is as close to pre 5.18 behavior as you can get.

    When set to "1" or "RANDOM" then traversing keys will be randomized. Every time a hash is inserted into the key order will change in a random fashion. The order may not be repeatable in a following program run even if the PERL_HASH_SEED has been specified. This is the default mode for perl.

    When set to "2" or "DETERMINISTIC" then inserting keys into a hash will cause the key order to change, but in a way that is repeatable from program run to program run.

    NOTE: Use of this option is considered insecure, and is intended only for debugging non-deterministic behavior in Perl's hash function. Do not use it in production.

    See Algorithmic Complexity Attacks in perlsec and PERL_HASH_SEED and PERL_HASH_SEED_DEBUG for more information. You can get and set the key traversal mask for a specific hash by using the hash_traversal_mask() function from Hash::Util.

  • PERL_HASH_SEED_DEBUG

    (Since Perl 5.8.1.) Set to "1" to display (to STDERR) information about the hash function, seed, and what type of key traversal randomization is in effect at the beginning of execution. This, combined with PERL_HASH_SEED and PERL_PERTURB_KEYS is intended to aid in debugging nondeterministic behaviour caused by hash randomization.

    Note that any information about the hash function, especially the hash seed is sensitive information: by knowing it, one can craft a denial-of-service attack against Perl code, even remotely; see Algorithmic Complexity Attacks in perlsec for more information. Do not disclose the hash seed to people who don't need to know it. See also hash_seed() and key_traversal_mask() in Hash::Util.

    An example output might be:

    1. HASH_FUNCTION = ONE_AT_A_TIME_HARD HASH_SEED = 0x652e9b9349a7a032 PERTURB_KEYS = 1 (RANDOM)
  • PERL_MEM_LOG

    If your Perl was configured with -Accflags=-DPERL_MEM_LOG, setting the environment variable PERL_MEM_LOG enables logging debug messages. The value has the form <number>[m][s][t], where number is the file descriptor number you want to write to (2 is default), and the combination of letters specifies that you want information about (m)emory and/or (s)v, optionally with (t)imestamps. For example, PERL_MEM_LOG=1mst logs all information to stdout. You can write to other opened file descriptors in a variety of ways:

    1. $ 3>foo3 PERL_MEM_LOG=3m perl ...
  • PERL_ROOT (specific to the VMS port)

    A translation-concealed rooted logical name that contains Perl and the logical device for the @INC path on VMS only. Other logical names that affect Perl on VMS include PERLSHR, PERL_ENV_TABLES, and SYS$TIMEZONE_DIFFERENTIAL, but are optional and discussed further in perlvms and in README.vms in the Perl source distribution.

  • PERL_SIGNALS

    Available in Perls 5.8.1 and later. If set to "unsafe" , the pre-Perl-5.8.0 signal behaviour (which is immediate but unsafe) is restored. If set to safe , then safe (but deferred) signals are used. See Deferred Signals (Safe Signals) in perlipc.

  • PERL_UNICODE

    Equivalent to the -C command-line switch. Note that this is not a boolean variable. Setting this to "1" is not the right way to "enable Unicode" (whatever that would mean). You can use "0" to "disable Unicode", though (or alternatively unset PERL_UNICODE in your shell before starting Perl). See the description of the -C switch for more information.

  • SYS$LOGIN (specific to the VMS port)

    Used if chdir has no argument and HOME and LOGDIR are not set.

Perl also has environment variables that control how Perl handles data specific to particular natural languages; see perllocale.

Perl and its various modules and components, including its test frameworks, may sometimes make use of certain other environment variables. Some of these are specific to a particular platform. Please consult the appropriate module documentation and any documentation for your platform (like perlsolaris, perllinux, perlmacosx, perlwin32, etc) for variables peculiar to those specific situations.

Perl makes all environment variables available to the program being executed, and passes these along to any child processes it starts. However, programs running setuid would do well to execute the following lines before doing anything else, just to keep people honest:

  1. $ENV{PATH} = "/bin:/usr/bin"; # or whatever you need
  2. $ENV{SHELL} = "/bin/sh" if exists $ENV{SHELL};
  3. delete @ENV{qw(IFS CDPATH ENV BASH_ENV)};
 
perldoc-html/perlsec.html000644 000765 000024 00000157255 12275777345 015603 0ustar00jjstaff000000 000000 perlsec - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlsec

Perl 5 version 18.2 documentation
Recently read

perlsec

NAME

perlsec - Perl security

DESCRIPTION

Perl is designed to make it easy to program securely even when running with extra privileges, like setuid or setgid programs. Unlike most command line shells, which are based on multiple substitution passes on each line of the script, Perl uses a more conventional evaluation scheme with fewer hidden snags. Additionally, because the language has more builtin functionality, it can rely less upon external (and possibly untrustworthy) programs to accomplish its purposes.

SECURITY VULNERABILITY CONTACT INFORMATION

If you believe you have found a security vulnerability in Perl, please email perl5-security-report@perl.org with details. This points to a closed subscription, unarchived mailing list. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN.

SECURITY MECHANISMS AND CONCERNS

Taint mode

Perl automatically enables a set of special security checks, called taint mode, when it detects its program running with differing real and effective user or group IDs. The setuid bit in Unix permissions is mode 04000, the setgid bit mode 02000; either or both may be set. You can also enable taint mode explicitly by using the -T command line flag. This flag is strongly suggested for server programs and any program run on behalf of someone else, such as a CGI script. Once taint mode is on, it's on for the remainder of your script.

While in this mode, Perl takes special precautions called taint checks to prevent both obvious and subtle traps. Some of these checks are reasonably simple, such as verifying that path directories aren't writable by others; careful programmers have always used checks like these. Other checks, however, are best supported by the language itself, and it is these checks especially that contribute to making a set-id Perl program more secure than the corresponding C program.

You may not use data derived from outside your program to affect something else outside your program--at least, not by accident. All command line arguments, environment variables, locale information (see perllocale), results of certain system calls (readdir(), readlink(), the variable of shmread(), the messages returned by msgrcv(), the password, gcos and shell fields returned by the getpwxxx() calls), and all file input are marked as "tainted". Tainted data may not be used directly or indirectly in any command that invokes a sub-shell, nor in any command that modifies files, directories, or processes, with the following exceptions:

  • Arguments to print and syswrite are not checked for taintedness.

  • Symbolic methods

    1. $obj->$method(@args);

    and symbolic sub references

    1. &{$foo}(@args);
    2. $foo->(@args);

    are not checked for taintedness. This requires extra carefulness unless you want external data to affect your control flow. Unless you carefully limit what these symbolic values are, people are able to call functions outside your Perl code, such as POSIX::system, in which case they are able to run arbitrary external code.

  • Hash keys are never tainted.

For efficiency reasons, Perl takes a conservative view of whether data is tainted. If an expression contains tainted data, any subexpression may be considered tainted, even if the value of the subexpression is not itself affected by the tainted data.

Because taintedness is associated with each scalar value, some elements of an array or hash can be tainted and others not. The keys of a hash are never tainted.

For example:

  1. $arg = shift; # $arg is tainted
  2. $hid = $arg . 'bar'; # $hid is also tainted
  3. $line = <>; # Tainted
  4. $line = <STDIN>; # Also tainted
  5. open FOO, "/home/me/bar" or die $!;
  6. $line = <FOO>; # Still tainted
  7. $path = $ENV{'PATH'}; # Tainted, but see below
  8. $data = 'abc'; # Not tainted
  9. system "echo $arg"; # Insecure
  10. system "/bin/echo", $arg; # Considered insecure
  11. # (Perl doesn't know about /bin/echo)
  12. system "echo $hid"; # Insecure
  13. system "echo $data"; # Insecure until PATH set
  14. $path = $ENV{'PATH'}; # $path now tainted
  15. $ENV{'PATH'} = '/bin:/usr/bin';
  16. delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
  17. $path = $ENV{'PATH'}; # $path now NOT tainted
  18. system "echo $data"; # Is secure now!
  19. open(FOO, "< $arg"); # OK - read-only file
  20. open(FOO, "> $arg"); # Not OK - trying to write
  21. open(FOO,"echo $arg|"); # Not OK
  22. open(FOO,"-|")
  23. or exec 'echo', $arg; # Also not OK
  24. $shout = `echo $arg`; # Insecure, $shout now tainted
  25. unlink $data, $arg; # Insecure
  26. umask $arg; # Insecure
  27. exec "echo $arg"; # Insecure
  28. exec "echo", $arg; # Insecure
  29. exec "sh", '-c', $arg; # Very insecure!
  30. @files = <*.c>; # insecure (uses readdir() or similar)
  31. @files = glob('*.c'); # insecure (uses readdir() or similar)
  32. # In either case, the results of glob are tainted, since the list of
  33. # filenames comes from outside of the program.
  34. $bad = ($arg, 23); # $bad will be tainted
  35. $arg, `true`; # Insecure (although it isn't really)

If you try to do something insecure, you will get a fatal error saying something like "Insecure dependency" or "Insecure $ENV{PATH}".

The exception to the principle of "one tainted value taints the whole expression" is with the ternary conditional operator ?:. Since code with a ternary conditional

  1. $result = $tainted_value ? "Untainted" : "Also untainted";

is effectively

  1. if ( $tainted_value ) {
  2. $result = "Untainted";
  3. } else {
  4. $result = "Also untainted";
  5. }

it doesn't make sense for $result to be tainted.

Laundering and Detecting Tainted Data

To test whether a variable contains tainted data, and whose use would thus trigger an "Insecure dependency" message, you can use the tainted() function of the Scalar::Util module, available in your nearby CPAN mirror, and included in Perl starting from the release 5.8.0. Or you may be able to use the following is_tainted() function.

  1. sub is_tainted {
  2. local $@; # Don't pollute caller's value.
  3. return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
  4. }

This function makes use of the fact that the presence of tainted data anywhere within an expression renders the entire expression tainted. It would be inefficient for every operator to test every argument for taintedness. Instead, the slightly more efficient and conservative approach is used that if any tainted value has been accessed within the same expression, the whole expression is considered tainted.

But testing for taintedness gets you only so far. Sometimes you have just to clear your data's taintedness. Values may be untainted by using them as keys in a hash; otherwise the only way to bypass the tainting mechanism is by referencing subpatterns from a regular expression match. Perl presumes that if you reference a substring using $1, $2, etc., that you knew what you were doing when you wrote the pattern. That means using a bit of thought--don't just blindly untaint anything, or you defeat the entire mechanism. It's better to verify that the variable has only good characters (for certain values of "good") rather than checking whether it has any bad characters. That's because it's far too easy to miss bad characters that you never thought of.

Here's a test to make sure that the data contains nothing but "word" characters (alphabetics, numerics, and underscores), a hyphen, an at sign, or a dot.

  1. if ($data =~ /^([-\@\w.]+)$/) {
  2. $data = $1; # $data now untainted
  3. } else {
  4. die "Bad data in '$data'"; # log this somewhere
  5. }

This is fairly secure because /\w+/ doesn't normally match shell metacharacters, nor are dot, dash, or at going to mean something special to the shell. Use of /.+/ would have been insecure in theory because it lets everything through, but Perl doesn't check for that. The lesson is that when untainting, you must be exceedingly careful with your patterns. Laundering data using regular expression is the only mechanism for untainting dirty data, unless you use the strategy detailed below to fork a child of lesser privilege.

The example does not untaint $data if use locale is in effect, because the characters matched by \w are determined by the locale. Perl considers that locale definitions are untrustworthy because they contain data from outside the program. If you are writing a locale-aware program, and want to launder data with a regular expression containing \w , put no locale ahead of the expression in the same block. See SECURITY in perllocale for further discussion and examples.

Switches On the "#!" Line

When you make a script executable, in order to make it usable as a command, the system will pass switches to perl from the script's #! line. Perl checks that any command line switches given to a setuid (or setgid) script actually match the ones set on the #! line. Some Unix and Unix-like environments impose a one-switch limit on the #! line, so you may need to use something like -wU instead of -w -U under such systems. (This issue should arise only in Unix or Unix-like environments that support #! and setuid or setgid scripts.)

Taint mode and @INC

When the taint mode (-T ) is in effect, the "." directory is removed from @INC , and the environment variables PERL5LIB and PERLLIB are ignored by Perl. You can still adjust @INC from outside the program by using the -I command line option as explained in perlrun. The two environment variables are ignored because they are obscured, and a user running a program could be unaware that they are set, whereas the -I option is clearly visible and therefore permitted.

Another way to modify @INC without modifying the program, is to use the lib pragma, e.g.:

  1. perl -Mlib=/foo program

The benefit of using -Mlib=/foo over -I/foo , is that the former will automagically remove any duplicated directories, while the later will not.

Note that if a tainted string is added to @INC , the following problem will be reported:

  1. Insecure dependency in require while running with -T switch

Cleaning Up Your Path

For "Insecure $ENV{PATH} " messages, you need to set $ENV{'PATH'} to a known value, and each directory in the path must be absolute and non-writable by others than its owner and group. You may be surprised to get this message even if the pathname to your executable is fully qualified. This is not generated because you didn't supply a full path to the program; instead, it's generated because you never set your PATH environment variable, or you didn't set it to something that was safe. Because Perl can't guarantee that the executable in question isn't itself going to turn around and execute some other program that is dependent on your PATH, it makes sure you set the PATH.

The PATH isn't the only environment variable which can cause problems. Because some shells may use the variables IFS, CDPATH, ENV, and BASH_ENV, Perl checks that those are either empty or untainted when starting subprocesses. You may wish to add something like this to your setid and taint-checking scripts.

  1. delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; # Make %ENV safer

It's also possible to get into trouble with other operations that don't care whether they use tainted values. Make judicious use of the file tests in dealing with any user-supplied filenames. When possible, do opens and such after properly dropping any special user (or group!) privileges. Perl doesn't prevent you from opening tainted filenames for reading, so be careful what you print out. The tainting mechanism is intended to prevent stupid mistakes, not to remove the need for thought.

Perl does not call the shell to expand wild cards when you pass system and exec explicit parameter lists instead of strings with possible shell wildcards in them. Unfortunately, the open, glob, and backtick functions provide no such alternate calling convention, so more subterfuge will be required.

Perl provides a reasonably safe way to open a file or pipe from a setuid or setgid program: just create a child process with reduced privilege who does the dirty work for you. First, fork a child using the special open syntax that connects the parent and child by a pipe. Now the child resets its ID set and any other per-process attributes, like environment variables, umasks, current working directories, back to the originals or known safe values. Then the child process, which no longer has any special permissions, does the open or other system call. Finally, the child passes the data it managed to access back to the parent. Because the file or pipe was opened in the child while running under less privilege than the parent, it's not apt to be tricked into doing something it shouldn't.

Here's a way to do backticks reasonably safely. Notice how the exec is not called with a string that the shell could expand. This is by far the best way to call something that might be subjected to shell escapes: just never call the shell at all.

  1. use English '-no_match_vars';
  2. die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
  3. if ($pid) { # parent
  4. while (<KID>) {
  5. # do something
  6. }
  7. close KID;
  8. } else {
  9. my @temp = ($EUID, $EGID);
  10. my $orig_uid = $UID;
  11. my $orig_gid = $GID;
  12. $EUID = $UID;
  13. $EGID = $GID;
  14. # Drop privileges
  15. $UID = $orig_uid;
  16. $GID = $orig_gid;
  17. # Make sure privs are really gone
  18. ($EUID, $EGID) = @temp;
  19. die "Can't drop privileges"
  20. unless $UID == $EUID && $GID eq $EGID;
  21. $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
  22. # Consider sanitizing the environment even more.
  23. exec 'myprog', 'arg1', 'arg2'
  24. or die "can't exec myprog: $!";
  25. }

A similar strategy would work for wildcard expansion via glob, although you can use readdir instead.

Taint checking is most useful when although you trust yourself not to have written a program to give away the farm, you don't necessarily trust those who end up using it not to try to trick it into doing something bad. This is the kind of security checking that's useful for set-id programs and programs launched on someone else's behalf, like CGI programs.

This is quite different, however, from not even trusting the writer of the code not to try to do something evil. That's the kind of trust needed when someone hands you a program you've never seen before and says, "Here, run this." For that kind of safety, you might want to check out the Safe module, included standard in the Perl distribution. This module allows the programmer to set up special compartments in which all system operations are trapped and namespace access is carefully controlled. Safe should not be considered bullet-proof, though: it will not prevent the foreign code to set up infinite loops, allocate gigabytes of memory, or even abusing perl bugs to make the host interpreter crash or behave in unpredictable ways. In any case it's better avoided completely if you're really concerned about security.

Security Bugs

Beyond the obvious problems that stem from giving special privileges to systems as flexible as scripts, on many versions of Unix, set-id scripts are inherently insecure right from the start. The problem is a race condition in the kernel. Between the time the kernel opens the file to see which interpreter to run and when the (now-set-id) interpreter turns around and reopens the file to interpret it, the file in question may have changed, especially if you have symbolic links on your system.

Fortunately, sometimes this kernel "feature" can be disabled. Unfortunately, there are two ways to disable it. The system can simply outlaw scripts with any set-id bit set, which doesn't help much. Alternately, it can simply ignore the set-id bits on scripts.

However, if the kernel set-id script feature isn't disabled, Perl will complain loudly that your set-id script is insecure. You'll need to either disable the kernel set-id script feature, or put a C wrapper around the script. A C wrapper is just a compiled program that does nothing except call your Perl program. Compiled programs are not subject to the kernel bug that plagues set-id scripts. Here's a simple wrapper, written in C:

  1. #define REAL_PATH "/path/to/script"
  2. main(ac, av)
  3. char **av;
  4. {
  5. execv(REAL_PATH, av);
  6. }

Compile this wrapper into a binary executable and then make it rather than your script setuid or setgid.

In recent years, vendors have begun to supply systems free of this inherent security bug. On such systems, when the kernel passes the name of the set-id script to open to the interpreter, rather than using a pathname subject to meddling, it instead passes /dev/fd/3. This is a special file already opened on the script, so that there can be no race condition for evil scripts to exploit. On these systems, Perl should be compiled with -DSETUID_SCRIPTS_ARE_SECURE_NOW . The Configure program that builds Perl tries to figure this out for itself, so you should never have to specify this yourself. Most modern releases of SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.

Protecting Your Programs

There are a number of ways to hide the source to your Perl programs, with varying levels of "security".

First of all, however, you can't take away read permission, because the source code has to be readable in order to be compiled and interpreted. (That doesn't mean that a CGI script's source is readable by people on the web, though.) So you have to leave the permissions at the socially friendly 0755 level. This lets people on your local system only see your source.

Some people mistakenly regard this as a security problem. If your program does insecure things, and relies on people not knowing how to exploit those insecurities, it is not secure. It is often possible for someone to determine the insecure things and exploit them without viewing the source. Security through obscurity, the name for hiding your bugs instead of fixing them, is little security indeed.

You can try using encryption via source filters (Filter::* from CPAN, or Filter::Util::Call and Filter::Simple since Perl 5.8). But crackers might be able to decrypt it. You can try using the byte code compiler and interpreter described below, but crackers might be able to de-compile it. You can try using the native-code compiler described below, but crackers might be able to disassemble it. These pose varying degrees of difficulty to people wanting to get at your code, but none can definitively conceal it (this is true of every language, not just Perl).

If you're concerned about people profiting from your code, then the bottom line is that nothing but a restrictive license will give you legal security. License your software and pepper it with threatening statements like "This is unpublished proprietary software of XYZ Corp. Your access to it does not give you permission to use it blah blah blah." You should see a lawyer to be sure your license's wording will stand up in court.

Unicode

Unicode is a new and complex technology and one may easily overlook certain security pitfalls. See perluniintro for an overview and perlunicode for details, and Security Implications of Unicode in perlunicode for security implications in particular.

Algorithmic Complexity Attacks

Certain internal algorithms used in the implementation of Perl can be attacked by choosing the input carefully to consume large amounts of either time or space or both. This can lead into the so-called Denial of Service (DoS) attacks.

  • Hash Algorithm - Hash algorithms like the one used in Perl are well known to be vulnerable to collision attacks on their hash function. Such attacks involve constructing a set of keys which collide into the same bucket producing inefficient behavior. Such attacks often depend on discovering the seed of the hash function used to map the keys to buckets. That seed is then used to brute-force a key set which can be used to mount a denial of service attack. In Perl 5.8.1 changes were introduced to harden Perl to such attacks, and then later in Perl 5.18.0 these features were enhanced and additional protections added.

    At the time of this writing, Perl 5.18.0 is considered to be well-hardened against algorithmic complexity attacks on its hash implementation. This is largely owed to the following measures mitigate attacks:

    • Hash Seed Randomization

      In order to make it impossible to know what seed to generate an attack key set for, this seed is randomly initialized at process start. This may be overridden by using the PERL_HASH_SEED environment variable, see PERL_HASH_SEED in perlrun. This environment variable controls how items are actually stored, not how they are presented via keys, values and each.

    • Hash Traversal Randomization

      Independent of which seed is used in the hash function, keys, values, and each return items in a per-hash randomized order. Modifying a hash by insertion will change the iteration order of that hash. This behavior can be overridden by using hash_traversal_mask() from Hash::Util or by using the PERL_PERTURB_KEYS environment variable, see PERL_PERTURB_KEYS in perlrun. Note that this feature controls the "visible" order of the keys, and not the actual order they are stored in.

    • Bucket Order Perturbance

      When items collide into a given hash bucket the order they are stored in the chain is no longer predictable in Perl 5.18. This has the intention to make it harder to observe a collisions. This behavior can be overridden by using the PERL_PERTURB_KEYS environment variable, see PERL_PERTURB_KEYS in perlrun.

    • New Default Hash Function

      The default hash function has been modified with the intention of making it harder to infer the hash seed.

    • Alternative Hash Functions

      The source code includes multiple hash algorithms to choose from. While we believe that the default perl hash is robust to attack, we have included the hash function Siphash as a fall-back option. At the time of release of Perl 5.18.0 Siphash is believed to be of cryptographic strength. This is not the default as it is much slower than the default hash.

    Without compiling a special Perl, there is no way to get the exact same behavior of any versions prior to Perl 5.18.0. The closest one can get is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED to a known value. We do not advise those settings for production use due to the above security considerations.

    Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order and the history of changes made to the hash over its lifetime.

    Also note that while the order of the hash elements might be randomized, this "pseudo-ordering" should not be used for applications like shuffling a list randomly (use List::Util::shuffle() for that, see List::Util, a standard core module since Perl 5.8.0; or the CPAN module Algorithm::Numerical::Shuffle ), or for generating permutations (use e.g. the CPAN modules Algorithm::Permute or Algorithm::FastPermute ), or for any cryptographic applications.

  • Regular expressions - Perl's regular expression engine is so called NFA (Non-deterministic Finite Automaton), which among other things means that it can rather easily consume large amounts of both time and space if the regular expression may match in several ways. Careful crafting of the regular expressions can help but quite often there really isn't much one can do (the book "Mastering Regular Expressions" is required reading, see perlfaq2). Running out of space manifests itself by Perl running out of memory.

  • Sorting - the quicksort algorithm used in Perls before 5.8.0 to implement the sort() function is very easy to trick into misbehaving so that it consumes a lot of time. Starting from Perl 5.8.0 a different sorting algorithm, mergesort, is used by default. Mergesort cannot misbehave on any input.

See http://www.cs.rice.edu/~scrosby/hash/ for more information, and any computer science textbook on algorithmic complexity.

SEE ALSO

perlrun for its description of cleaning up environment variables.

 
perldoc-html/perlsolaris.html000644 000765 000024 00000146746 12275777413 016504 0ustar00jjstaff000000 000000 perlsolaris - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlsolaris

Perl 5 version 18.2 documentation
Recently read

perlsolaris

NAME

perlsolaris - Perl version 5 on Solaris systems

DESCRIPTION

This document describes various features of Sun's Solaris operating system that will affect how Perl version 5 (hereafter just perl) is compiled and/or runs. Some issues relating to the older SunOS 4.x are also discussed, though they may be out of date.

For the most part, everything should just work.

Starting with Solaris 8, perl5.00503 (or higher) is supplied with the operating system, so you might not even need to build a newer version of perl at all. The Sun-supplied version is installed in /usr/perl5 with /usr/bin/perl pointing to /usr/perl5/bin/perl. Do not disturb that installation unless you really know what you are doing. If you remove the perl supplied with the OS, you will render some bits of your system inoperable. If you wish to install a newer version of perl, install it under a different prefix from /usr/perl5. Common prefixes to use are /usr/local and /opt/perl.

You may wish to put your version of perl in the PATH of all users by changing the link /usr/bin/perl. This is probably OK, as most perl scripts shipped with Solaris use an explicit path. (There are a few exceptions, such as /usr/bin/rpm2cpio and /etc/rcm/scripts/README, but these are also sufficiently generic that the actual version of perl probably doesn't matter too much.)

Solaris ships with a range of Solaris-specific modules. If you choose to install your own version of perl you will find the source of many of these modules is available on CPAN under the Sun::Solaris:: namespace.

Solaris may include two versions of perl, e.g. Solaris 9 includes both 5.005_03 and 5.6.1. This is to provide stability across Solaris releases, in cases where a later perl version has incompatibilities with the version included in the preceding Solaris release. The default perl version will always be the most recent, and in general the old version will only be retained for one Solaris release. Note also that the default perl will NOT be configured to search for modules in the older version, again due to compatibility/stability concerns. As a consequence if you upgrade Solaris, you will have to rebuild/reinstall any additional CPAN modules that you installed for the previous Solaris version. See the CPAN manpage under 'autobundle' for a quick way of doing this.

As an interim measure, you may either change the #! line of your scripts to specifically refer to the old perl version, e.g. on Solaris 9 use #!/usr/perl5/5.00503/bin/perl to use the perl version that was the default for Solaris 8, or if you have a large number of scripts it may be more convenient to make the old version of perl the default on your system. You can do this by changing the appropriate symlinks under /usr/perl5 as follows (example for Solaris 9):

  1. # cd /usr/perl5
  2. # rm bin man pod
  3. # ln -s ./5.00503/bin
  4. # ln -s ./5.00503/man
  5. # ln -s ./5.00503/lib/pod
  6. # rm /usr/bin/perl
  7. # ln -s ../perl5/5.00503/bin/perl /usr/bin/perl

In both cases this should only be considered to be a temporary measure - you should upgrade to the later version of perl as soon as is practicable.

Note also that the perl command-line utilities (e.g. perldoc) and any that are added by modules that you install will be under /usr/perl5/bin, so that directory should be added to your PATH.

Solaris Version Numbers.

For consistency with common usage, perl's Configure script performs some minor manipulations on the operating system name and version number as reported by uname. Here's a partial translation table:

  1. Sun: perl's Configure:
  2. uname uname -r Name osname osvers
  3. SunOS 4.1.3 Solaris 1.1 sunos 4.1.3
  4. SunOS 5.6 Solaris 2.6 solaris 2.6
  5. SunOS 5.8 Solaris 8 solaris 2.8
  6. SunOS 5.9 Solaris 9 solaris 2.9
  7. SunOS 5.10 Solaris 10 solaris 2.10

The complete table can be found in the Sun Managers' FAQ ftp://ftp.cs.toronto.edu/pub/jdd/sunmanagers/faq under "9.1) Which Sun models run which versions of SunOS?".

RESOURCES

There are many, many sources for Solaris information. A few of the important ones for perl:

SETTING UP

File Extraction Problems on Solaris.

Be sure to use a tar program compiled under Solaris (not SunOS 4.x) to extract the perl-5.x.x.tar.gz file. Do not use GNU tar compiled for SunOS4 on Solaris. (GNU tar compiled for Solaris should be fine.) When you run SunOS4 binaries on Solaris, the run-time system magically alters pathnames matching m#lib/locale# so that when tar tries to create lib/locale.pm, a file named lib/oldlocale.pm gets created instead. If you found this advice too late and used a SunOS4-compiled tar anyway, you must find the incorrectly renamed file and move it back to lib/locale.pm.

Compiler and Related Tools on Solaris.

You must use an ANSI C compiler to build perl. Perl can be compiled with either Sun's add-on C compiler or with gcc. The C compiler that shipped with SunOS4 will not do.

Include /usr/ccs/bin/ in your PATH.

Several tools needed to build perl are located in /usr/ccs/bin/: ar, as, ld, and make. Make sure that /usr/ccs/bin/ is in your PATH.

On all the released versions of Solaris (8, 9 and 10) you need to make sure the following packages are installed (this info is extracted from the Solaris FAQ):

for tools (sccs, lex, yacc, make, nm, truss, ld, as): SUNWbtool, SUNWsprot, SUNWtoo

for libraries & headers: SUNWhea, SUNWarc, SUNWlibm, SUNWlibms, SUNWdfbh, SUNWcg6h, SUNWxwinc

Additionaly, on Solaris 8 and 9 you also need:

for 64 bit development: SUNWarcx, SUNWbtoox, SUNWdplx, SUNWscpux, SUNWsprox, SUNWtoox, SUNWlmsx, SUNWlmx, SUNWlibCx

And only on Solaris 8 you also need:

for libraries & headers: SUNWolinc

If you are in doubt which package contains a file you are missing, try to find an installation that has that file. Then do a

  1. $ grep /my/missing/file /var/sadm/install/contents

This will display a line like this:

/usr/include/sys/errno.h f none 0644 root bin 7471 37605 956241356 SUNWhea

The last item listed (SUNWhea in this example) is the package you need.

Avoid /usr/ucb/cc.

You don't need to have /usr/ucb/ in your PATH to build perl. If you want /usr/ucb/ in your PATH anyway, make sure that /usr/ucb/ is NOT in your PATH before the directory containing the right C compiler.

Sun's C Compiler

If you use Sun's C compiler, make sure the correct directory (usually /opt/SUNWspro/bin/) is in your PATH (before /usr/ucb/).

GCC

If you use gcc, make sure your installation is recent and complete. perl versions since 5.6.0 build fine with gcc > 2.8.1 on Solaris >= 2.6.

You must Configure perl with

  1. $ sh Configure -Dcc=gcc

If you don't, you may experience strange build errors.

If you have updated your Solaris version, you may also have to update your gcc. For example, if you are running Solaris 2.6 and your gcc is installed under /usr/local, check in /usr/local/lib/gcc-lib and make sure you have the appropriate directory, sparc-sun-solaris2.6/ or i386-pc-solaris2.6/. If gcc's directory is for a different version of Solaris than you are running, then you will need to rebuild gcc for your new version of Solaris.

You can get a precompiled version of gcc from http://www.sunfreeware.com/ or http://www.blastwave.org/. Make sure you pick up the package for your Solaris release.

If you wish to use gcc to build add-on modules for use with the perl shipped with Solaris, you should use the Solaris::PerlGcc module which is available from CPAN. The perl shipped with Solaris is configured and built with the Sun compilers, and the compiler configuration information stored in Config.pm is therefore only relevant to the Sun compilers. The Solaris:PerlGcc module contains a replacement Config.pm that is correct for gcc - see the module for details.

GNU as and GNU ld

The following information applies to gcc version 2. Volunteers to update it as appropriately for gcc version 3 would be appreciated.

The versions of as and ld supplied with Solaris work fine for building perl. There is normally no need to install the GNU versions to compile perl.

If you decide to ignore this advice and use the GNU versions anyway, then be sure that they are relatively recent. Versions newer than 2.7 are apparently new enough. Older versions may have trouble with dynamic loading.

If you wish to use GNU ld, then you need to pass it the -Wl,-E flag. The hints/solaris_2.sh file tries to do this automatically by setting the following Configure variables:

  1. ccdlflags="$ccdlflags -Wl,-E"
  2. lddlflags="$lddlflags -Wl,-E -G"

However, over the years, changes in gcc, GNU ld, and Solaris ld have made it difficult to automatically detect which ld ultimately gets called. You may have to manually edit config.sh and add the -Wl,-E flags yourself, or else run Configure interactively and add the flags at the appropriate prompts.

If your gcc is configured to use GNU as and ld but you want to use the Solaris ones instead to build perl, then you'll need to add -B/usr/ccs/bin/ to the gcc command line. One convenient way to do that is with

  1. $ sh Configure -Dcc='gcc -B/usr/ccs/bin/'

Note that the trailing slash is required. This will result in some harmless warnings as Configure is run:

  1. gcc: file path prefix `/usr/ccs/bin/' never used

These messages may safely be ignored. (Note that for a SunOS4 system, you must use -B/bin/ instead.)

Alternatively, you can use the GCC_EXEC_PREFIX environment variable to ensure that Sun's as and ld are used. Consult your gcc documentation for further information on the -B option and the GCC_EXEC_PREFIX variable.

Sun and GNU make

The make under /usr/ccs/bin works fine for building perl. If you have the Sun C compilers, you will also have a parallel version of make (dmake). This works fine to build perl, but can sometimes cause problems when running 'make test' due to underspecified dependencies between the different test harness files. The same problem can also affect the building of some add-on modules, so in those cases either specify '-m serial' on the dmake command line, or use /usr/ccs/bin/make instead. If you wish to use GNU make, be sure that the set-group-id bit is not set. If it is, then arrange your PATH so that /usr/ccs/bin/make is before GNU make or else have the system administrator disable the set-group-id bit on GNU make.

Avoid libucb.

Solaris provides some BSD-compatibility functions in /usr/ucblib/libucb.a. Perl will not build and run correctly if linked against -lucb since it contains routines that are incompatible with the standard Solaris libc. Normally this is not a problem since the solaris hints file prevents Configure from even looking in /usr/ucblib for libraries, and also explicitly omits -lucb.

Environment for Compiling perl on Solaris

PATH

Make sure your PATH includes the compiler (/opt/SUNWspro/bin/ if you're using Sun's compiler) as well as /usr/ccs/bin/ to pick up the other development tools (such as make, ar, as, and ld). Make sure your path either doesn't include /usr/ucb or that it includes it after the compiler and compiler tools and other standard Solaris directories. You definitely don't want /usr/ucb/cc.

LD_LIBRARY_PATH

If you have the LD_LIBRARY_PATH environment variable set, be sure that it does NOT include /lib or /usr/lib. If you will be building extensions that call third-party shared libraries (e.g. Berkeley DB) then make sure that your LD_LIBRARY_PATH environment variable includes the directory with that library (e.g. /usr/local/lib).

If you get an error message

  1. dlopen: stub interception failed

it is probably because your LD_LIBRARY_PATH environment variable includes a directory which is a symlink to /usr/lib (such as /lib). The reason this causes a problem is quite subtle. The file libdl.so.1.0 actually *only* contains functions which generate 'stub interception failed' errors! The runtime linker intercepts links to "/usr/lib/libdl.so.1.0" and links in internal implementations of those functions instead. [Thanks to Tim Bunce for this explanation.]

RUN CONFIGURE.

See the INSTALL file for general information regarding Configure. Only Solaris-specific issues are discussed here. Usually, the defaults should be fine.

64-bit perl on Solaris.

See the INSTALL file for general information regarding 64-bit compiles. In general, the defaults should be fine for most people.

By default, perl-5.6.0 (or later) is compiled as a 32-bit application with largefile and long-long support.

General 32-bit vs. 64-bit issues.

Solaris 7 and above will run in either 32 bit or 64 bit mode on SPARC CPUs, via a reboot. You can build 64 bit apps whilst running 32 bit mode and vice-versa. 32 bit apps will run under Solaris running in either 32 or 64 bit mode. 64 bit apps require Solaris to be running 64 bit mode.

Existing 32 bit apps are properly known as LP32, i.e. Longs and Pointers are 32 bit. 64-bit apps are more properly known as LP64. The discriminating feature of a LP64 bit app is its ability to utilise a 64-bit address space. It is perfectly possible to have a LP32 bit app that supports both 64-bit integers (long long) and largefiles (> 2GB), and this is the default for perl-5.6.0.

For a more complete explanation of 64-bit issues, see the "Solaris 64-bit Developer's Guide" at http://docs.sun.com/

You can detect the OS mode using "isainfo -v", e.g.

  1. $ isainfo -v # Ultra 30 in 64 bit mode
  2. 64-bit sparcv9 applications
  3. 32-bit sparc applications

By default, perl will be compiled as a 32-bit application. Unless you want to allocate more than ~ 4GB of memory inside perl, or unless you need more than 255 open file descriptors, you probably don't need perl to be a 64-bit app.

Large File Support

For Solaris 2.6 and onwards, there are two different ways for 32-bit applications to manipulate large files (files whose size is > 2GByte). (A 64-bit application automatically has largefile support built in by default.)

First is the "transitional compilation environment", described in lfcompile64(5). According to the man page,

  1. The transitional compilation environment exports all the
  2. explicit 64-bit functions (xxx64()) and types in addition to
  3. all the regular functions (xxx()) and types. Both xxx() and
  4. xxx64() functions are available to the program source. A
  5. 32-bit application must use the xxx64() functions in order
  6. to access large files. See the lf64(5) manual page for a
  7. complete listing of the 64-bit transitional interfaces.

The transitional compilation environment is obtained with the following compiler and linker flags:

  1. getconf LFS64_CFLAGS -D_LARGEFILE64_SOURCE
  2. getconf LFS64_LDFLAG # nothing special needed
  3. getconf LFS64_LIBS # nothing special needed

Second is the "large file compilation environment", described in lfcompile(5). According to the man page,

  1. Each interface named xxx() that needs to access 64-bit entities
  2. to access large files maps to a xxx64() call in the
  3. resulting binary. All relevant data types are defined to be
  4. of correct size (for example, off_t has a typedef definition
  5. for a 64-bit entity).
  6. An application compiled in this environment is able to use
  7. the xxx() source interfaces to access both large and small
  8. files, rather than having to explicitly utilize the transitional
  9. xxx64() interface calls to access large files.

Two exceptions are fseek() and ftell(). 32-bit applications should use fseeko(3C) and ftello(3C). These will get automatically mapped to fseeko64() and ftello64().

The large file compilation environment is obtained with

  1. getconf LFS_CFLAGS -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
  2. getconf LFS_LDFLAGS # nothing special needed
  3. getconf LFS_LIBS # nothing special needed

By default, perl uses the large file compilation environment and relies on Solaris to do the underlying mapping of interfaces.

Building an LP64 perl

To compile a 64-bit application on an UltraSparc with a recent Sun Compiler, you need to use the flag "-xarch=v9". getconf(1) will tell you this, e.g.

  1. $ getconf -a | grep v9
  2. XBS5_LP64_OFF64_CFLAGS: -xarch=v9
  3. XBS5_LP64_OFF64_LDFLAGS: -xarch=v9
  4. XBS5_LP64_OFF64_LINTFLAGS: -xarch=v9
  5. XBS5_LPBIG_OFFBIG_CFLAGS: -xarch=v9
  6. XBS5_LPBIG_OFFBIG_LDFLAGS: -xarch=v9
  7. XBS5_LPBIG_OFFBIG_LINTFLAGS: -xarch=v9
  8. _XBS5_LP64_OFF64_CFLAGS: -xarch=v9
  9. _XBS5_LP64_OFF64_LDFLAGS: -xarch=v9
  10. _XBS5_LP64_OFF64_LINTFLAGS: -xarch=v9
  11. _XBS5_LPBIG_OFFBIG_CFLAGS: -xarch=v9
  12. _XBS5_LPBIG_OFFBIG_LDFLAGS: -xarch=v9
  13. _XBS5_LPBIG_OFFBIG_LINTFLAGS: -xarch=v9

This flag is supported in Sun WorkShop Compilers 5.0 and onwards (now marketed under the name Forte) when used on Solaris 7 or later on UltraSparc systems.

If you are using gcc, you would need to use -mcpu=v9 -m64 instead. This option is not yet supported as of gcc 2.95.2; from install/SPECIFIC in that release:

  1. GCC version 2.95 is not able to compile code correctly for sparc64
  2. targets. Users of the Linux kernel, at least, can use the sparc32
  3. program to start up a new shell invocation with an environment that
  4. causes configure to recognize (via uname -a) the system as sparc-*-*
  5. instead.

All this should be handled automatically by the hints file, if requested.

Long Doubles.

As of 5.8.1, long doubles are working if you use the Sun compilers (needed for additional math routines not included in libm).

Threads in perl on Solaris.

It is possible to build a threaded version of perl on Solaris. The entire perl thread implementation is still experimental, however, so beware.

Malloc Issues with perl on Solaris.

Starting from perl 5.7.1 perl uses the Solaris malloc, since the perl malloc breaks when dealing with more than 2GB of memory, and the Solaris malloc also seems to be faster.

If you for some reason (such as binary backward compatibility) really need to use perl's malloc, you can rebuild perl from the sources and Configure the build with

  1. $ sh Configure -Dusemymalloc

You should not use perl's malloc if you are building with gcc. There are reports of core dumps, especially in the PDL module. The problem appears to go away under -DDEBUGGING, so it has been difficult to track down. Sun's compiler appears to be okay with or without perl's malloc. [XXX further investigation is needed here.]

MAKE PROBLEMS.

  • Dynamic Loading Problems With GNU as and GNU ld

    If you have problems with dynamic loading using gcc on SunOS or Solaris, and you are using GNU as and GNU ld, see the section GNU as and GNU ld above.

  • ld.so.1: ./perl: fatal: relocation error:

    If you get this message on SunOS or Solaris, and you're using gcc, it's probably the GNU as or GNU ld problem in the previous item GNU as and GNU ld.

  • dlopen: stub interception failed

    The primary cause of the 'dlopen: stub interception failed' message is that the LD_LIBRARY_PATH environment variable includes a directory which is a symlink to /usr/lib (such as /lib). See LD_LIBRARY_PATH above.

  • #error "No DATAMODEL_NATIVE specified"

    This is a common error when trying to build perl on Solaris 2.6 with a gcc installation from Solaris 2.5 or 2.5.1. The Solaris header files changed, so you need to update your gcc installation. You can either rerun the fixincludes script from gcc or take the opportunity to update your gcc installation.

  • sh: ar: not found

    This is a message from your shell telling you that the command 'ar' was not found. You need to check your PATH environment variable to make sure that it includes the directory with the 'ar' command. This is a common problem on Solaris, where 'ar' is in the /usr/ccs/bin/ directory.

MAKE TEST

op/stat.t test 4 in Solaris

op/stat.t test 4 may fail if you are on a tmpfs of some sort. Building in /tmp sometimes shows this behavior. The test suite detects if you are building in /tmp, but it may not be able to catch all tmpfs situations.

nss_delete core dump from op/pwent or op/grent

See nss_delete core dump from op/pwent or op/grent in perlhpux.

PREBUILT BINARIES OF PERL FOR SOLARIS.

You can pick up prebuilt binaries for Solaris from http://www.sunfreeware.com/, http://www.blastwave.org, ActiveState http://www.activestate.com/, and http://www.perl.com/ under the Binaries list at the top of the page. There are probably other sources as well. Please note that these sites are under the control of their respective owners, not the perl developers.

RUNTIME ISSUES FOR PERL ON SOLARIS.

Limits on Numbers of Open Files on Solaris.

The stdio(3C) manpage notes that for LP32 applications, only 255 files may be opened using fopen(), and only file descriptors 0 through 255 can be used in a stream. Since perl calls open() and then fdopen(3C) with the resulting file descriptor, perl is limited to 255 simultaneous open files, even if sysopen() is used. If this proves to be an insurmountable problem, you can compile perl as a LP64 application, see Building an LP64 perl for details. Note also that the default resource limit for open file descriptors on Solaris is 255, so you will have to modify your ulimit or rctl (Solaris 9 onwards) appropriately.

SOLARIS-SPECIFIC MODULES.

See the modules under the Solaris:: and Sun::Solaris namespaces on CPAN, see http://www.cpan.org/modules/by-module/Solaris/ and http://www.cpan.org/modules/by-module/Sun/.

SOLARIS-SPECIFIC PROBLEMS WITH MODULES.

Proc::ProcessTable on Solaris

Proc::ProcessTable does not compile on Solaris with perl5.6.0 and higher if you have LARGEFILES defined. Since largefile support is the default in 5.6.0 and later, you have to take special steps to use this module.

The problem is that various structures visible via procfs use off_t, and if you compile with largefile support these change from 32 bits to 64 bits. Thus what you get back from procfs doesn't match up with the structures in perl, resulting in garbage. See proc(4) for further discussion.

A fix for Proc::ProcessTable is to edit Makefile to explicitly remove the largefile flags from the ones MakeMaker picks up from Config.pm. This will result in Proc::ProcessTable being built under the correct environment. Everything should then be OK as long as Proc::ProcessTable doesn't try to share off_t's with the rest of perl, or if it does they should be explicitly specified as off64_t.

BSD::Resource on Solaris

BSD::Resource versions earlier than 1.09 do not compile on Solaris with perl 5.6.0 and higher, for the same reasons as Proc::ProcessTable. BSD::Resource versions starting from 1.09 have a workaround for the problem.

Net::SSLeay on Solaris

Net::SSLeay requires a /dev/urandom to be present. This device is available from Solaris 9 onwards. For earlier Solaris versions you can either get the package SUNWski (packaged with several Sun software products, for example the Sun WebServer, which is part of the Solaris Server Intranet Extension, or the Sun Directory Services, part of Solaris for ISPs) or download the ANDIrand package from http://www.cosy.sbg.ac.at/~andi/. If you use SUNWski, make a symbolic link /dev/urandom pointing to /dev/random. For more details, see Document ID27606 entitled "Differing /dev/random support requirements within Solaris[TM] Operating Environments", available at http://sunsolve.sun.com .

It may be possible to use the Entropy Gathering Daemon (written in Perl!), available from http://www.lothar.com/tech/crypto/.

SunOS 4.x

In SunOS 4.x you most probably want to use the SunOS ld, /usr/bin/ld, since the more recent versions of GNU ld (like 2.13) do not seem to work for building Perl anymore. When linking the extensions, the GNU ld gets very unhappy and spews a lot of errors like this

  1. ... relocation truncated to fit: BASE13 ...

and dies. Therefore the SunOS 4.1 hints file explicitly sets the ld to be /usr/bin/ld.

As of Perl 5.8.1 the dynamic loading of libraries (DynaLoader, XSLoader) also seems to have become broken in in SunOS 4.x. Therefore the default is to build Perl statically.

Running the test suite in SunOS 4.1 is a bit tricky since the lib/Tie/File/t/09_gen_rs test hangs (subtest #51, FWIW) for some unknown reason. Just stop the test and kill that particular Perl process.

There are various other failures, that as of SunOS 4.1.4 and gcc 3.2.2 look a lot like gcc bugs. Many of the failures happen in the Encode tests, where for example when the test expects "0" you get "&#48;" which should after a little squinting look very odd indeed. Another example is earlier in t/run/fresh_perl where chr(0xff) is expected but the test fails because the result is chr(0xff). Exactly.

This is the "make test" result from the said combination:

  1. Failed 27 test scripts out of 745, 96.38% okay.

Running the harness is painful because of the many failing Unicode-related tests will output megabytes of failure messages, but if one patiently waits, one gets these results:

  1. Failed Test Stat Wstat Total Fail Failed List of Failed
  2. -----------------------------------------------------------------------------
  3. ...
  4. ../ext/Encode/t/at-cn.t 4 1024 29 4 13.79% 14-17
  5. ../ext/Encode/t/at-tw.t 10 2560 17 10 58.82% 2 4 6 8 10 12
  6. 14-17
  7. ../ext/Encode/t/enc_data.t 29 7424 ?? ?? % ??
  8. ../ext/Encode/t/enc_eucjp.t 29 7424 ?? ?? % ??
  9. ../ext/Encode/t/enc_module.t 29 7424 ?? ?? % ??
  10. ../ext/Encode/t/encoding.t 29 7424 ?? ?? % ??
  11. ../ext/Encode/t/grow.t 12 3072 24 12 50.00% 2 4 6 8 10 12 14
  12. 16 18 20 22 24
  13. Failed Test Stat Wstat Total Fail Failed List of Failed
  14. ------------------------------------------------------------------------------
  15. ../ext/Encode/t/guess.t 255 65280 29 40 137.93% 10-29
  16. ../ext/Encode/t/jperl.t 29 7424 15 30 200.00% 1-15
  17. ../ext/Encode/t/mime-header.t 2 512 10 2 20.00% 2-3
  18. ../ext/Encode/t/perlio.t 22 5632 38 22 57.89% 1-4 9-16 19-20
  19. 23-24 27-32
  20. ../ext/List/Util/t/shuffle.t 0 139 ?? ?? % ??
  21. ../ext/PerlIO/t/encoding.t 14 1 7.14% 11
  22. ../ext/PerlIO/t/fallback.t 9 2 22.22% 3 5
  23. ../ext/Socket/t/socketpair.t 0 2 45 70 155.56% 11-45
  24. ../lib/CPAN/t/vcmp.t 30 1 3.33% 25
  25. ../lib/Tie/File/t/09_gen_rs.t 0 15 ?? ?? % ??
  26. ../lib/Unicode/Collate/t/test.t 199 30 15.08% 7 26-27 71-75
  27. 81-88 95 101
  28. 103-104 106 108-
  29. 109 122 124 161
  30. 169-172
  31. ../lib/sort.t 0 139 119 26 21.85% 107-119
  32. op/alarm.t 4 1 25.00% 4
  33. op/utfhash.t 97 1 1.03% 31
  34. run/fresh_perl.t 91 1 1.10% 32
  35. uni/tr_7jis.t ?? ?? % ??
  36. uni/tr_eucjp.t 29 7424 6 12 200.00% 1-6
  37. uni/tr_sjis.t 29 7424 6 12 200.00% 1-6
  38. 56 tests and 467 subtests skipped.
  39. Failed 27/811 test scripts, 96.67% okay. 1383/75399 subtests failed, 98.17% okay.

The alarm() test failure is caused by system() apparently blocking alarm(). That is probably a libc bug, and given that SunOS 4.x has been end-of-lifed years ago, don't hold your breath for a fix. In addition to that, don't try anything too Unicode-y, especially with Encode, and you should be fine in SunOS 4.x.

AUTHOR

The original was written by Andy Dougherty doughera@lafayette.edu drawing heavily on advice from Alan Burlison, Nick Ing-Simmons, Tim Bunce, and many other Solaris users over the years.

Please report any errors, updates, or suggestions to perlbug@perl.org.

 
perldoc-html/perlsource.html000644 000765 000024 00000055656 12275777361 016331 0ustar00jjstaff000000 000000 perlsource - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlsource

Perl 5 version 18.2 documentation
Recently read

perlsource

NAME

perlsource - A guide to the Perl source tree

DESCRIPTION

This document describes the layout of the Perl source tree. If you're hacking on the Perl core, this will help you find what you're looking for.

FINDING YOUR WAY AROUND

The Perl source tree is big. Here's some of the thing you'll find in it:

C code

The C source code and header files mostly live in the root of the source tree. There are a few platform-specific directories which contain C code. In addition, some of the modules shipped with Perl include C or XS code.

See perlinterp for more details on the files that make up the Perl interpreter, as well as details on how it works.

Core modules

Modules shipped as part of the Perl core live in four subdirectories. Two of these directories contain modules that live in the core, and two contain modules that can also be released separately on CPAN. Modules which can be released on cpan are known as "dual-life" modules.

  • lib/

    This directory contains pure-Perl modules which are only released as part of the core. This directory contains all of the modules and their tests, unlike other core modules.

  • ext/

    This directory contains XS-using modules which are only released as part of the core. These modules generally have their Makefile.PL and are laid out more like a typical CPAN module.

  • dist/

    This directory is for dual-life modules where the blead source is canonical. Note that some modules in this directory may not yet have been released separately on CPAN.

  • cpan/

    This directory contains dual-life modules where the CPAN module is canonical. Do not patch these modules directly! Changes to these modules should be submitted to the maintainer of the CPAN module. Once those changes are applied and released, the new version of the module will be incorporated into the core.

For some dual-life modules, it has not yet been determined if the CPAN version or the blead source is canonical. Until that is done, those modules should be in cpan/.

Tests

The Perl core has an extensive test suite. If you add new tests (or new modules with tests), you may need to update the t/TEST file so that the tests are run.

  • Module tests

    Tests for core modules in the lib/ directory are right next to the module itself. For example, we have lib/strict.pm and lib/strict.t.

    Tests for modules in ext/ and the dual-life modules are in t/ subdirectories for each module, like a standard CPAN distribution.

  • t/base/

    Tests for the absolute basic functionality of Perl. This includes if , basic file reads and writes, simple regexes, etc. These are run first in the test suite and if any of them fail, something is really broken.

  • t/cmd/

    Tests for basic control structures, if/else, while , subroutines, etc.

  • t/comp/

    Tests for basic issues of how Perl parses and compiles itself.

  • t/io/

    Tests for built-in IO functions, including command line arguments.

  • t/mro/

    Tests for perl's method resolution order implementations (see mro).

  • t/op/

    Tests for perl's built in functions that don't fit into any of the other directories.

  • t/opbasic/

    Tests for perl's built in functions which, like those in t/op/, do not fit into any of the other directories, but which, in addition, cannot use t/test.pl,as that program depends on functionality which the test file itself is testing.

  • t/re/

    Tests for regex related functions or behaviour. (These used to live in t/op).

  • t/run/

    Tests for features of how perl actually runs, including exit codes and handling of PERL* environment variables.

  • t/uni/

    Tests for the core support of Unicode.

  • t/win32/

    Windows-specific tests.

  • t/porting/

    Tests the state of the source tree for various common errors. For example, it tests that everyone who is listed in the git log has a corresponding entry in the AUTHORS file.

  • t/lib/

    The old home for the module tests, you shouldn't put anything new in here. There are still some bits and pieces hanging around in here that need to be moved. Perhaps you could move them? Thanks!

  • t/x2p

    A test suite for the s2p converter.

Documentation

All of the core documentation intended for end users lives in pod/. Individual modules in lib/, ext/, dist/, and cpan/ usually have their own documentation, either in the Module.pm file or an accompanying Module.pod file.

Finally, documentation intended for core Perl developers lives in the Porting/ directory.

Hacking tools and documentation

The Porting directory contains a grab bag of code and documentation intended to help porters work on Perl. Some of the highlights include:

  • check*

    These are scripts which will check the source things like ANSI C violations, POD encoding issues, etc.

  • Maintainers, Maintainers.pl, and Maintainers.pm

    These files contain information on who maintains which modules. Run perl Porting/Maintainers -M Module::Name to find out more information about a dual-life module.

  • podtidy

    Tidies a pod file. It's a good idea to run this on a pod file you've patched.

Build system

The Perl build system starts with the Configure script in the root directory.

Platform-specific pieces of the build system also live in platform-specific directories like win32/, vms/, etc.

The Configure script is ultimately responsible for generating a Makefile.

The build system that Perl uses is called metaconfig. This system is maintained separately from the Perl core.

The metaconfig system has its own git repository. Please see its README file in http://perl5.git.perl.org/metaconfig.git/ for more details.

The Cross directory contains various files related to cross-compiling Perl. See Cross/README for more details.

AUTHORS

This file lists everyone who's contributed to Perl. If you submit a patch, you should add your name to this file as part of the patch.

MANIFEST

The MANIFEST file in the root of the source tree contains a list of every file in the Perl core, as well as a brief description of each file.

You can get an overview of all the files with this command:

  1. % perl -lne 'print if /^[^\/]+\.[ch]\s+/' MANIFEST
 
perldoc-html/perlstyle.html000644 000765 000024 00000071357 12275777324 016164 0ustar00jjstaff000000 000000 perlstyle - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlstyle

Perl 5 version 18.2 documentation
Recently read

perlstyle

NAME

perlstyle - Perl style guide

DESCRIPTION

Each programmer will, of course, have his or her own preferences in regards to formatting, but there are some general guidelines that will make your programs easier to read, understand, and maintain.

The most important thing is to run your programs under the -w flag at all times. You may turn it off explicitly for particular portions of code via the no warnings pragma or the $^W variable if you must. You should also always run under use strict or know the reason why not. The use sigtrap and even use diagnostics pragmas may also prove useful.

Regarding aesthetics of code lay out, about the only thing Larry cares strongly about is that the closing curly bracket of a multi-line BLOCK should line up with the keyword that started the construct. Beyond that, he has other preferences that aren't so strong:

  • 4-column indent.

  • Opening curly on same line as keyword, if possible, otherwise line up.

  • Space before the opening curly of a multi-line BLOCK.

  • One-line BLOCK may be put on one line, including curlies.

  • No space before the semicolon.

  • Semicolon omitted in "short" one-line BLOCK.

  • Space around most operators.

  • Space around a "complex" subscript (inside brackets).

  • Blank lines between chunks that do different things.

  • Uncuddled elses.

  • No space between function name and its opening parenthesis.

  • Space after each comma.

  • Long lines broken after an operator (except and and or ).

  • Space after last parenthesis matching on current line.

  • Line up corresponding items vertically.

  • Omit redundant punctuation as long as clarity doesn't suffer.

Larry has his reasons for each of these things, but he doesn't claim that everyone else's mind works the same as his does.

Here are some other more substantive style issues to think about:

  • Just because you CAN do something a particular way doesn't mean that you SHOULD do it that way. Perl is designed to give you several ways to do anything, so consider picking the most readable one. For instance

    1. open(FOO,$foo) || die "Can't open $foo: $!";

    is better than

    1. die "Can't open $foo: $!" unless open(FOO,$foo);

    because the second way hides the main point of the statement in a modifier. On the other hand

    1. print "Starting analysis\n" if $verbose;

    is better than

    1. $verbose && print "Starting analysis\n";

    because the main point isn't whether the user typed -v or not.

    Similarly, just because an operator lets you assume default arguments doesn't mean that you have to make use of the defaults. The defaults are there for lazy systems programmers writing one-shot programs. If you want your program to be readable, consider supplying the argument.

    Along the same lines, just because you CAN omit parentheses in many places doesn't mean that you ought to:

    1. return print reverse sort num values %array;
    2. return print(reverse(sort num (values(%array))));

    When in doubt, parenthesize. At the very least it will let some poor schmuck bounce on the % key in vi.

    Even if you aren't in doubt, consider the mental welfare of the person who has to maintain the code after you, and who will probably put parentheses in the wrong place.

  • Don't go through silly contortions to exit a loop at the top or the bottom, when Perl provides the last operator so you can exit in the middle. Just "outdent" it a little to make it more visible:

    1. LINE:
    2. for (;;) {
    3. statements;
    4. last LINE if $foo;
    5. next LINE if /^#/;
    6. statements;
    7. }
  • Don't be afraid to use loop labels--they're there to enhance readability as well as to allow multilevel loop breaks. See the previous example.

  • Avoid using grep() (or map()) or `backticks` in a void context, that is, when you just throw away their return values. Those functions all have return values, so use them. Otherwise use a foreach() loop or the system() function instead.

  • For portability, when using features that may not be implemented on every machine, test the construct in an eval to see if it fails. If you know what version or patchlevel a particular feature was implemented, you can test $] ($PERL_VERSION in English ) to see if it will be there. The Config module will also let you interrogate values determined by the Configure program when Perl was installed.

  • Choose mnemonic identifiers. If you can't remember what mnemonic means, you've got a problem.

  • While short identifiers like $gotit are probably ok, use underscores to separate words in longer identifiers. It is generally easier to read $var_names_like_this than $VarNamesLikeThis , especially for non-native speakers of English. It's also a simple rule that works consistently with VAR_NAMES_LIKE_THIS .

    Package names are sometimes an exception to this rule. Perl informally reserves lowercase module names for "pragma" modules like integer and strict . Other modules should begin with a capital letter and use mixed case, but probably without underscores due to limitations in primitive file systems' representations of module names as files that must fit into a few sparse bytes.

  • You may find it helpful to use letter case to indicate the scope or nature of a variable. For example:

    1. $ALL_CAPS_HERE constants only (beware clashes with perl vars!)
    2. $Some_Caps_Here package-wide global/static
    3. $no_caps_here function scope my() or local() variables

    Function and method names seem to work best as all lowercase. E.g., $obj->as_string() .

    You can use a leading underscore to indicate that a variable or function should not be used outside the package that defined it.

  • If you have a really hairy regular expression, use the /x modifier and put in some whitespace to make it look a little less like line noise. Don't use slash as a delimiter when your regexp has slashes or backslashes.

  • Use the new and and or operators to avoid having to parenthesize list operators so much, and to reduce the incidence of punctuation operators like && and ||. Call your subroutines as if they were functions or list operators to avoid excessive ampersands and parentheses.

  • Use here documents instead of repeated print() statements.

  • Line up corresponding things vertically, especially if it'd be too long to fit on one line anyway.

    1. $IDX = $ST_MTIME;
    2. $IDX = $ST_ATIME if $opt_u;
    3. $IDX = $ST_CTIME if $opt_c;
    4. $IDX = $ST_SIZE if $opt_s;
    5. mkdir $tmpdir, 0700 or die "can't mkdir $tmpdir: $!";
    6. chdir($tmpdir) or die "can't chdir $tmpdir: $!";
    7. mkdir 'tmp', 0777 or die "can't mkdir $tmpdir/tmp: $!";
  • Always check the return codes of system calls. Good error messages should go to STDERR , include which program caused the problem, what the failed system call and arguments were, and (VERY IMPORTANT) should contain the standard system error message for what went wrong. Here's a simple but sufficient example:

    1. opendir(D, $dir) or die "can't opendir $dir: $!";
  • Line up your transliterations when it makes sense:

    1. tr [abc]
    2. [xyz];
  • Think about reusability. Why waste brainpower on a one-shot when you might want to do something like it again? Consider generalizing your code. Consider writing a module or object class. Consider making your code run cleanly with use strict and use warnings (or -w) in effect. Consider giving away your code. Consider changing your whole world view. Consider... oh, never mind.

  • Try to document your code and use Pod formatting in a consistent way. Here are commonly expected conventions:

    • use C<> for function, variable and module names (and more generally anything that can be considered part of code, like filehandles or specific values). Note that function names are considered more readable with parentheses after their name, that is function() .

    • use B<> for commands names like cat or grep.

    • use F<> or C<> for file names. F<> should be the only Pod code for file names, but as most Pod formatters render it as italic, Unix and Windows paths with their slashes and backslashes may be less readable, and better rendered with C<> .

  • Be consistent.

  • Be nice.

Page index
 
perldoc-html/perlsub.html000644 000765 000024 00000463776 12275777332 015626 0ustar00jjstaff000000 000000 perlsub - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlsub

Perl 5 version 18.2 documentation
Recently read

perlsub

NAME

perlsub - Perl subroutines

SYNOPSIS

To declare subroutines:

  1. sub NAME; # A "forward" declaration.
  2. sub NAME(PROTO); # ditto, but with prototypes
  3. sub NAME : ATTRS; # with attributes
  4. sub NAME(PROTO) : ATTRS; # with attributes and prototypes
  5. sub NAME BLOCK # A declaration and a definition.
  6. sub NAME(PROTO) BLOCK # ditto, but with prototypes
  7. sub NAME : ATTRS BLOCK # with attributes
  8. sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes

To define an anonymous subroutine at runtime:

  1. $subref = sub BLOCK; # no proto
  2. $subref = sub (PROTO) BLOCK; # with proto
  3. $subref = sub : ATTRS BLOCK; # with attributes
  4. $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes

To import subroutines:

  1. use MODULE qw(NAME1 NAME2 NAME3);

To call subroutines:

  1. NAME(LIST); # & is optional with parentheses.
  2. NAME LIST; # Parentheses optional if predeclared/imported.
  3. &NAME(LIST); # Circumvent prototypes.
  4. &NAME; # Makes current @_ visible to called subroutine.

DESCRIPTION

Like many languages, Perl provides for user-defined subroutines. These may be located anywhere in the main program, loaded in from other files via the do, require, or use keywords, or generated on the fly using eval or anonymous subroutines. You can even call a function indirectly using a variable containing its name or a CODE reference.

The Perl model for function call and return values is simple: all functions are passed as parameters one single flat list of scalars, and all functions likewise return to their caller one single flat list of scalars. Any arrays or hashes in these call and return lists will collapse, losing their identities--but you may always use pass-by-reference instead to avoid this. Both call and return lists may contain as many or as few scalar elements as you'd like. (Often a function without an explicit return statement is called a subroutine, but there's really no difference from Perl's perspective.)

Any arguments passed in show up in the array @_ . Therefore, if you called a function with two arguments, those would be stored in $_[0] and $_[1] . The array @_ is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element $_[0] is updated, the corresponding argument is updated (or an error occurs if it is not updatable). If an argument is an array or hash element which did not exist when the function was called, that element is created only when (and if) it is modified or a reference to it is taken. (Some earlier versions of Perl created the element whether or not the element was assigned to.) Assigning to the whole array @_ removes that aliasing, and does not update any arguments.

A return statement may be used to exit a subroutine, optionally specifying the returned value, which will be evaluated in the appropriate context (list, scalar, or void) depending on the context of the subroutine call. If you specify no return value, the subroutine returns an empty list in list context, the undefined value in scalar context, or nothing in void context. If you return one or more aggregates (arrays and hashes), these will be flattened together into one large indistinguishable list.

If no return is found and if the last statement is an expression, its value is returned. If the last statement is a loop control structure like a foreach or a while , the returned value is unspecified. The empty sub returns the empty list.

Perl does not have named formal parameters. In practice all you do is assign to a my() list of these. Variables that aren't declared to be private are global variables. For gory details on creating private variables, see Private Variables via my() and Temporary Values via local(). To create protected environments for a set of functions in a separate package (and probably a separate file), see Packages in perlmod.

Example:

  1. sub max {
  2. my $max = shift(@_);
  3. foreach $foo (@_) {
  4. $max = $foo if $max < $foo;
  5. }
  6. return $max;
  7. }
  8. $bestday = max($mon,$tue,$wed,$thu,$fri);

Example:

  1. # get a line, combining continuation lines
  2. # that start with whitespace
  3. sub get_line {
  4. $thisline = $lookahead; # global variables!
  5. LINE: while (defined($lookahead = <STDIN>)) {
  6. if ($lookahead =~ /^[ \t]/) {
  7. $thisline .= $lookahead;
  8. }
  9. else {
  10. last LINE;
  11. }
  12. }
  13. return $thisline;
  14. }
  15. $lookahead = <STDIN>; # get first line
  16. while (defined($line = get_line())) {
  17. ...
  18. }

Assigning to a list of private variables to name your arguments:

  1. sub maybeset {
  2. my($key, $value) = @_;
  3. $Foo{$key} = $value unless $Foo{$key};
  4. }

Because the assignment copies the values, this also has the effect of turning call-by-reference into call-by-value. Otherwise a function is free to do in-place modifications of @_ and change its caller's values.

  1. upcase_in($v1, $v2); # this changes $v1 and $v2
  2. sub upcase_in {
  3. for (@_) { tr/a-z/A-Z/ }
  4. }

You aren't allowed to modify constants in this way, of course. If an argument were actually literal and you tried to change it, you'd take a (presumably fatal) exception. For example, this won't work:

  1. upcase_in("frederick");

It would be much safer if the upcase_in() function were written to return a copy of its parameters instead of changing them in place:

  1. ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
  2. sub upcase {
  3. return unless defined wantarray; # void context, do nothing
  4. my @parms = @_;
  5. for (@parms) { tr/a-z/A-Z/ }
  6. return wantarray ? @parms : $parms[0];
  7. }

Notice how this (unprototyped) function doesn't care whether it was passed real scalars or arrays. Perl sees all arguments as one big, long, flat parameter list in @_ . This is one area where Perl's simple argument-passing style shines. The upcase() function would work perfectly well without changing the upcase() definition even if we fed it things like this:

  1. @newlist = upcase(@list1, @list2);
  2. @newlist = upcase( split /:/, $var );

Do not, however, be tempted to do this:

  1. (@a, @b) = upcase(@list1, @list2);

Like the flattened incoming parameter list, the return list is also flattened on return. So all you have managed to do here is stored everything in @a and made @b empty. See Pass by Reference for alternatives.

A subroutine may be called using an explicit & prefix. The & is optional in modern Perl, as are parentheses if the subroutine has been predeclared. The & is not optional when just naming the subroutine, such as when it's used as an argument to defined() or undef(). Nor is it optional when you want to do an indirect subroutine call with a subroutine name or reference using the &$subref() or &{$subref}() constructs, although the $subref->() notation solves that problem. See perlref for more about all that.

Subroutines may be called recursively. If a subroutine is called using the & form, the argument list is optional, and if omitted, no @_ array is set up for the subroutine: the @_ array at the time of the call is visible to subroutine instead. This is an efficiency mechanism that new users may wish to avoid.

  1. &foo(1,2,3); # pass three arguments
  2. foo(1,2,3); # the same
  3. foo(); # pass a null list
  4. &foo(); # the same
  5. &foo; # foo() get current args, like foo(@_) !!
  6. foo; # like foo() IFF sub foo predeclared, else "foo"

Not only does the & form make the argument list optional, it also disables any prototype checking on arguments you do provide. This is partly for historical reasons, and partly for having a convenient way to cheat if you know what you're doing. See Prototypes below.

Since Perl 5.16.0, the __SUB__ token is available under use feature 'current_sub' and use 5.16.0 . It will evaluate to a reference to the currently-running sub, which allows for recursive calls without knowing your subroutine's name.

  1. use 5.16.0;
  2. my $factorial = sub {
  3. my ($x) = @_;
  4. return 1 if $x == 1;
  5. return($x * __SUB__->( $x - 1 ) );
  6. };

The behaviour of __SUB__ within a regex code block (such as /(?{...})/ ) is subject to change.

Subroutines whose names are in all upper case are reserved to the Perl core, as are modules whose names are in all lower case. A subroutine in all capitals is a loosely-held convention meaning it will be called indirectly by the run-time system itself, usually due to a triggered event. Subroutines that do special, pre-defined things include AUTOLOAD , CLONE , DESTROY plus all functions mentioned in perltie and PerlIO::via.

The BEGIN , UNITCHECK , CHECK , INIT and END subroutines are not so much subroutines as named special code blocks, of which you can have more than one in a package, and which you can not call explicitly. See BEGIN, UNITCHECK, CHECK, INIT and END in perlmod

Private Variables via my()

Synopsis:

  1. my $foo; # declare $foo lexically local
  2. my (@wid, %get); # declare list of variables local
  3. my $foo = "flurp"; # declare $foo lexical, and init it
  4. my @oof = @bar; # declare @oof lexical, and init it
  5. my $x : Foo = $y; # similar, with an attribute applied

WARNING: The use of attribute lists on my declarations is still evolving. The current semantics and interface are subject to change. See attributes and Attribute::Handlers.

The my operator declares the listed variables to be lexically confined to the enclosing block, conditional (if/unless/elsif/else), loop (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd file. If more than one value is listed, the list must be placed in parentheses. All listed elements must be legal lvalues. Only alphanumeric identifiers may be lexically scoped--magical built-ins like $/ must currently be localized with local instead.

Unlike dynamic variables created by the local operator, lexical variables declared with my are totally hidden from the outside world, including any called subroutines. This is true if it's the same subroutine called from itself or elsewhere--every call gets its own copy.

This doesn't mean that a my variable declared in a statically enclosing lexical scope would be invisible. Only dynamic scopes are cut off. For example, the bumpx() function below has access to the lexical $x variable because both the my and the sub occurred at the same scope, presumably file scope.

  1. my $x = 10;
  2. sub bumpx { $x++ }

An eval(), however, can see lexical variables of the scope it is being evaluated in, so long as the names aren't hidden by declarations within the eval() itself. See perlref.

The parameter list to my() may be assigned to if desired, which allows you to initialize your variables. (If no initializer is given for a particular variable, it is created with the undefined value.) Commonly this is used to name input parameters to a subroutine. Examples:

  1. $arg = "fred"; # "global" variable
  2. $n = cube_root(27);
  3. print "$arg thinks the root is $n\n";
  4. fred thinks the root is 3
  5. sub cube_root {
  6. my $arg = shift; # name doesn't matter
  7. $arg **= 1/3;
  8. return $arg;
  9. }

The my is simply a modifier on something you might assign to. So when you do assign to variables in its argument list, my doesn't change whether those variables are viewed as a scalar or an array. So

  1. my ($foo) = <STDIN>; # WRONG?
  2. my @FOO = <STDIN>;

both supply a list context to the right-hand side, while

  1. my $foo = <STDIN>;

supplies a scalar context. But the following declares only one variable:

  1. my $foo, $bar = 1; # WRONG

That has the same effect as

  1. my $foo;
  2. $bar = 1;

The declared variable is not introduced (is not visible) until after the current statement. Thus,

  1. my $x = $x;

can be used to initialize a new $x with the value of the old $x, and the expression

  1. my $x = 123 and $x == 123

is false unless the old $x happened to have the value 123 .

Lexical scopes of control structures are not bounded precisely by the braces that delimit their controlled blocks; control expressions are part of that scope, too. Thus in the loop

  1. while (my $line = <>) {
  2. $line = lc $line;
  3. } continue {
  4. print $line;
  5. }

the scope of $line extends from its declaration throughout the rest of the loop construct (including the continue clause), but not beyond it. Similarly, in the conditional

  1. if ((my $answer = <STDIN>) =~ /^yes$/i) {
  2. user_agrees();
  3. } elsif ($answer =~ /^no$/i) {
  4. user_disagrees();
  5. } else {
  6. chomp $answer;
  7. die "'$answer' is neither 'yes' nor 'no'";
  8. }

the scope of $answer extends from its declaration through the rest of that conditional, including any elsif and else clauses, but not beyond it. See Simple Statements in perlsyn for information on the scope of variables in statements with modifiers.

The foreach loop defaults to scoping its index variable dynamically in the manner of local. However, if the index variable is prefixed with the keyword my, or if there is already a lexical by that name in scope, then a new lexical is created instead. Thus in the loop

  1. for my $i (1, 2, 3) {
  2. some_function();
  3. }

the scope of $i extends to the end of the loop, but not beyond it, rendering the value of $i inaccessible within some_function() .

Some users may wish to encourage the use of lexically scoped variables. As an aid to catching implicit uses to package variables, which are always global, if you say

  1. use strict 'vars';

then any variable mentioned from there to the end of the enclosing block must either refer to a lexical variable, be predeclared via our or use vars , or else must be fully qualified with the package name. A compilation error results otherwise. An inner block may countermand this with no strict 'vars' .

A my has both a compile-time and a run-time effect. At compile time, the compiler takes notice of it. The principal usefulness of this is to quiet use strict 'vars' , but it is also essential for generation of closures as detailed in perlref. Actual initialization is delayed until run time, though, so it gets executed at the appropriate time, such as each time through a loop, for example.

Variables declared with my are not part of any package and are therefore never fully qualified with the package name. In particular, you're not allowed to try to make a package variable (or other global) lexical:

  1. my $pack::var; # ERROR! Illegal syntax

In fact, a dynamic variable (also known as package or global variables) are still accessible using the fully qualified :: notation even while a lexical of the same name is also visible:

  1. package main;
  2. local $x = 10;
  3. my $x = 20;
  4. print "$x and $::x\n";

That will print out 20 and 10 .

You may declare my variables at the outermost scope of a file to hide any such identifiers from the world outside that file. This is similar in spirit to C's static variables when they are used at the file level. To do this with a subroutine requires the use of a closure (an anonymous function that accesses enclosing lexicals). If you want to create a private subroutine that cannot be called from outside that block, it can declare a lexical variable containing an anonymous sub reference:

  1. my $secret_version = '1.001-beta';
  2. my $secret_sub = sub { print $secret_version };
  3. &$secret_sub();

As long as the reference is never returned by any function within the module, no outside module can see the subroutine, because its name is not in any package's symbol table. Remember that it's not REALLY called $some_pack::secret_version or anything; it's just $secret_version, unqualified and unqualifiable.

This does not work with object methods, however; all object methods have to be in the symbol table of some package to be found. See Function Templates in perlref for something of a work-around to this.

Persistent Private Variables

There are two ways to build persistent private variables in Perl 5.10. First, you can simply use the state feature. Or, you can use closures, if you want to stay compatible with releases older than 5.10.

Persistent variables via state()

Beginning with Perl 5.10.0, you can declare variables with the state keyword in place of my. For that to work, though, you must have enabled that feature beforehand, either by using the feature pragma, or by using -E on one-liners (see feature). Beginning with Perl 5.16, the CORE::state form does not require the feature pragma.

The state keyword creates a lexical variable (following the same scoping rules as my) that persists from one subroutine call to the next. If a state variable resides inside an anonymous subroutine, then each copy of the subroutine has its own copy of the state variable. However, the value of the state variable will still persist between calls to the same copy of the anonymous subroutine. (Don't forget that sub { ... } creates a new subroutine each time it is executed.)

For example, the following code maintains a private counter, incremented each time the gimme_another() function is called:

  1. use feature 'state';
  2. sub gimme_another { state $x; return ++$x }

And this example uses anonymous subroutines to create separate counters:

  1. use feature 'state';
  2. sub create_counter {
  3. return sub { state $x; return ++$x }
  4. }

Also, since $x is lexical, it can't be reached or modified by any Perl code outside.

When combined with variable declaration, simple scalar assignment to state variables (as in state $x = 42 ) is executed only the first time. When such statements are evaluated subsequent times, the assignment is ignored. The behavior of this sort of assignment to non-scalar variables is undefined.

Persistent variables with closures

Just because a lexical variable is lexically (also called statically) scoped to its enclosing block, eval, or do FILE, this doesn't mean that within a function it works like a C static. It normally works more like a C auto, but with implicit garbage collection.

Unlike local variables in C or C++, Perl's lexical variables don't necessarily get recycled just because their scope has exited. If something more permanent is still aware of the lexical, it will stick around. So long as something else references a lexical, that lexical won't be freed--which is as it should be. You wouldn't want memory being free until you were done using it, or kept around once you were done. Automatic garbage collection takes care of this for you.

This means that you can pass back or save away references to lexical variables, whereas to return a pointer to a C auto is a grave error. It also gives us a way to simulate C's function statics. Here's a mechanism for giving a function private variables with both lexical scoping and a static lifetime. If you do want to create something like C's static variables, just enclose the whole function in an extra block, and put the static variable outside the function but in the block.

  1. {
  2. my $secret_val = 0;
  3. sub gimme_another {
  4. return ++$secret_val;
  5. }
  6. }
  7. # $secret_val now becomes unreachable by the outside
  8. # world, but retains its value between calls to gimme_another

If this function is being sourced in from a separate file via require or use, then this is probably just fine. If it's all in the main program, you'll need to arrange for the my to be executed early, either by putting the whole block above your main program, or more likely, placing merely a BEGIN code block around it to make sure it gets executed before your program starts to run:

  1. BEGIN {
  2. my $secret_val = 0;
  3. sub gimme_another {
  4. return ++$secret_val;
  5. }
  6. }

See BEGIN, UNITCHECK, CHECK, INIT and END in perlmod about the special triggered code blocks, BEGIN , UNITCHECK , CHECK , INIT and END .

If declared at the outermost scope (the file scope), then lexicals work somewhat like C's file statics. They are available to all functions in that same file declared below them, but are inaccessible from outside that file. This strategy is sometimes used in modules to create private variables that the whole module can see.

Temporary Values via local()

WARNING: In general, you should be using my instead of local, because it's faster and safer. Exceptions to this include the global punctuation variables, global filehandles and formats, and direct manipulation of the Perl symbol table itself. local is mostly used when the current value of a variable must be visible to called subroutines.

Synopsis:

  1. # localization of values
  2. local $foo; # make $foo dynamically local
  3. local (@wid, %get); # make list of variables local
  4. local $foo = "flurp"; # make $foo dynamic, and init it
  5. local @oof = @bar; # make @oof dynamic, and init it
  6. local $hash{key} = "val"; # sets a local value for this hash entry
  7. delete local $hash{key}; # delete this entry for the current block
  8. local ($cond ? $v1 : $v2); # several types of lvalues support
  9. # localization
  10. # localization of symbols
  11. local *FH; # localize $FH, @FH, %FH, &FH ...
  12. local *merlyn = *randal; # now $merlyn is really $randal, plus
  13. # @merlyn is really @randal, etc
  14. local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
  15. local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc

A local modifies its listed variables to be "local" to the enclosing block, eval, or do FILE --and to any subroutine called from within that block. A local just gives temporary values to global (meaning package) variables. It does not create a local variable. This is known as dynamic scoping. Lexical scoping is done with my, which works more like C's auto declarations.

Some types of lvalues can be localized as well: hash and array elements and slices, conditionals (provided that their result is always localizable), and symbolic references. As for simple variables, this creates new, dynamically scoped values.

If more than one variable or expression is given to local, they must be placed in parentheses. This operator works by saving the current values of those variables in its argument list on a hidden stack and restoring them upon exiting the block, subroutine, or eval. This means that called subroutines can also reference the local variable, but not the global one. The argument list may be assigned to if desired, which allows you to initialize your local variables. (If no initializer is given for a particular variable, it is created with an undefined value.)

Because local is a run-time operator, it gets executed each time through a loop. Consequently, it's more efficient to localize your variables outside the loop.

Grammatical note on local()

A local is simply a modifier on an lvalue expression. When you assign to a localized variable, the local doesn't change whether its list is viewed as a scalar or an array. So

  1. local($foo) = <STDIN>;
  2. local @FOO = <STDIN>;

both supply a list context to the right-hand side, while

  1. local $foo = <STDIN>;

supplies a scalar context.

Localization of special variables

If you localize a special variable, you'll be giving a new value to it, but its magic won't go away. That means that all side-effects related to this magic still work with the localized value.

This feature allows code like this to work :

  1. # Read the whole contents of FILE in $slurp
  2. { local $/ = undef; $slurp = <FILE>; }

Note, however, that this restricts localization of some values ; for example, the following statement dies, as of perl 5.10.0, with an error Modification of a read-only value attempted, because the $1 variable is magical and read-only :

  1. local $1 = 2;

One exception is the default scalar variable: starting with perl 5.14 local($_) will always strip all magic from $_, to make it possible to safely reuse $_ in a subroutine.

WARNING: Localization of tied arrays and hashes does not currently work as described. This will be fixed in a future release of Perl; in the meantime, avoid code that relies on any particular behaviour of localising tied arrays or hashes (localising individual elements is still okay). See Localising Tied Arrays and Hashes Is Broken in perl58delta for more details.

Localization of globs

The construct

  1. local *name;

creates a whole new symbol table entry for the glob name in the current package. That means that all variables in its glob slot ($name, @name, %name, &name, and the name filehandle) are dynamically reset.

This implies, among other things, that any magic eventually carried by those variables is locally lost. In other words, saying local */ will not have any effect on the internal value of the input record separator.

Localization of elements of composite types

It's also worth taking a moment to explain what happens when you localize a member of a composite type (i.e. an array or hash element). In this case, the element is localized by name. This means that when the scope of the local() ends, the saved value will be restored to the hash element whose key was named in the local(), or the array element whose index was named in the local(). If that element was deleted while the local() was in effect (e.g. by a delete() from a hash or a shift() of an array), it will spring back into existence, possibly extending an array and filling in the skipped elements with undef. For instance, if you say

  1. %hash = ( 'This' => 'is', 'a' => 'test' );
  2. @ary = ( 0..5 );
  3. {
  4. local($ary[5]) = 6;
  5. local($hash{'a'}) = 'drill';
  6. while (my $e = pop(@ary)) {
  7. print "$e . . .\n";
  8. last unless $e > 3;
  9. }
  10. if (@ary) {
  11. $hash{'only a'} = 'test';
  12. delete $hash{'a'};
  13. }
  14. }
  15. print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
  16. print "The array has ",scalar(@ary)," elements: ",
  17. join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";

Perl will print

  1. 6 . . .
  2. 4 . . .
  3. 3 . . .
  4. This is a test only a test.
  5. The array has 6 elements: 0, 1, 2, undef, undef, 5

The behavior of local() on non-existent members of composite types is subject to change in future.

Localized deletion of elements of composite types

You can use the delete local $array[$idx] and delete local $hash{key} constructs to delete a composite type entry for the current block and restore it when it ends. They return the array/hash value before the localization, which means that they are respectively equivalent to

  1. do {
  2. my $val = $array[$idx];
  3. local $array[$idx];
  4. delete $array[$idx];
  5. $val
  6. }

and

  1. do {
  2. my $val = $hash{key};
  3. local $hash{key};
  4. delete $hash{key};
  5. $val
  6. }

except that for those the local is scoped to the do block. Slices are also accepted.

  1. my %hash = (
  2. a => [ 7, 8, 9 ],
  3. b => 1,
  4. )
  5. {
  6. my $a = delete local $hash{a};
  7. # $a is [ 7, 8, 9 ]
  8. # %hash is (b => 1)
  9. {
  10. my @nums = delete local @$a[0, 2]
  11. # @nums is (7, 9)
  12. # $a is [ undef, 8 ]
  13. $a[0] = 999; # will be erased when the scope ends
  14. }
  15. # $a is back to [ 7, 8, 9 ]
  16. }
  17. # %hash is back to its original state

Lvalue subroutines

WARNING: Lvalue subroutines are still experimental and the implementation may change in future versions of Perl.

It is possible to return a modifiable value from a subroutine. To do this, you have to declare the subroutine to return an lvalue.

  1. my $val;
  2. sub canmod : lvalue {
  3. $val; # or: return $val;
  4. }
  5. sub nomod {
  6. $val;
  7. }
  8. canmod() = 5; # assigns to $val
  9. nomod() = 5; # ERROR

The scalar/list context for the subroutine and for the right-hand side of assignment is determined as if the subroutine call is replaced by a scalar. For example, consider:

  1. data(2,3) = get_data(3,4);

Both subroutines here are called in a scalar context, while in:

  1. (data(2,3)) = get_data(3,4);

and in:

  1. (data(2),data(3)) = get_data(3,4);

all the subroutines are called in a list context.

  • Lvalue subroutines are EXPERIMENTAL

    They appear to be convenient, but there is at least one reason to be circumspect.

    They violate encapsulation. A normal mutator can check the supplied argument before setting the attribute it is protecting, an lvalue subroutine never gets that chance. Consider;

    1. my $some_array_ref = []; # protected by mutators ??
    2. sub set_arr { # normal mutator
    3. my $val = shift;
    4. die("expected array, you supplied ", ref $val)
    5. unless ref $val eq 'ARRAY';
    6. $some_array_ref = $val;
    7. }
    8. sub set_arr_lv : lvalue { # lvalue mutator
    9. $some_array_ref;
    10. }
    11. # set_arr_lv cannot stop this !
    12. set_arr_lv() = { a => 1 };

Lexical Subroutines

WARNING: Lexical subroutines are still experimental. The feature may be modified or removed in future versions of Perl.

Lexical subroutines are only available under the use feature 'lexical_subs' pragma, which produces a warning unless the "experimental::lexical_subs" warnings category is disabled.

Beginning with Perl 5.18, you can declare a private subroutine with my or state. As with state variables, the state keyword is only available under use feature 'state' or use 5.010 or higher.

These subroutines are only visible within the block in which they are declared, and only after that declaration:

  1. no warnings "experimental::lexical_subs";
  2. use feature 'lexical_subs';
  3. foo(); # calls the package/global subroutine
  4. state sub foo {
  5. foo(); # also calls the package subroutine
  6. }
  7. foo(); # calls "state" sub
  8. my $ref = \&foo; # take a reference to "state" sub
  9. my sub bar { ... }
  10. bar(); # calls "my" sub

To use a lexical subroutine from inside the subroutine itself, you must predeclare it. The sub foo {...} subroutine definition syntax respects any previous my sub; or state sub; declaration.

  1. my sub baz; # predeclaration
  2. sub baz { # define the "my" sub
  3. baz(); # recursive call
  4. }

state sub vs my sub

What is the difference between "state" subs and "my" subs? Each time that execution enters a block when "my" subs are declared, a new copy of each sub is created. "State" subroutines persist from one execution of the containing block to the next.

So, in general, "state" subroutines are faster. But "my" subs are necessary if you want to create closures:

  1. no warnings "experimental::lexical_subs";
  2. use feature 'lexical_subs';
  3. sub whatever {
  4. my $x = shift;
  5. my sub inner {
  6. ... do something with $x ...
  7. }
  8. inner();
  9. }

In this example, a new $x is created when whatever is called, and also a new inner , which can see the new $x . A "state" sub will only see the $x from the first call to whatever .

our subroutines

Like our $variable , our sub creates a lexical alias to the package subroutine of the same name.

The two main uses for this are to switch back to using the package sub inside an inner scope:

  1. no warnings "experimental::lexical_subs";
  2. use feature 'lexical_subs';
  3. sub foo { ... }
  4. sub bar {
  5. my sub foo { ... }
  6. {
  7. # need to use the outer foo here
  8. our sub foo;
  9. foo();
  10. }
  11. }

and to make a subroutine visible to other packages in the same scope:

  1. package MySneakyModule;
  2. no warnings "experimental::lexical_subs";
  3. use feature 'lexical_subs';
  4. our sub do_something { ... }
  5. sub do_something_with_caller {
  6. package DB;
  7. () = caller 1; # sets @DB::args
  8. do_something(@args); # uses MySneakyModule::do_something
  9. }

Passing Symbol Table Entries (typeglobs)

WARNING: The mechanism described in this section was originally the only way to simulate pass-by-reference in older versions of Perl. While it still works fine in modern versions, the new reference mechanism is generally easier to work with. See below.

Sometimes you don't want to pass the value of an array to a subroutine but rather the name of it, so that the subroutine can modify the global copy of it rather than working with a local copy. In perl you can refer to all objects of a particular name by prefixing the name with a star: *foo . This is often known as a "typeglob", because the star on the front can be thought of as a wildcard match for all the funny prefix characters on variables and subroutines and such.

When evaluated, the typeglob produces a scalar value that represents all the objects of that name, including any filehandle, format, or subroutine. When assigned to, it causes the name mentioned to refer to whatever * value was assigned to it. Example:

  1. sub doubleary {
  2. local(*someary) = @_;
  3. foreach $elem (@someary) {
  4. $elem *= 2;
  5. }
  6. }
  7. doubleary(*foo);
  8. doubleary(*bar);

Scalars are already passed by reference, so you can modify scalar arguments without using this mechanism by referring explicitly to $_[0] etc. You can modify all the elements of an array by passing all the elements as scalars, but you have to use the * mechanism (or the equivalent reference mechanism) to push, pop, or change the size of an array. It will certainly be faster to pass the typeglob (or reference).

Even if you don't want to modify an array, this mechanism is useful for passing multiple arrays in a single LIST, because normally the LIST mechanism will merge all the array values so that you can't extract out the individual arrays. For more on typeglobs, see Typeglobs and Filehandles in perldata.

When to Still Use local()

Despite the existence of my, there are still three places where the local operator still shines. In fact, in these three places, you must use local instead of my.

1.

You need to give a global variable a temporary value, especially $_.

The global variables, like @ARGV or the punctuation variables, must be localized with local(). This block reads in /etc/motd, and splits it up into chunks separated by lines of equal signs, which are placed in @Fields .

  1. {
  2. local @ARGV = ("/etc/motd");
  3. local $/ = undef;
  4. local $_ = <>;
  5. @Fields = split /^\s*=+\s*$/;
  6. }

It particular, it's important to localize $_ in any routine that assigns to it. Look out for implicit assignments in while conditionals.

2.

You need to create a local file or directory handle or a local function.

A function that needs a filehandle of its own must use local() on a complete typeglob. This can be used to create new symbol table entries:

  1. sub ioqueue {
  2. local (*READER, *WRITER); # not my!
  3. pipe (READER, WRITER) or die "pipe: $!";
  4. return (*READER, *WRITER);
  5. }
  6. ($head, $tail) = ioqueue();

See the Symbol module for a way to create anonymous symbol table entries.

Because assignment of a reference to a typeglob creates an alias, this can be used to create what is effectively a local function, or at least, a local alias.

  1. {
  2. local *grow = \&shrink; # only until this block exits
  3. grow(); # really calls shrink()
  4. move(); # if move() grow()s, it shrink()s too
  5. }
  6. grow(); # get the real grow() again

See Function Templates in perlref for more about manipulating functions by name in this way.

3.

You want to temporarily change just one element of an array or hash.

You can localize just one element of an aggregate. Usually this is done on dynamics:

  1. {
  2. local $SIG{INT} = 'IGNORE';
  3. funct(); # uninterruptible
  4. }
  5. # interruptibility automatically restored here

But it also works on lexically declared aggregates.

Pass by Reference

If you want to pass more than one array or hash into a function--or return them from it--and have them maintain their integrity, then you're going to have to use an explicit pass-by-reference. Before you do that, you need to understand references as detailed in perlref. This section may not make much sense to you otherwise.

Here are a few simple examples. First, let's pass in several arrays to a function and have it pop all of then, returning a new list of all their former last elements:

  1. @tailings = popmany ( \@a, \@b, \@c, \@d );
  2. sub popmany {
  3. my $aref;
  4. my @retlist = ();
  5. foreach $aref ( @_ ) {
  6. push @retlist, pop @$aref;
  7. }
  8. return @retlist;
  9. }

Here's how you might write a function that returns a list of keys occurring in all the hashes passed to it:

  1. @common = inter( \%foo, \%bar, \%joe );
  2. sub inter {
  3. my ($k, $href, %seen); # locals
  4. foreach $href (@_) {
  5. while ( $k = each %$href ) {
  6. $seen{$k}++;
  7. }
  8. }
  9. return grep { $seen{$_} == @_ } keys %seen;
  10. }

So far, we're using just the normal list return mechanism. What happens if you want to pass or return a hash? Well, if you're using only one of them, or you don't mind them concatenating, then the normal calling convention is ok, although a little expensive.

Where people get into trouble is here:

  1. (@a, @b) = func(@c, @d);
  2. or
  3. (%a, %b) = func(%c, %d);

That syntax simply won't work. It sets just @a or %a and clears the @b or %b . Plus the function didn't get passed into two separate arrays or hashes: it got one long list in @_ , as always.

If you can arrange for everyone to deal with this through references, it's cleaner code, although not so nice to look at. Here's a function that takes two array references as arguments, returning the two array elements in order of how many elements they have in them:

  1. ($aref, $bref) = func(\@c, \@d);
  2. print "@$aref has more than @$bref\n";
  3. sub func {
  4. my ($cref, $dref) = @_;
  5. if (@$cref > @$dref) {
  6. return ($cref, $dref);
  7. } else {
  8. return ($dref, $cref);
  9. }
  10. }

It turns out that you can actually do this also:

  1. (*a, *b) = func(\@c, \@d);
  2. print "@a has more than @b\n";
  3. sub func {
  4. local (*c, *d) = @_;
  5. if (@c > @d) {
  6. return (\@c, \@d);
  7. } else {
  8. return (\@d, \@c);
  9. }
  10. }

Here we're using the typeglobs to do symbol table aliasing. It's a tad subtle, though, and also won't work if you're using my variables, because only globals (even in disguise as locals) are in the symbol table.

If you're passing around filehandles, you could usually just use the bare typeglob, like *STDOUT , but typeglobs references work, too. For example:

  1. splutter(\*STDOUT);
  2. sub splutter {
  3. my $fh = shift;
  4. print $fh "her um well a hmmm\n";
  5. }
  6. $rec = get_rec(\*STDIN);
  7. sub get_rec {
  8. my $fh = shift;
  9. return scalar <$fh>;
  10. }

If you're planning on generating new filehandles, you could do this. Notice to pass back just the bare *FH, not its reference.

  1. sub openit {
  2. my $path = shift;
  3. local *FH;
  4. return open (FH, $path) ? *FH : undef;
  5. }

Prototypes

Perl supports a very limited kind of compile-time argument checking using function prototyping. If you declare

  1. sub mypush (+@)

then mypush() takes arguments exactly like push() does. The function declaration must be visible at compile time. The prototype affects only interpretation of new-style calls to the function, where new-style is defined as not using the & character. In other words, if you call it like a built-in function, then it behaves like a built-in function. If you call it like an old-fashioned subroutine, then it behaves like an old-fashioned subroutine. It naturally falls out from this rule that prototypes have no influence on subroutine references like \&foo or on indirect subroutine calls like &{$subref} or $subref->() .

Method calls are not influenced by prototypes either, because the function to be called is indeterminate at compile time, since the exact code called depends on inheritance.

Because the intent of this feature is primarily to let you define subroutines that work like built-in functions, here are prototypes for some other functions that parse almost exactly like the corresponding built-in.

  1. Declared as Called as
  2. sub mylink ($$) mylink $old, $new
  3. sub myvec ($$$) myvec $var, $offset, 1
  4. sub myindex ($$;$) myindex &getstring, "substr"
  5. sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
  6. sub myreverse (@) myreverse $a, $b, $c
  7. sub myjoin ($@) myjoin ":", $a, $b, $c
  8. sub mypop (+) mypop @array
  9. sub mysplice (+$$@) mysplice @array, 0, 2, @pushme
  10. sub mykeys (+) mykeys %{$hashref}
  11. sub myopen (*;$) myopen HANDLE, $name
  12. sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
  13. sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
  14. sub myrand (;$) myrand 42
  15. sub mytime () mytime

Any backslashed prototype character represents an actual argument that must start with that character (optionally preceded by my, our or local), with the exception of $ , which will accept any scalar lvalue expression, such as $foo = 7 or my_function()->[0] . The value passed as part of @_ will be a reference to the actual argument given in the subroutine call, obtained by applying \ to that argument.

You can use the \[] backslash group notation to specify more than one allowed argument type. For example:

  1. sub myref (\[$@%&*])

will allow calling myref() as

  1. myref $var
  2. myref @array
  3. myref %hash
  4. myref &sub
  5. myref *glob

and the first argument of myref() will be a reference to a scalar, an array, a hash, a code, or a glob.

Unbackslashed prototype characters have special meanings. Any unbackslashed @ or % eats all remaining arguments, and forces list context. An argument represented by $ forces scalar context. An & requires an anonymous subroutine, which, if passed as the first argument, does not require the sub keyword or a subsequent comma.

A * allows the subroutine to accept a bareword, constant, scalar expression, typeglob, or a reference to a typeglob in that slot. The value will be available to the subroutine either as a simple scalar, or (in the latter two cases) as a reference to the typeglob. If you wish to always convert such arguments to a typeglob reference, use Symbol::qualify_to_ref() as follows:

  1. use Symbol 'qualify_to_ref';
  2. sub foo (*) {
  3. my $fh = qualify_to_ref(shift, caller);
  4. ...
  5. }

The + prototype is a special alternative to $ that will act like \[@%] when given a literal array or hash variable, but will otherwise force scalar context on the argument. This is useful for functions which should accept either a literal array or an array reference as the argument:

  1. sub mypush (+@) {
  2. my $aref = shift;
  3. die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
  4. push @$aref, @_;
  5. }

When using the + prototype, your function must check that the argument is of an acceptable type.

A semicolon (; ) separates mandatory arguments from optional arguments. It is redundant before @ or % , which gobble up everything else.

As the last character of a prototype, or just before a semicolon, a @ or a % , you can use _ in place of $ : if this argument is not provided, $_ will be used instead.

Note how the last three examples in the table above are treated specially by the parser. mygrep() is parsed as a true list operator, myrand() is parsed as a true unary operator with unary precedence the same as rand(), and mytime() is truly without arguments, just like time(). That is, if you say

  1. mytime +2;

you'll get mytime() + 2 , not mytime(2) , which is how it would be parsed without a prototype. If you want to force a unary function to have the same precedence as a list operator, add ; to the end of the prototype:

  1. sub mygetprotobynumber($;);
  2. mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)

The interesting thing about & is that you can generate new syntax with it, provided it's in the initial position:

  1. sub try (&@) {
  2. my($try,$catch) = @_;
  3. eval { &$try };
  4. if ($@) {
  5. local $_ = $@;
  6. &$catch;
  7. }
  8. }
  9. sub catch (&) { $_[0] }
  10. try {
  11. die "phooey";
  12. } catch {
  13. /phooey/ and print "unphooey\n";
  14. };

That prints "unphooey" . (Yes, there are still unresolved issues having to do with visibility of @_ . I'm ignoring that question for the moment. (But note that if we make @_ lexically scoped, those anonymous subroutines can act like closures... (Gee, is this sounding a little Lispish? (Never mind.))))

And here's a reimplementation of the Perl grep operator:

  1. sub mygrep (&@) {
  2. my $code = shift;
  3. my @result;
  4. foreach $_ (@_) {
  5. push(@result, $_) if &$code;
  6. }
  7. @result;
  8. }

Some folks would prefer full alphanumeric prototypes. Alphanumerics have been intentionally left out of prototypes for the express purpose of someday in the future adding named, formal parameters. The current mechanism's main goal is to let module writers provide better diagnostics for module users. Larry feels the notation quite understandable to Perl programmers, and that it will not intrude greatly upon the meat of the module, nor make it harder to read. The line noise is visually encapsulated into a small pill that's easy to swallow.

If you try to use an alphanumeric sequence in a prototype you will generate an optional warning - "Illegal character in prototype...". Unfortunately earlier versions of Perl allowed the prototype to be used as long as its prefix was a valid prototype. The warning may be upgraded to a fatal error in a future version of Perl once the majority of offending code is fixed.

It's probably best to prototype new functions, not retrofit prototyping into older ones. That's because you must be especially careful about silent impositions of differing list versus scalar contexts. For example, if you decide that a function should take just one parameter, like this:

  1. sub func ($) {
  2. my $n = shift;
  3. print "you gave me $n\n";
  4. }

and someone has been calling it with an array or expression returning a list:

  1. func(@foo);
  2. func( split /:/ );

Then you've just supplied an automatic scalar in front of their argument, which can be more than a bit surprising. The old @foo which used to hold one thing doesn't get passed in. Instead, func() now gets passed in a 1 ; that is, the number of elements in @foo . And the split gets called in scalar context so it starts scribbling on your @_ parameter list. Ouch!

This is all very powerful, of course, and should be used only in moderation to make the world a better place.

Constant Functions

Functions with a prototype of () are potential candidates for inlining. If the result after optimization and constant folding is either a constant or a lexically-scoped scalar which has no other references, then it will be used in place of function calls made without & . Calls made using & are never inlined. (See constant.pm for an easy way to declare most constants.)

The following functions would all be inlined:

  1. sub pi () { 3.14159 } # Not exact, but close.
  2. sub PI () { 4 * atan2 1, 1 } # As good as it gets,
  3. # and it's inlined, too!
  4. sub ST_DEV () { 0 }
  5. sub ST_INO () { 1 }
  6. sub FLAG_FOO () { 1 << 8 }
  7. sub FLAG_BAR () { 1 << 9 }
  8. sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
  9. sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
  10. sub N () { int(OPT_BAZ) / 3 }
  11. sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }

Be aware that these will not be inlined; as they contain inner scopes, the constant folding doesn't reduce them to a single constant:

  1. sub foo_set () { if (FLAG_MASK & FLAG_FOO) { 1 } }
  2. sub baz_val () {
  3. if (OPT_BAZ) {
  4. return 23;
  5. }
  6. else {
  7. return 42;
  8. }
  9. }

If you redefine a subroutine that was eligible for inlining, you'll get a warning by default. (You can use this warning to tell whether or not a particular subroutine is considered constant.) The warning is considered severe enough not to be affected by the -w switch (or its absence) because previously compiled invocations of the function will still be using the old value of the function. If you need to be able to redefine the subroutine, you need to ensure that it isn't inlined, either by dropping the () prototype (which changes calling semantics, so beware) or by thwarting the inlining mechanism in some other way, such as

  1. sub not_inlined () {
  2. 23 if $];
  3. }

Overriding Built-in Functions

Many built-in functions may be overridden, though this should be tried only occasionally and for good reason. Typically this might be done by a package attempting to emulate missing built-in functionality on a non-Unix system.

Overriding may be done only by importing the name from a module at compile time--ordinary predeclaration isn't good enough. However, the use subs pragma lets you, in effect, predeclare subs via the import syntax, and these names may then override built-in ones:

  1. use subs 'chdir', 'chroot', 'chmod', 'chown';
  2. chdir $somewhere;
  3. sub chdir { ... }

To unambiguously refer to the built-in form, precede the built-in name with the special package qualifier CORE:: . For example, saying CORE::open() always refers to the built-in open(), even if the current package has imported some other subroutine called &open() from elsewhere. Even though it looks like a regular function call, it isn't: the CORE:: prefix in that case is part of Perl's syntax, and works for any keyword, regardless of what is in the CORE package. Taking a reference to it, that is, \&CORE::open , only works for some keywords. See CORE.

Library modules should not in general export built-in names like open or chdir as part of their default @EXPORT list, because these may sneak into someone else's namespace and change the semantics unexpectedly. Instead, if the module adds that name to @EXPORT_OK , then it's possible for a user to import the name explicitly, but not implicitly. That is, they could say

  1. use Module 'open';

and it would import the open override. But if they said

  1. use Module;

they would get the default imports without overrides.

The foregoing mechanism for overriding built-in is restricted, quite deliberately, to the package that requests the import. There is a second method that is sometimes applicable when you wish to override a built-in everywhere, without regard to namespace boundaries. This is achieved by importing a sub into the special namespace CORE::GLOBAL:: . Here is an example that quite brazenly replaces the glob operator with something that understands regular expressions.

  1. package REGlob;
  2. require Exporter;
  3. @ISA = 'Exporter';
  4. @EXPORT_OK = 'glob';
  5. sub import {
  6. my $pkg = shift;
  7. return unless @_;
  8. my $sym = shift;
  9. my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
  10. $pkg->export($where, $sym, @_);
  11. }
  12. sub glob {
  13. my $pat = shift;
  14. my @got;
  15. if (opendir my $d, '.') {
  16. @got = grep /$pat/, readdir $d;
  17. closedir $d;
  18. }
  19. return @got;
  20. }
  21. 1;

And here's how it could be (ab)used:

  1. #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
  2. package Foo;
  3. use REGlob 'glob'; # override glob() in Foo:: only
  4. print for <^[a-z_]+\.pm\$>; # show all pragmatic modules

The initial comment shows a contrived, even dangerous example. By overriding glob globally, you would be forcing the new (and subversive) behavior for the glob operator for every namespace, without the complete cognizance or cooperation of the modules that own those namespaces. Naturally, this should be done with extreme caution--if it must be done at all.

The REGlob example above does not implement all the support needed to cleanly override perl's glob operator. The built-in glob has different behaviors depending on whether it appears in a scalar or list context, but our REGlob doesn't. Indeed, many perl built-in have such context sensitive behaviors, and these must be adequately supported by a properly written override. For a fully functional example of overriding glob, study the implementation of File::DosGlob in the standard library.

When you override a built-in, your replacement should be consistent (if possible) with the built-in native syntax. You can achieve this by using a suitable prototype. To get the prototype of an overridable built-in, use the prototype function with an argument of "CORE::builtin_name" (see prototype).

Note however that some built-ins can't have their syntax expressed by a prototype (such as system or chomp). If you override them you won't be able to fully mimic their original syntax.

The built-ins do, require and glob can also be overridden, but due to special magic, their original syntax is preserved, and you don't have to define a prototype for their replacements. (You can't override the do BLOCK syntax, though).

require has special additional dark magic: if you invoke your require replacement as require Foo::Bar , it will actually receive the argument "Foo/Bar.pm" in @_. See require.

And, as you'll have noticed from the previous example, if you override glob, the <*> glob operator is overridden as well.

In a similar fashion, overriding the readline function also overrides the equivalent I/O operator <FILEHANDLE> . Also, overriding readpipe also overrides the operators `` and qx//.

Finally, some built-ins (e.g. exists or grep) can't be overridden.

Autoloading

If you call a subroutine that is undefined, you would ordinarily get an immediate, fatal error complaining that the subroutine doesn't exist. (Likewise for subroutines being used as methods, when the method doesn't exist in any base class of the class's package.) However, if an AUTOLOAD subroutine is defined in the package or packages used to locate the original subroutine, then that AUTOLOAD subroutine is called with the arguments that would have been passed to the original subroutine. The fully qualified name of the original subroutine magically appears in the global $AUTOLOAD variable of the same package as the AUTOLOAD routine. The name is not passed as an ordinary argument because, er, well, just because, that's why. (As an exception, a method call to a nonexistent import or unimport method is just skipped instead. Also, if the AUTOLOAD subroutine is an XSUB, there are other ways to retrieve the subroutine name. See Autoloading with XSUBs in perlguts for details.)

Many AUTOLOAD routines load in a definition for the requested subroutine using eval(), then execute that subroutine using a special form of goto() that erases the stack frame of the AUTOLOAD routine without a trace. (See the source to the standard module documented in AutoLoader, for example.) But an AUTOLOAD routine can also just emulate the routine and never define it. For example, let's pretend that a function that wasn't defined should just invoke system with those arguments. All you'd do is:

  1. sub AUTOLOAD {
  2. my $program = $AUTOLOAD;
  3. $program =~ s/.*:://;
  4. system($program, @_);
  5. }
  6. date();
  7. who('am', 'i');
  8. ls('-l');

In fact, if you predeclare functions you want to call that way, you don't even need parentheses:

  1. use subs qw(date who ls);
  2. date;
  3. who "am", "i";
  4. ls '-l';

A more complete example of this is the Shell module on CPAN, which can treat undefined subroutine calls as calls to external programs.

Mechanisms are available to help modules writers split their modules into autoloadable files. See the standard AutoLoader module described in AutoLoader and in AutoSplit, the standard SelfLoader modules in SelfLoader, and the document on adding C functions to Perl code in perlxs.

Subroutine Attributes

A subroutine declaration or definition may have a list of attributes associated with it. If such an attribute list is present, it is broken up at space or colon boundaries and treated as though a use attributes had been seen. See attributes for details about what attributes are currently supported. Unlike the limitation with the obsolescent use attrs , the sub : ATTRLIST syntax works to associate the attributes with a pre-declaration, and not just with a subroutine definition.

The attributes must be valid as simple identifier names (without any punctuation other than the '_' character). They may have a parameter list appended, which is only checked for whether its parentheses ('(',')') nest properly.

Examples of valid syntax (even though the attributes are unknown):

  1. sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
  2. sub plugh () : Ugly('\(") :Bad;
  3. sub xyzzy : _5x5 { ... }

Examples of invalid syntax:

  1. sub fnord : switch(10,foo(); # ()-string not balanced
  2. sub snoid : Ugly('('); # ()-string not balanced
  3. sub xyzzy : 5x5; # "5x5" not a valid identifier
  4. sub plugh : Y2::north; # "Y2::north" not a simple identifier
  5. sub snurt : foo + bar; # "+" not a colon or space

The attribute list is passed as a list of constant strings to the code which associates them with the subroutine. In particular, the second example of valid syntax above currently looks like this in terms of how it's parsed and invoked:

  1. use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';

For further details on attribute lists and their manipulation, see attributes and Attribute::Handlers.

SEE ALSO

See Function Templates in perlref for more about references and closures. See perlxs if you'd like to learn about calling C subroutines from Perl. See perlembed if you'd like to learn about calling Perl subroutines from C. See perlmod to learn about bundling up your functions in separate files. See perlmodlib to learn what library modules come standard on your system. See perlootut to learn how to make object method calls.

 
perldoc-html/perlsymbian.html000644 000765 000024 00000117304 12275777413 016456 0ustar00jjstaff000000 000000 perlsymbian - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlsymbian

Perl 5 version 18.2 documentation
Recently read

perlsymbian

NAME

perlsymbian - Perl version 5 on Symbian OS

DESCRIPTION

This document describes various features of the Symbian operating system that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs.

NOTE: this port (as of 0.4.1) does not compile into a Symbian OS GUI application, but instead it results in a Symbian DLL. The DLL includes a C++ class called CPerlBase, which one can then (derive from and) use to embed Perl into applications, see symbian/README.

The base port of Perl to Symbian only implements the basic POSIX-like functionality; it does not implement any further Symbian or Series 60, Series 80, or UIQ bindings for Perl.

It is also possible to generate Symbian executables for "miniperl" and "perl", but since there is no standard command line interface for Symbian (nor full keyboards in the devices), these are useful mainly as demonstrations.

Compiling Perl on Symbian

(0) You need to have the appropriate Symbian SDK installed.

  1. These instructions have been tested under various Nokia Series 60
  2. Symbian SDKs (1.2 to 2.6, 2.8 should also work, 1.2 compiles but
  3. does not work), Series 80 2.0, and Nokia 7710 (Series 90) SDK.
  4. You can get the SDKs from Forum Nokia (L<http://www.forum.nokia.com/>).
  5. A very rough port ("it compiles") to UIQ 2.1 has also been made.
  6. A prerequisite for any of the SDKs is to install ActivePerl
  7. from ActiveState, L<http://www.activestate.com/Products/ActivePerl/>
  8. Having the SDK installed also means that you need to have either
  9. the Metrowerks CodeWarrior installed (2.8 and 3.0 were used in testing)
  10. or the Microsoft Visual C++ 6.0 installed (SP3 minimum, SP5 recommended).
  11. Note that for example the Series 60 2.0 VC SDK installation talks
  12. about ActivePerl build 518, which does no more (as of mid-2005) exist
  13. at the ActiveState website. The ActivePerl 5.8.4 build 810 was
  14. used successfully for compiling Perl on Symbian. The 5.6.x ActivePerls
  15. do not work.
  16. Other SDKs or compilers like Visual.NET, command-line-only
  17. Visual.NET, Borland, GnuPoc, or sdk2unix have not been tried.
  18. These instructions almost certainly won't work with older Symbian
  19. releases or other SDKs. Patches to get this port running in other
  20. releases, SDKs, compilers, platforms, or devices are naturally welcome.

(1) Get a Perl source code distribution (for example the file perl-5.9.2.tar.gz is fine) from http://www.cpan.org/src/ and unpack it in your the C:/Symbian directory of your Windows system.

(2) Change to the perl source directory.

  1. cd c:\Symbian\perl-5.x.x

(3) Run the following script using the perl coming with the SDK

  1. perl symbian\config.pl
  2. You must use the cmd.exe, the Cygwin shell will not work.
  3. The PATH must include the SDK tools, including a Perl,
  4. which should be the case under cmd.exe. If you do not
  5. have that, see the end of symbian\sdk.pl for notes of
  6. how your environment should be set up for Symbian compiles.

(4) Build the project, either by

  1. make all
  2. in cmd.exe or by using either the Metrowerks CodeWarrior
  3. or the Visual C++ 6.0, or the Visual Studio 8 (the Visual C++
  4. 2005 Express Edition works fine).
  5. If you use the VC IDE, you will have to run F<symbian\config.pl>
  6. first using the cmd.exe, and then run 'make win.mf vc6.mf' to generate
  7. the VC6 makefiles and workspaces. "make vc6" will compile for the VC6,
  8. and "make cw" for the CodeWarrior.
  9. The following SDK and compiler configurations and Nokia phones were
  10. tested at some point in time (+ = compiled and PerlApp run, - = not),
  11. both for Perl 5.8.x and 5.9.x:
  12. SDK | VC | CW |
  13. --------+----+----+---
  14. S60 1.2 | + | + | 3650 (*)
  15. S60 2.0 | + | + | 6600
  16. S60 2.1 | - | + | 6670
  17. S60 2.6 | + | + | 6630
  18. S60 2.8 | + | + | (not tested in a device)
  19. S80 2.6 | - | + | 9300
  20. S90 1.1 | + | - | 7710
  21. UIQ 2.1 | - | + | (not tested in a device)
  22. (*) Compiles but does not work, unfortunately, a problem with Symbian.
  23. If you are using the 'make' directly, it is the GNU make from the SDKs,
  24. and it will invoke the right make commands for the Windows emulator
  25. build and the Arm target builds ('thumb' by default) as necessary.
  26. The build scripts assume the 'absolute style' SDK installs under C:,
  27. the 'subst style' will not work.
  28. If using the VC IDE, to build use for example the File->Open Workspace->
  29. C:\Symbian\8.0a\S60_2nd_FP2\epoc32\build\symbian\perl\perl\wins\perl.dsw
  30. The emulator binaries will appear in the same directory.
  31. If using the VC IDE, you will a lot of warnings in the beginning of
  32. the build because a lot of headers mentioned by the source cannot
  33. be found, but this is not serious since those headers are not used.
  34. The Metrowerks will give a lot of warnings about unused variables and
  35. empty declarations, you can ignore those.
  36. When the Windows and Arm DLLs are built do not be scared by a very long
  37. messages whizzing by: it is the "export freeze" phase where the whole
  38. (rather large) API of Perl is listed.
  39. Once the build is completed you need to create the DLL SIS file by
  40. make perldll.sis
  41. which will create the file perlXYZ.sis (the XYZ being the Perl version)
  42. which you can then install into your Symbian device: an easy way
  43. to do this is to send them via Bluetooth or infrared and just open
  44. the messages.
  45. Since the total size of all Perl SIS files once installed is
  46. over 2 MB, it is recommended to do the installation into a
  47. memory card (drive E:) instead of the C: drive.
  48. The size of the perlXYZ.SIS is about 370 kB but once it is in the
  49. device it is about one 750 kB (according to the application manager).
  50. The perlXYZ.sis includes only the Perl DLL: to create an additional
  51. SIS file which includes some of the standard (pure) Perl libraries,
  52. issue the command
  53. make perllib.sis
  54. Some of the standard Perl libraries are included, but not all:
  55. see L</HISTORY> or F<symbian\install.cfg> for more details
  56. (250 kB -> 700 kB).
  57. Some of the standard Perl XS extensions (see L</HISTORY> are
  58. also available:
  59. make perlext.sis
  60. which will create perlXYZext.sis (290 kB -> 770 kB).
  61. To compile the demonstration application PerlApp you need first to
  62. install the Perl headers under the SDK.
  63. To install the Perl headers and the class CPerlBase documentation
  64. so that you no more need the Perl sources around to compile Perl
  65. applications using the SDK:
  66. make sdkinstall
  67. The destination directory is C:\Symbian\perl\X.Y.Z. For more
  68. details, see F<symbian\PerlBase.pod>.
  69. Once the headers have been installed, you can create a SIS for
  70. the PerlApp:
  71. make perlapp.sis
  72. The perlapp.sis (11 kB -> 16 kB) will be built in the symbian
  73. subdirectory, but a copy will also be made to the main directory.
  74. If you want to package the Perl DLLs (one for WINS, one for ARMI),
  75. the headers, and the documentation:
  76. make perlsdk.zip
  77. which will create perlXYZsdk.zip that can be used in another
  78. Windows system with the SDK, without having to compile Perl in
  79. that system.
  80. If you want to package the PerlApp sources:
  81. make perlapp.zip
  82. If you want to package the perl.exe and miniperl.exe, you
  83. can use the perlexe.sis and miniperlexe.sis make targets.
  84. You also probably want the perllib.sis for the libraries
  85. and maybe even the perlapp.sis for the recognizer.
  86. The make target 'allsis' combines all the above SIS targets.
  87. To clean up after compilation you can use either of
  88. make clean
  89. make distclean
  90. depending on how clean you want to be.

Compilation problems

If you see right after "make" this

  1. cat makefile.sh >makefile
  2. 'cat' is not recognized as an internal or external command,
  3. operable program or batch file.

it means you need to (re)run the symbian\config.pl.

If you get the error

  1. 'perl' is not recognized as an internal or external command,
  2. operable program or batch file.

you may need to reinstall the ActivePerl.

If you see this

  1. ren makedef.pl nomakedef.pl
  2. The system cannot find the file specified.
  3. C:\Symbian\...\make.exe: [rename_makedef] Error 1 (ignored)

please ignore it since it is nothing serious (the build process of renames the Perl makedef.pl as nomakedef.pl to avoid confusing it with a makedef.pl of the SDK).

PerlApp

The PerlApp application demonstrates how to embed Perl interpreters to a Symbian application. The "Time" menu item runs the following Perl code: print "Running in ", $^O, "\n", scalar localtime , the "Oneliner" allows one to type in Perl code, and the "Run" opens a file chooser for selecting a Perl file to run.

The PerlApp also is started when the "Perl recognizer" (also included and installed) detects a Perl file being activated through the GUI, and offers either to install it under \Perl (if the Perl file is in the inbox of the messaging application) or to run it (if the Perl file is under \Perl).

sisify.pl

In the symbian subdirectory there is sisify.pl utility which can be used to package Perl scripts and/or Perl library directories into SIS files, which can be installed to the device. To run the sisify.pl utility, you will need to have the 'makesis' and 'uidcrc' utilities already installed. If you don't have the Win32 SDKs, you may try for example http://gnupoc.sourceforge.net/ or http://symbianos.org/~andreh/.

Using Perl in Symbian

First of all note that you have full access to the Symbian device when using Perl: you can do a lot of damage to your device (like removing system files) unless you are careful. Please do take backups before doing anything.

The Perl port has been done for the most part using the Symbian standard POSIX-ish STDLIB library. It is a reasonably complete library, but certain corners of such emulation libraries that tend to be left unimplemented on non-UNIX platforms have been left unimplemented also this time: fork(), signals(), user/group ids, select() working for sockets, non-blocking sockets, and so forth. See the file symbian/config.sh and look for 'undef' to find the unsupported APIs (or from Perl use Config).

The filesystem of Symbian devices uses DOSish syntax, "drives" separated from paths by a colon, and backslashes for the path. The exact assignment of the drives probably varies between platforms, but for example in Series 60 you might see C: as the (flash) main memory, D: as the RAM drive, E: as the memory card (MMC), Z: as the ROM. In Series 80 D: is the memory card. As far the devices go the NUL: is the bit bucket, the COMx: are the serial lines, IRCOMx: are the IR ports, TMP: might be C:\System\Temp. Remember to double those backslashes in doublequoted strings.

The Perl DLL is installed in \System\Libs\. The Perl libraries and extension DLLs are installed in \System\Libs\Perl\X.Y.Z\. The PerlApp is installed in \System\Apps\, and the SIS also installs a couple of demo scripts in \Perl\ (C:\Mydocs\Perl\ on Nokia 7710).

Note that the Symbian filesystem is very picky: it strongly prefers the \ instead of the /.

When doing XS / Symbian C++ programming include first the Symbian headers, then any standard C/POSIX headers, then Perl headers, and finally any application headers.

New() and Copy() are unfortunately used by both Symbian and Perl code so you'll have to play cpp games if you need them. PerlBase.h undefines the Perl definitions and redefines them as PerlNew() and PerlCopy().

TO DO

Lots. See symbian/TODO.

WARNING

As of Perl Symbian port version 0.4.1 any part of Perl's standard regression test suite has not been run on a real Symbian device using the ported Perl, so innumerable bugs may lie in wait. Therefore there is absolutely no warranty.

NOTE

When creating and extending application programming interfaces (APIs) for Symbian or Series 60 or Series 80 or Series 90 it is suggested that trademarks, registered trademarks, or trade names are not used in the API names. Instead, developers should consider basing the API naming in the existing (C++, or maybe Java) public component and API naming, modified as appropriate by the rules of the programming language the new APIs are for.

Nokia is a registered trademark of Nokia Corporation. Nokia's product names are trademarks or registered trademarks of Nokia. Other product and company names mentioned herein may be trademarks or trade names of their respective owners.

AUTHOR

Jarkko Hietaniemi

COPYRIGHT

Copyright (c) 2004-2005 Nokia. All rights reserved.

Copyright (c) 2006-2007 Jarkko Hietaniemi.

LICENSE

The Symbian port is licensed under the same terms as Perl itself.

HISTORY

  • 0.1.0: April 2005

    (This will show as "0.01" in the Symbian Installer.)

    1. - The console window is a very simple console indeed: one can
    2. get the newline with "000" and the "C" button is a backspace.
    3. Do not expect a terminal capable of vt100 or ANSI sequences.
    4. The console is also "ASCII", you cannot input e.g. any accented
    5. letters. Because of obvious physical constraints the console is
    6. also very small: (in Nokia 6600) 22 columns, 17 rows.
    7. - The following libraries are available:
    8. AnyDBM_File AutoLoader base Carp Config Cwd constant
    9. DynaLoader Exporter File::Spec integer lib strict Symbol
    10. vars warnings XSLoader
    11. - The following extensions are available:
    12. attributes Compress::Zlib Cwd Data::Dumper Devel::Peek Digest::MD5 DynaLoader
    13. Fcntl File::Glob Filter::Util::Call IO List::Util MIME::Base64
    14. PerlIO::scalar PerlIO::via SDBM_File Socket Storable Time::HiRes
    15. - The following extensions are missing for various technical reasons:
    16. B ByteLoader Devel::DProf Devel::PPPort Encode GDBM_File
    17. I18N::Langinfo IPC::SysV NDBM_File Opcode PerlIO::encoding POSIX
    18. re Safe Sys::Hostname Sys::Syslog
    19. threads threads::shared Unicode::Normalize
    20. - Using MakeMaker or the Module::* to build and install modules
    21. is not supported.
    22. - Building XS other than the ones in the core is not supported.

    Since this is 0.something release, any future releases are almost guaranteed to be binary incompatible. As a sign of this the Symbian symbol exports are kept unfrozen and the .def files fully rebuilt every time.

  • 0.2.0: October 2005

    1. - Perl 5.9.3 (patch level 25741)
    2. - Compress::Zlib and IO::Zlib supported
    3. - sisify.pl added

    We maintain the binary incompatibility.

  • 0.3.0: October 2005

    1. - Perl 5.9.3 (patch level 25911)
    2. - Series 80 2.0 and UIQ 2.1 support

    We maintain the binary incompatibility.

  • 0.4.0: November 2005

    1. - Perl 5.9.3 (patch level 26052)
    2. - adding a sample Symbian extension

    We maintain the binary incompatibility.

  • 0.4.1: December 2006

    1. - Perl 5.9.5-to-be (patch level 30002)
    2. - added extensions: Compress/Raw/Zlib, Digest/SHA,
    3. Hash/Util, Math/BigInt/FastCalc, Text/Soundex, Time/Piece
    4. - port to S90 1.1 by alexander smishlajev

    We maintain the binary incompatibility.

  • 0.4.2: March 2007

    1. - catchup with Perl 5.9.5-to-be (patch level 30812)
    2. - tested to build with Microsoft Visual C++ 2005 Express Edition
    3. (which uses Microsoft Visual C 8, instead of the old VC6),
    4. SDK used for testing S60_2nd_FP3 aka 8.1a

    We maintain the binary incompatibility.

 
perldoc-html/perlsyn.html000644 000765 000024 00000343562 12275777332 015634 0ustar00jjstaff000000 000000 perlsyn - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlsyn

Perl 5 version 18.2 documentation
Recently read

perlsyn

NAME

perlsyn - Perl syntax

DESCRIPTION

A Perl program consists of a sequence of declarations and statements which run from the top to the bottom. Loops, subroutines, and other control structures allow you to jump around within the code.

Perl is a free-form language: you can format and indent it however you like. Whitespace serves mostly to separate tokens, unlike languages like Python where it is an important part of the syntax, or Fortran where it is immaterial.

Many of Perl's syntactic elements are optional. Rather than requiring you to put parentheses around every function call and declare every variable, you can often leave such explicit elements off and Perl will figure out what you meant. This is known as Do What I Mean, abbreviated DWIM. It allows programmers to be lazy and to code in a style with which they are comfortable.

Perl borrows syntax and concepts from many languages: awk, sed, C, Bourne Shell, Smalltalk, Lisp and even English. Other languages have borrowed syntax from Perl, particularly its regular expression extensions. So if you have programmed in another language you will see familiar pieces in Perl. They often work the same, but see perltrap for information about how they differ.

Declarations

The only things you need to declare in Perl are report formats and subroutines (and sometimes not even subroutines). A scalar variable holds the undefined value (undef) until it has been assigned a defined value, which is anything other than undef. When used as a number, undef is treated as 0 ; when used as a string, it is treated as the empty string, "" ; and when used as a reference that isn't being assigned to, it is treated as an error. If you enable warnings, you'll be notified of an uninitialized value whenever you treat undef as a string or a number. Well, usually. Boolean contexts, such as:

  1. if ($a) {}

are exempt from warnings (because they care about truth rather than definedness). Operators such as ++ , -- , += , -= , and .= , that operate on undefined variables such as:

  1. undef $a;
  2. $a++;

are also always exempt from such warnings.

A declaration can be put anywhere a statement can, but has no effect on the execution of the primary sequence of statements: declarations all take effect at compile time. All declarations are typically put at the beginning or the end of the script. However, if you're using lexically-scoped private variables created with my(), state(), or our(), you'll have to make sure your format or subroutine definition is within the same block scope as the my if you expect to be able to access those private variables.

Declaring a subroutine allows a subroutine name to be used as if it were a list operator from that point forward in the program. You can declare a subroutine without defining it by saying sub name , thus:

  1. sub myname;
  2. $me = myname $0 or die "can't get myname";

A bare declaration like that declares the function to be a list operator, not a unary operator, so you have to be careful to use parentheses (or or instead of ||.) The || operator binds too tightly to use after list operators; it becomes part of the last element. You can always use parentheses around the list operators arguments to turn the list operator back into something that behaves more like a function call. Alternatively, you can use the prototype ($) to turn the subroutine into a unary operator:

  1. sub myname ($);
  2. $me = myname $0 || die "can't get myname";

That now parses as you'd expect, but you still ought to get in the habit of using parentheses in that situation. For more on prototypes, see perlsub

Subroutines declarations can also be loaded up with the require statement or both loaded and imported into your namespace with a use statement. See perlmod for details on this.

A statement sequence may contain declarations of lexically-scoped variables, but apart from declaring a variable name, the declaration acts like an ordinary statement, and is elaborated within the sequence of statements as if it were an ordinary statement. That means it actually has both compile-time and run-time effects.

Comments

Text from a "#" character until the end of the line is a comment, and is ignored. Exceptions include "#" inside a string or regular expression.

Simple Statements

The only kind of simple statement is an expression evaluated for its side-effects. Every simple statement must be terminated with a semicolon, unless it is the final statement in a block, in which case the semicolon is optional. But put the semicolon in anyway if the block takes up more than one line, because you may eventually add another line. Note that there are operators like eval {} , sub {} , and do {} that look like compound statements, but aren't--they're just TERMs in an expression--and thus need an explicit termination when used as the last item in a statement.

Truth and Falsehood

The number 0, the strings '0' and "" , the empty list () , and undef are all false in a boolean context. All other values are true. Negation of a true value by ! or not returns a special false value. When evaluated as a string it is treated as "" , but as a number, it is treated as 0. Most Perl operators that return true or false behave this way.

Statement Modifiers

Any simple statement may optionally be followed by a SINGLE modifier, just before the terminating semicolon (or block ending). The possible modifiers are:

  1. if EXPR
  2. unless EXPR
  3. while EXPR
  4. until EXPR
  5. for LIST
  6. foreach LIST
  7. when EXPR

The EXPR following the modifier is referred to as the "condition". Its truth or falsehood determines how the modifier will behave.

if executes the statement once if and only if the condition is true. unless is the opposite, it executes the statement unless the condition is true (that is, if the condition is false).

  1. print "Basset hounds got long ears" if length $ear >= 10;
  2. go_outside() and play() unless $is_raining;

The for(each) modifier is an iterator: it executes the statement once for each item in the LIST (with $_ aliased to each item in turn).

  1. print "Hello $_!\n" for qw(world Dolly nurse);

while repeats the statement while the condition is true. until does the opposite, it repeats the statement until the condition is true (or while the condition is false):

  1. # Both of these count from 0 to 10.
  2. print $i++ while $i <= 10;
  3. print $j++ until $j > 10;

The while and until modifiers have the usual "while loop" semantics (conditional evaluated first), except when applied to a do-BLOCK (or to the Perl4 do-SUBROUTINE statement), in which case the block executes once before the conditional is evaluated.

This is so that you can write loops like:

  1. do {
  2. $line = <STDIN>;
  3. ...
  4. } until !defined($line) || $line eq ".\n"

See do. Note also that the loop control statements described later will NOT work in this construct, because modifiers don't take loop labels. Sorry. You can always put another block inside of it (for next) or around it (for last) to do that sort of thing. For next, just double the braces:

  1. do {{
  2. next if $x == $y;
  3. # do something here
  4. }} until $x++ > $z;

For last, you have to be more elaborate:

  1. LOOP: {
  2. do {
  3. last if $x = $y**2;
  4. # do something here
  5. } while $x++ <= $z;
  6. }

NOTE: The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ... ) is undefined. The value of the my variable may be undef, any previously assigned value, or possibly anything else. Don't rely on it. Future versions of perl might do something different from the version of perl you try it out on. Here be dragons.

The when modifier is an experimental feature that first appeared in Perl 5.14. To use it, you should include a use v5.14 declaration. (Technically, it requires only the switch feature, but that aspect of it was not available before 5.14.) Operative only from within a foreach loop or a given block, it executes the statement only if the smartmatch $_ ~~ EXPR is true. If the statement executes, it is followed by a next from inside a foreach and break from inside a given .

Under the current implementation, the foreach loop can be anywhere within the when modifier's dynamic scope, but must be within the given block's lexical scope. This restricted may be relaxed in a future release. See Switch Statements below.

Compound Statements

In Perl, a sequence of statements that defines a scope is called a block. Sometimes a block is delimited by the file containing it (in the case of a required file, or the program as a whole), and sometimes a block is delimited by the extent of a string (in the case of an eval).

But generally, a block is delimited by curly brackets, also known as braces. We will call this syntactic construct a BLOCK.

The following compound statements may be used to control flow:

  1. if (EXPR) BLOCK
  2. if (EXPR) BLOCK else BLOCK
  3. if (EXPR) BLOCK elsif (EXPR) BLOCK ...
  4. if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
  5. unless (EXPR) BLOCK
  6. unless (EXPR) BLOCK else BLOCK
  7. unless (EXPR) BLOCK elsif (EXPR) BLOCK ...
  8. unless (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
  9. given (EXPR) BLOCK
  10. LABEL while (EXPR) BLOCK
  11. LABEL while (EXPR) BLOCK continue BLOCK
  12. LABEL until (EXPR) BLOCK
  13. LABEL until (EXPR) BLOCK continue BLOCK
  14. LABEL for (EXPR; EXPR; EXPR) BLOCK
  15. LABEL for VAR (LIST) BLOCK
  16. LABEL for VAR (LIST) BLOCK continue BLOCK
  17. LABEL foreach (EXPR; EXPR; EXPR) BLOCK
  18. LABEL foreach VAR (LIST) BLOCK
  19. LABEL foreach VAR (LIST) BLOCK continue BLOCK
  20. LABEL BLOCK
  21. LABEL BLOCK continue BLOCK
  22. PHASE BLOCK

The experimental given statement is not automatically enabled; see Switch Statements below for how to do so, and the attendant caveats.

Unlike in C and Pascal, in Perl these are all defined in terms of BLOCKs, not statements. This means that the curly brackets are required--no dangling statements allowed. If you want to write conditionals without curly brackets, there are several other ways to do it. The following all do the same thing:

  1. if (!open(FOO)) { die "Can't open $FOO: $!" }
  2. die "Can't open $FOO: $!" unless open(FOO);
  3. open(FOO) || die "Can't open $FOO: $!";
  4. open(FOO) ? () : die "Can't open $FOO: $!";
  5. # a bit exotic, that last one

The if statement is straightforward. Because BLOCKs are always bounded by curly brackets, there is never any ambiguity about which if an else goes with. If you use unless in place of if , the sense of the test is reversed. Like if , unless can be followed by else . unless can even be followed by one or more elsif statements, though you may want to think twice before using that particular language construct, as everyone reading your code will have to think at least twice before they can understand what's going on.

The while statement executes the block as long as the expression is true. The until statement executes the block as long as the expression is false. The LABEL is optional, and if present, consists of an identifier followed by a colon. The LABEL identifies the loop for the loop control statements next, last, and redo. If the LABEL is omitted, the loop control statement refers to the innermost enclosing loop. This may include dynamically looking back your call-stack at run time to find the LABEL. Such desperate behavior triggers a warning if you use the use warnings pragma or the -w flag.

If there is a continue BLOCK, it is always executed just before the conditional is about to be evaluated again. Thus it can be used to increment a loop variable, even when the loop has been continued via the next statement.

When a block is preceding by a compilation phase keyword such as BEGIN , END , INIT , CHECK , or UNITCHECK , then the block will run only during the corresponding phase of execution. See perlmod for more details.

Extension modules can also hook into the Perl parser to define new kinds of compound statements. These are introduced by a keyword which the extension recognizes, and the syntax following the keyword is defined entirely by the extension. If you are an implementor, see PL_keyword_plugin in perlapi for the mechanism. If you are using such a module, see the module's documentation for details of the syntax that it defines.

Loop Control

The next command starts the next iteration of the loop:

  1. LINE: while (<STDIN>) {
  2. next LINE if /^#/; # discard comments
  3. ...
  4. }

The last command immediately exits the loop in question. The continue block, if any, is not executed:

  1. LINE: while (<STDIN>) {
  2. last LINE if /^$/; # exit when done with header
  3. ...
  4. }

The redo command restarts the loop block without evaluating the conditional again. The continue block, if any, is not executed. This command is normally used by programs that want to lie to themselves about what was just input.

For example, when processing a file like /etc/termcap. If your input lines might end in backslashes to indicate continuation, you want to skip ahead and get the next record.

  1. while (<>) {
  2. chomp;
  3. if (s/\\$//) {
  4. $_ .= <>;
  5. redo unless eof();
  6. }
  7. # now process $_
  8. }

which is Perl shorthand for the more explicitly written version:

  1. LINE: while (defined($line = <ARGV>)) {
  2. chomp($line);
  3. if ($line =~ s/\\$//) {
  4. $line .= <ARGV>;
  5. redo LINE unless eof(); # not eof(ARGV)!
  6. }
  7. # now process $line
  8. }

Note that if there were a continue block on the above code, it would get executed only on lines discarded by the regex (since redo skips the continue block). A continue block is often used to reset line counters or m?pat? one-time matches:

  1. # inspired by :1,$g/fred/s//WILMA/
  2. while (<>) {
  3. m?(fred)? && s//WILMA $1 WILMA/;
  4. m?(barney)? && s//BETTY $1 BETTY/;
  5. m?(homer)? && s//MARGE $1 MARGE/;
  6. } continue {
  7. print "$ARGV $.: $_";
  8. close ARGV if eof; # reset $.
  9. reset if eof; # reset ?pat?
  10. }

If the word while is replaced by the word until , the sense of the test is reversed, but the conditional is still tested before the first iteration.

Loop control statements don't work in an if or unless , since they aren't loops. You can double the braces to make them such, though.

  1. if (/pattern/) {{
  2. last if /fred/;
  3. next if /barney/; # same effect as "last",
  4. # but doesn't document as well
  5. # do something here
  6. }}

This is caused by the fact that a block by itself acts as a loop that executes once, see Basic BLOCKs.

The form while/if BLOCK BLOCK, available in Perl 4, is no longer available. Replace any occurrence of if BLOCK by if (do BLOCK) .

For Loops

Perl's C-style for loop works like the corresponding while loop; that means that this:

  1. for ($i = 1; $i < 10; $i++) {
  2. ...
  3. }

is the same as this:

  1. $i = 1;
  2. while ($i < 10) {
  3. ...
  4. } continue {
  5. $i++;
  6. }

There is one minor difference: if variables are declared with my in the initialization section of the for , the lexical scope of those variables is exactly the for loop (the body of the loop and the control sections).

Besides the normal array index looping, for can lend itself to many other interesting applications. Here's one that avoids the problem you get into if you explicitly test for end-of-file on an interactive file descriptor causing your program to appear to hang.

  1. $on_a_tty = -t STDIN && -t STDOUT;
  2. sub prompt { print "yes? " if $on_a_tty }
  3. for ( prompt(); <STDIN>; prompt() ) {
  4. # do something
  5. }

Using readline (or the operator form, <EXPR> ) as the conditional of a for loop is shorthand for the following. This behaviour is the same as a while loop conditional.

  1. for ( prompt(); defined( $_ = <STDIN> ); prompt() ) {
  2. # do something
  3. }

Foreach Loops

The foreach loop iterates over a normal list value and sets the variable VAR to be each element of the list in turn. If the variable is preceded with the keyword my, then it is lexically scoped, and is therefore visible only within the loop. Otherwise, the variable is implicitly local to the loop and regains its former value upon exiting the loop. If the variable was previously declared with my, it uses that variable instead of the global one, but it's still localized to the loop. This implicit localization occurs only in a foreach loop.

The foreach keyword is actually a synonym for the for keyword, so you can use either. If VAR is omitted, $_ is set to each value.

If any element of LIST is an lvalue, you can modify it by modifying VAR inside the loop. Conversely, if any element of LIST is NOT an lvalue, any attempt to modify that element will fail. In other words, the foreach loop index variable is an implicit alias for each item in the list that you're looping over.

If any part of LIST is an array, foreach will get very confused if you add or remove elements within the loop body, for example with splice. So don't do that.

foreach probably won't do what you expect if VAR is a tied or other special variable. Don't do that either.

Examples:

  1. for (@ary) { s/foo/bar/ }
  2. for my $elem (@elements) {
  3. $elem *= 2;
  4. }
  5. for $count (reverse(1..10), "BOOM") {
  6. print $count, "\n";
  7. sleep(1);
  8. }
  9. for (1..15) { print "Merry Christmas\n"; }
  10. foreach $item (split(/:[\\\n:]*/, $ENV{TERMCAP})) {
  11. print "Item: $item\n";
  12. }

Here's how a C programmer might code up a particular algorithm in Perl:

  1. for (my $i = 0; $i < @ary1; $i++) {
  2. for (my $j = 0; $j < @ary2; $j++) {
  3. if ($ary1[$i] > $ary2[$j]) {
  4. last; # can't go to outer :-(
  5. }
  6. $ary1[$i] += $ary2[$j];
  7. }
  8. # this is where that last takes me
  9. }

Whereas here's how a Perl programmer more comfortable with the idiom might do it:

  1. OUTER: for my $wid (@ary1) {
  2. INNER: for my $jet (@ary2) {
  3. next OUTER if $wid > $jet;
  4. $wid += $jet;
  5. }
  6. }

See how much easier this is? It's cleaner, safer, and faster. It's cleaner because it's less noisy. It's safer because if code gets added between the inner and outer loops later on, the new code won't be accidentally executed. The next explicitly iterates the other loop rather than merely terminating the inner one. And it's faster because Perl executes a foreach statement more rapidly than it would the equivalent for loop.

Basic BLOCKs

A BLOCK by itself (labeled or not) is semantically equivalent to a loop that executes once. Thus you can use any of the loop control statements in it to leave or restart the block. (Note that this is NOT true in eval{}, sub{}, or contrary to popular belief do{} blocks, which do NOT count as loops.) The continue block is optional.

The BLOCK construct can be used to emulate case structures.

  1. SWITCH: {
  2. if (/^abc/) { $abc = 1; last SWITCH; }
  3. if (/^def/) { $def = 1; last SWITCH; }
  4. if (/^xyz/) { $xyz = 1; last SWITCH; }
  5. $nothing = 1;
  6. }

You'll also find that foreach loop used to create a topicalizer and a switch:

  1. SWITCH:
  2. for ($var) {
  3. if (/^abc/) { $abc = 1; last SWITCH; }
  4. if (/^def/) { $def = 1; last SWITCH; }
  5. if (/^xyz/) { $xyz = 1; last SWITCH; }
  6. $nothing = 1;
  7. }

Such constructs are quite frequently used, both because older versions of Perl had no official switch statement, and also because the new version described immediately below remains experimental and can sometimes be confusing.

Switch Statements

Starting from Perl 5.10.1 (well, 5.10.0, but it didn't work right), you can say

  1. use feature "switch";

to enable an experimental switch feature. This is loosely based on an old version of a Perl 6 proposal, but it no longer resembles the Perl 6 construct. You also get the switch feature whenever you declare that your code prefers to run under a version of Perl that is 5.10 or later. For example:

  1. use v5.14;

Under the "switch" feature, Perl gains the experimental keywords given , when , default , continue, and break . Starting from Perl 5.16, one can prefix the switch keywords with CORE:: to access the feature without a use feature statement. The keywords given and when are analogous to switch and case in other languages, so the code in the previous section could be rewritten as

  1. use v5.10.1;
  2. for ($var) {
  3. when (/^abc/) { $abc = 1 }
  4. when (/^def/) { $def = 1 }
  5. when (/^xyz/) { $xyz = 1 }
  6. default { $nothing = 1 }
  7. }

The foreach is the non-experimental way to set a topicalizer. If you wish to use the highly experimental given , that could be written like this:

  1. use v5.10.1;
  2. given ($var) {
  3. when (/^abc/) { $abc = 1 }
  4. when (/^def/) { $def = 1 }
  5. when (/^xyz/) { $xyz = 1 }
  6. default { $nothing = 1 }
  7. }

As of 5.14, that can also be written this way:

  1. use v5.14;
  2. for ($var) {
  3. $abc = 1 when /^abc/;
  4. $def = 1 when /^def/;
  5. $xyz = 1 when /^xyz/;
  6. default { $nothing = 1 }
  7. }

Or if you don't care to play it safe, like this:

  1. use v5.14;
  2. given ($var) {
  3. $abc = 1 when /^abc/;
  4. $def = 1 when /^def/;
  5. $xyz = 1 when /^xyz/;
  6. default { $nothing = 1 }
  7. }

The arguments to given and when are in scalar context, and given assigns the $_ variable its topic value.

Exactly what the EXPR argument to when does is hard to describe precisely, but in general, it tries to guess what you want done. Sometimes it is interpreted as $_ ~~ EXPR, and sometimes it is not. It also behaves differently when lexically enclosed by a given block than it does when dynamically enclosed by a foreach loop. The rules are far too difficult to understand to be described here. See Experimental Details on given and when later on.

Due to an unfortunate bug in how given was implemented between Perl 5.10 and 5.16, under those implementations the version of $_ governed by given is merely a lexically scoped copy of the original, not a dynamically scoped alias to the original, as it would be if it were a foreach or under both the original and the current Perl 6 language specification. This bug was fixed in Perl 5.18. If you really want a lexical $_ , specify that explicitly, but note that my $_ is now deprecated and will warn unless warnings have been disabled:

  1. given(my $_ = EXPR) { ... }

If your code still needs to run on older versions, stick to foreach for your topicalizer and you will be less unhappy.

Goto

Although not for the faint of heart, Perl does support a goto statement. There are three forms: goto-LABEL, goto-EXPR, and goto-&NAME. A loop's LABEL is not actually a valid target for a goto; it's just the name of the loop.

The goto-LABEL form finds the statement labeled with LABEL and resumes execution there. It may not be used to go into any construct that requires initialization, such as a subroutine or a foreach loop. It also can't be used to go into a construct that is optimized away. It can be used to go almost anywhere else within the dynamic scope, including out of subroutines, but it's usually better to use some other construct such as last or die. The author of Perl has never felt the need to use this form of goto (in Perl, that is--C is another matter).

The goto-EXPR form expects a label name, whose scope will be resolved dynamically. This allows for computed gotos per FORTRAN, but isn't necessarily recommended if you're optimizing for maintainability:

  1. goto(("FOO", "BAR", "GLARCH")[$i]);

The goto-&NAME form is highly magical, and substitutes a call to the named subroutine for the currently running subroutine. This is used by AUTOLOAD() subroutines that wish to load another subroutine and then pretend that the other subroutine had been called in the first place (except that any modifications to @_ in the current subroutine are propagated to the other subroutine.) After the goto, not even caller() will be able to tell that this routine was called first.

In almost all cases like this, it's usually a far, far better idea to use the structured control flow mechanisms of next, last, or redo instead of resorting to a goto. For certain applications, the catch and throw pair of eval{} and die() for exception processing can also be a prudent approach.

The Ellipsis Statement

Beginning in Perl 5.12, Perl accepts an ellipsis, "... ", as a placeholder for code that you haven't implemented yet. This form of ellipsis, the unimplemented statement, should not be confused with the binary flip-flop ... operator. One is a statement and the other an operator. (Perl doesn't usually confuse them because usually Perl can tell whether it wants an operator or a statement, but see below for exceptions.)

When Perl 5.12 or later encounters an ellipsis statement, it parses this without error, but if and when you should actually try to execute it, Perl throws an exception with the text Unimplemented :

  1. use v5.12;
  2. sub unimplemented { ... }
  3. eval { unimplemented() };
  4. if ($@ =~ /^Unimplemented at /) {
  5. say "I found an ellipsis!";
  6. }

You can only use the elliptical statement to stand in for a complete statement. These examples of how the ellipsis works:

  1. use v5.12;
  2. { ... }
  3. sub foo { ... }
  4. ...;
  5. eval { ... };
  6. sub somemeth {
  7. my $self = shift;
  8. ...;
  9. }
  10. $x = do {
  11. my $n;
  12. ...;
  13. say "Hurrah!";
  14. $n;
  15. };

The elliptical statement cannot stand in for an expression that is part of a larger statement, since the ... is also the three-dot version of the flip-flop operator (see Range Operators in perlop).

These examples of attempts to use an ellipsis are syntax errors:

  1. use v5.12;
  2. print ...;
  3. open(my $fh, ">", "/dev/passwd") or ...;
  4. if ($condition && ... ) { say "Howdy" };

There are some cases where Perl can't immediately tell the difference between an expression and a statement. For instance, the syntax for a block and an anonymous hash reference constructor look the same unless there's something in the braces to give Perl a hint. The ellipsis is a syntax error if Perl doesn't guess that the { ... } is a block. In that case, it doesn't think the ... is an ellipsis because it's expecting an expression instead of a statement:

  1. @transformed = map { ... } @input; # syntax error

You can use a ; inside your block to denote that the { ... } is a block and not a hash reference constructor. Now the ellipsis works:

  1. @transformed = map {; ... } @input; # ; disambiguates
  2. @transformed = map { ...; } @input; # ; disambiguates

Note: Some folks colloquially refer to this bit of punctuation as a "yada-yada" or "triple-dot", but its true name is actually an ellipsis. Perl does not yet accept the Unicode version, U+2026 HORIZONTAL ELLIPSIS, as an alias for ... , but someday it may.

PODs: Embedded Documentation

Perl has a mechanism for intermixing documentation with source code. While it's expecting the beginning of a new statement, if the compiler encounters a line that begins with an equal sign and a word, like this

  1. =head1 Here There Be Pods!

Then that text and all remaining text up through and including a line beginning with =cut will be ignored. The format of the intervening text is described in perlpod.

This allows you to intermix your source code and your documentation text freely, as in

  1. =item snazzle($)
  2. The snazzle() function will behave in the most spectacular
  3. form that you can possibly imagine, not even excepting
  4. cybernetic pyrotechnics.
  5. =cut back to the compiler, nuff of this pod stuff!
  6. sub snazzle($) {
  7. my $thingie = shift;
  8. .........
  9. }

Note that pod translators should look at only paragraphs beginning with a pod directive (it makes parsing easier), whereas the compiler actually knows to look for pod escapes even in the middle of a paragraph. This means that the following secret stuff will be ignored by both the compiler and the translators.

  1. $a=3;
  2. =secret stuff
  3. warn "Neither POD nor CODE!?"
  4. =cut back
  5. print "got $a\n";

You probably shouldn't rely upon the warn() being podded out forever. Not all pod translators are well-behaved in this regard, and perhaps the compiler will become pickier.

One may also use pod directives to quickly comment out a section of code.

Plain Old Comments (Not!)

Perl can process line directives, much like the C preprocessor. Using this, one can control Perl's idea of filenames and line numbers in error or warning messages (especially for strings that are processed with eval()). The syntax for this mechanism is almost the same as for most C preprocessors: it matches the regular expression

  1. # example: '# line 42 "new_filename.plx"'
  2. /^\# \s*
  3. line \s+ (\d+) \s*
  4. (?:\s("?)([^"]+)\g2)? \s*
  5. $/x

with $1 being the line number for the next line, and $3 being the optional filename (specified with or without quotes). Note that no whitespace may precede the # , unlike modern C preprocessors.

There is a fairly obvious gotcha included with the line directive: Debuggers and profilers will only show the last source line to appear at a particular line number in a given file. Care should be taken not to cause line number collisions in code you'd like to debug later.

Here are some examples that you should be able to type into your command shell:

  1. % perl
  2. # line 200 "bzzzt"
  3. # the '#' on the previous line must be the first char on line
  4. die 'foo';
  5. __END__
  6. foo at bzzzt line 201.
  7. % perl
  8. # line 200 "bzzzt"
  9. eval qq[\n#line 2001 ""\ndie 'foo']; print $@;
  10. __END__
  11. foo at - line 2001.
  12. % perl
  13. eval qq[\n#line 200 "foo bar"\ndie 'foo']; print $@;
  14. __END__
  15. foo at foo bar line 200.
  16. % perl
  17. # line 345 "goop"
  18. eval "\n#line " . __LINE__ . ' "' . __FILE__ ."\"\ndie 'foo'";
  19. print $@;
  20. __END__
  21. foo at goop line 345.

Experimental Details on given and when

As previously mentioned, the "switch" feature is considered highly experimental; it is subject to change with little notice. In particular, when has tricky behaviours that are expected to change to become less tricky in the future. Do not rely upon its current (mis)implementation. Before Perl 5.18, given also had tricky behaviours that you should still beware of if your code must run on older versions of Perl.

Here is a longer example of given :

  1. use feature ":5.10";
  2. given ($foo) {
  3. when (undef) {
  4. say '$foo is undefined';
  5. }
  6. when ("foo") {
  7. say '$foo is the string "foo"';
  8. }
  9. when ([1,3,5,7,9]) {
  10. say '$foo is an odd digit';
  11. continue; # Fall through
  12. }
  13. when ($_ < 100) {
  14. say '$foo is numerically less than 100';
  15. }
  16. when (\&complicated_check) {
  17. say 'a complicated check for $foo is true';
  18. }
  19. default {
  20. die q(I don't know what to do with $foo);
  21. }
  22. }

Before Perl 5.18, given(EXPR) assigned the value of EXPR to merely a lexically scoped copy (!) of $_ , not a dynamically scoped alias the way foreach does. That made it similar to

  1. do { my $_ = EXPR; ... }

except that the block was automatically broken out of by a successful when or an explicit break . Because it was only a copy, and because it was only lexically scoped, not dynamically scoped, you could not do the things with it that you are used to in a foreach loop. In particular, it did not work for arbitrary function calls if those functions might try to access $_. Best stick to foreach for that.

Most of the power comes from the implicit smartmatching that can sometimes apply. Most of the time, when(EXPR) is treated as an implicit smartmatch of $_ , that is, $_ ~~ EXPR . (See Smartmatch Operator in perlop for more information on smartmatching.) But when EXPR is one of the 10 exceptional cases (or things like them) listed below, it is used directly as a boolean.

1.

A user-defined subroutine call or a method invocation.

2.

A regular expression match in the form of /REGEX/ , $foo =~ /REGEX/ , or $foo =~ EXPR . Also, a negated regular expression match in the form !/REGEX/ , $foo !~ /REGEX/ , or $foo !~ EXPR .

3.

A smart match that uses an explicit ~~ operator, such as EXPR ~~ EXPR .

4.

A boolean comparison operator such as $_ < 10 or $x eq "abc" . The relational operators that this applies to are the six numeric comparisons (< , >, <= , >= , == , and != ), and the six string comparisons (lt , gt , le , ge , eq , and ne ).

NOTE: You will often have to use $c ~~ $_ because the default case uses $_ ~~ $c , which is frequently the opposite of what you want.

5.

At least the three builtin functions defined(...), exists(...), and eof(...). We might someday add more of these later if we think of them.

6.

A negated expression, whether !(EXPR) or not(EXPR), or a logical exclusive-or, (EXPR1) xor (EXPR2) . The bitwise versions (~ and ^) are not included.

7.

A filetest operator, with exactly 4 exceptions: -s , -M , -A , and -C , as these return numerical values, not boolean ones. The -z filetest operator is not included in the exception list.

8.

The .. and ... flip-flop operators. Note that the ... flip-flop operator is completely different from the ... elliptical statement just described.

In those 8 cases above, the value of EXPR is used directly as a boolean, so no smartmatching is done. You may think of when as a smartsmartmatch.

Furthermore, Perl inspects the operands of logical operators to decide whether to use smartmatching for each one by applying the above test to the operands:

9.

If EXPR is EXPR1 && EXPR2 or EXPR1 and EXPR2 , the test is applied recursively to both EXPR1 and EXPR2. Only if both operands also pass the test, recursively, will the expression be treated as boolean. Otherwise, smartmatching is used.

10.

If EXPR is EXPR1 || EXPR2 , EXPR1 // EXPR2 , or EXPR1 or EXPR2 , the test is applied recursively to EXPR1 only (which might itself be a higher-precedence AND operator, for example, and thus subject to the previous rule), not to EXPR2. If EXPR1 is to use smartmatching, then EXPR2 also does so, no matter what EXPR2 contains. But if EXPR2 does not get to use smartmatching, then the second argument will not be either. This is quite different from the && case just described, so be careful.

These rules are complicated, but the goal is for them to do what you want (even if you don't quite understand why they are doing it). For example:

  1. when (/^\d+$/ && $_ < 75) { ... }

will be treated as a boolean match because the rules say both a regex match and an explicit test on $_ will be treated as boolean.

Also:

  1. when ([qw(foo bar)] && /baz/) { ... }

will use smartmatching because only one of the operands is a boolean: the other uses smartmatching, and that wins.

Further:

  1. when ([qw(foo bar)] || /^baz/) { ... }

will use smart matching (only the first operand is considered), whereas

  1. when (/^baz/ || [qw(foo bar)]) { ... }

will test only the regex, which causes both operands to be treated as boolean. Watch out for this one, then, because an arrayref is always a true value, which makes it effectively redundant. Not a good idea.

Tautologous boolean operators are still going to be optimized away. Don't be tempted to write

  1. when ("foo" or "bar") { ... }

This will optimize down to "foo" , so "bar" will never be considered (even though the rules say to use a smartmatch on "foo" ). For an alternation like this, an array ref will work, because this will instigate smartmatching:

  1. when ([qw(foo bar)] { ... }

This is somewhat equivalent to the C-style switch statement's fallthrough functionality (not to be confused with Perl's fallthrough functionality--see below), wherein the same block is used for several case statements.

Another useful shortcut is that, if you use a literal array or hash as the argument to given , it is turned into a reference. So given(@foo) is the same as given(\@foo) , for example.

default behaves exactly like when(1 == 1) , which is to say that it always matches.

Breaking out

You can use the break keyword to break out of the enclosing given block. Every when block is implicitly ended with a break .

Fall-through

You can use the continue keyword to fall through from one case to the next:

  1. given($foo) {
  2. when (/x/) { say '$foo contains an x'; continue }
  3. when (/y/) { say '$foo contains a y' }
  4. default { say '$foo does not contain a y' }
  5. }

Return value

When a given statement is also a valid expression (for example, when it's the last statement of a block), it evaluates to:

  • An empty list as soon as an explicit break is encountered.

  • The value of the last evaluated expression of the successful when /default clause, if there happens to be one.

  • The value of the last evaluated expression of the given block if no condition is true.

In both last cases, the last expression is evaluated in the context that was applied to the given block.

Note that, unlike if and unless , failed when statements always evaluate to an empty list.

  1. my $price = do {
  2. given ($item) {
  3. when (["pear", "apple"]) { 1 }
  4. break when "vote"; # My vote cannot be bought
  5. 1e10 when /Mona Lisa/;
  6. "unknown";
  7. }
  8. };

Currently, given blocks can't always be used as proper expressions. This may be addressed in a future version of Perl.

Switching in a loop

Instead of using given() , you can use a foreach() loop. For example, here's one way to count how many times a particular string occurs in an array:

  1. use v5.10.1;
  2. my $count = 0;
  3. for (@array) {
  4. when ("foo") { ++$count }
  5. }
  6. print "\@array contains $count copies of 'foo'\n";

Or in a more recent version:

  1. use v5.14;
  2. my $count = 0;
  3. for (@array) {
  4. ++$count when "foo";
  5. }
  6. print "\@array contains $count copies of 'foo'\n";

At the end of all when blocks, there is an implicit next. You can override that with an explicit last if you're interested in only the first match alone.

This doesn't work if you explicitly specify a loop variable, as in for $item (@array) . You have to use the default variable $_ .

Differences from Perl 6

The Perl 5 smartmatch and given /when constructs are not compatible with their Perl 6 analogues. The most visible difference and least important difference is that, in Perl 5, parentheses are required around the argument to given() and when() (except when this last one is used as a statement modifier). Parentheses in Perl 6 are always optional in a control construct such as if() , while() , or when() ; they can't be made optional in Perl 5 without a great deal of potential confusion, because Perl 5 would parse the expression

  1. given $foo {
  2. ...
  3. }

as though the argument to given were an element of the hash %foo , interpreting the braces as hash-element syntax.

However, their are many, many other differences. For example, this works in Perl 5:

  1. use v5.12;
  2. my @primary = ("red", "blue", "green");
  3. if (@primary ~~ "red") {
  4. say "primary smartmatches red";
  5. }
  6. if ("red" ~~ @primary) {
  7. say "red smartmatches primary";
  8. }
  9. say "that's all, folks!";

But it doesn't work at all in Perl 6. Instead, you should use the (parallelizable) any operator instead:

  1. if any(@primary) eq "red" {
  2. say "primary smartmatches red";
  3. }
  4. if "red" eq any(@primary) {
  5. say "red smartmatches primary";
  6. }

The table of smartmatches in Smartmatch Operator in perlop is not identical to that proposed by the Perl 6 specification, mainly due to differences between Perl 6's and Perl 5's data models, but also because the Perl 6 spec has changed since Perl 5 rushed into early adoption.

In Perl 6, when() will always do an implicit smartmatch with its argument, while in Perl 5 it is convenient (albeit potentially confusing) to suppress this implicit smartmatch in various rather loosely-defined situations, as roughly outlined above. (The difference is largely because Perl 5 does not have, even internally, a boolean type.)

 
perldoc-html/perlthanks.html000644 000765 000024 00000067736 12275777421 016320 0ustar00jjstaff000000 000000 perlthanks - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlthanks

Perl 5 version 18.2 documentation
Recently read

perlthanks

NAME

perlbug - how to submit bug reports on Perl

SYNOPSIS

perlbug

perlbug [ -v ] [ -a address ] [ -s subject ] [ -b body | -f inputfile ] [ -F outputfile ] [ -r returnaddress ] [ -e editor ] [ -c adminaddress | -C ] [ -S ] [ -t ] [ -d ] [ -A ] [ -h ] [ -T ]

perlbug [ -v ] [ -r returnaddress ] [ -A ] [ -ok | -okay | -nok | -nokay ]

perlthanks

DESCRIPTION

This program is designed to help you generate and send bug reports (and thank-you notes) about perl5 and the modules which ship with it.

In most cases, you can just run it interactively from a command line without any special arguments and follow the prompts.

If you have found a bug with a non-standard port (one that was not part of the standard distribution), a binary distribution, or a non-core module (such as Tk, DBI, etc), then please see the documentation that came with that distribution to determine the correct place to report bugs.

If you are unable to send your report using perlbug (most likely because your system doesn't have a way to send mail that perlbug recognizes), you may be able to use this tool to compose your report and save it to a file which you can then send to perlbug@perl.org using your regular mail client.

In extreme cases, perlbug may not work well enough on your system to guide you through composing a bug report. In those cases, you may be able to use perlbug -d to get system configuration information to include in a manually composed bug report to perlbug@perl.org.

When reporting a bug, please run through this checklist:

  • What version of Perl you are running?

    Type perl -v at the command line to find out.

  • Are you running the latest released version of perl?

    Look at http://www.perl.org/ to find out. If you are not using the latest released version, please try to replicate your bug on the latest stable release.

    Note that reports about bugs in old versions of Perl, especially those which indicate you haven't also tested the current stable release of Perl, are likely to receive less attention from the volunteers who build and maintain Perl than reports about bugs in the current release.

    This tool isn't appropriate for reporting bugs in any version prior to Perl 5.0.

  • Are you sure what you have is a bug?

    A significant number of the bug reports we get turn out to be documented features in Perl. Make sure the issue you've run into isn't intentional by glancing through the documentation that comes with the Perl distribution.

    Given the sheer volume of Perl documentation, this isn't a trivial undertaking, but if you can point to documentation that suggests the behaviour you're seeing is wrong, your issue is likely to receive more attention. You may want to start with perldoc perltrap for pointers to common traps that new (and experienced) Perl programmers run into.

    If you're unsure of the meaning of an error message you've run across, perldoc perldiag for an explanation. If the message isn't in perldiag, it probably isn't generated by Perl. You may have luck consulting your operating system documentation instead.

    If you are on a non-UNIX platform perldoc perlport, as some features may be unimplemented or work differently.

    You may be able to figure out what's going wrong using the Perl debugger. For information about how to use the debugger perldoc perldebug.

  • Do you have a proper test case?

    The easier it is to reproduce your bug, the more likely it will be fixed -- if nobody can duplicate your problem, it probably won't be addressed.

    A good test case has most of these attributes: short, simple code; few dependencies on external commands, modules, or libraries; no platform-dependent code (unless it's a platform-specific bug); clear, simple documentation.

    A good test case is almost always a good candidate to be included in Perl's test suite. If you have the time, consider writing your test case so that it can be easily included into the standard test suite.

  • Have you included all relevant information?

    Be sure to include the exact error messages, if any. "Perl gave an error" is not an exact error message.

    If you get a core dump (or equivalent), you may use a debugger (dbx, gdb, etc) to produce a stack trace to include in the bug report.

    NOTE: unless your Perl has been compiled with debug info (often -g), the stack trace is likely to be somewhat hard to use because it will most probably contain only the function names and not their arguments. If possible, recompile your Perl with debug info and reproduce the crash and the stack trace.

  • Can you describe the bug in plain English?

    The easier it is to understand a reproducible bug, the more likely it will be fixed. Any insight you can provide into the problem will help a great deal. In other words, try to analyze the problem (to the extent you can) and report your discoveries.

  • Can you fix the bug yourself?

    A bug report which includes a patch to fix it will almost definitely be fixed. When sending a patch, please use the diff program with the -u option to generate "unified" diff files. Bug reports with patches are likely to receive significantly more attention and interest than those without patches.

    Your patch may be returned with requests for changes, or requests for more detailed explanations about your fix.

    Here are a few hints for creating high-quality patches:

    Make sure the patch is not reversed (the first argument to diff is typically the original file, the second argument your changed file). Make sure you test your patch by applying it with the patch program before you send it on its way. Try to follow the same style as the code you are trying to patch. Make sure your patch really does work (make test , if the thing you're patching is covered by Perl's test suite).

  • Can you use perlbug to submit the report?

    perlbug will, amongst other things, ensure your report includes crucial information about your version of perl. If perlbug is unable to mail your report after you have typed it in, you may have to compose the message yourself, add the output produced by perlbug -d and email it to perlbug@perl.org. If, for some reason, you cannot run perlbug at all on your system, be sure to include the entire output produced by running perl -V (note the uppercase V).

    Whether you use perlbug or send the email manually, please make your Subject line informative. "a bug" is not informative. Neither is "perl crashes" nor is "HELP!!!". These don't help. A compact description of what's wrong is fine.

  • Can you use perlbug to submit a thank-you note?

    Yes, you can do this by either using the -T option, or by invoking the program as perlthanks . Thank-you notes are good. It makes people smile.

Having done your bit, please be prepared to wait, to be told the bug is in your code, or possibly to get no reply at all. The volunteers who maintain Perl are busy folks, so if your problem is an obvious bug in your own code, is difficult to understand or is a duplicate of an existing report, you may not receive a personal reply.

If it is important to you that your bug be fixed, do monitor the perl5-porters@perl.org mailing list and the commit logs to development versions of Perl, and encourage the maintainers with kind words or offers of frosty beverages. (Please do be kind to the maintainers. Harassing or flaming them is likely to have the opposite effect of the one you want.)

Feel free to update the ticket about your bug on http://rt.perl.org if a new version of Perl is released and your bug is still present.

OPTIONS

  • -a

    Address to send the report to. Defaults to perlbug@perl.org.

  • -A

    Don't send a bug received acknowledgement to the reply address. Generally it is only a sensible to use this option if you are a perl maintainer actively watching perl porters for your message to arrive.

  • -b

    Body of the report. If not included on the command line, or in a file with -f, you will get a chance to edit the message.

  • -C

    Don't send copy to administrator.

  • -c

    Address to send copy of report to. Defaults to the address of the local perl administrator (recorded when perl was built).

  • -d

    Data mode (the default if you redirect or pipe output). This prints out your configuration data, without mailing anything. You can use this with -v to get more complete data.

  • -e

    Editor to use.

  • -f

    File containing the body of the report. Use this to quickly send a prepared message.

  • -F

    File to output the results to instead of sending as an email. Useful particularly when running perlbug on a machine with no direct internet connection.

  • -h

    Prints a brief summary of the options.

  • -ok

    Report successful build on this system to perl porters. Forces -S and -C. Forces and supplies values for -s and -b. Only prompts for a return address if it cannot guess it (for use with make). Honors return address specified with -r. You can use this with -v to get more complete data. Only makes a report if this system is less than 60 days old.

  • -okay

    As -ok except it will report on older systems.

  • -nok

    Report unsuccessful build on this system. Forces -C. Forces and supplies a value for -s, then requires you to edit the report and say what went wrong. Alternatively, a prepared report may be supplied using -f. Only prompts for a return address if it cannot guess it (for use with make). Honors return address specified with -r. You can use this with -v to get more complete data. Only makes a report if this system is less than 60 days old.

  • -nokay

    As -nok except it will report on older systems.

  • -r

    Your return address. The program will ask you to confirm its default if you don't use this option.

  • -S

    Send without asking for confirmation.

  • -s

    Subject to include with the message. You will be prompted if you don't supply one on the command line.

  • -t

    Test mode. The target address defaults to perlbug-test@perl.org.

  • -T

    Send a thank-you note instead of a bug report.

  • -v

    Include verbose configuration data in the report.

AUTHORS

Kenneth Albanowski (<kjahds@kjahds.com>), subsequently doctored by Gurusamy Sarathy (<gsar@activestate.com>), Tom Christiansen (<tchrist@perl.com>), Nathan Torkington (<gnat@frii.com>), Charles F. Randall (<cfr@pobox.com>), Mike Guy (<mjtg@cam.ac.uk>), Dominic Dunlop (<domo@computer.org>), Hugo van der Sanden (<hv@crypt.org>), Jarkko Hietaniemi (<jhi@iki.fi>), Chris Nandor (<pudge@pobox.com>), Jon Orwant (<orwant@media.mit.edu>, Richard Foley (<richard.foley@rfi.net>), and Jesse Vincent (<jesse@bestpractical.com>).

SEE ALSO

perl(1), perldebug(1), perldiag(1), perlport(1), perltrap(1), diff(1), patch(1), dbx(1), gdb(1)

BUGS

None known (guess what must have been used to report them?)

 
perldoc-html/perlthrtut.html000644 000765 000024 00000326656 12275777325 016364 0ustar00jjstaff000000 000000 perlthrtut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlthrtut

Perl 5 version 18.2 documentation
Recently read

perlthrtut

NAME

perlthrtut - Tutorial on threads in Perl

DESCRIPTION

This tutorial describes the use of Perl interpreter threads (sometimes referred to as ithreads). In this model, each thread runs in its own Perl interpreter, and any data sharing between threads must be explicit. The user-level interface for ithreads uses the threads class.

NOTE: There was another older Perl threading flavor called the 5.005 model that used the threads class. This old model was known to have problems, is deprecated, and was removed for release 5.10. You are strongly encouraged to migrate any existing 5.005 threads code to the new model as soon as possible.

You can see which (or neither) threading flavour you have by running perl -V and looking at the Platform section. If you have useithreads=define you have ithreads, if you have use5005threads=define you have 5.005 threads. If you have neither, you don't have any thread support built in. If you have both, you are in trouble.

The threads and threads::shared modules are included in the core Perl distribution. Additionally, they are maintained as a separate modules on CPAN, so you can check there for any updates.

What Is A Thread Anyway?

A thread is a flow of control through a program with a single execution point.

Sounds an awful lot like a process, doesn't it? Well, it should. Threads are one of the pieces of a process. Every process has at least one thread and, up until now, every process running Perl had only one thread. With 5.8, though, you can create extra threads. We're going to show you how, when, and why.

Threaded Program Models

There are three basic ways that you can structure a threaded program. Which model you choose depends on what you need your program to do. For many non-trivial threaded programs, you'll need to choose different models for different pieces of your program.

Boss/Worker

The boss/worker model usually has one boss thread and one or more worker threads. The boss thread gathers or generates tasks that need to be done, then parcels those tasks out to the appropriate worker thread.

This model is common in GUI and server programs, where a main thread waits for some event and then passes that event to the appropriate worker threads for processing. Once the event has been passed on, the boss thread goes back to waiting for another event.

The boss thread does relatively little work. While tasks aren't necessarily performed faster than with any other method, it tends to have the best user-response times.

Work Crew

In the work crew model, several threads are created that do essentially the same thing to different pieces of data. It closely mirrors classical parallel processing and vector processors, where a large array of processors do the exact same thing to many pieces of data.

This model is particularly useful if the system running the program will distribute multiple threads across different processors. It can also be useful in ray tracing or rendering engines, where the individual threads can pass on interim results to give the user visual feedback.

Pipeline

The pipeline model divides up a task into a series of steps, and passes the results of one step on to the thread processing the next. Each thread does one thing to each piece of data and passes the results to the next thread in line.

This model makes the most sense if you have multiple processors so two or more threads will be executing in parallel, though it can often make sense in other contexts as well. It tends to keep the individual tasks small and simple, as well as allowing some parts of the pipeline to block (on I/O or system calls, for example) while other parts keep going. If you're running different parts of the pipeline on different processors you may also take advantage of the caches on each processor.

This model is also handy for a form of recursive programming where, rather than having a subroutine call itself, it instead creates another thread. Prime and Fibonacci generators both map well to this form of the pipeline model. (A version of a prime number generator is presented later on.)

What kind of threads are Perl threads?

If you have experience with other thread implementations, you might find that things aren't quite what you expect. It's very important to remember when dealing with Perl threads that Perl Threads Are Not X Threads for all values of X. They aren't POSIX threads, or DecThreads, or Java's Green threads, or Win32 threads. There are similarities, and the broad concepts are the same, but if you start looking for implementation details you're going to be either disappointed or confused. Possibly both.

This is not to say that Perl threads are completely different from everything that's ever come before. They're not. Perl's threading model owes a lot to other thread models, especially POSIX. Just as Perl is not C, though, Perl threads are not POSIX threads. So if you find yourself looking for mutexes, or thread priorities, it's time to step back a bit and think about what you want to do and how Perl can do it.

However, it is important to remember that Perl threads cannot magically do things unless your operating system's threads allow it. So if your system blocks the entire process on sleep(), Perl usually will, as well.

Perl Threads Are Different.

Thread-Safe Modules

The addition of threads has changed Perl's internals substantially. There are implications for people who write modules with XS code or external libraries. However, since Perl data is not shared among threads by default, Perl modules stand a high chance of being thread-safe or can be made thread-safe easily. Modules that are not tagged as thread-safe should be tested or code reviewed before being used in production code.

Not all modules that you might use are thread-safe, and you should always assume a module is unsafe unless the documentation says otherwise. This includes modules that are distributed as part of the core. Threads are a relatively new feature, and even some of the standard modules aren't thread-safe.

Even if a module is thread-safe, it doesn't mean that the module is optimized to work well with threads. A module could possibly be rewritten to utilize the new features in threaded Perl to increase performance in a threaded environment.

If you're using a module that's not thread-safe for some reason, you can protect yourself by using it from one, and only one thread at all. If you need multiple threads to access such a module, you can use semaphores and lots of programming discipline to control access to it. Semaphores are covered in Basic semaphores.

See also Thread-Safety of System Libraries.

Thread Basics

The threads module provides the basic functions you need to write threaded programs. In the following sections, we'll cover the basics, showing you what you need to do to create a threaded program. After that, we'll go over some of the features of the threads module that make threaded programming easier.

Basic Thread Support

Thread support is a Perl compile-time option. It's something that's turned on or off when Perl is built at your site, rather than when your programs are compiled. If your Perl wasn't compiled with thread support enabled, then any attempt to use threads will fail.

Your programs can use the Config module to check whether threads are enabled. If your program can't run without them, you can say something like:

  1. use Config;
  2. $Config{useithreads} or die('Recompile Perl with threads to run this program.');

A possibly-threaded program using a possibly-threaded module might have code like this:

  1. use Config;
  2. use MyMod;
  3. BEGIN {
  4. if ($Config{useithreads}) {
  5. # We have threads
  6. require MyMod_threaded;
  7. import MyMod_threaded;
  8. } else {
  9. require MyMod_unthreaded;
  10. import MyMod_unthreaded;
  11. }
  12. }

Since code that runs both with and without threads is usually pretty messy, it's best to isolate the thread-specific code in its own module. In our example above, that's what MyMod_threaded is, and it's only imported if we're running on a threaded Perl.

A Note about the Examples

In a real situation, care should be taken that all threads are finished executing before the program exits. That care has not been taken in these examples in the interest of simplicity. Running these examples as is will produce error messages, usually caused by the fact that there are still threads running when the program exits. You should not be alarmed by this.

Creating Threads

The threads module provides the tools you need to create new threads. Like any other module, you need to tell Perl that you want to use it; use threads; imports all the pieces you need to create basic threads.

The simplest, most straightforward way to create a thread is with create() :

  1. use threads;
  2. my $thr = threads->create(\&sub1);
  3. sub sub1 {
  4. print("In the thread\n");
  5. }

The create() method takes a reference to a subroutine and creates a new thread that starts executing in the referenced subroutine. Control then passes both to the subroutine and the caller.

If you need to, your program can pass parameters to the subroutine as part of the thread startup. Just include the list of parameters as part of the threads->create() call, like this:

  1. use threads;
  2. my $Param3 = 'foo';
  3. my $thr1 = threads->create(\&sub1, 'Param 1', 'Param 2', $Param3);
  4. my @ParamList = (42, 'Hello', 3.14);
  5. my $thr2 = threads->create(\&sub1, @ParamList);
  6. my $thr3 = threads->create(\&sub1, qw(Param1 Param2 Param3));
  7. sub sub1 {
  8. my @InboundParameters = @_;
  9. print("In the thread\n");
  10. print('Got parameters >', join('<>', @InboundParameters), "<\n");
  11. }

The last example illustrates another feature of threads. You can spawn off several threads using the same subroutine. Each thread executes the same subroutine, but in a separate thread with a separate environment and potentially separate arguments.

new() is a synonym for create() .

Waiting For A Thread To Exit

Since threads are also subroutines, they can return values. To wait for a thread to exit and extract any values it might return, you can use the join() method:

  1. use threads;
  2. my ($thr) = threads->create(\&sub1);
  3. my @ReturnData = $thr->join();
  4. print('Thread returned ', join(', ', @ReturnData), "\n");
  5. sub sub1 { return ('Fifty-six', 'foo', 2); }

In the example above, the join() method returns as soon as the thread ends. In addition to waiting for a thread to finish and gathering up any values that the thread might have returned, join() also performs any OS cleanup necessary for the thread. That cleanup might be important, especially for long-running programs that spawn lots of threads. If you don't want the return values and don't want to wait for the thread to finish, you should call the detach() method instead, as described next.

NOTE: In the example above, the thread returns a list, thus necessitating that the thread creation call be made in list context (i.e., my ($thr) ). See $thr->join() in threads and THREAD CONTEXT in threads for more details on thread context and return values.

Ignoring A Thread

join() does three things: it waits for a thread to exit, cleans up after it, and returns any data the thread may have produced. But what if you're not interested in the thread's return values, and you don't really care when the thread finishes? All you want is for the thread to get cleaned up after when it's done.

In this case, you use the detach() method. Once a thread is detached, it'll run until it's finished; then Perl will clean up after it automatically.

  1. use threads;
  2. my $thr = threads->create(\&sub1); # Spawn the thread
  3. $thr->detach(); # Now we officially don't care any more
  4. sleep(15); # Let thread run for awhile
  5. sub sub1 {
  6. $a = 0;
  7. while (1) {
  8. $a++;
  9. print("\$a is $a\n");
  10. sleep(1);
  11. }
  12. }

Once a thread is detached, it may not be joined, and any return data that it might have produced (if it was done and waiting for a join) is lost.

detach() can also be called as a class method to allow a thread to detach itself:

  1. use threads;
  2. my $thr = threads->create(\&sub1);
  3. sub sub1 {
  4. threads->detach();
  5. # Do more work
  6. }

Process and Thread Termination

With threads one must be careful to make sure they all have a chance to run to completion, assuming that is what you want.

An action that terminates a process will terminate all running threads. die() and exit() have this property, and perl does an exit when the main thread exits, perhaps implicitly by falling off the end of your code, even if that's not what you want.

As an example of this case, this code prints the message "Perl exited with active threads: 2 running and unjoined":

  1. use threads;
  2. my $thr1 = threads->new(\&thrsub, "test1");
  3. my $thr2 = threads->new(\&thrsub, "test2");
  4. sub thrsub {
  5. my ($message) = @_;
  6. sleep 1;
  7. print "thread $message\n";
  8. }

But when the following lines are added at the end:

  1. $thr1->join();
  2. $thr2->join();

it prints two lines of output, a perhaps more useful outcome.

Threads And Data

Now that we've covered the basics of threads, it's time for our next topic: Data. Threading introduces a couple of complications to data access that non-threaded programs never need to worry about.

Shared And Unshared Data

The biggest difference between Perl ithreads and the old 5.005 style threading, or for that matter, to most other threading systems out there, is that by default, no data is shared. When a new Perl thread is created, all the data associated with the current thread is copied to the new thread, and is subsequently private to that new thread! This is similar in feel to what happens when a Unix process forks, except that in this case, the data is just copied to a different part of memory within the same process rather than a real fork taking place.

To make use of threading, however, one usually wants the threads to share at least some data between themselves. This is done with the threads::shared module and the :shared attribute:

  1. use threads;
  2. use threads::shared;
  3. my $foo :shared = 1;
  4. my $bar = 1;
  5. threads->create(sub { $foo++; $bar++; })->join();
  6. print("$foo\n"); # Prints 2 since $foo is shared
  7. print("$bar\n"); # Prints 1 since $bar is not shared

In the case of a shared array, all the array's elements are shared, and for a shared hash, all the keys and values are shared. This places restrictions on what may be assigned to shared array and hash elements: only simple values or references to shared variables are allowed - this is so that a private variable can't accidentally become shared. A bad assignment will cause the thread to die. For example:

  1. use threads;
  2. use threads::shared;
  3. my $var = 1;
  4. my $svar :shared = 2;
  5. my %hash :shared;
  6. ... create some threads ...
  7. $hash{a} = 1; # All threads see exists($hash{a}) and $hash{a} == 1
  8. $hash{a} = $var; # okay - copy-by-value: same effect as previous
  9. $hash{a} = $svar; # okay - copy-by-value: same effect as previous
  10. $hash{a} = \$svar; # okay - a reference to a shared variable
  11. $hash{a} = \$var; # This will die
  12. delete($hash{a}); # okay - all threads will see !exists($hash{a})

Note that a shared variable guarantees that if two or more threads try to modify it at the same time, the internal state of the variable will not become corrupted. However, there are no guarantees beyond this, as explained in the next section.

Thread Pitfalls: Races

While threads bring a new set of useful tools, they also bring a number of pitfalls. One pitfall is the race condition:

  1. use threads;
  2. use threads::shared;
  3. my $a :shared = 1;
  4. my $thr1 = threads->create(\&sub1);
  5. my $thr2 = threads->create(\&sub2);
  6. $thr1->join();
  7. $thr2->join();
  8. print("$a\n");
  9. sub sub1 { my $foo = $a; $a = $foo + 1; }
  10. sub sub2 { my $bar = $a; $a = $bar + 1; }

What do you think $a will be? The answer, unfortunately, is it depends. Both sub1() and sub2() access the global variable $a , once to read and once to write. Depending on factors ranging from your thread implementation's scheduling algorithm to the phase of the moon, $a can be 2 or 3.

Race conditions are caused by unsynchronized access to shared data. Without explicit synchronization, there's no way to be sure that nothing has happened to the shared data between the time you access it and the time you update it. Even this simple code fragment has the possibility of error:

  1. use threads;
  2. my $a :shared = 2;
  3. my $b :shared;
  4. my $c :shared;
  5. my $thr1 = threads->create(sub { $b = $a; $a = $b + 1; });
  6. my $thr2 = threads->create(sub { $c = $a; $a = $c + 1; });
  7. $thr1->join();
  8. $thr2->join();

Two threads both access $a . Each thread can potentially be interrupted at any point, or be executed in any order. At the end, $a could be 3 or 4, and both $b and $c could be 2 or 3.

Even $a += 5 or $a++ are not guaranteed to be atomic.

Whenever your program accesses data or resources that can be accessed by other threads, you must take steps to coordinate access or risk data inconsistency and race conditions. Note that Perl will protect its internals from your race conditions, but it won't protect you from you.

Synchronization and control

Perl provides a number of mechanisms to coordinate the interactions between themselves and their data, to avoid race conditions and the like. Some of these are designed to resemble the common techniques used in thread libraries such as pthreads ; others are Perl-specific. Often, the standard techniques are clumsy and difficult to get right (such as condition waits). Where possible, it is usually easier to use Perlish techniques such as queues, which remove some of the hard work involved.

Controlling access: lock()

The lock() function takes a shared variable and puts a lock on it. No other thread may lock the variable until the variable is unlocked by the thread holding the lock. Unlocking happens automatically when the locking thread exits the block that contains the call to the lock() function. Using lock() is straightforward: This example has several threads doing some calculations in parallel, and occasionally updating a running total:

  1. use threads;
  2. use threads::shared;
  3. my $total :shared = 0;
  4. sub calc {
  5. while (1) {
  6. my $result;
  7. # (... do some calculations and set $result ...)
  8. {
  9. lock($total); # Block until we obtain the lock
  10. $total += $result;
  11. } # Lock implicitly released at end of scope
  12. last if $result == 0;
  13. }
  14. }
  15. my $thr1 = threads->create(\&calc);
  16. my $thr2 = threads->create(\&calc);
  17. my $thr3 = threads->create(\&calc);
  18. $thr1->join();
  19. $thr2->join();
  20. $thr3->join();
  21. print("total=$total\n");

lock() blocks the thread until the variable being locked is available. When lock() returns, your thread can be sure that no other thread can lock that variable until the block containing the lock exits.

It's important to note that locks don't prevent access to the variable in question, only lock attempts. This is in keeping with Perl's longstanding tradition of courteous programming, and the advisory file locking that flock() gives you.

You may lock arrays and hashes as well as scalars. Locking an array, though, will not block subsequent locks on array elements, just lock attempts on the array itself.

Locks are recursive, which means it's okay for a thread to lock a variable more than once. The lock will last until the outermost lock() on the variable goes out of scope. For example:

  1. my $x :shared;
  2. doit();
  3. sub doit {
  4. {
  5. {
  6. lock($x); # Wait for lock
  7. lock($x); # NOOP - we already have the lock
  8. {
  9. lock($x); # NOOP
  10. {
  11. lock($x); # NOOP
  12. lockit_some_more();
  13. }
  14. }
  15. } # *** Implicit unlock here ***
  16. }
  17. }
  18. sub lockit_some_more {
  19. lock($x); # NOOP
  20. } # Nothing happens here

Note that there is no unlock() function - the only way to unlock a variable is to allow it to go out of scope.

A lock can either be used to guard the data contained within the variable being locked, or it can be used to guard something else, like a section of code. In this latter case, the variable in question does not hold any useful data, and exists only for the purpose of being locked. In this respect, the variable behaves like the mutexes and basic semaphores of traditional thread libraries.

A Thread Pitfall: Deadlocks

Locks are a handy tool to synchronize access to data, and using them properly is the key to safe shared data. Unfortunately, locks aren't without their dangers, especially when multiple locks are involved. Consider the following code:

  1. use threads;
  2. my $a :shared = 4;
  3. my $b :shared = 'foo';
  4. my $thr1 = threads->create(sub {
  5. lock($a);
  6. sleep(20);
  7. lock($b);
  8. });
  9. my $thr2 = threads->create(sub {
  10. lock($b);
  11. sleep(20);
  12. lock($a);
  13. });

This program will probably hang until you kill it. The only way it won't hang is if one of the two threads acquires both locks first. A guaranteed-to-hang version is more complicated, but the principle is the same.

The first thread will grab a lock on $a , then, after a pause during which the second thread has probably had time to do some work, try to grab a lock on $b . Meanwhile, the second thread grabs a lock on $b , then later tries to grab a lock on $a . The second lock attempt for both threads will block, each waiting for the other to release its lock.

This condition is called a deadlock, and it occurs whenever two or more threads are trying to get locks on resources that the others own. Each thread will block, waiting for the other to release a lock on a resource. That never happens, though, since the thread with the resource is itself waiting for a lock to be released.

There are a number of ways to handle this sort of problem. The best way is to always have all threads acquire locks in the exact same order. If, for example, you lock variables $a , $b , and $c , always lock $a before $b , and $b before $c . It's also best to hold on to locks for as short a period of time to minimize the risks of deadlock.

The other synchronization primitives described below can suffer from similar problems.

Queues: Passing Data Around

A queue is a special thread-safe object that lets you put data in one end and take it out the other without having to worry about synchronization issues. They're pretty straightforward, and look like this:

  1. use threads;
  2. use Thread::Queue;
  3. my $DataQueue = Thread::Queue->new();
  4. my $thr = threads->create(sub {
  5. while (my $DataElement = $DataQueue->dequeue()) {
  6. print("Popped $DataElement off the queue\n");
  7. }
  8. });
  9. $DataQueue->enqueue(12);
  10. $DataQueue->enqueue("A", "B", "C");
  11. sleep(10);
  12. $DataQueue->enqueue(undef);
  13. $thr->join();

You create the queue with Thread::Queue->new() . Then you can add lists of scalars onto the end with enqueue() , and pop scalars off the front of it with dequeue() . A queue has no fixed size, and can grow as needed to hold everything pushed on to it.

If a queue is empty, dequeue() blocks until another thread enqueues something. This makes queues ideal for event loops and other communications between threads.

Semaphores: Synchronizing Data Access

Semaphores are a kind of generic locking mechanism. In their most basic form, they behave very much like lockable scalars, except that they can't hold data, and that they must be explicitly unlocked. In their advanced form, they act like a kind of counter, and can allow multiple threads to have the lock at any one time.

Basic semaphores

Semaphores have two methods, down() and up() : down() decrements the resource count, while up() increments it. Calls to down() will block if the semaphore's current count would decrement below zero. This program gives a quick demonstration:

  1. use threads;
  2. use Thread::Semaphore;
  3. my $semaphore = Thread::Semaphore->new();
  4. my $GlobalVariable :shared = 0;
  5. $thr1 = threads->create(\&sample_sub, 1);
  6. $thr2 = threads->create(\&sample_sub, 2);
  7. $thr3 = threads->create(\&sample_sub, 3);
  8. sub sample_sub {
  9. my $SubNumber = shift(@_);
  10. my $TryCount = 10;
  11. my $LocalCopy;
  12. sleep(1);
  13. while ($TryCount--) {
  14. $semaphore->down();
  15. $LocalCopy = $GlobalVariable;
  16. print("$TryCount tries left for sub $SubNumber (\$GlobalVariable is $GlobalVariable)\n");
  17. sleep(2);
  18. $LocalCopy++;
  19. $GlobalVariable = $LocalCopy;
  20. $semaphore->up();
  21. }
  22. }
  23. $thr1->join();
  24. $thr2->join();
  25. $thr3->join();

The three invocations of the subroutine all operate in sync. The semaphore, though, makes sure that only one thread is accessing the global variable at once.

Advanced Semaphores

By default, semaphores behave like locks, letting only one thread down() them at a time. However, there are other uses for semaphores.

Each semaphore has a counter attached to it. By default, semaphores are created with the counter set to one, down() decrements the counter by one, and up() increments by one. However, we can override any or all of these defaults simply by passing in different values:

  1. use threads;
  2. use Thread::Semaphore;
  3. my $semaphore = Thread::Semaphore->new(5);
  4. # Creates a semaphore with the counter set to five
  5. my $thr1 = threads->create(\&sub1);
  6. my $thr2 = threads->create(\&sub1);
  7. sub sub1 {
  8. $semaphore->down(5); # Decrements the counter by five
  9. # Do stuff here
  10. $semaphore->up(5); # Increment the counter by five
  11. }
  12. $thr1->detach();
  13. $thr2->detach();

If down() attempts to decrement the counter below zero, it blocks until the counter is large enough. Note that while a semaphore can be created with a starting count of zero, any up() or down() always changes the counter by at least one, and so $semaphore->down(0) is the same as $semaphore->down(1) .

The question, of course, is why would you do something like this? Why create a semaphore with a starting count that's not one, or why decrement or increment it by more than one? The answer is resource availability. Many resources that you want to manage access for can be safely used by more than one thread at once.

For example, let's take a GUI driven program. It has a semaphore that it uses to synchronize access to the display, so only one thread is ever drawing at once. Handy, but of course you don't want any thread to start drawing until things are properly set up. In this case, you can create a semaphore with a counter set to zero, and up it when things are ready for drawing.

Semaphores with counters greater than one are also useful for establishing quotas. Say, for example, that you have a number of threads that can do I/O at once. You don't want all the threads reading or writing at once though, since that can potentially swamp your I/O channels, or deplete your process's quota of filehandles. You can use a semaphore initialized to the number of concurrent I/O requests (or open files) that you want at any one time, and have your threads quietly block and unblock themselves.

Larger increments or decrements are handy in those cases where a thread needs to check out or return a number of resources at once.

Waiting for a Condition

The functions cond_wait() and cond_signal() can be used in conjunction with locks to notify co-operating threads that a resource has become available. They are very similar in use to the functions found in pthreads . However for most purposes, queues are simpler to use and more intuitive. See threads::shared for more details.

Giving up control

There are times when you may find it useful to have a thread explicitly give up the CPU to another thread. You may be doing something processor-intensive and want to make sure that the user-interface thread gets called frequently. Regardless, there are times that you might want a thread to give up the processor.

Perl's threading package provides the yield() function that does this. yield() is pretty straightforward, and works like this:

  1. use threads;
  2. sub loop {
  3. my $thread = shift;
  4. my $foo = 50;
  5. while($foo--) { print("In thread $thread\n"); }
  6. threads->yield();
  7. $foo = 50;
  8. while($foo--) { print("In thread $thread\n"); }
  9. }
  10. my $thr1 = threads->create(\&loop, 'first');
  11. my $thr2 = threads->create(\&loop, 'second');
  12. my $thr3 = threads->create(\&loop, 'third');

It is important to remember that yield() is only a hint to give up the CPU, it depends on your hardware, OS and threading libraries what actually happens. On many operating systems, yield() is a no-op. Therefore it is important to note that one should not build the scheduling of the threads around yield() calls. It might work on your platform but it won't work on another platform.

General Thread Utility Routines

We've covered the workhorse parts of Perl's threading package, and with these tools you should be well on your way to writing threaded code and packages. There are a few useful little pieces that didn't really fit in anyplace else.

What Thread Am I In?

The threads->self() class method provides your program with a way to get an object representing the thread it's currently in. You can use this object in the same way as the ones returned from thread creation.

Thread IDs

tid() is a thread object method that returns the thread ID of the thread the object represents. Thread IDs are integers, with the main thread in a program being 0. Currently Perl assigns a unique TID to every thread ever created in your program, assigning the first thread to be created a TID of 1, and increasing the TID by 1 for each new thread that's created. When used as a class method, threads->tid() can be used by a thread to get its own TID.

Are These Threads The Same?

The equal() method takes two thread objects and returns true if the objects represent the same thread, and false if they don't.

Thread objects also have an overloaded == comparison so that you can do comparison on them as you would with normal objects.

What Threads Are Running?

threads->list() returns a list of thread objects, one for each thread that's currently running and not detached. Handy for a number of things, including cleaning up at the end of your program (from the main Perl thread, of course):

  1. # Loop through all the threads
  2. foreach my $thr (threads->list()) {
  3. $thr->join();
  4. }

If some threads have not finished running when the main Perl thread ends, Perl will warn you about it and die, since it is impossible for Perl to clean up itself while other threads are running.

NOTE: The main Perl thread (thread 0) is in a detached state, and so does not appear in the list returned by threads->list() .

A Complete Example

Confused yet? It's time for an example program to show some of the things we've covered. This program finds prime numbers using threads.

  1. 1 #!/usr/bin/perl
  2. 2 # prime-pthread, courtesy of Tom Christiansen
  3. 3
  4. 4 use strict;
  5. 5 use warnings;
  6. 6
  7. 7 use threads;
  8. 8 use Thread::Queue;
  9. 9
  10. 10 sub check_num {
  11. 11 my ($upstream, $cur_prime) = @_;
  12. 12 my $kid;
  13. 13 my $downstream = Thread::Queue->new();
  14. 14 while (my $num = $upstream->dequeue()) {
  15. 15 next unless ($num % $cur_prime);
  16. 16 if ($kid) {
  17. 17 $downstream->enqueue($num);
  18. 18 } else {
  19. 19 print("Found prime: $num\n");
  20. 20 $kid = threads->create(\&check_num, $downstream, $num);
  21. 21 if (! $kid) {
  22. 22 warn("Sorry. Ran out of threads.\n");
  23. 23 last;
  24. 24 }
  25. 25 }
  26. 26 }
  27. 27 if ($kid) {
  28. 28 $downstream->enqueue(undef);
  29. 29 $kid->join();
  30. 30 }
  31. 31 }
  32. 32
  33. 33 my $stream = Thread::Queue->new(3..1000, undef);
  34. 34 check_num($stream, 2);

This program uses the pipeline model to generate prime numbers. Each thread in the pipeline has an input queue that feeds numbers to be checked, a prime number that it's responsible for, and an output queue into which it funnels numbers that have failed the check. If the thread has a number that's failed its check and there's no child thread, then the thread must have found a new prime number. In that case, a new child thread is created for that prime and stuck on the end of the pipeline.

This probably sounds a bit more confusing than it really is, so let's go through this program piece by piece and see what it does. (For those of you who might be trying to remember exactly what a prime number is, it's a number that's only evenly divisible by itself and 1.)

The bulk of the work is done by the check_num() subroutine, which takes a reference to its input queue and a prime number that it's responsible for. After pulling in the input queue and the prime that the subroutine is checking (line 11), we create a new queue (line 13) and reserve a scalar for the thread that we're likely to create later (line 12).

The while loop from line 14 to line 26 grabs a scalar off the input queue and checks against the prime this thread is responsible for. Line 15 checks to see if there's a remainder when we divide the number to be checked by our prime. If there is one, the number must not be evenly divisible by our prime, so we need to either pass it on to the next thread if we've created one (line 17) or create a new thread if we haven't.

The new thread creation is line 20. We pass on to it a reference to the queue we've created, and the prime number we've found. In lines 21 through 24, we check to make sure that our new thread got created, and if not, we stop checking any remaining numbers in the queue.

Finally, once the loop terminates (because we got a 0 or undef in the queue, which serves as a note to terminate), we pass on the notice to our child, and wait for it to exit if we've created a child (lines 27 and 30).

Meanwhile, back in the main thread, we first create a queue (line 33) and queue up all the numbers from 3 to 1000 for checking, plus a termination notice. Then all we have to do to get the ball rolling is pass the queue and the first prime to the check_num() subroutine (line 34).

That's how it works. It's pretty simple; as with many Perl programs, the explanation is much longer than the program.

Different implementations of threads

Some background on thread implementations from the operating system viewpoint. There are three basic categories of threads: user-mode threads, kernel threads, and multiprocessor kernel threads.

User-mode threads are threads that live entirely within a program and its libraries. In this model, the OS knows nothing about threads. As far as it's concerned, your process is just a process.

This is the easiest way to implement threads, and the way most OSes start. The big disadvantage is that, since the OS knows nothing about threads, if one thread blocks they all do. Typical blocking activities include most system calls, most I/O, and things like sleep().

Kernel threads are the next step in thread evolution. The OS knows about kernel threads, and makes allowances for them. The main difference between a kernel thread and a user-mode thread is blocking. With kernel threads, things that block a single thread don't block other threads. This is not the case with user-mode threads, where the kernel blocks at the process level and not the thread level.

This is a big step forward, and can give a threaded program quite a performance boost over non-threaded programs. Threads that block performing I/O, for example, won't block threads that are doing other things. Each process still has only one thread running at once, though, regardless of how many CPUs a system might have.

Since kernel threading can interrupt a thread at any time, they will uncover some of the implicit locking assumptions you may make in your program. For example, something as simple as $a = $a + 2 can behave unpredictably with kernel threads if $a is visible to other threads, as another thread may have changed $a between the time it was fetched on the right hand side and the time the new value is stored.

Multiprocessor kernel threads are the final step in thread support. With multiprocessor kernel threads on a machine with multiple CPUs, the OS may schedule two or more threads to run simultaneously on different CPUs.

This can give a serious performance boost to your threaded program, since more than one thread will be executing at the same time. As a tradeoff, though, any of those nagging synchronization issues that might not have shown with basic kernel threads will appear with a vengeance.

In addition to the different levels of OS involvement in threads, different OSes (and different thread implementations for a particular OS) allocate CPU cycles to threads in different ways.

Cooperative multitasking systems have running threads give up control if one of two things happen. If a thread calls a yield function, it gives up control. It also gives up control if the thread does something that would cause it to block, such as perform I/O. In a cooperative multitasking implementation, one thread can starve all the others for CPU time if it so chooses.

Preemptive multitasking systems interrupt threads at regular intervals while the system decides which thread should run next. In a preemptive multitasking system, one thread usually won't monopolize the CPU.

On some systems, there can be cooperative and preemptive threads running simultaneously. (Threads running with realtime priorities often behave cooperatively, for example, while threads running at normal priorities behave preemptively.)

Most modern operating systems support preemptive multitasking nowadays.

Performance considerations

The main thing to bear in mind when comparing Perl's ithreads to other threading models is the fact that for each new thread created, a complete copy of all the variables and data of the parent thread has to be taken. Thus, thread creation can be quite expensive, both in terms of memory usage and time spent in creation. The ideal way to reduce these costs is to have a relatively short number of long-lived threads, all created fairly early on (before the base thread has accumulated too much data). Of course, this may not always be possible, so compromises have to be made. However, after a thread has been created, its performance and extra memory usage should be little different than ordinary code.

Also note that under the current implementation, shared variables use a little more memory and are a little slower than ordinary variables.

Process-scope Changes

Note that while threads themselves are separate execution threads and Perl data is thread-private unless explicitly shared, the threads can affect process-scope state, affecting all the threads.

The most common example of this is changing the current working directory using chdir(). One thread calls chdir(), and the working directory of all the threads changes.

Even more drastic example of a process-scope change is chroot(): the root directory of all the threads changes, and no thread can undo it (as opposed to chdir()).

Further examples of process-scope changes include umask() and changing uids and gids.

Thinking of mixing fork() and threads? Please lie down and wait until the feeling passes. Be aware that the semantics of fork() vary between platforms. For example, some Unix systems copy all the current threads into the child process, while others only copy the thread that called fork(). You have been warned!

Similarly, mixing signals and threads may be problematic. Implementations are platform-dependent, and even the POSIX semantics may not be what you expect (and Perl doesn't even give you the full POSIX API). For example, there is no way to guarantee that a signal sent to a multi-threaded Perl application will get intercepted by any particular thread. (However, a recently added feature does provide the capability to send signals between threads. See THREAD SIGNALLING in threads for more details.)

Thread-Safety of System Libraries

Whether various library calls are thread-safe is outside the control of Perl. Calls often suffering from not being thread-safe include: localtime(), gmtime(), functions fetching user, group and network information (such as getgrent(), gethostent(), getnetent() and so on), readdir(), rand(), and srand(). In general, calls that depend on some global external state.

If the system Perl is compiled in has thread-safe variants of such calls, they will be used. Beyond that, Perl is at the mercy of the thread-safety or -unsafety of the calls. Please consult your C library call documentation.

On some platforms the thread-safe library interfaces may fail if the result buffer is too small (for example the user group databases may be rather large, and the reentrant interfaces may have to carry around a full snapshot of those databases). Perl will start with a small buffer, but keep retrying and growing the result buffer until the result fits. If this limitless growing sounds bad for security or memory consumption reasons you can recompile Perl with PERL_REENTRANT_MAXSIZE defined to the maximum number of bytes you will allow.

Conclusion

A complete thread tutorial could fill a book (and has, many times), but with what we've covered in this introduction, you should be well on your way to becoming a threaded Perl expert.

SEE ALSO

Annotated POD for threads: http://annocpan.org/?mode=search&field=Module&name=threads

Latest version of threads on CPAN: http://search.cpan.org/search?module=threads

Annotated POD for threads::shared: http://annocpan.org/?mode=search&field=Module&name=threads%3A%3Ashared

Latest version of threads::shared on CPAN: http://search.cpan.org/search?module=threads%3A%3Ashared

Perl threads mailing list: http://lists.perl.org/list/ithreads.html

Bibliography

Here's a short bibliography courtesy of Jürgen Christoffel:

Introductory Texts

Birrell, Andrew D. An Introduction to Programming with Threads. Digital Equipment Corporation, 1989, DEC-SRC Research Report #35 online as ftp://ftp.dec.com/pub/DEC/SRC/research-reports/SRC-035.pdf (highly recommended)

Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A Guide to Concurrency, Communication, and Multithreading. Prentice-Hall, 1996.

Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written introduction to threads).

Nelson, Greg (editor). Systems Programming with Modula-3. Prentice Hall, 1991, ISBN 0-13-590464-1.

Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell. Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1 (covers POSIX threads).

OS-Related References

Boykin, Joseph, David Kirschen, Alan Langerman, and Susan LoVerso. Programming under Mach. Addison-Wesley, 1994, ISBN 0-201-52739-1.

Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall, 1995, ISBN 0-13-219908-4 (great textbook).

Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts, 4th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4

Other References

Arnold, Ken and James Gosling. The Java Programming Language, 2nd ed. Addison-Wesley, 1998, ISBN 0-201-31006-6.

comp.programming.threads FAQ, http://www.serpentine.com/~bos/threads-faq/

Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage Collection on Virtually Shared Memory Architectures" in Memory Management: Proc. of the International Workshop IWMM 92, St. Malo, France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer, 1992, ISBN 3540-55940-X (real-life thread applications).

Artur Bergman, "Where Wizards Fear To Tread", June 11, 2002, http://www.perl.com/pub/a/2002/06/11/threads.html

Acknowledgements

Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy Sarathy, Ilya Zakharevich, Benjamin Sugars, Jürgen Christoffel, Joshua Pritikin, and Alan Burlison, for their help in reality-checking and polishing this article. Big thanks to Tom Christiansen for his rewrite of the prime number generator.

AUTHOR

Dan Sugalski <dan@sidhe.org<gt>

Slightly modified by Arthur Bergman to fit the new thread model/module.

Reworked slightly by Jörg Walter <jwalt@cpan.org<gt> to be more concise about thread-safety of Perl code.

Rearranged slightly by Elizabeth Mattijsen <liz@dijkmat.nl<gt> to put less emphasis on yield().

Copyrights

The original version of this article originally appeared in The Perl Journal #10, and is copyright 1998 The Perl Journal. It appears courtesy of Jon Orwant and The Perl Journal. This document may be distributed under the same terms as Perl itself.

 
perldoc-html/perltie.html000644 000765 000024 00000336215 12275777341 015601 0ustar00jjstaff000000 000000 perltie - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perltie

Perl 5 version 18.2 documentation
Recently read

perltie

NAME

perltie - how to hide an object class in a simple variable

SYNOPSIS

  1. tie VARIABLE, CLASSNAME, LIST
  2. $object = tied VARIABLE
  3. untie VARIABLE

DESCRIPTION

Prior to release 5.0 of Perl, a programmer could use dbmopen() to connect an on-disk database in the standard Unix dbm(3x) format magically to a %HASH in their program. However, their Perl was either built with one particular dbm library or another, but not both, and you couldn't extend this mechanism to other packages or types of variables.

Now you can.

The tie() function binds a variable to a class (package) that will provide the implementation for access methods for that variable. Once this magic has been performed, accessing a tied variable automatically triggers method calls in the proper class. The complexity of the class is hidden behind magic methods calls. The method names are in ALL CAPS, which is a convention that Perl uses to indicate that they're called implicitly rather than explicitly--just like the BEGIN() and END() functions.

In the tie() call, VARIABLE is the name of the variable to be enchanted. CLASSNAME is the name of a class implementing objects of the correct type. Any additional arguments in the LIST are passed to the appropriate constructor method for that class--meaning TIESCALAR(), TIEARRAY(), TIEHASH(), or TIEHANDLE(). (Typically these are arguments such as might be passed to the dbminit() function of C.) The object returned by the "new" method is also returned by the tie() function, which would be useful if you wanted to access other methods in CLASSNAME . (You don't actually have to return a reference to a right "type" (e.g., HASH or CLASSNAME ) so long as it's a properly blessed object.) You can also retrieve a reference to the underlying object using the tied() function.

Unlike dbmopen(), the tie() function will not use or require a module for you--you need to do that explicitly yourself.

Tying Scalars

A class implementing a tied scalar should define the following methods: TIESCALAR, FETCH, STORE, and possibly UNTIE and/or DESTROY.

Let's look at each in turn, using as an example a tie class for scalars that allows the user to do something like:

  1. tie $his_speed, 'Nice', getppid();
  2. tie $my_speed, 'Nice', $$;

And now whenever either of those variables is accessed, its current system priority is retrieved and returned. If those variables are set, then the process's priority is changed!

We'll use Jarkko Hietaniemi <jhi@iki.fi>'s BSD::Resource class (not included) to access the PRIO_PROCESS, PRIO_MIN, and PRIO_MAX constants from your system, as well as the getpriority() and setpriority() system calls. Here's the preamble of the class.

  1. package Nice;
  2. use Carp;
  3. use BSD::Resource;
  4. use strict;
  5. $Nice::DEBUG = 0 unless defined $Nice::DEBUG;
  • TIESCALAR classname, LIST

    This is the constructor for the class. That means it is expected to return a blessed reference to a new scalar (probably anonymous) that it's creating. For example:

    1. sub TIESCALAR {
    2. my $class = shift;
    3. my $pid = shift || $$; # 0 means me
    4. if ($pid !~ /^\d+$/) {
    5. carp "Nice::Tie::Scalar got non-numeric pid $pid" if $^W;
    6. return undef;
    7. }
    8. unless (kill 0, $pid) { # EPERM or ERSCH, no doubt
    9. carp "Nice::Tie::Scalar got bad pid $pid: $!" if $^W;
    10. return undef;
    11. }
    12. return bless \$pid, $class;
    13. }

    This tie class has chosen to return an error rather than raising an exception if its constructor should fail. While this is how dbmopen() works, other classes may well not wish to be so forgiving. It checks the global variable $^W to see whether to emit a bit of noise anyway.

  • FETCH this

    This method will be triggered every time the tied variable is accessed (read). It takes no arguments beyond its self reference, which is the object representing the scalar we're dealing with. Because in this case we're using just a SCALAR ref for the tied scalar object, a simple $$self allows the method to get at the real value stored there. In our example below, that real value is the process ID to which we've tied our variable.

    1. sub FETCH {
    2. my $self = shift;
    3. confess "wrong type" unless ref $self;
    4. croak "usage error" if @_;
    5. my $nicety;
    6. local($!) = 0;
    7. $nicety = getpriority(PRIO_PROCESS, $$self);
    8. if ($!) { croak "getpriority failed: $!" }
    9. return $nicety;
    10. }

    This time we've decided to blow up (raise an exception) if the renice fails--there's no place for us to return an error otherwise, and it's probably the right thing to do.

  • STORE this, value

    This method will be triggered every time the tied variable is set (assigned). Beyond its self reference, it also expects one (and only one) argument: the new value the user is trying to assign. Don't worry about returning a value from STORE; the semantic of assignment returning the assigned value is implemented with FETCH.

    1. sub STORE {
    2. my $self = shift;
    3. confess "wrong type" unless ref $self;
    4. my $new_nicety = shift;
    5. croak "usage error" if @_;
    6. if ($new_nicety < PRIO_MIN) {
    7. carp sprintf
    8. "WARNING: priority %d less than minimum system priority %d",
    9. $new_nicety, PRIO_MIN if $^W;
    10. $new_nicety = PRIO_MIN;
    11. }
    12. if ($new_nicety > PRIO_MAX) {
    13. carp sprintf
    14. "WARNING: priority %d greater than maximum system priority %d",
    15. $new_nicety, PRIO_MAX if $^W;
    16. $new_nicety = PRIO_MAX;
    17. }
    18. unless (defined setpriority(PRIO_PROCESS, $$self, $new_nicety)) {
    19. confess "setpriority failed: $!";
    20. }
    21. }
  • UNTIE this

    This method will be triggered when the untie occurs. This can be useful if the class needs to know when no further calls will be made. (Except DESTROY of course.) See The untie Gotcha below for more details.

  • DESTROY this

    This method will be triggered when the tied variable needs to be destructed. As with other object classes, such a method is seldom necessary, because Perl deallocates its moribund object's memory for you automatically--this isn't C++, you know. We'll use a DESTROY method here for debugging purposes only.

    1. sub DESTROY {
    2. my $self = shift;
    3. confess "wrong type" unless ref $self;
    4. carp "[ Nice::DESTROY pid $$self ]" if $Nice::DEBUG;
    5. }

That's about all there is to it. Actually, it's more than all there is to it, because we've done a few nice things here for the sake of completeness, robustness, and general aesthetics. Simpler TIESCALAR classes are certainly possible.

Tying Arrays

A class implementing a tied ordinary array should define the following methods: TIEARRAY, FETCH, STORE, FETCHSIZE, STORESIZE and perhaps UNTIE and/or DESTROY.

FETCHSIZE and STORESIZE are used to provide $#array and equivalent scalar(@array) access.

The methods POP, PUSH, SHIFT, UNSHIFT, SPLICE, DELETE, and EXISTS are required if the perl operator with the corresponding (but lowercase) name is to operate on the tied array. The Tie::Array class can be used as a base class to implement the first five of these in terms of the basic methods above. The default implementations of DELETE and EXISTS in Tie::Array simply croak .

In addition EXTEND will be called when perl would have pre-extended allocation in a real array.

For this discussion, we'll implement an array whose elements are a fixed size at creation. If you try to create an element larger than the fixed size, you'll take an exception. For example:

  1. use FixedElem_Array;
  2. tie @array, 'FixedElem_Array', 3;
  3. $array[0] = 'cat'; # ok.
  4. $array[1] = 'dogs'; # exception, length('dogs') > 3.

The preamble code for the class is as follows:

  1. package FixedElem_Array;
  2. use Carp;
  3. use strict;
  • TIEARRAY classname, LIST

    This is the constructor for the class. That means it is expected to return a blessed reference through which the new array (probably an anonymous ARRAY ref) will be accessed.

    In our example, just to show you that you don't really have to return an ARRAY reference, we'll choose a HASH reference to represent our object. A HASH works out well as a generic record type: the {ELEMSIZE} field will store the maximum element size allowed, and the {ARRAY} field will hold the true ARRAY ref. If someone outside the class tries to dereference the object returned (doubtless thinking it an ARRAY ref), they'll blow up. This just goes to show you that you should respect an object's privacy.

    1. sub TIEARRAY {
    2. my $class = shift;
    3. my $elemsize = shift;
    4. if ( @_ || $elemsize =~ /\D/ ) {
    5. croak "usage: tie ARRAY, '" . __PACKAGE__ . "', elem_size";
    6. }
    7. return bless {
    8. ELEMSIZE => $elemsize,
    9. ARRAY => [],
    10. }, $class;
    11. }
  • FETCH this, index

    This method will be triggered every time an individual element the tied array is accessed (read). It takes one argument beyond its self reference: the index whose value we're trying to fetch.

    1. sub FETCH {
    2. my $self = shift;
    3. my $index = shift;
    4. return $self->{ARRAY}->[$index];
    5. }

    If a negative array index is used to read from an array, the index will be translated to a positive one internally by calling FETCHSIZE before being passed to FETCH. You may disable this feature by assigning a true value to the variable $NEGATIVE_INDICES in the tied array class.

    As you may have noticed, the name of the FETCH method (et al.) is the same for all accesses, even though the constructors differ in names (TIESCALAR vs TIEARRAY). While in theory you could have the same class servicing several tied types, in practice this becomes cumbersome, and it's easiest to keep them at simply one tie type per class.

  • STORE this, index, value

    This method will be triggered every time an element in the tied array is set (written). It takes two arguments beyond its self reference: the index at which we're trying to store something and the value we're trying to put there.

    In our example, undef is really $self->{ELEMSIZE} number of spaces so we have a little more work to do here:

    1. sub STORE {
    2. my $self = shift;
    3. my( $index, $value ) = @_;
    4. if ( length $value > $self->{ELEMSIZE} ) {
    5. croak "length of $value is greater than $self->{ELEMSIZE}";
    6. }
    7. # fill in the blanks
    8. $self->EXTEND( $index ) if $index > $self->FETCHSIZE();
    9. # right justify to keep element size for smaller elements
    10. $self->{ARRAY}->[$index] = sprintf "%$self->{ELEMSIZE}s", $value;
    11. }

    Negative indexes are treated the same as with FETCH.

  • FETCHSIZE this

    Returns the total number of items in the tied array associated with object this. (Equivalent to scalar(@array)). For example:

    1. sub FETCHSIZE {
    2. my $self = shift;
    3. return scalar @{$self->{ARRAY}};
    4. }
  • STORESIZE this, count

    Sets the total number of items in the tied array associated with object this to be count. If this makes the array larger then class's mapping of undef should be returned for new positions. If the array becomes smaller then entries beyond count should be deleted.

    In our example, 'undef' is really an element containing $self->{ELEMSIZE} number of spaces. Observe:

    1. sub STORESIZE {
    2. my $self = shift;
    3. my $count = shift;
    4. if ( $count > $self->FETCHSIZE() ) {
    5. foreach ( $count - $self->FETCHSIZE() .. $count ) {
    6. $self->STORE( $_, '' );
    7. }
    8. } elsif ( $count < $self->FETCHSIZE() ) {
    9. foreach ( 0 .. $self->FETCHSIZE() - $count - 2 ) {
    10. $self->POP();
    11. }
    12. }
    13. }
  • EXTEND this, count

    Informative call that array is likely to grow to have count entries. Can be used to optimize allocation. This method need do nothing.

    In our example, we want to make sure there are no blank (undef) entries, so EXTEND will make use of STORESIZE to fill elements as needed:

    1. sub EXTEND {
    2. my $self = shift;
    3. my $count = shift;
    4. $self->STORESIZE( $count );
    5. }
  • EXISTS this, key

    Verify that the element at index key exists in the tied array this.

    In our example, we will determine that if an element consists of $self->{ELEMSIZE} spaces only, it does not exist:

    1. sub EXISTS {
    2. my $self = shift;
    3. my $index = shift;
    4. return 0 if ! defined $self->{ARRAY}->[$index] ||
    5. $self->{ARRAY}->[$index] eq ' ' x $self->{ELEMSIZE};
    6. return 1;
    7. }
  • DELETE this, key

    Delete the element at index key from the tied array this.

    In our example, a deleted item is $self->{ELEMSIZE} spaces:

    1. sub DELETE {
    2. my $self = shift;
    3. my $index = shift;
    4. return $self->STORE( $index, '' );
    5. }
  • CLEAR this

    Clear (remove, delete, ...) all values from the tied array associated with object this. For example:

    1. sub CLEAR {
    2. my $self = shift;
    3. return $self->{ARRAY} = [];
    4. }
  • PUSH this, LIST

    Append elements of LIST to the array. For example:

    1. sub PUSH {
    2. my $self = shift;
    3. my @list = @_;
    4. my $last = $self->FETCHSIZE();
    5. $self->STORE( $last + $_, $list[$_] ) foreach 0 .. $#list;
    6. return $self->FETCHSIZE();
    7. }
  • POP this

    Remove last element of the array and return it. For example:

    1. sub POP {
    2. my $self = shift;
    3. return pop @{$self->{ARRAY}};
    4. }
  • SHIFT this

    Remove the first element of the array (shifting other elements down) and return it. For example:

    1. sub SHIFT {
    2. my $self = shift;
    3. return shift @{$self->{ARRAY}};
    4. }
  • UNSHIFT this, LIST

    Insert LIST elements at the beginning of the array, moving existing elements up to make room. For example:

    1. sub UNSHIFT {
    2. my $self = shift;
    3. my @list = @_;
    4. my $size = scalar( @list );
    5. # make room for our list
    6. @{$self->{ARRAY}}[ $size .. $#{$self->{ARRAY}} + $size ]
    7. = @{$self->{ARRAY}};
    8. $self->STORE( $_, $list[$_] ) foreach 0 .. $#list;
    9. }
  • SPLICE this, offset, length, LIST

    Perform the equivalent of splice on the array.

    offset is optional and defaults to zero, negative values count back from the end of the array.

    length is optional and defaults to rest of the array.

    LIST may be empty.

    Returns a list of the original length elements at offset.

    In our example, we'll use a little shortcut if there is a LIST:

    1. sub SPLICE {
    2. my $self = shift;
    3. my $offset = shift || 0;
    4. my $length = shift || $self->FETCHSIZE() - $offset;
    5. my @list = ();
    6. if ( @_ ) {
    7. tie @list, __PACKAGE__, $self->{ELEMSIZE};
    8. @list = @_;
    9. }
    10. return splice @{$self->{ARRAY}}, $offset, $length, @list;
    11. }
  • UNTIE this

    Will be called when untie happens. (See The untie Gotcha below.)

  • DESTROY this

    This method will be triggered when the tied variable needs to be destructed. As with the scalar tie class, this is almost never needed in a language that does its own garbage collection, so this time we'll just leave it out.

Tying Hashes

Hashes were the first Perl data type to be tied (see dbmopen()). A class implementing a tied hash should define the following methods: TIEHASH is the constructor. FETCH and STORE access the key and value pairs. EXISTS reports whether a key is present in the hash, and DELETE deletes one. CLEAR empties the hash by deleting all the key and value pairs. FIRSTKEY and NEXTKEY implement the keys() and each() functions to iterate over all the keys. SCALAR is triggered when the tied hash is evaluated in scalar context. UNTIE is called when untie happens, and DESTROY is called when the tied variable is garbage collected.

If this seems like a lot, then feel free to inherit from merely the standard Tie::StdHash module for most of your methods, redefining only the interesting ones. See Tie::Hash for details.

Remember that Perl distinguishes between a key not existing in the hash, and the key existing in the hash but having a corresponding value of undef. The two possibilities can be tested with the exists() and defined() functions.

Here's an example of a somewhat interesting tied hash class: it gives you a hash representing a particular user's dot files. You index into the hash with the name of the file (minus the dot) and you get back that dot file's contents. For example:

  1. use DotFiles;
  2. tie %dot, 'DotFiles';
  3. if ( $dot{profile} =~ /MANPATH/ ||
  4. $dot{login} =~ /MANPATH/ ||
  5. $dot{cshrc} =~ /MANPATH/ )
  6. {
  7. print "you seem to set your MANPATH\n";
  8. }

Or here's another sample of using our tied class:

  1. tie %him, 'DotFiles', 'daemon';
  2. foreach $f ( keys %him ) {
  3. printf "daemon dot file %s is size %d\n",
  4. $f, length $him{$f};
  5. }

In our tied hash DotFiles example, we use a regular hash for the object containing several important fields, of which only the {LIST} field will be what the user thinks of as the real hash.

  • USER

    whose dot files this object represents

  • HOME

    where those dot files live

  • CLOBBER

    whether we should try to change or remove those dot files

  • LIST

    the hash of dot file names and content mappings

Here's the start of Dotfiles.pm:

  1. package DotFiles;
  2. use Carp;
  3. sub whowasi { (caller(1))[3] . '()' }
  4. my $DEBUG = 0;
  5. sub debug { $DEBUG = @_ ? shift : 1 }

For our example, we want to be able to emit debugging info to help in tracing during development. We keep also one convenience function around internally to help print out warnings; whowasi() returns the function name that calls it.

Here are the methods for the DotFiles tied hash.

  • TIEHASH classname, LIST

    This is the constructor for the class. That means it is expected to return a blessed reference through which the new object (probably but not necessarily an anonymous hash) will be accessed.

    Here's the constructor:

    1. sub TIEHASH {
    2. my $self = shift;
    3. my $user = shift || $>;
    4. my $dotdir = shift || '';
    5. croak "usage: @{[&whowasi]} [USER [DOTDIR]]" if @_;
    6. $user = getpwuid($user) if $user =~ /^\d+$/;
    7. my $dir = (getpwnam($user))[7]
    8. || croak "@{[&whowasi]}: no user $user";
    9. $dir .= "/$dotdir" if $dotdir;
    10. my $node = {
    11. USER => $user,
    12. HOME => $dir,
    13. LIST => {},
    14. CLOBBER => 0,
    15. };
    16. opendir(DIR, $dir)
    17. || croak "@{[&whowasi]}: can't opendir $dir: $!";
    18. foreach $dot ( grep /^\./ && -f "$dir/$_", readdir(DIR)) {
    19. $dot =~ s/^\.//;
    20. $node->{LIST}{$dot} = undef;
    21. }
    22. closedir DIR;
    23. return bless $node, $self;
    24. }

    It's probably worth mentioning that if you're going to filetest the return values out of a readdir, you'd better prepend the directory in question. Otherwise, because we didn't chdir() there, it would have been testing the wrong file.

  • FETCH this, key

    This method will be triggered every time an element in the tied hash is accessed (read). It takes one argument beyond its self reference: the key whose value we're trying to fetch.

    Here's the fetch for our DotFiles example.

    1. sub FETCH {
    2. carp &whowasi if $DEBUG;
    3. my $self = shift;
    4. my $dot = shift;
    5. my $dir = $self->{HOME};
    6. my $file = "$dir/.$dot";
    7. unless (exists $self->{LIST}->{$dot} || -f $file) {
    8. carp "@{[&whowasi]}: no $dot file" if $DEBUG;
    9. return undef;
    10. }
    11. if (defined $self->{LIST}->{$dot}) {
    12. return $self->{LIST}->{$dot};
    13. } else {
    14. return $self->{LIST}->{$dot} = `cat $dir/.$dot`;
    15. }
    16. }

    It was easy to write by having it call the Unix cat(1) command, but it would probably be more portable to open the file manually (and somewhat more efficient). Of course, because dot files are a Unixy concept, we're not that concerned.

  • STORE this, key, value

    This method will be triggered every time an element in the tied hash is set (written). It takes two arguments beyond its self reference: the index at which we're trying to store something, and the value we're trying to put there.

    Here in our DotFiles example, we'll be careful not to let them try to overwrite the file unless they've called the clobber() method on the original object reference returned by tie().

    1. sub STORE {
    2. carp &whowasi if $DEBUG;
    3. my $self = shift;
    4. my $dot = shift;
    5. my $value = shift;
    6. my $file = $self->{HOME} . "/.$dot";
    7. my $user = $self->{USER};
    8. croak "@{[&whowasi]}: $file not clobberable"
    9. unless $self->{CLOBBER};
    10. open(my $f, '>', $file) || croak "can't open $file: $!";
    11. print $f $value;
    12. close($f);
    13. }

    If they wanted to clobber something, they might say:

    1. $ob = tie %daemon_dots, 'daemon';
    2. $ob->clobber(1);
    3. $daemon_dots{signature} = "A true daemon\n";

    Another way to lay hands on a reference to the underlying object is to use the tied() function, so they might alternately have set clobber using:

    1. tie %daemon_dots, 'daemon';
    2. tied(%daemon_dots)->clobber(1);

    The clobber method is simply:

    1. sub clobber {
    2. my $self = shift;
    3. $self->{CLOBBER} = @_ ? shift : 1;
    4. }
  • DELETE this, key

    This method is triggered when we remove an element from the hash, typically by using the delete() function. Again, we'll be careful to check whether they really want to clobber files.

    1. sub DELETE {
    2. carp &whowasi if $DEBUG;
    3. my $self = shift;
    4. my $dot = shift;
    5. my $file = $self->{HOME} . "/.$dot";
    6. croak "@{[&whowasi]}: won't remove file $file"
    7. unless $self->{CLOBBER};
    8. delete $self->{LIST}->{$dot};
    9. my $success = unlink($file);
    10. carp "@{[&whowasi]}: can't unlink $file: $!" unless $success;
    11. $success;
    12. }

    The value returned by DELETE becomes the return value of the call to delete(). If you want to emulate the normal behavior of delete(), you should return whatever FETCH would have returned for this key. In this example, we have chosen instead to return a value which tells the caller whether the file was successfully deleted.

  • CLEAR this

    This method is triggered when the whole hash is to be cleared, usually by assigning the empty list to it.

    In our example, that would remove all the user's dot files! It's such a dangerous thing that they'll have to set CLOBBER to something higher than 1 to make it happen.

    1. sub CLEAR {
    2. carp &whowasi if $DEBUG;
    3. my $self = shift;
    4. croak "@{[&whowasi]}: won't remove all dot files for $self->{USER}"
    5. unless $self->{CLOBBER} > 1;
    6. my $dot;
    7. foreach $dot ( keys %{$self->{LIST}}) {
    8. $self->DELETE($dot);
    9. }
    10. }
  • EXISTS this, key

    This method is triggered when the user uses the exists() function on a particular hash. In our example, we'll look at the {LIST} hash element for this:

    1. sub EXISTS {
    2. carp &whowasi if $DEBUG;
    3. my $self = shift;
    4. my $dot = shift;
    5. return exists $self->{LIST}->{$dot};
    6. }
  • FIRSTKEY this

    This method will be triggered when the user is going to iterate through the hash, such as via a keys() or each() call.

    1. sub FIRSTKEY {
    2. carp &whowasi if $DEBUG;
    3. my $self = shift;
    4. my $a = keys %{$self->{LIST}}; # reset each() iterator
    5. each %{$self->{LIST}}
    6. }
  • NEXTKEY this, lastkey

    This method gets triggered during a keys() or each() iteration. It has a second argument which is the last key that had been accessed. This is useful if you're carrying about ordering or calling the iterator from more than one sequence, or not really storing things in a hash anywhere.

    For our example, we're using a real hash so we'll do just the simple thing, but we'll have to go through the LIST field indirectly.

    1. sub NEXTKEY {
    2. carp &whowasi if $DEBUG;
    3. my $self = shift;
    4. return each %{ $self->{LIST} }
    5. }
  • SCALAR this

    This is called when the hash is evaluated in scalar context. In order to mimic the behaviour of untied hashes, this method should return a false value when the tied hash is considered empty. If this method does not exist, perl will make some educated guesses and return true when the hash is inside an iteration. If this isn't the case, FIRSTKEY is called, and the result will be a false value if FIRSTKEY returns the empty list, true otherwise.

    However, you should not blindly rely on perl always doing the right thing. Particularly, perl will mistakenly return true when you clear the hash by repeatedly calling DELETE until it is empty. You are therefore advised to supply your own SCALAR method when you want to be absolutely sure that your hash behaves nicely in scalar context.

    In our example we can just call scalar on the underlying hash referenced by $self->{LIST} :

    1. sub SCALAR {
    2. carp &whowasi if $DEBUG;
    3. my $self = shift;
    4. return scalar %{ $self->{LIST} }
    5. }
  • UNTIE this

    This is called when untie occurs. See The untie Gotcha below.

  • DESTROY this

    This method is triggered when a tied hash is about to go out of scope. You don't really need it unless you're trying to add debugging or have auxiliary state to clean up. Here's a very simple function:

    1. sub DESTROY {
    2. carp &whowasi if $DEBUG;
    3. }

Note that functions such as keys() and values() may return huge lists when used on large objects, like DBM files. You may prefer to use the each() function to iterate over such. Example:

  1. # print out history file offsets
  2. use NDBM_File;
  3. tie(%HIST, 'NDBM_File', '/usr/lib/news/history', 1, 0);
  4. while (($key,$val) = each %HIST) {
  5. print $key, ' = ', unpack('L',$val), "\n";
  6. }
  7. untie(%HIST);

Tying FileHandles

This is partially implemented now.

A class implementing a tied filehandle should define the following methods: TIEHANDLE, at least one of PRINT, PRINTF, WRITE, READLINE, GETC, READ, and possibly CLOSE, UNTIE and DESTROY. The class can also provide: BINMODE, OPEN, EOF, FILENO, SEEK, TELL - if the corresponding perl operators are used on the handle.

When STDERR is tied, its PRINT method will be called to issue warnings and error messages. This feature is temporarily disabled during the call, which means you can use warn() inside PRINT without starting a recursive loop. And just like __WARN__ and __DIE__ handlers, STDERR's PRINT method may be called to report parser errors, so the caveats mentioned under %SIG in perlvar apply.

All of this is especially useful when perl is embedded in some other program, where output to STDOUT and STDERR may have to be redirected in some special way. See nvi and the Apache module for examples.

When tying a handle, the first argument to tie should begin with an asterisk. So, if you are tying STDOUT, use *STDOUT . If you have assigned it to a scalar variable, say $handle , use *$handle . tie $handle ties the scalar variable $handle , not the handle inside it.

In our example we're going to create a shouting handle.

  1. package Shout;
  • TIEHANDLE classname, LIST

    This is the constructor for the class. That means it is expected to return a blessed reference of some sort. The reference can be used to hold some internal information.

    1. sub TIEHANDLE { print "<shout>\n"; my $i; bless \$i, shift }
  • WRITE this, LIST

    This method will be called when the handle is written to via the syswrite function.

    1. sub WRITE {
    2. $r = shift;
    3. my($buf,$len,$offset) = @_;
    4. print "WRITE called, \$buf=$buf, \$len=$len, \$offset=$offset";
    5. }
  • PRINT this, LIST

    This method will be triggered every time the tied handle is printed to with the print() or say() functions. Beyond its self reference it also expects the list that was passed to the print function.

    1. sub PRINT { $r = shift; $$r++; print join($,,map(uc($_),@_)),$\ }

    say() acts just like print() except $\ will be localized to \n so you need do nothing special to handle say() in PRINT() .

  • PRINTF this, LIST

    This method will be triggered every time the tied handle is printed to with the printf() function. Beyond its self reference it also expects the format and list that was passed to the printf function.

    1. sub PRINTF {
    2. shift;
    3. my $fmt = shift;
    4. print sprintf($fmt, @_);
    5. }
  • READ this, LIST

    This method will be called when the handle is read from via the read or sysread functions.

    1. sub READ {
    2. my $self = shift;
    3. my $bufref = \$_[0];
    4. my(undef,$len,$offset) = @_;
    5. print "READ called, \$buf=$bufref, \$len=$len, \$offset=$offset";
    6. # add to $$bufref, set $len to number of characters read
    7. $len;
    8. }
  • READLINE this

    This method is called when the handle is read via <HANDLE> or readline HANDLE .

    As per readline, in scalar context it should return the next line, or undef for no more data. In list context it should return all remaining lines, or an empty list for no more data. The strings returned should include the input record separator $/ (see perlvar), unless it is undef (which means "slurp" mode).

    1. sub READLINE {
    2. my $r = shift;
    3. if (wantarray) {
    4. return ("all remaining\n",
    5. "lines up\n",
    6. "to eof\n");
    7. } else {
    8. return "READLINE called " . ++$$r . " times\n";
    9. }
    10. }
  • GETC this

    This method will be called when the getc function is called.

    1. sub GETC { print "Don't GETC, Get Perl"; return "a"; }
  • EOF this

    This method will be called when the eof function is called.

    Starting with Perl 5.12, an additional integer parameter will be passed. It will be zero if eof is called without parameter; 1 if eof is given a filehandle as a parameter, e.g. eof(FH); and 2 in the very special case that the tied filehandle is ARGV and eof is called with an empty parameter list, e.g. eof().

    1. sub EOF { not length $stringbuf }
  • CLOSE this

    This method will be called when the handle is closed via the close function.

    1. sub CLOSE { print "CLOSE called.\n" }
  • UNTIE this

    As with the other types of ties, this method will be called when untie happens. It may be appropriate to "auto CLOSE" when this occurs. See The untie Gotcha below.

  • DESTROY this

    As with the other types of ties, this method will be called when the tied handle is about to be destroyed. This is useful for debugging and possibly cleaning up.

    1. sub DESTROY { print "</shout>\n" }

Here's how to use our little example:

  1. tie(*FOO,'Shout');
  2. print FOO "hello\n";
  3. $a = 4; $b = 6;
  4. print FOO $a, " plus ", $b, " equals ", $a + $b, "\n";
  5. print <FOO>;

UNTIE this

You can define for all tie types an UNTIE method that will be called at untie(). See The untie Gotcha below.

The untie Gotcha

If you intend making use of the object returned from either tie() or tied(), and if the tie's target class defines a destructor, there is a subtle gotcha you must guard against.

As setup, consider this (admittedly rather contrived) example of a tie; all it does is use a file to keep a log of the values assigned to a scalar.

  1. package Remember;
  2. use strict;
  3. use warnings;
  4. use IO::File;
  5. sub TIESCALAR {
  6. my $class = shift;
  7. my $filename = shift;
  8. my $handle = IO::File->new( "> $filename" )
  9. or die "Cannot open $filename: $!\n";
  10. print $handle "The Start\n";
  11. bless {FH => $handle, Value => 0}, $class;
  12. }
  13. sub FETCH {
  14. my $self = shift;
  15. return $self->{Value};
  16. }
  17. sub STORE {
  18. my $self = shift;
  19. my $value = shift;
  20. my $handle = $self->{FH};
  21. print $handle "$value\n";
  22. $self->{Value} = $value;
  23. }
  24. sub DESTROY {
  25. my $self = shift;
  26. my $handle = $self->{FH};
  27. print $handle "The End\n";
  28. close $handle;
  29. }
  30. 1;

Here is an example that makes use of this tie:

  1. use strict;
  2. use Remember;
  3. my $fred;
  4. tie $fred, 'Remember', 'myfile.txt';
  5. $fred = 1;
  6. $fred = 4;
  7. $fred = 5;
  8. untie $fred;
  9. system "cat myfile.txt";

This is the output when it is executed:

  1. The Start
  2. 1
  3. 4
  4. 5
  5. The End

So far so good. Those of you who have been paying attention will have spotted that the tied object hasn't been used so far. So lets add an extra method to the Remember class to allow comments to be included in the file; say, something like this:

  1. sub comment {
  2. my $self = shift;
  3. my $text = shift;
  4. my $handle = $self->{FH};
  5. print $handle $text, "\n";
  6. }

And here is the previous example modified to use the comment method (which requires the tied object):

  1. use strict;
  2. use Remember;
  3. my ($fred, $x);
  4. $x = tie $fred, 'Remember', 'myfile.txt';
  5. $fred = 1;
  6. $fred = 4;
  7. comment $x "changing...";
  8. $fred = 5;
  9. untie $fred;
  10. system "cat myfile.txt";

When this code is executed there is no output. Here's why:

When a variable is tied, it is associated with the object which is the return value of the TIESCALAR, TIEARRAY, or TIEHASH function. This object normally has only one reference, namely, the implicit reference from the tied variable. When untie() is called, that reference is destroyed. Then, as in the first example above, the object's destructor (DESTROY) is called, which is normal for objects that have no more valid references; and thus the file is closed.

In the second example, however, we have stored another reference to the tied object in $x. That means that when untie() gets called there will still be a valid reference to the object in existence, so the destructor is not called at that time, and thus the file is not closed. The reason there is no output is because the file buffers have not been flushed to disk.

Now that you know what the problem is, what can you do to avoid it? Prior to the introduction of the optional UNTIE method the only way was the good old -w flag. Which will spot any instances where you call untie() and there are still valid references to the tied object. If the second script above this near the top use warnings 'untie' or was run with the -w flag, Perl prints this warning message:

  1. untie attempted while 1 inner references still exist

To get the script to work properly and silence the warning make sure there are no valid references to the tied object before untie() is called:

  1. undef $x;
  2. untie $fred;

Now that UNTIE exists the class designer can decide which parts of the class functionality are really associated with untie and which with the object being destroyed. What makes sense for a given class depends on whether the inner references are being kept so that non-tie-related methods can be called on the object. But in most cases it probably makes sense to move the functionality that would have been in DESTROY to the UNTIE method.

If the UNTIE method exists then the warning above does not occur. Instead the UNTIE method is passed the count of "extra" references and can issue its own warning if appropriate. e.g. to replicate the no UNTIE case this method can be used:

  1. sub UNTIE
  2. {
  3. my ($obj,$count) = @_;
  4. carp "untie attempted while $count inner references still exist" if $count;
  5. }

SEE ALSO

See DB_File or Config for some interesting tie() implementations. A good starting point for many tie() implementations is with one of the modules Tie::Scalar, Tie::Array, Tie::Hash, or Tie::Handle.

BUGS

The bucket usage information provided by scalar(%hash) is not available. What this means is that using %tied_hash in boolean context doesn't work right (currently this always tests false, regardless of whether the hash is empty or hash elements).

Localizing tied arrays or hashes does not work. After exiting the scope the arrays or the hashes are not restored.

Counting the number of entries in a hash via scalar(keys(%hash)) or scalar(values(%hash)) is inefficient since it needs to iterate through all the entries with FIRSTKEY/NEXTKEY.

Tied hash/array slices cause multiple FETCH/STORE pairs, there are no tie methods for slice operations.

You cannot easily tie a multilevel data structure (such as a hash of hashes) to a dbm file. The first problem is that all but GDBM and Berkeley DB have size limitations, but beyond that, you also have problems with how references are to be represented on disk. One module that does attempt to address this need is DBM::Deep. Check your nearest CPAN site as described in perlmodlib for source code. Note that despite its name, DBM::Deep does not use dbm. Another earlier attempt at solving the problem is MLDBM, which is also available on the CPAN, but which has some fairly serious limitations.

Tied filehandles are still incomplete. sysopen(), truncate(), flock(), fcntl(), stat() and -X can't currently be trapped.

AUTHOR

Tom Christiansen

TIEHANDLE by Sven Verdoolaege <skimo@dns.ufsia.ac.be> and Doug MacEachern <dougm@osf.org>

UNTIE by Nick Ing-Simmons <nick@ing-simmons.net>

SCALAR by Tassilo von Parseval <tassilo.von.parseval@rwth-aachen.de>

Tying Arrays by Casey West <casey@geeknest.com>

 
perldoc-html/perltodo.html000644 000765 000024 00000034305 12275777371 015763 0ustar00jjstaff000000 000000 perltodo - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perltodo

Perl 5 version 18.2 documentation
Recently read

perltodo

NAME

perltodo - Perl TO-DO List

DESCRIPTION

We no longer install the Perl 5 to-do list as a manpage, as installing snapshot that becomes increasingly out of date isn't that useful to anyone. The current Perl 5 to-do list is maintained in the git repository, and can be viewed at http://perl5.git.perl.org/perl.git/blob/HEAD:/Porting/todo.pod

Page index
 
perldoc-html/perltooc.html000644 000765 000024 00000033736 12275777324 015767 0ustar00jjstaff000000 000000 perltooc - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perltooc

Perl 5 version 18.2 documentation
Recently read

perltooc

NAME

perltooc - This document has been deleted

DESCRIPTION

For information on OO programming with Perl, please see perlootut and perlobj.

Page index
 
perldoc-html/perltoot.html000644 000765 000024 00000033736 12275777324 016010 0ustar00jjstaff000000 000000 perltoot - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perltoot

Perl 5 version 18.2 documentation
Recently read

perltoot

NAME

perltoot - This document has been deleted

DESCRIPTION

For information on OO programming with Perl, please see perlootut and perlobj.

Page index
 
perldoc-html/perltrap.html000644 000765 000024 00000064633 12275777324 015771 0ustar00jjstaff000000 000000 perltrap - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perltrap

Perl 5 version 18.2 documentation
Recently read

perltrap

NAME

perltrap - Perl traps for the unwary

DESCRIPTION

The biggest trap of all is forgetting to use warnings or use the -w switch; see perllexwarn and perlrun. The second biggest trap is not making your entire program runnable under use strict . The third biggest trap is not reading the list of changes in this version of Perl; see perldelta.

Awk Traps

Accustomed awk users should take special note of the following:

  • A Perl program executes only once, not once for each input line. You can do an implicit loop with -n or -p .

  • The English module, loaded via

    1. use English;

    allows you to refer to special variables (like $/ ) with names (like $RS), as though they were in awk; see perlvar for details.

  • Semicolons are required after all simple statements in Perl (except at the end of a block). Newline is not a statement delimiter.

  • Curly brackets are required on if s and while s.

  • Variables begin with "$", "@" or "%" in Perl.

  • Arrays index from 0. Likewise string positions in substr() and index().

  • You have to decide whether your array has numeric or string indices.

  • Hash values do not spring into existence upon mere reference.

  • You have to decide whether you want to use string or numeric comparisons.

  • Reading an input line does not split it for you. You get to split it to an array yourself. And the split() operator has different arguments than awk's.

  • The current input line is normally in $_, not $0. It generally does not have the newline stripped. ($0 is the name of the program executed.) See perlvar.

  • $<digit> does not refer to fields--it refers to substrings matched by the last match pattern.

  • The print() statement does not add field and record separators unless you set $, and $\ . You can set $OFS and $ORS if you're using the English module.

  • You must open your files before you print to them.

  • The range operator is "..", not comma. The comma operator works as in C.

  • The match operator is "=~", not "~". ("~" is the one's complement operator, as in C.)

  • The exponentiation operator is "**", not "^". "^" is the XOR operator, as in C. (You know, one could get the feeling that awk is basically incompatible with C.)

  • The concatenation operator is ".", not the null string. (Using the null string would render /pat/ /pat/ unparsable, because the third slash would be interpreted as a division operator--the tokenizer is in fact slightly context sensitive for operators like "/", "?", and ">". And in fact, "." itself can be the beginning of a number.)

  • The next, exit, and continue keywords work differently.

  • The following variables work differently:

    1. Awk Perl
    2. ARGC scalar @ARGV (compare with $#ARGV)
    3. ARGV[0] $0
    4. FILENAME $ARGV
    5. FNR $. - something
    6. FS (whatever you like)
    7. NF $#Fld, or some such
    8. NR $.
    9. OFMT $#
    10. OFS $,
    11. ORS $\
    12. RLENGTH length($&)
    13. RS $/
    14. RSTART length($`)
    15. SUBSEP $;
  • You cannot set $RS to a pattern, only a string.

  • When in doubt, run the awk construct through a2p and see what it gives you.

C/C++ Traps

Cerebral C and C++ programmers should take note of the following:

  • Curly brackets are required on if 's and while 's.

  • You must use elsif rather than else if .

  • The break and continue keywords from C become in Perl last and next, respectively. Unlike in C, these do not work within a do { } while construct. See Loop Control in perlsyn.

  • The switch statement is called given/when and only available in perl 5.10 or newer. See Switch Statements in perlsyn.

  • Variables begin with "$", "@" or "%" in Perl.

  • Comments begin with "#", not "/*" or "//". Perl may interpret C/C++ comments as division operators, unterminated regular expressions or the defined-or operator.

  • You can't take the address of anything, although a similar operator in Perl is the backslash, which creates a reference.

  • ARGV must be capitalized. $ARGV[0] is C's argv[1] , and argv[0] ends up in $0 .

  • System calls such as link(), unlink(), rename(), etc. return nonzero for success, not 0. (system(), however, returns zero for success.)

  • Signal handlers deal with signal names, not numbers. Use kill -l to find their names on your system.

Sed Traps

Seasoned sed programmers should take note of the following:

  • A Perl program executes only once, not once for each input line. You can do an implicit loop with -n or -p .

  • Backreferences in substitutions use "$" rather than "\".

  • The pattern matching metacharacters "(", ")", and "|" do not have backslashes in front.

  • The range operator is ... , rather than comma.

Shell Traps

Sharp shell programmers should take note of the following:

  • The backtick operator does variable interpolation without regard to the presence of single quotes in the command.

  • The backtick operator does no translation of the return value, unlike csh.

  • Shells (especially csh) do several levels of substitution on each command line. Perl does substitution in only certain constructs such as double quotes, backticks, angle brackets, and search patterns.

  • Shells interpret scripts a little bit at a time. Perl compiles the entire program before executing it (except for BEGIN blocks, which execute at compile time).

  • The arguments are available via @ARGV, not $1, $2, etc.

  • The environment is not automatically made available as separate scalar variables.

  • The shell's test uses "=", "!=", "<" etc for string comparisons and "-eq", "-ne", "-lt" etc for numeric comparisons. This is the reverse of Perl, which uses eq , ne , lt for string comparisons, and == , != < etc for numeric comparisons.

Perl Traps

Practicing Perl Programmers should take note of the following:

  • Remember that many operations behave differently in a list context than they do in a scalar one. See perldata for details.

  • Avoid barewords if you can, especially all lowercase ones. You can't tell by just looking at it whether a bareword is a function or a string. By using quotes on strings and parentheses on function calls, you won't ever get them confused.

  • You cannot discern from mere inspection which builtins are unary operators (like chop() and chdir()) and which are list operators (like print() and unlink()). (Unless prototyped, user-defined subroutines can only be list operators, never unary ones.) See perlop and perlsub.

  • People have a hard time remembering that some functions default to $_, or @ARGV, or whatever, but that others which you might expect to do not.

  • The <FH> construct is not the name of the filehandle, it is a readline operation on that handle. The data read is assigned to $_ only if the file read is the sole condition in a while loop:

    1. while (<FH>) { }
    2. while (defined($_ = <FH>)) { }..
    3. <FH>; # data discarded!
  • Remember not to use = when you need =~ ; these two constructs are quite different:

    1. $x = /foo/;
    2. $x =~ /foo/;
  • The do {} construct isn't a real loop that you can use loop control on.

  • Use my() for local variables whenever you can get away with it (but see perlform for where you can't). Using local() actually gives a local value to a global variable, which leaves you open to unforeseen side-effects of dynamic scoping.

  • If you localize an exported variable in a module, its exported value will not change. The local name becomes an alias to a new value but the external name is still an alias for the original.

As always, if any of these are ever officially declared as bugs, they'll be fixed and removed.

 
perldoc-html/perltru64.html000644 000765 000024 00000061515 12275777413 016002 0ustar00jjstaff000000 000000 perltru64 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perltru64

Perl 5 version 18.2 documentation
Recently read

perltru64

NAME

perltru64 - Perl version 5 on Tru64 (formerly known as Digital UNIX formerly known as DEC OSF/1) systems

DESCRIPTION

This document describes various features of HP's (formerly Compaq's, formerly Digital's) Unix operating system (Tru64) that will affect how Perl version 5 (hereafter just Perl) is configured, compiled and/or runs.

Compiling Perl 5 on Tru64

The recommended compiler to use in Tru64 is the native C compiler. The native compiler produces much faster code (the speed difference is noticeable: several dozen percentages) and also more correct code: if you are considering using the GNU C compiler you should use at the very least the release of 2.95.3 since all older gcc releases are known to produce broken code when compiling Perl. One manifestation of this brokenness is the lib/sdbm test dumping core; another is many of the op/regexp and op/pat, or ext/Storable tests dumping core (the exact pattern of failures depending on the GCC release and optimization flags).

gcc 3.2.1 is known to work okay with Perl 5.8.0. However, when optimizing the toke.c gcc likes to have a lot of memory, 256 megabytes seems to be enough. The default setting of the process data section in Tru64 should be one gigabyte, but some sites/setups might have lowered that. The configuration process of Perl checks for too low process limits, and lowers the optimization for the toke.c if necessary, and also gives advice on how to raise the process limits.

Also, Configure might abort with

  1. Build a threading Perl? [n]
  2. Configure[2437]: Syntax error at line 1 : 'config.sh' is not expected.

This indicates that Configure is being run with a broken Korn shell (even though you think you are using a Bourne shell by using "sh Configure" or "./Configure"). The Korn shell bug has been reported to Compaq as of February 1999 but in the meanwhile, the reason ksh is being used is that you have the environment variable BIN_SH set to 'xpg4'. This causes /bin/sh to delegate its duties to /bin/posix/sh (a ksh). Unset the environment variable and rerun Configure.

Using Large Files with Perl on Tru64

In Tru64 Perl is automatically able to use large files, that is, files larger than 2 gigabytes, there is no need to use the Configure -Duselargefiles option as described in INSTALL (though using the option is harmless).

Threaded Perl on Tru64

If you want to use threads, you should primarily use the Perl 5.8.0 threads model by running Configure with -Duseithreads.

Perl threading is going to work only in Tru64 4.0 and newer releases, older operating releases like 3.2 aren't probably going to work properly with threads.

In Tru64 V5 (at least V5.1A, V5.1B) you cannot build threaded Perl with gcc because the system header <pthread.h> explicitly checks for supported C compilers, gcc (at least 3.2.2) not being one of them. But the system C compiler should work just fine.

Long Doubles on Tru64

You cannot Configure Perl to use long doubles unless you have at least Tru64 V5.0, the long double support simply wasn't functional enough before that. Perl's Configure will override attempts to use the long doubles (you can notice this by Configure finding out that the modfl() function does not work as it should).

At the time of this writing (June 2002), there is a known bug in the Tru64 libc printing of long doubles when not using "e" notation. The values are correct and usable, but you only get a limited number of digits displayed unless you force the issue by using printf "%.33e",$num or the like. For Tru64 versions V5.0A through V5.1A, a patch is expected sometime after perl 5.8.0 is released. If your libc has not yet been patched, you'll get a warning from Configure when selecting long doubles.

DB_File tests failing on Tru64

The DB_File tests (db-btree.t, db-hash.t, db-recno.t) may fail you have installed a newer version of Berkeley DB into the system and the -I and -L compiler and linker flags introduce version conflicts with the DB 1.85 headers and libraries that came with the Tru64. For example, mixing a DB v2 library with the DB v1 headers is a bad idea. Watch out for Configure options -Dlocincpth and -Dloclibpth, and check your /usr/local/include and /usr/local/lib since they are included by default.

The second option is to explicitly instruct Configure to detect the newer Berkeley DB installation, by supplying the right directories with -Dlocincpth=/some/include and -Dloclibpth=/some/lib and before running "make test" setting your LD_LIBRARY_PATH to /some/lib.

The third option is to work around the problem by disabling the DB_File completely when build Perl by specifying -Ui_db to Configure, and then using the BerkeleyDB module from CPAN instead of DB_File. The BerkeleyDB works with Berkeley DB versions 2.* or greater.

The Berkeley DB 4.1.25 has been tested with Tru64 V5.1A and found to work. The latest Berkeley DB can be found from http://www.sleepycat.com.

64-bit Perl on Tru64

In Tru64 Perl's integers are automatically 64-bit wide, there is no need to use the Configure -Duse64bitint option as described in INSTALL. Similarly, there is no need for -Duse64bitall since pointers are automatically 64-bit wide.

Warnings about floating-point overflow when compiling Perl on Tru64

When compiling Perl in Tru64 you may (depending on the compiler release) see two warnings like this

  1. cc: Warning: numeric.c, line 104: In this statement, floating-point overflow occurs in evaluating the expression "1.8e308". (floatoverfl)
  2. return HUGE_VAL;
  3. -----------^

and when compiling the POSIX extension

  1. cc: Warning: const-c.inc, line 2007: In this statement, floating-point overflow occurs in evaluating the expression "1.8e308". (floatoverfl)
  2. return HUGE_VAL;
  3. -------------------^

The exact line numbers may vary between Perl releases. The warnings are benign and can be ignored: in later C compiler releases the warnings should be gone.

When the file pp_sys.c is being compiled you may (depending on the operating system release) see an additional compiler flag being used: -DNO_EFF_ONLY_OK . This is normal and refers to a feature that is relevant only if you use the filetest pragma. In older releases of the operating system the feature was broken and the NO_EFF_ONLY_OK instructs Perl not to use the feature.

Testing Perl on Tru64

During "make test" the comp/cpp will be skipped because on Tru64 it cannot be tested before Perl has been installed. The test refers to the use of the -P option of Perl.

ext/ODBM_File/odbm Test Failing With Static Builds

The ext/ODBM_File/odbm is known to fail with static builds (Configure -Uusedl) due to a known bug in Tru64's static libdbm library. The good news is that you very probably don't need to ever use the ODBM_File extension since more advanced NDBM_File works fine, not to mention the even more advanced DB_File.

Perl Fails Because Of Unresolved Symbol sockatmark

If you get an error like

  1. Can't load '.../OSF1/lib/perl5/5.8.0/alpha-dec_osf/auto/IO/IO.so' for module IO: Unresolved symbol in .../lib/perl5/5.8.0/alpha-dec_osf/auto/IO/IO.so: sockatmark at .../lib/perl5/5.8.0/alpha-dec_osf/XSLoader.pm line 75.

you need to either recompile your Perl in Tru64 4.0D or upgrade your Tru64 4.0D to at least 4.0F: the sockatmark() system call was added in Tru64 4.0F, and the IO extension refers that symbol.

AUTHOR

Jarkko Hietaniemi <jhi@iki.fi>

 
perldoc-html/perlunicode.html000644 000765 000024 00000470734 12275777344 016456 0ustar00jjstaff000000 000000 perlunicode - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlunicode

Perl 5 version 18.2 documentation
Recently read

perlunicode

NAME

perlunicode - Unicode support in Perl

DESCRIPTION

Important Caveats

Unicode support is an extensive requirement. While Perl does not implement the Unicode standard or the accompanying technical reports from cover to cover, Perl does support many Unicode features.

People who want to learn to use Unicode in Perl, should probably read the Perl Unicode tutorial, perlunitut and perluniintro, before reading this reference document.

Also, the use of Unicode may present security issues that aren't obvious. Read Unicode Security Considerations.

  • Safest if you "use feature 'unicode_strings'"

    In order to preserve backward compatibility, Perl does not turn on full internal Unicode support unless the pragma use feature 'unicode_strings' is specified. (This is automatically selected if you use use 5.012 or higher.) Failure to do this can trigger unexpected surprises. See The Unicode Bug below.

    This pragma doesn't affect I/O. Nor does it change the internal representation of strings, only their interpretation. There are still several places where Unicode isn't fully supported, such as in filenames.

  • Input and Output Layers

    Perl knows when a filehandle uses Perl's internal Unicode encodings (UTF-8, or UTF-EBCDIC if in EBCDIC) if the filehandle is opened with the ":encoding(utf8)" layer. Other encodings can be converted to Perl's encoding on input or from Perl's encoding on output by use of the ":encoding(...)" layer. See open.

    To indicate that Perl source itself is in UTF-8, use use utf8; .

  • use utf8 still needed to enable UTF-8/UTF-EBCDIC in scripts

    As a compatibility measure, the use utf8 pragma must be explicitly included to enable recognition of UTF-8 in the Perl scripts themselves (in string or regular expression literals, or in identifier names) on ASCII-based machines or to recognize UTF-EBCDIC on EBCDIC-based machines. These are the only times when an explicit use utf8 is needed. See utf8.

  • BOM-marked scripts and UTF-16 scripts autodetected

    If a Perl script begins marked with the Unicode BOM (UTF-16LE, UTF16-BE, or UTF-8), or if the script looks like non-BOM-marked UTF-16 of either endianness, Perl will correctly read in the script as Unicode. (BOMless UTF-8 cannot be effectively recognized or differentiated from ISO 8859-1 or other eight-bit encodings.)

  • use encoding needed to upgrade non-Latin-1 byte strings

    By default, there is a fundamental asymmetry in Perl's Unicode model: implicit upgrading from byte strings to Unicode strings assumes that they were encoded in ISO 8859-1 (Latin-1), but Unicode strings are downgraded with UTF-8 encoding. This happens because the first 256 codepoints in Unicode happens to agree with Latin-1.

    See Byte and Character Semantics for more details.

Byte and Character Semantics

Perl uses logically-wide characters to represent strings internally.

Starting in Perl 5.14, Perl-level operations work with characters rather than bytes within the scope of a use feature 'unicode_strings' (or equivalently use 5.012 or higher). (This is not true if bytes have been explicitly requested by use bytes, nor necessarily true for interactions with the platform's operating system.)

For earlier Perls, and when unicode_strings is not in effect, Perl provides a fairly safe environment that can handle both types of semantics in programs. For operations where Perl can unambiguously decide that the input data are characters, Perl switches to character semantics. For operations where this determination cannot be made without additional information from the user, Perl decides in favor of compatibility and chooses to use byte semantics.

When use locale (but not use locale ':not_characters' ) is in effect, Perl uses the semantics associated with the current locale. (use locale overrides use feature 'unicode_strings' in the same scope; while use locale ':not_characters' effectively also selects use feature 'unicode_strings' in its scope; see perllocale.) Otherwise, Perl uses the platform's native byte semantics for characters whose code points are less than 256, and Unicode semantics for those greater than 255. That means that non-ASCII characters are undefined except for their ordinal numbers. This means that none have case (upper and lower), nor are any a member of character classes, like [:alpha:] or \w . (But all do belong to the \W class or the Perl regular expression extension [:^alpha:].)

This behavior preserves compatibility with earlier versions of Perl, which allowed byte semantics in Perl operations only if none of the program's inputs were marked as being a source of Unicode character data. Such data may come from filehandles, from calls to external programs, from information provided by the system (such as %ENV), or from literals and constants in the source text.

The utf8 pragma is primarily a compatibility device that enables recognition of UTF-(8|EBCDIC) in literals encountered by the parser. Note that this pragma is only required while Perl defaults to byte semantics; when character semantics become the default, this pragma may become a no-op. See utf8.

If strings operating under byte semantics and strings with Unicode character data are concatenated, the new string will have character semantics. This can cause surprises: See BUGS, below. You can choose to be warned when this happens. See encoding::warnings.

Under character semantics, many operations that formerly operated on bytes now operate on characters. A character in Perl is logically just a number ranging from 0 to 2**31 or so. Larger characters may encode into longer sequences of bytes internally, but this internal detail is mostly hidden for Perl code. See perluniintro for more.

Effects of Character Semantics

Character semantics have the following effects:

  • Strings--including hash keys--and regular expression patterns may contain characters that have an ordinal value larger than 255.

    If you use a Unicode editor to edit your program, Unicode characters may occur directly within the literal strings in UTF-8 encoding, or UTF-16. (The former requires a BOM or use utf8 , the latter requires a BOM.)

    Unicode characters can also be added to a string by using the \N{U+...} notation. The Unicode code for the desired character, in hexadecimal, should be placed in the braces, after the U . For instance, a smiley face is \N{U+263A}.

    Alternatively, you can use the \x{...} notation for characters 0x100 and above. For characters below 0x100 you may get byte semantics instead of character semantics; see The Unicode Bug. On EBCDIC machines there is the additional problem that the value for such characters gives the EBCDIC character rather than the Unicode one, thus it is more portable to use \N{U+...} instead.

    Additionally, you can use the \N{...} notation and put the official Unicode character name within the braces, such as \N{WHITE SMILING FACE} . This automatically loads the charnames module with the :full and :short options. If you prefer different options for this module, you can instead, before the \N{...} , explicitly load it with your desired options; for example,

    1. use charnames ':loose';
  • If an appropriate encoding is specified, identifiers within the Perl script may contain Unicode alphanumeric characters, including ideographs. Perl does not currently attempt to canonicalize variable names.

  • Regular expressions match characters instead of bytes. "." matches a character instead of a byte.

  • Bracketed character classes in regular expressions match characters instead of bytes and match against the character properties specified in the Unicode properties database. \w can be used to match a Japanese ideograph, for instance.

  • Named Unicode properties, scripts, and block ranges may be used (like bracketed character classes) by using the \p{} "matches property" construct and the \P{} negation, "doesn't match property". See Unicode Character Properties for more details.

    You can define your own character properties and use them in the regular expression with the \p{} or \P{} construct. See User-Defined Character Properties for more details.

  • The special pattern \X matches a logical character, an "extended grapheme cluster" in Standardese. In Unicode what appears to the user to be a single character, for example an accented G , may in fact be composed of a sequence of characters, in this case a G followed by an accent character. \X will match the entire sequence.

  • The tr/// operator translates characters instead of bytes. Note that the tr///CU functionality has been removed. For similar functionality see pack('U0', ...) and pack('C0', ...).

  • Case translation operators use the Unicode case translation tables when character input is provided. Note that uc(), or \U in interpolated strings, translates to uppercase, while ucfirst, or \u in interpolated strings, translates to titlecase in languages that make the distinction (which is equivalent to uppercase in languages without the distinction).

  • Most operators that deal with positions or lengths in a string will automatically switch to using character positions, including chop(), chomp(), substr(), pos(), index(), rindex(), sprintf(), write(), and length(). An operator that specifically does not switch is vec(). Operators that really don't care include operators that treat strings as a bucket of bits such as sort(), and operators dealing with filenames.

  • The pack()/unpack() letter C does not change, since it is often used for byte-oriented formats. Again, think char in the C language.

    There is a new U specifier that converts between Unicode characters and code points. There is also a W specifier that is the equivalent of chr/ord and properly handles character values even if they are above 255.

  • The chr() and ord() functions work on characters, similar to pack("W") and unpack("W"), not pack("C") and unpack("C"). pack("C") and unpack("C") are methods for emulating byte-oriented chr() and ord() on Unicode strings. While these methods reveal the internal encoding of Unicode strings, that is not something one normally needs to care about at all.

  • The bit string operators, & | ^ ~ , can operate on character data. However, for backward compatibility, such as when using bit string operations when characters are all less than 256 in ordinal value, one should not use ~ (the bit complement) with characters of both values less than 256 and values greater than 256. Most importantly, DeMorgan's laws (~($x|$y) eq ~$x&~$y and ~($x&$y) eq ~$x|~$y ) will not hold. The reason for this mathematical faux pas is that the complement cannot return both the 8-bit (byte-wide) bit complement and the full character-wide bit complement.

  • There is a CPAN module, Unicode::Casing, which allows you to define your own mappings to be used in lc(), lcfirst(), uc(), ucfirst(), and fc (or their double-quoted string inlined versions such as \U ). (Prior to Perl 5.16, this functionality was partially provided in the Perl core, but suffered from a number of insurmountable drawbacks, so the CPAN module was written instead.)

  • And finally, scalar reverse() reverses by character rather than by byte.

Unicode Character Properties

(The only time that Perl considers a sequence of individual code points as a single logical character is in the \X construct, already mentioned above. Therefore "character" in this discussion means a single Unicode code point.)

Very nearly all Unicode character properties are accessible through regular expressions by using the \p{} "matches property" construct and the \P{} "doesn't match property" for its negation.

For instance, \p{Uppercase} matches any single character with the Unicode "Uppercase" property, while \p{L} matches any character with a General_Category of "L" (letter) property. Brackets are not required for single letter property names, so \p{L} is equivalent to \pL .

More formally, \p{Uppercase} matches any single character whose Unicode Uppercase property value is True, and \P{Uppercase} matches any character whose Uppercase property value is False, and they could have been written as \p{Uppercase=True} and \p{Uppercase=False} , respectively.

This formality is needed when properties are not binary; that is, if they can take on more values than just True and False. For example, the Bidi_Class (see Bidirectional Character Types below), can take on several different values, such as Left, Right, Whitespace, and others. To match these, one needs to specify both the property name (Bidi_Class), AND the value being matched against (Left, Right, etc.). This is done, as in the examples above, by having the two components separated by an equal sign (or interchangeably, a colon), like \p{Bidi_Class: Left} .

All Unicode-defined character properties may be written in these compound forms of \p{property=value} or \p{property:value} , but Perl provides some additional properties that are written only in the single form, as well as single-form short-cuts for all binary properties and certain others described below, in which you may omit the property name and the equals or colon separator.

Most Unicode character properties have at least two synonyms (or aliases if you prefer): a short one that is easier to type and a longer one that is more descriptive and hence easier to understand. Thus the "L" and "Letter" properties above are equivalent and can be used interchangeably. Likewise, "Upper" is a synonym for "Uppercase", and we could have written \p{Uppercase} equivalently as \p{Upper} . Also, there are typically various synonyms for the values the property can be. For binary properties, "True" has 3 synonyms: "T", "Yes", and "Y"; and "False has correspondingly "F", "No", and "N". But be careful. A short form of a value for one property may not mean the same thing as the same short form for another. Thus, for the General_Category property, "L" means "Letter", but for the Bidi_Class property, "L" means "Left". A complete list of properties and synonyms is in perluniprops.

Upper/lower case differences in property names and values are irrelevant; thus \p{Upper} means the same thing as \p{upper} or even \p{UpPeR} . Similarly, you can add or subtract underscores anywhere in the middle of a word, so that these are also equivalent to \p{U_p_p_e_r} . And white space is irrelevant adjacent to non-word characters, such as the braces and the equals or colon separators, so \p{ Upper } and \p{ Upper_case : Y } are equivalent to these as well. In fact, white space and even hyphens can usually be added or deleted anywhere. So even \p{ Up-per case = Yes} is equivalent. All this is called "loose-matching" by Unicode. The few places where stricter matching is used is in the middle of numbers, and in the Perl extension properties that begin or end with an underscore. Stricter matching cares about white space (except adjacent to non-word characters), hyphens, and non-interior underscores.

You can also use negation in both \p{} and \P{} by introducing a caret (^) between the first brace and the property name: \p{^Tamil} is equal to \P{Tamil} .

Almost all properties are immune to case-insensitive matching. That is, adding a /i regular expression modifier does not change what they match. There are two sets that are affected. The first set is Uppercase_Letter , Lowercase_Letter , and Titlecase_Letter , all of which match Cased_Letter under /i matching. And the second set is Uppercase , Lowercase , and Titlecase , all of which match Cased under /i matching. This set also includes its subsets PosixUpper and PosixLower both of which under /i matching match PosixAlpha . (The difference between these sets is that some things, such as Roman numerals, come in both upper and lower case so they are Cased , but aren't considered letters, so they aren't Cased_Letter s.)

The result is undefined if you try to match a non-Unicode code point (that is, one above 0x10FFFF) against a Unicode property. Currently, a warning is raised, and the match will fail. In some cases, this is counterintuitive, as both these fail:

  1. chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails.
  2. chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Fails!

General_Category

Every Unicode character is assigned a general category, which is the "most usual categorization of a character" (from http://www.unicode.org/reports/tr44).

The compound way of writing these is like \p{General_Category=Number} (short, \p{gc:n} ). But Perl furnishes shortcuts in which everything up through the equal or colon separator is omitted. So you can instead just write \pN .

Here are the short and long forms of the General Category properties:

  1. Short Long
  2. L Letter
  3. LC, L& Cased_Letter (that is: [\p{Ll}\p{Lu}\p{Lt}])
  4. Lu Uppercase_Letter
  5. Ll Lowercase_Letter
  6. Lt Titlecase_Letter
  7. Lm Modifier_Letter
  8. Lo Other_Letter
  9. M Mark
  10. Mn Nonspacing_Mark
  11. Mc Spacing_Mark
  12. Me Enclosing_Mark
  13. N Number
  14. Nd Decimal_Number (also Digit)
  15. Nl Letter_Number
  16. No Other_Number
  17. P Punctuation (also Punct)
  18. Pc Connector_Punctuation
  19. Pd Dash_Punctuation
  20. Ps Open_Punctuation
  21. Pe Close_Punctuation
  22. Pi Initial_Punctuation
  23. (may behave like Ps or Pe depending on usage)
  24. Pf Final_Punctuation
  25. (may behave like Ps or Pe depending on usage)
  26. Po Other_Punctuation
  27. S Symbol
  28. Sm Math_Symbol
  29. Sc Currency_Symbol
  30. Sk Modifier_Symbol
  31. So Other_Symbol
  32. Z Separator
  33. Zs Space_Separator
  34. Zl Line_Separator
  35. Zp Paragraph_Separator
  36. C Other
  37. Cc Control (also Cntrl)
  38. Cf Format
  39. Cs Surrogate
  40. Co Private_Use
  41. Cn Unassigned

Single-letter properties match all characters in any of the two-letter sub-properties starting with the same letter. LC and L& are special: both are aliases for the set consisting of everything matched by Ll , Lu , and Lt .

Bidirectional Character Types

Because scripts differ in their directionality (Hebrew and Arabic are written right to left, for example) Unicode supplies these properties in the Bidi_Class class:

  1. Property Meaning
  2. L Left-to-Right
  3. LRE Left-to-Right Embedding
  4. LRO Left-to-Right Override
  5. R Right-to-Left
  6. AL Arabic Letter
  7. RLE Right-to-Left Embedding
  8. RLO Right-to-Left Override
  9. PDF Pop Directional Format
  10. EN European Number
  11. ES European Separator
  12. ET European Terminator
  13. AN Arabic Number
  14. CS Common Separator
  15. NSM Non-Spacing Mark
  16. BN Boundary Neutral
  17. B Paragraph Separator
  18. S Segment Separator
  19. WS Whitespace
  20. ON Other Neutrals

This property is always written in the compound form. For example, \p{Bidi_Class:R} matches characters that are normally written right to left.

Scripts

The world's languages are written in many different scripts. This sentence (unless you're reading it in translation) is written in Latin, while Russian is written in Cyrillic, and Greek is written in, well, Greek; Japanese mainly in Hiragana or Katakana. There are many more.

The Unicode Script and Script_Extensions properties give what script a given character is in. Either property can be specified with the compound form like \p{Script=Hebrew} (short: \p{sc=hebr} ), or \p{Script_Extensions=Javanese} (short: \p{scx=java} ). In addition, Perl furnishes shortcuts for all Script property names. You can omit everything up through the equals (or colon), and simply write \p{Latin} or \P{Cyrillic} . (This is not true for Script_Extensions , which is required to be written in the compound form.)

The difference between these two properties involves characters that are used in multiple scripts. For example the digits '0' through '9' are used in many parts of the world. These are placed in a script named Common . Other characters are used in just a few scripts. For example, the "KATAKANA-HIRAGANA DOUBLE HYPHEN" is used in both Japanese scripts, Katakana and Hiragana, but nowhere else. The Script property places all characters that are used in multiple scripts in the Common script, while the Script_Extensions property places those that are used in only a few scripts into each of those scripts; while still using Common for those used in many scripts. Thus both these match:

  1. "0" =~ /\p{sc=Common}/ # Matches
  2. "0" =~ /\p{scx=Common}/ # Matches

and only the first of these match:

  1. "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Common} # Matches
  2. "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Common} # No match

And only the last two of these match:

  1. "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Hiragana} # No match
  2. "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Katakana} # No match
  3. "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Hiragana} # Matches
  4. "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Katakana} # Matches

Script_Extensions is thus an improved Script , in which there are fewer characters in the Common script, and correspondingly more in other scripts. It is new in Unicode version 6.0, and its data are likely to change significantly in later releases, as things get sorted out.

(Actually, besides Common , the Inherited script, contains characters that are used in multiple scripts. These are modifier characters which modify other characters, and inherit the script value of the controlling character. Some of these are used in many scripts, and so go into Inherited in both Script and Script_Extensions . Others are used in just a few scripts, so are in Inherited in Script , but not in Script_Extensions .)

It is worth stressing that there are several different sets of digits in Unicode that are equivalent to 0-9 and are matchable by \d in a regular expression. If they are used in a single language only, they are in that language's Script and Script_Extension . If they are used in more than one script, they will be in sc=Common , but only if they are used in many scripts should they be in scx=Common .

A complete list of scripts and their shortcuts is in perluniprops.

Use of "Is" Prefix

For backward compatibility (with Perl 5.6), all properties mentioned so far may have Is or Is_ prepended to their name, so \P{Is_Lu} , for example, is equal to \P{Lu} , and \p{IsScript:Arabic} is equal to \p{Arabic} .

Blocks

In addition to scripts, Unicode also defines blocks of characters. The difference between scripts and blocks is that the concept of scripts is closer to natural languages, while the concept of blocks is more of an artificial grouping based on groups of Unicode characters with consecutive ordinal values. For example, the "Basic Latin" block is all characters whose ordinals are between 0 and 127, inclusive; in other words, the ASCII characters. The "Latin" script contains some letters from this as well as several other blocks, like "Latin-1 Supplement", "Latin Extended-A", etc., but it does not contain all the characters from those blocks. It does not, for example, contain the digits 0-9, because those digits are shared across many scripts, and hence are in the Common script.

For more about scripts versus blocks, see UAX#24 "Unicode Script Property": http://www.unicode.org/reports/tr24

The Script or Script_Extensions properties are likely to be the ones you want to use when processing natural language; the Block property may occasionally be useful in working with the nuts and bolts of Unicode.

Block names are matched in the compound form, like \p{Block: Arrows} or \p{Blk=Hebrew} . Unlike most other properties, only a few block names have a Unicode-defined short name. But Perl does provide a (slight) shortcut: You can say, for example \p{In_Arrows} or \p{In_Hebrew} . For backwards compatibility, the In prefix may be omitted if there is no naming conflict with a script or any other property, and you can even use an Is prefix instead in those cases. But it is not a good idea to do this, for a couple reasons:

1

It is confusing. There are many naming conflicts, and you may forget some. For example, \p{Hebrew} means the script Hebrew, and NOT the block Hebrew. But would you remember that 6 months from now?

2

It is unstable. A new version of Unicode may pre-empt the current meaning by creating a property with the same name. There was a time in very early Unicode releases when \p{Hebrew} would have matched the block Hebrew; now it doesn't.

Some people prefer to always use \p{Block: foo} and \p{Script: bar} instead of the shortcuts, whether for clarity, because they can't remember the difference between 'In' and 'Is' anyway, or they aren't confident that those who eventually will read their code will know that difference.

A complete list of blocks and their shortcuts is in perluniprops.

Other Properties

There are many more properties than the very basic ones described here. A complete list is in perluniprops.

Unicode defines all its properties in the compound form, so all single-form properties are Perl extensions. Most of these are just synonyms for the Unicode ones, but some are genuine extensions, including several that are in the compound form. And quite a few of these are actually recommended by Unicode (in http://www.unicode.org/reports/tr18).

This section gives some details on all extensions that aren't just synonyms for compound-form Unicode properties (for those properties, you'll have to refer to the Unicode Standard.

  • \p{All}

    This matches any of the 1_114_112 Unicode code points. It is a synonym for \p{Any} .

  • \p{Alnum}

    This matches any \p{Alphabetic} or \p{Decimal_Number} character.

  • \p{Any}

    This matches any of the 1_114_112 Unicode code points. It is a synonym for \p{All} .

  • \p{ASCII}

    This matches any of the 128 characters in the US-ASCII character set, which is a subset of Unicode.

  • \p{Assigned}

    This matches any assigned code point; that is, any code point whose general category is not Unassigned (or equivalently, not Cn).

  • \p{Blank}

    This is the same as \h and \p{HorizSpace} : A character that changes the spacing horizontally.

  • \p{Decomposition_Type: Non_Canonical} (Short: \p{Dt=NonCanon} )

    Matches a character that has a non-canonical decomposition.

    To understand the use of this rarely used property=value combination, it is necessary to know some basics about decomposition. Consider a character, say H. It could appear with various marks around it, such as an acute accent, or a circumflex, or various hooks, circles, arrows, etc., above, below, to one side or the other, etc. There are many possibilities among the world's languages. The number of combinations is astronomical, and if there were a character for each combination, it would soon exhaust Unicode's more than a million possible characters. So Unicode took a different approach: there is a character for the base H, and a character for each of the possible marks, and these can be variously combined to get a final logical character. So a logical character--what appears to be a single character--can be a sequence of more than one individual characters. This is called an "extended grapheme cluster"; Perl furnishes the \X regular expression construct to match such sequences.

    But Unicode's intent is to unify the existing character set standards and practices, and several pre-existing standards have single characters that mean the same thing as some of these combinations. An example is ISO-8859-1, which has quite a few of these in the Latin-1 range, an example being "LATIN CAPITAL LETTER E WITH ACUTE". Because this character was in this pre-existing standard, Unicode added it to its repertoire. But this character is considered by Unicode to be equivalent to the sequence consisting of the character "LATIN CAPITAL LETTER E" followed by the character "COMBINING ACUTE ACCENT".

    "LATIN CAPITAL LETTER E WITH ACUTE" is called a "pre-composed" character, and its equivalence with the sequence is called canonical equivalence. All pre-composed characters are said to have a decomposition (into the equivalent sequence), and the decomposition type is also called canonical.

    However, many more characters have a different type of decomposition, a "compatible" or "non-canonical" decomposition. The sequences that form these decompositions are not considered canonically equivalent to the pre-composed character. An example, again in the Latin-1 range, is the "SUPERSCRIPT ONE". It is somewhat like a regular digit 1, but not exactly; its decomposition into the digit 1 is called a "compatible" decomposition, specifically a "super" decomposition. There are several such compatibility decompositions (see http://www.unicode.org/reports/tr44), including one called "compat", which means some miscellaneous type of decomposition that doesn't fit into the decomposition categories that Unicode has chosen.

    Note that most Unicode characters don't have a decomposition, so their decomposition type is "None".

    For your convenience, Perl has added the Non_Canonical decomposition type to mean any of the several compatibility decompositions.

  • \p{Graph}

    Matches any character that is graphic. Theoretically, this means a character that on a printer would cause ink to be used.

  • \p{HorizSpace}

    This is the same as \h and \p{Blank} : a character that changes the spacing horizontally.

  • \p{In=*}

    This is a synonym for \p{Present_In=*}

  • \p{PerlSpace}

    This is the same as \s, restricted to ASCII, namely [ \f\n\r\t] and starting in Perl v5.18, experimentally, a vertical tab.

    Mnemonic: Perl's (original) space

  • \p{PerlWord}

    This is the same as \w , restricted to ASCII, namely [A-Za-z0-9_]

    Mnemonic: Perl's (original) word.

  • \p{Posix...}

    There are several of these, which are equivalents using the \p notation for Posix classes and are described in POSIX Character Classes in perlrecharclass.

  • \p{Present_In: *} (Short: \p{In=*})

    This property is used when you need to know in what Unicode version(s) a character is.

    The "*" above stands for some two digit Unicode version number, such as 1.1 or 4.0 ; or the "*" can also be Unassigned . This property will match the code points whose final disposition has been settled as of the Unicode release given by the version number; \p{Present_In: Unassigned} will match those code points whose meaning has yet to be assigned.

    For example, U+0041 "LATIN CAPITAL LETTER A" was present in the very first Unicode release available, which is 1.1 , so this property is true for all valid "*" versions. On the other hand, U+1EFF was not assigned until version 5.1 when it became "LATIN SMALL LETTER Y WITH LOOP", so the only "*" that would match it are 5.1, 5.2, and later.

    Unicode furnishes the Age property from which this is derived. The problem with Age is that a strict interpretation of it (which Perl takes) has it matching the precise release a code point's meaning is introduced in. Thus U+0041 would match only 1.1; and U+1EFF only 5.1. This is not usually what you want.

    Some non-Perl implementations of the Age property may change its meaning to be the same as the Perl Present_In property; just be aware of that.

    Another confusion with both these properties is that the definition is not that the code point has been assigned, but that the meaning of the code point has been determined. This is because 66 code points will always be unassigned, and so the Age for them is the Unicode version in which the decision to make them so was made. For example, U+FDD0 is to be permanently unassigned to a character, and the decision to do that was made in version 3.1, so \p{Age=3.1} matches this character, as also does \p{Present_In: 3.1} and up.

  • \p{Print}

    This matches any character that is graphical or blank, except controls.

  • \p{SpacePerl}

    This is the same as \s, including beyond ASCII.

    Mnemonic: Space, as modified by Perl. (It doesn't include the vertical tab which both the Posix standard and Unicode consider white space.)

  • \p{Title} and \p{Titlecase}

    Under case-sensitive matching, these both match the same code points as \p{General Category=Titlecase_Letter} (\p{gc=lt} ). The difference is that under /i caseless matching, these match the same as \p{Cased} , whereas \p{gc=lt} matches \p{Cased_Letter ).

  • \p{VertSpace}

    This is the same as \v : A character that changes the spacing vertically.

  • \p{Word}

    This is the same as \w , including over 100_000 characters beyond ASCII.

  • \p{XPosix...}

    There are several of these, which are the standard Posix classes extended to the full Unicode range. They are described in POSIX Character Classes in perlrecharclass.

User-Defined Character Properties

You can define your own binary character properties by defining subroutines whose names begin with "In" or "Is". (The experimental feature (?[ ]) in perlre provides an alternative which allows more complex definitions.) The subroutines can be defined in any package. The user-defined properties can be used in the regular expression \p and \P constructs; if you are using a user-defined property from a package other than the one you are in, you must specify its package in the \p or \P construct.

  1. # assuming property Is_Foreign defined in Lang::
  2. package main; # property package name required
  3. if ($txt =~ /\p{Lang::IsForeign}+/) { ... }
  4. package Lang; # property package name not required
  5. if ($txt =~ /\p{IsForeign}+/) { ... }

Note that the effect is compile-time and immutable once defined. However, the subroutines are passed a single parameter, which is 0 if case-sensitive matching is in effect and non-zero if caseless matching is in effect. The subroutine may return different values depending on the value of the flag, and one set of values will immutably be in effect for all case-sensitive matches, and the other set for all case-insensitive matches.

Note that if the regular expression is tainted, then Perl will die rather than calling the subroutine, where the name of the subroutine is determined by the tainted data.

The subroutines must return a specially-formatted string, with one or more newline-separated lines. Each line must be one of the following:

  • A single hexadecimal number denoting a Unicode code point to include.

  • Two hexadecimal numbers separated by horizontal whitespace (space or tabular characters) denoting a range of Unicode code points to include.

  • Something to include, prefixed by "+": a built-in character property (prefixed by "utf8::") or a fully qualified (including package name) user-defined character property, to represent all the characters in that property; two hexadecimal code points for a range; or a single hexadecimal code point.

  • Something to exclude, prefixed by "-": an existing character property (prefixed by "utf8::") or a fully qualified (including package name) user-defined character property, to represent all the characters in that property; two hexadecimal code points for a range; or a single hexadecimal code point.

  • Something to negate, prefixed "!": an existing character property (prefixed by "utf8::") or a fully qualified (including package name) user-defined character property, to represent all the characters in that property; two hexadecimal code points for a range; or a single hexadecimal code point.

  • Something to intersect with, prefixed by "&": an existing character property (prefixed by "utf8::") or a fully qualified (including package name) user-defined character property, for all the characters except the characters in the property; two hexadecimal code points for a range; or a single hexadecimal code point.

For example, to define a property that covers both the Japanese syllabaries (hiragana and katakana), you can define

  1. sub InKana {
  2. return <<END;
  3. 3040\t309F
  4. 30A0\t30FF
  5. END
  6. }

Imagine that the here-doc end marker is at the beginning of the line. Now you can use \p{InKana} and \P{InKana} .

You could also have used the existing block property names:

  1. sub InKana {
  2. return <<'END';
  3. +utf8::InHiragana
  4. +utf8::InKatakana
  5. END
  6. }

Suppose you wanted to match only the allocated characters, not the raw block ranges: in other words, you want to remove the non-characters:

  1. sub InKana {
  2. return <<'END';
  3. +utf8::InHiragana
  4. +utf8::InKatakana
  5. -utf8::IsCn
  6. END
  7. }

The negation is useful for defining (surprise!) negated classes.

  1. sub InNotKana {
  2. return <<'END';
  3. !utf8::InHiragana
  4. -utf8::InKatakana
  5. +utf8::IsCn
  6. END
  7. }

This will match all non-Unicode code points, since every one of them is not in Kana. You can use intersection to exclude these, if desired, as this modified example shows:

  1. sub InNotKana {
  2. return <<'END';
  3. !utf8::InHiragana
  4. -utf8::InKatakana
  5. +utf8::IsCn
  6. &utf8::Any
  7. END
  8. }

&utf8::Any must be the last line in the definition.

Intersection is used generally for getting the common characters matched by two (or more) classes. It's important to remember not to use "&" for the first set; that would be intersecting with nothing, resulting in an empty set.

(Note that official Unicode properties differ from these in that they automatically exclude non-Unicode code points and a warning is raised if a match is attempted on one of those.)

User-Defined Case Mappings (for serious hackers only)

This feature has been removed as of Perl 5.16. The CPAN module Unicode::Casing provides better functionality without the drawbacks that this feature had. If you are using a Perl earlier than 5.16, this feature was most fully documented in the 5.14 version of this pod: http://perldoc.perl.org/5.14.0/perlunicode.html#User-Defined-Case-Mappings-%28for-serious-hackers-only%29

Character Encodings for Input and Output

See Encode.

Unicode Regular Expression Support Level

The following list of Unicode supported features for regular expressions describes all features currently directly supported by core Perl. The references to "Level N" and the section numbers refer to the Unicode Technical Standard #18, "Unicode Regular Expressions", version 13, from August 2008.

  • Level 1 - Basic Unicode Support

    1. RL1.1 Hex Notation - done [1]
    2. RL1.2 Properties - done [2][3]
    3. RL1.2a Compatibility Properties - done [4]
    4. RL1.3 Subtraction and Intersection - experimental [5]
    5. RL1.4 Simple Word Boundaries - done [6]
    6. RL1.5 Simple Loose Matches - done [7]
    7. RL1.6 Line Boundaries - MISSING [8][9]
    8. RL1.7 Supplementary Code Points - done [10]
    • [1]

      \x{...}

    • [2]

      \p{...} \P{...}

    • [3]

      supports not only minimal list, but all Unicode character properties (see Unicode Character Properties above)

    • [4]

      \d \D \s \S \w \W \X [:prop:] [:^prop:]

    • [5]

      The experimental feature in v5.18 "(?[...])" accomplishes this. See (?[ ]) in perlre. If you don't want to use an experimental feature, you can use one of the following:

      • Regular expression look-ahead

        You can mimic class subtraction using lookahead. For example, what UTS#18 might write as

        1. [{Block=Greek}-[{UNASSIGNED}]]

        in Perl can be written as:

        1. (?!\p{Unassigned})\p{Block=Greek}
        2. (?=\p{Assigned})\p{Block=Greek}

        But in this particular example, you probably really want

        1. \p{Greek}

        which will match assigned characters known to be part of the Greek script.

      • CPAN module Unicode::Regex::Set

        It does implement the full UTS#18 grouping, intersection, union, and removal (subtraction) syntax.

      • User-Defined Character Properties

        '+' for union, '-' for removal (set-difference), '&' for intersection

    • [6]

      \b \B

    • [7]

      Note that Perl does Full case-folding in matching (but with bugs), not Simple: for example U+1F88 is equivalent to U+1F00 U+03B9, instead of just U+1F80. This difference matters mainly for certain Greek capital letters with certain modifiers: the Full case-folding decomposes the letter, while the Simple case-folding would map it to a single character.

    • [8]

      Should do ^ and $ also on U+000B (\v in C), FF (\f), CR (\r), CRLF (\r\n), NEL (U+0085), LS (U+2028), and PS (U+2029); should also affect <>, $., and script line numbers; should not split lines within CRLF (i.e. there is no empty line between \r and \n). For CRLF, try the :crlf layer (see PerlIO).

    • [9]

      Linebreaking conformant with UAX#14 "Unicode Line Breaking Algorithm" is available through the Unicode::LineBreaking module.

    • [10]

      UTF-8/UTF-EBDDIC used in Perl allows not only U+10000 to U+10FFFF but also beyond U+10FFFF

  • Level 2 - Extended Unicode Support

    1. RL2.1 Canonical Equivalents - MISSING [10][11]
    2. RL2.2 Default Grapheme Clusters - MISSING [12]
    3. RL2.3 Default Word Boundaries - MISSING [14]
    4. RL2.4 Default Loose Matches - MISSING [15]
    5. RL2.5 Name Properties - DONE
    6. RL2.6 Wildcard Properties - MISSING
    7. [10] see UAX#15 "Unicode Normalization Forms"
    8. [11] have Unicode::Normalize but not integrated to regexes
    9. [12] have \X but we don't have a "Grapheme Cluster Mode"
    10. [14] see UAX#29, Word Boundaries
    11. [15] This is covered in Chapter 3.13 (in Unicode 6.0)
  • Level 3 - Tailored Support

    1. RL3.1 Tailored Punctuation - MISSING
    2. RL3.2 Tailored Grapheme Clusters - MISSING [17][18]
    3. RL3.3 Tailored Word Boundaries - MISSING
    4. RL3.4 Tailored Loose Matches - MISSING
    5. RL3.5 Tailored Ranges - MISSING
    6. RL3.6 Context Matching - MISSING [19]
    7. RL3.7 Incremental Matches - MISSING
    8. ( RL3.8 Unicode Set Sharing )
    9. RL3.9 Possible Match Sets - MISSING
    10. RL3.10 Folded Matching - MISSING [20]
    11. RL3.11 Submatchers - MISSING
    12. [17] see UAX#10 "Unicode Collation Algorithms"
    13. [18] have Unicode::Collate but not integrated to regexes
    14. [19] have (?<=x) and (?=x), but look-aheads or look-behinds
    15. should see outside of the target substring
    16. [20] need insensitive matching for linguistic features other
    17. than case; for example, hiragana to katakana, wide and
    18. narrow, simplified Han to traditional Han (see UTR#30
    19. "Character Foldings")

Unicode Encodings

Unicode characters are assigned to code points, which are abstract numbers. To use these numbers, various encodings are needed.

  • UTF-8

    UTF-8 is a variable-length (1 to 4 bytes), byte-order independent encoding. For ASCII (and we really do mean 7-bit ASCII, not another 8-bit encoding), UTF-8 is transparent.

    The following table is from Unicode 3.2.

    1. Code Points 1st Byte 2nd Byte 3rd Byte 4th Byte
    2. U+0000..U+007F 00..7F
    3. U+0080..U+07FF * C2..DF 80..BF
    4. U+0800..U+0FFF E0 * A0..BF 80..BF
    5. U+1000..U+CFFF E1..EC 80..BF 80..BF
    6. U+D000..U+D7FF ED 80..9F 80..BF
    7. U+D800..U+DFFF +++++ utf16 surrogates, not legal utf8 +++++
    8. U+E000..U+FFFF EE..EF 80..BF 80..BF
    9. U+10000..U+3FFFF F0 * 90..BF 80..BF 80..BF
    10. U+40000..U+FFFFF F1..F3 80..BF 80..BF 80..BF
    11. U+100000..U+10FFFF F4 80..8F 80..BF 80..BF

    Note the gaps marked by "*" before several of the byte entries above. These are caused by legal UTF-8 avoiding non-shortest encodings: it is technically possible to UTF-8-encode a single code point in different ways, but that is explicitly forbidden, and the shortest possible encoding should always be used (and that is what Perl does).

    Another way to look at it is via bits:

    1. Code Points 1st Byte 2nd Byte 3rd Byte 4th Byte
    2. 0aaaaaaa 0aaaaaaa
    3. 00000bbbbbaaaaaa 110bbbbb 10aaaaaa
    4. ccccbbbbbbaaaaaa 1110cccc 10bbbbbb 10aaaaaa
    5. 00000dddccccccbbbbbbaaaaaa 11110ddd 10cccccc 10bbbbbb 10aaaaaa

    As you can see, the continuation bytes all begin with "10", and the leading bits of the start byte tell how many bytes there are in the encoded character.

    The original UTF-8 specification allowed up to 6 bytes, to allow encoding of numbers up to 0x7FFF_FFFF. Perl continues to allow those, and has extended that up to 13 bytes to encode code points up to what can fit in a 64-bit word. However, Perl will warn if you output any of these as being non-portable; and under strict UTF-8 input protocols, they are forbidden.

    The Unicode non-character code points are also disallowed in UTF-8 in "open interchange". See Non-character code points.

  • UTF-EBCDIC

    Like UTF-8 but EBCDIC-safe, in the way that UTF-8 is ASCII-safe.

  • UTF-16, UTF-16BE, UTF-16LE, Surrogates, and BOMs (Byte Order Marks)

    The followings items are mostly for reference and general Unicode knowledge, Perl doesn't use these constructs internally.

    Like UTF-8, UTF-16 is a variable-width encoding, but where UTF-8 uses 8-bit code units, UTF-16 uses 16-bit code units. All code points occupy either 2 or 4 bytes in UTF-16: code points U+0000..U+FFFF are stored in a single 16-bit unit, and code points U+10000..U+10FFFF in two 16-bit units. The latter case is using surrogates, the first 16-bit unit being the high surrogate, and the second being the low surrogate.

    Surrogates are code points set aside to encode the U+10000..U+10FFFF range of Unicode code points in pairs of 16-bit units. The high surrogates are the range U+D800..U+DBFF and the low surrogates are the range U+DC00..U+DFFF . The surrogate encoding is

    1. $hi = ($uni - 0x10000) / 0x400 + 0xD800;
    2. $lo = ($uni - 0x10000) % 0x400 + 0xDC00;

    and the decoding is

    1. $uni = 0x10000 + ($hi - 0xD800) * 0x400 + ($lo - 0xDC00);

    Because of the 16-bitness, UTF-16 is byte-order dependent. UTF-16 itself can be used for in-memory computations, but if storage or transfer is required either UTF-16BE (big-endian) or UTF-16LE (little-endian) encodings must be chosen.

    This introduces another problem: what if you just know that your data is UTF-16, but you don't know which endianness? Byte Order Marks, or BOMs, are a solution to this. A special character has been reserved in Unicode to function as a byte order marker: the character with the code point U+FEFF is the BOM.

    The trick is that if you read a BOM, you will know the byte order, since if it was written on a big-endian platform, you will read the bytes 0xFE 0xFF, but if it was written on a little-endian platform, you will read the bytes 0xFF 0xFE. (And if the originating platform was writing in UTF-8, you will read the bytes 0xEF 0xBB 0xBF.)

    The way this trick works is that the character with the code point U+FFFE is not supposed to be in input streams, so the sequence of bytes 0xFF 0xFE is unambiguously "BOM, represented in little-endian format" and cannot be U+FFFE , represented in big-endian format".

    Surrogates have no meaning in Unicode outside their use in pairs to represent other code points. However, Perl allows them to be represented individually internally, for example by saying chr(0xD801), so that all code points, not just those valid for open interchange, are representable. Unicode does define semantics for them, such as their General Category is "Cs". But because their use is somewhat dangerous, Perl will warn (using the warning category "surrogate", which is a sub-category of "utf8") if an attempt is made to do things like take the lower case of one, or match case-insensitively, or to output them. (But don't try this on Perls before 5.14.)

  • UTF-32, UTF-32BE, UTF-32LE

    The UTF-32 family is pretty much like the UTF-16 family, expect that the units are 32-bit, and therefore the surrogate scheme is not needed. UTF-32 is a fixed-width encoding. The BOM signatures are 0x00 0x00 0xFE 0xFF for BE and 0xFF 0xFE 0x00 0x00 for LE.

  • UCS-2, UCS-4

    Legacy, fixed-width encodings defined by the ISO 10646 standard. UCS-2 is a 16-bit encoding. Unlike UTF-16, UCS-2 is not extensible beyond U+FFFF , because it does not use surrogates. UCS-4 is a 32-bit encoding, functionally identical to UTF-32 (the difference being that UCS-4 forbids neither surrogates nor code points larger than 0x10_FFFF).

  • UTF-7

    A seven-bit safe (non-eight-bit) encoding, which is useful if the transport or storage is not eight-bit safe. Defined by RFC 2152.

Non-character code points

66 code points are set aside in Unicode as "non-character code points". These all have the Unassigned (Cn) General Category, and they never will be assigned. These are never supposed to be in legal Unicode input streams, so that code can use them as sentinels that can be mixed in with character data, and they always will be distinguishable from that data. To keep them out of Perl input streams, strict UTF-8 should be specified, such as by using the layer :encoding('UTF-8') . The non-character code points are the 32 between U+FDD0 and U+FDEF, and the 34 code points U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, ... U+10FFFE, U+10FFFF. Some people are under the mistaken impression that these are "illegal", but that is not true. An application or cooperating set of applications can legally use them at will internally; but these code points are "illegal for open interchange". Therefore, Perl will not accept these from input streams unless lax rules are being used, and will warn (using the warning category "nonchar", which is a sub-category of "utf8") if an attempt is made to output them.

Beyond Unicode code points

The maximum Unicode code point is U+10FFFF. But Perl accepts code points up to the maximum permissible unsigned number available on the platform. However, Perl will not accept these from input streams unless lax rules are being used, and will warn (using the warning category "non_unicode", which is a sub-category of "utf8") if an attempt is made to operate on or output them. For example, uc(0x11_0000) will generate this warning, returning the input parameter as its result, as the upper case of every non-Unicode code point is the code point itself.

Security Implications of Unicode

Read Unicode Security Considerations. Also, note the following:

  • Malformed UTF-8

    Unfortunately, the original specification of UTF-8 leaves some room for interpretation of how many bytes of encoded output one should generate from one input Unicode character. Strictly speaking, the shortest possible sequence of UTF-8 bytes should be generated, because otherwise there is potential for an input buffer overflow at the receiving end of a UTF-8 connection. Perl always generates the shortest length UTF-8, and with warnings on, Perl will warn about non-shortest length UTF-8 along with other malformations, such as the surrogates, which are not Unicode code points valid for interchange.

  • Regular expression pattern matching may surprise you if you're not accustomed to Unicode. Starting in Perl 5.14, several pattern modifiers are available to control this, called the character set modifiers. Details are given in Character set modifiers in perlre.

As discussed elsewhere, Perl has one foot (two hooves?) planted in each of two worlds: the old world of bytes and the new world of characters, upgrading from bytes to characters when necessary. If your legacy code does not explicitly use Unicode, no automatic switch-over to characters should happen. Characters shouldn't get downgraded to bytes, either. It is possible to accidentally mix bytes and characters, however (see perluniintro), in which case \w in regular expressions might start behaving differently (unless the /a modifier is in effect). Review your code. Use warnings and the strict pragma.

Unicode in Perl on EBCDIC

The way Unicode is handled on EBCDIC platforms is still experimental. On such platforms, references to UTF-8 encoding in this document and elsewhere should be read as meaning the UTF-EBCDIC specified in Unicode Technical Report 16, unless ASCII vs. EBCDIC issues are specifically discussed. There is no utfebcdic pragma or ":utfebcdic" layer; rather, "utf8" and ":utf8" are reused to mean the platform's "natural" 8-bit encoding of Unicode. See perlebcdic for more discussion of the issues.

Locales

See Unicode and UTF-8 in perllocale

When Unicode Does Not Happen

While Perl does have extensive ways to input and output in Unicode, and a few other "entry points" like the @ARGV array (which can sometimes be interpreted as UTF-8), there are still many places where Unicode (in some encoding or another) could be given as arguments or received as results, or both, but it is not.

The following are such interfaces. Also, see The Unicode Bug. For all of these interfaces Perl currently (as of v5.16.0) simply assumes byte strings both as arguments and results, or UTF-8 strings if the (problematic) encoding pragma has been used.

One reason that Perl does not attempt to resolve the role of Unicode in these situations is that the answers are highly dependent on the operating system and the file system(s). For example, whether filenames can be in Unicode and in exactly what kind of encoding, is not exactly a portable concept. Similarly for qx and system: how well will the "command-line interface" (and which of them?) handle Unicode?

  • chdir, chmod, chown, chroot, exec, link, lstat, mkdir, rename, rmdir, stat, symlink, truncate, unlink, utime, -X

  • %ENV

  • glob (aka the <*>)

  • open, opendir, sysopen

  • qx (aka the backtick operator), system

  • readdir, readlink

The "Unicode Bug"

The term, "Unicode bug" has been applied to an inconsistency on ASCII platforms with the Unicode code points in the Latin-1 Supplement block, that is, between 128 and 255. Without a locale specified, unlike all other characters or code points, these characters have very different semantics in byte semantics versus character semantics, unless use feature 'unicode_strings' is specified, directly or indirectly. (It is indirectly specified by a use v5.12 or higher.)

In character semantics these upper-Latin1 characters are interpreted as Unicode code points, which means they have the same semantics as Latin-1 (ISO-8859-1).

In byte semantics (without unicode_strings ), they are considered to be unassigned characters, meaning that the only semantics they have is their ordinal numbers, and that they are not members of various character classes. None are considered to match \w for example, but all match \W .

Perl 5.12.0 added unicode_strings to force character semantics on these code points in some circumstances, which fixed portions of the bug; Perl 5.14.0 fixed almost all of it; and Perl 5.16.0 fixed the remainder (so far as we know, anyway). The lesson here is to enable unicode_strings to avoid the headaches described below.

The old, problematic behavior affects these areas:

  • Changing the case of a scalar, that is, using uc(), ucfirst(), lc(), and lcfirst(), or \L , \U , \u and \l in double-quotish contexts, such as regular expression substitutions. Under unicode_strings starting in Perl 5.12.0, character semantics are generally used. See lc for details on how this works in combination with various other pragmas.

  • Using caseless (/i) regular expression matching. Starting in Perl 5.14.0, regular expressions compiled within the scope of unicode_strings use character semantics even when executed or compiled into larger regular expressions outside the scope.

  • Matching any of several properties in regular expressions, namely \b , \B , \s, \S , \w , \W , and all the Posix character classes except [[:ascii:]] . Starting in Perl 5.14.0, regular expressions compiled within the scope of unicode_strings use character semantics even when executed or compiled into larger regular expressions outside the scope.

  • In quotemeta or its inline equivalent \Q , no code points above 127 are quoted in UTF-8 encoded strings, but in byte encoded strings, code points between 128-255 are always quoted. Starting in Perl 5.16.0, consistent quoting rules are used within the scope of unicode_strings , as described in quotemeta.

This behavior can lead to unexpected results in which a string's semantics suddenly change if a code point above 255 is appended to or removed from it, which changes the string's semantics from byte to character or vice versa. As an example, consider the following program and its output:

  1. $ perl -le'
  2. no feature 'unicode_strings';
  3. $s1 = "\xC2";
  4. $s2 = "\x{2660}";
  5. for ($s1, $s2, $s1.$s2) {
  6. print /\w/ || 0;
  7. }
  8. '
  9. 0
  10. 0
  11. 1

If there's no \w in s1 or in s2 , why does their concatenation have one?

This anomaly stems from Perl's attempt to not disturb older programs that didn't use Unicode, and hence had no semantics for characters outside of the ASCII range (except in a locale), along with Perl's desire to add Unicode support seamlessly. The result wasn't seamless: these characters were orphaned.

For Perls earlier than those described above, or when a string is passed to a function outside the subpragma's scope, a workaround is to always call utf8::upgrade($string) , or to use the standard module Encode. Also, a scalar that has any characters whose ordinal is above 0x100, or which were specified using either of the \N{...} notations, will automatically have character semantics.

Forcing Unicode in Perl (Or Unforcing Unicode in Perl)

Sometimes (see When Unicode Does Not Happen or The Unicode Bug) there are situations where you simply need to force a byte string into UTF-8, or vice versa. The low-level calls utf8::upgrade($bytestring) and utf8::downgrade($utf8string[, FAIL_OK]) are the answers.

Note that utf8::downgrade() can fail if the string contains characters that don't fit into a byte.

Calling either function on a string that already is in the desired state is a no-op.

Using Unicode in XS

If you want to handle Perl Unicode in XS extensions, you may find the following C APIs useful. See also Unicode Support in perlguts for an explanation about Unicode at the XS level, and perlapi for the API details.

  • DO_UTF8(sv) returns true if the UTF8 flag is on and the bytes pragma is not in effect. SvUTF8(sv) returns true if the UTF8 flag is on; the bytes pragma is ignored. The UTF8 flag being on does not mean that there are any characters of code points greater than 255 (or 127) in the scalar or that there are even any characters in the scalar. What the UTF8 flag means is that the sequence of octets in the representation of the scalar is the sequence of UTF-8 encoded code points of the characters of a string. The UTF8 flag being off means that each octet in this representation encodes a single character with code point 0..255 within the string. Perl's Unicode model is not to use UTF-8 until it is absolutely necessary.

  • uvchr_to_utf8(buf, chr) writes a Unicode character code point into a buffer encoding the code point as UTF-8, and returns a pointer pointing after the UTF-8 bytes. It works appropriately on EBCDIC machines.

  • utf8_to_uvchr_buf(buf, bufend, lenp) reads UTF-8 encoded bytes from a buffer and returns the Unicode character code point and, optionally, the length of the UTF-8 byte sequence. It works appropriately on EBCDIC machines.

  • utf8_length(start, end) returns the length of the UTF-8 encoded buffer in characters. sv_len_utf8(sv) returns the length of the UTF-8 encoded scalar.

  • sv_utf8_upgrade(sv) converts the string of the scalar to its UTF-8 encoded form. sv_utf8_downgrade(sv) does the opposite, if possible. sv_utf8_encode(sv) is like sv_utf8_upgrade except that it does not set the UTF8 flag. sv_utf8_decode() does the opposite of sv_utf8_encode() . Note that none of these are to be used as general-purpose encoding or decoding interfaces: use Encode for that. sv_utf8_upgrade() is affected by the encoding pragma but sv_utf8_downgrade() is not (since the encoding pragma is designed to be a one-way street).

  • is_utf8_string(buf, len) returns true if len bytes of the buffer are valid UTF-8.

  • is_utf8_char_buf(buf, buf_end) returns true if the pointer points to a valid UTF-8 character.

  • UTF8SKIP(buf) will return the number of bytes in the UTF-8 encoded character in the buffer. UNISKIP(chr) will return the number of bytes required to UTF-8-encode the Unicode character code point. UTF8SKIP() is useful for example for iterating over the characters of a UTF-8 encoded buffer; UNISKIP() is useful, for example, in computing the size required for a UTF-8 encoded buffer.

  • utf8_distance(a, b) will tell the distance in characters between the two pointers pointing to the same UTF-8 encoded buffer.

  • utf8_hop(s, off) will return a pointer to a UTF-8 encoded buffer that is off (positive or negative) Unicode characters displaced from the UTF-8 buffer s. Be careful not to overstep the buffer: utf8_hop() will merrily run off the end or the beginning of the buffer if told to do so.

  • pv_uni_display(dsv, spv, len, pvlim, flags) and sv_uni_display(dsv, ssv, pvlim, flags) are useful for debugging the output of Unicode strings and scalars. By default they are useful only for debugging--they display all characters as hexadecimal code points--but with the flags UNI_DISPLAY_ISPRINT , UNI_DISPLAY_BACKSLASH , and UNI_DISPLAY_QQ you can make the output more readable.

  • foldEQ_utf8(s1, pe1, l1, u1, s2, pe2, l2, u2) can be used to compare two strings case-insensitively in Unicode. For case-sensitive comparisons you can just use memEQ() and memNE() as usual, except if one string is in utf8 and the other isn't.

For more information, see perlapi, and utf8.c and utf8.h in the Perl source code distribution.

Hacking Perl to work on earlier Unicode versions (for very serious hackers only)

Perl by default comes with the latest supported Unicode version built in, but you can change to use any earlier one.

Download the files in the desired version of Unicode from the Unicode web site http://www.unicode.org). These should replace the existing files in lib/unicore in the Perl source tree. Follow the instructions in README.perl in that directory to change some of their names, and then build perl (see INSTALL).

BUGS

Interaction with Locales

See Unicode and UTF-8 in perllocale

Problems with characters in the Latin-1 Supplement range

See The Unicode Bug

Interaction with Extensions

When Perl exchanges data with an extension, the extension should be able to understand the UTF8 flag and act accordingly. If the extension doesn't recognize that flag, it's likely that the extension will return incorrectly-flagged data.

So if you're working with Unicode data, consult the documentation of every module you're using if there are any issues with Unicode data exchange. If the documentation does not talk about Unicode at all, suspect the worst and probably look at the source to learn how the module is implemented. Modules written completely in Perl shouldn't cause problems. Modules that directly or indirectly access code written in other programming languages are at risk.

For affected functions, the simple strategy to avoid data corruption is to always make the encoding of the exchanged data explicit. Choose an encoding that you know the extension can handle. Convert arguments passed to the extensions to that encoding and convert results back from that encoding. Write wrapper functions that do the conversions for you, so you can later change the functions when the extension catches up.

To provide an example, let's say the popular Foo::Bar::escape_html function doesn't deal with Unicode data yet. The wrapper function would convert the argument to raw UTF-8 and convert the result back to Perl's internal representation like so:

  1. sub my_escape_html ($) {
  2. my($what) = shift;
  3. return unless defined $what;
  4. Encode::decode_utf8(Foo::Bar::escape_html(
  5. Encode::encode_utf8($what)));
  6. }

Sometimes, when the extension does not convert data but just stores and retrieves them, you will be able to use the otherwise dangerous Encode::_utf8_on() function. Let's say the popular Foo::Bar extension, written in C, provides a param method that lets you store and retrieve data according to these prototypes:

  1. $self->param($name, $value); # set a scalar
  2. $value = $self->param($name); # retrieve a scalar

If it does not yet provide support for any encoding, one could write a derived class with such a param method:

  1. sub param {
  2. my($self,$name,$value) = @_;
  3. utf8::upgrade($name); # make sure it is UTF-8 encoded
  4. if (defined $value) {
  5. utf8::upgrade($value); # make sure it is UTF-8 encoded
  6. return $self->SUPER::param($name,$value);
  7. } else {
  8. my $ret = $self->SUPER::param($name);
  9. Encode::_utf8_on($ret); # we know, it is UTF-8 encoded
  10. return $ret;
  11. }
  12. }

Some extensions provide filters on data entry/exit points, such as DB_File::filter_store_key and family. Look out for such filters in the documentation of your extensions, they can make the transition to Unicode data much easier.

Speed

Some functions are slower when working on UTF-8 encoded strings than on byte encoded strings. All functions that need to hop over characters such as length(), substr() or index(), or matching regular expressions can work much faster when the underlying data are byte-encoded.

In Perl 5.8.0 the slowness was often quite spectacular; in Perl 5.8.1 a caching scheme was introduced which will hopefully make the slowness somewhat less spectacular, at least for some operations. In general, operations with UTF-8 encoded strings are still slower. As an example, the Unicode properties (character classes) like \p{Nd} are known to be quite a bit slower (5-20 times) than their simpler counterparts like \d (then again, there are hundreds of Unicode characters matching Nd compared with the 10 ASCII characters matching d ).

Problems on EBCDIC platforms

There are several known problems with Perl on EBCDIC platforms. If you want to use Perl there, send email to perlbug@perl.org.

In earlier versions, when byte and character data were concatenated, the new string was sometimes created by decoding the byte strings as ISO 8859-1 (Latin-1), even if the old Unicode string used EBCDIC.

If you find any of these, please report them as bugs.

Porting code from perl-5.6.X

Perl 5.8 has a different Unicode model from 5.6. In 5.6 the programmer was required to use the utf8 pragma to declare that a given scope expected to deal with Unicode data and had to make sure that only Unicode data were reaching that scope. If you have code that is working with 5.6, you will need some of the following adjustments to your code. The examples are written such that the code will continue to work under 5.6, so you should be safe to try them out.

  • A filehandle that should read or write UTF-8

    1. if ($] > 5.008) {
    2. binmode $fh, ":encoding(utf8)";
    3. }
  • A scalar that is going to be passed to some extension

    Be it Compress::Zlib, Apache::Request or any extension that has no mention of Unicode in the manpage, you need to make sure that the UTF8 flag is stripped off. Note that at the time of this writing (January 2012) the mentioned modules are not UTF-8-aware. Please check the documentation to verify if this is still true.

    1. if ($] > 5.008) {
    2. require Encode;
    3. $val = Encode::encode_utf8($val); # make octets
    4. }
  • A scalar we got back from an extension

    If you believe the scalar comes back as UTF-8, you will most likely want the UTF8 flag restored:

    1. if ($] > 5.008) {
    2. require Encode;
    3. $val = Encode::decode_utf8($val);
    4. }
  • Same thing, if you are really sure it is UTF-8

    1. if ($] > 5.008) {
    2. require Encode;
    3. Encode::_utf8_on($val);
    4. }
  • A wrapper for fetchrow_array and fetchrow_hashref

    When the database contains only UTF-8, a wrapper function or method is a convenient way to replace all your fetchrow_array and fetchrow_hashref calls. A wrapper function will also make it easier to adapt to future enhancements in your database driver. Note that at the time of this writing (January 2012), the DBI has no standardized way to deal with UTF-8 data. Please check the documentation to verify if that is still true.

    1. sub fetchrow {
    2. # $what is one of fetchrow_{array,hashref}
    3. my($self, $sth, $what) = @_;
    4. if ($] < 5.008) {
    5. return $sth->$what;
    6. } else {
    7. require Encode;
    8. if (wantarray) {
    9. my @arr = $sth->$what;
    10. for (@arr) {
    11. defined && /[^\000-\177]/ && Encode::_utf8_on($_);
    12. }
    13. return @arr;
    14. } else {
    15. my $ret = $sth->$what;
    16. if (ref $ret) {
    17. for my $k (keys %$ret) {
    18. defined
    19. && /[^\000-\177]/
    20. && Encode::_utf8_on($_) for $ret->{$k};
    21. }
    22. return $ret;
    23. } else {
    24. defined && /[^\000-\177]/ && Encode::_utf8_on($_) for $ret;
    25. return $ret;
    26. }
    27. }
    28. }
    29. }
  • A large scalar that you know can only contain ASCII

    Scalars that contain only ASCII and are marked as UTF-8 are sometimes a drag to your program. If you recognize such a situation, just remove the UTF8 flag:

    1. utf8::downgrade($val) if $] > 5.008;

SEE ALSO

perlunitut, perluniintro, perluniprops, Encode, open, utf8, bytes, perlretut, ${^UNICODE} in perlvar http://www.unicode.org/reports/tr44).

 
perldoc-html/perlunifaq.html000644 000765 000024 00000122440 12275777331 016273 0ustar00jjstaff000000 000000 perlunifaq - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlunifaq

Perl 5 version 18.2 documentation
Recently read

perlunifaq

NAME

perlunifaq - Perl Unicode FAQ

Q and A

This is a list of questions and answers about Unicode in Perl, intended to be read after perlunitut.

perlunitut isn't really a Unicode tutorial, is it?

No, and this isn't really a Unicode FAQ.

Perl has an abstracted interface for all supported character encodings, so this is actually a generic Encode tutorial and Encode FAQ. But many people think that Unicode is special and magical, and I didn't want to disappoint them, so I decided to call the document a Unicode tutorial.

What character encodings does Perl support?

To find out which character encodings your Perl supports, run:

  1. perl -MEncode -le "print for Encode->encodings(':all')"

Which version of perl should I use?

Well, if you can, upgrade to the most recent, but certainly 5.8.1 or newer. The tutorial and FAQ assume the latest release.

You should also check your modules, and upgrade them if necessary. For example, HTML::Entities requires version >= 1.32 to function correctly, even though the changelog is silent about this.

What about binary data, like images?

Well, apart from a bare binmode $fh , you shouldn't treat them specially. (The binmode is needed because otherwise Perl may convert line endings on Win32 systems.)

Be careful, though, to never combine text strings with binary strings. If you need text in a binary stream, encode your text strings first using the appropriate encoding, then join them with binary strings. See also: "What if I don't encode?".

When should I decode or encode?

Whenever you're communicating text with anything that is external to your perl process, like a database, a text file, a socket, or another program. Even if the thing you're communicating with is also written in Perl.

What if I don't decode?

Whenever your encoded, binary string is used together with a text string, Perl will assume that your binary string was encoded with ISO-8859-1, also known as latin-1. If it wasn't latin-1, then your data is unpleasantly converted. For example, if it was UTF-8, the individual bytes of multibyte characters are seen as separate characters, and then again converted to UTF-8. Such double encoding can be compared to double HTML encoding (&amp;gt; ), or double URI encoding (%253E ).

This silent implicit decoding is known as "upgrading". That may sound positive, but it's best to avoid it.

What if I don't encode?

Your text string will be sent using the bytes in Perl's internal format. In some cases, Perl will warn you that you're doing something wrong, with a friendly warning:

  1. Wide character in print at example.pl line 2.

Because the internal format is often UTF-8, these bugs are hard to spot, because UTF-8 is usually the encoding you wanted! But don't be lazy, and don't use the fact that Perl's internal format is UTF-8 to your advantage. Encode explicitly to avoid weird bugs, and to show to maintenance programmers that you thought this through.

Is there a way to automatically decode or encode?

If all data that comes from a certain handle is encoded in exactly the same way, you can tell the PerlIO system to automatically decode everything, with the encoding layer. If you do this, you can't accidentally forget to decode or encode anymore, on things that use the layered handle.

You can provide this layer when opening the file:

  1. open my $fh, '>:encoding(UTF-8)', $filename; # auto encoding on write
  2. open my $fh, '<:encoding(UTF-8)', $filename; # auto decoding on read

Or if you already have an open filehandle:

  1. binmode $fh, ':encoding(UTF-8)';

Some database drivers for DBI can also automatically encode and decode, but that is sometimes limited to the UTF-8 encoding.

What if I don't know which encoding was used?

Do whatever you can to find out, and if you have to: guess. (Don't forget to document your guess with a comment.)

You could open the document in a web browser, and change the character set or character encoding until you can visually confirm that all characters look the way they should.

There is no way to reliably detect the encoding automatically, so if people keep sending you data without charset indication, you may have to educate them.

Can I use Unicode in my Perl sources?

Yes, you can! If your sources are UTF-8 encoded, you can indicate that with the use utf8 pragma.

  1. use utf8;

This doesn't do anything to your input, or to your output. It only influences the way your sources are read. You can use Unicode in string literals, in identifiers (but they still have to be "word characters" according to \w ), and even in custom delimiters.

Data::Dumper doesn't restore the UTF8 flag; is it broken?

No, Data::Dumper's Unicode abilities are as they should be. There have been some complaints that it should restore the UTF8 flag when the data is read again with eval. However, you should really not look at the flag, and nothing indicates that Data::Dumper should break this rule.

Here's what happens: when Perl reads in a string literal, it sticks to 8 bit encoding as long as it can. (But perhaps originally it was internally encoded as UTF-8, when you dumped it.) When it has to give that up because other characters are added to the text string, it silently upgrades the string to UTF-8.

If you properly encode your strings for output, none of this is of your concern, and you can just eval dumped data as always.

Why do regex character classes sometimes match only in the ASCII range?

Why do some characters not uppercase or lowercase correctly?

Starting in Perl 5.14 (and partially in Perl 5.12), just put a use feature 'unicode_strings' near the beginning of your program. Within its lexical scope you shouldn't have this problem. It also is automatically enabled under use feature ':5.12' or use v5.12 or using -E on the command line for Perl 5.12 or higher.

The rationale for requiring this is to not break older programs that rely on the way things worked before Unicode came along. Those older programs knew only about the ASCII character set, and so may not work properly for additional characters. When a string is encoded in UTF-8, Perl assumes that the program is prepared to deal with Unicode, but when the string isn't, Perl assumes that only ASCII is wanted, and so those characters that are not ASCII characters aren't recognized as to what they would be in Unicode. use feature 'unicode_strings' tells Perl to treat all characters as Unicode, whether the string is encoded in UTF-8 or not, thus avoiding the problem.

However, on earlier Perls, or if you pass strings to subroutines outside the feature's scope, you can force Unicode semantics by changing the encoding to UTF-8 by doing utf8::upgrade($string) . This can be used safely on any string, as it checks and does not change strings that have already been upgraded.

For a more detailed discussion, see Unicode::Semantics on CPAN.

How can I determine if a string is a text string or a binary string?

You can't. Some use the UTF8 flag for this, but that's misuse, and makes well behaved modules like Data::Dumper look bad. The flag is useless for this purpose, because it's off when an 8 bit encoding (by default ISO-8859-1) is used to store the string.

This is something you, the programmer, has to keep track of; sorry. You could consider adopting a kind of "Hungarian notation" to help with this.

How do I convert from encoding FOO to encoding BAR?

By first converting the FOO-encoded byte string to a text string, and then the text string to a BAR-encoded byte string:

  1. my $text_string = decode('FOO', $foo_string);
  2. my $bar_string = encode('BAR', $text_string);

or by skipping the text string part, and going directly from one binary encoding to the other:

  1. use Encode qw(from_to);
  2. from_to($string, 'FOO', 'BAR'); # changes contents of $string

or by letting automatic decoding and encoding do all the work:

  1. open my $foofh, '<:encoding(FOO)', 'example.foo.txt';
  2. open my $barfh, '>:encoding(BAR)', 'example.bar.txt';
  3. print { $barfh } $_ while <$foofh>;

What are decode_utf8 and encode_utf8 ?

These are alternate syntaxes for decode('utf8', ...) and encode('utf8', ...) .

What is a "wide character"?

This is a term used both for characters with an ordinal value greater than 127, characters with an ordinal value greater than 255, or any character occupying more than one byte, depending on the context.

The Perl warning "Wide character in ..." is caused by a character with an ordinal value greater than 255. With no specified encoding layer, Perl tries to fit things in ISO-8859-1 for backward compatibility reasons. When it can't, it emits this warning (if warnings are enabled), and outputs UTF-8 encoded data instead.

To avoid this warning and to avoid having different output encodings in a single stream, always specify an encoding explicitly, for example with a PerlIO layer:

  1. binmode STDOUT, ":encoding(UTF-8)";

INTERNALS

What is "the UTF8 flag"?

Please, unless you're hacking the internals, or debugging weirdness, don't think about the UTF8 flag at all. That means that you very probably shouldn't use is_utf8 , _utf8_on or _utf8_off at all.

The UTF8 flag, also called SvUTF8, is an internal flag that indicates that the current internal representation is UTF-8. Without the flag, it is assumed to be ISO-8859-1. Perl converts between these automatically. (Actually Perl usually assumes the representation is ASCII; see Why do regex character classes sometimes match only in the ASCII range? above.)

One of Perl's internal formats happens to be UTF-8. Unfortunately, Perl can't keep a secret, so everyone knows about this. That is the source of much confusion. It's better to pretend that the internal format is some unknown encoding, and that you always have to encode and decode explicitly.

What about the use bytes pragma?

Don't use it. It makes no sense to deal with bytes in a text string, and it makes no sense to deal with characters in a byte string. Do the proper conversions (by decoding/encoding), and things will work out well: you get character counts for decoded data, and byte counts for encoded data.

use bytes is usually a failed attempt to do something useful. Just forget about it.

What about the use encoding pragma?

Don't use it. Unfortunately, it assumes that the programmer's environment and that of the user will use the same encoding. It will use the same encoding for the source code and for STDIN and STDOUT. When a program is copied to another machine, the source code does not change, but the STDIO environment might.

If you need non-ASCII characters in your source code, make it a UTF-8 encoded file and use utf8 .

If you need to set the encoding for STDIN, STDOUT, and STDERR, for example based on the user's locale, use open .

What is the difference between :encoding and :utf8 ?

Because UTF-8 is one of Perl's internal formats, you can often just skip the encoding or decoding step, and manipulate the UTF8 flag directly.

Instead of :encoding(UTF-8) , you can simply use :utf8 , which skips the encoding step if the data was already represented as UTF8 internally. This is widely accepted as good behavior when you're writing, but it can be dangerous when reading, because it causes internal inconsistency when you have invalid byte sequences. Using :utf8 for input can sometimes result in security breaches, so please use :encoding(UTF-8) instead.

Instead of decode and encode , you could use _utf8_on and _utf8_off , but this is considered bad style. Especially _utf8_on can be dangerous, for the same reason that :utf8 can.

There are some shortcuts for oneliners; see -C in perlrun.

What's the difference between UTF-8 and utf8 ?

UTF-8 is the official standard. utf8 is Perl's way of being liberal in what it accepts. If you have to communicate with things that aren't so liberal, you may want to consider using UTF-8 . If you have to communicate with things that are too liberal, you may have to use utf8 . The full explanation is in Encode.

UTF-8 is internally known as utf-8-strict . The tutorial uses UTF-8 consistently, even where utf8 is actually used internally, because the distinction can be hard to make, and is mostly irrelevant.

For example, utf8 can be used for code points that don't exist in Unicode, like 9999999, but if you encode that to UTF-8, you get a substitution character (by default; see Handling Malformed Data in Encode for more ways of dealing with this.)

Okay, if you insist: the "internal format" is utf8, not UTF-8. (When it's not some other encoding.)

I lost track; what encoding is the internal format really?

It's good that you lost track, because you shouldn't depend on the internal format being any specific encoding. But since you asked: by default, the internal format is either ISO-8859-1 (latin-1), or utf8, depending on the history of the string. On EBCDIC platforms, this may be different even.

Perl knows how it stored the string internally, and will use that knowledge when you encode . In other words: don't try to find out what the internal encoding for a certain string is, but instead just encode it into the encoding that you want.

AUTHOR

Juerd Waalboer <#####@juerd.nl>

SEE ALSO

perlunicode, perluniintro, Encode

 
perldoc-html/perluniintro.html000644 000765 000024 00000260533 12275777344 016671 0ustar00jjstaff000000 000000 perluniintro - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perluniintro

Perl 5 version 18.2 documentation
Recently read

perluniintro

NAME

perluniintro - Perl Unicode introduction

DESCRIPTION

This document gives a general idea of Unicode and how to use Unicode in Perl. See Further Resources for references to more in-depth treatments of Unicode.

Unicode

Unicode is a character set standard which plans to codify all of the writing systems of the world, plus many other symbols.

Unicode and ISO/IEC 10646 are coordinated standards that unify almost all other modern character set standards, covering more than 80 writing systems and hundreds of languages, including all commercially-important modern languages. All characters in the largest Chinese, Japanese, and Korean dictionaries are also encoded. The standards will eventually cover almost all characters in more than 250 writing systems and thousands of languages. Unicode 1.0 was released in October 1991, and 6.0 in October 2010.

A Unicode character is an abstract entity. It is not bound to any particular integer width, especially not to the C language char . Unicode is language-neutral and display-neutral: it does not encode the language of the text, and it does not generally define fonts or other graphical layout details. Unicode operates on characters and on text built from those characters.

Unicode defines characters like LATIN CAPITAL LETTER A or GREEK SMALL LETTER ALPHA and unique numbers for the characters, in this case 0x0041 and 0x03B1, respectively. These unique numbers are called code points. A code point is essentially the position of the character within the set of all possible Unicode characters, and thus in Perl, the term ordinal is often used interchangeably with it.

The Unicode standard prefers using hexadecimal notation for the code points. If numbers like 0x0041 are unfamiliar to you, take a peek at a later section, Hexadecimal Notation. The Unicode standard uses the notation U+0041 LATIN CAPITAL LETTER A, to give the hexadecimal code point and the normative name of the character.

Unicode also defines various properties for the characters, like "uppercase" or "lowercase", "decimal digit", or "punctuation"; these properties are independent of the names of the characters. Furthermore, various operations on the characters like uppercasing, lowercasing, and collating (sorting) are defined.

A Unicode logical "character" can actually consist of more than one internal actual "character" or code point. For Western languages, this is adequately modelled by a base character (like LATIN CAPITAL LETTER A ) followed by one or more modifiers (like COMBINING ACUTE ACCENT ). This sequence of base character and modifiers is called a combining character sequence. Some non-western languages require more complicated models, so Unicode created the grapheme cluster concept, which was later further refined into the extended grapheme cluster. For example, a Korean Hangul syllable is considered a single logical character, but most often consists of three actual Unicode characters: a leading consonant followed by an interior vowel followed by a trailing consonant.

Whether to call these extended grapheme clusters "characters" depends on your point of view. If you are a programmer, you probably would tend towards seeing each element in the sequences as one unit, or "character". However from the user's point of view, the whole sequence could be seen as one "character" since that's probably what it looks like in the context of the user's language. In this document, we take the programmer's point of view: one "character" is one Unicode code point.

For some combinations of base character and modifiers, there are precomposed characters. There is a single character equivalent, for example, to the sequence LATIN CAPITAL LETTER A followed by COMBINING ACUTE ACCENT . It is called LATIN CAPITAL LETTER A WITH ACUTE . These precomposed characters are, however, only available for some combinations, and are mainly meant to support round-trip conversions between Unicode and legacy standards (like ISO 8859). Using sequences, as Unicode does, allows for needing fewer basic building blocks (code points) to express many more potential grapheme clusters. To support conversion between equivalent forms, various normalization forms are also defined. Thus, LATIN CAPITAL LETTER A WITH ACUTE is in Normalization Form Composed, (abbreviated NFC), and the sequence LATIN CAPITAL LETTER A followed by COMBINING ACUTE ACCENT represents the same character in Normalization Form Decomposed (NFD).

Because of backward compatibility with legacy encodings, the "a unique number for every character" idea breaks down a bit: instead, there is "at least one number for every character". The same character could be represented differently in several legacy encodings. The converse is not also true: some code points do not have an assigned character. Firstly, there are unallocated code points within otherwise used blocks. Secondly, there are special Unicode control characters that do not represent true characters.

When Unicode was first conceived, it was thought that all the world's characters could be represented using a 16-bit word; that is a maximum of 0x10000 (or 65536) characters from 0x0000 to 0xFFFF would be needed. This soon proved to be false, and since Unicode 2.0 (July 1996), Unicode has been defined all the way up to 21 bits (0x10FFFF ), and Unicode 3.1 (March 2001) defined the first characters above 0xFFFF . The first 0x10000 characters are called the Plane 0, or the Basic Multilingual Plane (BMP). With Unicode 3.1, 17 (yes, seventeen) planes in all were defined--but they are nowhere near full of defined characters, yet.

When a new language is being encoded, Unicode generally will choose a block of consecutive unallocated code points for its characters. So far, the number of code points in these blocks has always been evenly divisible by 16. Extras in a block, not currently needed, are left unallocated, for future growth. But there have been occasions when a later relase needed more code points than the available extras, and a new block had to allocated somewhere else, not contiguous to the initial one, to handle the overflow. Thus, it became apparent early on that "block" wasn't an adequate organizing principal, and so the Script property was created. (Later an improved script property was added as well, the Script_Extensions property.) Those code points that are in overflow blocks can still have the same script as the original ones. The script concept fits more closely with natural language: there is Latin script, Greek script, and so on; and there are several artificial scripts, like Common for characters that are used in multiple scripts, such as mathematical symbols. Scripts usually span varied parts of several blocks. For more information about scripts, see Scripts in perlunicode. The division into blocks exists, but it is almost completely accidental--an artifact of how the characters have been and still are allocated. (Note that this paragraph has oversimplified things for the sake of this being an introduction. Unicode doesn't really encode languages, but the writing systems for them--their scripts; and one script can be used by many languages. Unicode also encodes things that aren't really about languages, such as symbols like BAGGAGE CLAIM .)

The Unicode code points are just abstract numbers. To input and output these abstract numbers, the numbers must be encoded or serialised somehow. Unicode defines several character encoding forms, of which UTF-8 is perhaps the most popular. UTF-8 is a variable length encoding that encodes Unicode characters as 1 to 6 bytes. Other encodings include UTF-16 and UTF-32 and their big- and little-endian variants (UTF-8 is byte-order independent). The ISO/IEC 10646 defines the UCS-2 and UCS-4 encoding forms.

For more information about encodings--for instance, to learn what surrogates and byte order marks (BOMs) are--see perlunicode.

Perl's Unicode Support

Starting from Perl v5.6.0, Perl has had the capacity to handle Unicode natively. Perl v5.8.0, however, is the first recommended release for serious Unicode work. The maintenance release 5.6.1 fixed many of the problems of the initial Unicode implementation, but for example regular expressions still do not work with Unicode in 5.6.1. Perl v5.14.0 is the first release where Unicode support is (almost) seamlessly integrable without some gotchas (the exception being some differences in quotemeta, which is fixed starting in Perl 5.16.0). To enable this seamless support, you should use feature 'unicode_strings' (which is automatically selected if you use 5.012 or higher). See feature. (5.14 also fixes a number of bugs and departures from the Unicode standard.)

Before Perl v5.8.0, the use of use utf8 was used to declare that operations in the current block or file would be Unicode-aware. This model was found to be wrong, or at least clumsy: the "Unicodeness" is now carried with the data, instead of being attached to the operations. Starting with Perl v5.8.0, only one case remains where an explicit use utf8 is needed: if your Perl script itself is encoded in UTF-8, you can use UTF-8 in your identifier names, and in string and regular expression literals, by saying use utf8 . This is not the default because scripts with legacy 8-bit data in them would break. See utf8.

Perl's Unicode Model

Perl supports both pre-5.6 strings of eight-bit native bytes, and strings of Unicode characters. The general principle is that Perl tries to keep its data as eight-bit bytes for as long as possible, but as soon as Unicodeness cannot be avoided, the data is transparently upgraded to Unicode. Prior to Perl v5.14.0, the upgrade was not completely transparent (see The Unicode Bug in perlunicode), and for backwards compatibility, full transparency is not gained unless use feature 'unicode_strings' (see feature) or use 5.012 (or higher) is selected.

Internally, Perl currently uses either whatever the native eight-bit character set of the platform (for example Latin-1) is, defaulting to UTF-8, to encode Unicode strings. Specifically, if all code points in the string are 0xFF or less, Perl uses the native eight-bit character set. Otherwise, it uses UTF-8.

A user of Perl does not normally need to know nor care how Perl happens to encode its internal strings, but it becomes relevant when outputting Unicode strings to a stream without a PerlIO layer (one with the "default" encoding). In such a case, the raw bytes used internally (the native character set or UTF-8, as appropriate for each string) will be used, and a "Wide character" warning will be issued if those strings contain a character beyond 0x00FF.

For example,

  1. perl -e 'print "\x{DF}\n", "\x{0100}\x{DF}\n"'

produces a fairly useless mixture of native bytes and UTF-8, as well as a warning:

  1. Wide character in print at ...

To output UTF-8, use the :encoding or :utf8 output layer. Prepending

  1. binmode(STDOUT, ":utf8");

to this sample program ensures that the output is completely UTF-8, and removes the program's warning.

You can enable automatic UTF-8-ification of your standard file handles, default open() layer, and @ARGV by using either the -C command line switch or the PERL_UNICODE environment variable, see perlrun for the documentation of the -C switch.

Note that this means that Perl expects other software to work the same way: if Perl has been led to believe that STDIN should be UTF-8, but then STDIN coming in from another command is not UTF-8, Perl will likely complain about the malformed UTF-8.

All features that combine Unicode and I/O also require using the new PerlIO feature. Almost all Perl 5.8 platforms do use PerlIO, though: you can see whether yours is by running "perl -V" and looking for useperlio=define .

Unicode and EBCDIC

Perl 5.8.0 also supports Unicode on EBCDIC platforms. There, Unicode support is somewhat more complex to implement since additional conversions are needed at every step.

Later Perl releases have added code that will not work on EBCDIC platforms, and no one has complained, so the divergence has continued. If you want to run Perl on an EBCDIC platform, send email to perlbug@perl.org

On EBCDIC platforms, the internal Unicode encoding form is UTF-EBCDIC instead of UTF-8. The difference is that as UTF-8 is "ASCII-safe" in that ASCII characters encode to UTF-8 as-is, while UTF-EBCDIC is "EBCDIC-safe".

Creating Unicode

To create Unicode characters in literals for code points above 0xFF , use the \x{...} notation in double-quoted strings:

  1. my $smiley = "\x{263a}";

Similarly, it can be used in regular expression literals

  1. $smiley =~ /\x{263a}/;

At run-time you can use chr():

  1. my $hebrew_alef = chr(0x05d0);

See Further Resources for how to find all these numeric codes.

Naturally, ord() will do the reverse: it turns a character into a code point.

Note that \x.. (no {} and only two hexadecimal digits), \x{...} , and chr(...) for arguments less than 0x100 (decimal 256) generate an eight-bit character for backward compatibility with older Perls. For arguments of 0x100 or more, Unicode characters are always produced. If you want to force the production of Unicode characters regardless of the numeric value, use pack("U", ...) instead of \x.. , \x{...} , or chr().

You can invoke characters by name in double-quoted strings:

  1. my $arabic_alef = "\N{ARABIC LETTER ALEF}";

And, as mentioned above, you can also pack() numbers into Unicode characters:

  1. my $georgian_an = pack("U", 0x10a0);

Note that both \x{...} and \N{...} are compile-time string constants: you cannot use variables in them. if you want similar run-time functionality, use chr() and charnames::string_vianame() .

If you want to force the result to Unicode characters, use the special "U0" prefix. It consumes no arguments but causes the following bytes to be interpreted as the UTF-8 encoding of Unicode characters:

  1. my $chars = pack("U0W*", 0x80, 0x42);

Likewise, you can stop such UTF-8 interpretation by using the special "C0" prefix.

Handling Unicode

Handling Unicode is for the most part transparent: just use the strings as usual. Functions like index(), length(), and substr() will work on the Unicode characters; regular expressions will work on the Unicode characters (see perlunicode and perlretut).

Note that Perl considers grapheme clusters to be separate characters, so for example

  1. print length("\N{LATIN CAPITAL LETTER A}\N{COMBINING ACUTE ACCENT}"),
  2. "\n";

will print 2, not 1. The only exception is that regular expressions have \X for matching an extended grapheme cluster. (Thus \X in a regular expression would match the entire sequence of both the example characters.)

Life is not quite so transparent, however, when working with legacy encodings, I/O, and certain special cases:

Legacy Encodings

When you combine legacy data and Unicode, the legacy data needs to be upgraded to Unicode. Normally the legacy data is assumed to be ISO 8859-1 (or EBCDIC, if applicable).

The Encode module knows about many encodings and has interfaces for doing conversions between those encodings:

  1. use Encode 'decode';
  2. $data = decode("iso-8859-3", $data); # convert from legacy to utf-8

Unicode I/O

Normally, writing out Unicode data

  1. print FH $some_string_with_unicode, "\n";

produces raw bytes that Perl happens to use to internally encode the Unicode string. Perl's internal encoding depends on the system as well as what characters happen to be in the string at the time. If any of the characters are at code points 0x100 or above, you will get a warning. To ensure that the output is explicitly rendered in the encoding you desire--and to avoid the warning--open the stream with the desired encoding. Some examples:

  1. open FH, ">:utf8", "file";
  2. open FH, ">:encoding(ucs2)", "file";
  3. open FH, ">:encoding(UTF-8)", "file";
  4. open FH, ">:encoding(shift_jis)", "file";

and on already open streams, use binmode():

  1. binmode(STDOUT, ":utf8");
  2. binmode(STDOUT, ":encoding(ucs2)");
  3. binmode(STDOUT, ":encoding(UTF-8)");
  4. binmode(STDOUT, ":encoding(shift_jis)");

The matching of encoding names is loose: case does not matter, and many encodings have several aliases. Note that the :utf8 layer must always be specified exactly like that; it is not subject to the loose matching of encoding names. Also note that currently :utf8 is unsafe for input, because it accepts the data without validating that it is indeed valid UTF-8; you should instead use :encoding(utf-8) (with or without a hyphen).

See PerlIO for the :utf8 layer, PerlIO::encoding and Encode::PerlIO for the :encoding() layer, and Encode::Supported for many encodings supported by the Encode module.

Reading in a file that you know happens to be encoded in one of the Unicode or legacy encodings does not magically turn the data into Unicode in Perl's eyes. To do that, specify the appropriate layer when opening files

  1. open(my $fh,'<:encoding(utf8)', 'anything');
  2. my $line_of_unicode = <$fh>;
  3. open(my $fh,'<:encoding(Big5)', 'anything');
  4. my $line_of_unicode = <$fh>;

The I/O layers can also be specified more flexibly with the open pragma. See open, or look at the following example.

  1. use open ':encoding(utf8)'; # input/output default encoding will be
  2. # UTF-8
  3. open X, ">file";
  4. print X chr(0x100), "\n";
  5. close X;
  6. open Y, "<file";
  7. printf "%#x\n", ord(<Y>); # this should print 0x100
  8. close Y;

With the open pragma you can use the :locale layer

  1. BEGIN { $ENV{LC_ALL} = $ENV{LANG} = 'ru_RU.KOI8-R' }
  2. # the :locale will probe the locale environment variables like
  3. # LC_ALL
  4. use open OUT => ':locale'; # russki parusski
  5. open(O, ">koi8");
  6. print O chr(0x430); # Unicode CYRILLIC SMALL LETTER A = KOI8-R 0xc1
  7. close O;
  8. open(I, "<koi8");
  9. printf "%#x\n", ord(<I>), "\n"; # this should print 0xc1
  10. close I;

These methods install a transparent filter on the I/O stream that converts data from the specified encoding when it is read in from the stream. The result is always Unicode.

The open pragma affects all the open() calls after the pragma by setting default layers. If you want to affect only certain streams, use explicit layers directly in the open() call.

You can switch encodings on an already opened stream by using binmode(); see binmode.

The :locale does not currently work with open() and binmode(), only with the open pragma. The :utf8 and :encoding(...) methods do work with all of open(), binmode(), and the open pragma.

Similarly, you may use these I/O layers on output streams to automatically convert Unicode to the specified encoding when it is written to the stream. For example, the following snippet copies the contents of the file "text.jis" (encoded as ISO-2022-JP, aka JIS) to the file "text.utf8", encoded as UTF-8:

  1. open(my $nihongo, '<:encoding(iso-2022-jp)', 'text.jis');
  2. open(my $unicode, '>:utf8', 'text.utf8');
  3. while (<$nihongo>) { print $unicode $_ }

The naming of encodings, both by the open() and by the open pragma allows for flexible names: koi8-r and KOI8R will both be understood.

Common encodings recognized by ISO, MIME, IANA, and various other standardisation organisations are recognised; for a more detailed list see Encode::Supported.

read() reads characters and returns the number of characters. seek() and tell() operate on byte counts, as do sysread() and sysseek().

Notice that because of the default behaviour of not doing any conversion upon input if there is no default layer, it is easy to mistakenly write code that keeps on expanding a file by repeatedly encoding the data:

  1. # BAD CODE WARNING
  2. open F, "file";
  3. local $/; ## read in the whole file of 8-bit characters
  4. $t = <F>;
  5. close F;
  6. open F, ">:encoding(utf8)", "file";
  7. print F $t; ## convert to UTF-8 on output
  8. close F;

If you run this code twice, the contents of the file will be twice UTF-8 encoded. A use open ':encoding(utf8)' would have avoided the bug, or explicitly opening also the file for input as UTF-8.

NOTE: the :utf8 and :encoding features work only if your Perl has been built with the new PerlIO feature (which is the default on most systems).

Displaying Unicode As Text

Sometimes you might want to display Perl scalars containing Unicode as simple ASCII (or EBCDIC) text. The following subroutine converts its argument so that Unicode characters with code points greater than 255 are displayed as \x{...} , control characters (like \n ) are displayed as \x.. , and the rest of the characters as themselves:

  1. sub nice_string {
  2. join("",
  3. map { $_ > 255 ? # if wide character...
  4. sprintf("\\x{%04X}", $_) : # \x{...}
  5. chr($_) =~ /[[:cntrl:]]/ ? # else if control character...
  6. sprintf("\\x%02X", $_) : # \x..
  7. quotemeta(chr($_)) # else quoted or as themselves
  8. } unpack("W*", $_[0])); # unpack Unicode characters
  9. }

For example,

  1. nice_string("foo\x{100}bar\n")

returns the string

  1. 'foo\x{0100}bar\x0A'

which is ready to be printed.

Special Cases

  • Bit Complement Operator ~ And vec()

    The bit complement operator ~ may produce surprising results if used on strings containing characters with ordinal values above 255. In such a case, the results are consistent with the internal encoding of the characters, but not with much else. So don't do that. Similarly for vec(): you will be operating on the internally-encoded bit patterns of the Unicode characters, not on the code point values, which is very probably not what you want.

  • Peeking At Perl's Internal Encoding

    Normal users of Perl should never care how Perl encodes any particular Unicode string (because the normal ways to get at the contents of a string with Unicode--via input and output--should always be via explicitly-defined I/O layers). But if you must, there are two ways of looking behind the scenes.

    One way of peeking inside the internal encoding of Unicode characters is to use unpack("C*", ... to get the bytes of whatever the string encoding happens to be, or unpack("U0..", ...) to get the bytes of the UTF-8 encoding:

    1. # this prints c4 80 for the UTF-8 bytes 0xc4 0x80
    2. print join(" ", unpack("U0(H2)*", pack("U", 0x100))), "\n";

    Yet another way would be to use the Devel::Peek module:

    1. perl -MDevel::Peek -e 'Dump(chr(0x100))'

    That shows the UTF8 flag in FLAGS and both the UTF-8 bytes and Unicode characters in PV . See also later in this document the discussion about the utf8::is_utf8() function.

Advanced Topics

  • String Equivalence

    The question of string equivalence turns somewhat complicated in Unicode: what do you mean by "equal"?

    (Is LATIN CAPITAL LETTER A WITH ACUTE equal to LATIN CAPITAL LETTER A ?)

    The short answer is that by default Perl compares equivalence (eq , ne ) based only on code points of the characters. In the above case, the answer is no (because 0x00C1 != 0x0041). But sometimes, any CAPITAL LETTER A's should be considered equal, or even A's of any case.

    The long answer is that you need to consider character normalization and casing issues: see Unicode::Normalize, Unicode Technical Report #15, Unicode Normalization Forms and sections on case mapping in the Unicode Standard.

    As of Perl 5.8.0, the "Full" case-folding of Case Mappings/SpecialCasing is implemented, but bugs remain in qr//i with them, mostly fixed by 5.14.

  • String Collation

    People like to see their strings nicely sorted--or as Unicode parlance goes, collated. But again, what do you mean by collate?

    (Does LATIN CAPITAL LETTER A WITH ACUTE come before or after LATIN CAPITAL LETTER A WITH GRAVE ?)

    The short answer is that by default, Perl compares strings (lt , le , cmp , ge , gt ) based only on the code points of the characters. In the above case, the answer is "after", since 0x00C1 > 0x00C0 .

    The long answer is that "it depends", and a good answer cannot be given without knowing (at the very least) the language context. See Unicode::Collate, and Unicode Collation Algorithm http://www.unicode.org/unicode/reports/tr10/

Miscellaneous

  • Character Ranges and Classes

    Character ranges in regular expression bracketed character classes ( e.g., /[a-z]/ ) and in the tr/// (also known as y///) operator are not magically Unicode-aware. What this means is that [A-Za-z] will not magically start to mean "all alphabetic letters" (not that it does mean that even for 8-bit characters; for those, if you are using locales (perllocale), use /[[:alpha:]]/ ; and if not, use the 8-bit-aware property \p{alpha} ).

    All the properties that begin with \p (and its inverse \P ) are actually character classes that are Unicode-aware. There are dozens of them, see perluniprops.

    You can use Unicode code points as the end points of character ranges, and the range will include all Unicode code points that lie between those end points.

  • String-To-Number Conversions

    Unicode does define several other decimal--and numeric--characters besides the familiar 0 to 9, such as the Arabic and Indic digits. Perl does not support string-to-number conversion for digits other than ASCII 0 to 9 (and ASCII a to f for hexadecimal). To get safe conversions from any Unicode string, use num() in Unicode::UCD.

Questions With Answers

  • Will My Old Scripts Break?

    Very probably not. Unless you are generating Unicode characters somehow, old behaviour should be preserved. About the only behaviour that has changed and which could start generating Unicode is the old behaviour of chr() where supplying an argument more than 255 produced a character modulo 255. chr(300), for example, was equal to chr(45) or "-" (in ASCII), now it is LATIN CAPITAL LETTER I WITH BREVE.

  • How Do I Make My Scripts Work With Unicode?

    Very little work should be needed since nothing changes until you generate Unicode data. The most important thing is getting input as Unicode; for that, see the earlier I/O discussion. To get full seamless Unicode support, add use feature 'unicode_strings' (or use 5.012 or higher) to your script.

  • How Do I Know Whether My String Is In Unicode?

    You shouldn't have to care. But you may if your Perl is before 5.14.0 or you haven't specified use feature 'unicode_strings' or use 5.012 (or higher) because otherwise the semantics of the code points in the range 128 to 255 are different depending on whether the string they are contained within is in Unicode or not. (See When Unicode Does Not Happen in perlunicode.)

    To determine if a string is in Unicode, use:

    1. print utf8::is_utf8($string) ? 1 : 0, "\n";

    But note that this doesn't mean that any of the characters in the string are necessary UTF-8 encoded, or that any of the characters have code points greater than 0xFF (255) or even 0x80 (128), or that the string has any characters at all. All the is_utf8() does is to return the value of the internal "utf8ness" flag attached to the $string . If the flag is off, the bytes in the scalar are interpreted as a single byte encoding. If the flag is on, the bytes in the scalar are interpreted as the (variable-length, potentially multi-byte) UTF-8 encoded code points of the characters. Bytes added to a UTF-8 encoded string are automatically upgraded to UTF-8. If mixed non-UTF-8 and UTF-8 scalars are merged (double-quoted interpolation, explicit concatenation, or printf/sprintf parameter substitution), the result will be UTF-8 encoded as if copies of the byte strings were upgraded to UTF-8: for example,

    1. $a = "ab\x80c";
    2. $b = "\x{100}";
    3. print "$a = $b\n";

    the output string will be UTF-8-encoded ab\x80c = \x{100}\n , but $a will stay byte-encoded.

    Sometimes you might really need to know the byte length of a string instead of the character length. For that use either the Encode::encode_utf8() function or the bytes pragma and the length() function:

    1. my $unicode = chr(0x100);
    2. print length($unicode), "\n"; # will print 1
    3. require Encode;
    4. print length(Encode::encode_utf8($unicode)),"\n"; # will print 2
    5. use bytes;
    6. print length($unicode), "\n"; # will also print 2
    7. # (the 0xC4 0x80 of the UTF-8)
    8. no bytes;
  • How Do I Find Out What Encoding a File Has?

    You might try Encode::Guess, but it has a number of limitations.

  • How Do I Detect Data That's Not Valid In a Particular Encoding?

    Use the Encode package to try converting it. For example,

    1. use Encode 'decode_utf8';
    2. if (eval { decode_utf8($string, Encode::FB_CROAK); 1 }) {
    3. # $string is valid utf8
    4. } else {
    5. # $string is not valid utf8
    6. }

    Or use unpack to try decoding it:

    1. use warnings;
    2. @chars = unpack("C0U*", $string_of_bytes_that_I_think_is_utf8);

    If invalid, a Malformed UTF-8 character warning is produced. The "C0" means "process the string character per character". Without that, the unpack("U*", ...) would work in U0 mode (the default if the format string starts with U ) and it would return the bytes making up the UTF-8 encoding of the target string, something that will always work.

  • How Do I Convert Binary Data Into a Particular Encoding, Or Vice Versa?

    This probably isn't as useful as you might think. Normally, you shouldn't need to.

    In one sense, what you are asking doesn't make much sense: encodings are for characters, and binary data are not "characters", so converting "data" into some encoding isn't meaningful unless you know in what character set and encoding the binary data is in, in which case it's not just binary data, now is it?

    If you have a raw sequence of bytes that you know should be interpreted via a particular encoding, you can use Encode :

    1. use Encode 'from_to';
    2. from_to($data, "iso-8859-1", "utf-8"); # from latin-1 to utf-8

    The call to from_to() changes the bytes in $data , but nothing material about the nature of the string has changed as far as Perl is concerned. Both before and after the call, the string $data contains just a bunch of 8-bit bytes. As far as Perl is concerned, the encoding of the string remains as "system-native 8-bit bytes".

    You might relate this to a fictional 'Translate' module:

    1. use Translate;
    2. my $phrase = "Yes";
    3. Translate::from_to($phrase, 'english', 'deutsch');
    4. ## phrase now contains "Ja"

    The contents of the string changes, but not the nature of the string. Perl doesn't know any more after the call than before that the contents of the string indicates the affirmative.

    Back to converting data. If you have (or want) data in your system's native 8-bit encoding (e.g. Latin-1, EBCDIC, etc.), you can use pack/unpack to convert to/from Unicode.

    1. $native_string = pack("W*", unpack("U*", $Unicode_string));
    2. $Unicode_string = pack("U*", unpack("W*", $native_string));

    If you have a sequence of bytes you know is valid UTF-8, but Perl doesn't know it yet, you can make Perl a believer, too:

    1. use Encode 'decode_utf8';
    2. $Unicode = decode_utf8($bytes);

    or:

    1. $Unicode = pack("U0a*", $bytes);

    You can find the bytes that make up a UTF-8 sequence with

    1. @bytes = unpack("C*", $Unicode_string)

    and you can create well-formed Unicode with

    1. $Unicode_string = pack("U*", 0xff, ...)
  • How Do I Display Unicode? How Do I Input Unicode?

    See http://www.alanwood.net/unicode/ and http://www.cl.cam.ac.uk/~mgk25/unicode.html

  • How Does Unicode Work With Traditional Locales?

    Starting in Perl 5.16, you can specify

    1. use locale ':not_characters';

    to get Perl to work well with tradtional locales. The catch is that you have to translate from the locale character set to/from Unicode yourself. See Unicode I/O above for how to

    1. use open ':locale';

    to accomplish this, but full details are in Unicode and UTF-8 in perllocale, including gotchas that happen if you don't specifiy :not_characters .

Hexadecimal Notation

The Unicode standard prefers using hexadecimal notation because that more clearly shows the division of Unicode into blocks of 256 characters. Hexadecimal is also simply shorter than decimal. You can use decimal notation, too, but learning to use hexadecimal just makes life easier with the Unicode standard. The U+HHHH notation uses hexadecimal, for example.

The 0x prefix means a hexadecimal number, the digits are 0-9 and a-f (or A-F, case doesn't matter). Each hexadecimal digit represents four bits, or half a byte. print 0x..., "\n" will show a hexadecimal number in decimal, and printf "%x\n", $decimal will show a decimal number in hexadecimal. If you have just the "hex digits" of a hexadecimal number, you can use the hex() function.

  1. print 0x0009, "\n"; # 9
  2. print 0x000a, "\n"; # 10
  3. print 0x000f, "\n"; # 15
  4. print 0x0010, "\n"; # 16
  5. print 0x0011, "\n"; # 17
  6. print 0x0100, "\n"; # 256
  7. print 0x0041, "\n"; # 65
  8. printf "%x\n", 65; # 41
  9. printf "%#x\n", 65; # 0x41
  10. print hex("41"), "\n"; # 65

Further Resources

UNICODE IN OLDER PERLS

If you cannot upgrade your Perl to 5.8.0 or later, you can still do some Unicode processing by using the modules Unicode::String , Unicode::Map8 , and Unicode::Map , available from CPAN. If you have the GNU recode installed, you can also use the Perl front-end Convert::Recode for character conversions.

The following are fast conversions from ISO 8859-1 (Latin-1) bytes to UTF-8 bytes and back, the code works even with older Perl 5 versions.

  1. # ISO 8859-1 to UTF-8
  2. s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg;
  3. # UTF-8 to ISO 8859-1
  4. s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg;

SEE ALSO

perlunitut, perlunicode, Encode, open, utf8, bytes, perlretut, perlrun, Unicode::Collate, Unicode::Normalize, Unicode::UCD

ACKNOWLEDGMENTS

Thanks to the kind readers of the perl5-porters@perl.org, perl-unicode@perl.org, linux-utf8@nl.linux.org, and unicore@unicode.org mailing lists for their valuable feedback.

AUTHOR, COPYRIGHT, AND LICENSE

Copyright 2001-2011 Jarkko Hietaniemi <jhi@iki.fi>

This document may be distributed under the same terms as Perl itself.

 
perldoc-html/perluniprops.html000644 000765 000024 00003653707 12275777345 016716 0ustar00jjstaff000000 000000 perluniprops - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perluniprops

Perl 5 version 18.2 documentation
Recently read

perluniprops

NAME

perluniprops - Index of Unicode Version 6.2.0 character properties in Perl

DESCRIPTION

This document provides information about the portion of the Unicode database that deals with character properties, that is the portion that is defined on single code points. (Other information in the Unicode data base below briefly mentions other data that Unicode provides.)

Perl can provide access to all non-provisional Unicode character properties, though not all are enabled by default. The omitted ones are the Unihan properties (accessible via the CPAN module Unicode::Unihan) and certain deprecated or Unicode-internal properties. (An installation may choose to recompile Perl's tables to change this. See Unicode character properties that are NOT accepted by Perl.)

For most purposes, access to Unicode properties from the Perl core is through regular expression matches, as described in the next section. For some special purposes, and to access the properties that are not suitable for regular expression matching, all the Unicode character properties that Perl handles are accessible via the standard Unicode::UCD module, as described in the section Properties accessible through Unicode::UCD.

Perl also provides some additional extensions and short-cut synonyms for Unicode properties.

This document merely lists all available properties and does not attempt to explain what each property really means. There is a brief description of each Perl extension; see Other Properties in perlunicode for more information on these. There is some detail about Blocks, Scripts, General_Category, and Bidi_Class in perlunicode, but to find out about the intricacies of the official Unicode properties, refer to the Unicode standard. A good starting place is http://www.unicode.org/reports/tr44/.

Note that you can define your own properties; see User-Defined Character Properties in perlunicode.

Properties accessible through \p{} and \P{}

The Perl regular expression \p{} and \P{} constructs give access to most of the Unicode character properties. The table below shows all these constructs, both single and compound forms.

Compound forms consist of two components, separated by an equals sign or a colon. The first component is the property name, and the second component is the particular value of the property to match against, for example, \p{Script: Greek} and \p{Script=Greek} both mean to match characters whose Script property is Greek.

Single forms, like \p{Greek} , are mostly Perl-defined shortcuts for their equivalent compound forms. The table shows these equivalences. (In our example, \p{Greek} is a just a shortcut for \p{Script=Greek} .) There are also a few Perl-defined single forms that are not shortcuts for a compound form. One such is \p{Word} . These are also listed in the table.

In parsing these constructs, Perl always ignores Upper/lower case differences everywhere within the {braces}. Thus \p{Greek} means the same thing as \p{greek} . But note that changing the case of the "p" or "P" before the left brace completely changes the meaning of the construct, from "match" (for \p{} ) to "doesn't match" (for \P{} ). Casing in this document is for improved legibility.

Also, white space, hyphens, and underscores are normally ignored everywhere between the {braces}, and hence can be freely added or removed even if the /x modifier hasn't been specified on the regular expression. But a 'T' at the beginning of an entry in the table below means that tighter (stricter) rules are used for that entry:

  • Single form (\p{name} ) tighter rules:

    White space, hyphens, and underscores ARE significant except for:

    • white space adjacent to a non-word character
    • underscores separating digits in numbers

    That means, for example, that you can freely add or remove white space adjacent to (but within) the braces without affecting the meaning.

  • Compound form (\p{name=value} or \p{name:value} ) tighter rules:

    The tighter rules given above for the single form apply to everything to the right of the colon or equals; the looser rules still apply to everything to the left.

    That means, for example, that you can freely add or remove white space adjacent to (but within) the braces and the colon or equal sign.

Some properties are considered obsolete by Unicode, but still available. There are several varieties of obsolescence:

  • Stabilized

    A property may be stabilized. Such a determination does not indicate that the property should or should not be used; instead it is a declaration that the property will not be maintained nor extended for newly encoded characters. Such properties are marked with an 'S' in the table.

  • Deprecated

    A property may be deprecated, perhaps because its original intent has been replaced by another property, or because its specification was somehow defective. This means that its use is strongly discouraged, so much so that a warning will be issued if used, unless the regular expression is in the scope of a no warnings 'deprecated' statement. A 'D' flags each such entry in the table, and the entry there for the longest, most descriptive version of the property will give the reason it is deprecated, and perhaps advice. Perl may issue such a warning, even for properties that aren't officially deprecated by Unicode, when there used to be characters or code points that were matched by them, but no longer. This is to warn you that your program may not work like it did on earlier Unicode releases.

    A deprecated property may be made unavailable in a future Perl version, so it is best to move away from them.

    A deprecated property may also be stabilized, but this fact is not shown.

  • Obsolete

    Properties marked with an 'O' in the table are considered (plain) obsolete. Generally this designation is given to properties that Unicode once used for internal purposes (but not any longer).

Some Perl extensions are present for backwards compatibility and are discouraged from being used, but are not obsolete. An 'X' flags each such entry in the table. Future Unicode versions may force some of these extensions to be removed without warning, replaced by another property with the same name that means something different. Use the equivalent shown instead.

Matches in the Block property have shortcuts that begin with "In_". For example, \p{Block=Latin1} can be written as \p{In_Latin1} . For backward compatibility, if there is no conflict with another shortcut, these may also be written as \p{Latin1} or \p{Is_Latin1} . But, N.B., there are numerous such conflicting shortcuts. Use of these forms for Block is discouraged, and are flagged as such, not only because of the potential confusion as to what is meant, but also because a later release of Unicode may preempt the shortcut, and your program would no longer be correct. Use the "In_" form instead to avoid this, or even more clearly, use the compound form, e.g., \p{blk:latin1} . See Blocks in perlunicode for more information about this.

The table below has two columns. The left column contains the \p{} constructs to look up, possibly preceded by the flags mentioned above; and the right column contains information about them, like a description, or synonyms. It shows both the single and compound forms for each property that has them. If the left column is a short name for a property, the right column will give its longer, more descriptive name; and if the left column is the longest name, the right column will show any equivalent shortest name, in both single and compound forms if applicable.

The right column will also caution you if a property means something different than what might normally be expected.

All single forms are Perl extensions; a few compound forms are as well, and are noted as such.

Numbers in (parentheses) indicate the total number of code points matched by the property. For emphasis, those properties that match no code points at all are listed as well in a separate section following the table.

Most properties match the same code points regardless of whether "/i" case-insensitive matching is specified or not. But a few properties are affected. These are shown with the notation

  1. (/i= other_property)

in the second column. Under case-insensitive matching they match the same code pode points as the property "other_property".

There is no description given for most non-Perl defined properties (See http://www.unicode.org/reports/tr44/ for that).

For compactness, '*' is used as a wildcard instead of showing all possible combinations. For example, entries like:

  1. \p{Gc: *} \p{General_Category: *}

mean that 'Gc' is a synonym for 'General_Category', and anything that is valid for the latter is also valid for the former. Similarly,

  1. \p{Is_*} \p{*}

means that if and only if, for example, \p{Foo} exists, then \p{Is_Foo} and \p{IsFoo} are also valid and all mean the same thing. And similarly, \p{Foo=Bar} means the same as \p{Is_Foo=Bar} and \p{IsFoo=Bar} . "*" here is restricted to something not beginning with an underscore.

Also, in binary properties, 'Yes', 'T', and 'True' are all synonyms for 'Y'. And 'No', 'F', and 'False' are all synonyms for 'N'. The table shows 'Y*' and 'N*' to indicate this, and doesn't have separate entries for the other possibilities. Note that not all properties which have values 'Yes' and 'No' are binary, and they have all their values spelled out without using this wild card, and a NOT clause in their description that highlights their not being binary. These also require the compound form to match them, whereas true binary properties have both single and compound forms available.

Note that all non-essential underscores are removed in the display of the short names below.

Legend summary:

  • * is a wild-card
  • (\d+) in the info column gives the number of code points matched by this property.
  • D means this is deprecated.
  • O means this is obsolete.
  • S means this is stabilized.
  • T means tighter (stricter) name matching applies.
  • X means use of this form is discouraged, and may not be stable.
  1. NAME INFO
  2. X \p{Aegean_Numbers} \p{Block=Aegean_Numbers} (64)
  3. T \p{Age: 1.1} \p{Age=V1_1} (33_979)
  4. T \p{Age: 2.0} \p{Age=V2_0} (144_521)
  5. T \p{Age: 2.1} \p{Age=V2_1} (2)
  6. T \p{Age: 3.0} \p{Age=V3_0} (10_307)
  7. T \p{Age: 3.1} \p{Age=V3_1} (44_978)
  8. T \p{Age: 3.2} \p{Age=V3_2} (1016)
  9. T \p{Age: 4.0} \p{Age=V4_0} (1226)
  10. T \p{Age: 4.1} \p{Age=V4_1} (1273)
  11. T \p{Age: 5.0} \p{Age=V5_0} (1369)
  12. T \p{Age: 5.1} \p{Age=V5_1} (1624)
  13. T \p{Age: 5.2} \p{Age=V5_2} (6648)
  14. T \p{Age: 6.0} \p{Age=V6_0} (2088)
  15. T \p{Age: 6.1} \p{Age=V6_1} (732)
  16. T \p{Age: 6.2} \p{Age=V6_2} (1)
  17. \p{Age: NA} \p{Age=Unassigned} (864_348)
  18. \p{Age: Unassigned} Code point's usage has not been assigned
  19. in any Unicode release thus far. (Short:
  20. \p{Age=NA}) (864_348)
  21. \p{Age: V1_1} Code point's usage introduced in version
  22. 1.1 (33_979)
  23. \p{Age: V2_0} Code point's usage was introduced in
  24. version 2.0; See also Property
  25. 'Present_In' (144_521)
  26. \p{Age: V2_1} Code point's usage was introduced in
  27. version 2.1; See also Property
  28. 'Present_In' (2)
  29. \p{Age: V3_0} Code point's usage was introduced in
  30. version 3.0; See also Property
  31. 'Present_In' (10_307)
  32. \p{Age: V3_1} Code point's usage was introduced in
  33. version 3.1; See also Property
  34. 'Present_In' (44_978)
  35. \p{Age: V3_2} Code point's usage was introduced in
  36. version 3.2; See also Property
  37. 'Present_In' (1016)
  38. \p{Age: V4_0} Code point's usage was introduced in
  39. version 4.0; See also Property
  40. 'Present_In' (1226)
  41. \p{Age: V4_1} Code point's usage was introduced in
  42. version 4.1; See also Property
  43. 'Present_In' (1273)
  44. \p{Age: V5_0} Code point's usage was introduced in
  45. version 5.0; See also Property
  46. 'Present_In' (1369)
  47. \p{Age: V5_1} Code point's usage was introduced in
  48. version 5.1; See also Property
  49. 'Present_In' (1624)
  50. \p{Age: V5_2} Code point's usage was introduced in
  51. version 5.2; See also Property
  52. 'Present_In' (6648)
  53. \p{Age: V6_0} Code point's usage was introduced in
  54. version 6.0; See also Property
  55. 'Present_In' (2088)
  56. \p{Age: V6_1} Code point's usage was introduced in
  57. version 6.1; See also Property
  58. 'Present_In' (732)
  59. \p{Age: V6_2} Code point's usage was introduced in
  60. version 6.2; See also Property
  61. 'Present_In' (1)
  62. \p{AHex} \p{PosixXDigit} (= \p{ASCII_Hex_Digit=Y})
  63. (22)
  64. \p{AHex: *} \p{ASCII_Hex_Digit: *}
  65. X \p{Alchemical} \p{Alchemical_Symbols} (= \p{Block=
  66. Alchemical_Symbols}) (128)
  67. X \p{Alchemical_Symbols} \p{Block=Alchemical_Symbols} (Short:
  68. \p{InAlchemical}) (128)
  69. \p{All} \p{Any} (1_114_112)
  70. \p{Alnum} Alphabetic and (decimal) Numeric (102_619)
  71. \p{Alpha} \p{Alphabetic=Y} (102_159)
  72. \p{Alpha: *} \p{Alphabetic: *}
  73. \p{Alphabetic} \p{Alpha} (= \p{Alphabetic=Y}) (102_159)
  74. \p{Alphabetic: N*} (Short: \p{Alpha=N}, \P{Alpha}) (1_011_953)
  75. \p{Alphabetic: Y*} (Short: \p{Alpha=Y}, \p{Alpha}) (102_159)
  76. X \p{Alphabetic_PF} \p{Alphabetic_Presentation_Forms} (=
  77. \p{Block=Alphabetic_Presentation_Forms})
  78. (80)
  79. X \p{Alphabetic_Presentation_Forms} \p{Block=
  80. Alphabetic_Presentation_Forms} (Short:
  81. \p{InAlphabeticPF}) (80)
  82. X \p{Ancient_Greek_Music} \p{Ancient_Greek_Musical_Notation} (=
  83. \p{Block=
  84. Ancient_Greek_Musical_Notation}) (80)
  85. X \p{Ancient_Greek_Musical_Notation} \p{Block=
  86. Ancient_Greek_Musical_Notation} (Short:
  87. \p{InAncientGreekMusic}) (80)
  88. X \p{Ancient_Greek_Numbers} \p{Block=Ancient_Greek_Numbers} (80)
  89. X \p{Ancient_Symbols} \p{Block=Ancient_Symbols} (64)
  90. \p{Any} [\x{0000}-\x{10FFFF}] (1_114_112)
  91. \p{Arab} \p{Arabic} (= \p{Script=Arabic}) (NOT
  92. \p{Block=Arabic}) (1235)
  93. \p{Arabic} \p{Script=Arabic} (Short: \p{Arab}; NOT
  94. \p{Block=Arabic}) (1235)
  95. X \p{Arabic_Ext_A} \p{Arabic_Extended_A} (= \p{Block=
  96. Arabic_Extended_A}) (96)
  97. X \p{Arabic_Extended_A} \p{Block=Arabic_Extended_A} (Short:
  98. \p{InArabicExtA}) (96)
  99. X \p{Arabic_Math} \p{Arabic_Mathematical_Alphabetic_Symbols}
  100. (= \p{Block=
  101. Arabic_Mathematical_Alphabetic_Symbols})
  102. (256)
  103. X \p{Arabic_Mathematical_Alphabetic_Symbols} \p{Block=
  104. Arabic_Mathematical_Alphabetic_Symbols}
  105. (Short: \p{InArabicMath}) (256)
  106. X \p{Arabic_PF_A} \p{Arabic_Presentation_Forms_A} (=
  107. \p{Block=Arabic_Presentation_Forms_A})
  108. (688)
  109. X \p{Arabic_PF_B} \p{Arabic_Presentation_Forms_B} (=
  110. \p{Block=Arabic_Presentation_Forms_B})
  111. (144)
  112. X \p{Arabic_Presentation_Forms_A} \p{Block=
  113. Arabic_Presentation_Forms_A} (Short:
  114. \p{InArabicPFA}) (688)
  115. X \p{Arabic_Presentation_Forms_B} \p{Block=
  116. Arabic_Presentation_Forms_B} (Short:
  117. \p{InArabicPFB}) (144)
  118. X \p{Arabic_Sup} \p{Arabic_Supplement} (= \p{Block=
  119. Arabic_Supplement}) (48)
  120. X \p{Arabic_Supplement} \p{Block=Arabic_Supplement} (Short:
  121. \p{InArabicSup}) (48)
  122. \p{Armenian} \p{Script=Armenian} (Short: \p{Armn}; NOT
  123. \p{Block=Armenian}) (91)
  124. \p{Armi} \p{Imperial_Aramaic} (= \p{Script=
  125. Imperial_Aramaic}) (NOT \p{Block=
  126. Imperial_Aramaic}) (31)
  127. \p{Armn} \p{Armenian} (= \p{Script=Armenian}) (NOT
  128. \p{Block=Armenian}) (91)
  129. X \p{Arrows} \p{Block=Arrows} (112)
  130. \p{ASCII} \p{Block=Basic_Latin} [[:ASCII:]] (128)
  131. \p{ASCII_Hex_Digit} \p{PosixXDigit} (= \p{ASCII_Hex_Digit=Y})
  132. (22)
  133. \p{ASCII_Hex_Digit: N*} (Short: \p{AHex=N}, \P{AHex}) (1_114_090)
  134. \p{ASCII_Hex_Digit: Y*} (Short: \p{AHex=Y}, \p{AHex}) (22)
  135. \p{Assigned} All assigned code points (249_698)
  136. \p{Avestan} \p{Script=Avestan} (Short: \p{Avst}; NOT
  137. \p{Block=Avestan}) (61)
  138. \p{Avst} \p{Avestan} (= \p{Script=Avestan}) (NOT
  139. \p{Block=Avestan}) (61)
  140. \p{Bali} \p{Balinese} (= \p{Script=Balinese}) (NOT
  141. \p{Block=Balinese}) (121)
  142. \p{Balinese} \p{Script=Balinese} (Short: \p{Bali}; NOT
  143. \p{Block=Balinese}) (121)
  144. \p{Bamu} \p{Bamum} (= \p{Script=Bamum}) (NOT
  145. \p{Block=Bamum}) (657)
  146. \p{Bamum} \p{Script=Bamum} (Short: \p{Bamu}; NOT
  147. \p{Block=Bamum}) (657)
  148. X \p{Bamum_Sup} \p{Bamum_Supplement} (= \p{Block=
  149. Bamum_Supplement}) (576)
  150. X \p{Bamum_Supplement} \p{Block=Bamum_Supplement} (Short:
  151. \p{InBamumSup}) (576)
  152. X \p{Basic_Latin} \p{ASCII} (= \p{Block=Basic_Latin}) (128)
  153. \p{Batak} \p{Script=Batak} (Short: \p{Batk}; NOT
  154. \p{Block=Batak}) (56)
  155. \p{Batk} \p{Batak} (= \p{Script=Batak}) (NOT
  156. \p{Block=Batak}) (56)
  157. \p{Bc: *} \p{Bidi_Class: *}
  158. \p{Beng} \p{Bengali} (= \p{Script=Bengali}) (NOT
  159. \p{Block=Bengali}) (92)
  160. \p{Bengali} \p{Script=Bengali} (Short: \p{Beng}; NOT
  161. \p{Block=Bengali}) (92)
  162. \p{Bidi_C} \p{Bidi_Control} (= \p{Bidi_Control=Y}) (7)
  163. \p{Bidi_C: *} \p{Bidi_Control: *}
  164. \p{Bidi_Class: AL} \p{Bidi_Class=Arabic_Letter} (1438)
  165. \p{Bidi_Class: AN} \p{Bidi_Class=Arabic_Number} (49)
  166. \p{Bidi_Class: Arabic_Letter} (Short: \p{Bc=AL}) (1438)
  167. \p{Bidi_Class: Arabic_Number} (Short: \p{Bc=AN}) (49)
  168. \p{Bidi_Class: B} \p{Bidi_Class=Paragraph_Separator} (7)
  169. \p{Bidi_Class: BN} \p{Bidi_Class=Boundary_Neutral} (4015)
  170. \p{Bidi_Class: Boundary_Neutral} (Short: \p{Bc=BN}) (4015)
  171. \p{Bidi_Class: Common_Separator} (Short: \p{Bc=CS}) (15)
  172. \p{Bidi_Class: CS} \p{Bidi_Class=Common_Separator} (15)
  173. \p{Bidi_Class: EN} \p{Bidi_Class=European_Number} (131)
  174. \p{Bidi_Class: ES} \p{Bidi_Class=European_Separator} (12)
  175. \p{Bidi_Class: ET} \p{Bidi_Class=European_Terminator} (66)
  176. \p{Bidi_Class: European_Number} (Short: \p{Bc=EN}) (131)
  177. \p{Bidi_Class: European_Separator} (Short: \p{Bc=ES}) (12)
  178. \p{Bidi_Class: European_Terminator} (Short: \p{Bc=ET}) (66)
  179. \p{Bidi_Class: L} \p{Bidi_Class=Left_To_Right} (1_098_530)
  180. \p{Bidi_Class: Left_To_Right} (Short: \p{Bc=L}) (1_098_530)
  181. \p{Bidi_Class: Left_To_Right_Embedding} (Short: \p{Bc=LRE}) (1)
  182. \p{Bidi_Class: Left_To_Right_Override} (Short: \p{Bc=LRO}) (1)
  183. \p{Bidi_Class: LRE} \p{Bidi_Class=Left_To_Right_Embedding} (1)
  184. \p{Bidi_Class: LRO} \p{Bidi_Class=Left_To_Right_Override} (1)
  185. \p{Bidi_Class: Nonspacing_Mark} (Short: \p{Bc=NSM}) (1290)
  186. \p{Bidi_Class: NSM} \p{Bidi_Class=Nonspacing_Mark} (1290)
  187. \p{Bidi_Class: ON} \p{Bidi_Class=Other_Neutral} (4447)
  188. \p{Bidi_Class: Other_Neutral} (Short: \p{Bc=ON}) (4447)
  189. \p{Bidi_Class: Paragraph_Separator} (Short: \p{Bc=B}) (7)
  190. \p{Bidi_Class: PDF} \p{Bidi_Class=Pop_Directional_Format} (1)
  191. \p{Bidi_Class: Pop_Directional_Format} (Short: \p{Bc=PDF}) (1)
  192. \p{Bidi_Class: R} \p{Bidi_Class=Right_To_Left} (4086)
  193. \p{Bidi_Class: Right_To_Left} (Short: \p{Bc=R}) (4086)
  194. \p{Bidi_Class: Right_To_Left_Embedding} (Short: \p{Bc=RLE}) (1)
  195. \p{Bidi_Class: Right_To_Left_Override} (Short: \p{Bc=RLO}) (1)
  196. \p{Bidi_Class: RLE} \p{Bidi_Class=Right_To_Left_Embedding} (1)
  197. \p{Bidi_Class: RLO} \p{Bidi_Class=Right_To_Left_Override} (1)
  198. \p{Bidi_Class: S} \p{Bidi_Class=Segment_Separator} (3)
  199. \p{Bidi_Class: Segment_Separator} (Short: \p{Bc=S}) (3)
  200. \p{Bidi_Class: White_Space} (Short: \p{Bc=WS}) (18)
  201. \p{Bidi_Class: WS} \p{Bidi_Class=White_Space} (18)
  202. \p{Bidi_Control} \p{Bidi_Control=Y} (Short: \p{BidiC}) (7)
  203. \p{Bidi_Control: N*} (Short: \p{BidiC=N}, \P{BidiC}) (1_114_105)
  204. \p{Bidi_Control: Y*} (Short: \p{BidiC=Y}, \p{BidiC}) (7)
  205. \p{Bidi_M} \p{Bidi_Mirrored} (= \p{Bidi_Mirrored=Y})
  206. (545)
  207. \p{Bidi_M: *} \p{Bidi_Mirrored: *}
  208. \p{Bidi_Mirrored} \p{Bidi_Mirrored=Y} (Short: \p{BidiM})
  209. (545)
  210. \p{Bidi_Mirrored: N*} (Short: \p{BidiM=N}, \P{BidiM}) (1_113_567)
  211. \p{Bidi_Mirrored: Y*} (Short: \p{BidiM=Y}, \p{BidiM}) (545)
  212. \p{Blank} \h, Horizontal white space (19)
  213. \p{Blk: *} \p{Block: *}
  214. \p{Block: Aegean_Numbers} (Single: \p{InAegeanNumbers}) (64)
  215. \p{Block: Alchemical} \p{Block=Alchemical_Symbols} (128)
  216. \p{Block: Alchemical_Symbols} (Short: \p{Blk=Alchemical},
  217. \p{InAlchemical}) (128)
  218. \p{Block: Alphabetic_PF} \p{Block=Alphabetic_Presentation_Forms}
  219. (80)
  220. \p{Block: Alphabetic_Presentation_Forms} (Short: \p{Blk=
  221. AlphabeticPF}, \p{InAlphabeticPF}) (80)
  222. \p{Block: Ancient_Greek_Music} \p{Block=
  223. Ancient_Greek_Musical_Notation} (80)
  224. \p{Block: Ancient_Greek_Musical_Notation} (Short: \p{Blk=
  225. AncientGreekMusic},
  226. \p{InAncientGreekMusic}) (80)
  227. \p{Block: Ancient_Greek_Numbers} (Single:
  228. \p{InAncientGreekNumbers}) (80)
  229. \p{Block: Ancient_Symbols} (Single: \p{InAncientSymbols}) (64)
  230. \p{Block: Arabic} (Single: \p{InArabic}; NOT \p{Arabic} NOR
  231. \p{Is_Arabic}) (256)
  232. \p{Block: Arabic_Ext_A} \p{Block=Arabic_Extended_A} (96)
  233. \p{Block: Arabic_Extended_A} (Short: \p{Blk=ArabicExtA},
  234. \p{InArabicExtA}) (96)
  235. \p{Block: Arabic_Math} \p{Block=
  236. Arabic_Mathematical_Alphabetic_Symbols}
  237. (256)
  238. \p{Block: Arabic_Mathematical_Alphabetic_Symbols} (Short: \p{Blk=
  239. ArabicMath}, \p{InArabicMath}) (256)
  240. \p{Block: Arabic_PF_A} \p{Block=Arabic_Presentation_Forms_A} (688)
  241. \p{Block: Arabic_PF_B} \p{Block=Arabic_Presentation_Forms_B} (144)
  242. \p{Block: Arabic_Presentation_Forms_A} (Short: \p{Blk=ArabicPFA},
  243. \p{InArabicPFA}) (688)
  244. \p{Block: Arabic_Presentation_Forms_B} (Short: \p{Blk=ArabicPFB},
  245. \p{InArabicPFB}) (144)
  246. \p{Block: Arabic_Sup} \p{Block=Arabic_Supplement} (48)
  247. \p{Block: Arabic_Supplement} (Short: \p{Blk=ArabicSup},
  248. \p{InArabicSup}) (48)
  249. \p{Block: Armenian} (Single: \p{InArmenian}; NOT \p{Armenian}
  250. NOR \p{Is_Armenian}) (96)
  251. \p{Block: Arrows} (Single: \p{InArrows}) (112)
  252. \p{Block: ASCII} \p{Block=Basic_Latin} (128)
  253. \p{Block: Avestan} (Single: \p{InAvestan}; NOT \p{Avestan}
  254. NOR \p{Is_Avestan}) (64)
  255. \p{Block: Balinese} (Single: \p{InBalinese}; NOT \p{Balinese}
  256. NOR \p{Is_Balinese}) (128)
  257. \p{Block: Bamum} (Single: \p{InBamum}; NOT \p{Bamum} NOR
  258. \p{Is_Bamum}) (96)
  259. \p{Block: Bamum_Sup} \p{Block=Bamum_Supplement} (576)
  260. \p{Block: Bamum_Supplement} (Short: \p{Blk=BamumSup},
  261. \p{InBamumSup}) (576)
  262. \p{Block: Basic_Latin} (Short: \p{Blk=ASCII}, \p{ASCII}) (128)
  263. \p{Block: Batak} (Single: \p{InBatak}; NOT \p{Batak} NOR
  264. \p{Is_Batak}) (64)
  265. \p{Block: Bengali} (Single: \p{InBengali}; NOT \p{Bengali}
  266. NOR \p{Is_Bengali}) (128)
  267. \p{Block: Block_Elements} (Single: \p{InBlockElements}) (32)
  268. \p{Block: Bopomofo} (Single: \p{InBopomofo}; NOT \p{Bopomofo}
  269. NOR \p{Is_Bopomofo}) (48)
  270. \p{Block: Bopomofo_Ext} \p{Block=Bopomofo_Extended} (32)
  271. \p{Block: Bopomofo_Extended} (Short: \p{Blk=BopomofoExt},
  272. \p{InBopomofoExt}) (32)
  273. \p{Block: Box_Drawing} (Single: \p{InBoxDrawing}) (128)
  274. \p{Block: Brahmi} (Single: \p{InBrahmi}; NOT \p{Brahmi} NOR
  275. \p{Is_Brahmi}) (128)
  276. \p{Block: Braille} \p{Block=Braille_Patterns} (256)
  277. \p{Block: Braille_Patterns} (Short: \p{Blk=Braille},
  278. \p{InBraille}) (256)
  279. \p{Block: Buginese} (Single: \p{InBuginese}; NOT \p{Buginese}
  280. NOR \p{Is_Buginese}) (32)
  281. \p{Block: Buhid} (Single: \p{InBuhid}; NOT \p{Buhid} NOR
  282. \p{Is_Buhid}) (32)
  283. \p{Block: Byzantine_Music} \p{Block=Byzantine_Musical_Symbols}
  284. (256)
  285. \p{Block: Byzantine_Musical_Symbols} (Short: \p{Blk=
  286. ByzantineMusic}, \p{InByzantineMusic})
  287. (256)
  288. \p{Block: Canadian_Syllabics} \p{Block=
  289. Unified_Canadian_Aboriginal_Syllabics}
  290. (640)
  291. \p{Block: Carian} (Single: \p{InCarian}; NOT \p{Carian} NOR
  292. \p{Is_Carian}) (64)
  293. \p{Block: Chakma} (Single: \p{InChakma}; NOT \p{Chakma} NOR
  294. \p{Is_Chakma}) (80)
  295. \p{Block: Cham} (Single: \p{InCham}; NOT \p{Cham} NOR
  296. \p{Is_Cham}) (96)
  297. \p{Block: Cherokee} (Single: \p{InCherokee}; NOT \p{Cherokee}
  298. NOR \p{Is_Cherokee}) (96)
  299. \p{Block: CJK} \p{Block=CJK_Unified_Ideographs} (20_992)
  300. \p{Block: CJK_Compat} \p{Block=CJK_Compatibility} (256)
  301. \p{Block: CJK_Compat_Forms} \p{Block=CJK_Compatibility_Forms} (32)
  302. \p{Block: CJK_Compat_Ideographs} \p{Block=
  303. CJK_Compatibility_Ideographs} (512)
  304. \p{Block: CJK_Compat_Ideographs_Sup} \p{Block=
  305. CJK_Compatibility_Ideographs_Supplement}
  306. (544)
  307. \p{Block: CJK_Compatibility} (Short: \p{Blk=CJKCompat},
  308. \p{InCJKCompat}) (256)
  309. \p{Block: CJK_Compatibility_Forms} (Short: \p{Blk=CJKCompatForms},
  310. \p{InCJKCompatForms}) (32)
  311. \p{Block: CJK_Compatibility_Ideographs} (Short: \p{Blk=
  312. CJKCompatIdeographs},
  313. \p{InCJKCompatIdeographs}) (512)
  314. \p{Block: CJK_Compatibility_Ideographs_Supplement} (Short: \p{Blk=
  315. CJKCompatIdeographsSup},
  316. \p{InCJKCompatIdeographsSup}) (544)
  317. \p{Block: CJK_Ext_A} \p{Block=
  318. CJK_Unified_Ideographs_Extension_A}
  319. (6592)
  320. \p{Block: CJK_Ext_B} \p{Block=
  321. CJK_Unified_Ideographs_Extension_B}
  322. (42_720)
  323. \p{Block: CJK_Ext_C} \p{Block=
  324. CJK_Unified_Ideographs_Extension_C}
  325. (4160)
  326. \p{Block: CJK_Ext_D} \p{Block=
  327. CJK_Unified_Ideographs_Extension_D} (224)
  328. \p{Block: CJK_Radicals_Sup} \p{Block=CJK_Radicals_Supplement} (128)
  329. \p{Block: CJK_Radicals_Supplement} (Short: \p{Blk=CJKRadicalsSup},
  330. \p{InCJKRadicalsSup}) (128)
  331. \p{Block: CJK_Strokes} (Single: \p{InCJKStrokes}) (48)
  332. \p{Block: CJK_Symbols} \p{Block=CJK_Symbols_And_Punctuation} (64)
  333. \p{Block: CJK_Symbols_And_Punctuation} (Short: \p{Blk=CJKSymbols},
  334. \p{InCJKSymbols}) (64)
  335. \p{Block: CJK_Unified_Ideographs} (Short: \p{Blk=CJK}, \p{InCJK})
  336. (20_992)
  337. \p{Block: CJK_Unified_Ideographs_Extension_A} (Short: \p{Blk=
  338. CJKExtA}, \p{InCJKExtA}) (6592)
  339. \p{Block: CJK_Unified_Ideographs_Extension_B} (Short: \p{Blk=
  340. CJKExtB}, \p{InCJKExtB}) (42_720)
  341. \p{Block: CJK_Unified_Ideographs_Extension_C} (Short: \p{Blk=
  342. CJKExtC}, \p{InCJKExtC}) (4160)
  343. \p{Block: CJK_Unified_Ideographs_Extension_D} (Short: \p{Blk=
  344. CJKExtD}, \p{InCJKExtD}) (224)
  345. \p{Block: Combining_Diacritical_Marks} (Short: \p{Blk=
  346. Diacriticals}, \p{InDiacriticals}) (112)
  347. \p{Block: Combining_Diacritical_Marks_For_Symbols} (Short: \p{Blk=
  348. DiacriticalsForSymbols},
  349. \p{InDiacriticalsForSymbols}) (48)
  350. \p{Block: Combining_Diacritical_Marks_Supplement} (Short: \p{Blk=
  351. DiacriticalsSup}, \p{InDiacriticalsSup})
  352. (64)
  353. \p{Block: Combining_Half_Marks} (Short: \p{Blk=HalfMarks},
  354. \p{InHalfMarks}) (16)
  355. \p{Block: Combining_Marks_For_Symbols} \p{Block=
  356. Combining_Diacritical_Marks_For_Symbols}
  357. (48)
  358. \p{Block: Common_Indic_Number_Forms} (Short: \p{Blk=
  359. IndicNumberForms},
  360. \p{InIndicNumberForms}) (16)
  361. \p{Block: Compat_Jamo} \p{Block=Hangul_Compatibility_Jamo} (96)
  362. \p{Block: Control_Pictures} (Single: \p{InControlPictures}) (64)
  363. \p{Block: Coptic} (Single: \p{InCoptic}; NOT \p{Coptic} NOR
  364. \p{Is_Coptic}) (128)
  365. \p{Block: Counting_Rod} \p{Block=Counting_Rod_Numerals} (32)
  366. \p{Block: Counting_Rod_Numerals} (Short: \p{Blk=CountingRod},
  367. \p{InCountingRod}) (32)
  368. \p{Block: Cuneiform} (Single: \p{InCuneiform}; NOT
  369. \p{Cuneiform} NOR \p{Is_Cuneiform})
  370. (1024)
  371. \p{Block: Cuneiform_Numbers} \p{Block=
  372. Cuneiform_Numbers_And_Punctuation} (128)
  373. \p{Block: Cuneiform_Numbers_And_Punctuation} (Short: \p{Blk=
  374. CuneiformNumbers},
  375. \p{InCuneiformNumbers}) (128)
  376. \p{Block: Currency_Symbols} (Single: \p{InCurrencySymbols}) (48)
  377. \p{Block: Cypriot_Syllabary} (Single: \p{InCypriotSyllabary}) (64)
  378. \p{Block: Cyrillic} (Single: \p{InCyrillic}; NOT \p{Cyrillic}
  379. NOR \p{Is_Cyrillic}) (256)
  380. \p{Block: Cyrillic_Ext_A} \p{Block=Cyrillic_Extended_A} (32)
  381. \p{Block: Cyrillic_Ext_B} \p{Block=Cyrillic_Extended_B} (96)
  382. \p{Block: Cyrillic_Extended_A} (Short: \p{Blk=CyrillicExtA},
  383. \p{InCyrillicExtA}) (32)
  384. \p{Block: Cyrillic_Extended_B} (Short: \p{Blk=CyrillicExtB},
  385. \p{InCyrillicExtB}) (96)
  386. \p{Block: Cyrillic_Sup} \p{Block=Cyrillic_Supplement} (48)
  387. \p{Block: Cyrillic_Supplement} (Short: \p{Blk=CyrillicSup},
  388. \p{InCyrillicSup}) (48)
  389. \p{Block: Cyrillic_Supplementary} \p{Block=Cyrillic_Supplement}
  390. (48)
  391. \p{Block: Deseret} (Single: \p{InDeseret}) (80)
  392. \p{Block: Devanagari} (Single: \p{InDevanagari}; NOT
  393. \p{Devanagari} NOR \p{Is_Devanagari})
  394. (128)
  395. \p{Block: Devanagari_Ext} \p{Block=Devanagari_Extended} (32)
  396. \p{Block: Devanagari_Extended} (Short: \p{Blk=DevanagariExt},
  397. \p{InDevanagariExt}) (32)
  398. \p{Block: Diacriticals} \p{Block=Combining_Diacritical_Marks} (112)
  399. \p{Block: Diacriticals_For_Symbols} \p{Block=
  400. Combining_Diacritical_Marks_For_Symbols}
  401. (48)
  402. \p{Block: Diacriticals_Sup} \p{Block=
  403. Combining_Diacritical_Marks_Supplement}
  404. (64)
  405. \p{Block: Dingbats} (Single: \p{InDingbats}) (192)
  406. \p{Block: Domino} \p{Block=Domino_Tiles} (112)
  407. \p{Block: Domino_Tiles} (Short: \p{Blk=Domino}, \p{InDomino}) (112)
  408. \p{Block: Egyptian_Hieroglyphs} (Single:
  409. \p{InEgyptianHieroglyphs}; NOT
  410. \p{Egyptian_Hieroglyphs} NOR
  411. \p{Is_Egyptian_Hieroglyphs}) (1072)
  412. \p{Block: Emoticons} (Single: \p{InEmoticons}) (80)
  413. \p{Block: Enclosed_Alphanum} \p{Block=Enclosed_Alphanumerics} (160)
  414. \p{Block: Enclosed_Alphanum_Sup} \p{Block=
  415. Enclosed_Alphanumeric_Supplement} (256)
  416. \p{Block: Enclosed_Alphanumeric_Supplement} (Short: \p{Blk=
  417. EnclosedAlphanumSup},
  418. \p{InEnclosedAlphanumSup}) (256)
  419. \p{Block: Enclosed_Alphanumerics} (Short: \p{Blk=
  420. EnclosedAlphanum},
  421. \p{InEnclosedAlphanum}) (160)
  422. \p{Block: Enclosed_CJK} \p{Block=Enclosed_CJK_Letters_And_Months}
  423. (256)
  424. \p{Block: Enclosed_CJK_Letters_And_Months} (Short: \p{Blk=
  425. EnclosedCJK}, \p{InEnclosedCJK}) (256)
  426. \p{Block: Enclosed_Ideographic_Sup} \p{Block=
  427. Enclosed_Ideographic_Supplement} (256)
  428. \p{Block: Enclosed_Ideographic_Supplement} (Short: \p{Blk=
  429. EnclosedIdeographicSup},
  430. \p{InEnclosedIdeographicSup}) (256)
  431. \p{Block: Ethiopic} (Single: \p{InEthiopic}; NOT \p{Ethiopic}
  432. NOR \p{Is_Ethiopic}) (384)
  433. \p{Block: Ethiopic_Ext} \p{Block=Ethiopic_Extended} (96)
  434. \p{Block: Ethiopic_Ext_A} \p{Block=Ethiopic_Extended_A} (48)
  435. \p{Block: Ethiopic_Extended} (Short: \p{Blk=EthiopicExt},
  436. \p{InEthiopicExt}) (96)
  437. \p{Block: Ethiopic_Extended_A} (Short: \p{Blk=EthiopicExtA},
  438. \p{InEthiopicExtA}) (48)
  439. \p{Block: Ethiopic_Sup} \p{Block=Ethiopic_Supplement} (32)
  440. \p{Block: Ethiopic_Supplement} (Short: \p{Blk=EthiopicSup},
  441. \p{InEthiopicSup}) (32)
  442. \p{Block: General_Punctuation} (Short: \p{Blk=Punctuation},
  443. \p{InPunctuation}; NOT \p{Punct} NOR
  444. \p{Is_Punctuation}) (112)
  445. \p{Block: Geometric_Shapes} (Single: \p{InGeometricShapes}) (96)
  446. \p{Block: Georgian} (Single: \p{InGeorgian}; NOT \p{Georgian}
  447. NOR \p{Is_Georgian}) (96)
  448. \p{Block: Georgian_Sup} \p{Block=Georgian_Supplement} (48)
  449. \p{Block: Georgian_Supplement} (Short: \p{Blk=GeorgianSup},
  450. \p{InGeorgianSup}) (48)
  451. \p{Block: Glagolitic} (Single: \p{InGlagolitic}; NOT
  452. \p{Glagolitic} NOR \p{Is_Glagolitic})
  453. (96)
  454. \p{Block: Gothic} (Single: \p{InGothic}; NOT \p{Gothic} NOR
  455. \p{Is_Gothic}) (32)
  456. \p{Block: Greek} \p{Block=Greek_And_Coptic} (NOT \p{Greek}
  457. NOR \p{Is_Greek}) (144)
  458. \p{Block: Greek_And_Coptic} (Short: \p{Blk=Greek}, \p{InGreek};
  459. NOT \p{Greek} NOR \p{Is_Greek}) (144)
  460. \p{Block: Greek_Ext} \p{Block=Greek_Extended} (256)
  461. \p{Block: Greek_Extended} (Short: \p{Blk=GreekExt},
  462. \p{InGreekExt}) (256)
  463. \p{Block: Gujarati} (Single: \p{InGujarati}; NOT \p{Gujarati}
  464. NOR \p{Is_Gujarati}) (128)
  465. \p{Block: Gurmukhi} (Single: \p{InGurmukhi}; NOT \p{Gurmukhi}
  466. NOR \p{Is_Gurmukhi}) (128)
  467. \p{Block: Half_And_Full_Forms} \p{Block=
  468. Halfwidth_And_Fullwidth_Forms} (240)
  469. \p{Block: Half_Marks} \p{Block=Combining_Half_Marks} (16)
  470. \p{Block: Halfwidth_And_Fullwidth_Forms} (Short: \p{Blk=
  471. HalfAndFullForms},
  472. \p{InHalfAndFullForms}) (240)
  473. \p{Block: Hangul} \p{Block=Hangul_Syllables} (NOT \p{Hangul}
  474. NOR \p{Is_Hangul}) (11_184)
  475. \p{Block: Hangul_Compatibility_Jamo} (Short: \p{Blk=CompatJamo},
  476. \p{InCompatJamo}) (96)
  477. \p{Block: Hangul_Jamo} (Short: \p{Blk=Jamo}, \p{InJamo}) (256)
  478. \p{Block: Hangul_Jamo_Extended_A} (Short: \p{Blk=JamoExtA},
  479. \p{InJamoExtA}) (32)
  480. \p{Block: Hangul_Jamo_Extended_B} (Short: \p{Blk=JamoExtB},
  481. \p{InJamoExtB}) (80)
  482. \p{Block: Hangul_Syllables} (Short: \p{Blk=Hangul}, \p{InHangul};
  483. NOT \p{Hangul} NOR \p{Is_Hangul})
  484. (11_184)
  485. \p{Block: Hanunoo} (Single: \p{InHanunoo}; NOT \p{Hanunoo}
  486. NOR \p{Is_Hanunoo}) (32)
  487. \p{Block: Hebrew} (Single: \p{InHebrew}; NOT \p{Hebrew} NOR
  488. \p{Is_Hebrew}) (112)
  489. \p{Block: High_Private_Use_Surrogates} (Short: \p{Blk=
  490. HighPUSurrogates},
  491. \p{InHighPUSurrogates}) (128)
  492. \p{Block: High_PU_Surrogates} \p{Block=
  493. High_Private_Use_Surrogates} (128)
  494. \p{Block: High_Surrogates} (Single: \p{InHighSurrogates}) (896)
  495. \p{Block: Hiragana} (Single: \p{InHiragana}; NOT \p{Hiragana}
  496. NOR \p{Is_Hiragana}) (96)
  497. \p{Block: IDC} \p{Block=
  498. Ideographic_Description_Characters} (NOT
  499. \p{ID_Continue} NOR \p{Is_IDC}) (16)
  500. \p{Block: Ideographic_Description_Characters} (Short: \p{Blk=IDC},
  501. \p{InIDC}; NOT \p{ID_Continue} NOR
  502. \p{Is_IDC}) (16)
  503. \p{Block: Imperial_Aramaic} (Single: \p{InImperialAramaic}; NOT
  504. \p{Imperial_Aramaic} NOR
  505. \p{Is_Imperial_Aramaic}) (32)
  506. \p{Block: Indic_Number_Forms} \p{Block=Common_Indic_Number_Forms}
  507. (16)
  508. \p{Block: Inscriptional_Pahlavi} (Single:
  509. \p{InInscriptionalPahlavi}; NOT
  510. \p{Inscriptional_Pahlavi} NOR
  511. \p{Is_Inscriptional_Pahlavi}) (32)
  512. \p{Block: Inscriptional_Parthian} (Single:
  513. \p{InInscriptionalParthian}; NOT
  514. \p{Inscriptional_Parthian} NOR
  515. \p{Is_Inscriptional_Parthian}) (32)
  516. \p{Block: IPA_Ext} \p{Block=IPA_Extensions} (96)
  517. \p{Block: IPA_Extensions} (Short: \p{Blk=IPAExt}, \p{InIPAExt})
  518. (96)
  519. \p{Block: Jamo} \p{Block=Hangul_Jamo} (256)
  520. \p{Block: Jamo_Ext_A} \p{Block=Hangul_Jamo_Extended_A} (32)
  521. \p{Block: Jamo_Ext_B} \p{Block=Hangul_Jamo_Extended_B} (80)
  522. \p{Block: Javanese} (Single: \p{InJavanese}; NOT \p{Javanese}
  523. NOR \p{Is_Javanese}) (96)
  524. \p{Block: Kaithi} (Single: \p{InKaithi}; NOT \p{Kaithi} NOR
  525. \p{Is_Kaithi}) (80)
  526. \p{Block: Kana_Sup} \p{Block=Kana_Supplement} (256)
  527. \p{Block: Kana_Supplement} (Short: \p{Blk=KanaSup}, \p{InKanaSup})
  528. (256)
  529. \p{Block: Kanbun} (Single: \p{InKanbun}) (16)
  530. \p{Block: Kangxi} \p{Block=Kangxi_Radicals} (224)
  531. \p{Block: Kangxi_Radicals} (Short: \p{Blk=Kangxi}, \p{InKangxi})
  532. (224)
  533. \p{Block: Kannada} (Single: \p{InKannada}; NOT \p{Kannada}
  534. NOR \p{Is_Kannada}) (128)
  535. \p{Block: Katakana} (Single: \p{InKatakana}; NOT \p{Katakana}
  536. NOR \p{Is_Katakana}) (96)
  537. \p{Block: Katakana_Ext} \p{Block=Katakana_Phonetic_Extensions} (16)
  538. \p{Block: Katakana_Phonetic_Extensions} (Short: \p{Blk=
  539. KatakanaExt}, \p{InKatakanaExt}) (16)
  540. \p{Block: Kayah_Li} (Single: \p{InKayahLi}) (48)
  541. \p{Block: Kharoshthi} (Single: \p{InKharoshthi}; NOT
  542. \p{Kharoshthi} NOR \p{Is_Kharoshthi})
  543. (96)
  544. \p{Block: Khmer} (Single: \p{InKhmer}; NOT \p{Khmer} NOR
  545. \p{Is_Khmer}) (128)
  546. \p{Block: Khmer_Symbols} (Single: \p{InKhmerSymbols}) (32)
  547. \p{Block: Lao} (Single: \p{InLao}; NOT \p{Lao} NOR
  548. \p{Is_Lao}) (128)
  549. \p{Block: Latin_1} \p{Block=Latin_1_Supplement} (128)
  550. \p{Block: Latin_1_Sup} \p{Block=Latin_1_Supplement} (128)
  551. \p{Block: Latin_1_Supplement} (Short: \p{Blk=Latin1},
  552. \p{InLatin1}) (128)
  553. \p{Block: Latin_Ext_A} \p{Block=Latin_Extended_A} (128)
  554. \p{Block: Latin_Ext_Additional} \p{Block=
  555. Latin_Extended_Additional} (256)
  556. \p{Block: Latin_Ext_B} \p{Block=Latin_Extended_B} (208)
  557. \p{Block: Latin_Ext_C} \p{Block=Latin_Extended_C} (32)
  558. \p{Block: Latin_Ext_D} \p{Block=Latin_Extended_D} (224)
  559. \p{Block: Latin_Extended_A} (Short: \p{Blk=LatinExtA},
  560. \p{InLatinExtA}) (128)
  561. \p{Block: Latin_Extended_Additional} (Short: \p{Blk=
  562. LatinExtAdditional},
  563. \p{InLatinExtAdditional}) (256)
  564. \p{Block: Latin_Extended_B} (Short: \p{Blk=LatinExtB},
  565. \p{InLatinExtB}) (208)
  566. \p{Block: Latin_Extended_C} (Short: \p{Blk=LatinExtC},
  567. \p{InLatinExtC}) (32)
  568. \p{Block: Latin_Extended_D} (Short: \p{Blk=LatinExtD},
  569. \p{InLatinExtD}) (224)
  570. \p{Block: Lepcha} (Single: \p{InLepcha}; NOT \p{Lepcha} NOR
  571. \p{Is_Lepcha}) (80)
  572. \p{Block: Letterlike_Symbols} (Single: \p{InLetterlikeSymbols})
  573. (80)
  574. \p{Block: Limbu} (Single: \p{InLimbu}; NOT \p{Limbu} NOR
  575. \p{Is_Limbu}) (80)
  576. \p{Block: Linear_B_Ideograms} (Single: \p{InLinearBIdeograms})
  577. (128)
  578. \p{Block: Linear_B_Syllabary} (Single: \p{InLinearBSyllabary})
  579. (128)
  580. \p{Block: Lisu} (Single: \p{InLisu}) (48)
  581. \p{Block: Low_Surrogates} (Single: \p{InLowSurrogates}) (1024)
  582. \p{Block: Lycian} (Single: \p{InLycian}; NOT \p{Lycian} NOR
  583. \p{Is_Lycian}) (32)
  584. \p{Block: Lydian} (Single: \p{InLydian}; NOT \p{Lydian} NOR
  585. \p{Is_Lydian}) (32)
  586. \p{Block: Mahjong} \p{Block=Mahjong_Tiles} (48)
  587. \p{Block: Mahjong_Tiles} (Short: \p{Blk=Mahjong}, \p{InMahjong})
  588. (48)
  589. \p{Block: Malayalam} (Single: \p{InMalayalam}; NOT
  590. \p{Malayalam} NOR \p{Is_Malayalam}) (128)
  591. \p{Block: Mandaic} (Single: \p{InMandaic}; NOT \p{Mandaic}
  592. NOR \p{Is_Mandaic}) (32)
  593. \p{Block: Math_Alphanum} \p{Block=
  594. Mathematical_Alphanumeric_Symbols} (1024)
  595. \p{Block: Math_Operators} \p{Block=Mathematical_Operators} (256)
  596. \p{Block: Mathematical_Alphanumeric_Symbols} (Short: \p{Blk=
  597. MathAlphanum}, \p{InMathAlphanum}) (1024)
  598. \p{Block: Mathematical_Operators} (Short: \p{Blk=MathOperators},
  599. \p{InMathOperators}) (256)
  600. \p{Block: Meetei_Mayek} (Single: \p{InMeeteiMayek}; NOT
  601. \p{Meetei_Mayek} NOR
  602. \p{Is_Meetei_Mayek}) (64)
  603. \p{Block: Meetei_Mayek_Ext} \p{Block=Meetei_Mayek_Extensions} (32)
  604. \p{Block: Meetei_Mayek_Extensions} (Short: \p{Blk=MeeteiMayekExt},
  605. \p{InMeeteiMayekExt}) (32)
  606. \p{Block: Meroitic_Cursive} (Single: \p{InMeroiticCursive}; NOT
  607. \p{Meroitic_Cursive} NOR
  608. \p{Is_Meroitic_Cursive}) (96)
  609. \p{Block: Meroitic_Hieroglyphs} (Single:
  610. \p{InMeroiticHieroglyphs}) (32)
  611. \p{Block: Miao} (Single: \p{InMiao}; NOT \p{Miao} NOR
  612. \p{Is_Miao}) (160)
  613. \p{Block: Misc_Arrows} \p{Block=Miscellaneous_Symbols_And_Arrows}
  614. (256)
  615. \p{Block: Misc_Math_Symbols_A} \p{Block=
  616. Miscellaneous_Mathematical_Symbols_A}
  617. (48)
  618. \p{Block: Misc_Math_Symbols_B} \p{Block=
  619. Miscellaneous_Mathematical_Symbols_B}
  620. (128)
  621. \p{Block: Misc_Pictographs} \p{Block=
  622. Miscellaneous_Symbols_And_Pictographs}
  623. (768)
  624. \p{Block: Misc_Symbols} \p{Block=Miscellaneous_Symbols} (256)
  625. \p{Block: Misc_Technical} \p{Block=Miscellaneous_Technical} (256)
  626. \p{Block: Miscellaneous_Mathematical_Symbols_A} (Short: \p{Blk=
  627. MiscMathSymbolsA},
  628. \p{InMiscMathSymbolsA}) (48)
  629. \p{Block: Miscellaneous_Mathematical_Symbols_B} (Short: \p{Blk=
  630. MiscMathSymbolsB},
  631. \p{InMiscMathSymbolsB}) (128)
  632. \p{Block: Miscellaneous_Symbols} (Short: \p{Blk=MiscSymbols},
  633. \p{InMiscSymbols}) (256)
  634. \p{Block: Miscellaneous_Symbols_And_Arrows} (Short: \p{Blk=
  635. MiscArrows}, \p{InMiscArrows}) (256)
  636. \p{Block: Miscellaneous_Symbols_And_Pictographs} (Short: \p{Blk=
  637. MiscPictographs}, \p{InMiscPictographs})
  638. (768)
  639. \p{Block: Miscellaneous_Technical} (Short: \p{Blk=MiscTechnical},
  640. \p{InMiscTechnical}) (256)
  641. \p{Block: Modifier_Letters} \p{Block=Spacing_Modifier_Letters} (80)
  642. \p{Block: Modifier_Tone_Letters} (Single:
  643. \p{InModifierToneLetters}) (32)
  644. \p{Block: Mongolian} (Single: \p{InMongolian}; NOT
  645. \p{Mongolian} NOR \p{Is_Mongolian}) (176)
  646. \p{Block: Music} \p{Block=Musical_Symbols} (256)
  647. \p{Block: Musical_Symbols} (Short: \p{Blk=Music}, \p{InMusic})
  648. (256)
  649. \p{Block: Myanmar} (Single: \p{InMyanmar}; NOT \p{Myanmar}
  650. NOR \p{Is_Myanmar}) (160)
  651. \p{Block: Myanmar_Ext_A} \p{Block=Myanmar_Extended_A} (32)
  652. \p{Block: Myanmar_Extended_A} (Short: \p{Blk=MyanmarExtA},
  653. \p{InMyanmarExtA}) (32)
  654. \p{Block: NB} \p{Block=No_Block} (860_672)
  655. \p{Block: New_Tai_Lue} (Single: \p{InNewTaiLue}; NOT
  656. \p{New_Tai_Lue} NOR \p{Is_New_Tai_Lue})
  657. (96)
  658. \p{Block: NKo} (Single: \p{InNKo}; NOT \p{Nko} NOR
  659. \p{Is_NKo}) (64)
  660. \p{Block: No_Block} (Short: \p{Blk=NB}, \p{InNB}) (860_672)
  661. \p{Block: Number_Forms} (Single: \p{InNumberForms}) (64)
  662. \p{Block: OCR} \p{Block=Optical_Character_Recognition}
  663. (32)
  664. \p{Block: Ogham} (Single: \p{InOgham}; NOT \p{Ogham} NOR
  665. \p{Is_Ogham}) (32)
  666. \p{Block: Ol_Chiki} (Single: \p{InOlChiki}) (48)
  667. \p{Block: Old_Italic} (Single: \p{InOldItalic}; NOT
  668. \p{Old_Italic} NOR \p{Is_Old_Italic})
  669. (48)
  670. \p{Block: Old_Persian} (Single: \p{InOldPersian}; NOT
  671. \p{Old_Persian} NOR \p{Is_Old_Persian})
  672. (64)
  673. \p{Block: Old_South_Arabian} (Single: \p{InOldSouthArabian}) (32)
  674. \p{Block: Old_Turkic} (Single: \p{InOldTurkic}; NOT
  675. \p{Old_Turkic} NOR \p{Is_Old_Turkic})
  676. (80)
  677. \p{Block: Optical_Character_Recognition} (Short: \p{Blk=OCR},
  678. \p{InOCR}) (32)
  679. \p{Block: Oriya} (Single: \p{InOriya}; NOT \p{Oriya} NOR
  680. \p{Is_Oriya}) (128)
  681. \p{Block: Osmanya} (Single: \p{InOsmanya}; NOT \p{Osmanya}
  682. NOR \p{Is_Osmanya}) (48)
  683. \p{Block: Phags_Pa} (Single: \p{InPhagsPa}; NOT \p{Phags_Pa}
  684. NOR \p{Is_Phags_Pa}) (64)
  685. \p{Block: Phaistos} \p{Block=Phaistos_Disc} (48)
  686. \p{Block: Phaistos_Disc} (Short: \p{Blk=Phaistos}, \p{InPhaistos})
  687. (48)
  688. \p{Block: Phoenician} (Single: \p{InPhoenician}; NOT
  689. \p{Phoenician} NOR \p{Is_Phoenician})
  690. (32)
  691. \p{Block: Phonetic_Ext} \p{Block=Phonetic_Extensions} (128)
  692. \p{Block: Phonetic_Ext_Sup} \p{Block=
  693. Phonetic_Extensions_Supplement} (64)
  694. \p{Block: Phonetic_Extensions} (Short: \p{Blk=PhoneticExt},
  695. \p{InPhoneticExt}) (128)
  696. \p{Block: Phonetic_Extensions_Supplement} (Short: \p{Blk=
  697. PhoneticExtSup}, \p{InPhoneticExtSup})
  698. (64)
  699. \p{Block: Playing_Cards} (Single: \p{InPlayingCards}) (96)
  700. \p{Block: Private_Use} \p{Block=Private_Use_Area} (NOT
  701. \p{Private_Use} NOR \p{Is_Private_Use})
  702. (6400)
  703. \p{Block: Private_Use_Area} (Short: \p{Blk=PUA}, \p{InPUA}; NOT
  704. \p{Private_Use} NOR \p{Is_Private_Use})
  705. (6400)
  706. \p{Block: PUA} \p{Block=Private_Use_Area} (NOT
  707. \p{Private_Use} NOR \p{Is_Private_Use})
  708. (6400)
  709. \p{Block: Punctuation} \p{Block=General_Punctuation} (NOT
  710. \p{Punct} NOR \p{Is_Punctuation}) (112)
  711. \p{Block: Rejang} (Single: \p{InRejang}; NOT \p{Rejang} NOR
  712. \p{Is_Rejang}) (48)
  713. \p{Block: Rumi} \p{Block=Rumi_Numeral_Symbols} (32)
  714. \p{Block: Rumi_Numeral_Symbols} (Short: \p{Blk=Rumi}, \p{InRumi})
  715. (32)
  716. \p{Block: Runic} (Single: \p{InRunic}; NOT \p{Runic} NOR
  717. \p{Is_Runic}) (96)
  718. \p{Block: Samaritan} (Single: \p{InSamaritan}; NOT
  719. \p{Samaritan} NOR \p{Is_Samaritan}) (64)
  720. \p{Block: Saurashtra} (Single: \p{InSaurashtra}; NOT
  721. \p{Saurashtra} NOR \p{Is_Saurashtra})
  722. (96)
  723. \p{Block: Sharada} (Single: \p{InSharada}; NOT \p{Sharada}
  724. NOR \p{Is_Sharada}) (96)
  725. \p{Block: Shavian} (Single: \p{InShavian}) (48)
  726. \p{Block: Sinhala} (Single: \p{InSinhala}; NOT \p{Sinhala}
  727. NOR \p{Is_Sinhala}) (128)
  728. \p{Block: Small_Form_Variants} (Short: \p{Blk=SmallForms},
  729. \p{InSmallForms}) (32)
  730. \p{Block: Small_Forms} \p{Block=Small_Form_Variants} (32)
  731. \p{Block: Sora_Sompeng} (Single: \p{InSoraSompeng}; NOT
  732. \p{Sora_Sompeng} NOR
  733. \p{Is_Sora_Sompeng}) (48)
  734. \p{Block: Spacing_Modifier_Letters} (Short: \p{Blk=
  735. ModifierLetters}, \p{InModifierLetters})
  736. (80)
  737. \p{Block: Specials} (Single: \p{InSpecials}) (16)
  738. \p{Block: Sundanese} (Single: \p{InSundanese}; NOT
  739. \p{Sundanese} NOR \p{Is_Sundanese}) (64)
  740. \p{Block: Sundanese_Sup} \p{Block=Sundanese_Supplement} (16)
  741. \p{Block: Sundanese_Supplement} (Short: \p{Blk=SundaneseSup},
  742. \p{InSundaneseSup}) (16)
  743. \p{Block: Sup_Arrows_A} \p{Block=Supplemental_Arrows_A} (16)
  744. \p{Block: Sup_Arrows_B} \p{Block=Supplemental_Arrows_B} (128)
  745. \p{Block: Sup_Math_Operators} \p{Block=
  746. Supplemental_Mathematical_Operators}
  747. (256)
  748. \p{Block: Sup_PUA_A} \p{Block=Supplementary_Private_Use_Area_A}
  749. (65_536)
  750. \p{Block: Sup_PUA_B} \p{Block=Supplementary_Private_Use_Area_B}
  751. (65_536)
  752. \p{Block: Sup_Punctuation} \p{Block=Supplemental_Punctuation} (128)
  753. \p{Block: Super_And_Sub} \p{Block=Superscripts_And_Subscripts} (48)
  754. \p{Block: Superscripts_And_Subscripts} (Short: \p{Blk=
  755. SuperAndSub}, \p{InSuperAndSub}) (48)
  756. \p{Block: Supplemental_Arrows_A} (Short: \p{Blk=SupArrowsA},
  757. \p{InSupArrowsA}) (16)
  758. \p{Block: Supplemental_Arrows_B} (Short: \p{Blk=SupArrowsB},
  759. \p{InSupArrowsB}) (128)
  760. \p{Block: Supplemental_Mathematical_Operators} (Short: \p{Blk=
  761. SupMathOperators},
  762. \p{InSupMathOperators}) (256)
  763. \p{Block: Supplemental_Punctuation} (Short: \p{Blk=
  764. SupPunctuation}, \p{InSupPunctuation})
  765. (128)
  766. \p{Block: Supplementary_Private_Use_Area_A} (Short: \p{Blk=
  767. SupPUAA}, \p{InSupPUAA}) (65_536)
  768. \p{Block: Supplementary_Private_Use_Area_B} (Short: \p{Blk=
  769. SupPUAB}, \p{InSupPUAB}) (65_536)
  770. \p{Block: Syloti_Nagri} (Single: \p{InSylotiNagri}; NOT
  771. \p{Syloti_Nagri} NOR
  772. \p{Is_Syloti_Nagri}) (48)
  773. \p{Block: Syriac} (Single: \p{InSyriac}; NOT \p{Syriac} NOR
  774. \p{Is_Syriac}) (80)
  775. \p{Block: Tagalog} (Single: \p{InTagalog}; NOT \p{Tagalog}
  776. NOR \p{Is_Tagalog}) (32)
  777. \p{Block: Tagbanwa} (Single: \p{InTagbanwa}; NOT \p{Tagbanwa}
  778. NOR \p{Is_Tagbanwa}) (32)
  779. \p{Block: Tags} (Single: \p{InTags}) (128)
  780. \p{Block: Tai_Le} (Single: \p{InTaiLe}; NOT \p{Tai_Le} NOR
  781. \p{Is_Tai_Le}) (48)
  782. \p{Block: Tai_Tham} (Single: \p{InTaiTham}; NOT \p{Tai_Tham}
  783. NOR \p{Is_Tai_Tham}) (144)
  784. \p{Block: Tai_Viet} (Single: \p{InTaiViet}; NOT \p{Tai_Viet}
  785. NOR \p{Is_Tai_Viet}) (96)
  786. \p{Block: Tai_Xuan_Jing} \p{Block=Tai_Xuan_Jing_Symbols} (96)
  787. \p{Block: Tai_Xuan_Jing_Symbols} (Short: \p{Blk=TaiXuanJing},
  788. \p{InTaiXuanJing}) (96)
  789. \p{Block: Takri} (Single: \p{InTakri}; NOT \p{Takri} NOR
  790. \p{Is_Takri}) (80)
  791. \p{Block: Tamil} (Single: \p{InTamil}; NOT \p{Tamil} NOR
  792. \p{Is_Tamil}) (128)
  793. \p{Block: Telugu} (Single: \p{InTelugu}; NOT \p{Telugu} NOR
  794. \p{Is_Telugu}) (128)
  795. \p{Block: Thaana} (Single: \p{InThaana}; NOT \p{Thaana} NOR
  796. \p{Is_Thaana}) (64)
  797. \p{Block: Thai} (Single: \p{InThai}; NOT \p{Thai} NOR
  798. \p{Is_Thai}) (128)
  799. \p{Block: Tibetan} (Single: \p{InTibetan}; NOT \p{Tibetan}
  800. NOR \p{Is_Tibetan}) (256)
  801. \p{Block: Tifinagh} (Single: \p{InTifinagh}; NOT \p{Tifinagh}
  802. NOR \p{Is_Tifinagh}) (80)
  803. \p{Block: Transport_And_Map} \p{Block=Transport_And_Map_Symbols}
  804. (128)
  805. \p{Block: Transport_And_Map_Symbols} (Short: \p{Blk=
  806. TransportAndMap}, \p{InTransportAndMap})
  807. (128)
  808. \p{Block: UCAS} \p{Block=
  809. Unified_Canadian_Aboriginal_Syllabics}
  810. (640)
  811. \p{Block: UCAS_Ext} \p{Block=
  812. Unified_Canadian_Aboriginal_Syllabics_-
  813. Extended} (80)
  814. \p{Block: Ugaritic} (Single: \p{InUgaritic}; NOT \p{Ugaritic}
  815. NOR \p{Is_Ugaritic}) (32)
  816. \p{Block: Unified_Canadian_Aboriginal_Syllabics} (Short: \p{Blk=
  817. UCAS}, \p{InUCAS}) (640)
  818. \p{Block: Unified_Canadian_Aboriginal_Syllabics_Extended} (Short:
  819. \p{Blk=UCASExt}, \p{InUCASExt}) (80)
  820. \p{Block: Vai} (Single: \p{InVai}; NOT \p{Vai} NOR
  821. \p{Is_Vai}) (320)
  822. \p{Block: Variation_Selectors} (Short: \p{Blk=VS}, \p{InVS}; NOT
  823. \p{Variation_Selector} NOR \p{Is_VS})
  824. (16)
  825. \p{Block: Variation_Selectors_Supplement} (Short: \p{Blk=VSSup},
  826. \p{InVSSup}) (240)
  827. \p{Block: Vedic_Ext} \p{Block=Vedic_Extensions} (48)
  828. \p{Block: Vedic_Extensions} (Short: \p{Blk=VedicExt},
  829. \p{InVedicExt}) (48)
  830. \p{Block: Vertical_Forms} (Single: \p{InVerticalForms}) (16)
  831. \p{Block: VS} \p{Block=Variation_Selectors} (NOT
  832. \p{Variation_Selector} NOR \p{Is_VS})
  833. (16)
  834. \p{Block: VS_Sup} \p{Block=Variation_Selectors_Supplement}
  835. (240)
  836. \p{Block: Yi_Radicals} (Single: \p{InYiRadicals}) (64)
  837. \p{Block: Yi_Syllables} (Single: \p{InYiSyllables}) (1168)
  838. \p{Block: Yijing} \p{Block=Yijing_Hexagram_Symbols} (64)
  839. \p{Block: Yijing_Hexagram_Symbols} (Short: \p{Blk=Yijing},
  840. \p{InYijing}) (64)
  841. X \p{Block_Elements} \p{Block=Block_Elements} (32)
  842. \p{Bopo} \p{Bopomofo} (= \p{Script=Bopomofo}) (NOT
  843. \p{Block=Bopomofo}) (70)
  844. \p{Bopomofo} \p{Script=Bopomofo} (Short: \p{Bopo}; NOT
  845. \p{Block=Bopomofo}) (70)
  846. X \p{Bopomofo_Ext} \p{Bopomofo_Extended} (= \p{Block=
  847. Bopomofo_Extended}) (32)
  848. X \p{Bopomofo_Extended} \p{Block=Bopomofo_Extended} (Short:
  849. \p{InBopomofoExt}) (32)
  850. X \p{Box_Drawing} \p{Block=Box_Drawing} (128)
  851. \p{Brah} \p{Brahmi} (= \p{Script=Brahmi}) (NOT
  852. \p{Block=Brahmi}) (108)
  853. \p{Brahmi} \p{Script=Brahmi} (Short: \p{Brah}; NOT
  854. \p{Block=Brahmi}) (108)
  855. \p{Brai} \p{Braille} (= \p{Script=Braille}) (256)
  856. \p{Braille} \p{Script=Braille} (Short: \p{Brai}) (256)
  857. X \p{Braille_Patterns} \p{Block=Braille_Patterns} (Short:
  858. \p{InBraille}) (256)
  859. \p{Bugi} \p{Buginese} (= \p{Script=Buginese}) (NOT
  860. \p{Block=Buginese}) (30)
  861. \p{Buginese} \p{Script=Buginese} (Short: \p{Bugi}; NOT
  862. \p{Block=Buginese}) (30)
  863. \p{Buhd} \p{Buhid} (= \p{Script=Buhid}) (NOT
  864. \p{Block=Buhid}) (20)
  865. \p{Buhid} \p{Script=Buhid} (Short: \p{Buhd}; NOT
  866. \p{Block=Buhid}) (20)
  867. X \p{Byzantine_Music} \p{Byzantine_Musical_Symbols} (= \p{Block=
  868. Byzantine_Musical_Symbols}) (256)
  869. X \p{Byzantine_Musical_Symbols} \p{Block=Byzantine_Musical_Symbols}
  870. (Short: \p{InByzantineMusic}) (256)
  871. \p{C} \p{Other} (= \p{General_Category=Other})
  872. (1_004_134)
  873. \p{Cakm} \p{Chakma} (= \p{Script=Chakma}) (NOT
  874. \p{Block=Chakma}) (67)
  875. \p{Canadian_Aboriginal} \p{Script=Canadian_Aboriginal} (Short:
  876. \p{Cans}) (710)
  877. X \p{Canadian_Syllabics} \p{Unified_Canadian_Aboriginal_Syllabics}
  878. (= \p{Block=
  879. Unified_Canadian_Aboriginal_Syllabics})
  880. (640)
  881. T \p{Canonical_Combining_Class: 0} \p{Canonical_Combining_Class=
  882. Not_Reordered} (1_113_459)
  883. T \p{Canonical_Combining_Class: 1} \p{Canonical_Combining_Class=
  884. Overlay} (26)
  885. T \p{Canonical_Combining_Class: 7} \p{Canonical_Combining_Class=
  886. Nukta} (13)
  887. T \p{Canonical_Combining_Class: 8} \p{Canonical_Combining_Class=
  888. Kana_Voicing} (2)
  889. T \p{Canonical_Combining_Class: 9} \p{Canonical_Combining_Class=
  890. Virama} (37)
  891. T \p{Canonical_Combining_Class: 10} \p{Canonical_Combining_Class=
  892. CCC10} (1)
  893. T \p{Canonical_Combining_Class: 11} \p{Canonical_Combining_Class=
  894. CCC11} (1)
  895. T \p{Canonical_Combining_Class: 12} \p{Canonical_Combining_Class=
  896. CCC12} (1)
  897. T \p{Canonical_Combining_Class: 13} \p{Canonical_Combining_Class=
  898. CCC13} (1)
  899. T \p{Canonical_Combining_Class: 14} \p{Canonical_Combining_Class=
  900. CCC14} (1)
  901. T \p{Canonical_Combining_Class: 15} \p{Canonical_Combining_Class=
  902. CCC15} (1)
  903. T \p{Canonical_Combining_Class: 16} \p{Canonical_Combining_Class=
  904. CCC16} (1)
  905. T \p{Canonical_Combining_Class: 17} \p{Canonical_Combining_Class=
  906. CCC17} (1)
  907. T \p{Canonical_Combining_Class: 18} \p{Canonical_Combining_Class=
  908. CCC18} (2)
  909. T \p{Canonical_Combining_Class: 19} \p{Canonical_Combining_Class=
  910. CCC19} (2)
  911. T \p{Canonical_Combining_Class: 20} \p{Canonical_Combining_Class=
  912. CCC20} (1)
  913. T \p{Canonical_Combining_Class: 21} \p{Canonical_Combining_Class=
  914. CCC21} (1)
  915. T \p{Canonical_Combining_Class: 22} \p{Canonical_Combining_Class=
  916. CCC22} (1)
  917. T \p{Canonical_Combining_Class: 23} \p{Canonical_Combining_Class=
  918. CCC23} (1)
  919. T \p{Canonical_Combining_Class: 24} \p{Canonical_Combining_Class=
  920. CCC24} (1)
  921. T \p{Canonical_Combining_Class: 25} \p{Canonical_Combining_Class=
  922. CCC25} (1)
  923. T \p{Canonical_Combining_Class: 26} \p{Canonical_Combining_Class=
  924. CCC26} (1)
  925. T \p{Canonical_Combining_Class: 27} \p{Canonical_Combining_Class=
  926. CCC27} (2)
  927. T \p{Canonical_Combining_Class: 28} \p{Canonical_Combining_Class=
  928. CCC28} (2)
  929. T \p{Canonical_Combining_Class: 29} \p{Canonical_Combining_Class=
  930. CCC29} (2)
  931. T \p{Canonical_Combining_Class: 30} \p{Canonical_Combining_Class=
  932. CCC30} (2)
  933. T \p{Canonical_Combining_Class: 31} \p{Canonical_Combining_Class=
  934. CCC31} (2)
  935. T \p{Canonical_Combining_Class: 32} \p{Canonical_Combining_Class=
  936. CCC32} (2)
  937. T \p{Canonical_Combining_Class: 33} \p{Canonical_Combining_Class=
  938. CCC33} (1)
  939. T \p{Canonical_Combining_Class: 34} \p{Canonical_Combining_Class=
  940. CCC34} (1)
  941. T \p{Canonical_Combining_Class: 35} \p{Canonical_Combining_Class=
  942. CCC35} (1)
  943. T \p{Canonical_Combining_Class: 36} \p{Canonical_Combining_Class=
  944. CCC36} (1)
  945. T \p{Canonical_Combining_Class: 84} \p{Canonical_Combining_Class=
  946. CCC84} (1)
  947. T \p{Canonical_Combining_Class: 91} \p{Canonical_Combining_Class=
  948. CCC91} (1)
  949. T \p{Canonical_Combining_Class: 103} \p{Canonical_Combining_Class=
  950. CCC103} (2)
  951. T \p{Canonical_Combining_Class: 107} \p{Canonical_Combining_Class=
  952. CCC107} (4)
  953. T \p{Canonical_Combining_Class: 118} \p{Canonical_Combining_Class=
  954. CCC118} (2)
  955. T \p{Canonical_Combining_Class: 122} \p{Canonical_Combining_Class=
  956. CCC122} (4)
  957. T \p{Canonical_Combining_Class: 129} \p{Canonical_Combining_Class=
  958. CCC129} (1)
  959. T \p{Canonical_Combining_Class: 130} \p{Canonical_Combining_Class=
  960. CCC130} (6)
  961. T \p{Canonical_Combining_Class: 132} \p{Canonical_Combining_Class=
  962. CCC132} (1)
  963. T \p{Canonical_Combining_Class: 133} \p{Canonical_Combining_Class=
  964. CCC133} (0)
  965. T \p{Canonical_Combining_Class: 200} \p{Canonical_Combining_Class=
  966. Attached_Below_Left} (0)
  967. T \p{Canonical_Combining_Class: 202} \p{Canonical_Combining_Class=
  968. Attached_Below} (5)
  969. T \p{Canonical_Combining_Class: 214} \p{Canonical_Combining_Class=
  970. Attached_Above} (1)
  971. T \p{Canonical_Combining_Class: 216} \p{Canonical_Combining_Class=
  972. Attached_Above_Right} (9)
  973. T \p{Canonical_Combining_Class: 218} \p{Canonical_Combining_Class=
  974. Below_Left} (1)
  975. T \p{Canonical_Combining_Class: 220} \p{Canonical_Combining_Class=
  976. Below} (129)
  977. T \p{Canonical_Combining_Class: 222} \p{Canonical_Combining_Class=
  978. Below_Right} (4)
  979. T \p{Canonical_Combining_Class: 224} \p{Canonical_Combining_Class=
  980. Left} (2)
  981. T \p{Canonical_Combining_Class: 226} \p{Canonical_Combining_Class=
  982. Right} (1)
  983. T \p{Canonical_Combining_Class: 228} \p{Canonical_Combining_Class=
  984. Above_Left} (3)
  985. T \p{Canonical_Combining_Class: 230} \p{Canonical_Combining_Class=
  986. Above} (349)
  987. T \p{Canonical_Combining_Class: 232} \p{Canonical_Combining_Class=
  988. Above_Right} (4)
  989. T \p{Canonical_Combining_Class: 233} \p{Canonical_Combining_Class=
  990. Double_Below} (4)
  991. T \p{Canonical_Combining_Class: 234} \p{Canonical_Combining_Class=
  992. Double_Above} (5)
  993. T \p{Canonical_Combining_Class: 240} \p{Canonical_Combining_Class=
  994. Iota_Subscript} (1)
  995. \p{Canonical_Combining_Class: A} \p{Canonical_Combining_Class=
  996. Above} (349)
  997. \p{Canonical_Combining_Class: Above} (Short: \p{Ccc=A}) (349)
  998. \p{Canonical_Combining_Class: Above_Left} (Short: \p{Ccc=AL}) (3)
  999. \p{Canonical_Combining_Class: Above_Right} (Short: \p{Ccc=AR}) (4)
  1000. \p{Canonical_Combining_Class: AL} \p{Canonical_Combining_Class=
  1001. Above_Left} (3)
  1002. \p{Canonical_Combining_Class: AR} \p{Canonical_Combining_Class=
  1003. Above_Right} (4)
  1004. \p{Canonical_Combining_Class: ATA} \p{Canonical_Combining_Class=
  1005. Attached_Above} (1)
  1006. \p{Canonical_Combining_Class: ATAR} \p{Canonical_Combining_Class=
  1007. Attached_Above_Right} (9)
  1008. \p{Canonical_Combining_Class: ATB} \p{Canonical_Combining_Class=
  1009. Attached_Below} (5)
  1010. \p{Canonical_Combining_Class: ATBL} \p{Canonical_Combining_Class=
  1011. Attached_Below_Left} (0)
  1012. \p{Canonical_Combining_Class: Attached_Above} (Short: \p{Ccc=ATA})
  1013. (1)
  1014. \p{Canonical_Combining_Class: Attached_Above_Right} (Short:
  1015. \p{Ccc=ATAR}) (9)
  1016. \p{Canonical_Combining_Class: Attached_Below} (Short: \p{Ccc=ATB})
  1017. (5)
  1018. \p{Canonical_Combining_Class: Attached_Below_Left} (Short: \p{Ccc=
  1019. ATBL}) (0)
  1020. \p{Canonical_Combining_Class: B} \p{Canonical_Combining_Class=
  1021. Below} (129)
  1022. \p{Canonical_Combining_Class: Below} (Short: \p{Ccc=B}) (129)
  1023. \p{Canonical_Combining_Class: Below_Left} (Short: \p{Ccc=BL}) (1)
  1024. \p{Canonical_Combining_Class: Below_Right} (Short: \p{Ccc=BR}) (4)
  1025. \p{Canonical_Combining_Class: BL} \p{Canonical_Combining_Class=
  1026. Below_Left} (1)
  1027. \p{Canonical_Combining_Class: BR} \p{Canonical_Combining_Class=
  1028. Below_Right} (4)
  1029. \p{Canonical_Combining_Class: CCC10} (Short: \p{Ccc=CCC10}) (1)
  1030. \p{Canonical_Combining_Class: CCC103} (Short: \p{Ccc=CCC103}) (2)
  1031. \p{Canonical_Combining_Class: CCC107} (Short: \p{Ccc=CCC107}) (4)
  1032. \p{Canonical_Combining_Class: CCC11} (Short: \p{Ccc=CCC11}) (1)
  1033. \p{Canonical_Combining_Class: CCC118} (Short: \p{Ccc=CCC118}) (2)
  1034. \p{Canonical_Combining_Class: CCC12} (Short: \p{Ccc=CCC12}) (1)
  1035. \p{Canonical_Combining_Class: CCC122} (Short: \p{Ccc=CCC122}) (4)
  1036. \p{Canonical_Combining_Class: CCC129} (Short: \p{Ccc=CCC129}) (1)
  1037. \p{Canonical_Combining_Class: CCC13} (Short: \p{Ccc=CCC13}) (1)
  1038. \p{Canonical_Combining_Class: CCC130} (Short: \p{Ccc=CCC130}) (6)
  1039. \p{Canonical_Combining_Class: CCC132} (Short: \p{Ccc=CCC132}) (1)
  1040. \p{Canonical_Combining_Class: CCC133} (Short: \p{Ccc=CCC133}) (0)
  1041. \p{Canonical_Combining_Class: CCC14} (Short: \p{Ccc=CCC14}) (1)
  1042. \p{Canonical_Combining_Class: CCC15} (Short: \p{Ccc=CCC15}) (1)
  1043. \p{Canonical_Combining_Class: CCC16} (Short: \p{Ccc=CCC16}) (1)
  1044. \p{Canonical_Combining_Class: CCC17} (Short: \p{Ccc=CCC17}) (1)
  1045. \p{Canonical_Combining_Class: CCC18} (Short: \p{Ccc=CCC18}) (2)
  1046. \p{Canonical_Combining_Class: CCC19} (Short: \p{Ccc=CCC19}) (2)
  1047. \p{Canonical_Combining_Class: CCC20} (Short: \p{Ccc=CCC20}) (1)
  1048. \p{Canonical_Combining_Class: CCC21} (Short: \p{Ccc=CCC21}) (1)
  1049. \p{Canonical_Combining_Class: CCC22} (Short: \p{Ccc=CCC22}) (1)
  1050. \p{Canonical_Combining_Class: CCC23} (Short: \p{Ccc=CCC23}) (1)
  1051. \p{Canonical_Combining_Class: CCC24} (Short: \p{Ccc=CCC24}) (1)
  1052. \p{Canonical_Combining_Class: CCC25} (Short: \p{Ccc=CCC25}) (1)
  1053. \p{Canonical_Combining_Class: CCC26} (Short: \p{Ccc=CCC26}) (1)
  1054. \p{Canonical_Combining_Class: CCC27} (Short: \p{Ccc=CCC27}) (2)
  1055. \p{Canonical_Combining_Class: CCC28} (Short: \p{Ccc=CCC28}) (2)
  1056. \p{Canonical_Combining_Class: CCC29} (Short: \p{Ccc=CCC29}) (2)
  1057. \p{Canonical_Combining_Class: CCC30} (Short: \p{Ccc=CCC30}) (2)
  1058. \p{Canonical_Combining_Class: CCC31} (Short: \p{Ccc=CCC31}) (2)
  1059. \p{Canonical_Combining_Class: CCC32} (Short: \p{Ccc=CCC32}) (2)
  1060. \p{Canonical_Combining_Class: CCC33} (Short: \p{Ccc=CCC33}) (1)
  1061. \p{Canonical_Combining_Class: CCC34} (Short: \p{Ccc=CCC34}) (1)
  1062. \p{Canonical_Combining_Class: CCC35} (Short: \p{Ccc=CCC35}) (1)
  1063. \p{Canonical_Combining_Class: CCC36} (Short: \p{Ccc=CCC36}) (1)
  1064. \p{Canonical_Combining_Class: CCC84} (Short: \p{Ccc=CCC84}) (1)
  1065. \p{Canonical_Combining_Class: CCC91} (Short: \p{Ccc=CCC91}) (1)
  1066. \p{Canonical_Combining_Class: DA} \p{Canonical_Combining_Class=
  1067. Double_Above} (5)
  1068. \p{Canonical_Combining_Class: DB} \p{Canonical_Combining_Class=
  1069. Double_Below} (4)
  1070. \p{Canonical_Combining_Class: Double_Above} (Short: \p{Ccc=DA}) (5)
  1071. \p{Canonical_Combining_Class: Double_Below} (Short: \p{Ccc=DB}) (4)
  1072. \p{Canonical_Combining_Class: Iota_Subscript} (Short: \p{Ccc=IS})
  1073. (1)
  1074. \p{Canonical_Combining_Class: IS} \p{Canonical_Combining_Class=
  1075. Iota_Subscript} (1)
  1076. \p{Canonical_Combining_Class: Kana_Voicing} (Short: \p{Ccc=KV}) (2)
  1077. \p{Canonical_Combining_Class: KV} \p{Canonical_Combining_Class=
  1078. Kana_Voicing} (2)
  1079. \p{Canonical_Combining_Class: L} \p{Canonical_Combining_Class=
  1080. Left} (2)
  1081. \p{Canonical_Combining_Class: Left} (Short: \p{Ccc=L}) (2)
  1082. \p{Canonical_Combining_Class: NK} \p{Canonical_Combining_Class=
  1083. Nukta} (13)
  1084. \p{Canonical_Combining_Class: Not_Reordered} (Short: \p{Ccc=NR})
  1085. (1_113_459)
  1086. \p{Canonical_Combining_Class: NR} \p{Canonical_Combining_Class=
  1087. Not_Reordered} (1_113_459)
  1088. \p{Canonical_Combining_Class: Nukta} (Short: \p{Ccc=NK}) (13)
  1089. \p{Canonical_Combining_Class: OV} \p{Canonical_Combining_Class=
  1090. Overlay} (26)
  1091. \p{Canonical_Combining_Class: Overlay} (Short: \p{Ccc=OV}) (26)
  1092. \p{Canonical_Combining_Class: R} \p{Canonical_Combining_Class=
  1093. Right} (1)
  1094. \p{Canonical_Combining_Class: Right} (Short: \p{Ccc=R}) (1)
  1095. \p{Canonical_Combining_Class: Virama} (Short: \p{Ccc=VR}) (37)
  1096. \p{Canonical_Combining_Class: VR} \p{Canonical_Combining_Class=
  1097. Virama} (37)
  1098. \p{Cans} \p{Canadian_Aboriginal} (= \p{Script=
  1099. Canadian_Aboriginal}) (710)
  1100. \p{Cari} \p{Carian} (= \p{Script=Carian}) (NOT
  1101. \p{Block=Carian}) (49)
  1102. \p{Carian} \p{Script=Carian} (Short: \p{Cari}; NOT
  1103. \p{Block=Carian}) (49)
  1104. \p{Case_Ignorable} \p{Case_Ignorable=Y} (Short: \p{CI}) (1799)
  1105. \p{Case_Ignorable: N*} (Short: \p{CI=N}, \P{CI}) (1_112_313)
  1106. \p{Case_Ignorable: Y*} (Short: \p{CI=Y}, \p{CI}) (1799)
  1107. \p{Cased} \p{Cased=Y} (3448)
  1108. \p{Cased: N*} (Single: \P{Cased}) (1_110_664)
  1109. \p{Cased: Y*} (Single: \p{Cased}) (3448)
  1110. \p{Cased_Letter} \p{General_Category=Cased_Letter} (Short:
  1111. \p{LC}) (3223)
  1112. \p{Category: *} \p{General_Category: *}
  1113. \p{Cc} \p{Cntrl} (= \p{General_Category=Control})
  1114. (65)
  1115. \p{Ccc: *} \p{Canonical_Combining_Class: *}
  1116. \p{CE} \p{Composition_Exclusion} (=
  1117. \p{Composition_Exclusion=Y}) (81)
  1118. \p{CE: *} \p{Composition_Exclusion: *}
  1119. \p{Cf} \p{Format} (= \p{General_Category=Format})
  1120. (139)
  1121. \p{Chakma} \p{Script=Chakma} (Short: \p{Cakm}; NOT
  1122. \p{Block=Chakma}) (67)
  1123. \p{Cham} \p{Script=Cham} (NOT \p{Block=Cham}) (83)
  1124. \p{Changes_When_Casefolded} \p{Changes_When_Casefolded=Y} (Short:
  1125. \p{CWCF}) (1107)
  1126. \p{Changes_When_Casefolded: N*} (Short: \p{CWCF=N}, \P{CWCF})
  1127. (1_113_005)
  1128. \p{Changes_When_Casefolded: Y*} (Short: \p{CWCF=Y}, \p{CWCF})
  1129. (1107)
  1130. \p{Changes_When_Casemapped} \p{Changes_When_Casemapped=Y} (Short:
  1131. \p{CWCM}) (2138)
  1132. \p{Changes_When_Casemapped: N*} (Short: \p{CWCM=N}, \P{CWCM})
  1133. (1_111_974)
  1134. \p{Changes_When_Casemapped: Y*} (Short: \p{CWCM=Y}, \p{CWCM})
  1135. (2138)
  1136. \p{Changes_When_Lowercased} \p{Changes_When_Lowercased=Y} (Short:
  1137. \p{CWL}) (1043)
  1138. \p{Changes_When_Lowercased: N*} (Short: \p{CWL=N}, \P{CWL})
  1139. (1_113_069)
  1140. \p{Changes_When_Lowercased: Y*} (Short: \p{CWL=Y}, \p{CWL}) (1043)
  1141. \p{Changes_When_NFKC_Casefolded} \p{Changes_When_NFKC_Casefolded=
  1142. Y} (Short: \p{CWKCF}) (9944)
  1143. \p{Changes_When_NFKC_Casefolded: N*} (Short: \p{CWKCF=N},
  1144. \P{CWKCF}) (1_104_168)
  1145. \p{Changes_When_NFKC_Casefolded: Y*} (Short: \p{CWKCF=Y},
  1146. \p{CWKCF}) (9944)
  1147. \p{Changes_When_Titlecased} \p{Changes_When_Titlecased=Y} (Short:
  1148. \p{CWT}) (1099)
  1149. \p{Changes_When_Titlecased: N*} (Short: \p{CWT=N}, \P{CWT})
  1150. (1_113_013)
  1151. \p{Changes_When_Titlecased: Y*} (Short: \p{CWT=Y}, \p{CWT}) (1099)
  1152. \p{Changes_When_Uppercased} \p{Changes_When_Uppercased=Y} (Short:
  1153. \p{CWU}) (1126)
  1154. \p{Changes_When_Uppercased: N*} (Short: \p{CWU=N}, \P{CWU})
  1155. (1_112_986)
  1156. \p{Changes_When_Uppercased: Y*} (Short: \p{CWU=Y}, \p{CWU}) (1126)
  1157. \p{Cher} \p{Cherokee} (= \p{Script=Cherokee}) (NOT
  1158. \p{Block=Cherokee}) (85)
  1159. \p{Cherokee} \p{Script=Cherokee} (Short: \p{Cher}; NOT
  1160. \p{Block=Cherokee}) (85)
  1161. \p{CI} \p{Case_Ignorable} (= \p{Case_Ignorable=
  1162. Y}) (1799)
  1163. \p{CI: *} \p{Case_Ignorable: *}
  1164. X \p{CJK} \p{CJK_Unified_Ideographs} (= \p{Block=
  1165. CJK_Unified_Ideographs}) (20_992)
  1166. X \p{CJK_Compat} \p{CJK_Compatibility} (= \p{Block=
  1167. CJK_Compatibility}) (256)
  1168. X \p{CJK_Compat_Forms} \p{CJK_Compatibility_Forms} (= \p{Block=
  1169. CJK_Compatibility_Forms}) (32)
  1170. X \p{CJK_Compat_Ideographs} \p{CJK_Compatibility_Ideographs} (=
  1171. \p{Block=CJK_Compatibility_Ideographs})
  1172. (512)
  1173. X \p{CJK_Compat_Ideographs_Sup}
  1174. \p{CJK_Compatibility_Ideographs_-
  1175. Supplement} (= \p{Block=
  1176. CJK_Compatibility_Ideographs_-
  1177. Supplement}) (544)
  1178. X \p{CJK_Compatibility} \p{Block=CJK_Compatibility} (Short:
  1179. \p{InCJKCompat}) (256)
  1180. X \p{CJK_Compatibility_Forms} \p{Block=CJK_Compatibility_Forms}
  1181. (Short: \p{InCJKCompatForms}) (32)
  1182. X \p{CJK_Compatibility_Ideographs} \p{Block=
  1183. CJK_Compatibility_Ideographs} (Short:
  1184. \p{InCJKCompatIdeographs}) (512)
  1185. X \p{CJK_Compatibility_Ideographs_Supplement} \p{Block=
  1186. CJK_Compatibility_Ideographs_Supplement}
  1187. (Short: \p{InCJKCompatIdeographsSup})
  1188. (544)
  1189. X \p{CJK_Ext_A} \p{CJK_Unified_Ideographs_Extension_A} (=
  1190. \p{Block=
  1191. CJK_Unified_Ideographs_Extension_A})
  1192. (6592)
  1193. X \p{CJK_Ext_B} \p{CJK_Unified_Ideographs_Extension_B} (=
  1194. \p{Block=
  1195. CJK_Unified_Ideographs_Extension_B})
  1196. (42_720)
  1197. X \p{CJK_Ext_C} \p{CJK_Unified_Ideographs_Extension_C} (=
  1198. \p{Block=
  1199. CJK_Unified_Ideographs_Extension_C})
  1200. (4160)
  1201. X \p{CJK_Ext_D} \p{CJK_Unified_Ideographs_Extension_D} (=
  1202. \p{Block=
  1203. CJK_Unified_Ideographs_Extension_D})
  1204. (224)
  1205. X \p{CJK_Radicals_Sup} \p{CJK_Radicals_Supplement} (= \p{Block=
  1206. CJK_Radicals_Supplement}) (128)
  1207. X \p{CJK_Radicals_Supplement} \p{Block=CJK_Radicals_Supplement}
  1208. (Short: \p{InCJKRadicalsSup}) (128)
  1209. X \p{CJK_Strokes} \p{Block=CJK_Strokes} (48)
  1210. X \p{CJK_Symbols} \p{CJK_Symbols_And_Punctuation} (=
  1211. \p{Block=CJK_Symbols_And_Punctuation})
  1212. (64)
  1213. X \p{CJK_Symbols_And_Punctuation} \p{Block=
  1214. CJK_Symbols_And_Punctuation} (Short:
  1215. \p{InCJKSymbols}) (64)
  1216. X \p{CJK_Unified_Ideographs} \p{Block=CJK_Unified_Ideographs}
  1217. (Short: \p{InCJK}) (20_992)
  1218. X \p{CJK_Unified_Ideographs_Extension_A} \p{Block=
  1219. CJK_Unified_Ideographs_Extension_A}
  1220. (Short: \p{InCJKExtA}) (6592)
  1221. X \p{CJK_Unified_Ideographs_Extension_B} \p{Block=
  1222. CJK_Unified_Ideographs_Extension_B}
  1223. (Short: \p{InCJKExtB}) (42_720)
  1224. X \p{CJK_Unified_Ideographs_Extension_C} \p{Block=
  1225. CJK_Unified_Ideographs_Extension_C}
  1226. (Short: \p{InCJKExtC}) (4160)
  1227. X \p{CJK_Unified_Ideographs_Extension_D} \p{Block=
  1228. CJK_Unified_Ideographs_Extension_D}
  1229. (Short: \p{InCJKExtD}) (224)
  1230. \p{Close_Punctuation} \p{General_Category=Close_Punctuation}
  1231. (Short: \p{Pe}) (71)
  1232. \p{Cn} \p{Unassigned} (= \p{General_Category=
  1233. Unassigned}) (864_414)
  1234. \p{Cntrl} \p{General_Category=Control} Control
  1235. characters (Short: \p{Cc}) (65)
  1236. \p{Co} \p{Private_Use} (= \p{General_Category=
  1237. Private_Use}) (NOT \p{Private_Use_Area})
  1238. (137_468)
  1239. X \p{Combining_Diacritical_Marks} \p{Block=
  1240. Combining_Diacritical_Marks} (Short:
  1241. \p{InDiacriticals}) (112)
  1242. X \p{Combining_Diacritical_Marks_For_Symbols} \p{Block=
  1243. Combining_Diacritical_Marks_For_Symbols}
  1244. (Short: \p{InDiacriticalsForSymbols})
  1245. (48)
  1246. X \p{Combining_Diacritical_Marks_Supplement} \p{Block=
  1247. Combining_Diacritical_Marks_Supplement}
  1248. (Short: \p{InDiacriticalsSup}) (64)
  1249. X \p{Combining_Half_Marks} \p{Block=Combining_Half_Marks} (Short:
  1250. \p{InHalfMarks}) (16)
  1251. \p{Combining_Mark} \p{Mark} (= \p{General_Category=Mark})
  1252. (1645)
  1253. X \p{Combining_Marks_For_Symbols}
  1254. \p{Combining_Diacritical_Marks_For_-
  1255. Symbols} (= \p{Block=
  1256. Combining_Diacritical_Marks_For_-
  1257. Symbols}) (48)
  1258. \p{Common} \p{Script=Common} (Short: \p{Zyyy}) (6413)
  1259. X \p{Common_Indic_Number_Forms} \p{Block=Common_Indic_Number_Forms}
  1260. (Short: \p{InIndicNumberForms}) (16)
  1261. \p{Comp_Ex} \p{Full_Composition_Exclusion} (=
  1262. \p{Full_Composition_Exclusion=Y}) (1120)
  1263. \p{Comp_Ex: *} \p{Full_Composition_Exclusion: *}
  1264. X \p{Compat_Jamo} \p{Hangul_Compatibility_Jamo} (= \p{Block=
  1265. Hangul_Compatibility_Jamo}) (96)
  1266. \p{Composition_Exclusion} \p{Composition_Exclusion=Y} (Short:
  1267. \p{CE}) (81)
  1268. \p{Composition_Exclusion: N*} (Short: \p{CE=N}, \P{CE}) (1_114_031)
  1269. \p{Composition_Exclusion: Y*} (Short: \p{CE=Y}, \p{CE}) (81)
  1270. \p{Connector_Punctuation} \p{General_Category=
  1271. Connector_Punctuation} (Short: \p{Pc})
  1272. (10)
  1273. \p{Control} \p{Cntrl} (= \p{General_Category=Control})
  1274. (65)
  1275. X \p{Control_Pictures} \p{Block=Control_Pictures} (64)
  1276. \p{Copt} \p{Coptic} (= \p{Script=Coptic}) (NOT
  1277. \p{Block=Coptic}) (137)
  1278. \p{Coptic} \p{Script=Coptic} (Short: \p{Copt}; NOT
  1279. \p{Block=Coptic}) (137)
  1280. X \p{Counting_Rod} \p{Counting_Rod_Numerals} (= \p{Block=
  1281. Counting_Rod_Numerals}) (32)
  1282. X \p{Counting_Rod_Numerals} \p{Block=Counting_Rod_Numerals} (Short:
  1283. \p{InCountingRod}) (32)
  1284. \p{Cprt} \p{Cypriot} (= \p{Script=Cypriot}) (55)
  1285. \p{Cs} \p{Surrogate} (= \p{General_Category=
  1286. Surrogate}) (2048)
  1287. \p{Cuneiform} \p{Script=Cuneiform} (Short: \p{Xsux}; NOT
  1288. \p{Block=Cuneiform}) (982)
  1289. X \p{Cuneiform_Numbers} \p{Cuneiform_Numbers_And_Punctuation} (=
  1290. \p{Block=
  1291. Cuneiform_Numbers_And_Punctuation}) (128)
  1292. X \p{Cuneiform_Numbers_And_Punctuation} \p{Block=
  1293. Cuneiform_Numbers_And_Punctuation}
  1294. (Short: \p{InCuneiformNumbers}) (128)
  1295. \p{Currency_Symbol} \p{General_Category=Currency_Symbol}
  1296. (Short: \p{Sc}) (49)
  1297. X \p{Currency_Symbols} \p{Block=Currency_Symbols} (48)
  1298. \p{CWCF} \p{Changes_When_Casefolded} (=
  1299. \p{Changes_When_Casefolded=Y}) (1107)
  1300. \p{CWCF: *} \p{Changes_When_Casefolded: *}
  1301. \p{CWCM} \p{Changes_When_Casemapped} (=
  1302. \p{Changes_When_Casemapped=Y}) (2138)
  1303. \p{CWCM: *} \p{Changes_When_Casemapped: *}
  1304. \p{CWKCF} \p{Changes_When_NFKC_Casefolded} (=
  1305. \p{Changes_When_NFKC_Casefolded=Y})
  1306. (9944)
  1307. \p{CWKCF: *} \p{Changes_When_NFKC_Casefolded: *}
  1308. \p{CWL} \p{Changes_When_Lowercased} (=
  1309. \p{Changes_When_Lowercased=Y}) (1043)
  1310. \p{CWL: *} \p{Changes_When_Lowercased: *}
  1311. \p{CWT} \p{Changes_When_Titlecased} (=
  1312. \p{Changes_When_Titlecased=Y}) (1099)
  1313. \p{CWT: *} \p{Changes_When_Titlecased: *}
  1314. \p{CWU} \p{Changes_When_Uppercased} (=
  1315. \p{Changes_When_Uppercased=Y}) (1126)
  1316. \p{CWU: *} \p{Changes_When_Uppercased: *}
  1317. \p{Cypriot} \p{Script=Cypriot} (Short: \p{Cprt}) (55)
  1318. X \p{Cypriot_Syllabary} \p{Block=Cypriot_Syllabary} (64)
  1319. \p{Cyrillic} \p{Script=Cyrillic} (Short: \p{Cyrl}; NOT
  1320. \p{Block=Cyrillic}) (417)
  1321. X \p{Cyrillic_Ext_A} \p{Cyrillic_Extended_A} (= \p{Block=
  1322. Cyrillic_Extended_A}) (32)
  1323. X \p{Cyrillic_Ext_B} \p{Cyrillic_Extended_B} (= \p{Block=
  1324. Cyrillic_Extended_B}) (96)
  1325. X \p{Cyrillic_Extended_A} \p{Block=Cyrillic_Extended_A} (Short:
  1326. \p{InCyrillicExtA}) (32)
  1327. X \p{Cyrillic_Extended_B} \p{Block=Cyrillic_Extended_B} (Short:
  1328. \p{InCyrillicExtB}) (96)
  1329. X \p{Cyrillic_Sup} \p{Cyrillic_Supplement} (= \p{Block=
  1330. Cyrillic_Supplement}) (48)
  1331. X \p{Cyrillic_Supplement} \p{Block=Cyrillic_Supplement} (Short:
  1332. \p{InCyrillicSup}) (48)
  1333. X \p{Cyrillic_Supplementary} \p{Cyrillic_Supplement} (= \p{Block=
  1334. Cyrillic_Supplement}) (48)
  1335. \p{Cyrl} \p{Cyrillic} (= \p{Script=Cyrillic}) (NOT
  1336. \p{Block=Cyrillic}) (417)
  1337. \p{Dash} \p{Dash=Y} (27)
  1338. \p{Dash: N*} (Single: \P{Dash}) (1_114_085)
  1339. \p{Dash: Y*} (Single: \p{Dash}) (27)
  1340. \p{Dash_Punctuation} \p{General_Category=Dash_Punctuation}
  1341. (Short: \p{Pd}) (23)
  1342. \p{Decimal_Number} \p{Digit} (= \p{General_Category=
  1343. Decimal_Number}) (460)
  1344. \p{Decomposition_Type: Can} \p{Decomposition_Type=Canonical}
  1345. (13_225)
  1346. \p{Decomposition_Type: Canonical} (Short: \p{Dt=Can}) (13_225)
  1347. \p{Decomposition_Type: Circle} (Short: \p{Dt=Enc}) (240)
  1348. \p{Decomposition_Type: Com} \p{Decomposition_Type=Compat} (720)
  1349. \p{Decomposition_Type: Compat} (Short: \p{Dt=Com}) (720)
  1350. \p{Decomposition_Type: Enc} \p{Decomposition_Type=Circle} (240)
  1351. \p{Decomposition_Type: Fin} \p{Decomposition_Type=Final} (240)
  1352. \p{Decomposition_Type: Final} (Short: \p{Dt=Fin}) (240)
  1353. \p{Decomposition_Type: Font} (Short: \p{Dt=Font}) (1184)
  1354. \p{Decomposition_Type: Fra} \p{Decomposition_Type=Fraction} (20)
  1355. \p{Decomposition_Type: Fraction} (Short: \p{Dt=Fra}) (20)
  1356. \p{Decomposition_Type: Init} \p{Decomposition_Type=Initial} (171)
  1357. \p{Decomposition_Type: Initial} (Short: \p{Dt=Init}) (171)
  1358. \p{Decomposition_Type: Iso} \p{Decomposition_Type=Isolated} (238)
  1359. \p{Decomposition_Type: Isolated} (Short: \p{Dt=Iso}) (238)
  1360. \p{Decomposition_Type: Med} \p{Decomposition_Type=Medial} (82)
  1361. \p{Decomposition_Type: Medial} (Short: \p{Dt=Med}) (82)
  1362. \p{Decomposition_Type: Nar} \p{Decomposition_Type=Narrow} (122)
  1363. \p{Decomposition_Type: Narrow} (Short: \p{Dt=Nar}) (122)
  1364. \p{Decomposition_Type: Nb} \p{Decomposition_Type=Nobreak} (5)
  1365. \p{Decomposition_Type: Nobreak} (Short: \p{Dt=Nb}) (5)
  1366. \p{Decomposition_Type: Non_Canon} \p{Decomposition_Type=
  1367. Non_Canonical} (Perl extension) (3655)
  1368. \p{Decomposition_Type: Non_Canonical} Union of all non-canonical
  1369. decompositions (Short: \p{Dt=NonCanon})
  1370. (Perl extension) (3655)
  1371. \p{Decomposition_Type: None} (Short: \p{Dt=None}) (1_097_232)
  1372. \p{Decomposition_Type: Small} (Short: \p{Dt=Sml}) (26)
  1373. \p{Decomposition_Type: Sml} \p{Decomposition_Type=Small} (26)
  1374. \p{Decomposition_Type: Sqr} \p{Decomposition_Type=Square} (284)
  1375. \p{Decomposition_Type: Square} (Short: \p{Dt=Sqr}) (284)
  1376. \p{Decomposition_Type: Sub} (Short: \p{Dt=Sub}) (38)
  1377. \p{Decomposition_Type: Sup} \p{Decomposition_Type=Super} (146)
  1378. \p{Decomposition_Type: Super} (Short: \p{Dt=Sup}) (146)
  1379. \p{Decomposition_Type: Vert} \p{Decomposition_Type=Vertical} (35)
  1380. \p{Decomposition_Type: Vertical} (Short: \p{Dt=Vert}) (35)
  1381. \p{Decomposition_Type: Wide} (Short: \p{Dt=Wide}) (104)
  1382. \p{Default_Ignorable_Code_Point} \p{Default_Ignorable_Code_Point=
  1383. Y} (Short: \p{DI}) (4167)
  1384. \p{Default_Ignorable_Code_Point: N*} (Short: \p{DI=N}, \P{DI})
  1385. (1_109_945)
  1386. \p{Default_Ignorable_Code_Point: Y*} (Short: \p{DI=Y}, \p{DI})
  1387. (4167)
  1388. \p{Dep} \p{Deprecated} (= \p{Deprecated=Y}) (111)
  1389. \p{Dep: *} \p{Deprecated: *}
  1390. \p{Deprecated} \p{Deprecated=Y} (Short: \p{Dep}) (111)
  1391. \p{Deprecated: N*} (Short: \p{Dep=N}, \P{Dep}) (1_114_001)
  1392. \p{Deprecated: Y*} (Short: \p{Dep=Y}, \p{Dep}) (111)
  1393. \p{Deseret} \p{Script=Deseret} (Short: \p{Dsrt}) (80)
  1394. \p{Deva} \p{Devanagari} (= \p{Script=Devanagari})
  1395. (NOT \p{Block=Devanagari}) (151)
  1396. \p{Devanagari} \p{Script=Devanagari} (Short: \p{Deva};
  1397. NOT \p{Block=Devanagari}) (151)
  1398. X \p{Devanagari_Ext} \p{Devanagari_Extended} (= \p{Block=
  1399. Devanagari_Extended}) (32)
  1400. X \p{Devanagari_Extended} \p{Block=Devanagari_Extended} (Short:
  1401. \p{InDevanagariExt}) (32)
  1402. \p{DI} \p{Default_Ignorable_Code_Point} (=
  1403. \p{Default_Ignorable_Code_Point=Y})
  1404. (4167)
  1405. \p{DI: *} \p{Default_Ignorable_Code_Point: *}
  1406. \p{Dia} \p{Diacritic} (= \p{Diacritic=Y}) (693)
  1407. \p{Dia: *} \p{Diacritic: *}
  1408. \p{Diacritic} \p{Diacritic=Y} (Short: \p{Dia}) (693)
  1409. \p{Diacritic: N*} (Short: \p{Dia=N}, \P{Dia}) (1_113_419)
  1410. \p{Diacritic: Y*} (Short: \p{Dia=Y}, \p{Dia}) (693)
  1411. X \p{Diacriticals} \p{Combining_Diacritical_Marks} (=
  1412. \p{Block=Combining_Diacritical_Marks})
  1413. (112)
  1414. X \p{Diacriticals_For_Symbols}
  1415. \p{Combining_Diacritical_Marks_For_-
  1416. Symbols} (= \p{Block=
  1417. Combining_Diacritical_Marks_For_-
  1418. Symbols}) (48)
  1419. X \p{Diacriticals_Sup} \p{Combining_Diacritical_Marks_Supplement}
  1420. (= \p{Block=
  1421. Combining_Diacritical_Marks_Supplement})
  1422. (64)
  1423. \p{Digit} \p{General_Category=Decimal_Number} [0-9]
  1424. + all other decimal digits (Short:
  1425. \p{Nd}) (460)
  1426. X \p{Dingbats} \p{Block=Dingbats} (192)
  1427. X \p{Domino} \p{Domino_Tiles} (= \p{Block=
  1428. Domino_Tiles}) (112)
  1429. X \p{Domino_Tiles} \p{Block=Domino_Tiles} (Short:
  1430. \p{InDomino}) (112)
  1431. \p{Dsrt} \p{Deseret} (= \p{Script=Deseret}) (80)
  1432. \p{Dt: *} \p{Decomposition_Type: *}
  1433. \p{Ea: *} \p{East_Asian_Width: *}
  1434. \p{East_Asian_Width: A} \p{East_Asian_Width=Ambiguous} (138_746)
  1435. \p{East_Asian_Width: Ambiguous} (Short: \p{Ea=A}) (138_746)
  1436. \p{East_Asian_Width: F} \p{East_Asian_Width=Fullwidth} (104)
  1437. \p{East_Asian_Width: Fullwidth} (Short: \p{Ea=F}) (104)
  1438. \p{East_Asian_Width: H} \p{East_Asian_Width=Halfwidth} (123)
  1439. \p{East_Asian_Width: Halfwidth} (Short: \p{Ea=H}) (123)
  1440. \p{East_Asian_Width: N} \p{East_Asian_Width=Neutral} (801_894)
  1441. \p{East_Asian_Width: Na} \p{East_Asian_Width=Narrow} (111)
  1442. \p{East_Asian_Width: Narrow} (Short: \p{Ea=Na}) (111)
  1443. \p{East_Asian_Width: Neutral} (Short: \p{Ea=N}) (801_894)
  1444. \p{East_Asian_Width: W} \p{East_Asian_Width=Wide} (173_134)
  1445. \p{East_Asian_Width: Wide} (Short: \p{Ea=W}) (173_134)
  1446. \p{Egyp} \p{Egyptian_Hieroglyphs} (= \p{Script=
  1447. Egyptian_Hieroglyphs}) (NOT \p{Block=
  1448. Egyptian_Hieroglyphs}) (1071)
  1449. \p{Egyptian_Hieroglyphs} \p{Script=Egyptian_Hieroglyphs} (Short:
  1450. \p{Egyp}; NOT \p{Block=
  1451. Egyptian_Hieroglyphs}) (1071)
  1452. X \p{Emoticons} \p{Block=Emoticons} (80)
  1453. X \p{Enclosed_Alphanum} \p{Enclosed_Alphanumerics} (= \p{Block=
  1454. Enclosed_Alphanumerics}) (160)
  1455. X \p{Enclosed_Alphanum_Sup} \p{Enclosed_Alphanumeric_Supplement} (=
  1456. \p{Block=
  1457. Enclosed_Alphanumeric_Supplement}) (256)
  1458. X \p{Enclosed_Alphanumeric_Supplement} \p{Block=
  1459. Enclosed_Alphanumeric_Supplement}
  1460. (Short: \p{InEnclosedAlphanumSup}) (256)
  1461. X \p{Enclosed_Alphanumerics} \p{Block=Enclosed_Alphanumerics}
  1462. (Short: \p{InEnclosedAlphanum}) (160)
  1463. X \p{Enclosed_CJK} \p{Enclosed_CJK_Letters_And_Months} (=
  1464. \p{Block=
  1465. Enclosed_CJK_Letters_And_Months}) (256)
  1466. X \p{Enclosed_CJK_Letters_And_Months} \p{Block=
  1467. Enclosed_CJK_Letters_And_Months} (Short:
  1468. \p{InEnclosedCJK}) (256)
  1469. X \p{Enclosed_Ideographic_Sup} \p{Enclosed_Ideographic_Supplement}
  1470. (= \p{Block=
  1471. Enclosed_Ideographic_Supplement}) (256)
  1472. X \p{Enclosed_Ideographic_Supplement} \p{Block=
  1473. Enclosed_Ideographic_Supplement} (Short:
  1474. \p{InEnclosedIdeographicSup}) (256)
  1475. \p{Enclosing_Mark} \p{General_Category=Enclosing_Mark}
  1476. (Short: \p{Me}) (12)
  1477. \p{Ethi} \p{Ethiopic} (= \p{Script=Ethiopic}) (NOT
  1478. \p{Block=Ethiopic}) (495)
  1479. \p{Ethiopic} \p{Script=Ethiopic} (Short: \p{Ethi}; NOT
  1480. \p{Block=Ethiopic}) (495)
  1481. X \p{Ethiopic_Ext} \p{Ethiopic_Extended} (= \p{Block=
  1482. Ethiopic_Extended}) (96)
  1483. X \p{Ethiopic_Ext_A} \p{Ethiopic_Extended_A} (= \p{Block=
  1484. Ethiopic_Extended_A}) (48)
  1485. X \p{Ethiopic_Extended} \p{Block=Ethiopic_Extended} (Short:
  1486. \p{InEthiopicExt}) (96)
  1487. X \p{Ethiopic_Extended_A} \p{Block=Ethiopic_Extended_A} (Short:
  1488. \p{InEthiopicExtA}) (48)
  1489. X \p{Ethiopic_Sup} \p{Ethiopic_Supplement} (= \p{Block=
  1490. Ethiopic_Supplement}) (32)
  1491. X \p{Ethiopic_Supplement} \p{Block=Ethiopic_Supplement} (Short:
  1492. \p{InEthiopicSup}) (32)
  1493. \p{Ext} \p{Extender} (= \p{Extender=Y}) (31)
  1494. \p{Ext: *} \p{Extender: *}
  1495. \p{Extender} \p{Extender=Y} (Short: \p{Ext}) (31)
  1496. \p{Extender: N*} (Short: \p{Ext=N}, \P{Ext}) (1_114_081)
  1497. \p{Extender: Y*} (Short: \p{Ext=Y}, \p{Ext}) (31)
  1498. \p{Final_Punctuation} \p{General_Category=Final_Punctuation}
  1499. (Short: \p{Pf}) (10)
  1500. \p{Format} \p{General_Category=Format} (Short:
  1501. \p{Cf}) (139)
  1502. \p{Full_Composition_Exclusion} \p{Full_Composition_Exclusion=Y}
  1503. (Short: \p{CompEx}) (1120)
  1504. \p{Full_Composition_Exclusion: N*} (Short: \p{CompEx=N},
  1505. \P{CompEx}) (1_112_992)
  1506. \p{Full_Composition_Exclusion: Y*} (Short: \p{CompEx=Y},
  1507. \p{CompEx}) (1120)
  1508. \p{Gc: *} \p{General_Category: *}
  1509. \p{GCB: *} \p{Grapheme_Cluster_Break: *}
  1510. \p{General_Category: C} \p{General_Category=Other} (1_004_134)
  1511. \p{General_Category: Cased_Letter} [\p{Ll}\p{Lu}\p{Lt}] (Short:
  1512. \p{Gc=LC}, \p{LC}) (3223)
  1513. \p{General_Category: Cc} \p{General_Category=Control} (65)
  1514. \p{General_Category: Cf} \p{General_Category=Format} (139)
  1515. \p{General_Category: Close_Punctuation} (Short: \p{Gc=Pe}, \p{Pe})
  1516. (71)
  1517. \p{General_Category: Cn} \p{General_Category=Unassigned} (864_414)
  1518. \p{General_Category: Cntrl} \p{General_Category=Control} (65)
  1519. \p{General_Category: Co} \p{General_Category=Private_Use} (137_468)
  1520. \p{General_Category: Combining_Mark} \p{General_Category=Mark}
  1521. (1645)
  1522. \p{General_Category: Connector_Punctuation} (Short: \p{Gc=Pc},
  1523. \p{Pc}) (10)
  1524. \p{General_Category: Control} (Short: \p{Gc=Cc}, \p{Cc}) (65)
  1525. \p{General_Category: Cs} \p{General_Category=Surrogate} (2048)
  1526. \p{General_Category: Currency_Symbol} (Short: \p{Gc=Sc}, \p{Sc})
  1527. (49)
  1528. \p{General_Category: Dash_Punctuation} (Short: \p{Gc=Pd}, \p{Pd})
  1529. (23)
  1530. \p{General_Category: Decimal_Number} (Short: \p{Gc=Nd}, \p{Nd})
  1531. (460)
  1532. \p{General_Category: Digit} \p{General_Category=Decimal_Number}
  1533. (460)
  1534. \p{General_Category: Enclosing_Mark} (Short: \p{Gc=Me}, \p{Me})
  1535. (12)
  1536. \p{General_Category: Final_Punctuation} (Short: \p{Gc=Pf}, \p{Pf})
  1537. (10)
  1538. \p{General_Category: Format} (Short: \p{Gc=Cf}, \p{Cf}) (139)
  1539. \p{General_Category: Initial_Punctuation} (Short: \p{Gc=Pi},
  1540. \p{Pi}) (12)
  1541. \p{General_Category: L} \p{General_Category=Letter} (101_013)
  1542. X \p{General_Category: L&} \p{General_Category=Cased_Letter} (3223)
  1543. X \p{General_Category: L_} \p{General_Category=Cased_Letter} Note
  1544. the trailing '_' matters in spite of
  1545. loose matching rules. (3223)
  1546. \p{General_Category: LC} \p{General_Category=Cased_Letter} (3223)
  1547. \p{General_Category: Letter} (Short: \p{Gc=L}, \p{L}) (101_013)
  1548. \p{General_Category: Letter_Number} (Short: \p{Gc=Nl}, \p{Nl})
  1549. (224)
  1550. \p{General_Category: Line_Separator} (Short: \p{Gc=Zl}, \p{Zl}) (1)
  1551. \p{General_Category: Ll} \p{General_Category=Lowercase_Letter}
  1552. (/i= General_Category=Cased_Letter)
  1553. (1751)
  1554. \p{General_Category: Lm} \p{General_Category=Modifier_Letter} (237)
  1555. \p{General_Category: Lo} \p{General_Category=Other_Letter} (97_553)
  1556. \p{General_Category: Lowercase_Letter} (Short: \p{Gc=Ll}, \p{Ll};
  1557. /i= General_Category=Cased_Letter) (1751)
  1558. \p{General_Category: Lt} \p{General_Category=Titlecase_Letter}
  1559. (/i= General_Category=Cased_Letter) (31)
  1560. \p{General_Category: Lu} \p{General_Category=Uppercase_Letter}
  1561. (/i= General_Category=Cased_Letter)
  1562. (1441)
  1563. \p{General_Category: M} \p{General_Category=Mark} (1645)
  1564. \p{General_Category: Mark} (Short: \p{Gc=M}, \p{M}) (1645)
  1565. \p{General_Category: Math_Symbol} (Short: \p{Gc=Sm}, \p{Sm}) (952)
  1566. \p{General_Category: Mc} \p{General_Category=Spacing_Mark} (353)
  1567. \p{General_Category: Me} \p{General_Category=Enclosing_Mark} (12)
  1568. \p{General_Category: Mn} \p{General_Category=Nonspacing_Mark}
  1569. (1280)
  1570. \p{General_Category: Modifier_Letter} (Short: \p{Gc=Lm}, \p{Lm})
  1571. (237)
  1572. \p{General_Category: Modifier_Symbol} (Short: \p{Gc=Sk}, \p{Sk})
  1573. (115)
  1574. \p{General_Category: N} \p{General_Category=Number} (1148)
  1575. \p{General_Category: Nd} \p{General_Category=Decimal_Number} (460)
  1576. \p{General_Category: Nl} \p{General_Category=Letter_Number} (224)
  1577. \p{General_Category: No} \p{General_Category=Other_Number} (464)
  1578. \p{General_Category: Nonspacing_Mark} (Short: \p{Gc=Mn}, \p{Mn})
  1579. (1280)
  1580. \p{General_Category: Number} (Short: \p{Gc=N}, \p{N}) (1148)
  1581. \p{General_Category: Open_Punctuation} (Short: \p{Gc=Ps}, \p{Ps})
  1582. (72)
  1583. \p{General_Category: Other} (Short: \p{Gc=C}, \p{C}) (1_004_134)
  1584. \p{General_Category: Other_Letter} (Short: \p{Gc=Lo}, \p{Lo})
  1585. (97_553)
  1586. \p{General_Category: Other_Number} (Short: \p{Gc=No}, \p{No}) (464)
  1587. \p{General_Category: Other_Punctuation} (Short: \p{Gc=Po}, \p{Po})
  1588. (434)
  1589. \p{General_Category: Other_Symbol} (Short: \p{Gc=So}, \p{So})
  1590. (4404)
  1591. \p{General_Category: P} \p{General_Category=Punctuation} (632)
  1592. \p{General_Category: Paragraph_Separator} (Short: \p{Gc=Zp},
  1593. \p{Zp}) (1)
  1594. \p{General_Category: Pc} \p{General_Category=
  1595. Connector_Punctuation} (10)
  1596. \p{General_Category: Pd} \p{General_Category=Dash_Punctuation} (23)
  1597. \p{General_Category: Pe} \p{General_Category=Close_Punctuation}
  1598. (71)
  1599. \p{General_Category: Pf} \p{General_Category=Final_Punctuation}
  1600. (10)
  1601. \p{General_Category: Pi} \p{General_Category=Initial_Punctuation}
  1602. (12)
  1603. \p{General_Category: Po} \p{General_Category=Other_Punctuation}
  1604. (434)
  1605. \p{General_Category: Private_Use} (Short: \p{Gc=Co}, \p{Co})
  1606. (137_468)
  1607. \p{General_Category: Ps} \p{General_Category=Open_Punctuation} (72)
  1608. \p{General_Category: Punct} \p{General_Category=Punctuation} (632)
  1609. \p{General_Category: Punctuation} (Short: \p{Gc=P}, \p{P}) (632)
  1610. \p{General_Category: S} \p{General_Category=Symbol} (5520)
  1611. \p{General_Category: Sc} \p{General_Category=Currency_Symbol} (49)
  1612. \p{General_Category: Separator} (Short: \p{Gc=Z}, \p{Z}) (20)
  1613. \p{General_Category: Sk} \p{General_Category=Modifier_Symbol} (115)
  1614. \p{General_Category: Sm} \p{General_Category=Math_Symbol} (952)
  1615. \p{General_Category: So} \p{General_Category=Other_Symbol} (4404)
  1616. \p{General_Category: Space_Separator} (Short: \p{Gc=Zs}, \p{Zs})
  1617. (18)
  1618. \p{General_Category: Spacing_Mark} (Short: \p{Gc=Mc}, \p{Mc}) (353)
  1619. \p{General_Category: Surrogate} (Short: \p{Gc=Cs}, \p{Cs}) (2048)
  1620. \p{General_Category: Symbol} (Short: \p{Gc=S}, \p{S}) (5520)
  1621. \p{General_Category: Titlecase_Letter} (Short: \p{Gc=Lt}, \p{Lt};
  1622. /i= General_Category=Cased_Letter) (31)
  1623. \p{General_Category: Unassigned} (Short: \p{Gc=Cn}, \p{Cn})
  1624. (864_414)
  1625. \p{General_Category: Uppercase_Letter} (Short: \p{Gc=Lu}, \p{Lu};
  1626. /i= General_Category=Cased_Letter) (1441)
  1627. \p{General_Category: Z} \p{General_Category=Separator} (20)
  1628. \p{General_Category: Zl} \p{General_Category=Line_Separator} (1)
  1629. \p{General_Category: Zp} \p{General_Category=Paragraph_Separator}
  1630. (1)
  1631. \p{General_Category: Zs} \p{General_Category=Space_Separator} (18)
  1632. X \p{General_Punctuation} \p{Block=General_Punctuation} (Short:
  1633. \p{InPunctuation}) (112)
  1634. X \p{Geometric_Shapes} \p{Block=Geometric_Shapes} (96)
  1635. \p{Geor} \p{Georgian} (= \p{Script=Georgian}) (NOT
  1636. \p{Block=Georgian}) (127)
  1637. \p{Georgian} \p{Script=Georgian} (Short: \p{Geor}; NOT
  1638. \p{Block=Georgian}) (127)
  1639. X \p{Georgian_Sup} \p{Georgian_Supplement} (= \p{Block=
  1640. Georgian_Supplement}) (48)
  1641. X \p{Georgian_Supplement} \p{Block=Georgian_Supplement} (Short:
  1642. \p{InGeorgianSup}) (48)
  1643. \p{Glag} \p{Glagolitic} (= \p{Script=Glagolitic})
  1644. (NOT \p{Block=Glagolitic}) (94)
  1645. \p{Glagolitic} \p{Script=Glagolitic} (Short: \p{Glag};
  1646. NOT \p{Block=Glagolitic}) (94)
  1647. \p{Goth} \p{Gothic} (= \p{Script=Gothic}) (NOT
  1648. \p{Block=Gothic}) (27)
  1649. \p{Gothic} \p{Script=Gothic} (Short: \p{Goth}; NOT
  1650. \p{Block=Gothic}) (27)
  1651. \p{Gr_Base} \p{Grapheme_Base} (= \p{Grapheme_Base=Y})
  1652. (108_661)
  1653. \p{Gr_Base: *} \p{Grapheme_Base: *}
  1654. \p{Gr_Ext} \p{Grapheme_Extend} (= \p{Grapheme_Extend=
  1655. Y}) (1317)
  1656. \p{Gr_Ext: *} \p{Grapheme_Extend: *}
  1657. \p{Graph} Characters that are graphical (247_565)
  1658. \p{Grapheme_Base} \p{Grapheme_Base=Y} (Short: \p{GrBase})
  1659. (108_661)
  1660. \p{Grapheme_Base: N*} (Short: \p{GrBase=N}, \P{GrBase})
  1661. (1_005_451)
  1662. \p{Grapheme_Base: Y*} (Short: \p{GrBase=Y}, \p{GrBase}) (108_661)
  1663. \p{Grapheme_Cluster_Break: CN} \p{Grapheme_Cluster_Break=Control}
  1664. (6023)
  1665. \p{Grapheme_Cluster_Break: Control} (Short: \p{GCB=CN}) (6023)
  1666. \p{Grapheme_Cluster_Break: CR} (Short: \p{GCB=CR}) (1)
  1667. \p{Grapheme_Cluster_Break: EX} \p{Grapheme_Cluster_Break=Extend}
  1668. (1317)
  1669. \p{Grapheme_Cluster_Break: Extend} (Short: \p{GCB=EX}) (1317)
  1670. \p{Grapheme_Cluster_Break: L} (Short: \p{GCB=L}) (125)
  1671. \p{Grapheme_Cluster_Break: LF} (Short: \p{GCB=LF}) (1)
  1672. \p{Grapheme_Cluster_Break: LV} (Short: \p{GCB=LV}) (399)
  1673. \p{Grapheme_Cluster_Break: LVT} (Short: \p{GCB=LVT}) (10_773)
  1674. \p{Grapheme_Cluster_Break: Other} (Short: \p{GCB=XX}) (1_094_924)
  1675. \p{Grapheme_Cluster_Break: PP} \p{Grapheme_Cluster_Break=Prepend}
  1676. (0)
  1677. \p{Grapheme_Cluster_Break: Prepend} (Short: \p{GCB=PP}) (0)
  1678. \p{Grapheme_Cluster_Break: Regional_Indicator} (Short: \p{GCB=RI})
  1679. (26)
  1680. \p{Grapheme_Cluster_Break: RI} \p{Grapheme_Cluster_Break=
  1681. Regional_Indicator} (26)
  1682. \p{Grapheme_Cluster_Break: SM} \p{Grapheme_Cluster_Break=
  1683. SpacingMark} (291)
  1684. \p{Grapheme_Cluster_Break: SpacingMark} (Short: \p{GCB=SM}) (291)
  1685. \p{Grapheme_Cluster_Break: T} (Short: \p{GCB=T}) (137)
  1686. \p{Grapheme_Cluster_Break: V} (Short: \p{GCB=V}) (95)
  1687. \p{Grapheme_Cluster_Break: XX} \p{Grapheme_Cluster_Break=Other}
  1688. (1_094_924)
  1689. \p{Grapheme_Extend} \p{Grapheme_Extend=Y} (Short: \p{GrExt})
  1690. (1317)
  1691. \p{Grapheme_Extend: N*} (Short: \p{GrExt=N}, \P{GrExt}) (1_112_795)
  1692. \p{Grapheme_Extend: Y*} (Short: \p{GrExt=Y}, \p{GrExt}) (1317)
  1693. \p{Greek} \p{Script=Greek} (Short: \p{Grek}; NOT
  1694. \p{Greek_And_Coptic}) (511)
  1695. X \p{Greek_And_Coptic} \p{Block=Greek_And_Coptic} (Short:
  1696. \p{InGreek}) (144)
  1697. X \p{Greek_Ext} \p{Greek_Extended} (= \p{Block=
  1698. Greek_Extended}) (256)
  1699. X \p{Greek_Extended} \p{Block=Greek_Extended} (Short:
  1700. \p{InGreekExt}) (256)
  1701. \p{Grek} \p{Greek} (= \p{Script=Greek}) (NOT
  1702. \p{Greek_And_Coptic}) (511)
  1703. \p{Gujarati} \p{Script=Gujarati} (Short: \p{Gujr}; NOT
  1704. \p{Block=Gujarati}) (84)
  1705. \p{Gujr} \p{Gujarati} (= \p{Script=Gujarati}) (NOT
  1706. \p{Block=Gujarati}) (84)
  1707. \p{Gurmukhi} \p{Script=Gurmukhi} (Short: \p{Guru}; NOT
  1708. \p{Block=Gurmukhi}) (79)
  1709. \p{Guru} \p{Gurmukhi} (= \p{Script=Gurmukhi}) (NOT
  1710. \p{Block=Gurmukhi}) (79)
  1711. X \p{Half_And_Full_Forms} \p{Halfwidth_And_Fullwidth_Forms} (=
  1712. \p{Block=Halfwidth_And_Fullwidth_Forms})
  1713. (240)
  1714. X \p{Half_Marks} \p{Combining_Half_Marks} (= \p{Block=
  1715. Combining_Half_Marks}) (16)
  1716. X \p{Halfwidth_And_Fullwidth_Forms} \p{Block=
  1717. Halfwidth_And_Fullwidth_Forms} (Short:
  1718. \p{InHalfAndFullForms}) (240)
  1719. \p{Han} \p{Script=Han} (75_963)
  1720. \p{Hang} \p{Hangul} (= \p{Script=Hangul}) (NOT
  1721. \p{Hangul_Syllables}) (11_739)
  1722. \p{Hangul} \p{Script=Hangul} (Short: \p{Hang}; NOT
  1723. \p{Hangul_Syllables}) (11_739)
  1724. X \p{Hangul_Compatibility_Jamo} \p{Block=Hangul_Compatibility_Jamo}
  1725. (Short: \p{InCompatJamo}) (96)
  1726. X \p{Hangul_Jamo} \p{Block=Hangul_Jamo} (Short: \p{InJamo})
  1727. (256)
  1728. X \p{Hangul_Jamo_Extended_A} \p{Block=Hangul_Jamo_Extended_A}
  1729. (Short: \p{InJamoExtA}) (32)
  1730. X \p{Hangul_Jamo_Extended_B} \p{Block=Hangul_Jamo_Extended_B}
  1731. (Short: \p{InJamoExtB}) (80)
  1732. \p{Hangul_Syllable_Type: L} \p{Hangul_Syllable_Type=Leading_Jamo}
  1733. (125)
  1734. \p{Hangul_Syllable_Type: Leading_Jamo} (Short: \p{Hst=L}) (125)
  1735. \p{Hangul_Syllable_Type: LV} \p{Hangul_Syllable_Type=LV_Syllable}
  1736. (399)
  1737. \p{Hangul_Syllable_Type: LV_Syllable} (Short: \p{Hst=LV}) (399)
  1738. \p{Hangul_Syllable_Type: LVT} \p{Hangul_Syllable_Type=
  1739. LVT_Syllable} (10_773)
  1740. \p{Hangul_Syllable_Type: LVT_Syllable} (Short: \p{Hst=LVT})
  1741. (10_773)
  1742. \p{Hangul_Syllable_Type: NA} \p{Hangul_Syllable_Type=
  1743. Not_Applicable} (1_102_583)
  1744. \p{Hangul_Syllable_Type: Not_Applicable} (Short: \p{Hst=NA})
  1745. (1_102_583)
  1746. \p{Hangul_Syllable_Type: T} \p{Hangul_Syllable_Type=Trailing_Jamo}
  1747. (137)
  1748. \p{Hangul_Syllable_Type: Trailing_Jamo} (Short: \p{Hst=T}) (137)
  1749. \p{Hangul_Syllable_Type: V} \p{Hangul_Syllable_Type=Vowel_Jamo}
  1750. (95)
  1751. \p{Hangul_Syllable_Type: Vowel_Jamo} (Short: \p{Hst=V}) (95)
  1752. X \p{Hangul_Syllables} \p{Block=Hangul_Syllables} (Short:
  1753. \p{InHangul}) (11_184)
  1754. \p{Hani} \p{Han} (= \p{Script=Han}) (75_963)
  1755. \p{Hano} \p{Hanunoo} (= \p{Script=Hanunoo}) (NOT
  1756. \p{Block=Hanunoo}) (21)
  1757. \p{Hanunoo} \p{Script=Hanunoo} (Short: \p{Hano}; NOT
  1758. \p{Block=Hanunoo}) (21)
  1759. \p{Hebr} \p{Hebrew} (= \p{Script=Hebrew}) (NOT
  1760. \p{Block=Hebrew}) (133)
  1761. \p{Hebrew} \p{Script=Hebrew} (Short: \p{Hebr}; NOT
  1762. \p{Block=Hebrew}) (133)
  1763. \p{Hex} \p{XDigit} (= \p{Hex_Digit=Y}) (44)
  1764. \p{Hex: *} \p{Hex_Digit: *}
  1765. \p{Hex_Digit} \p{XDigit} (= \p{Hex_Digit=Y}) (44)
  1766. \p{Hex_Digit: N*} (Short: \p{Hex=N}, \P{Hex}) (1_114_068)
  1767. \p{Hex_Digit: Y*} (Short: \p{Hex=Y}, \p{Hex}) (44)
  1768. X \p{High_Private_Use_Surrogates} \p{Block=
  1769. High_Private_Use_Surrogates} (Short:
  1770. \p{InHighPUSurrogates}) (128)
  1771. X \p{High_PU_Surrogates} \p{High_Private_Use_Surrogates} (=
  1772. \p{Block=High_Private_Use_Surrogates})
  1773. (128)
  1774. X \p{High_Surrogates} \p{Block=High_Surrogates} (896)
  1775. \p{Hira} \p{Hiragana} (= \p{Script=Hiragana}) (NOT
  1776. \p{Block=Hiragana}) (91)
  1777. \p{Hiragana} \p{Script=Hiragana} (Short: \p{Hira}; NOT
  1778. \p{Block=Hiragana}) (91)
  1779. \p{HorizSpace} \p{Blank} (19)
  1780. \p{Hst: *} \p{Hangul_Syllable_Type: *}
  1781. D \p{Hyphen} \p{Hyphen=Y} (11)
  1782. D \p{Hyphen: N*} Supplanted by Line_Break property values;
  1783. see www.unicode.org/reports/tr14
  1784. (Single: \P{Hyphen}) (1_114_101)
  1785. D \p{Hyphen: Y*} Supplanted by Line_Break property values;
  1786. see www.unicode.org/reports/tr14
  1787. (Single: \p{Hyphen}) (11)
  1788. \p{ID_Continue} \p{ID_Continue=Y} (Short: \p{IDC}; NOT
  1789. \p{Ideographic_Description_Characters})
  1790. (103_355)
  1791. \p{ID_Continue: N*} (Short: \p{IDC=N}, \P{IDC}) (1_010_757)
  1792. \p{ID_Continue: Y*} (Short: \p{IDC=Y}, \p{IDC}) (103_355)
  1793. \p{ID_Start} \p{ID_Start=Y} (Short: \p{IDS}) (101_240)
  1794. \p{ID_Start: N*} (Short: \p{IDS=N}, \P{IDS}) (1_012_872)
  1795. \p{ID_Start: Y*} (Short: \p{IDS=Y}, \p{IDS}) (101_240)
  1796. \p{IDC} \p{ID_Continue} (= \p{ID_Continue=Y}) (NOT
  1797. \p{Ideographic_Description_Characters})
  1798. (103_355)
  1799. \p{IDC: *} \p{ID_Continue: *}
  1800. \p{Ideo} \p{Ideographic} (= \p{Ideographic=Y})
  1801. (75_633)
  1802. \p{Ideo: *} \p{Ideographic: *}
  1803. \p{Ideographic} \p{Ideographic=Y} (Short: \p{Ideo})
  1804. (75_633)
  1805. \p{Ideographic: N*} (Short: \p{Ideo=N}, \P{Ideo}) (1_038_479)
  1806. \p{Ideographic: Y*} (Short: \p{Ideo=Y}, \p{Ideo}) (75_633)
  1807. X \p{Ideographic_Description_Characters} \p{Block=
  1808. Ideographic_Description_Characters}
  1809. (Short: \p{InIDC}) (16)
  1810. \p{IDS} \p{ID_Start} (= \p{ID_Start=Y}) (101_240)
  1811. \p{IDS: *} \p{ID_Start: *}
  1812. \p{IDS_Binary_Operator} \p{IDS_Binary_Operator=Y} (Short:
  1813. \p{IDSB}) (10)
  1814. \p{IDS_Binary_Operator: N*} (Short: \p{IDSB=N}, \P{IDSB})
  1815. (1_114_102)
  1816. \p{IDS_Binary_Operator: Y*} (Short: \p{IDSB=Y}, \p{IDSB}) (10)
  1817. \p{IDS_Trinary_Operator} \p{IDS_Trinary_Operator=Y} (Short:
  1818. \p{IDST}) (2)
  1819. \p{IDS_Trinary_Operator: N*} (Short: \p{IDST=N}, \P{IDST})
  1820. (1_114_110)
  1821. \p{IDS_Trinary_Operator: Y*} (Short: \p{IDST=Y}, \p{IDST}) (2)
  1822. \p{IDSB} \p{IDS_Binary_Operator} (=
  1823. \p{IDS_Binary_Operator=Y}) (10)
  1824. \p{IDSB: *} \p{IDS_Binary_Operator: *}
  1825. \p{IDST} \p{IDS_Trinary_Operator} (=
  1826. \p{IDS_Trinary_Operator=Y}) (2)
  1827. \p{IDST: *} \p{IDS_Trinary_Operator: *}
  1828. \p{Imperial_Aramaic} \p{Script=Imperial_Aramaic} (Short:
  1829. \p{Armi}; NOT \p{Block=
  1830. Imperial_Aramaic}) (31)
  1831. \p{In: *} \p{Present_In: *} (Perl extension)
  1832. \p{In_*} \p{Block: *}
  1833. X \p{Indic_Number_Forms} \p{Common_Indic_Number_Forms} (= \p{Block=
  1834. Common_Indic_Number_Forms}) (16)
  1835. \p{Inherited} \p{Script=Inherited} (Short: \p{Zinh})
  1836. (523)
  1837. \p{Initial_Punctuation} \p{General_Category=Initial_Punctuation}
  1838. (Short: \p{Pi}) (12)
  1839. \p{Inscriptional_Pahlavi} \p{Script=Inscriptional_Pahlavi} (Short:
  1840. \p{Phli}; NOT \p{Block=
  1841. Inscriptional_Pahlavi}) (27)
  1842. \p{Inscriptional_Parthian} \p{Script=Inscriptional_Parthian}
  1843. (Short: \p{Prti}; NOT \p{Block=
  1844. Inscriptional_Parthian}) (30)
  1845. X \p{IPA_Ext} \p{IPA_Extensions} (= \p{Block=
  1846. IPA_Extensions}) (96)
  1847. X \p{IPA_Extensions} \p{Block=IPA_Extensions} (Short:
  1848. \p{InIPAExt}) (96)
  1849. \p{Is_*} \p{*} (Any exceptions are individually
  1850. noted beginning with the word NOT.) If
  1851. an entry has flag(s) at its beginning,
  1852. like "D", the "Is_" form has the same
  1853. flag(s)
  1854. \p{Ital} \p{Old_Italic} (= \p{Script=Old_Italic})
  1855. (NOT \p{Block=Old_Italic}) (35)
  1856. X \p{Jamo} \p{Hangul_Jamo} (= \p{Block=Hangul_Jamo})
  1857. (256)
  1858. X \p{Jamo_Ext_A} \p{Hangul_Jamo_Extended_A} (= \p{Block=
  1859. Hangul_Jamo_Extended_A}) (32)
  1860. X \p{Jamo_Ext_B} \p{Hangul_Jamo_Extended_B} (= \p{Block=
  1861. Hangul_Jamo_Extended_B}) (80)
  1862. \p{Java} \p{Javanese} (= \p{Script=Javanese}) (NOT
  1863. \p{Block=Javanese}) (91)
  1864. \p{Javanese} \p{Script=Javanese} (Short: \p{Java}; NOT
  1865. \p{Block=Javanese}) (91)
  1866. \p{Jg: *} \p{Joining_Group: *}
  1867. \p{Join_C} \p{Join_Control} (= \p{Join_Control=Y}) (2)
  1868. \p{Join_C: *} \p{Join_Control: *}
  1869. \p{Join_Control} \p{Join_Control=Y} (Short: \p{JoinC}) (2)
  1870. \p{Join_Control: N*} (Short: \p{JoinC=N}, \P{JoinC}) (1_114_110)
  1871. \p{Join_Control: Y*} (Short: \p{JoinC=Y}, \p{JoinC}) (2)
  1872. \p{Joining_Group: Ain} (Short: \p{Jg=Ain}) (7)
  1873. \p{Joining_Group: Alaph} (Short: \p{Jg=Alaph}) (1)
  1874. \p{Joining_Group: Alef} (Short: \p{Jg=Alef}) (10)
  1875. \p{Joining_Group: Beh} (Short: \p{Jg=Beh}) (20)
  1876. \p{Joining_Group: Beth} (Short: \p{Jg=Beth}) (2)
  1877. \p{Joining_Group: Burushaski_Yeh_Barree} (Short: \p{Jg=
  1878. BurushaskiYehBarree}) (2)
  1879. \p{Joining_Group: Dal} (Short: \p{Jg=Dal}) (14)
  1880. \p{Joining_Group: Dalath_Rish} (Short: \p{Jg=DalathRish}) (4)
  1881. \p{Joining_Group: E} (Short: \p{Jg=E}) (1)
  1882. \p{Joining_Group: Farsi_Yeh} (Short: \p{Jg=FarsiYeh}) (7)
  1883. \p{Joining_Group: Fe} (Short: \p{Jg=Fe}) (1)
  1884. \p{Joining_Group: Feh} (Short: \p{Jg=Feh}) (10)
  1885. \p{Joining_Group: Final_Semkath} (Short: \p{Jg=FinalSemkath}) (1)
  1886. \p{Joining_Group: Gaf} (Short: \p{Jg=Gaf}) (13)
  1887. \p{Joining_Group: Gamal} (Short: \p{Jg=Gamal}) (3)
  1888. \p{Joining_Group: Hah} (Short: \p{Jg=Hah}) (18)
  1889. \p{Joining_Group: Hamza_On_Heh_Goal} (Short: \p{Jg=
  1890. HamzaOnHehGoal}) (1)
  1891. \p{Joining_Group: He} (Short: \p{Jg=He}) (1)
  1892. \p{Joining_Group: Heh} (Short: \p{Jg=Heh}) (1)
  1893. \p{Joining_Group: Heh_Goal} (Short: \p{Jg=HehGoal}) (2)
  1894. \p{Joining_Group: Heth} (Short: \p{Jg=Heth}) (1)
  1895. \p{Joining_Group: Kaf} (Short: \p{Jg=Kaf}) (5)
  1896. \p{Joining_Group: Kaph} (Short: \p{Jg=Kaph}) (1)
  1897. \p{Joining_Group: Khaph} (Short: \p{Jg=Khaph}) (1)
  1898. \p{Joining_Group: Knotted_Heh} (Short: \p{Jg=KnottedHeh}) (2)
  1899. \p{Joining_Group: Lam} (Short: \p{Jg=Lam}) (7)
  1900. \p{Joining_Group: Lamadh} (Short: \p{Jg=Lamadh}) (1)
  1901. \p{Joining_Group: Meem} (Short: \p{Jg=Meem}) (4)
  1902. \p{Joining_Group: Mim} (Short: \p{Jg=Mim}) (1)
  1903. \p{Joining_Group: No_Joining_Group} (Short: \p{Jg=NoJoiningGroup})
  1904. (1_113_870)
  1905. \p{Joining_Group: Noon} (Short: \p{Jg=Noon}) (8)
  1906. \p{Joining_Group: Nun} (Short: \p{Jg=Nun}) (1)
  1907. \p{Joining_Group: Nya} (Short: \p{Jg=Nya}) (1)
  1908. \p{Joining_Group: Pe} (Short: \p{Jg=Pe}) (1)
  1909. \p{Joining_Group: Qaf} (Short: \p{Jg=Qaf}) (5)
  1910. \p{Joining_Group: Qaph} (Short: \p{Jg=Qaph}) (1)
  1911. \p{Joining_Group: Reh} (Short: \p{Jg=Reh}) (17)
  1912. \p{Joining_Group: Reversed_Pe} (Short: \p{Jg=ReversedPe}) (1)
  1913. \p{Joining_Group: Rohingya_Yeh} (Short: \p{Jg=RohingyaYeh}) (1)
  1914. \p{Joining_Group: Sad} (Short: \p{Jg=Sad}) (5)
  1915. \p{Joining_Group: Sadhe} (Short: \p{Jg=Sadhe}) (1)
  1916. \p{Joining_Group: Seen} (Short: \p{Jg=Seen}) (11)
  1917. \p{Joining_Group: Semkath} (Short: \p{Jg=Semkath}) (1)
  1918. \p{Joining_Group: Shin} (Short: \p{Jg=Shin}) (1)
  1919. \p{Joining_Group: Swash_Kaf} (Short: \p{Jg=SwashKaf}) (1)
  1920. \p{Joining_Group: Syriac_Waw} (Short: \p{Jg=SyriacWaw}) (1)
  1921. \p{Joining_Group: Tah} (Short: \p{Jg=Tah}) (4)
  1922. \p{Joining_Group: Taw} (Short: \p{Jg=Taw}) (1)
  1923. \p{Joining_Group: Teh_Marbuta} (Short: \p{Jg=TehMarbuta}) (3)
  1924. \p{Joining_Group: Teh_Marbuta_Goal} \p{Joining_Group=
  1925. Hamza_On_Heh_Goal} (1)
  1926. \p{Joining_Group: Teth} (Short: \p{Jg=Teth}) (2)
  1927. \p{Joining_Group: Waw} (Short: \p{Jg=Waw}) (16)
  1928. \p{Joining_Group: Yeh} (Short: \p{Jg=Yeh}) (10)
  1929. \p{Joining_Group: Yeh_Barree} (Short: \p{Jg=YehBarree}) (2)
  1930. \p{Joining_Group: Yeh_With_Tail} (Short: \p{Jg=YehWithTail}) (1)
  1931. \p{Joining_Group: Yudh} (Short: \p{Jg=Yudh}) (1)
  1932. \p{Joining_Group: Yudh_He} (Short: \p{Jg=YudhHe}) (1)
  1933. \p{Joining_Group: Zain} (Short: \p{Jg=Zain}) (1)
  1934. \p{Joining_Group: Zhain} (Short: \p{Jg=Zhain}) (1)
  1935. \p{Joining_Type: C} \p{Joining_Type=Join_Causing} (3)
  1936. \p{Joining_Type: D} \p{Joining_Type=Dual_Joining} (215)
  1937. \p{Joining_Type: Dual_Joining} (Short: \p{Jt=D}) (215)
  1938. \p{Joining_Type: Join_Causing} (Short: \p{Jt=C}) (3)
  1939. \p{Joining_Type: L} \p{Joining_Type=Left_Joining} (0)
  1940. \p{Joining_Type: Left_Joining} (Short: \p{Jt=L}) (0)
  1941. \p{Joining_Type: Non_Joining} (Short: \p{Jt=U}) (1_112_389)
  1942. \p{Joining_Type: R} \p{Joining_Type=Right_Joining} (82)
  1943. \p{Joining_Type: Right_Joining} (Short: \p{Jt=R}) (82)
  1944. \p{Joining_Type: T} \p{Joining_Type=Transparent} (1423)
  1945. \p{Joining_Type: Transparent} (Short: \p{Jt=T}) (1423)
  1946. \p{Joining_Type: U} \p{Joining_Type=Non_Joining} (1_112_389)
  1947. \p{Jt: *} \p{Joining_Type: *}
  1948. \p{Kaithi} \p{Script=Kaithi} (Short: \p{Kthi}; NOT
  1949. \p{Block=Kaithi}) (66)
  1950. \p{Kali} \p{Kayah_Li} (= \p{Script=Kayah_Li}) (48)
  1951. \p{Kana} \p{Katakana} (= \p{Script=Katakana}) (NOT
  1952. \p{Block=Katakana}) (300)
  1953. X \p{Kana_Sup} \p{Kana_Supplement} (= \p{Block=
  1954. Kana_Supplement}) (256)
  1955. X \p{Kana_Supplement} \p{Block=Kana_Supplement} (Short:
  1956. \p{InKanaSup}) (256)
  1957. X \p{Kanbun} \p{Block=Kanbun} (16)
  1958. X \p{Kangxi} \p{Kangxi_Radicals} (= \p{Block=
  1959. Kangxi_Radicals}) (224)
  1960. X \p{Kangxi_Radicals} \p{Block=Kangxi_Radicals} (Short:
  1961. \p{InKangxi}) (224)
  1962. \p{Kannada} \p{Script=Kannada} (Short: \p{Knda}; NOT
  1963. \p{Block=Kannada}) (86)
  1964. \p{Katakana} \p{Script=Katakana} (Short: \p{Kana}; NOT
  1965. \p{Block=Katakana}) (300)
  1966. X \p{Katakana_Ext} \p{Katakana_Phonetic_Extensions} (=
  1967. \p{Block=Katakana_Phonetic_Extensions})
  1968. (16)
  1969. X \p{Katakana_Phonetic_Extensions} \p{Block=
  1970. Katakana_Phonetic_Extensions} (Short:
  1971. \p{InKatakanaExt}) (16)
  1972. \p{Kayah_Li} \p{Script=Kayah_Li} (Short: \p{Kali}) (48)
  1973. \p{Khar} \p{Kharoshthi} (= \p{Script=Kharoshthi})
  1974. (NOT \p{Block=Kharoshthi}) (65)
  1975. \p{Kharoshthi} \p{Script=Kharoshthi} (Short: \p{Khar};
  1976. NOT \p{Block=Kharoshthi}) (65)
  1977. \p{Khmer} \p{Script=Khmer} (Short: \p{Khmr}; NOT
  1978. \p{Block=Khmer}) (146)
  1979. X \p{Khmer_Symbols} \p{Block=Khmer_Symbols} (32)
  1980. \p{Khmr} \p{Khmer} (= \p{Script=Khmer}) (NOT
  1981. \p{Block=Khmer}) (146)
  1982. \p{Knda} \p{Kannada} (= \p{Script=Kannada}) (NOT
  1983. \p{Block=Kannada}) (86)
  1984. \p{Kthi} \p{Kaithi} (= \p{Script=Kaithi}) (NOT
  1985. \p{Block=Kaithi}) (66)
  1986. \p{L} \p{Letter} (= \p{General_Category=Letter})
  1987. (101_013)
  1988. X \p{L&} \p{Cased_Letter} (= \p{General_Category=
  1989. Cased_Letter}) (3223)
  1990. X \p{L_} \p{Cased_Letter} (= \p{General_Category=
  1991. Cased_Letter}) Note the trailing '_'
  1992. matters in spite of loose matching
  1993. rules. (3223)
  1994. \p{Lana} \p{Tai_Tham} (= \p{Script=Tai_Tham}) (NOT
  1995. \p{Block=Tai_Tham}) (127)
  1996. \p{Lao} \p{Script=Lao} (NOT \p{Block=Lao}) (67)
  1997. \p{Laoo} \p{Lao} (= \p{Script=Lao}) (NOT \p{Block=
  1998. Lao}) (67)
  1999. \p{Latin} \p{Script=Latin} (Short: \p{Latn}) (1272)
  2000. X \p{Latin_1} \p{Latin_1_Supplement} (= \p{Block=
  2001. Latin_1_Supplement}) (128)
  2002. X \p{Latin_1_Sup} \p{Latin_1_Supplement} (= \p{Block=
  2003. Latin_1_Supplement}) (128)
  2004. X \p{Latin_1_Supplement} \p{Block=Latin_1_Supplement} (Short:
  2005. \p{InLatin1}) (128)
  2006. X \p{Latin_Ext_A} \p{Latin_Extended_A} (= \p{Block=
  2007. Latin_Extended_A}) (128)
  2008. X \p{Latin_Ext_Additional} \p{Latin_Extended_Additional} (=
  2009. \p{Block=Latin_Extended_Additional})
  2010. (256)
  2011. X \p{Latin_Ext_B} \p{Latin_Extended_B} (= \p{Block=
  2012. Latin_Extended_B}) (208)
  2013. X \p{Latin_Ext_C} \p{Latin_Extended_C} (= \p{Block=
  2014. Latin_Extended_C}) (32)
  2015. X \p{Latin_Ext_D} \p{Latin_Extended_D} (= \p{Block=
  2016. Latin_Extended_D}) (224)
  2017. X \p{Latin_Extended_A} \p{Block=Latin_Extended_A} (Short:
  2018. \p{InLatinExtA}) (128)
  2019. X \p{Latin_Extended_Additional} \p{Block=Latin_Extended_Additional}
  2020. (Short: \p{InLatinExtAdditional}) (256)
  2021. X \p{Latin_Extended_B} \p{Block=Latin_Extended_B} (Short:
  2022. \p{InLatinExtB}) (208)
  2023. X \p{Latin_Extended_C} \p{Block=Latin_Extended_C} (Short:
  2024. \p{InLatinExtC}) (32)
  2025. X \p{Latin_Extended_D} \p{Block=Latin_Extended_D} (Short:
  2026. \p{InLatinExtD}) (224)
  2027. \p{Latn} \p{Latin} (= \p{Script=Latin}) (1272)
  2028. \p{Lb: *} \p{Line_Break: *}
  2029. \p{LC} \p{Cased_Letter} (= \p{General_Category=
  2030. Cased_Letter}) (3223)
  2031. \p{Lepc} \p{Lepcha} (= \p{Script=Lepcha}) (NOT
  2032. \p{Block=Lepcha}) (74)
  2033. \p{Lepcha} \p{Script=Lepcha} (Short: \p{Lepc}; NOT
  2034. \p{Block=Lepcha}) (74)
  2035. \p{Letter} \p{General_Category=Letter} (Short: \p{L})
  2036. (101_013)
  2037. \p{Letter_Number} \p{General_Category=Letter_Number} (Short:
  2038. \p{Nl}) (224)
  2039. X \p{Letterlike_Symbols} \p{Block=Letterlike_Symbols} (80)
  2040. \p{Limb} \p{Limbu} (= \p{Script=Limbu}) (NOT
  2041. \p{Block=Limbu}) (66)
  2042. \p{Limbu} \p{Script=Limbu} (Short: \p{Limb}; NOT
  2043. \p{Block=Limbu}) (66)
  2044. \p{Linb} \p{Linear_B} (= \p{Script=Linear_B}) (211)
  2045. \p{Line_Break: AI} \p{Line_Break=Ambiguous} (687)
  2046. \p{Line_Break: AL} \p{Line_Break=Alphabetic} (15_355)
  2047. \p{Line_Break: Alphabetic} (Short: \p{Lb=AL}) (15_355)
  2048. \p{Line_Break: Ambiguous} (Short: \p{Lb=AI}) (687)
  2049. \p{Line_Break: B2} \p{Line_Break=Break_Both} (3)
  2050. \p{Line_Break: BA} \p{Line_Break=Break_After} (151)
  2051. \p{Line_Break: BB} \p{Line_Break=Break_Before} (19)
  2052. \p{Line_Break: BK} \p{Line_Break=Mandatory_Break} (4)
  2053. \p{Line_Break: Break_After} (Short: \p{Lb=BA}) (151)
  2054. \p{Line_Break: Break_Before} (Short: \p{Lb=BB}) (19)
  2055. \p{Line_Break: Break_Both} (Short: \p{Lb=B2}) (3)
  2056. \p{Line_Break: Break_Symbols} (Short: \p{Lb=SY}) (1)
  2057. \p{Line_Break: Carriage_Return} (Short: \p{Lb=CR}) (1)
  2058. \p{Line_Break: CB} \p{Line_Break=Contingent_Break} (1)
  2059. \p{Line_Break: CJ} \p{Line_Break=
  2060. Conditional_Japanese_Starter} (51)
  2061. \p{Line_Break: CL} \p{Line_Break=Close_Punctuation} (87)
  2062. \p{Line_Break: Close_Parenthesis} (Short: \p{Lb=CP}) (2)
  2063. \p{Line_Break: Close_Punctuation} (Short: \p{Lb=CL}) (87)
  2064. \p{Line_Break: CM} \p{Line_Break=Combining_Mark} (1628)
  2065. \p{Line_Break: Combining_Mark} (Short: \p{Lb=CM}) (1628)
  2066. \p{Line_Break: Complex_Context} (Short: \p{Lb=SA}) (665)
  2067. \p{Line_Break: Conditional_Japanese_Starter} (Short: \p{Lb=CJ})
  2068. (51)
  2069. \p{Line_Break: Contingent_Break} (Short: \p{Lb=CB}) (1)
  2070. \p{Line_Break: CP} \p{Line_Break=Close_Parenthesis} (2)
  2071. \p{Line_Break: CR} \p{Line_Break=Carriage_Return} (1)
  2072. \p{Line_Break: EX} \p{Line_Break=Exclamation} (34)
  2073. \p{Line_Break: Exclamation} (Short: \p{Lb=EX}) (34)
  2074. \p{Line_Break: GL} \p{Line_Break=Glue} (18)
  2075. \p{Line_Break: Glue} (Short: \p{Lb=GL}) (18)
  2076. \p{Line_Break: H2} (Short: \p{Lb=H2}) (399)
  2077. \p{Line_Break: H3} (Short: \p{Lb=H3}) (10_773)
  2078. \p{Line_Break: Hebrew_Letter} (Short: \p{Lb=HL}) (74)
  2079. \p{Line_Break: HL} \p{Line_Break=Hebrew_Letter} (74)
  2080. \p{Line_Break: HY} \p{Line_Break=Hyphen} (1)
  2081. \p{Line_Break: Hyphen} (Short: \p{Lb=HY}) (1)
  2082. \p{Line_Break: ID} \p{Line_Break=Ideographic} (162_700)
  2083. \p{Line_Break: Ideographic} (Short: \p{Lb=ID}) (162_700)
  2084. \p{Line_Break: IN} \p{Line_Break=Inseparable} (4)
  2085. \p{Line_Break: Infix_Numeric} (Short: \p{Lb=IS}) (13)
  2086. \p{Line_Break: Inseparable} (Short: \p{Lb=IN}) (4)
  2087. \p{Line_Break: Inseperable} \p{Line_Break=Inseparable} (4)
  2088. \p{Line_Break: IS} \p{Line_Break=Infix_Numeric} (13)
  2089. \p{Line_Break: JL} (Short: \p{Lb=JL}) (125)
  2090. \p{Line_Break: JT} (Short: \p{Lb=JT}) (137)
  2091. \p{Line_Break: JV} (Short: \p{Lb=JV}) (95)
  2092. \p{Line_Break: LF} \p{Line_Break=Line_Feed} (1)
  2093. \p{Line_Break: Line_Feed} (Short: \p{Lb=LF}) (1)
  2094. \p{Line_Break: Mandatory_Break} (Short: \p{Lb=BK}) (4)
  2095. \p{Line_Break: Next_Line} (Short: \p{Lb=NL}) (1)
  2096. \p{Line_Break: NL} \p{Line_Break=Next_Line} (1)
  2097. \p{Line_Break: Nonstarter} (Short: \p{Lb=NS}) (26)
  2098. \p{Line_Break: NS} \p{Line_Break=Nonstarter} (26)
  2099. \p{Line_Break: NU} \p{Line_Break=Numeric} (452)
  2100. \p{Line_Break: Numeric} (Short: \p{Lb=NU}) (452)
  2101. \p{Line_Break: OP} \p{Line_Break=Open_Punctuation} (81)
  2102. \p{Line_Break: Open_Punctuation} (Short: \p{Lb=OP}) (81)
  2103. \p{Line_Break: PO} \p{Line_Break=Postfix_Numeric} (28)
  2104. \p{Line_Break: Postfix_Numeric} (Short: \p{Lb=PO}) (28)
  2105. \p{Line_Break: PR} \p{Line_Break=Prefix_Numeric} (46)
  2106. \p{Line_Break: Prefix_Numeric} (Short: \p{Lb=PR}) (46)
  2107. \p{Line_Break: QU} \p{Line_Break=Quotation} (34)
  2108. \p{Line_Break: Quotation} (Short: \p{Lb=QU}) (34)
  2109. \p{Line_Break: Regional_Indicator} (Short: \p{Lb=RI}) (26)
  2110. \p{Line_Break: RI} \p{Line_Break=Regional_Indicator} (26)
  2111. \p{Line_Break: SA} \p{Line_Break=Complex_Context} (665)
  2112. D \p{Line_Break: SG} \p{Line_Break=Surrogate} (2048)
  2113. \p{Line_Break: SP} \p{Line_Break=Space} (1)
  2114. \p{Line_Break: Space} (Short: \p{Lb=SP}) (1)
  2115. D \p{Line_Break: Surrogate} Deprecated by Unicode because surrogates
  2116. should never appear in well-formed text,
  2117. and therefore shouldn't be the basis for
  2118. line breaking (Short: \p{Lb=SG}) (2048)
  2119. \p{Line_Break: SY} \p{Line_Break=Break_Symbols} (1)
  2120. \p{Line_Break: Unknown} (Short: \p{Lb=XX}) (918_337)
  2121. \p{Line_Break: WJ} \p{Line_Break=Word_Joiner} (2)
  2122. \p{Line_Break: Word_Joiner} (Short: \p{Lb=WJ}) (2)
  2123. \p{Line_Break: XX} \p{Line_Break=Unknown} (918_337)
  2124. \p{Line_Break: ZW} \p{Line_Break=ZWSpace} (1)
  2125. \p{Line_Break: ZWSpace} (Short: \p{Lb=ZW}) (1)
  2126. \p{Line_Separator} \p{General_Category=Line_Separator}
  2127. (Short: \p{Zl}) (1)
  2128. \p{Linear_B} \p{Script=Linear_B} (Short: \p{Linb}) (211)
  2129. X \p{Linear_B_Ideograms} \p{Block=Linear_B_Ideograms} (128)
  2130. X \p{Linear_B_Syllabary} \p{Block=Linear_B_Syllabary} (128)
  2131. \p{Lisu} \p{Script=Lisu} (48)
  2132. \p{Ll} \p{Lowercase_Letter} (=
  2133. \p{General_Category=Lowercase_Letter})
  2134. (/i= General_Category=Cased_Letter)
  2135. (1751)
  2136. \p{Lm} \p{Modifier_Letter} (=
  2137. \p{General_Category=Modifier_Letter})
  2138. (237)
  2139. \p{Lo} \p{Other_Letter} (= \p{General_Category=
  2140. Other_Letter}) (97_553)
  2141. \p{LOE} \p{Logical_Order_Exception} (=
  2142. \p{Logical_Order_Exception=Y}) (15)
  2143. \p{LOE: *} \p{Logical_Order_Exception: *}
  2144. \p{Logical_Order_Exception} \p{Logical_Order_Exception=Y} (Short:
  2145. \p{LOE}) (15)
  2146. \p{Logical_Order_Exception: N*} (Short: \p{LOE=N}, \P{LOE})
  2147. (1_114_097)
  2148. \p{Logical_Order_Exception: Y*} (Short: \p{LOE=Y}, \p{LOE}) (15)
  2149. X \p{Low_Surrogates} \p{Block=Low_Surrogates} (1024)
  2150. \p{Lower} \p{Lowercase=Y} (/i= Cased=Yes) (1934)
  2151. \p{Lower: *} \p{Lowercase: *}
  2152. \p{Lowercase} \p{Lower} (= \p{Lowercase=Y}) (/i= Cased=
  2153. Yes) (1934)
  2154. \p{Lowercase: N*} (Short: \p{Lower=N}, \P{Lower}; /i= Cased=
  2155. No) (1_112_178)
  2156. \p{Lowercase: Y*} (Short: \p{Lower=Y}, \p{Lower}; /i= Cased=
  2157. Yes) (1934)
  2158. \p{Lowercase_Letter} \p{General_Category=Lowercase_Letter}
  2159. (Short: \p{Ll}; /i= General_Category=
  2160. Cased_Letter) (1751)
  2161. \p{Lt} \p{Titlecase_Letter} (=
  2162. \p{General_Category=Titlecase_Letter})
  2163. (/i= General_Category=Cased_Letter) (31)
  2164. \p{Lu} \p{Uppercase_Letter} (=
  2165. \p{General_Category=Uppercase_Letter})
  2166. (/i= General_Category=Cased_Letter)
  2167. (1441)
  2168. \p{Lyci} \p{Lycian} (= \p{Script=Lycian}) (NOT
  2169. \p{Block=Lycian}) (29)
  2170. \p{Lycian} \p{Script=Lycian} (Short: \p{Lyci}; NOT
  2171. \p{Block=Lycian}) (29)
  2172. \p{Lydi} \p{Lydian} (= \p{Script=Lydian}) (NOT
  2173. \p{Block=Lydian}) (27)
  2174. \p{Lydian} \p{Script=Lydian} (Short: \p{Lydi}; NOT
  2175. \p{Block=Lydian}) (27)
  2176. \p{M} \p{Mark} (= \p{General_Category=Mark})
  2177. (1645)
  2178. X \p{Mahjong} \p{Mahjong_Tiles} (= \p{Block=
  2179. Mahjong_Tiles}) (48)
  2180. X \p{Mahjong_Tiles} \p{Block=Mahjong_Tiles} (Short:
  2181. \p{InMahjong}) (48)
  2182. \p{Malayalam} \p{Script=Malayalam} (Short: \p{Mlym}; NOT
  2183. \p{Block=Malayalam}) (98)
  2184. \p{Mand} \p{Mandaic} (= \p{Script=Mandaic}) (NOT
  2185. \p{Block=Mandaic}) (29)
  2186. \p{Mandaic} \p{Script=Mandaic} (Short: \p{Mand}; NOT
  2187. \p{Block=Mandaic}) (29)
  2188. \p{Mark} \p{General_Category=Mark} (Short: \p{M})
  2189. (1645)
  2190. \p{Math} \p{Math=Y} (2310)
  2191. \p{Math: N*} (Single: \P{Math}) (1_111_802)
  2192. \p{Math: Y*} (Single: \p{Math}) (2310)
  2193. X \p{Math_Alphanum} \p{Mathematical_Alphanumeric_Symbols} (=
  2194. \p{Block=
  2195. Mathematical_Alphanumeric_Symbols})
  2196. (1024)
  2197. X \p{Math_Operators} \p{Mathematical_Operators} (= \p{Block=
  2198. Mathematical_Operators}) (256)
  2199. \p{Math_Symbol} \p{General_Category=Math_Symbol} (Short:
  2200. \p{Sm}) (952)
  2201. X \p{Mathematical_Alphanumeric_Symbols} \p{Block=
  2202. Mathematical_Alphanumeric_Symbols}
  2203. (Short: \p{InMathAlphanum}) (1024)
  2204. X \p{Mathematical_Operators} \p{Block=Mathematical_Operators}
  2205. (Short: \p{InMathOperators}) (256)
  2206. \p{Mc} \p{Spacing_Mark} (= \p{General_Category=
  2207. Spacing_Mark}) (353)
  2208. \p{Me} \p{Enclosing_Mark} (= \p{General_Category=
  2209. Enclosing_Mark}) (12)
  2210. \p{Meetei_Mayek} \p{Script=Meetei_Mayek} (Short: \p{Mtei};
  2211. NOT \p{Block=Meetei_Mayek}) (79)
  2212. X \p{Meetei_Mayek_Ext} \p{Meetei_Mayek_Extensions} (= \p{Block=
  2213. Meetei_Mayek_Extensions}) (32)
  2214. X \p{Meetei_Mayek_Extensions} \p{Block=Meetei_Mayek_Extensions}
  2215. (Short: \p{InMeeteiMayekExt}) (32)
  2216. \p{Merc} \p{Meroitic_Cursive} (= \p{Script=
  2217. Meroitic_Cursive}) (NOT \p{Block=
  2218. Meroitic_Cursive}) (26)
  2219. \p{Mero} \p{Meroitic_Hieroglyphs} (= \p{Script=
  2220. Meroitic_Hieroglyphs}) (32)
  2221. \p{Meroitic_Cursive} \p{Script=Meroitic_Cursive} (Short:
  2222. \p{Merc}; NOT \p{Block=
  2223. Meroitic_Cursive}) (26)
  2224. \p{Meroitic_Hieroglyphs} \p{Script=Meroitic_Hieroglyphs} (Short:
  2225. \p{Mero}) (32)
  2226. \p{Miao} \p{Script=Miao} (NOT \p{Block=Miao}) (133)
  2227. X \p{Misc_Arrows} \p{Miscellaneous_Symbols_And_Arrows} (=
  2228. \p{Block=
  2229. Miscellaneous_Symbols_And_Arrows}) (256)
  2230. X \p{Misc_Math_Symbols_A} \p{Miscellaneous_Mathematical_Symbols_A}
  2231. (= \p{Block=
  2232. Miscellaneous_Mathematical_Symbols_A})
  2233. (48)
  2234. X \p{Misc_Math_Symbols_B} \p{Miscellaneous_Mathematical_Symbols_B}
  2235. (= \p{Block=
  2236. Miscellaneous_Mathematical_Symbols_B})
  2237. (128)
  2238. X \p{Misc_Pictographs} \p{Miscellaneous_Symbols_And_Pictographs}
  2239. (= \p{Block=
  2240. Miscellaneous_Symbols_And_Pictographs})
  2241. (768)
  2242. X \p{Misc_Symbols} \p{Miscellaneous_Symbols} (= \p{Block=
  2243. Miscellaneous_Symbols}) (256)
  2244. X \p{Misc_Technical} \p{Miscellaneous_Technical} (= \p{Block=
  2245. Miscellaneous_Technical}) (256)
  2246. X \p{Miscellaneous_Mathematical_Symbols_A} \p{Block=
  2247. Miscellaneous_Mathematical_Symbols_A}
  2248. (Short: \p{InMiscMathSymbolsA}) (48)
  2249. X \p{Miscellaneous_Mathematical_Symbols_B} \p{Block=
  2250. Miscellaneous_Mathematical_Symbols_B}
  2251. (Short: \p{InMiscMathSymbolsB}) (128)
  2252. X \p{Miscellaneous_Symbols} \p{Block=Miscellaneous_Symbols} (Short:
  2253. \p{InMiscSymbols}) (256)
  2254. X \p{Miscellaneous_Symbols_And_Arrows} \p{Block=
  2255. Miscellaneous_Symbols_And_Arrows}
  2256. (Short: \p{InMiscArrows}) (256)
  2257. X \p{Miscellaneous_Symbols_And_Pictographs} \p{Block=
  2258. Miscellaneous_Symbols_And_Pictographs}
  2259. (Short: \p{InMiscPictographs}) (768)
  2260. X \p{Miscellaneous_Technical} \p{Block=Miscellaneous_Technical}
  2261. (Short: \p{InMiscTechnical}) (256)
  2262. \p{Mlym} \p{Malayalam} (= \p{Script=Malayalam})
  2263. (NOT \p{Block=Malayalam}) (98)
  2264. \p{Mn} \p{Nonspacing_Mark} (=
  2265. \p{General_Category=Nonspacing_Mark})
  2266. (1280)
  2267. \p{Modifier_Letter} \p{General_Category=Modifier_Letter}
  2268. (Short: \p{Lm}) (237)
  2269. X \p{Modifier_Letters} \p{Spacing_Modifier_Letters} (= \p{Block=
  2270. Spacing_Modifier_Letters}) (80)
  2271. \p{Modifier_Symbol} \p{General_Category=Modifier_Symbol}
  2272. (Short: \p{Sk}) (115)
  2273. X \p{Modifier_Tone_Letters} \p{Block=Modifier_Tone_Letters} (32)
  2274. \p{Mong} \p{Mongolian} (= \p{Script=Mongolian})
  2275. (NOT \p{Block=Mongolian}) (153)
  2276. \p{Mongolian} \p{Script=Mongolian} (Short: \p{Mong}; NOT
  2277. \p{Block=Mongolian}) (153)
  2278. \p{Mtei} \p{Meetei_Mayek} (= \p{Script=
  2279. Meetei_Mayek}) (NOT \p{Block=
  2280. Meetei_Mayek}) (79)
  2281. X \p{Music} \p{Musical_Symbols} (= \p{Block=
  2282. Musical_Symbols}) (256)
  2283. X \p{Musical_Symbols} \p{Block=Musical_Symbols} (Short:
  2284. \p{InMusic}) (256)
  2285. \p{Myanmar} \p{Script=Myanmar} (Short: \p{Mymr}; NOT
  2286. \p{Block=Myanmar}) (188)
  2287. X \p{Myanmar_Ext_A} \p{Myanmar_Extended_A} (= \p{Block=
  2288. Myanmar_Extended_A}) (32)
  2289. X \p{Myanmar_Extended_A} \p{Block=Myanmar_Extended_A} (Short:
  2290. \p{InMyanmarExtA}) (32)
  2291. \p{Mymr} \p{Myanmar} (= \p{Script=Myanmar}) (NOT
  2292. \p{Block=Myanmar}) (188)
  2293. \p{N} \p{Number} (= \p{General_Category=Number})
  2294. (1148)
  2295. X \p{NB} \p{No_Block} (= \p{Block=No_Block})
  2296. (860_672)
  2297. \p{NChar} \p{Noncharacter_Code_Point} (=
  2298. \p{Noncharacter_Code_Point=Y}) (66)
  2299. \p{NChar: *} \p{Noncharacter_Code_Point: *}
  2300. \p{Nd} \p{Digit} (= \p{General_Category=
  2301. Decimal_Number}) (460)
  2302. \p{New_Tai_Lue} \p{Script=New_Tai_Lue} (Short: \p{Talu};
  2303. NOT \p{Block=New_Tai_Lue}) (83)
  2304. \p{NFC_QC: *} \p{NFC_Quick_Check: *}
  2305. \p{NFC_Quick_Check: M} \p{NFC_Quick_Check=Maybe} (104)
  2306. \p{NFC_Quick_Check: Maybe} (Short: \p{NFCQC=M}) (104)
  2307. \p{NFC_Quick_Check: N} \p{NFC_Quick_Check=No} (NOT
  2308. \P{NFC_Quick_Check} NOR \P{NFC_QC})
  2309. (1120)
  2310. \p{NFC_Quick_Check: No} (Short: \p{NFCQC=N}; NOT
  2311. \P{NFC_Quick_Check} NOR \P{NFC_QC})
  2312. (1120)
  2313. \p{NFC_Quick_Check: Y} \p{NFC_Quick_Check=Yes} (NOT
  2314. \p{NFC_Quick_Check} NOR \p{NFC_QC})
  2315. (1_112_888)
  2316. \p{NFC_Quick_Check: Yes} (Short: \p{NFCQC=Y}; NOT
  2317. \p{NFC_Quick_Check} NOR \p{NFC_QC})
  2318. (1_112_888)
  2319. \p{NFD_QC: *} \p{NFD_Quick_Check: *}
  2320. \p{NFD_Quick_Check: N} \p{NFD_Quick_Check=No} (NOT
  2321. \P{NFD_Quick_Check} NOR \P{NFD_QC})
  2322. (13_225)
  2323. \p{NFD_Quick_Check: No} (Short: \p{NFDQC=N}; NOT
  2324. \P{NFD_Quick_Check} NOR \P{NFD_QC})
  2325. (13_225)
  2326. \p{NFD_Quick_Check: Y} \p{NFD_Quick_Check=Yes} (NOT
  2327. \p{NFD_Quick_Check} NOR \p{NFD_QC})
  2328. (1_100_887)
  2329. \p{NFD_Quick_Check: Yes} (Short: \p{NFDQC=Y}; NOT
  2330. \p{NFD_Quick_Check} NOR \p{NFD_QC})
  2331. (1_100_887)
  2332. \p{NFKC_QC: *} \p{NFKC_Quick_Check: *}
  2333. \p{NFKC_Quick_Check: M} \p{NFKC_Quick_Check=Maybe} (104)
  2334. \p{NFKC_Quick_Check: Maybe} (Short: \p{NFKCQC=M}) (104)
  2335. \p{NFKC_Quick_Check: N} \p{NFKC_Quick_Check=No} (NOT
  2336. \P{NFKC_Quick_Check} NOR \P{NFKC_QC})
  2337. (4787)
  2338. \p{NFKC_Quick_Check: No} (Short: \p{NFKCQC=N}; NOT
  2339. \P{NFKC_Quick_Check} NOR \P{NFKC_QC})
  2340. (4787)
  2341. \p{NFKC_Quick_Check: Y} \p{NFKC_Quick_Check=Yes} (NOT
  2342. \p{NFKC_Quick_Check} NOR \p{NFKC_QC})
  2343. (1_109_221)
  2344. \p{NFKC_Quick_Check: Yes} (Short: \p{NFKCQC=Y}; NOT
  2345. \p{NFKC_Quick_Check} NOR \p{NFKC_QC})
  2346. (1_109_221)
  2347. \p{NFKD_QC: *} \p{NFKD_Quick_Check: *}
  2348. \p{NFKD_Quick_Check: N} \p{NFKD_Quick_Check=No} (NOT
  2349. \P{NFKD_Quick_Check} NOR \P{NFKD_QC})
  2350. (16_880)
  2351. \p{NFKD_Quick_Check: No} (Short: \p{NFKDQC=N}; NOT
  2352. \P{NFKD_Quick_Check} NOR \P{NFKD_QC})
  2353. (16_880)
  2354. \p{NFKD_Quick_Check: Y} \p{NFKD_Quick_Check=Yes} (NOT
  2355. \p{NFKD_Quick_Check} NOR \p{NFKD_QC})
  2356. (1_097_232)
  2357. \p{NFKD_Quick_Check: Yes} (Short: \p{NFKDQC=Y}; NOT
  2358. \p{NFKD_Quick_Check} NOR \p{NFKD_QC})
  2359. (1_097_232)
  2360. \p{Nko} \p{Script=Nko} (NOT \p{NKo}) (59)
  2361. \p{Nkoo} \p{Nko} (= \p{Script=Nko}) (NOT \p{NKo})
  2362. (59)
  2363. \p{Nl} \p{Letter_Number} (= \p{General_Category=
  2364. Letter_Number}) (224)
  2365. \p{No} \p{Other_Number} (= \p{General_Category=
  2366. Other_Number}) (464)
  2367. X \p{No_Block} \p{Block=No_Block} (Short: \p{InNB})
  2368. (860_672)
  2369. \p{Noncharacter_Code_Point} \p{Noncharacter_Code_Point=Y} (Short:
  2370. \p{NChar}) (66)
  2371. \p{Noncharacter_Code_Point: N*} (Short: \p{NChar=N}, \P{NChar})
  2372. (1_114_046)
  2373. \p{Noncharacter_Code_Point: Y*} (Short: \p{NChar=Y}, \p{NChar})
  2374. (66)
  2375. \p{Nonspacing_Mark} \p{General_Category=Nonspacing_Mark}
  2376. (Short: \p{Mn}) (1280)
  2377. \p{Nt: *} \p{Numeric_Type: *}
  2378. \p{Number} \p{General_Category=Number} (Short: \p{N})
  2379. (1148)
  2380. X \p{Number_Forms} \p{Block=Number_Forms} (64)
  2381. \p{Numeric_Type: De} \p{Numeric_Type=Decimal} (460)
  2382. \p{Numeric_Type: Decimal} (Short: \p{Nt=De}) (460)
  2383. \p{Numeric_Type: Di} \p{Numeric_Type=Digit} (128)
  2384. \p{Numeric_Type: Digit} (Short: \p{Nt=Di}) (128)
  2385. \p{Numeric_Type: None} (Short: \p{Nt=None}) (1_112_883)
  2386. \p{Numeric_Type: Nu} \p{Numeric_Type=Numeric} (641)
  2387. \p{Numeric_Type: Numeric} (Short: \p{Nt=Nu}) (641)
  2388. T \p{Numeric_Value: -1} (Short: \p{Nv=-1}) (2)
  2389. T \p{Numeric_Value: -1/2} (Short: \p{Nv=-1/2}) (1)
  2390. T \p{Numeric_Value: 0} (Short: \p{Nv=0}) (60)
  2391. T \p{Numeric_Value: 1/16} (Short: \p{Nv=1/16}) (3)
  2392. T \p{Numeric_Value: 1/10} (Short: \p{Nv=1/10}) (1)
  2393. T \p{Numeric_Value: 1/9} (Short: \p{Nv=1/9}) (1)
  2394. T \p{Numeric_Value: 1/8} (Short: \p{Nv=1/8}) (5)
  2395. T \p{Numeric_Value: 1/7} (Short: \p{Nv=1/7}) (1)
  2396. T \p{Numeric_Value: 1/6} (Short: \p{Nv=1/6}) (2)
  2397. T \p{Numeric_Value: 3/16} (Short: \p{Nv=3/16}) (3)
  2398. T \p{Numeric_Value: 1/5} (Short: \p{Nv=1/5}) (1)
  2399. T \p{Numeric_Value: 1/4} (Short: \p{Nv=1/4}) (9)
  2400. T \p{Numeric_Value: 1/3} (Short: \p{Nv=1/3}) (4)
  2401. T \p{Numeric_Value: 3/8} (Short: \p{Nv=3/8}) (1)
  2402. T \p{Numeric_Value: 2/5} (Short: \p{Nv=2/5}) (1)
  2403. T \p{Numeric_Value: 1/2} (Short: \p{Nv=1/2}) (10)
  2404. T \p{Numeric_Value: 3/5} (Short: \p{Nv=3/5}) (1)
  2405. T \p{Numeric_Value: 5/8} (Short: \p{Nv=5/8}) (1)
  2406. T \p{Numeric_Value: 2/3} (Short: \p{Nv=2/3}) (5)
  2407. T \p{Numeric_Value: 3/4} (Short: \p{Nv=3/4}) (6)
  2408. T \p{Numeric_Value: 4/5} (Short: \p{Nv=4/5}) (1)
  2409. T \p{Numeric_Value: 5/6} (Short: \p{Nv=5/6}) (2)
  2410. T \p{Numeric_Value: 7/8} (Short: \p{Nv=7/8}) (1)
  2411. T \p{Numeric_Value: 1} (Short: \p{Nv=1}) (97)
  2412. T \p{Numeric_Value: 3/2} (Short: \p{Nv=3/2}) (1)
  2413. T \p{Numeric_Value: 2} (Short: \p{Nv=2}) (100)
  2414. T \p{Numeric_Value: 5/2} (Short: \p{Nv=5/2}) (1)
  2415. T \p{Numeric_Value: 3} (Short: \p{Nv=3}) (102)
  2416. T \p{Numeric_Value: 7/2} (Short: \p{Nv=7/2}) (1)
  2417. T \p{Numeric_Value: 4} (Short: \p{Nv=4}) (93)
  2418. T \p{Numeric_Value: 9/2} (Short: \p{Nv=9/2}) (1)
  2419. T \p{Numeric_Value: 5} (Short: \p{Nv=5}) (90)
  2420. T \p{Numeric_Value: 11/2} (Short: \p{Nv=11/2}) (1)
  2421. T \p{Numeric_Value: 6} (Short: \p{Nv=6}) (82)
  2422. T \p{Numeric_Value: 13/2} (Short: \p{Nv=13/2}) (1)
  2423. T \p{Numeric_Value: 7} (Short: \p{Nv=7}) (81)
  2424. T \p{Numeric_Value: 15/2} (Short: \p{Nv=15/2}) (1)
  2425. T \p{Numeric_Value: 8} (Short: \p{Nv=8}) (77)
  2426. T \p{Numeric_Value: 17/2} (Short: \p{Nv=17/2}) (1)
  2427. T \p{Numeric_Value: 9} (Short: \p{Nv=9}) (81)
  2428. T \p{Numeric_Value: 10} (Short: \p{Nv=10}) (40)
  2429. T \p{Numeric_Value: 11} (Short: \p{Nv=11}) (6)
  2430. T \p{Numeric_Value: 12} (Short: \p{Nv=12}) (6)
  2431. T \p{Numeric_Value: 13} (Short: \p{Nv=13}) (4)
  2432. T \p{Numeric_Value: 14} (Short: \p{Nv=14}) (4)
  2433. T \p{Numeric_Value: 15} (Short: \p{Nv=15}) (4)
  2434. T \p{Numeric_Value: 16} (Short: \p{Nv=16}) (5)
  2435. T \p{Numeric_Value: 17} (Short: \p{Nv=17}) (5)
  2436. T \p{Numeric_Value: 18} (Short: \p{Nv=18}) (5)
  2437. T \p{Numeric_Value: 19} (Short: \p{Nv=19}) (5)
  2438. T \p{Numeric_Value: 20} (Short: \p{Nv=20}) (19)
  2439. T \p{Numeric_Value: 21} (Short: \p{Nv=21}) (1)
  2440. T \p{Numeric_Value: 22} (Short: \p{Nv=22}) (1)
  2441. T \p{Numeric_Value: 23} (Short: \p{Nv=23}) (1)
  2442. T \p{Numeric_Value: 24} (Short: \p{Nv=24}) (1)
  2443. T \p{Numeric_Value: 25} (Short: \p{Nv=25}) (1)
  2444. T \p{Numeric_Value: 26} (Short: \p{Nv=26}) (1)
  2445. T \p{Numeric_Value: 27} (Short: \p{Nv=27}) (1)
  2446. T \p{Numeric_Value: 28} (Short: \p{Nv=28}) (1)
  2447. T \p{Numeric_Value: 29} (Short: \p{Nv=29}) (1)
  2448. T \p{Numeric_Value: 30} (Short: \p{Nv=30}) (11)
  2449. T \p{Numeric_Value: 31} (Short: \p{Nv=31}) (1)
  2450. T \p{Numeric_Value: 32} (Short: \p{Nv=32}) (1)
  2451. T \p{Numeric_Value: 33} (Short: \p{Nv=33}) (1)
  2452. T \p{Numeric_Value: 34} (Short: \p{Nv=34}) (1)
  2453. T \p{Numeric_Value: 35} (Short: \p{Nv=35}) (1)
  2454. T \p{Numeric_Value: 36} (Short: \p{Nv=36}) (1)
  2455. T \p{Numeric_Value: 37} (Short: \p{Nv=37}) (1)
  2456. T \p{Numeric_Value: 38} (Short: \p{Nv=38}) (1)
  2457. T \p{Numeric_Value: 39} (Short: \p{Nv=39}) (1)
  2458. T \p{Numeric_Value: 40} (Short: \p{Nv=40}) (10)
  2459. T \p{Numeric_Value: 41} (Short: \p{Nv=41}) (1)
  2460. T \p{Numeric_Value: 42} (Short: \p{Nv=42}) (1)
  2461. T \p{Numeric_Value: 43} (Short: \p{Nv=43}) (1)
  2462. T \p{Numeric_Value: 44} (Short: \p{Nv=44}) (1)
  2463. T \p{Numeric_Value: 45} (Short: \p{Nv=45}) (1)
  2464. T \p{Numeric_Value: 46} (Short: \p{Nv=46}) (1)
  2465. T \p{Numeric_Value: 47} (Short: \p{Nv=47}) (1)
  2466. T \p{Numeric_Value: 48} (Short: \p{Nv=48}) (1)
  2467. T \p{Numeric_Value: 49} (Short: \p{Nv=49}) (1)
  2468. T \p{Numeric_Value: 50} (Short: \p{Nv=50}) (20)
  2469. T \p{Numeric_Value: 60} (Short: \p{Nv=60}) (6)
  2470. T \p{Numeric_Value: 70} (Short: \p{Nv=70}) (6)
  2471. T \p{Numeric_Value: 80} (Short: \p{Nv=80}) (6)
  2472. T \p{Numeric_Value: 90} (Short: \p{Nv=90}) (6)
  2473. T \p{Numeric_Value: 100} (Short: \p{Nv=100}) (20)
  2474. T \p{Numeric_Value: 200} (Short: \p{Nv=200}) (2)
  2475. T \p{Numeric_Value: 300} (Short: \p{Nv=300}) (3)
  2476. T \p{Numeric_Value: 400} (Short: \p{Nv=400}) (2)
  2477. T \p{Numeric_Value: 500} (Short: \p{Nv=500}) (12)
  2478. T \p{Numeric_Value: 600} (Short: \p{Nv=600}) (2)
  2479. T \p{Numeric_Value: 700} (Short: \p{Nv=700}) (2)
  2480. T \p{Numeric_Value: 800} (Short: \p{Nv=800}) (2)
  2481. T \p{Numeric_Value: 900} (Short: \p{Nv=900}) (3)
  2482. T \p{Numeric_Value: 1000} (Short: \p{Nv=1000}) (17)
  2483. T \p{Numeric_Value: 2000} (Short: \p{Nv=2000}) (1)
  2484. T \p{Numeric_Value: 3000} (Short: \p{Nv=3000}) (1)
  2485. T \p{Numeric_Value: 4000} (Short: \p{Nv=4000}) (1)
  2486. T \p{Numeric_Value: 5000} (Short: \p{Nv=5000}) (5)
  2487. T \p{Numeric_Value: 6000} (Short: \p{Nv=6000}) (1)
  2488. T \p{Numeric_Value: 7000} (Short: \p{Nv=7000}) (1)
  2489. T \p{Numeric_Value: 8000} (Short: \p{Nv=8000}) (1)
  2490. T \p{Numeric_Value: 9000} (Short: \p{Nv=9000}) (1)
  2491. T \p{Numeric_Value: 10000} (= 1.0e+04) (Short: \p{Nv=10000}) (7)
  2492. T \p{Numeric_Value: 20000} (= 2.0e+04) (Short: \p{Nv=20000}) (1)
  2493. T \p{Numeric_Value: 30000} (= 3.0e+04) (Short: \p{Nv=30000}) (1)
  2494. T \p{Numeric_Value: 40000} (= 4.0e+04) (Short: \p{Nv=40000}) (1)
  2495. T \p{Numeric_Value: 50000} (= 5.0e+04) (Short: \p{Nv=50000}) (4)
  2496. T \p{Numeric_Value: 60000} (= 6.0e+04) (Short: \p{Nv=60000}) (1)
  2497. T \p{Numeric_Value: 70000} (= 7.0e+04) (Short: \p{Nv=70000}) (1)
  2498. T \p{Numeric_Value: 80000} (= 8.0e+04) (Short: \p{Nv=80000}) (1)
  2499. T \p{Numeric_Value: 90000} (= 9.0e+04) (Short: \p{Nv=90000}) (1)
  2500. T \p{Numeric_Value: 100000} (= 1.0e+05) (Short: \p{Nv=100000}) (1)
  2501. T \p{Numeric_Value: 216000} (= 2.2e+05) (Short: \p{Nv=216000}) (1)
  2502. T \p{Numeric_Value: 432000} (= 4.3e+05) (Short: \p{Nv=432000}) (1)
  2503. T \p{Numeric_Value: 100000000} (= 1.0e+08) (Short: \p{Nv=100000000})
  2504. (2)
  2505. T \p{Numeric_Value: 1000000000000} (= 1.0e+12) (Short: \p{Nv=
  2506. 1000000000000}) (1)
  2507. \p{Numeric_Value: NaN} (Short: \p{Nv=NaN}) (1_112_883)
  2508. \p{Nv: *} \p{Numeric_Value: *}
  2509. X \p{OCR} \p{Optical_Character_Recognition} (=
  2510. \p{Block=Optical_Character_Recognition})
  2511. (32)
  2512. \p{Ogam} \p{Ogham} (= \p{Script=Ogham}) (NOT
  2513. \p{Block=Ogham}) (29)
  2514. \p{Ogham} \p{Script=Ogham} (Short: \p{Ogam}; NOT
  2515. \p{Block=Ogham}) (29)
  2516. \p{Ol_Chiki} \p{Script=Ol_Chiki} (Short: \p{Olck}) (48)
  2517. \p{Olck} \p{Ol_Chiki} (= \p{Script=Ol_Chiki}) (48)
  2518. \p{Old_Italic} \p{Script=Old_Italic} (Short: \p{Ital};
  2519. NOT \p{Block=Old_Italic}) (35)
  2520. \p{Old_Persian} \p{Script=Old_Persian} (Short: \p{Xpeo};
  2521. NOT \p{Block=Old_Persian}) (50)
  2522. \p{Old_South_Arabian} \p{Script=Old_South_Arabian} (Short:
  2523. \p{Sarb}) (32)
  2524. \p{Old_Turkic} \p{Script=Old_Turkic} (Short: \p{Orkh};
  2525. NOT \p{Block=Old_Turkic}) (73)
  2526. \p{Open_Punctuation} \p{General_Category=Open_Punctuation}
  2527. (Short: \p{Ps}) (72)
  2528. X \p{Optical_Character_Recognition} \p{Block=
  2529. Optical_Character_Recognition} (Short:
  2530. \p{InOCR}) (32)
  2531. \p{Oriya} \p{Script=Oriya} (Short: \p{Orya}; NOT
  2532. \p{Block=Oriya}) (90)
  2533. \p{Orkh} \p{Old_Turkic} (= \p{Script=Old_Turkic})
  2534. (NOT \p{Block=Old_Turkic}) (73)
  2535. \p{Orya} \p{Oriya} (= \p{Script=Oriya}) (NOT
  2536. \p{Block=Oriya}) (90)
  2537. \p{Osma} \p{Osmanya} (= \p{Script=Osmanya}) (NOT
  2538. \p{Block=Osmanya}) (40)
  2539. \p{Osmanya} \p{Script=Osmanya} (Short: \p{Osma}; NOT
  2540. \p{Block=Osmanya}) (40)
  2541. \p{Other} \p{General_Category=Other} (Short: \p{C})
  2542. (1_004_134)
  2543. \p{Other_Letter} \p{General_Category=Other_Letter} (Short:
  2544. \p{Lo}) (97_553)
  2545. \p{Other_Number} \p{General_Category=Other_Number} (Short:
  2546. \p{No}) (464)
  2547. \p{Other_Punctuation} \p{General_Category=Other_Punctuation}
  2548. (Short: \p{Po}) (434)
  2549. \p{Other_Symbol} \p{General_Category=Other_Symbol} (Short:
  2550. \p{So}) (4404)
  2551. \p{P} \p{Punct} (= \p{General_Category=
  2552. Punctuation}) (NOT
  2553. \p{General_Punctuation}) (632)
  2554. \p{Paragraph_Separator} \p{General_Category=Paragraph_Separator}
  2555. (Short: \p{Zp}) (1)
  2556. \p{Pat_Syn} \p{Pattern_Syntax} (= \p{Pattern_Syntax=
  2557. Y}) (2760)
  2558. \p{Pat_Syn: *} \p{Pattern_Syntax: *}
  2559. \p{Pat_WS} \p{Pattern_White_Space} (=
  2560. \p{Pattern_White_Space=Y}) (11)
  2561. \p{Pat_WS: *} \p{Pattern_White_Space: *}
  2562. \p{Pattern_Syntax} \p{Pattern_Syntax=Y} (Short: \p{PatSyn})
  2563. (2760)
  2564. \p{Pattern_Syntax: N*} (Short: \p{PatSyn=N}, \P{PatSyn})
  2565. (1_111_352)
  2566. \p{Pattern_Syntax: Y*} (Short: \p{PatSyn=Y}, \p{PatSyn}) (2760)
  2567. \p{Pattern_White_Space} \p{Pattern_White_Space=Y} (Short:
  2568. \p{PatWS}) (11)
  2569. \p{Pattern_White_Space: N*} (Short: \p{PatWS=N}, \P{PatWS})
  2570. (1_114_101)
  2571. \p{Pattern_White_Space: Y*} (Short: \p{PatWS=Y}, \p{PatWS}) (11)
  2572. \p{Pc} \p{Connector_Punctuation} (=
  2573. \p{General_Category=
  2574. Connector_Punctuation}) (10)
  2575. \p{Pd} \p{Dash_Punctuation} (=
  2576. \p{General_Category=Dash_Punctuation})
  2577. (23)
  2578. \p{Pe} \p{Close_Punctuation} (=
  2579. \p{General_Category=Close_Punctuation})
  2580. (71)
  2581. \p{PerlSpace} \s, restricted to ASCII = [ \f\n\r\t] plus
  2582. vertical tab (6)
  2583. \p{PerlWord} \w, restricted to ASCII = [A-Za-z0-9_] (63)
  2584. \p{Pf} \p{Final_Punctuation} (=
  2585. \p{General_Category=Final_Punctuation})
  2586. (10)
  2587. \p{Phag} \p{Phags_Pa} (= \p{Script=Phags_Pa}) (NOT
  2588. \p{Block=Phags_Pa}) (56)
  2589. \p{Phags_Pa} \p{Script=Phags_Pa} (Short: \p{Phag}; NOT
  2590. \p{Block=Phags_Pa}) (56)
  2591. X \p{Phaistos} \p{Phaistos_Disc} (= \p{Block=
  2592. Phaistos_Disc}) (48)
  2593. X \p{Phaistos_Disc} \p{Block=Phaistos_Disc} (Short:
  2594. \p{InPhaistos}) (48)
  2595. \p{Phli} \p{Inscriptional_Pahlavi} (= \p{Script=
  2596. Inscriptional_Pahlavi}) (NOT \p{Block=
  2597. Inscriptional_Pahlavi}) (27)
  2598. \p{Phnx} \p{Phoenician} (= \p{Script=Phoenician})
  2599. (NOT \p{Block=Phoenician}) (29)
  2600. \p{Phoenician} \p{Script=Phoenician} (Short: \p{Phnx};
  2601. NOT \p{Block=Phoenician}) (29)
  2602. X \p{Phonetic_Ext} \p{Phonetic_Extensions} (= \p{Block=
  2603. Phonetic_Extensions}) (128)
  2604. X \p{Phonetic_Ext_Sup} \p{Phonetic_Extensions_Supplement} (=
  2605. \p{Block=
  2606. Phonetic_Extensions_Supplement}) (64)
  2607. X \p{Phonetic_Extensions} \p{Block=Phonetic_Extensions} (Short:
  2608. \p{InPhoneticExt}) (128)
  2609. X \p{Phonetic_Extensions_Supplement} \p{Block=
  2610. Phonetic_Extensions_Supplement} (Short:
  2611. \p{InPhoneticExtSup}) (64)
  2612. \p{Pi} \p{Initial_Punctuation} (=
  2613. \p{General_Category=
  2614. Initial_Punctuation}) (12)
  2615. X \p{Playing_Cards} \p{Block=Playing_Cards} (96)
  2616. \p{Plrd} \p{Miao} (= \p{Script=Miao}) (NOT
  2617. \p{Block=Miao}) (133)
  2618. \p{Po} \p{Other_Punctuation} (=
  2619. \p{General_Category=Other_Punctuation})
  2620. (434)
  2621. \p{PosixAlnum} [A-Za-z0-9] (62)
  2622. \p{PosixAlpha} [A-Za-z] (52)
  2623. \p{PosixBlank} \t and ' ' (2)
  2624. \p{PosixCntrl} ASCII control characters: NUL, SOH, STX,
  2625. ETX, EOT, ENQ, ACK, BEL, BS, HT, LF, VT,
  2626. FF, CR, SO, SI, DLE, DC1, DC2, DC3, DC4,
  2627. NAK, SYN, ETB, CAN, EOM, SUB, ESC, FS,
  2628. GS, RS, US, and DEL (33)
  2629. \p{PosixDigit} [0-9] (10)
  2630. \p{PosixGraph} [-!"#$%&'()*+,./:;<>?@[\\]^_`{|}~0-9A-Za-
  2631. z] (94)
  2632. \p{PosixLower} [a-z] (/i= PosixAlpha) (26)
  2633. \p{PosixPrint} [- 0-9A-Za-
  2634. z!"#$%&'()*+,./:;<>?@[\\]^_`{|}~] (95)
  2635. \p{PosixPunct} [-!"#$%&'()*+,./:;<>?@[\\]^_`{|}~] (32)
  2636. \p{PosixSpace} \t, \n, \cK, \f, \r, and ' '. (\cK is
  2637. vertical tab) (6)
  2638. \p{PosixUpper} [A-Z] (/i= PosixAlpha) (26)
  2639. \p{PosixWord} \p{PerlWord} (63)
  2640. \p{PosixXDigit} \p{ASCII_Hex_Digit=Y} [0-9A-Fa-f] (Short:
  2641. \p{AHex}) (22)
  2642. T \p{Present_In: 1.1} \p{Age=V1_1} (Short: \p{In=1.1}) (Perl
  2643. extension) (33_979)
  2644. T \p{Present_In: 2.0} Code point's usage introduced in version
  2645. 2.0 or earlier (Short: \p{In=2.0}) (Perl
  2646. extension) (178_500)
  2647. T \p{Present_In: 2.1} Code point's usage introduced in version
  2648. 2.1 or earlier (Short: \p{In=2.1}) (Perl
  2649. extension) (178_502)
  2650. T \p{Present_In: 3.0} Code point's usage introduced in version
  2651. 3.0 or earlier (Short: \p{In=3.0}) (Perl
  2652. extension) (188_809)
  2653. T \p{Present_In: 3.1} Code point's usage introduced in version
  2654. 3.1 or earlier (Short: \p{In=3.1}) (Perl
  2655. extension) (233_787)
  2656. T \p{Present_In: 3.2} Code point's usage introduced in version
  2657. 3.2 or earlier (Short: \p{In=3.2}) (Perl
  2658. extension) (234_803)
  2659. T \p{Present_In: 4.0} Code point's usage introduced in version
  2660. 4.0 or earlier (Short: \p{In=4.0}) (Perl
  2661. extension) (236_029)
  2662. T \p{Present_In: 4.1} Code point's usage introduced in version
  2663. 4.1 or earlier (Short: \p{In=4.1}) (Perl
  2664. extension) (237_302)
  2665. T \p{Present_In: 5.0} Code point's usage introduced in version
  2666. 5.0 or earlier (Short: \p{In=5.0}) (Perl
  2667. extension) (238_671)
  2668. T \p{Present_In: 5.1} Code point's usage introduced in version
  2669. 5.1 or earlier (Short: \p{In=5.1}) (Perl
  2670. extension) (240_295)
  2671. T \p{Present_In: 5.2} Code point's usage introduced in version
  2672. 5.2 or earlier (Short: \p{In=5.2}) (Perl
  2673. extension) (246_943)
  2674. T \p{Present_In: 6.0} Code point's usage introduced in version
  2675. 6.0 or earlier (Short: \p{In=6.0}) (Perl
  2676. extension) (249_031)
  2677. T \p{Present_In: 6.1} Code point's usage introduced in version
  2678. 6.1 or earlier (Short: \p{In=6.1}) (Perl
  2679. extension) (249_763)
  2680. T \p{Present_In: 6.2} Code point's usage introduced in version
  2681. 6.2 or earlier (Short: \p{In=6.2}) (Perl
  2682. extension) (249_764)
  2683. \p{Present_In: Unassigned} \p{Age=Unassigned} (Short: \p{In=
  2684. Unassigned}) (Perl extension) (864_348)
  2685. \p{Print} Characters that are graphical plus space
  2686. characters (but no controls) (247_583)
  2687. \p{Private_Use} \p{General_Category=Private_Use} (Short:
  2688. \p{Co}; NOT \p{Private_Use_Area})
  2689. (137_468)
  2690. X \p{Private_Use_Area} \p{Block=Private_Use_Area} (Short:
  2691. \p{InPUA}) (6400)
  2692. \p{Prti} \p{Inscriptional_Parthian} (= \p{Script=
  2693. Inscriptional_Parthian}) (NOT \p{Block=
  2694. Inscriptional_Parthian}) (30)
  2695. \p{Ps} \p{Open_Punctuation} (=
  2696. \p{General_Category=Open_Punctuation})
  2697. (72)
  2698. X \p{PUA} \p{Private_Use_Area} (= \p{Block=
  2699. Private_Use_Area}) (6400)
  2700. \p{Punct} \p{General_Category=Punctuation} (Short:
  2701. \p{P}; NOT \p{General_Punctuation}) (632)
  2702. \p{Punctuation} \p{Punct} (= \p{General_Category=
  2703. Punctuation}) (NOT
  2704. \p{General_Punctuation}) (632)
  2705. \p{Qaac} \p{Coptic} (= \p{Script=Coptic}) (NOT
  2706. \p{Block=Coptic}) (137)
  2707. \p{Qaai} \p{Inherited} (= \p{Script=Inherited})
  2708. (523)
  2709. \p{QMark} \p{Quotation_Mark} (= \p{Quotation_Mark=
  2710. Y}) (29)
  2711. \p{QMark: *} \p{Quotation_Mark: *}
  2712. \p{Quotation_Mark} \p{Quotation_Mark=Y} (Short: \p{QMark})
  2713. (29)
  2714. \p{Quotation_Mark: N*} (Short: \p{QMark=N}, \P{QMark}) (1_114_083)
  2715. \p{Quotation_Mark: Y*} (Short: \p{QMark=Y}, \p{QMark}) (29)
  2716. \p{Radical} \p{Radical=Y} (329)
  2717. \p{Radical: N*} (Single: \P{Radical}) (1_113_783)
  2718. \p{Radical: Y*} (Single: \p{Radical}) (329)
  2719. \p{Rejang} \p{Script=Rejang} (Short: \p{Rjng}; NOT
  2720. \p{Block=Rejang}) (37)
  2721. \p{Rjng} \p{Rejang} (= \p{Script=Rejang}) (NOT
  2722. \p{Block=Rejang}) (37)
  2723. X \p{Rumi} \p{Rumi_Numeral_Symbols} (= \p{Block=
  2724. Rumi_Numeral_Symbols}) (32)
  2725. X \p{Rumi_Numeral_Symbols} \p{Block=Rumi_Numeral_Symbols} (Short:
  2726. \p{InRumi}) (32)
  2727. \p{Runic} \p{Script=Runic} (Short: \p{Runr}; NOT
  2728. \p{Block=Runic}) (78)
  2729. \p{Runr} \p{Runic} (= \p{Script=Runic}) (NOT
  2730. \p{Block=Runic}) (78)
  2731. \p{S} \p{Symbol} (= \p{General_Category=Symbol})
  2732. (5520)
  2733. \p{Samaritan} \p{Script=Samaritan} (Short: \p{Samr}; NOT
  2734. \p{Block=Samaritan}) (61)
  2735. \p{Samr} \p{Samaritan} (= \p{Script=Samaritan})
  2736. (NOT \p{Block=Samaritan}) (61)
  2737. \p{Sarb} \p{Old_South_Arabian} (= \p{Script=
  2738. Old_South_Arabian}) (32)
  2739. \p{Saur} \p{Saurashtra} (= \p{Script=Saurashtra})
  2740. (NOT \p{Block=Saurashtra}) (81)
  2741. \p{Saurashtra} \p{Script=Saurashtra} (Short: \p{Saur};
  2742. NOT \p{Block=Saurashtra}) (81)
  2743. \p{SB: *} \p{Sentence_Break: *}
  2744. \p{Sc} \p{Currency_Symbol} (=
  2745. \p{General_Category=Currency_Symbol})
  2746. (49)
  2747. \p{Sc: *} \p{Script: *}
  2748. \p{Script: Arab} \p{Script=Arabic} (1235)
  2749. \p{Script: Arabic} (Short: \p{Sc=Arab}, \p{Arab}) (1235)
  2750. \p{Script: Armenian} (Short: \p{Sc=Armn}, \p{Armn}) (91)
  2751. \p{Script: Armi} \p{Script=Imperial_Aramaic} (31)
  2752. \p{Script: Armn} \p{Script=Armenian} (91)
  2753. \p{Script: Avestan} (Short: \p{Sc=Avst}, \p{Avst}) (61)
  2754. \p{Script: Avst} \p{Script=Avestan} (61)
  2755. \p{Script: Bali} \p{Script=Balinese} (121)
  2756. \p{Script: Balinese} (Short: \p{Sc=Bali}, \p{Bali}) (121)
  2757. \p{Script: Bamu} \p{Script=Bamum} (657)
  2758. \p{Script: Bamum} (Short: \p{Sc=Bamu}, \p{Bamu}) (657)
  2759. \p{Script: Batak} (Short: \p{Sc=Batk}, \p{Batk}) (56)
  2760. \p{Script: Batk} \p{Script=Batak} (56)
  2761. \p{Script: Beng} \p{Script=Bengali} (92)
  2762. \p{Script: Bengali} (Short: \p{Sc=Beng}, \p{Beng}) (92)
  2763. \p{Script: Bopo} \p{Script=Bopomofo} (70)
  2764. \p{Script: Bopomofo} (Short: \p{Sc=Bopo}, \p{Bopo}) (70)
  2765. \p{Script: Brah} \p{Script=Brahmi} (108)
  2766. \p{Script: Brahmi} (Short: \p{Sc=Brah}, \p{Brah}) (108)
  2767. \p{Script: Brai} \p{Script=Braille} (256)
  2768. \p{Script: Braille} (Short: \p{Sc=Brai}, \p{Brai}) (256)
  2769. \p{Script: Bugi} \p{Script=Buginese} (30)
  2770. \p{Script: Buginese} (Short: \p{Sc=Bugi}, \p{Bugi}) (30)
  2771. \p{Script: Buhd} \p{Script=Buhid} (20)
  2772. \p{Script: Buhid} (Short: \p{Sc=Buhd}, \p{Buhd}) (20)
  2773. \p{Script: Cakm} \p{Script=Chakma} (67)
  2774. \p{Script: Canadian_Aboriginal} (Short: \p{Sc=Cans}, \p{Cans})
  2775. (710)
  2776. \p{Script: Cans} \p{Script=Canadian_Aboriginal} (710)
  2777. \p{Script: Cari} \p{Script=Carian} (49)
  2778. \p{Script: Carian} (Short: \p{Sc=Cari}, \p{Cari}) (49)
  2779. \p{Script: Chakma} (Short: \p{Sc=Cakm}, \p{Cakm}) (67)
  2780. \p{Script: Cham} (Short: \p{Sc=Cham}, \p{Cham}) (83)
  2781. \p{Script: Cher} \p{Script=Cherokee} (85)
  2782. \p{Script: Cherokee} (Short: \p{Sc=Cher}, \p{Cher}) (85)
  2783. \p{Script: Common} (Short: \p{Sc=Zyyy}, \p{Zyyy}) (6413)
  2784. \p{Script: Copt} \p{Script=Coptic} (137)
  2785. \p{Script: Coptic} (Short: \p{Sc=Copt}, \p{Copt}) (137)
  2786. \p{Script: Cprt} \p{Script=Cypriot} (55)
  2787. \p{Script: Cuneiform} (Short: \p{Sc=Xsux}, \p{Xsux}) (982)
  2788. \p{Script: Cypriot} (Short: \p{Sc=Cprt}, \p{Cprt}) (55)
  2789. \p{Script: Cyrillic} (Short: \p{Sc=Cyrl}, \p{Cyrl}) (417)
  2790. \p{Script: Cyrl} \p{Script=Cyrillic} (417)
  2791. \p{Script: Deseret} (Short: \p{Sc=Dsrt}, \p{Dsrt}) (80)
  2792. \p{Script: Deva} \p{Script=Devanagari} (151)
  2793. \p{Script: Devanagari} (Short: \p{Sc=Deva}, \p{Deva}) (151)
  2794. \p{Script: Dsrt} \p{Script=Deseret} (80)
  2795. \p{Script: Egyp} \p{Script=Egyptian_Hieroglyphs} (1071)
  2796. \p{Script: Egyptian_Hieroglyphs} (Short: \p{Sc=Egyp}, \p{Egyp})
  2797. (1071)
  2798. \p{Script: Ethi} \p{Script=Ethiopic} (495)
  2799. \p{Script: Ethiopic} (Short: \p{Sc=Ethi}, \p{Ethi}) (495)
  2800. \p{Script: Geor} \p{Script=Georgian} (127)
  2801. \p{Script: Georgian} (Short: \p{Sc=Geor}, \p{Geor}) (127)
  2802. \p{Script: Glag} \p{Script=Glagolitic} (94)
  2803. \p{Script: Glagolitic} (Short: \p{Sc=Glag}, \p{Glag}) (94)
  2804. \p{Script: Goth} \p{Script=Gothic} (27)
  2805. \p{Script: Gothic} (Short: \p{Sc=Goth}, \p{Goth}) (27)
  2806. \p{Script: Greek} (Short: \p{Sc=Grek}, \p{Grek}) (511)
  2807. \p{Script: Grek} \p{Script=Greek} (511)
  2808. \p{Script: Gujarati} (Short: \p{Sc=Gujr}, \p{Gujr}) (84)
  2809. \p{Script: Gujr} \p{Script=Gujarati} (84)
  2810. \p{Script: Gurmukhi} (Short: \p{Sc=Guru}, \p{Guru}) (79)
  2811. \p{Script: Guru} \p{Script=Gurmukhi} (79)
  2812. \p{Script: Han} (Short: \p{Sc=Han}, \p{Han}) (75_963)
  2813. \p{Script: Hang} \p{Script=Hangul} (11_739)
  2814. \p{Script: Hangul} (Short: \p{Sc=Hang}, \p{Hang}) (11_739)
  2815. \p{Script: Hani} \p{Script=Han} (75_963)
  2816. \p{Script: Hano} \p{Script=Hanunoo} (21)
  2817. \p{Script: Hanunoo} (Short: \p{Sc=Hano}, \p{Hano}) (21)
  2818. \p{Script: Hebr} \p{Script=Hebrew} (133)
  2819. \p{Script: Hebrew} (Short: \p{Sc=Hebr}, \p{Hebr}) (133)
  2820. \p{Script: Hira} \p{Script=Hiragana} (91)
  2821. \p{Script: Hiragana} (Short: \p{Sc=Hira}, \p{Hira}) (91)
  2822. \p{Script: Imperial_Aramaic} (Short: \p{Sc=Armi}, \p{Armi}) (31)
  2823. \p{Script: Inherited} (Short: \p{Sc=Zinh}, \p{Zinh}) (523)
  2824. \p{Script: Inscriptional_Pahlavi} (Short: \p{Sc=Phli}, \p{Phli})
  2825. (27)
  2826. \p{Script: Inscriptional_Parthian} (Short: \p{Sc=Prti}, \p{Prti})
  2827. (30)
  2828. \p{Script: Ital} \p{Script=Old_Italic} (35)
  2829. \p{Script: Java} \p{Script=Javanese} (91)
  2830. \p{Script: Javanese} (Short: \p{Sc=Java}, \p{Java}) (91)
  2831. \p{Script: Kaithi} (Short: \p{Sc=Kthi}, \p{Kthi}) (66)
  2832. \p{Script: Kali} \p{Script=Kayah_Li} (48)
  2833. \p{Script: Kana} \p{Script=Katakana} (300)
  2834. \p{Script: Kannada} (Short: \p{Sc=Knda}, \p{Knda}) (86)
  2835. \p{Script: Katakana} (Short: \p{Sc=Kana}, \p{Kana}) (300)
  2836. \p{Script: Kayah_Li} (Short: \p{Sc=Kali}, \p{Kali}) (48)
  2837. \p{Script: Khar} \p{Script=Kharoshthi} (65)
  2838. \p{Script: Kharoshthi} (Short: \p{Sc=Khar}, \p{Khar}) (65)
  2839. \p{Script: Khmer} (Short: \p{Sc=Khmr}, \p{Khmr}) (146)
  2840. \p{Script: Khmr} \p{Script=Khmer} (146)
  2841. \p{Script: Knda} \p{Script=Kannada} (86)
  2842. \p{Script: Kthi} \p{Script=Kaithi} (66)
  2843. \p{Script: Lana} \p{Script=Tai_Tham} (127)
  2844. \p{Script: Lao} (Short: \p{Sc=Lao}, \p{Lao}) (67)
  2845. \p{Script: Laoo} \p{Script=Lao} (67)
  2846. \p{Script: Latin} (Short: \p{Sc=Latn}, \p{Latn}) (1272)
  2847. \p{Script: Latn} \p{Script=Latin} (1272)
  2848. \p{Script: Lepc} \p{Script=Lepcha} (74)
  2849. \p{Script: Lepcha} (Short: \p{Sc=Lepc}, \p{Lepc}) (74)
  2850. \p{Script: Limb} \p{Script=Limbu} (66)
  2851. \p{Script: Limbu} (Short: \p{Sc=Limb}, \p{Limb}) (66)
  2852. \p{Script: Linb} \p{Script=Linear_B} (211)
  2853. \p{Script: Linear_B} (Short: \p{Sc=Linb}, \p{Linb}) (211)
  2854. \p{Script: Lisu} (Short: \p{Sc=Lisu}, \p{Lisu}) (48)
  2855. \p{Script: Lyci} \p{Script=Lycian} (29)
  2856. \p{Script: Lycian} (Short: \p{Sc=Lyci}, \p{Lyci}) (29)
  2857. \p{Script: Lydi} \p{Script=Lydian} (27)
  2858. \p{Script: Lydian} (Short: \p{Sc=Lydi}, \p{Lydi}) (27)
  2859. \p{Script: Malayalam} (Short: \p{Sc=Mlym}, \p{Mlym}) (98)
  2860. \p{Script: Mand} \p{Script=Mandaic} (29)
  2861. \p{Script: Mandaic} (Short: \p{Sc=Mand}, \p{Mand}) (29)
  2862. \p{Script: Meetei_Mayek} (Short: \p{Sc=Mtei}, \p{Mtei}) (79)
  2863. \p{Script: Merc} \p{Script=Meroitic_Cursive} (26)
  2864. \p{Script: Mero} \p{Script=Meroitic_Hieroglyphs} (32)
  2865. \p{Script: Meroitic_Cursive} (Short: \p{Sc=Merc}, \p{Merc}) (26)
  2866. \p{Script: Meroitic_Hieroglyphs} (Short: \p{Sc=Mero}, \p{Mero})
  2867. (32)
  2868. \p{Script: Miao} (Short: \p{Sc=Miao}, \p{Miao}) (133)
  2869. \p{Script: Mlym} \p{Script=Malayalam} (98)
  2870. \p{Script: Mong} \p{Script=Mongolian} (153)
  2871. \p{Script: Mongolian} (Short: \p{Sc=Mong}, \p{Mong}) (153)
  2872. \p{Script: Mtei} \p{Script=Meetei_Mayek} (79)
  2873. \p{Script: Myanmar} (Short: \p{Sc=Mymr}, \p{Mymr}) (188)
  2874. \p{Script: Mymr} \p{Script=Myanmar} (188)
  2875. \p{Script: New_Tai_Lue} (Short: \p{Sc=Talu}, \p{Talu}) (83)
  2876. \p{Script: Nko} (Short: \p{Sc=Nko}, \p{Nko}) (59)
  2877. \p{Script: Nkoo} \p{Script=Nko} (59)
  2878. \p{Script: Ogam} \p{Script=Ogham} (29)
  2879. \p{Script: Ogham} (Short: \p{Sc=Ogam}, \p{Ogam}) (29)
  2880. \p{Script: Ol_Chiki} (Short: \p{Sc=Olck}, \p{Olck}) (48)
  2881. \p{Script: Olck} \p{Script=Ol_Chiki} (48)
  2882. \p{Script: Old_Italic} (Short: \p{Sc=Ital}, \p{Ital}) (35)
  2883. \p{Script: Old_Persian} (Short: \p{Sc=Xpeo}, \p{Xpeo}) (50)
  2884. \p{Script: Old_South_Arabian} (Short: \p{Sc=Sarb}, \p{Sarb}) (32)
  2885. \p{Script: Old_Turkic} (Short: \p{Sc=Orkh}, \p{Orkh}) (73)
  2886. \p{Script: Oriya} (Short: \p{Sc=Orya}, \p{Orya}) (90)
  2887. \p{Script: Orkh} \p{Script=Old_Turkic} (73)
  2888. \p{Script: Orya} \p{Script=Oriya} (90)
  2889. \p{Script: Osma} \p{Script=Osmanya} (40)
  2890. \p{Script: Osmanya} (Short: \p{Sc=Osma}, \p{Osma}) (40)
  2891. \p{Script: Phag} \p{Script=Phags_Pa} (56)
  2892. \p{Script: Phags_Pa} (Short: \p{Sc=Phag}, \p{Phag}) (56)
  2893. \p{Script: Phli} \p{Script=Inscriptional_Pahlavi} (27)
  2894. \p{Script: Phnx} \p{Script=Phoenician} (29)
  2895. \p{Script: Phoenician} (Short: \p{Sc=Phnx}, \p{Phnx}) (29)
  2896. \p{Script: Plrd} \p{Script=Miao} (133)
  2897. \p{Script: Prti} \p{Script=Inscriptional_Parthian} (30)
  2898. \p{Script: Qaac} \p{Script=Coptic} (137)
  2899. \p{Script: Qaai} \p{Script=Inherited} (523)
  2900. \p{Script: Rejang} (Short: \p{Sc=Rjng}, \p{Rjng}) (37)
  2901. \p{Script: Rjng} \p{Script=Rejang} (37)
  2902. \p{Script: Runic} (Short: \p{Sc=Runr}, \p{Runr}) (78)
  2903. \p{Script: Runr} \p{Script=Runic} (78)
  2904. \p{Script: Samaritan} (Short: \p{Sc=Samr}, \p{Samr}) (61)
  2905. \p{Script: Samr} \p{Script=Samaritan} (61)
  2906. \p{Script: Sarb} \p{Script=Old_South_Arabian} (32)
  2907. \p{Script: Saur} \p{Script=Saurashtra} (81)
  2908. \p{Script: Saurashtra} (Short: \p{Sc=Saur}, \p{Saur}) (81)
  2909. \p{Script: Sharada} (Short: \p{Sc=Shrd}, \p{Shrd}) (83)
  2910. \p{Script: Shavian} (Short: \p{Sc=Shaw}, \p{Shaw}) (48)
  2911. \p{Script: Shaw} \p{Script=Shavian} (48)
  2912. \p{Script: Shrd} \p{Script=Sharada} (83)
  2913. \p{Script: Sinh} \p{Script=Sinhala} (80)
  2914. \p{Script: Sinhala} (Short: \p{Sc=Sinh}, \p{Sinh}) (80)
  2915. \p{Script: Sora} \p{Script=Sora_Sompeng} (35)
  2916. \p{Script: Sora_Sompeng} (Short: \p{Sc=Sora}, \p{Sora}) (35)
  2917. \p{Script: Sund} \p{Script=Sundanese} (72)
  2918. \p{Script: Sundanese} (Short: \p{Sc=Sund}, \p{Sund}) (72)
  2919. \p{Script: Sylo} \p{Script=Syloti_Nagri} (44)
  2920. \p{Script: Syloti_Nagri} (Short: \p{Sc=Sylo}, \p{Sylo}) (44)
  2921. \p{Script: Syrc} \p{Script=Syriac} (77)
  2922. \p{Script: Syriac} (Short: \p{Sc=Syrc}, \p{Syrc}) (77)
  2923. \p{Script: Tagalog} (Short: \p{Sc=Tglg}, \p{Tglg}) (20)
  2924. \p{Script: Tagb} \p{Script=Tagbanwa} (18)
  2925. \p{Script: Tagbanwa} (Short: \p{Sc=Tagb}, \p{Tagb}) (18)
  2926. \p{Script: Tai_Le} (Short: \p{Sc=Tale}, \p{Tale}) (35)
  2927. \p{Script: Tai_Tham} (Short: \p{Sc=Lana}, \p{Lana}) (127)
  2928. \p{Script: Tai_Viet} (Short: \p{Sc=Tavt}, \p{Tavt}) (72)
  2929. \p{Script: Takr} \p{Script=Takri} (66)
  2930. \p{Script: Takri} (Short: \p{Sc=Takr}, \p{Takr}) (66)
  2931. \p{Script: Tale} \p{Script=Tai_Le} (35)
  2932. \p{Script: Talu} \p{Script=New_Tai_Lue} (83)
  2933. \p{Script: Tamil} (Short: \p{Sc=Taml}, \p{Taml}) (72)
  2934. \p{Script: Taml} \p{Script=Tamil} (72)
  2935. \p{Script: Tavt} \p{Script=Tai_Viet} (72)
  2936. \p{Script: Telu} \p{Script=Telugu} (93)
  2937. \p{Script: Telugu} (Short: \p{Sc=Telu}, \p{Telu}) (93)
  2938. \p{Script: Tfng} \p{Script=Tifinagh} (59)
  2939. \p{Script: Tglg} \p{Script=Tagalog} (20)
  2940. \p{Script: Thaa} \p{Script=Thaana} (50)
  2941. \p{Script: Thaana} (Short: \p{Sc=Thaa}, \p{Thaa}) (50)
  2942. \p{Script: Thai} (Short: \p{Sc=Thai}, \p{Thai}) (86)
  2943. \p{Script: Tibetan} (Short: \p{Sc=Tibt}, \p{Tibt}) (207)
  2944. \p{Script: Tibt} \p{Script=Tibetan} (207)
  2945. \p{Script: Tifinagh} (Short: \p{Sc=Tfng}, \p{Tfng}) (59)
  2946. \p{Script: Ugar} \p{Script=Ugaritic} (31)
  2947. \p{Script: Ugaritic} (Short: \p{Sc=Ugar}, \p{Ugar}) (31)
  2948. \p{Script: Unknown} (Short: \p{Sc=Zzzz}, \p{Zzzz}) (1_003_930)
  2949. \p{Script: Vai} (Short: \p{Sc=Vai}, \p{Vai}) (300)
  2950. \p{Script: Vaii} \p{Script=Vai} (300)
  2951. \p{Script: Xpeo} \p{Script=Old_Persian} (50)
  2952. \p{Script: Xsux} \p{Script=Cuneiform} (982)
  2953. \p{Script: Yi} (Short: \p{Sc=Yi}, \p{Yi}) (1220)
  2954. \p{Script: Yiii} \p{Script=Yi} (1220)
  2955. \p{Script: Zinh} \p{Script=Inherited} (523)
  2956. \p{Script: Zyyy} \p{Script=Common} (6413)
  2957. \p{Script: Zzzz} \p{Script=Unknown} (1_003_930)
  2958. \p{Script_Extensions: Arab} \p{Script_Extensions=Arabic} (1262)
  2959. \p{Script_Extensions: Arabic} (Short: \p{Scx=Arab}) (1262)
  2960. \p{Script_Extensions: Armenian} (Short: \p{Scx=Armn}) (92)
  2961. \p{Script_Extensions: Armi} \p{Script_Extensions=Imperial_Aramaic}
  2962. (31)
  2963. \p{Script_Extensions: Armn} \p{Script_Extensions=Armenian} (92)
  2964. \p{Script_Extensions: Avestan} (Short: \p{Scx=Avst}) (61)
  2965. \p{Script_Extensions: Avst} \p{Script_Extensions=Avestan} (61)
  2966. \p{Script_Extensions: Bali} \p{Script_Extensions=Balinese} (121)
  2967. \p{Script_Extensions: Balinese} (Short: \p{Scx=Bali}) (121)
  2968. \p{Script_Extensions: Bamu} \p{Script_Extensions=Bamum} (657)
  2969. \p{Script_Extensions: Bamum} (Short: \p{Scx=Bamu}) (657)
  2970. \p{Script_Extensions: Batak} (Short: \p{Scx=Batk}) (56)
  2971. \p{Script_Extensions: Batk} \p{Script_Extensions=Batak} (56)
  2972. \p{Script_Extensions: Beng} \p{Script_Extensions=Bengali} (94)
  2973. \p{Script_Extensions: Bengali} (Short: \p{Scx=Beng}) (94)
  2974. \p{Script_Extensions: Bopo} \p{Script_Extensions=Bopomofo} (306)
  2975. \p{Script_Extensions: Bopomofo} (Short: \p{Scx=Bopo}) (306)
  2976. \p{Script_Extensions: Brah} \p{Script_Extensions=Brahmi} (108)
  2977. \p{Script_Extensions: Brahmi} (Short: \p{Scx=Brah}) (108)
  2978. \p{Script_Extensions: Brai} \p{Script_Extensions=Braille} (256)
  2979. \p{Script_Extensions: Braille} (Short: \p{Scx=Brai}) (256)
  2980. \p{Script_Extensions: Bugi} \p{Script_Extensions=Buginese} (30)
  2981. \p{Script_Extensions: Buginese} (Short: \p{Scx=Bugi}) (30)
  2982. \p{Script_Extensions: Buhd} \p{Script_Extensions=Buhid} (22)
  2983. \p{Script_Extensions: Buhid} (Short: \p{Scx=Buhd}) (22)
  2984. \p{Script_Extensions: Cakm} \p{Script_Extensions=Chakma} (67)
  2985. \p{Script_Extensions: Canadian_Aboriginal} (Short: \p{Scx=Cans})
  2986. (710)
  2987. \p{Script_Extensions: Cans} \p{Script_Extensions=
  2988. Canadian_Aboriginal} (710)
  2989. \p{Script_Extensions: Cari} \p{Script_Extensions=Carian} (49)
  2990. \p{Script_Extensions: Carian} (Short: \p{Scx=Cari}) (49)
  2991. \p{Script_Extensions: Chakma} (Short: \p{Scx=Cakm}) (67)
  2992. \p{Script_Extensions: Cham} (Short: \p{Scx=Cham}) (83)
  2993. \p{Script_Extensions: Cher} \p{Script_Extensions=Cherokee} (85)
  2994. \p{Script_Extensions: Cherokee} (Short: \p{Scx=Cher}) (85)
  2995. \p{Script_Extensions: Common} (Short: \p{Scx=Zyyy}) (6057)
  2996. \p{Script_Extensions: Copt} \p{Script_Extensions=Coptic} (137)
  2997. \p{Script_Extensions: Coptic} (Short: \p{Scx=Copt}) (137)
  2998. \p{Script_Extensions: Cprt} \p{Script_Extensions=Cypriot} (112)
  2999. \p{Script_Extensions: Cuneiform} (Short: \p{Scx=Xsux}) (982)
  3000. \p{Script_Extensions: Cypriot} (Short: \p{Scx=Cprt}) (112)
  3001. \p{Script_Extensions: Cyrillic} (Short: \p{Scx=Cyrl}) (419)
  3002. \p{Script_Extensions: Cyrl} \p{Script_Extensions=Cyrillic} (419)
  3003. \p{Script_Extensions: Deseret} (Short: \p{Scx=Dsrt}) (80)
  3004. \p{Script_Extensions: Deva} \p{Script_Extensions=Devanagari} (193)
  3005. \p{Script_Extensions: Devanagari} (Short: \p{Scx=Deva}) (193)
  3006. \p{Script_Extensions: Dsrt} \p{Script_Extensions=Deseret} (80)
  3007. \p{Script_Extensions: Egyp} \p{Script_Extensions=
  3008. Egyptian_Hieroglyphs} (1071)
  3009. \p{Script_Extensions: Egyptian_Hieroglyphs} (Short: \p{Scx=Egyp})
  3010. (1071)
  3011. \p{Script_Extensions: Ethi} \p{Script_Extensions=Ethiopic} (495)
  3012. \p{Script_Extensions: Ethiopic} (Short: \p{Scx=Ethi}) (495)
  3013. \p{Script_Extensions: Geor} \p{Script_Extensions=Georgian} (128)
  3014. \p{Script_Extensions: Georgian} (Short: \p{Scx=Geor}) (128)
  3015. \p{Script_Extensions: Glag} \p{Script_Extensions=Glagolitic} (94)
  3016. \p{Script_Extensions: Glagolitic} (Short: \p{Scx=Glag}) (94)
  3017. \p{Script_Extensions: Goth} \p{Script_Extensions=Gothic} (27)
  3018. \p{Script_Extensions: Gothic} (Short: \p{Scx=Goth}) (27)
  3019. \p{Script_Extensions: Greek} (Short: \p{Scx=Grek}) (515)
  3020. \p{Script_Extensions: Grek} \p{Script_Extensions=Greek} (515)
  3021. \p{Script_Extensions: Gujarati} (Short: \p{Scx=Gujr}) (94)
  3022. \p{Script_Extensions: Gujr} \p{Script_Extensions=Gujarati} (94)
  3023. \p{Script_Extensions: Gurmukhi} (Short: \p{Scx=Guru}) (91)
  3024. \p{Script_Extensions: Guru} \p{Script_Extensions=Gurmukhi} (91)
  3025. \p{Script_Extensions: Han} (Short: \p{Scx=Han}) (76_218)
  3026. \p{Script_Extensions: Hang} \p{Script_Extensions=Hangul} (11_971)
  3027. \p{Script_Extensions: Hangul} (Short: \p{Scx=Hang}) (11_971)
  3028. \p{Script_Extensions: Hani} \p{Script_Extensions=Han} (76_218)
  3029. \p{Script_Extensions: Hano} \p{Script_Extensions=Hanunoo} (23)
  3030. \p{Script_Extensions: Hanunoo} (Short: \p{Scx=Hano}) (23)
  3031. \p{Script_Extensions: Hebr} \p{Script_Extensions=Hebrew} (133)
  3032. \p{Script_Extensions: Hebrew} (Short: \p{Scx=Hebr}) (133)
  3033. \p{Script_Extensions: Hira} \p{Script_Extensions=Hiragana} (356)
  3034. \p{Script_Extensions: Hiragana} (Short: \p{Scx=Hira}) (356)
  3035. \p{Script_Extensions: Imperial_Aramaic} (Short: \p{Scx=Armi}) (31)
  3036. \p{Script_Extensions: Inherited} (Short: \p{Scx=Zinh}) (459)
  3037. \p{Script_Extensions: Inscriptional_Pahlavi} (Short: \p{Scx=Phli})
  3038. (27)
  3039. \p{Script_Extensions: Inscriptional_Parthian} (Short: \p{Scx=
  3040. Prti}) (30)
  3041. \p{Script_Extensions: Ital} \p{Script_Extensions=Old_Italic} (35)
  3042. \p{Script_Extensions: Java} \p{Script_Extensions=Javanese} (91)
  3043. \p{Script_Extensions: Javanese} (Short: \p{Scx=Java}) (91)
  3044. \p{Script_Extensions: Kaithi} (Short: \p{Scx=Kthi}) (76)
  3045. \p{Script_Extensions: Kali} \p{Script_Extensions=Kayah_Li} (48)
  3046. \p{Script_Extensions: Kana} \p{Script_Extensions=Katakana} (565)
  3047. \p{Script_Extensions: Kannada} (Short: \p{Scx=Knda}) (86)
  3048. \p{Script_Extensions: Katakana} (Short: \p{Scx=Kana}) (565)
  3049. \p{Script_Extensions: Kayah_Li} (Short: \p{Scx=Kali}) (48)
  3050. \p{Script_Extensions: Khar} \p{Script_Extensions=Kharoshthi} (65)
  3051. \p{Script_Extensions: Kharoshthi} (Short: \p{Scx=Khar}) (65)
  3052. \p{Script_Extensions: Khmer} (Short: \p{Scx=Khmr}) (146)
  3053. \p{Script_Extensions: Khmr} \p{Script_Extensions=Khmer} (146)
  3054. \p{Script_Extensions: Knda} \p{Script_Extensions=Kannada} (86)
  3055. \p{Script_Extensions: Kthi} \p{Script_Extensions=Kaithi} (76)
  3056. \p{Script_Extensions: Lana} \p{Script_Extensions=Tai_Tham} (127)
  3057. \p{Script_Extensions: Lao} (Short: \p{Scx=Lao}) (67)
  3058. \p{Script_Extensions: Laoo} \p{Script_Extensions=Lao} (67)
  3059. \p{Script_Extensions: Latin} (Short: \p{Scx=Latn}) (1289)
  3060. \p{Script_Extensions: Latn} \p{Script_Extensions=Latin} (1289)
  3061. \p{Script_Extensions: Lepc} \p{Script_Extensions=Lepcha} (74)
  3062. \p{Script_Extensions: Lepcha} (Short: \p{Scx=Lepc}) (74)
  3063. \p{Script_Extensions: Limb} \p{Script_Extensions=Limbu} (66)
  3064. \p{Script_Extensions: Limbu} (Short: \p{Scx=Limb}) (66)
  3065. \p{Script_Extensions: Linb} \p{Script_Extensions=Linear_B} (268)
  3066. \p{Script_Extensions: Linear_B} (Short: \p{Scx=Linb}) (268)
  3067. \p{Script_Extensions: Lisu} (Short: \p{Scx=Lisu}) (48)
  3068. \p{Script_Extensions: Lyci} \p{Script_Extensions=Lycian} (29)
  3069. \p{Script_Extensions: Lycian} (Short: \p{Scx=Lyci}) (29)
  3070. \p{Script_Extensions: Lydi} \p{Script_Extensions=Lydian} (27)
  3071. \p{Script_Extensions: Lydian} (Short: \p{Scx=Lydi}) (27)
  3072. \p{Script_Extensions: Malayalam} (Short: \p{Scx=Mlym}) (98)
  3073. \p{Script_Extensions: Mand} \p{Script_Extensions=Mandaic} (30)
  3074. \p{Script_Extensions: Mandaic} (Short: \p{Scx=Mand}) (30)
  3075. \p{Script_Extensions: Meetei_Mayek} (Short: \p{Scx=Mtei}) (79)
  3076. \p{Script_Extensions: Merc} \p{Script_Extensions=Meroitic_Cursive}
  3077. (26)
  3078. \p{Script_Extensions: Mero} \p{Script_Extensions=
  3079. Meroitic_Hieroglyphs} (32)
  3080. \p{Script_Extensions: Meroitic_Cursive} (Short: \p{Scx=Merc}) (26)
  3081. \p{Script_Extensions: Meroitic_Hieroglyphs} (Short: \p{Scx=Mero})
  3082. (32)
  3083. \p{Script_Extensions: Miao} (Short: \p{Scx=Miao}) (133)
  3084. \p{Script_Extensions: Mlym} \p{Script_Extensions=Malayalam} (98)
  3085. \p{Script_Extensions: Mong} \p{Script_Extensions=Mongolian} (156)
  3086. \p{Script_Extensions: Mongolian} (Short: \p{Scx=Mong}) (156)
  3087. \p{Script_Extensions: Mtei} \p{Script_Extensions=Meetei_Mayek} (79)
  3088. \p{Script_Extensions: Myanmar} (Short: \p{Scx=Mymr}) (188)
  3089. \p{Script_Extensions: Mymr} \p{Script_Extensions=Myanmar} (188)
  3090. \p{Script_Extensions: New_Tai_Lue} (Short: \p{Scx=Talu}) (83)
  3091. \p{Script_Extensions: Nko} (Short: \p{Scx=Nko}) (59)
  3092. \p{Script_Extensions: Nkoo} \p{Script_Extensions=Nko} (59)
  3093. \p{Script_Extensions: Ogam} \p{Script_Extensions=Ogham} (29)
  3094. \p{Script_Extensions: Ogham} (Short: \p{Scx=Ogam}) (29)
  3095. \p{Script_Extensions: Ol_Chiki} (Short: \p{Scx=Olck}) (48)
  3096. \p{Script_Extensions: Olck} \p{Script_Extensions=Ol_Chiki} (48)
  3097. \p{Script_Extensions: Old_Italic} (Short: \p{Scx=Ital}) (35)
  3098. \p{Script_Extensions: Old_Persian} (Short: \p{Scx=Xpeo}) (50)
  3099. \p{Script_Extensions: Old_South_Arabian} (Short: \p{Scx=Sarb}) (32)
  3100. \p{Script_Extensions: Old_Turkic} (Short: \p{Scx=Orkh}) (73)
  3101. \p{Script_Extensions: Oriya} (Short: \p{Scx=Orya}) (92)
  3102. \p{Script_Extensions: Orkh} \p{Script_Extensions=Old_Turkic} (73)
  3103. \p{Script_Extensions: Orya} \p{Script_Extensions=Oriya} (92)
  3104. \p{Script_Extensions: Osma} \p{Script_Extensions=Osmanya} (40)
  3105. \p{Script_Extensions: Osmanya} (Short: \p{Scx=Osma}) (40)
  3106. \p{Script_Extensions: Phag} \p{Script_Extensions=Phags_Pa} (59)
  3107. \p{Script_Extensions: Phags_Pa} (Short: \p{Scx=Phag}) (59)
  3108. \p{Script_Extensions: Phli} \p{Script_Extensions=
  3109. Inscriptional_Pahlavi} (27)
  3110. \p{Script_Extensions: Phnx} \p{Script_Extensions=Phoenician} (29)
  3111. \p{Script_Extensions: Phoenician} (Short: \p{Scx=Phnx}) (29)
  3112. \p{Script_Extensions: Plrd} \p{Script_Extensions=Miao} (133)
  3113. \p{Script_Extensions: Prti} \p{Script_Extensions=
  3114. Inscriptional_Parthian} (30)
  3115. \p{Script_Extensions: Qaac} \p{Script_Extensions=Coptic} (137)
  3116. \p{Script_Extensions: Qaai} \p{Script_Extensions=Inherited} (459)
  3117. \p{Script_Extensions: Rejang} (Short: \p{Scx=Rjng}) (37)
  3118. \p{Script_Extensions: Rjng} \p{Script_Extensions=Rejang} (37)
  3119. \p{Script_Extensions: Runic} (Short: \p{Scx=Runr}) (78)
  3120. \p{Script_Extensions: Runr} \p{Script_Extensions=Runic} (78)
  3121. \p{Script_Extensions: Samaritan} (Short: \p{Scx=Samr}) (61)
  3122. \p{Script_Extensions: Samr} \p{Script_Extensions=Samaritan} (61)
  3123. \p{Script_Extensions: Sarb} \p{Script_Extensions=
  3124. Old_South_Arabian} (32)
  3125. \p{Script_Extensions: Saur} \p{Script_Extensions=Saurashtra} (81)
  3126. \p{Script_Extensions: Saurashtra} (Short: \p{Scx=Saur}) (81)
  3127. \p{Script_Extensions: Sharada} (Short: \p{Scx=Shrd}) (83)
  3128. \p{Script_Extensions: Shavian} (Short: \p{Scx=Shaw}) (48)
  3129. \p{Script_Extensions: Shaw} \p{Script_Extensions=Shavian} (48)
  3130. \p{Script_Extensions: Shrd} \p{Script_Extensions=Sharada} (83)
  3131. \p{Script_Extensions: Sinh} \p{Script_Extensions=Sinhala} (80)
  3132. \p{Script_Extensions: Sinhala} (Short: \p{Scx=Sinh}) (80)
  3133. \p{Script_Extensions: Sora} \p{Script_Extensions=Sora_Sompeng} (35)
  3134. \p{Script_Extensions: Sora_Sompeng} (Short: \p{Scx=Sora}) (35)
  3135. \p{Script_Extensions: Sund} \p{Script_Extensions=Sundanese} (72)
  3136. \p{Script_Extensions: Sundanese} (Short: \p{Scx=Sund}) (72)
  3137. \p{Script_Extensions: Sylo} \p{Script_Extensions=Syloti_Nagri} (44)
  3138. \p{Script_Extensions: Syloti_Nagri} (Short: \p{Scx=Sylo}) (44)
  3139. \p{Script_Extensions: Syrc} \p{Script_Extensions=Syriac} (93)
  3140. \p{Script_Extensions: Syriac} (Short: \p{Scx=Syrc}) (93)
  3141. \p{Script_Extensions: Tagalog} (Short: \p{Scx=Tglg}) (22)
  3142. \p{Script_Extensions: Tagb} \p{Script_Extensions=Tagbanwa} (20)
  3143. \p{Script_Extensions: Tagbanwa} (Short: \p{Scx=Tagb}) (20)
  3144. \p{Script_Extensions: Tai_Le} (Short: \p{Scx=Tale}) (35)
  3145. \p{Script_Extensions: Tai_Tham} (Short: \p{Scx=Lana}) (127)
  3146. \p{Script_Extensions: Tai_Viet} (Short: \p{Scx=Tavt}) (72)
  3147. \p{Script_Extensions: Takr} \p{Script_Extensions=Takri} (78)
  3148. \p{Script_Extensions: Takri} (Short: \p{Scx=Takr}) (78)
  3149. \p{Script_Extensions: Tale} \p{Script_Extensions=Tai_Le} (35)
  3150. \p{Script_Extensions: Talu} \p{Script_Extensions=New_Tai_Lue} (83)
  3151. \p{Script_Extensions: Tamil} (Short: \p{Scx=Taml}) (72)
  3152. \p{Script_Extensions: Taml} \p{Script_Extensions=Tamil} (72)
  3153. \p{Script_Extensions: Tavt} \p{Script_Extensions=Tai_Viet} (72)
  3154. \p{Script_Extensions: Telu} \p{Script_Extensions=Telugu} (93)
  3155. \p{Script_Extensions: Telugu} (Short: \p{Scx=Telu}) (93)
  3156. \p{Script_Extensions: Tfng} \p{Script_Extensions=Tifinagh} (59)
  3157. \p{Script_Extensions: Tglg} \p{Script_Extensions=Tagalog} (22)
  3158. \p{Script_Extensions: Thaa} \p{Script_Extensions=Thaana} (65)
  3159. \p{Script_Extensions: Thaana} (Short: \p{Scx=Thaa}) (65)
  3160. \p{Script_Extensions: Thai} (Short: \p{Scx=Thai}) (86)
  3161. \p{Script_Extensions: Tibetan} (Short: \p{Scx=Tibt}) (207)
  3162. \p{Script_Extensions: Tibt} \p{Script_Extensions=Tibetan} (207)
  3163. \p{Script_Extensions: Tifinagh} (Short: \p{Scx=Tfng}) (59)
  3164. \p{Script_Extensions: Ugar} \p{Script_Extensions=Ugaritic} (31)
  3165. \p{Script_Extensions: Ugaritic} (Short: \p{Scx=Ugar}) (31)
  3166. \p{Script_Extensions: Unknown} (Short: \p{Scx=Zzzz}) (1_003_930)
  3167. \p{Script_Extensions: Vai} (Short: \p{Scx=Vai}) (300)
  3168. \p{Script_Extensions: Vaii} \p{Script_Extensions=Vai} (300)
  3169. \p{Script_Extensions: Xpeo} \p{Script_Extensions=Old_Persian} (50)
  3170. \p{Script_Extensions: Xsux} \p{Script_Extensions=Cuneiform} (982)
  3171. \p{Script_Extensions: Yi} (Short: \p{Scx=Yi}) (1246)
  3172. \p{Script_Extensions: Yiii} \p{Script_Extensions=Yi} (1246)
  3173. \p{Script_Extensions: Zinh} \p{Script_Extensions=Inherited} (459)
  3174. \p{Script_Extensions: Zyyy} \p{Script_Extensions=Common} (6057)
  3175. \p{Script_Extensions: Zzzz} \p{Script_Extensions=Unknown}
  3176. (1_003_930)
  3177. \p{Scx: *} \p{Script_Extensions: *}
  3178. \p{SD} \p{Soft_Dotted} (= \p{Soft_Dotted=Y}) (46)
  3179. \p{SD: *} \p{Soft_Dotted: *}
  3180. \p{Sentence_Break: AT} \p{Sentence_Break=ATerm} (4)
  3181. \p{Sentence_Break: ATerm} (Short: \p{SB=AT}) (4)
  3182. \p{Sentence_Break: CL} \p{Sentence_Break=Close} (177)
  3183. \p{Sentence_Break: Close} (Short: \p{SB=CL}) (177)
  3184. \p{Sentence_Break: CR} (Short: \p{SB=CR}) (1)
  3185. \p{Sentence_Break: EX} \p{Sentence_Break=Extend} (1649)
  3186. \p{Sentence_Break: Extend} (Short: \p{SB=EX}) (1649)
  3187. \p{Sentence_Break: FO} \p{Sentence_Break=Format} (137)
  3188. \p{Sentence_Break: Format} (Short: \p{SB=FO}) (137)
  3189. \p{Sentence_Break: LE} \p{Sentence_Break=OLetter} (97_841)
  3190. \p{Sentence_Break: LF} (Short: \p{SB=LF}) (1)
  3191. \p{Sentence_Break: LO} \p{Sentence_Break=Lower} (1933)
  3192. \p{Sentence_Break: Lower} (Short: \p{SB=LO}) (1933)
  3193. \p{Sentence_Break: NU} \p{Sentence_Break=Numeric} (452)
  3194. \p{Sentence_Break: Numeric} (Short: \p{SB=NU}) (452)
  3195. \p{Sentence_Break: OLetter} (Short: \p{SB=LE}) (97_841)
  3196. \p{Sentence_Break: Other} (Short: \p{SB=XX}) (1_010_273)
  3197. \p{Sentence_Break: SC} \p{Sentence_Break=SContinue} (26)
  3198. \p{Sentence_Break: SContinue} (Short: \p{SB=SC}) (26)
  3199. \p{Sentence_Break: SE} \p{Sentence_Break=Sep} (3)
  3200. \p{Sentence_Break: Sep} (Short: \p{SB=SE}) (3)
  3201. \p{Sentence_Break: Sp} (Short: \p{SB=Sp}) (21)
  3202. \p{Sentence_Break: ST} \p{Sentence_Break=STerm} (80)
  3203. \p{Sentence_Break: STerm} (Short: \p{SB=ST}) (80)
  3204. \p{Sentence_Break: UP} \p{Sentence_Break=Upper} (1514)
  3205. \p{Sentence_Break: Upper} (Short: \p{SB=UP}) (1514)
  3206. \p{Sentence_Break: XX} \p{Sentence_Break=Other} (1_010_273)
  3207. \p{Separator} \p{General_Category=Separator} (Short:
  3208. \p{Z}) (20)
  3209. \p{Sharada} \p{Script=Sharada} (Short: \p{Shrd}; NOT
  3210. \p{Block=Sharada}) (83)
  3211. \p{Shavian} \p{Script=Shavian} (Short: \p{Shaw}) (48)
  3212. \p{Shaw} \p{Shavian} (= \p{Script=Shavian}) (48)
  3213. \p{Shrd} \p{Sharada} (= \p{Script=Sharada}) (NOT
  3214. \p{Block=Sharada}) (83)
  3215. \p{Sinh} \p{Sinhala} (= \p{Script=Sinhala}) (NOT
  3216. \p{Block=Sinhala}) (80)
  3217. \p{Sinhala} \p{Script=Sinhala} (Short: \p{Sinh}; NOT
  3218. \p{Block=Sinhala}) (80)
  3219. \p{Sk} \p{Modifier_Symbol} (=
  3220. \p{General_Category=Modifier_Symbol})
  3221. (115)
  3222. \p{Sm} \p{Math_Symbol} (= \p{General_Category=
  3223. Math_Symbol}) (952)
  3224. X \p{Small_Form_Variants} \p{Block=Small_Form_Variants} (Short:
  3225. \p{InSmallForms}) (32)
  3226. X \p{Small_Forms} \p{Small_Form_Variants} (= \p{Block=
  3227. Small_Form_Variants}) (32)
  3228. \p{So} \p{Other_Symbol} (= \p{General_Category=
  3229. Other_Symbol}) (4404)
  3230. \p{Soft_Dotted} \p{Soft_Dotted=Y} (Short: \p{SD}) (46)
  3231. \p{Soft_Dotted: N*} (Short: \p{SD=N}, \P{SD}) (1_114_066)
  3232. \p{Soft_Dotted: Y*} (Short: \p{SD=Y}, \p{SD}) (46)
  3233. \p{Sora} \p{Sora_Sompeng} (= \p{Script=
  3234. Sora_Sompeng}) (NOT \p{Block=
  3235. Sora_Sompeng}) (35)
  3236. \p{Sora_Sompeng} \p{Script=Sora_Sompeng} (Short: \p{Sora};
  3237. NOT \p{Block=Sora_Sompeng}) (35)
  3238. \p{Space} \p{White_Space=Y} \s including beyond
  3239. ASCII and vertical tab (26)
  3240. \p{Space: *} \p{White_Space: *}
  3241. \p{Space_Separator} \p{General_Category=Space_Separator}
  3242. (Short: \p{Zs}) (18)
  3243. \p{SpacePerl} \p{XPerlSpace} (26)
  3244. \p{Spacing_Mark} \p{General_Category=Spacing_Mark} (Short:
  3245. \p{Mc}) (353)
  3246. X \p{Spacing_Modifier_Letters} \p{Block=Spacing_Modifier_Letters}
  3247. (Short: \p{InModifierLetters}) (80)
  3248. X \p{Specials} \p{Block=Specials} (16)
  3249. \p{STerm} \p{STerm=Y} (83)
  3250. \p{STerm: N*} (Single: \P{STerm}) (1_114_029)
  3251. \p{STerm: Y*} (Single: \p{STerm}) (83)
  3252. \p{Sund} \p{Sundanese} (= \p{Script=Sundanese})
  3253. (NOT \p{Block=Sundanese}) (72)
  3254. \p{Sundanese} \p{Script=Sundanese} (Short: \p{Sund}; NOT
  3255. \p{Block=Sundanese}) (72)
  3256. X \p{Sundanese_Sup} \p{Sundanese_Supplement} (= \p{Block=
  3257. Sundanese_Supplement}) (16)
  3258. X \p{Sundanese_Supplement} \p{Block=Sundanese_Supplement} (Short:
  3259. \p{InSundaneseSup}) (16)
  3260. X \p{Sup_Arrows_A} \p{Supplemental_Arrows_A} (= \p{Block=
  3261. Supplemental_Arrows_A}) (16)
  3262. X \p{Sup_Arrows_B} \p{Supplemental_Arrows_B} (= \p{Block=
  3263. Supplemental_Arrows_B}) (128)
  3264. X \p{Sup_Math_Operators} \p{Supplemental_Mathematical_Operators} (=
  3265. \p{Block=
  3266. Supplemental_Mathematical_Operators})
  3267. (256)
  3268. X \p{Sup_PUA_A} \p{Supplementary_Private_Use_Area_A} (=
  3269. \p{Block=
  3270. Supplementary_Private_Use_Area_A})
  3271. (65_536)
  3272. X \p{Sup_PUA_B} \p{Supplementary_Private_Use_Area_B} (=
  3273. \p{Block=
  3274. Supplementary_Private_Use_Area_B})
  3275. (65_536)
  3276. X \p{Sup_Punctuation} \p{Supplemental_Punctuation} (= \p{Block=
  3277. Supplemental_Punctuation}) (128)
  3278. X \p{Super_And_Sub} \p{Superscripts_And_Subscripts} (=
  3279. \p{Block=Superscripts_And_Subscripts})
  3280. (48)
  3281. X \p{Superscripts_And_Subscripts} \p{Block=
  3282. Superscripts_And_Subscripts} (Short:
  3283. \p{InSuperAndSub}) (48)
  3284. X \p{Supplemental_Arrows_A} \p{Block=Supplemental_Arrows_A} (Short:
  3285. \p{InSupArrowsA}) (16)
  3286. X \p{Supplemental_Arrows_B} \p{Block=Supplemental_Arrows_B} (Short:
  3287. \p{InSupArrowsB}) (128)
  3288. X \p{Supplemental_Mathematical_Operators} \p{Block=
  3289. Supplemental_Mathematical_Operators}
  3290. (Short: \p{InSupMathOperators}) (256)
  3291. X \p{Supplemental_Punctuation} \p{Block=Supplemental_Punctuation}
  3292. (Short: \p{InSupPunctuation}) (128)
  3293. X \p{Supplementary_Private_Use_Area_A} \p{Block=
  3294. Supplementary_Private_Use_Area_A}
  3295. (Short: \p{InSupPUAA}) (65_536)
  3296. X \p{Supplementary_Private_Use_Area_B} \p{Block=
  3297. Supplementary_Private_Use_Area_B}
  3298. (Short: \p{InSupPUAB}) (65_536)
  3299. \p{Surrogate} \p{General_Category=Surrogate} (Short:
  3300. \p{Cs}) (2048)
  3301. \p{Sylo} \p{Syloti_Nagri} (= \p{Script=
  3302. Syloti_Nagri}) (NOT \p{Block=
  3303. Syloti_Nagri}) (44)
  3304. \p{Syloti_Nagri} \p{Script=Syloti_Nagri} (Short: \p{Sylo};
  3305. NOT \p{Block=Syloti_Nagri}) (44)
  3306. \p{Symbol} \p{General_Category=Symbol} (Short: \p{S})
  3307. (5520)
  3308. \p{Syrc} \p{Syriac} (= \p{Script=Syriac}) (NOT
  3309. \p{Block=Syriac}) (77)
  3310. \p{Syriac} \p{Script=Syriac} (Short: \p{Syrc}; NOT
  3311. \p{Block=Syriac}) (77)
  3312. \p{Tagalog} \p{Script=Tagalog} (Short: \p{Tglg}; NOT
  3313. \p{Block=Tagalog}) (20)
  3314. \p{Tagb} \p{Tagbanwa} (= \p{Script=Tagbanwa}) (NOT
  3315. \p{Block=Tagbanwa}) (18)
  3316. \p{Tagbanwa} \p{Script=Tagbanwa} (Short: \p{Tagb}; NOT
  3317. \p{Block=Tagbanwa}) (18)
  3318. X \p{Tags} \p{Block=Tags} (128)
  3319. \p{Tai_Le} \p{Script=Tai_Le} (Short: \p{Tale}; NOT
  3320. \p{Block=Tai_Le}) (35)
  3321. \p{Tai_Tham} \p{Script=Tai_Tham} (Short: \p{Lana}; NOT
  3322. \p{Block=Tai_Tham}) (127)
  3323. \p{Tai_Viet} \p{Script=Tai_Viet} (Short: \p{Tavt}; NOT
  3324. \p{Block=Tai_Viet}) (72)
  3325. X \p{Tai_Xuan_Jing} \p{Tai_Xuan_Jing_Symbols} (= \p{Block=
  3326. Tai_Xuan_Jing_Symbols}) (96)
  3327. X \p{Tai_Xuan_Jing_Symbols} \p{Block=Tai_Xuan_Jing_Symbols} (Short:
  3328. \p{InTaiXuanJing}) (96)
  3329. \p{Takr} \p{Takri} (= \p{Script=Takri}) (NOT
  3330. \p{Block=Takri}) (66)
  3331. \p{Takri} \p{Script=Takri} (Short: \p{Takr}; NOT
  3332. \p{Block=Takri}) (66)
  3333. \p{Tale} \p{Tai_Le} (= \p{Script=Tai_Le}) (NOT
  3334. \p{Block=Tai_Le}) (35)
  3335. \p{Talu} \p{New_Tai_Lue} (= \p{Script=New_Tai_Lue})
  3336. (NOT \p{Block=New_Tai_Lue}) (83)
  3337. \p{Tamil} \p{Script=Tamil} (Short: \p{Taml}; NOT
  3338. \p{Block=Tamil}) (72)
  3339. \p{Taml} \p{Tamil} (= \p{Script=Tamil}) (NOT
  3340. \p{Block=Tamil}) (72)
  3341. \p{Tavt} \p{Tai_Viet} (= \p{Script=Tai_Viet}) (NOT
  3342. \p{Block=Tai_Viet}) (72)
  3343. \p{Telu} \p{Telugu} (= \p{Script=Telugu}) (NOT
  3344. \p{Block=Telugu}) (93)
  3345. \p{Telugu} \p{Script=Telugu} (Short: \p{Telu}; NOT
  3346. \p{Block=Telugu}) (93)
  3347. \p{Term} \p{Terminal_Punctuation} (=
  3348. \p{Terminal_Punctuation=Y}) (176)
  3349. \p{Term: *} \p{Terminal_Punctuation: *}
  3350. \p{Terminal_Punctuation} \p{Terminal_Punctuation=Y} (Short:
  3351. \p{Term}) (176)
  3352. \p{Terminal_Punctuation: N*} (Short: \p{Term=N}, \P{Term})
  3353. (1_113_936)
  3354. \p{Terminal_Punctuation: Y*} (Short: \p{Term=Y}, \p{Term}) (176)
  3355. \p{Tfng} \p{Tifinagh} (= \p{Script=Tifinagh}) (NOT
  3356. \p{Block=Tifinagh}) (59)
  3357. \p{Tglg} \p{Tagalog} (= \p{Script=Tagalog}) (NOT
  3358. \p{Block=Tagalog}) (20)
  3359. \p{Thaa} \p{Thaana} (= \p{Script=Thaana}) (NOT
  3360. \p{Block=Thaana}) (50)
  3361. \p{Thaana} \p{Script=Thaana} (Short: \p{Thaa}; NOT
  3362. \p{Block=Thaana}) (50)
  3363. \p{Thai} \p{Script=Thai} (NOT \p{Block=Thai}) (86)
  3364. \p{Tibetan} \p{Script=Tibetan} (Short: \p{Tibt}; NOT
  3365. \p{Block=Tibetan}) (207)
  3366. \p{Tibt} \p{Tibetan} (= \p{Script=Tibetan}) (NOT
  3367. \p{Block=Tibetan}) (207)
  3368. \p{Tifinagh} \p{Script=Tifinagh} (Short: \p{Tfng}; NOT
  3369. \p{Block=Tifinagh}) (59)
  3370. \p{Title} \p{Titlecase} (/i= Cased=Yes) (31)
  3371. \p{Titlecase} (= \p{Gc=Lt}) (Short: \p{Title}; /i=
  3372. Cased=Yes) (31)
  3373. \p{Titlecase_Letter} \p{General_Category=Titlecase_Letter}
  3374. (Short: \p{Lt}; /i= General_Category=
  3375. Cased_Letter) (31)
  3376. X \p{Transport_And_Map} \p{Transport_And_Map_Symbols} (= \p{Block=
  3377. Transport_And_Map_Symbols}) (128)
  3378. X \p{Transport_And_Map_Symbols} \p{Block=Transport_And_Map_Symbols}
  3379. (Short: \p{InTransportAndMap}) (128)
  3380. X \p{UCAS} \p{Unified_Canadian_Aboriginal_Syllabics}
  3381. (= \p{Block=
  3382. Unified_Canadian_Aboriginal_Syllabics})
  3383. (640)
  3384. X \p{UCAS_Ext} \p{Unified_Canadian_Aboriginal_Syllabics_-
  3385. Extended} (= \p{Block=
  3386. Unified_Canadian_Aboriginal_Syllabics_-
  3387. Extended}) (80)
  3388. \p{Ugar} \p{Ugaritic} (= \p{Script=Ugaritic}) (NOT
  3389. \p{Block=Ugaritic}) (31)
  3390. \p{Ugaritic} \p{Script=Ugaritic} (Short: \p{Ugar}; NOT
  3391. \p{Block=Ugaritic}) (31)
  3392. \p{UIdeo} \p{Unified_Ideograph} (=
  3393. \p{Unified_Ideograph=Y}) (74_617)
  3394. \p{UIdeo: *} \p{Unified_Ideograph: *}
  3395. \p{Unassigned} \p{General_Category=Unassigned} (Short:
  3396. \p{Cn}) (864_414)
  3397. X \p{Unified_Canadian_Aboriginal_Syllabics} \p{Block=
  3398. Unified_Canadian_Aboriginal_Syllabics}
  3399. (Short: \p{InUCAS}) (640)
  3400. X \p{Unified_Canadian_Aboriginal_Syllabics_Extended} \p{Block=
  3401. Unified_Canadian_Aboriginal_Syllabics_-
  3402. Extended} (Short: \p{InUCASExt}) (80)
  3403. \p{Unified_Ideograph} \p{Unified_Ideograph=Y} (Short: \p{UIdeo})
  3404. (74_617)
  3405. \p{Unified_Ideograph: N*} (Short: \p{UIdeo=N}, \P{UIdeo})
  3406. (1_039_495)
  3407. \p{Unified_Ideograph: Y*} (Short: \p{UIdeo=Y}, \p{UIdeo}) (74_617)
  3408. \p{Unknown} \p{Script=Unknown} (Short: \p{Zzzz})
  3409. (1_003_930)
  3410. \p{Upper} \p{Uppercase=Y} (/i= Cased=Yes) (1483)
  3411. \p{Upper: *} \p{Uppercase: *}
  3412. \p{Uppercase} \p{Upper} (= \p{Uppercase=Y}) (/i= Cased=
  3413. Yes) (1483)
  3414. \p{Uppercase: N*} (Short: \p{Upper=N}, \P{Upper}; /i= Cased=
  3415. No) (1_112_629)
  3416. \p{Uppercase: Y*} (Short: \p{Upper=Y}, \p{Upper}; /i= Cased=
  3417. Yes) (1483)
  3418. \p{Uppercase_Letter} \p{General_Category=Uppercase_Letter}
  3419. (Short: \p{Lu}; /i= General_Category=
  3420. Cased_Letter) (1441)
  3421. \p{Vai} \p{Script=Vai} (NOT \p{Block=Vai}) (300)
  3422. \p{Vaii} \p{Vai} (= \p{Script=Vai}) (NOT \p{Block=
  3423. Vai}) (300)
  3424. \p{Variation_Selector} \p{Variation_Selector=Y} (Short: \p{VS};
  3425. NOT \p{Variation_Selectors}) (259)
  3426. \p{Variation_Selector: N*} (Short: \p{VS=N}, \P{VS}) (1_113_853)
  3427. \p{Variation_Selector: Y*} (Short: \p{VS=Y}, \p{VS}) (259)
  3428. X \p{Variation_Selectors} \p{Block=Variation_Selectors} (Short:
  3429. \p{InVS}) (16)
  3430. X \p{Variation_Selectors_Supplement} \p{Block=
  3431. Variation_Selectors_Supplement} (Short:
  3432. \p{InVSSup}) (240)
  3433. X \p{Vedic_Ext} \p{Vedic_Extensions} (= \p{Block=
  3434. Vedic_Extensions}) (48)
  3435. X \p{Vedic_Extensions} \p{Block=Vedic_Extensions} (Short:
  3436. \p{InVedicExt}) (48)
  3437. X \p{Vertical_Forms} \p{Block=Vertical_Forms} (16)
  3438. \p{VertSpace} \v (7)
  3439. \p{VS} \p{Variation_Selector} (=
  3440. \p{Variation_Selector=Y}) (NOT
  3441. \p{Variation_Selectors}) (259)
  3442. \p{VS: *} \p{Variation_Selector: *}
  3443. X \p{VS_Sup} \p{Variation_Selectors_Supplement} (=
  3444. \p{Block=
  3445. Variation_Selectors_Supplement}) (240)
  3446. \p{WB: *} \p{Word_Break: *}
  3447. \p{White_Space} \p{White_Space=Y} (Short: \p{WSpace}) (26)
  3448. \p{White_Space: N*} (Short: \p{Space=N}, \P{WSpace})
  3449. (1_114_086)
  3450. \p{White_Space: Y*} (Short: \p{Space=Y}, \p{WSpace}) (26)
  3451. \p{Word} \w, including beyond ASCII; = \p{Alnum} +
  3452. \pM + \p{Pc} (103_406)
  3453. \p{Word_Break: ALetter} (Short: \p{WB=LE}) (24_941)
  3454. \p{Word_Break: CR} (Short: \p{WB=CR}) (1)
  3455. \p{Word_Break: EX} \p{Word_Break=ExtendNumLet} (10)
  3456. \p{Word_Break: Extend} (Short: \p{WB=Extend}) (1649)
  3457. \p{Word_Break: ExtendNumLet} (Short: \p{WB=EX}) (10)
  3458. \p{Word_Break: FO} \p{Word_Break=Format} (136)
  3459. \p{Word_Break: Format} (Short: \p{WB=FO}) (136)
  3460. \p{Word_Break: KA} \p{Word_Break=Katakana} (310)
  3461. \p{Word_Break: Katakana} (Short: \p{WB=KA}) (310)
  3462. \p{Word_Break: LE} \p{Word_Break=ALetter} (24_941)
  3463. \p{Word_Break: LF} (Short: \p{WB=LF}) (1)
  3464. \p{Word_Break: MB} \p{Word_Break=MidNumLet} (8)
  3465. \p{Word_Break: MidLetter} (Short: \p{WB=ML}) (8)
  3466. \p{Word_Break: MidNum} (Short: \p{WB=MN}) (15)
  3467. \p{Word_Break: MidNumLet} (Short: \p{WB=MB}) (8)
  3468. \p{Word_Break: ML} \p{Word_Break=MidLetter} (8)
  3469. \p{Word_Break: MN} \p{Word_Break=MidNum} (15)
  3470. \p{Word_Break: Newline} (Short: \p{WB=NL}) (5)
  3471. \p{Word_Break: NL} \p{Word_Break=Newline} (5)
  3472. \p{Word_Break: NU} \p{Word_Break=Numeric} (451)
  3473. \p{Word_Break: Numeric} (Short: \p{WB=NU}) (451)
  3474. \p{Word_Break: Other} (Short: \p{WB=XX}) (1_086_551)
  3475. \p{Word_Break: Regional_Indicator} (Short: \p{WB=RI}) (26)
  3476. \p{Word_Break: RI} \p{Word_Break=Regional_Indicator} (26)
  3477. \p{Word_Break: XX} \p{Word_Break=Other} (1_086_551)
  3478. \p{WSpace} \p{White_Space} (= \p{White_Space=Y}) (26)
  3479. \p{WSpace: *} \p{White_Space: *}
  3480. \p{XDigit} \p{Hex_Digit=Y} (Short: \p{Hex}) (44)
  3481. \p{XID_Continue} \p{XID_Continue=Y} (Short: \p{XIDC})
  3482. (103_336)
  3483. \p{XID_Continue: N*} (Short: \p{XIDC=N}, \P{XIDC}) (1_010_776)
  3484. \p{XID_Continue: Y*} (Short: \p{XIDC=Y}, \p{XIDC}) (103_336)
  3485. \p{XID_Start} \p{XID_Start=Y} (Short: \p{XIDS}) (101_217)
  3486. \p{XID_Start: N*} (Short: \p{XIDS=N}, \P{XIDS}) (1_012_895)
  3487. \p{XID_Start: Y*} (Short: \p{XIDS=Y}, \p{XIDS}) (101_217)
  3488. \p{XIDC} \p{XID_Continue} (= \p{XID_Continue=Y})
  3489. (103_336)
  3490. \p{XIDC: *} \p{XID_Continue: *}
  3491. \p{XIDS} \p{XID_Start} (= \p{XID_Start=Y}) (101_217)
  3492. \p{XIDS: *} \p{XID_Start: *}
  3493. \p{Xpeo} \p{Old_Persian} (= \p{Script=Old_Persian})
  3494. (NOT \p{Block=Old_Persian}) (50)
  3495. \p{XPerlSpace} \s, including beyond ASCII (Short:
  3496. \p{SpacePerl}) (26)
  3497. \p{XPosixAlnum} \p{Alnum} (102_619)
  3498. \p{XPosixAlpha} \p{Alpha} (= \p{Alphabetic=Y}) (102_159)
  3499. \p{XPosixBlank} \p{Blank} (19)
  3500. \p{XPosixCntrl} \p{Cntrl} (= \p{General_Category=Control})
  3501. (65)
  3502. \p{XPosixDigit} \p{Digit} (= \p{General_Category=
  3503. Decimal_Number}) (460)
  3504. \p{XPosixGraph} \p{Graph} (247_565)
  3505. \p{XPosixLower} \p{Lower} (= \p{Lowercase=Y}) (/i= Cased=
  3506. Yes) (1934)
  3507. \p{XPosixPrint} \p{Print} (247_583)
  3508. \p{XPosixPunct} \p{Punct} + ASCII-range \p{Symbol} (641)
  3509. \p{XPosixSpace} \p{Space} (= \p{White_Space=Y}) (26)
  3510. \p{XPosixUpper} \p{Upper} (= \p{Uppercase=Y}) (/i= Cased=
  3511. Yes) (1483)
  3512. \p{XPosixWord} \p{Word} (103_406)
  3513. \p{XPosixXDigit} \p{XDigit} (= \p{Hex_Digit=Y}) (44)
  3514. \p{Xsux} \p{Cuneiform} (= \p{Script=Cuneiform})
  3515. (NOT \p{Block=Cuneiform}) (982)
  3516. \p{Yi} \p{Script=Yi} (1220)
  3517. X \p{Yi_Radicals} \p{Block=Yi_Radicals} (64)
  3518. X \p{Yi_Syllables} \p{Block=Yi_Syllables} (1168)
  3519. \p{Yiii} \p{Yi} (= \p{Script=Yi}) (1220)
  3520. X \p{Yijing} \p{Yijing_Hexagram_Symbols} (= \p{Block=
  3521. Yijing_Hexagram_Symbols}) (64)
  3522. X \p{Yijing_Hexagram_Symbols} \p{Block=Yijing_Hexagram_Symbols}
  3523. (Short: \p{InYijing}) (64)
  3524. \p{Z} \p{Separator} (= \p{General_Category=
  3525. Separator}) (20)
  3526. \p{Zinh} \p{Inherited} (= \p{Script=Inherited})
  3527. (523)
  3528. \p{Zl} \p{Line_Separator} (= \p{General_Category=
  3529. Line_Separator}) (1)
  3530. \p{Zp} \p{Paragraph_Separator} (=
  3531. \p{General_Category=
  3532. Paragraph_Separator}) (1)
  3533. \p{Zs} \p{Space_Separator} (=
  3534. \p{General_Category=Space_Separator})
  3535. (18)
  3536. \p{Zyyy} \p{Common} (= \p{Script=Common}) (6413)
  3537. \p{Zzzz} \p{Unknown} (= \p{Script=Unknown})
  3538. (1_003_930)
  3539. TX\p{_CanonDCIJ} (For internal use by Perl, not necessarily
  3540. stable) (= \p{Soft_Dotted=Y}) (46)
  3541. TX\p{_Case_Ignorable} (For internal use by Perl, not necessarily
  3542. stable) (= \p{Case_Ignorable=Y}) (1799)
  3543. TX\p{_CombAbove} (For internal use by Perl, not necessarily
  3544. stable) (= \p{Canonical_Combining_Class=
  3545. Above}) (349)

Legal \p{} and \P{} constructs that match no characters

Unicode has some property-value pairs that currently don't match anything. This happens generally either because they are obsolete, or they exist for symmetry with other forms, but no language has yet been encoded that uses them. In this version of Unicode, the following match zero code points:

  • \p{Canonical_Combining_Class=Attached_Below_Left}
  • \p{Canonical_Combining_Class=CCC133}
  • \p{Grapheme_Cluster_Break=Prepend}
  • \p{Joining_Type=Left_Joining}

Properties accessible through Unicode::UCD

All the Unicode character properties mentioned above (except for those marked as for internal use by Perl) are also accessible by prop_invlist() in Unicode::UCD.

Due to their nature, not all Unicode character properties are suitable for regular expression matches, nor prop_invlist() . The remaining non-provisional, non-internal ones are accessible via prop_invmap() in Unicode::UCD (except for those that this Perl installation hasn't included; see below for which those are).

For compatibility with other parts of Perl, all the single forms given in the table in the section above are recognized. BUT, there are some ambiguities between some Perl extensions and the Unicode properties, all of which are silently resolved in favor of the official Unicode property. To avoid surprises, you should only use prop_invmap() for forms listed in the table below, which omits the non-recommended ones. The affected forms are the Perl single form equivalents of Unicode properties, such as \p{sc} being a single-form equivalent of \p{gc=sc} , which is treated by prop_invmap() as the Script property, whose short name is sc . The table indicates the current ambiguities in the INFO column, beginning with the word "NOT" .

The standard Unicode properties listed below are documented in http://www.unicode.org/reports/tr44/; Perl_Decimal_Digit is documented in prop_invmap() in Unicode::UCD. The other Perl extensions are in Other Properties in perlunicode;

The first column in the table is a name for the property; the second column is an alternative name, if any, plus possibly some annotations. The alternative name is the property's full name, unless that would simply repeat the first column, in which case the second column indicates the property's short name (if different). The annotations are given only in the entry for the full name. If a property is obsolete, etc, the entry will be flagged with the same characters used in the table in the section above, like D or S.

  1. NAME INFO
  2. Age
  3. AHex ASCII_Hex_Digit
  4. All Any. (Perl extension)
  5. Alnum (Perl extension). Alphabetic and
  6. (decimal) Numeric
  7. Alpha Alphabetic
  8. Alphabetic (Short: Alpha)
  9. Any (Perl extension). [\x{0000}-\x{10FFFF}]
  10. ASCII Block=ASCII. (Perl extension).
  11. [[:ASCII:]]
  12. ASCII_Hex_Digit (Short: AHex)
  13. Assigned (Perl extension). All assigned code points
  14. Bc Bidi_Class
  15. Bidi_C Bidi_Control
  16. Bidi_Class (Short: bc)
  17. Bidi_Control (Short: Bidi_C)
  18. Bidi_M Bidi_Mirrored
  19. Bidi_Mirrored (Short: Bidi_M)
  20. Bidi_Mirroring_Glyph (Short: bmg)
  21. Blank (Perl extension). \h, Horizontal white
  22. space
  23. Blk Block
  24. Block (Short: blk)
  25. Bmg Bidi_Mirroring_Glyph
  26. Canonical_Combining_Class (Short: ccc)
  27. Case_Folding (Short: cf)
  28. Case_Ignorable (Short: CI)
  29. Cased
  30. Category General_Category
  31. Ccc Canonical_Combining_Class
  32. CE Composition_Exclusion
  33. Cf Case_Folding; NOT 'cf' meaning
  34. 'General_Category=Format'
  35. Changes_When_Casefolded (Short: CWCF)
  36. Changes_When_Casemapped (Short: CWCM)
  37. Changes_When_Lowercased (Short: CWL)
  38. Changes_When_NFKC_Casefolded (Short: CWKCF)
  39. Changes_When_Titlecased (Short: CWT)
  40. Changes_When_Uppercased (Short: CWU)
  41. CI Case_Ignorable
  42. Cntrl General_Category=Cntrl. (Perl extension).
  43. Control characters
  44. Comp_Ex Full_Composition_Exclusion
  45. Composition_Exclusion (Short: CE)
  46. CWCF Changes_When_Casefolded
  47. CWCM Changes_When_Casemapped
  48. CWKCF Changes_When_NFKC_Casefolded
  49. CWL Changes_When_Lowercased
  50. CWT Changes_When_Titlecased
  51. CWU Changes_When_Uppercased
  52. Dash
  53. Decomposition_Mapping (Short: dm)
  54. Decomposition_Type (Short: dt)
  55. Default_Ignorable_Code_Point (Short: DI)
  56. Dep Deprecated
  57. Deprecated (Short: Dep)
  58. DI Default_Ignorable_Code_Point
  59. Dia Diacritic
  60. Diacritic (Short: Dia)
  61. Digit General_Category=Digit. (Perl extension).
  62. [0-9] + all other decimal digits
  63. Dm Decomposition_Mapping
  64. Dt Decomposition_Type
  65. Ea East_Asian_Width
  66. East_Asian_Width (Short: ea)
  67. Ext Extender
  68. Extender (Short: Ext)
  69. Full_Composition_Exclusion (Short: Comp_Ex)
  70. Gc General_Category
  71. GCB Grapheme_Cluster_Break
  72. General_Category (Short: gc)
  73. Gr_Base Grapheme_Base
  74. Gr_Ext Grapheme_Extend
  75. Graph (Perl extension). Characters that are
  76. graphical
  77. Grapheme_Base (Short: Gr_Base)
  78. Grapheme_Cluster_Break (Short: GCB)
  79. Grapheme_Extend (Short: Gr_Ext)
  80. Hangul_Syllable_Type (Short: hst)
  81. Hex Hex_Digit
  82. Hex_Digit (Short: Hex)
  83. HorizSpace Blank. (Perl extension)
  84. Hst Hangul_Syllable_Type
  85. D Hyphen Supplanted by Line_Break property values;
  86. see www.unicode.org/reports/tr14
  87. ID_Continue (Short: IDC)
  88. ID_Start (Short: IDS)
  89. IDC ID_Continue
  90. Ideo Ideographic
  91. Ideographic (Short: Ideo)
  92. IDS ID_Start
  93. IDS_Binary_Operator (Short: IDSB)
  94. IDS_Trinary_Operator (Short: IDST)
  95. IDSB IDS_Binary_Operator
  96. IDST IDS_Trinary_Operator
  97. In Present_In. (Perl extension)
  98. Isc ISO_Comment; NOT 'isc' meaning
  99. 'General_Category=Other'
  100. ISO_Comment (Short: isc)
  101. Jg Joining_Group
  102. Join_C Join_Control
  103. Join_Control (Short: Join_C)
  104. Joining_Group (Short: jg)
  105. Joining_Type (Short: jt)
  106. Jt Joining_Type
  107. Lb Line_Break
  108. Lc Lowercase_Mapping; NOT 'lc' meaning
  109. 'General_Category=Cased_Letter'
  110. Line_Break (Short: lb)
  111. LOE Logical_Order_Exception
  112. Logical_Order_Exception (Short: LOE)
  113. Lower Lowercase
  114. Lowercase (Short: Lower)
  115. Lowercase_Mapping (Short: lc)
  116. Math
  117. Na Name
  118. Na1 Unicode_1_Name
  119. Name (Short: na)
  120. Name_Alias
  121. NChar Noncharacter_Code_Point
  122. NFC_QC NFC_Quick_Check
  123. NFC_Quick_Check (Short: NFC_QC)
  124. NFD_QC NFD_Quick_Check
  125. NFD_Quick_Check (Short: NFD_QC)
  126. NFKC_Casefold (Short: NFKC_CF)
  127. NFKC_CF NFKC_Casefold
  128. NFKC_QC NFKC_Quick_Check
  129. NFKC_Quick_Check (Short: NFKC_QC)
  130. NFKD_QC NFKD_Quick_Check
  131. NFKD_Quick_Check (Short: NFKD_QC)
  132. Noncharacter_Code_Point (Short: NChar)
  133. Nt Numeric_Type
  134. Numeric_Type (Short: nt)
  135. Numeric_Value (Short: nv)
  136. Nv Numeric_Value
  137. Pat_Syn Pattern_Syntax
  138. Pat_WS Pattern_White_Space
  139. Pattern_Syntax (Short: Pat_Syn)
  140. Pattern_White_Space (Short: Pat_WS)
  141. Perl_Decimal_Digit (Perl extension)
  142. PerlSpace (Perl extension). \s, restricted to ASCII
  143. = [ \f\n\r\t] plus vertical tab
  144. PerlWord (Perl extension). \w, restricted to ASCII
  145. = [A-Za-z0-9_]
  146. PosixAlnum (Perl extension). [A-Za-z0-9]
  147. PosixAlpha (Perl extension). [A-Za-z]
  148. PosixBlank (Perl extension). \t and ' '
  149. PosixCntrl (Perl extension). ASCII control
  150. characters: NUL, SOH, STX, ETX, EOT, ENQ,
  151. ACK, BEL, BS, HT, LF, VT, FF, CR, SO, SI,
  152. DLE, DC1, DC2, DC3, DC4, NAK, SYN, ETB,
  153. CAN, EOM, SUB, ESC, FS, GS, RS, US, and DEL
  154. PosixDigit (Perl extension). [0-9]
  155. PosixGraph (Perl extension). [-
  156. !"#$%&'()*+,./:;<>?@[\\]^_`{|}~0-9A-Za-z]
  157. PosixLower (Perl extension). [a-z]
  158. PosixPrint (Perl extension). [- 0-9A-Za-
  159. z!"#$%&'()*+,./:;<>?@[\\]^_`{|}~]
  160. PosixPunct (Perl extension). [-
  161. !"#$%&'()*+,./:;<>?@[\\]^_`{|}~]
  162. PosixSpace (Perl extension). \t, \n, \cK, \f, \r,
  163. and ' '. (\cK is vertical tab)
  164. PosixUpper (Perl extension). [A-Z]
  165. PosixWord PerlWord. (Perl extension)
  166. PosixXDigit (Perl extension). [0-9A-Fa-f]
  167. Present_In (Short: In). (Perl extension)
  168. Print (Perl extension). Characters that are
  169. graphical plus space characters (but no
  170. controls)
  171. Punct General_Category=Punct. (Perl extension)
  172. QMark Quotation_Mark
  173. Quotation_Mark (Short: QMark)
  174. Radical
  175. SB Sentence_Break
  176. Sc Script; NOT 'sc' meaning
  177. 'General_Category=Currency_Symbol'
  178. Scf Simple_Case_Folding
  179. Script (Short: sc)
  180. Script_Extensions (Short: scx)
  181. Scx Script_Extensions
  182. SD Soft_Dotted
  183. Sentence_Break (Short: SB)
  184. Sfc Simple_Case_Folding
  185. Simple_Case_Folding (Short: scf)
  186. Simple_Lowercase_Mapping (Short: slc)
  187. Simple_Titlecase_Mapping (Short: stc)
  188. Simple_Uppercase_Mapping (Short: suc)
  189. Slc Simple_Lowercase_Mapping
  190. Soft_Dotted (Short: SD)
  191. Space White_Space
  192. SpacePerl XPerlSpace. (Perl extension)
  193. Stc Simple_Titlecase_Mapping
  194. STerm
  195. Suc Simple_Uppercase_Mapping
  196. Tc Titlecase_Mapping
  197. Term Terminal_Punctuation
  198. Terminal_Punctuation (Short: Term)
  199. Title Titlecase. (Perl extension)
  200. Titlecase (Short: Title). (Perl extension). (=
  201. \p{Gc=Lt})
  202. Titlecase_Mapping (Short: tc)
  203. Uc Uppercase_Mapping
  204. UIdeo Unified_Ideograph
  205. Unicode_1_Name (Short: na1)
  206. Unified_Ideograph (Short: UIdeo)
  207. Upper Uppercase
  208. Uppercase (Short: Upper)
  209. Uppercase_Mapping (Short: uc)
  210. Variation_Selector (Short: VS)
  211. VertSpace (Perl extension). \v
  212. VS Variation_Selector
  213. WB Word_Break
  214. White_Space (Short: WSpace)
  215. Word (Perl extension). \w, including beyond
  216. ASCII; = \p{Alnum} + \pM + \p{Pc}
  217. Word_Break (Short: WB)
  218. WSpace White_Space
  219. XDigit (Perl extension)
  220. XID_Continue (Short: XIDC)
  221. XID_Start (Short: XIDS)
  222. XIDC XID_Continue
  223. XIDS XID_Start
  224. XPerlSpace (Perl extension). \s, including beyond
  225. ASCII
  226. XPosixAlnum Alnum. (Perl extension)
  227. XPosixAlpha Alpha. (Perl extension)
  228. XPosixBlank Blank. (Perl extension)
  229. XPosixCntrl General_Category=Cntrl. (Perl extension)
  230. XPosixDigit General_Category=Digit. (Perl extension)
  231. XPosixGraph Graph. (Perl extension)
  232. XPosixLower Lower. (Perl extension)
  233. XPosixPrint Print. (Perl extension)
  234. XPosixPunct (Perl extension). \p{Punct} + ASCII-range
  235. \p{Symbol}
  236. XPosixSpace Space. (Perl extension)
  237. XPosixUpper Upper. (Perl extension)
  238. XPosixWord Word. (Perl extension)
  239. XPosixXDigit XDigit. (Perl extension)

Properties accessible through other means

Certain properties are accessible also via core function calls. These are:

  1. Lowercase_Mapping lc() and lcfirst()
  2. Titlecase_Mapping ucfirst()
  3. Uppercase_Mapping uc()

Also, Case_Folding is accessible through the /i modifier in regular expressions, the \F transliteration escape, and the fc operator.

And, the Name and Name_Aliases properties are accessible through the \N{} interpolation in double-quoted strings and regular expressions; and functions charnames::viacode() , charnames::vianame() , and charnames::string_vianame() (which require a use charnames (); to be specified.

Finally, most properties related to decomposition are accessible via Unicode::Normalize.

Unicode character properties that are NOT accepted by Perl

Perl will generate an error for a few character properties in Unicode when used in a regular expression. The non-Unihan ones are listed below, with the reasons they are not accepted, perhaps with work-arounds. The short names for the properties are listed enclosed in (parentheses). As described after the list, an installation can change the defaults and choose to accept any of these. The list is machine generated based on the choices made for the installation that generated this document.

  • Expands_On_NFC (XO_NFC)
  • Expands_On_NFD (XO_NFD)
  • Expands_On_NFKC (XO_NFKC)
  • Expands_On_NFKD (XO_NFKD)

    Deprecated by Unicode. These are characters that expand to more than one character in the specified normalization form, but whether they actually take up more bytes or not depends on the encoding being used. For example, a UTF-8 encoded character may expand to a different number of bytes than a UTF-32 encoded character.

  • Grapheme_Link (Gr_Link)

    Deprecated by Unicode: Duplicates ccc=vr (Canonical_Combining_Class=Virama)

  • Indic_Matra_Category (InMC)
  • Indic_Syllabic_Category (InSC)

    Provisional

  • Jamo_Short_Name (JSN)
  • Other_Alphabetic (OAlpha)
  • Other_Default_Ignorable_Code_Point (ODI)
  • Other_Grapheme_Extend (OGr_Ext)
  • Other_ID_Continue (OIDC)
  • Other_ID_Start (OIDS)
  • Other_Lowercase (OLower)
  • Other_Math (OMath)
  • Other_Uppercase (OUpper)

    Used by Unicode internally for generating other properties and not intended to be used stand-alone

  • Script=Katakana_Or_Hiragana (sc=Hrkt)

    Obsolete. All code points previously matched by this have been moved to "Script=Common". Consider instead using "Script_Extensions=Katakana" or "Script_Extensions=Hiragana" (or both)

  • Script_Extensions=Katakana_Or_Hiragana (scx=Hrkt)

    All code points that would be matched by this are matched by either "Script_Extensions=Katakana" or "Script_Extensions=Hiragana"

An installation can choose to allow any of these to be matched by downloading the Unicode database from http://www.unicode.org/Public/ to $Config{privlib} /unicore/ in the Perl source tree, changing the controlling lists contained in the program $Config{privlib} /unicore/mktables and then re-compiling and installing. (%Config is available from the Config module).

Other information in the Unicode data base

The Unicode data base is delivered in two different formats. The XML version is valid for more modern Unicode releases. The other version is a collection of files. The two are intended to give equivalent information. Perl uses the older form; this allows you to recompile Perl to use early Unicode releases.

The only non-character property that Perl currently supports is Named Sequences, in which a sequence of code points is given a name and generally treated as a single entity. (Perl supports these via the \N{...} double-quotish construct, charnames::string_vianame(name) in charnames, and namedseq() in Unicode::UCD.

Below is a list of the files in the Unicode data base that Perl doesn't currently use, along with very brief descriptions of their purposes. Some of the names of the files have been shortened from those that Unicode uses, in order to allow them to be distinguishable from similarly named files on file systems for which only the first 8 characters of a name are significant.

  • auxiliary/GraphemeBreakTest.html
  • auxiliary/LineBreakTest.html
  • auxiliary/SentenceBreakTest.html
  • auxiliary/WordBreakTest.html

    Documentation of validation tests

  • auxiliary/LBTest.txt
  • auxiliary/SBTest.txt
  • auxiliary/WBTest.txt
  • BidiTest.txt
  • NormTest.txt

    Validation Tests

  • CJKRadicals.txt

    Maps the kRSUnicode property values to corresponding code points

  • EmojiSources.txt

    Maps certain Unicode code points to their legacy Japanese cell-phone values

  • Index.txt

    Alphabetical index of Unicode characters

  • IndicMatraCategory.txt
  • IndicSyllabicCategory.txt

    Provisional; for the analysis and processing of Indic scripts

  • NamedSqProv.txt

    Named sequences proposed for inclusion in a later version of the Unicode Standard; if you need them now, you can append this file to NamedSequences.txt and recompile perl

  • NamesList.txt

    Annotated list of characters

  • NormalizationCorrections.txt

    Documentation of corrections already incorporated into the Unicode data base

  • Props.txt

    Only in very early releases; is a subset of PropList.txt (which is used instead)

  • ReadMe.txt

    Documentation

  • StandardizedVariants.txt

    Certain glyph variations for character display are standardized. This lists the non-Unihan ones; the Unihan ones are also not used by Perl, and are in a separate Unicode data base http://www.unicode.org/ivd

  • USourceData.pdf
  • USourceData.txt

    Documentation of status and cross reference of proposals for encoding by Unicode of Unihan characters

SEE ALSO

http://www.unicode.org/reports/tr44/

perlrecharclass

perlunicode

 
perldoc-html/perlunitut.html000644 000765 000024 00000066215 12275777326 016353 0ustar00jjstaff000000 000000 perlunitut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlunitut

Perl 5 version 18.2 documentation
Recently read

perlunitut

NAME

perlunitut - Perl Unicode Tutorial

DESCRIPTION

The days of just flinging strings around are over. It's well established that modern programs need to be capable of communicating funny accented letters, and things like euro symbols. This means that programmers need new habits. It's easy to program Unicode capable software, but it does require discipline to do it right.

There's a lot to know about character sets, and text encodings. It's probably best to spend a full day learning all this, but the basics can be learned in minutes.

These are not the very basics, though. It is assumed that you already know the difference between bytes and characters, and realise (and accept!) that there are many different character sets and encodings, and that your program has to be explicit about them. Recommended reading is "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" by Joel Spolsky, at http://joelonsoftware.com/articles/Unicode.html.

This tutorial speaks in rather absolute terms, and provides only a limited view of the wealth of character string related features that Perl has to offer. For most projects, this information will probably suffice.

Definitions

It's important to set a few things straight first. This is the most important part of this tutorial. This view may conflict with other information that you may have found on the web, but that's mostly because many sources are wrong.

You may have to re-read this entire section a few times...

Unicode

Unicode is a character set with room for lots of characters. The ordinal value of a character is called a code point. (But in practice, the distinction between code point and character is blurred, so the terms often are used interchangeably.)

There are many, many code points, but computers work with bytes, and a byte has room for only 256 values. Unicode has many more characters than that, so you need a method to make these accessible.

Unicode is encoded using several competing encodings, of which UTF-8 is the most used. In a Unicode encoding, multiple subsequent bytes can be used to store a single code point, or simply: character.

UTF-8

UTF-8 is a Unicode encoding. Many people think that Unicode and UTF-8 are the same thing, but they're not. There are more Unicode encodings, but much of the world has standardized on UTF-8.

UTF-8 treats the first 128 codepoints, 0..127, the same as ASCII. They take only one byte per character. All other characters are encoded as two or more (up to six) bytes using a complex scheme. Fortunately, Perl handles this for us, so we don't have to worry about this.

Text strings (character strings)

Text strings, or character strings are made of characters. Bytes are irrelevant here, and so are encodings. Each character is just that: the character.

On a text string, you would do things like:

  1. $text =~ s/foo/bar/;
  2. if ($string =~ /^\d+$/) { ... }
  3. $text = ucfirst $text;
  4. my $character_count = length $text;

The value of a character (ord, chr) is the corresponding Unicode code point.

Binary strings (byte strings)

Binary strings, or byte strings are made of bytes. Here, you don't have characters, just bytes. All communication with the outside world (anything outside of your current Perl process) is done in binary.

On a binary string, you would do things like:

  1. my (@length_content) = unpack "(V/a)*", $binary;
  2. $binary =~ s/\x00\x0F/\xFF\xF0/; # for the brave :)
  3. print {$fh} $binary;
  4. my $byte_count = length $binary;

Encoding

Encoding (as a verb) is the conversion from text to binary. To encode, you have to supply the target encoding, for example iso-8859-1 or UTF-8 . Some encodings, like the iso-8859 ("latin") range, do not support the full Unicode standard; characters that can't be represented are lost in the conversion.

Decoding

Decoding is the conversion from binary to text. To decode, you have to know what encoding was used during the encoding phase. And most of all, it must be something decodable. It doesn't make much sense to decode a PNG image into a text string.

Internal format

Perl has an internal format, an encoding that it uses to encode text strings so it can store them in memory. All text strings are in this internal format. In fact, text strings are never in any other format!

You shouldn't worry about what this format is, because conversion is automatically done when you decode or encode.

Your new toolkit

Add to your standard heading the following line:

  1. use Encode qw(encode decode);

Or, if you're lazy, just:

  1. use Encode;

I/O flow (the actual 5 minute tutorial)

The typical input/output flow of a program is:

  1. 1. Receive and decode
  2. 2. Process
  3. 3. Encode and output

If your input is binary, and is supposed to remain binary, you shouldn't decode it to a text string, of course. But in all other cases, you should decode it.

Decoding can't happen reliably if you don't know how the data was encoded. If you get to choose, it's a good idea to standardize on UTF-8.

  1. my $foo = decode('UTF-8', get 'http://example.com/');
  2. my $bar = decode('ISO-8859-1', readline STDIN);
  3. my $xyzzy = decode('Windows-1251', $cgi->param('foo'));

Processing happens as you knew before. The only difference is that you're now using characters instead of bytes. That's very useful if you use things like substr, or length.

It's important to realize that there are no bytes in a text string. Of course, Perl has its internal encoding to store the string in memory, but ignore that. If you have to do anything with the number of bytes, it's probably best to move that part to step 3, just after you've encoded the string. Then you know exactly how many bytes it will be in the destination string.

The syntax for encoding text strings to binary strings is as simple as decoding:

  1. $body = encode('UTF-8', $body);

If you needed to know the length of the string in bytes, now's the perfect time for that. Because $body is now a byte string, length will report the number of bytes, instead of the number of characters. The number of characters is no longer known, because characters only exist in text strings.

  1. my $byte_count = length $body;

And if the protocol you're using supports a way of letting the recipient know which character encoding you used, please help the receiving end by using that feature! For example, E-mail and HTTP support MIME headers, so you can use the Content-Type header. They can also have Content-Length to indicate the number of bytes, which is always a good idea to supply if the number is known.

  1. "Content-Type: text/plain; charset=UTF-8",
  2. "Content-Length: $byte_count"

SUMMARY

Decode everything you receive, encode everything you send out. (If it's text data.)

Q and A (or FAQ)

After reading this document, you ought to read perlunifaq too.

ACKNOWLEDGEMENTS

Thanks to Johan Vromans from Squirrel Consultancy. His UTF-8 rants during the Amsterdam Perl Mongers meetings got me interested and determined to find out how to use character encodings in Perl in ways that don't break easily.

Thanks to Gerard Goossen from TTY. His presentation "UTF-8 in the wild" (Dutch Perl Workshop 2006) inspired me to publish my thoughts and write this tutorial.

Thanks to the people who asked about this kind of stuff in several Perl IRC channels, and have constantly reminded me that a simpler explanation was needed.

Thanks to the people who reviewed this document for me, before it went public. They are: Benjamin Smith, Jan-Pieter Cornet, Johan Vromans, Lukas Mai, Nathan Gray.

AUTHOR

Juerd Waalboer <#####@juerd.nl>

SEE ALSO

perlunifaq, perlunicode, perluniintro, Encode

 
perldoc-html/perlutil.html000644 000765 000024 00000074115 12275777417 015777 0ustar00jjstaff000000 000000 perlutil - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlutil

Perl 5 version 18.2 documentation
Recently read

perlutil

NAME

perlutil - utilities packaged with the Perl distribution

DESCRIPTION

Along with the Perl interpreter itself, the Perl distribution installs a range of utilities on your system. There are also several utilities which are used by the Perl distribution itself as part of the install process. This document exists to list all of these utilities, explain what they are for and provide pointers to each module's documentation, if appropriate.

LIST OF UTILITIES

Documentation

  • perldoc

    The main interface to Perl's documentation is perldoc , although if you're reading this, it's more than likely that you've already found it. perldoc will extract and format the documentation from any file in the current directory, any Perl module installed on the system, or any of the standard documentation pages, such as this one. Use perldoc <name> to get information on any of the utilities described in this document.

  • pod2man and pod2text

    If it's run from a terminal, perldoc will usually call pod2man to translate POD (Plain Old Documentation - see perlpod for an explanation) into a manpage, and then run man to display it; if man isn't available, pod2text will be used instead and the output piped through your favourite pager.

  • pod2html and pod2latex

    As well as these two, there are two other converters: pod2html will produce HTML pages from POD, and pod2latex, which produces LaTeX files.

  • pod2usage

    If you just want to know how to use the utilities described here, pod2usage will just extract the "USAGE" section; some of the utilities will automatically call pod2usage on themselves when you call them with -help .

  • podselect

    pod2usage is a special case of podselect, a utility to extract named sections from documents written in POD. For instance, while utilities have "USAGE" sections, Perl modules usually have "SYNOPSIS" sections: podselect -s "SYNOPSIS" ... will extract this section for a given file.

  • podchecker

    If you're writing your own documentation in POD, the podchecker utility will look for errors in your markup.

  • splain

    splain is an interface to perldiag - paste in your error message to it, and it'll explain it for you.

  • roffitall

    The roffitall utility is not installed on your system but lives in the pod/ directory of your Perl source kit; it converts all the documentation from the distribution to *roff format, and produces a typeset PostScript or text file of the whole lot.

Converters

To help you convert legacy programs to Perl, we've included three conversion filters:

  • a2p

    a2p converts awk scripts to Perl programs; for example, a2p -F: on the simple awk script {print $2} will produce a Perl program based around this code:

    1. while (<>) {
    2. ($Fld1,$Fld2) = split(/[:\n]/, $_, -1);
    3. print $Fld2;
    4. }
  • s2p and psed

    Similarly, s2p converts sed scripts to Perl programs. s2p run on s/foo/bar will produce a Perl program based around this:

    1. while (<>) {
    2. chomp;
    3. s/foo/bar/g;
    4. print if $printit;
    5. }

    When invoked as psed, it behaves as a sed implementation, written in Perl.

  • find2perl

    Finally, find2perl translates find commands to Perl equivalents which use the File::Find module. As an example, find2perl . -user root -perm 4000 -print produces the following callback subroutine for File::Find :

    1. sub wanted {
    2. my ($dev,$ino,$mode,$nlink,$uid,$gid);
    3. (($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
    4. $uid == $uid{'root'}) &&
    5. (($mode & 0777) == 04000);
    6. print("$name\n");
    7. }

As well as these filters for converting other languages, the pl2pm utility will help you convert old-style Perl 4 libraries to new-style Perl5 modules.

Administration

  • config_data

    Query or change configuration of Perl modules that use Module::Build-based configuration files for features and config data.

  • libnetcfg

    To display and change the libnet configuration run the libnetcfg command.

  • perlivp

    The perlivp program is set up at Perl source code build time to test the Perl version it was built under. It can be used after running make install (or your platform's equivalent procedure) to verify that perl and its libraries have been installed correctly.

Development

There are a set of utilities which help you in developing Perl programs, and in particular, extending Perl with C.

  • perlbug

    perlbug is the recommended way to report bugs in the perl interpreter itself or any of the standard library modules back to the developers; please read through the documentation for perlbug thoroughly before using it to submit a bug report.

  • perlthanks

    This program provides an easy way to send a thank-you message back to the authors and maintainers of perl. It's just perlbug installed under another name.

  • h2ph

    Back before Perl had the XS system for connecting with C libraries, programmers used to get library constants by reading through the C header files. You may still see require 'syscall.ph' or similar around - the .ph file should be created by running h2ph on the corresponding .h file. See the h2ph documentation for more on how to convert a whole bunch of header files at once.

  • c2ph and pstruct

    c2ph and pstruct, which are actually the same program but behave differently depending on how they are called, provide another way of getting at C with Perl - they'll convert C structures and union declarations to Perl code. This is deprecated in favour of h2xs these days.

  • h2xs

    h2xs converts C header files into XS modules, and will try and write as much glue between C libraries and Perl modules as it can. It's also very useful for creating skeletons of pure Perl modules.

  • enc2xs

    enc2xs builds a Perl extension for use by Encode from either Unicode Character Mapping files (.ucm) or Tcl Encoding Files (.enc). Besides being used internally during the build process of the Encode module, you can use enc2xs to add your own encoding to perl. No knowledge of XS is necessary.

  • xsubpp

    xsubpp is a compiler to convert Perl XS code into C code. It is typically run by the makefiles created by ExtUtils::MakeMaker.

    xsubpp will compile XS code into C code by embedding the constructs necessary to let C functions manipulate Perl values and creates the glue necessary to let Perl access those functions.

  • prove

    prove is a command-line interface to the test-running functionality of Test::Harness. It's an alternative to make test .

  • corelist

    A command-line front-end to Module::CoreList , to query what modules were shipped with given versions of perl.

General tools

A few general-purpose tools are shipped with perl, mostly because they came along modules included in the perl distribution.

  • piconv

    piconv is a Perl version of iconv, a character encoding converter widely available for various Unixen today. This script was primarily a technology demonstrator for Perl v5.8.0, but you can use piconv in the place of iconv for virtually any case.

  • ptar

    ptar is a tar-like program, written in pure Perl.

  • ptardiff

    ptardiff is a small utility that produces a diff between an extracted archive and an unextracted one. (Note that this utility requires the Text::Diff module to function properly; this module isn't distributed with perl, but is available from the CPAN.)

  • ptargrep

    ptargrep is a utility to apply pattern matching to the contents of files in a tar archive.

  • shasum

    This utility, that comes with the Digest::SHA module, is used to print or verify SHA checksums.

  • zipdetails

    zipdetails displays information about the internal record structure of the zip file. It is not concerned with displaying any details of the compressed data stored in the zip file.

Installation

These utilities help manage extra Perl modules that don't come with the perl distribution.

  • cpan

    cpan is a command-line interface to CPAN.pm. It allows you to install modules or distributions from CPAN, or just get information about them, and a lot more. It is similar to the command line mode of the CPAN module,

    1. perl -MCPAN -e shell
  • cpanp

    cpanp is, like cpan, a command-line interface to the CPAN, using the CPANPLUS module as a back-end. It can be used interactively or imperatively.

  • cpan2dist

    cpan2dist is a tool to create distributions (or packages) from CPAN modules, then suitable for your package manager of choice. Support for specific formats are available from CPAN as CPANPLUS::Dist::* modules.

  • instmodsh

    A little interface to ExtUtils::Installed to examine installed modules, validate your packlists and even create a tarball from an installed module.

SEE ALSO

perldoc, pod2man, perlpod, pod2html, pod2usage, podselect, podchecker, splain, perldiag, roffitall|roffitall , a2p, s2p, find2perl, File::Find, pl2pm, perlbug, h2ph, c2ph, h2xs, enc2xs, xsubpp, cpan, cpanp, cpan2dist, instmodsh, piconv, prove, corelist, ptar, ptardiff, shasum, zipdetails

 
perldoc-html/perlvar.html000644 000765 000024 00000466224 12275777337 015621 0ustar00jjstaff000000 000000 perlvar - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlvar

Perl 5 version 18.2 documentation
Recently read

perlvar

NAME

perlvar - Perl predefined variables

DESCRIPTION

The Syntax of Variable Names

Variable names in Perl can have several formats. Usually, they must begin with a letter or underscore, in which case they can be arbitrarily long (up to an internal limit of 251 characters) and may contain letters, digits, underscores, or the special sequence :: or '. In this case, the part before the last :: or ' is taken to be a package qualifier; see perlmod.

Perl variable names may also be a sequence of digits or a single punctuation or control character. These names are all reserved for special uses by Perl; for example, the all-digits names are used to hold data captured by backreferences after a regular expression match. Perl has a special syntax for the single-control-character names: It understands ^X (caret X ) to mean the control-X character. For example, the notation $^W (dollar-sign caret W ) is the scalar variable whose name is the single character control-W . This is better than typing a literal control-W into your program.

Since Perl v5.6.0, Perl variable names may be alphanumeric strings that begin with control characters (or better yet, a caret). These variables must be written in the form ${^Foo} ; the braces are not optional. ${^Foo} denotes the scalar variable whose name is a control-F followed by two o 's. These variables are reserved for future special uses by Perl, except for the ones that begin with ^_ (control-underscore or caret-underscore). No control-character name that begins with ^_ will acquire a special meaning in any future version of Perl; such names may therefore be used safely in programs. $^_ itself, however, is reserved.

Perl identifiers that begin with digits, control characters, or punctuation characters are exempt from the effects of the package declaration and are always forced to be in package main ; they are also exempt from strict 'vars' errors. A few other names are also exempt in these ways:

  1. ENV STDIN
  2. INC STDOUT
  3. ARGV STDERR
  4. ARGVOUT
  5. SIG

In particular, the special ${^_XYZ} variables are always taken to be in package main , regardless of any package declarations presently in scope.

SPECIAL VARIABLES

The following names have special meaning to Perl. Most punctuation names have reasonable mnemonics, or analogs in the shells. Nevertheless, if you wish to use long variable names, you need only say:

  1. use English;

at the top of your program. This aliases all the short names to the long names in the current package. Some even have medium names, generally borrowed from awk. To avoid a performance hit, if you don't need the $PREMATCH , $MATCH , or $POSTMATCH it's best to use the English module without them:

  1. use English '-no_match_vars';

Before you continue, note the sort order for variables. In general, we first list the variables in case-insensitive, almost-lexigraphical order (ignoring the { or ^ preceding words, as in ${^UNICODE} or $^T ), although $_ and @_ move up to the top of the pile. For variables with the same identifier, we list it in order of scalar, array, hash, and bareword.

General Variables

  • $ARG
  • $_

    The default input and pattern-searching space. The following pairs are equivalent:

    1. while (<>) {...} # equivalent only in while!
    2. while (defined($_ = <>)) {...}
    3. /^Subject:/
    4. $_ =~ /^Subject:/
    5. tr/a-z/A-Z/
    6. $_ =~ tr/a-z/A-Z/
    7. chomp
    8. chomp($_)

    Here are the places where Perl will assume $_ even if you don't use it:

    • The following functions use $_ as a default argument:

      abs, alarm, chomp, chop, chr, chroot, cos, defined, eval, evalbytes, exp, fc, glob, hex, int, lc, lcfirst, length, log, lstat, mkdir, oct, ord, pos, print, printf, quotemeta, readlink, readpipe, ref, require, reverse (in scalar context only), rmdir, say, sin, split (for its second argument), sqrt, stat, study, uc, ucfirst, unlink, unpack.

    • All file tests (-f , -d ) except for -t , which defaults to STDIN. See -X

    • The pattern matching operations m//, s/// and tr/// (aka y///) when used without an =~ operator.

    • The default iterator variable in a foreach loop if no other variable is supplied.

    • The implicit iterator variable in the grep() and map() functions.

    • The implicit variable of given() .

    • The default place to put the next value or input record when a <FH> , readline, readdir or each operation's result is tested by itself as the sole criterion of a while test. Outside a while test, this will not happen.

    $_ is by default a global variable. However, as of perl v5.10.0, you can use a lexical version of $_ by declaring it in a file or in a block with my. Moreover, declaring our $_ restores the global $_ in the current scope. Though this seemed like a good idea at the time it was introduced, lexical $_ actually causes more problems than it solves. If you call a function that expects to be passed information via $_ , it may or may not work, depending on how the function is written, there not being any easy way to solve this. Just avoid lexical $_ , unless you are feeling particularly masochistic. For this reason lexical $_ is still experimental and will produce a warning unless warnings have been disabled. As with other experimental features, the behavior of lexical $_ is subject to change without notice, including change into a fatal error.

    Mnemonic: underline is understood in certain operations.

  • @ARG
  • @_

    Within a subroutine the array @_ contains the parameters passed to that subroutine. Inside a subroutine, @_ is the default array for the array operators push, pop, shift, and unshift.

    See perlsub.

  • $LIST_SEPARATOR
  • $"

    When an array or an array slice is interpolated into a double-quoted string or a similar context such as /.../ , its elements are separated by this value. Default is a space. For example, this:

    1. print "The array is: @array\n";

    is equivalent to this:

    1. print "The array is: " . join($", @array) . "\n";

    Mnemonic: works in double-quoted context.

  • $PROCESS_ID
  • $PID
  • $$

    The process number of the Perl running this script. Though you can set this variable, doing so is generally discouraged, although it can be invaluable for some testing purposes. It will be reset automatically across fork() calls.

    Note for Linux and Debian GNU/kFreeBSD users: Before Perl v5.16.0 perl would emulate POSIX semantics on Linux systems using LinuxThreads, a partial implementation of POSIX Threads that has since been superseded by the Native POSIX Thread Library (NPTL).

    LinuxThreads is now obsolete on Linux, and caching getpid() like this made embedding perl unnecessarily complex (since you'd have to manually update the value of $$), so now $$ and getppid() will always return the same values as the underlying C library.

    Debian GNU/kFreeBSD systems also used LinuxThreads up until and including the 6.0 release, but after that moved to FreeBSD thread semantics, which are POSIX-like.

    To see if your system is affected by this discrepancy check if getconf GNU_LIBPTHREAD_VERSION | grep -q NPTL returns a false value. NTPL threads preserve the POSIX semantics.

    Mnemonic: same as shells.

  • $PROGRAM_NAME
  • $0

    Contains the name of the program being executed.

    On some (but not all) operating systems assigning to $0 modifies the argument area that the ps program sees. On some platforms you may have to use special ps options or a different ps to see the changes. Modifying the $0 is more useful as a way of indicating the current program state than it is for hiding the program you're running.

    Note that there are platform-specific limitations on the maximum length of $0 . In the most extreme case it may be limited to the space occupied by the original $0 .

    In some platforms there may be arbitrary amount of padding, for example space characters, after the modified name as shown by ps . In some platforms this padding may extend all the way to the original length of the argument area, no matter what you do (this is the case for example with Linux 2.2).

    Note for BSD users: setting $0 does not completely remove "perl" from the ps(1) output. For example, setting $0 to "foobar" may result in "perl: foobar (perl)" (whether both the "perl: " prefix and the " (perl)" suffix are shown depends on your exact BSD variant and version). This is an operating system feature, Perl cannot help it.

    In multithreaded scripts Perl coordinates the threads so that any thread may modify its copy of the $0 and the change becomes visible to ps(1) (assuming the operating system plays along). Note that the view of $0 the other threads have will not change since they have their own copies of it.

    If the program has been given to perl via the switches -e or -E , $0 will contain the string "-e" .

    On Linux as of perl v5.14.0 the legacy process name will be set with prctl(2) , in addition to altering the POSIX name via argv[0] as perl has done since version 4.000. Now system utilities that read the legacy process name such as ps, top and killall will recognize the name you set when assigning to $0 . The string you supply will be cut off at 16 bytes, this is a limitation imposed by Linux.

    Mnemonic: same as sh and ksh.

  • $REAL_GROUP_ID
  • $GID
  • $(

    The real gid of this process. If you are on a machine that supports membership in multiple groups simultaneously, gives a space separated list of groups you are in. The first number is the one returned by getgid() , and the subsequent ones by getgroups() , one of which may be the same as the first number.

    However, a value assigned to $( must be a single number used to set the real gid. So the value given by $( should not be assigned back to $( without being forced numeric, such as by adding zero. Note that this is different to the effective gid ($) ) which does take a list.

    You can change both the real gid and the effective gid at the same time by using POSIX::setgid() . Changes to $( require a check to $! to detect any possible errors after an attempted change.

    Mnemonic: parentheses are used to group things. The real gid is the group you left, if you're running setgid.

  • $EFFECTIVE_GROUP_ID
  • $EGID
  • $)

    The effective gid of this process. If you are on a machine that supports membership in multiple groups simultaneously, gives a space separated list of groups you are in. The first number is the one returned by getegid() , and the subsequent ones by getgroups() , one of which may be the same as the first number.

    Similarly, a value assigned to $) must also be a space-separated list of numbers. The first number sets the effective gid, and the rest (if any) are passed to setgroups() . To get the effect of an empty list for setgroups() , just repeat the new effective gid; that is, to force an effective gid of 5 and an effectively empty setgroups() list, say $) = "5 5" .

    You can change both the effective gid and the real gid at the same time by using POSIX::setgid() (use only a single numeric argument). Changes to $) require a check to $! to detect any possible errors after an attempted change.

    $< , $> , $( and $) can be set only on machines that support the corresponding set[re][ug]id() routine. $( and $) can be swapped only on machines supporting setregid() .

    Mnemonic: parentheses are used to group things. The effective gid is the group that's right for you, if you're running setgid.

  • $REAL_USER_ID
  • $UID
  • $<

    The real uid of this process. You can change both the real uid and the effective uid at the same time by using POSIX::setuid() . Since changes to $< require a system call, check $! after a change attempt to detect any possible errors.

    Mnemonic: it's the uid you came from, if you're running setuid.

  • $EFFECTIVE_USER_ID
  • $EUID
  • $>

    The effective uid of this process. For example:

    1. $< = $>; # set real to effective uid
    2. ($<,$>) = ($>,$<); # swap real and effective uids

    You can change both the effective uid and the real uid at the same time by using POSIX::setuid() . Changes to $> require a check to $! to detect any possible errors after an attempted change.

    $< and $> can be swapped only on machines supporting setreuid() .

    Mnemonic: it's the uid you went to, if you're running setuid.

  • $SUBSCRIPT_SEPARATOR
  • $SUBSEP
  • $;

    The subscript separator for multidimensional array emulation. If you refer to a hash element as

    1. $foo{$a,$b,$c}

    it really means

    1. $foo{join($;, $a, $b, $c)}

    But don't put

    1. @foo{$a,$b,$c} # a slice--note the @

    which means

    1. ($foo{$a},$foo{$b},$foo{$c})

    Default is "\034", the same as SUBSEP in awk. If your keys contain binary data there might not be any safe value for $; .

    Consider using "real" multidimensional arrays as described in perllol.

    Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.

  • $a
  • $b

    Special package variables when using sort(), see sort. Because of this specialness $a and $b don't need to be declared (using use vars , or our()) even when using the strict 'vars' pragma. Don't lexicalize them with my $a or my $b if you want to be able to use them in the sort() comparison block or function.

  • %ENV

    The hash %ENV contains your current environment. Setting a value in ENV changes the environment for any child processes you subsequently fork() off.

  • $SYSTEM_FD_MAX
  • $^F

    The maximum system file descriptor, ordinarily 2. System file descriptors are passed to exec()ed processes, while higher file descriptors are not. Also, during an open(), system file descriptors are preserved even if the open() fails (ordinary file descriptors are closed before the open() is attempted). The close-on-exec status of a file descriptor will be decided according to the value of $^F when the corresponding file, pipe, or socket was opened, not the time of the exec().

  • @F

    The array @F contains the fields of each line read in when autosplit mode is turned on. See perlrun for the -a switch. This array is package-specific, and must be declared or given a full package name if not in package main when running under strict 'vars' .

  • @INC

    The array @INC contains the list of places that the do EXPR , require, or use constructs look for their library files. It initially consists of the arguments to any -I command-line switches, followed by the default Perl library, probably /usr/local/lib/perl, followed by ".", to represent the current directory. ("." will not be appended if taint checks are enabled, either by -T or by -t .) If you need to modify this at runtime, you should use the use lib pragma to get the machine-dependent library properly loaded also:

    1. use lib '/mypath/libdir/';
    2. use SomeMod;

    You can also insert hooks into the file inclusion system by putting Perl code directly into @INC . Those hooks may be subroutine references, array references or blessed objects. See require for details.

  • %INC

    The hash %INC contains entries for each filename included via the do, require, or use operators. The key is the filename you specified (with module names converted to pathnames), and the value is the location of the file found. The require operator uses this hash to determine whether a particular file has already been included.

    If the file was loaded via a hook (e.g. a subroutine reference, see require for a description of these hooks), this hook is by default inserted into %INC in place of a filename. Note, however, that the hook may have set the %INC entry by itself to provide some more specific info.

  • $INPLACE_EDIT
  • $^I

    The current value of the inplace-edit extension. Use undef to disable inplace editing.

    Mnemonic: value of -i switch.

  • $^M

    By default, running out of memory is an untrappable, fatal error. However, if suitably built, Perl can use the contents of $^M as an emergency memory pool after die()ing. Suppose that your Perl were compiled with -DPERL_EMERGENCY_SBRK and used Perl's malloc. Then

    1. $^M = 'a' x (1 << 16);

    would allocate a 64K buffer for use in an emergency. See the INSTALL file in the Perl distribution for information on how to add custom C compilation flags when compiling perl. To discourage casual use of this advanced feature, there is no English long name for this variable.

    This variable was added in Perl 5.004.

  • $OSNAME
  • $^O

    The name of the operating system under which this copy of Perl was built, as determined during the configuration process. For examples see PLATFORMS in perlport.

    The value is identical to $Config{'osname'} . See also Config and the -V command-line switch documented in perlrun.

    In Windows platforms, $^O is not very helpful: since it is always MSWin32 , it doesn't tell the difference between 95/98/ME/NT/2000/XP/CE/.NET. Use Win32::GetOSName() or Win32::GetOSVersion() (see Win32 and perlport) to distinguish between the variants.

    This variable was added in Perl 5.003.

  • %SIG

    The hash %SIG contains signal handlers for signals. For example:

    1. sub handler { # 1st argument is signal name
    2. my($sig) = @_;
    3. print "Caught a SIG$sig--shutting down\n";
    4. close(LOG);
    5. exit(0);
    6. }
    7. $SIG{'INT'} = \&handler;
    8. $SIG{'QUIT'} = \&handler;
    9. ...
    10. $SIG{'INT'} = 'DEFAULT'; # restore default action
    11. $SIG{'QUIT'} = 'IGNORE'; # ignore SIGQUIT

    Using a value of 'IGNORE' usually has the effect of ignoring the signal, except for the CHLD signal. See perlipc for more about this special case.

    Here are some other examples:

    1. $SIG{"PIPE"} = "Plumber"; # assumes main::Plumber (not
    2. # recommended)
    3. $SIG{"PIPE"} = \&Plumber; # just fine; assume current
    4. # Plumber
    5. $SIG{"PIPE"} = *Plumber; # somewhat esoteric
    6. $SIG{"PIPE"} = Plumber(); # oops, what did Plumber()
    7. # return??

    Be sure not to use a bareword as the name of a signal handler, lest you inadvertently call it.

    If your system has the sigaction() function then signal handlers are installed using it. This means you get reliable signal handling.

    The default delivery policy of signals changed in Perl v5.8.0 from immediate (also known as "unsafe") to deferred, also known as "safe signals". See perlipc for more information.

    Certain internal hooks can be also set using the %SIG hash. The routine indicated by $SIG{__WARN__} is called when a warning message is about to be printed. The warning message is passed as the first argument. The presence of a __WARN__ hook causes the ordinary printing of warnings to STDERR to be suppressed. You can use this to save warnings in a variable, or turn warnings into fatal errors, like this:

    1. local $SIG{__WARN__} = sub { die $_[0] };
    2. eval $proggie;

    As the 'IGNORE' hook is not supported by __WARN__ , you can disable warnings using the empty subroutine:

    1. local $SIG{__WARN__} = sub {};

    The routine indicated by $SIG{__DIE__} is called when a fatal exception is about to be thrown. The error message is passed as the first argument. When a __DIE__ hook routine returns, the exception processing continues as it would have in the absence of the hook, unless the hook routine itself exits via a goto &sub , a loop exit, or a die(). The __DIE__ handler is explicitly disabled during the call, so that you can die from a __DIE__ handler. Similarly for __WARN__ .

    Due to an implementation glitch, the $SIG{__DIE__} hook is called even inside an eval(). Do not use this to rewrite a pending exception in $@ , or as a bizarre substitute for overriding CORE::GLOBAL::die() . This strange action at a distance may be fixed in a future release so that $SIG{__DIE__} is only called if your program is about to exit, as was the original intent. Any other use is deprecated.

    __DIE__ /__WARN__ handlers are very special in one respect: they may be called to report (probable) errors found by the parser. In such a case the parser may be in inconsistent state, so any attempt to evaluate Perl code from such a handler will probably result in a segfault. This means that warnings or errors that result from parsing Perl should be used with extreme caution, like this:

    1. require Carp if defined $^S;
    2. Carp::confess("Something wrong") if defined &Carp::confess;
    3. die "Something wrong, but could not load Carp to give "
    4. . "backtrace...\n\t"
    5. . "To see backtrace try starting Perl with -MCarp switch";

    Here the first line will load Carp unless it is the parser who called the handler. The second line will print backtrace and die if Carp was available. The third line will be executed only if Carp was not available.

    Having to even think about the $^S variable in your exception handlers is simply wrong. $SIG{__DIE__} as currently implemented invites grievous and difficult to track down errors. Avoid it and use an END{} or CORE::GLOBAL::die override instead.

    See die, warn, eval, and warnings for additional information.

  • $BASETIME
  • $^T

    The time at which the program began running, in seconds since the epoch (beginning of 1970). The values returned by the -M, -A, and -C filetests are based on this value.

  • $PERL_VERSION
  • $^V

    The revision, version, and subversion of the Perl interpreter, represented as a version object.

    This variable first appeared in perl v5.6.0; earlier versions of perl will see an undefined value. Before perl v5.10.0 $^V was represented as a v-string.

    $^V can be used to determine whether the Perl interpreter executing a script is in the right range of versions. For example:

    1. warn "Hashes not randomized!\n" if !$^V or $^V lt v5.8.1

    To convert $^V into its string representation use sprintf()'s "%vd" conversion:

    1. printf "version is v%vd\n", $^V; # Perl's version

    See the documentation of use VERSION and require VERSION for a convenient way to fail if the running Perl interpreter is too old.

    See also $] for an older representation of the Perl version.

    This variable was added in Perl v5.6.0.

    Mnemonic: use ^V for Version Control.

  • ${^WIN32_SLOPPY_STAT}

    If this variable is set to a true value, then stat() on Windows will not try to open the file. This means that the link count cannot be determined and file attributes may be out of date if additional hardlinks to the file exist. On the other hand, not opening the file is considerably faster, especially for files on network drives.

    This variable could be set in the sitecustomize.pl file to configure the local Perl installation to use "sloppy" stat() by default. See the documentation for -f in perlrun for more information about site customization.

    This variable was added in Perl v5.10.0.

  • $EXECUTABLE_NAME
  • $^X

    The name used to execute the current copy of Perl, from C's argv[0] or (where supported) /proc/self/exe.

    Depending on the host operating system, the value of $^X may be a relative or absolute pathname of the perl program file, or may be the string used to invoke perl but not the pathname of the perl program file. Also, most operating systems permit invoking programs that are not in the PATH environment variable, so there is no guarantee that the value of $^X is in PATH. For VMS, the value may or may not include a version number.

    You usually can use the value of $^X to re-invoke an independent copy of the same perl that is currently running, e.g.,

    1. @first_run = `$^X -le "print int rand 100 for 1..100"`;

    But recall that not all operating systems support forking or capturing of the output of commands, so this complex statement may not be portable.

    It is not safe to use the value of $^X as a path name of a file, as some operating systems that have a mandatory suffix on executable files do not require use of the suffix when invoking a command. To convert the value of $^X to a path name, use the following statements:

    1. # Build up a set of file names (not command names).
    2. use Config;
    3. my $this_perl = $^X;
    4. if ($^O ne 'VMS') {
    5. $this_perl .= $Config{_exe}
    6. unless $this_perl =~ m/$Config{_exe}$/i;
    7. }

    Because many operating systems permit anyone with read access to the Perl program file to make a copy of it, patch the copy, and then execute the copy, the security-conscious Perl programmer should take care to invoke the installed copy of perl, not the copy referenced by $^X . The following statements accomplish this goal, and produce a pathname that can be invoked as a command or referenced as a file.

    1. use Config;
    2. my $secure_perl_path = $Config{perlpath};
    3. if ($^O ne 'VMS') {
    4. $secure_perl_path .= $Config{_exe}
    5. unless $secure_perl_path =~ m/$Config{_exe}$/i;
    6. }

Variables related to regular expressions

Most of the special variables related to regular expressions are side effects. Perl sets these variables when it has a successful match, so you should check the match result before using them. For instance:

  1. if( /P(A)TT(ER)N/ ) {
  2. print "I found $1 and $2\n";
  3. }

These variables are read-only and dynamically-scoped, unless we note otherwise.

The dynamic nature of the regular expression variables means that their value is limited to the block that they are in, as demonstrated by this bit of code:

  1. my $outer = 'Wallace and Grommit';
  2. my $inner = 'Mutt and Jeff';
  3. my $pattern = qr/(\S+) and (\S+)/;
  4. sub show_n { print "\$1 is $1; \$2 is $2\n" }
  5. {
  6. OUTER:
  7. show_n() if $outer =~ m/$pattern/;
  8. INNER: {
  9. show_n() if $inner =~ m/$pattern/;
  10. }
  11. show_n();
  12. }

The output shows that while in the OUTER block, the values of $1 and $2 are from the match against $outer . Inside the INNER block, the values of $1 and $2 are from the match against $inner , but only until the end of the block (i.e. the dynamic scope). After the INNER block completes, the values of $1 and $2 return to the values for the match against $outer even though we have not made another match:

  1. $1 is Wallace; $2 is Grommit
  2. $1 is Mutt; $2 is Jeff
  3. $1 is Wallace; $2 is Grommit

Due to an unfortunate accident of Perl's implementation, use English imposes a considerable performance penalty on all regular expression matches in a program because it uses the $` , $& , and $' , regardless of whether they occur in the scope of use English . For that reason, saying use English in libraries is strongly discouraged unless you import it without the match variables:

  1. use English '-no_match_vars'

The Devel::NYTProf and Devel::FindAmpersand modules can help you find uses of these problematic match variables in your code.

Since Perl v5.10.0, you can use the /p match operator flag and the ${^PREMATCH} , ${^MATCH} , and ${^POSTMATCH} variables instead so you only suffer the performance penalties.

  • $<digits> ($1, $2, ...)

    Contains the subpattern from the corresponding set of capturing parentheses from the last successful pattern match, not counting patterns matched in nested blocks that have been exited already.

    These variables are read-only and dynamically-scoped.

    Mnemonic: like \digits.

  • $MATCH
  • $&

    The string matched by the last successful pattern match (not counting any matches hidden within a BLOCK or eval() enclosed by the current BLOCK).

    The use of this variable anywhere in a program imposes a considerable performance penalty on all regular expression matches. To avoid this penalty, you can extract the same substring by using @-. Starting with Perl v5.10.0, you can use the /p match flag and the ${^MATCH} variable to do the same thing for particular match operations.

    This variable is read-only and dynamically-scoped.

    Mnemonic: like & in some editors.

  • ${^MATCH}

    This is similar to $& ($MATCH ) except that it does not incur the performance penalty associated with that variable, and is only guaranteed to return a defined value when the pattern was compiled or executed with the /p modifier.

    This variable was added in Perl v5.10.0.

    This variable is read-only and dynamically-scoped.

  • $PREMATCH
  • $`

    The string preceding whatever was matched by the last successful pattern match, not counting any matches hidden within a BLOCK or eval enclosed by the current BLOCK.

    The use of this variable anywhere in a program imposes a considerable performance penalty on all regular expression matches. To avoid this penalty, you can extract the same substring by using @-. Starting with Perl v5.10.0, you can use the /p match flag and the ${^PREMATCH} variable to do the same thing for particular match operations.

    This variable is read-only and dynamically-scoped.

    Mnemonic: ` often precedes a quoted string.

  • ${^PREMATCH}

    This is similar to $` ($PREMATCH) except that it does not incur the performance penalty associated with that variable, and is only guaranteed to return a defined value when the pattern was compiled or executed with the /p modifier.

    This variable was added in Perl v5.10.0

    This variable is read-only and dynamically-scoped.

  • $POSTMATCH
  • $'

    The string following whatever was matched by the last successful pattern match (not counting any matches hidden within a BLOCK or eval() enclosed by the current BLOCK). Example:

    1. local $_ = 'abcdefghi';
    2. /def/;
    3. print "$`:$&:$'\n"; # prints abc:def:ghi

    The use of this variable anywhere in a program imposes a considerable performance penalty on all regular expression matches. To avoid this penalty, you can extract the same substring by using @-. Starting with Perl v5.10.0, you can use the /p match flag and the ${^POSTMATCH} variable to do the same thing for particular match operations.

    This variable is read-only and dynamically-scoped.

    Mnemonic: ' often follows a quoted string.

  • ${^POSTMATCH}

    This is similar to $' ($POSTMATCH ) except that it does not incur the performance penalty associated with that variable, and is only guaranteed to return a defined value when the pattern was compiled or executed with the /p modifier.

    This variable was added in Perl v5.10.0.

    This variable is read-only and dynamically-scoped.

  • $LAST_PAREN_MATCH
  • $+

    The text matched by the last bracket of the last successful search pattern. This is useful if you don't know which one of a set of alternative patterns matched. For example:

    1. /Version: (.*)|Revision: (.*)/ && ($rev = $+);

    This variable is read-only and dynamically-scoped.

    Mnemonic: be positive and forward looking.

  • $LAST_SUBMATCH_RESULT
  • $^N

    The text matched by the used group most-recently closed (i.e. the group with the rightmost closing parenthesis) of the last successful search pattern.

    This is primarily used inside (?{...}) blocks for examining text recently matched. For example, to effectively capture text to a variable (in addition to $1 , $2 , etc.), replace (...) with

    1. (?:(...)(?{ $var = $^N }))

    By setting and then using $var in this way relieves you from having to worry about exactly which numbered set of parentheses they are.

    This variable was added in Perl v5.8.0.

    Mnemonic: the (possibly) Nested parenthesis that most recently closed.

  • @LAST_MATCH_END
  • @+

    This array holds the offsets of the ends of the last successful submatches in the currently active dynamic scope. $+[0] is the offset into the string of the end of the entire match. This is the same value as what the pos function returns when called on the variable that was matched against. The nth element of this array holds the offset of the nth submatch, so $+[1] is the offset past where $1 ends, $+[2] the offset past where $2 ends, and so on. You can use $#+ to determine how many subgroups were in the last successful match. See the examples given for the @- variable.

    This variable was added in Perl v5.6.0.

  • %LAST_PAREN_MATCH
  • %+

    Similar to @+ , the %+ hash allows access to the named capture buffers, should they exist, in the last successful match in the currently active dynamic scope.

    For example, $+{foo} is equivalent to $1 after the following match:

    1. 'foo' =~ /(?<foo>foo)/;

    The keys of the %+ hash list only the names of buffers that have captured (and that are thus associated to defined values).

    The underlying behaviour of %+ is provided by the Tie::Hash::NamedCapture module.

    Note: %- and %+ are tied views into a common internal hash associated with the last successful regular expression. Therefore mixing iterative access to them via each may have unpredictable results. Likewise, if the last successful match changes, then the results may be surprising.

    This variable was added in Perl v5.10.0.

    This variable is read-only and dynamically-scoped.

  • @LAST_MATCH_START
  • @-

    $-[0] is the offset of the start of the last successful match. $-[n] is the offset of the start of the substring matched by n-th subpattern, or undef if the subpattern did not match.

    Thus, after a match against $_ , $& coincides with substr $_, $-[0], $+[0] - $-[0] . Similarly, $n coincides with substr $_, $-[n], $+[n] - $-[n] if $-[n] is defined, and $+ coincides with substr $_, $-[$#-], $+[$#-] - $-[$#-] . One can use $#- to find the last matched subgroup in the last successful match. Contrast with $#+ , the number of subgroups in the regular expression. Compare with @+ .

    This array holds the offsets of the beginnings of the last successful submatches in the currently active dynamic scope. $-[0] is the offset into the string of the beginning of the entire match. The nth element of this array holds the offset of the nth submatch, so $-[1] is the offset where $1 begins, $-[2] the offset where $2 begins, and so on.

    After a match against some variable $var :

    • $` is the same as substr($var, 0, $-[0])
    • $& is the same as substr($var, $-[0], $+[0] - $-[0])
    • $' is the same as substr($var, $+[0])
    • $1 is the same as substr($var, $-[1], $+[1] - $-[1])
    • $2 is the same as substr($var, $-[2], $+[2] - $-[2])
    • $3 is the same as substr($var, $-[3], $+[3] - $-[3])

    This variable was added in Perl v5.6.0.

  • %LAST_MATCH_START
  • %-

    Similar to %+ , this variable allows access to the named capture groups in the last successful match in the currently active dynamic scope. To each capture group name found in the regular expression, it associates a reference to an array containing the list of values captured by all buffers with that name (should there be several of them), in the order where they appear.

    Here's an example:

    1. if ('1234' =~ /(?<A>1)(?<B>2)(?<A>3)(?<B>4)/) {
    2. foreach my $bufname (sort keys %-) {
    3. my $ary = $-{$bufname};
    4. foreach my $idx (0..$#$ary) {
    5. print "\$-{$bufname}[$idx] : ",
    6. (defined($ary->[$idx])
    7. ? "'$ary->[$idx]'"
    8. : "undef"),
    9. "\n";
    10. }
    11. }
    12. }

    would print out:

    1. $-{A}[0] : '1'
    2. $-{A}[1] : '3'
    3. $-{B}[0] : '2'
    4. $-{B}[1] : '4'

    The keys of the %- hash correspond to all buffer names found in the regular expression.

    The behaviour of %- is implemented via the Tie::Hash::NamedCapture module.

    Note: %- and %+ are tied views into a common internal hash associated with the last successful regular expression. Therefore mixing iterative access to them via each may have unpredictable results. Likewise, if the last successful match changes, then the results may be surprising.

    This variable was added in Perl v5.10.0.

    This variable is read-only and dynamically-scoped.

  • $LAST_REGEXP_CODE_RESULT
  • $^R

    The result of evaluation of the last successful (?{ code }) regular expression assertion (see perlre). May be written to.

    This variable was added in Perl 5.005.

  • ${^RE_DEBUG_FLAGS}

    The current value of the regex debugging flags. Set to 0 for no debug output even when the re 'debug' module is loaded. See re for details.

    This variable was added in Perl v5.10.0.

  • ${^RE_TRIE_MAXBUF}

    Controls how certain regex optimisations are applied and how much memory they utilize. This value by default is 65536 which corresponds to a 512kB temporary cache. Set this to a higher value to trade memory for speed when matching large alternations. Set it to a lower value if you want the optimisations to be as conservative of memory as possible but still occur, and set it to a negative value to prevent the optimisation and conserve the most memory. Under normal situations this variable should be of no interest to you.

    This variable was added in Perl v5.10.0.

Variables related to filehandles

Variables that depend on the currently selected filehandle may be set by calling an appropriate object method on the IO::Handle object, although this is less efficient than using the regular built-in variables. (Summary lines below for this contain the word HANDLE.) First you must say

  1. use IO::Handle;

after which you may use either

  1. method HANDLE EXPR

or more safely,

  1. HANDLE->method(EXPR)

Each method returns the old value of the IO::Handle attribute. The methods each take an optional EXPR, which, if supplied, specifies the new value for the IO::Handle attribute in question. If not supplied, most methods do nothing to the current value--except for autoflush() , which will assume a 1 for you, just to be different.

Because loading in the IO::Handle class is an expensive operation, you should learn how to use the regular built-in variables.

A few of these variables are considered "read-only". This means that if you try to assign to this variable, either directly or indirectly through a reference, you'll raise a run-time exception.

You should be very careful when modifying the default values of most special variables described in this document. In most cases you want to localize these variables before changing them, since if you don't, the change may affect other modules which rely on the default values of the special variables that you have changed. This is one of the correct ways to read the whole file at once:

  1. open my $fh, "<", "foo" or die $!;
  2. local $/; # enable localized slurp mode
  3. my $content = <$fh>;
  4. close $fh;

But the following code is quite bad:

  1. open my $fh, "<", "foo" or die $!;
  2. undef $/; # enable slurp mode
  3. my $content = <$fh>;
  4. close $fh;

since some other module, may want to read data from some file in the default "line mode", so if the code we have just presented has been executed, the global value of $/ is now changed for any other code running inside the same Perl interpreter.

Usually when a variable is localized you want to make sure that this change affects the shortest scope possible. So unless you are already inside some short {} block, you should create one yourself. For example:

  1. my $content = '';
  2. open my $fh, "<", "foo" or die $!;
  3. {
  4. local $/;
  5. $content = <$fh>;
  6. }
  7. close $fh;

Here is an example of how your own code can go broken:

  1. for ( 1..3 ){
  2. $\ = "\r\n";
  3. nasty_break();
  4. print "$_";
  5. }
  6. sub nasty_break {
  7. $\ = "\f";
  8. # do something with $_
  9. }

You probably expect this code to print the equivalent of

  1. "1\r\n2\r\n3\r\n"

but instead you get:

  1. "1\f2\f3\f"

Why? Because nasty_break() modifies $\ without localizing it first. The value you set in nasty_break() is still there when you return. The fix is to add local() so the value doesn't leak out of nasty_break() :

  1. local $\ = "\f";

It's easy to notice the problem in such a short example, but in more complicated code you are looking for trouble if you don't localize changes to the special variables.

  • $ARGV

    Contains the name of the current file when reading from <> .

  • @ARGV

    The array @ARGV contains the command-line arguments intended for the script. $#ARGV is generally the number of arguments minus one, because $ARGV[0] is the first argument, not the program's command name itself. See $0 for the command name.

  • ARGV

    The special filehandle that iterates over command-line filenames in @ARGV . Usually written as the null filehandle in the angle operator <> . Note that currently ARGV only has its magical effect within the <> operator; elsewhere it is just a plain filehandle corresponding to the last file opened by <> . In particular, passing \*ARGV as a parameter to a function that expects a filehandle may not cause your function to automatically read the contents of all the files in @ARGV .

  • ARGVOUT

    The special filehandle that points to the currently open output file when doing edit-in-place processing with -i. Useful when you have to do a lot of inserting and don't want to keep modifying $_ . See perlrun for the -i switch.

  • IO::Handle->output_field_separator( EXPR )
  • $OUTPUT_FIELD_SEPARATOR
  • $OFS
  • $,

    The output field separator for the print operator. If defined, this value is printed between each of print's arguments. Default is undef.

    You cannot call output_field_separator() on a handle, only as a static method. See IO::Handle.

    Mnemonic: what is printed when there is a "," in your print statement.

  • HANDLE->input_line_number( EXPR )
  • $INPUT_LINE_NUMBER
  • $NR
  • $.

    Current line number for the last filehandle accessed.

    Each filehandle in Perl counts the number of lines that have been read from it. (Depending on the value of $/ , Perl's idea of what constitutes a line may not match yours.) When a line is read from a filehandle (via readline() or <> ), or when tell() or seek() is called on it, $. becomes an alias to the line counter for that filehandle.

    You can adjust the counter by assigning to $. , but this will not actually move the seek pointer. Localizing $. will not localize the filehandle's line count. Instead, it will localize perl's notion of which filehandle $. is currently aliased to.

    $. is reset when the filehandle is closed, but not when an open filehandle is reopened without an intervening close(). For more details, see I/O Operators in perlop. Because <> never does an explicit close, line numbers increase across ARGV files (but see examples in eof).

    You can also use HANDLE->input_line_number(EXPR) to access the line counter for a given filehandle without having to worry about which handle you last accessed.

    Mnemonic: many programs use "." to mean the current line number.

  • IO::Handle->input_record_separator( EXPR )
  • $INPUT_RECORD_SEPARATOR
  • $RS
  • $/

    The input record separator, newline by default. This influences Perl's idea of what a "line" is. Works like awk's RS variable, including treating empty lines as a terminator if set to the null string (an empty line cannot contain any spaces or tabs). You may set it to a multi-character string to match a multi-character terminator, or to undef to read through the end of file. Setting it to "\n\n" means something slightly different than setting to "" , if the file contains consecutive empty lines. Setting to "" will treat two or more consecutive empty lines as a single empty line. Setting to "\n\n" will blindly assume that the next input character belongs to the next paragraph, even if it's a newline.

    1. local $/; # enable "slurp" mode
    2. local $_ = <FH>; # whole file now here
    3. s/\n[ \t]+/ /g;

    Remember: the value of $/ is a string, not a regex. awk has to be better for something. :-)

    Setting $/ to a reference to an integer, scalar containing an integer, or scalar that's convertible to an integer will attempt to read records instead of lines, with the maximum record size being the referenced integer number of characters. So this:

    1. local $/ = \32768; # or \"32768", or \$var_containing_32768
    2. open my $fh, "<", $myfile or die $!;
    3. local $_ = <$fh>;

    will read a record of no more than 32768 characters from $fh. If you're not reading from a record-oriented file (or your OS doesn't have record-oriented files), then you'll likely get a full chunk of data with every read. If a record is larger than the record size you've set, you'll get the record back in pieces. Trying to set the record size to zero or less will cause reading in the (rest of the) whole file.

    On VMS only, record reads bypass PerlIO layers and any associated buffering, so you must not mix record and non-record reads on the same filehandle. Record mode mixes with line mode only when the same buffering layer is in use for both modes.

    You cannot call input_record_separator() on a handle, only as a static method. See IO::Handle.

    See also Newlines in perlport. Also see $..

    Mnemonic: / delimits line boundaries when quoting poetry.

  • IO::Handle->output_record_separator( EXPR )
  • $OUTPUT_RECORD_SEPARATOR
  • $ORS
  • $\

    The output record separator for the print operator. If defined, this value is printed after the last of print's arguments. Default is undef.

    You cannot call output_record_separator() on a handle, only as a static method. See IO::Handle.

    Mnemonic: you set $\ instead of adding "\n" at the end of the print. Also, it's just like $/ , but it's what you get "back" from Perl.

  • HANDLE->autoflush( EXPR )
  • $OUTPUT_AUTOFLUSH
  • $|

    If set to nonzero, forces a flush right away and after every write or print on the currently selected output channel. Default is 0 (regardless of whether the channel is really buffered by the system or not; $| tells you only whether you've asked Perl explicitly to flush after each write). STDOUT will typically be line buffered if output is to the terminal and block buffered otherwise. Setting this variable is useful primarily when you are outputting to a pipe or socket, such as when you are running a Perl program under rsh and want to see the output as it's happening. This has no effect on input buffering. See getc for that. See select on how to select the output channel. See also IO::Handle.

    Mnemonic: when you want your pipes to be piping hot.

  • ${^LAST_FH}

    This read-only variable contains a reference to the last-read filehandle. This is set by <HANDLE> , readline, tell, eof and seek. This is the same handle that $. and tell and eof without arguments use. It is also the handle used when Perl appends ", <STDIN> line 1" to an error or warning message.

    This variable was added in Perl v5.18.0.

Variables related to formats

The special variables for formats are a subset of those for filehandles. See perlform for more information about Perl's formats.

  • $ACCUMULATOR
  • $^A

    The current value of the write() accumulator for format() lines. A format contains formline() calls that put their result into $^A . After calling its format, write() prints out the contents of $^A and empties. So you never really see the contents of $^A unless you call formline() yourself and then look at it. See perlform and formline PICTURE,LIST.

  • IO::Handle->format_formfeed(EXPR)
  • $FORMAT_FORMFEED
  • $^L

    What formats output as a form feed. The default is \f .

    You cannot call format_formfeed() on a handle, only as a static method. See IO::Handle.

  • HANDLE->format_page_number(EXPR)
  • $FORMAT_PAGE_NUMBER
  • $%

    The current page number of the currently selected output channel.

    Mnemonic: % is page number in nroff.

  • HANDLE->format_lines_left(EXPR)
  • $FORMAT_LINES_LEFT
  • $-

    The number of lines left on the page of the currently selected output channel.

    Mnemonic: lines_on_page - lines_printed.

  • IO::Handle->format_line_break_characters EXPR
  • $FORMAT_LINE_BREAK_CHARACTERS
  • $:

    The current set of characters after which a string may be broken to fill continuation fields (starting with ^) in a format. The default is " \n-", to break on a space, newline, or a hyphen.

    You cannot call format_line_break_characters() on a handle, only as a static method. See IO::Handle.

    Mnemonic: a "colon" in poetry is a part of a line.

  • HANDLE->format_lines_per_page(EXPR)
  • $FORMAT_LINES_PER_PAGE
  • $=

    The current page length (printable lines) of the currently selected output channel. The default is 60.

    Mnemonic: = has horizontal lines.

  • HANDLE->format_top_name(EXPR)
  • $FORMAT_TOP_NAME
  • $^

    The name of the current top-of-page format for the currently selected output channel. The default is the name of the filehandle with _TOP appended. For example, the default format top name for the STDOUT filehandle is STDOUT_TOP .

    Mnemonic: points to top of page.

  • HANDLE->format_name(EXPR)
  • $FORMAT_NAME
  • $~

    The name of the current report format for the currently selected output channel. The default format name is the same as the filehandle name. For example, the default format name for the STDOUT filehandle is just STDOUT .

    Mnemonic: brother to $^ .

Error Variables

The variables $@ , $! , $^E , and $? contain information about different types of error conditions that may appear during execution of a Perl program. The variables are shown ordered by the "distance" between the subsystem which reported the error and the Perl process. They correspond to errors detected by the Perl interpreter, C library, operating system, or an external program, respectively.

To illustrate the differences between these variables, consider the following Perl expression, which uses a single-quoted string. After execution of this statement, perl may have set all four special error variables:

  1. eval q{
  2. open my $pipe, "/cdrom/install |" or die $!;
  3. my @res = <$pipe>;
  4. close $pipe or die "bad pipe: $?, $!";
  5. };

When perl executes the eval() expression, it translates the open(), <PIPE> , and close calls in the C run-time library and thence to the operating system kernel. perl sets $! to the C library's errno if one of these calls fails.

$@ is set if the string to be eval-ed did not compile (this may happen if open or close were imported with bad prototypes), or if Perl code executed during evaluation die()d. In these cases the value of $@ is the compile error, or the argument to die (which will interpolate $! and $? ). (See also Fatal, though.)

Under a few operating systems, $^E may contain a more verbose error indicator, such as in this case, "CDROM tray not closed." Systems that do not support extended error messages leave $^E the same as $! .

Finally, $? may be set to non-0 value if the external program /cdrom/install fails. The upper eight bits reflect specific error conditions encountered by the program (the program's exit() value). The lower eight bits reflect mode of failure, like signal death and core dump information. See wait(2) for details. In contrast to $! and $^E , which are set only if error condition is detected, the variable $? is set on each wait or pipe close, overwriting the old value. This is more like $@ , which on every eval() is always set on failure and cleared on success.

For more details, see the individual descriptions at $@ , $! , $^E , and $? .

  • ${^CHILD_ERROR_NATIVE}

    The native status returned by the last pipe close, backtick (`` ) command, successful call to wait() or waitpid(), or from the system() operator. On POSIX-like systems this value can be decoded with the WIFEXITED, WEXITSTATUS, WIFSIGNALED, WTERMSIG, WIFSTOPPED, WSTOPSIG and WIFCONTINUED functions provided by the POSIX module.

    Under VMS this reflects the actual VMS exit status; i.e. it is the same as $? when the pragma use vmsish 'status' is in effect.

    This variable was added in Perl v5.10.0.

  • $EXTENDED_OS_ERROR
  • $^E

    Error information specific to the current operating system. At the moment, this differs from $! under only VMS, OS/2, and Win32 (and for MacPerl). On all other platforms, $^E is always just the same as $! .

    Under VMS, $^E provides the VMS status value from the last system error. This is more specific information about the last system error than that provided by $! . This is particularly important when $! is set to EVMSERR.

    Under OS/2, $^E is set to the error code of the last call to OS/2 API either via CRT, or directly from perl.

    Under Win32, $^E always returns the last error information reported by the Win32 call GetLastError() which describes the last error from within the Win32 API. Most Win32-specific code will report errors via $^E . ANSI C and Unix-like calls set errno and so most portable Perl code will report errors via $! .

    Caveats mentioned in the description of $! generally apply to $^E , also.

    This variable was added in Perl 5.003.

    Mnemonic: Extra error explanation.

  • $EXCEPTIONS_BEING_CAUGHT
  • $^S

    Current state of the interpreter.

    1. $^S State
    2. --------- -------------------------------------
    3. undef Parsing module, eval, or main program
    4. true (1) Executing an eval
    5. false (0) Otherwise

    The first state may happen in $SIG{__DIE__} and $SIG{__WARN__} handlers.

    The English name $EXCEPTIONS_BEING_CAUGHT is slightly misleading, because the undef value does not indicate whether exceptions are being caught, since compilation of the main program does not catch exceptions.

    This variable was added in Perl 5.004.

  • $WARNING
  • $^W

    The current value of the warning switch, initially true if -w was used, false otherwise, but directly modifiable.

    See also warnings.

    Mnemonic: related to the -w switch.

  • ${^WARNING_BITS}

    The current set of warning checks enabled by the use warnings pragma. It has the same scoping as the $^H and %^H variables. The exact values are considered internal to the warnings pragma and may change between versions of Perl.

    This variable was added in Perl v5.6.0.

  • $OS_ERROR
  • $ERRNO
  • $!

    When referenced, $! retrieves the current value of the C errno integer variable. If $! is assigned a numerical value, that value is stored in errno . When referenced as a string, $! yields the system error string corresponding to errno .

    Many system or library calls set errno if they fail, to indicate the cause of failure. They usually do not set errno to zero if they succeed. This means errno , hence $! , is meaningful only immediately after a failure:

    1. if (open my $fh, "<", $filename) {
    2. # Here $! is meaningless.
    3. ...
    4. }
    5. else {
    6. # ONLY here is $! meaningful.
    7. ...
    8. # Already here $! might be meaningless.
    9. }
    10. # Since here we might have either success or failure,
    11. # $! is meaningless.

    Here, meaningless means that $! may be unrelated to the outcome of the open() operator. Assignment to $! is similarly ephemeral. It can be used immediately before invoking the die() operator, to set the exit value, or to inspect the system error string corresponding to error n, or to restore $! to a meaningful state.

    Mnemonic: What just went bang?

  • %OS_ERROR
  • %ERRNO
  • %!

    Each element of %! has a true value only if $! is set to that value. For example, $!{ENOENT} is true if and only if the current value of $! is ENOENT ; that is, if the most recent error was "No such file or directory" (or its moral equivalent: not all operating systems give that exact error, and certainly not all languages). To check if a particular key is meaningful on your system, use exists $!{the_key} ; for a list of legal keys, use keys %! . See Errno for more information, and also see $!.

    This variable was added in Perl 5.005.

  • $CHILD_ERROR
  • $?

    The status returned by the last pipe close, backtick (`` ) command, successful call to wait() or waitpid(), or from the system() operator. This is just the 16-bit status word returned by the traditional Unix wait() system call (or else is made up to look like it). Thus, the exit value of the subprocess is really ($?>> 8 ), and $? & 127 gives which signal, if any, the process died from, and $? & 128 reports whether there was a core dump.

    Additionally, if the h_errno variable is supported in C, its value is returned via $? if any gethost*() function fails.

    If you have installed a signal handler for SIGCHLD , the value of $? will usually be wrong outside that handler.

    Inside an END subroutine $? contains the value that is going to be given to exit(). You can modify $? in an END subroutine to change the exit status of your program. For example:

    1. END {
    2. $? = 1 if $? == 255; # die would make it 255
    3. }

    Under VMS, the pragma use vmsish 'status' makes $? reflect the actual VMS exit status, instead of the default emulation of POSIX status; see $? in perlvms for details.

    Mnemonic: similar to sh and ksh.

  • $EVAL_ERROR
  • $@

    The Perl syntax error message from the last eval() operator. If $@ is the null string, the last eval() parsed and executed correctly (although the operations you invoked may have failed in the normal fashion).

    Warning messages are not collected in this variable. You can, however, set up a routine to process warnings by setting $SIG{__WARN__} as described in %SIG.

    Mnemonic: Where was the syntax error "at"?

Variables related to the interpreter state

These variables provide information about the current interpreter state.

  • $COMPILING
  • $^C

    The current value of the flag associated with the -c switch. Mainly of use with -MO=... to allow code to alter its behavior when being compiled, such as for example to AUTOLOAD at compile time rather than normal, deferred loading. Setting $^C = 1 is similar to calling B::minus_c .

    This variable was added in Perl v5.6.0.

  • $DEBUGGING
  • $^D

    The current value of the debugging flags. May be read or set. Like its command-line equivalent, you can use numeric or symbolic values, eg $^D = 10 or $^D = "st" .

    Mnemonic: value of -D switch.

  • ${^ENCODING}

    The object reference to the Encode object that is used to convert the source code to Unicode. Thanks to this variable your Perl script does not have to be written in UTF-8. Default is undef. The direct manipulation of this variable is highly discouraged.

    This variable was added in Perl 5.8.2.

  • ${^GLOBAL_PHASE}

    The current phase of the perl interpreter.

    Possible values are:

    • CONSTRUCT

      The PerlInterpreter* is being constructed via perl_construct . This value is mostly there for completeness and for use via the underlying C variable PL_phase . It's not really possible for Perl code to be executed unless construction of the interpreter is finished.

    • START

      This is the global compile-time. That includes, basically, every BEGIN block executed directly or indirectly from during the compile-time of the top-level program.

      This phase is not called "BEGIN" to avoid confusion with BEGIN -blocks, as those are executed during compile-time of any compilation unit, not just the top-level program. A new, localised compile-time entered at run-time, for example by constructs as eval "use SomeModule" are not global interpreter phases, and therefore aren't reflected by ${^GLOBAL_PHASE} .

    • CHECK

      Execution of any CHECK blocks.

    • INIT

      Similar to "CHECK", but for INIT -blocks, not CHECK blocks.

    • RUN

      The main run-time, i.e. the execution of PL_main_root .

    • END

      Execution of any END blocks.

    • DESTRUCT

      Global destruction.

    Also note that there's no value for UNITCHECK-blocks. That's because those are run for each compilation unit individually, and therefore is not a global interpreter phase.

    Not every program has to go through each of the possible phases, but transition from one phase to another can only happen in the order described in the above list.

    An example of all of the phases Perl code can see:

    1. BEGIN { print "compile-time: ${^GLOBAL_PHASE}\n" }
    2. INIT { print "init-time: ${^GLOBAL_PHASE}\n" }
    3. CHECK { print "check-time: ${^GLOBAL_PHASE}\n" }
    4. {
    5. package Print::Phase;
    6. sub new {
    7. my ($class, $time) = @_;
    8. return bless \$time, $class;
    9. }
    10. sub DESTROY {
    11. my $self = shift;
    12. print "$$self: ${^GLOBAL_PHASE}\n";
    13. }
    14. }
    15. print "run-time: ${^GLOBAL_PHASE}\n";
    16. my $runtime = Print::Phase->new(
    17. "lexical variables are garbage collected before END"
    18. );
    19. END { print "end-time: ${^GLOBAL_PHASE}\n" }
    20. our $destruct = Print::Phase->new(
    21. "package variables are garbage collected after END"
    22. );

    This will print out

    1. compile-time: START
    2. check-time: CHECK
    3. init-time: INIT
    4. run-time: RUN
    5. lexical variables are garbage collected before END: RUN
    6. end-time: END
    7. package variables are garbage collected after END: DESTRUCT

    This variable was added in Perl 5.14.0.

  • $^H

    WARNING: This variable is strictly for internal use only. Its availability, behavior, and contents are subject to change without notice.

    This variable contains compile-time hints for the Perl interpreter. At the end of compilation of a BLOCK the value of this variable is restored to the value when the interpreter started to compile the BLOCK.

    When perl begins to parse any block construct that provides a lexical scope (e.g., eval body, required file, subroutine body, loop body, or conditional block), the existing value of $^H is saved, but its value is left unchanged. When the compilation of the block is completed, it regains the saved value. Between the points where its value is saved and restored, code that executes within BEGIN blocks is free to change the value of $^H .

    This behavior provides the semantic of lexical scoping, and is used in, for instance, the use strict pragma.

    The contents should be an integer; different bits of it are used for different pragmatic flags. Here's an example:

    1. sub add_100 { $^H |= 0x100 }
    2. sub foo {
    3. BEGIN { add_100() }
    4. bar->baz($boon);
    5. }

    Consider what happens during execution of the BEGIN block. At this point the BEGIN block has already been compiled, but the body of foo() is still being compiled. The new value of $^H will therefore be visible only while the body of foo() is being compiled.

    Substitution of BEGIN { add_100() } block with:

    1. BEGIN { require strict; strict->import('vars') }

    demonstrates how use strict 'vars' is implemented. Here's a conditional version of the same lexical pragma:

    1. BEGIN {
    2. require strict; strict->import('vars') if $condition
    3. }

    This variable was added in Perl 5.003.

  • %^H

    The %^H hash provides the same scoping semantic as $^H . This makes it useful for implementation of lexically scoped pragmas. See perlpragma.

    When putting items into %^H , in order to avoid conflicting with other users of the hash there is a convention regarding which keys to use. A module should use only keys that begin with the module's name (the name of its main package) and a "/" character. For example, a module Foo::Bar should use keys such as Foo::Bar/baz .

    This variable was added in Perl v5.6.0.

  • ${^OPEN}

    An internal variable used by PerlIO. A string in two parts, separated by a \0 byte, the first part describes the input layers, the second part describes the output layers.

    This variable was added in Perl v5.8.0.

  • $PERLDB
  • $^P

    The internal variable for debugging support. The meanings of the various bits are subject to change, but currently indicate:

    0
    x01

    Debug subroutine enter/exit.

    0
    x02

    Line-by-line debugging. Causes DB::DB() subroutine to be called for each statement executed. Also causes saving source code lines (like 0x400).

    0
    x04

    Switch off optimizations.

    0
    x08

    Preserve more data for future interactive inspections.

    0
    x10

    Keep info about source lines on which a subroutine is defined.

    0
    x20

    Start with single-step on.

    0
    x40

    Use subroutine address instead of name when reporting.

    0
    x80

    Report goto &subroutine as well.

    0
    x100

    Provide informative "file" names for evals based on the place they were compiled.

    0
    x200

    Provide informative names to anonymous subroutines based on the place they were compiled.

    0
    x400

    Save source code lines into @{"_<$filename"} .

    Some bits may be relevant at compile-time only, some at run-time only. This is a new mechanism and the details may change. See also perldebguts.

  • ${^TAINT}

    Reflects if taint mode is on or off. 1 for on (the program was run with -T), 0 for off, -1 when only taint warnings are enabled (i.e. with -t or -TU).

    This variable is read-only.

    This variable was added in Perl v5.8.0.

  • ${^UNICODE}

    Reflects certain Unicode settings of Perl. See perlrun documentation for the -C switch for more information about the possible values.

    This variable is set during Perl startup and is thereafter read-only.

    This variable was added in Perl v5.8.2.

  • ${^UTF8CACHE}

    This variable controls the state of the internal UTF-8 offset caching code. 1 for on (the default), 0 for off, -1 to debug the caching code by checking all its results against linear scans, and panicking on any discrepancy.

    This variable was added in Perl v5.8.9. It is subject to change or removal without notice, but is currently used to avoid recalculating the boundaries of multi-byte UTF-8-encoded characters.

  • ${^UTF8LOCALE}

    This variable indicates whether a UTF-8 locale was detected by perl at startup. This information is used by perl when it's in adjust-utf8ness-to-locale mode (as when run with the -CL command-line switch); see perlrun for more info on this.

    This variable was added in Perl v5.8.8.

Deprecated and removed variables

Deprecating a variable announces the intent of the perl maintainers to eventually remove the variable from the language. It may still be available despite its status. Using a deprecated variable triggers a warning.

Once a variable is removed, its use triggers an error telling you the variable is unsupported.

See perldiag for details about error messages.

  • $OFMT
  • $#

    $# was a variable that could be used to format printed numbers. After a deprecation cycle, its magic was removed in Perl v5.10.0 and using it now triggers a warning: $# is no longer supported.

    This is not the sigil you use in front of an array name to get the last index, like $#array . That's still how you get the last index of an array in Perl. The two have nothing to do with each other.

    Deprecated in Perl 5.

    Removed in Perl v5.10.0.

  • $*

    $* was a variable that you could use to enable multiline matching. After a deprecation cycle, its magic was removed in Perl v5.10.0. Using it now triggers a warning: $* is no longer supported. You should use the /s and /m regexp modifiers instead.

    Deprecated in Perl 5.

    Removed in Perl v5.10.0.

  • $ARRAY_BASE
  • $[

    This variable stores the index of the first element in an array, and of the first character in a substring. The default is 0, but you could theoretically set it to 1 to make Perl behave more like awk (or Fortran) when subscripting and when evaluating the index() and substr() functions.

    As of release 5 of Perl, assignment to $[ is treated as a compiler directive, and cannot influence the behavior of any other file. (That's why you can only assign compile-time constants to it.) Its use is highly discouraged.

    Prior to Perl v5.10.0, assignment to $[ could be seen from outer lexical scopes in the same file, unlike other compile-time directives (such as strict). Using local() on it would bind its value strictly to a lexical block. Now it is always lexically scoped.

    As of Perl v5.16.0, it is implemented by the arybase module. See arybase for more details on its behaviour.

    Under use v5.16 , or no feature "array_base" , $[ no longer has any effect, and always contains 0. Assigning 0 to it is permitted, but any other value will produce an error.

    Mnemonic: [ begins subscripts.

    Deprecated in Perl v5.12.0.

  • $OLD_PERL_VERSION
  • $]

    See $^V for a more modern representation of the Perl version that allows accurate string comparisons.

    The version + patchlevel / 1000 of the Perl interpreter. This variable can be used to determine whether the Perl interpreter executing a script is in the right range of versions:

    1. warn "No checksumming!\n" if $] < 3.019;

    The floating point representation can sometimes lead to inaccurate numeric comparisons.

    See also the documentation of use VERSION and require VERSION for a convenient way to fail if the running Perl interpreter is too old.

    Mnemonic: Is this version of perl in the right bracket?

 
perldoc-html/perlvms.html000644 000765 000024 00000321537 12275777413 015626 0ustar00jjstaff000000 000000 perlvms - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlvms

Perl 5 version 18.2 documentation
Recently read

perlvms

NAME

perlvms - VMS-specific documentation for Perl

DESCRIPTION

Gathered below are notes describing details of Perl 5's behavior on VMS. They are a supplement to the regular Perl 5 documentation, so we have focussed on the ways in which Perl 5 functions differently under VMS than it does under Unix, and on the interactions between Perl and the rest of the operating system. We haven't tried to duplicate complete descriptions of Perl features from the main Perl documentation, which can be found in the [.pod] subdirectory of the Perl distribution.

We hope these notes will save you from confusion and lost sleep when writing Perl scripts on VMS. If you find we've missed something you think should appear here, please don't hesitate to drop a line to vmsperl@perl.org.

Installation

Directions for building and installing Perl 5 can be found in the file README.vms in the main source directory of the Perl distribution..

Organization of Perl Images

Core Images

During the installation process, three Perl images are produced. Miniperl.Exe is an executable image which contains all of the basic functionality of Perl, but cannot take advantage of Perl extensions. It is used to generate several files needed to build the complete Perl and various extensions. Once you've finished installing Perl, you can delete this image.

Most of the complete Perl resides in the shareable image PerlShr.Exe, which provides a core to which the Perl executable image and all Perl extensions are linked. You should place this image in Sys$Share, or define the logical name PerlShr to translate to the full file specification of this image. It should be world readable. (Remember that if a user has execute only access to PerlShr, VMS will treat it as if it were a privileged shareable image, and will therefore require all downstream shareable images to be INSTALLed, etc.)

Finally, Perl.Exe is an executable image containing the main entry point for Perl, as well as some initialization code. It should be placed in a public directory, and made world executable. In order to run Perl with command line arguments, you should define a foreign command to invoke this image.

Perl Extensions

Perl extensions are packages which provide both XS and Perl code to add new functionality to perl. (XS is a meta-language which simplifies writing C code which interacts with Perl, see perlxs for more details.) The Perl code for an extension is treated like any other library module - it's made available in your script through the appropriate use or require statement, and usually defines a Perl package containing the extension.

The portion of the extension provided by the XS code may be connected to the rest of Perl in either of two ways. In the static configuration, the object code for the extension is linked directly into PerlShr.Exe, and is initialized whenever Perl is invoked. In the dynamic configuration, the extension's machine code is placed into a separate shareable image, which is mapped by Perl's DynaLoader when the extension is used or required in your script. This allows you to maintain the extension as a separate entity, at the cost of keeping track of the additional shareable image. Most extensions can be set up as either static or dynamic.

The source code for an extension usually resides in its own directory. At least three files are generally provided: Extshortname.xs (where Extshortname is the portion of the extension's name following the last :: ), containing the XS code, Extshortname.pm, the Perl library module for the extension, and Makefile.PL, a Perl script which uses the MakeMaker library modules supplied with Perl to generate a Descrip.MMS file for the extension.

Installing static extensions

Since static extensions are incorporated directly into PerlShr.Exe, you'll have to rebuild Perl to incorporate a new extension. You should edit the main Descrip.MMS or Makefile you use to build Perl, adding the extension's name to the ext macro, and the extension's object file to the extobj macro. You'll also need to build the extension's object file, either by adding dependencies to the main Descrip.MMS, or using a separate Descrip.MMS for the extension. Then, rebuild PerlShr.Exe to incorporate the new code.

Finally, you'll need to copy the extension's Perl library module to the [.Extname] subdirectory under one of the directories in @INC , where Extname is the name of the extension, with all :: replaced by . (e.g. the library module for extension Foo::Bar would be copied to a [.Foo.Bar] subdirectory).

Installing dynamic extensions

In general, the distributed kit for a Perl extension includes a file named Makefile.PL, which is a Perl program which is used to create a Descrip.MMS file which can be used to build and install the files required by the extension. The kit should be unpacked into a directory tree not under the main Perl source directory, and the procedure for building the extension is simply

  1. $ perl Makefile.PL ! Create Descrip.MMS
  2. $ mmk ! Build necessary files
  3. $ mmk test ! Run test code, if supplied
  4. $ mmk install ! Install into public Perl tree

N.B. The procedure by which extensions are built and tested creates several levels (at least 4) under the directory in which the extension's source files live. For this reason if you are running a version of VMS prior to V7.1 you shouldn't nest the source directory too deeply in your directory structure lest you exceed RMS' maximum of 8 levels of subdirectory in a filespec. (You can use rooted logical names to get another 8 levels of nesting, if you can't place the files near the top of the physical directory structure.)

VMS support for this process in the current release of Perl is sufficient to handle most extensions. However, it does not yet recognize extra libraries required to build shareable images which are part of an extension, so these must be added to the linker options file for the extension by hand. For instance, if the PGPLOT extension to Perl requires the PGPLOTSHR.EXE shareable image in order to properly link the Perl extension, then the line PGPLOTSHR/Share must be added to the linker options file PGPLOT.Opt produced during the build process for the Perl extension.

By default, the shareable image for an extension is placed in the [.lib.site_perl.autoArch.Extname] directory of the installed Perl directory tree (where Arch is VMS_VAX or VMS_AXP, and Extname is the name of the extension, with each :: translated to .). (See the MakeMaker documentation for more details on installation options for extensions.) However, it can be manually placed in any of several locations:

  • the [.Lib.Auto.Arch$PVersExtname] subdirectory of one of the directories in @INC (where PVers is the version of Perl you're using, as supplied in $] , with '.' converted to '_'), or

  • one of the directories in @INC , or

  • a directory which the extensions Perl library module passes to the DynaLoader when asking it to map the shareable image, or

  • Sys$Share or Sys$Library.

If the shareable image isn't in any of these places, you'll need to define a logical name Extshortname, where Extshortname is the portion of the extension's name after the last :: , which translates to the full file specification of the shareable image.

File specifications

Syntax

We have tried to make Perl aware of both VMS-style and Unix-style file specifications wherever possible. You may use either style, or both, on the command line and in scripts, but you may not combine the two styles within a single file specification. VMS Perl interprets Unix pathnames in much the same way as the CRTL (e.g. the first component of an absolute path is read as the device name for the VMS file specification). There are a set of functions provided in the VMS::Filespec package for explicit interconversion between VMS and Unix syntax; its documentation provides more details.

We've tried to minimize the dependence of Perl library modules on Unix syntax, but you may find that some of these, as well as some scripts written for Unix systems, will require that you use Unix syntax, since they will assume that '/' is the directory separator, etc. If you find instances of this in the Perl distribution itself, please let us know, so we can try to work around them.

Also when working on Perl programs on VMS, if you need a syntax in a specific operating system format, then you need either to check the appropriate DECC$ feature logical, or call a conversion routine to force it to that format.

The feature logical name DECC$FILENAME_UNIX_REPORT modifies traditional Perl behavior in the conversion of file specifications from Unix to VMS format in order to follow the extended character handling rules now expected by the CRTL. Specifically, when this feature is in effect, the ./.../ in a Unix path is now translated to [.^.^.^.] instead of the traditional VMS [...] . To be compatible with what MakeMaker expects, if a VMS path cannot be translated to a Unix path, it is passed through unchanged, so unixify("[...]") will return [...] .

The handling of extended characters is largely complete in the VMS-specific C infrastructure of Perl, but more work is still needed to fully support extended syntax filenames in several core modules. In particular, at this writing PathTools has only partial support for directories containing some extended characters.

There are several ambiguous cases where a conversion routine cannot determine whether an input filename is in Unix format or in VMS format, since now both VMS and Unix file specifications may have characters in them that could be mistaken for syntax delimiters of the other type. So some pathnames simply cannot be used in a mode that allows either type of pathname to be present. Perl will tend to assume that an ambiguous filename is in Unix format.

Allowing "." as a version delimiter is simply incompatible with determining whether a pathname is in VMS format or in Unix format with extended file syntax. There is no way to know whether "perl-5.8.6" is a Unix "perl-5.8.6" or a VMS "perl-5.8;6" when passing it to unixify() or vmsify().

The DECC$FILENAME_UNIX_REPORT logical name controls how Perl interprets filenames to the extent that Perl uses the CRTL internally for many purposes, and attempts to follow CRTL conventions for reporting filenames. The DECC$FILENAME_UNIX_ONLY feature differs in that it expects all filenames passed to the C run-time to be already in Unix format. This feature is not yet supported in Perl since Perl uses traditional OpenVMS file specifications internally and in the test harness, and it is not yet clear whether this mode will be useful or useable. The feature logical name DECC$POSIX_COMPLIANT_PATHNAMES is new with the RMS Symbolic Link SDK and included with OpenVMS v8.3, but is not yet supported in Perl.

Filename Case

Perl follows VMS defaults and override settings in preserving (or not preserving) filename case. Case is not preserved on ODS-2 formatted volumes on any architecture. On ODS-5 volumes, filenames may be case preserved depending on process and feature settings. Perl now honors DECC$EFS_CASE_PRESERVE and DECC$ARGV_PARSE_STYLE on those systems where the CRTL supports these features. When these features are not enabled or the CRTL does not support them, Perl follows the traditional CRTL behavior of downcasing command-line arguments and returning file specifications in lower case only.

N. B. It is very easy to get tripped up using a mixture of other programs, external utilities, and Perl scripts that are in varying states of being able to handle case preservation. For example, a file created by an older version of an archive utility or a build utility such as MMK or MMS may generate a filename in all upper case even on an ODS-5 volume. If this filename is later retrieved by a Perl script or module in a case preserving environment, that upper case name may not match the mixed-case or lower-case exceptions of the Perl code. Your best bet is to follow an all-or-nothing approach to case preservation: either don't use it at all, or make sure your entire toolchain and application environment support and use it.

OpenVMS Alpha v7.3-1 and later and all version of OpenVMS I64 support case sensitivity as a process setting (see SET PROCESS /CASE_LOOKUP=SENSITIVE ). Perl does not currently support case sensitivity on VMS, but it may in the future, so Perl programs should use the File::Spec->case_tolerant method to determine the state, and not the $^O variable.

Symbolic Links

When built on an ODS-5 volume with symbolic links enabled, Perl by default supports symbolic links when the requisite support is available in the filesystem and CRTL (generally 64-bit OpenVMS v8.3 and later). There are a number of limitations and caveats to be aware of when working with symbolic links on VMS. Most notably, the target of a valid symbolic link must be expressed as a Unix-style path and it must exist on a volume visible from your POSIX root (see the SHOW ROOT command in DCL help). For further details on symbolic link capabilities and requirements, see chapter 12 of the CRTL manual that ships with OpenVMS v8.3 or later.

Wildcard expansion

File specifications containing wildcards are allowed both on the command line and within Perl globs (e.g. <*.c> ). If the wildcard filespec uses VMS syntax, the resultant filespecs will follow VMS syntax; if a Unix-style filespec is passed in, Unix-style filespecs will be returned. Similar to the behavior of wildcard globbing for a Unix shell, one can escape command line wildcards with double quotation marks " around a perl program command line argument. However, owing to the stripping of " characters carried out by the C handling of argv you will need to escape a construct such as this one (in a directory containing the files PERL.C, PERL.EXE, PERL.H, and PERL.OBJ):

  1. $ perl -e "print join(' ',@ARGV)" perl.*
  2. perl.c perl.exe perl.h perl.obj

in the following triple quoted manner:

  1. $ perl -e "print join(' ',@ARGV)" """perl.*"""
  2. perl.*

In both the case of unquoted command line arguments or in calls to glob() VMS wildcard expansion is performed. (csh-style wildcard expansion is available if you use File::Glob::glob .) If the wildcard filespec contains a device or directory specification, then the resultant filespecs will also contain a device and directory; otherwise, device and directory information are removed. VMS-style resultant filespecs will contain a full device and directory, while Unix-style resultant filespecs will contain only as much of a directory path as was present in the input filespec. For example, if your default directory is Perl_Root:[000000], the expansion of [.t]*.* will yield filespecs like "perl_root:[t]base.dir", while the expansion of t/*/* will yield filespecs like "t/base.dir". (This is done to match the behavior of glob expansion performed by Unix shells.)

Similarly, the resultant filespec will contain the file version only if one was present in the input filespec.

Pipes

Input and output pipes to Perl filehandles are supported; the "file name" is passed to lib$spawn() for asynchronous execution. You should be careful to close any pipes you have opened in a Perl script, lest you leave any "orphaned" subprocesses around when Perl exits.

You may also use backticks to invoke a DCL subprocess, whose output is used as the return value of the expression. The string between the backticks is handled as if it were the argument to the system operator (see below). In this case, Perl will wait for the subprocess to complete before continuing.

The mailbox (MBX) that perl can create to communicate with a pipe defaults to a buffer size of 8192 on 64-bit systems, 512 on VAX. The default buffer size is adjustable via the logical name PERL_MBX_SIZE provided that the value falls between 128 and the SYSGEN parameter MAXBUF inclusive. For example, to set the mailbox size to 32767 use $ENV{'PERL_MBX_SIZE'} = 32767; and then open and use pipe constructs. An alternative would be to issue the command:

  1. $ Define PERL_MBX_SIZE 32767

before running your wide record pipe program. A larger value may improve performance at the expense of the BYTLM UAF quota.

PERL5LIB and PERLLIB

The PERL5LIB and PERLLIB logical names work as documented in perl, except that the element separator is '|' instead of ':'. The directory specifications may use either VMS or Unix syntax.

The Perl Forked Debugger

The Perl forked debugger places the debugger commands and output in a separate X-11 terminal window so that commands and output from multiple processes are not mixed together.

Perl on VMS supports an emulation of the forked debugger when Perl is run on a VMS system that has X11 support installed.

To use the forked debugger, you need to have the default display set to an X-11 Server and some environment variables set that Unix expects.

The forked debugger requires the environment variable TERM to be xterm , and the environment variable DISPLAY to exist. xterm must be in lower case.

  1. $define TERM "xterm"
  2. $define DISPLAY "hostname:0.0"

Currently the value of DISPLAY is ignored. It is recommended that it be set to be the hostname of the display, the server and screen in Unix notation. In the future the value of DISPLAY may be honored by Perl instead of using the default display.

It may be helpful to always use the forked debugger so that script I/O is separated from debugger I/O. You can force the debugger to be forked by assigning a value to the logical name <PERLDB_PIDS> that is not a process identification number.

  1. $define PERLDB_PIDS XXXX

PERL_VMS_EXCEPTION_DEBUG

The PERL_VMS_EXCEPTION_DEBUG being defined as "ENABLE" will cause the VMS debugger to be invoked if a fatal exception that is not otherwise handled is raised. The purpose of this is to allow debugging of internal Perl problems that would cause such a condition.

This allows the programmer to look at the execution stack and variables to find out the cause of the exception. As the debugger is being invoked as the Perl interpreter is about to do a fatal exit, continuing the execution in debug mode is usually not practical.

Starting Perl in the VMS debugger may change the program execution profile in a way that such problems are not reproduced.

The kill function can be used to test this functionality from within a program.

In typical VMS style, only the first letter of the value of this logical name is actually checked in a case insensitive mode, and it is considered enabled if it is the value "T","1" or "E".

This logical name must be defined before Perl is started.

Command line

I/O redirection and backgrounding

Perl for VMS supports redirection of input and output on the command line, using a subset of Bourne shell syntax:

  • <file reads stdin from file ,

  • >file writes stdout to file ,

  • >>file appends stdout to file ,

  • 2>file writes stderr to file ,

  • 2>>file appends stderr to file , and

  • 2>&1 redirects stderr to stdout.

In addition, output may be piped to a subprocess, using the character '|'. Anything after this character on the command line is passed to a subprocess for execution; the subprocess takes the output of Perl as its input.

Finally, if the command line ends with '&', the entire command is run in the background as an asynchronous subprocess.

Command line switches

The following command line switches behave differently under VMS than described in perlrun. Note also that in order to pass uppercase switches to Perl, you need to enclose them in double-quotes on the command line, since the CRTL downcases all unquoted strings.

On newer 64 bit versions of OpenVMS, a process setting now controls if the quoting is needed to preserve the case of command line arguments.

  • -i

    If the -i switch is present but no extension for a backup copy is given, then inplace editing creates a new version of a file; the existing copy is not deleted. (Note that if an extension is given, an existing file is renamed to the backup file, as is the case under other operating systems, so it does not remain as a previous version under the original filename.)

  • -S

    If the "-S" or -"S" switch is present and the script name does not contain a directory, then Perl translates the logical name DCL$PATH as a searchlist, using each translation as a directory in which to look for the script. In addition, if no file type is specified, Perl looks in each directory for a file matching the name specified, with a blank type, a type of .pl, and a type of .com, in that order.

  • -u

    The -u switch causes the VMS debugger to be invoked after the Perl program is compiled, but before it has run. It does not create a core dump file.

Perl functions

As of the time this document was last revised, the following Perl functions were implemented in the VMS port of Perl (functions marked with * are discussed in more detail below):

  1. file tests*, abs, alarm, atan, backticks*, binmode*, bless,
  2. caller, chdir, chmod, chown, chomp, chop, chr,
  3. close, closedir, cos, crypt*, defined, delete, die, do, dump*,
  4. each, endgrent, endpwent, eof, eval, exec*, exists, exit, exp,
  5. fileno, flock getc, getgrent*, getgrgid*, getgrnam, getlogin, getppid,
  6. getpwent*, getpwnam*, getpwuid*, glob, gmtime*, goto,
  7. grep, hex, ioctl, import, index, int, join, keys, kill*,
  8. last, lc, lcfirst, lchown*, length, link*, local, localtime, log, lstat, m//,
  9. map, mkdir, my, next, no, oct, open, opendir, ord, pack,
  10. pipe, pop, pos, print, printf, push, q//, qq//, qw//,
  11. qx//*, quotemeta, rand, read, readdir, readlink*, redo, ref, rename,
  12. require, reset, return, reverse, rewinddir, rindex,
  13. rmdir, s///, scalar, seek, seekdir, select(internal),
  14. select (system call)*, setgrent, setpwent, shift, sin, sleep,
  15. socketpair, sort, splice, split, sprintf, sqrt, srand, stat,
  16. study, substr, symlink*, sysread, system*, syswrite, tell,
  17. telldir, tie, time, times*, tr///, uc, ucfirst, umask,
  18. undef, unlink*, unpack, untie, unshift, use, utime*,
  19. values, vec, wait, waitpid*, wantarray, warn, write, y///

The following functions were not implemented in the VMS port, and calling them produces a fatal error (usually) or undefined behavior (rarely, we hope):

  1. chroot, dbmclose, dbmopen, fork*, getpgrp, getpriority,
  2. msgctl, msgget, msgsend, msgrcv, semctl,
  3. semget, semop, setpgrp, setpriority, shmctl, shmget,
  4. shmread, shmwrite, syscall

The following functions are available on Perls compiled with Dec C 5.2 or greater and running VMS 7.0 or greater:

  1. truncate

The following functions are available on Perls built on VMS 7.2 or greater:

  1. fcntl (without locking)

The following functions may or may not be implemented, depending on what type of socket support you've built into your copy of Perl:

  1. accept, bind, connect, getpeername,
  2. gethostbyname, getnetbyname, getprotobyname,
  3. getservbyname, gethostbyaddr, getnetbyaddr,
  4. getprotobynumber, getservbyport, gethostent,
  5. getnetent, getprotoent, getservent, sethostent,
  6. setnetent, setprotoent, setservent, endhostent,
  7. endnetent, endprotoent, endservent, getsockname,
  8. getsockopt, listen, recv, select(system call)*,
  9. send, setsockopt, shutdown, socket

The following function is available on Perls built on 64 bit OpenVMS v8.2 with hard links enabled on an ODS-5 formatted build disk. CRTL support is in principle available as of OpenVMS v7.3-1, and better configuration support could detect this.

  1. link

The following functions are available on Perls built on 64 bit OpenVMS v8.2 and later. CRTL support is in principle available as of OpenVMS v7.3-2, and better configuration support could detect this.

  1. getgrgid, getgrnam, getpwnam, getpwuid,
  2. setgrent, ttyname

The following functions are available on Perls built on 64 bit OpenVMS v8.2 and later.

  1. statvfs, socketpair
  • File tests

    The tests -b , -B , -c , -C , -d , -e , -f , -o , -M , -s , -S , -t , -T , and -z work as advertised. The return values for -r , -w , and -x tell you whether you can actually access the file; this may not reflect the UIC-based file protections. Since real and effective UIC don't differ under VMS, -O , -R , -W , and -X are equivalent to -o , -r , -w , and -x . Similarly, several other tests, including -A , -g , -k , -l , -p , and -u , aren't particularly meaningful under VMS, and the values returned by these tests reflect whatever your CRTL stat() routine does to the equivalent bits in the st_mode field. Finally, -d returns true if passed a device specification without an explicit directory (e.g. DUA1: ), as well as if passed a directory.

    There are DECC feature logical names AND ODS-5 volume attributes that also control what values are returned for the date fields.

    Note: Some sites have reported problems when using the file-access tests (-r , -w , and -x ) on files accessed via DEC's DFS. Specifically, since DFS does not currently provide access to the extended file header of files on remote volumes, attempts to examine the ACL fail, and the file tests will return false, with $! indicating that the file does not exist. You can use stat on these files, since that checks UIC-based protection only, and then manually check the appropriate bits, as defined by your C compiler's stat.h, in the mode value it returns, if you need an approximation of the file's protections.

  • backticks

    Backticks create a subprocess, and pass the enclosed string to it for execution as a DCL command. Since the subprocess is created directly via lib$spawn() , any valid DCL command string may be specified.

  • binmode FILEHANDLE

    The binmode operator will attempt to insure that no translation of carriage control occurs on input from or output to this filehandle. Since this involves reopening the file and then restoring its file position indicator, if this function returns FALSE, the underlying filehandle may no longer point to an open file, or may point to a different position in the file than before binmode was called.

    Note that binmode is generally not necessary when using normal filehandles; it is provided so that you can control I/O to existing record-structured files when necessary. You can also use the vmsfopen function in the VMS::Stdio extension to gain finer control of I/O to files and devices with different record structures.

  • crypt PLAINTEXT, USER

    The crypt operator uses the sys$hash_password system service to generate the hashed representation of PLAINTEXT. If USER is a valid username, the algorithm and salt values are taken from that user's UAF record. If it is not, then the preferred algorithm and a salt of 0 are used. The quadword encrypted value is returned as an 8-character string.

    The value returned by crypt may be compared against the encrypted password from the UAF returned by the getpw* functions, in order to authenticate users. If you're going to do this, remember that the encrypted password in the UAF was generated using uppercase username and password strings; you'll have to upcase the arguments to crypt to insure that you'll get the proper value:

    1. sub validate_passwd {
    2. my($user,$passwd) = @_;
    3. my($pwdhash);
    4. if ( !($pwdhash = (getpwnam($user))[1]) ||
    5. $pwdhash ne crypt("\U$passwd","\U$name") ) {
    6. intruder_alert($name);
    7. }
    8. return 1;
    9. }
  • die

    die will force the native VMS exit status to be an SS$_ABORT code if neither of the $! or $? status values are ones that would cause the native status to be interpreted as being what VMS classifies as SEVERE_ERROR severity for DCL error handling.

    When PERL_VMS_POSIX_EXIT is active (see $? below), the native VMS exit status value will have either one of the $! or $? or $^E or the Unix value 255 encoded into it in a way that the effective original value can be decoded by other programs written in C, including Perl and the GNV package. As per the normal non-VMS behavior of die if either $! or $? are non-zero, one of those values will be encoded into a native VMS status value. If both of the Unix status values are 0, and the $^E value is set one of ERROR or SEVERE_ERROR severity, then the $^E value will be used as the exit code as is. If none of the above apply, the Unix value of 255 will be encoded into a native VMS exit status value.

    Please note a significant difference in the behavior of die in the PERL_VMS_POSIX_EXIT mode is that it does not force a VMS SEVERE_ERROR status on exit. The Unix exit values of 2 through 255 will be encoded in VMS status values with severity levels of SUCCESS. The Unix exit value of 1 will be encoded in a VMS status value with a severity level of ERROR. This is to be compatible with how the VMS C library encodes these values.

    The minimum severity level set by die in PERL_VMS_POSIX_EXIT mode may be changed to be ERROR or higher in the future depending on the results of testing and further review.

    See $? for a description of the encoding of the Unix value to produce a native VMS status containing it.

  • dump

    Rather than causing Perl to abort and dump core, the dump operator invokes the VMS debugger. If you continue to execute the Perl program under the debugger, control will be transferred to the label specified as the argument to dump, or, if no label was specified, back to the beginning of the program. All other state of the program (e.g. values of variables, open file handles) are not affected by calling dump.

  • exec LIST

    A call to exec will cause Perl to exit, and to invoke the command given as an argument to exec via lib$do_command . If the argument begins with '@' or '$' (other than as part of a filespec), then it is executed as a DCL command. Otherwise, the first token on the command line is treated as the filespec of an image to run, and an attempt is made to invoke it (using .Exe and the process defaults to expand the filespec) and pass the rest of exec's argument to it as parameters. If the token has no file type, and matches a file with null type, then an attempt is made to determine whether the file is an executable image which should be invoked using MCR or a text file which should be passed to DCL as a command procedure.

  • fork

    While in principle the fork operator could be implemented via (and with the same rather severe limitations as) the CRTL vfork() routine, and while some internal support to do just that is in place, the implementation has never been completed, making fork currently unavailable. A true kernel fork() is expected in a future version of VMS, and the pseudo-fork based on interpreter threads may be available in a future version of Perl on VMS (see perlfork). In the meantime, use system, backticks, or piped filehandles to create subprocesses.

  • getpwent
  • getpwnam
  • getpwuid

    These operators obtain the information described in perlfunc, if you have the privileges necessary to retrieve the named user's UAF information via sys$getuai . If not, then only the $name , $uid , and $gid items are returned. The $dir item contains the login directory in VMS syntax, while the $comment item contains the login directory in Unix syntax. The $gcos item contains the owner field from the UAF record. The $quota item is not used.

  • gmtime

    The gmtime operator will function properly if you have a working CRTL gmtime() routine, or if the logical name SYS$TIMEZONE_DIFFERENTIAL is defined as the number of seconds which must be added to UTC to yield local time. (This logical name is defined automatically if you are running a version of VMS with built-in UTC support.) If neither of these cases is true, a warning message is printed, and undef is returned.

  • kill

    In most cases, kill is implemented via the undocumented system service $SIGPRC , which has the same calling sequence as $FORCEX , but throws an exception in the target process rather than forcing it to call $EXIT . Generally speaking, kill follows the behavior of the CRTL's kill() function, but unlike that function can be called from within a signal handler. Also, unlike the kill in some versions of the CRTL, Perl's kill checks the validity of the signal passed in and returns an error rather than attempting to send an unrecognized signal.

    Also, negative signal values don't do anything special under VMS; they're just converted to the corresponding positive value.

  • qx//

    See the entry on backticks above.

  • select (system call)

    If Perl was not built with socket support, the system call version of select is not available at all. If socket support is present, then the system call version of select functions only for file descriptors attached to sockets. It will not provide information about regular files or pipes, since the CRTL select() routine does not provide this functionality.

  • stat EXPR

    Since VMS keeps track of files according to a different scheme than Unix, it's not really possible to represent the file's ID in the st_dev and st_ino fields of a struct stat . Perl tries its best, though, and the values it uses are pretty unlikely to be the same for two different files. We can't guarantee this, though, so caveat scriptor.

  • system LIST

    The system operator creates a subprocess, and passes its arguments to the subprocess for execution as a DCL command. Since the subprocess is created directly via lib$spawn() , any valid DCL command string may be specified. If the string begins with '@', it is treated as a DCL command unconditionally. Otherwise, if the first token contains a character used as a delimiter in file specification (e.g. : or ]), an attempt is made to expand it using a default type of .Exe and the process defaults, and if successful, the resulting file is invoked via MCR . This allows you to invoke an image directly simply by passing the file specification to system, a common Unixish idiom. If the token has no file type, and matches a file with null type, then an attempt is made to determine whether the file is an executable image which should be invoked using MCR or a text file which should be passed to DCL as a command procedure.

    If LIST consists of the empty string, system spawns an interactive DCL subprocess, in the same fashion as typing SPAWN at the DCL prompt.

    Perl waits for the subprocess to complete before continuing execution in the current process. As described in perlfunc, the return value of system is a fake "status" which follows POSIX semantics unless the pragma use vmsish 'status' is in effect; see the description of $? in this document for more detail.

  • time

    The value returned by time is the offset in seconds from 01-JAN-1970 00:00:00 (just like the CRTL's times() routine), in order to make life easier for code coming in from the POSIX/Unix world.

  • times

    The array returned by the times operator is divided up according to the same rules the CRTL times() routine. Therefore, the "system time" elements will always be 0, since there is no difference between "user time" and "system" time under VMS, and the time accumulated by a subprocess may or may not appear separately in the "child time" field, depending on whether times() keeps track of subprocesses separately. Note especially that the VAXCRTL (at least) keeps track only of subprocesses spawned using fork() and exec(); it will not accumulate the times of subprocesses spawned via pipes, system(), or backticks.

  • unlink LIST

    unlink will delete the highest version of a file only; in order to delete all versions, you need to say

    1. 1 while unlink LIST;

    You may need to make this change to scripts written for a Unix system which expect that after a call to unlink, no files with the names passed to unlink will exist. (Note: This can be changed at compile time; if you use Config and $Config{'d_unlink_all_versions'} is define , then unlink will delete all versions of a file on the first call.)

    unlink will delete a file if at all possible, even if it requires changing file protection (though it won't try to change the protection of the parent directory). You can tell whether you've got explicit delete access to a file by using the VMS::Filespec::candelete operator. For instance, in order to delete only files to which you have delete access, you could say something like

    1. sub safe_unlink {
    2. my($file,$num);
    3. foreach $file (@_) {
    4. next unless VMS::Filespec::candelete($file);
    5. $num += unlink $file;
    6. }
    7. $num;
    8. }

    (or you could just use VMS::Stdio::remove , if you've installed the VMS::Stdio extension distributed with Perl). If unlink has to change the file protection to delete the file, and you interrupt it in midstream, the file may be left intact, but with a changed ACL allowing you delete access.

    This behavior of unlink is to be compatible with POSIX behavior and not traditional VMS behavior.

  • utime LIST

    This operator changes only the modification time of the file (VMS revision date) on ODS-2 volumes and ODS-5 volumes without access dates enabled. On ODS-5 volumes with access dates enabled, the true access time is modified.

  • waitpid PID,FLAGS

    If PID is a subprocess started by a piped open() (see open), waitpid will wait for that subprocess, and return its final status value in $? . If PID is a subprocess created in some other way (e.g. SPAWNed before Perl was invoked), waitpid will simply check once per second whether the process has completed, and return when it has. (If PID specifies a process that isn't a subprocess of the current process, and you invoked Perl with the -w switch, a warning will be issued.)

    Returns PID on success, -1 on error. The FLAGS argument is ignored in all cases.

Perl variables

The following VMS-specific information applies to the indicated "special" Perl variables, in addition to the general information in perlvar. Where there is a conflict, this information takes precedence.

  • %ENV

    The operation of the %ENV array depends on the translation of the logical name PERL_ENV_TABLES. If defined, it should be a search list, each element of which specifies a location for %ENV elements. If you tell Perl to read or set the element $ENV{name}, then Perl uses the translations of PERL_ENV_TABLES as follows:

    • CRTL_ENV

      This string tells Perl to consult the CRTL's internal environ array of key-value pairs, using name as the key. In most cases, this contains only a few keys, but if Perl was invoked via the C exec[lv]e() function, as is the case for CGI processing by some HTTP servers, then the environ array may have been populated by the calling program.

    • CLISYM_[LOCAL]

      A string beginning with CLISYM_ tells Perl to consult the CLI's symbol tables, using name as the name of the symbol. When reading an element of %ENV , the local symbol table is scanned first, followed by the global symbol table.. The characters following CLISYM_ are significant when an element of %ENV is set or deleted: if the complete string is CLISYM_LOCAL , the change is made in the local symbol table; otherwise the global symbol table is changed.

    • Any other string

      If an element of PERL_ENV_TABLES translates to any other string, that string is used as the name of a logical name table, which is consulted using name as the logical name. The normal search order of access modes is used.

    PERL_ENV_TABLES is translated once when Perl starts up; any changes you make while Perl is running do not affect the behavior of %ENV . If PERL_ENV_TABLES is not defined, then Perl defaults to consulting first the logical name tables specified by LNM$FILE_DEV, and then the CRTL environ array.

    In all operations on %ENV, the key string is treated as if it were entirely uppercase, regardless of the case actually specified in the Perl expression.

    When an element of %ENV is read, the locations to which PERL_ENV_TABLES points are checked in order, and the value obtained from the first successful lookup is returned. If the name of the %ENV element contains a semi-colon, it and any characters after it are removed. These are ignored when the CRTL environ array or a CLI symbol table is consulted. However, the name is looked up in a logical name table, the suffix after the semi-colon is treated as the translation index to be used for the lookup. This lets you look up successive values for search list logical names. For instance, if you say

    1. $ Define STORY once,upon,a,time,there,was
    2. $ perl -e "for ($i = 0; $i <= 6; $i++) " -
    3. _$ -e "{ print $ENV{'story;'.$i},' '}"

    Perl will print ONCE UPON A TIME THERE WAS , assuming, of course, that PERL_ENV_TABLES is set up so that the logical name story is found, rather than a CLI symbol or CRTL environ element with the same name.

    When an element of %ENV is set to a defined string, the corresponding definition is made in the location to which the first translation of PERL_ENV_TABLES points. If this causes a logical name to be created, it is defined in supervisor mode. (The same is done if an existing logical name was defined in executive or kernel mode; an existing user or supervisor mode logical name is reset to the new value.) If the value is an empty string, the logical name's translation is defined as a single NUL (ASCII 00) character, since a logical name cannot translate to a zero-length string. (This restriction does not apply to CLI symbols or CRTL environ values; they are set to the empty string.) An element of the CRTL environ array can be set only if your copy of Perl knows about the CRTL's setenv() function. (This is present only in some versions of the DECCRTL; check $Config{d_setenv} to see whether your copy of Perl was built with a CRTL that has this function.)

    When an element of %ENV is set to undef, the element is looked up as if it were being read, and if it is found, it is deleted. (An item "deleted" from the CRTL environ array is set to the empty string; this can only be done if your copy of Perl knows about the CRTL setenv() function.) Using delete to remove an element from %ENV has a similar effect, but after the element is deleted, another attempt is made to look up the element, so an inner-mode logical name or a name in another location will replace the logical name just deleted. In either case, only the first value found searching PERL_ENV_TABLES is altered. It is not possible at present to define a search list logical name via %ENV.

    The element $ENV{DEFAULT} is special: when read, it returns Perl's current default device and directory, and when set, it resets them, regardless of the definition of PERL_ENV_TABLES. It cannot be cleared or deleted; attempts to do so are silently ignored.

    Note that if you want to pass on any elements of the C-local environ array to a subprocess which isn't started by fork/exec, or isn't running a C program, you can "promote" them to logical names in the current process, which will then be inherited by all subprocesses, by saying

    1. foreach my $key (qw[C-local keys you want promoted]) {
    2. my $temp = $ENV{$key}; # read from C-local array
    3. $ENV{$key} = $temp; # and define as logical name
    4. }

    (You can't just say $ENV{$key} = $ENV{$key} , since the Perl optimizer is smart enough to elide the expression.)

    Don't try to clear %ENV by saying %ENV = (); , it will throw a fatal error. This is equivalent to doing the following from DCL:

    1. DELETE/LOGICAL *

    You can imagine how bad things would be if, for example, the SYS$MANAGER or SYS$SYSTEM logical names were deleted.

    At present, the first time you iterate over %ENV using keys, or values, you will incur a time penalty as all logical names are read, in order to fully populate %ENV. Subsequent iterations will not reread logical names, so they won't be as slow, but they also won't reflect any changes to logical name tables caused by other programs.

    You do need to be careful with the logical names representing process-permanent files, such as SYS$INPUT and SYS$OUTPUT . The translations for these logical names are prepended with a two-byte binary value (0x1B 0x00) that needs to be stripped off if you wantto use it. (In previous versions of Perl it wasn't possible to get the values of these logical names, as the null byte acted as an end-of-string marker)

  • $!

    The string value of $! is that returned by the CRTL's strerror() function, so it will include the VMS message for VMS-specific errors. The numeric value of $! is the value of errno , except if errno is EVMSERR, in which case $! contains the value of vaxc$errno. Setting $! always sets errno to the value specified. If this value is EVMSERR, it also sets vaxc$errno to 4 (NONAME-F-NOMSG), so that the string value of $! won't reflect the VMS error message from before $! was set.

  • $^E

    This variable provides direct access to VMS status values in vaxc$errno, which are often more specific than the generic Unix-style error messages in $! . Its numeric value is the value of vaxc$errno, and its string value is the corresponding VMS message string, as retrieved by sys$getmsg(). Setting $^E sets vaxc$errno to the value specified.

    While Perl attempts to keep the vaxc$errno value to be current, if errno is not EVMSERR, it may not be from the current operation.

  • $?

    The "status value" returned in $? is synthesized from the actual exit status of the subprocess in a way that approximates POSIX wait(5) semantics, in order to allow Perl programs to portably test for successful completion of subprocesses. The low order 8 bits of $? are always 0 under VMS, since the termination status of a process may or may not have been generated by an exception.

    The next 8 bits contain the termination status of the program.

    If the child process follows the convention of C programs compiled with the _POSIX_EXIT macro set, the status value will contain the actual value of 0 to 255 returned by that program on a normal exit.

    With the _POSIX_EXIT macro set, the Unix exit value of zero is represented as a VMS native status of 1, and the Unix values from 2 to 255 are encoded by the equation:

    1. VMS_status = 0x35a000 + (unix_value * 8) + 1.

    And in the special case of Unix value 1 the encoding is:

    1. VMS_status = 0x35a000 + 8 + 2 + 0x10000000.

    For other termination statuses, the severity portion of the subprocess's exit status is used: if the severity was success or informational, these bits are all 0; if the severity was warning, they contain a value of 1; if the severity was error or fatal error, they contain the actual severity bits, which turns out to be a value of 2 for error and 4 for severe_error. Fatal is another term for the severe_error status.

    As a result, $? will always be zero if the subprocess's exit status indicated successful completion, and non-zero if a warning or error occurred or a program compliant with encoding _POSIX_EXIT values was run and set a status.

    How can you tell the difference between a non-zero status that is the result of a VMS native error status or an encoded Unix status? You can not unless you look at the ${^CHILD_ERROR_NATIVE} value. The ${^CHILD_ERROR_NATIVE} value returns the actual VMS status value and check the severity bits. If the severity bits are equal to 1, then if the numeric value for $? is between 2 and 255 or 0, then $? accurately reflects a value passed back from a Unix application. If $? is 1, and the severity bits indicate a VMS error (2), then $? is from a Unix application exit value.

    In practice, Perl scripts that call programs that return _POSIX_EXIT type status values will be expecting those values, and programs that call traditional VMS programs will either be expecting the previous behavior or just checking for a non-zero status.

    And success is always the value 0 in all behaviors.

    When the actual VMS termination status of the child is an error, internally the $! value will be set to the closest Unix errno value to that error so that Perl scripts that test for error messages will see the expected Unix style error message instead of a VMS message.

    Conversely, when setting $? in an END block, an attempt is made to convert the POSIX value into a native status intelligible to the operating system upon exiting Perl. What this boils down to is that setting $? to zero results in the generic success value SS$_NORMAL, and setting $? to a non-zero value results in the generic failure status SS$_ABORT. See also exit in perlport.

    With the PERL_VMS_POSIX_EXIT logical name defined as "ENABLE", setting $? will cause the new value to be encoded into $^E so that either the original parent or child exit status values 0 to 255 can be automatically recovered by C programs expecting _POSIX_EXIT behavior. If both a parent and a child exit value are non-zero, then it will be assumed that this is actually a VMS native status value to be passed through. The special value of 0xFFFF is almost a NOOP as it will cause the current native VMS status in the C library to become the current native Perl VMS status, and is handled this way as it is known to not be a valid native VMS status value. It is recommend that only values in the range of normal Unix parent or child status numbers, 0 to 255 are used.

    The pragma use vmsish 'status' makes $? reflect the actual VMS exit status instead of the default emulation of POSIX status described above. This pragma also disables the conversion of non-zero values to SS$_ABORT when setting $? in an END block (but zero will still be converted to SS$_NORMAL).

    Do not use the pragma use vmsish 'status' with PERL_VMS_POSIX_EXIT enabled, as they are at times requesting conflicting actions and the consequence of ignoring this advice will be undefined to allow future improvements in the POSIX exit handling.

    In general, with PERL_VMS_POSIX_EXIT enabled, more detailed information will be available in the exit status for DCL scripts or other native VMS tools, and will give the expected information for Posix programs. It has not been made the default in order to preserve backward compatibility.

    N.B. Setting DECC$FILENAME_UNIX_REPORT implicitly enables PERL_VMS_POSIX_EXIT .

  • $|

    Setting $| for an I/O stream causes data to be flushed all the way to disk on each write (i.e. not just to the underlying RMS buffers for a file). In other words, it's equivalent to calling fflush() and fsync() from C.

Standard modules with VMS-specific differences

SDBM_File

SDBM_File works properly on VMS. It has, however, one minor difference. The database directory file created has a .sdbm_dir extension rather than a .dir extension. .dir files are VMS filesystem directory files, and using them for other purposes could cause unacceptable problems.

Revision date

Please see the git repository for revision history.

AUTHOR

Charles Bailey bailey@cor.newman.upenn.edu Craig Berry craigberry@mac.com Dan Sugalski dan@sidhe.org John Malmberg wb8tyw@qsl.net

 
perldoc-html/perlvos.html000644 000765 000024 00000046714 12275777413 015631 0ustar00jjstaff000000 000000 perlvos - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlvos

Perl 5 version 18.2 documentation
Recently read

perlvos

NAME

perlvos - Perl for Stratus OpenVOS

SYNOPSIS

This file contains notes for building perl on the Stratus OpenVOS operating system. Perl is a scripting or macro language that is popular on many systems. See perlbook for a number of good books on Perl.

These are instructions for building Perl from source. This version of Perl requires the dynamic linking support that is found in OpenVOS Release 17.1 and thus is not supported on OpenVOS Release 17.0 or earlier releases.

If you are running VOS Release 14.4.1 or later, you can obtain a pre-compiled, supported copy of perl by purchasing the GNU Tools product from Stratus Technologies.

BUILDING PERL FOR OPENVOS

To build perl from its source code on the Stratus V Series platform you must have OpenVOS Release 17.1.0 or later, GNU Tools Release 3.5 or later, and the C/POSIX Runtime Libraries.

Follow the normal instructions for building perl; e.g, enter bash, run the Configure script, then use "gmake" to build perl.

INSTALLING PERL IN OPENVOS

1

After you have built perl using the Configure script, ensure that you have modify and default write permission to >system>ported and all subdirectories. Then type

  1. gmake install
2

While there are currently no architecture-specific extensions or modules distributed with perl, the following directories can be used to hold such files (replace the string VERSION by the appropriate version number):

  1. >system>ported>lib>perl5>VERSION>i786
3

Site-specific perl extensions and modules can be installed in one of two places. Put architecture-independent files into:

  1. >system>ported>lib>perl5>site_perl>VERSION

Put site-specific architecture-dependent files into one of the following directories:

  1. >system>ported>lib>perl5>site_perl>VERSION>i786
4

You can examine the @INC variable from within a perl program to see the order in which Perl searches these directories.

USING PERL IN OPENVOS

Restrictions of Perl on OpenVOS

This port of Perl version 5 prefers Unix-style, slash-separated pathnames over OpenVOS-style greater-than-separated pathnames. OpenVOS-style pathnames should work in most contexts, but if you have trouble, replace all greater-than characters by slash characters. Because the slash character is used as a pathname delimiter, Perl cannot process OpenVOS pathnames containing a slash character in a directory or file name; these must be renamed.

This port of Perl also uses Unix-epoch date values internally. As long as you are dealing with ASCII character string representations of dates, this should not be an issue. The supported epoch is January 1, 1980 to January 17, 2038.

See the file pod/perlport.pod for more information about the OpenVOS port of Perl.

TEST STATUS

A number of the perl self-tests fails for various reasons; generally these are minor and due to subtle differences between common POSIX-based environments and the OpenVOS POSIX environment. Ensure that you conduct sufficient testing of your code to guarantee that it works properly in the OpenVOS environment.

SUPPORT STATUS

I'm offering this port "as is". You can ask me questions, but I can't guarantee I'll be able to answer them. There are some excellent books available on the Perl language; consult a book seller.

If you want a supported version of perl for OpenVOS, purchase the OpenVOS GNU Tools product from Stratus Technologies, along with a support contract (or from anyone else who will sell you support).

AUTHOR

Paul Green (Paul.Green@stratus.com)

LAST UPDATE

February 28, 2013

 
perldoc-html/perlwin32.html000644 000765 000024 00000165746 12275777413 015773 0ustar00jjstaff000000 000000 perlwin32 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlwin32

Perl 5 version 18.2 documentation
Recently read

perlwin32

NAME

perlwin32 - Perl under Windows

SYNOPSIS

These are instructions for building Perl under Windows 2000 and later.

DESCRIPTION

Before you start, you should glance through the README file found in the top-level directory to which the Perl distribution was extracted. Make sure you read and understand the terms under which this software is being distributed.

Also make sure you read BUGS AND CAVEATS below for the known limitations of this port.

The INSTALL file in the perl top-level has much information that is only relevant to people building Perl on Unix-like systems. In particular, you can safely ignore any information that talks about "Configure".

You may also want to look at one other option for building a perl that will work on Windows: the README.cygwin file, which give a different set of rules to build a perl for Windows. This method will probably enable you to build a more Unix-compatible perl, but you will also need to download and use various other build-time and run-time support software described in that file.

This set of instructions is meant to describe a so-called "native" port of Perl to the Windows platform. This includes both 32-bit and 64-bit Windows operating systems. The resulting Perl requires no additional software to run (other than what came with your operating system). Currently, this port is capable of using one of the following compilers on the Intel x86 architecture:

  1. Microsoft Visual C++ version 6.0 or later
  2. Gcc by mingw.org gcc version 3.2 or later
  3. Gcc by mingw-w64.sf.net gcc version 4.4.3 or later

Note that the last two of these are actually competing projects both delivering complete gcc toolchain for MS Windows:

  • http://mingw.org

    Delivers gcc toolchain targeting 32-bit Windows platform.

  • http://mingw-w64.sf.net

    Delivers gcc toolchain targeting both 64-bit Windows and 32-bit Windows platforms (despite the project name "mingw-w64" they are not only 64-bit oriented). They deliver the native gcc compilers and cross-compilers that are also supported by perl's makefile.

The Microsoft Visual C++ compilers are also now being given away free. They are available as "Visual C++ Toolkit 2003" or "Visual C++ 2005/2008/2010/2012 Express Edition" (and also as part of the ".NET Framework SDK") and are the same compilers that ship with "Visual C++ .NET 2003 Professional" or "Visual C++ 2005/2008/2010/2012 Professional" respectively.

This port can also be built on IA64/AMD64 using:

  1. Microsoft Platform SDK Nov 2001 (64-bit compiler and tools)
  2. MinGW64 compiler (gcc version 4.4.3 or later)

The Windows SDK can be downloaded from http://www.microsoft.com/. The MinGW64 compiler is available at http://sourceforge.net/projects/mingw-w64. The latter is actually a cross-compiler targeting Win64. There's also a trimmed down compiler (no java, or gfortran) suitable for building perl available at: http://strawberryperl.com/package/kmx/64_gcctoolchain/

NOTE: If you're using a 32-bit compiler to build perl on a 64-bit Windows operating system, then you should set the WIN64 environment variable to "undef". Also, the trimmed down compiler only passes tests when USE_ITHREADS *= define (as opposed to undef) and when the CFG *= Debug line is commented out.

This port fully supports MakeMaker (the set of modules that is used to build extensions to perl). Therefore, you should be able to build and install most extensions found in the CPAN sites. See Usage Hints for Perl on Windows below for general hints about this.

Setting Up Perl on Windows

  • Make

    You need a "make" program to build the sources. If you are using Visual C++ or the Windows SDK tools, nmake will work. Builds using the gcc need dmake.

    dmake is a freely available make that has very nice macro features and parallelability.

    A port of dmake for Windows is available from:

    http://search.cpan.org/dist/dmake/

    Fetch and install dmake somewhere on your path.

  • Command Shell

    Use the default "cmd" shell that comes with Windows. Some versions of the popular 4DOS/NT shell have incompatibilities that may cause you trouble. If the build fails under that shell, try building again with the cmd shell.

    Make sure the path to the build directory does not contain spaces. The build usually works in this circumstance, but some tests will fail.

  • Microsoft Visual C++

    The nmake that comes with Visual C++ will suffice for building. You will need to run the VCVARS32.BAT file, usually found somewhere like C:\Program Files\Microsoft Visual Studio\VC98\Bin. This will set your build environment.

    You can also use dmake to build using Visual C++; provided, however, you set OSRELEASE to "microsft" (or whatever the directory name under which the Visual C dmake configuration lives) in your environment and edit win32/config.vc to change "make=nmake" into "make=dmake". The latter step is only essential if you want to use dmake as your default make for building extensions using MakeMaker.

  • Microsoft Visual C++ 2008/2010/2012 Express Edition

    These free versions of Visual C++ 2008/2010/2012 Professional contain the same compilers and linkers that ship with the full versions, and also contain everything necessary to build Perl, rather than requiring a separate download of the Windows SDK like previous versions did.

    These packages can be downloaded by searching in the Download Center at http://www.microsoft.com/downloads/search.aspx?displaylang=en. (Providing exact links to these packages has proven a pointless task because the links keep on changing so often.)

    Install Visual C++ 2008/2010/2012 Express, then setup your environment using, e.g.

    1. C:\Program Files\Microsoft Visual Studio 11.0\Common7\Tools\vsvars32.bat

    (assuming the default installation location was chosen).

    Perl should now build using the win32/Makefile. You will need to edit that file to set CCTYPE to MSVC90FREE or MSVC100FREE first.

  • Microsoft Visual C++ 2005 Express Edition

    This free version of Visual C++ 2005 Professional contains the same compiler and linker that ship with the full version, but doesn't contain everything necessary to build Perl.

    You will also need to download the "Windows SDK" (the "Core SDK" and "MDAC SDK" components are required) for more header files and libraries.

    These packages can both be downloaded by searching in the Download Center at http://www.microsoft.com/downloads/search.aspx?displaylang=en. (Providing exact links to these packages has proven a pointless task because the links keep on changing so often.)

    Try to obtain the latest version of the Windows SDK. Sometimes these packages contain a particular Windows OS version in their name, but actually work on other OS versions too. For example, the "Windows Server 2003 R2 Platform SDK" also runs on Windows XP SP2 and Windows 2000.

    Install Visual C++ 2005 first, then the Platform SDK. Setup your environment as follows (assuming default installation locations were chosen):

    1. SET PlatformSDKDir=C:\Program Files\Microsoft Platform SDK
    2. SET PATH=%SystemRoot%\system32;%SystemRoot%;C:\Program Files\Microsoft Visual Studio 8\Common7\IDE;C:\Program Files\Microsoft Visual Studio 8\VC\BIN;C:\Program Files\Microsoft Visual Studio 8\Common7\Tools;C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\bin;C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727;C:\Program Files\Microsoft Visual Studio 8\VC\VCPackages;%PlatformSDKDir%\Bin
    3. SET INCLUDE=C:\Program Files\Microsoft Visual Studio 8\VC\INCLUDE;%PlatformSDKDir%\include
    4. SET LIB=C:\Program Files\Microsoft Visual Studio 8\VC\LIB;C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\lib;%PlatformSDKDir%\lib
    5. SET LIBPATH=C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727

    (The PlatformSDKDir might need to be set differently depending on which version you are using. Earlier versions installed into "C:\Program Files\Microsoft SDK", while the latest versions install into version-specific locations such as "C:\Program Files\Microsoft Platform SDK for Windows Server 2003 R2".)

    Perl should now build using the win32/Makefile. You will need to edit that file to set

    1. CCTYPE = MSVC80FREE

    and to set CCHOME, CCINCDIR and CCLIBDIR as per the environment setup above.

  • Microsoft Visual C++ Toolkit 2003

    This free toolkit contains the same compiler and linker that ship with Visual C++ .NET 2003 Professional, but doesn't contain everything necessary to build Perl.

    You will also need to download the "Platform SDK" (the "Core SDK" and "MDAC SDK" components are required) for header files, libraries and rc.exe, and ".NET Framework SDK" for more libraries and nmake.exe. Note that the latter (which also includes the free compiler and linker) requires the ".NET Framework Redistributable" to be installed first. This can be downloaded and installed separately, but is included in the "Visual C++ Toolkit 2003" anyway.

    These packages can all be downloaded by searching in the Download Center at http://www.microsoft.com/downloads/search.aspx?displaylang=en. (Providing exact links to these packages has proven a pointless task because the links keep on changing so often.)

    Try to obtain the latest version of the Windows SDK. Sometimes these packages contain a particular Windows OS version in their name, but actually work on other OS versions too. For example, the "Windows Server 2003 R2 Platform SDK" also runs on Windows XP SP2 and Windows 2000.

    Install the Toolkit first, then the Platform SDK, then the .NET Framework SDK. Setup your environment as follows (assuming default installation locations were chosen):

    1. SET PlatformSDKDir=C:\Program Files\Microsoft Platform SDK
    2. SET PATH=%SystemRoot%\system32;%SystemRoot%;C:\Program Files\Microsoft Visual C++ Toolkit 2003\bin;%PlatformSDKDir%\Bin;C:\Program Files\Microsoft.NET\SDK\v1.1\Bin
    3. SET INCLUDE=C:\Program Files\Microsoft Visual C++ Toolkit 2003\include;%PlatformSDKDir%\include;C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\include
    4. SET LIB=C:\Program Files\Microsoft Visual C++ Toolkit 2003\lib;%PlatformSDKDir%\lib;C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\lib

    (The PlatformSDKDir might need to be set differently depending on which version you are using. Earlier versions installed into "C:\Program Files\Microsoft SDK", while the latest versions install into version-specific locations such as "C:\Program Files\Microsoft Platform SDK for Windows Server 2003 R2".)

    Several required files will still be missing:

    • cvtres.exe is required by link.exe when using a .res file. It is actually installed by the .NET Framework SDK, but into a location such as the following:

      1. C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322

      Copy it from there to %PlatformSDKDir%\Bin

    • lib.exe is normally used to build libraries, but link.exe with the /lib option also works, so change win32/config.vc to use it instead:

      Change the line reading:

      1. ar='lib'

      to:

      1. ar='link /lib'

      It may also be useful to create a batch file called lib.bat in C:\Program Files\Microsoft Visual C++ Toolkit 2003\bin containing:

      1. @echo off
      2. link /lib %*

      for the benefit of any naughty C extension modules that you might want to build later which explicitly reference "lib" rather than taking their value from $Config{ar}.

    • setargv.obj is required to build perlglob.exe (and perl.exe if the USE_SETARGV option is enabled). The Platform SDK supplies this object file in source form in %PlatformSDKDir%\src\crt. Copy setargv.c, cruntime.h and internal.h from there to some temporary location and build setargv.obj using

      1. cl.exe /c /I. /D_CRTBLD setargv.c

      Then copy setargv.obj to %PlatformSDKDir%\lib

      Alternatively, if you don't need perlglob.exe and don't need to enable the USE_SETARGV option then you can safely just remove all mention of $(GLOBEXE) from win32/Makefile and setargv.obj won't be required anyway.

    Perl should now build using the win32/Makefile. You will need to edit that file to set

    1. CCTYPE = MSVC70FREE

    and to set CCHOME, CCINCDIR and CCLIBDIR as per the environment setup above.

  • Microsoft Platform SDK 64-bit Compiler

    The nmake that comes with the Platform SDK will suffice for building Perl. Make sure you are building within one of the "Build Environment" shells available after you install the Platform SDK from the Start Menu.

  • MinGW release 3 with gcc

    Perl can be compiled with gcc from MinGW release 3 and later (using gcc 3.2.x and later). It can be downloaded here:

    http://www.mingw.org/

    You also need dmake. See Make above on how to get it.

Building

  • Make sure you are in the "win32" subdirectory under the perl toplevel. This directory contains a "Makefile" that will work with versions of nmake that come with Visual C++ or the Windows SDK, and a dmake "makefile.mk" that will work for all supported compilers. The defaults in the dmake makefile are setup to build using MinGW/gcc.

  • Edit the makefile.mk (or Makefile, if you're using nmake) and change the values of INST_DRV and INST_TOP. You can also enable various build flags. These are explained in the makefiles.

    Note that it is generally not a good idea to try to build a perl with INST_DRV and INST_TOP set to a path that already exists from a previous build. In particular, this may cause problems with the lib/ExtUtils/t/Embed.t test, which attempts to build a test program and may end up building against the installed perl's lib/CORE directory rather than the one being tested.

    You will have to make sure that CCTYPE is set correctly and that CCHOME points to wherever you installed your compiler.

    If building with the cross-compiler provided by mingw-w64.sourceforge.net you'll need to uncomment the line that sets GCCCROSS in the makefile.mk. Do this only if it's the cross-compiler - ie only if the bin folder doesn't contain a gcc.exe. (The cross-compiler does not provide a gcc.exe, g++.exe, ar.exe, etc. Instead, all of these executables are prefixed with 'x86_64-w64-mingw32-'.)

    The default value for CCHOME in the makefiles for Visual C++ may not be correct for some versions. Make sure the default exists and is valid.

    You may also need to comment out the DELAYLOAD = ... line in the Makefile if you're using VC++ 6.0 without the latest service pack and the linker reports an internal error.

    If you want build some core extensions statically into perl's dll, specify them in the STATIC_EXT macro.

    Be sure to read the instructions near the top of the makefiles carefully.

  • Type "dmake" (or "nmake" if you are using that make).

    This should build everything. Specifically, it will create perl.exe, perl518.dll at the perl toplevel, and various other extension dll's under the lib\auto directory. If the build fails for any reason, make sure you have done the previous steps correctly.

Testing Perl on Windows

Type "dmake test" (or "nmake test"). This will run most of the tests from the testsuite (many tests will be skipped).

There should be no test failures.

Some test failures may occur if you use a command shell other than the native "cmd.exe", or if you are building from a path that contains spaces. So don't do that.

If you are running the tests from a emacs shell window, you may see failures in op/stat.t. Run "dmake test-notty" in that case.

If you run the tests on a FAT partition, you may see some failures for link() related tests (op/write.t, op/stat.t ...). Testing on NTFS avoids these errors.

Furthermore, you should make sure that during make test you do not have any GNU tool packages in your path: some toolkits like Unixutils include some tools (type for instance) which override the Windows ones and makes tests fail. Remove them from your path while testing to avoid these errors.

Please report any other failures as described under BUGS AND CAVEATS.

Installation of Perl on Windows

Type "dmake install" (or "nmake install"). This will put the newly built perl and the libraries under whatever INST_TOP points to in the Makefile. It will also install the pod documentation under $INST_TOP\$INST_VER\lib\pod and HTML versions of the same under $INST_TOP\$INST_VER\lib\pod\html .

To use the Perl you just installed you will need to add a new entry to your PATH environment variable: $INST_TOP\bin , e.g.

  1. set PATH=c:\perl\bin;%PATH%

If you opted to uncomment INST_VER and INST_ARCH in the makefile then the installation structure is a little more complicated and you will need to add two new PATH components instead: $INST_TOP\$INST_VER\bin and $INST_TOP\$INST_VER\bin\$ARCHNAME , e.g.

  1. set PATH=c:\perl\5.6.0\bin;c:\perl\5.6.0\bin\MSWin32-x86;%PATH%

Usage Hints for Perl on Windows

  • Environment Variables

    The installation paths that you set during the build get compiled into perl, so you don't have to do anything additional to start using that perl (except add its location to your PATH variable).

    If you put extensions in unusual places, you can set PERL5LIB to a list of paths separated by semicolons where you want perl to look for libraries. Look for descriptions of other environment variables you can set in perlrun.

    You can also control the shell that perl uses to run system() and backtick commands via PERL5SHELL. See perlrun.

    Perl does not depend on the registry, but it can look up certain default values if you choose to put them there. Perl attempts to read entries from HKEY_CURRENT_USER\Software\Perl and HKEY_LOCAL_MACHINE\Software\Perl . Entries in the former override entries in the latter. One or more of the following entries (of type REG_SZ or REG_EXPAND_SZ) may be set:

    1. lib-$] version-specific standard library path to add to @INC
    2. lib standard library path to add to @INC
    3. sitelib-$] version-specific site library path to add to @INC
    4. sitelib site library path to add to @INC
    5. vendorlib-$] version-specific vendor library path to add to @INC
    6. vendorlib vendor library path to add to @INC
    7. PERL* fallback for all %ENV lookups that begin with "PERL"

    Note the $] in the above is not literal. Substitute whatever version of perl you want to honor that entry, e.g. 5.6.0 . Paths must be separated with semicolons, as usual on Windows.

  • File Globbing

    By default, perl handles file globbing using the File::Glob extension, which provides portable globbing.

    If you want perl to use globbing that emulates the quirks of DOS filename conventions, you might want to consider using File::DosGlob to override the internal glob() implementation. See File::DosGlob for details.

  • Using perl from the command line

    If you are accustomed to using perl from various command-line shells found in UNIX environments, you will be less than pleased with what Windows offers by way of a command shell.

    The crucial thing to understand about the Windows environment is that the command line you type in is processed twice before Perl sees it. First, your command shell (usually CMD.EXE) preprocesses the command line, to handle redirection, environment variable expansion, and location of the executable to run. Then, the perl executable splits the remaining command line into individual arguments, using the C runtime library upon which Perl was built.

    It is particularly important to note that neither the shell nor the C runtime do any wildcard expansions of command-line arguments (so wildcards need not be quoted). Also, the quoting behaviours of the shell and the C runtime are rudimentary at best (and may, if you are using a non-standard shell, be inconsistent). The only (useful) quote character is the double quote ("). It can be used to protect spaces and other special characters in arguments.

    The Windows documentation describes the shell parsing rules here: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/cmd.mspx?mfr=true and the C runtime parsing rules here: http://msdn.microsoft.com/en-us/library/17w5ykft%28v=VS.100%29.aspx.

    Here are some further observations based on experiments: The C runtime breaks arguments at spaces and passes them to programs in argc/argv. Double quotes can be used to prevent arguments with spaces in them from being split up. You can put a double quote in an argument by escaping it with a backslash and enclosing the whole argument within double quotes. The backslash and the pair of double quotes surrounding the argument will be stripped by the C runtime.

    The file redirection characters "<", ">", and "|" can be quoted by double quotes (although there are suggestions that this may not always be true). Single quotes are not treated as quotes by the shell or the C runtime, they don't get stripped by the shell (just to make this type of quoting completely useless). The caret "^" has also been observed to behave as a quoting character, but this appears to be a shell feature, and the caret is not stripped from the command line, so Perl still sees it (and the C runtime phase does not treat the caret as a quote character).

    Here are some examples of usage of the "cmd" shell:

    This prints two doublequotes:

    1. perl -e "print '\"\"' "

    This does the same:

    1. perl -e "print \"\\\"\\\"\" "

    This prints "bar" and writes "foo" to the file "blurch":

    1. perl -e "print 'foo'; print STDERR 'bar'" > blurch

    This prints "foo" ("bar" disappears into nowhereland):

    1. perl -e "print 'foo'; print STDERR 'bar'" 2> nul

    This prints "bar" and writes "foo" into the file "blurch":

    1. perl -e "print 'foo'; print STDERR 'bar'" 1> blurch

    This pipes "foo" to the "less" pager and prints "bar" on the console:

    1. perl -e "print 'foo'; print STDERR 'bar'" | less

    This pipes "foo\nbar\n" to the less pager:

    1. perl -le "print 'foo'; print STDERR 'bar'" 2>&1 | less

    This pipes "foo" to the pager and writes "bar" in the file "blurch":

    1. perl -e "print 'foo'; print STDERR 'bar'" 2> blurch | less

    Discovering the usefulness of the "command.com" shell on Windows 9x is left as an exercise to the reader :)

    One particularly pernicious problem with the 4NT command shell for Windows is that it (nearly) always treats a % character as indicating that environment variable expansion is needed. Under this shell, it is therefore important to always double any % characters which you want Perl to see (for example, for hash variables), even when they are quoted.

  • Building Extensions

    The Comprehensive Perl Archive Network (CPAN) offers a wealth of extensions, some of which require a C compiler to build. Look in http://www.cpan.org/ for more information on CPAN.

    Note that not all of the extensions available from CPAN may work in the Windows environment; you should check the information at http://testers.cpan.org/ before investing too much effort into porting modules that don't readily build.

    Most extensions (whether they require a C compiler or not) can be built, tested and installed with the standard mantra:

    1. perl Makefile.PL
    2. $MAKE
    3. $MAKE test
    4. $MAKE install

    where $MAKE is whatever 'make' program you have configured perl to use. Use "perl -V:make" to find out what this is. Some extensions may not provide a testsuite (so "$MAKE test" may not do anything or fail), but most serious ones do.

    It is important that you use a supported 'make' program, and ensure Config.pm knows about it. If you don't have nmake, you can either get dmake from the location mentioned earlier or get an old version of nmake reportedly available from:

    http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/nmake15.exe

    Another option is to use the make written in Perl, available from CPAN.

    http://www.cpan.org/modules/by-module/Make/

    You may also use dmake. See Make above on how to get it.

    Note that MakeMaker actually emits makefiles with different syntax depending on what 'make' it thinks you are using. Therefore, it is important that one of the following values appears in Config.pm:

    1. make='nmake' # MakeMaker emits nmake syntax
    2. make='dmake' # MakeMaker emits dmake syntax
    3. any other value # MakeMaker emits generic make syntax
    4. (e.g GNU make, or Perl make)

    If the value doesn't match the 'make' program you want to use, edit Config.pm to fix it.

    If a module implements XSUBs, you will need one of the supported C compilers. You must make sure you have set up the environment for the compiler for command-line compilation.

    If a module does not build for some reason, look carefully for why it failed, and report problems to the module author. If it looks like the extension building support is at fault, report that with full details of how the build failed using the perlbug utility.

  • Command-line Wildcard Expansion

    The default command shells on DOS descendant operating systems (such as they are) usually do not expand wildcard arguments supplied to programs. They consider it the application's job to handle that. This is commonly achieved by linking the application (in our case, perl) with startup code that the C runtime libraries usually provide. However, doing that results in incompatible perl versions (since the behavior of the argv expansion code differs depending on the compiler, and it is even buggy on some compilers). Besides, it may be a source of frustration if you use such a perl binary with an alternate shell that *does* expand wildcards.

    Instead, the following solution works rather well. The nice things about it are 1) you can start using it right away; 2) it is more powerful, because it will do the right thing with a pattern like */*/*.c; 3) you can decide whether you do/don't want to use it; and 4) you can extend the method to add any customizations (or even entirely different kinds of wildcard expansion).

    1. C:\> copy con c:\perl\lib\Wild.pm
    2. # Wild.pm - emulate shell @ARGV expansion on shells that don't
    3. use File::DosGlob;
    4. @ARGV = map {
    5. my @g = File::DosGlob::glob($_) if /[*?]/;
    6. @g ? @g : $_;
    7. } @ARGV;
    8. 1;
    9. ^Z
    10. C:\> set PERL5OPT=-MWild
    11. C:\> perl -le "for (@ARGV) { print }" */*/perl*.c
    12. p4view/perl/perl.c
    13. p4view/perl/perlio.c
    14. p4view/perl/perly.c
    15. perl5.005/win32/perlglob.c
    16. perl5.005/win32/perllib.c
    17. perl5.005/win32/perlglob.c
    18. perl5.005/win32/perllib.c
    19. perl5.005/win32/perlglob.c
    20. perl5.005/win32/perllib.c

    Note there are two distinct steps there: 1) You'll have to create Wild.pm and put it in your perl lib directory. 2) You'll need to set the PERL5OPT environment variable. If you want argv expansion to be the default, just set PERL5OPT in your default startup environment.

    If you are using the Visual C compiler, you can get the C runtime's command line wildcard expansion built into perl binary. The resulting binary will always expand unquoted command lines, which may not be what you want if you use a shell that does that for you. The expansion done is also somewhat less powerful than the approach suggested above.

  • Notes on 64-bit Windows

    Windows .NET Server supports the LLP64 data model on the Intel Itanium architecture.

    The LLP64 data model is different from the LP64 data model that is the norm on 64-bit Unix platforms. In the former, int and long are both 32-bit data types, while pointers are 64 bits wide. In addition, there is a separate 64-bit wide integral type, __int64 . In contrast, the LP64 data model that is pervasive on Unix platforms provides int as the 32-bit type, while both the long type and pointers are of 64-bit precision. Note that both models provide for 64-bits of addressability.

    64-bit Windows running on Itanium is capable of running 32-bit x86 binaries transparently. This means that you could use a 32-bit build of Perl on a 64-bit system. Given this, why would one want to build a 64-bit build of Perl? Here are some reasons why you would bother:

    • A 64-bit native application will run much more efficiently on Itanium hardware.

    • There is no 2GB limit on process size.

    • Perl automatically provides large file support when built under 64-bit Windows.

    • Embedding Perl inside a 64-bit application.

Running Perl Scripts

Perl scripts on UNIX use the "#!" (a.k.a "shebang") line to indicate to the OS that it should execute the file using perl. Windows has no comparable means to indicate arbitrary files are executables.

Instead, all available methods to execute plain text files on Windows rely on the file "extension". There are three methods to use this to execute perl scripts:

1

There is a facility called "file extension associations". This can be manipulated via the two commands "assoc" and "ftype" that come standard with Windows. Type "ftype /?" for a complete example of how to set this up for perl scripts (Say what? You thought Windows wasn't perl-ready? :).

2

Since file associations don't work everywhere, and there are reportedly bugs with file associations where it does work, the old method of wrapping the perl script to make it look like a regular batch file to the OS, may be used. The install process makes available the "pl2bat.bat" script which can be used to wrap perl scripts into batch files. For example:

  1. pl2bat foo.pl

will create the file "FOO.BAT". Note "pl2bat" strips any .pl suffix and adds a .bat suffix to the generated file.

If you use the 4DOS/NT or similar command shell, note that "pl2bat" uses the "%*" variable in the generated batch file to refer to all the command line arguments, so you may need to make sure that construct works in batch files. As of this writing, 4DOS/NT users will need a "ParameterChar = *" statement in their 4NT.INI file or will need to execute "setdos /p*" in the 4DOS/NT startup file to enable this to work.

3

Using "pl2bat" has a few problems: the file name gets changed, so scripts that rely on $0 to find what they must do may not run properly; running "pl2bat" replicates the contents of the original script, and so this process can be maintenance intensive if the originals get updated often. A different approach that avoids both problems is possible.

A script called "runperl.bat" is available that can be copied to any filename (along with the .bat suffix). For example, if you call it "foo.bat", it will run the file "foo" when it is executed. Since you can run batch files on Windows platforms simply by typing the name (without the extension), this effectively runs the file "foo", when you type either "foo" or "foo.bat". With this method, "foo.bat" can even be in a different location than the file "foo", as long as "foo" is available somewhere on the PATH. If your scripts are on a filesystem that allows symbolic links, you can even avoid copying "runperl.bat".

Here's a diversion: copy "runperl.bat" to "runperl", and type "runperl". Explain the observed behavior, or lack thereof. :) Hint: .gnidnats llits er'uoy fi ,"lrepnur" eteled :tniH

Miscellaneous Things

A full set of HTML documentation is installed, so you should be able to use it if you have a web browser installed on your system.

perldoc is also a useful tool for browsing information contained in the documentation, especially in conjunction with a pager like less (recent versions of which have Windows support). You may have to set the PAGER environment variable to use a specific pager. "perldoc -f foo" will print information about the perl operator "foo".

One common mistake when using this port with a GUI library like Tk is assuming that Perl's normal behavior of opening a command-line window will go away. This isn't the case. If you want to start a copy of perl without opening a command-line window, use the wperl executable built during the installation process. Usage is exactly the same as normal perl on Windows, except that options like -h don't work (since they need a command-line window to print to).

If you find bugs in perl, you can run perlbug to create a bug report (you may have to send it manually if perlbug cannot find a mailer on your system).

BUGS AND CAVEATS

Norton AntiVirus interferes with the build process, particularly if set to "AutoProtect, All Files, when Opened". Unlike large applications the perl build process opens and modifies a lot of files. Having the the AntiVirus scan each and every one slows build the process significantly. Worse, with PERLIO=stdio the build process fails with peculiar messages as the virus checker interacts badly with miniperl.exe writing configure files (it seems to either catch file part written and treat it as suspicious, or virus checker may have it "locked" in a way which inhibits miniperl updating it). The build does complete with

  1. set PERLIO=perlio

but that may be just luck. Other AntiVirus software may have similar issues.

Some of the built-in functions do not act exactly as documented in perlfunc, and a few are not implemented at all. To avoid surprises, particularly if you have had prior exposure to Perl in other operating environments or if you intend to write code that will be portable to other environments, see perlport for a reasonably definitive list of these differences.

Not all extensions available from CPAN may build or work properly in the Windows environment. See Building Extensions.

Most socket() related calls are supported, but they may not behave as on Unix platforms. See perlport for the full list.

Signal handling may not behave as on Unix platforms (where it doesn't exactly "behave", either :). For instance, calling die() or exit() from signal handlers will cause an exception, since most implementations of signal() on Windows are severely crippled. Thus, signals may work only for simple things like setting a flag variable in the handler. Using signals under this port should currently be considered unsupported.

Please send detailed descriptions of any problems and solutions that you may find to <perlbug@perl.org>, along with the output produced by perl -V .

ACKNOWLEDGEMENTS

The use of a camel with the topic of Perl is a trademark of O'Reilly and Associates, Inc. Used with permission.

AUTHORS

  • Gary Ng <71564.1743@CompuServe.COM>
  • Gurusamy Sarathy <gsar@activestate.com>
  • Nick Ing-Simmons <nick@ing-simmons.net>
  • Jan Dubois <jand@activestate.com>
  • Steve Hay <steve.m.hay@googlemail.com>

This document is maintained by Jan Dubois.

SEE ALSO

perl

HISTORY

This port was originally contributed by Gary Ng around 5.003_24, and borrowed from the Hip Communications port that was available at the time. Various people have made numerous and sundry hacks since then.

GCC/mingw32 support was added in 5.005 (Nick Ing-Simmons).

Support for PERL_OBJECT was added in 5.005 (ActiveState Tool Corp).

Support for fork() emulation was added in 5.6 (ActiveState Tool Corp).

Win9x support was added in 5.6 (Benjamin Stuhl).

Support for 64-bit Windows added in 5.8 (ActiveState Corp).

Last updated: 02 January 2012

 
perldoc-html/perlxs.html000644 000765 000024 00000430170 12275777360 015446 0ustar00jjstaff000000 000000 perlxs - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlxs

Perl 5 version 18.2 documentation
Recently read

perlxs

NAME

perlxs - XS language reference manual

DESCRIPTION

Introduction

XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl. The XS interface is combined with the library to create a new library which can then be either dynamically loaded or statically linked into perl. The XS interface description is written in the XS language and is the core component of the Perl extension interface.

An XSUB forms the basic unit of the XS interface. After compilation by the xsubpp compiler, each XSUB amounts to a C function definition which will provide the glue between Perl calling conventions and C calling conventions.

The glue code pulls the arguments from the Perl stack, converts these Perl values to the formats expected by a C function, call this C function, transfers the return values of the C function back to Perl. Return values here may be a conventional C return value or any C function arguments that may serve as output parameters. These return values may be passed back to Perl either by putting them on the Perl stack, or by modifying the arguments supplied from the Perl side.

The above is a somewhat simplified view of what really happens. Since Perl allows more flexible calling conventions than C, XSUBs may do much more in practice, such as checking input parameters for validity, throwing exceptions (or returning undef/empty list) if the return value from the C function indicates failure, calling different C functions based on numbers and types of the arguments, providing an object-oriented interface, etc.

Of course, one could write such glue code directly in C. However, this would be a tedious task, especially if one needs to write glue for multiple C functions, and/or one is not familiar enough with the Perl stack discipline and other such arcana. XS comes to the rescue here: instead of writing this glue C code in long-hand, one can write a more concise short-hand description of what should be done by the glue, and let the XS compiler xsubpp handle the rest.

The XS language allows one to describe the mapping between how the C routine is used, and how the corresponding Perl routine is used. It also allows creation of Perl routines which are directly translated to C code and which are not related to a pre-existing C function. In cases when the C interface coincides with the Perl interface, the XSUB declaration is almost identical to a declaration of a C function (in K&R style). In such circumstances, there is another tool called h2xs that is able to translate an entire C header file into a corresponding XS file that will provide glue to the functions/macros described in the header file.

The XS compiler is called xsubpp. This compiler creates the constructs necessary to let an XSUB manipulate Perl values, and creates the glue necessary to let Perl call the XSUB. The compiler uses typemaps to determine how to map C function parameters and output values to Perl values and back. The default typemap (which comes with Perl) handles many common C types. A supplementary typemap may also be needed to handle any special structures and types for the library being linked. For more information on typemaps, see perlxstypemap.

A file in XS format starts with a C language section which goes until the first MODULE = directive. Other XS directives and XSUB definitions may follow this line. The "language" used in this part of the file is usually referred to as the XS language. xsubpp recognizes and skips POD (see perlpod) in both the C and XS language sections, which allows the XS file to contain embedded documentation.

See perlxstut for a tutorial on the whole extension creation process.

Note: For some extensions, Dave Beazley's SWIG system may provide a significantly more convenient mechanism for creating the extension glue code. See http://www.swig.org/ for more information.

On The Road

Many of the examples which follow will concentrate on creating an interface between Perl and the ONC+ RPC bind library functions. The rpcb_gettime() function is used to demonstrate many features of the XS language. This function has two parameters; the first is an input parameter and the second is an output parameter. The function also returns a status value.

  1. bool_t rpcb_gettime(const char *host, time_t *timep);

From C this function will be called with the following statements.

  1. #include <rpc/rpc.h>
  2. bool_t status;
  3. time_t timep;
  4. status = rpcb_gettime( "localhost", &timep );

If an XSUB is created to offer a direct translation between this function and Perl, then this XSUB will be used from Perl with the following code. The $status and $timep variables will contain the output of the function.

  1. use RPC;
  2. $status = rpcb_gettime( "localhost", $timep );

The following XS file shows an XS subroutine, or XSUB, which demonstrates one possible interface to the rpcb_gettime() function. This XSUB represents a direct translation between C and Perl and so preserves the interface even from Perl. This XSUB will be invoked from Perl with the usage shown above. Note that the first three #include statements, for EXTERN.h , perl.h , and XSUB.h , will always be present at the beginning of an XS file. This approach and others will be expanded later in this document.

  1. #include "EXTERN.h"
  2. #include "perl.h"
  3. #include "XSUB.h"
  4. #include <rpc/rpc.h>
  5. MODULE = RPC PACKAGE = RPC
  6. bool_t
  7. rpcb_gettime(host,timep)
  8. char *host
  9. time_t &timep
  10. OUTPUT:
  11. timep

Any extension to Perl, including those containing XSUBs, should have a Perl module to serve as the bootstrap which pulls the extension into Perl. This module will export the extension's functions and variables to the Perl program and will cause the extension's XSUBs to be linked into Perl. The following module will be used for most of the examples in this document and should be used from Perl with the use command as shown earlier. Perl modules are explained in more detail later in this document.

  1. package RPC;
  2. require Exporter;
  3. require DynaLoader;
  4. @ISA = qw(Exporter DynaLoader);
  5. @EXPORT = qw( rpcb_gettime );
  6. bootstrap RPC;
  7. 1;

Throughout this document a variety of interfaces to the rpcb_gettime() XSUB will be explored. The XSUBs will take their parameters in different orders or will take different numbers of parameters. In each case the XSUB is an abstraction between Perl and the real C rpcb_gettime() function, and the XSUB must always ensure that the real rpcb_gettime() function is called with the correct parameters. This abstraction will allow the programmer to create a more Perl-like interface to the C function.

The Anatomy of an XSUB

The simplest XSUBs consist of 3 parts: a description of the return value, the name of the XSUB routine and the names of its arguments, and a description of types or formats of the arguments.

The following XSUB allows a Perl program to access a C library function called sin(). The XSUB will imitate the C function which takes a single argument and returns a single value.

  1. double
  2. sin(x)
  3. double x

Optionally, one can merge the description of types and the list of argument names, rewriting this as

  1. double
  2. sin(double x)

This makes this XSUB look similar to an ANSI C declaration. An optional semicolon is allowed after the argument list, as in

  1. double
  2. sin(double x);

Parameters with C pointer types can have different semantic: C functions with similar declarations

  1. bool string_looks_as_a_number(char *s);
  2. bool make_char_uppercase(char *c);

are used in absolutely incompatible manner. Parameters to these functions could be described xsubpp like this:

  1. char * s
  2. char &c

Both these XS declarations correspond to the char* C type, but they have different semantics, see The & Unary Operator.

It is convenient to think that the indirection operator * should be considered as a part of the type and the address operator & should be considered part of the variable. See perlxstypemap for more info about handling qualifiers and unary operators in C types.

The function name and the return type must be placed on separate lines and should be flush left-adjusted.

  1. INCORRECT CORRECT
  2. double sin(x) double
  3. double x sin(x)
  4. double x

The rest of the function description may be indented or left-adjusted. The following example shows a function with its body left-adjusted. Most examples in this document will indent the body for better readability.

  1. CORRECT
  2. double
  3. sin(x)
  4. double x

More complicated XSUBs may contain many other sections. Each section of an XSUB starts with the corresponding keyword, such as INIT: or CLEANUP:. However, the first two lines of an XSUB always contain the same data: descriptions of the return type and the names of the function and its parameters. Whatever immediately follows these is considered to be an INPUT: section unless explicitly marked with another keyword. (See The INPUT: Keyword.)

An XSUB section continues until another section-start keyword is found.

The Argument Stack

The Perl argument stack is used to store the values which are sent as parameters to the XSUB and to store the XSUB's return value(s). In reality all Perl functions (including non-XSUB ones) keep their values on this stack all the same time, each limited to its own range of positions on the stack. In this document the first position on that stack which belongs to the active function will be referred to as position 0 for that function.

XSUBs refer to their stack arguments with the macro ST(x), where x refers to a position in this XSUB's part of the stack. Position 0 for that function would be known to the XSUB as ST(0). The XSUB's incoming parameters and outgoing return values always begin at ST(0). For many simple cases the xsubpp compiler will generate the code necessary to handle the argument stack by embedding code fragments found in the typemaps. In more complex cases the programmer must supply the code.

The RETVAL Variable

The RETVAL variable is a special C variable that is declared automatically for you. The C type of RETVAL matches the return type of the C library function. The xsubpp compiler will declare this variable in each XSUB with non-void return type. By default the generated C function will use RETVAL to hold the return value of the C library function being called. In simple cases the value of RETVAL will be placed in ST(0) of the argument stack where it can be received by Perl as the return value of the XSUB.

If the XSUB has a return type of void then the compiler will not declare a RETVAL variable for that function. When using a PPCODE: section no manipulation of the RETVAL variable is required, the section may use direct stack manipulation to place output values on the stack.

If PPCODE: directive is not used, void return value should be used only for subroutines which do not return a value, even if CODE: directive is used which sets ST(0) explicitly.

Older versions of this document recommended to use void return value in such cases. It was discovered that this could lead to segfaults in cases when XSUB was truly void . This practice is now deprecated, and may be not supported at some future version. Use the return value SV * in such cases. (Currently xsubpp contains some heuristic code which tries to disambiguate between "truly-void" and "old-practice-declared-as-void" functions. Hence your code is at mercy of this heuristics unless you use SV * as return value.)

Returning SVs, AVs and HVs through RETVAL

When you're using RETVAL to return an SV * , there's some magic going on behind the scenes that should be mentioned. When you're manipulating the argument stack using the ST(x) macro, for example, you usually have to pay special attention to reference counts. (For more about reference counts, see perlguts.) To make your life easier, the typemap file automatically makes RETVAL mortal when you're returning an SV * . Thus, the following two XSUBs are more or less equivalent:

  1. void
  2. alpha()
  3. PPCODE:
  4. ST(0) = newSVpv("Hello World",0);
  5. sv_2mortal(ST(0));
  6. XSRETURN(1);
  7. SV *
  8. beta()
  9. CODE:
  10. RETVAL = newSVpv("Hello World",0);
  11. OUTPUT:
  12. RETVAL

This is quite useful as it usually improves readability. While this works fine for an SV * , it's unfortunately not as easy to have AV * or HV * as a return value. You should be able to write:

  1. AV *
  2. array()
  3. CODE:
  4. RETVAL = newAV();
  5. /* do something with RETVAL */
  6. OUTPUT:
  7. RETVAL

But due to an unfixable bug (fixing it would break lots of existing CPAN modules) in the typemap file, the reference count of the AV * is not properly decremented. Thus, the above XSUB would leak memory whenever it is being called. The same problem exists for HV * , CV * , and SVREF (which indicates a scalar reference, not a general SV * ). In XS code on perls starting with perl 5.16, you can override the typemaps for any of these types with a version that has proper handling of refcounts. In your TYPEMAP section, do

  1. AV* T_AVREF_REFCOUNT_FIXED

to get the repaired variant. For backward compatibility with older versions of perl, you can instead decrement the reference count manually when you're returning one of the aforementioned types using sv_2mortal :

  1. AV *
  2. array()
  3. CODE:
  4. RETVAL = newAV();
  5. sv_2mortal((SV*)RETVAL);
  6. /* do something with RETVAL */
  7. OUTPUT:
  8. RETVAL

Remember that you don't have to do this for an SV * . The reference documentation for all core typemaps can be found in perlxstypemap.

The MODULE Keyword

The MODULE keyword is used to start the XS code and to specify the package of the functions which are being defined. All text preceding the first MODULE keyword is considered C code and is passed through to the output with POD stripped, but otherwise untouched. Every XS module will have a bootstrap function which is used to hook the XSUBs into Perl. The package name of this bootstrap function will match the value of the last MODULE statement in the XS source files. The value of MODULE should always remain constant within the same XS file, though this is not required.

The following example will start the XS code and will place all functions in a package named RPC.

  1. MODULE = RPC

The PACKAGE Keyword

When functions within an XS source file must be separated into packages the PACKAGE keyword should be used. This keyword is used with the MODULE keyword and must follow immediately after it when used.

  1. MODULE = RPC PACKAGE = RPC
  2. [ XS code in package RPC ]
  3. MODULE = RPC PACKAGE = RPCB
  4. [ XS code in package RPCB ]
  5. MODULE = RPC PACKAGE = RPC
  6. [ XS code in package RPC ]

The same package name can be used more than once, allowing for non-contiguous code. This is useful if you have a stronger ordering principle than package names.

Although this keyword is optional and in some cases provides redundant information it should always be used. This keyword will ensure that the XSUBs appear in the desired package.

The PREFIX Keyword

The PREFIX keyword designates prefixes which should be removed from the Perl function names. If the C function is rpcb_gettime() and the PREFIX value is rpcb_ then Perl will see this function as gettime() .

This keyword should follow the PACKAGE keyword when used. If PACKAGE is not used then PREFIX should follow the MODULE keyword.

  1. MODULE = RPC PREFIX = rpc_
  2. MODULE = RPC PACKAGE = RPCB PREFIX = rpcb_

The OUTPUT: Keyword

The OUTPUT: keyword indicates that certain function parameters should be updated (new values made visible to Perl) when the XSUB terminates or that certain values should be returned to the calling Perl function. For simple functions which have no CODE: or PPCODE: section, such as the sin() function above, the RETVAL variable is automatically designated as an output value. For more complex functions the xsubpp compiler will need help to determine which variables are output variables.

This keyword will normally be used to complement the CODE: keyword. The RETVAL variable is not recognized as an output variable when the CODE: keyword is present. The OUTPUT: keyword is used in this situation to tell the compiler that RETVAL really is an output variable.

The OUTPUT: keyword can also be used to indicate that function parameters are output variables. This may be necessary when a parameter has been modified within the function and the programmer would like the update to be seen by Perl.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t &timep
  5. OUTPUT:
  6. timep

The OUTPUT: keyword will also allow an output parameter to be mapped to a matching piece of code rather than to a typemap.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t &timep
  5. OUTPUT:
  6. timep sv_setnv(ST(1), (double)timep);

xsubpp emits an automatic SvSETMAGIC() for all parameters in the OUTPUT section of the XSUB, except RETVAL. This is the usually desired behavior, as it takes care of properly invoking 'set' magic on output parameters (needed for hash or array element parameters that must be created if they didn't exist). If for some reason, this behavior is not desired, the OUTPUT section may contain a SETMAGIC: DISABLE line to disable it for the remainder of the parameters in the OUTPUT section. Likewise, SETMAGIC: ENABLE can be used to reenable it for the remainder of the OUTPUT section. See perlguts for more details about 'set' magic.

The NO_OUTPUT Keyword

The NO_OUTPUT can be placed as the first token of the XSUB. This keyword indicates that while the C subroutine we provide an interface to has a non-void return type, the return value of this C subroutine should not be returned from the generated Perl subroutine.

With this keyword present The RETVAL Variable is created, and in the generated call to the subroutine this variable is assigned to, but the value of this variable is not going to be used in the auto-generated code.

This keyword makes sense only if RETVAL is going to be accessed by the user-supplied code. It is especially useful to make a function interface more Perl-like, especially when the C return value is just an error condition indicator. For example,

  1. NO_OUTPUT int
  2. delete_file(char *name)
  3. POSTCALL:
  4. if (RETVAL != 0)
  5. croak("Error %d while deleting file '%s'", RETVAL, name);

Here the generated XS function returns nothing on success, and will die() with a meaningful error message on error.

The CODE: Keyword

This keyword is used in more complicated XSUBs which require special handling for the C function. The RETVAL variable is still declared, but it will not be returned unless it is specified in the OUTPUT: section.

The following XSUB is for a C function which requires special handling of its parameters. The Perl usage is given first.

  1. $status = rpcb_gettime( "localhost", $timep );

The XSUB follows.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t timep
  5. CODE:
  6. RETVAL = rpcb_gettime( host, &timep );
  7. OUTPUT:
  8. timep
  9. RETVAL

The INIT: Keyword

The INIT: keyword allows initialization to be inserted into the XSUB before the compiler generates the call to the C function. Unlike the CODE: keyword above, this keyword does not affect the way the compiler handles RETVAL.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t &timep
  5. INIT:
  6. printf("# Host is %s\n", host );
  7. OUTPUT:
  8. timep

Another use for the INIT: section is to check for preconditions before making a call to the C function:

  1. long long
  2. lldiv(a,b)
  3. long long a
  4. long long b
  5. INIT:
  6. if (a == 0 && b == 0)
  7. XSRETURN_UNDEF;
  8. if (b == 0)
  9. croak("lldiv: cannot divide by 0");

The NO_INIT Keyword

The NO_INIT keyword is used to indicate that a function parameter is being used only as an output value. The xsubpp compiler will normally generate code to read the values of all function parameters from the argument stack and assign them to C variables upon entry to the function. NO_INIT will tell the compiler that some parameters will be used for output rather than for input and that they will be handled before the function terminates.

The following example shows a variation of the rpcb_gettime() function. This function uses the timep variable only as an output variable and does not care about its initial contents.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t &timep = NO_INIT
  5. OUTPUT:
  6. timep

The TYPEMAP: Keyword

Starting with Perl 5.16, you can embed typemaps into your XS code instead of or in addition to typemaps in a separate file. Multiple such embedded typemaps will be processed in order of appearance in the XS code and like local typemap files take precendence over the default typemap, the embedded typemaps may overwrite previous definitions of TYPEMAP, INPUT, and OUTPUT stanzas. The syntax for embedded typemaps is

  1. TYPEMAP: <<HERE
  2. ... your typemap code here ...
  3. HERE

where the TYPEMAP keyword must appear in the first column of a new line.

Refer to perlxstypemap for details on writing typemaps.

Initializing Function Parameters

C function parameters are normally initialized with their values from the argument stack (which in turn contains the parameters that were passed to the XSUB from Perl). The typemaps contain the code segments which are used to translate the Perl values to the C parameters. The programmer, however, is allowed to override the typemaps and supply alternate (or additional) initialization code. Initialization code starts with the first = , ; or + on a line in the INPUT: section. The only exception happens if this ; terminates the line, then this ; is quietly ignored.

The following code demonstrates how to supply initialization code for function parameters. The initialization code is eval'ed within double quotes by the compiler before it is added to the output so anything which should be interpreted literally [mainly $ , @ , or \\ ] must be protected with backslashes. The variables $var , $arg , and $type can be used as in typemaps.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host = (char *)SvPV_nolen($arg);
  4. time_t &timep = 0;
  5. OUTPUT:
  6. timep

This should not be used to supply default values for parameters. One would normally use this when a function parameter must be processed by another library function before it can be used. Default parameters are covered in the next section.

If the initialization begins with = , then it is output in the declaration for the input variable, replacing the initialization supplied by the typemap. If the initialization begins with ; or + , then it is performed after all of the input variables have been declared. In the ; case the initialization normally supplied by the typemap is not performed. For the + case, the declaration for the variable will include the initialization from the typemap. A global variable, %v , is available for the truly rare case where information from one initialization is needed in another initialization.

Here's a truly obscure example:

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. time_t &timep; /* \$v{timep}=@{[$v{timep}=$arg]} */
  4. char *host + SvOK($v{timep}) ? SvPV_nolen($arg) : NULL;
  5. OUTPUT:
  6. timep

The construct \$v{timep}=@{[$v{timep}=$arg]} used in the above example has a two-fold purpose: first, when this line is processed by xsubpp, the Perl snippet $v{timep}=$arg is evaluated. Second, the text of the evaluated snippet is output into the generated C file (inside a C comment)! During the processing of char *host line, $arg will evaluate to ST(0) , and $v{timep} will evaluate to ST(1) .

Default Parameter Values

Default values for XSUB arguments can be specified by placing an assignment statement in the parameter list. The default value may be a number, a string or the special string NO_INIT . Defaults should always be used on the right-most parameters only.

To allow the XSUB for rpcb_gettime() to have a default host value the parameters to the XSUB could be rearranged. The XSUB will then call the real rpcb_gettime() function with the parameters in the correct order. This XSUB can be called from Perl with either of the following statements:

  1. $status = rpcb_gettime( $timep, $host );
  2. $status = rpcb_gettime( $timep );

The XSUB will look like the code which follows. A CODE: block is used to call the real rpcb_gettime() function with the parameters in the correct order for that function.

  1. bool_t
  2. rpcb_gettime(timep,host="localhost")
  3. char *host
  4. time_t timep = NO_INIT
  5. CODE:
  6. RETVAL = rpcb_gettime( host, &timep );
  7. OUTPUT:
  8. timep
  9. RETVAL

The PREINIT: Keyword

The PREINIT: keyword allows extra variables to be declared immediately before or after the declarations of the parameters from the INPUT: section are emitted.

If a variable is declared inside a CODE: section it will follow any typemap code that is emitted for the input parameters. This may result in the declaration ending up after C code, which is C syntax error. Similar errors may happen with an explicit ; -type or + -type initialization of parameters is used (see Initializing Function Parameters). Declaring these variables in an INIT: section will not help.

In such cases, to force an additional variable to be declared together with declarations of other variables, place the declaration into a PREINIT: section. The PREINIT: keyword may be used one or more times within an XSUB.

The following examples are equivalent, but if the code is using complex typemaps then the first example is safer.

  1. bool_t
  2. rpcb_gettime(timep)
  3. time_t timep = NO_INIT
  4. PREINIT:
  5. char *host = "localhost";
  6. CODE:
  7. RETVAL = rpcb_gettime( host, &timep );
  8. OUTPUT:
  9. timep
  10. RETVAL

For this particular case an INIT: keyword would generate the same C code as the PREINIT: keyword. Another correct, but error-prone example:

  1. bool_t
  2. rpcb_gettime(timep)
  3. time_t timep = NO_INIT
  4. CODE:
  5. char *host = "localhost";
  6. RETVAL = rpcb_gettime( host, &timep );
  7. OUTPUT:
  8. timep
  9. RETVAL

Another way to declare host is to use a C block in the CODE: section:

  1. bool_t
  2. rpcb_gettime(timep)
  3. time_t timep = NO_INIT
  4. CODE:
  5. {
  6. char *host = "localhost";
  7. RETVAL = rpcb_gettime( host, &timep );
  8. }
  9. OUTPUT:
  10. timep
  11. RETVAL

The ability to put additional declarations before the typemap entries are processed is very handy in the cases when typemap conversions manipulate some global state:

  1. MyObject
  2. mutate(o)
  3. PREINIT:
  4. MyState st = global_state;
  5. INPUT:
  6. MyObject o;
  7. CLEANUP:
  8. reset_to(global_state, st);

Here we suppose that conversion to MyObject in the INPUT: section and from MyObject when processing RETVAL will modify a global variable global_state . After these conversions are performed, we restore the old value of global_state (to avoid memory leaks, for example).

There is another way to trade clarity for compactness: INPUT sections allow declaration of C variables which do not appear in the parameter list of a subroutine. Thus the above code for mutate() can be rewritten as

  1. MyObject
  2. mutate(o)
  3. MyState st = global_state;
  4. MyObject o;
  5. CLEANUP:
  6. reset_to(global_state, st);

and the code for rpcb_gettime() can be rewritten as

  1. bool_t
  2. rpcb_gettime(timep)
  3. time_t timep = NO_INIT
  4. char *host = "localhost";
  5. C_ARGS:
  6. host, &timep
  7. OUTPUT:
  8. timep
  9. RETVAL

The SCOPE: Keyword

The SCOPE: keyword allows scoping to be enabled for a particular XSUB. If enabled, the XSUB will invoke ENTER and LEAVE automatically.

To support potentially complex type mappings, if a typemap entry used by an XSUB contains a comment like /*scope*/ then scoping will be automatically enabled for that XSUB.

To enable scoping:

  1. SCOPE: ENABLE

To disable scoping:

  1. SCOPE: DISABLE

The INPUT: Keyword

The XSUB's parameters are usually evaluated immediately after entering the XSUB. The INPUT: keyword can be used to force those parameters to be evaluated a little later. The INPUT: keyword can be used multiple times within an XSUB and can be used to list one or more input variables. This keyword is used with the PREINIT: keyword.

The following example shows how the input parameter timep can be evaluated late, after a PREINIT.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. PREINIT:
  5. time_t tt;
  6. INPUT:
  7. time_t timep
  8. CODE:
  9. RETVAL = rpcb_gettime( host, &tt );
  10. timep = tt;
  11. OUTPUT:
  12. timep
  13. RETVAL

The next example shows each input parameter evaluated late.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. PREINIT:
  4. time_t tt;
  5. INPUT:
  6. char *host
  7. PREINIT:
  8. char *h;
  9. INPUT:
  10. time_t timep
  11. CODE:
  12. h = host;
  13. RETVAL = rpcb_gettime( h, &tt );
  14. timep = tt;
  15. OUTPUT:
  16. timep
  17. RETVAL

Since INPUT sections allow declaration of C variables which do not appear in the parameter list of a subroutine, this may be shortened to:

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. time_t tt;
  4. char *host;
  5. char *h = host;
  6. time_t timep;
  7. CODE:
  8. RETVAL = rpcb_gettime( h, &tt );
  9. timep = tt;
  10. OUTPUT:
  11. timep
  12. RETVAL

(We used our knowledge that input conversion for char * is a "simple" one, thus host is initialized on the declaration line, and our assignment h = host is not performed too early. Otherwise one would need to have the assignment h = host in a CODE: or INIT: section.)

The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords

In the list of parameters for an XSUB, one can precede parameter names by the IN /OUTLIST /IN_OUTLIST /OUT /IN_OUT keywords. IN keyword is the default, the other keywords indicate how the Perl interface should differ from the C interface.

Parameters preceded by OUTLIST /IN_OUTLIST /OUT /IN_OUT keywords are considered to be used by the C subroutine via pointers. OUTLIST /OUT keywords indicate that the C subroutine does not inspect the memory pointed by this parameter, but will write through this pointer to provide additional return values.

Parameters preceded by OUTLIST keyword do not appear in the usage signature of the generated Perl function.

Parameters preceded by IN_OUTLIST /IN_OUT /OUT do appear as parameters to the Perl function. With the exception of OUT -parameters, these parameters are converted to the corresponding C type, then pointers to these data are given as arguments to the C function. It is expected that the C function will write through these pointers.

The return list of the generated Perl function consists of the C return value from the function (unless the XSUB is of void return type or The NO_OUTPUT Keyword was used) followed by all the OUTLIST and IN_OUTLIST parameters (in the order of appearance). On the return from the XSUB the IN_OUT /OUT Perl parameter will be modified to have the values written by the C function.

For example, an XSUB

  1. void
  2. day_month(OUTLIST day, IN unix_time, OUTLIST month)
  3. int day
  4. int unix_time
  5. int month

should be used from Perl as

  1. my ($day, $month) = day_month(time);

The C signature of the corresponding function should be

  1. void day_month(int *day, int unix_time, int *month);

The IN /OUTLIST /IN_OUTLIST /IN_OUT /OUT keywords can be mixed with ANSI-style declarations, as in

  1. void
  2. day_month(OUTLIST int day, int unix_time, OUTLIST int month)

(here the optional IN keyword is omitted).

The IN_OUT parameters are identical with parameters introduced with The & Unary Operator and put into the OUTPUT: section (see The OUTPUT: Keyword). The IN_OUTLIST parameters are very similar, the only difference being that the value C function writes through the pointer would not modify the Perl parameter, but is put in the output list.

The OUTLIST /OUT parameter differ from IN_OUTLIST /IN_OUT parameters only by the initial value of the Perl parameter not being read (and not being given to the C function - which gets some garbage instead). For example, the same C function as above can be interfaced with as

  1. void day_month(OUT int day, int unix_time, OUT int month);

or

  1. void
  2. day_month(day, unix_time, month)
  3. int &day = NO_INIT
  4. int unix_time
  5. int &month = NO_INIT
  6. OUTPUT:
  7. day
  8. month

However, the generated Perl function is called in very C-ish style:

  1. my ($day, $month);
  2. day_month($day, time, $month);

The length(NAME) Keyword

If one of the input arguments to the C function is the length of a string argument NAME , one can substitute the name of the length-argument by length(NAME) in the XSUB declaration. This argument must be omitted when the generated Perl function is called. E.g.,

  1. void
  2. dump_chars(char *s, short l)
  3. {
  4. short n = 0;
  5. while (n < l) {
  6. printf("s[%d] = \"\\%#03o\"\n", n, (int)s[n]);
  7. n++;
  8. }
  9. }
  10. MODULE = x PACKAGE = x
  11. void dump_chars(char *s, short length(s))

should be called as dump_chars($string) .

This directive is supported with ANSI-type function declarations only.

Variable-length Parameter Lists

XSUBs can have variable-length parameter lists by specifying an ellipsis (...) in the parameter list. This use of the ellipsis is similar to that found in ANSI C. The programmer is able to determine the number of arguments passed to the XSUB by examining the items variable which the xsubpp compiler supplies for all XSUBs. By using this mechanism one can create an XSUB which accepts a list of parameters of unknown length.

The host parameter for the rpcb_gettime() XSUB can be optional so the ellipsis can be used to indicate that the XSUB will take a variable number of parameters. Perl should be able to call this XSUB with either of the following statements.

  1. $status = rpcb_gettime( $timep, $host );
  2. $status = rpcb_gettime( $timep );

The XS code, with ellipsis, follows.

  1. bool_t
  2. rpcb_gettime(timep, ...)
  3. time_t timep = NO_INIT
  4. PREINIT:
  5. char *host = "localhost";
  6. CODE:
  7. if( items > 1 )
  8. host = (char *)SvPV_nolen(ST(1));
  9. RETVAL = rpcb_gettime( host, &timep );
  10. OUTPUT:
  11. timep
  12. RETVAL

The C_ARGS: Keyword

The C_ARGS: keyword allows creating of XSUBS which have different calling sequence from Perl than from C, without a need to write CODE: or PPCODE: section. The contents of the C_ARGS: paragraph is put as the argument to the called C function without any change.

For example, suppose that a C function is declared as

  1. symbolic nth_derivative(int n, symbolic function, int flags);

and that the default flags are kept in a global C variable default_flags . Suppose that you want to create an interface which is called as

  1. $second_deriv = $function->nth_derivative(2);

To do this, declare the XSUB as

  1. symbolic
  2. nth_derivative(function, n)
  3. symbolic function
  4. int n
  5. C_ARGS:
  6. n, function, default_flags

The PPCODE: Keyword

The PPCODE: keyword is an alternate form of the CODE: keyword and is used to tell the xsubpp compiler that the programmer is supplying the code to control the argument stack for the XSUBs return values. Occasionally one will want an XSUB to return a list of values rather than a single value. In these cases one must use PPCODE: and then explicitly push the list of values on the stack. The PPCODE: and CODE: keywords should not be used together within the same XSUB.

The actual difference between PPCODE: and CODE: sections is in the initialization of SP macro (which stands for the current Perl stack pointer), and in the handling of data on the stack when returning from an XSUB. In CODE: sections SP preserves the value which was on entry to the XSUB: SP is on the function pointer (which follows the last parameter). In PPCODE: sections SP is moved backward to the beginning of the parameter list, which allows PUSH*() macros to place output values in the place Perl expects them to be when the XSUB returns back to Perl.

The generated trailer for a CODE: section ensures that the number of return values Perl will see is either 0 or 1 (depending on the void ness of the return value of the C function, and heuristics mentioned in The RETVAL Variable). The trailer generated for a PPCODE: section is based on the number of return values and on the number of times SP was updated by [X]PUSH*() macros.

Note that macros ST(i) , XST_m*() and XSRETURN*() work equally well in CODE: sections and PPCODE: sections.

The following XSUB will call the C rpcb_gettime() function and will return its two output values, timep and status, to Perl as a single list.

  1. void
  2. rpcb_gettime(host)
  3. char *host
  4. PREINIT:
  5. time_t timep;
  6. bool_t status;
  7. PPCODE:
  8. status = rpcb_gettime( host, &timep );
  9. EXTEND(SP, 2);
  10. PUSHs(sv_2mortal(newSViv(status)));
  11. PUSHs(sv_2mortal(newSViv(timep)));

Notice that the programmer must supply the C code necessary to have the real rpcb_gettime() function called and to have the return values properly placed on the argument stack.

The void return type for this function tells the xsubpp compiler that the RETVAL variable is not needed or used and that it should not be created. In most scenarios the void return type should be used with the PPCODE: directive.

The EXTEND() macro is used to make room on the argument stack for 2 return values. The PPCODE: directive causes the xsubpp compiler to create a stack pointer available as SP , and it is this pointer which is being used in the EXTEND() macro. The values are then pushed onto the stack with the PUSHs() macro.

Now the rpcb_gettime() function can be used from Perl with the following statement.

  1. ($status, $timep) = rpcb_gettime("localhost");

When handling output parameters with a PPCODE section, be sure to handle 'set' magic properly. See perlguts for details about 'set' magic.

Returning Undef And Empty Lists

Occasionally the programmer will want to return simply undef or an empty list if a function fails rather than a separate status value. The rpcb_gettime() function offers just this situation. If the function succeeds we would like to have it return the time and if it fails we would like to have undef returned. In the following Perl code the value of $timep will either be undef or it will be a valid time.

  1. $timep = rpcb_gettime( "localhost" );

The following XSUB uses the SV * return type as a mnemonic only, and uses a CODE: block to indicate to the compiler that the programmer has supplied all the necessary code. The sv_newmortal() call will initialize the return value to undef, making that the default return value.

  1. SV *
  2. rpcb_gettime(host)
  3. char * host
  4. PREINIT:
  5. time_t timep;
  6. bool_t x;
  7. CODE:
  8. ST(0) = sv_newmortal();
  9. if( rpcb_gettime( host, &timep ) )
  10. sv_setnv( ST(0), (double)timep);

The next example demonstrates how one would place an explicit undef in the return value, should the need arise.

  1. SV *
  2. rpcb_gettime(host)
  3. char * host
  4. PREINIT:
  5. time_t timep;
  6. bool_t x;
  7. CODE:
  8. if( rpcb_gettime( host, &timep ) ){
  9. ST(0) = sv_newmortal();
  10. sv_setnv( ST(0), (double)timep);
  11. }
  12. else{
  13. ST(0) = &PL_sv_undef;
  14. }

To return an empty list one must use a PPCODE: block and then not push return values on the stack.

  1. void
  2. rpcb_gettime(host)
  3. char *host
  4. PREINIT:
  5. time_t timep;
  6. PPCODE:
  7. if( rpcb_gettime( host, &timep ) )
  8. PUSHs(sv_2mortal(newSViv(timep)));
  9. else{
  10. /* Nothing pushed on stack, so an empty
  11. * list is implicitly returned. */
  12. }

Some people may be inclined to include an explicit return in the above XSUB, rather than letting control fall through to the end. In those situations XSRETURN_EMPTY should be used, instead. This will ensure that the XSUB stack is properly adjusted. Consult perlapi for other XSRETURN macros.

Since XSRETURN_* macros can be used with CODE blocks as well, one can rewrite this example as:

  1. int
  2. rpcb_gettime(host)
  3. char *host
  4. PREINIT:
  5. time_t timep;
  6. CODE:
  7. RETVAL = rpcb_gettime( host, &timep );
  8. if (RETVAL == 0)
  9. XSRETURN_UNDEF;
  10. OUTPUT:
  11. RETVAL

In fact, one can put this check into a POSTCALL: section as well. Together with PREINIT: simplifications, this leads to:

  1. int
  2. rpcb_gettime(host)
  3. char *host
  4. time_t timep;
  5. POSTCALL:
  6. if (RETVAL == 0)
  7. XSRETURN_UNDEF;

The REQUIRE: Keyword

The REQUIRE: keyword is used to indicate the minimum version of the xsubpp compiler needed to compile the XS module. An XS module which contains the following statement will compile with only xsubpp version 1.922 or greater:

  1. REQUIRE: 1.922

The CLEANUP: Keyword

This keyword can be used when an XSUB requires special cleanup procedures before it terminates. When the CLEANUP: keyword is used it must follow any CODE:, PPCODE:, or OUTPUT: blocks which are present in the XSUB. The code specified for the cleanup block will be added as the last statements in the XSUB.

The POSTCALL: Keyword

This keyword can be used when an XSUB requires special procedures executed after the C subroutine call is performed. When the POSTCALL: keyword is used it must precede OUTPUT: and CLEANUP: blocks which are present in the XSUB.

See examples in The NO_OUTPUT Keyword and Returning Undef And Empty Lists.

The POSTCALL: block does not make a lot of sense when the C subroutine call is supplied by user by providing either CODE: or PPCODE: section.

The BOOT: Keyword

The BOOT: keyword is used to add code to the extension's bootstrap function. The bootstrap function is generated by the xsubpp compiler and normally holds the statements necessary to register any XSUBs with Perl. With the BOOT: keyword the programmer can tell the compiler to add extra statements to the bootstrap function.

This keyword may be used any time after the first MODULE keyword and should appear on a line by itself. The first blank line after the keyword will terminate the code block.

  1. BOOT:
  2. # The following message will be printed when the
  3. # bootstrap function executes.
  4. printf("Hello from the bootstrap!\n");

The VERSIONCHECK: Keyword

The VERSIONCHECK: keyword corresponds to xsubpp's -versioncheck and -noversioncheck options. This keyword overrides the command line options. Version checking is enabled by default. When version checking is enabled the XS module will attempt to verify that its version matches the version of the PM module.

To enable version checking:

  1. VERSIONCHECK: ENABLE

To disable version checking:

  1. VERSIONCHECK: DISABLE

Note that if the version of the PM module is an NV (a floating point number), it will be stringified with a possible loss of precision (currently chopping to nine decimal places) so that it may not match the version of the XS module anymore. Quoting the $VERSION declaration to make it a string is recommended if long version numbers are used.

The PROTOTYPES: Keyword

The PROTOTYPES: keyword corresponds to xsubpp's -prototypes and -noprototypes options. This keyword overrides the command line options. Prototypes are enabled by default. When prototypes are enabled XSUBs will be given Perl prototypes. This keyword may be used multiple times in an XS module to enable and disable prototypes for different parts of the module.

To enable prototypes:

  1. PROTOTYPES: ENABLE

To disable prototypes:

  1. PROTOTYPES: DISABLE

The PROTOTYPE: Keyword

This keyword is similar to the PROTOTYPES: keyword above but can be used to force xsubpp to use a specific prototype for the XSUB. This keyword overrides all other prototype options and keywords but affects only the current XSUB. Consult Prototypes in perlsub for information about Perl prototypes.

  1. bool_t
  2. rpcb_gettime(timep, ...)
  3. time_t timep = NO_INIT
  4. PROTOTYPE: $;$
  5. PREINIT:
  6. char *host = "localhost";
  7. CODE:
  8. if( items > 1 )
  9. host = (char *)SvPV_nolen(ST(1));
  10. RETVAL = rpcb_gettime( host, &timep );
  11. OUTPUT:
  12. timep
  13. RETVAL

If the prototypes are enabled, you can disable it locally for a given XSUB as in the following example:

  1. void
  2. rpcb_gettime_noproto()
  3. PROTOTYPE: DISABLE
  4. ...

The ALIAS: Keyword

The ALIAS: keyword allows an XSUB to have two or more unique Perl names and to know which of those names was used when it was invoked. The Perl names may be fully-qualified with package names. Each alias is given an index. The compiler will setup a variable called ix which contain the index of the alias which was used. When the XSUB is called with its declared name ix will be 0.

The following example will create aliases FOO::gettime() and BAR::getit() for this function.

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t &timep
  5. ALIAS:
  6. FOO::gettime = 1
  7. BAR::getit = 2
  8. INIT:
  9. printf("# ix = %d\n", ix );
  10. OUTPUT:
  11. timep

The OVERLOAD: Keyword

Instead of writing an overloaded interface using pure Perl, you can also use the OVERLOAD keyword to define additional Perl names for your functions (like the ALIAS: keyword above). However, the overloaded functions must be defined with three parameters (except for the nomethod() function which needs four parameters). If any function has the OVERLOAD: keyword, several additional lines will be defined in the c file generated by xsubpp in order to register with the overload magic.

Since blessed objects are actually stored as RV's, it is useful to use the typemap features to preprocess parameters and extract the actual SV stored within the blessed RV. See the sample for T_PTROBJ_SPECIAL below.

To use the OVERLOAD: keyword, create an XS function which takes three input parameters ( or use the c style '...' definition) like this:

  1. SV *
  2. cmp (lobj, robj, swap)
  3. My_Module_obj lobj
  4. My_Module_obj robj
  5. IV swap
  6. OVERLOAD: cmp <=>
  7. { /* function defined here */}

In this case, the function will overload both of the three way comparison operators. For all overload operations using non-alpha characters, you must type the parameter without quoting, separating multiple overloads with whitespace. Note that "" (the stringify overload) should be entered as \"\" (i.e. escaped).

The FALLBACK: Keyword

In addition to the OVERLOAD keyword, if you need to control how Perl autogenerates missing overloaded operators, you can set the FALLBACK keyword in the module header section, like this:

  1. MODULE = RPC PACKAGE = RPC
  2. FALLBACK: TRUE
  3. ...

where FALLBACK can take any of the three values TRUE, FALSE, or UNDEF. If you do not set any FALLBACK value when using OVERLOAD, it defaults to UNDEF. FALLBACK is not used except when one or more functions using OVERLOAD have been defined. Please see fallback in overload for more details.

The INTERFACE: Keyword

This keyword declares the current XSUB as a keeper of the given calling signature. If some text follows this keyword, it is considered as a list of functions which have this signature, and should be attached to the current XSUB.

For example, if you have 4 C functions multiply(), divide(), add(), subtract() all having the signature:

  1. symbolic f(symbolic, symbolic);

you can make them all to use the same XSUB using this:

  1. symbolic
  2. interface_s_ss(arg1, arg2)
  3. symbolic arg1
  4. symbolic arg2
  5. INTERFACE:
  6. multiply divide
  7. add subtract

(This is the complete XSUB code for 4 Perl functions!) Four generated Perl function share names with corresponding C functions.

The advantage of this approach comparing to ALIAS: keyword is that there is no need to code a switch statement, each Perl function (which shares the same XSUB) knows which C function it should call. Additionally, one can attach an extra function remainder() at runtime by using

  1. CV *mycv = newXSproto("Symbolic::remainder",
  2. XS_Symbolic_interface_s_ss, __FILE__, "$$");
  3. XSINTERFACE_FUNC_SET(mycv, remainder);

say, from another XSUB. (This example supposes that there was no INTERFACE_MACRO: section, otherwise one needs to use something else instead of XSINTERFACE_FUNC_SET , see the next section.)

The INTERFACE_MACRO: Keyword

This keyword allows one to define an INTERFACE using a different way to extract a function pointer from an XSUB. The text which follows this keyword should give the name of macros which would extract/set a function pointer. The extractor macro is given return type, CV* , and XSANY.any_dptr for this CV* . The setter macro is given cv, and the function pointer.

The default value is XSINTERFACE_FUNC and XSINTERFACE_FUNC_SET . An INTERFACE keyword with an empty list of functions can be omitted if INTERFACE_MACRO keyword is used.

Suppose that in the previous example functions pointers for multiply(), divide(), add(), subtract() are kept in a global C array fp[] with offsets being multiply_off , divide_off , add_off , subtract_off . Then one can use

  1. #define XSINTERFACE_FUNC_BYOFFSET(ret,cv,f) \
  2. ((XSINTERFACE_CVT_ANON(ret))fp[CvXSUBANY(cv).any_i32])
  3. #define XSINTERFACE_FUNC_BYOFFSET_set(cv,f) \
  4. CvXSUBANY(cv).any_i32 = CAT2( f, _off )

in C section,

  1. symbolic
  2. interface_s_ss(arg1, arg2)
  3. symbolic arg1
  4. symbolic arg2
  5. INTERFACE_MACRO:
  6. XSINTERFACE_FUNC_BYOFFSET
  7. XSINTERFACE_FUNC_BYOFFSET_set
  8. INTERFACE:
  9. multiply divide
  10. add subtract

in XSUB section.

The INCLUDE: Keyword

This keyword can be used to pull other files into the XS module. The other files may have XS code. INCLUDE: can also be used to run a command to generate the XS code to be pulled into the module.

The file Rpcb1.xsh contains our rpcb_gettime() function:

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t &timep
  5. OUTPUT:
  6. timep

The XS module can use INCLUDE: to pull that file into it.

  1. INCLUDE: Rpcb1.xsh

If the parameters to the INCLUDE: keyword are followed by a pipe (|) then the compiler will interpret the parameters as a command. This feature is mildly deprecated in favour of the INCLUDE_COMMAND: directive, as documented below.

  1. INCLUDE: cat Rpcb1.xsh |

Do not use this to run perl: INCLUDE: perl | will run the perl that happens to be the first in your path and not necessarily the same perl that is used to run xsubpp . See The INCLUDE_COMMAND: Keyword.

The INCLUDE_COMMAND: Keyword

Runs the supplied command and includes its output into the current XS document. INCLUDE_COMMAND assigns special meaning to the $^X token in that it runs the same perl interpreter that is running xsubpp :

  1. INCLUDE_COMMAND: cat Rpcb1.xsh
  2. INCLUDE_COMMAND: $^X -e ...

The CASE: Keyword

The CASE: keyword allows an XSUB to have multiple distinct parts with each part acting as a virtual XSUB. CASE: is greedy and if it is used then all other XS keywords must be contained within a CASE:. This means nothing may precede the first CASE: in the XSUB and anything following the last CASE: is included in that case.

A CASE: might switch via a parameter of the XSUB, via the ix ALIAS: variable (see The ALIAS: Keyword), or maybe via the items variable (see Variable-length Parameter Lists). The last CASE: becomes the default case if it is not associated with a conditional. The following example shows CASE switched via ix with a function rpcb_gettime() having an alias x_gettime() . When the function is called as rpcb_gettime() its parameters are the usual (char *host, time_t *timep) , but when the function is called as x_gettime() its parameters are reversed, (time_t *timep, char *host) .

  1. long
  2. rpcb_gettime(a,b)
  3. CASE: ix == 1
  4. ALIAS:
  5. x_gettime = 1
  6. INPUT:
  7. # 'a' is timep, 'b' is host
  8. char *b
  9. time_t a = NO_INIT
  10. CODE:
  11. RETVAL = rpcb_gettime( b, &a );
  12. OUTPUT:
  13. a
  14. RETVAL
  15. CASE:
  16. # 'a' is host, 'b' is timep
  17. char *a
  18. time_t &b = NO_INIT
  19. OUTPUT:
  20. b
  21. RETVAL

That function can be called with either of the following statements. Note the different argument lists.

  1. $status = rpcb_gettime( $host, $timep );
  2. $status = x_gettime( $timep, $host );

The EXPORT_XSUB_SYMBOLS: Keyword

The EXPORT_XSUB_SYMBOLS: keyword is likely something you will never need. In perl versions earlier than 5.16.0, this keyword does nothing. Starting with 5.16, XSUB symbols are no longer exported by default. That is, they are static functions. If you include

  1. EXPORT_XSUB_SYMBOLS: ENABLE

in your XS code, the XSUBs following this line will not be declared static . You can later disable this with

  1. EXPORT_XSUB_SYMBOLS: DISABLE

which, again, is the default that you should probably never change. You cannot use this keyword on versions of perl before 5.16 to make XSUBs static .

The & Unary Operator

The & unary operator in the INPUT: section is used to tell xsubpp that it should convert a Perl value to/from C using the C type to the left of & , but provide a pointer to this value when the C function is called.

This is useful to avoid a CODE: block for a C function which takes a parameter by reference. Typically, the parameter should be not a pointer type (an int or long but not an int* or long* ).

The following XSUB will generate incorrect C code. The xsubpp compiler will turn this into code which calls rpcb_gettime() with parameters (char *host, time_t timep) , but the real rpcb_gettime() wants the timep parameter to be of type time_t* rather than time_t .

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t timep
  5. OUTPUT:
  6. timep

That problem is corrected by using the & operator. The xsubpp compiler will now turn this into code which calls rpcb_gettime() correctly with parameters (char *host, time_t *timep) . It does this by carrying the & through, so the function call looks like rpcb_gettime(host, &timep) .

  1. bool_t
  2. rpcb_gettime(host,timep)
  3. char *host
  4. time_t &timep
  5. OUTPUT:
  6. timep

Inserting POD, Comments and C Preprocessor Directives

C preprocessor directives are allowed within BOOT:, PREINIT: INIT:, CODE:, PPCODE:, POSTCALL:, and CLEANUP: blocks, as well as outside the functions. Comments are allowed anywhere after the MODULE keyword. The compiler will pass the preprocessor directives through untouched and will remove the commented lines. POD documentation is allowed at any point, both in the C and XS language sections. POD must be terminated with a =cut command; xsubpp will exit with an error if it does not. It is very unlikely that human generated C code will be mistaken for POD, as most indenting styles result in whitespace in front of any line starting with = . Machine generated XS files may fall into this trap unless care is taken to ensure that a space breaks the sequence "\n=".

Comments can be added to XSUBs by placing a # as the first non-whitespace of a line. Care should be taken to avoid making the comment look like a C preprocessor directive, lest it be interpreted as such. The simplest way to prevent this is to put whitespace in front of the # .

If you use preprocessor directives to choose one of two versions of a function, use

  1. #if ... version1
  2. #else /* ... version2 */
  3. #endif

and not

  1. #if ... version1
  2. #endif
  3. #if ... version2
  4. #endif

because otherwise xsubpp will believe that you made a duplicate definition of the function. Also, put a blank line before the #else/#endif so it will not be seen as part of the function body.

Using XS With C++

If an XSUB name contains :: , it is considered to be a C++ method. The generated Perl function will assume that its first argument is an object pointer. The object pointer will be stored in a variable called THIS. The object should have been created by C++ with the new() function and should be blessed by Perl with the sv_setref_pv() macro. The blessing of the object by Perl can be handled by a typemap. An example typemap is shown at the end of this section.

If the return type of the XSUB includes static , the method is considered to be a static method. It will call the C++ function using the class::method() syntax. If the method is not static the function will be called using the THIS->method() syntax.

The next examples will use the following C++ class.

  1. class color {
  2. public:
  3. color();
  4. ~color();
  5. int blue();
  6. void set_blue( int );
  7. private:
  8. int c_blue;
  9. };

The XSUBs for the blue() and set_blue() methods are defined with the class name but the parameter for the object (THIS, or "self") is implicit and is not listed.

  1. int
  2. color::blue()
  3. void
  4. color::set_blue( val )
  5. int val

Both Perl functions will expect an object as the first parameter. In the generated C++ code the object is called THIS , and the method call will be performed on this object. So in the C++ code the blue() and set_blue() methods will be called as this:

  1. RETVAL = THIS->blue();
  2. THIS->set_blue( val );

You could also write a single get/set method using an optional argument:

  1. int
  2. color::blue( val = NO_INIT )
  3. int val
  4. PROTOTYPE $;$
  5. CODE:
  6. if (items > 1)
  7. THIS->set_blue( val );
  8. RETVAL = THIS->blue();
  9. OUTPUT:
  10. RETVAL

If the function's name is DESTROY then the C++ delete function will be called and THIS will be given as its parameter. The generated C++ code for

  1. void
  2. color::DESTROY()

will look like this:

  1. color *THIS = ...; // Initialized as in typemap
  2. delete THIS;

If the function's name is new then the C++ new function will be called to create a dynamic C++ object. The XSUB will expect the class name, which will be kept in a variable called CLASS , to be given as the first argument.

  1. color *
  2. color::new()

The generated C++ code will call new .

  1. RETVAL = new color();

The following is an example of a typemap that could be used for this C++ example.

  1. TYPEMAP
  2. color * O_OBJECT
  3. OUTPUT
  4. # The Perl object is blessed into 'CLASS', which should be a
  5. # char* having the name of the package for the blessing.
  6. O_OBJECT
  7. sv_setref_pv( $arg, CLASS, (void*)$var );
  8. INPUT
  9. O_OBJECT
  10. if( sv_isobject($arg) && (SvTYPE(SvRV($arg)) == SVt_PVMG) )
  11. $var = ($type)SvIV((SV*)SvRV( $arg ));
  12. else{
  13. warn( \"${Package}::$func_name() -- $var is not a blessed SV reference\" );
  14. XSRETURN_UNDEF;
  15. }

Interface Strategy

When designing an interface between Perl and a C library a straight translation from C to XS (such as created by h2xs -x ) is often sufficient. However, sometimes the interface will look very C-like and occasionally nonintuitive, especially when the C function modifies one of its parameters, or returns failure inband (as in "negative return values mean failure"). In cases where the programmer wishes to create a more Perl-like interface the following strategy may help to identify the more critical parts of the interface.

Identify the C functions with input/output or output parameters. The XSUBs for these functions may be able to return lists to Perl.

Identify the C functions which use some inband info as an indication of failure. They may be candidates to return undef or an empty list in case of failure. If the failure may be detected without a call to the C function, you may want to use an INIT: section to report the failure. For failures detectable after the C function returns one may want to use a POSTCALL: section to process the failure. In more complicated cases use CODE: or PPCODE: sections.

If many functions use the same failure indication based on the return value, you may want to create a special typedef to handle this situation. Put

  1. typedef int negative_is_failure;

near the beginning of XS file, and create an OUTPUT typemap entry for negative_is_failure which converts negative values to undef, or maybe croak()s. After this the return value of type negative_is_failure will create more Perl-like interface.

Identify which values are used by only the C and XSUB functions themselves, say, when a parameter to a function should be a contents of a global variable. If Perl does not need to access the contents of the value then it may not be necessary to provide a translation for that value from C to Perl.

Identify the pointers in the C function parameter lists and return values. Some pointers may be used to implement input/output or output parameters, they can be handled in XS with the & unary operator, and, possibly, using the NO_INIT keyword. Some others will require handling of types like int * , and one needs to decide what a useful Perl translation will do in such a case. When the semantic is clear, it is advisable to put the translation into a typemap file.

Identify the structures used by the C functions. In many cases it may be helpful to use the T_PTROBJ typemap for these structures so they can be manipulated by Perl as blessed objects. (This is handled automatically by h2xs -x .)

If the same C type is used in several different contexts which require different translations, typedef several new types mapped to this C type, and create separate typemap entries for these new types. Use these types in declarations of return type and parameters to XSUBs.

Perl Objects And C Structures

When dealing with C structures one should select either T_PTROBJ or T_PTRREF for the XS type. Both types are designed to handle pointers to complex objects. The T_PTRREF type will allow the Perl object to be unblessed while the T_PTROBJ type requires that the object be blessed. By using T_PTROBJ one can achieve a form of type-checking because the XSUB will attempt to verify that the Perl object is of the expected type.

The following XS code shows the getnetconfigent() function which is used with ONC+ TIRPC. The getnetconfigent() function will return a pointer to a C structure and has the C prototype shown below. The example will demonstrate how the C pointer will become a Perl reference. Perl will consider this reference to be a pointer to a blessed object and will attempt to call a destructor for the object. A destructor will be provided in the XS source to free the memory used by getnetconfigent(). Destructors in XS can be created by specifying an XSUB function whose name ends with the word DESTROY. XS destructors can be used to free memory which may have been malloc'd by another XSUB.

  1. struct netconfig *getnetconfigent(const char *netid);

A typedef will be created for struct netconfig . The Perl object will be blessed in a class matching the name of the C type, with the tag Ptr appended, and the name should not have embedded spaces if it will be a Perl package name. The destructor will be placed in a class corresponding to the class of the object and the PREFIX keyword will be used to trim the name to the word DESTROY as Perl will expect.

  1. typedef struct netconfig Netconfig;
  2. MODULE = RPC PACKAGE = RPC
  3. Netconfig *
  4. getnetconfigent(netid)
  5. char *netid
  6. MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_
  7. void
  8. rpcb_DESTROY(netconf)
  9. Netconfig *netconf
  10. CODE:
  11. printf("Now in NetconfigPtr::DESTROY\n");
  12. free( netconf );

This example requires the following typemap entry. Consult perlxstypemap for more information about adding new typemaps for an extension.

  1. TYPEMAP
  2. Netconfig * T_PTROBJ

This example will be used with the following Perl statements.

  1. use RPC;
  2. $netconf = getnetconfigent("udp");

When Perl destroys the object referenced by $netconf it will send the object to the supplied XSUB DESTROY function. Perl cannot determine, and does not care, that this object is a C struct and not a Perl object. In this sense, there is no difference between the object created by the getnetconfigent() XSUB and an object created by a normal Perl subroutine.

Safely Storing Static Data in XS

Starting with Perl 5.8, a macro framework has been defined to allow static data to be safely stored in XS modules that will be accessed from a multi-threaded Perl.

Although primarily designed for use with multi-threaded Perl, the macros have been designed so that they will work with non-threaded Perl as well.

It is therefore strongly recommended that these macros be used by all XS modules that make use of static data.

The easiest way to get a template set of macros to use is by specifying the -g (--global ) option with h2xs (see h2xs).

Below is an example module that makes use of the macros.

  1. #include "EXTERN.h"
  2. #include "perl.h"
  3. #include "XSUB.h"
  4. /* Global Data */
  5. #define MY_CXT_KEY "BlindMice::_guts" XS_VERSION
  6. typedef struct {
  7. int count;
  8. char name[3][100];
  9. } my_cxt_t;
  10. START_MY_CXT
  11. MODULE = BlindMice PACKAGE = BlindMice
  12. BOOT:
  13. {
  14. MY_CXT_INIT;
  15. MY_CXT.count = 0;
  16. strcpy(MY_CXT.name[0], "None");
  17. strcpy(MY_CXT.name[1], "None");
  18. strcpy(MY_CXT.name[2], "None");
  19. }
  20. int
  21. newMouse(char * name)
  22. char * name;
  23. PREINIT:
  24. dMY_CXT;
  25. CODE:
  26. if (MY_CXT.count >= 3) {
  27. warn("Already have 3 blind mice");
  28. RETVAL = 0;
  29. }
  30. else {
  31. RETVAL = ++ MY_CXT.count;
  32. strcpy(MY_CXT.name[MY_CXT.count - 1], name);
  33. }
  34. char *
  35. get_mouse_name(index)
  36. int index
  37. CODE:
  38. dMY_CXT;
  39. RETVAL = MY_CXT.lives ++;
  40. if (index > MY_CXT.count)
  41. croak("There are only 3 blind mice.");
  42. else
  43. RETVAL = newSVpv(MY_CXT.name[index - 1]);
  44. void
  45. CLONE(...)
  46. CODE:
  47. MY_CXT_CLONE;

REFERENCE

  • MY_CXT_KEY

    This macro is used to define a unique key to refer to the static data for an XS module. The suggested naming scheme, as used by h2xs, is to use a string that consists of the module name, the string "::_guts" and the module version number.

    1. #define MY_CXT_KEY "MyModule::_guts" XS_VERSION
  • typedef my_cxt_t

    This struct typedef must always be called my_cxt_t . The other CXT* macros assume the existence of the my_cxt_t typedef name.

    Declare a typedef named my_cxt_t that is a structure that contains all the data that needs to be interpreter-local.

    1. typedef struct {
    2. int some_value;
    3. } my_cxt_t;
  • START_MY_CXT

    Always place the START_MY_CXT macro directly after the declaration of my_cxt_t .

  • MY_CXT_INIT

    The MY_CXT_INIT macro initialises storage for the my_cxt_t struct.

    It must be called exactly once, typically in a BOOT: section. If you are maintaining multiple interpreters, it should be called once in each interpreter instance, except for interpreters cloned from existing ones. (But see MY_CXT_CLONE below.)

  • dMY_CXT

    Use the dMY_CXT macro (a declaration) in all the functions that access MY_CXT.

  • MY_CXT

    Use the MY_CXT macro to access members of the my_cxt_t struct. For example, if my_cxt_t is

    1. typedef struct {
    2. int index;
    3. } my_cxt_t;

    then use this to access the index member

    1. dMY_CXT;
    2. MY_CXT.index = 2;
  • aMY_CXT/pMY_CXT

    dMY_CXT may be quite expensive to calculate, and to avoid the overhead of invoking it in each function it is possible to pass the declaration onto other functions using the aMY_CXT /pMY_CXT macros, eg

    1. void sub1() {
    2. dMY_CXT;
    3. MY_CXT.index = 1;
    4. sub2(aMY_CXT);
    5. }
    6. void sub2(pMY_CXT) {
    7. MY_CXT.index = 2;
    8. }

    Analogously to pTHX , there are equivalent forms for when the macro is the first or last in multiple arguments, where an underscore represents a comma, i.e. _aMY_CXT , aMY_CXT_ , _pMY_CXT and pMY_CXT_ .

  • MY_CXT_CLONE

    By default, when a new interpreter is created as a copy of an existing one (eg via threads->create() ), both interpreters share the same physical my_cxt_t structure. Calling MY_CXT_CLONE (typically via the package's CLONE() function), causes a byte-for-byte copy of the structure to be taken, and any future dMY_CXT will cause the copy to be accessed instead.

  • MY_CXT_INIT_INTERP(my_perl)
  • dMY_CXT_INTERP(my_perl)

    These are versions of the macros which take an explicit interpreter as an argument.

Note that these macros will only work together within the same source file; that is, a dMY_CTX in one source file will access a different structure than a dMY_CTX in another source file.

Thread-aware system interfaces

Starting from Perl 5.8, in C/C++ level Perl knows how to wrap system/library interfaces that have thread-aware versions (e.g. getpwent_r()) into frontend macros (e.g. getpwent()) that correctly handle the multithreaded interaction with the Perl interpreter. This will happen transparently, the only thing you need to do is to instantiate a Perl interpreter.

This wrapping happens always when compiling Perl core source (PERL_CORE is defined) or the Perl core extensions (PERL_EXT is defined). When compiling XS code outside of Perl core the wrapping does not take place. Note, however, that intermixing the _r-forms (as Perl compiled for multithreaded operation will do) and the _r-less forms is neither well-defined (inconsistent results, data corruption, or even crashes become more likely), nor is it very portable.

EXAMPLES

File RPC.xs : Interface to some ONC+ RPC bind library functions.

  1. #include "EXTERN.h"
  2. #include "perl.h"
  3. #include "XSUB.h"
  4. #include <rpc/rpc.h>
  5. typedef struct netconfig Netconfig;
  6. MODULE = RPC PACKAGE = RPC
  7. SV *
  8. rpcb_gettime(host="localhost")
  9. char *host
  10. PREINIT:
  11. time_t timep;
  12. CODE:
  13. ST(0) = sv_newmortal();
  14. if( rpcb_gettime( host, &timep ) )
  15. sv_setnv( ST(0), (double)timep );
  16. Netconfig *
  17. getnetconfigent(netid="udp")
  18. char *netid
  19. MODULE = RPC PACKAGE = NetconfigPtr PREFIX = rpcb_
  20. void
  21. rpcb_DESTROY(netconf)
  22. Netconfig *netconf
  23. CODE:
  24. printf("NetconfigPtr::DESTROY\n");
  25. free( netconf );

File typemap : Custom typemap for RPC.xs. (cf. perlxstypemap)

  1. TYPEMAP
  2. Netconfig * T_PTROBJ

File RPC.pm : Perl module for the RPC extension.

  1. package RPC;
  2. require Exporter;
  3. require DynaLoader;
  4. @ISA = qw(Exporter DynaLoader);
  5. @EXPORT = qw(rpcb_gettime getnetconfigent);
  6. bootstrap RPC;
  7. 1;

File rpctest.pl : Perl test program for the RPC extension.

  1. use RPC;
  2. $netconf = getnetconfigent();
  3. $a = rpcb_gettime();
  4. print "time = $a\n";
  5. print "netconf = $netconf\n";
  6. $netconf = getnetconfigent("tcp");
  7. $a = rpcb_gettime("poplar");
  8. print "time = $a\n";
  9. print "netconf = $netconf\n";

XS VERSION

This document covers features supported by ExtUtils::ParseXS (also known as xsubpp ) 3.13_01.

AUTHOR

Originally written by Dean Roehrich <roehrich@cray.com>.

Maintained since 1996 by The Perl Porters <perlbug@perl.org>.

 
perldoc-html/perlxstut.html000644 000765 000024 00000320261 12275777361 016203 0ustar00jjstaff000000 000000 perlxstut - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlxstut

Perl 5 version 18.2 documentation
Recently read

perlxstut

NAME

perlxstut - Tutorial for writing XSUBs

DESCRIPTION

This tutorial will educate the reader on the steps involved in creating a Perl extension. The reader is assumed to have access to perlguts, perlapi and perlxs.

This tutorial starts with very simple examples and becomes more complex, with each new example adding new features. Certain concepts may not be completely explained until later in the tutorial in order to slowly ease the reader into building extensions.

This tutorial was written from a Unix point of view. Where I know them to be otherwise different for other platforms (e.g. Win32), I will list them. If you find something that was missed, please let me know.

SPECIAL NOTES

make

This tutorial assumes that the make program that Perl is configured to use is called make . Instead of running "make" in the examples that follow, you may have to substitute whatever make program Perl has been configured to use. Running perl -V:make should tell you what it is.

Version caveat

When writing a Perl extension for general consumption, one should expect that the extension will be used with versions of Perl different from the version available on your machine. Since you are reading this document, the version of Perl on your machine is probably 5.005 or later, but the users of your extension may have more ancient versions.

To understand what kinds of incompatibilities one may expect, and in the rare case that the version of Perl on your machine is older than this document, see the section on "Troubleshooting these Examples" for more information.

If your extension uses some features of Perl which are not available on older releases of Perl, your users would appreciate an early meaningful warning. You would probably put this information into the README file, but nowadays installation of extensions may be performed automatically, guided by CPAN.pm module or other tools.

In MakeMaker-based installations, Makefile.PL provides the earliest opportunity to perform version checks. One can put something like this in Makefile.PL for this purpose:

  1. eval { require 5.007 }
  2. or die <<EOD;
  3. ############
  4. ### This module uses frobnication framework which is not available before
  5. ### version 5.007 of Perl. Upgrade your Perl before installing Kara::Mba.
  6. ############
  7. EOD

Dynamic Loading versus Static Loading

It is commonly thought that if a system does not have the capability to dynamically load a library, you cannot build XSUBs. This is incorrect. You can build them, but you must link the XSUBs subroutines with the rest of Perl, creating a new executable. This situation is similar to Perl 4.

This tutorial can still be used on such a system. The XSUB build mechanism will check the system and build a dynamically-loadable library if possible, or else a static library and then, optionally, a new statically-linked executable with that static library linked in.

Should you wish to build a statically-linked executable on a system which can dynamically load libraries, you may, in all the following examples, where the command "make " with no arguments is executed, run the command "make perl " instead.

If you have generated such a statically-linked executable by choice, then instead of saying "make test ", you should say "make test_static ". On systems that cannot build dynamically-loadable libraries at all, simply saying "make test " is sufficient.

TUTORIAL

Now let's go on with the show!

EXAMPLE 1

Our first extension will be very simple. When we call the routine in the extension, it will print out a well-known message and return.

Run "h2xs -A -n Mytest ". This creates a directory named Mytest, possibly under ext/ if that directory exists in the current working directory. Several files will be created under the Mytest dir, including MANIFEST, Makefile.PL, lib/Mytest.pm, Mytest.xs, t/Mytest.t, and Changes.

The MANIFEST file contains the names of all the files just created in the Mytest directory.

The file Makefile.PL should look something like this:

  1. use ExtUtils::MakeMaker;
  2. # See lib/ExtUtils/MakeMaker.pm for details of how to influence
  3. # the contents of the Makefile that is written.
  4. WriteMakefile(
  5. NAME => 'Mytest',
  6. VERSION_FROM => 'Mytest.pm', # finds $VERSION
  7. LIBS => [''], # e.g., '-lm'
  8. DEFINE => '', # e.g., '-DHAVE_SOMETHING'
  9. INC => '', # e.g., '-I/usr/include/other'
  10. );

The file Mytest.pm should start with something like this:

  1. package Mytest;
  2. use 5.008008;
  3. use strict;
  4. use warnings;
  5. require Exporter;
  6. our @ISA = qw(Exporter);
  7. our %EXPORT_TAGS = ( 'all' => [ qw(
  8. ) ] );
  9. our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );
  10. our @EXPORT = qw(
  11. );
  12. our $VERSION = '0.01';
  13. require XSLoader;
  14. XSLoader::load('Mytest', $VERSION);
  15. # Preloaded methods go here.
  16. 1;
  17. __END__
  18. # Below is the stub of documentation for your module. You better edit it!

The rest of the .pm file contains sample code for providing documentation for the extension.

Finally, the Mytest.xs file should look something like this:

  1. #include "EXTERN.h"
  2. #include "perl.h"
  3. #include "XSUB.h"
  4. #include "ppport.h"
  5. MODULE = Mytest PACKAGE = Mytest

Let's edit the .xs file by adding this to the end of the file:

  1. void
  2. hello()
  3. CODE:
  4. printf("Hello, world!\n");

It is okay for the lines starting at the "CODE:" line to not be indented. However, for readability purposes, it is suggested that you indent CODE: one level and the lines following one more level.

Now we'll run "perl Makefile.PL ". This will create a real Makefile, which make needs. Its output looks something like:

  1. % perl Makefile.PL
  2. Checking if your kit is complete...
  3. Looks good
  4. Writing Makefile for Mytest
  5. %

Now, running make will produce output that looks something like this (some long lines have been shortened for clarity and some extraneous lines have been deleted):

  1. % make
  2. cp lib/Mytest.pm blib/lib/Mytest.pm
  3. perl xsubpp -typemap typemap Mytest.xs > Mytest.xsc && mv Mytest.xsc Mytest.c
  4. Please specify prototyping behavior for Mytest.xs (see perlxs manual)
  5. cc -c Mytest.c
  6. Running Mkbootstrap for Mytest ()
  7. chmod 644 Mytest.bs
  8. rm -f blib/arch/auto/Mytest/Mytest.so
  9. cc -shared -L/usr/local/lib Mytest.o -o blib/arch/auto/Mytest/Mytest.so \
  10. \
  11. chmod 755 blib/arch/auto/Mytest/Mytest.so
  12. cp Mytest.bs blib/arch/auto/Mytest/Mytest.bs
  13. chmod 644 blib/arch/auto/Mytest/Mytest.bs
  14. Manifying blib/man3/Mytest.3pm
  15. %

You can safely ignore the line about "prototyping behavior" - it is explained in The PROTOTYPES: Keyword in perlxs.

Perl has its own special way of easily writing test scripts, but for this example only, we'll create our own test script. Create a file called hello that looks like this:

  1. #! /opt/perl5/bin/perl
  2. use ExtUtils::testlib;
  3. use Mytest;
  4. Mytest::hello();

Now we make the script executable (chmod +x hello ), run the script and we should see the following output:

  1. % ./hello
  2. Hello, world!
  3. %

EXAMPLE 2

Now let's add to our extension a subroutine that will take a single numeric argument as input and return 1 if the number is even or 0 if the number is odd.

Add the following to the end of Mytest.xs:

  1. int
  2. is_even(input)
  3. int input
  4. CODE:
  5. RETVAL = (input % 2 == 0);
  6. OUTPUT:
  7. RETVAL

There does not need to be whitespace at the start of the "int input " line, but it is useful for improving readability. Placing a semi-colon at the end of that line is also optional. Any amount and kind of whitespace may be placed between the "int" and "input ".

Now re-run make to rebuild our new shared library.

Now perform the same steps as before, generating a Makefile from the Makefile.PL file, and running make.

In order to test that our extension works, we now need to look at the file Mytest.t. This file is set up to imitate the same kind of testing structure that Perl itself has. Within the test script, you perform a number of tests to confirm the behavior of the extension, printing "ok" when the test is correct, "not ok" when it is not.

  1. use Test::More tests => 4;
  2. BEGIN { use_ok('Mytest') };
  3. #########################
  4. # Insert your test code below, the Test::More module is use()ed here so read
  5. # its man page ( perldoc Test::More ) for help writing this test script.
  6. is(&Mytest::is_even(0), 1);
  7. is(&Mytest::is_even(1), 0);
  8. is(&Mytest::is_even(2), 1);

We will be calling the test script through the command "make test ". You should see output that looks something like this:

  1. %make test
  2. PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
  3. t/Mytest....ok
  4. All tests successful.
  5. Files=1, Tests=4, 0 wallclock secs ( 0.03 cusr + 0.00 csys = 0.03 CPU)
  6. %

What has gone on?

The program h2xs is the starting point for creating extensions. In later examples we'll see how we can use h2xs to read header files and generate templates to connect to C routines.

h2xs creates a number of files in the extension directory. The file Makefile.PL is a perl script which will generate a true Makefile to build the extension. We'll take a closer look at it later.

The .pm and .xs files contain the meat of the extension. The .xs file holds the C routines that make up the extension. The .pm file contains routines that tell Perl how to load your extension.

Generating the Makefile and running make created a directory called blib (which stands for "build library") in the current working directory. This directory will contain the shared library that we will build. Once we have tested it, we can install it into its final location.

Invoking the test script via "make test " did something very important. It invoked perl with all those -I arguments so that it could find the various files that are part of the extension. It is very important that while you are still testing extensions that you use "make test ". If you try to run the test script all by itself, you will get a fatal error. Another reason it is important to use "make test " to run your test script is that if you are testing an upgrade to an already-existing version, using "make test " ensures that you will test your new extension, not the already-existing version.

When Perl sees a use extension; , it searches for a file with the same name as the use'd extension that has a .pm suffix. If that file cannot be found, Perl dies with a fatal error. The default search path is contained in the @INC array.

In our case, Mytest.pm tells perl that it will need the Exporter and Dynamic Loader extensions. It then sets the @ISA and @EXPORT arrays and the $VERSION scalar; finally it tells perl to bootstrap the module. Perl will call its dynamic loader routine (if there is one) and load the shared library.

The two arrays @ISA and @EXPORT are very important. The @ISA array contains a list of other packages in which to search for methods (or subroutines) that do not exist in the current package. This is usually only important for object-oriented extensions (which we will talk about much later), and so usually doesn't need to be modified.

The @EXPORT array tells Perl which of the extension's variables and subroutines should be placed into the calling package's namespace. Because you don't know if the user has already used your variable and subroutine names, it's vitally important to carefully select what to export. Do not export method or variable names by default without a good reason.

As a general rule, if the module is trying to be object-oriented then don't export anything. If it's just a collection of functions and variables, then you can export them via another array, called @EXPORT_OK . This array does not automatically place its subroutine and variable names into the namespace unless the user specifically requests that this be done.

See perlmod for more information.

The $VERSION variable is used to ensure that the .pm file and the shared library are "in sync" with each other. Any time you make changes to the .pm or .xs files, you should increment the value of this variable.

Writing good test scripts

The importance of writing good test scripts cannot be over-emphasized. You should closely follow the "ok/not ok" style that Perl itself uses, so that it is very easy and unambiguous to determine the outcome of each test case. When you find and fix a bug, make sure you add a test case for it.

By running "make test ", you ensure that your Mytest.t script runs and uses the correct version of your extension. If you have many test cases, save your test files in the "t" directory and use the suffix ".t". When you run "make test ", all of these test files will be executed.

EXAMPLE 3

Our third extension will take one argument as its input, round off that value, and set the argument to the rounded value.

Add the following to the end of Mytest.xs:

  1. void
  2. round(arg)
  3. double arg
  4. CODE:
  5. if (arg > 0.0) {
  6. arg = floor(arg + 0.5);
  7. } else if (arg < 0.0) {
  8. arg = ceil(arg - 0.5);
  9. } else {
  10. arg = 0.0;
  11. }
  12. OUTPUT:
  13. arg

Edit the Makefile.PL file so that the corresponding line looks like this:

  1. 'LIBS' => ['-lm'], # e.g., '-lm'

Generate the Makefile and run make. Change the test number in Mytest.t to "9" and add the following tests:

  1. $i = -1.5; &Mytest::round($i); is( $i, -2.0 );
  2. $i = -1.1; &Mytest::round($i); is( $i, -1.0 );
  3. $i = 0.0; &Mytest::round($i); is( $i, 0.0 );
  4. $i = 0.5; &Mytest::round($i); is( $i, 1.0 );
  5. $i = 1.2; &Mytest::round($i); is( $i, 1.0 );

Running "make test " should now print out that all nine tests are okay.

Notice that in these new test cases, the argument passed to round was a scalar variable. You might be wondering if you can round a constant or literal. To see what happens, temporarily add the following line to Mytest.t:

  1. &Mytest::round(3);

Run "make test " and notice that Perl dies with a fatal error. Perl won't let you change the value of constants!

What's new here?

  • We've made some changes to Makefile.PL. In this case, we've specified an extra library to be linked into the extension's shared library, the math library libm in this case. We'll talk later about how to write XSUBs that can call every routine in a library.

  • The value of the function is not being passed back as the function's return value, but by changing the value of the variable that was passed into the function. You might have guessed that when you saw that the return value of round is of type "void".

Input and Output Parameters

You specify the parameters that will be passed into the XSUB on the line(s) after you declare the function's return value and name. Each input parameter line starts with optional whitespace, and may have an optional terminating semicolon.

The list of output parameters occurs at the very end of the function, just after the OUTPUT: directive. The use of RETVAL tells Perl that you wish to send this value back as the return value of the XSUB function. In Example 3, we wanted the "return value" placed in the original variable which we passed in, so we listed it (and not RETVAL) in the OUTPUT: section.

The XSUBPP Program

The xsubpp program takes the XS code in the .xs file and translates it into C code, placing it in a file whose suffix is .c. The C code created makes heavy use of the C functions within Perl.

The TYPEMAP file

The xsubpp program uses rules to convert from Perl's data types (scalar, array, etc.) to C's data types (int, char, etc.). These rules are stored in the typemap file ($PERLLIB/ExtUtils/typemap). There's a brief discussion below, but all the nitty-gritty details can be found in perlxstypemap. If you have a new-enough version of perl (5.16 and up) or an upgraded XS compiler (ExtUtils::ParseXS 3.13_01 or better), then you can inline typemaps in your XS instead of writing separate files. Either way, this typemap thing is split into three parts:

The first section maps various C data types to a name, which corresponds somewhat with the various Perl types. The second section contains C code which xsubpp uses to handle input parameters. The third section contains C code which xsubpp uses to handle output parameters.

Let's take a look at a portion of the .c file created for our extension. The file name is Mytest.c:

  1. XS(XS_Mytest_round)
  2. {
  3. dXSARGS;
  4. if (items != 1)
  5. Perl_croak(aTHX_ "Usage: Mytest::round(arg)");
  6. PERL_UNUSED_VAR(cv); /* -W */
  7. {
  8. double arg = (double)SvNV(ST(0)); /* XXXXX */
  9. if (arg > 0.0) {
  10. arg = floor(arg + 0.5);
  11. } else if (arg < 0.0) {
  12. arg = ceil(arg - 0.5);
  13. } else {
  14. arg = 0.0;
  15. }
  16. sv_setnv(ST(0), (double)arg); /* XXXXX */
  17. SvSETMAGIC(ST(0));
  18. }
  19. XSRETURN_EMPTY;
  20. }

Notice the two lines commented with "XXXXX". If you check the first part of the typemap file (or section), you'll see that doubles are of type T_DOUBLE. In the INPUT part of the typemap, an argument that is T_DOUBLE is assigned to the variable arg by calling the routine SvNV on something, then casting it to double, then assigned to the variable arg. Similarly, in the OUTPUT section, once arg has its final value, it is passed to the sv_setnv function to be passed back to the calling subroutine. These two functions are explained in perlguts; we'll talk more later about what that "ST(0)" means in the section on the argument stack.

Warning about Output Arguments

In general, it's not a good idea to write extensions that modify their input parameters, as in Example 3. Instead, you should probably return multiple values in an array and let the caller handle them (we'll do this in a later example). However, in order to better accommodate calling pre-existing C routines, which often do modify their input parameters, this behavior is tolerated.

EXAMPLE 4

In this example, we'll now begin to write XSUBs that will interact with pre-defined C libraries. To begin with, we will build a small library of our own, then let h2xs write our .pm and .xs files for us.

Create a new directory called Mytest2 at the same level as the directory Mytest. In the Mytest2 directory, create another directory called mylib, and cd into that directory.

Here we'll create some files that will generate a test library. These will include a C source file and a header file. We'll also create a Makefile.PL in this directory. Then we'll make sure that running make at the Mytest2 level will automatically run this Makefile.PL file and the resulting Makefile.

In the mylib directory, create a file mylib.h that looks like this:

  1. #define TESTVAL 4
  2. extern double foo(int, long, const char*);

Also create a file mylib.c that looks like this:

  1. #include <stdlib.h>
  2. #include "./mylib.h"
  3. double
  4. foo(int a, long b, const char *c)
  5. {
  6. return (a + b + atof(c) + TESTVAL);
  7. }

And finally create a file Makefile.PL that looks like this:

  1. use ExtUtils::MakeMaker;
  2. $Verbose = 1;
  3. WriteMakefile(
  4. NAME => 'Mytest2::mylib',
  5. SKIP => [qw(all static static_lib dynamic dynamic_lib)],
  6. clean => {'FILES' => 'libmylib$(LIB_EXT)'},
  7. );
  8. sub MY::top_targets {
  9. '
  10. all :: static
  11. pure_all :: static
  12. static :: libmylib$(LIB_EXT)
  13. libmylib$(LIB_EXT): $(O_FILES)
  14. $(AR) cr libmylib$(LIB_EXT) $(O_FILES)
  15. $(RANLIB) libmylib$(LIB_EXT)
  16. ';
  17. }

Make sure you use a tab and not spaces on the lines beginning with "$(AR)" and "$(RANLIB)". Make will not function properly if you use spaces. It has also been reported that the "cr" argument to $(AR) is unnecessary on Win32 systems.

We will now create the main top-level Mytest2 files. Change to the directory above Mytest2 and run the following command:

  1. % h2xs -O -n Mytest2 ./Mytest2/mylib/mylib.h

This will print out a warning about overwriting Mytest2, but that's okay. Our files are stored in Mytest2/mylib, and will be untouched.

The normal Makefile.PL that h2xs generates doesn't know about the mylib directory. We need to tell it that there is a subdirectory and that we will be generating a library in it. Let's add the argument MYEXTLIB to the WriteMakefile call so that it looks like this:

  1. WriteMakefile(
  2. 'NAME' => 'Mytest2',
  3. 'VERSION_FROM' => 'Mytest2.pm', # finds $VERSION
  4. 'LIBS' => [''], # e.g., '-lm'
  5. 'DEFINE' => '', # e.g., '-DHAVE_SOMETHING'
  6. 'INC' => '', # e.g., '-I/usr/include/other'
  7. 'MYEXTLIB' => 'mylib/libmylib$(LIB_EXT)',
  8. );

and then at the end add a subroutine (which will override the pre-existing subroutine). Remember to use a tab character to indent the line beginning with "cd"!

  1. sub MY::postamble {
  2. '
  3. $(MYEXTLIB): mylib/Makefile
  4. cd mylib && $(MAKE) $(PASSTHRU)
  5. ';
  6. }

Let's also fix the MANIFEST file so that it accurately reflects the contents of our extension. The single line that says "mylib" should be replaced by the following three lines:

  1. mylib/Makefile.PL
  2. mylib/mylib.c
  3. mylib/mylib.h

To keep our namespace nice and unpolluted, edit the .pm file and change the variable @EXPORT to @EXPORT_OK . Finally, in the .xs file, edit the #include line to read:

  1. #include "mylib/mylib.h"

And also add the following function definition to the end of the .xs file:

  1. double
  2. foo(a,b,c)
  3. int a
  4. long b
  5. const char * c
  6. OUTPUT:
  7. RETVAL

Now we also need to create a typemap because the default Perl doesn't currently support the const char * type. Include a new TYPEMAP section in your XS code before the above function:

  1. TYPEMAP: <<END;
  2. const char * T_PV
  3. END

Now run perl on the top-level Makefile.PL. Notice that it also created a Makefile in the mylib directory. Run make and watch that it does cd into the mylib directory and run make in there as well.

Now edit the Mytest2.t script and change the number of tests to "4", and add the following lines to the end of the script:

  1. is( &Mytest2::foo(1, 2, "Hello, world!"), 7 );
  2. is( &Mytest2::foo(1, 2, "0.0"), 7 );
  3. ok( abs(&Mytest2::foo(0, 0, "-3.4") - 0.6) <= 0.01 );

(When dealing with floating-point comparisons, it is best to not check for equality, but rather that the difference between the expected and actual result is below a certain amount (called epsilon) which is 0.01 in this case)

Run "make test " and all should be well. There are some warnings on missing tests for the Mytest2::mylib extension, but you can ignore them.

What has happened here?

Unlike previous examples, we've now run h2xs on a real include file. This has caused some extra goodies to appear in both the .pm and .xs files.

  • In the .xs file, there's now a #include directive with the absolute path to the mylib.h header file. We changed this to a relative path so that we could move the extension directory if we wanted to.

  • There's now some new C code that's been added to the .xs file. The purpose of the constant routine is to make the values that are #define'd in the header file accessible by the Perl script (by calling either TESTVAL or &Mytest2::TESTVAL ). There's also some XS code to allow calls to the constant routine.

  • The .pm file originally exported the name TESTVAL in the @EXPORT array. This could lead to name clashes. A good rule of thumb is that if the #define is only going to be used by the C routines themselves, and not by the user, they should be removed from the @EXPORT array. Alternately, if you don't mind using the "fully qualified name" of a variable, you could move most or all of the items from the @EXPORT array into the @EXPORT_OK array.

  • If our include file had contained #include directives, these would not have been processed by h2xs. There is no good solution to this right now.

  • We've also told Perl about the library that we built in the mylib subdirectory. That required only the addition of the MYEXTLIB variable to the WriteMakefile call and the replacement of the postamble subroutine to cd into the subdirectory and run make. The Makefile.PL for the library is a bit more complicated, but not excessively so. Again we replaced the postamble subroutine to insert our own code. This code simply specified that the library to be created here was a static archive library (as opposed to a dynamically loadable library) and provided the commands to build it.

Anatomy of .xs file

The .xs file of EXAMPLE 4 contained some new elements. To understand the meaning of these elements, pay attention to the line which reads

  1. MODULE = Mytest2 PACKAGE = Mytest2

Anything before this line is plain C code which describes which headers to include, and defines some convenience functions. No translations are performed on this part, apart from having embedded POD documentation skipped over (see perlpod) it goes into the generated output C file as is.

Anything after this line is the description of XSUB functions. These descriptions are translated by xsubpp into C code which implements these functions using Perl calling conventions, and which makes these functions visible from Perl interpreter.

Pay a special attention to the function constant . This name appears twice in the generated .xs file: once in the first part, as a static C function, then another time in the second part, when an XSUB interface to this static C function is defined.

This is quite typical for .xs files: usually the .xs file provides an interface to an existing C function. Then this C function is defined somewhere (either in an external library, or in the first part of .xs file), and a Perl interface to this function (i.e. "Perl glue") is described in the second part of .xs file. The situation in EXAMPLE 1, EXAMPLE 2, and EXAMPLE 3, when all the work is done inside the "Perl glue", is somewhat of an exception rather than the rule.

Getting the fat out of XSUBs

In EXAMPLE 4 the second part of .xs file contained the following description of an XSUB:

  1. double
  2. foo(a,b,c)
  3. int a
  4. long b
  5. const char * c
  6. OUTPUT:
  7. RETVAL

Note that in contrast with EXAMPLE 1, EXAMPLE 2 and EXAMPLE 3, this description does not contain the actual code for what is done during a call to Perl function foo(). To understand what is going on here, one can add a CODE section to this XSUB:

  1. double
  2. foo(a,b,c)
  3. int a
  4. long b
  5. const char * c
  6. CODE:
  7. RETVAL = foo(a,b,c);
  8. OUTPUT:
  9. RETVAL

However, these two XSUBs provide almost identical generated C code: xsubpp compiler is smart enough to figure out the CODE: section from the first two lines of the description of XSUB. What about OUTPUT: section? In fact, that is absolutely the same! The OUTPUT: section can be removed as well, as far as CODE: section or PPCODE: section is not specified: xsubpp can see that it needs to generate a function call section, and will autogenerate the OUTPUT section too. Thus one can shortcut the XSUB to become:

  1. double
  2. foo(a,b,c)
  3. int a
  4. long b
  5. const char * c

Can we do the same with an XSUB

  1. int
  2. is_even(input)
  3. int input
  4. CODE:
  5. RETVAL = (input % 2 == 0);
  6. OUTPUT:
  7. RETVAL

of EXAMPLE 2? To do this, one needs to define a C function int is_even(int input) . As we saw in Anatomy of .xs file, a proper place for this definition is in the first part of .xs file. In fact a C function

  1. int
  2. is_even(int arg)
  3. {
  4. return (arg % 2 == 0);
  5. }

is probably overkill for this. Something as simple as a #define will do too:

  1. #define is_even(arg) ((arg) % 2 == 0)

After having this in the first part of .xs file, the "Perl glue" part becomes as simple as

  1. int
  2. is_even(input)
  3. int input

This technique of separation of the glue part from the workhorse part has obvious tradeoffs: if you want to change a Perl interface, you need to change two places in your code. However, it removes a lot of clutter, and makes the workhorse part independent from idiosyncrasies of Perl calling convention. (In fact, there is nothing Perl-specific in the above description, a different version of xsubpp might have translated this to TCL glue or Python glue as well.)

More about XSUB arguments

With the completion of Example 4, we now have an easy way to simulate some real-life libraries whose interfaces may not be the cleanest in the world. We shall now continue with a discussion of the arguments passed to the xsubpp compiler.

When you specify arguments to routines in the .xs file, you are really passing three pieces of information for each argument listed. The first piece is the order of that argument relative to the others (first, second, etc). The second is the type of argument, and consists of the type declaration of the argument (e.g., int, char*, etc). The third piece is the calling convention for the argument in the call to the library function.

While Perl passes arguments to functions by reference, C passes arguments by value; to implement a C function which modifies data of one of the "arguments", the actual argument of this C function would be a pointer to the data. Thus two C functions with declarations

  1. int string_length(char *s);
  2. int upper_case_char(char *cp);

may have completely different semantics: the first one may inspect an array of chars pointed by s, and the second one may immediately dereference cp and manipulate *cp only (using the return value as, say, a success indicator). From Perl one would use these functions in a completely different manner.

One conveys this info to xsubpp by replacing * before the argument by & . & means that the argument should be passed to a library function by its address. The above two function may be XSUB-ified as

  1. int
  2. string_length(s)
  3. char * s
  4. int
  5. upper_case_char(cp)
  6. char &cp

For example, consider:

  1. int
  2. foo(a,b)
  3. char &a
  4. char * b

The first Perl argument to this function would be treated as a char and assigned to the variable a, and its address would be passed into the function foo. The second Perl argument would be treated as a string pointer and assigned to the variable b. The value of b would be passed into the function foo. The actual call to the function foo that xsubpp generates would look like this:

  1. foo(&a, b);

xsubpp will parse the following function argument lists identically:

  1. char &a
  2. char&a
  3. char & a

However, to help ease understanding, it is suggested that you place a "&" next to the variable name and away from the variable type), and place a "*" near the variable type, but away from the variable name (as in the call to foo above). By doing so, it is easy to understand exactly what will be passed to the C function; it will be whatever is in the "last column".

You should take great pains to try to pass the function the type of variable it wants, when possible. It will save you a lot of trouble in the long run.

The Argument Stack

If we look at any of the C code generated by any of the examples except example 1, you will notice a number of references to ST(n), where n is usually 0. "ST" is actually a macro that points to the n'th argument on the argument stack. ST(0) is thus the first argument on the stack and therefore the first argument passed to the XSUB, ST(1) is the second argument, and so on.

When you list the arguments to the XSUB in the .xs file, that tells xsubpp which argument corresponds to which of the argument stack (i.e., the first one listed is the first argument, and so on). You invite disaster if you do not list them in the same order as the function expects them.

The actual values on the argument stack are pointers to the values passed in. When an argument is listed as being an OUTPUT value, its corresponding value on the stack (i.e., ST(0) if it was the first argument) is changed. You can verify this by looking at the C code generated for Example 3. The code for the round() XSUB routine contains lines that look like this:

  1. double arg = (double)SvNV(ST(0));
  2. /* Round the contents of the variable arg */
  3. sv_setnv(ST(0), (double)arg);

The arg variable is initially set by taking the value from ST(0), then is stored back into ST(0) at the end of the routine.

XSUBs are also allowed to return lists, not just scalars. This must be done by manipulating stack values ST(0), ST(1), etc, in a subtly different way. See perlxs for details.

XSUBs are also allowed to avoid automatic conversion of Perl function arguments to C function arguments. See perlxs for details. Some people prefer manual conversion by inspecting ST(i) even in the cases when automatic conversion will do, arguing that this makes the logic of an XSUB call clearer. Compare with Getting the fat out of XSUBs for a similar tradeoff of a complete separation of "Perl glue" and "workhorse" parts of an XSUB.

While experts may argue about these idioms, a novice to Perl guts may prefer a way which is as little Perl-guts-specific as possible, meaning automatic conversion and automatic call generation, as in Getting the fat out of XSUBs. This approach has the additional benefit of protecting the XSUB writer from future changes to the Perl API.

Extending your Extension

Sometimes you might want to provide some extra methods or subroutines to assist in making the interface between Perl and your extension simpler or easier to understand. These routines should live in the .pm file. Whether they are automatically loaded when the extension itself is loaded or only loaded when called depends on where in the .pm file the subroutine definition is placed. You can also consult AutoLoader for an alternate way to store and load your extra subroutines.

Documenting your Extension

There is absolutely no excuse for not documenting your extension. Documentation belongs in the .pm file. This file will be fed to pod2man, and the embedded documentation will be converted to the manpage format, then placed in the blib directory. It will be copied to Perl's manpage directory when the extension is installed.

You may intersperse documentation and Perl code within the .pm file. In fact, if you want to use method autoloading, you must do this, as the comment inside the .pm file explains.

See perlpod for more information about the pod format.

Installing your Extension

Once your extension is complete and passes all its tests, installing it is quite simple: you simply run "make install". You will either need to have write permission into the directories where Perl is installed, or ask your system administrator to run the make for you.

Alternately, you can specify the exact directory to place the extension's files by placing a "PREFIX=/destination/directory" after the make install. (or in between the make and install if you have a brain-dead version of make). This can be very useful if you are building an extension that will eventually be distributed to multiple systems. You can then just archive the files in the destination directory and distribute them to your destination systems.

EXAMPLE 5

In this example, we'll do some more work with the argument stack. The previous examples have all returned only a single value. We'll now create an extension that returns an array.

This extension is very Unix-oriented (struct statfs and the statfs system call). If you are not running on a Unix system, you can substitute for statfs any other function that returns multiple values, you can hard-code values to be returned to the caller (although this will be a bit harder to test the error case), or you can simply not do this example. If you change the XSUB, be sure to fix the test cases to match the changes.

Return to the Mytest directory and add the following code to the end of Mytest.xs:

  1. void
  2. statfs(path)
  3. char * path
  4. INIT:
  5. int i;
  6. struct statfs buf;
  7. PPCODE:
  8. i = statfs(path, &buf);
  9. if (i == 0) {
  10. XPUSHs(sv_2mortal(newSVnv(buf.f_bavail)));
  11. XPUSHs(sv_2mortal(newSVnv(buf.f_bfree)));
  12. XPUSHs(sv_2mortal(newSVnv(buf.f_blocks)));
  13. XPUSHs(sv_2mortal(newSVnv(buf.f_bsize)));
  14. XPUSHs(sv_2mortal(newSVnv(buf.f_ffree)));
  15. XPUSHs(sv_2mortal(newSVnv(buf.f_files)));
  16. XPUSHs(sv_2mortal(newSVnv(buf.f_type)));
  17. } else {
  18. XPUSHs(sv_2mortal(newSVnv(errno)));
  19. }

You'll also need to add the following code to the top of the .xs file, just after the include of "XSUB.h":

  1. #include <sys/vfs.h>

Also add the following code segment to Mytest.t while incrementing the "9" tests to "11":

  1. @a = &Mytest::statfs("/blech");
  2. ok( scalar(@a) == 1 && $a[0] == 2 );
  3. @a = &Mytest::statfs("/");
  4. is( scalar(@a), 7 );

New Things in this Example

This example added quite a few new concepts. We'll take them one at a time.

  • The INIT: directive contains code that will be placed immediately after the argument stack is decoded. C does not allow variable declarations at arbitrary locations inside a function, so this is usually the best way to declare local variables needed by the XSUB. (Alternatively, one could put the whole PPCODE: section into braces, and put these declarations on top.)

  • This routine also returns a different number of arguments depending on the success or failure of the call to statfs. If there is an error, the error number is returned as a single-element array. If the call is successful, then a 7-element array is returned. Since only one argument is passed into this function, we need room on the stack to hold the 7 values which may be returned.

    We do this by using the PPCODE: directive, rather than the CODE: directive. This tells xsubpp that we will be managing the return values that will be put on the argument stack by ourselves.

  • When we want to place values to be returned to the caller onto the stack, we use the series of macros that begin with "XPUSH". There are five different versions, for placing integers, unsigned integers, doubles, strings, and Perl scalars on the stack. In our example, we placed a Perl scalar onto the stack. (In fact this is the only macro which can be used to return multiple values.)

    The XPUSH* macros will automatically extend the return stack to prevent it from being overrun. You push values onto the stack in the order you want them seen by the calling program.

  • The values pushed onto the return stack of the XSUB are actually mortal SV's. They are made mortal so that once the values are copied by the calling program, the SV's that held the returned values can be deallocated. If they were not mortal, then they would continue to exist after the XSUB routine returned, but would not be accessible. This is a memory leak.

  • If we were interested in performance, not in code compactness, in the success branch we would not use XPUSHs macros, but PUSHs macros, and would pre-extend the stack before pushing the return values:

    1. EXTEND(SP, 7);

    The tradeoff is that one needs to calculate the number of return values in advance (though overextending the stack will not typically hurt anything but memory consumption).

    Similarly, in the failure branch we could use PUSHs without extending the stack: the Perl function reference comes to an XSUB on the stack, thus the stack is always large enough to take one return value.

EXAMPLE 6

In this example, we will accept a reference to an array as an input parameter, and return a reference to an array of hashes. This will demonstrate manipulation of complex Perl data types from an XSUB.

This extension is somewhat contrived. It is based on the code in the previous example. It calls the statfs function multiple times, accepting a reference to an array of filenames as input, and returning a reference to an array of hashes containing the data for each of the filesystems.

Return to the Mytest directory and add the following code to the end of Mytest.xs:

  1. SV *
  2. multi_statfs(paths)
  3. SV * paths
  4. INIT:
  5. AV * results;
  6. I32 numpaths = 0;
  7. int i, n;
  8. struct statfs buf;
  9. SvGETMAGIC(paths);
  10. if ((!SvROK(paths))
  11. || (SvTYPE(SvRV(paths)) != SVt_PVAV)
  12. || ((numpaths = av_top_index((AV *)SvRV(paths))) < 0))
  13. {
  14. XSRETURN_UNDEF;
  15. }
  16. results = (AV *)sv_2mortal((SV *)newAV());
  17. CODE:
  18. for (n = 0; n <= numpaths; n++) {
  19. HV * rh;
  20. STRLEN l;
  21. char * fn = SvPV(*av_fetch((AV *)SvRV(paths), n, 0), l);
  22. i = statfs(fn, &buf);
  23. if (i != 0) {
  24. av_push(results, newSVnv(errno));
  25. continue;
  26. }
  27. rh = (HV *)sv_2mortal((SV *)newHV());
  28. hv_store(rh, "f_bavail", 8, newSVnv(buf.f_bavail), 0);
  29. hv_store(rh, "f_bfree", 7, newSVnv(buf.f_bfree), 0);
  30. hv_store(rh, "f_blocks", 8, newSVnv(buf.f_blocks), 0);
  31. hv_store(rh, "f_bsize", 7, newSVnv(buf.f_bsize), 0);
  32. hv_store(rh, "f_ffree", 7, newSVnv(buf.f_ffree), 0);
  33. hv_store(rh, "f_files", 7, newSVnv(buf.f_files), 0);
  34. hv_store(rh, "f_type", 6, newSVnv(buf.f_type), 0);
  35. av_push(results, newRV((SV *)rh));
  36. }
  37. RETVAL = newRV((SV *)results);
  38. OUTPUT:
  39. RETVAL

And add the following code to Mytest.t, while incrementing the "11" tests to "13":

  1. $results = Mytest::multi_statfs([ '/', '/blech' ]);
  2. ok( ref $results->[0] );
  3. ok( ! ref $results->[1] );

New Things in this Example

There are a number of new concepts introduced here, described below:

  • This function does not use a typemap. Instead, we declare it as accepting one SV* (scalar) parameter, and returning an SV* value, and we take care of populating these scalars within the code. Because we are only returning one value, we don't need a PPCODE: directive - instead, we use CODE: and OUTPUT: directives.

  • When dealing with references, it is important to handle them with caution. The INIT: block first calls SvGETMAGIC(paths), in case paths is a tied variable. Then it checks that SvROK returns true, which indicates that paths is a valid reference. (Simply checking SvROK won't trigger FETCH on a tied variable.) It then verifies that the object referenced by paths is an array, using SvRV to dereference paths, and SvTYPE to discover its type. As an added test, it checks that the array referenced by paths is non-empty, using the av_top_index function (which returns -1 if the array is empty). The XSRETURN_UNDEF macro is used to abort the XSUB and return the undefined value whenever all three of these conditions are not met.

  • We manipulate several arrays in this XSUB. Note that an array is represented internally by an AV* pointer. The functions and macros for manipulating arrays are similar to the functions in Perl: av_top_index returns the highest index in an AV*, much like $#array; av_fetch fetches a single scalar value from an array, given its index; av_push pushes a scalar value onto the end of the array, automatically extending the array as necessary.

    Specifically, we read pathnames one at a time from the input array, and store the results in an output array (results) in the same order. If statfs fails, the element pushed onto the return array is the value of errno after the failure. If statfs succeeds, though, the value pushed onto the return array is a reference to a hash containing some of the information in the statfs structure.

    As with the return stack, it would be possible (and a small performance win) to pre-extend the return array before pushing data into it, since we know how many elements we will return:

    1. av_extend(results, numpaths);
  • We are performing only one hash operation in this function, which is storing a new scalar under a key using hv_store . A hash is represented by an HV* pointer. Like arrays, the functions for manipulating hashes from an XSUB mirror the functionality available from Perl. See perlguts and perlapi for details.

  • To create a reference, we use the newRV function. Note that you can cast an AV* or an HV* to type SV* in this case (and many others). This allows you to take references to arrays, hashes and scalars with the same function. Conversely, the SvRV function always returns an SV*, which may need to be cast to the appropriate type if it is something other than a scalar (check with SvTYPE ).

  • At this point, xsubpp is doing very little work - the differences between Mytest.xs and Mytest.c are minimal.

EXAMPLE 7 (Coming Soon)

XPUSH args AND set RETVAL AND assign return value to array

EXAMPLE 8 (Coming Soon)

Setting $!

EXAMPLE 9 Passing open files to XSes

You would think passing files to an XS is difficult, with all the typeglobs and stuff. Well, it isn't.

Suppose that for some strange reason we need a wrapper around the standard C library function fputs() . This is all we need:

  1. #define PERLIO_NOT_STDIO 0
  2. #include "EXTERN.h"
  3. #include "perl.h"
  4. #include "XSUB.h"
  5. #include <stdio.h>
  6. int
  7. fputs(s, stream)
  8. char * s
  9. FILE * stream

The real work is done in the standard typemap.

But you lose all the fine stuff done by the perlio layers. This calls the stdio function fputs() , which knows nothing about them.

The standard typemap offers three variants of PerlIO *: InputStream (T_IN), InOutStream (T_INOUT) and OutputStream (T_OUT). A bare PerlIO * is considered a T_INOUT. If it matters in your code (see below for why it might) #define or typedef one of the specific names and use that as the argument or result type in your XS file.

The standard typemap does not contain PerlIO * before perl 5.7, but it has the three stream variants. Using a PerlIO * directly is not backwards compatible unless you provide your own typemap.

For streams coming from perl the main difference is that OutputStream will get the output PerlIO * - which may make a difference on a socket. Like in our example...

For streams being handed to perl a new file handle is created (i.e. a reference to a new glob) and associated with the PerlIO * provided. If the read/write state of the PerlIO * is not correct then you may get errors or warnings from when the file handle is used. So if you opened the PerlIO * as "w" it should really be an OutputStream if open as "r" it should be an InputStream .

Now, suppose you want to use perlio layers in your XS. We'll use the perlio PerlIO_puts() function as an example.

In the C part of the XS file (above the first MODULE line) you have

  1. #define OutputStream PerlIO *
  2. or
  3. typedef PerlIO * OutputStream;

And this is the XS code:

  1. int
  2. perlioputs(s, stream)
  3. char * s
  4. OutputStream stream
  5. CODE:
  6. RETVAL = PerlIO_puts(stream, s);
  7. OUTPUT:
  8. RETVAL

We have to use a CODE section because PerlIO_puts() has the arguments reversed compared to fputs() , and we want to keep the arguments the same.

Wanting to explore this thoroughly, we want to use the stdio fputs() on a PerlIO *. This means we have to ask the perlio system for a stdio FILE * :

  1. int
  2. perliofputs(s, stream)
  3. char * s
  4. OutputStream stream
  5. PREINIT:
  6. FILE *fp = PerlIO_findFILE(stream);
  7. CODE:
  8. if (fp != (FILE*) 0) {
  9. RETVAL = fputs(s, fp);
  10. } else {
  11. RETVAL = -1;
  12. }
  13. OUTPUT:
  14. RETVAL

Note: PerlIO_findFILE() will search the layers for a stdio layer. If it can't find one, it will call PerlIO_exportFILE() to generate a new stdio FILE . Please only call PerlIO_exportFILE() if you want a new FILE . It will generate one on each call and push a new stdio layer. So don't call it repeatedly on the same file. PerlIO_findFILE() will retrieve the stdio layer once it has been generated by PerlIO_exportFILE() .

This applies to the perlio system only. For versions before 5.7, PerlIO_exportFILE() is equivalent to PerlIO_findFILE() .

Troubleshooting these Examples

As mentioned at the top of this document, if you are having problems with these example extensions, you might see if any of these help you.

  • In versions of 5.002 prior to the gamma version, the test script in Example 1 will not function properly. You need to change the "use lib" line to read:

    1. use lib './blib';
  • In versions of 5.002 prior to version 5.002b1h, the test.pl file was not automatically created by h2xs. This means that you cannot say "make test" to run the test script. You will need to add the following line before the "use extension" statement:

    1. use lib './blib';
  • In versions 5.000 and 5.001, instead of using the above line, you will need to use the following line:

    1. BEGIN { unshift(@INC, "./blib") }
  • This document assumes that the executable named "perl" is Perl version 5. Some systems may have installed Perl version 5 as "perl5".

See also

For more information, consult perlguts, perlapi, perlxs, perlmod, and perlpod.

Author

Jeff Okamoto <okamoto@corp.hp.com>

Reviewed and assisted by Dean Roehrich, Ilya Zakharevich, Andreas Koenig, and Tim Bunce.

PerlIO material contributed by Lupe Christoph, with some clarification by Nick Ing-Simmons.

Changes for h2xs as of Perl 5.8.x by Renee Baecker

Last Changed

2012-01-20

 
perldoc-html/perlxstypemap.html000644 000765 000024 00000146302 12275777361 017050 0ustar00jjstaff000000 000000 perlxstypemap - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

perlxstypemap

Perl 5 version 18.2 documentation
Recently read

perlxstypemap

NAME

perlxstypemap - Perl XS C/Perl type mapping

DESCRIPTION

The more you think about interfacing between two languages, the more you'll realize that the majority of programmer effort has to go into converting between the data structures that are native to either of the languages involved. This trumps other matter such as differing calling conventions because the problem space is so much greater. There are simply more ways to shove data into memory than there are ways to implement a function call.

Perl XS' attempt at a solution to this is the concept of typemaps. At an abstract level, a Perl XS typemap is nothing but a recipe for converting from a certain Perl data structure to a certain C data structure and vice versa. Since there can be C types that are sufficiently similar to warrant converting with the same logic, XS typemaps are represented by a unique identifier, henceforth called an <XS type> in this document. You can then tell the XS compiler that multiple C types are to be mapped with the same XS typemap.

In your XS code, when you define an argument with a C type or when you are using a CODE: and an OUTPUT: section together with a C return type of your XSUB, it'll be the typemapping mechanism that makes this easy.

Anatomy of a typemap

In more practical terms, the typemap is a collection of code fragments which are used by the xsubpp compiler to map C function parameters and values to Perl values. The typemap file may consist of three sections labelled TYPEMAP , INPUT , and OUTPUT . An unlabelled initial section is assumed to be a TYPEMAP section. The INPUT section tells the compiler how to translate Perl values into variables of certain C types. The OUTPUT section tells the compiler how to translate the values from certain C types into values Perl can understand. The TYPEMAP section tells the compiler which of the INPUT and OUTPUT code fragments should be used to map a given C type to a Perl value. The section labels TYPEMAP , INPUT , or OUTPUT must begin in the first column on a line by themselves, and must be in uppercase.

Each type of section can appear an arbitrary number of times and does not have to appear at all. For example, a typemap may commonly lack INPUT and OUTPUT sections if all it needs to do is associate additional C types with core XS types like T_PTROBJ. Lines that start with a hash # are considered comments and ignored in the TYPEMAP section, but are considered significant in INPUT and OUTPUT . Blank lines are generally ignored.

Traditionally, typemaps needed to be written to a separate file, conventionally called typemap in a CPAN distribution. With ExtUtils::ParseXS (the XS compiler) version 3.12 or better which comes with perl 5.16, typemaps can also be embedded directly into XS code using a HERE-doc like syntax:

  1. TYPEMAP: <<HERE
  2. ...
  3. HERE

where HERE can be replaced by other identifiers like with normal Perl HERE-docs. All details below about the typemap textual format remain valid.

The TYPEMAP section should contain one pair of C type and XS type per line as follows. An example from the core typemap file:

  1. TYPEMAP
  2. # all variants of char* is handled by the T_PV typemap
  3. char * T_PV
  4. const char * T_PV
  5. unsigned char * T_PV
  6. ...

The INPUT and OUTPUT sections have identical formats, that is, each unindented line starts a new in- or output map respectively. A new in- or output map must start with the name of the XS type to map on a line by itself, followed by the code that implements it indented on the following lines. Example:

  1. INPUT
  2. T_PV
  3. $var = ($type)SvPV_nolen($arg)
  4. T_PTR
  5. $var = INT2PTR($type,SvIV($arg))

We'll get to the meaning of those Perlish-looking variables in a little bit.

Finally, here's an example of the full typemap file for mapping C strings of the char * type to Perl scalars/strings:

  1. TYPEMAP
  2. char * T_PV
  3. INPUT
  4. T_PV
  5. $var = ($type)SvPV_nolen($arg)
  6. OUTPUT
  7. T_PV
  8. sv_setpv((SV*)$arg, $var);

Here's a more complicated example: suppose that you wanted struct netconfig to be blessed into the class Net::Config . One way to do this is to use underscores (_) to separate package names, as follows:

  1. typedef struct netconfig * Net_Config;

And then provide a typemap entry T_PTROBJ_SPECIAL that maps underscores to double-colons (::), and declare Net_Config to be of that type:

  1. TYPEMAP
  2. Net_Config T_PTROBJ_SPECIAL
  3. INPUT
  4. T_PTROBJ_SPECIAL
  5. if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){
  6. IV tmp = SvIV((SV*)SvRV($arg));
  7. $var = INT2PTR($type, tmp);
  8. }
  9. else
  10. croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")
  11. OUTPUT
  12. T_PTROBJ_SPECIAL
  13. sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\",
  14. (void*)$var);

The INPUT and OUTPUT sections substitute underscores for double-colons on the fly, giving the desired effect. This example demonstrates some of the power and versatility of the typemap facility.

The INT2PTR macro (defined in perl.h) casts an integer to a pointer of a given type, taking care of the possible different size of integers and pointers. There are also PTR2IV , PTR2UV , PTR2NV macros, to map the other way, which may be useful in OUTPUT sections.

The Role of the typemap File in Your Distribution

The default typemap in the lib/ExtUtils directory of the Perl source contains many useful types which can be used by Perl extensions. Some extensions define additional typemaps which they keep in their own directory. These additional typemaps may reference INPUT and OUTPUT maps in the main typemap. The xsubpp compiler will allow the extension's own typemap to override any mappings which are in the default typemap. Instead of using an additional typemap file, typemaps may be embedded verbatim in XS with a heredoc-like syntax. See the documentation on the TYPEMAP: XS keyword.

For CPAN distributions, you can assume that the XS types defined by the perl core are already available. Additionally, the core typemap has default XS types for a large number of C types. For example, if you simply return a char * from your XSUB, the core typemap will have this C type associated with the T_PV XS type. That means your C string will be copied into the PV (pointer value) slot of a new scalar that will be returned from your XSUB to to Perl.

If you're developing a CPAN distribution using XS, you may add your own file called typemap to the distribution. That file may contain typemaps that either map types that are specific to your code or that override the core typemap file's mappings for common C types.

Sharing typemaps Between CPAN Distributions

Starting with ExtUtils::ParseXS version 3.13_01 (comes with perl 5.16 and better), it is rather easy to share typemap code between multiple CPAN distributions. The general idea is to share it as a module that offers a certain API and have the dependent modules declare that as a built-time requirement and import the typemap into the XS. An example of such a typemap-sharing module on CPAN is ExtUtils::Typemaps::Basic . Two steps to getting that module's typemaps available in your code:

  • Declare ExtUtils::Typemaps::Basic as a build-time dependency in Makefile.PL (use BUILD_REQUIRES ), or in your Build.PL (use build_requires ).

  • Include the following line in the XS section of your XS file: (don't break the line)

    1. INCLUDE_COMMAND: $^X -MExtUtils::Typemaps::Cmd
    2. -e "print embeddable_typemap(q{Basic})"

Writing typemap Entries

Each INPUT or OUTPUT typemap entry is a double-quoted Perl string that will be evaluated in the presence of certain variables to get the final C code for mapping a certain C type.

This means that you can embed Perl code in your typemap (C) code using constructs such as ${ perl code that evaluates to scalar reference here } . A common use case is to generate error messages that refer to the true function name even when using the ALIAS XS feature:

  1. ${ $ALIAS ? \q[GvNAME(CvGV(cv))] : \qq[\"$pname\"] }

For many typemap examples, refer to the core typemap file that can be found in the perl source tree at lib/ExtUtils/typemap.

The Perl variables that are available for interpolation into typemaps are the following:

  • $var - the name of the input or output variable, eg. RETVAL for return values.

  • $type - the raw C type of the parameter, any : replaced with _ .

  • $ntype - the supplied type with * replaced with Ptr . e.g. for a type of Foo::Bar , $ntype is Foo::Bar

  • $arg - the stack entry, that the parameter is input from or output to, e.g. ST(0)

  • $argoff - the argument stack offset of the argument. ie. 0 for the first argument, etc.

  • $pname - the full name of the XSUB, with including the PACKAGE name, with any PREFIX stripped. This is the non-ALIAS name.

  • $Package - the package specified by the most recent PACKAGE keyword.

  • $ALIAS - non-zero if the current XSUB has any aliases declared with ALIAS .

Full Listing of Core Typemaps

Each C type is represented by an entry in the typemap file that is responsible for converting perl variables (SV, AV, HV, CV, etc.) to and from that type. The following sections list all XS types that come with perl by default.

  • T_SV

    This simply passes the C representation of the Perl variable (an SV*) in and out of the XS layer. This can be used if the C code wants to deal directly with the Perl variable.

  • T_SVREF

    Used to pass in and return a reference to an SV.

    Note that this typemap does not decrement the reference count when returning the reference to an SV*. See also: T_SVREF_REFCOUNT_FIXED

  • T_SVREF_FIXED

    Used to pass in and return a reference to an SV. This is a fixed variant of T_SVREF that decrements the refcount appropriately when returning a reference to an SV*. Introduced in perl 5.15.4.

  • T_AVREF

    From the perl level this is a reference to a perl array. From the C level this is a pointer to an AV.

    Note that this typemap does not decrement the reference count when returning an AV*. See also: T_AVREF_REFCOUNT_FIXED

  • T_AVREF_REFCOUNT_FIXED

    From the perl level this is a reference to a perl array. From the C level this is a pointer to an AV. This is a fixed variant of T_AVREF that decrements the refcount appropriately when returning an AV*. Introduced in perl 5.15.4.

  • T_HVREF

    From the perl level this is a reference to a perl hash. From the C level this is a pointer to an HV.

    Note that this typemap does not decrement the reference count when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED

  • T_HVREF_REFCOUNT_FIXED

    From the perl level this is a reference to a perl hash. From the C level this is a pointer to an HV. This is a fixed variant of T_HVREF that decrements the refcount appropriately when returning an HV*. Introduced in perl 5.15.4.

  • T_CVREF

    From the perl level this is a reference to a perl subroutine (e.g. $sub = sub { 1 };). From the C level this is a pointer to a CV.

    Note that this typemap does not decrement the reference count when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED

  • T_CVREF_REFCOUNT_FIXED

    From the perl level this is a reference to a perl subroutine (e.g. $sub = sub { 1 };). From the C level this is a pointer to a CV.

    This is a fixed variant of T_HVREF that decrements the refcount appropriately when returning an HV*. Introduced in perl 5.15.4.

  • T_SYSRET

    The T_SYSRET typemap is used to process return values from system calls. It is only meaningful when passing values from C to perl (there is no concept of passing a system return value from Perl to C).

    System calls return -1 on error (setting ERRNO with the reason) and (usually) 0 on success. If the return value is -1 this typemap returns undef. If the return value is not -1, this typemap translates a 0 (perl false) to "0 but true" (which is perl true) or returns the value itself, to indicate that the command succeeded.

    The POSIX module makes extensive use of this type.

  • T_UV

    An unsigned integer.

  • T_IV

    A signed integer. This is cast to the required integer type when passed to C and converted to an IV when passed back to Perl.

  • T_INT

    A signed integer. This typemap converts the Perl value to a native integer type (the int type on the current platform). When returning the value to perl it is processed in the same way as for T_IV.

    Its behaviour is identical to using an int type in XS with T_IV.

  • T_ENUM

    An enum value. Used to transfer an enum component from C. There is no reason to pass an enum value to C since it is stored as an IV inside perl.

  • T_BOOL

    A boolean type. This can be used to pass true and false values to and from C.

  • T_U_INT

    This is for unsigned integers. It is equivalent to using T_UV but explicitly casts the variable to type unsigned int . The default type for unsigned int is T_UV.

  • T_SHORT

    Short integers. This is equivalent to T_IV but explicitly casts the return to type short . The default typemap for short is T_IV.

  • T_U_SHORT

    Unsigned short integers. This is equivalent to T_UV but explicitly casts the return to type unsigned short . The default typemap for unsigned short is T_UV.

    T_U_SHORT is used for type U16 in the standard typemap.

  • T_LONG

    Long integers. This is equivalent to T_IV but explicitly casts the return to type long . The default typemap for long is T_IV.

  • T_U_LONG

    Unsigned long integers. This is equivalent to T_UV but explicitly casts the return to type unsigned long . The default typemap for unsigned long is T_UV.

    T_U_LONG is used for type U32 in the standard typemap.

  • T_CHAR

    Single 8-bit characters.

  • T_U_CHAR

    An unsigned byte.

  • T_FLOAT

    A floating point number. This typemap guarantees to return a variable cast to a float .

  • T_NV

    A Perl floating point number. Similar to T_IV and T_UV in that the return type is cast to the requested numeric type rather than to a specific type.

  • T_DOUBLE

    A double precision floating point number. This typemap guarantees to return a variable cast to a double .

  • T_PV

    A string (char *).

  • T_PTR

    A memory address (pointer). Typically associated with a void * type.

  • T_PTRREF

    Similar to T_PTR except that the pointer is stored in a scalar and the reference to that scalar is returned to the caller. This can be used to hide the actual pointer value from the programmer since it is usually not required directly from within perl.

    The typemap checks that a scalar reference is passed from perl to XS.

  • T_PTROBJ

    Similar to T_PTRREF except that the reference is blessed into a class. This allows the pointer to be used as an object. Most commonly used to deal with C structs. The typemap checks that the perl object passed into the XS routine is of the correct class (or part of a subclass).

    The pointer is blessed into a class that is derived from the name of type of the pointer but with all '*' in the name replaced with 'Ptr'.

  • T_REF_IV_REF

    NOT YET

  • T_REF_IV_PTR

    Similar to T_PTROBJ in that the pointer is blessed into a scalar object. The difference is that when the object is passed back into XS it must be of the correct type (inheritance is not supported).

    The pointer is blessed into a class that is derived from the name of type of the pointer but with all '*' in the name replaced with 'Ptr'.

  • T_PTRDESC

    NOT YET

  • T_REFREF

    Similar to T_PTRREF, except the pointer stored in the referenced scalar is dereferenced and copied to the output variable. This means that T_REFREF is to T_PTRREF as T_OPAQUE is to T_OPAQUEPTR. All clear?

    Only the INPUT part of this is implemented (Perl to XSUB) and there are no known users in core or on CPAN.

  • T_REFOBJ

    NOT YET

  • T_OPAQUEPTR

    This can be used to store bytes in the string component of the SV. Here the representation of the data is irrelevant to perl and the bytes themselves are just stored in the SV. It is assumed that the C variable is a pointer (the bytes are copied from that memory location). If the pointer is pointing to something that is represented by 8 bytes then those 8 bytes are stored in the SV (and length() will report a value of 8). This entry is similar to T_OPAQUE.

    In principle the unpack() command can be used to convert the bytes back to a number (if the underlying type is known to be a number).

    This entry can be used to store a C structure (the number of bytes to be copied is calculated using the C sizeof function) and can be used as an alternative to T_PTRREF without having to worry about a memory leak (since Perl will clean up the SV).

  • T_OPAQUE

    This can be used to store data from non-pointer types in the string part of an SV. It is similar to T_OPAQUEPTR except that the typemap retrieves the pointer directly rather than assuming it is being supplied. For example, if an integer is imported into Perl using T_OPAQUE rather than T_IV the underlying bytes representing the integer will be stored in the SV but the actual integer value will not be available. i.e. The data is opaque to perl.

    The data may be retrieved using the unpack function if the underlying type of the byte stream is known.

    T_OPAQUE supports input and output of simple types. T_OPAQUEPTR can be used to pass these bytes back into C if a pointer is acceptable.

  • Implicit array

    xsubpp supports a special syntax for returning packed C arrays to perl. If the XS return type is given as

    1. array(type, nelem)

    xsubpp will copy the contents of nelem * sizeof(type) bytes from RETVAL to an SV and push it onto the stack. This is only really useful if the number of items to be returned is known at compile time and you don't mind having a string of bytes in your SV. Use T_ARRAY to push a variable number of arguments onto the return stack (they won't be packed as a single string though).

    This is similar to using T_OPAQUEPTR but can be used to process more than one element.

  • T_PACKED

    Calls user-supplied functions for conversion. For OUTPUT (XSUB to Perl), a function named XS_pack_$ntype is called with the output Perl scalar and the C variable to convert from. $ntype is the normalized C type that is to be mapped to Perl. Normalized means that all * are replaced by the string Ptr . The return value of the function is ignored.

    Conversely for INPUT (Perl to XSUB) mapping, the function named XS_unpack_$ntype is called with the input Perl scalar as argument and the return value is cast to the mapped C type and assigned to the output C variable.

    An example conversion function for a typemapped struct foo_t * might be:

    1. static void
    2. XS_pack_foo_tPtr(SV *out, foo_t *in)
    3. {
    4. dTHX; /* alas, signature does not include pTHX_ */
    5. HV* hash = newHV();
    6. hv_stores(hash, "int_member", newSViv(in->int_member));
    7. hv_stores(hash, "float_member", newSVnv(in->float_member));
    8. /* ... */
    9. /* mortalize as thy stack is not refcounted */
    10. sv_setsv(out, sv_2mortal(newRV_noinc((SV*)hash)));
    11. }

    The conversion from Perl to C is left as an exercise to the reader, but the prototype would be:

    1. static foo_t *
    2. XS_unpack_foo_tPtr(SV *in);

    Instead of an actual C function that has to fetch the thread context using dTHX , you can define macros of the same name and avoid the overhead. Also, keep in mind to possibly free the memory allocated by XS_unpack_foo_tPtr .

  • T_PACKEDARRAY

    T_PACKEDARRAY is similar to T_PACKED. In fact, the INPUT (Perl to XSUB) typemap is indentical, but the OUTPUT typemap passes an additional argument to the XS_pack_$ntype function. This third parameter indicates the number of elements in the output so that the function can handle C arrays sanely. The variable needs to be declared by the user and must have the name count_$ntype where $ntype is the normalized C type name as explained above. The signature of the function would be for the example above and foo_t ** :

    1. static void
    2. XS_pack_foo_tPtrPtr(SV *out, foo_t *in, UV count_foo_tPtrPtr);

    The type of the third parameter is arbitrary as far as the typemap is concerned. It just has to be in line with the declared variable.

    Of course, unless you know the number of elements in the sometype ** C array, within your XSUB, the return value from foo_t ** XS_unpack_foo_tPtrPtr(...) will be hard to decypher. Since the details are all up to the XS author (the typemap user), there are several solutions, none of which particularly elegant. The most commonly seen solution has been to allocate memory for N+1 pointers and assign NULL to the (N+1)th to facilitate iteration.

    Alternatively, using a customized typemap for your purposes in the first place is probably preferrable.

  • T_DATAUNIT

    NOT YET

  • T_CALLBACK

    NOT YET

  • T_ARRAY

    This is used to convert the perl argument list to a C array and for pushing the contents of a C array onto the perl argument stack.

    The usual calling signature is

    1. @out = array_func( @in );

    Any number of arguments can occur in the list before the array but the input and output arrays must be the last elements in the list.

    When used to pass a perl list to C the XS writer must provide a function (named after the array type but with 'Ptr' substituted for '*') to allocate the memory required to hold the list. A pointer should be returned. It is up to the XS writer to free the memory on exit from the function. The variable ix_$var is set to the number of elements in the new array.

    When returning a C array to Perl the XS writer must provide an integer variable called size_$var containing the number of elements in the array. This is used to determine how many elements should be pushed onto the return argument stack. This is not required on input since Perl knows how many arguments are on the stack when the routine is called. Ordinarily this variable would be called size_RETVAL .

    Additionally, the type of each element is determined from the type of the array. If the array uses type intArray * xsubpp will automatically work out that it contains variables of type int and use that typemap entry to perform the copy of each element. All pointer '*' and 'Array' tags are removed from the name to determine the subtype.

  • T_STDIO

    This is used for passing perl filehandles to and from C using FILE * structures.

  • T_INOUT

    This is used for passing perl filehandles to and from C using PerlIO * structures. The file handle can used for reading and writing. This corresponds to the +< mode, see also T_IN and T_OUT.

    See perliol for more information on the Perl IO abstraction layer. Perl must have been built with -Duseperlio .

    There is no check to assert that the filehandle passed from Perl to C was created with the right open() mode.

    Hint: The perlxstut tutorial covers the T_INOUT, T_IN, and T_OUT XS types nicely.

  • T_IN

    Same as T_INOUT, but the filehandle that is returned from C to Perl can only be used for reading (mode < ).

  • T_OUT

    Same as T_INOUT, but the filehandle that is returned from C to Perl is set to use the open mode +>.

 
perldoc-html/piconv.html000644 000765 000024 00000044312 12275777420 015423 0ustar00jjstaff000000 000000 piconv - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

piconv

Perl 5 version 18.2 documentation
Recently read

piconv

NAME

piconv -- iconv(1), reinvented in perl

SYNOPSIS

  1. piconv [-f from_encoding] [-t to_encoding] [-s string] [files...]
  2. piconv -l
  3. piconv [-C N|-c|-p]
  4. piconv -S scheme ...
  5. piconv -r encoding
  6. piconv -D ...
  7. piconv -h

DESCRIPTION

piconv is perl version of iconv, a character encoding converter widely available for various Unixen today. This script was primarily a technology demonstrator for Perl 5.8.0, but you can use piconv in the place of iconv for virtually any case.

piconv converts the character encoding of either STDIN or files specified in the argument and prints out to STDOUT.

Here is the list of options. Each option can be in short format (-f) or long (--from).

  • -f,--from from_encoding

    Specifies the encoding you are converting from. Unlike iconv, this option can be omitted. In such cases, the current locale is used.

  • -t,--to to_encoding

    Specifies the encoding you are converting to. Unlike iconv, this option can be omitted. In such cases, the current locale is used.

    Therefore, when both -f and -t are omitted, piconv just acts like cat.

  • -s,--string string

    uses string instead of file for the source of text.

  • -l,--list

    Lists all available encodings, one per line, in case-insensitive order. Note that only the canonical names are listed; many aliases exist. For example, the names are case-insensitive, and many standard and common aliases work, such as "latin1" for "ISO-8859-1", or "ibm850" instead of "cp850", or "winlatin1" for "cp1252". See Encode::Supported for a full discussion.

  • -C,--check N

    Check the validity of the stream if N = 1. When N = -1, something interesting happens when it encounters an invalid character.

  • -c

    Same as -C 1 .

  • -p,--perlqq
  • --htmlcref
  • --xmlcref

    Applies PERLQQ, HTMLCREF, XMLCREF, respectively. Try

    1. piconv -f utf8 -t ascii --perlqq

    To see what it does.

  • -h,--help

    Show usage.

  • -D,--debug

    Invokes debugging mode. Primarily for Encode hackers.

  • -S,--scheme scheme

    Selects which scheme is to be used for conversion. Available schemes are as follows:

    • from_to

      Uses Encode::from_to for conversion. This is the default.

    • decode_encode

      Input strings are decode()d then encode()d. A straight two-step implementation.

    • perlio

      The new perlIO layer is used. NI-S' favorite.

      You should use this option if you are using UTF-16 and others which linefeed is not $/.

    Like the -D option, this is also for Encode hackers.

SEE ALSO

iconv(1) locale(3) Encode Encode::Supported Encode::Alias PerlIO

 
perldoc-html/pod2html.html000644 000765 000024 00000044370 12275777420 015662 0ustar00jjstaff000000 000000 pod2html - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

pod2html

Perl 5 version 18.2 documentation
Recently read

pod2html

NAME

pod2html - convert .pod files to .html files

SYNOPSIS

  1. pod2html --help --htmlroot=<name> --infile=<name> --outfile=<name>
  2. --podpath=<name>:...:<name> --podroot=<name>
  3. --recurse --norecurse --verbose
  4. --index --noindex --title=<name>

DESCRIPTION

Converts files from pod format (see perlpod) to HTML format.

ARGUMENTS

pod2html takes the following arguments:

  • help
    1. --help

    Displays the usage message.

  • htmlroot
    1. --htmlroot=name

    Sets the base URL for the HTML files. When cross-references are made, the HTML root is prepended to the URL.

  • infile
    1. --infile=name

    Specify the pod file to convert. Input is taken from STDIN if no infile is specified.

  • outfile
    1. --outfile=name

    Specify the HTML file to create. Output goes to STDOUT if no outfile is specified.

  • podroot
    1. --podroot=name

    Specify the base directory for finding library pods.

  • podpath
    1. --podpath=name:...:name

    Specify which subdirectories of the podroot contain pod files whose HTML converted forms can be linked-to in cross-references.

  • index
    1. --index

    Generate an index at the top of the HTML file (default behaviour).

  • noindex
    1. --noindex

    Do not generate an index at the top of the HTML file.

  • recurse
    1. --recurse

    Recurse into subdirectories specified in podpath (default behaviour).

  • norecurse
    1. --norecurse

    Do not recurse into subdirectories specified in podpath.

  • title
    1. --title=title

    Specify the title of the resulting HTML file.

  • verbose
    1. --verbose

    Display progress messages.

AUTHOR

Tom Christiansen, <tchrist@perl.com>.

BUGS

See Pod::Html for a list of known bugs in the translator.

SEE ALSO

perlpod, Pod::Html

COPYRIGHT

This program is distributed under the Artistic License.

 
perldoc-html/pod2latex.html000644 000765 000024 00000047057 12275777420 016040 0ustar00jjstaff000000 000000 pod2latex - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

pod2latex

Perl 5 version 18.2 documentation
Recently read

pod2latex

NAME

pod2latex - convert pod documentation to latex format

SYNOPSIS

  1. pod2latex *.pm
  2. pod2latex -out mytex.tex *.pod
  3. pod2latex -full -sections 'DESCRIPTION|NAME' SomeDir
  4. pod2latex -prefile h.tex -postfile t.tex my.pod

DESCRIPTION

pod2latex is a program to convert POD format documentation (perlpod) into latex. It can process multiple input documents at a time and either generate a latex file per input document or a single combined output file.

OPTIONS AND ARGUMENTS

This section describes the supported command line options. Minimum matching is supported.

  • -out

    Name of the output file to be used. If there are multiple input pods it is assumed that the intention is to write all translated output into a single file. .tex is appended if not present. If the argument is not supplied, a single document will be created for each input file.

  • -full

    Creates a complete latex file that can be processed immediately (unless =for/=begin directives are used that rely on extra packages). Table of contents and index generation commands are included in the wrapper latex code.

  • -sections

    Specify pod sections to include (or remove if negated) in the translation. See SECTION SPECIFICATIONS in Pod::Select for the format to use for section-spec. This option may be given multiple times on the command line.This is identical to the similar option in the podselect() command.

  • -modify

    This option causes the output latex to be slightly modified from the input pod such that when a =head1 NAME is encountered a section is created containing the actual pod name (rather than NAME) and all subsequent =head1 directives are treated as subsections. This has the advantage that the description of a module will be in its own section which is helpful for including module descriptions in documentation. Also forces latex label and index entries to be prefixed by the name of the module.

  • -h1level

    Specifies the latex section that is equivalent to a H1 pod directive. This is an integer between 0 and 5 with 0 equivalent to a latex chapter, 1 equivalent to a latex section etc. The default is 1 (H1 equivalent to a latex section).

  • -help

    Print a brief help message and exit.

  • -man

    Print the manual page and exit.

  • -verbose

    Print information messages as each document is processed.

  • -preamble

    A user-supplied preamble for the LaTeX code. Multiple values are supported and appended in order separated by "\n". See -prefile for reading the preamble from a file.

  • -postamble

    A user supplied postamble for the LaTeX code. Multiple values are supported and appended in order separated by "\n". See -postfile for reading the postamble from a file.

  • -prefile

    A user-supplied preamble for the LaTeX code to be read from the named file. Multiple values are supported and appended in order. See -preamble.

  • -postfile

    A user-supplied postamble for the LaTeX code to be read from the named file. Multiple values are supported and appended in order. See -postamble.

BUGS

Known bugs are:

  • Cross references between documents are not resolved when multiple pod documents are converted into a single output latex file.

  • Functions and variables are not automatically recognized and they will therefore not be marked up in any special way unless instructed by an explicit pod command.

SEE ALSO

Pod::LaTeX

AUTHOR

Tim Jenness <tjenness@cpan.org>

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Copyright (C) 2000, 2003, 2004 Tim Jenness. All Rights Reserved.

 
perldoc-html/pod2man.html000644 000765 000024 00000072575 12275777421 015502 0ustar00jjstaff000000 000000 pod2man - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

pod2man

Perl 5 version 18.2 documentation
Recently read

pod2man

NAME

pod2man - Convert POD data to formatted *roff input

SYNOPSIS

pod2man [--center=string] [--date=string] [--errors=style] [--fixed=font] [--fixedbold=font] [--fixeditalic=font] [--fixedbolditalic=font] [--name=name] [--nourls] [--official] [--quotes=quotes] [--release[=version]] [--section=manext] [--stderr] [--utf8] [--verbose] [input [output] ...]

pod2man --help

DESCRIPTION

pod2man is a front-end for Pod::Man, using it to generate *roff input from POD source. The resulting *roff code is suitable for display on a terminal using nroff(1), normally via man(1), or printing using troff(1).

input is the file to read for POD source (the POD can be embedded in code). If input isn't given, it defaults to STDIN . output, if given, is the file to which to write the formatted output. If output isn't given, the formatted output is written to STDOUT . Several POD files can be processed in the same pod2man invocation (saving module load and compile times) by providing multiple pairs of input and output files on the command line.

--section, --release, --center, --date, and --official can be used to set the headers and footers to use; if not given, Pod::Man will assume various defaults. See below or Pod::Man for details.

pod2man assumes that your *roff formatters have a fixed-width font named CW . If yours is called something else (like CR ), use --fixed to specify it. This generally only matters for troff output for printing. Similarly, you can set the fonts used for bold, italic, and bold italic fixed-width output.

Besides the obvious pod conversions, Pod::Man, and therefore pod2man also takes care of formatting func(), func(n), and simple variable references like $foo or @bar so you don't have to use code escapes for them; complex expressions like $fred{'stuff'} will still need to be escaped, though. It also translates dashes that aren't used as hyphens into en dashes, makes long dashes--like this--into proper em dashes, fixes "paired quotes," and takes care of several other troff-specific tweaks. See Pod::Man for complete information.

OPTIONS

  • -c string, --center=string

    Sets the centered page header to string. The default is "User Contributed Perl Documentation", but also see --official below.

  • -d string, --date=string

    Set the left-hand footer string to this value. By default, the modification date of the input file will be used, or the current date if input comes from STDIN .

  • -errors=style

    Set the error handling style. die says to throw an exception on any POD formatting error. stderr says to report errors on standard error, but not to throw an exception. pod says to include a POD ERRORS section in the resulting documentation summarizing the errors. none ignores POD errors entirely, as much as possible.

    The default is die.

  • --fixed=font

    The fixed-width font to use for verbatim text and code. Defaults to CW . Some systems may want CR instead. Only matters for troff(1) output.

  • --fixedbold=font

    Bold version of the fixed-width font. Defaults to CB . Only matters for troff(1) output.

  • --fixeditalic=font

    Italic version of the fixed-width font (actually, something of a misnomer, since most fixed-width fonts only have an oblique version, not an italic version). Defaults to CI . Only matters for troff(1) output.

  • --fixedbolditalic=font

    Bold italic (probably actually oblique) version of the fixed-width font. Pod::Man doesn't assume you have this, and defaults to CB . Some systems (such as Solaris) have this font available as CX . Only matters for troff(1) output.

  • -h, --help

    Print out usage information.

  • -l, --lax

    No longer used. pod2man used to check its input for validity as a manual page, but this should now be done by podchecker(1) instead. Accepted for backward compatibility; this option no longer does anything.

  • -n name, --name=name

    Set the name of the manual page to name. Without this option, the manual name is set to the uppercased base name of the file being converted unless the manual section is 3, in which case the path is parsed to see if it is a Perl module path. If it is, a path like .../lib/Pod/Man.pm is converted into a name like Pod::Man . This option, if given, overrides any automatic determination of the name.

    Note that this option is probably not useful when converting multiple POD files at once. The convention for Unix man pages for commands is for the man page title to be in all-uppercase even if the command isn't.

  • --nourls

    Normally, L<> formatting codes with a URL but anchor text are formatted to show both the anchor text and the URL. In other words:

    1. L<foo|http://example.com/>

    is formatted as:

    1. foo <http://example.com/>

    This flag, if given, suppresses the URL when anchor text is given, so this example would be formatted as just foo . This can produce less cluttered output in cases where the URLs are not particularly important.

  • -o, --official

    Set the default header to indicate that this page is part of the standard Perl release, if --center is not also given.

  • -q quotes, --quotes=quotes

    Sets the quote marks used to surround C<> text to quotes. If quotes is a single character, it is used as both the left and right quote; if quotes is two characters, the first character is used as the left quote and the second as the right quoted; and if quotes is four characters, the first two are used as the left quote and the second two as the right quote.

    quotes may also be set to the special value none , in which case no quote marks are added around C<> text (but the font is still changed for troff output).

  • -r, --release

    Set the centered footer. By default, this is the version of Perl you run pod2man under. Note that some system an macro sets assume that the centered footer will be a modification date and will prepend something like "Last modified: "; if this is the case, you may want to set --release to the last modified date and --date to the version number.

  • -s, --section

    Set the section for the .TH macro. The standard section numbering convention is to use 1 for user commands, 2 for system calls, 3 for functions, 4 for devices, 5 for file formats, 6 for games, 7 for miscellaneous information, and 8 for administrator commands. There is a lot of variation here, however; some systems (like Solaris) use 4 for file formats, 5 for miscellaneous information, and 7 for devices. Still others use 1m instead of 8, or some mix of both. About the only section numbers that are reliably consistent are 1, 2, and 3.

    By default, section 1 will be used unless the file ends in .pm, in which case section 3 will be selected.

  • --stderr

    By default, pod2man dies if any errors are detected in the POD input. If --stderr is given and no --errors flag is present, errors are sent to standard error, but pod2man does not abort. This is equivalent to --errors=stderr and is supported for backward compatibility.

  • -u, --utf8

    By default, pod2man produces the most conservative possible *roff output to try to ensure that it will work with as many different *roff implementations as possible. Many *roff implementations cannot handle non-ASCII characters, so this means all non-ASCII characters are converted either to a *roff escape sequence that tries to create a properly accented character (at least for troff output) or to X .

    This option says to instead output literal UTF-8 characters. If your *roff implementation can handle it, this is the best output format to use and avoids corruption of documents containing non-ASCII characters. However, be warned that *roff source with literal UTF-8 characters is not supported by many implementations and may even result in segfaults and other bad behavior.

    Be aware that, when using this option, the input encoding of your POD source must be properly declared unless it is US-ASCII or Latin-1. POD input without an =encoding command will be assumed to be in Latin-1, and if it's actually in UTF-8, the output will be double-encoded. See perlpod(1) for more information on the =encoding command.

  • -v, --verbose

    Print out the name of each output file as it is being generated.

EXIT STATUS

As long as all documents processed result in some output, even if that output includes errata (a POD ERRORS section generated with --errors=pod ), pod2man will exit with status 0. If any of the documents being processed do not result in an output document, pod2man will exit with status 1. If there are syntax errors in a POD document being processed and the error handling style is set to the default of die, pod2man will abort immediately with exit status 255.

DIAGNOSTICS

If pod2man fails with errors, see Pod::Man and Pod::Simple for information about what those errors might mean.

EXAMPLES

  1. pod2man program > program.1
  2. pod2man SomeModule.pm /usr/perl/man/man3/SomeModule.3
  3. pod2man --section=7 note.pod > note.7

If you would like to print out a lot of man page continuously, you probably want to set the C and D registers to set contiguous page numbering and even/odd paging, at least on some versions of man(7).

  1. troff -man -rC1 -rD1 perl.1 perldata.1 perlsyn.1 ...

To get index entries on STDERR , turn on the F register, as in:

  1. troff -man -rF1 perl.1

The indexing merely outputs messages via .tm for each major page, section, subsection, item, and any X<> directives. See Pod::Man for more details.

BUGS

Lots of this documentation is duplicated from Pod::Man.

SEE ALSO

Pod::Man, Pod::Simple, man(1), nroff(1), perlpod(1), podchecker(1), perlpodstyle(1), troff(1), man(7)

The man page documenting the an macro set may be man(5) instead of man(7) on your system.

The current version of this script is always available from its web site at http://www.eyrie.org/~eagle/software/podlators/. It is also part of the Perl core distribution as of 5.6.0.

AUTHOR

Russ Allbery <rra@stanford.edu>, based very heavily on the original pod2man by Larry Wall and Tom Christiansen.

COPYRIGHT AND LICENSE

Copyright 1999, 2000, 2001, 2004, 2006, 2008, 2010, 2012, 2013 Russ Allbery <rra@stanford.edu>.

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/pod2text.html000644 000765 000024 00000063151 12275777421 015701 0ustar00jjstaff000000 000000 pod2text - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

pod2text

Perl 5 version 18.2 documentation
Recently read

pod2text

NAME

pod2text - Convert POD data to formatted ASCII text

SYNOPSIS

pod2text [-aclostu] [--code] [--errors=style] [-i indent] [-q quotes] [--nourls] [--stderr] [-w width] [input [output ...]]

pod2text -h

DESCRIPTION

pod2text is a front-end for Pod::Text and its subclasses. It uses them to generate formatted ASCII text from POD source. It can optionally use either termcap sequences or ANSI color escape sequences to format the text.

input is the file to read for POD source (the POD can be embedded in code). If input isn't given, it defaults to STDIN . output, if given, is the file to which to write the formatted output. If output isn't given, the formatted output is written to STDOUT . Several POD files can be processed in the same pod2text invocation (saving module load and compile times) by providing multiple pairs of input and output files on the command line.

OPTIONS

  • -a, --alt

    Use an alternate output format that, among other things, uses a different heading style and marks =item entries with a colon in the left margin.

  • --code

    Include any non-POD text from the input file in the output as well. Useful for viewing code documented with POD blocks with the POD rendered and the code left intact.

  • -c, --color

    Format the output with ANSI color escape sequences. Using this option requires that Term::ANSIColor be installed on your system.

  • -i indent, --indent=indent

    Set the number of spaces to indent regular text, and the default indentation for =over blocks. Defaults to 4 spaces if this option isn't given.

  • -errors=style

    Set the error handling style. die says to throw an exception on any POD formatting error. stderr says to report errors on standard error, but not to throw an exception. pod says to include a POD ERRORS section in the resulting documentation summarizing the errors. none ignores POD errors entirely, as much as possible.

    The default is die.

  • -h, --help

    Print out usage information and exit.

  • -l, --loose

    Print a blank line after a =head1 heading. Normally, no blank line is printed after =head1 , although one is still printed after =head2 , because this is the expected formatting for manual pages; if you're formatting arbitrary text documents, using this option is recommended.

  • -m width, --left-margin=width, --margin=width

    The width of the left margin in spaces. Defaults to 0. This is the margin for all text, including headings, not the amount by which regular text is indented; for the latter, see -i option.

  • --nourls

    Normally, L<> formatting codes with a URL but anchor text are formatted to show both the anchor text and the URL. In other words:

    1. L<foo|http://example.com/>

    is formatted as:

    1. foo <http://example.com/>

    This flag, if given, suppresses the URL when anchor text is given, so this example would be formatted as just foo . This can produce less cluttered output in cases where the URLs are not particularly important.

  • -o, --overstrike

    Format the output with overstrike printing. Bold text is rendered as character, backspace, character. Italics and file names are rendered as underscore, backspace, character. Many pagers, such as less, know how to convert this to bold or underlined text.

  • -q quotes, --quotes=quotes

    Sets the quote marks used to surround C<> text to quotes. If quotes is a single character, it is used as both the left and right quote; if quotes is two characters, the first character is used as the left quote and the second as the right quoted; and if quotes is four characters, the first two are used as the left quote and the second two as the right quote.

    quotes may also be set to the special value none , in which case no quote marks are added around C<> text.

  • -s, --sentence

    Assume each sentence ends with two spaces and try to preserve that spacing. Without this option, all consecutive whitespace in non-verbatim paragraphs is compressed into a single space.

  • --stderr

    By default, pod2text dies if any errors are detected in the POD input. If --stderr is given and no --errors flag is present, errors are sent to standard error, but pod2text does not abort. This is equivalent to --errors=stderr and is supported for backward compatibility.

  • -t, --termcap

    Try to determine the width of the screen and the bold and underline sequences for the terminal from termcap, and use that information in formatting the output. Output will be wrapped at two columns less than the width of your terminal device. Using this option requires that your system have a termcap file somewhere where Term::Cap can find it and requires that your system support termios. With this option, the output of pod2text will contain terminal control sequences for your current terminal type.

  • -u, --utf8

    By default, pod2text tries to use the same output encoding as its input encoding (to be backward-compatible with older versions). This option says to instead force the output encoding to UTF-8.

    Be aware that, when using this option, the input encoding of your POD source must be properly declared unless it is US-ASCII or Latin-1. POD input without an =encoding command will be assumed to be in Latin-1, and if it's actually in UTF-8, the output will be double-encoded. See perlpod(1) for more information on the =encoding command.

  • -w, --width=width, -width

    The column at which to wrap text on the right-hand side. Defaults to 76, unless -t is given, in which case it's two columns less than the width of your terminal device.

EXIT STATUS

As long as all documents processed result in some output, even if that output includes errata (a POD ERRORS section generated with --errors=pod ), pod2text will exit with status 0. If any of the documents being processed do not result in an output document, pod2text will exit with status 1. If there are syntax errors in a POD document being processed and the error handling style is set to the default of die, pod2text will abort immediately with exit status 255.

DIAGNOSTICS

If pod2text fails with errors, see Pod::Text and Pod::Simple for information about what those errors might mean. Internally, it can also produce the following diagnostics:

  • -c (--color) requires Term::ANSIColor be installed

    (F) -c or --color were given, but Term::ANSIColor could not be loaded.

  • Unknown option: %s

    (F) An unknown command line option was given.

In addition, other Getopt::Long error messages may result from invalid command-line options.

ENVIRONMENT

  • COLUMNS

    If -t is given, pod2text will take the current width of your screen from this environment variable, if available. It overrides terminal width information in TERMCAP.

  • TERMCAP

    If -t is given, pod2text will use the contents of this environment variable if available to determine the correct formatting sequences for your current terminal device.

SEE ALSO

Pod::Text, Pod::Text::Color, Pod::Text::Overstrike, Pod::Text::Termcap, Pod::Simple, perlpod(1)

The current version of this script is always available from its web site at http://www.eyrie.org/~eagle/software/podlators/. It is also part of the Perl core distribution as of 5.6.0.

AUTHOR

Russ Allbery <rra@stanford.edu>.

COPYRIGHT AND LICENSE

Copyright 1999, 2000, 2001, 2004, 2006, 2008, 2010, 2012, 2013 Russ Allbery <rra@stanford.edu>.

This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

 
perldoc-html/pod2usage.html000644 000765 000024 00000042030 12275777421 016012 0ustar00jjstaff000000 000000 pod2usage - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

pod2usage

Perl 5 version 18.2 documentation
Recently read

pod2usage

NAME

pod2usage - print usage messages from embedded pod docs in files

SYNOPSIS

  • pod2usage

    [-help] [-man] [-exit exitval] [-output outfile] [-verbose level] [-pathlist dirlist] [-formatter module] file

OPTIONS AND ARGUMENTS

  • -help

    Print a brief help message and exit.

  • -man

    Print this command's manual page and exit.

  • -exit exitval

    The exit status value to return.

  • -output outfile

    The output file to print to. If the special names "-" or ">&1" or ">&STDOUT" are used then standard output is used. If ">&2" or ">&STDERR" is used then standard error is used.

  • -verbose level

    The desired level of verbosity to use:

    1. 1 : print SYNOPSIS only
    2. 2 : print SYNOPSIS sections and any OPTIONS/ARGUMENTS sections
    3. 3 : print the entire manpage (similar to running pod2text)
  • -pathlist dirlist

    Specifies one or more directories to search for the input file if it was not supplied with an absolute path. Each directory path in the given list should be separated by a ':' on Unix (';' on MSWin32 and DOS).

  • -formatter module

    Which text formatter to use. Default is Pod::Text, or for very old Perl versions Pod::PlainText. An alternative would be e.g. Pod::Text::Termcap.

  • file

    The pathname of a file containing pod documentation to be output in usage message format (defaults to standard input).

DESCRIPTION

pod2usage will read the given input file looking for pod documentation and will print the corresponding usage message. If no input file is specified then standard input is read.

pod2usage invokes the pod2usage() function in the Pod::Usage module. Please see pod2usage() in Pod::Usage.

SEE ALSO

Pod::Usage, pod2text(1)

AUTHOR

Please report bugs using http://rt.cpan.org.

Brad Appleton <bradapp@enteract.com>

Based on code for pod2text(1) written by Tom Christiansen <tchrist@mox.perl.com>

 
perldoc-html/podchecker.html000644 000765 000024 00000041174 12275777420 016237 0ustar00jjstaff000000 000000 podchecker - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

podchecker

Perl 5 version 18.2 documentation
Recently read

podchecker

NAME

podchecker - check the syntax of POD format documentation files

SYNOPSIS

podchecker [-help] [-man] [-(no)warnings] [file ...]

OPTIONS AND ARGUMENTS

  • -help

    Print a brief help message and exit.

  • -man

    Print the manual page and exit.

  • -warnings -nowarnings

    Turn on/off printing of warnings. Repeating -warnings increases the warning level, i.e. more warnings are printed. Currently increasing to level two causes flagging of unescaped "<,>" characters.

  • file

    The pathname of a POD file to syntax-check (defaults to standard input).

DESCRIPTION

podchecker will read the given input files looking for POD syntax errors in the POD documentation and will print any errors it find to STDERR. At the end, it will print a status message indicating the number of errors found.

Directories are ignored, an appropriate warning message is printed.

podchecker invokes the podchecker() function exported by Pod::Checker Please see podchecker() in Pod::Checker for more details.

RETURN VALUE

podchecker returns a 0 (zero) exit status if all specified POD files are ok.

ERRORS

podchecker returns the exit status 1 if at least one of the given POD files has syntax errors.

The status 2 indicates that at least one of the specified files does not contain any POD commands.

Status 1 overrides status 2. If you want unambiguous results, call podchecker with one single argument only.

SEE ALSO

Pod::Parser and Pod::Checker

AUTHORS

Please report bugs using http://rt.cpan.org.

Brad Appleton <bradapp@enteract.com>, Marek Rouchal <marekr@cpan.org>

Based on code for Pod::Text::pod2text(1) written by Tom Christiansen <tchrist@mox.perl.com>

 
perldoc-html/podselect.html000644 000765 000024 00000037736 12275777421 016124 0ustar00jjstaff000000 000000 podselect - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

podselect

Perl 5 version 18.2 documentation
Recently read

podselect

NAME

podselect - print selected sections of pod documentation on standard output

SYNOPSIS

podselect [-help] [-man] [-section section-spec] [file ...]

OPTIONS AND ARGUMENTS

  • -help

    Print a brief help message and exit.

  • -man

    Print the manual page and exit.

  • -section section-spec

    Specify a section to include in the output. See SECTION SPECIFICATIONS in Pod::Parser for the format to use for section-spec. This option may be given multiple times on the command line.

  • file

    The pathname of a file from which to select sections of pod documentation (defaults to standard input).

DESCRIPTION

podselect will read the given input files looking for pod documentation and will print out (in raw pod format) all sections that match one ore more of the given section specifications. If no section specifications are given than all pod sections encountered are output.

podselect invokes the podselect() function exported by Pod::Select Please see podselect() in Pod::Select for more details.

SEE ALSO

Pod::Parser and Pod::Select

AUTHOR

Please report bugs using http://rt.cpan.org.

Brad Appleton <bradapp@enteract.com>

Based on code for Pod::Text::pod2text(1) written by Tom Christiansen <tchrist@mox.perl.com>

 
perldoc-html/preferences.html000644 000765 000024 00000033372 12276001417 016415 0ustar00jjstaff000000 000000 Preferences - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Preferences

Perl 5 version 18.2 documentation
Recently read

Preferences

Customise the behaviour of perldoc.perl.org.

Toolbar
Positioning

 
perldoc-html/prove.html000644 000765 000024 00000115672 12275777420 015270 0ustar00jjstaff000000 000000 prove - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

prove

Perl 5 version 18.2 documentation
Recently read

prove

NAME

prove - Run tests through a TAP harness.

USAGE

  1. prove [options] [files or directories]

OPTIONS

Boolean options:

  1. -v, --verbose Print all test lines.
  2. -l, --lib Add 'lib' to the path for your tests (-Ilib).
  3. -b, --blib Add 'blib/lib' and 'blib/arch' to the path for
  4. your tests
  5. -s, --shuffle Run the tests in random order.
  6. -c, --color Colored test output (default).
  7. --nocolor Do not color test output.
  8. --count Show the X/Y test count when not verbose
  9. (default)
  10. --nocount Disable the X/Y test count.
  11. -D --dry Dry run. Show test that would have run.
  12. --ext Set the extension for tests (default '.t')
  13. -f, --failures Show failed tests.
  14. -o, --comments Show comments.
  15. --ignore-exit Ignore exit status from test scripts.
  16. -m, --merge Merge test scripts' STDERR with their STDOUT.
  17. -r, --recurse Recursively descend into directories.
  18. --reverse Run the tests in reverse order.
  19. -q, --quiet Suppress some test output while running tests.
  20. -Q, --QUIET Only print summary results.
  21. -p, --parse Show full list of TAP parse errors, if any.
  22. --directives Only show results with TODO or SKIP directives.
  23. --timer Print elapsed time after each test.
  24. --trap Trap Ctrl-C and print summary on interrupt.
  25. --normalize Normalize TAP output in verbose output
  26. -T Enable tainting checks.
  27. -t Enable tainting warnings.
  28. -W Enable fatal warnings.
  29. -w Enable warnings.
  30. -h, --help Display this help
  31. -?, Display this help
  32. -H, --man Longer manpage for prove
  33. --norc Don't process default .proverc

Options that take arguments:

  1. -I Library paths to include.
  2. -P Load plugin (searches App::Prove::Plugin::*.)
  3. -M Load a module.
  4. -e, --exec Interpreter to run the tests ('' for compiled
  5. tests.)
  6. --harness Define test harness to use. See TAP::Harness.
  7. --formatter Result formatter to use. See FORMATTERS.
  8. --source Load and/or configure a SourceHandler. See
  9. SOURCE HANDLERS.
  10. -a, --archive out.tgz Store the resulting TAP in an archive file.
  11. -j, --jobs N Run N test jobs in parallel (try 9.)
  12. --state=opts Control prove's persistent state.
  13. --rc=rcfile Process options from rcfile

NOTES

.proverc

If ~/.proverc or ./.proverc exist they will be read and any options they contain processed before the command line options. Options in .proverc are specified in the same way as command line options:

  1. # .proverc
  2. --state=hot,fast,save
  3. -j9

Additional option files may be specified with the --rc option. Default option file processing is disabled by the --norc option.

Under Windows and VMS the option file is named _proverc rather than .proverc and is sought only in the current directory.

Reading from STDIN

If you have a list of tests (or URLs, or anything else you want to test) in a file, you can add them to your tests by using a '-':

  1. prove - < my_list_of_things_to_test.txt

See the README in the examples directory of this distribution.

Default Test Directory

If no files or directories are supplied, prove looks for all files matching the pattern t/*.t.

Colored Test Output

Colored test output is the default, but if output is not to a terminal, color is disabled. You can override this by adding the --color switch.

Color support requires Term::ANSIColor on Unix-like platforms and Win32::Console windows. If the necessary module is not installed colored output will not be available.

Exit Code

If the tests fail prove will exit with non-zero status.

Arguments to Tests

It is possible to supply arguments to tests. To do so separate them from prove's own arguments with the arisdottle, '::'. For example

  1. prove -v t/mytest.t :: --url http://example.com

would run t/mytest.t with the options '--url http://example.com'. When running multiple tests they will each receive the same arguments.

--exec

Normally you can just pass a list of Perl tests and the harness will know how to execute them. However, if your tests are not written in Perl or if you want all tests invoked exactly the same way, use the -e , or --exec switch:

  1. prove --exec '/usr/bin/ruby -w' t/
  2. prove --exec '/usr/bin/perl -Tw -mstrict -Ilib' t/
  3. prove --exec '/path/to/my/customer/exec'

--merge

If you need to make sure your diagnostics are displayed in the correct order relative to test results you can use the --merge option to merge the test scripts' STDERR into their STDOUT.

This guarantees that STDOUT (where the test results appear) and STDERR (where the diagnostics appear) will stay in sync. The harness will display any diagnostics your tests emit on STDERR.

Caveat: this is a bit of a kludge. In particular note that if anything that appears on STDERR looks like a test result the test harness will get confused. Use this option only if you understand the consequences and can live with the risk.

--trap

The --trap option will attempt to trap SIGINT (Ctrl-C) during a test run and display the test summary even if the run is interrupted

--state

You can ask prove to remember the state of previous test runs and select and/or order the tests to be run based on that saved state.

The --state switch requires an argument which must be a comma separated list of one or more of the following options.

  • last

    Run the same tests as the last time the state was saved. This makes it possible, for example, to recreate the ordering of a shuffled test.

    1. # Run all tests in random order
    2. $ prove -b --state=save --shuffle
    3. # Run them again in the same order
    4. $ prove -b --state=last
  • failed

    Run only the tests that failed on the last run.

    1. # Run all tests
    2. $ prove -b --state=save
    3. # Run failures
    4. $ prove -b --state=failed

    If you also specify the save option newly passing tests will be excluded from subsequent runs.

    1. # Repeat until no more failures
    2. $ prove -b --state=failed,save
  • passed

    Run only the passed tests from last time. Useful to make sure that no new problems have been introduced.

  • all

    Run all tests in normal order. Multple options may be specified, so to run all tests with the failures from last time first:

    1. $ prove -b --state=failed,all,save
  • hot

    Run the tests that most recently failed first. The last failure time of each test is stored. The hot option causes tests to be run in most-recent- failure order.

    1. $ prove -b --state=hot,save

    Tests that have never failed will not be selected. To run all tests with the most recently failed first use

    1. $ prove -b --state=hot,all,save

    This combination of options may also be specified thus

    1. $ prove -b --state=adrian
  • todo

    Run any tests with todos.

  • slow

    Run the tests in slowest to fastest order. This is useful in conjunction with the -j parallel testing switch to ensure that your slowest tests start running first.

    1. $ prove -b --state=slow -j9
  • fast

    Run test tests in fastest to slowest order.

  • new

    Run the tests in newest to oldest order based on the modification times of the test scripts.

  • old

    Run the tests in oldest to newest order.

  • fresh

    Run those test scripts that have been modified since the last test run.

  • save

    Save the state on exit. The state is stored in a file called .prove (_prove on Windows and VMS) in the current directory.

The --state switch may be used more than once.

  1. $ prove -b --state=hot --state=all,save

@INC

prove introduces a separation between "options passed to the perl which runs prove" and "options passed to the perl which runs tests"; this distinction is by design. Thus the perl which is running a test starts with the default @INC . Additional library directories can be added via the PERL5LIB environment variable, via -Ifoo in PERL5OPT or via the -Ilib option to prove.

Taint Mode

Normally when a Perl program is run in taint mode the contents of the PERL5LIB environment variable do not appear in @INC .

Because PERL5LIB is often used during testing to add build directories to @INC prove passes the names of any directories found in PERL5LIB as -I switches. The net effect of this is that PERL5LIB is honoured even when prove is run in taint mode.

FORMATTERS

You can load a custom TAP::Parser::Formatter:

  1. prove --formatter MyFormatter

SOURCE HANDLERS

You can load custom TAP::Parser::SourceHandlers, to change the way the parser interprets particular sources of TAP.

  1. prove --source MyHandler --source YetAnother t

If you want to provide config to the source you can use:

  1. prove --source MyCustom \
  2. --source Perl --perl-option 'foo=bar baz' --perl-option avg=0.278 \
  3. --source File --file-option extensions=.txt --file-option extensions=.tmp t
  4. --source pgTAP --pgtap-option pset=format=html --pgtap-option pset=border=2

Each --$source-option option must specify a key/value pair separated by an = . If an option can take multiple values, just specify it multiple times, as with the extensions= examples above. If the option should be a hash reference, specify the value as a second pair separated by a = , as in the pset= examples above (escape = with a backslash).

All --sources are combined into a hash, and passed to new in TAP::Harness's sources parameter.

See TAP::Parser::IteratorFactory for more details on how configuration is passed to SourceHandlers.

PLUGINS

Plugins can be loaded using the -Pplugin syntax, eg:

  1. prove -PMyPlugin

This will search for a module named App::Prove::Plugin::MyPlugin , or failing that, MyPlugin . If the plugin can't be found, prove will complain & exit.

You can pass arguments to your plugin by appending =arg1,arg2,etc to the plugin name:

  1. prove -PMyPlugin=fou,du,fafa

Please check individual plugin documentation for more details.

Available Plugins

For an up-to-date list of plugins available, please check CPAN:

http://search.cpan.org/search?query=App%3A%3AProve+Plugin

Writing Plugins

Please see PLUGINS in App::Prove.

 
perldoc-html/psed.html000644 000765 000024 00000123202 12275777420 015054 0ustar00jjstaff000000 000000 psed - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

psed

Perl 5 version 18.2 documentation
Recently read

psed

NAME

psed - a stream editor

SYNOPSIS

  1. psed [-an] script [file ...]
  2. psed [-an] [-e script] [-f script-file] [file ...]
  3. s2p [-an] [-e script] [-f script-file]

DESCRIPTION

A stream editor reads the input stream consisting of the specified files (or standard input, if none are given), processes is line by line by applying a script consisting of edit commands, and writes resulting lines to standard output. The filename '- ' may be used to read standard input.

The edit script is composed from arguments of -e options and script-files, in the given order. A single script argument may be specified as the first parameter.

If this program is invoked with the name s2p, it will act as a sed-to-Perl translator. See SED SCRIPT TRANSLATION.

sed returns an exit code of 0 on success or >0 if an error occurred.

OPTIONS

  • -a

    A file specified as argument to the w edit command is by default opened before input processing starts. Using -a, opening of such files is delayed until the first line is actually written to the file.

  • -e script

    The editing commands defined by script are appended to the script. Multiple commands must be separated by newlines.

  • -f script-file

    Editing commands from the specified script-file are read and appended to the script.

  • -n

    By default, a line is written to standard output after the editing script has been applied to it. The -n option suppresses automatic printing.

COMMANDS

sed command syntax is defined as

[address[,address]][!]function[argument]

with whitespace being permitted before or after addresses, and between the function character and the argument. The addresses and the address inverter (! ) are used to restrict the application of a command to the selected line(s) of input.

Each command must be on a line of its own, except where noted in the synopses below.

The edit cycle performed on each input line consist of reading the line (without its trailing newline character) into the pattern space, applying the applicable commands of the edit script, writing the final contents of the pattern space and a newline to the standard output. A hold space is provided for saving the contents of the pattern space for later use.

Addresses

A sed address is either a line number or a pattern, which may be combined arbitrarily to construct ranges. Lines are numbered across all input files.

Any address may be followed by an exclamation mark ('! '), selecting all lines not matching that address.

  • number

    The line with the given number is selected.

  • $

    A dollar sign ($ ) is the line number of the last line of the input stream.

  • /regular expression/

    A pattern address is a basic regular expression (see BASIC REGULAR EXPRESSIONS), between the delimiting character /. Any other character except \ or newline may be used to delimit a pattern address when the initial delimiter is prefixed with a backslash ('\ ').

If no address is given, the command selects every line.

If one address is given, it selects the line (or lines) matching the address.

Two addresses select a range that begins whenever the first address matches, and ends (including that line) when the second address matches. If the first (second) address is a matching pattern, the second address is not applied to the very same line to determine the end of the range. Likewise, if the second address is a matching pattern, the first address is not applied to the very same line to determine the begin of another range. If both addresses are line numbers, and the second line number is less than the first line number, then only the first line is selected.

Functions

The maximum permitted number of addresses is indicated with each function synopsis below.

The argument text consists of one or more lines following the command. Embedded newlines in text must be preceded with a backslash. Other backslashes in text are deleted and the following character is taken literally.

  • [1addr]a\ text

    Write text (which must start on the line following the command) to standard output immediately before reading the next line of input, either by executing the N function or by beginning a new cycle.

  • [2addr]b [label]

    Branch to the : function with the specified label. If no label is given, branch to the end of the script.

  • [2addr]c\ text

    The line, or range of lines, selected by the address is deleted. The text (which must start on the line following the command) is written to standard output. With an address range, this occurs at the end of the range.

  • [2addr]d

    Deletes the pattern space and starts the next cycle.

  • [2addr]D

    Deletes the pattern space through the first embedded newline or to the end. If the pattern space becomes empty, a new cycle is started, otherwise execution of the script is restarted.

  • [2addr]g

    Replace the contents of the pattern space with the hold space.

  • [2addr]G

    Append a newline and the contents of the hold space to the pattern space.

  • [2addr]h

    Replace the contents of the hold space with the pattern space.

  • [2addr]H

    Append a newline and the contents of the pattern space to the hold space.

  • [1addr]i\ text

    Write the text (which must start on the line following the command) to standard output.

  • [2addr]l

    Print the contents of the pattern space: non-printable characters are shown in C-style escaped form; long lines are split and have a trailing ^'\ ' at the point of the split; the true end of a line is marked with a '$ '. Escapes are: '\a', '\t', '\n', '\f', '\r', '\e' for BEL, HT, LF, FF, CR, ESC, respectively, and '\' followed by a three-digit octal number for all other non-printable characters.

  • [2addr]n

    If automatic printing is enabled, write the pattern space to the standard output. Replace the pattern space with the next line of input. If there is no more input, processing is terminated.

  • [2addr]N

    Append a newline and the next line of input to the pattern space. If there is no more input, processing is terminated.

  • [2addr]p

    Print the pattern space to the standard output. (Use the -n option to suppress automatic printing at the end of a cycle if you want to avoid double printing of lines.)

  • [2addr]P

    Prints the pattern space through the first embedded newline or to the end.

  • [1addr]q

    Branch to the end of the script and quit without starting a new cycle.

  • [1addr]r file

    Copy the contents of the file to standard output immediately before the next attempt to read a line of input. Any error encountered while reading file is silently ignored.

  • [2addr]s/regular expression/replacement/flags

    Substitute the replacement string for the first substring in the pattern space that matches the regular expression. Any character other than backslash or newline can be used instead of a slash to delimit the regular expression and the replacement. To use the delimiter as a literal character within the regular expression and the replacement, precede the character by a backslash ('\ ').

    Literal newlines may be embedded in the replacement string by preceding a newline with a backslash.

    Within the replacement, an ampersand ('& ') is replaced by the string matching the regular expression. The strings '\1 ' through '\9 ' are replaced by the corresponding subpattern (see BASIC REGULAR EXPRESSIONS). To get a literal '& ' or '\ ' in the replacement text, precede it by a backslash.

    The following flags modify the behaviour of the s command:

    • g

      The replacement is performed for all matching, non-overlapping substrings of the pattern space.

    • 1..9

      Replace only the n-th matching substring of the pattern space.

    • p

      If the substitution was made, print the new value of the pattern space.

    • w file

      If the substitution was made, write the new value of the pattern space to the specified file.

  • [2addr]t [label]

    Branch to the : function with the specified label if any s substitutions have been made since the most recent reading of an input line or execution of a t function. If no label is given, branch to the end of the script.

  • [2addr]w file

    The contents of the pattern space are written to the file.

  • [2addr]x

    Swap the contents of the pattern space and the hold space.

  • [2addr]y/string1/string2/

    In the pattern space, replace all characters occurring in string1 by the character at the corresponding position in string2. It is possible to use any character (other than a backslash or newline) instead of a slash to delimit the strings. Within string1 and string2, a backslash followed by any character other than a newline is that literal character, and a backslash followed by an 'n' is replaced by a newline character.

  • [1addr]=

    Prints the current line number on the standard output.

  • [0addr]: [label]

    The command specifies the position of the label. It has no other effect.

  • [2addr]{ [command]
  • [0addr]}

    These two commands begin and end a command list. The first command may be given on the same line as the opening { command. The commands within the list are jointly selected by the address(es) given on the { command (but may still have individual addresses).

  • [0addr]# [comment]

    The entire line is ignored (treated as a comment). If, however, the first two characters in the script are '#n ', automatic printing of output is suppressed, as if the -n option were given on the command line.

BASIC REGULAR EXPRESSIONS

A Basic Regular Expression (BRE), as defined in POSIX 1003.2, consists of atoms, for matching parts of a string, and bounds, specifying repetitions of a preceding atom.

Atoms

The possible atoms of a BRE are: ., matching any single character; ^ and $, matching the null string at the beginning or end of a string, respectively; a bracket expressions, enclosed in [ and ] (see below); and any single character with no other significance (matching that character). A \ before one of: ., ^, $, [, *, \, matching the character after the backslash. A sequence of atoms enclosed in \( and \) becomes an atom and establishes the target for a backreference, consisting of the substring that actually matches the enclosed atoms. Finally, \ followed by one of the digits 0 through 9 is a backreference.

A ^ that is not first, or a $ that is not last does not have a special significance and need not be preceded by a backslash to become literal. The same is true for a ], that does not terminate a bracket expression.

An unescaped backslash cannot be last in a BRE.

Bounds

The BRE bounds are: *, specifying 0 or more matches of the preceding atom; \{count\}, specifying that many repetitions; \{minimum,\}, giving a lower limit; and \{minimum,maximum\} finally defines a lower and upper bound.

A bound appearing as the first item in a BRE is taken literally.

Bracket Expressions

A bracket expression is a list of characters, character ranges and character classes enclosed in [ and ] and matches any single character from the represented set of characters.

A character range is written as two characters separated by - and represents all characters (according to the character collating sequence) that are not less than the first and not greater than the second. (Ranges are very collating-sequence-dependent, and portable programs should avoid relying on them.)

A character class is one of the class names

  1. alnum digit punct
  2. alpha graph space
  3. blank lower upper
  4. cntrl print xdigit

enclosed in [: and :] and represents the set of characters as defined in ctype(3).

If the first character after [ is ^, the sense of matching is inverted.

To include a literal '^', place it anywhere else but first. To include a literal ']' place it first or immediately after an initial ^. To include a literal '- ' make it the first (or second after ^) or last character, or the second endpoint of a range.

The special bracket expression constructs [[:<:]] and [[:>:]] match the null string at the beginning and end of a word respectively. (Note that neither is identical to Perl's '\b' atom.)

Additional Atoms

Since some sed implementations provide additional regular expression atoms (not defined in POSIX 1003.2), psed is capable of translating the following backslash escapes:

  • \< This is the same as [[:>:]].
  • \> This is the same as [[:<:]].
  • \w This is an abbreviation for [[:alnum:]_].
  • \W This is an abbreviation for [^[:alnum:]_].
  • \y Match the empty string at a word boundary.
  • \B Match the empty string between any two either word or non-word characters.

To enable this feature, the environment variable PSEDEXTBRE must be set to a string containing the requested characters, e.g.: PSEDEXTBRE='<>wW' .

ENVIRONMENT

The environment variable PSEDEXTBRE may be set to extend BREs. See Additional Atoms.

DIAGNOSTICS

  • ambiguous translation for character '%s' in 'y' command

    The indicated character appears twice, with different translations.

  • '[' cannot be last in pattern

    A '[' in a BRE indicates the beginning of a bracket expression.

  • '\' cannot be last in pattern

    A '\' in a BRE is used to make the subsequent character literal.

  • '\' cannot be last in substitution

    A '\' in a substitution string is used to make the subsequent character literal.

  • conflicting flags '%s'

    In an s command, either the 'g' flag and an n-th occurrence flag, or multiple n-th occurrence flags are specified. Note that only the digits ^'1' through '9' are permitted.

  • duplicate label %s (first defined at %s)
  • excess address(es)

    The command has more than the permitted number of addresses.

  • extra characters after command (%s)
  • illegal option '%s'
  • improper delimiter in s command

    The BRE and substitution may not be delimited with '\' or newline.

  • invalid address after ','
  • invalid backreference (%s)

    The specified backreference number exceeds the number of backreferences in the BRE.

  • invalid repeat clause '\{%s\}'

    The repeat clause does not contain a valid integer value, or pair of values.

  • malformed regex, 1st address
  • malformed regex, 2nd address
  • malformed regular expression
  • malformed substitution expression
  • malformed 'y' command argument

    The first or second string of a y command is syntactically incorrect.

  • maximum less than minimum in '\{%s\}'
  • no script command given

    There must be at least one -e or one -f option specifying a script or script file.

  • '\' not valid as delimiter in 'y' command
  • option -e requires an argument
  • option -f requires an argument
  • 's' command requires argument
  • start of unterminated '{'
  • string lengths in 'y' command differ

    The translation table strings in a y command must have equal lengths.

  • undefined label '%s'
  • unexpected '}'

    A } command without a preceding { command was encountered.

  • unexpected end of script

    The end of the script was reached although a text line after a a, c or i command indicated another line.

  • unknown command '%s'
  • unterminated '['

    A BRE contains an unterminated bracket expression.

  • unterminated '\('

    A BRE contains an unterminated backreference.

  • '\{' without closing '\}'

    A BRE contains an unterminated bounds specification.

  • '\)' without preceding '\('
  • 'y' command requires argument

EXAMPLE

The basic material for the preceding section was generated by running the sed script

  1. #no autoprint
  2. s/^.*Warn( *"\([^"]*\)".*$/\1/
  3. t process
  4. b
  5. :process
  6. s/$!/%s/g
  7. s/$[_[:alnum:]]\{1,\}/%s/g
  8. s/\\\\/\\/g
  9. s/^/=item /
  10. p

on the program's own text, and piping the output into sort -u .

SED SCRIPT TRANSLATION

If this program is invoked with the name s2p it will act as a sed-to-Perl translator. After option processing (all other arguments are ignored), a Perl program is printed on standard output, which will process the input stream (as read from all arguments) in the way defined by the sed script and the option setting used for the translation.

SEE ALSO

perl(1), re_format(7)

BUGS

The l command will show escape characters (ESC) as '\e ', but a vertical tab (VT) in octal.

Trailing spaces are truncated from labels in :, t and b commands.

The meaning of an empty regular expression ('// '), as defined by sed, is "the last pattern used, at run time". This deviates from the Perl interpretation, which will re-use the "last last successfully executed regular expression". Since keeping track of pattern usage would create terribly cluttered code, and differences would only appear in obscure context (where other sed implementations appear to deviate, too), the Perl semantics was adopted. Note that common usage of this feature, such as in /abc/s//xyz/ , will work as expected.

Collating elements (of bracket expressions in BREs) are not implemented.

STANDARDS

This sed implementation conforms to the IEEE Std1003.2-1992 ("POSIX.2") definition of sed, and is compatible with the OpenBSD implementation, except where otherwise noted (see BUGS).

AUTHOR

This Perl implementation of sed was written by Wolfgang Laun, Wolfgang.Laun@alcatel.at.

COPYRIGHT and LICENSE

This program is free and open software. You may use, modify, distribute, and sell this program (and any modified variants) in any way you wish, provided you do not restrict others from doing the same.

 
perldoc-html/pstruct.html000644 000765 000024 00000064437 12275777421 015644 0ustar00jjstaff000000 000000 pstruct - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

pstruct

Perl 5 version 18.2 documentation
Recently read

pstruct

NAME

c2ph, pstruct - Dump C structures as generated from cc -g -S stabs

SYNOPSIS

  1. c2ph [-dpnP] [var=val] [files ...]

OPTIONS

  1. Options:
  2. -w wide; short for: type_width=45 member_width=35 offset_width=8
  3. -x hex; short for: offset_fmt=x offset_width=08 size_fmt=x size_width=04
  4. -n do not generate perl code (default when invoked as pstruct)
  5. -p generate perl code (default when invoked as c2ph)
  6. -v generate perl code, with C decls as comments
  7. -i do NOT recompute sizes for intrinsic datatypes
  8. -a dump information on intrinsics also
  9. -t trace execution
  10. -d spew reams of debugging output
  11. -slist give comma-separated list a structures to dump

DESCRIPTION

The following is the old c2ph.doc documentation by Tom Christiansen <tchrist@perl.com> Date: 25 Jul 91 08:10:21 GMT

Once upon a time, I wrote a program called pstruct. It was a perl program that tried to parse out C structures and display their member offsets for you. This was especially useful for people looking at binary dumps or poking around the kernel.

Pstruct was not a pretty program. Neither was it particularly robust. The problem, you see, was that the C compiler was much better at parsing C than I could ever hope to be.

So I got smart: I decided to be lazy and let the C compiler parse the C, which would spit out debugger stabs for me to read. These were much easier to parse. It's still not a pretty program, but at least it's more robust.

Pstruct takes any .c or .h files, or preferably .s ones, since that's the format it is going to massage them into anyway, and spits out listings like this:

  1. struct tty {
  2. int tty.t_locker 000 4
  3. int tty.t_mutex_index 004 4
  4. struct tty * tty.t_tp_virt 008 4
  5. struct clist tty.t_rawq 00c 20
  6. int tty.t_rawq.c_cc 00c 4
  7. int tty.t_rawq.c_cmax 010 4
  8. int tty.t_rawq.c_cfx 014 4
  9. int tty.t_rawq.c_clx 018 4
  10. struct tty * tty.t_rawq.c_tp_cpu 01c 4
  11. struct tty * tty.t_rawq.c_tp_iop 020 4
  12. unsigned char * tty.t_rawq.c_buf_cpu 024 4
  13. unsigned char * tty.t_rawq.c_buf_iop 028 4
  14. struct clist tty.t_canq 02c 20
  15. int tty.t_canq.c_cc 02c 4
  16. int tty.t_canq.c_cmax 030 4
  17. int tty.t_canq.c_cfx 034 4
  18. int tty.t_canq.c_clx 038 4
  19. struct tty * tty.t_canq.c_tp_cpu 03c 4
  20. struct tty * tty.t_canq.c_tp_iop 040 4
  21. unsigned char * tty.t_canq.c_buf_cpu 044 4
  22. unsigned char * tty.t_canq.c_buf_iop 048 4
  23. struct clist tty.t_outq 04c 20
  24. int tty.t_outq.c_cc 04c 4
  25. int tty.t_outq.c_cmax 050 4
  26. int tty.t_outq.c_cfx 054 4
  27. int tty.t_outq.c_clx 058 4
  28. struct tty * tty.t_outq.c_tp_cpu 05c 4
  29. struct tty * tty.t_outq.c_tp_iop 060 4
  30. unsigned char * tty.t_outq.c_buf_cpu 064 4
  31. unsigned char * tty.t_outq.c_buf_iop 068 4
  32. (*int)() tty.t_oproc_cpu 06c 4
  33. (*int)() tty.t_oproc_iop 070 4
  34. (*int)() tty.t_stopproc_cpu 074 4
  35. (*int)() tty.t_stopproc_iop 078 4
  36. struct thread * tty.t_rsel 07c 4

etc.

Actually, this was generated by a particular set of options. You can control the formatting of each column, whether you prefer wide or fat, hex or decimal, leading zeroes or whatever.

All you need to be able to use this is a C compiler than generates BSD/GCC-style stabs. The -g option on native BSD compilers and GCC should get this for you.

To learn more, just type a bogus option, like -\?, and a long usage message will be provided. There are a fair number of possibilities.

If you're only a C programmer, than this is the end of the message for you. You can quit right now, and if you care to, save off the source and run it when you feel like it. Or not.

But if you're a perl programmer, then for you I have something much more wondrous than just a structure offset printer.

You see, if you call pstruct by its other incybernation, c2ph, you have a code generator that translates C code into perl code! Well, structure and union declarations at least, but that's quite a bit.

Prior to this point, anyone programming in perl who wanted to interact with C programs, like the kernel, was forced to guess the layouts of the C structures, and then hardwire these into his program. Of course, when you took your wonderfully crafted program to a system where the sgtty structure was laid out differently, your program broke. Which is a shame.

We've had Larry's h2ph translator, which helped, but that only works on cpp symbols, not real C, which was also very much needed. What I offer you is a symbolic way of getting at all the C structures. I've couched them in terms of packages and functions. Consider the following program:

  1. #!/usr/local/bin/perl
  2. require 'syscall.ph';
  3. require 'sys/time.ph';
  4. require 'sys/resource.ph';
  5. $ru = "\0" x &rusage'sizeof();
  6. syscall(&SYS_getrusage, &RUSAGE_SELF, $ru) && die "getrusage: $!";
  7. @ru = unpack($t = &rusage'typedef(), $ru);
  8. $utime = $ru[ &rusage'ru_utime + &timeval'tv_sec ]
  9. + ($ru[ &rusage'ru_utime + &timeval'tv_usec ]) / 1e6;
  10. $stime = $ru[ &rusage'ru_stime + &timeval'tv_sec ]
  11. + ($ru[ &rusage'ru_stime + &timeval'tv_usec ]) / 1e6;
  12. printf "you have used %8.3fs+%8.3fu seconds.\n", $utime, $stime;

As you see, the name of the package is the name of the structure. Regular fields are just their own names. Plus the following accessor functions are provided for your convenience:

  1. struct This takes no arguments, and is merely the number of first-level
  2. elements in the structure. You would use this for indexing
  3. into arrays of structures, perhaps like this
  4. $usec = $u[ &user'u_utimer
  5. + (&ITIMER_VIRTUAL * &itimerval'struct)
  6. + &itimerval'it_value
  7. + &timeval'tv_usec
  8. ];
  9. sizeof Returns the bytes in the structure, or the member if
  10. you pass it an argument, such as
  11. &rusage'sizeof(&rusage'ru_utime)
  12. typedef This is the perl format definition for passing to pack and
  13. unpack. If you ask for the typedef of a nothing, you get
  14. the whole structure, otherwise you get that of the member
  15. you ask for. Padding is taken care of, as is the magic to
  16. guarantee that a union is unpacked into all its aliases.
  17. Bitfields are not quite yet supported however.
  18. offsetof This function is the byte offset into the array of that
  19. member. You may wish to use this for indexing directly
  20. into the packed structure with vec() if you're too lazy
  21. to unpack it.
  22. typeof Not to be confused with the typedef accessor function, this
  23. one returns the C type of that field. This would allow
  24. you to print out a nice structured pretty print of some
  25. structure without knoning anything about it beforehand.
  26. No args to this one is a noop. Someday I'll post such
  27. a thing to dump out your u structure for you.

The way I see this being used is like basically this:

  1. % h2ph <some_include_file.h > /usr/lib/perl/tmp.ph
  2. % c2ph some_include_file.h >> /usr/lib/perl/tmp.ph
  3. % install

It's a little tricker with c2ph because you have to get the includes right. I can't know this for your system, but it's not usually too terribly difficult.

The code isn't pretty as I mentioned -- I never thought it would be a 1000- line program when I started, or I might not have begun. :-) But I would have been less cavalier in how the parts of the program communicated with each other, etc. It might also have helped if I didn't have to divine the makeup of the stabs on the fly, and then account for micro differences between my compiler and gcc.

Anyway, here it is. Should run on perl v4 or greater. Maybe less.

  1. --tom
 
perldoc-html/ptar.html000644 000765 000024 00000046034 12275777421 015077 0ustar00jjstaff000000 000000 ptar - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

ptar

Perl 5 version 18.2 documentation
Recently read

ptar

NAME

  1. ptar - a tar-like program written in perl

DESCRIPTION

  1. ptar is a small, tar look-alike program that uses the perl module
  2. Archive::Tar to extract, create and list tar archives.

SYNOPSIS

  1. ptar -c [-v] [-z] [-C] [-f ARCHIVE_FILE | -] FILE FILE ...
  2. ptar -c [-v] [-z] [-C] [-T index | -] [-f ARCHIVE_FILE | -]
  3. ptar -x [-v] [-z] [-f ARCHIVE_FILE | -]
  4. ptar -t [-z] [-f ARCHIVE_FILE | -]
  5. ptar -h

OPTIONS

  1. c Create ARCHIVE_FILE or STDOUT (-) from FILE
  2. x Extract from ARCHIVE_FILE or STDIN (-)
  3. t List the contents of ARCHIVE_FILE or STDIN (-)
  4. f Name of the ARCHIVE_FILE to use. Default is './default.tar'
  5. z Read/Write zlib compressed ARCHIVE_FILE (not always available)
  6. v Print filenames as they are added or extracted from ARCHIVE_FILE
  7. h Prints this help message
  8. C CPAN mode - drop 022 from permissions
  9. T get names to create from file

SEE ALSO

  1. tar(1), L<Archive::Tar>.
 
perldoc-html/ptardiff.html000644 000765 000024 00000042707 12275777421 015733 0ustar00jjstaff000000 000000 ptardiff - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

ptardiff

Perl 5 version 18.2 documentation
Recently read

ptardiff

NAME

ptardiff - program that diffs an extracted archive against an unextracted one

DESCRIPTION

  1. ptardiff is a small program that diffs an extracted archive
  2. against an unextracted one, using the perl module Archive::Tar.
  3. This effectively lets you view changes made to an archives contents.
  4. Provide the progam with an ARCHIVE_FILE and it will look up all
  5. the files with in the archive, scan the current working directory
  6. for a file with the name and diff it against the contents of the
  7. archive.

SYNOPSIS

  1. ptardiff ARCHIVE_FILE
  2. ptardiff -h
  3. $ tar -xzf Acme-Buffy-1.3.tar.gz
  4. $ vi Acme-Buffy-1.3/README
  5. [...]
  6. $ ptardiff Acme-Buffy-1.3.tar.gz > README.patch

OPTIONS

  1. h Prints this help message

SEE ALSO

tar(1), Archive::Tar.

 
perldoc-html/re.html000644 000765 000024 00000111303 12275777416 014533 0ustar00jjstaff000000 000000 re - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

re

Perl 5 version 18.2 documentation
Recently read

re

NAME

re - Perl pragma to alter regular expression behaviour

SYNOPSIS

  1. use re 'taint';
  2. ($x) = ($^X =~ /^(.*)$/s); # $x is tainted here
  3. $pat = '(?{ $foo = 1 })';
  4. use re 'eval';
  5. /foo${pat}bar/; # won't fail (when not under -T
  6. # switch)
  7. {
  8. no re 'taint'; # the default
  9. ($x) = ($^X =~ /^(.*)$/s); # $x is not tainted here
  10. no re 'eval'; # the default
  11. /foo${pat}bar/; # disallowed (with or without -T
  12. # switch)
  13. }
  14. use re '/ix';
  15. "FOO" =~ / foo /; # /ix implied
  16. no re '/x';
  17. "FOO" =~ /foo/; # just /i implied
  18. use re 'debug'; # output debugging info during
  19. /^(.*)$/s; # compile and run time
  20. use re 'debugcolor'; # same as 'debug', but with colored
  21. # output
  22. ...
  23. use re qw(Debug All); # Same as "use re 'debug'", but you
  24. # can use "Debug" with things other
  25. # than 'All'
  26. use re qw(Debug More); # 'All' plus output more details
  27. no re qw(Debug ALL); # Turn on (almost) all re debugging
  28. # in this scope
  29. use re qw(is_regexp regexp_pattern); # import utility functions
  30. my ($pat,$mods)=regexp_pattern(qr/foo/i);
  31. if (is_regexp($obj)) {
  32. print "Got regexp: ",
  33. scalar regexp_pattern($obj); # just as perl would stringify
  34. } # it but no hassle with blessed
  35. # re's.

(We use $^X in these examples because it's tainted by default.)

DESCRIPTION

'taint' mode

When use re 'taint' is in effect, and a tainted string is the target of a regexp, the regexp memories (or values returned by the m// operator in list context) are tainted. This feature is useful when regexp operations on tainted data aren't meant to extract safe substrings, but to perform other transformations.

'eval' mode

When use re 'eval' is in effect, a regexp is allowed to contain (?{ ... }) zero-width assertions and (??{ ... }) postponed subexpressions that are derived from variable interpolation, rather than appearing literally within the regexp. That is normally disallowed, since it is a potential security risk. Note that this pragma is ignored when the regular expression is obtained from tainted data, i.e. evaluation is always disallowed with tainted regular expressions. See (?{ code }) in perlre and (??{ code }) in perlre.

For the purpose of this pragma, interpolation of precompiled regular expressions (i.e., the result of qr//) is not considered variable interpolation. Thus:

  1. /foo${pat}bar/

is allowed if $pat is a precompiled regular expression, even if $pat contains (?{ ... }) assertions or (??{ ... }) subexpressions.

'/flags' mode

When use re '/flags' is specified, the given flags are automatically added to every regular expression till the end of the lexical scope.

no re '/flags' will turn off the effect of use re '/flags' for the given flags.

For example, if you want all your regular expressions to have /msx on by default, simply put

  1. use re '/msx';

at the top of your code.

The character set /adul flags cancel each other out. So, in this example,

  1. use re "/u";
  2. "ss" =~ /\xdf/;
  3. use re "/d";
  4. "ss" =~ /\xdf/;

the second use re does an implicit no re '/u' .

Turning on one of the character set flags with use re takes precedence over the locale pragma and the 'unicode_strings' feature , for regular expressions. Turning off one of these flags when it is active reverts to the behaviour specified by whatever other pragmata are in scope. For example:

  1. use feature "unicode_strings";
  2. no re "/u"; # does nothing
  3. use re "/l";
  4. no re "/l"; # reverts to unicode_strings behaviour

'debug' mode

When use re 'debug' is in effect, perl emits debugging messages when compiling and using regular expressions. The output is the same as that obtained by running a -DDEBUGGING -enabled perl interpreter with the -Dr switch. It may be quite voluminous depending on the complexity of the match. Using debugcolor instead of debug enables a form of output that can be used to get a colorful display on terminals that understand termcap color sequences. Set $ENV{PERL_RE_TC} to a comma-separated list of termcap properties to use for highlighting strings on/off, pre-point part on/off. See Debugging Regular Expressions in perldebug for additional info.

As of 5.9.5 the directive use re 'debug' and its equivalents are lexically scoped, as the other directives are. However they have both compile-time and run-time effects.

See Pragmatic Modules in perlmodlib.

'Debug' mode

Similarly use re 'Debug' produces debugging output, the difference being that it allows the fine tuning of what debugging output will be emitted. Options are divided into three groups, those related to compilation, those related to execution and those related to special purposes. The options are as follows:

  • Compile related options
    • COMPILE

      Turns on all compile related debug options.

    • PARSE

      Turns on debug output related to the process of parsing the pattern.

    • OPTIMISE

      Enables output related to the optimisation phase of compilation.

    • TRIEC

      Detailed info about trie compilation.

    • DUMP

      Dump the final program out after it is compiled and optimised.

  • Execute related options
    • EXECUTE

      Turns on all execute related debug options.

    • MATCH

      Turns on debugging of the main matching loop.

    • TRIEE

      Extra debugging of how tries execute.

    • INTUIT

      Enable debugging of start-point optimisations.

  • Extra debugging options
    • EXTRA

      Turns on all "extra" debugging options.

    • BUFFERS

      Enable debugging the capture group storage during match. Warning, this can potentially produce extremely large output.

    • TRIEM

      Enable enhanced TRIE debugging. Enhances both TRIEE and TRIEC.

    • STATE

      Enable debugging of states in the engine.

    • STACK

      Enable debugging of the recursion stack in the engine. Enabling or disabling this option automatically does the same for debugging states as well. This output from this can be quite large.

    • OPTIMISEM

      Enable enhanced optimisation debugging and start-point optimisations. Probably not useful except when debugging the regexp engine itself.

    • OFFSETS

      Dump offset information. This can be used to see how regops correlate to the pattern. Output format is

      1. NODENUM:POSITION[LENGTH]

      Where 1 is the position of the first char in the string. Note that position can be 0, or larger than the actual length of the pattern, likewise length can be zero.

    • OFFSETSDBG

      Enable debugging of offsets information. This emits copious amounts of trace information and doesn't mesh well with other debug options.

      Almost definitely only useful to people hacking on the offsets part of the debug engine.

  • Other useful flags

    These are useful shortcuts to save on the typing.

    • ALL

      Enable all options at once except OFFSETS, OFFSETSDBG and BUFFERS. (To get every single option without exception, use both ALL and EXTRA.)

    • All

      Enable DUMP and all execute options. Equivalent to:

      1. use re 'debug';
    • MORE
    • More

      Enable the options enabled by "All", plus STATE, TRIEC, and TRIEM.

As of 5.9.5 the directive use re 'debug' and its equivalents are lexically scoped, as are the other directives. However they have both compile-time and run-time effects.

Exportable Functions

As of perl 5.9.5 're' debug contains a number of utility functions that may be optionally exported into the caller's namespace. They are listed below.

  • is_regexp($ref)

    Returns true if the argument is a compiled regular expression as returned by qr//, false if it is not.

    This function will not be confused by overloading or blessing. In internals terms, this extracts the regexp pointer out of the PERL_MAGIC_qr structure so it cannot be fooled.

  • regexp_pattern($ref)

    If the argument is a compiled regular expression as returned by qr//, then this function returns the pattern.

    In list context it returns a two element list, the first element containing the pattern and the second containing the modifiers used when the pattern was compiled.

    1. my ($pat, $mods) = regexp_pattern($ref);

    In scalar context it returns the same as perl would when stringifying a raw qr// with the same pattern inside. If the argument is not a compiled reference then this routine returns false but defined in scalar context, and the empty list in list context. Thus the following

    1. if (regexp_pattern($ref) eq '(?^i:foo)')

    will be warning free regardless of what $ref actually is.

    Like is_regexp this function will not be confused by overloading or blessing of the object.

  • regmust($ref)

    If the argument is a compiled regular expression as returned by qr//, then this function returns what the optimiser considers to be the longest anchored fixed string and longest floating fixed string in the pattern.

    A fixed string is defined as being a substring that must appear for the pattern to match. An anchored fixed string is a fixed string that must appear at a particular offset from the beginning of the match. A floating fixed string is defined as a fixed string that can appear at any point in a range of positions relative to the start of the match. For example,

    1. my $qr = qr/here .* there/x;
    2. my ($anchored, $floating) = regmust($qr);
    3. print "anchored:'$anchored'\nfloating:'$floating'\n";

    results in

    1. anchored:'here'
    2. floating:'there'

    Because the here is before the .* in the pattern, its position can be determined exactly. That's not true, however, for the there ; it could appear at any point after where the anchored string appeared. Perl uses both for its optimisations, prefering the longer, or, if they are equal, the floating.

    NOTE: This may not necessarily be the definitive longest anchored and floating string. This will be what the optimiser of the Perl that you are using thinks is the longest. If you believe that the result is wrong please report it via the perlbug utility.

  • regname($name,$all)

    Returns the contents of a named buffer of the last successful match. If $all is true, then returns an array ref containing one entry per buffer, otherwise returns the first defined buffer.

  • regnames($all)

    Returns a list of all of the named buffers defined in the last successful match. If $all is true, then it returns all names defined, if not it returns only names which were involved in the match.

  • regnames_count()

    Returns the number of distinct names defined in the pattern used for the last successful match.

    Note: this result is always the actual number of distinct named buffers defined, it may not actually match that which is returned by regnames() and related routines when those routines have not been called with the $all parameter set.

SEE ALSO

Pragmatic Modules in perlmodlib.

 
perldoc-html/s2p.html000644 000765 000024 00000123174 12275777421 014636 0ustar00jjstaff000000 000000 s2p - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

s2p

Perl 5 version 18.2 documentation
Recently read

s2p

NAME

psed - a stream editor

SYNOPSIS

  1. psed [-an] script [file ...]
  2. psed [-an] [-e script] [-f script-file] [file ...]
  3. s2p [-an] [-e script] [-f script-file]

DESCRIPTION

A stream editor reads the input stream consisting of the specified files (or standard input, if none are given), processes is line by line by applying a script consisting of edit commands, and writes resulting lines to standard output. The filename '- ' may be used to read standard input.

The edit script is composed from arguments of -e options and script-files, in the given order. A single script argument may be specified as the first parameter.

If this program is invoked with the name s2p, it will act as a sed-to-Perl translator. See SED SCRIPT TRANSLATION.

sed returns an exit code of 0 on success or >0 if an error occurred.

OPTIONS

  • -a

    A file specified as argument to the w edit command is by default opened before input processing starts. Using -a, opening of such files is delayed until the first line is actually written to the file.

  • -e script

    The editing commands defined by script are appended to the script. Multiple commands must be separated by newlines.

  • -f script-file

    Editing commands from the specified script-file are read and appended to the script.

  • -n

    By default, a line is written to standard output after the editing script has been applied to it. The -n option suppresses automatic printing.

COMMANDS

sed command syntax is defined as

[address[,address]][!]function[argument]

with whitespace being permitted before or after addresses, and between the function character and the argument. The addresses and the address inverter (! ) are used to restrict the application of a command to the selected line(s) of input.

Each command must be on a line of its own, except where noted in the synopses below.

The edit cycle performed on each input line consist of reading the line (without its trailing newline character) into the pattern space, applying the applicable commands of the edit script, writing the final contents of the pattern space and a newline to the standard output. A hold space is provided for saving the contents of the pattern space for later use.

Addresses

A sed address is either a line number or a pattern, which may be combined arbitrarily to construct ranges. Lines are numbered across all input files.

Any address may be followed by an exclamation mark ('! '), selecting all lines not matching that address.

  • number

    The line with the given number is selected.

  • $

    A dollar sign ($ ) is the line number of the last line of the input stream.

  • /regular expression/

    A pattern address is a basic regular expression (see BASIC REGULAR EXPRESSIONS), between the delimiting character /. Any other character except \ or newline may be used to delimit a pattern address when the initial delimiter is prefixed with a backslash ('\ ').

If no address is given, the command selects every line.

If one address is given, it selects the line (or lines) matching the address.

Two addresses select a range that begins whenever the first address matches, and ends (including that line) when the second address matches. If the first (second) address is a matching pattern, the second address is not applied to the very same line to determine the end of the range. Likewise, if the second address is a matching pattern, the first address is not applied to the very same line to determine the begin of another range. If both addresses are line numbers, and the second line number is less than the first line number, then only the first line is selected.

Functions

The maximum permitted number of addresses is indicated with each function synopsis below.

The argument text consists of one or more lines following the command. Embedded newlines in text must be preceded with a backslash. Other backslashes in text are deleted and the following character is taken literally.

  • [1addr]a\ text

    Write text (which must start on the line following the command) to standard output immediately before reading the next line of input, either by executing the N function or by beginning a new cycle.

  • [2addr]b [label]

    Branch to the : function with the specified label. If no label is given, branch to the end of the script.

  • [2addr]c\ text

    The line, or range of lines, selected by the address is deleted. The text (which must start on the line following the command) is written to standard output. With an address range, this occurs at the end of the range.

  • [2addr]d

    Deletes the pattern space and starts the next cycle.

  • [2addr]D

    Deletes the pattern space through the first embedded newline or to the end. If the pattern space becomes empty, a new cycle is started, otherwise execution of the script is restarted.

  • [2addr]g

    Replace the contents of the pattern space with the hold space.

  • [2addr]G

    Append a newline and the contents of the hold space to the pattern space.

  • [2addr]h

    Replace the contents of the hold space with the pattern space.

  • [2addr]H

    Append a newline and the contents of the pattern space to the hold space.

  • [1addr]i\ text

    Write the text (which must start on the line following the command) to standard output.

  • [2addr]l

    Print the contents of the pattern space: non-printable characters are shown in C-style escaped form; long lines are split and have a trailing ^'\ ' at the point of the split; the true end of a line is marked with a '$ '. Escapes are: '\a', '\t', '\n', '\f', '\r', '\e' for BEL, HT, LF, FF, CR, ESC, respectively, and '\' followed by a three-digit octal number for all other non-printable characters.

  • [2addr]n

    If automatic printing is enabled, write the pattern space to the standard output. Replace the pattern space with the next line of input. If there is no more input, processing is terminated.

  • [2addr]N

    Append a newline and the next line of input to the pattern space. If there is no more input, processing is terminated.

  • [2addr]p

    Print the pattern space to the standard output. (Use the -n option to suppress automatic printing at the end of a cycle if you want to avoid double printing of lines.)

  • [2addr]P

    Prints the pattern space through the first embedded newline or to the end.

  • [1addr]q

    Branch to the end of the script and quit without starting a new cycle.

  • [1addr]r file

    Copy the contents of the file to standard output immediately before the next attempt to read a line of input. Any error encountered while reading file is silently ignored.

  • [2addr]s/regular expression/replacement/flags

    Substitute the replacement string for the first substring in the pattern space that matches the regular expression. Any character other than backslash or newline can be used instead of a slash to delimit the regular expression and the replacement. To use the delimiter as a literal character within the regular expression and the replacement, precede the character by a backslash ('\ ').

    Literal newlines may be embedded in the replacement string by preceding a newline with a backslash.

    Within the replacement, an ampersand ('& ') is replaced by the string matching the regular expression. The strings '\1 ' through '\9 ' are replaced by the corresponding subpattern (see BASIC REGULAR EXPRESSIONS). To get a literal '& ' or '\ ' in the replacement text, precede it by a backslash.

    The following flags modify the behaviour of the s command:

    • g

      The replacement is performed for all matching, non-overlapping substrings of the pattern space.

    • 1..9

      Replace only the n-th matching substring of the pattern space.

    • p

      If the substitution was made, print the new value of the pattern space.

    • w file

      If the substitution was made, write the new value of the pattern space to the specified file.

  • [2addr]t [label]

    Branch to the : function with the specified label if any s substitutions have been made since the most recent reading of an input line or execution of a t function. If no label is given, branch to the end of the script.

  • [2addr]w file

    The contents of the pattern space are written to the file.

  • [2addr]x

    Swap the contents of the pattern space and the hold space.

  • [2addr]y/string1/string2/

    In the pattern space, replace all characters occurring in string1 by the character at the corresponding position in string2. It is possible to use any character (other than a backslash or newline) instead of a slash to delimit the strings. Within string1 and string2, a backslash followed by any character other than a newline is that literal character, and a backslash followed by an 'n' is replaced by a newline character.

  • [1addr]=

    Prints the current line number on the standard output.

  • [0addr]: [label]

    The command specifies the position of the label. It has no other effect.

  • [2addr]{ [command]
  • [0addr]}

    These two commands begin and end a command list. The first command may be given on the same line as the opening { command. The commands within the list are jointly selected by the address(es) given on the { command (but may still have individual addresses).

  • [0addr]# [comment]

    The entire line is ignored (treated as a comment). If, however, the first two characters in the script are '#n ', automatic printing of output is suppressed, as if the -n option were given on the command line.

BASIC REGULAR EXPRESSIONS

A Basic Regular Expression (BRE), as defined in POSIX 1003.2, consists of atoms, for matching parts of a string, and bounds, specifying repetitions of a preceding atom.

Atoms

The possible atoms of a BRE are: ., matching any single character; ^ and $, matching the null string at the beginning or end of a string, respectively; a bracket expressions, enclosed in [ and ] (see below); and any single character with no other significance (matching that character). A \ before one of: ., ^, $, [, *, \, matching the character after the backslash. A sequence of atoms enclosed in \( and \) becomes an atom and establishes the target for a backreference, consisting of the substring that actually matches the enclosed atoms. Finally, \ followed by one of the digits 0 through 9 is a backreference.

A ^ that is not first, or a $ that is not last does not have a special significance and need not be preceded by a backslash to become literal. The same is true for a ], that does not terminate a bracket expression.

An unescaped backslash cannot be last in a BRE.

Bounds

The BRE bounds are: *, specifying 0 or more matches of the preceding atom; \{count\}, specifying that many repetitions; \{minimum,\}, giving a lower limit; and \{minimum,maximum\} finally defines a lower and upper bound.

A bound appearing as the first item in a BRE is taken literally.

Bracket Expressions

A bracket expression is a list of characters, character ranges and character classes enclosed in [ and ] and matches any single character from the represented set of characters.

A character range is written as two characters separated by - and represents all characters (according to the character collating sequence) that are not less than the first and not greater than the second. (Ranges are very collating-sequence-dependent, and portable programs should avoid relying on them.)

A character class is one of the class names

  1. alnum digit punct
  2. alpha graph space
  3. blank lower upper
  4. cntrl print xdigit

enclosed in [: and :] and represents the set of characters as defined in ctype(3).

If the first character after [ is ^, the sense of matching is inverted.

To include a literal '^', place it anywhere else but first. To include a literal ']' place it first or immediately after an initial ^. To include a literal '- ' make it the first (or second after ^) or last character, or the second endpoint of a range.

The special bracket expression constructs [[:<:]] and [[:>:]] match the null string at the beginning and end of a word respectively. (Note that neither is identical to Perl's '\b' atom.)

Additional Atoms

Since some sed implementations provide additional regular expression atoms (not defined in POSIX 1003.2), psed is capable of translating the following backslash escapes:

  • \< This is the same as [[:>:]].
  • \> This is the same as [[:<:]].
  • \w This is an abbreviation for [[:alnum:]_].
  • \W This is an abbreviation for [^[:alnum:]_].
  • \y Match the empty string at a word boundary.
  • \B Match the empty string between any two either word or non-word characters.

To enable this feature, the environment variable PSEDEXTBRE must be set to a string containing the requested characters, e.g.: PSEDEXTBRE='<>wW' .

ENVIRONMENT

The environment variable PSEDEXTBRE may be set to extend BREs. See Additional Atoms.

DIAGNOSTICS

  • ambiguous translation for character '%s' in 'y' command

    The indicated character appears twice, with different translations.

  • '[' cannot be last in pattern

    A '[' in a BRE indicates the beginning of a bracket expression.

  • '\' cannot be last in pattern

    A '\' in a BRE is used to make the subsequent character literal.

  • '\' cannot be last in substitution

    A '\' in a substitution string is used to make the subsequent character literal.

  • conflicting flags '%s'

    In an s command, either the 'g' flag and an n-th occurrence flag, or multiple n-th occurrence flags are specified. Note that only the digits ^'1' through '9' are permitted.

  • duplicate label %s (first defined at %s)
  • excess address(es)

    The command has more than the permitted number of addresses.

  • extra characters after command (%s)
  • illegal option '%s'
  • improper delimiter in s command

    The BRE and substitution may not be delimited with '\' or newline.

  • invalid address after ','
  • invalid backreference (%s)

    The specified backreference number exceeds the number of backreferences in the BRE.

  • invalid repeat clause '\{%s\}'

    The repeat clause does not contain a valid integer value, or pair of values.

  • malformed regex, 1st address
  • malformed regex, 2nd address
  • malformed regular expression
  • malformed substitution expression
  • malformed 'y' command argument

    The first or second string of a y command is syntactically incorrect.

  • maximum less than minimum in '\{%s\}'
  • no script command given

    There must be at least one -e or one -f option specifying a script or script file.

  • '\' not valid as delimiter in 'y' command
  • option -e requires an argument
  • option -f requires an argument
  • 's' command requires argument
  • start of unterminated '{'
  • string lengths in 'y' command differ

    The translation table strings in a y command must have equal lengths.

  • undefined label '%s'
  • unexpected '}'

    A } command without a preceding { command was encountered.

  • unexpected end of script

    The end of the script was reached although a text line after a a, c or i command indicated another line.

  • unknown command '%s'
  • unterminated '['

    A BRE contains an unterminated bracket expression.

  • unterminated '\('

    A BRE contains an unterminated backreference.

  • '\{' without closing '\}'

    A BRE contains an unterminated bounds specification.

  • '\)' without preceding '\('
  • 'y' command requires argument

EXAMPLE

The basic material for the preceding section was generated by running the sed script

  1. #no autoprint
  2. s/^.*Warn( *"\([^"]*\)".*$/\1/
  3. t process
  4. b
  5. :process
  6. s/$!/%s/g
  7. s/$[_[:alnum:]]\{1,\}/%s/g
  8. s/\\\\/\\/g
  9. s/^/=item /
  10. p

on the program's own text, and piping the output into sort -u .

SED SCRIPT TRANSLATION

If this program is invoked with the name s2p it will act as a sed-to-Perl translator. After option processing (all other arguments are ignored), a Perl program is printed on standard output, which will process the input stream (as read from all arguments) in the way defined by the sed script and the option setting used for the translation.

SEE ALSO

perl(1), re_format(7)

BUGS

The l command will show escape characters (ESC) as '\e ', but a vertical tab (VT) in octal.

Trailing spaces are truncated from labels in :, t and b commands.

The meaning of an empty regular expression ('// '), as defined by sed, is "the last pattern used, at run time". This deviates from the Perl interpretation, which will re-use the "last last successfully executed regular expression". Since keeping track of pattern usage would create terribly cluttered code, and differences would only appear in obscure context (where other sed implementations appear to deviate, too), the Perl semantics was adopted. Note that common usage of this feature, such as in /abc/s//xyz/ , will work as expected.

Collating elements (of bracket expressions in BREs) are not implemented.

STANDARDS

This sed implementation conforms to the IEEE Std1003.2-1992 ("POSIX.2") definition of sed, and is compatible with the OpenBSD implementation, except where otherwise noted (see BUGS).

AUTHOR

This Perl implementation of sed was written by Wolfgang Laun, Wolfgang.Laun@alcatel.at.

COPYRIGHT and LICENSE

This program is free and open software. You may use, modify, distribute, and sell this program (and any modified variants) in any way you wish, provided you do not restrict others from doing the same.

 
perldoc-html/search.html000644 000765 000024 00000035216 12276001417 015360 0ustar00jjstaff000000 000000 Search results - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

Search results

Perl 5 version 18.2 documentation
Recently read

Search results

Man pages

Functions

Core modules

FAQs

The perldoc.perl.org search engine is optimised to index Perl functions, core modules, and FAQs. To perform a full-text search of the documentation, please repeat your query using Google:

Google
WWW perldoc.perl.org
   
perldoc-html/shasum.html000644 000765 000024 00000056404 12275777421 015433 0ustar00jjstaff000000 000000 shasum - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

shasum

Perl 5 version 18.2 documentation
Recently read

shasum

NAME

shasum - Print or Check SHA Checksums

SYNOPSIS

  1. Usage: shasum [OPTION]... [FILE]...
  2. Print or check SHA checksums.
  3. With no FILE, or when FILE is -, read standard input.
  4. -a, --algorithm 1 (default), 224, 256, 384, 512, 512224, 512256
  5. -b, --binary read in binary mode
  6. -c, --check read SHA sums from the FILEs and check them
  7. -t, --text read in text mode (default)
  8. -p, --portable read in portable mode
  9. produces same digest on Windows/Unix/Mac
  10. -0, --01 read in BITS mode
  11. ASCII '0' interpreted as 0-bit,
  12. ASCII '1' interpreted as 1-bit,
  13. all other characters ignored
  14. The following two options are useful only when verifying checksums:
  15. -s, --status don't output anything, status code shows success
  16. -w, --warn warn about improperly formatted checksum lines
  17. -h, --help display this help and exit
  18. -v, --version output version information and exit
  19. When verifying SHA-512/224 or SHA-512/256 checksums, indicate the
  20. algorithm explicitly using the -a option, e.g.
  21. shasum -a 512224 -c checksumfile
  22. The sums are computed as described in FIPS-180-4. When checking, the
  23. input should be a former output of this program. The default mode is to
  24. print a line with checksum, a character indicating type (`*' for binary,
  25. ` ' for text, `?' for portable, `^' for BITS), and name for each FILE.
  26. Report shasum bugs to mshelor@cpan.org

DESCRIPTION

Running shasum is often the quickest way to compute SHA message digests. The user simply feeds data to the script through files or standard input, and then collects the results from standard output.

The following command shows how to compute digests for typical inputs such as the NIST test vector "abc":

  1. perl -e "print qq(abc)" | shasum

Or, if you want to use SHA-256 instead of the default SHA-1, simply say:

  1. perl -e "print qq(abc)" | shasum -a 256

Since shasum mimics the behavior of the combined GNU sha1sum, sha224sum, sha256sum, sha384sum, and sha512sum programs, you can install this script as a convenient drop-in replacement.

Unlike the GNU programs, shasum encompasses the full SHA standard by allowing partial-byte inputs. This is accomplished through the BITS option (-0). The following example computes the SHA-224 digest of the 7-bit message 0001100:

  1. perl -e "print qq(0001100)" | shasum -0 -a 224

AUTHOR

Copyright (c) 2003-2013 Mark Shelor <mshelor@cpan.org>.

SEE ALSO

shasum is implemented using the Perl module Digest::SHA or Digest::SHA::PurePerl.

 
perldoc-html/sigtrap.html000644 000765 000024 00000055455 12275777416 015615 0ustar00jjstaff000000 000000 sigtrap - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

sigtrap

Perl 5 version 18.2 documentation
Recently read

sigtrap

NAME

sigtrap - Perl pragma to enable simple signal handling

SYNOPSIS

  1. use sigtrap;
  2. use sigtrap qw(stack-trace old-interface-signals); # equivalent
  3. use sigtrap qw(BUS SEGV PIPE ABRT);
  4. use sigtrap qw(die INT QUIT);
  5. use sigtrap qw(die normal-signals);
  6. use sigtrap qw(die untrapped normal-signals);
  7. use sigtrap qw(die untrapped normal-signals
  8. stack-trace any error-signals);
  9. use sigtrap 'handler' => \&my_handler, 'normal-signals';
  10. use sigtrap qw(handler my_handler normal-signals
  11. stack-trace error-signals);

DESCRIPTION

The sigtrap pragma is a simple interface to installing signal handlers. You can have it install one of two handlers supplied by sigtrap itself (one which provides a Perl stack trace and one which simply die()s), or alternately you can supply your own handler for it to install. It can be told only to install a handler for signals which are either untrapped or ignored. It has a couple of lists of signals to trap, plus you can supply your own list of signals.

The arguments passed to the use statement which invokes sigtrap are processed in order. When a signal name or the name of one of sigtrap's signal lists is encountered a handler is immediately installed, when an option is encountered it affects subsequently installed handlers.

OPTIONS

SIGNAL HANDLERS

These options affect which handler will be used for subsequently installed signals.

  • stack-trace

    The handler used for subsequently installed signals outputs a Perl stack trace to STDERR and then tries to dump core. This is the default signal handler.

  • die

    The handler used for subsequently installed signals calls die (actually croak ) with a message indicating which signal was caught.

  • handler your-handler

    your-handler will be used as the handler for subsequently installed signals. your-handler can be any value which is valid as an assignment to an element of %SIG . See perlvar for examples of handler functions.

SIGNAL LISTS

sigtrap has a few built-in lists of signals to trap. They are:

  • normal-signals

    These are the signals which a program might normally expect to encounter and which by default cause it to terminate. They are HUP, INT, PIPE and TERM.

  • error-signals

    These signals usually indicate a serious problem with the Perl interpreter or with your script. They are ABRT, BUS, EMT, FPE, ILL, QUIT, SEGV, SYS and TRAP.

  • old-interface-signals

    These are the signals which were trapped by default by the old sigtrap interface, they are ABRT, BUS, EMT, FPE, ILL, PIPE, QUIT, SEGV, SYS, TERM, and TRAP. If no signals or signals lists are passed to sigtrap, this list is used.

For each of these three lists, the collection of signals set to be trapped is checked before trapping; if your architecture does not implement a particular signal, it will not be trapped but rather silently ignored.

OTHER

  • untrapped

    This token tells sigtrap to install handlers only for subsequently listed signals which aren't already trapped or ignored.

  • any

    This token tells sigtrap to install handlers for all subsequently listed signals. This is the default behavior.

  • signal

    Any argument which looks like a signal name (that is, /^[A-Z][A-Z0-9]*$/ ) indicates that sigtrap should install a handler for that name.

  • number

    Require that at least version number of sigtrap is being used.

EXAMPLES

Provide a stack trace for the old-interface-signals:

  1. use sigtrap;

Ditto:

  1. use sigtrap qw(stack-trace old-interface-signals);

Provide a stack trace on the 4 listed signals only:

  1. use sigtrap qw(BUS SEGV PIPE ABRT);

Die on INT or QUIT:

  1. use sigtrap qw(die INT QUIT);

Die on HUP, INT, PIPE or TERM:

  1. use sigtrap qw(die normal-signals);

Die on HUP, INT, PIPE or TERM, except don't change the behavior for signals which are already trapped or ignored:

  1. use sigtrap qw(die untrapped normal-signals);

Die on receipt one of an of the normal-signals which is currently untrapped, provide a stack trace on receipt of any of the error-signals:

  1. use sigtrap qw(die untrapped normal-signals
  2. stack-trace any error-signals);

Install my_handler() as the handler for the normal-signals:

  1. use sigtrap 'handler', \&my_handler, 'normal-signals';

Install my_handler() as the handler for the normal-signals, provide a Perl stack trace on receipt of one of the error-signals:

  1. use sigtrap qw(handler my_handler normal-signals
  2. stack-trace error-signals);
 
perldoc-html/sort.html000644 000765 000024 00000060653 12275777416 015127 0ustar00jjstaff000000 000000 sort - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

sort

Perl 5 version 18.2 documentation
Recently read

sort

NAME

sort - perl pragma to control sort() behaviour

SYNOPSIS

  1. use sort 'stable'; # guarantee stability
  2. use sort '_quicksort'; # use a quicksort algorithm
  3. use sort '_mergesort'; # use a mergesort algorithm
  4. use sort 'defaults'; # revert to default behavior
  5. no sort 'stable'; # stability not important
  6. use sort '_qsort'; # alias for quicksort
  7. my $current;
  8. BEGIN {
  9. $current = sort::current(); # identify prevailing algorithm
  10. }

DESCRIPTION

With the sort pragma you can control the behaviour of the builtin sort() function.

In Perl versions 5.6 and earlier the quicksort algorithm was used to implement sort(), but in Perl 5.8 a mergesort algorithm was also made available, mainly to guarantee worst case O(N log N) behaviour: the worst case of quicksort is O(N**2). In Perl 5.8 and later, quicksort defends against quadratic behaviour by shuffling large arrays before sorting.

A stable sort means that for records that compare equal, the original input ordering is preserved. Mergesort is stable, quicksort is not. Stability will matter only if elements that compare equal can be distinguished in some other way. That means that simple numerical and lexical sorts do not profit from stability, since equal elements are indistinguishable. However, with a comparison such as

  1. { substr($a, 0, 3) cmp substr($b, 0, 3) }

stability might matter because elements that compare equal on the first 3 characters may be distinguished based on subsequent characters. In Perl 5.8 and later, quicksort can be stabilized, but doing so will add overhead, so it should only be done if it matters.

The best algorithm depends on many things. On average, mergesort does fewer comparisons than quicksort, so it may be better when complicated comparison routines are used. Mergesort also takes advantage of pre-existing order, so it would be favored for using sort() to merge several sorted arrays. On the other hand, quicksort is often faster for small arrays, and on arrays of a few distinct values, repeated many times. You can force the choice of algorithm with this pragma, but this feels heavy-handed, so the subpragmas beginning with a _ may not persist beyond Perl 5.8. The default algorithm is mergesort, which will be stable even if you do not explicitly demand it. But the stability of the default sort is a side-effect that could change in later versions. If stability is important, be sure to say so with a

  1. use sort 'stable';

The no sort pragma doesn't forbid what follows, it just leaves the choice open. Thus, after

  1. no sort qw(_mergesort stable);

a mergesort, which happens to be stable, will be employed anyway. Note that

  1. no sort "_quicksort";
  2. no sort "_mergesort";

have exactly the same effect, leaving the choice of sort algorithm open.

CAVEATS

As of Perl 5.10, this pragma is lexically scoped and takes effect at compile time. In earlier versions its effect was global and took effect at run-time; the documentation suggested using eval() to change the behaviour:

  1. { eval 'use sort qw(defaults _quicksort)'; # force quicksort
  2. eval 'no sort "stable"'; # stability not wanted
  3. print sort::current . "\n";
  4. @a = sort @b;
  5. eval 'use sort "defaults"'; # clean up, for others
  6. }
  7. { eval 'use sort qw(defaults stable)'; # force stability
  8. print sort::current . "\n";
  9. @c = sort @d;
  10. eval 'use sort "defaults"'; # clean up, for others
  11. }

Such code no longer has the desired effect, for two reasons. Firstly, the use of eval() means that the sorting algorithm is not changed until runtime, by which time it's too late to have any effect. Secondly, sort::current is also called at run-time, when in fact the compile-time value of sort::current is the one that matters.

So now this code would be written:

  1. { use sort qw(defaults _quicksort); # force quicksort
  2. no sort "stable"; # stability not wanted
  3. my $current;
  4. BEGIN { $current = sort::current; }
  5. print "$current\n";
  6. @a = sort @b;
  7. # Pragmas go out of scope at the end of the block
  8. }
  9. { use sort qw(defaults stable); # force stability
  10. my $current;
  11. BEGIN { $current = sort::current; }
  12. print "$current\n";
  13. @c = sort @d;
  14. }
 
perldoc-html/splain.html000644 000765 000024 00000063364 12275777421 015424 0ustar00jjstaff000000 000000 splain - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

splain

Perl 5 version 18.2 documentation
Recently read

splain

NAME

diagnostics, splain - produce verbose warning diagnostics

SYNOPSIS

Using the diagnostics pragma:

  1. use diagnostics;
  2. use diagnostics -verbose;
  3. enable diagnostics;
  4. disable diagnostics;

Using the splain standalone filter program:

  1. perl program 2>diag.out
  2. splain [-v] [-p] diag.out

Using diagnostics to get stack traces from a misbehaving script:

  1. perl -Mdiagnostics=-traceonly my_script.pl

DESCRIPTION

The diagnostics Pragma

This module extends the terse diagnostics normally emitted by both the perl compiler and the perl interpreter (from running perl with a -w switch or use warnings ), augmenting them with the more explicative and endearing descriptions found in perldiag. Like the other pragmata, it affects the compilation phase of your program rather than merely the execution phase.

To use in your program as a pragma, merely invoke

  1. use diagnostics;

at the start (or near the start) of your program. (Note that this does enable perl's -w flag.) Your whole compilation will then be subject(ed :-) to the enhanced diagnostics. These still go out STDERR.

Due to the interaction between runtime and compiletime issues, and because it's probably not a very good idea anyway, you may not use no diagnostics to turn them off at compiletime. However, you may control their behaviour at runtime using the disable() and enable() methods to turn them off and on respectively.

The -verbose flag first prints out the perldiag introduction before any other diagnostics. The $diagnostics::PRETTY variable can generate nicer escape sequences for pagers.

Warnings dispatched from perl itself (or more accurately, those that match descriptions found in perldiag) are only displayed once (no duplicate descriptions). User code generated warnings a la warn() are unaffected, allowing duplicate user messages to be displayed.

This module also adds a stack trace to the error message when perl dies. This is useful for pinpointing what caused the death. The -traceonly (or just -t) flag turns off the explanations of warning messages leaving just the stack traces. So if your script is dieing, run it again with

  1. perl -Mdiagnostics=-traceonly my_bad_script

to see the call stack at the time of death. By supplying the -warntrace (or just -w) flag, any warnings emitted will also come with a stack trace.

The splain Program

While apparently a whole nuther program, splain is actually nothing more than a link to the (executable) diagnostics.pm module, as well as a link to the diagnostics.pod documentation. The -v flag is like the use diagnostics -verbose directive. The -p flag is like the $diagnostics::PRETTY variable. Since you're post-processing with splain, there's no sense in being able to enable() or disable() processing.

Output from splain is directed to STDOUT, unlike the pragma.

EXAMPLES

The following file is certain to trigger a few errors at both runtime and compiletime:

  1. use diagnostics;
  2. print NOWHERE "nothing\n";
  3. print STDERR "\n\tThis message should be unadorned.\n";
  4. warn "\tThis is a user warning";
  5. print "\nDIAGNOSTIC TESTER: Please enter a <CR> here: ";
  6. my $a, $b = scalar <STDIN>;
  7. print "\n";
  8. print $x/$y;

If you prefer to run your program first and look at its problem afterwards, do this:

  1. perl -w test.pl 2>test.out
  2. ./splain < test.out

Note that this is not in general possible in shells of more dubious heritage, as the theoretical

  1. (perl -w test.pl >/dev/tty) >& test.out
  2. ./splain < test.out

Because you just moved the existing stdout to somewhere else.

If you don't want to modify your source code, but still have on-the-fly warnings, do this:

  1. exec 3>&1; perl -w test.pl 2>&1 1>&3 3>&- | splain 1>&2 3>&-

Nifty, eh?

If you want to control warnings on the fly, do something like this. Make sure you do the use first, or you won't be able to get at the enable() or disable() methods.

  1. use diagnostics; # checks entire compilation phase
  2. print "\ntime for 1st bogus diags: SQUAWKINGS\n";
  3. print BOGUS1 'nada';
  4. print "done with 1st bogus\n";
  5. disable diagnostics; # only turns off runtime warnings
  6. print "\ntime for 2nd bogus: (squelched)\n";
  7. print BOGUS2 'nada';
  8. print "done with 2nd bogus\n";
  9. enable diagnostics; # turns back on runtime warnings
  10. print "\ntime for 3rd bogus: SQUAWKINGS\n";
  11. print BOGUS3 'nada';
  12. print "done with 3rd bogus\n";
  13. disable diagnostics;
  14. print "\ntime for 4th bogus: (squelched)\n";
  15. print BOGUS4 'nada';
  16. print "done with 4th bogus\n";

INTERNALS

Diagnostic messages derive from the perldiag.pod file when available at runtime. Otherwise, they may be embedded in the file itself when the splain package is built. See the Makefile for details.

If an extant $SIG{__WARN__} handler is discovered, it will continue to be honored, but only after the diagnostics::splainthis() function (the module's $SIG{__WARN__} interceptor) has had its way with your warnings.

There is a $diagnostics::DEBUG variable you may set if you're desperately curious what sorts of things are being intercepted.

  1. BEGIN { $diagnostics::DEBUG = 1 }

BUGS

Not being able to say "no diagnostics" is annoying, but may not be insurmountable.

The -pretty directive is called too late to affect matters. You have to do this instead, and before you load the module.

  1. BEGIN { $diagnostics::PRETTY = 1 }

I could start up faster by delaying compilation until it should be needed, but this gets a "panic: top_level" when using the pragma form in Perl 5.001e.

While it's true that this documentation is somewhat subserious, if you use a program named splain, you should expect a bit of whimsy.

AUTHOR

Tom Christiansen <tchrist@mox.perl.com>, 25 June 1995.

 
perldoc-html/static/000755 000765 000024 00000000000 12276001417 014505 5ustar00jjstaff000000 000000 perldoc-html/strict.html000644 000765 000024 00000052367 12275777416 015453 0ustar00jjstaff000000 000000 strict - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

strict

Perl 5 version 18.2 documentation
Recently read

strict

NAME

strict - Perl pragma to restrict unsafe constructs

SYNOPSIS

  1. use strict;
  2. use strict "vars";
  3. use strict "refs";
  4. use strict "subs";
  5. use strict;
  6. no strict "vars";

DESCRIPTION

If no import list is supplied, all possible restrictions are assumed. (This is the safest mode to operate in, but is sometimes too strict for casual programming.) Currently, there are three possible things to be strict about: "subs", "vars", and "refs".

  • strict refs

    This generates a runtime error if you use symbolic references (see perlref).

    1. use strict 'refs';
    2. $ref = \$foo;
    3. print $$ref; # ok
    4. $ref = "foo";
    5. print $$ref; # runtime error; normally ok
    6. $file = "STDOUT";
    7. print $file "Hi!"; # error; note: no comma after $file

    There is one exception to this rule:

    1. $bar = \&{'foo'};
    2. &$bar;

    is allowed so that goto &$AUTOLOAD would not break under stricture.

  • strict vars

    This generates a compile-time error if you access a variable that was neither explicitly declared (using any of my, our, state, or use vars ) nor fully qualified. (Because this is to avoid variable suicide problems and subtle dynamic scoping issues, a merely local variable isn't good enough.) See my, our, state, local, and vars.

    1. use strict 'vars';
    2. $X::foo = 1; # ok, fully qualified
    3. my $foo = 10; # ok, my() var
    4. local $baz = 9; # blows up, $baz not declared before
    5. package Cinna;
    6. our $bar; # Declares $bar in current package
    7. $bar = 'HgS'; # ok, global declared via pragma

    The local() generated a compile-time error because you just touched a global name without fully qualifying it.

    Because of their special use by sort(), the variables $a and $b are exempted from this check.

  • strict subs

    This disables the poetry optimization, generating a compile-time error if you try to use a bareword identifier that's not a subroutine, unless it is a simple identifier (no colons) and that it appears in curly braces or on the left hand side of the => symbol.

    1. use strict 'subs';
    2. $SIG{PIPE} = Plumber; # blows up
    3. $SIG{PIPE} = "Plumber"; # just fine: quoted string is always ok
    4. $SIG{PIPE} = \&Plumber; # preferred form

See Pragmatic Modules in perlmodlib.

HISTORY

strict 'subs' , with Perl 5.6.1, erroneously permitted to use an unquoted compound identifier (e.g. Foo::Bar ) as a hash key (before => or inside curlies), but without forcing it always to a literal string.

Starting with Perl 5.8.1 strict is strict about its restrictions: if unknown restrictions are used, the strict pragma will abort with

  1. Unknown 'strict' tag(s) '...'

As of version 1.04 (Perl 5.10), strict verifies that it is used as "strict" to avoid the dreaded Strict trap on case insensitive file systems.

 
perldoc-html/subs.html000644 000765 000024 00000036271 12275777416 015113 0ustar00jjstaff000000 000000 subs - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

subs

Perl 5 version 18.2 documentation
Recently read

subs

NAME

subs - Perl pragma to predeclare sub names

SYNOPSIS

  1. use subs qw(frob);
  2. frob 3..10;

DESCRIPTION

This will predeclare all the subroutine whose names are in the list, allowing you to use them without parentheses even before they're declared.

Unlike pragmas that affect the $^H hints variable, the use vars and use subs declarations are not BLOCK-scoped. They are thus effective for the entire package in which they appear. You may not rescind such declarations with no vars or no subs .

See Pragmatic Modules in perlmodlib and strict subs in strict.

 
perldoc-html/threads/000755 000765 000024 00000000000 12275777416 014672 5ustar00jjstaff000000 000000 perldoc-html/threads.html000644 000765 000024 00000255231 12275777416 015570 0ustar00jjstaff000000 000000 threads - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

threads

Perl 5 version 18.2 documentation
Recently read

threads

NAME

threads - Perl interpreter-based threads

VERSION

This document describes threads version 1.86

SYNOPSIS

  1. use threads ('yield',
  2. 'stack_size' => 64*4096,
  3. 'exit' => 'threads_only',
  4. 'stringify');
  5. sub start_thread {
  6. my @args = @_;
  7. print('Thread started: ', join(' ', @args), "\n");
  8. }
  9. my $thr = threads->create('start_thread', 'argument');
  10. $thr->join();
  11. threads->create(sub { print("I am a thread\n"); })->join();
  12. my $thr2 = async { foreach (@files) { ... } };
  13. $thr2->join();
  14. if (my $err = $thr2->error()) {
  15. warn("Thread error: $err\n");
  16. }
  17. # Invoke thread in list context (implicit) so it can return a list
  18. my ($thr) = threads->create(sub { return (qw/a b c/); });
  19. # or specify list context explicitly
  20. my $thr = threads->create({'context' => 'list'},
  21. sub { return (qw/a b c/); });
  22. my @results = $thr->join();
  23. $thr->detach();
  24. # Get a thread's object
  25. $thr = threads->self();
  26. $thr = threads->object($tid);
  27. # Get a thread's ID
  28. $tid = threads->tid();
  29. $tid = $thr->tid();
  30. $tid = "$thr";
  31. # Give other threads a chance to run
  32. threads->yield();
  33. yield();
  34. # Lists of non-detached threads
  35. my @threads = threads->list();
  36. my $thread_count = threads->list();
  37. my @running = threads->list(threads::running);
  38. my @joinable = threads->list(threads::joinable);
  39. # Test thread objects
  40. if ($thr1 == $thr2) {
  41. ...
  42. }
  43. # Manage thread stack size
  44. $stack_size = threads->get_stack_size();
  45. $old_size = threads->set_stack_size(32*4096);
  46. # Create a thread with a specific context and stack size
  47. my $thr = threads->create({ 'context' => 'list',
  48. 'stack_size' => 32*4096,
  49. 'exit' => 'thread_only' },
  50. \&foo);
  51. # Get thread's context
  52. my $wantarray = $thr->wantarray();
  53. # Check thread's state
  54. if ($thr->is_running()) {
  55. sleep(1);
  56. }
  57. if ($thr->is_joinable()) {
  58. $thr->join();
  59. }
  60. # Send a signal to a thread
  61. $thr->kill('SIGUSR1');
  62. # Exit a thread
  63. threads->exit();

DESCRIPTION

Since Perl 5.8, thread programming has been available using a model called interpreter threads which provides a new Perl interpreter for each thread, and, by default, results in no data or state information being shared between threads.

(Prior to Perl 5.8, 5005threads was available through the Thread.pm API. This threading model has been deprecated, and was removed as of Perl 5.10.0.)

As just mentioned, all variables are, by default, thread local. To use shared variables, you need to also load threads::shared:

  1. use threads;
  2. use threads::shared;

When loading threads::shared, you must use threads before you use threads::shared . (threads will emit a warning if you do it the other way around.)

It is strongly recommended that you enable threads via use threads as early as possible in your script.

If needed, scripts can be written so as to run on both threaded and non-threaded Perls:

  1. my $can_use_threads = eval 'use threads; 1';
  2. if ($can_use_threads) {
  3. # Do processing using threads
  4. ...
  5. } else {
  6. # Do it without using threads
  7. ...
  8. }
  • $thr = threads->create(FUNCTION, ARGS)

    This will create a new thread that will begin execution with the specified entry point function, and give it the ARGS list as parameters. It will return the corresponding threads object, or undef if thread creation failed.

    FUNCTION may either be the name of a function, an anonymous subroutine, or a code ref.

    1. my $thr = threads->create('func_name', ...);
    2. # or
    3. my $thr = threads->create(sub { ... }, ...);
    4. # or
    5. my $thr = threads->create(\&func, ...);

    The ->new() method is an alias for ->create() .

  • $thr->join()

    This will wait for the corresponding thread to complete its execution. When the thread finishes, ->join() will return the return value(s) of the entry point function.

    The context (void, scalar or list) for the return value(s) for ->join() is determined at the time of thread creation.

    1. # Create thread in list context (implicit)
    2. my ($thr1) = threads->create(sub {
    3. my @results = qw(a b c);
    4. return (@results);
    5. });
    6. # or (explicit)
    7. my $thr1 = threads->create({'context' => 'list'},
    8. sub {
    9. my @results = qw(a b c);
    10. return (@results);
    11. });
    12. # Retrieve list results from thread
    13. my @res1 = $thr1->join();
    14. # Create thread in scalar context (implicit)
    15. my $thr2 = threads->create(sub {
    16. my $result = 42;
    17. return ($result);
    18. });
    19. # Retrieve scalar result from thread
    20. my $res2 = $thr2->join();
    21. # Create a thread in void context (explicit)
    22. my $thr3 = threads->create({'void' => 1},
    23. sub { print("Hello, world\n"); });
    24. # Join the thread in void context (i.e., no return value)
    25. $thr3->join();

    See THREAD CONTEXT for more details.

    If the program exits without all threads having either been joined or detached, then a warning will be issued.

    Calling ->join() or ->detach() on an already joined thread will cause an error to be thrown.

  • $thr->detach()

    Makes the thread unjoinable, and causes any eventual return value to be discarded. When the program exits, any detached threads that are still running are silently terminated.

    If the program exits without all threads having either been joined or detached, then a warning will be issued.

    Calling ->join() or ->detach() on an already detached thread will cause an error to be thrown.

  • threads->detach()

    Class method that allows a thread to detach itself.

  • threads->self()

    Class method that allows a thread to obtain its own threads object.

  • $thr->tid()

    Returns the ID of the thread. Thread IDs are unique integers with the main thread in a program being 0, and incrementing by 1 for every thread created.

  • threads->tid()

    Class method that allows a thread to obtain its own ID.

  • "$thr"

    If you add the stringify import option to your use threads declaration, then using a threads object in a string or a string context (e.g., as a hash key) will cause its ID to be used as the value:

    1. use threads qw(stringify);
    2. my $thr = threads->create(...);
    3. print("Thread $thr started...\n"); # Prints out: Thread 1 started...
  • threads->object($tid)

    This will return the threads object for the active thread associated with the specified thread ID. If $tid is the value for the current thread, then this call works the same as ->self() . Otherwise, returns undef if there is no thread associated with the TID, if the thread is joined or detached, if no TID is specified or if the specified TID is undef.

  • threads->yield()

    This is a suggestion to the OS to let this thread yield CPU time to other threads. What actually happens is highly dependent upon the underlying thread implementation.

    You may do use threads qw(yield) , and then just use yield() in your code.

  • threads->list()
  • threads->list(threads::all)
  • threads->list(threads::running)
  • threads->list(threads::joinable)

    With no arguments (or using threads::all ) and in a list context, returns a list of all non-joined, non-detached threads objects. In a scalar context, returns a count of the same.

    With a true argument (using threads::running ), returns a list of all non-joined, non-detached threads objects that are still running.

    With a false argument (using threads::joinable ), returns a list of all non-joined, non-detached threads objects that have finished running (i.e., for which ->join() will not block).

  • $thr1->equal($thr2)

    Tests if two threads objects are the same thread or not. This is overloaded to the more natural forms:

    1. if ($thr1 == $thr2) {
    2. print("Threads are the same\n");
    3. }
    4. # or
    5. if ($thr1 != $thr2) {
    6. print("Threads differ\n");
    7. }

    (Thread comparison is based on thread IDs.)

  • async BLOCK;

    async creates a thread to execute the block immediately following it. This block is treated as an anonymous subroutine, and so must have a semicolon after the closing brace. Like threads->create() , async returns a threads object.

  • $thr->error()

    Threads are executed in an eval context. This method will return undef if the thread terminates normally. Otherwise, it returns the value of $@ associated with the thread's execution status in its eval context.

  • $thr->_handle()

    This private method returns the memory location of the internal thread structure associated with a threads object. For Win32, this is a pointer to the HANDLE value returned by CreateThread (i.e., HANDLE * ); for other platforms, it is a pointer to the pthread_t structure used in the pthread_create call (i.e., pthread_t * ).

    This method is of no use for general Perl threads programming. Its intent is to provide other (XS-based) thread modules with the capability to access, and possibly manipulate, the underlying thread structure associated with a Perl thread.

  • threads->_handle()

    Class method that allows a thread to obtain its own handle.

EXITING A THREAD

The usual method for terminating a thread is to return EXPR from the entry point function with the appropriate return value(s).

  • threads->exit()

    If needed, a thread can be exited at any time by calling threads->exit() . This will cause the thread to return undef in a scalar context, or the empty list in a list context.

    When called from the main thread, this behaves the same as exit(0).

  • threads->exit(status)

    When called from a thread, this behaves like threads->exit() (i.e., the exit status code is ignored).

    When called from the main thread, this behaves the same as exit(status).

  • die()

    Calling die() in a thread indicates an abnormal exit for the thread. Any $SIG{__DIE__} handler in the thread will be called first, and then the thread will exit with a warning message that will contain any arguments passed in the die() call.

  • exit(status)

    Calling exit EXPR inside a thread causes the whole application to terminate. Because of this, the use of exit() inside threaded code, or in modules that might be used in threaded applications, is strongly discouraged.

    If exit() really is needed, then consider using the following:

    1. threads->exit() if threads->can('exit'); # Thread friendly
    2. exit(status);
  • use threads 'exit' => 'threads_only'

    This globally overrides the default behavior of calling exit() inside a thread, and effectively causes such calls to behave the same as threads->exit() . In other words, with this setting, calling exit() causes only the thread to terminate.

    Because of its global effect, this setting should not be used inside modules or the like.

    The main thread is unaffected by this setting.

  • threads->create({'exit' => 'thread_only'}, ...)

    This overrides the default behavior of exit() inside the newly created thread only.

  • $thr->set_thread_exit_only(boolean)

    This can be used to change the exit thread only behavior for a thread after it has been created. With a true argument, exit() will cause only the thread to exit. With a false argument, exit() will terminate the application.

    The main thread is unaffected by this call.

  • threads->set_thread_exit_only(boolean)

    Class method for use inside a thread to change its own behavior for exit().

    The main thread is unaffected by this call.

THREAD STATE

The following boolean methods are useful in determining the state of a thread.

  • $thr->is_running()

    Returns true if a thread is still running (i.e., if its entry point function has not yet finished or exited).

  • $thr->is_joinable()

    Returns true if the thread has finished running, is not detached and has not yet been joined. In other words, the thread is ready to be joined, and a call to $thr->join() will not block.

  • $thr->is_detached()

    Returns true if the thread has been detached.

  • threads->is_detached()

    Class method that allows a thread to determine whether or not it is detached.

THREAD CONTEXT

As with subroutines, the type of value returned from a thread's entry point function may be determined by the thread's context: list, scalar or void. The thread's context is determined at thread creation. This is necessary so that the context is available to the entry point function via wantarray. The thread may then specify a value of the appropriate type to be returned from ->join() .

Explicit context

Because thread creation and thread joining may occur in different contexts, it may be desirable to state the context explicitly to the thread's entry point function. This may be done by calling ->create() with a hash reference as the first argument:

  1. my $thr = threads->create({'context' => 'list'}, \&foo);
  2. ...
  3. my @results = $thr->join();

In the above, the threads object is returned to the parent thread in scalar context, and the thread's entry point function foo will be called in list (array) context such that the parent thread can receive a list (array) from the ->join() call. ('array' is synonymous with 'list' .)

Similarly, if you need the threads object, but your thread will not be returning a value (i.e., void context), you would do the following:

  1. my $thr = threads->create({'context' => 'void'}, \&foo);
  2. ...
  3. $thr->join();

The context type may also be used as the key in the hash reference followed by a true value:

  1. threads->create({'scalar' => 1}, \&foo);
  2. ...
  3. my ($thr) = threads->list();
  4. my $result = $thr->join();

Implicit context

If not explicitly stated, the thread's context is implied from the context of the ->create() call:

  1. # Create thread in list context
  2. my ($thr) = threads->create(...);
  3. # Create thread in scalar context
  4. my $thr = threads->create(...);
  5. # Create thread in void context
  6. threads->create(...);

$thr->wantarray()

This returns the thread's context in the same manner as wantarray.

threads->wantarray()

Class method to return the current thread's context. This returns the same value as running wantarray inside the current thread's entry point function.

THREAD STACK SIZE

The default per-thread stack size for different platforms varies significantly, and is almost always far more than is needed for most applications. On Win32, Perl's makefile explicitly sets the default stack to 16 MB; on most other platforms, the system default is used, which again may be much larger than is needed.

By tuning the stack size to more accurately reflect your application's needs, you may significantly reduce your application's memory usage, and increase the number of simultaneously running threads.

Note that on Windows, address space allocation granularity is 64 KB, therefore, setting the stack smaller than that on Win32 Perl will not save any more memory.

  • threads->get_stack_size();

    Returns the current default per-thread stack size. The default is zero, which means the system default stack size is currently in use.

  • $size = $thr->get_stack_size();

    Returns the stack size for a particular thread. A return value of zero indicates the system default stack size was used for the thread.

  • $old_size = threads->set_stack_size($new_size);

    Sets a new default per-thread stack size, and returns the previous setting.

    Some platforms have a minimum thread stack size. Trying to set the stack size below this value will result in a warning, and the minimum stack size will be used.

    Some Linux platforms have a maximum stack size. Setting too large of a stack size will cause thread creation to fail.

    If needed, $new_size will be rounded up to the next multiple of the memory page size (usually 4096 or 8192).

    Threads created after the stack size is set will then either call pthread_attr_setstacksize() (for pthreads platforms), or supply the stack size to CreateThread() (for Win32 Perl).

    (Obviously, this call does not affect any currently extant threads.)

  • use threads ('stack_size' => VALUE);

    This sets the default per-thread stack size at the start of the application.

  • $ENV{'PERL5_ITHREADS_STACK_SIZE'}

    The default per-thread stack size may be set at the start of the application through the use of the environment variable PERL5_ITHREADS_STACK_SIZE :

    1. PERL5_ITHREADS_STACK_SIZE=1048576
    2. export PERL5_ITHREADS_STACK_SIZE
    3. perl -e'use threads; print(threads->get_stack_size(), "\n")'

    This value overrides any stack_size parameter given to use threads . Its primary purpose is to permit setting the per-thread stack size for legacy threaded applications.

  • threads->create({'stack_size' => VALUE}, FUNCTION, ARGS)

    To specify a particular stack size for any individual thread, call ->create() with a hash reference as the first argument:

    1. my $thr = threads->create({'stack_size' => 32*4096}, \&foo, @args);
  • $thr2 = $thr1->create(FUNCTION, ARGS)

    This creates a new thread ($thr2 ) that inherits the stack size from an existing thread ($thr1 ). This is shorthand for the following:

    1. my $stack_size = $thr1->get_stack_size();
    2. my $thr2 = threads->create({'stack_size' => $stack_size}, FUNCTION, ARGS);

THREAD SIGNALLING

When safe signals is in effect (the default behavior - see Unsafe signals for more details), then signals may be sent and acted upon by individual threads.

  • $thr->kill('SIG...');

    Sends the specified signal to the thread. Signal names and (positive) signal numbers are the same as those supported by kill SIGNAL, LIST. For example, 'SIGTERM', 'TERM' and (depending on the OS) 15 are all valid arguments to ->kill() .

    Returns the thread object to allow for method chaining:

    1. $thr->kill('SIG...')->join();

Signal handlers need to be set up in the threads for the signals they are expected to act upon. Here's an example for cancelling a thread:

  1. use threads;
  2. sub thr_func
  3. {
  4. # Thread 'cancellation' signal handler
  5. $SIG{'KILL'} = sub { threads->exit(); };
  6. ...
  7. }
  8. # Create a thread
  9. my $thr = threads->create('thr_func');
  10. ...
  11. # Signal the thread to terminate, and then detach
  12. # it so that it will get cleaned up automatically
  13. $thr->kill('KILL')->detach();

Here's another simplistic example that illustrates the use of thread signalling in conjunction with a semaphore to provide rudimentary suspend and resume capabilities:

  1. use threads;
  2. use Thread::Semaphore;
  3. sub thr_func
  4. {
  5. my $sema = shift;
  6. # Thread 'suspend/resume' signal handler
  7. $SIG{'STOP'} = sub {
  8. $sema->down(); # Thread suspended
  9. $sema->up(); # Thread resumes
  10. };
  11. ...
  12. }
  13. # Create a semaphore and pass it to a thread
  14. my $sema = Thread::Semaphore->new();
  15. my $thr = threads->create('thr_func', $sema);
  16. # Suspend the thread
  17. $sema->down();
  18. $thr->kill('STOP');
  19. ...
  20. # Allow the thread to continue
  21. $sema->up();

CAVEAT: The thread signalling capability provided by this module does not actually send signals via the OS. It emulates signals at the Perl-level such that signal handlers are called in the appropriate thread. For example, sending $thr->kill('STOP') does not actually suspend a thread (or the whole process), but does cause a $SIG{'STOP'} handler to be called in that thread (as illustrated above).

As such, signals that would normally not be appropriate to use in the kill() command (e.g., kill('KILL', $$) ) are okay to use with the ->kill() method (again, as illustrated above).

Correspondingly, sending a signal to a thread does not disrupt the operation the thread is currently working on: The signal will be acted upon after the current operation has completed. For instance, if the thread is stuck on an I/O call, sending it a signal will not cause the I/O call to be interrupted such that the signal is acted up immediately.

Sending a signal to a terminated thread is ignored.

WARNINGS

  • Perl exited with active threads:

    If the program exits without all threads having either been joined or detached, then this warning will be issued.

    NOTE: If the main thread exits, then this warning cannot be suppressed using no warnings 'threads'; as suggested below.

  • Thread creation failed: pthread_create returned #

    See the appropriate man page for pthread_create to determine the actual cause for the failure.

  • Thread # terminated abnormally: ...

    A thread terminated in some manner other than just returning from its entry point function, or by using threads->exit() . For example, the thread may have terminated because of an error, or by using die.

  • Using minimum thread stack size of #

    Some platforms have a minimum thread stack size. Trying to set the stack size below this value will result in the above warning, and the stack size will be set to the minimum.

  • Thread creation failed: pthread_attr_setstacksize(SIZE) returned 22

    The specified SIZE exceeds the system's maximum stack size. Use a smaller value for the stack size.

If needed, thread warnings can be suppressed by using:

  1. no warnings 'threads';

in the appropriate scope.

ERRORS

  • This Perl not built to support threads

    The particular copy of Perl that you're trying to use was not built using the useithreads configuration option.

    Having threads support requires all of Perl and all of the XS modules in the Perl installation to be rebuilt; it is not just a question of adding the threads module (i.e., threaded and non-threaded Perls are binary incompatible.)

  • Cannot change stack size of an existing thread

    The stack size of currently extant threads cannot be changed, therefore, the following results in the above error:

    1. $thr->set_stack_size($size);
  • Cannot signal threads without safe signals

    Safe signals must be in effect to use the ->kill() signalling method. See Unsafe signals for more details.

  • Unrecognized signal name: ...

    The particular copy of Perl that you're trying to use does not support the specified signal being used in a ->kill() call.

BUGS AND LIMITATIONS

Before you consider posting a bug report, please consult, and possibly post a message to the discussion forum to see if what you've encountered is a known problem.

  • Thread-safe modules

    See Making your module threadsafe in perlmod when creating modules that may be used in threaded applications, especially if those modules use non-Perl data, or XS code.

  • Using non-thread-safe modules

    Unfortunately, you may encounter Perl modules that are not thread-safe. For example, they may crash the Perl interpreter during execution, or may dump core on termination. Depending on the module and the requirements of your application, it may be possible to work around such difficulties.

    If the module will only be used inside a thread, you can try loading the module from inside the thread entry point function using require (and import if needed):

    1. sub thr_func
    2. {
    3. require Unsafe::Module
    4. # Unsafe::Module->import(...);
    5. ....
    6. }

    If the module is needed inside the main thread, try modifying your application so that the module is loaded (again using require and ->import() ) after any threads are started, and in such a way that no other threads are started afterwards.

    If the above does not work, or is not adequate for your application, then file a bug report on http://rt.cpan.org/Public/ against the problematic module.

  • Memory consumption

    On most systems, frequent and continual creation and destruction of threads can lead to ever-increasing growth in the memory footprint of the Perl interpreter. While it is simple to just launch threads and then ->join() or ->detach() them, for long-lived applications, it is better to maintain a pool of threads, and to reuse them for the work needed, using queues to notify threads of pending work. The CPAN distribution of this module contains a simple example (examples/pool_reuse.pl) illustrating the creation, use and monitoring of a pool of reusable threads.

  • Current working directory

    On all platforms except MSWin32, the setting for the current working directory is shared among all threads such that changing it in one thread (e.g., using chdir()) will affect all the threads in the application.

    On MSWin32, each thread maintains its own the current working directory setting.

  • Environment variables

    Currently, on all platforms except MSWin32, all system calls (e.g., using system() or back-ticks) made from threads use the environment variable settings from the main thread. In other words, changes made to %ENV in a thread will not be visible in system calls made by that thread.

    To work around this, set environment variables as part of the system call. For example:

    1. my $msg = 'hello';
    2. system("FOO=$msg; echo \$FOO"); # Outputs 'hello' to STDOUT

    On MSWin32, each thread maintains its own set of environment variables.

  • Catching signals

    Signals are caught by the main thread (thread ID = 0) of a script. Therefore, setting up signal handlers in threads for purposes other than THREAD SIGNALLING as documented above will not accomplish what is intended.

    This is especially true if trying to catch SIGALRM in a thread. To handle alarms in threads, set up a signal handler in the main thread, and then use THREAD SIGNALLING to relay the signal to the thread:

    1. # Create thread with a task that may time out
    2. my $thr->create(sub {
    3. threads->yield();
    4. eval {
    5. $SIG{ALRM} = sub { die("Timeout\n"); };
    6. alarm(10);
    7. ... # Do work here
    8. alarm(0);
    9. };
    10. if ($@ =~ /Timeout/) {
    11. warn("Task in thread timed out\n");
    12. }
    13. };
    14. # Set signal handler to relay SIGALRM to thread
    15. $SIG{ALRM} = sub { $thr->kill('ALRM') };
    16. ... # Main thread continues working
  • Parent-child threads

    On some platforms, it might not be possible to destroy parent threads while there are still existing child threads.

  • Creating threads inside special blocks

    Creating threads inside BEGIN , CHECK or INIT blocks should not be relied upon. Depending on the Perl version and the application code, results may range from success, to (apparently harmless) warnings of leaked scalar, or all the way up to crashing of the Perl interpreter.

  • Unsafe signals

    Since Perl 5.8.0, signals have been made safer in Perl by postponing their handling until the interpreter is in a safe state. See Safe Signals in perl58delta and Deferred Signals (Safe Signals) in perlipc for more details.

    Safe signals is the default behavior, and the old, immediate, unsafe signalling behavior is only in effect in the following situations:

    If unsafe signals is in effect, then signal handling is not thread-safe, and the ->kill() signalling method cannot be used.

  • Returning closures from threads

    Returning closures from threads should not be relied upon. Depending of the Perl version and the application code, results may range from success, to (apparently harmless) warnings of leaked scalar, or all the way up to crashing of the Perl interpreter.

  • Returning objects from threads

    Returning objects from threads does not work. Depending on the classes involved, you may be able to work around this by returning a serialized version of the object (e.g., using Data::Dumper or Storable), and then reconstituting it in the joining thread. If you're using Perl 5.10.0 or later, and if the class supports shared objects, you can pass them via shared queues.

  • END blocks in threads

    It is possible to add END blocks to threads by using require VERSION or eval EXPR with the appropriate code. These END blocks will then be executed when the thread's interpreter is destroyed (i.e., either during a ->join() call, or at program termination).

    However, calling any threads methods in such an END block will most likely fail (e.g., the application may hang, or generate an error) due to mutexes that are needed to control functionality within the threads module.

    For this reason, the use of END blocks in threads is strongly discouraged.

  • Open directory handles

    In perl 5.14 and higher, on systems other than Windows that do not support the fchdir C function, directory handles (see opendir DIRHANDLE,EXPR) will not be copied to new threads. You can use the d_fchdir variable in Config.pm to determine whether your system supports it.

    In prior perl versions, spawning threads with open directory handles would crash the interpreter. [perl #75154]

  • Perl Bugs and the CPAN Version of threads

    Support for threads extends beyond the code in this module (i.e., threads.pm and threads.xs), and into the Perl interpreter itself. Older versions of Perl contain bugs that may manifest themselves despite using the latest version of threads from CPAN. There is no workaround for this other than upgrading to the latest version of Perl.

    Even with the latest version of Perl, it is known that certain constructs with threads may result in warning messages concerning leaked scalars or unreferenced scalars. However, such warnings are harmless, and may safely be ignored.

    You can search for threads related bug reports at http://rt.cpan.org/Public/. If needed submit any new bugs, problems, patches, etc. to: http://rt.cpan.org/Public/Dist/Display.html?Name=threads

REQUIREMENTS

Perl 5.8.0 or later

SEE ALSO

threads Discussion Forum on CPAN: http://www.cpanforum.com/dist/threads

threads::shared, perlthrtut

http://www.perl.com/pub/a/2002/06/11/threads.html and http://www.perl.com/pub/a/2002/09/04/threads.html

Perl threads mailing list: http://lists.perl.org/list/ithreads.html

Stack size discussion: http://www.perlmonks.org/?node_id=532956

AUTHOR

Artur Bergman <sky AT crucially DOT net>

CPAN version produced by Jerry D. Hedden <jdhedden AT cpan DOT org>

LICENSE

threads is released under the same license as Perl.

ACKNOWLEDGEMENTS

Richard Soderberg <perl AT crystalflame DOT net> - Helping me out tons, trying to find reasons for races and other weird bugs!

Simon Cozens <simon AT brecon DOT co DOT uk> - Being there to answer zillions of annoying questions

Rocco Caputo <troc AT netrus DOT net>

Vipul Ved Prakash <mail AT vipul DOT net> - Helping with debugging

Dean Arnold <darnold AT presicient DOT com> - Stack size API

 
perldoc-html/utf8.html000644 000765 000024 00000063753 12275777416 015032 0ustar00jjstaff000000 000000 utf8 - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

utf8

Perl 5 version 18.2 documentation
Recently read

utf8

NAME

utf8 - Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source code

SYNOPSIS

  1. use utf8;
  2. no utf8;
  3. # Convert the internal representation of a Perl scalar to/from UTF-8.
  4. $num_octets = utf8::upgrade($string);
  5. $success = utf8::downgrade($string[, FAIL_OK]);
  6. # Change each character of a Perl scalar to/from a series of
  7. # characters that represent the UTF-8 bytes of each original character.
  8. utf8::encode($string); # "\x{100}" becomes "\xc4\x80"
  9. utf8::decode($string); # "\xc4\x80" becomes "\x{100}"
  10. $flag = utf8::is_utf8(STRING); # since Perl 5.8.1
  11. $flag = utf8::valid(STRING);

DESCRIPTION

The use utf8 pragma tells the Perl parser to allow UTF-8 in the program text in the current lexical scope (allow UTF-EBCDIC on EBCDIC based platforms). The no utf8 pragma tells Perl to switch back to treating the source text as literal bytes in the current lexical scope.

Do not use this pragma for anything else than telling Perl that your script is written in UTF-8. The utility functions described below are directly usable without use utf8; .

Because it is not possible to reliably tell UTF-8 from native 8 bit encodings, you need either a Byte Order Mark at the beginning of your source code, or use utf8; , to instruct perl.

When UTF-8 becomes the standard source format, this pragma will effectively become a no-op. For convenience in what follows the term UTF-X is used to refer to UTF-8 on ASCII and ISO Latin based platforms and UTF-EBCDIC on EBCDIC based platforms.

See also the effects of the -C switch and its cousin, the $ENV{PERL_UNICODE} , in perlrun.

Enabling the utf8 pragma has the following effect:

  • Bytes in the source text that have their high-bit set will be treated as being part of a literal UTF-X sequence. This includes most literals such as identifier names, string constants, and constant regular expression patterns.

    On EBCDIC platforms characters in the Latin 1 character set are treated as being part of a literal UTF-EBCDIC character.

Note that if you have bytes with the eighth bit on in your script (for example embedded Latin-1 in your string literals), use utf8 will be unhappy since the bytes are most probably not well-formed UTF-X. If you want to have such bytes under use utf8 , you can disable this pragma until the end the block (or file, if at top level) by no utf8; .

Utility functions

The following functions are defined in the utf8:: package by the Perl core. You do not need to say use utf8 to use these and in fact you should not say that unless you really want to have UTF-8 source code.

  • $num_octets = utf8::upgrade($string)

    Converts in-place the internal representation of the string from an octet sequence in the native encoding (Latin-1 or EBCDIC) to UTF-X. The logical character sequence itself is unchanged. If $string is already stored as UTF-X, then this is a no-op. Returns the number of octets necessary to represent the string as UTF-X. Can be used to make sure that the UTF-8 flag is on, so that \w or lc() work as Unicode on strings containing characters in the range 0x80-0xFF (on ASCII and derivatives).

    Note that this function does not handle arbitrary encodings. Therefore Encode is recommended for the general purposes; see also Encode.

  • $success = utf8::downgrade($string[, FAIL_OK])

    Converts in-place the internal representation of the string from UTF-X to the equivalent octet sequence in the native encoding (Latin-1 or EBCDIC). The logical character sequence itself is unchanged. If $string is already stored as native 8 bit, then this is a no-op. Can be used to make sure that the UTF-8 flag is off, e.g. when you want to make sure that the substr() or length() function works with the usually faster byte algorithm.

    Fails if the original UTF-X sequence cannot be represented in the native 8 bit encoding. On failure dies or, if the value of FAIL_OK is true, returns false.

    Returns true on success.

    Note that this function does not handle arbitrary encodings. Therefore Encode is recommended for the general purposes; see also Encode.

  • utf8::encode($string)

    Converts in-place the character sequence to the corresponding octet sequence in UTF-X. That is, every (possibly wide) character gets replaced with a sequence of one or more characters that represent the individual UTF-X bytes of the character. The UTF8 flag is turned off. Returns nothing.

    1. my $a = "\x{100}"; # $a contains one character, with ord 0x100
    2. utf8::encode($a); # $a contains two characters, with ords 0xc4 and 0x80

    Note that this function does not handle arbitrary encodings. Therefore Encode is recommended for the general purposes; see also Encode.

  • $success = utf8::decode($string)

    Attempts to convert in-place the octet sequence in UTF-X to the corresponding character sequence. That is, it replaces each sequence of characters in the string whose ords represent a valid UTF-X byte sequence, with the corresponding single character. The UTF-8 flag is turned on only if the source string contains multiple-byte UTF-X characters. If $string is invalid as UTF-X, returns false; otherwise returns true.

    1. my $a = "\xc4\x80"; # $a contains two characters, with ords 0xc4 and 0x80
    2. utf8::decode($a); # $a contains one character, with ord 0x100

    Note that this function does not handle arbitrary encodings. Therefore Encode is recommended for the general purposes; see also Encode.

  • $flag = utf8::is_utf8(STRING)

    (Since Perl 5.8.1) Test whether STRING is encoded internally in UTF-8. Functionally the same as Encode::is_utf8().

  • $flag = utf8::valid(STRING)

    [INTERNAL] Test whether STRING is in a consistent state regarding UTF-8. Will return true if it is well-formed UTF-8 and has the UTF-8 flag on or if STRING is held as bytes (both these states are 'consistent'). Main reason for this routine is to allow Perl's testsuite to check that operations have left strings in a consistent state. You most probably want to use utf8::is_utf8() instead.

utf8::encode is like utf8::upgrade , but the UTF8 flag is cleared. See perlunicode for more on the UTF8 flag and the C API functions sv_utf8_upgrade , sv_utf8_downgrade , sv_utf8_encode , and sv_utf8_decode , which are wrapped by the Perl functions utf8::upgrade , utf8::downgrade , utf8::encode and utf8::decode . Also, the functions utf8::is_utf8, utf8::valid, utf8::encode, utf8::decode, utf8::upgrade, and utf8::downgrade are actually internal, and thus always available, without a require utf8 statement.

BUGS

One can have Unicode in identifier names, but not in package/class or subroutine names. While some limited functionality towards this does exist as of Perl 5.8.0, that is more accidental than designed; use of Unicode for the said purposes is unsupported.

One reason of this unfinishedness is its (currently) inherent unportability: since both package names and subroutine names may need to be mapped to file and directory names, the Unicode capability of the filesystem becomes important-- and there unfortunately aren't portable answers.

SEE ALSO

perlunitut, perluniintro, perlrun, bytes, perlunicode

 
perldoc-html/vars.html000644 000765 000024 00000037564 12275777416 015120 0ustar00jjstaff000000 000000 vars - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

vars

Perl 5 version 18.2 documentation
Recently read

vars

NAME

vars - Perl pragma to predeclare global variable names

SYNOPSIS

  1. use vars qw($frob @mung %seen);

DESCRIPTION

NOTE: For use with variables in the current package for a single scope, the functionality provided by this pragma has been superseded by our declarations, available in Perl v5.6.0 or later, and use of this pragma is discouraged. See our.

This will predeclare all the variables whose names are in the list, allowing you to use them under "use strict", and disabling any typo warnings.

Unlike pragmas that affect the $^H hints variable, the use vars and use subs declarations are not BLOCK-scoped. They are thus effective for the entire file in which they appear. You may not rescind such declarations with no vars or no subs .

Packages such as the AutoLoader and SelfLoader that delay loading of subroutines within packages can create problems with package lexicals defined using my(). While the vars pragma cannot duplicate the effect of package lexicals (total transparency outside of the package), it can act as an acceptable substitute by pre-declaring global symbols, ensuring their availability to the later-loaded routines.

See Pragmatic Modules in perlmodlib.

 
perldoc-html/vmsish.html000644 000765 000024 00000046550 12275777416 015451 0ustar00jjstaff000000 000000 vmsish - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

vmsish

Perl 5 version 18.2 documentation
Recently read

vmsish

NAME

vmsish - Perl pragma to control VMS-specific language features

SYNOPSIS

  1. use vmsish;
  2. use vmsish 'status'; # or '$?'
  3. use vmsish 'exit';
  4. use vmsish 'time';
  5. use vmsish 'hushed';
  6. no vmsish 'hushed';
  7. vmsish::hushed($hush);
  8. use vmsish;
  9. no vmsish 'time';

DESCRIPTION

If no import list is supplied, all possible VMS-specific features are assumed. Currently, there are four VMS-specific features available: 'status' (a.k.a '$?'), 'exit', 'time' and 'hushed'.

If you're not running VMS, this module does nothing.

  • vmsish status

    This makes $? and system return the native VMS exit status instead of emulating the POSIX exit status.

  • vmsish exit

    This makes exit 1 produce a successful exit (with status SS$_NORMAL), instead of emulating UNIX exit(), which considers exit 1 to indicate an error. As with the CRTL's exit() function, exit 0 is also mapped to an exit status of SS$_NORMAL, and any other argument to exit() is used directly as Perl's exit status.

  • vmsish time

    This makes all times relative to the local time zone, instead of the default of Universal Time (a.k.a Greenwich Mean Time, or GMT).

  • vmsish hushed

    This suppresses printing of VMS status messages to SYS$OUTPUT and SYS$ERROR if Perl terminates with an error status, and allows programs that are expecting "unix-style" Perl to avoid having to parse VMS error messages. It does not suppress any messages from Perl itself, just the messages generated by DCL after Perl exits. The DCL symbol $STATUS will still have the termination status, but with a high-order bit set:

    EXAMPLE: $ perl -e"exit 44;" Non-hushed error exit %SYSTEM-F-ABORT, abort DCL message $ show sym $STATUS $STATUS == "%X0000002C"

    1. $ perl -e"use vmsish qw(hushed); exit 44;" Hushed error exit
    2. $ show sym $STATUS
    3. $STATUS == "%X1000002C"

    The 'hushed' flag has a global scope during compilation: the exit() or die() commands that are compiled after 'vmsish hushed' will be hushed when they are executed. Doing a "no vmsish 'hushed'" turns off the hushed flag.

    The status of the hushed flag also affects output of VMS error messages from compilation errors. Again, you still get the Perl error message (and the code in $STATUS)

    EXAMPLE: use vmsish 'hushed'; # turn on hushed flag use Carp; # Carp compiled hushed exit 44; # will be hushed croak('I die'); # will be hushed no vmsish 'hushed'; # turn off hushed flag exit 44; # will not be hushed croak('I die2'): # WILL be hushed, croak was compiled hushed

    You can also control the 'hushed' flag at run-time, using the built-in routine vmsish::hushed(). Without argument, it returns the hushed status. Since vmsish::hushed is built-in, you do not need to "use vmsish" to call it.

    EXAMPLE: if ($quiet_exit) { vmsish::hushed(1); } print "Sssshhhh...I'm hushed...\n" if vmsish::hushed(); exit 44;

    Note that an exit() or die() that is compiled 'hushed' because of "use vmsish" is not un-hushed by calling vmsish::hushed(0) at runtime.

    The messages from error exits from inside the Perl core are generally more serious, and are not suppressed.

See Perl Modules in perlmod.

 
perldoc-html/warnings/000755 000765 000024 00000000000 12275777416 015070 5ustar00jjstaff000000 000000 perldoc-html/warnings.html000644 000765 000024 00000054525 12275777416 015771 0ustar00jjstaff000000 000000 warnings - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

warnings

Perl 5 version 18.2 documentation
Recently read

warnings

NAME

warnings - Perl pragma to control optional warnings

SYNOPSIS

  1. use warnings;
  2. no warnings;
  3. use warnings "all";
  4. no warnings "all";
  5. use warnings::register;
  6. if (warnings::enabled()) {
  7. warnings::warn("some warning");
  8. }
  9. if (warnings::enabled("void")) {
  10. warnings::warn("void", "some warning");
  11. }
  12. if (warnings::enabled($object)) {
  13. warnings::warn($object, "some warning");
  14. }
  15. warnings::warnif("some warning");
  16. warnings::warnif("void", "some warning");
  17. warnings::warnif($object, "some warning");

DESCRIPTION

The warnings pragma is a replacement for the command line flag -w , but the pragma is limited to the enclosing block, while the flag is global. See perllexwarn for more information and the list of built-in warning categories.

If no import list is supplied, all possible warnings are either enabled or disabled.

A number of functions are provided to assist module authors.

  • use warnings::register

    Creates a new warnings category with the same name as the package where the call to the pragma is used.

  • warnings::enabled()

    Use the warnings category with the same name as the current package.

    Return TRUE if that warnings category is enabled in the calling module. Otherwise returns FALSE.

  • warnings::enabled($category)

    Return TRUE if the warnings category, $category , is enabled in the calling module. Otherwise returns FALSE.

  • warnings::enabled($object)

    Use the name of the class for the object reference, $object , as the warnings category.

    Return TRUE if that warnings category is enabled in the first scope where the object is used. Otherwise returns FALSE.

  • warnings::fatal_enabled()

    Return TRUE if the warnings category with the same name as the current package has been set to FATAL in the calling module. Otherwise returns FALSE.

  • warnings::fatal_enabled($category)

    Return TRUE if the warnings category $category has been set to FATAL in the calling module. Otherwise returns FALSE.

  • warnings::fatal_enabled($object)

    Use the name of the class for the object reference, $object , as the warnings category.

    Return TRUE if that warnings category has been set to FATAL in the first scope where the object is used. Otherwise returns FALSE.

  • warnings::warn($message)

    Print $message to STDERR.

    Use the warnings category with the same name as the current package.

    If that warnings category has been set to "FATAL" in the calling module then die. Otherwise return.

  • warnings::warn($category, $message)

    Print $message to STDERR.

    If the warnings category, $category , has been set to "FATAL" in the calling module then die. Otherwise return.

  • warnings::warn($object, $message)

    Print $message to STDERR.

    Use the name of the class for the object reference, $object , as the warnings category.

    If that warnings category has been set to "FATAL" in the scope where $object is first used then die. Otherwise return.

  • warnings::warnif($message)

    Equivalent to:

    1. if (warnings::enabled())
    2. { warnings::warn($message) }
  • warnings::warnif($category, $message)

    Equivalent to:

    1. if (warnings::enabled($category))
    2. { warnings::warn($category, $message) }
  • warnings::warnif($object, $message)

    Equivalent to:

    1. if (warnings::enabled($object))
    2. { warnings::warn($object, $message) }
  • warnings::register_categories(@names)

    This registers warning categories for the given names and is primarily for use by the warnings::register pragma, for which see perllexwarn.

See Pragmatic Modules in perlmodlib and perllexwarn.

 
perldoc-html/xsubpp.html000644 000765 000024 00000045054 12275777421 015453 0ustar00jjstaff000000 000000 xsubpp - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

xsubpp

Perl 5 version 18.2 documentation
Recently read

xsubpp

NAME

xsubpp - compiler to convert Perl XS code into C code

SYNOPSIS

xsubpp [-v] [-except] [-s pattern] [-prototypes] [-noversioncheck] [-nolinenumbers] [-nooptimize] [-typemap typemap] [-output filename]... file.xs

DESCRIPTION

This compiler is typically run by the makefiles created by ExtUtils::MakeMaker or by Module::Build or other Perl module build tools.

xsubpp will compile XS code into C code by embedding the constructs necessary to let C functions manipulate Perl values and creates the glue necessary to let Perl access those functions. The compiler uses typemaps to determine how to map C function parameters and variables to Perl values.

The compiler will search for typemap files called typemap. It will use the following search path to find default typemaps, with the rightmost typemap taking precedence.

  1. ../../../typemap:../../typemap:../typemap:typemap

It will also use a default typemap installed as ExtUtils::typemap .

OPTIONS

Note that the XSOPT MakeMaker option may be used to add these options to any makefiles generated by MakeMaker.

  • -hiertype

    Retains '::' in type names so that C++ hierarchical types can be mapped.

  • -except

    Adds exception handling stubs to the C code.

  • -typemap typemap

    Indicates that a user-supplied typemap should take precedence over the default typemaps. This option may be used multiple times, with the last typemap having the highest precedence.

  • -output filename

    Specifies the name of the output file to generate. If no file is specified, output will be written to standard output.

  • -v

    Prints the xsubpp version number to standard output, then exits.

  • -prototypes

    By default xsubpp will not automatically generate prototype code for all xsubs. This flag will enable prototypes.

  • -noversioncheck

    Disables the run time test that determines if the object file (derived from the .xs file) and the .pm files have the same version number.

  • -nolinenumbers

    Prevents the inclusion of '#line' directives in the output.

  • -nooptimize

    Disables certain optimizations. The only optimization that is currently affected is the use of targets by the output C code (see perlguts). This may significantly slow down the generated code, but this is the way xsubpp of 5.005 and earlier operated.

  • -noinout

    Disable recognition of IN , OUT_LIST and INOUT_LIST declarations.

  • -noargtypes

    Disable recognition of ANSI-like descriptions of function signature.

  • -C++

    Currently doesn't do anything at all. This flag has been a no-op for many versions of perl, at least as far back as perl5.003_07. It's allowed here for backwards compatibility.

ENVIRONMENT

No environment variables are used.

AUTHOR

Originally by Larry Wall. Turned into the ExtUtils::ParseXS module by Ken Williams.

MODIFICATION HISTORY

See the file Changes.

SEE ALSO

perl(1), perlxs(1), perlxstut(1), ExtUtils::ParseXS

 
perldoc-html/warnings/register.html000644 000765 000024 00000035046 12275777416 017612 0ustar00jjstaff000000 000000 warnings::register - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

warnings::register

Perl 5 version 18.2 documentation
Recently read

warnings::register

NAME

warnings::register - warnings import function

SYNOPSIS

  1. use warnings::register;

DESCRIPTION

Creates a warnings category with the same name as the current package.

See warnings and perllexwarn for more information on this module's usage.

 
perldoc-html/threads/shared.html000644 000765 000024 00000160077 12275777416 017041 0ustar00jjstaff000000 000000 threads::shared - perldoc.perl.org

Modules

  • ABCDE
  • FGHIL
  • MNOPS
  • TUX

Tools

threads::shared

Perl 5 version 18.2 documentation
Recently read

threads::shared

NAME

threads::shared - Perl extension for sharing data structures between threads

VERSION

This document describes threads::shared version 1.43

SYNOPSIS

  1. use threads;
  2. use threads::shared;
  3. my $var :shared;
  4. my %hsh :shared;
  5. my @ary :shared;
  6. my ($scalar, @array, %hash);
  7. share($scalar);
  8. share(@array);
  9. share(%hash);
  10. $var = $scalar_value;
  11. $var = $shared_ref_value;
  12. $var = shared_clone($non_shared_ref_value);
  13. $var = shared_clone({'foo' => [qw/foo bar baz/]});
  14. $hsh{'foo'} = $scalar_value;
  15. $hsh{'bar'} = $shared_ref_value;
  16. $hsh{'baz'} = shared_clone($non_shared_ref_value);
  17. $hsh{'quz'} = shared_clone([1..3]);
  18. $ary[0] = $scalar_value;
  19. $ary[1] = $shared_ref_value;
  20. $ary[2] = shared_clone($non_shared_ref_value);
  21. $ary[3] = shared_clone([ {}, [] ]);
  22. { lock(%hash); ... }
  23. cond_wait($scalar);
  24. cond_timedwait($scalar, time() + 30);
  25. cond_broadcast(@array);
  26. cond_signal(%hash);
  27. my $lockvar :shared;
  28. # condition var != lock var
  29. cond_wait($var, $lockvar);
  30. cond_timedwait($var, time()+30, $lockvar);

DESCRIPTION

By default, variables are private to each thread, and each newly created thread gets a private copy of each existing variable. This module allows you to share variables across different threads (and pseudo-forks on Win32). It is used together with the threads module.

This module supports the sharing of the following data types only: scalars and scalar refs, arrays and array refs, and hashes and hash refs.

EXPORT

The following functions are exported by this module: share , shared_clone , is_shared , cond_wait , cond_timedwait , cond_signal and cond_broadcast

Note that if this module is imported when threads has not yet been loaded, then these functions all become no-ops. This makes it possible to write modules that will work in both threaded and non-threaded environments.

FUNCTIONS

  • share VARIABLE

    share takes a variable and marks it as shared:

    1. my ($scalar, @array, %hash);
    2. share($scalar);
    3. share(@array);
    4. share(%hash);

    share will return the shared rvalue, but always as a reference.

    Variables can also be marked as shared at compile time by using the :shared attribute:

    1. my ($var, %hash, @array) :shared;

    Shared variables can only store scalars, refs of shared variables, or refs of shared data (discussed in next section):

    1. my ($var, %hash, @array) :shared;
    2. my $bork;
    3. # Storing scalars
    4. $var = 1;
    5. $hash{'foo'} = 'bar';
    6. $array[0] = 1.5;
    7. # Storing shared refs
    8. $var = \%hash;
    9. $hash{'ary'} = \@array;
    10. $array[1] = \$var;
    11. # The following are errors:
    12. # $var = \$bork; # ref of non-shared variable
    13. # $hash{'bork'} = []; # non-shared array ref
    14. # push(@array, { 'x' => 1 }); # non-shared hash ref
  • shared_clone REF

    shared_clone takes a reference, and returns a shared version of its argument, performing a deep copy on any non-shared elements. Any shared elements in the argument are used as is (i.e., they are not cloned).

    1. my $cpy = shared_clone({'foo' => [qw/foo bar baz/]});

    Object status (i.e., the class an object is blessed into) is also cloned.

    1. my $obj = {'foo' => [qw/foo bar baz/]};
    2. bless($obj, 'Foo');
    3. my $cpy = shared_clone($obj);
    4. print(ref($cpy), "\n"); # Outputs 'Foo'

    For cloning empty array or hash refs, the following may also be used:

    1. $var = &share([]); # Same as $var = shared_clone([]);
    2. $var = &share({}); # Same as $var = shared_clone({});

    Not all Perl data types can be cloned (e.g., globs, code refs). By default, shared_clone will croak if it encounters such items. To change this behaviour to a warning, then set the following:

    1. $threads::shared::clone_warn = 1;

    In this case, undef will be substituted for the item to be cloned. If set to zero:

    1. $threads::shared::clone_warn = 0;

    then the undef substitution will be performed silently.

  • is_shared VARIABLE

    is_shared checks if the specified variable is shared or not. If shared, returns the variable's internal ID (similar to refaddr()). Otherwise, returns undef.

    1. if (is_shared($var)) {
    2. print("\$var is shared\n");
    3. } else {
    4. print("\$var is not shared\n");
    5. }

    When used on an element of an array or hash, is_shared checks if the specified element belongs to a shared array or hash. (It does not check the contents of that element.)

    1. my %hash :shared;
    2. if (is_shared(%hash)) {
    3. print("\%hash is shared\n");
    4. }
    5. $hash{'elem'} = 1;
    6. if (is_shared($hash{'elem'})) {
    7. print("\$hash{'elem'} is in a shared hash\n");
    8. }
  • lock VARIABLE

    lock places a advisory lock on a variable until the lock goes out of scope. If the variable is locked by another thread, the lock call will block until it's available. Multiple calls to lock by the same thread from within dynamically nested scopes are safe -- the variable will remain locked until the outermost lock on the variable goes out of scope.

    lock follows references exactly one level:

    1. my %hash :shared;
    2. my $ref = \%hash;
    3. lock($ref); # This is equivalent to lock(%hash)

    Note that you cannot explicitly unlock a variable; you can only wait for the lock to go out of scope. This is most easily accomplished by locking the variable inside a block.

    1. my $var :shared;
    2. {
    3. lock($var);
    4. # $var is locked from here to the end of the block
    5. ...
    6. }
    7. # $var is now unlocked

    As locks are advisory, they do not prevent data access or modification by another thread that does not itself attempt to obtain a lock on the variable.

    You cannot lock the individual elements of a container variable:

    1. my %hash :shared;
    2. $hash{'foo'} = 'bar';
    3. #lock($hash{'foo'}); # Error
    4. lock(%hash); # Works

    If you need more fine-grained control over shared variable access, see Thread::Semaphore.

  • cond_wait VARIABLE
  • cond_wait CONDVAR, LOCKVAR

    The cond_wait function takes a locked variable as a parameter, unlocks the variable, and blocks until another thread does a cond_signal or cond_broadcast for that same locked variable. The variable that cond_wait blocked on is re-locked after the cond_wait is satisfied. If there are multiple threads cond_wait ing on the same variable, all but one will re-block waiting to reacquire the lock on the variable. (So if you're only using cond_wait for synchronization, give up the lock as soon as possible). The two actions of unlocking the variable and entering the blocked wait state are atomic, the two actions of exiting from the blocked wait state and re-locking the variable are not.

    In its second form, cond_wait takes a shared, unlocked variable followed by a shared, locked variable. The second variable is unlocked and thread execution suspended until another thread signals the first variable.

    It is important to note that the variable can be notified even if no thread cond_signal or cond_broadcast on the variable. It is therefore important to check the value of the variable and go back to waiting if the requirement is not fulfilled. For example, to pause until a shared counter drops to zero:

    1. { lock($counter); cond_wait($counter) until $counter == 0; }
  • cond_timedwait VARIABLE, ABS_TIMEOUT
  • cond_timedwait CONDVAR, ABS_TIMEOUT, LOCKVAR

    In its two-argument form, cond_timedwait takes a locked variable and an absolute timeout in epoch seconds (see time for more) as parameters, unlocks the variable, and blocks until the timeout is reached or another thread signals the variable. A false value is returned if the timeout is reached, and a true value otherwise. In either case, the variable is re-locked upon return.

    Like cond_wait , this function may take a shared, locked variable as an additional parameter; in this case the first parameter is an unlocked condition variable protected by a distinct lock variable.

    Again like cond_wait , waking up and reacquiring the lock are not atomic, and you should always check your desired condition after this function returns. Since the timeout is an absolute value, however, it does not have to be recalculated with each pass:

    1. lock($var);
    2. my $abs = time() + 15;
    3. until ($ok = desired_condition($var)) {
    4. last if !cond_timedwait($var, $abs);
    5. }
    6. # we got it if $ok, otherwise we timed out!
  • cond_signal VARIABLE

    The cond_signal function takes a locked variable as a parameter and unblocks one thread that's cond_wait ing on that variable. If more than one thread is blocked in a cond_wait on that variable, only one (and which one is indeterminate) will be unblocked.

    If there are no threads blocked in a cond_wait on the variable, the signal is discarded. By always locking before signaling, you can (with care), avoid signaling before another thread has entered cond_wait().

    cond_signal will normally generate a warning if you attempt to use it on an unlocked variable. On the rare occasions where doing this may be sensible, you can suppress the warning with:

    1. { no warnings 'threads'; cond_signal($foo); }
  • cond_broadcast VARIABLE

    The cond_broadcast function works similarly to cond_signal . cond_broadcast , though, will unblock all the threads that are blocked in a cond_wait on the locked variable, rather than only one.

OBJECTS

threads::shared exports a version of bless REF that works on shared objects such that blessings propagate across threads.

  1. # Create a shared 'Foo' object
  2. my $foo :shared = shared_clone({});
  3. bless($foo, 'Foo');
  4. # Create a shared 'Bar' object
  5. my $bar :shared = shared_clone({});
  6. bless($bar, 'Bar');
  7. # Put 'bar' inside 'foo'
  8. $foo->{'bar'} = $bar;
  9. # Rebless the objects via a thread
  10. threads->create(sub {
  11. # Rebless the outer object
  12. bless($foo, 'Yin');
  13. # Cannot directly rebless the inner object
  14. #bless($foo->{'bar'}, 'Yang');
  15. # Retrieve and rebless the inner object
  16. my $obj = $foo->{'bar'};
  17. bless($obj, 'Yang');
  18. $foo->{'bar'} = $obj;
  19. })->join();
  20. print(ref($foo), "\n"); # Prints 'Yin'
  21. print(ref($foo->{'bar'}), "\n"); # Prints 'Yang'
  22. print(ref($bar), "\n"); # Also prints 'Yang'

NOTES

threads::shared is designed to disable itself silently if threads are not available. This allows you to write modules and packages that can be used in both threaded and non-threaded applications.

If you want access to threads, you must use threads before you use threads::shared . threads will emit a warning if you use it after threads::shared.

BUGS AND LIMITATIONS

When share is used on arrays, hashes, array refs or hash refs, any data they contain will be lost.

  1. my @arr = qw(foo bar baz);
  2. share(@arr);
  3. # @arr is now empty (i.e., == ());
  4. # Create a 'foo' object
  5. my $foo = { 'data' => 99 };
  6. bless($foo, 'foo');
  7. # Share the object
  8. share($foo); # Contents are now wiped out
  9. print("ERROR: \$foo is empty\n")
  10. if (! exists($foo->{'data'}));

Therefore, populate such variables after declaring them as shared. (Scalar and scalar refs are not affected by this problem.)

It is often not wise to share an object unless the class itself has been written to support sharing. For example, an object's destructor may get called multiple times, once for each thread's scope exit. Another danger is that the contents of hash-based objects will be lost due to the above mentioned limitation. See examples/class.pl (in the CPAN distribution of this module) for how to create a class that supports object sharing.

Destructors may not be called on objects if those objects still exist at global destruction time. If the destructors must be called, make sure there are no circular references and that nothing is referencing the objects, before the program ends.

Does not support splice on arrays. Does not support explicitly changing array lengths via $#array -- use push and pop instead.

Taking references to the elements of shared arrays and hashes does not autovivify the elements, and neither does slicing a shared array/hash over non-existent indices/keys autovivify the elements.

share() allows you to share($hashref->{key}) and share($arrayref->[idx]) without giving any error message. But the $hashref->{key} or $arrayref->[idx] is not shared, causing the error "lock can only be used on shared values" to occur when you attempt to lock($hashref->{key}) or lock($arrayref->[idx]) in another thread.

Using refaddr()) is unreliable for testing whether or not two shared references are equivalent (e.g., when testing for circular references). Use is_shared(), instead:

  1. use threads;
  2. use threads::shared;
  3. use Scalar::Util qw(refaddr);
  4. # If ref is shared, use threads::shared's internal ID.
  5. # Otherwise, use refaddr().
  6. my $addr1 = is_shared($ref1) || refaddr($ref1);
  7. my $addr2 = is_shared($ref2) || refaddr($ref2);
  8. if ($addr1 == $addr2) {
  9. # The refs are equivalent
  10. }

each HASH does not work properly on shared references embedded in shared structures. For example:

  1. my %foo :shared;
  2. $foo{'bar'} = shared_clone({'a'=>'x', 'b'=>'y', 'c'=>'z'});
  3. while (my ($key, $val) = each(%{$foo{'bar'}})) {
  4. ...
  5. }

Either of the following will work instead:

  1. my $ref = $foo{'bar'};
  2. while (my ($key, $val) = each(%{$ref})) {
  3. ...
  4. }
  5. foreach my $key (keys(%{$foo{'bar'}})) {
  6. my $val = $foo{'bar'}{$key};
  7. ...
  8. }

This module supports dual-valued variables created using dualvar() from Scalar::Util). However, while $! acts like a dualvar, it is implemented as a tied SV. To propagate its value, use the follow construct, if needed:

  1. my $errno :shared = dualvar($!,$!);

View existing bug reports at, and submit any new bugs, problems, patches, etc. to: http://rt.cpan.org/Public/Dist/Display.html?Name=threads-shared

SEE ALSO

threads::shared Discussion Forum on CPAN: http://www.cpanforum.com/dist/threads-shared

threads, perlthrtut

http://www.perl.com/pub/a/2002/06/11/threads.html and http://www.perl.com/pub/a/2002/09/04/threads.html

Perl threads mailing list: http://lists.perl.org/list/ithreads.html

AUTHOR

Artur Bergman <sky AT crucially DOT net>

Documentation borrowed from the old Thread.pm.

CPAN version produced by Jerry D. Hedden <jdhedden AT cpan DOT org>.

LICENSE

threads::shared is released under the same license as Perl.

 
perldoc-html/static/._1x1.png000644 000765 000024 00000000674 12276001416 016047 0ustar00jjstaff000000 000000 Mac OS X  2мATTR¼ÈôÈœ%com.apple.metadata:kMDItemWhereFromsdXcom.apple.quarantinebplist00¢_:http://upload.wikimedia.org/wikipedia/commons/c/ca/1x1.png_.http://commons.wikimedia.org/wiki/File:1x1.png Hyq/0001;4b93b00f;Google\x20Chrome;F1AB5FE9-0DA6-4A2B-8082-528154F37DD5|com.google.Chromeperldoc-html/static/1x1.png000644 000765 000024 00000000137 12276001416 015624 0ustar00jjstaff000000 000000 ‰PNG  IHDR%ÛVÊPLTE§z=ÚtRNS@æØf IDAT×c`â!¼3IEND®B`‚perldoc-html/static/Logo_40wht.gif000644 000765 000024 00000007405 12276001416 017127 0ustar00jjstaff000000 000000 GIF89a€5÷üñðñíìíÑÑÓÉÉËããäààáÔÔÕÎÎÏ"j¿ÁÆ>Å/š,(„<¸6§&s 1ƒ B‹;z/QœŠ’£˜ ±GÖEÌPÚ FÀMÉF²SÆ#XÃ+Tª8b·°=I¸HgÁf€ÏW‰VÆÆÃóóñööõÞÞÝëêãñÞ„ÑÊ©ôÌ-õØWñá›ÔÆéá¿ÏÍÄÉÈÄÆšá± ÷Æ íÀÕ«Ë­>ư^’‚KŸ“i×ÔÉáߨèæß¤‹k²‹{`Šu/¿¶™ÞÛÑóñ쮢”öõôõôóáàßÖÕÔã͹èàÚµ¬¦¾¶²ÅÁ¿èæåϼµ²±ìéèçäãÜÙØè¨›òèæÞQ9álWä‹zÏ¡™à¸±èÒÎäÙ×Ü% Û-á1Ô8!ã>$ÊŒ‚ª‰Í­¨Í¹¶ÕÎͽÐ! £¹%Ç0Ë3Ã:)¶E8ÂWI¸bW¸}v‰`[›yuÜÔÓ²¥’ ά"§2&}1*F=r ÔÆÅÐÌÌfffÔÑÑÞÜÜüûûæååûûûùùù÷÷÷ôôôóóóïïïëëëéééçççäääâââßßßÜÜÜÚÚÚÙÙÙ×××ÓÓÓÈÈÈÇÇÇÆÆÆÄÄÄ»»»!ùü,€5ÿù H° Áƒ*\Ȱ¡Ã‡#JœH±¢Å‹3jÜȱ£Ç CŠI²¤É“(Sª\ɲ¥ËäÌÉ4Gî¥ÍŒæÎJ§.Ý/4mÆ<çÅ º£^νyÐ\›5hÐüÙcf!s-czIçDÉ‘H¤(ÒEÝtJ™| >eàdä¥fʘè”y"†_\ÜRI«Ko’«óeK?cÔ(à V”äÎ8bƒƒˆ%åB¿C1ƒƒ`/’¬³¬–ºuðÒôɣƊ¼vé.›4——ˆç` @èwS7Þ´bÇ çT„NEà „!;èx—^ e ÃåƒG9Þ¢Ȱ†4êD(‚¤@à‚¶@€Ê­ìy¤(l01!o(A À Àq¸ÜÁ¾q e¬Âõ Ç<èQŒgxƒÃ0Æ;ÿ¾à…Š¢E¸Pÿ €ÑL.la RŒ)îpÐäÑ `ô¢Üƒ#‘ˆN¬bLhGØñz°b’€Ä%¬°!l`\°‡< PzœÀh@ ¬ 7\. †-²‹ØãÂx†3¢±ŠVX¡XG'BŽ+Øa7ØHpÿAQIMʉ:\‹`\ƒ—̤OÖA€·‘ΰ`…"&ñ‰UXåXM:P€; Àƒô&Àª: 0„á†lôaa#¤(†0l¡ŒhˆCpd¬P?N^¡;¨A0.У„@庰ÿ‚`a@5nñ‹kL# Ûé—Ø!oä…""Ñ R¼+7æHG;è±2H|3 (+p€ÛÀôA×ú‡ÍËááÞ,)¢‘‹Y¨m‡ *(Bˆ,¤Àp@3 ðŸûFkUÓn“à&`±}P` xäÀ`<À‡´\L镾ˆ 4îQx¸hTšHFÝQ(àÅ8eR¨ó²£ëU2|h`´BŠØI;Þ˜ÆnÂú+¦u#(ñ Rìú0A B8Æ´68¸4¬@6  PFÿ~Ò.E\€x$àx±|É« å P/¨É}àƒÊX³6öqOŒª£¸È†¼ JDBŽVÃc2ˆB\ûàÂ]ß›ì@ƒ¶æ†ÁðÌÁRQzi ñclt£·‰™L–„“A°øx€0à¼ÙHè&Žmð¯‰Ú «ØF5ªqkÊ̲Êà+ôŠ}àÆD=Êñ…€rÄ>ÈÌ8 0 B`WL¶ÃBヶ+´¢Õ86š‘‹qdá ê0….v‘∸¦:hBð\ IˆBÖñ…u´ ;pÀƒkÚÚÚÿ¸F5œa Æ‚ˈn¿ï`;„œð$,!ŠÂbÊ °äè;´£:@HÙæz¼ƒíˆCÆ †3\9€EÊ«ñãgƒ‘Œ˜u‘ŒÆõx¨À"@qx9ˆ; @ðÀúÀVp™wCȃ‡Y´,÷™-qÃÔ„ ó‹ç¡@ß7†dšBÁ‘_!Ÿ({hŠéÈš7a^é·Â¯ˆ]@ံ™9!•C1œl¹$ʉœÌy;perldoc-html/static/banner.png000644 000765 000024 00000061001 12276001416 016455 0ustar00jjstaff000000 000000 ‰PNG  IHDRÙi.–àiCCPICC ProfilexÚ”OhUÇ?o:¡ ‰ ®5”‡‡$ k+ºE¨É&é6î²]¦Ùü£ ›Ù·»c^fÇ7³[[ŠH@¼ÙêQ¼T‹x¨"$Oö h­z‚xª" ½HY³&Ôª?ø¼ßï÷¾¿?†>«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´JÎ2D¦((*Š–\ ("ˆl¡ÌBwYt¦{§ÙùýÁïÞonîM›t1<¯Ç£¶É͹çyϹIÞï÷y¿/ppœ w¡\eÎ î†^ÝÑ=Àþ>^ðQ¸Cî*ƒ\æ ±H""""""""""""¢ûSZµõ¨­«‡²¢E¥e¸U\Š›y¸™[€º†Æ»²ß¼»ÅÈv•9cØÀ~‰èÈÞèèGf§ò K”q’ÒqîJÚ]ctßQ#ÛÍU†ñÅ .v(F…ƒÇã‘™BDDDDDDDDDDDDd—ŒF#®¤_ÇáSçð×é ¨­oøgÙ1ÑQ˜:q F B€Ì""""""""""""¢‘N¯Çñ³‰Øsð.&gÜ¿F6ŸÏÇ#ÇóSCXH72òDDDDDDDDDDDDDªë9ùعû< “©kâË]bdÇŠ—gOC÷_2ÊDDDDDDDDDDDDD]ªÜ‚b|úÍn9}áÞ6²ƒƒüñ΢9Ü¿U""""""""""""¢;ª ÉéX³uò‹Jï-#ÛQâ€3žÆ¬ÉIÎ5Ñ]#^»àË~…Z£¹ûìÈÐlzg1}½ÉèÝ•*(Qâµ5ájV^‡¶+põ﹪£›ñälzg1Ü\edĈˆˆˆˆˆˆˆˆˆˆˆˆîZ¹º8ãñq#ÑФBúõìk·C"Ù2g)Ö¼ö"F €ŒÑ=¥£g.bù‡_ ±Iuçl/Ow|¹þ¿èÙ=ðŽ\ ½^½^NƒÁ@ÿFF»Ê´óx<˜L&ðx< ÿnk©÷ÖÚëêó~Âß•üDDDDDD÷’Èç?á'üÛ@>Ÿ>Ÿ@¡P¡P‘HÁ]PËëfî-Ì{*ªkÝ=À_nX?oÅ»%%%ÈÏÏG~~>ÊÊÊPVV†ÚÚZ466¢±±ƒ:Φ‰Bn2ÂOøÛöz£Ñ½^OÆŸð~ÂOø ÿ=ÉÏãñ ‰Èø~Â߉ýqtt„››\]]!—Ë¡P(€ÀÀ@ÀËËë®p¸•–cÞÛkQP¬ìz#;ªwO|¾ö-Èe.] ŸŸÄÄD$''ãêÕ«¨¯¯¿£ÏÎŒüÝí‘EÂOø ?á'ü„Ÿð~ÂOø ?áïˆþÉd2DFF"""‚þ- ïgM]=üw=2oæv‘Ü ßl^©S—@àèÑ£8yò$JJJ:lÐ;ûø»ÍsFø ?á'ü„Ÿð~ÂOø ?á'ü÷¿³³3† ‚áÇcðàÁprêÛ“R]C#f-]…ìüÂÎ7²|½ðÝGïAá.ïT(“É„³gÏbß¾}HIIÁý zÎ+á'ü„Ÿð~ÂOø ?á'ü„ŸðÛËïàà€Ñ£Gã±ÇCXXX—õ¿¼²3^]’²ŠÎ3²Ý\eøþã÷äïÓi F£'NœÀ÷ßüüüs:ÂïÌöÛ{¾Îîá'ü„Ÿð~ÂOø ?á'ü„ŸðßKü¡¡¡˜6mbcc;­ßæÊ/*ÅŒÅËQ[ßÐñF¶H(Ä·[ÞETïžpéÒ%|ñÅÈËËûGL¢Îá'ü„Ÿð~ÂOø ?á'ü„Ÿðßü=zôÀìÙ³ñàƒvúùR¯eaÖ’•Ð 6/põï¹Ê–ß|qÆŒÜ).--Åš5kðõ×_£¦¦¦ÃÚ5dó¿ÿ‰Ë3?á'ü„Ÿð~ÂOø ?á'ü„ÿ~QMM Nž<‰ . W¯^ðôôì´sù(<àè(Á¹Ë©¶ƒ-‘ìÑÃããUÿéðÎFüüóÏøú믡Ñh:½Ä}K“°-ž™ö èêÿ„Ÿð~ÂOø ?á'ü„Ÿð~¿ñóx<<õÔS˜;wn§H{yùû8uáJûlö|¾2gi‡vP©TbíÚµHKK³:¨öL @@ÿðx<ðù|ú5TÛT»z½F£ƒ†Bþ=©Z›Ô]}S~ÂOø ?á'ü„Ÿð~ÂOø ÿ½Ê¯P(ðßÿþì#»®¾“_|¥å•í3²ãß1ÑQÚ¹„„¬Y³MMMm$>Ÿ±X ±X ‰DÒ®ýÓL&t:4 4 t:]§y>:ÛóÓÑ“žð~ÂOø ?á'ü„Ÿð~ÂOøï5þ石gϦ¯©³—S1ÿíu-ÓbNö„‡‡ã…©“:ô"ÿïÿÃ|`·1ËãñàààÈår899A,·xáš››!‰ZmW ÀÁÁNNNJ¥àóùt”›Š€SiÏR óטOˆ;UÁÞþZþ~ÂOø ?á'ü„Ÿð~ÂOø ÿÝΟ’’‚´´4 2ŽŽŽÚvŸ²o!çV‘uÓZ$[êäˆ?v~Ôaûaët:¬[·Gµk ø|>œ!•J!ì:çœ9sðÝwßA«Õ¶¹Ï P©T6;Ú³Â^Ýmûî~ÂOø ?á'ü„Ÿð~ÂOø ÿÝÂïëë‹-[¶ 00°C¹Ë*ªðØœ¥P5«9Ÿ·É^:÷Y Ð1ËÄU*Þ~ûmœ>}Úæå|>2™ H$m õÏ›7.\`-K·UŽŽŽprr‚Éd‚^¯oqÒ´æ êÌB\ž¦ö./±ÅSEø ?á'ü„Ÿð~ÂOø ?á'üw#CCŽ;†èèh(Š3²¥N …8w%ûºrE²|½ðÇÎ ´3rÌ%µZ×^{ III6{2œ!—ËÛ½†~Ë–-øä“O››Û!S§Ó¡ºº¦Ý“¢3<7÷Z?á'ü„Ÿð~ÂOø ?á'ü„¿³ù±aà Üq[Rëôz<:ûU””UØÉ~ãÅYˆèÕ£Ý'ÖjµxóÍ7‘˜˜hÓñ"‘^^^pvvn׺ýeË–áÌ™3ˆÅ7P^^Þ!R ÀÙÙB¡F£‘áI¹Ûs.약ž$ÂOø ?á'ü„Ÿð~ÂOø ?á¿Û¤×ëqâÄ 4ÞÞÞcòùpr”àÔù+­Ù~Þ ¬^º|~û/Ô{gS§N1úmÞwqqB¡hW¥pJ ,Àï¿ÿŽáÇãÖ­[Ðëõm^2Î%±X ©T FÃZBÞÚ¤µÆoËó]Ý~Km´õ¦%ü„Ÿð~ÂOø ?á'ü„Ÿðþ®æ×ëõ8uêFŽ ¹¼cꎅá׿N¡IÕܲ‘ýâÌɈîÖîÆÇÇc÷îÝ,φ¥—ƒÏçÃÓÓ®®®æyä‘Gð÷ßcÀ€P*•xðÁqåÊt¤ø|>¤R)L&Ôjî„w.ÏÎÝìåéèþ~ÂOø ?á'ü„Ÿð~ÂOø ÿÝ"Fƒ³gÏbܸqRuœÏçÃh0â|37›6–88੸‡Û}²„„lß¾±¬€Ë"àãã±XÜ¡O­VC*•B§ÓA"‘ ´´ÁÁÁ0™LÈËËëÐêvnnn‰D¨ªªb,ïOPWW´ì/Ç£™¸ÎßZÿ?á'ü„Ÿð~ÂOø ?á'ü„ÿnâ/..Æë¯¿Ž/¿üÒîÝ«¸ôô#£ðé7?C£ý¿-ª•Åâb‡ÂYêÔ®“”––bÕªU¬‹mù#‰àçç×á¶¥‘íàà€üü|¼ñÆø÷¿ÿÝ)ÔÙÙ …‚ž”\“Ö–}çZ;¾µŸÖ&±½Ë5ŒF#ý›â2ç3ŒëyÂOø ?á'ü„Ÿð~ÂOø ?á¿ÛøÓÓÓ±cÇŽ±eÎRŒ{(Æz$ûÉñí‹b›L&¬Zµ µµµœžJb±¾¾¾â9à’Á` w”––"((—.]BgÉÉÉ >>>(--¥'š5~óIjn˜[ßÚó­yޏ>¯¾új»l@‡±#†à§G˜FöCC´«áÍ›7C¥RµèIðööî”ls  ¶¶‰„Ž˜kµZ»¶Ûj\\\ Õjézg*°e°Û[8 +o ÂOø ?á'ü„Ÿð~ÂOø ?áïJþü“'OF@@@»ÎýÐL#ÛÁAŒA}#ÚÜ`ZZŽ9Ò"¤‡‡œœœ:ýÂöíÛƒô[”d2Y— °‡‡š››¡R©Ú\¯½7IkË%ìm¿£'ü„Ÿð~ÂOø ?á'ü„Ÿðþ;ɯÓéðÑGáƒ>h—ý7¸Ä"´:Ým#;*¬'Ä¢67¸eËÆ°L\—H$pwwïãö©§žÂ•+W`4áîîîv)u@€¨¨(œ9s¦Ë mܺu‹Ao­Ð@K¿[+$ÐÙ%öí´–9­ñ |ûí·ððð ãóùøøãñÇØÅ?vìX¼ñÆŒs×ÖÖbúôéw-ÿý>þعs' £Ý9sæ ¸¸ø?ÿ ?á'ü„Ÿð~ÂOø ÿá?vì’““ÝfÛOâà€>¡ÁHμqÛÈÖæÆ.]º„¤¤$«æñxðññé£622ƒ Â[o½àöòtÊÀ‰D=z4víÚ…úúú.éP(„B¡@III›&mgçDt´Ú{ÓvïÞ~~~ ¢©®®FAAcr[øÇÆóûöícÜðwÿý>þƒ Bpp0ã5©©©¨©©¡ÇåŸ<ÿ ?á'ü„Ÿð~ÂOø ÿãÇgŸ}Ö®ëöFvdhH›úꫯ¬vÜd2A¡P@$¡³åââ‚×_éééHMM…B¡€““jjj  ‰ ‰ðâ‹/bãÆè*Éd2ÔÔÔ ©©©ÕIÒÒD´6)Û{|gÊÞ›¤Gðõõe666ÅÅÅØµkW—M6äæævø¤ïˆÂ)[‹Dú{+ÚÔ€Z­Æ¡C‡X›‰S•H$´qÛY ÆÊ•+áåå…Ý»wãèÑ£nGÍbbbPWW‡òòrDEE¸½t{Û¶mX·nž~úiøûûãÃ?Dccc§Úžžžô–^Ö ´uÒÌœ9“q¾¦¦&üþûˆÀ£>Šððp888 ®®'OžÄÞ½{éÊç\7‰‹‹  €ØØX„††B*•‚ÇãA¯×£°°§OŸÆéÓ§QYYÉÙŸY³fÑ{¦óx<444à·ß~ƒÉdBpp0"""àç燂‚üñÇðññaEŸ›››qéÒ%Λ044±±±xà —Ëa2™P\\ŒcÇŽá ·n37²«««Y×O$¡_¿~>|8¢££áêê >Ÿ½^ÒÒRœ={ÿý7g^½e!…B>}ú`äÈ‘ ‚L&ŸÏGcc#rrrpæÌ¤¦¦¢ªªÊ®ñ÷óóÄ ×3''W®\Áˆ#0~üx€Ï磠 »ví•+WZ,áïïbÔ¨Qðõõ¥·ÖÓh4ÈÎÎÆÑ£G‘À¹í]`` ÆÇèOVVΜ9ggg„……!,, …ÇGff&§ã£¤¤m~“–J¥èß¿?F…ÐÐP8;;t:rssqòäIœ;wŽ®Á`9^111ˆˆˆ ïG@€'N 77žžžˆŒŒD·nÝ “ÉðÍ7ß0j9øùùaøðáxøá‡¡P( P]]¤¤$>|111H$tëëëñÛo¿Á`0tøýoë‡%¿-’­y’[jÏÞB(„Ÿð~ÂOø ?á'üwš¿¹¹GÅäÉ“Ûi÷ñò€Ð¯Fö‰'P__oõ¢)mk×VÅÅÅáÅ_„X,Æ‘#Gðõ×_ÓÏ9ÎÎÎ8wîд€€H$üþûï˜4ibbbðÉ'Ÿ`Ó¦M¸zõj§öW$ÑÑì¶Êr’RÇÅÅaÕªUàóùô±¹¹¹=z4 À*<÷ÐCaúôé˜2e kos@€'žxÏ<ó "##9·]ú(”J%Þzë-üý÷ßœonnnxæ™gðÈ# 44¬v~øa̘1·nÝ /¼ÀàjMË–-äI“¥¥¥¡®® `¤C <cÇŽÅ?ü€uëÖ±Þtär9žþyL˜0¡¡¡,g 6 O=õ ðÌ3Ï ²²’ñ&·lÙ2<ú裌×ìÛ·îîî˜1cBCCé4£ÑˆŒŒ <˜6ä)µ” ÐÒü7™Lxä‘G0{ölDEEÑÆµ¹bbbðä“O¢¢¢S§NeU/—J¥xï½÷Æ,ú¨R©0nÜ8<öØcèÙ³'œœœÐØØˆ . 11|>Ó§OÇŒ3Áº~ãÇǬY³Xµ(òòòpäÈ‘v]´vÿÛãynéC¬5Ou{«‡ÚóNø ?á'ü„Ÿð~§ø9Òf#|¼ ôt—·éҶÚI±XÜæ{k’J¥X´hFŽ øû￱uëVºÓ¦Mœ?FæÂ… ±téR899a̘1ðññÁ|€_~ù»ví‚V«í4CÛÃÃƒŽ¦Úâ™i­:õ÷!C-p{©¾e%gJˆˆˆÀüùóñá‡Òíóù|¼ýöÛ˜:u*ÜÜÜZd …pssæM›°xñb†“"&&†ÕOOO¬[·ŽeHS×Aƒ1ŒE¨ªªBQQÍéììŒ5kÖàÑGµZLÏÁÁÞÞÞ,Ã)11‘a¤=ýôÓX²d g4K΀€lÚ´ K–,am׳gO¬]»–“ÙR‰®®®pvv欖Î5þ`µÌi\€««+¦NŠ'Nàܹstûr¹[¶lAll,„Ba‹}uvvFDDÖ¯_%K–ÐÅÉ$ k ©Èî† }¢ŒSàöþõ\Ž{<¡ÔcK–,ÁÌ™3­:F(9::"((Û·oÇK/½„¼¼<ºÍ€€ÖJ›ššL:ýúõc\ÊÙc2™°xñb¼ð ˭¿oZFìàúõ먭­m×>‘­y~Û»e[>”ì©Úý#ü„Ÿð~ÂOø ?áïlþóçÏ£¶¶¶Åï{-Ú|n®àËeö/énnnÆ™3gè墖?míPk ǶmÛh;!!›6mbì»;~üxøûûC«ÕÒ_ðÍ¿ôzzzbåʕرc¶oß½^‡)S¦à³Ï>CŸ>}:ÍÈ–H$pttäœ$-Mhk×Ùd2Ïçsö™2äjkk9B¡±±±Édôy6n܈ٳg3 캺:$''cÏž=8{ö,+JŽ)S¦Ðý¬è p{;5KãL«ÕââÅ‹0™L8p ë5™™™0 ôð¿ÿý=öËÀV«Õ(**Byy9çM_TTDçê›L&Œ1Ë—/gØeeeHKKÃ7X{9ûúúbþüùppp Yƒƒƒ±oß> 6Œe`×ÖÖ¢°°N ¤T*‘——góør® qvv†F£Amm-ç¶dxüñÇév\]]qàÀŒ3†a@VTTàôéÓØ»w/222è}å)1‚“É„   Nã6&&†eô£°°ÞÞެרÕj\¼xÑîù¿bÅ ÌŸ?ŸÕÞ­[·žžN³¹úõë‡Y³f1Ú e­ðpssÃÀYˆŠŠ (•JL™2óæÍc½¿™L&(•J°V†Pº|ùr«|m¹ÿÍ:ºýöö§³ûGø ?á'ü„Ÿð~Âßüz½Ço³Í'—¹@èâìd÷ ÑÜÜlÕSÐZÔ^QFð¬Y³hcæÜ¹sذaâ–ÂÀ©S§è\ënݺ1Ú ÁæÍ›±bÅ dddà­·Þ‚¿¿?±ù$kß犊 ¼ÿþûøé§Ÿ§§§ãرcøøã±wï^»ÞD† ÂbJJJÂܹséÿSSS1a–‘MEdCBB0nÜ8F;ÅÅÅX½z5é½M&®^½ÊJ3pww§ûÃêOUUΞ=‹ÌÌL:»W¯^HHH€ÑhÄÀYK³+++éÈêŸR©Äúõëé5e Zëz½žŽˆ{zzÒ©æ:xð }}¨¶ÑØØÈ0…B!ø|>$ {ì1–3éÏ?ÿ¤·µ³¼‰¯]»†3f ¼¼œ~Þ–B\Ûš™/³¦ÎqýúuÖ±z½F£>ø «HœX,ƶmÛXm[:2(#šJˆˆˆ`=ÿÅ_à“O>±ú¦7hÐ Ös×®]ƒN§³yþGDDлPR«ÕøòË/éjíÔØQËñ-ßø]­ÝrµE}}=^|ñEúÚRí}÷Ýwðòòb9T€Û[<œ8q‚1¯xàÖü¯¬¬Dii)ëõqÿ·TÓÚº­9Vý¥ð~ÂOø ÿýÁOÕœ‘ËåèÖ­ÆŒƒ>}ú@¡PÐßQí1VìÐÚò|k´-Ž{žwrrb}°TSS4M«‘c[Ù[{®­‘`k[®‹µó666¢ºº7nÜÀåË—QVV¥R‰ââbÔÖÖÂd2Ýñù_SSƒœœôìÙÓîk& !´Ç87™LÐét¸zõªÕ7˜ŽÚ¶+../½ô#?2%%ï½÷+o4<<³gϦ GóJã–Æ—ÑûÖ[oA"‘àÈ‘#ÈÎÎÆòåËi£ä‰'ž@tt4Ö¬YC®í•T*eTf·6 l™ `µŸžžÎze”“šàMMM6lk °T*mµ(˜¹QO©ÿþ¬çSRRèH²¥'Éh4"&&†õšÜÜ\455Çãáá‡f9²³³ór¨XÙÅÅÅtD¼{÷î¬U µµµ8|ø0«å‘5WCCT*ÂÂÂXíÔ××ãçŸæ¼a©ßJ¥’“ßÚøûùù±œMMMŒmͨß\Î-Êø|衇XÏ) ›v0™LÈÍÍ¥ûciœ644àÒ¥KV_+—ËY+ŒF#Ξ=Ëø"ÑÚü窒Ÿ——‡äädÎÕ"–R«Õhjj¢÷Q·ìSqq1òóóYN*Ý2Š]]]°îGó7\+)ìÿŽ.´ÌĹ­Yaa!ã5Ô×5´æ`±µÞ@]]]œ«?ÅÅÅtn5×—ÿSlÿ\ó·¶¶–±2 %ÇZ]]jjj`41dÈ–w¹¼¼œŽ6[~)êÙ³'kÞPÅÐÌûîîîΚÿƒçÏŸoµðF[î[$Y­³£ÏGø ?á'ü„ÿÞá§¶ÃôòòBTT6lØÀøÞ¤V«QYY‰ææfðù|Èd2xxxØ\…èŸ#@€ØØXDDDàË/¿„££#®\¹ÂX•{§æzz:ž|òÉ6qÙedkµZÆ—cKYÛRÈ-X°€®ŠlþxåÊ•,Ã@,cÅŠôM]PP€_ý•q ×—skZ¼x1šššpöìY|øá‡P*•˜1c€ÛÛ-[¶ ñññØ»wo»)GD{øX¥Š‹‹‘““Ø(¬ˆ·Éd¢+ÄsEþðñÇÛ4‘©%¶AAAœ®rss­²pÆiii0™LprrbE±)NË6‡ Â:.;;ÍÍÍÀiˆšG8ÍõÈ#°>JKK‘ŸŸÏ¹äÜ`0°æc ‰Î)…BÁ2ôŠ‹‹ŽƒÁƒ³œVæ¹Ì”úõëÇZu T*鈷eµV«ÅÖ­[mrUWWãæÍ›ôõåJM°æ02™LèÓ§ëºWTT0r”y<«®€F£aÐ\;Ô×׳ÆÍÑÑ‘³R}JJ ýþamµ…µ|1Ë‚TŸu:ã5œ+)²³³ïX±šö|Iëê/…„Ÿð~ÂOøï ~“ɱX ôìÙ~ø!ÃÈÍÍEQQ+"‘HÖá’‰îyyyáå—_ÆæÍ›1pà@466¢¦¦æŽÎj5g§Ù:ÙÙÙV=­å´¦Y³f± 즦&¬X±‚µ¼üòË ¥ÿÿôÓO¹Ú¾¾¾¬ÈRKâóùxóÍ7ñæ›oâÚµkøþûïÑÐЀ_|‘>fîܹpssC|||›9y<Äb1m”Ùº›åóýúõcE)+**‘N‡—^z‰ÝËÊÊÂÁƒa2™P__Ï꣋‹ ’’’X¹"æËƒž~úi\¿~>Wtt4Ë¢ŒBk|ÁÁÁ¬è¨R©¤·b’H$œyà>>>Œ›ÅÑÑ‘•w ü_4çVWTLóþEGG³–°kµZüþûïô¶h–òôôDÏž=QVVƺ^½zõÂ;#wÞyEEEðõõÅÏ?ÿÌ0~œœðŸÿü‡‘ã˕˜‘‘Á˜ã^^^X¼x1ËQ’––F;*Ì—ÛPÎ)WWWFõvËññõõŬY³ðÙgŸÑýá2^SSS9¯+ÅÏ• ——‡ÆÆFú\Ó¦M믾ÊhG­VcêÔ©¨¬¬´9‹Å¬7ÌÇ{Œ•7^VVF/íöööæ\mAE›¹r‡¸ 9úúú"..»ví¢U±±±¬%øÔýØ–ûûn«^ÛÑÇ~ÂOø ?á¿÷øÍwrrrªU«öõëסT*é×9::B¯×C«ÕB­V#==}ûöí´í~‰îmÉårÌž=Û¶mCdd$Ξ= ƒÁpÇæNNN×ÙÅÅŬ¢c檶*..Ó§OgyÖ®]˹÷m\\ÆŽKÿâÄ ¤§§3ŽáÊlM"‘Ë—/ÇË/¿ŒêêjüþûïÐjµX¼x1}ÌÓO? •J…~ø¡Í¼"‘ˆŽ°Ú“3`žÃùU«ÕJ¥ …ðòò–-[лwo†A§×ëqèÐ!äää€Çã!11Z­–aLDEEáË/¿ÄúõëQPP@·ëë님˜Lž<ÑÑÑx÷Ýwqþüyà,uýúuš“+iذa¬}‰KKKQYY “ÉFÃYtŽÊ߸zõ*<==ñßÿþ—U»¢¢‚‘Ÿž——ƒÁÀ¸AAAx衇è‘‘‘X¿~=+7%%…vJäææ¢¢¢‚aPyxx`Û¶mxî¹ç›› >ŸLœ8ÿú׿ •JáââƒÁ€ÈÈH„„„0œ¥¥¥P*•ôX{xx°¢¢ƒpssŸÏGXX¶mÛÆ:N©T">>ž6®srrXcóä“OB§Ó᫯¾Bee%ŒF#œœœ‚G}ãÆƒ\.ÇþýûqõêU«©gÏžµšÓ"‰Ð«W/«Žj\ÆŒÃrœ9s†Îõ7´¡j®¨¨(Œ9§NL˜0s›­³gÏ"99™NàZm‘““ÃrP<”“ÈÒ õÊ+¯àᇆÑhD÷îÝ9—©çææÒÛ†µ–ƒgïýß^qUÇméC¨½Ç~ÂOø ?á¿÷ù©º5Ôwóï^UUU´íææ†Þ½{ÓŇ«ªªpýúuèt:\¿~C† ¹ï·â"j›‚ƒƒ1`Àðx<¤¥¥Ñé~wbþ—––²l‡N1²©=i­©­FvHH.\Èzüûï¿ç,ldy¼Z­ÆŽ;XÇqEÞlµmØÿû_˜L&>|îîî˜9s&}ÌÌ™3QUU…¿þú«Mç‹ÅŒb[Ö<+ÖÞ„œœ8 “õîÝ{÷î…Á`€\.g‹Àí¥àÛ·o§'Zbb"222KÊñÈ# oß¾¨¬¬„^¯‡P(„‡‡ÀãñP]]¤¤$Úè°<—Éd¹sçX}7çåZΟ˜˜HaœKE|||ðÅ_ ªª r¹œsoî’’äççÓçKKKCNNcõƒ‹‹ Ö¬Yƒ3f@¯×£wïÞ¬¥Ëذa½|9++ ×®]cE-ýüüðÝwß¡¢¢‚Î… p{Û­üü|ÀСCYÑù²²2ܺu‹ž,cP `Ö¬Y˜4iø|>çéZ­{÷îÅÉ“'ék¸wï^Œ3†Ñ_///,X°ãǧW2888ÀÏÏ^‘œœL/ë bÇ+..f,£±ß   Ö¸TWWÓK³M&d2ç¥rî©{âØ±c˜:u*Ãóíéé‰7"77ŽŽŽg]ªª*¬[·Ž^5µÚ¼¿åünGë XýìÞ½;½ý™5YV+·üÝÖûŸKm‰”´ô¡ÕÞB?Öþ&ü„Ÿð~Âïò ¸ºº‚Çã±ö¦jÜH$DEE1Òî<<<†ŒŒ ¨ÕjTUUq~w#"¢¾+gdd $$„¶ îÄü7 ¨®®¶©Xp»ŒlƒÁ@oçcÍp´Wo¼ñËè¸víc;$ósX¿wï^Örr@À™{i«ú÷ïI“&aÿþý€ü!!!6l}ÌÂ… ‘™™Ù¦ªãTÿí©^iþ|PPgޱ§§g‹oZEEEX¼x1cY³R©Ä矎5kÖ°¢•,£ÓÜ0¤ŒØÀÀ@V”J%®^½j5ÒéïïÏš´*•бÄ[¯×ãàÁƒ2dk+6777:¯‡r˜«´´”1/rrrðÇ`áÂ… ‡+LéÖ­[X¿~=-nWÔÞ¶mzôèAÑ”|}}YÆ(äç磡¡зo_Ö󨫫£Ç·oß¾œ¹À¬sšØ'Oždmq—€½{÷böìÙ n¡PÈi6?Ê·–šÀµÿ3¥=z°æŽy¡4ŠÇrÞèõzºú8uÜùóçqêÔ)<ñÄ,‡Wše¿ôÒK G×êëׯ3Ò",•™™‰C‡aΜ9œ© ”£ÏÒÑXYYIWÖçšÿí¹ÿ[û¢×QíµGQˆð~ÂOø ÿÝÅO¥ rí¼A}Ïñòòâ,pæééIרihh F6‘UA*•ÂßߟÞÅæNÍÿúúú6Ùv•ø3¨©©¡£P–?m ¥?öØc¬›T§ÓaË–-œaüiÓ¦1ŽW©T¬bgÖîJç³gϦ “É„>úˆ‘+‹ñú믷‰›Úh½­?½zõbs•••V÷ó®­­Å/¿ü‚Ñ£GÓ†‘ùÏ/¿ü‚×^{ ׯ_·š`i`ét:”——Ãd2¡wïÞ¬È+e„qMjÊQ`i0šGŸ©Ÿï¾ûÉÉɬÜbÊ »rå ç’^óýÜ)oÔûï¿Ã‡Ó­5©T*œ9s‹-ÂÏ?ÿ̺^ýõ^ýuäææZ]nbnìSÅè|}}9#–}å2Í·J³T^^>øàÌœ9*•ŠÑ–F£ÁòåËñÃ?pî×ÌõF•——‡›7oÒm˜;—(ݼyjµš±ôÛü kĈ,£´¼¼œž3&“ aaaœÅû¨baÔOcc#–.]ŠÌÌÌV+£WVVbÿþý˜;w.233é6\\\X‘g“É„³g϶x¯éõz,_¾{öìAVVà ¯¯¯ÇåË—qòäI–“§  €UÃÂ|þ·÷§¥;ñcO¤‡ð~ÂOø ÿ½ÉÏãñàààÀ餧¾µô½˜z®µïNDÿl xxxÀÕÕ•ÞõNÍÊyd¯ìŠdFÆ\KO½Æ¦D"amÕû÷ïçÌÁT(˜Ü&#»­….† ƺޗ.]Âï¿ÿŽ &ÀÇÇB¡:999Ø»w/X‘ÍõÇàòåË3f Æ___H$ðù|zì«««‘˜˜Èȧ&à¾}û´SRRN K¾¦¦&=z”ÎÙ€ÂÂBV¡(•J…I“&aáÂ…ˆ‹‹ƒL&ƒÉdBmm-þúë/ìß¿óçÏgDZe2¾ùæÖõÔh4˜={6ÆŒƒiÓ¦!$$R©Z­ÍÍÍÈÍÍÅ‘#Gpøða:Ÿ–K‡Fff&&L˜€±cÇÂÃÃàñxÐh4¨¯¯ÇùóçqèÐ!úZ577#55)))t;...øþûïé>J¥RÎmË6lØ™L†#FÀÕÕ&“ ÍÍ͸xñ"<ÈhÓòzëõz,]º_ý5âââ0tèP¸ººÒýÕétÐh4(,,Ä™3gpîÜ9dee1 êŸ~ú‰ñæ÷ÕW_qz©ßÅÅÅ´ƒ‚šó{öìa¼466â×_e8‡ªªªè] Ìçuu5F…'žxO>ù$|||àää>ŸOW‰OOOÇÁƒqîÜ9hµZÖø§¦¦âúõëô¹D">Ì*èÂåÌy饗„øúúB¯×£°°ÙÙÙØºu+ËÈNNNnqþ··ÐMkžÚöê±7RÓÞþ~ÂOø ?á¿ûùy<g…pGGG444 ¦¦†s{ØææfÚŽàÚ9†ˆÈr>Q«Xí‰Twôüoiµc‹÷ʶo~2=ûÄ›ÎÎÎÆ¨Q£¬¾) :Ô®“?¯¾ú*㱦¦&<ÿüóœ^ƒ_|“&Mb<¶`ÁÎÂhëׯo×rqsÇ /¼@/¯æñxغu+ÃR*•˜3gŽ]^¹òòrF”˦Áúÿ×[(âÀ¬èâŠ+èm·Äb1$ £rxGW«ìêꘔ÷”2¦+** ×ëmn«?2™ 2™ b±ÍÍÍP©T­F¹;›¿wïÞØ½{7㪬¬ Ó¦MCrr2]­“úÀjÿ½6þT<r¹...hllDcc#T*U‡ó¯_¿|>þù'®^½JŠ£*ÚòÉ'èÝ»7cåLAAž}öY¤¥¥uÙõþ'?á'ü„Ÿðþ;Ãïè舰°0<õÔSŒ¢ÀT „ªÆÜ«W/Æî1Z­/^„Á`ŸÏGLLL›ÒL‰þ9úòË/‘““ƒíÛ·ßÑùÿÓO?aøðávõýÛ½í‹d›L&F¢xkÕÙZS\\뱿þú‹ÓÀ–H$?~<ã±üü|N›Çã¡wïÞ2À|>qqqøæ›oèk°wï^F4ÛÇÇÑÑѸråŠ]o˜m1øy<üýýYy, t‘%óªÜYHƒkü-Ûkí{ùÕj5½4¼-r–}¨¯¯çÜÂìNò÷êÕ‹•ˬT*éÂiF£MMM¯¿ùùª««QUUÕ©ü~~~5jz÷î3f    0‹Åðööf•V«Åo¿ýƨlßü\ÕjÿIãOø ?á'ü„¿ëùF#4 ]EÜ\þþþ(--…J¥BVV*** —Ë¡×ëQ^^N§#:88X­qBDDÍÁÊÊJ455ÝñùßV;×.#›ÇãA(2rcÛú¦!•JÆzüàÁƒœÇÇÄİ _qU§ŒÞöîÙm®Ñ£GãÛo¿¥YÏž=‹ÆÆFÆVAÇ·ËȦ"°öIÔ1Ý»wgå2Ò¥ZJôo­}®„–&¥=Õ6Û»O¤µ¾Øº\ë^áçJ(..f÷»_ùï¦ñ÷õõ¥‹]8;;³öà¶”F£ÁáDZråJúM™ÌÂOø ?á'ü÷ ¿Ñh„V«¥kø˜_áóùèÛ·/ÒÒÒ R©P[[ËH›¢ÔÜÜŒ¬¬,Æn/DDæ***Bss3½³Ïœÿ–é€fd;::ZüFÎj‚\Šˆˆ`ÁÐåÿ-ŵçuff&ç±­m«c¯ BBBèåÝ:—/_Fll,} WÅèÖ¼’m­~7tèPV.KEEJJJº¼Ze[2™ŒÎYµôPy¶ˆkÛk‘iœQo®âh¬n9Õ…‡‡3r¨SSSF¶¿¿?Äb±ÕêÞ–2 ­NkžÎzaj\Ž;Öå•;:稫ۿû'—Ëáàà€ÔÔTú1©TŠü#øï¶ñ¿pá¾ÿþ{tëÖ NNN‹Å‹ÅÐéthnnFuu5N:…ƒ";;»ÃîÃâü'ü„Ÿð~Âwóët:ÔÕÕA£ÑàçŸFLL kõŸÏ·º¬L&Crr2‘——‰DÂÚ>–蟭ÊÊJ¤¥¥¡¦¦†^¥{'ç¿‹‹KçÙr¹œ6n-×ÈkµZ›s,¸öÆ3ßîÉR\7 µíˆ¸*¶W=zô`üOåÇZ2Y‹Äs½I™¿qZ{åúûùçŸï7åöoKNR{?Dìñ<Ýü•••´óÆ–œ2þËýúuÌ›7<2™ ‰ŽŽŽtEs®¥pdþ~ÂOø ?á¿_ùÕj5JJJàââ‚ü3f̰˖èÛ·/’’’ V«qãÆ 8;;C*•ë’z½¿þú+T*222èmiïäüï2#Û××iiiœ7½J¥²ù&á*ÝOUð¶”D"aåXët:ÆvQæjïþØ\’ÉdŒÿ©¶žW­V·šÈoo¢¿åïŽ>¾½oú–"ü„ÿ^á7™L¨««Cmmm‡m AÆŸð~ÂOø ÿ½Ê¯T*¡P(ðÛo¿ÁËË ãÆ³ù;°X,FTT’““¡×ë¡Ñhˆ‘MƒÁ€}ûö¡  ·nÝBJJ ã^¸SóßÖ´ÁvÙ"‘þþþVo:óýŽ[×’Jkûqí¿ÝÒ¹Úš Þ’,|®­‚ìY&JõßrP[úÍõ†nOá ®¿[zµåC¥½…??á'ü„Ÿð~ÂOø ÿ½Å¯ÑhpóæM8::bÇŽ(..ÆôéÓmÞÿZ*•bРAP«ÕËåÄÂü‡«¶¶¿þú+rrrPXXˆ¿ÿþ›‘‚{§æ¿··w›+áÛmd[kll´¹-.#ÕZ>7UòßÒ fMÖŠµGjµºÕÁµ<¦µö:»ð†-Õô&Nœ±X “É„ÚÚZ\¿~^ònO5>lÞ¼›6mBzz:ëø!C† &“ jµYYYÈÉÉaUCïjþÖªÚ[Ðüx‰D‚çž{ ¸víZ›x¬?lØ0øúúÒ׳¨¨%%%¨¬¬¼køí=Ÿ¹ÜÝÝ¡Õjï)-ïçç‡ŠŠ zuKŸ>}ƒ]»vÑ÷å½Ä?ÌÂOø ?á'ü÷7]]ÒÒÒ‰C‡áòå˘0a ›X¹ѽ%jEDJJ ._¾Œêêj(•Jœ9s†±£Îœÿ–éÂfd;88 44”±|żÓTA4[Äu¬µ5ïjµ:ŽáIH$œ¸µeäí‘eäœkY‹­¹™ÍÍÍ6HkZ[~/½ôx<´Z-ÜÝÝQWW‡M›6áøñã­NBóñwvvFŸ>}àááÁ9Y§L™‚˜˜TWWÃÑÑŽŽŽ8räÖ®]Û)ãe+¿-7ª-ü\Ï»»»cúôéÐh4V+á·UÏ<ó úõë‡êêjˆÅb( dggcûöíôØÝiþ¶Vc•H$xÿý÷QZZŠeË–µz¼··7¾þúklß¾{öì 2Ó§OÇÁƒéÚ ÷ ÿý2ÿ ?á'ü„Ÿðßÿü555HLLDhh(´Z-víÚ…_~ùîîîËåpttÇ£Ìûhþ›ú›jÛ2ðfïÿ\íZo¹»‹µkiÞËk`yuóˆôôt@*•ÒÛº™oMfÞ½^7nàµ×^ƒH$ÂÒ¥K1qâDìÞ½ÕÕÕJ¥¨©©ŸŸªªªPRR‚   DEEÁÉÉ ÉÉÉÈÎΦ۔Éd_¿~ýpãÆ ¨ÕjôêÕ MMMð÷÷G=pãÆ ¤§§#<<ááá(//Ç™3gè÷™L†¾}ûÂÇÇ………¸rå t:$ """PXXˆÀÁÁ.\@EE D"œ††dgg#44ÁÁÁppp ÇP«Õ¢{÷ît*Ktt4”J%öíÛ‡üü|ƒÇãA*•¢oß¾ DQQ’““ÑÜÜ “É„ÈÈH”——£gÏžð÷÷Gaa!½eZK÷ck…oÚ{wÅýßžÈ á'ü„Ÿð~ÂORRR——???( ÔÔÔÐé›¶Îæý¡ž37ÆÍûÁe4›÷ßÚë­ù–QN.ÓÚu²6þ”!Oým¾+u ¹r…[{=×ÊP®óǹŒ_®¹c~¼ù~éæÇ[Þ \mP¯1·! òóó‘››K4ï¦ùß§OŸ®1²©êâ=zô@nn.礪©©¯¯o«mååå±k)$Ÿ››Ë2²{ôèÁid755u¸‘]TTÄø¿gÏž­òXSMMUO§ùß­¾°u‚µÔ>p;ò_UU…ªª*$%%¡ÿþXµjFŒ’’xyy!;;‹/†ŸŸvî܉óçÏ£_¿~prrÂÂ… [ì?uéT*¨T*ìÙ³&L€››fΜ‰ÈÈHh4„††âêÕ«X¶l¶lÙggg455aÉ’%غu+~üñG¸¹¹aݺu4hÊËËáêê 777Lž<555غu+ н{w¸»»ã“O>Á¬Y³ R©ÐÜÜ ooo;v «W¯†§§'>ÿüs”••ÁÝÝ|>jµ¹¹¹tŸD"Þ|óM$$$`áÂ…xðÁé½"½¼¼0qâDÚÓ%—Ëqúôi,[¶ nnnX¹r%>úè#ìÙ³«W¯†\.‡‹‹ ´Z-ÜÜÜðé§ŸbçÎpvvÆ{gáÇ£¼¼2™ îîîxþùçqéÒ%ÖøS׳±±Z­¨ªªÂ¶mÛ0yòd¤¥¥aàÀذa}ý—,Y‚+Vàï¿ÿ†ŸŸÞÿ}ôìÙðööÆîÝ»QZZŠ9sæà™gžR©„¿¿?¶oߎ÷Þ{ÀúõëáààWWWèt:H¥Rœ;w<ðT*ÜÝÝñé§Ÿâ믿†££#6n܈¨¨(”””ÀßßçÏŸÇÛo¿ØØX¬_¿EEE´·7//‹-³Ï>‹°°0èõzDDD ¼¼K—.ÅÎ;cxüøq¬^½óæÍƒ\.ÇSO=…G}W®\Ajj*æÍ›‡ÌÌLTTT`ÅŠ9r$JKKáããƒÔÔT,Y²*• [¶lASS<==¡×ë!‹±|ùr?~¼UOhK_ªl)\s§ïÿ–<»¶x‚ ?á'ü„Ÿð~ê¸úúzÔ××ãÚµk …àóùœFMK,æ&eˆÆ£Í­E6-6ËH¶-QakÇ›÷Õ'Ek2ÿ®×Úë[«`ù¼¥1Íõ¿¹@Ðí©ÖÝVþ»aþ0 kŒlJC‡¥lK¨ªª*›Œì¢¢"TVV2¶òêÛ·¯ÕãoÞ¼‰aÆ1 Á•+WXÇÚ³lÝVYFÞ£££ÿ›ïklk$ÛÏ¥-7­žG®ç©ª‡‡úöí‹B©TbÊ”)>|8V¯^Ó§O#&&ëׯÇСCqëÖ-Ú8_¸p! ãš[ë¯@ €‡‡<<<ðÜsÏ¡ªª EEEÐh4puuÅ·ß~‹·Þz ®®®xë­·ÐØØˆ  ±±ï¿ÿ>fΜ‰“'ObæÌ™ÇK/½„7nàÁÄÆicR«ÕB `ÅŠ(//Guu5ÒÒÒPPP‘H„3fà‰'žÀW_}F½^ëׯcË–-ð÷÷ÇÖ­[Áãñð¯ý <Ÿ}öÆŒƒ„„¨Õjèõz¬[·©©©Xµj† ‚-[¶àرc˜9s&žyæxyyA§Ó1öC×h4¨­­ÅþóTUUaóæÍ9r$vìØ™3gâÀ+¯¼‚k×®!::Ÿ~ú)ÃSh9',ß³²²píÚ5ÂÅÅo¼ñrrr°lÙ2|øá‡˜;w.’’’ðÆo@.—ãùçŸGQQz÷î~ýú1–îsy+›››Q__Å‹C­VcÛ¶mèÑ£^~ùecóæÍ1bvî܉… "88K—.EZZ&Mš„×_aaaÐétàñx8pàöíÛ‡þýûãƒ>@tt4>ÿüs„……¡¤¤ü1Ôj5ðŸÿü……… …˜9s&üq|ùå—Ø¸q#¾øâ |ñÅ8uêêëë1qâDúƒa„ 3f –-[†3gÎ`Ô¨QX¹r%âââðË/¿@§Ó¡¨¨‹-|öÙg?~<Ž;f÷ý×Òñöæ¼uÅýßÒ‡V{ ý~ÂOø ?áÿçò ΔΎˆìwv{­{‘ÿŸ8ÿ¥R)"""ºÖÈ1b¾ýö[Îç”J%ÂÃÃ9+‚[*11&L ÿ÷òòBpp0§ŸœœŒÙ³g38p vïÞÍ:ÖÚV`m•V«Evv6ý¿T*eÙT´±5UVVÚU…½#eÍ39qâDŒ7žžž¸yó&6n܈^xÍÍ͘8q"&NœH/ù¦£öñññ47µ Áh4r¾©šL&„††â«¯¾¢#¹ñññôuÍÎÎÆ7ß|•J…BAG*)ƒ~çΈ‡B¡@Ïž=‘ššŠ‹/Âh4Ò*º þù'm$ Œ;sçÎ…··7œ!‘HŽŽ„„£©© ÕÕÕ¸vírrr  é½ )ƒ¹¼¼/^DCC.]º„¾}ûâøñã(++ÃéÓ§1wî\…B:§Äüšäää ##’’‚aÆÁh4¢wïÞHKKÃÅ‹¡Óéè=ã C‹9$–bz½<... ÄîÝ»¡T*üñ^}õUÈd2:ªœ‘‘“É„óçÏ#11“'Of´Kµmþ÷Í›7qãÆ ðx<Ãd2!99ƒ™™™ˆŽŽ†Á`@=`00sæL†S§[·nhjj‚N§ÃÁƒ¡T*‘””„šš¸»»ÓÏéõzÓcسgOüûßÿ†——=†ÔêNGO]3£Ñˆððpäåå!!!µµµ8rä¦OŸŽèèh:‡ûÊ•+ô¼ÎÏχ“““Õ/÷ª:"G‰ð~ÂOø ?á'ü„ÿŸÀc“=Û¡FöèÑ£­zôz=ÊÊÊX9Ë\ú믿F6Œ;Û·og›••…ÒÒRF”<22...hhh`K}Ñî(edd0 •=šQ„-''ùùù6µež{'8€‹/¢¼¼Àís…B«W¯">>ñññ8tèc‹({ÏÓÔÔ„ƒâøñãÈÏÏ·êAR©T(..ÆðáÃéýGúúz(•J”––",, ðôôd­*°”¯¯/òóóñÎ;ïàäÿkïÞcš:ß8€ÛÒJ‹A28e^†67ÄyÇM·ÅyÛ4cs&êtÑÌ阷¹-¸¨Ä8osÌøSãDãT4¨€(P«8aŠ:ÆiEîáÖ-íïÓ+U¹*Ìï'1¡ç¼ç}ÏiêÓó¾Ï“˜Øª÷FkÉË˃¿¿? P(بoæd2¡T*±víZøøø ..z½Z­¡¡¡Âz¨   Ü¿:ùùù2d €Iƒ‚‚P]] ggga‡O³¿Å»wïÜÜÜššŠ¨¨(ìÚµ (**zêï‰Ñh„···${yy!77+W®DRR’U’AK{???ዘ‡åääÀÏÏOH^áçç???!ë8ÑÃBCC[t¼]s|ï½÷––fs_II ªªªlËÚ³g~üñGáu§N0mÚ4üöÛo ÚÆÅÅaÚ´iVý¾ûî»øã?¬Ú™L&äää´(#œEmm-’’’„×ãÆ³zš~ÿþ}$$$4ª/FÓÚÊÐhùâ 44«W¯Fzz:$ † ‚¨¨(«'“¶¾)jÊ8ï·¨¬¬ÄÑ£G±bÅ ¬_¿z½!!!HIIF£ALL Fމ;wB¯× 5ÓÚNŽaY‰ÚÚZ›AùÓJ=<œ}òIÙ0ó~ÛçСC7n¶nÝ N'\Ó“ÞÓž={â矆T*…`Æ HLL„X,ÆÉ“'†õë×C*•â­·ÞÂÁƒQXXˆØØXüðÃØ´iòòòзo_hµZlÙ²:+W®Daa!ú÷ïoU“þi×ÿðùþþûïÆúõë‘™™ ggg <K—.µù¾Y^[¦ËOœ8‘‘‘°··GRR&OžŒÈÈHèõz«D:¹¹¹5jÜÝÝQQQììla||ªª*á³±dí¶U/\.—£°°jµºÁìÎ; ÓŠ-ûnܸ¢¢"øúúÂÁÁÇŽÃæÍ›Q[[‹¢¢"¨ÕjàâÅ‹ÈÈÈÀ°aÃ-”zKKKfTdggC*•B.—£¼¼111¨©©A\\t:ºuë†óçÏ£¨¨&“ îîîÈÌÌÄ¿ÿþ ³Ù 777ܽ{ýõ\\\P^^•J“É™L±XŒÄÄDaê¸B¡À‰' Óéàéé‰ .   îîîÈÎÎÆÍ›7|¸°ÿܹs#//ׯ_ç]Û ÜÜÜðí·ßB­VC$aúôé‰D˜={¶Í¿#ËåX±bÒÓÓa41mÚ4têÔ ³fÍjö}"""""j<©TŠ›7oZUÀjª=±'š·&ÛbîܹÝg6›­¦l>Iii)"""¬‚äž={bÁ‚ ÚÖ××cýúõVk2}}}1~üx«vׯ_·jÓìñãÇ[ØZ­›7onT?&“ÉfÆtj>|òÉ' C~~>–.]Úal OOOÌš5 Ÿþ9Š‹‹±lÙ2ØDDDDDÏÈûï¿ß¢Û¢EO²Íf3‚‚‚„鯶 6 r¹¼Qý…†† µj-öíÛ‡èèèmƒ‚‚°jÕ*aídUUæÌ™c5Åü›o¾Appp³®íöíÛX´hŒF#!dQ.//Ç’%KàÍ25—ˆˆˆˆˆˆÚ§³g϶hª8Ð O²E"–,YòÄ6YYYN‡;vXmûè£0sæÌmU*¶lÙ"¼vvvÆüùó­Ú$''7ëºôz="""`4áããƒ+VXØááá°«««…ú»DDDDDDÔþŒ1¢Å¶…¸¥L™2Ūæí£*++›4UúèÑ£X·nP›fÍš… 4(%gh[=¹¾téR“§Œ›Íf!™Y=ðÓO?¡K—.$iûꫯšt=ׯ_aнuD_ýu«õÕâ ["‘`õêÕOl“““Ó¤õ²çÎÃâÅ‹­ž¿óÎ;øá‡àêêjÕöĉˆˆˆ¤}ñÅÂôôººº&?ÍÞ¶mRSS1hÐ lܸQèëÒ¥KøòË/… ÄñÏ?ÿkº‰ˆˆˆˆˆ¨ý3f Þ|óÍVë¯Ù%¼Ö§O$%%=65›Í(..†··wƒ§ÑSVV†øøx888Àßß"‘žžž5jòòò Õj…¶¹¹¹øóÏ?¹\¥R)×wîÜÁرc!•JŸ:æÉ“'qðàAÌš5 ‹-‚L&ƒN§Ã/¿ü‚;w6Ètþ$¥¥¥¸víïX"""""¢vJ"‘ ::îîî­Ò_ÆìÖ ²àµ×^ÃîÝ»ŸX7»²²J¥RHVö4F£éééHKKƒ——”J%1räH(•Jdee¡¦¦PRR‚3gÎ@©TbРAˆ…ÙlFuuµPãW§ÓÁÃÃCX_ý¨.]º`Ê”) $$$`íÚµÈÌÌlÒ{¡×ë…RLDDDDDDÔ>-\¸S§NmµþZ\'ûQß}÷"##ŸØÆËË ýû÷oVÿýúõÇ~ˆ¡C‡B$¡ºº1118vì˜ÕSf‰D‚úúz›}8::b„ ˜1cœì7™LHNNÆ››Ûäs4 P©T¨®®æKDDDDDÔNùùù!55NNN­ÖçžØ­d×ÕÕaĈÈÊÊzêÅôéÓ§Ùãxxx`Ô¨Q îÝ»‡yóæ5i:·««+–/_ŽîÞ½‹3gÎ 11±Ùµ‰ëëë¡V«­ÊˆQû"‰pêÔ)µj¿­dÀ70räHèõú6 ´-¼¼¼Ð½{w¨Õj›ûçÌ™ƒû÷ïãøñã öI$¼ñÆÐjµMÊn‹Á`À•+WPVVÆ;–ˆˆˆˆˆ¨[¶lÂÃÃ[½ß×ɶå•W^yê”qàAB²ÌÌÌF×Ð~œ{÷î5°W®\‰>ø0tèPáiõš5k0yòd¡]}}=Ο?ßâ»¶¶/^d€MDDDDDÔÎ=Ë—/o³þÅmÑéÌ™3±xñâFÈéééV5±[ÊÉÉ C† ½½=€k¬-¼ƒƒBBBZõZ+** R©PUUÅ»•ˆˆˆˆˆ¨ëÝ»7¢¢¢ ‹ÛlŒ6ëyõêÕÊÒV\\ŒÔÔÔ&ÕÑ~…Bððp˜L&œ>}Z²M&àøñãèÓ§>ýôÓÇfoŠ»wïB¥R=uj<=_ˆ…\.oÓqÚ,ȉDضmFŒñÔ¶z½jµ·oßnÑôñÏ>û =zôÀš5k„äe&“IÈ4~áÂDEEaÒ¤I9rd³Ç1 ¸zõ*²²²„žˆˆˆˆˆˆÚ'¹\ŽØØXøúú¶ùXvmÙ¹T*ž}û0mÚ4¤¤¤<±­Éd­[·pïÞ=ôë×®®®MoãÆ ‚Þòòr«lß±±±8räH³ƒcFƒ¿ÿþ»I™Ì‰ˆˆˆˆˆèùØÇŽ믾úLÆkõìâ¶ÔÔÔ ,, qqq;)‘ÞÞÞèÕ«Z4¶L&C}}ýcëf7VYYnݺÕ*ÓÚ‰ˆˆˆˆˆ¨í)•JÄÆÆ¢_¿~Ïd¼6É.n‹ƒƒ¢££ññÇ7ª½Ùl†F£Arr2®]»N×ì±ëêêZ`—””@­VãâÅ‹ °‰ˆˆˆˆˆ:ˆ¾}ûâìÙ³Ï,À¶°{VI$lݺ¾¾¾ˆˆˆhÔÚk“ÉF­V wwwxyy¡[·nmš ΘçççC«Õ¢¢¢‚w'Q2iÒ$lݺ;w~æcÛ=ë—-[†×_sæÌAAAA£Ž1›Í(,,Daa!¤R)<<<еkW( Èd²V9¯êêj”––¢°°ÅÅÅ-®ßMDDDDDDÏ8Àµ³Ã÷ßùóç?¿sxƒ#%%sçÎEBBB“Ž5 Ðh4Ðh4¸ººÂÉÉIøcoo‰Dbóx£Ñ½^êêjTWW£ªª eee¨©©áIDDDDDÔAõïßÛ·oG@@Àó ôŸ×Àîîî8|ø0öî݋իW£´´´YýTVV¢²²ò±ûE"ðÉ4Ñ““–.]Š… ÂÎÎøy.‰0{öl¤§§#,,Lˆ[“Ùlf€MDDDDDô#‰0sæL\¹r‹/nös²- 6mÚ„ÄÄD¼ýöÛ¼[ˆˆˆˆˆˆÈv+cêÔ©¸té¶oßOOÏöu~íéd „#GŽ >>¡¡¡mòd›ˆˆˆˆˆˆ:WWW,Z´øõ×_ñòË/·Ëó´k'5lØ0ÄÄÄ ''»wïFLL ŠŠŠxW½@$ F…éÓ§câĉppph÷ç,Úö¿æÞ›Ð®OÒh4")) GŽÁ©S§P\\Ì»ˆˆˆˆˆè?H.—#$$&LÀرc!—Ë;̹ï‰=Ñ>Ÿd?ÊÎΣGÆèÑ£a6›‘‘‘¤¤$¨T*¨Õj”””ðN$"""""ê€zöì‰ÀÀ@bøðáèÐK‡í:Ú ‹D" 8¶iµZdeeáæÍ›¸}û6îܹƒû÷¸¥¥¥0 ­6¾L&ƒ££#ìííáàà;;;H$H$áçæhiôŽv|}}ý {í`2™ ‹ÛÅùÛúlšr·t|ƒÁ€ªªªFýî·äïæöÛÔqŸö~´tKµ··î=êXìí홫„è“Édü½ë ,ÿ&>\ièÑŸÛÎò³Éd²úÿ’Ùl†Éd¶›L&«m–ŸÝö¢U>’H$puu…B¡€»»;|||УG¼ôÒKð÷÷GïÞ½áììüŸºf»ÿÂEx{{ÃÛÛcÆŒ±¹¿ªª eee¨©©Á`þF˜ÍfH$ˆÅb!X–J¥Éd°··‡½½=d2ø-µÊ—á¶ñGuËq·±Õ­×~ ÐÜóm,±X '''¸ººÂÅÅå…û|ÿFîD,/´I·IEND®B`‚perldoc-html/static/center_bg.png000644 000765 000024 00000005357 12276001416 017154 0ustar00jjstaff000000 000000 ‰PNG  IHDRû@a CiCCPICC profilexÚSwX“÷>ß÷eVBØð±—l"#¬ÈY¢’a„@Å…ˆ VœHUÄ‚Õ Hˆâ (¸gAŠˆZ‹U\8îܧµ}zïííû×û¼çœçüÎyÏ€&‘æ¢j9R…<:ØOHÄɽ€Hà æËÂgÅðyx~t°?ü¯opÕ.$ÇáÿƒºP&W ‘à"ç RÈ.TÈȰS³d ”ly|B"ª ìôI>Ø©“ÜØ¢©™(G$@»`UR,À ¬@".À®€Y¶2G€½vŽX@`€™B,Ì 8CÍ L 0Ò¿à©_p…¸HÀ˕͗KÒ3¸•Ðwòðàâ!âÂl±Ba)f ä"œ—›#HçLÎ ùÑÁþ8?çæäáæfçlïôÅ¢þkðo">!ñßþ¼ŒNÏïÚ_ååÖpǰu¿k©[ÚVhßù]3Û  Z Ðzù‹y8ü@ž¡PÈ< í%b¡½0ã‹>ÿ3áoà‹~öü@þÛzðqš@™­À£ƒýqanv®RŽçËB1n÷ç#þÇ…ýŽ)Ñâ4±\,ŠñX‰¸P"MÇy¹R‘D!É•âé2ñ–ý “w ¬†OÀN¶µËlÀ~î‹XÒv@~ó-Œ ‘g42y÷“¿ù@+Í—¤ã¼è\¨”LÆD *°A Á¬ÀœÁ¼ÀaD@ $À<Bä€ ¡–ATÀ:ص° šá´Á18 çà\ëp`žÂ¼† AÈa!:ˆbŽØ"ΙŽ"aH4’€¤ éˆQ"ÅÈr¤©Bj‘]H#ò-r9\@úÛÈ 2ŠüмG1”²QÔu@¹¨ŠÆ sÑt4]€–¢kÑ´=€¶¢§ÑKèut}ŠŽc€Ñ1fŒÙa\Œ‡E`‰X&ÇcåX5V5cX7vÀžaï$‹€ì^„Âl‚GXLXC¨%ì#´ºW ƒ„1Â'"“¨O´%zùÄxb:±XF¬&î!!ž%^'_“H$É’äN !%2I IkHÛH-¤S¤>ÒiœL&ëmÉÞä²€¬ —‘·O’ûÉÃä·:ňâL ¢$R¤”J5e?奟2B™ ªQÍ©žÔªˆ:ŸZIm vP/S‡©4uš%Í›Cˤ-£ÕКigi÷h/étº ݃E—ЗÒkèéçéƒôw † ƒÇHb(k{§·/™L¦Ó—™ÈT0×2™g˜˜oUX*ö*|‘Ê•:•V•~•çªTUsU?Õyª T«U«^V}¦FU³Pã© Ô«Õ©U»©6®ÎRwRPÏQ_£¾_ý‚úc ²†…F †H£Tc·Æ!Æ2eñXBÖrVë,k˜Mb[²ùìLvûv/{LSCsªf¬f‘fæqÍƱàð9ÙœJÎ!Î Î{--?-±Öj­f­~­7ÚzÚ¾ÚbírííëÚïup@,õ:m:÷u º6ºQº…ºÛuÏê>Ócëyé õÊõéÝÑGõmô£õêïÖïÑ7046l18cðÌcèk˜i¸Ñð„á¨Ëhº‘Äh£ÑI£'¸&î‡gã5x>f¬ob¬4ÞeÜkVyVõV׬IÖ\ë,ëmÖWlPW› ›:›Ë¶¨­›­Äv›mßâ)Ò)õSnÚ1ìüì ìšìí9öaö%ömöÏÌÖ;t;|rtuÌvlp¼ë¤á4éĩÃéWgg¡só5¦KË—v—Sm§Š§nŸzË•åîºÒµÓõ£›»›Ü­ÙmÔÝÌ=Å}«ûM.›É]Ã=ïAôð÷XâqÌã§›§Âóç/^v^Y^û½O³œ&žÖ0mÈÛÄ[à½Ë{`:>=eúÎé>Æ>ŸzŸ‡¾¦¾"ß=¾#~Ö~™~üžû;úËýø¿áyòñN`Áå½³k™¥5»/ >B Yr“oÀòùc3Üg,šÑÊZú0Ì&LÖކÏß~o¦ùLé̶ˆàGlˆ¸i™ù})*2ª.êQ´Stqt÷,Ö¬äYûg½Žñ©Œ¹;Ûj¶rvg¬jlRlc웸€¸ª¸x‡øEñ—t$ í‰äÄØÄ=‰ãsçlš3œäšT–tc®åÜ¢¹æéÎËžwç|þü/÷„óû€9%bKGD???Cc¸ pHYsˆˆÄ×@ tIMEÚg¯Ó-IDATHÇíÁÑ CÁ8Oö©BèöKçÞÝJr»[¶àµKsFd ˜zHZIEND®B`‚perldoc-html/static/center_footer.png000644 000765 000024 00000002412 12276001416 020047 0ustar00jjstaff000000 000000 ‰PNG  IHDRˆ 7®¯@iCCPICC ProfilexÚ”OhUÇ?o:¡ ‰ ®5”‡‡$ k+ºE¨É&é6î²]¦Ùü£ ›Ù·»c^fÇ7³[[ŠH@¼ÙêQ¼T‹x¨"$Oö h­z‚xª" ½HY³&Ôª?ø¼ßï÷¾¿?†>«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´JËÍù& g™(MÓÇf³Ñt:µZ­t¹\dŒ1ªªŠ‰î~¿+Š"™z½®²,õx<˜ @À®×«œs2Î99çä½g*Ëó\qËHR§ÓÑz½f*Úív2Æ|Ä^¯§Ãá <Ï™@`Š¢Ðr¹Ôp8”¤÷‚h­Õh4Ò|>×étbJ•ÃÙl¦$IÔjµ$I‘÷þùwJ–eÚn·šL& Š¢ˆ©üSûý^‹ÅBI’¨ßï?Ÿ)ˆ’t<•¦©Ê²T·ÛU£ÑµVµZíeÃ…^dÉO~òãÕn¢ýf ò“ÿÕE¡,ËTU•Æã±Úíö×½õ{Aüp>Ÿå½×ív{މßü¯_tXÿ0óÿöú÷ËAü‰µVÖZ5›MÅqüã;oÂ1”ÊÝÛÙIEND®B`‚perldoc-html/static/center_header.png000644 000765 000024 00000021044 12276001416 020003 0ustar00jjstaff000000 000000 ‰PNG  IHDRÛ5¡'s CiCCPICC profilexÚSwX“÷>ß÷eVBØð±—l"#¬ÈY¢’a„@Å…ˆ VœHUÄ‚Õ Hˆâ (¸gAŠˆZ‹U\8îܧµ}zïííû×û¼çœçüÎyÏ€&‘æ¢j9R…<:ØOHÄɽ€Hà æËÂgÅðyx~t°?ü¯opÕ.$ÇáÿƒºP&W ‘à"ç RÈ.TÈȰS³d ”ly|B"ª ìôI>Ø©“ÜØ¢©™(G$@»`UR,À ¬@".À®€Y¶2G€½vŽX@`€™B,Ì 8CÍ L 0Ò¿à©_p…¸HÀ˕͗KÒ3¸•Ðwòðàâ!âÂl±Ba)f ä"œ—›#HçLÎ ùÑÁþ8?çæäáæfçlïôÅ¢þkðo">!ñßþ¼ŒNÏïÚ_ååÖpǰu¿k©[ÚVhßù]3Û  Z Ðzù‹y8ü@ž¡PÈ< í%b¡½0ã‹>ÿ3áoà‹~öü@þÛzðqš@™­À£ƒýqanv®RŽçËB1n÷ç#þÇ…ýŽ)Ñâ4±\,ŠñX‰¸P"MÇy¹R‘D!É•âé2ñ–ý “w ¬†OÀN¶µËlÀ~î‹XÒv@~ó-Œ ‘g42y÷“¿ù@+Í—¤ã¼è\¨”LÆD *°A Á¬ÀœÁ¼ÀaD@ $À<Bä€ ¡–ATÀ:ص° šá´Á18 çà\ëp`žÂ¼† AÈa!:ˆbŽØ"ΙŽ"aH4’€¤ éˆQ"ÅÈr¤©Bj‘]H#ò-r9\@úÛÈ 2ŠüмG1”²QÔu@¹¨ŠÆ sÑt4]€–¢kÑ´=€¶¢§ÑKèut}ŠŽc€Ñ1fŒÙa\Œ‡E`‰X&ÇcåX5V5cX7vÀžaï$‹€ì^„Âl‚GXLXC¨%ì#´ºW ƒ„1Â'"“¨O´%zùÄxb:±XF¬&î!!ž%^'_“H$É’äN !%2I IkHÛH-¤S¤>ÒiœL&ëmÉÞä²€¬ —‘·O’ûÉÃä·:ňâL ¢$R¤”J5e?奟2B™ ªQÍ©žÔªˆ:ŸZIm vP/S‡©4uš%Í›Cˤ-£ÕКigi÷h/étº ݃E—ЗÒkèéçéƒôw † ƒÇHb(k{§·/™L¦Ó—™ÈT0×2™g˜˜oUX*ö*|‘Ê•:•V•~•çªTUsU?Õyª T«U«^V}¦FU³Pã© Ô«Õ©U»©6®ÎRwRPÏQ_£¾_ý‚úc ²†…F †H£Tc·Æ!Æ2eñXBÖrVë,k˜Mb[²ùìLvûv/{LSCsªf¬f‘fæqÍƱàð9ÙœJÎ!Î Î{--?-±Öj­f­~­7ÚzÚ¾ÚbírííëÚïup@,õ:m:÷u º6ºQº…ºÛuÏê>Ócëyé õÊõéÝÑGõmô£õêïÖïÑ7046l18cðÌcèk˜i¸Ñð„á¨Ëhº‘Äh£ÑI£'¸&î‡gã5x>f¬ob¬4ÞeÜkVyVõV׬IÖ\ë,ëmÖWlPW› ›:›Ë¶¨­›­Äv›mßâ)Ò)õSnÚ1ìüì ìšìí9öaö%ömöÏÌÖ;t;|rtuÌvlp¼ë¤á4éĩÃéWgg¡só5¦KË—v—Sm§Š§nŸzË•åîºÒµÓõ£›»›Ü­ÙmÔÝÌ=Å}«ûM.›É]Ã=ïAôð÷XâqÌã§›§Âóç/^v^Y^û½O³œ&žÖ0mÈÛÄ[à½Ë{`:>=eúÎé>Æ>ŸzŸ‡¾¦¾"ß=¾#~Ö~™~üžû;úËýø¿áyòñN`Áå½³k™¥5»/ >B Yr“oÀòùc3Üg,šÑÊZú0Ì&LÖކÏß~o¦ùLé̶ˆàGlˆ¸i™ù})*2ª.êQ´Stqt÷,Ö¬äYûg½Žñ©Œ¹;Ûj¶rvg¬jlRlc웸€¸ª¸x‡øEñ—t$ í‰äÄØÄ=‰ãsçlš3œäšT–tc®åÜ¢¹æéÎËžwç|þü/÷„óû€9%bKGD»»» ìù pHYsˆˆÄ×@ tIMEÚ»ômbIDATxÚíYpSÙ™Çÿ÷J²Û²$˶llcÀ ÆØ 8ÍÚ, vCUH*ýÐ!½V&ó˜y˜Jf¦R]IUÞ'oyî—tª«ª§:ÓM 1Î@°YÚ W¼ï’mYû2Fm–®, üÿW¹$tï=ßùsîå;çžs>Án·AQEQEQTÆ%²(Š¢(Š¢(ŠÎ6EQEQEÑÙ¦(Š¢(Š¢(ŠÎ6EQEQEÑÙ¦(Š¢(Š¢(:ÛEQEQEÑÙ¦(Š¢(Š¢(:ÛEQEQEg›¢(Š¢(Š¢(:ÛEQEQ•e)Ó¹Øçóajj “˜™…ÍfÃòò2<À–(@A !H߃Á ùÉO~ò“•üT å˶Dþ”(ŠÈÉÉA^^ŒF#ŠŠŠ°mÛ6”””@©\¿Ë¼®+'&'ÐõMººº`·Ûà÷ûã:Ø©êFWB¦í“Ÿüä'ÿfäg]“³Ø¶É¶DþÍÀ/Š"A€R©„^¯G}}=öïßBs!¢"5&»Ý.;7“““hkkC__œNg\ç:2Y!¤{ùÉO~ò?/~ÖõËÍÿ<í³-‘ÿeà_\\„Ãá@uu5T*UbgûÚµkèìì„Çã‰È„B¡ÀŽ; Õj“Ôjµ(**‚ÝnG^^ž8€ÁÁAܺu ÝÝÝòŸ%nµµáàÓ€7‚ } V+ú÷áð¡ÃèëëƒÃá€^¯Ç¹sçpõêU8Τi)Df³SSSÒ'õ üO{þ°ÑØë€$Çcz$QçËz]à¬ý:$𴇈|ˆÊµoÿHò“ŸüäÏ42fÖõÖâaÃìKåš²lmmE}}=öîÝ µZ p»Ý˜ŸŸ‡Ûí† ÈÏÏO„„ÚÚR*•(--…ÅbAee%ÚÛÛñ׿þUö³dqqûú`0 V«Wm¯×‹÷Àáp@E©ÑŠ ÅÅÅI3e±Xpæì áÿþñ444Àj³"77z½Ó3ÓP)Uxû·qão7Ðßߟ4ÍÒÒRLLL$|…$çu’ÜÉòk=£°éÚ‹÷ª2Yúä'?ùÉŸi~Ö5ùÃùãMõ|^öS/¾VY¶´´`ÿþýسgtîÈÈ&''cø4 vîÜ){Š,µ5%***ÐÚÚŠ`0ˆ‹/Ê~–tvt`ß+¯1³ÕŸ   B¡ÀéÓ§QWW‡ÞÞ^\¿~¢(âÌÙ3˜Åèè(öíß÷ìF¨®V£Å§Ÿ~гgÏâ­·ÞÂW_}…ñññ5mèõú¸ %S ]Ñ“n´¤T:©'?ùÉOþLó³®É¿Yìg¢,O:…††éßV«Ur´ víÚ…œœéØÀÀ¼^/°oß>Îݦ’ª¢¢ÇŽÃÇeÝK.— £££0`~n^Ÿ7ææÈÏÏ_ÓÑ>wîêêê022‚k×®^;þJ-¥¸yó&A@YiäÂÊòòrû Þ|óM´¶¶Âív¯n%F“§T_U=ïWcÙùÉO~òg#?¬ë­Í¿ÙË#‘öïßòòòˆßVVV¬Æýˆ·Òh4B¥RÁëõÂn·Ãh4Ò“¤’ª¸¸MMMxøð¡¬{grr>¿oÕÙv8Ò¾Úá…Øj±XÐØØ¸zõ*Ün74 Nž8×ƒŽŽÀöíÛ×ÌHówšÑ×ׇK—.á'oý---øè£àv»ã¾ÂН÷›F6öZM÷F~ò“Ÿü™æg]omþÍh_nÚF£1fÿêÐ@]¢ug …^¯wÝÁL¨­'Qa0d-ƒ«¾  t:˜››ƒßï—‚Ï$ÚçÀ€á'Ã455A«ÕâÛo¿•=&Ú›[D¼zðU\¾t½=½hhh@SSþùÏÆœÝ ˆÎcè{èf‰Ç^(!ÎPºrÒ ]ï“Ì~²ô¢ó#§¢ÉO~ò“?“ü¬kòo¤ýtË2Þ¨´F£ÝnÇâ⢴YC¸\.—ä¯h4z‘”liµÚˆö˜¨½.,,`vvÊxóªü~ÜÞhh èíé•~¯©© …M),,L˜Ù];wA©T¢§§ رsG\g;ºG>O+t<:ïñœôµøxç';ž(½õä/ü{²I÷ä'?ùÉŸi~Ö5ùÃóô¼í§[–ñÍf³älOMMÁb±HǼ^¯4 @E˜L&z”l…î9Ï’PtJ%°ª2´¿v¸“ý H«ÕJó°C{`‹¢ƒÑ°.,H¿…BT&êµ›ÍfLOOÃçóÁP`X÷«¬ln§´Þ×nÙ ÉK~ò“Ÿü›Ÿuýòño”ýtmÅ[àXRR‚ééi8N aaaz½>Ÿóóó’C”““3…¢)´@ŽÔjõ³’:¢(ÂëõFœär¹"œfQñ¬÷èñx 9êOçv‡®W©T²2‘““ƒ`0Ç·±»Ýî´çRezk§T¯O5™^„B~ò“Ÿükµa]“?“óó²m+m:z@¯¾¾½½½p:X\\”â‰Dû9ÃÃÃØ±c½H*©|>¬Vëš3@¢Ûgnnî3g;??_šo~C,..F8ÛN‡S:¦Ñj`_¶Ããñ  @D¨5j`²äЂHF7xÍÌÌLDžäÜÀñnødQ³Â#XÅ ¡› ‡}}ºù‹×»'?ùÉOþLóÇ{{ɺ޺üi?Úfª¶ºººÐÚÚŠ¼¼¼˜QŦ¦&LOOc~~.— …¹¹¹())ÁÄÄl6¦¦¦““wn7EEû¥wïÞ•Út²g‰Ùl†B¡€Òá /ÛQ¼ó™ŸŸ˜çä÷û133ƒ’’XJ,è_îG0„uÁŠÂÂB˜ ͘™ž×ë…ÏçKújÆf³¡¸¤¢(J›Ï‡knnŽ«„)Š¢(ŠZS·oßÆÃ‡qôèјc¢(¢´´¥¥¥1ÇòòòÐÝÝ••ŒŒŒ@­VÃl6³@©¸ ƒÀ½{÷d_£2Z 5˜ŸÍÙMâ×ää$v××C¶ø ¿¿%%%¨­­E?`pp………ؾ};zzzËËË ÷­œ…ÛíÆîÝ»¥t#F‚id;•(Rñ 'Ýí 2!,ÛÉÈO~ò“=ü¬ë­ÍžÆF–ÿzm_¾|;wîŒ L&…BúúztuuÁåra``::Žž%£ÑÑQ)¶Œœ¶œ——'M«Vüû/ÿ󷃳pÌOI#É¡îóù ÓéPPP %277‡Ý»w£¤¤ÃO†±²²‚ùùyìmÜ‹ÂÂBô>ê…ÇãAIIIÂbgg'VVVÐrºóóóhkk‹8>88ˆÙÙY)ãáŸñnÌÐg²½@圿Vd­xLõ–äò‘Ÿüä'&øY×äOEòyÚÏDYNLL@«Õ¢²²2é ÑwAA´hÒd2q+@*Fsss¸rå ¾úê+Ù÷ÒáÇ¡+®ÀöÒâÕpíž €#GŽH 4tc ‚€øÃ¼x¯×‹¯¯} ‚x½õu¨Õj8Nüýæß!Š"ŽY}3>>¾f¦m6ºººÐÒÒøúë¯c2=44”•œœD8¼¿è9©þEÛ¶þùÉO~òg‹ŸuMþLæa½ö3U–Ÿ|ò ¾øâ ŒŽŽ¦äÈët:455¡¡¡ƒž%Ñ¦ÇÆÆpñâE|üñÇ)=KŽ= OpõwÅÿõ¿}<>‹–#Íhkk‹Y¨èõz¡P("ö¡\\\„}ÙŽú=õ(//Çàà &''¡R©ÐØØˆññqLLL`ÿý17q À¥K—ÐÜÜŒªª*\ºtIÚF0¤Gann.æAïú=ô›œã‰¨r”,½d©.D!?ùÉOþLóËáe]oþèÀó´ŸÉ²ìêêÂÄÄT*Ôj5´ZmÂ@}!)•J¨Õjz—€Õ™333¸wïþô§?áÊ•+)µÇÊÊJüøüñx|{+ŠVH€¹ÈŒ×Ž¿†Ïÿ÷s8ΈÀãÇQTT1¤··KKKhimÁ›o¾‰Ë_~‰[·nÁívã{ßû>ùä  ¦º&"íííhlj„¹ÐŒOÿçSÌLG.Œ´Z­1£ÚrVugR©®ÒNç¾ÞôÉO~ò“Ÿ¬äÏ6ÿó°ŸjÇ/YYvttàîÝ»8pà:£Ñˆüü|i›c…B!˧^ny½ÞÕmù”Jx½^8N,--a~~wîÜÁýû÷×u/}ÿûßG~ÕgvøC3Á¯;ºñ¯çNa~~¿ùÍoâNáÐjµ8zôhLÏO©Tbß¾}hØÛ€ÁAܾ}f³6› eÛʰ§~eryyß|ó îß¿yÊív£­­ .—+« M’’œ× Ù\8’í…6ä'?ùÉ/‡—u½uøÃ¯Ý å϶Dþ•¿¶¶¿þõ¯QTT„¿øN77 bo>“É„7Þx~ø!#Œ8ܾ}‡Šp¸}>îÞ½‹ÎÎN”——Ãh4bjj Z­ÕÕÕÃåË—ÑÛÛ QñäÉ“¸+9Ýn7ÚÛÛ¥½·£ %ü3ÕJŽ~E—îèE<ûÑû+fz•9ùÉO~òg’ŸuMþèãeŸm‰ü/ ¿B¡À;ï¼3÷_øw¿û]<~üW¯^…ËåŠ8Ùn·£½½ˆ˜R¬ÎÅÔ××Clß¾:.îÔpY­VtttHŽözz‰ %ÝBÎt£JµQŸüä'6øY×äOšȖýµFÕÙ–Èÿ¢ñ¿ýöÛØ³gOL$õ˜¨3J¥ï½÷¦§§qïÞ½˜î‡ííí¨­­ÅŽ;¤PíáE===0™ñàÁÔÖÖÆÍ\ @_?¤èRÙÞO¥Òäj&_g¤»Ÿ+ùÉO~ò§Ê¿žy²¬ë—›£ì'[lɶDþÿäÉ“8{ölL$Ó¸Î6°¾ýçÿösüGI“ÈC‰êëë¥è뉤”‘£ÊAss3vïÞ›7oâ³Ï>ÃÄÄÄšÎt¶W¦ÚCJ·}þF¯‚%?ùÉOþtùךÂǺޚüræðg˾Ü‘lKäßhþªª*üð‡?ÄÑ£G¡×ëeG’ål€B¡€^¯Ç™3gpâä tuu¡íïmèèèÀÒÒRF+%Ûç§;×h³ñŸüä'ªür‚˜°®·ÿFÙg["ÿfæ/((@ss3Nœ8={ö@§ÓA©”í>§æl‡¤R© R©pèà!|§ù;p¹]ÃÈè(&''°´´„û |>(Š¢(Š¢(êEJ¥‚F£ÑhDiY)***P^^FQSv²Cú%…Ë™ó  ¿IEND®B`‚perldoc-html/static/combined-20090721.png000644 000765 000024 00000105465 12276001416 017707 0ustar00jjstaff000000 000000 ‰PNG  IHDRÙë}îUÑiCCPICC ProfilexÚ”OhUÇ?o:¡ ‰ ®5”‡‡$ k+ºE¨É&é6î²]¦Ùü£ ›Ù·»c^fÇ7³[[ŠH@¼ÙêQ¼T‹x¨"$Oö h­z‚xª" ½HY³&Ôª?ø¼ßï÷¾¿?†>«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´JÒM%„B!„"-èÆy1ý8rú¹UJtWªÈ>ºe/3mÎeæé_5ß}÷n!„B!„9óí·ßš§^xÕœzî¥æ¨=·=‘ÝsÔ4³`ÉR³yËÝB!„B! Ʀ͛Í}‹7=Fºu‹ì?œØÎŒ9í|óúÛïë !„B!„(w–½ýž=ó<³½¶[—È1íóÞ‡+u‘…B!„BT8ï,_a†:»ú‹ì&=†™'ŸEU!„B!D¥³ô¹—L£nCªŸÈ®Ù¸³¹ì¦;5æZ!„B!D•³}Éõ·›Cu®"»mÿq惕ëâ !„B!„¨²,ÿh•i}ò˜ª-²O»ðJÛ+  &„B!„¢ª³qÓ&3ã‚+ªžÈ®Õ¼»YôèSºHB!„B!ª<ò„9¼Y·ª!²ë´ëgÞzïƒJkŒÍ1Ïù† ̺uëÌ_|aÖ¬Yc>ùä³jÕ*óÑG™+VdÛŸ¥¿žK¹ÔWÑÇ“ý²¿"íB!ªúÿ—ý²¿|χåÇlV¯^m5ÚmýúõVËm©"¹¼Þxç}s\›>•+²ÉÊöÑÇ«+µ!¸X=ö˜¹þúëÍìٳ͘1cLß¾}MÇŽM³fÍLãÆÍI'dN<ñÄŒ¸ýXúëÙ|¶,õUôñd¿ì/¯ÏŸp æ¸ãŽK¡N:É¥¿žíö\É·¾Š>Ù/ûe¿ì—ýUÇþã?^ÿÿ²_ö—óù Íºuëf† b&Ožl.¸àsçwšgŸ}Ö:I«ŠGûÕŸ˜]WŽÈn7ðóÅÚunô{ï½gn¹å3nÜ8{¡ÜjeÀ²[úëÕ¥~Ù/ûe¿ì—ý²_öË~Ù/ûe¿ì¯ 燶?~¼už¾øâ‹6b¹²„öç_®5mú­X‘]Ôg”Y÷Õ×—õmùrsÙe—Yït!/zyï_Ñ7©ì—ý²_öË~Ù/ûe¿ì—ý²_öo ö7iÒÄLŸ>Ý<üðÃæë¯¿®p¡ýåºõ¦Y¯#²Oê<Ьþì‹r7ê»ï¾3ùË_̰aêLÏÍÖÞ3%ûe¿ì—ý²_öË~Ù/ûe¿ì—ýUÍþúõë›3Ï<Ó¼þúë*´?ùô3S¯ã€òÙG·ìe–¯XU®†|ûí·fáÂ…6^?Óž|)ïúó=žì—ý²_öË~Ù/ûe¿ì—ý²_öËþâ}È¿õÐCY§l… Wþp¥9ªEÏòÙÖï`^ZöV¹ðä“OšîÝ»—9ñEȱÇ›\úë…ª¿²'ûe¿ì—ý²_öË~Ù/ûe¿ì—ýÛ¢ýèÆ%K–TˆØ~áµ7Í'µ/¼È¾îŽûÊí¤W®\iF¼8…"¼ÈáM°µ#ûe¿ì—ý²_öË~Ù/ûe¿ì—ý[³Í}úô1Ë–-+w¡}Õ¼{ +²O>«ÜBÃo¸á›Ò:æ˜c’KÝ5`¸=_ÂúCò=ŸLûç»]öË~Ù/ûe¿ì—ý²_öË~Ù/û·uûÙgΜ9åž mÀ„Ó #²Oì4À¬]ÿUÁOpÕªUfàÀ)S»víä2 Ÿâó™«—yØß°aCK£F,¬7hÐÀR¯^=S·n]Û«“îXQÇËõüÂýK;ÿ|íÏæx™ê—ý²_öË~Ù/ûe¿ì—ý²_öËþêj‹-Ì3Ïúèh·ô×Ãí4$‚áLC¶iӦ̴nÝÚ4oÞÜŠo2Ýe{–…О²Ú_^õ•7²_öË~Ù/ûe¿ì—ý²_öË~Ù_Þö_~ùå6jº<„ö£O¿ŸÈ9cNÁ§åb¾ë£Ž:*mÔÖ`kæKCg# Ù7WÑݪU++Þñrsž>œG.çî¦,Ÿ¯hÂó•ý²_öË~Ù/ûe¿ì—ý²_öËþêhÿàÁƒÍgŸ}V.B{Ø©³Ë&²kÚµ óaoÚ´ÉL™2Å6ò‘G™54áß-[¶ÌY0/X°ÀtêÔ©Ì^"+ì³=Wg[º›Ôß§„Ç+týùžì—ý²_öË~Ù/ûe¿ì—ý²_öW–ýhÈ>ø à"ûãÕkÌ¡Mºä.²¯¹íÞ‚Їj ­U«VÒh·ÎÒ_wâšîl½ÖQÐsÁ¸ï|ÂÉtÆq‡0<ÿLäbY¶çz¼LûgBöË~Ù/ûe¿ì—ý²_öË~Ù/û«²ýhÊW_}µàBûÊ[îÊMdŸÔy Ù¼eKA¾aÃ3`ÀkàG‘\–Þc¶óÇ}ô‘;vlÞõøžm2×¹óômв/Óö\É÷xUí|d¿ì—ý²_öË~Ù/ûe¿ì—ý²¿¼í'Iö“O>YP‘½iófS¯ã€ìEö äÀ7n4C† ɺwM²|Åð‹/¾h—ï¼óŽ™6mZÁD¶ƒlåôŒ~øáö¼Yúëe½‰ª™ì“ý²_öË~Ù/ûe¿ì—ý²_öËþê`ɳ_z©°I½çÍ_”ÈFoÙR˜Ll“'ON^8ì°Ã’K%Þë|BÃ}6Çz:tè`^{í5söÙg›Aƒ\hßÏ…òíËDiög³½¢ëÏ÷|d¿ì—ý²_öË~Ù/ûe¿ì—ý²¿ªØæ|ÿý÷ &²‰þ>¾ýÉ™EöÕóî)È/½ôÒŒF½ x† )€×­[gúöík^xá3wî\óÐC™¶mÛ\hÓ)ÀXmߟlì¯Jú|e¿ì—ý²_öË~Ù/ûe¿ì—ý²¿*ÙLät!³Ž_qó]¥‹ìCu6ë¿úº ó`׬YÓzè¡¥‚À.DxxÈêÕ«ÍðáÃÍÓO?m®¸â sýõ×›qãÆÙñÙå!¶™ò‹››¸p™ìÎW—»! ]¾Ç/ïó“ý²_öË~Ù/ûe¿ì—ý²_öËþBÛÙ£GXôvar­]ÿ•9¸aÇô"{ü™só>ÈÊ•+MݺuíÉ;¡Í2„1Ídí.´àÒ´Oœ8Ñ<þøãV`Ϙ1Ã|ñÅvŒvy7nlo¿s¡4ûýméöÏ´=S}!…®¿ª×'ûe¿ì—ý²_öË~Ù/ûe¿ì—ýQû_rÉ%ófu~z‘ýÌKËòªü»ï¾3}úô1‡rˆÜ2Øe™÷:[Þ{ï=›ð ‘=oÞ<;…y³Ëë˜Ð´iS{Á2Ùïð÷‰Ú?×í!¹Ö—ïùe:¾ì—ý²_öË~Ù/ûe¿ì—ý²_öWûÑm$Ì.„È~òùW¢Evvýò®ü†nHk0|ðÁ6¬º¼<ØQ"ûÎ;ï´IÐ(ôV”çqG;Ý…Ç~·ô×Kk¯\öÏ•ò®_öË~Ù/ûe¿ì—ý²_öË~Ù/û«ªýMš41ëׯÏ[ãl>¦uï’"{æWäUñÇlŽ:ê({²tÅ­;#è-(1ØQc² _ºt©¹ûî»í{?üðƒ9ýôÓËýØÀ„çQöW4þ9”å|rý|¾Ç“ý²_öË~Ù/ûe¿ì—ý²_öËþŠ´ÿœsÎ)ˆ7{ÚœËJŠìGŸ~!¯JG4&xyË[à¶oßÞ j¦ízþùçÍüùóíûûÛßìØìŠÙÀ¸ôLíQÞ„7UxÓmíÈ~Ù/ûe¿ì—ý²_öË~Ù/ûeiŸ!¯Ö‡~˜·È^òÄs©"ûàFÌ77•¹BbÙ<ðÀpÒnÉœd!nO;í4óüôk×μùæ›æÞ{ïMŠìóÏ?¿ÂD60öÜÙï·ETû”e{¾äZ¡÷—ý²_öË~Ù/ûe¿ì—ý²_öËþʶĈy‹ì ß|cjбXdw>%¯ »uëf8à{’né¯×ªU«Â„ío¼açÇf}Íš5vL6ëïÅ‹W¨Èfì91ÿ´CéÚ+ÛýCÂÏçZ¾äkmõꫯڡBÿÇŸ³ý#GŽ´Ÿõëzýõ׫´ý[ûõ'’åÓO?M¹&ШQ#Ýÿ²_öË~Ù/ûe¿ì—ý²¿Ríî¹çòÚ†L,Ùs®¸±ì™Ôž|ÒüñL FUˆ¨=õÔSm‚3ް&»8ëÿùÏìë^½zU¨Ð®_¿~©íSøÃ’K½PûW5¯ÏTk~Ù¼y³éܹsÎö/\¸Ð„…÷ªK[l×Ê”)%®É»ï¾k#>tÿË~Ù/ûe¿ì—ý²_öËþÊ´¿ÿþy‹ì³/½®Xd/zô©2W„huF…ì¿ÿþvlrEˆÙž={š 6˜÷ßß¾0`€}ˆ¿ôÒKí8mWž}öÙ Ù@FuÚµIºöÚÚðmÎÆþaÆÙο0çy:ur:. öÞ~ûí‚næÌ™UÚþ­ýú“!,D—èþ—ý²_öË~Ù/ûe¿ì—ýUáÜ™¥*‘½`ÉÒb‘½ü£Ueª„1Ï®£‘ðßÖ­[—»ˆíÞ½»ùè£Ì¿ÿýo3jÔ(ûã¯)ŒÑ¦#À/wÝuW…ŠìfÍš¥Ül…"lïð¦®hò=Ÿo¼±„{æ™gr>‘_~ùeJ=7n´S¹Ueû·æëO ~/Âræ™gêþ—ý²_öË~Ù/ûe¿ì—ýUâxDFç#²ßûpe\d“ô¬¬•L:ÕžÌ~ûí SY•·€íÓ§YµjUÒkíÞ'Œ2pà@ë!¥5Ή¯x B…víÚµ“må.¢[wí•i{®äZ_¸&r=¿Ò·åË/¿\B„qM³­×1|øpóÿ÷)õ,_¾<åT5û·öëÏ„Ÿ}öYÊ5ùöÛoíPÝÿ²_öË~Ù/ûe¿ì—ý²¿*Ø“½˜Ï|Ù$?Û®IaeËž Í& zß}÷Mž¤[g‰çª¼…ëØ±cíyPî»ï¾äû:u2ßÿ½5’×Ó¦M³û¬]»Öz¶]!ÉV=*Ddã]õÛ©,ømì¯3† ã_¹´b’Á-[¶Ì<ñĶ jÖ¬™Rg×öä“O67ß|³ /ä[o½eë ƒb„ æ˜cŽI{>gŸ}¶9묳칰œ}ºmSî}Ž7xð`iâ:½Ž=öØ2ßÿ‹N¶›nºÉ&ªp6Ðwß}·m›#Ž8"íçb‚î;3{öl›œ}¸·iƒÓO?Ý\tÑE)õ¸kÂ=M¸;Çã:s/rOSöûí3iÒ$Ûžåñý¯¨ÏçZ_EOöË~Ù/ûe¿ì—ý²¿ŸGKäãÍnÐu°Ù®ï¸™eú0Scí³Ï>öDX†ð ^ž¢õŠ+®H>´/]ºÔ´mÛ6¹ ï'ÁÀë¹sç&ì™lüá‡N¾þú믭 ­¡}ÔQG•ùsíì··['åüO?ý”"`Èâü—¿üÅÚæg¼Ú>„[Lœ8Ñ 6Å¥+?þø£Ù²e‹ï^û1cÆ”8ÆV#29'?”ÂqÉPÿ÷¿ÿ=å3džæ>òí'S=âÌáQÛþú׿–x¡ÚЏ~ì±Çl(yivÒ9ƒ¨õ¯¿~ôÑGÛ{Œ|?]ù׿þeV®\i“áårÝýûÕ®'a›Qè\ºîºë"ë"Éß+V”ðöû…N*D0áýöÈ#D&•C´s=IV¶;çò#”åþ²ÇÓ9Ày¦+tÒH¯^½z%êbÜ>×",ˆãóÎ;϶¯»ÿ9F×®]íçèád®uºö[¿~}‰ÜÌt õB}ÿó©3Ÿã•öû_ç'ûe¿ì—ý²_öË~Ù_öãäËGd÷3Ýl7þ̹eúð AƒÒ6.ÞÁò×ÊÖzn;ëëÖ­³Ûþô§?Ù÷n¿ýö”±¹x]89å¿ÿý¯¹ÿþû­×·Ù–¯¾úÊ~áÂëÁ5¥3'ìTHWèôhÕªUÖןìüxÖs½¾›6mJ±ÙuPpßsí³->ú¨mwWó¾óÎ;‘×*<'_œ¾ôÒK‘Cr½ÿ¹.»ì²R;F¢2˜3tů“Hİ_¸ïñ„‡íC¾’ïñYΙ0÷\ í˜É¾²~ÿ³ý|®õç{>å}~²_öË~Ù/ûe¿ì—ýåa?N5eÙcgo¶;ó’ksþ ‚‰‡w²!„€–‡H%<Ó…‡S^|ñÅ W]u•݆—Û…‚ºë¼ „™"(ýj<–Œ3/O¡Mwºv+ ˆNTAø!Â0]Wo„ܺíž{î)á‰Å+J¦n:!JQƒp]w>Ü”„IgS8/:<øÑa¡óÀÕ‹PCÌFyñsí¸7è0 aè„£»ºH„÷Ml²ø¨Ž<˜x8£ŽCØ7H®æwN×!A;á±÷½º*c=²½¾Œe“¸ù^{Ž“NàÿùÏNÖƒ5ªSv`8¡öŒ[;0þö·¿™~ýú%ë!‰mM¡?þx+PC¯1׌ëëýNØ}T„ÅçŸn½Ë,£ŠBT´EºBÄßB¾£:7¸çè<áÇ8]ôǹçž[Ðï½B!„(?˜º¬"{ÖÜ«Ìv_[Î\²d‰Ù{ï½í ¸¥¿Î˜ÛB SBÁ1Ô(Fp…áì‚mîýp¬¯óL2f“±Á¾‡•fÂs %.‘Ø£²!]ûúû06”°ê¨°d„Ù݇ bí ‰¨ŠÔƒØÅûn§üó!¼:,ˆjw^„w§ åŽ:>F É/ˆ\wlꦃ% æ~@ܰõàd\lX¸WIÔÕ°aÃÈ0a„#!ëxj ÿæ bÏ5õvM'OÔ}Eh4íNX8^~2ݾïîårý‡ZBð#ì ÙfL1ǹøâ‹Kˆy áÔ®Î)¬‡viÑ¢Eòx\wž~¹òÊ+“û04¡´0s¿Ð)ñ£X÷Ë'Ÿ|bøs¹ÿûaçdÜG\WÆ?Ù/ûe¿ì—ýÕË~çlÁÇ.‘p! UŒzß[:ÈõþÖˆ³•e¸îÃs[iÁèó«[ÃXýõž‡}Bä­qËpݧk¸tðšc`›s`UµûŸa®eÙ^}‹ÙïÊùƒ$âà5jÔH.ˆ”BŠR*…sã¹ 6BÜM„PFxñ~»víÒŽ“EϘ1#96×/xE™W¹Ð"Û‰¿íÜÅ Û3ÓvÀ[ÏØÓ(¢£ ”¢!ÂqòôÓO—ð’rIå@D05Ê#îÎï$"?…$†"Ì—¤k$Mówùå—ÛÏ Cñ„ Å‹Ëv¾€Q!Ó„>“lÌo/€+$~síæ"±ߜƒß¾Ü7¡0¤MfZ…Ø€·Ó¿><Àt$p?g{}YšÖŒmaýx¡ÃB›SÂ3ìø c ›ÝuÖ£:ðˆ»ó‰J*Gç S„ü“ß®¿þz;ŽÏq­£ÂˉNÈöþ'b#ŒHqÑ,ü‘øû“Ì,,ØO‡ûpܨh ¾ØJB=†ôŽDrüyD ÉàEøû׎ðþçûèß§¹\ÿLßÿÒê )tý™öÏ„ì—ý²_öËþ­Ãþ½öÚËþ·"®q8”p4à!(ð,B9K·ÝßÏçÁp®ÓÙ®³ ×£ÞËô™tÛÃ÷3?Ýçý×áûûÑSG8xXœØ÷á‡;l<¸üˆp^ØÆ¢ûß…ðzÑ eºýi³¨iÍðŒ‡ŸG Fy²Ù6kÖ¬È0ñðÚB”˜$ƒÔC4@ÔÐ…ùv‡ö09,D¦ärÿó#&ñCÌ—žÇ°p/s¯Sb;ìtàw?”¨ëÃu #-(W_}u‰ß¿¨ï#ýÒîÿ|¾ÿáñK»ÿ2Õ—ïùe:¾ì—ý²_öËþ­Ó~’æò_ɳÿË<áø"Œç ÀyåÖ3AÞ•Ò s¿4ˆî ñß/m½42—oÏÑ<7–Ï¥Ù­m4S:^XÚöÒÈtœLçõ>÷C(çÏŸoç¥æ¾A_ò,׺ªÜÿ8ùÊ"²¯ºõn³ÝuwÌÏiÞ/.:Ûé ã V¨ìáa".F”à%”Ø…°âm¤‡Ä¯'SÁó8dº…àŒótžñBÍ™½çž{ÚörË(ü}ÒíW3,„‡uq‡AÊ"ù2†»ã0n9*ܶ4{ø‚……°j·cÅÓu$ø6"Ä„X\;zJÙNO÷p8všÏ…íKa>Ͼ®7ÐÏ àD=ƒé®e¶×Óïg8€mÄ`¸oÔ˜v¼Òl‹ ¡Ï¶ðÝ ò€zè™&Ì; …æ;™ÎºŽ Âü™»>—ûߟzÏOºGÇIø9¼ÐQûr¯³ÞÓÐÛÌvÿ{éußðA/q¸oT$¹ÊrýsÝ?WÊ»þ|'ûe¿ì—ý²¿jÛï"XéÄÆËɬ!**¹4÷N7¦OÅa‹Ð® ÷?à Ë"²¯žwOn"aWÆT_¶BˆPÆ8†…‡û¨qÒjßÛGÛßõЛ®8¡h?‡']!ì#ü%]æB:Ï-7(Û÷Øcä2*ã7!âl ¡¦³!’ B¦éä B=˜ö)LnFˆH:»ãÂüýB§ÛéòÓ‘ÖÕY@݃6à<£’s1&$¬‹Î–01‚/(íu"Š]{ûm_ÖëËý&Òr¡Çþ~xÖ£2~îÏö¨öËæÚžm:)°/~Q@æm'N}›Ý:Ña‡ãßéŒË¥-¢:ZðÔGí5]˜ËTÏyEE[0KAX³!*Ò‚ë€h÷÷'¬(*·÷R!¾ï¹u=r¹sý|¾Ç“ý²_öË~Ù_}ìç5à ØL«Fžº¤´<;á¼ÂS›mÂQ•m³ð¼HT1¬á³neÝÿh‹ Ù|Yn»í¶äI‡¯}衇"þp_İŸÔŒÐƒp¬vTøkºÂ—Ÿ‹ë>ËøÌ°Ý8_ç4];f ž´0áž[~ðüýxøÅ8ÊÙ³g§àxÈýñéÀË玃—7Ì>N2Â4 ŒÃ3BãÆ6¶“Ø!j¼12¬‹ èaÁKî¶_xá…‘Ó;1†(¬ë¾ûî+±/Y¸¹nŒÕ]t2¤³“9ÿ5$ü€øÐcæï5­\á~ˆñÐÓJh5a7ldzÓñÁüÕÙ\_ÆS¹ãD…B3 ´{9Ê«ÌPþ˜ý¿°-¶Œ 9ÇCæx cÄÝ>QÑD¼¤³ß„¨ŒõÜ›þ~Œ s ð›ä·¡BQÝÙ}÷Ýís ¹\ó ‡ZòìÇ3ÞI:¼s™&VeÛ+èò!´I’WÙ÷:"²×苾dn „næ#>ï½÷ÞMx,㪣ö!!~üíŒÅ͵0…OWG”WŒPæ|…6 "ü6ŒjÏLÛ£ÄcÖ©Û¯ƒ±£aFh7­û°=,†¸zÂóA$Ñ»C§†Û5ö×Õ“Î>¼Âá°D5aÆl'd$*8çìŸ^ʨähx!Ý>ŒeË|`3`úç‡À3I#L±—}øâG%ÿ"Œ;ª½èô ¤[x(`zw$"#ŒÙoŸ¨D_„ûÇ@¬G…g>ŽgŸ(ÑI=¥Ý_x÷™g/¹;V˜/ê:„öß}÷Ý%>³pá”c‘áŸ^n¿=ðÓ9äö‹Ùˆåð˜ãÆ+ñgOG¡ülOmAÛ‡÷§ƒ‰°&Ïï„ÛŸp9"]J‹¤Èõûi{¾äZ¡÷—ý²_öË~Ù_ýìwKþO‰V g%áùƉj¢Äx!¿Œ{ñ5¬ŠŠßIÃ37‰ÑܳleÝÿ<³WˆÈF˜ <Øìºë®‘Ñ:Ÿ1ØQãA‹™Íþ€pŸ¨0èl ½l¾ç<êÜî¼óμD6âe·Ýv+ÚÔ-ýu·±5VŠƃÌ[èíFÔ’ìËÕE¢¨0S6aÁCB… .,Ï3Ù£ Õ%ˆ ìîü¢¢ð’§;ˆêÄ@\º}£¦÷ržZ2Ò±CöE>þhãUdlµ;ãêÃÎÄãƒÝ9Æ%Öù“@ôQuFͳÍtNl£]´?™µ %'‰us î­p>j¶“MÒ+½wQ¡ÇxUÝ´Dr ÎC›h2<ºº¢B­‰A óãáLàÑÇûÍýÎeΉ¡ ÔítHD Hw}IöªNçDÝ7xÈýïAT–u"awûÑa5}¢žsa¢H¢¢-øí Ïܱ“UH¸xñb{þDDDÍGOtEi÷>ßÿ| ëOg¡ö—ý²_öË~Ù_ýíwÿï<ӄïp881Íÿ¢?»ÛˆÆsmžñUTÒœ4 %Jº2ïÄ7º°ÜE6 B"Ó‚¬,‚ïS(:(„kg³?b/*œ<*3s¶…1®.É ªÃ‚P-«ÈF$í²Ë.¶ÝXúë®=KÛŽ·:J "<C$›Šš×—¢ÉÕGØkT]nü,m‡œ%ÙÅÝ£KþD=Œ§Zc?¢J»‰£æšÆãìΛ›^Ït!Ø0Ù™ ŽØuÇCôFÍ™Ž(¥M‡½aÁ3J¸ˆ«‡N‡tç„0¥í9¶Ÿ‰öås|>*r€¶¥cÄ]ߨÄkê¤~¼ýQÛz×\sí}smˆê ³¯1ÞV N¿C†ëÉu¥ž¨¡ D¶Ð¦é®/ó`³_¨!ïÎ-꾡\{íµ)÷<Q=߈j:X¸¢Úƒ÷êÔ©“¬‡N¡°0 éì ó#Ýw)S¡'6]½ù|ÿ+¢¾òFöË~Ù/ûeõ³ßå ‰úoÄ9àDtÔøkž×œO÷즢âž1q´â©ìûŸgÍrÙ|yyu'BØH®b“pã¨X*d•Ây3—2']„Û2–&¬7ªàÅ£ê>‹HCØŽ§2Ýñ¢:BðL#¾ýýK¢W4JHó^Ø0yåœsÎ)qLþZa¢°°p^„,#øÓ?B›?’LÙ2él iŸ!ü6 ¹üº£¦5‹òFû=~Dàù:W„â­·ÞZbîï¨Âu¤>ÂÆÝç£Ä)‘&þ1‡<êa!¡žÿBõÃû†k‰‡<´0|¢42uð= „›pÿóD„YØ96¿5™¾o´¿=xþý~‰ò C"¼?‰^(íþÏ—°½Ã‡ºŠ¦¢ÏGöË~Ù/ûeÅÛOç:ކÃE YD@G=“¹ÂóûD ïRQ 8vÚi§J½ÿqŽ•»ÈFÉóåruF³„\…&Y‘Ã$EÄ`Ôþôš…"òÁŒÜ7*cvY ¤üzçÎ[bÄH®¶sÌ1%ÚÏoÓ¨öõ‰£Êj¼® B`Ÿü˜áé'ÜÙ °¨úܱ"cM„3L€%¯iz–ˆZðëCp2FœäuŽ3f”¨ß·±Ôœ«ÿ'FCû9ÖÙgŸmíá|[ô†’ ‹ýzh ƧkO:f/‹Â›Ž¸§^ÚŒ÷é4À{ZÚ—ŽzˆHÀËKÈ2uq^ÔÃw…„d4÷½ùE´®<òH²óŠ:ñDGMkFÇáª\ Æ;óŧÍÛŠºž¡ýœ bœ¤aþùÒ†gšk8qâD›Ï·•ä_þ9ÓFŒÇ*í~"ÌÇ·•Œí=zôH9î¼ü~݈s~,£îÂÔ˜*„(3\\;îOîy²ÿó»Â~¡ý$ÇÃkí {y?ª½¢à^ã»4vìX{D¸0Fžó Ó fºeýþgC®õecxÿçr~²_öË~Ù/û«¿ý8>x–ˆúƳ :jZQ ù~\¸8ylTTJ+<³1̯²ïžñË]dóPÎIì¸ãŽ·^V‘͘æ°MqÔþ„Ô†%]æñt_ð\ ^J¼Æ~Ø8õá˜Þ¨ÐöÒ`š"׎ÙâÚ›ñQãÍœn_Y¢qO=õÔ´IÒÂÂù¤«›ðÍBB2 ²µŸ)¶h³_ÿú×iÛÓmc鯧Û?WÊ»þ|'ûe¿ì—ý²_öË~Ù/û«žýìÏ´ª-[¶4½zõ2óæÍ+‘”7S!JõÛo¿•ºT±÷÷×ÚHÁªpÿ3]l®»L"›qÑ—]vYò¤Bößÿ¬EfÔœ×Ì…–­'›/fez²;wîœW¸8c Òµc¡øÕ¯~•\úëþ>dícÊ(<óÌõM2‹\>ïØgŸ}lï ¢ögŽgwœnݺÙp[z‰*ÛþB~>ÜŸ^^Æ+1V·Ð¶5oÞ<¥=™[›9¿«’ý¹χW:¢²ÝŸû–ñÐî5Q5LûÀ5¨Žöo ÷¿ì—ý²_öËþ­Û~ÞcÞlr2õèÑÃ:›–,Yb¾þúk©F•Œ…Y¦¾úê+;$—1Ø ÅíÓ§ÕUåþ'÷@…ˆl¾4Œ¥þå/iáànéDE¶"óÎ;ï,ÑØé²…ááaiß¾}ä¾|Á ]ð¼g Gìdc;‚Óµay^ÿ=YɃ}$vzæ™gìte¹ÖÏØ2G¶nÝ:rßk®¹ÆF<ñÄ6<ˆu²¤3¢2í/Ïú¿Ë¼$è+´m·Ür‹MBˆ¡V´=‰ýkWÙö—ƒH6vÉ%—dµ?C/^xáÛéàÞãžûŒÈêfÿÖrÿË~Ù/ûe¿ìßúíwÿÛ8Y£Ý»wo3hÐ 3yòd;Û³nðþ§?ýÉ\zé¥IpÚÁå—_ž„¤ÂW^y¥]’§É‡çHŸë®».…n¸!…o¼1 c|o¾ùæH’̳”/j·Ýv›]’°x þºÿÚí÷À˜Ç{Ì‚ˆô×Ýë{ï½7íçK[÷_;î¸ãŽä2Óº¿Ôç3ßoÚüu×¶nI{»vfëqõÕWÛûƒ±×'N´Þë“N:ÉÞKUéþ5jTňlRï3ÍÖ/~ñ {`é`^Ùl„&Œ‡…FO·T¯B7jßE‹\d/X° å„¶û…¶Éeú®° £Ú3ÓöLdS?¡õS§Nµ=‘ü@23_z¶ÑiR¿~}ÛÎ\W<…|Ž•÷y}øá‡ÛÏÕ¬YÓ =†¢ŽÏæÝwßm“GñÀk²¢sÌ1ö5õ0õٜɰÍç:è ÅÀñÙî×IfK%?æ´gƒ ’Ûðbâ%1^„<ûtèÐÁzî9G'îÙ¿nݺÖsJ_ò£>ÚNQAgÇÆVwl¼Å ‹ Ë>ÛˆhÀ#ðÁÛºIÆÇgÝþD`ë|ÑÇùñYz~ý/=Ÿãþ v É ¡*Q×?~иnœ;ö“|äµ×^³õ»}9&çÛ¿þ\ <â Çຌ¡´‡<^»½ZµjÙz8^ÿþýí6zûðÖcS‹-¬Mî³\ŽÁ6#ÒVlãÞ©W¯ž=w® ç@Ï%Ÿáºqýõ×Û}hoÞ?òÈ#í¸ ©‹zI B$ ûsÿÐæ´½;]vÙÅ^OΙ{™{ÛÙȹslw½©¯ßÇ|¿ßñý÷·‡È~Ù/ûe¿ì—ýÙžÏÏþsû Áóÿûü_ã|þãYâíö!Ì܇g:crÿe #ÿp·0`@ {Ïrƒ¶ 2ÄB´é°aÃ’ >Ü2bÄ ‡ÿ:ÜVÚ~~],hþºÛîÖÙæöñëÈôyÿxßß/ê<ܺo3K¿X§Àµ¸6déÚܵpQ•<»:¸~þuå¹%ל{¥»/xÎG ðœYï:x*Dd;T|™B8™Úµkg%4¹ ™¼Å>x³M”†¬Ð…6ÿˆ¿àÎVd#Ô²¹I\›úíë–…€ºÙ&LHƒh"wØŒ·”Ð~D "n×]wµb‡L{ôh½óÎ;vÁˆÈFdE?uq]7¼‡Hãs„aУ‰GvñâÅfåÊ•6;<¢š°ÂôÙÆyð#J]¤ó§'Œc³ aI]ü°Ó¶Ü+tpμO>Çž:ù½ª=¼ÍxÕé\`ÿeË–Ùýè¡ã3Ÿ÷fï;R<§x?üðC{.|˜ÿ‘Þ9~,{|–? >{ÿý÷[O>Ça_:øÑ£=ôÌ:›Ü¹#ð£®?”B'4AM$Â…^h_Ó#Èq8OÞå•W’ׇsCÈrߺ69í´Óì(öáf?Ú‡iªHxȱ¹F®ÍÙmœ7vrìçÇ›Ïr­9G¦$ÃËN{òcÅ9ÓÁýâ®1Ççº#v™œ©Ë¸.œ?\œOx é§¹/™N‹sb;=ãüp`'žûså:c3î^$šƒ÷±Ïp¾§á÷-ü>Fýþ•¶®ßߊøþçcì—ý²_öË~ÙŸn:Üù¿çY‹ÿj·tðœ¶ãx`%ÏH¬³Œ‚gÓÒ@ü»¥[Ç à^‡¸m,ýõtûó\Êv–þºû|¸=áñÂóññÏ#Êv¿\›ºkÀ5ñ§Æò“}ù^àmùþçµBE6°éN ‘“­Ød4¿lذ!í¾ˆž°ð µ/sZºL™2%å‹ä¼è Àƒ;þ|ëÝçZÐÑÁqñ¢b/õcÞì¨ëÉq »q÷’ÛÆ½Og?¬DtÐnx¬± QÍ1ÙÆ~ˆe¼ö¼¦-ÇŒc{$ȦN× Aóš:¨ßyå¹_¨.€ú¹vìËu¢­È>Ê9mD»ã¥ç{@ã¦9?¢xW›?Ú…ŽΛöà‡žkÈùpM»ã®!÷#÷õ³?Ç¢W•÷°ƒžR¦¤W?^ÓÁAO+mFç÷£»7 áâ~M÷=+Uáû_Z}²_öË~Ù/ûe¿ì—ýÛªýtJ©\¡"qõ?ÿó?‘pR„lÄ&èÙfèF„„…‡ì¨}B…,ÌçǺŸí<#>zôè¬l&,7]Û•7ÿïÿý¿äÒ­3•bO'Âa‹`¾è¢‹¬WÐÏ‚è&‘Bˆ uu#NÙ„Eï"b”ãQÇ$$…mˆl„½h¼F¼â­D°ºÏ#>9&¡½xG‰|AØFx4Û8/„.ÂŽÐÿžäXä@Xâ­dÄ¡êˆ[<µì‹Cd͘1þ¦ ÁÍ!^Ÿyæ™Vh"äxç[м¦M¨ŠàCd"Ã6D6^fw^tnàfÑÌ÷ŠC^s ¨ñÕžNd#>ý÷9O*qŒÝm#€k€8Ƴ<{öìä}àleD6‚™÷hꡳ…רÎ÷ËÝGnl mìÚ4ë¼O'Žñ¥­[µje½Õœ ûÒÉC;Ò¹CÇmî·•»†´¡ i':VxhvûÓ±A}lc*ڙζÑs‹w|îܹÖD6î³D p¾•õ=­Èï¿ý·vd¿ì—ý²_öË~Ù/û³µÝV—È^³f=øvÛm Þ±l'ÝÂ÷-j_<]d óËO?ýdãüÃ}WPȆ‚#pü²zõꬽ؈—tíV zñâ"¢ˆ\ÞÇ“‡0Á»éÆk D´NdÓaàêñEvÔqÙ\[„¡åˆ+np¶!²Oˆl^#¸ý„y»Ï3^Ãyv—ˆl·-Jd#(ÝvêAÐ1Üf¼ x:ÙxjÙ‘Íy2æž×OÄ!aμFDâ­EdóÏ/¡ãîº"Š©›pw_d³ ±ÕÑ„â³îD6b’׾ȎjO_d»÷ð#žñÓ¾tš0fÜmg÷ðÓÁ…È÷ëä;Èæœé4à=_dó‘M°Îõ#Tœ¡sáÂ…v/:û»ûèŒb̵/²Ù—ëÆ5§Ó‘7‘íÎÎ7w b஡Ù~;ƒ/²ñz#ü×l# ŠNî{^#² qwŸ¥£‚k^•¾§B!„¢â i_…‹l ¤7ÝIÞ™­èd ®_þñØñQû"*‚Ø÷#áS|ª °\Ý„K‡bŸ‡õllÅÃ÷µ*Ý@xñN‡ï9À¸`’A¹÷xHË"²i#„ c?Âm¡ÈF ñšðr±²ˆ2ÆsÞœ÷â­4‘M’=Ä'ÇFH’l¡ª‰lB«©‡Î<®ND–&²ñèÓV´¢,BˆP:*ðÈã‘u½uˆVBßÙÆgËúÀwšˆ „8!í¼ÏØyÂßË"² ±w×ÉíO›Óv¥‰l:8?"؆wŸºh+ìÅ7D‘Ígi;Ÿ±?Ÿ÷E6Ã[ê@ Û¹6ls×X"[!„Bø k*EdNœî¤xff6ÂsÖ¬Y3y;Gßÿ}ʾk׮ܗ†)Dùá‡R¦æBlù…¬çé¦ áᾪÝ@x²ÔáûˆiB¶>xFñDFŒ`I'²÷Zš'›0æt"Û…‹»÷èhA„!â8>/&bŽã#’DX» Ñv"›ótuá=çÜÜ”|.Ù„0;‘ 'TÝÙ|Ž×„J3®ÚÙ.¼:‘í{gé8 Z€uB×IÀEسo“߯¡ÈfªiWD&à‰v‚–ëŠ@¦ƒƒûÖ%7ãûIGÂ!K' ´5mÈulä5u2VÚ‰lÞ?ûì³SD6BÞ‰lÂìÈf<<ŸgÌ6vÓ†tÜðÛ€È&tÜÙ\7D6õr †pßµ@Ä ×o?¶¸kˆÈvÉÏ#ç^¡ÀÙt@p/ÓÖ„ÈãÅç¾vÇváâ¾Èæ8úƒB!„Øö`xjYvÞ"ûÓO?-Õ+‹èÈ֛̓·_~üñG›Ž>j_¼ua!4Ü/h! æ®NL(òÙØHâ0¦:ªj7žGß[íƒ@ÁÛŠè dï°›j aŒ0rû"žyÏ —D¡ÂNŒù p¹Þþ6„ cß96‚’!¾'Ùã®М— ƒÆ«Ë\_œrÎÌ×H=J:E°qF/’¦¹}'Mšd¸¹ó@ø!¦yMb.ÂÏß.TQGˆ³‹a݈M¶!䜘ŽÐv¯ýtpŸ!*ɘèjOÆ.c'TÔ§£Ã…àßOÎÝÍ)I[aŸ|Ä/¢•ö!*€÷ÔìOönÚŠï—ë ÂvÞsulŒsqÃG¸Ïi?·óGØS÷'œ#‰á¸®Ó…ë†-ÎsOg êèXaÈí^CµëìpyÜÐ’«ù×…¶Á¼î,ývå;àGæÐþÜ[ú“B!„Øöà™²ÒD6à +Í›xËF€û·¿ý­Ä8g?Ù˜®†˜3w7çâï‡G­ Ïü¹¸ñŒùOb¶ Nĉü!£8N„;‚ˆ0b:CÿÕÕ&D%‘û ÁŒ7˜po'|…B!„å ŽœÉ•*² «-í$sñfãí"‘™_ÿŒÚ¯&c·Ó›<Žá>¹’/¹úð¨ùeݺu)aä¥Ah¬[, #²R@0cŒñNâɬÎ6!² ûFX»)¥H ¨ë-„B!DÅ€³+]‘ýÝwߥ gu±:[¡Žw¦0V7j_Âpÿûßÿ&÷#Œ›pXƵ–µA¡N=$H"„Ýìf:¡lí"›²nZ!„B!„¨º¯ªÒE6dÊÂK6`Æif+H£Æ\“œ)Qލö·“…º,åŸÿü§Í6L,Õ¾ÀN7—wÌïì“B!„BQµ`¨s¾»`"{Ë–-=µdNÎV”ɘþóŸÿ¤ß%K–Dfñ&c±_H®ä¶1¦;×q¼ã.‘ɸ°Ï¦îbº¨\lqS$ !„B!„¨š0#N•Ùàæ³Mž\æ¨ÍEœŽ7ά_¿>E3É®Â}Ö$)£@dXnc\s)dAvcÄÿþ÷¿'ßg:!2>çb”uà !„B!DÕ¥eË–ØÙ€ˆ.íÄ™¾ª¨¨('‘Ú¥K;¯?öÏ2s‡ûNœ8Ñ|óÍ7vævï3%R¶ÞlæÏíܹ³™?~ò˜|oy.ç 'œpBrZ#!„B!„U¦½}íµ×ª¦È~å•WJ7˜Ž(—ñÙÆ@¿ýöÛ)‚ø©§ž*áÕÆÓüâ‹/Za쇖3WðÍ7ßl·ýðÃiEö† ’ã¯ÙßÂ+[ˆçÿÕ¯~¥›V!„B!ª0&L(˜À.¸È†)S¦d4"—i½B¦Nj{œ—™pnÄs8ŸvÔØmG×®]m²6²‘G¦{öÙgíx천#Þúßýîwºa…B!„¢ óÇ?þÑæÝªÒ"{ãÆæˆ#ŽÈʘ² mÎ8ð/¿üÒ cÆn‡B;xÁß}÷ݤ¸þâ‹/Ì­·Þjú÷ï_æó"–§vÒ +„B!„U†ö.]º´ »\D6àiþÍo~SîBÛÁ|ÕQc´<ð@2™YïóÏ??§)¹Jó`ï²Ë.ºa…B!„¢Š3cÆŒ‚ ìrÙÀ\×ÙFèxYÆhg‚Äg„„³Ž·›±Ø¬ÓÀ¹úxÍš53Ûo¿½nV!„B!„¨â´hÑÂ|ûí·ÕKdÃäÉ“³2dh¹f/îÝ»Û9¶ï¸ãûúóÏ?7/¼ð‚]'<ü£>*¨À®_¿~Vž{!„B!„•KÍš5­F, \î"›,ݽzõÊÊPDj®óhGA&ð·ÞzËfwc«×¬Ycž{î9»~ÞyçÙñ×÷ß¿éСCÞÇcü9s€ëfB!„BˆªÍ^{íe–/_^n»ÜE6lÚ´ÉNg•ÁˆÕC=4¯ðñ§Ÿ~ÚsæÌ™É÷V¯^mßw¯ÉFŽ¿ôÒKó]£F ݨB!„BQ ÖK/½T®»BD6¬[·Î4hÐ kãwØa‡2{µÛµkWâ=<Û>ø`Æý²åÈ#4¿üå/u£ !„B!„vÅ‹lذaƒ¶Ù6éÔ÷Ýw_Ó´iÓ¼Cº™Ú«´y³³¥^½zfçwÖM*„B!„Õ„½÷ÞÛ¼òÊ+"°+TdÖ-[ì8é\„rÄvãÆ ž<[êÖ­kvÝuWÝ B!„BQ ‡ÖŠ+*L`W¸ÈvÌš5ËzªsiößsÏ=Í1ÇcZµjUîºyóææðÃ7;nN!„B!„¨fѼvíÚ Õº•&²aÉ’%6³[Yë¿ø…õn#¸Ã…Ö52µjÕ2{ì±GÎB!„B!*ŸŸÿüçæÂ /¬p[é"ÈúHηI”¶ß~û™Ã;ÌÔ©Sdž–·lÙ2­˜fâq±!Ò9ä£ÿë_ÿZ7¤B!„BTcŽ>úhóòË/WšÀ®t‘íæÒ¾úê«ËuÌ3^iy¦…B!„bëäw¿û™={¶Ù¼ys¥ ì*!²Ÿ}ö™2dˆÄ°B!„Bˆ¬ªýúõ3«V­ªtM[åD¶ã¹çž+H¹B!„Bˆ­f¡êÕ«—yóÍ7«Œ–­²"ÛñÄOØyµåÙB!„B;ï¼³™0a‚Y¾|y•Ó°U^d;Þ~ûmÛˆdüÖM%„B!„Û?ûÙÏì4ÎóæÍ36l¨²ÚµÚˆlØ.\hú÷ïovß}wÝlB!„B±•²Ë.»˜®]»š›nºÉ|þùçU^¯VK‘f$þùçÍ9çœcÚ¶mkvÛm7݈B!„BQM9è ƒìë‹.ºÈ¼ôÒKVóU'ZíEv+V¬0<ð€9ï¼ó̰aÃLQQ‘©U«–Ùk¯½Ì/~ñ‹‚Þ¿üå/ÍN;ídöÜsO³ÿþû›<ÐηÍ<ÝGy¤©]»v™`N·|8ꨣò‚sÏÚ;h/Ç¡‡š5kÖÌ ®_>|ðÁ9Ã}ãÖùQÉê*$øÃrÚÿ€È‹}÷Ý׎¯É=še…iÓAG]i=“ ™ê+í\ ;³iG~÷Þ{oQ ùãÿ˜÷÷M‘i_“ÝüÅ_4¯½öšþᇚիW›/¿üÒ|óÍ7ÕºGE!„BQu¢s¿ýö[;,vÓ¦MfãÆv¼ñ×_m¾úê+«_Ö®]kuáÒLy kÖ¬±úä“O>1ü±ºjåÊ•ÖéˆvAóì½÷Þ3ï¾û®yçw¬®´Ðo¼a–-[–3è#tT¶ðÎeݺuÛܵµ"ûÊ[îÒ.„B!„BäÉ¥7Þa¶»äúÛÕB!„B!DžÌ½æV³ÝìK¯Wc!„B!„yrÆÅטí&}±C!„B!„È“ñg\h¶8ñt5†B!„B‘'ýN9ÍlWÔg”C!„B!„È“¦=‡›íoÖM!„B!„BäÉ!:›íö;¡ùxõ5ˆB!„BQFV¬ZmÐ×Vdÿå©çÕ(B!„B!Dyxé3Å"ûâënS£!„B!„eäÂØÙI‘ÝwÜL5ŠB!„BQFzžV,²I~¶eË·j!„B!„"G6oÙbmÒ¥Xdó//Sã!„B!„9òÔ ¯§­“"ûÜËoPã!„B!„9rÖŸ®-)²v¢ÆB!„B!r¤~çA%E6,{û=5B!„B‘%/¿ñ¶ñuuŠÈž5÷*5’B!„B‘%3.¸"½È>²¨‡Ù¸i“J!„B!„ÈÀ7ßl4µšwO/²áΫ±„B!„Bˆ Ü6‘ 5u ‘ݼ÷H5–B!„B‘&=†eÙ°ø‰gÕ`B!„B!D=ú”‰ÒÓ‘"»U¿1j4!„B!„" E}Fe/²áÞEªá„B!„Bˆ€»>bÒié´"û¸6}Ìú¯¾V !„B!„ ÖÅtò1­{ç.²5o¶B!„B‘J8/vN"û'¶3¯¿ý¾R!„B!Ä6Ï«o¾kö¯×¶ì"Úôk6mÞ¬B!„B±Í²qÓ&Ó²ïh“ICgÙ0ó‚+Õ¨B!„B!¶Y¦Í¹Ìd£Ÿ³Ù°ð/Oªa…B!„BlsÜ¿d©ÉV;g-²k5ïn>\ù‰X!„B!Ä6Ãò«Ìáͺ^dCãîCÍ—ëÖ«¡…B!„Blõ|þåZÓ ë`“‹nÎIdC»§˜¿nøF .„B!„b«åë¿n0mû3¹jæœE6ô7ÓlÞ²E /„B!„b«ƒ¶z™nÊ¢—Ë$²aèÔ³m s]!„B!„[ ßlÜdM:ÔU+—YdC÷‘SÍú¯¾Ö…B!„BQíYÓ·]†M6ùèä¼D6´ê7ƬùüK]!„B!„Õ–ÕŸ}aŠúŒ2ùjä¼E6Ôï<È,{û=]!„B!„ÕŽWß|לØi€)„>.ˆÈ†ƒt47Þµ@H!„B!Dµáú;î³z¶PÚ¸`"Û1bÚ9§-„B!„¢Ê¿&¡w¡5qÁE6œÐ¡¿YôèSºpB!„B!ª ÿò¤©ÛþdSz¸\D¶ãäñ§™V~¬‹(„B!„¢ÒYþÑ*ÓwÜLSž:¸\E6ܰ£9ÿª›­+^U!„B!DE³výWæ¼+n²ú´¼5p¹‹lÇ‘E=ÌÅ×Ýf¾\»^Y!„B!D¹óåºõfî5·šZÍ»›ŠÒ¾&²5w6ÓϿܼùîr]t!„B!„çõ·ß7Óæ\fiÔÙT´æ­p‘íÓnà)æºXºôOÖ|®A!„B!D™ùäÓÏÌ5·ÝkÚ8ÅT¦Î­T‘íØ¿^[+¸/Œ¹ñŸ}y™Ù¸i“n!„B!„iùfã&óÌKËÌWßRéºʉì&o?h¼9õÜKÍw-0?û’Í'ñ-„B!„ÛèÀå+VY]xãŸï7SÏý“Õ‹Öï`ª¢ž­’"»4mÒÅÔë8À4é1Ì´è;Ê´é?V!„B!ÄV:¯q÷¡V÷¡ÿª›fÝN=#B!„BˆªÀ§Ÿ~ªvØÊØn[,ºðB!„BˆªÀŒ3Ìo¼¡¶È–ÈB!„Bˆ|X¹r¥iÖ¬™™5k–ÚC"["[!„B!òᢋ.2M›6µB{ÕªUj‰l‰l!„B!„ðùâ‹/ÌË/¿œq¿uëÖ™V­ZY‘ —\rIÆÏlذÁ<ýôÓjg‰l‰l!„B!ĶÁ–-[¬x>í´ÓJõNßtÓMV\8Ð.Û´icÖ¯_ŸvÿE‹™nݺ™k¯½Ví,‘-‘-„B!„Øv3fŒÎ-[¶4W]u•ùúë¯S¶oŠÍÜ¥K»IϦNj×o½õÖu½öÚkføðáI÷SO=¥6–È–ÈB!„ÕŸæÍ›W§žzªÚ»šsùå—'E1tîÜÙ,X°À|÷Ýwvû<`ß:t¨¡,[¶Ì¾îÚµ«Ù¼y³Ýç“O>1gœqFJ=ðå—_ª%²%²…B!DõÓºukëU,† b1eʵw5çÑGµ×²oß¾fìØ±IÌ5~ñÅÍ€ìëÇܸ2xð`ûÞ=÷ÜcCÂñ‚;oø™gži×{÷î­ö•È–ÈB!„[È=z´)¯‚—S"{ë`õêÕöZvèÐÁü÷¿ÿ5O>ù¤éÙ³gŠGºGæÇL^'Ì}Î9çó׿þÕ,^¼Ø¾Fl«}%²%²…B!„D¶Dö6‡sM¶qÊ?ü`n¿ýv á<Ö~ùÏþc›±mÔ¨QfùòåÉmsçεïßu×]j[‰ì­Od“¤àwÞ1?ü°¹ñÆÍ…^h3’¬`òäÉÛü𻥿.ûe¿ì—ý²UöË~Ù_ÞðÌų×\`ŸÅzè!óî»ïÚg´B<oÜ´Ù¼ñþGfñ³¯š{ÿòœùó’§ÍŸÃS&²O4ÔÜþГIîð–ŽÛÁvy[biy(þÚ¾÷P|Öç=øDòýÛL³Lì7ïÁ¥ žHY¦û|ñ>î8KK0ïÁÇͼ…v½yì¹æ‘ûÏ3Ü7§˜ûç˜áCz– §lܸÑN×õý÷ß—¸ðX‡ûS\ò¯šj–Ì?7vŒscËs,Ì?×¾Çúâù³“ïûëqR·-¾÷ì³-Kî¿Çr‰·ýá{bË{XžïöaÛY–%‰ý’ïÝã>wVòó®nw,x(öþC÷Ä·»å"»œmyÈãÁ»â,ºû˃w“x¯x¹ðϳÍÂ;cüùœäò;cü¹˜w–äþ;Xž›ä~ˆ½wÜî-½õ% ®4Ï=qùô“å±ïð·Ù¹ðþûï›y󿙉'šaÆÙñÜìýû÷/c,ÜÒ_Ú·,ûšB_öË~Ù/û«¢ýºÖ²_öo}öó,6hÐ ûl†'3óŠ+ìJeyÞû|íW1û”9õÒy¦ïŒKL»qsL›1³Mëѳí2Jds<„Y¸òÊ+#Ev³ö]MËQg›–#Ï2­XŽŠ/[Å–¬ó>Û[Œ83¹_ ÷ÞÈÄ>£ÎJ®Åöcý‹´L¼vŸuëEÞ>–ágØeóØ’},¬'`{ó¡³Lóa§Çö9=¾ÆëÓbïŸfš¹åЙ¦Ùà¦Ù 8MO7ML‹­Çi:˜å©¦éÀ©±÷‹iÜrŒI¦ñÉ,§Ä–±õ~c/õ`öoö‰/õ‰Ñ›×§$— {3 zM0Æ4è9ÖÔï9Ê4ìÉzŒ£-õ»´4è9Ú´Œ·÷)§›Y]k–>÷ŠÙœ¸¯Öù±y|ÑŸÌw3N;Öœ5ösƨýÌé#÷5§Ú'¾Œ1¸çQöz^vÙeyu¾üýïßÍš˜™Ã÷1³FþÞœ>â÷f¸õØrÖð½Íi#ö¶ËY±åi‰å¬pi·Õˆ3¢x9Ë[ÎrÛ‡íefÆ8mxb™X· ó–Ê·ÏL0}hl94¾ch|9-öúÔÁ1†ì•\NcP 3u` 3%ë“ì•dRì½I±å„þ{™‰ýcë1&ÀÉ5Ìø“÷Š/ûÕ0§ôÛËŒë£o »<¥ïÞflŸqúÆ—£{ïeÕ«Fr9ª×^fdl9²gärDϽÌð±e÷v9<¶Úm/3¬{ìýî±zûlfŽk`®¾h„yåù‡b¿?›%²3A˜_ B6Ò‰ê“O>9¹ô×Ëk{H¸¦óË•\ë—ý²_öËþêh¿®µì—ýÕß~7Ùš¯~ø¡ùöÛo³~æûë†oÌ5÷,6'œ’ž0µ4þ:Jd¿òÊ+%ÆÑfKXWRd·ëyüäkO;1Ü"![øë9|¾„°öHŠh+žgÅóiqÏ4M‡œÈ3­hnÑMÅÄs –žjš œf—cËF1‘ì@07Š ç†1‘Ü0&–ãËI¦A߉¦b9&”Y6ˆ-ë÷:ÅÔï=.¾ŒqRÏq1Æš“zŒ1'öLÐ}”¥^7–£cË‘æ„n#âË®#,u»7u»À0s‚]5ÇwbŽï” ¶^§ã`S§Ã`s|‡A±eŒöƒÌq혓'œa^}ó]³ñ›¯¬‡öœñ5=1“1;eð~f|ÿÌÈ>5cû0Ó½C{=GŒ‘—È~óÍ7m=-š72ºÕ2ÃzjÆô;ÈLø‡˜XÝ×Öˆøy$Åñˆ½J¼ïÎ5¹Q#e½´Ï'Eyb}æð¸pžiÙË.§‰ êCâLŒF@ïiN´gL@ïi¦Ä–SÆ—ScL°gL4/aâɰGL0Ç×Ç÷Û#&’aÏÄr˜pŽÑ{˜XŽÑ{÷{˜Ñ=w7£{ín—cbËQ=ãŒì£Gœá±õáÝËúîn†vÝ-¹[é²›epç]cÄ–v5;zÄ^è°«éß>†[Æè×vW3ó”fæÝ·ž“ÈNûCK0pçwÚ?~¸ûõë—5n÷Çà¿W–ýs­¯¢ë—ý²_öËþêf¿®µì—ý[·ýn$óçÏ/1?q:žzå-ÓeÒIñÙÂóêºõ(aüÓO?YOcYø×¿þ•VdG?Ý{þ9Giº()žOOˆé˜ˆât\Dã¶"zP±v":.¤c˘×9E@§ˆèbݰo\87Lè½Ç'ÅsRDǼÎ'õ›ÐVLw‹g_L[ñì„tB<#¢O°ËaqÝiH*[ñ\,¢ZíslÛþæØ6'›c íÉæØÖýÌ1­úšÓç^eÞ|y‘™ا 8Àtiw¼i×ú¤˜ømXjgJQQ‘]Ö‚&)­þæ1wë LǶõLŸÎG'¼ÙGQ<¢X`Ÿ6"Ý5’â™×3Úz YOˆçéCŠE4âyZB@Ÿ:8þÚ‰g+¤Lò…tÿ=cây+ž'œ_߯X<ŸÒwwsJO<÷ÙÝ.ÍŠh'žaDTÍú°n»[áÒ ñlt ¢!Ý?Ñ'·‹‰h„t»¸˜f½olÙ§í.¦O›]-½ÛìbzµÞÙ\qá(‰ì(V®\ifÏžm{CI“ß§OŸð>!ã§œrŠËðôÓO·ÙþÎ>ûlsÖYgÙ±B¼OÊ~êqŸ‰ªË߯Ò_/Ôþ™(ôñËÛÙ/ûe¿ì/‹ýºÖ²_öo[ö#¶çÌ™cçÎäÕ>ûÚ»l¸uQàöÅkEÉnê{²½ã…륈èæ)"úô¤7Úz x&„{èÌbO´ÑxŸ$ij'¢§õDǽÑVD;ï³óDû"º÷¸¸:! Yâ‘>±GB<÷(Ñhç…Nz£ÎS…´/ ñB;}\í ècÓ&.¢i ¶^»eoS»EostŒÚ­Xö4 »1w^34 ¾¢ƒº‘"vIZFçάY³l$,☌᯿þºMjVÖ¼ØÏ?ÿ¼[›<矾A”m«V­’ÇoÛª¾õ¦ÏŠÕ³’"ºFRDÏð<Ñ3aÜ3Þèé Ot\HÇ=ÒÖ=°XD;Ñœâî_, 7zBÌ=>é…v"z÷„:î…¶"1ðF[/tB<[1cD÷’":î}Þ-鎋è]‹4ËN»™A‹…ô€L"ºz—bm×c"ºu\D÷²Ëøz–©to±³Øõ ‰ì’šMš4)åG½W¯^É%ï#œÕqB“H„¾ˆn¾ta…À?¿(Âãå»®çŸïçe¿ì—ý²¿öëZË~Ù¿mÚÏ’g»÷Þ{¯T¡=øÌ+bÂ45¤:OögŸ}f®¸âŠ2qï½÷fåÉ.5”Û h_D#žBÚz£­':åà‡rŸj":Щ¡Ü otÌ+]¿7z\ ôD÷(öDÛPî£KŠçD(·/¢CÊÍ2:鉶ëýŠÅ3bºUŸ¸xŽ é£¡E/sTLD]ÔÓ-zÄ–ÝÍ‘Í4/k5íj.=½õî:;¢Ï!Ö“Ìu;ãŒ3Ì¿ÿýoSQ…„íÛ·OÖ¦^LïS"”{† åfô=“"zš çNië…„r,Ê÷Dï™ åžÐÏó@'„tq(wBD÷Ù3éí‹hOH‡žhuÜ ½kÂÔ¾ˆb½Ð»¥†rwŒåîày Û":é‰Þ%æ‰ÞÕôjU,¤{¶ŠÑÝŠŠéÒú¨L|þùçÑ"»m礈ö½Ðñpn¼Ñӊø¡ÿÔ铊9/´ ãn˜=.žT¬wÂí…rÇ—Å!Üq/ô¨ˆ0î¸ú„@H‡¡Üx¡ó¼Ð‘"Ú å½ÐÅÞ脈nÙ+é‰F<Çtä2鉶^ènV@εRDtçoÔÑÖ°ƒ9<Æa ;šÃ´7‡Aýv– ¦ÖN ÁNŒmžKBÖª¨½~dÎ'@y•ûï¿?ÙQÓ«cí¤ˆvc KÑ"B¹ý¤bãb¢Ùz¡](·ÑnuȾ±qÑ'ÚkصkWóñÇôþøïÿk®¹æš¤ÀîÛùˆ`<ôž%ÆC/‘•;ÑññÐ.™XÒ Šh/3÷/+·ï‰.N*æ‰è,B¹Ã¤b) ÅÜxè6©":›Pî"º™/¢‹…4žhë…ö¼Ñm! "ºQ\D·n*¤[ž´£iuRñ²Å‰;˜¢z±e=‰l ó_“‘’/ð£ÎÑÀvã±Ù .4¯¾újVŸqBÛý‘„Çw4!þ>Ù쟉B×WÞÇ—ý²_öËþ²œ¯®µì—ý²Ÿ%^mHmذ!åy°ËÄó#Çcgòdã¬yðÁËijÏ>-²[wL;ºXH/ m—ã’Y¹ýÌÜI/t ¢OðE´›Öªó°dX·]RDNŸ•»])ã¡[§†r;j»1шgÄtó`<´óD7óDtlIH÷M|/tªˆÆ m=Ò bëõ‹=чÖo[RDŸˆˆn™Ó¬\·EŒ¢äò ã¡¹9wâaÅÓZð§¶Š3}èï“í¶mÛÚB†¥:ݧÓ¥‡NˆhÂÝ«xé‹h_LûS[•½krj«x8÷î‘¡Ü.©Ø?¡˜Ͼˆvžè”ñлä<ºD(wè‰fÙ$. -Ö½£ÓIÝpÇHOtëúqáìÓòÄc"z{+¢‹êÅõ ;˜æPwûø2F³º;˜¦Çï ‘MÂÆZwîÜ9ùCÞ¥K»Ì"îsÏ=÷˜¥K—ÚqÙ/¿ürÖŸƒñãÇ'› þ9úëÙ~>×ú*úx²_öË~Ù_Ÿ×µ–ý²_ö»ú˜K{ÅŠ)aãÇÏ)9µUbJ+—T¬¢Æd7nÕ1b<ô¸ã¡Oì>ÆÏn<ôÈ”ñоÚŠèÎ%“ŠÙñÐ%¼Ðƒ2‹èÖýRÄsíÖÞxè–Þxè„÷Ùz [øã¡»Ù¥?ÚŽ‰ŽðBÇC¹møÛ'Äs;»nÅsŒ¸xŽ‹hç…¶":¶<¤®Ñq9¡E\D×i£™98±LḦæÀc›šsN949½ÕLoj+;:6þ™ðm›¼®Y3Û™SȲeË›HÙÍ™=¤ûþÅã¡{¥&é%s Åâã¡w‹Ï=Ìeç.á…Þ-r"áFDѤ‹—P,î. å.)¢žè„ºæI1Î5-­^èVV4ûžhç…>["¢Œ‰æ¸x.ÒˆèkbH,íëcËØþ¼Ö¸šÉi­ìrH|<ô¤5Lç¶uìõ#!Sn•Ganu"mãB¾‰é×é€âdbÝ‹ÇFgäö½Ð¥Ï=À‰h<´?7t»„ºMê˜èÞÉi­vN›T,·ñÐ;&ÇCwlê¼Ð‰eÌ3 ½crÙ¶AzÝÊ åvâÙ÷Dï˜ôD;/t+š"Ú çÄҊ阈nT'¾lœÕÞo´­‹ì·ß~ÛÎ/×±cÇèáL—E< z0I@€›ùð˜ùðßÙÖÁÛüøsü:$—þº;¿\·‡„ûg"×ú2_®Ç“ý²_öËþBدk-ûe¿ì÷éÛ·¯ùðÓÞìö£ÏJ$+žÖªq0º¢Æd7jÙ®XH'=Ñ Ýe¨õ<‹è¡‰i­{¡K2?të`~è–}Љ¹¥ åî™ å¶ÞèæÅ^hÄ4ã¡÷Ù-‹“Хއ¶ÉÄœ˜öŠù^h' k¦Œ‡.mEôñ !m½Ðqáì/}ñÌ:"ùÀ„7ºÄºÓá{Ÿ9}tM›Tl’7­Õø“oÚ·ªŸJ+vߑž<ËO?ýdg6r=ÚÕŒåÞµ8¡˜?:œÚó@'EtrºÖ²_öËþ†ù¹±ÙmGÌŠOoåeån”’Tl|…‰ì†-Ú&ø£’ŠEއf~è¶þXè“S§¶j]œT¬8œ»x<ôQ.œ»yd(÷Qn~hÊÝ$! ­:uz+²rží„´Ï)"Ú èâppîb!Ý2î>¡EÒ#ånωÐíäò¸T/t±`öEs“èí%ÄuæÏO~p<¡Xb<ô¨žûÄÆ`Ÿd¯s²‡ã˳WÊ íN­ŽHÑN@‡"ºmqFî¸':*”{—²‡nê‹hôÎÞXh¹wN†rûžè6 ²]Î]”ån^7Æ]쉎‰g'¦Ã0îã=í æ@4§ÚêdÿùmVdÚƒ˜&!?²íÚµ³Kz6³ñbßtÓMv\^l^/Z´ÈNvï½÷Ú›ž,Dö÷ßoßËFds\Σ¢ñí÷׫Ký²_öË~Ù/[e¿ì—ýe9¿)S¦˜/¾øÂŠì6Ãf'cÔóD{Ÿ[&—I/t]'¢›ÇÇCßÜÒž˜Žq@”·¹„ÞwÛ¡Þ…øüÔ¡ÇŠÅ“‹ é¶lltäÐ1c†¹õÖ[ÍâÅ‹m„+ÓÅá|#+x!ÂÄ×­[góI=õÔS¶ƒˆLã8 ÝñÛ¶¨:/tLL§„r·Î!”»YjfîÎÎíB¹›Oiå’‹µ‹ðB·¹[Gx¢€Nz¡¢™pnçnŠˆN†rï`EsJ8÷ñ¾ NÇ%óÅï;\§pŸßfE67(½Mˆl²ŒgÃ?þ¸ Ó` ¯/¹ä;Gâm·Ýf'†w¡ý裚üÑ,_¾ÜŽÿÎT7Sz…ç”+mÚ´I.ýõ|ë­.È~Ù/ûeÿ¶`¿®µì—ýùÛÏØl—­õàSãóB'²rÛ©­º{c¢c‰ÄòÙõùmVd?üðÃvìµûqnݺµ]òÙNÏ™3ǼóÎ;öfë­·ì{$8[³f×x³ý²zõjsã7Ú^,~ÀéÝ*Md“éœs ñÏÑ_O·=$ÓþáöLÇÏ·þ|/ûe¿ì—ý¹žŸ®µì—ý²?êø<òˆpl5hŠM$ŸÒ*1½U0?t>"û±Ç³ŸgŠXž÷¢2dH<»xÓfq1í‡v:™‘»[b½[ŠºVÓ¨©­:&“‰ž’T¬]r™â…NfæöÇAûS[ÅB¹'Úåvx:9º„pmZBø–¼éö©ÄÏO\³x^èXR±Ak˜~÷3½ÚÿÑto[Ótis„éÐòè˜Wùø˜—»~|>ëØøÿ| ÓÛDgM›˜ÍêZ!ݦy-Ó®èPÓ±è@Ó©h_Ó¥¨FL8ï) ;G‰è„h¶":á‘f÷ö7»|ú»ÉsÖw0†{>ý]ûa?ì‡ýѰ×öÃ~ØïEúMxáÂV9½‡Wåžãhk%ó¡ÓÌ|èHˆlJ1ôªH-E¶=‰WånyУÍÑ+”[÷BÛÅô|è*«Oô¥*·î…6ò¡¹€ænåCK]äÈöÌ}Ö½ÆV¾³–í/w:N×/š–ÄfðÖV3”¶V®¢bf>t}Åq=IGôÔ¹ÈÙîªÜ’Ejè›l/´R™[ˆh3”Û«7´.¢+4­†t«Ú‘­ó)U¹Õ°kwÁ°ðö «¹Ñh}Šl*ZF¾eeeÞ}÷Ýžâ÷­·ÞÿäW®\kiåBtu©Ñ3Û«_6ÝP!盪xñâE×vj9,ª—Ý,¼Ð}zk+ʶs¢i.Et¥GQ±@ùÐ¥™2„ÛΉ¶zC›žgŸ2êùаî v„U{ç3†õ +²©È˜W.Œ¿"gj?ÚFùÕ”‡}íÚ5+×úÀKíSq"Íi­ë:ƒåë„’Ï#Gu«õÑ~>Øûa?ìÄëáZÃ~Øûi\¾|9;sæ +¿³[io¥3ªrGKdS"õV6v¹³¨Xží…–9ÑV(·ÒZx¢3Ê ñœa hʇV«rËÖVŠ':xk+¯êÜþ¼¾ƒ}WÛH¥¨Ø?:ŠŠéÅJ‹²Å5§¼ šwww‹íj-'‰¹sçŠÿJ1U#dI»ÐšŠÂaf‹«›ÜùÐ!x¡Ë”\h) íÖVoWçÖ<Ñz^²ŸåÞѺ×7Ö'¤È>þ¼ø€¤2úp•#ÑKôR›.å[ËmTÕ@9Úrå_B£ã¨‡6aÿþýž¯Gç£R?Gu›×~ÁŽõóéûƒöÃ~Øû#}¾¸Ö°öÃ~9Ξ=[x‘K[»ì‚bVQ±%'º)*"›ŠñRØ:íË((³ûD›ùÐQ¦±ÔQLLÑÃe(·W%m/QéhkE^]Ÿ»ê6ֳζQ!µ¶ª/þ'qM):á×_yÿëׯw)ëêêrü/?~ܱŸ*‡SÁ<ž={Œ"j…ÉE´ÐJA1šSw‘*¤e•n½µ•‰v‡Bmªœ¡Ó…þò”±>q=ÙTþž> }>Ÿã™>luÁ«æY«­¸¨§"BÆå6òNånß{ï½–gœB“¼D¶<õ¼ä¨ÎC­2mö÷ü‚­‡ý°öÃþx°×öÃþÁi? £G²Òɼò¶ÑÚJ3 ŠñŠÜ>cŒ´È&GcìÜWQ1Ùú6W(·’¬€—G5mÝC‹õ!¯ï˜2Š è›\U¹ÖV²¸÷2ûþU\o*f·sçNëFJ/^Vî%ò«éñ‰'¬ÿ‡Õ«W‹m%cY‰/Óñ?ºk×.ãqa¦V•Û퉖9Ðv(·âÕ<³^•´uOm:Ö‡»>aEö¢E‹óo6G›ŽùÍ~ÁúÈ®OX‘MÍâ)D¨°°ÐAªî­ ÞM›6YŽä…¦m”K-±yóf×¶@ ;Ytü?ü <äºÀ¦;Xú9…Ë‚‚kô¢zL4Ö‡{~ý}>Øûa?ìåüq­a?ì‡ý^ë׬Y#ÂÅ{×üëêYÍ:—¬T¸Šu.^aÌù)‘M)†µµµâq]c›³ ×fw9öjÛµýúëc²~ÍòY¬·»…-áìQ¸d¾“uµ•–(®®*cs¦7°ÅóZØ"“ó;šxŸmÃáG5|ðA1o¬¯b :›-ΟÛÌZ&MàÎA[dÏœVϺùv‹ÚèµOÎ;=ŽÃúˆ¯OX‘M9”#‘ŸŸ/>`åHázë.ªN Á-CÈ¥'{Û¶mb=_( žÙt<…ŽSn†.²)/H='/ªç«Îûz¼¾?c}~°öÃ~Øéãq­a?ì‡ý’?ü°(<6gÎÖÚÚ$p¦L™"„r¸”ù¸Ôɦ®®NÌ)T<Øk©Ï²Õy¨ûo4û{~ñd¿úèueÎ}}½ß窮®ÿwÝu—(v&û¨{=m—ÇSâhØîóáõýŸ°"›Ä.µðÊÍÍu°¹¹Ù%|éî’ž“Mý Ï?ÿ¼µBÀƒ<Þ$Ìe¡ ýµèCW?'AA0:Ü·oŸøMHdËΑ S$°ãM4ëç£3ÒçÏö‡r3fòäÉ"RnȲ‹t‡# œçm;*†FzáFˆÊx¿ÙOö'¬È~ñÅYMM ËÉÉqîhÒÝ$UøRx,N!·É’û§N²¶Qì@8wîœ8îÈ‘#V•@õuèÎ&}ØÓydgg[ç$ç4ªsýÜûz|¸ëû»?Ò¯ûa?ì‡ý}9×öÃ~ØOm¼>,zwttýáL^ɾ°¡¡Ax)›H,–¢1Ö"9ÞDs$íÕ“*Éé§VH÷ÁAÑŸçOX‘MB—B³IÔÒmVV–5ÒvUüRxÏwß}'ÂÆexøý÷ß/zdSØ· #§²ûðòË/³‡zHx¼IpSعú:S§Nµ¾,ÔóQç¡î×îóé öüᲿöÁ~Øûa¸öãZÃ~Øûuû—-[&Ú±’ÈÅC¯µæ‘Èö‡óÊy76Uë)\| ¥Dûÿh ÛŸ°"›zUïØ±CäJèÒ”C¡{³Ÿzê)‘‡MÞj ù¦m*®öÊ–½@ù×T8 ]ÈÞºu«ãù©ª9yÑ£õÅÔß/Ò`ÏŒý}=Øûa?ìï¯ý¸Ö°öÃ~/½ô’¨ò-Eö@ñø ¤°ÙÁf$E¶$ lŠv,âLp‘M#…ðx}0···»ò¥Ÿ{î9!´©ÄþÆŶƒ ½}ûv±MDÓûcÓ~zMØ»wïv=7ånDòîl¸w{#ýÅîëEúî4ì‡ý°ö{=®5ì‡ý°_’BsÏž=˾øâ‹°=Ù`â’"^)ךH¿ßå Õ5ꜼÙ^ûu{=Ïïo¤Ÿ¿¯öÇêübiB‹lÄê-úz|P÷ôô¸Ä0‰åË—/ Ïôã?.¶½þúëâ1…•ŸŸ_¶W踞¯M…Ϩ¥µü"qèx §Bgñë<«x ‰ƒý°ö~ûq­a?ì‡ýÄ-[¶ˆ®1Ù D6DAd¾ýö[ÖÝÝÍòòò<¿¨—"µØ¢^Ö^¢ù…^-¹¨5……“èö:ŽÖSµBz¾XVÐŒvÅÐh·é€ý°öÃþXØk ûaÿà¶ŸZv]ºtÉñ"„È€(‰l § “³fͲ„¶×5µ÷êèèð+¶ý‘Z‚Ñs“·<¾È"}7ûF÷º„ý°öÃþ¾¼®5ì‡ý‰cÿ‚ Ø™3gÄo>ˆl"b ²¥Ð¦Bh .tm/æçç³––ÖÕÕÅV­ZåÕÔ÷zåÊ•¬³³S”å§ãcY±3Öm1â­b)ì‡ý°ö‡’“k ûabØ¿bÅ áLÑv¸"›žƒZ¹F›ô[‚"…È–B›ª†S¥pY”,Ôz §b4ö·¢fï>‡ûÅé/:Øûa?ì7ûq­a?ìO<ûÉÉñ裲o¾ùÆS`‡+²©·6ý>Œ6§OŸA ‘ ƒGdK\»vMTž¤¦ó$œ£ýÅu£‹…D»B)ì‡ý°öÇCuq\kØûÇ~ª¥sðàAvõêÕ€¿ùÂÙÔ‘fÍš5Q#¥&BdCdÀ Ù„ëׯ‹‚hT)¼¡¡Á*V_„‘γºÑ!m°öÃ~Øiûq­a?ìO û›››Ùž={Ø×_Í~ûí· ¿÷ÂÙ”Mˆ‡È†È€A+²%¨"…¿öÚk"Ϻ¬¬Ìñ‹­x ƒý°öÃþx·×öÃþÁk?ý6£ßhû÷ïjÏ* ²Aˆl¸"[õlÿüóÏBp¿ýöÛlïÞ½lÓ¦M¢ÒøÒ¥KEïkAA0z¤6\k×®e=ôÛ·o{çwÄo3Jõ G\ûÙßþß×ìÝ¿eÿ»óYöÜæ?³½þ‰í}à ößœ5UÕ1ÙµõüõþèäýîÇÏÜ¿ÛØ3јÿ‘íá‰Ï(£Î=÷wñ™ûä¾mÖÜÞ¦k<6žÏ9>½a+çãì/|ÿ_Ìq7ß&›|ú¾­â8kÛÆ­l÷}ÄÇ,îºo Ûµa‹c|ŠO­Ô9wÞûûóúGØÎõÆèÅíë7ó °þ‹íXos»5n6ilÛûèöêÞÿaŸ}ð1ûîʈlˆ†ÈT‘ýÅGŸ³í«7±uÍ Ùï+f±e…w²Þ‚V¶”˜ßÆÊ‹J\"›Ú‚‘ç¹/\´h‘§È®ò•‰×ëͧ×me=ySXož1JöZc«c.÷9çöqòØ¥Öho7æ­bßRíø^åµ—ä¶°%y“­qqn3[”ÃÉ/äãœ&¶€Æì&ñx‰ó³Ùüìc䜗ÙÀæe™Ì¬g]|œ›1‘ufѼžÍͬc™Yßfµlöø6'£ÆÇg¥W³Yã'°™b¬f3Ò«Øô´*s¬óié•lZšÉÔ 65­œMMµÙÎÙ6®Œµ¥–±;ù¾é™X‡¯‘-m˜Á^xrû–Ѓ’?8þ<;zô(¾U§OŸf}ô»tþÛÒóŸleÉ4!n…ÀÍ3ĵ¨ü±—È&‘>a„>±½½Ý¯Èözý¥b›ßöv¹OÌóìy(ë{ò öšcØFbšDô1_Â…òb. ›ã"!¤ Mâ™ÆîìIœæ8Iˆéy’¦î䢺‹‹éÎŒ:CD“hO¬ây®9Ÿ3ÞÐRDÏ䢙DôLÑ|NBz&Ï3Ò MãT.š§«0ÆÔJ6uÍ\0ëlWÊ…´ÁÖq%lÊØb6%…sl‘1çãä1E¬%™sŒ³ˆ5%ûXGqûë˯Ad€?P‡7ÞxÃo+ 1@^ìÏ?ÿœ|á5¶ªtºíAö°^";Ò°<Ù¯o é|!í1ïÕt¯é^è<)¤‰Í¦€6(=Ð MMsU@Ó|>§ÏRHg*"š{ ;3 -„3ÐRHÛâY é ¦x6´Ó O´ðF“ˆgˆgÒ†˜."  Rbˆh>¶í³t3›ùã¦äB>lJ.`“’óY£Â†Q¹líÌ…ÙàtäÈv…çV‰‹Ã‡3ŠrÜñïó°ð6KÄ:Åm›­±Ù^¯¯{£{U´KDs’G:×ë!Ýšz¡9_ <Ñš:KñBgKOt½ -¤ëhUDK!=‹S gáN·ô ÎêÑwjš Rj‰iá‰N1½Ñ&[¤ÚÒÍ$¤“}¦xv èI&FsYýè<>æñ1—M™ÃêîÈ6Éó±i\!D6Ùü1ûàƒðÍ ¸té’H!¤ß†›çÝm U±ºõçɾpá‚(ÄÖnܸѯȦ×÷Êmz¡óühá…ÎÑD´ÎmgUD7:Et–·ˆÂYˆç‰Ö\æ@["Ú Ý&!­ŠèÒjÌ¡Ü^h]DºX çVB¹•pî&Ó Ý¤‰hËMÚôD× Í…ô¨‘Z‚ùÙ–väCKñl{ í|è¡Ü$ 3êMm†rKá¤3”»ÆÂ- ‹U)"Ú((6•<ÐBš>† Kf…œPR‚È&ž;wŽ;vLˆmª0I›¼ÜgÏž[ÒMD&]³D&hIdRþ["óâÅ‹ g¼]—D_&úçr¢/Çóo&Iª N…oé÷ÞW_}Åô߃§-W*tÛÅÄÔ¼h/‘M¢}Û¶m}â®]»üŠl+z¼]™[Ñ3e[+™ =^ˇ6ûA^hSDs/ôž­­JœùÐcçC7›^èfÅ Ý$ºAæC̶¼Ðr¬ñȇ–áÜÕ~<ѪÚðB§»¼Ðºˆ.vˆè1¦ˆNf¾a£9“IL Åò9 Læ½ƒå  ‘ ¡ŠlIú‘Dw5O:ÅNœ8ÁNž<9`Hç›È|ÿý÷šï½÷Þ€d¢Û)?~tŠ’bWä&¯´Ì6èSò¡ ]žh‡ˆ–èQfeîQ¹V(·-Et–VTLñD'ˇ6„´—€.ÚÒÅDη)"ÚÎ…ÃM=\ŠèÑÆÈ¶Ñ’¹&ó´J ¬oïµÚZ-PFµ¨X™‡È¦üî¾Þ¬òêt#E¶lmåðBó¹­‡r«Õ¹­ÖV)ŠZ åN.räB“ºQ)(f·¶²+r«EÅ&*Þh)œk”œhGené…¾Ý-¢motzH¡Ü%RDKo4…r·=Ð>ÐÂ=Úå…ÎÕD´×¶Ü!I,g#YÎPc´·'Ad„‚um‹ý·¶¢ÊÜYõž";9ÙeEO´šíçN1«r›èÉŠ7ºYñD«"ÚÑZVæ–…ÅhT<Ðuž­­²\^誑î\h[D§»C¹GØbºTÒ†€N1=Ðc¬|hÆ-…´áyöï…Îó' ‡&¢yˆ!šå\Šjk¿u\’ã8ÚŽw @¸gÊ!¤çÊžÐJA1ÙÚªÌW‘_ähkEžhµ*w³+ÚYTÌöD+Å„Ú(*6q¤ÞZ͉β<ÐÁZ[©^èò©–÷ÙŸ€&hS@Ó¨³½Ðf(÷°Ñ–'ÚË £+:žl"ûرc¬¬¬Lx²KSsì|èÛÓ^è~ªrß–¢„r;ó¡Õpné.0Çܽ¡½D´îñÍ¢µ¾R=¿zµîA°ï€°´e¶UXLχ–U¹‹ }QÙ´ žÑþÒôED;«rÛ9Ñcœ­­†;Û[YÞh%”Û¿7Ú»­•Níé`Í ]w @XÒ<Óì ã((¦öˆŽ†È>q⫬¬4ÂÄÓ²låCËêÜ®Pîa£\¢Ùo(·Wum5„zèÈ€…Á°áâ¡‹ì¦[[ñ\èâ‚ÈŠì?üUUUÅÎÒ³U¹ÍÊÜ^áÛ®êÜŽ¶TZ*Õ{«x­ÝÞ\¬e=Þ)!`ÕŒnÖ”QÌ&¥™ä‚š³!­5¤²z>–ø"—“ýé§Ÿ²šš±­$'ŸUŽÍdœ•+Ædp*£Ø¦nW©c­ÅúH®Ç; üaÃF¶ºw9[ùÚÁe+xÞt$DöéÓ§Ymm­xÜ<©‰­ìYjpÉR¶bI¯ìQ÷ZÛV(s¬Íz¼SBÀòåËYkkk@–””ˆðnÊárݺuBTwww³ºº:1§ç öšÄ¶¶6kTç¡î¿ÑìïùÅ“ýx§ôSdKÁVZZj(‹©ØY´Di´E²ÎHŸ_<Ûw @X±bEP±5iÒ¤>±©©I„™“›X__SÑk‘o¢9’öãB –hŒv8u¼‹æd?Þ)!ŠìþïÏ7˜íÇ; D‘=P ƒÅ{Nó`²ïðÆÿ.ŠXÙìöIEND®B`‚perldoc-html/static/combined-20090722.png000644 000765 000024 00000112341 12276001416 017677 0ustar00jjstaff000000 000000 ‰PNG  IHDRÙXGÇ?¾iCCPICC ProfilexÚ”OhUÇ?o:¡ ‰ ®5”‡‡$ k+ºE¨É&é6î²]¦Ùü£ ›Ù·»c^fÇ7³[[ŠH@¼ÙêQ¼T‹x¨"$Oö h­z‚xª" ½HY³&Ôª?ø¼ßï÷¾¿?†>«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´JÒM%„B!„"-èÆy1ý8rú¹UJtWªÈ>ºe/3mÎeæé_5ß}÷n!„B!„9óí·ßš§^xÕœzî¥æ¨=·=‘ÝsÔ4³`ÉR³yËÝB!„B! Ʀ͛Í}‹7=Fºu‹ì?œØÎŒ9í|óúÛïë !„B!„(w–½ýž=ó<³½¶[—È1íóÞ‡+u‘…B!„BT8ï,_a†:»ú‹ì&=†™'ŸEU!„B!D¥³ô¹—L£nCªŸÈ®Ù¸³¹ì¦;5æZ!„B!D•³}Éõ·›Cu®"»mÿq惕ëâ !„B!„¨²,ÿh•i}ò˜ª-²O»ðJÛ+  &„B!„¢ª³qÓ&3ã‚+ªžÈ®Õ¼»YôèSºHB!„B!ª<ò„9¼Y·ª!²ë´ëgÞzïƒJkŒÍ1Ïù† ̺uëÌ_|aÖ¬Yc>ùä³jÕ*óÑG™+VdÛŸ¥¿žK¹ÔWÑÇ“ý²¿"íB!ªúÿ—ý²¿|χåÇlV¯^m5ÚmýúõVËm©"¹¼Þxç}s\›>•+²ÉÊöÑÇ«+µ!¸X=ö˜¹þúëÍìٳ͘1cLß¾}MÇŽM³fÍLãÆÍI'dN<ñÄŒ¸ýXúëÙ|¶,õUôñd¿ì/¯ÏŸp æ¸ãŽK¡N:É¥¿žíö\É·¾Š>Ù/ûe¿ì—ýUÇþã?^ÿÿ²_ö—óù Íºuëf† b&Ožl.¸àsçwšgŸ}Ö:I«ŠGûÕŸ˜]WŽÈn7ðóÅÚunô{ï½gn¹å3nÜ8{¡ÜjeÀ²[úëÕ¥~Ù/ûe¿ì—ý²_öË~Ù/ûe¿ì¯ 燶?~¼už¾øâ‹6b¹²„öç_®5mú­X‘]Ôg”Y÷Õ×—õmùrsÙe—Yït!/zyï_Ñ7©ì—ý²_öË~Ù/ûe¿ì—ý²_öo ö7iÒÄLŸ>Ý<üðÃæë¯¿®p¡ýåºõ¦Y¯#²Oê<Ьþì‹r7ê»ï¾3ùË_̰aêLÏÍÖÞ3%ûe¿ì—ý²_öË~Ù/ûe¿ì—ýUÍþúõë›3Ï<Ó¼þúë*´?ùô3S¯ã€òÙG·ìe–¯XU®†|ûí·fáÂ…6^?Óž|)ïúó=žì—ý²_öË~Ù/ûe¿ì—ý²_öËþâ}È¿õÐCY§l… Wþp¥9ªEÏòÙÖï`^ZöV¹ðä“OšîÝ»—9ñEȱÇ›\úë…ª¿²'ûe¿ì—ý²_öË~Ù/ûe¿ì—ýÛ¢ýèÆ%K–TˆØ~áµ7Í'µ/¼È¾îŽûÊí¤W®\iF¼8…"¼ÈáM°µ#ûe¿ì—ý²_öË~Ù/ûe¿ì—ý[³Í}úô1Ë–-+w¡}Õ¼{ +²O>«ÜBÃo¸á›Ò:æ˜c’KÝ5`¸=_ÂúCò=ŸLûç»]öË~Ù/ûe¿ì—ý²_öË~Ù/û·uûÙgΜ9åž mÀ„Ó #²Oì4À¬]ÿUÁOpÕªUfàÀ)S»víä2 Ÿâó™«—yØß°aCK£F,¬7hÐÀR¯^=S·n]Û«“îXQÇËõüÂýK;ÿ|íÏæx™ê—ý²_öË~Ù/ûe¿ì—ý²_öËþêj‹-Ì3Ïúèh·ô×Ãí4$‚áLC¶iӦ̴nÝÚ4oÞÜŠo2Ýe{–…О²Ú_^õ•7²_öË~Ù/ûe¿ì—ý²_öË~Ù_Þö_~ùå6jº<„ö£O¿ŸÈ9cNÁ§åb¾ë£Ž:*mÔÖ`kæKCg# Ù7WÑݪU++Þñrsž>œG.çî¦,Ÿ¯hÂó•ý²_öË~Ù/ûe¿ì—ý²_öËþêhÿàÁƒÍgŸ}V.B{Ø©³Ë&²kÚµ óaoÚ´ÉL™2Å6ò‘G™54áß-[¶ÌY0/X°ÀtêÔ©Ì^"+ì³=Wg[º›Ôß§„Ç+týùžì—ý²_öË~Ù/ûe¿ì—ý²_öW–ýhÈ>ø à"ûãÕkÌ¡Mºä.²¯¹íÞ‚Їj ­U«VÒh·ÎÒ_wâšîl½ÖQÐsÁ¸ï|ÂÉtÆq‡0<ÿLäbY¶çz¼LûgBöË~Ù/ûe¿ì—ý²_öË~Ù/û«²ýhÊW_}µàBûÊ[îÊMdŸÔy Ù¼eKA¾aÃ3`ÀkàG‘\–Þc¶óÇ}ô‘;vlÞõøžm2×¹óômв/Óö\É÷xUí|d¿ì—ý²_öË~Ù/ûe¿ì—ý²¿¼í'Iö“O>YP‘½iófS¯ã€ìEö äÀ7n4C† ɺwM²|Åð‹/¾h—ï¼óŽ™6mZÁD¶ƒlåôŒ~øáö¼Yúëe½‰ª™ì“ý²_öË~Ù/ûe¿ì—ý²_öËþê`ɳ_z©°I½çÍ_”ÈFoÙR˜Ll“'ON^8ì°Ã’K%Þë|BÃ}6Çz:tè`^{í5söÙg›Aƒ\hßÏ…òíËDiög³½¢ëÏ÷|d¿ì—ý²_öË~Ù/ûe¿ì—ý²¿ªØæ|ÿý÷ &²‰þ>¾ýÉ™EöÕóî)È/½ôÒŒF½ x† )€×­[gúöík^xá3wî\óÐC™¶mÛ\hÓ)ÀXmߟlì¯Jú|e¿ì—ý²_öË~Ù/ûe¿ì—ý²¿*ÙLät!³Ž_qó]¥‹ìCu6ë¿úº ó`׬YÓzè¡¥‚À.DxxÈêÕ«ÍðáÃÍÓO?m®¸â sýõ×›qãÆÙñÙå!¶™ò‹››¸p™ìÎW—»! ]¾Ç/ïó“ý²_öË~Ù/ûe¿ì—ý²_öËþBÛÙ£GXôvar­]ÿ•9¸aÇô"{ü™só>ÈÊ•+MݺuíÉ;¡Í2„1Ídí.´àÒ´Oœ8Ñ<þøãV`Ϙ1Ã|ñÅvŒvy7nlo¿s¡4ûýméöÏ´=S}!…®¿ª×'ûe¿ì—ý²_öË~Ù/ûe¿ì—ýQû_rÉ%ófu~z‘ýÌKËòªü»ï¾3}úô1‡rˆÜ2Øe™÷:[Þ{ï=›ð ‘=oÞ<;…y³Ëë˜Ð´iS{Á2Ùïð÷‰Ú?×í!¹Ö—ïùe:¾ì—ý²_öË~Ù/ûe¿ì—ý²_öWûÑm$Ì.„È~òùW¢Evvýò®ü†nHk0|ðÁ6¬º¼<ØQ"ûÎ;ï´IÐ(ôV”çqG;Ý…Ç~·ô×Kk¯\öÏ•ò®_öË~Ù/ûe¿ì—ý²_öË~Ù/û«ªýMš41ëׯÏ[ãl>¦uï’"{æWäUñÇlŽ:ê({²tÅ­;#è-(1ØQc² _ºt©¹ûî»í{?üðƒ9ýôÓËýØÀ„çQöW4þ9”å|rý|¾Ç“ý²_öË~Ù/ûe¿ì—ý²_öËþŠ´ÿœsÎ)ˆ7{ÚœËJŠìGŸ~!¯JG4&xyË[à¶oßÞ j¦ízþùçÍüùóíûûÛßìØìŠÙÀ¸ôLíQÞ„7UxÓmíÈ~Ù/ûe¿ì—ý²_öË~Ù/ûeiŸ!¯Ö‡~˜·È^òÄs©"ûàFÌ77•¹BbÙ<ðÀpÒnÉœd!nO;í4óüôk×μùæ›æÞ{ïMŠìóÏ?¿ÂD60öÜÙï·ETû”e{¾äZ¡÷—ý²_öË~Ù/ûe¿ì—ý²_öËþʶĈy‹ì ß|cjбXdw>%¯ »uëf8à{’né¯×ªU«Â„ío¼açÇf}Íš5vL6ëïÅ‹W¨Èfì91ÿ´CéÚ+ÛýCÂÏçZ¾äkmõꫯڡBÿÇŸ³ý#GŽ´Ÿõëzýõ׫´ý[ûõ'’åÓO?M¹&ШQ#Ýÿ²_öË~Ù/ûe¿ì—ý²¿Ríî¹çòÚ†L,Ùs®¸±ì™Ôž|ÒüñL FUˆ¨=õÔSm‚3ް&»8ëÿùÏìë^½zU¨Ð®_¿~©íSøÃ’K½PûW5¯ÏTk~Ù¼y³éܹsÎö/\¸Ð„…÷ªK[l×Ê”)%®É»ï¾k#>tÿË~Ù/ûe¿ì—ý²_öËþÊ´¿ÿþy‹ì³/½®Xd/zô©2W„huF…ì¿ÿþvlrEˆÙž={š 6˜÷ßß¾0`€}ˆ¿ôÒKí8mWž}öÙ Ù@FuÚµIºöÚÚðmÎÆþaÆÙο0çy:ur:. öÞ~ûí‚næÌ™UÚþ­ýú“!,D—èþ—ý²_öË~Ù/ûe¿ì—ýUáÜ™¥*‘½`ÉÒb‘½ü£Ueª„1Ï®£‘ðßÖ­[—»ˆíÞ½»ùè£Ì¿ÿýo3jÔ(ûã¯)ŒÑ¦#À/wÝuW…ŠìfÍš¥Ül…"lïð¦®hò=Ÿo¼±„{æ™gr>‘_~ùeJ=7n´S¹Ueû·æëO ~/Âræ™gêþ—ý²_öË~Ù/ûe¿ì—ýUâxDFç#²ßûpe\d“ô¬¬•L:ÕžÌ~ûí SY•·€íÓ§YµjUÒkíÞ'Œ2pà@ë!¥5Ή¯x B…víÚµ“må.¢[wí•i{®äZ_¸&r=¿Ò·åË/¿\B„qM³­×1|øpóÿ÷)õ,_¾<åT5û·öëÏ„Ÿ}öYÊ5ùöÛoíPÝÿ²_öË~Ù/ûe¿ì—ý²¿*Ø“½˜Ï|Ù$?Û®IaeËž Í& zß}÷Mž¤[g‰çª¼…ëØ±cíyPî»ï¾äû:u2ßÿ½5’×Ó¦M³û¬]»Öz¶]!ÉV=*Ddã]õÛ©,ømì¯3† ã_¹´b’Á-[¶Ì<ñĶ jÖ¬™Rg×öä“O67ß|³ /ä[o½eë ƒb„ æ˜cŽI{>gŸ}¶9묳칰œ}ºmSî}Ž7xð`iâ:½Ž=öØ2ßÿ‹N¶›nºÉ&ªp6Ðwß}·m›#Ž8"íçb‚î;3{öl›œ}¸·iƒÓO?Ý\tÑE)õ¸kÂ=M¸;Çã:s/rOSöûí3iÒ$Ûžåñý¯¨ÏçZ_EOöË~Ù/ûe¿ì—ý²¿ŸGKäãÍnÐu°Ù®ï¸™eú0Scí³Ï>öDX†ð ^ž¢õŠ+®H>´/]ºÔ´mÛ6¹ ï'ÁÀë¹sç&ì™lüá‡N¾þú믭 ­¡}ÔQG•ùsíì··['åüO?ý”"`Èâü—¿üÅÚæg¼Ú>„[Lœ8Ñ 6Å¥+?þø£Ù²e‹ï^û1cÆ”8ÆV#29'?”ÂqÉPÿ÷¿ÿ=å3džæ>òí'S=âÌáQÛþú׿–x¡ÚЏ~ì±Çl(yivÒ9ƒ¨õ¯¿~ôÑGÛ{Œ|?]ù׿þeV®\i“áårÝýûÕ®'a›Qè\ºîºë"ë"Éß+V”ðöû…N*D0áýöÈ#D&•C´s=IV¶;çò#”åþ²ÇÓ9Ày¦+tÒH¯^½z%êbÜ>×",ˆãóÎ;϶¯»ÿ9F×®]íçèád®uºö[¿~}‰ÜÌt õB}ÿó©3Ÿã•öû_ç'ûe¿ì—ý²_öË~Ù_öãäËGd÷3Ýl7þ̹eúð AƒÒ6.ÞÁò×ÊÖzn;ëëÖ­³Ûþô§?Ù÷n¿ýö”±¹x]89å¿ÿý¯¹ÿþû­×·Ù–¯¾úÊ~áÂëÁ5¥3'ìTHWèôhÕªUÖןìüxÖs½¾›6mJ±ÙuPpßsí³->ú¨mwWó¾óÎ;‘×*<'_œ¾ôÒK‘Cr½ÿ¹.»ì²R;F¢2˜3tů“Hİ_¸ïñ„‡íC¾’ïñYΙ0÷\ í˜É¾²~ÿ³ý|®õç{>å}~²_öË~Ù/ûe¿ì—ýåa?N5eÙcgo¶;ó’ksþ ‚‰‡w²!„€–‡H%<Ó…‡S^|ñÅ W]u•݆—Û…‚ºë¼ „™"(ýj<–Œ3/O¡Mwºv+ ˆNTAø!Â0]Wo„ܺíž{î)á‰Å+J¦n:!JQƒp]w>Ü”„IgS8/:<øÑa¡óÀÕ‹PCÌFyñsí¸7è0 aè„£»ºH„÷Ml²ø¨Ž<˜x8£ŽCØ7H®æwN×!A;á±÷½º*c=²½¾Œe“¸ù^{Ž“NàÿùÏNÖƒ5ªSv`8¡öŒ[;0þö·¿™~ýú%ë!‰mM¡?þx+PC¯1׌ëëýNØ}T„ÅçŸn½Ë,£ŠBT´EºBÄßB¾£:7¸çè<áÇ8]ôǹçž[Ðï½B!„(?˜º¬"{ÖÜ«Ìv_[Î\²d‰Ù{ï½í ¸¥¿Î˜ÛB SBÁ1Ô(Fp…áì‚mîýp¬¯óL2f“±Á¾‡•fÂs %.‘Ø£²!]ûúû06”°ê¨°d„Ù݇ bí ‰¨ŠÔƒØÅûn§üó!¼:,ˆjw^„w§ åŽ:>F É/ˆ\wlꦃ% æ~@ܰõàd\lX¸WIÔÕ°aÃÈ0a„#!ëxj ÿæ bÏ5õvM'OÔ}Eh4íNX8^~2ݾïîårý‡ZBð#ì ÙfL1ǹøâ‹Kˆy áÔ®Î)¬‡viÑ¢Eòx\wž~¹òÊ+“û04¡´0s¿Ð)ñ£X÷Ë'Ÿ|bøs¹ÿûaçdÜG\WÆ?Ù/ûe¿ì—ýÕË~çlÁÇ.‘p! UŒzß[:ÈõþÖˆ³•e¸îÃs[iÁèó«[ÃXýõž‡}Bä­qËpݧk¸tðšc`›s`UµûŸa®eÙ^}‹ÙïÊùƒ$âà5jÔH.ˆ”BŠR*…sã¹ 6BÜM„PFxñ~»víÒŽ“EϘ1#96×/xE™W¹Ð"Û‰¿íÜÅ Û3ÓvÀ[ÏØÓ(¢£ ”¢!ÂqòôÓO—ð’rIå@D05Ê#îÎï$"?…$†"Ì—¤k$Mówùå—ÛÏ Cñ„ Å‹Ëv¾€Q!Ó„>“lÌo/€+$~síæ"±ߜƒß¾Ü7¡0¤MfZ…Ø€·Ó¿><Àt$p?g{}YšÖŒmaýx¡ÃB›SÂ3ìø c ›ÝuÖ£:ðˆ»ó‰J*Gç S„ü“ß®¿þz;ŽÏq­£ÂˉNÈöþ'b#ŒHqÑ,ü‘øû“Ì,,ØO‡ûpܨh ¾ØJB=†ôŽDrüyD ÉàEøû׎ðþçûèß§¹\ÿLßÿÒê )tý™öÏ„ì—ý²_öËþ­Ãþ½öÚËþ·"®q8”p4à!(ð,B9K·ÝßÏçÁp®ÓÙ®³ ×£ÞËô™tÛÃ÷3?Ýçý×áûûÑSG8xXœØ÷á‡;l<¸üˆp^ØÆ¢ûß…ðzÑ eºýi³¨iÍðŒ‡ŸG Fy²Ù6kÖ¬È0ñðÚB”˜$ƒÔC4@ÔÐ…ùv‡ö09,D¦ärÿó#&ñCÌ—žÇ°p/s¯Sb;ìtàw?”¨ëÃu #-(W_}u‰ß¿¨ï#ýÒîÿ|¾ÿáñK»ÿ2Õ—ïùe:¾ì—ý²_öËþ­Ó~’æò_ɳÿË<áø"Œç ÀyåÖ3AÞ•Ò s¿4ˆî ñß/m½42—oÏÑ<7–Ï¥Ù­m4S:^XÚöÒÈtœLçõ>÷C(çÏŸoç¥æ¾A_ò,׺ªÜÿ8ùÊ"²¯ºõn³ÝuwÌÏiÞ/.:Ûé ã V¨ìáa".F”à%”Ø…°âm¤‡Ä¯'SÁó8dº…àŒótžñBÍ™½çž{ÚörË(ü}ÒíW3,„‡uq‡AÊ"ù2†»ã0n9*ܶ4{ø‚……°j·cÅÓu$ø6"Ä„X\;zJÙNO÷p8všÏ…íKa>Ͼ®7ÐÏ àD=ƒé®e¶×Óïg8€mÄ`¸oÔ˜v¼Òl‹ ¡Ï¶ðÝ ò€zè™&Ì; …æ;™ÎºŽ Âü™»>—ûߟzÏOºGÇIø9¼ÐQûr¯³ÞÓÐÛÌvÿ{éußðA/q¸oT$¹ÊrýsÝ?WÊ»þ|'ûe¿ì—ý²¿jÛï"XéÄÆËɬ!**¹4÷N7¦OÅa‹Ð® ÷?à Ë"²¯žwOn"aWÆT_¶BˆPÆ8†…‡û¨qÒjßÛGÛßõЛ®8¡h?‡']!ì#ü%]æB:Ï-7(Û÷Øcä2*ã7!âl ¡¦³!’ B¦éä B=˜ö)LnFˆH:»ãÂüýB§ÛéòÓ‘ÖÕY@݃6à<£’s1&$¬‹Î–01‚/(íu"Š]{ûm_ÖëËý&Òr¡Çþ~xÖ£2~îÏö¨öËæÚžm:)°/~Q@æm'N}›Ý:Ña‡ãßéŒË¥-¢:ZðÔGí5]˜ËTÏyEE[0KAX³!*Ò‚ë€h÷÷'¬(*·÷R!¾ï¹u=r¹sý|¾Ç“ý²_öË~Ù_}ìç5à ØL«Fžº¤´<;á¼ÂS›mÂQ•m³ð¼HT1¬á³neÝÿh‹ Ù|Yn»í¶äI‡¯}衇"þp_İŸÔŒÐƒp¬vTøkºÂ—Ÿ‹ë>ËøÌ°Ý8_ç4];f ž´0áž[~ðüýxøÅ8ÊÙ³g§àxÈýñéÀË玃—7Ì>N2Â4 ŒÃ3BãÆ6¶“Ø!j¼12¬‹ èaÁKî¶_xá…‘Ó;1†(¬ë¾ûî+±/Y¸¹nŒÕ]t2¤³“9ÿ5$ü€øÐcæï5­\á~ˆñÐÓJh5a7ldzÓñÁüÕÙ\_ÆS¹ãD…B3 ´{9Ê«ÌPþ˜ý¿°-¶Œ 9ÇCæx cÄÝ>QÑD¼¤³ß„¨ŒõÜ›þ~Œ s ð›ä·¡BQÝÙ}÷Ýís ¹\ó ‡ZòìÇ3ÞI:¼s™&VeÛ+èò!´I’WÙ÷:"²×苾dn „næ#>ï½÷ÞMx,㪣ö!!~üíŒÅ͵0…OWG”WŒPæ|…6 "ü6ŒjÏLÛ£ÄcÖ©Û¯ƒ±£aFh7­û°=,†¸zÂóA$Ñ»C§†Û5ö×Õ“Î>¼Âá°D5aÆl'd$*8çìŸ^ʨähx!Ý>ŒeË|`3`úç‡À3I#L±—}øâG%ÿ"Œ;ª½èô ¤[x(`zw$"#ŒÙoŸ¨D_„ûÇ@¬G…g>ŽgŸ(ÑI=¥Ý_x÷™g/¹;V˜/ê:„öß}÷Ý%>³pá”c‘áŸ^n¿=ðÓ9äö‹Ùˆåð˜ãÆ+ñgOG¡ülOmAÛ‡÷§ƒ‰°&Ïï„ÛŸp9"]J‹¤Èõûi{¾äZ¡÷—ý²_öË~Ù_ýìwKþO‰V g%áùƉj¢Äx!¿Œ{ñ5¬ŠŠßIÃ37‰ÑܳleÝÿ<³WˆÈF˜ <Øìºë®‘Ñ:Ÿ1ØQãA‹™Íþ€pŸ¨0èl ½l¾ç<êÜî¼óμD6âe·Ýv+ÚÔ-ýu·±5VŠƃÌ[èíFÔ’ìËÕE¢¨0S6aÁCB… .,Ï3Ù£ Õ%ˆ ìîü¢¢ð’§;ˆêÄ@\º}£¦÷ržZ2Ò±CöE>þhãUdlµ;ãêÃÎÄãƒÝ9Æ%Öù“@ôQuFͳÍtNl£]´?™µ %'‰us î­p>j¶“MÒ+½wQ¡ÇxUÝ´Dr ÎC›h2<ºº¢B­‰A óãáLàÑÇûÍýÎeΉ¡ ÔítHD Hw}IöªNçDÝ7xÈýïAT–u"awûÑa5}¢žsa¢H¢¢-øí Ïܱ“UH¸xñb{þDDDÍGOtEi÷>ßÿ| ëOg¡ö—ý²_öË~Ù_ýíwÿï<ӄïp881Íÿ¢?»ÛˆÆsmžñUTÒœ4 %Jº2ïÄ7º°ÜE6 B"Ó‚¬,‚ïS(:(„kg³?b/*œ<*3s¶…1®.É ªÃ‚P-«ÈF$í²Ë.¶ÝXúë®=KÛŽ·:J "<C$›Šš×—¢ÉÕGØkT]nü,m‡œ%ÙÅÝ£KþD=Œ§Zc?¢J»‰£æšÆãìΛ›^Ït!Ø0Ù™ ŽØuÇCôFÍ™Ž(¥M‡½aÁ3J¸ˆ«‡N‡tç„0¥í9¶Ÿ‰öås|>*r€¶¥cÄ]ߨÄkê¤~¼ýQÛz×\sí}smˆê ³¯1ÞV N¿C†ëÉu¥ž¨¡ D¶Ð¦é®/ó`³_¨!ïÎ-꾡\{íµ)÷<Q=߈j:X¸¢Úƒ÷êÔ©“¬‡N¡°0 éì ó#Ýw)S¡'6]½ù|ÿ+¢¾òFöË~Ù/ûeõ³ßå ‰úoÄ9àDtÔøkž×œO÷즢âž1q´â©ìûŸgÍrÙ|yyu'BØH®b“pã¨X*d•Ây3—2']„Û2–&¬7ªàÅ£ê>‹HCØŽ§2Ýñ¢:BðL#¾ýýK¢W4JHó^Ø0yåœsÎ)qLþZa¢°°p^„,#øÓ?B›?’LÙ2él iŸ!ü6 ¹üº£¦5‹òFû=~Dàù:W„â­·ÞZbîï¨Âu¤>ÂÆÝç£Ä)‘&þ1‡<êa!¡žÿBõÃû†k‰‡<´0|¢42uð= „›pÿóD„YØ96¿5™¾o´¿=xþý~‰ò C"¼?‰^(íþÏ—°½Ã‡ºŠ¦¢ÏGöË~Ù/ûeÅÛOç:ކÃE YD@G=“¹ÂóûD ïRQ 8vÚi§J½ÿqŽ•»ÈFÉóåruF³„\…&Y‘Ã$EÄ`Ôþôš…"òÁŒÜ7*cvY ¤üzçÎ[bÄH®¶sÌ1%ÚÏoÓ¨öõ‰£Êj¼® B`Ÿü˜áé'ÜÙ °¨úܱ"cM„3L€%¯iz–ˆZðëCp2FœäuŽ3f”¨ß·±Ôœ«ÿ'FCû9ÖÙgŸmíá|[ô†’ ‹ýzh ƧkO:f/‹Â›Ž¸§^ÚŒ÷é4À{ZÚ—ŽzˆHÀËKÈ2uq^ÔÃw…„d4÷½ùE´®<òH²óŠ:ñDGMkFÇáª\ Æ;óŧÍÛŠºž¡ýœ bœ¤aþùÒ†gšk8qâD›Ï·•ä_þ9ÓFŒÇ*í~"ÌÇ·•Œí=zôH9î¼ü~݈s~,£îÂÔ˜*„(3\\;îOîy²ÿó»Â~¡ý$ÇÃkí {y?ª½¢à^ã»4vìX{D¸0Fžó Ó fºeýþgC®õecxÿçr~²_öË~Ù/û«¿ý8>x–ˆúƳ :jZQ ù~\¸8ylTTJ+<³1̯²ïžñË]dóPÎIì¸ãŽ·^V‘͘æ°MqÔþ„Ô†%]æñt_ð\ ^J¼Æ~Ø8õá˜Þ¨ÐöÒ`š"׎ÙâÚ›ñQãÍœn_Y¢qO=õÔ´IÒÂÂù¤«›ðÍBB2 ²µŸ)¶h³_ÿú×iÛÓmc鯧Û?WÊ»þ|'ûe¿ì—ý²_öË~Ù/û«žýìÏ´ª-[¶4½zõ2óæÍ+‘”7S!JõÛo¿•ºT±÷÷×ÚHÁªpÿ3]l®»L"›qÑ—]vYò¤Bößÿ¬EfÔœ×Ì…–­'›/fez²;wîœW¸8c Òµc¡øÕ¯~•\úëþ>dícÊ(<óÌõM2‹\>ïØgŸ}lï ¢ögŽgwœnݺÙp[z‰*ÛþB~>ÜŸ^^Æ+1V·Ð¶5oÞ<¥=™[›9¿«’ý¹χW:¢²ÝŸû–ñÐî5Q5LûÀ5¨Žöo ÷¿ì—ý²_öËþ­Û~ÞcÞlr2õèÑÃ:›–,Yb¾þúk©F•Œ…Y¦¾úê+;$—1Ø ÅíÓ§ÕUåþ'÷@…ˆl¾4Œ¥þå/iáànéDE¶"óÎ;ï,ÑØé²…ááaiß¾}ä¾|Á ]ð¼g Gìdc;‚Óµay^ÿ=YɃ}$vzæ™gìte¹ÖÏØ2G¶nÝ:rßk®¹ÆF<ñÄ6<ˆu²¤3¢2í/Ïú¿Ë¼$è+´m·Ür‹MBˆ¡V´=‰ýkWÙö—ƒH6vÉ%—dµ?C/^xáÛéàÞãžûŒÈêfÿÖrÿË~Ù/ûe¿ìßúíwÿÛ8Y£Ý»wo3hÐ 3yòd;Û³nðþ§?ýÉ\zé¥IpÚÁå—_ž„¤ÂW^y¥]’§É‡çHŸë®».…n¸!…o¼1 c|o¾ùæH’̳”/j·Ýv›]’°x þºÿÚí÷À˜Ç{Ì‚ˆô×Ýë{ï½7íçK[÷_;î¸ãŽä2Óº¿Ôç3ßoÚüu×¶nI{»vfëqõÕWÛûƒ±×'N´Þë“N:ÉÞKUéþ5jTňlRï3ÍÖ/~ñ {`é`^Ùl„&Œ‡…FO·T¯B7jßE‹\d/X° å„¶û…¶Éeú®° £Ú3ÓöLdS?¡õS§Nµ=‘ü@23_z¶ÑiR¿~}ÛÎ\W<…|Ž•÷y}øá‡ÛÏÕ¬YÓ =†¢ŽÏæÝwßm“GñÀk²¢sÌ1ö5õ0õٜɰÍç:è ÅÀñÙî×IfK%?æ´gƒ ’Ûðbâ%1^„<ûtèÐÁzî9G'îÙ¿nݺÖsJ_ò£>ÚNQAgÇÆVwl¼Å ‹ Ë>ÛˆhÀ#ðÁÛºIÆÇgÝþD`ë|ÑÇùñYz~ý/=Ÿãþ v É ¡*Q×?~иnœ;ö“|äµ×^³õ»}9&çÛ¿þ\ <â Çຌ¡´‡<^»½ZµjÙz8^ÿþýí6zûðÖcS‹-¬Mî³\ŽÁ6#ÒVlãÞ©W¯ž=w® ç@Ï%Ÿáºqýõ×Û}hoÞ?òÈ#í¸ ©‹zI B$ ûsÿÐæ´½;]vÙÅ^OΙ{™{ÛÙȹslw½©¯ßÇ|¿ßñý÷·‡È~Ù/ûe¿ì—ýÙžÏÏþsû Áóÿûü_ã|þãYâíö!Ì܇g:crÿe #ÿp·0`@ {Ïrƒ¶ 2ÄB´é°aÃ’ >Ü2bÄ ‡ÿ:ÜVÚ~~],hþºÛîÖÙæöñëÈôyÿxßß/ê<ܺo3K¿X§Àµ¸6déÚܵpQ•<»:¸~þuå¹%ל{¥»/xÎG ðœYï:x*Dd;T|™B8™Úµkg%4¹ ™¼Å>x³M”†¬Ð…6ÿˆ¿àÎVd#Ô²¹I\›úíë–…€ºÙ&LHƒh"wØŒ·”Ð~D "n×]wµb‡L{ôh½óÎ;vÁˆÈFdE?uq]7¼‡Hãs„aУ‰GvñâÅfåÊ•6;<¢š°ÂôÙÆyð#J]¤ó§'Œc³ aI]ü°Ó¶Ü+tpμO>Çž:ù½ª=¼ÍxÕé\`ÿeË–Ùýè¡ã3Ÿ÷fï;R<§x?üðC{.|˜ÿ‘Þ9~,{|–? >{ÿý÷[O>Ça_:øÑ£=ôÌ:›Ü¹#ð£®?”B'4AM$Â…^h_Ó#Èq8OÞå•W’ׇsCÈrߺ69í´Óì(öáf?Ú‡iªHxȱ¹F®ÍÙmœ7vrìçÇ›Ïr­9G¦$ÃËN{òcÅ9ÓÁýâ®1Ççº#v™œ©Ë¸.œ?\œOx é§¹/™N‹sb;=ãüp`'žûså:c3î^$šƒ÷±Ïp¾§á÷-ü>Fýþ•¶®ßߊøþçcì—ý²_öË~ÙŸn:Üù¿çY‹ÿj·tðœ¶ãx`%ÏH¬³Œ‚gÓÒ@ü»¥[Ç à^‡¸m,ýõtûó\Êv–þºû|¸=áñÂóññÏ#Êv¿\›ºkÀ5ñ§Æò“}ù^àmùþçµBE6°éN ‘“­Ød4¿lذ!í¾ˆž°ð µ/sZºL™2%å‹ä¼è Àƒ;þ|ëÝçZÐÑÁqñ¢b/õcÞì¨ëÉq »q÷’ÛÆ½Og?¬DtÐnx¬± QÍ1ÙÆ~ˆe¼ö¼¦-ÇŒc{$ȦN× Aóš:¨ßyå¹_¨.€ú¹vìËu¢­È>Ê9mD»ã¥ç{@ã¦9?¢xW›?Ú…ŽΛöà‡žkÈùpM»ã®!÷#÷õ³?Ç¢W•÷°ƒžR¦¤W?^ÓÁAO+mFç÷£»7 áâ~M÷=+Uáû_Z}²_öË~Ù/ûe¿ì—ýÛªýtJ©\¡"qõ?ÿó?‘pR„lÄ&èÙfèF„„…‡ì¨}B…,ÌçǺŸí<#>zôè¬l&,7]Û•7ÿïÿý¿äÒ­3•bO'Âa‹`¾è¢‹¬WÐÏ‚è&‘Bˆ uu#NÙ„Eï"b”ãQÇ$$…mˆl„½h¼F¼â­D°ºÏ#>9&¡½xG‰|AØFx4Û8/„.ÂŽÐÿžäXä@Xâ­dÄ¡êˆ[<µì‹Cd͘1þ¦ ÁÍ!^Ÿyæ™Vh"äxç[м¦M¨ŠàCd"Ã6D6^fw^tnàfÑÌ÷ŠC^s ¨ñÕžNd#>ý÷9O*qŒÝm#€k€8Ƴ<{öìä}àleD6‚™÷hꡳ…רÎ÷ËÝGnl mìÚ4ë¼O'Žñ¥­[µje½Õœ ûÒÉC;Ò¹CÇmî·•»†´¡ i':VxhvûÓ±A}lc*ڙζÑs‹w|îܹÖD6î³D p¾•õ=­Èï¿ý·vd¿ì—ý²_öË~Ù/û³µÝV—È^³f=øvÛm Þ±l'ÝÂ÷-j_<]d óËO?ýdãüÃ}WPȆ‚#pü²zõꬽ؈—tíV zñâ"¢ˆ\ÞÇ“‡0Á»éÆk D´NdÓaàêñEvÔqÙ\[„¡åˆ+np¶!²Oˆl^#¸ý„y»Ï3^Ãyv—ˆl·-Jd#(ÝvêAÐ1Üf¼ x:ÙxjÙ‘Íy2æž×OÄ!aμFDâ­EdóÏ/¡ãîº"Š©›pw_d³ ±ÕÑ„â³îD6b’׾ȎjO_d»÷ð#žñÓ¾tš0fÜmg÷ðÓÁ…È÷ëä;Èæœé4à=_dó‘M°Îõ#Tœ¡sáÂ…v/:û»ûèŒb̵/²Ù—ëÆ5§Ó‘7‘íÎÎ7w b஡Ù~;ƒ/²ñz#ü×l# ŠNî{^#² qwŸ¥£‚k^•¾§B!„¢â i_…‹l ¤7ÝIÞ™­èd ®_þñØñQû"*‚Ø÷#áS|ª °\Ý„K‡bŸ‡õllÅÃ÷µ*Ý@xñN‡ï9À¸`’A¹÷xHË"²i#„ c?Âm¡ÈF ñšðr±²ˆ2ÆsÞœ÷â­4‘M’=Ä'ÇFH’l¡ª‰lB«©‡Î<®ND–&²ñèÓV´¢,BˆP:*ðÈã‘u½uˆVBßÙÆgËúÀwšˆ „8!í¼ÏØyÂßË"² ±w×ÉíO›Óv¥‰l:8?"؆wŸºh+ìÅ7D‘Ígi;Ÿ±?Ÿ÷E6Ã[ê@ Û¹6ls×X"[!„Bø k*EdNœî¤xff6ÂsÖ¬Y3y;Gßÿ}ʾk׮ܗ†)Dùá‡R¦æBlù…¬çé¦ áᾪÝ@x²ÔáûˆiB¶>xFñDFŒ`I'²÷Zš'›0æt"Û…‹»÷èhA„!â8>/&bŽã#’DX» Ñv"›ótuá=çÜÜ”|.Ù„0;‘ 'TÝÙ|Ž×„J3®ÚÙ.¼:‘í{gé8 Z€uB×IÀEسo“߯¡ÈfªiWD&à‰v‚–ëŠ@¦ƒƒûÖ%7ãûIGÂ!K' ´5mÈulä5u2VÚ‰lÞ?ûì³SD6BÞ‰lÂìÈf<<ŸgÌ6vÓ†tÜðÛ€È&tÜÙ\7D6õr †pßµ@Ä ×o?¶¸kˆÈvÉÏ#ç^¡ÀÙt@p/ÓÖ„ÈãÅç¾vÇváâ¾Èæ8úƒB!„Øö`xjYvÞ"ûÓO?-Õ+‹èÈ֛̓·_~üñG›Ž>j_¼ua!4Ü/h! æ®NL(òÙØHâ0¦:ªj7žGß[íƒ@ÁÛŠè dï°›j aŒ0rû"žyÏ —D¡ÂNŒù p¹Þþ6„ cß96‚’!¾'Ùã®М— ƒÆ«Ë\_œrÎÌ×H=J:E°qF/’¦¹}'Mšd¸¹ó@ø!¦yMb.ÂÏß.TQGˆ³‹a݈M¶!䜘ŽÐv¯ýtpŸ!*ɘèjOÆ.c'TÔ§£Ã…àßOÎÝÍ)I[aŸ|Ä/¢•ö!*€÷ÔìOönÚŠï—ë ÂvÞsulŒsqÃG¸Ïi?·óGØS÷'œ#‰á¸®Ó…ë†-ÎsOg êèXaÈí^CµëìpyÜÐ’«ù×…¶Á¼î,ývå;àGæÐþÜ[ú“B!„Øöà™²ÒD6à +Í›xËF€û·¿ý­Ä8g?Ù˜®†˜3w7çâï‡G­ Ïü¹¸ñŒùOb¶ Nĉü!£8N„;‚ˆ0b:CÿÕÕ&D%‘û ÁŒ7˜po'|…B!„å ŽœÉ•*² «-í$sñfãí"‘™_ÿŒÚ¯&c·Ó›<Žá>¹’/¹úð¨ùeݺu)aä¥Ah¬[, #²R@0cŒñNâɬÎ6!² ûFX»)¥H ¨ë-„B!DÅ€³+]‘ýÝwߥ gu±:[¡Žw¦0V7j_Âpÿûßÿ&÷#Œ›pXƵ–µA¡N=$H"„Ýìf:¡lí"›²nZ!„B!„¨º¯ªÒE6dÊÂK6`Æif+H£Æ\“œ)Qލö·“…º,åŸÿü§Í6L,Õ¾ÀN7—wÌïì“B!„BQµ`¨s¾»`"{Ë–-=µdNÎV”ɘþóŸÿ¤ß%K–Dfñ&c±_H®ä¶1¦;×q¼ã.‘ɸ°Ï¦îbº¨\lqS$ !„B!„¨š0#N•Ùàæ³Mž\æ¨ÍEœŽ7ά_¿>E3É®Â}Ö$)£@dXnc\s)dAvcÄÿþ÷¿'ßg:!2>çb”uà !„B!DÕ¥eË–ØÙ€ˆ.íÄ™¾ª¨¨('‘Ú¥K;¯?öÏ2s‡ûNœ8Ñ|óÍ7vævï3%R¶ÞlæÏíܹ³™?~ò˜|oy.ç 'œpBrZ#!„B!„U¦½}íµ×ª¦È~å•WJ7˜Ž(—ñÙÆ@¿ýöÛ)‚ø©§ž*áÕÆÓüâ‹/Za쇖3WðÍ7ßl·ýðÃiEö† ’ã¯ÙßÂ+[ˆçÿÕ¯~¥›V!„B!ª0&L(˜À.¸È†)S¦d4"—i½B¦Nj{œ—™pnÄs8ŸvÔØmG×®]m²6²‘G¦{öÙgíx천#Þúßýîwºa…B!„¢ óÇ?þÑæÝªÒ"{ãÆæˆ#ŽÈʘ² mÎ8ð/¿üÒ cÆn‡B;xÁß}÷ݤ¸þâ‹/Ì­·Þjú÷ï_æó"–§vÒ +„B!„U†ö.]º´ »\D6àiþÍo~SîBÛÁ|ÕQc´<ð@2™YïóÏ??§)¹Jó`ï²Ë.ºa…B!„¢Š3cÆŒ‚ ìrÙÀ\×ÙFèxYÆhg‚Äg„„³Ž·›±Ø¬ÓÀ¹úxÍš53Ûo¿½nV!„B!„¨â´hÑÂ|ûí·ÕKdÃäÉ“³2dh¹f/îÝ»Û9¶ï¸ãûúóÏ?7/¼ð‚]'<ü£>*¨À®_¿~Vž{!„B!„•KÍš5­F, \î"›,ݽzõÊÊPDj®óhGA&ð·ÞzËfwc«×¬Ycž{î9»~ÞyçÙñ×÷ß¿éСCÞÇcü9s€ëfB!„BˆªÍ^{íe–/_^n»ÜE6lÚ´ÉNg•ÁˆÕC=4¯ðñ§Ÿ~ÚsæÌ™É÷V¯^mßw¯ÉFŽ¿ôÒKó]£F ݨB!„BQ ÖK/½T®»BD6¬[·Î4hÐ kãwØa‡2{µÛµkWâ=<Û>ø`Æý²åÈ#4¿üå/u£ !„B!„vÅ‹lذaƒ¶Ù6éÔ÷Ýw_Ó´iÓ¼Cº™Ú«´y³³¥^½zfçwÖM*„B!„Õ„½÷ÞÛ¼òÊ+"°+TdÖ-[ì8é\„rÄvãÆ ž<[êÖ­kvÝuWÝ B!„BQ ‡ÖŠ+*L`W¸ÈvÌš5ËzªsiößsÏ=Í1ÇcZµjUîºyóææðÃ7;nN!„B!„¨fѼvíÚ Õº•&²aÉ’%6³[Yë¿ø…õn#¸Ã…Ö52µjÕ2{ì±GÎB!„B!*ŸŸÿüçæÂ /¬p[é"ÈúHηI”¶ß~û™Ã;ÌÔ©Sdž–·lÙ2­˜fâq±!Ò9ä£ÿë_ÿZ7¤B!„BTcŽ>úhóòË/WšÀ®t‘íæÒ¾úê«ËuÌ3^iy¦…B!„bëäw¿û™={¶Ù¼ys¥ ì*!²Ÿ}ö™2dˆÄ°B!„Bˆ¬ªýúõ3«V­ªtM[åD¶ã¹çž+H¹B!„Bˆ­f¡êÕ«—yóÍ7«Œ–­²"ÛñÄOØyµåÙB!„B;ï¼³™0a‚Y¾|y•Ó°U^d;Þ~ûmÛˆdüÖM%„B!„Û?ûÙÏì4ÎóæÍ36l¨²ÚµÚˆlØ.\hú÷ïovß}wÝlB!„B±•²Ë.»˜®]»š›nºÉ|þùçU^¯VK‘f$þùçÍ9çœcÚ¶mkvÛm7݈B!„BQM9è ƒìë‹.ºÈ¼ôÒKVóU'ZíEv+V¬0<ð€9ï¼ó̰aÃLQQ‘©U«–Ùk¯½Ì/~ñ‹‚Þ¿üå/ÍN;ídöÜsO³ÿþû›<ÐηÍ<ÝGy¤©]»v™`N·|8ꨣò‚sÏÚ;h/Ç¡‡š5kÖÌ ®_>|ðÁ9Ã}ãÖùQÉê*$øÃrÚÿ€È‹}÷Ý׎¯É=še…iÓAG]i=“ ™ê+í\ ;³iG~÷Þ{oQ ùãÿ˜÷÷M‘i_“ÝüÅ_4¯½öšþᇚիW›/¿üÒ|óÍ7ÕºGE!„BQu¢s¿ýö[;,vÓ¦MfãÆv¼ñ×_m¾úê+«_Ö®]kuáÒLy kÖ¬±úä“O>1ü±ºjåÊ•ÖéˆvAóì½÷Þ3ï¾û®yçw¬®´Ðo¼a–-[–3è#tT¶ðÎeݺuÛܵµ"ûÊ[îÒ.„B!„BäÉ¥7Þa¶»äúÛÕB!„B!DžÌ½æV³ÝìK¯Wc!„B!„yrÆÅטí&}±C!„B!„È“ñg\h¶8ñt5†B!„B‘'ýN9ÍlWÔg”C!„B!„È“¦=‡›íoÖM!„B!„BäÉ!:›íö;¡ùxõ5ˆB!„BQFV¬ZmÐ×Vdÿå©çÕ(B!„B!Dyxé3Å"ûâënS£!„B!„eäÂØÙI‘ÝwÜL5ŠB!„BQFzžV,²I~¶eË·j!„B!„"G6oÙbmÒ¥Xdó//Sã!„B!„9òÔ ¯§­“"ûÜËoPã!„B!„9rÖŸ®-)²v¢ÆB!„B!r¤~çA%E6,{û=5B!„B‘%/¿ñ¶ñuuŠÈž5÷*5’B!„B‘%3.¸"½È>²¨‡Ù¸i“J!„B!„ÈÀ7ßl4µšwO/²áΫ±„B!„Bˆ Ü6‘ 5u ‘ݼ÷H5–B!„B‘&=†eÙ°ø‰gÕ`B!„B!D=ú”‰ÒÓ‘"»U¿1j4!„B!„" E}Fe/²áÞEªá„B!„Bˆ€»>bÒié´"û¸6}Ìú¯¾V !„B!„ ÖÅtò1­{ç.²5o¶B!„B‘J8/vN"û'¶3¯¿ý¾R!„B!Ä6Ï«o¾kö¯×¶ì"Úôk6mÞ¬B!„B±Í²qÓ&Ó²ïh“ICgÙ0ó‚+Õ¨B!„B!¶Y¦Í¹Ìd£Ÿ³Ù°ð/Oªa…B!„BlsÜ¿d©ÉV;g-²k5ïn>\ù‰X!„B!Ä6Ãò«Ìáͺ^dCãîCÍ—ëÖ«¡…B!„Blõ|þåZÓ ë`“‹nÎIdC»§˜¿nøF .„B!„b«åë¿n0mû3¹jæœE6ô7ÓlÞ²E /„B!„b«ƒ¶z™nÊ¢—Ë$²aèÔ³m s]!„B!„[ ßlÜdM:ÔU+—YdC÷‘SÍú¯¾Ö…B!„BQíYÓ·]†M6ùèä¼D6´ê7ƬùüK]!„B!„Õ–ÕŸ}aŠúŒ2ùjä¼E6Ôï<È,{û=]!„B!„ÕŽWß|לØi€)„>.ˆÈ†ƒt47Þµ@H!„B!Dµáú;î³z¶PÚ¸`"Û1bÚ9§-„B!„¢Ê¿&¡w¡5qÁE6œÐ¡¿YôèSºpB!„B!ª ÿò¤©ÛþdSz¸\D¶ãäñ§™V~¬‹(„B!„¢ÒYþÑ*ÓwÜLSž:¸\E6ܰ£9ÿª›­+^U!„B!DE³výWæ¼+n²ú´¼5p¹‹lÇ‘E=ÌÅ×Ýf¾\»^Y!„B!D¹óåºõfî5·šZÍ»›ŠÒ¾&²5w6ÓϿܼùîr]t!„B!„çõ·ß7Óæ\fiÔÙT´æ­p‘íÓnà)æºXºôOÖ|®A!„B!D™ùäÓÏÌ5·ÝkÚ8ÅT¦Î­T‘íØ¿^[+¸/Œ¹ñŸ}y™Ù¸i“n!„B!„iùfã&óÌKËÌWßRéºʉì&o?h¼9õÜKÍw-0?û’Í'ñ-„B!„ÛèÀå+VY]xãŸï7SÏý“Õ‹Öï`ª¢ž­’"»4mÒÅÔë8À4é1Ì´è;Ê´é?V!„B!ÄV:¯q÷¡V÷¡ÿª›fÝN=#B!„BˆªÀ§Ÿ~ªvØÊØn[,ºðB!„BˆªÀŒ3Ìo¼¡¶È–ÈB!„Bˆ|X¹r¥iÖ¬™™5k–ÚC"["[!„B!òᢋ.2M›6µB{ÕªUj‰l‰l!„B!„ðùâ‹/ÌË/¿œq¿uëÖ™V­ZY‘ —\rIÆÏlذÁ<ýôÓjg‰l‰l!„B!ĶÁ–-[¬x>í´ÓJõNßtÓMV\8Ð.Û´icÖ¯_ŸvÿE‹™nݺ™k¯½Ví,‘-‘-„B!„Øv3fŒÎ-[¶4W]u•ùúë¯S¶oŠÍÜ¥K»IϦNj×o½õÖu½öÚkføðáI÷SO=¥6–È–ÈB!„ÕŸæÍ›W§žzªÚ»šsùå—'E1tîÜÙ,X°À|÷Ýwvû<`ß:t¨¡,[¶Ì¾îÚµ«Ù¼y³Ýç“O>1gœqFJ=ðå—_ª%²%²…B!DõÓºukëU,† b1eʵw5çÑGµ×²oß¾fìØ±IÌ5~ñÅÍ€ìëÇܸ2xð`ûÞ=÷ÜcCÂñ‚;oø™gži×{÷î­ö•È–ÈB!„[È=z´)¯‚—S"{ë`õêÕöZvèÐÁü÷¿ÿ5O>ù¤éÙ³gŠGºGæÇL^'Ì}Î9çó׿þÕ,^¼Ø¾Fl«}%²%²…B!„D¶Dö6‡sM¶qÊ?ü`n¿ýv á<Ö~ùÏþc›±mÔ¨QfùòåÉmsçεïßu×]j[‰ì­Od“¤àwÞ1?ü°¹ñÆÍ…^h3’¬`òäÉÛü𻥿.ûe¿ì—ý²UöË~Ù_ÞðÌų×\`ŸÅzè!óî»ïÚg´B<oÜ´Ù¼ñþGfñ³¯š{ÿòœùó’§ÍŸÃS&²O4ÔÜþГIîð–ŽÛÁvy[biy(þÚ¾÷P|Öç=øDòýÛL³Lì7ïÁ¥ žHY¦û|ñ>î8KK0ïÁÇͼ…v½yì¹æ‘ûÏ3Ü7§˜ûç˜áCz– §lܸÑN×õý÷ß—¸ðX‡ûS\ò¯šj–Ì?7vŒscËs,Ì?×¾Çúâù³“ïûëqR·-¾÷ì³-Kî¿Çr‰·ýá{bË{XžïöaÛY–%‰ý’ïÝã>wVòó®nw,x(öþC÷Ä·»å"»œmyÈãÁ»â,ºû˃w“x¯x¹ðϳÍÂ;cüùœäò;cü¹˜w–äþ;Xž›ä~ˆ½wÜî-½õ% ®4Ï=qùô“å±ïð·Ù¹ðþûï›y󿙉'šaÆÙñÜìýû÷/c,ÜÒ_Ú·,ûšB_öË~Ù/û«¢ýºÖ²_öo}öó,6hÐ ûl†'3óŠ+ìJeyÞû|íW1û”9õÒy¦ïŒKL»qsL›1³Mëѳí2Jds<„Y¸òÊ+#Ev³ö]MËQg›–#Ï2­XŽŠ/[Å–¬ó>Û[Œ83¹_ ÷ÞÈÄ>£ÎJ®Åöcý‹´L¼vŸuëEÞ>–ágØeóØ’},¬'`{ó¡³Lóa§Çö9=¾ÆëÓbïŸfš¹åЙ¦Ùà¦Ù 8MO7ML‹­Çi:˜å©¦éÀ©±÷‹iÜrŒI¦ñÉ,§Ä–±õ~c/õ`öoö‰/õ‰Ñ›×§$— {3 zM0Æ4è9ÖÔï9Ê4ìÉzŒ£-õ»´4è9Ú´Œ·÷)§›Y]k–>÷ŠÙœ¸¯Öù±y|ÑŸÌw3N;Öœ5ösƨýÌé#÷5§Ú'¾Œ1¸çQöz^vÙeyu¾üýïßÍš˜™Ã÷1³FþÞœ>â÷f¸õØrÖð½Íi#ö¶ËY±åi‰å¬pi·Õˆ3¢x9Ë[ÎrÛ‡íefÆ8mxb™X· ó–Ê·ÏL0}hl94¾ch|9-öúÔÁ1†ì•\NcP 3u` 3%ë“ì•dRì½I±å„þ{™‰ýcë1&ÀÉ5Ìø“÷Š/ûÕ0§ôÛËŒë£o »<¥ïÞflŸqúÆ—£{ïeÕ«Fr9ª×^fdl9²gärDϽÌð±e÷v9<¶Úm/3¬{ìýî±zûlfŽk`®¾h„yåù‡b¿?›%²3A˜_ B6Ò‰ê“O>9¹ô×Ëk{H¸¦óË•\ë—ý²_öËþêh¿®µì—ýÕß~7Ùš¯~ø¡ùöÛo³~æûë†oÌ5÷,6'œ’ž0µ4þ:Jd¿òÊ+%ÆÑfKXWRd·ëyüäkO;1Ü"![øë9|¾„°öHŠh+žgÅóiqÏ4M‡œÈ3­hnÑMÅÄs –žjš œf—cËF1‘ì@07Š ç†1‘Ü0&–ãËI¦A߉¦b9&”Y6ˆ-ë÷:ÅÔï=.¾ŒqRÏq1Æš“zŒ1'öLÐ}”¥^7–£cË‘æ„n#âË®#,u»7u»À0s‚]5ÇwbŽï” ¶^§ã`S§Ã`s|‡A±eŒöƒÌq혓'œa^}ó]³ñ›¯¬‡öœñ5=1“1;eð~f|ÿÌÈ>5cû0Ó½C{=GŒ‘—È~óÍ7m=-š72ºÕ2ÃzjÆô;ÈLø‡˜XÝ×Öˆøy$Åñˆ½J¼ïÎ5¹Q#e½´Ï'Eyb}æð¸pžiÙË.§‰ êCâLŒF@ïiN´gL@ïi¦Ä–SÆ—ScL°gL4/aâɰGL0Ç×Ç÷Û#&’aÏÄr˜pŽÑ{˜XŽÑ{÷{˜Ñ=w7£{ín—cbËQ=ãŒì£Gœá±õáÝËúîn†vÝ-¹[é²›epç]cÄ–v5;zÄ^è°«éß>†[Æè×vW3ó”fæÝ·ž“ÈNûCK0pçwÚ?~¸ûõë—5n÷Çà¿W–ýs­¯¢ë—ý²_öËþêf¿®µì—ý[·ýn$óçÏ/1?q:žzå-ÓeÒIñÙÂóêºõ(aüÓO?YOcYø×¿þ•VdG?Ý{þ9Giº()žOOˆé˜ˆât\Dã¶"zP±v":.¤c˘×9E@§ˆèbݰo\87Lè½Ç'ÅsRDǼÎ'õ›ÐVLw‹g_L[ñì„tB<#¢O°ËaqÝiH*[ñ\,¢ZíslÛþæØ6'›c íÉæØÖýÌ1­úšÓç^eÞ|y‘™ا 8Àtiw¼i×ú¤˜ømXjgJQQ‘]Ö‚&)­þæ1wë LǶõLŸÎG'¼ÙGQ<¢X`Ÿ6"Ý5’â™×3Úz YOˆçéCŠE4âyZB@Ÿ:8þÚ‰g+¤Lò…tÿ=cây+ž'œ_߯X<ŸÒwwsJO<÷ÙÝ.ÍŠh'žaDTÍú°n»[áÒ ñlt ¢!Ý?Ñ'·‹‰h„t»¸˜f½olÙ§í.¦O›]-½ÛìbzµÞÙ\qá(‰ì(V®\ifÏžm{CI“ß§OŸð>!ã§œrŠËðôÓO·ÙþÎ>ûlsÖYgÙ±B¼OÊ~êqŸ‰ªË߯Ò_/Ôþ™(ôñËÛÙ/ûe¿ì/‹ýºÖ²_öo[ö#¶çÌ™cçÎäÕ>ûÚ»l¸uQàöÅkEÉnê{²½ã…륈èæ)"úô¤7Úz x&„{èÌbO´ÑxŸ$ij'¢§õDǽÑVD;ï³óDû"º÷¸¸:! Yâ‘>±GB<÷(Ñhç…Nz£ÎS…´/ ñB;}\í ècÓ&.¢i ¶^»eoS»EostŒÚ­Xö4 »1w^34 ¾¢ƒº‘"vIZFçάY³l$,☌᯿þºMjVÖ¼ØÏ?ÿ¼[›<矾A”m«V­’ÇoÛª¾õ¦ÏŠÕ³’"ºFRDÏð<Ñ3aÜ3Þèé Ot\HÇ=ÒÖ=°XD;Ñœâî_, 7zBÌ=>é…v"z÷„:î…¶"1ðF[/tB<[1cD÷’":î}Þ-鎋è]‹4ËN»™A‹…ô€L"ºz—bm×c"ºu\D÷²Ëøz–©to±³Øõ ‰ì’šMš4)åG½W¯^É%ï#œÕqB“H„¾ˆn¾ta…À?¿(Âãå»®çŸïçe¿ì—ý²¿öëZË~Ù¿mÚÏ’g»÷Þ{¯T¡=øÌ+bÂ45¤:OögŸ}f®¸âŠ2qï½÷fåÉ.5”Û h_D#žBÚz£­':åà‡rŸj":Щ¡Ü otÌ+]¿7z\ ôD÷(öDÛPî£KŠçD(·/¢CÊÍ2:鉶ëýŠÅ3bºUŸ¸xŽ é£¡E/sTLD]ÔÓ-zÄ–ÝÍ‘Í4/k5íj.=½õî:;¢Ï!Ö“Ìu;ãŒ3Ì¿ÿýoSQ…„íÛ·OÖ¦^LïS"”{† åfô=“"zš çNië…„r,Ê÷Dï™ åžÐÏó@'„tq(wBD÷Ù3éí‹hOH‡žhuÜ ½kÂÔ¾ˆb½Ð»¥†rwŒåîày Û":é‰Þ%æ‰ÞÕôjU,¤{¶ŠÑÝŠŠéÒú¨L|þùçÑ"»m礈ö½Ðñpn¼Ñӊø¡ÿÔ铊9/´ ãn˜=.žT¬wÂí…rÇ—Å!Üq/ô¨ˆ0î¸ú„@H‡¡Üx¡ó¼Ð‘"Ú å½ÐÅÞ脈nÙ+é‰F<Çtä2鉶^ènV@εRDtçoÔÑÖ°ƒ9<Æa ;šÃ´7‡Aýv– ¦ÖN ÁNŒmžKBÖª¨½~dÎ'@y•ûï¿?ÙQÓ«cí¤ˆvc KÑ"B¹ý¤bãb¢Ùz¡](·ÑnuȾ±qÑ'ÚkصkWóñÇôþøïÿk®¹æš¤ÀîÛùˆ`<ôž%ÆC/‘•;ÑññÐ.™XÒ Šh/3÷/+·ï‰.N*æ‰è,B¹Ã¤b) ÅÜxè6©":›Pî"º™/¢‹…4žhë…ö¼Ñm! "ºQ\D·n*¤[ž´£iuRñ²Å‰;˜¢z±e=‰l ó_“‘’/ð£ÎÑÀvã±Ù .4¯¾újVŸqBÛý‘„Çw4!þ>Ù쟉B×WÞÇ—ý²_öËþ²œ¯®µì—ý²Ÿ%^mHmذ!åy°ËÄó#Çcgòdã¬yðÁËijÏ>-²[wL;ºXH/ m—ã’Y¹ýÌÜI/t ¢OðE´›Öªó°dX·]RDNŸ•»])ã¡[§†r;j»1шgÄtó`<´óD7óDtlIH÷M|/tªˆÆ m=Ò bëõ‹=чÖo[RDŸˆˆn™Ó¬\·EŒ¢äò ã¡¹9wâaÅÓZð§¶Š3}èï“í¶mÛÚB†¥:ݧÓ¥‡NˆhÂÝ«xé‹h_LûS[•½krj«x8÷î‘¡Ü.©Ø?¡˜Ͼˆvžè”ñлä<ºD(wè‰fÙ$. -Ö½£ÓIÝpÇHOtëúqáìÓòÄc"z{+¢‹êÅõ ;˜æPwûø2F³º;˜¦Çï ‘MÂÆZwîÜ9ùCÞ¥K»Ì"îsÏ=÷˜¥K—ÚqÙ/¿ürÖŸƒñãÇ'› þ9úëÙ~>×ú*úx²_öË~Ù_Ÿ×µ–ý²_ö»ú˜K{ÅŠ)aãÇÏ)9µUbJ+—T¬¢Æd7nÕ1b<ô¸ã¡Oì>ÆÏn<ôÈ”ñоÚŠèÎ%“ŠÙñÐ%¼Ðƒ2‹èÖýRÄsíÖÞxè–Þxè„÷Ùz [øã¡»Ù¥?ÚŽ‰ŽðBÇC¹møÛ'Äs;»nÅsŒ¸xŽ‹hç…¶":¶<¤®Ñq9¡E\D×i£™98±LḦæÀc›šsN949½ÕLoj+;:6þ™ðm›¼®Y3Û™SȲeË›HÙÍ™=¤ûþÅã¡{¥&é%s Åâã¡w‹Ï=Ìeç.á…Þ-r"áFDѤ‹—P,î. å.)¢žè„ºæI1Î5-­^èVV4ûžhç…>["¢Œ‰æ¸x.ÒˆèkbH,íëcËØþ¼Ö¸šÉi­ìrH|<ô¤5Lç¶uìõ#!Sn•Ganu"mãB¾‰é×é€âdbÝ‹ÇFgäö½Ð¥Ï=À‰h<´?7t»„ºMê˜èÞÉi­vN›T,·ñÐ;&ÇCwlê¼Ð‰eÌ3 ½crÙ¶AzÝÊ åvâÙ÷Dï˜ôD;/t+š"Ú çÄҊ阈nT'¾lœÕÞo´­‹ì·ß~ÛÎ/×±cÇèáL—E< z0I@€›ùð˜ùðßÙÖÁÛüøsü:$—þº;¿\·‡„ûg"×ú2_®Ç“ý²_öËþBدk-ûe¿ì÷éÛ·¯ùðÓÞìö£ÏJ$+žÖªq0º¢Æd7jÙ®XH'=Ñ Ýe¨õ<‹è¡‰i­{¡K2?të`~è–}Љ¹¥ åî™ å¶ÞèæÅ^hÄ4ã¡÷Ù-‹“Хއ¶ÉÄœ˜öŠù^h' k¦Œ‡.mEôñ !m½Ðqáì/}ñÌ:"ùÀ„7ºÄºÓá{Ÿ9}tM›Tl’7­Õø“oÚ·ªŸJ+vߑž<ËO?ýdg6r=ÚÕŒåÞµ8¡˜?:œÚó@'EtrºÖ²_öËþ†ù¹±ÙmGÌŠOoåeån”’Tl|…‰ì†-Ú&ø£’ŠEއf~è¶þXè“S§¶j]œT¬8œ»x<ôQ.œ»yd(÷Qn~hÊÝ$! ­:uz+²rží„´Ï)"Ú èâppîb!Ý2î>¡EÒ#ånωÐíäò¸T/t±`öEs“èí%ÄuæÏO~p<¡Xb<ô¨žûÄÆ`Ÿd¯s²‡ã˳WÊ íN­ŽHÑN@‡"ºmqFî¸':*”{—²‡nê‹hôÎÞXh¹wN†rûžè6 ²]Î]”ån^7Æ]쉎‰g'¦Ã0îã=í æ@4§ÚêdÿùmVdÚƒ˜&!?²íÚµ³Kz6³ñbßtÓMv\^l^/Z´ÈNvï½÷Ú›ž,Dö÷ßoßËFds\Σ¢ñí÷׫Ký²_öË~Ù/[e¿ì—ýe9¿)S¦˜/¾øÂŠì6Ãf'cÔóD{Ÿ[&—I/t]'¢›ÇÇCßÜÒž˜Žq@”·¹„ÞwÛ¡Þ…øüÔ¡ÇŠÅ“‹ é¶lltäÐ1c†¹õÖ[ÍâÅ‹m„+ÓÅá|#+x!ÂÄ×­[góI=õÔS¶ƒˆLã8 ÝñÛ¶¨:/tLL§„r·Î!”»YjfîÎÎíB¹›Oiå’‹µ‹ðB·¹[Gx¢€Nz¡¢™pnçnŠˆN†rï`EsJ8÷ñ¾ NÇ%óÅï;\§pŸßfE67(½Mˆl²ŒgÃ?þ¸ Ó` ¯/¹ä;Gâm·Ýf'†w¡ý裚üÑ,_¾ÜŽÿÎT7Sz…ç”+mÚ´I.ýõ|ë­.È~Ù/ûeÿ¶`¿®µì—ýùÛÏØl—­õàSãóB'²rÛ©­º{c¢c‰ÄòÙõùmVd?üðÃvìµûqnݺµ]òÙNÏ™3ǼóÎ;öfë­·ì{$8[³f×x³ý²zõjsã7Ú^,~ÀéÝ*Md“éœs ñÏÑ_O·=$ÓþáöLÇÏ·þ|/ûe¿ì—ý¹žŸ®µì—ý²?êø<òˆpl5hŠM$ŸÒ*1½U0?t>"û±Ç³ŸgŠXž÷¢2dH<»xÓfq1í‡v:™‘»[b½[ŠºVÓ¨©­:&“‰ž’T¬]r™â…NfæöÇAûS[ÅB¹'Úåvx:9º„pmZBø–¼éö©ÄÏO\³x^èXR±Ak˜~÷3½ÚÿÑto[Ótis„éÐòè˜Wùø˜—»~|>ëØøÿ| ÓÛDgM›˜ÍêZ!ݦy-Ó®èPÓ±è@Ó©h_Ó¥¨FL8ï) ;G‰è„h¶":á‘fO×î˜pæÙv~RàŽ;îÐËc#jÏÎI_×^=©¦ÆW9ÉÁ´ÉÞ²qDívÖûÚj£¶™åûq|ѯY“-c"dŒDkk«þ€5­tÇOÝ%ÕÀ1ܦ ¹ÉdŸEá‹_ü¢¿Nº€çB2ÞbÌM¡‹ðµäC7|O!„B¨4züñÇõw²l&Û|q.f7õl»ÒLsø~Â*öýUrüI~ŒY»v­î© ?Èd‹K|G ¸3n;×}H14ñ ÂTVú-•Íšì/}éKjhhH555$¿hʯI¶ñ•î=¦8…YgJî¿ôÒKþ:™;¯½öšÞïÙgŸõ«Úב_6åÃ^±Ñ¿'³,­½¾÷ÙîŸïñ…n/öõˆŸø‰Ÿøg³?ï5ñ?ñË4^Ï<󌞛xÇŽ9¿8KVr6ÕÙGƒ-VNÓXn“\i¦¹˜ñ'Íd'•$ýìjâséß½Ú{Prþš5Ùbt¥k¶˜Zù ]¹r¥ßÊzÛüJ÷žßþö·ºÛ¸é~Ûm·é9²¥Û·éF.e÷³ñ•¯|EÝu×]:ã-†[ºÛ×Ù°aƒÿǾ{9éö°ò=_X¹ÎŸ¯ ø‰Ÿø‰?ßøy¯‰Ÿø‰?ÿu×]§§c“$CY©µ¹–‘œËñçóÿ$É}¹U륻ø\PêÿGs9þš5Ù2Wõƒ>¨ÇJ„?¤e E8›ýðÃëqØ’­–.ß²NºŠÛse›×QÈøk)œ&….Ädßwß}óKUsÉ¢—êS¡Hs/— ½ñ?ñ¡ñó^?ñX_þò—u•oc²çJÆo.u›­¶ø‹i²Ä`Ko‡j1ϨÆM¶´Ò…'êƒyýúõ㥿ð…/h£-%öOœ8¡×9sF›èx@¯3ÑÂócËv¹¦ìGy$ãÜ2v£˜¿Îæûko±ÿpæ{½bÿ:MüÄOüÄuÞkâ'~â7’®¹¯¾úªúÅ/~‘w&Õ®¤Ç«ŒµÉ÷wÓ&‘}Œ½,Ùì¨íaåº^Üùã¶ûü³¿\÷WÎøkÚd‹!–®Þz~ˆêƒf˜a1Ëï¼óŽÎLßÿýzÝ׿þuýZº•¿øâ‹&û«_ýª^/YpÓÝÜ–ÌÏX诳¥îR•ïýz¿ÄOüÄOüÅŽŸ÷šø‰ŸøíýeZÕ?üá¾ÉþÝï~‡PNýÈn²©.¿6D}8‹ù–y«Ã¦Xº}‹±³}öìYuË-·¨Ï~ö³êöÛo×Û¤kù¹sç|“}þüyõµ¯}MO7/v¥»*öý•zâ'~â'þbtç½&~â¯Îø¥Ç¢ÔÇ0Ù“ Pd“-ÅÌ>ÿùÏë±QÔ2f;Êh‹nºé&ý+¨L&Ë2‰¼œËB;uê”zôÑGõ¶¨ãå¼á,z¥üš]î.^ÄOüÄOü¥ŽŸ÷šø‰Ÿø¥•¢·O?ý´úóŸÿŒÉF˜l€R˜lAÆI‹N¥R±í¨®ãáñÚRøL¦ô’)¿Ä\gÛ_ºˆK¡³JéVîqV•Ö%Žø‰Ÿø«?~Þkâ'~âÝ{ï½zÖ&a²J`²…·ß~[íÞ½[µ´´Dþa¹eŠ-™Ë:Ê4?ñÄzJ.™Lº…‹éŽÚOŽ—j…r¾rVÐ,uÅÐROÓAüÄOüÄ_Žøy¯‰Ÿø«;~™²ë­·Þ |Äd#L6@‰L¶t— “[·nõvÔµLïµcÇŽX³'™LÎ-ÙòJøCVì_³/ô\—ÄOüÄOü³¹ï5ñíÄ¿gÏõÊ+¯èï|˜l„É(ƒÉ6F[ ¡íÝ»7`´£ÔÚÚª¦¦¦Ô®]»Ô 7ÜaªeÞë믿^íܹS—å—ýËY±³ÜÓbTZÅRâ'~â'þ$c²y¯‰Ÿøk#þ£GêdJØ`çk²å2•k©%ßU1´˜l€ª0ÙÆhKÕp©nŠ’%ý —.àRLCÚB+júës¾¸Šý‡Žø‰Ÿø‰¿Òâç½&~⯽ø%ÉqÏ=÷¨ßüæ7‘;_“-skË÷ÃRkÓ¦MZL6@õ˜lÃ{ï½§+OʤóbœKý‡ëB )u…Râ'~â'þJ¨.Î{MüÄ_;ñK-3gΨwß}7ëw¾|M¶ÌHsüøñ’I†&b²1ÙUi²…÷ß_D“Jᣣ£~±²JüCXìqVºKñ?ñ±ãç½&~â¯ø'''ÕéÓ§Õ¯ýkõ׿þ5ç÷½|M¶ ,%bâ1Ù˜l€ª5Ù™GQº?ùä“zœuOOOà¾]¶*½Kñ?ñ¥ÇÏ{MüÄ_½ñËw3ùŽöÔSOé‰LÏšL6Âd\“mg¶ÿô§?iÃýío[=öØcêÎ;ïÔ•Æ>¬ç¾F!„B¥“LÃuã7ª»îºK=þøãê;ßùŽþn&Cýò1×q&ûíÿ÷kõÝNýÛCŸW_¸û_Ôcwü³zìöϨÿãhh`°,&{lõˆs½ ê¶Ìן»í¤ÓžTŸ;á¶îò?©ÓÎkÑç¬6¬Ó·ŠîÏÐçn5ÛNúËéuö¾îk÷|ÁöÑ[îst¿úWgû¿zí#Î:ýÚÓ£·Þ§÷ó׸O=r«èÓ¾NÝz¯:u˽öa§}øæ{Ò­£‡>ñ)õ/7J=t³ÛFé›ïv—où”zð–ÿ­¼9­üönOîºÇîyP}í±ÿ«~öƒ«ßž?É(…É€êÂ6Ù¿øÑÏÕÇîT7MîUÿзU]×~•:Ô6­‹ZgToGW†É–iÁ$ó<íÛ·/Òd¤zôõµÊu§ÕÁ–uêP‹ÛòÛéÀ²Ù\Nïgö=ì·éõîò´Þv8´ÿ!ëÚš§Ô–µ~»¿yRíkrä¼Þë´{›&Ôi'ôë=N+ºv嘺¶qÔm]³bT]³ÒÓŠµËi¯nX£v®”åuõŠaµcŵÃYç· «Õ¶åCj{ÃÛ.w_o­T[—¯R[t;¨6רM˼¶_/o¬ïW—yZÚ§6,ëU–¦µÞÑÌ’5³´G]ålÛ´b•Ú‘S‡G7«'>{Z½íÐÃIÄðú믫çž{Ž¿*5ÎË/¿¬~ô£©·^CÝ{ð©ë»6js« n‹k®µAu^G™l1é«V­š•Ö¯_k²£®X¯3æ;½ÞlÓË-éå$ÇluuÈkêub¦ÅD¯ÓË£¼ß1Ðû½vŸ6Ò®‰ó,íîÆqGc^;®Íô5Fž‘Þé˜ê]Ž™ÞÙ0ìšh1ÍËE«µy¾Ú[Þ¾Ü5ÐÆDoqL³˜è-b¢e1Ò[ó¼y™k ¥Ýà˜æKúÜvi¿Ú°Ä1ÍŽakzI·c¤]M/éRë®ìTë;º²Ã]vÚµ‹:ÔT£E)Gj¢.¥vtŽªÿÊ“˜l€8d‡o|ã±S9@m YìŸÿüçêÌOªº7¥3È6Êd?“q}ßH·FéˆåC!}ÈËŠë,t‹1Ò¢IÏ@»2轞–eÛ@ËòµŽ´y6Fz…e¢ ôή‰ÖÆÙ1ÐÆH§Í³1Ò«<óìhÓ.s3Ñ:-&z‰kžÅH»fº7‡v¥ ôâ.×D;í”6Ð)ß@O:í¤óz¢®Ýi]MÔµ©ñºV5fitA³ºqË^L6@ÒèÙgŸUç±P»<óÌ3Jz9>ø?>ét ŸñMlÐÜÎhÓZN“uýp6úÎ0ÑŽ$#ÝìvëÖ]ºCYè½Þò‰e¡WZYèF“‰ÑJéá@Ú6ÑÆHoudŒ³ÎF×§ ôfÓ{i¦‰¾*d åµ6Ћ»}3­3Ñ‹½l´§)“…öŒô¤麔gžƒzÜÓèBQ³YØâ´-NÛ¬ÖÌoRÃiôä¼vÚ‰%í˜l€l&ûÇ?þ±úÁ~À_€å­·ÞÒCå»áÝ×|Ì7ªbbÃF6.“ýÆoèBl³Ñ‰'bM¶\?¾+·—…nuÇGë,tSÈD[ݹ]ãl›è± ‰^m¢µqÖæy¿lÆ@û&Úëº-FÚ6Ñ›M&z©»èÊ‘…›h×@wZݹ­®ÜVwî / =2Ñ~&Z ´—‰ÑÚ1Ò Z<óܤ%Z´zþJG¾VÍ_á¨A·ƒóÔ ³Œ“Èb²ez‡o~ó›d³j™1æìÙ³ºp™|7üÇmÇ"»gÛ㚣L¶Œéîè蘕dJ°8“íéÀxhcžÓèôxè,]¹Å@7Œx&ÚëÊmLtCPÁ®ÜCÁ.ܺ°Ø€e¢Ý‚b$­Ís¢®Ü®‘î èi/#m…6ã¡íîÜm ´—Ö3í˜è‘Ø™è&ß@YFZŒ³k¤]‰‰˜·<­+–«¾ËëU¿£>O8)€,&[ôË_þR¸¾ûî»ü¥¨!ƒ-ìŸüä'þÔ]w8&û 5–ÙŒmf²»3Lößþö7¼™¢’=b²ûÛ»c¢³Ž‡d¡‡­6‰N vçÖ&º~Я̽5Ô•[·Kü.ÜnA±Líg¡—X]¹›ñÐ^zQ¨ XL&Ú˜g“‰ÎÈBÄ5ÑvÚ5Í+3ŒtÚ@7dhßD/óT¯Û^G=^P—ÖbÕí'Ãd‹ä×GÉhË´^B¨nÄÜJ’Å6Ø¢Û7_ïé™`…î–ôèÃe,|Ö×Þ1z4m¢†½bbkÒY訮ÜÖxèpÚ¨Ì]•;ÚD»qg˜h]T,ûxètQ±–ô˜hÇD§»r»™è ‘N·Y³Ð™èL½4ÃDËrLJ©Î]é´ž.[¤:.­SíŽpR L¶èµ×^SÏ?ÿ¼6ÛRaRŒ·d¹_}õÕŠ•ü(PË’÷¬–%ZjY2þ­–õæ›o"G•ö¾ÔúsYëŸËµþw¹’¿3Iq)|+ß÷~õ«_©ð÷ÁXºÓÅÄìqÑQ&[LûÉ“'g¥S§NÅšl<ôòteî°‰Þb¦µ2c¡—‡ÆC{óA»YhÏD;Yè«"§¶ê އ¾2ûxèI/ =ie¡'sŒ‡5ã¡ç7úYhÓEŒ‡6ݹc2ÑvÚÍB×gd¡Ã&º3`¢y&ºN¥.]è¨NKÌtÛ% T«£6O­—|Dµ\2“ ÔdÉ—$ùUó¥—^RçÎS/¾øâœ‘Üo-ëûßÿ~Më{ßûÞœT­Ç_,½ð 5)þÿVGüsUµþww.}GÉ<Ø?ûÙÏôBqßOl¸.=:0Ú•ëM•gLvo[§c¤WÇC›cõ¶‰Hg —zFzI–ñЋ­ñЋÓ¹%+mÆ@»JYã¡Û32Ñm2Ð ¼ÊÜ šý®ÜéñÐÆD¯ ³2Ñór‡vt”îÖÚ5Ò"1ζL´gœÛ/óLôeÆD/t[Ç`mÔì©%Ôâ¤b˜žžVåO @ípóúCþ´V{¬Ö.*Öa²e|÷l¬ŠšéƘl3µU í,Û]¸Ã]¹íêÜþÔV‹­,´Ý•»®#0Z²ÐcVA±ôÔVéŠÜvQ±5V6Úç!kLt 2·ÉB_‘i¢ÓÙèúD]¹»Œ‰6ÙhéÊ}Y:­3Ñ 3²ÐÍ!µ®ùâyªébi竦KÜ6½~ß0Ù“ I¸ifüÔVR™{åH¤É.Ř잶Ž@&ÚèνثÊíe ×ZÙèI+m›èÀüЦ2·),&­•ŽœÚjeFz`~æXè´‰®ÏìÊ}yÚLw[FÚ5Ћ½ ô"<´éÆmŒ´›yŽÏB·ÄèK湦ùb×4›ecªýíþ~óûÉzžL6Âd@>¾n6ÒW›9¡­‚bfj«žTgyLvkG`Z+ÉDÛU¹'3ÆB‹Š¥3ÑVA1…v‹Š­™žÚ½ÒÏ@çšÚÊÎB÷^¾ÔÏ>Çh‘›ö ´´VQ±tÚëÊ}éB?•…n²^ÛYgß4_½®É_—6ÕM';ž'“0Ù€®Û­‹‰¹ÅÅÜ鬶šÂbŽ6;ݵ» 0Ùï¿ÿ¾zçwrJLvw[‡?µ•;&:¥MtÖªÜ 23Ñv!1{LtF&:ÇÔVý¾‰^i¢í®ÜîXè+½.ÝuÚ@‹yn¿4=ºÕïÆ½P/Çe ›†:"»™™Noo²²×MçxžL6Âd@þaí.w^èz·*·;µUzL´SP¬»½cÖ&û»ßýnò¢h©”W•»%0&Úí¾íuãžß1Ú•©Ê}<´k¢û¯¨Uå^›‰vͳ…® Œ‡6FÚTæNÒ•»9œ‰¶ÇB‡ºjGg¦ç…Ž)íñ<)˜l„É€$&{úšŒªÜëCEźŠ`²×¯_¯Ž?«ÎÎNm²Ã㡇b²ÐQ&ÚÎB»ã¡—eèžÈ®Üfnèº@Wî”e ƒc¡Dšèæˆ*Ýãž/šÞàøçLs[)Çó¤`²&pÌÉdO/îö«qO[S\É8èµÎxh1ÙQÓnåc²?ó™ÏÄîsöìY?“}wîœêïïw»‰/k´Lt?ÚTçÎèÊ}é‚ ÓÛ•;ªº¶Ý…ú’ùY ƒq<ÝÅ0Ù“ ÉMöÄæìS[9c¡;ÛŠk²øÃª·ØY}c°*·W™;ªûvFuîÀ´T¡i¨ìì­•µÎÌær|’ãyR0Ù“ ¸aón5ÑЩÆë;<9†ÚÑè²v5Zß®Fœ¶+U¼1Ù?ýéOÕÐÐ^×ÕÔªú¯\¡úõ‡Ô·¨Á‘Õêuöz[Ö>þ±_ÌãyR0Ù“ øÇ[N¨c‡Ž¨ãŽŽt¯>gÜt1LöË/¿¬V¯^­_OŽO¨ëvuà°:zàn:h¿>ä¯;j-s|yŽçIÀd#L6$àÈ‘#9¿tuuéîÝb”óÕM7ݤMõîÝ»Õðð°^–s%ùN233ã·örÒíZ…Þ_%ÅÏ“€ÉF|ˆ@&Û¶îîn·@Y$ÅÎJeJKm’Ã*öýUrü<)˜l„É€=z4çwƒñññYibbBw3— ¶hdd¤¬¦±Ü&¹ÒLs1ãçIÀd#L6$ IwñR™ÆRw§®tÓ<—âçIÀd#>D ¡Éž+ÝŸ+ý|Õ?O &a² ¡Éž«…Á*}Ls5ÅÏ“€ÉF|ˆ`²&“0Ù˜l„0Ù˜l„ÉÀd#L6&a²“0Ù˜l„ÉÀd#L6&a²“0Ùeaûöí‰ÖÌÌŒßÚ˳5n…ž¯Ø÷Cü˜l€‚Mv©LZ¥™Òr›ÔjŠŸ' ¡É¦;4"“ P$“½nÝ:-1S¦µ—g»=—ò=¾Ðëå{>âO¯ãIHh²ãLW¾*Ô¤æ{¾rï_Ëñó¤$`Û¶m¾‘Z»v­ßFÉÞ'JáãÃûç»=×õs¯Ðû#þ´xRšìÙš¾\*¶‰äúîú<)yšl„âÄ“€­[·ª©©)_b¨Lk/Çm+×þÅÞV¾÷—ëxâŸÂd$eË–-jrrrNH Ÿiíå¹rÿs9~ž€*3Ùè‰'“0ÙåcóæÍ¾‘š˜˜ðÛ$²‰:>¼=¬\×»Ðç'þ´xRšì¤¦Õ®xRÊl²ÇÇÇýÖ^Nº½Ðó_hUsü<) Ø´i“oÚÆÆÆü6Jö>Q Þ?ßíž~.üõyR°aÃßHŽŽúm”ì}Š¡ðõŠ}þ|¯Güññó¤$4Ùq¦²ÔÊ×äV›æRü<)n²ÑÜO &a²0Ù“ 0çX¿~½o¤FFFüÖ^ŽÛV®ý =_X•~¿Õ?O @®ºê*5<<<+­Y³Æoíåbí_骥øyR033㛾|6‰aYêãK}>âO‹' @“½zõj¿µ—gkÔŠ}¾R_øÓÇð¤$`zz:Ö”…6i¹L\¾Û U¾ç/öþÕ?O @B“],“‡ªW<)˜l„É(ëÖ­‹5VCCC~%{ŸÙ(|þ\çËwÿb_¯–ãçIHh²W­Z¥%fÊ´Id¥ðù*}âŸ' k×®5e•®ÁÁA¿µ—çj<•?O @B“gÒr™¸ðö|UèùÊ}?µ?O @ž&¡8ñ¤ÌÂd ø­½\+f’ø£ãçIHÀÔÔTÀDæ2™áíÅR©ÏŸíºÄ?€É(¦ÉF(›xR099é©þþ~¿­Ù÷”äþJ½-ÇÏ“€‰‰ ßHõõõùm”ì}¢>>¼¾Ûs]?×ù ½?âO‹' ¡ÉŽ3ua²0Ù“ P~ÆÇÇUooo"‰Ù2m”ì}¢ö/ööBï/_Õrü<) óTOOßÚËqÆ+×þùnÏ¥|Ï_ìû­åøyRšì¤&Õ®xR*Ìdwwwû­½\¬ý+]s9~ž€ŒŽŽÆšºB•¯‰Ìu|.Ó™ïýz|-ÅÏ“ÐdËT¢êO @B“ÝÕÕ¥%fÊ´Id¥ðùÂûz½\ÇçÚ¿Ø÷SÍñó¤$4ÙZb°Lk/'ÝV¡ç+öýˆY4­½Lü¹ÏÇ“€áááXS†O @Ö¬Y£:::Ê*žL6Âd`²& ªLv*•ò[{9éö\Ê÷øB¯—¯ˆ?} ž€ Åš¸ööv¿µ—Íö ­ðýɽ›6I<ùî_Ëñó¤$4Ùq&2lÊòU¾çãú•{}ž€¬ZµJµµµi‰™2­½·=¬\û—z{.…‹øã¯Ç“§É«µµÕo£dï“Dáóå:>ßýK}|-ÇÏ“§ÉÎe*«MaÓIüññó¤$4Ùµb*ÑìÅ“€ÉF˜l€ò188è©––¿M"û˜¨ãÃÛÃÊu½ }~âO‹' ¡ÉNj*s©¹¹Ùoíå¤Û纪9~ž€ ø¦¯©©Éoíe³½ÔÊ÷úž_â?O @úûû}S•Ka6eùªÐóû~ˆ?^<)E6Ù¨vÅ“€¾¾>ßH566ú­½\-F1W|Ä¿' ¡É6¦ ¡8ñ¤`²& |ôöö&6Z+W®ô[{¹R÷o+×ù‰“ ===¾éZ±b…ßFÉÞ'JáãÃûç»=×õ Ý¿ÐóÕRü<) MvœIC“ P&“ÝÐÐà·ör±ö¯tÕRü<) èîîöM_¾ ›Ä°‰ kùòå~+’}M›äø|ïǾ†}íÙž¿–ãçIH@WWW¬ +lÒÂûº½På{þbï_Íñó¤$4ÙÅ2y¨zÅ“€ÉF˜l€òÑÙÙ©êëëç¤Äü™Ö^ž«ñTrü<)Un²QùÄ“€T*¥–.]ªµdÉ¿µ—Íö\Êu|¾çÏ÷|aûþk9~ž€´µµ©Å‹#”U<)˜l„ÉÀd#L6&a²0Ùa²0Ù“ €ÉF˜lL6Âd&a²0Ù“ €ÉF˜lL6B˜lL6Âd`²&“0Ù˜lL$Âd`²&“0ÙUÇÿtP°•·É÷IEND®B`‚perldoc-html/static/combined-20090809.js000644 000765 000024 00000236036 12276001417 017546 0ustar00jjstaff000000 000000 //MooTools, , My Object Oriented (JavaScript) Tools. Copyright (c) 2006-2009 Valerio Proietti, , MIT Style License. var MooTools={version:"1.2.3",build:"4980aa0fb74d2f6eb80bcd9f5b8e1fd6fbb8f607"};var Native=function(k){k=k||{};var a=k.name;var i=k.legacy;var b=k.protect; var c=k.implement;var h=k.generics;var f=k.initialize;var g=k.afterImplement||function(){};var d=f||i;h=h!==false;d.constructor=Native;d.$family={name:"native"}; if(i&&f){d.prototype=i.prototype;}d.prototype.constructor=d;if(a){var e=a.toLowerCase();d.prototype.$family={name:e};Native.typize(d,e);}var j=function(n,l,o,m){if(!b||m||!n.prototype[l]){n.prototype[l]=o; }if(h){Native.genericize(n,l,b);}g.call(n,l,o);return n;};d.alias=function(n,l,p){if(typeof n=="string"){var o=this.prototype[n];if((n=o)){return j(this,l,n,p); }}for(var m in n){this.alias(m,n[m],l);}return this;};d.implement=function(m,l,o){if(typeof m=="string"){return j(this,m,l,o);}for(var n in m){j(this,n,m[n],l); }return this;};if(c){d.implement(c);}return d;};Native.genericize=function(b,c,a){if((!a||!b[c])&&typeof b.prototype[c]=="function"){b[c]=function(){var d=Array.prototype.slice.call(arguments); return b.prototype[c].apply(d.shift(),d);};}};Native.implement=function(d,c){for(var b=0,a=d.length;b-1:this.indexOf(a)>-1;},trim:function(){return this.replace(/^\s+|\s+$/g,"");},clean:function(){return this.replace(/\s+/g," ").trim(); },camelCase:function(){return this.replace(/-\D/g,function(a){return a.charAt(1).toUpperCase();});},hyphenate:function(){return this.replace(/[A-Z]/g,function(a){return("-"+a.charAt(0).toLowerCase()); });},capitalize:function(){return this.replace(/\b[a-z]/g,function(a){return a.toUpperCase();});},escapeRegExp:function(){return this.replace(/([-.*+?^${}()|[\]\/\\])/g,"\\$1"); },toInt:function(a){return parseInt(this,a||10);},toFloat:function(){return parseFloat(this);},hexToRgb:function(b){var a=this.match(/^#?(\w{1,2})(\w{1,2})(\w{1,2})$/); return(a)?a.slice(1).hexToRgb(b):null;},rgbToHex:function(b){var a=this.match(/\d{1,3}/g);return(a)?a.rgbToHex(b):null;},stripScripts:function(b){var a=""; var c=this.replace(/]*>([\s\S]*?)<\/script>/gi,function(){a+=arguments[1]+"\n";return"";});if(b===true){$exec(a);}else{if($type(b)=="function"){b(a,c); }}return c;},substitute:function(a,b){return this.replace(b||(/\\?\{([^{}]+)\}/g),function(d,c){if(d.charAt(0)=="\\"){return d.slice(1);}return(a[c]!=undefined)?a[c]:""; });}});Hash.implement({has:Object.prototype.hasOwnProperty,keyOf:function(b){for(var a in this){if(this.hasOwnProperty(a)&&this[a]===b){return a;}}return null; },hasValue:function(a){return(Hash.keyOf(this,a)!==null);},extend:function(a){Hash.each(a||{},function(c,b){Hash.set(this,b,c);},this);return this;},combine:function(a){Hash.each(a||{},function(c,b){Hash.include(this,b,c); },this);return this;},erase:function(a){if(this.hasOwnProperty(a)){delete this[a];}return this;},get:function(a){return(this.hasOwnProperty(a))?this[a]:null; },set:function(a,b){if(!this[a]||this.hasOwnProperty(a)){this[a]=b;}return this;},empty:function(){Hash.each(this,function(b,a){delete this[a];},this); return this;},include:function(a,b){if(this[a]==undefined){this[a]=b;}return this;},map:function(b,c){var a=new Hash;Hash.each(this,function(e,d){a.set(d,b.call(c,e,d,this)); },this);return a;},filter:function(b,c){var a=new Hash;Hash.each(this,function(e,d){if(b.call(c,e,d,this)){a.set(d,e);}},this);return a;},every:function(b,c){for(var a in this){if(this.hasOwnProperty(a)&&!b.call(c,this[a],a)){return false; }}return true;},some:function(b,c){for(var a in this){if(this.hasOwnProperty(a)&&b.call(c,this[a],a)){return true;}}return false;},getKeys:function(){var a=[]; Hash.each(this,function(c,b){a.push(b);});return a;},getValues:function(){var a=[];Hash.each(this,function(b){a.push(b);});return a;},toQueryString:function(a){var b=[]; Hash.each(this,function(f,e){if(a){e=a+"["+e+"]";}var d;switch($type(f)){case"object":d=Hash.toQueryString(f,e);break;case"array":var c={};f.each(function(h,g){c[g]=h; });d=Hash.toQueryString(c,e);break;default:d=e+"="+encodeURIComponent(f);}if(f!=undefined){b.push(d);}});return b.join("&");}});Hash.alias({keyOf:"indexOf",hasValue:"contains"}); var Event=new Native({name:"Event",initialize:function(a,f){f=f||window;var k=f.document;a=a||f.event;if(a.$extended){return a;}this.$extended=true;var j=a.type; var g=a.target||a.srcElement;while(g&&g.nodeType==3){g=g.parentNode;}if(j.test(/key/)){var b=a.which||a.keyCode;var m=Event.Keys.keyOf(b);if(j=="keydown"){var d=b-111; if(d>0&&d<13){m="f"+d;}}m=m||String.fromCharCode(b).toLowerCase();}else{if(j.match(/(click|mouse|menu)/i)){k=(!k.compatMode||k.compatMode=="CSS1Compat")?k.html:k.body; var i={x:a.pageX||a.clientX+k.scrollLeft,y:a.pageY||a.clientY+k.scrollTop};var c={x:(a.pageX)?a.pageX-f.pageXOffset:a.clientX,y:(a.pageY)?a.pageY-f.pageYOffset:a.clientY}; if(j.match(/DOMMouseScroll|mousewheel/)){var h=(a.wheelDelta)?a.wheelDelta/120:-(a.detail||0)/3;}var e=(a.which==3)||(a.button==2);var l=null;if(j.match(/over|out/)){switch(j){case"mouseover":l=a.relatedTarget||a.fromElement; break;case"mouseout":l=a.relatedTarget||a.toElement;}if(!(function(){while(l&&l.nodeType==3){l=l.parentNode;}return true;}).create({attempt:Browser.Engine.gecko})()){l=false; }}}}return $extend(this,{event:a,type:j,page:i,client:c,rightClick:e,wheel:h,relatedTarget:l,target:g,code:b,key:m,shift:a.shiftKey,control:a.ctrlKey,alt:a.altKey,meta:a.metaKey}); }});Event.Keys=new Hash({enter:13,up:38,down:40,left:37,right:39,esc:27,space:32,backspace:8,tab:9,"delete":46});Event.implement({stop:function(){return this.stopPropagation().preventDefault(); },stopPropagation:function(){if(this.event.stopPropagation){this.event.stopPropagation();}else{this.event.cancelBubble=true;}return this;},preventDefault:function(){if(this.event.preventDefault){this.event.preventDefault(); }else{this.event.returnValue=false;}return this;}});function Class(b){if(b instanceof Function){b={initialize:b};}var a=function(){Object.reset(this);if(a._prototyping){return this; }this._current=$empty;var c=(this.initialize)?this.initialize.apply(this,arguments):this;delete this._current;delete this.caller;return c;}.extend(this); a.implement(b);a.constructor=Class;a.prototype.constructor=a;return a;}Function.prototype.protect=function(){this._protected=true;return this;};Object.reset=function(a,c){if(c==null){for(var e in a){Object.reset(a,e); }return a;}delete a[c];switch($type(a[c])){case"object":var d=function(){};d.prototype=a[c];var b=new d;a[c]=Object.reset(b);break;case"array":a[c]=$unlink(a[c]); break;}return a;};new Native({name:"Class",initialize:Class}).extend({instantiate:function(b){b._prototyping=true;var a=new b;delete b._prototyping;return a; },wrap:function(a,b,c){if(c._origin){c=c._origin;}return function(){if(c._protected&&this._current==null){throw new Error('The method "'+b+'" cannot be called.'); }var e=this.caller,f=this._current;this.caller=f;this._current=arguments.callee;var d=c.apply(this,arguments);this._current=f;this.caller=e;return d;}.extend({_owner:a,_origin:c,_name:b}); }});Class.implement({implement:function(a,d){if($type(a)=="object"){for(var e in a){this.implement(e,a[e]);}return this;}var f=Class.Mutators[a];if(f){d=f.call(this,d); if(d==null){return this;}}var c=this.prototype;switch($type(d)){case"function":if(d._hidden){return this;}c[a]=Class.wrap(this,a,d);break;case"object":var b=c[a]; if($type(b)=="object"){$mixin(b,d);}else{c[a]=$unlink(d);}break;case"array":c[a]=$unlink(d);break;default:c[a]=d;}return this;}});Class.Mutators={Extends:function(a){this.parent=a; this.prototype=Class.instantiate(a);this.implement("parent",function(){var b=this.caller._name,c=this.caller._owner.parent.prototype[b];if(!c){throw new Error('The method "'+b+'" has no parent.'); }return c.apply(this,arguments);}.protect());},Implements:function(a){$splat(a).each(function(b){if(b instanceof Function){b=Class.instantiate(b);}this.implement(b); },this);}};var Chain=new Class({$chain:[],chain:function(){this.$chain.extend(Array.flatten(arguments));return this;},callChain:function(){return(this.$chain.length)?this.$chain.shift().apply(this,arguments):false; },clearChain:function(){this.$chain.empty();return this;}});var Events=new Class({$events:{},addEvent:function(c,b,a){c=Events.removeOn(c);if(b!=$empty){this.$events[c]=this.$events[c]||[]; this.$events[c].include(b);if(a){b.internal=true;}}return this;},addEvents:function(a){for(var b in a){this.addEvent(b,a[b]);}return this;},fireEvent:function(c,b,a){c=Events.removeOn(c); if(!this.$events||!this.$events[c]){return this;}this.$events[c].each(function(d){d.create({bind:this,delay:a,"arguments":b})();},this);return this;},removeEvent:function(b,a){b=Events.removeOn(b); if(!this.$events[b]){return this;}if(!a.internal){this.$events[b].erase(a);}return this;},removeEvents:function(c){var d;if($type(c)=="object"){for(d in c){this.removeEvent(d,c[d]); }return this;}if(c){c=Events.removeOn(c);}for(d in this.$events){if(c&&c!=d){continue;}var b=this.$events[d];for(var a=b.length;a--;a){this.removeEvent(d,b[a]); }}return this;}});Events.removeOn=function(a){return a.replace(/^on([A-Z])/,function(b,c){return c.toLowerCase();});};var Options=new Class({setOptions:function(){this.options=$merge.run([this.options].extend(arguments)); if(!this.addEvent){return this;}for(var a in this.options){if($type(this.options[a])!="function"||!(/^on[A-Z]/).test(a)){continue;}this.addEvent(a,this.options[a]); delete this.options[a];}return this;}});var Element=new Native({name:"Element",legacy:window.Element,initialize:function(a,b){var c=Element.Constructors.get(a); if(c){return c(b);}if(typeof a=="string"){return document.newElement(a,b);}return document.id(a).set(b);},afterImplement:function(a,b){Element.Prototype[a]=b; if(Array[a]){return;}Elements.implement(a,function(){var c=[],g=true;for(var e=0,d=this.length;e";}return document.id(this.createElement(a)).set(b);},newTextNode:function(a){return this.createTextNode(a); },getDocument:function(){return this;},getWindow:function(){return this.window;},id:(function(){var a={string:function(d,c,b){d=b.getElementById(d);return(d)?a.element(d,c):null; },element:function(b,e){$uid(b);if(!e&&!b.$family&&!(/^object|embed$/i).test(b.tagName)){var c=Element.Prototype;for(var d in c){b[d]=c[d];}}return b;},object:function(c,d,b){if(c.toElement){return a.element(c.toElement(b),d); }return null;}};a.textnode=a.whitespace=a.window=a.document=$arguments(0);return function(c,e,d){if(c&&c.$family&&c.uid){return c;}var b=$type(c);return(a[b])?a[b](c,e,d||document):null; };})()});if(window.$==null){Window.implement({$:function(a,b){return document.id(a,b,this.document);}});}Window.implement({$$:function(a){if(arguments.length==1&&typeof a=="string"){return this.document.getElements(a); }var f=[];var c=Array.flatten(arguments);for(var d=0,b=c.length;d1);a.each(function(e){var f=this.getElementsByTagName(e.trim());(b)?c.extend(f):c=f; },this);return new Elements(c,{ddup:b,cash:!d});}});(function(){var h={},f={};var i={input:"checked",option:"selected",textarea:(Browser.Engine.webkit&&Browser.Engine.version<420)?"innerHTML":"value"}; var c=function(l){return(f[l]||(f[l]={}));};var g=function(n,l){if(!n){return;}var m=n.uid;if(Browser.Engine.trident){if(n.clearAttributes){var q=l&&n.cloneNode(false); n.clearAttributes();if(q){n.mergeAttributes(q);}}else{if(n.removeEvents){n.removeEvents();}}if((/object/i).test(n.tagName)){for(var o in n){if(typeof n[o]=="function"){n[o]=$empty; }}Element.dispose(n);}}if(!m){return;}h[m]=f[m]=null;};var d=function(){Hash.each(h,g);if(Browser.Engine.trident){$A(document.getElementsByTagName("object")).each(g); }if(window.CollectGarbage){CollectGarbage();}h=f=null;};var j=function(n,l,s,m,p,r){var o=n[s||l];var q=[];while(o){if(o.nodeType==1&&(!m||Element.match(o,m))){if(!p){return document.id(o,r); }q.push(o);}o=o[l];}return(p)?new Elements(q,{ddup:false,cash:!r}):null;};var e={html:"innerHTML","class":"className","for":"htmlFor",defaultValue:"defaultValue",text:(Browser.Engine.trident||(Browser.Engine.webkit&&Browser.Engine.version<420))?"innerText":"textContent"}; var b=["compact","nowrap","ismap","declare","noshade","checked","disabled","readonly","multiple","selected","noresize","defer"];var k=["value","type","defaultValue","accessKey","cellPadding","cellSpacing","colSpan","frameBorder","maxLength","readOnly","rowSpan","tabIndex","useMap"]; b=b.associate(b);Hash.extend(e,b);Hash.extend(e,k.associate(k.map(String.toLowerCase)));var a={before:function(m,l){if(l.parentNode){l.parentNode.insertBefore(m,l); }},after:function(m,l){if(!l.parentNode){return;}var n=l.nextSibling;(n)?l.parentNode.insertBefore(m,n):l.parentNode.appendChild(m);},bottom:function(m,l){l.appendChild(m); },top:function(m,l){var n=l.firstChild;(n)?l.insertBefore(m,n):l.appendChild(m);}};a.inside=a.bottom;Hash.each(a,function(l,m){m=m.capitalize();Element.implement("inject"+m,function(n){l(this,document.id(n,true)); return this;});Element.implement("grab"+m,function(n){l(document.id(n,true),this);return this;});});Element.implement({set:function(o,m){switch($type(o)){case"object":for(var n in o){this.set(n,o[n]); }break;case"string":var l=Element.Properties.get(o);(l&&l.set)?l.set.apply(this,Array.slice(arguments,1)):this.setProperty(o,m);}return this;},get:function(m){var l=Element.Properties.get(m); return(l&&l.get)?l.get.apply(this,Array.slice(arguments,1)):this.getProperty(m);},erase:function(m){var l=Element.Properties.get(m);(l&&l.erase)?l.erase.apply(this):this.removeProperty(m); return this;},setProperty:function(m,n){var l=e[m];if(n==undefined){return this.removeProperty(m);}if(l&&b[m]){n=!!n;}(l)?this[l]=n:this.setAttribute(m,""+n); return this;},setProperties:function(l){for(var m in l){this.setProperty(m,l[m]);}return this;},getProperty:function(m){var l=e[m];var n=(l)?this[l]:this.getAttribute(m,2); return(b[m])?!!n:(l)?n:n||null;},getProperties:function(){var l=$A(arguments);return l.map(this.getProperty,this).associate(l);},removeProperty:function(m){var l=e[m]; (l)?this[l]=(l&&b[m])?false:"":this.removeAttribute(m);return this;},removeProperties:function(){Array.each(arguments,this.removeProperty,this);return this; },hasClass:function(l){return this.className.contains(l," ");},addClass:function(l){if(!this.hasClass(l)){this.className=(this.className+" "+l).clean(); }return this;},removeClass:function(l){this.className=this.className.replace(new RegExp("(^|\\s)"+l+"(?:\\s|$)"),"$1");return this;},toggleClass:function(l){return this.hasClass(l)?this.removeClass(l):this.addClass(l); },adopt:function(){Array.flatten(arguments).each(function(l){l=document.id(l,true);if(l){this.appendChild(l);}},this);return this;},appendText:function(m,l){return this.grab(this.getDocument().newTextNode(m),l); },grab:function(m,l){a[l||"bottom"](document.id(m,true),this);return this;},inject:function(m,l){a[l||"bottom"](this,document.id(m,true));return this;},replaces:function(l){l=document.id(l,true); l.parentNode.replaceChild(this,l);return this;},wraps:function(m,l){m=document.id(m,true);return this.replaces(m).grab(m,l);},getPrevious:function(l,m){return j(this,"previousSibling",null,l,false,m); },getAllPrevious:function(l,m){return j(this,"previousSibling",null,l,true,m);},getNext:function(l,m){return j(this,"nextSibling",null,l,false,m);},getAllNext:function(l,m){return j(this,"nextSibling",null,l,true,m); },getFirst:function(l,m){return j(this,"nextSibling","firstChild",l,false,m);},getLast:function(l,m){return j(this,"previousSibling","lastChild",l,false,m); },getParent:function(l,m){return j(this,"parentNode",null,l,false,m);},getParents:function(l,m){return j(this,"parentNode",null,l,true,m);},getSiblings:function(l,m){return this.getParent().getChildren(l,m).erase(this); },getChildren:function(l,m){return j(this,"nextSibling","firstChild",l,true,m);},getWindow:function(){return this.ownerDocument.window;},getDocument:function(){return this.ownerDocument; },getElementById:function(o,n){var m=this.ownerDocument.getElementById(o);if(!m){return null;}for(var l=m.parentNode;l!=this;l=l.parentNode){if(!l){return null; }}return document.id(m,n);},getSelected:function(){return new Elements($A(this.options).filter(function(l){return l.selected;}));},getComputedStyle:function(m){if(this.currentStyle){return this.currentStyle[m.camelCase()]; }var l=this.getDocument().defaultView.getComputedStyle(this,null);return(l)?l.getPropertyValue([m.hyphenate()]):null;},toQueryString:function(){var l=[]; this.getElements("input, select, textarea",true).each(function(m){if(!m.name||m.disabled||m.type=="submit"||m.type=="reset"||m.type=="file"){return;}var n=(m.tagName.toLowerCase()=="select")?Element.getSelected(m).map(function(o){return o.value; }):((m.type=="radio"||m.type=="checkbox")&&!m.checked)?null:m.value;$splat(n).each(function(o){if(typeof o!="undefined"){l.push(m.name+"="+encodeURIComponent(o)); }});});return l.join("&");},clone:function(o,l){o=o!==false;var r=this.cloneNode(o);var n=function(v,u){if(!l){v.removeAttribute("id");}if(Browser.Engine.trident){v.clearAttributes(); v.mergeAttributes(u);v.removeAttribute("uid");if(v.options){var w=v.options,s=u.options;for(var t=w.length;t--;){w[t].selected=s[t].selected;}}}var x=i[u.tagName.toLowerCase()]; if(x&&u[x]){v[x]=u[x];}};if(o){var p=r.getElementsByTagName("*"),q=this.getElementsByTagName("*");for(var m=p.length;m--;){n(p[m],q[m]);}}n(r,this);return document.id(r); },destroy:function(){Element.empty(this);Element.dispose(this);g(this,true);return null;},empty:function(){$A(this.childNodes).each(function(l){Element.destroy(l); });return this;},dispose:function(){return(this.parentNode)?this.parentNode.removeChild(this):this;},hasChild:function(l){l=document.id(l,true);if(!l){return false; }if(Browser.Engine.webkit&&Browser.Engine.version<420){return $A(this.getElementsByTagName(l.tagName)).contains(l);}return(this.contains)?(this!=l&&this.contains(l)):!!(this.compareDocumentPosition(l)&16); },match:function(l){return(!l||(l==this)||(Element.get(this,"tag")==l));}});Native.implement([Element,Window,Document],{addListener:function(o,n){if(o=="unload"){var l=n,m=this; n=function(){m.removeListener("unload",n);l();};}else{h[this.uid]=this;}if(this.addEventListener){this.addEventListener(o,n,false);}else{this.attachEvent("on"+o,n); }return this;},removeListener:function(m,l){if(this.removeEventListener){this.removeEventListener(m,l,false);}else{this.detachEvent("on"+m,l);}return this; },retrieve:function(m,l){var o=c(this.uid),n=o[m];if(l!=undefined&&n==undefined){n=o[m]=l;}return $pick(n);},store:function(m,l){var n=c(this.uid);n[m]=l; return this;},eliminate:function(l){var m=c(this.uid);delete m[l];return this;}});window.addListener("unload",d);})();Element.Properties=new Hash;Element.Properties.style={set:function(a){this.style.cssText=a; },get:function(){return this.style.cssText;},erase:function(){this.style.cssText="";}};Element.Properties.tag={get:function(){return this.tagName.toLowerCase(); }};Element.Properties.html=(function(){var c=document.createElement("div");var a={table:[1,"","
"],select:[1,""],tbody:[2,"","
"],tr:[3,"","
"]}; a.thead=a.tfoot=a.tbody;var b={set:function(){var e=Array.flatten(arguments).join("");var f=Browser.Engine.trident&&a[this.get("tag")];if(f){var g=c;g.innerHTML=f[1]+e+f[2]; for(var d=f[0];d--;){g=g.firstChild;}this.empty().adopt(g.childNodes);}else{this.innerHTML=e;}}};b.erase=b.set;return b;})();if(Browser.Engine.webkit&&Browser.Engine.version<420){Element.Properties.text={get:function(){if(this.innerText){return this.innerText; }var a=this.ownerDocument.newElement("div",{html:this.innerHTML}).inject(this.ownerDocument.body);var b=a.innerText;a.destroy();return b;}};}Element.Properties.events={set:function(a){this.addEvents(a); }};Native.implement([Element,Window,Document],{addEvent:function(e,g){var h=this.retrieve("events",{});h[e]=h[e]||{keys:[],values:[]};if(h[e].keys.contains(g)){return this; }h[e].keys.push(g);var f=e,a=Element.Events.get(e),c=g,i=this;if(a){if(a.onAdd){a.onAdd.call(this,g);}if(a.condition){c=function(j){if(a.condition.call(this,j)){return g.call(this,j); }return true;};}f=a.base||f;}var d=function(){return g.call(i);};var b=Element.NativeEvents[f];if(b){if(b==2){d=function(j){j=new Event(j,i.getWindow()); if(c.call(i,j)===false){j.stop();}};}this.addListener(f,d);}h[e].values.push(d);return this;},removeEvent:function(c,b){var a=this.retrieve("events");if(!a||!a[c]){return this; }var f=a[c].keys.indexOf(b);if(f==-1){return this;}a[c].keys.splice(f,1);var e=a[c].values.splice(f,1)[0];var d=Element.Events.get(c);if(d){if(d.onRemove){d.onRemove.call(this,b); }c=d.base||c;}return(Element.NativeEvents[c])?this.removeListener(c,e):this;},addEvents:function(a){for(var b in a){this.addEvent(b,a[b]);}return this; },removeEvents:function(a){var c;if($type(a)=="object"){for(c in a){this.removeEvent(c,a[c]);}return this;}var b=this.retrieve("events");if(!b){return this; }if(!a){for(c in b){this.removeEvents(c);}this.eliminate("events");}else{if(b[a]){while(b[a].keys[0]){this.removeEvent(a,b[a].keys[0]);}b[a]=null;}}return this; },fireEvent:function(d,b,a){var c=this.retrieve("events");if(!c||!c[d]){return this;}c[d].keys.each(function(e){e.create({bind:this,delay:a,"arguments":b})(); },this);return this;},cloneEvents:function(d,a){d=document.id(d);var c=d.retrieve("events");if(!c){return this;}if(!a){for(var b in c){this.cloneEvents(d,b); }}else{if(c[a]){c[a].keys.each(function(e){this.addEvent(a,e);},this);}}return this;}});Element.NativeEvents={click:2,dblclick:2,mouseup:2,mousedown:2,contextmenu:2,mousewheel:2,DOMMouseScroll:2,mouseover:2,mouseout:2,mousemove:2,selectstart:2,selectend:2,keydown:2,keypress:2,keyup:2,focus:2,blur:2,change:2,reset:2,select:2,submit:2,load:1,unload:1,beforeunload:2,resize:1,move:1,DOMContentLoaded:1,readystatechange:1,error:1,abort:1,scroll:1}; (function(){var a=function(b){var c=b.relatedTarget;if(c==undefined){return true;}if(c===false){return false;}return($type(this)!="document"&&c!=this&&c.prefix!="xul"&&!this.hasChild(c)); };Element.Events=new Hash({mouseenter:{base:"mouseover",condition:a},mouseleave:{base:"mouseout",condition:a},mousewheel:{base:(Browser.Engine.gecko)?"DOMMouseScroll":"mousewheel"}}); })();Element.Properties.styles={set:function(a){this.setStyles(a);}};Element.Properties.opacity={set:function(a,b){if(!b){if(a==0){if(this.style.visibility!="hidden"){this.style.visibility="hidden"; }}else{if(this.style.visibility!="visible"){this.style.visibility="visible";}}}if(!this.currentStyle||!this.currentStyle.hasLayout){this.style.zoom=1;}if(Browser.Engine.trident){this.style.filter=(a==1)?"":"alpha(opacity="+a*100+")"; }this.style.opacity=a;this.store("opacity",a);},get:function(){return this.retrieve("opacity",1);}};Element.implement({setOpacity:function(a){return this.set("opacity",a,true); },getOpacity:function(){return this.get("opacity");},setStyle:function(b,a){switch(b){case"opacity":return this.set("opacity",parseFloat(a));case"float":b=(Browser.Engine.trident)?"styleFloat":"cssFloat"; }b=b.camelCase();if($type(a)!="string"){var c=(Element.Styles.get(b)||"@").split(" ");a=$splat(a).map(function(e,d){if(!c[d]){return"";}return($type(e)=="number")?c[d].replace("@",Math.round(e)):e; }).join(" ");}else{if(a==String(Number(a))){a=Math.round(a);}}this.style[b]=a;return this;},getStyle:function(g){switch(g){case"opacity":return this.get("opacity"); case"float":g=(Browser.Engine.trident)?"styleFloat":"cssFloat";}g=g.camelCase();var a=this.style[g];if(!$chk(a)){a=[];for(var f in Element.ShortStyles){if(g!=f){continue; }for(var e in Element.ShortStyles[f]){a.push(this.getStyle(e));}return a.join(" ");}a=this.getComputedStyle(g);}if(a){a=String(a);var c=a.match(/rgba?\([\d\s,]+\)/); if(c){a=a.replace(c[0],c[0].rgbToHex());}}if(Browser.Engine.presto||(Browser.Engine.trident&&!$chk(parseInt(a,10)))){if(g.test(/^(height|width)$/)){var b=(g=="width")?["left","right"]:["top","bottom"],d=0; b.each(function(h){d+=this.getStyle("border-"+h+"-width").toInt()+this.getStyle("padding-"+h).toInt();},this);return this["offset"+g.capitalize()]-d+"px"; }if((Browser.Engine.presto)&&String(a).test("px")){return a;}if(g.test(/(border(.+)Width|margin|padding)/)){return"0px";}}return a;},setStyles:function(b){for(var a in b){this.setStyle(a,b[a]); }return this;},getStyles:function(){var a={};Array.flatten(arguments).each(function(b){a[b]=this.getStyle(b);},this);return a;}});Element.Styles=new Hash({left:"@px",top:"@px",bottom:"@px",right:"@px",width:"@px",height:"@px",maxWidth:"@px",maxHeight:"@px",minWidth:"@px",minHeight:"@px",backgroundColor:"rgb(@, @, @)",backgroundPosition:"@px @px",color:"rgb(@, @, @)",fontSize:"@px",letterSpacing:"@px",lineHeight:"@px",clip:"rect(@px @px @px @px)",margin:"@px @px @px @px",padding:"@px @px @px @px",border:"@px @ rgb(@, @, @) @px @ rgb(@, @, @) @px @ rgb(@, @, @)",borderWidth:"@px @px @px @px",borderStyle:"@ @ @ @",borderColor:"rgb(@, @, @) rgb(@, @, @) rgb(@, @, @) rgb(@, @, @)",zIndex:"@",zoom:"@",fontWeight:"@",textIndent:"@px",opacity:"@"}); Element.ShortStyles={margin:{},padding:{},border:{},borderWidth:{},borderStyle:{},borderColor:{}};["Top","Right","Bottom","Left"].each(function(g){var f=Element.ShortStyles; var b=Element.Styles;["margin","padding"].each(function(h){var i=h+g;f[h][i]=b[i]="@px";});var e="border"+g;f.border[e]=b[e]="@px @ rgb(@, @, @)";var d=e+"Width",a=e+"Style",c=e+"Color"; f[e]={};f.borderWidth[d]=f[e][d]=b[d]="@px";f.borderStyle[a]=f[e][a]=b[a]="@";f.borderColor[c]=f[e][c]=b[c]="rgb(@, @, @)";});(function(){Element.implement({scrollTo:function(h,i){if(b(this)){this.getWindow().scrollTo(h,i); }else{this.scrollLeft=h;this.scrollTop=i;}return this;},getSize:function(){if(b(this)){return this.getWindow().getSize();}return{x:this.offsetWidth,y:this.offsetHeight}; },getScrollSize:function(){if(b(this)){return this.getWindow().getScrollSize();}return{x:this.scrollWidth,y:this.scrollHeight};},getScroll:function(){if(b(this)){return this.getWindow().getScroll(); }return{x:this.scrollLeft,y:this.scrollTop};},getScrolls:function(){var i=this,h={x:0,y:0};while(i&&!b(i)){h.x+=i.scrollLeft;h.y+=i.scrollTop;i=i.parentNode; }return h;},getOffsetParent:function(){var h=this;if(b(h)){return null;}if(!Browser.Engine.trident){return h.offsetParent;}while((h=h.parentNode)&&!b(h)){if(d(h,"position")!="static"){return h; }}return null;},getOffsets:function(){if(this.getBoundingClientRect){var m=this.getBoundingClientRect(),k=document.id(this.getDocument().documentElement),i=k.getScroll(),n=(d(this,"position")=="fixed"); return{x:parseInt(m.left,10)+((n)?0:i.x)-k.clientLeft,y:parseInt(m.top,10)+((n)?0:i.y)-k.clientTop};}var j=this,h={x:0,y:0};if(b(this)){return h;}while(j&&!b(j)){h.x+=j.offsetLeft; h.y+=j.offsetTop;if(Browser.Engine.gecko){if(!f(j)){h.x+=c(j);h.y+=g(j);}var l=j.parentNode;if(l&&d(l,"overflow")!="visible"){h.x+=c(l);h.y+=g(l);}}else{if(j!=this&&Browser.Engine.webkit){h.x+=c(j); h.y+=g(j);}}j=j.offsetParent;}if(Browser.Engine.gecko&&!f(this)){h.x-=c(this);h.y-=g(this);}return h;},getPosition:function(k){if(b(this)){return{x:0,y:0}; }var l=this.getOffsets(),i=this.getScrolls();var h={x:l.x-i.x,y:l.y-i.y};var j=(k&&(k=document.id(k)))?k.getPosition():{x:0,y:0};return{x:h.x-j.x,y:h.y-j.y}; },getCoordinates:function(j){if(b(this)){return this.getWindow().getCoordinates();}var h=this.getPosition(j),i=this.getSize();var k={left:h.x,top:h.y,width:i.x,height:i.y}; k.right=k.left+k.width;k.bottom=k.top+k.height;return k;},computePosition:function(h){return{left:h.x-e(this,"margin-left"),top:h.y-e(this,"margin-top")}; },setPosition:function(h){return this.setStyles(this.computePosition(h));}});Native.implement([Document,Window],{getSize:function(){if(Browser.Engine.presto||Browser.Engine.webkit){var i=this.getWindow(); return{x:i.innerWidth,y:i.innerHeight};}var h=a(this);return{x:h.clientWidth,y:h.clientHeight};},getScroll:function(){var i=this.getWindow(),h=a(this); return{x:i.pageXOffset||h.scrollLeft,y:i.pageYOffset||h.scrollTop};},getScrollSize:function(){var i=a(this),h=this.getSize();return{x:Math.max(i.scrollWidth,h.x),y:Math.max(i.scrollHeight,h.y)}; },getPosition:function(){return{x:0,y:0};},getCoordinates:function(){var h=this.getSize();return{top:0,left:0,bottom:h.y,right:h.x,height:h.y,width:h.x}; }});var d=Element.getComputedStyle;function e(h,i){return d(h,i).toInt()||0;}function f(h){return d(h,"-moz-box-sizing")=="border-box";}function g(h){return e(h,"border-top-width"); }function c(h){return e(h,"border-left-width");}function b(h){return(/^(?:body|html)$/i).test(h.tagName);}function a(h){var i=h.getDocument();return(!i.compatMode||i.compatMode=="CSS1Compat")?i.html:i.body; }})();Element.alias("setPosition","position");Native.implement([Window,Document,Element],{getHeight:function(){return this.getSize().y;},getWidth:function(){return this.getSize().x; },getScrollTop:function(){return this.getScroll().y;},getScrollLeft:function(){return this.getScroll().x;},getScrollHeight:function(){return this.getScrollSize().y; },getScrollWidth:function(){return this.getScrollSize().x;},getTop:function(){return this.getPosition().y;},getLeft:function(){return this.getPosition().x; }});Element.Events.domready={onAdd:function(a){if(Browser.loaded){a.call(this);}}};(function(){var b=function(){if(Browser.loaded){return;}Browser.loaded=true; window.fireEvent("domready");document.fireEvent("domready");};if(Browser.Engine.trident){var a=document.createElement("div");(function(){($try(function(){a.doScroll(); return document.id(a).inject(document.body).set("html","temp").dispose();}))?b():arguments.callee.delay(50);})();}else{if(Browser.Engine.webkit&&Browser.Engine.version<525){(function(){(["loaded","complete"].contains(document.readyState))?b():arguments.callee.delay(50); })();}else{window.addEvent("load",b);document.addEvent("DOMContentLoaded",b);}}})();var JSON=new Hash({$specialChars:{"\b":"\\b","\t":"\\t","\n":"\\n","\f":"\\f","\r":"\\r",'"':'\\"',"\\":"\\\\"},$replaceChars:function(a){return JSON.$specialChars[a]||"\\u00"+Math.floor(a.charCodeAt()/16).toString(16)+(a.charCodeAt()%16).toString(16); },encode:function(b){switch($type(b)){case"string":return'"'+b.replace(/[\x00-\x1f\\"]/g,JSON.$replaceChars)+'"';case"array":return"["+String(b.map(JSON.encode).clean())+"]"; case"object":case"hash":var a=[];Hash.each(b,function(e,d){var c=JSON.encode(e);if(c){a.push(JSON.encode(d)+":"+c);}});return"{"+a+"}";case"number":case"boolean":return String(b); case false:return"null";}return null;},decode:function(string,secure){if($type(string)!="string"||!string.length){return null;}if(secure&&!(/^[,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]*$/).test(string.replace(/\\./g,"@").replace(/"[^"\\\n\r]*"/g,""))){return null; }return eval("("+string+")");}});Native.implement([Hash,Array,String,Number],{toJSON:function(){return JSON.encode(this);}});var Cookie=new Class({Implements:Options,options:{path:false,domain:false,duration:false,secure:false,document:document},initialize:function(b,a){this.key=b; this.setOptions(a);},write:function(b){b=encodeURIComponent(b);if(this.options.domain){b+="; domain="+this.options.domain;}if(this.options.path){b+="; path="+this.options.path; }if(this.options.duration){var a=new Date();a.setTime(a.getTime()+this.options.duration*24*60*60*1000);b+="; expires="+a.toGMTString();}if(this.options.secure){b+="; secure"; }this.options.document.cookie=this.key+"="+b;return this;},read:function(){var a=this.options.document.cookie.match("(?:^|;)\\s*"+this.key.escapeRegExp()+"=([^;]*)"); return(a)?decodeURIComponent(a[1]):null;},dispose:function(){new Cookie(this.key,$merge(this.options,{duration:-1})).write("");return this;}});Cookie.write=function(b,c,a){return new Cookie(b,a).write(c); };Cookie.read=function(a){return new Cookie(a).read();};Cookie.dispose=function(b,a){return new Cookie(b,a).dispose();};var Request=new Class({Implements:[Chain,Events,Options],options:{url:"",data:"",headers:{"X-Requested-With":"XMLHttpRequest",Accept:"text/javascript, text/html, application/xml, text/xml, */*"},async:true,format:false,method:"post",link:"ignore",isSuccess:null,emulation:true,urlEncoded:true,encoding:"utf-8",evalScripts:false,evalResponse:false,noCache:false},initialize:function(a){this.xhr=new Browser.Request(); this.setOptions(a);this.options.isSuccess=this.options.isSuccess||this.isSuccess;this.headers=new Hash(this.options.headers);},onStateChange:function(){if(this.xhr.readyState!=4||!this.running){return; }this.running=false;this.status=0;$try(function(){this.status=this.xhr.status;}.bind(this));this.xhr.onreadystatechange=$empty;if(this.options.isSuccess.call(this,this.status)){this.response={text:this.xhr.responseText,xml:this.xhr.responseXML}; this.success(this.response.text,this.response.xml);}else{this.response={text:null,xml:null};this.failure();}},isSuccess:function(){return((this.status>=200)&&(this.status<300)); },processScripts:function(a){if(this.options.evalResponse||(/(ecma|java)script/).test(this.getHeader("Content-type"))){return $exec(a);}return a.stripScripts(this.options.evalScripts); },success:function(b,a){this.onSuccess(this.processScripts(b),a);},onSuccess:function(){this.fireEvent("complete",arguments).fireEvent("success",arguments).callChain(); },failure:function(){this.onFailure();},onFailure:function(){this.fireEvent("complete").fireEvent("failure",this.xhr);},setHeader:function(a,b){this.headers.set(a,b); return this;},getHeader:function(a){return $try(function(){return this.xhr.getResponseHeader(a);}.bind(this));},check:function(){if(!this.running){return true; }switch(this.options.link){case"cancel":this.cancel();return true;case"chain":this.chain(this.caller.bind(this,arguments));return false;}return false;},send:function(k){if(!this.check(k)){return this; }this.running=true;var i=$type(k);if(i=="string"||i=="element"){k={data:k};}var d=this.options;k=$extend({data:d.data,url:d.url,method:d.method},k);var g=k.data,b=k.url,a=k.method.toLowerCase(); switch($type(g)){case"element":g=document.id(g).toQueryString();break;case"object":case"hash":g=Hash.toQueryString(g);}if(this.options.format){var j="format="+this.options.format; g=(g)?j+"&"+g:j;}if(this.options.emulation&&!["get","post"].contains(a)){var h="_method="+a;g=(g)?h+"&"+g:h;a="post";}if(this.options.urlEncoded&&a=="post"){var c=(this.options.encoding)?"; charset="+this.options.encoding:""; this.headers.set("Content-type","application/x-www-form-urlencoded"+c);}if(this.options.noCache){var f="noCache="+new Date().getTime();g=(g)?f+"&"+g:f; }var e=b.lastIndexOf("/");if(e>-1&&(e=b.indexOf("#"))>-1){b=b.substr(0,e);}if(g&&a=="get"){b=b+(b.contains("?")?"&":"?")+g;g=null;}this.xhr.open(a.toUpperCase(),b,this.options.async); this.xhr.onreadystatechange=this.onStateChange.bind(this);this.headers.each(function(m,l){try{this.xhr.setRequestHeader(l,m);}catch(n){this.fireEvent("exception",[l,m]); }},this);this.fireEvent("request");this.xhr.send(g);if(!this.options.async){this.onStateChange();}return this;},cancel:function(){if(!this.running){return this; }this.running=false;this.xhr.abort();this.xhr.onreadystatechange=$empty;this.xhr=new Browser.Request();this.fireEvent("cancel");return this;}});(function(){var a={}; ["get","post","put","delete","GET","POST","PUT","DELETE"].each(function(b){a[b]=function(){var c=Array.link(arguments,{url:String.type,data:$defined}); return this.send($extend(c,{method:b}));};});Request.implement(a);})();Element.Properties.send={set:function(a){var b=this.retrieve("send");if(b){b.cancel(); }return this.eliminate("send").store("send:options",$extend({data:this,link:"cancel",method:this.get("method")||"post",url:this.get("action")},a));},get:function(a){if(a||!this.retrieve("send")){if(a||!this.retrieve("send:options")){this.set("send",a); }this.store("send",new Request(this.retrieve("send:options")));}return this.retrieve("send");}};Element.implement({send:function(a){var b=this.get("send"); b.send({data:this,url:a||b.options.url});return this;}});Request.JSON=new Class({Extends:Request,options:{secure:true},initialize:function(a){this.parent(a); this.headers.extend({Accept:"application/json","X-Request":"JSON"});},success:function(a){this.response.json=JSON.decode(a,this.options.secure);this.onSuccess(this.response.json,a); }});//MooTools More, . Copyright (c) 2006-2009 Aaron Newton , Valerio Proietti & the MooTools team , MIT Style License. MooTools.More={version:"1.2.3.1"};Class.refactor=function(b,a){$each(a,function(e,d){var c=b.prototype[d];if(c&&(c=c._origin)&&typeof e=="function"){b.implement(d,function(){var f=this.previous; this.previous=c;var g=e.apply(this,arguments);this.previous=f;return g;});}else{b.implement(d,e);}});return b;};Class.Mutators.Binds=function(a){return a; };Class.Mutators.initialize=function(a){return function(){$splat(this.Binds).each(function(b){var c=this[b];if(c){this[b]=c.bind(this);}},this);return a.apply(this,arguments); };};Class.Occlude=new Class({occlude:function(c,b){b=document.id(b||this.element);var a=b.retrieve(c||this.property);if(a&&!$defined(this.occluded)){this.occluded=a; }else{this.occluded=false;b.store(c||this.property,this);}return this.occluded;}});String.implement({parseQueryString:function(){var b=this.split(/[&;]/),a={}; if(b.length){b.each(function(g){var c=g.indexOf("="),d=c<0?[""]:g.substr(0,c).match(/[^\]\[]+/g),e=decodeURIComponent(g.substr(c+1)),f=a;d.each(function(j,h){var k=f[j]; if(h=0||h||r.allowNegative)?n.x:0).toInt(),top:((n.y>=0||h||r.allowNegative)?n.y:0).toInt()}; if(p.getStyle("position")=="fixed"||r.relFixedPosition){var f=window.getScroll();n.top=n.top.toInt()+f.y;n.left=n.left.toInt()+f.x;}if(r.returnPos){return n; }else{this.setStyles(n);}return this;}});})();Element.implement({isDisplayed:function(){return this.getStyle("display")!="none";},toggle:function(){return this[this.isDisplayed()?"hide":"show"](); },hide:function(){var b;try{if("none"!=this.getStyle("display")){b=this.getStyle("display");}}catch(a){}return this.store("originalDisplay",b||"block").setStyle("display","none"); },show:function(a){return this.setStyle("display",a||this.retrieve("originalDisplay")||"block");},swapClass:function(a,b){return this.removeClass(a).addClass(b); }});var OverText=new Class({Implements:[Options,Events,Class.Occlude],Binds:["reposition","assert","focus"],options:{element:"label",positionOptions:{position:"upperLeft",edge:"upperLeft",offset:{x:4,y:2}},poll:false,pollInterval:250},property:"OverText",initialize:function(b,a){this.element=document.id(b); if(this.occlude()){return this.occluded;}this.setOptions(a);this.attach(this.element);OverText.instances.push(this);if(this.options.poll){this.poll();}return this; },toElement:function(){return this.element;},attach:function(){var a=this.options.textOverride||this.element.get("alt")||this.element.get("title");if(!a){return; }this.text=new Element(this.options.element,{"class":"overTxtLabel",styles:{lineHeight:"normal",position:"absolute"},html:a,events:{click:this.hide.pass(true,this)}}).inject(this.element,"after"); if(this.options.element=="label"){this.text.set("for",this.element.get("id"));}this.element.addEvents({focus:this.focus,blur:this.assert,change:this.assert}).store("OverTextDiv",this.text); window.addEvent("resize",this.reposition.bind(this));this.assert(true);this.reposition();},startPolling:function(){this.pollingPaused=false;return this.poll(); },poll:function(a){if(this.poller&&!a){return this;}var b=function(){if(!this.pollingPaused){this.assert(true);}}.bind(this);if(a){$clear(this.poller); }else{this.poller=b.periodical(this.options.pollInterval,this);}return this;},stopPolling:function(){this.pollingPaused=true;return this.poll(true);},focus:function(){if(!this.text.isDisplayed()||this.element.get("disabled")){return; }this.hide();},hide:function(b){if(this.text.isDisplayed()&&!this.element.get("disabled")){this.text.hide();this.fireEvent("textHide",[this.text,this.element]); this.pollingPaused=true;try{if(!b){this.element.fireEvent("focus").focus();}}catch(a){}}return this;},show:function(){if(!this.text.isDisplayed()){this.text.show(); this.reposition();this.fireEvent("textShow",[this.text,this.element]);this.pollingPaused=false;}return this;},assert:function(a){this[this.test()?"show":"hide"](a); },test:function(){var a=this.element.get("value");return !a;},reposition:function(){this.assert(true);if(!this.element.getParent()||!this.element.offsetHeight){return this.stopPolling().hide(); }if(this.test()){this.text.position($merge(this.options.positionOptions,{relativeTo:this.element}));}return this;}});OverText.instances=[];OverText.update=function(){return OverText.instances.map(function(a){if(a.element&&a.text){return a.reposition(); }return null;});};if(window.Fx&&Fx.Reveal){Fx.Reveal.implement({hideInputs:Browser.Engine.trident?"select, input, textarea, object, embed, .overTxtLabel":false}); }var Drag=new Class({Implements:[Events,Options],options:{snap:6,unit:"px",grid:false,style:true,limit:false,handle:false,invert:false,preventDefault:false,modifiers:{x:"left",y:"top"}},initialize:function(){var b=Array.link(arguments,{options:Object.type,element:$defined}); this.element=document.id(b.element);this.document=this.element.getDocument();this.setOptions(b.options||{});var a=$type(this.options.handle);this.handles=((a=="array"||a=="collection")?$$(this.options.handle):document.id(this.options.handle))||this.element; this.mouse={now:{},pos:{}};this.value={start:{},now:{}};this.selection=(Browser.Engine.trident)?"selectstart":"mousedown";this.bound={start:this.start.bind(this),check:this.check.bind(this),drag:this.drag.bind(this),stop:this.stop.bind(this),cancel:this.cancel.bind(this),eventStop:$lambda(false)}; this.attach();},attach:function(){this.handles.addEvent("mousedown",this.bound.start);return this;},detach:function(){this.handles.removeEvent("mousedown",this.bound.start); return this;},start:function(c){if(this.options.preventDefault){c.preventDefault();}this.mouse.start=c.page;this.fireEvent("beforeStart",this.element); var a=this.options.limit;this.limit={x:[],y:[]};for(var d in this.options.modifiers){if(!this.options.modifiers[d]){continue;}if(this.options.style){this.value.now[d]=this.element.getStyle(this.options.modifiers[d]).toInt(); }else{this.value.now[d]=this.element[this.options.modifiers[d]];}if(this.options.invert){this.value.now[d]*=-1;}this.mouse.pos[d]=c.page[d]-this.value.now[d]; if(a&&a[d]){for(var b=2;b--;b){if($chk(a[d][b])){this.limit[d][b]=$lambda(a[d][b])();}}}}if($type(this.options.grid)=="number"){this.options.grid={x:this.options.grid,y:this.options.grid}; }this.document.addEvents({mousemove:this.bound.check,mouseup:this.bound.cancel});this.document.addEvent(this.selection,this.bound.eventStop);},check:function(a){if(this.options.preventDefault){a.preventDefault(); }var b=Math.round(Math.sqrt(Math.pow(a.page.x-this.mouse.start.x,2)+Math.pow(a.page.y-this.mouse.start.y,2)));if(b>this.options.snap){this.cancel();this.document.addEvents({mousemove:this.bound.drag,mouseup:this.bound.stop}); this.fireEvent("start",[this.element,a]).fireEvent("snap",this.element);}},drag:function(a){if(this.options.preventDefault){a.preventDefault();}this.mouse.now=a.page; for(var b in this.options.modifiers){if(!this.options.modifiers[b]){continue;}this.value.now[b]=this.mouse.now[b]-this.mouse.pos[b];if(this.options.invert){this.value.now[b]*=-1; }if(this.options.limit&&this.limit[b]){if($chk(this.limit[b][1])&&(this.value.now[b]>this.limit[b][1])){this.value.now[b]=this.limit[b][1];}else{if($chk(this.limit[b][0])&&(this.value.now[b]c.left&&a.xc.top); },checkDroppables:function(){var a=this.droppables.filter(this.checkAgainst,this).getLast();if(this.overed!=a){if(this.overed){this.fireEvent("leave",[this.element,this.overed]); }if(a){this.fireEvent("enter",[this.element,a]);}this.overed=a;}},drag:function(a){this.parent(a);if(this.options.checkDroppables&&this.droppables.length){this.checkDroppables(); }},stop:function(a){this.checkDroppables();this.fireEvent("drop",[this.element,this.overed,a]);this.overed=null;return this.parent(a);}});Element.implement({makeDraggable:function(a){var b=new Drag.Move(this,a); this.store("dragger",b);return b;}});Hash.Cookie=new Class({Extends:Cookie,options:{autoSave:true},initialize:function(b,a){this.parent(b,a);this.load(); },save:function(){var a=JSON.encode(this.hash);if(!a||a.length>4096){return false;}if(a=="{}"){this.dispose();}else{this.write(a);}return true;},load:function(){this.hash=new Hash(JSON.decode(this.read(),true)); return this;}});Hash.each(Hash.prototype,function(b,a){if(typeof b=="function"){Hash.Cookie.implement(a,function(){var c=b.apply(this.hash,arguments);if(this.options.autoSave){this.save(); }return c;});}}); // perldoc.js // // JavaScript functions for perldoc.perl.org //------------------------------------------------------------------------- // perldoc - site-level functions //------------------------------------------------------------------------- var perldoc = { // startup - page initialisation functions ------------------------------ startup: function() { toolbar.setup(); pageIndex.setup(); recentPages.setup(); new OverText('search_box'); perldoc.fromSearch(); }, // path - path back to the documentation root directory ----------------- path: "", // setPath - sets the perldoc.path variable from page depth ------------- setPath: function(depth) { perldoc.path = ""; for (var c = 0; c < depth; c++) { perldoc.path = perldoc.path + "../"; } }, // loadFlags - loads the perldocFlags cookie ---------------------------- loadFlags: function() { var perldocFlags = new Hash.Cookie('perldocFlags',{ duration: 365, path: "/" }); return perldocFlags; }, // setFlag - stores a value in the perldocFlags cookie ------------------ setFlag: function(name,value) { var perldocFlags = perldoc.loadFlags(); if (!value) { value = true; } perldocFlags.set(name,value); }, // getFlag - gets a value from the perldocFlags cookie ------------------ getFlag: function(name) { var perldocFlags = perldoc.loadFlags(); if (perldocFlags.has(name)) { return perldocFlags.get(name); } else { return false; } }, // clearFlag - removes a value from the perldocFlags cookie ------------- clearFlag: function(name) { var perldocFlags = perldoc.loadFlags(); if (perldocFlags.has(name)) { perldocFlags.erase(name); } }, // fromSearch - writes a message if the page was reached from search ---- fromSearch: function() { if (perldoc.getFlag('fromSearch')) { var query = perldoc.getFlag('searchQuery'); var searchURL = perldoc.path + "search.html?r=no&q=" + query; $('from_search').set('html', '
Search results - this is the top result for your query ' + "'" + query + "'." + '
View all results
'); perldoc.clearFlag('fromSearch'); perldoc.clearFlag('searchQuery'); } } } //------------------------------------------------------------------------- // pageIndex - functions to control the floating page index window //------------------------------------------------------------------------- var pageIndex = { // setup - called to initialise the page index -------------------------- setup: function() { if ($('page_index')) { var pageIndexDrag = new Drag('page_index',{ handle: 'page_index_title', onComplete: pageIndex.checkPosition }); $('page_index_content').makeResizable({ handle: 'page_index_resize', onComplete: pageIndex.checkSize }); var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); if (pageIndexSettings.get('status') == 'Visible') { pageIndex.show(); pageIndex.checkPosition(); pageIndex.checkSize(); } else { pageIndex.hide(); } } }, // show - displays the page index --------------------------------------- show: function() { var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); if (pageIndexSettings.has('x') && pageIndexSettings.has('y')) { $('page_index').setStyle('left',pageIndexSettings.get('x')); $('page_index').setStyle('top',pageIndexSettings.get('y')); } if (pageIndexSettings.has('w') && pageIndexSettings.has('h')) { var paddingX = $('page_index_content').getStyle('padding-left').toInt() + $('page_index_content').getStyle('padding-right').toInt(); var paddingY = $('page_index_content').getStyle('padding-top').toInt() + $('page_index_content').getStyle('padding-bottom').toInt(); $('page_index_content').setStyle('width',pageIndexSettings.get('w') - paddingX); $('page_index_content').setStyle('height',pageIndexSettings.get('h') - paddingY); } pageIndex.windowResized(); $('page_index').style.display = 'Block'; $('page_index').style.visibility = 'Visible'; pageIndexSettings.set('status','Visible'); $('page_index_toggle').innerHTML = 'Hide page index'; $('page_index_toggle').removeEvent('click',pageIndex.show); $('page_index_toggle').addEvent('click',pageIndex.hide); window.addEvent('resize',pageIndex.windowResized); return false; }, // hide - hides the page index ------------------------------------------ hide: function() { $('page_index').style.display = 'None'; $('page_index').style.visibility = 'Hidden'; $('page_index_toggle').innerHTML = 'Show page index'; $('page_index_toggle').removeEvent('click',pageIndex.hide); $('page_index_toggle').addEvent('click',pageIndex.show); window.removeEvent('resize',pageIndex.windowResized); var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); pageIndexSettings.set('status','Hidden'); return false; }, // checkPosition - checks the index window is within the screen --------- checkPosition: function() { var pageIndexSize = $('page_index').getSize(); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var windowSize = window.getSize(); var newX = pageIndexPosition.x; var newY = pageIndexPosition.y; if (pageIndexPosition.x < 0) {newX = 0} if (windowSize.x < (pageIndexPosition.x + pageIndexSize.x)) {newX = Math.max(0,windowSize.x - pageIndexSize.x)} if (pageIndexPosition.y < 0) {newY = 0} if (windowSize.y < (pageIndexPosition.y + pageIndexSize.y)) {newY = Math.max(0,windowSize.y - pageIndexSize.y)} $('page_index').setStyle('left',newX); $('page_index').setStyle('top',newY); pageIndex.saveDimensions(); }, // checkSize - checks the index window is smaller than the screen ------- checkSize: function() { var pageIndexSize = $('page_index').getSize(); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var pageIndexHeaderSize = $('page_index_header').getSize(); var pageIndexContentSize = $('page_index_content').getSize(); var pageIndexFooterSize = $('page_index_footer').getSize(); var windowSize = window.getSize(); var newX = pageIndexContentSize.x; var newY = pageIndexContentSize.y; var paddingX = $('page_index_content').getStyle('padding-left').toInt() + $('page_index_content').getStyle('padding-right').toInt(); var paddingY = $('page_index_content').getStyle('padding-top').toInt() + $('page_index_content').getStyle('padding-bottom').toInt(); if (windowSize.x < (pageIndexPosition.x + pageIndexSize.x)) {newX = windowSize.x - pageIndexPosition.x} if (windowSize.y < (pageIndexPosition.y + pageIndexSize.y)) {newY = windowSize.y - pageIndexPosition.y - pageIndexFooterSize.y - pageIndexHeaderSize.y} $('page_index_content').setStyle('width',newX - paddingX); $('page_index_content').setStyle('height',newY - paddingY); pageIndex.saveDimensions(); }, // windowResized - check the index still fits if the window is resized -- windowResized: function() { pageIndex.checkPosition(); var windowSize = window.getSize(); var pageIndexSize = $('page_index').getSize(); if ((windowSize.x < pageIndexSize.x) || (windowSize.y < pageIndexSize.y)) { pageIndex.checkSize(); } }, // saveDimensions - stores the window size/position in a cookie --------- saveDimensions: function() { var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var pageIndexContentSize = $('page_index_content').getSize(); pageIndexSettings.set('x',pageIndexPosition.x); pageIndexSettings.set('y',pageIndexPosition.y); pageIndexSettings.set('w',pageIndexContentSize.x); pageIndexSettings.set('h',pageIndexContentSize.y); } }; //------------------------------------------------------------------------- // recentPages - store and display the last viewed pages //------------------------------------------------------------------------- var recentPages = { // count - number of pages to store ------------------------------------- count: 10, // setup - startup functions--------------------------------------------- setup: function() { recentPages.show(); if (perldoc.contentPage) { recentPages.add(perldoc.pageName,perldoc.pageAddress); } }, // add - adds a page to the recent list --------------------------------- add: function(name,url) { var recentList = recentPages.load(); // Remove page if it is already in the list recentList = recentList.filter(function(item) { return (item.url != url); }); // Add page as the first item in the list recentList.unshift({ 'name': name, 'url': url }); // Truncate list to maximum length recentList.splice(recentPages.count); // Save list recentPages.save(recentList); }, // show - displays the recent pages list -------------------------------- show: function() { var recentList = recentPages.load(); var recentHTML = ""; if (recentList.length > 0) { recentHTML += '
    '; recentList.each(function(item){ recentHTML += '
  • ' + item.name + ''; }); recentHTML += '
'; } $('recent_pages').set('html',recentHTML); }, // load - loads the recent pages list ----------------------------------- load: function() { return (perldoc.getFlag('recentPages') || new Array()); }, // save - saves the recent pages list ----------------------------------- save: function(list) { perldoc.setFlag('recentPages',list); } }; //------------------------------------------------------------------------- // toolbar - functions to control the floating toolbar //------------------------------------------------------------------------- var toolbar = { // state - holds the current CSS positioning state (fixed or static) state: "static", // setup - initialises the window.onscroll handler setup: function() { var toolbarType = perldoc.getFlag('toolbar_position'); if (toolbarType != 'standard') { toolbar.checkPosition(); if (toolbar.state == 'fixed') { // If an internal link was called (x.html#y) the link position will be // behind the toolbar so the page needs to be scrolled 90px down anchor = location.hash.substr(1); if (anchor) { var allLinks = $(document.body).getElements('a'); allLinks.each(function(link) { if (link.get('name') == anchor) { window.scrollTo(0,link.offsetTop + 100); } }); } } toolbar.rewriteLinks(); window.addEvent('scroll', toolbar.checkPosition); } }, // checkPosition - checks the scroll position and updates toolbar checkPosition: function() { if ((toolbar.state == 'static') && (toolbar.getCurrentYPos() > 120)) { $('content_header').style.position = "fixed"; $('content_body').style.marginTop = "90px"; toolbar.state = 'fixed'; } if ((toolbar.state == 'fixed') && (toolbar.getCurrentYPos() <= 120)) { $('content_header').style.position = "static"; $('content_body').style.marginTop = "0px"; toolbar.state = 'static'; } }, // getCurrentYPos - returns the vertical scroll offset getCurrentYPos: function() { if (document.body && document.body.scrollTop) return document.body.scrollTop; if (document.documentElement && document.documentElement.scrollTop) return document.documentElement.scrollTop; if (window.pageYOffset) return window.pageYOffset; return 0; }, // goToTop - scroll to the top of the page goToTop: function() { $('content_header').style.position = "static"; $('content_body').style.marginTop = "0px"; toolbar.state = 'static'; window.scrollTo(0,0); }, // rewriteLinks - stop internal links appearing behind the toolbar // Based on code written by Stuart Langridge - http://www.kryogenix.org rewriteLinks: function() { // Get a list of all links in the page var allLinks = $(document.body).getElements('a'); // Walk through the list allLinks.each( function(link) { if ((link.href && link.href.indexOf('#') != -1) && ( (link.pathname == location.pathname) || ('/'+link.pathname == location.pathname) ) && (link.search == location.search)) { // If the link is internal to the page (begins in #) // then attach the smoothScroll function as an onclick // event handler link.addEvent('click',toolbar.linkScroll); } }); }, // linkScroll - follow internal link and scroll the page // Based on code written by Stuart Langridge - http://www.kryogenix.org linkScroll: function(e) { // This is an event handler; get the clicked on element, // in a cross-browser fashion if (window.event) { target = window.event.srcElement; } else if (e) { target = e.target; } else return; // Make sure that the target is an element, not a text node // within an element if (target.nodeName.toLowerCase() != 'a') { target = target.parentNode; } // Paranoia; check this is an A tag if (target.nodeName.toLowerCase() != 'a') return; // Find the tag corresponding to this href // First strip off the hash (first character) anchor = target.hash.substr(1); // Now loop all A tags until we find one with that name var allLinks = $(document.body).getElements('a'); var destinationLink = null; allLinks.each ( function(link) { if (link.name && (link.name == anchor)) { destinationLink = link; } }); if (!destinationLink) destinationLink = $(anchor); // If we didn't find a destination, give up and let the browser do // its thing if (!destinationLink) return true; // Find the destination's position var desty = destinationLink.offsetTop; var thisNode = destinationLink; while (thisNode.offsetParent && (thisNode.offsetParent != document.body)) { thisNode = thisNode.offsetParent; desty += thisNode.offsetTop; } // Follow the link location.hash = anchor; // Scroll if necessary to avoid the top nav bar if ((window.pageYOffset > 120) && ((desty + window.innerHeight - 120) < toolbar.getDocHeight())) { window.scrollBy(0,-90); } // And stop the actual click happening if (window.event) { window.event.cancelBubble = true; window.event.returnValue = false; } if (e && e.preventDefault && e.stopPropagation) { e.preventDefault(); e.stopPropagation(); } }, // getDocHeight - return the height of the document getDocHeight: function() { var D = document; return Math.max( Math.max(D.body.scrollHeight, D.documentElement.scrollHeight), Math.max(D.body.offsetHeight, D.documentElement.offsetHeight), Math.max(D.body.clientHeight, D.documentElement.clientHeight) ); } }; perldoc-html/static/combined-20090925.png000644 000765 000024 00000110164 12276001416 017705 0ustar00jjstaff000000 000000 ‰PNG  IHDRÙXGÇ?¾sRGB®Îé pHYs99ÂVtIMEÙ  ØûwbKGDÿÿÿ ½§“ôIDATxÚì°UÞö­·¾­ÝÚ­Ý­UËP¦ÂÕÒ]C *Z¢"‚@/9çœs, &ºæ0¬ÅUQ1 FD‘ * ‹ßM¾«žo~gæÌœ9·çÎÌ™àéª_uÏtOwÿO÷ÌôsþáìwÜùLMäŒvýLß sÌüko1÷<ò”Yñòëæý?1›¶|evîÜiþ÷ÿW!„B!Ä^ ºý‡DÞýð“Vö?ÛœQÖÏÔT-»_MÕæ-2>±Ü¬ýôsÝTB!„B!2‚n\Óæ^U£DwµŠì³Ú0sÝdV½µÚ|ÿý÷ºQ„B!„BäÍž={Ì«o®6]u£9³]ÿ}Od÷Ÿ8×,]±ÒìÚ½[7„B!„Bˆâ…™ïÚež\ñ’é7ñ¢½[dÿ©I3ùâkÌ{}¢ /„B!„¢ä¬ùh­™4ÿjs|ãÎ{—È?÷J³ö³ ºÈB!„B!ªœ×­7c/ZXûEv«~cÍ+o¼«‹*„B!„¢ÚYùúÛ¦EŸÑµOd×mÙÓÜtï#ʹB!„BQãr¶o¸û!sJ‹žµCdw6Õ|ºa“.žB!„Bˆ˺Ï7šŽC'×l‘}ñu·Ú^]0!„B!„5Þ«½s§™wí-5Od×oÛ×,áU]$!„B!„µŽ§Ÿٜ֦OÍÙ » 1®ý´ÚcWÌs¾cdzmÛ6óõ×_›-[¶˜M›6™7šÏ?ÿܬ_¿>gÜöÌýå|ö‘Ïþªúx²_öW¥ýB!DmBÿÿ²_ö—ö|˜£Ó6oÞl5ÚmûöíVËí®!µ¼Þÿøsn§AÕ+²©Êöù¦ÍÕÚ\¬_|ÑÜ}÷ÝfáÂ…fòäÉfðàÁ¦{÷î¦M›6¦eË–æ‚ .0Mš4ÉŠÛŽ¹¿œËg+³¿ª>žì—ý¥úüùçŸoÎ=÷Ü46l˜œû˹®Ï—B÷WÕç#ûe¿ì—ý²¿æØÞyçéÿ_öËþŸÚ¬OŸ>fôèÑfÖ¬YæÚk¯5<òˆyíµ×¬ø®)íÏbõÅšõU="»Ëˆ Í×[·U¹Ñk×®5÷ß¿™:uª½Pîµ:àÙÍýåÚ²Ù/ûe¿ì—ý²_öË~Ù/ûe¿ì¯ 燶›6mšuž¾õÖ[6b¹º„öW[·šNæT­È.4ÑlûæÛª«ú¶n¹é¦›¬wº˜½ÔÛWõM*ûe¿ì—ý²_öË~Ù/ûe¿ì—ý{ƒý­Zµ2sçÎ5Ï=÷œùöÛo«\hoݶݴ0¾jDö=G˜Í_~]r£¾ÿþ{ó·¿ýÍŒ;¶ÆôÜìí=S²_öË~Ù/ûe¿ì—ý²_öË~Ù_ÓìoÚ´©¹ì²ËÌ{ï½W¥B{Ó_šÆÝ‡—VdŸÕ~€Y·~cI Ù³gY¶l™×Ï–ÃS(¥Þ¡Ç“ý²_öË~Ù/ûe¿ì—ý²_öË~ÙŸÚ†ú[Ï>û¬uÊVIºògÌ™íú—FdŸØ´›y{͇%5à•W^1}ûö­tá‹sÎ9'9÷—‹µÿê>žì—ý²_öË~Ù/ûe¿ì—ý²_öï‹ö£W¬XQ%bûÍÕ˜.èZ|‘}×ÃO–ì¤7lØ`&Mš”¼8Å"¼ÈáM°·#ûe¿ì—ý²_öË~Ù/ûe¿ì—ý{³Íƒ 2kÖ¬)¹Ð¾mÉãÅÙ£f]^²Ðð{î¹Ç–t§Î>ûìäÜ_v ®/”pÿ!…žO¶í ]/ûe¿ì—ý²_öË~Ù/ûe¿ì—ýûºýl³hÑ¢’H>ý’âˆì&=†›­Û¿)ú nܸь1"­q4hœGáoB|>cõ2‰ñÍ›7·´hÑÂÂr³fÍ,765²½:™Žu¼|Ï/ܾ¢ó/Ôþ\Ž—mÿ²_öË~Ù/ûe¿ì—ý²_öË~Ù_[ío×®YµjUé*ŽoÝnÎï6¬p‘½òõ·‹~rT G ŸuÖYiÐ0nî/‡ëiH5™†ìÔ©S¥éر£iÛ¶­ßTºËõ¬ ¡=•µ¿Tû+5²_öË~Ù/ûe¿ì—ý²_öË~Ù_jûo¾ùf5] ¡ýª7 Ùæ-*ú°\Œw}æ™gfl˜Š aÍxiã\4Ûæ+º;tè`Å;^nÎÓ‡óÈçÜýÏTæóUMx¾²_öË~Ù/ûe¿ì—ý²_öË~Ù_í5j”ùòË/K"´Ç^´°r"ûÔÖ½‹:öÎ;ÍìÙ³m#ŸqÆ9Cþݾ}û¼óÒ¥KM=*íå.++³Â>×su¶eºIýmŠAx¼bï¿Ðó‘ý²_öË~Ù/ûe¿ì—ý²_öËþê² ùé§ŸüìÍ[L½V½òÙw<øxÑN‚ô1cÆXCëׯŸ4Ú-3÷—¸&„;W¯uô\÷]H89’NwxÃóÏF>öWf}¾Ç˶}6d¿ì—ý²_öË~Ù/ûe¿ì—ý²¿&Û¦\½zuÑ…ö­÷?šŸÈ¾ ç³k÷î¢|ÇŽføðáÖÀÓO?=9¯¼Ç„m*Ž?ÿüs3eÊ”‚÷ã{¶©\çÎÓ·)ʾlëó¥ÐãÕ´ó‘ý²_öË~Ù/ûe¿ì—ý²_öËþRÛO‘ìW^y¥¨"{ç®]¦q÷ṋ쇗®(ZˆøèÑ£snDò®)BV¨~ë­·ìüã?6sçÎ-šÈvP­œž‘ÓN;Íž7s¹²7QM#›}²_öË~Ù/ûe¿ì—ý²_öË~Ù_l£xöÛo·¨÷’'–ç&²Qã»w§Û¬Y³’N=õÔäÜ_fŽ÷ºÐpŸ]±^…nݺٰ€+®¸ÂŒ9²èB›ø~.”o_6*²?—õU½ÿBÏGöË~Ù/ûe¿ì—ý²_öË~Ù/ûkŠýhÎO>ù¤h"›èïóºÍ.²o_Rœ\ìo¼1«‘@¯žáb àmÛ¶™Áƒ›7ß|Ó,^¼Ø<û쳦sçÎEÚt «íÛã“‹ý5‰bŸ¯ì—ý²_öË~Ù/ûe¿ì—ý²_ö×$›‰œ.fÕñ[î{´b‘}J‹žfû7ßeìºuëšzõêU»áá!›7o6ãÆ³‘ßrË-æî»ï6S§NµùÙ¥Û ùÅ̓M\¸lvçŠÛ—»!нÿB_êó“ý²_öË~Ù/ûe¿ì—ý²_öËþbÛÙ¯_¿XôvqjmÝþ9¹y÷Ì"{Úe‹ >Ȇ L£FìÉ;¡Í<„œfªv[ðeÚg̘a^zé%+°çÍ›g¾þúk›£]ŠãAË–-í àw.Td¿¿.ÓöÙÖgÛ_H±÷_Ó÷'ûe¿ì—ý²_öË~Ù/ûe¿ì—ýQÛßpà EófOYpMf‘½êí5íüûï¿7ƒ 2§œrŠÜ<Ø•÷:WÖ®]k ž!²—,Yb‡ñbbÜìRZ·nm/X6ûþ6QÛç»>$ßýz~ÙŽ/ûe¿ì—ý²_öË~Ù/ûe¿ì—ý5Á~t³‹!²_yãÝh‘ݰ˂w~Ï=÷d4N>ùdV]*v”È~ä‘Gl4&z+Jy\çÑÎtá±ßÍýåŠÚ+Ÿíó¥Ôû—ý²_öË~Ù/ûe¿ì—ý²_öËþšj«V­ÌöíÛ ÖÁ8›Ïî8°¼Èží-íxÓ¦MæÌ3Ï´'{ÒI'Yܲ3‚Þ‚Rä`Gåd.¾råJóØcÙ÷~øásÉ%—”üØÀ€çQöW5þ9Tæ|òý|¡Ç“ý²_öË~Ù/ûe¿ì—ý²_öËþª´ÿÊ+¯,Š7{ʋìV½YÐN'Mš”4&xyK-p»víj5Ãv½ñÆæ‰'ž°ïÿãÿ°¹ÙU!²¼ôlíQj›*¼éövd¿ì—ý²_öË~Ù/ûe¿ì—ý²¿¢ÏPWë³Ï>+Xd¯xùõt‘}r‹æ»;+½CbÙO<ñÄrpÒnΘdU!n/¾øbó¯ýËtéÒÅ|ðÁæñÇOŠìk®¹¦ÊD6{îì÷Û"ª}*³¾Pòݱ·—ý²_öË~Ù/ûe¿ì—ý²_öËþê¶üøñ‹ìß}gNjÖ=%²{›]ÐûôécN8á{’nî/ׯ_¿Ê„íûï¿oÇÇfyË–-6'›e„÷Š+ªTd“{NÌ?íE¦öÊuûðóùî¿P µ‡¶Z½zµM=pú?mÚ´¼íŸ0a‚ý¬¿¯÷Þ{¯FÛ¿·_"Y¾øâ‹´k-Z´Ðý/ûe¿ì—ý²_öË~Ù/û«Õþ×_½`¡ÝcôŒ”È^t˽•¯¤öÊ+æÏþsF0ª¬¬¬JDíE]d œ¹°p„5ÕÅYþïÿk_0 J…vÓ¦M+lŸŠøÓŸþ”œûËÅÚ¾¦A¾>C­ùÓ®]»LÏž=ó¶Ù²e&œx¯¶´ÅÞxýgÏž]îšP¤ˆÝÿ²_öË~Ù/ûe¿ì—ý²¿:í6lXÁ"ûŠïJ‰ìå/¼Zé!ZQ!ǼÍM® 1Û¿³cÇóÉ'ŸØ×Ç·ñ7Þx£ÍÓvÓk¯½V¥"¨¨N[¸6ÉÔ^{¾Í¹Ø?vìXÛâOŒyÞ°aüŽK½>ú¨œ ›?~¶o¿þÔG'¢KtÿË~Ù/ûe¿ì—ý²_öËþšpî8€ ÙKW¬L‰ìuŸo¬ÔNÈyv Õˆ„ÿvìØ±ä"¶oß¾æóÏ?7ÿ÷ÿg&Nœhß#ÿš‰m:üéÑG­R‘ݦM›´›­X„íÞÔUM¡çsï½÷–a«V­ÊûxDNlݺ5m?;c5Ê­&Û¿7_ Nð{N—]v™îÙ/ûe¿ì—ý²_öË~Ù_#ŽGdt!"{ígâ"›¢g•ÝÉœ9sìÉwÜq‘0”U©ì AƒÌÆ“^k÷>aìL#FŒ°R&ªÆ9ñõôÓOW©ÐnРA²­ÜEtË®½²­Ï—|÷nŸ|ϯ¢óaþÎ;ï”a\Ó\÷ë7nœùñÇÓö³nݺ´kPÓìßÛ¯?c~ùå—i×dÏž=6@÷¿ì—ý²_öË~Ù/ûe¿ì¯ öã$F/2^6ÅÏökÕol媧ÅB³ ƒ>öØc“'é–™ã¹*µp2eŠ=¦'Ÿ|2ù~=Ì?ÿùOk$¯çÎk·¡Áðl»‰"[ýúõ«‘wÕo§Êà·±¿LAò_¹´bŠÁ­Y³Æ¼üò˶ êÖ­›¶Ï®íСCÍ}÷Ýg‡Bà ùá‡Ú}ÐA1}útsöÙgg<Ÿ+®¸Â\~ùåö\˜Ïš5Ëp¹×^x¡¹öÚkíÐo|Ž´×Yâ¦ÿûßfàÀ‘öÓž‹-2/½ô’½ŽÀy±¿x œX§Ó%ªý8§^½z™n¸Á¼úê«¶pž³óÙgŸµ÷•ñs¹çž{®MS¸ÿþûm;sN[ÃO;ÒáƒÐÏ÷ºsü°=¹6 7@‡ÂÒ¥Kí±8g–®ÙηI“&fêÔ©6OýÝwßµ×èèxøá‡mN ×*j?\pA¹ó°ßoŸ™3gÚö,Å÷¿ª>Ÿïþªúx²_öË~Ù/ûe¿ì—ýÅøäV#29'?”Âq©POLj?QyšûÈ·ŸJõˆ7*‡GMØö÷¿ÿ½Üû¥ÐVÄõ‹/¾hCÉ+²“óEÔú×À_>묳ì=F=ŽŸiúÏþc6lØ`‹áåsÝýûÕ/FÇIØfLt.Ýu×]‘û¢Èßõë×—óöûûEÓ)ÞoÏ?ÿ|dQ9D;דbua»s>áD}„ÊÜÿ@õx:¢ì÷;i(¤×¸qãrû"oŸkNˆã«¯¾Ú¶¯»ÿ9FïÞ½íçèád®u¦öÛ¾}{¹ÚŒt õb}ÿ Ùg!Ç«è÷¿ç'ûe¿ì—ý²_öË~Ù_öãä+Ddœ<×ì7í²Å•úðÈ‘#36.ÞÁRæ_ãµrË=·žåmÛ¶ÙuùË_ì{=ôPZn.ÞFNÎôóÏ?›§žzÊz}K)²[·nmÛçü£Å-»vóß‹ZŸ <§Q"©¢ €Øò÷O¯Î=÷ÜcÅs®b­}ûöiçƒ'4œ¾îºøÓwÜa?ƒ0œlœ“Û/ÂhùòååL¶ ±DĂߦ †ðÉuúæ›oì.¼\S:sÂN…L:tÈùúSÏz¾×—ûÜ·ÙuPpßW$®Ãé…^°íîöÃ8‚üqäµ Ïɧo¿ývd*@¾÷?÷ÃM7ÝTaÇHT‡©+þ>‰„@ û÷=žð°}¨÷@ñ=>Ë9æžïD;f³¯²ßÿ\?Ÿïþ =ŸRŸŸì—ý²_öË~Ù/ûe)ìÇ©‚£¦²"{Ê‚kÌ~—ÝpgÞD0ñðïN6„ÐRˆTÂ3]x8Ó[o½•&°á¶Ûn³ëðr»PpBwý /(a¦Jÿ%yæ¥Ú„pgj·Ê€è@àDM?A¦ë&Ä!·îF#:!ôÄâ¥R7¥(A¸®;nÊLçNœ|Žh„p¢óÀí¡†˜ˆxˆ¹vÜt˜„aè„£»}Qï»Ø`ñQ2<œQÇ!ì›$·ÆwÎÔ!A;qN¾W— J®G®×—\æ°ˆ›ïµç8™þ_ÿú×ä~ð¢Fu*ФjOÞzØñüà 2$¹Šøe:Ÿp¢Ï;ï<+PC¯1׌ëïýNØ}T„ÅW_}e½ËÌ£&ÿ…¨h‹L|?ùŽêÜàž£ó„ãLÑW]uUQ¿÷B!„¢t0 teEö‚Å·™ýn¸ûÁ¼?HâÑGmOÀÍýern‹)L ÇPÿ¡Á l„³{F´¹÷Ã\_ç™$g“Ü`ßÃÊ3Ṅ—Bd#öh§\ÈÔ¾þ6ä†V–Œ ºûèÑ£­½áD!*„"ûAì†ÞAÖÓFþù^åÅtçExw¦Pî¨ããaD"BO»;6û¦ƒ%8܈¶c?x'É‹ 'îDûjÞ¼yd˜0‘u<µ„sGmƒçšývM'OÔ}Eh4íNX8^~*ݾï îåsýÇŒSNð#ì Ù&§˜ã>Šy&©Ý~8§p?´K»ví’Çã:¸óô§[o½5¹ © ¹zÂé”áØäh#ÖÃT¸ó¹ÿûaççÂ}Äu%ç™û9*]€Ž":7ܾýè–lw´S8†» g´î î}êD„ß5lç÷$×ëžÏ÷¿:)õùÉ~Ù/ûe¿ì¯]ö;g Ž8þs‰„ !U1ê}@nî VPÔû{#ÎVæá²ÏmA£Ï®n¤±úË!<ûBÔ­qópÙ§k8wðšc`›s`Õ´ûŸ4×ÊŠìën¿ßìwË}æýAŠqð:uê$çDJ1E)•Â1u¡ÀFˆ»!‚Ê/ÞïÒ¥KÆÎÁo_î›PÒ&„?³ ­ÂlÀÛé_`:¸Ÿs½¾,G kF¶pÿx¡Ã‰6gϰノ%lv×XŽê\À#îÎ'ª¨3tLòO}¸ûî»m;ŸãZGuÌëýOÄF‘â¢Yø#ñ·§˜Y8a?RlÃq£¢-ø>`+õH'à^§Q)Ü£ÿ:ÐñÞÿ|ýû4ŸëŸíû_ÑþBнÿlÛgCöË~Ù/ûeÿÞaÿQGeÿ[׈8J8p€P x ¡ƒœ¹[ïoç¿Çó`8—éì—™‡ËQïeûL¦õáûÙŽŸéóþëð}ž81<¿£)pæ8HGžï˜ûÉi":Š¡mÝ»wOBú,ô‘ÏÀ8¤¤ðúð „pàüž}™óÌ<’‚É>œSŠû±Í½TSîô½•Ù¹ça³ß=‘÷¹îKåæŽb†Šº†ãRý™ ÜÖFh¹÷7;[n2â€my¸½¥ä;W›ñ¨öóßËu=^Ì(áîh”'qÅ—5ô¾Ñ„ãÙô‰ F̹sDHG…ÛFݼîܦQžG¾l¬Ç{v$àÁæ¼ýöAl…b“ë‹'Ÿíè©#<œœØ÷á‡;löÅç…ÍáD.ºÿ]¯½€QögÚž6‹Ö ÏxøyÄ`”'›u ,ˆ ¯-Dy)2È~ˆˆ§ óíí!9œˆLÉçþçG8,⇘%#<.=áĽ̽ÎþÛa§¿kü¡D]®[iÁtûí·—ûý‹ú>âѯèþ/äû¿¢û/Ûþ =¿lÇ—ý²_öË~Ù¿wÚOÑ\þ+yvâ™ç!_D€ñ¼8¯Ür6è¼/¢ûBü÷+Z.ßž£yn¬ž=òµ;Ê6š)¤V´¾"²'Û9E½Ï½A åOK~f8QݸPç4S;æ ž´°àž[~ðüíxøÅ8Ê… fàxÈýüŒLàåsÇÁËV'ašÉò0ŠÐx±õvˆÊ7F@†û¢z8á%w믻îºÈáÈ! ÷õä“O–Û–*Ü\·¨¼tD ™ì¤GÎM ? >ô˜ùÛD kF'W¸b<ô´ZMØ ëÃ|v:>¿:—ëK>•;NT(4©ÝËQ^eRøcöü¶Û2*ä}x<:Tã=œÈwÛDE[ñ’É~¢*ÖsoúÛ‘Öà7ÉoC!„¢¶søá‡Ûçj¹æ¦ZòìÇ3ÞI:¼ó&VÓ¾7¡#¨{„ЦH^ußëDhT‰È&¯Ñ7|ÉÜÝ,D|Fy«øâ’Wµ}X ñã¯'7߉!Œðxº}DyÅe.ThS Âoèö̶>J쑳ξý};†Þ»a­Ø†õá„P!-Àí'<D½;tj¸õQ¹¿n?™ìÃ+ž¢š0cÖ2U œsöÏ /eTq4¼nr™ÃéÓO?µ0ýóCà‡•¤¦ØË6|ñ£ŠÆÕ^tzR-¼‡0½» ‘Æì·OT¡/Býc ֣³GŒ³M”èd?Ý_x÷g/¹;V”8 ¯Chÿc=Vî3Ë–-K;þéåöÛ0Cn»(‘Íù„Çœ:uj¹?{:Úåg}¦h Ú>¼?tH„aòüN¸í —#Ò¥¢HŠ|¿ßÙÖJ¾û/öö²_öË~Ù/ûkŸýnÎÿ)Ñjá¨$<ß8QM”Ï Ô—qï!¾£†ƒÕ¤Éï¤á™›ÂhîY¶ºîžÙ«Dd#ÌFe=ôÐH¨h]HvT>(¹˜¹lÏw¸MTt.½l¾ç<êÜyä‘‚D6âå°Ã«ÚÔÍýe·±•«Œ@EãAæ‡-ôv#j)öåöE¡¨°R6aÁCB… .,Ï3Õ£ Õ%ˆ ìîü¢¢ð’g:ˆêÄ@\ºm£†÷ržZ*Ò±CõE>þhãU$·Ú‹¼úPÐ#–ÈvçHs”XçOÑÇ~ØgÔ8Û çÄ:Ú ÑEûSYÑH5öÍ1¸·Âñ¨YO5Iw®ôÞE…ãUuÃbÉ8m¢m¨ðèöÕyEt™W`>Þoîwr”9'RضÓ!• éúRì+*TÎ!ÿˆºoðû߃¨*ëtlÂî¶£Ã&jø8D=çÂ6D‘DE[ðÛþž¹c““5Q8¼wΟˆˆ¨ñ艮¨èþ/äû_(áþ3Ù_¬íe¿ì—ý²_ö×~ûÝÿ;Ï4aúÿËNLó¿èîÃ:¢ñœG›g|Mš2M8iH%Jº:ïÄ7º°ä"!‘™N AVÁ‰÷)L„kç²=b/*œ<ª2s®9®®È ªÃ ¡ZY‘H:äCl»1÷—]{V´ou”äG 1Ä[Ô¸¾LüÀ!šÜþ{Ú—ËŸ¥-ð3çµûatşعÓáPklGBE7qÔXÓxœÝ¹qsÓ뙩3(óCÁ»îxˆÞ¨1Ó¥´ yøQÕÓñŒ.âöC§C¦sB˜r\ð÷Eûò9>9@ÛÒ1â®oTá5w=Ø7Þþ¨õ½;î¸Ãö¾¹6DˆFu Îi?¼­À>ý®'וýD¥&ÙB›fº¾ŒƒÍ6þÄ>òîÜ¢î¦;ï¼3íž§Ó ªçQM ÷QT{ð^Æ “û¡S(*_›N†LvÐù‘é»”m¢'6Ó~ ùþWÅþJì—ý²_öËþÚg¿«Aõ߈sÀ‰è¨ük_„gzvÓ¤É=câhÅ9RÝ÷?Ïš%Ù|yyu'BØH¾b“pã¨X%ÐgÚßT‘¶l÷#yÊx¦ý}jýœÝL„hû× Ê»8Ítâå;Ïø q÷L¦û†kI§™¿/:•N/Ÿ‰Îò¡ýýDíƒÊ‘}×è° ý¢:ý޾°}y€ ³$Óý¿·ãÛ,ûe¿ì—ý²ï°Ÿg :¯ÉÅ'†û“óf‡ðš4…Ï‚¤á1qußó8ƒJ.²yhÅÛæ~4BðPå+6É Ê‹&ï5jû0ÇQ†§.ÜŽæB'šýáÀ¢¼ixFÃbk¹€ÐÍÔŽ¹@ÅâPXòP²ÊÄy3–2'Ó> ·%—&J°†žQî÷YDzúÃz<•™ŽÕ‚gñíoGX½¢QçÅ{xaÃâiLW^ye¹còç€Ð ……çEÈ2‚?Óù#´ù#ÉV-“ÎІñÂßiãp¢Ö¿ï¨aÍ¢¼Ñ~xþ£Î¡øÀ”û;jâ:²?ÂÆÝç£Ä)‘&þ1‡<êáDA=ÿ3„ê‡÷ ×yhaøDidë,à{@7!àþç‰$«°sl~k²}ßh?~{¸?ý~‰ò C"¼?‰^¨èþ/”°½Ã‡ºª¦ªÏGöË~Ù/ûeÕÛOç:ŽÒá¢RÐQÏd~ôÛD¥wiÒ:;q:tÐAÕzÿã+¹ÈFÉóåruF3‡|…&U‘Ã"ELˆÁ¨íé5 Eä3Ï<¹mTÅìÊLò÷»xñârÛ Fòµýì³Ï.×~~›Fµ¯OTŽ*9Ôx]üˆ!>™ãé'ÜÙ °¨ý¹cD!*rM„“&Àœ×´=KD-øûCp’#Nþ¯cÞ¼yåöïÛG.5çêƉÑÐ~ŽuÅWX{8Ľ¡T ¤Ã‚ðb?´9ƙړŽòeBxÓOì—6ã}: ðžVô¥c?D$àå%d™}q^ì‡ï ?¨hî{óŠhÿ\ŸþùdçûÄ5¬GT„G¨r-Èwæ‹O›¶u=Cû9Ä85 üó¥CÏ4×pÆŒ¶(žo+Å¿üs¦ÈǪè~"ÌÇ·•ŠíýúõK;î¼üþ¾çüXFÝÿ„©1TQ f:¹¸vÜŸÜTÿçw…íBû)އ×Ú?öò~T{EÁ½ÆwiÊ”)ö<ˆp!Gžó '†Ìv=*ûýÏ…|÷—‹ýáýŸÏùÉ~Ù/ûe¿ì¯ýöãøàY"*õgtÔ°¢LÔûqáâԱѤ©¢‰g6Òüªûþç¿ä"›‡rNâÀ´¸åÊŠlrš£ò_ñGmOHm8eª<žé žï„—¯±6ÎC}˜ÓÚ^ SäÚ1W\{“•oŽàtÛâyC$F}Þ]/ÿ½Êm¥8žËrEÈòÙ_ÔùÐFx—éô OÁUÝöãÅ { åGH»}ÐÙ…Ú_Û®¿›ó£GÇQ\; cÐ¥°ŸzSIUá8›÷‰Š 7ŸßÅ0"ƒ”B᫲½÷¥ë/ûe¿ì—ý²¿zìç¿Nö¨¢À¤Ñ9¦‚á$#BÐUÏy©I“›p(âȪîûŸhÅ’‹lz¨8à€äAÝ2sÈWdG…ÍâÕÊäõór©îµ-B8ßüÓŠ&<^ټلûæc;ù,ùþ°¹öFX„¼¨’Mq*ÿzøŸ© áõ ÷®)ôø…žOmµŸ…0ôØ…Þï öפëOÇ‹û¢¯?sªëó{•—σí·ÞjÀuÿË~Ù/ûe¿ìß›ì'²Â¦Q…DqLç„6/žÕqNùÃx©º¸¦\R‰¼¤ØuußÿDœV‰'›0ÎL?*ùˆLÂu£¾`$¸Gm%l3…ŠWf|ìlùÇ~¥qòtÃ0wz9ò±¿Aƒ¶Íößÿdû¹eæþrØÎŸâáAPÑçsÝe·ÏF¶ýz¼bìÇs• °¯Ø_“®?) ùT@¥#¼Ýºÿe¿ì—ý²_öïmöEHdœ_QµrprùB;Q£½hÒäGE\zé¥6º¹ºïŠ WIN6ù¢î !xRs™„6‡!–™¶'8œ38j[ ^{"OÙ??þDXL>"ot¦vÌÆµ×^[îüÈÿõoQ;Áû5¬Ùå—_®ö©ȽÎu¢§žüxÂÖÕvB!öFèD¦æ Ï²ÔÆÉ”j‰HÂ9Ç6x®©ÃЫDæ9¡]Q4Mûö„Â(aŠØV÷=ϽZ%ÕÅÉ…ýÃþ`áÀn W“«È$™=ªA3mõE¤ðPÔ¶TQ.ö„wÑ?FT. 9›¹ÚO°kÇL„íë–) ÅPHn™öMøf±'B² ÂrµŸ!¶h³ßþö·ÛÓ­cî/gÚ>_J½ÿB'ûe¿ì—ý²_öË~Ù/ûkžýlϰªíÛ·7 0K–,)W”7ÛD”êž={¤.5Ùû€{‡ëˆ#l¤`M¸ÿ.6_])‘M^ôM7Ý”<©ã?>g‘5æ5c¡åêÉæ‹Yžìž={.NnA¦v,¿ùÍo’sÙ߆ª} …gž±¾)f‘ÏçÇsŒí}¢@AÔöŒñìŽÓ§OnK/QuÛ_ÌχÛÓËK¾¹ºÅ¶­mÛ¶iíÉØÚŒù]“ìÏ÷x>ü¸Ò•ëöÜ·äC»×DÕ0ì× 6Ú¿7Üÿ²_öË~Ù/û÷nûyq³©ÉÔ¯_?ëlZ±b…ùöÛo¥5eeê›o¾±)¹ä`“Š;hÐ «)jÊýOí*Ù|iÈ¥þõ¯máànîDE®"ó‘G)ר™ª…áááÔµk×Èmù‚{Âóž-\±“‹íN׆¥$¼>þ{ª’“ƒ}v"ŒŸáÊòÝ?¹9TŽìرcä¶wÜq‡xùå—mxËTI'¢:í/åþÉße\H ôÛ¶ûï¿ß!D‹P+ÚžÂþµ«nû+ áA£Rw.Û“zñæ›oÚN÷ôÜgô@Ö6û÷–û_öË~Ù/ûeÿÞo¿û߯ÉBŽöÀÍÈ‘#ͬY³ìh?ü—Ã_þòsã7&Ái7ß|sŠ ßzë­vN&ž#}îºë®4î¹çž4Èëõ¹ï¾ûÒ H2ÏR¼¨Q<øàƒvNÁfà5øËþk·ÝÓO?m^|ñE "Ò_v¯üñŒŸ¯hÙíxøá‡“ólËþöQŸÏv|¿]hGð—]Ûº9ííÚ™e®Çí·ßnïr¯g̘a½×\p½—jÒý?qâĪÙ”Þg˜­_ýêWöÀþÜÁ¸²¹M'=ÓöQ½bݨm—/_^t‘½téÒ´cÚîO´M>Ãw…mÕžÙÖg#—ýZ?gÎÛÉ$ã1ó¥g&M›6µíÌuÅSÈç¸QyŸ×§všý\ݺu­Ð#l(êøü`>öØc¶x¼¦*úÙgŸm_³†>¢š3¶ùÜI'd£8>ëý}RÙAÉ9íÙ¬Y³ä:¼˜xA)L…‡!Ï6ݺu³ž{Îщ{¶oÔ¨‘õœÒAÀ—ü¬³Î²CTб±Õo1iTÙg xäO>ùd»oŠññY·=ÑØÅ2ŸAôq~|–ž_ÿKÏç¸?èÅ£(2H¨JÔõã„4®çŽý¡ûwÛrLÎ 8¶ý¹xÄIÇຊ¡´‡<^»½úõëÛýp¼aÆÙuôöá­Ç¦víÚY›Üg¹ƒuF¤­XǽӸqc{î\ΞK>Ãu'*âî»ï¶ÛÐÞ¼ÆgØ?pÿ²/ö‹ÝD°=÷mNÛ»ó8äCìõ䜹—¹·œ;Çv×›ýãûXè÷»*¾ÿþúÙ/ûe¿ì—ý²?×óùå/iŸ!xÞâŸÿkœOÀ„™ûðLç 'ø_:Òùwóáǧ°wð,7jÔ(ËèÑ£-D›Ž;6ɸqã,ãÇ·àqø¯Ãumçï‹9Í_vëÝ2ëÜ6þ>²}Þ?Þ÷·‹:·ìÛÌÜo–i#pm® ™»¶w-\T%Ï®®Ÿ]yncÎ5ç^`îî žóÑG¯*Bo3^u:Ø~Íš5v;zèø Çç=„ÇÆûއÏ)äÏ>ûÌž Ÿ†Ü¢wŽ ÄŸåO‚Ï>õÔSÖ“ÏqØ–N~ôhD =³¡Mü¨ë¥‡Ð M@P‰pÝu×Ù×ôrΓ÷ß}÷Ýäõáܷܲ®M.¾øbû#Š}x‡ÙŽöa˜* rl®‘;?¶cçûùñæ³\kΑ!Éð²ÓžüXqÎtFp¿¸kÌñ¹îˆ]Æg?\Ο.Î'¼†ôŠÓÎÜ— §åÚ•žqþ¸°aÏýǹr±™ w/ÍÁûØÀg8ßÓðû~£~ÿ*Ú>ßïoU|ÿ ±GöË~Ù/ûe¿ìÏ´=îüßó¬Åµ›;øÎëq<°Ìœg$–™GÁ³iE þÝÜ-ãp¯CÜ:æþr¦íy.e=sÙ}>\Ÿðxáùøøçe»ßN®MÝ5àšøCcùž|/ð¾|ÿóŒZ¥"›ØL'†ÈÉUlR­ÍŸvìØ‘q[DO8ñ µ-cZ{š={vÚ1x Ïå\¢à&¦­~ñ‹_$çþ²kËlëC*³?D6¡xDñ‚-ôR!ŽC|!ñò"RÑw"q‰Ã[ìÂÅqQç†ÈAø’; D- úð6#\H„øâµdŸ„«jƒ×”Î Êý…ä¦G€ò£B/烗‘†ØCÐb>4†`_ðDEˆ0ŽÐgßxB¹¶t àf{Ž{õÕW[;€ìŸ}Sðƒ%D=zxOñ¦"ù,ûÆFzõø,bó¢3îOØ¿óÊs¿ P]4ûçÚ±-׉¶¢ú(?æ´펗žïäMs~D%ð¯60´ !œ7íÁ=×óáš»ã®!÷#÷ûg{ŽE¯*ïa=¥ ?H¯:2¼¦ƒƒžVÚŒÎ=îGwoÂÅýšé{V,jÂ÷¿¢ýÉ~Ù/ûe¿ì—ý²_öï«öÓ)A¤r•ŠlÄÕÿüÏÿDÂI!r›< çZ¡Nr¹5´±k<Ò,ó>8þ=Äw–¶îСƒ]æ\Ø–NÚ‘Î:hs¿­Ü5¤ ýkH;ѱÂkD³ÛžŽ öÇ:†‚ éL`=·xÇ/^lm@dQà>K”çZ]ßÓªüþû×oGöË~Ù/ûe¿ì—ý²?WûÑm•؉ì-[¶Øƒï·ß~‘àËEp’èNxߢ¶ÅÓE:úé§Ÿlœ¸-yÅœÂPpŽ?mÞ¼9g/6â%S»Uˆ^¼¸ˆ("—÷ñä!Lðnº| „‚Ö‰l: Ü~|‘uD6×!Dh9⊜uˆlÄ"›×.΋0o÷yò5œgq‰Èvë¢D6‚Ò­g?:òÀ]~0ÛãÅÓ‰ÈÆS˶ˆlΓœ{^#<\„9ó‰·‘Ík<¿„Ž»ëŠ(fß„»û"›u؈­î¼ˆŽ Ÿe'²“¼öEvT{ú"Û½‡ñŒ˜ö¥Ó„œq·žeÞÃL"ßß'ßiD6çL§ïù"›×ˆlÚ€e®¡â¤N¸ófŸË–-³ËxÑÙÞÝ?@g9×¾Èf[®לN D6ÞpD¶;7:ßÜ5$ÅÀ]C'²ýv_dãõFø#®YG:Ü÷¼Fdâî>KG×¼&}O…B!DÕÁót•‹l ¤7ÓIÞ™«è$ןþõ¯Ù„ø¨má„Ø·£àC|kB`¹}.Š}Ös±?ÞךtáiÄ;¾OäaÔƒrï!ñVFdÓFr?Âu¡ÈF ñšðrïQœ\YDyãœ7çÁ½†x«HdSdñɱ’[¨i"›ÐjöCgW'"+Ùxôi+ÚQŒ–!D(xä9w×[‡h%ôu|1̱Ø)|§‰Ø@ˆÒÎûäÎþ^‘Mˆ½»Nn{Úœ¶«HdÓqÀùyÀ:¼ûì‹¶Â^lq)ˆl>KÛùÀö|ÞÙ¤·ê@ ë¹6¬s×X"[!„Bø kªEdNœé¤x&3á¹`Á‚¬•¼ˆ£þóŸiÛnݺ5r[¦Ó?ü64bËŸ¨zži(±îkÚ „°AP‡ï#¦ Ùf==9x¢ #F°dÙä½VäÉ&Œ9“Èváâî=:Zaˆ8ŽÀÆ‹‰˜ãøˆ$@ ÖîB´Èæ<ݾðžsnnÈ>ŠlB˜ÈF;ÁæD6Ÿã5¡ÒäUû"Û…Wç"²}ï,D °Lè:¸{ömòÛ8ÙlOX5íŠÈ<ÑNÐr]Ètppߺâf|?éˆBx"dé„ c¶¦ é\ÁF^³Or¥Èæý+®¸"Md»pqÞ#ÌÞ‰lòáù<9ÛØM²o~¢D6× ‘Í~9iÜ7D-±Â5ÄÛ-î"²]ñ3ÂȹWèðE6ÜË´5!òxñ¹¯Ý±]¸¸/²9Žþ`„B!ö=HO­¬À.XdñÅze¹z³yðö§üÑ–£Úo]8n‡´ænŸ˜Pä#r±‘Âa uTÓn"<¾·Ú‚·ÑxÁ;솚B#ŒÜ¶ˆgÞsÂ%ÑC¨°c>\®·¿!Hî;ÇFP’à‹pŠ]‘w€æ¼\4^]rp}qÊ93^#ûAPÒ)‚=ˆ3ö‰ð¢hšÛvæÌ™¶€›;„bš×æ"üñíBÕu„8;±î±É:„œóÀ1Úî5¢ŸÎî3D%³½QíIî2vÒIÅ>òtt¸|àûɹ»1%i+ìóøE´Ò>Dð>‚ší©ÞM[ñýrDØÎ{nã\\ú÷9íçÖsþ{öÇýAç çHa8®…ëtáºa‹óÜÓÙA::VHY ½3]C^OÇŸK} ¸š]hìÀëÎÜoW¾~díϽ¥?!„Bˆ}ž)«MdÞ°Š¼Ùˆ·\(¡±ÿøÇ?Êå9ûÅÆüpí0Äœ±»9;ç çŸ~rX#!„B!„5†½eHé)²ß}÷Ý Ç͆#Ê'?ÛAôG}”&ˆ_}õÕr^m<Ío½õ–Æ~h9cßwß}vÝ?üQdïØ±#™Èæþ^¹B<ÿo~óÝ´B!„BQƒ™>}zÑvÑE6Ìž=;«ù ë2gÎÛËà¼Ì„s#žÃñ´£r·½{÷¶ÅÚüPpb±×^{ÍæcWæñÖÿáÐ +„B!„5˜?ÿù϶îVÙ;wî4§Ÿ~zNÆTVh»pòÀ·nÝj…1¹Û¡ÐÎ^ðµk×&Åõ×_mxà3lذJŸ±ütnX!„B!„¨ÁÚ»råÊ¢ ì’ˆlÀÓü»ßý®äBÛÁxÕQ9ÚŽ§Ÿ~:YÌ,÷5×\“×\y°9äݰB!„BQÙ7o^ÑvÉD60Öu.†:^™ílPøŒp–ñv“‹Í2œ[±×¦M³ÿþûëfB!„BˆN»víÌž={j—ȆY³fåd ÅÐò­:^}ûöµcl?üðÃöõW_}eÞ|óM»LxøçŸ^TÝ´iÓœ<÷B!„B!ª—ºuëZX \r‘M•îäd("5ßq´£ ø‡~h+ˆ»Üê-[¶˜×_Ý._}õÕ6ÿú©§ž2ݺu+øxäŸ3¸nV!„B!„¨ÙuÔQfݺu%Ø%Ù®ÃYåb0bµ^½z…¯ZµÊsþüùÉ÷6oÞlßw¯©FŽ¿ñÆ Ê¿®S§ŽnT!„B!„¨P?ëí·ß.©À®‘ Û¶m3Íš5ËÙø8 Ò^í.]º”{Ïö3Ï<“u»\9ãŒ3̯ýkݨB!„B!]õ"vìØa…m®@9õc=Ö´nݺàn†öªhÜì\iܸ±9øàƒu“ !„B!D-áè£6ï¾ûn•ì*Ù°{÷n›'OƒBŽØnÙ²eÑ+‚çJ£FÌ¡‡ªT!„B!jÔÐZ¿~}• ì*ÙŽ XOu>ÃöGy¤9ûì³M‡J.¬Û¶mkN;í4sàêæB!„BˆZÍ ç\•Z·ÚD6¬X±ÂVv«LcýêW¿²Þm7b¸XºE‹¦~ýúæˆ#ŽÈ»@!„B!DõóË_þÒ\wÝuU®q«]dU¿É…6"…ÒŽ;î8sê©§š† ÚÐòöíÛgÓ Ð^ŽzõêDݺu ‚ëW'Ÿ|rÞp߸e~T }“?ýéOym 'ıÇkók²Afea˜ÀLÐQWDÏäC¶ýUt.Pˆ¹´#¿‹G}´¨…üùÏ.øû&„Èž3ügQ{ðŸýg¶ðùÉ]k~cyžñ;î8 Ï)ÇsŒÅÿMF ÒÊ»ûF#xàfÿý÷7¿ÿýïÍï~÷;ó›ßüÆê‡_þò—¶X×¾XsŠpoÚ‰kдiS3pà@3wî\sÏ=÷˜×^{Ílß¾½ÖëѽRdgƒ ÷é§ŸÚês«W¯6o¾ù¦Yµj•Y¹r¥y饗Ì+¯¼b_SÝü­·Þ²ÛþÙgŸ™Í›7›­[·šï¾û®V÷¨!„B!jNtîž={lZìÎ;-äûí·æ›o¾±ú „K3ä1lÙ²Åê“M›6YºjÆ ÖéˆvAóPlíÚµ–?þØê@ ½ÿþûfÍš5yƒ>Ê>ùlÛ¶mŸ»¶Vdßzÿ£ºÑ…B!„Bˆ¹ñÞ‡Í~7ÜýC!„B!„(Åw<`ö[xãÝj !„B!„¢@.½á³ßÌ+nPc!„B!„2íÒëÌ~#f\¢ÆB!„B! dÈ…›ýÊMTc!„B!„Òºÿ8³ßimú¨1„B!„Bˆ9¥EO³ßqçw2›6oQƒ!„B!„•dýÆÍ}mEöß^}C"„B!„BT’çV®J‰ìîzP"„B!„BT’ëbcd'Eöà©óÕ(B!„B!D%0inJdSül÷î=j!„B!„"OvíÞmêµê•ÙðÚ;kÔ8B!„B!Dž¼úæjã´uRd_uó=j!„B!„"O.ÿËåEvóÞ£Õ8B!„B!Dž4í9²¼È†5­U !„B!„9òÎû_W§‰ì‹oS# !„B!„92ïÚ[2‹ì3Êú™;wª¡„B!„Bˆ,|÷ÝNS¿mßÌ"yz…K!„B!„ȃO,7¡¦.'²Ûœ ÆB!„B!²ÐªßØì"V¼üšL!„B!„ÈÀò^5Qz:Rdw2Y&„B!„Bd lÐÄÜE6<¾ü5œB!„Bðè²çM&-QdŸÛiÙþÍ·j@!„B!„"Á¶˜N>»ãÀüE¶ÆÍB!„B!Ò ÇÅÎKdÿ©IóÞGŸ¨!…B!„Bìó¬þ`­9¾qçÊ‹lè4lŠÙ¹k—T!„B!Ä>ËÎ;MûÁ“L6 UdÃükoU£ !„B!„Øg™»è&“‹~ÎIdò¿½¢†B!„B±ÏñÔŠ•&W휳Ȯ߶¯ùlÃ&5°B!„Bˆ}†uë7šÓÚô)¾È†–}ǘ­Û¶«¡…B!„Bìõ|µu«iÖ{”ÉG7ç%²¡Ëˆ Íßw|§B!„B±×òíßw˜Îæš|5sÞ"OovíÞ­†B!„B±×Á['Ï5•ÑË•Ù0fζ„¹.€B!„Bˆ½…ïb:wäÌKMeµr¥E6ô0Çlÿæ[]!„B!„µžm1}Ûkì,SˆN.HdC‡!“Í–¯¶ê‚!„B!„¨µlþòkS6h¢)T#,²¡iÏ‘fÍGkua„B!„BÔ:V°Ö4é1ÜCEdÃIͺ›{]ª $„B!„¢Öp÷ÃOZ=[,m\4‘í?÷Jåi !„B!„¨ñù×ô.¶&.ºÈ†ó» 3Ë_xUN!„B!DcÙß^1º5¥ÐÃ%ÙŽ¡Ó.6Ÿnؤ‹(„B!„¢ÚY÷ùF3xê|SJ\R‘ '7ïn®¹í>ëŠ×EB!„BQÕlÝþ¹ú–{­>-µ.¹ÈvœQÖÏÜp׃fëÖíºÈB!„B!J/®·m7‹ïxÀÔoÛ×T•ö­2‘í¨Û²§™{ÍÍæƒµëtÑ…B!„B÷>úÄÌ]t“9¥EOSÕš·ÊE¶O—š»båÒ7mùJ7‚B!„BˆJ³é‹/Í>n:¿ÐT§Î­V‘í8¾qg+¸¯‹¹ñ_{gÙ¹s§n!„B!„ù.¦W½½Æ\{ûýÕ.¬kœÈa ð®#§™‹®ºÑÜûèRóÒkoÛ*pßB!„B±o\·~£Õ…÷þõ)3窿X½xbÓn¦&êÙ)²+¢^«^¦q÷á¦U¿±¦Ýà‰¦Ó°)B!„B!öÐy-ûޱºýWÛ4ë~êB!„BÔ¾øâ µÃ^Æ~û⤠/„B!„¨ Ì›7ϼÿþûj ‰l‰l!„B!„(„ 6˜6mÚ˜ ¨=$²%²…B!„¢®¿þzÓºuk+´7nܨ6‘È–ÈB!„BŸ¯¿þÚ¼óÎ;Y·Û¶m›éСƒÙpà 7dýÌŽ;̪U«ÔÎÙÙB!„Bˆ}ƒÝ»w[ñ|ñÅWè¾÷Þ{­¸1b„wêÔÉlß¾=ãöË—/7}úô1wÞy§ÚY"["[!„B±ï0yòd+œÛ·oon»í6óí·ß¦­g ä^½zÙm(z6gλüÀ”Û×êի͸qã’ïW_}Um,‘-‘-„B!j?mÛ¶­.ºè"µw-çæ›oNŠbèÙ³§Yºt©ùþûïíú§Ÿ~Ú¾?fÌôfÍûºwïÞf×®]v›M›6™K/½4m?°uëVµ±D¶D¶B!„¨ý p:vìh½Š¥`ôèÑö³gÏV{×r^xá{-l¦L™’È\ã·ÞzË >ܾ~饗Œ›Feß{üñÇmH8^pç ¿ì²ËìòÀÕ¾ÙÙB!„bïÙ“&M2¥šðrJdïlÞ¼Ù^Ënݺ™ŸþÙ¼òÊ+¦ÿþié~ýú™ü1yý0÷¹òÊ+Íßÿþw³bÅ û±­ö•È–ÈB!„ÙÙû.çšjãL?üðƒy衇l4„óXûÓÿû_[ØŒu'N4ëÖ­K®[¼x±}ÿÑGUÛJdï}"›"ü±yî¹çlEÀë®»ÎV¤XÁ¬Y³ö øáwsYöË~Ù/ûe«ì—ý²¿ÔðÌų׵×^kŸÅž}öY³víZûŒVŒ‡ã;w™÷?ùܬxmµyüo¯›¿®X•àÕ*ÙCGŽ1=ûJ’‡½¹ã¡åð²?˜˜[ž¿¶ï=߆å%ϼœ|ÿÁg2ÌÛ-yfe‚—Óæ™>ŸÚÆge9–<ó’Y²,àé̓‰åíúã<íxÁ<ÃΗÆçK–†üÍ<ðÔó±õÏ'çKžJ°45àɱõOÆY’x½Ä{ÏñÄŠ—ÍÊ7V›M[¾2{¹ÔŽ]»¾3ë>zÕ¼öâÝæÅ§›çŸºÚ<ÿä¢O-2ãF÷/ÎĽÊp]ÿüç?ËÝx¬Ãí™\ò{o›cV7û°aÃÊAŽ…›ûËQÛVfûbSìãË~Ù/ûeM´_×ZöËþ½Ï~žÅFŽiŸÍàTf^¿~½B©2Ï{_mý&&`_5Ý¸Ä žwƒé2u‘é4y¡é8i¡G‰lŽ‡ð¯ ·Þzk¤ÈnÓµ·i?ñ Ó~Âå¦ó‰ñy‡ØœeÞg}»ñ—%·kçÞ›ØfâåÉ岨v,³}Y‚ö‰×î³n¹ÌÛÆ2îR;o›³…å¬o;fi;ö’Ø6—Äçcy}qìý‹M73ß´5Ï´§õ¨¹¦Õð¹±å8¼n3ò"ÓzÄœØû)Z›c¦i9”ùìØ<¶RÇ\8ä(3uPŒÁuìüÂÁG›)ƒêÄŸOx”eâ€:ÉùÄG™ ±ù„þu’óñý2ãúÅæ}ëØù¸Ø|LŸ£Ìؾ±÷ûÆö;ìd3j3sûõãÍ»o<ûýÙ%‘ Â4øb²‘IT:49÷—Kµ>$Ü>ÛùåK¾û—ý²_öËþÚh¿®µì—ýµß~7Õš¯Ÿ}ö™Ù³gOÎÏ|ßñ¹ãñ¦çô«cBÒ¦V€Æ_G‰ìwß}·\m®„ûJŠì.½#Ÿ|í c'†Û%r;9Ï—ÖImÅó‚˜`¾8Ψù¦õè‹cy¾É­c"ºuL0·B4Çh9â"ÓjD|Þ26oÉs‹˜pnÉÍcb9>Ÿiš žaš!–cB™y³Ø¼é€ MÓSãóôŸcй ßdÓ¤‚¾-û0Ÿ›O0ç÷Ÿ÷oiÔ{œiÔ Æšóí|Œ9¯ûhs^±å†ÝG™†ÝF™óºŒÍctiÎí2Ü ~©YýA,bâ»o¬‡öÊiu=1“1;{ÔqfÚ°Ì„AucûTÓ·[C{=Ç_Èþàƒì~Úµma†÷©oƨg&9ÉÌñ§˜X=ÖÁøøy$Åñø£Ê½ïÎ59_'m¹¢Ï'Eybyþ¸¸pžo9ÊÎ玎 êy£ãÌu”}ï¢QGš‹FÐGšÙ±ùìñùœ³†Í©9Ì GÄs|yÚ#b"ŽL̈ 爉åq„™Ôÿp3iÀáv>96ŸØ?΄¾1úÅ[×'11¶÷áfLïÃ’ó1±ùè^‡YFõ<4FlÞãP3¢»Gìõðn‡ša]c¸yŒ!5ó/lcÖ~øºDvÆÚXGyÄþøñÃ=dÈœqÛ»?ÿ½ÊlŸïþªzÿ²_öË~Ù_Ûì×µ–ý²ï¶Áƒä‰'ž(7>q&^}÷CÓkæµIñÙÎóêºå(aüÓO?YOceøÏþ“QdG?Ó{þ9GEº,)ž/Iˆé˜ˆít\Dã¶"zdJH;Ò±yÌëœ& ÓDtJ@7ÎͺÙÀiIñœÑ1¯óý¦$´Ó}ãâÙÓV<;!ψèóí|l\D÷N·QV<§Dô+¢}Îé<ÌœÓi¨9:5çtbÎî0Ø\²ø6óÁ;Ë͘À¾pø ¦W—óL—ŽÄÄoó ;SÊÊÊl>ve'4IEûoópwl×ÌtïÜØ êyV›x¤ÅãSûâñ¡è®“ϼžŸÐÖÍrB<ÏÑˆç¹ }Ѩøk'ž­N0ÓÒÃŽŒ‰ç#¬xž>4¾‡[áÒ ñlt ¢!=,ÑC»ÄD4BºK\L³<86Ôù3¨Ó¡–1:ln¹n¢Dv6l0 .´½¡”É4hP9xŸñ /¼ÐŽexÉ%—ØjW\q…¹üòËm®ïS²Ÿý¸ÏDíË_ÇÜ_.ÖöÙ(öñKmì—ý²_öWÆ~]kÙ/û÷-ûÛ‹-²ã gój_qç£6ܺ,ðû⵪r²[ûžlïøeár"ºmšˆ¾$é¶h'ž á3?å‰v"ïóð„xöDtËŒžè¸7ÚŠhç}vžh_Dœ÷B'4s<ÒMú%Äs¿”ˆF@;/tÒpî™.¤}Ú‰ès3ˆh_@Ÿ“àìNq}vDZåíšíš³b4èÀ¼¿iÞ{´yäŽ1±Pðc¬Ù÷ô4±KÑ2:w,X`#aÇT ï½÷lQ³ÊNŒ‹ýÆoرµ©CpÍ5רô¢l;tè<~çM­7}A„¨^Ñu’"zžç‰ž—ãž—ðFÏMx¢ãB:þÚz¢G¤D´ÍiÞèa)í¼ÑÓcžèiI/´ч'<Ðq/´шé„7Úz¡âÙŠéãû–ÑqïóaIot\DšÐÌ{fFvO éáÙDtô!)m—c"ºc\D°óør¿öéômw°Ñû$‰ìŠšÍœ93íG}À€É9ï#œÕqB“(„¾ˆn¾tá>‹~Q„Ç+tû|Ï¿ÐÏË~Ù/ûe1ì×µ–ý²ß´Ÿ9ÏvG«Hhºì–˜0M©ÎÅ“ýå—_š[n¹¥R„Õ¥3y²+ åvÚш然Þhë‰B¹‡û¡Ü™ˆtz(wÂóJ7ˆ€žZND'=ÑýRžhÊÝoRyñœåöEt( £B¹™gòB'=ÑvyHJ<#¦; Š‹ç˜> Ú 0gÆDôYeý͙Ю_lÞלÑ&AÛø¼~ëÞæÆKšYï®±ãb=É\·K/½ÔüßÿýŸ©ª‰{ºk×®ñ¡Â:5Ž ácÊ…rÏs¡ÜäA>²œˆvBÚz¡G¡Ü#ʇrÇ=ÑG&C¹§ñ<Ð ! åNˆèAG&=Г|í éРŽ{¡Mx¢ã‚ÚÑ£­ú°ôPîî¡ÜÝ<tç@D'=чÄ<чšRBº‡hݧ,E¯¶1Ú$h› õÁÙ¡Àž:uªÃŽeæ>ãÆ³_ž(}ýõ×›'Ÿ|ÒÜu×]‘ëôlQœÃÃ'êx•Åßg.ûÏwûª>?Ù/ûe¿ì/Æþu­e¿ìß·í'úB¶™„ö ¹I÷‡9ÌU˜“m=ÙÎ x· )¤ÛÀè„Úåö±¡ÜsSžè´Pî9ˆŽ èVDÏ´hÄsóÐ åž’&¢­€Æ }&¥…rûyÑžç”›–ݨÇ+¢ÏMˆèó2…rwIÑ.”ۉ賈Nz¢XOô™mû¥ é2O@·í[Ñmz[!}F«­YîeNoÙ#FÏļ‡Y<÷Ü´g˜:ìDÓ®¬¥½vh î§RO«V­2íÛ·{лžËG§…s#¨ç&¼Ðh¼ÐsF¦°žèPD‹{ŸS¡Ü¾ˆÆðH'C¹Bz@<O´ÒÎû<¡J@Oˆç¨Pn牎 èÃҼо€Šèn /t"„;=”Û…s’åöˆ èþ1A.¢1½ËÒEto_D;b"ºGëƒlºÇèÚò Ó­ÕA¦k‹í\"Û §—“AáCøaf]EâùÎ;ï4/¿ü²-»_ÑvŽéÓ§ÛýF/üY¸¹¿\™}ÕFd¿ì—ý²_°_×ZöËþâÚß·o_b»qãF+fÃgÁs—+z‘£„1¹ÕŸþy¥øê«¯¢EvçžIí{¡['ªp·J‹‹h„sœÌEŜڅq7OæCO˜ðD{¡Üñy*„;î…žÆ÷BŸé0”/ô¹ž:RD{¡Ü¡:åNˆèö’žhÄs\@÷KΓžhë…îcôé1á\?MD÷Lã´Ýͩͻ™ÓbœÚ¼»9­YWs*4íb¹vNƒôìDnóÌX²eÍìõ£r>µJ5=õÔSÉŽšÝ$E´Ë®HDOåö‹ŠM‰fë…v¡ÜND»|èüPî~Ñ¡Ücû¤r¡ãÅMäCG„rûÅ¢Dt€ÏVHwr¡Ü^8·Ñ{¡“"º­ç‰ns°ÎV@·Š‹ènÐ2&žÒ º@óM—ñy皥ÈN9#šo~€}ÂsçÎÍ*š ùyýõ×Í‹/¾hÞ~û휄69Ûì?tÇôPnG—xFL· ò¡'º'¢csBºOoå{¡ÓE4^hë‘n[nšòD×kÚ¹¼ˆn‚ˆnŸÓ,ŸÜ¨]Œ²äü¤ó ­¹jÆ©©a­ÆûC[Å™;æIvçÎm‡N±&ÒRÀÔãôŠó¡"Ú†pHÍ}í‹ih«òùЇ&‡¶Š‡sÊ튊 ÷ Š9ñì‹hç‰Nˇ>$ï|èr¡Ü¡'šy«¸€¶XoôVL'Etó#=ћƅ³Oû&ÆDôþVD—5Ž êó0m¡ÑþñyŒ60­Ï;@"›‚äZ÷ìÙ3ùCÞ«W/;Ï"îCን+WÚ¼ìwÞy'çÏÁ´iÓ’ÇÍÿýå\?Ÿïþªúx²_öË~Ù_ŠÏëZË~Ù/ûÝþK{ýúõiaã=§-*?´UbH+WT¬ªr²[vè‘=µ\>t“¾“=ñìò¡'¤åCû^h+¢{–/*fó¡Ëy¡Gfч¤‰ç½|èö^>tÂûl=Ðíü|è>vîçCÛœè/t<”;Ñ6Œ»kB•= ½¨Ø¯¨˜+(χ>,>>t"z¬«Î]Î }Xd>tè…æCw ó¡-ŸÝÑåNåE÷©(:±ÜÓÐ.:!¢­'ÐNHwòtÇ@D·÷<Ñgt»&qot\8ïo…s\LÇtëóöOÒ*öºeCo£å¹ûKd/^¼Øz’Ù=zô°°Ìo®"ùÊ+¯´b/ã1~÷Ýw檫®ÊKh3ÌÇ­œÍ¾ýn¾/ ûe¿ìß7ì×µ–ý²_ö;ûáî»ï6ß|óMò™°Ç…WÙJÜñ\èù¶Jw¹|èa¼sçN[—§2¬Y³&Rd·hß5}h+Ä3cDSP,!œ7ºQ2:1N4":C(÷¹ m«rG m•– ÝÙÑaeîD(7sÆm½ÐNDǼѮ"·ï…>=áFDŸÞª—WP,îNåBÇC¹Ë‹è„':á…®{AL@Ç„s]K‡„ºƒ;'Úy¡OD@Çæˆèc¢9.žSB}â¹­Ì ‰¹}}NbÛž÷/ŸZ79¬•ŽçCÏ^ÇôìÜÐ^? ’1äV)&ÆV'Ò6.ä[™!=NHë›ÊNUäö½Ð=܉h?ÚºK Ý)='z`rX«ƒ3Ë/úÀd>t÷ÖÎ ˜Ç<Óñ\è“óÎÍ2‹è^(·Ïޏ'úÀ¤'Úy¡[YÑœÑN8'æVLÇDt‹†ñyË„¨vð~‹}]dôÑGv|¹îÝ»§Ag¦*âQЃI<ØŒ‡Ç8xŒ‡‡øÎuŒ±Í?ÇïÖ­[rî/»óËw}H¸}6òÝ_¶óË÷x²_öË~Ù_ ûu­e¿ì—ý>ƒ6Ÿ}öYÒ›ÝuÒ剂b©a­ZùÐU•“Ý¢}—”Nz¢"º×ëyN‰è1‰a­F¥¼ÐäC§Ý1ºý T!17·¡Üý“¡ÜÖÝ6å…FL“í¼Ïnž**–žm‹‰91íó½ÐN@×MˇNåE[}^BH[/t\8ûs_<³ŒH>1á.·ìÄtø^Äg.™T×›é k5mèM×ãCiÅî;ªØ—rúé§ŸìÈF."¢_—º¡Ü‡¦ ŠùùÐáøÐž:)¢“ùÐñå^Eîx(÷!†r» ÝÎ Ý3C>4^èî6úàd8wZ(w‹ørÊÝ¡iz.tûD(wR<'<Òxž­m—ú€¸ˆötË„¸f¹…Í Áì–}!mçîu º÷y‘}Ï=÷ØüÆ—ó™0aBN˜ÂŸ~ú©Í§APÓ“yß}÷™]»vÙBTä «Ðfh¯¨? ÎÉÍýå\×gÛ_6J}¼|?/ûe¿ì—ý•9]kÙ/ûei~.7»óøñá­¼ªÜ-ÒŠŠM«2‘ݼ]çdwTQ±È|hƇîìçBMÚªcª¨X*œ;•}¦ çnÛ/Ê}¦Ú…r·Jhë…NÞŠªÜ§%ò vâ9MD[ ç®›çN éöqôùí’éx(w›¸xN„n'çç¦{¡S‚ÙÍ­¢×—×Ù??wÜÉñ‚b‰|è‰ý‰å`_`¯c²‡ãK9QWÊ íNOÑN@‡"ºsª"wÜÊ}Håò¡[û"}°— M8÷ÁÉPnßÝ©Y®ùЩpî²D(wÛFñ0î”':&ž˜øÏó´/˜ÑœIh·h˜ûç÷Y‘ FLS€Ù.]ºØ9=›¹x±ï½÷^››×Ë—/·Ã€‘›ÍD"›ax/‘Íq9ªÆ·ß_®-û—ý²_öË~Ù*ûe¿ì¯ÌùÍž=Û|ýõ×Vdw;/UXŒ|èÄV.'ºI¬ w•‰ì²NÑ"Ú¯ÌÝÉ«Êí mÕÀí ŠµKK«ÊÝ'U™ÛoÕ;­¨XúÐV=’C[æy¡)(VÏ÷BÊÏõ9Oz¡9Ý6ž}^[OH{b:Æ QÞær"8xß­K„zãósÆœ+(/.6ºÏñ±ÜèæI¡;oÞ<óÀØ!}‰pe¸8œoT/F˜ø¶mÛlŠê«¯¾j;ˆ¨4ŽÃпs»éùÐx¡cb:-”»c¡ÜmÒ+s÷tžhÊÝ*5¤•+.Ö% Ý1çîá‰v:é…Nˆf¹7º5":Ê}€ÍiáÜçù‚8]—̤Þw¹añ>¿ÏŠlnPz›Ù>TÏ&†_zé%¦A ¯o¸á+¦|ðA;0¼›Ú/¼ð‚ùñÇͺuëlþw¶}3¤WxNùÒ©S§äÜ_.t¿µÙ/ûe¿ìßì×µ–ý²¿pûÉÍvÐ:Žº(>.t¢*·Úª¯—+$VˆÈæù±¬¬¬BÚ¶m¯.Þ¦mjh+?'Ú¯Èí‹è²þžè¾É¡­ÎðDtùñ¡ã¡Ü)t÷äøÐIX®WÎ /$VND'«rǽÐ'{"úd—ÿì{¡ýísZ§r¢£D°Ÿ7íDs~~æÈ“m1±½Ž7mÛ´Ì©À×ýQÙ‰HÙ\‹éul{NZ1±Œ"º,ðD·IKåC{ãCÛ°îƒRùÐ-RU¹C]¡:Ò}`²˜X?”±ì<Ò =}^à-½ÉØÍ´MU}~ŸÙÏ=÷œÍ½v?Î;v´s~83 àE‹™?þØÞì~ø¡}g[¶l±áá¼Æ›íO›7o¶^oz±ø§w«"‘M¥sÎ%Ä?G9Óúlۇ볿Ðýz|Ù/ûe¿ìÏ÷üt­e¿ì—ýQÇþùçm„c‡‘³m!±øV‰á­‚ñ¡ Ù/¾ø¢ý‰å>ièú­£†¶êž,&vZZQ±.Éyš:Y™Ûσö‡¶Š…r;O´Êíðt2ºœpm]Nø–¼™¶©ÆÏÏU75.t¬¨ØÈžuÌîÇ™]ÿlúv®kzu:ÝtkVÌ«|^ÌËÝ4>žu,ÿ¿‰a‚m¡³Ö­L»6¬îÔ¶¾éRVÏt/;Ñô(;Öô*«·E èžQ":!š­ˆNx¤É‡NѶ2÷A‘ùÐižè° X㓹Û4J éÖ~1±ØëVçîŸ,2Ö2ȇv^bߣœ²–/íÞ«9ŸßgE6=B„†‡?´™ÆÅF`»å™»êᄋ㩾í¶Ûl^6¡áDØ8Þî­[·ZøÒ¥K3ŠlŽß¡C{.Ì£pëÜ9‡Ûç»>ùžO¡Ÿ—ý²_öËþRدk-ûe¿ì‚gÂo¿ýÖt13V•{lÚ°V.úÜD>t1D6)†™&*R;‘ʇî«ÊÝÛæA×OÌ£B¹C/tª X˜Ý)9Nô)^UîÐ Ï‡Ž è'&ó¡€n–™ûz“ùÎANt¦ÜéúùiÃ뙑±¡­FzÃZ•+*–ȇîÙá{=Ñ…LŒ\Ù-ÊWåv´öLJ>(å…ö*s[åŽ:Ñí‡tû:-:(*ÖÊ«Êí‡]—/v@´WØÏ®EŸßgE6EËøñm×®]—_~y¤ø]½zµ½É÷ìÙc?Ë{äB0½ùæ›É<íLaä|Ž/ yýë_#sÅW”;'BLÜÜ_·+U}¼šv>²_öËþ}Ã~]kÙ/ûe?sž¿p„t:=sQ1ŠˆÅŠŠ•Rd“†È91¾òy-;¥‹è *·Ñõ’…ÅÒ+s×ó ŠÕm’^T,éNTèŽç<¹Ð Óó¡SùÉå=À~Øõ Q…ÄÒò›kÿç§ ­ÚªK¢¨Xbx+Duª*w<ºsÙÿoïÎä(ï<Žïß°?¬VÚVa³Òr$JìØc1Çsj¦ç><öÜ'6¾±Ácc‡ÍI l¸Ì†° «(Ù¤(Ä”„9AQ"A`HB¢gëûT=ÕOUWuWOîé~¿¤¯ª¦»ª»†Ïgž£Ü{fGÝ/[þ "oa²âý›o¾™ñ¸ÜrX¯^Þý!Ý ½5¾µ•ÌNω–}¢‡"Ë6ºo“žíßÚëyNYÛð|è@p¸'80¬:z>s5œ_³![‹š ·È™½ŠŸ<&ó«eöûï¿ïϵ¶çcG-µ/‹HɾœkÂz¸rÍ×I2ŸÇlíýr_ê×£ý´ŸöÓþb¼ßkÚOûi¿lOœ8¡CöÀU­Û[…sWå.UÈ–iˆroe7`kO÷B›9ÑþPnëÞк'ºiÀ ÏM^€–ùÐöªÜæÖVVOtî[[E­Î×ë[ýçX\m-*ö·EÅ Šõu·èï¹Ìû7dÿàÁƒúñ¨ìpÍ5×èÿdŠ©=BV²‹œ3Øu¹w‹«¿Éœ ºßš mtúÖV^;ÔÝž—3G¹3âÞÑá^ßZ8¿&Cöo¼¡@Ê2ùáj¶RQ¡WnÓ%d¾µyLVõ2GÛ<&󯳑…Ðä8¹‡¶xæ™g"ßO>]áÏh?õ|¸r_î× ?Ÿ«h?í§ý´¿ØŸ—ï5í§ý´ßl÷ìÙ£{‘û¤ó›µæDO—$dËb¼2l]žkêìOß'Ú›ÐMý¡ê ,&f‡è+ÌP´£Beà¶VÒ«›Ê\u›óÕþÅ5‰nm5Þówú{*£þüç?ëyÿgΜ ,RvàÀÀÿ /¾øbàyY9\Ìr·"½ˆZW]Ö­´µ ˜ìËîn;H›UºÃ·¶ÒC¢3‡B»·© §ÌùµÛ“-ËßËÈT*øY~؆¯=ÏÚ¾—ÜSQÈqó˜ôNg#s·?ýéOû=ãòWÓ¨m>‹ý¹ÌÖÞOºÊ`©«ÐÏ—ë|ÚOûi?í¯„öó½¦ý´¿:Û/AG¦öÍíwVÞvomessVäN¹Ûb‡léÈ‘ÅØtv[oÆ¢bæþÐÍÊm͇nȵ€WÄjÚáZÎO|þ¾ù5N€þ›ŒU¹Ý[[™ÅÅœ^æÔ?êï·,f÷àƒúHév/H­Ñó«åkù#‹qêÔ)ýXo番7µ)ðÿèùóçݯ»6…VåÎì‰6s ÓC¹­ÞØPÏlgÔJÚážÚFÎÏ÷üš ÙGމüá-s¢íÀ+ –²oV7s(î¾ûnÿ±$$`Kp—¿hýሠÙEQEQTéëèÑ£:do¹úÚ84¯gšÓµa`FmèŸö«˜![FHÊýºå±–T¿ªëQuíõ¶}H­m‹¯5mÙŸÏUœŸßñ®Þ¨æF>¬f†.Q3òý5íÔäà%jjÐl/Q#½uÿ¯ ô4ª­=ÿ¤Æ{/QC=WêÇnºé&ýÿƒÜ¯Ý=®Kmîú°ÚœúJ­R=©ŽÌÛtu^¢†;ÿÞ¯!§;œmLJ¼mºûŠ=†ó‹{~͆l¹Y¼ êêê TxuñÛn»Íÿá(½Ðò˜Ì¥6î¸ãŽŒÇ²‘¿dÉñ¿ûÝïty8`Ë_°ÂŸ)ßêììô·QeSŠóóý|…¾í§ý´Ÿö'ùü|¯i?í§ýQçŸ>}Z_:ýoêÀ±Sjÿõ'­ºQí?êí;Ûb…l™b¸eËýõØä´Ú{h)]yÛ¥Ðã¡çÃûœ_–óOŸØ­–Ϊë:fÕõ×klËŠG†ûÕÞêèµ³êˆW×í›vî³ívøÉèVéÌ“ýÉñauhÿŒ_×]3£f§6;ƒé½ëêquÐyܯý¡mÔsfÄqœ_ôók6dËœ™#ÑÑÑ¡Àš­ ÇߺKV¸ÍrÓ“}îÜ9ý˜¼^Òžl9^†ŽËÜŒpÈ–yAögŠ*ûóÚûË=>ü|®*÷ç£ý´ŸöÓþbÏ÷šöÓ~ÚoêóŸÿ¼^xlïÞ½jaa!kIÀ™ŸŸ×A9ß2óqåN6cccz_†ŠçzO)¹Ï²ÙÚûIŸ¿ØUè竤öÛÿŸd{_3ç||<öµFFFôÿŸúÔ§ôbgæ>êQ¯'›ãå6Ä¥hw¾¯ÇûÇ_³![®Ü«­­-P333ÁWþºž“-÷S_ùÊWüÇdx.Òã-ÁÜ,t~/ù¡þLEQEQTiêñÇ׿“e ÙæçbSϰ+-4‡?O¸Šýù*¹ýIþ377§G*Èd²µKrG`¸3o;×çÅÐ$/\ŒPYél©¤ö×lÈþêW¿ªFGGUkkk ä/šò×$;øÊð³8…yÌ,¹ÿÊ+¯øÉ=°³yýõ×õqÏ?ÿ|Fh—’¿lÊ{ù---þg2û²µ÷ß}¹Çç{~¡Ïûýh?í§ý´9Çó½¦ý´ŸöËm¼ž{î9}oâ}ûöåüÅYz%—Sº÷Qæ`KI+gh,wH®´Ð\Ìö'íÉNZÒég¯&¾’þ»WûŠB^¿fC¶]š-¡V~Ð677û[yÜ¿2¼ç7¿ù6n†‡ßzë­úÙ2ìÛ #—e÷³ùÚ×¾¦n¿ývÝã-[†Ûï³}ûvÿ ûóØûIŸW¾¯®\¯ŸoÚ>ÚOûi?íÏ·ý|¯i?í§ýáößpà úv¬²“ôPVjÚJë‘\ÉíÏçÿ“$Ÿ{rÒ]µ^†‹¯´i¥þÿh%·¿fC¶Ü«úÐs%Â?¤eE¸7û¡‡Òó°¥·Z†|Ëc2TܾW¶ù:ŠÌ¿–…Ód¡ Ù÷ÜsOàõeUséE/Õ?L…þCšëõrU¡ïGûi?í§ý…¶Ÿï5í§ý´?\2²QVù6!{¥ôø­¤a³ÕÖþb†lS°e´Cµ„gªÆC¶leOÔæmÛ¶eÌ—~òÉ'uЖ%öÏž=«{öÙguˆ¾ÿþûõcfA´ðý±åyyO Ø?üpÆkËÜbþu6ß¿öûÎ|߯ئý´ŸöÓþ¨×á{Mûi?í7%CseÝ_üây÷dSµ[2âUæZKÉïïf›¤ìsì}éÍŽz>\¹Þ/îõãž/öë/·ýåú|ålM‡l Ä2Ô[ߟ0âõ±cÇ2°„åwÞyG÷Lß{ï½ú±o~ó›úkVþòË/g„ì¯ýëúqé7ÃÍí’û3ú×ÙR©Ê÷óúyi?í§ý´¿Øíç{Mûi?í·—Ûªþþ÷¿÷Cöoû[ŠÊY {ȲJ¸üµ!ꇳ„o¹ou8˰o Ö¶/\¸ n¹åõÅ/~QßßNž“¡å²J¥ñî»ïªo|ãúv`q÷Å®ôyWÅþ|¥>žöÓ~ÚOû‹1\œï5í§ýÕÙ~±(ëãB6EÈŠ²e1³'žxBÏ…ˆúA-s¶£‚¶ÔÍ7߬ÿ *·“} ÙòZf!´óçÏ«GyD?u¾¼n¸½Rþš]î!^´ŸöÓ~Ú_êöó½¦ý´ŸöËV½•»ÄüéO"dS„l ![È«Þ{ャ¿óå²åŽ4§OŸ.YÉÔDB6!¨Ê->øà½ š¬>11á/VV‰ÿ{žÕÅÒFûi?í§ýÅn?ßkÚOûk£ý333êÑGÕ½×ò»\.ù†l™XJâ Ù„l jC¶!÷Q”!äO?ý´žÓÓßßø_Ž![•>$ŒöÓ~ÚOû+½ý|¯i?í¯ÞöËïfò;Ú3Ï<£;Häö¬I²)B6pB¶Ý³ýÇ?þQîï}ï{ê±ÇS·Ýv›^iüøñãúÞ×EQEQTéJnÃuÓM7©Ûo¿]=þøãêûßÿ¾þÝL¦úå®ãBöÛÿûêßú®úŸŸPOÞñ%õØgÿÓ©ûÔ95:#uoDËRáãîõ^/¸}ä–{œºW}ÙyþËÞöaç1ýµW|æ}œÿØÙ{Ôß‘ºÛ¯óŸ¹K¿å®Àö!gûЙ;Ó[§üôÔ—Î|A=xÆÝFÕýgîp÷où‚zà–ÿPœI×ýþö¯ÜÇ»óõÇþ[½ú£Ÿªß¼û.!(EÈ@u±Cö/~òªºÿÔmêæ™Ã꓃»Õ ]W©¥Îu\ªcQ t÷f„lYq\zž—SGމ Ùé~ý~Kò¾ êXû¼Zjw·¦–üíB`ß<ÜOgŽ=îoÓ»û ú¹ã¡ã—¬÷¾¾mV]ß>ço¶Í¨#­N9_v¶‡[§Õ!Ù¶Lë¯9[©ëš'Õu-îÖ©k7M¨k›½Ú4®8Ûkš¶ªýͲ?®®Ù4¦ömÚªö9ùÛ¦-jÏÆQµ·iÔÝnt¿ÞÝ8¢voܬvéíˆÚÙ8¬vlö¶CzÿêÆ!uõ¯Õö j{Cº¶9µ¸¾_-6ô««œçvlÚ¬ö¥&Õñ‰ê©/>ªÞv¦ ¤€o¼ñ†úîw¿Ë¿*5îµ×^S?ùÉOÔ[oüZÝuìßÕÉÞ«u¸Õ·Ý ×: :_G…l é›7o^VmÛ¶-6dG½ÿqý˜ ßéÇÍsz¿=½Ÿäücn-yÛcú1 Ó¢çõþõNP>êè£ÞöˆÒnˆ–ð,Ûƒ-SNMzÛ)¦¯5åéýN¨>à„éýMcnˆ–мQj‹Ï×xû{7ºÚ„è]Nh–½KB´³/Az—žwnp´l·;¡ùêõƒî¶aHm_ï„f'0‡ka}Ÿ¤ÝZXß«æ¯ìQóõN]Ùíî;Û¹uÝj¶Î©u)§ºÕt]Jíë™PßúÚÓ„l Ž,„ñío;öV¨ Ò‹ýꫯªgŸzZÝØ·#݃`£Bv±ù=Ùïï鎈 ±¿ ÐK^¯¸î…n7AZjÆ Ðn™èÃ^€–};@ËþuNéðl‚ô&+D;=Ðû7¹!Zg'@› Ï&Hoö³ ÍvƒÛ­{£%D¯wói7LäÐné]ßë†hg;«tÊÐ3ÎvÆùzº®ËÙº5]ש¦ê:Ô¤UkÚÔM»²82èùçŸWï:s+P»ž{î9%£øçÏ;ÃÂý ·‹:´–3dG½¸7zÉîÎÑNIt›;¬[éõBööéžèP/t³Õ Ýbz¢Çu¥ƒôX ÚÑ&HïvÊgÝݘÐ;Ípî†Ì}U(@Ë×:@×÷ùaZ÷D×{½Ñ^Íš^h/HÏH®Kyá9 §¼šX+Õ¦Æ×¶;ÛvgÛ¦¶®nUcŸhñÊùÚÙN¯ï"dÙBöOúSõ£ýˆYjÔ[o½¥§Êï†w\û/~P•²q=Ù¿þõ¯õBlË©³gÏÆ†lyÿø¡Ü^/t‡;?Z÷B·†B´5œÛ Îvˆž †èæè­ƒ³Ï[ý}3ÚÑÞÐm ÒvˆÞiz¢ÜýÀPîˆ^èpˆvt5œÛÊm çžöz¡§C!ÚíõDëíé5í^xnÕ%ZjËêf§ZüÚ¼z“SMz;²ªI8û$) KÈ–Û;|ç;ß¡7 Éc.\¸ .“ß ?·çTäðl{^sTÈ–9ÝÝÝÝË*¹%X\Èöƒt`>´ Ïéèô|è,C¹%@7{!ÚÊmBtS°‚C¹GƒC¸õÂbÃVˆvÛ.=Ð:<%ÊíéÞ@€^ðz¤í¹Ðf>´=œ;¢M€öz uI˜vBôø'ìžèV?@ZAZ‚³¤Ý’=¼jcº>¾Q ~¬Q 95èI Ȳ¥~ùË_ê®ï½÷ÿÒÔPÀ–ìŸýìgþ­»>ë„ìcÖ\f3·9ؓݗ²ÿò—¿èΛåTTg„ì¡®¾Àœè¬ó¡½ÐcÖ6Ý^P,8œ[‡èÆeîÝ¡¡ÜzÛ0ìávË Ñ~/ôzk(w½™íõB¯ -(Óm³é‰Îè…þ„¢í^h747gét€nÊ Ð~ˆÞàU£Þ8ÕÿÑõêÕU¯úœ"I9B¶”üõQz´å¶^,„PÝ$ÜJ'‹°uÈÞyÒ Ò‹ÁºÛÓs —qá³Á®ÞˆùÐéÝ4æ-&¶5Ý 5”Ûšî…öçCVæŽ^•;:D‡qg„h½¨XöùÐéEÅÚÓs¢ÊíöDƒtz›µ:¢':3@7d„hÙïþÈ:Õó‘+­WW¬Sݗש.§HR@‚-õú믫^xA‡mYaR‚·ôrËð¡J-ù£@-—|Ïj¹d–Z.™ÿVËõæ›oRNUÚ÷¥Ö¯ËZÿ¹\ëÿ.WòïL¦dqYøV~ßûÕ¯~¥Â¿ž½ú„µBwz11{^tTÈ–Ð~îܹeÕùóçcC¶?zczeîpˆÞenkeæBo ͇öîíöB{!Úé…¾*òÖV½ÁùÐWfŸ=ãõBÏX½Ð39æCO˜ùЫ[ü^h³˜m†sÄôDÛ=Ðn/tcF/t8D÷Bô:/DשÔåkªÓ%aºó²5ªÃ©N¯:.û„j¿l5!H²MÉ/IòWÍW^yE½ôÒKêå—_^1%Ÿ—ªÝúá¸"«ÖÛ_¬zñÅk²øÿ·:ÚO­ÌZI¿#IÉ}°%dË„â~<»ý†ô<èÀ|h·dA±Tyædtö8AzKp>´Y`¬ÑÑÃéè/H¯Ï2ºÞš]Ÿ^‘[z¥Íh·RÖ|讌žè@ˆ6=Ðk¼•¹×´ùC¹Óó¡Mˆn-*fõD¯Ê5Ú ÒQºOh7H÷HIpþ¨¢½àÜu…¢¯0!z­»u¶ ѦڼjmIR@Œ……EQ¹Š+€ÚqfÛ’[«CÖÖ^T¬?"dËüîåþ±*êN7&d›[[z¡}{wx(·½:·k«z«ÚÊ]ט -½Ð“Ö‚bé[[¥Wä¶ÛjõF›àW @ȦÙ OÍÔ‹‰¹‹‹¹·³Úmsj§3\»¯€ýÁ¨wÞy'gIÈîëìöomåΉNéuUî5™=ÑöBböœèŒžè·¶òCôúÈmåvçB_é é®ÓZÂs×åéùÐþ0îµz?®º-¨#z—#{¦ÓÏ·Z½×­—ç|®€M²@Ÿœ;àÞºÑ]•Û½µÕPzN´³ X_W÷²Cö~ðƒä‹¢¥RÞªÜí9Ñîðmo÷ê–ˆùÐn™U¹³Ï‡vCôÐÇC«r7ÄöD»áÙî…® ̇6AÚ¬Ìd(w[¸'Úž ªÝ3½*tNiÏçJÙ!$ Ù ×f¬Ê½-´¨XoBö¶mÛÔéÓ§c«§§G‡ìð|èј^è¨m÷B»ó¡7dèþÈ¡ÜæÞÐu¡Ü)+@çB¯‰ Ñm«tgÌ{¾4zƒóŸ3Ãm¥œÏ•²)B6Hà”Ó“½Pßç¯Æ½`ÝâJæAÏ9ó¡%dGÝv+Ÿ}ß}÷Åsá¿';û|èt€v‡ro°kˆÊý‘z¯'Ú¬Ì]—¹2·Õ möÛ²èv{A1¦ðM@ …ݨãp>W @ȦÙ ³ûœ0í,(¶.åÍ6‹Š¥ü9ÐÅèÉŽ Ù/¼ð‚êïï×=Ù} ­éùÐo öB,fUîÖ[C¹ƒó¡íáܦºÓÛ¶e¹7tTˆ÷ø¶^ºõ•Ýó^­» ÎçJÙ!$p|v¿°Xx>´Y•»§+U’-ÏÉ‚gò|_c«¢ƒ«r§çD¯ ÞÚêŠàí­üÞhk(w|otôm­Âé#{€/ËrN•žÏ•²)B6Hàú™]Þ½¡[ ŠÙ÷ˆ.EÈ~饗ÔÐÐ;L|C‹¢ëüùÐfuܗ¯ÉͱC¹£V×¶‡P_¶:ëÂ`œÏpq€M²@ò=½3û­­œ¹Ð=Å Ù?þñÕðð°»ØYcKpUnoeî¨áÛ«snKº •Ý{kõZgöær~’ó¹RB6EÈ ܸó šnêQSÝ^9Ú©‰ ]j¢±K;ÛÞTñædÿüç?W£££ú±ÞÖ5tå&5èÔP¨×59emõcöãvYÇøçr~1ÏçJÙ!$ð¹[ΪSK'Ôi§Nê¿yÓÅÙ¯½öšÚ²e‹þzfjZ>^ÖÐXî\i¡¹˜íçJÙ!$d¸x©Bc©‡SWzh^IíçJÙ?D@½R†?WúëUsû¹RB6EÈ CöJ]¬Òç4WSû¹RB6ÅQ dS„l dS„l dS!Ù!Ù!Ù!@ȦÙ@ȦÙ@ȦÙ@ȦÙB6EÈ€²Ø»woâ µ¸¸èoíýå·B_¯ØŸ‡ö² à]ªVi¡´Ü!µšÚÏ• C6á)z² H!{~~^—„)³µ÷—û|®Ê÷üBß/ß×£ýéǸR aÈŽ ]ùV¡!5ß×+÷ñµÜ~®H`Ïž=~š››ó·QeUáóÃÇçû|®÷Ïõz…~>ÚŸ.®H²—úrU±C$ïñÞŸ+ò ÙW\)ÀîÝ»Õìì¬_¨ÌÖÞ{>\¹Ž/öóáÊ÷óå:Ÿöϲ ©]»v©™™™QøÌÖÞ_)Ÿ%·Ÿ+ª,dS¯¸R€M² |vîÜé©ééi›¤ìs¢Î?®\ïw±_Ÿö§‹+†ì¤¡’ªÝâJ€2‡ì©©)kï'}¾Ð׿ØUÍíçJ€vìØá‡¶ÉÉIUö1Q>?||¾Ïûýsí®H`ûöí~š˜˜ð·QeSŒ ¿_±_?ß÷£ýñíçJ€„!;.T–ºò ¹ÕV+©ý\)Pá!›Z9Å•„lŠ „lŠ +ζmÛü 5>>îoíý¸çÕëøB_/\•þy«©ý\)ÀUW]¥ÆÆÆ–U[·nõ·ö~±Ž¯ôª¥ös¥@‹‹‹~èË·Â!1"K}~©_ö§‹+ Ù[¶lñ·öþrƒZ±_¯ÔïGûÓçp¥@ ±¡,\á–+Äåû|¡•ïëûøjn?W $ ÙÅ yTõW ²)B6”Ïüü|l°õ·Qe³œ ¿~®×Ë÷øb¿_-·Ÿ+†ìÍ›7ë’0e¶IÊ>'ªÂ¯WéÇÓþøös¥@sss±¡¬ÒkddÄßÚû+µ=•Ü~®H²ãBZ®~>ß*ôõÊýyj¹ý\)gȦ¨¸âJ€e„ìááakï×J˜¤ýÑíçJ€fgg!2WÈ ?_¬*õëg{_Ú?LÈ€b†lŠÊV\)ÀÌÌŒ¤†††üm%”ý™’|¾R_ËíçJ€¦§§ý 588èo£Ê>&ªÂç‡Ï÷ù\ïŸëõ ý|´?]\)0dÇ…:Š"d!›"d@ùMMM©D%aËl£Ê>&êøb?_èçË·j¹ý\)Àä䤤úûûý­½¼rŸïó¹*ß×/öç­åös¥@Â4äQµ[\)Pa!»¯¯ÏßÚûÅ:¾Òk%·Ÿ+˜˜˜ˆ u…V¾!2×ù¹Bg¾Ÿ¿Ðók©ý\)0d+TRÕ[\)0d÷ööê’0e¶IÊ>'ªÂ¯>¾Ð÷Ëu~®ã‹ýyª¹ý\)0d÷ôôè’€e¶ö~ÒçÃUèëûóHX4[{Ÿöç~=®H`ll,6”Q”)®H`ëÖ­ª»»›¢²W ²)B6²)B6TUÈN¥RþÖÞOú|®Ê÷üBß/ߢýé÷àJ€FGGcC\WW—¿µ÷Íó»ÂŸO>»Ù&iO¾Ç×rû¹R aÈŽ ‘áP–oåûz¼å¾?W $°yófÕÙÙ©K”ÙÚûqχ+×ñ¥~>W…Ïí?®È3d‡«££ÃßF•}L’ ¿^®óó=¾Ôç×rû¹R Ï+TV[…C'ío?W $ Ùµ*©åW ²)B6”ÏÈȈ¤ÚÛÛým’²Ï‰:?ü|¸r½ßÅ~}ÚŸ.®H²“†Ê\ÕÖÖæoíý¤Ï¯ôªæös¥@ÃÃÃ~èkmmõ·ö¾y¾Ô•ïûûóÒþø×ãJ€†††üP•«Â!.Êò­B_¯ØŸ‡öÇW 9dSµ[\)Ààà ¤ZZZü­½_-A1Wûhüó\)0d›PEQqÅ•„lŠ å3008h577û[{¿R?®\¯Oû Ù—þþ~?tmÚ´ÉßF•}LT…ÏŸïó¹Þ¿Ðã }½Zj?W $ Ùq!¢ÙP¦ÝÔÔäoíýb_éUKíçJ€úúúüЗo…Cb8D†kãÆþVJŽ5Û$ççûyì÷°ß{¹¯_ËíçJ€z{{cC`¸Â!-||¡ÏZù¾~±¯æös¥@Â]¬GUoq¥!›"d@ùôôô¨ÆÆÆYþÌÖÞ_©í©äös¥@•‡lª|Å• ¤R)ÕÐРkýúõþÖÞ7Ïçª\ççûúù¾^¸Šýùk¹ý\)@gg§ª¯¯§¨¬Å•„lŠ „lŠ „lŠ „lŠ"d!›"d!›"d!›"dÙ!Ù!Ù!ÙEÈB6EÈB6EÈB6EÈB6!’"d!›"d!›"d@Õùó :àFh½WIEND®B`‚perldoc-html/static/combined-20100403.js000644 000765 000024 00000251552 12276001417 017524 0ustar00jjstaff000000 000000 //MooTools, , My Object Oriented (JavaScript) Tools. Copyright (c) 2006-2009 Valerio Proietti, , MIT Style License. var MooTools={version:"1.2.3",build:"4980aa0fb74d2f6eb80bcd9f5b8e1fd6fbb8f607"};var Native=function(k){k=k||{};var a=k.name;var i=k.legacy;var b=k.protect; var c=k.implement;var h=k.generics;var f=k.initialize;var g=k.afterImplement||function(){};var d=f||i;h=h!==false;d.constructor=Native;d.$family={name:"native"}; if(i&&f){d.prototype=i.prototype;}d.prototype.constructor=d;if(a){var e=a.toLowerCase();d.prototype.$family={name:e};Native.typize(d,e);}var j=function(n,l,o,m){if(!b||m||!n.prototype[l]){n.prototype[l]=o; }if(h){Native.genericize(n,l,b);}g.call(n,l,o);return n;};d.alias=function(n,l,p){if(typeof n=="string"){var o=this.prototype[n];if((n=o)){return j(this,l,n,p); }}for(var m in n){this.alias(m,n[m],l);}return this;};d.implement=function(m,l,o){if(typeof m=="string"){return j(this,m,l,o);}for(var n in m){j(this,n,m[n],l); }return this;};if(c){d.implement(c);}return d;};Native.genericize=function(b,c,a){if((!a||!b[c])&&typeof b.prototype[c]=="function"){b[c]=function(){var d=Array.prototype.slice.call(arguments); return b.prototype[c].apply(d.shift(),d);};}};Native.implement=function(d,c){for(var b=0,a=d.length;b-1:this.indexOf(a)>-1;},trim:function(){return this.replace(/^\s+|\s+$/g,"");},clean:function(){return this.replace(/\s+/g," ").trim(); },camelCase:function(){return this.replace(/-\D/g,function(a){return a.charAt(1).toUpperCase();});},hyphenate:function(){return this.replace(/[A-Z]/g,function(a){return("-"+a.charAt(0).toLowerCase()); });},capitalize:function(){return this.replace(/\b[a-z]/g,function(a){return a.toUpperCase();});},escapeRegExp:function(){return this.replace(/([-.*+?^${}()|[\]\/\\])/g,"\\$1"); },toInt:function(a){return parseInt(this,a||10);},toFloat:function(){return parseFloat(this);},hexToRgb:function(b){var a=this.match(/^#?(\w{1,2})(\w{1,2})(\w{1,2})$/); return(a)?a.slice(1).hexToRgb(b):null;},rgbToHex:function(b){var a=this.match(/\d{1,3}/g);return(a)?a.rgbToHex(b):null;},stripScripts:function(b){var a=""; var c=this.replace(/]*>([\s\S]*?)<\/script>/gi,function(){a+=arguments[1]+"\n";return"";});if(b===true){$exec(a);}else{if($type(b)=="function"){b(a,c); }}return c;},substitute:function(a,b){return this.replace(b||(/\\?\{([^{}]+)\}/g),function(d,c){if(d.charAt(0)=="\\"){return d.slice(1);}return(a[c]!=undefined)?a[c]:""; });}});Hash.implement({has:Object.prototype.hasOwnProperty,keyOf:function(b){for(var a in this){if(this.hasOwnProperty(a)&&this[a]===b){return a;}}return null; },hasValue:function(a){return(Hash.keyOf(this,a)!==null);},extend:function(a){Hash.each(a||{},function(c,b){Hash.set(this,b,c);},this);return this;},combine:function(a){Hash.each(a||{},function(c,b){Hash.include(this,b,c); },this);return this;},erase:function(a){if(this.hasOwnProperty(a)){delete this[a];}return this;},get:function(a){return(this.hasOwnProperty(a))?this[a]:null; },set:function(a,b){if(!this[a]||this.hasOwnProperty(a)){this[a]=b;}return this;},empty:function(){Hash.each(this,function(b,a){delete this[a];},this); return this;},include:function(a,b){if(this[a]==undefined){this[a]=b;}return this;},map:function(b,c){var a=new Hash;Hash.each(this,function(e,d){a.set(d,b.call(c,e,d,this)); },this);return a;},filter:function(b,c){var a=new Hash;Hash.each(this,function(e,d){if(b.call(c,e,d,this)){a.set(d,e);}},this);return a;},every:function(b,c){for(var a in this){if(this.hasOwnProperty(a)&&!b.call(c,this[a],a)){return false; }}return true;},some:function(b,c){for(var a in this){if(this.hasOwnProperty(a)&&b.call(c,this[a],a)){return true;}}return false;},getKeys:function(){var a=[]; Hash.each(this,function(c,b){a.push(b);});return a;},getValues:function(){var a=[];Hash.each(this,function(b){a.push(b);});return a;},toQueryString:function(a){var b=[]; Hash.each(this,function(f,e){if(a){e=a+"["+e+"]";}var d;switch($type(f)){case"object":d=Hash.toQueryString(f,e);break;case"array":var c={};f.each(function(h,g){c[g]=h; });d=Hash.toQueryString(c,e);break;default:d=e+"="+encodeURIComponent(f);}if(f!=undefined){b.push(d);}});return b.join("&");}});Hash.alias({keyOf:"indexOf",hasValue:"contains"}); var Event=new Native({name:"Event",initialize:function(a,f){f=f||window;var k=f.document;a=a||f.event;if(a.$extended){return a;}this.$extended=true;var j=a.type; var g=a.target||a.srcElement;while(g&&g.nodeType==3){g=g.parentNode;}if(j.test(/key/)){var b=a.which||a.keyCode;var m=Event.Keys.keyOf(b);if(j=="keydown"){var d=b-111; if(d>0&&d<13){m="f"+d;}}m=m||String.fromCharCode(b).toLowerCase();}else{if(j.match(/(click|mouse|menu)/i)){k=(!k.compatMode||k.compatMode=="CSS1Compat")?k.html:k.body; var i={x:a.pageX||a.clientX+k.scrollLeft,y:a.pageY||a.clientY+k.scrollTop};var c={x:(a.pageX)?a.pageX-f.pageXOffset:a.clientX,y:(a.pageY)?a.pageY-f.pageYOffset:a.clientY}; if(j.match(/DOMMouseScroll|mousewheel/)){var h=(a.wheelDelta)?a.wheelDelta/120:-(a.detail||0)/3;}var e=(a.which==3)||(a.button==2);var l=null;if(j.match(/over|out/)){switch(j){case"mouseover":l=a.relatedTarget||a.fromElement; break;case"mouseout":l=a.relatedTarget||a.toElement;}if(!(function(){while(l&&l.nodeType==3){l=l.parentNode;}return true;}).create({attempt:Browser.Engine.gecko})()){l=false; }}}}return $extend(this,{event:a,type:j,page:i,client:c,rightClick:e,wheel:h,relatedTarget:l,target:g,code:b,key:m,shift:a.shiftKey,control:a.ctrlKey,alt:a.altKey,meta:a.metaKey}); }});Event.Keys=new Hash({enter:13,up:38,down:40,left:37,right:39,esc:27,space:32,backspace:8,tab:9,"delete":46});Event.implement({stop:function(){return this.stopPropagation().preventDefault(); },stopPropagation:function(){if(this.event.stopPropagation){this.event.stopPropagation();}else{this.event.cancelBubble=true;}return this;},preventDefault:function(){if(this.event.preventDefault){this.event.preventDefault(); }else{this.event.returnValue=false;}return this;}});function Class(b){if(b instanceof Function){b={initialize:b};}var a=function(){Object.reset(this);if(a._prototyping){return this; }this._current=$empty;var c=(this.initialize)?this.initialize.apply(this,arguments):this;delete this._current;delete this.caller;return c;}.extend(this); a.implement(b);a.constructor=Class;a.prototype.constructor=a;return a;}Function.prototype.protect=function(){this._protected=true;return this;};Object.reset=function(a,c){if(c==null){for(var e in a){Object.reset(a,e); }return a;}delete a[c];switch($type(a[c])){case"object":var d=function(){};d.prototype=a[c];var b=new d;a[c]=Object.reset(b);break;case"array":a[c]=$unlink(a[c]); break;}return a;};new Native({name:"Class",initialize:Class}).extend({instantiate:function(b){b._prototyping=true;var a=new b;delete b._prototyping;return a; },wrap:function(a,b,c){if(c._origin){c=c._origin;}return function(){if(c._protected&&this._current==null){throw new Error('The method "'+b+'" cannot be called.'); }var e=this.caller,f=this._current;this.caller=f;this._current=arguments.callee;var d=c.apply(this,arguments);this._current=f;this.caller=e;return d;}.extend({_owner:a,_origin:c,_name:b}); }});Class.implement({implement:function(a,d){if($type(a)=="object"){for(var e in a){this.implement(e,a[e]);}return this;}var f=Class.Mutators[a];if(f){d=f.call(this,d); if(d==null){return this;}}var c=this.prototype;switch($type(d)){case"function":if(d._hidden){return this;}c[a]=Class.wrap(this,a,d);break;case"object":var b=c[a]; if($type(b)=="object"){$mixin(b,d);}else{c[a]=$unlink(d);}break;case"array":c[a]=$unlink(d);break;default:c[a]=d;}return this;}});Class.Mutators={Extends:function(a){this.parent=a; this.prototype=Class.instantiate(a);this.implement("parent",function(){var b=this.caller._name,c=this.caller._owner.parent.prototype[b];if(!c){throw new Error('The method "'+b+'" has no parent.'); }return c.apply(this,arguments);}.protect());},Implements:function(a){$splat(a).each(function(b){if(b instanceof Function){b=Class.instantiate(b);}this.implement(b); },this);}};var Chain=new Class({$chain:[],chain:function(){this.$chain.extend(Array.flatten(arguments));return this;},callChain:function(){return(this.$chain.length)?this.$chain.shift().apply(this,arguments):false; },clearChain:function(){this.$chain.empty();return this;}});var Events=new Class({$events:{},addEvent:function(c,b,a){c=Events.removeOn(c);if(b!=$empty){this.$events[c]=this.$events[c]||[]; this.$events[c].include(b);if(a){b.internal=true;}}return this;},addEvents:function(a){for(var b in a){this.addEvent(b,a[b]);}return this;},fireEvent:function(c,b,a){c=Events.removeOn(c); if(!this.$events||!this.$events[c]){return this;}this.$events[c].each(function(d){d.create({bind:this,delay:a,"arguments":b})();},this);return this;},removeEvent:function(b,a){b=Events.removeOn(b); if(!this.$events[b]){return this;}if(!a.internal){this.$events[b].erase(a);}return this;},removeEvents:function(c){var d;if($type(c)=="object"){for(d in c){this.removeEvent(d,c[d]); }return this;}if(c){c=Events.removeOn(c);}for(d in this.$events){if(c&&c!=d){continue;}var b=this.$events[d];for(var a=b.length;a--;a){this.removeEvent(d,b[a]); }}return this;}});Events.removeOn=function(a){return a.replace(/^on([A-Z])/,function(b,c){return c.toLowerCase();});};var Options=new Class({setOptions:function(){this.options=$merge.run([this.options].extend(arguments)); if(!this.addEvent){return this;}for(var a in this.options){if($type(this.options[a])!="function"||!(/^on[A-Z]/).test(a)){continue;}this.addEvent(a,this.options[a]); delete this.options[a];}return this;}});var Element=new Native({name:"Element",legacy:window.Element,initialize:function(a,b){var c=Element.Constructors.get(a); if(c){return c(b);}if(typeof a=="string"){return document.newElement(a,b);}return document.id(a).set(b);},afterImplement:function(a,b){Element.Prototype[a]=b; if(Array[a]){return;}Elements.implement(a,function(){var c=[],g=true;for(var e=0,d=this.length;e";}return document.id(this.createElement(a)).set(b);},newTextNode:function(a){return this.createTextNode(a); },getDocument:function(){return this;},getWindow:function(){return this.window;},id:(function(){var a={string:function(d,c,b){d=b.getElementById(d);return(d)?a.element(d,c):null; },element:function(b,e){$uid(b);if(!e&&!b.$family&&!(/^object|embed$/i).test(b.tagName)){var c=Element.Prototype;for(var d in c){b[d]=c[d];}}return b;},object:function(c,d,b){if(c.toElement){return a.element(c.toElement(b),d); }return null;}};a.textnode=a.whitespace=a.window=a.document=$arguments(0);return function(c,e,d){if(c&&c.$family&&c.uid){return c;}var b=$type(c);return(a[b])?a[b](c,e,d||document):null; };})()});if(window.$==null){Window.implement({$:function(a,b){return document.id(a,b,this.document);}});}Window.implement({$$:function(a){if(arguments.length==1&&typeof a=="string"){return this.document.getElements(a); }var f=[];var c=Array.flatten(arguments);for(var d=0,b=c.length;d1);a.each(function(e){var f=this.getElementsByTagName(e.trim());(b)?c.extend(f):c=f; },this);return new Elements(c,{ddup:b,cash:!d});}});(function(){var h={},f={};var i={input:"checked",option:"selected",textarea:(Browser.Engine.webkit&&Browser.Engine.version<420)?"innerHTML":"value"}; var c=function(l){return(f[l]||(f[l]={}));};var g=function(n,l){if(!n){return;}var m=n.uid;if(Browser.Engine.trident){if(n.clearAttributes){var q=l&&n.cloneNode(false); n.clearAttributes();if(q){n.mergeAttributes(q);}}else{if(n.removeEvents){n.removeEvents();}}if((/object/i).test(n.tagName)){for(var o in n){if(typeof n[o]=="function"){n[o]=$empty; }}Element.dispose(n);}}if(!m){return;}h[m]=f[m]=null;};var d=function(){Hash.each(h,g);if(Browser.Engine.trident){$A(document.getElementsByTagName("object")).each(g); }if(window.CollectGarbage){CollectGarbage();}h=f=null;};var j=function(n,l,s,m,p,r){var o=n[s||l];var q=[];while(o){if(o.nodeType==1&&(!m||Element.match(o,m))){if(!p){return document.id(o,r); }q.push(o);}o=o[l];}return(p)?new Elements(q,{ddup:false,cash:!r}):null;};var e={html:"innerHTML","class":"className","for":"htmlFor",defaultValue:"defaultValue",text:(Browser.Engine.trident||(Browser.Engine.webkit&&Browser.Engine.version<420))?"innerText":"textContent"}; var b=["compact","nowrap","ismap","declare","noshade","checked","disabled","readonly","multiple","selected","noresize","defer"];var k=["value","type","defaultValue","accessKey","cellPadding","cellSpacing","colSpan","frameBorder","maxLength","readOnly","rowSpan","tabIndex","useMap"]; b=b.associate(b);Hash.extend(e,b);Hash.extend(e,k.associate(k.map(String.toLowerCase)));var a={before:function(m,l){if(l.parentNode){l.parentNode.insertBefore(m,l); }},after:function(m,l){if(!l.parentNode){return;}var n=l.nextSibling;(n)?l.parentNode.insertBefore(m,n):l.parentNode.appendChild(m);},bottom:function(m,l){l.appendChild(m); },top:function(m,l){var n=l.firstChild;(n)?l.insertBefore(m,n):l.appendChild(m);}};a.inside=a.bottom;Hash.each(a,function(l,m){m=m.capitalize();Element.implement("inject"+m,function(n){l(this,document.id(n,true)); return this;});Element.implement("grab"+m,function(n){l(document.id(n,true),this);return this;});});Element.implement({set:function(o,m){switch($type(o)){case"object":for(var n in o){this.set(n,o[n]); }break;case"string":var l=Element.Properties.get(o);(l&&l.set)?l.set.apply(this,Array.slice(arguments,1)):this.setProperty(o,m);}return this;},get:function(m){var l=Element.Properties.get(m); return(l&&l.get)?l.get.apply(this,Array.slice(arguments,1)):this.getProperty(m);},erase:function(m){var l=Element.Properties.get(m);(l&&l.erase)?l.erase.apply(this):this.removeProperty(m); return this;},setProperty:function(m,n){var l=e[m];if(n==undefined){return this.removeProperty(m);}if(l&&b[m]){n=!!n;}(l)?this[l]=n:this.setAttribute(m,""+n); return this;},setProperties:function(l){for(var m in l){this.setProperty(m,l[m]);}return this;},getProperty:function(m){var l=e[m];var n=(l)?this[l]:this.getAttribute(m,2); return(b[m])?!!n:(l)?n:n||null;},getProperties:function(){var l=$A(arguments);return l.map(this.getProperty,this).associate(l);},removeProperty:function(m){var l=e[m]; (l)?this[l]=(l&&b[m])?false:"":this.removeAttribute(m);return this;},removeProperties:function(){Array.each(arguments,this.removeProperty,this);return this; },hasClass:function(l){return this.className.contains(l," ");},addClass:function(l){if(!this.hasClass(l)){this.className=(this.className+" "+l).clean(); }return this;},removeClass:function(l){this.className=this.className.replace(new RegExp("(^|\\s)"+l+"(?:\\s|$)"),"$1");return this;},toggleClass:function(l){return this.hasClass(l)?this.removeClass(l):this.addClass(l); },adopt:function(){Array.flatten(arguments).each(function(l){l=document.id(l,true);if(l){this.appendChild(l);}},this);return this;},appendText:function(m,l){return this.grab(this.getDocument().newTextNode(m),l); },grab:function(m,l){a[l||"bottom"](document.id(m,true),this);return this;},inject:function(m,l){a[l||"bottom"](this,document.id(m,true));return this;},replaces:function(l){l=document.id(l,true); l.parentNode.replaceChild(this,l);return this;},wraps:function(m,l){m=document.id(m,true);return this.replaces(m).grab(m,l);},getPrevious:function(l,m){return j(this,"previousSibling",null,l,false,m); },getAllPrevious:function(l,m){return j(this,"previousSibling",null,l,true,m);},getNext:function(l,m){return j(this,"nextSibling",null,l,false,m);},getAllNext:function(l,m){return j(this,"nextSibling",null,l,true,m); },getFirst:function(l,m){return j(this,"nextSibling","firstChild",l,false,m);},getLast:function(l,m){return j(this,"previousSibling","lastChild",l,false,m); },getParent:function(l,m){return j(this,"parentNode",null,l,false,m);},getParents:function(l,m){return j(this,"parentNode",null,l,true,m);},getSiblings:function(l,m){return this.getParent().getChildren(l,m).erase(this); },getChildren:function(l,m){return j(this,"nextSibling","firstChild",l,true,m);},getWindow:function(){return this.ownerDocument.window;},getDocument:function(){return this.ownerDocument; },getElementById:function(o,n){var m=this.ownerDocument.getElementById(o);if(!m){return null;}for(var l=m.parentNode;l!=this;l=l.parentNode){if(!l){return null; }}return document.id(m,n);},getSelected:function(){return new Elements($A(this.options).filter(function(l){return l.selected;}));},getComputedStyle:function(m){if(this.currentStyle){return this.currentStyle[m.camelCase()]; }var l=this.getDocument().defaultView.getComputedStyle(this,null);return(l)?l.getPropertyValue([m.hyphenate()]):null;},toQueryString:function(){var l=[]; this.getElements("input, select, textarea",true).each(function(m){if(!m.name||m.disabled||m.type=="submit"||m.type=="reset"||m.type=="file"){return;}var n=(m.tagName.toLowerCase()=="select")?Element.getSelected(m).map(function(o){return o.value; }):((m.type=="radio"||m.type=="checkbox")&&!m.checked)?null:m.value;$splat(n).each(function(o){if(typeof o!="undefined"){l.push(m.name+"="+encodeURIComponent(o)); }});});return l.join("&");},clone:function(o,l){o=o!==false;var r=this.cloneNode(o);var n=function(v,u){if(!l){v.removeAttribute("id");}if(Browser.Engine.trident){v.clearAttributes(); v.mergeAttributes(u);v.removeAttribute("uid");if(v.options){var w=v.options,s=u.options;for(var t=w.length;t--;){w[t].selected=s[t].selected;}}}var x=i[u.tagName.toLowerCase()]; if(x&&u[x]){v[x]=u[x];}};if(o){var p=r.getElementsByTagName("*"),q=this.getElementsByTagName("*");for(var m=p.length;m--;){n(p[m],q[m]);}}n(r,this);return document.id(r); },destroy:function(){Element.empty(this);Element.dispose(this);g(this,true);return null;},empty:function(){$A(this.childNodes).each(function(l){Element.destroy(l); });return this;},dispose:function(){return(this.parentNode)?this.parentNode.removeChild(this):this;},hasChild:function(l){l=document.id(l,true);if(!l){return false; }if(Browser.Engine.webkit&&Browser.Engine.version<420){return $A(this.getElementsByTagName(l.tagName)).contains(l);}return(this.contains)?(this!=l&&this.contains(l)):!!(this.compareDocumentPosition(l)&16); },match:function(l){return(!l||(l==this)||(Element.get(this,"tag")==l));}});Native.implement([Element,Window,Document],{addListener:function(o,n){if(o=="unload"){var l=n,m=this; n=function(){m.removeListener("unload",n);l();};}else{h[this.uid]=this;}if(this.addEventListener){this.addEventListener(o,n,false);}else{this.attachEvent("on"+o,n); }return this;},removeListener:function(m,l){if(this.removeEventListener){this.removeEventListener(m,l,false);}else{this.detachEvent("on"+m,l);}return this; },retrieve:function(m,l){var o=c(this.uid),n=o[m];if(l!=undefined&&n==undefined){n=o[m]=l;}return $pick(n);},store:function(m,l){var n=c(this.uid);n[m]=l; return this;},eliminate:function(l){var m=c(this.uid);delete m[l];return this;}});window.addListener("unload",d);})();Element.Properties=new Hash;Element.Properties.style={set:function(a){this.style.cssText=a; },get:function(){return this.style.cssText;},erase:function(){this.style.cssText="";}};Element.Properties.tag={get:function(){return this.tagName.toLowerCase(); }};Element.Properties.html=(function(){var c=document.createElement("div");var a={table:[1,"","
"],select:[1,""],tbody:[2,"","
"],tr:[3,"","
"]}; a.thead=a.tfoot=a.tbody;var b={set:function(){var e=Array.flatten(arguments).join("");var f=Browser.Engine.trident&&a[this.get("tag")];if(f){var g=c;g.innerHTML=f[1]+e+f[2]; for(var d=f[0];d--;){g=g.firstChild;}this.empty().adopt(g.childNodes);}else{this.innerHTML=e;}}};b.erase=b.set;return b;})();if(Browser.Engine.webkit&&Browser.Engine.version<420){Element.Properties.text={get:function(){if(this.innerText){return this.innerText; }var a=this.ownerDocument.newElement("div",{html:this.innerHTML}).inject(this.ownerDocument.body);var b=a.innerText;a.destroy();return b;}};}Element.Properties.events={set:function(a){this.addEvents(a); }};Native.implement([Element,Window,Document],{addEvent:function(e,g){var h=this.retrieve("events",{});h[e]=h[e]||{keys:[],values:[]};if(h[e].keys.contains(g)){return this; }h[e].keys.push(g);var f=e,a=Element.Events.get(e),c=g,i=this;if(a){if(a.onAdd){a.onAdd.call(this,g);}if(a.condition){c=function(j){if(a.condition.call(this,j)){return g.call(this,j); }return true;};}f=a.base||f;}var d=function(){return g.call(i);};var b=Element.NativeEvents[f];if(b){if(b==2){d=function(j){j=new Event(j,i.getWindow()); if(c.call(i,j)===false){j.stop();}};}this.addListener(f,d);}h[e].values.push(d);return this;},removeEvent:function(c,b){var a=this.retrieve("events");if(!a||!a[c]){return this; }var f=a[c].keys.indexOf(b);if(f==-1){return this;}a[c].keys.splice(f,1);var e=a[c].values.splice(f,1)[0];var d=Element.Events.get(c);if(d){if(d.onRemove){d.onRemove.call(this,b); }c=d.base||c;}return(Element.NativeEvents[c])?this.removeListener(c,e):this;},addEvents:function(a){for(var b in a){this.addEvent(b,a[b]);}return this; },removeEvents:function(a){var c;if($type(a)=="object"){for(c in a){this.removeEvent(c,a[c]);}return this;}var b=this.retrieve("events");if(!b){return this; }if(!a){for(c in b){this.removeEvents(c);}this.eliminate("events");}else{if(b[a]){while(b[a].keys[0]){this.removeEvent(a,b[a].keys[0]);}b[a]=null;}}return this; },fireEvent:function(d,b,a){var c=this.retrieve("events");if(!c||!c[d]){return this;}c[d].keys.each(function(e){e.create({bind:this,delay:a,"arguments":b})(); },this);return this;},cloneEvents:function(d,a){d=document.id(d);var c=d.retrieve("events");if(!c){return this;}if(!a){for(var b in c){this.cloneEvents(d,b); }}else{if(c[a]){c[a].keys.each(function(e){this.addEvent(a,e);},this);}}return this;}});Element.NativeEvents={click:2,dblclick:2,mouseup:2,mousedown:2,contextmenu:2,mousewheel:2,DOMMouseScroll:2,mouseover:2,mouseout:2,mousemove:2,selectstart:2,selectend:2,keydown:2,keypress:2,keyup:2,focus:2,blur:2,change:2,reset:2,select:2,submit:2,load:1,unload:1,beforeunload:2,resize:1,move:1,DOMContentLoaded:1,readystatechange:1,error:1,abort:1,scroll:1}; (function(){var a=function(b){var c=b.relatedTarget;if(c==undefined){return true;}if(c===false){return false;}return($type(this)!="document"&&c!=this&&c.prefix!="xul"&&!this.hasChild(c)); };Element.Events=new Hash({mouseenter:{base:"mouseover",condition:a},mouseleave:{base:"mouseout",condition:a},mousewheel:{base:(Browser.Engine.gecko)?"DOMMouseScroll":"mousewheel"}}); })();Element.Properties.styles={set:function(a){this.setStyles(a);}};Element.Properties.opacity={set:function(a,b){if(!b){if(a==0){if(this.style.visibility!="hidden"){this.style.visibility="hidden"; }}else{if(this.style.visibility!="visible"){this.style.visibility="visible";}}}if(!this.currentStyle||!this.currentStyle.hasLayout){this.style.zoom=1;}if(Browser.Engine.trident){this.style.filter=(a==1)?"":"alpha(opacity="+a*100+")"; }this.style.opacity=a;this.store("opacity",a);},get:function(){return this.retrieve("opacity",1);}};Element.implement({setOpacity:function(a){return this.set("opacity",a,true); },getOpacity:function(){return this.get("opacity");},setStyle:function(b,a){switch(b){case"opacity":return this.set("opacity",parseFloat(a));case"float":b=(Browser.Engine.trident)?"styleFloat":"cssFloat"; }b=b.camelCase();if($type(a)!="string"){var c=(Element.Styles.get(b)||"@").split(" ");a=$splat(a).map(function(e,d){if(!c[d]){return"";}return($type(e)=="number")?c[d].replace("@",Math.round(e)):e; }).join(" ");}else{if(a==String(Number(a))){a=Math.round(a);}}this.style[b]=a;return this;},getStyle:function(g){switch(g){case"opacity":return this.get("opacity"); case"float":g=(Browser.Engine.trident)?"styleFloat":"cssFloat";}g=g.camelCase();var a=this.style[g];if(!$chk(a)){a=[];for(var f in Element.ShortStyles){if(g!=f){continue; }for(var e in Element.ShortStyles[f]){a.push(this.getStyle(e));}return a.join(" ");}a=this.getComputedStyle(g);}if(a){a=String(a);var c=a.match(/rgba?\([\d\s,]+\)/); if(c){a=a.replace(c[0],c[0].rgbToHex());}}if(Browser.Engine.presto||(Browser.Engine.trident&&!$chk(parseInt(a,10)))){if(g.test(/^(height|width)$/)){var b=(g=="width")?["left","right"]:["top","bottom"],d=0; b.each(function(h){d+=this.getStyle("border-"+h+"-width").toInt()+this.getStyle("padding-"+h).toInt();},this);return this["offset"+g.capitalize()]-d+"px"; }if((Browser.Engine.presto)&&String(a).test("px")){return a;}if(g.test(/(border(.+)Width|margin|padding)/)){return"0px";}}return a;},setStyles:function(b){for(var a in b){this.setStyle(a,b[a]); }return this;},getStyles:function(){var a={};Array.flatten(arguments).each(function(b){a[b]=this.getStyle(b);},this);return a;}});Element.Styles=new Hash({left:"@px",top:"@px",bottom:"@px",right:"@px",width:"@px",height:"@px",maxWidth:"@px",maxHeight:"@px",minWidth:"@px",minHeight:"@px",backgroundColor:"rgb(@, @, @)",backgroundPosition:"@px @px",color:"rgb(@, @, @)",fontSize:"@px",letterSpacing:"@px",lineHeight:"@px",clip:"rect(@px @px @px @px)",margin:"@px @px @px @px",padding:"@px @px @px @px",border:"@px @ rgb(@, @, @) @px @ rgb(@, @, @) @px @ rgb(@, @, @)",borderWidth:"@px @px @px @px",borderStyle:"@ @ @ @",borderColor:"rgb(@, @, @) rgb(@, @, @) rgb(@, @, @) rgb(@, @, @)",zIndex:"@",zoom:"@",fontWeight:"@",textIndent:"@px",opacity:"@"}); Element.ShortStyles={margin:{},padding:{},border:{},borderWidth:{},borderStyle:{},borderColor:{}};["Top","Right","Bottom","Left"].each(function(g){var f=Element.ShortStyles; var b=Element.Styles;["margin","padding"].each(function(h){var i=h+g;f[h][i]=b[i]="@px";});var e="border"+g;f.border[e]=b[e]="@px @ rgb(@, @, @)";var d=e+"Width",a=e+"Style",c=e+"Color"; f[e]={};f.borderWidth[d]=f[e][d]=b[d]="@px";f.borderStyle[a]=f[e][a]=b[a]="@";f.borderColor[c]=f[e][c]=b[c]="rgb(@, @, @)";});(function(){Element.implement({scrollTo:function(h,i){if(b(this)){this.getWindow().scrollTo(h,i); }else{this.scrollLeft=h;this.scrollTop=i;}return this;},getSize:function(){if(b(this)){return this.getWindow().getSize();}return{x:this.offsetWidth,y:this.offsetHeight}; },getScrollSize:function(){if(b(this)){return this.getWindow().getScrollSize();}return{x:this.scrollWidth,y:this.scrollHeight};},getScroll:function(){if(b(this)){return this.getWindow().getScroll(); }return{x:this.scrollLeft,y:this.scrollTop};},getScrolls:function(){var i=this,h={x:0,y:0};while(i&&!b(i)){h.x+=i.scrollLeft;h.y+=i.scrollTop;i=i.parentNode; }return h;},getOffsetParent:function(){var h=this;if(b(h)){return null;}if(!Browser.Engine.trident){return h.offsetParent;}while((h=h.parentNode)&&!b(h)){if(d(h,"position")!="static"){return h; }}return null;},getOffsets:function(){if(this.getBoundingClientRect){var m=this.getBoundingClientRect(),k=document.id(this.getDocument().documentElement),i=k.getScroll(),n=(d(this,"position")=="fixed"); return{x:parseInt(m.left,10)+((n)?0:i.x)-k.clientLeft,y:parseInt(m.top,10)+((n)?0:i.y)-k.clientTop};}var j=this,h={x:0,y:0};if(b(this)){return h;}while(j&&!b(j)){h.x+=j.offsetLeft; h.y+=j.offsetTop;if(Browser.Engine.gecko){if(!f(j)){h.x+=c(j);h.y+=g(j);}var l=j.parentNode;if(l&&d(l,"overflow")!="visible"){h.x+=c(l);h.y+=g(l);}}else{if(j!=this&&Browser.Engine.webkit){h.x+=c(j); h.y+=g(j);}}j=j.offsetParent;}if(Browser.Engine.gecko&&!f(this)){h.x-=c(this);h.y-=g(this);}return h;},getPosition:function(k){if(b(this)){return{x:0,y:0}; }var l=this.getOffsets(),i=this.getScrolls();var h={x:l.x-i.x,y:l.y-i.y};var j=(k&&(k=document.id(k)))?k.getPosition():{x:0,y:0};return{x:h.x-j.x,y:h.y-j.y}; },getCoordinates:function(j){if(b(this)){return this.getWindow().getCoordinates();}var h=this.getPosition(j),i=this.getSize();var k={left:h.x,top:h.y,width:i.x,height:i.y}; k.right=k.left+k.width;k.bottom=k.top+k.height;return k;},computePosition:function(h){return{left:h.x-e(this,"margin-left"),top:h.y-e(this,"margin-top")}; },setPosition:function(h){return this.setStyles(this.computePosition(h));}});Native.implement([Document,Window],{getSize:function(){if(Browser.Engine.presto||Browser.Engine.webkit){var i=this.getWindow(); return{x:i.innerWidth,y:i.innerHeight};}var h=a(this);return{x:h.clientWidth,y:h.clientHeight};},getScroll:function(){var i=this.getWindow(),h=a(this); return{x:i.pageXOffset||h.scrollLeft,y:i.pageYOffset||h.scrollTop};},getScrollSize:function(){var i=a(this),h=this.getSize();return{x:Math.max(i.scrollWidth,h.x),y:Math.max(i.scrollHeight,h.y)}; },getPosition:function(){return{x:0,y:0};},getCoordinates:function(){var h=this.getSize();return{top:0,left:0,bottom:h.y,right:h.x,height:h.y,width:h.x}; }});var d=Element.getComputedStyle;function e(h,i){return d(h,i).toInt()||0;}function f(h){return d(h,"-moz-box-sizing")=="border-box";}function g(h){return e(h,"border-top-width"); }function c(h){return e(h,"border-left-width");}function b(h){return(/^(?:body|html)$/i).test(h.tagName);}function a(h){var i=h.getDocument();return(!i.compatMode||i.compatMode=="CSS1Compat")?i.html:i.body; }})();Element.alias("setPosition","position");Native.implement([Window,Document,Element],{getHeight:function(){return this.getSize().y;},getWidth:function(){return this.getSize().x; },getScrollTop:function(){return this.getScroll().y;},getScrollLeft:function(){return this.getScroll().x;},getScrollHeight:function(){return this.getScrollSize().y; },getScrollWidth:function(){return this.getScrollSize().x;},getTop:function(){return this.getPosition().y;},getLeft:function(){return this.getPosition().x; }});Element.Events.domready={onAdd:function(a){if(Browser.loaded){a.call(this);}}};(function(){var b=function(){if(Browser.loaded){return;}Browser.loaded=true; window.fireEvent("domready");document.fireEvent("domready");};if(Browser.Engine.trident){var a=document.createElement("div");(function(){($try(function(){a.doScroll(); return document.id(a).inject(document.body).set("html","temp").dispose();}))?b():arguments.callee.delay(50);})();}else{if(Browser.Engine.webkit&&Browser.Engine.version<525){(function(){(["loaded","complete"].contains(document.readyState))?b():arguments.callee.delay(50); })();}else{window.addEvent("load",b);document.addEvent("DOMContentLoaded",b);}}})();var JSON=new Hash({$specialChars:{"\b":"\\b","\t":"\\t","\n":"\\n","\f":"\\f","\r":"\\r",'"':'\\"',"\\":"\\\\"},$replaceChars:function(a){return JSON.$specialChars[a]||"\\u00"+Math.floor(a.charCodeAt()/16).toString(16)+(a.charCodeAt()%16).toString(16); },encode:function(b){switch($type(b)){case"string":return'"'+b.replace(/[\x00-\x1f\\"]/g,JSON.$replaceChars)+'"';case"array":return"["+String(b.map(JSON.encode).clean())+"]"; case"object":case"hash":var a=[];Hash.each(b,function(e,d){var c=JSON.encode(e);if(c){a.push(JSON.encode(d)+":"+c);}});return"{"+a+"}";case"number":case"boolean":return String(b); case false:return"null";}return null;},decode:function(string,secure){if($type(string)!="string"||!string.length){return null;}if(secure&&!(/^[,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]*$/).test(string.replace(/\\./g,"@").replace(/"[^"\\\n\r]*"/g,""))){return null; }return eval("("+string+")");}});Native.implement([Hash,Array,String,Number],{toJSON:function(){return JSON.encode(this);}});var Cookie=new Class({Implements:Options,options:{path:false,domain:false,duration:false,secure:false,document:document},initialize:function(b,a){this.key=b; this.setOptions(a);},write:function(b){b=encodeURIComponent(b);if(this.options.domain){b+="; domain="+this.options.domain;}if(this.options.path){b+="; path="+this.options.path; }if(this.options.duration){var a=new Date();a.setTime(a.getTime()+this.options.duration*24*60*60*1000);b+="; expires="+a.toGMTString();}if(this.options.secure){b+="; secure"; }this.options.document.cookie=this.key+"="+b;return this;},read:function(){var a=this.options.document.cookie.match("(?:^|;)\\s*"+this.key.escapeRegExp()+"=([^;]*)"); return(a)?decodeURIComponent(a[1]):null;},dispose:function(){new Cookie(this.key,$merge(this.options,{duration:-1})).write("");return this;}});Cookie.write=function(b,c,a){return new Cookie(b,a).write(c); };Cookie.read=function(a){return new Cookie(a).read();};Cookie.dispose=function(b,a){return new Cookie(b,a).dispose();};var Request=new Class({Implements:[Chain,Events,Options],options:{url:"",data:"",headers:{"X-Requested-With":"XMLHttpRequest",Accept:"text/javascript, text/html, application/xml, text/xml, */*"},async:true,format:false,method:"post",link:"ignore",isSuccess:null,emulation:true,urlEncoded:true,encoding:"utf-8",evalScripts:false,evalResponse:false,noCache:false},initialize:function(a){this.xhr=new Browser.Request(); this.setOptions(a);this.options.isSuccess=this.options.isSuccess||this.isSuccess;this.headers=new Hash(this.options.headers);},onStateChange:function(){if(this.xhr.readyState!=4||!this.running){return; }this.running=false;this.status=0;$try(function(){this.status=this.xhr.status;}.bind(this));this.xhr.onreadystatechange=$empty;if(this.options.isSuccess.call(this,this.status)){this.response={text:this.xhr.responseText,xml:this.xhr.responseXML}; this.success(this.response.text,this.response.xml);}else{this.response={text:null,xml:null};this.failure();}},isSuccess:function(){return((this.status>=200)&&(this.status<300)); },processScripts:function(a){if(this.options.evalResponse||(/(ecma|java)script/).test(this.getHeader("Content-type"))){return $exec(a);}return a.stripScripts(this.options.evalScripts); },success:function(b,a){this.onSuccess(this.processScripts(b),a);},onSuccess:function(){this.fireEvent("complete",arguments).fireEvent("success",arguments).callChain(); },failure:function(){this.onFailure();},onFailure:function(){this.fireEvent("complete").fireEvent("failure",this.xhr);},setHeader:function(a,b){this.headers.set(a,b); return this;},getHeader:function(a){return $try(function(){return this.xhr.getResponseHeader(a);}.bind(this));},check:function(){if(!this.running){return true; }switch(this.options.link){case"cancel":this.cancel();return true;case"chain":this.chain(this.caller.bind(this,arguments));return false;}return false;},send:function(k){if(!this.check(k)){return this; }this.running=true;var i=$type(k);if(i=="string"||i=="element"){k={data:k};}var d=this.options;k=$extend({data:d.data,url:d.url,method:d.method},k);var g=k.data,b=k.url,a=k.method.toLowerCase(); switch($type(g)){case"element":g=document.id(g).toQueryString();break;case"object":case"hash":g=Hash.toQueryString(g);}if(this.options.format){var j="format="+this.options.format; g=(g)?j+"&"+g:j;}if(this.options.emulation&&!["get","post"].contains(a)){var h="_method="+a;g=(g)?h+"&"+g:h;a="post";}if(this.options.urlEncoded&&a=="post"){var c=(this.options.encoding)?"; charset="+this.options.encoding:""; this.headers.set("Content-type","application/x-www-form-urlencoded"+c);}if(this.options.noCache){var f="noCache="+new Date().getTime();g=(g)?f+"&"+g:f; }var e=b.lastIndexOf("/");if(e>-1&&(e=b.indexOf("#"))>-1){b=b.substr(0,e);}if(g&&a=="get"){b=b+(b.contains("?")?"&":"?")+g;g=null;}this.xhr.open(a.toUpperCase(),b,this.options.async); this.xhr.onreadystatechange=this.onStateChange.bind(this);this.headers.each(function(m,l){try{this.xhr.setRequestHeader(l,m);}catch(n){this.fireEvent("exception",[l,m]); }},this);this.fireEvent("request");this.xhr.send(g);if(!this.options.async){this.onStateChange();}return this;},cancel:function(){if(!this.running){return this; }this.running=false;this.xhr.abort();this.xhr.onreadystatechange=$empty;this.xhr=new Browser.Request();this.fireEvent("cancel");return this;}});(function(){var a={}; ["get","post","put","delete","GET","POST","PUT","DELETE"].each(function(b){a[b]=function(){var c=Array.link(arguments,{url:String.type,data:$defined}); return this.send($extend(c,{method:b}));};});Request.implement(a);})();Element.Properties.send={set:function(a){var b=this.retrieve("send");if(b){b.cancel(); }return this.eliminate("send").store("send:options",$extend({data:this,link:"cancel",method:this.get("method")||"post",url:this.get("action")},a));},get:function(a){if(a||!this.retrieve("send")){if(a||!this.retrieve("send:options")){this.set("send",a); }this.store("send",new Request(this.retrieve("send:options")));}return this.retrieve("send");}};Element.implement({send:function(a){var b=this.get("send"); b.send({data:this,url:a||b.options.url});return this;}});Request.JSON=new Class({Extends:Request,options:{secure:true},initialize:function(a){this.parent(a); this.headers.extend({Accept:"application/json","X-Request":"JSON"});},success:function(a){this.response.json=JSON.decode(a,this.options.secure);this.onSuccess(this.response.json,a); }});//MooTools More, . Copyright (c) 2006-2009 Aaron Newton , Valerio Proietti & the MooTools team , MIT Style License. MooTools.More={version:"1.2.3.1"};Class.refactor=function(b,a){$each(a,function(e,d){var c=b.prototype[d];if(c&&(c=c._origin)&&typeof e=="function"){b.implement(d,function(){var f=this.previous; this.previous=c;var g=e.apply(this,arguments);this.previous=f;return g;});}else{b.implement(d,e);}});return b;};Class.Mutators.Binds=function(a){return a; };Class.Mutators.initialize=function(a){return function(){$splat(this.Binds).each(function(b){var c=this[b];if(c){this[b]=c.bind(this);}},this);return a.apply(this,arguments); };};Class.Occlude=new Class({occlude:function(c,b){b=document.id(b||this.element);var a=b.retrieve(c||this.property);if(a&&!$defined(this.occluded)){this.occluded=a; }else{this.occluded=false;b.store(c||this.property,this);}return this.occluded;}});String.implement({parseQueryString:function(){var b=this.split(/[&;]/),a={}; if(b.length){b.each(function(g){var c=g.indexOf("="),d=c<0?[""]:g.substr(0,c).match(/[^\]\[]+/g),e=decodeURIComponent(g.substr(c+1)),f=a;d.each(function(j,h){var k=f[j]; if(h=0||h||r.allowNegative)?n.x:0).toInt(),top:((n.y>=0||h||r.allowNegative)?n.y:0).toInt()}; if(p.getStyle("position")=="fixed"||r.relFixedPosition){var f=window.getScroll();n.top=n.top.toInt()+f.y;n.left=n.left.toInt()+f.x;}if(r.returnPos){return n; }else{this.setStyles(n);}return this;}});})();Element.implement({isDisplayed:function(){return this.getStyle("display")!="none";},toggle:function(){return this[this.isDisplayed()?"hide":"show"](); },hide:function(){var b;try{if("none"!=this.getStyle("display")){b=this.getStyle("display");}}catch(a){}return this.store("originalDisplay",b||"block").setStyle("display","none"); },show:function(a){return this.setStyle("display",a||this.retrieve("originalDisplay")||"block");},swapClass:function(a,b){return this.removeClass(a).addClass(b); }});var OverText=new Class({Implements:[Options,Events,Class.Occlude],Binds:["reposition","assert","focus"],options:{element:"label",positionOptions:{position:"upperLeft",edge:"upperLeft",offset:{x:4,y:2}},poll:false,pollInterval:250},property:"OverText",initialize:function(b,a){this.element=document.id(b); if(this.occlude()){return this.occluded;}this.setOptions(a);this.attach(this.element);OverText.instances.push(this);if(this.options.poll){this.poll();}return this; },toElement:function(){return this.element;},attach:function(){var a=this.options.textOverride||this.element.get("alt")||this.element.get("title");if(!a){return; }this.text=new Element(this.options.element,{"class":"overTxtLabel",styles:{lineHeight:"normal",position:"absolute"},html:a,events:{click:this.hide.pass(true,this)}}).inject(this.element,"after"); if(this.options.element=="label"){this.text.set("for",this.element.get("id"));}this.element.addEvents({focus:this.focus,blur:this.assert,change:this.assert}).store("OverTextDiv",this.text); window.addEvent("resize",this.reposition.bind(this));this.assert(true);this.reposition();},startPolling:function(){this.pollingPaused=false;return this.poll(); },poll:function(a){if(this.poller&&!a){return this;}var b=function(){if(!this.pollingPaused){this.assert(true);}}.bind(this);if(a){$clear(this.poller); }else{this.poller=b.periodical(this.options.pollInterval,this);}return this;},stopPolling:function(){this.pollingPaused=true;return this.poll(true);},focus:function(){if(!this.text.isDisplayed()||this.element.get("disabled")){return; }this.hide();},hide:function(b){if(this.text.isDisplayed()&&!this.element.get("disabled")){this.text.hide();this.fireEvent("textHide",[this.text,this.element]); this.pollingPaused=true;try{if(!b){this.element.fireEvent("focus").focus();}}catch(a){}}return this;},show:function(){if(!this.text.isDisplayed()){this.text.show(); this.reposition();this.fireEvent("textShow",[this.text,this.element]);this.pollingPaused=false;}return this;},assert:function(a){this[this.test()?"show":"hide"](a); },test:function(){var a=this.element.get("value");return !a;},reposition:function(){this.assert(true);if(!this.element.getParent()||!this.element.offsetHeight){return this.stopPolling().hide(); }if(this.test()){this.text.position($merge(this.options.positionOptions,{relativeTo:this.element}));}return this;}});OverText.instances=[];OverText.update=function(){return OverText.instances.map(function(a){if(a.element&&a.text){return a.reposition(); }return null;});};if(window.Fx&&Fx.Reveal){Fx.Reveal.implement({hideInputs:Browser.Engine.trident?"select, input, textarea, object, embed, .overTxtLabel":false}); }var Drag=new Class({Implements:[Events,Options],options:{snap:6,unit:"px",grid:false,style:true,limit:false,handle:false,invert:false,preventDefault:false,modifiers:{x:"left",y:"top"}},initialize:function(){var b=Array.link(arguments,{options:Object.type,element:$defined}); this.element=document.id(b.element);this.document=this.element.getDocument();this.setOptions(b.options||{});var a=$type(this.options.handle);this.handles=((a=="array"||a=="collection")?$$(this.options.handle):document.id(this.options.handle))||this.element; this.mouse={now:{},pos:{}};this.value={start:{},now:{}};this.selection=(Browser.Engine.trident)?"selectstart":"mousedown";this.bound={start:this.start.bind(this),check:this.check.bind(this),drag:this.drag.bind(this),stop:this.stop.bind(this),cancel:this.cancel.bind(this),eventStop:$lambda(false)}; this.attach();},attach:function(){this.handles.addEvent("mousedown",this.bound.start);return this;},detach:function(){this.handles.removeEvent("mousedown",this.bound.start); return this;},start:function(c){if(this.options.preventDefault){c.preventDefault();}this.mouse.start=c.page;this.fireEvent("beforeStart",this.element); var a=this.options.limit;this.limit={x:[],y:[]};for(var d in this.options.modifiers){if(!this.options.modifiers[d]){continue;}if(this.options.style){this.value.now[d]=this.element.getStyle(this.options.modifiers[d]).toInt(); }else{this.value.now[d]=this.element[this.options.modifiers[d]];}if(this.options.invert){this.value.now[d]*=-1;}this.mouse.pos[d]=c.page[d]-this.value.now[d]; if(a&&a[d]){for(var b=2;b--;b){if($chk(a[d][b])){this.limit[d][b]=$lambda(a[d][b])();}}}}if($type(this.options.grid)=="number"){this.options.grid={x:this.options.grid,y:this.options.grid}; }this.document.addEvents({mousemove:this.bound.check,mouseup:this.bound.cancel});this.document.addEvent(this.selection,this.bound.eventStop);},check:function(a){if(this.options.preventDefault){a.preventDefault(); }var b=Math.round(Math.sqrt(Math.pow(a.page.x-this.mouse.start.x,2)+Math.pow(a.page.y-this.mouse.start.y,2)));if(b>this.options.snap){this.cancel();this.document.addEvents({mousemove:this.bound.drag,mouseup:this.bound.stop}); this.fireEvent("start",[this.element,a]).fireEvent("snap",this.element);}},drag:function(a){if(this.options.preventDefault){a.preventDefault();}this.mouse.now=a.page; for(var b in this.options.modifiers){if(!this.options.modifiers[b]){continue;}this.value.now[b]=this.mouse.now[b]-this.mouse.pos[b];if(this.options.invert){this.value.now[b]*=-1; }if(this.options.limit&&this.limit[b]){if($chk(this.limit[b][1])&&(this.value.now[b]>this.limit[b][1])){this.value.now[b]=this.limit[b][1];}else{if($chk(this.limit[b][0])&&(this.value.now[b]c.left&&a.xc.top); },checkDroppables:function(){var a=this.droppables.filter(this.checkAgainst,this).getLast();if(this.overed!=a){if(this.overed){this.fireEvent("leave",[this.element,this.overed]); }if(a){this.fireEvent("enter",[this.element,a]);}this.overed=a;}},drag:function(a){this.parent(a);if(this.options.checkDroppables&&this.droppables.length){this.checkDroppables(); }},stop:function(a){this.checkDroppables();this.fireEvent("drop",[this.element,this.overed,a]);this.overed=null;return this.parent(a);}});Element.implement({makeDraggable:function(a){var b=new Drag.Move(this,a); this.store("dragger",b);return b;}});Hash.Cookie=new Class({Extends:Cookie,options:{autoSave:true},initialize:function(b,a){this.parent(b,a);this.load(); },save:function(){var a=JSON.encode(this.hash);if(!a||a.length>4096){return false;}if(a=="{}"){this.dispose();}else{this.write(a);}return true;},load:function(){this.hash=new Hash(JSON.decode(this.read(),true)); return this;}});Hash.each(Hash.prototype,function(b,a){if(typeof b=="function"){Hash.Cookie.implement(a,function(){var c=b.apply(this.hash,arguments);if(this.options.autoSave){this.save(); }return c;});}}); // perldoc.js // // JavaScript functions for perldoc.perl.org //------------------------------------------------------------------------- // perldoc - site-level functions //------------------------------------------------------------------------- var perldoc = { // startup - page initialisation functions ------------------------------ startup: function() { toolbar.setup(); pageIndex.setup(); recentPages.setup(); new OverText('search_box'); perldoc.fromSearch(); }, // path - path back to the documentation root directory ----------------- path: "", // setPath - sets the perldoc.path variable from page depth ------------- setPath: function(depth) { perldoc.path = ""; for (var c = 0; c < depth; c++) { perldoc.path = perldoc.path + "../"; } }, // loadFlags - loads the perldocFlags cookie ---------------------------- loadFlags: function() { var perldocFlags = new Hash.Cookie('perldocFlags',{ duration: 365, path: "/" }); return perldocFlags; }, // setFlag - stores a value in the perldocFlags cookie ------------------ setFlag: function(name,value) { var perldocFlags = perldoc.loadFlags(); if (!value) { value = true; } perldocFlags.set(name,value); }, // getFlag - gets a value from the perldocFlags cookie ------------------ getFlag: function(name) { var perldocFlags = perldoc.loadFlags(); if (perldocFlags.has(name)) { return perldocFlags.get(name); } else { return false; } }, // clearFlag - removes a value from the perldocFlags cookie ------------- clearFlag: function(name) { var perldocFlags = perldoc.loadFlags(); if (perldocFlags.has(name)) { perldocFlags.erase(name); } }, // fromSearch - writes a message if the page was reached from search ---- fromSearch: function() { if (perldoc.getFlag('fromSearch')) { var query = perldoc.getFlag('searchQuery'); var searchURL = perldoc.path + "search.html?r=no&q=" + query; $('from_search').set('html', '
'); perldoc.clearFlag('fromSearch'); perldoc.clearFlag('searchQuery'); } } } //------------------------------------------------------------------------- // pageIndex - functions to control the floating page index window //------------------------------------------------------------------------- var pageIndex = { // setup - called to initialise the page index -------------------------- setup: function() { if ($('page_index')) { var pageIndexDrag = new Drag('page_index',{ handle: 'page_index_title', onComplete: pageIndex.checkPosition }); $('page_index_content').makeResizable({ handle: 'page_index_resize', onComplete: pageIndex.checkSize }); var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); if (pageIndexSettings.get('status') == 'Visible') { pageIndex.show(); pageIndex.checkPosition(); pageIndex.checkSize(); } else { pageIndex.hide(); } } }, // show - displays the page index --------------------------------------- show: function() { var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); if (pageIndexSettings.has('x') && pageIndexSettings.has('y')) { $('page_index').setStyle('left',pageIndexSettings.get('x')); $('page_index').setStyle('top',pageIndexSettings.get('y')); } if (pageIndexSettings.has('w') && pageIndexSettings.has('h')) { var paddingX = $('page_index_content').getStyle('padding-left').toInt() + $('page_index_content').getStyle('padding-right').toInt(); var paddingY = $('page_index_content').getStyle('padding-top').toInt() + $('page_index_content').getStyle('padding-bottom').toInt(); $('page_index_content').setStyle('width',pageIndexSettings.get('w') - paddingX); $('page_index_content').setStyle('height',pageIndexSettings.get('h') - paddingY); } pageIndex.windowResized(); $('page_index').style.display = 'Block'; $('page_index').style.visibility = 'Visible'; pageIndexSettings.set('status','Visible'); $('page_index_toggle').innerHTML = 'Hide page index'; $('page_index_toggle').removeEvent('click',pageIndex.show); $('page_index_toggle').addEvent('click',pageIndex.hide); window.addEvent('resize',pageIndex.windowResized); return false; }, // hide - hides the page index ------------------------------------------ hide: function() { $('page_index').style.display = 'None'; $('page_index').style.visibility = 'Hidden'; $('page_index_toggle').innerHTML = 'Show page index'; $('page_index_toggle').removeEvent('click',pageIndex.hide); $('page_index_toggle').addEvent('click',pageIndex.show); window.removeEvent('resize',pageIndex.windowResized); var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); pageIndexSettings.set('status','Hidden'); return false; }, // checkPosition - checks the index window is within the screen --------- checkPosition: function() { var pageIndexSize = $('page_index').getSize(); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var windowSize = window.getSize(); var newX = pageIndexPosition.x; var newY = pageIndexPosition.y; if (pageIndexPosition.x < 0) {newX = 0} if (windowSize.x < (pageIndexPosition.x + pageIndexSize.x)) {newX = Math.max(0,windowSize.x - pageIndexSize.x)} if (pageIndexPosition.y < 0) {newY = 0} if (windowSize.y < (pageIndexPosition.y + pageIndexSize.y)) {newY = Math.max(0,windowSize.y - pageIndexSize.y)} $('page_index').setStyle('left',newX); $('page_index').setStyle('top',newY); pageIndex.saveDimensions(); }, // checkSize - checks the index window is smaller than the screen ------- checkSize: function() { var pageIndexSize = $('page_index').getSize(); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var pageIndexHeaderSize = $('page_index_header').getSize(); var pageIndexContentSize = $('page_index_content').getSize(); var pageIndexFooterSize = $('page_index_footer').getSize(); var windowSize = window.getSize(); var newX = pageIndexContentSize.x; var newY = pageIndexContentSize.y; var paddingX = $('page_index_content').getStyle('padding-left').toInt() + $('page_index_content').getStyle('padding-right').toInt(); var paddingY = $('page_index_content').getStyle('padding-top').toInt() + $('page_index_content').getStyle('padding-bottom').toInt(); if (windowSize.x < (pageIndexPosition.x + pageIndexSize.x)) {newX = windowSize.x - pageIndexPosition.x} if (windowSize.y < (pageIndexPosition.y + pageIndexSize.y)) {newY = windowSize.y - pageIndexPosition.y - pageIndexFooterSize.y - pageIndexHeaderSize.y} $('page_index_content').setStyle('width',newX - paddingX); $('page_index_content').setStyle('height',newY - paddingY); pageIndex.saveDimensions(); }, // windowResized - check the index still fits if the window is resized -- windowResized: function() { pageIndex.checkPosition(); var windowSize = window.getSize(); var pageIndexSize = $('page_index').getSize(); if ((windowSize.x < pageIndexSize.x) || (windowSize.y < pageIndexSize.y)) { pageIndex.checkSize(); } }, // saveDimensions - stores the window size/position in a cookie --------- saveDimensions: function() { var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var pageIndexContentSize = $('page_index_content').getSize(); pageIndexSettings.set('x',pageIndexPosition.x); pageIndexSettings.set('y',pageIndexPosition.y); pageIndexSettings.set('w',pageIndexContentSize.x); pageIndexSettings.set('h',pageIndexContentSize.y); } }; //------------------------------------------------------------------------- // recentPages - store and display the last viewed pages //------------------------------------------------------------------------- var recentPages = { // count - number of pages to store ------------------------------------- count: 10, // add - adds a page to the recent list --------------------------------- add: function(name,url) { var recentList = recentPages.load(); // Remove page if it is already in the list recentList = recentList.filter(function(item) { return (item.url != url); }); // Add page as the first item in the list recentList.unshift({ 'name': name, 'url': url }); // Truncate list to maximum length recentList.splice(recentPages.count); // Save list recentPages.save(recentList); }, // show - displays the recent pages list -------------------------------- render: function() { var recentList = recentPages.load(); var recentHTML = ""; if (recentList.length > 0) { recentHTML += '
    '; recentList.each(function(item){ recentHTML += '
  • ' + item.name + ''; }); recentHTML += '
'; } $('recent_pages_content').set('html',recentHTML); }, // load - loads the recent pages list ----------------------------------- load: function() { return (perldoc.getFlag('recentPages') || new Array()); }, // save - saves the recent pages list ----------------------------------- save: function(list) { perldoc.setFlag('recentPages',list); }, setup: function() { recentPages.render(); if (perldoc.contentPage) { recentPages.add(perldoc.pageName,perldoc.pageAddress); } if ($('recent_pages')) { var recentPagesDrag = new Drag('recent_pages',{ handle: 'recent_pages_title', onComplete: recentPages.checkPosition }); $('recent_pages_content').makeResizable({ handle: 'recent_pages_resize', onComplete: recentPages.checkSize }); var recentPagesSettings = new Hash.Cookie('recentPagesSettings',{duration:365,path:"/"}); if (recentPagesSettings.get('status') == 'Visible') { recentPages.show(); recentPages.checkPosition(); recentPages.checkSize(); } else { recentPages.hide(); } } }, // show - displays the page index --------------------------------------- show: function() { var recentPagesSettings = new Hash.Cookie('recentPagesSettings',{duration:365,path:"/"}); if (recentPagesSettings.has('x') && recentPagesSettings.has('y')) { $('recent_pages').setStyle('left',recentPagesSettings.get('x')); $('recent_pages').setStyle('top',recentPagesSettings.get('y')); } if (recentPagesSettings.has('w') && recentPagesSettings.has('h')) { var paddingX = $('recent_pages_content').getStyle('padding-left').toInt() + $('recent_pages_content').getStyle('padding-right').toInt(); var paddingY = $('recent_pages_content').getStyle('padding-top').toInt() + $('recent_pages_content').getStyle('padding-bottom').toInt(); $('recent_pages_content').setStyle('width',recentPagesSettings.get('w') - paddingX); $('recent_pages_content').setStyle('height',recentPagesSettings.get('h') - paddingY); } recentPages.windowResized(); $('recent_pages').style.display = 'Block'; $('recent_pages').style.visibility = 'Visible'; recentPagesSettings.set('status','Visible'); $('recent_pages_toggle').innerHTML = 'Hide recent pages'; $('recent_pages_toggle').removeEvent('click',recentPages.show); $('recent_pages_toggle').addEvent('click',recentPages.hide); window.addEvent('resize',recentPages.windowResized); return false; }, // hide - hides the page index ------------------------------------------ hide: function() { $('recent_pages').style.display = 'None'; $('recent_pages').style.visibility = 'Hidden'; $('recent_pages_toggle').innerHTML = 'Show recent pages'; $('recent_pages_toggle').removeEvent('click',recentPages.hide); $('recent_pages_toggle').addEvent('click',recentPages.show); window.removeEvent('resize',recentPages.windowResized); var recentPagesSettings = new Hash.Cookie('recentPagesSettings',{duration:365,path:"/"}); recentPagesSettings.set('status','Hidden'); return false; }, // checkPosition - checks the index window is within the screen --------- checkPosition: function() { var recentPagesSize = $('recent_pages').getSize(); var recentPagesPosition = {x:$('recent_pages').getStyle('left').toInt(), y:$('recent_pages').getStyle('top').toInt()}; var windowSize = window.getSize(); var newX = recentPagesPosition.x; var newY = recentPagesPosition.y; if (recentPagesPosition.x < 0) {newX = 0} if (windowSize.x < (recentPagesPosition.x + recentPagesSize.x)) {newX = Math.max(0,windowSize.x - recentPagesSize.x)} if (recentPagesPosition.y < 0) {newY = 0} if (windowSize.y < (recentPagesPosition.y + recentPagesSize.y)) {newY = Math.max(0,windowSize.y - recentPagesSize.y)} $('recent_pages').setStyle('left',newX); $('recent_pages').setStyle('top',newY); recentPages.saveDimensions(); }, // checkSize - checks the index window is smaller than the screen ------- checkSize: function() { var recentPagesSize = $('recent_pages').getSize(); var recentPagesPosition = {x:$('recent_pages').getStyle('left').toInt(), y:$('recent_pages').getStyle('top').toInt()}; var recentPagesHeaderSize = $('recent_pages_header').getSize(); var recentPagesContentSize = $('recent_pages_content').getSize(); var recentPagesFooterSize = $('recent_pages_footer').getSize(); var windowSize = window.getSize(); var newX = recentPagesContentSize.x; var newY = recentPagesContentSize.y; var paddingX = $('recent_pages_content').getStyle('padding-left').toInt() + $('recent_pages_content').getStyle('padding-right').toInt(); var paddingY = $('recent_pages_content').getStyle('padding-top').toInt() + $('recent_pages_content').getStyle('padding-bottom').toInt(); if (windowSize.x < (recentPagesPosition.x + recentPagesSize.x)) {newX = windowSize.x - recentPagesPosition.x} if (windowSize.y < (recentPagesPosition.y + recentPagesSize.y)) {newY = windowSize.y - recentPagesPosition.y - recentPagesFooterSize.y - recentPagesHeaderSize.y} $('recent_pages_content').setStyle('width',newX - paddingX); $('recent_pages_content').setStyle('height',newY - paddingY); recentPages.saveDimensions(); }, // windowResized - check the index still fits if the window is resized -- windowResized: function() { recentPages.checkPosition(); var windowSize = window.getSize(); var recentPagesSize = $('recent_pages').getSize(); if ((windowSize.x < recentPagesSize.x) || (windowSize.y < recentPagesSize.y)) { recentPages.checkSize(); } }, // saveDimensions - stores the window size/position in a cookie --------- saveDimensions: function() { var recentPagesSettings = new Hash.Cookie('recentPagesSettings',{duration:365,path:"/"}); var recentPagesPosition = {x:$('recent_pages').getStyle('left').toInt(), y:$('recent_pages').getStyle('top').toInt()}; var recentPagesContentSize = $('recent_pages_content').getSize(); recentPagesSettings.set('x',recentPagesPosition.x); recentPagesSettings.set('y',recentPagesPosition.y); recentPagesSettings.set('w',recentPagesContentSize.x); recentPagesSettings.set('h',recentPagesContentSize.y); } }; //------------------------------------------------------------------------- // toolbar - functions to control the floating toolbar //------------------------------------------------------------------------- var toolbar = { // state - holds the current CSS positioning state (fixed or static) state: "static", // setup - initialises the window.onscroll handler setup: function() { var toolbarType = perldoc.getFlag('toolbar_position'); if (toolbarType != 'standard') { toolbar.checkPosition(); if (toolbar.state == 'fixed') { // If an internal link was called (x.html#y) the link position will be // behind the toolbar so the page needs to be scrolled 90px down anchor = location.hash.substr(1); if (anchor) { var allLinks = $(document.body).getElements('a'); allLinks.each(function(link) { if (link.get('name') == anchor) { window.scrollTo(0,link.offsetTop + 100); } }); } } toolbar.rewriteLinks(); window.addEvent('scroll', toolbar.checkPosition); } }, // checkPosition - checks the scroll position and updates toolbar checkPosition: function() { if ((toolbar.state == 'static') && (toolbar.getCurrentYPos() > 120)) { $('content_header').style.position = "fixed"; $('content_body').style.marginTop = "85px"; toolbar.state = 'fixed'; } if ((toolbar.state == 'fixed') && (toolbar.getCurrentYPos() <= 120)) { $('content_header').style.position = "static"; $('content_body').style.marginTop = "0px"; toolbar.state = 'static'; } }, // getCurrentYPos - returns the vertical scroll offset getCurrentYPos: function() { if (document.body && document.body.scrollTop) return document.body.scrollTop; if (document.documentElement && document.documentElement.scrollTop) return document.documentElement.scrollTop; if (window.pageYOffset) return window.pageYOffset; return 0; }, // goToTop - scroll to the top of the page goToTop: function() { $('content_header').style.position = "static"; $('content_body').style.marginTop = "0px"; toolbar.state = 'static'; window.scrollTo(0,0); }, // rewriteLinks - stop internal links appearing behind the toolbar // Based on code written by Stuart Langridge - http://www.kryogenix.org rewriteLinks: function() { // Get a list of all links in the page var allLinks = $(document.body).getElements('a'); // Walk through the list allLinks.each( function(link) { if ((link.href && link.href.indexOf('#') != -1) && ( (link.pathname == location.pathname) || ('/'+link.pathname == location.pathname) ) && (link.search == location.search)) { // If the link is internal to the page (begins in #) // then attach the smoothScroll function as an onclick // event handler link.addEvent('click',toolbar.linkScroll); } }); }, // linkScroll - follow internal link and scroll the page // Based on code written by Stuart Langridge - http://www.kryogenix.org linkScroll: function(e) { // This is an event handler; get the clicked on element, // in a cross-browser fashion if (window.event) { target = window.event.srcElement; } else if (e) { target = e.target; } else return; // Make sure that the target is an element, not a text node // within an element if (target.nodeName.toLowerCase() != 'a') { target = target.parentNode; } // Paranoia; check this is an A tag if (target.nodeName.toLowerCase() != 'a') return; // Find the tag corresponding to this href // First strip off the hash (first character) anchor = target.hash.substr(1); // Now loop all A tags until we find one with that name var allLinks = $(document.body).getElements('a'); var destinationLink = null; allLinks.each ( function(link) { if (link.name && (link.name == anchor)) { destinationLink = link; } }); if (!destinationLink) destinationLink = $(anchor); // If we didn't find a destination, give up and let the browser do // its thing if (!destinationLink) return true; // Find the destination's position var desty = destinationLink.offsetTop; var thisNode = destinationLink; while (thisNode.offsetParent && (thisNode.offsetParent != document.body)) { thisNode = thisNode.offsetParent; desty += thisNode.offsetTop; } // Follow the link location.hash = anchor; // Scroll if necessary to avoid the top nav bar if ((window.pageYOffset > 120) && ((desty + window.innerHeight - 120) < toolbar.getDocHeight())) { window.scrollBy(0,-90); } // And stop the actual click happening if (window.event) { window.event.cancelBubble = true; window.event.returnValue = false; } if (e && e.preventDefault && e.stopPropagation) { e.preventDefault(); e.stopPropagation(); } }, // getDocHeight - return the height of the document getDocHeight: function() { var D = document; return Math.max( Math.max(D.body.scrollHeight, D.documentElement.scrollHeight), Math.max(D.body.offsetHeight, D.documentElement.offsetHeight), Math.max(D.body.clientHeight, D.documentElement.clientHeight) ); } }; perldoc-html/static/content_bg.png000644 000765 000024 00000000340 12276001416 017331 0ustar00jjstaff000000 000000 ‰PNG  IHDRÛJbÈsRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ # ƒR†"tEXtCommentCreated with GIMP on a Mac‡¨wC2IDATHÇc¼ÿìÕÿ½g¯2$û80Œ‚Q0 FÁ(£`Œ‚Q0 (³6íe°ÕUe×¾ –6v IEND®B`‚perldoc-html/static/content_footer.png000644 000765 000024 00000002304 12276001416 020241 0ustar00jjstaff000000 000000 ‰PNG  IHDRÛàÏâ>sRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ %Ѽ<ì"tEXtCommentCreated with GIMP on a Mac‡¨wCIDATxÚíÝÛOSÇñ_Ï‘KÊ TZ(X@¹D§vI\æœË²½í_Xö‡lK–=¸çݳ™l˳/“0#èè@”ÂZ¤EJrJÛ=lÊ•rÑlúý¼µ=ç4ù—ïC/¶x<žùø›.½ÿÎëÚ]P [7_Ò§g»ôÁۯɤÒâ"EçYئh,®²b‡$ýÛÞR§®O…Yئ±é°*\Åk±]ívéÚTˆe€m †Tã.]‹mË©©Èœ¬ä*ë[d%W5½-˹Û»LSû=n]`!`‹._›PƒwLÓX‹mIjñW©oxœ…€-º8<¦ƒþÊ»ïÆ¶ÏíRliYÁ™+›49ÑÂrB¾r×úض٤—›êÔÝ7ÀRÀ&u_Ð+Íu²Ù´>¶%©¹Æ«?#1MÝŠ²¥à­¨Bјšj¼÷<Ol›¦¡Gšt&ð«2™ «H§3:èÕ‰£Í2 ãá±-Iu^·Šìyêa9`=ƒÃrØóUç)_÷šñ N¶·èçþ!¾, <ÂÍ𬗮êd{Ë_`lÙótª£E_tŸW²X¸Ï²eéËîêhQ‘=/ûØ–¤ZO¹ž¯«Ög­XIÖþ‘°,}øÕw:²¯Zþв‡g<ê"m~U•9uºóÁ HZ±’úèëïÕê¯T[ƒÿ‘Ç]ìøá&íu—è“o"¸ðLKX–NwžS³¯BÇ5lx¼-gõ=ƒ£º4vSïI.'Kà™2=;§Ï»~Qkm•Úk³:'ëØ–¤‘É:{Ó‰èhC-‹à™páʨ~¸ð»Þl;¨úJwÖçm*¶%)¶¸¬³çû•ÉHïkS™s7ë੎ÎëL W¦iÓ[­Ú]hßÔù›Ží;†Æƒú±ïŠê+ÝzµµI¥ÏÝx:ÌÌÅèÔH0¬77®ûöÇÛ’d%WÕ?zC=C×å+/Ñá}{ÕàóÈ0lÜ!ü¯¤Ó]½Tÿð¸&½Øäסºjåî2·|ÍmÅö©TJL†40Ôäìœj+ÊU_é–ßS.§£P†ap÷ð‹ë´¢±¸Æ¦g42ÒètX^—Sü^í÷î‘in¿aw$¶ÿmieE¡ˆn„f51QtaIÅ…ù*q8”›c*?7G&ñ €',•N+a%e%SŠÄ4¿”P‰£@¾²ùÜ.Õ¸]²çåîè{îxlß/“‘nÇ_^Q"™Ôj*­t*ÅÝÀe˜¦v™†òsrä°ç©¸¨P¶Çüéç¿Ë¾R«Ç‚½IEND®B`‚perldoc-html/static/css-20090727.css000644 000765 000024 00000023525 12276001416 016725 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #1d3648; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { background-color: #f1f1f1; width: 985px; margin: 15px auto; } /* @group Header */ div#header { height: 105px; background-image: url(combined-20090722.png); position: relative; width: 985px; } /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #7f7f7f; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 48px; left: 810px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { background-image: url(page_bg.png); padding-top: 0px; } div#left_column { float: left; width: 145px; padding-left: 15px; } div#centre_column { float: left; width: 648px; margin-left: 8px; margin-right: 9px; position: relative; background-color: #ffffff; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { height: 48px; width: 985px; background: url(combined-20090722.png) 0 -105px; } /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ form#perl_version { padding-left: 7px; margin-top: 3px; padding-bottom: 10px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #2f2f2f url(combined-20090722.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090722.png) -648px -153px; } div.links_panel p { background: url(combined-20090722.png) -786px -153px; } div.tools_panel p { background: url(combined-20090722.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(center_bg.png); top: 0; width: 648px; height: 90px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; width: 648px; background: url(combined-20090722.png) 0 -153px; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div#page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; top: 33px; right: 21px; } div#page_links a { color: #bebebe; text-decoration: none; } div#page_links a:hover { color: white; } div#page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 12px; padding-top: 5px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 12px; padding-left: 12px; font-size: .9em; color: #262626; position: relative; top: 0px; width: 624px; background-image: url(center_bg.png); } div#content_body h1 { margin-top: 0; margin-bottom: 6px; font-size: 1.4em; border-bottom: 2px solid #d8d8d8; color: #3f3f3f; } div#content_body h2 { margin-top: 0; margin-bottom: 6px; font-size: 1.2em; border-bottom: 1px solid #d8d8d8; color: #3f3f3f; } div#content_body p { margin-top: 2px; margin-bottom: 12px; } div#content_body ul li { padding-bottom: 2px; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { height: 10px; width: 648px; background: url(combined-20090722.png) 0 -206px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { position: fixed; top: 180px; left: 550px; visibility: hidden; z-index: 2; display: none; } div#page_index_header { } div#page_index_footer { } div#page_index_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div#page_index_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div#page_index_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div#page_index_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div#page_index_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div#page_index_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div#page_index_close a { display: block; width: 24px; height: 23px; } div#page_index_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div#page_index_content ul { margin-top: 10px; } div#page_index_content a:link { color: #a6a6a6; } div#page_index_content a:hover { color: #fff; } span.page_index_top { display: inline-block; height: 19px; } span.page_index_bottom { display: inline-block; height: 23px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/css-20090730.css000644 000765 000024 00000023556 12276001416 016723 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/css-20090730.css */ /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #1d3648; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { background-color: #f1f1f1; width: 985px; margin: 15px auto; } /* @group Header */ div#header { height: 105px; background-image: url(combined-20090722.png); position: relative; width: 985px; } /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #7f7f7f; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 48px; left: 810px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { background-image: url(page_bg.png); padding-top: 0px; } div#left_column { float: left; width: 145px; padding-left: 15px; } div#centre_column { float: left; width: 648px; margin-left: 8px; margin-right: 9px; position: relative; background-color: #ffffff; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { height: 48px; width: 985px; background: url(combined-20090722.png) 0 -105px; } /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ select#perl_version_select { margin: 5px 8px 11px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #2f2f2f url(combined-20090722.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090722.png) -648px -153px; } div.links_panel p { background: url(combined-20090722.png) -786px -153px; } div.tools_panel p { background: url(combined-20090722.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(center_bg.png); top: 0; width: 648px; height: 90px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; width: 648px; background: url(combined-20090722.png) 0 -153px; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div#page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; top: 33px; right: 21px; } div#page_links a { color: #bebebe; text-decoration: none; } div#page_links a:hover { color: white; } div#page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 12px; padding-top: 5px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 12px; padding-left: 12px; font-size: .9em; color: #262626; position: relative; top: 0px; width: 624px; background-image: url(center_bg.png); } div#content_body h1 { margin-top: 0; margin-bottom: 6px; font-size: 1.4em; border-bottom: 2px solid #d8d8d8; color: #3f3f3f; } div#content_body h2 { margin-top: 0; margin-bottom: 6px; font-size: 1.2em; border-bottom: 1px solid #d8d8d8; color: #3f3f3f; } div#content_body p { margin-top: 2px; margin-bottom: 12px; } div#content_body ul li { padding-bottom: 2px; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { height: 10px; width: 648px; background: url(combined-20090722.png) 0 -206px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { position: fixed; top: 180px; left: 550px; visibility: hidden; z-index: 2; display: none; } div#page_index_header { } div#page_index_footer { } div#page_index_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div#page_index_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div#page_index_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div#page_index_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div#page_index_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div#page_index_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div#page_index_close a { display: block; width: 24px; height: 23px; } div#page_index_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div#page_index_content ul { margin-top: 10px; } div#page_index_content a:link { color: #a6a6a6; } div#page_index_content a:hover { color: #fff; } span.page_index_top { display: inline-block; height: 19px; } span.page_index_bottom { display: inline-block; height: 23px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/css-20090810.css000644 000765 000024 00000024016 12276001417 016713 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/css-20090810.css */ /* @override http://localhost:3323/css-20090730.css */ /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #1d3648; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { background-color: #f1f1f1; width: 985px; margin: 15px auto; } /* @group Header */ div#header { height: 105px; background-image: url(combined-20090722.png); position: relative; width: 985px; } /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #7f7f7f; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 48px; left: 810px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { background-image: url(page_bg.png); padding-top: 0px; } div#left_column { float: left; width: 145px; padding-left: 15px; } div#centre_column { float: left; width: 648px; margin-left: 8px; margin-right: 9px; position: relative; background-color: #ffffff; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { height: 48px; width: 985px; background: url(combined-20090722.png) 0 -105px; } /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ select#perl_version_select { margin: 5px 8px 11px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #2f2f2f url(combined-20090722.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090722.png) -648px -153px; } div.links_panel p { background: url(combined-20090722.png) -786px -153px; } div.tools_panel p { background: url(combined-20090722.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(center_bg.png); top: 0; width: 648px; height: 90px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; width: 648px; background: url(combined-20090722.png) 0 -153px; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div#page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; top: 33px; right: 21px; } div#page_links a { color: #bebebe; text-decoration: none; } div#page_links a:hover { color: white; } div#page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 12px; padding-top: 5px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 12px; padding-left: 12px; font-size: .9em; color: #262626; position: relative; top: 0px; width: 624px; background-image: url(center_bg.png); } div#content_body h1 { margin-top: 0; margin-bottom: 6px; font-size: 1.4em; border-bottom: 2px solid #d8d8d8; color: #3f3f3f; } div#content_body h2 { margin-top: 18px; margin-bottom: 6px; font-size: 1.2em; border-bottom: 1px solid #d8d8d8; color: #3f3f3f; } div#content_body h3 { margin-top: 6px; margin-bottom: 6px; font-size: 1.1em; color: #3f3f3f; } div#content_body p { margin-top: 2px; margin-bottom: 12px; } div#content_body ul li { padding-bottom: 2px; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { height: 10px; width: 648px; background: url(combined-20090722.png) 0 -206px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { position: fixed; top: 180px; left: 550px; visibility: hidden; z-index: 2; display: none; } div#page_index_header { } div#page_index_footer { } div#page_index_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div#page_index_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div#page_index_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div#page_index_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div#page_index_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div#page_index_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div#page_index_close a { display: block; width: 24px; height: 23px; } div#page_index_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div#page_index_content ul { margin-top: 10px; } div#page_index_content a:link { color: #a6a6a6; } div#page_index_content a:hover { color: #fff; } span.page_index_top { display: inline-block; height: 19px; } span.page_index_bottom { display: inline-block; height: 23px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/css-20090925.css000644 000765 000024 00000024016 12276001417 016722 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/css-20090810.css */ /* @override http://localhost:3323/css-20090730.css */ /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #274255; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { background-color: #f1f1f1; width: 985px; margin: 15px auto; } /* @group Header */ div#header { height: 105px; background-image: url(combined-20090925.png); position: relative; width: 985px; } /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #7f7f7f; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 48px; left: 810px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { background-image: url(page_bg.png); padding-top: 0px; } div#left_column { float: left; width: 145px; padding-left: 15px; } div#centre_column { float: left; width: 648px; margin-left: 8px; margin-right: 9px; position: relative; background-color: #ffffff; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { height: 48px; width: 985px; background: url(combined-20090925.png) 0 -105px; } /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ select#perl_version_select { margin: 5px 8px 11px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #3b3b3b url(combined-20090925.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090925.png) -648px -153px; } div.links_panel p { background: url(combined-20090925.png) -786px -153px; } div.tools_panel p { background: url(combined-20090925.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(center_bg.png); top: 0; width: 648px; height: 90px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; width: 648px; background: url(combined-20090925.png) 0 -153px; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div#page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; top: 33px; right: 21px; } div#page_links a { color: #bebebe; text-decoration: none; } div#page_links a:hover { color: white; } div#page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 12px; padding-top: 5px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 12px; padding-left: 12px; font-size: .9em; color: #262626; position: relative; top: 0px; width: 624px; background-image: url(center_bg.png); } div#content_body h1 { margin-top: 0; margin-bottom: 6px; font-size: 1.4em; border-bottom: 2px solid #d8d8d8; color: #3f3f3f; } div#content_body h2 { margin-top: 18px; margin-bottom: 6px; font-size: 1.2em; border-bottom: 1px solid #d8d8d8; color: #3f3f3f; } div#content_body h3 { margin-top: 6px; margin-bottom: 6px; font-size: 1.1em; color: #3f3f3f; } div#content_body p { margin-top: 2px; margin-bottom: 12px; } div#content_body ul li { padding-bottom: 2px; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { height: 10px; width: 648px; background: url(combined-20090925.png) 0 -206px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { position: fixed; top: 180px; left: 550px; visibility: hidden; z-index: 2; display: none; } div#page_index_header { } div#page_index_footer { } div#page_index_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div#page_index_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div#page_index_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div#page_index_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div#page_index_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div#page_index_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div#page_index_close a { display: block; width: 24px; height: 23px; } div#page_index_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div#page_index_content ul { margin-top: 10px; } div#page_index_content a:link { color: #a6a6a6; } div#page_index_content a:hover { color: #fff; } span.page_index_top { display: inline-block; height: 19px; } span.page_index_bottom { display: inline-block; height: 23px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/css-20100402.css000644 000765 000024 00000031261 12276001417 016700 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/css-20090810.css */ /* @override http://localhost:3323/css-20090730.css */ /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #004065; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { width: 946px; margin: 16px auto 26px; position: relative; } p { margin: 0; padding-top: 5px; padding-bottom: 5px; } /* @group Header */ div#header { height: 105px; background-image: url(header.png); position: relative; width: 946px; } div#strapline { color: #b4b4b4; position: absolute; top: 63px; left: 119px; font-size: .8em; text-align: center; width: 228px; } /* @group Download */ div#download_link { position: absolute; top: 48px; font-size: .8em; left: 724px; } div#explore_link { position: absolute; top: 48px; font-size: .8em; left: 846px; } div.download a { text-decoration: none; color: #b4b4b4; } div.download a:hover { color: #eaeaea; } /* @end */ /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #afafaf; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 23px; left: 588px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { padding-top: 0px; background: url(main_bg.png) repeat-y; } div#left_column { float: left; width: 145px; padding-left: 30px; } div#centre_column { float: left; width: 731px; margin-left: 9px; margin-right: 9px; position: relative; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { } div#footer_content { background-image: url(footer_bg.png); padding-right: 74px; padding-left: 74px; } div#footer_strapline { text-align: center; padding-top: 15px; padding-bottom: 15px; color: #6c6c6c; font: lighter 1.05em "Gill Sans"; } div#footer_end { background-image: url(footer.png); width: 946px; height: 42px; clear: both; background-color: #004065; } /* @group Address */ div#address { background: url(onion-grey.png) no-repeat 0 4px; padding-left: 51px; float: left; border-right: 1px solid #b4b4b4; padding-right: 35px; } div#address p.name { margin: 0; color: #414141; } div#address p.address { color: #797979; font-size: .9em; margin: 3px 0 0; padding: 0; } div#address p.contact { color: #797979; margin: 3px 0 0; font-size: .9em; } div#address p.contact a { color: #797979; text-decoration: underline; } div#address p.contact a:hover { color: #515151; } div#address p.address a { color: #797979; text-decoration: underline; } div#address p.address a:hover { color: #515151; } /* @end */ /* @group Links */ div#footer_links { background-image: url(footer_bg.png); } ul.f1 { list-style-type: none; margin: 0; float: right; padding: 8px 0 0; } ul.f1 li { float: left; color: #414141; font-size: .8em; text-align: right; padding-left: 40px; } ul.f2 { list-style-type: none; text-align: right; margin: 3px 0 0; padding: 0; } ul.f2 li { float: none; font-size: 1em; margin: 0; padding: 0; } ul.f2 li a { text-decoration: none; color: #858585; } ul.f2 li a:hover { color: #515151; } /* @end */ /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ select#perl_version_select { margin: 5px 8px 11px; width: 121px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #3b3b3b url(combined-20090925.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090925.png) -648px -153px; } div.links_panel p { background: url(combined-20090925.png) -786px -153px; } div.tools_panel p { background: url(combined-20090925.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(content_bg.png); top: 0; width: 731px; height: 85px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; background: url(center_header.png) 0 0; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div.page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; } div#page_links_top { top: 18px; right: 179px; } div#page_links_bottom { top: 33px; right: 179px; } div.page_links a { color: #bebebe; text-decoration: none; } div.page_links a:hover { color: white; } div.page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 30px; padding-top: 10px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 30px; padding-left: 30px; font-size: .8em; color: #515151; position: relative; top: 0px; background-image: url(content_bg.png); } div#content_body h1 { font: 1.9em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; background: url(h2_bg.png) no-repeat -2px 29px; padding-left: 0px; padding-top: 6px; padding-bottom: 8px; margin-top: 0px; margin-bottom: 0; } div#content_body h2 { font: 1.5em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; padding-top: 12px; margin: 0; background: url(h2_bg.png) no-repeat -2px 30px; padding-bottom: 4px; padding-left: 0; } div#content_body h3 { font: 1.3em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; margin-top: 9px; margin-bottom: 3px; } div#content_body p { margin-top: 0px; margin-bottom: 0px; } div#content_body ul li { padding-bottom: 3px; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { width: 731px; height: 25px; background: url(content_footer.png) no-repeat; margin-bottom: 20px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; font-size: 1.2em; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; font-size: 1.2em; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { top: 180px; left: 550px; } div#recent_pages { top: 180px; left: 250px; } div.hud_container { position: fixed; z-index: 2; display: none; visibility: hidden; } div.hud_header { } div.hud_footer { } div.hud_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div.hud_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div.hud_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div.hud_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div.hud_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div.hud_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div.hud_close a { display: block; width: 24px; height: 23px; } div.hud_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div.hud_content ul { margin-top: 10px; } div#content_body div.hud_content a:link { color: #a6a6a6; } div#content_body div.hud_content a:visited { color: #a6a6a6; } div#content_body div.hud_content { color: #a6a6a6; } div#content_body div.hud_content a:hover { color: #fff; } span.hud_span_top { display: inline-block; height: 19px; } span.hud_span_bottom { display: inline-block; height: 23px; } div#recent_pages_content { width: 170px; height: 200px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/css-20100513.css000644 000765 000024 00000031352 12276001417 016704 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/css-20090810.css */ /* @override http://localhost:3323/css-20090730.css */ /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #004065; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { width: 946px; margin: 16px auto 26px; position: relative; } p { margin: 0; padding-top: 5px; padding-bottom: 5px; } /* @group Header */ div#header { height: 105px; background-image: url(header.png); position: relative; width: 946px; } div#strapline { color: #b4b4b4; position: absolute; top: 63px; left: 119px; font-size: .8em; text-align: center; width: 228px; } /* @group Download */ div#download_link { position: absolute; top: 48px; font-size: .8em; left: 724px; } div#explore_link { position: absolute; top: 48px; font-size: .8em; left: 846px; } div.download a { text-decoration: none; color: #b4b4b4; } div.download a:hover { color: #eaeaea; } /* @end */ /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #afafaf; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 23px; left: 588px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { padding-top: 0px; background: url(main_bg.png) repeat-y; } div#left_column { float: left; width: 145px; padding-left: 30px; } div#centre_column { float: left; width: 731px; margin-left: 9px; margin-right: 9px; position: relative; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { } div#footer_content { background-image: url(footer_bg.png); padding-right: 74px; padding-left: 74px; } div#footer_strapline { text-align: center; padding-top: 15px; padding-bottom: 15px; color: #6c6c6c; font: lighter 1.05em "Gill Sans"; } div#footer_end { background-image: url(footer.png); width: 946px; height: 42px; clear: both; background-color: #004065; } /* @group Address */ div#address { background: url(onion-grey.png) no-repeat 0 4px; padding-left: 51px; float: left; border-right: 1px solid #b4b4b4; padding-right: 35px; } div#address p.name { margin: 0; color: #414141; } div#address p.address { color: #797979; font-size: .9em; margin: 3px 0 0; padding: 0; } div#address p.contact { color: #797979; margin: 3px 0 0; font-size: .9em; } div#address p.contact a { color: #797979; text-decoration: underline; } div#address p.contact a:hover { color: #515151; } div#address p.address a { color: #797979; text-decoration: underline; } div#address p.address a:hover { color: #515151; } /* @end */ /* @group Links */ div#footer_links { background-image: url(footer_bg.png); } ul.f1 { list-style-type: none; margin: 0; float: right; padding: 8px 0 0; } ul.f1 li { float: left; color: #414141; font-size: .8em; text-align: right; padding-left: 40px; } ul.f2 { list-style-type: none; text-align: right; margin: 3px 0 0; padding: 0; } ul.f2 li { float: none; font-size: 1em; margin: 0; padding: 0; } ul.f2 li a { text-decoration: none; color: #858585; } ul.f2 li a:hover { color: #515151; } /* @end */ /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ select#perl_version_select { margin: 5px 8px 11px; width: 121px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #3b3b3b url(combined-20090925.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090925.png) -648px -153px; } div.links_panel p { background: url(combined-20090925.png) -786px -153px; } div.tools_panel p { background: url(combined-20090925.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(content_bg.png); top: 0; width: 731px; height: 85px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; background: url(center_header.png) 0 0; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div.page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; } div#page_links_top { top: 18px; right: 179px; } div#page_links_bottom { top: 33px; right: 179px; } div.page_links a { color: #bebebe; text-decoration: none; } div.page_links a:hover { color: white; } div.page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 30px; padding-top: 10px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 30px; padding-left: 30px; font-size: .8em; color: #515151; position: relative; top: 0px; background-image: url(content_bg.png); } div#content_body h1 { font: 1.9em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; background: url(h2_bg.png) no-repeat -2px 29px; padding-left: 0px; padding-top: 6px; padding-bottom: 8px; margin-top: 0px; margin-bottom: 0; } div#content_body h2 { font: 1.5em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; padding-top: 12px; margin: 0; background: url(h2_bg.png) no-repeat -2px 30px; padding-bottom: 4px; padding-left: 0; } div#content_body h3 { font: 1.3em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; margin-top: 9px; margin-bottom: 3px; } div#content_body p { margin-top: 0px; margin-bottom: 0px; } div#content_body ul li { padding-bottom: 3px; } div#content_body dt { font-weight: bold; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { width: 731px; height: 25px; background: url(content_footer.png) no-repeat; margin-bottom: 20px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; font-size: 1.2em; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; font-size: 1.2em; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { top: 180px; left: 550px; } div#recent_pages { top: 180px; left: 250px; } div.hud_container { position: fixed; z-index: 2; display: none; visibility: hidden; } div.hud_header { } div.hud_footer { } div.hud_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div.hud_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div.hud_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div.hud_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div.hud_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div.hud_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div.hud_close a { display: block; width: 24px; height: 23px; } div.hud_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div.hud_content ul { margin-top: 10px; } div#content_body div.hud_content a:link { color: #a6a6a6; } div#content_body div.hud_content a:visited { color: #a6a6a6; } div#content_body div.hud_content { color: #a6a6a6; } div#content_body div.hud_content a:hover { color: #fff; } span.hud_span_top { display: inline-block; height: 19px; } span.hud_span_bottom { display: inline-block; height: 23px; } div#recent_pages_content { width: 170px; height: 200px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/css-20100522.css000644 000765 000024 00000031357 12276001417 016711 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/css-20090810.css */ /* @override http://localhost:3323/css-20090730.css */ /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #004065; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { width: 946px; margin: 16px auto 26px; position: relative; } p { margin: 0; padding-top: 5px; padding-bottom: 5px; } /* @group Header */ div#header { height: 105px; background-image: url(header.png); position: relative; width: 946px; } div#strapline { color: #b4b4b4; position: absolute; top: 63px; left: 119px; font-size: .8em; text-align: center; width: 228px; } /* @group Download */ div#download_link { position: absolute; top: 48px; font-size: .8em; left: 724px; } div#explore_link { position: absolute; top: 48px; font-size: .8em; left: 846px; } div.download a { text-decoration: none; color: #b4b4b4; } div.download a:hover { color: #eaeaea; } /* @end */ /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #afafaf; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 23px; left: 588px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { padding-top: 0px; background: #fff url(main_bg.png) repeat-y; } div#left_column { float: left; width: 145px; padding-left: 30px; } div#centre_column { float: left; width: 731px; margin-left: 9px; margin-right: 9px; position: relative; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { } div#footer_content { background-image: url(footer_bg.png); padding-right: 74px; padding-left: 74px; } div#footer_strapline { text-align: center; padding-top: 15px; padding-bottom: 15px; color: #6c6c6c; font: lighter 1.05em "Gill Sans"; } div#footer_end { background-image: url(footer.png); width: 946px; height: 42px; clear: both; background-color: #004065; } /* @group Address */ div#address { background: url(onion-grey.png) no-repeat 0 4px; padding-left: 51px; float: left; border-right: 1px solid #b4b4b4; padding-right: 35px; } div#address p.name { margin: 0; color: #414141; } div#address p.address { color: #797979; font-size: .9em; margin: 3px 0 0; padding: 0; } div#address p.contact { color: #797979; margin: 3px 0 0; font-size: .9em; } div#address p.contact a { color: #797979; text-decoration: underline; } div#address p.contact a:hover { color: #515151; } div#address p.address a { color: #797979; text-decoration: underline; } div#address p.address a:hover { color: #515151; } /* @end */ /* @group Links */ div#footer_links { background-image: url(footer_bg.png); } ul.f1 { list-style-type: none; margin: 0; float: right; padding: 8px 0 0; } ul.f1 li { float: left; color: #414141; font-size: .8em; text-align: right; padding-left: 40px; } ul.f2 { list-style-type: none; text-align: right; margin: 3px 0 0; padding: 0; } ul.f2 li { float: none; font-size: 1em; margin: 0; padding: 0; } ul.f2 li a { text-decoration: none; color: #858585; } ul.f2 li a:hover { color: #515151; } /* @end */ /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ select#perl_version_select { margin: 5px 8px 11px; width: 121px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #3b3b3b url(combined-20090925.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090925.png) -648px -153px; } div.links_panel p { background: url(combined-20090925.png) -786px -153px; } div.tools_panel p { background: url(combined-20090925.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(content_bg.png); top: 0; width: 731px; height: 85px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; background: url(center_header.png) 0 0; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div.page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; } div#page_links_top { top: 18px; right: 179px; } div#page_links_bottom { top: 33px; right: 179px; } div.page_links a { color: #bebebe; text-decoration: none; } div.page_links a:hover { color: white; } div.page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 30px; padding-top: 10px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 30px; padding-left: 30px; font-size: .8em; color: #515151; position: relative; top: 0px; background-image: url(content_bg.png); } div#content_body h1 { font: 1.9em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; background: url(h2_bg.png) no-repeat -2px 29px; padding-left: 0px; padding-top: 6px; padding-bottom: 8px; margin-top: 0px; margin-bottom: 0; } div#content_body h2 { font: 1.5em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; padding-top: 12px; margin: 0; background: url(h2_bg.png) no-repeat -2px 30px; padding-bottom: 4px; padding-left: 0; } div#content_body h3 { font: 1.3em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; margin-top: 9px; margin-bottom: 3px; } div#content_body p { margin-top: 0px; margin-bottom: 0px; } div#content_body ul li { padding-bottom: 3px; } div#content_body dt { font-weight: bold; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { width: 731px; height: 25px; background: url(content_footer.png) no-repeat; margin-bottom: 20px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; font-size: 1.2em; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; font-size: 1.2em; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { top: 180px; left: 550px; } div#recent_pages { top: 180px; left: 250px; } div.hud_container { position: fixed; z-index: 2; display: none; visibility: hidden; } div.hud_header { } div.hud_footer { } div.hud_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div.hud_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div.hud_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div.hud_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div.hud_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div.hud_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div.hud_close a { display: block; width: 24px; height: 23px; } div.hud_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div.hud_content ul { margin-top: 10px; } div#content_body div.hud_content a:link { color: #a6a6a6; } div#content_body div.hud_content a:visited { color: #a6a6a6; } div#content_body div.hud_content { color: #a6a6a6; } div#content_body div.hud_content a:hover { color: #fff; } span.hud_span_top { display: inline-block; height: 19px; } span.hud_span_bottom { display: inline-block; height: 23px; } div#recent_pages_content { width: 170px; height: 200px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/css-20100830.css000644 000765 000024 00000031504 12276001417 016705 0ustar00jjstaff000000 000000 /* @override http://perldoc.perl.org/static/css-20100522.css */ /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/css-20090810.css */ /* @override http://localhost:3323/css-20090730.css */ /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #004065; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { width: 946px; margin: 16px auto 26px; position: relative; } p { margin: 0; padding-top: 5px; padding-bottom: 5px; } /* @group Header */ div#header { height: 105px; background-image: url(header.png); position: relative; width: 946px; } div#strapline { color: #b4b4b4; position: absolute; top: 63px; left: 119px; font-size: .8em; text-align: center; width: 228px; } /* @group Download */ div#download_link { position: absolute; top: 48px; font-size: .8em; left: 724px; } div#explore_link { position: absolute; top: 48px; font-size: .8em; left: 846px; } div.download a { text-decoration: none; color: #b4b4b4; } div.download a:hover { color: #eaeaea; } /* @end */ /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #afafaf; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 23px; left: 588px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { padding-top: 0px; background: url(main_bg.png) repeat-y; } div#left_column { float: left; width: 145px; padding-left: 30px; } div#centre_column { float: left; width: 731px; margin-left: 9px; margin-right: 9px; position: relative; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { } div#footer_content { background-image: url(footer_bg.png); padding-right: 74px; padding-left: 74px; } div#footer_strapline { text-align: center; padding-top: 15px; padding-bottom: 15px; color: #6c6c6c; font: lighter 1.05em "Gill Sans"; } div#footer_end { background-image: url(footer.png); width: 946px; height: 42px; clear: both; background-color: #004065; } /* @group Address */ div#address { background: url(onion-grey.png) no-repeat 0 4px; padding-left: 51px; float: left; border-right: 1px solid #b4b4b4; padding-right: 35px; } div#address p.name { margin: 0; color: #414141; } div#address p.address { color: #797979; font-size: .9em; margin: 3px 0 0; padding: 0; } div#address p.contact { color: #797979; margin: 3px 0 0; font-size: .9em; } div#address p.contact a { color: #797979; text-decoration: underline; } div#address p.contact a:hover { color: #515151; } div#address p.address a { color: #797979; text-decoration: underline; } div#address p.address a:hover { color: #515151; } /* @end */ /* @group Links */ div#footer_links { background-image: url(footer_bg.png); } ul.f1 { list-style-type: none; margin: 0; float: right; padding: 8px 0 0; } ul.f1 li { float: left; color: #414141; font-size: .8em; text-align: right; padding-left: 40px; } ul.f2 { list-style-type: none; text-align: right; margin: 3px 0 0; padding: 0; } ul.f2 li { float: none; font-size: 1em; margin: 0; padding: 0; } ul.f2 li a { text-decoration: none; color: #858585; } ul.f2 li a:hover { color: #515151; } /* @end */ /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ select#perl_version_select { margin: 5px 8px 11px; width: 121px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #3b3b3b url(combined-20090925.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090925.png) -648px -153px; } div.links_panel p { background: url(combined-20090925.png) -786px -153px; } div.tools_panel p { background: url(combined-20090925.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(content_bg.png); top: 0; width: 731px; height: 85px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; background: url(center_header.png) 0 0; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div.page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; } div#page_links_top { top: 18px; right: 179px; } div#page_links_bottom { top: 33px; right: 179px; } div.page_links a { color: #bebebe; text-decoration: none; } div.page_links a:hover { color: white; } div.page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 30px; padding-top: 10px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 30px; padding-left: 30px; font-size: .8em; color: #515151; position: relative; top: 0px; background-image: url(content_bg.png); background-color: #fff; } div#content_body h1 { font: 1.9em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; background: url(h2_bg.png) no-repeat -2px 29px; padding-left: 0px; padding-top: 6px; padding-bottom: 8px; margin-top: 0px; margin-bottom: 0; } div#content_body h2 { font: 1.5em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; padding-top: 12px; margin: 0; background: url(h2_bg.png) no-repeat -2px 30px; padding-bottom: 4px; padding-left: 0; } div#content_body h3 { font: 1.3em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; margin-top: 9px; margin-bottom: 3px; } div#content_body p { margin-top: 0px; margin-bottom: 0px; } div#content_body ul li { padding-bottom: 3px; } div#content_body dt { font-weight: bold; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { width: 731px; height: 25px; background: url(content_footer.png) no-repeat; margin-bottom: 20px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; font-size: 1.2em; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; font-size: 1.2em; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { top: 180px; left: 550px; } div#recent_pages { top: 180px; left: 250px; } div.hud_container { position: fixed; z-index: 2; display: none; visibility: hidden; } div.hud_header { } div.hud_footer { } div.hud_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div.hud_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div.hud_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div.hud_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div.hud_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div.hud_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div.hud_close a { display: block; width: 24px; height: 23px; } div.hud_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div.hud_content ul { margin-top: 10px; } div#content_body div.hud_content a:link { color: #a6a6a6; } div#content_body div.hud_content a:visited { color: #a6a6a6; } div#content_body div.hud_content { color: #a6a6a6; } div#content_body div.hud_content a:hover { color: #fff; } span.hud_span_top { display: inline-block; height: 19px; } span.hud_span_bottom { display: inline-block; height: 23px; } div#recent_pages_content { width: 170px; height: 200px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/css.css000644 000765 000024 00000021140 12276001417 016005 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #1d3648; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { background-color: #f1f1f1; width: 985px; margin: 15px auto; } /* @group Header */ div#header { height: 105px; background-image: url(banner.png); position: relative; } /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #7f7f7f; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 48px; left: 810px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { background-image: url(page_bg.png); padding-top: 0px; } div#left_column { float: left; width: 145px; padding-left: 15px; } div#centre_column { float: left; width: 648px; margin-left: 8px; margin-right: 9px; position: relative; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { height: 48px; background-image: url(footer.png); } /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ form#perl_version { padding-left: 7px; margin-top: 3px; padding-bottom: 10px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #2f2f2f url(panel_bg.png) no-repeat; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background-image: url(panel_top_doc.png); } div.links_panel p { background-image: url(panel_top_link.png); } div.tools_panel p { background-image: url(panel_top_tools.png); } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(center_bg.png); top: 0; width: 648px; height: 90px; z-index: 1; } /* @group Title bar */ div#title_bar { background-image: url(center_header.png); height: 53px; position: relative; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div#page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; top: 33px; right: 21px; } div#page_links a { color: #bebebe; text-decoration: none; } div#page_links a:hover { color: white; } div#page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 12px; padding-top: 5px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { background-image: url(center_bg.png); padding-right: 10px; padding-left: 10px; font-size: .9em; color: #262626; position: relative; top: 0px; width: 628px; } div#content_body h1 { margin-top: 0; margin-bottom: 6px; font-size: 1.4em; border-bottom: 2px solid #d8d8d8; color: #3f3f3f; } div#content_body h2 { margin-top: 0; margin-bottom: 6px; font-size: 1.2em; border-bottom: 1px solid #d8d8d8; color: #3f3f3f; } div#content_body p { margin-top: 2px; margin-bottom: 12px; } div#content_body ul li { padding-bottom: 2px; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { background-image: url(center_footer.png); height: 10px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { position: fixed; top: 180px; left: 550px; visibility: hidden; z-index: 2; } div#page_index_header { } div#page_index_footer { } div#page_index_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div#page_index_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div#page_index_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div#page_index_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div#page_index_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div#page_index_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div#page_index_close a { display: block; width: 24px; height: 23px; } div#page_index_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div#page_index_content ul { margin-top: 10px; } div#page_index_content a:link { color: #a6a6a6; } div#page_index_content a:hover { color: #fff; } span.page_index_top { display: inline-block; height: 19px; } span.page_index_bottom { display: inline-block; height: 23px; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ perldoc-html/static/explore_bl.png000644 000765 000024 00000001575 12276001417 017356 0ustar00jjstaff000000 000000 ‰PNG  IHDRù‡Ý}sRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÚ5&ߤ½Á"tEXtCommentCreated with GIMP on a Mac‡¨wCÏIDATHǵV¿O1þžïò£m iš†P¡ª Ñ¥C*Eˆ1SÕ¡C— ±çàabì C`ª"Š*šåHsMi­;»CïZÇØçc¨%Ëw~¶?ï=¿÷( ×——¯{½Þ磣£[ i£>—vFýÀ8ç_J¥Òm»Ý~”ÌIÜmÒ 3­³ÊØx<>¯T*“N§Su-Væ³@ôu€dÇÇǃjµ:^YYy°¹¹YÆj`1 ÃÞÍÍMw0Ìmmm -v2Ùˆ\¶I› Òív?­®®n …f«Õ*ôûý©Ãè6™ÌPÙÛÛ ¶··¿·Z­åf³ù¼V«áäää‡åpÛÁ™`  €𯯯?”Ëå7WWWËggg?www¿‡ÃØ¡:TÐ/>ÿâââí»Ñhô,‚ǧ§§ßúý~xxxxÇ1e؈²TGž&Œ¼ìàààÕúúú{/‚ hL§Ó9Îyq2™Dœs)¥ü{˜úíR]Ca”‚êõza¿³¶¶¶1??ÿRñs^ŒãØ@BRôÑTOm¤²JøF£´³³³¸´´ô¤X,–¢(ò¢(ò„,íRJB0•© Jj¢ÄÐ:Ð  Ö½¤§ûtg™Q¡@$’p!ÄŽ‡©†–ìÑ=sÈPLªnÊ/_Úbšrq©Ž¾ò#5F6WUׯŠú2Ó—è_äx”‘:t6B¹°H™¨g…~¡ØDXãhá;@Èpkæp£óø†ì©JM¦3a×&Õ½U/RÕ¢³ñ,êb® ë[R³ F»¼Ë [ ´ Rù—Šœå‰â¾vˆ4ÜÞ”üT™Èˆ”‡‘Úb‹ë¦ÿÌ•“(1°4ÌÛTáª%ŒŒ(g$ œ°'>)ÿ!"[ÞÏ[ˆPF¹ðÈ–—±uÏ †{:‹G#Ð=sƒˆXK)4ÒШ§IEND®B`‚perldoc-html/static/explore_bm.png000644 000765 000024 00000000370 12276001417 017347 0ustar00jjstaff000000 000000 ‰PNG  IHDR&ɧbsRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ  8£t$Ø"tEXtCommentCreated with GIMP on a Mac‡¨wCJIDAT×U˱€ Ñç'´z²\ê0±rK@Q3F’½Ù9ØÞyþ‚RÊ©Öz¤œó [à¸Úlm®W [<'a ¼\]× õ{¡IEND®B`‚perldoc-html/static/explore_br.png000644 000765 000024 00000001546 12276001417 017362 0ustar00jjstaff000000 000000 ‰PNG  IHDR®9sRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ  6;˜uÑ"tEXtCommentCreated with GIMP on a Mac‡¨wC¸IDATHÇ¥—¿«AÇÇߢᩄWÙÙ(ø/Ø^™BÁÆBPüCllílR™4bPQAb«Ay1¼§ ¨»“æ”uoé;öd×ûÌ|wvv×?àúAö½ßïÛ™Læi0|l6›s89vtìàûûÜâ´Ô1tZð*ÀǽÉd2Óéô‰éGÆDÿGî]Èð :/ƒŠÅb4 !Ûn·»V|L“>^Uçããc<‰¼,—ËŸˆšÈUJÜÇR©ô&•Jù"‘Ès§ÓùÎU&sÄÅÁ1›Íkµš•H$Ö‹ÅbÜëõ6Ì© '\¹.—Ë[­Ö;˲þí÷ûe¹\þ"Q3Táë¹$œ×ë…jµúP©TÞZ–õ×çóýn4ŸV«Õ‘Y&T°dî–ߟN§…B!šÏçlÛ¦±XìÏn·ûU¯×?F£­ú‡À37Á`pF_Âáðód2ùV©T¾n6¶0®`°E†5¾¸œwü¶m/·Ûíz>Ÿ/Úíöd8}”J¢–ÊïÉf³f³Ùž(‚³QŸ¸(ÙÈE%•ˆœòÀ{Aáàç•8rï„iyøUM¿Èî bÂyù‰B~aY>ÃQ!;‘HO ’ UËÍïHÅF Š9穦Ò]àÅœ£&ÛeK 5•ÙÈMá2¨Aq¹žpDEœä²õNÉF ¶[ל£fÉQ#ôžd͹N~]Gƒ] Ñ9™°sšC‚(›©bKÝÖ*‚ƒÆÙÎe>G-«p ‘¾âåóp4PÃä¸ìŸáD¹Ê\iy%гjÍYM ÁAq“[ˆ)˜¿± 7@©ÉÅç(LÀ&·m´"8Ür¹ ð”$±Åq®JsIEND®B`‚perldoc-html/static/explore_c.png000644 000765 000024 00000000357 12276001417 017200 0ustar00jjstaff000000 000000 ‰PNG  IHDRXxÐÊsRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ !â6Gl"tEXtCommentCreated with GIMP on a Mac‡¨wCAIDATHÇcüõë×ök×®]¿wïÞ_&F(ÌF¦‘1 0¢Ñ£`Œ‚Q0TÁ¿Ñ2mŒ‚Q@ °þ ÐVCVŽIEND®B`‚perldoc-html/static/explore_m.png000644 000765 000024 00000000273 12276001417 017207 0ustar00jjstaff000000 000000 ‰PNG  IHDRĉsRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ  »ÔJu"tEXtCommentCreated with GIMP on a Mac‡¨wC IDAT×c```8ÓϦgIEND®B`‚perldoc-html/static/explore_ml.png000644 000765 000024 00000000314 12276001417 017357 0ustar00jjstaff000000 000000 ‰PNG  IHDRþdósRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ  ;ñ…Ï¿"tEXtCommentCreated with GIMP on a Mac‡¨wCIDAT×cøõë×ö .IDAT×cøÿÿ?ÁÀÀÀÀÈ#°°˜à3Ó“'Oœ^½zuœéðáÃb »à²hŠq±ðh ÒL®„IEND®B`‚perldoc-html/static/explore_tr.png000644 000765 000024 00000004016 12276001417 017377 0ustar00jjstaff000000 000000 ‰PNG  IHDRS¹³sRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ  -0,|gÃ"tEXtCommentCreated with GIMP on a Mac‡¨wC`IDAThÞÍ™KlÇÀ¿™]»v^vµ£'ˆš‚Ò€Dƒ‘*¥ ñPE‘¨@œ Q…„¨…¸ ) p)T‚Cè¡Bn9D@¤’òÿ7´‡(IËBhB¼6 ÄNSׯ`ïôâ Ífgf¨Ú•>íÎ>fgû=g€<XC„9iii1LMM-ƒJŒ¢¾G̉ ­÷!=ãý·å`Q7„¦]šÛß¹sǼjÕªO0Æo'“ɯ ¿S^×x–¨ösÇ„q}Þó„7ðaà ˆˆ)% V®^½ú󙙙ƼÒéts{{{E»Ôšýauÿ”Ÿæ¶ÿLDA!ž6ŠF£u‹åÓ'Ož¼ m·oߎ®_¿¾ÂétŽ ž]·n]XÕŸ–ö©(m`h9ü´U­,ß8çÛ ñxü½ÙÙÙïFGGGnÞ¼¨««ë€î;vôöõõ…%Iœ½¶€U!–œ”@1@!€LðšÂ·ªý©¦/Õ#ÿ´Ïf¨i8räˆ)•JL$? û|>¿Õj½ Ýð=|¿|ùò[ׯ_‹Çã·ž={öqCCƒ ÊsR¥ ¸s`_j^p_5`¤á7i&úúúŠÝn·'™L¾ —ݸq#ÚÒÒ&„,6ƒŸ9sƱaÃë’%KÆEQ¼xñâù£GN+ÌWÎÝ.ç„(öD£­TÀqL· Õ^ÖÄ=þqttÔ¹téÒÏ¢Ñhm(ªîèèvttD5‚ļvSSSÙþýû•••“åååc_nÚ´éW,Yu,S`ò .î‚û åL‡"‘H½Åb9‡]’$•Ÿ:uj¬§§'¡]C»‘Ûí6µµµÕÔÔÔ¤«ªª¤p8ܹfÍš©TJV”)p 'PåQ÷=ùBE9D‹ÅÞ7J’ô†ßï><öèÑ£ âmµX,Èëõ:×®]kr8R:þ¥¹¹ù’ÏçûKdV1Ë€I‹ôùjì+Ég±FG€›››©Tê0Æø£‡ÖöööÊ»ví¥€D”~^äŽ333°oß¾ÐåË—§€K–åõçÏŸomoo_¦`ÔÇ'Ø`Æ· F ”Ÿ¯Ìg±Þ|å"ç¼sÝÝÝÖ7z‰D}0|ãÚµk¿Ÿ>>>ÑØØø{µ?í§|¢§OŸ(OD£QÛðð0 //
  • heading // // addContent: function(parent_id) { var parent = document.getElementById(parent_id); if (parent) { var ul = this.maker('ul'), li = this.maker('li'), a = this.maker('a'), span = this.maker('span'); topMenu = ul({'class': 'explorePerl_l1'}); for (m=0; m 1) {o = 1}; if (o < 0) {o = 0; this.container.style.visibility = 'hidden';}; if (this.fadeDirection == 'in') { if (o < 1) { o = o + 0.1; this.o = o; this.container.style.visibility = 'visible'; this.container.style.opacity = o; //this.container.style.filter = 'alpha(opacity=' + parseInt(o * 100) + ')'; this.fadeTimeout = setTimeout(function(){x.doFade()},1); } } if (this.fadeDirection == 'out') { if (o > 0) { o = o - 0.1; this.o = o; this.container.style.opacity = o; //this.container.style.filter = 'alpha(opacity=' + parseInt(o * 100) + ')'; this.fadeTimeout = setTimeout(function(){x.doFade()},1); } } }, // 'make' and 'maker' functions adapted from the book // JavaScript: The Definitive Guide, 5th Edition, by David Flanagan. // Copyright 2006 O'Reilly Media, Inc. (ISBN #0596101996) // // See http://oreilly.com/catalog/9780596101992/ /** * make(tagname, attributes, children): * create an HTML element with specified tagname, attributes and children. * * The attributes argument is a JavaScript object: the names and values of its * properties are taken as the names and values of the attributes to set. * If attributes is null, and children is an array or a string, the attributes * can be omitted altogether and the children passed as the second argument. * * The children argument is normally an array of children to be added to * the created element. If there are no children, this argument can be * omitted. If there is only a single child, it can be passed directly * instead of being enclosed in an array. (But if the child is not a string * and no attributes are specified, an array must be used.) * * Example: make("p", ["This is a ", make("b", "bold"), " word."]); * * Inspired by the MochiKit library (http://mochikit.com) by Bob Ippolito */ make: function(tagname, attributes, children) { // If we were invoked with two arguments the attributes argument is // an array or string, it should really be the children arguments. if (arguments.length == 2 && (attributes instanceof Array || typeof attributes == "string")) { children = attributes; attributes = null; } // Create the element var e = document.createElement(tagname); // Set attributes if (attributes) { for(var name in attributes) { if (name == "class") { // Fix for IE7 e.className = attributes[name]; } else { e.setAttribute(name, attributes[name]); } } } // Add children, if any were specified. if (children != null) { if (children instanceof Array) { // If it really is an array for(var i = 0; i < children.length; i++) { // Loop through kids var child = children[i]; if (typeof child == "string") // Handle text nodes child = document.createTextNode(child); e.appendChild(child); // Assume anything else is a Node } } else if (typeof children == "string") // Handle single text child e.appendChild(document.createTextNode(children)); else e.appendChild(children); // Handle any other single child } // Finally, return the element. return e; }, /** * maker(tagname): return a function that calls make() for the specified tag. * Example: var table = maker("table"), tr = maker("tr"), td = maker("td"); */ maker: function (tag) { var make = this.make; return function(attrs, kids) { if (arguments.length == 1) return make(tag, attrs); else return make(tag, attrs, kids); } } }; perldoc-html/static/extracted-css-20100402.css000644 000765 000024 00000030551 12276001417 020662 0ustar00jjstaff000000 000000 /* @override http://localhost:3323/static/css-20100402.css */ /* @override http://localhost:3323/css-20090810.css */ /* @override http://localhost:3323/css-20090730.css */ /* @override http://localhost:3323/css-20090727.css */ /* @override http://localhost:3323/css-20090722.css */ /* @override http://localhost:3323/css-20090721.css */ /* @override http://localhost:3323/css.css */ /* @group Layout */ body { background-color: #004065; margin: 0; font-family: "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div.clear { clear: both; } div#page { width: 946px; margin: 16px auto 26px; position: relative; } p { margin: 0; padding-top: 5px; padding-bottom: 5px; } /* @group Header */ div#header { height: 105px; background-image: url(header.png); position: relative; width: 946px; } div#strapline { color: #b4b4b4; position: absolute; top: 63px; left: 119px; font-size: .8em; text-align: center; width: 228px; } /* @group Download */ div#download_link { position: absolute; top: 48px; font-size: .8em; left: 724px; } div#explore_link { position: absolute; top: 48px; font-size: .8em; left: 846px; } div.download a { text-decoration: none; color: #b4b4b4; } div.download a:hover { color: #eaeaea; } /* @end */ /* @group Homepage link */ div#homepage_link { position: absolute; top: 10px; left: 10px; height: 90px; width: 340px; } div#homepage_link a { display: block; height: 90px; } /* @end */ /* @group Search form */ label.overTxtLabel { color: #afafaf; font-size: .8em; margin-top: -2px; margin-left: -6px; } div#search_form { position: relative; top: 23px; left: 588px; width: 115px; } div#search_form input { border-style: none; width: 115px; } /* @end */ /* @group Body */ div#body { padding-top: 0px; background: url(main_bg.png) repeat-y; } div#left_column { float: left; width: 145px; padding-left: 30px; } div#centre_column { float: left; width: 731px; margin-left: 9px; margin-right: 9px; position: relative; background-color: #ffffff; } div#right_column { float: left; width: 145px; padding-left: 6px; } /* @end */ /* @group Footer */ div#footer { } div#footer_content { background-image: url(footer_bg.png); padding-right: 74px; padding-left: 74px; } div#footer_strapline { text-align: center; padding-top: 15px; padding-bottom: 15px; color: #6c6c6c; font: lighter 1.05em "Gill Sans"; } div#footer_end { background-image: url(footer.png); width: 946px; height: 42px; clear: both; } /* @group Address */ div#address { background: url(onion-grey.png) no-repeat 0 4px; padding-left: 51px; float: left; border-right: 1px solid #b4b4b4; padding-right: 35px; } div#address p.name { margin: 0; color: #414141; } div#address p.address { color: #797979; font-size: .9em; margin: 3px 0 0; padding: 0; } div#address p.contact { margin: 3px 0 0; font-size: .9em; } div#address p.contact a { color: #797979; text-decoration: none; } div#address p.contact a:hover { color: #515151; } /* @end */ /* @group Links */ div#footer_links { background-image: url(footer_bg.png); } ul.f1 { list-style-type: none; margin: 0; float: right; padding: 8px 0 0; } ul.f1 li { float: left; color: #414141; font-size: .8em; text-align: right; padding-left: 40px; } ul.f2 { list-style-type: none; text-align: right; margin: 3px 0 0; padding: 0; } ul.f2 li { float: none; font-size: 1em; margin: 0; padding: 0; } ul.f2 li a { text-decoration: none; color: #858585; } ul.f2 li a:hover { color: #515151; } /* @end */ /* @end */ /* @end */ /* @group Side panels */ /* @group Perl version form */ select#perl_version_select { margin: 5px 8px 11px; } /* @end */ div.side_group { margin-bottom: 25px; } div.side_panel { width: 130px; background: #3b3b3b url(combined-20090925.png) no-repeat 0 -216px; color: #d8d8d8; padding-right: 8px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin-bottom: 1px; } div.side_panel p { margin: 0; padding-top: 14px; padding-left: 42px; height: 27px; width: 96px; padding-bottom: 0; } div.doc_panel p { background: url(combined-20090925.png) -648px -153px; } div.links_panel p { background: url(combined-20090925.png) -786px -153px; } div.tools_panel p { background: url(combined-20090925.png) -648px -194px; } div.side_panel ul { padding-left: 8px; margin-bottom: 0; padding-bottom: 11px; margin-top: 1px; font-size: .9em; margin-left: 0; } div.side_panel li { list-style-type: none; text-align: right; padding-bottom: 1px; } div.side_panel a { color: #bebebe; text-decoration: none; } div.side_panel a:hover { color: white; } /* @end */ /* @group Content */ /* @group Header */ div#content_header { background-image: url(content_bg.png); top: 0; width: 731px; height: 85px; z-index: 1; } /* @group Title bar */ div#title_bar { height: 53px; position: relative; background: url(center_header.png) 0 0; } div#page_name { position: absolute; left: 46px; top: 16px; } div#page_name h1 { color: white; font: normal normal 0.85em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; margin: 0; padding: 0; } div#perl_version { color: #bebebe; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; left: 46px; top: 33px; } div#page_links { color: #7f7f7f; font: .7em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; position: absolute; top: 33px; right: 179px; } div#page_links a { color: #bebebe; text-decoration: none; } div#page_links a:hover { color: white; } div#page_links a:focus { outline: none; } /* @end */ div#breadcrumbs { color: #7f7f7f; padding-left: 30px; padding-top: 10px; font: .8em "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif; } div#breadcrumbs a { color: #5a5a5a; } /* @end */ /* @group Body */ div.mod_az_list { font-size: .9em; text-align: center; margin-top: 15px; margin-bottom: 15px; } div#content_body { padding-right: 30px; padding-left: 30px; font-size: .8em; color: #515151; position: relative; top: 0px; background-image: url(content_bg.png); } div#content_body h1 { font: 1.9em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; background: url(h2_bg.png) no-repeat -2px 26px; padding-left: 0px; padding-top: 0px; padding-bottom: 8px; margin-top: 0; margin-bottom: 0; } div#content_body h2 { font: 1.5em "ITC Garamond", Garamond, Georgia, "Times New Roman", Times, serif; color: #36497d; padding-top: 12px; margin: 0; background: url(h2_bg.png) no-repeat -2px 30px; padding-bottom: 4px; padding-left: 0; } div#content_body h3 { margin-top: 6px; margin-bottom: 6px; font-size: 1.1em; color: #3f3f3f; } div#content_body p { margin-top: 0px; margin-bottom: 0px; } div#content_body ul li { padding-bottom: 2px; } div#content_body a:link, div#content_body a:active { color: #36415c; } div#content_body a:visited { color: #666666; } div#content_body a:hover { color: #888; } div#searchBanner { text-align: center; background-color: #5ab06d; padding: 10px; margin-bottom: 15px; color: #f1f1f1; border: 5px solid #1a661c; margin-right: 15px; margin-left: 15px; } div#searchBanner a { color: white; font-weight: bold; } div#searchBanner a:link { color: white; font-weight: bold; } div#searchBanner a:visited { color: white; font-weight: bold; } div#searchBanner a:hover { color: white; font-weight: bold; } div.noscript { text-align: center; padding: 10px; margin-bottom: 15px; border: 5px solid #dc4c24; margin-right: 15px; margin-left: 15px; } /* @end */ /* @group Footer */ div#content_footer { width: 731px; height: 25px; background: url(content_footer.png) no-repeat; margin-bottom: 20px; } /* @end */ /* @end */ /* @group Syntax highlighting */ .c { color: #228B22;} /* comment */ .cm { color: #000000;} /* comma */ .co { color: #000000;} /* colon */ .h { color: #CD5555; font-weight:bold;} /* here-doc-target */ .hh { color: #CD5555; font-style:italic;} /* here-doc-text */ .i { color: #00688B;} /* identifier */ .j { color: #CD5555; font-weight:bold;} /* label */ .k { color: #8B008B; font-weight:bold;} /* keyword */ .m { color: #FF0000; font-weight:bold;} /* subroutine */ .n { color: #B452CD;} /* numeric */ .p { color: #000000;} /* paren */ .pd { color: #228B22; font-style:italic;} /* pod-text */ .pu { color: #000000;} /* punctuation */ .q { color: #CD5555;} /* quote */ .s { color: #000000;} /* structure */ .sc { color: #000000;} /* semicolon */ .v { color: #B452CD;} /* v-string */ .w { color: #000000;} /* bareword */ a.l_k:link, a.l_k:visited, a.l_k:active { color: #8B008B; font-weight:bold;} /* keyword */ a.l_k:hover { color: #225533; font-weight:bold;} /* keyword */ a.l_w:link, a.l_w:visited, a.l_w:active { color: #000000; } /* keyword */ a.l_w:hover { color: #225533; } /* keyword */ code.inline { background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; } pre.verbatim { margin-left: 10px; margin-right: 10px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; word-wrap: break-word; white-space: pre-wrap; padding: 3px; } pre.verbatim ol { background-color: #d8d8d8; color: #3f3f3f; margin-top: 0; margin-bottom: 0; } pre.verbatim ol li { background: #eeeedd; padding-left: 5px; color: #262626; padding-bottom: 2px; } pre.indent { margin-left: 30px; margin-right: 30px; background: #eeeedd; border-width: 1px; border-style: solid solid solid solid; border-color: #ccc; padding: 10px 10px 10px 10px; } /* @end */ /* @group Page Index */ div#page_index { top: 180px; left: 550px; } div#recent_pages { top: 180px; left: 250px; } div.hud_container { position: fixed; z-index: 2; display: none; visibility: hidden; } div.hud_header { } div.hud_footer { } div.hud_bottomleft { background: url(popup_bottomleft.png) no-repeat; width: 8px; height: 23px; float: left; } div.hud_title { background-image: url(popup_title.png); margin-left: 24px; margin-right: 8px; padding-top: 4px; color: #d8d8d8; text-align: center; padding-right: 16px; font-size: .9em; cursor: move; } div.hud_topright { width: 8px; height: 23px; background-image: url(popup_topright.png); position: absolute; right: 0; top: 0; } div.hud_bottom { background-image: url(popup_bg.png); margin-left: 8px; margin-right: 23px; } div.hud_resize { background: url(popup_resize.png) no-repeat; width: 23px; height: 23px; position: absolute; right: 0; bottom: 0; cursor: se-resize; } div.hud_close { width: 24px; height: 23px; background-image: url(popup_close.png); float: left; } div.hud_close a { display: block; width: 24px; height: 23px; } div.hud_content { overflow: auto; width: 250px; height: 200px; background-image: url(popup_bg.png); font-size: .9em; color: #7f7f7f; padding-right: 5px; } div.hud_content ul { margin-top: 10px; } div#content_body div.hud_content a:link { color: #a6a6a6; } div#content_body div.hud_content { color: #a6a6a6; } div#content_body div.hud_content a:hover { color: #fff; } span.hud_span_top { display: inline-block; height: 19px; } span.hud_span_bottom { display: inline-block; height: 23px; } div#recent_pages_content { width: 100px; height: 180px; } div#page_index_content a:link { color: #a6a6a6; } div#page_index_content { color: #a6a6a6; } div#page_index_content a:hover { color: #fff; } /* @end */ /* @group Search results */ div.search_results { margin-bottom: 15px; } div.search_results img { margin-right: 10px; vertical-align: top; } /* @end */ /* @group Preferences */ form.prefs { width: 580px; margin-right: auto; margin-left: auto; } form.prefs legend { padding: 2px 15px 2px 8px; color: #3f3f3f; border: 2px solid #bfbfbf; background-color: white; font-weight: bold; font-size: .9em; } form.prefs fieldset { margin-bottom: 20px; padding-bottom: 25px; border: 2px solid #bfbfbf; background-color: #fafafa; } form.prefs label { display: block; margin-left: 50px; margin-bottom: -8px; font-size: .9em; } form.prefs fieldset input { float: left; margin-left: 25px; } div.form_section { font-size: .9em; color: #3f3f3f; border-bottom: 1px solid #bfbfbf; padding-bottom: 1px; margin: 7px 10px 13px; font-weight: bold; font-style: italic; } div.form_buttons { padding-bottom: 5px; text-align: center; } div.form_buttons input { margin-right: 6px; margin-left: 6px; font-size: 1.2em; } /* @end */ perldoc-html/static/favicon.ico000644 000765 000024 00000002576 12276001417 016640 0ustar00jjstaff000000 000000 h( @ÿÿÿ€€€ÀÀÀ   ààà°°°ÑÑÑõõõêêêwwwˆˆˆ˜˜˜¨¨¨ÊÊÊ···×××úúú|||ÜÜÜŒŒŒ”””ÄÄÄœœœ¼¼¼¤¤¤¬¬¬ããッƒÔÔÔÍÍÍÇÇÇ´´´÷÷÷~~~………ŠŠŠŽŽŽÏÏÏ’’’–––žžž¾¾¾ººº²²²®®®ùùù}}}ááá‚‚‚ÛÛÛ†††ÖÖÖ‡‡‡‰‰‰ÓÓÓÒÒÒ‹‹‹ÌÌÌÃÃÛ››ÁÁÁ¿¿¿½½½¡¡¡¥¥¥¶¶¶§§§µµµ©©©³³³«««±±±¯¯¯NJ+O:KCC1J1=)NI&1B1@,/6AD1L4=6B/E,K=4F;78)8-PA>29MA>4-8%';#5( K.-"O?1L H0<(#+G  (1E7M/;7C#$!P##33# 91#J17*:9 øàÀ€€€€Ààøperldoc-html/static/footer.png000644 000765 000024 00000002517 12276001417 016516 0ustar00jjstaff000000 000000 ‰PNG  IHDR²*Ž]gGsRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ "cj†É"tEXtCommentCreated with GIMP on a Mac‡¨wC¡IDATxÚíÝM¨œWàß9÷E/hH…, ÙdS¢˜É}ç…RWnü‹«n¬‹nT¬hñcSRP‹”`qQ7VhE¥Yt!XA?hf®Æ«¶%­! M¥-±„j½mnŽ‹^Bjcr?çέϳ›™óÅïÌæÏ9óN™L&_l­Ý˜m÷}S-¥œ‘{À™$©­µÓ²`8$u~~þ©$«ò`–•Rþœ$u8¾’ä‘0ã–“¤®½8!fØË]×=q©-¥üZ&̰ÇK)«— ÙZë/dÀ¬*¥\ª[k’,..žÊÚÓŸ` ÙÇÞPÈ®ù±h˜AÏ...þþM…l)åÙ0kJ)ÇK)íM…l×u¿MrFDÌX!ûÃË__~"Û’|WDÌ?v]w⊅l’ÌÏÏ?˜äU90#øï7ÞPÈ=zôl’‡äÀ ø{’ï]µM’Á`po’Uy°ËŽõ}ÿïk²Ãáð/I~$/vѹ\áZñ Ù$)¥|-ɹ°Kîëûþåu²£ÑèÉ$÷Ë €]ðôŽý¯ëU:ÞäYù0M¥”;<¸²áBví÷N0EÇG£ÑÏ®Öàj'²éûþx’GåÀ¼PkýüµÕu t{’3ò`µZëm]×=·åB¶ïû—j­·&yM®ìot]÷Óõ4\ωlº®;‘ä+r`ür4}u½ëzö}ÿ­$ß–/Ûè‰Á`ðñRÊêz;”ŒÞZ«“ÉäIn•5[ô×$7÷}ÿ·tªi\J¹¸ÿþO&ù¹¼Ø‚s¥”n´ˆÝp!›$‡zuaaá–$?‘;›ð\­õý£ÑèÉÍt®›étøðá%¹%ÉÃò`ž.¥¼¯ëºåÍP¶2{k­,--}³µö%{À5,ÕZ?ÜuÝ [¤lÇJÆãñmIH²`_¸‚>·vÃwKÊv­h<ߘäx’÷ØÖü3Égú¾ß¶Ÿ¦–í\ÝÉ“'ßváÂ…c­µOm÷Øì9KI>Ñ÷ýSÛ9莛ãñø¦¼~Õxhßþï¼”ä®ÑhôRÊÅí|ÇNM[ks“Éä³IîIòNûð–×’<< ¾<Ÿß©Ivüúïx<Þ—äŽ$_Hr½}xËYMòH’¯÷}ÿ§žlj¿c=uêÔÂùóç?ÝZ»3É ö`Ï[Iòý$÷ö}ÿÌ´&ú™ZksKKKm­ÝžäCI®³÷{Êïòú⇆Ãá¹iO¾«O^^^~ÇÊÊÊÇ’|$Éâê1À,z-ɯ’<–äÑiž¾Î\!{¹ÖZ™L&ï.¥ÜÜZë’¼7ÉIöùÎLÍJ’ÓI–“ü¡Öú›¹¹¹Ãáð•YYàÌÿ×ëx<ÞWk=ØZ{WkíúRÊÛ“\×Z›‹ÿªج‹IVJ)+­µ$yq0œ=räÈÙRJ›å…ÿuúÒgØ1ƒIEND®B`‚perldoc-html/static/footer_bg.png000644 000765 000024 00000000326 12276001417 017162 0ustar00jjstaff000000 000000 ‰PNG  IHDR²ã6’²sRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ #÷yº}"tEXtCommentCreated with GIMP on a Mac‡¨wC(IDATHÇíÁ1 °2T‡=ïØ`H²fæ&ـϵ=þˆ‹G‹IEND®B`‚perldoc-html/static/h2_bg.png000644 000765 000024 00000002172 12276001417 016176 0ustar00jjstaff000000 000000 ‰PNG  IHDRš.ÿŸåsRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ NB«4"tEXtCommentCreated with GIMP on a Mac‡¨wCÌIDAThÞíX¹nA­·"DB" àïù b>‚ $d9±ÁxÁôt×9»¶ïMG³ÛG]¯^U7HRö±}ìC«ëùòí‡\]ßÈë—Ïåí›òþÝ«Ý1ûØÇ>ö± ,懟äó×ïíß6Ëå"B`̉ùl+D ´ RÚ7ôac‚_ååc>xY,þsÈêɶ¿r††]o#Žëß4ÈÔ ^í"­€r4‚ÿÆ„‰$6È=€ðc“[¬ØÙ—¢t2j5Uü‘Á/µ+Õ«´_ÿšå‹±Gww:íZƒŸ ¿@éx>òÙ%ßúÃæèlUœ¢Äÿ0Q%°D;SÅTä×ðoaU¸p¢üüœ&?昧?B:73þÑ èãðÒL¤¼0ŧó•¾(òº€)Öø3±ŸìXöWœ)fÍ|W•Ø&Gž¤‡¿Ë_­@e%$? ~$Ä?(Í,—tþÈIüÂOW@“úWÂÇpT®?Š¢a~ «™‚Yi 53Å3Y©{U/hJNÃOàVJhtޤÔÊÌ5ȹ#S %ÀÈÓŠÿVð·Êyëæ3½ÿö÷jøµÕSÈFóºS8ÒZ“ý¿Æ¦¤OíòWŒ„‘ïedÉíèAÎ’{å»F×'OŸ¡© „cbEÿÇb„š0gmâ©›¯ؾ¨.Î:T$1@«Ð§Š&2TY¬®r•õ;´ãp—£ìb`¹œád@8l‡‚ͬùó”\c‰ÅKŸ¤ùßC+HÞ3‰»*T=笥IÓ©à§ìQu\¡íEqæêÅÏ·_¦‚Ni¡éðEÕdIéˆ{ÃTkT‰¤áËꂹj©"œbƒâŸø/sLhÎm<Ù±¼FÍs¤æ-ˆë„áŠãä ´0χ&.Úf —˺¦®Ëð5ÌlºêH‹E0h:Õ$ÑóÓHF´æuJåf#´xW?¹ÀO鿸íÀf>‡LWKæ=™"~ý­ýã›D¿L>îË9䜓& )bmbÈ .µhú;þ¤º”´oi°ÛÊyA¼²î!ȱÆ.ÕdqhKÄš1èad(n¯.&ÁÈ~ö‰ÅP'aûo˜1ä+²ôõCV™B+¡ª5¨ÿΗ͟·¿äîÏýÓ=¥zƒpŠ•[~~¨þx´±Ø¤ù±Q|rߟÝWøwêâ’XÇ™’vSÚý l$ØLޝ¿fo’ÀqþCpaÝ·T1€ÿ ¼—Ö[µ9DzŒü¿ÀUKÙdzŽIEND®B`‚perldoc-html/static/header.png000644 000765 000024 00000051156 12276001417 016453 0ustar00jjstaff000000 000000 ‰PNG  IHDR²i‘…sRGB®ÎébKGD»»» ìù pHYsttÞfxtIMEÚ $i£`/"tEXtCommentCreated with GIMP on a Mac‡¨wC IDATxÚìwxTÕÖÿ¿gzIôJ¤Ð{Õ( x-¨`ƒ+þ°r±+WQ®W_®ìúz±"ê‹"Eš@è„@HB ôž)É´óû#žãœ9“df2 m}Ÿ'O23göÙŸ]&³ö^{-†eYœ«2¡‰$Án· „eY Ã0 2H$‰D"‘H$É[Ù˜t°,Û$‘HêY–=#“ÉJ EÛ¹Zqæ\1dFc(€)& `(˲¡4¶H$‰D"‘H$©Ÿ E†igYöÃ0Ù²ìÒh4ùdÈv¯CX–½À,†aư,ËÐ!‘H$‰D"‘H¤sRå ÃüƲì:­V»€å¢1dÿÜy]àN–eÓX ‘H$‰D"‘H¤óN¾’H$ªÕ꣬!k6›­VëRw°,«¤~'‘H$‰D"‘H¤ó_ ÃìdæÔjõæ Æ5ižp+˲ ‰D"‘H$‰D"‘.L`æFó3€>54ûÒ 4 Ï3 s?˲RêS‰D"‘H$‰DºðÅ0̆aîS«Õ9ç•!k2™ncYö5–e#©I$‰D"‘H$é¢3fmÞÖh4Ë´Ó†¬ÉdŠ´Ûíÿ `:u‰D"‘H$‰D"]ôm%Ã0 Ôjõï¾,WâC#v:˲ÙdÄ’H$‰D"‘H$ X–fYv“Á`ø7Ÿ9õÅŽ¬Ä`0<Ë0Ì2–e%ÔU$‰D"‘H$‰DŸ ³]"‘ܪR©ªÎ¶!«4kX–CÝB"‘H$‰D"‘H¤TÆ0Ì•&¿7…ôf5Ðh4n$#–D"‘H$‰D"‘Hn*À.ƒÁ0¾7…xµ#k2™"Y–ýeÙág‹ÞjµÂ`0Àh4Â`0 ½½ííí°Z­°Z­°Ùl°ÛíçT±, †aÀµ9÷7Ã0>¹þ\ñŸü6›>jI$‰tVÄ0 ÿCÿÿ‰ŸøçL&ƒR©„B¡€B¡€J¥‚V«…V«…J¥‚TzÖ3¤$Éjµú·þ2dƒŒFãN–e‡ö'¥ÍfƒÉd‚^¯‡Á`€^¯Guu5jkkQWW‡††455Áh4B¯×Ãl6Ãf³S_Äi"ÿùÆo±X¨ÿ‰Ÿø‰Ÿø‰¿ßù¥R)$ õ?ñ¿—üR©þþþð÷÷G@@Êÿ„……ÁÏÏjµZ­ö¬¶ Ãt¸R£ÑìèkCVe47±,{Iu¼Á`@kk+êëëQTT„“'OâäÉ“(//ïqÇÕn·C"‘ð×qsŒ=ÉÓ÷÷ö~¾ñ?ñ?ñ?ñ?ñ?ñ¿s}är9âãã‘””„„„$''#44þþþÐjµP*•ýYíV‰Dr©Z­>ÚW†¬Ôh4®eYöÚ¾&±ÙlhiiAMM Nž<‰ ''&“鬌óm"?ñ?ñ?ñ?ñ?ñ?ñ÷hèI¥HKKÃСC1xð`ÄÄÄ ((~~~ý²Î0LµL&›¬P(Š|nÈÆ×X–}´/¬V+š››qæÌìÛ·;wîDMMÍ9»bâñýa‡Øñçõþ-ù3æV_ó?ñ?ñ?ñ?ñ?ñ?ñwÇ/‘H0|øpL˜0ˆˆˆ@```ŸÓ Ãäi4š± >3dFãßX–ý¹/Wš››QQQ]»vá÷ßGss³ËFíMG9¿Þ£$ìø+¶³Ýá¹.:½»ûyZâ'~â'~â'~â'~â'~â'þ³Å©Ó¦b옱ˆE@@@Ÿ´ Ã|¡ÑhþîCÖl6Ç[,–#t}QYƒÁ€ÊÊJìÞ½6l@]]¿}íëÃÑçÚáq_×—ø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰ß×üÑÑѸâŠ+0~üxÄÆÆB£ÑôY2 s§F£ù´·†¬Äh4þÁ²ì$_WÐf³¡¾¾‡Â?ü€¢¢"8Öű‘ŸïMç:^ã«ò{;x:ø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰ÿœäOIIÁõ×_#F ,, r¹¼/lY#Ã0£5M¾×†¬Ñh\̲ì;>¯™ÑˆÒÒRüøãؾ};:::|R®D"4¸ã–:÷Û›¼¹$‰D"‘H$‰Dê´±¦NŠÙ³gcРA}²;Ë0ÌvFs¹W†¬ÉdаÛíù‚|Y©ææf˲`YÖí²Ùl|¾Z½^ïÓÕŠ^¶¾VkˆŸø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰ÿÂä—Ëå¸á†0cÆ $$$@&“ù¬l†aj4M*€· Y£Ñ8‘eÙ=¾ª„ÍfÙ3g°fÍlݺ&“‰«\þçJ¥:þþþn¦Z­ááá(..ö¸®V«---hll„ÕjunÌnO»h|Ÿ¶îíanO_÷”‡ø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰Ÿø/>þ‰'bÁ‚8p ”J¥/Ù74˰ÒgŸ}Vô¤ÅbùÀ@_±ÅÅÅøè£\ž‡u>hÌýV(ˆŠŠBDDT*•G~×ÉÉɘ3gvïÞíÕ*ŠF£N§ƒL&CGGl6›Ëúz2ðÙÜáïêúžî×U™î¼ÞÓÄ!~â'~â'~â'~â'~â'~âwVEEJKK NçË PíVë‡r¹ÜØãŽl{{û(›ÍvÈWFlQQÞ{ï=}:ÚÛÛ}Ùp AJJ üýý}º‚âÉ K_”ßÛû?ñ?ñ?ñ?ñ?ñ?ñs***›o¾‰Ã‡û,3 ˲‹á"±À5ƒÌñÅ «««ñÁàðáðX,‚8Î’J¥0`¢¢¢zñ+ #GŽDGGT*d2™Ï Z¹\ŽøøxDEE¹íæìÈÜÿ¹r½óëÎ?=•OüÄOüÄOüÄOüÄOüÄOü7ee%V­Z…¼¼<¯³Ã8›yF£ñ>ç'®ÅF£ñU–eëíšššðá‡â×_E{{û_7s±íï︸8ŸD¸Š‹‹Ã¢E‹ðÊ+¯`åÊ•xæ™g0aÂüüóÏ>]©hooGYY™€ÍmW³-â'~â'~â'~â'~â'~â¿8øG…|)))¾HÍS¬Õj“ð çh=Êü½·w0X¿~=6nÜÈG'æä¼ý‚˜˜Ÿ%Ð5™LP«Õ0›ÍP*•¨­­ETTî¶¼¼Ü'÷Q©THIIAii)ÚÚÚ|Ý Lg~†aøs·]]ߟ>÷®^w\]qÜ-wUgâ'~â'~â'~â'~â'~â'~8tè¾øâ ,^¼‘‘‘½Íµ›h2™¦ªÕêßE†¬ÉdšÊ²lXoJ·Ùl8tè¾þúk´¶¶v{mDD¢££}jõ›Íf¨ÕjX­VÈår0 ƒ°°0<ûì³øþûï}fÈÑ €²²2455³+!޹¬X–íU.«ó1ÿñ?ñ?ñ?ñ?ñ?ñŸþM›6!55×]w]¯‚ùþi4߀7d%/Ìím£9sŸ~ú)º½.&&ÆçF,ЙÖÑEY©T¢±±±±±8~üxŸ ’„„ètºsv {êG¡‰ø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰ŸøÏÿgŸ}†œœ˜ÍæÞ²×Ãa#–3d³zS°ÑhÄ·ß~‹ÂÂBAÞU§›#<<áááýÒh2™ ÍÍͨ­­í²N¾P||<üýýyÆnŸÿíM0ç×{úñ´|_×—ø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰Ÿø/nþ¶¶6|öÙg¨¬¬ì­A­3†¬ÉdβlToJ=rä¶lÙ“ÉăÙívþ·Ýn‡N§CLLLŸ”0›Í¼ÿ¶T*…Ñh„^¯ïSƒ™a$%%ñ’»âw^ éëðÞ=½ßyµÆ±Îî\ßñ?ñ?ñ?ñ?ñ?ñvv66mÚƒÁÐ[Óë*~ÓòÏÂ3{SZss3¾ûî;455‰š{¬V«ß§eBBZ[[¡Ñh:­t‰V«µ_¢‚I$¤¤¤ 77V«UÄïÜÑçŠ<­___ßär9.\(xEE6lØàQ9III¸âŠ+øÉÅ0 ²²²pìØ±sšÿBê…B…  >OŸ>­[·Òø'~â'~â'~â'~â¿ù¿ÿþ{Lš4 éé齉bœ)0d†™ìmãÚívlÛ¶ GíÒïY*•"))©·‘ªzÔøñãQVV???¾n à 00°_…L&CRRòóó»4v£Ùl6H¥RÞí™û›ëXOf{Z¾«¾ôö x_ÔϹ>×_=~øaAÄ´œœüòË/•ÿÄOà²Ë.Ôíý÷ßÇÑ£GÏiþ ©ÿ-Z„x@ðú®]»°yófÿÄOüÄOüÄOüÄOü CCÖ­[‡èèèÞÄ@  Cö§…<ÖÛ’êêêðý÷ߣ¹¹¹KË|À€P©T}jDFFFbôèÑxã7 3ø“T*ETTÂÃÃQ[[ÛçÆ¬¿¿?¢££qæÌÑkÜ€r<¯ëêï®^ïIÞ”ßU}¡Þò=Zö{ïÞ½ü¸s§|DDDÊhmmEVVX–=§ù/¤þ6l˜èõ;wÒø'~â'~â'~â'~â¿€ùׯ_«®º þþþËå—Dz¬²½½}¨J¥:(àÏ0Ìo+·cÇ”••ñ•s´æív;‚ƒƒyò/uÛm·¡ººÙÙÙˆŠê<îÛÑÑÁ§á¹ñÆÑ_ŠŠŠ‚J¥êuİžÞïiùÎ×;ÿœíúuW¾\.GRR’àõööv"¹óoBB”J%_Vkk+¿0 Ñh???Âh4"==Ý宸³ë¿'ã?$$iii0`ÂÃÃ!•JQWW‡ÂÂBœ8q­­­.ù†Ajjªh„;*‚( ~œpüJ¥iiiHHH@TT:::P\\Œüü|´µµ!66VPn~~þYÿÝõ¿¯Êë-ñ?ñ_ñK¥RøùùA­VC¡P@"‘tûEDºPÅ¥koo‡Ñh„Á`è÷ùßÒÒ‚mÛ¶!>>žmä¡Î`O^¹›ÍflܸÍÍÍ._—ÉdˆŒŒìÓŽ˜;w.®½öZ”””àã?æ¿ÐÆÇÇ#++‹7(àÝwßÅâÅ‹ñÀà³Ï>ë—許±±"CÖõ´"âé?¦_|#GŽä+•J|úé§:t(†Šèèh¾¼ªª*ìÝ»=ö˜Ë8nÜ8ÜvÛmŠêêj—m¯Ñh0mÚ4Ì›7111ˆŠŠ”c6›Q]]òòr¼ôÒKÈÍÍu»ÿßyç$$$úò›o¾AFFÒÓÓù¹g·ÛQUU…ü¯½öšË¦)S¦`þüù.YY–E]]Z[[ñøã#;;[P!C†`õêՂŬ––<òÈ#˜5k233 ­V‹¯¿þ¯½öÆŒ#âÉÎÎcwÆbb"æÎ‹ÌÌLÄÄÄðóSss3*++ñÁà§Ÿ~ñß~ûí¸÷Þ{ùÅ©TŠ#GŽà½÷Þí·ÞŠáÇ# *• K–,ÁÞ½{™™™X´h’““ŸsF£èèèxØívüóŸÿÄž={ÎÚü÷õ]w¾döTñ?ñ_óB§Ó! € WÉ…l6š››Q[[‹ööö~›ÿ¿þú+®½öZhµZç&˲8C6Ìè²²2?~‹E)‹û;""¢7a•»•V«Å½÷Þ‹Q£F¡®®¯½ö:::Ó¦Mñ`@g0¨;v 33ÿïÿý?¤¤¤àóÏ?çß×R«ÕF}}½×eH$Øl6QûzY, ÉÉÉ‚<¾f³·Þz«(·/Ã0ˆŽŽÆìÙ³QXXˆ÷Þ{‰„w[¸ï¾û0þüns‡††"44¯½öV¬Xo¿ý–mĈˆ쌖””àå—_Fzz:o°Úív9r6› ‚òív;öíÛ'0 fÏžGy)))¢ú¨T* 8Pô|uuµ`GT*•⡇¼yóº\ˆ‘J¥ˆ‰‰ALL >þøc<ôÐC8yò¤ RRRðä“Obܸq]®4) ÄÇÇ#>>·ß~;–.]êVÿÇÅÅ!""BÐþz½óæÍ¹ìJ$ÄÄÄ`áÂ…¨¨¨Àš5k>ú(nºé&Q,DZŽððp¼ûî»Xºt)víÚſɓ'‹ÚÉd2aåÊ•4hÿ`6›‘ ¥R‰ÄÄDÑõÜYgOÆÿ”)Sðä“O"##£ËˆèAAA ÂòåË„O?ýTðú%—\"ÚÙîèèÀ»ï¾+HVVV†úúzØív,Z´‹-½[¼4hèùŠŠ þýgcþ÷æó‡»'7ÿïï)Ñ×õ#~â'~â÷¶|ÄÆÆ ¼¿, ššš`0`±Xú4À ‰t®J&“A¥RÁÏÏAAAJ¥¼Zcc#Μ9«ÕÚçóŸób mˆ¹¡0Î õ¦²²² ×ë]â•H$ˆŽŽî“ÆOIIÁâŋ޿æf¼ôÒK¼±R©Dffgj¡#GŽ‚ƒƒ3gÎÄ3Ï<ƒ˜˜¤¤¤ 33ƒ ÂêÕ«y—̾·+ë­º;D퉒““Eîœ …¢[cT©TbÖ¬Yøïÿ‹––À“O>‰Ûo¿]`œ™L&TWWÃb± ,,Œos üyó°aÃþ,µ+÷ÞèèhÁ#й«›ŸŸ¸¸8QÝkjjpêÔ)þñ¤I“°lÙ²ny\ÉÙ=ù¥—^ÂÕW_ ­V+¸Žs‘ö÷÷<ŸžžŽ‡~÷ß?ÿ1** ï¾û®KƒÆ•Ìf³h׺»þOHHž~~~ݺfh4Üpà øþûïù4Y¯¼ò ®¾újû½ÑhDUUl6"##;œÑÑѸ뮻°gϾ>£FÝ+>>^´óÍgvuÖ¹¦¦%%%ÿñãÇcÅŠ¢þ6›Í0  Œ1N‡ `Ïž=(((๜ßϲ,fÍš%ªCCŠŠŠ0{öl,Y²AAA³ÆÆFÁ8ëïùßWŸ?}]?â'~â'~oÊ‹‹ã#±,‹ŠŠ ¡±±‘¬ÉA à 66IIIÐétÐét Dii)šššú|þïÚµ ãÆóÆ å Y­§ïìèèÀ®]»ºŒ6™Læó†¾öÚk1gÎH¥R´´´àÅ_DMM Í•W^‰   œ9sUUUP*•üNFƒ%K–àÕW_ÅÂ… 1fÌDEEaùò娰aƒà ¾/åçç´¶¶zì*ãËë $Ú­kkkñcÇpàÀh4\uÕU¢à\aaaHHH@vv6®ºê*Ì;W`0•––â“O>Á–-[ÐÖÖ†9sæàÁ³±±±HIIÁ‘#G •J]îŒ:œñpêÔ)Œ1Bt>¶±±‘7d£¢¢\5ؽ{7¶oߎÀÀ@ÜtÓMHOO\SXXÈ÷û 7Ü€™3g ŒØ¶¶6àÕW_Ã0xì±ÇDi€222’’‚‚‚ÄÄÄà?ÿùȈ­©©ÁÞ½{±ÿ~Øl6 2“&MBRRª««qêÔ)·ûsĈ"ƒº¥¥ÙÙÙ8|ø0‚‚‚0uêT  <5‚„„`Þ¼y¸òÊ+FìéÓ§ñÞ{ïa÷îÝèèèÀwÜ…  ŒÙøøx$''£  À¥!Ø]_᪫®õeCCJKKÝæÇ‹/¾(ºwMM ^xá”——ãòË/Ç‚ gRR.½ôR>ϳ«Å†a\Ö¿°°ƒ Â3Ï<#2bëëë‘••…Í›7#88sçÎÅàÁƒE«Žgmþ»s}O®@î8—xˆŸø‰ÿâã—ÉdHNNæÿGÖÔÔààÁƒ}òýŽDºIJ,ÊËËQ^^ް°0Œ5 III¼=Õ—ó÷îÝX´hüýý=u/Ö`dèL(ë‘***pêÔ)X,–. Y_*88÷Ýwÿå°µµ/¾ø¢ W«N§Ã5×\à¯|”ÎÁVðøããÕW_E~~>æÍ›¹\ŽÙ³gcôèÑøè£——çóAΛá:¸»ÎwþÛ×»Ú]»v-þýïó;‰'OžÄsÏ='0nÔj54 ”J%,X ØMkmmÅŠ+°{÷n;ñ?þø#æÌ™#0dàïïÏŸitÞ‘3¨©©AYYZZZ`µZüü|´¶¶bôèÑ¢ÝÆÒÒRèõzÀ-·Ü"2jJKK±zõj|ùå—üs3fÌ­öîÛ·,Ë" óçÏm,ËbÇŽX²d ßFO=õ¾üòK1¨¨(äççcîܹ‚sÈ@g £U«Vñç¸9%''ã…^€L&ã½ÜéÏ#Fˆú÷ã?Æþóþq^^^~ùeAŸ«Õj¨T*àæ›o°Ö××ãÙgŸÅ±cÇøç>ÿüsÌš5KpV«…ŸŸX–ERR’È4¨¬¬DEEZZZÀ²,bcc±}ûvX,Œ?^tÖ¹  V«Õmþyóæ‰R1åååáÑGåSø:t™™™"£sÈ!üß©©©¢ÅÖÖV”––¢¦¦---ÉdHKKÃ×_ ˆÜ‰‹‹‹ñòË/cÆ ]Ž38xðàYÿî^ïÎ{=9óv¶yˆŸø‰ÿâág)))üÿ¨œœþ˜‰DêYuuuØ´iÆŽË¡€ªªª>›ÿœ·Dxx¸Ë„¤¡sWÖ#åæævy¶”û¢ì+1÷Üs_&gÄVTT®[´hÔj5ŒF#¶mÛ¢)Θ}î¹çðÆo`ùòåX¼x1bcc‰§žz [·nÅW_}Õåago†S§Nõ¸åî©¿¹»×+ Ñ.¨ÙlÆ®]»`³Ùøë?޶¶6!k±X`2™0tèPÑ.£ÅbÁ’%K°dÉÁóΆjGGÚÛÛa·Û‘””$r‹=uêî¸ãþ ¡3+Ë<3f 6‹Å‚O>ù_|ñÿ\ll¬Èh©««ÃÉ“'a·Ûq饗"--Md ¿òÊ+ü‚D"á#Ú:²r¹J¥¸òÊ+†šÅbÁ_|?üPÄPXXˆ'žx---°X,nõgpp°ÈhoiiÁþýûן>}ƒA°sk±X`41jÔ(Q_Úív,[¶LTGç¾´Z­üÜpµËìØ1,Z´---.ÏS8Ÿu¶ÙlØ·oŸàÚîø£¢¢0mÚ4A[­V|ùå—ÈÉÉÈ\… IDAT\ŸŸŸáÇ îç8¶ÇŽ+ZÜùî»ïðÊ+¯ðã•+/!!cÇŽÍ¡>úH`ÄFDD¸ÜqÎÍÍ=kóßÝë=}½'yz^†ø‰Ÿø‰¿7<ÉÉɼ›••%ðØëIŒDFªÐ5kïk7^°"‘α,‹ýû÷cĈHLLDLL L&ûlþ;v £GöÆ•{åÿ›““½^/:$o·ÛE» ÞŠaÜrË-˜5k¿Õ¬×ëñÊ+¯ ¼¼\píÕW_Íaݼy3ïòì|æ’SHHž~úi¼ùæ›xæ™gpûí·ó_ާOŸŽaÆaõêÕ¼ bo%“ÉˆÆÆÆn?¨{Ê åüÁ͵?÷»«ë;¤ÜêJQQ, G¡Pˆ\Âõz=ôz=¦M›&ú‚Î ïI---hll˲;v¬È¾  UUU|}ùƒƒƒEÁ„šššpìØ1Øl6 4Hä]VV† 6§'$$¸40¸ MS¦L9œ¡yêÔ)A{+•JÑD6hkkCjjª¨.¥¥¥øæ›oø€Îý_\\ìQÿ'&&ºL]súôi>‚³Ýn‡B¡} ˜L&´µµaÊ”)"×d. SOjkkCCCl6Æ/j‹£G ΄;~F 0ÀåùØ‚‚AÝ»ãOJJògΜÁŽ;ep‹jÎjnn†ÅbB¡kooÇž={ø³Ü‰„_`HHHû/++ÃæÍ›ó/>>^Ô?õõõ¼ëøÙ˜ÿÜõ ·0#•JóŸûÛqá†{«û¹óËñÚž¾¨?ñ?ñ{˯ÓéøÏÞC‡ydÄ€<8þ©0 ×îí­µ0TåÀnm#k‡tÁ+;;›¥“˜˜ˆ¶¶¶.]ô{;ÿ;³Ù,ŠOãÖ"˜§o0›Í8qâŒF#ìv»àø+Jpoäç燥K—böìÙ¼k4±bÅ QP˜””ÜtÓM¼óóÏ? v_»’Z­Æc=†#Fà“O>Á›o¾ £ÑÈ©_¶læÎë³Pí:Np0º»CÒî¾îªý]]Ÿ””äÒäΘÚl6Øl6$&&Š\1›››qêÔ©nÛ²' Àn· \;¹:îÛ·¯Kþ¤¤$‘«z]]NŸ>Íýqv;®¯¯Gyy¹ ¼‘#GŠ<ÊËËÑÜÜ ›Íæ2Òñ‰'Díæ2ÅKcc#ïÒälÄsÞ¾èÿ´´4—íá|!C†¸L‰T^^ޫŦææf”••A&“¹4¹´W®4`ÀÑn|CCNŸ>í6zzº¨ý[ZZøÏÇë]ßå.%''‹wjjjøº8×gÈ!‚È—œQÏ3îÚŒŒ ‘±^UU…ººº³6ÿ»zÝñ‹ «/©=ݯ§û»úGGüÄOüÄïk~.Ÿ9·xì¼ÙáÖ—a¹rµ2µµµ2‡ß2µR•Êɹ@ÄEB’0Œ(5F:pàŒF#¤R)âããûlþ=z”·Á<Þ,ôô õõõ¨®®î²ÂžFótµ[ºtéRQš˜×_]ð%“3š|ðA~ñ›o¾áÝ•J¥K×bçÒüãxï½÷°wï^”——ãá‡FLL $ ®½öZ 8o¿ý6Z[[{ÅåN»ôUp„ &ˆvçNž<):ã<{öl—;lf³Ù¥»ø·ß~‹M›6uÿAÁ0ÈÎΆÍfCll¬hׯ®®ŽßùvÅ“‘‘á2bqUUX–u™6¦¡¡Aôœ«èº‡âÿvµ{ç*?ì¤I“Dî´555ÈËËÃõ×_ßåäuGÁÁÁðs»æœÆ/êߣG ÚN&“aêÔ©¢±Â-8/6Y,üïÿþ¯[9N¹2º2»‹Ì;nÜ8Ñj›ãYg–eÆ‚·Û ¸>¯×ëaµZãÈ!.fÇú;/î466vyžÊÙ8íªo]åÈ=zô¨[ýß×ÁQÎu?ñ?ñ{ù\ΧyóFR™2… R™Ø»„…ÄÎÀr£Â›ýÒ9&Åe×#ðÒk!Õýõ]ÎR[‰Ö-ßò5Ó<ËÎÎÆ¤I“FãµÁÙjkkQ[[‹ˆˆƒ{lÈVUUñÑ7]}w×Ú•"##±lÙ2Á—F»ÝîÒÍW*•âá‡æ¯ÍËËÃüÁ¿>hÐ ·C*•â¾ûîC{{;²³³±|ùr<üðÃüÎaFFžþy¬X±B\ÊSùùùñ9Ѽž>¨»2»‹:ÈýíAÕ1ȧ;ï¼£Gõõwß}–eycÃy!aýúõ.ë#‘Hp饗bòäÉØ²e È•{¯ó…#¿+Ã-;;›¿³‹,·âØ:N´ ÉEùul+g)•JAÅÇÇãºë®Œq›Í†-[¶Àl6»\ìGDD„È(Öétxì±Ç`4ñ /@*•â³Ï>_*• ï¾û.>øà¾>®vA÷ïß/hûüã:t¨È`üé§Ÿ€wå$—Ë¡Õj}é¼àsÕUW!55¿ýöß—® AÇ”HÎãÑÕYçýû÷ók4|ùå—‚EµZÿûßX»vm—Æ£B¡ð³,‹ùóç‹Ü¼ ‘““Ã+W‹;]¹Í8¨€ÄÄDŒ7ûöíÃ0˜8q¢h®Fìß¿ßåî¯ùïnTÒ^¯nû8 *ñ?ñ¿;ïç‚𹋤Ëúˤ©”`$°7€”e;s×(`¤ççB©Sþw> í° âÿñáѹõ´%¤ÂðíÛÔPª©©ASS‚ƒƒárÁßó¿¢¢ééé²[ÕÕÕ|„QWÆš· Á“O>)Úùøæ›o°wï^Ñõ .äY­V|úé§‚×]X»µæe2,Y² hooÇŠ+pàÀþõÐÐP,_¾ÜeÚOäìšè©kóyç÷Ûívþ7w¾$&&F´‹ÅWMHH@zz:V¬X!XàáüÁïZ‰áÇcêÔ©øâ‹//š˜œ{2gP;ëÞ{ïåß7pà@¼ñÆ¢`EyyyX·n€N番¸¸8|ðÁ9r$âãã‘‘‘Ûo¿_|ñæÏŸ­V ›Í†äädDDD`À€üF£Aaa!ߟÉÉÉ¢ùÑØØ¹\ޏ¸8 <«V­Â]wÝ%˜‹œ±}âÄ °,ËçXvÔW\×_Æ Ctt4¢££‘‘‘[n¹ß|ó ^ýuL™2…Ÿû]íò[­V°,+úæÒðåÎ:Ûl6¤¥¥‰Ú@"‘ÜÄ‹ŠŠDã099·Þz+†ŸŸž~úiüío\ÓÚÚŠ5kÖÀ`0téâÞ[4±Ïy1â?ÿù~øáüòË/X½zµh¡¡»¹ý1ÿ¹ßŽ»ë äÍÊ-÷Ûˆ£ÄOüÄOü½å÷÷÷‡B¡Ë²‚…TO%SÈ S)¡P© T© T*ÿúýçT¡¤R²lÎS©fý7b;J P÷é˨yù¨ÿß×a®,íO“®„|òÕÔXNâ6œœ=}9ÿ«««»<ƒßíÜõÆ2ïèèpùèl¨¹=¸T*,]ºTÔ@Û¶ms¹Kt饗âòË/çoܸQ´[ê¼3âNyä<ýôÓhmmÅ›o¾‰ûï¿“&MâôÇÏ?ÿ¼Wç/€ÎÝ¥––Ñh‡GN¿…{š|˜3~œwACCC±lÙ2´µµñÑvwÒ³³³ñÖ[oñƒjÇŽ(((r¸é¦›™™ÉïFªÕjk·[ @´ã Ý{ù]åù¬®®FII Ïn¸XtzÜÿý¸õÖ[¡Õj]ŽKî¼(wØ|ëÖ­˜4i’À8 ÇW_}…¶¶6èt:‘1\WW‡+Vð«SÙÙÙ(((¤ß‘H$7n>ÿüsèõz¾½e2Ìf3ïEœœ,JíRSSƒ¢¢ÓƒÍÙ]6** ÿó?ÿ£ÑµZððpÑ ÷¾}ûðþûïó¬7nÄܹs‘œœ,0Ên½õVL:z½ Ã@£Ñ 22’GŽáÛÝÕ.VVV—ã111Q´ R[[+øâ1hÐ —®çÜ5v»GŽAII‰ îÁÁÁXºt),X•J…„„Áù`«ÕЬ¬,¬]»–~æüYS[[ËŸãv¥}ûö¡ººZdŒ'&&Š[ÅåÏíªÜ¾žÿŽ¿= Ú"~â'~âï ?çÁS__ïÕ—`NR©2¥ ŒDÚ±+C®Åç§$Rfv¨íGÑüþÓ@çÞ;¬Õ%h<¼ºGÿEl¯¸õ»×S›9¨¢¢£G†L&ƒF£q¹‰ÔÛù_QQáÕöjGÖd2¹´¸½5dïºë.Q`–’’|öÙg¢kããã±páBþqSSïvÈÉßß¿Ç󱮂{ï½—çZµj• ¯¦V«ÅO<áò¼œ»†¬ãŠ…óÝþ·ãß]]ïÜþÎ+",˺ rı&$$ **JdÄÖÖÖâ¾ûîÃÉ“'ù2KKK±jÕ*—;—áááHIIAJJ bbbx#Ön·£  €7”œ]=[[[qèС.ùSRR\žsä"À9ƒ]Ë”Ëå ƒF£q™J‰s+æ"Ý®[·Žßéu4B“““1bÄ‘[^^ŽW_}Uz¥¦¦}ôjkk]¶q—kŸÚÚZ¾Ç'Úᬯ¯Gqq ß>®Üa†Axx8!0b¹$׋/FYYÏš››‹O?ýT´¨Â¹h 8)))ˆŽŽæÇF{{;Ÿc9..Î¥QÊõµ+ :Ôå¢Duu5Ï7aÂÑX¬®®F]]Mnn.Ö¯_“É$jßáÇ#55U`Äêõz¬]»wÞy'Ìf3ï‰àj\t9¿öïß_ýµK×cn<;+??Ÿÿ`v5?ûzþ;þpýÏ­€:>çMù=­¸ö´bKüÄOüÄß[~ÎûÈ1Z¾7bärÈÔ*ÈTÊÎߎ«”©”*•`d´#{>Jš1Œ²ó;x˺y#ö¯jGË/_v.X…B“Bæ4¹ïŒœMáëùß]ü¥îäñŽlKK‹ ÷(wîS"‘ˆ¢¤º£)S¦`¡¿ºÙlÆÛo¿-ra–J¥X¼x±àËüºuëD9m‡îõYÝáÇcúô騲e l6Þyç<ÿüóüN‘N§Ã½÷Þ‹_|Ñã²U*• ‹W‹J«ªŽé<º*ÏÕ.è™3g.:÷WWW‡½{÷âùçŸwynõ¿ÿý/êëë±téR¤¤¤t&»¶¶%%%سgl6’’’D»`MMM÷^WcÃ¹Ž………|J–e‘——‡—_~+V¬-0Fdgg#::ZuÙl6óîê\ûUVVâž{îÁ?þˆ˜˜˜.Ù¸³µï¼ó¶nÝ*ÿðõ×_Ãd2áñÇGbb¢Ë RœÖÐÐÀ‡®Îž8qB°:åìk³Ùø]B©“»Suu5vî܉矕••¢ñ³zõj455añâÅHLL¥rüð©®®FQQ:$pƒv”^¯ÇÉ“'Eã›û{âĉ.û>==]tÍáÇ×H$¼ð  ĬY³ºLd±XPXXˆµk×bÕªUüb·Cî|¶º²²RdØ;·õÒ¥KaµZ1eÊC«ÕÂb± ¾¾GŽÁ°aà F£¿ÿþ»`|ô÷üïîý¾Ø•qÿ=åž$~â'~â÷%?÷?¶¹¹¹wvräJ%$RØ? X°`þLÇ#•ËÀkñùiÈêþ\|·ZÁV—¸þ?úø_ ºàÌ)j8µµµ!((J¥’ŸË¾œÿuuuýãZl2™º<Ç õp‚«T*Ì›7Oôü?üà2bìu×]'ÚS__mÛ¶¹ÜýéæÍ›‡ýû÷£µµz½«V­ÂÓO?ÍwÆàÁƒ1sæLÁnœ»ÂŽmæù„<`À—î°÷ÝwÒÒÒ0zôhèt: ”””`ãÆ¼áÙÕøùú믱sçNLœ8'NDbb"4 $ ª««‘#GŽ ;;›?ë|èÐ!¼óÎ;üBÃ08~ü8ÿEÂÕØ\³fàl.Ã0Xµj• n«W¯¸W+ ¼õÖ[.ëÿè£bíÚµ˜1c† Æ»—FTWWã÷ßÇ®]»DãËn·cÓ¦MhooEé*Ч€€˜Ífþ?7î- ÊÊÊpùå—cæÌ™‚÷”””ðÌîºúrþ÷Åû=-¯¿ïGüÄOü?·ØíÊóÊ#CG&‡\¥îÜqu±ÑÄ‚L¥$dÈž²ÿ r)“* ]œ XæàjÔS£9‰Û4äæœ¯ç¿Ñhô*c0~eYv¦»o¸óÎ;±uëVþ¾£u=cÆ ¢M]wÝu|XNUUUxâ‰'D°ÁÁÁxýõ×»¾_}õ•Ë/ +W®ìr§Æ]mܸ_|ñÿxÑ¢E‚s¹F£?ü°G~âÕÕÕ8|øp¿­–N˜0Ÿþ¹ - 1gÎ>嘆«¡¾¨_¬ŸËåyRþÔ©SñÙgŸ v?ŽÙ³gw™êBâ?›å)•JüôÓOËå8xð ¶oߎºº:0 ƒÀÀ@,Z´#GŽÌ3«ÕŠ•+Wzä¹AãŸø‰Ÿø‰ß³ò/½ôRH¥RlÙ²E‘ßÅ]z#Œ›F*ã,W€À¨m*ÍGá?¡£²¬šóMš@D¾ð%À0hÙñ0­[-ºD·ø%(knGÍÒ¹k§vsÐàÁƒ1hÐ ÔÕÕáøñã>ŸÿñññX·nË4‹]I«Õú{¼#k4.¿Žž± Ãà²Ë.=ÿý÷ß»l˜Ù³g‹¸ìܹSt]HHH¯X¸üòËñÿ÷hjj|÷Ýw˜4i_Fƒ+¯¼Rt>·Û?©T°Úà~Úf³a~ûÛݼpÎá®ÓÓÓ]îâq»Tžæ©sŒzèÎꊧá·}Íï\_ÇÈç¿+wØÒÒR{!óŸÍþ8p ¢££‡Q£FaÁ‚hmmåÏg¹rC///çÓ&Ñø'~â'~âï~ÎP©TöÊedRH•ªn]‡%J$ {:/el!' Úᓘy5X«í¿~ج€\ƒÀûž…"©óhPÛ¡dĺgãqsÔ×óß`0xµ#+óÆyȆ "²ºëëëEw8ðÒK/<—ŸŸïr'*--Í'¦P(™™‰ü@§KnVV–Àø¾ä’K<2d†ñ([o*p<Ç1Y¸sù½ÍëÖÓ@õ4]où» Xq¾ñ5Jôº«ü¤*ÿÙìçàV*•ªË³Ï6› »wïÆÝwßúúzÿÄOüÄOü}ÈßÑѹ\???>1ŒŒR ‰Rµ_ äj5$Ò?¿3¹Ø‘Uhü  ƒ¥¥ÖöÀj!ëææ7 :Ÿy*Mt2Ò®YˆÉÿxñ§C¦VC¢PA¢PBªRB¢PAªRBªRAªR!$9#oº£ïx#/ë6UéT‡ ¯=SîŸñI¤2ÈuF,Ë¢½¢3†". ‹–S{9‰‹Vl4ûdþ«Õj·=Påñެã¬V+$ ïjl³ÙÜø4pà@Ñsœ¢É1(§®_;7êbbb§œ),,DKK ùkÒÓÓQRRâVyŽÑž¹wö1ïîu®­¹]Vçö‡°[íüòį¿þнûöò÷Ìf36oÞÌ_/‘t†¹ÿk!J‚μmïîïâewûš_ ì°C×ן/üF£¿þú+¿ÐÃ0 ïZ¡óŸíþ_¹r%<ˆÔÔTÄÆÆ"00|?”––"''GEYYâ'~â'þ~âojjBDD„Gçê8Îâä†&ØÚMˆ› •Ö_^‡qܘeYXÛ;ÐX]Š“[ס%ïY6ç£Ú hùðY´†Æ@>x,$~°·6Á’“¶¥~·<¿ñÓ¡É ë¼`üæ-j3t§ä²[444ôÉü÷÷÷ïC–‹jÊE0u¬„Åbå¹ìJŽÑIÎô]„Î9<tíÖ›³î”––†Ý»wóOŸ>-p󌎎v»,‹ÅÒcÇvõœ;¯;çá…º­÷Ö¿Š'$ïM2wVû’ßþçö.yÎþ¶¶6,_îÆêàʶû¿££[¶lÁ–-[<úì ñOüÄOüÄß·üuuuHMM…V«EPPÇix¬mõ8¹þc´Ö•cдë“ ©B–eÀ0Ë¬ÝŽŽ¶fTÜ‚ÍßÁR†,›ó\lý˜wˆûQÿÕ›†@“6gÀÖT‡ŽM_]ô한œ  ó˜¥Édê“ùÏe@ñT»ûùùu™£ÕÝðç Ãð[Ôœº;Ûàì2ÒÑÑÑå¡~.9¶¯ä˜¶€ ''>í‡[ A½ O"‘H$‰Dê”ÉdBcc#×¹ÈÝ2jìvTeý†ƒ_¾‰ŠÃ Ý kgÁ²€ÝbAse Nüü9r¿}‡ŒØ ßÄEëÿ†¹¢¨Óþ˜y+˜À°‹ºE$ Œ²ªªªÏîÔ¥}Ù<Þ‘ŽŽ†L&sy–U¯×#$$¤Ç2\íÚv·Š¦ÑhD†¬'e÷FΆqKK‹×÷3™L½ ~ÐÓáiOW÷6ƒ§êíýˆŸø‰Ÿø‰Ÿø‰Ÿøßúôi„„„ 22:Ž7l=•¡$Ùÿ}‰S¯CÒ¤+!Óú£¡ðò~û­§Ž’w±ÈnEã;Ë [Ò™>mkº¨›cèСÉd0›Í(++ó.׫ó?&&¦ Ùððp¨T*—ƒÝÍ©j6›EÑñý¬]5€£<¹Ö¾ÝÑf³ÙýÉ?ƒtE°Çu"£öT¾§×÷z«—÷#~â'~â'~â'~âwü»©© uuu ÃØ±c±yóf¯ó[ÚŒ-8õËh*)„20µ9`m«é"S{_]rÑ7ƒN§Cbb"€ÎØDÞ~F¸3ÿccc=Jãêµ!ÅÐu>¬ë¼[ÙPGG‡ ……cŽXgqçq9u•ú¢'#׫±ìäìÜÈž¸ ·´´ôè/îÜÀ1Ax_ßã¢ÕY.ø‰Ÿø‰Ÿø‰Ÿø‰ÿĉ˜4i4 &Mš„Ý»w{o˜³,Žï!cŽtQK¥Raâĉ` (--íÓùÓ?†ltt4”J%Ÿ¨ÚÑ0«­­u¤¡¡111ücÿ.¯mmm…N§sËõ$;â5srŒX Àí¼ez½^àZÜ•;óáçž’˜ww½ãn2×'ŽÉÑ{»“í‰aÎÝ—ø‰Ÿø‰Ÿø‰Ÿø‰ßWüz½GÅèÑ£†I“&aÏž=}ºËL"]¨Òh4ÈÌÌ„B¡€Éd‘#GzŒJÜÛùåvæ›^²ÇéÓ§E‹MMMn“­ªª²ááá]^[WW'È×*‘H‰êêjѵž¸úº£ÚÚZÁcÇ:s¾¾Û­uw“{r½ã5]øÞ$]ïiÁñ^®îKüÄOüÄOüÄOüÄïKþêêjäää`ذaÇ´iÓ••Åë"‘H=+..#GŽ„T*E{{;öïßߣ÷ioçLL "##½Ú‘õxX©TbèС¢LŽF§;rΫÑhçòÚââbÑsœÏ¶³|ýuæŒ0B]JJŠàq^^ž[åpF÷Å´:ØS‚uâ'~â'~â'~â'~_©¼¼ÙÙÙ°Z­ð÷÷ÇW\áÇ{õ™Dº˜„)S¦`̘1J¥ÐëõسgO—Yb|9ÿÇŒ­VëU½½šÙ#FŒÀ?ü îÄ0 ÊËË‘––ÖcÙÙÙ¸ùæ›Ï9Òe~Ø‚‚Ñs©©©ÈÊÊ=ßÔÔÔ¥‘ë©ÚÛÛ†lrr² ÝNMM[†»ÍfCee¥Ï£öwÔAOëãü›ø‰Ÿø‰Ÿø‰Ÿø‰¿/ù+**ÐÜÜŒ‘#G"((nåQt¨IDATIIIHLLDmm-ªªªP[[ £ÑHnǤ‹Z‰ˆŒŒDtt´àèdii)Ž;æuTqOçÿ¨Q£¼Î:ã•!;tèPhµZQÅX–Ecc#Z[[Eyb]­š•••!>>žnüøñøùçŸ]²ÍÍÍ#røðá.ËuwGØ ¶Æ333¯ïÛ·Ï­rΜ9‹ÅŸ)v÷<Š;çIÏ)÷T¾§áø=­_O«.ÄOüÄOüÄOüÄOü}ͯ×ëñÇ ..„V«EDD"""ø{Øl62fI­ë*žQmm-osõçü6läryÿ²)))HLLDqq±Ëƒ¼ÅÅÅ]šŽÚ¾};,XÀ?NHH@jjªh–eYìÞ½³gÏæŸ Grr2NŸ>-¸¶¢¢Âg}ôè_yøHxœìv;¶mÛæV9\¤/›Í‰DÂÇîüãð$¼¾«vçò»ûÐv~¿;ÿèœëâ8°£˜?ñ?ñ?ñ?ñ÷ii)JKK†èèh„††B«Õ‚a¯ËH’l6Z[[QSSƒ²²2Æ~ŸÿÉÉÉHIIé_CV©TbÚ´i8xð šššD<………2dHÛ¶mÃÕW_-ˆH|à 7àÅ_]»aÃ̘1Czå•Wâ½÷Þ\çlØz+‹Å‚?þøƒü·¿ý jµš¼wï^Q (WÒëõãÚÙðï韋§ò4Bwïïê5w3p¬Î‘‰Ÿø‰Ÿø‰Ÿø‰Ÿøû‹¿¶¶–ÿÎÆ0 Ôj5är9%]”²X,èèè@GG‡Gs«/æÿ¬Y³àïïïuôt¯gð%—\‚Õ«W‹ Y®ŠŠŠ0pàÀnË0›Íøþûïq÷ÝwóÏ <ãÆÃþýû×655aË–-˜9s&ÿÜØ±c¡ÓéÐØØÈ?WVVƒÁàõ¡aNàÏGFF⪫®Ôû»ï¾s«Ç`P=¹Êôôº;Í—å÷¶>}]?â'~â'~â'~â'~OÊgY–?#KýOüÄöø% 233…žJâíÓÒÒ‘‘ÑeÎØcÇŽu›cˆÓŽ;““#xîïÿ»Ë3¶k×®ECCÿX¡Pà†n5Vaaa¯ŒX»Ý޵k×òï¾ûnÁ!äuëÖ¹µk4Ñ™?;„vçõžäëò{[Ÿ¾®ñ?ñ?ñ?ñ?ñ?ñŸüééé2dˆ×žzeÈ* Üxã ‘Üïööv¹UÖûï¿/ØU Â< rM6™Løàƒò%—\‚ØØXÁuÙÙÙ½2d·mÛÆ§Ë¹æškššÊ¿–““ã2 •+åååuéêÒ·wßßßáï{ºñ?ñ?ñ?ñ?ñ?ñÿ-·Ü"ˆ–ܯ†,L:Uà>ìlq>|f³¹ÇrZ[[ñÆo§§§ \Ž9?~\àÖ+•JqÇw®9xð ×ÖÐЀ¯¾ú @g: ›nº‰­¢¢ï¼óŽ[å455áäÉ“ýºâÒÛë{;ðÏöŠñ?ñ?ñ?ñ?ñ?ñŸÛüÑÑј1cT*ÕÙ3du:æÌ™ƒàà`— a³Ùpøða·Ê*..Æk¯½&0f§L™‚»îºKÔø?ÿü36oÞ,0z§OŸ.0"½ údµZñÞ{ïÁd2!!!÷ß?ï:]UU…W^yƒÁ­²¸Ô<½Yñõ LO¯;4ç3%ý½DüÄOüÄOüÄOüÄOüÄOüÿ¼yóÛëèá½2d`æÌ™HKKëÒâ...v;·kAAžþy›ñe—]†‡zJ¥RpígŸ}†ßÿ|óÍ7#<<œ¼wï^YÖ¬Yƒüü|$%%áŸÿü'4 €ÎHÈÏ=÷œ ^ÝéÔ©S|,W¡ªûj«ÿlûÈ÷t=ñ?ñ?ñ?ñ?ñ?ñ_¼ü¸æškx;묲qqq¸õÖ[ÞeGíØ±Ã-c 3êð²eËpìØ1þ¹1cÆà_ÿú"##×~òÉ'X·n@­V \‘wïÞ ‹Åâ6Çúõë±qãFŒ3Ë–-ƒ¿¿?€Î`TÏ=÷Z[[Ý*§¥¥pkÅÃ×É×+D¾^"~â'~â'~â'~â'~â'þ‹—ÿöÛoGjjªOÒ_I—-[v€½)$!!‡FQQ¿íXi»ÝŽææf$$$¸U^GGvíÚ…¶¶6 4r¹AAA¸ä’KÐÖÖ†’’þÚ'N ²²Æ CTTôz=NŸ>ŽŽ( ~·¸;;v Ÿþ9n»í6ÜvÛmËåhnnÆêÕ«ñóÏ?»}ètMÞ¼y3,‹O¢|õ6üõ¹î“OüÄOüÄOüÄOüÄOüÄOüFF–/_ŽÈÈÈ.3߸+…Bñ’O Y•J8pMMM.+Ïådˆˆp»Ü¢¢"ìܹZ­±±±P*•=z4ˆÓ§OóeVTT`ÿþýHHH€T*åÏåæææ¢°° ÉdÐét.ïˆY³f!55f³›7oÆ›o¾‰ÒÒR]VVêëëû$ÏTW[õçr^,â'~â'~â'~â'~â'~â'~¹\ŽeË–aÒ¤I¢#£Þ²ŒÁ`ø•eÙ™½-¬££Ï?ÿ<>üðC>`“+Ð1cÆ"»«ÐÐP\}õÕ˜8¾V_Oâ'~â'~â'~â'~â'~â'þk®¹+W®Dxx¸OÚB«ÕúûÌ€òòr<ôÐCزeK·î¸“'OF||¼W÷P*•?~<&L˜€ôôtTUUá©§žò¨ŒéÓ§cþüùH$ÈÏÏÇîÝ»±wï^´··{U§ÜÜ\äääœwí\ñ?ñ?ñ?ñ?ñ?ñŸßüx÷Ýw1bÄŸœíC¶mÛ†§žz Çï¶a'L˜€ÄÄÄ^ÝK¡P@«Õò‚ ** UUUHHHÀ²e˰lÙ2444 44555üu!!!°X,^í¾:*''¹¹¹ÄÄr®¯³<­?ñ?ñ?ñ?ñ?ñ?ñ_¼ü¡¡¡xùå—qíµ×B­Vû¬ÞZ­Ö_âëÆÈÌÌÄüùóÝíáß½{÷"//¯W÷2›Í#6""¯¾ú*âããFƒ€€Œ?/¼ð‚ WQCCC¯ŒX–e±ÿþ.XGVoÃq÷wÔ1çú:ÿ8׿'?ñ?ñ?ñ?ñ?ñÿÅɯR©°`ÁÌœ9Ó§F,'Ÿ²‰ÿûßqÛm·!$$¤Ûk³³³qàÀØl6ŸÜ{òäɰÙlhllä]›ív;jjj V«1vìXŸÜ§½½Û¶méS§z5Ð|÷©§ò=Mpìë‰BüÄOüÄOüÄOüÄOüÄOü>¿T*ŵ×^‹»ï¾è Iú¢P­V‹üã˜3gNÖ÷©S§ðÛo¿¡­­­W÷œ;w.®¿þzlܸz½^`Èž>}‡Â=÷܃iÓ¦õê>uuuX¿~=jjj|>P|·É×+D}=QˆŸø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰ÿüægW\qž|òIDEEõ™[µ }$N‡Gyƒ?þø#ÉØ•Z[[ñË/¿`ôèÑHIIñ V£ÑàóÏ?Ç–-[x¿ÛûÖ[oáꫯ†¿¿¿W<6› ¹¹¹Wbw|à½]Aqçýž^ïëû?ñ?ñ?ñ?ñ?ñ?ñsï‘H$˜:u*–/_ŽÄÄDÁÑN_ËçÁžœU\\Œ7Þx?üðZZZºgYAAA˜0aB—9_Ý•¿¿?n»í6|úé§èèèèUYgΜAVV,KŸæ]êíáp_—ßßy§ˆŸø‰Ÿø‰Ÿø‰Ÿø‰Ÿø‰ÿüä—Ëå¸üòËñì³Ï"##£OØ>‰ZìJUUUX¹r%¾ûî;Ô××»õž`ðàÁ ÄÙRmm-Ž;†ÚÚZH$‰D"‘H$I,FƒY³fáñÇGjjjŸ±ýjÈQ‚ßzë-üðÃ())XûÝYø±±±HKKCXXX¿t˲¨¬¬Dnn.<~ïÙ\qéíõ¾h;â'~â'~â'~â'~â'~â¿xøcbb0kÖ,<üðÈ…DÒ'a˜Îž! ƒkÖ¬Á×_ƒòçXÝ‘R©Djj*âãã½>çÚQ\\ŒÓ§OÃjµöKÇŸkºÐJ?ñ?ñ?ñ?ñ?ñßñ27ß|3æÏŸààà~k·~7d¹ÆÚ±c>üðCìØ±---7¬B¡@||<ÂÃíVë‘åo³Ù ×ëñÿÛ»Ÿ—ôý8àO÷CYúõY”’'1è ìÔ!ðà­{× ÿ«cÇŽý8ÑQ¨Hè”Yf¿H4³œ›ŸÓ†¿+ [:?Ï'HCÝÜcï÷æ^Ûl¨T*(—˶·õkê>"D?ýôÓO?ýôÓO?ýôÓßûþH$UU±¾¾Žl6 ŸÏ÷«øP Y3ÅbØÛÛÃéé©íÙÙï,xAàóù Ë2<$I²Þßn·¡ë:4Mƒ¦iß:<Ì#ãv„ˆ~úé§Ÿ~úé§Ÿ~úéw¯_’$¤Óiär9¬­­!;þ{Ø‘+d Õjáðð›››8>>F©TjÇ·ŽG?ýôÓO?ýôÓO?ýôÓ?¨_E$“I¨ªŠÕÕU,--ýúYØ‘*dÍ<==akk ÛÛÛ( VAËkÖé§Ÿ~úé§Ÿ~úé§Ÿ~ú‡ã—$ Éd XYYA.—C0z{ŒL!k¦T*agg»»»8??G±X„acs û°ÑO?ýôÓO?ýôÓO?ýô÷›ßÉÉI$ ë2âl6‹P(42F®5S©T°¿¿ƒƒ\\\àúúúË÷Ÿç‡~úé§Ÿ~úé§Ÿ~úé§ß $Áìì,b±TUÅòò22™ EÁ¨ed Y3Fù|GGGÈç󸻻C¹\Æýý½u‹7uœQ[Ñè§Ÿ~úé§Ÿ~úé§ŸþÓ¯( ¦§§F155…ÅÅEd2¨ªŠh4ú+÷ƒÛB¶;Õj…Bggg899ÁÍÍ žŸŸQ¯×Q­VQ«ÕÐl6´ãôÆé‹ÚŠC?ýôÓO?ýôÓO?ýô»Û/Ë2‚Á B¡Âá0æçç‘N§‘J¥J¥FêÒá±*d{;Èãã#.//Q*•puu…r¹ŒÛÛ[¼¾¾¢ÕjAÓ4¼½½¡Ýnãýý`t]·n¿#<õ×ãñ@EëÖ=æm|DQ´ÞÓýšùœSG\~û¿®9ÕVÅ0 ‚ðW{˜Ï}åõ~tü¯~†[Ú¿{ƒìtû†N§óéò±ó™íä–ö´ïtÏ_÷2é×þ‚ ôu¸ýÒ®qØþu÷çÕÿ¯·?ý?ïÿÊ6ÜmÛ7îÿ b·Û?±ksþíúGï>ˆùÞN§ó¿ñí–‰Ý4>šÞgû;Ÿc7žÿ'ןÏA ( $I‚,ËPÑh333ˆÇãˆÅb˜››C"‘ÀÄÄ„kÏ\»¶ý(º®£^¯£V«¡ÑhàååõzÍfº®£ÕjÁ0 ë²d³5 U¯×ký•e²,Û ‚Qÿ*l(ôÜrPaBoØ~'>ÿ;_ƒ~þwÆ·Û:µü5MC»Ýv¤ˆþhÃÜ÷[Žƒ,g»qEQÊ=Û˜ï÷U¶ÃülÝïb†óÝh7l_ù¾ì-T»Ýbïóæp÷øvÃvã÷N£÷Ñ;OvÓéWØþ侃 ðûýÖ#Àï÷# ÂëõŽ]ÿòûýÿIãEápáp˜[†a†a†a˜1ÀEÀ0 Ã0 Ã0 ø)hŒ¥ÀIEND®B`‚perldoc-html/static/indexFAQs.js000644 000765 000024 00000043304 12276000231 016662 0ustar00jjstaff000000 000000 perldocSearch.indexData.faqs = new Array ( [1, "What is Perl?"], [1, "Who supports Perl? Who develops it? Why is it free?"], [1, "Which version of Perl should I use?"], [1, "What are Perl 4, Perl 5, or Perl 6?"], [1, "What is Perl 6?"], [1, "How stable is Perl?"], [1, "Is Perl difficult to learn?"], [1, "How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl?"], [1, "Can I do [task] in Perl?"], [1, "When shouldn't I program in Perl?"], [1, "What's the difference between \"perl\" and \"Perl\"?"], [1, "What is a JAPH?"], [1, "How can I convince others to use Perl?"], [2, "What machines support Perl? Where do I get it?"], [2, "How can I get a binary version of Perl?"], [2, "I don't have a C compiler. How can I build my own Perl interpreter?"], [2, "I copied the Perl binary from one machine to another, but scripts don't work."], [2, "I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work?"], [2, "What modules and extensions are available for Perl? What is CPAN?"], [2, "Where can I get information on Perl?"], [2, "What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org?"], [2, "Where can I post questions?"], [2, "Perl Books"], [2, "Which magazines have Perl content?"], [2, "Which Perl blogs should I read?"], [2, "What mailing lists are there for Perl?"], [2, "Where can I buy a commercial version of Perl?"], [2, "Where do I send bug reports?"], [3, "How do I do (anything)?"], [3, "How can I use Perl interactively?"], [3, "How do I find which modules are installed on my system?"], [3, "How do I debug my Perl programs?"], [3, "How do I profile my Perl programs?"], [3, "How do I cross-reference my Perl programs?"], [3, "Is there a pretty-printer (formatter) for Perl?"], [3, "Is there an IDE or Windows Perl Editor?"], [3, "Where can I get Perl macros for vi?"], [3, "Where can I get perl-mode or cperl-mode for emacs? "], [3, "How can I use curses with Perl?"], [3, "How can I write a GUI (X, Tk, Gtk, etc.) in Perl? "], [3, "How can I make my Perl program run faster?"], [3, "How can I make my Perl program take less memory?"], [3, "Is it safe to return a reference to local or lexical data?"], [3, "How can I free an array or hash so my program shrinks?"], [3, "How can I make my CGI script more efficient?"], [3, "How can I hide the source for my Perl program?"], [3, "How can I compile my Perl program into byte code or C?"], [3, "How can I get '#!perl' to work on [MS-DOS,NT,...]?"], [3, "Can I write useful Perl programs on the command line?"], [3, "Why don't Perl one-liners work on my DOS/Mac/VMS system?"], [3, "Where can I learn about CGI or Web programming in Perl?"], [3, "Where can I learn about object-oriented Perl programming?"], [3, "Where can I learn about linking C with Perl?"], [3, "I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong?"], [3, "When I tried to run my script, I got this message. What does it mean?"], [3, "What's MakeMaker?"], [4, "Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?"], [4, "Why is int() broken?"], [4, "Why isn't my octal data interpreted correctly?"], [4, "Does Perl have a round() function? What about ceil() and floor()? Trig functions?"], [4, "How do I convert between numeric representations/bases/radixes?"], [4, "Why doesn't & work the way I want it to?"], [4, "How do I multiply matrices?"], [4, "How do I perform an operation on a series of integers?"], [4, "How can I output Roman numerals?"], [4, "Why aren't my random numbers random?"], [4, "How do I get a random number between X and Y?"], [4, "How do I find the day or week of the year?"], [4, "How do I find the current century or millennium?"], [4, "How can I compare two dates and find the difference?"], [4, "How can I take a string and turn it into epoch seconds?"], [4, "How can I find the Julian Day?"], [4, "How do I find yesterday's date? "], [4, "Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant?"], [4, "How do I validate input?"], [4, "How do I unescape a string?"], [4, "How do I remove consecutive pairs of characters?"], [4, "How do I expand function calls in a string?"], [4, "How do I find matching/nesting anything?"], [4, "How do I reverse a string?"], [4, "How do I expand tabs in a string?"], [4, "How do I reformat a paragraph?"], [4, "How can I access or change N characters of a string?"], [4, "How do I change the Nth occurrence of something?"], [4, "How can I count the number of occurrences of a substring within a string?"], [4, "How do I capitalize all the words on one line? "], [4, "How can I split a [character]-delimited string except when inside [character]?"], [4, "How do I strip blank space from the beginning/end of a string?"], [4, "How do I pad a string with blanks or pad a number with zeroes?"], [4, "How do I extract selected columns from a string?"], [4, "How do I find the soundex value of a string?"], [4, "How can I expand variables in text strings?"], [4, "What's wrong with always quoting \"$vars\"?"], [4, "Why don't my <? "], [5, "How can I open a file with a leading \">\" or trailing blanks? "], [5, "How can I reliably rename a file? "], [5, "How can I lock a file? "], [5, "Why can't I just open(FH, \">file.lock\")? "], [5, "I still don't get locking. I just want to increment the number in the file. How can I do this? "], [5, "All I want to do is append a small amount of text to the end of a file. Do I still have to use locking? "], [5, "How do I randomly update a binary file? "], [5, "How do I get a file's timestamp in perl? "], [5, "How do I set a file's timestamp in perl? "], [5, "How do I print to more than one file at once? "], [5, "How can I read in an entire file all at once? "], [5, "How can I read in a file by paragraphs? "], [5, "How can I read a single character from a file? From the keyboard? "], [5, "How can I tell whether there's a character waiting on a filehandle?"], [5, "How do I do a 'tail -f' in perl? "], [5, "How do I dup() a filehandle in Perl? "], [5, "How do I close a file descriptor by number? "], [5, "Why can't I use \"C:\\temp\\foo\" in DOS paths? Why doesn't `C:\\temp\\foo.exe` work? "], [5, "Why doesn't glob(\"*.*\") get all the files? "], [5, "Why does Perl let me delete read-only files? Why does '-i' clobber protected files? Isn't this a bug in Perl?"], [5, "How do I select a random line from a file? "], [5, "Why do I get weird spaces when I print an array of lines?"], [5, "How do I traverse a directory tree?"], [5, "How do I delete a directory tree?"], [5, "How do I copy an entire directory?"], [6, "How can I hope to use regular expressions without creating illegible and unmaintainable code? "], [6, "I'm having trouble matching over more than one line. What's wrong? "], [6, "How can I pull out lines between two patterns that are themselves on different lines? "], [6, "How do I match XML, HTML, or other nasty, ugly things with a regex? "], [6, "I put a regular expression into $/ but it didn't work. What's wrong? "], [6, "How do I substitute case-insensitively on the LHS while preserving case on the RHS? "], [6, "How can I make '\\w' match national character sets? "], [6, "How can I match a locale-smart version of '/[a-zA-Z]/'? "], [6, "How can I quote a variable to use in a regex? "], [6, "What is '/o' really for? "], [6, "How do I use a regular expression to strip C-style comments from a file?"], [6, "Can I use Perl regular expressions to match balanced text? "], [6, "What does it mean that regexes are greedy? How can I get around it? "], [6, "How do I process each word on each line? "], [6, "How can I print out a word-frequency or line-frequency summary?"], [6, "How can I do approximate matching? "], [6, "How do I efficiently match many regular expressions at once? "], [6, "Why don't word-boundary searches with '\\b' work for me? "], [6, "Why does using $&, $`, or $' slow my program down? "], [6, "What good is '\\G' in a regular expression? "], [6, "Are Perl regexes DFAs or NFAs? Are they POSIX compliant? "], [6, "What's wrong with using grep in a void context? "], [6, "How can I match strings with multibyte characters? "], [6, "How do I match a regular expression that's in a variable? "], [7, "Can I get a BNF/yacc/RE for the Perl language?"], [7, "What are all these $@%&* punctuation signs, and how do I know when to use them?"], [7, "Do I always/never have to quote my strings or use semicolons and commas?"], [7, "How do I skip some return values?"], [7, "How do I temporarily block warnings?"], [7, "What's an extension?"], [7, "Why do Perl operators have different precedence than C operators?"], [7, "How do I declare/create a structure?"], [7, "How do I create a module?"], [7, "How do I adopt or take over a module already on CPAN?"], [7, "How do I create a class? "], [7, "How can I tell if a variable is tainted?"], [7, "What's a closure?"], [7, "What is variable suicide and how can I prevent it?"], [7, "How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}?"], [7, "How do I create a static variable?"], [7, "What's the difference between dynamic and lexical (static) scoping? Between local() and my()?"], [7, "How can I access a dynamic variable while a similarly named lexical is in scope?"], [7, "What's the difference between deep and shallow binding?"], [7, "Why doesn't \"my($foo) = <$fh>;\" work right?"], [7, "How do I redefine a builtin function, operator, or method?"], [7, "What's the difference between calling a function as &foo and foo()?"], [7, "How do I create a switch or case statement?"], [7, "How can I catch accesses to undefined variables, functions, or methods?"], [7, "Why can't a method included in this same file be found?"], [7, "How can I find out my current or calling package?"], [7, "How can I comment out a large block of Perl code?"], [7, "How do I clear a package?"], [7, "How can I use a variable as a variable name?"], [7, "What does \"bad interpreter\" mean?"], [8, "How do I find out which operating system I'm running under?"], [8, "How come exec() doesn't return? "], [8, "How do I do fancy stuff with the keyboard/screen/mouse?"], [8, "How do I print something out in color?"], [8, "How do I read just one key without waiting for a return key?"], [8, "How do I check whether input is ready on the keyboard?"], [8, "How do I clear the screen?"], [8, "How do I get the screen size?"], [8, "How do I ask the user for a password?"], [8, "How do I read and write the serial port?"], [8, "How do I decode encrypted password files?"], [8, "How do I start a process in the background?"], [8, "How do I trap control characters/signals?"], [8, "How do I modify the shadow password file on a Unix system?"], [8, "How do I set the time and date?"], [8, "How can I sleep() or alarm() for under a second? "], [8, "How can I measure time under a second? "], [8, "How can I do an atexit() or setjmp()/longjmp()? (Exception handling)"], [8, "Why doesn't my sockets program work under System V (Solaris)? What does the error message \"Protocol not supported\" mean?"], [8, "How can I call my system's unique C functions from Perl?"], [8, "Where do I get the include files to do ioctl() or syscall()?"], [8, "Why do setuid perl scripts complain about kernel problems?"], [8, "How can I open a pipe both to and from a command?"], [8, "Why can't I get the output of a command with system()?"], [8, "How can I capture STDERR from an external command?"], [8, "Why doesn't open() return an error when a pipe open fails?"], [8, "What's wrong with using backticks in a void context?"], [8, "How can I call backticks without shell processing?"], [8, "Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)?"], [8, "How can I convert my shell script to perl?"], [8, "Can I use perl to run a telnet or ftp session?"], [8, "How can I write expect in Perl?"], [8, "Is there a way to hide perl's command line from programs such as \"ps\"?"], [8, "I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible?"], [8, "How do I close a process's filehandle without waiting for it to complete?"], [8, "How do I fork a daemon process?"], [8, "How do I find out if I'm running interactively or not?"], [8, "How do I timeout a slow event?"], [8, "How do I set CPU limits? "], [8, "How do I avoid zombies on a Unix system?"], [8, "How do I use an SQL database?"], [8, "How do I make a system() exit on control-C?"], [8, "How do I open a file without blocking?"], [8, "How do I tell the difference between errors from the shell and perl?"], [8, "How do I install a module from CPAN?"], [8, "What's the difference between require and use?"], [8, "How do I keep my own module/library directory?"], [8, "How do I add the directory my program lives in to the module/library search path?"], [8, "How do I add a directory to my include path (@INC) at runtime?"], [8, "What is socket.ph and where do I get it?"], [9, "Should I use a web framework?"], [9, "Which web framework should I use? "], [9, "What is Plack and PSGI?"], [9, "How do I remove HTML from a string?"], [9, "How do I extract URLs?"], [9, "How do I fetch an HTML file?"], [9, "How do I automate an HTML form submission?"], [9, "How do I decode or create those %-encodings on the web? "], [9, "How do I redirect to another page?"], [9, "How do I put a password on my web pages?"], [9, "How do I make sure users can't enter values into a form that causes my CGI script to do bad things?"], [9, "How do I parse a mail header?"], [9, "How do I check a valid mail address?"], [9, "How do I decode a MIME/BASE64 string?"], [9, "How do I find the user's mail address?"], [9, "How do I send email?"], [9, "How do I use MIME to make an attachment to a mail message?"], [9, "How do I read email?"], [9, "How do I find out my hostname, domainname, or IP address? "], [9, "How do I fetch/put an (S)FTP file?"], [9, "How can I do RPC in Perl?"] ); perldoc-html/static/indexFunctions.js000644 000765 000024 00000027173 12276000230 020045 0ustar00jjstaff000000 000000 perldocSearch.indexData.functions = new Hash ({ "_ne": "1", "_scalar": "force a scalar context", "_tr": "transliterate a string", "_my": "declare and assign a local variable (lexical scoping)", "_print": "output a list to a filehandle", "_vec": "test or set particular bits in a string", "_fc": "1", "_stat": "get a file's status information", "_setpriority": "set a process's nice value", "_eof": "test a filehandle for its end", "_getpwent": "get next passwd record", "_tied": "get a reference to the object underlying a tied variable", "_setnetent": "prepare networks file for use", "_system": "run a separate program ", "_time": "return number of seconds since 1970", "_shmget": "get SysV shared memory segment identifier", "_CHECK": "1", "_getprotobyname": "get protocol record given name", "_getpriority": "get current nice value", "_readline": "fetch a record from a file", "_when": "1", "_ord": "find a character's numeric representation", "_bless": "create an object ", "_wait": "wait for any child process to die", "_seekdir": "reposition directory pointer ", "_exists": "test whether a hash key is present", "_substr": "get or alter a portion of a stirng", "_no": "unimport some module symbols or semantics at compile time", "_kill": "send a signal to a process or process group", "_readdir": "get a directory from a directory handle", "_grep": "locate elements in a list test true against a given criterion", "_exp": "raise I to a power", "_ioctl": "system-dependent device control system call", "_values": "return a list of the values in a hash", "_getservbyname": "get services record given its name", "_setsockopt": "set some socket options", "_shift": "remove the first element of an array, and return it", "_hex": "convert a string to a hexadecimal number", "_setgrent": "prepare group file for use", "_msgctl": "SysV IPC message control operations", "_msgrcv": "receive a SysV IPC message from a message queue", "_gethostbyname": "get host record given name", "_goto": "create spaghetti code", "___SUB__": "1", "_accept": "accept an incoming socket connect", "_unless": "1", "_import": "patch a module's namespace into your own", "_endpwent": "be done using passwd file", "___DATA__": "1", "_lc": "return lower-case version of a string", "_getgrent": "get next group record ", "_atan2": "arctangent of Y/X in the range -PI to PI", "_fcntl": "file control system call", "_sysread": "fixed-length unbuffered input from a filehandle", "_map": "apply a change to a list to get back a new list with the changes", "_closedir": "close directory handle", "_getservbyport": "get services record given numeric port", "_study": "optimize input data for repeated searches", "_break": "break out of a \"given\" block", "_semctl": "SysV semaphore control operations", "_ref": "find out the type of thing being referenced", "_abs": "absolute value function", "_tell": "get current seekpointer on a filehandle", "_sysopen": "open a file, pipe, or descriptor", "_int": "get the integer portion of a number", "_local": "create a temporary value for a global variable (dynamic scoping)", "_gethostent": "get next hosts record ", "_syswrite": "fixed-length unbuffered output to a filehandle", "_pos": "find or set the offset for the last/next m//g search", "_glob": "expand filenames using wildcards", "_syscall": "execute an arbitrary system call", "_chr": "get character this number represents", "_shmctl": "SysV shared memory operations", "_getnetbyname": "get networks record given name", "_package": "declare a separate global namespace", "_semop": "SysV semaphore operations", "_mkdir": "create a directory", "_lstat": "stat a symbolic link", "_lock": "get a thread lock on a variable, subroutine, or method", "_pipe": "open a pair of connected filehandles", "_state": "declare and assign a state variable (persistent lexical scoping)", "_getc": "get the next character from the filehandle", "_fileno": "return file descriptor from filehandle", "_prototype": "get the prototype (if any) of a subroutine", "_msgget": "get SysV IPC message queue", "_exec": "abandon this program to run another", "_srand": "seed the random number generator", "_qq": "doubly quote a string", "_defined": "test whether a value, variable, or function is defined", "_default": "1", "_telldir": "get current seekpointer on a directory handle", "_reverse": "flip a string or a list", "_continue": "optional trailing block in a while or foreach ", "_setpgrp": "set the process group of a process", "_send": "send a message over a socket", "_crypt": "one-way passwd-style encryption", "_use": "load in a module at compile time", "_symlink": "create a symbolic link to a file", "_say": "print with newline", "_log": "retrieve the natural logarithm for a number", "_dbmopen": "create binding on a tied dbm file", "_getprotoent": "get next protocols record", "_return": "get out of a function early", "_until": "1", "_format": "declare a picture format with use by the write() function", "_dbmclose": "breaks binding on a tied dbm file", "_msgsnd": "send a SysV IPC message to a message queue", "_seek": "reposition file pointer for random-access I/O", "_sqrt": "square root function", "_le": "1", "_getppid": "get parent process ID", "_if": "1", "_rename": "change a filename", "_y": "transliterate a string", "___END__": "1", "_chop": "remove the last character from a string", "_caller": "get context of the current subroutine call", "_wantarray": "get void vs scalar vs list context of current subroutine call", "_each": "retrieve the next key/value pair from a hash", "_undef": "remove a variable or function definition", "_open": "open a file, pipe, or descriptor", "_-X": "a file test (-r, -x, etc)", "_getpwuid": "get passwd record given user ID", "_flock": "lock an entire file with an advisory lock", "_qx": "backquote quote a string", "_delete": "deletes a value from a hash", "_qr": "Compile pattern ", "_getpeername": "find the other end of a socket connection", "_rindex": "right-to-left substring search", "_and": "1", "_quotemeta": "quote regular expression magic characters", "_ge": "1", "_evalbytes": "1", "___LINE__": "1", "_die": "raise an exception or bail out", "_unshift": "prepend more elements to the beginning of a list", "_AUTOLOAD": "1", "_uc": "return upper-case version of a string", "_warn": "print debugging info", "_getprotobynumber": "get protocol record numeric protocol", "_last": "exit a block prematurely", "_truncate": "shorten a file", "_getlogin": "return who logged in at this tty", "_index": "find a substring within a string", "_length": "return the number of bytes in a string", "_sort": "sort a list of values ", "_xor": "1", "_chdir": "change your current working directory", "_gt": "1", "_UNITCHECK": "1", "_shmwrite": "write SysV shared memory ", "_opendir": "open a directory", "_next": "iterate a block prematurely", "_require": "load in external functions from a library at runtime", "_shmread": "read SysV shared memory ", "_exit": "terminate this program", "_unlink": "remove one link to a file", "_getservent": "get next services record ", "_split": "split up a string using a regexp delimiter", "_write": "print a picture record", "_DESTROY": "1", "_s": "replace a pattern with a string", "_END": "1", "_setprotoent": "prepare protocols file for use", "_sin": "return the sine of a number", "_push": "append one or more elements to an array", "_pack": "convert a list into a binary representation", "_getpgrp": "get process group", "_endprotoent": "be done using protocols file", "_foreach": "1", "_keys": "retrieve list of indices from a hash", "_close": "close file (or pipe or socket) handle", "_printf": "output a formatted list to a filehandle", "_select": "reset default output or do I/O multiplexing", "_formline": "internal function used for formats", "_readpipe": "execute a system command and collect standard output", "_gmtime": "convert UNIX time into record or string using Greenwich time", "_m": "match a string with a regular expression pattern", "_chmod": "changes the permissions on a list of files", "_fork": "create a new process just like this one", "_while": "1", "_splice": "add or remove elements anywhere in an array", "_rewinddir": "reset directory handle", "_not": "1", "_getsockname": "retrieve the sockaddr for a given socket", "_our": "declare and assign a package variable (lexical scoping)", "_listen": "register your socket as a server ", "_sethostent": "prepare hosts file for use", "_eq": "1", "_localtime": "convert UNIX time into record or string using local time", "_lcfirst": "return a string with just the next letter in lower case", "_join": "join a list into a string using a separator", "_chown": "change the owership on a list of files", "_q": "singly quote a string", "_endgrent": "be done using group file", "_do": "turn a BLOCK into a TERM", "_untie": "break a tie binding to a variable", "_unpack": "convert binary structure into normal perl variables", "_setservent": "prepare services file for use", "_setpwent": "prepare passwd file for use", "_getnetent": "get next networks record ", "_BEGIN": "1", "_alarm": "schedule a SIGALRM ", "_oct": "convert a string to an octal number", "_getsockopt": "get socket options on a given socket", "_getnetbyaddr": "get network record given its address", "_tie": "bind a variable to an object class ", "_elsif": "1", "_binmode": "prepare binary files for I/O", "___PACKAGE__": "1", "_waitpid": "wait for a particular child process to die", "_gethostbyaddr": "get host record given its address", "_pop": "remove the last element from an array and return it", "_semget": "get set of SysV semaphores", "_times": "return elapsed time for self and child processes", "_ucfirst": "return a string with just the next letter in upper case", "_given": "1", "_redo": "start this loop iteration over again", "_rmdir": "remove a directory", "_sysseek": "position I/O pointer on handle used with sysread and syswrite", "_or": "1", "_shutdown": "close down just half of a socket connection", "_read": "fixed-length buffered input from a filehandle", "_chroot": "make directory new root for path lookups", "_for": "1", "_bind": "binds an address to a socket", "_readlink": "determine where a symbolic link is pointing", "_else": "1", "_socket": "create a socket", "_lt": "1", "_utime": "set a file's last access and modify times", "_recv": "receive a message over a Socket", "_endhostent": "be done using hosts file", "_dump": "create an immediate core dump", "_socketpair": "create a pair of sockets", "_x": "1", "_getpwnam": "get passwd record given user login name", "_endnetent": "be done using networks file", "_endservent": "be done using services file", "_cos": "cosine function", "_reset": "clear all variables of a given name", "_INIT": "1", "_getgrnam": "get group record given group name", "_getgrgid": "get group record given group user ID", "_sprintf": "formatted print into a string ", "_elseif": "1", "_connect": "connect to a remote socket", "_eval": "catch exceptions or compile and run code", "_qw": "quote a list of words", "_link": "create a hard link in the filesytem", "_rand": "retrieve the next pseudorandom number ", "_cmp": "1", "_sub": "declare a subroutine, possibly anonymously", "_chomp": "remove a trailing record separator from a string", "___FILE__": "1", "_umask": "set file creation mode mask", "_sleep": "block for some number of seconds" }); perldoc-html/static/indexModules.js000644 000765 000024 00000100460 12276000230 017474 0ustar00jjstaff000000 000000 perldocSearch.indexData.modules = new Hash ({ "attributes": "get/set subroutine or variable attributes", "autodie": "Replace functions with ones that succeed or die with lexical scope", "autouse": "postpone load of modules until a function is used", "base": "Establish an ISA relationship with base classes at compile time", "bigint": "Transparent BigInteger support for Perl", "bignum": "Transparent BigNumber support for Perl", "bigrat": "Transparent BigNumber/BigRational support for Perl", "blib": "Use MakeMaker's uninstalled version of a package", "bytes": "Perl pragma to force byte semantics rather than character semantics", "charnames": "access to Unicode character names and named character sequences; also define character names", "constant": "Perl pragma to declare constants", "diagnostics": "produce verbose warning diagnostics", "encoding": "allows you to write your script in non-ascii or non-utf8", "feature": "Perl pragma to enable new features", "fields": "compile-time class fields", "filetest": "Perl pragma to control the filetest permission operators", "if": "use a Perl module if a condition holds", "integer": "Perl pragma to use integer arithmetic instead of floating point", "less": "perl pragma to request less of something", "lib": "manipulate @INC at compile time", "locale": "Perl pragma to use or avoid POSIX locales for built-in operations", "mro": "Method Resolution Order", "open": "perl pragma to set default PerlIO layers for input and output", "ops": "Perl pragma to restrict unsafe operations when compiling", "overload": "Package for overloading Perl operations", "overloading": "perl pragma to lexically control overloading", "parent": "Establish an ISA relationship with base classes at compile time", "re": "Perl pragma to alter regular expression behaviour", "sigtrap": "Perl pragma to enable simple signal handling", "sort": "perl pragma to control sort() behaviour", "strict": "Perl pragma to restrict unsafe constructs", "subs": "Perl pragma to predeclare sub names", "threads": "Perl interpreter-based threads", "threads::shared": "Perl extension for sharing data structures between threads", "utf8": "Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source code", "vars": "Perl pragma to predeclare global variable names", "vmsish": "Perl pragma to control VMS-specific language features", "warnings": "Perl pragma to control optional warnings", "warnings::register": "warnings import function", "Pod::Simple": "framework for parsing Pod", "Config": "access Perl configuration information", "Params::Check": "A generic input parsing/checking mechanism.", "ExtUtils::MM_MacOS": "once produced Makefiles for MacOS Classic", "ExtUtils::Embed": "Utilities for embedding Perl in C/C++ applications", "Text::Tabs": "expand and unexpand tabs like unix expand(1) and unexpand(1)", "ExtUtils::MakeMaker": "Create a module Makefile", "Log::Message::Simple": "Simplified interface to Log::Message", "PerlIO::encoding": "encoding layer", "Text::ParseWords": "parse text into an array of tokens or array of arrays", "IPC::Open2": "open a process for both reading and writing using open2()", "File::stat": "by-name interface to Perl's built-in stat() functions", "Pod::Simple::PullParserEndToken": "end-tokens from Pod::Simple::PullParser", "Math::Complex": "complex numbers and associated mathematical functions", "ExtUtils::MM_VMS": "methods to override UN*X behaviour in ExtUtils::MakeMaker", "File::DosGlob": "DOS like globbing and then some", "IO::Poll": "Object interface to system poll call", "File::Spec::Functions": "portably perform operations on file names", "Socket": "networking constants and support functions", "Module::Pluggable": "automatically give your module the ability to have plugins", "Parse::CPAN::Meta": "Parse META.yml and META.json CPAN metadata files", "Locale::Maketext::Simple": "Simple interface to Locale::Maketext::Lexicon", "Tie::File": "Access the lines of a disk file via a Perl array", "Encode::GSM0338": "ESTI GSM 03.38 Encoding", "File::GlobMapper": "Extend File Glob to Allow Input and Output Files", "Module::Build::Platform::cygwin": "Builder class for Cygwin platform", "Devel::SelfStubber": "generate stubs for a SelfLoading module", "Module::Build::Platform::Amiga": "Builder class for Amiga platforms", "File::Spec::Cygwin": "methods for Cygwin file specs", "Pod::Simple::HTML": "convert Pod to HTML", "Pod::Simple::HTMLBatch": "convert several Pod files to several HTML files", "CGI::Util": "Internal utilities used by CGI module", "IO::Pipe": "supply object methods for pipes", "Module::Build::YAML": "DEPRECATED", "ExtUtils::Command": "utilities to replace common UNIX commands in Makefiles etc.", "List::Util": "A selection of general-utility list subroutines", "Pod::Simple::Search": "find POD documents in directory trees", "IO::Uncompress::AnyInflate": "Uncompress zlib-based (zip, gzip) file/buffer", "TAP::Formatter::Base": "Base class for harness output delegates", "File::Spec::VMS": "methods for VMS file specs", "Opcode": "Disable named opcodes when compiling perl code", "Archive::Tar": "module for manipulations of tar archives", "PerlIO::via::QuotedPrint": "PerlIO layer for quoted-printable strings", "TAP::Parser::Result::Pragma": "TAP pragma token.", "File::CheckTree": "run many filetest checks on a tree", "File::Copy": "Copy files or filehandles", "Encode::EBCDIC": "EBCDIC Encodings", "CPANPLUS::Internals::Search": "internals for searching for modules", "Net::Netrc": "OO interface to users netrc file", "Compress::Raw::Zlib": "Low-Level Interface to zlib compression library", "TAP::Parser::YAMLish::Writer": "Write YAMLish data", "Pod::Perldoc::GetOptsOO": "Customized option parser for Pod::Perldoc", "Thread::Queue": "Thread-safe queues", "CPANPLUS::Dist::Sample": "Sample code to create your own Dist::* plugin", "Encode::Encoder": "Object Oriented Encoder", "Fcntl": "load the C Fcntl.h defines", "Test::Builder::Tester": "test testsuites that have been built withTest::Builder", "TAP::Formatter::Session": "Abstract base class for harness output delegate ", "CPANPLUS::Error": "error handling for CPANPLUS", "TAP::Parser": "Parse TAP|Test::Harness::TAP output", "Hash::Util": "A selection of general-utility hash subroutines", "ExtUtils::MakeMaker::FAQ": "Frequently Asked Questions About MakeMaker", "B::Lint::Debug": "Adds debugging stringification to B::", "Math::BigInt::FastCalc": "Math::BigInt::Calc with some XS for more speed", "Errno": "System errno constants", "Dumpvalue": "provides screen dump of Perl data.", "ExtUtils::testlib": "add blib/* directories to @INC", "Tie::RefHash": "use references as hash keys", "IO::Uncompress::RawInflate": "Read RFC 1951 files/buffers", "TAP::Parser::Utils": "Internal TAP::Parser utilities", "Module::Build::Platform::VMS": "Builder class for VMS platforms", "ExtUtils::MM_DOS": "DOS specific subclass of ExtUtils::MM_Unix", "CPANPLUS::Internals::Source::Memory": "In memory implementation", "CPANPLUS::Backend": "programmer's interface to CPANPLUS", "CPAN::Kwalify": "Interface between CPAN.pm and Kwalify.pm", "DBM_Filter::null": "filter for DBM_Filter", "ExtUtils::CBuilder": "Compile and link C code for Perl modules", "DBM_Filter::compress": "filter for DBM_Filter", "IO::Uncompress::Gunzip": "Read RFC 1952 files/buffers", "Encode::KR": "Korean Encodings", "Exporter": "Implements default import method for modules", "ExtUtils::Constant::XS": "generate C code for XS modules' constants.", "AutoLoader": "load subroutines only on demand", "Module::Build::Platform::aix": "Builder class for AIX platform", "Pod::Text::Color": "Convert POD data to formatted color ASCII text", "Time::tm": "internal object used by Time::gmtime and Time::localtime", "Net::NNTP": "NNTP Client class", "IO": "load various IO modules", "Module::Load": "runtime require of both modules and files", "Encode::KR::2022_KR": "internally used by Encode::KR", "CPANPLUS::Shell::Default": "the default CPANPLUS shell", "ExtUtils::Manifest": "utilities to write and check a MANIFEST file", "TAP::Parser::Scheduler::Spinner": "A no-op job.", "Getopt::Long": "Extended processing of command line options", "Digest::base": "Digest base class", "Pod::LaTeX": "Convert Pod data to formatted Latex", "Module::Build::PPMMaker": "Perl Package Manager file creation", "PerlIO::via": "Helper class for PerlIO layers implemented in perl", "SelfLoader": "load functions only on demand", "TAP::Parser::Iterator": "Base class for TAP source iterators", "TAP::Parser::Result": "Base class for TAP::Parser output objects", "Module::CoreList": "what modules shipped with versions of perl", "Encode::CJKConstants": "Internally used by Encode::??::ISO_2022_*", "CPAN::FirstTime": "Utility for CPAN::Config file Initialization", "File::Spec": "portably perform operations on file names", "Pod::Simple::PullParserStartToken": "start-tokens from Pod::Simple::PullParser", "Pod::Perldoc::ToRtf": "let Perldoc render Pod as RTF", "Test::Builder::Module": "Base class for test modules", "File::Compare": "Compare files or filehandles", "IO::Compress::Deflate": "Write RFC 1950 files/buffers", "User::grent": "by-name interface to Perl's built-in getgr*() functions", "Pod::Html": "module to convert pod files to HTML", "Pod::Perldoc::ToNroff": "let Perldoc convert Pod to nroff", "TAP::Parser::Result::Comment": "Comment result token.", "CPANPLUS::Selfupdate": "self-updating for CPANPLUS", "ExtUtils::Packlist": "manage .packlist files", "TAP::Formatter::Console::ParallelSession": "Harness output delegate for parallel console output", "Thread": "Manipulate threads in Perl (for old code only)", "Unicode::Collate": "Unicode Collation Algorithm", "CPANPLUS::Module::Fake": "fake module object for internal use", "IO::Compress::Gzip": "Write RFC 1952 files/buffers", "Time::HiRes": "High resolution alarm, sleep, gettimeofday, interval timers", "Net::protoent": "by-name interface to Perl's built-in getproto*() functions", "IPC::Semaphore": "SysV Semaphore IPC object class", "Pod::Simple::PullParserToken": "tokens from Pod::Simple::PullParser", "IPC::SysV": "System V IPC constants and system calls", "Math::BigInt::CalcEmu": "Emulate low-level math with BigInt code", "Encode::JP::JIS7": "internally used by Encode::JP", "Sys::Hostname": "Try every conceivable way to get hostname", "Pod::PlainText": "Convert POD data to formatted ASCII text", "IO::Socket::INET": "Object interface for AF_INET domain sockets", "Encode::Byte": "Single Byte Encodings", "Module::Build": "Build and install Perl modules", "DynaLoader": "Dynamically load C libraries into Perl code", "CPANPLUS::Shell::Default::Plugins::Source": "read in CPANPLUS commands", "Tie::Scalar": "base class definitions for tied scalars", "CPANPLUS::Module::Author::Fake": "dummy author object for CPANPLUS", "CPANPLUS::Internals::Source::SQLite": "SQLite implementation", "Math::Trig": "trigonometric functions", "TAP::Object": "Base class that provides common functionality to all TAP::* modules", "Tie::Array": "base class for tied arrays", "I18N::Langinfo": "query locale information", "CPANPLUS::Shell::Default::Plugins::Remote": "connect to a remote CPANPLUS", "ExtUtils::CBuilder::Platform::Windows": "Builder class for Windows platforms", "Test::Harness": "Run Perl standard test scripts with statistics", "ExtUtils::MM_Win95": "method to customize MakeMaker for Win9X", "Attribute::Handlers": "Simpler definition of attribute handlers", "Term::UI": "Term::ReadLine UI made easy", "Memoize::NDBM_File": "glue to provide EXISTS for NDBM_File for Storable use", "Net::servent": "by-name interface to Perl's built-in getserv*() functions", "Module::Pluggable::Object": "automatically give your module the ability to have plugins", "Class::Struct": "declare struct-like datatypes as Perl classes", "Encode::Guess": "Guesses encoding from data", "CPANPLUS::Internals::Source": "internals for updating source files", "Pod::Perldoc::ToChecker": "let Perldoc check Pod for errors", "Test::Builder": "Backend for building test libraries", "DBM_Filter::int32": "filter for DBM_Filter", "ExtUtils::MM_NW5": "methods to override UN*X behaviour in ExtUtils::MakeMaker", "CGI::Carp": "CGI routines for writing to the HTTPD (or other) error log", "Module::Build::Compat": "Compatibility with ExtUtils::MakeMaker", "Getopt::Std": "Process single-character switches with switch clustering", "Module::Build::Platform::MPEiX": "Builder class for MPEiX platforms", "ExtUtils::MM_Win32": "methods to override UN*X behaviour in ExtUtils::MakeMaker", "Object::Accessor": "interface to create per object accessors", "CPAN::Debug": "internal debugging for CPAN.pm", "File::Fetch": "A generic file fetching mechanism", "Term::ReadLine": "Perl interface to various readline packages.If no real package is found, substitutes stubs instead of basic functions.", "ExtUtils::Constant::Utils": "helper functions for ExtUtils::Constant", "Scalar::Util": "A selection of general-utility scalar subroutines", "Locale::Language": "standard codes for language identification", "Benchmark": "benchmark running times of Perl code", "FindBin": "Locate directory of original perl script", "Net::hostent": "by-name interface to Perl's built-in gethost*() functions", "ExtUtils::MY": "ExtUtils::MakeMaker subclass for customization", "FileCache": "keep more files open than the system permits", "CPANPLUS": "API & CLI access to the CPAN mirrors", "Pod::Select": "extract selected sections of POD from input", "Locale::Maketext::GutsLoader": "Deprecated module to load Locale::Maketext utf8 code", "TAP::Parser::Result::Plan": "Plan result token.", "Encode::JP": "Japanese Encodings", "Pod::Checker": "check pod documents for syntax errors", "Module::Build::Platform::os2": "Builder class for OS/2 platform", "Pod::Text::Termcap": "Convert POD data to ASCII text with format escapes", "Locale::Maketext::Guts": "Deprecated module to load Locale::Maketext utf8 code", "ExtUtils::Mkbootstrap": "make a bootstrap file for use by DynaLoader", "Pod::Usage": "print a usage message from embedded pod documentation", "Pod::Text": "Convert POD data to formatted ASCII text", "Pod::ParseUtils": "helpers for POD parsing and conversion", "Pod::Perldoc::ToPod": "let Perldoc render Pod as ... Pod!", "CPAN::Version": "utility functions to compare CPAN versions", "I18N::LangTags": "functions for dealing with RFC3066-style language tags", "Safe": "Compile and execute code in restricted compartments", "IO::File": "supply object methods for filehandles", "ExtUtils::ParseXS": "converts Perl XS code into C code", "CPANPLUS::Module::Checksums": "checking the checksum of a distribution", "Module::Build::ModuleInfo": "DEPRECATED", "Encode::MIME::Header": "MIME 'B' and 'Q' header encoding", "File::Find": "Traverse a directory tree.", "Module::Build::Platform::EBCDIC": "Builder class for EBCDIC platforms", "DBM_Filter::encode": "filter for DBM_Filter", "Test": "provides a simple framework for writing test scripts", "Hash::Util::FieldHash": "Support for Inside-Out Classes", "Module::Build::Version": "DEPRECATED", "Exporter::Heavy": "Exporter guts", "App::Prove::State::Result": "Individual test suite results.", "CPANPLUS::Shell::Default::Plugins::CustomSource": "add custom sources to CPANPLUS", "CPANPLUS::Module::Author": "CPAN author object for CPANPLUS", "Time::localtime": "by-name interface to Perl's built-in localtime() function", "Module::Load::Conditional": "Looking up module information / loading at runtime", "CPANPLUS::Backend::RV": "return value objects", "Pod::Functions": "Group Perl's functions a la perlfunc.pod", "File::Spec::Epoc": "methods for Epoc file specs", "Devel::PPPort": "Perl/Pollution/Portability", "ExtUtils::Command::MM": "Commands for the MM's to use in Makefiles", "IO::Socket": "Object interface to socket communications", "File::Temp": "return name and handle of a temporary file safely", "Locale::Script": "standard codes for script identification", "Term::UI::History": "history function", "TAP::Parser::Result::Test": "Test result token.", "CGI::Push": "Simple Interface to Server Push", "Digest::MD5": "Perl interface to the MD5 Algorithm", "Text::Balanced": "Extract delimited text sequences from strings.", "Net::Domain": "Attempt to evaluate the current host's internet name and domain", "TAP::Formatter::Color": "Run Perl test scripts with color", "TAP::Formatter::File": "Harness output delegate for file output", "Pod::Perldoc::ToTk": "let Perldoc use Tk::Pod to render Pod", "Memoize::Storable": "store Memoized data in Storable database", "Unicode::UCD": "Unicode character database", "Encode::TW": "Taiwan-based Chinese Encodings", "Module::Build::Cookbook": "Examples of Module::Build Usage", "Pod::Perldoc": "Look up Perl documentation in Pod format.", "Module::Build::ConfigData": "Configuration for Module::Build", "AutoSplit": "split a package for autoloading", "Module::Build::Platform::Default": "Stub class for unknown platforms", "TAP::Parser::Result::Version": "TAP syntax version token.", "Locale::Country": "standard codes for country identification", "Module::Build::Platform::Unix": "Builder class for Unix platforms", "File::Path": "Create or remove directory trees", "ExtUtils::Mksymlists": "write linker options files for dynamic extension", "CPAN::HandleConfig": "internal configuration handling for CPAN.pm", "Memoize::Expire": "Plug-in module for automatic expiration of memoized values", "TAP::Parser::Result::Unknown": "Unknown result token.", "SDBM_File": "Tied access to sdbm files", "Pod::Simple::Methody": "turn Pod::Simple events into method calls", "IO::Compress::RawDeflate": "Write RFC 1951 files/buffers", "ExtUtils::MM_Any": "Platform-agnostic MM methods", "IO::Uncompress::AnyUncompress": "Uncompress gzip, zip, bzip2 or lzop file/buffer", "TAP::Formatter::Console": "Harness output delegate for default console output", "B::Deparse": "Perl compiler backend to produce perl code", "FileHandle": "supply object methods for filehandles", "CPANPLUS::Dist": "base class for plugins", "CPANPLUS::Dist::Build": "CPANPLUS plugin to install packages that use Build.PL", "File::Glob": "Perl extension for BSD glob routine", "Digest::file": "Calculate digests of files", "Encode::Symbol": "Symbol Encodings", "B::Debug": "Walk Perl syntax tree, printing debug info about ops", "I18N::LangTags::List": "tags and names for human languages", "List::Util::XS": "Indicate if List::Util was compiled with a C compiler", "ExtUtils::Constant": "generate XS code to import C header constants", "App::Prove::State::Result::Test": "Individual test results.", "IO::Socket::UNIX": "Object interface for AF_UNIX domain sockets", "Test::Builder::Tester::Color": "turn on colour in Test::Builder::Tester", "TAP::Parser::Source": "a TAP source & meta data about it", "ExtUtils::MM_UWIN": "U/WIN specific subclass of ExtUtils::MM_Unix", "ExtUtils::MM_QNX": "QNX specific subclass of ExtUtils::MM_Unix", "Term::ANSIColor": "Color screen output using ANSI escape sequences", "Pod::Simple::Checker": "check the Pod syntax of a document", "TAP::Parser::Multiplexer": "Multiplex multiple TAP::Parsers", "MIME::QuotedPrint": "Encoding and decoding of quoted-printable strings", "Time::Seconds": "a simple API to convert seconds to other date values", "B": "The Perl Compiler Backend", "Test::More": "yet another framework for writing test scripts", "TAP::Base": "Base class that provides common functionality to TAP::Parserand TAP::Harness", "B::Concise": "Walk Perl syntax tree, printing concise info about ops", "PerlIO": "On demand loader for PerlIO layers and root of PerlIO::* name space", "ExtUtils::MM_BeOS": "methods to override UN*X behaviour in ExtUtils::MakeMaker", "TAP::Parser::Iterator::Process": "Iterator for process-based TAP sources", "IO::Dir": "supply object methods for directory handles", "CGI::Cookie": "Interface to HTTP Cookies", "UNIVERSAL": "base class for ALL classes (blessed references)", "IO::Uncompress::Inflate": "Read RFC 1950 files/buffers", "Log::Message::Item": "Message objects for Log::Message", "Test::Simple": "Basic utilities for writing tests.", "ExtUtils::MM_Cygwin": "methods to override UN*X behaviour in ExtUtils::MakeMaker", "Fatal": "Replace functions with equivalents which succeed or die", "IPC::Msg": "SysV Msg IPC object class", "CPANPLUS::Module": "CPAN module objects for CPANPLUS", "B::Terse": "Walk Perl syntax tree, printing terse info about ops", "Archive::Extract": "A generic archive extracting mechanism", "Module::Build::Platform::MacOS": "Builder class for MacOS platforms", "CPAN": "query, download and build perl modules from CPAN sites", "Module::Build::Notes": "Create persistent distribution configuration modules", "Module::Build::Base": "Default methods for Module::Build", "Net::netent": "by-name interface to Perl's built-in getnet*() functions", "Encode": "character encodings in Perl", "Devel::InnerPackage": "find all the inner packages of a package", "CPANPLUS::Internals::Fetch": "internals for fetching files", "IO::Zlib": "IO:: style interface to Compress::Zlib", "Memoize::SDBM_File": "glue to provide EXISTS for SDBM_File for Storable use", "Net::FTP": "FTP Client class", "Package::Constants": "List all constants declared in a package", "Pod::Man": "Convert POD data to formatted *roff input", "Net::POP3": "Post Office Protocol 3 Client class (RFC1939)", "B::Xref": "Generates cross reference reports for Perl programs", "ExtUtils::MM_VOS": "VOS specific subclass of ExtUtils::MM_Unix", "Math::BigRat": "Arbitrary big rational numbers", "TAP::Formatter::File::Session": "Harness output delegate for file output", "CPANPLUS::Dist::Autobundle": "distribution class for installation snapshots", "IO::Uncompress::Unzip": "Read zip files/buffers", "Encode::Encoding": "Encode Implementation Base Class", "IO::Uncompress::Bunzip2": "Read bzip2 files/buffers", "Sys::Syslog": "Perl interface to the UNIX syslog(3) calls", "Unicode::Normalize": "Unicode Normalization Forms", "Filter::Util::Call": "Perl Source Filter Utility Module", "ExtUtils::MM_OS2": "methods to override UN*X behaviour in ExtUtils::MakeMaker", "Text::Wrap": "line wrapping to form simple paragraphs", "Text::Abbrev": "abbrev - create an abbreviation table from a list", "Pod::Parser": "base class for creating POD filters and translators", "SelectSaver": "save and restore selected file handle", "Time::gmtime": "by-name interface to Perl's built-in gmtime() function", "CGI::Fast": "CGI Interface for Fast CGI", "CGI": "Handle Common Gateway Interface requests and responses", "IPC::SharedMem": "SysV Shared Memory IPC object class", "CPANPLUS::Shell::Classic": "CPAN.pm emulation for CPANPLUS", "App::Prove::State": "State storage for the prove command.", "DBM_Filter::utf8": "filter for DBM_Filter", "Time::Local": "efficiently compute time from local and GMT time", "Pod::Simple::RTF": "format Pod as RTF", "Pod::ParseLink": "Parse an L<> formatting code in POD text", "Log::Message::Handlers": "Message handlers for Log::Message", "Pod::Simple::PullParser": "a pull-parser interface to parsing Pod", "Tie::Handle": "base class definitions for tied handles", "Module::Build::Platform::RiscOS": "Builder class for RiscOS platforms", "Memoize::ExpireFile": "test for Memoize expiration semantics", "CPANPLUS::Config": "configuration defaults and heuristics for CPANPLUS", "Text::Soundex": "Implementation of the soundex algorithm.", "IO::Select": "OO interface to the select system call", "CPANPLUS::Internals::Utils": "convenience functions for CPANPLUS", "Locale::Maketext": "framework for localization", "I18N::Collate": "compare 8-bit scalar data according to the current locale", "Pod::Simple::Text": "format Pod as plaintext", "TAP::Parser::Scheduler::Job": "A single testing job.", "ExtUtils::MakeMaker::Tutorial": "Writing a module with MakeMaker", "IO::Uncompress::Base": "Base Class for IO::Uncompress modules ", "ExtUtils::Installed": "Inventory management of installed modules", "Encode::CN::HZ": "internally used by Encode::CN", "Pod::Perldoc::ToXml": "let Perldoc render Pod as XML", "TAP::Parser::IteratorFactory": "Figures out which SourceHandler objects to use for a given Source", "TAP::Harness": "Run test scripts with statistics", "Net::Config": "Local configuration data for libnet", "ExtUtils::Miniperl": "write the C code for perlmain.c", "Archive::Tar::File": "a subclass for in-memory extracted file from Archive::Tar", "TAP::Parser::Iterator::Stream": "Iterator for filehandle-based TAP sources", "Pod::Simple::LinkSection": "represent \"section\" attributes of L codes", "IO::Compress::Zip": "Write zip files/buffers", "Encode::Alias": "alias definitions to encodings", "Memoize::AnyDBM_File": "glue to provide EXISTS for AnyDBM_File for Storable use", "Pod::Text::Overstrike": "=for stopwordsoverstrike", "Pod::Perldoc::BaseTo": "Base for Pod::Perldoc formatters", "Pod::Escapes": "for resolving Pod E<...> sequences", "Storable": "persistence for Perl data structures", "TAP::Parser::Scheduler": "Schedule tests during parallel testing", "Symbol": "manipulate Perl symbols and their names", "Term::Cap": "Perl termcap interface", "Math::BigInt": "Arbitrary size integer/float math package", "IO::Compress::Base": "Base Class for IO::Compress modules ", "TAP::Parser::Result::Bailout": "Bailout result token.", "B::Lint": "Perl lint", "Pod::Find": "find POD documents in directory trees", "English": "use nice English (or awk) names for ugly punctuation variables", "Pod::InputObjects": "objects representing POD input paragraphs, commands, etc.", "CPANPLUS::Configure": "configuration for CPANPLUS", "Tie::Hash": "base class definitions for tied hashes", "Log::Message::Config": "Configuration options for Log::Message", "TAP::Parser::Result::YAML": "YAML result token.", "Math::BigInt::Calc": "Pure Perl module to support Math::BigInt", "TAP::Parser::Grammar": "A grammar for the Test Anything Protocol.", "File::Spec::Mac": "File::Spec for Mac OS (Classic)", "Filter::Simple": "Simplified source filtering", "XSLoader": "Dynamically load C libraries into Perl code", "IO::Handle": "supply object methods for I/O handles", "Module::Build::Platform::darwin": "Builder class for Mac OS X platform", "TAP::Formatter::Console::Session": "Harness output delegate for default console output", "Carp": "alternative warn and die for modules", "Search::Dict": "look - search for key in dictionary file", "CPANPLUS::Dist::MM": "distribution class for MakeMaker related modules", "ExtUtils::Liblist": "determine libraries to use and how to use them", "CPAN::Queue": "internal queue support for CPAN.pm", "ExtUtils::Constant::Base": "base class for ExtUtils::Constant objects", "IO::Seekable": "supply seek based methods for I/O objects", "Tie::Memoize": "add data to hash when needed", "B::Showlex": "Show lexical variables used in functions or files", "Time::Piece": "Object Oriented time objects", "Pod::Simple::DumpAsXML": "turn Pod into XML", "DB": "programmatic interface to the Perl debugging API", "Module::Build::Platform::VOS": "Builder class for VOS platforms", "Data::Dumper": "stringified perl data structures, suitable for both printing and eval", "IPC::Cmd": "finding and running system commands made easy", "Env": "perl module that imports environment variables as scalars or arrays", "I18N::LangTags::Detect": "detect the user's language preferences", "Encode::CN": "China-based Chinese Encodings", "Config::Extensions": "hash lookup of which core extensions were built.", "Pod::Perldoc::ToText": "let Perldoc render Pod as plaintext", "Encode::Unicode": "Various Unicode Transformation Formats", "Pod::Simple::XMLOutStream": "turn Pod into XML", "CPAN::Distroprefs": "read and match distroprefs", "Pod::Simple::XHTML": "format Pod as validating XHTML", "Net::Cmd": "Network Command class (as used by FTP, SMTP etc)", "Encode::Config": "internally used by Encode", "Pod::Simple::DumpAsText": "dump Pod-parsing events as text", "Encode::JP::H2Z": "internally used by Encode::JP::2022_JP*", "CGI::Switch": "Backward compatibility module for defunct CGI::Switch", "Digest": "Modules that calculate message digests", "CPANPLUS::Internals::Extract": "internals for archive extraction", "Math::BigFloat": "Arbitrary size floating point math package", "Net::Time": "time and daytime network client interface", "Module::Loaded": "mark modules as loaded or unloaded", "ExtUtils::MM_AIX": "AIX specific subclass of ExtUtils::MM_Unix", "Cwd": "get pathname of current working directory", "TAP::Parser::ResultFactory": "Factory for creating TAP::Parser output objects", "File::Spec::Win32": "methods for Win32 file specs", "Memoize::ExpireTest": "test for Memoize expiration semantics", "ExtUtils::Install": "install files from here to there", "IPC::Open3": "open a process for reading, writing, and error handling using open3()", "CPANPLUS::Internals::Report": "internals for sending test reports", "ExtUtils::MM_Darwin": "special behaviors for OS X", "Net::Ping": "check a remote host for reachability", "Pod::Perldoc::ToMan": "let Perldoc render Pod as man pages", "Compress::Raw::Bzip2": "Low-Level Interface to bzip2 compression library", "PerlIO::scalar": "in-memory IO, scalar IO", "App::Prove": "Implements the prove command.", "Thread::Semaphore": "Thread-safe semaphores", "Compress::Zlib": "Interface to zlib compression library", "NEXT": "Provide a pseudo-class NEXT (et al) that allows method redispatch", "Tie::SubstrHash": "Fixed-table-size, fixed-key-length hashing", "CGI::Apache": "Backward compatibility module for CGI.pm", "NDBM_File": "Tied access to ndbm files", "Tie::StdHandle": "base class definitions for tied handles", "CPAN::Tarzip": "internal handling of tar archives for CPAN.pm", "TAP::Parser::Aggregator": "Aggregate TAP::Parser results", "Net::SMTP": "Simple Mail Transfer Protocol Client", "Pod::Simple::SimpleTree": "parse Pod into a simple parse tree ", "MIME::Base64": "Encoding and decoding of base64 strings", "Encode::Unicode::UTF7": "UTF-7 encoding", "Pod::Simple::TextContent": "get the text content of Pod", "File::Basename": "Parse file paths into directory, filename and suffix.", "Log::Message": "A generic message storing mechanism;", "Devel::Peek": "A data debugging tool for the XS programmer", "CPANPLUS::Dist::Base": "Base class for custom distribution classes", "CGI::Pretty": "module to produce nicely formatted HTML code", "File::Spec::Unix": "File::Spec for Unix, base for other File::Spec modules", "CPANPLUS::Dist::Build::Constants": "Constants for CPANPLUS::Dist::Build", "CPANPLUS::Internals": "CPANPLUS internals", "ExtUtils::MM": "OS adjusted ExtUtils::MakeMaker subclass", "TAP::Parser::YAMLish::Reader": "Read YAMLish data from iterator", "Digest::SHA": "Perl extension for SHA-1/224/256/384/512", "Locale::Currency": "standard codes for currency identification", "DirHandle": "supply object methods for directory handles", "Encode::MIME::Name": "internally used by Encode", "ExtUtils::MakeMaker::Config": "Wrapper around Config.pm", "CPAN::Nox": "Wrapper around CPAN.pm without using any XS module", "ExtUtils::MM_Unix": "methods used by ExtUtils::MakeMaker", "DB_File": "Perl5 access to Berkeley DB version 1.x", "File::Spec::OS2": "methods for OS/2 file specs", "Pod::Simple::Debug": "put Pod::Simple into trace/debug mode", "IO::Compress::Bzip2": "Write bzip2 files/buffers", "Pod::Simple::PullParserTextToken": "text-tokens from Pod::Simple::PullParser", "CPANPLUS::Shell": "base class for CPANPLUS shells", "Memoize": "Make functions faster by trading space for time", "Term::Complete": "Perl word completion module", "POSIX": "Perl interface to IEEE Std 1003.1", "User::pwent": "by-name interface to Perl's built-in getpw*() functions", "Tie::Hash::NamedCapture": "Named regexp capture buffers", "DBM_Filter": "Filter DBM keys/values ", "Module::Build::Platform::Windows": "Builder class for Windows platforms", "O": "Generic interface to Perl Compiler backends", "AnyDBM_File": "provide framework for multiple DBMs", "TAP::Parser::Iterator::Array": "Iterator for array-based TAP sources" }); perldoc-html/static/indexPod.js000644 000765 000024 00000024214 12276000230 016610 0ustar00jjstaff000000 000000 perldocSearch.indexData.pod = new Hash ({ "perl": "The Perl 5 language interpreter", "perlintro": "a brief introduction and overview of Perl", "perlrun": "how to execute the Perl interpreter", "perlbook": "Books about and related to Perl", "perlcommunity": "a brief overview of the Perl community", "perlreftut": "Mark's very short tutorial about references", "perldsc": "data structure complex data structure struct", "perllol": "Manipulating Arrays of Arrays in Perl", "perlrequick": "Perl regular expressions quick start", "perlretut": "Perl regular expressions tutorial", "perlboot": "This document has been deleted", "perlootut": "Object-Oriented Programming in Perl Tutorial", "perltoot": "This document has been deleted", "perltooc": "This document has been deleted", "perlbot": "This document has been deleted", "perlstyle": "Perl style guide", "perlcheat": "Perl 5 Cheat Sheet", "perltrap": "Perl traps for the unwary", "perldebtut": "Perl debugging tutorial", "perlopentut": "tutorial on opening things in Perl", "perlpacktut": "tutorial on pack and unpack", "perlthrtut": "Tutorial on threads in Perl", "perlxstut": "Tutorial for writing XSUBs", "perlunitut": "Perl Unicode Tutorial", "perlpragma": "how to write a user pragma", "perlfaq": "frequently asked questions about Perl", "perlfaq1": "General Questions About Perl", "perlfaq2": "Obtaining and Learning about Perl", "perlfaq3": "Programming Tools", "perlfaq4": "Data Manipulation", "perlfaq5": "Files and Formats", "perlfaq6": "Regular Expressions", "perlfaq7": "General Perl Language Issues", "perlfaq8": "System Interaction", "perlfaq9": "Web, Email and Networking", "perlunifaq": "Perl Unicode FAQ", "perlsyn": "syntax", "perldata": "Perl data types", "perlsub": "subroutine function", "perlop": "operator", "perlfunc": "function", "perlpod": "POD plain old documentation", "perlpodspec": "Plain Old Documentation: format specification and notes", "perlpodstyle": "Perl POD style guide", "perldiag": "various Perl diagnostics", "perllexwarn": "warning, lexical warnings warning", "perldebug": "debug debugger", "perlvar": "Perl predefined variables", "perlre": "regular expression regex regexp", "perlreref": "Perl Regular Expressions Reference", "perlrebackslash": "Perl Regular Expression Backslash Sequences and Escapes", "perlrecharclass": "character class", "perlref": "reference pointer data structure structure struct", "perlform": "format report chart", "perlobj": "object OOP", "perltie": "tie", "perldbmfilter": "Perl DBM Filters", "perlipc": "Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)", "perlfork": "Perl's fork() emulation", "perlnumber": "semantics of numbers and numeric operations in Perl", "perlperf": "Perl Performance and Optimization Techniques", "perlport": "Writing portable Perl", "perllocale": "Perl locale handling (internationalization and localization)", "perluniintro": "Perl Unicode introduction", "perlunicode": "Unicode support in Perl", "perluniprops": "Index of Unicode Version 6.2.0 character properties in Perl", "perlebcdic": "Considerations for running Perl on EBCDIC platforms", "perlsec": "Perl security", "perlmod": "Perl modules (packages and symbol tables)", "perlmodlib": "constructing new Perl modules and finding existing ones", "perlmodstyle": "Perl module style guide", "perlmodinstall": "Installing CPAN Modules", "perlnewmod": "preparing a new module for distribution", "perlfilter": "Source Filters", "perlglossary": "Perl Glossary", "perlexperiment": "A listing of experimental features in Perl", "perldtrace": "Perl's support for DTrace", "CORE": "Namespace for Perl's core routines", "perlembed": "how to embed perl in your C program", "perldebguts": "Guts of Perl debugging ", "perlxs": "XS language reference manual", "perlxstut": "Tutorial for writing XSUBs", "perlxstypemap": "Perl XS C/Perl type mapping", "perlinterp": "An overview of the Perl interpreter", "perlsource": "A guide to the Perl source tree", "perlrepository": "Links to current information on the Perl source repository", "perlclib": "Internal replacements for standard C library functions", "perlguts": "Introduction to the Perl API", "perlcall": "Perl calling conventions from C", "perlapi": "autogenerated documentation for the perl public API", "perlintern": "autogenerated documentation of purely internal Perl functions", "perlmroapi": "Perl method resolution plugin interface", "perliol": "C API for Perl's implementation of IO in Layers.", "perlapio": "perl's IO abstraction interface.", "perlhack": "How to hack on Perl", "perlhacktut": "Walk through the creation of a simple C code patch", "perlhacktips": "Tips for Perl core C code hacking", "perlreguts": "Description of the Perl regular expression engine.", "perlreapi": "Perl regular expression plugin interface", "perlpolicy": "Various and sundry policies and commitments related to the Perl core", "perlhist": "the Perl history records", "perltodo": "Perl TO-DO List", "perldelta": "what is new for perl v5.18.2", "perl5182delta": "what is new for perl v5.18.2", "perl5181delta": "what is new for perl v5.18.1", "perl5180delta": "what is new for perl v5.18.0", "perl5163delta": "what is new for perl v5.16.3", "perl5162delta": "what is new for perl v5.16.2", "perl5161delta": "what is new for perl v5.16.1", "perl5160delta": "what is new for perl v5.16.0", "perl5144delta": "what is new for perl v5.14.4", "perl5143delta": "what is new for perl v5.14.3", "perl5142delta": "what is new for perl v5.14.2", "perl5141delta": "what is new for perl v5.14.1", "perl5140delta": "what is new for perl v5.14.0", "perl5125delta": "what is new for perl v5.12.5", "perl5124delta": "what is new for perl v5.12.4", "perl5123delta": "what is new for perl v5.12.3", "perl5122delta": "what is new for perl v5.12.2", "perl5121delta": "what is new for perl v5.12.1", "perl5120delta": "what is new for perl v5.12.0", "perl5101delta": "what is new for perl v5.10.1", "perl5100delta": "what is new for perl 5.10.0", "perl589delta": "what is new for perl v5.8.9", "perl588delta": "what is new for perl v5.8.8", "perl587delta": "what is new for perl v5.8.7", "perl586delta": "what is new for perl v5.8.6", "perl585delta": "what is new for perl v5.8.5", "perl584delta": "what is new for perl v5.8.4", "perl583delta": "what is new for perl v5.8.3", "perl582delta": "what is new for perl v5.8.2", "perl581delta": "what is new for perl v5.8.1", "perl58delta": "what is new for perl v5.8.0", "perl561delta": "what's new for perl v5.6.1", "perl56delta": "what's new for perl v5.6.0", "perl5005delta": "what's new for perl5.005", "perl5004delta": "what's new for perl5.004", "perlartistic": "the Perl Artistic License", "perlgpl": "the GNU General Public License, version 1", "perlaix": "Perl version 5 on IBM AIX (UNIX) systems", "perlamiga": "Perl under Amiga OS", "perlbs2000": "building and installing Perl for BS2000.", "perlce": "Perl for WinCE", "perlcygwin": "Perl for Cygwin", "perldgux": "Perl under DG/UX.", "perldos": "Perl under DOS, W31, W95.", "perlfreebsd": "Perl version 5 on FreeBSD systems", "perlhaiku": "Perl version 5.10+ on Haiku", "perlhpux": "Perl version 5 on Hewlett-Packard Unix (HP-UX) systems", "perlhurd": "Perl version 5 on Hurd", "perlirix": "Perl version 5 on Irix systems", "perllinux": "Perl version 5 on Linux systems", "perlmacos": "Perl under Mac OS (Classic)", "perlmacosx": "Perl under Mac OS X", "perlnetware": "Perl for NetWare", "perlopenbsd": "Perl version 5 on OpenBSD systems", "perlos2": "Perl under OS/2, DOS, Win0.3*, Win0.95 and WinNT.", "perlos390": "building and installing Perl for OS/390 and z/OS", "perlos400": "Perl version 5 on OS/400", "perlplan9": "Plan 9-specific documentation for Perl", "perlqnx": "Perl version 5 on QNX", "perlriscos": "Perl version 5 for RISC OS", "perlsolaris": "Perl version 5 on Solaris systems", "perlsymbian": "Perl version 5 on Symbian OS", "perltru64": "Perl version 5 on Tru64 (formerly known as Digital UNIX formerly known as DEC OSF/1) systems", "perlvms": "VMS-specific documentation for Perl", "perlvos": "Perl for Stratus OpenVOS", "perlwin32": "Perl under Windows", "perlutil": "utilities packaged with the Perl distribution", "a2p": "Awk to Perl translator", "c2ph": "Dump C structures as generated from cc -g -S stabs", "config_data": "Query or change configuration of Perl modules", "corelist": "a commandline frontend to Module::CoreList", "cpan": "easily interact with CPAN from the command line", "cpanp": "The CPANPLUS launcher", "cpan2dist": "The CPANPLUS distribution creator", "enc2xs": "Perl Encode Module Generator", "find2perl": "translate find command lines to Perl code", "h2ph": "convert .h C header files to .ph Perl header files", "h2xs": "convert .h C header files to Perl extensions", "instmodsh": "A shell to examine installed modules", "libnetcfg": "configure libnet", "perlbug": "how to submit bug reports on Perl", "piconv": "iconv(1), reinvented in perl", "prove": "Run tests through a TAP harness.", "psed": "a stream editor", "podchecker": "check the syntax of POD format documentation files", "perldoc": "Look up Perl documentation in Pod format.", "perlivp": "Perl Installation Verification Procedure", "pod2html": "convert .pod files to .html files", "pod2latex": "convert pod documentation to latex format", "pod2man": "Convert POD data to formatted *roff input", "pod2text": "Convert POD data to formatted ASCII text", "pod2usage": "print usage messages from embedded pod docs in files", "podselect": "print selected sections of pod documentation on standard output", "pstruct": "Dump C structures as generated from cc -g -S stabs", "ptar": "1", "ptardiff": "program that diffs an extracted archive against an unextracted one", "s2p": "a stream editor", "shasum": "Print or Check SHA Checksums", "splain": "produce verbose warning diagnostics", "xsubpp": "compiler to convert Perl XS code into C code", "perlthanks": "how to submit bug reports on Perl" }); perldoc-html/static/._loading.gif000644 000765 000024 00000000305 12276001417 017024 0ustar00jjstaff000000 000000 Mac OS X  2“ÅATTRŘ-˜-com.apple.quarantineq/0001;4a4cc82d;Firefox;|org.mozilla.firefoxperldoc-html/static/loading.gif000644 000765 000024 00000003471 12276001417 016616 0ustar00jjstaff000000 000000 GIF89aôÿÿÿDDDóóó©©©èèèwwwDDD„„„^^^ÁÁÁÏÏÏRRR¶¶¶FFFkkk!þCreated with ajaxload.info!ù !ÿ NETSCAPE2.0,w  !å¨DBÇA«H‰àȬ³Áa°¦D‚Âæ@ ^¶AéXøP¤@ñ¸"Uƒ‚³Q# ÎáB\;ŸÍ Ã1ª oÏ:2$v@ $|,3 ‚_# d€53—" s5 e!!ù ,v i@e9ŽDAÉAŒ²ŠÄñÀ/«`ph$Ca%@ ŒÇépH©°x½FÂuS‰ƒx#… Â.¿Ý„†YfŽL_" p 3BƒW ˆ]|L \6’{|zš87[7!!ù ,x Ù e9ŽDE"²Š„ƒÀ2r,« qPć€j´Â`ð8ëÂ@8bH, *Ñâ0-¦ ðmFWîä9½LP¤E3+ (‚B"  f{ˆ*BW_/‰ @_$‰‚~Kr7Ar7!!ù ,v Ù4e9Ž„!Hñ"Ë* ÐQ/@ƒ±ˆ-4€ép4ŒR+÷¼-Ÿèµp­ȧ`ÑP(–6ƒá ðU/ü  *,„)(+/]"lO…/†*Ak‘“Š K”ŒŠ]A~66 6!!ù ,l ie9Ž"ÇñË*‡ ½¾-Ö80H‚±€=N;¡ ÐÊT„Eìн®îqè¤í±ež êUoK2_WZòÝŒV‰´1jgWe@tuH//w`?…‰f~#’‰–6”“#!!ù ,~ ¹,e9Ž‚"ƒñÄ* †; pR³%„°#0¤š`¡ À'Ãc™(¤”J@@¿Áµ/1Ái4ˆÂ`½V­‰Bâ¾V u}„"caNi/ ] ))-Lel  mi} me[+!!ù ,y Ie9ŽÂ"Mó6Ä*¨"7EÍ–„@G((L&Ôpqj@Z…§‰ùº „ï%@­w¬Z) „pl( ‡Ô­Žqõu*R&c `))( s_Jˆ>_\'Gm7Œ$+!!ù ,w Ie9Ž*,‹ (Ä*¾(üB5[1² ¥Z²ÓIah!G—ªexz²ìJ0ˆe¿6ÀÂ@V|U«ñ4º¶Dm²…%$Í›ëp \Gx }@+| =+ 1“- Ea5l)+!!ù ,y )œä¨ž'AœK©’¯àÚ,¢¶ý“‰E\(lƒœ©&;5 à5D‰Ä€0è³3‚a¬0-‹µ-Ñ¡À”ŽÃƒpH4V % i p[R"| Œ‘# ™ 6iZwcw*!!ù ,y )œä¨ž,K”*ù¶Ä‹¨0Ÿ aš;׋аY8b`4én¤ ¨Bb‚b»x¾,±ÁÔ¾‘ ±Ë¾ÍäÑ( Ƚ°  % >  2*Ši* /:™+$v*!!ù ,u )œä¨žl[ª$á ²Jq[£Âq 3™`Q[ø5ø:Š•ðIX!0ÀrAD8 Cv«ÉÜHPfi¾äiQ”ƒAP@pC %D PQ46 Š iciNj0w „)#!!ù ,y )œä¨. q¾¨ ,G ®Jr(¯J8 ‡Cðä*Ї†B´,™Ž…ê&< …Œ´Ûh± W~-¼‘`Ñ, ‡–õ¤,ì>; 8RN<, …<1T] ˜c‘—' qk$ @)#!;perldoc-html/static/main_bg.png000644 000765 000024 00000000325 12276001417 016607 0ustar00jjstaff000000 000000 ‰PNG  IHDR²ã6’²sRGB®ÎébKGDÿÿÿ ½§“ pHYsttÞfxtIMEÙ ,+Cû f"tEXtCommentCreated with GIMP on a Mac‡¨wC'IDATHÇíÁ1 0˜{Ã0Î:m0$ÙÝý’ÜŸ«ª3c ŸÁÚ IEND®B`‚perldoc-html/static/._mootools-1.2.3-core.js000644 000765 000024 00000000305 12276001417 020516 0ustar00jjstaff000000 000000 Mac OS X  2“ÅATTRŘ-˜-com.apple.quarantineq/0001;4a6583b2;Firefox;|org.mozilla.firefoxperldoc-html/static/mootools-1.2.3-core.js000644 000765 000024 00000137323 12276001417 020314 0ustar00jjstaff000000 000000 //MooTools, , My Object Oriented (JavaScript) Tools. Copyright (c) 2006-2009 Valerio Proietti, , MIT Style License. var MooTools={version:"1.2.3",build:"4980aa0fb74d2f6eb80bcd9f5b8e1fd6fbb8f607"};var Native=function(k){k=k||{};var a=k.name;var i=k.legacy;var b=k.protect; var c=k.implement;var h=k.generics;var f=k.initialize;var g=k.afterImplement||function(){};var d=f||i;h=h!==false;d.constructor=Native;d.$family={name:"native"}; if(i&&f){d.prototype=i.prototype;}d.prototype.constructor=d;if(a){var e=a.toLowerCase();d.prototype.$family={name:e};Native.typize(d,e);}var j=function(n,l,o,m){if(!b||m||!n.prototype[l]){n.prototype[l]=o; }if(h){Native.genericize(n,l,b);}g.call(n,l,o);return n;};d.alias=function(n,l,p){if(typeof n=="string"){var o=this.prototype[n];if((n=o)){return j(this,l,n,p); }}for(var m in n){this.alias(m,n[m],l);}return this;};d.implement=function(m,l,o){if(typeof m=="string"){return j(this,m,l,o);}for(var n in m){j(this,n,m[n],l); }return this;};if(c){d.implement(c);}return d;};Native.genericize=function(b,c,a){if((!a||!b[c])&&typeof b.prototype[c]=="function"){b[c]=function(){var d=Array.prototype.slice.call(arguments); return b.prototype[c].apply(d.shift(),d);};}};Native.implement=function(d,c){for(var b=0,a=d.length;b-1:this.indexOf(a)>-1;},trim:function(){return this.replace(/^\s+|\s+$/g,"");},clean:function(){return this.replace(/\s+/g," ").trim(); },camelCase:function(){return this.replace(/-\D/g,function(a){return a.charAt(1).toUpperCase();});},hyphenate:function(){return this.replace(/[A-Z]/g,function(a){return("-"+a.charAt(0).toLowerCase()); });},capitalize:function(){return this.replace(/\b[a-z]/g,function(a){return a.toUpperCase();});},escapeRegExp:function(){return this.replace(/([-.*+?^${}()|[\]\/\\])/g,"\\$1"); },toInt:function(a){return parseInt(this,a||10);},toFloat:function(){return parseFloat(this);},hexToRgb:function(b){var a=this.match(/^#?(\w{1,2})(\w{1,2})(\w{1,2})$/); return(a)?a.slice(1).hexToRgb(b):null;},rgbToHex:function(b){var a=this.match(/\d{1,3}/g);return(a)?a.rgbToHex(b):null;},stripScripts:function(b){var a=""; var c=this.replace(/]*>([\s\S]*?)<\/script>/gi,function(){a+=arguments[1]+"\n";return"";});if(b===true){$exec(a);}else{if($type(b)=="function"){b(a,c); }}return c;},substitute:function(a,b){return this.replace(b||(/\\?\{([^{}]+)\}/g),function(d,c){if(d.charAt(0)=="\\"){return d.slice(1);}return(a[c]!=undefined)?a[c]:""; });}});Hash.implement({has:Object.prototype.hasOwnProperty,keyOf:function(b){for(var a in this){if(this.hasOwnProperty(a)&&this[a]===b){return a;}}return null; },hasValue:function(a){return(Hash.keyOf(this,a)!==null);},extend:function(a){Hash.each(a||{},function(c,b){Hash.set(this,b,c);},this);return this;},combine:function(a){Hash.each(a||{},function(c,b){Hash.include(this,b,c); },this);return this;},erase:function(a){if(this.hasOwnProperty(a)){delete this[a];}return this;},get:function(a){return(this.hasOwnProperty(a))?this[a]:null; },set:function(a,b){if(!this[a]||this.hasOwnProperty(a)){this[a]=b;}return this;},empty:function(){Hash.each(this,function(b,a){delete this[a];},this); return this;},include:function(a,b){if(this[a]==undefined){this[a]=b;}return this;},map:function(b,c){var a=new Hash;Hash.each(this,function(e,d){a.set(d,b.call(c,e,d,this)); },this);return a;},filter:function(b,c){var a=new Hash;Hash.each(this,function(e,d){if(b.call(c,e,d,this)){a.set(d,e);}},this);return a;},every:function(b,c){for(var a in this){if(this.hasOwnProperty(a)&&!b.call(c,this[a],a)){return false; }}return true;},some:function(b,c){for(var a in this){if(this.hasOwnProperty(a)&&b.call(c,this[a],a)){return true;}}return false;},getKeys:function(){var a=[]; Hash.each(this,function(c,b){a.push(b);});return a;},getValues:function(){var a=[];Hash.each(this,function(b){a.push(b);});return a;},toQueryString:function(a){var b=[]; Hash.each(this,function(f,e){if(a){e=a+"["+e+"]";}var d;switch($type(f)){case"object":d=Hash.toQueryString(f,e);break;case"array":var c={};f.each(function(h,g){c[g]=h; });d=Hash.toQueryString(c,e);break;default:d=e+"="+encodeURIComponent(f);}if(f!=undefined){b.push(d);}});return b.join("&");}});Hash.alias({keyOf:"indexOf",hasValue:"contains"}); var Event=new Native({name:"Event",initialize:function(a,f){f=f||window;var k=f.document;a=a||f.event;if(a.$extended){return a;}this.$extended=true;var j=a.type; var g=a.target||a.srcElement;while(g&&g.nodeType==3){g=g.parentNode;}if(j.test(/key/)){var b=a.which||a.keyCode;var m=Event.Keys.keyOf(b);if(j=="keydown"){var d=b-111; if(d>0&&d<13){m="f"+d;}}m=m||String.fromCharCode(b).toLowerCase();}else{if(j.match(/(click|mouse|menu)/i)){k=(!k.compatMode||k.compatMode=="CSS1Compat")?k.html:k.body; var i={x:a.pageX||a.clientX+k.scrollLeft,y:a.pageY||a.clientY+k.scrollTop};var c={x:(a.pageX)?a.pageX-f.pageXOffset:a.clientX,y:(a.pageY)?a.pageY-f.pageYOffset:a.clientY}; if(j.match(/DOMMouseScroll|mousewheel/)){var h=(a.wheelDelta)?a.wheelDelta/120:-(a.detail||0)/3;}var e=(a.which==3)||(a.button==2);var l=null;if(j.match(/over|out/)){switch(j){case"mouseover":l=a.relatedTarget||a.fromElement; break;case"mouseout":l=a.relatedTarget||a.toElement;}if(!(function(){while(l&&l.nodeType==3){l=l.parentNode;}return true;}).create({attempt:Browser.Engine.gecko})()){l=false; }}}}return $extend(this,{event:a,type:j,page:i,client:c,rightClick:e,wheel:h,relatedTarget:l,target:g,code:b,key:m,shift:a.shiftKey,control:a.ctrlKey,alt:a.altKey,meta:a.metaKey}); }});Event.Keys=new Hash({enter:13,up:38,down:40,left:37,right:39,esc:27,space:32,backspace:8,tab:9,"delete":46});Event.implement({stop:function(){return this.stopPropagation().preventDefault(); },stopPropagation:function(){if(this.event.stopPropagation){this.event.stopPropagation();}else{this.event.cancelBubble=true;}return this;},preventDefault:function(){if(this.event.preventDefault){this.event.preventDefault(); }else{this.event.returnValue=false;}return this;}});function Class(b){if(b instanceof Function){b={initialize:b};}var a=function(){Object.reset(this);if(a._prototyping){return this; }this._current=$empty;var c=(this.initialize)?this.initialize.apply(this,arguments):this;delete this._current;delete this.caller;return c;}.extend(this); a.implement(b);a.constructor=Class;a.prototype.constructor=a;return a;}Function.prototype.protect=function(){this._protected=true;return this;};Object.reset=function(a,c){if(c==null){for(var e in a){Object.reset(a,e); }return a;}delete a[c];switch($type(a[c])){case"object":var d=function(){};d.prototype=a[c];var b=new d;a[c]=Object.reset(b);break;case"array":a[c]=$unlink(a[c]); break;}return a;};new Native({name:"Class",initialize:Class}).extend({instantiate:function(b){b._prototyping=true;var a=new b;delete b._prototyping;return a; },wrap:function(a,b,c){if(c._origin){c=c._origin;}return function(){if(c._protected&&this._current==null){throw new Error('The method "'+b+'" cannot be called.'); }var e=this.caller,f=this._current;this.caller=f;this._current=arguments.callee;var d=c.apply(this,arguments);this._current=f;this.caller=e;return d;}.extend({_owner:a,_origin:c,_name:b}); }});Class.implement({implement:function(a,d){if($type(a)=="object"){for(var e in a){this.implement(e,a[e]);}return this;}var f=Class.Mutators[a];if(f){d=f.call(this,d); if(d==null){return this;}}var c=this.prototype;switch($type(d)){case"function":if(d._hidden){return this;}c[a]=Class.wrap(this,a,d);break;case"object":var b=c[a]; if($type(b)=="object"){$mixin(b,d);}else{c[a]=$unlink(d);}break;case"array":c[a]=$unlink(d);break;default:c[a]=d;}return this;}});Class.Mutators={Extends:function(a){this.parent=a; this.prototype=Class.instantiate(a);this.implement("parent",function(){var b=this.caller._name,c=this.caller._owner.parent.prototype[b];if(!c){throw new Error('The method "'+b+'" has no parent.'); }return c.apply(this,arguments);}.protect());},Implements:function(a){$splat(a).each(function(b){if(b instanceof Function){b=Class.instantiate(b);}this.implement(b); },this);}};var Chain=new Class({$chain:[],chain:function(){this.$chain.extend(Array.flatten(arguments));return this;},callChain:function(){return(this.$chain.length)?this.$chain.shift().apply(this,arguments):false; },clearChain:function(){this.$chain.empty();return this;}});var Events=new Class({$events:{},addEvent:function(c,b,a){c=Events.removeOn(c);if(b!=$empty){this.$events[c]=this.$events[c]||[]; this.$events[c].include(b);if(a){b.internal=true;}}return this;},addEvents:function(a){for(var b in a){this.addEvent(b,a[b]);}return this;},fireEvent:function(c,b,a){c=Events.removeOn(c); if(!this.$events||!this.$events[c]){return this;}this.$events[c].each(function(d){d.create({bind:this,delay:a,"arguments":b})();},this);return this;},removeEvent:function(b,a){b=Events.removeOn(b); if(!this.$events[b]){return this;}if(!a.internal){this.$events[b].erase(a);}return this;},removeEvents:function(c){var d;if($type(c)=="object"){for(d in c){this.removeEvent(d,c[d]); }return this;}if(c){c=Events.removeOn(c);}for(d in this.$events){if(c&&c!=d){continue;}var b=this.$events[d];for(var a=b.length;a--;a){this.removeEvent(d,b[a]); }}return this;}});Events.removeOn=function(a){return a.replace(/^on([A-Z])/,function(b,c){return c.toLowerCase();});};var Options=new Class({setOptions:function(){this.options=$merge.run([this.options].extend(arguments)); if(!this.addEvent){return this;}for(var a in this.options){if($type(this.options[a])!="function"||!(/^on[A-Z]/).test(a)){continue;}this.addEvent(a,this.options[a]); delete this.options[a];}return this;}});var Element=new Native({name:"Element",legacy:window.Element,initialize:function(a,b){var c=Element.Constructors.get(a); if(c){return c(b);}if(typeof a=="string"){return document.newElement(a,b);}return document.id(a).set(b);},afterImplement:function(a,b){Element.Prototype[a]=b; if(Array[a]){return;}Elements.implement(a,function(){var c=[],g=true;for(var e=0,d=this.length;e";}return document.id(this.createElement(a)).set(b);},newTextNode:function(a){return this.createTextNode(a); },getDocument:function(){return this;},getWindow:function(){return this.window;},id:(function(){var a={string:function(d,c,b){d=b.getElementById(d);return(d)?a.element(d,c):null; },element:function(b,e){$uid(b);if(!e&&!b.$family&&!(/^object|embed$/i).test(b.tagName)){var c=Element.Prototype;for(var d in c){b[d]=c[d];}}return b;},object:function(c,d,b){if(c.toElement){return a.element(c.toElement(b),d); }return null;}};a.textnode=a.whitespace=a.window=a.document=$arguments(0);return function(c,e,d){if(c&&c.$family&&c.uid){return c;}var b=$type(c);return(a[b])?a[b](c,e,d||document):null; };})()});if(window.$==null){Window.implement({$:function(a,b){return document.id(a,b,this.document);}});}Window.implement({$$:function(a){if(arguments.length==1&&typeof a=="string"){return this.document.getElements(a); }var f=[];var c=Array.flatten(arguments);for(var d=0,b=c.length;d1);a.each(function(e){var f=this.getElementsByTagName(e.trim());(b)?c.extend(f):c=f; },this);return new Elements(c,{ddup:b,cash:!d});}});(function(){var h={},f={};var i={input:"checked",option:"selected",textarea:(Browser.Engine.webkit&&Browser.Engine.version<420)?"innerHTML":"value"}; var c=function(l){return(f[l]||(f[l]={}));};var g=function(n,l){if(!n){return;}var m=n.uid;if(Browser.Engine.trident){if(n.clearAttributes){var q=l&&n.cloneNode(false); n.clearAttributes();if(q){n.mergeAttributes(q);}}else{if(n.removeEvents){n.removeEvents();}}if((/object/i).test(n.tagName)){for(var o in n){if(typeof n[o]=="function"){n[o]=$empty; }}Element.dispose(n);}}if(!m){return;}h[m]=f[m]=null;};var d=function(){Hash.each(h,g);if(Browser.Engine.trident){$A(document.getElementsByTagName("object")).each(g); }if(window.CollectGarbage){CollectGarbage();}h=f=null;};var j=function(n,l,s,m,p,r){var o=n[s||l];var q=[];while(o){if(o.nodeType==1&&(!m||Element.match(o,m))){if(!p){return document.id(o,r); }q.push(o);}o=o[l];}return(p)?new Elements(q,{ddup:false,cash:!r}):null;};var e={html:"innerHTML","class":"className","for":"htmlFor",defaultValue:"defaultValue",text:(Browser.Engine.trident||(Browser.Engine.webkit&&Browser.Engine.version<420))?"innerText":"textContent"}; var b=["compact","nowrap","ismap","declare","noshade","checked","disabled","readonly","multiple","selected","noresize","defer"];var k=["value","type","defaultValue","accessKey","cellPadding","cellSpacing","colSpan","frameBorder","maxLength","readOnly","rowSpan","tabIndex","useMap"]; b=b.associate(b);Hash.extend(e,b);Hash.extend(e,k.associate(k.map(String.toLowerCase)));var a={before:function(m,l){if(l.parentNode){l.parentNode.insertBefore(m,l); }},after:function(m,l){if(!l.parentNode){return;}var n=l.nextSibling;(n)?l.parentNode.insertBefore(m,n):l.parentNode.appendChild(m);},bottom:function(m,l){l.appendChild(m); },top:function(m,l){var n=l.firstChild;(n)?l.insertBefore(m,n):l.appendChild(m);}};a.inside=a.bottom;Hash.each(a,function(l,m){m=m.capitalize();Element.implement("inject"+m,function(n){l(this,document.id(n,true)); return this;});Element.implement("grab"+m,function(n){l(document.id(n,true),this);return this;});});Element.implement({set:function(o,m){switch($type(o)){case"object":for(var n in o){this.set(n,o[n]); }break;case"string":var l=Element.Properties.get(o);(l&&l.set)?l.set.apply(this,Array.slice(arguments,1)):this.setProperty(o,m);}return this;},get:function(m){var l=Element.Properties.get(m); return(l&&l.get)?l.get.apply(this,Array.slice(arguments,1)):this.getProperty(m);},erase:function(m){var l=Element.Properties.get(m);(l&&l.erase)?l.erase.apply(this):this.removeProperty(m); return this;},setProperty:function(m,n){var l=e[m];if(n==undefined){return this.removeProperty(m);}if(l&&b[m]){n=!!n;}(l)?this[l]=n:this.setAttribute(m,""+n); return this;},setProperties:function(l){for(var m in l){this.setProperty(m,l[m]);}return this;},getProperty:function(m){var l=e[m];var n=(l)?this[l]:this.getAttribute(m,2); return(b[m])?!!n:(l)?n:n||null;},getProperties:function(){var l=$A(arguments);return l.map(this.getProperty,this).associate(l);},removeProperty:function(m){var l=e[m]; (l)?this[l]=(l&&b[m])?false:"":this.removeAttribute(m);return this;},removeProperties:function(){Array.each(arguments,this.removeProperty,this);return this; },hasClass:function(l){return this.className.contains(l," ");},addClass:function(l){if(!this.hasClass(l)){this.className=(this.className+" "+l).clean(); }return this;},removeClass:function(l){this.className=this.className.replace(new RegExp("(^|\\s)"+l+"(?:\\s|$)"),"$1");return this;},toggleClass:function(l){return this.hasClass(l)?this.removeClass(l):this.addClass(l); },adopt:function(){Array.flatten(arguments).each(function(l){l=document.id(l,true);if(l){this.appendChild(l);}},this);return this;},appendText:function(m,l){return this.grab(this.getDocument().newTextNode(m),l); },grab:function(m,l){a[l||"bottom"](document.id(m,true),this);return this;},inject:function(m,l){a[l||"bottom"](this,document.id(m,true));return this;},replaces:function(l){l=document.id(l,true); l.parentNode.replaceChild(this,l);return this;},wraps:function(m,l){m=document.id(m,true);return this.replaces(m).grab(m,l);},getPrevious:function(l,m){return j(this,"previousSibling",null,l,false,m); },getAllPrevious:function(l,m){return j(this,"previousSibling",null,l,true,m);},getNext:function(l,m){return j(this,"nextSibling",null,l,false,m);},getAllNext:function(l,m){return j(this,"nextSibling",null,l,true,m); },getFirst:function(l,m){return j(this,"nextSibling","firstChild",l,false,m);},getLast:function(l,m){return j(this,"previousSibling","lastChild",l,false,m); },getParent:function(l,m){return j(this,"parentNode",null,l,false,m);},getParents:function(l,m){return j(this,"parentNode",null,l,true,m);},getSiblings:function(l,m){return this.getParent().getChildren(l,m).erase(this); },getChildren:function(l,m){return j(this,"nextSibling","firstChild",l,true,m);},getWindow:function(){return this.ownerDocument.window;},getDocument:function(){return this.ownerDocument; },getElementById:function(o,n){var m=this.ownerDocument.getElementById(o);if(!m){return null;}for(var l=m.parentNode;l!=this;l=l.parentNode){if(!l){return null; }}return document.id(m,n);},getSelected:function(){return new Elements($A(this.options).filter(function(l){return l.selected;}));},getComputedStyle:function(m){if(this.currentStyle){return this.currentStyle[m.camelCase()]; }var l=this.getDocument().defaultView.getComputedStyle(this,null);return(l)?l.getPropertyValue([m.hyphenate()]):null;},toQueryString:function(){var l=[]; this.getElements("input, select, textarea",true).each(function(m){if(!m.name||m.disabled||m.type=="submit"||m.type=="reset"||m.type=="file"){return;}var n=(m.tagName.toLowerCase()=="select")?Element.getSelected(m).map(function(o){return o.value; }):((m.type=="radio"||m.type=="checkbox")&&!m.checked)?null:m.value;$splat(n).each(function(o){if(typeof o!="undefined"){l.push(m.name+"="+encodeURIComponent(o)); }});});return l.join("&");},clone:function(o,l){o=o!==false;var r=this.cloneNode(o);var n=function(v,u){if(!l){v.removeAttribute("id");}if(Browser.Engine.trident){v.clearAttributes(); v.mergeAttributes(u);v.removeAttribute("uid");if(v.options){var w=v.options,s=u.options;for(var t=w.length;t--;){w[t].selected=s[t].selected;}}}var x=i[u.tagName.toLowerCase()]; if(x&&u[x]){v[x]=u[x];}};if(o){var p=r.getElementsByTagName("*"),q=this.getElementsByTagName("*");for(var m=p.length;m--;){n(p[m],q[m]);}}n(r,this);return document.id(r); },destroy:function(){Element.empty(this);Element.dispose(this);g(this,true);return null;},empty:function(){$A(this.childNodes).each(function(l){Element.destroy(l); });return this;},dispose:function(){return(this.parentNode)?this.parentNode.removeChild(this):this;},hasChild:function(l){l=document.id(l,true);if(!l){return false; }if(Browser.Engine.webkit&&Browser.Engine.version<420){return $A(this.getElementsByTagName(l.tagName)).contains(l);}return(this.contains)?(this!=l&&this.contains(l)):!!(this.compareDocumentPosition(l)&16); },match:function(l){return(!l||(l==this)||(Element.get(this,"tag")==l));}});Native.implement([Element,Window,Document],{addListener:function(o,n){if(o=="unload"){var l=n,m=this; n=function(){m.removeListener("unload",n);l();};}else{h[this.uid]=this;}if(this.addEventListener){this.addEventListener(o,n,false);}else{this.attachEvent("on"+o,n); }return this;},removeListener:function(m,l){if(this.removeEventListener){this.removeEventListener(m,l,false);}else{this.detachEvent("on"+m,l);}return this; },retrieve:function(m,l){var o=c(this.uid),n=o[m];if(l!=undefined&&n==undefined){n=o[m]=l;}return $pick(n);},store:function(m,l){var n=c(this.uid);n[m]=l; return this;},eliminate:function(l){var m=c(this.uid);delete m[l];return this;}});window.addListener("unload",d);})();Element.Properties=new Hash;Element.Properties.style={set:function(a){this.style.cssText=a; },get:function(){return this.style.cssText;},erase:function(){this.style.cssText="";}};Element.Properties.tag={get:function(){return this.tagName.toLowerCase(); }};Element.Properties.html=(function(){var c=document.createElement("div");var a={table:[1,"","
    "],select:[1,""],tbody:[2,"","
    "],tr:[3,"","
    "]}; a.thead=a.tfoot=a.tbody;var b={set:function(){var e=Array.flatten(arguments).join("");var f=Browser.Engine.trident&&a[this.get("tag")];if(f){var g=c;g.innerHTML=f[1]+e+f[2]; for(var d=f[0];d--;){g=g.firstChild;}this.empty().adopt(g.childNodes);}else{this.innerHTML=e;}}};b.erase=b.set;return b;})();if(Browser.Engine.webkit&&Browser.Engine.version<420){Element.Properties.text={get:function(){if(this.innerText){return this.innerText; }var a=this.ownerDocument.newElement("div",{html:this.innerHTML}).inject(this.ownerDocument.body);var b=a.innerText;a.destroy();return b;}};}Element.Properties.events={set:function(a){this.addEvents(a); }};Native.implement([Element,Window,Document],{addEvent:function(e,g){var h=this.retrieve("events",{});h[e]=h[e]||{keys:[],values:[]};if(h[e].keys.contains(g)){return this; }h[e].keys.push(g);var f=e,a=Element.Events.get(e),c=g,i=this;if(a){if(a.onAdd){a.onAdd.call(this,g);}if(a.condition){c=function(j){if(a.condition.call(this,j)){return g.call(this,j); }return true;};}f=a.base||f;}var d=function(){return g.call(i);};var b=Element.NativeEvents[f];if(b){if(b==2){d=function(j){j=new Event(j,i.getWindow()); if(c.call(i,j)===false){j.stop();}};}this.addListener(f,d);}h[e].values.push(d);return this;},removeEvent:function(c,b){var a=this.retrieve("events");if(!a||!a[c]){return this; }var f=a[c].keys.indexOf(b);if(f==-1){return this;}a[c].keys.splice(f,1);var e=a[c].values.splice(f,1)[0];var d=Element.Events.get(c);if(d){if(d.onRemove){d.onRemove.call(this,b); }c=d.base||c;}return(Element.NativeEvents[c])?this.removeListener(c,e):this;},addEvents:function(a){for(var b in a){this.addEvent(b,a[b]);}return this; },removeEvents:function(a){var c;if($type(a)=="object"){for(c in a){this.removeEvent(c,a[c]);}return this;}var b=this.retrieve("events");if(!b){return this; }if(!a){for(c in b){this.removeEvents(c);}this.eliminate("events");}else{if(b[a]){while(b[a].keys[0]){this.removeEvent(a,b[a].keys[0]);}b[a]=null;}}return this; },fireEvent:function(d,b,a){var c=this.retrieve("events");if(!c||!c[d]){return this;}c[d].keys.each(function(e){e.create({bind:this,delay:a,"arguments":b})(); },this);return this;},cloneEvents:function(d,a){d=document.id(d);var c=d.retrieve("events");if(!c){return this;}if(!a){for(var b in c){this.cloneEvents(d,b); }}else{if(c[a]){c[a].keys.each(function(e){this.addEvent(a,e);},this);}}return this;}});Element.NativeEvents={click:2,dblclick:2,mouseup:2,mousedown:2,contextmenu:2,mousewheel:2,DOMMouseScroll:2,mouseover:2,mouseout:2,mousemove:2,selectstart:2,selectend:2,keydown:2,keypress:2,keyup:2,focus:2,blur:2,change:2,reset:2,select:2,submit:2,load:1,unload:1,beforeunload:2,resize:1,move:1,DOMContentLoaded:1,readystatechange:1,error:1,abort:1,scroll:1}; (function(){var a=function(b){var c=b.relatedTarget;if(c==undefined){return true;}if(c===false){return false;}return($type(this)!="document"&&c!=this&&c.prefix!="xul"&&!this.hasChild(c)); };Element.Events=new Hash({mouseenter:{base:"mouseover",condition:a},mouseleave:{base:"mouseout",condition:a},mousewheel:{base:(Browser.Engine.gecko)?"DOMMouseScroll":"mousewheel"}}); })();Element.Properties.styles={set:function(a){this.setStyles(a);}};Element.Properties.opacity={set:function(a,b){if(!b){if(a==0){if(this.style.visibility!="hidden"){this.style.visibility="hidden"; }}else{if(this.style.visibility!="visible"){this.style.visibility="visible";}}}if(!this.currentStyle||!this.currentStyle.hasLayout){this.style.zoom=1;}if(Browser.Engine.trident){this.style.filter=(a==1)?"":"alpha(opacity="+a*100+")"; }this.style.opacity=a;this.store("opacity",a);},get:function(){return this.retrieve("opacity",1);}};Element.implement({setOpacity:function(a){return this.set("opacity",a,true); },getOpacity:function(){return this.get("opacity");},setStyle:function(b,a){switch(b){case"opacity":return this.set("opacity",parseFloat(a));case"float":b=(Browser.Engine.trident)?"styleFloat":"cssFloat"; }b=b.camelCase();if($type(a)!="string"){var c=(Element.Styles.get(b)||"@").split(" ");a=$splat(a).map(function(e,d){if(!c[d]){return"";}return($type(e)=="number")?c[d].replace("@",Math.round(e)):e; }).join(" ");}else{if(a==String(Number(a))){a=Math.round(a);}}this.style[b]=a;return this;},getStyle:function(g){switch(g){case"opacity":return this.get("opacity"); case"float":g=(Browser.Engine.trident)?"styleFloat":"cssFloat";}g=g.camelCase();var a=this.style[g];if(!$chk(a)){a=[];for(var f in Element.ShortStyles){if(g!=f){continue; }for(var e in Element.ShortStyles[f]){a.push(this.getStyle(e));}return a.join(" ");}a=this.getComputedStyle(g);}if(a){a=String(a);var c=a.match(/rgba?\([\d\s,]+\)/); if(c){a=a.replace(c[0],c[0].rgbToHex());}}if(Browser.Engine.presto||(Browser.Engine.trident&&!$chk(parseInt(a,10)))){if(g.test(/^(height|width)$/)){var b=(g=="width")?["left","right"]:["top","bottom"],d=0; b.each(function(h){d+=this.getStyle("border-"+h+"-width").toInt()+this.getStyle("padding-"+h).toInt();},this);return this["offset"+g.capitalize()]-d+"px"; }if((Browser.Engine.presto)&&String(a).test("px")){return a;}if(g.test(/(border(.+)Width|margin|padding)/)){return"0px";}}return a;},setStyles:function(b){for(var a in b){this.setStyle(a,b[a]); }return this;},getStyles:function(){var a={};Array.flatten(arguments).each(function(b){a[b]=this.getStyle(b);},this);return a;}});Element.Styles=new Hash({left:"@px",top:"@px",bottom:"@px",right:"@px",width:"@px",height:"@px",maxWidth:"@px",maxHeight:"@px",minWidth:"@px",minHeight:"@px",backgroundColor:"rgb(@, @, @)",backgroundPosition:"@px @px",color:"rgb(@, @, @)",fontSize:"@px",letterSpacing:"@px",lineHeight:"@px",clip:"rect(@px @px @px @px)",margin:"@px @px @px @px",padding:"@px @px @px @px",border:"@px @ rgb(@, @, @) @px @ rgb(@, @, @) @px @ rgb(@, @, @)",borderWidth:"@px @px @px @px",borderStyle:"@ @ @ @",borderColor:"rgb(@, @, @) rgb(@, @, @) rgb(@, @, @) rgb(@, @, @)",zIndex:"@",zoom:"@",fontWeight:"@",textIndent:"@px",opacity:"@"}); Element.ShortStyles={margin:{},padding:{},border:{},borderWidth:{},borderStyle:{},borderColor:{}};["Top","Right","Bottom","Left"].each(function(g){var f=Element.ShortStyles; var b=Element.Styles;["margin","padding"].each(function(h){var i=h+g;f[h][i]=b[i]="@px";});var e="border"+g;f.border[e]=b[e]="@px @ rgb(@, @, @)";var d=e+"Width",a=e+"Style",c=e+"Color"; f[e]={};f.borderWidth[d]=f[e][d]=b[d]="@px";f.borderStyle[a]=f[e][a]=b[a]="@";f.borderColor[c]=f[e][c]=b[c]="rgb(@, @, @)";});(function(){Element.implement({scrollTo:function(h,i){if(b(this)){this.getWindow().scrollTo(h,i); }else{this.scrollLeft=h;this.scrollTop=i;}return this;},getSize:function(){if(b(this)){return this.getWindow().getSize();}return{x:this.offsetWidth,y:this.offsetHeight}; },getScrollSize:function(){if(b(this)){return this.getWindow().getScrollSize();}return{x:this.scrollWidth,y:this.scrollHeight};},getScroll:function(){if(b(this)){return this.getWindow().getScroll(); }return{x:this.scrollLeft,y:this.scrollTop};},getScrolls:function(){var i=this,h={x:0,y:0};while(i&&!b(i)){h.x+=i.scrollLeft;h.y+=i.scrollTop;i=i.parentNode; }return h;},getOffsetParent:function(){var h=this;if(b(h)){return null;}if(!Browser.Engine.trident){return h.offsetParent;}while((h=h.parentNode)&&!b(h)){if(d(h,"position")!="static"){return h; }}return null;},getOffsets:function(){if(this.getBoundingClientRect){var m=this.getBoundingClientRect(),k=document.id(this.getDocument().documentElement),i=k.getScroll(),n=(d(this,"position")=="fixed"); return{x:parseInt(m.left,10)+((n)?0:i.x)-k.clientLeft,y:parseInt(m.top,10)+((n)?0:i.y)-k.clientTop};}var j=this,h={x:0,y:0};if(b(this)){return h;}while(j&&!b(j)){h.x+=j.offsetLeft; h.y+=j.offsetTop;if(Browser.Engine.gecko){if(!f(j)){h.x+=c(j);h.y+=g(j);}var l=j.parentNode;if(l&&d(l,"overflow")!="visible"){h.x+=c(l);h.y+=g(l);}}else{if(j!=this&&Browser.Engine.webkit){h.x+=c(j); h.y+=g(j);}}j=j.offsetParent;}if(Browser.Engine.gecko&&!f(this)){h.x-=c(this);h.y-=g(this);}return h;},getPosition:function(k){if(b(this)){return{x:0,y:0}; }var l=this.getOffsets(),i=this.getScrolls();var h={x:l.x-i.x,y:l.y-i.y};var j=(k&&(k=document.id(k)))?k.getPosition():{x:0,y:0};return{x:h.x-j.x,y:h.y-j.y}; },getCoordinates:function(j){if(b(this)){return this.getWindow().getCoordinates();}var h=this.getPosition(j),i=this.getSize();var k={left:h.x,top:h.y,width:i.x,height:i.y}; k.right=k.left+k.width;k.bottom=k.top+k.height;return k;},computePosition:function(h){return{left:h.x-e(this,"margin-left"),top:h.y-e(this,"margin-top")}; },setPosition:function(h){return this.setStyles(this.computePosition(h));}});Native.implement([Document,Window],{getSize:function(){if(Browser.Engine.presto||Browser.Engine.webkit){var i=this.getWindow(); return{x:i.innerWidth,y:i.innerHeight};}var h=a(this);return{x:h.clientWidth,y:h.clientHeight};},getScroll:function(){var i=this.getWindow(),h=a(this); return{x:i.pageXOffset||h.scrollLeft,y:i.pageYOffset||h.scrollTop};},getScrollSize:function(){var i=a(this),h=this.getSize();return{x:Math.max(i.scrollWidth,h.x),y:Math.max(i.scrollHeight,h.y)}; },getPosition:function(){return{x:0,y:0};},getCoordinates:function(){var h=this.getSize();return{top:0,left:0,bottom:h.y,right:h.x,height:h.y,width:h.x}; }});var d=Element.getComputedStyle;function e(h,i){return d(h,i).toInt()||0;}function f(h){return d(h,"-moz-box-sizing")=="border-box";}function g(h){return e(h,"border-top-width"); }function c(h){return e(h,"border-left-width");}function b(h){return(/^(?:body|html)$/i).test(h.tagName);}function a(h){var i=h.getDocument();return(!i.compatMode||i.compatMode=="CSS1Compat")?i.html:i.body; }})();Element.alias("setPosition","position");Native.implement([Window,Document,Element],{getHeight:function(){return this.getSize().y;},getWidth:function(){return this.getSize().x; },getScrollTop:function(){return this.getScroll().y;},getScrollLeft:function(){return this.getScroll().x;},getScrollHeight:function(){return this.getScrollSize().y; },getScrollWidth:function(){return this.getScrollSize().x;},getTop:function(){return this.getPosition().y;},getLeft:function(){return this.getPosition().x; }});Element.Events.domready={onAdd:function(a){if(Browser.loaded){a.call(this);}}};(function(){var b=function(){if(Browser.loaded){return;}Browser.loaded=true; window.fireEvent("domready");document.fireEvent("domready");};if(Browser.Engine.trident){var a=document.createElement("div");(function(){($try(function(){a.doScroll(); return document.id(a).inject(document.body).set("html","temp").dispose();}))?b():arguments.callee.delay(50);})();}else{if(Browser.Engine.webkit&&Browser.Engine.version<525){(function(){(["loaded","complete"].contains(document.readyState))?b():arguments.callee.delay(50); })();}else{window.addEvent("load",b);document.addEvent("DOMContentLoaded",b);}}})();var JSON=new Hash({$specialChars:{"\b":"\\b","\t":"\\t","\n":"\\n","\f":"\\f","\r":"\\r",'"':'\\"',"\\":"\\\\"},$replaceChars:function(a){return JSON.$specialChars[a]||"\\u00"+Math.floor(a.charCodeAt()/16).toString(16)+(a.charCodeAt()%16).toString(16); },encode:function(b){switch($type(b)){case"string":return'"'+b.replace(/[\x00-\x1f\\"]/g,JSON.$replaceChars)+'"';case"array":return"["+String(b.map(JSON.encode).clean())+"]"; case"object":case"hash":var a=[];Hash.each(b,function(e,d){var c=JSON.encode(e);if(c){a.push(JSON.encode(d)+":"+c);}});return"{"+a+"}";case"number":case"boolean":return String(b); case false:return"null";}return null;},decode:function(string,secure){if($type(string)!="string"||!string.length){return null;}if(secure&&!(/^[,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]*$/).test(string.replace(/\\./g,"@").replace(/"[^"\\\n\r]*"/g,""))){return null; }return eval("("+string+")");}});Native.implement([Hash,Array,String,Number],{toJSON:function(){return JSON.encode(this);}});var Cookie=new Class({Implements:Options,options:{path:false,domain:false,duration:false,secure:false,document:document},initialize:function(b,a){this.key=b; this.setOptions(a);},write:function(b){b=encodeURIComponent(b);if(this.options.domain){b+="; domain="+this.options.domain;}if(this.options.path){b+="; path="+this.options.path; }if(this.options.duration){var a=new Date();a.setTime(a.getTime()+this.options.duration*24*60*60*1000);b+="; expires="+a.toGMTString();}if(this.options.secure){b+="; secure"; }this.options.document.cookie=this.key+"="+b;return this;},read:function(){var a=this.options.document.cookie.match("(?:^|;)\\s*"+this.key.escapeRegExp()+"=([^;]*)"); return(a)?decodeURIComponent(a[1]):null;},dispose:function(){new Cookie(this.key,$merge(this.options,{duration:-1})).write("");return this;}});Cookie.write=function(b,c,a){return new Cookie(b,a).write(c); };Cookie.read=function(a){return new Cookie(a).read();};Cookie.dispose=function(b,a){return new Cookie(b,a).dispose();};var Request=new Class({Implements:[Chain,Events,Options],options:{url:"",data:"",headers:{"X-Requested-With":"XMLHttpRequest",Accept:"text/javascript, text/html, application/xml, text/xml, */*"},async:true,format:false,method:"post",link:"ignore",isSuccess:null,emulation:true,urlEncoded:true,encoding:"utf-8",evalScripts:false,evalResponse:false,noCache:false},initialize:function(a){this.xhr=new Browser.Request(); this.setOptions(a);this.options.isSuccess=this.options.isSuccess||this.isSuccess;this.headers=new Hash(this.options.headers);},onStateChange:function(){if(this.xhr.readyState!=4||!this.running){return; }this.running=false;this.status=0;$try(function(){this.status=this.xhr.status;}.bind(this));this.xhr.onreadystatechange=$empty;if(this.options.isSuccess.call(this,this.status)){this.response={text:this.xhr.responseText,xml:this.xhr.responseXML}; this.success(this.response.text,this.response.xml);}else{this.response={text:null,xml:null};this.failure();}},isSuccess:function(){return((this.status>=200)&&(this.status<300)); },processScripts:function(a){if(this.options.evalResponse||(/(ecma|java)script/).test(this.getHeader("Content-type"))){return $exec(a);}return a.stripScripts(this.options.evalScripts); },success:function(b,a){this.onSuccess(this.processScripts(b),a);},onSuccess:function(){this.fireEvent("complete",arguments).fireEvent("success",arguments).callChain(); },failure:function(){this.onFailure();},onFailure:function(){this.fireEvent("complete").fireEvent("failure",this.xhr);},setHeader:function(a,b){this.headers.set(a,b); return this;},getHeader:function(a){return $try(function(){return this.xhr.getResponseHeader(a);}.bind(this));},check:function(){if(!this.running){return true; }switch(this.options.link){case"cancel":this.cancel();return true;case"chain":this.chain(this.caller.bind(this,arguments));return false;}return false;},send:function(k){if(!this.check(k)){return this; }this.running=true;var i=$type(k);if(i=="string"||i=="element"){k={data:k};}var d=this.options;k=$extend({data:d.data,url:d.url,method:d.method},k);var g=k.data,b=k.url,a=k.method.toLowerCase(); switch($type(g)){case"element":g=document.id(g).toQueryString();break;case"object":case"hash":g=Hash.toQueryString(g);}if(this.options.format){var j="format="+this.options.format; g=(g)?j+"&"+g:j;}if(this.options.emulation&&!["get","post"].contains(a)){var h="_method="+a;g=(g)?h+"&"+g:h;a="post";}if(this.options.urlEncoded&&a=="post"){var c=(this.options.encoding)?"; charset="+this.options.encoding:""; this.headers.set("Content-type","application/x-www-form-urlencoded"+c);}if(this.options.noCache){var f="noCache="+new Date().getTime();g=(g)?f+"&"+g:f; }var e=b.lastIndexOf("/");if(e>-1&&(e=b.indexOf("#"))>-1){b=b.substr(0,e);}if(g&&a=="get"){b=b+(b.contains("?")?"&":"?")+g;g=null;}this.xhr.open(a.toUpperCase(),b,this.options.async); this.xhr.onreadystatechange=this.onStateChange.bind(this);this.headers.each(function(m,l){try{this.xhr.setRequestHeader(l,m);}catch(n){this.fireEvent("exception",[l,m]); }},this);this.fireEvent("request");this.xhr.send(g);if(!this.options.async){this.onStateChange();}return this;},cancel:function(){if(!this.running){return this; }this.running=false;this.xhr.abort();this.xhr.onreadystatechange=$empty;this.xhr=new Browser.Request();this.fireEvent("cancel");return this;}});(function(){var a={}; ["get","post","put","delete","GET","POST","PUT","DELETE"].each(function(b){a[b]=function(){var c=Array.link(arguments,{url:String.type,data:$defined}); return this.send($extend(c,{method:b}));};});Request.implement(a);})();Element.Properties.send={set:function(a){var b=this.retrieve("send");if(b){b.cancel(); }return this.eliminate("send").store("send:options",$extend({data:this,link:"cancel",method:this.get("method")||"post",url:this.get("action")},a));},get:function(a){if(a||!this.retrieve("send")){if(a||!this.retrieve("send:options")){this.set("send",a); }this.store("send",new Request(this.retrieve("send:options")));}return this.retrieve("send");}};Element.implement({send:function(a){var b=this.get("send"); b.send({data:this,url:a||b.options.url});return this;}});Request.JSON=new Class({Extends:Request,options:{secure:true},initialize:function(a){this.parent(a); this.headers.extend({Accept:"application/json","X-Request":"JSON"});},success:function(a){this.response.json=JSON.decode(a,this.options.secure);this.onSuccess(this.response.json,a); }});perldoc-html/static/._mootools-1.2.3.1-more.js000644 000765 000024 00000000305 12276001417 020667 0ustar00jjstaff000000 000000 Mac OS X  2“ÅATTRŘ-˜-com.apple.quarantineq/0001;4a658408;Firefox;|org.mozilla.firefoxperldoc-html/static/mootools-1.2.3.1-more.js000644 000765 000024 00000037051 12276001417 020462 0ustar00jjstaff000000 000000 //MooTools More, . Copyright (c) 2006-2009 Aaron Newton , Valerio Proietti & the MooTools team , MIT Style License. MooTools.More={version:"1.2.3.1"};Class.refactor=function(b,a){$each(a,function(e,d){var c=b.prototype[d];if(c&&(c=c._origin)&&typeof e=="function"){b.implement(d,function(){var f=this.previous; this.previous=c;var g=e.apply(this,arguments);this.previous=f;return g;});}else{b.implement(d,e);}});return b;};Class.Mutators.Binds=function(a){return a; };Class.Mutators.initialize=function(a){return function(){$splat(this.Binds).each(function(b){var c=this[b];if(c){this[b]=c.bind(this);}},this);return a.apply(this,arguments); };};Class.Occlude=new Class({occlude:function(c,b){b=document.id(b||this.element);var a=b.retrieve(c||this.property);if(a&&!$defined(this.occluded)){this.occluded=a; }else{this.occluded=false;b.store(c||this.property,this);}return this.occluded;}});String.implement({parseQueryString:function(){var b=this.split(/[&;]/),a={}; if(b.length){b.each(function(g){var c=g.indexOf("="),d=c<0?[""]:g.substr(0,c).match(/[^\]\[]+/g),e=decodeURIComponent(g.substr(c+1)),f=a;d.each(function(j,h){var k=f[j]; if(h=0||h||r.allowNegative)?n.x:0).toInt(),top:((n.y>=0||h||r.allowNegative)?n.y:0).toInt()}; if(p.getStyle("position")=="fixed"||r.relFixedPosition){var f=window.getScroll();n.top=n.top.toInt()+f.y;n.left=n.left.toInt()+f.x;}if(r.returnPos){return n; }else{this.setStyles(n);}return this;}});})();Element.implement({isDisplayed:function(){return this.getStyle("display")!="none";},toggle:function(){return this[this.isDisplayed()?"hide":"show"](); },hide:function(){var b;try{if("none"!=this.getStyle("display")){b=this.getStyle("display");}}catch(a){}return this.store("originalDisplay",b||"block").setStyle("display","none"); },show:function(a){return this.setStyle("display",a||this.retrieve("originalDisplay")||"block");},swapClass:function(a,b){return this.removeClass(a).addClass(b); }});var OverText=new Class({Implements:[Options,Events,Class.Occlude],Binds:["reposition","assert","focus"],options:{element:"label",positionOptions:{position:"upperLeft",edge:"upperLeft",offset:{x:4,y:2}},poll:false,pollInterval:250},property:"OverText",initialize:function(b,a){this.element=document.id(b); if(this.occlude()){return this.occluded;}this.setOptions(a);this.attach(this.element);OverText.instances.push(this);if(this.options.poll){this.poll();}return this; },toElement:function(){return this.element;},attach:function(){var a=this.options.textOverride||this.element.get("alt")||this.element.get("title");if(!a){return; }this.text=new Element(this.options.element,{"class":"overTxtLabel",styles:{lineHeight:"normal",position:"absolute"},html:a,events:{click:this.hide.pass(true,this)}}).inject(this.element,"after"); if(this.options.element=="label"){this.text.set("for",this.element.get("id"));}this.element.addEvents({focus:this.focus,blur:this.assert,change:this.assert}).store("OverTextDiv",this.text); window.addEvent("resize",this.reposition.bind(this));this.assert(true);this.reposition();},startPolling:function(){this.pollingPaused=false;return this.poll(); },poll:function(a){if(this.poller&&!a){return this;}var b=function(){if(!this.pollingPaused){this.assert(true);}}.bind(this);if(a){$clear(this.poller); }else{this.poller=b.periodical(this.options.pollInterval,this);}return this;},stopPolling:function(){this.pollingPaused=true;return this.poll(true);},focus:function(){if(!this.text.isDisplayed()||this.element.get("disabled")){return; }this.hide();},hide:function(b){if(this.text.isDisplayed()&&!this.element.get("disabled")){this.text.hide();this.fireEvent("textHide",[this.text,this.element]); this.pollingPaused=true;try{if(!b){this.element.fireEvent("focus").focus();}}catch(a){}}return this;},show:function(){if(!this.text.isDisplayed()){this.text.show(); this.reposition();this.fireEvent("textShow",[this.text,this.element]);this.pollingPaused=false;}return this;},assert:function(a){this[this.test()?"show":"hide"](a); },test:function(){var a=this.element.get("value");return !a;},reposition:function(){this.assert(true);if(!this.element.getParent()||!this.element.offsetHeight){return this.stopPolling().hide(); }if(this.test()){this.text.position($merge(this.options.positionOptions,{relativeTo:this.element}));}return this;}});OverText.instances=[];OverText.update=function(){return OverText.instances.map(function(a){if(a.element&&a.text){return a.reposition(); }return null;});};if(window.Fx&&Fx.Reveal){Fx.Reveal.implement({hideInputs:Browser.Engine.trident?"select, input, textarea, object, embed, .overTxtLabel":false}); }var Drag=new Class({Implements:[Events,Options],options:{snap:6,unit:"px",grid:false,style:true,limit:false,handle:false,invert:false,preventDefault:false,modifiers:{x:"left",y:"top"}},initialize:function(){var b=Array.link(arguments,{options:Object.type,element:$defined}); this.element=document.id(b.element);this.document=this.element.getDocument();this.setOptions(b.options||{});var a=$type(this.options.handle);this.handles=((a=="array"||a=="collection")?$$(this.options.handle):document.id(this.options.handle))||this.element; this.mouse={now:{},pos:{}};this.value={start:{},now:{}};this.selection=(Browser.Engine.trident)?"selectstart":"mousedown";this.bound={start:this.start.bind(this),check:this.check.bind(this),drag:this.drag.bind(this),stop:this.stop.bind(this),cancel:this.cancel.bind(this),eventStop:$lambda(false)}; this.attach();},attach:function(){this.handles.addEvent("mousedown",this.bound.start);return this;},detach:function(){this.handles.removeEvent("mousedown",this.bound.start); return this;},start:function(c){if(this.options.preventDefault){c.preventDefault();}this.mouse.start=c.page;this.fireEvent("beforeStart",this.element); var a=this.options.limit;this.limit={x:[],y:[]};for(var d in this.options.modifiers){if(!this.options.modifiers[d]){continue;}if(this.options.style){this.value.now[d]=this.element.getStyle(this.options.modifiers[d]).toInt(); }else{this.value.now[d]=this.element[this.options.modifiers[d]];}if(this.options.invert){this.value.now[d]*=-1;}this.mouse.pos[d]=c.page[d]-this.value.now[d]; if(a&&a[d]){for(var b=2;b--;b){if($chk(a[d][b])){this.limit[d][b]=$lambda(a[d][b])();}}}}if($type(this.options.grid)=="number"){this.options.grid={x:this.options.grid,y:this.options.grid}; }this.document.addEvents({mousemove:this.bound.check,mouseup:this.bound.cancel});this.document.addEvent(this.selection,this.bound.eventStop);},check:function(a){if(this.options.preventDefault){a.preventDefault(); }var b=Math.round(Math.sqrt(Math.pow(a.page.x-this.mouse.start.x,2)+Math.pow(a.page.y-this.mouse.start.y,2)));if(b>this.options.snap){this.cancel();this.document.addEvents({mousemove:this.bound.drag,mouseup:this.bound.stop}); this.fireEvent("start",[this.element,a]).fireEvent("snap",this.element);}},drag:function(a){if(this.options.preventDefault){a.preventDefault();}this.mouse.now=a.page; for(var b in this.options.modifiers){if(!this.options.modifiers[b]){continue;}this.value.now[b]=this.mouse.now[b]-this.mouse.pos[b];if(this.options.invert){this.value.now[b]*=-1; }if(this.options.limit&&this.limit[b]){if($chk(this.limit[b][1])&&(this.value.now[b]>this.limit[b][1])){this.value.now[b]=this.limit[b][1];}else{if($chk(this.limit[b][0])&&(this.value.now[b]c.left&&a.xc.top); },checkDroppables:function(){var a=this.droppables.filter(this.checkAgainst,this).getLast();if(this.overed!=a){if(this.overed){this.fireEvent("leave",[this.element,this.overed]); }if(a){this.fireEvent("enter",[this.element,a]);}this.overed=a;}},drag:function(a){this.parent(a);if(this.options.checkDroppables&&this.droppables.length){this.checkDroppables(); }},stop:function(a){this.checkDroppables();this.fireEvent("drop",[this.element,this.overed,a]);this.overed=null;return this.parent(a);}});Element.implement({makeDraggable:function(a){var b=new Drag.Move(this,a); this.store("dragger",b);return b;}});Hash.Cookie=new Class({Extends:Cookie,options:{autoSave:true},initialize:function(b,a){this.parent(b,a);this.load(); },save:function(){var a=JSON.encode(this.hash);if(!a||a.length>4096){return false;}if(a=="{}"){this.dispose();}else{this.write(a);}return true;},load:function(){this.hash=new Hash(JSON.decode(this.read(),true)); return this;}});Hash.each(Hash.prototype,function(b,a){if(typeof b=="function"){Hash.Cookie.implement(a,function(){var c=b.apply(this.hash,arguments);if(this.options.autoSave){this.save(); }return c;});}});perldoc-html/static/onion-grey.png000644 000765 000024 00000006075 12276001417 017311 0ustar00jjstaff000000 000000 ‰PNG  IHDR*+ŸþîiCCPICC Profilex…TÏkAþ6n©Ð"Zk²x"IY«hEÔ6ýbk Û¶Ed3IÖn6ëî&µ¥ˆäâÑ*ÞEí¡ÿ€zðd/J…ZE(Þ«(b¡-ñÍnL¶¥êÀÎ~óÞ7ï}ovß rÒ4õ€ä ÇR¢il|BjüˆŽ¢ A4%UÛìN$Aƒsù{çØz[VÃ{ûw²w­šÒ¶š„ý@àGšÙ*°ïq Yˆ<ß¡)ÇtßãØòì9NyxÁµ+=ÄY"|@5-ÎM¸SÍ%Ó@ƒH8”õqR>œ×‹”×infÆÈ½O¦»Ìî«b¡œNö½ô~N³Þ>Â! ­?F¸žõŒÕ?âaá¤æÄ†=5ôø`·©ø5Â_M'¢TqÙ. ñ˜®ýVòJ‚p8Êda€sZHO×Lnøº‡}&ׯâwVQáygÞÔÝïEÚ¯0  š HPEa˜°P@†<14²r?#«“{2u$j»tbD±A{6Ü=·Q¤Ý<þ("q”Cµ’üAþ*¯ÉOåyùË\°ØV÷”­›šºòà;Å噹×ÓÈãsM^|•Ôv“WG–¬yz¼šì?ìW—1æ‚5Äs°ûñ-_•Ì—)ŒÅãUóêK„uZ17ߟl;=â.Ï.µÖs­‰‹7V›—gýjHû“æUùO^õñügÍÄcâ)1&vŠç!‰—Å.ñ’ØK« â`mÇ•†)Òm‘ú$Õ``š¼õ/]?[x½F õQ”ÌÒT‰÷Â*d4¹oúÛÇüä÷ŠçŸ(/làÈ™ºmSqï¡e¥ns®¿Ñ}ð¶nk£~8üX<«­R5Ÿ ¼v‡zè)˜Ó––Í9R‡,Ÿ“ºéÊbRÌPÛCRR×%×eK³™UbévØ™Ón¡9B÷ħJe“ú¯ñ°ý°Rùù¬RÙ~NÖ—úoÀ¼ýEÀx‹‰ pHYsttÞfxõIDATX ÝXglUÉ>¶×Æ€ %`¢÷DG€˜`zÃÒ¼B° "H ¢,]À*BÑ$œ¢Ó#cZ ET ÑÙ%`›bcò}Çž»÷]ßwßc“?ɑޛ™3gæ~÷ÌisEþG(ä¿óóçÏ!ƒŽ*,,,þW Ð°°°üªU«ælܸ±à?}ÎÏ:pàÀæ–` Ñ¿p‚À\ð¢Ù7ÞCðÒ<õÓ§OiGŽÉ7sÁ¶_ éß¿ÿ0´¿ÇàJæ³ÒÒÒê–š(aôkü6”)Sæû}ûö½ô'çä ´_¿~¡uØ s3®T©’¨l߾ݰ¼Úwü]­ZµÖca^;qnäÈ‘aµk×þ܆aœ—<ìTÂÃÃåÉ“'òþý{/QÎEà×+''g@³fÍþvûöíy-ð:hРè>¤bƒdüj¿uëÖ šØ£¼yóÆëÙf.'•Ü Aƒ‹÷îÝË2Lgêd˜ñ!Cb Ò1îexZ‚ƒÇ«6GŒHÜšÇiUÀˆy ¶˜ŽŽ+Ðäääȼ¼¼ƒØ ­CÞsH¡¡¡òîÝ;yöì™§¬sÏŠ€fS6Þ9DZ+Ð/^üsÝÜøã`tt´DFF NBÞŸ¬?€÷ÀäJùB) x£ážì±™ëT=äÎ;BÏ'µiã78¸®7L<»êÇÿlƦõ ‚ µùET³fM0`€œŠŸ-Û¶±fa¾w#=̓ùÖ¬Y’››«{Ž7ÎMÜâ!-ÿ’ŠãªnÍ8:3eJ±MgeeÉ–-[T‚érΜ9j{»víÒsÂ+ÐÓ òñãÇ p¬j·<™ÚœS P¯ïYÙVMž]Ê–-ëº Ï Ã/DB¢Œ›CŽÑÐéÓ§…±’D@&ëœ;wNò”ß°aCÕ°<þÆŒ£³;vìPïg‚`hóGH(P×;L»ví¬µÇŽ³úäóIÆóÙg†âââaGAž?^—tîÜÙïÒùóç6GïZŽ3%’àlzT:ÀÓ¦!Ú­¡ØØXÓ Ø6oÞ\e.]º¤mݺu]×ÀŠð+ Ïzî&U¾|yeÓ«YºЉ‰1]ᜡråÊ™nÀÖTY()U–!¥¢“€í1y:ÏzäàØØžsjØ}ŽÁ>X2{˜5¸zëÉ9×Û?ÈS HwwÐ/t 1%’˜Û ½zõÊtÅ®]»Ö-?sø@¡F³.â§ÉS [·nÍCÿïN¡›7o*‹Z³§Ef CÌù†ª‚¥GŠÑ„³[·n¹.…ÆÕ‹(%f‡S’ŃysVð†X-™ëpûö?}‡xù2¸ï <ö7nhÝ¥KÝöÌ™3f{«…}ÞBjþ'P8N ÐçXRèÐnRRR”Å‚ÄmפÑîÝ»ë=‰B÷ïßWÙ@ôt¦ÐÄÄD5^a.^¼Xj^h½aZ@Qže(¿„øPjjª0ÑF§M›¦XÎÑVééæjüðáCÁŸõnƒ;wj6qâDu ÖªÌ|v‚6_!l6< (º¿Ì$[æc^5h;v”ñãÇë4ªoY±b…Î(ïJ|˜‰‹ö=ìýãÇ«æçÎ+ i|Á.‡½þeXEk˜}òîÝ»yHƒ lCí|ÞÓ/_¾,:tN:iî§Ò&©E?Óí©S§8ÇnÄ“Y¹r¥V\Ìn,p𡬔(ü%ÏúûYjöÊðèkÛÝöXž¥§§ë5ƒy™Þ~åÊ­Añ‚2|øpÕ:sÏž=µº·¯§Mòf̘¡„}ÓNÈ\œê¯7mÚäó-ªP.Dµ~Gžˆ®ONä “w"?Í€Å4m’E4µIg£CQ†ÚãÇæuC=œ¦Á[‚=õª@ñ_!Ž|ª±LO»~SÉСC{M¤¨]WbÎÆÄ£Gª œ@ªU«¦Õ<]»v•¾}ûjñ—¡fÝZdNÿáh»ë¼ÓððU¹ ²MÆ Ï«:uª0JŒ=Z5ËÊÈ#ãØ·bÂù/[Ú`K¤|¼Þ¾’}<ôG¼e<~VR%^«ùõþX?݄݉ýóòðOÐâ |ë/¾2ø—µf¾¨Y !ø(Ö‘! ¼üü:\É­xÙæ¯Ì‚%ü ›ŸÔ¹;5º 5@¼>/Š~X‡oúÙtwuçFÿãèº)N}IEND®B`‚perldoc-html/static/page_bg.png000644 000765 000024 00000001707 12276001417 016604 0ustar00jjstaff000000 000000 ‰PNG  IHDRÙÊ1m5iCCPICC ProfilexÚ”OhUÇ?o:¡ ‰ ®5”‡‡$ k+ºE¨É&é6î²]¦Ùü£ ›Ù·»c^fÇ7³[[ŠH@¼ÙêQ¼T‹x¨"$Oö h­z‚xª" ½HY³&Ôª?ø¼ßï÷¾¿?†>«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´J«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´J† w¸!®Ü9!„,ŽÜ¸pBœrH–"¤¡Äб Ø‰l lÎÚÄk/»¯«r˜þ©ž7óÞÌ›™·o÷uIàÝ}3Ý]Ý_W}UÕ3îܹ#˜"O?ý4æ!?þ8^yå$Y<É›\ÄÌXYYÁ±cÇ„1_|ñ˜9­ÈN <úè£xóÍ7Ä;wðÔSO¥ÕX`ÉÒ$ieQ6·F¸þí÷¸që6Öïý„@û¿¹Èw·nã9ï'Û;©kĉ û”)>"{=¹¤Ô¨j\"]ã‹È5Pº¯è/ŒŽh|®Dd| ¥hSPŒÍ;\'¡+7þ±¦¥¸O‡*†/,Uú‡êßÊþ}«8|ð~sì—8òà!dDPÖ~\Ç.ãŸW¿Â7ßß»`6ñ$—äÊ•+xë­·fÄñãÇñüóÏýýÛÖð‡?ýUM¤‚ÌA˜AYæÁ@D?V7nf‰@™¿–l›ÅÏ\š,׈ˆ £M~M…¹èÈ/¤ÿ váp‹@Œ ËÜïI±èz"ŒÐ¶ÿ€.Ú·‘ëž$¡¢ï }q­À Cæõý”ex`eÚÇŽ=‚ߟþ-~wê$òû?áì‡ÿÀŸÿö)Öïo‹£vÔX”Û·oãÂ… 3e}}½ú)ˆ³[xÀà0¥˜#¾DTÜSìýHý® €aGǺŠ1ØN¢¿Žb°3/~‡²pÑ·[P6±•1™´B_ñvk ,|ÿÅuB>ÒP¸àqº{ûd\Œ‹!FY ×÷ïã¿kkøÏµ¯ðå77ð‹Ÿ?€ü³«_ãýóŸa}cÓ›ÙØ„W¯ééÓ§qöìÙ™€²gÏžÚÏêú—²ÉT»¼üsÙ”ºE([ >$˜i¿ò;Öû‹ŽÝÍÂ"c@$ë*<€dÜä;ë#Ö‡¸|¸¿‘ÝÌ^òÛZÀ\Ò×Z–`ü¤vžÄ*C,¸ô¯«8ûþ9ä^¼‚õû?E»ŠÔ¤Pïɲ û÷ïï•£ˆ›ÔRÿ¢ÀPLvi’Üç^aïBØþ<—(Ö›˜ÁÖ…Eþ[¹®28 ®P¸½dÑB_Ä»…â~RYJ­sÑ?[¯À †c Tÿã@ÐzP ˜‘e†ßYá^?øûEä_ÞüÆ0²ŒÆLqÑqrýúu¼ûî»3âá‡Æ3Ï<3nMJ%â*Ç"e+á€ç®È›h(Ð^(`) ÄÖÀqçŠÅpîÎs`rf¼˜x€2»ûKVO`ûw`…€Ên•±Áú²ŸOñ.ÜÑÿ‚·)«±\Ê›0»lC¬­ÝF¾ñÓYFÑNÖV¤†ËâÆxçwfÊÉ“'+â ¢Û YÐZœ @D¹…2Çñ~Þîrz[ð°~@Œ"¦ ­R¶®ic! DÕM(‰w šˆ³å'¤pÅ#FÖÆ‚)Sà­sK¾RŠzÜ(%Œ2 ‘µ8Å`Ùx³ãû‡¥C âÞbc†Æs6£È·Wùø*9uêÔÌ ¸}ûöÕøó"–å«]êÂQñ;~ââp2B0¹Å¤$òón2ÄF*Z ÈuPÁ_ÏpQ’ïQ8ËÂcá±ãTfŒâ˜&„Ö,‹ã;츕¶Lv‹x~blÿŽŸYkRèónÝó³0ÿ¹á+Bõ¦ÄÅÚû÷ãäÉ“ýrÈÈÒ¥€áç‹\N 8·ã‚é²¶Ö@(Š[­ !· ±C!……*€XΓH ‡Q$U›i¶!qˆ{‚[ð´…­õÐ@`»?J< ¥"WmçH¯0Äò±R©Æ =ÿäF…™q¨8%9öÝwøè£fÄáÇñÄOT!f´UÉô ÊfQ”uq´W@p·‘ý;mgŒ —q™v4p ?~3XX»86V‘´ mƒ¾$v\‘iG,¼MNù‰±àÈüµÚû¶!È…9Bʾ¯&ê¹víÞxã™9JPDŠÜE¼6ŒUün´…D¡_‡Ÿì\q1$Q´Pp.Yƒ|ª‚„ÄPäTr ŒÀÆøÌ±ÝÕ Ÿ!²nJtB5¶a³( ø{T`ì­O™¸{Ø\DfjÿnJŒÊ±T¤È‹äÏ(6¹n ¸Ê‰'pæÌ™™€rðàÁ×Ã6#ÌÊ÷ÚEÐ»Èæ<\š]v\éâ‹Z\Ê©Œ¹7—O¨¨ÃF Š3ˆp !ÓfÇbùµ ³!qÌmÈ?£Üù5â^ÞrˆÒßY*ï<”["o%|vÙò'—ŸrVÔ[3±©Ñuâ§ÜŒF6‘ßÓ–…j*‡GŽÁ“O>ÙsµG»g,¬UÑ‘­O$¹4Òù·X’QÆ…Ó*§á9‹7Ù:enKˆHú‰W 2cd¯Oí2\‘JgÏYœ"k14í²`˸ÂZŽ>;hu&S ùËQö¬7˜ §¤™[‰YDÀfä'½L«øû]£Ì7s©®'~¢H]Ö3" äÒß–ûÍ/*58+«å³£‡¢0·ÖEI)7äAXøQå­Ë!áþÊïv¶Ö@œp‹-ÁÅE¦Ïƒ ¨ú<¾?6aû´v°KÌ{17¤€G£)é÷‚”©ü”Ýå6’!§‡Z«o°Ž5U’NcSN¶q ¬®–£|¿Nv¹æ-?Ð)‡r%×·Mš¤Y}ˆ¼% 8°Ä›&ÜÕ’ÀÞôdÖ[@Qa¢v/à"çQIÉ Þ-@Õ …ãÔ{dU]Ú€ÙVщ-”€àý‚®ï¸ä­Ý‚,>ý[LWÃrY¹`ÝBM¡°䤫ڥû]E¹§ûsæ^™g£ò]݉1G­%±î(¤ˆÀ° „£ °dKT6 H…Dz”΢Ä"p]Ñ¡­Q9 .ü‰Oõs:—CŽRIX Pé‹Ú™Î>IùŽ"e1çûs†VÑ‚<ºóÔÓÁ¥gŸ}¶ö¸ã¹sçðꫯZÓ?²Ç7Ä×a¢E-¹…(’÷< 1Q,åO , /НØÈÄÞ|Û…§*#沞1Ð)}ETŒ"Š™hÈ¡/Üý¹˜Â¢ &­Ì4,-9þ<^{í5î™ °qL#"³Á™ ŠHÕ‹ŠÊg Šº|X "v«w»³îÐO4Ïሀ3áÞœkèF(*Ë-ºðJ;êþ\d3ù˜<&Šçz”?þ/¿ü2Œ1øŸÉ±nr°l~Àªþà –:Ÿ¡òâJ .Ým8Úýp!ktâMB£ƒÇ<铸<%©ÝGQ¤ Iãî¹?g6…¹/èQõ†!ä“O>Á™3g°µµ…u“ãÇQÈ(:tã@Àã‡N½[bw¡=1&þÔ›(W J£\a!ÔnóG*EE‘Y.GËq˜­­¢ÀÌ#@õ,—.]ÂK/½„ÍÍMÜÝÌp{ ¶T®©Èè#‘ѹT[U`Ó¤,ŽlJfU hÔæ–Ju”t?Aï[Ù‹Õ•½¥ÐŽãÜŨ?|þùçxñű±± ɱ‘íÅêÏ*N¯C&†\‚©î©§éÒýÍ%?ñë£øÕCJ.†¢Œà¿?½Õ H®^½Š^x÷îÝÃ/?„Ç9WõIJFšdV=îôþüÇï¿®?¯dmmm¦m—/_\¼xo¿ý6îÞ½‹ÕÕU<°²ë·nL¼·ò¼léðõ¤Ï·[ºŽo‘ô§çž{Nê€â:½yófo®®®âСCƒLìÐ ]•Lìs|‹¬^7 :µ^—QmÒ‘1Ƨq²,Ãj/t™¤® ]5žºkúXøEÖ?o2wƵÄGG.6Í]zè…ßIúg®C×Y £Ï„NÛQ“ÚïÚ^\“™>éßLÿ¼OS7´)]ôöv³þÙ$DOC|ÛÑu‡ví¯m{IÅQ´¹©ò‘u76Aô´pn»^?Íç7¹>éoŸë©BSÓ†Ëâž {Ó€?­5ùóidlÚõUc/ÖIÿ Ž2ï(¢íŽ˜¶£‡Þ‘ˬ¾Hä+õ¿¸ýg"áá©>Ûбüù´ñO_Ò¿™þ¹T½G£…y›f~gi¯îó*ÿZçsguIÿêñäØA2tÞ#é?%3›$É4ɧ™².µ†®E©ín?éßS ¿o¾hí'ýr=‹–òÞN±ÛôË̶1U}ì€6®¯*39)2˜6Þ®¦z™ôÏ™¹×Ìc›‰™–Òn²múor”0éî¥os·Gÿºù–UÿmÍ£¤¼ÈÎÑ?åQ’$ $é(C¤æ]àZ´ñî&ýs‰Þ«Þ%œ« Ƕ“å‘ÐZVýóª×mÉÚÛ\ô(b™ôÏÛ$`šÄÝm'bÒѽ®QDÛ_IÿúñæmÎTÎ2‘ó|VvˆÃÈIé'áÖç!yç%¶ƒÓìTýSxœ¤}x<-üj¾µÙóxªkˬî¶&á]YÊáÞ,,¼Ës(M®/gÚs6Iÿjýó¶aÚ†{Cr†>òIÿ뛵}±ïñ,³þyWÄmçÓùóÏ2ëŸõM˜vº$ý«õÏõW¬Í’­ì3>dû³&©’þŠ£lg’hZÿóÜAIÿú±tz®gÞhïû9ÝöúÑ!¯ïõý(}×>vÚûQv³þ½®²ö1ïÈ“þ*…$IH«ƒKó6½]}nßË2럷9Ú×ä¢6ï,«ª•L{ÕT[=ÉtϲP˪>OŸ·h…/ÚKsYÿ¹r”í~¯êNËú.’þƒæQʦ­üoßïÙî¢ÚnÖî¯8Ÿd ç5ÝŽ£‰;Uÿ¬‹iú ‹C›î¾Ç³›õŸJfû|MDתfy¿HÛ¯=IúW·—W…fÛ™I2Špºj“þÍôïô¤`’å‘”ÂO’€’dN@Ùí_lÔu|ˤÿ™móµMÈUÓÁÌE”k%Y–™‘eY­>ú÷&×'ýkëú¾Ž.™Ï®¯‘(³úiú´½~™õ¯ývª‰™¥ŒÝ¦½¶ßÕöéý®yeÖ?oÓp“‰›öžÓ6¦¼ÉÄMR|Ú1€¶»ÌúçmvDWSÜÕ'÷Ý_ïGYý- .º¤÷ܦ3³Iz–ô}=éûzæÿ}=³TG‡~MÅ<¿¯g7ë?ñkXÚ>—ÛD‘.ßþЄշoßííVý;åQæñøÃ$”q”1éß 2ï(aÑ2&ýUÔÓg‘k'…ƒCùv³þŸëÙ©“ÕY\ýS%I{×ÓÕt-Úõ}¿'v™õoý-¥m¿%³Í{MÛáš\ßæYݤO ·E?H «¤ÿµžô,ñòè?×/všwËó›ô ï.²Lúçm}tŸ'²¦½¤kø×w¦u™õOßדôOçQ’,GÙnYô÷Äî&ýó«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´JpÚGÛ&ùôR óSu=·÷tÛ^ˆ |(ÓªÁ ŸÊí5øèx^¯óP‘ZºgÎęϟà£O¹&Ðñ§µ„u^”Jõ&«±¬î£ý-y˜bßt?×ùÖš‹  ô/ýöñ5ßô®ÚŽŸüßz$öðò 2·bÈ&þƒö )/­â×l?g¸Yÿ­ö ¼ üÞ°´oà“?¦·AüfNÌðT/ëò ÔrfÂk)”¾öe/ð’(©S‚÷ù'ÔýòÈM¼Ø¨gí™–f$®Qsž—Ÿ%a³o¥4v«Þ6Òw‚äËÂcðíìwC×/@{Æ£Ž#«bª½¨õ§ÅüÙU¡Ø&^”ǬÉh~{×2¯óÒ÷ûø[ÌÖfz»¬·³Ú{‹¼P>_Éalª¥s»ÓfÞÆ Ì6†þ|¸ÄŠ}¼Nɼß2Œ^ŸE^¨?ã[†Ñë³È§'NñÎA®ñÇÚÂQÓû¨xµÎ ÝÄ#ºñ>ØÆÛ<¨'ÞË}º†ñïÕW¼}äI}e’wmvcéeéøÊ™qݼlÇ·e‰xAm{À¨êh™ß†³kã76ð2P¥é¥æØÇ§Á :Pò­¸Â`bb·>r·½MòÝÂz«>‚ÁÂú3>9ÎÊ+:¶^8öñbR)wJ™Ä¢¬‚ª¨¾Îÿúó÷‡½⢒ðÁRf–h/™ª/™Ð«"s¦”ÓÁ0Þ×ë¼Sé95eÅä}üëä£>.âºMŸvé*€!‹¼¨æóqP]AU©àÅ Ð>^ˆ.¢5ÎÞ½K}ež—ï«ËÞñq4zTí¨­Fç«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´J>KtèòúËx€t˜›Çƒ{ðØÁGðÛ£¿ÆoŽ,¢¿±zÇ?üþø—O±rc-޳ N`”ååeœ:ujK@YYYÿ£p¶ƒ…–ù¸…™ð‘Ø'Ú~q{RP£Xt½V†ˆ@Ó&æç”`  F/ÙB•ǶÕP³$‰XQZ\/3OLl€ÇÏQr¤áp]ÀckÏü”€ç¥`pLÃòŒ€X½qÿ^Z¿.|…/¿¹„ŸýôôŸÿïŸü +«k™fk ¦GÅñãÇ·”;vLümÒøl)ÓYyû¹¥R;„–1X|H¡é|É›}D‰V‡žç–`rˆ’\EG)ߨ‡É‡ÍÒ‹ï¡}'ÉTó:$›5¡Ú¬71K!?NÜ'¦ÅˆgþqÇß?þÃÓç°rãfeUâ6E&øž®ë°°°0UBÛÔf|:0ÄÍn6ÙÁ~Ï Î.DÓw’µD‡+%X„¦Îw’[ÊðeõØ,Yæ&@‡t’'NVC¦<>’‚ľÑ0ËË{ CåÛÇùøqíÈ‘#[NÀíÚµk‚ïQ¨ “ÊwVjá(³e#o\NÖ@(”7EÀÊÏÛf0ƒ‹¤rõ‹é ‹’ò8¨ÂY¥Ž„ǦdT1Ô1M ­¥0‹é5må™)™HÖ'!oú„’Ø$®·Ö'ZÜzÖgeÿû % ,,"“©Äbí…,..NW£àŠèrÀÈû¥ mÓæv,˜nýsbJ·&7TBn ±KÁÈÀ ¶yNÐ0N¤zšÖ—¸§¸…,[4±‡‚&ûht¤J+T®:í‘?a0鱦T“ Ãï¿}paf*nûî;|ôÑG[ľ}ûðÄOŒC °>VøIµ´HÇ.1Àûpë&é;5vLƒ6‰8VÔŽM¸…J¿Y¬D:œ »HÚ…¶e½BšwbÚ„EæºpæNXLàèò³žó»AôT­Ð‡Ö÷Mˆz.\¸€7ÞxcËePȘ»¨"…±NÀ¬H$Ê¡C¿?Õ\qÜ@a-D ¡ ”äÓD °$Ö€˜S1=dN !äÌ1%Yµ"çFD’›¢O¨ÖlPŒÅ!÷qqfŸV¸gØ#3g?ÑM1¸˘4>&†šrm¢·Ð*‡ƱcǶ”={öLp=š2Âê|o:oE)çaIhµì$t¬[ ˜‹ZÚäTFÜ›åS@u¤ˆÁiR‹ø+™¶4—¤@—BâZÛH"¿àÜ›äµÒ^™9èÖoL•‡sK’Y"g—“~²ü”±hf3¦Ô}] ~êÃ0€*¿ç™E&T÷ïß'Ÿ|rÊÕïzŒ«øŒFÉÈNN$™% >`‡%%,œv9¬Y2eû”y*€•Ìïd bHOŒ&ã{V¤òÂ9kóB’Ãˮ֑Œ+sTÀÈÙÁ´fIѹ‘_´|Œz4«ÞBA¶ÈpÇJÌ$¡aÈ›Þ Åqú@³Õ8úVmêzÌ%®hYÏJ€Š¥¿“ÎÆO—š)šAkåìh¥¡¤ìmrQlrC„Ñ:˜\ޏ'ú«líšØ€ÆvØ,.®¢¾ ‚`ÜïuÿžŠç´vá%Õ€¸cHÃé÷(Ê\~*YyŠdÄÖ¡¥Öcõ õ±¦KÒyaÚd›6`µZŽóý>Ùe¯OúÀ§ÚJn~·x‘–Ö#’™@``©¦teËÇâkòçÍ÷ïBX/øwI+Ü"3»ÙöÞ{ïáõ×_ßI`‡®e1›.LôîsmH*&d·Wƒ¦Ö©÷ŠUˆ-m š+}b ²_ðõKÞ&TæôC͘Vò¬\a·RSˆÌ!H¾ªÝô·Šò”ú÷¤ºwô\þàvÝI˜(bí*$I %‚0¹€$¶è²°U@JÉR£Ô Q´ˆ¯øÐ6¸œPþ˜SýZG…ærÄ$ËAš{,Î2ŸØ^Àqb£Õw¸/T0¨«ˆFñh÷#dJ—ž}öÙ‰×Oœ8W_}5Qÿ®o0×aªCmÜBÉg…Z(6ù“(‚éðâôJŠL¬ð–+ØžºŒ˜e=k-àSú>Šs1J¤V¢%‡~×õï"£`BZYe{eÉÉ“'ñÚk¯ADp=Ì!èP€ Á”F%f‹3A1ˆ«ÅÊgо|”•ºõÖn,a—~ª}.WŒÂ3{h3¤ËrÓ^åžêß“Â0䘼Šcî5L±}üñÇxùå—BÀB•ÐC¹^ôºúƒ,}>Ãå=h%PKw­¬²V7ÞXÌh¹Eëû”â¬OªHÁ‹Æû§¯"Ý·z\½a;Ú'Ÿ|‚cÇŽa}}+¡ÇCp¨.ÝtôÒivKj¦cÌ·Þè\+ê†pÖ–¯TÒE-·ÂÿGÿJ„õõXŠFQæ ¦ÜΜ9ƒ—^z kkk¸¶ÖayÖ]®)æü•Èê^jªÐؼ(«#›†VÝÝÑ‚FO·ÒÔQfýD¿kn'æçv6¡Ö¹‹az ùüóÏñâ‹/buu«ì±ÚíÄüOÆÜ^or¸7¼M7ë¿ùÖþåüâ¡Ý‹‘*#øÏO¯L$çÏŸÇ /¼€ëׯãgûÂcª+‚þb»"/2Çýï³þÛÚ¿ÿñû¯'ߊwmiiiK7ÚΞ= 8}ú4Þ~ûm\»v óóóx`n'V®\ºeß±÷e›Ë×·úýÝnw~wÓúå¹çžã$ Ø —/_žÚ€óóóØ»wï¶lìvô¸dâ4çw7¯¿Ÿ´ >µ>)£º™B!ÞÆéº»wpr´lp;›t»=n>“ž™ÆÁßÍëï73»ã: ÄWW.·™šo÷ ·ûàï¥õw6  ¶†ñwB7²¨[½ÿvßW严ƒŸ­sëï§IuÛM¥wûûîçõÿÂÉ5ìEžÁIEND®B`‚perldoc-html/static/panel_top_link.png000644 000765 000024 00000012253 12276001417 020214 0ustar00jjstaff000000 000000 ‰PNG  IHDRŠ)ê±$iCCPICC ProfilexÚ”OhUÇ?o:¡ ‰ ®5”‡‡$ k+ºE¨É&é6î²]¦Ùü£ ›Ù·»c^fÇ7³[[ŠH@¼ÙêQ¼T‹x¨"$Oö h­z‚xª" ½HY³&Ôª?ø¼ßï÷¾¿?†>«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´JøàƒÓ¶) øú믧4q!¥¶¶O?ý4V¯^}J/±mÛ6‹E´µµvìØááá Ÿÿä“Oðàƒâû￟ÒÄ…”L&ƒyóæa×®]xä‘Gðæ›obttÔ{&Š"lÛ¶ ðè£báÂ…( øè£Júûî»ïðøããÅ_D__®¼òÊ)M\(åŠ+®°€Ø¼y3xà|üñÇ`VÙõgŸ}†\z饸öÚkqçw>üðCÄq 8qâž{î9<ñÄØ¿IßS×/÷ îƒÆêgÍš…úúzìÛ·¯¼ò ¶mÛ†ûî»[·n E‹aîܹ8xð ¶oߎþþ~¼ÿþûˆ¢Ùl×_=vî܉––LŸ>}JšGÂË/¿ŒåË—£¹¹X±bºººÐÔÔ„%K–Ø64ë֭æM›Enºé&lذ×]wÝ”7¹ÒÜÜŒºº: £«« K—.ņ °lÙ2TTTî¸ã„a⤖.]І†ë‘^zé%<õÔShiiÁ¾}ûS‰ F‚”dÛKH+Ë(b–pôhÅSÊ­K×5ºá’åíK£EcÕl¦„dõ ß㱚´ÓuBšx?8€b;¾ñ&`ÅV ÈLa0tìÁg¿ƒpßžE4>‚iÓ«Fq¨w†HCCš››ÑØØˆ¦¦&ïg¿Mtã7bΜ9èîîFoo/zzzÐÓÓƒÞÞ^;v ÅbãqAÀ˜]߯”ê R}N€1 v…ãpÉÊ›0d^†¯¶Bfòüì=v€ =âÁ&®éðd=,¦=+¶ vI 5G8qYÇÚ£yà™fÜX΂ž%Ãp üÏîGØ}´R€³z‘ËFøñølH&ÌŸ?+V¬@.—›TÝÚÚŠÖÖÖ’û{÷îÅòåËQ,QS9†«fý„0# Ma¨Ï -i!&GP‰*«’𝠱X7,(常e¶I¡!’L¦ƒR:À0@p¸…éKÆ—N²É¼Ð…=°$䛀´Bj`À AÍ’`0ûžÎö;·ôsœìï­þÇ¿áÁþ#^61X¨FûÑ‹‹W_}5V­Z…iÓ¦×ôëÓO?Å3Ï<ƒ(ŠÐƼæ#H:Ù{a‡S©;–EŽ'6q™]`J—)KeɤÝ1í†/ƒI…ãKëÊ­÷ñ¼†r\žap*ݰ Äé,‡50Ø'B²Ÿ'r™ìH`Âá&RüŠ­zü¯xxðxÛµg-V`ßÅYZ[[±fÍÌœ9ó¼€dÓ¦MX»v-`Fm?Ú&Öèd(å€(= ÒqÓ¬}°µ7‚|iÉôX\Ø’gh¥H;fIXpI"ÊNÿ’ìÇ¥ÄC$¼# ŽrJgFIê+]$“æKRú©½q'T~,úçßÍçÑÑ^K É!ŠÅ8‹}G.ÆX±õõõxöÙgqùå—O@˜k×®Å{ï½§¶¦ŸÀÌi=Éá4®”¤µú.X9ÐY"iˆœ×w’R'R7yJJYNZnˆÃK’zŒàÕIÊ*ýT@H‘këm(éÏx™®.i¹ÞFÈT Hs1@ÿô»ßpat !_LÞÄ„ ðÇC— (²¨ªªÂúõë1{öìIÊ«¯¾Š7fN;5Ý)~à’@'î; pݰ½—JKùzgKÃ:½°`õá(žS¸1ú·.ßòÃO~>J€‘öšœK°WacMz)Ž÷‘,A,Ob8ÂtÚ…, Ó%Û;;ó8ØÛŒ¢È‚ˆpÿý÷OHà®»îÂÞ½{ÑÞÞŽžáFTe‡Q•ÓPÂIû]Bo•TS)ÉXþ"=o i-i+Éd©cðäŒoC ÒJç”"Ø g%¤ÔM`^…—«t»Ç+QŠpZoËî<¡×–dxIF¥½„½—Ôr`C*l¥ÜÎ㩇¯àÑ‘¤@¥óy!tö\„Á±Zd³Y<ù䓸á†&Ÿ ¬\¹»wïcVíaä³C~U™]÷êzƒrõƒ”UIŸÁ[e±cÒÍ(,¥Þ¨,Q<3~ ]9 ³Ø È”Ïw9L: ûÛT’’³é€ÈáZl=+§ªÎ6T“ª-y¥h§rœù»…¿ŠmIR±ñã‰6 çQSSƒ5kÖ`Ñ¢Eçg³) ±dÉtww££c?†ÆëPŒl0 É )•`ÉRy?©9ˆtæì½‹$“Z¨ªþÚç“X-%û÷dòº/!ôw™(ÎÊ’•Ð…P€0}šçÍs,Ýþ‚qtæ–Ê30„`Õ—þ®Ö¥Ç0ýj2/Ó˜o‘5XNCœbh¤¹‰6°Ìoÿºþ÷Åâ¸UÂx1‡ýÝsPˆ+ÑÒÒ‚çŸþ¼ ",^¼ÌŒ½{÷b4ª…”„ ka!—‚T ’ZH Wy¤×§Kì) ¤¿› ; ‘ž¢r”÷1ÊQD  E»ý:ã É`™¤£f­Æ£Xà"°â#ŽÈi&½cä(žœ­"gÓɦÚÀN½ÆßD”ÒÙo2V½“«b?+«—RoN«´•ØIŠõÆ#9!€œŠ¬—¡iÒ‘nÏÌšO›ÏÎþ‘Sz<]{ú‡{/ãÑ‘âjü48’O_šÏd2ؼyóYáÖ¯_·Þzëg=[ ¢6s°¤¬\Â\>â¤l„¯-‰eR*UÖÍ%µéß*é×ÝßñÆg'sp3¶@LJl²&ƒ©“ö·´ÂÒ N‡‹óÝ>d  FH#˜S÷-¤Ì!’YÄ2!s"!çÉb™ÃŒ3ΩRkj1F†F‘¡„"Š‘A¤>#!Aè¸\z Å‚©»A¨wnÉÝó…VšA”I[ÝJ¥Ã÷˜ß€Tœ¢$S²™¦`H{ì!ñ IùÝõ4IæFÉ@¶óçjJAÂ9§"B‘!€3úÁŒzŽ1X¼óçÏ?§°“´—˜t$Ù‚ô]„pCp.³¿užXÿ{©0ä“ôº´ou\ÁŸHqãÖ!œcON…ÓžIÚd@äŽq¬™~aí¡IU,€X‘âX 8æ>CHFQT®ºêªåñÅj£¬Ì‘‚ãÇ—«lll#ƒ¢¨Tl^0bÉjLɈ™ ý]¿ ŸŠ#ŽY·3÷TV"$ÅŒ8VïRB0¢X½baÖ%Õ˜±T} E&…žƒÉV¤TÙ„0$Ò¼Kç°•sŠœÍ9W1l8®ò~ í!I е²#–Q¬• $âX)¦(ªRAbzøá‡±råJ|þùç%@yá…pÏ=÷àõ×_G¡P°÷ ØŠ² ‘T "A3¤P c@èùˆ8ƒÐбÊn,„Ô©£º/„TïRe-&­V+ñ@,-@2†›a¸<À©Oð…Õ>`V1Ö²c¤,ˆâ *†!.»ì2ôôô`ÕªUx衇ìA¤wß}×Éž={ÐÑÑ(аaÃÜ}÷ÝØ¾}»6<„O¥¾Ækýj~QœX¼Ð`ˆ…´i¤­oƒPôAÚ3,d­)²GÉÉEmb䥞Ö=ÿ…µ¥vÕäöY#ƒs@ê,ÉÛo¿wÞyãããYîFLhooÇ7ß|ƒ €ÝÇ DT¢¯X½z5¶lÙ‚Å‹k Tƒ…ŸØŒÁ)}»»šDêäW*qµÕ!’¥ÖCn{øãYwkÚ;Ltª½z.ÌäªQQY£ø¶³äxŸz8Žë tvv¢³³S²§!T¢A¡À!ƹ7nÄ‚ pøða|õÕW$*©€€àé(¢íííhoo×ç«‘­¨ÑU¿¾áŽ_zîHNË–>LµŸÌöaCË5¨®½2•O§ë]'16<¨ª 1{F òÕÍ`\ªÿÝC¢ýÇ>|ùå—èêê²Þ¤~z5f·üÖóV=½£èéµcÌš»Õ•Ù’]Óô¾Pæß Ò«›j^Ú‡'úŒŒdl¬2ÿ»bO¸ë{C#EA€ÚÚZTUUa¤Œ½2**+166†×^{M{ ÌÕ¢o(ôúË䦡©)ááaŒ¡w@¢O9þÙn œISãOü|˜>HäýŒ9C*%òù<òù¼WºOwœÏç166†;wªªjE2™LI¦º[WW‡ªª*ÄqìõUîùÉÌ©ú›hýg;Þéú;×ùþ)פ2ïîç PSSS²¿ãvÌÌÃÐ;ˆÏçO{Â-—Ë¡ºººì$Ë ¢Üü&šÏÙúTýŸJ°g3Þ™¶ÿs®?˜Ì…‘G6›E.—+ñ@g¢ˆÉVÌùîÿtëû5¯?Ä$^ÌŒŠŠ d³Y ˜rnìl]ë/ý:Ýú~ÍëÓè9ò4QL¬©©A6›µÈœÈjΕܥŸìþÏ•œïùý)×ÿÿ„äX^_ð³IEND®B`‚perldoc-html/static/panel_top_tools.png000644 000765 000024 00000007661 12276001417 020426 0ustar00jjstaff000000 000000 ‰PNG  IHDRŠ)ê±$iCCPICC ProfilexÚ”OhUÇ?o:¡ ‰ ®5”‡‡$ k+ºE¨É&é6î²]¦Ùü£ ›Ù·»c^fÇ7³[[ŠH@¼ÙêQ¼T‹x¨"$Oö h­z‚xª" ½HY³&Ôª?ø¼ßï÷¾¿?†>«¶$lù‘qrY¹º¶.^cˆ†*nÌ–J€Jhî±;? ®OU‚@ç—O^ûõ¹Ç>yûóüO£_•ÿn#fum„Rõ˜ŸR1ŸRg¢ qH¹JDLš%gÄûÀƒõo$¸ªB†f€¦˜¬i»õÄ íW=Ä©†îX°þ\][—q›QŽ=¾øÖC¸ò<:>ðMŒÂ#+°sdà»í qh7¬9 €ÎÂÐÏÎíq8ø.Ü}§ÓùëƒNçî‡p`¾ÖnË´»;â;ˆûϰãôè~Ï@.€• ÿ&¼÷ <õ)<ü”F`é(Öô¾H½Ì5ƒ³Æ«7"y8~^ÎVrÑw§'eEki¼z# ¥Q¡2mUfK·Ün½‡€aå—Oã`½®Â….‹íje>Lƒø­ªæ€I×¼ã‹À°W3ÇË1‹§½hq)fkÙ×ÅB—GýâÉ®¦DY§§¶O-ôü¯T^*c œÍfÞ5s®±´³ÐçsÅ.oZNx¬Ë.âÞ¬oXC£ððQøHrd™"ÀФ†‡‡Æ#‡ÂGaðÙüÇLM)Á’7ñ¹‰ÁãUZ($Ëd‹lOödúÇôééKéËéß/޵&‘mó²çî^¸E …éëv£Ýžb}—&³h4u[ý™ÂD§‰îjþű’»{á–z«x'1¥JTšbEÈ&6 Mˆb6Qí~ÛSè«ç¿Ôºa>|õü¾]5ï™Jõ§Úä%ï+û û{ÑÎØG‘ö‹öŒ}Ìž·3ö v¡£ŒÇ† Ÿ³È}ÿ¯—Kü¶Ê)_Ï•N.+K¦Yó´J`ÃâFO6ÀŠ¿h¶èM&`0”zÚ+¨ÎP»úwªvûSP(AãÐÄo¤à±A .ó›"³E¢ ×ÕNÕ‡©n„‰©’ÓÆ5¨ÙPi|Ì8áçhÙ—ñŸyýÝÿ~Žß^úÚ[¼íL·EÌ 4ú[ß÷ãµÈ*ñ¹ä9‚Ê“¢ ¦ KXÀ ’Þ\1$ xzoBè¨â¹ªB¢.2@@b¤p$ŠôíøúhCe ­%Y`±LŠÉ˜æ›ŽÌ „9k±fXŒs WGHrÊ„â¯×—Ю޽´€§oë½qúÚòò2Þyç‘À077‡óçÏaö`d] \b:T“gQ b6"),o4ø¯ .ø´õd }üú4Í ¤! ?LÍa:À5ÇX鋃>g›{|Ÿj]dJgÂdÅÖÐ. a¼ÝæÚn{;?"9ö¢iƒm}}—/_ (ÇzOá)’ªFª6`Ö†¾Â±ØU:p€è,,ÆaÓÍ2!\ó}\°Ãƒ60ìD±ú kA?Îl°<žü2l‹t.²ˆeV iƱËYWí@‰EËõh£4VÁ,® 7Ñ` ]ƒÙMM#As0÷b=†%Œ¬>HÑ/°“¦Òt>eG©˜bÐÃ=ŒÒèªd`@”®¨'Å4H´8=²žÂ8±p˜DÁÊÊ .]º4 fggñÊ+¯ M»«JXÁÀ%¦¡QFTq>ª¸xeÆ >=ú@Y•_J}àà ŠêNPŒ•>]ÎÎÂTsHð^ìBMD* Q²a (ézÂx‚†gr]*8G‘Ú¤+Š•1䵑v=êbê¨1ð¶84hÐ{÷îáâÅ‹#k”> DúW8C)™^iëFÎ¥ú‚¤ †ƒµCˆIC0kU§({@ÙšlPŒÙ‚ÑMÙ ù=¶˜Ãi9gb7}ÀBp[ØaÓ¶6¢š?• Ôe˜ö0é°êtë¨è\gª{RMº¿–²°°€7ß|s$ LMM /Ʊó‘‚Τy4D™_ 0Æj¤áxe¬†Ëìª-5šHÒo¿¨j¼='P™!lÈcÁÀ†SOXˆ,™¾CSÞáÇMVð6)±Šeí…Yî>dr­SÍ"Q-Åqè¶áÌÌ ^~ùåñk×™ÈÀ.ˆæj+µŽê@™`C­´¬‘¥øSk$Ï–JŠ•›…çx½TlM…eX0eÅ%€³À.Êìi=lÀþí±Š¥Sf¥ìDh'ÕÎ(éº Õ‡›*µ„„j%SÑ@§.o da,d+Š´±¨]Tû.QÑ3•ù­qr¹)Ãn÷´ÑšÂ×Fj8n&&ጲ¯„p&HÀÕ¢r4ZÌRM¶UìlÄ/›}¡\ñ‘#€_?éa²ÜØŸ‚VA8×eI¦¶ˆ´À>lßuÖÖÖö\•í¢¸62±Îé‹l! ]ÃÊr'6¾Káz m%œÙmñ¬Ñ[$+¶i¼=0ƒÆPˆDç1%Ï•-Ø€Y——‰+Ká £S@³f®³T±ºv·){îß::ì¨+v• ; &æØ–––pöìÙ=XvÑÊzFR&ã-à+б4\„‹HÐ/®w—¹’,@‘·ùKñ§1u›QTa©'TŠezÚâ`>FâmòðôŒA 6 ã¤(öÙ¿E#pÚõî6ö„¿‘Û‡޿råŠÏzÂ*•UÉ œI©€×E¤òĆ¢¬jfÒ\`Q‰AÁ’Êúì­»Xј¶–ŠÝNmÙKc œ¦¤põì3¯«J¸´9³ž,f% ÙPN:…W_}µ÷ÞåË—ññÇûPUTk} †FÃ~-Vª÷†2<·â-#¤=™àBQר٠«AP„…tÁŠF”g\LüÏuL–ž^PûmW¯^Å믿As+d=j¦žb5`¡çCa«Ú÷§”žϤ½ « hÏzà ̦cT!’N¹y‘̼¸Ñ{óA‘Ïeìϧ£«Ptp^ ¤S6&·¦‚}·1´k×®áÂ… 躇ֶ kp(ìÅýaÎlȲzš¼Ÿ,+³=©b>‡U¦Œ(Jl’Ø%÷1žX'¥yë·øüñkпuÔtâ¬<´2ìÔéxÚÒÒ^{í5ììì«›Ðûëy„ezËÊ),Äó'ÄàÙžlU¹{@,Ò¦ƒ&™,b:Ó‚GÊþ&õoA¢«ÊÊEÚH<f¹yó&Î;‡ÍÍM4[¦wSearch results - this is the top result for your query '+"'"+query+"'. "); document.write('
    View all results'); perldoc.clearFlag('fromSearch'); perldoc.clearFlag('searchQuery'); } } } //------------------------------------------------------------------------- // pageIndex - functions to control the floating page index window //------------------------------------------------------------------------- var pageIndex = { // setup - called to initialise the page index -------------------------- setup: function() { if ($('page_index')) { var pageIndexDrag = new Drag('page_index',{ handle: 'page_index_title', onComplete: pageIndex.checkPosition }); $('page_index_content').makeResizable({ handle: 'page_index_resize', onComplete: pageIndex.checkSize }); var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); if (pageIndexSettings.get('status') == 'Visible') { pageIndex.show(); } else { pageIndex.hide(); } } }, // show - displays the page index --------------------------------------- show: function() { var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); if (pageIndexSettings.has('x') && pageIndexSettings.has('y')) { $('page_index').setStyle('left',pageIndexSettings.get('x')); $('page_index').setStyle('top',pageIndexSettings.get('y')); } if (pageIndexSettings.has('w') && pageIndexSettings.has('h')) { var paddingX = $('page_index_content').getStyle('padding-left').toInt() + $('page_index_content').getStyle('padding-right').toInt(); var paddingY = $('page_index_content').getStyle('padding-top').toInt() + $('page_index_content').getStyle('padding-bottom').toInt(); $('page_index_content').setStyle('width',pageIndexSettings.get('w') - paddingX); $('page_index_content').setStyle('height',pageIndexSettings.get('h') - paddingY); } pageIndex.windowResized(); $('page_index').style.visibility = 'Visible'; pageIndexSettings.set('status','Visible'); $('page_index_toggle').innerHTML = 'Hide page index'; $('page_index_toggle').removeEvent('click',pageIndex.show); $('page_index_toggle').addEvent('click',pageIndex.hide); window.addEvent('resize',pageIndex.windowResized); return false; }, // hide - hides the page index ------------------------------------------ hide: function() { $('page_index').style.visibility = 'Hidden'; $('page_index_toggle').innerHTML = 'Show page index'; $('page_index_toggle').removeEvent('click',pageIndex.hide); $('page_index_toggle').addEvent('click',pageIndex.show); window.removeEvent('resize',pageIndex.windowResized); var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); pageIndexSettings.set('status','Hidden'); return false; }, // checkPosition - checks the index window is within the screen --------- checkPosition: function() { var pageIndexSize = $('page_index').getSize(); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var windowSize = window.getSize(); var newX = pageIndexPosition.x; var newY = pageIndexPosition.y; if (pageIndexPosition.x < 0) {newX = 0} if (windowSize.x < (pageIndexPosition.x + pageIndexSize.x)) {newX = Math.max(0,windowSize.x - pageIndexSize.x)} if (pageIndexPosition.y < 0) {newY = 0} if (windowSize.y < (pageIndexPosition.y + pageIndexSize.y)) {newY = Math.max(0,windowSize.y - pageIndexSize.y)} $('page_index').setStyle('left',newX); $('page_index').setStyle('top',newY); pageIndex.saveDimensions(); }, // checkSize - checks the index window is smaller than the screen ------- checkSize: function() { var pageIndexSize = $('page_index').getSize(); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var pageIndexHeaderSize = $('page_index_header').getSize(); var pageIndexContentSize = $('page_index_content').getSize(); var pageIndexFooterSize = $('page_index_footer').getSize(); var windowSize = window.getSize(); var newX = pageIndexContentSize.x; var newY = pageIndexContentSize.y; var paddingX = $('page_index_content').getStyle('padding-left').toInt() + $('page_index_content').getStyle('padding-right').toInt(); var paddingY = $('page_index_content').getStyle('padding-top').toInt() + $('page_index_content').getStyle('padding-bottom').toInt(); if (windowSize.x < (pageIndexPosition.x + pageIndexSize.x)) {newX = windowSize.x - pageIndexPosition.x} if (windowSize.y < (pageIndexPosition.y + pageIndexSize.y)) {newY = windowSize.y - pageIndexPosition.y - pageIndexFooterSize.y - pageIndexHeaderSize.y} $('page_index_content').setStyle('width',newX - paddingX); $('page_index_content').setStyle('height',newY - paddingY); pageIndex.saveDimensions(); }, // windowResized - check the index still fits if the window is resized -- windowResized: function() { pageIndex.checkPosition(); var windowSize = window.getSize(); var pageIndexSize = $('page_index').getSize(); if ((windowSize.x < pageIndexSize.x) || (windowSize.y < pageIndexSize.y)) { pageIndex.checkSize(); } }, // saveDimensions - stores the window size/position in a cookie --------- saveDimensions: function() { var pageIndexSettings = new Hash.Cookie('pageIndexSettings',{duration:365,path:"/"}); var pageIndexPosition = {x:$('page_index').getStyle('left').toInt(), y:$('page_index').getStyle('top').toInt()}; var pageIndexContentSize = $('page_index_content').getSize(); pageIndexSettings.set('x',pageIndexPosition.x); pageIndexSettings.set('y',pageIndexPosition.y); pageIndexSettings.set('w',pageIndexContentSize.x); pageIndexSettings.set('h',pageIndexContentSize.y); } }; //------------------------------------------------------------------------- // recentPages - store and display the last viewed pages //------------------------------------------------------------------------- var recentPages = { // count - number of pages to store ------------------------------------- count: 10, // setup - startup functions--------------------------------------------- setup: function() { recentPages.show(); if (perldoc.contentPage) { recentPages.add(perldoc.pageName,perldoc.pageAddress); } }, // add - adds a page to the recent list --------------------------------- add: function(name,url) { var recentList = recentPages.load(); // Remove page if it is already in the list recentList = recentList.filter(function(item) { return (item.url != url); }); // Add page as the first item in the list recentList.unshift({ 'name': name, 'url': url }); // Truncate list to maximum length recentList.splice(recentPages.count); // Save list recentPages.save(recentList); }, // show - displays the recent pages list -------------------------------- show: function() { var recentList = recentPages.load(); var recentHTML = ""; if (recentList.length > 0) { recentHTML += '
      '; recentList.each(function(item){ recentHTML += '
    • ' + item.name + ''; }); recentHTML += '
    '; } $('recent_pages').set('html',recentHTML); }, // load - loads the recent pages list ----------------------------------- load: function() { return (perldoc.getFlag('recentPages') || new Array()); }, // save - saves the recent pages list ----------------------------------- save: function(list) { perldoc.setFlag('recentPages',list); } }; //------------------------------------------------------------------------- window.onscroll = function() { var scrOfY = 0; if( typeof( window.pageYOffset ) == 'number' ) { //Netscape compliant scrOfY = window.pageYOffset; } else if( document.body && ( document.body.scrollLeft || document.body.scrollTop ) ) { //DOM compliant scrOfY = document.body.scrollTop; } else if( document.documentElement && ( document.documentElement.scrollLeft || document.documentElement.scrollTop ) ) { //IE6 standards compliant mode scrOfY = document.documentElement.scrollTop; } if (scrOfY >120) { $('content_header').style.position = "fixed"; $('content_body').style.marginTop = "90px"; } else { $('content_header').style.position = "static"; $('content_body').style.marginTop = "0px"; } }; function goToTop () { window.scrollTo(0,0); $('content_header').style.position = "static"; $('content_body').style.marginTop = "0px"; } /* Smooth scrolling Changes links that link to other parts of this page to scroll smoothly to those links rather than jump to them directly, which can be a little disorienting. sil, http://www.kryogenix.org/ v1.0 2003-11-11 v1.1 2005-06-16 wrap it up in an object */ var ss = { fixAllLinks: function() { // Get a list of all links in the page var allLinks = document.getElementsByTagName('a'); // Walk through the list for (var i=0;i tag corresponding to this href // First strip off the hash (first character) anchor = target.hash.substr(1); // Now loop all A tags until we find one with that name var allLinks = document.getElementsByTagName('a'); var destinationLink = null; for (var i=0;i 120) && ((desty + window.innerHeight - 120) < ss.getDocHeight())) { window.scrollBy(0,-90); } // And stop the actual click happening if (window.event) { window.event.cancelBubble = true; window.event.returnValue = false; } if (e && e.preventDefault && e.stopPropagation) { e.preventDefault(); e.stopPropagation(); } }, getDocHeight: function() { var D = document; return Math.max( Math.max(D.body.scrollHeight, D.documentElement.scrollHeight), Math.max(D.body.offsetHeight, D.documentElement.offsetHeight), Math.max(D.body.clientHeight, D.documentElement.clientHeight) ); }, scrollWindow: function(scramount,dest,anchor) { wascypos = ss.getCurrentYPos(); isAbove = (wascypos < dest); window.scrollTo(0,wascypos + scramount); iscypos = ss.getCurrentYPos(); isAboveNow = (iscypos < dest); if ((isAbove != isAboveNow) || (wascypos == iscypos)) { // if we've just scrolled past the destination, or // we haven't moved from the last scroll (i.e., we're at the // bottom of the page) then scroll exactly to the link window.scrollTo(0,dest); // cancel the repeating timer clearInterval(ss.INTERVAL); // and jump to the link directly so the URL's right //location.hash = anchor; } }, getCurrentYPos: function() { if (document.body && document.body.scrollTop) return document.body.scrollTop; if (document.documentElement && document.documentElement.scrollTop) return document.documentElement.scrollTop; if (window.pageYOffset) return window.pageYOffset; return 0; }, addEvent: function(elm, evType, fn, useCapture) { // addEvent and removeEvent // cross-browser event handling for IE5+, NS6 and Mozilla // By Scott Andrew if (elm.addEventListener){ elm.addEventListener(evType, fn, useCapture); return true; } else if (elm.attachEvent){ var r = elm.attachEvent("on"+evType, fn); return r; } else { alert("Handler could not be removed"); } } } ss.STEPS = 25; ss.addEvent(window,"load",ss.fixAllLinks); perldoc-html/static/perlonion.png000644 000765 000024 00000002520 12276001417 017217 0ustar00jjstaff000000 000000 ‰PNG  IHDR†*AyÓ6gAMA± üaÀPLTEÈÇÇIKYYVWz†£YgŒo{›ŽñññÕÕÕÞáè0,-ôõ÷ÓÖà‘š²LHI¦®Á>:;¼ÂÑdq“œ¤º¬ª«usséëðgee†«ƒº¸¹žœ±¸ÉãããÇÌØN]„"ÿÿÿðà, pHYs  šœtIMEÕ 3P‹ÒIDATXÃí˜i“£ †)ÏÄ+1&1þÿ?¹Ð€ÉÆ™ªý°]5Uё⡻y»Mÿ„¡ÿS4œŠM –qc¿ï€æòâ¶ŠÁj<*£õ¯¢Ã:Æ™*ERýÁ€Û%0Öñw1 ×1˜ž}œ=2V™7øµ´S}GÒ¯a”óܸÕþà[“_„â0ï úÃ$®h“XÁÂCGOõ6ZFá¬YʆŽSä"í&µ©µ¸Äè1m`T´³ò±ä;„-ÌUMäøØÓt#óhvÙÖ†Õ‚-J¡ë•þŽÇç®SÇÇHB¾ðèÞ–¡|-•Õ÷d…ï1˜pƘ E½ÎÖ‰­cVãÓ Šx%œ­ ì6™øìb€¤Sx*ç3 é寣W$d2uÅ 3(z‹!°‡Á„èZŠÂˆ7»ÇMnÈ·x1åVr 4¬ÚÑ £Á'õc£V«¥KŒLCmçw¯ƒO˜ç ` †ÔÌ/±¿„!½žCmy ߘ•çüµÜ°éïc䪆V"*ÌÕ®SåÆëF§ŠWÎǶŠÖªÉÞc€LÄ0T’BXëLM!tþ×{ (÷ýŒ´²9JÕk¥”ªCI?À€×ItCw£4ËÈHžTî‹üÊTÿõœ>ÀÉÁ˜;°ë³VÍG:uº*§0 ôë/êa:ÁŠï Âÿ*jš²Ï0äy%q³4N.û¼ DÈœèÏÖ© c«Y¿¯Ä×¶¦l€ì¸D¡Þ<·MŒðÖ¶ÖÅÓÆŠò#½îÊI¶˜LÖö±ÉqRW/qŸÌ´ÃÊ…OÌ«ac_³¤%}«½Ó|è”ðÅÅY|š;oÙ~Ê›xP½GòH?¡F.'éoïÒ)-#4©»5q(à¶é´z«É{eúœâôâ; —7¬Ä›.³ï.yh{ÚµÓu¬39©üÌ—…ÚèJ3IEND®B`‚perldoc-html/static/perlpowered.png000644 000765 000024 00000002624 12276001417 017547 0ustar00jjstaff000000 000000 ‰PNG  IHDR†0ã4òËÀPLTEÈÇÇIKYYVWz†£YgŒo{›ŽñññÕÕÕÞáè0,-ôõ÷ÓÖà‘š²LHI¦®Á>:;¼ÂÑdq“œ¤º¬ª«usséëðgee†«ƒº¸¹žœ±¸ÉãããÇÌØN]„"ÿÿÿðà, pHYsHHFÉk>zIDATXÃ͘‰Ò³* †­€Š{ݺ×zÿ7y ‚¬Ú¯Ûü‡™.Zi^B¦ÿE þ5ÀK»ÝN|»|íÔ»y¡w’Ocù·ùc§ýùnr/Œï»§/ªacxÔÐ1¤2O)^PÇ1iƒ]Wãë“t‘É`Õ7¾ˆñëöO1âK]®bМ5ú{ºÃƒ5/mÑ(iŠ ~Œ=‚¤ÙÛo!k‡Pµþ’¼‡‘‘ƒJëã¢ÈØäVç‡Ûê(~ääèÛ¨UkíÎ猸 I”È{Šœ]Œ¬P–ÂÉ|½B¸‡«ó;ƒ‘‘QÇ@hƒz‡Ë%p¤·¯`hZFc\obL!¿î¿qÕ­ŽÇ£‰aù‡è“ô 5Œ"\¡Ê¸3æNo cŸí>ÇÐ=a$ÅX!J ²‰ ú#7´¸²øY¹6î·0:ûÆ{GÍ^µg6›jhè¯Èá`˜7n—C ‹çÐþ—ðHЧ<⹃²Ö䌂älð_]á™ÁrEzê‚ b,©_“úœD茽Z"ûŽ)YRÊ4A®'åÌMÒ‹¤cÃ.E<‰y¸ ™"2„Æ3 ð’û~Äü—Œ`ª¥™ ˜>»q¨ s¸ü G1?ŒË´ ºTŠgb…‚²ÑWðS†1]¢+^Ǹ@Záß8…îù`!®ìÓDÀÂCÎDt4µË ÐQã°0JcÌsØà“c$¹XÊ$µ .Ip›V0pÆp6hþX±"C[nb,Q304¶bº óÁ"™5 £…Sñ¸ÞÈçØü\¥ëØiÄŸAíy*B;–ÎÙ·V¢ç”‹1f<¢—[†Dý‹Ï³vâ7P%ZZêe^ ægB:«jy0ç$1-߸É!Pžêi¥ê?Ř…A¹ Ä碰â€5y›×¸é9°[â˜\Á@bA®a@ k|F«7ËsŒVŒ–¸9Ÿ ±œŸa<>ĸƒ;d8€Q< ævŠýƬz¹åεQ#/ØíyÁfVï`"‡6|V¨7†A@>Ũ=†^ÂDòÊA|¤GÑV#¹ÝÛÅ€0±ºaù†pR˜L•S0Y~zŽéþô FÖ蕨µ2BÄ& gNïÐo(]+Ðÿ‚!«Q’çxÄw2¯‹âHEýuw{{œ±ö¤×0– ìxoEñ‘Mƒ¬‡*OoÔ‚vÞ`Iý U 6le`öj–«ðõö-MÈ1©é¥IzØÆ` !ÔŽ^U¤ã!­X¿ÚÖÎÏ_Î&ּđ\<çUŒèÜ÷ÚÁ•Ɔ°-½¬Êqî›k[å¨ÅÑKrJ— J‹ù8ø$,vú1KVé;Q Y»ùÈHáÎÁYRËß¹üœO>’‹¨=úˆMÁ¬SÐÍÃIOç›uè”UÆ·ƒm‡‡N›¸A÷T#žÏ]D“û£þ˜%Ñ«“È= ¤ßlPåLÇŸZÜÍ›¶GÝûv׉ôàÀpåÿF-á»N.¢IEND®B`‚perldoc-html/static/perlversion.js000644 000765 000024 00000002723 12276001417 017417 0ustar00jjstaff000000 000000 // perlversion.js - writes Perl version drop-down menu function selectPerlVersion(element) { if (element.value.substring(0,1) == '/') { location.href = element.value; } } document.write(''); perldoc-html/static/popup_bg.png000644 000765 000024 00000000273 12276001417 017030 0ustar00jjstaff000000 000000 ‰PNG  IHDRĉsRGB®ÎébKGDùC» pHYs  šœtIMEÙ :/Ë,©Ý"tEXtCommentCreated with GIMP on a Mac‡¨wC IDAT×c```8ÑÍÐì3IEND®B`‚perldoc-html/static/popup_bottomleft.png000644 000765 000024 00000000406 12276001417 020615 0ustar00jjstaff000000 000000 ‰PNG  IHDR6ÅsRGB®ÎébKGDùC» pHYs  šœtIMEÙ !h‰""tEXtCommentCreated with GIMP on a Mac‡¨wCXIDAT(ÏíÒ¡€0DÑ1TA´@«t‚EPB€‰‚I¢™aÍ­ø÷Õ6XÒªä>5`-Æ’a®Nì9KH7¦>äi¼/SxFXæÃÒH² § 3îäÞµ[þFþC<NAO¨%Œ•B ÑNÜCiY$ö¦xûVBq˜ÌäI#íî{{î=÷ž{/1´¦¦¦rl€Èȸ0zã þýÑÀÍcÉdr —ÿxD $ ipð†-Kÿ.™L~Gð™Ë¢ôÑ÷ „úZ@€HߢyÂИÓlx¦ý >ööö~iºžw)ƒ‡cÏ™™™-¹a ¿J¥R¼R©Lâ–õýÛ·'µÕÎó 2>>ö†à=‚ u\Ibssóî·/^La0ôžùP3]^^ŽGí(k6=MÞàèpFÆïÜy"bD±TŠýüùýz½nïîîF-ËÂTrÊC@ ÙlŽ|øðSìüü\ …‘‹öE÷óŸŸ»4FPŠä¨¢IžB4å«JeÒu]¸®ËZ­f7›ÍC»ÙlŽÔj5{ðL¥òj2‹’P!8@( L@iZ+k«1ˆÅb’N¥{àû>Œîû> •Jõb±˜ÄÊÊj AˆM¨,CǤ>ÿ$Ÿ ‰±±1)?-{ Ðjµ,ß÷ñ±V³€t*Ý+=-{cñ¸ù|>Q¯×;ƒ‚´”©Djýg2Ûì' )—Ë ˆïC‚½t:Ý+—Ë^"‘#ËL&ck…÷eQÑUÇÿY JW{ÐVŒ<¡ ´§ž¡×i·Y­Ví–Ó²@)¥´N˱ªÕªÝi·C!·Z-OÇ^i=PF÷:É ûûûm’è´;Ü­VmÇq,‚P‘J¥’W*•¼H$pÇÚ­Vív§C8Øßoù…‘« ‹"h„FÃ×ué8Ž…Àób±èe³Ùn6›í‹E/dâ8–뺀OŸ. DdâÞÄë°8\_û8;;ë=z”‹w»]º®«òOòWKKK]Îd2éÚ¶œœœFfçf{©túúÇ÷ïO§ž1xœ››Ý“P6ºo6¯ßù/‰ÝÙÙùk{{û\·Ñ_"2qòµ& [¯ <øýèÈ;;=íår¹ømàïÞýp²³óóE¿€Ò Àù…ù=="!³j4jóqîqlmm-ñÕÇÑKïJZ­?¼ƒƒƒö¯†{éyrÃbÐéCŒ……ù½á7üë ÂÁ/L˜Va”,çzbq`6„A…pN³Zt,ö'¸.‘`äºýʤD)¨¯x ©a« @¨ð7?ů‘£•IEND®B`‚perldoc-html/static/popup_resize.png000644 000765 000024 00000001121 12276001417 017732 0ustar00jjstaff000000 000000 ‰PNG  IHDRà*Ô sRGB®ÎébKGDùC» pHYs  šœtIMEÙ ŒŽ"¼"tEXtCommentCreated with GIMP on a Mac‡¨wC£IDATHǵսn‚P𿀡Z¼E«ÆA''IÜ\\}N¾ÏÒ™&nÆhº°0¸tð Ô„ÅĘH¹*J—ÒÆ©x¶û‘ß=¹÷䞀7Ü(Ü0Å{½ÞÃMð~¿ÿX¯×3ᘤ2E‘¥”ž Ãp‚y. ¸R©ð›ÍÆÓuÝ6MóÀÛíÖÇïaøªk‰‚ÿŸƒ !Œ¢(Ùð>. X’¤´ªª¹|>ÏÀl6sbgþ\(¸årép,< ^,®¦i»Øw~)\.—Ó<ϧ.Æ/…eYæ»Ý.©Õj|p“,IRºÓé¥R‰«V«|dæq`UUs‚ °«Õêc4Y®ëú(—ü˃ΙÁ0á¢àF£q×jµîA`cÀ`|g®(Jö',Ë2ßn·sÅb‘[¯×MÓv„¦Ùlf"à‡ð–eEQdt]§Á'dÛöÉó<¿ßûÃápw<ẮošæRzšN§öøÀ ¤Â šÂX–u ï jö«¢bà5¤®ìþÀÀ€À /~\f‡ê¬`„§IEND®B`‚perldoc-html/static/popup_title.png000644 000765 000024 00000000342 12276001417 017556 0ustar00jjstaff000000 000000 ‰PNG  IHDRÊš%sRGB®ÎébKGDùC» pHYs  šœtIMEÙ 9ÛoÚÖ"tEXtCommentCreated with GIMP on a Mac‡¨wC4IDAT×e‹± À0 €€üÿE=*Cœ©ÀÌ|¡”B"©O(Ün) ‰$aÝ ë/„ˆ¤i·¥ÿIEND®B`‚perldoc-html/static/popup_topright.png000644 000765 000024 00000000670 12276001417 020301 0ustar00jjstaff000000 000000 ‰PNG  IHDR6ÅsRGB®ÎébKGDùC» pHYs  šœtIMEÙ 99ÕOO"tEXtCommentCreated with GIMP on a Mac‡¨wC IDAT(Ïe’½MA …¿gYTp{+Q=]5T@NDô@5ÜöðìÙ#4?~öó7£Óéôiø|Û¶ý0 -Ëò`ƒ˜÷ëv}„z"¡Ðe9Ÿßö€ è éi=¯/u®8ÔQqY×õ!‘KùoÐs´¡Û×íë1+ûÈ # #À÷©0(À¾-'t—#»F~þ‹MJ}âJ}cQS€Qt•ìTYpµÝû®›åø S €DÜ‚ªÍÙÃÞ@©]|RËÂýæâð”„壘šª™î"JU¬bg€ÛƒšÃ01¼H1‘œ.f×9>¢Žo.FÝ·¯;˜þ~t?Hnz•—IEND®B`‚perldoc-html/static/preferences.js000644 000765 000024 00000001406 12276001417 017345 0ustar00jjstaff000000 000000 var perldocPrefs = { // load - loads current settings into the form load: function() { var toolbarType = perldoc.getFlag('toolbar_position'); if (toolbarType != 'standard') { $('toolbar_fixed').checked = true; } else { $('toolbar_standard').checked = true; } }, // save - saves settings into the perldoc cookie save: function() { if ($('toolbar_standard').checked) { perldoc.setFlag('toolbar_position','standard'); } else { perldoc.clearFlag('toolbar_position'); } $('from_search').set('html','
    Your preferences have been saved.'); }, // cancel - cancels changing settings cancel: function() { location.href = "index.html"; } }; perldoc-html/static/search.js000644 000765 000024 00000026710 12276001417 016316 0ustar00jjstaff000000 000000 // search.js // // perldoc.perl.org search engine //------------------------------------------------------------------------- // perldocSearch //------------------------------------------------------------------------- var perldocSearch = { // indexData - object to hold page indexes ------------------------------ indexData: { }, // run - runs the search query ------------------------------------------ run: function(args) { if (args.q) { args.q = args.q.replace(/\+/g," "); $('results_title').innerHTML = 'Search results for query "' + encodeURI(args.q) + '"'; if (args.r && args.r == "no") { perldocSearch.doFullSearch(args.q); } else { perldocSearch.doQuickSearch(args.q) || perldocSearch.doFullSearch(args.q); } } else { // no query string specified } }, // doQuickSearch - search for an exact page match ----------------------- doQuickSearch: function(query) { ScriptLoader.load('indexPod.js'); ScriptLoader.load('indexFunctions.js'); ScriptLoader.load('indexModules.js'); if (perldocSearch.indexData.functions.has("_"+query.toLowerCase())) { perldoc.setFlag('fromSearch',true); perldoc.setFlag('searchQuery',query); location.replace("functions/"+query.toLowerCase()+".html"); return true; } if (perldocSearch.indexData.pod.has(query.toLowerCase())) { perldoc.setFlag('fromSearch',true); perldoc.setFlag('searchQuery',query); location.replace(query.toLowerCase()+".html"); return true; } if (perldocSearch.indexData.pod.has("perl" + query.toLowerCase())) { perldoc.setFlag('fromSearch',true); perldoc.setFlag('searchQuery',query); location.replace("perl"+query.toLowerCase()+".html"); return true; } var moduleQuery = query.toLowerCase(); moduleQuery = moduleQuery.replace(/\.pm$/,""); moduleQuery = moduleQuery.replace(/-/g," "); moduleQuery = moduleQuery.replace(/::/g," "); var found = false; perldocSearch.indexData.modules.each( function(description,name) { var moduleName = name.toLowerCase().replace(/::/g," "); if (moduleName == moduleQuery) { perldoc.setFlag('fromSearch',true); perldoc.setFlag('searchQuery',query); location.replace(name.replace(/::/g,"/") + ".html"); found = true; } }); return found; }, // doFullSearch - run a complete search --------------------------------- doFullSearch: function(query) { // Split query string into individual words var queryWords = new Array(); queryWords = query.toLowerCase().replace(/[^-:\w\.]/g," ").split(/\s+/); queryWords = queryWords.map(perldocSearch.stemWord); window.setTimeout(function(){perldocSearch.searchPod(queryWords)},0); window.setTimeout(function(){perldocSearch.searchFunctions(queryWords)},0); window.setTimeout(function(){perldocSearch.searchModules(queryWords)},0); window.setTimeout(function(){perldocSearch.searchFAQs(queryWords)},0); }, // searchPod - full search of Pod documents ----------------------------- searchPod: function(queryWords) { perldocSearch.displayProgress('pod_search_results'); ScriptLoader.load('indexPod.js'); var sortedResults = perldocSearch.performFullSearch( perldocSearch.indexData.pod, queryWords, function(name) { return name + ".html"; } ); perldocSearch.displayResults('pod_search_results',sortedResults); }, // searchFunctions - full search of Perl functions ---------------------- searchFunctions: function(queryWords) { perldocSearch.displayProgress('function_search_results'); ScriptLoader.load('indexFunctions.js'); var sortedResults = perldocSearch.performFullSearch( perldocSearch.indexData.functions, queryWords, function(name) { return "functions/" + name + ".html"; }, "_" ); perldocSearch.displayResults('function_search_results',sortedResults); }, // searchModules - perform a full search on modules --------------------- searchModules: function(queryWords) { perldocSearch.displayProgress('module_search_results'); ScriptLoader.load('indexModules.js'); var sortedResults = perldocSearch.performFullSearch( perldocSearch.indexData.modules, queryWords, function(name) { return name.replace(/::/g,"/") + ".html"; } ); perldocSearch.displayResults('module_search_results',sortedResults); }, // searchFAQs - perform a full search on FAQs --------------------------- searchFAQs: function(queryWords) { perldocSearch.displayProgress('faq_search_results'); ScriptLoader.load('indexFAQs.js'); var score = new Hash; var matched = new Hash; perldocSearch.indexData.faqs.each(function(faq,faqIndex){matched.set(faqIndex,0);}); queryWords.each( function(word) { matched.each(function(found,faqIndex) { var faq = perldocSearch.indexData.faqs[faqIndex]; var faqWords = new Array(); faqWords = faq[1].toString().toLowerCase().replace(/[-.,\/\?\(\)\{\}=_+]/g," ").split(/\s+/); faqWords = faqWords.map(perldocSearch.stemWord); var faqText = faqWords.join(" "); if (word.length > 1) { if (faqText.indexOf(word) > -1) { matched.set(faqIndex,1); if (score.has(faqIndex)) { score.set(faqIndex,score.get(faqIndex)+2); } else { score.set(faqIndex,2); } } } }); // Remove unmatched entries (score == 0) matched = matched.filter(function(value,key) {return value > 0;}); matched = matched.map(function(){return 0;}); }); var sortedResults = matched.getKeys(); sortedResults.sort(function(a,b){return score.get(b) - score.get(a)}); sortedResults = sortedResults.map(function(faqIndex) { return new Hash ({ "url": "perlfaq" + perldocSearch.indexData.faqs[faqIndex][0] + ".html#" + perldocSearch.newEscape(perldocSearch.indexData.faqs[faqIndex][1].trim()), "text": perldocSearch.indexData.faqs[faqIndex][1] }); }); perldocSearch.displayResults('faq_search_results',sortedResults); }, // performFullSearch - name and description text search ----------------- performFullSearch: function(dataSet, queryWords, nameToUrl, prefix) { var score = new Hash; var matched = dataSet.map(function(){return 0;}); if (!prefix) {prefix = "";} queryWords.each(function(word) { matched.each(function(value,key) { var name = key; name = name.slice(prefix.length); name = name.toLowerCase(); var descriptionWords = new Array(); descriptionWords = dataSet[key].toLowerCase().replace(/[^-:\w\.]/g," ").split(/\s+/); descriptionWords = descriptionWords.map(perldocSearch.stemWord); var description = descriptionWords.join(" "); if (word.length > 1) { if (name == word) { matched.set(key,1); if (score.has(key)) { score.set(key,score.get(key)+20); } else { score.set(key,20); } } else if (name.indexOf(word) > -1) { matched.set(key,1); if (score.has(key)) { score.set(key,score.get(key)+15); } else { score.set(key,15); } if (name.indexOf(word) < 10) { score.set(key,score.get(key) - name.indexOf(word)); } if (name.indexOf(word) >= 10) { score.set(key,score.get(key) - 10); } } if (description.indexOf(word) > -1) { matched.set(key,1); if (score.has(key)) { score.set(key,score.get(key)+5); } else { score.set(key,5); } } } }); // Remove unmatched entries (score == 0) matched = matched.filter(function(value,key) {return value > 0;}); matched = matched.map(function(){return 0;}); }); var sortedResults = matched.getKeys(); sortedResults.sort(function(a,b){ if (score.get(a) == score.get(b)) { return a.length - b.length; } else { return score.get(b) - score.get(a); } }); sortedResults = sortedResults.map(function(name) { return new Hash ({ "url": nameToUrl(name.slice(prefix.length)), "text": name.slice(prefix.length), "description": dataSet.get(name) }); }); return sortedResults; }, // displayProgress - shows "Searching..." indicator --------------------- displayProgress: function(elementID) { $(elementID).innerHTML = 'Searching...'; }, // displayResults - shows search results -------------------------------- displayResults: function(elementID,results) { if (results.length > 0) { var resultsHTML = "
      "; results.each( function(result) { resultsHTML += '
    • ' + result.text + ''; if (result.description) { resultsHTML += ' - ' + result.description; } }); resultsHTML += "
    "; $(elementID).innerHTML = resultsHTML; } else { $(elementID).innerHTML = "No matches found"; } }, // stemWord - returns the stem of a given word -------------------------- stemWord: function(word) { word = word.toString().toLowerCase(); word = word.replace(/[^-:\w\.]/g,""); word = word.replace(/\.pm$/,""); if (word.length > 5) { word = word.replace(/(\w+)ing$/,"$1"); word = word.replace(/(\w+)ies$/,"$1y"); } if (word.length > 3) { word = word.replace(/(\w+)s$/,"$1"); } return word; }, // newEscape - escape special characters -------------------------------- newEscape: function(word) { word = escape(word); word = word.replace(/%20/g,"-"); return word; } } //------------------------------------------------------------------------- // ScriptLoader - load JavaScript files on demand //------------------------------------------------------------------------- var ScriptLoader = { request: null, loaded: {}, load: function() { for (var i = 0, len = arguments.length; i < len; i++) { var filename = 'static/' + arguments[i]; if (!this.loaded[filename]) { if (!this.request) { if (window.XMLHttpRequest) this.request = new XMLHttpRequest; else if (window.ActiveXObject) { try { this.request = new ActiveXObject('MSXML2.XMLHTTP'); } catch (e) { this.request = new ActiveXObject('Microsoft.XMLHTTP'); } } } if (this.request) { if (this.request.overrideMimeType) { this.request.overrideMimeType("text/javascript"); } this.request.open('GET', filename, false); this.request.send(null); if (this.request.responseText) { this.globalEval(this.request.responseText); this.loaded[filename] = true; } } } } }, globalEval: function(code) { if (window.execScript) window.execScript(code, 'javascript'); else window.eval(code); } } perldoc-html/static/search.xml000644 000765 000024 00000004532 12276001417 016500 0ustar00jjstaff000000 000000 perldoc.perl.org Perl Programming Documentation UTF-8 data:image/x-icon;base64, AAABAAEAEBAAAAAAAABoBQAAFgAAACgAAAAQAAAAIAAAAAEACAAAAAAAQAEAAAAAAAAAAAAAAAAA AAAAAAAAAAAA////AICAgADAwMAAoKCgAODg4ACwsLAAkJCQANHR0QD19fUA6urqAHd3dwCIiIgA mJiYAKioqADKysoAt7e3ANfX1wD6+voAfHx8ANzc3ACMjIwAlJSUAMTExACcnJwAvLy8AKSkpACs rKwA4+PjAIODgwDU1NQAzc3NAMfHxwC0tLQA9/f3AH5+fgCFhYUAioqKAI6OjgDPz88AkpKSAJaW lgDCwsIAnp6eAL6+vgC6uroAsrKyAK6urgD5+fkAfX19AOHh4QB/f38AgYGBAIKCggDb29sAhoaG ANbW1gCHh4cAiYmJANPT0wDS0tIAi4uLAI2NjQCPj48AzMzMAMPDwwCbm5sAwcHBAJ2dnQC/v78A vb29AKGhoQClpaUAtra2AKenpwC1tbUAqampALOzswCrq6sAsbGxAK+vrwAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAABOShMTK08AAAAAAAAAAA4CEzpLQ0MxAkoAAAAAABgxPRsYKU4USSYxQgAAABkxGEAY LC8SATZBRDFMAAA0PTZCL0UsAQESSx89NABGEwY7NzgpCAEBATgtUBMZBxMXQT4yOREBAQEFTUET PjQTLTglFAcnAQEBOxcQEyM1EygKBEsuLQEBIk8FPxMCFhMxTBwNF0gBMBc8BBMTKAMTEyMrD0cQ CR4gKDETE0UANxMTExVNLzsbHRMTEzcAAEMjExMTJCFQIxMTEyMDAAAAGzMTEzMGGhMTEyMOAAAA AAAgOTEjSgITMTcqAAAAAAAAAAADOjkLDAMAAAAAAPgfAADgBwAAwAMAAIABAACAAQAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAgAEAAIABAADAAwAA4AcAAPgfAAA= perldoc-html/functions/-X.html000644 000765 000024 00000071650 12275777527 016436 0ustar00jjstaff000000 000000 -X - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    -X

    Perl 5 version 18.2 documentation
    Recently read

    -X

    • -X FILEHANDLE

    • -X EXPR
    • -X DIRHANDLE
    • -X

      A file test, where X is one of the letters listed below. This unary operator takes one argument, either a filename, a filehandle, or a dirhandle, and tests the associated file to see if something is true about it. If the argument is omitted, tests $_ , except for -t , which tests STDIN. Unless otherwise documented, it returns 1 for true and '' for false, or the undefined value if the file doesn't exist. Despite the funny names, precedence is the same as any other named unary operator. The operator may be any of:

      1. -r File is readable by effective uid/gid.
      2. -w File is writable by effective uid/gid.
      3. -x File is executable by effective uid/gid.
      4. -o File is owned by effective uid.
      5. -R File is readable by real uid/gid.
      6. -W File is writable by real uid/gid.
      7. -X File is executable by real uid/gid.
      8. -O File is owned by real uid.
      9. -e File exists.
      10. -z File has zero size (is empty).
      11. -s File has nonzero size (returns size in bytes).
      12. -f File is a plain file.
      13. -d File is a directory.
      14. -l File is a symbolic link.
      15. -p File is a named pipe (FIFO), or Filehandle is a pipe.
      16. -S File is a socket.
      17. -b File is a block special file.
      18. -c File is a character special file.
      19. -t Filehandle is opened to a tty.
      20. -u File has setuid bit set.
      21. -g File has setgid bit set.
      22. -k File has sticky bit set.
      23. -T File is an ASCII text file (heuristic guess).
      24. -B File is a "binary" file (opposite of -T).
      25. -M Script start time minus file modification time, in days.
      26. -A Same for access time.
      27. -C Same for inode change time (Unix, may differ for other
      28. platforms)

      Example:

      1. while (<>) {
      2. chomp;
      3. next unless -f $_; # ignore specials
      4. #...
      5. }

      Note that -s/a/b/ does not do a negated substitution. Saying -exp($foo) still works as expected, however: only single letters following a minus are interpreted as file tests.

      These operators are exempt from the "looks like a function rule" described above. That is, an opening parenthesis after the operator does not affect how much of the following code constitutes the argument. Put the opening parentheses before the operator to separate it from code that follows (this applies only to operators with higher precedence than unary operators, of course):

      1. -s($file) + 1024 # probably wrong; same as -s($file + 1024)
      2. (-s $file) + 1024 # correct

      The interpretation of the file permission operators -r , -R , -w , -W , -x , and -X is by default based solely on the mode of the file and the uids and gids of the user. There may be other reasons you can't actually read, write, or execute the file: for example network filesystem access controls, ACLs (access control lists), read-only filesystems, and unrecognized executable formats. Note that the use of these six specific operators to verify if some operation is possible is usually a mistake, because it may be open to race conditions.

      Also note that, for the superuser on the local filesystems, the -r , -R , -w , and -W tests always return 1, and -x and -X return 1 if any execute bit is set in the mode. Scripts run by the superuser may thus need to do a stat() to determine the actual mode of the file, or temporarily set their effective uid to something else.

      If you are using ACLs, there is a pragma called filetest that may produce more accurate results than the bare stat() mode bits. When under use filetest 'access' the above-mentioned filetests test whether the permission can(not) be granted using the access(2) family of system calls. Also note that the -x and -X may under this pragma return true even if there are no execute permission bits set (nor any extra execute permission ACLs). This strangeness is due to the underlying system calls' definitions. Note also that, due to the implementation of use filetest 'access' , the _ special filehandle won't cache the results of the file tests when this pragma is in effect. Read the documentation for the filetest pragma for more information.

      The -T and -B switches work as follows. The first block or so of the file is examined for odd characters such as strange control codes or characters with the high bit set. If too many strange characters (>30%) are found, it's a -B file; otherwise it's a -T file. Also, any file containing a zero byte in the first block is considered a binary file. If -T or -B is used on a filehandle, the current IO buffer is examined rather than the first block. Both -T and -B return true on an empty file, or a file at EOF when testing a filehandle. Because you have to read a file to do the -T test, on most occasions you want to use a -f against the file first, as in next unless -f $file && -T $file .

      If any of the file tests (or either the stat or lstat operator) is given the special filehandle consisting of a solitary underline, then the stat structure of the previous file test (or stat operator) is used, saving a system call. (This doesn't work with -t , and you need to remember that lstat() and -l leave values in the stat structure for the symbolic link, not the real file.) (Also, if the stat buffer was filled by an lstat call, -T and -B will reset it with the results of stat _ ). Example:

      1. print "Can do.\n" if -r $a || -w _ || -x _;
      2. stat($filename);
      3. print "Readable\n" if -r _;
      4. print "Writable\n" if -w _;
      5. print "Executable\n" if -x _;
      6. print "Setuid\n" if -u _;
      7. print "Setgid\n" if -g _;
      8. print "Sticky\n" if -k _;
      9. print "Text\n" if -T _;
      10. print "Binary\n" if -B _;

      As of Perl 5.10.0, as a form of purely syntactic sugar, you can stack file test operators, in a way that -f -w -x $file is equivalent to -x $file && -w _ && -f _ . (This is only fancy fancy: if you use the return value of -f $file as an argument to another filetest operator, no special magic will happen.)

      Portability issues: -X in perlport.

      To avoid confusing would-be users of your code with mysterious syntax errors, put something like this at the top of your script:

      1. use 5.010; # so filetest ops can stack
     
    perldoc-html/functions/AUTOLOAD.html000644 000765 000024 00000032542 12275777527 017357 0ustar00jjstaff000000 000000 AUTOLOAD - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    AUTOLOAD

    Perl 5 version 18.2 documentation
    Recently read

    AUTOLOAD

     
    perldoc-html/functions/BEGIN.html000644 000765 000024 00000032625 12275777531 016770 0ustar00jjstaff000000 000000 BEGIN - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    BEGIN

    Perl 5 version 18.2 documentation
    Recently read

    BEGIN

     
    perldoc-html/functions/CHECK.html000644 000765 000024 00000032625 12275777524 016763 0ustar00jjstaff000000 000000 CHECK - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CHECK

    Perl 5 version 18.2 documentation
    Recently read

    CHECK

     
    perldoc-html/functions/DESTROY.html000644 000765 000024 00000032541 12275777530 017271 0ustar00jjstaff000000 000000 DESTROY - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    DESTROY

    Perl 5 version 18.2 documentation
    Recently read

    DESTROY

     
    perldoc-html/functions/END.html000644 000765 000024 00000032605 12275777530 016547 0ustar00jjstaff000000 000000 END - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    END

    Perl 5 version 18.2 documentation
    Recently read

    END

     
    perldoc-html/functions/INIT.html000644 000765 000024 00000032615 12275777532 016707 0ustar00jjstaff000000 000000 INIT - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    INIT

    Perl 5 version 18.2 documentation
    Recently read

    INIT

     
    perldoc-html/functions/UNITCHECK.html000644 000765 000024 00000032665 12275777530 017464 0ustar00jjstaff000000 000000 UNITCHECK - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    UNITCHECK

    Perl 5 version 18.2 documentation
    Recently read

    UNITCHECK

     
    perldoc-html/functions/__DATA__.html000644 000765 000024 00000032561 12275777525 017453 0ustar00jjstaff000000 000000 __DATA__ - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    __DATA__

    Perl 5 version 18.2 documentation
    Recently read

    __DATA__

     
    perldoc-html/functions/__END__.html000644 000765 000024 00000032551 12275777527 017351 0ustar00jjstaff000000 000000 __END__ - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    __END__

    Perl 5 version 18.2 documentation
    Recently read

    __END__

     
    perldoc-html/functions/__FILE__.html000644 000765 000024 00000032520 12275777532 017452 0ustar00jjstaff000000 000000 __FILE__ - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    __FILE__

    Perl 5 version 18.2 documentation
    Recently read

    __FILE__

    • __FILE__

      A special token that returns the name of the file in which it occurs.

     
    perldoc-html/functions/__LINE__.html000644 000765 000024 00000032504 12275777527 017470 0ustar00jjstaff000000 000000 __LINE__ - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    __LINE__

    Perl 5 version 18.2 documentation
    Recently read

    __LINE__

    • __LINE__

      A special token that compiles to the current line number.

     
    perldoc-html/functions/__PACKAGE__.html000644 000765 000024 00000032553 12275777531 017773 0ustar00jjstaff000000 000000 __PACKAGE__ - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    __PACKAGE__

    Perl 5 version 18.2 documentation
    Recently read

    __PACKAGE__

    • __PACKAGE__

      A special token that returns the name of the package in which it occurs.

     
    perldoc-html/functions/__SUB__.html000644 000765 000024 00000033601 12275777525 017367 0ustar00jjstaff000000 000000 __SUB__ - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    __SUB__

    Perl 5 version 18.2 documentation
    Recently read

    __SUB__

    • __SUB__

      A special token that returns a reference to the current subroutine, or undef outside of a subroutine.

      The behaviour of __SUB__ within a regex code block (such as /(?{...})/ ) is subject to change.

      This token is only available under use v5.16 or the "current_sub" feature. See feature.

     
    perldoc-html/functions/abs.html000644 000765 000024 00000032637 12275777525 016717 0ustar00jjstaff000000 000000 abs - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    abs

    Perl 5 version 18.2 documentation
    Recently read

    abs

    • abs VALUE

    • abs

      Returns the absolute value of its argument. If VALUE is omitted, uses $_ .

     
    perldoc-html/functions/accept.html000644 000765 000024 00000033411 12275777525 017400 0ustar00jjstaff000000 000000 accept - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    accept

    Perl 5 version 18.2 documentation
    Recently read

    accept

    • accept NEWSOCKET,GENERICSOCKET

      Accepts an incoming socket connect, just as accept(2) does. Returns the packed address if it succeeded, false otherwise. See the example in Sockets: Client/Server Communication in perlipc.

      On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptor, as determined by the value of $^F. See $^F in perlvar.

     
    perldoc-html/functions/alarm.html000644 000765 000024 00000043147 12275777531 017241 0ustar00jjstaff000000 000000 alarm - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    alarm

    Perl 5 version 18.2 documentation
    Recently read

    alarm

    • alarm SECONDS

    • alarm

      Arranges to have a SIGALRM delivered to this process after the specified number of wallclock seconds has elapsed. If SECONDS is not specified, the value stored in $_ is used. (On some machines, unfortunately, the elapsed time may be up to one second less or more than you specified because of how seconds are counted, and process scheduling may delay the delivery of the signal even further.)

      Only one timer may be counting at once. Each call disables the previous timer, and an argument of 0 may be supplied to cancel the previous timer without starting a new one. The returned value is the amount of time remaining on the previous timer.

      For delays of finer granularity than one second, the Time::HiRes module (from CPAN, and starting from Perl 5.8 part of the standard distribution) provides ualarm(). You may also use Perl's four-argument version of select() leaving the first three arguments undefined, or you might be able to use the syscall interface to access setitimer(2) if your system supports it. See perlfaq8 for details.

      It is usually a mistake to intermix alarm and sleep calls, because sleep may be internally implemented on your system with alarm.

      If you want to use alarm to time out a system call you need to use an eval/die pair. You can't rely on the alarm causing the system call to fail with $! set to EINTR because Perl sets up signal handlers to restart system calls on some systems. Using eval/die always works, modulo the caveats given in Signals in perlipc.

      1. eval {
      2. local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required
      3. alarm $timeout;
      4. $nread = sysread SOCKET, $buffer, $size;
      5. alarm 0;
      6. };
      7. if ($@) {
      8. die unless $@ eq "alarm\n"; # propagate unexpected errors
      9. # timed out
      10. }
      11. else {
      12. # didn't
      13. }

      For more information see perlipc.

      Portability issues: alarm in perlport.

     
    perldoc-html/functions/and.html000644 000765 000024 00000032441 12275777527 016707 0ustar00jjstaff000000 000000 and - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    and

    Perl 5 version 18.2 documentation
    Recently read

    and

    • and

      These operators are documented in perlop.

     
    perldoc-html/functions/atan2.html000644 000765 000024 00000034257 12275777525 017157 0ustar00jjstaff000000 000000 atan2 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    atan2

    Perl 5 version 18.2 documentation
    Recently read

    atan2

    • atan2 Y,X

      Returns the arctangent of Y/X in the range -PI to PI.

      For the tangent operation, you may use the Math::Trig::tan function, or use the familiar relation:

      1. sub tan { sin($_[0]) / cos($_[0]) }

      The return value for atan2(0,0) is implementation-defined; consult your atan2(3) manpage for more information.

      Portability issues: atan2 in perlport.

     
    perldoc-html/functions/bind.html000644 000765 000024 00000033105 12275777531 017052 0ustar00jjstaff000000 000000 bind - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    bind

    Perl 5 version 18.2 documentation
    Recently read

    bind

    • bind SOCKET,NAME

      Binds a network address to a socket, just as bind(2) does. Returns true if it succeeded, false otherwise. NAME should be a packed address of the appropriate type for the socket. See the examples in Sockets: Client/Server Communication in perlipc.

     
    perldoc-html/functions/binmode.html000644 000765 000024 00000050645 12275777531 017563 0ustar00jjstaff000000 000000 binmode - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    binmode

    Perl 5 version 18.2 documentation
    Recently read

    binmode

    • binmode FILEHANDLE, LAYER

    • binmode FILEHANDLE

      Arranges for FILEHANDLE to be read or written in "binary" or "text" mode on systems where the run-time libraries distinguish between binary and text files. If FILEHANDLE is an expression, the value is taken as the name of the filehandle. Returns true on success, otherwise it returns undef and sets $! (errno).

      On some systems (in general, DOS- and Windows-based systems) binmode() is necessary when you're not working with a text file. For the sake of portability it is a good idea always to use it when appropriate, and never to use it when it isn't appropriate. Also, people can set their I/O to be by default UTF8-encoded Unicode, not bytes.

      In other words: regardless of platform, use binmode() on binary data, like images, for example.

      If LAYER is present it is a single string, but may contain multiple directives. The directives alter the behaviour of the filehandle. When LAYER is present, using binmode on a text file makes sense.

      If LAYER is omitted or specified as :raw the filehandle is made suitable for passing binary data. This includes turning off possible CRLF translation and marking it as bytes (as opposed to Unicode characters). Note that, despite what may be implied in "Programming Perl" (the Camel, 3rd edition) or elsewhere, :raw is not simply the inverse of :crlf . Other layers that would affect the binary nature of the stream are also disabled. See PerlIO, perlrun, and the discussion about the PERLIO environment variable.

      The :bytes , :crlf , :utf8 , and any other directives of the form :... , are called I/O layers. The open pragma can be used to establish default I/O layers. See open.

      The LAYER parameter of the binmode() function is described as "DISCIPLINE" in "Programming Perl, 3rd Edition". However, since the publishing of this book, by many known as "Camel III", the consensus of the naming of this functionality has moved from "discipline" to "layer". All documentation of this version of Perl therefore refers to "layers" rather than to "disciplines". Now back to the regularly scheduled documentation...

      To mark FILEHANDLE as UTF-8, use :utf8 or :encoding(UTF-8) . :utf8 just marks the data as UTF-8 without further checking, while :encoding(UTF-8) checks the data for actually being valid UTF-8. More details can be found in PerlIO::encoding.

      In general, binmode() should be called after open() but before any I/O is done on the filehandle. Calling binmode() normally flushes any pending buffered output data (and perhaps pending input data) on the handle. An exception to this is the :encoding layer that changes the default character encoding of the handle; see open. The :encoding layer sometimes needs to be called in mid-stream, and it doesn't flush the stream. The :encoding also implicitly pushes on top of itself the :utf8 layer because internally Perl operates on UTF8-encoded Unicode characters.

      The operating system, device drivers, C libraries, and Perl run-time system all conspire to let the programmer treat a single character (\n ) as the line terminator, irrespective of external representation. On many operating systems, the native text file representation matches the internal representation, but on some platforms the external representation of \n is made up of more than one character.

      All variants of Unix, Mac OS (old and new), and Stream_LF files on VMS use a single character to end each line in the external representation of text (even though that single character is CARRIAGE RETURN on old, pre-Darwin flavors of Mac OS, and is LINE FEED on Unix and most VMS files). In other systems like OS/2, DOS, and the various flavors of MS-Windows, your program sees a \n as a simple \cJ , but what's stored in text files are the two characters \cM\cJ . That means that if you don't use binmode() on these systems, \cM\cJ sequences on disk will be converted to \n on input, and any \n in your program will be converted back to \cM\cJ on output. This is what you want for text files, but it can be disastrous for binary files.

      Another consequence of using binmode() (on some systems) is that special end-of-file markers will be seen as part of the data stream. For systems from the Microsoft family this means that, if your binary data contain \cZ , the I/O subsystem will regard it as the end of the file, unless you use binmode().

      binmode() is important not only for readline() and print() operations, but also when using read(), seek(), sysread(), syswrite() and tell() (see perlport for more details). See the $/ and $\ variables in perlvar for how to manually set your input and output line-termination sequences.

      Portability issues: binmode in perlport.

     
    perldoc-html/functions/bless.html000644 000765 000024 00000034353 12275777525 017257 0ustar00jjstaff000000 000000 bless - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    bless

    Perl 5 version 18.2 documentation
    Recently read

    bless

    • bless REF,CLASSNAME

    • bless REF

      This function tells the thingy referenced by REF that it is now an object in the CLASSNAME package. If CLASSNAME is omitted, the current package is used. Because a bless is often the last thing in a constructor, it returns the reference for convenience. Always use the two-argument version if a derived class might inherit the function doing the blessing. See perlobj for more about the blessing (and blessings) of objects.

      Consider always blessing objects in CLASSNAMEs that are mixed case. Namespaces with all lowercase names are considered reserved for Perl pragmata. Builtin types have all uppercase names. To prevent confusion, you may wish to avoid such package names as well. Make sure that CLASSNAME is a true value.

      See Perl Modules in perlmod.

     
    perldoc-html/functions/break.html000644 000765 000024 00000033543 12275777525 017233 0ustar00jjstaff000000 000000 break - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    break

    Perl 5 version 18.2 documentation
    Recently read

    break

    • break

      Break out of a given() block.

      This keyword is enabled by the "switch" feature; see feature for more information on "switch" . You can also access it by prefixing it with CORE:: . Alternatively, include a use v5.10 or later to the current scope.

     
    perldoc-html/functions/caller.html000644 000765 000024 00000051552 12275777527 017413 0ustar00jjstaff000000 000000 caller - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    caller

    Perl 5 version 18.2 documentation
    Recently read

    caller

    • caller EXPR

    • caller

      Returns the context of the current subroutine call. In scalar context, returns the caller's package name if there is a caller (that is, if we're in a subroutine or eval or require) and the undefined value otherwise. In list context, returns

      1. # 0 1 2
      2. ($package, $filename, $line) = caller;

      With EXPR, it returns some extra information that the debugger uses to print a stack trace. The value of EXPR indicates how many call frames to go back before the current one.

      1. # 0 1 2 3 4
      2. ($package, $filename, $line, $subroutine, $hasargs,
      3. # 5 6 7 8 9 10
      4. $wantarray, $evaltext, $is_require, $hints, $bitmask, $hinthash)
      5. = caller($i);

      Here $subroutine may be (eval) if the frame is not a subroutine call, but an eval. In such a case additional elements $evaltext and $is_require are set: $is_require is true if the frame is created by a require or use statement, $evaltext contains the text of the eval EXPR statement. In particular, for an eval BLOCK statement, $subroutine is (eval) , but $evaltext is undefined. (Note also that each use statement creates a require frame inside an eval EXPR frame.) $subroutine may also be (unknown) if this particular subroutine happens to have been deleted from the symbol table. $hasargs is true if a new instance of @_ was set up for the frame. $hints and $bitmask contain pragmatic hints that the caller was compiled with. $hints corresponds to $^H , and $bitmask corresponds to ${^WARNING_BITS} . The $hints and $bitmask values are subject to change between versions of Perl, and are not meant for external use.

      $hinthash is a reference to a hash containing the value of %^H when the caller was compiled, or undef if %^H was empty. Do not modify the values of this hash, as they are the actual values stored in the optree.

      Furthermore, when called from within the DB package in list context, and with an argument, caller returns more detailed information: it sets the list variable @DB::args to be the arguments with which the subroutine was invoked.

      Be aware that the optimizer might have optimized call frames away before caller had a chance to get the information. That means that caller(N) might not return information about the call frame you expect it to, for N > 1 . In particular, @DB::args might have information from the previous time caller was called.

      Be aware that setting @DB::args is best effort, intended for debugging or generating backtraces, and should not be relied upon. In particular, as @_ contains aliases to the caller's arguments, Perl does not take a copy of @_ , so @DB::args will contain modifications the subroutine makes to @_ or its contents, not the original values at call time. @DB::args , like @_ , does not hold explicit references to its elements, so under certain cases its elements may have become freed and reallocated for other variables or temporary values. Finally, a side effect of the current implementation is that the effects of shift @_ can normally be undone (but not pop @_ or other splicing, and not if a reference to @_ has been taken, and subject to the caveat about reallocated elements), so @DB::args is actually a hybrid of the current state and initial state of @_ . Buyer beware.

     
    perldoc-html/functions/chdir.html000644 000765 000024 00000034602 12275777530 017231 0ustar00jjstaff000000 000000 chdir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    chdir

    Perl 5 version 18.2 documentation
    Recently read

    chdir

    • chdir EXPR

    • chdir FILEHANDLE
    • chdir DIRHANDLE
    • chdir

      Changes the working directory to EXPR, if possible. If EXPR is omitted, changes to the directory specified by $ENV{HOME} , if set; if not, changes to the directory specified by $ENV{LOGDIR} . (Under VMS, the variable $ENV{SYS$LOGIN} is also checked, and used if it is set.) If neither is set, chdir does nothing. It returns true on success, false otherwise. See the example under die.

      On systems that support fchdir(2), you may pass a filehandle or directory handle as the argument. On systems that don't support fchdir(2), passing handles raises an exception.

     
    perldoc-html/functions/chmod.html000644 000765 000024 00000042556 12275777530 017241 0ustar00jjstaff000000 000000 chmod - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    chmod

    Perl 5 version 18.2 documentation
    Recently read

    chmod

    • chmod LIST

      Changes the permissions of a list of files. The first element of the list must be the numeric mode, which should probably be an octal number, and which definitely should not be a string of octal digits: 0644 is okay, but "0644" is not. Returns the number of files successfully changed. See also oct if all you have is a string.

      1. $cnt = chmod 0755, "foo", "bar";
      2. chmod 0755, @executables;
      3. $mode = "0644"; chmod $mode, "foo"; # !!! sets mode to
      4. # --w----r-T
      5. $mode = "0644"; chmod oct($mode), "foo"; # this is better
      6. $mode = 0644; chmod $mode, "foo"; # this is best

      On systems that support fchmod(2), you may pass filehandles among the files. On systems that don't support fchmod(2), passing filehandles raises an exception. Filehandles must be passed as globs or glob references to be recognized; barewords are considered filenames.

      1. open(my $fh, "<", "foo");
      2. my $perm = (stat $fh)[2] & 07777;
      3. chmod($perm | 0600, $fh);

      You can also import the symbolic S_I* constants from the Fcntl module:

      1. use Fcntl qw( :mode );
      2. chmod S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH, @executables;
      3. # Identical to the chmod 0755 of the example above.

      Portability issues: chmod in perlport.

     
    perldoc-html/functions/chomp.html000644 000765 000024 00000042106 12275777532 017246 0ustar00jjstaff000000 000000 chomp - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    chomp

    Perl 5 version 18.2 documentation
    Recently read

    chomp

    • chomp VARIABLE

    • chomp( LIST )
    • chomp

      This safer version of chop removes any trailing string that corresponds to the current value of $/ (also known as $INPUT_RECORD_SEPARATOR in the English module). It returns the total number of characters removed from all its arguments. It's often used to remove the newline from the end of an input record when you're worried that the final record may be missing its newline. When in paragraph mode ($/ = "" ), it removes all trailing newlines from the string. When in slurp mode ($/ = undef ) or fixed-length record mode ($/ is a reference to an integer or the like; see perlvar) chomp() won't remove anything. If VARIABLE is omitted, it chomps $_ . Example:

      1. while (<>) {
      2. chomp; # avoid \n on last field
      3. @array = split(/:/);
      4. # ...
      5. }

      If VARIABLE is a hash, it chomps the hash's values, but not its keys.

      You can actually chomp anything that's an lvalue, including an assignment:

      1. chomp($cwd = `pwd`);
      2. chomp($answer = <STDIN>);

      If you chomp a list, each element is chomped, and the total number of characters removed is returned.

      Note that parentheses are necessary when you're chomping anything that is not a simple variable. This is because chomp $cwd = `pwd`; is interpreted as (chomp $cwd) = `pwd`; , rather than as chomp( $cwd = `pwd` ) which you might expect. Similarly, chomp $a, $b is interpreted as chomp($a), $b rather than as chomp($a, $b) .

     
    perldoc-html/functions/chop.html000644 000765 000024 00000034745 12275777527 017107 0ustar00jjstaff000000 000000 chop - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    chop

    Perl 5 version 18.2 documentation
    Recently read

    chop

    • chop VARIABLE

    • chop( LIST )
    • chop

      Chops off the last character of a string and returns the character chopped. It is much more efficient than s/.$//s because it neither scans nor copies the string. If VARIABLE is omitted, chops $_ . If VARIABLE is a hash, it chops the hash's values, but not its keys.

      You can actually chop anything that's an lvalue, including an assignment.

      If you chop a list, each element is chopped. Only the value of the last chop is returned.

      Note that chop returns the last character. To return all but the last character, use substr($string, 0, -1) .

      See also chomp.

     
    perldoc-html/functions/chown.html000644 000765 000024 00000042001 12275777531 017247 0ustar00jjstaff000000 000000 chown - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    chown

    Perl 5 version 18.2 documentation
    Recently read

    chown

    • chown LIST

      Changes the owner (and group) of a list of files. The first two elements of the list must be the numeric uid and gid, in that order. A value of -1 in either position is interpreted by most systems to leave that value unchanged. Returns the number of files successfully changed.

      1. $cnt = chown $uid, $gid, 'foo', 'bar';
      2. chown $uid, $gid, @filenames;

      On systems that support fchown(2), you may pass filehandles among the files. On systems that don't support fchown(2), passing filehandles raises an exception. Filehandles must be passed as globs or glob references to be recognized; barewords are considered filenames.

      Here's an example that looks up nonnumeric uids in the passwd file:

      1. print "User: ";
      2. chomp($user = <STDIN>);
      3. print "Files: ";
      4. chomp($pattern = <STDIN>);
      5. ($login,$pass,$uid,$gid) = getpwnam($user)
      6. or die "$user not in passwd file";
      7. @ary = glob($pattern); # expand filenames
      8. chown $uid, $gid, @ary;

      On most systems, you are not allowed to change the ownership of the file unless you're the superuser, although you should be able to change the group to any of your secondary groups. On insecure systems, these restrictions may be relaxed, but this is not a portable assumption. On POSIX systems, you can detect this condition this way:

      1. use POSIX qw(sysconf _PC_CHOWN_RESTRICTED);
      2. $can_chown_giveaway = not sysconf(_PC_CHOWN_RESTRICTED);

      Portability issues: chmod in perlport.

     
    perldoc-html/functions/chr.html000644 000765 000024 00000034245 12275777526 016724 0ustar00jjstaff000000 000000 chr - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    chr

    Perl 5 version 18.2 documentation
    Recently read

    chr

    • chr NUMBER

    • chr

      Returns the character represented by that NUMBER in the character set. For example, chr(65) is "A" in either ASCII or Unicode, and chr(0x263a) is a Unicode smiley face.

      Negative values give the Unicode replacement character (chr(0xfffd)), except under the bytes pragma, where the low eight bits of the value (truncated to an integer) are used.

      If NUMBER is omitted, uses $_ .

      For the reverse, use ord.

      Note that characters from 128 to 255 (inclusive) are by default internally not encoded as UTF-8 for backward compatibility reasons.

      See perlunicode for more about Unicode.

     
    perldoc-html/functions/chroot.html000644 000765 000024 00000033665 12275777531 017447 0ustar00jjstaff000000 000000 chroot - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    chroot

    Perl 5 version 18.2 documentation
    Recently read

    chroot

    • chroot FILENAME

    • chroot

      This function works like the system call by the same name: it makes the named directory the new root directory for all further pathnames that begin with a / by your process and all its children. (It doesn't change your current working directory, which is unaffected.) For security reasons, this call is restricted to the superuser. If FILENAME is omitted, does a chroot to $_ .

      Portability issues: chroot in perlport.

     
    perldoc-html/functions/close.html000644 000765 000024 00000042070 12275777530 017243 0ustar00jjstaff000000 000000 close - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    close

    Perl 5 version 18.2 documentation
    Recently read

    close

    • close FILEHANDLE

    • close

      Closes the file or pipe associated with the filehandle, flushes the IO buffers, and closes the system file descriptor. Returns true if those operations succeed and if no error was reported by any PerlIO layer. Closes the currently selected filehandle if the argument is omitted.

      You don't have to close FILEHANDLE if you are immediately going to do another open on it, because open closes it for you. (See open.) However, an explicit close on an input file resets the line counter ($. ), while the implicit close done by open does not.

      If the filehandle came from a piped open, close returns false if one of the other syscalls involved fails or if its program exits with non-zero status. If the only problem was that the program exited non-zero, $! will be set to 0 . Closing a pipe also waits for the process executing on the pipe to exit--in case you wish to look at the output of the pipe afterwards--and implicitly puts the exit status value of that command into $? and ${^CHILD_ERROR_NATIVE} .

      If there are multiple threads running, close on a filehandle from a piped open returns true without waiting for the child process to terminate, if the filehandle is still open in another thread.

      Closing the read end of a pipe before the process writing to it at the other end is done writing results in the writer receiving a SIGPIPE. If the other end can't handle that, be sure to read all the data before closing the pipe.

      Example:

      1. open(OUTPUT, '|sort >foo') # pipe to sort
      2. or die "Can't start sort: $!";
      3. #... # print stuff to output
      4. close OUTPUT # wait for sort to finish
      5. or warn $! ? "Error closing sort pipe: $!"
      6. : "Exit status $? from sort";
      7. open(INPUT, 'foo') # get sort's results
      8. or die "Can't open 'foo' for input: $!";

      FILEHANDLE may be an expression whose value can be used as an indirect filehandle, usually the real filehandle name or an autovivified handle.

     
    perldoc-html/functions/closedir.html000644 000765 000024 00000032700 12275777525 017745 0ustar00jjstaff000000 000000 closedir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    closedir

    Perl 5 version 18.2 documentation
    Recently read

    closedir

    • closedir DIRHANDLE

      Closes a directory opened by opendir and returns the success of that system call.

     
    perldoc-html/functions/cmp.html000644 000765 000024 00000032441 12275777532 016720 0ustar00jjstaff000000 000000 cmp - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    cmp

    Perl 5 version 18.2 documentation
    Recently read

    cmp

    • cmp

      These operators are documented in perlop.

     
    perldoc-html/functions/connect.html000644 000765 000024 00000033137 12275777532 017575 0ustar00jjstaff000000 000000 connect - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    connect

    Perl 5 version 18.2 documentation
    Recently read

    connect

    • connect SOCKET,NAME

      Attempts to connect to a remote socket, just like connect(2). Returns true if it succeeded, false otherwise. NAME should be a packed address of the appropriate type for the socket. See the examples in Sockets: Client/Server Communication in perlipc.

     
    perldoc-html/functions/continue.html000644 000765 000024 00000041632 12275777526 017772 0ustar00jjstaff000000 000000 continue - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    continue

    Perl 5 version 18.2 documentation
    Recently read

    continue

    • continue BLOCK

    • continue

      When followed by a BLOCK, continue is actually a flow control statement rather than a function. If there is a continue BLOCK attached to a BLOCK (typically in a while or foreach ), it is always executed just before the conditional is about to be evaluated again, just like the third part of a for loop in C. Thus it can be used to increment a loop variable, even when the loop has been continued via the next statement (which is similar to the C continue statement).

      last, next, or redo may appear within a continue block; last and redo behave as if they had been executed within the main block. So will next, but since it will execute a continue block, it may be more entertaining.

      1. while (EXPR) {
      2. ### redo always comes here
      3. do_something;
      4. } continue {
      5. ### next always comes here
      6. do_something_else;
      7. # then back the top to re-check EXPR
      8. }
      9. ### last always comes here

      Omitting the continue section is equivalent to using an empty one, logically enough, so next goes directly back to check the condition at the top of the loop.

      When there is no BLOCK, continue is a function that falls through the current when or default block instead of iterating a dynamically enclosing foreach or exiting a lexically enclosing given. In Perl 5.14 and earlier, this form of continue was only available when the "switch" feature was enabled. See feature and Switch Statements in perlsyn for more information.

     
    perldoc-html/functions/cos.html000644 000765 000024 00000034271 12275777532 016730 0ustar00jjstaff000000 000000 cos - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    cos

    Perl 5 version 18.2 documentation
    Recently read

    cos

    • cos EXPR

    • cos

      Returns the cosine of EXPR (expressed in radians). If EXPR is omitted, takes the cosine of $_ .

      For the inverse cosine operation, you may use the Math::Trig::acos() function, or use this relation:

      1. sub acos { atan2( sqrt(1 - $_[0] * $_[0]), $_[0] ) }
     
    perldoc-html/functions/crypt.html000644 000765 000024 00000046513 12275777526 017312 0ustar00jjstaff000000 000000 crypt - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    crypt

    Perl 5 version 18.2 documentation
    Recently read

    crypt

    • crypt PLAINTEXT,SALT

      Creates a digest string exactly like the crypt(3) function in the C library (assuming that you actually have a version there that has not been extirpated as a potential munition).

      crypt() is a one-way hash function. The PLAINTEXT and SALT are turned into a short string, called a digest, which is returned. The same PLAINTEXT and SALT will always return the same string, but there is no (known) way to get the original PLAINTEXT from the hash. Small changes in the PLAINTEXT or SALT will result in large changes in the digest.

      There is no decrypt function. This function isn't all that useful for cryptography (for that, look for Crypt modules on your nearby CPAN mirror) and the name "crypt" is a bit of a misnomer. Instead it is primarily used to check if two pieces of text are the same without having to transmit or store the text itself. An example is checking if a correct password is given. The digest of the password is stored, not the password itself. The user types in a password that is crypt()'d with the same salt as the stored digest. If the two digests match, the password is correct.

      When verifying an existing digest string you should use the digest as the salt (like crypt($plain, $digest) eq $digest ). The SALT used to create the digest is visible as part of the digest. This ensures crypt() will hash the new string with the same salt as the digest. This allows your code to work with the standard crypt and with more exotic implementations. In other words, assume nothing about the returned string itself nor about how many bytes of SALT may matter.

      Traditionally the result is a string of 13 bytes: two first bytes of the salt, followed by 11 bytes from the set [./0-9A-Za-z], and only the first eight bytes of PLAINTEXT mattered. But alternative hashing schemes (like MD5), higher level security schemes (like C2), and implementations on non-Unix platforms may produce different strings.

      When choosing a new salt create a random two character string whose characters come from the set [./0-9A-Za-z] (like join '', ('.', '/', 0..9, 'A'..'Z', 'a'..'z')[rand 64, rand 64] ). This set of characters is just a recommendation; the characters allowed in the salt depend solely on your system's crypt library, and Perl can't restrict what salts crypt() accepts.

      Here's an example that makes sure that whoever runs this program knows their password:

      1. $pwd = (getpwuid($<))[1];
      2. system "stty -echo";
      3. print "Password: ";
      4. chomp($word = <STDIN>);
      5. print "\n";
      6. system "stty echo";
      7. if (crypt($word, $pwd) ne $pwd) {
      8. die "Sorry...\n";
      9. } else {
      10. print "ok\n";
      11. }

      Of course, typing in your own password to whoever asks you for it is unwise.

      The crypt function is unsuitable for hashing large quantities of data, not least of all because you can't get the information back. Look at the Digest module for more robust algorithms.

      If using crypt() on a Unicode string (which potentially has characters with codepoints above 255), Perl tries to make sense of the situation by trying to downgrade (a copy of) the string back to an eight-bit byte string before calling crypt() (on that copy). If that works, good. If not, crypt() dies with Wide character in crypt .

      Portability issues: crypt in perlport.

     
    perldoc-html/functions/dbmclose.html000644 000765 000024 00000033067 12275777526 017741 0ustar00jjstaff000000 000000 dbmclose - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    dbmclose

    Perl 5 version 18.2 documentation
    Recently read

    dbmclose

    • dbmclose HASH

      [This function has been largely superseded by the untie function.]

      Breaks the binding between a DBM file and a hash.

      Portability issues: dbmclose in perlport.

     
    perldoc-html/functions/dbmopen.html000644 000765 000024 00000043114 12275777526 017567 0ustar00jjstaff000000 000000 dbmopen - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    dbmopen

    Perl 5 version 18.2 documentation
    Recently read

    dbmopen

    • dbmopen HASH,DBNAME,MASK

      [This function has been largely superseded by the tie function.]

      This binds a dbm(3), ndbm(3), sdbm(3), gdbm(3), or Berkeley DB file to a hash. HASH is the name of the hash. (Unlike normal open, the first argument is not a filehandle, even though it looks like one). DBNAME is the name of the database (without the .dir or .pag extension if any). If the database does not exist, it is created with protection specified by MASK (as modified by the umask). To prevent creation of the database if it doesn't exist, you may specify a MODE of 0, and the function will return a false value if it can't find an existing database. If your system supports only the older DBM functions, you may make only one dbmopen call in your program. In older versions of Perl, if your system had neither DBM nor ndbm, calling dbmopen produced a fatal error; it now falls back to sdbm(3).

      If you don't have write access to the DBM file, you can only read hash variables, not set them. If you want to test whether you can write, either use file tests or try setting a dummy hash entry inside an eval to trap the error.

      Note that functions such as keys and values may return huge lists when used on large DBM files. You may prefer to use the each function to iterate over large DBM files. Example:

      1. # print out history file offsets
      2. dbmopen(%HIST,'/usr/lib/news/history',0666);
      3. while (($key,$val) = each %HIST) {
      4. print $key, ' = ', unpack('L',$val), "\n";
      5. }
      6. dbmclose(%HIST);

      See also AnyDBM_File for a more general description of the pros and cons of the various dbm approaches, as well as DB_File for a particularly rich implementation.

      You can control which DBM library you use by loading that library before you call dbmopen():

      1. use DB_File;
      2. dbmopen(%NS_Hist, "$ENV{HOME}/.netscape/history.db")
      3. or die "Can't open netscape history file: $!";

      Portability issues: dbmopen in perlport.

     
    perldoc-html/functions/default.html000644 000765 000024 00000032642 12275777526 017573 0ustar00jjstaff000000 000000 default - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    default

    Perl 5 version 18.2 documentation
    Recently read

    default

     
    perldoc-html/functions/defined.html000644 000765 000024 00000046705 12275777526 017552 0ustar00jjstaff000000 000000 defined - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    defined

    Perl 5 version 18.2 documentation
    Recently read

    defined

    • defined EXPR

    • defined

      Returns a Boolean value telling whether EXPR has a value other than the undefined value undef. If EXPR is not present, $_ is checked.

      Many operations return undef to indicate failure, end of file, system error, uninitialized variable, and other exceptional conditions. This function allows you to distinguish undef from other values. (A simple Boolean test will not distinguish among undef, zero, the empty string, and "0" , which are all equally false.) Note that since undef is a valid scalar, its presence doesn't necessarily indicate an exceptional condition: pop returns undef when its argument is an empty array, or when the element to return happens to be undef.

      You may also use defined(&func) to check whether subroutine &func has ever been defined. The return value is unaffected by any forward declarations of &func . A subroutine that is not defined may still be callable: its package may have an AUTOLOAD method that makes it spring into existence the first time that it is called; see perlsub.

      Use of defined on aggregates (hashes and arrays) is deprecated. It used to report whether memory for that aggregate had ever been allocated. This behavior may disappear in future versions of Perl. You should instead use a simple test for size:

      1. if (@an_array) { print "has array elements\n" }
      2. if (%a_hash) { print "has hash members\n" }

      When used on a hash element, it tells you whether the value is defined, not whether the key exists in the hash. Use exists for the latter purpose.

      Examples:

      1. print if defined $switch{D};
      2. print "$val\n" while defined($val = pop(@ary));
      3. die "Can't readlink $sym: $!"
      4. unless defined($value = readlink $sym);
      5. sub foo { defined &$bar ? &$bar(@_) : die "No bar"; }
      6. $debugging = 0 unless defined $debugging;

      Note: Many folks tend to overuse defined and are then surprised to discover that the number 0 and "" (the zero-length string) are, in fact, defined values. For example, if you say

      1. "ab" =~ /a(.*)b/;

      The pattern match succeeds and $1 is defined, although it matched "nothing". It didn't really fail to match anything. Rather, it matched something that happened to be zero characters long. This is all very above-board and honest. When a function returns an undefined value, it's an admission that it couldn't give you an honest answer. So you should use defined only when questioning the integrity of what you're trying to do. At other times, a simple comparison to 0 or "" is what you want.

      See also undef, exists, ref.

     
    perldoc-html/functions/delete.html000644 000765 000024 00000050010 12275777527 017377 0ustar00jjstaff000000 000000 delete - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    delete

    Perl 5 version 18.2 documentation
    Recently read

    delete

    • delete EXPR

      Given an expression that specifies an element or slice of a hash, delete deletes the specified elements from that hash so that exists() on that element no longer returns true. Setting a hash element to the undefined value does not remove its key, but deleting it does; see exists.

      In list context, returns the value or values deleted, or the last such element in scalar context. The return list's length always matches that of the argument list: deleting non-existent elements returns the undefined value in their corresponding positions.

      delete() may also be used on arrays and array slices, but its behavior is less straightforward. Although exists() will return false for deleted entries, deleting array elements never changes indices of existing values; use shift() or splice() for that. However, if all deleted elements fall at the end of an array, the array's size shrinks to the position of the highest element that still tests true for exists(), or to 0 if none do.

      WARNING: Calling delete on array values is deprecated and likely to be removed in a future version of Perl.

      Deleting from %ENV modifies the environment. Deleting from a hash tied to a DBM file deletes the entry from the DBM file. Deleting from a tied hash or array may not necessarily return anything; it depends on the implementation of the tied package's DELETE method, which may do whatever it pleases.

      The delete local EXPR construct localizes the deletion to the current block at run time. Until the block exits, elements locally deleted temporarily no longer exist. See Localized deletion of elements of composite types in perlsub.

      1. %hash = (foo => 11, bar => 22, baz => 33);
      2. $scalar = delete $hash{foo}; # $scalar is 11
      3. $scalar = delete @hash{qw(foo bar)}; # $scalar is 22
      4. @array = delete @hash{qw(foo baz)}; # @array is (undef,33)

      The following (inefficiently) deletes all the values of %HASH and @ARRAY:

      1. foreach $key (keys %HASH) {
      2. delete $HASH{$key};
      3. }
      4. foreach $index (0 .. $#ARRAY) {
      5. delete $ARRAY[$index];
      6. }

      And so do these:

      1. delete @HASH{keys %HASH};
      2. delete @ARRAY[0 .. $#ARRAY];

      But both are slower than assigning the empty list or undefining %HASH or @ARRAY, which is the customary way to empty out an aggregate:

      1. %HASH = (); # completely empty %HASH
      2. undef %HASH; # forget %HASH ever existed
      3. @ARRAY = (); # completely empty @ARRAY
      4. undef @ARRAY; # forget @ARRAY ever existed

      The EXPR can be arbitrarily complicated provided its final operation is an element or slice of an aggregate:

      1. delete $ref->[$x][$y]{$key};
      2. delete @{$ref->[$x][$y]}{$key1, $key2, @morekeys};
      3. delete $ref->[$x][$y][$index];
      4. delete @{$ref->[$x][$y]}[$index1, $index2, @moreindices];
     
    perldoc-html/functions/die.html000644 000765 000024 00000056231 12275777527 016711 0ustar00jjstaff000000 000000 die - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    die

    Perl 5 version 18.2 documentation
    Recently read

    die

    • die LIST

      die raises an exception. Inside an eval the error message is stuffed into $@ and the eval is terminated with the undefined value. If the exception is outside of all enclosing evals, then the uncaught exception prints LIST to STDERR and exits with a non-zero value. If you need to exit the process with a specific exit code, see exit.

      Equivalent examples:

      1. die "Can't cd to spool: $!\n" unless chdir '/usr/spool/news';
      2. chdir '/usr/spool/news' or die "Can't cd to spool: $!\n"

      If the last element of LIST does not end in a newline, the current script line number and input line number (if any) are also printed, and a newline is supplied. Note that the "input line number" (also known as "chunk") is subject to whatever notion of "line" happens to be currently in effect, and is also available as the special variable $. . See $/ in perlvar and $. in perlvar.

      Hint: sometimes appending ", stopped" to your message will cause it to make better sense when the string "at foo line 123" is appended. Suppose you are running script "canasta".

      1. die "/etc/games is no good";
      2. die "/etc/games is no good, stopped";

      produce, respectively

      1. /etc/games is no good at canasta line 123.
      2. /etc/games is no good, stopped at canasta line 123.

      If the output is empty and $@ already contains a value (typically from a previous eval) that value is reused after appending "\t...propagated" . This is useful for propagating exceptions:

      1. eval { ... };
      2. die unless $@ =~ /Expected exception/;

      If the output is empty and $@ contains an object reference that has a PROPAGATE method, that method will be called with additional file and line number parameters. The return value replaces the value in $@ ; i.e., as if $@ = eval { $@->PROPAGATE(__FILE__, __LINE__) }; were called.

      If $@ is empty then the string "Died" is used.

      If an uncaught exception results in interpreter exit, the exit code is determined from the values of $! and $? with this pseudocode:

      1. exit $! if $!; # errno
      2. exit $? >> 8 if $? >> 8; # child exit status
      3. exit 255; # last resort

      The intent is to squeeze as much possible information about the likely cause into the limited space of the system exit code. However, as $! is the value of C's errno , which can be set by any system call, this means that the value of the exit code used by die can be non-predictable, so should not be relied upon, other than to be non-zero.

      You can also call die with a reference argument, and if this is trapped within an eval, $@ contains that reference. This permits more elaborate exception handling using objects that maintain arbitrary state about the exception. Such a scheme is sometimes preferable to matching particular string values of $@ with regular expressions. Because $@ is a global variable and eval may be used within object implementations, be careful that analyzing the error object doesn't replace the reference in the global variable. It's easiest to make a local copy of the reference before any manipulations. Here's an example:

      1. use Scalar::Util "blessed";
      2. eval { ... ; die Some::Module::Exception->new( FOO => "bar" ) };
      3. if (my $ev_err = $@) {
      4. if (blessed($ev_err)
      5. && $ev_err->isa("Some::Module::Exception")) {
      6. # handle Some::Module::Exception
      7. }
      8. else {
      9. # handle all other possible exceptions
      10. }
      11. }

      Because Perl stringifies uncaught exception messages before display, you'll probably want to overload stringification operations on exception objects. See overload for details about that.

      You can arrange for a callback to be run just before the die does its deed, by setting the $SIG{__DIE__} hook. The associated handler is called with the error text and can change the error message, if it sees fit, by calling die again. See %SIG in perlvar for details on setting %SIG entries, and eval BLOCK for some examples. Although this feature was to be run only right before your program was to exit, this is not currently so: the $SIG{__DIE__} hook is currently called even inside eval()ed blocks/strings! If one wants the hook to do nothing in such situations, put

      1. die @_ if $^S;

      as the first line of the handler (see $^S in perlvar). Because this promotes strange action at a distance, this counterintuitive behavior may be fixed in a future release.

      See also exit(), warn(), and the Carp module.

     
    perldoc-html/functions/do.html000644 000765 000024 00000044520 12275777531 016543 0ustar00jjstaff000000 000000 do - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    do

    Perl 5 version 18.2 documentation
    Recently read

    do

    • do BLOCK

      Not really a function. Returns the value of the last command in the sequence of commands indicated by BLOCK. When modified by the while or until loop modifier, executes the BLOCK once before testing the loop condition. (On other statements the loop modifiers test the conditional first.)

      do BLOCK does not count as a loop, so the loop control statements next, last, or redo cannot be used to leave or restart the block. See perlsyn for alternative strategies.

    • do SUBROUTINE(LIST)

      This form of subroutine call is deprecated. SUBROUTINE can be a bareword or scalar variable.

    • do EXPR

      Uses the value of EXPR as a filename and executes the contents of the file as a Perl script.

      1. do 'stat.pl';

      is largely like

      1. eval `cat stat.pl`;

      except that it's more concise, runs no external processes, keeps track of the current filename for error messages, searches the @INC directories, and updates %INC if the file is found. See @INC in perlvar and %INC in perlvar for these variables. It also differs in that code evaluated with do FILENAME cannot see lexicals in the enclosing scope; eval STRING does. It's the same, however, in that it does reparse the file every time you call it, so you probably don't want to do this inside a loop.

      If do can read the file but cannot compile it, it returns undef and sets an error message in $@ . If do cannot read the file, it returns undef and sets $! to the error. Always check $@ first, as compilation could fail in a way that also sets $! . If the file is successfully compiled, do returns the value of the last expression evaluated.

      Inclusion of library modules is better done with the use and require operators, which also do automatic error checking and raise an exception if there's a problem.

      You might like to use do to read in a program configuration file. Manual error checking can be done this way:

      1. # read in config files: system first, then user
      2. for $file ("/share/prog/defaults.rc",
      3. "$ENV{HOME}/.someprogrc")
      4. {
      5. unless ($return = do $file) {
      6. warn "couldn't parse $file: $@" if $@;
      7. warn "couldn't do $file: $!" unless defined $return;
      8. warn "couldn't run $file" unless $return;
      9. }
      10. }
     
    perldoc-html/functions/dump.html000644 000765 000024 00000037037 12275777531 017113 0ustar00jjstaff000000 000000 dump - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    dump

    Perl 5 version 18.2 documentation
    Recently read

    dump

    • dump LABEL

    • dump EXPR
    • dump

      This function causes an immediate core dump. See also the -u command-line switch in perlrun, which does the same thing. Primarily this is so that you can use the undump program (not supplied) to turn your core dump into an executable binary after having initialized all your variables at the beginning of the program. When the new binary is executed it will begin by executing a goto LABEL (with all the restrictions that goto suffers). Think of it as a goto with an intervening core dump and reincarnation. If LABEL is omitted, restarts the program from the top. The dump EXPR form, available starting in Perl 5.18.0, allows a name to be computed at run time, being otherwise identical to dump LABEL .

      WARNING: Any files opened at the time of the dump will not be open any more when the program is reincarnated, with possible resulting confusion by Perl.

      This function is now largely obsolete, mostly because it's very hard to convert a core file into an executable. That's why you should now invoke it as CORE::dump() , if you don't want to be warned against a possible typo.

      Unlike most named operators, this has the same precedence as assignment. It is also exempt from the looks-like-a-function rule, so dump ("foo")."bar" will cause "bar" to be part of the argument to dump.

      Portability issues: dump in perlport.

     
    perldoc-html/functions/each.html000644 000765 000024 00000050574 12275777527 017054 0ustar00jjstaff000000 000000 each - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    each

    Perl 5 version 18.2 documentation
    Recently read

    each

    • each HASH

    • each ARRAY

    • each EXPR

      When called on a hash in list context, returns a 2-element list consisting of the key and value for the next element of a hash. In Perl 5.12 and later only, it will also return the index and value for the next element of an array so that you can iterate over it; older Perls consider this a syntax error. When called in scalar context, returns only the key (not the value) in a hash, or the index in an array.

      Hash entries are returned in an apparently random order. The actual random order is specific to a given hash; the exact same series of operations on two hashes may result in a different order for each hash. Any insertion into the hash may change the order, as will any deletion, with the exception that the most recent key returned by each or keys may be deleted without changing the order. So long as a given hash is unmodified you may rely on keys, values and each to repeatedly return the same order as each other. See Algorithmic Complexity Attacks in perlsec for details on why hash order is randomized. Aside from the guarantees provided here the exact details of Perl's hash algorithm and the hash traversal order are subject to change in any release of Perl.

      After each has returned all entries from the hash or array, the next call to each returns the empty list in list context and undef in scalar context; the next call following that one restarts iteration. Each hash or array has its own internal iterator, accessed by each, keys, and values. The iterator is implicitly reset when each has reached the end as just described; it can be explicitly reset by calling keys or values on the hash or array. If you add or delete a hash's elements while iterating over it, entries may be skipped or duplicated--so don't do that. Exception: In the current implementation, it is always safe to delete the item most recently returned by each(), so the following code works properly:

      1. while (($key, $value) = each %hash) {
      2. print $key, "\n";
      3. delete $hash{$key}; # This is safe
      4. }

      This prints out your environment like the printenv(1) program, but in a different order:

      1. while (($key,$value) = each %ENV) {
      2. print "$key=$value\n";
      3. }

      Starting with Perl 5.14, each can take a scalar EXPR, which must hold reference to an unblessed hash or array. The argument will be dereferenced automatically. This aspect of each is considered highly experimental. The exact behaviour may change in a future version of Perl.

      1. while (($key,$value) = each $hashref) { ... }

      As of Perl 5.18 you can use a bare each in a while loop, which will set $_ on every iteration.

      1. while(each %ENV) {
      2. print "$_=$ENV{$_}\n";
      3. }

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.012; # so keys/values/each work on arrays
      2. use 5.014; # so keys/values/each work on scalars (experimental)
      3. use 5.018; # so each assigns to $_ in a lone while test

      See also keys, values, and sort.

     
    perldoc-html/functions/else.html000644 000765 000024 00000032617 12275777531 017075 0ustar00jjstaff000000 000000 else - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    else

    Perl 5 version 18.2 documentation
    Recently read

    else

     
    perldoc-html/functions/elseif.html000644 000765 000024 00000032562 12275777532 017414 0ustar00jjstaff000000 000000 elseif - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    elseif

    Perl 5 version 18.2 documentation
    Recently read

    elseif

     
    perldoc-html/functions/elsif.html000644 000765 000024 00000032552 12275777531 017245 0ustar00jjstaff000000 000000 elsif - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    elsif

    Perl 5 version 18.2 documentation
    Recently read

    elsif

     
    perldoc-html/functions/endgrent.html000644 000765 000024 00000032373 12275777531 017752 0ustar00jjstaff000000 000000 endgrent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    endgrent

    Perl 5 version 18.2 documentation
    Recently read

    endgrent

    • endgrent
     
    perldoc-html/functions/endhostent.html000644 000765 000024 00000032413 12275777531 020312 0ustar00jjstaff000000 000000 endhostent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    endhostent

    Perl 5 version 18.2 documentation
    Recently read

    endhostent

    • endhostent
     
    perldoc-html/functions/endnetent.html000644 000765 000024 00000032403 12275777531 020122 0ustar00jjstaff000000 000000 endnetent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    endnetent

    Perl 5 version 18.2 documentation
    Recently read

    endnetent

    • endnetent
     
    perldoc-html/functions/endprotoent.html000644 000765 000024 00000032423 12275777530 020500 0ustar00jjstaff000000 000000 endprotoent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    endprotoent

    Perl 5 version 18.2 documentation
    Recently read

    endprotoent

    • endprotoent
     
    perldoc-html/functions/endpwent.html000644 000765 000024 00000032373 12275777525 017773 0ustar00jjstaff000000 000000 endpwent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    endpwent

    Perl 5 version 18.2 documentation
    Recently read

    endpwent

    • endpwent
     
    perldoc-html/functions/endservent.html000644 000765 000024 00000062344 12275777532 020323 0ustar00jjstaff000000 000000 endservent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    endservent

    Perl 5 version 18.2 documentation
    Recently read

    endservent

    • endservent

      These routines are the same as their counterparts in the system C library. In list context, the return values from the various get routines are as follows:

      1. ($name,$passwd,$uid,$gid,
      2. $quota,$comment,$gcos,$dir,$shell,$expire) = getpw*
      3. ($name,$passwd,$gid,$members) = getgr*
      4. ($name,$aliases,$addrtype,$length,@addrs) = gethost*
      5. ($name,$aliases,$addrtype,$net) = getnet*
      6. ($name,$aliases,$proto) = getproto*
      7. ($name,$aliases,$port,$proto) = getserv*

      (If the entry doesn't exist you get an empty list.)

      The exact meaning of the $gcos field varies but it usually contains the real name of the user (as opposed to the login name) and other information pertaining to the user. Beware, however, that in many system users are able to change this information and therefore it cannot be trusted and therefore the $gcos is tainted (see perlsec). The $passwd and $shell, user's encrypted password and login shell, are also tainted, for the same reason.

      In scalar context, you get the name, unless the function was a lookup by name, in which case you get the other thing, whatever it is. (If the entry doesn't exist you get the undefined value.) For example:

      1. $uid = getpwnam($name);
      2. $name = getpwuid($num);
      3. $name = getpwent();
      4. $gid = getgrnam($name);
      5. $name = getgrgid($num);
      6. $name = getgrent();
      7. #etc.

      In getpw*() the fields $quota, $comment, and $expire are special in that they are unsupported on many systems. If the $quota is unsupported, it is an empty scalar. If it is supported, it usually encodes the disk quota. If the $comment field is unsupported, it is an empty scalar. If it is supported it usually encodes some administrative comment about the user. In some systems the $quota field may be $change or $age, fields that have to do with password aging. In some systems the $comment field may be $class. The $expire field, if present, encodes the expiration period of the account or the password. For the availability and the exact meaning of these fields in your system, please consult getpwnam(3) and your system's pwd.h file. You can also find out from within Perl what your $quota and $comment fields mean and whether you have the $expire field by using the Config module and the values d_pwquota , d_pwage , d_pwchange , d_pwcomment , and d_pwexpire . Shadow password files are supported only if your vendor has implemented them in the intuitive fashion that calling the regular C library routines gets the shadow versions if you're running under privilege or if there exists the shadow(3) functions as found in System V (this includes Solaris and Linux). Those systems that implement a proprietary shadow password facility are unlikely to be supported.

      The $members value returned by getgr*() is a space-separated list of the login names of the members of the group.

      For the gethost*() functions, if the h_errno variable is supported in C, it will be returned to you via $? if the function call fails. The @addrs value returned by a successful call is a list of raw addresses returned by the corresponding library call. In the Internet domain, each address is four bytes long; you can unpack it by saying something like:

      1. ($a,$b,$c,$d) = unpack('W4',$addr[0]);

      The Socket library makes this slightly easier:

      1. use Socket;
      2. $iaddr = inet_aton("127.1"); # or whatever address
      3. $name = gethostbyaddr($iaddr, AF_INET);
      4. # or going the other way
      5. $straddr = inet_ntoa($iaddr);

      In the opposite way, to resolve a hostname to the IP address you can write this:

      1. use Socket;
      2. $packed_ip = gethostbyname("www.perl.org");
      3. if (defined $packed_ip) {
      4. $ip_address = inet_ntoa($packed_ip);
      5. }

      Make sure gethostbyname() is called in SCALAR context and that its return value is checked for definedness.

      The getprotobynumber function, even though it only takes one argument, has the precedence of a list operator, so beware:

      1. getprotobynumber $number eq 'icmp' # WRONG
      2. getprotobynumber($number eq 'icmp') # actually means this
      3. getprotobynumber($number) eq 'icmp' # better this way

      If you get tired of remembering which element of the return list contains which return value, by-name interfaces are provided in standard modules: File::stat , Net::hostent , Net::netent , Net::protoent , Net::servent , Time::gmtime , Time::localtime , and User::grent . These override the normal built-ins, supplying versions that return objects with the appropriate names for each field. For example:

      1. use File::stat;
      2. use User::pwent;
      3. $is_his = (stat($filename)->uid == pwent($whoever)->uid);

      Even though it looks as though they're the same method calls (uid), they aren't, because a File::stat object is different from a User::pwent object.

      Portability issues: getpwnam in perlport to endservent in perlport.

     
    perldoc-html/functions/eof.html000644 000765 000024 00000044161 12275777524 016715 0ustar00jjstaff000000 000000 eof - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    eof

    Perl 5 version 18.2 documentation
    Recently read

    eof

    • eof FILEHANDLE

    • eof ()
    • eof

      Returns 1 if the next read on FILEHANDLE will return end of file or if FILEHANDLE is not open. FILEHANDLE may be an expression whose value gives the real filehandle. (Note that this function actually reads a character and then ungetc s it, so isn't useful in an interactive context.) Do not read from a terminal file (or call eof(FILEHANDLE) on it) after end-of-file is reached. File types such as terminals may lose the end-of-file condition if you do.

      An eof without an argument uses the last file read. Using eof() with empty parentheses is different. It refers to the pseudo file formed from the files listed on the command line and accessed via the <> operator. Since <> isn't explicitly opened, as a normal filehandle is, an eof() before <> has been used will cause @ARGV to be examined to determine if input is available. Similarly, an eof() after <> has returned end-of-file will assume you are processing another @ARGV list, and if you haven't set @ARGV , will read input from STDIN ; see I/O Operators in perlop.

      In a while (<>) loop, eof or eof(ARGV) can be used to detect the end of each file, whereas eof() will detect the end of the very last file only. Examples:

      1. # reset line numbering on each input file
      2. while (<>) {
      3. next if /^\s*#/; # skip comments
      4. print "$.\t$_";
      5. } continue {
      6. close ARGV if eof; # Not eof()!
      7. }
      8. # insert dashes just before last line of last file
      9. while (<>) {
      10. if (eof()) { # check for end of last file
      11. print "--------------\n";
      12. }
      13. print;
      14. last if eof(); # needed if we're reading from a terminal
      15. }

      Practical hint: you almost never need to use eof in Perl, because the input operators typically return undef when they run out of data or encounter an error.

     
    perldoc-html/functions/eq.html000644 000765 000024 00000032431 12275777530 016543 0ustar00jjstaff000000 000000 eq - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    eq

    Perl 5 version 18.2 documentation
    Recently read

    eq

    • eq

      These operators are documented in perlop.

     
    perldoc-html/functions/eval.html000644 000765 000024 00000067404 12275777532 017077 0ustar00jjstaff000000 000000 eval - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    eval

    Perl 5 version 18.2 documentation
    Recently read

    eval

    • eval EXPR

    • eval BLOCK
    • eval

      In the first form, the return value of EXPR is parsed and executed as if it were a little Perl program. The value of the expression (which is itself determined within scalar context) is first parsed, and if there were no errors, executed as a block within the lexical context of the current Perl program. This means, that in particular, any outer lexical variables are visible to it, and any package variable settings or subroutine and format definitions remain afterwards.

      Note that the value is parsed every time the eval executes. If EXPR is omitted, evaluates $_ . This form is typically used to delay parsing and subsequent execution of the text of EXPR until run time.

      If the unicode_eval feature is enabled (which is the default under a use 5.16 or higher declaration), EXPR or $_ is treated as a string of characters, so use utf8 declarations have no effect, and source filters are forbidden. In the absence of the unicode_eval feature, the string will sometimes be treated as characters and sometimes as bytes, depending on the internal encoding, and source filters activated within the eval exhibit the erratic, but historical, behaviour of affecting some outer file scope that is still compiling. See also the evalbytes keyword, which always treats its input as a byte stream and works properly with source filters, and the feature pragma.

      In the second form, the code within the BLOCK is parsed only once--at the same time the code surrounding the eval itself was parsed--and executed within the context of the current Perl program. This form is typically used to trap exceptions more efficiently than the first (see below), while also providing the benefit of checking the code within BLOCK at compile time.

      The final semicolon, if any, may be omitted from the value of EXPR or within the BLOCK.

      In both forms, the value returned is the value of the last expression evaluated inside the mini-program; a return statement may be also used, just as with subroutines. The expression providing the return value is evaluated in void, scalar, or list context, depending on the context of the eval itself. See wantarray for more on how the evaluation context can be determined.

      If there is a syntax error or runtime error, or a die statement is executed, eval returns undef in scalar context or an empty list in list context, and $@ is set to the error message. (Prior to 5.16, a bug caused undef to be returned in list context for syntax errors, but not for runtime errors.) If there was no error, $@ is set to the empty string. A control flow operator like last or goto can bypass the setting of $@ . Beware that using eval neither silences Perl from printing warnings to STDERR, nor does it stuff the text of warning messages into $@ . To do either of those, you have to use the $SIG{__WARN__} facility, or turn off warnings inside the BLOCK or EXPR using no warnings 'all' . See warn, perlvar, warnings and perllexwarn.

      Note that, because eval traps otherwise-fatal errors, it is useful for determining whether a particular feature (such as socket or symlink) is implemented. It is also Perl's exception-trapping mechanism, where the die operator is used to raise exceptions.

      If you want to trap errors when loading an XS module, some problems with the binary interface (such as Perl version skew) may be fatal even with eval unless $ENV{PERL_DL_NONLAZY} is set. See perlrun.

      If the code to be executed doesn't vary, you may use the eval-BLOCK form to trap run-time errors without incurring the penalty of recompiling each time. The error, if any, is still returned in $@ . Examples:

      1. # make divide-by-zero nonfatal
      2. eval { $answer = $a / $b; }; warn $@ if $@;
      3. # same thing, but less efficient
      4. eval '$answer = $a / $b'; warn $@ if $@;
      5. # a compile-time error
      6. eval { $answer = }; # WRONG
      7. # a run-time error
      8. eval '$answer ='; # sets $@

      Using the eval{} form as an exception trap in libraries does have some issues. Due to the current arguably broken state of __DIE__ hooks, you may wish not to trigger any __DIE__ hooks that user code may have installed. You can use the local $SIG{__DIE__} construct for this purpose, as this example shows:

      1. # a private exception trap for divide-by-zero
      2. eval { local $SIG{'__DIE__'}; $answer = $a / $b; };
      3. warn $@ if $@;

      This is especially significant, given that __DIE__ hooks can call die again, which has the effect of changing their error messages:

      1. # __DIE__ hooks may modify error messages
      2. {
      3. local $SIG{'__DIE__'} =
      4. sub { (my $x = $_[0]) =~ s/foo/bar/g; die $x };
      5. eval { die "foo lives here" };
      6. print $@ if $@; # prints "bar lives here"
      7. }

      Because this promotes action at a distance, this counterintuitive behavior may be fixed in a future release.

      With an eval, you should be especially careful to remember what's being looked at when:

      1. eval $x; # CASE 1
      2. eval "$x"; # CASE 2
      3. eval '$x'; # CASE 3
      4. eval { $x }; # CASE 4
      5. eval "\$$x++"; # CASE 5
      6. $$x++; # CASE 6

      Cases 1 and 2 above behave identically: they run the code contained in the variable $x. (Although case 2 has misleading double quotes making the reader wonder what else might be happening (nothing is).) Cases 3 and 4 likewise behave in the same way: they run the code '$x' , which does nothing but return the value of $x. (Case 4 is preferred for purely visual reasons, but it also has the advantage of compiling at compile-time instead of at run-time.) Case 5 is a place where normally you would like to use double quotes, except that in this particular situation, you can just use symbolic references instead, as in case 6.

      Before Perl 5.14, the assignment to $@ occurred before restoration of localized variables, which means that for your code to run on older versions, a temporary is required if you want to mask some but not all errors:

      1. # alter $@ on nefarious repugnancy only
      2. {
      3. my $e;
      4. {
      5. local $@; # protect existing $@
      6. eval { test_repugnancy() };
      7. # $@ =~ /nefarious/ and die $@; # Perl 5.14 and higher only
      8. $@ =~ /nefarious/ and $e = $@;
      9. }
      10. die $e if defined $e
      11. }

      eval BLOCK does not count as a loop, so the loop control statements next, last, or redo cannot be used to leave or restart the block.

      An eval '' executed within a subroutine defined in the DB package doesn't see the usual surrounding lexical scope, but rather the scope of the first non-DB piece of code that called it. You don't normally need to worry about this unless you are writing a Perl debugger.

     
    perldoc-html/functions/evalbytes.html000644 000765 000024 00000034214 12275777527 020143 0ustar00jjstaff000000 000000 evalbytes - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    evalbytes

    Perl 5 version 18.2 documentation
    Recently read

    evalbytes

    • evalbytes EXPR

    • evalbytes

      This function is like eval with a string argument, except it always parses its argument, or $_ if EXPR is omitted, as a string of bytes. A string containing characters whose ordinal value exceeds 255 results in an error. Source filters activated within the evaluated code apply to the code itself.

      This function is only available under the evalbytes feature, a use v5.16 (or higher) declaration, or with a CORE:: prefix. See feature for more information.

     
    perldoc-html/functions/exec.html000644 000765 000024 00000050320 12275777526 017064 0ustar00jjstaff000000 000000 exec - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    exec

    Perl 5 version 18.2 documentation
    Recently read

    exec

    • exec LIST

    • exec PROGRAM LIST

      The exec function executes a system command and never returns; use system instead of exec if you want it to return. It fails and returns false only if the command does not exist and it is executed directly instead of via your system's command shell (see below).

      Since it's a common mistake to use exec instead of system, Perl warns you if exec is called in void context and if there is a following statement that isn't die, warn, or exit (if -w is set--but you always do that, right?). If you really want to follow an exec with some other statement, you can use one of these styles to avoid the warning:

      1. exec ('foo') or print STDERR "couldn't exec foo: $!";
      2. { exec ('foo') }; print STDERR "couldn't exec foo: $!";

      If there is more than one argument in LIST, or if LIST is an array with more than one value, calls execvp(3) with the arguments in LIST. If there is only one scalar argument or an array with one element in it, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's command shell for parsing (this is /bin/sh -c on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the argument, it is split into words and passed directly to execvp , which is more efficient. Examples:

      1. exec '/bin/echo', 'Your arguments are: ', @ARGV;
      2. exec "sort $outfile | uniq";

      If you don't really want to execute the first argument, but want to lie to the program you are executing about its own name, you can specify the program you actually want to run as an "indirect object" (without a comma) in front of the LIST. (This always forces interpretation of the LIST as a multivalued list, even if there is only a single scalar in the list.) Example:

      1. $shell = '/bin/csh';
      2. exec $shell '-sh'; # pretend it's a login shell

      or, more directly,

      1. exec {'/bin/csh'} '-sh'; # pretend it's a login shell

      When the arguments get executed via the system shell, results are subject to its quirks and capabilities. See `STRING` in perlop for details.

      Using an indirect object with exec or system is also more secure. This usage (which also works fine with system()) forces interpretation of the arguments as a multivalued list, even if the list had just one argument. That way you're safe from the shell expanding wildcards or splitting up words with whitespace in them.

      1. @args = ( "echo surprise" );
      2. exec @args; # subject to shell escapes
      3. # if @args == 1
      4. exec { $args[0] } @args; # safe even with one-arg list

      The first version, the one without the indirect object, ran the echo program, passing it "surprise" an argument. The second version didn't; it tried to run a program named "echo surprise", didn't find it, and set $? to a non-zero value indicating failure.

      Perl attempts to flush all files opened for output before the exec, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles to avoid lost output.

      Note that exec will not call your END blocks, nor will it invoke DESTROY methods on your objects.

      Portability issues: exec in perlport.

     
    perldoc-html/functions/exists.html000644 000765 000024 00000047360 12275777525 017470 0ustar00jjstaff000000 000000 exists - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    exists

    Perl 5 version 18.2 documentation
    Recently read

    exists

    • exists EXPR

      Given an expression that specifies an element of a hash, returns true if the specified element in the hash has ever been initialized, even if the corresponding value is undefined.

      1. print "Exists\n" if exists $hash{$key};
      2. print "Defined\n" if defined $hash{$key};
      3. print "True\n" if $hash{$key};

      exists may also be called on array elements, but its behavior is much less obvious and is strongly tied to the use of delete on arrays. Be aware that calling exists on array values is deprecated and likely to be removed in a future version of Perl.

      1. print "Exists\n" if exists $array[$index];
      2. print "Defined\n" if defined $array[$index];
      3. print "True\n" if $array[$index];

      A hash or array element can be true only if it's defined and defined only if it exists, but the reverse doesn't necessarily hold true.

      Given an expression that specifies the name of a subroutine, returns true if the specified subroutine has ever been declared, even if it is undefined. Mentioning a subroutine name for exists or defined does not count as declaring it. Note that a subroutine that does not exist may still be callable: its package may have an AUTOLOAD method that makes it spring into existence the first time that it is called; see perlsub.

      1. print "Exists\n" if exists &subroutine;
      2. print "Defined\n" if defined &subroutine;

      Note that the EXPR can be arbitrarily complicated as long as the final operation is a hash or array key lookup or subroutine name:

      1. if (exists $ref->{A}->{B}->{$key}) { }
      2. if (exists $hash{A}{B}{$key}) { }
      3. if (exists $ref->{A}->{B}->[$ix]) { }
      4. if (exists $hash{A}{B}[$ix]) { }
      5. if (exists &{$ref->{A}{B}{$key}}) { }

      Although the most deeply nested array or hash element will not spring into existence just because its existence was tested, any intervening ones will. Thus $ref->{"A"} and $ref->{"A"}->{"B"} will spring into existence due to the existence test for the $key element above. This happens anywhere the arrow operator is used, including even here:

      1. undef $ref;
      2. if (exists $ref->{"Some key"}) { }
      3. print $ref; # prints HASH(0x80d3d5c)

      This surprising autovivification in what does not at first--or even second--glance appear to be an lvalue context may be fixed in a future release.

      Use of a subroutine call, rather than a subroutine name, as an argument to exists() is an error.

      1. exists &sub; # OK
      2. exists &sub(); # Error
     
    perldoc-html/functions/exit.html000644 000765 000024 00000037011 12275777530 017106 0ustar00jjstaff000000 000000 exit - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    exit

    Perl 5 version 18.2 documentation
    Recently read

    exit

    • exit EXPR

    • exit

      Evaluates EXPR and exits immediately with that value. Example:

      1. $ans = <STDIN>;
      2. exit 0 if $ans =~ /^[Xx]/;

      See also die. If EXPR is omitted, exits with 0 status. The only universally recognized values for EXPR are 0 for success and 1 for error; other values are subject to interpretation depending on the environment in which the Perl program is running. For example, exiting 69 (EX_UNAVAILABLE) from a sendmail incoming-mail filter will cause the mailer to return the item undelivered, but that's not true everywhere.

      Don't use exit to abort a subroutine if there's any chance that someone might want to trap whatever error happened. Use die instead, which can be trapped by an eval.

      The exit() function does not always exit immediately. It calls any defined END routines first, but these END routines may not themselves abort the exit. Likewise any object destructors that need to be called are called before the real exit. END routines and destructors can change the exit status by modifying $? . If this is a problem, you can call POSIX::_exit($status) to avoid END and destructor processing. See perlmod for details.

      Portability issues: exit in perlport.

     
    perldoc-html/functions/exp.html000644 000765 000024 00000032725 12275777525 016744 0ustar00jjstaff000000 000000 exp - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    exp

    Perl 5 version 18.2 documentation
    Recently read

    exp

    • exp EXPR

    • exp

      Returns e (the natural logarithm base) to the power of EXPR. If EXPR is omitted, gives exp($_).

     
    perldoc-html/functions/fc.html000644 000765 000024 00000041536 12275777524 016537 0ustar00jjstaff000000 000000 fc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    fc

    Perl 5 version 18.2 documentation
    Recently read

    fc

    • fc EXPR

    • fc

      Returns the casefolded version of EXPR. This is the internal function implementing the \F escape in double-quoted strings.

      Casefolding is the process of mapping strings to a form where case differences are erased; comparing two strings in their casefolded form is effectively a way of asking if two strings are equal, regardless of case.

      Roughly, if you ever found yourself writing this

      1. lc($this) eq lc($that) # Wrong!
      2. # or
      3. uc($this) eq uc($that) # Also wrong!
      4. # or
      5. $this =~ /^\Q$that\E\z/i # Right!

      Now you can write

      1. fc($this) eq fc($that)

      And get the correct results.

      Perl only implements the full form of casefolding, but you can access the simple folds using casefold() in Unicode::UCD and prop_invmap() in Unicode::UCD. For further information on casefolding, refer to the Unicode Standard, specifically sections 3.13 Default Case Operations , 4.2 Case-Normative , and 5.18 Case Mappings , available at http://www.unicode.org/versions/latest/, as well as the Case Charts available at http://www.unicode.org/charts/case/.

      If EXPR is omitted, uses $_ .

      This function behaves the same way under various pragma, such as in a locale, as lc does.

      While the Unicode Standard defines two additional forms of casefolding, one for Turkic languages and one that never maps one character into multiple characters, these are not provided by the Perl core; However, the CPAN module Unicode::Casing may be used to provide an implementation.

      This keyword is available only when the "fc" feature is enabled, or when prefixed with CORE:: ; See feature. Alternately, include a use v5.16 or later to the current scope.

     
    perldoc-html/functions/fcntl.html000644 000765 000024 00000041371 12275777525 017253 0ustar00jjstaff000000 000000 fcntl - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    fcntl

    Perl 5 version 18.2 documentation
    Recently read

    fcntl

    • fcntl FILEHANDLE,FUNCTION,SCALAR

      Implements the fcntl(2) function. You'll probably have to say

      1. use Fcntl;

      first to get the correct constant definitions. Argument processing and value returned work just like ioctl below. For example:

      1. use Fcntl;
      2. fcntl($filehandle, F_GETFL, $packed_return_buffer)
      3. or die "can't fcntl F_GETFL: $!";

      You don't have to check for defined on the return from fcntl. Like ioctl, it maps a 0 return from the system call into "0 but true" in Perl. This string is true in boolean context and 0 in numeric context. It is also exempt from the normal -w warnings on improper numeric conversions.

      Note that fcntl raises an exception if used on a machine that doesn't implement fcntl(2). See the Fcntl module or your fcntl(2) manpage to learn what functions are available on your system.

      Here's an example of setting a filehandle named REMOTE to be non-blocking at the system level. You'll have to negotiate $| on your own, though.

      1. use Fcntl qw(F_GETFL F_SETFL O_NONBLOCK);
      2. $flags = fcntl(REMOTE, F_GETFL, 0)
      3. or die "Can't get flags for the socket: $!\n";
      4. $flags = fcntl(REMOTE, F_SETFL, $flags | O_NONBLOCK)
      5. or die "Can't set flags for the socket: $!\n";

      Portability issues: fcntl in perlport.

     
    perldoc-html/functions/fileno.html000644 000765 000024 00000035066 12275777526 017426 0ustar00jjstaff000000 000000 fileno - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    fileno

    Perl 5 version 18.2 documentation
    Recently read

    fileno

    • fileno FILEHANDLE

      Returns the file descriptor for a filehandle, or undefined if the filehandle is not open. If there is no real file descriptor at the OS level, as can happen with filehandles connected to memory objects via open with a reference for the third argument, -1 is returned.

      This is mainly useful for constructing bitmaps for select and low-level POSIX tty-handling operations. If FILEHANDLE is an expression, the value is taken as an indirect filehandle, generally its name.

      You can use this to find out whether two handles refer to the same underlying descriptor:

      1. if (fileno(THIS) == fileno(THAT)) {
      2. print "THIS and THAT are dups\n";
      3. }
     
    perldoc-html/functions/flock.html000644 000765 000024 00000050467 12275777527 017253 0ustar00jjstaff000000 000000 flock - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    flock

    Perl 5 version 18.2 documentation
    Recently read

    flock

    • flock FILEHANDLE,OPERATION

      Calls flock(2), or an emulation of it, on FILEHANDLE. Returns true for success, false on failure. Produces a fatal error if used on a machine that doesn't implement flock(2), fcntl(2) locking, or lockf(3). flock is Perl's portable file-locking interface, although it locks entire files only, not records.

      Two potentially non-obvious but traditional flock semantics are that it waits indefinitely until the lock is granted, and that its locks are merely advisory. Such discretionary locks are more flexible, but offer fewer guarantees. This means that programs that do not also use flock may modify files locked with flock. See perlport, your port's specific documentation, and your system-specific local manpages for details. It's best to assume traditional behavior if you're writing portable programs. (But if you're not, you should as always feel perfectly free to write for your own system's idiosyncrasies (sometimes called "features"). Slavish adherence to portability concerns shouldn't get in the way of your getting your job done.)

      OPERATION is one of LOCK_SH, LOCK_EX, or LOCK_UN, possibly combined with LOCK_NB. These constants are traditionally valued 1, 2, 8 and 4, but you can use the symbolic names if you import them from the Fcntl module, either individually, or as a group using the :flock tag. LOCK_SH requests a shared lock, LOCK_EX requests an exclusive lock, and LOCK_UN releases a previously requested lock. If LOCK_NB is bitwise-or'ed with LOCK_SH or LOCK_EX, then flock returns immediately rather than blocking waiting for the lock; check the return status to see if you got it.

      To avoid the possibility of miscoordination, Perl now flushes FILEHANDLE before locking or unlocking it.

      Note that the emulation built with lockf(3) doesn't provide shared locks, and it requires that FILEHANDLE be open with write intent. These are the semantics that lockf(3) implements. Most if not all systems implement lockf(3) in terms of fcntl(2) locking, though, so the differing semantics shouldn't bite too many people.

      Note that the fcntl(2) emulation of flock(3) requires that FILEHANDLE be open with read intent to use LOCK_SH and requires that it be open with write intent to use LOCK_EX.

      Note also that some versions of flock cannot lock things over the network; you would need to use the more system-specific fcntl for that. If you like you can force Perl to ignore your system's flock(2) function, and so provide its own fcntl(2)-based emulation, by passing the switch -Ud_flock to the Configure program when you configure and build a new Perl.

      Here's a mailbox appender for BSD systems.

      1. # import LOCK_* and SEEK_END constants
      2. use Fcntl qw(:flock SEEK_END);
      3. sub lock {
      4. my ($fh) = @_;
      5. flock($fh, LOCK_EX) or die "Cannot lock mailbox - $!\n";
      6. # and, in case someone appended while we were waiting...
      7. seek($fh, 0, SEEK_END) or die "Cannot seek - $!\n";
      8. }
      9. sub unlock {
      10. my ($fh) = @_;
      11. flock($fh, LOCK_UN) or die "Cannot unlock mailbox - $!\n";
      12. }
      13. open(my $mbox, ">>", "/usr/spool/mail/$ENV{'USER'}")
      14. or die "Can't open mailbox: $!";
      15. lock($mbox);
      16. print $mbox $msg,"\n\n";
      17. unlock($mbox);

      On systems that support a real flock(2), locks are inherited across fork() calls, whereas those that must resort to the more capricious fcntl(2) function lose their locks, making it seriously harder to write servers.

      See also DB_File for other flock() examples.

      Portability issues: flock in perlport.

     
    perldoc-html/functions/for.html000644 000765 000024 00000032611 12275777531 016725 0ustar00jjstaff000000 000000 for - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    for

    Perl 5 version 18.2 documentation
    Recently read

    for

     
    perldoc-html/functions/foreach.html000644 000765 000024 00000032572 12275777530 017553 0ustar00jjstaff000000 000000 foreach - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    foreach

    Perl 5 version 18.2 documentation
    Recently read

    foreach

     
    perldoc-html/functions/fork.html000644 000765 000024 00000037330 12275777530 017102 0ustar00jjstaff000000 000000 fork - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    fork

    Perl 5 version 18.2 documentation
    Recently read

    fork

    • fork

      Does a fork(2) system call to create a new process running the same program at the same point. It returns the child pid to the parent process, 0 to the child process, or undef if the fork is unsuccessful. File descriptors (and sometimes locks on those descriptors) are shared, while everything else is copied. On most systems supporting fork(), great care has gone into making it extremely efficient (for example, using copy-on-write technology on data pages), making it the dominant paradigm for multitasking over the last few decades.

      Perl attempts to flush all files opened for output before forking the child process, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles to avoid duplicate output.

      If you fork without ever waiting on your children, you will accumulate zombies. On some systems, you can avoid this by setting $SIG{CHLD} to "IGNORE" . See also perlipc for more examples of forking and reaping moribund children.

      Note that if your forked child inherits system file descriptors like STDIN and STDOUT that are actually connected by a pipe or socket, even if you exit, then the remote server (such as, say, a CGI script or a backgrounded job launched from a remote shell) won't think you're done. You should reopen those to /dev/null if it's any issue.

      On some platforms such as Windows, where the fork() system call is not available, Perl can be built to emulate fork() in the Perl interpreter. The emulation is designed, at the level of the Perl program, to be as compatible as possible with the "Unix" fork(). However it has limitations that have to be considered in code intended to be portable. See perlfork for more details.

      Portability issues: fork in perlport.

     
    perldoc-html/functions/format.html000644 000765 000024 00000033471 12275777526 017440 0ustar00jjstaff000000 000000 format - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    format

    Perl 5 version 18.2 documentation
    Recently read

    format

    • format

      Declare a picture format for use by the write function. For example:

      1. format Something =
      2. Test: @<<<<<<<< @||||| @>>>>>
      3. $str, $%, '$' . int($num)
      4. .
      5. $str = "widget";
      6. $num = $cost/$quantity;
      7. $~ = 'Something';
      8. write;

      See perlform for many details and examples.

     
    perldoc-html/functions/formline.html000644 000765 000024 00000037174 12275777530 017762 0ustar00jjstaff000000 000000 formline - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    formline

    Perl 5 version 18.2 documentation
    Recently read

    formline

    • formline PICTURE,LIST

      This is an internal function used by formats, though you may call it, too. It formats (see perlform) a list of values according to the contents of PICTURE, placing the output into the format output accumulator, $^A (or $ACCUMULATOR in English). Eventually, when a write is done, the contents of $^A are written to some filehandle. You could also read $^A and then set $^A back to "" . Note that a format typically does one formline per line of form, but the formline function itself doesn't care how many newlines are embedded in the PICTURE. This means that the ~ and ~~ tokens treat the entire PICTURE as a single line. You may therefore need to use multiple formlines to implement a single record format, just like the format compiler.

      Be careful if you put double quotes around the picture, because an @ character may be taken to mean the beginning of an array name. formline always returns true. See perlform for other examples.

      If you are trying to use this instead of write to capture the output, you may find it easier to open a filehandle to a scalar (open $fh, ">", \$output ) and write to that instead.

     
    perldoc-html/functions/ge.html000644 000765 000024 00000032431 12275777527 016537 0ustar00jjstaff000000 000000 ge - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ge

    Perl 5 version 18.2 documentation
    Recently read

    ge

    • ge

      These operators are documented in perlop.

     
    perldoc-html/functions/getc.html000644 000765 000024 00000040074 12275777526 017067 0ustar00jjstaff000000 000000 getc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getc

    Perl 5 version 18.2 documentation
    Recently read

    getc

    • getc FILEHANDLE

    • getc

      Returns the next character from the input file attached to FILEHANDLE, or the undefined value at end of file or if there was an error (in the latter case $! is set). If FILEHANDLE is omitted, reads from STDIN. This is not particularly efficient. However, it cannot be used by itself to fetch single characters without waiting for the user to hit enter. For that, try something more like:

      1. if ($BSD_STYLE) {
      2. system "stty cbreak </dev/tty >/dev/tty 2>&1";
      3. }
      4. else {
      5. system "stty", '-icanon', 'eol', "\001";
      6. }
      7. $key = getc(STDIN);
      8. if ($BSD_STYLE) {
      9. system "stty -cbreak </dev/tty >/dev/tty 2>&1";
      10. }
      11. else {
      12. system 'stty', 'icanon', 'eol', '^@'; # ASCII NUL
      13. }
      14. print "\n";

      Determination of whether $BSD_STYLE should be set is left as an exercise to the reader.

      The POSIX::getattr function can do this more portably on systems purporting POSIX compliance. See also the Term::ReadKey module from your nearest CPAN site; details on CPAN can be found under CPAN in perlmodlib.

     
    perldoc-html/functions/getgrent.html000644 000765 000024 00000032373 12275777525 017766 0ustar00jjstaff000000 000000 getgrent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getgrent

    Perl 5 version 18.2 documentation
    Recently read

    getgrent

    • getgrent
     
    perldoc-html/functions/getgrgid.html000644 000765 000024 00000032403 12275777532 017733 0ustar00jjstaff000000 000000 getgrgid - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getgrgid

    Perl 5 version 18.2 documentation
    Recently read

    getgrgid

    • getgrgid GID
     
    perldoc-html/functions/getgrnam.html000644 000765 000024 00000032405 12275777532 017745 0ustar00jjstaff000000 000000 getgrnam - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getgrnam

    Perl 5 version 18.2 documentation
    Recently read

    getgrnam

    • getgrnam NAME
     
    perldoc-html/functions/gethostbyaddr.html000644 000765 000024 00000032501 12275777531 021000 0ustar00jjstaff000000 000000 gethostbyaddr - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    gethostbyaddr

    Perl 5 version 18.2 documentation
    Recently read

    gethostbyaddr

    • gethostbyaddr ADDR,ADDRTYPE
     
    perldoc-html/functions/gethostbyname.html000644 000765 000024 00000032455 12275777525 021021 0ustar00jjstaff000000 000000 gethostbyname - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    gethostbyname

    Perl 5 version 18.2 documentation
    Recently read

    gethostbyname

    • gethostbyname NAME
     
    perldoc-html/functions/gethostent.html000644 000765 000024 00000032413 12275777526 020327 0ustar00jjstaff000000 000000 gethostent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    gethostent

    Perl 5 version 18.2 documentation
    Recently read

    gethostent

    • gethostent
     
    perldoc-html/functions/getlogin.html000644 000765 000024 00000034305 12275777527 017756 0ustar00jjstaff000000 000000 getlogin - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getlogin

    Perl 5 version 18.2 documentation
    Recently read

    getlogin

    • getlogin

      This implements the C library function of the same name, which on most systems returns the current login from /etc/utmp, if any. If it returns the empty string, use getpwuid.

      1. $login = getlogin || getpwuid($<) || "Kilroy";

      Do not consider getlogin for authentication: it is not as secure as getpwuid.

      Portability issues: getlogin in perlport.

     
    perldoc-html/functions/getnetbyaddr.html000644 000765 000024 00000032471 12275777531 020617 0ustar00jjstaff000000 000000 getnetbyaddr - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getnetbyaddr

    Perl 5 version 18.2 documentation
    Recently read

    getnetbyaddr

    • getnetbyaddr ADDR,ADDRTYPE
     
    perldoc-html/functions/getnetbyname.html000644 000765 000024 00000032445 12275777526 020632 0ustar00jjstaff000000 000000 getnetbyname - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getnetbyname

    Perl 5 version 18.2 documentation
    Recently read

    getnetbyname

    • getnetbyname NAME
     
    perldoc-html/functions/getnetent.html000644 000765 000024 00000032403 12275777531 020133 0ustar00jjstaff000000 000000 getnetent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getnetent

    Perl 5 version 18.2 documentation
    Recently read

    getnetent

    • getnetent
     
    perldoc-html/functions/getpeername.html000644 000765 000024 00000034761 12275777527 020450 0ustar00jjstaff000000 000000 getpeername - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getpeername

    Perl 5 version 18.2 documentation
    Recently read

    getpeername

    • getpeername SOCKET

      Returns the packed sockaddr address of the other end of the SOCKET connection.

      1. use Socket;
      2. $hersockaddr = getpeername(SOCK);
      3. ($port, $iaddr) = sockaddr_in($hersockaddr);
      4. $herhostname = gethostbyaddr($iaddr, AF_INET);
      5. $herstraddr = inet_ntoa($iaddr);
     
    perldoc-html/functions/getpgrp.html000644 000765 000024 00000033665 12275777530 017620 0ustar00jjstaff000000 000000 getpgrp - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getpgrp

    Perl 5 version 18.2 documentation
    Recently read

    getpgrp

    • getpgrp PID

      Returns the current process group for the specified PID. Use a PID of 0 to get the current process group for the current process. Will raise an exception if used on a machine that doesn't implement getpgrp(2). If PID is omitted, returns the process group of the current process. Note that the POSIX version of getpgrp does not accept a PID argument, so only PID==0 is truly portable.

      Portability issues: getpgrp in perlport.

     
    perldoc-html/functions/getppid.html000644 000765 000024 00000033301 12275777526 017574 0ustar00jjstaff000000 000000 getppid - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getppid

    Perl 5 version 18.2 documentation
    Recently read

    getppid

    • getppid

      Returns the process id of the parent process.

      Note for Linux users: Between v5.8.1 and v5.16.0 Perl would work around non-POSIX thread semantics the minority of Linux systems (and Debian GNU/kFreeBSD systems) that used LinuxThreads, this emulation has since been removed. See the documentation for $$ for details.

      Portability issues: getppid in perlport.

     
    perldoc-html/functions/getpriority.html000644 000765 000024 00000033130 12275777524 020517 0ustar00jjstaff000000 000000 getpriority - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getpriority

    Perl 5 version 18.2 documentation
    Recently read

    getpriority

    • getpriority WHICH,WHO

      Returns the current priority for a process, a process group, or a user. (See getpriority(2).) Will raise a fatal exception if used on a machine that doesn't implement getpriority(2).

      Portability issues: getpriority in perlport.

     
    perldoc-html/functions/getprotobyname.html000644 000765 000024 00000032465 12275777524 021207 0ustar00jjstaff000000 000000 getprotobyname - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getprotobyname

    Perl 5 version 18.2 documentation
    Recently read

    getprotobyname

    • getprotobyname NAME
     
    perldoc-html/functions/getprotobynumber.html000644 000765 000024 00000032511 12275777527 021552 0ustar00jjstaff000000 000000 getprotobynumber - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getprotobynumber

    Perl 5 version 18.2 documentation
    Recently read

    getprotobynumber

    • getprotobynumber NUMBER
     
    perldoc-html/functions/getprotoent.html000644 000765 000024 00000032423 12275777526 020516 0ustar00jjstaff000000 000000 getprotoent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getprotoent

    Perl 5 version 18.2 documentation
    Recently read

    getprotoent

    • getprotoent
     
    perldoc-html/functions/getpwent.html000644 000765 000024 00000032373 12275777524 020003 0ustar00jjstaff000000 000000 getpwent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getpwent

    Perl 5 version 18.2 documentation
    Recently read

    getpwent

    • getpwent
     
    perldoc-html/functions/getpwnam.html000644 000765 000024 00000032452 12275777531 017764 0ustar00jjstaff000000 000000 getpwnam - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getpwnam

    Perl 5 version 18.2 documentation
    Recently read

    getpwnam

    • getpwnam NAME

     
    perldoc-html/functions/getpwuid.html000644 000765 000024 00000032403 12275777527 017773 0ustar00jjstaff000000 000000 getpwuid - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getpwuid

    Perl 5 version 18.2 documentation
    Recently read

    getpwuid

    • getpwuid UID
     
    perldoc-html/functions/getservbyname.html000644 000765 000024 00000032473 12275777525 021023 0ustar00jjstaff000000 000000 getservbyname - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getservbyname

    Perl 5 version 18.2 documentation
    Recently read

    getservbyname

    • getservbyname NAME,PROTO
     
    perldoc-html/functions/getservbyport.html000644 000765 000024 00000032473 12275777525 021067 0ustar00jjstaff000000 000000 getservbyport - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getservbyport

    Perl 5 version 18.2 documentation
    Recently read

    getservbyport

    • getservbyport PORT,PROTO
     
    perldoc-html/functions/getservent.html000644 000765 000024 00000032413 12275777530 020324 0ustar00jjstaff000000 000000 getservent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getservent

    Perl 5 version 18.2 documentation
    Recently read

    getservent

    • getservent
     
    perldoc-html/functions/getsockname.html000644 000765 000024 00000035343 12275777530 020443 0ustar00jjstaff000000 000000 getsockname - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getsockname

    Perl 5 version 18.2 documentation
    Recently read

    getsockname

    • getsockname SOCKET

      Returns the packed sockaddr address of this end of the SOCKET connection, in case you don't know the address because you have several different IPs that the connection might have come in on.

      1. use Socket;
      2. $mysockaddr = getsockname(SOCK);
      3. ($port, $myaddr) = sockaddr_in($mysockaddr);
      4. printf "Connect to %s [%s]\n",
      5. scalar gethostbyaddr($myaddr, AF_INET),
      6. inet_ntoa($myaddr);
     
    perldoc-html/functions/getsockopt.html000644 000765 000024 00000041616 12275777531 020326 0ustar00jjstaff000000 000000 getsockopt - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    getsockopt

    Perl 5 version 18.2 documentation
    Recently read

    getsockopt

    • getsockopt SOCKET,LEVEL,OPTNAME

      Queries the option named OPTNAME associated with SOCKET at a given LEVEL. Options may exist at multiple protocol levels depending on the socket type, but at least the uppermost socket level SOL_SOCKET (defined in the Socket module) will exist. To query options at another level the protocol number of the appropriate protocol controlling the option should be supplied. For example, to indicate that an option is to be interpreted by the TCP protocol, LEVEL should be set to the protocol number of TCP, which you can get using getprotobyname.

      The function returns a packed string representing the requested socket option, or undef on error, with the reason for the error placed in $! . Just what is in the packed string depends on LEVEL and OPTNAME; consult getsockopt(2) for details. A common case is that the option is an integer, in which case the result is a packed integer, which you can decode using unpack with the i (or I ) format.

      Here's an example to test whether Nagle's algorithm is enabled on a socket:

      1. use Socket qw(:all);
      2. defined(my $tcp = getprotobyname("tcp"))
      3. or die "Could not determine the protocol number for tcp";
      4. # my $tcp = IPPROTO_TCP; # Alternative
      5. my $packed = getsockopt($socket, $tcp, TCP_NODELAY)
      6. or die "getsockopt TCP_NODELAY: $!";
      7. my $nodelay = unpack("I", $packed);
      8. print "Nagle's algorithm is turned ",
      9. $nodelay ? "off\n" : "on\n";

      Portability issues: getsockopt in perlport.

     
    perldoc-html/functions/given.html000644 000765 000024 00000032622 12275777531 017251 0ustar00jjstaff000000 000000 given - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    given

    Perl 5 version 18.2 documentation
    Recently read

    given

     
    perldoc-html/functions/glob.html000644 000765 000024 00000041447 12275777526 017075 0ustar00jjstaff000000 000000 glob - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    glob

    Perl 5 version 18.2 documentation
    Recently read

    glob

    • glob EXPR

    • glob

      In list context, returns a (possibly empty) list of filename expansions on the value of EXPR such as the standard Unix shell /bin/csh would do. In scalar context, glob iterates through such filename expansions, returning undef when the list is exhausted. This is the internal function implementing the <*.c> operator, but you can use it directly. If EXPR is omitted, $_ is used. The <*.c> operator is discussed in more detail in I/O Operators in perlop.

      Note that glob splits its arguments on whitespace and treats each segment as separate pattern. As such, glob("*.c *.h") matches all files with a .c or .h extension. The expression glob(".* *") matches all files in the current working directory. If you want to glob filenames that might contain whitespace, you'll have to use extra quotes around the spacey filename to protect it. For example, to glob filenames that have an e followed by a space followed by an f , use either of:

      1. @spacies = <"*e f*">;
      2. @spacies = glob '"*e f*"';
      3. @spacies = glob q("*e f*");

      If you had to get a variable through, you could do this:

      1. @spacies = glob "'*${var}e f*'";
      2. @spacies = glob qq("*${var}e f*");

      If non-empty braces are the only wildcard characters used in the glob, no filenames are matched, but potentially many strings are returned. For example, this produces nine strings, one for each pairing of fruits and colors:

      1. @many = glob "{apple,tomato,cherry}={green,yellow,red}";

      This operator is implemented using the standard File::Glob extension. See File::Glob for details, including bsd_glob which does not treat whitespace as a pattern separator.

      Portability issues: glob in perlport.

     
    perldoc-html/functions/gmtime.html000644 000765 000024 00000033343 12275777530 017423 0ustar00jjstaff000000 000000 gmtime - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    gmtime

    Perl 5 version 18.2 documentation
    Recently read

    gmtime

    • gmtime EXPR

    • gmtime

      Works just like localtime but the returned values are localized for the standard Greenwich time zone.

      Note: When called in list context, $isdst, the last value returned by gmtime, is always 0 . There is no Daylight Saving Time in GMT.

      Portability issues: gmtime in perlport.

     
    perldoc-html/functions/goto.html000644 000765 000024 00000042533 12275777525 017116 0ustar00jjstaff000000 000000 goto - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    goto

    Perl 5 version 18.2 documentation
    Recently read

    goto

    • goto LABEL

    • goto EXPR
    • goto &NAME

      The goto-LABEL form finds the statement labeled with LABEL and resumes execution there. It can't be used to get out of a block or subroutine given to sort. It can be used to go almost anywhere else within the dynamic scope, including out of subroutines, but it's usually better to use some other construct such as last or die. The author of Perl has never felt the need to use this form of goto (in Perl, that is; C is another matter). (The difference is that C does not offer named loops combined with loop control. Perl does, and this replaces most structured uses of goto in other languages.)

      The goto-EXPR form expects a label name, whose scope will be resolved dynamically. This allows for computed gotos per FORTRAN, but isn't necessarily recommended if you're optimizing for maintainability:

      1. goto ("FOO", "BAR", "GLARCH")[$i];

      As shown in this example, goto-EXPR is exempt from the "looks like a function" rule. A pair of parentheses following it does not (necessarily) delimit its argument. goto("NE")."XT" is equivalent to goto NEXT . Also, unlike most named operators, this has the same precedence as assignment.

      Use of goto-LABEL or goto-EXPR to jump into a construct is deprecated and will issue a warning. Even then, it may not be used to go into any construct that requires initialization, such as a subroutine or a foreach loop. It also can't be used to go into a construct that is optimized away.

      The goto-&NAME form is quite different from the other forms of goto. In fact, it isn't a goto in the normal sense at all, and doesn't have the stigma associated with other gotos. Instead, it exits the current subroutine (losing any changes set by local()) and immediately calls in its place the named subroutine using the current value of @_. This is used by AUTOLOAD subroutines that wish to load another subroutine and then pretend that the other subroutine had been called in the first place (except that any modifications to @_ in the current subroutine are propagated to the other subroutine.) After the goto, not even caller will be able to tell that this routine was called first.

      NAME needn't be the name of a subroutine; it can be a scalar variable containing a code reference or a block that evaluates to a code reference.

     
    perldoc-html/functions/grep.html000644 000765 000024 00000037613 12275777525 017106 0ustar00jjstaff000000 000000 grep - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    grep

    Perl 5 version 18.2 documentation
    Recently read

    grep

    • grep BLOCK LIST

    • grep EXPR,LIST

      This is similar in spirit to, but not the same as, grep(1) and its relatives. In particular, it is not limited to using regular expressions.

      Evaluates the BLOCK or EXPR for each element of LIST (locally setting $_ to each element) and returns the list value consisting of those elements for which the expression evaluated to true. In scalar context, returns the number of times the expression was true.

      1. @foo = grep(!/^#/, @bar); # weed out comments

      or equivalently,

      1. @foo = grep {!/^#/} @bar; # weed out comments

      Note that $_ is an alias to the list value, so it can be used to modify the elements of the LIST. While this is useful and supported, it can cause bizarre results if the elements of LIST are not variables. Similarly, grep returns aliases into the original list, much as a for loop's index variable aliases the list elements. That is, modifying an element of a list returned by grep (for example, in a foreach , map or another grep) actually modifies the element in the original list. This is usually something to be avoided when writing clear code.

      If $_ is lexical in the scope where the grep appears (because it has been declared with the deprecated my $_ construct) then, in addition to being locally aliased to the list elements, $_ keeps being lexical inside the block; i.e., it can't be seen from the outside, avoiding any potential side-effects.

      See also map for a list composed of the results of the BLOCK or EXPR.

     
    perldoc-html/functions/gt.html000644 000765 000024 00000032431 12275777530 016550 0ustar00jjstaff000000 000000 gt - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    gt

    Perl 5 version 18.2 documentation
    Recently read

    gt

    • gt

      These operators are documented in perlop.

     
    perldoc-html/functions/hex.html000644 000765 000024 00000034742 12275777525 016735 0ustar00jjstaff000000 000000 hex - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    hex

    Perl 5 version 18.2 documentation
    Recently read

    hex

    • hex EXPR

    • hex

      Interprets EXPR as a hex string and returns the corresponding value. (To convert strings that might start with either 0 , 0x , or 0b, see oct.) If EXPR is omitted, uses $_ .

      1. print hex '0xAf'; # prints '175'
      2. print hex 'aF'; # same

      Hex strings may only represent integers. Strings that would cause integer overflow trigger a warning. Leading whitespace is not stripped, unlike oct(). To present something as hex, look into printf, sprintf, and unpack.

     
    perldoc-html/functions/if.html000644 000765 000024 00000032431 12275777526 016541 0ustar00jjstaff000000 000000 if - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    if

    Perl 5 version 18.2 documentation
    Recently read

    if

    • if

      These operators are documented in perlop.

     
    perldoc-html/functions/import.html000644 000765 000024 00000033525 12275777525 017461 0ustar00jjstaff000000 000000 import - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    import

    Perl 5 version 18.2 documentation
    Recently read

    import

    • import LIST

      There is no builtin import function. It is just an ordinary method (subroutine) defined (or inherited) by modules that wish to export names to another module. The use function calls the import method for the package used. See also use, perlmod, and Exporter.

     
    perldoc-html/functions/index.html000644 000765 000024 00000033717 12275777527 017263 0ustar00jjstaff000000 000000 index - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    index

    Perl 5 version 18.2 documentation
    Recently read

    index

    • index STR,SUBSTR,POSITION

    • index STR,SUBSTR

      The index function searches for one string within another, but without the wildcard-like behavior of a full regular-expression pattern match. It returns the position of the first occurrence of SUBSTR in STR at or after POSITION. If POSITION is omitted, starts searching from the beginning of the string. POSITION before the beginning of the string or after its end is treated as if it were the beginning or the end, respectively. POSITION and the return value are based at zero. If the substring is not found, index returns -1.

     
    perldoc-html/functions/int.html000644 000765 000024 00000034355 12275777526 016744 0ustar00jjstaff000000 000000 int - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    int

    Perl 5 version 18.2 documentation
    Recently read

    int

    • int EXPR

    • int

      Returns the integer portion of EXPR. If EXPR is omitted, uses $_ . You should not use this function for rounding: one because it truncates towards 0 , and two because machine representations of floating-point numbers can sometimes produce counterintuitive results. For example, int(-6.725/0.025) produces -268 rather than the correct -269; that's because it's really more like -268.99999999999994315658 instead. Usually, the sprintf, printf, or the POSIX::floor and POSIX::ceil functions will serve you better than will int().

     
    perldoc-html/functions/ioctl.html000644 000765 000024 00000040135 12275777525 017254 0ustar00jjstaff000000 000000 ioctl - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ioctl

    Perl 5 version 18.2 documentation
    Recently read

    ioctl

    • ioctl FILEHANDLE,FUNCTION,SCALAR

      Implements the ioctl(2) function. You'll probably first have to say

      1. require "sys/ioctl.ph"; # probably in
      2. # $Config{archlib}/sys/ioctl.ph

      to get the correct function definitions. If sys/ioctl.ph doesn't exist or doesn't have the correct definitions you'll have to roll your own, based on your C header files such as <sys/ioctl.h>. (There is a Perl script called h2ph that comes with the Perl kit that may help you in this, but it's nontrivial.) SCALAR will be read and/or written depending on the FUNCTION; a C pointer to the string value of SCALAR will be passed as the third argument of the actual ioctl call. (If SCALAR has no string value but does have a numeric value, that value will be passed rather than a pointer to the string value. To guarantee this to be true, add a 0 to the scalar before using it.) The pack and unpack functions may be needed to manipulate the values of structures used by ioctl.

      The return value of ioctl (and fcntl) is as follows:

      1. if OS returns: then Perl returns:
      2. -1 undefined value
      3. 0 string "0 but true"
      4. anything else that number

      Thus Perl returns true on success and false on failure, yet you can still easily determine the actual value returned by the operating system:

      1. $retval = ioctl(...) || -1;
      2. printf "System returned %d\n", $retval;

      The special string "0 but true" is exempt from -w complaints about improper numeric conversions.

      Portability issues: ioctl in perlport.

     
    perldoc-html/functions/join.html000644 000765 000024 00000034426 12275777531 017104 0ustar00jjstaff000000 000000 join - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    join

    Perl 5 version 18.2 documentation
    Recently read

    join

    • join EXPR,LIST

      Joins the separate strings of LIST into a single string with fields separated by the value of EXPR, and returns that new string. Example:

      1. $rec = join(':', $login,$passwd,$uid,$gid,$gcos,$home,$shell);

      Beware that unlike split, join doesn't take a pattern as its first argument. Compare split.

     
    perldoc-html/functions/keys.html000644 000765 000024 00000052417 12275777530 017117 0ustar00jjstaff000000 000000 keys - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    keys

    Perl 5 version 18.2 documentation
    Recently read

    keys

    • keys HASH

    • keys ARRAY
    • keys EXPR

      Called in list context, returns a list consisting of all the keys of the named hash, or in Perl 5.12 or later only, the indices of an array. Perl releases prior to 5.12 will produce a syntax error if you try to use an array argument. In scalar context, returns the number of keys or indices.

      Hash entries are returned in an apparently random order. The actual random order is specific to a given hash; the exact same series of operations on two hashes may result in a different order for each hash. Any insertion into the hash may change the order, as will any deletion, with the exception that the most recent key returned by each or keys may be deleted without changing the order. So long as a given hash is unmodified you may rely on keys, values and each to repeatedly return the same order as each other. See Algorithmic Complexity Attacks in perlsec for details on why hash order is randomized. Aside from the guarantees provided here the exact details of Perl's hash algorithm and the hash traversal order are subject to change in any release of Perl.

      As a side effect, calling keys() resets the internal iterator of the HASH or ARRAY (see each). In particular, calling keys() in void context resets the iterator with no other overhead.

      Here is yet another way to print your environment:

      1. @keys = keys %ENV;
      2. @values = values %ENV;
      3. while (@keys) {
      4. print pop(@keys), '=', pop(@values), "\n";
      5. }

      or how about sorted by key:

      1. foreach $key (sort(keys %ENV)) {
      2. print $key, '=', $ENV{$key}, "\n";
      3. }

      The returned values are copies of the original keys in the hash, so modifying them will not affect the original hash. Compare values.

      To sort a hash by value, you'll need to use a sort function. Here's a descending numeric sort of a hash by its values:

      1. foreach $key (sort { $hash{$b} <=> $hash{$a} } keys %hash) {
      2. printf "%4d %s\n", $hash{$key}, $key;
      3. }

      Used as an lvalue, keys allows you to increase the number of hash buckets allocated for the given hash. This can gain you a measure of efficiency if you know the hash is going to get big. (This is similar to pre-extending an array by assigning a larger number to $#array.) If you say

      1. keys %hash = 200;

      then %hash will have at least 200 buckets allocated for it--256 of them, in fact, since it rounds up to the next power of two. These buckets will be retained even if you do %hash = () , use undef %hash if you want to free the storage while %hash is still in scope. You can't shrink the number of buckets allocated for the hash using keys in this way (but you needn't worry about doing this by accident, as trying has no effect). keys @array in an lvalue context is a syntax error.

      Starting with Perl 5.14, keys can take a scalar EXPR, which must contain a reference to an unblessed hash or array. The argument will be dereferenced automatically. This aspect of keys is considered highly experimental. The exact behaviour may change in a future version of Perl.

      1. for (keys $hashref) { ... }
      2. for (keys $obj->get_arrayref) { ... }

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.012; # so keys/values/each work on arrays
      2. use 5.014; # so keys/values/each work on scalars (experimental)

      See also each, values, and sort.

     
    perldoc-html/functions/kill.html000644 000765 000024 00000042426 12275777525 017102 0ustar00jjstaff000000 000000 kill - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    kill

    Perl 5 version 18.2 documentation
    Recently read

    kill

    • kill SIGNAL, LIST
    • kill SIGNAL

      Sends a signal to a list of processes. Returns the number of processes successfully signaled (which is not necessarily the same as the number actually killed).

      1. $cnt = kill 'HUP', $child1, $child2;
      2. kill 'KILL', @goners;

      SIGNAL may be either a signal name (a string) or a signal number. A signal name may start with a SIG prefix, thus FOO and SIGFOO refer to the same signal. The string form of SIGNAL is recommended for portability because the same signal may have different numbers in different operating systems.

      A list of signal names supported by the current platform can be found in $Config{sig_name} , which is provided by the Config module. See Config for more details.

      A negative signal name is the same as a negative signal number, killing process groups instead of processes. For example, kill '-KILL', $pgrp and kill -9, $pgrp will send SIGKILL to the entire process group specified. That means you usually want to use positive not negative signals.

      If SIGNAL is either the number 0 or the string ZERO (or SIGZZERO ), no signal is sent to the process, but kill checks whether it's possible to send a signal to it (that means, to be brief, that the process is owned by the same user, or we are the super-user). This is useful to check that a child process is still alive (even if only as a zombie) and hasn't changed its UID. See perlport for notes on the portability of this construct.

      The behavior of kill when a PROCESS number is zero or negative depends on the operating system. For example, on POSIX-conforming systems, zero will signal the current process group, -1 will signal all processes, and any other negative PROCESS number will act as a negative signal number and kill the entire process group specified.

      If both the SIGNAL and the PROCESS are negative, the results are undefined. A warning may be produced in a future version.

      See Signals in perlipc for more details.

      On some platforms such as Windows where the fork() system call is not available. Perl can be built to emulate fork() at the interpreter level. This emulation has limitations related to kill that have to be considered, for code running on Windows and in code intended to be portable.

      See perlfork for more details.

      If there is no LIST of processes, no signal is sent, and the return value is 0. This form is sometimes used, however, because it causes tainting checks to be run. But see Laundering and Detecting Tainted Data in perlsec.

      Portability issues: kill in perlport.

     
    perldoc-html/functions/last.html000644 000765 000024 00000040340 12275777527 017105 0ustar00jjstaff000000 000000 last - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    last

    Perl 5 version 18.2 documentation
    Recently read

    last

    • last LABEL

    • last EXPR
    • last

      The last command is like the break statement in C (as used in loops); it immediately exits the loop in question. If the LABEL is omitted, the command refers to the innermost enclosing loop. The last EXPR form, available starting in Perl 5.18.0, allows a label name to be computed at run time, and is otherwise identical to last LABEL . The continue block, if any, is not executed:

      1. LINE: while (<STDIN>) {
      2. last LINE if /^$/; # exit when done with header
      3. #...
      4. }

      last cannot be used to exit a block that returns a value such as eval {} , sub {} , or do {} , and should not be used to exit a grep() or map() operation.

      Note that a block by itself is semantically identical to a loop that executes once. Thus last can be used to effect an early exit out of such a block.

      See also continue for an illustration of how last, next, and redo work.

      Unlike most named operators, this has the same precedence as assignment. It is also exempt from the looks-like-a-function rule, so last ("foo")."bar" will cause "bar" to be part of the argument to last.

     
    perldoc-html/functions/lc.html000644 000765 000024 00000040312 12275777525 016535 0ustar00jjstaff000000 000000 lc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    lc

    Perl 5 version 18.2 documentation
    Recently read

    lc

    • lc EXPR

    • lc

      Returns a lowercased version of EXPR. This is the internal function implementing the \L escape in double-quoted strings.

      If EXPR is omitted, uses $_ .

      What gets returned depends on several factors:

      • If use bytes is in effect:

        The results follow ASCII semantics. Only characters A-Z change, to a-z respectively.

      • Otherwise, if use locale (but not use locale ':not_characters' ) is in effect:

        Respects current LC_CTYPE locale for code points < 256; and uses Unicode semantics for the remaining code points (this last can only happen if the UTF8 flag is also set). See perllocale.

        A deficiency in this is that case changes that cross the 255/256 boundary are not well-defined. For example, the lower case of LATIN CAPITAL LETTER SHARP S (U+1E9E) in Unicode semantics is U+00DF (on ASCII platforms). But under use locale , the lower case of U+1E9E is itself, because 0xDF may not be LATIN SMALL LETTER SHARP S in the current locale, and Perl has no way of knowing if that character even exists in the locale, much less what code point it is. Perl returns the input character unchanged, for all instances (and there aren't many) where the 255/256 boundary would otherwise be crossed.

      • Otherwise, If EXPR has the UTF8 flag set:

        Unicode semantics are used for the case change.

      • Otherwise, if use feature 'unicode_strings' or use locale ':not_characters' is in effect:

        Unicode semantics are used for the case change.

      • Otherwise:

        ASCII semantics are used for the case change. The lowercase of any character outside the ASCII range is the character itself.

     
    perldoc-html/functions/lcfirst.html000644 000765 000024 00000033354 12275777531 017612 0ustar00jjstaff000000 000000 lcfirst - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    lcfirst

    Perl 5 version 18.2 documentation
    Recently read

    lcfirst

    • lcfirst EXPR

    • lcfirst

      Returns the value of EXPR with the first character lowercased. This is the internal function implementing the \l escape in double-quoted strings.

      If EXPR is omitted, uses $_ .

      This function behaves the same way under various pragmata, such as in a locale, as lc does.

     
    perldoc-html/functions/le.html000644 000765 000024 00000032431 12275777526 016543 0ustar00jjstaff000000 000000 le - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    le

    Perl 5 version 18.2 documentation
    Recently read

    le

    • le

      These operators are documented in perlop.

     
    perldoc-html/functions/length.html000644 000765 000024 00000034777 12275777527 017444 0ustar00jjstaff000000 000000 length - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    length

    Perl 5 version 18.2 documentation
    Recently read

    length

    • length EXPR

    • length

      Returns the length in characters of the value of EXPR. If EXPR is omitted, returns the length of $_ . If EXPR is undefined, returns undef.

      This function cannot be used on an entire array or hash to find out how many elements these have. For that, use scalar @array and scalar keys %hash , respectively.

      Like all Perl character operations, length() normally deals in logical characters, not physical bytes. For how many bytes a string encoded as UTF-8 would take up, use length(Encode::encode_utf8(EXPR)) (you'll have to use Encode first). See Encode and perlunicode.

     
    perldoc-html/functions/link.html000644 000765 000024 00000032674 12275777532 017106 0ustar00jjstaff000000 000000 link - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    link

    Perl 5 version 18.2 documentation
    Recently read

    link

    • link OLDFILE,NEWFILE

      Creates a new filename linked to the old filename. Returns true for success, false otherwise.

      Portability issues: link in perlport.

     
    perldoc-html/functions/listen.html000644 000765 000024 00000033023 12275777530 017432 0ustar00jjstaff000000 000000 listen - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    listen

    Perl 5 version 18.2 documentation
    Recently read

    listen

     
    perldoc-html/functions/local.html000644 000765 000024 00000034550 12275777526 017241 0ustar00jjstaff000000 000000 local - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    local

    Perl 5 version 18.2 documentation
    Recently read

    local

     
    perldoc-html/functions/localtime.html000644 000765 000024 00000046473 12275777531 020123 0ustar00jjstaff000000 000000 localtime - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    localtime

    Perl 5 version 18.2 documentation
    Recently read

    localtime

    • localtime EXPR

    • localtime

      Converts a time as returned by the time function to a 9-element list with the time analyzed for the local time zone. Typically used as follows:

      1. # 0 1 2 3 4 5 6 7 8
      2. ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
      3. localtime(time);

      All list elements are numeric and come straight out of the C `struct tm'. $sec , $min , and $hour are the seconds, minutes, and hours of the specified time.

      $mday is the day of the month and $mon the month in the range 0..11 , with 0 indicating January and 11 indicating December. This makes it easy to get a month name from a list:

      1. my @abbr = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
      2. print "$abbr[$mon] $mday";
      3. # $mon=9, $mday=18 gives "Oct 18"

      $year contains the number of years since 1900. To get a 4-digit year write:

      1. $year += 1900;

      To get the last two digits of the year (e.g., "01" in 2001) do:

      1. $year = sprintf("%02d", $year % 100);

      $wday is the day of the week, with 0 indicating Sunday and 3 indicating Wednesday. $yday is the day of the year, in the range 0..364 (or 0..365 in leap years.)

      $isdst is true if the specified time occurs during Daylight Saving Time, false otherwise.

      If EXPR is omitted, localtime() uses the current time (as returned by time(3)).

      In scalar context, localtime() returns the ctime(3) value:

      1. $now_string = localtime; # e.g., "Thu Oct 13 04:54:34 1994"

      The format of this scalar value is not locale-dependent but built into Perl. For GMT instead of local time use the gmtime builtin. See also the Time::Local module (for converting seconds, minutes, hours, and such back to the integer value returned by time()), and the POSIX module's strftime(3) and mktime(3) functions.

      To get somewhat similar but locale-dependent date strings, set up your locale environment variables appropriately (please see perllocale) and try for example:

      1. use POSIX qw(strftime);
      2. $now_string = strftime "%a %b %e %H:%M:%S %Y", localtime;
      3. # or for GMT formatted appropriately for your locale:
      4. $now_string = strftime "%a %b %e %H:%M:%S %Y", gmtime;

      Note that the %a and %b , the short forms of the day of the week and the month of the year, may not necessarily be three characters wide.

      The Time::gmtime and Time::localtime modules provide a convenient, by-name access mechanism to the gmtime() and localtime() functions, respectively.

      For a comprehensive date and time representation look at the DateTime module on CPAN.

      Portability issues: localtime in perlport.

     
    perldoc-html/functions/lock.html000644 000765 000024 00000033610 12275777526 017073 0ustar00jjstaff000000 000000 lock - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    lock

    Perl 5 version 18.2 documentation
    Recently read

    lock

    • lock THING

      This function places an advisory lock on a shared variable or referenced object contained in THING until the lock goes out of scope.

      The value returned is the scalar itself, if the argument is a scalar, or a reference, if the argument is a hash, array or subroutine.

      lock() is a "weak keyword" : this means that if you've defined a function by this name (before any calls to it), that function will be called instead. If you are not under use threads::shared this does nothing. See threads::shared.

     
    perldoc-html/functions/log.html000644 000765 000024 00000034545 12275777526 016734 0ustar00jjstaff000000 000000 log - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    log

    Perl 5 version 18.2 documentation
    Recently read

    log

    • log EXPR

    • log

      Returns the natural logarithm (base e) of EXPR. If EXPR is omitted, returns the log of $_ . To get the log of another base, use basic algebra: The base-N log of a number is equal to the natural log of that number divided by the natural log of N. For example:

      1. sub log10 {
      2. my $n = shift;
      3. return log($n)/log(10);
      4. }

      See also exp for the inverse operation.

     
    perldoc-html/functions/lstat.html000644 000765 000024 00000034262 12275777526 017276 0ustar00jjstaff000000 000000 lstat - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    lstat

    Perl 5 version 18.2 documentation
    Recently read

    lstat

    • lstat FILEHANDLE

    • lstat EXPR
    • lstat DIRHANDLE
    • lstat

      Does the same thing as the stat function (including setting the special _ filehandle) but stats a symbolic link instead of the file the symbolic link points to. If symbolic links are unimplemented on your system, a normal stat is done. For much more detailed information, please see the documentation for stat.

      If EXPR is omitted, stats $_ .

      Portability issues: lstat in perlport.

     
    perldoc-html/functions/lt.html000644 000765 000024 00000032431 12275777531 016556 0ustar00jjstaff000000 000000 lt - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    lt

    Perl 5 version 18.2 documentation
    Recently read

    lt

    • lt

      These operators are documented in perlop.

     
    perldoc-html/functions/m.html000644 000765 000024 00000032513 12275777530 016373 0ustar00jjstaff000000 000000 m - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    m

    Perl 5 version 18.2 documentation
    Recently read

    m

     
    perldoc-html/functions/map.html000644 000765 000024 00000053203 12275777525 016717 0ustar00jjstaff000000 000000 map - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    map

    Perl 5 version 18.2 documentation
    Recently read

    map

    • map BLOCK LIST

    • map EXPR,LIST

      Evaluates the BLOCK or EXPR for each element of LIST (locally setting $_ to each element) and returns the list value composed of the results of each such evaluation. In scalar context, returns the total number of elements so generated. Evaluates BLOCK or EXPR in list context, so each element of LIST may produce zero, one, or more elements in the returned value.

      1. @chars = map(chr, @numbers);

      translates a list of numbers to the corresponding characters.

      1. my @squares = map { $_ * $_ } @numbers;

      translates a list of numbers to their squared values.

      1. my @squares = map { $_ > 5 ? ($_ * $_) : () } @numbers;

      shows that number of returned elements can differ from the number of input elements. To omit an element, return an empty list (). This could also be achieved by writing

      1. my @squares = map { $_ * $_ } grep { $_ > 5 } @numbers;

      which makes the intention more clear.

      Map always returns a list, which can be assigned to a hash such that the elements become key/value pairs. See perldata for more details.

      1. %hash = map { get_a_key_for($_) => $_ } @array;

      is just a funny way to write

      1. %hash = ();
      2. foreach (@array) {
      3. $hash{get_a_key_for($_)} = $_;
      4. }

      Note that $_ is an alias to the list value, so it can be used to modify the elements of the LIST. While this is useful and supported, it can cause bizarre results if the elements of LIST are not variables. Using a regular foreach loop for this purpose would be clearer in most cases. See also grep for an array composed of those items of the original list for which the BLOCK or EXPR evaluates to true.

      If $_ is lexical in the scope where the map appears (because it has been declared with the deprecated my $_ construct), then, in addition to being locally aliased to the list elements, $_ keeps being lexical inside the block; that is, it can't be seen from the outside, avoiding any potential side-effects.

      { starts both hash references and blocks, so map { ... could be either the start of map BLOCK LIST or map EXPR, LIST. Because Perl doesn't look ahead for the closing } it has to take a guess at which it's dealing with based on what it finds just after the {. Usually it gets it right, but if it doesn't it won't realize something is wrong until it gets to the } and encounters the missing (or unexpected) comma. The syntax error will be reported close to the }, but you'll need to change something near the { such as using a unary + to give Perl some help:

      1. %hash = map { "\L$_" => 1 } @array # perl guesses EXPR. wrong
      2. %hash = map { +"\L$_" => 1 } @array # perl guesses BLOCK. right
      3. %hash = map { ("\L$_" => 1) } @array # this also works
      4. %hash = map { lc($_) => 1 } @array # as does this.
      5. %hash = map +( lc($_) => 1 ), @array # this is EXPR and works!
      6. %hash = map ( lc($_), 1 ), @array # evaluates to (1, @array)

      or to force an anon hash constructor use +{:

      1. @hashes = map +{ lc($_) => 1 }, @array # EXPR, so needs
      2. # comma at end

      to get a list of anonymous hashes each with only one entry apiece.

     
    perldoc-html/functions/mkdir.html000644 000765 000024 00000035422 12275777526 017254 0ustar00jjstaff000000 000000 mkdir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    mkdir

    Perl 5 version 18.2 documentation
    Recently read

    mkdir

    • mkdir FILENAME,MASK

    • mkdir FILENAME
    • mkdir

      Creates the directory specified by FILENAME, with permissions specified by MASK (as modified by umask). If it succeeds it returns true; otherwise it returns false and sets $! (errno). MASK defaults to 0777 if omitted, and FILENAME defaults to $_ if omitted.

      In general, it is better to create directories with a permissive MASK and let the user modify that with their umask than it is to supply a restrictive MASK and give the user no way to be more permissive. The exceptions to this rule are when the file or directory should be kept private (mail files, for instance). The perlfunc(1) entry on umask discusses the choice of MASK in more detail.

      Note that according to the POSIX 1003.1-1996 the FILENAME may have any number of trailing slashes. Some operating and filesystems do not get this right, so Perl automatically removes all trailing slashes to keep everyone happy.

      To recursively create a directory structure, look at the mkpath function of the File::Path module.

     
    perldoc-html/functions/msgctl.html000644 000765 000024 00000034461 12275777525 017440 0ustar00jjstaff000000 000000 msgctl - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    msgctl

    Perl 5 version 18.2 documentation
    Recently read

    msgctl

    • msgctl ID,CMD,ARG

      Calls the System V IPC function msgctl(2). You'll probably have to say

      1. use IPC::SysV;

      first to get the correct constant definitions. If CMD is IPC_STAT , then ARG must be a variable that will hold the returned msqid_ds structure. Returns like ioctl: the undefined value for error, "0 but true" for zero, or the actual return value otherwise. See also SysV IPC in perlipc and the documentation for IPC::SysV and IPC::Semaphore .

      Portability issues: msgctl in perlport.

     
    perldoc-html/functions/msgget.html000644 000765 000024 00000033357 12275777526 017441 0ustar00jjstaff000000 000000 msgget - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    msgget

    Perl 5 version 18.2 documentation
    Recently read

    msgget

    • msgget KEY,FLAGS

      Calls the System V IPC function msgget(2). Returns the message queue id, or undef on error. See also SysV IPC in perlipc and the documentation for IPC::SysV and IPC::Msg .

      Portability issues: msgget in perlport.

     
    perldoc-html/functions/msgrcv.html000644 000765 000024 00000034230 12275777525 017442 0ustar00jjstaff000000 000000 msgrcv - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    msgrcv

    Perl 5 version 18.2 documentation
    Recently read

    msgrcv

    • msgrcv ID,VAR,SIZE,TYPE,FLAGS

      Calls the System V IPC function msgrcv to receive a message from message queue ID into variable VAR with a maximum message size of SIZE. Note that when a message is received, the message type as a native long integer will be the first thing in VAR, followed by the actual message. This packing may be opened with unpack("l! a*") . Taints the variable. Returns true if successful, false on error. See also SysV IPC in perlipc and the documentation for IPC::SysV and IPC::SysV::Msg .

      Portability issues: msgrcv in perlport.

     
    perldoc-html/functions/msgsnd.html000644 000765 000024 00000034153 12275777526 017441 0ustar00jjstaff000000 000000 msgsnd - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    msgsnd

    Perl 5 version 18.2 documentation
    Recently read

    msgsnd

    • msgsnd ID,MSG,FLAGS

      Calls the System V IPC function msgsnd to send the message MSG to the message queue ID. MSG must begin with the native long integer message type, be followed by the length of the actual message, and then finally the message itself. This kind of packing can be achieved with pack("l! a*", $type, $message) . Returns true if successful, false on error. See also the IPC::SysV and IPC::SysV::Msg documentation.

      Portability issues: msgsnd in perlport.

     
    perldoc-html/functions/my.html000644 000765 000024 00000034644 12275777524 016576 0ustar00jjstaff000000 000000 my - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    my

    Perl 5 version 18.2 documentation
    Recently read

    my

    • my EXPR

    • my TYPE EXPR
    • my EXPR : ATTRS
    • my TYPE EXPR : ATTRS

      A my declares the listed variables to be local (lexically) to the enclosing block, file, or eval. If more than one value is listed, the list must be placed in parentheses.

      The exact semantics and interface of TYPE and ATTRS are still evolving. TYPE is currently bound to the use of the fields pragma, and attributes are handled using the attributes pragma, or starting from Perl 5.8.0 also via the Attribute::Handlers module. See Private Variables via my() in perlsub for details, and fields, attributes, and Attribute::Handlers.

     
    perldoc-html/functions/ne.html000644 000765 000024 00000032431 12275777524 016543 0ustar00jjstaff000000 000000 ne - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ne

    Perl 5 version 18.2 documentation
    Recently read

    ne

    • ne

      These operators are documented in perlop.

     
    perldoc-html/functions/next.html000644 000765 000024 00000040374 12275777530 017121 0ustar00jjstaff000000 000000 next - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    next

    Perl 5 version 18.2 documentation
    Recently read

    next

    • next LABEL

    • next EXPR
    • next

      The next command is like the continue statement in C; it starts the next iteration of the loop:

      1. LINE: while (<STDIN>) {
      2. next LINE if /^#/; # discard comments
      3. #...
      4. }

      Note that if there were a continue block on the above, it would get executed even on discarded lines. If LABEL is omitted, the command refers to the innermost enclosing loop. The next EXPR form, available as of Perl 5.18.0, allows a label name to be computed at run time, being otherwise identical to next LABEL .

      next cannot be used to exit a block which returns a value such as eval {} , sub {} , or do {} , and should not be used to exit a grep() or map() operation.

      Note that a block by itself is semantically identical to a loop that executes once. Thus next will exit such a block early.

      See also continue for an illustration of how last, next, and redo work.

      Unlike most named operators, this has the same precedence as assignment. It is also exempt from the looks-like-a-function rule, so next ("foo")."bar" will cause "bar" to be part of the argument to next.

     
    perldoc-html/functions/no.html000644 000765 000024 00000033300 12275777525 016552 0ustar00jjstaff000000 000000 no - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    no

    Perl 5 version 18.2 documentation
    Recently read

    no

    • no MODULE VERSION LIST

    • no MODULE VERSION
    • no MODULE LIST
    • no MODULE
    • no VERSION

      See the use function, of which no is the opposite.

     
    perldoc-html/functions/not.html000644 000765 000024 00000032441 12275777530 016737 0ustar00jjstaff000000 000000 not - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    not

    Perl 5 version 18.2 documentation
    Recently read

    not

    • not

      These operators are documented in perlop.

     
    perldoc-html/functions/oct.html000644 000765 000024 00000036435 12275777531 016734 0ustar00jjstaff000000 000000 oct - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    oct

    Perl 5 version 18.2 documentation
    Recently read

    oct

    • oct EXPR

    • oct

      Interprets EXPR as an octal string and returns the corresponding value. (If EXPR happens to start off with 0x , interprets it as a hex string. If EXPR starts off with 0b, it is interpreted as a binary string. Leading whitespace is ignored in all three cases.) The following will handle decimal, binary, octal, and hex in standard Perl notation:

      1. $val = oct($val) if $val =~ /^0/;

      If EXPR is omitted, uses $_ . To go the other way (produce a number in octal), use sprintf() or printf():

      1. $dec_perms = (stat("filename"))[2] & 07777;
      2. $oct_perm_str = sprintf "%o", $perms;

      The oct() function is commonly used when a string such as 644 needs to be converted into a file mode, for example. Although Perl automatically converts strings into numbers as needed, this automatic conversion assumes base 10.

      Leading white space is ignored without warning, as too are any trailing non-digits, such as a decimal point (oct only handles non-negative integers, not negative integers or floating point).

     
    perldoc-html/functions/open.html000644 000765 000024 00000161731 12275777527 017113 0ustar00jjstaff000000 000000 open - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    open

    Perl 5 version 18.2 documentation
    Recently read

    open

    • open FILEHANDLE,EXPR

    • open FILEHANDLE,MODE,EXPR
    • open FILEHANDLE,MODE,EXPR,LIST
    • open FILEHANDLE,MODE,REFERENCE
    • open FILEHANDLE

      Opens the file whose filename is given by EXPR, and associates it with FILEHANDLE.

      Simple examples to open a file for reading:

      1. open(my $fh, "<", "input.txt")
      2. or die "cannot open < input.txt: $!";

      and for writing:

      1. open(my $fh, ">", "output.txt")
      2. or die "cannot open > output.txt: $!";

      (The following is a comprehensive reference to open(): for a gentler introduction you may consider perlopentut.)

      If FILEHANDLE is an undefined scalar variable (or array or hash element), a new filehandle is autovivified, meaning that the variable is assigned a reference to a newly allocated anonymous filehandle. Otherwise if FILEHANDLE is an expression, its value is the real filehandle. (This is considered a symbolic reference, so use strict "refs" should not be in effect.)

      If EXPR is omitted, the global (package) scalar variable of the same name as the FILEHANDLE contains the filename. (Note that lexical variables--those declared with my or state--will not work for this purpose; so if you're using my or state, specify EXPR in your call to open.)

      If three (or more) arguments are specified, the open mode (including optional encoding) in the second argument are distinct from the filename in the third. If MODE is < or nothing, the file is opened for input. If MODE is >, the file is opened for output, with existing files first being truncated ("clobbered") and nonexisting files newly created. If MODE is >> , the file is opened for appending, again being created if necessary.

      You can put a + in front of the > or < to indicate that you want both read and write access to the file; thus +< is almost always preferred for read/write updates--the +> mode would clobber the file first. You can't usually use either read-write mode for updating textfiles, since they have variable-length records. See the -i switch in perlrun for a better approach. The file is created with permissions of 0666 modified by the process's umask value.

      These various prefixes correspond to the fopen(3) modes of r , r+ , w , w+ , a , and a+ .

      In the one- and two-argument forms of the call, the mode and filename should be concatenated (in that order), preferably separated by white space. You can--but shouldn't--omit the mode in these forms when that mode is < . It is always safe to use the two-argument form of open if the filename argument is a known literal.

      For three or more arguments if MODE is |- , the filename is interpreted as a command to which output is to be piped, and if MODE is -| , the filename is interpreted as a command that pipes output to us. In the two-argument (and one-argument) form, one should replace dash (- ) with the command. See Using open() for IPC in perlipc for more examples of this. (You are not allowed to open to a command that pipes both in and out, but see IPC::Open2, IPC::Open3, and Bidirectional Communication with Another Process in perlipc for alternatives.)

      In the form of pipe opens taking three or more arguments, if LIST is specified (extra arguments after the command name) then LIST becomes arguments to the command invoked if the platform supports it. The meaning of open with more than three arguments for non-pipe modes is not yet defined, but experimental "layers" may give extra LIST arguments meaning.

      In the two-argument (and one-argument) form, opening <- or - opens STDIN and opening >- opens STDOUT.

      You may (and usually should) use the three-argument form of open to specify I/O layers (sometimes referred to as "disciplines") to apply to the handle that affect how the input and output are processed (see open and PerlIO for more details). For example:

      1. open(my $fh, "<:encoding(UTF-8)", "filename")
      2. || die "can't open UTF-8 encoded filename: $!";

      opens the UTF8-encoded file containing Unicode characters; see perluniintro. Note that if layers are specified in the three-argument form, then default layers stored in ${^OPEN} (see perlvar; usually set by the open pragma or the switch -CioD) are ignored. Those layers will also be ignored if you specifying a colon with no name following it. In that case the default layer for the operating system (:raw on Unix, :crlf on Windows) is used.

      Open returns nonzero on success, the undefined value otherwise. If the open involved a pipe, the return value happens to be the pid of the subprocess.

      If you're running Perl on a system that distinguishes between text files and binary files, then you should check out binmode for tips for dealing with this. The key distinction between systems that need binmode and those that don't is their text file formats. Systems like Unix, Mac OS, and Plan 9, that end lines with a single character and encode that character in C as "\n" do not need binmode. The rest need it.

      When opening a file, it's seldom a good idea to continue if the request failed, so open is frequently used with die. Even if die won't do what you want (say, in a CGI script, where you want to format a suitable error message (but there are modules that can help with that problem)) always check the return value from opening a file.

      As a special case the three-argument form with a read/write mode and the third argument being undef:

      1. open(my $tmp, "+>", undef) or die ...

      opens a filehandle to an anonymous temporary file. Also using +< works for symmetry, but you really should consider writing something to the temporary file first. You will need to seek() to do the reading.

      Perl is built using PerlIO by default; Unless you've changed this (such as building Perl with Configure -Uuseperlio ), you can open filehandles directly to Perl scalars via:

      1. open($fh, ">", \$variable) || ..

      To (re)open STDOUT or STDERR as an in-memory file, close it first:

      1. close STDOUT;
      2. open(STDOUT, ">", \$variable)
      3. or die "Can't open STDOUT: $!";

      General examples:

      1. $ARTICLE = 100;
      2. open(ARTICLE) or die "Can't find article $ARTICLE: $!\n";
      3. while (<ARTICLE>) {...
      4. open(LOG, ">>/usr/spool/news/twitlog"); # (log is reserved)
      5. # if the open fails, output is discarded
      6. open(my $dbase, "+<", "dbase.mine") # open for update
      7. or die "Can't open 'dbase.mine' for update: $!";
      8. open(my $dbase, "+<dbase.mine") # ditto
      9. or die "Can't open 'dbase.mine' for update: $!";
      10. open(ARTICLE, "-|", "caesar <$article") # decrypt article
      11. or die "Can't start caesar: $!";
      12. open(ARTICLE, "caesar <$article |") # ditto
      13. or die "Can't start caesar: $!";
      14. open(EXTRACT, "|sort >Tmp$$") # $$ is our process id
      15. or die "Can't start sort: $!";
      16. # in-memory files
      17. open(MEMORY, ">", \$var)
      18. or die "Can't open memory file: $!";
      19. print MEMORY "foo!\n"; # output will appear in $var
      20. # process argument list of files along with any includes
      21. foreach $file (@ARGV) {
      22. process($file, "fh00");
      23. }
      24. sub process {
      25. my($filename, $input) = @_;
      26. $input++; # this is a string increment
      27. unless (open($input, "<", $filename)) {
      28. print STDERR "Can't open $filename: $!\n";
      29. return;
      30. }
      31. local $_;
      32. while (<$input>) { # note use of indirection
      33. if (/^#include "(.*)"/) {
      34. process($1, $input);
      35. next;
      36. }
      37. #... # whatever
      38. }
      39. }

      See perliol for detailed info on PerlIO.

      You may also, in the Bourne shell tradition, specify an EXPR beginning with >&, in which case the rest of the string is interpreted as the name of a filehandle (or file descriptor, if numeric) to be duped (as dup(2) ) and opened. You may use & after >, >> , < , +>, +>> , and +< . The mode you specify should match the mode of the original filehandle. (Duping a filehandle does not take into account any existing contents of IO buffers.) If you use the three-argument form, then you can pass either a number, the name of a filehandle, or the normal "reference to a glob".

      Here is a script that saves, redirects, and restores STDOUT and STDERR using various methods:

      1. #!/usr/bin/perl
      2. open(my $oldout, ">&STDOUT") or die "Can't dup STDOUT: $!";
      3. open(OLDERR, ">&", \*STDERR) or die "Can't dup STDERR: $!";
      4. open(STDOUT, '>', "foo.out") or die "Can't redirect STDOUT: $!";
      5. open(STDERR, ">&STDOUT") or die "Can't dup STDOUT: $!";
      6. select STDERR; $| = 1; # make unbuffered
      7. select STDOUT; $| = 1; # make unbuffered
      8. print STDOUT "stdout 1\n"; # this works for
      9. print STDERR "stderr 1\n"; # subprocesses too
      10. open(STDOUT, ">&", $oldout) or die "Can't dup \$oldout: $!";
      11. open(STDERR, ">&OLDERR") or die "Can't dup OLDERR: $!";
      12. print STDOUT "stdout 2\n";
      13. print STDERR "stderr 2\n";

      If you specify '<&=X' , where X is a file descriptor number or a filehandle, then Perl will do an equivalent of C's fdopen of that file descriptor (and not call dup(2) ); this is more parsimonious of file descriptors. For example:

      1. # open for input, reusing the fileno of $fd
      2. open(FILEHANDLE, "<&=$fd")

      or

      1. open(FILEHANDLE, "<&=", $fd)

      or

      1. # open for append, using the fileno of OLDFH
      2. open(FH, ">>&=", OLDFH)

      or

      1. open(FH, ">>&=OLDFH")

      Being parsimonious on filehandles is also useful (besides being parsimonious) for example when something is dependent on file descriptors, like for example locking using flock(). If you do just open(A, ">>&B") , the filehandle A will not have the same file descriptor as B, and therefore flock(A) will not flock(B) nor vice versa. But with open(A, ">>&=B") , the filehandles will share the same underlying system file descriptor.

      Note that under Perls older than 5.8.0, Perl uses the standard C library's' fdopen() to implement the = functionality. On many Unix systems, fdopen() fails when file descriptors exceed a certain value, typically 255. For Perls 5.8.0 and later, PerlIO is (most often) the default.

      You can see whether your Perl was built with PerlIO by running perl -V and looking for the useperlio= line. If useperlio is define , you have PerlIO; otherwise you don't.

      If you open a pipe on the command - (that is, specify either |- or -| with the one- or two-argument forms of open), an implicit fork is done, so open returns twice: in the parent process it returns the pid of the child process, and in the child process it returns (a defined) 0 . Use defined($pid) or // to determine whether the open was successful.

      For example, use either

      1. $child_pid = open(FROM_KID, "-|") // die "can't fork: $!";

      or

      1. $child_pid = open(TO_KID, "|-") // die "can't fork: $!";

      followed by

      1. if ($child_pid) {
      2. # am the parent:
      3. # either write TO_KID or else read FROM_KID
      4. ...
      5. waitpid $child_pid, 0;
      6. } else {
      7. # am the child; use STDIN/STDOUT normally
      8. ...
      9. exit;
      10. }

      The filehandle behaves normally for the parent, but I/O to that filehandle is piped from/to the STDOUT/STDIN of the child process. In the child process, the filehandle isn't opened--I/O happens from/to the new STDOUT/STDIN. Typically this is used like the normal piped open when you want to exercise more control over just how the pipe command gets executed, such as when running setuid and you don't want to have to scan shell commands for metacharacters.

      The following blocks are more or less equivalent:

      1. open(FOO, "|tr '[a-z]' '[A-Z]'");
      2. open(FOO, "|-", "tr '[a-z]' '[A-Z]'");
      3. open(FOO, "|-") || exec 'tr', '[a-z]', '[A-Z]';
      4. open(FOO, "|-", "tr", '[a-z]', '[A-Z]');
      5. open(FOO, "cat -n '$file'|");
      6. open(FOO, "-|", "cat -n '$file'");
      7. open(FOO, "-|") || exec "cat", "-n", $file;
      8. open(FOO, "-|", "cat", "-n", $file);

      The last two examples in each block show the pipe as "list form", which is not yet supported on all platforms. A good rule of thumb is that if your platform has a real fork() (in other words, if your platform is Unix, including Linux and MacOS X), you can use the list form. You would want to use the list form of the pipe so you can pass literal arguments to the command without risk of the shell interpreting any shell metacharacters in them. However, this also bars you from opening pipes to commands that intentionally contain shell metacharacters, such as:

      1. open(FOO, "|cat -n | expand -4 | lpr")
      2. // die "Can't open pipeline to lpr: $!";

      See Safe Pipe Opens in perlipc for more examples of this.

      Perl will attempt to flush all files opened for output before any operation that may do a fork, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles.

      On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptor as determined by the value of $^F . See $^F in perlvar.

      Closing any piped filehandle causes the parent process to wait for the child to finish, then returns the status value in $? and ${^CHILD_ERROR_NATIVE} .

      The filename passed to the one- and two-argument forms of open() will have leading and trailing whitespace deleted and normal redirection characters honored. This property, known as "magic open", can often be used to good effect. A user could specify a filename of "rsh cat file |", or you could change certain filenames as needed:

      1. $filename =~ s/(.*\.gz)\s*$/gzip -dc < $1|/;
      2. open(FH, $filename) or die "Can't open $filename: $!";

      Use the three-argument form to open a file with arbitrary weird characters in it,

      1. open(FOO, "<", $file)
      2. || die "can't open < $file: $!";

      otherwise it's necessary to protect any leading and trailing whitespace:

      1. $file =~ s#^(\s)#./$1#;
      2. open(FOO, "< $file\0")
      3. || die "open failed: $!";

      (this may not work on some bizarre filesystems). One should conscientiously choose between the magic and three-argument form of open():

      1. open(IN, $ARGV[0]) || die "can't open $ARGV[0]: $!";

      will allow the user to specify an argument of the form "rsh cat file |" , but will not work on a filename that happens to have a trailing space, while

      1. open(IN, "<", $ARGV[0])
      2. || die "can't open < $ARGV[0]: $!";

      will have exactly the opposite restrictions.

      If you want a "real" C open (see open(2) on your system), then you should use the sysopen function, which involves no such magic (but may use subtly different filemodes than Perl open(), which is mapped to C fopen()). This is another way to protect your filenames from interpretation. For example:

      1. use IO::Handle;
      2. sysopen(HANDLE, $path, O_RDWR|O_CREAT|O_EXCL)
      3. or die "sysopen $path: $!";
      4. $oldfh = select(HANDLE); $| = 1; select($oldfh);
      5. print HANDLE "stuff $$\n";
      6. seek(HANDLE, 0, 0);
      7. print "File contains: ", <HANDLE>;

      Using the constructor from the IO::Handle package (or one of its subclasses, such as IO::File or IO::Socket ), you can generate anonymous filehandles that have the scope of the variables used to hold them, then automatically (but silently) close once their reference counts become zero, typically at scope exit:

      1. use IO::File;
      2. #...
      3. sub read_myfile_munged {
      4. my $ALL = shift;
      5. # or just leave it undef to autoviv
      6. my $handle = IO::File->new;
      7. open($handle, "<", "myfile") or die "myfile: $!";
      8. $first = <$handle>
      9. or return (); # Automatically closed here.
      10. mung($first) or die "mung failed"; # Or here.
      11. return (first, <$handle>) if $ALL; # Or here.
      12. return $first; # Or here.
      13. }

      WARNING: The previous example has a bug because the automatic close that happens when the refcount on handle reaches zero does not properly detect and report failures. Always close the handle yourself and inspect the return value.

      1. close($handle)
      2. || warn "close failed: $!";

      See seek for some details about mixing reading and writing.

      Portability issues: open in perlport.

     
    perldoc-html/functions/opendir.html000644 000765 000024 00000034402 12275777530 017576 0ustar00jjstaff000000 000000 opendir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    opendir

    Perl 5 version 18.2 documentation
    Recently read

    opendir

    • opendir DIRHANDLE,EXPR

      Opens a directory named EXPR for processing by readdir, telldir, seekdir, rewinddir, and closedir. Returns true if successful. DIRHANDLE may be an expression whose value can be used as an indirect dirhandle, usually the real dirhandle name. If DIRHANDLE is an undefined scalar variable (or array or hash element), the variable is assigned a reference to a new anonymous dirhandle; that is, it's autovivified. DIRHANDLEs have their own namespace separate from FILEHANDLEs.

      See the example at readdir.

     
    perldoc-html/functions/or.html000644 000765 000024 00000032431 12275777531 016557 0ustar00jjstaff000000 000000 or - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    or

    Perl 5 version 18.2 documentation
    Recently read

    or

    • or

      These operators are documented in perlop.

     
    perldoc-html/functions/ord.html000644 000765 000024 00000033204 12275777525 016725 0ustar00jjstaff000000 000000 ord - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ord

    Perl 5 version 18.2 documentation
    Recently read

    ord

    • ord EXPR

    • ord

      Returns the numeric value of the first character of EXPR. If EXPR is an empty string, returns 0. If EXPR is omitted, uses $_ . (Note character, not byte.)

      For the reverse, see chr. See perlunicode for more about Unicode.

     
    perldoc-html/functions/our.html000644 000765 000024 00000046722 12275777530 016753 0ustar00jjstaff000000 000000 our - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    our

    Perl 5 version 18.2 documentation
    Recently read

    our

    • our EXPR

    • our TYPE EXPR
    • our EXPR : ATTRS
    • our TYPE EXPR : ATTRS

      our makes a lexical alias to a package variable of the same name in the current package for use within the current lexical scope.

      our has the same scoping rules as my or state, but our only declares an alias, whereas my or state both declare a variable name and allocate storage for that name within the current scope.

      This means that when use strict 'vars' is in effect, our lets you use a package variable without qualifying it with the package name, but only within the lexical scope of the our declaration. In this way, our differs from use vars , which allows use of an unqualified name only within the affected package, but across scopes.

      If more than one value is listed, the list must be placed in parentheses.

      1. our $foo;
      2. our($bar, $baz);

      An our declaration declares an alias for a package variable that will be visible across its entire lexical scope, even across package boundaries. The package in which the variable is entered is determined at the point of the declaration, not at the point of use. This means the following behavior holds:

      1. package Foo;
      2. our $bar; # declares $Foo::bar for rest of lexical scope
      3. $bar = 20;
      4. package Bar;
      5. print $bar; # prints 20, as it refers to $Foo::bar

      Multiple our declarations with the same name in the same lexical scope are allowed if they are in different packages. If they happen to be in the same package, Perl will emit warnings if you have asked for them, just like multiple my declarations. Unlike a second my declaration, which will bind the name to a fresh variable, a second our declaration in the same package, in the same scope, is merely redundant.

      1. use warnings;
      2. package Foo;
      3. our $bar; # declares $Foo::bar for rest of lexical scope
      4. $bar = 20;
      5. package Bar;
      6. our $bar = 30; # declares $Bar::bar for rest of lexical scope
      7. print $bar; # prints 30
      8. our $bar; # emits warning but has no other effect
      9. print $bar; # still prints 30

      An our declaration may also have a list of attributes associated with it.

      The exact semantics and interface of TYPE and ATTRS are still evolving. TYPE is currently bound to the use of the fields pragma, and attributes are handled using the attributes pragma, or, starting from Perl 5.8.0, also via the Attribute::Handlers module. See Private Variables via my() in perlsub for details, and fields, attributes, and Attribute::Handlers.

     
    perldoc-html/functions/pack.html000644 000765 000024 00000212750 12275777530 017060 0ustar00jjstaff000000 000000 pack - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    pack

    Perl 5 version 18.2 documentation
    Recently read

    pack

    • pack TEMPLATE,LIST

      Takes a LIST of values and converts it into a string using the rules given by the TEMPLATE. The resulting string is the concatenation of the converted values. Typically, each converted value looks like its machine-level representation. For example, on 32-bit machines an integer may be represented by a sequence of 4 bytes, which will in Perl be presented as a string that's 4 characters long.

      See perlpacktut for an introduction to this function.

      The TEMPLATE is a sequence of characters that give the order and type of values, as follows:

      1. a A string with arbitrary binary data, will be null padded.
      2. A A text (ASCII) string, will be space padded.
      3. Z A null-terminated (ASCIZ) string, will be null padded.
      4. b A bit string (ascending bit order inside each byte,
      5. like vec()).
      6. B A bit string (descending bit order inside each byte).
      7. h A hex string (low nybble first).
      8. H A hex string (high nybble first).
      9. c A signed char (8-bit) value.
      10. C An unsigned char (octet) value.
      11. W An unsigned char value (can be greater than 255).
      12. s A signed short (16-bit) value.
      13. S An unsigned short value.
      14. l A signed long (32-bit) value.
      15. L An unsigned long value.
      16. q A signed quad (64-bit) value.
      17. Q An unsigned quad value.
      18. (Quads are available only if your system supports 64-bit
      19. integer values _and_ if Perl has been compiled to support
      20. those. Raises an exception otherwise.)
      21. i A signed integer value.
      22. I A unsigned integer value.
      23. (This 'integer' is _at_least_ 32 bits wide. Its exact
      24. size depends on what a local C compiler calls 'int'.)
      25. n An unsigned short (16-bit) in "network" (big-endian) order.
      26. N An unsigned long (32-bit) in "network" (big-endian) order.
      27. v An unsigned short (16-bit) in "VAX" (little-endian) order.
      28. V An unsigned long (32-bit) in "VAX" (little-endian) order.
      29. j A Perl internal signed integer value (IV).
      30. J A Perl internal unsigned integer value (UV).
      31. f A single-precision float in native format.
      32. d A double-precision float in native format.
      33. F A Perl internal floating-point value (NV) in native format
      34. D A float of long-double precision in native format.
      35. (Long doubles are available only if your system supports
      36. long double values _and_ if Perl has been compiled to
      37. support those. Raises an exception otherwise.)
      38. p A pointer to a null-terminated string.
      39. P A pointer to a structure (fixed-length string).
      40. u A uuencoded string.
      41. U A Unicode character number. Encodes to a character in char-
      42. acter mode and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in
      43. byte mode.
      44. w A BER compressed integer (not an ASN.1 BER, see perlpacktut
      45. for details). Its bytes represent an unsigned integer in
      46. base 128, most significant digit first, with as few digits
      47. as possible. Bit eight (the high bit) is set on each byte
      48. except the last.
      49. x A null byte (a.k.a ASCII NUL, "\000", chr(0))
      50. X Back up a byte.
      51. @ Null-fill or truncate to absolute position, counted from the
      52. start of the innermost ()-group.
      53. . Null-fill or truncate to absolute position specified by
      54. the value.
      55. ( Start of a ()-group.

      One or more modifiers below may optionally follow certain letters in the TEMPLATE (the second column lists letters for which the modifier is valid):

      1. ! sSlLiI Forces native (short, long, int) sizes instead
      2. of fixed (16-/32-bit) sizes.
      3. xX Make x and X act as alignment commands.
      4. nNvV Treat integers as signed instead of unsigned.
      5. @. Specify position as byte offset in the internal
      6. representation of the packed string. Efficient
      7. but dangerous.
      8. > sSiIlLqQ Force big-endian byte-order on the type.
      9. jJfFdDpP (The "big end" touches the construct.)
      10. < sSiIlLqQ Force little-endian byte-order on the type.
      11. jJfFdDpP (The "little end" touches the construct.)

      The > and < modifiers can also be used on () groups to force a particular byte-order on all components in that group, including all its subgroups.

      The following rules apply:

      • Each letter may optionally be followed by a number indicating the repeat count. A numeric repeat count may optionally be enclosed in brackets, as in pack("C[80]", @arr) . The repeat count gobbles that many values from the LIST when used with all format types other than a , A , Z , b , B , h , H , @ , ., x , X , and P , where it means something else, described below. Supplying a * for the repeat count instead of a number means to use however many items are left, except for:

        • @ , x , and X , where it is equivalent to 0 .

        • <.>, where it means relative to the start of the string.

        • u , where it is equivalent to 1 (or 45, which here is equivalent).

        One can replace a numeric repeat count with a template letter enclosed in brackets to use the packed byte length of the bracketed template for the repeat count.

        For example, the template x[L] skips as many bytes as in a packed long, and the template "$t X[$t] $t" unpacks twice whatever $t (when variable-expanded) unpacks. If the template in brackets contains alignment commands (such as x![d] ), its packed length is calculated as if the start of the template had the maximal possible alignment.

        When used with Z , a * as the repeat count is guaranteed to add a trailing null byte, so the resulting string is always one byte longer than the byte length of the item itself.

        When used with @ , the repeat count represents an offset from the start of the innermost () group.

        When used with ., the repeat count determines the starting position to calculate the value offset as follows:

        • If the repeat count is 0 , it's relative to the current position.

        • If the repeat count is * , the offset is relative to the start of the packed string.

        • And if it's an integer n, the offset is relative to the start of the nth innermost ( ) group, or to the start of the string if n is bigger then the group level.

        The repeat count for u is interpreted as the maximal number of bytes to encode per line of output, with 0, 1 and 2 replaced by 45. The repeat count should not be more than 65.

      • The a , A , and Z types gobble just one value, but pack it as a string of length count, padding with nulls or spaces as needed. When unpacking, A strips trailing whitespace and nulls, Z strips everything after the first null, and a returns data with no stripping at all.

        If the value to pack is too long, the result is truncated. If it's too long and an explicit count is provided, Z packs only $count-1 bytes, followed by a null byte. Thus Z always packs a trailing null, except when the count is 0.

      • Likewise, the b and B formats pack a string that's that many bits long. Each such format generates 1 bit of the result. These are typically followed by a repeat count like B8 or B64 .

        Each result bit is based on the least-significant bit of the corresponding input character, i.e., on ord($char)%2. In particular, characters "0" and "1" generate bits 0 and 1, as do characters "\000" and "\001" .

        Starting from the beginning of the input string, each 8-tuple of characters is converted to 1 character of output. With format b , the first character of the 8-tuple determines the least-significant bit of a character; with format B , it determines the most-significant bit of a character.

        If the length of the input string is not evenly divisible by 8, the remainder is packed as if the input string were padded by null characters at the end. Similarly during unpacking, "extra" bits are ignored.

        If the input string is longer than needed, remaining characters are ignored.

        A * for the repeat count uses all characters of the input field. On unpacking, bits are converted to a string of 0 s and 1 s.

      • The h and H formats pack a string that many nybbles (4-bit groups, representable as hexadecimal digits, "0".."9" "a".."f" ) long.

        For each such format, pack() generates 4 bits of result. With non-alphabetical characters, the result is based on the 4 least-significant bits of the input character, i.e., on ord($char)%16. In particular, characters "0" and "1" generate nybbles 0 and 1, as do bytes "\000" and "\001" . For characters "a".."f" and "A".."F" , the result is compatible with the usual hexadecimal digits, so that "a" and "A" both generate the nybble 0xA==10 . Use only these specific hex characters with this format.

        Starting from the beginning of the template to pack(), each pair of characters is converted to 1 character of output. With format h , the first character of the pair determines the least-significant nybble of the output character; with format H , it determines the most-significant nybble.

        If the length of the input string is not even, it behaves as if padded by a null character at the end. Similarly, "extra" nybbles are ignored during unpacking.

        If the input string is longer than needed, extra characters are ignored.

        A * for the repeat count uses all characters of the input field. For unpack(), nybbles are converted to a string of hexadecimal digits.

      • The p format packs a pointer to a null-terminated string. You are responsible for ensuring that the string is not a temporary value, as that could potentially get deallocated before you got around to using the packed result. The P format packs a pointer to a structure of the size indicated by the length. A null pointer is created if the corresponding value for p or P is undef; similarly with unpack(), where a null pointer unpacks into undef.

        If your system has a strange pointer size--meaning a pointer is neither as big as an int nor as big as a long--it may not be possible to pack or unpack pointers in big- or little-endian byte order. Attempting to do so raises an exception.

      • The / template character allows packing and unpacking of a sequence of items where the packed structure contains a packed item count followed by the packed items themselves. This is useful when the structure you're unpacking has encoded the sizes or repeat counts for some of its fields within the structure itself as separate fields.

        For pack, you write length-item/sequence-item, and the length-item describes how the length value is packed. Formats likely to be of most use are integer-packing ones like n for Java strings, w for ASN.1 or SNMP, and N for Sun XDR.

        For pack, sequence-item may have a repeat count, in which case the minimum of that and the number of available items is used as the argument for length-item. If it has no repeat count or uses a '*', the number of available items is used.

        For unpack, an internal stack of integer arguments unpacked so far is used. You write /sequence-item and the repeat count is obtained by popping off the last element from the stack. The sequence-item must not have a repeat count.

        If sequence-item refers to a string type ("A" , "a" , or "Z" ), the length-item is the string length, not the number of strings. With an explicit repeat count for pack, the packed string is adjusted to that length. For example:

        1. This code: gives this result:
        2. unpack("W/a", "\004Gurusamy") ("Guru")
        3. unpack("a3/A A*", "007 Bond J ") (" Bond", "J")
        4. unpack("a3 x2 /A A*", "007: Bond, J.") ("Bond, J", ".")
        5. pack("n/a* w/a","hello,","world") "\000\006hello,\005world"
        6. pack("a/W2", ord("a") .. ord("z")) "2ab"

        The length-item is not returned explicitly from unpack.

        Supplying a count to the length-item format letter is only useful with A , a , or Z . Packing with a length-item of a or Z may introduce "\000" characters, which Perl does not regard as legal in numeric strings.

      • The integer types s, S , l , and L may be followed by a ! modifier to specify native shorts or longs. As shown in the example above, a bare l means exactly 32 bits, although the native long as seen by the local C compiler may be larger. This is mainly an issue on 64-bit platforms. You can see whether using ! makes any difference this way:

        1. printf "format s is %d, s! is %d\n",
        2. length pack("s"), length pack("s!");
        3. printf "format l is %d, l! is %d\n",
        4. length pack("l"), length pack("l!");

        i! and I! are also allowed, but only for completeness' sake: they are identical to i and I .

        The actual sizes (in bytes) of native shorts, ints, longs, and long longs on the platform where Perl was built are also available from the command line:

        1. $ perl -V:{short,int,long{,long}}size
        2. shortsize='2';
        3. intsize='4';
        4. longsize='4';
        5. longlongsize='8';

        or programmatically via the Config module:

        1. use Config;
        2. print $Config{shortsize}, "\n";
        3. print $Config{intsize}, "\n";
        4. print $Config{longsize}, "\n";
        5. print $Config{longlongsize}, "\n";

        $Config{longlongsize} is undefined on systems without long long support.

      • The integer formats s, S , i , I , l , L , j , and J are inherently non-portable between processors and operating systems because they obey native byteorder and endianness. For example, a 4-byte integer 0x12345678 (305419896 decimal) would be ordered natively (arranged in and handled by the CPU registers) into bytes as

        1. 0x12 0x34 0x56 0x78 # big-endian
        2. 0x78 0x56 0x34 0x12 # little-endian

        Basically, Intel and VAX CPUs are little-endian, while everybody else, including Motorola m68k/88k, PPC, Sparc, HP PA, Power, and Cray, are big-endian. Alpha and MIPS can be either: Digital/Compaq uses (well, used) them in little-endian mode, but SGI/Cray uses them in big-endian mode.

        The names big-endian and little-endian are comic references to the egg-eating habits of the little-endian Lilliputians and the big-endian Blefuscudians from the classic Jonathan Swift satire, Gulliver's Travels. This entered computer lingo via the paper "On Holy Wars and a Plea for Peace" by Danny Cohen, USC/ISI IEN 137, April 1, 1980.

        Some systems may have even weirder byte orders such as

        1. 0x56 0x78 0x12 0x34
        2. 0x34 0x12 0x78 0x56

        You can determine your system endianness with this incantation:

        1. printf("%#02x ", $_) for unpack("W*", pack L=>0x12345678);

        The byteorder on the platform where Perl was built is also available via Config:

        1. use Config;
        2. print "$Config{byteorder}\n";

        or from the command line:

        1. $ perl -V:byteorder

        Byteorders "1234" and "12345678" are little-endian; "4321" and "87654321" are big-endian.

        For portably packed integers, either use the formats n , N , v , and V or else use the > and < modifiers described immediately below. See also perlport.

      • Starting with Perl 5.10.0, integer and floating-point formats, along with the p and P formats and () groups, may all be followed by the > or < endianness modifiers to respectively enforce big- or little-endian byte-order. These modifiers are especially useful given how n , N , v , and V don't cover signed integers, 64-bit integers, or floating-point values.

        Here are some concerns to keep in mind when using an endianness modifier:

        • Exchanging signed integers between different platforms works only when all platforms store them in the same format. Most platforms store signed integers in two's-complement notation, so usually this is not an issue.

        • The > or < modifiers can only be used on floating-point formats on big- or little-endian machines. Otherwise, attempting to use them raises an exception.

        • Forcing big- or little-endian byte-order on floating-point values for data exchange can work only if all platforms use the same binary representation such as IEEE floating-point. Even if all platforms are using IEEE, there may still be subtle differences. Being able to use > or < on floating-point values can be useful, but also dangerous if you don't know exactly what you're doing. It is not a general way to portably store floating-point values.

        • When using > or < on a () group, this affects all types inside the group that accept byte-order modifiers, including all subgroups. It is silently ignored for all other types. You are not allowed to override the byte-order within a group that already has a byte-order modifier suffix.

      • Real numbers (floats and doubles) are in native machine format only. Due to the multiplicity of floating-point formats and the lack of a standard "network" representation for them, no facility for interchange has been made. This means that packed floating-point data written on one machine may not be readable on another, even if both use IEEE floating-point arithmetic (because the endianness of the memory representation is not part of the IEEE spec). See also perlport.

        If you know exactly what you're doing, you can use the > or < modifiers to force big- or little-endian byte-order on floating-point values.

        Because Perl uses doubles (or long doubles, if configured) internally for all numeric calculation, converting from double into float and thence to double again loses precision, so unpack("f", pack("f", $foo)) will not in general equal $foo.

      • Pack and unpack can operate in two modes: character mode (C0 mode) where the packed string is processed per character, and UTF-8 mode (U0 mode) where the packed string is processed in its UTF-8-encoded Unicode form on a byte-by-byte basis. Character mode is the default unless the format string starts with U . You can always switch mode mid-format with an explicit C0 or U0 in the format. This mode remains in effect until the next mode change, or until the end of the () group it (directly) applies to.

        Using C0 to get Unicode characters while using U0 to get non-Unicode bytes is not necessarily obvious. Probably only the first of these is what you want:

        1. $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
        2. perl -CS -ne 'printf "%v04X\n", $_ for unpack("C0A*", $_)'
        3. 03B1.03C9
        4. $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
        5. perl -CS -ne 'printf "%v02X\n", $_ for unpack("U0A*", $_)'
        6. CE.B1.CF.89
        7. $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
        8. perl -C0 -ne 'printf "%v02X\n", $_ for unpack("C0A*", $_)'
        9. CE.B1.CF.89
        10. $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
        11. perl -C0 -ne 'printf "%v02X\n", $_ for unpack("U0A*", $_)'
        12. C3.8E.C2.B1.C3.8F.C2.89

        Those examples also illustrate that you should not try to use pack/unpack as a substitute for the Encode module.

      • You must yourself do any alignment or padding by inserting, for example, enough "x" es while packing. There is no way for pack() and unpack() to know where characters are going to or coming from, so they handle their output and input as flat sequences of characters.

      • A () group is a sub-TEMPLATE enclosed in parentheses. A group may take a repeat count either as postfix, or for unpack(), also via the / template character. Within each repetition of a group, positioning with @ starts over at 0. Therefore, the result of

        1. pack("@1A((@2A)@3A)", qw[X Y Z])

        is the string "\0X\0\0YZ" .

      • x and X accept the ! modifier to act as alignment commands: they jump forward or back to the closest position aligned at a multiple of count characters. For example, to pack() or unpack() a C structure like

        1. struct {
        2. char c; /* one signed, 8-bit character */
        3. double d;
        4. char cc[2];
        5. }

        one may need to use the template c x![d] d c[2] . This assumes that doubles must be aligned to the size of double.

        For alignment commands, a count of 0 is equivalent to a count of 1; both are no-ops.

      • n , N , v and V accept the ! modifier to represent signed 16-/32-bit integers in big-/little-endian order. This is portable only when all platforms sharing packed data use the same binary representation for signed integers; for example, when all platforms use two's-complement representation.

      • Comments can be embedded in a TEMPLATE using # through the end of line. White space can separate pack codes from each other, but modifiers and repeat counts must follow immediately. Breaking complex templates into individual line-by-line components, suitably annotated, can do as much to improve legibility and maintainability of pack/unpack formats as /x can for complicated pattern matches.

      • If TEMPLATE requires more arguments than pack() is given, pack() assumes additional "" arguments. If TEMPLATE requires fewer arguments than given, extra arguments are ignored.

      Examples:

      1. $foo = pack("WWWW",65,66,67,68);
      2. # foo eq "ABCD"
      3. $foo = pack("W4",65,66,67,68);
      4. # same thing
      5. $foo = pack("W4",0x24b6,0x24b7,0x24b8,0x24b9);
      6. # same thing with Unicode circled letters.
      7. $foo = pack("U4",0x24b6,0x24b7,0x24b8,0x24b9);
      8. # same thing with Unicode circled letters. You don't get the
      9. # UTF-8 bytes because the U at the start of the format caused
      10. # a switch to U0-mode, so the UTF-8 bytes get joined into
      11. # characters
      12. $foo = pack("C0U4",0x24b6,0x24b7,0x24b8,0x24b9);
      13. # foo eq "\xe2\x92\xb6\xe2\x92\xb7\xe2\x92\xb8\xe2\x92\xb9"
      14. # This is the UTF-8 encoding of the string in the
      15. # previous example
      16. $foo = pack("ccxxcc",65,66,67,68);
      17. # foo eq "AB\0\0CD"
      18. # NOTE: The examples above featuring "W" and "c" are true
      19. # only on ASCII and ASCII-derived systems such as ISO Latin 1
      20. # and UTF-8. On EBCDIC systems, the first example would be
      21. # $foo = pack("WWWW",193,194,195,196);
      22. $foo = pack("s2",1,2);
      23. # "\001\000\002\000" on little-endian
      24. # "\000\001\000\002" on big-endian
      25. $foo = pack("a4","abcd","x","y","z");
      26. # "abcd"
      27. $foo = pack("aaaa","abcd","x","y","z");
      28. # "axyz"
      29. $foo = pack("a14","abcdefg");
      30. # "abcdefg\0\0\0\0\0\0\0"
      31. $foo = pack("i9pl", gmtime);
      32. # a real struct tm (on my system anyway)
      33. $utmp_template = "Z8 Z8 Z16 L";
      34. $utmp = pack($utmp_template, @utmp1);
      35. # a struct utmp (BSDish)
      36. @utmp2 = unpack($utmp_template, $utmp);
      37. # "@utmp1" eq "@utmp2"
      38. sub bintodec {
      39. unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
      40. }
      41. $foo = pack('sx2l', 12, 34);
      42. # short 12, two zero bytes padding, long 34
      43. $bar = pack('s@4l', 12, 34);
      44. # short 12, zero fill to position 4, long 34
      45. # $foo eq $bar
      46. $baz = pack('s.l', 12, 4, 34);
      47. # short 12, zero fill to position 4, long 34
      48. $foo = pack('nN', 42, 4711);
      49. # pack big-endian 16- and 32-bit unsigned integers
      50. $foo = pack('S>L>', 42, 4711);
      51. # exactly the same
      52. $foo = pack('s<l<', -42, 4711);
      53. # pack little-endian 16- and 32-bit signed integers
      54. $foo = pack('(sl)<', -42, 4711);
      55. # exactly the same

      The same template may generally also be used in unpack().

     
    perldoc-html/functions/package.html000644 000765 000024 00000042304 12275777526 017536 0ustar00jjstaff000000 000000 package - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    package

    Perl 5 version 18.2 documentation
    Recently read

    package

    • package NAMESPACE
    • package NAMESPACE VERSION

    • package NAMESPACE BLOCK
    • package NAMESPACE VERSION BLOCK

      Declares the BLOCK or the rest of the compilation unit as being in the given namespace. The scope of the package declaration is either the supplied code BLOCK or, in the absence of a BLOCK, from the declaration itself through the end of current scope (the enclosing block, file, or eval). That is, the forms without a BLOCK are operative through the end of the current scope, just like the my, state, and our operators. All unqualified dynamic identifiers in this scope will be in the given namespace, except where overridden by another package declaration or when they're one of the special identifiers that qualify into main:: , like STDOUT , ARGV , ENV , and the punctuation variables.

      A package statement affects dynamic variables only, including those you've used local on, but not lexically-scoped variables, which are created with my, state, or our. Typically it would be the first declaration in a file included by require or use. You can switch into a package in more than one place, since this only determines which default symbol table the compiler uses for the rest of that block. You can refer to identifiers in other packages than the current one by prefixing the identifier with the package name and a double colon, as in $SomePack::var or ThatPack::INPUT_HANDLE . If package name is omitted, the main package as assumed. That is, $::sail is equivalent to $main::sail (as well as to $main'sail , still seen in ancient code, mostly from Perl 4).

      If VERSION is provided, package sets the $VERSION variable in the given namespace to a version object with the VERSION provided. VERSION must be a "strict" style version number as defined by the version module: a positive decimal number (integer or decimal-fraction) without exponentiation or else a dotted-decimal v-string with a leading 'v' character and at least three components. You should set $VERSION only once per package.

      See Packages in perlmod for more information about packages, modules, and classes. See perlsub for other scoping issues.

     
    perldoc-html/functions/pipe.html000644 000765 000024 00000034515 12275777526 017105 0ustar00jjstaff000000 000000 pipe - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    pipe

    Perl 5 version 18.2 documentation
    Recently read

    pipe

    • pipe READHANDLE,WRITEHANDLE

      Opens a pair of connected pipes like the corresponding system call. Note that if you set up a loop of piped processes, deadlock can occur unless you are very careful. In addition, note that Perl's pipes use IO buffering, so you may need to set $| to flush your WRITEHANDLE after each command, depending on the application.

      Returns true on success.

      See IPC::Open2, IPC::Open3, and Bidirectional Communication with Another Process in perlipc for examples of such things.

      On systems that support a close-on-exec flag on files, that flag is set on all newly opened file descriptors whose filenos are higher than the current value of $^F (by default 2 for STDERR ). See $^F in perlvar.

     
    perldoc-html/functions/pop.html000644 000765 000024 00000035276 12275777531 016747 0ustar00jjstaff000000 000000 pop - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    pop

    Perl 5 version 18.2 documentation
    Recently read

    pop

    • pop ARRAY

    • pop EXPR
    • pop

      Pops and returns the last value of the array, shortening the array by one element.

      Returns the undefined value if the array is empty, although this may also happen at other times. If ARRAY is omitted, pops the @ARGV array in the main program, but the @_ array in subroutines, just like shift.

      Starting with Perl 5.14, pop can take a scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of pop is considered highly experimental. The exact behaviour may change in a future version of Perl.

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.014; # so push/pop/etc work on scalars (experimental)
     
    perldoc-html/functions/pos.html000644 000765 000024 00000036323 12275777526 016750 0ustar00jjstaff000000 000000 pos - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    pos

    Perl 5 version 18.2 documentation
    Recently read

    pos

    • pos SCALAR

    • pos

      Returns the offset of where the last m//g search left off for the variable in question ($_ is used when the variable is not specified). Note that 0 is a valid match offset. undef indicates that the search position is reset (usually due to match failure, but can also be because no match has yet been run on the scalar).

      pos directly accesses the location used by the regexp engine to store the offset, so assigning to pos will change that offset, and so will also influence the \G zero-width assertion in regular expressions. Both of these effects take place for the next match, so you can't affect the position with pos during the current match, such as in (?{pos() = 5}) or s//pos() = 5/e .

      Setting pos also resets the matched with zero-length flag, described under Repeated Patterns Matching a Zero-length Substring in perlre.

      Because a failed m//gc match doesn't reset the offset, the return from pos won't change either in this case. See perlre and perlop.

     
    perldoc-html/functions/print.html000644 000765 000024 00000040315 12275777524 017275 0ustar00jjstaff000000 000000 print - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    print

    Perl 5 version 18.2 documentation
    Recently read

    print

    • print FILEHANDLE LIST

    • print FILEHANDLE
    • print LIST
    • print

      Prints a string or a list of strings. Returns true if successful. FILEHANDLE may be a scalar variable containing the name of or a reference to the filehandle, thus introducing one level of indirection. (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be misinterpreted as an operator unless you interpose a + or put parentheses around the arguments.) If FILEHANDLE is omitted, prints to the last selected (see select) output handle. If LIST is omitted, prints $_ to the currently selected output handle. To use FILEHANDLE alone to print the content of $_ to it, you must use a real filehandle like FH , not an indirect one like $fh . To set the default output handle to something other than STDOUT, use the select operation.

      The current value of $, (if any) is printed between each LIST item. The current value of $\ (if any) is printed after the entire LIST has been printed. Because print takes a LIST, anything in the LIST is evaluated in list context, including any subroutines whose return lists you pass to print. Be careful not to follow the print keyword with a left parenthesis unless you want the corresponding right parenthesis to terminate the arguments to the print; put parentheses around all arguments (or interpose a + , but that doesn't look as good).

      If you're storing handles in an array or hash, or in general whenever you're using any expression more complex than a bareword handle or a plain, unsubscripted scalar variable to retrieve it, you will have to use a block returning the filehandle value instead, in which case the LIST may not be omitted:

      1. print { $files[$i] } "stuff\n";
      2. print { $OK ? STDOUT : STDERR } "stuff\n";

      Printing to a closed pipe or socket will generate a SIGPIPE signal. See perlipc for more on signal handling.

     
    perldoc-html/functions/printf.html000644 000765 000024 00000037724 12275777530 017452 0ustar00jjstaff000000 000000 printf - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    printf

    Perl 5 version 18.2 documentation
    Recently read

    printf

    • printf FILEHANDLE FORMAT, LIST

    • printf FILEHANDLE
    • printf FORMAT, LIST
    • printf

      Equivalent to print FILEHANDLE sprintf(FORMAT, LIST) , except that $\ (the output record separator) is not appended. The FORMAT and the LIST are actually parsed as a single list. The first argument of the list will be interpreted as the printf format. This means that printf(@_) will use $_[0] as the format. See sprintf for an explanation of the format argument. If use locale (including use locale ':not_characters' ) is in effect and POSIX::setlocale() has been called, the character used for the decimal separator in formatted floating-point numbers is affected by the LC_NUMERIC locale setting. See perllocale and POSIX.

      For historical reasons, if you omit the list, $_ is used as the format; to use FILEHANDLE without a list, you must use a real filehandle like FH , not an indirect one like $fh . However, this will rarely do what you want; if $_ contains formatting codes, they will be replaced with the empty string and a warning will be emitted if warnings are enabled. Just use print if you want to print the contents of $_.

      Don't fall into the trap of using a printf when a simple print would do. The print is more efficient and less error prone.

     
    perldoc-html/functions/prototype.html000644 000765 000024 00000034145 12275777526 020214 0ustar00jjstaff000000 000000 prototype - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    prototype

    Perl 5 version 18.2 documentation
    Recently read

    prototype

    • prototype FUNCTION

      Returns the prototype of a function as a string (or undef if the function has no prototype). FUNCTION is a reference to, or the name of, the function whose prototype you want to retrieve.

      If FUNCTION is a string starting with CORE:: , the rest is taken as a name for a Perl builtin. If the builtin's arguments cannot be adequately expressed by a prototype (such as system), prototype() returns undef, because the builtin does not really behave like a Perl function. Otherwise, the string describing the equivalent prototype is returned.

     
    perldoc-html/functions/push.html000644 000765 000024 00000035621 12275777530 017121 0ustar00jjstaff000000 000000 push - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    push

    Perl 5 version 18.2 documentation
    Recently read

    push

    • push ARRAY,LIST

    • push EXPR,LIST

      Treats ARRAY as a stack by appending the values of LIST to the end of ARRAY. The length of ARRAY increases by the length of LIST. Has the same effect as

      1. for $value (LIST) {
      2. $ARRAY[++$#ARRAY] = $value;
      3. }

      but is more efficient. Returns the number of elements in the array following the completed push.

      Starting with Perl 5.14, push can take a scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of push is considered highly experimental. The exact behaviour may change in a future version of Perl.

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.014; # so push/pop/etc work on scalars (experimental)
     
    perldoc-html/functions/q.html000644 000765 000024 00000032327 12275777531 016403 0ustar00jjstaff000000 000000 q - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    q

    Perl 5 version 18.2 documentation
    Recently read

    q

    • q/STRING/
     
    perldoc-html/functions/qq.html000644 000765 000024 00000032337 12275777526 016571 0ustar00jjstaff000000 000000 qq - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    qq

    Perl 5 version 18.2 documentation
    Recently read

    qq

    • qq/STRING/
     
    perldoc-html/functions/qr.html000644 000765 000024 00000032536 12275777527 016574 0ustar00jjstaff000000 000000 qr - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    qr

    Perl 5 version 18.2 documentation
    Recently read

    qr

     
    perldoc-html/functions/quotemeta.html000644 000765 000024 00000047554 12275777527 020164 0ustar00jjstaff000000 000000 quotemeta - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    quotemeta

    Perl 5 version 18.2 documentation
    Recently read

    quotemeta

    • quotemeta EXPR

    • quotemeta

      Returns the value of EXPR with all the ASCII non-"word" characters backslashed. (That is, all ASCII characters not matching /[A-Za-z_0-9]/ will be preceded by a backslash in the returned string, regardless of any locale settings.) This is the internal function implementing the \Q escape in double-quoted strings. (See below for the behavior on non-ASCII code points.)

      If EXPR is omitted, uses $_ .

      quotemeta (and \Q ... \E ) are useful when interpolating strings into regular expressions, because by default an interpolated variable will be considered a mini-regular expression. For example:

      1. my $sentence = 'The quick brown fox jumped over the lazy dog';
      2. my $substring = 'quick.*?fox';
      3. $sentence =~ s{$substring}{big bad wolf};

      Will cause $sentence to become 'The big bad wolf jumped over...' .

      On the other hand:

      1. my $sentence = 'The quick brown fox jumped over the lazy dog';
      2. my $substring = 'quick.*?fox';
      3. $sentence =~ s{\Q$substring\E}{big bad wolf};

      Or:

      1. my $sentence = 'The quick brown fox jumped over the lazy dog';
      2. my $substring = 'quick.*?fox';
      3. my $quoted_substring = quotemeta($substring);
      4. $sentence =~ s{$quoted_substring}{big bad wolf};

      Will both leave the sentence as is. Normally, when accepting literal string input from the user, quotemeta() or \Q must be used.

      In Perl v5.14, all non-ASCII characters are quoted in non-UTF-8-encoded strings, but not quoted in UTF-8 strings.

      Starting in Perl v5.16, Perl adopted a Unicode-defined strategy for quoting non-ASCII characters; the quoting of ASCII characters is unchanged.

      Also unchanged is the quoting of non-UTF-8 strings when outside the scope of a use feature 'unicode_strings' , which is to quote all characters in the upper Latin1 range. This provides complete backwards compatibility for old programs which do not use Unicode. (Note that unicode_strings is automatically enabled within the scope of a use v5.12 or greater.)

      Within the scope of use locale , all non-ASCII Latin1 code points are quoted whether the string is encoded as UTF-8 or not. As mentioned above, locale does not affect the quoting of ASCII-range characters. This protects against those locales where characters such as "|" are considered to be word characters.

      Otherwise, Perl quotes non-ASCII characters using an adaptation from Unicode (see http://www.unicode.org/reports/tr31/). The only code points that are quoted are those that have any of the Unicode properties: Pattern_Syntax, Pattern_White_Space, White_Space, Default_Ignorable_Code_Point, or General_Category=Control.

      Of these properties, the two important ones are Pattern_Syntax and Pattern_White_Space. They have been set up by Unicode for exactly this purpose of deciding which characters in a regular expression pattern should be quoted. No character that can be in an identifier has these properties.

      Perl promises, that if we ever add regular expression pattern metacharacters to the dozen already defined (\ | ( ) [ { ^ $ * + ? . ), that we will only use ones that have the Pattern_Syntax property. Perl also promises, that if we ever add characters that are considered to be white space in regular expressions (currently mostly affected by /x), they will all have the Pattern_White_Space property.

      Unicode promises that the set of code points that have these two properties will never change, so something that is not quoted in v5.16 will never need to be quoted in any future Perl release. (Not all the code points that match Pattern_Syntax have actually had characters assigned to them; so there is room to grow, but they are quoted whether assigned or not. Perl, of course, would never use an unassigned code point as an actual metacharacter.)

      Quoting characters that have the other 3 properties is done to enhance the readability of the regular expression and not because they actually need to be quoted for regular expression purposes (characters with the White_Space property are likely to be indistinguishable on the page or screen from those with the Pattern_White_Space property; and the other two properties contain non-printing characters).

     
    perldoc-html/functions/qw.html000644 000765 000024 00000032337 12275777532 016574 0ustar00jjstaff000000 000000 qw - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    qw

    Perl 5 version 18.2 documentation
    Recently read

    qw

    • qw/STRING/
     
    perldoc-html/functions/qx.html000644 000765 000024 00000032521 12275777527 016574 0ustar00jjstaff000000 000000 qx - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    qx

    Perl 5 version 18.2 documentation
    Recently read

    qx

     
    perldoc-html/functions/rand.html000644 000765 000024 00000037236 12275777532 017074 0ustar00jjstaff000000 000000 rand - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    rand

    Perl 5 version 18.2 documentation
    Recently read

    rand

    • rand EXPR

    • rand

      Returns a random fractional number greater than or equal to 0 and less than the value of EXPR. (EXPR should be positive.) If EXPR is omitted, the value 1 is used. Currently EXPR with the value 0 is also special-cased as 1 (this was undocumented before Perl 5.8.0 and is subject to change in future versions of Perl). Automatically calls srand unless srand has already been called. See also srand.

      Apply int() to the value returned by rand() if you want random integers instead of random fractional numbers. For example,

      1. int(rand(10))

      returns a random integer between 0 and 9 , inclusive.

      (Note: If your rand function consistently returns numbers that are too large or too small, then your version of Perl was probably compiled with the wrong number of RANDBITS.)

      rand() is not cryptographically secure. You should not rely on it in security-sensitive situations. As of this writing, a number of third-party CPAN modules offer random number generators intended by their authors to be cryptographically secure, including: Data::Entropy, Crypt::Random, Math::Random::Secure, and Math::TrulyRandom.

     
    perldoc-html/functions/read.html000644 000765 000024 00000036350 12275777531 017056 0ustar00jjstaff000000 000000 read - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    read

    Perl 5 version 18.2 documentation
    Recently read

    read

    • read FILEHANDLE,SCALAR,LENGTH,OFFSET

    • read FILEHANDLE,SCALAR,LENGTH

      Attempts to read LENGTH characters of data into variable SCALAR from the specified FILEHANDLE. Returns the number of characters actually read, 0 at end of file, or undef if there was an error (in the latter case $! is also set). SCALAR will be grown or shrunk so that the last character actually read is the last character of the scalar after the read.

      An OFFSET may be specified to place the read data at some place in the string other than the beginning. A negative OFFSET specifies placement at that many characters counting backwards from the end of the string. A positive OFFSET greater than the length of SCALAR results in the string being padded to the required size with "\0" bytes before the result of the read is appended.

      The call is implemented in terms of either Perl's or your system's native fread(3) library function. To get a true read(2) system call, see sysread.

      Note the characters: depending on the status of the filehandle, either (8-bit) bytes or characters are read. By default, all filehandles operate on bytes, but for example if the filehandle has been opened with the :utf8 I/O layer (see open, and the open pragma, open), the I/O will operate on UTF8-encoded Unicode characters, not bytes. Similarly for the :encoding pragma: in that case pretty much any characters can be read.

     
    perldoc-html/functions/readdir.html000644 000765 000024 00000040720 12275777525 017554 0ustar00jjstaff000000 000000 readdir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    readdir

    Perl 5 version 18.2 documentation
    Recently read

    readdir

    • readdir DIRHANDLE

      Returns the next directory entry for a directory opened by opendir. If used in list context, returns all the rest of the entries in the directory. If there are no more entries, returns the undefined value in scalar context and the empty list in list context.

      If you're planning to filetest the return values out of a readdir, you'd better prepend the directory in question. Otherwise, because we didn't chdir there, it would have been testing the wrong file.

      1. opendir(my $dh, $some_dir) || die "can't opendir $some_dir: $!";
      2. @dots = grep { /^\./ && -f "$some_dir/$_" } readdir($dh);
      3. closedir $dh;

      As of Perl 5.12 you can use a bare readdir in a while loop, which will set $_ on every iteration.

      1. opendir(my $dh, $some_dir) || die;
      2. while(readdir $dh) {
      3. print "$some_dir/$_\n";
      4. }
      5. closedir $dh;

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious failures, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.012; # so readdir assigns to $_ in a lone while test
     
    perldoc-html/functions/readline.html000644 000765 000024 00000044360 12275777524 017730 0ustar00jjstaff000000 000000 readline - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    readline

    Perl 5 version 18.2 documentation
    Recently read

    readline

    • readline EXPR
    • readline

      Reads from the filehandle whose typeglob is contained in EXPR (or from *ARGV if EXPR is not provided). In scalar context, each call reads and returns the next line until end-of-file is reached, whereupon the subsequent call returns undef. In list context, reads until end-of-file is reached and returns a list of lines. Note that the notion of "line" used here is whatever you may have defined with $/ or $INPUT_RECORD_SEPARATOR ). See $/ in perlvar.

      When $/ is set to undef, when readline is in scalar context (i.e., file slurp mode), and when an empty file is read, it returns '' the first time, followed by undef subsequently.

      This is the internal function implementing the <EXPR> operator, but you can use it directly. The <EXPR> operator is discussed in more detail in I/O Operators in perlop.

      1. $line = <STDIN>;
      2. $line = readline(*STDIN); # same thing

      If readline encounters an operating system error, $! will be set with the corresponding error message. It can be helpful to check $! when you are reading from filehandles you don't trust, such as a tty or a socket. The following example uses the operator form of readline and dies if the result is not defined.

      1. while ( ! eof($fh) ) {
      2. defined( $_ = <$fh> ) or die "readline failed: $!";
      3. ...
      4. }

      Note that you have can't handle readline errors that way with the ARGV filehandle. In that case, you have to open each element of @ARGV yourself since eof handles ARGV differently.

      1. foreach my $arg (@ARGV) {
      2. open(my $fh, $arg) or warn "Can't open $arg: $!";
      3. while ( ! eof($fh) ) {
      4. defined( $_ = <$fh> )
      5. or die "readline failed for $arg: $!";
      6. ...
      7. }
      8. }
     
    perldoc-html/functions/readlink.html000644 000765 000024 00000033344 12275777531 017734 0ustar00jjstaff000000 000000 readlink - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    readlink

    Perl 5 version 18.2 documentation
    Recently read

    readlink

    • readlink EXPR

    • readlink

      Returns the value of a symbolic link, if symbolic links are implemented. If not, raises an exception. If there is a system error, returns the undefined value and sets $! (errno). If EXPR is omitted, uses $_ .

      Portability issues: readlink in perlport.

     
    perldoc-html/functions/readpipe.html000644 000765 000024 00000034212 12275777530 017726 0ustar00jjstaff000000 000000 readpipe - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    readpipe

    Perl 5 version 18.2 documentation
    Recently read

    readpipe

    • readpipe EXPR
    • readpipe

      EXPR is executed as a system command. The collected standard output of the command is returned. In scalar context, it comes back as a single (potentially multi-line) string. In list context, returns a list of lines (however you've defined lines with $/ or $INPUT_RECORD_SEPARATOR ). This is the internal function implementing the qx/EXPR/ operator, but you can use it directly. The qx/EXPR/ operator is discussed in more detail in I/O Operators in perlop. If EXPR is omitted, uses $_ .

     
    perldoc-html/functions/recv.html000644 000765 000024 00000035202 12275777531 017075 0ustar00jjstaff000000 000000 recv - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    recv

    Perl 5 version 18.2 documentation
    Recently read

    recv

    • recv SOCKET,SCALAR,LENGTH,FLAGS

      Receives a message on a socket. Attempts to receive LENGTH characters of data into variable SCALAR from the specified SOCKET filehandle. SCALAR will be grown or shrunk to the length actually read. Takes the same flags as the system call of the same name. Returns the address of the sender if SOCKET's protocol supports this; returns an empty string otherwise. If there's an error, returns the undefined value. This call is actually implemented in terms of recvfrom(2) system call. See UDP: Message Passing in perlipc for examples.

      Note the characters: depending on the status of the socket, either (8-bit) bytes or characters are received. By default all sockets operate on bytes, but for example if the socket has been changed using binmode() to operate with the :encoding(utf8) I/O layer (see the open pragma, open), the I/O will operate on UTF8-encoded Unicode characters, not bytes. Similarly for the :encoding pragma: in that case pretty much any characters can be read.

     
    perldoc-html/functions/redo.html000644 000765 000024 00000042542 12275777531 017074 0ustar00jjstaff000000 000000 redo - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    redo

    Perl 5 version 18.2 documentation
    Recently read

    redo

    • redo LABEL

    • redo EXPR
    • redo

      The redo command restarts the loop block without evaluating the conditional again. The continue block, if any, is not executed. If the LABEL is omitted, the command refers to the innermost enclosing loop. The redo EXPR form, available starting in Perl 5.18.0, allows a label name to be computed at run time, and is otherwise identical to redo LABEL . Programs that want to lie to themselves about what was just input normally use this command:

      1. # a simpleminded Pascal comment stripper
      2. # (warning: assumes no { or } in strings)
      3. LINE: while (<STDIN>) {
      4. while (s|({.*}.*){.*}|$1 |) {}
      5. s|{.*}| |;
      6. if (s|{.*| |) {
      7. $front = $_;
      8. while (<STDIN>) {
      9. if (/}/) { # end of comment?
      10. s|^|$front\{|;
      11. redo LINE;
      12. }
      13. }
      14. }
      15. print;
      16. }

      redo cannot be used to retry a block that returns a value such as eval {} , sub {} , or do {} , and should not be used to exit a grep() or map() operation.

      Note that a block by itself is semantically identical to a loop that executes once. Thus redo inside such a block will effectively turn it into a looping construct.

      See also continue for an illustration of how last, next, and redo work.

      Unlike most named operators, this has the same precedence as assignment. It is also exempt from the looks-like-a-function rule, so redo ("foo")."bar" will cause "bar" to be part of the argument to redo.

     
    perldoc-html/functions/ref.html000644 000765 000024 00000037773 12275777525 016734 0ustar00jjstaff000000 000000 ref - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ref

    Perl 5 version 18.2 documentation
    Recently read

    ref

    • ref EXPR

    • ref

      Returns a non-empty string if EXPR is a reference, the empty string otherwise. If EXPR is not specified, $_ will be used. The value returned depends on the type of thing the reference is a reference to. Builtin types include:

      1. SCALAR
      2. ARRAY
      3. HASH
      4. CODE
      5. REF
      6. GLOB
      7. LVALUE
      8. FORMAT
      9. IO
      10. VSTRING
      11. Regexp

      If the referenced object has been blessed into a package, then that package name is returned instead. You can think of ref as a typeof operator.

      1. if (ref($r) eq "HASH") {
      2. print "r is a reference to a hash.\n";
      3. }
      4. unless (ref($r)) {
      5. print "r is not a reference at all.\n";
      6. }

      The return value LVALUE indicates a reference to an lvalue that is not a variable. You get this from taking the reference of function calls like pos() or substr(). VSTRING is returned if the reference points to a version string.

      The result Regexp indicates that the argument is a regular expression resulting from qr//.

      See also perlref.

     
    perldoc-html/functions/rename.html000644 000765 000024 00000034122 12275777527 017412 0ustar00jjstaff000000 000000 rename - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    rename

    Perl 5 version 18.2 documentation
    Recently read

    rename

    • rename OLDNAME,NEWNAME

      Changes the name of a file; an existing file NEWNAME will be clobbered. Returns true for success, false otherwise.

      Behavior of this function varies wildly depending on your system implementation. For example, it will usually not work across file system boundaries, even though the system mv command sometimes compensates for this. Other restrictions include whether it works on directories, open files, or pre-existing files. Check perlport and either the rename(2) manpage or equivalent system documentation for details.

      For a platform independent move function look at the File::Copy module.

      Portability issues: rename in perlport.

     
    perldoc-html/functions/require.html000644 000765 000024 00000066361 12275777530 017623 0ustar00jjstaff000000 000000 require - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    require

    Perl 5 version 18.2 documentation
    Recently read

    require

    • require VERSION

    • require EXPR
    • require

      Demands a version of Perl specified by VERSION, or demands some semantics specified by EXPR or by $_ if EXPR is not supplied.

      VERSION may be either a numeric argument such as 5.006, which will be compared to $] , or a literal of the form v5.6.1, which will be compared to $^V (aka $PERL_VERSION). An exception is raised if VERSION is greater than the version of the current Perl interpreter. Compare with use, which can do a similar check at compile time.

      Specifying VERSION as a literal of the form v5.6.1 should generally be avoided, because it leads to misleading error messages under earlier versions of Perl that do not support this syntax. The equivalent numeric version should be used instead.

      1. require v5.6.1; # run time version check
      2. require 5.6.1; # ditto
      3. require 5.006_001; # ditto; preferred for backwards
      4. compatibility

      Otherwise, require demands that a library file be included if it hasn't already been included. The file is included via the do-FILE mechanism, which is essentially just a variety of eval with the caveat that lexical variables in the invoking script will be invisible to the included code. Has semantics similar to the following subroutine:

      1. sub require {
      2. my ($filename) = @_;
      3. if (exists $INC{$filename}) {
      4. return 1 if $INC{$filename};
      5. die "Compilation failed in require";
      6. }
      7. my ($realfilename,$result);
      8. ITER: {
      9. foreach $prefix (@INC) {
      10. $realfilename = "$prefix/$filename";
      11. if (-f $realfilename) {
      12. $INC{$filename} = $realfilename;
      13. $result = do $realfilename;
      14. last ITER;
      15. }
      16. }
      17. die "Can't find $filename in \@INC";
      18. }
      19. if ($@) {
      20. $INC{$filename} = undef;
      21. die $@;
      22. } elsif (!$result) {
      23. delete $INC{$filename};
      24. die "$filename did not return true value";
      25. } else {
      26. return $result;
      27. }
      28. }

      Note that the file will not be included twice under the same specified name.

      The file must return true as the last statement to indicate successful execution of any initialization code, so it's customary to end such a file with 1; unless you're sure it'll return true otherwise. But it's better just to put the 1; , in case you add more statements.

      If EXPR is a bareword, the require assumes a ".pm" extension and replaces "::" with "/" in the filename for you, to make it easy to load standard modules. This form of loading of modules does not risk altering your namespace.

      In other words, if you try this:

      1. require Foo::Bar; # a splendid bareword

      The require function will actually look for the "Foo/Bar.pm" file in the directories specified in the @INC array.

      But if you try this:

      1. $class = 'Foo::Bar';
      2. require $class; # $class is not a bareword
      3. #or
      4. require "Foo::Bar"; # not a bareword because of the ""

      The require function will look for the "Foo::Bar" file in the @INC array and will complain about not finding "Foo::Bar" there. In this case you can do:

      1. eval "require $class";

      Now that you understand how require looks for files with a bareword argument, there is a little extra functionality going on behind the scenes. Before require looks for a ".pm" extension, it will first look for a similar filename with a ".pmc" extension. If this file is found, it will be loaded in place of any file ending in a ".pm" extension.

      You can also insert hooks into the import facility by putting Perl code directly into the @INC array. There are three forms of hooks: subroutine references, array references, and blessed objects.

      Subroutine references are the simplest case. When the inclusion system walks through @INC and encounters a subroutine, this subroutine gets called with two parameters, the first a reference to itself, and the second the name of the file to be included (e.g., "Foo/Bar.pm"). The subroutine should return either nothing or else a list of up to three values in the following order:

      1

      A filehandle, from which the file will be read.

      2

      A reference to a subroutine. If there is no filehandle (previous item), then this subroutine is expected to generate one line of source code per call, writing the line into $_ and returning 1, then finally at end of file returning 0. If there is a filehandle, then the subroutine will be called to act as a simple source filter, with the line as read in $_ . Again, return 1 for each valid line, and 0 after all lines have been returned.

      3

      Optional state for the subroutine. The state is passed in as $_[1] . A reference to the subroutine itself is passed in as $_[0] .

      If an empty list, undef, or nothing that matches the first 3 values above is returned, then require looks at the remaining elements of @INC. Note that this filehandle must be a real filehandle (strictly a typeglob or reference to a typeglob, whether blessed or unblessed); tied filehandles will be ignored and processing will stop there.

      If the hook is an array reference, its first element must be a subroutine reference. This subroutine is called as above, but the first parameter is the array reference. This lets you indirectly pass arguments to the subroutine.

      In other words, you can write:

      1. push @INC, \&my_sub;
      2. sub my_sub {
      3. my ($coderef, $filename) = @_; # $coderef is \&my_sub
      4. ...
      5. }

      or:

      1. push @INC, [ \&my_sub, $x, $y, ... ];
      2. sub my_sub {
      3. my ($arrayref, $filename) = @_;
      4. # Retrieve $x, $y, ...
      5. my @parameters = @$arrayref[1..$#$arrayref];
      6. ...
      7. }

      If the hook is an object, it must provide an INC method that will be called as above, the first parameter being the object itself. (Note that you must fully qualify the sub's name, as unqualified INC is always forced into package main .) Here is a typical code layout:

      1. # In Foo.pm
      2. package Foo;
      3. sub new { ... }
      4. sub Foo::INC {
      5. my ($self, $filename) = @_;
      6. ...
      7. }
      8. # In the main program
      9. push @INC, Foo->new(...);

      These hooks are also permitted to set the %INC entry corresponding to the files they have loaded. See %INC in perlvar.

      For a yet-more-powerful import facility, see use and perlmod.

     
    perldoc-html/functions/reset.html000644 000765 000024 00000036034 12275777532 017265 0ustar00jjstaff000000 000000 reset - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    reset

    Perl 5 version 18.2 documentation
    Recently read

    reset

    • reset EXPR

    • reset

      Generally used in a continue block at the end of a loop to clear variables and reset ?? searches so that they work again. The expression is interpreted as a list of single characters (hyphens allowed for ranges). All variables and arrays beginning with one of those letters are reset to their pristine state. If the expression is omitted, one-match searches (?pattern? ) are reset to match again. Only resets variables or searches in the current package. Always returns 1. Examples:

      1. reset 'X'; # reset all X variables
      2. reset 'a-z'; # reset lower case variables
      3. reset; # just reset ?one-time? searches

      Resetting "A-Z" is not recommended because you'll wipe out your @ARGV and @INC arrays and your %ENV hash. Resets only package variables; lexical variables are unaffected, but they clean themselves up on scope exit anyway, so you'll probably want to use them instead. See my.

     
    perldoc-html/functions/return.html000644 000765 000024 00000035104 12275777526 017462 0ustar00jjstaff000000 000000 return - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    return

    Perl 5 version 18.2 documentation
    Recently read

    return

    • return EXPR

    • return

      Returns from a subroutine, eval, or do FILE with the value given in EXPR. Evaluation of EXPR may be in list, scalar, or void context, depending on how the return value will be used, and the context may vary from one execution to the next (see wantarray). If no EXPR is given, returns an empty list in list context, the undefined value in scalar context, and (of course) nothing at all in void context.

      (In the absence of an explicit return, a subroutine, eval, or do FILE automatically returns the value of the last expression evaluated.)

      Unlike most named operators, this is also exempt from the looks-like-a-function rule, so return ("foo")."bar" will cause "bar" to be part of the argument to return.

     
    perldoc-html/functions/reverse.html000644 000765 000024 00000040105 12275777526 017613 0ustar00jjstaff000000 000000 reverse - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    reverse

    Perl 5 version 18.2 documentation
    Recently read

    reverse

    • reverse LIST

      In list context, returns a list value consisting of the elements of LIST in the opposite order. In scalar context, concatenates the elements of LIST and returns a string value with all characters in the opposite order.

      1. print join(", ", reverse "world", "Hello"); # Hello, world
      2. print scalar reverse "dlrow ,", "olleH"; # Hello, world

      Used without arguments in scalar context, reverse() reverses $_ .

      1. $_ = "dlrow ,olleH";
      2. print reverse; # No output, list context
      3. print scalar reverse; # Hello, world

      Note that reversing an array to itself (as in @a = reverse @a ) will preserve non-existent elements whenever possible; i.e., for non-magical arrays or for tied arrays with EXISTS and DELETE methods.

      This operator is also handy for inverting a hash, although there are some caveats. If a value is duplicated in the original hash, only one of those can be represented as a key in the inverted hash. Also, this has to unwind one hash and build a whole new one, which may take some time on a large hash, such as from a DBM file.

      1. %by_name = reverse %by_address; # Invert the hash
     
    perldoc-html/functions/rewinddir.html000644 000765 000024 00000033063 12275777530 020127 0ustar00jjstaff000000 000000 rewinddir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    rewinddir

    Perl 5 version 18.2 documentation
    Recently read

    rewinddir

    • rewinddir DIRHANDLE

      Sets the current position to the beginning of the directory for the readdir routine on DIRHANDLE.

      Portability issues: rewinddir in perlport.

     
    perldoc-html/functions/rindex.html000644 000765 000024 00000033077 12275777527 017444 0ustar00jjstaff000000 000000 rindex - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    rindex

    Perl 5 version 18.2 documentation
    Recently read

    rindex

    • rindex STR,SUBSTR,POSITION

    • rindex STR,SUBSTR

      Works just like index() except that it returns the position of the last occurrence of SUBSTR in STR. If POSITION is specified, returns the last occurrence beginning at or before that position.

     
    perldoc-html/functions/rmdir.html000644 000765 000024 00000033563 12275777531 017263 0ustar00jjstaff000000 000000 rmdir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    rmdir

    Perl 5 version 18.2 documentation
    Recently read

    rmdir

    • rmdir FILENAME

    • rmdir

      Deletes the directory specified by FILENAME if that directory is empty. If it succeeds it returns true; otherwise it returns false and sets $! (errno). If FILENAME is omitted, uses $_ .

      To remove a directory tree recursively (rm -rf on Unix) look at the rmtree function of the File::Path module.

     
    perldoc-html/functions/s.html000644 000765 000024 00000032526 12275777530 016405 0ustar00jjstaff000000 000000 s - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    s

    Perl 5 version 18.2 documentation
    Recently read

    s

     
    perldoc-html/functions/say.html000644 000765 000024 00000035266 12275777526 016750 0ustar00jjstaff000000 000000 say - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    say

    Perl 5 version 18.2 documentation
    Recently read

    say

    • say FILEHANDLE LIST

    • say FILEHANDLE
    • say LIST
    • say

      Just like print, but implicitly appends a newline. say LIST is simply an abbreviation for { local $\ = "\n"; print LIST } . To use FILEHANDLE without a LIST to print the contents of $_ to it, you must use a real filehandle like FH , not an indirect one like $fh .

      This keyword is available only when the "say" feature is enabled, or when prefixed with CORE:: ; see feature. Alternately, include a use v5.10 or later to the current scope.

     
    perldoc-html/functions/scalar.html000644 000765 000024 00000037437 12275777524 017421 0ustar00jjstaff000000 000000 scalar - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    scalar

    Perl 5 version 18.2 documentation
    Recently read

    scalar

    • scalar EXPR

      Forces EXPR to be interpreted in scalar context and returns the value of EXPR.

      1. @counts = ( scalar @a, scalar @b, scalar @c );

      There is no equivalent operator to force an expression to be interpolated in list context because in practice, this is never needed. If you really wanted to do so, however, you could use the construction @{[ (some expression) ]} , but usually a simple (some expression) suffices.

      Because scalar is a unary operator, if you accidentally use a parenthesized list for the EXPR, this behaves as a scalar comma expression, evaluating all but the last element in void context and returning the final element evaluated in scalar context. This is seldom what you want.

      The following single statement:

      1. print uc(scalar(&foo,$bar)),$baz;

      is the moral equivalent of these two:

      1. &foo;
      2. print(uc($bar),$baz);

      See perlop for more details on unary operators and the comma operator.

     
    perldoc-html/functions/seek.html000644 000765 000024 00000043550 12275777526 017076 0ustar00jjstaff000000 000000 seek - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    seek

    Perl 5 version 18.2 documentation
    Recently read

    seek

    • seek FILEHANDLE,POSITION,WHENCE

      Sets FILEHANDLE's position, just like the fseek call of stdio . FILEHANDLE may be an expression whose value gives the name of the filehandle. The values for WHENCE are 0 to set the new position in bytes to POSITION; 1 to set it to the current position plus POSITION; and 2 to set it to EOF plus POSITION, typically negative. For WHENCE you may use the constants SEEK_SET , SEEK_CUR , and SEEK_END (start of the file, current position, end of the file) from the Fcntl module. Returns 1 on success, false otherwise.

      Note the in bytes: even if the filehandle has been set to operate on characters (for example by using the :encoding(utf8) open layer), tell() will return byte offsets, not character offsets (because implementing that would render seek() and tell() rather slow).

      If you want to position the file for sysread or syswrite, don't use seek, because buffering makes its effect on the file's read-write position unpredictable and non-portable. Use sysseek instead.

      Due to the rules and rigors of ANSI C, on some systems you have to do a seek whenever you switch between reading and writing. Amongst other things, this may have the effect of calling stdio's clearerr(3). A WHENCE of 1 (SEEK_CUR ) is useful for not moving the file position:

      1. seek(TEST,0,1);

      This is also useful for applications emulating tail -f . Once you hit EOF on your read and then sleep for a while, you (probably) have to stick in a dummy seek() to reset things. The seek doesn't change the position, but it does clear the end-of-file condition on the handle, so that the next <FILE> makes Perl try again to read something. (We hope.)

      If that doesn't work (some I/O implementations are particularly cantankerous), you might need something like this:

      1. for (;;) {
      2. for ($curpos = tell(FILE); $_ = <FILE>;
      3. $curpos = tell(FILE)) {
      4. # search for some stuff and put it into files
      5. }
      6. sleep($for_a_while);
      7. seek(FILE, $curpos, 0);
      8. }
     
    perldoc-html/functions/seekdir.html000644 000765 000024 00000033356 12275777525 017577 0ustar00jjstaff000000 000000 seekdir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    seekdir

    Perl 5 version 18.2 documentation
    Recently read

    seekdir

    • seekdir DIRHANDLE,POS

      Sets the current position for the readdir routine on DIRHANDLE. POS must be a value returned by telldir. seekdir also has the same caveats about possible directory compaction as the corresponding system library routine.

     
    perldoc-html/functions/select.html000644 000765 000024 00000054717 12275777530 017430 0ustar00jjstaff000000 000000 select - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    select

    Perl 5 version 18.2 documentation
    Recently read

    select

    • select FILEHANDLE

    • select

      Returns the currently selected filehandle. If FILEHANDLE is supplied, sets the new current default filehandle for output. This has two effects: first, a write or a print without a filehandle default to this FILEHANDLE. Second, references to variables related to output will refer to this output channel.

      For example, to set the top-of-form format for more than one output channel, you might do the following:

      1. select(REPORT1);
      2. $^ = 'report1_top';
      3. select(REPORT2);
      4. $^ = 'report2_top';

      FILEHANDLE may be an expression whose value gives the name of the actual filehandle. Thus:

      1. $oldfh = select(STDERR); $| = 1; select($oldfh);

      Some programmers may prefer to think of filehandles as objects with methods, preferring to write the last example as:

      1. use IO::Handle;
      2. STDERR->autoflush(1);

      Portability issues: select in perlport.

    • select RBITS,WBITS,EBITS,TIMEOUT

      This calls the select(2) syscall with the bit masks specified, which can be constructed using fileno and vec, along these lines:

      1. $rin = $win = $ein = '';
      2. vec($rin, fileno(STDIN), 1) = 1;
      3. vec($win, fileno(STDOUT), 1) = 1;
      4. $ein = $rin | $win;

      If you want to select on many filehandles, you may wish to write a subroutine like this:

      1. sub fhbits {
      2. my @fhlist = @_;
      3. my $bits = "";
      4. for my $fh (@fhlist) {
      5. vec($bits, fileno($fh), 1) = 1;
      6. }
      7. return $bits;
      8. }
      9. $rin = fhbits(*STDIN, *TTY, *MYSOCK);

      The usual idiom is:

      1. ($nfound,$timeleft) =
      2. select($rout=$rin, $wout=$win, $eout=$ein, $timeout);

      or to block until something becomes ready just do this

      1. $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef);

      Most systems do not bother to return anything useful in $timeleft, so calling select() in scalar context just returns $nfound.

      Any of the bit masks can also be undef. The timeout, if specified, is in seconds, which may be fractional. Note: not all implementations are capable of returning the $timeleft. If not, they always return $timeleft equal to the supplied $timeout.

      You can effect a sleep of 250 milliseconds this way:

      1. select(undef, undef, undef, 0.25);

      Note that whether select gets restarted after signals (say, SIGALRM) is implementation-dependent. See also perlport for notes on the portability of select.

      On error, select behaves just like select(2): it returns -1 and sets $! .

      On some Unixes, select(2) may report a socket file descriptor as "ready for reading" even when no data is available, and thus any subsequent read would block. This can be avoided if you always use O_NONBLOCK on the socket. See select(2) and fcntl(2) for further details.

      The standard IO::Select module provides a user-friendlier interface to select, mostly because it does all the bit-mask work for you.

      WARNING: One should not attempt to mix buffered I/O (like read or <FH>) with select, except as permitted by POSIX, and even then only on POSIX systems. You have to use sysread instead.

      Portability issues: select in perlport.

     
    perldoc-html/functions/semctl.html000644 000765 000024 00000034604 12275777525 017435 0ustar00jjstaff000000 000000 semctl - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    semctl

    Perl 5 version 18.2 documentation
    Recently read

    semctl

    • semctl ID,SEMNUM,CMD,ARG

      Calls the System V IPC function semctl(2). You'll probably have to say

      1. use IPC::SysV;

      first to get the correct constant definitions. If CMD is IPC_STAT or GETALL, then ARG must be a variable that will hold the returned semid_ds structure or semaphore value array. Returns like ioctl: the undefined value for error, "0 but true" for zero, or the actual return value otherwise. The ARG must consist of a vector of native short integers, which may be created with pack("s!",(0)x$nsem). See also SysV IPC in perlipc, IPC::SysV , IPC::Semaphore documentation.

      Portability issues: semctl in perlport.

     
    perldoc-html/functions/semget.html000644 000765 000024 00000033267 12275777531 017433 0ustar00jjstaff000000 000000 semget - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    semget

    Perl 5 version 18.2 documentation
    Recently read

    semget

    • semget KEY,NSEMS,FLAGS

      Calls the System V IPC function semget(2). Returns the semaphore id, or the undefined value on error. See also SysV IPC in perlipc, IPC::SysV , IPC::SysV::Semaphore documentation.

      Portability issues: semget in perlport.

     
    perldoc-html/functions/semop.html000644 000765 000024 00000036245 12275777526 017275 0ustar00jjstaff000000 000000 semop - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    semop

    Perl 5 version 18.2 documentation
    Recently read

    semop

    • semop KEY,OPSTRING

      Calls the System V IPC function semop(2) for semaphore operations such as signalling and waiting. OPSTRING must be a packed array of semop structures. Each semop structure can be generated with pack("s!3", $semnum, $semop, $semflag) . The length of OPSTRING implies the number of semaphore operations. Returns true if successful, false on error. As an example, the following code waits on semaphore $semnum of semaphore id $semid:

      1. $semop = pack("s!3", $semnum, -1, 0);
      2. die "Semaphore trouble: $!\n" unless semop($semid, $semop);

      To signal the semaphore, replace -1 with 1 . See also SysV IPC in perlipc, IPC::SysV , and IPC::SysV::Semaphore documentation.

      Portability issues: semop in perlport.

     
    perldoc-html/functions/send.html000644 000765 000024 00000035233 12275777526 017077 0ustar00jjstaff000000 000000 send - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    send

    Perl 5 version 18.2 documentation
    Recently read

    send

    • send SOCKET,MSG,FLAGS,TO

    • send SOCKET,MSG,FLAGS

      Sends a message on a socket. Attempts to send the scalar MSG to the SOCKET filehandle. Takes the same flags as the system call of the same name. On unconnected sockets, you must specify a destination to send to, in which case it does a sendto(2) syscall. Returns the number of characters sent, or the undefined value on error. The sendmsg(2) syscall is currently unimplemented. See UDP: Message Passing in perlipc for examples.

      Note the characters: depending on the status of the socket, either (8-bit) bytes or characters are sent. By default all sockets operate on bytes, but for example if the socket has been changed using binmode() to operate with the :encoding(utf8) I/O layer (see open, or the open pragma, open), the I/O will operate on UTF-8 encoded Unicode characters, not bytes. Similarly for the :encoding pragma: in that case pretty much any characters can be sent.

     
    perldoc-html/functions/setgrent.html000644 000765 000024 00000032373 12275777525 020002 0ustar00jjstaff000000 000000 setgrent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    setgrent

    Perl 5 version 18.2 documentation
    Recently read

    setgrent

    • setgrent
     
    perldoc-html/functions/sethostent.html000644 000765 000024 00000032435 12275777530 020342 0ustar00jjstaff000000 000000 sethostent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sethostent

    Perl 5 version 18.2 documentation
    Recently read

    sethostent

    • sethostent STAYOPEN
     
    perldoc-html/functions/setnetent.html000644 000765 000024 00000032425 12275777524 020155 0ustar00jjstaff000000 000000 setnetent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    setnetent

    Perl 5 version 18.2 documentation
    Recently read

    setnetent

    • setnetent STAYOPEN
     
    perldoc-html/functions/setpgrp.html000644 000765 000024 00000034170 12275777526 017631 0ustar00jjstaff000000 000000 setpgrp - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    setpgrp

    Perl 5 version 18.2 documentation
    Recently read

    setpgrp

    • setpgrp PID,PGRP

      Sets the current process group for the specified PID, 0 for the current process. Raises an exception when used on a machine that doesn't implement POSIX setpgid(2) or BSD setpgrp(2). If the arguments are omitted, it defaults to 0,0 . Note that the BSD 4.2 version of setpgrp does not accept any arguments, so only setpgrp(0,0) is portable. See also POSIX::setsid() .

      Portability issues: setpgrp in perlport.

     
    perldoc-html/functions/setpriority.html000644 000765 000024 00000033134 12275777524 020537 0ustar00jjstaff000000 000000 setpriority - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    setpriority

    Perl 5 version 18.2 documentation
    Recently read

    setpriority

    • setpriority WHICH,WHO,PRIORITY

      Sets the current priority for a process, a process group, or a user. (See setpriority(2).) Raises an exception when used on a machine that doesn't implement setpriority(2).

      Portability issues: setpriority in perlport.

     
    perldoc-html/functions/setprotoent.html000644 000765 000024 00000032445 12275777530 020531 0ustar00jjstaff000000 000000 setprotoent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    setprotoent

    Perl 5 version 18.2 documentation
    Recently read

    setprotoent

    • setprotoent STAYOPEN
     
    perldoc-html/functions/setpwent.html000644 000765 000024 00000032373 12275777531 020015 0ustar00jjstaff000000 000000 setpwent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    setpwent

    Perl 5 version 18.2 documentation
    Recently read

    setpwent

    • setpwent
     
    perldoc-html/functions/setservent.html000644 000765 000024 00000032435 12275777531 020345 0ustar00jjstaff000000 000000 setservent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    setservent

    Perl 5 version 18.2 documentation
    Recently read

    setservent

    • setservent STAYOPEN
     
    perldoc-html/functions/setsockopt.html000644 000765 000024 00000034716 12275777525 020350 0ustar00jjstaff000000 000000 setsockopt - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    setsockopt

    Perl 5 version 18.2 documentation
    Recently read

    setsockopt

    • setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL

      Sets the socket option requested. Returns undef on error. Use integer constants provided by the Socket module for LEVEL and OPNAME. Values for LEVEL can also be obtained from getprotobyname. OPTVAL might either be a packed string or an integer. An integer OPTVAL is shorthand for pack("i", OPTVAL).

      An example disabling Nagle's algorithm on a socket:

      1. use Socket qw(IPPROTO_TCP TCP_NODELAY);
      2. setsockopt($socket, IPPROTO_TCP, TCP_NODELAY, 1);

      Portability issues: setsockopt in perlport.

     
    perldoc-html/functions/shift.html000644 000765 000024 00000037656 12275777525 017275 0ustar00jjstaff000000 000000 shift - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    shift

    Perl 5 version 18.2 documentation
    Recently read

    shift

    • shift ARRAY

    • shift EXPR
    • shift

      Shifts the first value of the array off and returns it, shortening the array by 1 and moving everything down. If there are no elements in the array, returns the undefined value. If ARRAY is omitted, shifts the @_ array within the lexical scope of subroutines and formats, and the @ARGV array outside a subroutine and also within the lexical scopes established by the eval STRING , BEGIN {} , INIT {} , CHECK {} , UNITCHECK {} , and END {} constructs.

      Starting with Perl 5.14, shift can take a scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of shift is considered highly experimental. The exact behaviour may change in a future version of Perl.

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.014; # so push/pop/etc work on scalars (experimental)

      See also unshift, push, and pop. shift and unshift do the same thing to the left end of an array that pop and push do to the right end.

     
    perldoc-html/functions/shmctl.html000644 000765 000024 00000034307 12275777526 017441 0ustar00jjstaff000000 000000 shmctl - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    shmctl

    Perl 5 version 18.2 documentation
    Recently read

    shmctl

    • shmctl ID,CMD,ARG

      Calls the System V IPC function shmctl. You'll probably have to say

      1. use IPC::SysV;

      first to get the correct constant definitions. If CMD is IPC_STAT , then ARG must be a variable that will hold the returned shmid_ds structure. Returns like ioctl: undef for error; "0 but true" for zero; and the actual return value otherwise. See also SysV IPC in perlipc and IPC::SysV documentation.

      Portability issues: shmctl in perlport.

     
    perldoc-html/functions/shmget.html000644 000765 000024 00000033266 12275777524 017437 0ustar00jjstaff000000 000000 shmget - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    shmget

    Perl 5 version 18.2 documentation
    Recently read

    shmget

    • shmget KEY,SIZE,FLAGS

      Calls the System V IPC function shmget. Returns the shared memory segment id, or undef on error. See also SysV IPC in perlipc and IPC::SysV documentation.

      Portability issues: shmget in perlport.

     
    perldoc-html/functions/shmread.html000644 000765 000024 00000032442 12275777530 017563 0ustar00jjstaff000000 000000 shmread - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    shmread

    Perl 5 version 18.2 documentation
    Recently read

    shmread

    • shmread ID,VAR,POS,SIZE

     
    perldoc-html/functions/shmwrite.html000644 000765 000024 00000034121 12275777530 017776 0ustar00jjstaff000000 000000 shmwrite - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    shmwrite

    Perl 5 version 18.2 documentation
    Recently read

    shmwrite

    • shmwrite ID,STRING,POS,SIZE

      Reads or writes the System V shared memory segment ID starting at position POS for size SIZE by attaching to it, copying in/out, and detaching from it. When reading, VAR must be a variable that will hold the data read. When writing, if STRING is too long, only SIZE bytes are used; if STRING is too short, nulls are written to fill out SIZE bytes. Return true if successful, false on error. shmread() taints the variable. See also SysV IPC in perlipc, IPC::SysV , and the IPC::Shareable module from CPAN.

      Portability issues: shmread in perlport and shmwrite in perlport.

     
    perldoc-html/functions/shutdown.html000644 000765 000024 00000035651 12275777531 020021 0ustar00jjstaff000000 000000 shutdown - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    shutdown

    Perl 5 version 18.2 documentation
    Recently read

    shutdown

    • shutdown SOCKET,HOW

      Shuts down a socket connection in the manner indicated by HOW, which has the same interpretation as in the syscall of the same name.

      1. shutdown(SOCKET, 0); # I/we have stopped reading data
      2. shutdown(SOCKET, 1); # I/we have stopped writing data
      3. shutdown(SOCKET, 2); # I/we have stopped using this socket

      This is useful with sockets when you want to tell the other side you're done writing but not done reading, or vice versa. It's also a more insistent form of close because it also disables the file descriptor in any forked copies in other processes.

      Returns 1 for success; on error, returns undef if the first argument is not a valid filehandle, or returns 0 and sets $! for any other failure.

     
    perldoc-html/functions/sin.html000644 000765 000024 00000034177 12275777530 016740 0ustar00jjstaff000000 000000 sin - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sin

    Perl 5 version 18.2 documentation
    Recently read

    sin

    • sin EXPR

    • sin

      Returns the sine of EXPR (expressed in radians). If EXPR is omitted, returns sine of $_ .

      For the inverse sine operation, you may use the Math::Trig::asin function, or use this relation:

      1. sub asin { atan2($_[0], sqrt(1 - $_[0] * $_[0])) }
     
    perldoc-html/functions/sleep.html000644 000765 000024 00000037322 12275777532 017254 0ustar00jjstaff000000 000000 sleep - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sleep

    Perl 5 version 18.2 documentation
    Recently read

    sleep

    • sleep EXPR

    • sleep

      Causes the script to sleep for (integer) EXPR seconds, or forever if no argument is given. Returns the integer number of seconds actually slept.

      May be interrupted if the process receives a signal such as SIGALRM .

      1. eval {
      2. local $SIG{ALARM} = sub { die "Alarm!\n" };
      3. sleep;
      4. };
      5. die $@ unless $@ eq "Alarm!\n";

      You probably cannot mix alarm and sleep calls, because sleep is often implemented using alarm.

      On some older systems, it may sleep up to a full second less than what you requested, depending on how it counts seconds. Most modern systems always sleep the full amount. They may appear to sleep longer than that, however, because your process might not be scheduled right away in a busy multitasking system.

      For delays of finer granularity than one second, the Time::HiRes module (from CPAN, and starting from Perl 5.8 part of the standard distribution) provides usleep(). You may also use Perl's four-argument version of select() leaving the first three arguments undefined, or you might be able to use the syscall interface to access setitimer(2) if your system supports it. See perlfaq8 for details.

      See also the POSIX module's pause function.

     
    perldoc-html/functions/socket.html000644 000765 000024 00000033750 12275777531 017434 0ustar00jjstaff000000 000000 socket - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    socket

    Perl 5 version 18.2 documentation
    Recently read

    socket

    • socket SOCKET,DOMAIN,TYPE,PROTOCOL

      Opens a socket of the specified kind and attaches it to filehandle SOCKET. DOMAIN, TYPE, and PROTOCOL are specified the same as for the syscall of the same name. You should use Socket first to get the proper definitions imported. See the examples in Sockets: Client/Server Communication in perlipc.

      On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptor, as determined by the value of $^F. See $^F in perlvar.

     
    perldoc-html/functions/socketpair.html000644 000765 000024 00000037223 12275777531 020307 0ustar00jjstaff000000 000000 socketpair - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    socketpair

    Perl 5 version 18.2 documentation
    Recently read

    socketpair

    • socketpair SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL

      Creates an unnamed pair of sockets in the specified domain, of the specified type. DOMAIN, TYPE, and PROTOCOL are specified the same as for the syscall of the same name. If unimplemented, raises an exception. Returns true if successful.

      On systems that support a close-on-exec flag on files, the flag will be set for the newly opened file descriptors, as determined by the value of $^F. See $^F in perlvar.

      Some systems defined pipe in terms of socketpair, in which a call to pipe(Rdr, Wtr) is essentially:

      1. use Socket;
      2. socketpair(Rdr, Wtr, AF_UNIX, SOCK_STREAM, PF_UNSPEC);
      3. shutdown(Rdr, 1); # no more writing for reader
      4. shutdown(Wtr, 0); # no more reading for writer

      See perlipc for an example of socketpair use. Perl 5.8 and later will emulate socketpair using IP sockets to localhost if your system implements sockets but not socketpair.

      Portability issues: socketpair in perlport.

     
    perldoc-html/functions/sort.html000644 000765 000024 00000105071 12275777527 017134 0ustar00jjstaff000000 000000 sort - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sort

    Perl 5 version 18.2 documentation
    Recently read

    sort

    • sort SUBNAME LIST

    • sort BLOCK LIST
    • sort LIST

      In list context, this sorts the LIST and returns the sorted list value. In scalar context, the behaviour of sort() is undefined.

      If SUBNAME or BLOCK is omitted, sorts in standard string comparison order. If SUBNAME is specified, it gives the name of a subroutine that returns an integer less than, equal to, or greater than 0 , depending on how the elements of the list are to be ordered. (The <=> and cmp operators are extremely useful in such routines.) SUBNAME may be a scalar variable name (unsubscripted), in which case the value provides the name of (or a reference to) the actual subroutine to use. In place of a SUBNAME, you can provide a BLOCK as an anonymous, in-line sort subroutine.

      If the subroutine's prototype is ($$) , the elements to be compared are passed by reference in @_ , as for a normal subroutine. This is slower than unprototyped subroutines, where the elements to be compared are passed into the subroutine as the package global variables $a and $b (see example below). Note that in the latter case, it is usually highly counter-productive to declare $a and $b as lexicals.

      If the subroutine is an XSUB, the elements to be compared are pushed on to the stack, the way arguments are usually passed to XSUBs. $a and $b are not set.

      The values to be compared are always passed by reference and should not be modified.

      You also cannot exit out of the sort block or subroutine using any of the loop control operators described in perlsyn or with goto.

      When use locale (but not use locale 'not_characters' ) is in effect, sort LIST sorts LIST according to the current collation locale. See perllocale.

      sort() returns aliases into the original list, much as a for loop's index variable aliases the list elements. That is, modifying an element of a list returned by sort() (for example, in a foreach , map or grep) actually modifies the element in the original list. This is usually something to be avoided when writing clear code.

      Perl 5.6 and earlier used a quicksort algorithm to implement sort. That algorithm was not stable, so could go quadratic. (A stable sort preserves the input order of elements that compare equal. Although quicksort's run time is O(NlogN) when averaged over all arrays of length N, the time can be O(N**2), quadratic behavior, for some inputs.) In 5.7, the quicksort implementation was replaced with a stable mergesort algorithm whose worst-case behavior is O(NlogN). But benchmarks indicated that for some inputs, on some platforms, the original quicksort was faster. 5.8 has a sort pragma for limited control of the sort. Its rather blunt control of the underlying algorithm may not persist into future Perls, but the ability to characterize the input or output in implementation independent ways quite probably will. See the sort pragma.

      Examples:

      1. # sort lexically
      2. @articles = sort @files;
      3. # same thing, but with explicit sort routine
      4. @articles = sort {$a cmp $b} @files;
      5. # now case-insensitively
      6. @articles = sort {fc($a) cmp fc($b)} @files;
      7. # same thing in reversed order
      8. @articles = sort {$b cmp $a} @files;
      9. # sort numerically ascending
      10. @articles = sort {$a <=> $b} @files;
      11. # sort numerically descending
      12. @articles = sort {$b <=> $a} @files;
      13. # this sorts the %age hash by value instead of key
      14. # using an in-line function
      15. @eldest = sort { $age{$b} <=> $age{$a} } keys %age;
      16. # sort using explicit subroutine name
      17. sub byage {
      18. $age{$a} <=> $age{$b}; # presuming numeric
      19. }
      20. @sortedclass = sort byage @class;
      21. sub backwards { $b cmp $a }
      22. @harry = qw(dog cat x Cain Abel);
      23. @george = qw(gone chased yz Punished Axed);
      24. print sort @harry;
      25. # prints AbelCaincatdogx
      26. print sort backwards @harry;
      27. # prints xdogcatCainAbel
      28. print sort @george, 'to', @harry;
      29. # prints AbelAxedCainPunishedcatchaseddoggonetoxyz
      30. # inefficiently sort by descending numeric compare using
      31. # the first integer after the first = sign, or the
      32. # whole record case-insensitively otherwise
      33. my @new = sort {
      34. ($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0]
      35. ||
      36. fc($a) cmp fc($b)
      37. } @old;
      38. # same thing, but much more efficiently;
      39. # we'll build auxiliary indices instead
      40. # for speed
      41. my @nums = @caps = ();
      42. for (@old) {
      43. push @nums, ( /=(\d+)/ ? $1 : undef );
      44. push @caps, fc($_);
      45. }
      46. my @new = @old[ sort {
      47. $nums[$b] <=> $nums[$a]
      48. ||
      49. $caps[$a] cmp $caps[$b]
      50. } 0..$#old
      51. ];
      52. # same thing, but without any temps
      53. @new = map { $_->[0] }
      54. sort { $b->[1] <=> $a->[1]
      55. ||
      56. $a->[2] cmp $b->[2]
      57. } map { [$_, /=(\d+)/, fc($_)] } @old;
      58. # using a prototype allows you to use any comparison subroutine
      59. # as a sort subroutine (including other package's subroutines)
      60. package other;
      61. sub backwards ($$) { $_[1] cmp $_[0]; } # $a and $b are
      62. # not set here
      63. package main;
      64. @new = sort other::backwards @old;
      65. # guarantee stability, regardless of algorithm
      66. use sort 'stable';
      67. @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old;
      68. # force use of mergesort (not portable outside Perl 5.8)
      69. use sort '_mergesort'; # note discouraging _
      70. @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old;

      Warning: syntactical care is required when sorting the list returned from a function. If you want to sort the list returned by the function call find_records(@key) , you can use:

      1. @contact = sort { $a cmp $b } find_records @key;
      2. @contact = sort +find_records(@key);
      3. @contact = sort &find_records(@key);
      4. @contact = sort(find_records(@key));

      If instead you want to sort the array @key with the comparison routine find_records() then you can use:

      1. @contact = sort { find_records() } @key;
      2. @contact = sort find_records(@key);
      3. @contact = sort(find_records @key);
      4. @contact = sort(find_records (@key));

      If you're using strict, you must not declare $a and $b as lexicals. They are package globals. That means that if you're in the main package and type

      1. @articles = sort {$b <=> $a} @files;

      then $a and $b are $main::a and $main::b (or $::a and $::b ), but if you're in the FooPack package, it's the same as typing

      1. @articles = sort {$FooPack::b <=> $FooPack::a} @files;

      The comparison function is required to behave. If it returns inconsistent results (sometimes saying $x[1] is less than $x[2] and sometimes saying the opposite, for example) the results are not well-defined.

      Because <=> returns undef when either operand is NaN (not-a-number), be careful when sorting with a comparison function like $a <=> $b any lists that might contain a NaN . The following example takes advantage that NaN != NaN to eliminate any NaN s from the input list.

      1. @result = sort { $a <=> $b } grep { $_ == $_ } @input;
     
    perldoc-html/functions/splice.html000644 000765 000024 00000044152 12275777530 017420 0ustar00jjstaff000000 000000 splice - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    splice

    Perl 5 version 18.2 documentation
    Recently read

    splice

    • splice ARRAY or EXPR,OFFSET,LENGTH,LIST

    • splice ARRAY or EXPR,OFFSET,LENGTH
    • splice ARRAY or EXPR,OFFSET
    • splice ARRAY or EXPR

      Removes the elements designated by OFFSET and LENGTH from an array, and replaces them with the elements of LIST, if any. In list context, returns the elements removed from the array. In scalar context, returns the last element removed, or undef if no elements are removed. The array grows or shrinks as necessary. If OFFSET is negative then it starts that far from the end of the array. If LENGTH is omitted, removes everything from OFFSET onward. If LENGTH is negative, removes the elements from OFFSET onward except for -LENGTH elements at the end of the array. If both OFFSET and LENGTH are omitted, removes everything. If OFFSET is past the end of the array, Perl issues a warning, and splices at the end of the array.

      The following equivalences hold (assuming $#a >= $i )

      1. push(@a,$x,$y) splice(@a,@a,0,$x,$y)
      2. pop(@a) splice(@a,-1)
      3. shift(@a) splice(@a,0,1)
      4. unshift(@a,$x,$y) splice(@a,0,0,$x,$y)
      5. $a[$i] = $y splice(@a,$i,1,$y)

      Example, assuming array lengths are passed before arrays:

      1. sub aeq { # compare two list values
      2. my(@a) = splice(@_,0,shift);
      3. my(@b) = splice(@_,0,shift);
      4. return 0 unless @a == @b; # same len?
      5. while (@a) {
      6. return 0 if pop(@a) ne pop(@b);
      7. }
      8. return 1;
      9. }
      10. if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... }

      Starting with Perl 5.14, splice can take scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of splice is considered highly experimental. The exact behaviour may change in a future version of Perl.

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.014; # so push/pop/etc work on scalars (experimental)
     
    perldoc-html/functions/split.html000644 000765 000024 00000067715 12275777530 017306 0ustar00jjstaff000000 000000 split - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    split

    Perl 5 version 18.2 documentation
    Recently read

    split

    • split /PATTERN/,EXPR,LIMIT

    • split /PATTERN/,EXPR
    • split /PATTERN/
    • split

      Splits the string EXPR into a list of strings and returns the list in list context, or the size of the list in scalar context.

      If only PATTERN is given, EXPR defaults to $_ .

      Anything in EXPR that matches PATTERN is taken to be a separator that separates the EXPR into substrings (called "fields") that do not include the separator. Note that a separator may be longer than one character or even have no characters at all (the empty string, which is a zero-width match).

      The PATTERN need not be constant; an expression may be used to specify a pattern that varies at runtime.

      If PATTERN matches the empty string, the EXPR is split at the match position (between characters). As an example, the following:

      1. print join(':', split('b', 'abc')), "\n";

      uses the 'b' in 'abc' as a separator to produce the output 'a:c'. However, this:

      1. print join(':', split('', 'abc')), "\n";

      uses empty string matches as separators to produce the output 'a:b:c'; thus, the empty string may be used to split EXPR into a list of its component characters.

      As a special case for split, the empty pattern given in match operator syntax (// ) specifically matches the empty string, which is contrary to its usual interpretation as the last successful match.

      If PATTERN is /^/ , then it is treated as if it used the multiline modifier (/^/m ), since it isn't much use otherwise.

      As another special case, split emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a literal string composed of a single space character (such as ' ' or "\x20" , but not e.g. / / ). In this case, any leading whitespace in EXPR is removed before splitting occurs, and the PATTERN is instead treated as if it were /\s+/ ; in particular, this means that any contiguous whitespace (not just a single space character) is used as a separator. However, this special treatment can be avoided by specifying the pattern / / instead of the string " " , thereby allowing only a single space character to be a separator. In earlier Perl's this special case was restricted to the use of a plain " " as the pattern argument to split, in Perl 5.18.0 and later this special case is triggered by any expression which evaluates as the simple string " " .

      If omitted, PATTERN defaults to a single space, " " , triggering the previously described awk emulation.

      If LIMIT is specified and positive, it represents the maximum number of fields into which the EXPR may be split; in other words, LIMIT is one greater than the maximum number of times EXPR may be split. Thus, the LIMIT value 1 means that EXPR may be split a maximum of zero times, producing a maximum of one field (namely, the entire value of EXPR). For instance:

      1. print join(':', split(//, 'abc', 1)), "\n";

      produces the output 'abc', and this:

      1. print join(':', split(//, 'abc', 2)), "\n";

      produces the output 'a:bc', and each of these:

      1. print join(':', split(//, 'abc', 3)), "\n";
      2. print join(':', split(//, 'abc', 4)), "\n";

      produces the output 'a:b:c'.

      If LIMIT is negative, it is treated as if it were instead arbitrarily large; as many fields as possible are produced.

      If LIMIT is omitted (or, equivalently, zero), then it is usually treated as if it were instead negative but with the exception that trailing empty fields are stripped (empty leading fields are always preserved); if all fields are empty, then all fields are considered to be trailing (and are thus stripped in this case). Thus, the following:

      1. print join(':', split(',', 'a,b,c,,,')), "\n";

      produces the output 'a:b:c', but the following:

      1. print join(':', split(',', 'a,b,c,,,', -1)), "\n";

      produces the output 'a:b:c:::'.

      In time-critical applications, it is worthwhile to avoid splitting into more fields than necessary. Thus, when assigning to a list, if LIMIT is omitted (or zero), then LIMIT is treated as though it were one larger than the number of variables in the list; for the following, LIMIT is implicitly 3:

      1. ($login, $passwd) = split(/:/);

      Note that splitting an EXPR that evaluates to the empty string always produces zero fields, regardless of the LIMIT specified.

      An empty leading field is produced when there is a positive-width match at the beginning of EXPR. For instance:

      1. print join(':', split(/ /, ' abc')), "\n";

      produces the output ':abc'. However, a zero-width match at the beginning of EXPR never produces an empty field, so that:

      1. print join(':', split(//, ' abc'));

      produces the output ' :a:b:c' (rather than ': :a:b:c').

      An empty trailing field, on the other hand, is produced when there is a match at the end of EXPR, regardless of the length of the match (of course, unless a non-zero LIMIT is given explicitly, such fields are removed, as in the last example). Thus:

      1. print join(':', split(//, ' abc', -1)), "\n";

      produces the output ' :a:b:c:'.

      If the PATTERN contains capturing groups, then for each separator, an additional field is produced for each substring captured by a group (in the order in which the groups are specified, as per backreferences); if any group does not match, then it captures the undef value instead of a substring. Also, note that any such additional field is produced whenever there is a separator (that is, whenever a split occurs), and such an additional field does not count towards the LIMIT. Consider the following expressions evaluated in list context (each returned list is provided in the associated comment):

      1. split(/-|,/, "1-10,20", 3)
      2. # ('1', '10', '20')
      3. split(/(-|,)/, "1-10,20", 3)
      4. # ('1', '-', '10', ',', '20')
      5. split(/-|(,)/, "1-10,20", 3)
      6. # ('1', undef, '10', ',', '20')
      7. split(/(-)|,/, "1-10,20", 3)
      8. # ('1', '-', '10', undef, '20')
      9. split(/(-)|(,)/, "1-10,20", 3)
      10. # ('1', '-', undef, '10', undef, ',', '20')
     
    perldoc-html/functions/sprintf.html000644 000765 000024 00000152041 12275777532 017625 0ustar00jjstaff000000 000000 sprintf - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sprintf

    Perl 5 version 18.2 documentation
    Recently read

    sprintf

    • sprintf FORMAT, LIST

      Returns a string formatted by the usual printf conventions of the C library function sprintf. See below for more details and see sprintf(3) or printf(3) on your system for an explanation of the general principles.

      For example:

      1. # Format number with up to 8 leading zeroes
      2. $result = sprintf("%08d", $number);
      3. # Round number to 3 digits after decimal point
      4. $rounded = sprintf("%.3f", $number);

      Perl does its own sprintf formatting: it emulates the C function sprintf(3), but doesn't use it except for floating-point numbers, and even then only standard modifiers are allowed. Non-standard extensions in your local sprintf(3) are therefore unavailable from Perl.

      Unlike printf, sprintf does not do what you probably mean when you pass it an array as your first argument. The array is given scalar context, and instead of using the 0th element of the array as the format, Perl will use the count of elements in the array as the format, which is almost never useful.

      Perl's sprintf permits the following universally-known conversions:

      1. %% a percent sign
      2. %c a character with the given number
      3. %s a string
      4. %d a signed integer, in decimal
      5. %u an unsigned integer, in decimal
      6. %o an unsigned integer, in octal
      7. %x an unsigned integer, in hexadecimal
      8. %e a floating-point number, in scientific notation
      9. %f a floating-point number, in fixed decimal notation
      10. %g a floating-point number, in %e or %f notation

      In addition, Perl permits the following widely-supported conversions:

      1. %X like %x, but using upper-case letters
      2. %E like %e, but using an upper-case "E"
      3. %G like %g, but with an upper-case "E" (if applicable)
      4. %b an unsigned integer, in binary
      5. %B like %b, but using an upper-case "B" with the # flag
      6. %p a pointer (outputs the Perl value's address in hexadecimal)
      7. %n special: *stores* the number of characters output so far
      8. into the next argument in the parameter list

      Finally, for backward (and we do mean "backward") compatibility, Perl permits these unnecessary but widely-supported conversions:

      1. %i a synonym for %d
      2. %D a synonym for %ld
      3. %U a synonym for %lu
      4. %O a synonym for %lo
      5. %F a synonym for %f

      Note that the number of exponent digits in the scientific notation produced by %e , %E , %g and %G for numbers with the modulus of the exponent less than 100 is system-dependent: it may be three or less (zero-padded as necessary). In other words, 1.23 times ten to the 99th may be either "1.23e99" or "1.23e099".

      Between the % and the format letter, you may specify several additional attributes controlling the interpretation of the format. In order, these are:

      • format parameter index

        An explicit format parameter index, such as 2$. By default sprintf will format the next unused argument in the list, but this allows you to take the arguments out of order:

        1. printf '%2$d %1$d', 12, 34; # prints "34 12"
        2. printf '%3$d %d %1$d', 1, 2, 3; # prints "3 1 1"
      • flags

        one or more of:

        1. space prefix non-negative number with a space
        2. + prefix non-negative number with a plus sign
        3. - left-justify within the field
        4. 0 use zeros, not spaces, to right-justify
        5. # ensure the leading "0" for any octal,
        6. prefix non-zero hexadecimal with "0x" or "0X",
        7. prefix non-zero binary with "0b" or "0B"

        For example:

        1. printf '<% d>', 12; # prints "< 12>"
        2. printf '<%+d>', 12; # prints "<+12>"
        3. printf '<%6s>', 12; # prints "< 12>"
        4. printf '<%-6s>', 12; # prints "<12 >"
        5. printf '<%06s>', 12; # prints "<000012>"
        6. printf '<%#o>', 12; # prints "<014>"
        7. printf '<%#x>', 12; # prints "<0xc>"
        8. printf '<%#X>', 12; # prints "<0XC>"
        9. printf '<%#b>', 12; # prints "<0b1100>"
        10. printf '<%#B>', 12; # prints "<0B1100>"

        When a space and a plus sign are given as the flags at once, a plus sign is used to prefix a positive number.

        1. printf '<%+ d>', 12; # prints "<+12>"
        2. printf '<% +d>', 12; # prints "<+12>"

        When the # flag and a precision are given in the %o conversion, the precision is incremented if it's necessary for the leading "0".

        1. printf '<%#.5o>', 012; # prints "<00012>"
        2. printf '<%#.5o>', 012345; # prints "<012345>"
        3. printf '<%#.0o>', 0; # prints "<0>"
      • vector flag

        This flag tells Perl to interpret the supplied string as a vector of integers, one for each character in the string. Perl applies the format to each integer in turn, then joins the resulting strings with a separator (a dot . by default). This can be useful for displaying ordinal values of characters in arbitrary strings:

        1. printf "%vd", "AB\x{100}"; # prints "65.66.256"
        2. printf "version is v%vd\n", $^V; # Perl's version

        Put an asterisk * before the v to override the string to use to separate the numbers:

        1. printf "address is %*vX\n", ":", $addr; # IPv6 address
        2. printf "bits are %0*v8b\n", " ", $bits; # random bitstring

        You can also explicitly specify the argument number to use for the join string using something like *2$v; for example:

        1. printf '%*4$vX %*4$vX %*4$vX', # 3 IPv6 addresses
        2. @addr[1..3], ":";
      • (minimum) width

        Arguments are usually formatted to be only as wide as required to display the given value. You can override the width by putting a number here, or get the width from the next argument (with * ) or from a specified argument (e.g., with *2$):

        1. printf "<%s>", "a"; # prints "<a>"
        2. printf "<%6s>", "a"; # prints "< a>"
        3. printf "<%*s>", 6, "a"; # prints "< a>"
        4. printf '<%*2$s>', "a", 6; # prints "< a>"
        5. printf "<%2s>", "long"; # prints "<long>" (does not truncate)

        If a field width obtained through * is negative, it has the same effect as the - flag: left-justification.

      • precision, or maximum width

        You can specify a precision (for numeric conversions) or a maximum width (for string conversions) by specifying a . followed by a number. For floating-point formats except g and G , this specifies how many places right of the decimal point to show (the default being 6). For example:

        1. # these examples are subject to system-specific variation
        2. printf '<%f>', 1; # prints "<1.000000>"
        3. printf '<%.1f>', 1; # prints "<1.0>"
        4. printf '<%.0f>', 1; # prints "<1>"
        5. printf '<%e>', 10; # prints "<1.000000e+01>"
        6. printf '<%.1e>', 10; # prints "<1.0e+01>"

        For "g" and "G", this specifies the maximum number of digits to show, including those prior to the decimal point and those after it; for example:

        1. # These examples are subject to system-specific variation.
        2. printf '<%g>', 1; # prints "<1>"
        3. printf '<%.10g>', 1; # prints "<1>"
        4. printf '<%g>', 100; # prints "<100>"
        5. printf '<%.1g>', 100; # prints "<1e+02>"
        6. printf '<%.2g>', 100.01; # prints "<1e+02>"
        7. printf '<%.5g>', 100.01; # prints "<100.01>"
        8. printf '<%.4g>', 100.01; # prints "<100>"

        For integer conversions, specifying a precision implies that the output of the number itself should be zero-padded to this width, where the 0 flag is ignored:

        1. printf '<%.6d>', 1; # prints "<000001>"
        2. printf '<%+.6d>', 1; # prints "<+000001>"
        3. printf '<%-10.6d>', 1; # prints "<000001 >"
        4. printf '<%10.6d>', 1; # prints "< 000001>"
        5. printf '<%010.6d>', 1; # prints "< 000001>"
        6. printf '<%+10.6d>', 1; # prints "< +000001>"
        7. printf '<%.6x>', 1; # prints "<000001>"
        8. printf '<%#.6x>', 1; # prints "<0x000001>"
        9. printf '<%-10.6x>', 1; # prints "<000001 >"
        10. printf '<%10.6x>', 1; # prints "< 000001>"
        11. printf '<%010.6x>', 1; # prints "< 000001>"
        12. printf '<%#10.6x>', 1; # prints "< 0x000001>"

        For string conversions, specifying a precision truncates the string to fit the specified width:

        1. printf '<%.5s>', "truncated"; # prints "<trunc>"
        2. printf '<%10.5s>', "truncated"; # prints "< trunc>"

        You can also get the precision from the next argument using .*:

        1. printf '<%.6x>', 1; # prints "<000001>"
        2. printf '<%.*x>', 6, 1; # prints "<000001>"

        If a precision obtained through * is negative, it counts as having no precision at all.

        1. printf '<%.*s>', 7, "string"; # prints "<string>"
        2. printf '<%.*s>', 3, "string"; # prints "<str>"
        3. printf '<%.*s>', 0, "string"; # prints "<>"
        4. printf '<%.*s>', -1, "string"; # prints "<string>"
        5. printf '<%.*d>', 1, 0; # prints "<0>"
        6. printf '<%.*d>', 0, 0; # prints "<>"
        7. printf '<%.*d>', -1, 0; # prints "<0>"

        You cannot currently get the precision from a specified number, but it is intended that this will be possible in the future, for example using .*2$:

        1. printf '<%.*2$x>', 1, 6; # INVALID, but in future will print
        2. # "<000001>"
      • size

        For numeric conversions, you can specify the size to interpret the number as using l , h , V , q, L , or ll . For integer conversions (d u o x X b i D U O ), numbers are usually assumed to be whatever the default integer size is on your platform (usually 32 or 64 bits), but you can override this to use instead one of the standard C types, as supported by the compiler used to build Perl:

        1. hh interpret integer as C type "char" or "unsigned
        2. char" on Perl 5.14 or later
        3. h interpret integer as C type "short" or
        4. "unsigned short"
        5. j interpret integer as C type "intmax_t" on Perl
        6. 5.14 or later, and only with a C99 compiler
        7. (unportable)
        8. l interpret integer as C type "long" or
        9. "unsigned long"
        10. q, L, or ll interpret integer as C type "long long",
        11. "unsigned long long", or "quad" (typically
        12. 64-bit integers)
        13. t interpret integer as C type "ptrdiff_t" on Perl
        14. 5.14 or later
        15. z interpret integer as C type "size_t" on Perl 5.14
        16. or later

        As of 5.14, none of these raises an exception if they are not supported on your platform. However, if warnings are enabled, a warning of the printf warning class is issued on an unsupported conversion flag. Should you instead prefer an exception, do this:

        1. use warnings FATAL => "printf";

        If you would like to know about a version dependency before you start running the program, put something like this at its top:

        1. use 5.014; # for hh/j/t/z/ printf modifiers

        You can find out whether your Perl supports quads via Config:

        1. use Config;
        2. if ($Config{use64bitint} eq "define"
        3. || $Config{longsize} >= 8) {
        4. print "Nice quads!\n";
        5. }

        For floating-point conversions (e f g E F G ), numbers are usually assumed to be the default floating-point size on your platform (double or long double), but you can force "long double" with q, L , or ll if your platform supports them. You can find out whether your Perl supports long doubles via Config:

        1. use Config;
        2. print "long doubles\n" if $Config{d_longdbl} eq "define";

        You can find out whether Perl considers "long double" to be the default floating-point size to use on your platform via Config:

        1. use Config;
        2. if ($Config{uselongdouble} eq "define") {
        3. print "long doubles by default\n";
        4. }

        It can also be that long doubles and doubles are the same thing:

        1. use Config;
        2. ($Config{doublesize} == $Config{longdblsize}) &&
        3. print "doubles are long doubles\n";

        The size specifier V has no effect for Perl code, but is supported for compatibility with XS code. It means "use the standard size for a Perl integer or floating-point number", which is the default.

      • order of arguments

        Normally, sprintf() takes the next unused argument as the value to format for each format specification. If the format specification uses * to require additional arguments, these are consumed from the argument list in the order they appear in the format specification before the value to format. Where an argument is specified by an explicit index, this does not affect the normal order for the arguments, even when the explicitly specified index would have been the next argument.

        So:

        1. printf "<%*.*s>", $a, $b, $c;

        uses $a for the width, $b for the precision, and $c as the value to format; while:

        1. printf '<%*1$.*s>', $a, $b;

        would use $a for the width and precision, and $b as the value to format.

        Here are some more examples; be aware that when using an explicit index, the $ may need escaping:

        1. printf "%2\$d %d\n", 12, 34; # will print "34 12\n"
        2. printf "%2\$d %d %d\n", 12, 34; # will print "34 12 34\n"
        3. printf "%3\$d %d %d\n", 12, 34, 56; # will print "56 12 34\n"
        4. printf "%2\$*3\$d %d\n", 12, 34, 3; # will print " 34 12\n"

      If use locale (including use locale 'not_characters' ) is in effect and POSIX::setlocale() has been called, the character used for the decimal separator in formatted floating-point numbers is affected by the LC_NUMERIC locale. See perllocale and POSIX.

     
    perldoc-html/functions/sqrt.html000644 000765 000024 00000033723 12275777526 017141 0ustar00jjstaff000000 000000 sqrt - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sqrt

    Perl 5 version 18.2 documentation
    Recently read

    sqrt

    • sqrt EXPR

    • sqrt

      Return the positive square root of EXPR. If EXPR is omitted, uses $_ . Works only for non-negative operands unless you've loaded the Math::Complex module.

      1. use Math::Complex;
      2. print sqrt(-4); # prints 2i
     
    perldoc-html/functions/srand.html000644 000765 000024 00000042230 12275777526 017250 0ustar00jjstaff000000 000000 srand - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    srand

    Perl 5 version 18.2 documentation
    Recently read

    srand

    • srand EXPR

    • srand

      Sets and returns the random number seed for the rand operator.

      The point of the function is to "seed" the rand function so that rand can produce a different sequence each time you run your program. When called with a parameter, srand uses that for the seed; otherwise it (semi-)randomly chooses a seed. In either case, starting with Perl 5.14, it returns the seed. To signal that your code will work only on Perls of a recent vintage:

      1. use 5.014; # so srand returns the seed

      If srand() is not called explicitly, it is called implicitly without a parameter at the first use of the rand operator. However, there are a few situations where programs are likely to want to call srand. One is for generating predictable results, generally for testing or debugging. There, you use srand($seed), with the same $seed each time. Another case is that you may want to call srand() after a fork() to avoid child processes sharing the same seed value as the parent (and consequently each other).

      Do not call srand() (i.e., without an argument) more than once per process. The internal state of the random number generator should contain more entropy than can be provided by any seed, so calling srand() again actually loses randomness.

      Most implementations of srand take an integer and will silently truncate decimal numbers. This means srand(42) will usually produce the same results as srand(42.1). To be safe, always pass srand an integer.

      A typical use of the returned seed is for a test program which has too many combinations to test comprehensively in the time available to it each run. It can test a random subset each time, and should there be a failure, log the seed used for that run so that it can later be used to reproduce the same results.

      rand() is not cryptographically secure. You should not rely on it in security-sensitive situations. As of this writing, a number of third-party CPAN modules offer random number generators intended by their authors to be cryptographically secure, including: Data::Entropy, Crypt::Random, Math::Random::Secure, and Math::TrulyRandom.

     
    perldoc-html/functions/stat.html000644 000765 000024 00000073712 12275777524 017123 0ustar00jjstaff000000 000000 stat - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    stat

    Perl 5 version 18.2 documentation
    Recently read

    stat

    • stat FILEHANDLE

    • stat EXPR
    • stat DIRHANDLE
    • stat

      Returns a 13-element list giving the status info for a file, either the file opened via FILEHANDLE or DIRHANDLE, or named by EXPR. If EXPR is omitted, it stats $_ (not _ !). Returns the empty list if stat fails. Typically used as follows:

      1. ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
      2. $atime,$mtime,$ctime,$blksize,$blocks)
      3. = stat($filename);

      Not all fields are supported on all filesystem types. Here are the meanings of the fields:

      1. 0 dev device number of filesystem
      2. 1 ino inode number
      3. 2 mode file mode (type and permissions)
      4. 3 nlink number of (hard) links to the file
      5. 4 uid numeric user ID of file's owner
      6. 5 gid numeric group ID of file's owner
      7. 6 rdev the device identifier (special files only)
      8. 7 size total size of file, in bytes
      9. 8 atime last access time in seconds since the epoch
      10. 9 mtime last modify time in seconds since the epoch
      11. 10 ctime inode change time in seconds since the epoch (*)
      12. 11 blksize preferred I/O size in bytes for interacting with the
      13. file (may vary from file to file)
      14. 12 blocks actual number of system-specific blocks allocated
      15. on disk (often, but not always, 512 bytes each)

      (The epoch was at 00:00 January 1, 1970 GMT.)

      (*) Not all fields are supported on all filesystem types. Notably, the ctime field is non-portable. In particular, you cannot expect it to be a "creation time"; see Files and Filesystems in perlport for details.

      If stat is passed the special filehandle consisting of an underline, no stat is done, but the current contents of the stat structure from the last stat, lstat, or filetest are returned. Example:

      1. if (-x $file && (($d) = stat(_)) && $d < 0) {
      2. print "$file is executable NFS file\n";
      3. }

      (This works on machines only for which the device number is negative under NFS.)

      Because the mode contains both the file type and its permissions, you should mask off the file type portion and (s)printf using a "%o" if you want to see the real permissions.

      1. $mode = (stat($filename))[2];
      2. printf "Permissions are %04o\n", $mode & 07777;

      In scalar context, stat returns a boolean value indicating success or failure, and, if successful, sets the information associated with the special filehandle _ .

      The File::stat module provides a convenient, by-name access mechanism:

      1. use File::stat;
      2. $sb = stat($filename);
      3. printf "File is %s, size is %s, perm %04o, mtime %s\n",
      4. $filename, $sb->size, $sb->mode & 07777,
      5. scalar localtime $sb->mtime;

      You can import symbolic mode constants (S_IF* ) and functions (S_IS* ) from the Fcntl module:

      1. use Fcntl ':mode';
      2. $mode = (stat($filename))[2];
      3. $user_rwx = ($mode & S_IRWXU) >> 6;
      4. $group_read = ($mode & S_IRGRP) >> 3;
      5. $other_execute = $mode & S_IXOTH;
      6. printf "Permissions are %04o\n", S_IMODE($mode), "\n";
      7. $is_setuid = $mode & S_ISUID;
      8. $is_directory = S_ISDIR($mode);

      You could write the last two using the -u and -d operators. Commonly available S_IF* constants are:

      1. # Permissions: read, write, execute, for user, group, others.
      2. S_IRWXU S_IRUSR S_IWUSR S_IXUSR
      3. S_IRWXG S_IRGRP S_IWGRP S_IXGRP
      4. S_IRWXO S_IROTH S_IWOTH S_IXOTH
      5. # Setuid/Setgid/Stickiness/SaveText.
      6. # Note that the exact meaning of these is system-dependent.
      7. S_ISUID S_ISGID S_ISVTX S_ISTXT
      8. # File types. Not all are necessarily available on
      9. # your system.
      10. S_IFREG S_IFDIR S_IFLNK S_IFBLK S_IFCHR
      11. S_IFIFO S_IFSOCK S_IFWHT S_ENFMT
      12. # The following are compatibility aliases for S_IRUSR,
      13. # S_IWUSR, and S_IXUSR.
      14. S_IREAD S_IWRITE S_IEXEC

      and the S_IF* functions are

      1. S_IMODE($mode) the part of $mode containing the permission
      2. bits and the setuid/setgid/sticky bits
      3. S_IFMT($mode) the part of $mode containing the file type
      4. which can be bit-anded with (for example)
      5. S_IFREG or with the following functions
      6. # The operators -f, -d, -l, -b, -c, -p, and -S.
      7. S_ISREG($mode) S_ISDIR($mode) S_ISLNK($mode)
      8. S_ISBLK($mode) S_ISCHR($mode) S_ISFIFO($mode) S_ISSOCK($mode)
      9. # No direct -X operator counterpart, but for the first one
      10. # the -g operator is often equivalent. The ENFMT stands for
      11. # record flocking enforcement, a platform-dependent feature.
      12. S_ISENFMT($mode) S_ISWHT($mode)

      See your native chmod(2) and stat(2) documentation for more details about the S_* constants. To get status info for a symbolic link instead of the target file behind the link, use the lstat function.

      Portability issues: stat in perlport.

     
    perldoc-html/functions/state.html000644 000765 000024 00000034574 12275777526 017275 0ustar00jjstaff000000 000000 state - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    state

    Perl 5 version 18.2 documentation
    Recently read

    state

    • state EXPR

    • state TYPE EXPR
    • state EXPR : ATTRS
    • state TYPE EXPR : ATTRS

      state declares a lexically scoped variable, just like my. However, those variables will never be reinitialized, contrary to lexical variables that are reinitialized each time their enclosing block is entered. See Persistent Private Variables in perlsub for details.

      state variables are enabled only when the use feature "state" pragma is in effect, unless the keyword is written as CORE::state . See also feature.

     
    perldoc-html/functions/study.html000644 000765 000024 00000044472 12275777525 017322 0ustar00jjstaff000000 000000 study - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    study

    Perl 5 version 18.2 documentation
    Recently read

    study

    • study SCALAR

    • study

      Takes extra time to study SCALAR ($_ if unspecified) in anticipation of doing many pattern matches on the string before it is next modified. This may or may not save time, depending on the nature and number of patterns you are searching and the distribution of character frequencies in the string to be searched; you probably want to compare run times with and without it to see which is faster. Those loops that scan for many short constant strings (including the constant parts of more complex patterns) will benefit most. (The way study works is this: a linked list of every character in the string to be searched is made, so we know, for example, where all the 'k' characters are. From each search string, the rarest character is selected, based on some static frequency tables constructed from some C programs and English text. Only those places that contain this "rarest" character are examined.)

      For example, here is a loop that inserts index producing entries before any line containing a certain pattern:

      1. while (<>) {
      2. study;
      3. print ".IX foo\n" if /\bfoo\b/;
      4. print ".IX bar\n" if /\bbar\b/;
      5. print ".IX blurfl\n" if /\bblurfl\b/;
      6. # ...
      7. print;
      8. }

      In searching for /\bfoo\b/ , only locations in $_ that contain f will be looked at, because f is rarer than o . In general, this is a big win except in pathological cases. The only question is whether it saves you more time than it took to build the linked list in the first place.

      Note that if you have to look for strings that you don't know till runtime, you can build an entire loop as a string and eval that to avoid recompiling all your patterns all the time. Together with undefining $/ to input entire files as one record, this can be quite fast, often faster than specialized programs like fgrep(1). The following scans a list of files (@files ) for a list of words (@words ), and prints out the names of those files that contain a match:

      1. $search = 'while (<>) { study;';
      2. foreach $word (@words) {
      3. $search .= "++\$seen{\$ARGV} if /\\b$word\\b/;\n";
      4. }
      5. $search .= "}";
      6. @ARGV = @files;
      7. undef $/;
      8. eval $search; # this screams
      9. $/ = "\n"; # put back to normal input delimiter
      10. foreach $file (sort keys(%seen)) {
      11. print $file, "\n";
      12. }
     
    perldoc-html/functions/sub.html000644 000765 000024 00000034016 12275777532 016732 0ustar00jjstaff000000 000000 sub - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sub

    Perl 5 version 18.2 documentation
    Recently read

    sub

    • sub NAME BLOCK

    • sub NAME (PROTO) BLOCK
    • sub NAME : ATTRS BLOCK
    • sub NAME (PROTO) : ATTRS BLOCK

      This is subroutine definition, not a real function per se. Without a BLOCK it's just a forward declaration. Without a NAME, it's an anonymous function declaration, so does return a value: the CODE ref of the closure just created.

      See perlsub and perlref for details about subroutines and references; see attributes and Attribute::Handlers for more information about attributes.

     
    perldoc-html/functions/substr.html000644 000765 000024 00000054016 12275777525 017467 0ustar00jjstaff000000 000000 substr - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    substr

    Perl 5 version 18.2 documentation
    Recently read

    substr

    • substr EXPR,OFFSET,LENGTH,REPLACEMENT

    • substr EXPR,OFFSET,LENGTH
    • substr EXPR,OFFSET

      Extracts a substring out of EXPR and returns it. First character is at offset zero. If OFFSET is negative, starts that far back from the end of the string. If LENGTH is omitted, returns everything through the end of the string. If LENGTH is negative, leaves that many characters off the end of the string.

      1. my $s = "The black cat climbed the green tree";
      2. my $color = substr $s, 4, 5; # black
      3. my $middle = substr $s, 4, -11; # black cat climbed the
      4. my $end = substr $s, 14; # climbed the green tree
      5. my $tail = substr $s, -4; # tree
      6. my $z = substr $s, -4, 2; # tr

      You can use the substr() function as an lvalue, in which case EXPR must itself be an lvalue. If you assign something shorter than LENGTH, the string will shrink, and if you assign something longer than LENGTH, the string will grow to accommodate it. To keep the string the same length, you may need to pad or chop your value using sprintf.

      If OFFSET and LENGTH specify a substring that is partly outside the string, only the part within the string is returned. If the substring is beyond either end of the string, substr() returns the undefined value and produces a warning. When used as an lvalue, specifying a substring that is entirely outside the string raises an exception. Here's an example showing the behavior for boundary cases:

      1. my $name = 'fred';
      2. substr($name, 4) = 'dy'; # $name is now 'freddy'
      3. my $null = substr $name, 6, 2; # returns "" (no warning)
      4. my $oops = substr $name, 7; # returns undef, with warning
      5. substr($name, 7) = 'gap'; # raises an exception

      An alternative to using substr() as an lvalue is to specify the replacement string as the 4th argument. This allows you to replace parts of the EXPR and return what was there before in one operation, just as you can with splice().

      1. my $s = "The black cat climbed the green tree";
      2. my $z = substr $s, 14, 7, "jumped from"; # climbed
      3. # $s is now "The black cat jumped from the green tree"

      Note that the lvalue returned by the three-argument version of substr() acts as a 'magic bullet'; each time it is assigned to, it remembers which part of the original string is being modified; for example:

      1. $x = '1234';
      2. for (substr($x,1,2)) {
      3. $_ = 'a'; print $x,"\n"; # prints 1a4
      4. $_ = 'xyz'; print $x,"\n"; # prints 1xyz4
      5. $x = '56789';
      6. $_ = 'pq'; print $x,"\n"; # prints 5pq9
      7. }

      With negative offsets, it remembers its position from the end of the string when the target string is modified:

      1. $x = '1234';
      2. for (substr($x, -3, 2)) {
      3. $_ = 'a'; print $x,"\n"; # prints 1a4, as above
      4. $x = 'abcdefg';
      5. print $_,"\n"; # prints f
      6. }

      Prior to Perl version 5.10, the result of using an lvalue multiple times was unspecified. Prior to 5.16, the result with negative offsets was unspecified.

     
    perldoc-html/functions/symlink.html000644 000765 000024 00000034204 12275777526 017631 0ustar00jjstaff000000 000000 symlink - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    symlink

    Perl 5 version 18.2 documentation
    Recently read

    symlink

    • symlink OLDFILE,NEWFILE

      Creates a new filename symbolically linked to the old filename. Returns 1 for success, 0 otherwise. On systems that don't support symbolic links, raises an exception. To check for that, use eval:

      1. $symlink_exists = eval { symlink("",""); 1 };

      Portability issues: symlink in perlport.

     
    perldoc-html/functions/syscall.html000644 000765 000024 00000041035 12275777526 017615 0ustar00jjstaff000000 000000 syscall - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    syscall

    Perl 5 version 18.2 documentation
    Recently read

    syscall

    • syscall NUMBER, LIST

      Calls the system call specified as the first element of the list, passing the remaining elements as arguments to the system call. If unimplemented, raises an exception. The arguments are interpreted as follows: if a given argument is numeric, the argument is passed as an int. If not, the pointer to the string value is passed. You are responsible to make sure a string is pre-extended long enough to receive any result that might be written into a string. You can't use a string literal (or other read-only string) as an argument to syscall because Perl has to assume that any string pointer might be written through. If your integer arguments are not literals and have never been interpreted in a numeric context, you may need to add 0 to them to force them to look like numbers. This emulates the syswrite function (or vice versa):

      1. require 'syscall.ph'; # may need to run h2ph
      2. $s = "hi there\n";
      3. syscall(&SYS_write, fileno(STDOUT), $s, length $s);

      Note that Perl supports passing of up to only 14 arguments to your syscall, which in practice should (usually) suffice.

      Syscall returns whatever value returned by the system call it calls. If the system call fails, syscall returns -1 and sets $! (errno). Note that some system calls can legitimately return -1 . The proper way to handle such calls is to assign $!=0 before the call, then check the value of $! if syscall returns -1 .

      There's a problem with syscall(&SYS_pipe): it returns the file number of the read end of the pipe it creates, but there is no way to retrieve the file number of the other end. You can avoid this problem by using pipe instead.

      Portability issues: syscall in perlport.

     
    perldoc-html/functions/sysopen.html000644 000765 000024 00000042767 12275777525 017657 0ustar00jjstaff000000 000000 sysopen - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sysopen

    Perl 5 version 18.2 documentation
    Recently read

    sysopen

    • sysopen FILEHANDLE,FILENAME,MODE

    • sysopen FILEHANDLE,FILENAME,MODE,PERMS

      Opens the file whose filename is given by FILENAME, and associates it with FILEHANDLE. If FILEHANDLE is an expression, its value is used as the real filehandle wanted; an undefined scalar will be suitably autovivified. This function calls the underlying operating system's open(2) function with the parameters FILENAME, MODE, and PERMS.

      The possible values and flag bits of the MODE parameter are system-dependent; they are available via the standard module Fcntl . See the documentation of your operating system's open(2) syscall to see which values and flag bits are available. You may combine several flags using the |-operator.

      Some of the most common values are O_RDONLY for opening the file in read-only mode, O_WRONLY for opening the file in write-only mode, and O_RDWR for opening the file in read-write mode.

      For historical reasons, some values work on almost every system supported by Perl: 0 means read-only, 1 means write-only, and 2 means read/write. We know that these values do not work under OS/390 and on the Macintosh; you probably don't want to use them in new code.

      If the file named by FILENAME does not exist and the open call creates it (typically because MODE includes the O_CREAT flag), then the value of PERMS specifies the permissions of the newly created file. If you omit the PERMS argument to sysopen, Perl uses the octal value 0666 . These permission values need to be in octal, and are modified by your process's current umask.

      In many systems the O_EXCL flag is available for opening files in exclusive mode. This is not locking: exclusiveness means here that if the file already exists, sysopen() fails. O_EXCL may not work on network filesystems, and has no effect unless the O_CREAT flag is set as well. Setting O_CREAT|O_EXCL prevents the file from being opened if it is a symbolic link. It does not protect against symbolic links in the file's path.

      Sometimes you may want to truncate an already-existing file. This can be done using the O_TRUNC flag. The behavior of O_TRUNC with O_RDONLY is undefined.

      You should seldom if ever use 0644 as argument to sysopen, because that takes away the user's option to have a more permissive umask. Better to omit it. See the perlfunc(1) entry on umask for more on this.

      Note that sysopen depends on the fdopen() C library function. On many Unix systems, fdopen() is known to fail when file descriptors exceed a certain value, typically 255. If you need more file descriptors than that, consider rebuilding Perl to use the sfio library, or perhaps using the POSIX::open() function.

      See perlopentut for a kinder, gentler explanation of opening files.

      Portability issues: sysopen in perlport.

     
    perldoc-html/functions/sysread.html000644 000765 000024 00000037455 12275777525 017627 0ustar00jjstaff000000 000000 sysread - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sysread

    Perl 5 version 18.2 documentation
    Recently read

    sysread

    • sysread FILEHANDLE,SCALAR,LENGTH,OFFSET

    • sysread FILEHANDLE,SCALAR,LENGTH

      Attempts to read LENGTH bytes of data into variable SCALAR from the specified FILEHANDLE, using the read(2). It bypasses buffered IO, so mixing this with other kinds of reads, print, write, seek, tell, or eof can cause confusion because the perlio or stdio layers usually buffers data. Returns the number of bytes actually read, 0 at end of file, or undef if there was an error (in the latter case $! is also set). SCALAR will be grown or shrunk so that the last byte actually read is the last byte of the scalar after the read.

      An OFFSET may be specified to place the read data at some place in the string other than the beginning. A negative OFFSET specifies placement at that many characters counting backwards from the end of the string. A positive OFFSET greater than the length of SCALAR results in the string being padded to the required size with "\0" bytes before the result of the read is appended.

      There is no syseof() function, which is ok, since eof() doesn't work well on device files (like ttys) anyway. Use sysread() and check for a return value for 0 to decide whether you're done.

      Note that if the filehandle has been marked as :utf8 Unicode characters are read instead of bytes (the LENGTH, OFFSET, and the return value of sysread() are in Unicode characters). The :encoding(...) layer implicitly introduces the :utf8 layer. See binmode, open, and the open pragma, open.

     
    perldoc-html/functions/sysseek.html000644 000765 000024 00000040315 12275777531 017625 0ustar00jjstaff000000 000000 sysseek - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    sysseek

    Perl 5 version 18.2 documentation
    Recently read

    sysseek

    • sysseek FILEHANDLE,POSITION,WHENCE

      Sets FILEHANDLE's system position in bytes using lseek(2). FILEHANDLE may be an expression whose value gives the name of the filehandle. The values for WHENCE are 0 to set the new position to POSITION; 1 to set the it to the current position plus POSITION; and 2 to set it to EOF plus POSITION, typically negative.

      Note the in bytes: even if the filehandle has been set to operate on characters (for example by using the :encoding(utf8) I/O layer), tell() will return byte offsets, not character offsets (because implementing that would render sysseek() unacceptably slow).

      sysseek() bypasses normal buffered IO, so mixing it with reads other than sysread (for example <> or read()) print, write, seek, tell, or eof may cause confusion.

      For WHENCE, you may also use the constants SEEK_SET , SEEK_CUR , and SEEK_END (start of the file, current position, end of the file) from the Fcntl module. Use of the constants is also more portable than relying on 0, 1, and 2. For example to define a "systell" function:

      1. use Fcntl 'SEEK_CUR';
      2. sub systell { sysseek($_[0], 0, SEEK_CUR) }

      Returns the new position, or the undefined value on failure. A position of zero is returned as the string "0 but true" ; thus sysseek returns true on success and false on failure, yet you can still easily determine the new position.

     
    perldoc-html/functions/system.html000644 000765 000024 00000047637 12275777524 017503 0ustar00jjstaff000000 000000 system - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    system

    Perl 5 version 18.2 documentation
    Recently read

    system

    • system LIST

    • system PROGRAM LIST

      Does exactly the same thing as exec LIST , except that a fork is done first and the parent process waits for the child process to exit. Note that argument processing varies depending on the number of arguments. If there is more than one argument in LIST, or if LIST is an array with more than one value, starts the program given by the first element of the list with arguments given by the rest of the list. If there is only one scalar argument, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's command shell for parsing (this is /bin/sh -c on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the argument, it is split into words and passed directly to execvp , which is more efficient.

      Perl will attempt to flush all files opened for output before any operation that may do a fork, but this may not be supported on some platforms (see perlport). To be safe, you may need to set $| ($AUTOFLUSH in English) or call the autoflush() method of IO::Handle on any open handles.

      The return value is the exit status of the program as returned by the wait call. To get the actual exit value, shift right by eight (see below). See also exec. This is not what you want to use to capture the output from a command; for that you should use merely backticks or qx//, as described in `STRING` in perlop. Return value of -1 indicates a failure to start the program or an error of the wait(2) system call (inspect $! for the reason).

      If you'd like to make system (and many other bits of Perl) die on error, have a look at the autodie pragma.

      Like exec, system allows you to lie to a program about its name if you use the system PROGRAM LIST syntax. Again, see exec.

      Since SIGINT and SIGQUIT are ignored during the execution of system, if you expect your program to terminate on receipt of these signals you will need to arrange to do so yourself based on the return value.

      1. @args = ("command", "arg1", "arg2");
      2. system(@args) == 0
      3. or die "system @args failed: $?"

      If you'd like to manually inspect system's failure, you can check all possible failure modes by inspecting $? like this:

      1. if ($? == -1) {
      2. print "failed to execute: $!\n";
      3. }
      4. elsif ($? & 127) {
      5. printf "child died with signal %d, %s coredump\n",
      6. ($? & 127), ($? & 128) ? 'with' : 'without';
      7. }
      8. else {
      9. printf "child exited with value %d\n", $? >> 8;
      10. }

      Alternatively, you may inspect the value of ${^CHILD_ERROR_NATIVE} with the W*() calls from the POSIX module.

      When system's arguments are executed indirectly by the shell, results and return codes are subject to its quirks. See `STRING` in perlop and exec for details.

      Since system does a fork and wait it may affect a SIGCHLD handler. See perlipc for details.

      Portability issues: system in perlport.

     
    perldoc-html/functions/syswrite.html000644 000765 000024 00000037566 12275777526 020052 0ustar00jjstaff000000 000000 syswrite - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    syswrite

    Perl 5 version 18.2 documentation
    Recently read

    syswrite

    • syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET

    • syswrite FILEHANDLE,SCALAR,LENGTH
    • syswrite FILEHANDLE,SCALAR

      Attempts to write LENGTH bytes of data from variable SCALAR to the specified FILEHANDLE, using write(2). If LENGTH is not specified, writes whole SCALAR. It bypasses buffered IO, so mixing this with reads (other than sysread()), print, write, seek, tell, or eof may cause confusion because the perlio and stdio layers usually buffer data. Returns the number of bytes actually written, or undef if there was an error (in this case the errno variable $! is also set). If the LENGTH is greater than the data available in the SCALAR after the OFFSET, only as much data as is available will be written.

      An OFFSET may be specified to write the data from some part of the string other than the beginning. A negative OFFSET specifies writing that many characters counting backwards from the end of the string. If SCALAR is of length zero, you can only use an OFFSET of 0.

      WARNING: If the filehandle is marked :utf8 , Unicode characters encoded in UTF-8 are written instead of bytes, and the LENGTH, OFFSET, and return value of syswrite() are in (UTF8-encoded Unicode) characters. The :encoding(...) layer implicitly introduces the :utf8 layer. Alternately, if the handle is not marked with an encoding but you attempt to write characters with code points over 255, raises an exception. See binmode, open, and the open pragma, open.

     
    perldoc-html/functions/tell.html000644 000765 000024 00000035245 12275777525 017110 0ustar00jjstaff000000 000000 tell - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    tell

    Perl 5 version 18.2 documentation
    Recently read

    tell

    • tell FILEHANDLE

    • tell

      Returns the current position in bytes for FILEHANDLE, or -1 on error. FILEHANDLE may be an expression whose value gives the name of the actual filehandle. If FILEHANDLE is omitted, assumes the file last read.

      Note the in bytes: even if the filehandle has been set to operate on characters (for example by using the :encoding(utf8) open layer), tell() will return byte offsets, not character offsets (because that would render seek() and tell() rather slow).

      The return value of tell() for the standard streams like the STDIN depends on the operating system: it may return -1 or something else. tell() on pipes, fifos, and sockets usually returns -1.

      There is no systell function. Use sysseek(FH, 0, 1) for that.

      Do not use tell() (or other buffered I/O operations) on a filehandle that has been manipulated by sysread(), syswrite(), or sysseek(). Those functions ignore the buffering, while tell() does not.

     
    perldoc-html/functions/telldir.html000644 000765 000024 00000033406 12275777526 017605 0ustar00jjstaff000000 000000 telldir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    telldir

    Perl 5 version 18.2 documentation
    Recently read

    telldir

    • telldir DIRHANDLE

      Returns the current position of the readdir routines on DIRHANDLE. Value may be given to seekdir to access a particular location in a directory. telldir has the same caveats about possible directory compaction as the corresponding system library routine.

     
    perldoc-html/functions/tie.html000644 000765 000024 00000053024 12275777531 016721 0ustar00jjstaff000000 000000 tie - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    tie

    Perl 5 version 18.2 documentation
    Recently read

    tie

    • tie VARIABLE,CLASSNAME,LIST

      This function binds a variable to a package class that will provide the implementation for the variable. VARIABLE is the name of the variable to be enchanted. CLASSNAME is the name of a class implementing objects of correct type. Any additional arguments are passed to the appropriate constructor method of the class (meaning TIESCALAR , TIEHANDLE , TIEARRAY , or TIEHASH ). Typically these are arguments such as might be passed to the dbm_open() function of C. The object returned by the constructor is also returned by the tie function, which would be useful if you want to access other methods in CLASSNAME.

      Note that functions such as keys and values may return huge lists when used on large objects, like DBM files. You may prefer to use the each function to iterate over such. Example:

      1. # print out history file offsets
      2. use NDBM_File;
      3. tie(%HIST, 'NDBM_File', '/usr/lib/news/history', 1, 0);
      4. while (($key,$val) = each %HIST) {
      5. print $key, ' = ', unpack('L',$val), "\n";
      6. }
      7. untie(%HIST);

      A class implementing a hash should have the following methods:

      1. TIEHASH classname, LIST
      2. FETCH this, key
      3. STORE this, key, value
      4. DELETE this, key
      5. CLEAR this
      6. EXISTS this, key
      7. FIRSTKEY this
      8. NEXTKEY this, lastkey
      9. SCALAR this
      10. DESTROY this
      11. UNTIE this

      A class implementing an ordinary array should have the following methods:

      1. TIEARRAY classname, LIST
      2. FETCH this, key
      3. STORE this, key, value
      4. FETCHSIZE this
      5. STORESIZE this, count
      6. CLEAR this
      7. PUSH this, LIST
      8. POP this
      9. SHIFT this
      10. UNSHIFT this, LIST
      11. SPLICE this, offset, length, LIST
      12. EXTEND this, count
      13. DELETE this, key
      14. EXISTS this, key
      15. DESTROY this
      16. UNTIE this

      A class implementing a filehandle should have the following methods:

      1. TIEHANDLE classname, LIST
      2. READ this, scalar, length, offset
      3. READLINE this
      4. GETC this
      5. WRITE this, scalar, length, offset
      6. PRINT this, LIST
      7. PRINTF this, format, LIST
      8. BINMODE this
      9. EOF this
      10. FILENO this
      11. SEEK this, position, whence
      12. TELL this
      13. OPEN this, mode, LIST
      14. CLOSE this
      15. DESTROY this
      16. UNTIE this

      A class implementing a scalar should have the following methods:

      1. TIESCALAR classname, LIST
      2. FETCH this,
      3. STORE this, value
      4. DESTROY this
      5. UNTIE this

      Not all methods indicated above need be implemented. See perltie, Tie::Hash, Tie::Array, Tie::Scalar, and Tie::Handle.

      Unlike dbmopen, the tie function will not use or require a module for you; you need to do that explicitly yourself. See DB_File or the Config module for interesting tie implementations.

      For further details see perltie, tied VARIABLE.

     
    perldoc-html/functions/tied.html000644 000765 000024 00000033044 12275777524 017067 0ustar00jjstaff000000 000000 tied - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    tied

    Perl 5 version 18.2 documentation
    Recently read

    tied

    • tied VARIABLE

      Returns a reference to the object underlying VARIABLE (the same value that was originally returned by the tie call that bound the variable to a package.) Returns the undefined value if VARIABLE isn't tied to a package.

     
    perldoc-html/functions/time.html000644 000765 000024 00000034470 12275777524 017104 0ustar00jjstaff000000 000000 time - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    time

    Perl 5 version 18.2 documentation
    Recently read

    time

    • time

      Returns the number of non-leap seconds since whatever time the system considers to be the epoch, suitable for feeding to gmtime and localtime. On most systems the epoch is 00:00:00 UTC, January 1, 1970; a prominent exception being Mac OS Classic which uses 00:00:00, January 1, 1904 in the current local time zone for its epoch.

      For measuring time in better granularity than one second, use the Time::HiRes module from Perl 5.8 onwards (or from CPAN before then), or, if you have gettimeofday(2), you may be able to use the syscall interface of Perl. See perlfaq8 for details.

      For date and time processing look at the many related modules on CPAN. For a comprehensive date and time representation look at the DateTime module.

     
    perldoc-html/functions/times.html000644 000765 000024 00000034061 12275777531 017261 0ustar00jjstaff000000 000000 times - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    times

    Perl 5 version 18.2 documentation
    Recently read

    times

    • times

      Returns a four-element list giving the user and system times in seconds for this process and any exited children of this process.

      1. ($user,$system,$cuser,$csystem) = times;

      In scalar context, times returns $user .

      Children's times are only included for terminated children.

      Portability issues: times in perlport.

     
    perldoc-html/functions/tr.html000644 000765 000024 00000032700 12275777524 016565 0ustar00jjstaff000000 000000 tr - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    tr

    Perl 5 version 18.2 documentation
    Recently read

    tr

     
    perldoc-html/functions/truncate.html000644 000765 000024 00000033757 12275777527 020005 0ustar00jjstaff000000 000000 truncate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    truncate

    Perl 5 version 18.2 documentation
    Recently read

    truncate

    • truncate FILEHANDLE,LENGTH

    • truncate EXPR,LENGTH

      Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified length. Raises an exception if truncate isn't implemented on your system. Returns true if successful, undef on error.

      The behavior is undefined if LENGTH is greater than the length of the file.

      The position in the file of FILEHANDLE is left unchanged. You may want to call seek before writing to the file.

      Portability issues: truncate in perlport.

     
    perldoc-html/functions/uc.html000644 000765 000024 00000033440 12275777527 016554 0ustar00jjstaff000000 000000 uc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    uc

    Perl 5 version 18.2 documentation
    Recently read

    uc

    • uc EXPR

    • uc

      Returns an uppercased version of EXPR. This is the internal function implementing the \U escape in double-quoted strings. It does not attempt to do titlecase mapping on initial letters. See ucfirst for that.

      If EXPR is omitted, uses $_ .

      This function behaves the same way under various pragma, such as in a locale, as lc does.

     
    perldoc-html/functions/ucfirst.html000644 000765 000024 00000033403 12275777531 017616 0ustar00jjstaff000000 000000 ucfirst - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ucfirst

    Perl 5 version 18.2 documentation
    Recently read

    ucfirst

    • ucfirst EXPR

    • ucfirst

      Returns the value of EXPR with the first character in uppercase (titlecase in Unicode). This is the internal function implementing the \u escape in double-quoted strings.

      If EXPR is omitted, uses $_ .

      This function behaves the same way under various pragma, such as in a locale, as lc does.

     
    perldoc-html/functions/umask.html000644 000765 000024 00000041373 12275777532 017265 0ustar00jjstaff000000 000000 umask - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    umask

    Perl 5 version 18.2 documentation
    Recently read

    umask

    • umask EXPR

    • umask

      Sets the umask for the process to EXPR and returns the previous value. If EXPR is omitted, merely returns the current umask.

      The Unix permission rwxr-x--- is represented as three sets of three bits, or three octal digits: 0750 (the leading 0 indicates octal and isn't one of the digits). The umask value is such a number representing disabled permissions bits. The permission (or "mode") values you pass mkdir or sysopen are modified by your umask, so even if you tell sysopen to create a file with permissions 0777 , if your umask is 0022 , then the file will actually be created with permissions 0755 . If your umask were 0027 (group can't write; others can't read, write, or execute), then passing sysopen 0666 would create a file with mode 0640 (because 0666 &~ 027 is 0640 ).

      Here's some advice: supply a creation mode of 0666 for regular files (in sysopen) and one of 0777 for directories (in mkdir) and executable files. This gives users the freedom of choice: if they want protected files, they might choose process umasks of 022 , 027 , or even the particularly antisocial mask of 077 . Programs should rarely if ever make policy decisions better left to the user. The exception to this is when writing files that should be kept private: mail files, web browser cookies, .rhosts files, and so on.

      If umask(2) is not implemented on your system and you are trying to restrict access for yourself (i.e., (EXPR & 0700) > 0 ), raises an exception. If umask(2) is not implemented and you are not trying to restrict access for yourself, returns undef.

      Remember that a umask is a number, usually given in octal; it is not a string of octal digits. See also oct, if all you have is a string.

      Portability issues: umask in perlport.

     
    perldoc-html/functions/undef.html000644 000765 000024 00000040663 12275777527 017253 0ustar00jjstaff000000 000000 undef - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    undef

    Perl 5 version 18.2 documentation
    Recently read

    undef

    • undef EXPR

    • undef

      Undefines the value of EXPR, which must be an lvalue. Use only on a scalar value, an array (using @ ), a hash (using % ), a subroutine (using & ), or a typeglob (using * ). Saying undef $hash{$key} will probably not do what you expect on most predefined variables or DBM list values, so don't do that; see delete. Always returns the undefined value. You can omit the EXPR, in which case nothing is undefined, but you still get an undefined value that you could, for instance, return from a subroutine, assign to a variable, or pass as a parameter. Examples:

      1. undef $foo;
      2. undef $bar{'blurfl'}; # Compare to: delete $bar{'blurfl'};
      3. undef @ary;
      4. undef %hash;
      5. undef &mysub;
      6. undef *xyz; # destroys $xyz, @xyz, %xyz, &xyz, etc.
      7. return (wantarray ? (undef, $errmsg) : undef) if $they_blew_it;
      8. select undef, undef, undef, 0.25;
      9. ($a, $b, undef, $c) = &foo; # Ignore third value returned

      Note that this is a unary operator, not a list operator.

     
    perldoc-html/functions/unless.html000644 000765 000024 00000032562 12275777525 017460 0ustar00jjstaff000000 000000 unless - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    unless

    Perl 5 version 18.2 documentation
    Recently read

    unless

     
    perldoc-html/functions/unlink.html000644 000765 000024 00000037235 12275777530 017445 0ustar00jjstaff000000 000000 unlink - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    unlink

    Perl 5 version 18.2 documentation
    Recently read

    unlink

    • unlink LIST

    • unlink

      Deletes a list of files. On success, it returns the number of files it successfully deleted. On failure, it returns false and sets $! (errno):

      1. my $unlinked = unlink 'a', 'b', 'c';
      2. unlink @goners;
      3. unlink glob "*.bak";

      On error, unlink will not tell you which files it could not remove. If you want to know which files you could not remove, try them one at a time:

      1. foreach my $file ( @goners ) {
      2. unlink $file or warn "Could not unlink $file: $!";
      3. }

      Note: unlink will not attempt to delete directories unless you are superuser and the -U flag is supplied to Perl. Even if these conditions are met, be warned that unlinking a directory can inflict damage on your filesystem. Finally, using unlink on directories is not supported on many operating systems. Use rmdir instead.

      If LIST is omitted, unlink uses $_ .

     
    perldoc-html/functions/unpack.html000644 000765 000024 00000044003 12275777531 017416 0ustar00jjstaff000000 000000 unpack - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    unpack

    Perl 5 version 18.2 documentation
    Recently read

    unpack

    • unpack TEMPLATE,EXPR

    • unpack TEMPLATE

      unpack does the reverse of pack: it takes a string and expands it out into a list of values. (In scalar context, it returns merely the first value produced.)

      If EXPR is omitted, unpacks the $_ string. See perlpacktut for an introduction to this function.

      The string is broken into chunks described by the TEMPLATE. Each chunk is converted separately to a value. Typically, either the string is a result of pack, or the characters of the string represent a C structure of some kind.

      The TEMPLATE has the same format as in the pack function. Here's a subroutine that does substring:

      1. sub substr {
      2. my($what,$where,$howmuch) = @_;
      3. unpack("x$where a$howmuch", $what);
      4. }

      and then there's

      1. sub ordinal { unpack("W",$_[0]); } # same as ord()

      In addition to fields allowed in pack(), you may prefix a field with a %<number> to indicate that you want a <number>-bit checksum of the items instead of the items themselves. Default is a 16-bit checksum. Checksum is calculated by summing numeric values of expanded values (for string fields the sum of ord($char) is taken; for bit fields the sum of zeroes and ones).

      For example, the following computes the same number as the System V sum program:

      1. $checksum = do {
      2. local $/; # slurp!
      3. unpack("%32W*",<>) % 65535;
      4. };

      The following efficiently counts the number of set bits in a bit vector:

      1. $setbits = unpack("%32b*", $selectmask);

      The p and P formats should be used with care. Since Perl has no way of checking whether the value passed to unpack() corresponds to a valid memory location, passing a pointer value that's not known to be valid is likely to have disastrous consequences.

      If there are more pack codes or if the repeat count of a field or a group is larger than what the remainder of the input string allows, the result is not well defined: the repeat count may be decreased, or unpack() may produce empty strings or zeros, or it may raise an exception. If the input string is longer than one described by the TEMPLATE, the remainder of that input string is ignored.

      See pack for more examples and notes.

     
    perldoc-html/functions/unshift.html000644 000765 000024 00000036252 12275777527 017631 0ustar00jjstaff000000 000000 unshift - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    unshift

    Perl 5 version 18.2 documentation
    Recently read

    unshift

    • unshift ARRAY,LIST

    • unshift EXPR,LIST

      Does the opposite of a shift. Or the opposite of a push, depending on how you look at it. Prepends list to the front of the array and returns the new number of elements in the array.

      1. unshift(@ARGV, '-e') unless $ARGV[0] =~ /^-/;

      Note the LIST is prepended whole, not one element at a time, so the prepended elements stay in the same order. Use reverse to do the reverse.

      Starting with Perl 5.14, unshift can take a scalar EXPR, which must hold a reference to an unblessed array. The argument will be dereferenced automatically. This aspect of unshift is considered highly experimental. The exact behaviour may change in a future version of Perl.

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.014; # so push/pop/etc work on scalars (experimental)
     
    perldoc-html/functions/untie.html000644 000765 000024 00000032636 12275777531 017272 0ustar00jjstaff000000 000000 untie - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    untie

    Perl 5 version 18.2 documentation
    Recently read

    untie

    • untie VARIABLE

      Breaks the binding between a variable and a package. (See tie.) Has no effect if the variable is not tied.

     
    perldoc-html/functions/until.html000644 000765 000024 00000032552 12275777526 017302 0ustar00jjstaff000000 000000 until - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    until

    Perl 5 version 18.2 documentation
    Recently read

    until

     
    perldoc-html/functions/use.html000644 000765 000024 00000061312 12275777526 016737 0ustar00jjstaff000000 000000 use - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    use

    Perl 5 version 18.2 documentation
    Recently read

    use

    • use Module VERSION LIST

    • use Module VERSION
    • use Module LIST
    • use Module
    • use VERSION

      Imports some semantics into the current package from the named module, generally by aliasing certain subroutine or variable names into your package. It is exactly equivalent to

      1. BEGIN { require Module; Module->import( LIST ); }

      except that Module must be a bareword. The importation can be made conditional by using the if module.

      In the peculiar use VERSION form, VERSION may be either a positive decimal fraction such as 5.006, which will be compared to $] , or a v-string of the form v5.6.1, which will be compared to $^V (aka $PERL_VERSION). An exception is raised if VERSION is greater than the version of the current Perl interpreter; Perl will not attempt to parse the rest of the file. Compare with require, which can do a similar check at run time. Symmetrically, no VERSION allows you to specify that you want a version of Perl older than the specified one.

      Specifying VERSION as a literal of the form v5.6.1 should generally be avoided, because it leads to misleading error messages under earlier versions of Perl (that is, prior to 5.6.0) that do not support this syntax. The equivalent numeric version should be used instead.

      1. use v5.6.1; # compile time version check
      2. use 5.6.1; # ditto
      3. use 5.006_001; # ditto; preferred for backwards compatibility

      This is often useful if you need to check the current Perl version before useing library modules that won't work with older versions of Perl. (We try not to do this more than we have to.)

      use VERSION also enables all features available in the requested version as defined by the feature pragma, disabling any features not in the requested version's feature bundle. See feature. Similarly, if the specified Perl version is greater than or equal to 5.12.0, strictures are enabled lexically as with use strict . Any explicit use of use strict or no strict overrides use VERSION , even if it comes before it. In both cases, the feature.pm and strict.pm files are not actually loaded.

      The BEGIN forces the require and import to happen at compile time. The require makes sure the module is loaded into memory if it hasn't been yet. The import is not a builtin; it's just an ordinary static method call into the Module package to tell the module to import the list of features back into the current package. The module can implement its import method any way it likes, though most modules just choose to derive their import method via inheritance from the Exporter class that is defined in the Exporter module. See Exporter. If no import method can be found then the call is skipped, even if there is an AUTOLOAD method.

      If you do not want to call the package's import method (for instance, to stop your namespace from being altered), explicitly supply the empty list:

      1. use Module ();

      That is exactly equivalent to

      1. BEGIN { require Module }

      If the VERSION argument is present between Module and LIST, then the use will call the VERSION method in class Module with the given version as an argument. The default VERSION method, inherited from the UNIVERSAL class, croaks if the given version is larger than the value of the variable $Module::VERSION .

      Again, there is a distinction between omitting LIST (import called with no arguments) and an explicit empty LIST () (import not called). Note that there is no comma after VERSION!

      Because this is a wide-open interface, pragmas (compiler directives) are also implemented this way. Currently implemented pragmas are:

      1. use constant;
      2. use diagnostics;
      3. use integer;
      4. use sigtrap qw(SEGV BUS);
      5. use strict qw(subs vars refs);
      6. use subs qw(afunc blurfl);
      7. use warnings qw(all);
      8. use sort qw(stable _quicksort _mergesort);

      Some of these pseudo-modules import semantics into the current block scope (like strict or integer , unlike ordinary modules, which import symbols into the current package (which are effective through the end of the file).

      Because use takes effect at compile time, it doesn't respect the ordinary flow control of the code being compiled. In particular, putting a use inside the false branch of a conditional doesn't prevent it from being processed. If a module or pragma only needs to be loaded conditionally, this can be done using the if pragma:

      1. use if $] < 5.008, "utf8";
      2. use if WANT_WARNINGS, warnings => qw(all);

      There's a corresponding no declaration that unimports meanings imported by use, i.e., it calls unimport Module LIST instead of import. It behaves just as import does with VERSION, an omitted or empty LIST, or no unimport method being found.

      1. no integer;
      2. no strict 'refs';
      3. no warnings;

      Care should be taken when using the no VERSION form of no. It is only meant to be used to assert that the running Perl is of a earlier version than its argument and not to undo the feature-enabling side effects of use VERSION .

      See perlmodlib for a list of standard modules and pragmas. See perlrun for the -M and -m command-line options to Perl that give use functionality from the command-line.

     
    perldoc-html/functions/utime.html000644 000765 000024 00000040372 12275777531 017265 0ustar00jjstaff000000 000000 utime - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    utime

    Perl 5 version 18.2 documentation
    Recently read

    utime

    • utime LIST

      Changes the access and modification times on each file of a list of files. The first two elements of the list must be the NUMERIC access and modification times, in that order. Returns the number of files successfully changed. The inode change time of each file is set to the current time. For example, this code has the same effect as the Unix touch(1) command when the files already exist and belong to the user running the program:

      1. #!/usr/bin/perl
      2. $atime = $mtime = time;
      3. utime $atime, $mtime, @ARGV;

      Since Perl 5.8.0, if the first two elements of the list are undef, the utime(2) syscall from your C library is called with a null second argument. On most systems, this will set the file's access and modification times to the current time (i.e., equivalent to the example above) and will work even on files you don't own provided you have write permission:

      1. for $file (@ARGV) {
      2. utime(undef, undef, $file)
      3. || warn "couldn't touch $file: $!";
      4. }

      Under NFS this will use the time of the NFS server, not the time of the local machine. If there is a time synchronization problem, the NFS server and local machine will have different times. The Unix touch(1) command will in fact normally use this form instead of the one shown in the first example.

      Passing only one of the first two elements as undef is equivalent to passing a 0 and will not have the effect described when both are undef. This also triggers an uninitialized warning.

      On systems that support futimes(2), you may pass filehandles among the files. On systems that don't support futimes(2), passing filehandles raises an exception. Filehandles must be passed as globs or glob references to be recognized; barewords are considered filenames.

      Portability issues: utime in perlport.

     
    perldoc-html/functions/values.html000644 000765 000024 00000044474 12275777525 017453 0ustar00jjstaff000000 000000 values - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    values

    Perl 5 version 18.2 documentation
    Recently read

    values

    • values HASH

    • values ARRAY
    • values EXPR

      In list context, returns a list consisting of all the values of the named hash. In Perl 5.12 or later only, will also return a list of the values of an array; prior to that release, attempting to use an array argument will produce a syntax error. In scalar context, returns the number of values.

      Hash entries are returned in an apparently random order. The actual random order is specific to a given hash; the exact same series of operations on two hashes may result in a different order for each hash. Any insertion into the hash may change the order, as will any deletion, with the exception that the most recent key returned by each or keys may be deleted without changing the order. So long as a given hash is unmodified you may rely on keys, values and each to repeatedly return the same order as each other. See Algorithmic Complexity Attacks in perlsec for details on why hash order is randomized. Aside from the guarantees provided here the exact details of Perl's hash algorithm and the hash traversal order are subject to change in any release of Perl.

      As a side effect, calling values() resets the HASH or ARRAY's internal iterator, see each. (In particular, calling values() in void context resets the iterator with no other overhead. Apart from resetting the iterator, values @array in list context is the same as plain @array . (We recommend that you use void context keys @array for this, but reasoned that taking values @array out would require more documentation than leaving it in.)

      Note that the values are not copied, which means modifying them will modify the contents of the hash:

      1. for (values %hash) { s/foo/bar/g } # modifies %hash values
      2. for (@hash{keys %hash}) { s/foo/bar/g } # same

      Starting with Perl 5.14, values can take a scalar EXPR, which must hold a reference to an unblessed hash or array. The argument will be dereferenced automatically. This aspect of values is considered highly experimental. The exact behaviour may change in a future version of Perl.

      1. for (values $hashref) { ... }
      2. for (values $obj->get_arrayref) { ... }

      To avoid confusing would-be users of your code who are running earlier versions of Perl with mysterious syntax errors, put this sort of thing at the top of your file to signal that your code will work only on Perls of a recent vintage:

      1. use 5.012; # so keys/values/each work on arrays
      2. use 5.014; # so keys/values/each work on scalars (experimental)

      See also keys, each, and sort.

     
    perldoc-html/functions/vec.html000644 000765 000024 00000213134 12275777524 016717 0ustar00jjstaff000000 000000 vec - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    vec

    Perl 5 version 18.2 documentation
    Recently read

    vec

    • vec EXPR,OFFSET,BITS

      Treats the string in EXPR as a bit vector made up of elements of width BITS and returns the value of the element specified by OFFSET as an unsigned integer. BITS therefore specifies the number of bits that are reserved for each element in the bit vector. This must be a power of two from 1 to 32 (or 64, if your platform supports that).

      If BITS is 8, "elements" coincide with bytes of the input string.

      If BITS is 16 or more, bytes of the input string are grouped into chunks of size BITS/8, and each group is converted to a number as with pack()/unpack() with big-endian formats n /N (and analogously for BITS==64). See pack for details.

      If bits is 4 or less, the string is broken into bytes, then the bits of each byte are broken into 8/BITS groups. Bits of a byte are numbered in a little-endian-ish way, as in 0x01 , 0x02 , 0x04 , 0x08 , 0x10 , 0x20 , 0x40 , 0x80 . For example, breaking the single input byte chr(0x36) into two groups gives a list (0x6, 0x3) ; breaking it into 4 groups gives (0x2, 0x1, 0x3, 0x0) .

      vec may also be assigned to, in which case parentheses are needed to give the expression the correct precedence as in

      1. vec($image, $max_x * $x + $y, 8) = 3;

      If the selected element is outside the string, the value 0 is returned. If an element off the end of the string is written to, Perl will first extend the string with sufficiently many zero bytes. It is an error to try to write off the beginning of the string (i.e., negative OFFSET).

      If the string happens to be encoded as UTF-8 internally (and thus has the UTF8 flag set), this is ignored by vec, and it operates on the internal byte string, not the conceptual character string, even if you only have characters with values less than 256.

      Strings created with vec can also be manipulated with the logical operators |, & , ^, and ~ . These operators will assume a bit vector operation is desired when both operands are strings. See Bitwise String Operators in perlop.

      The following code will build up an ASCII string saying 'PerlPerlPerl' . The comments show the string after each step. Note that this code works in the same way on big-endian or little-endian machines.

      1. my $foo = '';
      2. vec($foo, 0, 32) = 0x5065726C; # 'Perl'
      3. # $foo eq "Perl" eq "\x50\x65\x72\x6C", 32 bits
      4. print vec($foo, 0, 8); # prints 80 == 0x50 == ord('P')
      5. vec($foo, 2, 16) = 0x5065; # 'PerlPe'
      6. vec($foo, 3, 16) = 0x726C; # 'PerlPerl'
      7. vec($foo, 8, 8) = 0x50; # 'PerlPerlP'
      8. vec($foo, 9, 8) = 0x65; # 'PerlPerlPe'
      9. vec($foo, 20, 4) = 2; # 'PerlPerlPe' . "\x02"
      10. vec($foo, 21, 4) = 7; # 'PerlPerlPer'
      11. # 'r' is "\x72"
      12. vec($foo, 45, 2) = 3; # 'PerlPerlPer' . "\x0c"
      13. vec($foo, 93, 1) = 1; # 'PerlPerlPer' . "\x2c"
      14. vec($foo, 94, 1) = 1; # 'PerlPerlPerl'
      15. # 'l' is "\x6c"

      To transform a bit vector into a string or list of 0's and 1's, use these:

      1. $bits = unpack("b*", $vector);
      2. @bits = split(//, unpack("b*", $vector));

      If you know the exact length in bits, it can be used in place of the * .

      Here is an example to illustrate how the bits actually fall in place:

      1. #!/usr/bin/perl -wl
      2. print <<'EOT';
      3. 0 1 2 3
      4. unpack("V",$_) 01234567890123456789012345678901
      5. ------------------------------------------------------------------
      6. EOT
      7. for $w (0..3) {
      8. $width = 2**$w;
      9. for ($shift=0; $shift < $width; ++$shift) {
      10. for ($off=0; $off < 32/$width; ++$off) {
      11. $str = pack("B*", "0"x32);
      12. $bits = (1<<$shift);
      13. vec($str, $off, $width) = $bits;
      14. $res = unpack("b*",$str);
      15. $val = unpack("V", $str);
      16. write;
      17. }
      18. }
      19. }
      20. format STDOUT =
      21. vec($_,@#,@#) = @<< == @######### @>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
      22. $off, $width, $bits, $val, $res
      23. .
      24. __END__

      Regardless of the machine architecture on which it runs, the example above should print the following table:

      1. 0 1 2 3
      2. unpack("V",$_) 01234567890123456789012345678901
      3. ------------------------------------------------------------------
      4. vec($_, 0, 1) = 1 == 1 10000000000000000000000000000000
      5. vec($_, 1, 1) = 1 == 2 01000000000000000000000000000000
      6. vec($_, 2, 1) = 1 == 4 00100000000000000000000000000000
      7. vec($_, 3, 1) = 1 == 8 00010000000000000000000000000000
      8. vec($_, 4, 1) = 1 == 16 00001000000000000000000000000000
      9. vec($_, 5, 1) = 1 == 32 00000100000000000000000000000000
      10. vec($_, 6, 1) = 1 == 64 00000010000000000000000000000000
      11. vec($_, 7, 1) = 1 == 128 00000001000000000000000000000000
      12. vec($_, 8, 1) = 1 == 256 00000000100000000000000000000000
      13. vec($_, 9, 1) = 1 == 512 00000000010000000000000000000000
      14. vec($_,10, 1) = 1 == 1024 00000000001000000000000000000000
      15. vec($_,11, 1) = 1 == 2048 00000000000100000000000000000000
      16. vec($_,12, 1) = 1 == 4096 00000000000010000000000000000000
      17. vec($_,13, 1) = 1 == 8192 00000000000001000000000000000000
      18. vec($_,14, 1) = 1 == 16384 00000000000000100000000000000000
      19. vec($_,15, 1) = 1 == 32768 00000000000000010000000000000000
      20. vec($_,16, 1) = 1 == 65536 00000000000000001000000000000000
      21. vec($_,17, 1) = 1 == 131072 00000000000000000100000000000000
      22. vec($_,18, 1) = 1 == 262144 00000000000000000010000000000000
      23. vec($_,19, 1) = 1 == 524288 00000000000000000001000000000000
      24. vec($_,20, 1) = 1 == 1048576 00000000000000000000100000000000
      25. vec($_,21, 1) = 1 == 2097152 00000000000000000000010000000000
      26. vec($_,22, 1) = 1 == 4194304 00000000000000000000001000000000
      27. vec($_,23, 1) = 1 == 8388608 00000000000000000000000100000000
      28. vec($_,24, 1) = 1 == 16777216 00000000000000000000000010000000
      29. vec($_,25, 1) = 1 == 33554432 00000000000000000000000001000000
      30. vec($_,26, 1) = 1 == 67108864 00000000000000000000000000100000
      31. vec($_,27, 1) = 1 == 134217728 00000000000000000000000000010000
      32. vec($_,28, 1) = 1 == 268435456 00000000000000000000000000001000
      33. vec($_,29, 1) = 1 == 536870912 00000000000000000000000000000100
      34. vec($_,30, 1) = 1 == 1073741824 00000000000000000000000000000010
      35. vec($_,31, 1) = 1 == 2147483648 00000000000000000000000000000001
      36. vec($_, 0, 2) = 1 == 1 10000000000000000000000000000000
      37. vec($_, 1, 2) = 1 == 4 00100000000000000000000000000000
      38. vec($_, 2, 2) = 1 == 16 00001000000000000000000000000000
      39. vec($_, 3, 2) = 1 == 64 00000010000000000000000000000000
      40. vec($_, 4, 2) = 1 == 256 00000000100000000000000000000000
      41. vec($_, 5, 2) = 1 == 1024 00000000001000000000000000000000
      42. vec($_, 6, 2) = 1 == 4096 00000000000010000000000000000000
      43. vec($_, 7, 2) = 1 == 16384 00000000000000100000000000000000
      44. vec($_, 8, 2) = 1 == 65536 00000000000000001000000000000000
      45. vec($_, 9, 2) = 1 == 262144 00000000000000000010000000000000
      46. vec($_,10, 2) = 1 == 1048576 00000000000000000000100000000000
      47. vec($_,11, 2) = 1 == 4194304 00000000000000000000001000000000
      48. vec($_,12, 2) = 1 == 16777216 00000000000000000000000010000000
      49. vec($_,13, 2) = 1 == 67108864 00000000000000000000000000100000
      50. vec($_,14, 2) = 1 == 268435456 00000000000000000000000000001000
      51. vec($_,15, 2) = 1 == 1073741824 00000000000000000000000000000010
      52. vec($_, 0, 2) = 2 == 2 01000000000000000000000000000000
      53. vec($_, 1, 2) = 2 == 8 00010000000000000000000000000000
      54. vec($_, 2, 2) = 2 == 32 00000100000000000000000000000000
      55. vec($_, 3, 2) = 2 == 128 00000001000000000000000000000000
      56. vec($_, 4, 2) = 2 == 512 00000000010000000000000000000000
      57. vec($_, 5, 2) = 2 == 2048 00000000000100000000000000000000
      58. vec($_, 6, 2) = 2 == 8192 00000000000001000000000000000000
      59. vec($_, 7, 2) = 2 == 32768 00000000000000010000000000000000
      60. vec($_, 8, 2) = 2 == 131072 00000000000000000100000000000000
      61. vec($_, 9, 2) = 2 == 524288 00000000000000000001000000000000
      62. vec($_,10, 2) = 2 == 2097152 00000000000000000000010000000000
      63. vec($_,11, 2) = 2 == 8388608 00000000000000000000000100000000
      64. vec($_,12, 2) = 2 == 33554432 00000000000000000000000001000000
      65. vec($_,13, 2) = 2 == 134217728 00000000000000000000000000010000
      66. vec($_,14, 2) = 2 == 536870912 00000000000000000000000000000100
      67. vec($_,15, 2) = 2 == 2147483648 00000000000000000000000000000001
      68. vec($_, 0, 4) = 1 == 1 10000000000000000000000000000000
      69. vec($_, 1, 4) = 1 == 16 00001000000000000000000000000000
      70. vec($_, 2, 4) = 1 == 256 00000000100000000000000000000000
      71. vec($_, 3, 4) = 1 == 4096 00000000000010000000000000000000
      72. vec($_, 4, 4) = 1 == 65536 00000000000000001000000000000000
      73. vec($_, 5, 4) = 1 == 1048576 00000000000000000000100000000000
      74. vec($_, 6, 4) = 1 == 16777216 00000000000000000000000010000000
      75. vec($_, 7, 4) = 1 == 268435456 00000000000000000000000000001000
      76. vec($_, 0, 4) = 2 == 2 01000000000000000000000000000000
      77. vec($_, 1, 4) = 2 == 32 00000100000000000000000000000000
      78. vec($_, 2, 4) = 2 == 512 00000000010000000000000000000000
      79. vec($_, 3, 4) = 2 == 8192 00000000000001000000000000000000
      80. vec($_, 4, 4) = 2 == 131072 00000000000000000100000000000000
      81. vec($_, 5, 4) = 2 == 2097152 00000000000000000000010000000000
      82. vec($_, 6, 4) = 2 == 33554432 00000000000000000000000001000000
      83. vec($_, 7, 4) = 2 == 536870912 00000000000000000000000000000100
      84. vec($_, 0, 4) = 4 == 4 00100000000000000000000000000000
      85. vec($_, 1, 4) = 4 == 64 00000010000000000000000000000000
      86. vec($_, 2, 4) = 4 == 1024 00000000001000000000000000000000
      87. vec($_, 3, 4) = 4 == 16384 00000000000000100000000000000000
      88. vec($_, 4, 4) = 4 == 262144 00000000000000000010000000000000
      89. vec($_, 5, 4) = 4 == 4194304 00000000000000000000001000000000
      90. vec($_, 6, 4) = 4 == 67108864 00000000000000000000000000100000
      91. vec($_, 7, 4) = 4 == 1073741824 00000000000000000000000000000010
      92. vec($_, 0, 4) = 8 == 8 00010000000000000000000000000000
      93. vec($_, 1, 4) = 8 == 128 00000001000000000000000000000000
      94. vec($_, 2, 4) = 8 == 2048 00000000000100000000000000000000
      95. vec($_, 3, 4) = 8 == 32768 00000000000000010000000000000000
      96. vec($_, 4, 4) = 8 == 524288 00000000000000000001000000000000
      97. vec($_, 5, 4) = 8 == 8388608 00000000000000000000000100000000
      98. vec($_, 6, 4) = 8 == 134217728 00000000000000000000000000010000
      99. vec($_, 7, 4) = 8 == 2147483648 00000000000000000000000000000001
      100. vec($_, 0, 8) = 1 == 1 10000000000000000000000000000000
      101. vec($_, 1, 8) = 1 == 256 00000000100000000000000000000000
      102. vec($_, 2, 8) = 1 == 65536 00000000000000001000000000000000
      103. vec($_, 3, 8) = 1 == 16777216 00000000000000000000000010000000
      104. vec($_, 0, 8) = 2 == 2 01000000000000000000000000000000
      105. vec($_, 1, 8) = 2 == 512 00000000010000000000000000000000
      106. vec($_, 2, 8) = 2 == 131072 00000000000000000100000000000000
      107. vec($_, 3, 8) = 2 == 33554432 00000000000000000000000001000000
      108. vec($_, 0, 8) = 4 == 4 00100000000000000000000000000000
      109. vec($_, 1, 8) = 4 == 1024 00000000001000000000000000000000
      110. vec($_, 2, 8) = 4 == 262144 00000000000000000010000000000000
      111. vec($_, 3, 8) = 4 == 67108864 00000000000000000000000000100000
      112. vec($_, 0, 8) = 8 == 8 00010000000000000000000000000000
      113. vec($_, 1, 8) = 8 == 2048 00000000000100000000000000000000
      114. vec($_, 2, 8) = 8 == 524288 00000000000000000001000000000000
      115. vec($_, 3, 8) = 8 == 134217728 00000000000000000000000000010000
      116. vec($_, 0, 8) = 16 == 16 00001000000000000000000000000000
      117. vec($_, 1, 8) = 16 == 4096 00000000000010000000000000000000
      118. vec($_, 2, 8) = 16 == 1048576 00000000000000000000100000000000
      119. vec($_, 3, 8) = 16 == 268435456 00000000000000000000000000001000
      120. vec($_, 0, 8) = 32 == 32 00000100000000000000000000000000
      121. vec($_, 1, 8) = 32 == 8192 00000000000001000000000000000000
      122. vec($_, 2, 8) = 32 == 2097152 00000000000000000000010000000000
      123. vec($_, 3, 8) = 32 == 536870912 00000000000000000000000000000100
      124. vec($_, 0, 8) = 64 == 64 00000010000000000000000000000000
      125. vec($_, 1, 8) = 64 == 16384 00000000000000100000000000000000
      126. vec($_, 2, 8) = 64 == 4194304 00000000000000000000001000000000
      127. vec($_, 3, 8) = 64 == 1073741824 00000000000000000000000000000010
      128. vec($_, 0, 8) = 128 == 128 00000001000000000000000000000000
      129. vec($_, 1, 8) = 128 == 32768 00000000000000010000000000000000
      130. vec($_, 2, 8) = 128 == 8388608 00000000000000000000000100000000
      131. vec($_, 3, 8) = 128 == 2147483648 00000000000000000000000000000001
     
    perldoc-html/functions/wait.html000644 000765 000024 00000034102 12275777525 017103 0ustar00jjstaff000000 000000 wait - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    wait

    Perl 5 version 18.2 documentation
    Recently read

    wait

    • wait

      Behaves like wait(2) on your system: it waits for a child process to terminate and returns the pid of the deceased process, or -1 if there are no child processes. The status is returned in $? and ${^CHILD_ERROR_NATIVE} . Note that a return value of -1 could mean that child processes are being automatically reaped, as described in perlipc.

      If you use wait in your handler for $SIG{CHLD} it may accidentally for the child created by qx() or system(). See perlipc for details.

      Portability issues: wait in perlport.

     
    perldoc-html/functions/waitpid.html000644 000765 000024 00000036317 12275777531 017607 0ustar00jjstaff000000 000000 waitpid - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    waitpid

    Perl 5 version 18.2 documentation
    Recently read

    waitpid

    • waitpid PID,FLAGS

      Waits for a particular child process to terminate and returns the pid of the deceased process, or -1 if there is no such child process. On some systems, a value of 0 indicates that there are processes still running. The status is returned in $? and ${^CHILD_ERROR_NATIVE} . If you say

      1. use POSIX ":sys_wait_h";
      2. #...
      3. do {
      4. $kid = waitpid(-1, WNOHANG);
      5. } while $kid > 0;

      then you can do a non-blocking wait for all pending zombie processes. Non-blocking wait is available on machines supporting either the waitpid(2) or wait4(2) syscalls. However, waiting for a particular pid with FLAGS of 0 is implemented everywhere. (Perl emulates the system call by remembering the status values of processes that have exited but have not been harvested by the Perl script yet.)

      Note that on some systems, a return value of -1 could mean that child processes are being automatically reaped. See perlipc for details, and for other examples.

      Portability issues: waitpid in perlport.

     
    perldoc-html/functions/wantarray.html000644 000765 000024 00000035447 12275777527 020166 0ustar00jjstaff000000 000000 wantarray - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    wantarray

    Perl 5 version 18.2 documentation
    Recently read

    wantarray

    • wantarray

      Returns true if the context of the currently executing subroutine or eval is looking for a list value. Returns false if the context is looking for a scalar. Returns the undefined value if the context is looking for no value (void context).

      1. return unless defined wantarray; # don't bother doing more
      2. my @a = complex_calculation();
      3. return wantarray ? @a : "@a";

      wantarray()'s result is unspecified in the top level of a file, in a BEGIN , UNITCHECK , CHECK , INIT or END block, or in a DESTROY method.

      This function should have been named wantlist() instead.

     
    perldoc-html/functions/warn.html000644 000765 000024 00000041676 12275777527 017126 0ustar00jjstaff000000 000000 warn - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    warn

    Perl 5 version 18.2 documentation
    Recently read

    warn

    • warn LIST

      Prints the value of LIST to STDERR. If the last element of LIST does not end in a newline, it appends the same file/line number text as die does.

      If the output is empty and $@ already contains a value (typically from a previous eval) that value is used after appending "\t...caught" to $@ . This is useful for staying almost, but not entirely similar to die.

      If $@ is empty then the string "Warning: Something's wrong" is used.

      No message is printed if there is a $SIG{__WARN__} handler installed. It is the handler's responsibility to deal with the message as it sees fit (like, for instance, converting it into a die). Most handlers must therefore arrange to actually display the warnings that they are not prepared to deal with, by calling warn again in the handler. Note that this is quite safe and will not produce an endless loop, since __WARN__ hooks are not called from inside one.

      You will find this behavior is slightly different from that of $SIG{__DIE__} handlers (which don't suppress the error text, but can instead call die again to change it).

      Using a __WARN__ handler provides a powerful way to silence all warnings (even the so-called mandatory ones). An example:

      1. # wipe out *all* compile-time warnings
      2. BEGIN { $SIG{'__WARN__'} = sub { warn $_[0] if $DOWARN } }
      3. my $foo = 10;
      4. my $foo = 20; # no warning about duplicate my $foo,
      5. # but hey, you asked for it!
      6. # no compile-time or run-time warnings before here
      7. $DOWARN = 1;
      8. # run-time warnings enabled after here
      9. warn "\$foo is alive and $foo!"; # does show up

      See perlvar for details on setting %SIG entries and for more examples. See the Carp module for other kinds of warnings using its carp() and cluck() functions.

     
    perldoc-html/functions/when.html000644 000765 000024 00000032612 12275777524 017103 0ustar00jjstaff000000 000000 when - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    when

    Perl 5 version 18.2 documentation
    Recently read

    when

     
    perldoc-html/functions/while.html000644 000765 000024 00000032552 12275777530 017252 0ustar00jjstaff000000 000000 while - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    while

    Perl 5 version 18.2 documentation
    Recently read

    while

     
    perldoc-html/functions/write.html000644 000765 000024 00000036355 12275777530 017301 0ustar00jjstaff000000 000000 write - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    write

    Perl 5 version 18.2 documentation
    Recently read

    write

    • write FILEHANDLE

    • write EXPR
    • write

      Writes a formatted record (possibly multi-line) to the specified FILEHANDLE, using the format associated with that file. By default the format for a file is the one having the same name as the filehandle, but the format for the current output channel (see the select function) may be set explicitly by assigning the name of the format to the $~ variable.

      Top of form processing is handled automatically: if there is insufficient room on the current page for the formatted record, the page is advanced by writing a form feed, a special top-of-page format is used to format the new page header before the record is written. By default, the top-of-page format is the name of the filehandle with "_TOP" appended. This would be a problem with autovivified filehandles, but it may be dynamically set to the format of your choice by assigning the name to the $^ variable while that filehandle is selected. The number of lines remaining on the current page is in variable $- , which can be set to 0 to force a new page.

      If FILEHANDLE is unspecified, output goes to the current default output channel, which starts out as STDOUT but may be changed by the select operator. If the FILEHANDLE is an EXPR, then the expression is evaluated and the resulting string is used to look up the name of the FILEHANDLE at run time. For more on formats, see perlform.

      Note that write is not the opposite of read. Unfortunately.

     
    perldoc-html/functions/x.html000644 000765 000024 00000032470 12275777531 016411 0ustar00jjstaff000000 000000 x - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    x

    Perl 5 version 18.2 documentation
    Recently read

    x

    • x
    • xor

      These operators are documented in perlop.

     
    perldoc-html/functions/xor.html000644 000765 000024 00000032441 12275777530 016747 0ustar00jjstaff000000 000000 xor - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    xor

    Perl 5 version 18.2 documentation
    Recently read

    xor

    • xor

      These operators are documented in perlop.

     
    perldoc-html/functions/y.html000644 000765 000024 00000032672 12275777527 016423 0ustar00jjstaff000000 000000 y - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    y

    Perl 5 version 18.2 documentation
    Recently read

    y

     
    perldoc-html/User/grent.html000644 000765 000024 00000044174 12275777515 016175 0ustar00jjstaff000000 000000 User::grent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    User::grent

    Perl 5 version 18.2 documentation
    Recently read

    User::grent

    NAME

    User::grent - by-name interface to Perl's built-in getgr*() functions

    SYNOPSIS

    1. use User::grent;
    2. $gr = getgrgid(0) or die "No group zero";
    3. if ( $gr->name eq 'wheel' && @{$gr->members} > 1 ) {
    4. print "gid zero name wheel, with other members";
    5. }
    6. use User::grent qw(:FIELDS);
    7. getgrgid(0) or die "No group zero";
    8. if ( $gr_name eq 'wheel' && @gr_members > 1 ) {
    9. print "gid zero name wheel, with other members";
    10. }
    11. $gr = getgr($whoever);

    DESCRIPTION

    This module's default exports override the core getgrent(), getgruid(), and getgrnam() functions, replacing them with versions that return "User::grent" objects. This object has methods that return the similarly named structure field name from the C's passwd structure from grp.h; namely name, passwd, gid, and members (not mem). The first three return scalars, the last an array reference.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your core functions.) Access these fields as variables named with a preceding gr_ . Thus, $group_obj->gid() corresponds to $gr_gid if you import the fields. Array references are available as regular array variables, so @{ $group_obj->members() } would be simply @gr_members.

    The getpw() function is a simple front-end that forwards a numeric argument to getpwuid() and the rest to getpwnam().

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. On the other hand, the built-ins are still available via the CORE:: pseudo-package.

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/User/pwent.html000644 000765 000024 00000052037 12275777516 016211 0ustar00jjstaff000000 000000 User::pwent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    User::pwent

    Perl 5 version 18.2 documentation
    Recently read

    User::pwent

    NAME

    User::pwent - by-name interface to Perl's built-in getpw*() functions

    SYNOPSIS

    1. use User::pwent;
    2. $pw = getpwnam('daemon') || die "No daemon user";
    3. if ( $pw->uid == 1 && $pw->dir =~ m#^/(bin|tmp)?\z#s ) {
    4. print "gid 1 on root dir";
    5. }
    6. $real_shell = $pw->shell || '/bin/sh';
    7. for (($fullname, $office, $workphone, $homephone) =
    8. split /\s*,\s*/, $pw->gecos)
    9. {
    10. s/&/ucfirst(lc($pw->name))/ge;
    11. }
    12. use User::pwent qw(:FIELDS);
    13. getpwnam('daemon') || die "No daemon user";
    14. if ( $pw_uid == 1 && $pw_dir =~ m#^/(bin|tmp)?\z#s ) {
    15. print "gid 1 on root dir";
    16. }
    17. $pw = getpw($whoever);
    18. use User::pwent qw/:DEFAULT pw_has/;
    19. if (pw_has(qw[gecos expire quota])) { .... }
    20. if (pw_has("name uid gid passwd")) { .... }
    21. print "Your struct pwd has: ", scalar pw_has(), "\n";

    DESCRIPTION

    This module's default exports override the core getpwent(), getpwuid(), and getpwnam() functions, replacing them with versions that return User::pwent objects. This object has methods that return the similarly named structure field name from the C's passwd structure from pwd.h, stripped of their leading "pw_" parts, namely name , passwd , uid , gid , change , age , quota , comment , class , gecos , dir , shell , and expire . The passwd , gecos , and shell fields are tainted when running in taint mode.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your core functions.) Access these fields as variables named with a preceding pw_ in front their method names. Thus, $passwd_obj->shell corresponds to $pw_shell if you import the fields.

    The getpw() function is a simple front-end that forwards a numeric argument to getpwuid() and the rest to getpwnam().

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. The built-ins are always still available via the CORE:: pseudo-package.

    System Specifics

    Perl believes that no machine ever has more than one of change , age , or quota implemented, nor more than one of either comment or class . Some machines do not support expire , gecos , or allegedly, passwd . You may call these methods no matter what machine you're on, but they return undef if unimplemented.

    You may ask whether one of these was implemented on the system Perl was built on by asking the importable pw_has function about them. This function returns true if all parameters are supported fields on the build platform, false if one or more were not, and raises an exception if you asked about a field that Perl never knows how to provide. Parameters may be in a space-separated string, or as separate arguments. If you pass no parameters, the function returns the list of struct pwd fields supported by your build platform's C library, as a list in list context, or a space-separated string in scalar context. Note that just because your C library had a field doesn't necessarily mean that it's fully implemented on that system.

    Interpretation of the gecos field varies between systems, but traditionally holds 4 comma-separated fields containing the user's full name, office location, work phone number, and home phone number. An & in the gecos field should be replaced by the user's properly capitalized login name . The shell field, if blank, must be assumed to be /bin/sh. Perl does not do this for you. The passwd is one-way hashed garble, not clear text, and may not be unhashed save by brute-force guessing. Secure systems use more a more secure hashing than DES. On systems supporting shadow password systems, Perl automatically returns the shadow password entry when called by a suitably empowered user, even if your underlying vendor-provided C library was too short-sighted to realize it should do this.

    See passwd(5) and getpwent(3) for details.

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

    HISTORY

    • March 18th, 2000

      Reworked internals to support better interface to dodgey fields than normal Perl function provides. Added pw_has() field. Improved documentation.

     
    perldoc-html/Unicode/Collate.html000644 000765 000024 00000312777 12275777515 017120 0ustar00jjstaff000000 000000 Unicode::Collate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Unicode::Collate

    Perl 5 version 18.2 documentation
    Recently read

    Unicode::Collate

    NAME

    Unicode::Collate - Unicode Collation Algorithm

    SYNOPSIS

    1. use Unicode::Collate;
    2. #construct
    3. $Collator = Unicode::Collate->new(%tailoring);
    4. #sort
    5. @sorted = $Collator->sort(@not_sorted);
    6. #compare
    7. $result = $Collator->cmp($a, $b); # returns 1, 0, or -1.

    Note: Strings in @not_sorted , $a and $b are interpreted according to Perl's Unicode support. See perlunicode, perluniintro, perlunitut, perlunifaq, utf8. Otherwise you can use preprocess or should decode them before.

    DESCRIPTION

    This module is an implementation of Unicode Technical Standard #10 (a.k.a. UTS #10) - Unicode Collation Algorithm (a.k.a. UCA).

    Constructor and Tailoring

    The new method returns a collator object. If new() is called with no parameters, the collator should do the default collation.

    1. $Collator = Unicode::Collate->new(
    2. UCA_Version => $UCA_Version,
    3. alternate => $alternate, # alias for 'variable'
    4. backwards => $levelNumber, # or \@levelNumbers
    5. entry => $element,
    6. hangul_terminator => $term_primary_weight,
    7. highestFFFF => $bool,
    8. identical => $bool,
    9. ignoreName => qr/$ignoreName/,
    10. ignoreChar => qr/$ignoreChar/,
    11. ignore_level2 => $bool,
    12. katakana_before_hiragana => $bool,
    13. level => $collationLevel,
    14. minimalFFFE => $bool,
    15. normalization => $normalization_form,
    16. overrideCJK => \&overrideCJK,
    17. overrideHangul => \&overrideHangul,
    18. preprocess => \&preprocess,
    19. rearrange => \@charList,
    20. rewrite => \&rewrite,
    21. suppress => \@charList,
    22. table => $filename,
    23. undefName => qr/$undefName/,
    24. undefChar => qr/$undefChar/,
    25. upper_before_lower => $bool,
    26. variable => $variable,
    27. );
    • UCA_Version

      If the revision (previously "tracking version") number of UCA is given, behavior of that revision is emulated on collating. If omitted, the return value of UCA_Version() is used.

      The following revisions are supported. The default is 26.

      1. UCA Unicode Standard DUCET (@version)
      2. -------------------------------------------------------
      3. 8 3.1 3.0.1 (3.0.1d9)
      4. 9 3.1 with Corrigendum 3 3.1.1 (3.1.1)
      5. 11 4.0 4.0.0 (4.0.0)
      6. 14 4.1.0 4.1.0 (4.1.0)
      7. 16 5.0 5.0.0 (5.0.0)
      8. 18 5.1.0 5.1.0 (5.1.0)
      9. 20 5.2.0 5.2.0 (5.2.0)
      10. 22 6.0.0 6.0.0 (6.0.0)
      11. 24 6.1.0 6.1.0 (6.1.0)
      12. 26 6.2.0 6.2.0 (6.2.0)

      * Noncharacters (e.g. U+FFFF) are not ignored, and can be overridden since UCA_Version 22.

      * Fully ignorable characters were ignored, and would not interrupt contractions with UCA_Version 9 and 11.

      * Treatment of ignorables after variables and some behaviors were changed at UCA_Version 9.

      * Characters regarded as CJK unified ideographs (cf. overrideCJK ) depend on UCA_Version .

      * Many hangul jamo are assigned at UCA_Version 20, that will affect hangul_terminator .

    • alternate

      -- see 3.2.2 Alternate Weighting, version 8 of UTS #10

      For backward compatibility, alternate (old name) can be used as an alias for variable .

    • backwards

      -- see 3.4 Backward Accents, UTS #10.

      1. backwards => $levelNumber or \@levelNumbers

      Weights in reverse order; ex. level 2 (diacritic ordering) in French. If omitted (or $levelNumber is undef or \@levelNumbers is [] ), forwards at all the levels.

    • entry

      -- see 5 Tailoring; 3.6.1 File Format, UTS #10.

      If the same character (or a sequence of characters) exists in the collation element table through table , mapping to collation elements is overridden. If it does not exist, the mapping is defined additionally.

      1. entry => <<'ENTRY', # for DUCET v4.0.0 (allkeys-4.0.0.txt)
      2. 0063 0068 ; [.0E6A.0020.0002.0063] # ch
      3. 0043 0068 ; [.0E6A.0020.0007.0043] # Ch
      4. 0043 0048 ; [.0E6A.0020.0008.0043] # CH
      5. 006C 006C ; [.0F4C.0020.0002.006C] # ll
      6. 004C 006C ; [.0F4C.0020.0007.004C] # Ll
      7. 004C 004C ; [.0F4C.0020.0008.004C] # LL
      8. 00F1 ; [.0F7B.0020.0002.00F1] # n-tilde
      9. 006E 0303 ; [.0F7B.0020.0002.00F1] # n-tilde
      10. 00D1 ; [.0F7B.0020.0008.00D1] # N-tilde
      11. 004E 0303 ; [.0F7B.0020.0008.00D1] # N-tilde
      12. ENTRY
      13. entry => <<'ENTRY', # for DUCET v4.0.0 (allkeys-4.0.0.txt)
      14. 00E6 ; [.0E33.0020.0002.00E6][.0E8B.0020.0002.00E6] # ae ligature as <a><e>
      15. 00C6 ; [.0E33.0020.0008.00C6][.0E8B.0020.0008.00C6] # AE ligature as <A><E>
      16. ENTRY

      NOTE: The code point in the UCA file format (before ';' ) must be a Unicode code point (defined as hexadecimal), but not a native code point. So 0063 must always denote U+0063 , but not a character of "\x63" .

      Weighting may vary depending on collation element table. So ensure the weights defined in entry will be consistent with those in the collation element table loaded via table .

      In DUCET v4.0.0, primary weight of C is 0E60 and that of D is 0E6D . So setting primary weight of CH to 0E6A (as a value between 0E60 and 0E6D ) makes ordering as C < CH < D . Exactly speaking DUCET already has some characters between C and D : small capital C (U+1D04) with primary weight 0E64 , c-hook/C-hook (U+0188/U+0187) with 0E65 , and c-curl (U+0255 ) with 0E69 . Then primary weight 0E6A for CH makes CH ordered between c-curl and D .

    • hangul_terminator

      -- see 7.1.4 Trailing Weights, UTS #10.

      If a true value is given (non-zero but should be positive), it will be added as a terminator primary weight to the end of every standard Hangul syllable. Secondary and any higher weights for terminator are set to zero. If the value is false or hangul_terminator key does not exist, insertion of terminator weights will not be performed.

      Boundaries of Hangul syllables are determined according to conjoining Jamo behavior in the Unicode Standard and HangulSyllableType.txt.

      Implementation Note: (1) For expansion mapping (Unicode character mapped to a sequence of collation elements), a terminator will not be added between collation elements, even if Hangul syllable boundary exists there. Addition of terminator is restricted to the next position to the last collation element.

      (2) Non-conjoining Hangul letters (Compatibility Jamo, halfwidth Jamo, and enclosed letters) are not automatically terminated with a terminator primary weight. These characters may need terminator included in a collation element table beforehand.

    • highestFFFF

      -- see 5.14 Collation Elements, UTS #35.

      If the parameter is made true, U+FFFF has a highest primary weight. When a boolean of $coll->ge($str, "abc") and $coll->le($str, "abc\x{FFFF}") is true, it is expected that $str begins with "abc" , or another primary equivalent. $str may be "abcd" , "abc012" , but should not include U+FFFF such as "abc\x{FFFF}xyz" .

      $coll->le($str, "abc\x{FFFF}") works like $coll->lt($str, "abd") almostly, but the latter has a problem that you should know which letter is next to c . For a certain language where ch as the next letter, "abch" is greater than "abc\x{FFFF}" , but lesser than "abd" .

      Note: This is equivalent to entry => 'FFFF ; [.FFFE.0020.0005.FFFF]' . Any other character than U+FFFF can be tailored by entry .

    • identical

      -- see A.3 Deterministic Comparison, UTS #10.

      By default, strings whose weights are equal should be equal, even though their code points are not equal. Completely ignorable characters are ignored.

      If the parameter is made true, a final, tie-breaking level is used. If no difference of weights is found after the comparison through all the level specified by level , the comparison with code points will be performed. For the tie-breaking comparision, the sort key has code points of the original string appended. Completely ignorable characters are not ignored.

      If preprocess and/or normalization is applied, the code points of the string after them (in NFD by default) are used.

    • ignoreChar
    • ignoreName

      -- see 3.6.2 Variable Weighting, UTS #10.

      Makes the entry in the table completely ignorable; i.e. as if the weights were zero at all level.

      Through ignoreChar , any character matching qr/$ignoreChar/ will be ignored. Through ignoreName , any character whose name (given in the table file as a comment) matches qr/$ignoreName/ will be ignored.

      E.g. when 'a' and 'e' are ignorable, 'element' is equal to 'lament' (or 'lmnt').

    • ignore_level2

      -- see 5.1 Parametric Tailoring, UTS #10.

      By default, case-sensitive comparison (that is level 3 difference) won't ignore accents (that is level 2 difference).

      If the parameter is made true, accents (and other primary ignorable characters) are ignored, even though cases are taken into account.

      NOTE: level should be 3 or greater.

    • katakana_before_hiragana

      -- see 7.2 Tertiary Weight Table, UTS #10.

      By default, hiragana is before katakana. If the parameter is made true, this is reversed.

      NOTE: This parameter simplemindedly assumes that any hiragana/katakana distinctions must occur in level 3, and their weights at level 3 must be same as those mentioned in 7.3.1, UTS #10. If you define your collation elements which violate this requirement, this parameter does not work validly.

    • level

      -- see 4.3 Form Sort Key, UTS #10.

      Set the maximum level. Any higher levels than the specified one are ignored.

      1. Level 1: alphabetic ordering
      2. Level 2: diacritic ordering
      3. Level 3: case ordering
      4. Level 4: tie-breaking (e.g. in the case when variable is 'shifted')
      5. ex.level => 2,

      If omitted, the maximum is the 4th.

      NOTE: The DUCET includes weights over 0xFFFF at the 4th level. But this module only uses weights within 0xFFFF. When variable is 'blanked' or 'non-ignorable' (other than 'shifted' and 'shift-trimmed'), the level 4 may be unreliable.

      See also identical .

    • minimalFFFE

      -- see 5.14 Collation Elements, UTS #35.

      If the parameter is made true, U+FFFE has a minimal primary weight. The comparison between "$a1\x{FFFE}$a2" and "$b1\x{FFFE}$b2" first compares $a1 and $b1 at level 1, and then $a2 and $b2 at level 1, as followed.

      1. "ab\x{FFFE}a"
      2. "Ab\x{FFFE}a"
      3. "ab\x{FFFE}c"
      4. "Ab\x{FFFE}c"
      5. "ab\x{FFFE}xyz"
      6. "abc\x{FFFE}def"
      7. "abc\x{FFFE}xYz"
      8. "aBc\x{FFFE}xyz"
      9. "abcX\x{FFFE}def"
      10. "abcx\x{FFFE}xyz"
      11. "b\x{FFFE}aaa"
      12. "bbb\x{FFFE}a"

      Note: This is equivalent to entry => 'FFFE ; [.0001.0020.0005.FFFE]' . Any other character than U+FFFE can be tailored by entry .

    • normalization

      -- see 4.1 Normalize, UTS #10.

      If specified, strings are normalized before preparation of sort keys (the normalization is executed after preprocess).

      A form name Unicode::Normalize::normalize() accepts will be applied as $normalization_form . Acceptable names include 'NFD' , 'NFC' , 'NFKD' , and 'NFKC' . See Unicode::Normalize::normalize() for detail. If omitted, 'NFD' is used.

      normalization is performed after preprocess (if defined).

      Furthermore, special values, undef and "prenormalized" , can be used, though they are not concerned with Unicode::Normalize::normalize() .

      If undef (not a string "undef" ) is passed explicitly as the value for this key, any normalization is not carried out (this may make tailoring easier if any normalization is not desired). Under (normalization => undef) , only contiguous contractions are resolved; e.g. even if A-ring (and A-ring-cedilla ) is ordered after Z , A-cedilla-ring would be primary equal to A . In this point, (normalization => undef, preprocess => sub { NFD(shift) }) is not equivalent to (normalization => 'NFD') .

      In the case of (normalization => "prenormalized") , any normalization is not performed, but discontiguous contractions with combining characters are performed. Therefore (normalization => 'prenormalized', preprocess => sub { NFD(shift) }) is equivalent to (normalization => 'NFD') . If source strings are finely prenormalized, (normalization => 'prenormalized') may save time for normalization.

      Except (normalization => undef) , Unicode::Normalize is required (see also CAVEAT).

    • overrideCJK

      -- see 7.1 Derived Collation Elements, UTS #10.

      By default, CJK unified ideographs are ordered in Unicode codepoint order, but those in the CJK Unified Ideographs block are lesser than those in the CJK Unified Ideographs Extension A etc.

      1. In the CJK Unified Ideographs block:
      2. U+4E00..U+9FA5 if UCA_Version is 8, 9 or 11.
      3. U+4E00..U+9FBB if UCA_Version is 14 or 16.
      4. U+4E00..U+9FC3 if UCA_Version is 18.
      5. U+4E00..U+9FCB if UCA_Version is 20 or 22.
      6. U+4E00..U+9FCC if UCA_Version is 24 or 26.
      7. In the CJK Unified Ideographs Extension blocks:
      8. Ext.A (U+3400..U+4DB5) and Ext.B (U+20000..U+2A6D6) in any UCA_Version.
      9. Ext.C (U+2A700..U+2B734) if UCA_Version is 20 or greater.
      10. Ext.D (U+2B740..U+2B81D) if UCA_Version is 22 or greater.

      Through overrideCJK , ordering of CJK unified ideographs (including extensions) can be overridden.

      ex. CJK unified ideographs in the JIS code point order.

      1. overrideCJK => sub {
      2. my $u = shift; # get a Unicode codepoint
      3. my $b = pack('n', $u); # to UTF-16BE
      4. my $s = your_unicode_to_sjis_converter($b); # convert
      5. my $n = unpack('n', $s); # convert sjis to short
      6. [ $n, 0x20, 0x2, $u ]; # return the collation element
      7. },

      The return value may be an arrayref of 1st to 4th weights as shown above. The return value may be an integer as the primary weight as shown below. If undef is returned, the default derived collation element will be used.

      1. overrideCJK => sub {
      2. my $u = shift; # get a Unicode codepoint
      3. my $b = pack('n', $u); # to UTF-16BE
      4. my $s = your_unicode_to_sjis_converter($b); # convert
      5. my $n = unpack('n', $s); # convert sjis to short
      6. return $n; # return the primary weight
      7. },

      The return value may be a list containing zero or more of an arrayref, an integer, or undef.

      ex. ignores all CJK unified ideographs.

      1. overrideCJK => sub {()}, # CODEREF returning empty list
      2. # where ->eq("Pe\x{4E00}rl", "Perl") is true
      3. # as U+4E00 is a CJK unified ideograph and to be ignorable.

      If undef is passed explicitly as the value for this key, weights for CJK unified ideographs are treated as undefined. But assignment of weight for CJK unified ideographs in table or entry is still valid.

      Note: In addition to them, 12 CJK compatibility ideographs (U+FA0E , U+FA0F , U+FA11 , U+FA13 , U+FA14 , U+FA1F , U+FA21 , U+FA23 , U+FA24 , U+FA27 , U+FA28 , U+FA29 ) are also treated as CJK unified ideographs. But they can't be overridden via overrideCJK when you use DUCET, as the table includes weights for them. table or entry has priority over overrideCJK .

    • overrideHangul

      -- see 7.1 Derived Collation Elements, UTS #10.

      By default, Hangul syllables are decomposed into Hangul Jamo, even if (normalization => undef) . But the mapping of Hangul syllables may be overridden.

      This parameter works like overrideCJK , so see there for examples.

      If you want to override the mapping of Hangul syllables, NFD and NFKD are not appropriate, since NFD and NFKD will decompose Hangul syllables before overriding. FCD may decompose Hangul syllables as the case may be.

      If undef is passed explicitly as the value for this key, weight for Hangul syllables is treated as undefined without decomposition into Hangul Jamo. But definition of weight for Hangul syllables in table or entry is still valid.

    • preprocess

      -- see 5.4 Preprocessing, UTS #10.

      If specified, the coderef is used to preprocess each string before the formation of sort keys.

      ex. dropping English articles, such as "a" or "the". Then, "the pen" is before "a pencil".

      1. preprocess => sub {
      2. my $str = shift;
      3. $str =~ s/\b(?:an?|the)\s+//gi;
      4. return $str;
      5. },

      preprocess is performed before normalization (if defined).

      ex. decoding strings in a legacy encoding such as shift-jis:

      1. $sjis_collator = Unicode::Collate->new(
      2. preprocess => \&your_shiftjis_to_unicode_decoder,
      3. );
      4. @result = $sjis_collator->sort(@shiftjis_strings);

      Note: Strings returned from the coderef will be interpreted according to Perl's Unicode support. See perlunicode, perluniintro, perlunitut, perlunifaq, utf8.

    • rearrange

      -- see 3.5 Rearrangement, UTS #10.

      Characters that are not coded in logical order and to be rearranged. If UCA_Version is equal to or lesser than 11, default is:

      1. rearrange => [ 0x0E40..0x0E44, 0x0EC0..0x0EC4 ],

      If you want to disallow any rearrangement, pass undef or [] (a reference to empty list) as the value for this key.

      If UCA_Version is equal to or greater than 14, default is [] (i.e. no rearrangement).

      According to the version 9 of UCA, this parameter shall not be used; but it is not warned at present.

    • rewrite

      If specified, the coderef is used to rewrite lines in table or entry . The coderef will get each line, and then should return a rewritten line according to the UCA file format. If the coderef returns an empty line, the line will be skipped.

      e.g. any primary ignorable characters into tertiary ignorable:

      1. rewrite => sub {
      2. my $line = shift;
      3. $line =~ s/\[\.0000\..{4}\..{4}\./[.0000.0000.0000./g;
      4. return $line;
      5. },

      This example shows rewriting weights. rewrite is allowed to affect code points, weights, and the name.

      NOTE: table is available to use another table file; preparing a modified table once would be more efficient than rewriting lines on reading an unmodified table every time.

    • suppress

      -- see suppress contractions in 5.14.11 Special-Purpose Commands, UTS #35 (LDML).

      Contractions beginning with the specified characters are suppressed, even if those contractions are defined in table .

      An example for Russian and some languages using the Cyrillic script:

      1. suppress => [0x0400..0x0417, 0x041A..0x0437, 0x043A..0x045F],

      where 0x0400 stands for U+0400 , CYRILLIC CAPITAL LETTER IE WITH GRAVE.

      NOTE: Contractions via entry are not be suppressed.

    • table

      -- see 3.6 Default Unicode Collation Element Table, UTS #10.

      You can use another collation element table if desired.

      The table file should locate in the Unicode/Collate directory on @INC . Say, if the filename is Foo.txt, the table file is searched as Unicode/Collate/Foo.txt in @INC .

      By default, allkeys.txt (as the filename of DUCET) is used. If you will prepare your own table file, any name other than allkeys.txt may be better to avoid namespace conflict.

      NOTE: When XSUB is used, the DUCET is compiled on building this module, and it may save time at the run time. Explicit saying table => 'allkeys.txt' (or using another table), or using ignoreChar , ignoreName , undefChar , undefName or rewrite will prevent this module from using the compiled DUCET.

      If undef is passed explicitly as the value for this key, no file is read (but you can define collation elements via entry ).

      A typical way to define a collation element table without any file of table:

      1. $onlyABC = Unicode::Collate->new(
      2. table => undef,
      3. entry => << 'ENTRIES',
      4. 0061 ; [.0101.0020.0002.0061] # LATIN SMALL LETTER A
      5. 0041 ; [.0101.0020.0008.0041] # LATIN CAPITAL LETTER A
      6. 0062 ; [.0102.0020.0002.0062] # LATIN SMALL LETTER B
      7. 0042 ; [.0102.0020.0008.0042] # LATIN CAPITAL LETTER B
      8. 0063 ; [.0103.0020.0002.0063] # LATIN SMALL LETTER C
      9. 0043 ; [.0103.0020.0008.0043] # LATIN CAPITAL LETTER C
      10. ENTRIES
      11. );

      If ignoreName or undefName is used, character names should be specified as a comment (following # ) on each line.

    • undefChar
    • undefName

      -- see 6.3.4 Reducing the Repertoire, UTS #10.

      Undefines the collation element as if it were unassigned in the table . This reduces the size of the table. If an unassigned character appears in the string to be collated, the sort key is made from its codepoint as a single-character collation element, as it is greater than any other assigned collation elements (in the codepoint order among the unassigned characters). But, it'd be better to ignore characters unfamiliar to you and maybe never used.

      Through undefChar , any character matching qr/$undefChar/ will be undefined. Through undefName , any character whose name (given in the table file as a comment) matches qr/$undefName/ will be undefined.

      ex. Collation weights for beyond-BMP characters are not stored in object:

      1. undefChar => qr/[^\0-\x{fffd}]/,
    • upper_before_lower

      -- see 6.6 Case Comparisons, UTS #10.

      By default, lowercase is before uppercase. If the parameter is made true, this is reversed.

      NOTE: This parameter simplemindedly assumes that any lowercase/uppercase distinctions must occur in level 3, and their weights at level 3 must be same as those mentioned in 7.3.1, UTS #10. If you define your collation elements which differs from this requirement, this parameter doesn't work validly.

    • variable

      -- see 3.6.2 Variable Weighting, UTS #10.

      This key allows for variable weighting of variable collation elements, which are marked with an ASTERISK in the table (NOTE: Many punctuation marks and symbols are variable in allkeys.txt).

      1. variable => 'blanked', 'non-ignorable', 'shifted', or 'shift-trimmed'.

      These names are case-insensitive. By default (if specification is omitted), 'shifted' is adopted.

      1. 'Blanked' Variable elements are made ignorable at levels 1 through 3;
      2. considered at the 4th level.
      3. 'Non-Ignorable' Variable elements are not reset to ignorable.
      4. 'Shifted' Variable elements are made ignorable at levels 1 through 3
      5. their level 4 weight is replaced by the old level 1 weight.
      6. Level 4 weight for Non-Variable elements is 0xFFFF.
      7. 'Shift-Trimmed' Same as 'shifted', but all FFFF's at the 4th level
      8. are trimmed.

    Methods for Collation

    • @sorted = $Collator->sort(@not_sorted)

      Sorts a list of strings.

    • $result = $Collator->cmp($a, $b)

      Returns 1 (when $a is greater than $b ) or 0 (when $a is equal to $b ) or -1 (when $a is lesser than $b ).

    • $result = $Collator->eq($a, $b)
    • $result = $Collator->ne($a, $b)
    • $result = $Collator->lt($a, $b)
    • $result = $Collator->le($a, $b)
    • $result = $Collator->gt($a, $b)
    • $result = $Collator->ge($a, $b)

      They works like the same name operators as theirs.

      1. eq : whether $a is equal to $b.
      2. ne : whether $a is not equal to $b.
      3. lt : whether $a is lesser than $b.
      4. le : whether $a is lesser than $b or equal to $b.
      5. gt : whether $a is greater than $b.
      6. ge : whether $a is greater than $b or equal to $b.
    • $sortKey = $Collator->getSortKey($string)

      -- see 4.3 Form Sort Key, UTS #10.

      Returns a sort key.

      You compare the sort keys using a binary comparison and get the result of the comparison of the strings using UCA.

      1. $Collator->getSortKey($a) cmp $Collator->getSortKey($b)
      2. is equivalent to
      3. $Collator->cmp($a, $b)
    • $sortKeyForm = $Collator->viewSortKey($string)

      Converts a sorting key into its representation form. If UCA_Version is 8, the output is slightly different.

      1. use Unicode::Collate;
      2. my $c = Unicode::Collate->new();
      3. print $c->viewSortKey("Perl"),"\n";
      4. # output:
      5. # [0B67 0A65 0B7F 0B03 | 0020 0020 0020 0020 | 0008 0002 0002 0002 | FFFF FFFF FFFF FFFF]
      6. # Level 1 Level 2 Level 3 Level 4

    Methods for Searching

    The match , gmatch , subst , gsubst methods work like m//, m//g, s///, s///g, respectively, but they are not aware of any pattern, but only a literal substring.

    DISCLAIMER: If preprocess or normalization parameter is true for $Collator , calling these methods (index, match , gmatch , subst , gsubst ) is croaked, as the position and the length might differ from those on the specified string.

    rearrange and hangul_terminator parameters are neglected. katakana_before_hiragana and upper_before_lower don't affect matching and searching, as it doesn't matter whether greater or lesser.

    • $position = $Collator->index($string, $substring[, $position])
    • ($position, $length) = $Collator->index($string, $substring[, $position])

      If $substring matches a part of $string , returns the position of the first occurrence of the matching part in scalar context; in list context, returns a two-element list of the position and the length of the matching part.

      If $substring does not match any part of $string , returns -1 in scalar context and an empty list in list context.

      e.g. you say

      1. my $Collator = Unicode::Collate->new( normalization => undef, level => 1 );
      2. # (normalization => undef) is REQUIRED.
      3. my $str = "Ich muß studieren Perl.";
      4. my $sub = "MÜSS";
      5. my $match;
      6. if (my($pos,$len) = $Collator->index($str, $sub)) {
      7. $match = substr($str, $pos, $len);
      8. }

      and get "muß" in $match since "muß" is primary equal to "MÜSS" .

    • $match_ref = $Collator->match($string, $substring)
    • ($match) = $Collator->match($string, $substring)

      If $substring matches a part of $string , in scalar context, returns a reference to the first occurrence of the matching part ($match_ref is always true if matches, since every reference is true); in list context, returns the first occurrence of the matching part.

      If $substring does not match any part of $string , returns undef in scalar context and an empty list in list context.

      e.g.

      1. if ($match_ref = $Collator->match($str, $sub)) { # scalar context
      2. print "matches [$$match_ref].\n";
      3. } else {
      4. print "doesn't match.\n";
      5. }
      6. or
      7. if (($match) = $Collator->match($str, $sub)) { # list context
      8. print "matches [$match].\n";
      9. } else {
      10. print "doesn't match.\n";
      11. }
    • @match = $Collator->gmatch($string, $substring)

      If $substring matches a part of $string , returns all the matching parts (or matching count in scalar context).

      If $substring does not match any part of $string , returns an empty list.

    • $count = $Collator->subst($string, $substring, $replacement)

      If $substring matches a part of $string , the first occurrence of the matching part is replaced by $replacement ($string is modified) and $count (always equals to 1 ) is returned.

      $replacement can be a CODEREF , taking the matching part as an argument, and returning a string to replace the matching part (a bit similar to s/(..)/$coderef->($1)/e).

    • $count = $Collator->gsubst($string, $substring, $replacement)

      If $substring matches a part of $string , all the occurrences of the matching part are replaced by $replacement ($string is modified) and $count is returned.

      $replacement can be a CODEREF , taking the matching part as an argument, and returning a string to replace the matching part (a bit similar to s/(..)/$coderef->($1)/eg).

      e.g.

      1. my $Collator = Unicode::Collate->new( normalization => undef, level => 1 );
      2. # (normalization => undef) is REQUIRED.
      3. my $str = "Camel donkey zebra came\x{301}l CAMEL horse cam\0e\0l...";
      4. $Collator->gsubst($str, "camel", sub { "<b>$_[0]</b>" });
      5. # now $str is "<b>Camel</b> donkey zebra <b>came\x{301}l</b> <b>CAMEL</b> horse <b>cam\0e\0l</b>...";
      6. # i.e., all the camels are made bold-faced.
      7. Examples: levels and ignore_level2 - what does camel match?
      8. ---------------------------------------------------------------------------
      9. level ignore_level2 | camel Camel came\x{301}l c-a-m-e-l cam\0e\0l
      10. -----------------------|---------------------------------------------------
      11. 1 false | yes yes yes yes yes
      12. 2 false | yes yes no yes yes
      13. 3 false | yes no no yes yes
      14. 4 false | yes no no no yes
      15. -----------------------|---------------------------------------------------
      16. 1 true | yes yes yes yes yes
      17. 2 true | yes yes yes yes yes
      18. 3 true | yes no yes yes yes
      19. 4 true | yes no yes no yes
      20. ---------------------------------------------------------------------------
      21. note: if variable => non-ignorable, camel doesn't match c-a-m-e-l
      22. at any level.

    Other Methods

    • %old_tailoring = $Collator->change(%new_tailoring)
    • $modified_collator = $Collator->change(%new_tailoring)

      Changes the value of specified keys and returns the changed part.

      1. $Collator = Unicode::Collate->new(level => 4);
      2. $Collator->eq("perl", "PERL"); # false
      3. %old = $Collator->change(level => 2); # returns (level => 4).
      4. $Collator->eq("perl", "PERL"); # true
      5. $Collator->change(%old); # returns (level => 2).
      6. $Collator->eq("perl", "PERL"); # false

      Not all (key,value) s are allowed to be changed. See also @Unicode::Collate::ChangeOK and @Unicode::Collate::ChangeNG .

      In the scalar context, returns the modified collator (but it is not a clone from the original).

      1. $Collator->change(level => 2)->eq("perl", "PERL"); # true
      2. $Collator->eq("perl", "PERL"); # true; now max level is 2nd.
      3. $Collator->change(level => 4)->eq("perl", "PERL"); # false
    • $version = $Collator->version()

      Returns the version number (a string) of the Unicode Standard which the table file used by the collator object is based on. If the table does not include a version line (starting with @version ), returns "unknown" .

    • UCA_Version()

      Returns the revision number of UTS #10 this module consults, that should correspond with the DUCET incorporated.

    • Base_Unicode_Version()

      Returns the version number of UTS #10 this module consults, that should correspond with the DUCET incorporated.

    EXPORT

    No method will be exported.

    INSTALL

    Though this module can be used without any table file, to use this module easily, it is recommended to install a table file in the UCA format, by copying it under the directory <a place in @INC>/Unicode/Collate.

    The most preferable one is "The Default Unicode Collation Element Table" (aka DUCET), available from the Unicode Consortium's website:

    1. http://www.unicode.org/Public/UCA/
    2. http://www.unicode.org/Public/UCA/latest/allkeys.txt (latest version)

    If DUCET is not installed, it is recommended to copy the file from http://www.unicode.org/Public/UCA/latest/allkeys.txt to <a place in @INC>/Unicode/Collate/allkeys.txt manually.

    CAVEATS

    • Normalization

      Use of the normalization parameter requires the Unicode::Normalize module (see Unicode::Normalize).

      If you need not it (say, in the case when you need not handle any combining characters), assign normalization => undef explicitly.

      -- see 6.5 Avoiding Normalization, UTS #10.

    • Conformance Test

      The Conformance Test for the UCA is available under http://www.unicode.org/Public/UCA/.

      For CollationTest_SHIFTED.txt, a collator via Unicode::Collate->new( ) should be used; for CollationTest_NON_IGNORABLE.txt, a collator via Unicode::Collate->new(variable => "non-ignorable", level => 3) .

      If UCA_Version is 26 or later, the identical level is preferred; Unicode::Collate->new(identical => 1) and Unicode::Collate->new(identical => 1, variable => "non-ignorable", level => 3) should be used.

      Unicode::Normalize is required to try The Conformance Test.

    AUTHOR, COPYRIGHT AND LICENSE

    The Unicode::Collate module for perl was written by SADAHIRO Tomoyuki, <SADAHIRO@cpan.org>. This module is Copyright(C) 2001-2012, SADAHIRO Tomoyuki. Japan. All rights reserved.

    This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    The file Unicode/Collate/allkeys.txt was copied verbatim from http://www.unicode.org/Public/UCA/6.2.0/allkeys.txt. For this file, Copyright (c) 2001-2012 Unicode, Inc. Distributed under the Terms of Use in http://www.unicode.org/copyright.html.

    SEE ALSO

     
    perldoc-html/Unicode/Normalize.html000644 000765 000024 00000154424 12275777516 017467 0ustar00jjstaff000000 000000 Unicode::Normalize - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Unicode::Normalize

    Perl 5 version 18.2 documentation
    Recently read

    Unicode::Normalize

    NAME

    Unicode::Normalize - Unicode Normalization Forms

    SYNOPSIS

    (1) using function names exported by default:

    1. use Unicode::Normalize;
    2. $NFD_string = NFD($string); # Normalization Form D
    3. $NFC_string = NFC($string); # Normalization Form C
    4. $NFKD_string = NFKD($string); # Normalization Form KD
    5. $NFKC_string = NFKC($string); # Normalization Form KC

    (2) using function names exported on request:

    1. use Unicode::Normalize 'normalize';
    2. $NFD_string = normalize('D', $string); # Normalization Form D
    3. $NFC_string = normalize('C', $string); # Normalization Form C
    4. $NFKD_string = normalize('KD', $string); # Normalization Form KD
    5. $NFKC_string = normalize('KC', $string); # Normalization Form KC

    DESCRIPTION

    Parameters:

    $string is used as a string under character semantics (see perlunicode).

    $code_point should be an unsigned integer representing a Unicode code point.

    Note: Between XSUB and pure Perl, there is an incompatibility about the interpretation of $code_point as a decimal number. XSUB converts $code_point to an unsigned integer, but pure Perl does not. Do not use a floating point nor a negative sign in $code_point .

    Normalization Forms

    • $NFD_string = NFD($string)

      It returns the Normalization Form D (formed by canonical decomposition).

    • $NFC_string = NFC($string)

      It returns the Normalization Form C (formed by canonical decomposition followed by canonical composition).

    • $NFKD_string = NFKD($string)

      It returns the Normalization Form KD (formed by compatibility decomposition).

    • $NFKC_string = NFKC($string)

      It returns the Normalization Form KC (formed by compatibility decomposition followed by canonical composition).

    • $FCD_string = FCD($string)

      If the given string is in FCD ("Fast C or D" form; cf. UTN #5), it returns the string without modification; otherwise it returns an FCD string.

      Note: FCD is not always unique, then plural forms may be equivalent each other. FCD() will return one of these equivalent forms.

    • $FCC_string = FCC($string)

      It returns the FCC form ("Fast C Contiguous"; cf. UTN #5).

      Note: FCC is unique, as well as four normalization forms (NF*).

    • $normalized_string = normalize($form_name, $string)

      It returns the normalization form of $form_name .

      As $form_name , one of the following names must be given.

      1. 'C' or 'NFC' for Normalization Form C (UAX #15)
      2. 'D' or 'NFD' for Normalization Form D (UAX #15)
      3. 'KC' or 'NFKC' for Normalization Form KC (UAX #15)
      4. 'KD' or 'NFKD' for Normalization Form KD (UAX #15)
      5. 'FCD' for "Fast C or D" Form (UTN #5)
      6. 'FCC' for "Fast C Contiguous" (UTN #5)

    Decomposition and Composition

    • $decomposed_string = decompose($string [, $useCompatMapping])

      It returns the concatenation of the decomposition of each character in the string.

      If the second parameter (a boolean) is omitted or false, the decomposition is canonical decomposition; if the second parameter (a boolean) is true, the decomposition is compatibility decomposition.

      The string returned is not always in NFD/NFKD. Reordering may be required.

      1. $NFD_string = reorder(decompose($string)); # eq. to NFD()
      2. $NFKD_string = reorder(decompose($string, TRUE)); # eq. to NFKD()
    • $reordered_string = reorder($string)

      It returns the result of reordering the combining characters according to Canonical Ordering Behavior.

      For example, when you have a list of NFD/NFKD strings, you can get the concatenated NFD/NFKD string from them, by saying

      1. $concat_NFD = reorder(join '', @NFD_strings);
      2. $concat_NFKD = reorder(join '', @NFKD_strings);
    • $composed_string = compose($string)

      It returns the result of canonical composition without applying any decomposition.

      For example, when you have a NFD/NFKD string, you can get its NFC/NFKC string, by saying

      1. $NFC_string = compose($NFD_string);
      2. $NFKC_string = compose($NFKD_string);
    • ($processed, $unprocessed) = splitOnLastStarter($normalized)

      It returns two strings: the first one, $processed , is a part before the last starter, and the second one, $unprocessed is another part after the first part. A starter is a character having a combining class of zero (see UAX #15).

      Note that $processed may be empty (when $normalized contains no starter or starts with the last starter), and then $unprocessed should be equal to the entire $normalized .

      When you have a $normalized string and an $unnormalized string following it, a simple concatenation is wrong:

      1. $concat = $normalized . normalize($form, $unnormalized); # wrong!

      Instead of it, do like this:

      1. ($processed, $unprocessed) = splitOnLastStarter($normalized);
      2. $concat = $processed . normalize($form, $unprocessed.$unnormalized);

      splitOnLastStarter() should be called with a pre-normalized parameter $normalized , that is in the same form as $form you want.

      If you have an array of @string that should be concatenated and then normalized, you can do like this:

      1. my $result = "";
      2. my $unproc = "";
      3. foreach my $str (@string) {
      4. $unproc .= $str;
      5. my $n = normalize($form, $unproc);
      6. my($p, $u) = splitOnLastStarter($n);
      7. $result .= $p;
      8. $unproc = $u;
      9. }
      10. $result .= $unproc;
      11. # instead of normalize($form, join('', @string))
    • $processed = normalize_partial($form, $unprocessed)

      A wrapper for the combination of normalize() and splitOnLastStarter() . Note that $unprocessed will be modified as a side-effect.

      If you have an array of @string that should be concatenated and then normalized, you can do like this:

      1. my $result = "";
      2. my $unproc = "";
      3. foreach my $str (@string) {
      4. $unproc .= $str;
      5. $result .= normalize_partial($form, $unproc);
      6. }
      7. $result .= $unproc;
      8. # instead of normalize($form, join('', @string))
    • $processed = NFD_partial($unprocessed)

      It does like normalize_partial('NFD', $unprocessed) . Note that $unprocessed will be modified as a side-effect.

    • $processed = NFC_partial($unprocessed)

      It does like normalize_partial('NFC', $unprocessed) . Note that $unprocessed will be modified as a side-effect.

    • $processed = NFKD_partial($unprocessed)

      It does like normalize_partial('NFKD', $unprocessed) . Note that $unprocessed will be modified as a side-effect.

    • $processed = NFKC_partial($unprocessed)

      It does like normalize_partial('NFKC', $unprocessed) . Note that $unprocessed will be modified as a side-effect.

    Quick Check

    (see Annex 8, UAX #15; and DerivedNormalizationProps.txt)

    The following functions check whether the string is in that normalization form.

    The result returned will be one of the following:

    1. YES The string is in that normalization form.
    2. NO The string is not in that normalization form.
    3. MAYBE Dubious. Maybe yes, maybe no.
    • $result = checkNFD($string)

      It returns true (1 ) if YES ; false (empty string ) if NO .

    • $result = checkNFC($string)

      It returns true (1 ) if YES ; false (empty string ) if NO ; undef if MAYBE .

    • $result = checkNFKD($string)

      It returns true (1 ) if YES ; false (empty string ) if NO .

    • $result = checkNFKC($string)

      It returns true (1 ) if YES ; false (empty string ) if NO ; undef if MAYBE .

    • $result = checkFCD($string)

      It returns true (1 ) if YES ; false (empty string ) if NO .

    • $result = checkFCC($string)

      It returns true (1 ) if YES ; false (empty string ) if NO ; undef if MAYBE .

      Note: If a string is not in FCD, it must not be in FCC. So checkFCC($not_FCD_string) should return NO .

    • $result = check($form_name, $string)

      It returns true (1 ) if YES ; false (empty string ) if NO ; undef if MAYBE .

      As $form_name , one of the following names must be given.

      1. 'C' or 'NFC' for Normalization Form C (UAX #15)
      2. 'D' or 'NFD' for Normalization Form D (UAX #15)
      3. 'KC' or 'NFKC' for Normalization Form KC (UAX #15)
      4. 'KD' or 'NFKD' for Normalization Form KD (UAX #15)
      5. 'FCD' for "Fast C or D" Form (UTN #5)
      6. 'FCC' for "Fast C Contiguous" (UTN #5)

    Note

    In the cases of NFD, NFKD, and FCD, the answer must be either YES or NO . The answer MAYBE may be returned in the cases of NFC, NFKC, and FCC.

    A MAYBE string should contain at least one combining character or the like. For example, COMBINING ACUTE ACCENT has the MAYBE_NFC/MAYBE_NFKC property.

    Both checkNFC("A\N{COMBINING ACUTE ACCENT}") and checkNFC("B\N{COMBINING ACUTE ACCENT}") will return MAYBE . "A\N{COMBINING ACUTE ACCENT}" is not in NFC (its NFC is "\N{LATIN CAPITAL LETTER A WITH ACUTE}" ), while "B\N{COMBINING ACUTE ACCENT}" is in NFC.

    If you want to check exactly, compare the string with its NFC/NFKC/FCC.

    1. if ($string eq NFC($string)) {
    2. # $string is exactly normalized in NFC;
    3. } else {
    4. # $string is not normalized in NFC;
    5. }
    6. if ($string eq NFKC($string)) {
    7. # $string is exactly normalized in NFKC;
    8. } else {
    9. # $string is not normalized in NFKC;
    10. }

    Character Data

    These functions are interface of character data used internally. If you want only to get Unicode normalization forms, you don't need call them yourself.

    • $canonical_decomposition = getCanon($code_point)

      If the character is canonically decomposable (including Hangul Syllables), it returns the (full) canonical decomposition as a string. Otherwise it returns undef.

      Note: According to the Unicode standard, the canonical decomposition of the character that is not canonically decomposable is same as the character itself.

    • $compatibility_decomposition = getCompat($code_point)

      If the character is compatibility decomposable (including Hangul Syllables), it returns the (full) compatibility decomposition as a string. Otherwise it returns undef.

      Note: According to the Unicode standard, the compatibility decomposition of the character that is not compatibility decomposable is same as the character itself.

    • $code_point_composite = getComposite($code_point_here, $code_point_next)

      If two characters here and next (as code points) are composable (including Hangul Jamo/Syllables and Composition Exclusions), it returns the code point of the composite.

      If they are not composable, it returns undef.

    • $combining_class = getCombinClass($code_point)

      It returns the combining class (as an integer) of the character.

    • $may_be_composed_with_prev_char = isComp2nd($code_point)

      It returns a boolean whether the character of the specified codepoint may be composed with the previous one in a certain composition (including Hangul Compositions, but excluding Composition Exclusions and Non-Starter Decompositions).

    • $is_exclusion = isExclusion($code_point)

      It returns a boolean whether the code point is a composition exclusion.

    • $is_singleton = isSingleton($code_point)

      It returns a boolean whether the code point is a singleton

    • $is_non_starter_decomposition = isNonStDecomp($code_point)

      It returns a boolean whether the code point has Non-Starter Decomposition.

    • $is_Full_Composition_Exclusion = isComp_Ex($code_point)

      It returns a boolean of the derived property Comp_Ex (Full_Composition_Exclusion). This property is generated from Composition Exclusions + Singletons + Non-Starter Decompositions.

    • $NFD_is_NO = isNFD_NO($code_point)

      It returns a boolean of the derived property NFD_NO (NFD_Quick_Check=No).

    • $NFC_is_NO = isNFC_NO($code_point)

      It returns a boolean of the derived property NFC_NO (NFC_Quick_Check=No).

    • $NFC_is_MAYBE = isNFC_MAYBE($code_point)

      It returns a boolean of the derived property NFC_MAYBE (NFC_Quick_Check=Maybe).

    • $NFKD_is_NO = isNFKD_NO($code_point)

      It returns a boolean of the derived property NFKD_NO (NFKD_Quick_Check=No).

    • $NFKC_is_NO = isNFKC_NO($code_point)

      It returns a boolean of the derived property NFKC_NO (NFKC_Quick_Check=No).

    • $NFKC_is_MAYBE = isNFKC_MAYBE($code_point)

      It returns a boolean of the derived property NFKC_MAYBE (NFKC_Quick_Check=Maybe).

    EXPORT

    NFC , NFD , NFKC , NFKD : by default.

    normalize and other some functions: on request.

    CAVEATS

    • Perl's version vs. Unicode version

      Since this module refers to perl core's Unicode database in the directory /lib/unicore (or formerly /lib/unicode), the Unicode version of normalization implemented by this module depends on your perl's version.

      1. perl's version implemented Unicode version
      2. 5.6.1 3.0.1
      3. 5.7.2 3.1.0
      4. 5.7.3 3.1.1 (normalization is same as 3.1.0)
      5. 5.8.0 3.2.0
      6. 5.8.1-5.8.3 4.0.0
      7. 5.8.4-5.8.6 4.0.1 (normalization is same as 4.0.0)
      8. 5.8.7-5.8.8 4.1.0
      9. 5.10.0 5.0.0
      10. 5.8.9, 5.10.1 5.1.0
      11. 5.12.0-5.12.3 5.2.0
      12. 5.14.x 6.0.0
      13. 5.16.x 6.1.0
    • Correction of decomposition mapping

      In older Unicode versions, a small number of characters (all of which are CJK compatibility ideographs as far as they have been found) may have an erroneous decomposition mapping (see NormalizationCorrections.txt). Anyhow, this module will neither refer to NormalizationCorrections.txt nor provide any specific version of normalization. Therefore this module running on an older perl with an older Unicode database may use the erroneous decomposition mapping blindly conforming to the Unicode database.

    • Revised definition of canonical composition

      In Unicode 4.1.0, the definition D2 of canonical composition (which affects NFC and NFKC) has been changed (see Public Review Issue #29 and recent UAX #15). This module has used the newer definition since the version 0.07 (Oct 31, 2001). This module will not support the normalization according to the older definition, even if the Unicode version implemented by perl is lower than 4.1.0.

    AUTHOR

    SADAHIRO Tomoyuki <SADAHIRO@cpan.org>

    Copyright(C) 2001-2012, SADAHIRO Tomoyuki. Japan. All rights reserved.

    This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

     
    perldoc-html/Unicode/UCD.html000644 000765 000024 00000440106 12275777516 016135 0ustar00jjstaff000000 000000 Unicode::UCD - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Unicode::UCD

    Perl 5 version 18.2 documentation
    Recently read

    Unicode::UCD

    NAME

    Unicode::UCD - Unicode character database

    SYNOPSIS

    1. use Unicode::UCD 'charinfo';
    2. my $charinfo = charinfo($codepoint);
    3. use Unicode::UCD 'casefold';
    4. my $casefold = casefold(0xFB00);
    5. use Unicode::UCD 'all_casefolds';
    6. my $all_casefolds_ref = all_casefolds();
    7. use Unicode::UCD 'casespec';
    8. my $casespec = casespec(0xFB00);
    9. use Unicode::UCD 'charblock';
    10. my $charblock = charblock($codepoint);
    11. use Unicode::UCD 'charscript';
    12. my $charscript = charscript($codepoint);
    13. use Unicode::UCD 'charblocks';
    14. my $charblocks = charblocks();
    15. use Unicode::UCD 'charscripts';
    16. my $charscripts = charscripts();
    17. use Unicode::UCD qw(charscript charinrange);
    18. my $range = charscript($script);
    19. print "looks like $script\n" if charinrange($range, $codepoint);
    20. use Unicode::UCD qw(general_categories bidi_types);
    21. my $categories = general_categories();
    22. my $types = bidi_types();
    23. use Unicode::UCD 'prop_aliases';
    24. my @space_names = prop_aliases("space");
    25. use Unicode::UCD 'prop_value_aliases';
    26. my @gc_punct_names = prop_value_aliases("Gc", "Punct");
    27. use Unicode::UCD 'prop_invlist';
    28. my @puncts = prop_invlist("gc=punctuation");
    29. use Unicode::UCD 'prop_invmap';
    30. my ($list_ref, $map_ref, $format, $missing)
    31. = prop_invmap("General Category");
    32. use Unicode::UCD 'compexcl';
    33. my $compexcl = compexcl($codepoint);
    34. use Unicode::UCD 'namedseq';
    35. my $namedseq = namedseq($named_sequence_name);
    36. my $unicode_version = Unicode::UCD::UnicodeVersion();
    37. my $convert_to_numeric =
    38. Unicode::UCD::num("\N{RUMI DIGIT ONE}\N{RUMI DIGIT TWO}");

    DESCRIPTION

    The Unicode::UCD module offers a series of functions that provide a simple interface to the Unicode Character Database.

    code point argument

    Some of the functions are called with a code point argument, which is either a decimal or a hexadecimal scalar designating a Unicode code point, or U+ followed by hexadecimals designating a Unicode code point. In other words, if you want a code point to be interpreted as a hexadecimal number, you must prefix it with either 0x or U+ , because a string like e.g. 123 will be interpreted as a decimal code point.

    Examples:

    1. 223 # Decimal 223
    2. 0223 # Hexadecimal 223 (= 547 decimal)
    3. 0xDF # Hexadecimal DF (= 223 decimal
    4. U+DF # Hexadecimal DF

    Note that the largest code point in Unicode is U+10FFFF.

    charinfo()

    1. use Unicode::UCD 'charinfo';
    2. my $charinfo = charinfo(0x41);

    This returns information about the input code point argument as a reference to a hash of fields as defined by the Unicode standard. If the code point argument is not assigned in the standard (i.e., has the general category Cn meaning Unassigned ) or is a non-character (meaning it is guaranteed to never be assigned in the standard), undef is returned.

    Fields that aren't applicable to the particular code point argument exist in the returned hash, and are empty.

    The keys in the hash with the meanings of their values are:

    • code

      the input code point argument expressed in hexadecimal, with leading zeros added if necessary to make it contain at least four hexdigits

    • name

      name of code, all IN UPPER CASE. Some control-type code points do not have names. This field will be empty for Surrogate and Private Use code points, and for the others without a name, it will contain a description enclosed in angle brackets, like <control> .

    • category

      The short name of the general category of code. This will match one of the keys in the hash returned by general_categories().

      The prop_value_aliases() function can be used to get all the synonyms of the category name.

    • combining

      the combining class number for code used in the Canonical Ordering Algorithm. For Unicode 5.1, this is described in Section 3.11 Canonical Ordering Behavior available at http://www.unicode.org/versions/Unicode5.1.0/

      The prop_value_aliases() function can be used to get all the synonyms of the combining class number.

    • bidi

      bidirectional type of code. This will match one of the keys in the hash returned by bidi_types().

      The prop_value_aliases() function can be used to get all the synonyms of the bidi type name.

    • decomposition

      is empty if code has no decomposition; or is one or more codes (separated by spaces) that, taken in order, represent a decomposition for code. Each has at least four hexdigits. The codes may be preceded by a word enclosed in angle brackets then a space, like <compat> , giving the type of decomposition

      This decomposition may be an intermediate one whose components are also decomposable. Use Unicode::Normalize to get the final decomposition.

    • decimal

      if code is a decimal digit this is its integer numeric value

    • digit

      if code represents some other digit-like number, this is its integer numeric value

    • numeric

      if code represents a whole or rational number, this is its numeric value. Rational values are expressed as a string like 1/4 .

    • mirrored

      Y or N designating if code is mirrored in bidirectional text

    • unicode10

      name of code in the Unicode 1.0 standard if one existed for this code point and is different from the current name

    • comment

      As of Unicode 6.0, this is always empty.

    • upper

      is empty if there is no single code point uppercase mapping for code (its uppercase mapping is itself); otherwise it is that mapping expressed as at least four hexdigits. (casespec() should be used in addition to charinfo() for case mappings when the calling program can cope with multiple code point mappings.)

    • lower

      is empty if there is no single code point lowercase mapping for code (its lowercase mapping is itself); otherwise it is that mapping expressed as at least four hexdigits. (casespec() should be used in addition to charinfo() for case mappings when the calling program can cope with multiple code point mappings.)

    • title

      is empty if there is no single code point titlecase mapping for code (its titlecase mapping is itself); otherwise it is that mapping expressed as at least four hexdigits. (casespec() should be used in addition to charinfo() for case mappings when the calling program can cope with multiple code point mappings.)

    • block

      the block code belongs to (used in \p{Blk=...} ). See Blocks versus Scripts.

    • script

      the script code belongs to. See Blocks versus Scripts.

    Note that you cannot do (de)composition and casing based solely on the decomposition, combining, lower, upper, and title fields; you will need also the compexcl(), and casespec() functions.

    charblock()

    1. use Unicode::UCD 'charblock';
    2. my $charblock = charblock(0x41);
    3. my $charblock = charblock(1234);
    4. my $charblock = charblock(0x263a);
    5. my $charblock = charblock("U+263a");
    6. my $range = charblock('Armenian');

    With a code point argument charblock() returns the block the code point belongs to, e.g. Basic Latin . The old-style block name is returned (see Old-style versus new-style block names). If the code point is unassigned, this returns the block it would belong to if it were assigned. (If the Unicode version being used is so early as to not have blocks, all code points are considered to be in No_Block .)

    See also Blocks versus Scripts.

    If supplied with an argument that can't be a code point, charblock() tries to do the opposite and interpret the argument as an old-style block name. The return value is a range set with one range: an anonymous list with a single element that consists of another anonymous list whose first element is the first code point in the block, and whose second (and final) element is the final code point in the block. (The extra list consisting of just one element is so that the same program logic can be used to handle both this return, and the return from charscript() which can have multiple ranges.) You can test whether a code point is in a range using the charinrange() function. If the argument is not a known block, undef is returned.

    charscript()

    1. use Unicode::UCD 'charscript';
    2. my $charscript = charscript(0x41);
    3. my $charscript = charscript(1234);
    4. my $charscript = charscript("U+263a");
    5. my $range = charscript('Thai');

    With a code point argument charscript() returns the script the code point belongs to, e.g. Latin , Greek , Han . If the code point is unassigned or the Unicode version being used is so early that it doesn't have scripts, this function returns "Unknown" .

    If supplied with an argument that can't be a code point, charscript() tries to do the opposite and interpret the argument as a script name. The return value is a range set: an anonymous list of lists that contain start-of-range, end-of-range code point pairs. You can test whether a code point is in a range set using the charinrange() function. If the argument is not a known script, undef is returned.

    See also Blocks versus Scripts.

    charblocks()

    1. use Unicode::UCD 'charblocks';
    2. my $charblocks = charblocks();

    charblocks() returns a reference to a hash with the known block names as the keys, and the code point ranges (see charblock()) as the values.

    The names are in the old-style (see Old-style versus new-style block names).

    prop_invmap(block) can be used to get this same data in a different type of data structure.

    See also Blocks versus Scripts.

    charscripts()

    1. use Unicode::UCD 'charscripts';
    2. my $charscripts = charscripts();

    charscripts() returns a reference to a hash with the known script names as the keys, and the code point ranges (see charscript()) as the values.

    prop_invmap(script) can be used to get this same data in a different type of data structure.

    See also Blocks versus Scripts.

    charinrange()

    In addition to using the \p{Blk=...} and \P{Blk=...} constructs, you can also test whether a code point is in the range as returned by charblock() and charscript() or as the values of the hash returned by charblocks() and charscripts() by using charinrange():

    1. use Unicode::UCD qw(charscript charinrange);
    2. $range = charscript('Hiragana');
    3. print "looks like hiragana\n" if charinrange($range, $codepoint);

    general_categories()

    1. use Unicode::UCD 'general_categories';
    2. my $categories = general_categories();

    This returns a reference to a hash which has short general category names (such as Lu , Nd , Zs , S ) as keys and long names (such as UppercaseLetter , DecimalNumber , SpaceSeparator , Symbol ) as values. The hash is reversible in case you need to go from the long names to the short names. The general category is the one returned from charinfo() under the category key.

    The prop_value_aliases() function can be used to get all the synonyms of the category name.

    bidi_types()

    1. use Unicode::UCD 'bidi_types';
    2. my $categories = bidi_types();

    This returns a reference to a hash which has the short bidi (bidirectional) type names (such as L , R ) as keys and long names (such as Left-to-Right , Right-to-Left ) as values. The hash is reversible in case you need to go from the long names to the short names. The bidi type is the one returned from charinfo() under the bidi key. For the exact meaning of the various bidi classes the Unicode TR9 is recommended reading: http://www.unicode.org/reports/tr9/ (as of Unicode 5.0.0)

    The prop_value_aliases() function can be used to get all the synonyms of the bidi type name.

    compexcl()

    1. use Unicode::UCD 'compexcl';
    2. my $compexcl = compexcl(0x09dc);

    This routine returns undef if the Unicode version being used is so early that it doesn't have this property. It is included for backwards compatibility, but as of Perl 5.12 and more modern Unicode versions, for most purposes it is probably more convenient to use one of the following instead:

    1. my $compexcl = chr(0x09dc) =~ /\p{Comp_Ex};
    2. my $compexcl = chr(0x09dc) =~ /\p{Full_Composition_Exclusion};

    or even

    1. my $compexcl = chr(0x09dc) =~ /\p{CE};
    2. my $compexcl = chr(0x09dc) =~ /\p{Composition_Exclusion};

    The first two forms return true if the code point argument should not be produced by composition normalization. For the final two forms to return true, it is additionally required that this fact not otherwise be determinable from the Unicode data base.

    This routine behaves identically to the final two forms. That is, it does not return true if the code point has a decomposition consisting of another single code point, nor if its decomposition starts with a code point whose combining class is non-zero. Code points that meet either of these conditions should also not be produced by composition normalization, which is probably why you should use the Full_Composition_Exclusion property instead, as shown above.

    The routine returns false otherwise.

    casefold()

    1. use Unicode::UCD 'casefold';
    2. my $casefold = casefold(0xDF);
    3. if (defined $casefold) {
    4. my @full_fold_hex = split / /, $casefold->{'full'};
    5. my $full_fold_string =
    6. join "", map {chr(hex($_))} @full_fold_hex;
    7. my @turkic_fold_hex =
    8. split / /, ($casefold->{'turkic'} ne "")
    9. ? $casefold->{'turkic'}
    10. : $casefold->{'full'};
    11. my $turkic_fold_string =
    12. join "", map {chr(hex($_))} @turkic_fold_hex;
    13. }
    14. if (defined $casefold && $casefold->{'simple'} ne "") {
    15. my $simple_fold_hex = $casefold->{'simple'};
    16. my $simple_fold_string = chr(hex($simple_fold_hex));
    17. }

    This returns the (almost) locale-independent case folding of the character specified by the code point argument. (Starting in Perl v5.16, the core function fc() returns the full mapping (described below) faster than this does, and for entire strings.)

    If there is no case folding for the input code point, undef is returned.

    If there is a case folding for that code point, a reference to a hash with the following fields is returned:

    • code

      the input code point argument expressed in hexadecimal, with leading zeros added if necessary to make it contain at least four hexdigits

    • full

      one or more codes (separated by spaces) that, taken in order, give the code points for the case folding for code. Each has at least four hexdigits.

    • simple

      is empty, or is exactly one code with at least four hexdigits which can be used as an alternative case folding when the calling program cannot cope with the fold being a sequence of multiple code points. If full is just one code point, then simple equals full. If there is no single code point folding defined for code, then simple is the empty string. Otherwise, it is an inferior, but still better-than-nothing alternative folding to full.

    • mapping

      is the same as simple if simple is not empty, and it is the same as full otherwise. It can be considered to be the simplest possible folding for code. It is defined primarily for backwards compatibility.

    • status

      is C (for common ) if the best possible fold is a single code point (simple equals full equals mapping). It is S if there are distinct folds, simple and full (mapping equals simple). And it is F if there is only a full fold (mapping equals full; simple is empty). Note that this describes the contents of mapping. It is defined primarily for backwards compatibility.

      For Unicode versions between 3.1 and 3.1.1 inclusive, status can also be I which is the same as C but is a special case for dotted uppercase I and dotless lowercase i:

      • * If you use this I mapping

        the result is case-insensitive, but dotless and dotted I's are not distinguished

      • * If you exclude this I mapping

        the result is not fully case-insensitive, but dotless and dotted I's are distinguished

    • turkic

      contains any special folding for Turkic languages. For versions of Unicode starting with 3.2, this field is empty unless code has a different folding in Turkic languages, in which case it is one or more codes (separated by spaces) that, taken in order, give the code points for the case folding for code in those languages. Each code has at least four hexdigits. Note that this folding does not maintain canonical equivalence without additional processing.

      For Unicode versions between 3.1 and 3.1.1 inclusive, this field is empty unless there is a special folding for Turkic languages, in which case status is I , and mapping, full, simple, and turkic are all equal.

    Programs that want complete generality and the best folding results should use the folding contained in the full field. But note that the fold for some code points will be a sequence of multiple code points.

    Programs that can't cope with the fold mapping being multiple code points can use the folding contained in the simple field, with the loss of some generality. In Unicode 5.1, about 7% of the defined foldings have no single code point folding.

    The mapping and status fields are provided for backwards compatibility for existing programs. They contain the same values as in previous versions of this function.

    Locale is not completely independent. The turkic field contains results to use when the locale is a Turkic language.

    For more information about case mappings see http://www.unicode.org/unicode/reports/tr21

    all_casefolds()

    1. use Unicode::UCD 'all_casefolds';
    2. my $all_folds_ref = all_casefolds();
    3. foreach my $char_with_casefold (sort { $a <=> $b }
    4. keys %$all_folds_ref)
    5. {
    6. printf "%04X:", $char_with_casefold;
    7. my $casefold = $all_folds_ref->{$char_with_casefold};
    8. # Get folds for $char_with_casefold
    9. my @full_fold_hex = split / /, $casefold->{'full'};
    10. my $full_fold_string =
    11. join "", map {chr(hex($_))} @full_fold_hex;
    12. print " full=", join " ", @full_fold_hex;
    13. my @turkic_fold_hex =
    14. split / /, ($casefold->{'turkic'} ne "")
    15. ? $casefold->{'turkic'}
    16. : $casefold->{'full'};
    17. my $turkic_fold_string =
    18. join "", map {chr(hex($_))} @turkic_fold_hex;
    19. print "; turkic=", join " ", @turkic_fold_hex;
    20. if (defined $casefold && $casefold->{'simple'} ne "") {
    21. my $simple_fold_hex = $casefold->{'simple'};
    22. my $simple_fold_string = chr(hex($simple_fold_hex));
    23. print "; simple=$simple_fold_hex";
    24. }
    25. print "\n";
    26. }

    This returns all the case foldings in the current version of Unicode in the form of a reference to a hash. Each key to the hash is the decimal representation of a Unicode character that has a casefold to other than itself. The casefold of a semi-colon is itself, so it isn't in the hash; likewise for a lowercase "a", but there is an entry for a capital "A". The hash value for each key is another hash, identical to what is returned by casefold() if called with that code point as its argument. So the value all_casefolds()->{ord("A")}' is equivalent to casefold(ord("A")) ;

    casespec()

    1. use Unicode::UCD 'casespec';
    2. my $casespec = casespec(0xFB00);

    This returns the potentially locale-dependent case mappings of the code point argument. The mappings may be longer than a single code point (which the basic Unicode case mappings as returned by charinfo() never are).

    If there are no case mappings for the code point argument, or if all three possible mappings (lower, title and upper) result in single code points and are locale independent and unconditional, undef is returned (which means that the case mappings, if any, for the code point are those returned by charinfo()).

    Otherwise, a reference to a hash giving the mappings (or a reference to a hash of such hashes, explained below) is returned with the following keys and their meanings:

    The keys in the bottom layer hash with the meanings of their values are:

    • code

      the input code point argument expressed in hexadecimal, with leading zeros added if necessary to make it contain at least four hexdigits

    • lower

      one or more codes (separated by spaces) that, taken in order, give the code points for the lower case of code. Each has at least four hexdigits.

    • title

      one or more codes (separated by spaces) that, taken in order, give the code points for the title case of code. Each has at least four hexdigits.

    • upper

      one or more codes (separated by spaces) that, taken in order, give the code points for the upper case of code. Each has at least four hexdigits.

    • condition

      the conditions for the mappings to be valid. If undef, the mappings are always valid. When defined, this field is a list of conditions, all of which must be true for the mappings to be valid. The list consists of one or more locales (see below) and/or contexts (explained in the next paragraph), separated by spaces. (Other than as used to separate elements, spaces are to be ignored.) Case distinctions in the condition list are not significant. Conditions preceded by "NON_" represent the negation of the condition.

      A context is one of those defined in the Unicode standard. For Unicode 5.1, they are defined in Section 3.13 Default Case Operations available at http://www.unicode.org/versions/Unicode5.1.0/. These are for context-sensitive casing.

    The hash described above is returned for locale-independent casing, where at least one of the mappings has length longer than one. If undef is returned, the code point may have mappings, but if so, all are length one, and are returned by charinfo(). Note that when this function does return a value, it will be for the complete set of mappings for a code point, even those whose length is one.

    If there are additional casing rules that apply only in certain locales, an additional key for each will be defined in the returned hash. Each such key will be its locale name, defined as a 2-letter ISO 3166 country code, possibly followed by a "_" and a 2-letter ISO language code (possibly followed by a "_" and a variant code). You can find the lists of all possible locales, see Locale::Country and Locale::Language. (In Unicode 6.0, the only locales returned by this function are lt , tr, and az .)

    Each locale key is a reference to a hash that has the form above, and gives the casing rules for that particular locale, which take precedence over the locale-independent ones when in that locale.

    If the only casing for a code point is locale-dependent, then the returned hash will not have any of the base keys, like code , upper , etc., but will contain only locale keys.

    For more information about case mappings see http://www.unicode.org/unicode/reports/tr21/

    namedseq()

    1. use Unicode::UCD 'namedseq';
    2. my $namedseq = namedseq("KATAKANA LETTER AINU P");
    3. my @namedseq = namedseq("KATAKANA LETTER AINU P");
    4. my %namedseq = namedseq();

    If used with a single argument in a scalar context, returns the string consisting of the code points of the named sequence, or undef if no named sequence by that name exists. If used with a single argument in a list context, it returns the list of the ordinals of the code points. If used with no arguments in a list context, returns a hash with the names of the named sequences as the keys and the named sequences as strings as the values. Otherwise, it returns undef or an empty list depending on the context.

    This function only operates on officially approved (not provisional) named sequences.

    Note that as of Perl 5.14, \N{KATAKANA LETTER AINU P} will insert the named sequence into double-quoted strings, and charnames::string_vianame("KATAKANA LETTER AINU P") will return the same string this function does, but will also operate on character names that aren't named sequences, without you having to know which are which. See charnames.

    num()

    1. use Unicode::UCD 'num';
    2. my $val = num("123");
    3. my $one_quarter = num("\N{VULGAR FRACTION 1/4}");

    num returns the numeric value of the input Unicode string; or undef if it doesn't think the entire string has a completely valid, safe numeric value.

    If the string is just one character in length, the Unicode numeric value is returned if it has one, or undef otherwise. Note that this need not be a whole number. num("\N{TIBETAN DIGIT HALF ZERO}") , for example returns -0.5.

    If the string is more than one character, undef is returned unless all its characters are decimal digits (that is, they would match \d+ ), from the same script. For example if you have an ASCII '0' and a Bengali '3', mixed together, they aren't considered a valid number, and undef is returned. A further restriction is that the digits all have to be of the same form. A half-width digit mixed with a full-width one will return undef. The Arabic script has two sets of digits; num will return undef unless all the digits in the string come from the same set.

    num errs on the side of safety, and there may be valid strings of decimal digits that it doesn't recognize. Note that Unicode defines a number of "digit" characters that aren't "decimal digit" characters. "Decimal digits" have the property that they have a positional value, i.e., there is a units position, a 10's position, a 100's, etc, AND they are arranged in Unicode in blocks of 10 contiguous code points. The Chinese digits, for example, are not in such a contiguous block, and so Unicode doesn't view them as decimal digits, but merely digits, and so \d will not match them. A single-character string containing one of these digits will have its decimal value returned by num , but any longer string containing only these digits will return undef.

    Strings of multiple sub- and superscripts are not recognized as numbers. You can use either of the compatibility decompositions in Unicode::Normalize to change these into digits, and then call num on the result.

    prop_aliases()

    1. use Unicode::UCD 'prop_aliases';
    2. my ($short_name, $full_name, @other_names) = prop_aliases("space");
    3. my $same_full_name = prop_aliases("Space"); # Scalar context
    4. my ($same_short_name) = prop_aliases("Space"); # gets 0th element
    5. print "The full name is $full_name\n";
    6. print "The short name is $short_name\n";
    7. print "The other aliases are: ", join(", ", @other_names), "\n";
    8. prints:
    9. The full name is White_Space
    10. The short name is WSpace
    11. The other aliases are: Space

    Most Unicode properties have several synonymous names. Typically, there is at least a short name, convenient to type, and a long name that more fully describes the property, and hence is more easily understood.

    If you know one name for a Unicode property, you can use prop_aliases to find either the long name (when called in scalar context), or a list of all of the names, somewhat ordered so that the short name is in the 0th element, the long name in the next element, and any other synonyms are in the remaining elements, in no particular order.

    The long name is returned in a form nicely capitalized, suitable for printing.

    The input parameter name is loosely matched, which means that white space, hyphens, and underscores are ignored (except for the trailing underscore in the old_form grandfathered-in "L_" , which is better written as "LC" , and both of which mean General_Category=Cased Letter ).

    If the name is unknown, undef is returned (or an empty list in list context). Note that Perl typically recognizes property names in regular expressions with an optional "Is_ " (with or without the underscore) prefixed to them, such as \p{isgc=punct} . This function does not recognize those in the input, returning undef. Nor are they included in the output as possible synonyms.

    prop_aliases does know about the Perl extensions to Unicode properties, such as Any and XPosixAlpha , and the single form equivalents to Unicode properties such as XDigit , Greek , In_Greek , and Is_Greek . The final example demonstrates that the "Is_" prefix is recognized for these extensions; it is needed to resolve ambiguities. For example, prop_aliases('lc') returns the list (lc, Lowercase_Mapping) , but prop_aliases('islc') returns (Is_LC, Cased_Letter) . This is because islc is a Perl extension which is short for General_Category=Cased Letter . The lists returned for the Perl extensions will not include the "Is_" prefix (whether or not the input had it) unless needed to resolve ambiguities, as shown in the "islc" example, where the returned list had one element containing "Is_" , and the other without.

    It is also possible for the reverse to happen: prop_aliases('isc') returns the list (isc, ISO_Comment) ; whereas prop_aliases('c') returns (C, Other) (the latter being a Perl extension meaning General_Category=Other . Properties accessible through Unicode::UCD in perluniprops lists the available forms, including which ones are discouraged from use.

    Those discouraged forms are accepted as input to prop_aliases , but are not returned in the lists. prop_aliases('isL&') and prop_aliases('isL_') , which are old synonyms for "Is_LC" and should not be used in new code, are examples of this. These both return (Is_LC, Cased_Letter) . Thus this function allows you to take a discourarged form, and find its acceptable alternatives. The same goes with single-form Block property equivalences. Only the forms that begin with "In_" are not discouraged; if you pass prop_aliases a discouraged form, you will get back the equivalent ones that begin with "In_" . It will otherwise look like a new-style block name (see. Old-style versus new-style block names).

    prop_aliases does not know about any user-defined properties, and will return undef if called with one of those. Likewise for Perl internal properties, with the exception of "Perl_Decimal_Digit" which it does know about (and which is documented below in prop_invmap()).

    prop_value_aliases()

    1. use Unicode::UCD 'prop_value_aliases';
    2. my ($short_name, $full_name, @other_names)
    3. = prop_value_aliases("Gc", "Punct");
    4. my $same_full_name = prop_value_aliases("Gc", "P"); # Scalar cntxt
    5. my ($same_short_name) = prop_value_aliases("Gc", "P"); # gets 0th
    6. # element
    7. print "The full name is $full_name\n";
    8. print "The short name is $short_name\n";
    9. print "The other aliases are: ", join(", ", @other_names), "\n";
    10. prints:
    11. The full name is Punctuation
    12. The short name is P
    13. The other aliases are: Punct

    Some Unicode properties have a restricted set of legal values. For example, all binary properties are restricted to just true or false ; and there are only a few dozen possible General Categories.

    For such properties, there are usually several synonyms for each possible value. For example, in binary properties, truth can be represented by any of the strings "Y", "Yes", "T", or "True"; and the General Category "Punctuation" by that string, or "Punct", or simply "P".

    Like property names, there is typically at least a short name for each such property-value, and a long name. If you know any name of the property-value, you can use prop_value_aliases () to get the long name (when called in scalar context), or a list of all the names, with the short name in the 0th element, the long name in the next element, and any other synonyms in the remaining elements, in no particular order, except that any all-numeric synonyms will be last.

    The long name is returned in a form nicely capitalized, suitable for printing.

    Case, white space, hyphens, and underscores are ignored in the input parameters (except for the trailing underscore in the old-form grandfathered-in general category property value "L_" , which is better written as "LC" ).

    If either name is unknown, undef is returned. Note that Perl typically recognizes property names in regular expressions with an optional "Is_ " (with or without the underscore) prefixed to them, such as \p{isgc=punct} . This function does not recognize those in the property parameter, returning undef.

    If called with a property that doesn't have synonyms for its values, it returns the input value, possibly normalized with capitalization and underscores.

    For the block property, new-style block names are returned (see Old-style versus new-style block names).

    To find the synonyms for single-forms, such as \p{Any} , use prop_aliases() instead.

    prop_value_aliases does not know about any user-defined properties, and will return undef if called with one of those.

    prop_invlist()

    prop_invlist returns an inversion list (described below) that defines all the code points for the binary Unicode property (or "property=value" pair) given by the input parameter string:

    1. use feature 'say';
    2. use Unicode::UCD 'prop_invlist';
    3. say join ", ", prop_invlist("Any");
    4. prints:
    5. 0, 1114112

    If the input is unknown undef is returned in scalar context; an empty-list in list context. If the input is known, the number of elements in the list is returned if called in scalar context.

    perluniprops gives the list of properties that this function accepts, as well as all the possible forms for them (including with the optional "Is_" prefixes). (Except this function doesn't accept any Perl-internal properties, some of which are listed there.) This function uses the same loose or tighter matching rules for resolving the input property's name as is done for regular expressions. These are also specified in perluniprops. Examples of using the "property=value" form are:

    1. say join ", ", prop_invlist("Script=Shavian");
    2. prints:
    3. 66640, 66688
    4. say join ", ", prop_invlist("ASCII_Hex_Digit=No");
    5. prints:
    6. 0, 48, 58, 65, 71, 97, 103
    7. say join ", ", prop_invlist("ASCII_Hex_Digit=Yes");
    8. prints:
    9. 48, 58, 65, 71, 97, 103

    Inversion lists are a compact way of specifying Unicode property-value definitions. The 0th item in the list is the lowest code point that has the property-value. The next item (item [1]) is the lowest code point beyond that one that does NOT have the property-value. And the next item beyond that ([2]) is the lowest code point beyond that one that does have the property-value, and so on. Put another way, each element in the list gives the beginning of a range that has the property-value (for even numbered elements), or doesn't have the property-value (for odd numbered elements). The name for this data structure stems from the fact that each element in the list toggles (or inverts) whether the corresponding range is or isn't on the list.

    In the final example above, the first ASCII Hex digit is code point 48, the character "0", and all code points from it through 57 (a "9") are ASCII hex digits. Code points 58 through 64 aren't, but 65 (an "A") through 70 (an "F") are, as are 97 ("a") through 102 ("f"). 103 starts a range of code points that aren't ASCII hex digits. That range extends to infinity, which on your computer can be found in the variable $Unicode::UCD::MAX_CP . (This variable is as close to infinity as Perl can get on your platform, and may be too high for some operations to work; you may wish to use a smaller number for your purposes.)

    Note that the inversion lists returned by this function can possibly include non-Unicode code points, that is anything above 0x10FFFF. This is in contrast to Perl regular expression matches on those code points, in which a non-Unicode code point always fails to match. For example, both of these have the same result:

    1. chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails.
    2. chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Fails!

    And both raise a warning that a Unicode property is being used on a non-Unicode code point. It is arguable as to which is the correct thing to do here. This function has chosen the way opposite to the Perl regular expression behavior. This allows you to easily flip to to the Perl regular expression way (for you to go in the other direction would be far harder). Simply add 0x110000 at the end of the non-empty returned list if it isn't already that value; and pop that value if it is; like:

    1. my @list = prop_invlist("foo");
    2. if (@list) {
    3. if ($list[-1] == 0x110000) {
    4. pop @list; # Defeat the turning on for above Unicode
    5. }
    6. else {
    7. push @list, 0x110000; # Turn off for above Unicode
    8. }
    9. }

    It is a simple matter to expand out an inversion list to a full list of all code points that have the property-value:

    1. my @invlist = prop_invlist($property_name);
    2. die "empty" unless @invlist;
    3. my @full_list;
    4. for (my $i = 0; $i < @invlist; $i += 2) {
    5. my $upper = ($i + 1) < @invlist
    6. ? $invlist[$i+1] - 1 # In range
    7. : $Unicode::UCD::MAX_CP; # To infinity. You may want
    8. # to stop much much earlier;
    9. # going this high may expose
    10. # perl deficiencies with very
    11. # large numbers.
    12. for my $j ($invlist[$i] .. $upper) {
    13. push @full_list, $j;
    14. }
    15. }

    prop_invlist does not know about any user-defined nor Perl internal-only properties, and will return undef if called with one of those.

    prop_invmap()

    1. use Unicode::UCD 'prop_invmap';
    2. my ($list_ref, $map_ref, $format, $missing)
    3. = prop_invmap("General Category");

    prop_invmap is used to get the complete mapping definition for a property, in the form of an inversion map. An inversion map consists of two parallel arrays. One is an ordered list of code points that mark range beginnings, and the other gives the value (or mapping) that all code points in the corresponding range have.

    prop_invmap is called with the name of the desired property. The name is loosely matched, meaning that differences in case, white-space, hyphens, and underscores are not meaningful (except for the trailing underscore in the old-form grandfathered-in property "L_" , which is better written as "LC" , or even better, "Gc=LC" ).

    Many Unicode properties have more than one name (or alias). prop_invmap understands all of these, including Perl extensions to them. Ambiguities are resolved as described above for prop_aliases(). The Perl internal property "Perl_Decimal_Digit, described below, is also accepted. undef is returned if the property name is unknown. See Properties accessible through Unicode::UCD in perluniprops for the properties acceptable as inputs to this function.

    It is a fatal error to call this function except in list context.

    In addition to the the two arrays that form the inversion map, prop_invmap returns two other values; one is a scalar that gives some details as to the format of the entries of the map array; the other is used for specialized purposes, described at the end of this section.

    This means that prop_invmap returns a 4 element list. For example,

    1. my ($blocks_ranges_ref, $blocks_maps_ref, $format, $default)
    2. = prop_invmap("Block");

    In this call, the two arrays will be populated as shown below (for Unicode 6.0):

    1. Index @blocks_ranges @blocks_maps
    2. 0 0x0000 Basic Latin
    3. 1 0x0080 Latin-1 Supplement
    4. 2 0x0100 Latin Extended-A
    5. 3 0x0180 Latin Extended-B
    6. 4 0x0250 IPA Extensions
    7. 5 0x02B0 Spacing Modifier Letters
    8. 6 0x0300 Combining Diacritical Marks
    9. 7 0x0370 Greek and Coptic
    10. 8 0x0400 Cyrillic
    11. ...
    12. 233 0x2B820 No_Block
    13. 234 0x2F800 CJK Compatibility Ideographs Supplement
    14. 235 0x2FA20 No_Block
    15. 236 0xE0000 Tags
    16. 237 0xE0080 No_Block
    17. 238 0xE0100 Variation Selectors Supplement
    18. 239 0xE01F0 No_Block
    19. 240 0xF0000 Supplementary Private Use Area-A
    20. 241 0x100000 Supplementary Private Use Area-B
    21. 242 0x110000 No_Block

    The first line (with Index [0]) means that the value for code point 0 is "Basic Latin". The entry "0x0080" in the @blocks_ranges column in the second line means that the value from the first line, "Basic Latin", extends to all code points in the range from 0 up to but not including 0x0080, that is, through 127. In other words, the code points from 0 to 127 are all in the "Basic Latin" block. Similarly, all code points in the range from 0x0080 up to (but not including) 0x0100 are in the block named "Latin-1 Supplement", etc. (Notice that the return is the old-style block names; see Old-style versus new-style block names).

    The final line (with Index [242]) means that the value for all code points above the legal Unicode maximum code point have the value "No_Block", which is the term Unicode uses for a non-existing block.

    The arrays completely specify the mappings for all possible code points. The final element in an inversion map returned by this function will always be for the range that consists of all the code points that aren't legal Unicode, but that are expressible on the platform. (That is, it starts with code point 0x110000, the first code point above the legal Unicode maximum, and extends to infinity.) The value for that range will be the same that any typical unassigned code point has for the specified property. (Certain unassigned code points are not "typical"; for example the non-character code points, or those in blocks that are to be written right-to-left. The above-Unicode range's value is not based on these atypical code points.) It could be argued that, instead of treating these as unassigned Unicode code points, the value for this range should be undef. If you wish, you can change the returned arrays accordingly.

    The maps are almost always simple scalars that should be interpreted as-is. These values are those given in the Unicode-supplied data files, which may be inconsistent as to capitalization and as to which synonym for a property-value is given. The results may be normalized by using the prop_value_aliases() function.

    There are exceptions to the simple scalar maps. Some properties have some elements in their map list that are themselves lists of scalars; and some special strings are returned that are not to be interpreted as-is. Element [2] (placed into $format in the example above) of the returned four element list tells you if the map has any of these special elements or not, as follows:

    • s

      means all the elements of the map array are simple scalars, with no special elements. Almost all properties are like this, like the block example above.

    • sl

      means that some of the map array elements have the form given by "s" , and the rest are lists of scalars. For example, here is a portion of the output of calling prop_invmap () with the "Script Extensions" property:

      1. @scripts_ranges @scripts_maps
      2. ...
      3. 0x0953 Devanagari
      4. 0x0964 [ Bengali, Devanagari, Gurumukhi, Oriya ]
      5. 0x0966 Devanagari
      6. 0x0970 Common

      Here, the code points 0x964 and 0x965 are both used in Bengali, Devanagari, Gurmukhi, and Oriya, but no other scripts.

      The Name_Alias property is also of this form. But each scalar consists of two components: 1) the name, and 2) the type of alias this is. They are separated by a colon and a space. In Unicode 6.1, there are several alias types:

      • correction

        indicates that the name is a corrected form for the original name (which remains valid) for the same code point.

      • control

        adds a new name for a control character.

      • alternate

        is an alternate name for a character

      • figment

        is a name for a character that has been documented but was never in any actual standard.

      • abbreviation

        is a common abbreviation for a character

      The lists are ordered (roughly) so the most preferred names come before less preferred ones.

      For example,

      1. @aliases_ranges @alias_maps
      2. ...
      3. 0x009E [ 'PRIVACY MESSAGE: control', 'PM: abbreviation' ]
      4. 0x009F [ 'APPLICATION PROGRAM COMMAND: control',
      5. 'APC: abbreviation'
      6. ]
      7. 0x00A0 'NBSP: abbreviation'
      8. 0x00A1 ""
      9. 0x00AD 'SHY: abbreviation'
      10. 0x00AE ""
      11. 0x01A2 'LATIN CAPITAL LETTER GHA: correction'
      12. 0x01A3 'LATIN SMALL LETTER GHA: correction'
      13. 0x01A4 ""
      14. ...

      A map to the empty string means that there is no alias defined for the code point.

    • a

      is like "s" in that all the map array elements are scalars, but here they are restricted to all being integers, and some have to be adjusted (hence the name "a" ) to get the correct result. For example, in:

      1. my ($uppers_ranges_ref, $uppers_maps_ref, $format)
      2. = prop_invmap("Simple_Uppercase_Mapping");

      the returned arrays look like this:

      1. @$uppers_ranges_ref @$uppers_maps_ref Note
      2. 0 0
      3. 97 65 'a' maps to 'A', b => B ...
      4. 123 0
      5. 181 924 MICRO SIGN => Greek Cap MU
      6. 182 0
      7. ...

      Let's start with the second line. It says that the uppercase of code point 97 is 65; or uc("a") == "A". But the line is for the entire range of code points 97 through 122. To get the mapping for any code point in a range, you take the offset it has from the beginning code point of the range, and add that to the mapping for that first code point. So, the mapping for 122 ("z") is derived by taking the offset of 122 from 97 (=25) and adding that to 65, yielding 90 ("z"). Likewise for everything in between.

      The first line works the same way. The first map in a range is always the correct value for its code point (because the adjustment is 0). Thus the uc(chr(0)) is just itself. Also, uc(chr(1)) is also itself, as the adjustment is 0+1-0 .. uc(chr(96)) is 96.

      Requiring this simple adjustment allows the returned arrays to be significantly smaller than otherwise, up to a factor of 10, speeding up searching through them.

    • al

      means that some of the map array elements have the form given by "a" , and the rest are ordered lists of code points. For example, in:

      1. my ($uppers_ranges_ref, $uppers_maps_ref, $format)
      2. = prop_invmap("Uppercase_Mapping");

      the returned arrays look like this:

      1. @$uppers_ranges_ref @$uppers_maps_ref
      2. 0 0
      3. 97 65
      4. 123 0
      5. 181 924
      6. 182 0
      7. ...
      8. 0x0149 [ 0x02BC 0x004E ]
      9. 0x014A 0
      10. 0x014B 330
      11. ...

      This is the full Uppercase_Mapping property (as opposed to the Simple_Uppercase_Mapping given in the example for format "a" ). The only difference between the two in the ranges shown is that the code point at 0x0149 (LATIN SMALL LETTER N PRECEDED BY APOSTROPHE) maps to a string of two characters, 0x02BC (MODIFIER LETTER APOSTROPHE) followed by 0x004E (LATIN CAPITAL LETTER N).

      No adjustments are needed to entries that are references to arrays; each such entry will have exactly one element in its range, so the offset is always 0.

    • ae

      This is like "a" , but some elements are the empty string, and should not be adjusted. The one internal Perl property accessible by prop_invmap is of this type: "Perl_Decimal_Digit" returns an inversion map which gives the numeric values that are represented by the Unicode decimal digit characters. Characters that don't represent decimal digits map to the empty string, like so:

      1. @digits @values
      2. 0x0000 ""
      3. 0x0030 0
      4. 0x003A: ""
      5. 0x0660: 0
      6. 0x066A: ""
      7. 0x06F0: 0
      8. 0x06FA: ""
      9. 0x07C0: 0
      10. 0x07CA: ""
      11. 0x0966: 0
      12. ...

      This means that the code points from 0 to 0x2F do not represent decimal digits; the code point 0x30 (DIGIT ZERO) represents 0; code point 0x31, (DIGIT ONE), represents 0+1-0 = 1; ... code point 0x39, (DIGIT NINE), represents 0+9-0 = 9; ... code points 0x3A through 0x65F do not represent decimal digits; 0x660 (ARABIC-INDIC DIGIT ZERO), represents 0; ... 0x07C1 (NKO DIGIT ONE), represents 0+1-0 = 1 ...

    • ale

      is a combination of the "al" type and the "ae" type. Some of the map array elements have the forms given by "al" , and the rest are the empty string. The property NFKC_Casefold has this form. An example slice is:

      1. @$ranges_ref @$maps_ref Note
      2. ...
      3. 0x00AA 97 FEMININE ORDINAL INDICATOR => 'a'
      4. 0x00AB 0
      5. 0x00AD SOFT HYPHEN => ""
      6. 0x00AE 0
      7. 0x00AF [ 0x0020, 0x0304 ] MACRON => SPACE . COMBINING MACRON
      8. 0x00B0 0
      9. ...
    • ar

      means that all the elements of the map array are either rational numbers or the string "NaN" , meaning "Not a Number". A rational number is either an integer, or two integers separated by a solidus ("/" ). The second integer represents the denominator of the division implied by the solidus, and is actually always positive, so it is guaranteed not to be 0 and to not be signed. When the element is a plain integer (without the solidus), it may need to be adjusted to get the correct value by adding the offset, just as other "a" properties. No adjustment is needed for fractions, as the range is guaranteed to have just a single element, and so the offset is always 0.

      If you want to convert the returned map to entirely scalar numbers, you can use something like this:

      1. my ($invlist_ref, $invmap_ref, $format) = prop_invmap($property);
      2. if ($format && $format eq "ar") {
      3. map { $_ = eval $_ if $_ ne 'NaN' } @$map_ref;
      4. }

      Here's some entries from the output of the property "Nv", which has format "ar" .

      1. @numerics_ranges @numerics_maps Note
      2. 0x00 "NaN"
      3. 0x30 0 DIGIT 0 .. DIGIT 9
      4. 0x3A "NaN"
      5. 0xB2 2 SUPERSCRIPTs 2 and 3
      6. 0xB4 "NaN"
      7. 0xB9 1 SUPERSCRIPT 1
      8. 0xBA "NaN"
      9. 0xBC 1/4 VULGAR FRACTION 1/4
      10. 0xBD 1/2 VULGAR FRACTION 1/2
      11. 0xBE 3/4 VULGAR FRACTION 3/4
      12. 0xBF "NaN"
      13. 0x660 0 ARABIC-INDIC DIGIT ZERO .. NINE
      14. 0x66A "NaN"
    • n

      means the Name property. All the elements of the map array are simple scalars, but some of them contain special strings that require more work to get the actual name.

      Entries such as:

      1. CJK UNIFIED IDEOGRAPH-<code point>

      mean that the name for the code point is "CJK UNIFIED IDEOGRAPH-" with the code point (expressed in hexadecimal) appended to it, like "CJK UNIFIED IDEOGRAPH-3403" (similarly for CJK COMPATIBILITY IDEOGRAPH-<code point> ).

      Also, entries like

      1. <hangul syllable>

      means that the name is algorithmically calculated. This is easily done by the function charnames::viacode(code) in charnames.

      Note that for control characters (Gc=cc ), Unicode's data files have the string "<control> ", but the real name of each of these characters is the empty string. This function returns that real name, the empty string. (There are names for these characters, but they are considered aliases, not the Name property name, and are contained in the Name_Alias property.)

    • ad

      means the Decomposition_Mapping property. This property is like "al" properties, except that one of the scalar elements is of the form:

      1. <hangul syllable>

      This signifies that this entry should be replaced by the decompositions for all the code points whose decomposition is algorithmically calculated. (All of them are currently in one range and no others outisde the range are likely to ever be added to Unicode; the "n" format has this same entry.) These can be generated via the function Unicode::Normalize::NFD().

      Note that the mapping is the one that is specified in the Unicode data files, and to get the final decomposition, it may need to be applied recursively.

    Note that a format begins with the letter "a" if and only the property it is for requires adjustments by adding the offsets in multi-element ranges. For all these properties, an entry should be adjusted only if the map is a scalar which is an integer. That is, it must match the regular expression:

    1. / ^ -? \d+ $ /xa

    Further, the first element in a range never needs adjustment, as the adjustment would be just adding 0.

    A binary search can be used to quickly find a code point in the inversion list, and hence its corresponding mapping.

    The final element (index [3], assigned to $default in the "block" example) in the four element list returned by this function may be useful for applications that wish to convert the returned inversion map data structure into some other, such as a hash. It gives the mapping that most code points map to under the property. If you establish the convention that any code point not explicitly listed in your data structure maps to this value, you can potentially make your data structure much smaller. As you construct your data structure from the one returned by this function, simply ignore those ranges that map to this value, generally called the "default" value. For example, to convert to the data structure searchable by charinrange(), you can follow this recipe for properties that don't require adjustments:

    1. my ($list_ref, $map_ref, $format, $missing) = prop_invmap($property);
    2. my @range_list;
    3. # Look at each element in the list, but the -2 is needed because we
    4. # look at $i+1 in the loop, and the final element is guaranteed to map
    5. # to $missing by prop_invmap(), so we would skip it anyway.
    6. for my $i (0 .. @$list_ref - 2) {
    7. next if $map_ref->[$i] eq $missing;
    8. push @range_list, [ $list_ref->[$i],
    9. $list_ref->[$i+1],
    10. $map_ref->[$i]
    11. ];
    12. }
    13. print charinrange(\@range_list, $code_point), "\n";

    With this, charinrange() will return undef if its input code point maps to $missing . You can avoid this by omitting the next statement, and adding a line after the loop to handle the final element of the inversion map.

    Similarly, this recipe can be used for properties that do require adjustments:

    1. for my $i (0 .. @$list_ref - 2) {
    2. next if $map_ref->[$i] eq $missing;
    3. # prop_invmap() guarantees that if the mapping is to an array, the
    4. # range has just one element, so no need to worry about adjustments.
    5. if (ref $map_ref->[$i]) {
    6. push @range_list,
    7. [ $list_ref->[$i], $list_ref->[$i], $map_ref->[$i] ];
    8. }
    9. else { # Otherwise each element is actually mapped to a separate
    10. # value, so the range has to be split into single code point
    11. # ranges.
    12. my $adjustment = 0;
    13. # For each code point that gets mapped to something...
    14. for my $j ($list_ref->[$i] .. $list_ref->[$i+1] -1 ) {
    15. # ... add a range consisting of just it mapping to the
    16. # original plus the adjustment, which is incremented for the
    17. # next time through the loop, as the offset increases by 1
    18. # for each element in the range
    19. push @range_list,
    20. [ $j, $j, $map_ref->[$i] + $adjustment++ ];
    21. }
    22. }
    23. }

    Note that the inversion maps returned for the Case_Folding and Simple_Case_Folding properties do not include the Turkic-locale mappings. Use casefold() for these.

    prop_invmap does not know about any user-defined properties, and will return undef if called with one of those.

    Unicode::UCD::UnicodeVersion

    This returns the version of the Unicode Character Database, in other words, the version of the Unicode standard the database implements. The version is a string of numbers delimited by dots ('.' ).

    Blocks versus Scripts

    The difference between a block and a script is that scripts are closer to the linguistic notion of a set of code points required to present languages, while block is more of an artifact of the Unicode code point numbering and separation into blocks of consecutive code points (so far the size of a block is some multiple of 16, like 128 or 256).

    For example the Latin script is spread over several blocks, such as Basic Latin , Latin 1 Supplement, Latin Extended-A , and Latin Extended-B . On the other hand, the Latin script does not contain all the characters of the Basic Latin block (also known as ASCII): it includes only the letters, and not, for example, the digits or the punctuation.

    For blocks see http://www.unicode.org/Public/UNIDATA/Blocks.txt

    For scripts see UTR #24: http://www.unicode.org/unicode/reports/tr24/

    Matching Scripts and Blocks

    Scripts are matched with the regular-expression construct \p{...} (e.g. \p{Tibetan} matches characters of the Tibetan script), while \p{Blk=...} is used for blocks (e.g. \p{Blk=Tibetan} matches any of the 256 code points in the Tibetan block).

    Old-style versus new-style block names

    Unicode publishes the names of blocks in two different styles, though the two are equivalent under Unicode's loose matching rules.

    The original style uses blanks and hyphens in the block names (except for No_Block ), like so:

    1. Miscellaneous Mathematical Symbols-B

    The newer style replaces these with underscores, like this:

    1. Miscellaneous_Mathematical_Symbols_B

    This newer style is consistent with the values of other Unicode properties. To preserve backward compatibility, all the functions in Unicode::UCD that return block names (except one) return the old-style ones. That one function, prop_value_aliases() can be used to convert from old-style to new-style:

    1. my $new_style = prop_values_aliases("block", $old_style);

    Perl also has single-form extensions that refer to blocks, In_Cyrillic , meaning Block=Cyrillic . These have always been written in the new style.

    To convert from new-style to old-style, follow this recipe:

    1. $old_style = charblock((prop_invlist("block=$new_style"))[0]);

    (which finds the range of code points in the block using prop_invlist , gets the lower end of the range (0th element) and then looks up the old name for its block using charblock ).

    Note that starting in Unicode 6.1, many of the block names have shorter synonyms. These are always given in the new style.

    BUGS

    Does not yet support EBCDIC platforms.

    AUTHOR

    Jarkko Hietaniemi. Now maintained by perl5 porters.

     
    perldoc-html/Time/HiRes.html000644 000765 000024 00000161670 12275777507 016052 0ustar00jjstaff000000 000000 Time::HiRes - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Time::HiRes

    Perl 5 version 18.2 documentation
    Recently read

    Time::HiRes

    NAME

    Time::HiRes - High resolution alarm, sleep, gettimeofday, interval timers

    SYNOPSIS

    1. use Time::HiRes qw( usleep ualarm gettimeofday tv_interval nanosleep
    2. clock_gettime clock_getres clock_nanosleep clock
    3. stat );
    4. usleep ($microseconds);
    5. nanosleep ($nanoseconds);
    6. ualarm ($microseconds);
    7. ualarm ($microseconds, $interval_microseconds);
    8. $t0 = [gettimeofday];
    9. ($seconds, $microseconds) = gettimeofday;
    10. $elapsed = tv_interval ( $t0, [$seconds, $microseconds]);
    11. $elapsed = tv_interval ( $t0, [gettimeofday]);
    12. $elapsed = tv_interval ( $t0 );
    13. use Time::HiRes qw ( time alarm sleep );
    14. $now_fractions = time;
    15. sleep ($floating_seconds);
    16. alarm ($floating_seconds);
    17. alarm ($floating_seconds, $floating_interval);
    18. use Time::HiRes qw( setitimer getitimer );
    19. setitimer ($which, $floating_seconds, $floating_interval );
    20. getitimer ($which);
    21. use Time::HiRes qw( clock_gettime clock_getres clock_nanosleep
    22. ITIMER_REAL ITIMER_VIRTUAL ITIMER_PROF ITIMER_REALPROF );
    23. $realtime = clock_gettime(CLOCK_REALTIME);
    24. $resolution = clock_getres(CLOCK_REALTIME);
    25. clock_nanosleep(CLOCK_REALTIME, 1.5e9);
    26. clock_nanosleep(CLOCK_REALTIME, time()*1e9 + 10e9, TIMER_ABSTIME);
    27. my $ticktock = clock();
    28. use Time::HiRes qw( stat );
    29. my @stat = stat("file");
    30. my @stat = stat(FH);

    DESCRIPTION

    The Time::HiRes module implements a Perl interface to the usleep , nanosleep , ualarm , gettimeofday , and setitimer /getitimer system calls, in other words, high resolution time and timers. See the EXAMPLES section below and the test scripts for usage; see your system documentation for the description of the underlying nanosleep or usleep , ualarm , gettimeofday , and setitimer /getitimer calls.

    If your system lacks gettimeofday() or an emulation of it you don't get gettimeofday() or the one-argument form of tv_interval() . If your system lacks all of nanosleep() , usleep() , select(), and poll , you don't get Time::HiRes::usleep() , Time::HiRes::nanosleep() , or Time::HiRes::sleep() . If your system lacks both ualarm() and setitimer() you don't get Time::HiRes::ualarm() or Time::HiRes::alarm() .

    If you try to import an unimplemented function in the use statement it will fail at compile time.

    If your subsecond sleeping is implemented with nanosleep() instead of usleep() , you can mix subsecond sleeping with signals since nanosleep() does not use signals. This, however, is not portable, and you should first check for the truth value of &Time::HiRes::d_nanosleep to see whether you have nanosleep, and then carefully read your nanosleep() C API documentation for any peculiarities.

    If you are using nanosleep for something else than mixing sleeping with signals, give some thought to whether Perl is the tool you should be using for work requiring nanosecond accuracies.

    Remember that unless you are working on a hard realtime system, any clocks and timers will be imprecise, especially so if you are working in a pre-emptive multiuser system. Understand the difference between wallclock time and process time (in UNIX-like systems the sum of user and system times). Any attempt to sleep for X seconds will most probably end up sleeping more than that, but don't be surpised if you end up sleeping slightly less.

    The following functions can be imported from this module. No functions are exported by default.

    • gettimeofday ()

      In array context returns a two-element array with the seconds and microseconds since the epoch. In scalar context returns floating seconds like Time::HiRes::time() (see below).

    • usleep ( $useconds )

      Sleeps for the number of microseconds (millionths of a second) specified. Returns the number of microseconds actually slept. Can sleep for more than one second, unlike the usleep system call. Can also sleep for zero seconds, which often works like a thread yield. See also Time::HiRes::usleep() , Time::HiRes::sleep() , and Time::HiRes::clock_nanosleep() .

      Do not expect usleep() to be exact down to one microsecond.

    • nanosleep ( $nanoseconds )

      Sleeps for the number of nanoseconds (1e9ths of a second) specified. Returns the number of nanoseconds actually slept (accurate only to microseconds, the nearest thousand of them). Can sleep for more than one second. Can also sleep for zero seconds, which often works like a thread yield. See also Time::HiRes::sleep() , Time::HiRes::usleep() , and Time::HiRes::clock_nanosleep() .

      Do not expect nanosleep() to be exact down to one nanosecond. Getting even accuracy of one thousand nanoseconds is good.

    • ualarm ( $useconds [, $interval_useconds ] )

      Issues a ualarm call; the $interval_useconds is optional and will be zero if unspecified, resulting in alarm-like behaviour.

      Returns the remaining time in the alarm in microseconds, or undef if an error occurred.

      ualarm(0) will cancel an outstanding ualarm().

      Note that the interaction between alarms and sleeps is unspecified.

    • tv_interval

      tv_interval ( $ref_to_gettimeofday [, $ref_to_later_gettimeofday] )

      Returns the floating seconds between the two times, which should have been returned by gettimeofday() . If the second argument is omitted, then the current time is used.

    • time ()

      Returns a floating seconds since the epoch. This function can be imported, resulting in a nice drop-in replacement for the time provided with core Perl; see the EXAMPLES below.

      NOTE 1: This higher resolution timer can return values either less or more than the core time(), depending on whether your platform rounds the higher resolution timer values up, down, or to the nearest second to get the core time(), but naturally the difference should be never more than half a second. See also clock_getres, if available in your system.

      NOTE 2: Since Sunday, September 9th, 2001 at 01:46:40 AM GMT, when the time() seconds since epoch rolled over to 1_000_000_000, the default floating point format of Perl and the seconds since epoch have conspired to produce an apparent bug: if you print the value of Time::HiRes::time() you seem to be getting only five decimals, not six as promised (microseconds). Not to worry, the microseconds are there (assuming your platform supports such granularity in the first place). What is going on is that the default floating point format of Perl only outputs 15 digits. In this case that means ten digits before the decimal separator and five after. To see the microseconds you can use either printf/sprintf with "%.6f" , or the gettimeofday() function in list context, which will give you the seconds and microseconds as two separate values.

    • sleep ( $floating_seconds )

      Sleeps for the specified amount of seconds. Returns the number of seconds actually slept (a floating point value). This function can be imported, resulting in a nice drop-in replacement for the sleep provided with perl, see the EXAMPLES below.

      Note that the interaction between alarms and sleeps is unspecified.

    • alarm ( $floating_seconds [, $interval_floating_seconds ] )

      The SIGALRM signal is sent after the specified number of seconds. Implemented using setitimer() if available, ualarm() if not. The $interval_floating_seconds argument is optional and will be zero if unspecified, resulting in alarm()-like behaviour. This function can be imported, resulting in a nice drop-in replacement for the alarm provided with perl, see the EXAMPLES below.

      Returns the remaining time in the alarm in seconds, or undef if an error occurred.

      NOTE 1: With some combinations of operating systems and Perl releases SIGALRM restarts select(), instead of interrupting it. This means that an alarm() followed by a select() may together take the sum of the times specified for the alarm() and the select(), not just the time of the alarm().

      Note that the interaction between alarms and sleeps is unspecified.

    • setitimer ( $which, $floating_seconds [, $interval_floating_seconds ] )

      Start up an interval timer: after a certain time, a signal ($which) arrives, and more signals may keep arriving at certain intervals. To disable an "itimer", use $floating_seconds of zero. If the $interval_floating_seconds is set to zero (or unspecified), the timer is disabled after the next delivered signal.

      Use of interval timers may interfere with alarm(), sleep(), and usleep() . In standard-speak the "interaction is unspecified", which means that anything may happen: it may work, it may not.

      In scalar context, the remaining time in the timer is returned.

      In list context, both the remaining time and the interval are returned.

      There are usually three or four interval timers (signals) available: the $which can be ITIMER_REAL , ITIMER_VIRTUAL , ITIMER_PROF , or ITIMER_REALPROF . Note that which ones are available depends: true UNIX platforms usually have the first three, but only Solaris seems to have ITIMER_REALPROF (which is used to profile multithreaded programs). Win32 unfortunately does not haveinterval timers.

      ITIMER_REAL results in alarm()-like behaviour. Time is counted in real time; that is, wallclock time. SIGALRM is delivered when the timer expires.

      ITIMER_VIRTUAL counts time in (process) virtual time; that is, only when the process is running. In multiprocessor/user/CPU systems this may be more or less than real or wallclock time. (This time is also known as the user time.) SIGVTALRM is delivered when the timer expires.

      ITIMER_PROF counts time when either the process virtual time or when the operating system is running on behalf of the process (such as I/O). (This time is also known as the system time.) (The sum of user time and system time is known as the CPU time.) SIGPROF is delivered when the timer expires. SIGPROF can interrupt system calls.

      The semantics of interval timers for multithreaded programs are system-specific, and some systems may support additional interval timers. For example, it is unspecified which thread gets the signals. See your setitimer() documentation.

    • getitimer ( $which )

      Return the remaining time in the interval timer specified by $which .

      In scalar context, the remaining time is returned.

      In list context, both the remaining time and the interval are returned. The interval is always what you put in using setitimer() .

    • clock_gettime ( $which )

      Return as seconds the current value of the POSIX high resolution timer specified by $which . All implementations that support POSIX high resolution timers are supposed to support at least the $which value of CLOCK_REALTIME , which is supposed to return results close to the results of gettimeofday , or the number of seconds since 00:00:00:00 January 1, 1970 Greenwich Mean Time (GMT). Do not assume that CLOCK_REALTIME is zero, it might be one, or something else. Another potentially useful (but not available everywhere) value is CLOCK_MONOTONIC , which guarantees a monotonically increasing time value (unlike time() or gettimeofday(), which can be adjusted). See your system documentation for other possibly supported values.

    • clock_getres ( $which )

      Return as seconds the resolution of the POSIX high resolution timer specified by $which . All implementations that support POSIX high resolution timers are supposed to support at least the $which value of CLOCK_REALTIME , see clock_gettime.

    • clock_nanosleep ( $which, $nanoseconds, $flags = 0)

      Sleeps for the number of nanoseconds (1e9ths of a second) specified. Returns the number of nanoseconds actually slept. The $which is the "clock id", as with clock_gettime() and clock_getres(). The flags default to zero but TIMER_ABSTIME can specified (must be exported explicitly) which means that $nanoseconds is not a time interval (as is the default) but instead an absolute time. Can sleep for more than one second. Can also sleep for zero seconds, which often works like a thread yield. See also Time::HiRes::sleep() , Time::HiRes::usleep() , and Time::HiRes::nanosleep() .

      Do not expect clock_nanosleep() to be exact down to one nanosecond. Getting even accuracy of one thousand nanoseconds is good.

    • clock()

      Return as seconds the process time (user + system time) spent by the process since the first call to clock() (the definition is not "since the start of the process", though if you are lucky these times may be quite close to each other, depending on the system). What this means is that you probably need to store the result of your first call to clock(), and subtract that value from the following results of clock().

      The time returned also includes the process times of the terminated child processes for which wait() has been executed. This value is somewhat like the second value returned by the times() of core Perl, but not necessarily identical. Note that due to backward compatibility limitations the returned value may wrap around at about 2147 seconds or at about 36 minutes.

    • stat
    • stat FH
    • stat EXPR

      As stat but with the access/modify/change file timestamps in subsecond resolution, if the operating system and the filesystem both support such timestamps. To override the standard stat():

      1. use Time::HiRes qw(stat);

      Test for the value of &Time::HiRes::d_hires_stat to find out whether the operating system supports subsecond file timestamps: a value larger than zero means yes. There are unfortunately no easy ways to find out whether the filesystem supports such timestamps. UNIX filesystems often do; NTFS does; FAT doesn't (FAT timestamp granularity is two seconds).

      A zero return value of &Time::HiRes::d_hires_stat means that Time::HiRes::stat is a no-op passthrough for CORE::stat(), and therefore the timestamps will stay integers. The same thing will happen if the filesystem does not do subsecond timestamps, even if the &Time::HiRes::d_hires_stat is non-zero.

      In any case do not expect nanosecond resolution, or even a microsecond resolution. Also note that the modify/access timestamps might have different resolutions, and that they need not be synchronized, e.g. if the operations are

      1. write
      2. stat # t1
      3. read
      4. stat # t2

      the access time stamp from t2 need not be greater-than the modify time stamp from t1: it may be equal or less.

    EXAMPLES

    1. use Time::HiRes qw(usleep ualarm gettimeofday tv_interval);
    2. $microseconds = 750_000;
    3. usleep($microseconds);
    4. # signal alarm in 2.5s & every .1s thereafter
    5. ualarm(2_500_000, 100_000);
    6. # cancel that ualarm
    7. ualarm(0);
    8. # get seconds and microseconds since the epoch
    9. ($s, $usec) = gettimeofday();
    10. # measure elapsed time
    11. # (could also do by subtracting 2 gettimeofday return values)
    12. $t0 = [gettimeofday];
    13. # do bunch of stuff here
    14. $t1 = [gettimeofday];
    15. # do more stuff here
    16. $t0_t1 = tv_interval $t0, $t1;
    17. $elapsed = tv_interval ($t0, [gettimeofday]);
    18. $elapsed = tv_interval ($t0); # equivalent code
    19. #
    20. # replacements for time, alarm and sleep that know about
    21. # floating seconds
    22. #
    23. use Time::HiRes;
    24. $now_fractions = Time::HiRes::time;
    25. Time::HiRes::sleep (2.5);
    26. Time::HiRes::alarm (10.6666666);
    27. use Time::HiRes qw ( time alarm sleep );
    28. $now_fractions = time;
    29. sleep (2.5);
    30. alarm (10.6666666);
    31. # Arm an interval timer to go off first at 10 seconds and
    32. # after that every 2.5 seconds, in process virtual time
    33. use Time::HiRes qw ( setitimer ITIMER_VIRTUAL time );
    34. $SIG{VTALRM} = sub { print time, "\n" };
    35. setitimer(ITIMER_VIRTUAL, 10, 2.5);
    36. use Time::HiRes qw( clock_gettime clock_getres CLOCK_REALTIME );
    37. # Read the POSIX high resolution timer.
    38. my $high = clock_getres(CLOCK_REALTIME);
    39. # But how accurate we can be, really?
    40. my $reso = clock_getres(CLOCK_REALTIME);
    41. use Time::HiRes qw( clock_nanosleep TIMER_ABSTIME );
    42. clock_nanosleep(CLOCK_REALTIME, 1e6);
    43. clock_nanosleep(CLOCK_REALTIME, 2e9, TIMER_ABSTIME);
    44. use Time::HiRes qw( clock );
    45. my $clock0 = clock();
    46. ... # Do something.
    47. my $clock1 = clock();
    48. my $clockd = $clock1 - $clock0;
    49. use Time::HiRes qw( stat );
    50. my ($atime, $mtime, $ctime) = (stat("istics"))[8, 9, 10];

    C API

    In addition to the perl API described above, a C API is available for extension writers. The following C functions are available in the modglobal hash:

    1. name C prototype
    2. --------------- ----------------------
    3. Time::NVtime double (*)()
    4. Time::U2time void (*)(pTHX_ UV ret[2])

    Both functions return equivalent information (like gettimeofday ) but with different representations. The names NVtime and U2time were selected mainly because they are operating system independent. (gettimeofday is Unix-centric, though some platforms like Win32 and VMS have emulations for it.)

    Here is an example of using NVtime from C:

    1. double (*myNVtime)(); /* Returns -1 on failure. */
    2. SV **svp = hv_fetch(PL_modglobal, "Time::NVtime", 12, 0);
    3. if (!svp) croak("Time::HiRes is required");
    4. if (!SvIOK(*svp)) croak("Time::NVtime isn't a function pointer");
    5. myNVtime = INT2PTR(double(*)(), SvIV(*svp));
    6. printf("The current time is: %f\n", (*myNVtime)());

    DIAGNOSTICS

    useconds or interval more than ...

    In ualarm() you tried to use number of microseconds or interval (also in microseconds) more than 1_000_000 and setitimer() is not available in your system to emulate that case.

    negative time not invented yet

    You tried to use a negative time argument.

    internal error: useconds < 0 (unsigned ... signed ...)

    Something went horribly wrong-- the number of microseconds that cannot become negative just became negative. Maybe your compiler is broken?

    useconds or uinterval equal to or more than 1000000

    In some platforms it is not possible to get an alarm with subsecond resolution and later than one second.

    unimplemented in this platform

    Some calls simply aren't available, real or emulated, on every platform.

    CAVEATS

    Notice that the core time() maybe rounding rather than truncating. What this means is that the core time() may be reporting the time as one second later than gettimeofday() and Time::HiRes::time() .

    Adjusting the system clock (either manually or by services like ntp) may cause problems, especially for long running programs that assume a monotonously increasing time (note that all platforms do not adjust time as gracefully as UNIX ntp does). For example in Win32 (and derived platforms like Cygwin and MinGW) the Time::HiRes::time() may temporarily drift off from the system clock (and the original time()) by up to 0.5 seconds. Time::HiRes will notice this eventually and recalibrate. Note that since Time::HiRes 1.77 the clock_gettime(CLOCK_MONOTONIC) might help in this (in case your system supports CLOCK_MONOTONIC).

    Some systems have APIs but not implementations: for example QNX and Haiku have the interval timer APIs but not the functionality.

    SEE ALSO

    Perl modules BSD::Resource, Time::TAI64.

    Your system documentation for clock , clock_gettime , clock_getres , clock_nanosleep , clock_settime , getitimer , gettimeofday , setitimer , sleep, stat, ualarm .

    AUTHORS

    D. Wegscheid <wegscd@whirlpool.com> R. Schertler <roderick@argon.org> J. Hietaniemi <jhi@iki.fi> G. Aas <gisle@aas.no>

    COPYRIGHT AND LICENSE

    Copyright (c) 1996-2002 Douglas E. Wegscheid. All rights reserved.

    Copyright (c) 2002, 2003, 2004, 2005, 2006, 2007, 2008 Jarkko Hietaniemi. All rights reserved.

    Copyright (C) 2011, 2012 Andrew Main (Zefram) <zefram@fysh.org>

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Time/Local.html000644 000765 000024 00000065071 12275777512 016064 0ustar00jjstaff000000 000000 Time::Local - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Time::Local

    Perl 5 version 18.2 documentation
    Recently read

    Time::Local

    NAME

    Time::Local - efficiently compute time from local and GMT time

    SYNOPSIS

    1. $time = timelocal( $sec, $min, $hour, $mday, $mon, $year );
    2. $time = timegm( $sec, $min, $hour, $mday, $mon, $year );

    DESCRIPTION

    This module provides functions that are the inverse of built-in perl functions localtime() and gmtime(). They accept a date as a six-element array, and return the corresponding time(2) value in seconds since the system epoch (Midnight, January 1, 1970 GMT on Unix, for example). This value can be positive or negative, though POSIX only requires support for positive values, so dates before the system's epoch may not work on all operating systems.

    It is worth drawing particular attention to the expected ranges for the values provided. The value for the day of the month is the actual day (ie 1..31), while the month is the number of months since January (0..11). This is consistent with the values returned from localtime() and gmtime().

    FUNCTIONS

    timelocal() and timegm()

    This module exports two functions by default, timelocal() and timegm() .

    The timelocal() and timegm() functions perform range checking on the input $sec, $min, $hour, $mday, and $mon values by default.

    timelocal_nocheck() and timegm_nocheck()

    If you are working with data you know to be valid, you can speed your code up by using the "nocheck" variants, timelocal_nocheck() and timegm_nocheck() . These variants must be explicitly imported.

    1. use Time::Local 'timelocal_nocheck';
    2. # The 365th day of 1999
    3. print scalar localtime timelocal_nocheck( 0, 0, 0, 365, 0, 99 );

    If you supply data which is not valid (month 27, second 1,000) the results will be unpredictable (so don't do that).

    Year Value Interpretation

    Strictly speaking, the year should be specified in a form consistent with localtime(), i.e. the offset from 1900. In order to make the interpretation of the year easier for humans, however, who are more accustomed to seeing years as two-digit or four-digit values, the following conventions are followed:

    • Years greater than 999 are interpreted as being the actual year, rather than the offset from 1900. Thus, 1964 would indicate the year Martin Luther King won the Nobel prize, not the year 3864.

    • Years in the range 100..999 are interpreted as offset from 1900, so that 112 indicates 2012. This rule also applies to years less than zero (but see note below regarding date range).

    • Years in the range 0..99 are interpreted as shorthand for years in the rolling "current century," defined as 50 years on either side of the current year. Thus, today, in 1999, 0 would refer to 2000, and 45 to 2045, but 55 would refer to 1955. Twenty years from now, 55 would instead refer to 2055. This is messy, but matches the way people currently think about two digit dates. Whenever possible, use an absolute four digit year instead.

    The scheme above allows interpretation of a wide range of dates, particularly if 4-digit years are used.

    Limits of time_t

    On perl versions older than 5.12.0, the range of dates that can be actually be handled depends on the size of time_t (usually a signed integer) on the given platform. Currently, this is 32 bits for most systems, yielding an approximate range from Dec 1901 to Jan 2038.

    Both timelocal() and timegm() croak if given dates outside the supported range.

    As of version 5.12.0, perl has stopped using the underlying time library of the operating system it's running on and has its own implementation of those routines with a safe range of at least +/ 2**52 (about 142 million years).

    Ambiguous Local Times (DST)

    Because of DST changes, there are many time zones where the same local time occurs for two different GMT times on the same day. For example, in the "Europe/Paris" time zone, the local time of 2001-10-28 02:30:00 can represent either 2001-10-28 00:30:00 GMT, or 2001-10-28 01:30:00 GMT.

    When given an ambiguous local time, the timelocal() function should always return the epoch for the earlier of the two possible GMT times.

    Non-Existent Local Times (DST)

    When a DST change causes a locale clock to skip one hour forward, there will be an hour's worth of local times that don't exist. Again, for the "Europe/Paris" time zone, the local clock jumped from 2001-03-25 01:59:59 to 2001-03-25 03:00:00.

    If the timelocal() function is given a non-existent local time, it will simply return an epoch value for the time one hour later.

    Negative Epoch Values

    On perl version 5.12.0 and newer, negative epoch values are fully supported.

    On older versions of perl, negative epoch (time_t ) values, which are not officially supported by the POSIX standards, are known not to work on some systems. These include MacOS (pre-OSX) and Win32.

    On systems which do support negative epoch values, this module should be able to cope with dates before the start of the epoch, down the minimum value of time_t for the system.

    IMPLEMENTATION

    These routines are quite efficient and yet are always guaranteed to agree with localtime() and gmtime(). We manage this by caching the start times of any months we've seen before. If we know the start time of the month, we can always calculate any time within the month. The start times are calculated using a mathematical formula. Unlike other algorithms that do multiple calls to gmtime().

    The timelocal() function is implemented using the same cache. We just assume that we're translating a GMT time, and then fudge it when we're done for the timezone and daylight savings arguments. Note that the timezone is evaluated for each date because countries occasionally change their official timezones. Assuming that localtime() corrects for these changes, this routine will also be correct.

    BUGS

    The whole scheme for interpreting two-digit years can be considered a bug.

    SUPPORT

    Support for this module is provided via the datetime@perl.org email list. See http://lists.perl.org/ for more details.

    Please submit bugs to the CPAN RT system at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Time-Local or via email at bug-time-local@rt.cpan.org.

    COPYRIGHT

    Copyright (c) 1997-2003 Graham Barr, 2003-2007 David Rolsky. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    The full text of the license can be found in the LICENSE file included with this module.

    AUTHOR

    This module is based on a Perl 4 library, timelocal.pl, that was included with Perl 4.036, and was most likely written by Tom Christiansen.

    The current version was written by Graham Barr.

    It is now being maintained separately from the Perl core by Dave Rolsky, <autarch@urth.org>.

     
    perldoc-html/Time/Piece.html000644 000765 000024 00000072614 12275777513 016061 0ustar00jjstaff000000 000000 Time::Piece - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Time::Piece

    Perl 5 version 18.2 documentation
    Recently read

    Time::Piece

    NAME

    Time::Piece - Object Oriented time objects

    SYNOPSIS

    1. use Time::Piece;
    2. my $t = localtime;
    3. print "Time is $t\n";
    4. print "Year is ", $t->year, "\n";

    DESCRIPTION

    This module replaces the standard localtime and gmtime functions with implementations that return objects. It does so in a backwards compatible manner, so that using localtime/gmtime in the way documented in perlfunc will still return what you expect.

    The module actually implements most of an interface described by Larry Wall on the perl5-porters mailing list here: http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-01/msg00241.html

    USAGE

    After importing this module, when you use localtime or gmtime in a scalar context, rather than getting an ordinary scalar string representing the date and time, you get a Time::Piece object, whose stringification happens to produce the same effect as the localtime and gmtime functions. There is also a new() constructor provided, which is the same as localtime(), except when passed a Time::Piece object, in which case it's a copy constructor. The following methods are available on the object:

    1. $t->sec # also available as $t->second
    2. $t->min # also available as $t->minute
    3. $t->hour # 24 hour
    4. $t->mday # also available as $t->day_of_month
    5. $t->mon # 1 = January
    6. $t->_mon # 0 = January
    7. $t->monname # Feb
    8. $t->month # same as $t->monname
    9. $t->fullmonth # February
    10. $t->year # based at 0 (year 0 AD is, of course 1 BC)
    11. $t->_year # year minus 1900
    12. $t->yy # 2 digit year
    13. $t->wday # 1 = Sunday
    14. $t->_wday # 0 = Sunday
    15. $t->day_of_week # 0 = Sunday
    16. $t->wdayname # Tue
    17. $t->day # same as wdayname
    18. $t->fullday # Tuesday
    19. $t->yday # also available as $t->day_of_year, 0 = Jan 01
    20. $t->isdst # also available as $t->daylight_savings
    21. $t->hms # 12:34:56
    22. $t->hms(".") # 12.34.56
    23. $t->time # same as $t->hms
    24. $t->ymd # 2000-02-29
    25. $t->date # same as $t->ymd
    26. $t->mdy # 02-29-2000
    27. $t->mdy("/") # 02/29/2000
    28. $t->dmy # 29-02-2000
    29. $t->dmy(".") # 29.02.2000
    30. $t->datetime # 2000-02-29T12:34:56 (ISO 8601)
    31. $t->cdate # Tue Feb 29 12:34:56 2000
    32. "$t" # same as $t->cdate
    33. $t->epoch # seconds since the epoch
    34. $t->tzoffset # timezone offset in a Time::Seconds object
    35. $t->julian_day # number of days since Julian period began
    36. $t->mjd # modified Julian date (JD-2400000.5 days)
    37. $t->week # week number (ISO 8601)
    38. $t->is_leap_year # true if it its
    39. $t->month_last_day # 28-31
    40. $t->time_separator($s) # set the default separator (default ":")
    41. $t->date_separator($s) # set the default separator (default "-")
    42. $t->day_list(@days) # set the default weekdays
    43. $t->mon_list(@days) # set the default months
    44. $t->strftime(FORMAT) # same as POSIX::strftime (without the overhead
    45. # of the full POSIX extension)
    46. $t->strftime() # "Tue, 29 Feb 2000 12:34:56 GMT"
    47. Time::Piece->strptime(STRING, FORMAT)
    48. # see strptime man page. Creates a new
    49. # Time::Piece object

    Local Locales

    Both wdayname (day) and monname (month) allow passing in a list to use to index the name of the days against. This can be useful if you need to implement some form of localisation without actually installing or using locales.

    1. my @days = qw( Dimanche Lundi Merdi Mercredi Jeudi Vendredi Samedi );
    2. my $french_day = localtime->day(@days);

    These settings can be overriden globally too:

    1. Time::Piece::day_list(@days);

    Or for months:

    1. Time::Piece::mon_list(@months);

    And locally for months:

    1. print localtime->month(@months);

    Date Calculations

    It's possible to use simple addition and subtraction of objects:

    1. use Time::Seconds;
    2. my $seconds = $t1 - $t2;
    3. $t1 += ONE_DAY; # add 1 day (constant from Time::Seconds)

    The following are valid ($t1 and $t2 are Time::Piece objects):

    1. $t1 - $t2; # returns Time::Seconds object
    2. $t1 - 42; # returns Time::Piece object
    3. $t1 + 533; # returns Time::Piece object

    However adding a Time::Piece object to another Time::Piece object will cause a runtime error.

    Note that the first of the above returns a Time::Seconds object, so while examining the object will print the number of seconds (because of the overloading), you can also get the number of minutes, hours, days, weeks and years in that delta, using the Time::Seconds API.

    In addition to adding seconds, there are two APIs for adding months and years:

    1. $t->add_months(6);
    2. $t->add_years(5);

    The months and years can be negative for subtractions. Note that there is some "strange" behaviour when adding and subtracting months at the ends of months. Generally when the resulting month is shorter than the starting month then the number of overlap days is added. For example subtracting a month from 2008-03-31 will not result in 2008-02-31 as this is an impossible date. Instead you will get 2008-03-02. This appears to be consistent with other date manipulation tools.

    Date Comparisons

    Date comparisons are also possible, using the full suite of "<", ">", "<=", ">=", "<=>", "==" and "!=".

    Date Parsing

    Time::Piece has a built-in strptime() function (from FreeBSD), allowing you incredibly flexible date parsing routines. For example:

    1. my $t = Time::Piece->strptime("Sunday 3rd Nov, 1943",
    2. "%A %drd %b, %Y");
    3. print $t->strftime("%a, %d %b %Y");

    Outputs:

    1. Wed, 03 Nov 1943

    (see, it's even smart enough to fix my obvious date bug)

    For more information see "man strptime", which should be on all unix systems.

    Alternatively look here: http://www.unix.com/man-page/FreeBSD/3/strftime/

    YYYY-MM-DDThh:mm:ss

    The ISO 8601 standard defines the date format to be YYYY-MM-DD, and the time format to be hh:mm:ss (24 hour clock), and if combined, they should be concatenated with date first and with a capital 'T' in front of the time.

    Week Number

    The week number may be an unknown concept to some readers. The ISO 8601 standard defines that weeks begin on a Monday and week 1 of the year is the week that includes both January 4th and the first Thursday of the year. In other words, if the first Monday of January is the 2nd, 3rd, or 4th, the preceding days of the January are part of the last week of the preceding year. Week numbers range from 1 to 53.

    Global Overriding

    Finally, it's possible to override localtime and gmtime everywhere, by including the ':override' tag in the import list:

    1. use Time::Piece ':override';

    CAVEATS

    Setting $ENV{TZ} in Threads on Win32

    Note that when using perl in the default build configuration on Win32 (specifically, when perl is built with PERL_IMPLICIT_SYS), each perl interpreter maintains its own copy of the environment and only the main interpreter will update the process environment seen by strftime.

    Therefore, if you make changes to $ENV{TZ} from inside a thread other than the main thread then those changes will not be seen by strftime if you subsequently call that with the %Z formatting code. You must change $ENV{TZ} in the main thread to have the desired effect in this case (and you must also call _tzset() in the main thread to register the environment change).

    Furthermore, remember that this caveat also applies to fork(), which is emulated by threads on Win32.

    Use of epoch seconds

    This module internally uses the epoch seconds system that is provided via the perl time() function and supported by gmtime() and localtime().

    If your perl does not support times larger than 2^31 seconds then this module is likely to fail at processing dates beyond the year 2038. There are moves afoot to fix that in perl. Alternatively use 64 bit perl. Or if none of those are options, use the DateTime module which has support for years well into the future and past.

    AUTHOR

    Matt Sergeant, matt@sergeant.org Jarkko Hietaniemi, jhi@iki.fi (while creating Time::Piece for core perl)

    License

    This module is free software, you may distribute it under the same terms as Perl.

    SEE ALSO

    The excellent Calendar FAQ at http://www.tondering.dk/claus/calendar.html

    BUGS

    The test harness leaves much to be desired. Patches welcome.

     
    perldoc-html/Time/Seconds.html000644 000765 000024 00000045370 12275777511 016427 0ustar00jjstaff000000 000000 Time::Seconds - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Time::Seconds

    Perl 5 version 18.2 documentation
    Recently read

    Time::Seconds

    NAME

    Time::Seconds - a simple API to convert seconds to other date values

    SYNOPSIS

    1. use Time::Piece;
    2. use Time::Seconds;
    3. my $t = localtime;
    4. $t += ONE_DAY;
    5. my $t2 = localtime;
    6. my $s = $t - $t2;
    7. print "Difference is: ", $s->days, "\n";

    DESCRIPTION

    This module is part of the Time::Piece distribution. It allows the user to find out the number of minutes, hours, days, weeks or years in a given number of seconds. It is returned by Time::Piece when you delta two Time::Piece objects.

    Time::Seconds also exports the following constants:

    1. ONE_DAY
    2. ONE_WEEK
    3. ONE_HOUR
    4. ONE_MINUTE
    5. ONE_MONTH
    6. ONE_YEAR
    7. ONE_FINANCIAL_MONTH
    8. LEAP_YEAR
    9. NON_LEAP_YEAR

    Since perl does not (yet?) support constant objects, these constants are in seconds only, so you cannot, for example, do this: print ONE_WEEK->minutes;

    METHODS

    The following methods are available:

    1. my $val = Time::Seconds->new(SECONDS)
    2. $val->seconds;
    3. $val->minutes;
    4. $val->hours;
    5. $val->days;
    6. $val->weeks;
    7. $val->months;
    8. $val->financial_months; # 30 days
    9. $val->years;
    10. $val->pretty; # gives English representation of the delta

    The usual arithmetic (+,-,+=,-=) is also available on the objects.

    The methods make the assumption that there are 24 hours in a day, 7 days in a week, 365.24225 days in a year and 12 months in a year. (from The Calendar FAQ at http://www.tondering.dk/claus/calendar.html)

    AUTHOR

    Matt Sergeant, matt@sergeant.org

    Tobias Brox, tobiasb@tobiasb.funcom.com

    Bal�zs Szab� (dLux), dlux@kapu.hu

    LICENSE

    Please see Time::Piece for the license.

    Bugs

    Currently the methods aren't as efficient as they could be, for reasons of clarity. This is probably a bad idea.

     
    perldoc-html/Time/gmtime.html000644 000765 000024 00000043560 12275777512 016313 0ustar00jjstaff000000 000000 Time::gmtime - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Time::gmtime

    Perl 5 version 18.2 documentation
    Recently read

    Time::gmtime

    NAME

    Time::gmtime - by-name interface to Perl's built-in gmtime() function

    SYNOPSIS

    1. use Time::gmtime;
    2. $gm = gmtime();
    3. printf "The day in Greenwich is %s\n",
    4. (qw(Sun Mon Tue Wed Thu Fri Sat Sun))[ $gm->wday() ];
    5. use Time::gmtime qw(:FIELDS);
    6. gmtime();
    7. printf "The day in Greenwich is %s\n",
    8. (qw(Sun Mon Tue Wed Thu Fri Sat Sun))[ $tm_wday ];
    9. $now = gmctime();
    10. use Time::gmtime;
    11. use File::stat;
    12. $date_string = gmctime(stat($file)->mtime);

    DESCRIPTION

    This module's default exports override the core gmtime() function, replacing it with a version that returns "Time::tm" objects. This object has methods that return the similarly named structure field name from the C's tm structure from time.h; namely sec, min, hour, mday, mon, year, wday, yday, and isdst.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your core functions.) Access these fields as variables named with a preceding tm_ in front their method names. Thus, $tm_obj->mday() corresponds to $tm_mday if you import the fields.

    The gmctime() function provides a way of getting at the scalar sense of the original CORE::gmtime() function.

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. On the other hand, the built-ins are still available via the CORE:: pseudo-package.

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/Time/localtime.html000644 000765 000024 00000042032 12275777510 016771 0ustar00jjstaff000000 000000 Time::localtime - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Time::localtime

    Perl 5 version 18.2 documentation
    Recently read

    Time::localtime

    NAME

    Time::localtime - by-name interface to Perl's built-in localtime() function

    SYNOPSIS

    1. use Time::localtime;
    2. printf "Year is %d\n", localtime->year() + 1900;
    3. $now = ctime();
    4. use Time::localtime;
    5. use File::stat;
    6. $date_string = ctime(stat($file)->mtime);

    DESCRIPTION

    This module's default exports override the core localtime() function, replacing it with a version that returns "Time::tm" objects. This object has methods that return the similarly named structure field name from the C's tm structure from time.h; namely sec, min, hour, mday, mon, year, wday, yday, and isdst.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your core functions.) Access these fields as variables named with a preceding tm_ in front their method names. Thus, $tm_obj->mday() corresponds to $tm_mday if you import the fields.

    The ctime() function provides a way of getting at the scalar sense of the original CORE::localtime() function.

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. On the other hand, the built-ins are still available via the CORE:: pseudo-package.

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/Time/tm.html000644 000765 000024 00000035154 12275777506 015454 0ustar00jjstaff000000 000000 Time::tm - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Time::tm

    Perl 5 version 18.2 documentation
    Recently read

    Time::tm

    NAME

    Time::tm - internal object used by Time::gmtime and Time::localtime

    SYNOPSIS

    Don't use this module directly.

    DESCRIPTION

    This module is used internally as a base class by Time::localtime And Time::gmtime functions. It creates a Time::tm struct object which is addressable just like's C's tm structure from time.h; namely with sec, min, hour, mday, mon, year, wday, yday, and isdst.

    This class is an internal interface only.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/Tie/Array.html000644 000765 000024 00000057665 12275777507 015751 0ustar00jjstaff000000 000000 Tie::Array - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::Array

    Perl 5 version 18.2 documentation
    Recently read

    Tie::Array

    NAME

    Tie::Array - base class for tied arrays

    SYNOPSIS

    1. package Tie::NewArray;
    2. use Tie::Array;
    3. @ISA = ('Tie::Array');
    4. # mandatory methods
    5. sub TIEARRAY { ... }
    6. sub FETCH { ... }
    7. sub FETCHSIZE { ... }
    8. sub STORE { ... } # mandatory if elements writeable
    9. sub STORESIZE { ... } # mandatory if elements can be added/deleted
    10. sub EXISTS { ... } # mandatory if exists() expected to work
    11. sub DELETE { ... } # mandatory if delete() expected to work
    12. # optional methods - for efficiency
    13. sub CLEAR { ... }
    14. sub PUSH { ... }
    15. sub POP { ... }
    16. sub SHIFT { ... }
    17. sub UNSHIFT { ... }
    18. sub SPLICE { ... }
    19. sub EXTEND { ... }
    20. sub DESTROY { ... }
    21. package Tie::NewStdArray;
    22. use Tie::Array;
    23. @ISA = ('Tie::StdArray');
    24. # all methods provided by default
    25. package main;
    26. $object = tie @somearray,'Tie::NewArray';
    27. $object = tie @somearray,'Tie::StdArray';
    28. $object = tie @somearray,'Tie::NewStdArray';

    DESCRIPTION

    This module provides methods for array-tying classes. See perltie for a list of the functions required in order to tie an array to a package. The basic Tie::Array package provides stub DESTROY , and EXTEND methods that do nothing, stub DELETE and EXISTS methods that croak() if the delete() or exists() builtins are ever called on the tied array, and implementations of PUSH , POP , SHIFT , UNSHIFT , SPLICE and CLEAR in terms of basic FETCH , STORE , FETCHSIZE , STORESIZE .

    The Tie::StdArray package provides efficient methods required for tied arrays which are implemented as blessed references to an "inner" perl array. It inherits from Tie::Array, and should cause tied arrays to behave exactly like standard arrays, allowing for selective overloading of methods.

    For developers wishing to write their own tied arrays, the required methods are briefly defined below. See the perltie section for more detailed descriptive, as well as example code:

    • TIEARRAY classname, LIST

      The class method is invoked by the command tie @array, classname . Associates an array instance with the specified class. LIST would represent additional arguments (along the lines of AnyDBM_File and compatriots) needed to complete the association. The method should return an object of a class which provides the methods below.

    • STORE this, index, value

      Store datum value into index for the tied array associated with object this. If this makes the array larger then class's mapping of undef should be returned for new positions.

    • FETCH this, index

      Retrieve the datum in index for the tied array associated with object this.

    • FETCHSIZE this

      Returns the total number of items in the tied array associated with object this. (Equivalent to scalar(@array)).

    • STORESIZE this, count

      Sets the total number of items in the tied array associated with object this to be count. If this makes the array larger then class's mapping of undef should be returned for new positions. If the array becomes smaller then entries beyond count should be deleted.

    • EXTEND this, count

      Informative call that array is likely to grow to have count entries. Can be used to optimize allocation. This method need do nothing.

    • EXISTS this, key

      Verify that the element at index key exists in the tied array this.

      The Tie::Array implementation is a stub that simply croaks.

    • DELETE this, key

      Delete the element at index key from the tied array this.

      The Tie::Array implementation is a stub that simply croaks.

    • CLEAR this

      Clear (remove, delete, ...) all values from the tied array associated with object this.

    • DESTROY this

      Normal object destructor method.

    • PUSH this, LIST

      Append elements of LIST to the array.

    • POP this

      Remove last element of the array and return it.

    • SHIFT this

      Remove the first element of the array (shifting other elements down) and return it.

    • UNSHIFT this, LIST

      Insert LIST elements at the beginning of the array, moving existing elements up to make room.

    • SPLICE this, offset, length, LIST

      Perform the equivalent of splice on the array.

      offset is optional and defaults to zero, negative values count back from the end of the array.

      length is optional and defaults to rest of the array.

      LIST may be empty.

      Returns a list of the original length elements at offset.

    CAVEATS

    There is no support at present for tied @ISA. There is a potential conflict between magic entries needed to notice setting of @ISA, and those needed to implement 'tie'.

    AUTHOR

    Nick Ing-Simmons <nik@tiuk.ti.com>

     
    perldoc-html/Tie/File.html000644 000765 000024 00000163771 12275777505 015544 0ustar00jjstaff000000 000000 Tie::File - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::File

    Perl 5 version 18.2 documentation
    Recently read

    Tie::File

    NAME

    Tie::File - Access the lines of a disk file via a Perl array

    SYNOPSIS

    1. # This file documents Tie::File version 0.98
    2. use Tie::File;
    3. tie @array, 'Tie::File', filename or die ...;
    4. $array[13] = 'blah'; # line 13 of the file is now 'blah'
    5. print $array[42]; # display line 42 of the file
    6. $n_recs = @array; # how many records are in the file?
    7. $#array -= 2; # chop two records off the end
    8. for (@array) {
    9. s/PERL/Perl/g; # Replace PERL with Perl everywhere in the file
    10. }
    11. # These are just like regular push, pop, unshift, shift, and splice
    12. # Except that they modify the file in the way you would expect
    13. push @array, new recs...;
    14. my $r1 = pop @array;
    15. unshift @array, new recs...;
    16. my $r2 = shift @array;
    17. @old_recs = splice @array, 3, 7, new recs...;
    18. untie @array; # all finished

    DESCRIPTION

    Tie::File represents a regular text file as a Perl array. Each element in the array corresponds to a record in the file. The first line of the file is element 0 of the array; the second line is element 1, and so on.

    The file is not loaded into memory, so this will work even for gigantic files.

    Changes to the array are reflected in the file immediately.

    Lazy people and beginners may now stop reading the manual.

    recsep

    What is a 'record'? By default, the meaning is the same as for the <...> operator: It's a string terminated by $/ , which is probably "\n" . (Minor exception: on DOS and Win32 systems, a 'record' is a string terminated by "\r\n" .) You may change the definition of "record" by supplying the recsep option in the tie call:

    1. tie @array, 'Tie::File', $file, recsep => 'es';

    This says that records are delimited by the string es . If the file contained the following data:

    1. Curse these pesky flies!\n

    then the @array would appear to have four elements:

    1. "Curse th"
    2. "e p"
    3. "ky fli"
    4. "!\n"

    An undefined value is not permitted as a record separator. Perl's special "paragraph mode" semantics (à la $/ = "" ) are not emulated.

    Records read from the tied array do not have the record separator string on the end; this is to allow

    1. $array[17] .= "extra";

    to work as expected.

    (See autochomp, below.) Records stored into the array will have the record separator string appended before they are written to the file, if they don't have one already. For example, if the record separator string is "\n" , then the following two lines do exactly the same thing:

    1. $array[17] = "Cherry pie";
    2. $array[17] = "Cherry pie\n";

    The result is that the contents of line 17 of the file will be replaced with "Cherry pie"; a newline character will separate line 17 from line 18. This means that this code will do nothing:

    1. chomp $array[17];

    Because the chomped value will have the separator reattached when it is written back to the file. There is no way to create a file whose trailing record separator string is missing.

    Inserting records that contain the record separator string is not supported by this module. It will probably produce a reasonable result, but what this result will be may change in a future version. Use 'splice' to insert records or to replace one record with several.

    autochomp

    Normally, array elements have the record separator removed, so that if the file contains the text

    1. Gold
    2. Frankincense
    3. Myrrh

    the tied array will appear to contain ("Gold", "Frankincense", "Myrrh") . If you set autochomp to a false value, the record separator will not be removed. If the file above was tied with

    1. tie @gifts, "Tie::File", $gifts, autochomp => 0;

    then the array @gifts would appear to contain ("Gold\n", "Frankincense\n", "Myrrh\n") , or (on Win32 systems) ("Gold\r\n", "Frankincense\r\n", "Myrrh\r\n") .

    mode

    Normally, the specified file will be opened for read and write access, and will be created if it does not exist. (That is, the flags O_RDWR | O_CREAT are supplied in the open call.) If you want to change this, you may supply alternative flags in the mode option. See Fcntl for a listing of available flags. For example:

    1. # open the file if it exists, but fail if it does not exist
    2. use Fcntl 'O_RDWR';
    3. tie @array, 'Tie::File', $file, mode => O_RDWR;
    4. # create the file if it does not exist
    5. use Fcntl 'O_RDWR', 'O_CREAT';
    6. tie @array, 'Tie::File', $file, mode => O_RDWR | O_CREAT;
    7. # open an existing file in read-only mode
    8. use Fcntl 'O_RDONLY';
    9. tie @array, 'Tie::File', $file, mode => O_RDONLY;

    Opening the data file in write-only or append mode is not supported.

    memory

    This is an upper limit on the amount of memory that Tie::File will consume at any time while managing the file. This is used for two things: managing the read cache and managing the deferred write buffer.

    Records read in from the file are cached, to avoid having to re-read them repeatedly. If you read the same record twice, the first time it will be stored in memory, and the second time it will be fetched from the read cache. The amount of data in the read cache will not exceed the value you specified for memory . If Tie::File wants to cache a new record, but the read cache is full, it will make room by expiring the least-recently visited records from the read cache.

    The default memory limit is 2Mib. You can adjust the maximum read cache size by supplying the memory option. The argument is the desired cache size, in bytes.

    1. # I have a lot of memory, so use a large cache to speed up access
    2. tie @array, 'Tie::File', $file, memory => 20_000_000;

    Setting the memory limit to 0 will inhibit caching; records will be fetched from disk every time you examine them.

    The memory value is not an absolute or exact limit on the memory used. Tie::File objects contains some structures besides the read cache and the deferred write buffer, whose sizes are not charged against memory .

    The cache itself consumes about 310 bytes per cached record, so if your file has many short records, you may want to decrease the cache memory limit, or else the cache overhead may exceed the size of the cached data.

    dw_size

    (This is an advanced feature. Skip this section on first reading.)

    If you use deferred writing (See Deferred Writing, below) then data you write into the array will not be written directly to the file; instead, it will be saved in the deferred write buffer to be written out later. Data in the deferred write buffer is also charged against the memory limit you set with the memory option.

    You may set the dw_size option to limit the amount of data that can be saved in the deferred write buffer. This limit may not exceed the total memory limit. For example, if you set dw_size to 1000 and memory to 2500, that means that no more than 1000 bytes of deferred writes will be saved up. The space available for the read cache will vary, but it will always be at least 1500 bytes (if the deferred write buffer is full) and it could grow as large as 2500 bytes (if the deferred write buffer is empty.)

    If you don't specify a dw_size , it defaults to the entire memory limit.

    Option Format

    -mode is a synonym for mode . -recsep is a synonym for recsep . -memory is a synonym for memory . You get the idea.

    Public Methods

    The tie call returns an object, say $o . You may call

    1. $rec = $o->FETCH($n);
    2. $o->STORE($n, $rec);

    to fetch or store the record at line $n , respectively; similarly the other tied array methods. (See perltie for details.) You may also call the following methods on this object:

    flock

    1. $o->flock(MODE)

    will lock the tied file. MODE has the same meaning as the second argument to the Perl built-in flock function; for example LOCK_SH or LOCK_EX | LOCK_NB . (These constants are provided by the use Fcntl ':flock' declaration.)

    MODE is optional; the default is LOCK_EX .

    Tie::File maintains an internal table of the byte offset of each record it has seen in the file.

    When you use flock to lock the file, Tie::File assumes that the read cache is no longer trustworthy, because another process might have modified the file since the last time it was read. Therefore, a successful call to flock discards the contents of the read cache and the internal record offset table.

    Tie::File promises that the following sequence of operations will be safe:

    1. my $o = tie @array, "Tie::File", $filename;
    2. $o->flock;

    In particular, Tie::File will not read or write the file during the tie call. (Exception: Using mode => O_TRUNC will, of course, erase the file during the tie call. If you want to do this safely, then open the file without O_TRUNC , lock the file, and use @array = () .)

    The best way to unlock a file is to discard the object and untie the array. It is probably unsafe to unlock the file without also untying it, because if you do, changes may remain unwritten inside the object. That is why there is no shortcut for unlocking. If you really want to unlock the file prematurely, you know what to do; if you don't know what to do, then don't do it.

    All the usual warnings about file locking apply here. In particular, note that file locking in Perl is advisory, which means that holding a lock will not prevent anyone else from reading, writing, or erasing the file; it only prevents them from getting another lock at the same time. Locks are analogous to green traffic lights: If you have a green light, that does not prevent the idiot coming the other way from plowing into you sideways; it merely guarantees to you that the idiot does not also have a green light at the same time.

    autochomp

    1. my $old_value = $o->autochomp(0); # disable autochomp option
    2. my $old_value = $o->autochomp(1); # enable autochomp option
    3. my $ac = $o->autochomp(); # recover current value

    See autochomp, above.

    defer , flush , discard , and autodefer

    See Deferred Writing, below.

    offset

    1. $off = $o->offset($n);

    This method returns the byte offset of the start of the $n th record in the file. If there is no such record, it returns an undefined value.

    Tying to an already-opened filehandle

    If $fh is a filehandle, such as is returned by IO::File or one of the other IO modules, you may use:

    1. tie @array, 'Tie::File', $fh, ...;

    Similarly if you opened that handle FH with regular open or sysopen, you may use:

    1. tie @array, 'Tie::File', \*FH, ...;

    Handles that were opened write-only won't work. Handles that were opened read-only will work as long as you don't try to modify the array. Handles must be attached to seekable sources of data---that means no pipes or sockets. If Tie::File can detect that you supplied a non-seekable handle, the tie call will throw an exception. (On Unix systems, it can detect this.)

    Note that Tie::File will only close any filehandles that it opened internally. If you passed it a filehandle as above, you "own" the filehandle, and are responsible for closing it after you have untied the @array.

    Deferred Writing

    (This is an advanced feature. Skip this section on first reading.)

    Normally, modifying a Tie::File array writes to the underlying file immediately. Every assignment like $a[3] = ... rewrites as much of the file as is necessary; typically, everything from line 3 through the end will need to be rewritten. This is the simplest and most transparent behavior. Performance even for large files is reasonably good.

    However, under some circumstances, this behavior may be excessively slow. For example, suppose you have a million-record file, and you want to do:

    1. for (@FILE) {
    2. $_ = "> $_";
    3. }

    The first time through the loop, you will rewrite the entire file, from line 0 through the end. The second time through the loop, you will rewrite the entire file from line 1 through the end. The third time through the loop, you will rewrite the entire file from line 2 to the end. And so on.

    If the performance in such cases is unacceptable, you may defer the actual writing, and then have it done all at once. The following loop will perform much better for large files:

    1. (tied @a)->defer;
    2. for (@a) {
    3. $_ = "> $_";
    4. }
    5. (tied @a)->flush;

    If Tie::File 's memory limit is large enough, all the writing will done in memory. Then, when you call ->flush , the entire file will be rewritten in a single pass.

    (Actually, the preceding discussion is something of a fib. You don't need to enable deferred writing to get good performance for this common case, because Tie::File will do it for you automatically unless you specifically tell it not to. See Autodeferring, below.)

    Calling ->flush returns the array to immediate-write mode. If you wish to discard the deferred writes, you may call ->discard instead of ->flush . Note that in some cases, some of the data will have been written already, and it will be too late for ->discard to discard all the changes. Support for ->discard may be withdrawn in a future version of Tie::File .

    Deferred writes are cached in memory up to the limit specified by the dw_size option (see above). If the deferred-write buffer is full and you try to write still more deferred data, the buffer will be flushed. All buffered data will be written immediately, the buffer will be emptied, and the now-empty space will be used for future deferred writes.

    If the deferred-write buffer isn't yet full, but the total size of the buffer and the read cache would exceed the memory limit, the oldest records will be expired from the read cache until the total size is under the limit.

    push, pop, shift, unshift, and splice cannot be deferred. When you perform one of these operations, any deferred data is written to the file and the operation is performed immediately. This may change in a future version.

    If you resize the array with deferred writing enabled, the file will be resized immediately, but deferred records will not be written. This has a surprising consequence: @a = (...) erases the file immediately, but the writing of the actual data is deferred. This might be a bug. If it is a bug, it will be fixed in a future version.

    Autodeferring

    Tie::File tries to guess when deferred writing might be helpful, and to turn it on and off automatically.

    1. for (@a) {
    2. $_ = "> $_";
    3. }

    In this example, only the first two assignments will be done immediately; after this, all the changes to the file will be deferred up to the user-specified memory limit.

    You should usually be able to ignore this and just use the module without thinking about deferring. However, special applications may require fine control over which writes are deferred, or may require that all writes be immediate. To disable the autodeferment feature, use

    1. (tied @o)->autodefer(0);

    or

    1. tie @array, 'Tie::File', $file, autodefer => 0;

    Similarly, ->autodefer(1) re-enables autodeferment, and ->autodefer() recovers the current value of the autodefer setting.

    CONCURRENT ACCESS TO FILES

    Caching and deferred writing are inappropriate if you want the same file to be accessed simultaneously from more than one process. Other optimizations performed internally by this module are also incompatible with concurrent access. A future version of this module will support a concurrent => 1 option that enables safe concurrent access.

    Previous versions of this documentation suggested using memory => 0 for safe concurrent access. This was mistaken. Tie::File will not support safe concurrent access before version 0.96.

    CAVEATS

    (That's Latin for 'warnings'.)

    • Reasonable effort was made to make this module efficient. Nevertheless, changing the size of a record in the middle of a large file will always be fairly slow, because everything after the new record must be moved.

    • The behavior of tied arrays is not precisely the same as for regular arrays. For example:

      1. # This DOES print "How unusual!"
      2. undef $a[10]; print "How unusual!\n" if defined $a[10];

      undef-ing a Tie::File array element just blanks out the corresponding record in the file. When you read it back again, you'll get the empty string, so the supposedly-undef'ed value will be defined. Similarly, if you have autochomp disabled, then

      1. # This DOES print "How unusual!" if 'autochomp' is disabled
      2. undef $a[10];
      3. print "How unusual!\n" if $a[10];

      Because when autochomp is disabled, $a[10] will read back as "\n" (or whatever the record separator string is.)

      There are other minor differences, particularly regarding exists and delete, but in general, the correspondence is extremely close.

    • I have supposed that since this module is concerned with file I/O, almost all normal use of it will be heavily I/O bound. This means that the time to maintain complicated data structures inside the module will be dominated by the time to actually perform the I/O. When there was an opportunity to spend CPU time to avoid doing I/O, I usually tried to take it.

    • You might be tempted to think that deferred writing is like transactions, with flush as commit and discard as rollback , but it isn't, so don't.

    • There is a large memory overhead for each record offset and for each cache entry: about 310 bytes per cached data record, and about 21 bytes per offset table entry.

      The per-record overhead will limit the maximum number of records you can access per file. Note that accessing the length of the array via $x = scalar @tied_file accesses all records and stores their offsets. The same for foreach (@tied_file) , even if you exit the loop early.

    SUBCLASSING

    This version promises absolutely nothing about the internals, which may change without notice. A future version of the module will have a well-defined and stable subclassing API.

    WHAT ABOUT DB_File ?

    People sometimes point out that DB_File will do something similar, and ask why Tie::File module is necessary.

    There are a number of reasons that you might prefer Tie::File . A list is available at http://perl.plover.com/TieFile/why-not-DB_File.

    AUTHOR

    Mark Jason Dominus

    To contact the author, send email to: mjd-perl-tiefile+@plover.com

    To receive an announcement whenever a new version of this module is released, send a blank email message to mjd-perl-tiefile-subscribe@plover.com .

    The most recent version of this module, including documentation and any news of importance, will be available at

    1. http://perl.plover.com/TieFile/

    LICENSE

    Tie::File version 0.96 is copyright (C) 2003 Mark Jason Dominus.

    This library is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

    These terms are your choice of any of (1) the Perl Artistic Licence, or (2) version 2 of the GNU General Public License as published by the Free Software Foundation, or (3) any later version of the GNU General Public License.

    This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

    You should have received a copy of the GNU General Public License along with this library program; it should be in the file COPYING . If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA

    For licensing inquiries, contact the author at:

    1. Mark Jason Dominus
    2. 255 S. Warnock St.
    3. Philadelphia, PA 19107

    WARRANTY

    Tie::File version 0.98 comes with ABSOLUTELY NO WARRANTY. For details, see the license.

    THANKS

    Gigantic thanks to Jarkko Hietaniemi, for agreeing to put this in the core when I hadn't written it yet, and for generally being helpful, supportive, and competent. (Usually the rule is "choose any one.") Also big thanks to Abhijit Menon-Sen for all of the same things.

    Special thanks to Craig Berry and Peter Prymmer (for VMS portability help), Randy Kobes (for Win32 portability help), Clinton Pierce and Autrijus Tang (for heroic eleventh-hour Win32 testing above and beyond the call of duty), Michael G Schwern (for testing advice), and the rest of the CPAN testers (for testing generally).

    Special thanks to Tels for suggesting several speed and memory optimizations.

    Additional thanks to: Edward Avis / Mattia Barbon / Tom Christiansen / Gerrit Haase / Gurusamy Sarathy / Jarkko Hietaniemi (again) / Nikola Knezevic / John Kominetz / Nick Ing-Simmons / Tassilo von Parseval / H. Dieter Pearcey / Slaven Rezic / Eric Roode / Peter Scott / Peter Somu / Autrijus Tang (again) / Tels (again) / Juerd Waalboer / Todd Rinaldo

    TODO

    More tests. (Stuff I didn't think of yet.)

    Paragraph mode?

    Fixed-length mode. Leave-blanks mode.

    Maybe an autolocking mode?

    For many common uses of the module, the read cache is a liability. For example, a program that inserts a single record, or that scans the file once, will have a cache hit rate of zero. This suggests a major optimization: The cache should be initially disabled. Here's a hybrid approach: Initially, the cache is disabled, but the cache code maintains statistics about how high the hit rate would be *if* it were enabled. When it sees the hit rate get high enough, it enables itself. The STAT comments in this code are the beginning of an implementation of this.

    Record locking with fcntl()? Then the module might support an undo log and get real transactions. What a tour de force that would be.

    Keeping track of the highest cached record. This would allow reads-in-a-row to skip the cache lookup faster (if reading from 1..N with empty cache at start, the last cached value will be always N-1).

    More tests.

     
    perldoc-html/Tie/Handle.html000644 000765 000024 00000045604 12275777512 016050 0ustar00jjstaff000000 000000 Tie::Handle - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::Handle

    Perl 5 version 18.2 documentation
    Recently read

    Tie::Handle

    NAME

    Tie::Handle - base class definitions for tied handles

    SYNOPSIS

    1. package NewHandle;
    2. require Tie::Handle;
    3. @ISA = qw(Tie::Handle);
    4. sub READ { ... } # Provide a needed method
    5. sub TIEHANDLE { ... } # Overrides inherited method
    6. package main;
    7. tie *FH, 'NewHandle';

    DESCRIPTION

    This module provides some skeletal methods for handle-tying classes. See perltie for a list of the functions required in tying a handle to a package. The basic Tie::Handle package provides a new method, as well as methods TIEHANDLE , PRINT , PRINTF and GETC .

    For developers wishing to write their own tied-handle classes, the methods are summarized below. The perltie section not only documents these, but has sample code as well:

    • TIEHANDLE classname, LIST

      The method invoked by the command tie *glob, classname . Associates a new glob instance with the specified class. LIST would represent additional arguments (along the lines of AnyDBM_File and compatriots) needed to complete the association.

    • WRITE this, scalar, length, offset

      Write length bytes of data from scalar starting at offset.

    • PRINT this, LIST

      Print the values in LIST

    • PRINTF this, format, LIST

      Print the values in LIST using format

    • READ this, scalar, length, offset

      Read length bytes of data into scalar starting at offset.

    • READLINE this

      Read a single line

    • GETC this

      Get a single character

    • CLOSE this

      Close the handle

    • OPEN this, filename

      (Re-)open the handle

    • BINMODE this

      Specify content is binary

    • EOF this

      Test for end of file.

    • TELL this

      Return position in the file.

    • SEEK this, offset, whence

      Position the file.

      Test for end of file.

    • DESTROY this

      Free the storage associated with the tied handle referenced by this. This is rarely needed, as Perl manages its memory quite well. But the option exists, should a class wish to perform specific actions upon the destruction of an instance.

    MORE INFORMATION

    The perltie section contains an example of tying handles.

    COMPATIBILITY

    This version of Tie::Handle is neither related to nor compatible with the Tie::Handle (3.0) module available on CPAN. It was due to an accident that two modules with the same name appeared. The namespace clash has been cleared in favor of this module that comes with the perl core in September 2000 and accordingly the version number has been bumped up to 4.0.

     
    perldoc-html/Tie/Hash/000755 000765 000024 00000000000 12275777514 014643 5ustar00jjstaff000000 000000 perldoc-html/Tie/Hash.html000644 000765 000024 00000067707 12275777513 015551 0ustar00jjstaff000000 000000 Tie::Hash - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::Hash

    Perl 5 version 18.2 documentation
    Recently read

    Tie::Hash

    NAME

    Tie::Hash, Tie::StdHash, Tie::ExtraHash - base class definitions for tied hashes

    SYNOPSIS

    1. package NewHash;
    2. require Tie::Hash;
    3. @ISA = qw(Tie::Hash);
    4. sub DELETE { ... } # Provides needed method
    5. sub CLEAR { ... } # Overrides inherited method
    6. package NewStdHash;
    7. require Tie::Hash;
    8. @ISA = qw(Tie::StdHash);
    9. # All methods provided by default, define only those needing overrides
    10. # Accessors access the storage in %{$_[0]};
    11. # TIEHASH should return a reference to the actual storage
    12. sub DELETE { ... }
    13. package NewExtraHash;
    14. require Tie::Hash;
    15. @ISA = qw(Tie::ExtraHash);
    16. # All methods provided by default, define only those needing overrides
    17. # Accessors access the storage in %{$_[0][0]};
    18. # TIEHASH should return an array reference with the first element being
    19. # the reference to the actual storage
    20. sub DELETE {
    21. $_[0][1]->('del', $_[0][0], $_[1]); # Call the report writer
    22. delete $_[0][0]->{$_[1]}; # $_[0]->SUPER::DELETE($_[1])
    23. }
    24. package main;
    25. tie %new_hash, 'NewHash';
    26. tie %new_std_hash, 'NewStdHash';
    27. tie %new_extra_hash, 'NewExtraHash',
    28. sub {warn "Doing \U$_[1]\E of $_[2].\n"};

    DESCRIPTION

    This module provides some skeletal methods for hash-tying classes. See perltie for a list of the functions required in order to tie a hash to a package. The basic Tie::Hash package provides a new method, as well as methods TIEHASH , EXISTS and CLEAR . The Tie::StdHash and Tie::ExtraHash packages provide most methods for hashes described in perltie (the exceptions are UNTIE and DESTROY ). They cause tied hashes to behave exactly like standard hashes, and allow for selective overwriting of methods. Tie::Hash grandfathers the new method: it is used if TIEHASH is not defined in the case a class forgets to include a TIEHASH method.

    For developers wishing to write their own tied hashes, the required methods are briefly defined below. See the perltie section for more detailed descriptive, as well as example code:

    • TIEHASH classname, LIST

      The method invoked by the command tie %hash, classname . Associates a new hash instance with the specified class. LIST would represent additional arguments (along the lines of AnyDBM_File and compatriots) needed to complete the association.

    • STORE this, key, value

      Store datum value into key for the tied hash this.

    • FETCH this, key

      Retrieve the datum in key for the tied hash this.

    • FIRSTKEY this

      Return the first key in the hash.

    • NEXTKEY this, lastkey

      Return the next key in the hash.

    • EXISTS this, key

      Verify that key exists with the tied hash this.

      The Tie::Hash implementation is a stub that simply croaks.

    • DELETE this, key

      Delete the key key from the tied hash this.

    • CLEAR this

      Clear all values from the tied hash this.

    • SCALAR this

      Returns what evaluating the hash in scalar context yields.

      Tie::Hash does not implement this method (but Tie::StdHash and Tie::ExtraHash do).

    Inheriting from Tie::StdHash

    The accessor methods assume that the actual storage for the data in the tied hash is in the hash referenced by tied(%tiedhash). Thus overwritten TIEHASH method should return a hash reference, and the remaining methods should operate on the hash referenced by the first argument:

    1. package ReportHash;
    2. our @ISA = 'Tie::StdHash';
    3. sub TIEHASH {
    4. my $storage = bless {}, shift;
    5. warn "New ReportHash created, stored in $storage.\n";
    6. $storage
    7. }
    8. sub STORE {
    9. warn "Storing data with key $_[1] at $_[0].\n";
    10. $_[0]{$_[1]} = $_[2]
    11. }

    Inheriting from Tie::ExtraHash

    The accessor methods assume that the actual storage for the data in the tied hash is in the hash referenced by (tied(%tiedhash))->[0] . Thus overwritten TIEHASH method should return an array reference with the first element being a hash reference, and the remaining methods should operate on the hash %{ $_[0]->[0] } :

    1. package ReportHash;
    2. our @ISA = 'Tie::ExtraHash';
    3. sub TIEHASH {
    4. my $class = shift;
    5. my $storage = bless [{}, @_], $class;
    6. warn "New ReportHash created, stored in $storage.\n";
    7. $storage;
    8. }
    9. sub STORE {
    10. warn "Storing data with key $_[1] at $_[0].\n";
    11. $_[0][0]{$_[1]} = $_[2]
    12. }

    The default TIEHASH method stores "extra" arguments to tie() starting from offset 1 in the array referenced by tied(%tiedhash); this is the same storage algorithm as in TIEHASH subroutine above. Hence, a typical package inheriting from Tie::ExtraHash does not need to overwrite this method.

    SCALAR , UNTIE and DESTROY

    The methods UNTIE and DESTROY are not defined in Tie::Hash, Tie::StdHash, or Tie::ExtraHash. Tied hashes do not require presence of these methods, but if defined, the methods will be called in proper time, see perltie.

    SCALAR is only defined in Tie::StdHash and Tie::ExtraHash.

    If needed, these methods should be defined by the package inheriting from Tie::Hash, Tie::StdHash, or Tie::ExtraHash. See SCALAR in perltie to find out what happens when SCALAR does not exist.

    MORE INFORMATION

    The packages relating to various DBM-related implementations (DB_File, NDBM_File, etc.) show examples of general tied hashes, as does the Config module. While these do not utilize Tie::Hash, they serve as good working examples.

     
    perldoc-html/Tie/Memoize.html000644 000765 000024 00000051002 12275777513 016250 0ustar00jjstaff000000 000000 Tie::Memoize - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::Memoize

    Perl 5 version 18.2 documentation
    Recently read

    Tie::Memoize

    NAME

    Tie::Memoize - add data to hash when needed

    SYNOPSIS

    1. require Tie::Memoize;
    2. tie %hash, 'Tie::Memoize',
    3. \&fetch, # The rest is optional
    4. $DATA, \&exists,
    5. {%ini_value}, {%ini_existence};

    DESCRIPTION

    This package allows a tied hash to autoload its values on the first access, and to use the cached value on the following accesses.

    Only read-accesses (via fetching the value or exists) result in calls to the functions; the modify-accesses are performed as on a normal hash.

    The required arguments during tie are the hash, the package, and the reference to the FETCH ing function. The optional arguments are an arbitrary scalar $data, the reference to the EXISTS function, and initial values of the hash and of the existence cache.

    Both the FETCH ing function and the EXISTS functions have the same signature: the arguments are $key, $data ; $data is the same value as given as argument during tie()ing. Both functions should return an empty list if the value does not exist. If EXISTS function is different from the FETCH ing function, it should return a TRUE value on success. The FETCH ing function should return the intended value if the key is valid.

    Inheriting from Tie::Memoize

    The structure of the tied() data is an array reference with elements

    1. 0: cache of known values
    2. 1: cache of known existence of keys
    3. 2: FETCH function
    4. 3: EXISTS function
    5. 4: $data

    The rest is for internal usage of this package. In particular, if TIEHASH is overwritten, it should call SUPER::TIEHASH.

    EXAMPLE

    1. sub slurp {
    2. my ($key, $dir) = shift;
    3. open my $h, '<', "$dir/$key" or return;
    4. local $/; <$h> # slurp it all
    5. }
    6. sub exists { my ($key, $dir) = shift; return -f "$dir/$key" }
    7. tie %hash, 'Tie::Memoize', \&slurp, $directory, \&exists,
    8. { fake_file1 => $content1, fake_file2 => $content2 },
    9. { pretend_does_not_exists => 0, known_to_exist => 1 };

    This example treats the slightly modified contents of $directory as a hash. The modifications are that the keys fake_file1 and fake_file2 fetch values $content1 and $content2, and pretend_does_not_exists will never be accessed. Additionally, the existence of known_to_exist is never checked (so if it does not exists when its content is needed, the user of %hash may be confused).

    BUGS

    FIRSTKEY and NEXTKEY methods go through the keys which were already read, not all the possible keys of the hash.

    AUTHOR

    Ilya Zakharevich mailto:perl-module-hash-memoize@ilyaz.org.

     
    perldoc-html/Tie/RefHash.html000644 000765 000024 00000051573 12275777506 016202 0ustar00jjstaff000000 000000 Tie::RefHash - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::RefHash

    Perl 5 version 18.2 documentation
    Recently read

    Tie::RefHash

    NAME

    Tie::RefHash - use references as hash keys

    SYNOPSIS

    1. require 5.004;
    2. use Tie::RefHash;
    3. tie HASHVARIABLE, 'Tie::RefHash', LIST;
    4. tie HASHVARIABLE, 'Tie::RefHash::Nestable', LIST;
    5. untie HASHVARIABLE;

    DESCRIPTION

    This module provides the ability to use references as hash keys if you first tie the hash variable to this module. Normally, only the keys of the tied hash itself are preserved as references; to use references as keys in hashes-of-hashes, use Tie::RefHash::Nestable, included as part of Tie::RefHash.

    It is implemented using the standard perl TIEHASH interface. Please see the tie entry in perlfunc(1) and perltie(1) for more information.

    The Nestable version works by looking for hash references being stored and converting them to tied hashes so that they too can have references as keys. This will happen without warning whenever you store a reference to one of your own hashes in the tied hash.

    EXAMPLE

    1. use Tie::RefHash;
    2. tie %h, 'Tie::RefHash';
    3. $a = [];
    4. $b = {};
    5. $c = \*main;
    6. $d = \"gunk";
    7. $e = sub { 'foo' };
    8. %h = ($a => 1, $b => 2, $c => 3, $d => 4, $e => 5);
    9. $a->[0] = 'foo';
    10. $b->{foo} = 'bar';
    11. for (keys %h) {
    12. print ref($_), "\n";
    13. }
    14. tie %h, 'Tie::RefHash::Nestable';
    15. $h{$a}->{$b} = 1;
    16. for (keys %h, keys %{$h{$a}}) {
    17. print ref($_), "\n";
    18. }

    THREAD SUPPORT

    Tie::RefHash fully supports threading using the CLONE method.

    STORABLE SUPPORT

    Storable hooks are provided for semantically correct serialization and cloning of tied refhashes.

    RELIC SUPPORT

    This version of Tie::RefHash seems to no longer work with 5.004. This has not been throughly investigated. Patches welcome ;-)

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself

    MAINTAINER

    Yuval Kogman <nothingmuch@woobling.org>

    AUTHOR

    Gurusamy Sarathy gsar@activestate.com

    'Nestable' by Ed Avis ed@membled.com

    SEE ALSO

    perl(1), perlfunc(1), perltie(1)

     
    perldoc-html/Tie/Scalar.html000644 000765 000024 00000047322 12275777507 016065 0ustar00jjstaff000000 000000 Tie::Scalar - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::Scalar

    Perl 5 version 18.2 documentation
    Recently read

    Tie::Scalar

    NAME

    Tie::Scalar, Tie::StdScalar - base class definitions for tied scalars

    SYNOPSIS

    1. package NewScalar;
    2. require Tie::Scalar;
    3. @ISA = qw(Tie::Scalar);
    4. sub FETCH { ... } # Provide a needed method
    5. sub TIESCALAR { ... } # Overrides inherited method
    6. package NewStdScalar;
    7. require Tie::Scalar;
    8. @ISA = qw(Tie::StdScalar);
    9. # All methods provided by default, so define only what needs be overridden
    10. sub FETCH { ... }
    11. package main;
    12. tie $new_scalar, 'NewScalar';
    13. tie $new_std_scalar, 'NewStdScalar';

    DESCRIPTION

    This module provides some skeletal methods for scalar-tying classes. See perltie for a list of the functions required in tying a scalar to a package. The basic Tie::Scalar package provides a new method, as well as methods TIESCALAR , FETCH and STORE . The Tie::StdScalar package provides all the methods specified in perltie. It inherits from Tie::Scalar and causes scalars tied to it to behave exactly like the built-in scalars, allowing for selective overloading of methods. The new method is provided as a means of grandfathering, for classes that forget to provide their own TIESCALAR method.

    For developers wishing to write their own tied-scalar classes, the methods are summarized below. The perltie section not only documents these, but has sample code as well:

    • TIESCALAR classname, LIST

      The method invoked by the command tie $scalar, classname . Associates a new scalar instance with the specified class. LIST would represent additional arguments (along the lines of AnyDBM_File and compatriots) needed to complete the association.

    • FETCH this

      Retrieve the value of the tied scalar referenced by this.

    • STORE this, value

      Store data value in the tied scalar referenced by this.

    • DESTROY this

      Free the storage associated with the tied scalar referenced by this. This is rarely needed, as Perl manages its memory quite well. But the option exists, should a class wish to perform specific actions upon the destruction of an instance.

    Tie::Scalar vs Tie::StdScalar

    Tie::Scalar provides all the necessary methods, but one should realize they do not do anything useful. Calling Tie::Scalar::FETCH or Tie::Scalar::STORE results in a (trappable) croak. And if you inherit from Tie::Scalar , you must provide either a new or a TIESCALAR method.

    If you are looking for a class that does everything for you you don't define yourself, use the Tie::StdScalar class, not the Tie::Scalar one.

    MORE INFORMATION

    The perltie section uses a good example of tying scalars by associating process IDs with priority.

     
    perldoc-html/Tie/StdHandle.html000644 000765 000024 00000037065 12275777514 016527 0ustar00jjstaff000000 000000 Tie::StdHandle - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::StdHandle

    Perl 5 version 18.2 documentation
    Recently read

    Tie::StdHandle

    NAME

    Tie::StdHandle - base class definitions for tied handles

    SYNOPSIS

    1. package NewHandle;
    2. require Tie::Handle;
    3. @ISA = qw(Tie::Handle);
    4. sub READ { ... } # Provide a needed method
    5. sub TIEHANDLE { ... } # Overrides inherited method
    6. package main;
    7. tie *FH, 'NewHandle';

    DESCRIPTION

    The Tie::StdHandle package provide most methods for file handles described in perltie (the exceptions are UNTIE and DESTROY ). It causes tied file handles to behave exactly like standard file handles and allow for selective overwriting of methods.

     
    perldoc-html/Tie/SubstrHash.html000644 000765 000024 00000040147 12275777514 016742 0ustar00jjstaff000000 000000 Tie::SubstrHash - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::SubstrHash

    Perl 5 version 18.2 documentation
    Recently read

    Tie::SubstrHash

    NAME

    Tie::SubstrHash - Fixed-table-size, fixed-key-length hashing

    SYNOPSIS

    1. require Tie::SubstrHash;
    2. tie %myhash, 'Tie::SubstrHash', $key_len, $value_len, $table_size;

    DESCRIPTION

    The Tie::SubstrHash package provides a hash-table-like interface to an array of determinate size, with constant key size and record size.

    Upon tying a new hash to this package, the developer must specify the size of the keys that will be used, the size of the value fields that the keys will index, and the size of the overall table (in terms of key-value pairs, not size in hard memory). These values will not change for the duration of the tied hash. The newly-allocated hash table may now have data stored and retrieved. Efforts to store more than $table_size elements will result in a fatal error, as will efforts to store a value not exactly $value_len characters in length, or reference through a key not exactly $key_len characters in length. While these constraints may seem excessive, the result is a hash table using much less internal memory than an equivalent freely-allocated hash table.

    CAVEATS

    Because the current implementation uses the table and key sizes for the hashing algorithm, there is no means by which to dynamically change the value of any of the initialization parameters.

    The hash does not support exists().

     
    perldoc-html/Tie/Hash/NamedCapture.html000644 000765 000024 00000041746 12275777514 020115 0ustar00jjstaff000000 000000 Tie::Hash::NamedCapture - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Tie::Hash::NamedCapture

    Perl 5 version 18.2 documentation
    Recently read

    Tie::Hash::NamedCapture

    NAME

    Tie::Hash::NamedCapture - Named regexp capture buffers

    SYNOPSIS

    1. tie my %hash, "Tie::Hash::NamedCapture";
    2. # %hash now behaves like %+
    3. tie my %hash, "Tie::Hash::NamedCapture", all => 1;
    4. # %hash now access buffers from regexp in $qr like %-

    DESCRIPTION

    This module is used to implement the special hashes %+ and %- , but it can be used to tie other variables as you choose.

    When the all parameter is provided, then the tied hash elements will be array refs listing the contents of each capture buffer whose name is the same as the associated hash key. If none of these buffers were involved in the match, the contents of that array ref will be as many undef values as there are capture buffers with that name. In other words, the tied hash will behave as %- .

    When the all parameter is omitted or false, then the tied hash elements will be the contents of the leftmost defined buffer with the name of the associated hash key. In other words, the tied hash will behave as %+ .

    The keys of %- -like hashes correspond to all buffer names found in the regular expression; the keys of %+ -like hashes list only the names of buffers that have captured (and that are thus associated to defined values).

    SEE ALSO

    perlreapi, re, Pragmatic Modules in perlmodlib, %+ in perlvar, %- in perlvar.

     
    perldoc-html/Thread/Queue.html000644 000765 000024 00000111237 12275777506 016426 0ustar00jjstaff000000 000000 Thread::Queue - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Thread::Queue

    Perl 5 version 18.2 documentation
    Recently read

    Thread::Queue

    NAME

    Thread::Queue - Thread-safe queues

    VERSION

    This document describes Thread::Queue version 3.02

    SYNOPSIS

    1. use strict;
    2. use warnings;
    3. use threads;
    4. use Thread::Queue;
    5. my $q = Thread::Queue->new(); # A new empty queue
    6. # Worker thread
    7. my $thr = threads->create(
    8. sub {
    9. # Thread will loop until no more work
    10. while (defined(my $item = $q->dequeue())) {
    11. # Do work on $item
    12. ...
    13. }
    14. }
    15. );
    16. # Send work to the thread
    17. $q->enqueue($item1, ...);
    18. # Signal that there is no more work to be sent
    19. $q->end();
    20. # Join up with the thread when it finishes
    21. $thr->join();
    22. ...
    23. # Count of items in the queue
    24. my $left = $q->pending();
    25. # Non-blocking dequeue
    26. if (defined(my $item = $q->dequeue_nb())) {
    27. # Work on $item
    28. }
    29. # Blocking dequeue with 5-second timeout
    30. if (defined(my $item = $q->dequeue_timed(5))) {
    31. # Work on $item
    32. }
    33. # Get the second item in the queue without dequeuing anything
    34. my $item = $q->peek(1);
    35. # Insert two items into the queue just behind the head
    36. $q->insert(1, $item1, $item2);
    37. # Extract the last two items on the queue
    38. my ($item1, $item2) = $q->extract(-2, 2);

    DESCRIPTION

    This module provides thread-safe FIFO queues that can be accessed safely by any number of threads.

    Any data types supported by threads::shared can be passed via queues:

    • Ordinary scalars
    • Array refs
    • Hash refs
    • Scalar refs
    • Objects based on the above

    Ordinary scalars are added to queues as they are.

    If not already thread-shared, the other complex data types will be cloned (recursively, if needed, and including any blessings and read-only settings) into thread-shared structures before being placed onto a queue.

    For example, the following would cause Thread::Queue to create a empty, shared array reference via &shared([]) , copy the elements 'foo', 'bar' and 'baz' from @ary into it, and then place that shared reference onto the queue:

    1. my @ary = qw/foo bar baz/;
    2. $q->enqueue(\@ary);

    However, for the following, the items are already shared, so their references are added directly to the queue, and no cloning takes place:

    1. my @ary :shared = qw/foo bar baz/;
    2. $q->enqueue(\@ary);
    3. my $obj = &shared({});
    4. $$obj{'foo'} = 'bar';
    5. $$obj{'qux'} = 99;
    6. bless($obj, 'My::Class');
    7. $q->enqueue($obj);

    See LIMITATIONS for caveats related to passing objects via queues.

    QUEUE CREATION

    • ->new()

      Creates a new empty queue.

    • ->new(LIST)

      Creates a new queue pre-populated with the provided list of items.

    BASIC METHODS

    The following methods deal with queues on a FIFO basis.

    • ->enqueue(LIST)

      Adds a list of items onto the end of the queue.

    • ->dequeue()
    • ->dequeue(COUNT)

      Removes the requested number of items (default is 1) from the head of the queue, and returns them. If the queue contains fewer than the requested number of items, then the thread will be blocked until the requisite number of items are available (i.e., until other threads <enqueue> more items).

    • ->dequeue_nb()
    • ->dequeue_nb(COUNT)

      Removes the requested number of items (default is 1) from the head of the queue, and returns them. If the queue contains fewer than the requested number of items, then it immediately (i.e., non-blocking) returns whatever items there are on the queue. If the queue is empty, then undef is returned.

    • ->dequeue_timed(TIMEOUT)
    • ->dequeue_timed(TIMEOUT, COUNT)

      Removes the requested number of items (default is 1) from the head of the queue, and returns them. If the queue contains fewer than the requested number of items, then the thread will be blocked until the requisite number of items are available, or until the timeout is reached. If the timeout is reached, it returns whatever items there are on the queue, or undef if the queue is empty.

      The timeout may be a number of seconds relative to the current time (e.g., 5 seconds from when the call is made), or may be an absolute timeout in epoch seconds the same as would be used with cond_timedwait(). Fractional seconds (e.g., 2.5 seconds) are also supported (to the extent of the underlying implementation).

      If TIMEOUT is missing, c<undef>, or less than or equal to 0, then this call behaves the same as dequeue_nb .

    • ->pending()

      Returns the number of items still in the queue. Returns undef if the queue has been ended (see below), and there are no more items in the queue.

    • ->end()

      Declares that no more items will be added to the queue.

      All threads blocking on dequeue() calls will be unblocked with any remaining items in the queue and/or undef being returned. Any subsequent calls to dequeue() will behave like dequeue_nb() .

      Once ended, no more items may be placed in the queue.

    ADVANCED METHODS

    The following methods can be used to manipulate items anywhere in a queue.

    To prevent the contents of a queue from being modified by another thread while it is being examined and/or changed, lock the queue inside a local block:

    1. {
    2. lock($q); # Keep other threads from changing the queue's contents
    3. my $item = $q->peek();
    4. if ($item ...) {
    5. ...
    6. }
    7. }
    8. # Queue is now unlocked
    • ->peek()
    • ->peek(INDEX)

      Returns an item from the queue without dequeuing anything. Defaults to the the head of queue (at index position 0) if no index is specified. Negative index values are supported as with arrays (i.e., -1 is the end of the queue, -2 is next to last, and so on).

      If no items exists at the specified index (i.e., the queue is empty, or the index is beyond the number of items on the queue), then undef is returned.

      Remember, the returned item is not removed from the queue, so manipulating a peek ed at reference affects the item on the queue.

    • ->insert(INDEX, LIST)

      Adds the list of items to the queue at the specified index position (0 is the head of the list). Any existing items at and beyond that position are pushed back past the newly added items:

      1. $q->enqueue(1, 2, 3, 4);
      2. $q->insert(1, qw/foo bar/);
      3. # Queue now contains: 1, foo, bar, 2, 3, 4

      Specifying an index position greater than the number of items in the queue just adds the list to the end.

      Negative index positions are supported:

      1. $q->enqueue(1, 2, 3, 4);
      2. $q->insert(-2, qw/foo bar/);
      3. # Queue now contains: 1, 2, foo, bar, 3, 4

      Specifying a negative index position greater than the number of items in the queue adds the list to the head of the queue.

    • ->extract()
    • ->extract(INDEX)
    • ->extract(INDEX, COUNT)

      Removes and returns the specified number of items (defaults to 1) from the specified index position in the queue (0 is the head of the queue). When called with no arguments, extract operates the same as dequeue_nb .

      This method is non-blocking, and will return only as many items as are available to fulfill the request:

      1. $q->enqueue(1, 2, 3, 4);
      2. my $item = $q->extract(2) # Returns 3
      3. # Queue now contains: 1, 2, 4
      4. my @items = $q->extract(1, 3) # Returns (2, 4)
      5. # Queue now contains: 1

      Specifying an index position greater than the number of items in the queue results in undef or an empty list being returned.

      1. $q->enqueue('foo');
      2. my $nada = $q->extract(3) # Returns undef
      3. my @nada = $q->extract(1, 3) # Returns ()

      Negative index positions are supported. Specifying a negative index position greater than the number of items in the queue may return items from the head of the queue (similar to dequeue_nb ) if the count overlaps the head of the queue from the specified position (i.e. if queue size + index + count is greater than zero):

      1. $q->enqueue(qw/foo bar baz/);
      2. my @nada = $q->extract(-6, 2); # Returns () - (3+(-6)+2) <= 0
      3. my @some = $q->extract(-6, 4); # Returns (foo) - (3+(-6)+4) > 0
      4. # Queue now contains: bar, baz
      5. my @rest = $q->extract(-3, 4); # Returns (bar, baz) - (2+(-3)+4) > 0

    NOTES

    Queues created by Thread::Queue can be used in both threaded and non-threaded applications.

    LIMITATIONS

    Passing objects on queues may not work if the objects' classes do not support sharing. See BUGS AND LIMITATIONS in threads::shared for more.

    Passing array/hash refs that contain objects may not work for Perl prior to 5.10.0.

    SEE ALSO

    Thread::Queue Discussion Forum on CPAN: http://www.cpanforum.com/dist/Thread-Queue

    threads, threads::shared

    Sample code in the examples directory of this distribution on CPAN.

    MAINTAINER

    Jerry D. Hedden, <jdhedden AT cpan DOT org>

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Thread/Semaphore.html000644 000765 000024 00000054255 12275777514 017272 0ustar00jjstaff000000 000000 Thread::Semaphore - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Thread::Semaphore

    Perl 5 version 18.2 documentation
    Recently read

    Thread::Semaphore

    NAME

    Thread::Semaphore - Thread-safe semaphores

    VERSION

    This document describes Thread::Semaphore version 2.12

    SYNOPSIS

    1. use Thread::Semaphore;
    2. my $s = Thread::Semaphore->new();
    3. $s->down(); # Also known as the semaphore P operation.
    4. # The guarded section is here
    5. $s->up(); # Also known as the semaphore V operation.
    6. # Decrement the semaphore only if it would immediately succeed.
    7. if ($s->down_nb()) {
    8. # The guarded section is here
    9. $s->up();
    10. }
    11. # Forcefully decrement the semaphore even if its count goes below 0.
    12. $s->down_force();
    13. # The default value for semaphore operations is 1
    14. my $s = Thread::Semaphore->new($initial_value);
    15. $s->down($down_value);
    16. $s->up($up_value);
    17. if ($s->down_nb($down_value)) {
    18. ...
    19. $s->up($up_value);
    20. }
    21. $s->down_force($down_value);

    DESCRIPTION

    Semaphores provide a mechanism to regulate access to resources. Unlike locks, semaphores aren't tied to particular scalars, and so may be used to control access to anything you care to use them for.

    Semaphores don't limit their values to zero and one, so they can be used to control access to some resource that there may be more than one of (e.g., filehandles). Increment and decrement amounts aren't fixed at one either, so threads can reserve or return multiple resources at once.

    METHODS

    • ->new()
    • ->new(NUMBER)

      new creates a new semaphore, and initializes its count to the specified number (which must be an integer). If no number is specified, the semaphore's count defaults to 1.

    • ->down()
    • ->down(NUMBER)

      The down method decreases the semaphore's count by the specified number (which must be an integer >= 1), or by one if no number is specified.

      If the semaphore's count would drop below zero, this method will block until such time as the semaphore's count is greater than or equal to the amount you're down ing the semaphore's count by.

      This is the semaphore "P operation" (the name derives from the Dutch word "pak", which means "capture" -- the semaphore operations were named by the late Dijkstra, who was Dutch).

    • ->down_nb()
    • ->down_nb(NUMBER)

      The down_nb method attempts to decrease the semaphore's count by the specified number (which must be an integer >= 1), or by one if no number is specified.

      If the semaphore's count would drop below zero, this method will return false, and the semaphore's count remains unchanged. Otherwise, the semaphore's count is decremented and this method returns true.

    • ->down_force()
    • ->down_force(NUMBER)

      The down_force method decreases the semaphore's count by the specified number (which must be an integer >= 1), or by one if no number is specified. This method does not block, and may cause the semaphore's count to drop below zero.

    • ->up()
    • ->up(NUMBER)

      The up method increases the semaphore's count by the number specified (which must be an integer >= 1), or by one if no number is specified.

      This will unblock any thread that is blocked trying to down the semaphore if the up raises the semaphore's count above the amount that the down is trying to decrement it by. For example, if three threads are blocked trying to down a semaphore by one, and another thread up s the semaphore by two, then two of the blocked threads (which two is indeterminate) will become unblocked.

      This is the semaphore "V operation" (the name derives from the Dutch word "vrij", which means "release").

    NOTES

    Semaphores created by Thread::Semaphore can be used in both threaded and non-threaded applications. This allows you to write modules and packages that potentially make use of semaphores, and that will function in either environment.

    SEE ALSO

    Thread::Semaphore Discussion Forum on CPAN: http://www.cpanforum.com/dist/Thread-Semaphore

    threads, threads::shared

    MAINTAINER

    Jerry D. Hedden, <jdhedden AT cpan DOT org>

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Text/Abbrev.html000644 000765 000024 00000037001 12275777512 016251 0ustar00jjstaff000000 000000 Text::Abbrev - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Text::Abbrev

    Perl 5 version 18.2 documentation
    Recently read

    Text::Abbrev

    NAME

    Text::Abbrev - abbrev - create an abbreviation table from a list

    SYNOPSIS

    1. use Text::Abbrev;
    2. abbrev $hashref, LIST

    DESCRIPTION

    Stores all unambiguous truncations of each element of LIST as keys in the associative array referenced by $hashref . The values are the original list elements.

    EXAMPLE

    1. $hashref = abbrev qw(list edit send abort gripe);
    2. %hash = abbrev qw(list edit send abort gripe);
    3. abbrev $hashref, qw(list edit send abort gripe);
    4. abbrev(*hash, qw(list edit send abort gripe));
     
    perldoc-html/Text/Balanced.html000644 000765 000024 00000273120 12275777511 016544 0ustar00jjstaff000000 000000 Text::Balanced - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Text::Balanced

    Perl 5 version 18.2 documentation
    Recently read

    Text::Balanced

    NAME

    Text::Balanced - Extract delimited text sequences from strings.

    SYNOPSIS

    1. use Text::Balanced qw (
    2. extract_delimited
    3. extract_bracketed
    4. extract_quotelike
    5. extract_codeblock
    6. extract_variable
    7. extract_tagged
    8. extract_multiple
    9. gen_delimited_pat
    10. gen_extract_tagged
    11. );
    12. # Extract the initial substring of $text that is delimited by
    13. # two (unescaped) instances of the first character in $delim.
    14. ($extracted, $remainder) = extract_delimited($text,$delim);
    15. # Extract the initial substring of $text that is bracketed
    16. # with a delimiter(s) specified by $delim (where the string
    17. # in $delim contains one or more of '(){}[]<>').
    18. ($extracted, $remainder) = extract_bracketed($text,$delim);
    19. # Extract the initial substring of $text that is bounded by
    20. # an XML tag.
    21. ($extracted, $remainder) = extract_tagged($text);
    22. # Extract the initial substring of $text that is bounded by
    23. # a C<BEGIN>...C<END> pair. Don't allow nested C<BEGIN> tags
    24. ($extracted, $remainder) =
    25. extract_tagged($text,"BEGIN","END",undef,{bad=>["BEGIN"]});
    26. # Extract the initial substring of $text that represents a
    27. # Perl "quote or quote-like operation"
    28. ($extracted, $remainder) = extract_quotelike($text);
    29. # Extract the initial substring of $text that represents a block
    30. # of Perl code, bracketed by any of character(s) specified by $delim
    31. # (where the string $delim contains one or more of '(){}[]<>').
    32. ($extracted, $remainder) = extract_codeblock($text,$delim);
    33. # Extract the initial substrings of $text that would be extracted by
    34. # one or more sequential applications of the specified functions
    35. # or regular expressions
    36. @extracted = extract_multiple($text,
    37. [ \&extract_bracketed,
    38. \&extract_quotelike,
    39. \&some_other_extractor_sub,
    40. qr/[xyz]*/,
    41. 'literal',
    42. ]);

    # Create a string representing an optimized pattern (a la Friedl) # that matches a substring delimited by any of the specified characters # (in this case: any type of quote or a slash)

    1. $patstring = gen_delimited_pat(q{'"`/});

    # Generate a reference to an anonymous sub that is just like extract_tagged # but pre-compiled and optimized for a specific pair of tags, and consequently # much faster (i.e. 3 times faster). It uses qr// for better performance on # repeated calls, so it only works under Perl 5.005 or later.

    1. $extract_head = gen_extract_tagged('<HEAD>','</HEAD>');
    2. ($extracted, $remainder) = $extract_head->($text);

    DESCRIPTION

    The various extract_... subroutines may be used to extract a delimited substring, possibly after skipping a specified prefix string. By default, that prefix is optional whitespace (/\s*/ ), but you can change it to whatever you wish (see below).

    The substring to be extracted must appear at the current pos location of the string's variable (or at index zero, if no pos position is defined). In other words, the extract_... subroutines don't extract the first occurrence of a substring anywhere in a string (like an unanchored regex would). Rather, they extract an occurrence of the substring appearing immediately at the current matching position in the string (like a \G -anchored regex would).

    General behaviour in list contexts

    In a list context, all the subroutines return a list, the first three elements of which are always:

    • [0]

      The extracted string, including the specified delimiters. If the extraction fails undef is returned.

    • [1]

      The remainder of the input string (i.e. the characters after the extracted string). On failure, the entire string is returned.

    • [2]

      The skipped prefix (i.e. the characters before the extracted string). On failure, undef is returned.

    Note that in a list context, the contents of the original input text (the first argument) are not modified in any way.

    However, if the input text was passed in a variable, that variable's pos value is updated to point at the first character after the extracted text. That means that in a list context the various subroutines can be used much like regular expressions. For example:

    1. while ( $next = (extract_quotelike($text))[0] )
    2. {
    3. # process next quote-like (in $next)
    4. }

    General behaviour in scalar and void contexts

    In a scalar context, the extracted string is returned, having first been removed from the input text. Thus, the following code also processes each quote-like operation, but actually removes them from $text:

    1. while ( $next = extract_quotelike($text) )
    2. {
    3. # process next quote-like (in $next)
    4. }

    Note that if the input text is a read-only string (i.e. a literal), no attempt is made to remove the extracted text.

    In a void context the behaviour of the extraction subroutines is exactly the same as in a scalar context, except (of course) that the extracted substring is not returned.

    A note about prefixes

    Prefix patterns are matched without any trailing modifiers (/gimsox etc.) This can bite you if you're expecting a prefix specification like '.*?(?=<H1>)' to skip everything up to the first <H1> tag. Such a prefix pattern will only succeed if the <H1> tag is on the current line, since . normally doesn't match newlines.

    To overcome this limitation, you need to turn on /s matching within the prefix pattern, using the (?s) directive: '(?s).*?(?=<H1>)'

    extract_delimited

    The extract_delimited function formalizes the common idiom of extracting a single-character-delimited substring from the start of a string. For example, to extract a single-quote delimited string, the following code is typically used:

    1. ($remainder = $text) =~ s/\A('(\\.|[^'])*')//s;
    2. $extracted = $1;

    but with extract_delimited it can be simplified to:

    1. ($extracted,$remainder) = extract_delimited($text, "'");

    extract_delimited takes up to four scalars (the input text, the delimiters, a prefix pattern to be skipped, and any escape characters) and extracts the initial substring of the text that is appropriately delimited. If the delimiter string has multiple characters, the first one encountered in the text is taken to delimit the substring. The third argument specifies a prefix pattern that is to be skipped (but must be present!) before the substring is extracted. The final argument specifies the escape character to be used for each delimiter.

    All arguments are optional. If the escape characters are not specified, every delimiter is escaped with a backslash (\ ). If the prefix is not specified, the pattern '\s*' - optional whitespace - is used. If the delimiter set is also not specified, the set /["'`]/ is used. If the text to be processed is not specified either, $_ is used.

    In list context, extract_delimited returns a array of three elements, the extracted substring (including the surrounding delimiters), the remainder of the text, and the skipped prefix (if any). If a suitable delimited substring is not found, the first element of the array is the empty string, the second is the complete original text, and the prefix returned in the third element is an empty string.

    In a scalar context, just the extracted substring is returned. In a void context, the extracted substring (and any prefix) are simply removed from the beginning of the first argument.

    Examples:

    1. # Remove a single-quoted substring from the very beginning of $text:
    2. $substring = extract_delimited($text, "'", '');
    3. # Remove a single-quoted Pascalish substring (i.e. one in which
    4. # doubling the quote character escapes it) from the very
    5. # beginning of $text:
    6. $substring = extract_delimited($text, "'", '', "'");
    7. # Extract a single- or double- quoted substring from the
    8. # beginning of $text, optionally after some whitespace
    9. # (note the list context to protect $text from modification):
    10. ($substring) = extract_delimited $text, q{"'};
    11. # Delete the substring delimited by the first '/' in $text:
    12. $text = join '', (extract_delimited($text,'/','[^/]*')[2,1];

    Note that this last example is not the same as deleting the first quote-like pattern. For instance, if $text contained the string:

    1. "if ('./cmd' =~ m/$UNIXCMD/s) { $cmd = $1; }"

    then after the deletion it would contain:

    1. "if ('.$UNIXCMD/s) { $cmd = $1; }"

    not:

    1. "if ('./cmd' =~ ms) { $cmd = $1; }"

    See extract_quotelike for a (partial) solution to this problem.

    extract_bracketed

    Like "extract_delimited" , the extract_bracketed function takes up to three optional scalar arguments: a string to extract from, a delimiter specifier, and a prefix pattern. As before, a missing prefix defaults to optional whitespace and a missing text defaults to $_ . However, a missing delimiter specifier defaults to '{}()[]<>' (see below).

    extract_bracketed extracts a balanced-bracket-delimited substring (using any one (or more) of the user-specified delimiter brackets: '(..)', '{..}', '[..]', or '<..>'). Optionally it will also respect quoted unbalanced brackets (see below).

    A "delimiter bracket" is a bracket in list of delimiters passed as extract_bracketed 's second argument. Delimiter brackets are specified by giving either the left or right (or both!) versions of the required bracket(s). Note that the order in which two or more delimiter brackets are specified is not significant.

    A "balanced-bracket-delimited substring" is a substring bounded by matched brackets, such that any other (left or right) delimiter bracket within the substring is also matched by an opposite (right or left) delimiter bracket at the same level of nesting. Any type of bracket not in the delimiter list is treated as an ordinary character.

    In other words, each type of bracket specified as a delimiter must be balanced and correctly nested within the substring, and any other kind of ("non-delimiter") bracket in the substring is ignored.

    For example, given the string:

    1. $text = "{ an '[irregularly :-(] {} parenthesized >:-)' string }";

    then a call to extract_bracketed in a list context:

    1. @result = extract_bracketed( $text, '{}' );

    would return:

    1. ( "{ an '[irregularly :-(] {} parenthesized >:-)' string }" , "" , "" )

    since both sets of '{..}' brackets are properly nested and evenly balanced. (In a scalar context just the first element of the array would be returned. In a void context, $text would be replaced by an empty string.)

    Likewise the call in:

    1. @result = extract_bracketed( $text, '{[' );

    would return the same result, since all sets of both types of specified delimiter brackets are correctly nested and balanced.

    However, the call in:

    1. @result = extract_bracketed( $text, '{([<' );

    would fail, returning:

    1. ( undef , "{ an '[irregularly :-(] {} parenthesized >:-)' string }" );

    because the embedded pairs of '(..)' s and '[..]' s are "cross-nested" and the embedded '>' is unbalanced. (In a scalar context, this call would return an empty string. In a void context, $text would be unchanged.)

    Note that the embedded single-quotes in the string don't help in this case, since they have not been specified as acceptable delimiters and are therefore treated as non-delimiter characters (and ignored).

    However, if a particular species of quote character is included in the delimiter specification, then that type of quote will be correctly handled. for example, if $text is:

    1. $text = '<A HREF=">>>>">link</A>';

    then

    1. @result = extract_bracketed( $text, '<">' );

    returns:

    1. ( '<A HREF=">>>>">', 'link</A>', "" )

    as expected. Without the specification of " as an embedded quoter:

    1. @result = extract_bracketed( $text, '<>' );

    the result would be:

    1. ( '<A HREF=">', '>>>">link</A>', "" )

    In addition to the quote delimiters ', ", and `, full Perl quote-like quoting (i.e. q{string}, qq{string}, etc) can be specified by including the letter 'q' as a delimiter. Hence:

    1. @result = extract_bracketed( $text, '<q>' );

    would correctly match something like this:

    1. $text = '<leftop: conj /and/ conj>';

    See also: "extract_quotelike" and "extract_codeblock" .

    extract_variable

    extract_variable extracts any valid Perl variable or variable-involved expression, including scalars, arrays, hashes, array accesses, hash look-ups, method calls through objects, subroutine calls through subroutine references, etc.

    The subroutine takes up to two optional arguments:

    1.

    A string to be processed ($_ if the string is omitted or undef)

    2.

    A string specifying a pattern to be matched as a prefix (which is to be skipped). If omitted, optional whitespace is skipped.

    On success in a list context, an array of 3 elements is returned. The elements are:

    • [0]

      the extracted variable, or variablish expression

    • [1]

      the remainder of the input text,

    • [2]

      the prefix substring (if any),

    On failure, all of these values (except the remaining text) are undef.

    In a scalar context, extract_variable returns just the complete substring that matched a variablish expression. undef is returned on failure. In addition, the original input text has the returned substring (and any prefix) removed from it.

    In a void context, the input text just has the matched substring (and any specified prefix) removed.

    extract_tagged

    extract_tagged extracts and segments text between (balanced) specified tags.

    The subroutine takes up to five optional arguments:

    1.

    A string to be processed ($_ if the string is omitted or undef)

    2.

    A string specifying a pattern to be matched as the opening tag. If the pattern string is omitted (or undef) then a pattern that matches any standard XML tag is used.

    3.

    A string specifying a pattern to be matched at the closing tag. If the pattern string is omitted (or undef) then the closing tag is constructed by inserting a / after any leading bracket characters in the actual opening tag that was matched (not the pattern that matched the tag). For example, if the opening tag pattern is specified as '{{\w+}}' and actually matched the opening tag "{{DATA}}" , then the constructed closing tag would be "{{/DATA}}" .

    4.

    A string specifying a pattern to be matched as a prefix (which is to be skipped). If omitted, optional whitespace is skipped.

    5.

    A hash reference containing various parsing options (see below)

    The various options that can be specified are:

    • reject => $listref

      The list reference contains one or more strings specifying patterns that must not appear within the tagged text.

      For example, to extract an HTML link (which should not contain nested links) use:

      1. extract_tagged($text, '<A>', '</A>', undef, {reject => ['<A>']} );
    • ignore => $listref

      The list reference contains one or more strings specifying patterns that are not be be treated as nested tags within the tagged text (even if they would match the start tag pattern).

      For example, to extract an arbitrary XML tag, but ignore "empty" elements:

      1. extract_tagged($text, undef, undef, undef, {ignore => ['<[^>]*/>']} );

      (also see gen_delimited_pat below).

    • fail => $str

      The fail option indicates the action to be taken if a matching end tag is not encountered (i.e. before the end of the string or some reject pattern matches). By default, a failure to match a closing tag causes extract_tagged to immediately fail.

      However, if the string value associated with <reject> is "MAX", then extract_tagged returns the complete text up to the point of failure. If the string is "PARA", extract_tagged returns only the first paragraph after the tag (up to the first line that is either empty or contains only whitespace characters). If the string is "", the the default behaviour (i.e. failure) is reinstated.

      For example, suppose the start tag "/para" introduces a paragraph, which then continues until the next "/endpara" tag or until another "/para" tag is encountered:

      1. $text = "/para line 1\n\nline 3\n/para line 4";
      2. extract_tagged($text, '/para', '/endpara', undef,
      3. {reject => '/para', fail => MAX );
      4. # EXTRACTED: "/para line 1\n\nline 3\n"

      Suppose instead, that if no matching "/endpara" tag is found, the "/para" tag refers only to the immediately following paragraph:

      1. $text = "/para line 1\n\nline 3\n/para line 4";
      2. extract_tagged($text, '/para', '/endpara', undef,
      3. {reject => '/para', fail => MAX );
      4. # EXTRACTED: "/para line 1\n"

      Note that the specified fail behaviour applies to nested tags as well.

    On success in a list context, an array of 6 elements is returned. The elements are:

    • [0]

      the extracted tagged substring (including the outermost tags),

    • [1]

      the remainder of the input text,

    • [2]

      the prefix substring (if any),

    • [3]

      the opening tag

    • [4]

      the text between the opening and closing tags

    • [5]

      the closing tag (or "" if no closing tag was found)

    On failure, all of these values (except the remaining text) are undef.

    In a scalar context, extract_tagged returns just the complete substring that matched a tagged text (including the start and end tags). undef is returned on failure. In addition, the original input text has the returned substring (and any prefix) removed from it.

    In a void context, the input text just has the matched substring (and any specified prefix) removed.

    gen_extract_tagged

    (Note: This subroutine is only available under Perl5.005)

    gen_extract_tagged generates a new anonymous subroutine which extracts text between (balanced) specified tags. In other words, it generates a function identical in function to extract_tagged .

    The difference between extract_tagged and the anonymous subroutines generated by gen_extract_tagged , is that those generated subroutines:

    • do not have to reparse tag specification or parsing options every time they are called (whereas extract_tagged has to effectively rebuild its tag parser on every call);

    • make use of the new qr// construct to pre-compile the regexes they use (whereas extract_tagged uses standard string variable interpolation to create tag-matching patterns).

    The subroutine takes up to four optional arguments (the same set as extract_tagged except for the string to be processed). It returns a reference to a subroutine which in turn takes a single argument (the text to be extracted from).

    In other words, the implementation of extract_tagged is exactly equivalent to:

    1. sub extract_tagged
    2. {
    3. my $text = shift;
    4. $extractor = gen_extract_tagged(@_);
    5. return $extractor->($text);
    6. }

    (although extract_tagged is not currently implemented that way, in order to preserve pre-5.005 compatibility).

    Using gen_extract_tagged to create extraction functions for specific tags is a good idea if those functions are going to be called more than once, since their performance is typically twice as good as the more general-purpose extract_tagged .

    extract_quotelike

    extract_quotelike attempts to recognize, extract, and segment any one of the various Perl quotes and quotelike operators (see perlop(3)) Nested backslashed delimiters, embedded balanced bracket delimiters (for the quotelike operators), and trailing modifiers are all caught. For example, in:

    1. extract_quotelike 'q # an octothorpe: \# (not the end of the q!) #'
    2. extract_quotelike ' "You said, \"Use sed\"." '
    3. extract_quotelike ' s{([A-Z]{1,8}\.[A-Z]{3})} /\L$1\E/; '
    4. extract_quotelike ' tr/\\\/\\\\/\\\//ds; '

    the full Perl quotelike operations are all extracted correctly.

    Note too that, when using the /x modifier on a regex, any comment containing the current pattern delimiter will cause the regex to be immediately terminated. In other words:

    1. 'm /
    2. (?i) # CASE INSENSITIVE
    3. [a-z_] # LEADING ALPHABETIC/UNDERSCORE
    4. [a-z0-9]* # FOLLOWED BY ANY NUMBER OF ALPHANUMERICS
    5. /x'

    will be extracted as if it were:

    1. 'm /
    2. (?i) # CASE INSENSITIVE
    3. [a-z_] # LEADING ALPHABETIC/'

    This behaviour is identical to that of the actual compiler.

    extract_quotelike takes two arguments: the text to be processed and a prefix to be matched at the very beginning of the text. If no prefix is specified, optional whitespace is the default. If no text is given, $_ is used.

    In a list context, an array of 11 elements is returned. The elements are:

    • [0]

      the extracted quotelike substring (including trailing modifiers),

    • [1]

      the remainder of the input text,

    • [2]

      the prefix substring (if any),

    • [3]

      the name of the quotelike operator (if any),

    • [4]

      the left delimiter of the first block of the operation,

    • [5]

      the text of the first block of the operation (that is, the contents of a quote, the regex of a match or substitution or the target list of a translation),

    • [6]

      the right delimiter of the first block of the operation,

    • [7]

      the left delimiter of the second block of the operation (that is, if it is a s, tr, or y),

    • [8]

      the text of the second block of the operation (that is, the replacement of a substitution or the translation list of a translation),

    • [9]

      the right delimiter of the second block of the operation (if any),

    • [10]

      the trailing modifiers on the operation (if any).

    For each of the fields marked "(if any)" the default value on success is an empty string. On failure, all of these values (except the remaining text) are undef.

    In a scalar context, extract_quotelike returns just the complete substring that matched a quotelike operation (or undef on failure). In a scalar or void context, the input text has the same substring (and any specified prefix) removed.

    Examples:

    1. # Remove the first quotelike literal that appears in text
    2. $quotelike = extract_quotelike($text,'.*?');
    3. # Replace one or more leading whitespace-separated quotelike
    4. # literals in $_ with "<QLL>"
    5. do { $_ = join '<QLL>', (extract_quotelike)[2,1] } until $@;
    6. # Isolate the search pattern in a quotelike operation from $text
    7. ($op,$pat) = (extract_quotelike $text)[3,5];
    8. if ($op =~ /[ms]/)
    9. {
    10. print "search pattern: $pat\n";
    11. }
    12. else
    13. {
    14. print "$op is not a pattern matching operation\n";
    15. }

    extract_quotelike and "here documents"

    extract_quotelike can successfully extract "here documents" from an input string, but with an important caveat in list contexts.

    Unlike other types of quote-like literals, a here document is rarely a contiguous substring. For example, a typical piece of code using here document might look like this:

    1. <<'EOMSG' || die;
    2. This is the message.
    3. EOMSG
    4. exit;

    Given this as an input string in a scalar context, extract_quotelike would correctly return the string "<<'EOMSG'\nThis is the message.\nEOMSG", leaving the string " || die;\nexit;" in the original variable. In other words, the two separate pieces of the here document are successfully extracted and concatenated.

    In a list context, extract_quotelike would return the list

    • [0]

      "<<'EOMSG'\nThis is the message.\nEOMSG\n" (i.e. the full extracted here document, including fore and aft delimiters),

    • [1]

      " || die;\nexit;" (i.e. the remainder of the input text, concatenated),

    • [2]

      "" (i.e. the prefix substring -- trivial in this case),

    • [3]

      "<<" (i.e. the "name" of the quotelike operator)

    • [4]

      "'EOMSG'" (i.e. the left delimiter of the here document, including any quotes),

    • [5]

      "This is the message.\n" (i.e. the text of the here document),

    • [6]

      "EOMSG" (i.e. the right delimiter of the here document),

    • [7..10]

      "" (a here document has no second left delimiter, second text, second right delimiter, or trailing modifiers).

    However, the matching position of the input variable would be set to "exit;" (i.e. after the closing delimiter of the here document), which would cause the earlier " || die;\nexit;" to be skipped in any sequence of code fragment extractions.

    To avoid this problem, when it encounters a here document whilst extracting from a modifiable string, extract_quotelike silently rearranges the string to an equivalent piece of Perl:

    1. <<'EOMSG'
    2. This is the message.
    3. EOMSG
    4. || die;
    5. exit;

    in which the here document is contiguous. It still leaves the matching position after the here document, but now the rest of the line on which the here document starts is not skipped.

    To prevent <extract_quotelike> from mucking about with the input in this way (this is the only case where a list-context extract_quotelike does so), you can pass the input variable as an interpolated literal:

    1. $quotelike = extract_quotelike("$var");

    extract_codeblock

    extract_codeblock attempts to recognize and extract a balanced bracket delimited substring that may contain unbalanced brackets inside Perl quotes or quotelike operations. That is, extract_codeblock is like a combination of "extract_bracketed" and "extract_quotelike" .

    extract_codeblock takes the same initial three parameters as extract_bracketed : a text to process, a set of delimiter brackets to look for, and a prefix to match first. It also takes an optional fourth parameter, which allows the outermost delimiter brackets to be specified separately (see below).

    Omitting the first argument (input text) means process $_ instead. Omitting the second argument (delimiter brackets) indicates that only '{' is to be used. Omitting the third argument (prefix argument) implies optional whitespace at the start. Omitting the fourth argument (outermost delimiter brackets) indicates that the value of the second argument is to be used for the outermost delimiters.

    Once the prefix an dthe outermost opening delimiter bracket have been recognized, code blocks are extracted by stepping through the input text and trying the following alternatives in sequence:

    1.

    Try and match a closing delimiter bracket. If the bracket was the same species as the last opening bracket, return the substring to that point. If the bracket was mismatched, return an error.

    2.

    Try to match a quote or quotelike operator. If found, call extract_quotelike to eat it. If extract_quotelike fails, return the error it returned. Otherwise go back to step 1.

    3.

    Try to match an opening delimiter bracket. If found, call extract_codeblock recursively to eat the embedded block. If the recursive call fails, return an error. Otherwise, go back to step 1.

    4.

    Unconditionally match a bareword or any other single character, and then go back to step 1.

    Examples:

    1. # Find a while loop in the text
    2. if ($text =~ s/.*?while\s*\{/{/)
    3. {
    4. $loop = "while " . extract_codeblock($text);
    5. }
    6. # Remove the first round-bracketed list (which may include
    7. # round- or curly-bracketed code blocks or quotelike operators)
    8. extract_codeblock $text, "(){}", '[^(]*';

    The ability to specify a different outermost delimiter bracket is useful in some circumstances. For example, in the Parse::RecDescent module, parser actions which are to be performed only on a successful parse are specified using a <defer:...> directive. For example:

    1. sentence: subject verb object
    2. <defer: {$::theVerb = $item{verb}} >

    Parse::RecDescent uses extract_codeblock($text, '{}<>') to extract the code within the <defer:...> directive, but there's a problem.

    A deferred action like this:

    1. <defer: {if ($count>10) {$count--}} >

    will be incorrectly parsed as:

    1. <defer: {if ($count>

    because the "less than" operator is interpreted as a closing delimiter.

    But, by extracting the directive using extract_codeblock($text, '{}', undef, '<>') the '>' character is only treated as a delimited at the outermost level of the code block, so the directive is parsed correctly.

    extract_multiple

    The extract_multiple subroutine takes a string to be processed and a list of extractors (subroutines or regular expressions) to apply to that string.

    In an array context extract_multiple returns an array of substrings of the original string, as extracted by the specified extractors. In a scalar context, extract_multiple returns the first substring successfully extracted from the original string. In both scalar and void contexts the original string has the first successfully extracted substring removed from it. In all contexts extract_multiple starts at the current pos of the string, and sets that pos appropriately after it matches.

    Hence, the aim of of a call to extract_multiple in a list context is to split the processed string into as many non-overlapping fields as possible, by repeatedly applying each of the specified extractors to the remainder of the string. Thus extract_multiple is a generalized form of Perl's split subroutine.

    The subroutine takes up to four optional arguments:

    1.

    A string to be processed ($_ if the string is omitted or undef)

    2.

    A reference to a list of subroutine references and/or qr// objects and/or literal strings and/or hash references, specifying the extractors to be used to split the string. If this argument is omitted (or undef) the list:

    1. [
    2. sub { extract_variable($_[0], '') },
    3. sub { extract_quotelike($_[0],'') },
    4. sub { extract_codeblock($_[0],'{}','') },
    5. ]

    is used.

    3.

    An number specifying the maximum number of fields to return. If this argument is omitted (or undef), split continues as long as possible.

    If the third argument is N, then extraction continues until N fields have been successfully extracted, or until the string has been completely processed.

    Note that in scalar and void contexts the value of this argument is automatically reset to 1 (under -w , a warning is issued if the argument has to be reset).

    4.

    A value indicating whether unmatched substrings (see below) within the text should be skipped or returned as fields. If the value is true, such substrings are skipped. Otherwise, they are returned.

    The extraction process works by applying each extractor in sequence to the text string.

    If the extractor is a subroutine it is called in a list context and is expected to return a list of a single element, namely the extracted text. It may optionally also return two further arguments: a string representing the text left after extraction (like $' for a pattern match), and a string representing any prefix skipped before the extraction (like $` in a pattern match). Note that this is designed to facilitate the use of other Text::Balanced subroutines with extract_multiple . Note too that the value returned by an extractor subroutine need not bear any relationship to the corresponding substring of the original text (see examples below).

    If the extractor is a precompiled regular expression or a string, it is matched against the text in a scalar context with a leading '\G' and the gc modifiers enabled. The extracted value is either $1 if that variable is defined after the match, or else the complete match (i.e. $&).

    If the extractor is a hash reference, it must contain exactly one element. The value of that element is one of the above extractor types (subroutine reference, regular expression, or string). The key of that element is the name of a class into which the successful return value of the extractor will be blessed.

    If an extractor returns a defined value, that value is immediately treated as the next extracted field and pushed onto the list of fields. If the extractor was specified in a hash reference, the field is also blessed into the appropriate class,

    If the extractor fails to match (in the case of a regex extractor), or returns an empty list or an undefined value (in the case of a subroutine extractor), it is assumed to have failed to extract. If none of the extractor subroutines succeeds, then one character is extracted from the start of the text and the extraction subroutines reapplied. Characters which are thus removed are accumulated and eventually become the next field (unless the fourth argument is true, in which case they are discarded).

    For example, the following extracts substrings that are valid Perl variables:

    1. @fields = extract_multiple($text,
    2. [ sub { extract_variable($_[0]) } ],
    3. undef, 1);

    This example separates a text into fields which are quote delimited, curly bracketed, and anything else. The delimited and bracketed parts are also blessed to identify them (the "anything else" is unblessed):

    1. @fields = extract_multiple($text,
    2. [
    3. { Delim => sub { extract_delimited($_[0],q{'"}) } },
    4. { Brack => sub { extract_bracketed($_[0],'{}') } },
    5. ]);

    This call extracts the next single substring that is a valid Perl quotelike operator (and removes it from $text):

    1. $quotelike = extract_multiple($text,
    2. [
    3. sub { extract_quotelike($_[0]) },
    4. ], undef, 1);

    Finally, here is yet another way to do comma-separated value parsing:

    1. @fields = extract_multiple($csv_text,
    2. [
    3. sub { extract_delimited($_[0],q{'"}) },
    4. qr/([^,]+)(.*)/,
    5. ],
    6. undef,1);

    The list in the second argument means: "Try and extract a ' or " delimited string, otherwise extract anything up to a comma...". The undef third argument means: "...as many times as possible...", and the true value in the fourth argument means "...discarding anything else that appears (i.e. the commas)".

    If you wanted the commas preserved as separate fields (i.e. like split does if your split pattern has capturing parentheses), you would just make the last parameter undefined (or remove it).

    gen_delimited_pat

    The gen_delimited_pat subroutine takes a single (string) argument and > builds a Friedl-style optimized regex that matches a string delimited by any one of the characters in the single argument. For example:

    1. gen_delimited_pat(q{'"})

    returns the regex:

    1. (?:\"(?:\\\"|(?!\").)*\"|\'(?:\\\'|(?!\').)*\')

    Note that the specified delimiters are automatically quotemeta'd.

    A typical use of gen_delimited_pat would be to build special purpose tags for extract_tagged . For example, to properly ignore "empty" XML elements (which might contain quoted strings):

    1. my $empty_tag = '<(' . gen_delimited_pat(q{'"}) . '|.)+/>';
    2. extract_tagged($text, undef, undef, undef, {ignore => [$empty_tag]} );

    gen_delimited_pat may also be called with an optional second argument, which specifies the "escape" character(s) to be used for each delimiter. For example to match a Pascal-style string (where ' is the delimiter and '' is a literal ' within the string):

    1. gen_delimited_pat(q{'},q{'});

    Different escape characters can be specified for different delimiters. For example, to specify that '/' is the escape for single quotes and '%' is the escape for double quotes:

    1. gen_delimited_pat(q{'"},q{/%});

    If more delimiters than escape chars are specified, the last escape char is used for the remaining delimiters. If no escape char is specified for a given specified delimiter, '\' is used.

    delimited_pat

    Note that gen_delimited_pat was previously called delimited_pat . That name may still be used, but is now deprecated.

    DIAGNOSTICS

    In a list context, all the functions return (undef,$original_text) on failure. In a scalar context, failure is indicated by returning undef (in this case the input text is not modified in any way).

    In addition, on failure in any context, the $@ variable is set. Accessing $@->{error} returns one of the error diagnostics listed below. Accessing $@->{pos} returns the offset into the original string at which the error was detected (although not necessarily where it occurred!) Printing $@ directly produces the error message, with the offset appended. On success, the $@ variable is guaranteed to be undef.

    The available diagnostics are:

    • Did not find a suitable bracket: "%s"

      The delimiter provided to extract_bracketed was not one of '()[]<>{}' .

    • Did not find prefix: /%s/

      A non-optional prefix was specified but wasn't found at the start of the text.

    • Did not find opening bracket after prefix: "%s"

      extract_bracketed or extract_codeblock was expecting a particular kind of bracket at the start of the text, and didn't find it.

    • No quotelike operator found after prefix: "%s"

      extract_quotelike didn't find one of the quotelike operators q, qq, qw, qx, s, tr or y at the start of the substring it was extracting.

    • Unmatched closing bracket: "%c"

      extract_bracketed , extract_quotelike or extract_codeblock encountered a closing bracket where none was expected.

    • Unmatched opening bracket(s): "%s"

      extract_bracketed , extract_quotelike or extract_codeblock ran out of characters in the text before closing one or more levels of nested brackets.

    • Unmatched embedded quote (%s)

      extract_bracketed attempted to match an embedded quoted substring, but failed to find a closing quote to match it.

    • Did not find closing delimiter to match '%s'

      extract_quotelike was unable to find a closing delimiter to match the one that opened the quote-like operation.

    • Mismatched closing bracket: expected "%c" but found "%s"

      extract_bracketed , extract_quotelike or extract_codeblock found a valid bracket delimiter, but it was the wrong species. This usually indicates a nesting error, but may indicate incorrect quoting or escaping.

    • No block delimiter found after quotelike "%s"

      extract_quotelike or extract_codeblock found one of the quotelike operators q, qq, qw, qx, s, tr or y without a suitable block after it.

    • Did not find leading dereferencer

      extract_variable was expecting one of '$', '@', or '%' at the start of a variable, but didn't find any of them.

    • Bad identifier after dereferencer

      extract_variable found a '$', '@', or '%' indicating a variable, but that character was not followed by a legal Perl identifier.

    • Did not find expected opening bracket at %s

      extract_codeblock failed to find any of the outermost opening brackets that were specified.

    • Improperly nested codeblock at %s

      A nested code block was found that started with a delimiter that was specified as being only to be used as an outermost bracket.

    • Missing second block for quotelike "%s"

      extract_codeblock or extract_quotelike found one of the quotelike operators s, tr or y followed by only one block.

    • No match found for opening bracket

      extract_codeblock failed to find a closing bracket to match the outermost opening bracket.

    • Did not find opening tag: /%s/

      extract_tagged did not find a suitable opening tag (after any specified prefix was removed).

    • Unable to construct closing tag to match: /%s/

      extract_tagged matched the specified opening tag and tried to modify the matched text to produce a matching closing tag (because none was specified). It failed to generate the closing tag, almost certainly because the opening tag did not start with a bracket of some kind.

    • Found invalid nested tag: %s

      extract_tagged found a nested tag that appeared in the "reject" list (and the failure mode was not "MAX" or "PARA").

    • Found unbalanced nested tag: %s

      extract_tagged found a nested opening tag that was not matched by a corresponding nested closing tag (and the failure mode was not "MAX" or "PARA").

    • Did not find closing tag

      extract_tagged reached the end of the text without finding a closing tag to match the original opening tag (and the failure mode was not "MAX" or "PARA").

    AUTHOR

    Damian Conway (damian@conway.org)

    BUGS AND IRRITATIONS

    There are undoubtedly serious bugs lurking somewhere in this code, if only because parts of it give the impression of understanding a great deal more about Perl than they really do.

    Bug reports and other feedback are most welcome.

    COPYRIGHT

    Copyright 1997 - 2001 Damian Conway. All Rights Reserved.

    Some (minor) parts copyright 2009 Adam Kennedy.

    This module is free software. It may be used, redistributed and/or modified under the same terms as Perl itself.

     
    perldoc-html/Text/ParseWords.html000644 000765 000024 00000050635 12275777505 017153 0ustar00jjstaff000000 000000 Text::ParseWords - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Text::ParseWords

    Perl 5 version 18.2 documentation
    Recently read

    Text::ParseWords

    NAME

    Text::ParseWords - parse text into an array of tokens or array of arrays

    SYNOPSIS

    1. use Text::ParseWords;
    2. @lists = nested_quotewords($delim, $keep, @lines);
    3. @words = quotewords($delim, $keep, @lines);
    4. @words = shellwords(@lines);
    5. @words = parse_line($delim, $keep, $line);
    6. @words = old_shellwords(@lines); # DEPRECATED!

    DESCRIPTION

    The &nested_quotewords() and &quotewords() functions accept a delimiter (which can be a regular expression) and a list of lines and then breaks those lines up into a list of words ignoring delimiters that appear inside quotes. &quotewords() returns all of the tokens in a single long list, while &nested_quotewords() returns a list of token lists corresponding to the elements of @lines. &parse_line() does tokenizing on a single string. The &*quotewords() functions simply call &parse_line(), so if you're only splitting one line you can call &parse_line() directly and save a function call.

    The $keep argument is a boolean flag. If true, then the tokens are split on the specified delimiter, but all other characters (quotes, backslashes, etc.) are kept in the tokens. If $keep is false then the &*quotewords() functions remove all quotes and backslashes that are not themselves backslash-escaped or inside of single quotes (i.e., &quotewords() tries to interpret these characters just like the Bourne shell). NB: these semantics are significantly different from the original version of this module shipped with Perl 5.000 through 5.004. As an additional feature, $keep may be the keyword "delimiters" which causes the functions to preserve the delimiters in each string as tokens in the token lists, in addition to preserving quote and backslash characters.

    &shellwords() is written as a special case of &quotewords(), and it does token parsing with whitespace as a delimiter-- similar to most Unix shells.

    EXAMPLES

    The sample program:

    1. use Text::ParseWords;
    2. @words = quotewords('\s+', 0, q{this is "a test" of\ quotewords \"for you});
    3. $i = 0;
    4. foreach (@words) {
    5. print "$i: <$_>\n";
    6. $i++;
    7. }

    produces:

    1. 0: <this>
    2. 1: <is>
    3. 2: <a test>
    4. 3: <of quotewords>
    5. 4: <"for>
    6. 5: <you>

    demonstrating:

    0

    a simple word

    1

    multiple spaces are skipped because of our $delim

    2

    use of quotes to include a space in a word

    3

    use of a backslash to include a space in a word

    4

    use of a backslash to remove the special meaning of a double-quote

    5

    another simple word (note the lack of effect of the backslashed double-quote)

    Replacing quotewords('\s+', 0, q{this is...}) with shellwords(q{this is...}) is a simpler way to accomplish the same thing.

    SEE ALSO

    Text::CSV - for parsing CSV files

    AUTHORS

    Maintainer: Alexandr Ciornii <alexchornyATgmail.com>.

    Previous maintainer: Hal Pomeranz <pomeranz@netcom.com>, 1994-1997 (Original author unknown). Much of the code for &parse_line() (including the primary regexp) from Joerk Behrends <jbehrends@multimediaproduzenten.de>.

    Examples section another documentation provided by John Heidemann <johnh@ISI.EDU>

    Bug reports, patches, and nagging provided by lots of folks-- thanks everybody! Special thanks to Michael Schwern <schwern@envirolink.org> for assuring me that a &nested_quotewords() would be useful, and to Jeff Friedl <jfriedl@yahoo-inc.com> for telling me not to worry about error-checking (sort of-- you had to be there).

     
    perldoc-html/Text/Soundex.html000644 000765 000024 00000061705 12275777512 016505 0ustar00jjstaff000000 000000 Text::Soundex - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Text::Soundex

    Perl 5 version 18.2 documentation
    Recently read

    Text::Soundex

    NAME

    Text::Soundex - Implementation of the soundex algorithm.

    SYNOPSIS

    1. use Text::Soundex;
    2. # Original algorithm.
    3. $code = soundex($name); # Get the soundex code for a name.
    4. @codes = soundex(@names); # Get the list of codes for a list of names.
    5. # American Soundex variant (NARA) - Used for US census data.
    6. $code = soundex_nara($name); # Get the soundex code for a name.
    7. @codes = soundex_nara(@names); # Get the list of codes for a list of names.
    8. # Redefine the value that soundex() will return if the input string
    9. # contains no identifiable sounds within it.
    10. $Text::Soundex::nocode = 'Z000';

    DESCRIPTION

    Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal is for names with the same pronunciation to be encoded to the same representation so that they can be matched despite minor differences in spelling. Soundex is the most widely known of all phonetic algorithms and is often used (incorrectly) as a synonym for "phonetic algorithm". Improvements to Soundex are the basis for many modern phonetic algorithms. (Wikipedia, 2007)

    This module implements the original soundex algorithm developed by Robert Russell and Margaret Odell, patented in 1918 and 1922, as well as a variation called "American Soundex" used for US census data, and current maintained by the National Archives and Records Administration (NARA).

    The soundex algorithm may be recognized from Donald Knuth's The Art of Computer Programming. The algorithm described by Knuth is the NARA algorithm.

    The value returned for strings which have no soundex encoding is defined using $Text::Soundex::nocode . The default value is undef, however values such as 'Z000' are commonly used alternatives.

    For backward compatibility with older versions of this module the $Text::Soundex::nocode is exported into the caller's namespace as $soundex_nocode .

    In scalar context, soundex() returns the soundex code of its first argument. In list context, a list is returned in which each element is the soundex code for the corresponding argument passed to soundex() . For example, the following code assigns @codes the value ('M200', 'S320') :

    1. @codes = soundex qw(Mike Stok);

    To use Text::Soundex to generate codes that can be used to search one of the publically available US Censuses, a variant of the soundex algorithm must be used:

    1. use Text::Soundex;
    2. $code = soundex_nara($name);

    An example of where these algorithm differ follows:

    1. use Text::Soundex;
    2. print soundex("Ashcraft"), "\n"; # prints: A226
    3. print soundex_nara("Ashcraft"), "\n"; # prints: A261

    EXAMPLES

    Donald Knuth's examples of names and the soundex codes they map to are listed below:

    1. Euler, Ellery -> E460
    2. Gauss, Ghosh -> G200
    3. Hilbert, Heilbronn -> H416
    4. Knuth, Kant -> K530
    5. Lloyd, Ladd -> L300
    6. Lukasiewicz, Lissajous -> L222

    so:

    1. $code = soundex 'Knuth'; # $code contains 'K530'
    2. @list = soundex qw(Lloyd Gauss); # @list contains 'L300', 'G200'

    LIMITATIONS

    As the soundex algorithm was originally used a long time ago in the US it considers only the English alphabet and pronunciation. In particular, non-ASCII characters will be ignored. The recommended method of dealing with characters that have accents, or other unicode characters, is to use the Text::Unidecode module available from CPAN. Either use the module explicitly:

    1. use Text::Soundex;
    2. use Text::Unidecode;
    3. print soundex(unidecode("Fran\xE7ais")), "\n"; # Prints "F652\n"

    Or use the convenient wrapper routine:

    1. use Text::Soundex 'soundex_unicode';
    2. print soundex_unicode("Fran\xE7ais"), "\n"; # Prints "F652\n"

    Since the soundex algorithm maps a large space (strings of arbitrary length) onto a small space (single letter plus 3 digits) no inference can be made about the similarity of two strings which end up with the same soundex code. For example, both Hilbert and Heilbronn end up with a soundex code of H416 .

    MAINTAINER

    This module is currently maintain by Mark Mielke (mark@mielke.cc ).

    HISTORY

    Version 3 is a significant update to provide support for versions of Perl later than Perl 5.004. Specifically, the XS version of the soundex() subroutine understands strings that are encoded using UTF-8 (unicode strings).

    Version 2 of this module was a re-write by Mark Mielke (mark@mielke.cc ) to improve the speed of the subroutines. The XS version of the soundex() subroutine was introduced in 2.00.

    Version 1 of this module was written by Mike Stok (mike@stok.co.uk ) and was included into the Perl core library set.

    Dave Carlsen (dcarlsen@csranet.com ) made the request for the NARA algorithm to be included. The NARA soundex page can be viewed at: http://www.nara.gov/genealogy/soundex/soundex.html

    Ian Phillips (ian@pipex.net ) and Rich Pinder (rpinder@hsc.usc.edu ) supplied ideas and spotted mistakes for v1.x.

     
    perldoc-html/Text/Tabs.html000644 000765 000024 00000046600 12275777505 015750 0ustar00jjstaff000000 000000 Text::Tabs - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Text::Tabs

    Perl 5 version 18.2 documentation
    Recently read

    Text::Tabs

    NAME

    Text::Tabs - expand and unexpand tabs like unix expand(1) and unexpand(1)

    SYNOPSIS

    1. use Text::Tabs;
    2. $tabstop = 4; # default = 8
    3. @lines_without_tabs = expand(@lines_with_tabs);
    4. @lines_with_tabs = unexpand(@lines_without_tabs);

    DESCRIPTION

    Text::Tabs does most of what the unix utilities expand(1) and unexpand(1) do. Given a line with tabs in it, expand replaces those tabs with the appropriate number of spaces. Given a line with or without tabs in it, unexpand adds tabs when it can save bytes by doing so, like the unexpand -a command.

    Unlike the old unix utilities, this module correctly accounts for any Unicode combining characters (such as diacriticals) that may occur in each line for both expansion and unexpansion. These are overstrike characters that do not increment the logical position. Make sure you have the appropriate Unicode settings enabled.

    EXPORTS

    The following are exported:

    • expand
    • unexpand
    • $tabstop

      The $tabstop variable controls how many column positions apart each tabstop is. The default is 8.

      Please note that local($tabstop) doesn't do the right thing and if you want to use local to override $tabstop , you need to use local($Text::Tabs::tabstop).

    EXAMPLE

    1. #!perl
    2. # unexpand -a
    3. use Text::Tabs;
    4. while (<>) {
    5. print unexpand $_;
    6. }

    Instead of the shell's expand comand, use:

    1. perl -MText::Tabs -n -e 'print expand $_'

    Instead of the shell's unexpand -a command, use:

    1. perl -MText::Tabs -n -e 'print unexpand $_'

    SUBVERSION

    This module comes in two flavors: one for modern perls (5.10 and above) and one for ancient obsolete perls. The version for modern perls has support for Unicode. The version for old perls does not. You can tell which version you have installed by looking at $Text::Tabs::SUBVERSION : it is old for obsolete perls and modern for current perls.

    This man page is for the version for modern perls and so that's probably what you've got.

    BUGS

    Text::Tabs handles only tabs ("\t" ) and combining characters (/\pM/ ). It doesn't count backwards for backspaces ("\t" ), omit other non-printing control characters (/\pC/ ), or otherwise deal with any other zero-, half-, and full-width characters.

    LICENSE

    Copyright (C) 1996-2002,2005,2006 David Muir Sharnoff. Copyright (C) 2005 Aristotle Pagaltzis Copyright (C) 2012 Google, Inc. This module may be modified, used, copied, and redistributed at your own risk. Publicly redistributed modified versions must use a different name.

     
    perldoc-html/Text/Wrap.html000644 000765 000024 00000064774 12275777512 016002 0ustar00jjstaff000000 000000 Text::Wrap - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Text::Wrap

    Perl 5 version 18.2 documentation
    Recently read

    Text::Wrap

    NAME

    Text::Wrap - line wrapping to form simple paragraphs

    SYNOPSIS

    Example 1

    1. use Text::Wrap;
    2. $initial_tab = "\t"; # Tab before first line
    3. $subsequent_tab = ""; # All other lines flush left
    4. print wrap($initial_tab, $subsequent_tab, @text);
    5. print fill($initial_tab, $subsequent_tab, @text);
    6. $lines = wrap($initial_tab, $subsequent_tab, @text);
    7. @paragraphs = fill($initial_tab, $subsequent_tab, @text);

    Example 2

    1. use Text::Wrap qw(wrap $columns $huge);
    2. $columns = 132; # Wrap at 132 characters
    3. $huge = 'die';
    4. $huge = 'wrap';
    5. $huge = 'overflow';

    Example 3

    1. use Text::Wrap;
    2. $Text::Wrap::columns = 72;
    3. print wrap('', '', @text);

    DESCRIPTION

    Text::Wrap::wrap() is a very simple paragraph formatter. It formats a single paragraph at a time by breaking lines at word boundaries. Indentation is controlled for the first line ($initial_tab ) and all subsequent lines ($subsequent_tab ) independently. Please note: $initial_tab and $subsequent_tab are the literal strings that will be used: it is unlikely you would want to pass in a number.

    Text::Wrap::fill() is a simple multi-paragraph formatter. It formats each paragraph separately and then joins them together when it's done. It will destroy any whitespace in the original text. It breaks text into paragraphs by looking for whitespace after a newline. In other respects, it acts like wrap().

    wrap() compresses trailing whitespace into one newline, and fill() deletes all trailing whitespace.

    Both wrap() and fill() return a single string.

    Unlike the old Unix fmt(1) utility, this module correctly accounts for any Unicode combining characters (such as diacriticals) that may occur in each line for both expansion and unexpansion. These are overstrike characters that do not increment the logical position. Make sure you have the appropriate Unicode settings enabled.

    OVERRIDES

    Text::Wrap::wrap() has a number of variables that control its behavior. Because other modules might be using Text::Wrap::wrap() it is suggested that you leave these variables alone! If you can't do that, then use local($Text::Wrap::VARIABLE) = YOURVALUE when you change the values so that the original value is restored. This local() trick will not work if you import the variable into your own namespace.

    Lines are wrapped at $Text::Wrap::columns columns (default value: 76). $Text::Wrap::columns should be set to the full width of your output device. In fact, every resulting line will have length of no more than $columns - 1 .

    It is possible to control which characters terminate words by modifying $Text::Wrap::break . Set this to a string such as '[\s:]' (to break before spaces or colons) or a pre-compiled regexp such as qr/[\s']/ (to break before spaces or apostrophes). The default is simply '\s' ; that is, words are terminated by spaces. (This means, among other things, that trailing punctuation such as full stops or commas stay with the word they are "attached" to.) Setting $Text::Wrap::break to a regular expression that doesn't eat any characters (perhaps just a forward look-ahead assertion) will cause warnings.

    Beginner note: In example 2, above $columns is imported into the local namespace, and set locally. In example 3, $Text::Wrap::columns is set in its own namespace without importing it.

    Text::Wrap::wrap() starts its work by expanding all the tabs in its input into spaces. The last thing it does it to turn spaces back into tabs. If you do not want tabs in your results, set $Text::Wrap::unexpand to a false value. Likewise if you do not want to use 8-character tabstops, set $Text::Wrap::tabstop to the number of characters you do want for your tabstops.

    If you want to separate your lines with something other than \n then set $Text::Wrap::separator to your preference. This replaces all newlines with $Text::Wrap::separator . If you just want to preserve existing newlines but add new breaks with something else, set $Text::Wrap::separator2 instead.

    When words that are longer than $columns are encountered, they are broken up. wrap() adds a "\n" at column $columns . This behavior can be overridden by setting $huge to 'die' or to 'overflow'. When set to 'die', large words will cause die() to be called. When set to 'overflow', large words will be left intact.

    Historical notes: 'die' used to be the default value of $huge . Now, 'wrap' is the default value.

    EXAMPLES

    Code:

    1. print wrap("\t","",<<END);
    2. This is a bit of text that forms
    3. a normal book-style indented paragraph
    4. END

    Result:

    1. " This is a bit of text that forms
    2. a normal book-style indented paragraph
    3. "

    Code:

    1. $Text::Wrap::columns=20;
    2. $Text::Wrap::separator="|";
    3. print wrap("","","This is a bit of text that forms a normal book-style paragraph");

    Result:

    1. "This is a bit of|text that forms a|normal book-style|paragraph"

    SUBVERSION

    This module comes in two flavors: one for modern perls (5.10 and above) and one for ancient obsolete perls. The version for modern perls has support for Unicode. The version for old perls does not. You can tell which version you have installed by looking at $Text::Wrap::SUBVERSION : it is old for obsolete perls and modern for current perls.

    This man page is for the version for modern perls and so that's probably what you've got.

    SEE ALSO

    For correct handling of East Asian half- and full-width characters, see Text::WrapI18N. For more detailed controls: Text::Format.

    AUTHOR

    David Muir Sharnoff <cpan@dave.sharnoff.org> with help from Tim Pierce and many many others.

    LICENSE

    Copyright (C) 1996-2009 David Muir Sharnoff. Copyright (C) 2012 Google, Inc. This module may be modified, used, copied, and redistributed at your own risk. Publicly redistributed modified versions must use a different name.

     
    perldoc-html/Test/Builder/000755 000765 000024 00000000000 12275777511 015541 5ustar00jjstaff000000 000000 perldoc-html/Test/Builder.html000644 000765 000024 00000213253 12275777510 016434 0ustar00jjstaff000000 000000 Test::Builder - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Test::Builder

    Perl 5 version 18.2 documentation
    Recently read

    Test::Builder

    NAME

    Test::Builder - Backend for building test libraries

    SYNOPSIS

    1. package My::Test::Module;
    2. use base 'Test::Builder::Module';
    3. my $CLASS = __PACKAGE__;
    4. sub ok {
    5. my($test, $name) = @_;
    6. my $tb = $CLASS->builder;
    7. $tb->ok($test, $name);
    8. }

    DESCRIPTION

    Test::Simple and Test::More have proven to be popular testing modules, but they're not always flexible enough. Test::Builder provides a building block upon which to write your own test libraries which can work together.

    Construction

    • new
      1. my $Test = Test::Builder->new;

      Returns a Test::Builder object representing the current state of the test.

      Since you only run one test per program new always returns the same Test::Builder object. No matter how many times you call new() , you're getting the same object. This is called a singleton. This is done so that multiple modules share such global information as the test counter and where test output is going.

      If you want a completely new Test::Builder object different from the singleton, use create .

    • create
      1. my $Test = Test::Builder->create;

      Ok, so there can be more than one Test::Builder object and this is how you get it. You might use this instead of new() if you're testing a Test::Builder based module, but otherwise you probably want new .

      NOTE: the implementation is not complete. level , for example, is still shared amongst all Test::Builder objects, even ones created using this method. Also, the method name may change in the future.

    • child
      1. my $child = $builder->child($name_of_child);
      2. $child->plan( tests => 4 );
      3. $child->ok(some_code());
      4. ...
      5. $child->finalize;

      Returns a new instance of Test::Builder . Any output from this child will be indented four spaces more than the parent's indentation. When done, the finalize method must be called explicitly.

      Trying to create a new child with a previous child still active (i.e., finalize not called) will croak .

      Trying to run a test when you have an open child will also croak and cause the test suite to fail.

    • subtest
      1. $builder->subtest($name, \&subtests);

      See documentation of subtest in Test::More.

    • finalize
      1. my $ok = $child->finalize;

      When your child is done running tests, you must call finalize to clean up and tell the parent your pass/fail status.

      Calling finalize on a child with open children will croak .

      If the child falls out of scope before finalize is called, a failure diagnostic will be issued and the child is considered to have failed.

      No attempt to call methods on a child after finalize is called is guaranteed to succeed.

      Calling this on the root builder is a no-op.

    • parent
      1. if ( my $parent = $builder->parent ) {
      2. ...
      3. }

      Returns the parent Test::Builder instance, if any. Only used with child builders for nested TAP.

    • name
      1. diag $builder->name;

      Returns the name of the current builder. Top level builders default to $0 (the name of the executable). Child builders are named via the child method. If no name is supplied, will be named "Child of $parent->name".

    • reset
      1. $Test->reset;

      Reinitializes the Test::Builder singleton to its original state. Mostly useful for tests run in persistent environments where the same test might be run multiple times in the same process.

    Setting up tests

    These methods are for setting up tests and declaring how many there are. You usually only want to call one of these methods.

    • plan
      1. $Test->plan('no_plan');
      2. $Test->plan( skip_all => $reason );
      3. $Test->plan( tests => $num_tests );

      A convenient way to set up your tests. Call this and Test::Builder will print the appropriate headers and take the appropriate actions.

      If you call plan() , don't call any of the other methods below.

      If a child calls "skip_all" in the plan, a Test::Builder::Exception is thrown. Trap this error, call finalize() and don't run any more tests on the child.

      1. my $child = $Test->child('some child');
      2. eval { $child->plan( $condition ? ( skip_all => $reason ) : ( tests => 3 ) ) };
      3. if ( eval { $@->isa('Test::Builder::Exception') } ) {
      4. $child->finalize;
      5. return;
      6. }
      7. # run your tests
    • expected_tests
      1. my $max = $Test->expected_tests;
      2. $Test->expected_tests($max);

      Gets/sets the number of tests we expect this test to run and prints out the appropriate headers.

    • no_plan
      1. $Test->no_plan;

      Declares that this test will run an indeterminate number of tests.

    • done_testing
      1. $Test->done_testing();
      2. $Test->done_testing($num_tests);

      Declares that you are done testing, no more tests will be run after this point.

      If a plan has not yet been output, it will do so.

      $num_tests is the number of tests you planned to run. If a numbered plan was already declared, and if this contradicts, a failing test will be run to reflect the planning mistake. If no_plan was declared, this will override.

      If done_testing() is called twice, the second call will issue a failing test.

      If $num_tests is omitted, the number of tests run will be used, like no_plan.

      done_testing() is, in effect, used when you'd want to use no_plan , but safer. You'd use it like so:

      1. $Test->ok($a == $b);
      2. $Test->done_testing();

      Or to plan a variable number of tests:

      1. for my $test (@tests) {
      2. $Test->ok($test);
      3. }
      4. $Test->done_testing(@tests);
    • has_plan
      1. $plan = $Test->has_plan

      Find out whether a plan has been defined. $plan is either undef (no plan has been set), no_plan (indeterminate # of tests) or an integer (the number of expected tests).

    • skip_all
      1. $Test->skip_all;
      2. $Test->skip_all($reason);

      Skips all the tests, using the given $reason . Exits immediately with 0.

    • exported_to
      1. my $pack = $Test->exported_to;
      2. $Test->exported_to($pack);

      Tells Test::Builder what package you exported your functions to.

      This method isn't terribly useful since modules which share the same Test::Builder object might get exported to different packages and only the last one will be honored.

    Running tests

    These actually run the tests, analogous to the functions in Test::More.

    They all return true if the test passed, false if the test failed.

    $name is always optional.

    • ok
      1. $Test->ok($test, $name);

      Your basic test. Pass if $test is true, fail if $test is false. Just like Test::Simple's ok() .

    • is_eq
      1. $Test->is_eq($got, $expected, $name);

      Like Test::More's is() . Checks if $got eq $expected . This is the string version.

      undef only ever matches another undef.

    • is_num
      1. $Test->is_num($got, $expected, $name);

      Like Test::More's is() . Checks if $got == $expected . This is the numeric version.

      undef only ever matches another undef.

    • isnt_eq
      1. $Test->isnt_eq($got, $dont_expect, $name);

      Like Test::More's isnt() . Checks if $got ne $dont_expect . This is the string version.

    • isnt_num
      1. $Test->isnt_num($got, $dont_expect, $name);

      Like Test::More's isnt() . Checks if $got ne $dont_expect . This is the numeric version.

    • like
      1. $Test->like($this, qr/$regex/, $name);
      2. $Test->like($this, '/$regex/', $name);

      Like Test::More's like() . Checks if $this matches the given $regex .

    • unlike
      1. $Test->unlike($this, qr/$regex/, $name);
      2. $Test->unlike($this, '/$regex/', $name);

      Like Test::More's unlike() . Checks if $this does not match the given $regex .

    • cmp_ok
      1. $Test->cmp_ok($this, $type, $that, $name);

      Works just like Test::More's cmp_ok() .

      1. $Test->cmp_ok($big_num, '!=', $other_big_num);

    Other Testing Methods

    These are methods which are used in the course of writing a test but are not themselves tests.

    • BAIL_OUT
      1. $Test->BAIL_OUT($reason);

      Indicates to the Test::Harness that things are going so badly all testing should terminate. This includes running any additional test scripts.

      It will exit with 255.

    • skip
      1. $Test->skip;
      2. $Test->skip($why);

      Skips the current test, reporting $why .

    • todo_skip
      1. $Test->todo_skip;
      2. $Test->todo_skip($why);

      Like skip() , only it will declare the test as failing and TODO. Similar to

      1. print "not ok $tnum # TODO $why\n";

    Test building utility methods

    These methods are useful when writing your own test methods.

    • maybe_regex
      1. $Test->maybe_regex(qr/$regex/);
      2. $Test->maybe_regex('/$regex/');

      This method used to be useful back when Test::Builder worked on Perls before 5.6 which didn't have qr//. Now its pretty useless.

      Convenience method for building testing functions that take regular expressions as arguments.

      Takes a quoted regular expression produced by qr//, or a string representing a regular expression.

      Returns a Perl value which may be used instead of the corresponding regular expression, or undef if its argument is not recognised.

      For example, a version of like() , sans the useful diagnostic messages, could be written as:

      1. sub laconic_like {
      2. my ($self, $this, $regex, $name) = @_;
      3. my $usable_regex = $self->maybe_regex($regex);
      4. die "expecting regex, found '$regex'\n"
      5. unless $usable_regex;
      6. $self->ok($this =~ m/$usable_regex/, $name);
      7. }
    • is_fh
      1. my $is_fh = $Test->is_fh($thing);

      Determines if the given $thing can be used as a filehandle.

    Test style

    • level
      1. $Test->level($how_high);

      How far up the call stack should $Test look when reporting where the test failed.

      Defaults to 1.

      Setting $Test::Builder::Level overrides. This is typically useful localized:

      1. sub my_ok {
      2. my $test = shift;
      3. local $Test::Builder::Level = $Test::Builder::Level + 1;
      4. $TB->ok($test);
      5. }

      To be polite to other functions wrapping your own you usually want to increment $Level rather than set it to a constant.

    • use_numbers
      1. $Test->use_numbers($on_or_off);

      Whether or not the test should output numbers. That is, this if true:

      1. ok 1
      2. ok 2
      3. ok 3

      or this if false

      1. ok
      2. ok
      3. ok

      Most useful when you can't depend on the test output order, such as when threads or forking is involved.

      Defaults to on.

    • no_diag
      1. $Test->no_diag($no_diag);

      If set true no diagnostics will be printed. This includes calls to diag() .

    • no_ending
      1. $Test->no_ending($no_ending);

      Normally, Test::Builder does some extra diagnostics when the test ends. It also changes the exit code as described below.

      If this is true, none of that will be done.

    • no_header
      1. $Test->no_header($no_header);

      If set to true, no "1..N" header will be printed.

    Output

    Controlling where the test output goes.

    It's ok for your test to change where STDOUT and STDERR point to, Test::Builder's default output settings will not be affected.

    • diag
      1. $Test->diag(@msgs);

      Prints out the given @msgs . Like print, arguments are simply appended together.

      Normally, it uses the failure_output() handle, but if this is for a TODO test, the todo_output() handle is used.

      Output will be indented and marked with a # so as not to interfere with test output. A newline will be put on the end if there isn't one already.

      We encourage using this rather than calling print directly.

      Returns false. Why? Because diag() is often used in conjunction with a failing test (ok() || diag() ) it "passes through" the failure.

      1. return ok(...) || diag(...);
    • note
      1. $Test->note(@msgs);

      Like diag() , but it prints to the output() handle so it will not normally be seen by the user except in verbose mode.

    • explain
      1. my @dump = $Test->explain(@msgs);

      Will dump the contents of any references in a human readable format. Handy for things like...

      1. is_deeply($have, $want) || diag explain $have;

      or

      1. is_deeply($have, $want) || note explain $have;
    • output
    • failure_output
    • todo_output
      1. my $filehandle = $Test->output;
      2. $Test->output($filehandle);
      3. $Test->output($filename);
      4. $Test->output(\$scalar);

      These methods control where Test::Builder will print its output. They take either an open $filehandle , a $filename to open and write to or a $scalar reference to append to. It will always return a $filehandle .

      output is where normal "ok/not ok" test output goes.

      Defaults to STDOUT.

      failure_output is where diagnostic output on test failures and diag() goes. It is normally not read by Test::Harness and instead is displayed to the user.

      Defaults to STDERR.

      todo_output is used instead of failure_output() for the diagnostics of a failing TODO test. These will not be seen by the user.

      Defaults to STDOUT.

    • reset_outputs
      1. $tb->reset_outputs;

      Resets all the output filehandles back to their defaults.

    • carp
      1. $tb->carp(@message);

      Warns with @message but the message will appear to come from the point where the original test function was called ($tb->caller ).

    • croak
      1. $tb->croak(@message);

      Dies with @message but the message will appear to come from the point where the original test function was called ($tb->caller ).

    Test Status and Info

    • current_test
      1. my $curr_test = $Test->current_test;
      2. $Test->current_test($num);

      Gets/sets the current test number we're on. You usually shouldn't have to set this.

      If set forward, the details of the missing tests are filled in as 'unknown'. if set backward, the details of the intervening tests are deleted. You can erase history if you really want to.

    • is_passing
      1. my $ok = $builder->is_passing;

      Indicates if the test suite is currently passing.

      More formally, it will be false if anything has happened which makes it impossible for the test suite to pass. True otherwise.

      For example, if no tests have run is_passing() will be true because even though a suite with no tests is a failure you can add a passing test to it and start passing.

      Don't think about it too much.

    • summary
      1. my @tests = $Test->summary;

      A simple summary of the tests so far. True for pass, false for fail. This is a logical pass/fail, so todos are passes.

      Of course, test #1 is $tests[0], etc...

    • details
      1. my @tests = $Test->details;

      Like summary() , but with a lot more detail.

      1. $tests[$test_num - 1] =
      2. { 'ok' => is the test considered a pass?
      3. actual_ok => did it literally say 'ok'?
      4. name => name of the test (if any)
      5. type => type of test (if any, see below).
      6. reason => reason for the above (if any)
      7. };

      'ok' is true if Test::Harness will consider the test to be a pass.

      'actual_ok' is a reflection of whether or not the test literally printed 'ok' or 'not ok'. This is for examining the result of 'todo' tests.

      'name' is the name of the test.

      'type' indicates if it was a special test. Normal tests have a type of ''. Type can be one of the following:

      1. skip see skip()
      2. todo see todo()
      3. todo_skip see todo_skip()
      4. unknown see below

      Sometimes the Test::Builder test counter is incremented without it printing any test output, for example, when current_test() is changed. In these cases, Test::Builder doesn't know the result of the test, so its type is 'unknown'. These details for these tests are filled in. They are considered ok, but the name and actual_ok is left undef.

      For example "not ok 23 - hole count # TODO insufficient donuts" would result in this structure:

      1. $tests[22] = # 23 - 1, since arrays start from 0.
      2. { ok => 1, # logically, the test passed since its todo
      3. actual_ok => 0, # in absolute terms, it failed
      4. name => 'hole count',
      5. type => 'todo',
      6. reason => 'insufficient donuts'
      7. };
    • todo
      1. my $todo_reason = $Test->todo;
      2. my $todo_reason = $Test->todo($pack);

      If the current tests are considered "TODO" it will return the reason, if any. This reason can come from a $TODO variable or the last call to todo_start() .

      Since a TODO test does not need a reason, this function can return an empty string even when inside a TODO block. Use $Test->in_todo to determine if you are currently inside a TODO block.

      todo() is about finding the right package to look for $TODO in. It's pretty good at guessing the right package to look at. It first looks for the caller based on $Level + 1 , since todo() is usually called inside a test function. As a last resort it will use exported_to() .

      Sometimes there is some confusion about where todo() should be looking for the $TODO variable. If you want to be sure, tell it explicitly what $pack to use.

    • find_TODO
      1. my $todo_reason = $Test->find_TODO();
      2. my $todo_reason = $Test->find_TODO($pack);

      Like todo() but only returns the value of $TODO ignoring todo_start() .

      Can also be used to set $TODO to a new value while returning the old value:

      1. my $old_reason = $Test->find_TODO($pack, 1, $new_reason);
    • in_todo
      1. my $in_todo = $Test->in_todo;

      Returns true if the test is currently inside a TODO block.

    • todo_start
      1. $Test->todo_start();
      2. $Test->todo_start($message);

      This method allows you declare all subsequent tests as TODO tests, up until the todo_end method has been called.

      The TODO: and $TODO syntax is generally pretty good about figuring out whether or not we're in a TODO test. However, often we find that this is not possible to determine (such as when we want to use $TODO but the tests are being executed in other packages which can't be inferred beforehand).

      Note that you can use this to nest "todo" tests

      1. $Test->todo_start('working on this');
      2. # lots of code
      3. $Test->todo_start('working on that');
      4. # more code
      5. $Test->todo_end;
      6. $Test->todo_end;

      This is generally not recommended, but large testing systems often have weird internal needs.

      We've tried to make this also work with the TODO: syntax, but it's not guaranteed and its use is also discouraged:

      1. TODO: {
      2. local $TODO = 'We have work to do!';
      3. $Test->todo_start('working on this');
      4. # lots of code
      5. $Test->todo_start('working on that');
      6. # more code
      7. $Test->todo_end;
      8. $Test->todo_end;
      9. }

      Pick one style or another of "TODO" to be on the safe side.

    • todo_end
      1. $Test->todo_end;

      Stops running tests as "TODO" tests. This method is fatal if called without a preceding todo_start method call.

    • caller
      1. my $package = $Test->caller;
      2. my($pack, $file, $line) = $Test->caller;
      3. my($pack, $file, $line) = $Test->caller($height);

      Like the normal caller(), except it reports according to your level() .

      $height will be added to the level() .

      If caller() winds up off the top of the stack it report the highest context.

    EXIT CODES

    If all your tests passed, Test::Builder will exit with zero (which is normal). If anything failed it will exit with how many failed. If you run less (or more) tests than you planned, the missing (or extras) will be considered failures. If no tests were ever run Test::Builder will throw a warning and exit with 255. If the test died, even after having successfully completed all its tests, it will still be considered a failure and will exit with 255.

    So the exit codes are...

    1. 0 all tests successful
    2. 255 test died or all passed but wrong # of tests run
    3. any other number how many failed (including missing or extras)

    If you fail more than 254 tests, it will be reported as 254.

    THREADS

    In perl 5.8.1 and later, Test::Builder is thread-safe. The test number is shared amongst all threads. This means if one thread sets the test number using current_test() they will all be effected.

    While versions earlier than 5.8.1 had threads they contain too many bugs to support.

    Test::Builder is only thread-aware if threads.pm is loaded before Test::Builder.

    MEMORY

    An informative hash, accessible via <details()>, is stored for each test you perform. So memory usage will scale linearly with each test run. Although this is not a problem for most test suites, it can become an issue if you do large (hundred thousands to million) combinatorics tests in the same run.

    In such cases, you are advised to either split the test file into smaller ones, or use a reverse approach, doing "normal" (code) compares and triggering fail() should anything go unexpected.

    Future versions of Test::Builder will have a way to turn history off.

    EXAMPLES

    CPAN can provide the best examples. Test::Simple, Test::More, Test::Exception and Test::Differences all use Test::Builder.

    SEE ALSO

    Test::Simple, Test::More, Test::Harness

    AUTHORS

    Original code by chromatic, maintained by Michael G Schwern <schwern@pobox.com>

    COPYRIGHT

    Copyright 2002-2008 by chromatic <chromatic@wgz.org> and Michael G Schwern <schwern@pobox.com>.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    See http://www.perl.com/perl/misc/Artistic.html

     
    perldoc-html/Test/Harness.html000644 000765 000024 00000067753 12275777507 016473 0ustar00jjstaff000000 000000 Test::Harness - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Test::Harness

    Perl 5 version 18.2 documentation
    Recently read

    Test::Harness

    NAME

    Test::Harness - Run Perl standard test scripts with statistics

    VERSION

    Version 3.26

    SYNOPSIS

    1. use Test::Harness;
    2. runtests(@test_files);

    DESCRIPTION

    Although, for historical reasons, the Test::Harness distribution takes its name from this module it now exists only to provide TAP::Harness with an interface that is somewhat backwards compatible with Test::Harness 2.xx. If you're writing new code consider using TAP::Harness directly instead.

    Emulation is provided for runtests and execute_tests but the pluggable 'Straps' interface that previous versions of Test::Harness supported is not reproduced here. Straps is now available as a stand alone module: Test::Harness::Straps.

    See TAP::Parser, TAP::Harness for the main documentation for this distribution.

    FUNCTIONS

    The following functions are available.

    runtests( @test_files )

    This runs all the given @test_files and divines whether they passed or failed based on their output to STDOUT (details above). It prints out each individual test which failed along with a summary report and a how long it all took.

    It returns true if everything was ok. Otherwise it will die() with one of the messages in the DIAGNOSTICS section.

    execute_tests( tests => \@test_files, out => \*FH )

    Runs all the given @test_files (just like runtests() ) but doesn't generate the final report. During testing, progress information will be written to the currently selected output filehandle (usually STDOUT ), or to the filehandle given by the out parameter. The out is optional.

    Returns a list of two values, $total and $failed , describing the results. $total is a hash ref summary of all the tests run. Its keys and values are this:

    1. bonus Number of individual todo tests unexpectedly passed
    2. max Number of individual tests ran
    3. ok Number of individual tests passed
    4. sub_skipped Number of individual tests skipped
    5. todo Number of individual todo tests
    6. files Number of test files ran
    7. good Number of test files passed
    8. bad Number of test files failed
    9. tests Number of test files originally given
    10. skipped Number of test files skipped

    If $total->{bad} == 0 and $total->{max} > 0 , you've got a successful test.

    $failed is a hash ref of all the test scripts that failed. Each key is the name of a test script, each value is another hash representing how that script failed. Its keys are these:

    1. name Name of the test which failed
    2. estat Script's exit value
    3. wstat Script's wait status
    4. max Number of individual tests
    5. failed Number which failed
    6. canon List of tests which failed (as string).

    $failed should be empty if everything passed.

    EXPORT

    &runtests is exported by Test::Harness by default.

    &execute_tests , $verbose , $switches and $debug are exported upon request.

    ENVIRONMENT VARIABLES THAT TAP::HARNESS::COMPATIBLE SETS

    Test::Harness sets these before executing the individual tests.

    • HARNESS_ACTIVE

      This is set to a true value. It allows the tests to determine if they are being executed through the harness or by any other means.

    • HARNESS_VERSION

      This is the version of Test::Harness .

    ENVIRONMENT VARIABLES THAT AFFECT TEST::HARNESS

    • HARNESS_TIMER

      Setting this to true will make the harness display the number of milliseconds each test took. You can also use prove's --timer switch.

    • HARNESS_VERBOSE

      If true, Test::Harness will output the verbose results of running its tests. Setting $Test::Harness::verbose will override this, or you can use the -v switch in the prove utility.

    • HARNESS_OPTIONS

      Provide additional options to the harness. Currently supported options are:

      • j<n>

        Run <n> (default 9) parallel jobs.

      • c

        Try to color output. See new in TAP::Formatter::Base.

      • a<file.tgz>

        Will use TAP::Harness::Archive as the harness class, and save the TAP to file.tgz

      • fPackage-With-Dashes

        Set the formatter_class of the harness being run. Since the HARNESS_OPTIONS is seperated by : , we use - instead.

      Multiple options may be separated by colons:

      1. HARNESS_OPTIONS=j9:c make test
    • HARNESS_SUBCLASS

      Specifies a TAP::Harness subclass to be used in place of TAP::Harness.

    Taint Mode

    Normally when a Perl program is run in taint mode the contents of the PERL5LIB environment variable do not appear in @INC .

    Because PERL5LIB is often used during testing to add build directories to @INC Test::Harness passes the names of any directories found in PERL5LIB as -I switches. The net effect of this is that PERL5LIB is honoured even in taint mode.

    SEE ALSO

    TAP::Harness

    BUGS

    Please report any bugs or feature requests to bug-test-harness at rt.cpan.org , or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Test-Harness. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

    AUTHORS

    Andy Armstrong <andy@hexten.net>

    Test::Harness 2.64 (maintained by Andy Lester and on which this module is based) has this attribution:

    1. Either Tim Bunce or Andreas Koenig, we don't know. What we know for
    2. sure is, that it was inspired by Larry Wall's F<TEST> script that came
    3. with perl distributions for ages. Numerous anonymous contributors
    4. exist. Andreas Koenig held the torch for many years, and then
    5. Michael G Schwern.

    LICENCE AND COPYRIGHT

    Copyright (c) 2007-2011, Andy Armstrong <andy@hexten.net> . All rights reserved.

    This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

     
    perldoc-html/Test/More.html000644 000765 000024 00000257317 12275777512 015763 0ustar00jjstaff000000 000000 Test::More - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Test::More

    Perl 5 version 18.2 documentation
    Recently read

    Test::More

    NAME

    Test::More - yet another framework for writing test scripts

    SYNOPSIS

    1. use Test::More tests => 23;
    2. # or
    3. use Test::More skip_all => $reason;
    4. # or
    5. use Test::More; # see done_testing()
    6. BEGIN { use_ok( 'Some::Module' ); }
    7. require_ok( 'Some::Module' );
    8. # Various ways to say "ok"
    9. ok($got eq $expected, $test_name);
    10. is ($got, $expected, $test_name);
    11. isnt($got, $expected, $test_name);
    12. # Rather than print STDERR "# here's what went wrong\n"
    13. diag("here's what went wrong");
    14. like ($got, qr/expected/, $test_name);
    15. unlike($got, qr/expected/, $test_name);
    16. cmp_ok($got, '==', $expected, $test_name);
    17. is_deeply($got_complex_structure, $expected_complex_structure, $test_name);
    18. SKIP: {
    19. skip $why, $how_many unless $have_some_feature;
    20. ok( foo(), $test_name );
    21. is( foo(42), 23, $test_name );
    22. };
    23. TODO: {
    24. local $TODO = $why;
    25. ok( foo(), $test_name );
    26. is( foo(42), 23, $test_name );
    27. };
    28. can_ok($module, @methods);
    29. isa_ok($object, $class);
    30. pass($test_name);
    31. fail($test_name);
    32. BAIL_OUT($why);
    33. # UNIMPLEMENTED!!!
    34. my @status = Test::More::status;

    DESCRIPTION

    STOP! If you're just getting started writing tests, have a look at Test::Simple first. This is a drop in replacement for Test::Simple which you can switch to once you get the hang of basic testing.

    The purpose of this module is to provide a wide range of testing utilities. Various ways to say "ok" with better diagnostics, facilities to skip tests, test future features and compare complicated data structures. While you can do almost anything with a simple ok() function, it doesn't provide good diagnostic output.

    I love it when a plan comes together

    Before anything else, you need a testing plan. This basically declares how many tests your script is going to run to protect against premature failure.

    The preferred way to do this is to declare a plan when you use Test::More .

    1. use Test::More tests => 23;

    There are cases when you will not know beforehand how many tests your script is going to run. In this case, you can declare your tests at the end.

    1. use Test::More;
    2. ... run your tests ...
    3. done_testing( $number_of_tests_run );

    Sometimes you really don't know how many tests were run, or it's too difficult to calculate. In which case you can leave off $number_of_tests_run.

    In some cases, you'll want to completely skip an entire testing script.

    1. use Test::More skip_all => $skip_reason;

    Your script will declare a skip with the reason why you skipped and exit immediately with a zero (success). See Test::Harness for details.

    If you want to control what functions Test::More will export, you have to use the 'import' option. For example, to import everything but 'fail', you'd do:

    1. use Test::More tests => 23, import => ['!fail'];

    Alternatively, you can use the plan() function. Useful for when you have to calculate the number of tests.

    1. use Test::More;
    2. plan tests => keys %Stuff * 3;

    or for deciding between running the tests at all:

    1. use Test::More;
    2. if( $^O eq 'MacOS' ) {
    3. plan skip_all => 'Test irrelevant on MacOS';
    4. }
    5. else {
    6. plan tests => 42;
    7. }
    • done_testing
      1. done_testing();
      2. done_testing($number_of_tests);

      If you don't know how many tests you're going to run, you can issue the plan when you're done running tests.

      $number_of_tests is the same as plan(), it's the number of tests you expected to run. You can omit this, in which case the number of tests you ran doesn't matter, just the fact that your tests ran to conclusion.

      This is safer than and replaces the "no_plan" plan.

    Test names

    By convention, each test is assigned a number in order. This is largely done automatically for you. However, it's often very useful to assign a name to each test. Which would you rather see:

    1. ok 4
    2. not ok 5
    3. ok 6

    or

    1. ok 4 - basic multi-variable
    2. not ok 5 - simple exponential
    3. ok 6 - force == mass * acceleration

    The later gives you some idea of what failed. It also makes it easier to find the test in your script, simply search for "simple exponential".

    All test functions take a name argument. It's optional, but highly suggested that you use it.

    I'm ok, you're not ok.

    The basic purpose of this module is to print out either "ok #" or "not ok #" depending on if a given test succeeded or failed. Everything else is just gravy.

    All of the following print "ok" or "not ok" depending on if the test succeeded or failed. They all also return true or false, respectively.

    • ok
      1. ok($got eq $expected, $test_name);

      This simply evaluates any expression ($got eq $expected is just a simple example) and uses that to determine if the test succeeded or failed. A true expression passes, a false one fails. Very simple.

      For example:

      1. ok( $exp{9} == 81, 'simple exponential' );
      2. ok( Film->can('db_Main'), 'set_db()' );
      3. ok( $p->tests == 4, 'saw tests' );
      4. ok( !grep !defined $_, @items, 'items populated' );

      (Mnemonic: "This is ok.")

      $test_name is a very short description of the test that will be printed out. It makes it very easy to find a test in your script when it fails and gives others an idea of your intentions. $test_name is optional, but we very strongly encourage its use.

      Should an ok() fail, it will produce some diagnostics:

      1. not ok 18 - sufficient mucus
      2. # Failed test 'sufficient mucus'
      3. # in foo.t at line 42.

      This is the same as Test::Simple's ok() routine.

    • is
    • isnt
      1. is ( $got, $expected, $test_name );
      2. isnt( $got, $expected, $test_name );

      Similar to ok(), is() and isnt() compare their two arguments with eq and ne respectively and use the result of that to determine if the test succeeded or failed. So these:

      1. # Is the ultimate answer 42?
      2. is( ultimate_answer(), 42, "Meaning of Life" );
      3. # $foo isn't empty
      4. isnt( $foo, '', "Got some foo" );

      are similar to these:

      1. ok( ultimate_answer() eq 42, "Meaning of Life" );
      2. ok( $foo ne '', "Got some foo" );

      undef will only ever match undef. So you can test a value agains undef like this:

      1. is($not_defined, undef, "undefined as expected");

      (Mnemonic: "This is that." "This isn't that.")

      So why use these? They produce better diagnostics on failure. ok() cannot know what you are testing for (beyond the name), but is() and isnt() know what the test was and why it failed. For example this test:

      1. my $foo = 'waffle'; my $bar = 'yarblokos';
      2. is( $foo, $bar, 'Is foo the same as bar?' );

      Will produce something like this:

      1. not ok 17 - Is foo the same as bar?
      2. # Failed test 'Is foo the same as bar?'
      3. # in foo.t at line 139.
      4. # got: 'waffle'
      5. # expected: 'yarblokos'

      So you can figure out what went wrong without rerunning the test.

      You are encouraged to use is() and isnt() over ok() where possible, however do not be tempted to use them to find out if something is true or false!

      1. # XXX BAD!
      2. is( exists $brooklyn{tree}, 1, 'A tree grows in Brooklyn' );

      This does not check if exists $brooklyn{tree} is true, it checks if it returns 1. Very different. Similar caveats exist for false and 0. In these cases, use ok().

      1. ok( exists $brooklyn{tree}, 'A tree grows in Brooklyn' );

      A simple call to isnt() usually does not provide a strong test but there are cases when you cannot say much more about a value than that it is different from some other value:

      1. new_ok $obj, "Foo";
      2. my $clone = $obj->clone;
      3. isa_ok $obj, "Foo", "Foo->clone";
      4. isnt $obj, $clone, "clone() produces a different object";

      For those grammatical pedants out there, there's an isn't() function which is an alias of isnt().

    • like
      1. like( $got, qr/expected/, $test_name );

      Similar to ok(), like() matches $got against the regex qr/expected/.

      So this:

      1. like($got, qr/expected/, 'this is like that');

      is similar to:

      1. ok( $got =~ /expected/, 'this is like that');

      (Mnemonic "This is like that".)

      The second argument is a regular expression. It may be given as a regex reference (i.e. qr//) or (for better compatibility with older perls) as a string that looks like a regex (alternative delimiters are currently not supported):

      1. like( $got, '/expected/', 'this is like that' );

      Regex options may be placed on the end ('/expected/i' ).

      Its advantages over ok() are similar to that of is() and isnt(). Better diagnostics on failure.

    • unlike
      1. unlike( $got, qr/expected/, $test_name );

      Works exactly as like(), only it checks if $got does not match the given pattern.

    • cmp_ok
      1. cmp_ok( $got, $op, $expected, $test_name );

      Halfway between ok() and is() lies cmp_ok(). This allows you to compare two arguments using any binary perl operator.

      1. # ok( $got eq $expected );
      2. cmp_ok( $got, 'eq', $expected, 'this eq that' );
      3. # ok( $got == $expected );
      4. cmp_ok( $got, '==', $expected, 'this == that' );
      5. # ok( $got && $expected );
      6. cmp_ok( $got, '&&', $expected, 'this && that' );
      7. ...etc...

      Its advantage over ok() is when the test fails you'll know what $got and $expected were:

      1. not ok 1
      2. # Failed test in foo.t at line 12.
      3. # '23'
      4. # &&
      5. # undef

      It's also useful in those cases where you are comparing numbers and is()'s use of eq will interfere:

      1. cmp_ok( $big_hairy_number, '==', $another_big_hairy_number );

      It's especially useful when comparing greater-than or smaller-than relation between values:

      1. cmp_ok( $some_value, '<=', $upper_limit );
    • can_ok
      1. can_ok($module, @methods);
      2. can_ok($object, @methods);

      Checks to make sure the $module or $object can do these @methods (works with functions, too).

      1. can_ok('Foo', qw(this that whatever));

      is almost exactly like saying:

      1. ok( Foo->can('this') &&
      2. Foo->can('that') &&
      3. Foo->can('whatever')
      4. );

      only without all the typing and with a better interface. Handy for quickly testing an interface.

      No matter how many @methods you check, a single can_ok() call counts as one test. If you desire otherwise, use:

      1. foreach my $meth (@methods) {
      2. can_ok('Foo', $meth);
      3. }
    • isa_ok
      1. isa_ok($object, $class, $object_name);
      2. isa_ok($subclass, $class, $object_name);
      3. isa_ok($ref, $type, $ref_name);

      Checks to see if the given $object->isa($class) . Also checks to make sure the object was defined in the first place. Handy for this sort of thing:

      1. my $obj = Some::Module->new;
      2. isa_ok( $obj, 'Some::Module' );

      where you'd otherwise have to write

      1. my $obj = Some::Module->new;
      2. ok( defined $obj && $obj->isa('Some::Module') );

      to safeguard against your test script blowing up.

      You can also test a class, to make sure that it has the right ancestor:

      1. isa_ok( 'Vole', 'Rodent' );

      It works on references, too:

      1. isa_ok( $array_ref, 'ARRAY' );

      The diagnostics of this test normally just refer to 'the object'. If you'd like them to be more specific, you can supply an $object_name (for example 'Test customer').

    • new_ok
      1. my $obj = new_ok( $class );
      2. my $obj = new_ok( $class => \@args );
      3. my $obj = new_ok( $class => \@args, $object_name );

      A convenience function which combines creating an object and calling isa_ok() on that object.

      It is basically equivalent to:

      1. my $obj = $class->new(@args);
      2. isa_ok $obj, $class, $object_name;

      If @args is not given, an empty list will be used.

      This function only works on new() and it assumes new() will return just a single object which isa $class .

    • subtest
      1. subtest $name => \&code;

      subtest() runs the &code as its own little test with its own plan and its own result. The main test counts this as a single test using the result of the whole subtest to determine if its ok or not ok.

      For example...

      1. use Test::More tests => 3;
      2. pass("First test");
      3. subtest 'An example subtest' => sub {
      4. plan tests => 2;
      5. pass("This is a subtest");
      6. pass("So is this");
      7. };
      8. pass("Third test");

      This would produce.

      1. 1..3
      2. ok 1 - First test
      3. 1..2
      4. ok 1 - This is a subtest
      5. ok 2 - So is this
      6. ok 2 - An example subtest
      7. ok 3 - Third test

      A subtest may call "skip_all". No tests will be run, but the subtest is considered a skip.

      1. subtest 'skippy' => sub {
      2. plan skip_all => 'cuz I said so';
      3. pass('this test will never be run');
      4. };

      Returns true if the subtest passed, false otherwise.

      Due to how subtests work, you may omit a plan if you desire. This adds an implicit done_testing() to the end of your subtest. The following two subtests are equivalent:

      1. subtest 'subtest with implicit done_testing()', sub {
      2. ok 1, 'subtests with an implicit done testing should work';
      3. ok 1, '... and support more than one test';
      4. ok 1, '... no matter how many tests are run';
      5. };
      6. subtest 'subtest with explicit done_testing()', sub {
      7. ok 1, 'subtests with an explicit done testing should work';
      8. ok 1, '... and support more than one test';
      9. ok 1, '... no matter how many tests are run';
      10. done_testing();
      11. };
    • pass
    • fail
      1. pass($test_name);
      2. fail($test_name);

      Sometimes you just want to say that the tests have passed. Usually the case is you've got some complicated condition that is difficult to wedge into an ok(). In this case, you can simply use pass() (to declare the test ok) or fail (for not ok). They are synonyms for ok(1) and ok(0).

      Use these very, very, very sparingly.

    Module tests

    You usually want to test if the module you're testing loads ok, rather than just vomiting if its load fails. For such purposes we have use_ok and require_ok .

    • use_ok
      1. BEGIN { use_ok($module); }
      2. BEGIN { use_ok($module, @imports); }

      These simply use the given $module and test to make sure the load happened ok. It's recommended that you run use_ok() inside a BEGIN block so its functions are exported at compile-time and prototypes are properly honored.

      If @imports are given, they are passed through to the use. So this:

      1. BEGIN { use_ok('Some::Module', qw(foo bar)) }

      is like doing this:

      1. use Some::Module qw(foo bar);

      Version numbers can be checked like so:

      1. # Just like "use Some::Module 1.02"
      2. BEGIN { use_ok('Some::Module', 1.02) }

      Don't try to do this:

      1. BEGIN {
      2. use_ok('Some::Module');
      3. ...some code that depends on the use...
      4. ...happening at compile time...
      5. }

      because the notion of "compile-time" is relative. Instead, you want:

      1. BEGIN { use_ok('Some::Module') }
      2. BEGIN { ...some code that depends on the use... }

      If you want the equivalent of use Foo () , use a module but not import anything, use require_ok .

      1. BEGIN { require_ok "Foo" }
    • require_ok
      1. require_ok($module);
      2. require_ok($file);

      Like use_ok(), except it requires the $module or $file.

    Complex data structures

    Not everything is a simple eq check or regex. There are times you need to see if two data structures are equivalent. For these instances Test::More provides a handful of useful functions.

    NOTE I'm not quite sure what will happen with filehandles.

    • is_deeply
      1. is_deeply( $got, $expected, $test_name );

      Similar to is(), except that if $got and $expected are references, it does a deep comparison walking each data structure to see if they are equivalent. If the two structures are different, it will display the place where they start differing.

      is_deeply() compares the dereferenced values of references, the references themselves (except for their type) are ignored. This means aspects such as blessing and ties are not considered "different".

      is_deeply() currently has very limited handling of function reference and globs. It merely checks if they have the same referent. This may improve in the future.

      Test::Differences and Test::Deep provide more in-depth functionality along these lines.

    Diagnostics

    If you pick the right test function, you'll usually get a good idea of what went wrong when it failed. But sometimes it doesn't work out that way. So here we have ways for you to write your own diagnostic messages which are safer than just print STDERR .

    • diag
      1. diag(@diagnostic_message);

      Prints a diagnostic message which is guaranteed not to interfere with test output. Like print @diagnostic_message is simply concatenated together.

      Returns false, so as to preserve failure.

      Handy for this sort of thing:

      1. ok( grep(/foo/, @users), "There's a foo user" ) or
      2. diag("Since there's no foo, check that /etc/bar is set up right");

      which would produce:

      1. not ok 42 - There's a foo user
      2. # Failed test 'There's a foo user'
      3. # in foo.t at line 52.
      4. # Since there's no foo, check that /etc/bar is set up right.

      You might remember ok() or diag() with the mnemonic open() or die() .

      NOTE The exact formatting of the diagnostic output is still changing, but it is guaranteed that whatever you throw at it it won't interfere with the test.

    • note
      1. note(@diagnostic_message);

      Like diag(), except the message will not be seen when the test is run in a harness. It will only be visible in the verbose TAP stream.

      Handy for putting in notes which might be useful for debugging, but don't indicate a problem.

      1. note("Tempfile is $tempfile");
    • explain
      1. my @dump = explain @diagnostic_message;

      Will dump the contents of any references in a human readable format. Usually you want to pass this into note or diag .

      Handy for things like...

      1. is_deeply($have, $want) || diag explain $have;

      or

      1. note explain \%args;
      2. Some::Class->method(%args);

    Conditional tests

    Sometimes running a test under certain conditions will cause the test script to die. A certain function or method isn't implemented (such as fork() on MacOS), some resource isn't available (like a net connection) or a module isn't available. In these cases it's necessary to skip tests, or declare that they are supposed to fail but will work in the future (a todo test).

    For more details on the mechanics of skip and todo tests see Test::Harness.

    The way Test::More handles this is with a named block. Basically, a block of tests which can be skipped over or made todo. It's best if I just show you...

    • SKIP: BLOCK
      1. SKIP: {
      2. skip $why, $how_many if $condition;
      3. ...normal testing code goes here...
      4. }

      This declares a block of tests that might be skipped, $how_many tests there are, $why and under what $condition to skip them. An example is the easiest way to illustrate:

      1. SKIP: {
      2. eval { require HTML::Lint };
      3. skip "HTML::Lint not installed", 2 if $@;
      4. my $lint = new HTML::Lint;
      5. isa_ok( $lint, "HTML::Lint" );
      6. $lint->parse( $html );
      7. is( $lint->errors, 0, "No errors found in HTML" );
      8. }

      If the user does not have HTML::Lint installed, the whole block of code won't be run at all. Test::More will output special ok's which Test::Harness interprets as skipped, but passing, tests.

      It's important that $how_many accurately reflects the number of tests in the SKIP block so the # of tests run will match up with your plan. If your plan is no_plan $how_many is optional and will default to 1.

      It's perfectly safe to nest SKIP blocks. Each SKIP block must have the label SKIP , or Test::More can't work its magic.

      You don't skip tests which are failing because there's a bug in your program, or for which you don't yet have code written. For that you use TODO. Read on.

    • TODO: BLOCK
      1. TODO: {
      2. local $TODO = $why if $condition;
      3. ...normal testing code goes here...
      4. }

      Declares a block of tests you expect to fail and $why. Perhaps it's because you haven't fixed a bug or haven't finished a new feature:

      1. TODO: {
      2. local $TODO = "URI::Geller not finished";
      3. my $card = "Eight of clubs";
      4. is( URI::Geller->your_card, $card, 'Is THIS your card?' );
      5. my $spoon;
      6. URI::Geller->bend_spoon;
      7. is( $spoon, 'bent', "Spoon bending, that's original" );
      8. }

      With a todo block, the tests inside are expected to fail. Test::More will run the tests normally, but print out special flags indicating they are "todo". Test::Harness will interpret failures as being ok. Should anything succeed, it will report it as an unexpected success. You then know the thing you had todo is done and can remove the TODO flag.

      The nice part about todo tests, as opposed to simply commenting out a block of tests, is it's like having a programmatic todo list. You know how much work is left to be done, you're aware of what bugs there are, and you'll know immediately when they're fixed.

      Once a todo test starts succeeding, simply move it outside the block. When the block is empty, delete it.

    • todo_skip
      1. TODO: {
      2. todo_skip $why, $how_many if $condition;
      3. ...normal testing code...
      4. }

      With todo tests, it's best to have the tests actually run. That way you'll know when they start passing. Sometimes this isn't possible. Often a failing test will cause the whole program to die or hang, even inside an eval BLOCK with and using alarm. In these extreme cases you have no choice but to skip over the broken tests entirely.

      The syntax and behavior is similar to a SKIP: BLOCK except the tests will be marked as failing but todo. Test::Harness will interpret them as passing.

    • When do I use SKIP vs. TODO?

      If it's something the user might not be able to do, use SKIP. This includes optional modules that aren't installed, running under an OS that doesn't have some feature (like fork() or symlinks), or maybe you need an Internet connection and one isn't available.

      If it's something the programmer hasn't done yet, use TODO. This is for any code you haven't written yet, or bugs you have yet to fix, but want to put tests in your testing script (always a good idea).

    Test control

    • BAIL_OUT
      1. BAIL_OUT($reason);

      Indicates to the harness that things are going so badly all testing should terminate. This includes the running of any additional test scripts.

      This is typically used when testing cannot continue such as a critical module failing to compile or a necessary external utility not being available such as a database connection failing.

      The test will exit with 255.

      For even better control look at Test::Most.

    Discouraged comparison functions

    The use of the following functions is discouraged as they are not actually testing functions and produce no diagnostics to help figure out what went wrong. They were written before is_deeply() existed because I couldn't figure out how to display a useful diff of two arbitrary data structures.

    These functions are usually used inside an ok().

    1. ok( eq_array(\@got, \@expected) );

    is_deeply() can do that better and with diagnostics.

    1. is_deeply( \@got, \@expected );

    They may be deprecated in future versions.

    • eq_array
      1. my $is_eq = eq_array(\@got, \@expected);

      Checks if two arrays are equivalent. This is a deep check, so multi-level structures are handled correctly.

    • eq_hash
      1. my $is_eq = eq_hash(\%got, \%expected);

      Determines if the two hashes contain the same keys and values. This is a deep check.

    • eq_set
      1. my $is_eq = eq_set(\@got, \@expected);

      Similar to eq_array(), except the order of the elements is not important. This is a deep check, but the irrelevancy of order only applies to the top level.

      1. ok( eq_set(\@got, \@expected) );

      Is better written:

      1. is_deeply( [sort @got], [sort @expected] );

      NOTE By historical accident, this is not a true set comparison. While the order of elements does not matter, duplicate elements do.

      NOTE eq_set() does not know how to deal with references at the top level. The following is an example of a comparison which might not work:

      1. eq_set([\1, \2], [\2, \1]);

      Test::Deep contains much better set comparison functions.

    Extending and Embedding Test::More

    Sometimes the Test::More interface isn't quite enough. Fortunately, Test::More is built on top of Test::Builder which provides a single, unified backend for any test library to use. This means two test libraries which both use Test::Builder can be used together in the same program.

    If you simply want to do a little tweaking of how the tests behave, you can access the underlying Test::Builder object like so:

    • builder
      1. my $test_builder = Test::More->builder;

      Returns the Test::Builder object underlying Test::More for you to play with.

    EXIT CODES

    If all your tests passed, Test::Builder will exit with zero (which is normal). If anything failed it will exit with how many failed. If you run less (or more) tests than you planned, the missing (or extras) will be considered failures. If no tests were ever run Test::Builder will throw a warning and exit with 255. If the test died, even after having successfully completed all its tests, it will still be considered a failure and will exit with 255.

    So the exit codes are...

    1. 0 all tests successful
    2. 255 test died or all passed but wrong # of tests run
    3. any other number how many failed (including missing or extras)

    If you fail more than 254 tests, it will be reported as 254.

    NOTE This behavior may go away in future versions.

    CAVEATS and NOTES

    • Backwards compatibility

      Test::More works with Perls as old as 5.6.0.

    • utf8 / "Wide character in print"

      If you use utf8 or other non-ASCII characters with Test::More you might get a "Wide character in print" warning. Using binmode STDOUT, ":utf8" will not fix it. Test::Builder (which powers Test::More) duplicates STDOUT and STDERR. So any changes to them, including changing their output disciplines, will not be seem by Test::More.

      The work around is to change the filehandles used by Test::Builder directly.

      1. my $builder = Test::More->builder;
      2. binmode $builder->output, ":utf8";
      3. binmode $builder->failure_output, ":utf8";
      4. binmode $builder->todo_output, ":utf8";
    • Overloaded objects

      String overloaded objects are compared as strings (or in cmp_ok()'s case, strings or numbers as appropriate to the comparison op). This prevents Test::More from piercing an object's interface allowing better blackbox testing. So if a function starts returning overloaded objects instead of bare strings your tests won't notice the difference. This is good.

      However, it does mean that functions like is_deeply() cannot be used to test the internals of string overloaded objects. In this case I would suggest Test::Deep which contains more flexible testing functions for complex data structures.

    • Threads

      Test::More will only be aware of threads if "use threads" has been done before Test::More is loaded. This is ok:

      1. use threads;
      2. use Test::More;

      This may cause problems:

      1. use Test::More
      2. use threads;

      5.8.1 and above are supported. Anything below that has too many bugs.

    HISTORY

    This is a case of convergent evolution with Joshua Pritikin's Test module. I was largely unaware of its existence when I'd first written my own ok() routines. This module exists because I can't figure out how to easily wedge test names into Test's interface (along with a few other problems).

    The goal here is to have a testing utility that's simple to learn, quick to use and difficult to trip yourself up with while still providing more flexibility than the existing Test.pm. As such, the names of the most common routines are kept tiny, special cases and magic side-effects are kept to a minimum. WYSIWYG.

    SEE ALSO

    Test::Simple if all this confuses you and you just want to write some tests. You can upgrade to Test::More later (it's forward compatible).

    Test::Harness is the test runner and output interpreter for Perl. It's the thing that powers make test and where the prove utility comes from.

    Test::Legacy tests written with Test.pm, the original testing module, do not play well with other testing libraries. Test::Legacy emulates the Test.pm interface and does play well with others.

    Test::Differences for more ways to test complex data structures. And it plays well with Test::More.

    Test::Class is like xUnit but more perlish.

    Test::Deep gives you more powerful complex data structure testing.

    Test::Inline shows the idea of embedded testing.

    Bundle::Test installs a whole bunch of useful test modules.

    AUTHORS

    Michael G Schwern <schwern@pobox.com> with much inspiration from Joshua Pritikin's Test module and lots of help from Barrie Slaymaker, Tony Bowden, blackstar.co.uk, chromatic, Fergal Daly and the perl-qa gang.

    BUGS

    See http://rt.cpan.org to report and view bugs.

    SOURCE

    The source code repository for Test::More can be found at http://github.com/schwern/test-more/.

    COPYRIGHT

    Copyright 2001-2008 by Michael G Schwern <schwern@pobox.com>.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    See http://www.perl.com/perl/misc/Artistic.html

     
    perldoc-html/Test/Simple.html000644 000765 000024 00000062107 12275777512 016301 0ustar00jjstaff000000 000000 Test::Simple - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Test::Simple

    Perl 5 version 18.2 documentation
    Recently read

    Test::Simple

    NAME

    Test::Simple - Basic utilities for writing tests.

    SYNOPSIS

    1. use Test::Simple tests => 1;
    2. ok( $foo eq $bar, 'foo is bar' );

    DESCRIPTION

    ** If you are unfamiliar with testing read Test::Tutorial first! **

    This is an extremely simple, extremely basic module for writing tests suitable for CPAN modules and other pursuits. If you wish to do more complicated testing, use the Test::More module (a drop-in replacement for this one).

    The basic unit of Perl testing is the ok. For each thing you want to test your program will print out an "ok" or "not ok" to indicate pass or fail. You do this with the ok() function (see below).

    The only other constraint is you must pre-declare how many tests you plan to run. This is in case something goes horribly wrong during the test and your test program aborts, or skips a test or whatever. You do this like so:

    1. use Test::Simple tests => 23;

    You must have a plan.

    • ok
      1. ok( $foo eq $bar, $name );
      2. ok( $foo eq $bar );

      ok() is given an expression (in this case $foo eq $bar ). If it's true, the test passed. If it's false, it didn't. That's about it.

      ok() prints out either "ok" or "not ok" along with a test number (it keeps track of that for you).

      1. # This produces "ok 1 - Hell not yet frozen over" (or not ok)
      2. ok( get_temperature($hell) > 0, 'Hell not yet frozen over' );

      If you provide a $name, that will be printed along with the "ok/not ok" to make it easier to find your test when if fails (just search for the name). It also makes it easier for the next guy to understand what your test is for. It's highly recommended you use test names.

      All tests are run in scalar context. So this:

      1. ok( @stuff, 'I have some stuff' );

      will do what you mean (fail if stuff is empty)

    Test::Simple will start by printing number of tests run in the form "1..M" (so "1..5" means you're going to run 5 tests). This strange format lets Test::Harness know how many tests you plan on running in case something goes horribly wrong.

    If all your tests passed, Test::Simple will exit with zero (which is normal). If anything failed it will exit with how many failed. If you run less (or more) tests than you planned, the missing (or extras) will be considered failures. If no tests were ever run Test::Simple will throw a warning and exit with 255. If the test died, even after having successfully completed all its tests, it will still be considered a failure and will exit with 255.

    So the exit codes are...

    1. 0 all tests successful
    2. 255 test died or all passed but wrong # of tests run
    3. any other number how many failed (including missing or extras)

    If you fail more than 254 tests, it will be reported as 254.

    This module is by no means trying to be a complete testing system. It's just to get you started. Once you're off the ground its recommended you look at Test::More.

    EXAMPLE

    Here's an example of a simple .t file for the fictional Film module.

    1. use Test::Simple tests => 5;
    2. use Film; # What you're testing.
    3. my $btaste = Film->new({ Title => 'Bad Taste',
    4. Director => 'Peter Jackson',
    5. Rating => 'R',
    6. NumExplodingSheep => 1
    7. });
    8. ok( defined($btaste) && ref $btaste eq 'Film', 'new() works' );
    9. ok( $btaste->Title eq 'Bad Taste', 'Title() get' );
    10. ok( $btaste->Director eq 'Peter Jackson', 'Director() get' );
    11. ok( $btaste->Rating eq 'R', 'Rating() get' );
    12. ok( $btaste->NumExplodingSheep == 1, 'NumExplodingSheep() get' );

    It will produce output like this:

    1. 1..5
    2. ok 1 - new() works
    3. ok 2 - Title() get
    4. ok 3 - Director() get
    5. not ok 4 - Rating() get
    6. # Failed test 'Rating() get'
    7. # in t/film.t at line 14.
    8. ok 5 - NumExplodingSheep() get
    9. # Looks like you failed 1 tests of 5

    Indicating the Film::Rating() method is broken.

    CAVEATS

    Test::Simple will only report a maximum of 254 failures in its exit code. If this is a problem, you probably have a huge test script. Split it into multiple files. (Otherwise blame the Unix folks for using an unsigned short integer as the exit status).

    Because VMS's exit codes are much, much different than the rest of the universe, and perl does horrible mangling to them that gets in my way, it works like this on VMS.

    1. 0 SS$_NORMAL all tests successful
    2. 4 SS$_ABORT something went wrong

    Unfortunately, I can't differentiate any further.

    NOTES

    Test::Simple is explicitly tested all the way back to perl 5.6.0.

    Test::Simple is thread-safe in perl 5.8.1 and up.

    HISTORY

    This module was conceived while talking with Tony Bowden in his kitchen one night about the problems I was having writing some really complicated feature into the new Testing module. He observed that the main problem is not dealing with these edge cases but that people hate to write tests at all. What was needed was a dead simple module that took all the hard work out of testing and was really, really easy to learn. Paul Johnson simultaneously had this idea (unfortunately, he wasn't in Tony's kitchen). This is it.

    SEE ALSO

    • Test::More

      More testing functions! Once you outgrow Test::Simple, look at Test::More. Test::Simple is 100% forward compatible with Test::More (i.e. you can just use Test::More instead of Test::Simple in your programs and things will still work).

    Look in Test::More's SEE ALSO for more testing modules.

    AUTHORS

    Idea by Tony Bowden and Paul Johnson, code by Michael G Schwern <schwern@pobox.com>, wardrobe by Calvin Klein.

    COPYRIGHT

    Copyright 2001-2008 by Michael G Schwern <schwern@pobox.com>.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    See http://www.perl.com/perl/misc/Artistic.html

     
    perldoc-html/Test/Builder/Module.html000644 000765 000024 00000047652 12275777507 017677 0ustar00jjstaff000000 000000 Test::Builder::Module - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Test::Builder::Module

    Perl 5 version 18.2 documentation
    Recently read

    Test::Builder::Module

    NAME

    Test::Builder::Module - Base class for test modules

    SYNOPSIS

    1. # Emulates Test::Simple
    2. package Your::Module;
    3. my $CLASS = __PACKAGE__;
    4. use base 'Test::Builder::Module';
    5. @EXPORT = qw(ok);
    6. sub ok ($;$) {
    7. my $tb = $CLASS->builder;
    8. return $tb->ok(@_);
    9. }
    10. 1;

    DESCRIPTION

    This is a superclass for Test::Builder-based modules. It provides a handful of common functionality and a method of getting at the underlying Test::Builder object.

    Importing

    Test::Builder::Module is a subclass of Exporter which means your module is also a subclass of Exporter. @EXPORT, @EXPORT_OK, etc... all act normally.

    A few methods are provided to do the use Your::Module tests = 23> part for you.

    import

    Test::Builder::Module provides an import() method which acts in the same basic way as Test::More's, setting the plan and controlling exporting of functions and variables. This allows your module to set the plan independent of Test::More.

    All arguments passed to import() are passed onto Your::Module->builder->plan() with the exception of import =>[qw(things to import)] .

    1. use Your::Module import => [qw(this that)], tests => 23;

    says to import the functions this() and that() as well as set the plan to be 23 tests.

    import() also sets the exported_to() attribute of your builder to be the caller of the import() function.

    Additional behaviors can be added to your import() method by overriding import_extra().

    import_extra

    1. Your::Module->import_extra(\@import_args);

    import_extra() is called by import(). It provides an opportunity for you to add behaviors to your module based on its import list.

    Any extra arguments which shouldn't be passed on to plan() should be stripped off by this method.

    See Test::More for an example of its use.

    NOTE This mechanism is VERY ALPHA AND LIKELY TO CHANGE as it feels like a bit of an ugly hack in its current form.

    Builder

    Test::Builder::Module provides some methods of getting at the underlying Test::Builder object.

    builder

    1. my $builder = Your::Class->builder;

    This method returns the Test::Builder object associated with Your::Class. It is not a constructor so you can call it as often as you like.

    This is the preferred way to get the Test::Builder object. You should not get it via Test::Builder->new as was previously recommended.

    The object returned by builder() may change at runtime so you should call builder() inside each function rather than store it in a global.

    1. sub ok {
    2. my $builder = Your::Class->builder;
    3. return $builder->ok(@_);
    4. }
     
    perldoc-html/Test/Builder/Tester/000755 000765 000024 00000000000 12275777511 017007 5ustar00jjstaff000000 000000 perldoc-html/Test/Builder/Tester.html000644 000765 000024 00000067701 12275777506 017714 0ustar00jjstaff000000 000000 Test::Builder::Tester - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Test::Builder::Tester

    Perl 5 version 18.2 documentation
    Recently read

    Test::Builder::Tester

    NAME

    Test::Builder::Tester - test testsuites that have been built with Test::Builder

    SYNOPSIS

    1. use Test::Builder::Tester tests => 1;
    2. use Test::More;
    3. test_out("not ok 1 - foo");
    4. test_fail(+1);
    5. fail("foo");
    6. test_test("fail works");

    DESCRIPTION

    A module that helps you test testing modules that are built with Test::Builder.

    The testing system is designed to be used by performing a three step process for each test you wish to test. This process starts with using test_out and test_err in advance to declare what the testsuite you are testing will output with Test::Builder to stdout and stderr.

    You then can run the test(s) from your test suite that call Test::Builder. At this point the output of Test::Builder is safely captured by Test::Builder::Tester rather than being interpreted as real test output.

    The final stage is to call test_test that will simply compare what you predeclared to what Test::Builder actually outputted, and report the results back with a "ok" or "not ok" (with debugging) to the normal output.

    Functions

    These are the six methods that are exported as default.

    • test_out
    • test_err

      Procedures for predeclaring the output that your test suite is expected to produce until test_test is called. These procedures automatically assume that each line terminates with "\n". So

      1. test_out("ok 1","ok 2");

      is the same as

      1. test_out("ok 1\nok 2");

      which is even the same as

      1. test_out("ok 1");
      2. test_out("ok 2");

      Once test_out or test_err (or test_fail or test_diag ) have been called, all further output from Test::Builder will be captured by Test::Builder::Tester. This means that you will not be able perform further tests to the normal output in the normal way until you call test_test (well, unless you manually meddle with the output filehandles)

    • test_fail

      Because the standard failure message that Test::Builder produces whenever a test fails will be a common occurrence in your test error output, and because it has changed between Test::Builder versions, rather than forcing you to call test_err with the string all the time like so

      1. test_err("# Failed test ($0 at line ".line_num(+1).")");

      test_fail exists as a convenience function that can be called instead. It takes one argument, the offset from the current line that the line that causes the fail is on.

      1. test_fail(+1);

      This means that the example in the synopsis could be rewritten more simply as:

      1. test_out("not ok 1 - foo");
      2. test_fail(+1);
      3. fail("foo");
      4. test_test("fail works");
    • test_diag

      As most of the remaining expected output to the error stream will be created by Test::Builder's diag function, Test::Builder::Tester provides a convenience function test_diag that you can use instead of test_err .

      The test_diag function prepends comment hashes and spacing to the start and newlines to the end of the expected output passed to it and adds it to the list of expected error output. So, instead of writing

      1. test_err("# Couldn't open file");

      you can write

      1. test_diag("Couldn't open file");

      Remember that Test::Builder's diag function will not add newlines to the end of output and test_diag will. So to check

      1. Test::Builder->new->diag("foo\n","bar\n");

      You would do

      1. test_diag("foo","bar")

      without the newlines.

    • test_test

      Actually performs the output check testing the tests, comparing the data (with eq ) that we have captured from Test::Builder against that that was declared with test_out and test_err .

      This takes name/value pairs that effect how the test is run.

      • title (synonym 'name', 'label')

        The name of the test that will be displayed after the ok or not ok .

      • skip_out

        Setting this to a true value will cause the test to ignore if the output sent by the test to the output stream does not match that declared with test_out .

      • skip_err

        Setting this to a true value will cause the test to ignore if the output sent by the test to the error stream does not match that declared with test_err .

      As a convenience, if only one argument is passed then this argument is assumed to be the name of the test (as in the above examples.)

      Once test_test has been run test output will be redirected back to the original filehandles that Test::Builder was connected to (probably STDOUT and STDERR,) meaning any further tests you run will function normally and cause success/errors for Test::Harness.

    • line_num

      A utility function that returns the line number that the function was called on. You can pass it an offset which will be added to the result. This is very useful for working out the correct text of diagnostic functions that contain line numbers.

      Essentially this is the same as the __LINE__ macro, but the line_num(+3) idiom is arguably nicer.

    In addition to the six exported functions there exists one function that can only be accessed with a fully qualified function call.

    • color

      When test_test is called and the output that your tests generate does not match that which you declared, test_test will print out debug information showing the two conflicting versions. As this output itself is debug information it can be confusing which part of the output is from test_test and which was the original output from your original tests. Also, it may be hard to spot things like extraneous whitespace at the end of lines that may cause your test to fail even though the output looks similar.

      To assist you test_test can colour the background of the debug information to disambiguate the different types of output. The debug output will have its background coloured green and red. The green part represents the text which is the same between the executed and actual output, the red shows which part differs.

      The color function determines if colouring should occur or not. Passing it a true or false value will enable or disable colouring respectively, and the function called with no argument will return the current setting.

      To enable colouring from the command line, you can use the Text::Builder::Tester::Color module like so:

      1. perl -Mlib=Text::Builder::Tester::Color test.t

      Or by including the Test::Builder::Tester::Color module directly in the PERL5LIB.

    BUGS

    Calls <Test::Builder- no_ending>> turning off the ending tests. This is needed as otherwise it will trip out because we've run more tests than we strictly should have and it'll register any failures we had that we were testing for as real failures.

    The color function doesn't work unless Term::ANSIColor is compatible with your terminal.

    Bugs (and requests for new features) can be reported to the author though the CPAN RT system: http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Test-Builder-Tester

    AUTHOR

    Copyright Mark Fowler <mark@twoshortplanks.com> 2002, 2004.

    Some code taken from Test::More and Test::Catch, written by by Michael G Schwern <schwern@pobox.com>. Hence, those parts Copyright Micheal G Schwern 2001. Used and distributed with permission.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    NOTES

    Thanks to Richard Clamp <richardc@unixbeard.net> for letting me use his testing system to try this module out on.

    SEE ALSO

    Test::Builder, Test::Builder::Tester::Color, Test::More.

     
    perldoc-html/Test/Builder/Tester/Color.html000644 000765 000024 00000037543 12275777511 020767 0ustar00jjstaff000000 000000 Test::Builder::Tester::Color - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Test::Builder::Tester::Color

    Perl 5 version 18.2 documentation
    Recently read

    Test::Builder::Tester::Color

    NAME

    Test::Builder::Tester::Color - turn on colour in Test::Builder::Tester

    SYNOPSIS

    1. When running a test script
    2. perl -MTest::Builder::Tester::Color test.t

    DESCRIPTION

    Importing this module causes the subroutine color in Test::Builder::Tester to be called with a true value causing colour highlighting to be turned on in debug output.

    The sole purpose of this module is to enable colour highlighting from the command line.

    AUTHOR

    Copyright Mark Fowler <mark@twoshortplanks.com> 2002.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    BUGS

    This module will have no effect unless Term::ANSIColor is installed.

    SEE ALSO

    Test::Builder::Tester, Term::ANSIColor

     
    perldoc-html/Term/ANSIColor.html000644 000765 000024 00000176100 12275777511 016567 0ustar00jjstaff000000 000000 Term::ANSIColor - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Term::ANSIColor

    Perl 5 version 18.2 documentation
    Recently read

    Term::ANSIColor

    NAME

    Term::ANSIColor - Color screen output using ANSI escape sequences

    SYNOPSIS

    1. use Term::ANSIColor;
    2. print color 'bold blue';
    3. print "This text is bold blue.\n";
    4. print color 'reset';
    5. print "This text is normal.\n";
    6. print colored("Yellow on magenta.", 'yellow on_magenta'), "\n";
    7. print "This text is normal.\n";
    8. print colored ['yellow on_magenta'], 'Yellow on magenta.', "\n";
    9. print colored ['red on_bright_yellow'], 'Red on bright yellow.', "\n";
    10. print colored ['bright_red on_black'], 'Bright red on black.', "\n";
    11. print "\n";
    12. # Map escape sequences back to color names.
    13. use Term::ANSIColor 1.04 qw(uncolor);
    14. my $names = uncolor('01;31');
    15. print join(q{ }, @{$names}), "\n";
    16. # Strip all color escape sequences.
    17. use Term::ANSIColor 2.01 qw(colorstrip);
    18. print colorstrip '\e[1mThis is bold\e[0m', "\n";
    19. # Determine whether a color is valid.
    20. use Term::ANSIColor 2.02 qw(colorvalid);
    21. my $valid = colorvalid('blue bold', 'on_magenta');
    22. print "Color string is ", $valid ? "valid\n" : "invalid\n";
    23. # Create new aliases for colors.
    24. use Term::ANSIColor 4.00 qw(coloralias);
    25. coloralias('alert', 'red');
    26. print "Alert is ", coloralias('alert'), "\n";
    27. print colored("This is in red.", 'alert'), "\n";
    28. use Term::ANSIColor qw(:constants);
    29. print BOLD, BLUE, "This text is in bold blue.\n", RESET;
    30. use Term::ANSIColor qw(:constants);
    31. {
    32. local $Term::ANSIColor::AUTORESET = 1;
    33. print BOLD BLUE "This text is in bold blue.\n";
    34. print "This text is normal.\n";
    35. }
    36. use Term::ANSIColor 2.00 qw(:pushpop);
    37. print PUSHCOLOR RED ON_GREEN "This text is red on green.\n";
    38. print PUSHCOLOR BRIGHT_BLUE "This text is bright blue on green.\n";
    39. print RESET BRIGHT_BLUE "This text is just bright blue.\n";
    40. print POPCOLOR "Back to red on green.\n";
    41. print LOCALCOLOR GREEN ON_BLUE "This text is green on blue.\n";
    42. print "This text is red on green.\n";
    43. {
    44. local $Term::ANSIColor::AUTOLOCAL = 1;
    45. print ON_BLUE "This text is red on blue.\n";
    46. print "This text is red on green.\n";
    47. }
    48. print POPCOLOR "Back to whatever we started as.\n";

    DESCRIPTION

    This module has two interfaces, one through color() and colored() and the other through constants. It also offers the utility functions uncolor(), colorstrip(), colorvalid(), and coloralias(), which have to be explicitly imported to be used (see SYNOPSIS).

    See COMPATIBILITY for the versions of Term::ANSIColor that introduced particular features and the versions of Perl that included them.

    Supported Colors

    Terminal emulators that support color divide into two types: ones that support only eight colors, ones that support sixteen, and ones that support 256. This module provides the ANSI escape codes all of them. These colors are referred to as ANSI colors 0 through 7 (normal), 8 through 15 (16-color), and 16 through 255 (256-color).

    Unfortunately, interpretation of colors 0 through 7 often depends on whether the emulator supports eight colors or sixteen colors. Emulators that only support eight colors (such as the Linux console) will display colors 0 through 7 with normal brightness and ignore colors 8 through 15, treating them the same as white. Emulators that support 16 colors, such as gnome-terminal, normally display colors 0 through 7 as dim or darker versions and colors 8 through 15 as normal brightness. On such emulators, the "normal" white (color 7) usually is shown as pale grey, requiring bright white (15) to be used to get a real white color. Bright black usually is a dark grey color, although some terminals display it as pure black. Some sixteen-color terminal emulators also treat normal yellow (color 3) as orange or brown, and bright yellow (color 11) as yellow.

    Following the normal convention of sixteen-color emulators, this module provides a pair of attributes for each color. For every normal color (0 through 7), the corresponding bright color (8 through 15) is obtained by prepending the string bright_ to the normal color name. For example, red is color 1 and bright_red is color 9. The same applies for background colors: on_red is the normal color and on_bright_red is the bright color. Capitalize these strings for the constant interface.

    For 256-color emulators, this module additionally provides ansi0 through ansi15 , which are the same as colors 0 through 15 in sixteen-color emulators but use the 256-color escape syntax, grey0 through grey23 ranging from nearly black to nearly white, and a set of RGB colors. The RGB colors are of the form rgbRGB where R, G, and B are numbers from 0 to 5 giving the intensity of red, green, and blue. on_ variants of all of these colors are also provided. These colors may be ignored completely on non-256-color terminals or may be misinterpreted and produce random behavior. Additional attributes such as blink, italic, or bold may not work with the 256-color palette.

    There is unfortunately no way to know whether the current emulator supports more than eight colors, which makes the choice of colors difficult. The most conservative choice is to use only the regular colors, which are at least displayed on all emulators. However, they will appear dark in sixteen-color terminal emulators, including most common emulators in UNIX X environments. If you know the display is one of those emulators, you may wish to use the bright variants instead. Even better, offer the user a way to configure the colors for a given application to fit their terminal emulator.

    Function Interface

    The function interface uses attribute strings to describe the colors and text attributes to assign to text. The recognized non-color attributes are clear, reset, bold, dark, faint, italic, underline, underscore, blink, reverse, and concealed. Clear and reset (reset to default attributes), dark and faint (dim and saturated), and underline and underscore are equivalent, so use whichever is the most intuitive to you.

    Note that not all attributes are supported by all terminal types, and some terminals may not support any of these sequences. Dark and faint, italic, blink, and concealed in particular are frequently not implemented.

    The recognized normal foreground color attributes (colors 0 to 7) are:

    1. black red green yellow blue magenta cyan white

    The corresponding bright foreground color attributes (colors 8 to 15) are:

    1. bright_black bright_red bright_green bright_yellow
    2. bright_blue bright_magenta bright_cyan bright_white

    The recognized normal background color attributes (colors 0 to 7) are:

    1. on_black on_red on_green on yellow
    2. on_blue on_magenta on_cyan on_white

    The recognized bright background color attributes (colors 8 to 15) are:

    1. on_bright_black on_bright_red on_bright_green on_bright_yellow
    2. on_bright_blue on_bright_magenta on_bright_cyan on_bright_white

    For 256-color terminals, the recognized foreground colors are:

    1. ansi0 .. ansi15
    2. grey0 .. grey23

    plus rgbRGB for R, G, and B values from 0 to 5, such as rgb000 or rgb515 . Similarly, the recognized background colors are:

    1. on_ansi0 .. on_ansi15
    2. on_grey0 .. on_grey23

    plus on_rgbRGB for for R, G, and B values from 0 to 5.

    For any of the above listed attributes, case is not significant.

    Attributes, once set, last until they are unset (by printing the attribute clear or reset). Be careful to do this, or otherwise your attribute will last after your script is done running, and people get very annoyed at having their prompt and typing changed to weird colors.

    • color(ATTR[, ATTR ...])

      color() takes any number of strings as arguments and considers them to be space-separated lists of attributes. It then forms and returns the escape sequence to set those attributes. It doesn't print it out, just returns it, so you'll have to print it yourself if you want to. This is so that you can save it as a string, pass it to something else, send it to a file handle, or do anything else with it that you might care to. color() throws an exception if given an invalid attribute.

    • colored(STRING, ATTR[, ATTR ...])
    • colored(ATTR-REF, STRING[, STRING...])

      As an aid in resetting colors, colored() takes a scalar as the first argument and any number of attribute strings as the second argument and returns the scalar wrapped in escape codes so that the attributes will be set as requested before the string and reset to normal after the string. Alternately, you can pass a reference to an array as the first argument, and then the contents of that array will be taken as attributes and color codes and the remainder of the arguments as text to colorize.

      Normally, colored() just puts attribute codes at the beginning and end of the string, but if you set $Term::ANSIColor::EACHLINE to some string, that string will be considered the line delimiter and the attribute will be set at the beginning of each line of the passed string and reset at the end of each line. This is often desirable if the output contains newlines and you're using background colors, since a background color that persists across a newline is often interpreted by the terminal as providing the default background color for the next line. Programs like pagers can also be confused by attributes that span lines. Normally you'll want to set $Term::ANSIColor::EACHLINE to "\n" to use this feature.

    • uncolor(ESCAPE)

      uncolor() performs the opposite translation as color(), turning escape sequences into a list of strings corresponding to the attributes being set by those sequences.

    • colorstrip(STRING[, STRING ...])

      colorstrip() removes all color escape sequences from the provided strings, returning the modified strings separately in array context or joined together in scalar context. Its arguments are not modified.

    • colorvalid(ATTR[, ATTR ...])

      colorvalid() takes attribute strings the same as color() and returns true if all attributes are known and false otherwise.

    • coloralias(ALIAS[, ATTR])

      If ATTR is specified, coloralias() sets up an alias of ALIAS for the standard color ATTR. From that point forward, ALIAS can be passed into color(), colored(), and colorvalid() and will have the same meaning as ATTR. One possible use of this facility is to give more meaningful names to the 256-color RGB colors. Only alphanumerics, ., _ , and - are allowed in alias names.

      If ATTR is not specified, coloralias() returns the standard color name to which ALIAS is aliased, if any, or undef if ALIAS does not exist.

      This is the same facility used by the ANSI_COLORS_ALIASES environment variable (see ENVIRONMENT below) but can be used at runtime, not just when the module is loaded.

      Later invocations of coloralias() with the same ALIAS will override earlier aliases. There is no way to remove an alias.

      Aliases have no effect on the return value of uncolor().

      WARNING: Aliases are global and affect all callers in the same process. There is no way to set an alias limited to a particular block of code or a particular object.

    Constant Interface

    Alternately, if you import :constants , you can use the following constants directly:

    1. CLEAR RESET BOLD DARK
    2. FAINT ITALIC UNDERLINE UNDERSCORE
    3. BLINK REVERSE CONCEALED
    4. BLACK RED GREEN YELLOW
    5. BLUE MAGENTA CYAN WHITE
    6. BRIGHT_BLACK BRIGHT_RED BRIGHT_GREEN BRIGHT_YELLOW
    7. BRIGHT_BLUE BRIGHT_MAGENTA BRIGHT_CYAN BRIGHT_WHITE
    8. ON_BLACK ON_RED ON_GREEN ON_YELLOW
    9. ON_BLUE ON_MAGENTA ON_CYAN ON_WHITE
    10. ON_BRIGHT_BLACK ON_BRIGHT_RED ON_BRIGHT_GREEN ON_BRIGHT_YELLOW
    11. ON_BRIGHT_BLUE ON_BRIGHT_MAGENTA ON_BRIGHT_CYAN ON_BRIGHT_WHITE

    These are the same as color('attribute') and can be used if you prefer typing:

    1. print BOLD BLUE ON_WHITE "Text", RESET, "\n";

    to

    1. print colored ("Text", 'bold blue on_white'), "\n";

    (Note that the newline is kept separate to avoid confusing the terminal as described above since a background color is being used.)

    If you import :constants256 , you can use the following constants directly:

    1. ANSI0 .. ANSI15
    2. GREY0 .. GREY23
    3. RGBXYZ (for X, Y, and Z values from 0 to 5, like RGB000 or RGB515)
    4. ON_ANSI0 .. ON_ANSI15
    5. ON_GREY0 .. ON_GREY23
    6. ON_RGBXYZ (for X, Y, and Z values from 0 to 5)

    Note that :constants256 does not include the other constants, so if you want to mix both, you need to include :constants as well. You may want to explicitly import at least RESET , as in:

    1. use Term::ANSIColor 4.00 qw(RESET :constants256);

    When using the constants, if you don't want to have to remember to add the , RESET at the end of each print line, you can set $Term::ANSIColor::AUTORESET to a true value. Then, the display mode will automatically be reset if there is no comma after the constant. In other words, with that variable set:

    1. print BOLD BLUE "Text\n";

    will reset the display mode afterward, whereas:

    1. print BOLD, BLUE, "Text\n";

    will not. If you are using background colors, you will probably want to either use say() (in newer versions of Perl) or print the newline with a separate print statement to avoid confusing the terminal.

    If $Term::ANSIColor::AUTOLOCAL is set (see below), it takes precedence over $Term::ANSIColor::AUTORESET, and the latter is ignored.

    The subroutine interface has the advantage over the constants interface in that only two subroutines are exported into your namespace, versus thirty-eight in the constants interface. On the flip side, the constants interface has the advantage of better compile time error checking, since misspelled names of colors or attributes in calls to color() and colored() won't be caught until runtime whereas misspelled names of constants will be caught at compile time. So, pollute your namespace with almost two dozen subroutines that you may not even use that often, or risk a silly bug by mistyping an attribute. Your choice, TMTOWTDI after all.

    The Color Stack

    You can import :pushpop and maintain a stack of colors using PUSHCOLOR, POPCOLOR, and LOCALCOLOR. PUSHCOLOR takes the attribute string that starts its argument and pushes it onto a stack of attributes. POPCOLOR removes the top of the stack and restores the previous attributes set by the argument of a prior PUSHCOLOR. LOCALCOLOR surrounds its argument in a PUSHCOLOR and POPCOLOR so that the color resets afterward.

    If $Term::ANSIColor::AUTOLOCAL is set, each sequence of color constants will be implicitly preceded by LOCALCOLOR. In other words, the following:

    1. {
    2. local $Term::ANSIColor::AUTOLOCAL = 1;
    3. print BLUE "Text\n";
    4. }

    is equivalent to:

    1. print LOCALCOLOR BLUE "Text\n";

    If $Term::ANSIColor::AUTOLOCAL is set, it takes precedence over $Term::ANSIColor::AUTORESET, and the latter is ignored.

    When using PUSHCOLOR, POPCOLOR, and LOCALCOLOR, it's particularly important to not put commas between the constants.

    1. print PUSHCOLOR BLUE "Text\n";

    will correctly push BLUE onto the top of the stack.

    1. print PUSHCOLOR, BLUE, "Text\n"; # wrong!

    will not, and a subsequent pop won't restore the correct attributes. PUSHCOLOR pushes the attributes set by its argument, which is normally a string of color constants. It can't ask the terminal what the current attributes are.

    DIAGNOSTICS

    • Bad color mapping %s

      (W) The specified color mapping from ANSI_COLORS_ALIASES is not valid and could not be parsed. It was ignored.

    • Bad escape sequence %s

      (F) You passed an invalid ANSI escape sequence to uncolor().

    • Bareword "%s" not allowed while "strict subs" in use

      (F) You probably mistyped a constant color name such as:

      1. $Foobar = FOOBAR . "This line should be blue\n";

      or:

      1. @Foobar = FOOBAR, "This line should be blue\n";

      This will only show up under use strict (another good reason to run under use strict).

    • Cannot alias standard color %s

      (F) The alias name passed to coloralias() matches a standard color name. Standard color names cannot be aliased.

    • Cannot alias standard color %s in %s

      (W) The same, but in ANSI_COLORS_ALIASES. The color mapping was ignored.

    • Invalid alias name %s

      (F) You passed an invalid alias name to coloralias(). Alias names must consist only of alphanumerics, ., - , and _ .

    • Invalid alias name %s in %s

      (W) You specified an invalid alias name on the left hand of the equal sign in a color mapping in ANSI_COLORS_ALIASES. The color mapping was ignored.

    • Invalid attribute name %s

      (F) You passed an invalid attribute name to color(), colored(), or coloralias().

    • Invalid attribute name %s in %s

      (W) You specified an invalid attribute name on the right hand of the equal sign in a color mapping in ANSI_COLORS_ALIASES. The color mapping was ignored.

    • Name "%s" used only once: possible typo

      (W) You probably mistyped a constant color name such as:

      1. print FOOBAR "This text is color FOOBAR\n";

      It's probably better to always use commas after constant names in order to force the next error.

    • No comma allowed after filehandle

      (F) You probably mistyped a constant color name such as:

      1. print FOOBAR, "This text is color FOOBAR\n";

      Generating this fatal compile error is one of the main advantages of using the constants interface, since you'll immediately know if you mistype a color name.

    • No name for escape sequence %s

      (F) The ANSI escape sequence passed to uncolor() contains escapes which aren't recognized and can't be translated to names.

    ENVIRONMENT

    • ANSI_COLORS_ALIASES

      This environment variable allows the user to specify custom color aliases that will be understood by color(), colored(), and colorvalid(). None of the other functions will be affected, and no new color constants will be created. The custom colors are aliases for existing color names; no new escape sequences can be introduced. Only alphanumerics, ., _ , and - are allowed in alias names.

      The format is:

      1. ANSI_COLORS_ALIASES='newcolor1=oldcolor1,newcolor2=oldcolor2'

      Whitespace is ignored.

      For example the Solarized colors can be mapped with:

      1. ANSI_COLORS_ALIASES='\
      2. base00=bright_yellow, on_base00=on_bright_yellow,\
      3. base01=bright_green, on_base01=on_bright_green, \
      4. base02=black, on_base02=on_black, \
      5. base03=bright_black, on_base03=on_bright_black, \
      6. base0=bright_blue, on_base0=on_bright_blue, \
      7. base1=bright_cyan, on_base1=on_bright_cyan, \
      8. base2=white, on_base2=on_white, \
      9. base3=bright_white, on_base3=on_bright_white, \
      10. orange=bright_red, on_orange=on_bright_red, \
      11. violet=bright_magenta,on_violet=on_bright_magenta'

      This environment variable is read and applied when the Term::ANSIColor module is loaded and is then subsequently ignored. Changes to ANSI_COLORS_ALIASES after the module is loaded will have no effect. See coloralias() for an equivalent facility that can be used at runtime.

    • ANSI_COLORS_DISABLED

      If this environment variable is set to a true value, all of the functions defined by this module (color(), colored(), and all of the constants not previously used in the program) will not output any escape sequences and instead will just return the empty string or pass through the original text as appropriate. This is intended to support easy use of scripts using this module on platforms that don't support ANSI escape sequences.

    COMPATIBILITY

    Term::ANSIColor was first included with Perl in Perl 5.6.0.

    The uncolor() function and support for ANSI_COLORS_DISABLED were added in Term::ANSIColor 1.04, included in Perl 5.8.0.

    Support for dark was added in Term::ANSIColor 1.08, included in Perl 5.8.4.

    The color stack, including the :pushpop import tag, PUSHCOLOR, POPCOLOR, LOCALCOLOR, and the $Term::ANSIColor::AUTOLOCAL variable, was added in Term::ANSIColor 2.00, included in Perl 5.10.1.

    colorstrip() was added in Term::ANSIColor 2.01 and colorvalid() was added in Term::ANSIColor 2.02, both included in Perl 5.11.0.

    Support for colors 8 through 15 (the bright_ variants) was added in Term::ANSIColor 3.00, included in Perl 5.13.3.

    Support for italic was added in Term::ANSIColor 3.02, included in Perl 5.17.1.

    Support for colors 16 through 256 (the ansi , rgb , and grey colors), the :constants256 import tag, the coloralias() function, and support for the ANSI_COLORS_ALIASES environment variable were added in Term::ANSIColor 4.00.

    $Term::ANSIColor::AUTOLOCAL was changed to take precedence over $Term::ANSIColor::AUTORESET, rather than the other way around, in Term::ANSIColor 4.00.

    RESTRICTIONS

    It would be nice if one could leave off the commas around the constants entirely and just say:

    1. print BOLD BLUE ON_WHITE "Text\n" RESET;

    but the syntax of Perl doesn't allow this. You need a comma after the string. (Of course, you may consider it a bug that commas between all the constants aren't required, in which case you may feel free to insert commas unless you're using $Term::ANSIColor::AUTORESET or PUSHCOLOR/POPCOLOR.)

    For easier debugging, you may prefer to always use the commas when not setting $Term::ANSIColor::AUTORESET or PUSHCOLOR/POPCOLOR so that you'll get a fatal compile error rather than a warning.

    It's not possible to use this module to embed formatting and color attributes using Perl formats. They replace the escape character with a space (as documented in perlform(1)), resulting in garbled output from the unrecognized attribute. Even if there were a way around that problem, the format doesn't know that the non-printing escape sequence is zero-length and would incorrectly format the output. For formatted output using color or other attributes, either use sprintf() instead or use formline() and then add the color or other attributes after formatting and before output.

    NOTES

    The codes generated by this module are standard terminal control codes, complying with ECMA-048 and ISO 6429 (generally referred to as "ANSI color" for the color codes). The non-color control codes (bold, dark, italic, underline, and reverse) are part of the earlier ANSI X3.64 standard for control sequences for video terminals and peripherals.

    Note that not all displays are ISO 6429-compliant, or even X3.64-compliant (or are even attempting to be so). This module will not work as expected on displays that do not honor these escape sequences, such as cmd.exe, 4nt.exe, and command.com under either Windows NT or Windows 2000. They may just be ignored, or they may display as an ESC character followed by some apparent garbage.

    Jean Delvare provided the following table of different common terminal emulators and their support for the various attributes and others have helped me flesh it out:

    1. clear bold faint under blink reverse conceal
    2. ------------------------------------------------------------------------
    3. xterm yes yes no yes yes yes yes
    4. linux yes yes yes bold yes yes no
    5. rxvt yes yes no yes bold/black yes no
    6. dtterm yes yes yes yes reverse yes yes
    7. teraterm yes reverse no yes rev/red yes no
    8. aixterm kinda normal no yes no yes yes
    9. PuTTY yes color no yes no yes no
    10. Windows yes no no no no yes no
    11. Cygwin SSH yes yes no color color color yes
    12. Terminal.app yes yes no yes yes yes yes

    Windows is Windows telnet, Cygwin SSH is the OpenSSH implementation under Cygwin on Windows NT, and Mac Terminal is the Terminal application in Mac OS X. Where the entry is other than yes or no, that emulator displays the given attribute as something else instead. Note that on an aixterm, clear doesn't reset colors; you have to explicitly set the colors back to what you want. More entries in this table are welcome.

    Support for code 3 (italic) is rare and therefore not mentioned in that table. It is not believed to be fully supported by any of the terminals listed, although it's displayed as green in the Linux console, but it is reportedly supported by urxvt.

    Note that codes 6 (rapid blink) and 9 (strike-through) are specified in ANSI X3.64 and ECMA-048 but are not commonly supported by most displays and emulators and therefore aren't supported by this module at the present time. ECMA-048 also specifies a large number of other attributes, including a sequence of attributes for font changes, Fraktur characters, double-underlining, framing, circling, and overlining. As none of these attributes are widely supported or useful, they also aren't currently supported by this module.

    Most modern X terminal emulators support 256 colors. Known to not support those colors are aterm, rxvt, Terminal.app, and TTY/VC.

    SEE ALSO

    ECMA-048 is available on-line (at least at the time of this writing) at http://www.ecma-international.org/publications/standards/Ecma-048.htm.

    ISO 6429 is available from ISO for a charge; the author of this module does not own a copy of it. Since the source material for ISO 6429 was ECMA-048 and the latter is available for free, there seems little reason to obtain the ISO standard.

    The 256-color control sequences are documented at http://www.xfree86.org/current/ctlseqs.html (search for 256-color).

    The CPAN module Term::ExtendedColor provides a different and more comprehensive interface for 256-color emulators that may be more convenient.

    The current version of this module is always available from its web site at http://www.eyrie.org/~eagle/software/ansicolor/. It is also part of the Perl core distribution as of 5.6.0.

    AUTHORS

    Original idea (using constants) by Zenin, reimplemented using subs by Russ Allbery <rra@stanford.edu>, and then combined with the original idea by Russ with input from Zenin. 256-color support is based on work by Kurt Starsinic. Russ Allbery now maintains this module.

    PUSHCOLOR, POPCOLOR, and LOCALCOLOR were contributed by openmethods.com voice solutions.

    COPYRIGHT AND LICENSE

    Copyright 1996 Zenin. Copyright 1996, 1997, 1998, 2000, 2001, 2002, 2005, 2006, 2008, 2009, 2010, 2011, 2012 Russ Allbery <rra@stanford.edu>. Copyright 2012 Kurt Starsinic <kstarsinic@gmail.com>. This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Term/Cap.html000644 000765 000024 00000063622 12275777513 015547 0ustar00jjstaff000000 000000 Term::Cap - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Term::Cap

    Perl 5 version 18.2 documentation
    Recently read

    Term::Cap

    NAME

    Term::Cap - Perl termcap interface

    SYNOPSIS

    1. require Term::Cap;
    2. $terminal = Tgetent Term::Cap { TERM => undef, OSPEED => $ospeed };
    3. $terminal->Trequire(qw/ce ku kd/);
    4. $terminal->Tgoto('cm', $col, $row, $FH);
    5. $terminal->Tputs('dl', $count, $FH);
    6. $terminal->Tpad($string, $count, $FH);

    DESCRIPTION

    These are low-level functions to extract and use capabilities from a terminal capability (termcap) database.

    More information on the terminal capabilities will be found in the termcap manpage on most Unix-like systems.

    METHODS

    The output strings for Tputs are cached for counts of 1 for performance. Tgoto and Tpad do not cache. $self->{_xx} is the raw termcap data and $self->{xx} is the cached version.

    1. print $terminal->Tpad($self->{_xx}, 1);

    Tgoto, Tputs, and Tpad return the string and will also output the string to $FH if specified.

    • Tgetent

      Returns a blessed object reference which the user can then use to send the control strings to the terminal using Tputs and Tgoto.

      The function extracts the entry of the specified terminal type TERM (defaults to the environment variable TERM) from the database.

      It will look in the environment for a TERMCAP variable. If found, and the value does not begin with a slash, and the terminal type name is the same as the environment string TERM, the TERMCAP string is used instead of reading a termcap file. If it does begin with a slash, the string is used as a path name of the termcap file to search. If TERMCAP does not begin with a slash and name is different from TERM, Tgetent searches the files $HOME/.termcap, /etc/termcap, and /usr/share/misc/termcap, in that order, unless the environment variable TERMPATH exists, in which case it specifies a list of file pathnames (separated by spaces or colons) to be searched instead. Whenever multiple files are searched and a tc field occurs in the requested entry, the entry it names must be found in the same file or one of the succeeding files. If there is a :tc=...: in the TERMCAP environment variable string it will continue the search in the files as above.

      The extracted termcap entry is available in the object as $self->{TERMCAP} .

      It takes a hash reference as an argument with two optional keys:

      • OSPEED

        The terminal output bit rate (often mistakenly called the baud rate) for this terminal - if not set a warning will be generated and it will be defaulted to 9600. OSPEED can be be specified as either a POSIX termios/SYSV termio speeds (where 9600 equals 9600) or an old DSD-style speed ( where 13 equals 9600).

      • TERM

        The terminal type whose termcap entry will be used - if not supplied it will default to $ENV{TERM}: if that is not set then Tgetent will croak.

      It calls croak on failure.

    • Tpad

      Outputs a literal string with appropriate padding for the current terminal.

      It takes three arguments:

      • $string

        The literal string to be output. If it starts with a number and an optional '*' then the padding will be increased by an amount relative to this number, if the '*' is present then this amount will me multiplied by $cnt. This part of $string is removed before output/

      • $cnt

        Will be used to modify the padding applied to string as described above.

      • $FH

        An optional filehandle (or IO::Handle ) that output will be printed to.

      The padded $string is returned.

    • Tputs

      Output the string for the given capability padded as appropriate without any parameter substitution.

      It takes three arguments:

      • $cap

        The capability whose string is to be output.

      • $cnt

        A count passed to Tpad to modify the padding applied to the output string. If $cnt is zero or one then the resulting string will be cached.

      • $FH

        An optional filehandle (or IO::Handle ) that output will be printed to.

      The appropriate string for the capability will be returned.

    • Tgoto

      Tgoto decodes a cursor addressing string with the given parameters.

      There are four arguments:

      • $cap

        The name of the capability to be output.

      • $col

        The first value to be substituted in the output string ( usually the column in a cursor addressing capability )

      • $row

        The second value to be substituted in the output string (usually the row in cursor addressing capabilities)

      • $FH

        An optional filehandle (or IO::Handle ) to which the output string will be printed.

      Substitutions are made with $col and $row in the output string with the following sprintf() line formats:

      1. %% output `%'
      2. %d output value as in printf %d
      3. %2 output value as in printf %2d
      4. %3 output value as in printf %3d
      5. %. output value as in printf %c
      6. %+x add x to value, then do %.
      7. %>xy if value > x then add y, no output
      8. %r reverse order of two parameters, no output
      9. %i increment by one, no output
      10. %B BCD (16*(value/10)) + (value%10), no output
      11. %n exclusive-or all parameters with 0140 (Datamedia 2500)
      12. %D Reverse coding (value - 2*(value%16)), no output (Delta Data)

      The output string will be returned.

    • Trequire

      Takes a list of capabilities as an argument and will croak if one is not found.

    EXAMPLES

    1. use Term::Cap;
    2. # Get terminal output speed
    3. require POSIX;
    4. my $termios = new POSIX::Termios;
    5. $termios->getattr;
    6. my $ospeed = $termios->getospeed;
    7. # Old-style ioctl code to get ospeed:
    8. # require 'ioctl.pl';
    9. # ioctl(TTY,$TIOCGETP,$sgtty);
    10. # ($ispeed,$ospeed) = unpack('cc',$sgtty);
    11. # allocate and initialize a terminal structure
    12. $terminal = Tgetent Term::Cap { TERM => undef, OSPEED => $ospeed };
    13. # require certain capabilities to be available
    14. $terminal->Trequire(qw/ce ku kd/);
    15. # Output Routines, if $FH is undefined these just return the string
    16. # Tgoto does the % expansion stuff with the given args
    17. $terminal->Tgoto('cm', $col, $row, $FH);
    18. # Tputs doesn't do any % expansion.
    19. $terminal->Tputs('dl', $count = 1, $FH);

    COPYRIGHT AND LICENSE

    Please see the README file in distribution.

    AUTHOR

    This module is part of the core Perl distribution and is also maintained for CPAN by Jonathan Stowe <jns@gellyfish.com>.

    SEE ALSO

    termcap(5)

     
    perldoc-html/Term/Complete.html000644 000765 000024 00000037737 12275777514 016625 0ustar00jjstaff000000 000000 Term::Complete - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Term::Complete

    Perl 5 version 18.2 documentation
    Recently read

    Term::Complete

    NAME

    Term::Complete - Perl word completion module

    SYNOPSIS

    1. $input = Complete('prompt_string', \@completion_list);
    2. $input = Complete('prompt_string', @completion_list);

    DESCRIPTION

    This routine provides word completion on the list of words in the array (or array ref).

    The tty driver is put into raw mode and restored using an operating system specific command, in UNIX-like environments stty .

    The following command characters are defined:

    • <tab>

      Attempts word completion. Cannot be changed.

    • ^D

      Prints completion list. Defined by $Term::Complete::complete.

    • ^U

      Erases the current input. Defined by $Term::Complete::kill.

    • <del>, <bs>

      Erases one character. Defined by $Term::Complete::erase1 and $Term::Complete::erase2.

    DIAGNOSTICS

    Bell sounds when word completion fails.

    BUGS

    The completion character <tab> cannot be changed.

    AUTHOR

    Wayne Thompson

     
    perldoc-html/Term/ReadLine.html000644 000765 000024 00000071106 12275777510 016520 0ustar00jjstaff000000 000000 Term::ReadLine - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Term::ReadLine

    Perl 5 version 18.2 documentation
    Recently read

    Term::ReadLine

    NAME

    Term::ReadLine - Perl interface to various readline packages. If no real package is found, substitutes stubs instead of basic functions.

    SYNOPSIS

    1. use Term::ReadLine;
    2. my $term = Term::ReadLine->new('Simple Perl calc');
    3. my $prompt = "Enter your arithmetic expression: ";
    4. my $OUT = $term->OUT || \*STDOUT;
    5. while ( defined ($_ = $term->readline($prompt)) ) {
    6. my $res = eval($_);
    7. warn $@ if $@;
    8. print $OUT $res, "\n" unless $@;
    9. $term->addhistory($_) if /\S/;
    10. }

    DESCRIPTION

    This package is just a front end to some other packages. It's a stub to set up a common interface to the various ReadLine implementations found on CPAN (under the Term::ReadLine::* namespace).

    Minimal set of supported functions

    All the supported functions should be called as methods, i.e., either as

    1. $term = Term::ReadLine->new('name');

    or as

    1. $term->addhistory('row');

    where $term is a return value of Term::ReadLine->new().

    • ReadLine

      returns the actual package that executes the commands. Among possible values are Term::ReadLine::Gnu , Term::ReadLine::Perl , Term::ReadLine::Stub .

    • new

      returns the handle for subsequent calls to following functions. Argument is the name of the application. Optionally can be followed by two arguments for IN and OUT filehandles. These arguments should be globs.

    • readline

      gets an input line, possibly with actual readline support. Trailing newline is removed. Returns undef on EOF .

    • addhistory

      adds the line to the history of input, from where it can be used if the actual readline is present.

    • IN , OUT

      return the filehandles for input and output or undef if readline input and output cannot be used for Perl.

    • MinLine

      If argument is specified, it is an advice on minimal size of line to be included into history. undef means do not include anything into history. Returns the old value.

    • findConsole

      returns an array with two strings that give most appropriate names for files for input and output using conventions "<$in" , ">out" .

    • Attribs

      returns a reference to a hash which describes internal configuration of the package. Names of keys in this hash conform to standard conventions with the leading rl_ stripped.

    • Features

      Returns a reference to a hash with keys being features present in current implementation. Several optional features are used in the minimal interface: appname should be present if the first argument to new is recognized, and minline should be present if MinLine method is not dummy. autohistory should be present if lines are put into history automatically (maybe subject to MinLine ), and addhistory if addhistory method is not dummy.

      If Features method reports a feature attribs as present, the method Attribs is not dummy.

    Additional supported functions

    Actually Term::ReadLine can use some other package, that will support a richer set of commands.

    All these commands are callable via method interface and have names which conform to standard conventions with the leading rl_ stripped.

    The stub package included with the perl distribution allows some additional methods:

    • tkRunning

      makes Tk event loop run when waiting for user input (i.e., during readline method).

    • event_loop

      Registers call-backs to wait for user input (i.e., during readline method). This supersedes tkRunning.

      The first call-back registered is the call back for waiting. It is expected that the callback will call the current event loop until there is something waiting to get on the input filehandle. The parameter passed in is the return value of the second call back.

      The second call-back registered is the call back for registration. The input filehandle (often STDIN, but not necessarily) will be passed in.

      For example, with AnyEvent:

      1. $term->event_loop(sub {
      2. my $data = shift;
      3. $data->[1] = AE::cv();
      4. $data->[1]->recv();
      5. }, sub {
      6. my $fh = shift;
      7. my $data = [];
      8. $data->[0] = AE::io($fh, 0, sub { $data->[1]->send() });
      9. $data;
      10. });

      The second call-back is optional if the call back is registered prior to the call to $term->readline.

      Deregistration is done in this case by calling event_loop with undef as its parameter:

      1. $term->event_loop(undef);

      This will cause the data array ref to be removed, allowing normal garbage collection to clean it up. With AnyEvent, that will cause $data->[0] to be cleaned up, and AnyEvent will automatically cancel the watcher at that time. If another loop requires more than that to clean up a file watcher, that will be up to the caller to handle.

    • ornaments

      makes the command line stand out by using termcap data. The argument to ornaments should be 0, 1, or a string of a form "aa,bb,cc,dd" . Four components of this string should be names of terminal capacities, first two will be issued to make the prompt standout, last two to make the input line standout.

    • newTTY

      takes two arguments which are input filehandle and output filehandle. Switches to use these filehandles.

    One can check whether the currently loaded ReadLine package supports these methods by checking for corresponding Features .

    EXPORTS

    None

    ENVIRONMENT

    The environment variable PERL_RL governs which ReadLine clone is loaded. If the value is false, a dummy interface is used. If the value is true, it should be tail of the name of the package to use, such as Perl or Gnu .

    As a special case, if the value of this variable is space-separated, the tail might be used to disable the ornaments by setting the tail to be o=0 or ornaments=0 . The head should be as described above, say

    If the variable is not set, or if the head of space-separated list is empty, the best available package is loaded.

    1. export "PERL_RL=Perl o=0" # Use Perl ReadLine sans ornaments
    2. export "PERL_RL= o=0" # Use best available ReadLine sans ornaments

    (Note that processing of PERL_RL for ornaments is in the discretion of the particular used Term::ReadLine::* package).

     
    perldoc-html/Term/UI/000755 000765 000024 00000000000 12275777510 014457 5ustar00jjstaff000000 000000 perldoc-html/Term/UI.html000644 000765 000024 00000123701 12275777510 015351 0ustar00jjstaff000000 000000 Term::UI - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Term::UI

    Perl 5 version 18.2 documentation
    Recently read

    Term::UI

    NAME

    Term::UI - Term::ReadLine UI made easy

    SYNOPSIS

    1. use Term::UI;
    2. use Term::ReadLine;
    3. my $term = Term::ReadLine->new('brand');
    4. my $reply = $term->get_reply(
    5. prompt => 'What is your favourite colour?',
    6. choices => [qw|blue red green|],
    7. default => 'blue',
    8. );
    9. my $bool = $term->ask_yn(
    10. prompt => 'Do you like cookies?',
    11. default => 'y',
    12. );
    13. my $string = q[some_command -option --no-foo --quux='this thing'];
    14. my ($options,$munged_input) = $term->parse_options($string);
    15. ### don't have Term::UI issue warnings -- default is '1'
    16. $Term::UI::VERBOSE = 0;
    17. ### always pick the default (good for non-interactive terms)
    18. ### -- default is '0'
    19. $Term::UI::AUTOREPLY = 1;
    20. ### Retrieve the entire session as a printable string:
    21. $hist = Term::UI::History->history_as_string;
    22. $hist = $term->history_as_string;

    DESCRIPTION

    Term::UI is a transparent way of eliminating the overhead of having to format a question and then validate the reply, informing the user if the answer was not proper and re-issuing the question.

    Simply give it the question you want to ask, optionally with choices the user can pick from and a default and Term::UI will DWYM.

    For asking a yes or no question, there's even a shortcut.

    HOW IT WORKS

    Term::UI places itself at the back of the Term::ReadLine @ISA array, so you can call its functions through your term object.

    Term::UI uses Term::UI::History to record all interactions with the commandline. You can retrieve this history, or alter the filehandle the interaction is printed to. See the Term::UI::History manpage or the SYNOPSIS for details.

    METHODS

    $reply = $term->get_reply( prompt => 'question?', [choices => \@list, default => $list[0], multi => BOOL, print_me => "extra text to print & record", allow => $ref] );

    get_reply asks a user a question, and then returns the reply to the caller. If the answer is invalid (more on that below), the question will be reposed, until a satisfactory answer has been entered.

    You have the option of providing a list of choices the user can pick from using the choices argument. If the answer is not in the list of choices presented, the question will be reposed.

    If you provide a default answer, this will be returned when either $AUTOREPLY is set to true, (see the GLOBAL VARIABLES section further below), or when the user just hits enter .

    You can indicate that the user is allowed to enter multiple answers by toggling the multi flag. Note that a list of answers will then be returned to you, rather than a simple string.

    By specifying an allow hander, you can yourself validate the answer a user gives. This can be any of the types that the Params::Check allow function allows, so please refer to that manpage for details.

    Finally, you have the option of adding a print_me argument, which is simply printed before the prompt. It's printed to the same file handle as the rest of the questions, so you can use this to keep track of a full session of Q&A with the user, and retrieve it later using the Term::UI->history_as_string function.

    See the EXAMPLES section for samples of how to use this function.

    $bool = $term->ask_yn( prompt => "your question", [default => (y|1,n|0), print_me => "extra text to print & record"] )

    Asks a simple yes or no question to the user, returning a boolean indicating true or false to the caller.

    The default answer will automatically returned, if the user hits enter or if $AUTOREPLY is set to true. See the GLOBAL VARIABLES section further below.

    Also, you have the option of adding a print_me argument, which is simply printed before the prompt. It's printed to the same file handle as the rest of the questions, so you can use this to keep track of a full session of Q&A with the user, and retrieve it later using the Term::UI->history_as_string function.

    See the EXAMPLES section for samples of how to use this function.

    ($opts, $munged) = $term->parse_options( STRING );

    parse_options will convert all options given from an input string to a hash reference. If called in list context it will also return the part of the input string that it found no options in.

    Consider this example:

    1. my $str = q[command --no-foo --baz --bar=0 --quux=bleh ] .
    2. q[--option="some'thing" -one-dash -single=blah' arg];
    3. my ($options,$munged) = $term->parse_options($str);
    4. ### $options would contain: ###
    5. $options = {
    6. 'foo' => 0,
    7. 'bar' => 0,
    8. 'one-dash' => 1,
    9. 'baz' => 1,
    10. 'quux' => 'bleh',
    11. 'single' => 'blah\'',
    12. 'option' => 'some\'thing'
    13. };
    14. ### and this is the munged version of the input string,
    15. ### ie what's left of the input minus the options
    16. $munged = 'command arg';

    As you can see, you can either use a single or a double - to indicate an option. If you prefix an option with no- and do not give it a value, it will be set to 0. If it has no prefix and no value, it will be set to 1. Otherwise, it will be set to its value. Note also that it can deal fine with single/double quoting issues.

    $str = $term->history_as_string

    Convenience wrapper around Term::UI::History->history_as_string .

    Consult the Term::UI::History man page for details.

    GLOBAL VARIABLES

    The behaviour of Term::UI can be altered by changing the following global variables:

    $Term::UI::VERBOSE

    This controls whether Term::UI will issue warnings and explanations as to why certain things may have failed. If you set it to 0, Term::UI will not output any warnings. The default is 1;

    $Term::UI::AUTOREPLY

    This will make every question be answered by the default, and warn if there was no default provided. This is particularly useful if your program is run in non-interactive mode. The default is 0;

    $Term::UI::INVALID

    This holds the string that will be printed when the user makes an invalid choice. You can override this string from your program if you, for example, wish to do localization. The default is Invalid selection, please try again:

    $Term::UI::History::HISTORY_FH

    This is the filehandle all the print statements from this module are being sent to. Please consult the Term::UI::History manpage for details.

    This defaults to *STDOUT .

    EXAMPLES

    Basic get_reply sample

    1. ### ask a user (with an open question) for their favourite colour
    2. $reply = $term->get_reply( prompt => 'Your favourite colour? );

    which would look like:

    1. Your favourite colour?

    and $reply would hold the text the user typed.

    get_reply with choices

    1. ### now provide a list of choices, so the user has to pick one
    2. $reply = $term->get_reply(
    3. prompt => 'Your favourite colour?',
    4. choices => [qw|red green blue|] );

    which would look like:

    1. 1> red
    2. 2> green
    3. 3> blue
    4. Your favourite colour?

    $reply will hold one of the choices presented. Term::UI will repose the question if the user attempts to enter an answer that's not in the list of choices. The string presented is held in the $Term::UI::INVALID variable (see the GLOBAL VARIABLES section for details.

    get_reply with choices and default

    1. ### provide a sensible default option -- everyone loves blue!
    2. $reply = $term->get_reply(
    3. prompt => 'Your favourite colour?',
    4. choices => [qw|red green blue|],
    5. default => 'blue' );

    which would look like:

    1. 1> red
    2. 2> green
    3. 3> blue
    4. Your favourite colour? [3]:

    Note the default answer after the prompt. A user can now just hit enter (or set $Term::UI::AUTOREPLY -- see the GLOBAL VARIABLES section) and the sensible answer 'blue' will be returned.

    get_reply using print_me & multi

    1. ### allow the user to pick more than one colour and add an
    2. ### introduction text
    3. @reply = $term->get_reply(
    4. print_me => 'Tell us what colours you like',
    5. prompt => 'Your favourite colours?',
    6. choices => [qw|red green blue|],
    7. multi => 1 );

    which would look like:

    1. Tell us what colours you like
    2. 1> red
    3. 2> green
    4. 3> blue
    5. Your favourite colours?

    An answer of 3 2 1 would fill @reply with blue green red

    get_reply & allow

    1. ### pose an open question, but do a custom verification on
    2. ### the answer, which will only exit the question loop, if
    3. ### the answer matches the allow handler.
    4. $reply = $term->get_reply(
    5. prompt => "What is the magic number?",
    6. allow => 42 );

    Unless the user now enters 42 , the question will be reposed over and over again. You can use more sophisticated allow handlers (even subroutines can be used). The allow handler is implemented using Params::Check 's allow function. Check its manpage for details.

    an elaborate ask_yn sample

    1. ### ask a user if he likes cookies. Default to a sensible 'yes'
    2. ### and inform him first what cookies are.
    3. $bool = $term->ask_yn( prompt => 'Do you like cookies?',
    4. default => 'y',
    5. print_me => 'Cookies are LOVELY!!!' );

    would print:

    1. Cookies are LOVELY!!!
    2. Do you like cookies? [Y/n]:

    If a user then simply hits enter , agreeing with the default, $bool would be set to true . (Simply hitting 'y' would also return true . Hitting 'n' would return false )

    We could later retrieve this interaction by printing out the Q&A history as follows:

    1. print $term->history_as_string;

    which would then print:

    1. Cookies are LOVELY!!!
    2. Do you like cookies? [Y/n]: y

    There's a chance we're doing this non-interactively, because a console is missing, the user indicated he just wanted the defaults, etc.

    In this case, simply setting $Term::UI::AUTOREPLY to true, will return from every question with the default answer set for the question. Do note that if AUTOREPLY is true, and no default is set, Term::UI will warn about this and return undef.

    See Also

    Params::Check , Term::ReadLine , Term::UI::History

    BUG REPORTS

    Please report bugs or other issues to <bug-term-ui@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Term/UI/History.html000644 000765 000024 00000043603 12275777510 017014 0ustar00jjstaff000000 000000 Term::UI::History - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Term::UI::History

    Perl 5 version 18.2 documentation
    Recently read

    Term::UI::History

    NAME

    Term::UI::History - history function

    SYNOPSIS

    1. use Term::UI::History qw[history];
    2. history("Some message");
    3. ### retrieve the history in printable form
    4. $hist = Term::UI::History->history_as_string;
    5. ### redirect output
    6. local $Term::UI::History::HISTORY_FH = \*STDERR;

    DESCRIPTION

    This module provides the history function for Term::UI , printing and saving all the UI interaction.

    Refer to the Term::UI manpage for details on usage from Term::UI .

    This module subclasses Log::Message::Simple . Refer to its manpage for additional functionality available via this package.

    FUNCTIONS

    history("message string" [,VERBOSE])

    Records a message on the stack, and prints it to STDOUT (or actually $HISTORY_FH , see the GLOBAL VARIABLES section below), if the VERBOSE option is true.

    The VERBOSE option defaults to true.

    GLOBAL VARIABLES

    • $HISTORY_FH

      This is the filehandle all the messages sent to history() are being printed. This defaults to *STDOUT .

    See Also

    Log::Message::Simple , Term::UI

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This module is copyright (c) 2005 Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/TAP/Base.html000644 000765 000024 00000040701 12275777512 015423 0ustar00jjstaff000000 000000 TAP::Base - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Base

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Base

    NAME

    TAP::Base - Base class that provides common functionality to TAP::Parser and TAP::Harness

    VERSION

    Version 3.26

    SYNOPSIS

    1. package TAP::Whatever;
    2. use TAP::Base;
    3. use vars qw($VERSION @ISA);
    4. @ISA = qw(TAP::Base);
    5. # ... later ...
    6. my $thing = TAP::Whatever->new();
    7. $thing->callback( event => sub {
    8. # do something interesting
    9. } );

    DESCRIPTION

    TAP::Base provides callback management.

    METHODS

    Class Methods

    callback

    Install a callback for a named event.

    get_time

    Return the current time using Time::HiRes if available.

    time_is_hires

    Return true if the time returned by get_time is high resolution (i.e. if Time::HiRes is available).

     
    perldoc-html/TAP/Formatter/000755 000765 000024 00000000000 12275777512 015624 5ustar00jjstaff000000 000000 perldoc-html/TAP/Harness.html000644 000765 000024 00000136773 12275777513 016174 0ustar00jjstaff000000 000000 TAP::Harness - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Harness

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Harness

    NAME

    TAP::Harness - Run test scripts with statistics

    VERSION

    Version 3.26

    DESCRIPTION

    This is a simple test harness which allows tests to be run and results automatically aggregated and output to STDOUT.

    SYNOPSIS

    1. use TAP::Harness;
    2. my $harness = TAP::Harness->new( \%args );
    3. $harness->runtests(@tests);

    METHODS

    Class Methods

    new

    1. my %args = (
    2. verbosity => 1,
    3. lib => [ 'lib', 'blib/lib', 'blib/arch' ],
    4. )
    5. my $harness = TAP::Harness->new( \%args );

    The constructor returns a new TAP::Harness object. It accepts an optional hashref whose allowed keys are:

    • verbosity

      Set the verbosity level:

      1. 1 verbose Print individual test results to STDOUT.
      2. 0 normal
      3. -1 quiet Suppress some test output (mostly failures
      4. while tests are running).
      5. -2 really quiet Suppress everything but the tests summary.
      6. -3 silent Suppress everything.
    • timer

      Append run time for each test to output. Uses Time::HiRes if available.

    • failures

      Show test failures (this is a no-op if verbose is selected).

    • comments

      Show test comments (this is a no-op if verbose is selected).

    • show_count

      Update the running test count during testing.

    • normalize

      Set to a true value to normalize the TAP that is emitted in verbose modes.

    • lib

      Accepts a scalar value or array ref of scalar values indicating which paths to allowed libraries should be included if Perl tests are executed. Naturally, this only makes sense in the context of tests written in Perl.

    • switches

      Accepts a scalar value or array ref of scalar values indicating which switches should be included if Perl tests are executed. Naturally, this only makes sense in the context of tests written in Perl.

    • test_args

      A reference to an @INC style array of arguments to be passed to each test program.

      1. test_args => ['foo', 'bar'],

      if you want to pass different arguments to each test then you should pass a hash of arrays, keyed by the alias for each test:

      1. test_args => {
      2. my_test => ['foo', 'bar'],
      3. other_test => ['baz'],
      4. }
    • color

      Attempt to produce color output.

    • exec

      Typically, Perl tests are run through this. However, anything which spits out TAP is fine. You can use this argument to specify the name of the program (and optional switches) to run your tests with:

      1. exec => ['/usr/bin/ruby', '-w']

      You can also pass a subroutine reference in order to determine and return the proper program to run based on a given test script. The subroutine reference should expect the TAP::Harness object itself as the first argument, and the file name as the second argument. It should return an array reference containing the command to be run and including the test file name. It can also simply return undef, in which case TAP::Harness will fall back on executing the test script in Perl:

      1. exec => sub {
      2. my ( $harness, $test_file ) = @_;
      3. # Let Perl tests run.
      4. return undef if $test_file =~ /[.]t$/;
      5. return [ qw( /usr/bin/ruby -w ), $test_file ]
      6. if $test_file =~ /[.]rb$/;
      7. }

      If the subroutine returns a scalar with a newline or a filehandle, it will be interpreted as raw TAP or as a TAP stream, respectively.

    • merge

      If merge is true the harness will create parsers that merge STDOUT and STDERR together for any processes they start.

    • sources

      NEW to 3.18.

      If set, sources must be a hashref containing the names of the TAP::Parser::SourceHandlers to load and/or configure. The values are a hash of configuration that will be accessible to to the source handlers via config_for in TAP::Parser::Source.

      For example:

      1. sources => {
      2. Perl => { exec => '/path/to/custom/perl' },
      3. File => { extensions => [ '.tap', '.txt' ] },
      4. MyCustom => { some => 'config' },
      5. }

      The sources parameter affects how source , tap and exec parameters are handled.

      For more details, see the sources parameter in new in TAP::Parser, TAP::Parser::Source, and TAP::Parser::IteratorFactory.

    • aggregator_class

      The name of the class to use to aggregate test results. The default is TAP::Parser::Aggregator.

    • version

      NEW to 3.22.

      Assume this TAP version for TAP::Parser instead of default TAP version 12.

    • formatter_class

      The name of the class to use to format output. The default is TAP::Formatter::Console, or TAP::Formatter::File if the output isn't a TTY.

    • multiplexer_class

      The name of the class to use to multiplex tests during parallel testing. The default is TAP::Parser::Multiplexer.

    • parser_class

      The name of the class to use to parse TAP. The default is TAP::Parser.

    • scheduler_class

      The name of the class to use to schedule test execution. The default is TAP::Parser::Scheduler.

    • formatter

      If set formatter must be an object that is capable of formatting the TAP output. See TAP::Formatter::Console for an example.

    • errors

      If parse errors are found in the TAP output, a note of this will be made in the summary report. To see all of the parse errors, set this argument to true:

      1. errors => 1
    • directives

      If set to a true value, only test results with directives will be displayed. This overrides other settings such as verbose or failures .

    • ignore_exit

      If set to a true value instruct TAP::Parser to ignore exit and wait status from test scripts.

    • jobs

      The maximum number of parallel tests to run at any time. Which tests can be run in parallel is controlled by rules . The default is to run only one test at a time.

    • rules

      A reference to a hash of rules that control which tests may be executed in parallel. This is an experimental feature and the interface may change.

      1. $harness->rules(
      2. { par => [
      3. { seq => '../ext/DB_File/t/*' },
      4. { seq => '../ext/IO_Compress_Zlib/t/*' },
      5. { seq => '../lib/CPANPLUS/*' },
      6. { seq => '../lib/ExtUtils/t/*' },
      7. '*'
      8. ]
      9. }
      10. );
    • stdout

      A filehandle for catching standard output.

    • trap

      Attempt to print summary information if run is interrupted by SIGINT (Ctrl-C).

    Any keys for which the value is undef will be ignored.

    Instance Methods

    runtests

    1. $harness->runtests(@tests);

    Accepts an array of @tests to be run. This should generally be the names of test files, but this is not required. Each element in @tests will be passed to TAP::Parser::new() as a source . See TAP::Parser for more information.

    It is possible to provide aliases that will be displayed in place of the test name by supplying the test as a reference to an array containing [ $test, $alias ] :

    1. $harness->runtests( [ 't/foo.t', 'Foo Once' ],
    2. [ 't/foo.t', 'Foo Twice' ] );

    Normally it is an error to attempt to run the same test twice. Aliases allow you to overcome this limitation by giving each run of the test a unique name.

    Tests will be run in the order found.

    If the environment variable PERL_TEST_HARNESS_DUMP_TAP is defined it should name a directory into which a copy of the raw TAP for each test will be written. TAP is written to files named for each test. Subdirectories will be created as needed.

    Returns a TAP::Parser::Aggregator containing the test results.

    summary

    1. $harness->summary( $aggregator );

    Output the summary for a TAP::Parser::Aggregator.

    aggregate_tests

    1. $harness->aggregate_tests( $aggregate, @tests );

    Run the named tests and display a summary of result. Tests will be run in the order found.

    Test results will be added to the supplied TAP::Parser::Aggregator. aggregate_tests may be called multiple times to run several sets of tests. Multiple Test::Harness instances may be used to pass results to a single aggregator so that different parts of a complex test suite may be run using different TAP::Harness settings. This is useful, for example, in the case where some tests should run in parallel but others are unsuitable for parallel execution.

    1. my $formatter = TAP::Formatter::Console->new;
    2. my $ser_harness = TAP::Harness->new( { formatter => $formatter } );
    3. my $par_harness = TAP::Harness->new(
    4. { formatter => $formatter,
    5. jobs => 9
    6. }
    7. );
    8. my $aggregator = TAP::Parser::Aggregator->new;
    9. $aggregator->start();
    10. $ser_harness->aggregate_tests( $aggregator, @ser_tests );
    11. $par_harness->aggregate_tests( $aggregator, @par_tests );
    12. $aggregator->stop();
    13. $formatter->summary($aggregator);

    Note that for simpler testing requirements it will often be possible to replace the above code with a single call to runtests .

    Each element of the @tests array is either:

    • the source name of a test to run
    • a reference to a [ source name, display name ] array

    In the case of a perl test suite, typically source names are simply the file names of the test scripts to run.

    When you supply a separate display name it becomes possible to run a test more than once; the display name is effectively the alias by which the test is known inside the harness. The harness doesn't care if it runs the same test more than once when each invocation uses a different name.

    make_scheduler

    Called by the harness when it needs to create a TAP::Parser::Scheduler. Override in a subclass to provide an alternative scheduler. make_scheduler is passed the list of tests that was passed to aggregate_tests .

    jobs

    Gets or sets the number of concurrent test runs the harness is handling. By default, this value is 1 -- for parallel testing, this should be set higher.

    make_parser

    Make a new parser and display formatter session. Typically used and/or overridden in subclasses.

    1. my ( $parser, $session ) = $harness->make_parser;

    finish_parser

    Terminate use of a parser. Typically used and/or overridden in subclasses. The parser isn't destroyed as a result of this.

    CONFIGURING

    TAP::Harness is designed to be easy to configure.

    Plugins

    TAP::Parser plugins let you change the way TAP is input to and output from the parser.

    TAP::Parser::SourceHandlers handle TAP input. You can configure them and load custom handlers using the sources parameter to new.

    TAP::Formatters handle TAP output. You can load custom formatters by using the formatter_class parameter to new. To configure a formatter, you currently need to instantiate it outside of TAP::Harness and pass it in with the formatter parameter to new. This may be addressed by adding a formatters parameter to new in the future.

    Module::Build

    Module::Build version 0.30 supports TAP::Harness .

    To load TAP::Harness plugins, you'll need to use the tap_harness_args parameter to new , typically from your Build.PL . For example:

    1. Module::Build->new(
    2. module_name => 'MyApp',
    3. test_file_exts => [qw(.t .tap .txt)],
    4. use_tap_harness => 1,
    5. tap_harness_args => {
    6. sources => {
    7. MyCustom => {},
    8. File => {
    9. extensions => ['.tap', '.txt'],
    10. },
    11. },
    12. formatter_class => 'TAP::Formatter::HTML',
    13. },
    14. build_requires => {
    15. 'Module::Build' => '0.30',
    16. 'TAP::Harness' => '3.18',
    17. },
    18. )->create_build_script;

    See new

    ExtUtils::MakeMaker

    ExtUtils::MakeMaker does not support TAP::Harness out-of-the-box.

    prove

    prove supports TAP::Harness plugins, and has a plugin system of its own. See FORMATTERS in prove, SOURCE HANDLERS in prove and App::Prove for more details.

    WRITING PLUGINS

    If you can't configure TAP::Harness to do what you want, and you can't find an existing plugin, consider writing one.

    The two primary use cases supported by TAP::Harness for plugins are input and output:

    • Customize how TAP gets into the parser

      To do this, you can either extend an existing TAP::Parser::SourceHandler, or write your own. It's a pretty simple API, and they can be loaded and configured using the sources parameter to new.

    • Customize how TAP results are output from the parser

      To do this, you can either extend an existing TAP::Formatter, or write your own. Writing formatters are a bit more involved than writing a SourceHandler, as you'll need to understand the TAP::Parser API. A good place to start is by understanding how aggregate_tests works.

      Custom formatters can be loaded configured using the formatter_class parameter to new.

    SUBCLASSING

    If you can't configure TAP::Harness to do exactly what you want, and writing a plugin isn't an option, consider extending it. It is designed to be (mostly) easy to subclass, though the cases when sub-classing is necessary should be few and far between.

    Methods

    The following methods are ones you may wish to override if you want to subclass TAP::Harness .

    REPLACING

    If you like the prove utility and TAP::Parser but you want your own harness, all you need to do is write one and provide new and runtests methods. Then you can use the prove utility like so:

    1. prove --harness My::Test::Harness

    Note that while prove accepts a list of tests (or things to be tested), new has a fairly rich set of arguments. You'll probably want to read over this code carefully to see how all of them are being used.

    SEE ALSO

    Test::Harness

     
    perldoc-html/TAP/Object.html000644 000765 000024 00000046635 12275777507 015777 0ustar00jjstaff000000 000000 TAP::Object - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Object

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Object

    NAME

    TAP::Object - Base class that provides common functionality to all TAP::* modules

    VERSION

    Version 3.26

    SYNOPSIS

    1. package TAP::Whatever;
    2. use strict;
    3. use vars qw(@ISA);
    4. use TAP::Object;
    5. @ISA = qw(TAP::Object);
    6. # new() implementation by TAP::Object
    7. sub _initialize {
    8. my ( $self, @args) = @_;
    9. # initialize your object
    10. return $self;
    11. }
    12. # ... later ...
    13. my $obj = TAP::Whatever->new(@args);

    DESCRIPTION

    TAP::Object provides a default constructor and exception model for all TAP::* classes. Exceptions are raised using Carp.

    METHODS

    Class Methods

    new

    Create a new object. Any arguments passed to new will be passed on to the _initialize method. Returns a new object.

    Instance Methods

    _initialize

    Initializes a new object. This method is a stub by default, you should override it as appropriate.

    Note: new expects you to return $self or raise an exception. See _croak, and Carp.

    _croak

    Raise an exception using croak from Carp, eg:

    1. $self->_croak( 'why me?', 'aaarrgh!' );

    May also be called as a class method.

    1. $class->_croak( 'this works too' );

    _confess

    Raise an exception using confess from Carp, eg:

    1. $self->_confess( 'why me?', 'aaarrgh!' );

    May also be called as a class method.

    1. $class->_confess( 'this works too' );

    _construct

    Create a new instance of the specified class.

    mk_methods

    Create simple getter/setters.

    1. __PACKAGE__->mk_methods(@method_names);
     
    perldoc-html/TAP/Parser/000755 000765 000024 00000000000 12275777514 015117 5ustar00jjstaff000000 000000 perldoc-html/TAP/Parser.html000644 000765 000024 00000257755 12275777506 016033 0ustar00jjstaff000000 000000 TAP::Parser - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser

    NAME

    TAP::Parser - Parse TAP output

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser;
    2. my $parser = TAP::Parser->new( { source => $source } );
    3. while ( my $result = $parser->next ) {
    4. print $result->as_string;
    5. }

    DESCRIPTION

    TAP::Parser is designed to produce a proper parse of TAP output. For an example of how to run tests through this module, see the simple harnesses examples/ .

    There's a wiki dedicated to the Test Anything Protocol:

    http://testanything.org

    It includes the TAP::Parser Cookbook:

    http://testanything.org/wiki/index.php/TAP::Parser_Cookbook

    METHODS

    Class Methods

    new

    1. my $parser = TAP::Parser->new(\%args);

    Returns a new TAP::Parser object.

    The arguments should be a hashref with one of the following keys:

    • source

      CHANGED in 3.18

      This is the preferred method of passing input to the constructor.

      The source is used to create a TAP::Parser::Source that is passed to the iterator_factory_class which in turn figures out how to handle the source and creates a <TAP::Parser::Iterator> for it. The iterator is used by the parser to read in the TAP stream.

      To configure the IteratorFactory use the sources parameter below.

      Note that source , tap and exec are mutually exclusive.

    • tap

      CHANGED in 3.18

      The value should be the complete TAP output.

      The tap is used to create a TAP::Parser::Source that is passed to the iterator_factory_class which in turn figures out how to handle the source and creates a <TAP::Parser::Iterator> for it. The iterator is used by the parser to read in the TAP stream.

      To configure the IteratorFactory use the sources parameter below.

      Note that source , tap and exec are mutually exclusive.

    • exec

      Must be passed an array reference.

      The exec array ref is used to create a TAP::Parser::Source that is passed to the iterator_factory_class which in turn figures out how to handle the source and creates a <TAP::Parser::Iterator> for it. The iterator is used by the parser to read in the TAP stream.

      By default the TAP::Parser::SourceHandler::Executable class will create a TAP::Parser::Iterator::Process object to handle the source. This passes the array reference strings as command arguments to IPC::Open3::open3:

      1. exec => [ '/usr/bin/ruby', 't/my_test.rb' ]

      If any test_args are given they will be appended to the end of the command argument list.

      To configure the IteratorFactory use the sources parameter below.

      Note that source , tap and exec are mutually exclusive.

    The following keys are optional.

    • sources

      NEW to 3.18.

      If set, sources must be a hashref containing the names of the TAP::Parser::SourceHandlers to load and/or configure. The values are a hash of configuration that will be accessible to to the source handlers via config_for in TAP::Parser::Source.

      For example:

      1. sources => {
      2. Perl => { exec => '/path/to/custom/perl' },
      3. File => { extensions => [ '.tap', '.txt' ] },
      4. MyCustom => { some => 'config' },
      5. }

      This will cause TAP::Parser to pass custom configuration to two of the built- in source handlers - TAP::Parser::SourceHandler::Perl, TAP::Parser::SourceHandler::File - and attempt to load the MyCustom class. See load_handlers in TAP::Parser::IteratorFactory for more detail.

      The sources parameter affects how source , tap and exec parameters are handled.

      See TAP::Parser::IteratorFactory, TAP::Parser::SourceHandler and subclasses for more details.

    • callback

      If present, each callback corresponding to a given result type will be called with the result as the argument if the run method is used:

      1. my %callbacks = (
      2. test => \&test_callback,
      3. plan => \&plan_callback,
      4. comment => \&comment_callback,
      5. bailout => \&bailout_callback,
      6. unknown => \&unknown_callback,
      7. );
      8. my $aggregator = TAP::Parser::Aggregator->new;
      9. for my $file ( @test_files ) {
      10. my $parser = TAP::Parser->new(
      11. {
      12. source => $file,
      13. callbacks => \%callbacks,
      14. }
      15. );
      16. $parser->run;
      17. $aggregator->add( $file, $parser );
      18. }
    • switches

      If using a Perl file as a source, optional switches may be passed which will be used when invoking the perl executable.

      1. my $parser = TAP::Parser->new( {
      2. source => $test_file,
      3. switches => [ '-Ilib' ],
      4. } );
    • test_args

      Used in conjunction with the source and exec option to supply a reference to an @ARGV style array of arguments to pass to the test program.

    • spool

      If passed a filehandle will write a copy of all parsed TAP to that handle.

    • merge

      If false, STDERR is not captured (though it is 'relayed' to keep it somewhat synchronized with STDOUT.)

      If true, STDERR and STDOUT are the same filehandle. This may cause breakage if STDERR contains anything resembling TAP format, but does allow exact synchronization.

      Subtleties of this behavior may be platform-dependent and may change in the future.

    • grammar_class

      This option was introduced to let you easily customize which grammar class the parser should use. It defaults to TAP::Parser::Grammar.

      See also make_grammar.

    • result_factory_class

      This option was introduced to let you easily customize which result factory class the parser should use. It defaults to TAP::Parser::ResultFactory.

      See also make_result.

    • iterator_factory_class

      CHANGED in 3.18

      This option was introduced to let you easily customize which iterator factory class the parser should use. It defaults to TAP::Parser::IteratorFactory.

    Instance Methods

    next

    1. my $parser = TAP::Parser->new( { source => $file } );
    2. while ( my $result = $parser->next ) {
    3. print $result->as_string, "\n";
    4. }

    This method returns the results of the parsing, one result at a time. Note that it is destructive. You can't rewind and examine previous results.

    If callbacks are used, they will be issued before this call returns.

    Each result returned is a subclass of TAP::Parser::Result. See that module and related classes for more information on how to use them.

    run

    1. $parser->run;

    This method merely runs the parser and parses all of the TAP.

    make_grammar

    Make a new TAP::Parser::Grammar object and return it. Passes through any arguments given.

    The grammar_class can be customized, as described in new.

    make_result

    Make a new TAP::Parser::Result object using the parser's TAP::Parser::ResultFactory, and return it. Passes through any arguments given.

    The result_factory_class can be customized, as described in new.

    make_iterator_factory

    NEW to 3.18.

    Make a new TAP::Parser::IteratorFactory object and return it. Passes through any arguments given.

    iterator_factory_class can be customized, as described in new.

    INDIVIDUAL RESULTS

    If you've read this far in the docs, you've seen this:

    1. while ( my $result = $parser->next ) {
    2. print $result->as_string;
    3. }

    Each result returned is a TAP::Parser::Result subclass, referred to as result types.

    Result types

    Basically, you fetch individual results from the TAP. The six types, with examples of each, are as follows:

    • Version
      1. TAP version 12
    • Plan
      1. 1..42
    • Pragma
      1. pragma +strict
    • Test
      1. ok 3 - We should start with some foobar!
    • Comment
      1. # Hope we don't use up the foobar.
    • Bailout
      1. Bail out! We ran out of foobar!
    • Unknown
      1. ... yo, this ain't TAP! ...

    Each result fetched is a result object of a different type. There are common methods to each result object and different types may have methods unique to their type. Sometimes a type method may be overridden in a subclass, but its use is guaranteed to be identical.

    Common type methods

    type

    Returns the type of result, such as comment or test .

    as_string

    Prints a string representation of the token. This might not be the exact output, however. Tests will have test numbers added if not present, TODO and SKIP directives will be capitalized and, in general, things will be cleaned up. If you need the original text for the token, see the raw method.

    raw

    Returns the original line of text which was parsed.

    is_plan

    Indicates whether or not this is the test plan line.

    is_test

    Indicates whether or not this is a test line.

    is_comment

    Indicates whether or not this is a comment. Comments will generally only appear in the TAP stream if STDERR is merged to STDOUT. See the merge option.

    is_bailout

    Indicates whether or not this is bailout line.

    is_yaml

    Indicates whether or not the current item is a YAML block.

    is_unknown

    Indicates whether or not the current line could be parsed.

    is_ok

    1. if ( $result->is_ok ) { ... }

    Reports whether or not a given result has passed. Anything which is not a test result returns true. This is merely provided as a convenient shortcut which allows you to do this:

    1. my $parser = TAP::Parser->new( { source => $source } );
    2. while ( my $result = $parser->next ) {
    3. # only print failing results
    4. print $result->as_string unless $result->is_ok;
    5. }

    plan methods

    1. if ( $result->is_plan ) { ... }

    If the above evaluates as true, the following methods will be available on the $result object.

    plan

    1. if ( $result->is_plan ) {
    2. print $result->plan;
    3. }

    This is merely a synonym for as_string .

    directive

    1. my $directive = $result->directive;

    If a SKIP directive is included with the plan, this method will return it.

    1. 1..0 # SKIP: why bother?

    explanation

    1. my $explanation = $result->explanation;

    If a SKIP directive was included with the plan, this method will return the explanation, if any.

    pragma methods

    1. if ( $result->is_pragma ) { ... }

    If the above evaluates as true, the following methods will be available on the $result object.

    pragmas

    Returns a list of pragmas each of which is a + or - followed by the pragma name.

    comment methods

    1. if ( $result->is_comment ) { ... }

    If the above evaluates as true, the following methods will be available on the $result object.

    comment

    1. if ( $result->is_comment ) {
    2. my $comment = $result->comment;
    3. print "I have something to say: $comment";
    4. }

    bailout methods

    1. if ( $result->is_bailout ) { ... }

    If the above evaluates as true, the following methods will be available on the $result object.

    explanation

    1. if ( $result->is_bailout ) {
    2. my $explanation = $result->explanation;
    3. print "We bailed out because ($explanation)";
    4. }

    If, and only if, a token is a bailout token, you can get an "explanation" via this method. The explanation is the text after the mystical "Bail out!" words which appear in the tap output.

    unknown methods

    1. if ( $result->is_unknown ) { ... }

    There are no unique methods for unknown results.

    test methods

    1. if ( $result->is_test ) { ... }

    If the above evaluates as true, the following methods will be available on the $result object.

    ok

    1. my $ok = $result->ok;

    Returns the literal text of the ok or not ok status.

    number

    1. my $test_number = $result->number;

    Returns the number of the test, even if the original TAP output did not supply that number.

    description

    1. my $description = $result->description;

    Returns the description of the test, if any. This is the portion after the test number but before the directive.

    directive

    1. my $directive = $result->directive;

    Returns either TODO or SKIP if either directive was present for a test line.

    explanation

    1. my $explanation = $result->explanation;

    If a test had either a TODO or SKIP directive, this method will return the accompanying explanation, if present.

    1. not ok 17 - 'Pigs can fly' # TODO not enough acid

    For the above line, the explanation is not enough acid.

    is_ok

    1. if ( $result->is_ok ) { ... }

    Returns a boolean value indicating whether or not the test passed. Remember that for TODO tests, the test always passes.

    Note: this was formerly passed . The latter method is deprecated and will issue a warning.

    is_actual_ok

    1. if ( $result->is_actual_ok ) { ... }

    Returns a boolean value indicating whether or not the test passed, regardless of its TODO status.

    Note: this was formerly actual_passed . The latter method is deprecated and will issue a warning.

    is_unplanned

    1. if ( $test->is_unplanned ) { ... }

    If a test number is greater than the number of planned tests, this method will return true. Unplanned tests will always return false for is_ok , regardless of whether or not the test has_todo (see TAP::Parser::Result::Test for more information about this).

    has_skip

    1. if ( $result->has_skip ) { ... }

    Returns a boolean value indicating whether or not this test had a SKIP directive.

    has_todo

    1. if ( $result->has_todo ) { ... }

    Returns a boolean value indicating whether or not this test had a TODO directive.

    Note that TODO tests always pass. If you need to know whether or not they really passed, check the is_actual_ok method.

    in_todo

    1. if ( $parser->in_todo ) { ... }

    True while the most recent result was a TODO. Becomes true before the TODO result is returned and stays true until just before the next non- TODO test is returned.

    TOTAL RESULTS

    After parsing the TAP, there are many methods available to let you dig through the results and determine what is meaningful to you.

    Individual Results

    These results refer to individual tests which are run.

    passed

    1. my @passed = $parser->passed; # the test numbers which passed
    2. my $passed = $parser->passed; # the number of tests which passed

    This method lets you know which (or how many) tests passed. If a test failed but had a TODO directive, it will be counted as a passed test.

    failed

    1. my @failed = $parser->failed; # the test numbers which failed
    2. my $failed = $parser->failed; # the number of tests which failed

    This method lets you know which (or how many) tests failed. If a test passed but had a TODO directive, it will NOT be counted as a failed test.

    actual_passed

    1. # the test numbers which actually passed
    2. my @actual_passed = $parser->actual_passed;
    3. # the number of tests which actually passed
    4. my $actual_passed = $parser->actual_passed;

    This method lets you know which (or how many) tests actually passed, regardless of whether or not a TODO directive was found.

    actual_ok

    This method is a synonym for actual_passed .

    actual_failed

    1. # the test numbers which actually failed
    2. my @actual_failed = $parser->actual_failed;
    3. # the number of tests which actually failed
    4. my $actual_failed = $parser->actual_failed;

    This method lets you know which (or how many) tests actually failed, regardless of whether or not a TODO directive was found.

    todo

    1. my @todo = $parser->todo; # the test numbers with todo directives
    2. my $todo = $parser->todo; # the number of tests with todo directives

    This method lets you know which (or how many) tests had TODO directives.

    todo_passed

    1. # the test numbers which unexpectedly succeeded
    2. my @todo_passed = $parser->todo_passed;
    3. # the number of tests which unexpectedly succeeded
    4. my $todo_passed = $parser->todo_passed;

    This method lets you know which (or how many) tests actually passed but were declared as "TODO" tests.

    todo_failed

    1. # deprecated in favor of 'todo_passed'. This method was horribly misnamed.

    This was a badly misnamed method. It indicates which TODO tests unexpectedly succeeded. Will now issue a warning and call todo_passed .

    skipped

    1. my @skipped = $parser->skipped; # the test numbers with SKIP directives
    2. my $skipped = $parser->skipped; # the number of tests with SKIP directives

    This method lets you know which (or how many) tests had SKIP directives.

    Pragmas

    pragma

    Get or set a pragma. To get the state of a pragma:

    1. if ( $p->pragma('strict') ) {
    2. # be strict
    3. }

    To set the state of a pragma:

    1. $p->pragma('strict', 1); # enable strict mode

    pragmas

    Get a list of all the currently enabled pragmas:

    1. my @pragmas_enabled = $p->pragmas;

    Summary Results

    These results are "meta" information about the total results of an individual test program.

    plan

    1. my $plan = $parser->plan;

    Returns the test plan, if found.

    good_plan

    Deprecated. Use is_good_plan instead.

    is_good_plan

    1. if ( $parser->is_good_plan ) { ... }

    Returns a boolean value indicating whether or not the number of tests planned matches the number of tests run.

    Note: this was formerly good_plan . The latter method is deprecated and will issue a warning.

    And since we're on that subject ...

    tests_planned

    1. print $parser->tests_planned;

    Returns the number of tests planned, according to the plan. For example, a plan of '1..17' will mean that 17 tests were planned.

    tests_run

    1. print $parser->tests_run;

    Returns the number of tests which actually were run. Hopefully this will match the number of $parser->tests_planned .

    skip_all

    Returns a true value (actually the reason for skipping) if all tests were skipped.

    start_time

    Returns the time when the Parser was created.

    end_time

    Returns the time when the end of TAP input was seen.

    has_problems

    1. if ( $parser->has_problems ) {
    2. ...
    3. }

    This is a 'catch-all' method which returns true if any tests have currently failed, any TODO tests unexpectedly succeeded, or any parse errors occurred.

    version

    1. $parser->version;

    Once the parser is done, this will return the version number for the parsed TAP. Version numbers were introduced with TAP version 13 so if no version number is found version 12 is assumed.

    exit

    1. $parser->exit;

    Once the parser is done, this will return the exit status. If the parser ran an executable, it returns the exit status of the executable.

    wait

    1. $parser->wait;

    Once the parser is done, this will return the wait status. If the parser ran an executable, it returns the wait status of the executable. Otherwise, this merely returns the exit status.

    ignore_exit

    1. $parser->ignore_exit(1);

    Tell the parser to ignore the exit status from the test when determining whether the test passed. Normally tests with non-zero exit status are considered to have failed even if all individual tests passed. In cases where it is not possible to control the exit value of the test script use this option to ignore it.

    parse_errors

    1. my @errors = $parser->parse_errors; # the parser errors
    2. my $errors = $parser->parse_errors; # the number of parser_errors

    Fortunately, all TAP output is perfect. In the event that it is not, this method will return parser errors. Note that a junk line which the parser does not recognize is not an error. This allows this parser to handle future versions of TAP. The following are all TAP errors reported by the parser:

    • Misplaced plan

      The plan (for example, '1..5'), must only come at the beginning or end of the TAP output.

    • No plan

      Gotta have a plan!

    • More than one plan
      1. 1..3
      2. ok 1 - input file opened
      3. not ok 2 - first line of the input valid # todo some data
      4. ok 3 read the rest of the file
      5. 1..3

      Right. Very funny. Don't do that.

    • Test numbers out of sequence
      1. 1..3
      2. ok 1 - input file opened
      3. not ok 2 - first line of the input valid # todo some data
      4. ok 2 read the rest of the file

      That last test line above should have the number '3' instead of '2'.

      Note that it's perfectly acceptable for some lines to have test numbers and others to not have them. However, when a test number is found, it must be in sequence. The following is also an error:

      1. 1..3
      2. ok 1 - input file opened
      3. not ok - first line of the input valid # todo some data
      4. ok 2 read the rest of the file

      But this is not:

      1. 1..3
      2. ok - input file opened
      3. not ok - first line of the input valid # todo some data
      4. ok 3 read the rest of the file

    get_select_handles

    Get an a list of file handles which can be passed to select to determine the readiness of this parser.

    delete_spool

    Delete and return the spool.

    1. my $fh = $parser->delete_spool;

    CALLBACKS

    As mentioned earlier, a "callback" key may be added to the TAP::Parser constructor. If present, each callback corresponding to a given result type will be called with the result as the argument if the run method is used. The callback is expected to be a subroutine reference (or anonymous subroutine) which is invoked with the parser result as its argument.

    1. my %callbacks = (
    2. test => \&test_callback,
    3. plan => \&plan_callback,
    4. comment => \&comment_callback,
    5. bailout => \&bailout_callback,
    6. unknown => \&unknown_callback,
    7. );
    8. my $aggregator = TAP::Parser::Aggregator->new;
    9. for my $file ( @test_files ) {
    10. my $parser = TAP::Parser->new(
    11. {
    12. source => $file,
    13. callbacks => \%callbacks,
    14. }
    15. );
    16. $parser->run;
    17. $aggregator->add( $file, $parser );
    18. }

    Callbacks may also be added like this:

    1. $parser->callback( test => \&test_callback );
    2. $parser->callback( plan => \&plan_callback );

    The following keys allowed for callbacks. These keys are case-sensitive.

    • test

      Invoked if $result->is_test returns true.

    • version

      Invoked if $result->is_version returns true.

    • plan

      Invoked if $result->is_plan returns true.

    • comment

      Invoked if $result->is_comment returns true.

    • bailout

      Invoked if $result->is_unknown returns true.

    • yaml

      Invoked if $result->is_yaml returns true.

    • unknown

      Invoked if $result->is_unknown returns true.

    • ELSE

      If a result does not have a callback defined for it, this callback will be invoked. Thus, if all of the previous result types are specified as callbacks, this callback will never be invoked.

    • ALL

      This callback will always be invoked and this will happen for each result after one of the above callbacks is invoked. For example, if Term::ANSIColor is loaded, you could use the following to color your test output:

      1. my %callbacks = (
      2. test => sub {
      3. my $test = shift;
      4. if ( $test->is_ok && not $test->directive ) {
      5. # normal passing test
      6. print color 'green';
      7. }
      8. elsif ( !$test->is_ok ) { # even if it's TODO
      9. print color 'white on_red';
      10. }
      11. elsif ( $test->has_skip ) {
      12. print color 'white on_blue';
      13. }
      14. elsif ( $test->has_todo ) {
      15. print color 'white';
      16. }
      17. },
      18. ELSE => sub {
      19. # plan, comment, and so on (anything which isn't a test line)
      20. print color 'black on_white';
      21. },
      22. ALL => sub {
      23. # now print them
      24. print shift->as_string;
      25. print color 'reset';
      26. print "\n";
      27. },
      28. );
    • EOF

      Invoked when there are no more lines to be parsed. Since there is no accompanying TAP::Parser::Result object the TAP::Parser object is passed instead.

    TAP GRAMMAR

    If you're looking for an EBNF grammar, see TAP::Parser::Grammar.

    BACKWARDS COMPATIBILITY

    The Perl-QA list attempted to ensure backwards compatibility with Test::Harness. However, there are some minor differences.

    Differences

    • TODO plans

      A little-known feature of Test::Harness is that it supported TODO lists in the plan:

      1. 1..2 todo 2
      2. ok 1 - We have liftoff
      3. not ok 2 - Anti-gravity device activated

      Under Test::Harness, test number 2 would pass because it was listed as a TODO test on the plan line. However, we are not aware of anyone actually using this feature and hard-coding test numbers is discouraged because it's very easy to add a test and break the test number sequence. This makes test suites very fragile. Instead, the following should be used:

      1. 1..2
      2. ok 1 - We have liftoff
      3. not ok 2 - Anti-gravity device activated # TODO
    • 'Missing' tests

      It rarely happens, but sometimes a harness might encounter 'missing tests:

      1. ok 1
      2. ok 2
      3. ok 15
      4. ok 16
      5. ok 17

      Test::Harness would report tests 3-14 as having failed. For the TAP::Parser , these tests are not considered failed because they've never run. They're reported as parse failures (tests out of sequence).

    SUBCLASSING

    If you find you need to provide custom functionality (as you would have using Test::Harness::Straps), you're in luck: TAP::Parser and friends are designed to be easily plugged-into and/or subclassed.

    Before you start, it's important to know a few things:

    1

    All TAP::* objects inherit from TAP::Object.

    2

    Many TAP::* classes have a SUBCLASSING section to guide you.

    3

    Note that TAP::Parser is designed to be the central "maker" - ie: it is responsible for creating most new objects in the TAP::Parser::* namespace.

    This makes it possible for you to have a single point of configuring what subclasses should be used, which means that in many cases you'll find you only need to sub-class one of the parser's components.

    The exception to this rule are SourceHandlers & Iterators, but those are both created with customizable IteratorFactory.

    4

    By subclassing, you may end up overriding undocumented methods. That's not a bad thing per se, but be forewarned that undocumented methods may change without warning from one release to the next - we cannot guarantee backwards compatibility. If any documented method needs changing, it will be deprecated first, and changed in a later release.

    Parser Components

    Sources

    A TAP parser consumes input from a single raw source of TAP, which could come from anywhere (a file, an executable, a database, an IO handle, a URI, etc..). The source gets bundled up in a TAP::Parser::Source object which gathers some meta data about it. The parser then uses a TAP::Parser::IteratorFactory to determine which TAP::Parser::SourceHandler to use to turn the raw source into a stream of TAP by way of Iterators.

    If you simply want TAP::Parser to handle a new source of TAP you probably don't need to subclass TAP::Parser itself. Rather, you'll need to create a new TAP::Parser::SourceHandler class, and just plug it into the parser using the sources param to new. Before you start writing one, read through TAP::Parser::IteratorFactory to get a feel for how the system works first.

    If you find you really need to use your own iterator factory you can still do so without sub-classing TAP::Parser by setting iterator_factory_class.

    If you just need to customize the objects on creation, subclass TAP::Parser and override make_iterator_factory.

    Note that make_source & make_perl_source have been DEPRECATED and are now removed.

    Iterators

    A TAP parser uses iterators to loop through the stream of TAP read in from the source it was given. There are a few types of Iterators available by default, all sub-classes of TAP::Parser::Iterator. Choosing which iterator to use is the responsibility of the iterator factory, though it simply delegates to the Source Handler it uses.

    If you're writing your own TAP::Parser::SourceHandler, you may need to create your own iterators too. If so you'll need to subclass TAP::Parser::Iterator.

    Note that make_iterator has been DEPRECATED and is now removed.

    Results

    A TAP parser creates TAP::Parser::Results as it iterates through the input stream. There are quite a few result types available; choosing which class to use is the responsibility of the result factory.

    To create your own result types you have two options:

    If you need to customize the objects on creation, subclass TAP::Parser and override make_result.

    Grammar

    TAP::Parser::Grammar is the heart of the parser. It tokenizes the TAP input stream and produces results. If you need to customize its behaviour you should probably familiarize yourself with the source first. Enough lecturing.

    Subclass TAP::Parser::Grammar and customize your parser by setting the grammar_class parameter. See new for more details.

    If you need to customize the objects on creation, subclass TAP::Parser and override make_grammar

    ACKNOWLEDGMENTS

    All of the following have helped. Bug reports, patches, (im)moral support, or just words of encouragement have all been forthcoming.

    • Michael Schwern
    • Andy Lester
    • chromatic
    • GEOFFR
    • Shlomi Fish
    • Torsten Schoenfeld
    • Jerry Gay
    • Aristotle
    • Adam Kennedy
    • Yves Orton
    • Adrian Howard
    • Sean & Lil
    • Andreas J. Koenig
    • Florian Ragwitz
    • Corion
    • Mark Stosberg
    • Matt Kraai
    • David Wheeler
    • Alex Vandiver
    • Cosimo Streppone
    • Ville Skyttä

    AUTHORS

    Curtis "Ovid" Poe <ovid@cpan.org>

    Andy Armstong <andy@hexten.net>

    Eric Wilhelm @ <ewilhelm at cpan dot org>

    Michael Peters <mpeters at plusthree dot com>

    Leif Eriksen <leif dot eriksen at bigpond dot com>

    Steve Purkis <spurkis@cpan.org>

    Nicholas Clark <nick@ccl4.org>

    Lee Johnson <notfadeaway at btinternet dot com>

    Philippe Bruhat <book@cpan.org>

    BUGS

    Please report any bugs or feature requests to bug-test-harness@rt.cpan.org , or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Test-Harness. We will be notified, and then you'll automatically be notified of progress on your bug as we make changes.

    Obviously, bugs which include patches are best. If you prefer, you can patch against bleed by via anonymous checkout of the latest version:

    1. git clone git://github.com/Perl-Toolchain-Gang/Test-Harness.git

    COPYRIGHT & LICENSE

    Copyright 2006-2008 Curtis "Ovid" Poe, all rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/TAP/Parser/Aggregator.html000644 000765 000024 00000057777 12275777514 020116 0ustar00jjstaff000000 000000 TAP::Parser::Aggregator - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Aggregator

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Aggregator

    NAME

    TAP::Parser::Aggregator - Aggregate TAP::Parser results

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Aggregator;
    2. my $aggregate = TAP::Parser::Aggregator->new;
    3. $aggregate->add( 't/00-load.t', $load_parser );
    4. $aggregate->add( 't/10-lex.t', $lex_parser );
    5. my $summary = <<'END_SUMMARY';
    6. Passed: %s
    7. Failed: %s
    8. Unexpectedly succeeded: %s
    9. END_SUMMARY
    10. printf $summary,
    11. scalar $aggregate->passed,
    12. scalar $aggregate->failed,
    13. scalar $aggregate->todo_passed;

    DESCRIPTION

    TAP::Parser::Aggregator collects parser objects and allows reporting/querying their aggregate results.

    METHODS

    Class Methods

    new

    1. my $aggregate = TAP::Parser::Aggregator->new;

    Returns a new TAP::Parser::Aggregator object.

    Instance Methods

    add

    1. $aggregate->add( $description => $parser );

    The $description is usually a test file name (but only by convention.) It is used as a unique identifier (see e.g. parsers.) Reusing a description is a fatal error.

    The $parser is a TAP::Parser object.

    parsers

    1. my $count = $aggregate->parsers;
    2. my @parsers = $aggregate->parsers;
    3. my @parsers = $aggregate->parsers(@descriptions);

    In scalar context without arguments, this method returns the number of parsers aggregated. In list context without arguments, returns the parsers in the order they were added.

    If @descriptions is given, these correspond to the keys used in each call to the add() method. Returns an array of the requested parsers (in the requested order) in list context or an array reference in scalar context.

    Requesting an unknown identifier is a fatal error.

    descriptions

    Get an array of descriptions in the order in which they were added to the aggregator.

    start

    Call start immediately before adding any results to the aggregator. Among other times it records the start time for the test run.

    stop

    Call stop immediately after adding all test results to the aggregator.

    elapsed

    Elapsed returns a Benchmark object that represents the running time of the aggregated tests. In order for elapsed to be valid you must call start before running the tests and stop immediately afterwards.

    elapsed_timestr

    Returns a formatted string representing the runtime returned by elapsed() . This lets the caller not worry about Benchmark.

    all_passed

    Return true if all the tests passed and no parse errors were detected.

    get_status

    Get a single word describing the status of the aggregated tests. Depending on the outcome of the tests returns 'PASS', 'FAIL' or 'NOTESTS'. This token is understood by CPAN::Reporter.

    Summary methods

    Each of the following methods will return the total number of corresponding tests if called in scalar context. If called in list context, returns the descriptions of the parsers which contain the corresponding tests (see add for an explanation of description.

    • failed
    • parse_errors
    • passed
    • planned
    • skipped
    • todo
    • todo_passed
    • wait
    • exit

    For example, to find out how many tests unexpectedly succeeded (TODO tests which passed when they shouldn't):

    1. my $count = $aggregate->todo_passed;
    2. my @descriptions = $aggregate->todo_passed;

    Note that wait and exit are the totals of the wait and exit statuses of each of the tests. These values are totalled only to provide a true value if any of them are non-zero.

    total

    1. my $tests_run = $aggregate->total;

    Returns the total number of tests run.

    has_problems

    1. if ( $parser->has_problems ) {
    2. ...
    3. }

    Identical to has_errors , but also returns true if any TODO tests unexpectedly succeeded. This is more akin to "warnings".

    has_errors

    1. if ( $parser->has_errors ) {
    2. ...
    3. }

    Returns true if any of the parsers failed. This includes:

    • Failed tests
    • Parse errors
    • Bad exit or wait status

    todo_failed

    1. # deprecated in favor of 'todo_passed'. This method was horribly misnamed.

    This was a badly misnamed method. It indicates which TODO tests unexpectedly succeeded. Will now issue a warning and call todo_passed .

    See Also

    TAP::Parser

    TAP::Harness

     
    perldoc-html/TAP/Parser/Grammar.html000644 000765 000024 00000065244 12275777513 017405 0ustar00jjstaff000000 000000 TAP::Parser::Grammar - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Grammar

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Grammar

    NAME

    TAP::Parser::Grammar - A grammar for the Test Anything Protocol.

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Grammar;
    2. my $grammar = $self->make_grammar({
    3. iterator => $tap_parser_iterator,
    4. parser => $tap_parser,
    5. version => 12,
    6. });
    7. my $result = $grammar->tokenize;

    DESCRIPTION

    TAP::Parser::Grammar tokenizes lines from a TAP::Parser::Iterator and constructs TAP::Parser::Result subclasses to represent the tokens.

    Do not attempt to use this class directly. It won't make sense. It's mainly here to ensure that we will be able to have pluggable grammars when TAP is expanded at some future date (plus, this stuff was really cluttering the parser).

    METHODS

    Class Methods

    new

    1. my $grammar = TAP::Parser::Grammar->new({
    2. iterator => $iterator,
    3. parser => $parser,
    4. version => $version,
    5. });

    Returns TAP::Parser grammar object that will parse the TAP stream from the specified iterator. Both iterator and parser are required arguments. If version is not set it defaults to 12 (see set_version for more details).

    Instance Methods

    set_version

    1. $grammar->set_version(13);

    Tell the grammar which TAP syntax version to support. The lowest supported version is 12. Although 'TAP version' isn't valid version 12 syntax it is accepted so that higher version numbers may be parsed.

    tokenize

    1. my $token = $grammar->tokenize;

    This method will return a TAP::Parser::Result object representing the current line of TAP.

    token_types

    1. my @types = $grammar->token_types;

    Returns the different types of tokens which this grammar can parse.

    syntax_for

    1. my $syntax = $grammar->syntax_for($token_type);

    Returns a pre-compiled regular expression which will match a chunk of TAP corresponding to the token type. For example (not that you should really pay attention to this, $grammar->syntax_for('comment') will return qr/^#(.*)/.

    handler_for

    1. my $handler = $grammar->handler_for($token_type);

    Returns a code reference which, when passed an appropriate line of TAP, returns the lexed token corresponding to that line. As a result, the basic TAP parsing loop looks similar to the following:

    1. my @tokens;
    2. my $grammar = TAP::Grammar->new;
    3. LINE: while ( defined( my $line = $parser->_next_chunk_of_tap ) ) {
    4. for my $type ( $grammar->token_types ) {
    5. my $syntax = $grammar->syntax_for($type);
    6. if ( $line =~ $syntax ) {
    7. my $handler = $grammar->handler_for($type);
    8. push @tokens => $grammar->$handler($line);
    9. next LINE;
    10. }
    11. }
    12. push @tokens => $grammar->_make_unknown_token($line);
    13. }

    TAP GRAMMAR

    NOTE: This grammar is slightly out of date. There's still some discussion about it and a new one will be provided when we have things better defined.

    The TAP::Parser does not use a formal grammar because TAP is essentially a stream-based protocol. In fact, it's quite legal to have an infinite stream. For the same reason that we don't apply regexes to streams, we're not using a formal grammar here. Instead, we parse the TAP in lines.

    For purposes for forward compatibility, any result which does not match the following grammar is currently referred to as TAP::Parser::Result::Unknown. It is not a parse error.

    A formal grammar would look similar to the following:

    1. (*
    2. For the time being, I'm cheating on the EBNF by allowing
    3. certain terms to be defined by POSIX character classes by
    4. using the following syntax:
    5. digit ::= [:digit:]
    6. As far as I am aware, that's not valid EBNF. Sue me. I
    7. didn't know how to write "char" otherwise (Unicode issues).
    8. Suggestions welcome.
    9. *)
    10. tap ::= version? { comment | unknown } leading_plan lines
    11. |
    12. lines trailing_plan {comment}
    13. version ::= 'TAP version ' positiveInteger {positiveInteger} "\n"
    14. leading_plan ::= plan skip_directive? "\n"
    15. trailing_plan ::= plan "\n"
    16. plan ::= '1..' nonNegativeInteger
    17. lines ::= line {line}
    18. line ::= (comment | test | unknown | bailout ) "\n"
    19. test ::= status positiveInteger? description? directive?
    20. status ::= 'not '? 'ok '
    21. description ::= (character - (digit | '#')) {character - '#'}
    22. directive ::= todo_directive | skip_directive
    23. todo_directive ::= hash_mark 'TODO' ' ' {character}
    24. skip_directive ::= hash_mark 'SKIP' ' ' {character}
    25. comment ::= hash_mark {character}
    26. hash_mark ::= '#' {' '}
    27. bailout ::= 'Bail out!' {character}
    28. unknown ::= { (character - "\n") }
    29. (* POSIX character classes and other terminals *)
    30. digit ::= [:digit:]
    31. character ::= ([:print:] - "\n")
    32. positiveInteger ::= ( digit - '0' ) {digit}
    33. nonNegativeInteger ::= digit {digit}

    SUBCLASSING

    Please see SUBCLASSING in TAP::Parser for a subclassing overview.

    If you really want to subclass TAP::Parser's grammar the best thing to do is read through the code. There's no easy way of summarizing it here.

    SEE ALSO

    TAP::Object, TAP::Parser, TAP::Parser::Iterator, TAP::Parser::Result,

     
    perldoc-html/TAP/Parser/Iterator/000755 000765 000024 00000000000 12275777514 016710 5ustar00jjstaff000000 000000 perldoc-html/TAP/Parser/Iterator.html000644 000765 000024 00000050050 12275777506 017577 0ustar00jjstaff000000 000000 TAP::Parser::Iterator - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Iterator

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Iterator

    NAME

    TAP::Parser::Iterator - Base class for TAP source iterators

    VERSION

    Version 3.26

    SYNOPSIS

    1. # to subclass:
    2. use vars qw(@ISA);
    3. use TAP::Parser::Iterator ();
    4. @ISA = qw(TAP::Parser::Iterator);
    5. sub _initialize {
    6. # see TAP::Object...
    7. }
    8. sub next_raw { ... }
    9. sub wait { ... }
    10. sub exit { ... }

    DESCRIPTION

    This is a simple iterator base class that defines TAP::Parser's iterator API. Iterators are typically created from TAP::Parser::SourceHandlers.

    METHODS

    Class Methods

    new

    Create an iterator. Provided by TAP::Object.

    Instance Methods

    next

    1. while ( my $item = $iter->next ) { ... }

    Iterate through it, of course.

    next_raw

    Note: this method is abstract and should be overridden.

    1. while ( my $item = $iter->next_raw ) { ... }

    Iterate raw input without applying any fixes for quirky input syntax.

    handle_unicode

    If necessary switch the input stream to handle unicode. This only has any effect for I/O handle based streams.

    The default implementation does nothing.

    get_select_handles

    Return a list of filehandles that may be used upstream in a select() call to signal that this Iterator is ready. Iterators that are not handle-based should return an empty list.

    The default implementation does nothing.

    wait

    Note: this method is abstract and should be overridden.

    1. my $wait_status = $iter->wait;

    Return the wait status for this iterator.

    exit

    Note: this method is abstract and should be overridden.

    1. my $wait_status = $iter->exit;

    Return the exit status for this iterator.

    SUBCLASSING

    Please see SUBCLASSING in TAP::Parser for a subclassing overview.

    You must override the abstract methods as noted above.

    Example

    TAP::Parser::Iterator::Array is probably the easiest example to follow. There's not much point repeating it here.

    SEE ALSO

    TAP::Object, TAP::Parser, TAP::Parser::Iterator::Array, TAP::Parser::Iterator::Stream, TAP::Parser::Iterator::Process,

     
    perldoc-html/TAP/Parser/IteratorFactory.html000644 000765 000024 00000062576 12275777512 021144 0ustar00jjstaff000000 000000 TAP::Parser::IteratorFactory - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::IteratorFactory

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::IteratorFactory

    NAME

    TAP::Parser::IteratorFactory - Figures out which SourceHandler objects to use for a given Source

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::IteratorFactory;
    2. my $factory = TAP::Parser::IteratorFactory->new({ %config });
    3. my $iterator = $factory->make_iterator( $filename );

    DESCRIPTION

    This is a factory class that takes a TAP::Parser::Source and runs it through all the registered TAP::Parser::SourceHandlers to see which one should handle the source.

    If you're a plugin author, you'll be interested in how to register_handlers, how detect_source works.

    METHODS

    Class Methods

    new

    Creates a new factory class:

    1. my $sf = TAP::Parser::IteratorFactory->new( $config );

    $config is optional. If given, sets config and calls load_handlers.

    register_handler

    Registers a new TAP::Parser::SourceHandler with this factory.

    1. __PACKAGE__->register_handler( $handler_class );

    handlers

    List of handlers that have been registered.

    Instance Methods

    config

    1. my $cfg = $sf->config;
    2. $sf->config({ Perl => { %config } });

    Chaining getter/setter for the configuration of the available source handlers. This is a hashref keyed on handler class whose values contain config to be passed onto the handlers during detection & creation. Class names may be fully qualified or abbreviated, eg:

    1. # these are equivalent
    2. $sf->config({ 'TAP::Parser::SourceHandler::Perl' => { %config } });
    3. $sf->config({ 'Perl' => { %config } });

    load_handlers

    1. $sf->load_handlers;

    Loads the handler classes defined in config. For example, given a config:

    1. $sf->config({
    2. MySourceHandler => { some => 'config' },
    3. });

    load_handlers will attempt to load the MySourceHandler class by looking in @INC for it in this order:

    1. TAP::Parser::SourceHandler::MySourceHandler
    2. MySourceHandler

    croak s on error.

    make_iterator

    1. my $iterator = $src_factory->make_iterator( $source );

    Given a TAP::Parser::Source, finds the most suitable TAP::Parser::SourceHandler to use to create a TAP::Parser::Iterator (see detect_source). Dies on error.

    detect_source

    Given a TAP::Parser::Source, detects what kind of source it is and returns one TAP::Parser::SourceHandler (the most confident one). Dies on error.

    The detection algorithm works something like this:

    1. for (@registered_handlers) {
    2. # ask them how confident they are about handling this source
    3. $confidence{$handler} = $handler->can_handle( $source )
    4. }
    5. # choose the most confident handler

    Ties are handled by choosing the first handler.

    SUBCLASSING

    Please see SUBCLASSING in TAP::Parser for a subclassing overview.

    Example

    If we've done things right, you'll probably want to write a new source, rather than sub-classing this (see TAP::Parser::SourceHandler for that).

    But in case you find the need to...

    1. package MyIteratorFactory;
    2. use strict;
    3. use vars '@ISA';
    4. use TAP::Parser::IteratorFactory;
    5. @ISA = qw( TAP::Parser::IteratorFactory );
    6. # override source detection algorithm
    7. sub detect_source {
    8. my ($self, $raw_source_ref, $meta) = @_;
    9. # do detective work, using $meta and whatever else...
    10. }
    11. 1;

    AUTHORS

    Steve Purkis

    ATTRIBUTION

    Originally ripped off from Test::Harness.

    Moved out of TAP::Parser & converted to a factory class to support extensible TAP source detective work by Steve Purkis.

    SEE ALSO

    TAP::Object, TAP::Parser, TAP::Parser::SourceHandler, TAP::Parser::SourceHandler::File, TAP::Parser::SourceHandler::Perl, TAP::Parser::SourceHandler::RawTAP, TAP::Parser::SourceHandler::Handle, TAP::Parser::SourceHandler::Executable

     
    perldoc-html/TAP/Parser/Multiplexer.html000644 000765 000024 00000050265 12275777511 020324 0ustar00jjstaff000000 000000 TAP::Parser::Multiplexer - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Multiplexer

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Multiplexer

    NAME

    TAP::Parser::Multiplexer - Multiplex multiple TAP::Parsers

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Multiplexer;
    2. my $mux = TAP::Parser::Multiplexer->new;
    3. $mux->add( $parser1, $stash1 );
    4. $mux->add( $parser2, $stash2 );
    5. while ( my ( $parser, $stash, $result ) = $mux->next ) {
    6. # do stuff
    7. }

    DESCRIPTION

    TAP::Parser::Multiplexer gathers input from multiple TAP::Parsers. Internally it calls select on the input file handles for those parsers to wait for one or more of them to have input available.

    See TAP::Harness for an example of its use.

    METHODS

    Class Methods

    new

    1. my $mux = TAP::Parser::Multiplexer->new;

    Returns a new TAP::Parser::Multiplexer object.

    Instance Methods

    add

    1. $mux->add( $parser, $stash );

    Add a TAP::Parser to the multiplexer. $stash is an optional opaque reference that will be returned from next along with the parser and the next result.

    parsers

    1. my $count = $mux->parsers;

    Returns the number of parsers. Parsers are removed from the multiplexer when their input is exhausted.

    next

    Return a result from the next available parser. Returns a list containing the parser from which the result came, the stash that corresponds with that parser and the result.

    1. my ( $parser, $stash, $result ) = $mux->next;

    If $result is undefined the corresponding parser has reached the end of its input (and will automatically be removed from the multiplexer).

    When all parsers are exhausted an empty list will be returned.

    1. if ( my ( $parser, $stash, $result ) = $mux->next ) {
    2. if ( ! defined $result ) {
    3. # End of this parser
    4. }
    5. else {
    6. # Process result
    7. }
    8. }
    9. else {
    10. # All parsers finished
    11. }

    See Also

    TAP::Parser

    TAP::Harness

     
    perldoc-html/TAP/Parser/Result/000755 000765 000024 00000000000 12275777513 016374 5ustar00jjstaff000000 000000 perldoc-html/TAP/Parser/Result.html000644 000765 000024 00000061642 12275777507 017276 0ustar00jjstaff000000 000000 TAP::Parser::Result - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result

    NAME

    TAP::Parser::Result - Base class for TAP::Parser output objects

    VERSION

    Version 3.26

    SYNOPSIS

    1. # abstract class - not meant to be used directly
    2. # see TAP::Parser::ResultFactory for preferred usage
    3. # directly:
    4. use TAP::Parser::Result;
    5. my $token = {...};
    6. my $result = TAP::Parser::Result->new( $token );

    DESCRIPTION

    This is a simple base class used by TAP::Parser to store objects that represent the current bit of test output data from TAP (usually a single line). Unless you're subclassing, you probably won't need to use this module directly.

    METHODS

    new

    1. # see TAP::Parser::ResultFactory for preferred usage
    2. # to use directly:
    3. my $result = TAP::Parser::Result->new($token);

    Returns an instance the appropriate class for the test token passed in.

    Boolean methods

    The following methods all return a boolean value and are to be overridden in the appropriate subclass.

    • is_plan

      Indicates whether or not this is the test plan line.

      1. 1..3
    • is_pragma

      Indicates whether or not this is a pragma line.

      1. pragma +strict
    • is_test

      Indicates whether or not this is a test line.

      1. ok 1 Is OK!
    • is_comment

      Indicates whether or not this is a comment.

      1. # this is a comment
    • is_bailout

      Indicates whether or not this is bailout line.

      1. Bail out! We're out of dilithium crystals.
    • is_version

      Indicates whether or not this is a TAP version line.

      1. TAP version 4
    • is_unknown

      Indicates whether or not the current line could be parsed.

      1. ... this line is junk ...
    • is_yaml

      Indicates whether or not this is a YAML chunk.

    raw

    1. print $result->raw;

    Returns the original line of text which was parsed.

    type

    1. my $type = $result->type;

    Returns the "type" of a token, such as comment or test .

    as_string

    1. print $result->as_string;

    Prints a string representation of the token. This might not be the exact output, however. Tests will have test numbers added if not present, TODO and SKIP directives will be capitalized and, in general, things will be cleaned up. If you need the original text for the token, see the raw method.

    is_ok

    1. if ( $result->is_ok ) { ... }

    Reports whether or not a given result has passed. Anything which is not a test result returns true. This is merely provided as a convenient shortcut.

    passed

    Deprecated. Please use is_ok instead.

    has_directive

    1. if ( $result->has_directive ) {
    2. ...
    3. }

    Indicates whether or not the given result has a TODO or SKIP directive.

    has_todo

    1. if ( $result->has_todo ) {
    2. ...
    3. }

    Indicates whether or not the given result has a TODO directive.

    has_skip

    1. if ( $result->has_skip ) {
    2. ...
    3. }

    Indicates whether or not the given result has a SKIP directive.

    set_directive

    Set the directive associated with this token. Used internally to fake TODO tests.

    SUBCLASSING

    Please see SUBCLASSING in TAP::Parser for a subclassing overview.

    Remember: if you want your subclass to be automatically used by the parser, you'll have to register it with register_type in TAP::Parser::ResultFactory.

    If you're creating a completely new result type, you'll probably need to subclass TAP::Parser::Grammar too, or else it'll never get used.

    Example

    1. package MyResult;
    2. use strict;
    3. use vars '@ISA';
    4. @ISA = 'TAP::Parser::Result';
    5. # register with the factory:
    6. TAP::Parser::ResultFactory->register_type( 'my_type' => __PACKAGE__ );
    7. sub as_string { 'My results all look the same' }

    SEE ALSO

    TAP::Object, TAP::Parser, TAP::Parser::ResultFactory, TAP::Parser::Result::Bailout, TAP::Parser::Result::Comment, TAP::Parser::Result::Plan, TAP::Parser::Result::Pragma, TAP::Parser::Result::Test, TAP::Parser::Result::Unknown, TAP::Parser::Result::Version, TAP::Parser::Result::YAML,

     
    perldoc-html/TAP/Parser/ResultFactory.html000644 000765 000024 00000053671 12275777514 020627 0ustar00jjstaff000000 000000 TAP::Parser::ResultFactory - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::ResultFactory

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::ResultFactory

    NAME

    TAP::Parser::ResultFactory - Factory for creating TAP::Parser output objects

    SYNOPSIS

    1. use TAP::Parser::ResultFactory;
    2. my $token = {...};
    3. my $factory = TAP::Parser::ResultFactory->new;
    4. my $result = $factory->make_result( $token );

    VERSION

    Version 3.26

    DESCRIPTION

    This is a simple factory class which returns a TAP::Parser::Result subclass representing the current bit of test data from TAP (usually a single line). It is used primarily by TAP::Parser::Grammar. Unless you're subclassing, you probably won't need to use this module directly.

    METHODS

    Class Methods

    new

    Creates a new factory class. Note: You currently don't need to instantiate a factory in order to use it.

    make_result

    Returns an instance the appropriate class for the test token passed in.

    1. my $result = TAP::Parser::ResultFactory->make_result($token);

    Can also be called as an instance method.

    class_for

    Takes one argument: $type . Returns the class for this $type, or croak s with an error.

    register_type

    Takes two arguments: $type , $class

    This lets you override an existing type with your own custom type, or register a completely new type, eg:

    1. # create a custom result type:
    2. package MyResult;
    3. use strict;
    4. use vars qw(@ISA);
    5. @ISA = 'TAP::Parser::Result';
    6. # register with the factory:
    7. TAP::Parser::ResultFactory->register_type( 'my_type' => __PACKAGE__ );
    8. # use it:
    9. my $r = TAP::Parser::ResultFactory->( { type => 'my_type' } );

    Your custom type should then be picked up automatically by the TAP::Parser.

    SUBCLASSING

    Please see SUBCLASSING in TAP::Parser for a subclassing overview.

    There are a few things to bear in mind when creating your own ResultFactory :

    1

    The factory itself is never instantiated (this may change in the future). This means that _initialize is never called.

    2

    TAP::Parser::Result->new is never called, $tokens are reblessed. This will change in a future version!

    3

    TAP::Parser::Result subclasses will register themselves with TAP::Parser::ResultFactory directly:

    1. package MyFooResult;
    2. TAP::Parser::ResultFactory->register_type( foo => __PACKAGE__ );

    Of course, it's up to you to decide whether or not to ignore them.

    Example

    1. package MyResultFactory;
    2. use strict;
    3. use vars '@ISA';
    4. use MyResult;
    5. use TAP::Parser::ResultFactory;
    6. @ISA = qw( TAP::Parser::ResultFactory );
    7. # force all results to be 'MyResult'
    8. sub class_for {
    9. return 'MyResult';
    10. }
    11. 1;

    SEE ALSO

    TAP::Parser, TAP::Parser::Result, TAP::Parser::Grammar

     
    perldoc-html/TAP/Parser/Scheduler/000755 000765 000024 00000000000 12275777512 017033 5ustar00jjstaff000000 000000 perldoc-html/TAP/Parser/Scheduler.html000644 000765 000024 00000040037 12275777513 017726 0ustar00jjstaff000000 000000 TAP::Parser::Scheduler - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Scheduler

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Scheduler

    NAME

    TAP::Parser::Scheduler - Schedule tests during parallel testing

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Scheduler;

    DESCRIPTION

    METHODS

    Class Methods

    new

    1. my $sched = TAP::Parser::Scheduler->new;

    Returns a new TAP::Parser::Scheduler object.

    get_all

    Get a list of all remaining tests.

    get_job

    Return the next available job or undef if none are available. Returns a TAP::Parser::Scheduler::Spinner if the scheduler still has pending jobs but none are available to run right now.

    as_string

    Return a human readable representation of the scheduling tree.

     
    perldoc-html/TAP/Parser/Source.html000644 000765 000024 00000072102 12275777511 017244 0ustar00jjstaff000000 000000 TAP::Parser::Source - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Source

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Source

    NAME

    TAP::Parser::Source - a TAP source & meta data about it

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Source;
    2. my $source = TAP::Parser::Source->new;
    3. $source->raw( \'reference to raw TAP source' )
    4. ->config( \%config )
    5. ->merge( $boolean )
    6. ->switches( \@switches )
    7. ->test_args( \@args )
    8. ->assemble_meta;
    9. do { ... } if $source->meta->{is_file};
    10. # see assemble_meta for a full list of data available

    DESCRIPTION

    A TAP source is something that produces a stream of TAP for the parser to consume, such as an executable file, a text file, an archive, an IO handle, a database, etc. TAP::Parser::Source s encapsulate these raw sources, and provide some useful meta data about them. They are used by TAP::Parser::SourceHandlers, which do whatever is required to produce & capture a stream of TAP from the raw source, and package it up in a TAP::Parser::Iterator for the parser to consume.

    Unless you're writing a new TAP::Parser::SourceHandler, a plugin or subclassing TAP::Parser, you probably won't need to use this module directly.

    METHODS

    Class Methods

    new

    1. my $source = TAP::Parser::Source->new;

    Returns a new TAP::Parser::Source object.

    Instance Methods

    raw

    1. my $raw = $source->raw;
    2. $source->raw( $some_value );

    Chaining getter/setter for the raw TAP source. This is a reference, as it may contain large amounts of data (eg: raw TAP).

    meta

    1. my $meta = $source->meta;
    2. $source->meta({ %some_value });

    Chaining getter/setter for meta data about the source. This defaults to an empty hashref. See assemble_meta for more info.

    has_meta

    True if the source has meta data.

    config

    1. my $config = $source->config;
    2. $source->config({ %some_value });

    Chaining getter/setter for the source's configuration, if any has been provided by the user. How it's used is up to you. This defaults to an empty hashref. See config_for for more info.

    merge

    1. my $merge = $source->merge;
    2. $source->config( $bool );

    Chaining getter/setter for the flag that dictates whether STDOUT and STDERR should be merged (where appropriate). Defaults to undef.

    switches

    1. my $switches = $source->switches;
    2. $source->config([ @switches ]);

    Chaining getter/setter for the list of command-line switches that should be passed to the source (where appropriate). Defaults to undef.

    test_args

    1. my $test_args = $source->test_args;
    2. $source->config([ @test_args ]);

    Chaining getter/setter for the list of command-line arguments that should be passed to the source (where appropriate). Defaults to undef.

    assemble_meta

    1. my $meta = $source->assemble_meta;

    Gathers meta data about the raw source, stashes it in meta and returns it as a hashref. This is done so that the TAP::Parser::SourceHandlers don't have to repeat common checks. Currently this includes:

    1. is_scalar => $bool,
    2. is_hash => $bool,
    3. is_array => $bool,
    4. # for scalars:
    5. length => $n
    6. has_newlines => $bool
    7. # only done if the scalar looks like a filename
    8. is_file => $bool,
    9. is_dir => $bool,
    10. is_symlink => $bool,
    11. file => {
    12. # only done if the scalar looks like a filename
    13. basename => $string, # including ext
    14. dir => $string,
    15. ext => $string,
    16. lc_ext => $string,
    17. # system checks
    18. exists => $bool,
    19. stat => [ ... ], # perldoc -f stat
    20. empty => $bool,
    21. size => $n,
    22. text => $bool,
    23. binary => $bool,
    24. read => $bool,
    25. write => $bool,
    26. execute => $bool,
    27. setuid => $bool,
    28. setgid => $bool,
    29. sticky => $bool,
    30. is_file => $bool,
    31. is_dir => $bool,
    32. is_symlink => $bool,
    33. # only done if the file's a symlink
    34. lstat => [ ... ], # perldoc -f lstat
    35. # only done if the file's a readable text file
    36. shebang => $first_line,
    37. }
    38. # for arrays:
    39. size => $n,

    shebang

    Get the shebang line for a script file.

    1. my $shebang = TAP::Parser::Source->shebang( $some_script );

    May be called as a class method

    config_for

    1. my $config = $source->config_for( $class );

    Returns config for the $class given. Class names may be fully qualified or abbreviated, eg:

    1. # these are equivalent
    2. $source->config_for( 'Perl' );
    3. $source->config_for( 'TAP::Parser::SourceHandler::Perl' );

    If a fully qualified $class is given, its abbreviated version is checked first.

    AUTHORS

    Steve Purkis.

    SEE ALSO

    TAP::Object, TAP::Parser, TAP::Parser::IteratorFactory, TAP::Parser::SourceHandler

     
    perldoc-html/TAP/Parser/Utils.html000644 000765 000024 00000040065 12275777506 017113 0ustar00jjstaff000000 000000 TAP::Parser::Utils - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Utils

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Utils

    NAME

    TAP::Parser::Utils - Internal TAP::Parser utilities

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Utils qw( split_shell )
    2. my @switches = split_shell( $arg );

    DESCRIPTION

    FOR INTERNAL USE ONLY!

    INTERFACE

    split_shell

    Shell style argument parsing. Handles backslash escaping, single and double quoted strings but not shell substitutions.

    Pass one or more strings containing shell escaped arguments. The return value is an array of arguments parsed from the input strings according to (approximate) shell parsing rules. It's legal to pass undef in which case an empty array will be returned. That makes it possible to

    1. my @args = split_shell( $ENV{SOME_ENV_VAR} );

    without worrying about whether the environment variable exists.

    This is used to split HARNESS_PERL_ARGS into individual switches.

     
    perldoc-html/TAP/Parser/YAMLish/000755 000765 000024 00000000000 12275777514 016365 5ustar00jjstaff000000 000000 perldoc-html/TAP/Parser/YAMLish/Reader.html000644 000765 000024 00000044050 12275777514 020460 0ustar00jjstaff000000 000000 TAP::Parser::YAMLish::Reader - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::YAMLish::Reader

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::YAMLish::Reader

    NAME

    TAP::Parser::YAMLish::Reader - Read YAMLish data from iterator

    VERSION

    Version 3.26

    SYNOPSIS

    DESCRIPTION

    Note that parts of this code were derived from YAML::Tiny with the permission of Adam Kennedy.

    METHODS

    Class Methods

    new

    The constructor new creates and returns an empty TAP::Parser::YAMLish::Reader object.

    1. my $reader = TAP::Parser::YAMLish::Reader->new;

    Instance Methods

    read

    1. my $got = $reader->read($iterator);

    Read YAMLish from a TAP::Parser::Iterator and return the data structure it represents.

    get_raw

    1. my $source = $reader->get_source;

    Return the raw YAMLish source from the most recent read.

    AUTHOR

    Andy Armstrong, <andy@hexten.net>

    Adam Kennedy wrote YAML::Tiny which provided the template and many of the YAML matching regular expressions for this module.

    SEE ALSO

    YAML::Tiny, YAML, YAML::Syck, Config::Tiny, CSS::Tiny, http://use.perl.org/~Alias/journal/29427

    COPYRIGHT

    Copyright 2007-2011 Andy Armstrong.

    Portions copyright 2006-2008 Adam Kennedy.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    The full text of the license can be found in the LICENSE file included with this module.

     
    perldoc-html/TAP/Parser/YAMLish/Writer.html000644 000765 000024 00000054475 12275777506 020547 0ustar00jjstaff000000 000000 TAP::Parser::YAMLish::Writer - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::YAMLish::Writer

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::YAMLish::Writer

    NAME

    TAP::Parser::YAMLish::Writer - Write YAMLish data

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::YAMLish::Writer;
    2. my $data = {
    3. one => 1,
    4. two => 2,
    5. three => [ 1, 2, 3 ],
    6. };
    7. my $yw = TAP::Parser::YAMLish::Writer->new;
    8. # Write to an array...
    9. $yw->write( $data, \@some_array );
    10. # ...an open file handle...
    11. $yw->write( $data, $some_file_handle );
    12. # ...a string ...
    13. $yw->write( $data, \$some_string );
    14. # ...or a closure
    15. $yw->write( $data, sub {
    16. my $line = shift;
    17. print "$line\n";
    18. } );

    DESCRIPTION

    Encodes a scalar, hash reference or array reference as YAMLish.

    METHODS

    Class Methods

    new

    1. my $writer = TAP::Parser::YAMLish::Writer->new;

    The constructor new creates and returns an empty TAP::Parser::YAMLish::Writer object.

    Instance Methods

    write

    1. $writer->write($obj, $output );

    Encode a scalar, hash reference or array reference as YAML.

    1. my $writer = sub {
    2. my $line = shift;
    3. print SOMEFILE "$line\n";
    4. };
    5. my $data = {
    6. one => 1,
    7. two => 2,
    8. three => [ 1, 2, 3 ],
    9. };
    10. my $yw = TAP::Parser::YAMLish::Writer->new;
    11. $yw->write( $data, $writer );

    The $output argument may be:

    • a reference to a scalar to append YAML to
    • the handle of an open file
    • a reference to an array into which YAML will be pushed
    • a code reference

    If you supply a code reference the subroutine will be called once for each line of output with the line as its only argument. Passed lines will have no trailing newline.

    AUTHOR

    Andy Armstrong, <andy@hexten.net>

    SEE ALSO

    YAML::Tiny, YAML, YAML::Syck, Config::Tiny, CSS::Tiny, http://use.perl.org/~Alias/journal/29427

    COPYRIGHT

    Copyright 2007-2011 Andy Armstrong.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    The full text of the license can be found in the LICENSE file included with this module.

     
    perldoc-html/TAP/Parser/Scheduler/Job.html000644 000765 000024 00000041351 12275777512 020437 0ustar00jjstaff000000 000000 TAP::Parser::Scheduler::Job - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Scheduler::Job

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Scheduler::Job

    NAME

    TAP::Parser::Scheduler::Job - A single testing job.

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Scheduler::Job;

    DESCRIPTION

    Represents a single test 'job'.

    METHODS

    Class Methods

    new

    1. my $job = TAP::Parser::Scheduler::Job->new(
    2. $name, $desc
    3. );

    Returns a new TAP::Parser::Scheduler::Job object.

    on_finish

    Register a closure to be called when this job is destroyed.

    finish

    Called when a job is complete to unlock it.

    filename

    description

    context

    as_array_ref

    For backwards compatibility in callbacks.

    is_spinner

    Returns false indicating that this is a real job rather than a 'spinner'. Spinners are returned when the scheduler still has pending jobs but can't (because of locking) return one right now.

     
    perldoc-html/TAP/Parser/Scheduler/Spinner.html000644 000765 000024 00000040040 12275777506 021340 0ustar00jjstaff000000 000000 TAP::Parser::Scheduler::Spinner - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Scheduler::Spinner

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Scheduler::Spinner

    NAME

    TAP::Parser::Scheduler::Spinner - A no-op job.

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Scheduler::Spinner;

    DESCRIPTION

    A no-op job. Returned by TAP::Parser::Scheduler as an instruction to the harness to spin (keep executing tests) while the scheduler can't return a real job.

    METHODS

    Class Methods

    new

    1. my $job = TAP::Parser::Scheduler::Spinner->new;

    Returns a new TAP::Parser::Scheduler::Spinner object.

    is_spinner

    Returns true indicating that is a 'spinner' job. Spinners are returned when the scheduler still has pending jobs but can't (because of locking) return one right now.

     
    perldoc-html/TAP/Parser/Result/Bailout.html000644 000765 000024 00000040456 12275777513 020672 0ustar00jjstaff000000 000000 TAP::Parser::Result::Bailout - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result::Bailout

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result::Bailout

    NAME

    TAP::Parser::Result::Bailout - Bailout result token.

    VERSION

    Version 3.26

    DESCRIPTION

    This is a subclass of TAP::Parser::Result. A token of this class will be returned if a bail out line is encountered.

    1. 1..5
    2. ok 1 - woo hooo!
    3. Bail out! Well, so much for "woo hooo!"

    OVERRIDDEN METHODS

    Mainly listed here to shut up the pitiful screams of the pod coverage tests. They keep me awake at night.

    • as_string

    Instance Methods

    explanation

    1. if ( $result->is_bailout ) {
    2. my $explanation = $result->explanation;
    3. print "We bailed out because ($explanation)";
    4. }

    If, and only if, a token is a bailout token, you can get an "explanation" via this method. The explanation is the text after the mystical "Bail out!" words which appear in the tap output.

     
    perldoc-html/TAP/Parser/Result/Comment.html000644 000765 000024 00000040217 12275777507 020673 0ustar00jjstaff000000 000000 TAP::Parser::Result::Comment - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result::Comment

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result::Comment

    NAME

    TAP::Parser::Result::Comment - Comment result token.

    VERSION

    Version 3.26

    DESCRIPTION

    This is a subclass of TAP::Parser::Result. A token of this class will be returned if a comment line is encountered.

    1. 1..1
    2. ok 1 - woo hooo!
    3. # this is a comment

    OVERRIDDEN METHODS

    Mainly listed here to shut up the pitiful screams of the pod coverage tests. They keep me awake at night.

    • as_string

      Note that this method merely returns the comment preceded by a '# '.

    Instance Methods

    comment

    1. if ( $result->is_comment ) {
    2. my $comment = $result->comment;
    3. print "I have something to say: $comment";
    4. }
     
    perldoc-html/TAP/Parser/Result/Plan.html000644 000765 000024 00000044723 12275777510 020163 0ustar00jjstaff000000 000000 TAP::Parser::Result::Plan - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result::Plan

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result::Plan

    NAME

    TAP::Parser::Result::Plan - Plan result token.

    VERSION

    Version 3.26

    DESCRIPTION

    This is a subclass of TAP::Parser::Result. A token of this class will be returned if a plan line is encountered.

    1. 1..1
    2. ok 1 - woo hooo!

    1..1 is the plan. Gotta have a plan.

    OVERRIDDEN METHODS

    Mainly listed here to shut up the pitiful screams of the pod coverage tests. They keep me awake at night.

    • as_string
    • raw

    Instance Methods

    plan

    1. if ( $result->is_plan ) {
    2. print $result->plan;
    3. }

    This is merely a synonym for as_string .

    tests_planned

    1. my $planned = $result->tests_planned;

    Returns the number of tests planned. For example, a plan of 1..17 will cause this method to return '17'.

    directive

    1. my $directive = $plan->directive;

    If a SKIP directive is included with the plan, this method will return it.

    1. 1..0 # SKIP: why bother?

    has_skip

    1. if ( $result->has_skip ) { ... }

    Returns a boolean value indicating whether or not this test has a SKIP directive.

    explanation

    1. my $explanation = $plan->explanation;

    If a SKIP directive was included with the plan, this method will return the explanation, if any.

    todo_list

    1. my $todo = $result->todo_list;
    2. for ( @$todo ) {
    3. ...
    4. }
     
    perldoc-html/TAP/Parser/Result/Pragma.html000644 000765 000024 00000037270 12275777506 020504 0ustar00jjstaff000000 000000 TAP::Parser::Result::Pragma - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result::Pragma

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result::Pragma

    NAME

    TAP::Parser::Result::Pragma - TAP pragma token.

    VERSION

    Version 3.26

    DESCRIPTION

    This is a subclass of TAP::Parser::Result. A token of this class will be returned if a pragma is encountered.

    1. TAP version 13
    2. pragma +strict, -foo

    Pragmas are only supported from TAP version 13 onwards.

    OVERRIDDEN METHODS

    Mainly listed here to shut up the pitiful screams of the pod coverage tests. They keep me awake at night.

    • as_string
    • raw

    Instance Methods

    pragmas

    if ( $result->is_pragma ) { @pragmas = $result->pragmas; }

     
    perldoc-html/TAP/Parser/Result/Test.html000644 000765 000024 00000055450 12275777510 020207 0ustar00jjstaff000000 000000 TAP::Parser::Result::Test - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result::Test

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result::Test

    NAME

    TAP::Parser::Result::Test - Test result token.

    VERSION

    Version 3.26

    DESCRIPTION

    This is a subclass of TAP::Parser::Result. A token of this class will be returned if a test line is encountered.

    1. 1..1
    2. ok 1 - woo hooo!

    OVERRIDDEN METHODS

    This class is the workhorse of the TAP::Parser system. Most TAP lines will be test lines and if $result->is_test , then you have a bunch of methods at your disposal.

    Instance Methods

    ok

    1. my $ok = $result->ok;

    Returns the literal text of the ok or not ok status.

    number

    1. my $test_number = $result->number;

    Returns the number of the test, even if the original TAP output did not supply that number.

    description

    1. my $description = $result->description;

    Returns the description of the test, if any. This is the portion after the test number but before the directive.

    directive

    1. my $directive = $result->directive;

    Returns either TODO or SKIP if either directive was present for a test line.

    explanation

    1. my $explanation = $result->explanation;

    If a test had either a TODO or SKIP directive, this method will return the accompanying explanation, if present.

    1. not ok 17 - 'Pigs can fly' # TODO not enough acid

    For the above line, the explanation is not enough acid.

    is_ok

    1. if ( $result->is_ok ) { ... }

    Returns a boolean value indicating whether or not the test passed. Remember that for TODO tests, the test always passes.

    If the test is unplanned, this method will always return false. See is_unplanned .

    is_actual_ok

    1. if ( $result->is_actual_ok ) { ... }

    Returns a boolean value indicating whether or not the test passed, regardless of its TODO status.

    actual_passed

    Deprecated. Please use is_actual_ok instead.

    todo_passed

    1. if ( $test->todo_passed ) {
    2. # test unexpectedly succeeded
    3. }

    If this is a TODO test and an 'ok' line, this method returns true. Otherwise, it will always return false (regardless of passing status on non-todo tests).

    This is used to track which tests unexpectedly succeeded.

    todo_failed

    1. # deprecated in favor of 'todo_passed'. This method was horribly misnamed.

    This was a badly misnamed method. It indicates which TODO tests unexpectedly succeeded. Will now issue a warning and call todo_passed .

    has_skip

    1. if ( $result->has_skip ) { ... }

    Returns a boolean value indicating whether or not this test has a SKIP directive.

    has_todo

    1. if ( $result->has_todo ) { ... }

    Returns a boolean value indicating whether or not this test has a TODO directive.

    as_string

    1. print $result->as_string;

    This method prints the test as a string. It will probably be similar, but not necessarily identical, to the original test line. Directives are capitalized, some whitespace may be trimmed and a test number will be added if it was not present in the original line. If you need the original text of the test line, use the raw method.

    is_unplanned

    1. if ( $test->is_unplanned ) { ... }
    2. $test->is_unplanned(1);

    If a test number is greater than the number of planned tests, this method will return true. Unplanned tests will always return false for is_ok , regardless of whether or not the test has_todo .

    Note that if tests have a trailing plan, it is not possible to set this property for unplanned tests as we do not know it's unplanned until the plan is reached:

    1. print <<'END';
    2. ok 1
    3. ok 2
    4. 1..1
    5. END
     
    perldoc-html/TAP/Parser/Result/Unknown.html000644 000765 000024 00000036711 12275777511 020727 0ustar00jjstaff000000 000000 TAP::Parser::Result::Unknown - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result::Unknown

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result::Unknown

    NAME

    TAP::Parser::Result::Unknown - Unknown result token.

    VERSION

    Version 3.26

    DESCRIPTION

    This is a subclass of TAP::Parser::Result. A token of this class will be returned if the parser does not recognize the token line. For example:

    1. 1..5
    2. VERSION 7
    3. ok 1 - woo hooo!
    4. ... woo hooo! is cool!

    In the above "TAP", the second and fourth lines will generate "Unknown" tokens.

    OVERRIDDEN METHODS

    Mainly listed here to shut up the pitiful screams of the pod coverage tests. They keep me awake at night.

    • as_string
    • raw
     
    perldoc-html/TAP/Parser/Result/Version.html000644 000765 000024 00000040177 12275777511 020716 0ustar00jjstaff000000 000000 TAP::Parser::Result::Version - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result::Version

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result::Version

    NAME

    TAP::Parser::Result::Version - TAP syntax version token.

    VERSION

    Version 3.26

    DESCRIPTION

    This is a subclass of TAP::Parser::Result. A token of this class will be returned if a version line is encountered.

    1. TAP version 13
    2. ok 1
    3. not ok 2

    The first version of TAP to include an explicit version number is 13.

    OVERRIDDEN METHODS

    Mainly listed here to shut up the pitiful screams of the pod coverage tests. They keep me awake at night.

    • as_string
    • raw

    Instance Methods

    version

    1. if ( $result->is_version ) {
    2. print $result->version;
    3. }

    This is merely a synonym for as_string .

     
    perldoc-html/TAP/Parser/Result/YAML.html000644 000765 000024 00000040100 12275777513 020017 0ustar00jjstaff000000 000000 TAP::Parser::Result::YAML - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Result::YAML

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Result::YAML

    NAME

    TAP::Parser::Result::YAML - YAML result token.

    VERSION

    Version 3.26

    DESCRIPTION

    This is a subclass of TAP::Parser::Result. A token of this class will be returned if a YAML block is encountered.

    1. 1..1
    2. ok 1 - woo hooo!

    1..1 is the plan. Gotta have a plan.

    OVERRIDDEN METHODS

    Mainly listed here to shut up the pitiful screams of the pod coverage tests. They keep me awake at night.

    • as_string
    • raw

    Instance Methods

    data

    1. if ( $result->is_yaml ) {
    2. print $result->data;
    3. }

    Return the parsed YAML data for this result

     
    perldoc-html/TAP/Parser/Iterator/Array.html000644 000765 000024 00000042576 12275777514 020672 0ustar00jjstaff000000 000000 TAP::Parser::Iterator::Array - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Iterator::Array

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Iterator::Array

    NAME

    TAP::Parser::Iterator::Array - Iterator for array-based TAP sources

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Iterator::Array;
    2. my @data = ('foo', 'bar', baz');
    3. my $it = TAP::Parser::Iterator::Array->new(\@data);
    4. my $line = $it->next;

    DESCRIPTION

    This is a simple iterator wrapper for arrays of scalar content, used by TAP::Parser. Unless you're writing a plugin or subclassing, you probably won't need to use this module directly.

    METHODS

    Class Methods

    new

    Create an iterator. Takes one argument: an $array_ref

    Instance Methods

    next

    Iterate through it, of course.

    next_raw

    Iterate raw input without applying any fixes for quirky input syntax.

    wait

    Get the wait status for this iterator. For an array iterator this will always be zero.

    exit

    Get the exit status for this iterator. For an array iterator this will always be zero.

    ATTRIBUTION

    Originally ripped off from Test::Harness.

    SEE ALSO

    TAP::Object, TAP::Parser, TAP::Parser::Iterator,

     
    perldoc-html/TAP/Parser/Iterator/Process.html000644 000765 000024 00000046475 12275777512 021232 0ustar00jjstaff000000 000000 TAP::Parser::Iterator::Process - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Iterator::Process

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Iterator::Process

    NAME

    TAP::Parser::Iterator::Process - Iterator for process-based TAP sources

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Iterator::Process;
    2. my %args = (
    3. command => ['python', 'setup.py', 'test'],
    4. merge => 1,
    5. setup => sub { ... },
    6. teardown => sub { ... },
    7. );
    8. my $it = TAP::Parser::Iterator::Process->new(\%args);
    9. my $line = $it->next;

    DESCRIPTION

    This is a simple iterator wrapper for executing external processes, used by TAP::Parser. Unless you're writing a plugin or subclassing, you probably won't need to use this module directly.

    METHODS

    Class Methods

    new

    Create an iterator. Expects one argument containing a hashref of the form:

    1. command => \@command_to_execute
    2. merge => $attempt_merge_stderr_and_stdout?
    3. setup => $callback_to_setup_command
    4. teardown => $callback_to_teardown_command

    Tries to uses IPC::Open3 & IO::Select to communicate with the spawned process if they are available. Falls back onto open().

    Instance Methods

    next

    Iterate through the process output, of course.

    next_raw

    Iterate raw input without applying any fixes for quirky input syntax.

    wait

    Get the wait status for this iterator's process.

    exit

    Get the exit status for this iterator's process.

    handle_unicode

    Upgrade the input stream to handle UTF8.

    get_select_handles

    Return a list of filehandles that may be used upstream in a select() call to signal that this Iterator is ready. Iterators that are not handle based should return an empty list.

    ATTRIBUTION

    Originally ripped off from Test::Harness.

    SEE ALSO

    TAP::Object, TAP::Parser, TAP::Parser::Iterator,

     
    perldoc-html/TAP/Parser/Iterator/Stream.html000644 000765 000024 00000043046 12275777513 021037 0ustar00jjstaff000000 000000 TAP::Parser::Iterator::Stream - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Parser::Iterator::Stream

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Parser::Iterator::Stream

    NAME

    TAP::Parser::Iterator::Stream - Iterator for filehandle-based TAP sources

    VERSION

    Version 3.26

    SYNOPSIS

    1. use TAP::Parser::Iterator::Stream;
    2. open( TEST, 'test.tap' );
    3. my $it = TAP::Parser::Iterator::Stream->new(\*TEST);
    4. my $line = $it->next;

    DESCRIPTION

    This is a simple iterator wrapper for reading from filehandles, used by TAP::Parser. Unless you're writing a plugin or subclassing, you probably won't need to use this module directly.

    METHODS

    Class Methods

    new

    Create an iterator. Expects one argument containing a filehandle.

    Instance Methods

    next

    Iterate through it, of course.

    next_raw

    Iterate raw input without applying any fixes for quirky input syntax.

    wait

    Get the wait status for this iterator. Always returns zero.

    exit

    Get the exit status for this iterator. Always returns zero.

    ATTRIBUTION

    Originally ripped off from Test::Harness.

    SEE ALSO

    TAP::Object, TAP::Parser, TAP::Parser::Iterator,

     
    perldoc-html/TAP/Formatter/Base.html000644 000765 000024 00000053701 12275777505 017374 0ustar00jjstaff000000 000000 TAP::Formatter::Base - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Formatter::Base

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Formatter::Base

    NAME

    TAP::Formatter::Base - Base class for harness output delegates

    VERSION

    Version 3.26

    DESCRIPTION

    This provides console orientated output formatting for TAP::Harness.

    SYNOPSIS

    1. use TAP::Formatter::Console;
    2. my $harness = TAP::Formatter::Console->new( \%args );

    METHODS

    Class Methods

    new

    1. my %args = (
    2. verbose => 1,
    3. )
    4. my $harness = TAP::Formatter::Console->new( \%args );

    The constructor returns a new TAP::Formatter::Console object. If a TAP::Harness is created with no formatter a TAP::Formatter::Console is automatically created. If any of the following options were given to TAP::Harness->new they well be passed to this constructor which accepts an optional hashref whose allowed keys are:

    • verbosity

      Set the verbosity level.

    • verbose

      Printing individual test results to STDOUT.

    • timer

      Append run time for each test to output. Uses Time::HiRes if available.

    • failures

      Show test failures (this is a no-op if verbose is selected).

    • comments

      Show test comments (this is a no-op if verbose is selected).

    • quiet

      Suppressing some test output (mostly failures while tests are running).

    • really_quiet

      Suppressing everything but the tests summary.

    • silent

      Suppressing all output.

    • errors

      If parse errors are found in the TAP output, a note of this will be made in the summary report. To see all of the parse errors, set this argument to true:

      1. errors => 1
    • directives

      If set to a true value, only test results with directives will be displayed. This overrides other settings such as verbose , failures , or comments .

    • stdout

      A filehandle for catching standard output.

    • color

      If defined specifies whether color output is desired. If color is not defined it will default to color output if color support is available on the current platform and output is not being redirected.

    • jobs

      The number of concurrent jobs this formatter will handle.

    • show_count

      Boolean value. If false, disables the X/Y test count which shows up while tests are running.

    Any keys for which the value is undef will be ignored.

    prepare

    Called by Test::Harness before any test output is generated.

    This is an advisory and may not be called in the case where tests are being supplied to Test::Harness by an iterator.

    open_test

    Called to create a new test session. A test session looks like this:

    1. my $session = $formatter->open_test( $test, $parser );
    2. while ( defined( my $result = $parser->next ) ) {
    3. $session->result($result);
    4. exit 1 if $result->is_bailout;
    5. }
    6. $session->close_test;

    summary

    1. $harness->summary( $aggregate );

    summary prints the summary report after all tests are run. The first argument is an aggregate to summarise. An optional second argument may be set to a true value to indicate that the summary is being output as a result of an interrupted test run.

     
    perldoc-html/TAP/Formatter/Color.html000644 000765 000024 00000041530 12275777511 017572 0ustar00jjstaff000000 000000 TAP::Formatter::Color - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Formatter::Color

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Formatter::Color

    NAME

    TAP::Formatter::Color - Run Perl test scripts with color

    VERSION

    Version 3.26

    DESCRIPTION

    Note that this harness is experimental. You may not like the colors I've chosen and I haven't yet provided an easy way to override them.

    This test harness is the same as TAP::Harness, but test results are output in color. Passing tests are printed in green. Failing tests are in red. Skipped tests are blue on a white background and TODO tests are printed in white.

    If Term::ANSIColor cannot be found (or Win32::Console if running under Windows) tests will be run without color.

    SYNOPSIS

    1. use TAP::Formatter::Color;
    2. my $harness = TAP::Formatter::Color->new( \%args );
    3. $harness->runtests(@tests);

    METHODS

    Class Methods

    new

    The constructor returns a new TAP::Formatter::Color object. If Term::ANSIColor is not installed, returns undef.

    can_color

    1. Test::Formatter::Color->can_color()

    Returns a boolean indicating whether or not this module can actually generate colored output. This will be false if it could not load the modules needed for the current platform.

    set_color

    Set the output color.

     
    perldoc-html/TAP/Formatter/Console/000755 000765 000024 00000000000 12275777513 017227 5ustar00jjstaff000000 000000 perldoc-html/TAP/Formatter/Console.html000644 000765 000024 00000036436 12275777511 020127 0ustar00jjstaff000000 000000 TAP::Formatter::Console - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Formatter::Console

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Formatter::Console

    NAME

    TAP::Formatter::Console - Harness output delegate for default console output

    VERSION

    Version 3.26

    DESCRIPTION

    This provides console orientated output formatting for TAP::Harness.

    SYNOPSIS

    1. use TAP::Formatter::Console;
    2. my $harness = TAP::Formatter::Console->new( \%args );

    open_test

    See TAP::Formatter::Base

     
    perldoc-html/TAP/Formatter/File/000755 000765 000024 00000000000 12275777512 016503 5ustar00jjstaff000000 000000 perldoc-html/TAP/Formatter/File.html000644 000765 000024 00000036413 12275777511 017377 0ustar00jjstaff000000 000000 TAP::Formatter::File - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Formatter::File

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Formatter::File

    NAME

    TAP::Formatter::File - Harness output delegate for file output

    VERSION

    Version 3.26

    DESCRIPTION

    This provides file orientated output formatting for TAP::Harness.

    SYNOPSIS

    1. use TAP::Formatter::File;
    2. my $harness = TAP::Formatter::File->new( \%args );

    open_test

    See TAP::Formatter::base

     
    perldoc-html/TAP/Formatter/Session.html000644 000765 000024 00000040654 12275777506 020151 0ustar00jjstaff000000 000000 TAP::Formatter::Session - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Formatter::Session

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Formatter::Session

    NAME

    TAP::Formatter::Session - Abstract base class for harness output delegate

    VERSION

    Version 3.26

    METHODS

    Class Methods

    new

    1. my %args = (
    2. formatter => $self,
    3. )
    4. my $harness = TAP::Formatter::Console::Session->new( \%args );

    The constructor returns a new TAP::Formatter::Console::Session object.

    • formatter
    • parser
    • name
    • show_count

    header

    Output test preamble

    result

    Called by the harness for each line of TAP it receives.

    close_test

    Called to close a test session.

    clear_for_close

    Called by close_test to clear the line showing test progress, or the parallel test ruler, prior to printing the final test result.

     
    perldoc-html/TAP/Formatter/File/Session.html000644 000765 000024 00000036453 12275777512 021027 0ustar00jjstaff000000 000000 TAP::Formatter::File::Session - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Formatter::File::Session

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Formatter::File::Session

    NAME

    TAP::Formatter::File::Session - Harness output delegate for file output

    VERSION

    Version 3.26

    DESCRIPTION

    This provides file orientated output formatting for TAP::Harness. It is particularly important when running with parallel tests, as it ensures that test results are not interleaved, even when run verbosely.

    METHODS

    result

    Stores results for later output, all together.

    close_test

    When the test file finishes, outputs the summary, together.

     
    perldoc-html/TAP/Formatter/Console/ParallelSession.html000644 000765 000024 00000040000 12275777507 023212 0ustar00jjstaff000000 000000 TAP::Formatter::Console::ParallelSession - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Formatter::Console::ParallelSession

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Formatter::Console::ParallelSession

    NAME

    TAP::Formatter::Console::ParallelSession - Harness output delegate for parallel console output

    VERSION

    Version 3.26

    DESCRIPTION

    This provides console orientated output formatting for TAP::Harness when run with multiple jobs in TAP::Harness.

    SYNOPSIS

    METHODS

    Class Methods

    header

    Output test preamble

    result

    1. Called by the harness for each line of TAP it receives .

    clear_for_close

    close_test

     
    perldoc-html/TAP/Formatter/Console/Session.html000644 000765 000024 00000036621 12275777513 021550 0ustar00jjstaff000000 000000 TAP::Formatter::Console::Session - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    TAP::Formatter::Console::Session

    Perl 5 version 18.2 documentation
    Recently read

    TAP::Formatter::Console::Session

    NAME

    TAP::Formatter::Console::Session - Harness output delegate for default console output

    VERSION

    Version 3.26

    DESCRIPTION

    This provides console orientated output formatting for TAP::Harness.

    clear_for_close

    close_test

    header

    result

     
    perldoc-html/Sys/Hostname.html000644 000765 000024 00000036456 12275777504 016476 0ustar00jjstaff000000 000000 Sys::Hostname - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Sys::Hostname

    Perl 5 version 18.2 documentation
    Recently read

    Sys::Hostname

    NAME

    Sys::Hostname - Try every conceivable way to get hostname

    SYNOPSIS

    1. use Sys::Hostname;
    2. $host = hostname;

    DESCRIPTION

    Attempts several methods of getting the system hostname and then caches the result. It tries the first available of the C library's gethostname(), `$Config{aphostname}` , uname(2), syscall(SYS_gethostname), `hostname` , `uname -n` , and the file /com/host. If all that fails it croak s.

    All NULs, returns, and newlines are removed from the result.

    AUTHOR

    David Sundstrom <sunds@asictest.sc.ti.com>

    Texas Instruments

    XS code added by Greg Bacon <gbacon@cs.uah.edu>

     
    perldoc-html/Sys/Syslog.html000644 000765 000024 00000176412 12275777504 016175 0ustar00jjstaff000000 000000 Sys::Syslog - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Sys::Syslog

    Perl 5 version 18.2 documentation
    Recently read

    Sys::Syslog

    NAME

    Sys::Syslog - Perl interface to the UNIX syslog(3) calls

    VERSION

    This is the documentation of version 0.32

    SYNOPSIS

    1. use Sys::Syslog; # all except setlogsock()
    2. use Sys::Syslog qw(:standard :macros); # standard functions & macros
    3. openlog($ident, $logopt, $facility); # don't forget this
    4. syslog($priority, $format, @args);
    5. $oldmask = setlogmask($mask_priority);
    6. closelog();

    DESCRIPTION

    Sys::Syslog is an interface to the UNIX syslog(3) program. Call syslog() with a string priority and a list of printf() args just like syslog(3) .

    EXPORTS

    Sys::Syslog exports the following Exporter tags:

    • :standard exports the standard syslog(3) functions:

      1. openlog closelog setlogmask syslog
    • :extended exports the Perl specific functions for syslog(3) :

      1. setlogsock
    • :macros exports the symbols corresponding to most of your syslog(3) macros and the LOG_UPTO() and LOG_MASK() functions. See CONSTANTS for the supported constants and their meaning.

    By default, Sys::Syslog exports the symbols from the :standard tag.

    FUNCTIONS

    • openlog($ident, $logopt, $facility)

      Opens the syslog. $ident is prepended to every message. $logopt contains zero or more of the options detailed below. $facility specifies the part of the system to report about, for example LOG_USER or LOG_LOCAL0 : see Facilities for a list of well-known facilities, and your syslog(3) documentation for the facilities available in your system. Check SEE ALSO for useful links. Facility can be given as a string or a numeric macro.

      This function will croak if it can't connect to the syslog daemon.

      Note that openlog() now takes three arguments, just like openlog(3) .

      You should use openlog() before calling syslog() .

      Options

      • cons - This option is ignored, since the failover mechanism will drop down to the console automatically if all other media fail.

      • ndelay - Open the connection immediately (normally, the connection is opened when the first message is logged).

      • noeol - When set to true, no end of line character (\n ) will be appended to the message. This can be useful for some buggy syslog daemons.

      • nofatal - When set to true, openlog() and syslog() will only emit warnings instead of dying if the connection to the syslog can't be established.

      • nonul - When set to true, no NUL character (\0 ) will be appended to the message. This can be useful for some buggy syslog daemons.

      • nowait - Don't wait for child processes that may have been created while logging the message. (The GNU C library does not create a child process, so this option has no effect on Linux.)

      • perror - Write the message to standard error output as well to the system log (added in Sys::Syslo 0.22).

      • pid - Include PID with each message.

      Examples

      Open the syslog with options ndelay and pid , and with facility LOCAL0 :

      1. openlog($name, "ndelay,pid", "local0");

      Same thing, but this time using the macro corresponding to LOCAL0 :

      1. openlog($name, "ndelay,pid", LOG_LOCAL0);
    • syslog($priority, $message)
    • syslog($priority, $format, @args)

      If $priority permits, logs $message or sprintf($format, @args) with the addition that %m in $message or $format is replaced with "$!" (the latest error message).

      $priority can specify a level, or a level and a facility. Levels and facilities can be given as strings or as macros. When using the eventlog mechanism, priorities DEBUG and INFO are mapped to event type informational , NOTICE and WARNING to warning and ERR to EMERG to error .

      If you didn't use openlog() before using syslog() , syslog() will try to guess the $ident by extracting the shortest prefix of $format that ends in a ":" .

      Examples

      1. # informational level
      2. syslog("info", $message);
      3. syslog(LOG_INFO, $message);
      4. # information level, Local0 facility
      5. syslog("info|local0", $message);
      6. syslog(LOG_INFO|LOG_LOCAL0, $message);
      • Note

        Sys::Syslog version v0.07 and older passed the $message as the formatting string to sprintf() even when no formatting arguments were provided. If the code calling syslog() might execute with older versions of this module, make sure to call the function as syslog($priority, "%s", $message) instead of syslog($priority, $message) . This protects against hostile formatting sequences that might show up if $message contains tainted data.

    • setlogmask($mask_priority)

      Sets the log mask for the current process to $mask_priority and returns the old mask. If the mask argument is 0, the current log mask is not modified. See Levels for the list of available levels. You can use the LOG_UPTO() function to allow all levels up to a given priority (but it only accept the numeric macros as arguments).

      Examples

      Only log errors:

      1. setlogmask( LOG_MASK(LOG_ERR) );

      Log everything except informational messages:

      1. setlogmask( ~(LOG_MASK(LOG_INFO)) );

      Log critical messages, errors and warnings:

      1. setlogmask( LOG_MASK(LOG_CRIT)
      2. | LOG_MASK(LOG_ERR)
      3. | LOG_MASK(LOG_WARNING) );

      Log all messages up to debug:

      1. setlogmask( LOG_UPTO(LOG_DEBUG) );
    • setlogsock()

      Sets the socket type and options to be used for the next call to openlog() or syslog() . Returns true on success, undef on failure.

      Being Perl-specific, this function has evolved along time. It can currently be called as follow:

      • setlogsock($sock_type)

      • setlogsock($sock_type, $stream_location) (added in Perl 5.004_02)

      • setlogsock($sock_type, $stream_location, $sock_timeout) (added in Sys::Syslog 0.25)

      • setlogsock(\%options) (added in Sys::Syslog 0.28)

      The available options are:

      • type - equivalent to $sock_type , selects the socket type (or "mechanism"). An array reference can be passed to specify several mechanisms to try, in the given order.

      • path - equivalent to $stream_location , sets the stream location. Defaults to standard Unix location, or _PATH_LOG .

      • timeout - equivalent to $sock_timeout , sets the socket timeout in seconds. Defaults to 0 on all systems except Mac OS X where it is set to 0.25 sec.

      • host - sets the hostname to send the messages to. Defaults to the local host.

      • port - sets the TCP or UDP port to connect to. Defaults to the first standard syslog port available on the system.

      The available mechanisms are:

      • "native" - use the native C functions from your syslog(3) library (added in Sys::Syslog 0.15).

      • "eventlog" - send messages to the Win32 events logger (Win32 only; added in Sys::Syslog 0.19).

      • "tcp" - connect to a TCP socket, on the syslog/tcp or syslogng/tcp service. See also the host , port and timeout options.

      • "udp" - connect to a UDP socket, on the syslog/udp service. See also the host , port and timeout options.

      • "inet" - connect to an INET socket, either TCP or UDP, tried in that order. See also the host , port and timeout options.

      • "unix" - connect to a UNIX domain socket (in some systems a character special device). The name of that socket is given by the path option or, if omitted, the value returned by the _PATH_LOG macro (if your system defines it), /dev/log or /dev/conslog, whichever is writable.

      • "stream" - connect to the stream indicated by the path option, or, if omitted, the value returned by the _PATH_LOG macro (if your system defines it), /dev/log or /dev/conslog, whichever is writable. For example Solaris and IRIX system may prefer "stream" instead of "unix" .

      • "pipe" - connect to the named pipe indicated by the path option, or, if omitted, to the value returned by the _PATH_LOG macro (if your system defines it), or /dev/log (added in Sys::Syslog 0.21). HP-UX is a system which uses such a named pipe.

      • "console" - send messages directly to the console, as for the "cons" option of openlog() .

      The default is to try native , tcp , udp , unix , pipe, stream , console . Under systems with the Win32 API, eventlog will be added as the first mechanism to try if Win32::EventLog is available.

      Giving an invalid value for $sock_type will croak .

      Examples

      Select the UDP socket mechanism:

      1. setlogsock("udp");

      Send messages using the TCP socket mechanism on a custom port:

      1. setlogsock({ type => "tcp", port => 2486 });

      Send messages to a remote host using the TCP socket mechanism:

      1. setlogsock({ type => "tcp", host => $loghost });

      Try the native, UDP socket then UNIX domain socket mechanisms:

      1. setlogsock(["native", "udp", "unix"]);
      • Note

        Now that the "native" mechanism is supported by Sys::Syslog and selected by default, the use of the setlogsock() function is discouraged because other mechanisms are less portable across operating systems. Authors of modules and programs that use this function, especially its cargo-cult form setlogsock("unix") , are advised to remove any occurence of it unless they specifically want to use a given mechanism (like TCP or UDP to connect to a remote host).

    • closelog()

      Closes the log file and returns true on success.

    THE RULES OF SYS::SYSLOG

    The First Rule of Sys::Syslog is: You do not call setlogsock .

    The Second Rule of Sys::Syslog is: You do not call setlogsock .

    The Third Rule of Sys::Syslog is: The program crashes, dies, calls closelog , the log is over.

    The Fourth Rule of Sys::Syslog is: One facility, one priority.

    The Fifth Rule of Sys::Syslog is: One log at a time.

    The Sixth Rule of Sys::Syslog is: No syslog before openlog .

    The Seventh Rule of Sys::Syslog is: Logs will go on as long as they have to.

    The Eighth, and Final Rule of Sys::Syslog is: If this is your first use of Sys::Syslog, you must read the doc.

    EXAMPLES

    An example:

    1. openlog($program, 'cons,pid', 'user');
    2. syslog('info', '%s', 'this is another test');
    3. syslog('mail|warning', 'this is a better test: %d', time);
    4. closelog();
    5. syslog('debug', 'this is the last test');

    Another example:

    1. openlog("$program $$", 'ndelay', 'user');
    2. syslog('notice', 'fooprogram: this is really done');

    Example of use of %m :

    1. $! = 55;
    2. syslog('info', 'problem was %m'); # %m == $! in syslog(3)

    Log to UDP port on $remotehost instead of logging locally:

    1. setlogsock("udp", $remotehost);
    2. openlog($program, 'ndelay', 'user');
    3. syslog('info', 'something happened over here');

    CONSTANTS

    Facilities

    • LOG_AUDIT - audit daemon (IRIX); falls back to LOG_AUTH

    • LOG_AUTH - security/authorization messages

    • LOG_AUTHPRIV - security/authorization messages (private)

    • LOG_CONSOLE - /dev/console output (FreeBSD); falls back to LOG_USER

    • LOG_CRON - clock daemons (cron and at)

    • LOG_DAEMON - system daemons without separate facility value

    • LOG_FTP - FTP daemon

    • LOG_KERN - kernel messages

    • LOG_INSTALL - installer subsystem (Mac OS X); falls back to LOG_USER

    • LOG_LAUNCHD - launchd - general bootstrap daemon (Mac OS X); falls back to LOG_DAEMON

    • LOG_LFMT - logalert facility; falls back to LOG_USER

    • LOG_LOCAL0 through LOG_LOCAL7 - reserved for local use

    • LOG_LPR - line printer subsystem

    • LOG_MAIL - mail subsystem

    • LOG_NETINFO - NetInfo subsystem (Mac OS X); falls back to LOG_DAEMON

    • LOG_NEWS - USENET news subsystem

    • LOG_NTP - NTP subsystem (FreeBSD, NetBSD); falls back to LOG_DAEMON

    • LOG_RAS - Remote Access Service (VPN / PPP) (Mac OS X); falls back to LOG_AUTH

    • LOG_REMOTEAUTH - remote authentication/authorization (Mac OS X); falls back to LOG_AUTH

    • LOG_SECURITY - security subsystems (firewalling, etc.) (FreeBSD); falls back to LOG_AUTH

    • LOG_SYSLOG - messages generated internally by syslogd

    • LOG_USER (default) - generic user-level messages

    • LOG_UUCP - UUCP subsystem

    Levels

    • LOG_EMERG - system is unusable

    • LOG_ALERT - action must be taken immediately

    • LOG_CRIT - critical conditions

    • LOG_ERR - error conditions

    • LOG_WARNING - warning conditions

    • LOG_NOTICE - normal, but significant, condition

    • LOG_INFO - informational message

    • LOG_DEBUG - debug-level message

    DIAGNOSTICS

    • Invalid argument passed to setlogsock

      (F) You gave setlogsock() an invalid value for $sock_type .

    • eventlog passed to setlogsock, but no Win32 API available

      (W) You asked setlogsock() to use the Win32 event logger but the operating system running the program isn't Win32 or does not provides Win32 compatible facilities.

    • no connection to syslog available

      (F) syslog() failed to connect to the specified socket.

    • stream passed to setlogsock, but %s is not writable

      (W) You asked setlogsock() to use a stream socket, but the given path is not writable.

    • stream passed to setlogsock, but could not find any device

      (W) You asked setlogsock() to use a stream socket, but didn't provide a path, and Sys::Syslog was unable to find an appropriate one.

    • tcp passed to setlogsock, but tcp service unavailable

      (W) You asked setlogsock() to use a TCP socket, but the service is not available on the system.

    • syslog: expecting argument %s

      (F) You forgot to give syslog() the indicated argument.

    • syslog: invalid level/facility: %s

      (F) You specified an invalid level or facility.

    • syslog: too many levels given: %s

      (F) You specified too many levels.

    • syslog: too many facilities given: %s

      (F) You specified too many facilities.

    • syslog: level must be given

      (F) You forgot to specify a level.

    • udp passed to setlogsock, but udp service unavailable

      (W) You asked setlogsock() to use a UDP socket, but the service is not available on the system.

    • unix passed to setlogsock, but path not available

      (W) You asked setlogsock() to use a UNIX socket, but Sys::Syslog was unable to find an appropriate an appropriate device.

    HISTORY

    Sys::Syslog is a core module, part of the standard Perl distribution since 1990. At this time, modules as we know them didn't exist, the Perl library was a collection of .pl files, and the one for sending syslog messages with was simply lib/syslog.pl, included with Perl 3.0. It was converted as a module with Perl 5.0, but had a version number only starting with Perl 5.6. Here is a small table with the matching Perl and Sys::Syslog versions.

    1. Sys::Syslog Perl
    2. ----------- ----
    3. undef 5.0.0 ~ 5.5.4
    4. 0.01 5.6.*
    5. 0.03 5.8.0
    6. 0.04 5.8.1, 5.8.2, 5.8.3
    7. 0.05 5.8.4, 5.8.5, 5.8.6
    8. 0.06 5.8.7
    9. 0.13 5.8.8
    10. 0.22 5.10.0
    11. 0.27 5.8.9, 5.10.1 ~ 5.14.2
    12. 0.29 5.16.0, 5.16.1

    SEE ALSO

    Manual Pages

    syslog(3)

    SUSv3 issue 6, IEEE Std 1003.1, 2004 edition, http://www.opengroup.org/onlinepubs/000095399/basedefs/syslog.h.html

    GNU C Library documentation on syslog, http://www.gnu.org/software/libc/manual/html_node/Syslog.html

    Solaris 10 documentation on syslog, http://docs.sun.com/app/docs/doc/816-5168/syslog-3c?a=view

    Mac OS X documentation on syslog, http://developer.apple.com/documentation/Darwin/Reference/ManPages/man3/syslog.3.html

    IRIX 6.5 documentation on syslog, http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=0650&db=man&fname=3c+syslog

    AIX 5L 5.3 documentation on syslog, http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.basetechref/doc/basetrf2/syslog.htm

    HP-UX 11i documentation on syslog, http://docs.hp.com/en/B2355-60130/syslog.3C.html

    Tru64 5.1 documentation on syslog, http://h30097.www3.hp.com/docs/base_doc/DOCUMENTATION/V51_HTML/MAN/MAN3/0193____.HTM

    Stratus VOS 15.1, http://stratadoc.stratus.com/vos/15.1.1/r502-01/wwhelp/wwhimpl/js/html/wwhelp.htm?context=r502-01&file=ch5r502-01bi.html

    RFCs

    RFC 3164 - The BSD syslog Protocol, http://www.faqs.org/rfcs/rfc3164.html -- Please note that this is an informational RFC, and therefore does not specify a standard of any kind.

    RFC 3195 - Reliable Delivery for syslog, http://www.faqs.org/rfcs/rfc3195.html

    Articles

    Syslogging with Perl, http://lexington.pm.org/meetings/022001.html

    Event Log

    Windows Event Log, http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wes/wes/windows_event_log.asp

    AUTHORS & ACKNOWLEDGEMENTS

    Tom Christiansen <tchrist (at) perl.com> and Larry Wall <larry (at) wall.org>.

    UNIX domain sockets added by Sean Robinson <robinson_s (at) sc.maricopa.edu> with support from Tim Bunce <Tim.Bunce (at) ig.co.uk> and the perl5-porters mailing list.

    Dependency on syslog.ph replaced with XS code by Tom Hughes <tom (at) compton.nu>.

    Code for constant() s regenerated by Nicholas Clark <nick (at) ccl4.org>.

    Failover to different communication modes by Nick Williams <Nick.Williams (at) morganstanley.com>.

    Extracted from core distribution for publishing on the CPAN by Sébastien Aperghis-Tramoni <sebastien (at) aperghis.net>.

    XS code for using native C functions borrowed from Unix::Syslog, written by Marcus Harnisch <marcus.harnisch (at) gmx.net>.

    Yves Orton suggested and helped for making Sys::Syslog use the native event logger under Win32 systems.

    Jerry D. Hedden and Reini Urban provided greatly appreciated help to debug and polish Sys::Syslog under Cygwin.

    BUGS

    Please report any bugs or feature requests to bug-sys-syslog (at) rt.cpan.org, or through the web interface at http://rt.cpan.org/Public/Dist/Display.html?Name=Sys-Syslog. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

    SUPPORT

    You can find documentation for this module with the perldoc command.

    1. perldoc Sys::Syslog

    You can also look for information at:

    COPYRIGHT

    Copyright (C) 1990-2012 by Larry Wall and others.

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Search/Dict.html000644 000765 000024 00000040051 12275777505 016215 0ustar00jjstaff000000 000000 Search::Dict - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Search::Dict

    Perl 5 version 18.2 documentation
    Recently read

    Search::Dict

    NAME

    Search::Dict - look - search for key in dictionary file

    SYNOPSIS

    1. use Search::Dict;
    2. look *FILEHANDLE, $key, $dict, $fold;
    3. use Search::Dict;
    4. look *FILEHANDLE, $params;

    DESCRIPTION

    Sets file position in FILEHANDLE to be first line greater than or equal (stringwise) to $key. Returns the new file position, or -1 if an error occurs.

    The flags specify dictionary order and case folding:

    If $dict is true, search by dictionary order (ignore anything but word characters and whitespace). The default is honour all characters.

    If $fold is true, ignore case. The default is to honour case.

    If there are only three arguments and the third argument is a hash reference, the keys of that hash can have values dict , fold , and comp or xfrm (see below), and their corresponding values will be used as the parameters.

    If a comparison subroutine (comp) is defined, it must return less than zero, zero, or greater than zero, if the first comparand is less than, equal, or greater than the second comparand.

    If a transformation subroutine (xfrm) is defined, its value is used to transform the lines read from the filehandle before their comparison.

     
    perldoc-html/Scalar/Util.html000644 000765 000024 00000100770 12275777504 016253 0ustar00jjstaff000000 000000 Scalar::Util - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Scalar::Util

    Perl 5 version 18.2 documentation
    Recently read

    Scalar::Util

    NAME

    Scalar::Util - A selection of general-utility scalar subroutines

    SYNOPSIS

    1. use Scalar::Util qw(blessed dualvar isdual readonly refaddr reftype
    2. tainted weaken isweak isvstring looks_like_number
    3. set_prototype);
    4. # and other useful utils appearing below

    DESCRIPTION

    Scalar::Util contains a selection of subroutines that people have expressed would be nice to have in the perl core, but the usage would not really be high enough to warrant the use of a keyword, and the size so small such that being individual extensions would be wasteful.

    By default Scalar::Util does not export any subroutines. The subroutines defined are

    • blessed EXPR

      If EXPR evaluates to a blessed reference the name of the package that it is blessed into is returned. Otherwise undef is returned.

      1. $scalar = "foo";
      2. $class = blessed $scalar; # undef
      3. $ref = [];
      4. $class = blessed $ref; # undef
      5. $obj = bless [], "Foo";
      6. $class = blessed $obj; # "Foo"
    • dualvar NUM, STRING

      Returns a scalar that has the value NUM in a numeric context and the value STRING in a string context.

      1. $foo = dualvar 10, "Hello";
      2. $num = $foo + 2; # 12
      3. $str = $foo . " world"; # Hello world
    • isdual EXPR

      If EXPR is a scalar that is a dualvar, the result is true.

      1. $foo = dualvar 86, "Nix";
      2. $dual = isdual($foo); # true

      Note that a scalar can be made to have both string and numeric content through numeric operations:

      1. $foo = "10";
      2. $dual = isdual($foo); # false
      3. $bar = $foo + 0;
      4. $dual = isdual($foo); # true

      Note that although $! appears to be dual-valued variable, it is actually implemented using a tied scalar:

      1. $! = 1;
      2. print("$!\n"); # "Operation not permitted"
      3. $dual = isdual($!); # false

      You can capture its numeric and string content using:

      1. $err = dualvar $!, $!;
      2. $dual = isdual($err); # true
    • isvstring EXPR

      If EXPR is a scalar which was coded as a vstring the result is true.

      1. $vs = v49.46.48;
      2. $fmt = isvstring($vs) ? "%vd" : "%s"; #true
      3. printf($fmt,$vs);
    • looks_like_number EXPR

      Returns true if perl thinks EXPR is a number. See looks_like_number in perlapi.

    • openhandle FH

      Returns FH if FH may be used as a filehandle and is open, or FH is a tied handle. Otherwise undef is returned.

      1. $fh = openhandle(*STDIN); # \*STDIN
      2. $fh = openhandle(\*STDIN); # \*STDIN
      3. $fh = openhandle(*NOTOPEN); # undef
      4. $fh = openhandle("scalar"); # undef
    • readonly SCALAR

      Returns true if SCALAR is readonly.

      1. sub foo { readonly($_[0]) }
      2. $readonly = foo($bar); # false
      3. $readonly = foo(0); # true
    • refaddr EXPR

      If EXPR evaluates to a reference the internal memory address of the referenced value is returned. Otherwise undef is returned.

      1. $addr = refaddr "string"; # undef
      2. $addr = refaddr \$var; # eg 12345678
      3. $addr = refaddr []; # eg 23456784
      4. $obj = bless {}, "Foo";
      5. $addr = refaddr $obj; # eg 88123488
    • reftype EXPR

      If EXPR evaluates to a reference the type of the variable referenced is returned. Otherwise undef is returned.

      1. $type = reftype "string"; # undef
      2. $type = reftype \$var; # SCALAR
      3. $type = reftype []; # ARRAY
      4. $obj = bless {}, "Foo";
      5. $type = reftype $obj; # HASH
    • set_prototype CODEREF, PROTOTYPE

      Sets the prototype of the given function, or deletes it if PROTOTYPE is undef. Returns the CODEREF.

      1. set_prototype \&foo, '$$';
    • tainted EXPR

      Return true if the result of EXPR is tainted

      1. $taint = tainted("constant"); # false
      2. $taint = tainted($ENV{PWD}); # true if running under -T
    • weaken REF

      REF will be turned into a weak reference. This means that it will not hold a reference count on the object it references. Also when the reference count on that object reaches zero, REF will be set to undef.

      This is useful for keeping copies of references , but you don't want to prevent the object being DESTROY-ed at its usual time.

      1. {
      2. my $var;
      3. $ref = \$var;
      4. weaken($ref); # Make $ref a weak reference
      5. }
      6. # $ref is now undef

      Note that if you take a copy of a scalar with a weakened reference, the copy will be a strong reference.

      1. my $var;
      2. my $foo = \$var;
      3. weaken($foo); # Make $foo a weak reference
      4. my $bar = $foo; # $bar is now a strong reference

      This may be less obvious in other situations, such as grep(), for instance when grepping through a list of weakened references to objects that may have been destroyed already:

      1. @object = grep { defined } @object;

      This will indeed remove all references to destroyed objects, but the remaining references to objects will be strong, causing the remaining objects to never be destroyed because there is now always a strong reference to them in the @object array.

    • isweak EXPR

      If EXPR is a scalar which is a weak reference the result is true.

      1. $ref = \$foo;
      2. $weak = isweak($ref); # false
      3. weaken($ref);
      4. $weak = isweak($ref); # true

      NOTE: Copying a weak reference creates a normal, strong, reference.

      1. $copy = $ref;
      2. $weak = isweak($copy); # false

    DIAGNOSTICS

    Module use may give one of the following errors during import.

    • Weak references are not implemented in the version of perl

      The version of perl that you are using does not implement weak references, to use isweak or weaken you will need to use a newer release of perl.

    • Vstrings are not implemented in the version of perl

      The version of perl that you are using does not implement Vstrings, to use isvstring you will need to use a newer release of perl.

    • NAME is only available with the XS version of Scalar::Util

      Scalar::Util contains both perl and C implementations of many of its functions so that those without access to a C compiler may still use it. However some of the functions are only available when a C compiler was available to compile the XS version of the extension.

      At present that list is: weaken, isweak, dualvar, isvstring, set_prototype

    KNOWN BUGS

    There is a bug in perl5.6.0 with UV's that are >= 1<<31. This will show up as tests 8 and 9 of dualvar.t failing

    SEE ALSO

    List::Util

    COPYRIGHT

    Copyright (c) 1997-2007 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Except weaken and isweak which are

    Copyright (c) 1999 Tuomas J. Lukka <lukka@iki.fi>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as perl itself.

     
    perldoc-html/Pod/Checker.html000644 000765 000024 00000112234 12275777477 016226 0ustar00jjstaff000000 000000 Pod::Checker - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Checker

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Checker

    NAME

    Pod::Checker, podchecker() - check pod documents for syntax errors

    SYNOPSIS

    1. use Pod::Checker;
    2. $num_errors = podchecker($filepath, $outputpath, %options);
    3. my $checker = new Pod::Checker %options;
    4. $checker->parse_from_file($filepath, \*STDERR);

    OPTIONS/ARGUMENTS

    $filepath is the input POD to read and $outputpath is where to write POD syntax error messages. Either argument may be a scalar indicating a file-path, or else a reference to an open filehandle. If unspecified, the input-file it defaults to \*STDIN , and the output-file defaults to \*STDERR .

    podchecker()

    This function can take a hash of options:

    • -warnings => val

      Turn warnings on/off. val is usually 1 for on, but higher values trigger additional warnings. See Warnings.

    DESCRIPTION

    podchecker will perform syntax checking of Perl5 POD format documentation.

    Curious/ambitious users are welcome to propose additional features they wish to see in Pod::Checker and podchecker and verify that the checks are consistent with perlpod.

    The following checks are currently performed:

    • Unknown '=xxxx' commands, unknown 'X<...>' interior-sequences, and unterminated interior sequences.

    • Check for proper balancing of =begin and =end . The contents of such a block are generally ignored, i.e. no syntax checks are performed.

    • Check for proper nesting and balancing of =over , =item and =back .

    • Check for same nested interior-sequences (e.g. L<...L<...>...>).

    • Check for malformed or non-existing entities E<...> .

    • Check for correct syntax of hyperlinks L<...> . See perlpod for details.

    • Check for unresolved document-internal links. This check may also reveal misspelled links that seem to be internal links but should be links to something else.

    DIAGNOSTICS

    Errors

    • empty =headn

      A heading (=head1 or =head2 ) without any text? That ain't no heading!

    • =over on line N without closing =back

      The =over command does not have a corresponding =back before the next heading (=head1 or =head2 ) or the end of the file.

    • =item without previous =over
    • =back without previous =over

      An =item or =back command has been found outside a =over /=back block.

    • No argument for =begin

      A =begin command was found that is not followed by the formatter specification.

    • =end without =begin

      A standalone =end command was found.

    • Nested =begin's

      There were at least two consecutive =begin commands without the corresponding =end . Only one =begin may be active at a time.

    • =for without formatter specification

      There is no specification of the formatter after the =for command.

    • Apparent command =foo not preceded by blank line

      A command which has ended up in the middle of a paragraph or other command, such as

      1. =item one
      2. =item two <-- bad
    • unresolved internal link NAME

      The given link to NAME does not have a matching node in the current POD. This also happened when a single word node name is not enclosed in "" .

    • Unknown command "CMD"

      An invalid POD command has been found. Valid are =head1 , =head2 , =head3 , =head4 , =over , =item , =back , =begin , =end , =for , =pod , =cut

    • Unknown interior-sequence "SEQ"

      An invalid markup command has been encountered. Valid are: B<> , C<> , E<> , F<> , I<> , L<> , S<> , X<> , Z<>

    • nested commands CMD<...CMD<...>...>

      Two nested identical markup commands have been found. Generally this does not make sense.

    • garbled entity STRING

      The STRING found cannot be interpreted as a character entity.

    • Entity number out of range

      An entity specified by number (dec, hex, oct) is out of range (1-255).

    • malformed link L<>

      The link found cannot be parsed because it does not conform to the syntax described in perlpod.

    • nonempty Z<>

      The Z<> sequence is supposed to be empty.

    • empty X<>

      The index entry specified contains nothing but whitespace.

    • Spurious text after =pod / =cut

      The commands =pod and =cut do not take any arguments.

    • Spurious =cut command

      A =cut command was found without a preceding POD paragraph.

    • Spurious =pod command

      A =pod command was found after a preceding POD paragraph.

    • Spurious character(s) after =back

      The =back command does not take any arguments.

    Warnings

    These may not necessarily cause trouble, but indicate mediocre style.

    • multiple occurrence of link target name

      The POD file has some =item and/or =head commands that have the same text. Potential hyperlinks to such a text cannot be unique then. This warning is printed only with warning level greater than one.

    • line containing nothing but whitespace in paragraph

      There is some whitespace on a seemingly empty line. POD is very sensitive to such things, so this is flagged. vi users switch on the list option to avoid this problem.

    • previous =item has no contents

      There is a list =item right above the flagged line that has no text contents. You probably want to delete empty items.

    • preceding non-item paragraph(s)

      A list introduced by =over starts with a text or verbatim paragraph, but continues with =item s. Move the non-item paragraph out of the =over /=back block.

    • =item type mismatch (one vs. two)

      A list started with e.g. a bullet-like =item and continued with a numbered one. This is obviously inconsistent. For most translators the type of the first =item determines the type of the list.

    • N unescaped <> in paragraph

      Angle brackets not written as <lt> and <gt> can potentially cause errors as they could be misinterpreted as markup commands. This is only printed when the -warnings level is greater than 1.

    • Unknown entity

      A character entity was found that does not belong to the standard ISO set or the POD specials verbar and sol .

    • No items in =over

      The list opened with =over does not contain any items.

    • No argument for =item

      =item without any parameters is deprecated. It should either be followed by * to indicate an unordered list, by a number (optionally followed by a dot) to indicate an ordered (numbered) list or simple text for a definition list.

    • empty section in previous paragraph

      The previous section (introduced by a =head command) does not contain any text. This usually indicates that something is missing. Note: A =head1 followed immediately by =head2 does not trigger this warning.

    • Verbatim paragraph in NAME section

      The NAME section (=head1 NAME ) should consist of a single paragraph with the script/module name, followed by a dash `-' and a very short description of what the thing is good for.

    • =headn without preceding higher level

      For example if there is a =head2 in the POD file prior to a =head1 .

    Hyperlinks

    There are some warnings with respect to malformed hyperlinks:

    • ignoring leading/trailing whitespace in link

      There is whitespace at the beginning or the end of the contents of L<...>.

    • (section) in '$page' deprecated

      There is a section detected in the page name of L<...>, e.g. L<passwd(2)> . POD hyperlinks may point to POD documents only. Please write C<passwd(2)> instead. Some formatters are able to expand this to appropriate code. For links to (builtin) functions, please say L<perlfunc/mkdir> , without ().

    • alternative text/node '%s' contains non-escaped | or /

      The characters | and / are special in the L<...> context. Although the hyperlink parser does its best to determine which "/" is text and which is a delimiter in case of doubt, one ought to escape these literal characters like this:

      1. / E<sol>
      2. | E<verbar>

    RETURN VALUE

    podchecker returns the number of POD syntax errors found or -1 if there were no POD commands at all found in the file.

    EXAMPLES

    See SYNOPSIS

    INTERFACE

    While checking, this module collects document properties, e.g. the nodes for hyperlinks (=headX , =item ) and index entries (X<> ). POD translators can use this feature to syntax-check and get the nodes in a first pass before actually starting to convert. This is expensive in terms of execution time, but allows for very robust conversions.

    Since PodParser-1.24 the Pod::Checker module uses only the poderror method to print errors and warnings. The summary output (e.g. "Pod syntax OK") has been dropped from the module and has been included in podchecker (the script). This allows users of Pod::Checker to control completely the output behavior. Users of podchecker (the script) get the well-known behavior.

    • Pod::Checker->new( %options )

      Return a reference to a new Pod::Checker object that inherits from Pod::Parser and is used for calling the required methods later. The following options are recognized:

      -warnings => num Print warnings if num is true. The higher the value of num , the more warnings are printed. Currently there are only levels 1 and 2.

      -quiet => num If num is true, do not print any errors/warnings. This is useful when Pod::Checker is used to munge POD code into plain text from within POD formatters.

    • $checker->poderror( @args )
    • $checker->poderror( {%opts}, @args )

      Internal method for printing errors and warnings. If no options are given, simply prints "@_". The following options are recognized and used to form the output:

      1. -msg

      A message to print prior to @args .

      1. -line

      The line number the error occurred in.

      1. -file

      The file (name) the error occurred in.

      1. -severity

      The error level, should be 'WARNING' or 'ERROR'.

    • $checker->num_errors()

      Set (if argument specified) and retrieve the number of errors found.

    • $checker->num_warnings()

      Set (if argument specified) and retrieve the number of warnings found.

    • $checker->name()

      Set (if argument specified) and retrieve the canonical name of POD as found in the =head1 NAME section.

    • $checker->node()

      Add (if argument specified) and retrieve the nodes (as defined by =headX and =item ) of the current POD. The nodes are returned in the order of their occurrence. They consist of plain text, each piece of whitespace is collapsed to a single blank.

    • $checker->idx()

      Add (if argument specified) and retrieve the index entries (as defined by X<> ) of the current POD. They consist of plain text, each piece of whitespace is collapsed to a single blank.

    • $checker->hyperlink()

      Add (if argument specified) and retrieve the hyperlinks (as defined by L<> ) of the current POD. They consist of a 2-item array: line number and Pod::Hyperlink object.

    AUTHOR

    Please report bugs using http://rt.cpan.org.

    Brad Appleton <bradapp@enteract.com> (initial version), Marek Rouchal <marekr@cpan.org>

    Based on code for Pod::Text::pod2text() written by Tom Christiansen <tchrist@mox.perl.com>

    Pod::Checker is part of the Pod-Checker distribution, and is based on Pod::Parser.

     
    perldoc-html/Pod/Escapes.html000644 000765 000024 00000055404 12275777501 016236 0ustar00jjstaff000000 000000 Pod::Escapes - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Escapes

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Escapes

    NAME

    Pod::Escapes -- for resolving Pod E<...> sequences

    SYNOPSIS

    1. use Pod::Escapes qw(e2char);
    2. ...la la la, parsing POD, la la la...
    3. $text = e2char($e_node->label);
    4. unless(defined $text) {
    5. print "Unknown E sequence \"", $e_node->label, "\"!";
    6. }
    7. ...else print/interpolate $text...

    DESCRIPTION

    This module provides things that are useful in decoding Pod E<...> sequences. Presumably, it should be used only by Pod parsers and/or formatters.

    By default, Pod::Escapes exports none of its symbols. But you can request any of them to be exported. Either request them individually, as with use Pod::Escapes qw(symbolname symbolname2...); , or you can do use Pod::Escapes qw(:ALL); to get all exportable symbols.

    GOODIES

    • e2char($e_content)

      Given a name or number that could appear in a E<name_or_num> sequence, this returns the string that it stands for. For example, e2char('sol') , e2char('47') , e2char('0x2F') , and e2char('057') all return "/", because E<sol> , E<47> , E<0x2f> , and E<057> , all mean "/". If the name has no known value (as with a name of "qacute") or is syntactally invalid (as with a name of "1/4"), this returns undef.

    • e2charnum($e_content)

      Given a name or number that could appear in a E<name_or_num> sequence, this returns the number of the Unicode character that this stands for. For example, e2char('sol') , e2char('47') , e2char('0x2F') , and e2char('057') all return 47, because E<sol> , E<47> , E<0x2f> , and E<057> , all mean "/", whose Unicode number is 47. If the name has no known value (as with a name of "qacute") or is syntactally invalid (as with a name of "1/4"), this returns undef.

    • $Name2character{name}

      Maps from names (as in E<name>) like "eacute" or "sol" to the string that each stands for. Note that this does not include numerics (like "64" or "x981c"). Under old Perl versions (before 5.7) you get a "?" in place of characters whose Unicode value is over 255.

    • $Name2character_number{name}

      Maps from names (as in E<name>) like "eacute" or "sol" to the Unicode value that each stands for. For example, $Name2character_number{'eacute'} is 201, and $Name2character_number{'eacute'} is 8364. You get the correct Unicode value, regardless of the version of Perl you're using -- which differs from %Name2character 's behavior under pre-5.7 Perls.

      Note that this hash does not include numerics (like "64" or "x981c").

    • $Latin1Code_to_fallback{integer}

      For numbers in the range 160 (0x00A0) to 255 (0x00FF), this maps from the character code for a Latin-1 character (like 233 for lowercase e-acute) to the US-ASCII character that best aproximates it (like "e"). You may find this useful if you are rendering POD in a format that you think deals well only with US-ASCII characters.

    • $Latin1Char_to_fallback{character}

      Just as above, but maps from characters (like "\xE9", lowercase e-acute) to characters (like "e").

    • $Code2USASCII{integer}

      This maps from US-ASCII codes (like 32) to the corresponding character (like space, for 32). Only characters 32 to 126 are defined. This is meant for use by e2char($x) when it senses that it's running on a non-ASCII platform (where chr(32) doesn't get you a space -- but $Code2USASCII{32} will). It's documented here just in case you might find it useful.

    CAVEATS

    On Perl versions before 5.7, Unicode characters with a value over 255 (like lambda or emdash) can't be conveyed. This module does work under such early Perl versions, but in the place of each such character, you get a "?". Latin-1 characters (characters 160-255) are unaffected.

    Under EBCDIC platforms, e2char($n) may not always be the same as chr(e2charnum($n)), and ditto for $Name2character{$name} and chr($Name2character_number{$name}).

    SEE ALSO

    perlpod

    perlpodspec

    Text::Unidecode

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2001-2004 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    Portions of the data tables in this module are derived from the entity declarations in the W3C XHTML specification.

    Currently (October 2001), that's these three:

    1. http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
    2. http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
    3. http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent

    AUTHOR

    Sean M. Burke sburke@cpan.org

     
    perldoc-html/Pod/Find.html000644 000765 000024 00000057356 12275777501 015543 0ustar00jjstaff000000 000000 Pod::Find - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Find

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Find

    NAME

    Pod::Find - find POD documents in directory trees

    SYNOPSIS

    1. use Pod::Find qw(pod_find simplify_name);
    2. my %pods = pod_find({ -verbose => 1, -inc => 1 });
    3. foreach(keys %pods) {
    4. print "found library POD `$pods{$_}' in $_\n";
    5. }
    6. print "podname=",simplify_name('a/b/c/mymodule.pod'),"\n";
    7. $location = pod_where( { -inc => 1 }, "Pod::Find" );

    DESCRIPTION

    Pod::Find provides a set of functions to locate POD files. Note that no function is exported by default to avoid pollution of your namespace, so be sure to specify them in the use statement if you need them:

    1. use Pod::Find qw(pod_find);

    From this version on the typical SCM (software configuration management) files/directories like RCS, CVS, SCCS, .svn are ignored.

    pod_find( { %opts } , @directories )

    The function pod_find searches for POD documents in a given set of files and/or directories. It returns a hash with the file names as keys and the POD name as value. The POD name is derived from the file name and its position in the directory tree.

    E.g. when searching in $HOME/perl5lib, the file $HOME/perl5lib/MyModule.pm would get the POD name MyModule, whereas $HOME/perl5lib/Myclass/Subclass.pm would be Myclass::Subclass. The name information can be used for POD translators.

    Only text files containing at least one valid POD command are found.

    A warning is printed if more than one POD file with the same POD name is found, e.g. CPAN.pm in different directories. This usually indicates duplicate occurrences of modules in the @INC search path.

    OPTIONS The first argument for pod_find may be a hash reference with options. The rest are either directories that are searched recursively or files. The POD names of files are the plain basenames with any Perl-like extension (.pm, .pl, .pod) stripped.

    • -verbose => 1

      Print progress information while scanning.

    • -perl => 1

      Apply Perl-specific heuristics to find the correct PODs. This includes stripping Perl-like extensions, omitting subdirectories that are numeric but do not match the current Perl interpreter's version id, suppressing site_perl as a module hierarchy name etc.

    • -script => 1

      Search for PODs in the current Perl interpreter's installation scriptdir. This is taken from the local Config module.

    • -inc => 1

      Search for PODs in the current Perl interpreter's @INC paths. This automatically considers paths specified in the PERL5LIB environment as this is included in @INC by the Perl interpreter itself.

    simplify_name( $str )

    The function simplify_name is equivalent to basename, but also strips Perl-like extensions (.pm, .pl, .pod) and extensions like .bat, .cmd on Win32 and OS/2, or .com on VMS, respectively.

    pod_where( { %opts }, $pod )

    Returns the location of a pod document given a search directory and a module (e.g. File::Find ) or script (e.g. perldoc ) name.

    Options:

    • -inc => 1

      Search @INC for the pod and also the scriptdir defined in the Config module.

    • -dirs => [ $dir1, $dir2, ... ]

      Reference to an array of search directories. These are searched in order before looking in @INC (if -inc). Current directory is used if none are specified.

    • -verbose => 1

      List directories as they are searched

    Returns the full path of the first occurrence to the file. Package names (eg 'A::B') are automatically converted to directory names in the selected directory. (eg on unix 'A::B' is converted to 'A/B'). Additionally, '.pm', '.pl' and '.pod' are appended to the search automatically if required.

    A subdirectory pod/ is also checked if it exists in any of the given search directories. This ensures that e.g. perlfunc is found.

    It is assumed that if a module name is supplied, that that name matches the file name. Pods are not opened to check for the 'NAME' entry.

    A check is made to make sure that the file that is found does contain some pod documentation.

    contains_pod( $file , $verbose )

    Returns true if the supplied filename (not POD module) contains some pod information.

    AUTHOR

    Please report bugs using http://rt.cpan.org.

    Marek Rouchal <marekr@cpan.org>, heavily borrowing code from Nick Ing-Simmons' PodToHtml.

    Tim Jenness <t.jenness@jach.hawaii.edu> provided pod_where and contains_pod .

    Pod::Find is part of the Pod::Parser distribution.

    SEE ALSO

    Pod::Parser, Pod::Checker, perldoc

     
    perldoc-html/Pod/Functions.html000644 000765 000024 00000040136 12275777500 016616 0ustar00jjstaff000000 000000 Pod::Functions - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Functions

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Functions

    NAME

    Pod::Functions - Group Perl's functions a la perlfunc.pod

    SYNOPSIS

    1. use Pod::Functions;
    2. my @misc_ops = @{ $Kinds{ 'Misc' } };
    3. my $misc_dsc = $Type_Description{ 'Misc' };

    or

    1. perl /path/to/lib/Pod/Functions.pm

    This will print a grouped list of Perl's functions, like the Perl Functions by Category in perlfunc section.

    DESCRIPTION

    It exports the following variables:

    • %Kinds

      This holds a hash-of-lists. Each list contains the functions in the category the key denotes.

    • %Type

      In this hash each key represents a function and the value is the category. The category can be a comma separated list.

    • %Flavor

      In this hash each key represents a function and the value is a short description of that function.

    • %Type_Description

      In this hash each key represents a category of functions and the value is a short description of that category.

    • @Type_Order

      This list of categories is used to produce the same order as the Perl Functions by Category in perlfunc section.

     
    perldoc-html/Pod/Html.html000644 000765 000024 00000055050 12275777477 015570 0ustar00jjstaff000000 000000 Pod::Html - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Html

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Html

    NAME

    Pod::Html - module to convert pod files to HTML

    SYNOPSIS

    1. use Pod::Html;
    2. pod2html([options]);

    DESCRIPTION

    Converts files from pod format (see perlpod) to HTML format. It can automatically generate indexes and cross-references, and it keeps a cache of things it knows how to cross-reference.

    FUNCTIONS

    pod2html

    1. pod2html("pod2html",
    2. "--podpath=lib:ext:pod:vms",
    3. "--podroot=/usr/src/perl",
    4. "--htmlroot=/perl/nmanual",
    5. "--recurse",
    6. "--infile=foo.pod",
    7. "--outfile=/perl/nmanual/foo.html");

    pod2html takes the following arguments:

    • backlink
      1. --backlink

      Turns every head1 heading into a link back to the top of the page. By default, no backlinks are generated.

    • cachedir
      1. --cachedir=name

      Creates the directory cache in the given directory.

    • css
      1. --css=stylesheet

      Specify the URL of a cascading style sheet. Also disables all HTML/CSS style attributes that are output by default (to avoid conflicts).

    • flush
      1. --flush

      Flushes the directory cache.

    • header
      1. --header
      2. --noheader

      Creates header and footer blocks containing the text of the NAME section. By default, no headers are generated.

    • help
      1. --help

      Displays the usage message.

    • htmldir
      1. --htmldir=name

      Sets the directory to which all cross references in the resulting html file will be relative. Not passing this causes all links to be absolute since this is the value that tells Pod::Html the root of the documentation tree.

      Do not use this and --htmlroot in the same call to pod2html; they are mutually exclusive.

    • htmlroot
      1. --htmlroot=name

      Sets the base URL for the HTML files. When cross-references are made, the HTML root is prepended to the URL.

      Do not use this if relative links are desired: use --htmldir instead.

      Do not pass both this and --htmldir to pod2html; they are mutually exclusive.

    • index
      1. --index
      2. --noindex

      Generate an index at the top of the HTML file. This is the default behaviour.

    • infile
      1. --infile=name

      Specify the pod file to convert. Input is taken from STDIN if no infile is specified.

    • outfile
      1. --outfile=name

      Specify the HTML file to create. Output goes to STDOUT if no outfile is specified.

    • poderrors
      1. --poderrors
      2. --nopoderrors

      Include a "POD ERRORS" section in the outfile if there were any POD errors in the infile. This section is included by default.

    • podpath
      1. --podpath=name:...:name

      Specify which subdirectories of the podroot contain pod files whose HTML converted forms can be linked to in cross references.

    • podroot
      1. --podroot=name

      Specify the base directory for finding library pods. Default is the current working directory.

    • quiet
      1. --quiet
      2. --noquiet

      Don't display mostly harmless warning messages. These messages will be displayed by default. But this is not the same as verbose mode.

    • recurse
      1. --recurse
      2. --norecurse

      Recurse into subdirectories specified in podpath (default behaviour).

    • title
      1. --title=title

      Specify the title of the resulting HTML file.

    • verbose
      1. --verbose
      2. --noverbose

      Display progress messages. By default, they won't be displayed.

    htmlify

    1. htmlify($heading);

    Converts a pod section specification to a suitable section specification for HTML. Note that we keep spaces and special characters except ", ? (Netscape problem) and the hyphen (writer's problem...).

    anchorify

    1. anchorify(@heading);

    Similar to htmlify() , but turns non-alphanumerics into underscores. Note that anchorify() is not exported by default.

    ENVIRONMENT

    Uses $Config{pod2html} to setup default options.

    AUTHOR

    Marc Green, <marcgreen@cpan.org>.

    Original version by Tom Christiansen, <tchrist@perl.com>.

    SEE ALSO

    perlpod

    COPYRIGHT

    This program is distributed under the Artistic License.

     
    perldoc-html/Pod/InputObjects.html000644 000765 000024 00000125365 12275777501 017270 0ustar00jjstaff000000 000000 Pod::InputObjects - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::InputObjects

    Perl 5 version 18.2 documentation
    Recently read

    Pod::InputObjects

    NAME

    Pod::InputObjects - objects representing POD input paragraphs, commands, etc.

    SYNOPSIS

    1. use Pod::InputObjects;

    REQUIRES

    perl5.004, Carp

    EXPORTS

    Nothing.

    DESCRIPTION

    This module defines some basic input objects used by Pod::Parser when reading and parsing POD text from an input source. The following objects are defined:

    • package Pod::Paragraph

      An object corresponding to a paragraph of POD input text. It may be a plain paragraph, a verbatim paragraph, or a command paragraph (see perlpod).

    • package Pod::InteriorSequence

      An object corresponding to an interior sequence command from the POD input text (see perlpod).

    • package Pod::ParseTree

      An object corresponding to a tree of parsed POD text. Each "node" in a parse-tree (or ptree) is either a text-string or a reference to a Pod::InteriorSequence object. The nodes appear in the parse-tree in the order in which they were parsed from left-to-right.

    Each of these input objects are described in further detail in the sections which follow.

    Pod::Paragraph

    An object representing a paragraph of POD input text. It has the following methods/attributes:

    Pod::Paragraph->new()

    1. my $pod_para1 = Pod::Paragraph->new(-text => $text);
    2. my $pod_para2 = Pod::Paragraph->new(-name => $cmd,
    3. -text => $text);
    4. my $pod_para3 = new Pod::Paragraph(-text => $text);
    5. my $pod_para4 = new Pod::Paragraph(-name => $cmd,
    6. -text => $text);
    7. my $pod_para5 = Pod::Paragraph->new(-name => $cmd,
    8. -text => $text,
    9. -file => $filename,
    10. -line => $line_number);

    This is a class method that constructs a Pod::Paragraph object and returns a reference to the new paragraph object. It may be given one or two keyword arguments. The -text keyword indicates the corresponding text of the POD paragraph. The -name keyword indicates the name of the corresponding POD command, such as head1 or item (it should not contain the = prefix); this is needed only if the POD paragraph corresponds to a command paragraph. The -file and -line keywords indicate the filename and line number corresponding to the beginning of the paragraph

    $pod_para->cmd_name()

    1. my $para_cmd = $pod_para->cmd_name();

    If this paragraph is a command paragraph, then this method will return the name of the command (without any leading = prefix).

    $pod_para->text()

    1. my $para_text = $pod_para->text();

    This method will return the corresponding text of the paragraph.

    $pod_para->raw_text()

    1. my $raw_pod_para = $pod_para->raw_text();

    This method will return the raw text of the POD paragraph, exactly as it appeared in the input.

    $pod_para->cmd_prefix()

    1. my $prefix = $pod_para->cmd_prefix();

    If this paragraph is a command paragraph, then this method will return the prefix used to denote the command (which should be the string "=" or "==").

    $pod_para->cmd_separator()

    1. my $separator = $pod_para->cmd_separator();

    If this paragraph is a command paragraph, then this method will return the text used to separate the command name from the rest of the paragraph (if any).

    $pod_para->parse_tree()

    1. my $ptree = $pod_parser->parse_text( $pod_para->text() );
    2. $pod_para->parse_tree( $ptree );
    3. $ptree = $pod_para->parse_tree();

    This method will get/set the corresponding parse-tree of the paragraph's text.

    $pod_para->file_line()

    1. my ($filename, $line_number) = $pod_para->file_line();
    2. my $position = $pod_para->file_line();

    Returns the current filename and line number for the paragraph object. If called in a list context, it returns a list of two elements: first the filename, then the line number. If called in a scalar context, it returns a string containing the filename, followed by a colon (':'), followed by the line number.

    Pod::InteriorSequence

    An object representing a POD interior sequence command. It has the following methods/attributes:

    Pod::InteriorSequence->new()

    1. my $pod_seq1 = Pod::InteriorSequence->new(-name => $cmd
    2. -ldelim => $delimiter);
    3. my $pod_seq2 = new Pod::InteriorSequence(-name => $cmd,
    4. -ldelim => $delimiter);
    5. my $pod_seq3 = new Pod::InteriorSequence(-name => $cmd,
    6. -ldelim => $delimiter,
    7. -file => $filename,
    8. -line => $line_number);
    9. my $pod_seq4 = new Pod::InteriorSequence(-name => $cmd, $ptree);
    10. my $pod_seq5 = new Pod::InteriorSequence($cmd, $ptree);

    This is a class method that constructs a Pod::InteriorSequence object and returns a reference to the new interior sequence object. It should be given two keyword arguments. The -ldelim keyword indicates the corresponding left-delimiter of the interior sequence (e.g. '<'). The -name keyword indicates the name of the corresponding interior sequence command, such as I or B or C . The -file and -line keywords indicate the filename and line number corresponding to the beginning of the interior sequence. If the $ptree argument is given, it must be the last argument, and it must be either string, or else an array-ref suitable for passing to Pod::ParseTree::new (or it may be a reference to a Pod::ParseTree object).

    $pod_seq->cmd_name()

    1. my $seq_cmd = $pod_seq->cmd_name();

    The name of the interior sequence command.

    $pod_seq->prepend()

    1. $pod_seq->prepend($text);
    2. $pod_seq1->prepend($pod_seq2);

    Prepends the given string or parse-tree or sequence object to the parse-tree of this interior sequence.

    $pod_seq->append()

    1. $pod_seq->append($text);
    2. $pod_seq1->append($pod_seq2);

    Appends the given string or parse-tree or sequence object to the parse-tree of this interior sequence.

    $pod_seq->nested()

    1. $outer_seq = $pod_seq->nested || print "not nested";

    If this interior sequence is nested inside of another interior sequence, then the outer/parent sequence that contains it is returned. Otherwise undef is returned.

    $pod_seq->raw_text()

    1. my $seq_raw_text = $pod_seq->raw_text();

    This method will return the raw text of the POD interior sequence, exactly as it appeared in the input.

    $pod_seq->left_delimiter()

    1. my $ldelim = $pod_seq->left_delimiter();

    The leftmost delimiter beginning the argument text to the interior sequence (should be "<").

    $pod_seq->right_delimiter()

    The rightmost delimiter beginning the argument text to the interior sequence (should be ">").

    $pod_seq->parse_tree()

    1. my $ptree = $pod_parser->parse_text($paragraph_text);
    2. $pod_seq->parse_tree( $ptree );
    3. $ptree = $pod_seq->parse_tree();

    This method will get/set the corresponding parse-tree of the interior sequence's text.

    $pod_seq->file_line()

    1. my ($filename, $line_number) = $pod_seq->file_line();
    2. my $position = $pod_seq->file_line();

    Returns the current filename and line number for the interior sequence object. If called in a list context, it returns a list of two elements: first the filename, then the line number. If called in a scalar context, it returns a string containing the filename, followed by a colon (':'), followed by the line number.

    Pod::InteriorSequence::DESTROY()

    This method performs any necessary cleanup for the interior-sequence. If you override this method then it is imperative that you invoke the parent method from within your own method, otherwise interior-sequence storage will not be reclaimed upon destruction!

    Pod::ParseTree

    This object corresponds to a tree of parsed POD text. As POD text is scanned from left to right, it is parsed into an ordered list of text-strings and Pod::InteriorSequence objects (in order of appearance). A Pod::ParseTree object corresponds to this list of strings and sequences. Each interior sequence in the parse-tree may itself contain a parse-tree (since interior sequences may be nested).

    Pod::ParseTree->new()

    1. my $ptree1 = Pod::ParseTree->new;
    2. my $ptree2 = new Pod::ParseTree;
    3. my $ptree4 = Pod::ParseTree->new($array_ref);
    4. my $ptree3 = new Pod::ParseTree($array_ref);

    This is a class method that constructs a Pod::Parse_tree object and returns a reference to the new parse-tree. If a single-argument is given, it must be a reference to an array, and is used to initialize the root (top) of the parse tree.

    $ptree->top()

    1. my $top_node = $ptree->top();
    2. $ptree->top( $top_node );
    3. $ptree->top( @children );

    This method gets/sets the top node of the parse-tree. If no arguments are given, it returns the topmost node in the tree (the root), which is also a Pod::ParseTree. If it is given a single argument that is a reference, then the reference is assumed to a parse-tree and becomes the new top node. Otherwise, if arguments are given, they are treated as the new list of children for the top node.

    $ptree->children()

    This method gets/sets the children of the top node in the parse-tree. If no arguments are given, it returns the list (array) of children (each of which should be either a string or a Pod::InteriorSequence. Otherwise, if arguments are given, they are treated as the new list of children for the top node.

    $ptree->prepend()

    This method prepends the given text or parse-tree to the current parse-tree. If the first item on the parse-tree is text and the argument is also text, then the text is prepended to the first item (not added as a separate string). Otherwise the argument is added as a new string or parse-tree before the current one.

    $ptree->append()

    This method appends the given text or parse-tree to the current parse-tree. If the last item on the parse-tree is text and the argument is also text, then the text is appended to the last item (not added as a separate string). Otherwise the argument is added as a new string or parse-tree after the current one.

    $ptree->raw_text()

    1. my $ptree_raw_text = $ptree->raw_text();

    This method will return the raw text of the POD parse-tree exactly as it appeared in the input.

    Pod::ParseTree::DESTROY()

    This method performs any necessary cleanup for the parse-tree. If you override this method then it is imperative that you invoke the parent method from within your own method, otherwise parse-tree storage will not be reclaimed upon destruction!

    SEE ALSO

    Pod::InputObjects is part of the Pod::Parser distribution.

    See Pod::Parser, Pod::Select

    AUTHOR

    Please report bugs using http://rt.cpan.org.

    Brad Appleton <bradapp@enteract.com>

     
    perldoc-html/Pod/LaTeX.html000644 000765 000024 00000102076 12275777476 015641 0ustar00jjstaff000000 000000 Pod::LaTeX - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::LaTeX

    Perl 5 version 18.2 documentation
    Recently read

    Pod::LaTeX

    NAME

    Pod::LaTeX - Convert Pod data to formatted Latex

    SYNOPSIS

    1. use Pod::LaTeX;
    2. my $parser = Pod::LaTeX->new ( );
    3. $parser->parse_from_filehandle;
    4. $parser->parse_from_file ('file.pod', 'file.tex');

    DESCRIPTION

    Pod::LaTeX is a module to convert documentation in the Pod format into Latex. The pod2latex command uses this module for translation.

    Pod::LaTeX is a derived class from Pod::Select.

    OBJECT METHODS

    The following methods are provided in this module. Methods inherited from Pod::Select are not described in the public interface.

    Data Accessors

    The following methods are provided for accessing instance data. These methods should be used for accessing configuration parameters rather than assuming the object is a hash.

    Default values can be supplied by using these names as keys to a hash of arguments when using the new() constructor.

    • AddPreamble

      Logical to control whether a latex preamble is to be written. If true, a valid latex preamble is written before the pod data is written. This is similar to:

      1. \documentclass{article}
      2. \usepackage[T1]{fontenc}
      3. \usepackage{textcomp}
      4. \begin{document}

      but will be more complicated if table of contents and indexing are required. Can be used to set or retrieve the current value.

      1. $add = $parser->AddPreamble();
      2. $parser->AddPreamble(1);

      If used in conjunction with AddPostamble a full latex document will be written that could be immediately processed by latex .

      For some pod escapes it may be necessary to include the amsmath package. This is not yet added to the preamble automatically.

    • AddPostamble

      Logical to control whether a standard latex ending is written to the output file after the document has been processed. In its simplest form this is simply:

      1. \end{document}

      but can be more complicated if a index is required. Can be used to set or retrieve the current value.

      1. $add = $parser->AddPostamble();
      2. $parser->AddPostamble(1);

      If used in conjunction with AddPreaamble a full latex document will be written that could be immediately processed by latex .

    • Head1Level

      The latex sectioning level that should be used to correspond to a pod =head1 directive. This can be used, for example, to turn a =head1 into a latex subsection . This should hold a number corresponding to the required position in an array containing the following elements:

      1. [0] chapter
      2. [1] section
      3. [2] subsection
      4. [3] subsubsection
      5. [4] paragraph
      6. [5] subparagraph

      Can be used to set or retrieve the current value:

      1. $parser->Head1Level(2);
      2. $sect = $parser->Head1Level;

      Setting this number too high can result in sections that may not be reproducible in the expected way. For example, setting this to 4 would imply that =head3 do not have a corresponding latex section (=head1 would correspond to a paragraph ).

      A check is made to ensure that the supplied value is an integer in the range 0 to 5.

      Default is for a value of 1 (i.e. a section ).

    • Label

      This is the label that is prefixed to all latex label and index entries to make them unique. In general, pods have similarly titled sections (NAME, DESCRIPTION etc) and a latex label will be multiply defined if more than one pod document is to be included in a single latex file. To overcome this, this label is prefixed to a label whenever a label is required (joined with an underscore) or to an index entry (joined by an exclamation mark which is the normal index separator). For example, \label{text} becomes \label{Label_text} .

      Can be used to set or retrieve the current value:

      1. $label = $parser->Label;
      2. $parser->Label($label);

      This label is only used if UniqueLabels is true. Its value is set automatically from the NAME field if ReplaceNAMEwithSection is true. If this is not the case it must be set manually before starting the parse.

      Default value is undef.

    • LevelNoNum

      Control the point at which latex section numbering is turned off. For example, this can be used to make sure that latex sections are numbered but subsections are not.

      Can be used to set or retrieve the current value:

      1. $lev = $parser->LevelNoNum;
      2. $parser->LevelNoNum(2);

      The argument must be an integer between 0 and 5 and is the same as the number described in Head1Level method description. The number has nothing to do with the pod heading number, only the latex sectioning.

      Default is 2. (i.e. latex subsections are written as subsection* but sections are numbered).

    • MakeIndex

      Controls whether latex commands for creating an index are to be inserted into the preamble and postamble

      1. $makeindex = $parser->MakeIndex;
      2. $parser->MakeIndex(0);

      Irrelevant if both AddPreamble and AddPostamble are false (or equivalently, UserPreamble and UserPostamble are set).

      Default is for an index to be created.

    • ReplaceNAMEwithSection

      This controls whether the NAME section in the pod is to be translated literally or converted to a slightly modified output where the section name is the pod name rather than "NAME".

      If true, the pod segment

      1. =head1 NAME
      2. pod::name - purpose
      3. =head1 SYNOPSIS

      is converted to the latex

      1. \section{pod::name\label{pod_name}\index{pod::name}}
      2. Purpose
      3. \subsection*{SYNOPSIS\label{pod_name_SYNOPSIS}%
      4. \index{pod::name!SYNOPSIS}}

      (dependent on the value of Head1Level and LevelNoNum ). Note that subsequent head1 directives translate to subsections rather than sections and that the labels and index now include the pod name (dependent on the value of UniqueLabels ).

      The Label is set from the pod name regardless of any current value of Label .

      1. $mod = $parser->ReplaceNAMEwithSection;
      2. $parser->ReplaceNAMEwithSection(0);

      Default is to translate the pod literally.

    • StartWithNewPage

      If true, each pod translation will begin with a latex \clearpage .

      1. $parser->StartWithNewPage(1);
      2. $newpage = $parser->StartWithNewPage;

      Default is false.

    • TableOfContents

      If true, a table of contents will be created. Irrelevant if AddPreamble is false or UserPreamble is set.

      1. $toc = $parser->TableOfContents;
      2. $parser->TableOfContents(1);

      Default is false.

    • UniqueLabels

      If true, the translator will attempt to make sure that each latex label or index entry will be uniquely identified by prefixing the contents of Label . This allows multiple documents to be combined without clashing common labels such as DESCRIPTION and SYNOPSIS

      1. $parser->UniqueLabels(1);
      2. $unq = $parser->UniqueLabels;

      Default is true.

    • UserPreamble

      User supplied latex preamble. Added before the pod translation data.

      If set, the contents will be prepended to the output file before the translated data regardless of the value of AddPreamble . MakeIndex and TableOfContents will also be ignored.

    • UserPostamble

      User supplied latex postamble. Added after the pod translation data.

      If set, the contents will be prepended to the output file after the translated data regardless of the value of AddPostamble . MakeIndex will also be ignored.

    NOTES

    Compatible with latex2e only. Can not be used with latex v2.09 or earlier.

    A subclass of Pod::Select so that specific pod sections can be converted to latex by using the select method.

    Some HTML escapes are missing and many have not been tested.

    SEE ALSO

    Pod::Parser, Pod::Select, pod2latex, Pod::Simple.

    AUTHORS

    Tim Jenness <tjenness@cpan.org>

    Bug fixes and improvements have been received from: Simon Cozens <simon@cozens.net>, Mark A. Hershberger <mah@everybody.org>, Marcel Grunauer <marcel@codewerk.com>, Hugh S Myers <hsmyers@sdragons.com>, Peter J Acklam <jacklam@math.uio.no>, Sudhi Herle <sudhi@herle.net>, Ariel Scolnicov <ariels@compugen.co.il>, Adriano Rodrigues Ferreira <ferreira@triang.com.br>, R. de Vries <r.de.vries@dutchspace.nl> and Dave Mitchell <davem@iabyn.com>.

    COPYRIGHT

    Copyright (C) 2011 Tim Jenness. Copyright (C) 2000-2004 Tim Jenness. All Rights Reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Pod/Man.html000644 000765 000024 00000077632 12275777500 015374 0ustar00jjstaff000000 000000 Pod::Man - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Man

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Man

    NAME

    Pod::Man - Convert POD data to formatted *roff input

    SYNOPSIS

    1. use Pod::Man;
    2. my $parser = Pod::Man->new (release => $VERSION, section => 8);
    3. # Read POD from STDIN and write to STDOUT.
    4. $parser->parse_file (\*STDIN);
    5. # Read POD from file.pod and write to file.1.
    6. $parser->parse_from_file ('file.pod', 'file.1');

    DESCRIPTION

    Pod::Man is a module to convert documentation in the POD format (the preferred language for documenting Perl) into *roff input using the man macro set. The resulting *roff code is suitable for display on a terminal using nroff(1), normally via man(1), or printing using troff(1). It is conventionally invoked using the driver script pod2man, but it can also be used directly.

    As a derived class from Pod::Simple, Pod::Man supports the same methods and interfaces. See Pod::Simple for all the details.

    new() can take options, in the form of key/value pairs that control the behavior of the parser. See below for details.

    If no options are given, Pod::Man uses the name of the input file with any trailing .pod, .pm, or .pl stripped as the man page title, to section 1 unless the file ended in .pm in which case it defaults to section 3, to a centered title of "User Contributed Perl Documentation", to a centered footer of the Perl version it is run with, and to a left-hand footer of the modification date of its input (or the current date if given STDIN for input).

    Pod::Man assumes that your *roff formatters have a fixed-width font named CW . If yours is called something else (like CR ), use the fixed option to specify it. This generally only matters for troff output for printing. Similarly, you can set the fonts used for bold, italic, and bold italic fixed-width output.

    Besides the obvious pod conversions, Pod::Man also takes care of formatting func(), func(3), and simple variable references like $foo or @bar so you don't have to use code escapes for them; complex expressions like $fred{'stuff'} will still need to be escaped, though. It also translates dashes that aren't used as hyphens into en dashes, makes long dashes--like this--into proper em dashes, fixes "paired quotes," makes C++ look right, puts a little space between double underscores, makes ALLCAPS a teeny bit smaller in troff, and escapes stuff that *roff treats as special so that you don't have to.

    The recognized options to new() are as follows. All options take a single argument.

    • center

      Sets the centered page header to use instead of "User Contributed Perl Documentation".

    • errors

      How to report errors. die says to throw an exception on any POD formatting error. stderr says to report errors on standard error, but not to throw an exception. pod says to include a POD ERRORS section in the resulting documentation summarizing the errors. none ignores POD errors entirely, as much as possible.

      The default is output .

    • date

      Sets the left-hand footer. By default, the modification date of the input file will be used, or the current date if stat() can't find that file (the case if the input is from STDIN ), and the date will be formatted as YYYY-MM-DD .

    • fixed

      The fixed-width font to use for verbatim text and code. Defaults to CW . Some systems may want CR instead. Only matters for troff output.

    • fixedbold

      Bold version of the fixed-width font. Defaults to CB . Only matters for troff output.

    • fixeditalic

      Italic version of the fixed-width font (actually, something of a misnomer, since most fixed-width fonts only have an oblique version, not an italic version). Defaults to CI . Only matters for troff output.

    • fixedbolditalic

      Bold italic (probably actually oblique) version of the fixed-width font. Pod::Man doesn't assume you have this, and defaults to CB . Some systems (such as Solaris) have this font available as CX . Only matters for troff output.

    • name

      Set the name of the manual page. Without this option, the manual name is set to the uppercased base name of the file being converted unless the manual section is 3, in which case the path is parsed to see if it is a Perl module path. If it is, a path like .../lib/Pod/Man.pm is converted into a name like Pod::Man . This option, if given, overrides any automatic determination of the name.

    • nourls

      Normally, L<> formatting codes with a URL but anchor text are formatted to show both the anchor text and the URL. In other words:

      1. L<foo|http://example.com/>

      is formatted as:

      1. foo <http://example.com/>

      This option, if set to a true value, suppresses the URL when anchor text is given, so this example would be formatted as just foo . This can produce less cluttered output in cases where the URLs are not particularly important.

    • quotes

      Sets the quote marks used to surround C<> text. If the value is a single character, it is used as both the left and right quote; if it is two characters, the first character is used as the left quote and the second as the right quoted; and if it is four characters, the first two are used as the left quote and the second two as the right quote.

      This may also be set to the special value none , in which case no quote marks are added around C<> text (but the font is still changed for troff output).

    • release

      Set the centered footer. By default, this is the version of Perl you run Pod::Man under. Note that some system an macro sets assume that the centered footer will be a modification date and will prepend something like "Last modified: "; if this is the case, you may want to set release to the last modified date and date to the version number.

    • section

      Set the section for the .TH macro. The standard section numbering convention is to use 1 for user commands, 2 for system calls, 3 for functions, 4 for devices, 5 for file formats, 6 for games, 7 for miscellaneous information, and 8 for administrator commands. There is a lot of variation here, however; some systems (like Solaris) use 4 for file formats, 5 for miscellaneous information, and 7 for devices. Still others use 1m instead of 8, or some mix of both. About the only section numbers that are reliably consistent are 1, 2, and 3.

      By default, section 1 will be used unless the file ends in .pm in which case section 3 will be selected.

    • stderr

      Send error messages about invalid POD to standard error instead of appending a POD ERRORS section to the generated *roff output. This is equivalent to setting errors to stderr if errors is not already set. It is supported for backward compatibility.

    • utf8

      By default, Pod::Man produces the most conservative possible *roff output to try to ensure that it will work with as many different *roff implementations as possible. Many *roff implementations cannot handle non-ASCII characters, so this means all non-ASCII characters are converted either to a *roff escape sequence that tries to create a properly accented character (at least for troff output) or to X .

      If this option is set, Pod::Man will instead output UTF-8. If your *roff implementation can handle it, this is the best output format to use and avoids corruption of documents containing non-ASCII characters. However, be warned that *roff source with literal UTF-8 characters is not supported by many implementations and may even result in segfaults and other bad behavior.

      Be aware that, when using this option, the input encoding of your POD source must be properly declared unless it is US-ASCII or Latin-1. POD input without an =encoding command will be assumed to be in Latin-1, and if it's actually in UTF-8, the output will be double-encoded. See perlpod(1) for more information on the =encoding command.

    The standard Pod::Simple method parse_file() takes one argument naming the POD file to read from. By default, the output is sent to STDOUT , but this can be changed with the output_fd() method.

    The standard Pod::Simple method parse_from_file() takes up to two arguments, the first being the input file to read POD from and the second being the file to write the formatted output to.

    You can also call parse_lines() to parse an array of lines or parse_string_document() to parse a document already in memory. To put the output into a string instead of a file handle, call the output_string() method. See Pod::Simple for the specific details.

    DIAGNOSTICS

    • roff font should be 1 or 2 chars, not "%s"

      (F) You specified a *roff font (using fixed , fixedbold , etc.) that wasn't either one or two characters. Pod::Man doesn't support *roff fonts longer than two characters, although some *roff extensions do (the canonical versions of nroff and troff don't either).

    • Invalid errors setting "%s"

      (F) The errors parameter to the constructor was set to an unknown value.

    • Invalid quote specification "%s"

      (F) The quote specification given (the quotes option to the constructor) was invalid. A quote specification must be one, two, or four characters long.

    • POD document had syntax errors

      (F) The POD document being formatted had syntax errors and the errors option was set to die.

    BUGS

    Encoding handling assumes that PerlIO is available and does not work properly if it isn't. The utf8 option is therefore not supported unless Perl is built with PerlIO support.

    There is currently no way to turn off the guesswork that tries to format unmarked text appropriately, and sometimes it isn't wanted (particularly when using POD to document something other than Perl). Most of the work toward fixing this has now been done, however, and all that's still needed is a user interface.

    The NAME section should be recognized specially and index entries emitted for everything in that section. This would have to be deferred until the next section, since extraneous things in NAME tends to confuse various man page processors. Currently, no index entries are emitted for anything in NAME.

    Pod::Man doesn't handle font names longer than two characters. Neither do most troff implementations, but GNU troff does as an extension. It would be nice to support as an option for those who want to use it.

    The preamble added to each output file is rather verbose, and most of it is only necessary in the presence of non-ASCII characters. It would ideally be nice if all of those definitions were only output if needed, perhaps on the fly as the characters are used.

    Pod::Man is excessively slow.

    CAVEATS

    If Pod::Man is given the utf8 option, the encoding of its output file handle will be forced to UTF-8 if possible, overriding any existing encoding. This will be done even if the file handle is not created by Pod::Man and was passed in from outside. This maintains consistency regardless of PERL_UNICODE and other settings.

    The handling of hyphens and em dashes is somewhat fragile, and one may get the wrong one under some circumstances. This should only matter for troff output.

    When and whether to use small caps is somewhat tricky, and Pod::Man doesn't necessarily get it right.

    Converting neutral double quotes to properly matched double quotes doesn't work unless there are no formatting codes between the quote marks. This only matters for troff output.

    AUTHOR

    Russ Allbery <rra@stanford.edu>, based very heavily on the original pod2man by Tom Christiansen <tchrist@mox.perl.com>. The modifications to work with Pod::Simple instead of Pod::Parser were originally contributed by Sean Burke (but I've since hacked them beyond recognition and all bugs are mine).

    COPYRIGHT AND LICENSE

    Copyright 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2012, 2013 Russ Allbery <rra@stanford.edu>.

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    Pod::Simple, perlpod(1), pod2man(1), nroff(1), troff(1), man(1), man(7)

    Ossanna, Joseph F., and Brian W. Kernighan. "Troff User's Manual," Computing Science Technical Report No. 54, AT&T Bell Laboratories. This is the best documentation of standard nroff and troff. At the time of this writing, it's available at http://www.cs.bell-labs.com/cm/cs/cstr.html.

    The man page documenting the man macro set may be man(5) instead of man(7) on your system. Also, please see pod2man(1) for extensive documentation on writing manual pages if you've not done it before and aren't familiar with the conventions.

    The current version of this module is always available from its web site at http://www.eyrie.org/~eagle/software/podlators/. It is also part of the Perl core distribution as of 5.6.0.

     
    perldoc-html/Pod/ParseLink.html000644 000765 000024 00000045070 12275777501 016541 0ustar00jjstaff000000 000000 Pod::ParseLink - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::ParseLink

    Perl 5 version 18.2 documentation
    Recently read

    Pod::ParseLink

    NAME

    Pod::ParseLink - Parse an L<> formatting code in POD text

    SYNOPSIS

    1. use Pod::ParseLink;
    2. my ($text, $inferred, $name, $section, $type) = parselink ($link);

    DESCRIPTION

    This module only provides a single function, parselink(), which takes the text of an L<> formatting code and parses it. It returns the anchor text for the link (if any was given), the anchor text possibly inferred from the name and section, the name or URL, the section if any, and the type of link. The type will be one of url , pod , or man , indicating a URL, a link to a POD page, or a link to a Unix manual page.

    Parsing is implemented per perlpodspec. For backward compatibility, links where there is no section and name contains spaces, or links where the entirety of the link (except for the anchor text if given) is enclosed in double-quotes are interpreted as links to a section (L</section>).

    The inferred anchor text is implemented per perlpodspec:

    1. L<name> => L<name|name>
    2. L</section> => L<"section"|/section>
    3. L<name/section> => L<"section" in name|name/section>

    The name may contain embedded E<> and Z<> formatting codes, and the section, anchor text, and inferred anchor text may contain any formatting codes. Any double quotes around the section are removed as part of the parsing, as is any leading or trailing whitespace.

    If the text of the L<> escape is entirely enclosed in double quotes, it's interpreted as a link to a section for backward compatibility.

    No attempt is made to resolve formatting codes. This must be done after calling parselink() (since E<> formatting codes can be used to escape characters that would otherwise be significant to the parser and resolving them before parsing would result in an incorrect parse of a formatting code like:

    1. L<verticalE<verbar>barE<sol>slash>

    which should be interpreted as a link to the vertical|bar/slash POD page and not as a link to the slash section of the bar POD page with an anchor text of vertical . Note that not only the anchor text will need to have formatting codes expanded, but so will the target of the link (to deal with E<> and Z<> formatting codes), and special handling of the section may be necessary depending on whether the translator wants to consider markup in sections to be significant when resolving links. See perlpodspec for more information.

    SEE ALSO

    Pod::Parser

    The current version of this module is always available from its web site at http://www.eyrie.org/~eagle/software/podlators/.

    AUTHOR

    Russ Allbery <rra@stanford.edu>.

    COPYRIGHT AND LICENSE

    Copyright 2001, 2008, 2009 Russ Allbery <rra@stanford.edu>.

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Pod/ParseUtils.html000644 000765 000024 00000070051 12275777500 016740 0ustar00jjstaff000000 000000 Pod::ParseUtils - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::ParseUtils

    Perl 5 version 18.2 documentation
    Recently read

    Pod::ParseUtils

    NAME

    Pod::ParseUtils - helpers for POD parsing and conversion

    SYNOPSIS

    1. use Pod::ParseUtils;
    2. my $list = new Pod::List;
    3. my $link = Pod::Hyperlink->new('Pod::Parser');

    DESCRIPTION

    Pod::ParseUtils contains a few object-oriented helper packages for POD parsing and processing (i.e. in POD formatters and translators).

    Pod::List

    Pod::List can be used to hold information about POD lists (written as =over ... =item ... =back) for further processing. The following methods are available:

    • Pod::List->new()

      Create a new list object. Properties may be specified through a hash reference like this:

      1. my $list = Pod::List->new({ -start => $., -indent => 4 });

      See the individual methods/properties for details.

    • $list->file()

      Without argument, retrieves the file name the list is in. This must have been set before by either specifying -file in the new() method or by calling the file() method with a scalar argument.

    • $list->start()

      Without argument, retrieves the line number where the list started. This must have been set before by either specifying -start in the new() method or by calling the start() method with a scalar argument.

    • $list->indent()

      Without argument, retrieves the indent level of the list as specified in =over n . This must have been set before by either specifying -indent in the new() method or by calling the indent() method with a scalar argument.

    • $list->type()

      Without argument, retrieves the list type, which can be an arbitrary value, e.g. OL , UL , ... when thinking the HTML way. This must have been set before by either specifying -type in the new() method or by calling the type() method with a scalar argument.

    • $list->rx()

      Without argument, retrieves a regular expression for simplifying the individual item strings once the list type has been determined. Usage: E.g. when converting to HTML, one might strip the leading number in an ordered list as <OL> already prints numbers itself. This must have been set before by either specifying -rx in the new() method or by calling the rx() method with a scalar argument.

    • $list->item()

      Without argument, retrieves the array of the items in this list. The items may be represented by any scalar. If an argument has been given, it is pushed on the list of items.

    • $list->parent()

      Without argument, retrieves information about the parent holding this list, which is represented as an arbitrary scalar. This must have been set before by either specifying -parent in the new() method or by calling the parent() method with a scalar argument.

    • $list->tag()

      Without argument, retrieves information about the list tag, which can be any scalar. This must have been set before by either specifying -tag in the new() method or by calling the tag() method with a scalar argument.

    Pod::Hyperlink

    Pod::Hyperlink is a class for manipulation of POD hyperlinks. Usage:

    1. my $link = Pod::Hyperlink->new('alternative text|page/"section in page"');

    The Pod::Hyperlink class is mainly designed to parse the contents of the L<...> sequence, providing a simple interface for accessing the different parts of a POD hyperlink for further processing. It can also be used to construct hyperlinks.

    • Pod::Hyperlink->new()

      The new() method can either be passed a set of key/value pairs or a single scalar value, namely the contents of a L<...> sequence. An object of the class Pod::Hyperlink is returned. The value undef indicates a failure, the error message is stored in $@ .

    • $link->parse($string)

      This method can be used to (re)parse a (new) hyperlink, i.e. the contents of a L<...> sequence. The result is stored in the current object. Warnings are stored in the warnings property. E.g. sections like L<open(2)> are deprecated, as they do not point to Perl documents. L<DBI::foo(3p)> is wrong as well, the manpage section can simply be dropped.

    • $link->markup($string)

      Set/retrieve the textual value of the link. This string contains special markers P<> and Q<> that should be expanded by the translator's interior sequence expansion engine to the formatter-specific code to highlight/activate the hyperlink. The details have to be implemented in the translator.

    • $link->text()

      This method returns the textual representation of the hyperlink as above, but without markers (read only). Depending on the link type this is one of the following alternatives (the + and * denote the portions of the text that are marked up):

      1. +perl+ L<perl>
      2. *$|* in +perlvar+ L<perlvar/$|>
      3. *OPTIONS* in +perldoc+ L<perldoc/"OPTIONS">
      4. *DESCRIPTION* L<"DESCRIPTION">
    • $link->warning()

      After parsing, this method returns any warnings encountered during the parsing process.

    • $link->file()
    • $link->line()

      Just simple slots for storing information about the line and the file the link was encountered in. Has to be filled in manually.

    • $link->page()

      This method sets or returns the POD page this link points to.

    • $link->node()

      As above, but the destination node text of the link.

    • $link->alttext()

      Sets or returns an alternative text specified in the link.

    • $link->type()

      The node type, either section or item . As an unofficial type, there is also hyperlink , derived from e.g. L<http://perl.com>

    • $link->link()

      Returns the link as contents of L<> . Reciprocal to parse().

    Pod::Cache

    Pod::Cache holds information about a set of POD documents, especially the nodes for hyperlinks. The following methods are available:

    • Pod::Cache->new()

      Create a new cache object. This object can hold an arbitrary number of POD documents of class Pod::Cache::Item.

    • $cache->item()

      Add a new item to the cache. Without arguments, this method returns a list of all cache elements.

    • $cache->find_page($name)

      Look for a POD document named $name in the cache. Returns the reference to the corresponding Pod::Cache::Item object or undef if not found.

    Pod::Cache::Item

    Pod::Cache::Item holds information about individual POD documents, that can be grouped in a Pod::Cache object. It is intended to hold information about the hyperlink nodes of POD documents. The following methods are available:

    • Pod::Cache::Item->new()

      Create a new object.

    • $cacheitem->page()

      Set/retrieve the POD document name (e.g. "Pod::Parser").

    • $cacheitem->description()

      Set/retrieve the POD short description as found in the =head1 NAME section.

    • $cacheitem->path()

      Set/retrieve the POD file storage path.

    • $cacheitem->file()

      Set/retrieve the POD file name.

    • $cacheitem->nodes()

      Add a node (or a list of nodes) to the document's node list. Note that the order is kept, i.e. start with the first node and end with the last. If no argument is given, the current list of nodes is returned in the same order the nodes have been added. A node can be any scalar, but usually is a pair of node string and unique id for the find_node method to work correctly.

    • $cacheitem->find_node($name)

      Look for a node or index entry named $name in the object. Returns the unique id of the node (i.e. the second element of the array stored in the node array) or undef if not found.

    • $cacheitem->idx()

      Add an index entry (or a list of them) to the document's index list. Note that the order is kept, i.e. start with the first node and end with the last. If no argument is given, the current list of index entries is returned in the same order the entries have been added. An index entry can be any scalar, but usually is a pair of string and unique id.

    AUTHOR

    Please report bugs using http://rt.cpan.org.

    Marek Rouchal <marekr@cpan.org>, borrowing a lot of things from pod2man and pod2roff as well as other POD processing tools by Tom Christiansen, Brad Appleton and Russ Allbery.

    Pod::ParseUtils is part of the Pod::Parser distribution.

    SEE ALSO

    pod2man, pod2roff, Pod::Parser, Pod::Checker, pod2html

     
    perldoc-html/Pod/Parser.html000644 000765 000024 00000240124 12275777500 016101 0ustar00jjstaff000000 000000 Pod::Parser - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Parser

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Parser

    NAME

    Pod::Parser - base class for creating POD filters and translators

    SYNOPSIS

    1. use Pod::Parser;
    2. package MyParser;
    3. @ISA = qw(Pod::Parser);
    4. sub command {
    5. my ($parser, $command, $paragraph, $line_num) = @_;
    6. ## Interpret the command and its text; sample actions might be:
    7. if ($command eq 'head1') { ... }
    8. elsif ($command eq 'head2') { ... }
    9. ## ... other commands and their actions
    10. my $out_fh = $parser->output_handle();
    11. my $expansion = $parser->interpolate($paragraph, $line_num);
    12. print $out_fh $expansion;
    13. }
    14. sub verbatim {
    15. my ($parser, $paragraph, $line_num) = @_;
    16. ## Format verbatim paragraph; sample actions might be:
    17. my $out_fh = $parser->output_handle();
    18. print $out_fh $paragraph;
    19. }
    20. sub textblock {
    21. my ($parser, $paragraph, $line_num) = @_;
    22. ## Translate/Format this block of text; sample actions might be:
    23. my $out_fh = $parser->output_handle();
    24. my $expansion = $parser->interpolate($paragraph, $line_num);
    25. print $out_fh $expansion;
    26. }
    27. sub interior_sequence {
    28. my ($parser, $seq_command, $seq_argument) = @_;
    29. ## Expand an interior sequence; sample actions might be:
    30. return "*$seq_argument*" if ($seq_command eq 'B');
    31. return "`$seq_argument'" if ($seq_command eq 'C');
    32. return "_${seq_argument}_'" if ($seq_command eq 'I');
    33. ## ... other sequence commands and their resulting text
    34. }
    35. package main;
    36. ## Create a parser object and have it parse file whose name was
    37. ## given on the command-line (use STDIN if no files were given).
    38. $parser = new MyParser();
    39. $parser->parse_from_filehandle(\*STDIN) if (@ARGV == 0);
    40. for (@ARGV) { $parser->parse_from_file($_); }

    REQUIRES

    perl5.005, Pod::InputObjects, Exporter, Symbol, Carp

    EXPORTS

    Nothing.

    DESCRIPTION

    Pod::Parser is a base class for creating POD filters and translators. It handles most of the effort involved with parsing the POD sections from an input stream, leaving subclasses free to be concerned only with performing the actual translation of text.

    Pod::Parser parses PODs, and makes method calls to handle the various components of the POD. Subclasses of Pod::Parser override these methods to translate the POD into whatever output format they desire.

    Note: This module is considered as legacy; modern Perl releases (5.18 and higher) are going to remove Pod::Parser from core and use Pod::Simple for all things POD.

    QUICK OVERVIEW

    To create a POD filter for translating POD documentation into some other format, you create a subclass of Pod::Parser which typically overrides just the base class implementation for the following methods:

    • command()

    • verbatim()

    • textblock()

    • interior_sequence()

    You may also want to override the begin_input() and end_input() methods for your subclass (to perform any needed per-file and/or per-document initialization or cleanup).

    If you need to perform any preprocessing of input before it is parsed you may want to override one or more of preprocess_line() and/or preprocess_paragraph().

    Sometimes it may be necessary to make more than one pass over the input files. If this is the case you have several options. You can make the first pass using Pod::Parser and override your methods to store the intermediate results in memory somewhere for the end_pod() method to process. You could use Pod::Parser for several passes with an appropriate state variable to control the operation for each pass. If your input source can't be reset to start at the beginning, you can store it in some other structure as a string or an array and have that structure implement a getline() method (which is all that parse_from_filehandle() uses to read input).

    Feel free to add any member data fields you need to keep track of things like current font, indentation, horizontal or vertical position, or whatever else you like. Be sure to read PRIVATE METHODS AND DATA to avoid name collisions.

    For the most part, the Pod::Parser base class should be able to do most of the input parsing for you and leave you free to worry about how to interpret the commands and translate the result.

    Note that all we have described here in this quick overview is the simplest most straightforward use of Pod::Parser to do stream-based parsing. It is also possible to use the Pod::Parser::parse_text function to do more sophisticated tree-based parsing. See TREE-BASED PARSING.

    PARSING OPTIONS

    A parse-option is simply a named option of Pod::Parser with a value that corresponds to a certain specified behavior. These various behaviors of Pod::Parser may be enabled/disabled by setting or unsetting one or more parse-options using the parseopts() method. The set of currently accepted parse-options is as follows:

    • -want_nonPODs (default: unset)

      Normally (by default) Pod::Parser will only provide access to the POD sections of the input. Input paragraphs that are not part of the POD-format documentation are not made available to the caller (not even using preprocess_paragraph()). Setting this option to a non-empty, non-zero value will allow preprocess_paragraph() to see non-POD sections of the input as well as POD sections. The cutting() method can be used to determine if the corresponding paragraph is a POD paragraph, or some other input paragraph.

    • -process_cut_cmd (default: unset)

      Normally (by default) Pod::Parser handles the =cut POD directive by itself and does not pass it on to the caller for processing. Setting this option to a non-empty, non-zero value will cause Pod::Parser to pass the =cut directive to the caller just like any other POD command (and hence it may be processed by the command() method).

      Pod::Parser will still interpret the =cut directive to mean that "cutting mode" has been (re)entered, but the caller will get a chance to capture the actual =cut paragraph itself for whatever purpose it desires.

    • -warnings (default: unset)

      Normally (by default) Pod::Parser recognizes a bare minimum of pod syntax errors and warnings and issues diagnostic messages for errors, but not for warnings. (Use Pod::Checker to do more thorough checking of POD syntax.) Setting this option to a non-empty, non-zero value will cause Pod::Parser to issue diagnostics for the few warnings it recognizes as well as the errors.

    Please see parseopts() for a complete description of the interface for the setting and unsetting of parse-options.

    RECOMMENDED SUBROUTINE/METHOD OVERRIDES

    Pod::Parser provides several methods which most subclasses will probably want to override. These methods are as follows:

    command()

    1. $parser->command($cmd,$text,$line_num,$pod_para);

    This method should be overridden by subclasses to take the appropriate action when a POD command paragraph (denoted by a line beginning with "=") is encountered. When such a POD directive is seen in the input, this method is called and is passed:

    • $cmd

      the name of the command for this POD paragraph

    • $text

      the paragraph text for the given POD paragraph command.

    • $line_num

      the line-number of the beginning of the paragraph

    • $pod_para

      a reference to a Pod::Paragraph object which contains further information about the paragraph command (see Pod::InputObjects for details).

    Note that this method is called for =pod paragraphs.

    The base class implementation of this method simply treats the raw POD command as normal block of paragraph text (invoking the textblock() method with the command paragraph).

    verbatim()

    1. $parser->verbatim($text,$line_num,$pod_para);

    This method may be overridden by subclasses to take the appropriate action when a block of verbatim text is encountered. It is passed the following parameters:

    • $text

      the block of text for the verbatim paragraph

    • $line_num

      the line-number of the beginning of the paragraph

    • $pod_para

      a reference to a Pod::Paragraph object which contains further information about the paragraph (see Pod::InputObjects for details).

    The base class implementation of this method simply prints the textblock (unmodified) to the output filehandle.

    textblock()

    1. $parser->textblock($text,$line_num,$pod_para);

    This method may be overridden by subclasses to take the appropriate action when a normal block of POD text is encountered (although the base class method will usually do what you want). It is passed the following parameters:

    • $text

      the block of text for the a POD paragraph

    • $line_num

      the line-number of the beginning of the paragraph

    • $pod_para

      a reference to a Pod::Paragraph object which contains further information about the paragraph (see Pod::InputObjects for details).

    In order to process interior sequences, subclasses implementations of this method will probably want to invoke either interpolate() or parse_text(), passing it the text block $text , and the corresponding line number in $line_num , and then perform any desired processing upon the returned result.

    The base class implementation of this method simply prints the text block as it occurred in the input stream).

    interior_sequence()

    1. $parser->interior_sequence($seq_cmd,$seq_arg,$pod_seq);

    This method should be overridden by subclasses to take the appropriate action when an interior sequence is encountered. An interior sequence is an embedded command within a block of text which appears as a command name (usually a single uppercase character) followed immediately by a string of text which is enclosed in angle brackets. This method is passed the sequence command $seq_cmd and the corresponding text $seq_arg . It is invoked by the interpolate() method for each interior sequence that occurs in the string that it is passed. It should return the desired text string to be used in place of the interior sequence. The $pod_seq argument is a reference to a Pod::InteriorSequence object which contains further information about the interior sequence. Please see Pod::InputObjects for details if you need to access this additional information.

    Subclass implementations of this method may wish to invoke the nested() method of $pod_seq to see if it is nested inside some other interior-sequence (and if so, which kind).

    The base class implementation of the interior_sequence() method simply returns the raw text of the interior sequence (as it occurred in the input) to the caller.

    OPTIONAL SUBROUTINE/METHOD OVERRIDES

    Pod::Parser provides several methods which subclasses may want to override to perform any special pre/post-processing. These methods do not have to be overridden, but it may be useful for subclasses to take advantage of them.

    new()

    1. my $parser = Pod::Parser->new();

    This is the constructor for Pod::Parser and its subclasses. You do not need to override this method! It is capable of constructing subclass objects as well as base class objects, provided you use any of the following constructor invocation styles:

    1. my $parser1 = MyParser->new();
    2. my $parser2 = new MyParser();
    3. my $parser3 = $parser2->new();

    where MyParser is some subclass of Pod::Parser.

    Using the syntax MyParser::new() to invoke the constructor is not recommended, but if you insist on being able to do this, then the subclass will need to override the new() constructor method. If you do override the constructor, you must be sure to invoke the initialize() method of the newly blessed object.

    Using any of the above invocations, the first argument to the constructor is always the corresponding package name (or object reference). No other arguments are required, but if desired, an associative array (or hash-table) my be passed to the new() constructor, as in:

    1. my $parser1 = MyParser->new( MYDATA => $value1, MOREDATA => $value2 );
    2. my $parser2 = new MyParser( -myflag => 1 );

    All arguments passed to the new() constructor will be treated as key/value pairs in a hash-table. The newly constructed object will be initialized by copying the contents of the given hash-table (which may have been empty). The new() constructor for this class and all of its subclasses returns a blessed reference to the initialized object (hash-table).

    initialize()

    1. $parser->initialize();

    This method performs any necessary object initialization. It takes no arguments (other than the object instance of course, which is typically copied to a local variable named $self ). If subclasses override this method then they must be sure to invoke $self->SUPER::initialize() .

    begin_pod()

    1. $parser->begin_pod();

    This method is invoked at the beginning of processing for each POD document that is encountered in the input. Subclasses should override this method to perform any per-document initialization.

    begin_input()

    1. $parser->begin_input();

    This method is invoked by parse_from_filehandle() immediately before processing input from a filehandle. The base class implementation does nothing, however, subclasses may override it to perform any per-file initializations.

    Note that if multiple files are parsed for a single POD document (perhaps the result of some future =include directive) this method is invoked for every file that is parsed. If you wish to perform certain initializations once per document, then you should use begin_pod().

    end_input()

    1. $parser->end_input();

    This method is invoked by parse_from_filehandle() immediately after processing input from a filehandle. The base class implementation does nothing, however, subclasses may override it to perform any per-file cleanup actions.

    Please note that if multiple files are parsed for a single POD document (perhaps the result of some kind of =include directive) this method is invoked for every file that is parsed. If you wish to perform certain cleanup actions once per document, then you should use end_pod().

    end_pod()

    1. $parser->end_pod();

    This method is invoked at the end of processing for each POD document that is encountered in the input. Subclasses should override this method to perform any per-document finalization.

    preprocess_line()

    1. $textline = $parser->preprocess_line($text, $line_num);

    This method should be overridden by subclasses that wish to perform any kind of preprocessing for each line of input (before it has been determined whether or not it is part of a POD paragraph). The parameter $text is the input line; and the parameter $line_num is the line number of the corresponding text line.

    The value returned should correspond to the new text to use in its place. If the empty string or an undefined value is returned then no further processing will be performed for this line.

    Please note that the preprocess_line() method is invoked before the preprocess_paragraph() method. After all (possibly preprocessed) lines in a paragraph have been assembled together and it has been determined that the paragraph is part of the POD documentation from one of the selected sections, then preprocess_paragraph() is invoked.

    The base class implementation of this method returns the given text.

    preprocess_paragraph()

    1. $textblock = $parser->preprocess_paragraph($text, $line_num);

    This method should be overridden by subclasses that wish to perform any kind of preprocessing for each block (paragraph) of POD documentation that appears in the input stream. The parameter $text is the POD paragraph from the input file; and the parameter $line_num is the line number for the beginning of the corresponding paragraph.

    The value returned should correspond to the new text to use in its place If the empty string is returned or an undefined value is returned, then the given $text is ignored (not processed).

    This method is invoked after gathering up all the lines in a paragraph and after determining the cutting state of the paragraph, but before trying to further parse or interpret them. After preprocess_paragraph() returns, the current cutting state (which is returned by $self->cutting() ) is examined. If it evaluates to true then input text (including the given $text ) is cut (not processed) until the next POD directive is encountered.

    Please note that the preprocess_line() method is invoked before the preprocess_paragraph() method. After all (possibly preprocessed) lines in a paragraph have been assembled together and either it has been determined that the paragraph is part of the POD documentation from one of the selected sections or the -want_nonPODs option is true, then preprocess_paragraph() is invoked.

    The base class implementation of this method returns the given text.

    METHODS FOR PARSING AND PROCESSING

    Pod::Parser provides several methods to process input text. These methods typically won't need to be overridden (and in some cases they can't be overridden), but subclasses may want to invoke them to exploit their functionality.

    parse_text()

    1. $ptree1 = $parser->parse_text($text, $line_num);
    2. $ptree2 = $parser->parse_text({%opts}, $text, $line_num);
    3. $ptree3 = $parser->parse_text(\%opts, $text, $line_num);

    This method is useful if you need to perform your own interpolation of interior sequences and can't rely upon interpolate to expand them in simple bottom-up order.

    The parameter $text is a string or block of text to be parsed for interior sequences; and the parameter $line_num is the line number corresponding to the beginning of $text .

    parse_text() will parse the given text into a parse-tree of "nodes." and interior-sequences. Each "node" in the parse tree is either a text-string, or a Pod::InteriorSequence. The result returned is a parse-tree of type Pod::ParseTree. Please see Pod::InputObjects for more information about Pod::InteriorSequence and Pod::ParseTree.

    If desired, an optional hash-ref may be specified as the first argument to customize certain aspects of the parse-tree that is created and returned. The set of recognized option keywords are:

    • -expand_seq => code-ref|method-name

      Normally, the parse-tree returned by parse_text() will contain an unexpanded Pod::InteriorSequence object for each interior-sequence encountered. Specifying -expand_seq tells parse_text() to "expand" every interior-sequence it sees by invoking the referenced function (or named method of the parser object) and using the return value as the expanded result.

      If a subroutine reference was given, it is invoked as:

      1. &$code_ref( $parser, $sequence )

      and if a method-name was given, it is invoked as:

      1. $parser->method_name( $sequence )

      where $parser is a reference to the parser object, and $sequence is a reference to the interior-sequence object. [NOTE: If the interior_sequence() method is specified, then it is invoked according to the interface specified in interior_sequence()].

    • -expand_text => code-ref|method-name

      Normally, the parse-tree returned by parse_text() will contain a text-string for each contiguous sequence of characters outside of an interior-sequence. Specifying -expand_text tells parse_text() to "preprocess" every such text-string it sees by invoking the referenced function (or named method of the parser object) and using the return value as the preprocessed (or "expanded") result. [Note that if the result is an interior-sequence, then it will not be expanded as specified by the -expand_seq option; Any such recursive expansion needs to be handled by the specified callback routine.]

      If a subroutine reference was given, it is invoked as:

      1. &$code_ref( $parser, $text, $ptree_node )

      and if a method-name was given, it is invoked as:

      1. $parser->method_name( $text, $ptree_node )

      where $parser is a reference to the parser object, $text is the text-string encountered, and $ptree_node is a reference to the current node in the parse-tree (usually an interior-sequence object or else the top-level node of the parse-tree).

    • -expand_ptree => code-ref|method-name

      Rather than returning a Pod::ParseTree , pass the parse-tree as an argument to the referenced subroutine (or named method of the parser object) and return the result instead of the parse-tree object.

      If a subroutine reference was given, it is invoked as:

      1. &$code_ref( $parser, $ptree )

      and if a method-name was given, it is invoked as:

      1. $parser->method_name( $ptree )

      where $parser is a reference to the parser object, and $ptree is a reference to the parse-tree object.

    interpolate()

    1. $textblock = $parser->interpolate($text, $line_num);

    This method translates all text (including any embedded interior sequences) in the given text string $text and returns the interpolated result. The parameter $line_num is the line number corresponding to the beginning of $text .

    interpolate() merely invokes a private method to recursively expand nested interior sequences in bottom-up order (innermost sequences are expanded first). If there is a need to expand nested sequences in some alternate order, use parse_text instead.

    parse_from_filehandle()

    1. $parser->parse_from_filehandle($in_fh,$out_fh);

    This method takes an input filehandle (which is assumed to already be opened for reading) and reads the entire input stream looking for blocks (paragraphs) of POD documentation to be processed. If no first argument is given the default input filehandle STDIN is used.

    The $in_fh parameter may be any object that provides a getline() method to retrieve a single line of input text (hence, an appropriate wrapper object could be used to parse PODs from a single string or an array of strings).

    Using $in_fh->getline() , input is read line-by-line and assembled into paragraphs or "blocks" (which are separated by lines containing nothing but whitespace). For each block of POD documentation encountered it will invoke a method to parse the given paragraph.

    If a second argument is given then it should correspond to a filehandle where output should be sent (otherwise the default output filehandle is STDOUT if no output filehandle is currently in use).

    NOTE: For performance reasons, this method caches the input stream at the top of the stack in a local variable. Any attempts by clients to change the stack contents during processing when in the midst executing of this method will not affect the input stream used by the current invocation of this method.

    This method does not usually need to be overridden by subclasses.

    parse_from_file()

    1. $parser->parse_from_file($filename,$outfile);

    This method takes a filename and does the following:

    • opens the input and output files for reading (creating the appropriate filehandles)

    • invokes the parse_from_filehandle() method passing it the corresponding input and output filehandles.

    • closes the input and output files.

    If the special input filename "-" or "<&STDIN" is given then the STDIN filehandle is used for input (and no open or close is performed). If no input filename is specified then "-" is implied. Filehandle references, or objects that support the regular IO operations (like <$fh> or $fh-<Egtgetline>) are also accepted; the handles must already be opened.

    If a second argument is given then it should be the name of the desired output file. If the special output filename "-" or ">&STDOUT" is given then the STDOUT filehandle is used for output (and no open or close is performed). If the special output filename ">&STDERR" is given then the STDERR filehandle is used for output (and no open or close is performed). If no output filehandle is currently in use and no output filename is specified, then "-" is implied. Alternatively, filehandle references or objects that support the regular IO operations (like print, e.g. IO::String) are also accepted; the object must already be opened.

    This method does not usually need to be overridden by subclasses.

    ACCESSOR METHODS

    Clients of Pod::Parser should use the following methods to access instance data fields:

    errorsub()

    1. $parser->errorsub("method_name");
    2. $parser->errorsub(\&warn_user);
    3. $parser->errorsub(sub { print STDERR, @_ });

    Specifies the method or subroutine to use when printing error messages about POD syntax. The supplied method/subroutine must return TRUE upon successful printing of the message. If undef is given, then the carp builtin is used to issue error messages (this is the default behavior).

    1. my $errorsub = $parser->errorsub()
    2. my $errmsg = "This is an error message!\n"
    3. (ref $errorsub) and &{$errorsub}($errmsg)
    4. or (defined $errorsub) and $parser->$errorsub($errmsg)
    5. or carp($errmsg);

    Returns a method name, or else a reference to the user-supplied subroutine used to print error messages. Returns undef if the carp builtin is used to issue error messages (this is the default behavior).

    cutting()

    1. $boolean = $parser->cutting();

    Returns the current cutting state: a boolean-valued scalar which evaluates to true if text from the input file is currently being "cut" (meaning it is not considered part of the POD document).

    1. $parser->cutting($boolean);

    Sets the current cutting state to the given value and returns the result.

    parseopts()

    When invoked with no additional arguments, parseopts returns a hashtable of all the current parsing options.

    1. ## See if we are parsing non-POD sections as well as POD ones
    2. my %opts = $parser->parseopts();
    3. $opts{'-want_nonPODs}' and print "-want_nonPODs\n";

    When invoked using a single string, parseopts treats the string as the name of a parse-option and returns its corresponding value if it exists (returns undef if it doesn't).

    1. ## Did we ask to see '=cut' paragraphs?
    2. my $want_cut = $parser->parseopts('-process_cut_cmd');
    3. $want_cut and print "-process_cut_cmd\n";

    When invoked with multiple arguments, parseopts treats them as key/value pairs and the specified parse-option names are set to the given values. Any unspecified parse-options are unaffected.

    1. ## Set them back to the default
    2. $parser->parseopts(-warnings => 0);

    When passed a single hash-ref, parseopts uses that hash to completely reset the existing parse-options, all previous parse-option values are lost.

    1. ## Reset all options to default
    2. $parser->parseopts( { } );

    See PARSING OPTIONS for more information on the name and meaning of each parse-option currently recognized.

    output_file()

    1. $fname = $parser->output_file();

    Returns the name of the output file being written.

    output_handle()

    1. $fhandle = $parser->output_handle();

    Returns the output filehandle object.

    input_file()

    1. $fname = $parser->input_file();

    Returns the name of the input file being read.

    input_handle()

    1. $fhandle = $parser->input_handle();

    Returns the current input filehandle object.

    PRIVATE METHODS AND DATA

    Pod::Parser makes use of several internal methods and data fields which clients should not need to see or use. For the sake of avoiding name collisions for client data and methods, these methods and fields are briefly discussed here. Determined hackers may obtain further information about them by reading the Pod::Parser source code.

    Private data fields are stored in the hash-object whose reference is returned by the new() constructor for this class. The names of all private methods and data-fields used by Pod::Parser begin with a prefix of "_" and match the regular expression /^_\w+$/ .

    TREE-BASED PARSING

    If straightforward stream-based parsing wont meet your needs (as is likely the case for tasks such as translating PODs into structured markup languages like HTML and XML) then you may need to take the tree-based approach. Rather than doing everything in one pass and calling the interpolate() method to expand sequences into text, it may be desirable to instead create a parse-tree using the parse_text() method to return a tree-like structure which may contain an ordered list of children (each of which may be a text-string, or a similar tree-like structure).

    Pay special attention to METHODS FOR PARSING AND PROCESSING and to the objects described in Pod::InputObjects. The former describes the gory details and parameters for how to customize and extend the parsing behavior of Pod::Parser. Pod::InputObjects provides several objects that may all be used interchangeably as parse-trees. The most obvious one is the Pod::ParseTree object. It defines the basic interface and functionality that all things trying to be a POD parse-tree should do. A Pod::ParseTree is defined such that each "node" may be a text-string, or a reference to another parse-tree. Each Pod::Paragraph object and each Pod::InteriorSequence object also supports the basic parse-tree interface.

    The parse_text() method takes a given paragraph of text, and returns a parse-tree that contains one or more children, each of which may be a text-string, or an InteriorSequence object. There are also callback-options that may be passed to parse_text() to customize the way it expands or transforms interior-sequences, as well as the returned result. These callbacks can be used to create a parse-tree with custom-made objects (which may or may not support the parse-tree interface, depending on how you choose to do it).

    If you wish to turn an entire POD document into a parse-tree, that process is fairly straightforward. The parse_text() method is the key to doing this successfully. Every paragraph-callback (i.e. the polymorphic methods for command(), verbatim(), and textblock() paragraphs) takes a Pod::Paragraph object as an argument. Each paragraph object has a parse_tree() method that can be used to get or set a corresponding parse-tree. So for each of those paragraph-callback methods, simply call parse_text() with the options you desire, and then use the returned parse-tree to assign to the given paragraph object.

    That gives you a parse-tree for each paragraph - so now all you need is an ordered list of paragraphs. You can maintain that yourself as a data element in the object/hash. The most straightforward way would be simply to use an array-ref, with the desired set of custom "options" for each invocation of parse_text. Let's assume the desired option-set is given by the hash %options . Then we might do something like the following:

    1. package MyPodParserTree;
    2. @ISA = qw( Pod::Parser );
    3. ...
    4. sub begin_pod {
    5. my $self = shift;
    6. $self->{'-paragraphs'} = []; ## initialize paragraph list
    7. }
    8. sub command {
    9. my ($parser, $command, $paragraph, $line_num, $pod_para) = @_;
    10. my $ptree = $parser->parse_text({%options}, $paragraph, ...);
    11. $pod_para->parse_tree( $ptree );
    12. push @{ $self->{'-paragraphs'} }, $pod_para;
    13. }
    14. sub verbatim {
    15. my ($parser, $paragraph, $line_num, $pod_para) = @_;
    16. push @{ $self->{'-paragraphs'} }, $pod_para;
    17. }
    18. sub textblock {
    19. my ($parser, $paragraph, $line_num, $pod_para) = @_;
    20. my $ptree = $parser->parse_text({%options}, $paragraph, ...);
    21. $pod_para->parse_tree( $ptree );
    22. push @{ $self->{'-paragraphs'} }, $pod_para;
    23. }
    24. ...
    25. package main;
    26. ...
    27. my $parser = new MyPodParserTree(...);
    28. $parser->parse_from_file(...);
    29. my $paragraphs_ref = $parser->{'-paragraphs'};

    Of course, in this module-author's humble opinion, I'd be more inclined to use the existing Pod::ParseTree object than a simple array. That way everything in it, paragraphs and sequences, all respond to the same core interface for all parse-tree nodes. The result would look something like:

    1. package MyPodParserTree2;
    2. ...
    3. sub begin_pod {
    4. my $self = shift;
    5. $self->{'-ptree'} = new Pod::ParseTree; ## initialize parse-tree
    6. }
    7. sub parse_tree {
    8. ## convenience method to get/set the parse-tree for the entire POD
    9. (@_ > 1) and $_[0]->{'-ptree'} = $_[1];
    10. return $_[0]->{'-ptree'};
    11. }
    12. sub command {
    13. my ($parser, $command, $paragraph, $line_num, $pod_para) = @_;
    14. my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...);
    15. $pod_para->parse_tree( $ptree );
    16. $parser->parse_tree()->append( $pod_para );
    17. }
    18. sub verbatim {
    19. my ($parser, $paragraph, $line_num, $pod_para) = @_;
    20. $parser->parse_tree()->append( $pod_para );
    21. }
    22. sub textblock {
    23. my ($parser, $paragraph, $line_num, $pod_para) = @_;
    24. my $ptree = $parser->parse_text({<<options>>}, $paragraph, ...);
    25. $pod_para->parse_tree( $ptree );
    26. $parser->parse_tree()->append( $pod_para );
    27. }
    28. ...
    29. package main;
    30. ...
    31. my $parser = new MyPodParserTree2(...);
    32. $parser->parse_from_file(...);
    33. my $ptree = $parser->parse_tree;
    34. ...

    Now you have the entire POD document as one great big parse-tree. You can even use the -expand_seq option to parse_text to insert whole different kinds of objects. Just don't expect Pod::Parser to know what to do with them after that. That will need to be in your code. Or, alternatively, you can insert any object you like so long as it conforms to the Pod::ParseTree interface.

    One could use this to create subclasses of Pod::Paragraphs and Pod::InteriorSequences for specific commands (or to create your own custom node-types in the parse-tree) and add some kind of emit() method to each custom node/subclass object in the tree. Then all you'd need to do is recursively walk the tree in the desired order, processing the children (most likely from left to right) by formatting them if they are text-strings, or by calling their emit() method if they are objects/references.

    CAVEATS

    Please note that POD has the notion of "paragraphs": this is something starting after a blank (read: empty) line, with the single exception of the file start, which is also starting a paragraph. That means that especially a command (e.g. =head1 ) must be preceded with a blank line; __END__ is not a blank line.

    SEE ALSO

    Pod::InputObjects, Pod::Select

    Pod::InputObjects defines POD input objects corresponding to command paragraphs, parse-trees, and interior-sequences.

    Pod::Select is a subclass of Pod::Parser which provides the ability to selectively include and/or exclude sections of a POD document from being translated based upon the current heading, subheading, subsubheading, etc.

    AUTHOR

    Please report bugs using http://rt.cpan.org.

    Brad Appleton <bradapp@enteract.com>

    Based on code for Pod::Text written by Tom Christiansen <tchrist@mox.perl.com>

    LICENSE

    Pod-Parser is free software; you can redistribute it and/or modify it under the terms of the Artistic License distributed with Perl version 5.000 or (at your option) any later version. Please refer to the Artistic License that came with your Perl distribution for more details. If your version of Perl was not distributed under the terms of the Artistic License, than you may distribute PodParser under the same terms as Perl itself.

     
    perldoc-html/Pod/Perldoc/000755 000765 000024 00000000000 12275777502 015346 5ustar00jjstaff000000 000000 perldoc-html/Pod/Perldoc.html000644 000765 000024 00000037443 12275777500 016245 0ustar00jjstaff000000 000000 Pod::Perldoc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc

    NAME

    Pod::Perldoc - Look up Perl documentation in Pod format.

    SYNOPSIS

    1. use Pod::Perldoc ();
    2. Pod::Perldoc->run();

    DESCRIPTION

    The guts of perldoc utility.

    SEE ALSO

    perldoc

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002-2007 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/PlainText.html000644 000765 000024 00000052056 12275777477 016577 0ustar00jjstaff000000 000000 Pod::PlainText - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::PlainText

    Perl 5 version 18.2 documentation
    Recently read

    Pod::PlainText

    NAME

    Pod::PlainText - Convert POD data to formatted ASCII text

    SYNOPSIS

    1. use Pod::PlainText;
    2. my $parser = Pod::PlainText->new (sentence => 0, width => 78);
    3. # Read POD from STDIN and write to STDOUT.
    4. $parser->parse_from_filehandle;
    5. # Read POD from file.pod and write to file.txt.
    6. $parser->parse_from_file ('file.pod', 'file.txt');

    DESCRIPTION

    Pod::PlainText is a module that can convert documentation in the POD format (the preferred language for documenting Perl) into formatted ASCII. It uses no special formatting controls or codes whatsoever, and its output is therefore suitable for nearly any device.

    As a derived class from Pod::Parser, Pod::PlainText supports the same methods and interfaces. See Pod::Parser for all the details; briefly, one creates a new parser with Pod::PlainText->new() and then calls either parse_from_filehandle() or parse_from_file().

    new() can take options, in the form of key/value pairs, that control the behavior of the parser. The currently recognized options are:

    • alt

      If set to a true value, selects an alternate output format that, among other things, uses a different heading style and marks =item entries with a colon in the left margin. Defaults to false.

    • indent

      The number of spaces to indent regular text, and the default indentation for =over blocks. Defaults to 4.

    • loose

      If set to a true value, a blank line is printed after a =headN headings. If set to false (the default), no blank line is printed after =headN . This is the default because it's the expected formatting for manual pages; if you're formatting arbitrary text documents, setting this to true may result in more pleasing output.

    • sentence

      If set to a true value, Pod::PlainText will assume that each sentence ends in two spaces, and will try to preserve that spacing. If set to false, all consecutive whitespace in non-verbatim paragraphs is compressed into a single space. Defaults to true.

    • width

      The column at which to wrap text on the right-hand side. Defaults to 76.

    The standard Pod::Parser method parse_from_filehandle() takes up to two arguments, the first being the file handle to read POD from and the second being the file handle to write the formatted output to. The first defaults to STDIN if not given, and the second defaults to STDOUT. The method parse_from_file() is almost identical, except that its two arguments are the input and output disk files instead. See Pod::Parser for the specific details.

    DIAGNOSTICS

    • Bizarre space in item

      (W) Something has gone wrong in internal =item processing. This message indicates a bug in Pod::PlainText; you should never see it.

    • Can't open %s for reading: %s

      (F) Pod::PlainText was invoked via the compatibility mode pod2text() interface and the input file it was given could not be opened.

    • Unknown escape: %s

      (W) The POD source contained an E<> escape that Pod::PlainText didn't know about.

    • Unknown sequence: %s

      (W) The POD source contained a non-standard internal sequence (something of the form X<> ) that Pod::PlainText didn't know about.

    • Unmatched =back

      (W) Pod::PlainText encountered a =back command that didn't correspond to an =over command.

    RESTRICTIONS

    Embedded Ctrl-As (octal 001) in the input will be mapped to spaces on output, due to an internal implementation detail.

    NOTES

    This is a replacement for an earlier Pod::Text module written by Tom Christiansen. It has a revamped interface, since it now uses Pod::Parser, but an interface roughly compatible with the old Pod::Text::pod2text() function is still available. Please change to the new calling convention, though.

    The original Pod::Text contained code to do formatting via termcap sequences, although it wasn't turned on by default and it was problematic to get it to work at all. This rewrite doesn't even try to do that, but a subclass of it does. Look for Pod::Text::Termcap.

    SEE ALSO

    Pod::PlainText is part of the Pod::Parser distribution.

    Pod::Parser, Pod::Text::Termcap, pod2text(1)

    AUTHOR

    Please report bugs using http://rt.cpan.org.

    Russ Allbery <rra@stanford.edu>, based very heavily on the original Pod::Text by Tom Christiansen <tchrist@mox.perl.com> and its conversion to Pod::Parser by Brad Appleton <bradapp@enteract.com>.

     
    perldoc-html/Pod/Select.html000644 000765 000024 00000076443 12275777477 016114 0ustar00jjstaff000000 000000 Pod::Select - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Select

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Select

    NAME

    Pod::Select, podselect() - extract selected sections of POD from input

    SYNOPSIS

    1. use Pod::Select;
    2. ## Select all the POD sections for each file in @filelist
    3. ## and print the result on standard output.
    4. podselect(@filelist);
    5. ## Same as above, but write to tmp.out
    6. podselect({-output => "tmp.out"}, @filelist):
    7. ## Select from the given filelist, only those POD sections that are
    8. ## within a 1st level section named any of: NAME, SYNOPSIS, OPTIONS.
    9. podselect({-sections => ["NAME|SYNOPSIS", "OPTIONS"]}, @filelist):
    10. ## Select the "DESCRIPTION" section of the PODs from STDIN and write
    11. ## the result to STDERR.
    12. podselect({-output => ">&STDERR", -sections => ["DESCRIPTION"]}, \*STDIN);

    or

    1. use Pod::Select;
    2. ## Create a parser object for selecting POD sections from the input
    3. $parser = new Pod::Select();
    4. ## Select all the POD sections for each file in @filelist
    5. ## and print the result to tmp.out.
    6. $parser->parse_from_file("<&STDIN", "tmp.out");
    7. ## Select from the given filelist, only those POD sections that are
    8. ## within a 1st level section named any of: NAME, SYNOPSIS, OPTIONS.
    9. $parser->select("NAME|SYNOPSIS", "OPTIONS");
    10. for (@filelist) { $parser->parse_from_file($_); }
    11. ## Select the "DESCRIPTION" and "SEE ALSO" sections of the PODs from
    12. ## STDIN and write the result to STDERR.
    13. $parser->select("DESCRIPTION");
    14. $parser->add_selection("SEE ALSO");
    15. $parser->parse_from_filehandle(\*STDIN, \*STDERR);

    REQUIRES

    perl5.005, Pod::Parser, Exporter, Carp

    EXPORTS

    podselect()

    DESCRIPTION

    podselect() is a function which will extract specified sections of pod documentation from an input stream. This ability is provided by the Pod::Select module which is a subclass of Pod::Parser. Pod::Select provides a method named select() to specify the set of POD sections to select for processing/printing. podselect() merely creates a Pod::Select object and then invokes the podselect() followed by parse_from_file().

    SECTION SPECIFICATIONS

    podselect() and Pod::Select::select() may be given one or more "section specifications" to restrict the text processed to only the desired set of sections and their corresponding subsections. A section specification is a string containing one or more Perl-style regular expressions separated by forward slashes ("/"). If you need to use a forward slash literally within a section title you can escape it with a backslash ("\/").

    The formal syntax of a section specification is:

    • head1-title-regex/head2-title-regex/...

    Any omitted or empty regular expressions will default to ".*". Please note that each regular expression given is implicitly anchored by adding "^" and "$" to the beginning and end. Also, if a given regular expression starts with a "!" character, then the expression is negated (so !foo would match anything except foo ).

    Some example section specifications follow.

    • Match the NAME and SYNOPSIS sections and all of their subsections:

      NAME|SYNOPSIS

    • Match only the Question and Answer subsections of the DESCRIPTION section:

      DESCRIPTION/Question|Answer

    • Match the Comments subsection of all sections:

      /Comments

    • Match all subsections of DESCRIPTION except for Comments :

      DESCRIPTION/!Comments

    • Match the DESCRIPTION section but do not match any of its subsections:

      DESCRIPTION/!.+

    • Match all top level sections but none of their subsections:

      /!.+

    OBJECT METHODS

    The following methods are provided in this module. Each one takes a reference to the object itself as an implicit first parameter.

    curr_headings()

    1. ($head1, $head2, $head3, ...) = $parser->curr_headings();
    2. $head1 = $parser->curr_headings(1);

    This method returns a list of the currently active section headings and subheadings in the document being parsed. The list of headings returned corresponds to the most recently parsed paragraph of the input.

    If an argument is given, it must correspond to the desired section heading number, in which case only the specified section heading is returned. If there is no current section heading at the specified level, then undef is returned.

    select()

    1. $parser->select($section_spec1,$section_spec2,...);

    This method is used to select the particular sections and subsections of POD documentation that are to be printed and/or processed. The existing set of selected sections is replaced with the given set of sections. See add_selection() for adding to the current set of selected sections.

    Each of the $section_spec arguments should be a section specification as described in SECTION SPECIFICATIONS. The section specifications are parsed by this method and the resulting regular expressions are stored in the invoking object.

    If no $section_spec arguments are given, then the existing set of selected sections is cleared out (which means all sections will be processed).

    This method should not normally be overridden by subclasses.

    add_selection()

    1. $parser->add_selection($section_spec1,$section_spec2,...);

    This method is used to add to the currently selected sections and subsections of POD documentation that are to be printed and/or processed. See <select()> for replacing the currently selected sections.

    Each of the $section_spec arguments should be a section specification as described in SECTION SPECIFICATIONS. The section specifications are parsed by this method and the resulting regular expressions are stored in the invoking object.

    This method should not normally be overridden by subclasses.

    clear_selections()

    1. $parser->clear_selections();

    This method takes no arguments, it has the exact same effect as invoking <select()> with no arguments.

    match_section()

    1. $boolean = $parser->match_section($heading1,$heading2,...);

    Returns a value of true if the given section and subsection heading titles match any of the currently selected section specifications in effect from prior calls to select() and add_selection() (or if there are no explicitly selected/deselected sections).

    The arguments $heading1 , $heading2 , etc. are the heading titles of the corresponding sections, subsections, etc. to try and match. If $headingN is omitted then it defaults to the current corresponding section heading title in the input.

    This method should not normally be overridden by subclasses.

    is_selected()

    1. $boolean = $parser->is_selected($paragraph);

    This method is used to determine if the block of text given in $paragraph falls within the currently selected set of POD sections and subsections to be printed or processed. This method is also responsible for keeping track of the current input section and subsections. It is assumed that $paragraph is the most recently read (but not yet processed) input paragraph.

    The value returned will be true if the $paragraph and the rest of the text in the same section as $paragraph should be selected (included) for processing; otherwise a false value is returned.

    EXPORTED FUNCTIONS

    The following functions are exported by this module. Please note that these are functions (not methods) and therefore do not take an implicit first argument.

    podselect()

    1. podselect(\%options,@filelist);

    podselect will print the raw (untranslated) POD paragraphs of all POD sections in the given input files specified by @filelist according to the given options.

    If any argument to podselect is a reference to a hash (associative array) then the values with the following keys are processed as follows:

    • -output

      A string corresponding to the desired output file (or ">&STDOUT" or ">&STDERR"). The default is to use standard output.

    • -sections

      A reference to an array of sections specifications (as described in SECTION SPECIFICATIONS) which indicate the desired set of POD sections and subsections to be selected from input. If no section specifications are given, then all sections of the PODs are used.

    All other arguments should correspond to the names of input files containing POD sections. A file name of "-" or "<&STDIN" will be interpreted to mean standard input (which is the default if no filenames are given).

    PRIVATE METHODS AND DATA

    Pod::Select makes uses a number of internal methods and data fields which clients should not need to see or use. For the sake of avoiding name collisions with client data and methods, these methods and fields are briefly discussed here. Determined hackers may obtain further information about them by reading the Pod::Select source code.

    Private data fields are stored in the hash-object whose reference is returned by the new() constructor for this class. The names of all private methods and data-fields used by Pod::Select begin with a prefix of "_" and match the regular expression /^_\w+$/ .

    SEE ALSO

    Pod::Parser

    AUTHOR

    Please report bugs using http://rt.cpan.org.

    Brad Appleton <bradapp@enteract.com>

    Based on code for pod2text written by Tom Christiansen <tchrist@mox.perl.com>

    Pod::Select is part of the Pod::Parser distribution.

     
    perldoc-html/Pod/Simple/000755 000765 000024 00000000000 12275777502 015207 5ustar00jjstaff000000 000000 perldoc-html/Pod/Simple.html000644 000765 000024 00000124037 12275777475 016115 0ustar00jjstaff000000 000000 Pod::Simple - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple

    SYNOPSIS

    1. TODO

    DESCRIPTION

    Pod::Simple is a Perl library for parsing text in the Pod ("plain old documentation") markup language that is typically used for writing documentation for Perl and for Perl modules. The Pod format is explained perlpod; the most common formatter is called perldoc .

    Be sure to read ENCODING if your Pod contains non-ASCII characters.

    Pod formatters can use Pod::Simple to parse Pod documents and render them into plain text, HTML, or any number of other formats. Typically, such formatters will be subclasses of Pod::Simple, and so they will inherit its methods, like parse_file .

    If you're reading this document just because you have a Pod-processing subclass that you want to use, this document (plus the documentation for the subclass) is probably all you need to read.

    If you're reading this document because you want to write a formatter subclass, continue reading it and then read Pod::Simple::Subclassing, and then possibly even read perlpodspec (some of which is for parser-writers, but much of which is notes to formatter-writers).

    MAIN METHODS

    • $parser = SomeClass->new();

      This returns a new parser object, where SomeClass is a subclass of Pod::Simple.

    • $parser->output_fh( *OUT );

      This sets the filehandle that $parser 's output will be written to. You can pass *STDOUT , otherwise you should probably do something like this:

      1. my $outfile = "output.txt";
      2. open TXTOUT, ">$outfile" or die "Can't write to $outfile: $!";
      3. $parser->output_fh(*TXTOUT);

      ...before you call one of the $parser->parse_whatever methods.

    • $parser->output_string( \$somestring );

      This sets the string that $parser 's output will be sent to, instead of any filehandle.

    • $parser->parse_file( $some_filename );
    • $parser->parse_file( *INPUT_FH );

      This reads the Pod content of the file (or filehandle) that you specify, and processes it with that $parser object, according to however $parser 's class works, and according to whatever parser options you have set up for this $parser object.

    • $parser->parse_string_document( $all_content );

      This works just like parse_file except that it reads the Pod content not from a file, but from a string that you have already in memory.

    • $parser->parse_lines( ...@lines..., undef );

      This processes the lines in @lines (where each list item must be a defined value, and must contain exactly one line of content -- so no items like "foo\nbar" are allowed). The final undef is used to indicate the end of document being parsed.

      The other parser_whatever methods are meant to be called only once per $parser object; but parse_lines can be called as many times per $parser object as you want, as long as the last call (and only the last call) ends with an undef value.

    • $parser->content_seen

      This returns true only if there has been any real content seen for this document. Returns false in cases where the document contains content, but does not make use of any Pod markup.

    • SomeClass->filter( $filename );
    • SomeClass->filter( *INPUT_FH );
    • SomeClass->filter( \$document_content );

      This is a shortcut method for creating a new parser object, setting the output handle to STDOUT, and then processing the specified file (or filehandle, or in-memory document). This is handy for one-liners like this:

      1. perl -MPod::Simple::Text -e "Pod::Simple::Text->filter('thingy.pod')"

    SECONDARY METHODS

    Some of these methods might be of interest to general users, as well as of interest to formatter-writers.

    Note that the general pattern here is that the accessor-methods read the attribute's value with $value = $parser->attribute and set the attribute's value with $parser->attribute(newvalue). For each accessor, I typically only mention one syntax or another, based on which I think you are actually most likely to use.

    • $parser->parse_characters( SOMEVALUE )

      The Pod parser normally expects to read octets and to convert those octets to characters based on the =encoding declaration in the Pod source. Set this option to a true value to indicate that the Pod source is already a Perl character stream. This tells the parser to ignore any =encoding command and to skip all the code paths involving decoding octets.

    • $parser->no_whining( SOMEVALUE )

      If you set this attribute to a true value, you will suppress the parser's complaints about irregularities in the Pod coding. By default, this attribute's value is false, meaning that irregularities will be reported.

      Note that turning this attribute to true won't suppress one or two kinds of complaints about rarely occurring unrecoverable errors.

    • $parser->no_errata_section( SOMEVALUE )

      If you set this attribute to a true value, you will stop the parser from generating a "POD ERRORS" section at the end of the document. By default, this attribute's value is false, meaning that an errata section will be generated, as necessary.

    • $parser->complain_stderr( SOMEVALUE )

      If you set this attribute to a true value, it will send reports of parsing errors to STDERR. By default, this attribute's value is false, meaning that no output is sent to STDERR.

      Setting complain_stderr also sets no_errata_section .

    • $parser->source_filename

      This returns the filename that this parser object was set to read from.

    • $parser->doc_has_started

      This returns true if $parser has read from a source, and has seen Pod content in it.

    • $parser->source_dead

      This returns true if $parser has read from a source, and come to the end of that source.

    • $parser->strip_verbatim_indent( SOMEVALUE )

      The perlpod spec for a Verbatim paragraph is "It should be reproduced exactly...", which means that the whitespace you've used to indent your verbatim blocks will be preserved in the output. This can be annoying for outputs such as HTML, where that whitespace will remain in front of every line. It's an unfortunate case where syntax is turned into semantics.

      If the POD your parsing adheres to a consistent indentation policy, you can have such indentation stripped from the beginning of every line of your verbatim blocks. This method tells Pod::Simple what to strip. For two-space indents, you'd use:

      1. $parser->strip_verbatim_indent(' ');

      For tab indents, you'd use a tab character:

      1. $parser->strip_verbatim_indent("\t");

      If the POD is inconsistent about the indentation of verbatim blocks, but you have figured out a heuristic to determine how much a particular verbatim block is indented, you can pass a code reference instead. The code reference will be executed with one argument, an array reference of all the lines in the verbatim block, and should return the value to be stripped from each line. For example, if you decide that you're fine to use the first line of the verbatim block to set the standard for indentation of the rest of the block, you can look at the first line and return the appropriate value, like so:

      1. $new->strip_verbatim_indent(sub {
      2. my $lines = shift;
      3. (my $indent = $lines->[0]) =~ s/\S.*//;
      4. return $indent;
      5. });

      If you'd rather treat each line individually, you can do that, too, by just transforming them in-place in the code reference and returning undef. Say that you don't want any lines indented. You can do something like this:

      1. $new->strip_verbatim_indent(sub {
      2. my $lines = shift;
      3. sub { s/^\s+// for @{ $lines },
      4. return undef;
      5. });

    TERTIARY METHODS

    • $parser->abandon_output_fh()

      Cancel output to the file handle. Any POD read by the $parser is not effected.

    • $parser->abandon_output_string()

      Cancel output to the output string. Any POD read by the $parser is not effected.

    • $parser->accept_code( @codes )

      Alias for accept_codes.

    • $parser->accept_codes( @codes )

      Allows $parser to accept a list of Formatting Codes in perlpod. This can be used to implement user-defined codes.

    • $parser->accept_directive_as_data( @directives )

      Allows $parser to accept a list of directives for data paragraphs. A directive is the label of a Command Paragraph in perlpod. A data paragraph is one delimited by =begin/=for/=end directives. This can be used to implement user-defined directives.

    • $parser->accept_directive_as_processed( @directives )

      Allows $parser to accept a list of directives for processed paragraphs. A directive is the label of a Command Paragraph in perlpod. A processed paragraph is also known as Ordinary Paragraph in perlpod. This can be used to implement user-defined directives.

    • $parser->accept_directive_as_verbatim( @directives )

      Allows $parser to accept a list of directives for Verbatim Paragraph in perlpod. A directive is the label of a Command Paragraph in perlpod. This can be used to implement user-defined directives.

    • $parser->accept_target( @targets )

      Alias for accept_targets.

    • $parser->accept_target_as_text( @targets )

      Alias for accept_targets_as_text.

    • $parser->accept_targets( @targets )

      Accepts targets for =begin/=for/=end sections of the POD.

    • $parser->accept_targets_as_text( @targets )

      Accepts targets for =begin/=for/=end sections that should be parsed as POD. For details, see About Data Paragraphs in perlpodspec.

    • $parser->any_errata_seen()

      Used to check if any errata was seen.

      Example:

      1. die "too many errors\n" if $parser->any_errata_seen();
    • $parser->detected_encoding()

      Return the encoding corresponding to =encoding , but only if the encoding was recognized and handled.

    • $parser->encoding()

      Return encoding of the document, even if the encoding is not correctly handled.

    • $parser->parse_from_file( $source, $to )

      Parses from $source file to $to file. Similar to parse_from_file in Pod::Parser.

    • $parser->scream( @error_messages )

      Log an error that can't be ignored.

    • $parser->unaccept_code( @codes )

      Alias for unaccept_codes.

    • $parser->unaccept_codes( @codes )

      Removes @codes as valid codes for the parse.

    • $parser->unaccept_directive( @directives )

      Alias for unaccept_directives.

    • $parser->unaccept_directives( @directives )

      Removes @directives as valid directives for the parse.

    • $parser->unaccept_target( @targets )

      Alias for unaccept_targets.

    • $parser->unaccept_targets( @targets )

      Removes @targets as valid targets for the parse.

    • $parser->version_report()

      Returns a string describing the version.

    • $parser->whine( @error_messages )

      Log an error unless $parser->no_whining( TRUE ); .

    ENCODING

    The Pod::Simple parser expects to read octets. The parser will decode the octets into Perl's internal character string representation using the value of the =encoding declaration in the POD source.

    If the POD source does not include an =encoding declaration, the parser will attempt to guess the encoding (selecting one of UTF-8 or Latin-1) by examining the first non-ASCII bytes and applying the heuristic described in perlpodspec.

    If you set the parse_characters option to a true value the parser will expect characters rather than octets; will ignore any =encoding ; and will make no attempt to decode the input.

    CAVEATS

    This is just a beta release -- there are a good number of things still left to do. Notably, support for EBCDIC platforms is still half-done, an untested.

    SEE ALSO

    Pod::Simple::Subclassing

    perlpod

    perlpodspec

    Pod::Escapes

    perldoc

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org

    Documentation has been contributed by:

    • Gabor Szabo szabgab@gmail.com
    • Shawn H Corey SHCOREY at cpan.org
     
    perldoc-html/Pod/Text/000755 000765 000024 00000000000 12275777501 014701 5ustar00jjstaff000000 000000 perldoc-html/Pod/Text.html000644 000765 000024 00000065062 12275777477 015614 0ustar00jjstaff000000 000000 Pod::Text - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Text

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Text

    NAME

    Pod::Text - Convert POD data to formatted ASCII text

    SYNOPSIS

    1. use Pod::Text;
    2. my $parser = Pod::Text->new (sentence => 0, width => 78);
    3. # Read POD from STDIN and write to STDOUT.
    4. $parser->parse_from_filehandle;
    5. # Read POD from file.pod and write to file.txt.
    6. $parser->parse_from_file ('file.pod', 'file.txt');

    DESCRIPTION

    Pod::Text is a module that can convert documentation in the POD format (the preferred language for documenting Perl) into formatted ASCII. It uses no special formatting controls or codes whatsoever, and its output is therefore suitable for nearly any device.

    As a derived class from Pod::Simple, Pod::Text supports the same methods and interfaces. See Pod::Simple for all the details; briefly, one creates a new parser with Pod::Text->new() and then normally calls parse_file().

    new() can take options, in the form of key/value pairs, that control the behavior of the parser. The currently recognized options are:

    • alt

      If set to a true value, selects an alternate output format that, among other things, uses a different heading style and marks =item entries with a colon in the left margin. Defaults to false.

    • code

      If set to a true value, the non-POD parts of the input file will be included in the output. Useful for viewing code documented with POD blocks with the POD rendered and the code left intact.

    • errors

      How to report errors. die says to throw an exception on any POD formatting error. stderr says to report errors on standard error, but not to throw an exception. pod says to include a POD ERRORS section in the resulting documentation summarizing the errors. none ignores POD errors entirely, as much as possible.

      The default is output .

    • indent

      The number of spaces to indent regular text, and the default indentation for =over blocks. Defaults to 4.

    • loose

      If set to a true value, a blank line is printed after a =head1 heading. If set to false (the default), no blank line is printed after =head1 , although one is still printed after =head2 . This is the default because it's the expected formatting for manual pages; if you're formatting arbitrary text documents, setting this to true may result in more pleasing output.

    • margin

      The width of the left margin in spaces. Defaults to 0. This is the margin for all text, including headings, not the amount by which regular text is indented; for the latter, see the indent option. To set the right margin, see the width option.

    • nourls

      Normally, L<> formatting codes with a URL but anchor text are formatted to show both the anchor text and the URL. In other words:

      1. L<foo|http://example.com/>

      is formatted as:

      1. foo <http://example.com/>

      This option, if set to a true value, suppresses the URL when anchor text is given, so this example would be formatted as just foo . This can produce less cluttered output in cases where the URLs are not particularly important.

    • quotes

      Sets the quote marks used to surround C<> text. If the value is a single character, it is used as both the left and right quote; if it is two characters, the first character is used as the left quote and the second as the right quoted; and if it is four characters, the first two are used as the left quote and the second two as the right quote.

      This may also be set to the special value none , in which case no quote marks are added around C<> text.

    • sentence

      If set to a true value, Pod::Text will assume that each sentence ends in two spaces, and will try to preserve that spacing. If set to false, all consecutive whitespace in non-verbatim paragraphs is compressed into a single space. Defaults to true.

    • stderr

      Send error messages about invalid POD to standard error instead of appending a POD ERRORS section to the generated output. This is equivalent to setting errors to stderr if errors is not already set. It is supported for backward compatibility.

    • utf8

      By default, Pod::Text uses the same output encoding as the input encoding of the POD source (provided that Perl was built with PerlIO; otherwise, it doesn't encode its output). If this option is given, the output encoding is forced to UTF-8.

      Be aware that, when using this option, the input encoding of your POD source must be properly declared unless it is US-ASCII or Latin-1. POD input without an =encoding command will be assumed to be in Latin-1, and if it's actually in UTF-8, the output will be double-encoded. See perlpod(1) for more information on the =encoding command.

    • width

      The column at which to wrap text on the right-hand side. Defaults to 76.

    The standard Pod::Simple method parse_file() takes one argument, the file or file handle to read from, and writes output to standard output unless that has been changed with the output_fh() method. See Pod::Simple for the specific details and for other alternative interfaces.

    DIAGNOSTICS

    • Bizarre space in item
    • Item called without tag

      (W) Something has gone wrong in internal =item processing. These messages indicate a bug in Pod::Text; you should never see them.

    • Can't open %s for reading: %s

      (F) Pod::Text was invoked via the compatibility mode pod2text() interface and the input file it was given could not be opened.

    • Invalid errors setting "%s"

      (F) The errors parameter to the constructor was set to an unknown value.

    • Invalid quote specification "%s"

      (F) The quote specification given (the quotes option to the constructor) was invalid. A quote specification must be one, two, or four characters long.

    • POD document had syntax errors

      (F) The POD document being formatted had syntax errors and the errors option was set to die.

    BUGS

    Encoding handling assumes that PerlIO is available and does not work properly if it isn't. The utf8 option is therefore not supported unless Perl is built with PerlIO support.

    CAVEATS

    If Pod::Text is given the utf8 option, the encoding of its output file handle will be forced to UTF-8 if possible, overriding any existing encoding. This will be done even if the file handle is not created by Pod::Text and was passed in from outside. This maintains consistency regardless of PERL_UNICODE and other settings.

    If the utf8 option is not given, the encoding of its output file handle will be forced to the detected encoding of the input POD, which preserves whatever the input text is. This ensures backward compatibility with earlier, pre-Unicode versions of this module, without large numbers of Perl warnings.

    This is not ideal, but it seems to be the best compromise. If it doesn't work for you, please let me know the details of how it broke.

    NOTES

    This is a replacement for an earlier Pod::Text module written by Tom Christiansen. It has a revamped interface, since it now uses Pod::Simple, but an interface roughly compatible with the old Pod::Text::pod2text() function is still available. Please change to the new calling convention, though.

    The original Pod::Text contained code to do formatting via termcap sequences, although it wasn't turned on by default and it was problematic to get it to work at all. This rewrite doesn't even try to do that, but a subclass of it does. Look for Pod::Text::Termcap.

    SEE ALSO

    Pod::Simple, Pod::Text::Termcap, perlpod(1), pod2text(1)

    The current version of this module is always available from its web site at http://www.eyrie.org/~eagle/software/podlators/. It is also part of the Perl core distribution as of 5.6.0.

    AUTHOR

    Russ Allbery <rra@stanford.edu>, based very heavily on the original Pod::Text by Tom Christiansen <tchrist@mox.perl.com> and its conversion to Pod::Parser by Brad Appleton <bradapp@enteract.com>. Sean Burke's initial conversion of Pod::Man to use Pod::Simple provided much-needed guidance on how to use Pod::Simple.

    COPYRIGHT AND LICENSE

    Copyright 1999, 2000, 2001, 2002, 2004, 2006, 2008, 2009, 2012, 2013 Russ Allbery <rra@stanford.edu>.

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Pod/Usage.html000644 000765 000024 00000137177 12275777477 015743 0ustar00jjstaff000000 000000 Pod::Usage - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Usage

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Usage

    NAME

    Pod::Usage, pod2usage() - print a usage message from embedded pod documentation

    SYNOPSIS

    1. use Pod::Usage
    2. my $message_text = "This text precedes the usage message.";
    3. my $exit_status = 2; ## The exit status to use
    4. my $verbose_level = 0; ## The verbose level to use
    5. my $filehandle = \*STDERR; ## The filehandle to write to
    6. pod2usage($message_text);
    7. pod2usage($exit_status);
    8. pod2usage( { -message => $message_text ,
    9. -exitval => $exit_status ,
    10. -verbose => $verbose_level,
    11. -output => $filehandle } );
    12. pod2usage( -msg => $message_text ,
    13. -exitval => $exit_status ,
    14. -verbose => $verbose_level,
    15. -output => $filehandle );
    16. pod2usage( -verbose => 2,
    17. -noperldoc => 1 )

    ARGUMENTS

    pod2usage should be given either a single argument, or a list of arguments corresponding to an associative array (a "hash"). When a single argument is given, it should correspond to exactly one of the following:

    • A string containing the text of a message to print before printing the usage message

    • A numeric value corresponding to the desired exit status

    • A reference to a hash

    If more than one argument is given then the entire argument list is assumed to be a hash. If a hash is supplied (either as a reference or as a list) it should contain one or more elements with the following keys:

    • -message
    • -msg

      The text of a message to print immediately prior to printing the program's usage message.

    • -exitval

      The desired exit status to pass to the exit() function. This should be an integer, or else the string "NOEXIT" to indicate that control should simply be returned without terminating the invoking process.

    • -verbose

      The desired level of "verboseness" to use when printing the usage message. If the corresponding value is 0, then only the "SYNOPSIS" section of the pod documentation is printed. If the corresponding value is 1, then the "SYNOPSIS" section, along with any section entitled "OPTIONS", "ARGUMENTS", or "OPTIONS AND ARGUMENTS" is printed. If the corresponding value is 2 or more then the entire manpage is printed.

      The special verbosity level 99 requires to also specify the -sections parameter; then these sections are extracted (see Pod::Select) and printed.

    • -sections

      A string representing a selection list for sections to be printed when -verbose is set to 99, e.g. "NAME|SYNOPSIS|DESCRIPTION|VERSION" .

      Alternatively, an array reference of section specifications can be used:

      1. pod2usage(-verbose => 99,
      2. -sections => [ qw(fred fred/subsection) ] );
    • -output

      A reference to a filehandle, or the pathname of a file to which the usage message should be written. The default is \*STDERR unless the exit value is less than 2 (in which case the default is \*STDOUT ).

    • -input

      A reference to a filehandle, or the pathname of a file from which the invoking script's pod documentation should be read. It defaults to the file indicated by $0 ($PROGRAM_NAME for users of English.pm).

      If you are calling pod2usage() from a module and want to display that module's POD, you can use this:

      1. use Pod::Find qw(pod_where);
      2. pod2usage( -input => pod_where({-inc => 1}, __PACKAGE__) );
    • -pathlist

      A list of directory paths. If the input file does not exist, then it will be searched for in the given directory list (in the order the directories appear in the list). It defaults to the list of directories implied by $ENV{PATH} . The list may be specified either by a reference to an array, or by a string of directory paths which use the same path separator as $ENV{PATH} on your system (e.g., : for Unix, ; for MSWin32 and DOS).

    • -noperldoc

      By default, Pod::Usage will call perldoc when -verbose >= 2 is specified. This does not work well e.g. if the script was packed with PAR. The -noperldoc option suppresses the external call to perldoc and uses the simple text formatter (Pod::Text) to output the POD.

    Formatting base class

    The default text formatter depends on the Perl version (Pod::Text or Pod::PlainText for Perl versions < 5.005_58). The base class for Pod::Usage can be defined by pre-setting $Pod::Usage::Formatter before loading Pod::Usage, e.g.:

    1. BEGIN { $Pod::Usage::Formatter = 'Pod::Text::Termcap'; }
    2. use Pod::Usage qw(pod2usage);

    Pass-through options

    The following options are passed through to the underlying text formatter. See the manual pages of these modules for more information.

    1. alt code indent loose margin quotes sentence stderr utf8 width

    DESCRIPTION

    pod2usage will print a usage message for the invoking script (using its embedded pod documentation) and then exit the script with the desired exit status. The usage message printed may have any one of three levels of "verboseness": If the verbose level is 0, then only a synopsis is printed. If the verbose level is 1, then the synopsis is printed along with a description (if present) of the command line options and arguments. If the verbose level is 2, then the entire manual page is printed.

    Unless they are explicitly specified, the default values for the exit status, verbose level, and output stream to use are determined as follows:

    • If neither the exit status nor the verbose level is specified, then the default is to use an exit status of 2 with a verbose level of 0.

    • If an exit status is specified but the verbose level is not, then the verbose level will default to 1 if the exit status is less than 2 and will default to 0 otherwise.

    • If an exit status is not specified but verbose level is given, then the exit status will default to 2 if the verbose level is 0 and will default to 1 otherwise.

    • If the exit status used is less than 2, then output is printed on STDOUT . Otherwise output is printed on STDERR .

    Although the above may seem a bit confusing at first, it generally does "the right thing" in most situations. This determination of the default values to use is based upon the following typical Unix conventions:

    • An exit status of 0 implies "success". For example, diff(1) exits with a status of 0 if the two files have the same contents.

    • An exit status of 1 implies possibly abnormal, but non-defective, program termination. For example, grep(1) exits with a status of 1 if it did not find a matching line for the given regular expression.

    • An exit status of 2 or more implies a fatal error. For example, ls(1) exits with a status of 2 if you specify an illegal (unknown) option on the command line.

    • Usage messages issued as a result of bad command-line syntax should go to STDERR . However, usage messages issued due to an explicit request to print usage (like specifying -help on the command line) should go to STDOUT , just in case the user wants to pipe the output to a pager (such as more(1)).

    • If program usage has been explicitly requested by the user, it is often desirable to exit with a status of 1 (as opposed to 0) after issuing the user-requested usage message. It is also desirable to give a more verbose description of program usage in this case.

    pod2usage doesn't force the above conventions upon you, but it will use them by default if you don't expressly tell it to do otherwise. The ability of pod2usage() to accept a single number or a string makes it convenient to use as an innocent looking error message handling function:

    1. use Pod::Usage;
    2. use Getopt::Long;
    3. ## Parse options
    4. GetOptions("help", "man", "flag1") || pod2usage(2);
    5. pod2usage(1) if ($opt_help);
    6. pod2usage(-verbose => 2) if ($opt_man);
    7. ## Check for too many filenames
    8. pod2usage("$0: Too many files given.\n") if (@ARGV > 1);

    Some user's however may feel that the above "economy of expression" is not particularly readable nor consistent and may instead choose to do something more like the following:

    1. use Pod::Usage;
    2. use Getopt::Long;
    3. ## Parse options
    4. GetOptions("help", "man", "flag1") || pod2usage(-verbose => 0);
    5. pod2usage(-verbose => 1) if ($opt_help);
    6. pod2usage(-verbose => 2) if ($opt_man);
    7. ## Check for too many filenames
    8. pod2usage(-verbose => 2, -message => "$0: Too many files given.\n")
    9. if (@ARGV > 1);

    As with all things in Perl, there's more than one way to do it, and pod2usage() adheres to this philosophy. If you are interested in seeing a number of different ways to invoke pod2usage (although by no means exhaustive), please refer to EXAMPLES.

    EXAMPLES

    Each of the following invocations of pod2usage() will print just the "SYNOPSIS" section to STDERR and will exit with a status of 2:

    1. pod2usage();
    2. pod2usage(2);
    3. pod2usage(-verbose => 0);
    4. pod2usage(-exitval => 2);
    5. pod2usage({-exitval => 2, -output => \*STDERR});
    6. pod2usage({-verbose => 0, -output => \*STDERR});
    7. pod2usage(-exitval => 2, -verbose => 0);
    8. pod2usage(-exitval => 2, -verbose => 0, -output => \*STDERR);

    Each of the following invocations of pod2usage() will print a message of "Syntax error." (followed by a newline) to STDERR , immediately followed by just the "SYNOPSIS" section (also printed to STDERR ) and will exit with a status of 2:

    1. pod2usage("Syntax error.");
    2. pod2usage(-message => "Syntax error.", -verbose => 0);
    3. pod2usage(-msg => "Syntax error.", -exitval => 2);
    4. pod2usage({-msg => "Syntax error.", -exitval => 2, -output => \*STDERR});
    5. pod2usage({-msg => "Syntax error.", -verbose => 0, -output => \*STDERR});
    6. pod2usage(-msg => "Syntax error.", -exitval => 2, -verbose => 0);
    7. pod2usage(-message => "Syntax error.",
    8. -exitval => 2,
    9. -verbose => 0,
    10. -output => \*STDERR);

    Each of the following invocations of pod2usage() will print the "SYNOPSIS" section and any "OPTIONS" and/or "ARGUMENTS" sections to STDOUT and will exit with a status of 1:

    1. pod2usage(1);
    2. pod2usage(-verbose => 1);
    3. pod2usage(-exitval => 1);
    4. pod2usage({-exitval => 1, -output => \*STDOUT});
    5. pod2usage({-verbose => 1, -output => \*STDOUT});
    6. pod2usage(-exitval => 1, -verbose => 1);
    7. pod2usage(-exitval => 1, -verbose => 1, -output => \*STDOUT});

    Each of the following invocations of pod2usage() will print the entire manual page to STDOUT and will exit with a status of 1:

    1. pod2usage(-verbose => 2);
    2. pod2usage({-verbose => 2, -output => \*STDOUT});
    3. pod2usage(-exitval => 1, -verbose => 2);
    4. pod2usage({-exitval => 1, -verbose => 2, -output => \*STDOUT});

    Recommended Use

    Most scripts should print some type of usage message to STDERR when a command line syntax error is detected. They should also provide an option (usually -H or -help ) to print a (possibly more verbose) usage message to STDOUT . Some scripts may even wish to go so far as to provide a means of printing their complete documentation to STDOUT (perhaps by allowing a -man option). The following complete example uses Pod::Usage in combination with Getopt::Long to do all of these things:

    1. use Getopt::Long;
    2. use Pod::Usage;
    3. my $man = 0;
    4. my $help = 0;
    5. ## Parse options and print usage if there is a syntax error,
    6. ## or if usage was explicitly requested.
    7. GetOptions('help|?' => \$help, man => \$man) or pod2usage(2);
    8. pod2usage(1) if $help;
    9. pod2usage(-verbose => 2) if $man;
    10. ## If no arguments were given, then allow STDIN to be used only
    11. ## if it's not connected to a terminal (otherwise print usage)
    12. pod2usage("$0: No files given.") if ((@ARGV == 0) && (-t STDIN));
    13. __END__
    14. =head1 NAME
    15. sample - Using GetOpt::Long and Pod::Usage
    16. =head1 SYNOPSIS
    17. sample [options] [file ...]
    18. Options:
    19. -help brief help message
    20. -man full documentation
    21. =head1 OPTIONS
    22. =over 8
    23. =item B<-help>
    24. Print a brief help message and exits.
    25. =item B<-man>
    26. Prints the manual page and exits.
    27. =back
    28. =head1 DESCRIPTION
    29. B<This program> will read the given input file(s) and do something
    30. useful with the contents thereof.
    31. =cut

    CAVEATS

    By default, pod2usage() will use $0 as the path to the pod input file. Unfortunately, not all systems on which Perl runs will set $0 properly (although if $0 isn't found, pod2usage() will search $ENV{PATH} or else the list specified by the -pathlist option). If this is the case for your system, you may need to explicitly specify the path to the pod docs for the invoking script using something similar to the following:

    1. pod2usage(-exitval => 2, -input => "/path/to/your/pod/docs");

    In the pathological case that a script is called via a relative path and the script itself changes the current working directory (see chdir) before calling pod2usage, Pod::Usage will fail even on robust platforms. Don't do that. Or use FindBin to locate the script:

    1. use FindBin;
    2. pod2usage(-input => $FindBin::Bin . "/" . $FindBin::Script);

    AUTHOR

    Please report bugs using http://rt.cpan.org.

    Marek Rouchal <marekr@cpan.org>

    Brad Appleton <bradapp@enteract.com>

    Based on code for Pod::Text::pod2text() written by Tom Christiansen <tchrist@mox.perl.com>

    ACKNOWLEDGMENTS

    Steven McDougall <swmcd@world.std.com> for his help and patience with re-writing this manpage.

    SEE ALSO

    Pod::Usage is now a standalone distribution.

    Pod::Parser, Pod::Perldoc, Getopt::Long, Pod::Find, FindBin, Pod::Text, Pod::PlainText, Pod::Text::Termcap

     
    perldoc-html/Pod/Text/Color.html000644 000765 000024 00000042154 12275777476 016666 0ustar00jjstaff000000 000000 Pod::Text::Color - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Text::Color

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Text::Color

    NAME

    Pod::Text::Color - Convert POD data to formatted color ASCII text

    SYNOPSIS

    1. use Pod::Text::Color;
    2. my $parser = Pod::Text::Color->new (sentence => 0, width => 78);
    3. # Read POD from STDIN and write to STDOUT.
    4. $parser->parse_from_filehandle;
    5. # Read POD from file.pod and write to file.txt.
    6. $parser->parse_from_file ('file.pod', 'file.txt');

    DESCRIPTION

    Pod::Text::Color is a simple subclass of Pod::Text that highlights output text using ANSI color escape sequences. Apart from the color, it in all ways functions like Pod::Text. See Pod::Text for details and available options.

    Term::ANSIColor is used to get colors and therefore must be installed to use this module.

    BUGS

    This is just a basic proof of concept. It should be seriously expanded to support configurable coloration via options passed to the constructor, and pod2text should be taught about those.

    SEE ALSO

    Pod::Text, Pod::Simple

    The current version of this module is always available from its web site at http://www.eyrie.org/~eagle/software/podlators/. It is also part of the Perl core distribution as of 5.6.0.

    AUTHOR

    Russ Allbery <rra@stanford.edu>.

    COPYRIGHT AND LICENSE

    Copyright 1999, 2001, 2004, 2006, 2008, 2009 Russ Allbery <rra@stanford.edu>.

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Pod/Text/Overstrike.html000644 000765 000024 00000043272 12275777501 017734 0ustar00jjstaff000000 000000 Pod::Text::Overstrike - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Text::Overstrike

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Text::Overstrike

    NAME

    Pod::Text::Overstrike - Convert POD data to formatted overstrike text

    SYNOPSIS

    1. use Pod::Text::Overstrike;
    2. my $parser = Pod::Text::Overstrike->new (sentence => 0, width => 78);
    3. # Read POD from STDIN and write to STDOUT.
    4. $parser->parse_from_filehandle;
    5. # Read POD from file.pod and write to file.txt.
    6. $parser->parse_from_file ('file.pod', 'file.txt');

    DESCRIPTION

    Pod::Text::Overstrike is a simple subclass of Pod::Text that highlights output text using overstrike sequences, in a manner similar to nroff. Characters in bold text are overstruck (character, backspace, character) and characters in underlined text are converted to overstruck underscores (underscore, backspace, character). This format was originally designed for hard-copy terminals and/or line printers, yet is readable on soft-copy (CRT) terminals.

    Overstruck text is best viewed by page-at-a-time programs that take advantage of the terminal's stand-out and underline capabilities, such as the less program on Unix.

    Apart from the overstrike, it in all ways functions like Pod::Text. See Pod::Text for details and available options.

    BUGS

    Currently, the outermost formatting instruction wins, so for example underlined text inside a region of bold text is displayed as simply bold. There may be some better approach possible.

    SEE ALSO

    Pod::Text, Pod::Simple

    The current version of this module is always available from its web site at http://www.eyrie.org/~eagle/software/podlators/. It is also part of the Perl core distribution as of 5.6.0.

    AUTHOR

    Joe Smith <Joe.Smith@inwap.com>, using the framework created by Russ Allbery <rra@stanford.edu>.

    COPYRIGHT AND LICENSE

    Copyright 2000 by Joe Smith <Joe.Smith@inwap.com>. Copyright 2001, 2004, 2008 by Russ Allbery <rra@stanford.edu>.

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Pod/Text/Termcap.html000644 000765 000024 00000042366 12275777477 017211 0ustar00jjstaff000000 000000 Pod::Text::Termcap - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Text::Termcap

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Text::Termcap

    NAME

    Pod::Text::Termcap - Convert POD data to ASCII text with format escapes

    SYNOPSIS

    1. use Pod::Text::Termcap;
    2. my $parser = Pod::Text::Termcap->new (sentence => 0, width => 78);
    3. # Read POD from STDIN and write to STDOUT.
    4. $parser->parse_from_filehandle;
    5. # Read POD from file.pod and write to file.txt.
    6. $parser->parse_from_file ('file.pod', 'file.txt');

    DESCRIPTION

    Pod::Text::Termcap is a simple subclass of Pod::Text that highlights output text using the correct termcap escape sequences for the current terminal. Apart from the format codes, it in all ways functions like Pod::Text. See Pod::Text for details and available options.

    NOTES

    This module uses Term::Cap to retrieve the formatting escape sequences for the current terminal, and falls back on the ECMA-48 (the same in this regard as ANSI X3.64 and ISO 6429, the escape codes also used by DEC VT100 terminals) if the bold, underline, and reset codes aren't set in the termcap information.

    SEE ALSO

    Pod::Text, Pod::Simple, Term::Cap

    The current version of this module is always available from its web site at http://www.eyrie.org/~eagle/software/podlators/. It is also part of the Perl core distribution as of 5.6.0.

    AUTHOR

    Russ Allbery <rra@stanford.edu>.

    COPYRIGHT AND LICENSE

    Copyright 1999, 2001, 2002, 2004, 2006, 2008, 2009 Russ Allbery <rra@stanford.edu>.

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Pod/Simple/Checker.html000644 000765 000024 00000042575 12275777500 017454 0ustar00jjstaff000000 000000 Pod::Simple::Checker - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::Checker

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::Checker

    NAME

    Pod::Simple::Checker -- check the Pod syntax of a document

    SYNOPSIS

    1. perl -MPod::Simple::Checker -e \
    2. "exit Pod::Simple::Checker->filter(shift)->any_errata_seen" \
    3. thingy.pod

    DESCRIPTION

    This class is for checking the syntactic validity of Pod. It works by basically acting like a simple-minded version of Pod::Simple::Text that formats only the "Pod Errors" section (if Pod::Simple even generates one for the given document).

    This is a subclass of Pod::Simple and inherits all its methods.

    SEE ALSO

    Pod::Simple, Pod::Simple::Text, Pod::Checker

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/Debug.html000644 000765 000024 00000051733 12275777502 017134 0ustar00jjstaff000000 000000 Pod::Simple::Debug - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::Debug

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::Debug

    NAME

    Pod::Simple::Debug -- put Pod::Simple into trace/debug mode

    SYNOPSIS

    1. use Pod::Simple::Debug (5); # or some integer

    Or:

    1. my $debuglevel;
    2. use Pod::Simple::Debug (\$debuglevel, 0);
    3. ...some stuff that uses Pod::Simple to do stuff, but which
    4. you don't want debug output from...
    5. $debug_level = 4;
    6. ...some stuff that uses Pod::Simple to do stuff, but which
    7. you DO want debug output from...
    8. $debug_level = 0;

    DESCRIPTION

    This is an internal module for controlling the debug level (a.k.a. trace level) of Pod::Simple. This is of interest only to Pod::Simple developers.

    CAVEATS

    Note that you should load this module before loading Pod::Simple (or any Pod::Simple-based class). If you try loading Pod::Simple::Debug after &Pod::Simple::DEBUG is already defined, Pod::Simple::Debug will throw a fatal error to the effect that "it's s too late to call Pod::Simple::Debug".

    Note that the use Pod::Simple::Debug (\$x, somenum) mode will make Pod::Simple (et al) run rather slower, since &Pod::Simple::DEBUG won't be a constant sub anymore, and so Pod::Simple (et al) won't compile with constant-folding.

    GUTS

    Doing this:

    1. use Pod::Simple::Debug (5); # or some integer

    is basically equivalent to:

    1. BEGIN { sub Pod::Simple::DEBUG () {5} } # or some integer
    2. use Pod::Simple ();

    And this:

    1. use Pod::Simple::Debug (\$debug_level,0); # or some integer

    is basically equivalent to this:

    1. my $debug_level;
    2. BEGIN { $debug_level = 0 }
    3. BEGIN { sub Pod::Simple::DEBUG () { $debug_level }
    4. use Pod::Simple ();

    SEE ALSO

    Pod::Simple

    The article "Constants in Perl", in The Perl Journal issue 21. See http://interglacial.com/tpj/21/

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/DumpAsText.html000644 000765 000024 00000042524 12275777502 020142 0ustar00jjstaff000000 000000 Pod::Simple::DumpAsText - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::DumpAsText

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::DumpAsText

    NAME

    Pod::Simple::DumpAsText -- dump Pod-parsing events as text

    SYNOPSIS

    1. perl -MPod::Simple::DumpAsText -e \
    2. "exit Pod::Simple::DumpAsText->filter(shift)->any_errata_seen" \
    3. thingy.pod

    DESCRIPTION

    This class is for dumping, as text, the events gotten from parsing a Pod document. This class is of interest to people writing Pod formatters based on Pod::Simple. It is useful for seeing exactly what events you get out of some Pod that you feed in.

    This is a subclass of Pod::Simple and inherits all its methods.

    SEE ALSO

    Pod::Simple::DumpAsXML

    Pod::Simple

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/DumpAsXML.html000644 000765 000024 00000044071 12275777501 017654 0ustar00jjstaff000000 000000 Pod::Simple::DumpAsXML - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::DumpAsXML

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::DumpAsXML

    NAME

    Pod::Simple::DumpAsXML -- turn Pod into XML

    SYNOPSIS

    1. perl -MPod::Simple::DumpAsXML -e \
    2. "exit Pod::Simple::DumpAsXML->filter(shift)->any_errata_seen" \
    3. thingy.pod

    DESCRIPTION

    Pod::Simple::DumpAsXML is a subclass of Pod::Simple that parses Pod and turns it into indented and wrapped XML. This class is of interest to people writing Pod formatters based on Pod::Simple.

    Pod::Simple::DumpAsXML inherits methods from Pod::Simple.

    SEE ALSO

    Pod::Simple::XMLOutStream is rather like this class. Pod::Simple::XMLOutStream's output is space-padded in a way that's better for sending to an XML processor (that is, it has no ignorable whitespace). But Pod::Simple::DumpAsXML's output is much more human-readable, being (more-or-less) one token per line, with line-wrapping.

    Pod::Simple::DumpAsText is rather like this class, except that it doesn't dump with XML syntax. Try them and see which one you like best!

    Pod::Simple, Pod::Simple::DumpAsXML

    The older libraries Pod::PXML, Pod::XML, Pod::SAX

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/HTML.html000644 000765 000024 00000103526 12275777476 016662 0ustar00jjstaff000000 000000 Pod::Simple::HTML - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::HTML

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::HTML

    NAME

    Pod::Simple::HTML - convert Pod to HTML

    SYNOPSIS

    1. perl -MPod::Simple::HTML -e Pod::Simple::HTML::go thingy.pod

    DESCRIPTION

    This class is for making an HTML rendering of a Pod document.

    This is a subclass of Pod::Simple::PullParser and inherits all its methods (and options).

    Note that if you want to do a batch conversion of a lot of Pod documents to HTML, you should see the module Pod::Simple::HTMLBatch.

    CALLING FROM THE COMMAND LINE

    TODO

    1. perl -MPod::Simple::HTML -e Pod::Simple::HTML::go Thing.pod Thing.html

    CALLING FROM PERL

    Minimal code

    1. use Pod::Simple::HTML;
    2. my $p = Pod::Simple::HTML->new;
    3. $p->output_string(\my $html);
    4. $p->parse_file('path/to/Module/Name.pm');
    5. open my $out, '>', 'out.html' or die "Cannot open 'out.html': $!\n";
    6. print $out $html;

    More detailed example

    1. use Pod::Simple::HTML;

    Set the content type:

    1. $Pod::Simple::HTML::Content_decl = q{<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" >};
    2. my $p = Pod::Simple::HTML->new;

    Include a single javascript source:

    1. $p->html_javascript('http://abc.com/a.js');

    Or insert multiple javascript source in the header (or for that matter include anything, thought this is not recommended)

    1. $p->html_javascript('
    2. <script type="text/javascript" src="http://abc.com/b.js"></script>
    3. <script type="text/javascript" src="http://abc.com/c.js"></script>');

    Include a single css source in the header:

    1. $p->html_css('/style.css');

    or insert multiple css sources:

    1. $p->html_css('
    2. <link rel="stylesheet" type="text/css" title="pod_stylesheet" href="http://remote.server.com/jquery.css">
    3. <link rel="stylesheet" type="text/css" title="pod_stylesheet" href="/style.css">');

    Tell the parser where should the output go. In this case it will be placed in the $html variable:

    1. my $html;
    2. $p->output_string(\$html);

    Parse and process a file with pod in it:

    1. $p->parse_file('path/to/Module/Name.pm');

    METHODS

    TODO all (most?) accessorized methods

    The following variables need to be set before the call to the ->new constructor.

    Set the string that is included before the opening <html> tag:

    1. $Pod::Simple::HTML::Doctype_decl = qq{<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    2. "http://www.w3.org/TR/html4/loose.dtd">\n};

    Set the content-type in the HTML head: (defaults to ISO-8859-1)

    1. $Pod::Simple::HTML::Content_decl = q{<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" >};

    Set the value that will be ebedded in the opening tags of F, C tags and verbatim text. F maps to <em>, C maps to <code>, Verbatim text maps to <pre> (Computerese defaults to "")

    1. $Pod::Simple::HTML::Computerese = ' class="some_class_name';

    html_css

    html_javascript

    title_prefix

    title_postfix

    html_header_before_title

    This includes everything before the <title> opening tag including the Document type and including the opening <title> tag. The following call will set it to be a simple HTML file:

    1. $p->html_header_before_title('<html><head><title>');

    html_h_level

    Normally =head1 will become <h1>, =head2 will become <h2> etc. Using the html_h_level method will change these levels setting the h level of =head1 tags:

    1. $p->html_h_level(3);

    Will make sure that =head1 will become <h3> and =head2 will become <h4> etc...

    index

    Set it to some true value if you want to have an index (in reality a table of contents) to be added at the top of the generated HTML.

    1. $p->index(1);

    html_header_after_title

    Includes the closing tag of </title> and through the rest of the head till the opening of the body

    1. $p->html_header_after_title('</title>...</head><body id="my_id">');

    html_footer

    The very end of the document:

    1. $p->html_footer( qq[\n<!-- end doc -->\n\n</body></html>\n] );

    SUBCLASSING

    Can use any of the methods described above but for further customization one needs to override some of the methods:

    1. package My::Pod;
    2. use strict;
    3. use warnings;
    4. use base 'Pod::Simple::HTML';
    5. # needs to return a URL string such
    6. # http://some.other.com/page.html
    7. # #anchor_in_the_same_file
    8. # /internal/ref.html
    9. sub do_pod_link {
    10. # My::Pod object and Pod::Simple::PullParserStartToken object
    11. my ($self, $link) = @_;
    12. say $link->tagname; # will be L for links
    13. say $link->attr('to'); #
    14. say $link->attr('type'); # will be 'pod' always
    15. say $link->attr('section');
    16. # Links local to our web site
    17. if ($link->tagname eq 'L' and $link->attr('type') eq 'pod') {
    18. my $to = $link->attr('to');
    19. if ($to =~ /^Padre::/) {
    20. $to =~ s{::}{/}g;
    21. return "/docs/Padre/$to.html";
    22. }
    23. }
    24. # all other links are generated by the parent class
    25. my $ret = $self->SUPER::do_pod_link($link);
    26. return $ret;
    27. }
    28. 1;

    Meanwhile in script.pl:

    1. use My::Pod;
    2. my $p = My::Pod->new;
    3. my $html;
    4. $p->output_string(\$html);
    5. $p->parse_file('path/to/Module/Name.pm');
    6. open my $out, '>', 'out.html' or die;
    7. print $out $html;

    TODO

    maybe override do_beginning do_end

    SEE ALSO

    Pod::Simple, Pod::Simple::HTMLBatch

    TODO: a corpus of sample Pod input and HTML output? Or common idioms?

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002-2004 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    ACKNOWLEDGEMENTS

    Thanks to Hurricane Electric for permission to use its Linux man pages online site for man page links.

    Thanks to search.cpan.org for permission to use the site for Perl module links.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/HTMLBatch.html000644 000765 000024 00000077554 12275777476 017637 0ustar00jjstaff000000 000000 Pod::Simple::HTMLBatch - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::HTMLBatch

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::HTMLBatch

    NAME

    Pod::Simple::HTMLBatch - convert several Pod files to several HTML files

    SYNOPSIS

    1. perl -MPod::Simple::HTMLBatch -e 'Pod::Simple::HTMLBatch::go' in out

    DESCRIPTION

    This module is used for running batch-conversions of a lot of HTML documents

    This class is NOT a subclass of Pod::Simple::HTML (nor of bad old Pod::Html) -- although it uses Pod::Simple::HTML for doing the conversion of each document.

    The normal use of this class is like so:

    1. use Pod::Simple::HTMLBatch;
    2. my $batchconv = Pod::Simple::HTMLBatch->new;
    3. $batchconv->some_option( some_value );
    4. $batchconv->some_other_option( some_other_value );
    5. $batchconv->batch_convert( \@search_dirs, $output_dir );

    FROM THE COMMAND LINE

    Note that this class also provides (but does not export) the function Pod::Simple::HTMLBatch::go. This is basically just a shortcut for Pod::Simple::HTMLBatch->batch_convert(@ARGV) . It's meant to be handy for calling from the command line.

    However, the shortcut requires that you specify exactly two command-line arguments, indirs and outdir .

    Example:

    1. % mkdir out_html
    2. % perl -MPod::Simple::HTMLBatch -e Pod::Simple::HTMLBatch::go @INC out_html
    3. (to convert the pod from Perl's @INC
    4. files under the directory ./out_html)

    (Note that the command line there contains a literal atsign-I-N-C. This is handled as a special case by batch_convert, in order to save you having to enter the odd-looking "" as the first command-line parameter when you mean "just use whatever's in @INC".)

    Example:

    1. % mkdir ../seekrut
    2. % chmod og-rx ../seekrut
    3. % perl -MPod::Simple::HTMLBatch -e Pod::Simple::HTMLBatch::go . ../seekrut
    4. (to convert the pod under the current dir into HTML
    5. files under the directory ./seekrut)

    Example:

    1. % perl -MPod::Simple::HTMLBatch -e Pod::Simple::HTMLBatch::go happydocs .
    2. (to convert all pod from happydocs into the current directory)

    MAIN METHODS

    • $batchconv = Pod::Simple::HTMLBatch->new;

      This TODO

    • $batchconv->batch_convert( indirs, outdir );

      this TODO

    • $batchconv->batch_convert( undef , ...);
    • $batchconv->batch_convert( q{@INC}, ...);

      These two values for indirs specify that the normal Perl @INC

    • $batchconv->batch_convert( \@dirs , ...);

      This specifies that the input directories are the items in the arrayref \@dirs .

    • $batchconv->batch_convert( "somedir" , ...);

      This specifies that the director "somedir" is the input. (This can be an absolute or relative path, it doesn't matter.)

      A common value you might want would be just "." for the current directory:

      1. $batchconv->batch_convert( "." , ...);
    • $batchconv->batch_convert( 'somedir:someother:also' , ...);

      This specifies that you want the dirs "somedir", "someother", and "also" scanned, just as if you'd passed the arrayref [qw( somedir someother also)] . Note that a ":"-separator is normal under Unix, but Under MSWin, you'll need 'somedir;someother;also' instead, since the pathsep on MSWin is ";" instead of ":". (And that is because ":" often comes up in paths, like "c:/perl/lib" .)

      (Exactly what separator character should be used, is gotten from $Config::Config{'path_sep'} , via the Config module.)

    • $batchconv->batch_convert( ... , undef );

      This specifies that you want the HTML output to go into the current directory.

      (Note that a missing or undefined value means a different thing in the first slot than in the second. That's so that batch_convert() with no arguments (or undef arguments) means "go from @INC, into the current directory.)

    • $batchconv->batch_convert( ... , 'somedir' );

      This specifies that you want the HTML output to go into the directory 'somedir'. (This can be an absolute or relative path, it doesn't matter.)

    Note that you can also call batch_convert as a class method, like so:

    1. Pod::Simple::HTMLBatch->batch_convert( ... );

    That is just short for this:

    1. Pod::Simple::HTMLBatch-> new-> batch_convert(...);

    That is, it runs a conversion with default options, for whatever inputdirs and output dir you specify.

    ACCESSOR METHODS

    The following are all accessor methods -- that is, they don't do anything on their own, but just alter the contents of the conversion object, which comprises the options for this particular batch conversion.

    We show the "put" form of the accessors below (i.e., the syntax you use for setting the accessor to a specific value). But you can also call each method with no parameters to get its current value. For example, $self->contents_file() returns the current value of the contents_file attribute.

    • $batchconv->verbose( nonnegative_integer );

      This controls how verbose to be during batch conversion, as far as notes to STDOUT (or whatever is select'd) about how the conversion is going. If 0, no progress information is printed. If 1 (the default value), some progress information is printed. Higher values print more information.

    • $batchconv->index( true-or-false );

      This controls whether or not each HTML page is liable to have a little table of contents at the top (which we call an "index" for historical reasons). This is true by default.

    • $batchconv->contents_file( filename );

      If set, should be the name of a file (in the output directory) to write the HTML index to. The default value is "index.html". If you set this to a false value, no contents file will be written.

    • $batchconv->contents_page_start( HTML_string );

      This specifies what string should be put at the beginning of the contents page. The default is a string more or less like this:

      1. <html>
      2. <head><title>Perl Documentation</title></head>
      3. <body class='contentspage'>
      4. <h1>Perl Documentation</h1>
    • $batchconv->contents_page_end( HTML_string );

      This specifies what string should be put at the end of the contents page. The default is a string more or less like this:

      1. <p class='contentsfooty'>Generated by
      2. Pod::Simple::HTMLBatch v3.01 under Perl v5.008
      3. <br >At Fri May 14 22:26:42 2004 GMT,
      4. which is Fri May 14 14:26:42 2004 local time.</p>
    • $batchconv->add_css( $url );

      TODO

    • $batchconv->add_javascript( $url );

      TODO

    • $batchconv->css_flurry( true-or-false );

      If true (the default value), we autogenerate some CSS files in the output directory, and set our HTML files to use those. TODO: continue

    • $batchconv->javascript_flurry( true-or-false );

      If true (the default value), we autogenerate a JavaScript in the output directory, and set our HTML files to use it. Currently, the JavaScript is used only to get the browser to remember what stylesheet it prefers. TODO: continue

    • $batchconv->no_contents_links( true-or-false );

      TODO

    • $batchconv->html_render_class( classname );

      This sets what class is used for rendering the files. The default is "Pod::Simple::HTML". If you set it to something else, it should probably be a subclass of Pod::Simple::HTML, and you should require or use that class so that's it's loaded before Pod::Simple::HTMLBatch tries loading it.

    • $batchconv->search_class( classname );

      This sets what class is used for searching for the files. The default is "Pod::Simple::Search". If you set it to something else, it should probably be a subclass of Pod::Simple::Search, and you should require or use that class so that's it's loaded before Pod::Simple::HTMLBatch tries loading it.

    NOTES ON CUSTOMIZATION

    TODO

    1. call add_css($someurl) to add stylesheet as alternate
    2. call add_css($someurl,1) to add as primary stylesheet
    3. call add_javascript
    4. subclass Pod::Simple::HTML and set $batchconv->html_render_class to
    5. that classname
    6. and maybe override
    7. $page->batch_mode_page_object_init($self, $module, $infile, $outfile, $depth)
    8. or maybe override
    9. $batchconv->batch_mode_page_object_init($page, $module, $infile, $outfile, $depth)
    10. subclass Pod::Simple::Search and set $batchconv->search_class to
    11. that classname

    ASK ME!

    If you want to do some kind of big pod-to-HTML version with some particular kind of option that you don't see how to achieve using this module, email me (sburke@cpan.org ) and I'll probably have a good idea how to do it. For reasons of concision and energetic laziness, some methods and options in this module (and the dozen modules it depends on) are undocumented; but one of those undocumented bits might be just what you're looking for.

    SEE ALSO

    Pod::Simple, Pod::Simple::HTMLBatch, perlpod, perlpodspec

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/LinkSection.html000644 000765 000024 00000045076 12275777501 020332 0ustar00jjstaff000000 000000 Pod::Simple::LinkSection - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::LinkSection

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::LinkSection

    NAME

    Pod::Simple::LinkSection -- represent "section" attributes of L codes

    SYNOPSIS

    1. # a long story

    DESCRIPTION

    This class is not of interest to general users.

    Pod::Simple uses this class for representing the value of the "section" attribute of "L" start-element events. Most applications can just use the normal stringification of objects of this class; they stringify to just the text content of the section, such as "foo" for L<Stuff/foo> , and "bar" for L<Stuff/bI<ar>> .

    However, anyone particularly interested in getting the full value of the treelet, can just traverse the content of the treeleet @$treelet_object. To wit:

    1. % perl -MData::Dumper -e
    2. "use base qw(Pod::Simple::Methody);
    3. sub start_L { print Dumper($_[1]{'section'} ) }
    4. __PACKAGE__->new->parse_string_document('=head1 L<Foo/bI<ar>baz>>')
    5. "
    6. Output:
    7. $VAR1 = bless( [
    8. '',
    9. {},
    10. 'b',
    11. bless( [
    12. 'I',
    13. {},
    14. 'ar'
    15. ], 'Pod::Simple::LinkSection' ),
    16. 'baz'
    17. ], 'Pod::Simple::LinkSection' );

    But stringify it and you get just the text content:

    1. % perl -MData::Dumper -e
    2. "use base qw(Pod::Simple::Methody);
    3. sub start_L { print Dumper( '' . $_[1]{'section'} ) }
    4. __PACKAGE__->new->parse_string_document('=head1 L<Foo/bI<ar>baz>>')
    5. "
    6. Output:
    7. $VAR1 = 'barbaz';

    SEE ALSO

    Pod::Simple

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2004 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/Methody.html000644 000765 000024 00000052342 12275777500 017512 0ustar00jjstaff000000 000000 Pod::Simple::Methody - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::Methody

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::Methody

    NAME

    Pod::Simple::Methody -- turn Pod::Simple events into method calls

    SYNOPSIS

    1. require 5;
    2. use strict;
    3. package SomePodFormatter;
    4. use base qw(Pod::Simple::Methody);
    5. sub handle_text {
    6. my($self, $text) = @_;
    7. ...
    8. }
    9. sub start_head1 {
    10. my($self, $attrs) = @_;
    11. ...
    12. }
    13. sub end_head1 {
    14. my($self) = @_;
    15. ...
    16. }

    ...and start_/end_ methods for whatever other events you want to catch.

    DESCRIPTION

    This class is of interest to people writing Pod formatters based on Pod::Simple.

    This class (which is very small -- read the source) overrides Pod::Simple's _handle_element_start, _handle_text, and _handle_element_end methods so that parser events are turned into method calls. (Otherwise, this is a subclass of Pod::Simple and inherits all its methods.)

    You can use this class as the base class for a Pod formatter/processor.

    METHOD CALLING

    When Pod::Simple sees a "=head1 Hi there", for example, it basically does this:

    1. $parser->_handle_element_start( "head1", \%attributes );
    2. $parser->_handle_text( "Hi there" );
    3. $parser->_handle_element_end( "head1" );

    But if you subclass Pod::Simple::Methody, it will instead do this when it sees a "=head1 Hi there":

    1. $parser->start_head1( \%attributes ) if $parser->can('start_head1');
    2. $parser->handle_text( "Hi there" ) if $parser->can('handle_text');
    3. $parser->end_head1() if $parser->can('end_head1');

    If Pod::Simple sends an event where the element name has a dash, period, or colon, the corresponding method name will have a underscore in its place. For example, "foo.bar:baz" becomes start_foo_bar_baz and end_foo_bar_baz.

    See the source for Pod::Simple::Text for an example of using this class.

    SEE ALSO

    Pod::Simple, Pod::Simple::Subclassing

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/PullParser.html000644 000765 000024 00000073353 12275777501 020200 0ustar00jjstaff000000 000000 Pod::Simple::PullParser - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::PullParser

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::PullParser

    NAME

    Pod::Simple::PullParser -- a pull-parser interface to parsing Pod

    SYNOPSIS

    1. my $parser = SomePodProcessor->new;
    2. $parser->set_source( "whatever.pod" );
    3. $parser->run;

    Or:

    1. my $parser = SomePodProcessor->new;
    2. $parser->set_source( $some_filehandle_object );
    3. $parser->run;

    Or:

    1. my $parser = SomePodProcessor->new;
    2. $parser->set_source( \$document_source );
    3. $parser->run;

    Or:

    1. my $parser = SomePodProcessor->new;
    2. $parser->set_source( \@document_lines );
    3. $parser->run;

    And elsewhere:

    1. require 5;
    2. package SomePodProcessor;
    3. use strict;
    4. use base qw(Pod::Simple::PullParser);
    5. sub run {
    6. my $self = shift;
    7. Token:
    8. while(my $token = $self->get_token) {
    9. ...process each token...
    10. }
    11. }

    DESCRIPTION

    This class is for using Pod::Simple to build a Pod processor -- but one that uses an interface based on a stream of token objects, instead of based on events.

    This is a subclass of Pod::Simple and inherits all its methods.

    A subclass of Pod::Simple::PullParser should define a run method that calls $token = $parser->get_token to pull tokens.

    See the source for Pod::Simple::RTF for an example of a formatter that uses Pod::Simple::PullParser.

    METHODS

    • my $token = $parser->get_token

      This returns the next token object (which will be of a subclass of Pod::Simple::PullParserToken), or undef if the parser-stream has hit the end of the document.

    • $parser->unget_token( $token )
    • $parser->unget_token( $token1, $token2, ... )

      This restores the token object(s) to the front of the parser stream.

    The source has to be set before you can parse anything. The lowest-level way is to call set_source :

    • $parser->set_source( $filename )
    • $parser->set_source( $filehandle_object )
    • $parser->set_source( \$document_source )
    • $parser->set_source( \@document_lines )

    Or you can call these methods, which Pod::Simple::PullParser has defined to work just like Pod::Simple's same-named methods:

    • $parser->parse_file(...)
    • $parser->parse_string_document(...)
    • $parser->filter(...)
    • $parser->parse_from_file(...)

    For those to work, the Pod-processing subclass of Pod::Simple::PullParser has to have defined a $parser->run method -- so it is advised that all Pod::Simple::PullParser subclasses do so. See the Synopsis above, or the source for Pod::Simple::RTF.

    Authors of formatter subclasses might find these methods useful to call on a parser object that you haven't started pulling tokens from yet:

    • my $title_string = $parser->get_title

      This tries to get the title string out of $parser, by getting some tokens, and scanning them for the title, and then ungetting them so that you can process the token-stream from the beginning.

      For example, suppose you have a document that starts out:

      1. =head1 NAME
      2. Hoo::Boy::Wowza -- Stuff B<wow> yeah!

      $parser->get_title on that document will return "Hoo::Boy::Wowza -- Stuff wow yeah!". If the document starts with:

      1. =head1 Name
      2. Hoo::Boy::W00t -- Stuff B<w00t> yeah!

      Then you'll need to pass the nocase option in order to recognize "Name":

      1. $parser->get_title(nocase => 1);

      In cases where get_title can't find the title, it will return empty-string ("").

    • my $title_string = $parser->get_short_title

      This is just like get_title, except that it returns just the modulename, if the title seems to be of the form "SomeModuleName -- description".

      For example, suppose you have a document that starts out:

      1. =head1 NAME
      2. Hoo::Boy::Wowza -- Stuff B<wow> yeah!

      then $parser->get_short_title on that document will return "Hoo::Boy::Wowza".

      But if the document starts out:

      1. =head1 NAME
      2. Hooboy, stuff B<wow> yeah!

      then $parser->get_short_title on that document will return "Hooboy, stuff wow yeah!". If the document starts with:

      1. =head1 Name
      2. Hoo::Boy::W00t -- Stuff B<w00t> yeah!

      Then you'll need to pass the nocase option in order to recognize "Name":

      1. $parser->get_short_title(nocase => 1);

      If the title can't be found, then get_short_title returns empty-string ("").

    • $author_name = $parser->get_author

      This works like get_title except that it returns the contents of the "=head1 AUTHOR\n\nParagraph...\n" section, assuming that that section isn't terribly long. To recognize a "=head1 Author\n\nParagraph\n" section, pass the nocase otpion:

      1. $parser->get_author(nocase => 1);

      (This method tolerates "AUTHORS" instead of "AUTHOR" too.)

    • $description_name = $parser->get_description

      This works like get_title except that it returns the contents of the "=head1 DESCRIPTION\n\nParagraph...\n" section, assuming that that section isn't terribly long. To recognize a "=head1 Description\n\nParagraph\n" section, pass the nocase otpion:

      1. $parser->get_description(nocase => 1);
    • $version_block = $parser->get_version

      This works like get_title except that it returns the contents of the "=head1 VERSION\n\n[BIG BLOCK]\n" block. Note that this does NOT return the module's $VERSION !! To recognize a "=head1 Version\n\n[BIG BLOCK]\n" section, pass the nocase otpion:

      1. $parser->get_version(nocase => 1);

    NOTE

    You don't actually have to define a run method. If you're writing a Pod-formatter class, you should define a run just so that users can call parse_file etc, but you don't have to.

    And if you're not writing a formatter class, but are instead just writing a program that does something simple with a Pod::PullParser object (and not an object of a subclass), then there's no reason to bother subclassing to add a run method.

    SEE ALSO

    Pod::Simple

    Pod::Simple::PullParserToken -- and its subclasses Pod::Simple::PullParserStartToken, Pod::Simple::PullParserTextToken, and Pod::Simple::PullParserEndToken.

    HTML::TokeParser, which inspired this.

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/PullParserEndToken.html000644 000765 000024 00000044453 12275777476 021642 0ustar00jjstaff000000 000000 Pod::Simple::PullParserEndToken - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::PullParserEndToken

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::PullParserEndToken

    NAME

    Pod::Simple::PullParserEndToken -- end-tokens from Pod::Simple::PullParser

    SYNOPSIS

    (See Pod::Simple::PullParser)

    DESCRIPTION

    When you do $parser->get_token on a Pod::Simple::PullParser, you might get an object of this class.

    This is a subclass of Pod::Simple::PullParserToken and inherits all its methods, and adds these methods:

    • $token->tagname

      This returns the tagname for this end-token object. For example, parsing a "=head1 ..." line will give you a start-token with the tagname of "head1", token(s) for its content, and then an end-token with the tagname of "head1".

    • $token->tagname(somestring)

      This changes the tagname for this end-token object. You probably won't need to do this.

    • $token->tag(...)

      A shortcut for $token->tagname(...)

    • $token->is_tag(somestring) or $token->is_tagname(somestring)

      These are shortcuts for $token->tag() eq somestring

    You're unlikely to ever need to construct an object of this class for yourself, but if you want to, call Pod::Simple::PullParserEndToken->new( tagname )

    SEE ALSO

    Pod::Simple::PullParserToken, Pod::Simple, Pod::Simple::Subclassing

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/PullParserStartToken.html000644 000765 000024 00000047342 12275777476 022231 0ustar00jjstaff000000 000000 Pod::Simple::PullParserStartToken - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::PullParserStartToken

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::PullParserStartToken

    NAME

    Pod::Simple::PullParserStartToken -- start-tokens from Pod::Simple::PullParser

    SYNOPSIS

    (See Pod::Simple::PullParser)

    DESCRIPTION

    When you do $parser->get_token on a Pod::Simple::PullParser object, you might get an object of this class.

    This is a subclass of Pod::Simple::PullParserToken and inherits all its methods, and adds these methods:

    • $token->tagname

      This returns the tagname for this start-token object. For example, parsing a "=head1 ..." line will give you a start-token with the tagname of "head1", token(s) for its content, and then an end-token with the tagname of "head1".

    • $token->tagname(somestring)

      This changes the tagname for this start-token object. You probably won't need to do this.

    • $token->tag(...)

      A shortcut for $token->tagname(...)

    • $token->is_tag(somestring) or $token->is_tagname(somestring)

      These are shortcuts for $token->tag() eq somestring

    • $token->attr(attrname)

      This returns the value of the attrname attribute for this start-token object, or undef.

      For example, parsing a L<Foo/"Bar"> link will produce a start-token with a "to" attribute with the value "Foo", a "type" attribute with the value "pod", and a "section" attribute with the value "Bar".

    • $token->attr(attrname, newvalue)

      This sets the attrname attribute for this start-token object to newvalue. You probably won't need to do this.

    • $token->attr_hash

      This returns the hashref that is the attribute set for this start-token. This is useful if (for example) you want to ask what all the attributes are -- you can just do keys %{$token->attr_hash}

    You're unlikely to ever need to construct an object of this class for yourself, but if you want to, call Pod::Simple::PullParserStartToken->new( tagname, attrhash )

    SEE ALSO

    Pod::Simple::PullParserToken, Pod::Simple, Pod::Simple::Subclassing

    SEE ALSO

    Pod::Simple::PullParserToken, Pod::Simple, Pod::Simple::Subclassing

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/PullParserTextToken.html000644 000765 000024 00000046745 12275777502 022054 0ustar00jjstaff000000 000000 Pod::Simple::PullParserTextToken - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::PullParserTextToken

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::PullParserTextToken

    NAME

    Pod::Simple::PullParserTextToken -- text-tokens from Pod::Simple::PullParser

    SYNOPSIS

    (See Pod::Simple::PullParser)

    DESCRIPTION

    When you do $parser->get_token on a Pod::Simple::PullParser, you might get an object of this class.

    This is a subclass of Pod::Simple::PullParserToken and inherits all its methods, and adds these methods:

    • $token->text

      This returns the text that this token holds. For example, parsing C<foo> will return a C start-token, a text-token, and a C end-token. And if you want to get the "foo" out of the text-token, call $token->text

    • $token->text(somestring)

      This changes the string that this token holds. You probably won't need to do this.

    • $token->text_r()

      This returns a scalar reference to the string that this token holds. This can be useful if you don't want to memory-copy the potentially large text value (well, as large as a paragraph or a verbatim block) as calling $token->text would do.

      Or, if you want to alter the value, you can even do things like this:

      1. for ( ${ $token->text_r } ) { # Aliases it with $_ !!
      2. s/ The / the /g; # just for example
      3. if( 'A' eq chr(65) ) { # (if in an ASCII world)
      4. tr/\xA0/ /;
      5. tr/\xAD//d;
      6. }
      7. ...or however you want to alter the value...
      8. }

    You're unlikely to ever need to construct an object of this class for yourself, but if you want to, call Pod::Simple::PullParserTextToken->new( text )

    SEE ALSO

    Pod::Simple::PullParserToken, Pod::Simple, Pod::Simple::Subclassing

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/PullParserToken.html000644 000765 000024 00000052573 12275777477 021216 0ustar00jjstaff000000 000000 Pod::Simple::PullParserToken - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::PullParserToken

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::PullParserToken

    NAME

    Pod::Simple::PullParserToken -- tokens from Pod::Simple::PullParser

    SYNOPSIS

    Given a $parser that's an object of class Pod::Simple::PullParser (or a subclass)...

    1. while(my $token = $parser->get_token) {
    2. $DEBUG and print "Token: ", $token->dump, "\n";
    3. if($token->is_start) {
    4. ...access $token->tagname, $token->attr, etc...
    5. } elsif($token->is_text) {
    6. ...access $token->text, $token->text_r, etc...
    7. } elsif($token->is_end) {
    8. ...access $token->tagname...
    9. }
    10. }

    (Also see Pod::Simple::PullParser)

    DESCRIPTION

    When you do $parser->get_token on a Pod::Simple::PullParser, you should get an object of a subclass of Pod::Simple::PullParserToken.

    Subclasses will add methods, and will also inherit these methods:

    • $token->type

      This returns the type of the token. This will be either the string "start", the string "text", or the string "end".

      Once you know what the type of an object is, you then know what subclass it belongs to, and therefore what methods it supports.

      Yes, you could probably do the same thing with code like $token->isa('Pod::Simple::PullParserEndToken'), but that's not so pretty as using just $token->type, or even the following shortcuts:

    • $token->is_start

      This is a shortcut for $token->type() eq "start"

    • $token->is_text

      This is a shortcut for $token->type() eq "text"

    • $token->is_end

      This is a shortcut for $token->type() eq "end"

    • $token->dump

      This returns a handy stringified value of this object. This is useful for debugging, as in:

      1. while(my $token = $parser->get_token) {
      2. $DEBUG and print "Token: ", $token->dump, "\n";
      3. ...
      4. }

    SEE ALSO

    My subclasses: Pod::Simple::PullParserStartToken, Pod::Simple::PullParserTextToken, and Pod::Simple::PullParserEndToken.

    Pod::Simple::PullParser and Pod::Simple

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/RTF.html000644 000765 000024 00000051517 12275777501 016540 0ustar00jjstaff000000 000000 Pod::Simple::RTF - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::RTF

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::RTF

    NAME

    Pod::Simple::RTF -- format Pod as RTF

    SYNOPSIS

    1. perl -MPod::Simple::RTF -e \
    2. "exit Pod::Simple::RTF->filter(shift)->any_errata_seen" \
    3. thingy.pod > thingy.rtf

    DESCRIPTION

    This class is a formatter that takes Pod and renders it as RTF, good for viewing/printing in MSWord, WordPad/write.exe, TextEdit, etc.

    This is a subclass of Pod::Simple and inherits all its methods.

    FORMAT CONTROL ATTRIBUTES

    You can set these attributes on the parser object before you call parse_file (or a similar method) on it:

    • $parser->head1_halfpoint_size( halfpoint_integer );
    • $parser->head2_halfpoint_size( halfpoint_integer );
    • $parser->head3_halfpoint_size( halfpoint_integer );
    • $parser->head4_halfpoint_size( halfpoint_integer );

      These methods set the size (in half-points, like 52 for 26-point) that these heading levels will appear as.

    • $parser->codeblock_halfpoint_size( halfpoint_integer );

      This method sets the size (in half-points, like 21 for 10.5-point) that codeblocks ("verbatim sections") will appear as.

    • $parser->header_halfpoint_size( halfpoint_integer );

      This method sets the size (in half-points, like 15 for 7.5-point) that the header on each page will appear in. The header is usually just "modulename p. pagenumber".

    • $parser->normal_halfpoint_size( halfpoint_integer );

      This method sets the size (in half-points, like 26 for 13-point) that normal paragraphic text will appear in.

    • $parser->no_proofing_exemptions( true_or_false );

      Set this value to true if you don't want the formatter to try putting a hidden code on all Perl symbols (as best as it can notice them) that labels them as being not in English, and so not worth spellchecking.

    • $parser->doc_lang( microsoft_decimal_language_code )

      This sets the language code to tag this document as being in. By default, it is currently the value of the environment variable RTFDEFLANG , or if that's not set, then the value 1033 (for US English).

      Setting this appropriately is useful if you want to use the RTF to spellcheck, and/or if you want it to hyphenate right.

      Here are some notable values:

      1. 1033 US English
      2. 2057 UK English
      3. 3081 Australia English
      4. 4105 Canada English
      5. 1034 Spain Spanish
      6. 2058 Mexico Spanish
      7. 1031 Germany German
      8. 1036 France French
      9. 3084 Canada French
      10. 1035 Finnish
      11. 1044 Norwegian (Bokmal)
      12. 2068 Norwegian (Nynorsk)

    If you are particularly interested in customizing this module's output even more, see the source and/or write to me.

    SEE ALSO

    Pod::Simple, RTF::Writer, RTF::Cookbook, RTF::Document, RTF::Generator

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/Search.html000644 000765 000024 00000114325 12275777476 017322 0ustar00jjstaff000000 000000 Pod::Simple::Search - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::Search

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::Search

    NAME

    Pod::Simple::Search - find POD documents in directory trees

    SYNOPSIS

    1. use Pod::Simple::Search;
    2. my $name2path = Pod::Simple::Search->new->limit_glob('LWP::*')->survey;
    3. print "Looky see what I found: ",
    4. join(' ', sort keys %$name2path), "\n";
    5. print "LWPUA docs = ",
    6. Pod::Simple::Search->new->find('LWP::UserAgent') || "?",
    7. "\n";

    DESCRIPTION

    Pod::Simple::Search is a class that you use for running searches for Pod files. An object of this class has several attributes (mostly options for controlling search options), and some methods for searching based on those attributes.

    The way to use this class is to make a new object of this class, set any options, and then call one of the search options (probably survey or find ). The sections below discuss the syntaxes for doing all that.

    CONSTRUCTOR

    This class provides the one constructor, called new . It takes no parameters:

    1. use Pod::Simple::Search;
    2. my $search = Pod::Simple::Search->new;

    ACCESSORS

    This class defines several methods for setting (and, occasionally, reading) the contents of an object. With two exceptions (discussed at the end of this section), these attributes are just for controlling the way searches are carried out.

    Note that each of these return $self when you call them as $self->whatever(value). That's so that you can chain together set-attribute calls like this:

    1. my $name2path =
    2. Pod::Simple::Search->new
    3. -> inc(0) -> verbose(1) -> callback(\&blab)
    4. ->survey(@there);

    ...which works exactly as if you'd done this:

    1. my $search = Pod::Simple::Search->new;
    2. $search->inc(0);
    3. $search->verbose(1);
    4. $search->callback(\&blab);
    5. my $name2path = $search->survey(@there);
    • $search->inc( true-or-false );

      This attribute, if set to a true value, means that searches should implicitly add perl's @INC paths. This automatically considers paths specified in the PERL5LIB environment as this is prepended to @INC by the Perl interpreter itself. This attribute's default value is TRUE. If you want to search only specific directories, set $self->inc(0) before calling $inc->survey or $inc->find.

    • $search->verbose( nonnegative-number );

      This attribute, if set to a nonzero positive value, will make searches output (via warn) notes about what they're doing as they do it. This option may be useful for debugging a pod-related module. This attribute's default value is zero, meaning that no warn messages are produced. (Setting verbose to 1 turns on some messages, and setting it to 2 turns on even more messages, i.e., makes the following search(es) even more verbose than 1 would make them.)

    • $search->limit_glob( some-glob-string );

      This option means that you want to limit the results just to items whose podnames match the given glob/wildcard expression. For example, you might limit your search to just "LWP::*", to search only for modules starting with "LWP::*" (but not including the module "LWP" itself); or you might limit your search to "LW*" to see only modules whose (full) names begin with "LW"; or you might search for "*Find*" to search for all modules with "Find" somewhere in their full name. (You can also use "?" in a glob expression; so "DB?" will match "DBI" and "DBD".)

    • $search->callback( \&some_routine );

      This attribute means that every time this search sees a matching Pod file, it should call this callback routine. The routine is called with two parameters: the current file's filespec, and its pod name. (For example: ("/etc/perljunk/File/Crunk.pm", "File::Crunk") would be in @_ .)

      The callback routine's return value is not used for anything.

      This attribute's default value is false, meaning that no callback is called.

    • $search->laborious( true-or-false );

      Unless you set this attribute to a true value, Pod::Search will apply Perl-specific heuristics to find the correct module PODs quickly. This attribute's default value is false. You won't normally need to set this to true.

      Specifically: Turning on this option will disable the heuristics for seeing only files with Perl-like extensions, omitting subdirectories that are numeric but do not match the current Perl interpreter's version ID, suppressing site_perl as a module hierarchy name, etc.

    • $search->shadows( true-or-false );

      Unless you set this attribute to a true value, Pod::Simple::Search will consider only the first file of a given modulename as it looks thru the specified directories; that is, with this option off, if Pod::Simple::Search has seen a somepathdir/Foo/Bar.pm already in this search, then it won't bother looking at a somelaterpathdir/Foo/Bar.pm later on in that search, because that file is merely a "shadow". But if you turn on $self->shadows(1) , then these "shadow" files are inspected too, and are noted in the pathname2podname return hash.

      This attribute's default value is false; and normally you won't need to turn it on.

    • $search->limit_re( some-regxp );

      Setting this attribute (to a value that's a regexp) means that you want to limit the results just to items whose podnames match the given regexp. Normally this option is not needed, and the more efficient limit_glob attribute is used instead.

    • $search->dir_prefix( some-string-value );

      Setting this attribute to a string value means that the searches should begin in the specified subdirectory name (like "Pod" or "File::Find", also expressable as "File/Find"). For example, the search option $search->limit_glob("File::Find::R*") is the same as the combination of the search options $search->limit_re("^File::Find::R") -> dir_prefix("File::Find") .

      Normally you don't need to know about the dir_prefix option, but I include it in case it might prove useful for someone somewhere.

      (Implementationally, searching with limit_glob ends up setting limit_re and usually dir_prefix.)

    • $search->progress( some-progress-object );

      If you set a value for this attribute, the value is expected to be an object (probably of a class that you define) that has a reach method and a done method. This is meant for reporting progress during the search, if you don't want to use a simple callback.

      Normally you don't need to know about the progress option, but I include it in case it might prove useful for someone somewhere.

      While a search is in progress, the progress object's reach and done methods are called like this:

      1. # Every time a file is being scanned for pod:
      2. $progress->reach($count, "Scanning $file"); ++$count;
      3. # And then at the end of the search:
      4. $progress->done("Noted $count Pod files total");

      Internally, we often set this to an object of class Pod::Simple::Progress. That class is probably undocumented, but you may wish to look at its source.

    • $name2path = $self->name2path;

      This attribute is not a search parameter, but is used to report the result of survey method, as discussed in the next section.

    • $path2name = $self->path2name;

      This attribute is not a search parameter, but is used to report the result of survey method, as discussed in the next section.

    MAIN SEARCH METHODS

    Once you've actually set any options you want (if any), you can go ahead and use the following methods to search for Pod files in particular ways.

    $search->survey( @directories )

    The method survey searches for POD documents in a given set of files and/or directories. This runs the search according to the various options set by the accessors above. (For example, if the inc attribute is on, as it is by default, then the perl @INC directories are implicitly added to the list of directories (if any) that you specify.)

    The return value of survey is two hashes:

    • name2path

      A hash that maps from each pod-name to the filespec (like "Stuff::Thing" => "/whatever/plib/Stuff/Thing.pm")

    • path2name

      A hash that maps from each Pod filespec to its pod-name (like "/whatever/plib/Stuff/Thing.pm" => "Stuff::Thing")

    Besides saving these hashes as the hashref attributes name2path and path2name , calling this function also returns these hashrefs. In list context, the return value of $search->survey is the list (\%name2path, \%path2name) . In scalar context, the return value is \%name2path . Or you can just call this in void context.

    Regardless of calling context, calling survey saves its results in its name2path and path2name attributes.

    E.g., when searching in $HOME/perl5lib, the file $HOME/perl5lib/MyModule.pm would get the POD name MyModule, whereas $HOME/perl5lib/Myclass/Subclass.pm would be Myclass::Subclass. The name information can be used for POD translators.

    Only text files containing at least one valid POD command are found.

    In verbose mode, a warning is printed if shadows are found (i.e., more than one POD file with the same POD name is found, e.g. CPAN.pm in different directories). This usually indicates duplicate occurrences of modules in the @INC search path, which is occasionally inadvertent (but is often simply a case of a user's path dir having a more recent version than the system's general path dirs in general.)

    The options to this argument is a list of either directories that are searched recursively, or files. (Usually you wouldn't specify files, but just dirs.) Or you can just specify an empty-list, as in $name2path; with the inc option on, as it is by default, teh

    The POD names of files are the plain basenames with any Perl-like extension (.pm, .pl, .pod) stripped, and path separators replaced by :: 's.

    Calling Pod::Simple::Search->search(...) is short for Pod::Simple::Search->new->search(...). That is, a throwaway object with default attribute values is used.

    $search->simplify_name( $str )

    The method simplify_name is equivalent to basename, but also strips Perl-like extensions (.pm, .pl, .pod) and extensions like .bat, .cmd on Win32 and OS/2, or .com on VMS, respectively.

    $search->find( $pod )

    $search->find( $pod, @search_dirs )

    Returns the location of a Pod file, given a Pod/module/script name (like "Foo::Bar" or "perlvar" or "perldoc"), and an idea of what files/directories to look in. It searches according to the various options set by the accessors above. (For example, if the inc attribute is on, as it is by default, then the perl @INC directories are implicitly added to the list of directories (if any) that you specify.)

    This returns the full path of the first occurrence to the file. Package names (eg 'A::B') are automatically converted to directory names in the selected directory. Additionally, '.pm', '.pl' and '.pod' are automatically appended to the search as required. (So, for example, under Unix, "A::B" is converted to "somedir/A/B.pm", "somedir/A/B.pod", or "somedir/A/B.pl", as appropriate.)

    If no such Pod file is found, this method returns undef.

    If any of the given search directories contains a pod/ subdirectory, then it is searched. (That's how we manage to find perlfunc, for example, which is usually in pod/perlfunc in most Perl dists.)

    The verbose and inc attributes influence the behavior of this search; notably, inc , if true, adds @INC and also $Config::Config{'scriptdir'} to the list of directories to search.

    It is common to simply say $filename = Pod::Simple::Search-> new ->find("perlvar") so that just the @INC (well, and scriptdir) directories are searched. (This happens because the inc attribute is true by default.)

    Calling Pod::Simple::Search->find(...) is short for Pod::Simple::Search->new->find(...). That is, a throwaway object with default attribute values is used.

    $self->contains_pod( $file )

    Returns true if the supplied filename (not POD module) contains some Pod documentation. =head1 SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org> with code borrowed from Marek Rouchal's Pod::Find, which in turn heavily borrowed code from Nick Ing-Simmons' PodToHtml .

    But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/SimpleTree.html000644 000765 000024 00000051027 12275777502 020153 0ustar00jjstaff000000 000000 Pod::Simple::SimpleTree - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::SimpleTree

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::SimpleTree

    NAME

    Pod::Simple::SimpleTree -- parse Pod into a simple parse tree

    SYNOPSIS

    1. % cat ptest.pod
    2. =head1 PIE
    3. I like B<pie>!
    4. % perl -MPod::Simple::SimpleTree -MData::Dumper -e \
    5. "print Dumper(Pod::Simple::SimpleTree->new->parse_file(shift)->root)" \
    6. ptest.pod
    7. $VAR1 = [
    8. 'Document',
    9. { 'start_line' => 1 },
    10. [
    11. 'head1',
    12. { 'start_line' => 1 },
    13. 'PIE'
    14. ],
    15. [
    16. 'Para',
    17. { 'start_line' => 3 },
    18. 'I like ',
    19. [
    20. 'B',
    21. {},
    22. 'pie'
    23. ],
    24. '!'
    25. ]
    26. ];

    DESCRIPTION

    This class is of interest to people writing a Pod processor/formatter.

    This class takes Pod and parses it, returning a parse tree made just of arrayrefs, and hashrefs, and strings.

    This is a subclass of Pod::Simple and inherits all its methods.

    This class is inspired by XML::Parser's "Tree" parsing-style, although it doesn't use exactly the same LoL format.

    METHODS

    At the end of the parse, call $parser->root to get the tree's top node.

    Tree Contents

    Every element node in the parse tree is represented by an arrayref of the form: [ elementname, \%attributes, ...subnodes... ]. See the example tree dump in the Synopsis, above.

    Every text node in the tree is represented by a simple (non-ref) string scalar. So you can test ref($node) to see whather you have an element node or just a text node.

    The top node in the tree is [ 'Document', \%attributes, ...subnodes... ]

    SEE ALSO

    Pod::Simple

    perllol

    The Tree subsubsection in XML::Parser

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/Text.html000644 000765 000024 00000042512 12275777501 017024 0ustar00jjstaff000000 000000 Pod::Simple::Text - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::Text

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::Text

    NAME

    Pod::Simple::Text -- format Pod as plaintext

    SYNOPSIS

    1. perl -MPod::Simple::Text -e \
    2. "exit Pod::Simple::Text->filter(shift)->any_errata_seen" \
    3. thingy.pod

    DESCRIPTION

    This class is a formatter that takes Pod and renders it as wrapped plaintext.

    Its wrapping is done by Text::Wrap, so you can change $Text::Wrap::columns as you like.

    This is a subclass of Pod::Simple and inherits all its methods.

    SEE ALSO

    Pod::Simple, Pod::Simple::TextContent, Pod::Text

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/TextContent.html000644 000765 000024 00000042542 12275777502 020363 0ustar00jjstaff000000 000000 Pod::Simple::TextContent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::TextContent

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::TextContent

    NAME

    Pod::Simple::TextContent -- get the text content of Pod

    SYNOPSIS

    1. TODO
    2. perl -MPod::Simple::TextContent -e \
    3. "exit Pod::Simple::TextContent->filter(shift)->any_errata_seen" \
    4. thingy.pod

    DESCRIPTION

    This class is that parses Pod and dumps just the text content. It is mainly meant for use by the Pod::Simple test suite, but you may find some other use for it.

    This is a subclass of Pod::Simple and inherits all its methods.

    SEE ALSO

    Pod::Simple, Pod::Simple::Text, Pod::Spell

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/XHTML.html000644 000765 000024 00000117362 12275777502 017003 0ustar00jjstaff000000 000000 Pod::Simple::XHTML - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::XHTML

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::XHTML

    NAME

    Pod::Simple::XHTML -- format Pod as validating XHTML

    SYNOPSIS

    1. use Pod::Simple::XHTML;
    2. my $parser = Pod::Simple::XHTML->new();
    3. ...
    4. $parser->parse_file('path/to/file.pod');

    DESCRIPTION

    This class is a formatter that takes Pod and renders it as XHTML validating HTML.

    This is a subclass of Pod::Simple::Methody and inherits all its methods. The implementation is entirely different than Pod::Simple::HTML, but it largely preserves the same interface.

    Minimal code

    1. use Pod::Simple::XHTML;
    2. my $psx = Pod::Simple::XHTML->new;
    3. $psx->output_string(\my $html);
    4. $psx->parse_file('path/to/Module/Name.pm');
    5. open my $out, '>', 'out.html' or die "Cannot open 'out.html': $!\n";
    6. print $out $html;

    You can also control the character encoding and entities. For example, if you're sure that the POD is properly encoded (using the =encoding command), you can prevent high-bit characters from being encoded as HTML entities and declare the output character set as UTF-8 before parsing, like so:

    1. $psx->html_charset('UTF-8');
    2. $psx->html_encode_chars('&<>">');

    METHODS

    Pod::Simple::XHTML offers a number of methods that modify the format of the HTML output. Call these after creating the parser object, but before the call to parse_file :

    1. my $parser = Pod::PseudoPod::HTML->new();
    2. $parser->set_optional_param("value");
    3. $parser->parse_file($file);

    perldoc_url_prefix

    In turning Foo::Bar into http://whatever/Foo%3a%3aBar, what to put before the "Foo%3a%3aBar". The default value is "http://search.cpan.org/perldoc?".

    perldoc_url_postfix

    What to put after "Foo%3a%3aBar" in the URL. This option is not set by default.

    man_url_prefix

    In turning crontab(5) into http://whatever/man/1/crontab, what to put before the "1/crontab". The default value is "http://man.he.net/man".

    man_url_postfix

    What to put after "1/crontab" in the URL. This option is not set by default.

    title_prefix, title_postfix

    What to put before and after the title in the head. The values should already be &-escaped.

    html_css

    1. $parser->html_css('path/to/style.css');

    The URL or relative path of a CSS file to include. This option is not set by default.

    html_javascript

    The URL or relative path of a JavaScript file to pull in. This option is not set by default.

    html_doctype

    A document type tag for the file. This option is not set by default.

    html_charset

    The charater set to declare in the Content-Type meta tag created by default for html_header_tags . Note that this option will be ignored if the value of html_header_tags is changed. Defaults to "ISO-8859-1".

    html_header_tags

    Additional arbitrary HTML tags for the header of the document. The default value is just a content type header tag:

    1. <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

    Add additional meta tags here, or blocks of inline CSS or JavaScript (wrapped in the appropriate tags).

    html_encode_chars

    A string containing all characters that should be encoded as HTML entities, specified using the regular expression character class syntax (what you find within brackets in regular expressions). This value will be passed as the second argument to the encode_entities function of HTML::Entities. If HTML::Entities is not installed, then any characters other than &< "'> will be encoded numerically.

    html_h_level

    This is the level of HTML "Hn" element to which a Pod "head1" corresponds. For example, if html_h_level is set to 2, a head1 will produce an H2, a head2 will produce an H3, and so on.

    default_title

    Set a default title for the page if no title can be determined from the content. The value of this string should already be &-escaped.

    force_title

    Force a title for the page (don't try to determine it from the content). The value of this string should already be &-escaped.

    html_header, html_footer

    Set the HTML output at the beginning and end of each file. The default header includes a title, a doctype tag (if html_doctype is set), a content tag (customized by html_header_tags ), a tag for a CSS file (if html_css is set), and a tag for a Javascript file (if html_javascript is set). The default footer simply closes the html and body tags.

    The options listed above customize parts of the default header, but setting html_header or html_footer completely overrides the built-in header or footer. These may be useful if you want to use template tags instead of literal HTML headers and footers or are integrating converted POD pages in a larger website.

    If you want no headers or footers output in the HTML, set these options to the empty string.

    index

    Whether to add a table-of-contents at the top of each page (called an index for the sake of tradition).

    anchor_items

    Whether to anchor every definition =item directive. This needs to be enabled if you want to be able to link to specific =item directives, which are output as <dt> elements. Disabled by default.

    backlink

    Whether to turn every =head1 directive into a link pointing to the top of the page (specifically, the opening body tag).

    SUBCLASSING

    If the standard options aren't enough, you may want to subclass Pod::Simple::XHMTL. These are the most likely candidates for methods you'll want to override when subclassing.

    handle_text

    This method handles the body of text within any element: it's the body of a paragraph, or everything between a "=begin" tag and the corresponding "=end" tag, or the text within an L entity, etc. You would want to override this if you are adding a custom element type that does more than just display formatted text. Perhaps adding a way to generate HTML tables from an extended version of POD.

    So, let's say you want to add a custom element called 'foo'. In your subclass's new method, after calling SUPER::new you'd call:

    1. $new->accept_targets_as_text( 'foo' );

    Then override the start_for method in the subclass to check for when "$flags->{'target'}" is equal to 'foo' and set a flag that marks that you're in a foo block (maybe "$self->{'in_foo'} = 1"). Then override the handle_text method to check for the flag, and pass $text to your custom subroutine to construct the HTML output for 'foo' elements, something like:

    1. sub handle_text {
    2. my ($self, $text) = @_;
    3. if ($self->{'in_foo'}) {
    4. $self->{'scratch'} .= build_foo_html($text);
    5. return;
    6. }
    7. $self->SUPER::handle_text($text);
    8. }

    handle_code

    This method handles the body of text that is marked up to be code. You might for instance override this to plug in a syntax highlighter. The base implementation just escapes the text.

    The callback methods start_code and end_code emits the code tags before and after handle_code is invoked, so you might want to override these together with handle_code if this wrapping isn't suiteable.

    Note that the code might be broken into mulitple segments if there are nested formatting codes inside a C<...> sequence. In between the calls to handle_code other markup tags might have been emitted in that case. The same is true for verbatim sections if the codes_in_verbatim option is turned on.

    accept_targets_as_html

    This method behaves like accept_targets_as_text , but also marks the region as one whose content should be emitted literally, without HTML entity escaping or wrapping in a div element.

    resolve_pod_page_link

    1. my $url = $pod->resolve_pod_page_link('Net::Ping', 'INSTALL');
    2. my $url = $pod->resolve_pod_page_link('perlpodspec');
    3. my $url = $pod->resolve_pod_page_link(undef, 'SYNOPSIS');

    Resolves a POD link target (typically a module or POD file name) and section name to a URL. The resulting link will be returned for the above examples as:

    1. http://search.cpan.org/perldoc?Net::Ping#INSTALL
    2. http://search.cpan.org/perldoc?perlpodspec
    3. #SYNOPSIS

    Note that when there is only a section argument the URL will simply be a link to a section in the current document.

    resolve_man_page_link

    1. my $url = $pod->resolve_man_page_link('crontab(5)', 'EXAMPLE CRON FILE');
    2. my $url = $pod->resolve_man_page_link('crontab');

    Resolves a man page link target and numeric section to a URL. The resulting link will be returned for the above examples as:

    1. http://man.he.net/man5/crontab
    2. http://man.he.net/man1/crontab

    Note that the first argument is required. The section number will be parsed from it, and if it's missing will default to 1. The second argument is currently ignored, as man.he.net does not currently include linkable IDs or anchor names in its pages. Subclass to link to a different man page HTTP server.

    idify

    1. my $id = $pod->idify($text);
    2. my $hash = $pod->idify($text, 1);

    This method turns an arbitrary string into a valid XHTML ID attribute value. The rules enforced, following http://webdesign.about.com/od/htmltags/a/aa031707.htm, are:

    • The id must start with a letter (a-z or A-Z)

    • All subsequent characters can be letters, numbers (0-9), hyphens (-), underscores (_), colons (:), and periods (.).

    • The final character can't be a hyphen, colon, or period. URLs ending with these characters, while allowed by XHTML, can be awkward to extract from plain text.

    • Each id must be unique within the document.

    In addition, the returned value will be unique within the context of the Pod::Simple::XHTML object unless a second argument is passed a true value. ID attributes should always be unique within a single XHTML document, but pass the true value if you are creating not an ID but a URL hash to point to an ID (i.e., if you need to put the "#foo" in <a href="#foo">foo</a>.

    batch_mode_page_object_init

    1. $pod->batch_mode_page_object_init($batchconvobj, $module, $infile, $outfile, $depth);

    Called by Pod::Simple::HTMLBatch so that the class has a chance to initialize the converter. Internally it sets the batch_mode property to true and sets batch_mode_current_level() , but Pod::Simple::XHTML does not currently use those features. Subclasses might, though.

    SEE ALSO

    Pod::Simple, Pod::Simple::Text, Pod::Spell

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2003-2005 Allison Randal.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    ACKNOWLEDGEMENTS

    Thanks to Hurricane Electric for permission to use its Linux man pages online site for man page links.

    Thanks to search.cpan.org for permission to use the site for Perl module links.

    AUTHOR

    Pod::Simpele::XHTML was created by Allison Randal <allison@perl.org>.

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Simple/XMLOutStream.html000644 000765 000024 00000045613 12275777501 020411 0ustar00jjstaff000000 000000 Pod::Simple::XMLOutStream - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Simple::XMLOutStream

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Simple::XMLOutStream

    NAME

    Pod::Simple::XMLOutStream -- turn Pod into XML

    SYNOPSIS

    1. perl -MPod::Simple::XMLOutStream -e \
    2. "exit Pod::Simple::XMLOutStream->filter(shift)->any_errata_seen" \
    3. thingy.pod

    DESCRIPTION

    Pod::Simple::XMLOutStream is a subclass of Pod::Simple that parses Pod and turns it into XML.

    Pod::Simple::XMLOutStream inherits methods from Pod::Simple.

    SEE ALSO

    Pod::Simple::DumpAsXML is rather like this class; see its documentation for a discussion of the differences.

    Pod::Simple, Pod::Simple::DumpAsXML, Pod::SAX

    Pod::Simple::Subclassing

    The older (and possibly obsolete) libraries Pod::PXML, Pod::XML

    ABOUT EXTENDING POD

    TODO: An example or two of =extend, then point to Pod::Simple::Subclassing

    ASK ME!

    If you actually want to use Pod as a format that you want to render to XML (particularly if to an XML instance with more elements than normal Pod has), please email me (sburke@cpan.org ) and I'll probably have some recommendations.

    For reasons of concision and energetic laziness, some methods and options in this module (and the dozen modules it depends on) are undocumented; but one of those undocumented bits might be just what you're looking for.

    SEE ALSO

    Pod::Simple, Pod::Simple::Text, Pod::Spell

    SUPPORT

    Questions or discussion about POD and Pod::Simple should be sent to the pod-people@perl.org mail list. Send an empty email to pod-people-subscribe@perl.org to subscribe.

    This module is managed in an open GitHub repository, https://github.com/theory/pod-simple/. Feel free to fork and contribute, or to clone git://github.com/theory/pod-simple.git and send patches!

    Patches against Pod::Simple are welcome. Please send bug reports to <bug-pod-simple@rt.cpan.org>.

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002-2004 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Pod::Simple was created by Sean M. Burke <sburke@cpan.org>. But don't bother him, he's retired.

    Pod::Simple is maintained by:

    • Allison Randal allison@perl.org
    • Hans Dieter Pearcey hdp@cpan.org
    • David E. Wheeler dwheeler@cpan.org
     
    perldoc-html/Pod/Perldoc/BaseTo.html000644 000765 000024 00000041131 12275777501 017410 0ustar00jjstaff000000 000000 Pod::Perldoc::BaseTo - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::BaseTo

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::BaseTo

    NAME

    Pod::Perldoc::BaseTo - Base for Pod::Perldoc formatters

    SYNOPSIS

    1. package Pod::Perldoc::ToMyFormat;
    2. use base qw( Pod::Perldoc::BaseTo );
    3. ...

    DESCRIPTION

    This package is meant as a base of Pod::Perldoc formatters, like Pod::Perldoc::ToText, Pod::Perldoc::ToMan, etc.

    It provides default implementations for the methods

    1. is_pageable
    2. write_with_binmode
    3. output_extension
    4. _perldoc_elem

    The concrete formatter must implement

    1. new
    2. parse_from_file

    SEE ALSO

    perldoc

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002-2007 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/Perldoc/GetOptsOO.html000644 000765 000024 00000043373 12275777476 020103 0ustar00jjstaff000000 000000 Pod::Perldoc::GetOptsOO - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::GetOptsOO

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::GetOptsOO

    NAME

    Pod::Perldoc::GetOptsOO - Customized option parser for Pod::Perldoc

    SYNOPSIS

    1. use Pod::Perldoc::GetOptsOO ();
    2. Pod::Perldoc::GetOptsOO::getopts( $obj, \@args, $truth )
    3. or die "wrong usage";

    DESCRIPTION

    Implements a customized option parser used for Pod::Perldoc.

    Rather like Getopt::Std's getopts:

    • Call Pod::Perldoc::GetOptsOO::getopts($object, \@ARGV, $truth)
    • Given -n, if there's a opt_n_with, it'll call $object->opt_n_with( ARGUMENT ) (e.g., "-n foo" => $object->opt_n_with('foo'). Ditto "-nfoo")
    • Otherwise (given -n) if there's an opt_n, we'll call it $object->opt_n($truth) (Truth defaults to 1)
    • Otherwise we try calling $object->handle_unknown_option('n') (and we increment the error count by the return value of it)
    • If there's no handle_unknown_option, then we just warn, and then increment the error counter

    The return value of Pod::Perldoc::GetOptsOO::getopts is true if no errors, otherwise it's false.

    SEE ALSO

    Pod::Perldoc

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002-2007 Sean M. Burke.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/Perldoc/ToChecker.html000644 000765 000024 00000041111 12275777477 020114 0ustar00jjstaff000000 000000 Pod::Perldoc::ToChecker - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::ToChecker

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::ToChecker

    NAME

    Pod::Perldoc::ToChecker - let Perldoc check Pod for errors

    SYNOPSIS

    1. % perldoc -o checker SomeFile.pod
    2. No Pod errors in SomeFile.pod
    3. (or an error report)

    DESCRIPTION

    This is a "plug-in" class that allows Perldoc to use Pod::Simple::Checker as a "formatter" class (or if that is not available, then Pod::Checker), to check for errors in a given Pod file.

    This is actually a Pod::Simple::Checker (or Pod::Checker) subclass, and inherits all its options.

    SEE ALSO

    Pod::Simple::Checker, Pod::Simple, Pod::Checker, Pod::Perldoc

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/Perldoc/ToMan.html000644 000765 000024 00000041155 12275777502 017260 0ustar00jjstaff000000 000000 Pod::Perldoc::ToMan - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::ToMan

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::ToMan

    NAME

    Pod::Perldoc::ToMan - let Perldoc render Pod as man pages

    SYNOPSIS

    1. perldoc -o man Some::Modulename

    DESCRIPTION

    This is a "plug-in" class that allows Perldoc to use Pod::Man and groff for reading Pod pages.

    The following options are supported: center, date, fixed, fixedbold, fixeditalic, fixedbolditalic, quotes, release, section

    (Those options are explained in Pod::Man.)

    For example:

    1. perldoc -o man -w center:Pod Some::Modulename

    CAVEAT

    This module may change to use a different pod-to-nroff formatter class in the future, and this may change what options are supported.

    SEE ALSO

    Pod::Man, Pod::Perldoc, Pod::Perldoc::ToNroff

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2011 brian d foy. All rights reserved.

    Copyright (c) 2002,3,4 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/Perldoc/ToNroff.html000644 000765 000024 00000040720 12275777477 017627 0ustar00jjstaff000000 000000 Pod::Perldoc::ToNroff - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::ToNroff

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::ToNroff

    NAME

    Pod::Perldoc::ToNroff - let Perldoc convert Pod to nroff

    SYNOPSIS

    1. perldoc -o nroff -d something.3 Some::Modulename

    DESCRIPTION

    This is a "plug-in" class that allows Perldoc to use Pod::Man as a formatter class.

    The following options are supported: center, date, fixed, fixedbold, fixeditalic, fixedbolditalic, quotes, release, section

    Those options are explained in Pod::Man.

    For example:

    1. perldoc -o nroff -w center:Pod -d something.3 Some::Modulename

    CAVEAT

    This module may change to use a different pod-to-nroff formatter class in the future, and this may change what options are supported.

    SEE ALSO

    Pod::Man, Pod::Perldoc, Pod::Perldoc::ToMan

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/Perldoc/ToPod.html000644 000765 000024 00000040140 12275777500 017256 0ustar00jjstaff000000 000000 Pod::Perldoc::ToPod - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::ToPod

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::ToPod

    NAME

    Pod::Perldoc::ToPod - let Perldoc render Pod as ... Pod!

    SYNOPSIS

    1. perldoc -opod Some::Modulename

    (That's currently the same as the following:)

    1. perldoc -u Some::Modulename

    DESCRIPTION

    This is a "plug-in" class that allows Perldoc to display Pod source as itself! Pretty Zen, huh?

    Currently this class works by just filtering out the non-Pod stuff from a given input file.

    SEE ALSO

    Pod::Perldoc

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallencpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/Perldoc/ToRtf.html000644 000765 000024 00000041731 12275777476 017312 0ustar00jjstaff000000 000000 Pod::Perldoc::ToRtf - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::ToRtf

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::ToRtf

    NAME

    Pod::Perldoc::ToRtf - let Perldoc render Pod as RTF

    SYNOPSIS

    1. perldoc -o rtf Some::Modulename

    DESCRIPTION

    This is a "plug-in" class that allows Perldoc to use Pod::Simple::RTF as a formatter class.

    This is actually a Pod::Simple::RTF subclass, and inherits all its options.

    You have to have Pod::Simple::RTF installed (from the Pod::Simple dist), or this module won't work.

    If Perldoc is running under MSWin and uses this class as a formatter, the output will be opened with write.exe or whatever program is specified in the environment variable RTFREADER . For example, to specify that RTF files should be opened the same as they are when you double-click them, you would do set RTFREADER=start.exe in your autoexec.bat.

    Handy tip: put set PERLDOC=-ortf in your autoexec.bat and that will set this class as the default formatter to run when you do perldoc whatever .

    SEE ALSO

    Pod::Simple::RTF, Pod::Simple, Pod::Perldoc

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/Perldoc/ToText.html000644 000765 000024 00000040557 12275777501 017475 0ustar00jjstaff000000 000000 Pod::Perldoc::ToText - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::ToText

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::ToText

    NAME

    Pod::Perldoc::ToText - let Perldoc render Pod as plaintext

    SYNOPSIS

    1. perldoc -o text Some::Modulename

    DESCRIPTION

    This is a "plug-in" class that allows Perldoc to use Pod::Text as a formatter class.

    It supports the following options, which are explained in Pod::Text: alt, indent, loose, quotes, sentence, width

    For example:

    1. perldoc -o text -w indent:5 Some::Modulename

    CAVEAT

    This module may change to use a different text formatter class in the future, and this may change what options are supported.

    SEE ALSO

    Pod::Text, Pod::Perldoc

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/Pod/Perldoc/ToTk.html000644 000765 000024 00000037015 12275777500 017121 0ustar00jjstaff000000 000000 Pod::Perldoc::ToTk - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::ToTk

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::ToTk

    NAME

    Pod::Perldoc::ToTk - let Perldoc use Tk::Pod to render Pod

    SYNOPSIS

    1. perldoc -o tk Some::Modulename &

    DESCRIPTION

    This is a "plug-in" class that allows Perldoc to use Tk::Pod as a formatter class.

    You have to have installed Tk::Pod first, or this class won't load.

    SEE ALSO

    Tk::Pod, Pod::Perldoc

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> ; Sean M. Burke <sburke@cpan.org> ; significant portions copied from tkpod in the Tk::Pod dist, by Nick Ing-Simmons, Slaven Rezic, et al.

     
    perldoc-html/Pod/Perldoc/ToXml.html000644 000765 000024 00000040325 12275777501 017302 0ustar00jjstaff000000 000000 Pod::Perldoc::ToXml - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Pod::Perldoc::ToXml

    Perl 5 version 18.2 documentation
    Recently read

    Pod::Perldoc::ToXml

    NAME

    Pod::Perldoc::ToXml - let Perldoc render Pod as XML

    SYNOPSIS

    1. perldoc -o xml -d out.xml Some::Modulename

    DESCRIPTION

    This is a "plug-in" class that allows Perldoc to use Pod::Simple::XMLOutStream as a formatter class.

    This is actually a Pod::Simple::XMLOutStream subclass, and inherits all its options.

    You have to have installed Pod::Simple::XMLOutStream (from the Pod::Simple dist), or this class won't work.

    SEE ALSO

    Pod::Simple::XMLOutStream, Pod::Simple, Pod::Perldoc

    COPYRIGHT AND DISCLAIMERS

    Copyright (c) 2002 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Current maintainer: Mark Allen <mallen@cpan.org>

    Past contributions from: brian d foy <bdfoy@cpan.org> Adriano R. Ferreira <ferreira@cpan.org> , Sean M. Burke <sburke@cpan.org>

     
    perldoc-html/PerlIO/encoding.html000644 000765 000024 00000040033 12275777476 017054 0ustar00jjstaff000000 000000 PerlIO::encoding - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    PerlIO::encoding

    Perl 5 version 18.2 documentation
    Recently read

    PerlIO::encoding

    NAME

    PerlIO::encoding - encoding layer

    SYNOPSIS

    1. use PerlIO::encoding;
    2. open($f, "<:encoding(foo)", "infoo");
    3. open($f, ">:encoding(bar)", "outbar");
    4. use Encode qw(:fallbacks);
    5. $PerlIO::encoding::fallback = FB_PERLQQ;

    DESCRIPTION

    This PerlIO layer opens a filehandle with a transparent encoding filter.

    On input, it converts the bytes expected to be in the specified character set and encoding to Perl string data (Unicode and Perl's internal Unicode encoding, UTF-8). On output, it converts Perl string data into the specified character set and encoding.

    When the layer is pushed, the current value of $PerlIO::encoding::fallback is saved and used as the CHECK argument when calling the Encode methods encode() and decode().

    SEE ALSO

    open, Encode, binmode, perluniintro

     
    perldoc-html/PerlIO/scalar.html000644 000765 000024 00000042532 12275777502 016527 0ustar00jjstaff000000 000000 PerlIO::scalar - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    PerlIO::scalar

    Perl 5 version 18.2 documentation
    Recently read

    PerlIO::scalar

    NAME

    PerlIO::scalar - in-memory IO, scalar IO

    SYNOPSIS

    1. my $scalar = '';
    2. ...
    3. open my $fh, "<", \$scalar or die;
    4. open my $fh, ">", \$scalar or die;
    5. open my $fh, ">>", \$scalar or die;

    or

    1. my $scalar = '';
    2. ...
    3. open my $fh, "<:scalar", \$scalar or die;
    4. open my $fh, ">:scalar", \$scalar or die;
    5. open my $fh, ">>:scalar", \$scalar or die;

    DESCRIPTION

    A filehandle is opened but the file operations are performed "in-memory" on a scalar variable. All the normal file operations can be performed on the handle. The scalar is considered a stream of bytes. Currently fileno($fh) returns -1.

    IMPLEMENTATION NOTE

    PerlIO::scalar only exists to use XSLoader to load C code that provides support for treating a scalar as an "in memory" file. One does not need to explicitly use PerlIO::scalar .

     
    perldoc-html/PerlIO/via/000755 000765 000024 00000000000 12275777476 015157 5ustar00jjstaff000000 000000 perldoc-html/PerlIO/via.html000644 000765 000024 00000071447 12275777476 016062 0ustar00jjstaff000000 000000 PerlIO::via - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    PerlIO::via

    Perl 5 version 18.2 documentation
    Recently read

    PerlIO::via

    NAME

    PerlIO::via - Helper class for PerlIO layers implemented in perl

    SYNOPSIS

    1. use PerlIO::via::Layer;
    2. open($fh,"<:via(Layer)",...);
    3. use Some::Other::Package;
    4. open($fh,">:via(Some::Other::Package)",...);

    DESCRIPTION

    The PerlIO::via module allows you to develop PerlIO layers in Perl, without having to go into the nitty gritty of programming C with XS as the interface to Perl.

    One example module, PerlIO::via::QuotedPrint, is included with Perl 5.8.0, and more example modules are available from CPAN, such as PerlIO::via::StripHTML and PerlIO::via::Base64. The PerlIO::via::StripHTML module for instance, allows you to say:

    1. use PerlIO::via::StripHTML;
    2. open( my $fh, "<:via(StripHTML)", "index.html" );
    3. my @line = <$fh>;

    to obtain the text of an HTML-file in an array with all the HTML-tags automagically removed.

    Please note that if the layer is created in the PerlIO::via:: namespace, it does not have to be fully qualified. The PerlIO::via module will prefix the PerlIO::via:: namespace if the specified modulename does not exist as a fully qualified module name.

    EXPECTED METHODS

    To create a Perl module that implements a PerlIO layer in Perl (as opposed to in C using XS as the interface to Perl), you need to supply some of the following subroutines. It is recommended to create these Perl modules in the PerlIO::via:: namespace, so that they can easily be located on CPAN and use the default namespace feature of the PerlIO::via module itself.

    Please note that this is an area of recent development in Perl and that the interface described here is therefore still subject to change (and hopefully will have better documentation and more examples).

    In the method descriptions below $fh will be a reference to a glob which can be treated as a perl file handle. It refers to the layer below. $fh is not passed if the layer is at the bottom of the stack, for this reason and to maintain some level of "compatibility" with TIEHANDLE classes it is passed last.

    • $class->PUSHED([$mode,[$fh]])

      Should return an object or the class, or -1 on failure. (Compare TIEHANDLE.) The arguments are an optional mode string ("r", "w", "w+", ...) and a filehandle for the PerlIO layer below. Mandatory.

      When the layer is pushed as part of an open call, PUSHED will be called before the actual open occurs, whether that be via OPEN , SYSOPEN , FDOPEN or by letting a lower layer do the open.

    • $obj->POPPED([$fh])

      Optional - called when the layer is about to be removed.

    • $obj->UTF8($belowFlag,[$fh])

      Optional - if present it will be called immediately after PUSHED has returned. It should return a true value if the layer expects data to be UTF-8 encoded. If it returns true, the result is as if the caller had done

      1. ":via(YourClass):utf8"

      If not present or if it returns false, then the stream is left with the UTF-8 flag clear. The $belowFlag argument will be true if there is a layer below and that layer was expecting UTF-8.

    • $obj->OPEN($path,$mode,[$fh])

      Optional - if not present a lower layer does the open. If present, called for normal opens after the layer is pushed. This function is subject to change as there is no easy way to get a lower layer to do the open and then regain control.

    • $obj->BINMODE([$fh])

      Optional - if not present the layer is popped on binmode($fh) or when :raw is pushed. If present it should return 0 on success, -1 on error, or undef to pop the layer.

    • $obj->FDOPEN($fd,[$fh])

      Optional - if not present a lower layer does the open. If present, called after the layer is pushed for opens which pass a numeric file descriptor. This function is subject to change as there is no easy way to get a lower layer to do the open and then regain control.

    • $obj->SYSOPEN($path,$imode,$perm,[$fh])

      Optional - if not present a lower layer does the open. If present, called after the layer is pushed for sysopen style opens which pass a numeric mode and permissions. This function is subject to change as there is no easy way to get a lower layer to do the open and then regain control.

    • $obj->FILENO($fh)

      Returns a numeric value for a Unix-like file descriptor. Returns -1 if there isn't one. Optional. Default is fileno($fh).

    • $obj->READ($buffer,$len,$fh)

      Returns the number of octets placed in $buffer (must be less than or equal to $len). Optional. Default is to use FILL instead.

    • $obj->WRITE($buffer,$fh)

      Returns the number of octets from $buffer that have been successfully written.

    • $obj->FILL($fh)

      Should return a string to be placed in the buffer. Optional. If not provided, must provide READ or reject handles open for reading in PUSHED.

    • $obj->CLOSE($fh)

      Should return 0 on success, -1 on error. Optional.

    • $obj->SEEK($posn,$whence,$fh)

      Should return 0 on success, -1 on error. Optional. Default is to fail, but that is likely to be changed in future.

    • $obj->TELL($fh)

      Returns file position. Optional. Default to be determined.

    • $obj->UNREAD($buffer,$fh)

      Returns the number of octets from $buffer that have been successfully saved to be returned on future FILL/READ calls. Optional. Default is to push data into a temporary layer above this one.

    • $obj->FLUSH($fh)

      Flush any buffered write data. May possibly be called on readable handles too. Should return 0 on success, -1 on error.

    • $obj->SETLINEBUF($fh)

      Optional. No return.

    • $obj->CLEARERR($fh)

      Optional. No return.

    • $obj->ERROR($fh)

      Optional. Returns error state. Default is no error until a mechanism to signal error (die?) is worked out.

    • $obj->EOF($fh)

      Optional. Returns end-of-file state. Default is a function of the return value of FILL or READ.

    EXAMPLES

    Check the PerlIO::via:: namespace on CPAN for examples of PerlIO layers implemented in Perl. To give you an idea how simple the implementation of a PerlIO layer can look, a simple example is included here.

    Example - a Hexadecimal Handle

    Given the following module, PerlIO::via::Hex :

    1. package PerlIO::via::Hex;
    2. sub PUSHED
    3. {
    4. my ($class,$mode,$fh) = @_;
    5. # When writing we buffer the data
    6. my $buf = '';
    7. return bless \$buf,$class;
    8. }
    9. sub FILL
    10. {
    11. my ($obj,$fh) = @_;
    12. my $line = <$fh>;
    13. return (defined $line) ? pack("H*", $line) : undef;
    14. }
    15. sub WRITE
    16. {
    17. my ($obj,$buf,$fh) = @_;
    18. $$obj .= unpack("H*", $buf);
    19. return length($buf);
    20. }
    21. sub FLUSH
    22. {
    23. my ($obj,$fh) = @_;
    24. print $fh $$obj or return -1;
    25. $$obj = '';
    26. return 0;
    27. }
    28. 1;

    The following code opens up an output handle that will convert any output to a hexadecimal dump of the output bytes: for example "A" will be converted to "41" (on ASCII-based machines, on EBCDIC platforms the "A" will become "c1")

    1. use PerlIO::via::Hex;
    2. open(my $fh, ">:via(Hex)", "foo.hex");

    and the following code will read the hexdump in and convert it on the fly back into bytes:

    1. open(my $fh, "<:via(Hex)", "foo.hex");
     
    perldoc-html/PerlIO/via/QuotedPrint.html000644 000765 000024 00000042776 12275777476 020343 0ustar00jjstaff000000 000000 PerlIO::via::QuotedPrint - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    PerlIO::via::QuotedPrint

    Perl 5 version 18.2 documentation
    Recently read

    PerlIO::via::QuotedPrint

    NAME

    PerlIO::via::QuotedPrint - PerlIO layer for quoted-printable strings

    SYNOPSIS

    1. use PerlIO::via::QuotedPrint;
    2. open( my $in, '<:via(QuotedPrint)', 'file.qp' )
    3. or die "Can't open file.qp for reading: $!\n";
    4. open( my $out, '>:via(QuotedPrint)', 'file.qp' )
    5. or die "Can't open file.qp for writing: $!\n";

    VERSION

    This documentation describes version 0.07.

    DESCRIPTION

    This module implements a PerlIO layer that works on files encoded in the quoted-printable format. It will decode from quoted-printable while reading from a handle, and it will encode as quoted-printable while writing to a handle.

    REQUIRED MODULES

    1. MIME::QuotedPrint (any)

    SEE ALSO

    PerlIO::via, MIME::QuotedPrint, PerlIO::via::Base64, PerlIO::via::MD5, PerlIO::via::StripHTML, PerlIO::via::Rotate.

    ACKNOWLEDGEMENTS

    Based on example that was initially added to MIME::QuotedPrint.pm for the 5.8.0 distribution of Perl.

    COPYRIGHT

    Copyright (c) 2002, 2003, 2004, 2012 Elizabeth Mattijsen. All rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Parse/CPAN/000755 000765 000024 00000000000 12275777476 015041 5ustar00jjstaff000000 000000 perldoc-html/Parse/CPAN/Meta.html000644 000765 000024 00000061362 12275777476 016625 0ustar00jjstaff000000 000000 Parse::CPAN::Meta - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Parse::CPAN::Meta

    Perl 5 version 18.2 documentation
    Recently read

    Parse::CPAN::Meta

    NAME

    Parse::CPAN::Meta - Parse META.yml and META.json CPAN metadata files

    SYNOPSIS

    1. #############################################
    2. # In your file
    3. ---
    4. name: My-Distribution
    5. version: 1.23
    6. resources:
    7. homepage: "http://example.com/dist/My-Distribution"
    8. #############################################
    9. # In your program
    10. use Parse::CPAN::Meta;
    11. my $distmeta = Parse::CPAN::Meta->load_file('META.yml');
    12. # Reading properties
    13. my $name = $distmeta->{name};
    14. my $version = $distmeta->{version};
    15. my $homepage = $distmeta->{resources}{homepage};

    DESCRIPTION

    Parse::CPAN::Meta is a parser for META.json and META.yml files, using JSON::PP and/or CPAN::Meta::YAML.

    Parse::CPAN::Meta provides three methods: load_file , load_json_string , and load_yaml_string . These will read and deserialize CPAN metafiles, and are described below in detail.

    Parse::CPAN::Meta provides a legacy API of only two functions, based on the YAML functions of the same name. Wherever possible, identical calling semantics are used. These may only be used with YAML sources.

    All error reporting is done with exceptions (die'ing).

    Note that META files are expected to be in UTF-8 encoding, only. When converted string data, it must first be decoded from UTF-8.

    METHODS

    load_file

    1. my $metadata_structure = Parse::CPAN::Meta->load_file('META.json');
    2. my $metadata_structure = Parse::CPAN::Meta->load_file('META.yml');

    This method will read the named file and deserialize it to a data structure, determining whether it should be JSON or YAML based on the filename. On Perl 5.8.1 or later, the file will be read using the ":utf8" IO layer.

    load_yaml_string

    1. my $metadata_structure = Parse::CPAN::Meta->load_yaml_string($yaml_string);

    This method deserializes the given string of YAML and returns the first document in it. (CPAN metadata files should always have only one document.) If the source was UTF-8 encoded, the string must be decoded before calling load_yaml_string .

    load_json_string

    1. my $metadata_structure = Parse::CPAN::Meta->load_json_string($json_string);

    This method deserializes the given string of JSON and the result. If the source was UTF-8 encoded, the string must be decoded before calling load_json_string .

    yaml_backend

    1. my $backend = Parse::CPAN::Meta->yaml_backend;

    Returns the module name of the YAML serializer. See ENVIRONMENT for details.

    json_backend

    1. my $backend = Parse::CPAN::Meta->json_backend;

    Returns the module name of the JSON serializer. This will either be JSON::PP or JSON. Even if PERL_JSON_BACKEND is set, this will return JSON as further delegation is handled by the JSON module. See ENVIRONMENT for details.

    FUNCTIONS

    For maintenance clarity, no functions are exported. These functions are available for backwards compatibility only and are best avoided in favor of load_file .

    Load

    1. my @yaml = Parse::CPAN::Meta::Load( $string );

    Parses a string containing a valid YAML stream into a list of Perl data structures.

    LoadFile

    1. my @yaml = Parse::CPAN::Meta::LoadFile( 'META.yml' );

    Reads the YAML stream from a file instead of a string.

    ENVIRONMENT

    PERL_JSON_BACKEND

    By default, JSON::PP will be used for deserializing JSON data. If the PERL_JSON_BACKEND environment variable exists, is true and is not "JSON::PP", then the JSON module (version 2.5 or greater) will be loaded and used to interpret PERL_JSON_BACKEND . If JSON is not installed or is too old, an exception will be thrown.

    PERL_YAML_BACKEND

    By default, CPAN::Meta::YAML will be used for deserializing YAML data. If the PERL_YAML_BACKEND environment variable is defined, then it is intepreted as a module to use for deserialization. The given module must be installed, must load correctly and must implement the Load() function or an exception will be thrown.

    SUPPORT

    Bugs should be reported via the CPAN bug tracker at

    http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Parse-CPAN-Meta

    AUTHOR

    Adam Kennedy <adamk@cpan.org>

    COPYRIGHT

    Copyright 2006 - 2010 Adam Kennedy.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    The full text of the license can be found in the LICENSE file included with this module.

     
    perldoc-html/Params/Check.html000644 000765 000024 00000114077 12275777476 016406 0ustar00jjstaff000000 000000 Params::Check - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Params::Check

    Perl 5 version 18.2 documentation
    Recently read

    Params::Check

    NAME

    Params::Check - A generic input parsing/checking mechanism.

    SYNOPSIS

    1. use Params::Check qw[check allow last_error];
    2. sub fill_personal_info {
    3. my %hash = @_;
    4. my $x;
    5. my $tmpl = {
    6. firstname => { required => 1, defined => 1 },
    7. lastname => { required => 1, store => \$x },
    8. gender => { required => 1,
    9. allow => [qr/M/i, qr/F/i],
    10. },
    11. married => { allow => [0,1] },
    12. age => { default => 21,
    13. allow => qr/^\d+$/,
    14. },
    15. phone => { allow => [ sub { return 1 if /$valid_re/ },
    16. '1-800-PERL' ]
    17. },
    18. id_list => { default => [],
    19. strict_type => 1
    20. },
    21. employer => { default => 'NSA', no_override => 1 },
    22. };
    23. ### check() returns a hashref of parsed args on success ###
    24. my $parsed_args = check( $tmpl, \%hash, $VERBOSE )
    25. or die qw[Could not parse arguments!];
    26. ... other code here ...
    27. }
    28. my $ok = allow( $colour, [qw|blue green yellow|] );
    29. my $error = Params::Check::last_error();

    DESCRIPTION

    Params::Check is a generic input parsing/checking mechanism.

    It allows you to validate input via a template. The only requirement is that the arguments must be named.

    Params::Check can do the following things for you:

    • Convert all keys to lowercase

    • Check if all required arguments have been provided

    • Set arguments that have not been provided to the default

    • Weed out arguments that are not supported and warn about them to the user

    • Validate the arguments given by the user based on strings, regexes, lists or even subroutines

    • Enforce type integrity if required

    Most of Params::Check's power comes from its template, which we'll discuss below:

    Template

    As you can see in the synopsis, based on your template, the arguments provided will be validated.

    The template can take a different set of rules per key that is used.

    The following rules are available:

    • default

      This is the default value if none was provided by the user. This is also the type strict_type will look at when checking type integrity (see below).

    • required

      A boolean flag that indicates if this argument was a required argument. If marked as required and not provided, check() will fail.

    • strict_type

      This does a ref() check on the argument provided. The ref of the argument must be the same as the ref of the default value for this check to pass.

      This is very useful if you insist on taking an array reference as argument for example.

    • defined

      If this template key is true, enforces that if this key is provided by user input, its value is defined. This just means that the user is not allowed to pass undef as a value for this key and is equivalent to: allow => sub { defined $_[0] && OTHER TESTS }

    • no_override

      This allows you to specify constants in your template. ie, they keys that are not allowed to be altered by the user. It pretty much allows you to keep all your configurable data in one place; the Params::Check template.

    • store

      This allows you to pass a reference to a scalar, in which the data will be stored:

      1. my $x;
      2. my $args = check(foo => { default => 1, store => \$x }, $input);

      This is basically shorthand for saying:

      1. my $args = check( { foo => { default => 1 }, $input );
      2. my $x = $args->{foo};

      You can alter the global variable $Params::Check::NO_DUPLICATES to control whether the store 'd key will still be present in your result set. See the Global Variables section below.

    • allow

      A set of criteria used to validate a particular piece of data if it has to adhere to particular rules.

      See the allow() function for details.

    Functions

    check( \%tmpl, \%args, [$verbose] );

    This function is not exported by default, so you'll have to ask for it via:

    1. use Params::Check qw[check];

    or use its fully qualified name instead.

    check takes a list of arguments, as follows:

    • Template

      This is a hashreference which contains a template as explained in the SYNOPSIS and Template section.

    • Arguments

      This is a reference to a hash of named arguments which need checking.

    • Verbose

      A boolean to indicate whether check should be verbose and warn about what went wrong in a check or not.

      You can enable this program wide by setting the package variable $Params::Check::VERBOSE to a true value. For details, see the section on Global Variables below.

    check will return when it fails, or a hashref with lowercase keys of parsed arguments when it succeeds.

    So a typical call to check would look like this:

    1. my $parsed = check( \%template, \%arguments, $VERBOSE )
    2. or warn q[Arguments could not be parsed!];

    A lot of the behaviour of check() can be altered by setting package variables. See the section on Global Variables for details on this.

    allow( $test_me, \@criteria );

    The function that handles the allow key in the template is also available for independent use.

    The function takes as first argument a key to test against, and as second argument any form of criteria that are also allowed by the allow key in the template.

    You can use the following types of values for allow:

    • string

      The provided argument MUST be equal to the string for the validation to pass.

    • regexp

      The provided argument MUST match the regular expression for the validation to pass.

    • subroutine

      The provided subroutine MUST return true in order for the validation to pass and the argument accepted.

      (This is particularly useful for more complicated data).

    • array ref

      The provided argument MUST equal one of the elements of the array ref for the validation to pass. An array ref can hold all the above values.

    It returns true if the key matched the criteria, or false otherwise.

    last_error()

    Returns a string containing all warnings and errors reported during the last time check was called.

    This is useful if you want to report then some other way than carp 'ing when the verbose flag is on.

    It is exported upon request.

    Global Variables

    The behaviour of Params::Check can be altered by changing the following global variables:

    $Params::Check::VERBOSE

    This controls whether Params::Check will issue warnings and explanations as to why certain things may have failed. If you set it to 0, Params::Check will not output any warnings.

    The default is 1 when warnings are enabled, 0 otherwise;

    $Params::Check::STRICT_TYPE

    This works like the strict_type option you can pass to check , which will turn on strict_type globally for all calls to check .

    The default is 0;

    $Params::Check::ALLOW_UNKNOWN

    If you set this flag, unknown options will still be present in the return value, rather than filtered out. This is useful if your subroutine is only interested in a few arguments, and wants to pass the rest on blindly to perhaps another subroutine.

    The default is 0;

    $Params::Check::STRIP_LEADING_DASHES

    If you set this flag, all keys passed in the following manner:

    1. function( -key => 'val' );

    will have their leading dashes stripped.

    $Params::Check::NO_DUPLICATES

    If set to true, all keys in the template that are marked as to be stored in a scalar, will also be removed from the result set.

    Default is false, meaning that when you use store as a template key, check will put it both in the scalar you supplied, as well as in the hashref it returns.

    $Params::Check::PRESERVE_CASE

    If set to true, Params::Check will no longer convert all keys from the user input to lowercase, but instead expect them to be in the case the template provided. This is useful when you want to use similar keys with different casing in your templates.

    Understand that this removes the case-insensitivity feature of this module.

    Default is 0;

    $Params::Check::ONLY_ALLOW_DEFINED

    If set to true, Params::Check will require all values passed to be defined. If you wish to enable this on a 'per key' basis, use the template option defined instead.

    Default is 0;

    $Params::Check::SANITY_CHECK_TEMPLATE

    If set to true, Params::Check will sanity check templates, validating for errors and unknown keys. Although very useful for debugging, this can be somewhat slow in hot-code and large loops.

    To disable this check, set this variable to false .

    Default is 1;

    $Params::Check::WARNINGS_FATAL

    If set to true, Params::Check will croak when an error during template validation occurs, rather than return false .

    Default is 0;

    $Params::Check::CALLER_DEPTH

    This global modifies the argument given to caller() by Params::Check::check() and is useful if you have a custom wrapper function around Params::Check::check() . The value must be an integer, indicating the number of wrapper functions inserted between the real function call and Params::Check::check() .

    Example wrapper function, using a custom stacktrace:

    1. sub check {
    2. my ($template, $args_in) = @_;
    3. local $Params::Check::WARNINGS_FATAL = 1;
    4. local $Params::Check::CALLER_DEPTH = $Params::Check::CALLER_DEPTH + 1;
    5. my $args_out = Params::Check::check($template, $args_in);
    6. my_stacktrace(Params::Check::last_error) unless $args_out;
    7. return $args_out;
    8. }

    Default is 0;

    Acknowledgements

    Thanks to Richard Soderberg for his performance improvements.

    BUG REPORTS

    Please report bugs or other issues to <bug-params-check@rt.cpan.org>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Package/Constants.html000644 000765 000024 00000042342 12275777500 017434 0ustar00jjstaff000000 000000 Package::Constants - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Package::Constants

    Perl 5 version 18.2 documentation
    Recently read

    Package::Constants

    NAME

    Package::Constants - List all constants declared in a package

    SYNOPSIS

    1. use Package::Constants;
    2. ### list the names of all constants in a given package;
    3. @const = Package::Constants->list( __PACKAGE__ );
    4. @const = Package::Constants->list( 'main' );
    5. ### enable debugging output
    6. $Package::Constants::DEBUG = 1;

    DESCRIPTION

    Package::Constants lists all the constants defined in a certain package. This can be useful for, among others, setting up an autogenerated @EXPORT/@EXPORT_OK for a Constants.pm file.

    CLASS METHODS

    @const = Package::Constants->list( PACKAGE_NAME );

    Lists the names of all the constants defined in the provided package.

    GLOBAL VARIABLES

    $Package::Constants::DEBUG

    When set to true, prints out debug information to STDERR about the package it is inspecting. Helps to identify issues when the results are not as you expect.

    Defaults to false.

    BUG REPORTS

    Please report bugs or other issues to <bug-package-constants@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Object/Accessor.html000644 000765 000024 00000136640 12275777475 017115 0ustar00jjstaff000000 000000 Object::Accessor - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Object::Accessor

    Perl 5 version 18.2 documentation
    Recently read

    Object::Accessor

    NAME

    Object::Accessor - interface to create per object accessors

    SYNOPSIS

    1. ### using the object
    2. $obj = Object::Accessor->new; # create object
    3. $obj = Object::Accessor->new(@list); # create object with accessors
    4. $obj = Object::Accessor->new(\%h); # create object with accessors
    5. # and their allow handlers
    6. $bool = $obj->mk_accessors('foo'); # create accessors
    7. $bool = $obj->mk_accessors( # create accessors with input
    8. {foo => ALLOW_HANDLER} ); # validation
    9. $bool = $obj->mk_aliases( # create an alias to an existing
    10. alias_name => 'method'); # method name
    11. $clone = $obj->mk_clone; # create a clone of original
    12. # object without data
    13. $bool = $obj->mk_flush; # clean out all data
    14. @list = $obj->ls_accessors; # retrieves a list of all
    15. # accessors for this object
    16. $bar = $obj->foo('bar'); # set 'foo' to 'bar'
    17. $bar = $obj->foo(); # retrieve 'bar' again
    18. $sub = $obj->can('foo'); # retrieve coderef for
    19. # 'foo' accessor
    20. $bar = $sub->('bar'); # set 'foo' via coderef
    21. $bar = $sub->(); # retrieve 'bar' by coderef
    22. ### using the object as base class
    23. package My::Class;
    24. use base 'Object::Accessor';
    25. $obj = My::Class->new; # create base object
    26. $bool = $obj->mk_accessors('foo'); # create accessors, etc...
    27. ### make all attempted access to non-existent accessors fatal
    28. ### (defaults to false)
    29. $Object::Accessor::FATAL = 1;
    30. ### enable debugging
    31. $Object::Accessor::DEBUG = 1;
    32. ### advanced usage -- callbacks
    33. { my $obj = Object::Accessor->new('foo');
    34. $obj->register_callback( sub { ... } );
    35. $obj->foo( 1 ); # these calls invoke the callback you registered
    36. $obj->foo() # which allows you to change the get/set
    37. # behaviour and what is returned to the caller.
    38. }
    39. ### advanced usage -- lvalue attributes
    40. { my $obj = Object::Accessor::Lvalue->new('foo');
    41. print $obj->foo = 1; # will print 1
    42. }
    43. ### advanced usage -- scoped attribute values
    44. { my $obj = Object::Accessor->new('foo');
    45. $obj->foo( 1 );
    46. print $obj->foo; # will print 1
    47. ### bind the scope of the value of attribute 'foo'
    48. ### to the scope of '$x' -- when $x goes out of
    49. ### scope, 'foo's previous value will be restored
    50. { $obj->foo( 2 => \my $x );
    51. print $obj->foo, ' ', $x; # will print '2 2'
    52. }
    53. print $obj->foo; # will print 1
    54. }

    DESCRIPTION

    Object::Accessor provides an interface to create per object accessors (as opposed to per Class accessors, as, for example, Class::Accessor provides).

    You can choose to either subclass this module, and thus using its accessors on your own module, or to store an Object::Accessor object inside your own object, and access the accessors from there. See the SYNOPSIS for examples.

    METHODS

    $object = Object::Accessor->new( [ARGS] );

    Creates a new (and empty) Object::Accessor object. This method is inheritable.

    Any arguments given to new are passed straight to mk_accessors .

    If you want to be able to assign to your accessors as if they were lvalue s, you should create your object in the Object::Accessor::Lvalue namespace instead. See the section on LVALUE ACCESSORS below.

    $bool = $object->mk_accessors( @ACCESSORS | \%ACCESSOR_MAP );

    Creates a list of accessors for this object (and NOT for other ones in the same class!). Will not clobber existing data, so if an accessor already exists, requesting to create again is effectively a no-op.

    When providing a hashref as argument, rather than a normal list, you can specify a list of key/value pairs of accessors and their respective input validators. The validators can be anything that Params::Check 's allow function accepts. Please see its manpage for details.

    For example:

    1. $object->mk_accessors( {
    2. foo => qr/^\d+$/, # digits only
    3. bar => [0,1], # booleans
    4. zot => \&my_sub # a custom verification sub
    5. } );

    Returns true on success, false on failure.

    Accessors that are called on an object, that do not exist return undef by default, but you can make this a fatal error by setting the global variable $FATAL to true. See the section on GLOBAL VARIABLES for details.

    Note that you can bind the values of attributes to a scope. This allows you to temporarily change a value of an attribute, and have it's original value restored up on the end of it's bound variable's scope;

    For example, in this snippet of code, the attribute foo will temporarily be set to 2 , until the end of the scope of $x , at which point the original value of 1 will be restored.

    1. my $obj = Object::Accessor->new;
    2. $obj->mk_accessors('foo');
    3. $obj->foo( 1 );
    4. print $obj->foo; # will print 1
    5. ### bind the scope of the value of attribute 'foo'
    6. ### to the scope of '$x' -- when $x goes out of
    7. ### scope, 'foo' previous value will be restored
    8. { $obj->foo( 2 => \my $x );
    9. print $obj->foo, ' ', $x; # will print '2 2'
    10. }
    11. print $obj->foo; # will print 1

    Note that all accessors are read/write for everyone. See the TODO section for details.

    @list = $self->ls_accessors;

    Returns a list of accessors that are supported by the current object. The corresponding coderefs can be retrieved by passing this list one by one to the can method.

    $ref = $self->ls_allow(KEY)

    Returns the allow handler for the given key, which can be used with Params::Check 's allow() handler. If there was no allow handler specified, an allow handler that always returns true will be returned.

    $bool = $self->mk_aliases( alias => method, [alias2 => method2, ...] );

    Creates an alias for a given method name. For all intents and purposes, these two accessors are now identical for this object. This is akin to doing the following on the symbol table level:

    1. *alias = *method

    This allows you to do the following:

    1. $self->mk_accessors('foo');
    2. $self->mk_aliases( bar => 'foo' );
    3. $self->bar( 42 );
    4. print $self->foo; # will print 42

    $clone = $self->mk_clone;

    Makes a clone of the current object, which will have the exact same accessors as the current object, but without the data stored in them.

    $bool = $self->mk_flush;

    Flushes all the data from the current object; all accessors will be set back to their default state of undef.

    Returns true on success and false on failure.

    $bool = $self->mk_verify;

    Checks if all values in the current object are in accordance with their own allow handler. Specifically useful to check if an empty initialised object has been filled with values satisfying their own allow criteria.

    $bool = $self->register_callback( sub { ... } );

    This method allows you to register a callback, that is invoked every time an accessor is called. This allows you to munge input data, access external data stores, etc.

    You are free to return whatever you wish. On a set call, the data is even stored in the object.

    Below is an example of the use of a callback.

    1. $object->some_method( "some_value" );
    2. my $callback = sub {
    3. my $self = shift; # the object
    4. my $meth = shift; # "some_method"
    5. my $val = shift; # ["some_value"]
    6. # could be undef -- check 'exists';
    7. # if scalar @$val is empty, it was a 'get'
    8. # your code here
    9. return $new_val; # the value you want to be set/returned
    10. }

    To access the values stored in the object, circumventing the callback structure, you should use the ___get and ___set methods documented further down.

    $bool = $self->can( METHOD_NAME )

    This method overrides UNIVERAL::can in order to provide coderefs to accessors which are loaded on demand. It will behave just like UNIVERSAL::can where it can -- returning a class method if it exists, or a closure pointing to a valid accessor of this particular object.

    You can use it as follows:

    1. $sub = $object->can('some_accessor'); # retrieve the coderef
    2. $sub->('foo'); # 'some_accessor' now set
    3. # to 'foo' for $object
    4. $foo = $sub->(); # retrieve the contents
    5. # of 'some_accessor'

    See the SYNOPSIS for more examples.

    $val = $self->___get( METHOD_NAME );

    Method to directly access the value of the given accessor in the object. It circumvents all calls to allow checks, callbacks, etc.

    Use only if you Know What You Are Doing ! General usage for this functionality would be in your own custom callbacks.

    $bool = $self->___set( METHOD_NAME => VALUE );

    Method to directly set the value of the given accessor in the object. It circumvents all calls to allow checks, callbacks, etc.

    Use only if you Know What You Are Doing ! General usage for this functionality would be in your own custom callbacks.

    $bool = $self->___alias( ALIAS => METHOD );

    Method to directly alias one accessor to another for this object. It circumvents all sanity checks, etc.

    Use only if you Know What You Are Doing !

    LVALUE ACCESSORS

    Object::Accessor supports lvalue attributes as well. To enable these, you should create your objects in the designated namespace, Object::Accessor::Lvalue . For example:

    1. my $obj = Object::Accessor::Lvalue->new('foo');
    2. $obj->foo += 1;
    3. print $obj->foo;

    will actually print 1 and work as expected. Since this is an optional feature, that's not desirable in all cases, we require you to explicitly use the Object::Accessor::Lvalue class.

    Doing the same on the standard Object >Accessor> class would generate the following code & errors:

    1. my $obj = Object::Accessor->new('foo');
    2. $obj->foo += 1;
    3. Can't modify non-lvalue subroutine call

    Note that lvalue support on AUTOLOAD routines is a perl 5.8.x feature. See perldoc perl58delta for details.

    CAVEATS

    • Allow handlers

      Due to the nature of lvalue subs , we never get access to the value you are assigning, so we can not check it against your allow handler. Allow handlers are therefor unsupported under lvalue conditions.

      See perldoc perlsub for details.

    • Callbacks

      Due to the nature of lvalue subs , we never get access to the value you are assigning, so we can not check provide this value to your callback. Furthermore, we can not distinguish between a get and a set call. Callbacks are therefor unsupported under lvalue conditions.

      See perldoc perlsub for details.

    GLOBAL VARIABLES

    $Object::Accessor::FATAL

    Set this variable to true to make all attempted access to non-existent accessors be fatal. This defaults to false .

    $Object::Accessor::DEBUG

    Set this variable to enable debugging output. This defaults to false .

    TODO

    Create read-only accessors

    Currently all accessors are read/write for everyone. Perhaps a future release should make it possible to have read-only accessors as well.

    CAVEATS

    If you use codereferences for your allow handlers, you will not be able to freeze the data structures using Storable .

    Due to a bug in storable (until at least version 2.15), qr// compiled regexes also don't de-serialize properly. Although this bug has been reported, you should be aware of this issue when serializing your objects.

    You can track the bug here:

    1. http://rt.cpan.org/Ticket/Display.html?id=1827

    BUG REPORTS

    Please report bugs or other issues to <bug-object-accessor@rt.cpan.org>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/Cmd.html000644 000765 000024 00000053562 12275777474 015376 0ustar00jjstaff000000 000000 Net::Cmd - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::Cmd

    Perl 5 version 18.2 documentation
    Recently read

    Net::Cmd

    NAME

    Net::Cmd - Network Command class (as used by FTP, SMTP etc)

    SYNOPSIS

    1. use Net::Cmd;
    2. @ISA = qw(Net::Cmd);

    DESCRIPTION

    Net::Cmd is a collection of methods that can be inherited by a sub class of IO::Handle . These methods implement the functionality required for a command based protocol, for example FTP and SMTP.

    USER METHODS

    These methods provide a user interface to the Net::Cmd object.

    • debug ( VALUE )

      Set the level of debug information for this object. If VALUE is not given then the current state is returned. Otherwise the state is changed to VALUE and the previous state returned.

      Different packages may implement different levels of debug but a non-zero value results in copies of all commands and responses also being sent to STDERR.

      If VALUE is undef then the debug level will be set to the default debug level for the class.

      This method can also be called as a static method to set/get the default debug level for a given class.

    • message ()

      Returns the text message returned from the last command

    • code ()

      Returns the 3-digit code from the last command. If a command is pending then the value 0 is returned

    • ok ()

      Returns non-zero if the last code value was greater than zero and less than 400. This holds true for most command servers. Servers where this does not hold may override this method.

    • status ()

      Returns the most significant digit of the current status code. If a command is pending then CMD_PENDING is returned.

    • datasend ( DATA )

      Send data to the remote server, converting LF to CRLF. Any line starting with a '.' will be prefixed with another '.'. DATA may be an array or a reference to an array.

    • dataend ()

      End the sending of data to the remote server. This is done by ensuring that the data already sent ends with CRLF then sending '.CRLF' to end the transmission. Once this data has been sent dataend calls response and returns true if response returns CMD_OK.

    CLASS METHODS

    These methods are not intended to be called by the user, but used or over-ridden by a sub-class of Net::Cmd

    • debug_print ( DIR, TEXT )

      Print debugging information. DIR denotes the direction true being data being sent to the server. Calls debug_text before printing to STDERR.

    • debug_text ( TEXT )

      This method is called to print debugging information. TEXT is the text being sent. The method should return the text to be printed

      This is primarily meant for the use of modules such as FTP where passwords are sent, but we do not want to display them in the debugging information.

    • command ( CMD [, ARGS, ... ])

      Send a command to the command server. All arguments a first joined with a space character and CRLF is appended, this string is then sent to the command server.

      Returns undef upon failure

    • unsupported ()

      Sets the status code to 580 and the response text to 'Unsupported command'. Returns zero.

    • response ()

      Obtain a response from the server. Upon success the most significant digit of the status code is returned. Upon failure, timeout etc., undef is returned.

    • parse_response ( TEXT )

      This method is called by response as a method with one argument. It should return an array of 2 values, the 3-digit status code and a flag which is true when this is part of a multi-line response and this line is not the list.

    • getline ()

      Retrieve one line, delimited by CRLF, from the remote server. Returns undef upon failure.

      NOTE: If you do use this method for any reason, please remember to add some debug_print calls into your method.

    • ungetline ( TEXT )

      Unget a line of text from the server.

    • rawdatasend ( DATA )

      Send data to the remote server without performing any conversions. DATA is a scalar.

    • read_until_dot ()

      Read data from the remote server until a line consisting of a single '.'. Any lines starting with '..' will have one of the '.'s removed.

      Returns a reference to a list containing the lines, or undef upon failure.

    • tied_fh ()

      Returns a filehandle tied to the Net::Cmd object. After issuing a command, you may read from this filehandle using read() or <>. The filehandle will return EOF when the final dot is encountered. Similarly, you may write to the filehandle in order to send data to the server after issuing a command that expects data to be written.

      See the Net::POP3 and Net::SMTP modules for examples of this.

    EXPORTS

    Net::Cmd exports six subroutines, five of these, CMD_INFO , CMD_OK , CMD_MORE , CMD_REJECT and CMD_ERROR , correspond to possible results of response and status . The sixth is CMD_PENDING .

    AUTHOR

    Graham Barr <gbarr@pobox.com>

    COPYRIGHT

    Copyright (c) 1995-2006 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/Config.html000644 000765 000024 00000055671 12275777473 016102 0ustar00jjstaff000000 000000 Net::Config - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::Config

    Perl 5 version 18.2 documentation
    Recently read

    Net::Config

    NAME

    Net::Config - Local configuration data for libnet

    SYNOPSYS

    1. use Net::Config qw(%NetConfig);

    DESCRIPTION

    Net::Config holds configuration data for the modules in the libnet distribution. During installation you will be asked for these values.

    The configuration data is held globally in a file in the perl installation tree, but a user may override any of these values by providing their own. This can be done by having a .libnetrc file in their home directory. This file should return a reference to a HASH containing the keys described below. For example

    1. # .libnetrc
    2. {
    3. nntp_hosts => [ "my_preferred_host" ],
    4. ph_hosts => [ "my_ph_server" ],
    5. }
    6. __END__

    METHODS

    Net::Config defines the following methods. They are methods as they are invoked as class methods. This is because Net::Config inherits from Net::LocalCfg so you can override these methods if you want.

    • requires_firewall HOST

      Attempts to determine if a given host is outside your firewall. Possible return values are.

      1. -1 Cannot lookup hostname
      2. 0 Host is inside firewall (or there is no ftp_firewall entry)
      3. 1 Host is outside the firewall

      This is done by using hostname lookup and the local_netmask entry in the configuration data.

    NetConfig VALUES

    • nntp_hosts
    • snpp_hosts
    • pop3_hosts
    • smtp_hosts
    • ph_hosts
    • daytime_hosts
    • time_hosts

      Each is a reference to an array of hostnames (in order of preference), which should be used for the given protocol

    • inet_domain

      Your internet domain name

    • ftp_firewall

      If you have an FTP proxy firewall (NOT an HTTP or SOCKS firewall) then this value should be set to the firewall hostname. If your firewall does not listen to port 21, then this value should be set to "hostname:port" (eg "hostname:99" )

    • ftp_firewall_type

      There are many different ftp firewall products available. But unfortunately there is no standard for how to traverse a firewall. The list below shows the sequence of commands that Net::FTP will use

      1. user Username for remote host
      2. pass Password for remote host
      3. fwuser Username for firewall
      4. fwpass Password for firewall
      5. remote.host The hostname of the remote ftp server
      0

      There is no firewall

      1
      1. USER user@remote.host
      2. PASS pass
      2
      1. USER fwuser
      2. PASS fwpass
      3. USER user@remote.host
      4. PASS pass
      3
      1. USER fwuser
      2. PASS fwpass
      3. SITE remote.site
      4. USER user
      5. PASS pass
      4
      1. USER fwuser
      2. PASS fwpass
      3. OPEN remote.site
      4. USER user
      5. PASS pass
      5
      1. USER user@fwuser@remote.site
      2. PASS pass@fwpass
      6
      1. USER fwuser@remote.site
      2. PASS fwpass
      3. USER user
      4. PASS pass
      7
      1. USER user@remote.host
      2. PASS pass
      3. AUTH fwuser
      4. RESP fwpass
    • ftp_ext_passive
    • ftp_int_passive

      FTP servers can work in passive or active mode. Active mode is when you want to transfer data you have to tell the server the address and port to connect to. Passive mode is when the server provide the address and port and you establish the connection.

      With some firewalls active mode does not work as the server cannot connect to your machine (because you are behind a firewall) and the firewall does not re-write the command. In this case you should set ftp_ext_passive to a true value.

      Some servers are configured to only work in passive mode. If you have one of these you can force Net::FTP to always transfer in passive mode; when not going via a firewall, by setting ftp_int_passive to a true value.

    • local_netmask

      A reference to a list of netmask strings in the form "134.99.4.0/24" . These are used by the requires_firewall function to determine if a given host is inside or outside your firewall.

    The following entries are used during installation & testing on the libnet package

    • test_hosts

      If true then make test may attempt to connect to hosts given in the configuration.

    • test_exists

      If true then Configure will check each hostname given that it exists

     
    perldoc-html/Net/Domain.html000644 000765 000024 00000037172 12275777473 016100 0ustar00jjstaff000000 000000 Net::Domain - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::Domain

    Perl 5 version 18.2 documentation
    Recently read

    Net::Domain

    NAME

    Net::Domain - Attempt to evaluate the current host's internet name and domain

    SYNOPSIS

    1. use Net::Domain qw(hostname hostfqdn hostdomain domainname);

    DESCRIPTION

    Using various methods attempt to find the Fully Qualified Domain Name (FQDN) of the current host. From this determine the host-name and the host-domain.

    Each of the functions will return undef if the FQDN cannot be determined.

    • hostfqdn ()

      Identify and return the FQDN of the current host.

    • domainname ()

      An alias for hostfqdn ().

    • hostname ()

      Returns the smallest part of the FQDN which can be used to identify the host.

    • hostdomain ()

      Returns the remainder of the FQDN after the hostname has been removed.

    AUTHOR

    Graham Barr <gbarr@pobox.com>. Adapted from Sys::Hostname by David Sundstrom <sunds@asictest.sc.ti.com>

    COPYRIGHT

    Copyright (c) 1995-1998 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/FTP.html000644 000765 000024 00000130034 12275777473 015311 0ustar00jjstaff000000 000000 Net::FTP - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::FTP

    Perl 5 version 18.2 documentation
    Recently read

    Net::FTP

    NAME

    Net::FTP - FTP Client class

    SYNOPSIS

    1. use Net::FTP;
    2. $ftp = Net::FTP->new("some.host.name", Debug => 0)
    3. or die "Cannot connect to some.host.name: $@";
    4. $ftp->login("anonymous",'-anonymous@')
    5. or die "Cannot login ", $ftp->message;
    6. $ftp->cwd("/pub")
    7. or die "Cannot change working directory ", $ftp->message;
    8. $ftp->get("that.file")
    9. or die "get failed ", $ftp->message;
    10. $ftp->quit;

    DESCRIPTION

    Net::FTP is a class implementing a simple FTP client in Perl as described in RFC959. It provides wrappers for a subset of the RFC959 commands.

    OVERVIEW

    FTP stands for File Transfer Protocol. It is a way of transferring files between networked machines. The protocol defines a client (whose commands are provided by this module) and a server (not implemented in this module). Communication is always initiated by the client, and the server responds with a message and a status code (and sometimes with data).

    The FTP protocol allows files to be sent to or fetched from the server. Each transfer involves a local file (on the client) and a remote file (on the server). In this module, the same file name will be used for both local and remote if only one is specified. This means that transferring remote file /path/to/file will try to put that file in /path/to/file locally, unless you specify a local file name.

    The protocol also defines several standard translations which the file can undergo during transfer. These are ASCII, EBCDIC, binary, and byte. ASCII is the default type, and indicates that the sender of files will translate the ends of lines to a standard representation which the receiver will then translate back into their local representation. EBCDIC indicates the file being transferred is in EBCDIC format. Binary (also known as image) format sends the data as a contiguous bit stream. Byte format transfers the data as bytes, the values of which remain the same regardless of differences in byte size between the two machines (in theory - in practice you should only use this if you really know what you're doing).

    CONSTRUCTOR

    • new ([ HOST ] [, OPTIONS ])

      This is the constructor for a new Net::FTP object. HOST is the name of the remote host to which an FTP connection is required.

      HOST is optional. If HOST is not given then it may instead be passed as the Host option described below.

      OPTIONS are passed in a hash like fashion, using key and value pairs. Possible options are:

      Host - FTP host to connect to. It may be a single scalar, as defined for the PeerAddr option in IO::Socket::INET, or a reference to an array with hosts to try in turn. The host method will return the value which was used to connect to the host.

      Firewall - The name of a machine which acts as an FTP firewall. This can be overridden by an environment variable FTP_FIREWALL . If specified, and the given host cannot be directly connected to, then the connection is made to the firewall machine and the string @hostname is appended to the login identifier. This kind of setup is also referred to as an ftp proxy.

      FirewallType - The type of firewall running on the machine indicated by Firewall. This can be overridden by an environment variable FTP_FIREWALL_TYPE . For a list of permissible types, see the description of ftp_firewall_type in Net::Config.

      BlockSize - This is the block size that Net::FTP will use when doing transfers. (defaults to 10240)

      Port - The port number to connect to on the remote machine for the FTP connection

      Timeout - Set a timeout value (defaults to 120)

      Debug - debug level (see the debug method in Net::Cmd)

      Passive - If set to a non-zero value then all data transfers will be done using passive mode. If set to zero then data transfers will be done using active mode. If the machine is connected to the Internet directly, both passive and active mode should work equally well. Behind most firewall and NAT configurations passive mode has a better chance of working. However, in some rare firewall configurations, active mode actually works when passive mode doesn't. Some really old FTP servers might not implement passive transfers. If not specified, then the transfer mode is set by the environment variable FTP_PASSIVE or if that one is not set by the settings done by the libnetcfg utility. If none of these apply then passive mode is used.

      Hash - If given a reference to a file handle (e.g., \*STDERR ), print hash marks (#) on that filehandle every 1024 bytes. This simply invokes the hash() method for you, so that hash marks are displayed for all transfers. You can, of course, call hash() explicitly whenever you'd like.

      LocalAddr - Local address to use for all socket connections, this argument will be passed to IO::Socket::INET

      If the constructor fails undef will be returned and an error message will be in $@

    METHODS

    Unless otherwise stated all methods return either a true or false value, with true meaning that the operation was a success. When a method states that it returns a value, failure will be returned as undef or an empty list.

    • login ([LOGIN [,PASSWORD [, ACCOUNT] ] ])

      Log into the remote FTP server with the given login information. If no arguments are given then the Net::FTP uses the Net::Netrc package to lookup the login information for the connected host. If no information is found then a login of anonymous is used. If no password is given and the login is anonymous then anonymous@ will be used for password.

      If the connection is via a firewall then the authorize method will be called with no arguments.

    • authorize ( [AUTH [, RESP]])

      This is a protocol used by some firewall ftp proxies. It is used to authorise the user to send data out. If both arguments are not specified then authorize uses Net::Netrc to do a lookup.

    • site (ARGS)

      Send a SITE command to the remote server and wait for a response.

      Returns most significant digit of the response code.

    • ascii

      Transfer file in ASCII. CRLF translation will be done if required

    • binary

      Transfer file in binary mode. No transformation will be done.

      Hint: If both server and client machines use the same line ending for text files, then it will be faster to transfer all files in binary mode.

    • rename ( OLDNAME, NEWNAME )

      Rename a file on the remote FTP server from OLDNAME to NEWNAME . This is done by sending the RNFR and RNTO commands.

    • delete ( FILENAME )

      Send a request to the server to delete FILENAME .

    • cwd ( [ DIR ] )

      Attempt to change directory to the directory given in $dir . If $dir is ".." , the FTP CDUP command is used to attempt to move up one directory. If no directory is given then an attempt is made to change the directory to the root directory.

    • cdup ()

      Change directory to the parent of the current directory.

    • pwd ()

      Returns the full pathname of the current directory.

    • restart ( WHERE )

      Set the byte offset at which to begin the next data transfer. Net::FTP simply records this value and uses it when during the next data transfer. For this reason this method will not return an error, but setting it may cause a subsequent data transfer to fail.

    • rmdir ( DIR [, RECURSE ])

      Remove the directory with the name DIR . If RECURSE is true then rmdir will attempt to delete everything inside the directory.

    • mkdir ( DIR [, RECURSE ])

      Create a new directory with the name DIR . If RECURSE is true then mkdir will attempt to create all the directories in the given path.

      Returns the full pathname to the new directory.

    • alloc ( SIZE [, RECORD_SIZE] )

      The alloc command allows you to give the ftp server a hint about the size of the file about to be transferred using the ALLO ftp command. Some storage systems use this to make intelligent decisions about how to store the file. The SIZE argument represents the size of the file in bytes. The RECORD_SIZE argument indicates a maximum record or page size for files sent with a record or page structure.

      The size of the file will be determined, and sent to the server automatically for normal files so that this method need only be called if you are transferring data from a socket, named pipe, or other stream not associated with a normal file.

    • ls ( [ DIR ] )

      Get a directory listing of DIR , or the current directory.

      In an array context, returns a list of lines returned from the server. In a scalar context, returns a reference to a list.

    • dir ( [ DIR ] )

      Get a directory listing of DIR , or the current directory in long format.

      In an array context, returns a list of lines returned from the server. In a scalar context, returns a reference to a list.

    • get ( REMOTE_FILE [, LOCAL_FILE [, WHERE]] )

      Get REMOTE_FILE from the server and store locally. LOCAL_FILE may be a filename or a filehandle. If not specified, the file will be stored in the current directory with the same leafname as the remote file.

      If WHERE is given then the first WHERE bytes of the file will not be transferred, and the remaining bytes will be appended to the local file if it already exists.

      Returns LOCAL_FILE , or the generated local file name if LOCAL_FILE is not given. If an error was encountered undef is returned.

    • put ( LOCAL_FILE [, REMOTE_FILE ] )

      Put a file on the remote server. LOCAL_FILE may be a name or a filehandle. If LOCAL_FILE is a filehandle then REMOTE_FILE must be specified. If REMOTE_FILE is not specified then the file will be stored in the current directory with the same leafname as LOCAL_FILE .

      Returns REMOTE_FILE , or the generated remote filename if REMOTE_FILE is not given.

      NOTE: If for some reason the transfer does not complete and an error is returned then the contents that had been transferred will not be remove automatically.

    • put_unique ( LOCAL_FILE [, REMOTE_FILE ] )

      Same as put but uses the STOU command.

      Returns the name of the file on the server.

    • append ( LOCAL_FILE [, REMOTE_FILE ] )

      Same as put but appends to the file on the remote server.

      Returns REMOTE_FILE , or the generated remote filename if REMOTE_FILE is not given.

    • unique_name ()

      Returns the name of the last file stored on the server using the STOU command.

    • mdtm ( FILE )

      Returns the modification time of the given file

    • size ( FILE )

      Returns the size in bytes for the given file as stored on the remote server.

      NOTE: The size reported is the size of the stored file on the remote server. If the file is subsequently transferred from the server in ASCII mode and the remote server and local machine have different ideas about "End Of Line" then the size of file on the local machine after transfer may be different.

    • supported ( CMD )

      Returns TRUE if the remote server supports the given command.

    • hash ( [FILEHANDLE_GLOB_REF],[ BYTES_PER_HASH_MARK] )

      Called without parameters, or with the first argument false, hash marks are suppressed. If the first argument is true but not a reference to a file handle glob, then \*STDERR is used. The second argument is the number of bytes per hash mark printed, and defaults to 1024. In all cases the return value is a reference to an array of two: the filehandle glob reference and the bytes per hash mark.

    • feature ( NAME )

      Determine if the server supports the specified feature. The return value is a list of lines the server responded with to describe the options that it supports for the given feature. If the feature is unsupported then the empty list is returned.

      1. if ($ftp->feature( 'MDTM' )) {
      2. # Do something
      3. }
      4. if (grep { /\bTLS\b/ } $ftp->feature('AUTH')) {
      5. # Server supports TLS
      6. }

    The following methods can return different results depending on how they are called. If the user explicitly calls either of the pasv or port methods then these methods will return a true or false value. If the user does not call either of these methods then the result will be a reference to a Net::FTP::dataconn based object.

    • nlst ( [ DIR ] )

      Send an NLST command to the server, with an optional parameter.

    • list ( [ DIR ] )

      Same as nlst but using the LIST command

    • retr ( FILE )

      Begin the retrieval of a file called FILE from the remote server.

    • stor ( FILE )

      Tell the server that you wish to store a file. FILE is the name of the new file that should be created.

    • stou ( FILE )

      Same as stor but using the STOU command. The name of the unique file which was created on the server will be available via the unique_name method after the data connection has been closed.

    • appe ( FILE )

      Tell the server that we want to append some data to the end of a file called FILE . If this file does not exist then create it.

    If for some reason you want to have complete control over the data connection, this includes generating it and calling the response method when required, then the user can use these methods to do so.

    However calling these methods only affects the use of the methods above that can return a data connection. They have no effect on methods get , put , put_unique and those that do not require data connections.

    • port ( [ PORT ] )

      Send a PORT command to the server. If PORT is specified then it is sent to the server. If not, then a listen socket is created and the correct information sent to the server.

    • pasv ()

      Tell the server to go into passive mode. Returns the text that represents the port on which the server is listening, this text is in a suitable form to sent to another ftp server using the port method.

    The following methods can be used to transfer files between two remote servers, providing that these two servers can connect directly to each other.

    • pasv_xfer ( SRC_FILE, DEST_SERVER [, DEST_FILE ] )

      This method will do a file transfer between two remote ftp servers. If DEST_FILE is omitted then the leaf name of SRC_FILE will be used.

    • pasv_xfer_unique ( SRC_FILE, DEST_SERVER [, DEST_FILE ] )

      Like pasv_xfer but the file is stored on the remote server using the STOU command.

    • pasv_wait ( NON_PASV_SERVER )

      This method can be used to wait for a transfer to complete between a passive server and a non-passive server. The method should be called on the passive server with the Net::FTP object for the non-passive server passed as an argument.

    • abort ()

      Abort the current data transfer.

    • quit ()

      Send the QUIT command to the remote FTP server and close the socket connection.

    Methods for the adventurous

    Net::FTP inherits from Net::Cmd so methods defined in Net::Cmd may be used to send commands to the remote FTP server.

    • quot (CMD [,ARGS])

      Send a command, that Net::FTP does not directly support, to the remote server and wait for a response.

      Returns most significant digit of the response code.

      WARNING This call should only be used on commands that do not require data connections. Misuse of this method can hang the connection.

    THE dataconn CLASS

    Some of the methods defined in Net::FTP return an object which will be derived from this class.The dataconn class itself is derived from the IO::Socket::INET class, so any normal IO operations can be performed. However the following methods are defined in the dataconn class and IO should be performed using these.

    • read ( BUFFER, SIZE [, TIMEOUT ] )

      Read SIZE bytes of data from the server and place it into BUFFER , also performing any <CRLF> translation necessary. TIMEOUT is optional, if not given, the timeout value from the command connection will be used.

      Returns the number of bytes read before any <CRLF> translation.

    • write ( BUFFER, SIZE [, TIMEOUT ] )

      Write SIZE bytes of data from BUFFER to the server, also performing any <CRLF> translation necessary. TIMEOUT is optional, if not given, the timeout value from the command connection will be used.

      Returns the number of bytes written before any <CRLF> translation.

    • bytes_read ()

      Returns the number of bytes read so far.

    • abort ()

      Abort the current data transfer.

    • close ()

      Close the data connection and get a response from the FTP server. Returns true if the connection was closed successfully and the first digit of the response from the server was a '2'.

    UNIMPLEMENTED

    The following RFC959 commands have not been implemented:

    • SMNT

      Mount a different file system structure without changing login or accounting information.

    • HELP

      Ask the server for "helpful information" (that's what the RFC says) on the commands it accepts.

    • MODE

      Specifies transfer mode (stream, block or compressed) for file to be transferred.

    • SYST

      Request remote server system identification.

    • STAT

      Request remote server status.

    • STRU

      Specifies file structure for file to be transferred.

    • REIN

      Reinitialize the connection, flushing all I/O and account information.

    REPORTING BUGS

    When reporting bugs/problems please include as much information as possible. It may be difficult for me to reproduce the problem as almost every setup is different.

    A small script which yields the problem will probably be of help. It would also be useful if this script was run with the extra options Debug = 1> passed to the constructor, and the output sent with the bug report. If you cannot include a small script then please include a Debug trace from a run of your program which does yield the problem.

    AUTHOR

    Graham Barr <gbarr@pobox.com>

    SEE ALSO

    Net::Netrc Net::Cmd

    ftp(1), ftpd(8), RFC 959 http://www.cis.ohio-state.edu/htbin/rfc/rfc959.html

    USE EXAMPLES

    For an example of the use of Net::FTP see

    CREDITS

    Henry Gabryjelski <henryg@WPI.EDU> - for the suggestion of creating directories recursively.

    Nathan Torkington <gnat@frii.com> - for some input on the documentation.

    Roderick Schertler <roderick@gate.net> - for various inputs

    COPYRIGHT

    Copyright (c) 1995-2004 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/NNTP.html000644 000765 000024 00000107106 12275777473 015443 0ustar00jjstaff000000 000000 Net::NNTP - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::NNTP

    Perl 5 version 18.2 documentation
    Recently read

    Net::NNTP

    NAME

    Net::NNTP - NNTP Client class

    SYNOPSIS

    1. use Net::NNTP;
    2. $nntp = Net::NNTP->new("some.host.name");
    3. $nntp->quit;

    DESCRIPTION

    Net::NNTP is a class implementing a simple NNTP client in Perl as described in RFC977. Net::NNTP inherits its communication methods from Net::Cmd

    CONSTRUCTOR

    • new ( [ HOST ] [, OPTIONS ])

      This is the constructor for a new Net::NNTP object. HOST is the name of the remote host to which a NNTP connection is required. If not given then it may be passed as the Host option described below. If no host is passed then two environment variables are checked, first NNTPSERVER then NEWSHOST , then Net::Config is checked, and if a host is not found then news is used.

      OPTIONS are passed in a hash like fashion, using key and value pairs. Possible options are:

      Host - NNTP host to connect to. It may be a single scalar, as defined for the PeerAddr option in IO::Socket::INET, or a reference to an array with hosts to try in turn. The host method will return the value which was used to connect to the host.

      Timeout - Maximum time, in seconds, to wait for a response from the NNTP server, a value of zero will cause all IO operations to block. (default: 120)

      Debug - Enable the printing of debugging information to STDERR

      Reader - If the remote server is INN then initially the connection will be to nnrpd, by default Net::NNTP will issue a MODE READER command so that the remote server becomes innd. If the Reader option is given with a value of zero, then this command will not be sent and the connection will be left talking to nnrpd.

    METHODS

    Unless otherwise stated all methods return either a true or false value, with true meaning that the operation was a success. When a method states that it returns a value, failure will be returned as undef or an empty list.

    • article ( [ MSGID|MSGNUM ], [FH] )

      Retrieve the header, a blank line, then the body (text) of the specified article.

      If FH is specified then it is expected to be a valid filehandle and the result will be printed to it, on success a true value will be returned. If FH is not specified then the return value, on success, will be a reference to an array containing the article requested, each entry in the array will contain one line of the article.

      If no arguments are passed then the current article in the currently selected newsgroup is fetched.

      MSGNUM is a numeric id of an article in the current newsgroup, and will change the current article pointer. MSGID is the message id of an article as shown in that article's header. It is anticipated that the client will obtain the MSGID from a list provided by the newnews command, from references contained within another article, or from the message-id provided in the response to some other commands.

      If there is an error then undef will be returned.

    • body ( [ MSGID|MSGNUM ], [FH] )

      Like article but only fetches the body of the article.

    • head ( [ MSGID|MSGNUM ], [FH] )

      Like article but only fetches the headers for the article.

    • articlefh ( [ MSGID|MSGNUM ] )
    • bodyfh ( [ MSGID|MSGNUM ] )
    • headfh ( [ MSGID|MSGNUM ] )

      These are similar to article(), body() and head(), but rather than returning the requested data directly, they return a tied filehandle from which to read the article.

    • nntpstat ( [ MSGID|MSGNUM ] )

      The nntpstat command is similar to the article command except that no text is returned. When selecting by message number within a group, the nntpstat command serves to set the "current article pointer" without sending text.

      Using the nntpstat command to select by message-id is valid but of questionable value, since a selection by message-id does not alter the "current article pointer".

      Returns the message-id of the "current article".

    • group ( [ GROUP ] )

      Set and/or get the current group. If GROUP is not given then information is returned on the current group.

      In a scalar context it returns the group name.

      In an array context the return value is a list containing, the number of articles in the group, the number of the first article, the number of the last article and the group name.

    • ihave ( MSGID [, MESSAGE ])

      The ihave command informs the server that the client has an article whose id is MSGID . If the server desires a copy of that article, and MESSAGE has been given the it will be sent.

      Returns true if the server desires the article and MESSAGE was successfully sent,if specified.

      If MESSAGE is not specified then the message must be sent using the datasend and dataend methods from Net::Cmd

      MESSAGE can be either an array of lines or a reference to an array.

    • last ()

      Set the "current article pointer" to the previous article in the current newsgroup.

      Returns the message-id of the article.

    • date ()

      Returns the date on the remote server. This date will be in a UNIX time format (seconds since 1970)

    • postok ()

      postok will return true if the servers initial response indicated that it will allow posting.

    • authinfo ( USER, PASS )

      Authenticates to the server (using AUTHINFO USER / AUTHINFO PASS) using the supplied username and password. Please note that the password is sent in clear text to the server. This command should not be used with valuable passwords unless the connection to the server is somehow protected.

    • list ()

      Obtain information about all the active newsgroups. The results is a reference to a hash where the key is a group name and each value is a reference to an array. The elements in this array are:- the last article number in the group, the first article number in the group and any information flags about the group.

    • newgroups ( SINCE [, DISTRIBUTIONS ])

      SINCE is a time value and DISTRIBUTIONS is either a distribution pattern or a reference to a list of distribution patterns. The result is the same as list , but the groups return will be limited to those created after SINCE and, if specified, in one of the distribution areas in DISTRIBUTIONS .

    • newnews ( SINCE [, GROUPS [, DISTRIBUTIONS ]])

      SINCE is a time value. GROUPS is either a group pattern or a reference to a list of group patterns. DISTRIBUTIONS is either a distribution pattern or a reference to a list of distribution patterns.

      Returns a reference to a list which contains the message-ids of all news posted after SINCE , that are in a groups which matched GROUPS and a distribution which matches DISTRIBUTIONS .

    • next ()

      Set the "current article pointer" to the next article in the current newsgroup.

      Returns the message-id of the article.

    • post ( [ MESSAGE ] )

      Post a new article to the news server. If MESSAGE is specified and posting is allowed then the message will be sent.

      If MESSAGE is not specified then the message must be sent using the datasend and dataend methods from Net::Cmd

      MESSAGE can be either an array of lines or a reference to an array.

      The message, either sent via datasend or as the MESSAGE parameter, must be in the format as described by RFC822 and must contain From:, Newsgroups: and Subject: headers.

    • postfh ()

      Post a new article to the news server using a tied filehandle. If posting is allowed, this method will return a tied filehandle that you can print() the contents of the article to be posted. You must explicitly close() the filehandle when you are finished posting the article, and the return value from the close() call will indicate whether the message was successfully posted.

    • slave ()

      Tell the remote server that I am not a user client, but probably another news server.

    • quit ()

      Quit the remote server and close the socket connection.

    Extension methods

    These methods use commands that are not part of the RFC977 documentation. Some servers may not support all of them.

    • newsgroups ( [ PATTERN ] )

      Returns a reference to a hash where the keys are all the group names which match PATTERN , or all of the groups if no pattern is specified, and each value contains the description text for the group.

    • distributions ()

      Returns a reference to a hash where the keys are all the possible distribution names and the values are the distribution descriptions.

    • subscriptions ()

      Returns a reference to a list which contains a list of groups which are recommended for a new user to subscribe to.

    • overview_fmt ()

      Returns a reference to an array which contain the names of the fields returned by xover .

    • active_times ()

      Returns a reference to a hash where the keys are the group names and each value is a reference to an array containing the time the groups was created and an identifier, possibly an Email address, of the creator.

    • active ( [ PATTERN ] )

      Similar to list but only active groups that match the pattern are returned. PATTERN can be a group pattern.

    • xgtitle ( PATTERN )

      Returns a reference to a hash where the keys are all the group names which match PATTERN and each value is the description text for the group.

    • xhdr ( HEADER, MESSAGE-SPEC )

      Obtain the header field HEADER for all the messages specified.

      The return value will be a reference to a hash where the keys are the message numbers and each value contains the text of the requested header for that message.

    • xover ( MESSAGE-SPEC )

      The return value will be a reference to a hash where the keys are the message numbers and each value contains a reference to an array which contains the overview fields for that message.

      The names of the fields can be obtained by calling overview_fmt .

    • xpath ( MESSAGE-ID )

      Returns the path name to the file on the server which contains the specified message.

    • xpat ( HEADER, PATTERN, MESSAGE-SPEC)

      The result is the same as xhdr except the is will be restricted to headers where the text of the header matches PATTERN

    • xrover

      The XROVER command returns reference information for the article(s) specified.

      Returns a reference to a HASH where the keys are the message numbers and the values are the References: lines from the articles

    • listgroup ( [ GROUP ] )

      Returns a reference to a list of all the active messages in GROUP , or the current group if GROUP is not specified.

    • reader

      Tell the server that you are a reader and not another server.

      This is required by some servers. For example if you are connecting to an INN server and you have transfer permission your connection will be connected to the transfer daemon, not the NNTP daemon. Issuing this command will cause the transfer daemon to hand over control to the NNTP daemon.

      Some servers do not understand this command, but issuing it and ignoring the response is harmless.

    UNSUPPORTED

    The following NNTP command are unsupported by the package, and there are no plans to do so.

    1. AUTHINFO GENERIC
    2. XTHREAD
    3. XSEARCH
    4. XINDEX

    DEFINITIONS

    • MESSAGE-SPEC

      MESSAGE-SPEC is either a single message-id, a single message number, or a reference to a list of two message numbers.

      If MESSAGE-SPEC is a reference to a list of two message numbers and the second number in a range is less than or equal to the first then the range represents all messages in the group after the first message number.

      NOTE For compatibility reasons only with earlier versions of Net::NNTP a message spec can be passed as a list of two numbers, this is deprecated and a reference to the list should now be passed

    • PATTERN

      The NNTP protocol uses the WILDMAT format for patterns. The WILDMAT format was first developed by Rich Salz based on the format used in the UNIX "find" command to articulate file names. It was developed to provide a uniform mechanism for matching patterns in the same manner that the UNIX shell matches filenames.

      Patterns are implicitly anchored at the beginning and end of each string when testing for a match.

      There are five pattern matching operations other than a strict one-to-one match between the pattern and the source to be checked for a match.

      The first is an asterisk * to match any sequence of zero or more characters.

      The second is a question mark ? to match any single character. The third specifies a specific set of characters.

      The set is specified as a list of characters, or as a range of characters where the beginning and end of the range are separated by a minus (or dash) character, or as any combination of lists and ranges. The dash can also be included in the set as a character it if is the beginning or end of the set. This set is enclosed in square brackets. The close square bracket ] may be used in a set if it is the first character in the set.

      The fourth operation is the same as the logical not of the third operation and is specified the same way as the third with the addition of a caret character ^ at the beginning of the test string just inside the open square bracket.

      The final operation uses the backslash character to invalidate the special meaning of an open square bracket [, the asterisk, backslash or the question mark. Two backslashes in sequence will result in the evaluation of the backslash as a character with no special meaning.

      • Examples
      • [^]-]

        matches any single character other than a close square bracket or a minus sign/dash.

      • *bdc

        matches any string that ends with the string "bdc" including the string "bdc" (without quotes).

      • [0-9a-zA-Z]

        matches any single printable alphanumeric ASCII character.

      • a??d

        matches any four character string which begins with a and ends with d.

    SEE ALSO

    Net::Cmd

    AUTHOR

    Graham Barr <gbarr@pobox.com>

    COPYRIGHT

    Copyright (c) 1995-1997 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/Netrc.html000644 000765 000024 00000050043 12275777473 015734 0ustar00jjstaff000000 000000 Net::Netrc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::Netrc

    Perl 5 version 18.2 documentation
    Recently read

    Net::Netrc

    NAME

    Net::Netrc - OO interface to users netrc file

    SYNOPSIS

    1. use Net::Netrc;
    2. $mach = Net::Netrc->lookup('some.machine');
    3. $login = $mach->login;
    4. ($login, $password, $account) = $mach->lpa;

    DESCRIPTION

    Net::Netrc is a class implementing a simple interface to the .netrc file used as by the ftp program.

    Net::Netrc also implements security checks just like the ftp program, these checks are, first that the .netrc file must be owned by the user and second the ownership permissions should be such that only the owner has read and write access. If these conditions are not met then a warning is output and the .netrc file is not read.

    THE .netrc FILE

    The .netrc file contains login and initialization information used by the auto-login process. It resides in the user's home directory. The following tokens are recognized; they may be separated by spaces, tabs, or new-lines:

    • machine name

      Identify a remote machine name. The auto-login process searches the .netrc file for a machine token that matches the remote machine specified. Once a match is made, the subsequent .netrc tokens are processed, stopping when the end of file is reached or an- other machine or a default token is encountered.

    • default

      This is the same as machine name except that default matches any name. There can be only one default token, and it must be after all machine tokens. This is normally used as:

      1. default login anonymous password user@site

      thereby giving the user automatic anonymous login to machines not specified in .netrc.

    • login name

      Identify a user on the remote machine. If this token is present, the auto-login process will initiate a login using the specified name.

    • password string

      Supply a password. If this token is present, the auto-login process will supply the specified string if the remote server requires a password as part of the login process.

    • account string

      Supply an additional account password. If this token is present, the auto-login process will supply the specified string if the remote server requires an additional account password.

    • macdef name

      Define a macro. Net::Netrc only parses this field to be compatible with ftp.

    CONSTRUCTOR

    The constructor for a Net::Netrc object is not called new as it does not really create a new object. But instead is called lookup as this is essentially what it does.

    • lookup ( MACHINE [, LOGIN ])

      Lookup and return a reference to the entry for MACHINE . If LOGIN is given then the entry returned will have the given login. If LOGIN is not given then the first entry in the .netrc file for MACHINE will be returned.

      If a matching entry cannot be found, and a default entry exists, then a reference to the default entry is returned.

      If there is no matching entry found and there is no default defined, or no .netrc file is found, then undef is returned.

    METHODS

    • login ()

      Return the login id for the netrc entry

    • password ()

      Return the password for the netrc entry

    • account ()

      Return the account information for the netrc entry

    • lpa ()

      Return a list of login, password and account information fir the netrc entry

    AUTHOR

    Graham Barr <gbarr@pobox.com>

    SEE ALSO

    Net::Netrc Net::Cmd

    COPYRIGHT

    Copyright (c) 1995-1998 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/POP3.html000644 000765 000024 00000063013 12275777473 015403 0ustar00jjstaff000000 000000 Net::POP3 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::POP3

    Perl 5 version 18.2 documentation
    Recently read

    Net::POP3

    NAME

    Net::POP3 - Post Office Protocol 3 Client class (RFC1939)

    SYNOPSIS

    1. use Net::POP3;
    2. # Constructors
    3. $pop = Net::POP3->new('pop3host');
    4. $pop = Net::POP3->new('pop3host', Timeout => 60);
    5. if ($pop->login($username, $password) > 0) {
    6. my $msgnums = $pop->list; # hashref of msgnum => size
    7. foreach my $msgnum (keys %$msgnums) {
    8. my $msg = $pop->get($msgnum);
    9. print @$msg;
    10. $pop->delete($msgnum);
    11. }
    12. }
    13. $pop->quit;

    DESCRIPTION

    This module implements a client interface to the POP3 protocol, enabling a perl5 application to talk to POP3 servers. This documentation assumes that you are familiar with the POP3 protocol described in RFC1939.

    A new Net::POP3 object must be created with the new method. Once this has been done, all POP3 commands are accessed via method calls on the object.

    CONSTRUCTOR

    • new ( [ HOST ] [, OPTIONS ] 0

      This is the constructor for a new Net::POP3 object. HOST is the name of the remote host to which an POP3 connection is required.

      HOST is optional. If HOST is not given then it may instead be passed as the Host option described below. If neither is given then the POP3_Hosts specified in Net::Config will be used.

      OPTIONS are passed in a hash like fashion, using key and value pairs. Possible options are:

      Host - POP3 host to connect to. It may be a single scalar, as defined for the PeerAddr option in IO::Socket::INET, or a reference to an array with hosts to try in turn. The host method will return the value which was used to connect to the host.

      ResvPort - If given then the socket for the Net::POP3 object will be bound to the local port given using bind when the socket is created.

      Timeout - Maximum time, in seconds, to wait for a response from the POP3 server (default: 120)

      Debug - Enable debugging information

    METHODS

    Unless otherwise stated all methods return either a true or false value, with true meaning that the operation was a success. When a method states that it returns a value, failure will be returned as undef or an empty list.

    • auth ( USERNAME, PASSWORD )

      Attempt SASL authentication.

    • user ( USER )

      Send the USER command.

    • pass ( PASS )

      Send the PASS command. Returns the number of messages in the mailbox.

    • login ( [ USER [, PASS ]] )

      Send both the USER and PASS commands. If PASS is not given the Net::POP3 uses Net::Netrc to lookup the password using the host and username. If the username is not specified then the current user name will be used.

      Returns the number of messages in the mailbox. However if there are no messages on the server the string "0E0" will be returned. This is will give a true value in a boolean context, but zero in a numeric context.

      If there was an error authenticating the user then undef will be returned.

    • apop ( [ USER [, PASS ]] )

      Authenticate with the server identifying as USER with password PASS . Similar to login, but the password is not sent in clear text.

      To use this method you must have the Digest::MD5 or the MD5 module installed, otherwise this method will return undef.

    • banner ()

      Return the sever's connection banner

    • capa ()

      Return a reference to a hash of the capabilities of the server. APOP is added as a pseudo capability. Note that I've been unable to find a list of the standard capability values, and some appear to be multi-word and some are not. We make an attempt at intelligently parsing them, but it may not be correct.

    • capabilities ()

      Just like capa, but only uses a cache from the last time we asked the server, so as to avoid asking more than once.

    • top ( MSGNUM [, NUMLINES ] )

      Get the header and the first NUMLINES of the body for the message MSGNUM . Returns a reference to an array which contains the lines of text read from the server.

    • list ( [ MSGNUM ] )

      If called with an argument the list returns the size of the message in octets.

      If called without arguments a reference to a hash is returned. The keys will be the MSGNUM 's of all undeleted messages and the values will be their size in octets.

    • get ( MSGNUM [, FH ] )

      Get the message MSGNUM from the remote mailbox. If FH is not given then get returns a reference to an array which contains the lines of text read from the server. If FH is given then the lines returned from the server are printed to the filehandle FH .

    • getfh ( MSGNUM )

      As per get(), but returns a tied filehandle. Reading from this filehandle returns the requested message. The filehandle will return EOF at the end of the message and should not be reused.

    • last ()

      Returns the highest MSGNUM of all the messages accessed.

    • popstat ()

      Returns a list of two elements. These are the number of undeleted elements and the size of the mbox in octets.

    • ping ( USER )

      Returns a list of two elements. These are the number of new messages and the total number of messages for USER .

    • uidl ( [ MSGNUM ] )

      Returns a unique identifier for MSGNUM if given. If MSGNUM is not given uidl returns a reference to a hash where the keys are the message numbers and the values are the unique identifiers.

    • delete ( MSGNUM )

      Mark message MSGNUM to be deleted from the remote mailbox. All messages that are marked to be deleted will be removed from the remote mailbox when the server connection closed.

    • reset ()

      Reset the status of the remote POP3 server. This includes resetting the status of all messages to not be deleted.

    • quit ()

      Quit and close the connection to the remote POP3 server. Any messages marked as deleted will be deleted from the remote mailbox.

    NOTES

    If a Net::POP3 object goes out of scope before quit method is called then the reset method will called before the connection is closed. This means that any messages marked to be deleted will not be.

    SEE ALSO

    Net::Netrc, Net::Cmd

    AUTHOR

    Graham Barr <gbarr@pobox.com>

    COPYRIGHT

    Copyright (c) 1995-2003 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/Ping.html000644 000765 000024 00000115373 12275777474 015567 0ustar00jjstaff000000 000000 Net::Ping - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::Ping

    Perl 5 version 18.2 documentation
    Recently read

    Net::Ping

    NAME

    Net::Ping - check a remote host for reachability

    SYNOPSIS

    1. use Net::Ping;
    2. $p = Net::Ping->new();
    3. print "$host is alive.\n" if $p->ping($host);
    4. $p->close();
    5. $p = Net::Ping->new("icmp");
    6. $p->bind($my_addr); # Specify source interface of pings
    7. foreach $host (@host_array)
    8. {
    9. print "$host is ";
    10. print "NOT " unless $p->ping($host, 2);
    11. print "reachable.\n";
    12. sleep(1);
    13. }
    14. $p->close();
    15. $p = Net::Ping->new("tcp", 2);
    16. # Try connecting to the www port instead of the echo port
    17. $p->port_number(scalar(getservbyname("http", "tcp")));
    18. while ($stop_time > time())
    19. {
    20. print "$host not reachable ", scalar(localtime()), "\n"
    21. unless $p->ping($host);
    22. sleep(300);
    23. }
    24. undef($p);
    25. # Like tcp protocol, but with many hosts
    26. $p = Net::Ping->new("syn");
    27. $p->port_number(getservbyname("http", "tcp"));
    28. foreach $host (@host_array) {
    29. $p->ping($host);
    30. }
    31. while (($host,$rtt,$ip) = $p->ack) {
    32. print "HOST: $host [$ip] ACKed in $rtt seconds.\n";
    33. }
    34. # High precision syntax (requires Time::HiRes)
    35. $p = Net::Ping->new();
    36. $p->hires();
    37. ($ret, $duration, $ip) = $p->ping($host, 5.5);
    38. printf("$host [ip: $ip] is alive (packet return time: %.2f ms)\n", 1000 * $duration)
    39. if $ret;
    40. $p->close();
    41. # For backward compatibility
    42. print "$host is alive.\n" if pingecho($host);

    DESCRIPTION

    This module contains methods to test the reachability of remote hosts on a network. A ping object is first created with optional parameters, a variable number of hosts may be pinged multiple times and then the connection is closed.

    You may choose one of six different protocols to use for the ping. The "tcp" protocol is the default. Note that a live remote host may still fail to be pingable by one or more of these protocols. For example, www.microsoft.com is generally alive but not "icmp" pingable.

    With the "tcp" protocol the ping() method attempts to establish a connection to the remote host's echo port. If the connection is successfully established, the remote host is considered reachable. No data is actually echoed. This protocol does not require any special privileges but has higher overhead than the "udp" and "icmp" protocols.

    Specifying the "udp" protocol causes the ping() method to send a udp packet to the remote host's echo port. If the echoed packet is received from the remote host and the received packet contains the same data as the packet that was sent, the remote host is considered reachable. This protocol does not require any special privileges. It should be borne in mind that, for a udp ping, a host will be reported as unreachable if it is not running the appropriate echo service. For Unix-like systems see inetd(8) for more information.

    If the "icmp" protocol is specified, the ping() method sends an icmp echo message to the remote host, which is what the UNIX ping program does. If the echoed message is received from the remote host and the echoed information is correct, the remote host is considered reachable. Specifying the "icmp" protocol requires that the program be run as root or that the program be setuid to root.

    If the "external" protocol is specified, the ping() method attempts to use the Net::Ping::External module to ping the remote host. Net::Ping::External interfaces with your system's default ping utility to perform the ping, and generally produces relatively accurate results. If Net::Ping::External if not installed on your system, specifying the "external" protocol will result in an error.

    If the "syn" protocol is specified, the ping() method will only send a TCP SYN packet to the remote host then immediately return. If the syn packet was sent successfully, it will return a true value, otherwise it will return false. NOTE: Unlike the other protocols, the return value does NOT determine if the remote host is alive or not since the full TCP three-way handshake may not have completed yet. The remote host is only considered reachable if it receives a TCP ACK within the timeout specified. To begin waiting for the ACK packets, use the ack() method as explained below. Use the "syn" protocol instead the "tcp" protocol to determine reachability of multiple destinations simultaneously by sending parallel TCP SYN packets. It will not block while testing each remote host. demo/fping is provided in this distribution to demonstrate the "syn" protocol as an example. This protocol does not require any special privileges.

    Functions

    • Net::Ping->new([$proto [, $def_timeout [, $bytes [, $device [, $tos [, $ttl ]]]]]]);

      Create a new ping object. All of the parameters are optional. $proto specifies the protocol to use when doing a ping. The current choices are "tcp", "udp", "icmp", "stream", "syn", or "external". The default is "tcp".

      If a default timeout ($def_timeout) in seconds is provided, it is used when a timeout is not given to the ping() method (below). The timeout must be greater than 0 and the default, if not specified, is 5 seconds.

      If the number of data bytes ($bytes) is given, that many data bytes are included in the ping packet sent to the remote host. The number of data bytes is ignored if the protocol is "tcp". The minimum (and default) number of data bytes is 1 if the protocol is "udp" and 0 otherwise. The maximum number of data bytes that can be specified is 1024.

      If $device is given, this device is used to bind the source endpoint before sending the ping packet. I believe this only works with superuser privileges and with udp and icmp protocols at this time.

      If $tos is given, this ToS is configured into the socket.

      For icmp, $ttl can be specified to set the TTL of the outgoing packet.

    • $p->ping($host [, $timeout]);

      Ping the remote host and wait for a response. $host can be either the hostname or the IP number of the remote host. The optional timeout must be greater than 0 seconds and defaults to whatever was specified when the ping object was created. Returns a success flag. If the hostname cannot be found or there is a problem with the IP number, the success flag returned will be undef. Otherwise, the success flag will be 1 if the host is reachable and 0 if it is not. For most practical purposes, undef and 0 and can be treated as the same case. In array context, the elapsed time as well as the string form of the ip the host resolved to are also returned. The elapsed time value will be a float, as returned by the Time::HiRes::time() function, if hires() has been previously called, otherwise it is returned as an integer.

    • $p->source_verify( { 0 | 1 } );

      Allows source endpoint verification to be enabled or disabled. This is useful for those remote destinations with multiples interfaces where the response may not originate from the same endpoint that the original destination endpoint was sent to. This only affects udp and icmp protocol pings.

      This is enabled by default.

    • $p->service_check( { 0 | 1 } );

      Set whether or not the connect behavior should enforce remote service availability as well as reachability. Normally, if the remote server reported ECONNREFUSED, it must have been reachable because of the status packet that it reported. With this option enabled, the full three-way tcp handshake must have been established successfully before it will claim it is reachable. NOTE: It still does nothing more than connect and disconnect. It does not speak any protocol (i.e., HTTP or FTP) to ensure the remote server is sane in any way. The remote server CPU could be grinding to a halt and unresponsive to any clients connecting, but if the kernel throws the ACK packet, it is considered alive anyway. To really determine if the server is responding well would be application specific and is beyond the scope of Net::Ping. For udp protocol, enabling this option demands that the remote server replies with the same udp data that it was sent as defined by the udp echo service.

      This affects the "udp", "tcp", and "syn" protocols.

      This is disabled by default.

    • $p->tcp_service_check( { 0 | 1 } );

      Deprecated method, but does the same as service_check() method.

    • $p->hires( { 0 | 1 } );

      Causes this module to use Time::HiRes module, allowing milliseconds to be returned by subsequent calls to ping().

      This is disabled by default.

    • $p->bind($local_addr);

      Sets the source address from which pings will be sent. This must be the address of one of the interfaces on the local host. $local_addr may be specified as a hostname or as a text IP address such as "192.168.1.1".

      If the protocol is set to "tcp", this method may be called any number of times, and each call to the ping() method (below) will use the most recent $local_addr. If the protocol is "icmp" or "udp", then bind() must be called at most once per object, and (if it is called at all) must be called before the first call to ping() for that object.

    • $p->open($host);

      When you are using the "stream" protocol, this call pre-opens the tcp socket. It's only necessary to do this if you want to provide a different timeout when creating the connection, or remove the overhead of establishing the connection from the first ping. If you don't call open(), the connection is automatically opened the first time ping() is called. This call simply does nothing if you are using any protocol other than stream.

    • $p->ack( [ $host ] );

      When using the "syn" protocol, use this method to determine the reachability of the remote host. This method is meant to be called up to as many times as ping() was called. Each call returns the host (as passed to ping()) that came back with the TCP ACK. The order in which the hosts are returned may not necessarily be the same order in which they were SYN queued using the ping() method. If the timeout is reached before the TCP ACK is received, or if the remote host is not listening on the port attempted, then the TCP connection will not be established and ack() will return undef. In list context, the host, the ack time, and the dotted ip string will be returned instead of just the host. If the optional $host argument is specified, the return value will be pertaining to that host only. This call simply does nothing if you are using any protocol other than syn.

    • $p->nack( $failed_ack_host );

      The reason that host $failed_ack_host did not receive a valid ACK. Useful to find out why when ack( $fail_ack_host ) returns a false value.

    • $p->close();

      Close the network connection for this ping object. The network connection is also closed by "undef $p". The network connection is automatically closed if the ping object goes out of scope (e.g. $p is local to a subroutine and you leave the subroutine).

    • $p->port_number([$port_number])

      When called with a port number, the port number used to ping is set to $port_number rather than using the echo port. It also has the effect of calling $p->service_check(1) causing a ping to return a successful response only if that specific port is accessible. This function returns the value of the port that ping() will connect to.

    • pingecho($host [, $timeout]);

      To provide backward compatibility with the previous version of Net::Ping, a pingecho() subroutine is available with the same functionality as before. pingecho() uses the tcp protocol. The return values and parameters are the same as described for the ping() method. This subroutine is obsolete and may be removed in a future version of Net::Ping.

    NOTES

    There will be less network overhead (and some efficiency in your program) if you specify either the udp or the icmp protocol. The tcp protocol will generate 2.5 times or more traffic for each ping than either udp or icmp. If many hosts are pinged frequently, you may wish to implement a small wait (e.g. 25ms or more) between each ping to avoid flooding your network with packets.

    The icmp protocol requires that the program be run as root or that it be setuid to root. The other protocols do not require special privileges, but not all network devices implement tcp or udp echo.

    Local hosts should normally respond to pings within milliseconds. However, on a very congested network it may take up to 3 seconds or longer to receive an echo packet from the remote host. If the timeout is set too low under these conditions, it will appear that the remote host is not reachable (which is almost the truth).

    Reachability doesn't necessarily mean that the remote host is actually functioning beyond its ability to echo packets. tcp is slightly better at indicating the health of a system than icmp because it uses more of the networking stack to respond.

    Because of a lack of anything better, this module uses its own routines to pack and unpack ICMP packets. It would be better for a separate module to be written which understands all of the different kinds of ICMP packets.

    INSTALL

    The latest source tree is available via cvs:

    1. cvs -z3 -q -d :pserver:anonymous@cvs.roobik.com.:/usr/local/cvsroot/freeware checkout Net-Ping
    2. cd Net-Ping

    The tarball can be created as follows:

    1. perl Makefile.PL ; make ; make dist

    The latest Net::Ping release can be found at CPAN:

    1. $CPAN/modules/by-module/Net/

    1) Extract the tarball

    1. gtar -zxvf Net-Ping-xxxx.tar.gz
    2. cd Net-Ping-xxxx

    2) Build:

    1. make realclean
    2. perl Makefile.PL
    3. make
    4. make test

    3) Install

    1. make install

    Or install it RPM Style:

    1. rpm -ta SOURCES/Net-Ping-xxxx.tar.gz
    2. rpm -ih RPMS/noarch/perl-Net-Ping-xxxx.rpm

    BUGS

    For a list of known issues, visit:

    https://rt.cpan.org/NoAuth/Bugs.html?Dist=Net-Ping

    To report a new bug, visit:

    https://rt.cpan.org/NoAuth/ReportBug.html?Queue=Net-Ping

    AUTHORS

    1. Current maintainer:
    2. bbb@cpan.org (Rob Brown)
    3. External protocol:
    4. colinm@cpan.org (Colin McMillen)
    5. Stream protocol:
    6. bronson@trestle.com (Scott Bronson)
    7. Original pingecho():
    8. karrer@bernina.ethz.ch (Andreas Karrer)
    9. pmarquess@bfsec.bt.co.uk (Paul Marquess)
    10. Original Net::Ping author:
    11. mose@ns.ccsn.edu (Russell Mosemann)

    COPYRIGHT

    Copyright (c) 2002-2003, Rob Brown. All rights reserved.

    Copyright (c) 2001, Colin McMillen. All rights reserved.

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/SMTP.html000644 000765 000024 00000111502 12275777474 015443 0ustar00jjstaff000000 000000 Net::SMTP - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::SMTP

    Perl 5 version 18.2 documentation
    Recently read

    Net::SMTP

    NAME

    Net::SMTP - Simple Mail Transfer Protocol Client

    SYNOPSIS

    1. use Net::SMTP;
    2. # Constructors
    3. $smtp = Net::SMTP->new('mailhost');
    4. $smtp = Net::SMTP->new('mailhost', Timeout => 60);

    DESCRIPTION

    This module implements a client interface to the SMTP and ESMTP protocol, enabling a perl5 application to talk to SMTP servers. This documentation assumes that you are familiar with the concepts of the SMTP protocol described in RFC821.

    A new Net::SMTP object must be created with the new method. Once this has been done, all SMTP commands are accessed through this object.

    The Net::SMTP class is a subclass of Net::Cmd and IO::Socket::INET.

    EXAMPLES

    This example prints the mail domain name of the SMTP server known as mailhost:

    1. #!/usr/local/bin/perl -w
    2. use Net::SMTP;
    3. $smtp = Net::SMTP->new('mailhost');
    4. print $smtp->domain,"\n";
    5. $smtp->quit;

    This example sends a small message to the postmaster at the SMTP server known as mailhost:

    1. #!/usr/local/bin/perl -w
    2. use Net::SMTP;
    3. $smtp = Net::SMTP->new('mailhost');
    4. $smtp->mail($ENV{USER});
    5. $smtp->to('postmaster');
    6. $smtp->data();
    7. $smtp->datasend("To: postmaster\n");
    8. $smtp->datasend("\n");
    9. $smtp->datasend("A simple test message\n");
    10. $smtp->dataend();
    11. $smtp->quit;

    CONSTRUCTOR

    • new ( [ HOST ] [, OPTIONS ] )

      This is the constructor for a new Net::SMTP object. HOST is the name of the remote host to which an SMTP connection is required.

      HOST is optional. If HOST is not given then it may instead be passed as the Host option described below. If neither is given then the SMTP_Hosts specified in Net::Config will be used.

      OPTIONS are passed in a hash like fashion, using key and value pairs. Possible options are:

      Hello - SMTP requires that you identify yourself. This option specifies a string to pass as your mail domain. If not given localhost.localdomain will be used.

      Host - SMTP host to connect to. It may be a single scalar, as defined for the PeerAddr option in IO::Socket::INET, or a reference to an array with hosts to try in turn. The host method will return the value which was used to connect to the host.

      LocalAddr and LocalPort - These parameters are passed directly to IO::Socket to allow binding the socket to a local port.

      Timeout - Maximum time, in seconds, to wait for a response from the SMTP server (default: 120)

      ExactAddresses - If true the all ADDRESS arguments must be as defined by addr-spec in RFC2822. If not given, or false, then Net::SMTP will attempt to extract the address from the value passed.

      Debug - Enable debugging information

      Example:

      1. $smtp = Net::SMTP->new('mailhost',
      2. Hello => 'my.mail.domain',
      3. Timeout => 30,
      4. Debug => 1,
      5. );
      6. # the same
      7. $smtp = Net::SMTP->new(
      8. Host => 'mailhost',
      9. Hello => 'my.mail.domain',
      10. Timeout => 30,
      11. Debug => 1,
      12. );
      13. # Connect to the default server from Net::config
      14. $smtp = Net::SMTP->new(
      15. Hello => 'my.mail.domain',
      16. Timeout => 30,
      17. );

    METHODS

    Unless otherwise stated all methods return either a true or false value, with true meaning that the operation was a success. When a method states that it returns a value, failure will be returned as undef or an empty list.

    • banner ()

      Returns the banner message which the server replied with when the initial connection was made.

    • domain ()

      Returns the domain that the remote SMTP server identified itself as during connection.

    • hello ( DOMAIN )

      Tell the remote server the mail domain which you are in using the EHLO command (or HELO if EHLO fails). Since this method is invoked automatically when the Net::SMTP object is constructed the user should normally not have to call it manually.

    • host ()

      Returns the value used by the constructor, and passed to IO::Socket::INET, to connect to the host.

    • etrn ( DOMAIN )

      Request a queue run for the DOMAIN given.

    • auth ( USERNAME, PASSWORD )

      Attempt SASL authentication.

    • mail ( ADDRESS [, OPTIONS] )
    • send ( ADDRESS )
    • send_or_mail ( ADDRESS )
    • send_and_mail ( ADDRESS )

      Send the appropriate command to the server MAIL, SEND, SOML or SAML. ADDRESS is the address of the sender. This initiates the sending of a message. The method recipient should be called for each address that the message is to be sent to.

      The mail method can some additional ESMTP OPTIONS which is passed in hash like fashion, using key and value pairs. Possible options are:

      1. Size => <bytes>
      2. Return => "FULL" | "HDRS"
      3. Bits => "7" | "8" | "binary"
      4. Transaction => <ADDRESS>
      5. Envelope => <ENVID> # xtext-encodes its argument
      6. ENVID => <ENVID> # similar to Envelope, but expects argument encoded
      7. XVERP => 1
      8. AUTH => <submitter> # encoded address according to RFC 2554

      The Return and Envelope parameters are used for DSN (Delivery Status Notification).

      The submitter address in AUTH option is expected to be in a format as required by RFC 2554, in an RFC2821-quoted form and xtext-encoded, or <> .

    • reset ()

      Reset the status of the server. This may be called after a message has been initiated, but before any data has been sent, to cancel the sending of the message.

    • recipient ( ADDRESS [, ADDRESS, [...]] [, OPTIONS ] )

      Notify the server that the current message should be sent to all of the addresses given. Each address is sent as a separate command to the server. Should the sending of any address result in a failure then the process is aborted and a false value is returned. It is up to the user to call reset if they so desire.

      The recipient method can also pass additional case-sensitive OPTIONS as an anonymous hash using key and value pairs. Possible options are:

      1. Notify => ['NEVER'] or ['SUCCESS','FAILURE','DELAY'] (see below)
      2. ORcpt => <ORCPT>
      3. SkipBad => 1 (to ignore bad addresses)

      If SkipBad is true the recipient will not return an error when a bad address is encountered and it will return an array of addresses that did succeed.

      1. $smtp->recipient($recipient1,$recipient2); # Good
      2. $smtp->recipient($recipient1,$recipient2, { SkipBad => 1 }); # Good
      3. $smtp->recipient($recipient1,$recipient2, { Notify => ['FAILURE','DELAY'], SkipBad => 1 }); # Good
      4. @goodrecips=$smtp->recipient(@recipients, { Notify => ['FAILURE'], SkipBad => 1 }); # Good
      5. $smtp->recipient("$recipient,$recipient2"); # BAD

      Notify is used to request Delivery Status Notifications (DSNs), but your SMTP/ESMTP service may not respect this request depending upon its version and your site's SMTP configuration.

      Leaving out the Notify option usually defaults an SMTP service to its default behavior equivalent to ['FAILURE'] notifications only, but again this may be dependent upon your site's SMTP configuration.

      The NEVER keyword must appear by itself if used within the Notify option and "requests that a DSN not be returned to the sender under any conditions."

      1. {Notify => ['NEVER']}
      2. $smtp->recipient(@recipients, { Notify => ['NEVER'], SkipBad => 1 }); # Good

      You may use any combination of these three values 'SUCCESS','FAILURE','DELAY' in the anonymous array reference as defined by RFC3461 (see http://rfc.net/rfc3461.html for more information. Note: quotations in this topic from same.).

      A Notify parameter of 'SUCCESS' or 'FAILURE' "requests that a DSN be issued on successful delivery or delivery failure, respectively."

      A Notify parameter of 'DELAY' "indicates the sender's willingness to receive delayed DSNs. Delayed DSNs may be issued if delivery of a message has been delayed for an unusual amount of time (as determined by the Message Transfer Agent (MTA) at which the message is delayed), but the final delivery status (whether successful or failure) cannot be determined. The absence of the DELAY keyword in a NOTIFY parameter requests that a "delayed" DSN NOT be issued under any conditions."

      1. {Notify => ['SUCCESS','FAILURE','DELAY']}
      2. $smtp->recipient(@recipients, { Notify => ['FAILURE','DELAY'], SkipBad => 1 }); # Good

      ORcpt is also part of the SMTP DSN extension according to RFC3461. It is used to pass along the original recipient that the mail was first sent to. The machine that generates a DSN will use this address to inform the sender, because he can't know if recipients get rewritten by mail servers. It is expected to be in a format as required by RFC3461, xtext-encoded.

    • to ( ADDRESS [, ADDRESS [...]] )
    • cc ( ADDRESS [, ADDRESS [...]] )
    • bcc ( ADDRESS [, ADDRESS [...]] )

      Synonyms for recipient .

    • data ( [ DATA ] )

      Initiate the sending of the data from the current message.

      DATA may be a reference to a list or a list. If specified the contents of DATA and a termination string ".\r\n" is sent to the server. And the result will be true if the data was accepted.

      If DATA is not specified then the result will indicate that the server wishes the data to be sent. The data must then be sent using the datasend and dataend methods described in Net::Cmd.

    • expand ( ADDRESS )

      Request the server to expand the given address Returns an array which contains the text read from the server.

    • verify ( ADDRESS )

      Verify that ADDRESS is a legitimate mailing address.

      Most sites usually disable this feature in their SMTP service configuration. Use "Debug => 1" option under new() to see if disabled.

    • help ( [ $subject ] )

      Request help text from the server. Returns the text or undef upon failure

    • quit ()

      Send the QUIT command to the remote SMTP server and close the socket connection.

    ADDRESSES

    Net::SMTP attempts to DWIM with addresses that are passed. For example an application might extract The From: line from an email and pass that to mail(). While this may work, it is not recommended. The application should really use a module like Mail::Address to extract the mail address and pass that.

    If ExactAddresses is passed to the constructor, then addresses should be a valid rfc2821-quoted address, although Net::SMTP will accept accept the address surrounded by angle brackets.

    1. funny user@domain WRONG
    2. "funny user"@domain RIGHT, recommended
    3. <"funny user"@domain> OK

    SEE ALSO

    Net::Cmd

    AUTHOR

    Graham Barr <gbarr@pobox.com>

    COPYRIGHT

    Copyright (c) 1995-2004 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/Time.html000644 000765 000024 00000043321 12275777474 015561 0ustar00jjstaff000000 000000 Net::Time - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::Time

    Perl 5 version 18.2 documentation
    Recently read

    Net::Time

    NAME

    Net::Time - time and daytime network client interface

    SYNOPSIS

    1. use Net::Time qw(inet_time inet_daytime);
    2. print inet_time(); # use default host from Net::Config
    3. print inet_time('localhost');
    4. print inet_time('localhost', 'tcp');
    5. print inet_daytime(); # use default host from Net::Config
    6. print inet_daytime('localhost');
    7. print inet_daytime('localhost', 'tcp');

    DESCRIPTION

    Net::Time provides subroutines that obtain the time on a remote machine.

    • inet_time ( [HOST [, PROTOCOL [, TIMEOUT]]])

      Obtain the time on HOST , or some default host if HOST is not given or not defined, using the protocol as defined in RFC868. The optional argument PROTOCOL should define the protocol to use, either tcp or udp . The result will be a time value in the same units as returned by time() or undef upon failure.

    • inet_daytime ( [HOST [, PROTOCOL [, TIMEOUT]]])

      Obtain the time on HOST , or some default host if HOST is not given or not defined, using the protocol as defined in RFC867. The optional argument PROTOCOL should define the protocol to use, either tcp or udp . The result will be an ASCII string or undef upon failure.

    AUTHOR

    Graham Barr <gbarr@pobox.com>

    COPYRIGHT

    Copyright (c) 1995-2004 Graham Barr. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Net/hostent.html000644 000765 000024 00000053000 12275777473 016341 0ustar00jjstaff000000 000000 Net::hostent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::hostent

    Perl 5 version 18.2 documentation
    Recently read

    Net::hostent

    NAME

    Net::hostent - by-name interface to Perl's built-in gethost*() functions

    SYNOPSIS

    1. use Net::hostent;

    DESCRIPTION

    This module's default exports override the core gethostbyname() and gethostbyaddr() functions, replacing them with versions that return "Net::hostent" objects. This object has methods that return the similarly named structure field name from the C's hostent structure from netdb.h; namely name, aliases, addrtype, length, and addr_list. The aliases and addr_list methods return array reference, the rest scalars. The addr method is equivalent to the zeroth element in the addr_list array reference.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your core functions.) Access these fields as variables named with a preceding h_ . Thus, $host_obj->name() corresponds to $h_name if you import the fields. Array references are available as regular array variables, so for example @{ $host_obj->aliases() } would be simply @h_aliases.

    The gethost() function is a simple front-end that forwards a numeric argument to gethostbyaddr() by way of Socket::inet_aton, and the rest to gethostbyname().

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. On the other hand, the built-ins are still available via the CORE:: pseudo-package.

    EXAMPLES

    1. use Net::hostent;
    2. use Socket;
    3. @ARGV = ('netscape.com') unless @ARGV;
    4. for $host ( @ARGV ) {
    5. unless ($h = gethost($host)) {
    6. warn "$0: no such host: $host\n";
    7. next;
    8. }
    9. printf "\n%s is %s%s\n",
    10. $host,
    11. lc($h->name) eq lc($host) ? "" : "*really* ",
    12. $h->name;
    13. print "\taliases are ", join(", ", @{$h->aliases}), "\n"
    14. if @{$h->aliases};
    15. if ( @{$h->addr_list} > 1 ) {
    16. my $i;
    17. for $addr ( @{$h->addr_list} ) {
    18. printf "\taddr #%d is [%s]\n", $i++, inet_ntoa($addr);
    19. }
    20. } else {
    21. printf "\taddress is [%s]\n", inet_ntoa($h->addr);
    22. }
    23. if ($h = gethostbyaddr($h->addr)) {
    24. if (lc($h->name) ne lc($host)) {
    25. printf "\tThat addr reverses to host %s!\n", $h->name;
    26. $host = $h->name;
    27. redo;
    28. }
    29. }
    30. }

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/Net/netent.html000644 000765 000024 00000061202 12275777473 016155 0ustar00jjstaff000000 000000 Net::netent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::netent

    Perl 5 version 18.2 documentation
    Recently read

    Net::netent

    NAME

    Net::netent - by-name interface to Perl's built-in getnet*() functions

    SYNOPSIS

    1. use Net::netent qw(:FIELDS);
    2. getnetbyname("loopback") or die "bad net";
    3. printf "%s is %08X\n", $n_name, $n_net;
    4. use Net::netent;
    5. $n = getnetbyname("loopback") or die "bad net";
    6. { # there's gotta be a better way, eh?
    7. @bytes = unpack("C4", pack("N", $n->net));
    8. shift @bytes while @bytes && $bytes[0] == 0;
    9. }
    10. printf "%s is %08X [%d.%d.%d.%d]\n", $n->name, $n->net, @bytes;

    DESCRIPTION

    This module's default exports override the core getnetbyname() and getnetbyaddr() functions, replacing them with versions that return "Net::netent" objects. This object has methods that return the similarly named structure field name from the C's netent structure from netdb.h; namely name, aliases, addrtype, and net. The aliases method returns an array reference, the rest scalars.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your core functions.) Access these fields as variables named with a preceding n_ . Thus, $net_obj->name() corresponds to $n_name if you import the fields. Array references are available as regular array variables, so for example @{ $net_obj->aliases() } would be simply @n_aliases.

    The getnet() function is a simple front-end that forwards a numeric argument to getnetbyaddr(), and the rest to getnetbyname().

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. On the other hand, the built-ins are still available via the CORE:: pseudo-package.

    EXAMPLES

    The getnet() functions do this in the Perl core:

    1. sv_setiv(sv, (I32)nent->n_net);

    The gethost() functions do this in the Perl core:

    1. sv_setpvn(sv, hent->h_addr, len);

    That means that the address comes back in binary for the host functions, and as a regular perl integer for the net ones. This seems a bug, but here's how to deal with it:

    1. use strict;
    2. use Socket;
    3. use Net::netent;
    4. @ARGV = ('loopback') unless @ARGV;
    5. my($n, $net);
    6. for $net ( @ARGV ) {
    7. unless ($n = getnetbyname($net)) {
    8. warn "$0: no such net: $net\n";
    9. next;
    10. }
    11. printf "\n%s is %s%s\n",
    12. $net,
    13. lc($n->name) eq lc($net) ? "" : "*really* ",
    14. $n->name;
    15. print "\taliases are ", join(", ", @{$n->aliases}), "\n"
    16. if @{$n->aliases};
    17. # this is stupid; first, why is this not in binary?
    18. # second, why am i going through these convolutions
    19. # to make it looks right
    20. {
    21. my @a = unpack("C4", pack("N", $n->net));
    22. shift @a while @a && $a[0] == 0;
    23. printf "\taddr is %s [%d.%d.%d.%d]\n", $n->net, @a;
    24. }
    25. if ($n = getnetbyaddr($n->net)) {
    26. if (lc($n->name) ne lc($net)) {
    27. printf "\tThat addr reverses to net %s!\n", $n->name;
    28. $net = $n->name;
    29. redo;
    30. }
    31. }
    32. }

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/Net/protoent.html000644 000765 000024 00000043537 12275777473 016545 0ustar00jjstaff000000 000000 Net::protoent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::protoent

    Perl 5 version 18.2 documentation
    Recently read

    Net::protoent

    NAME

    Net::protoent - by-name interface to Perl's built-in getproto*() functions

    SYNOPSIS

    1. use Net::protoent;
    2. $p = getprotobyname(shift || 'tcp') || die "no proto";
    3. printf "proto for %s is %d, aliases are %s\n",
    4. $p->name, $p->proto, "@{$p->aliases}";
    5. use Net::protoent qw(:FIELDS);
    6. getprotobyname(shift || 'tcp') || die "no proto";
    7. print "proto for $p_name is $p_proto, aliases are @p_aliases\n";

    DESCRIPTION

    This module's default exports override the core getprotoent(), getprotobyname(), and getnetbyport() functions, replacing them with versions that return "Net::protoent" objects. They take default second arguments of "tcp". This object has methods that return the similarly named structure field name from the C's protoent structure from netdb.h; namely name, aliases, and proto. The aliases method returns an array reference, the rest scalars.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your core functions.) Access these fields as variables named with a preceding p_ . Thus, $proto_obj->name() corresponds to $p_name if you import the fields. Array references are available as regular array variables, so for example @{ $proto_obj->aliases() } would be simply @p_aliases.

    The getproto() function is a simple front-end that forwards a numeric argument to getprotobyport(), and the rest to getprotobyname().

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. On the other hand, the built-ins are still available via the CORE:: pseudo-package.

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/Net/servent.html000644 000765 000024 00000047652 12275777473 016363 0ustar00jjstaff000000 000000 Net::servent - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Net::servent

    Perl 5 version 18.2 documentation
    Recently read

    Net::servent

    NAME

    Net::servent - by-name interface to Perl's built-in getserv*() functions

    SYNOPSIS

    1. use Net::servent;
    2. $s = getservbyname(shift || 'ftp') || die "no service";
    3. printf "port for %s is %s, aliases are %s\n",
    4. $s->name, $s->port, "@{$s->aliases}";
    5. use Net::servent qw(:FIELDS);
    6. getservbyname(shift || 'ftp') || die "no service";
    7. print "port for $s_name is $s_port, aliases are @s_aliases\n";

    DESCRIPTION

    This module's default exports override the core getservent(), getservbyname(), and getnetbyport() functions, replacing them with versions that return "Net::servent" objects. They take default second arguments of "tcp". This object has methods that return the similarly named structure field name from the C's servent structure from netdb.h; namely name, aliases, port, and proto. The aliases method returns an array reference, the rest scalars.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your core functions.) Access these fields as variables named with a preceding s_ . Thus, $serv_obj->name() corresponds to $s_name if you import the fields. Array references are available as regular array variables, so for example @{ $serv_obj->aliases()} would be simply @s_aliases.

    The getserv() function is a simple front-end that forwards a numeric argument to getservbyport(), and the rest to getservbyname().

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. On the other hand, the built-ins are still available via the CORE:: pseudo-package.

    EXAMPLES

    1. use Net::servent qw(:FIELDS);
    2. while (@ARGV) {
    3. my ($service, $proto) = ((split m!/!, shift), 'tcp');
    4. my $valet = getserv($service, $proto);
    5. unless ($valet) {
    6. warn "$0: No service: $service/$proto\n"
    7. next;
    8. }
    9. printf "service $service/$proto is port %d\n", $valet->port;
    10. print "alias are @s_aliases\n" if @s_aliases;
    11. }

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/Module/Build/000755 000765 000024 00000000000 12275777470 015524 5ustar00jjstaff000000 000000 perldoc-html/Module/Build.html000644 000765 000024 00000212014 12275777466 016417 0ustar00jjstaff000000 000000 Module::Build - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build

    NAME

    Module::Build - Build and install Perl modules

    SYNOPSIS

    Standard process for building & installing modules:

    1. perl Build.PL
    2. ./Build
    3. ./Build test
    4. ./Build install

    Or, if you're on a platform (like DOS or Windows) that doesn't require the "./" notation, you can do this:

    1. perl Build.PL
    2. Build
    3. Build test
    4. Build install

    DESCRIPTION

    Module::Build is a system for building, testing, and installing Perl modules. It is meant to be an alternative to ExtUtils::MakeMaker . Developers may alter the behavior of the module through subclassing in a much more straightforward way than with MakeMaker . It also does not require a make on your system - most of the Module::Build code is pure-perl and written in a very cross-platform way. In fact, you don't even need a shell, so even platforms like MacOS (traditional) can use it fairly easily. Its only prerequisites are modules that are included with perl 5.6.0, and it works fine on perl 5.005 if you can install a few additional modules.

    See MOTIVATIONS for more comparisons between ExtUtils::MakeMaker and Module::Build .

    To install Module::Build , and any other module that uses Module::Build for its installation process, do the following:

    1. perl Build.PL # 'Build.PL' script creates the 'Build' script
    2. ./Build # Need ./ to ensure we're using this "Build" script
    3. ./Build test # and not another one that happens to be in the PATH
    4. ./Build install

    This illustrates initial configuration and the running of three 'actions'. In this case the actions run are 'build' (the default action), 'test', and 'install'. Other actions defined so far include:

    1. build manifest
    2. clean manifest_skip
    3. code manpages
    4. config_data pardist
    5. diff ppd
    6. dist ppmdist
    7. distcheck prereq_data
    8. distclean prereq_report
    9. distdir pure_install
    10. distinstall realclean
    11. distmeta retest
    12. distsign skipcheck
    13. disttest test
    14. docs testall
    15. fakeinstall testcover
    16. help testdb
    17. html testpod
    18. install testpodcoverage
    19. installdeps versioninstall

    You can run the 'help' action for a complete list of actions.

    GUIDE TO DOCUMENTATION

    The documentation for Module::Build is broken up into sections:

    • General Usage (Module::Build)

      This is the document you are currently reading. It describes basic usage and background information. Its main purpose is to assist the user who wants to learn how to invoke and control Module::Build scripts at the command line.

    • Authoring Reference (Module::Build::Authoring)

      This document describes the structure and organization of Module::Build , and the relevant concepts needed by authors who are writing Build.PL scripts for a distribution or controlling Module::Build processes programmatically.

    • API Reference (Module::Build::API)

      This is a reference to the Module::Build API.

    • Cookbook (Module::Build::Cookbook)

      This document demonstrates how to accomplish many common tasks. It covers general command line usage and authoring of Build.PL scripts. Includes working examples.

    ACTIONS

    There are some general principles at work here. First, each task when building a module is called an "action". These actions are listed above; they correspond to the building, testing, installing, packaging, etc., tasks.

    Second, arguments are processed in a very systematic way. Arguments are always key=value pairs. They may be specified at perl Build.PL time (i.e. perl Build.PL destdir=/my/secret/place), in which case their values last for the lifetime of the Build script. They may also be specified when executing a particular action (i.e. Build test verbose=1 ), in which case their values last only for the lifetime of that command. Per-action command line parameters take precedence over parameters specified at perl Build.PL time.

    The build process also relies heavily on the Config.pm module. If the user wishes to override any of the values in Config.pm , she may specify them like so:

    1. perl Build.PL --config cc=gcc --config ld=gcc

    The following build actions are provided by default.

    • build

      [version 0.01]

      If you run the Build script without any arguments, it runs the build action, which in turn runs the code and docs actions.

      This is analogous to the MakeMaker make all target.

    • clean

      [version 0.01]

      This action will clean up any files that the build process may have created, including the blib/ directory (but not including the _build/ directory and the Build script itself).

    • code

      [version 0.20]

      This action builds your code base.

      By default it just creates a blib/ directory and copies any .pm and .pod files from your lib/ directory into the blib/ directory. It also compiles any .xs files from lib/ and places them in blib/ . Of course, you need a working C compiler (probably the same one that built perl itself) for the compilation to work properly.

      The code action also runs any .PL files in your lib/ directory. Typically these create other files, named the same but without the .PL ending. For example, a file lib/Foo/Bar.pm.PL could create the file lib/Foo/Bar.pm. The .PL files are processed first, so any .pm files (or other kinds that we deal with) will get copied correctly.

    • config_data

      [version 0.26]

      ...

    • diff

      [version 0.14]

      This action will compare the files about to be installed with their installed counterparts. For .pm and .pod files, a diff will be shown (this currently requires a 'diff' program to be in your PATH). For other files like compiled binary files, we simply report whether they differ.

      A flags parameter may be passed to the action, which will be passed to the 'diff' program. Consult your 'diff' documentation for the parameters it will accept - a good one is -u :

      1. ./Build diff flags=-u
    • dist

      [version 0.02]

      This action is helpful for module authors who want to package up their module for source distribution through a medium like CPAN. It will create a tarball of the files listed in MANIFEST and compress the tarball using GZIP compression.

      By default, this action will use the Archive::Tar module. However, you can force it to use binary "tar" and "gzip" executables by supplying an explicit tar (and optional gzip ) parameter:

      1. ./Build dist --tar C:\path\to\tar.exe --gzip C:\path\to\zip.exe
    • distcheck

      [version 0.05]

      Reports which files are in the build directory but not in the MANIFEST file, and vice versa. (See manifest for details.)

    • distclean

      [version 0.05]

      Performs the 'realclean' action and then the 'distcheck' action.

    • distdir

      [version 0.05]

      Creates a "distribution directory" named $dist_name-$dist_version (if that directory already exists, it will be removed first), then copies all the files listed in the MANIFEST file to that directory. This directory is what the distribution tarball is created from.

    • distinstall

      [version 0.37]

      Performs the 'distdir' action, then switches into that directory and runs a perl Build.PL , followed by the 'build' and 'install' actions in that directory. Use PERL_MB_OPT or .modulebuildrc to set options that should be applied during subprocesses

    • distmeta

      [version 0.21]

      Creates the META.yml file that describes the distribution.

      META.yml is a file containing various bits of metadata about the distribution. The metadata includes the distribution name, version, abstract, prerequisites, license, and various other data about the distribution. This file is created as META.yml in a simplified YAML format.

      META.yml file must also be listed in MANIFEST - if it's not, a warning will be issued.

      The current version of the META.yml specification can be found on CPAN as CPAN::Meta::Spec.

    • distsign

      [version 0.16]

      Uses Module::Signature to create a SIGNATURE file for your distribution, and adds the SIGNATURE file to the distribution's MANIFEST.

    • disttest

      [version 0.05]

      Performs the 'distdir' action, then switches into that directory and runs a perl Build.PL , followed by the 'build' and 'test' actions in that directory. Use PERL_MB_OPT or .modulebuildrc to set options that should be applied during subprocesses

    • docs

      [version 0.20]

      This will generate documentation (e.g. Unix man pages and HTML documents) for any installable items under blib/ that contain POD. If there are no bindoc or libdoc installation targets defined (as will be the case on systems that don't support Unix manpages) no action is taken for manpages. If there are no binhtml or libhtml installation targets defined no action is taken for HTML documents.

    • fakeinstall

      [version 0.02]

      This is just like the install action, but it won't actually do anything, it will just report what it would have done if you had actually run the install action.

    • help

      [version 0.03]

      This action will simply print out a message that is meant to help you use the build process. It will show you a list of available build actions too.

      With an optional argument specifying an action name (e.g. Build help test ), the 'help' action will show you any POD documentation it can find for that action.

    • html

      [version 0.26]

      This will generate HTML documentation for any binary or library files under blib/ that contain POD. The HTML documentation will only be installed if the install paths can be determined from values in Config.pm . You can also supply or override install paths on the command line by specifying install_path values for the binhtml and/or libhtml installation targets.

    • install

      [version 0.01]

      This action will use ExtUtils::Install to install the files from blib/ into the system. See INSTALL PATHS for details about how Module::Build determines where to install things, and how to influence this process.

      If you want the installation process to look around in @INC for other versions of the stuff you're installing and try to delete it, you can use the uninst parameter, which tells ExtUtils::Install to do so:

      1. ./Build install uninst=1

      This can be a good idea, as it helps prevent multiple versions of a module from being present on your system, which can be a confusing situation indeed.

    • installdeps

      [version 0.36]

      This action will use the cpan_client parameter as a command to install missing prerequisites. You will be prompted whether to install optional dependencies.

      The cpan_client option defaults to 'cpan' but can be set as an option or in .modulebuildrc. It must be a shell command that takes a list of modules to install as arguments (e.g. 'cpanp -i' for CPANPLUS). If the program part is a relative path (e.g. 'cpan' or 'cpanp'), it will be located relative to the perl program that executed Build.PL.

      1. /opt/perl/5.8.9/bin/perl Build.PL
      2. ./Build installdeps --cpan_client 'cpanp -i'
      3. # installs to 5.8.9
    • manifest

      [version 0.05]

      This is an action intended for use by module authors, not people installing modules. It will bring the MANIFEST up to date with the files currently present in the distribution. You may use a MANIFEST.SKIP file to exclude certain files or directories from inclusion in the MANIFEST. MANIFEST.SKIP should contain a bunch of regular expressions, one per line. If a file in the distribution directory matches any of the regular expressions, it won't be included in the MANIFEST.

      The following is a reasonable MANIFEST.SKIP starting point, you can add your own stuff to it:

      1. ^_build
      2. ^Build$
      3. ^blib
      4. ~$
      5. \.bak$
      6. ^MANIFEST\.SKIP$
      7. CVS

      See the distcheck and skipcheck actions if you want to find out what the manifest action would do, without actually doing anything.

    • manifest_skip

      [version 0.3608]

      This is an action intended for use by module authors, not people installing modules. It will generate a boilerplate MANIFEST.SKIP file if one does not already exist.

    • manpages

      [version 0.28]

      This will generate man pages for any binary or library files under blib/ that contain POD. The man pages will only be installed if the install paths can be determined from values in Config.pm . You can also supply or override install paths by specifying there values on the command line with the bindoc and libdoc installation targets.

    • pardist

      [version 0.2806]

      Generates a PAR binary distribution for use with PAR or PAR::Dist.

      It requires that the PAR::Dist module (version 0.17 and up) is installed on your system.

    • ppd

      [version 0.20]

      Build a PPD file for your distribution.

      This action takes an optional argument codebase which is used in the generated PPD file to specify the (usually relative) URL of the distribution. By default, this value is the distribution name without any path information.

      Example:

      1. ./Build ppd --codebase "MSWin32-x86-multi-thread/Module-Build-0.21.tar.gz"
    • ppmdist

      [version 0.23]

      Generates a PPM binary distribution and a PPD description file. This action also invokes the ppd action, so it can accept the same codebase argument described under that action.

      This uses the same mechanism as the dist action to tar & zip its output, so you can supply tar and/or gzip parameters to affect the result.

    • prereq_data

      [version 0.32]

      This action prints out a Perl data structure of all prerequisites and the versions required. The output can be loaded again using eval(). This can be useful for external tools that wish to query a Build script for prerequisites.

    • prereq_report

      [version 0.28]

      This action prints out a list of all prerequisites, the versions required, and the versions actually installed. This can be useful for reviewing the configuration of your system prior to a build, or when compiling data to send for a bug report.

    • pure_install

      [version 0.28]

      This action is identical to the install action. In the future, though, when install starts writing to the file $(INSTALLARCHLIB)/perllocal.pod, pure_install won't, and that will be the only difference between them.

    • realclean

      [version 0.01]

      This action is just like the clean action, but also removes the _build directory and the Build script. If you run the realclean action, you are essentially starting over, so you will have to re-create the Build script again.

    • retest

      [version 0.2806]

      This is just like the test action, but doesn't actually build the distribution first, and doesn't add blib/ to the load path, and therefore will test against a previously installed version of the distribution. This can be used to verify that a certain installed distribution still works, or to see whether newer versions of a distribution still pass the old regression tests, and so on.

    • skipcheck

      [version 0.05]

      Reports which files are skipped due to the entries in the MANIFEST.SKIP file (See manifest for details)

    • test

      [version 0.01]

      This will use Test::Harness or TAP::Harness to run any regression tests and report their results. Tests can be defined in the standard places: a file called test.pl in the top-level directory, or several files ending with .t in a t/ directory.

      If you want tests to be 'verbose', i.e. show details of test execution rather than just summary information, pass the argument verbose=1 .

      If you want to run tests under the perl debugger, pass the argument debugger=1 .

      If you want to have Module::Build find test files with different file name extensions, pass the test_file_exts argument with an array of extensions, such as [qw( .t .s .z )] .

      If you want test to be run by TAP::Harness , rather than Test::Harness , pass the argument tap_harness_args as an array reference of arguments to pass to the TAP::Harness constructor.

      In addition, if a file called visual.pl exists in the top-level directory, this file will be executed as a Perl script and its output will be shown to the user. This is a good place to put speed tests or other tests that don't use the Test::Harness format for output.

      To override the choice of tests to run, you may pass a test_files argument whose value is a whitespace-separated list of test scripts to run. This is especially useful in development, when you only want to run a single test to see whether you've squashed a certain bug yet:

      1. ./Build test --test_files t/something_failing.t

      You may also pass several test_files arguments separately:

      1. ./Build test --test_files t/one.t --test_files t/two.t

      or use a glob()-style pattern:

      1. ./Build test --test_files 't/01-*.t'
    • testall

      [version 0.2807]

      [Note: the 'testall' action and the code snippets below are currently in alpha stage, see http://www.nntp.perl.org/group/perl.module.build/2007/03/msg584.html ]

      Runs the test action plus each of the test$type actions defined by the keys of the test_types parameter.

      Currently, you need to define the ACTION_test$type method yourself and enumerate them in the test_types parameter.

      1. my $mb = Module::Build->subclass(
      2. code => q(
      3. sub ACTION_testspecial { shift->generic_test(type => 'special'); }
      4. sub ACTION_testauthor { shift->generic_test(type => 'author'); }
      5. )
      6. )->new(
      7. ...
      8. test_types => {
      9. special => '.st',
      10. author => ['.at', '.pt' ],
      11. },
      12. ...
    • testcover

      [version 0.26]

      Runs the test action using Devel::Cover , generating a code-coverage report showing which parts of the code were actually exercised during the tests.

      To pass options to Devel::Cover , set the $DEVEL_COVER_OPTIONS environment variable:

      1. DEVEL_COVER_OPTIONS=-ignore,Build ./Build testcover
    • testdb

      [version 0.05]

      This is a synonym for the 'test' action with the debugger=1 argument.

    • testpod

      [version 0.25]

      This checks all the files described in the docs action and produces Test::Harness -style output. If you are a module author, this is useful to run before creating a new release.

    • testpodcoverage

      [version 0.28]

      This checks the pod coverage of the distribution and produces Test::Harness -style output. If you are a module author, this is useful to run before creating a new release.

    • versioninstall

      [version 0.16]

      ** Note: since only.pm is so new, and since we just recently added support for it here too, this feature is to be considered experimental. **

      If you have the only.pm module installed on your system, you can use this action to install a module into the version-specific library trees. This means that you can have several versions of the same module installed and use a specific one like this:

      1. use only MyModule => 0.55;

      To override the default installation libraries in only::config , specify the versionlib parameter when you run the Build.PL script:

      1. perl Build.PL --versionlib /my/version/place/

      To override which version the module is installed as, specify the version parameter when you run the Build.PL script:

      1. perl Build.PL --version 0.50

      See the only.pm documentation for more information on version-specific installs.

    OPTIONS

    Command Line Options

    The following options can be used during any invocation of Build.PL or the Build script, during any action. For information on other options specific to an action, see the documentation for the respective action.

    NOTE: There is some preliminary support for options to use the more familiar long option style. Most options can be preceded with the -- long option prefix, and the underscores changed to dashes (e.g. --use-rcfile ). Additionally, the argument to boolean options is optional, and boolean options can be negated by prefixing them with no or no- (e.g. --noverbose or --no-verbose ).

    • quiet

      Suppress informative messages on output.

    • verbose

      Display extra information about the Build on output. verbose will turn off quiet

    • cpan_client

      Sets the cpan_client command for use with the installdeps action. See installdeps for more details.

    • use_rcfile

      Load the ~/.modulebuildrc option file. This option can be set to false to prevent the custom resource file from being loaded.

    • allow_mb_mismatch

      Suppresses the check upon startup that the version of Module::Build we're now running under is the same version that was initially invoked when building the distribution (i.e. when the Build.PL script was first run). As of 0.3601, a mismatch results in a warning instead of a fatal error, so this option effectively just suppresses the warning.

    • debug

      Prints Module::Build debugging information to STDOUT, such as a trace of executed build actions.

    Default Options File (.modulebuildrc)

    [version 0.28]

    When Module::Build starts up, it will look first for a file, $ENV{HOME}/.modulebuildrc. If it's not found there, it will look in the the .modulebuildrc file in the directories referred to by the environment variables HOMEDRIVE + HOMEDIR , USERPROFILE , APPDATA , WINDIR , SYS$LOGIN . If the file exists, the options specified there will be used as defaults, as if they were typed on the command line. The defaults can be overridden by specifying new values on the command line.

    The action name must come at the beginning of the line, followed by any amount of whitespace and then the options. Options are given the same as they would be on the command line. They can be separated by any amount of whitespace, including newlines, as long there is whitespace at the beginning of each continued line. Anything following a hash mark (# ) is considered a comment, and is stripped before parsing. If more than one line begins with the same action name, those lines are merged into one set of options.

    Besides the regular actions, there are two special pseudo-actions: the key * (asterisk) denotes any global options that should be applied to all actions, and the key 'Build_PL' specifies options to be applied when you invoke perl Build.PL .

    1. * verbose=1 # global options
    2. diff flags=-u
    3. install --install_base /home/ken
    4. --install_path html=/home/ken/docs/html
    5. installdeps --cpan_client 'cpanp -i'

    If you wish to locate your resource file in a different location, you can set the environment variable MODULEBUILDRC to the complete absolute path of the file containing your options.

    Environment variables

    • MODULEBUILDRC

      [version 0.28]

      Specifies an alternate location for a default options file as described above.

    • PERL_MB_OPT

      [version 0.36]

      Command line options that are applied to Build.PL or any Build action. The string is split as the shell would (e.g. whitespace) and the result is prepended to any actual command-line arguments.

    INSTALL PATHS

    [version 0.19]

    When you invoke Module::Build's build action, it needs to figure out where to install things. The nutshell version of how this works is that default installation locations are determined from Config.pm, and they may be overridden by using the install_path parameter. An install_base parameter lets you specify an alternative installation root like /home/foo, and a destdir lets you specify a temporary installation directory like /tmp/install in case you want to create bundled-up installable packages.

    Natively, Module::Build provides default installation locations for the following types of installable items:

    • lib

      Usually pure-Perl module files ending in .pm.

    • arch

      "Architecture-dependent" module files, usually produced by compiling XS, Inline, or similar code.

    • script

      Programs written in pure Perl. In order to improve reuse, try to make these as small as possible - put the code into modules whenever possible.

    • bin

      "Architecture-dependent" executable programs, i.e. compiled C code or something. Pretty rare to see this in a perl distribution, but it happens.

    • bindoc

      Documentation for the stuff in script and bin . Usually generated from the POD in those files. Under Unix, these are manual pages belonging to the 'man1' category.

    • libdoc

      Documentation for the stuff in lib and arch . This is usually generated from the POD in .pm files. Under Unix, these are manual pages belonging to the 'man3' category.

    • binhtml

      This is the same as bindoc above, but applies to HTML documents.

    • libhtml

      This is the same as libdoc above, but applies to HTML documents.

    Four other parameters let you control various aspects of how installation paths are determined:

    • installdirs

      The default destinations for these installable things come from entries in your system's Config.pm . You can select from three different sets of default locations by setting the installdirs parameter as follows:

      1. 'installdirs' set to:
      2. core site vendor
      3. uses the following defaults from Config.pm:
      4. lib => installprivlib installsitelib installvendorlib
      5. arch => installarchlib installsitearch installvendorarch
      6. script => installscript installsitescript installvendorscript
      7. bin => installbin installsitebin installvendorbin
      8. bindoc => installman1dir installsiteman1dir installvendorman1dir
      9. libdoc => installman3dir installsiteman3dir installvendorman3dir
      10. binhtml => installhtml1dir installsitehtml1dir installvendorhtml1dir [*]
      11. libhtml => installhtml3dir installsitehtml3dir installvendorhtml3dir [*]
      12. * Under some OS (eg. MSWin32) the destination for HTML documents is
      13. determined by the C<Config.pm> entry C<installhtmldir>.

      The default value of installdirs is "site". If you're creating vendor distributions of module packages, you may want to do something like this:

      1. perl Build.PL --installdirs vendor

      or

      1. ./Build install --installdirs vendor

      If you're installing an updated version of a module that was included with perl itself (i.e. a "core module"), then you may set installdirs to "core" to overwrite the module in its present location.

      (Note that the 'script' line is different from MakeMaker - unfortunately there's no such thing as "installsitescript" or "installvendorscript" entry in Config.pm , so we use the "installsitebin" and "installvendorbin" entries to at least get the general location right. In the future, if Config.pm adds some more appropriate entries, we'll start using those.)

    • install_path

      Once the defaults have been set, you can override them.

      On the command line, that would look like this:

      1. perl Build.PL --install_path lib=/foo/lib --install_path arch=/foo/lib/arch

      or this:

      1. ./Build install --install_path lib=/foo/lib --install_path arch=/foo/lib/arch
    • install_base

      You can also set the whole bunch of installation paths by supplying the install_base parameter to point to a directory on your system. For instance, if you set install_base to "/home/ken" on a Linux system, you'll install as follows:

      1. lib => /home/ken/lib/perl5
      2. arch => /home/ken/lib/perl5/i386-linux
      3. script => /home/ken/bin
      4. bin => /home/ken/bin
      5. bindoc => /home/ken/man/man1
      6. libdoc => /home/ken/man/man3
      7. binhtml => /home/ken/html
      8. libhtml => /home/ken/html

      Note that this is different from how MakeMaker 's PREFIX parameter works. install_base just gives you a default layout under the directory you specify, which may have little to do with the installdirs=site layout.

      The exact layout under the directory you specify may vary by system - we try to do the "sensible" thing on each platform.

    • destdir

      If you want to install everything into a temporary directory first (for instance, if you want to create a directory tree that a package manager like rpm or dpkg could create a package from), you can use the destdir parameter:

      1. perl Build.PL --destdir /tmp/foo

      or

      1. ./Build install --destdir /tmp/foo

      This will effectively install to "/tmp/foo/$sitelib", "/tmp/foo/$sitearch", and the like, except that it will use File::Spec to make the pathnames work correctly on whatever platform you're installing on.

    • prefix

      Provided for compatibility with ExtUtils::MakeMaker 's PREFIX argument. prefix should be used when you want Module::Build to install your modules, documentation, and scripts in the same place as ExtUtils::MakeMaker 's PREFIX mechanism.

      The following are equivalent.

      1. perl Build.PL --prefix /tmp/foo
      2. perl Makefile.PL PREFIX=/tmp/foo

      Because of the complex nature of the prefixification logic, the behavior of PREFIX in MakeMaker has changed subtly over time. Module::Build's --prefix logic is equivalent to the PREFIX logic found in ExtUtils::MakeMaker 6.30.

      The maintainers of MakeMaker do understand the troubles with the PREFIX mechanism, and added INSTALL_BASE support in version 6.31 of MakeMaker , which was released in 2006.

      If you don't need to retain compatibility with old versions (pre-6.31) of ExtUtils::MakeMaker or are starting a fresh Perl installation we recommend you use install_base instead (and INSTALL_BASE in ExtUtils::MakeMaker ). See Installing in the same location as ExtUtils::MakeMaker in Module::Build::Cookbook for further information.

    MOTIVATIONS

    There are several reasons I wanted to start over, and not just fix what I didn't like about MakeMaker :

    • I don't like the core idea of MakeMaker , namely that make should be involved in the build process. Here are my reasons:

      • +

        When a person is installing a Perl module, what can you assume about their environment? Can you assume they have make ? No, but you can assume they have some version of Perl.

      • +

        When a person is writing a Perl module for intended distribution, can you assume that they know how to build a Makefile, so they can customize their build process? No, but you can assume they know Perl, and could customize that way.

      For years, these things have been a barrier to people getting the build/install process to do what they want.

    • There are several architectural decisions in MakeMaker that make it very difficult to customize its behavior. For instance, when using MakeMaker you do use ExtUtils::MakeMaker , but the object created in WriteMakefile() is actually blessed into a package name that's created on the fly, so you can't simply subclass ExtUtils::MakeMaker . There is a workaround MY package that lets you override certain MakeMaker methods, but only certain explicitly preselected (by MakeMaker ) methods can be overridden. Also, the method of customization is very crude: you have to modify a string containing the Makefile text for the particular target. Since these strings aren't documented, and can't be documented (they take on different values depending on the platform, version of perl, version of MakeMaker , etc.), you have no guarantee that your modifications will work on someone else's machine or after an upgrade of MakeMaker or perl.

    • It is risky to make major changes to MakeMaker , since it does so many things, is so important, and generally works. Module::Build is an entirely separate package so that I can work on it all I want, without worrying about backward compatibility with MakeMaker .

    • Finally, Perl is said to be a language for system administration. Could it really be the case that Perl isn't up to the task of building and installing software? Even if that software is a bunch of .pm files that just need to be copied from one place to another? My sense was that we could design a system to accomplish this in a flexible, extensible, and friendly manner. Or die trying.

    TO DO

    The current method of relying on time stamps to determine whether a derived file is out of date isn't likely to scale well, since it requires tracing all dependencies backward, it runs into problems on NFS, and it's just generally flimsy. It would be better to use an MD5 signature or the like, if available. See cons for an example.

    1. - append to perllocal.pod
    2. - add a 'plugin' functionality

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    Development questions, bug reports, and patches should be sent to the Module-Build mailing list at <module-build@perl.org>.

    Bug reports are also welcome at <http://rt.cpan.org/NoAuth/Bugs.html?Dist=Module-Build>.

    The latest development version is available from the Git repository at <https://github.com/Perl-Toolchain-Gang/Module-Build>

    COPYRIGHT

    Copyright (c) 2001-2006 Ken Williams. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    perl(1), Module::Build::Cookbook, Module::Build::Authoring, Module::Build::API, ExtUtils::MakeMaker

    META.yml Specification: CPAN::Meta::Spec

    http://www.dsmit.com/cons/

    http://search.cpan.org/dist/PerlBuildSystem/

     
    perldoc-html/Module/CoreList.html000644 000765 000024 00000100753 12275777466 017112 0ustar00jjstaff000000 000000 Module::CoreList - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::CoreList

    Perl 5 version 18.2 documentation
    Recently read

    Module::CoreList

    NAME

    Module::CoreList - what modules shipped with versions of perl

    SYNOPSIS

    1. use Module::CoreList;
    2. print $Module::CoreList::version{5.00503}{CPAN}; # prints 1.48
    3. print Module::CoreList->first_release('File::Spec'); # prints 5.00405
    4. print Module::CoreList->first_release_by_date('File::Spec'); # prints 5.005
    5. print Module::CoreList->first_release('File::Spec', 0.82); # prints 5.006001
    6. if (Module::CoreList::is_core('File::Spec')) {
    7. print "File::Spec is a core module\n";
    8. }
    9. print join ', ', Module::CoreList->find_modules(qr/Data/);
    10. # prints 'Data::Dumper'
    11. print join ', ', Module::CoreList->find_modules(qr/test::h.*::.*s/i, 5.008008);
    12. # prints 'Test::Harness::Assert, Test::Harness::Straps'
    13. print join ", ", @{ $Module::CoreList::families{5.005} };
    14. # prints "5.005, 5.00503, 5.00504"

    DESCRIPTION

    Module::CoreList provides information on which core and dual-life modules shipped with each version of perl.

    It provides a number of mechanisms for querying this information.

    There is a utility called corelist provided with this module which is a convenient way of querying from the command-line.

    There is a functional programming API available for programmers to query information.

    Programmers may also query the contained hash structures to find relevant information.

    FUNCTIONS API

    These are the functions that are available, they may either be called as functions or class methods:

    1. Module::CoreList::first_release('File::Spec'); # as a function
    2. Module::CoreList->first_release('File::Spec'); # class method
    • first_release( MODULE )

      Behaviour since version 2.11

      Requires a MODULE name as an argument, returns the perl version when that module first appeared in core as ordered by perl version number or undef ( in scalar context ) or an empty list ( in list context ) if that module is not in core.

    • first_release_by_date( MODULE )

      Requires a MODULE name as an argument, returns the perl version when that module first appeared in core as ordered by release date or undef ( in scalar context ) or an empty list ( in list context ) if that module is not in core.

    • find_modules( REGEX, [ LIST OF PERLS ] )

      Takes a regex as an argument, returns a list of modules that match the regex given. If only a regex is provided applies to all modules in all perl versions. Optionally you may provide a list of perl versions to limit the regex search.

    • find_version( PERL_VERSION )

      Takes a perl version as an argument. Returns that perl version if it exists or undef otherwise.

    • is_core( MODULE, [ MODULE_VERSION, [ PERL_VERSION ] ] )

      Available in version 2.99 and above.

      Returns true if MODULE was bundled with the specified version of Perl. You can optionally specify a minimum version of the module, and can also specify a version of Perl. If a version of Perl isn't specified, is_core() will use the version of Perl that is running (ie $^V ).

      If you want to specify the version of Perl, but don't care about the version of the module, pass undef for the module version:

    • is_deprecated( MODULE, PERL_VERSION )

      Available in version 2.22 and above.

      Returns true if MODULE is marked as deprecated in PERL_VERSION. If PERL_VERSION is omitted, it defaults to the current version of Perl.

    • deprecated_in( MODULE )

      Available in version 2.77 and above.

      Returns the first PERL_VERSION where the MODULE was marked as deprecated. Returns undef if the MODULE has not been marked as deprecated.

    • removed_from( MODULE )

      Available in version 2.32 and above

      Takes a module name as an argument, returns the first perl version where that module was removed from core. Returns undef if the given module was never in core or remains in core.

    • removed_from_by_date( MODULE )

      Available in version 2.32 and above

      Takes a module name as an argument, returns the first perl version by release date where that module was removed from core. Returns undef if the given module was never in core or remains in core.

    • changes_between( PERL_VERSION, PERL_VERSION )

      Available in version 2.66 and above.

      Given two perl versions, this returns a list of pairs describing the changes in core module content between them. The list is suitable for storing in a hash. The keys are library names and the values are hashrefs. Each hashref has an entry for one or both of left and right , giving the versions of the library in each of the left and right perl distributions.

      For example, it might return these data (among others) for the difference between 5.008000 and 5.008001:

      1. 'Pod::ParseLink' => { left => '1.05', right => '1.06' },
      2. 'Pod::ParseUtils' => { left => '0.22', right => '0.3' },
      3. 'Pod::Perldoc' => { right => '3.10' },
      4. 'Pod::Perldoc::BaseTo' => { right => undef },

      This shows us two libraries being updated and two being added, one of which has an undefined version in the right-hand side version.

    DATA STRUCTURES

    These are the hash data structures that are available:

    • %Module::CoreList::version

      A hash of hashes that is keyed on perl version as indicated in $]. The second level hash is module => version pairs.

      Note, it is possible for the version of a module to be unspecified, whereby the value is undef, so use exists $version{$foo}{$bar} if that's what you're testing for.

      Starting with 2.10, the special module name Unicode refers to the version of the Unicode Character Database bundled with Perl.

    • %Module::CoreList::delta

      Available in version 3.00 and above.

      %Module::CoreList::version is implemented via Module::CoreList::TieHashDelta using this hash of delta changes.

      It is a hash of hashes that is keyed on perl version. Each keyed hash will have the following keys:

      1. delta_from - a previous perl version that the changes are based on
      2. changed - a hash of module/versions that have changed
      3. removed - a hash of modules that have been removed
    • %Module::CoreList::released

      Keyed on perl version this contains ISO formatted versions of the release dates, as gleaned from perlhist.

    • %Module::CoreList::families

      New, in 1.96, a hash that clusters known perl releases by their major versions.

    • %Module::CoreList::deprecated

      A hash of hashes keyed on perl version and on module name. If a module is defined it indicates that that module is deprecated in that perl version and is scheduled for removal from core at some future point.

    • %Module::CoreList::upstream

      A hash that contains information on where patches should be directed for each core module.

      UPSTREAM indicates where patches should go. undef implies that this hasn't been discussed for the module at hand. blead indicates that the copy of the module in the blead sources is to be considered canonical, cpan means that the module on CPAN is to be patched first. first-come means that blead can be patched freely if it is in sync with the latest release on CPAN.

    • %Module::CoreList::bug_tracker

      A hash that contains information on the appropriate bug tracker for each core module.

      BUGS is an email or url to post bug reports. For modules with UPSTREAM => 'blead', use perl5-porters@perl.org. rt.cpan.org appears to automatically provide a URL for CPAN modules; any value given here overrides the default: http://rt.cpan.org/Public/Dist/Display.html?Name=$ModuleName

    CAVEATS

    Module::CoreList currently covers the 5.000, 5.001, 5.002, 5.003_07, 5.004, 5.004_05, 5.005, 5.005_03, 5.005_04, 5.6.0, 5.6.1, 5.6.2, 5.7.3, 5.8.0, 5.8.1, 5.8.2, 5.8.3, 5.8.4, 5.8.5, 5.8.6, 5.8.7, 5.8.8, 5.8.9, 5.9.0, 5.9.1, 5.9.2, 5.9.3, 5.9.4, 5.9.5, 5.10.0, 5.10.1, 5.11.0, 5.11.1, 5.11.2, 5.11.3, 5.11.4, 5.11.5, 5.12.0, 5.12.1, 5.12.2, 5.12.3, 5.12.4, 5.12.5, 5.13.0, 5.13.1, 5.13.2, 5.13.3, 5.13.4, 5.13.5, 5.13.6, 5.13.7, 5.13.8, 5.13.9, 5.13.10, 5.13.11, 5.14.0, 5.14.1, 5.14.2 5.14.3, 5.14.4, 5.15.0, 5.15.1, 5.15.2, 5.15.3, 5.15.4, 5.15.5, 5.15.6, 5.15.7, 5.15.8, 5.15.9, 5.16.0, 5.16.1, 5.16.2, 5.16.3, 5.17.0, 5.17.1, 5.17.2, 5.17.3, 5.17.4, 5.17.5, 5.17.6, 5.17.7, 5.17.8, 5.17.9, 5.17.10, 5.17.11, 5.18.0, 5.19.0, 5.19.1, 5.19.2, 5.19.3, 5.19.4, 5.19.5, 5.19.6 and 5.19.7 releases of perl.

    HISTORY

    Moved to Changes file.

    AUTHOR

    Richard Clamp <richardc@unixbeard.net>

    Currently maintained by the perl 5 porters <perl5-porters@perl.org>.

    LICENSE

    Copyright (C) 2002-2009 Richard Clamp. All Rights Reserved.

    This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    corelist, Module::Info, perl, http://perlpunks.de/corelist

     
    perldoc-html/Module/Load/000755 000765 000024 00000000000 12275777467 015352 5ustar00jjstaff000000 000000 perldoc-html/Module/Load.html000644 000765 000024 00000046542 12275777466 016252 0ustar00jjstaff000000 000000 Module::Load - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Load

    Perl 5 version 18.2 documentation
    Recently read

    Module::Load

    NAME

    Module::Load - runtime require of both modules and files

    SYNOPSIS

    1. use Module::Load;
    2. my $module = 'Data:Dumper';
    3. load Data::Dumper; # loads that module
    4. load 'Data::Dumper'; # ditto
    5. load $module # tritto
    6. my $script = 'some/script.pl'
    7. load $script;
    8. load 'some/script.pl'; # use quotes because of punctuations
    9. load thing; # try 'thing' first, then 'thing.pm'
    10. load CGI, ':standard' # like 'use CGI qw[:standard]'

    DESCRIPTION

    load eliminates the need to know whether you are trying to require either a file or a module.

    If you consult perldoc -f require you will see that require will behave differently when given a bareword or a string.

    In the case of a string, require assumes you are wanting to load a file. But in the case of a bareword, it assumes you mean a module.

    This gives nasty overhead when you are trying to dynamically require modules at runtime, since you will need to change the module notation (Acme::Comment ) to a file notation fitting the particular platform you are on.

    load eliminates the need for this overhead and will just DWYM.

    Rules

    load has the following rules to decide what it thinks you want:

    • If the argument has any characters in it other than those matching \w , : or ', it must be a file

    • If the argument matches only [\w:'], it must be a module

    • If the argument matches only \w , it could either be a module or a file. We will try to find file.pm first in @INC and if that fails, we will try to find file in @INC. If both fail, we die with the respective error messages.

    Caveats

    Because of a bug in perl (#19213), at least in version 5.6.1, we have to hardcode the path separator for a require on Win32 to be /, like on Unix rather than the Win32 \ . Otherwise perl will not read its own %INC accurately double load files if they are required again, or in the worst case, core dump.

    Module::Load cannot do implicit imports, only explicit imports. (in other words, you always have to specify explicitly what you wish to import from a module, even if the functions are in that modules' @EXPORT )

    ACKNOWLEDGEMENTS

    Thanks to Jonas B. Nielsen for making explicit imports work.

    BUG REPORTS

    Please report bugs or other issues to <bug-module-load@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Module/Loaded.html000644 000765 000024 00000045416 12275777472 016557 0ustar00jjstaff000000 000000 Module::Loaded - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Loaded

    Perl 5 version 18.2 documentation
    Recently read

    Module::Loaded

    NAME

    Module::Loaded - mark modules as loaded or unloaded

    SYNOPSIS

    1. use Module::Loaded;
    2. $bool = mark_as_loaded('Foo'); # Foo.pm is now marked as loaded
    3. $loc = is_loaded('Foo'); # location of Foo.pm set to the
    4. # loaders location
    5. eval "require 'Foo'"; # is now a no-op
    6. $bool = mark_as_unloaded('Foo'); # Foo.pm no longer marked as loaded
    7. eval "require 'Foo'"; # Will try to find Foo.pm in @INC

    DESCRIPTION

    When testing applications, often you find yourself needing to provide functionality in your test environment that would usually be provided by external modules. Rather than munging the %INC by hand to mark these external modules as loaded, so they are not attempted to be loaded by perl, this module offers you a very simple way to mark modules as loaded and/or unloaded.

    FUNCTIONS

    $bool = mark_as_loaded( PACKAGE );

    Marks the package as loaded to perl. PACKAGE can be a bareword or string.

    If the module is already loaded, mark_as_loaded will carp about this and tell you from where the PACKAGE has been loaded already.

    $bool = mark_as_unloaded( PACKAGE );

    Marks the package as unloaded to perl, which is the exact opposite of mark_as_loaded . PACKAGE can be a bareword or string.

    If the module is already unloaded, mark_as_unloaded will carp about this and tell you the PACKAGE has been unloaded already.

    $loc = is_loaded( PACKAGE );

    is_loaded tells you if PACKAGE has been marked as loaded yet. PACKAGE can be a bareword or string.

    It returns falls if PACKAGE has not been loaded yet and the location from where it is said to be loaded on success.

    BUG REPORTS

    Please report bugs or other issues to <bug-module-loaded@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Module/Pluggable/000755 000765 000024 00000000000 12275777467 016375 5ustar00jjstaff000000 000000 perldoc-html/Module/Pluggable.html000644 000765 000024 00000116033 12275777466 017266 0ustar00jjstaff000000 000000 Module::Pluggable - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Pluggable

    Perl 5 version 18.2 documentation
    Recently read

    Module::Pluggable

    NAME

    Module::Pluggable - automatically give your module the ability to have plugins

    SYNOPSIS

    Simple use Module::Pluggable -

    1. package MyClass;
    2. use Module::Pluggable;

    and then later ...

    1. use MyClass;
    2. my $mc = MyClass->new();
    3. # returns the names of all plugins installed under MyClass::Plugin::*
    4. my @plugins = $mc->plugins();

    EXAMPLE

    Why would you want to do this? Say you have something that wants to pass an object to a number of different plugins in turn. For example you may want to extract meta-data from every email you get sent and do something with it. Plugins make sense here because then you can keep adding new meta data parsers and all the logic and docs for each one will be self contained and new handlers are easy to add without changing the core code. For that, you might do something like ...

    1. package Email::Examiner;
    2. use strict;
    3. use Email::Simple;
    4. use Module::Pluggable require => 1;
    5. sub handle_email {
    6. my $self = shift;
    7. my $email = shift;
    8. foreach my $plugin ($self->plugins) {
    9. $plugin->examine($email);
    10. }
    11. return 1;
    12. }

    .. and all the plugins will get a chance in turn to look at it.

    This can be trivally extended so that plugins could save the email somewhere and then no other plugin should try and do that. Simply have it so that the examine method returns 1 if it has saved the email somewhere. You might also wnat to be paranoid and check to see if the plugin has an examine method.

    1. foreach my $plugin ($self->plugins) {
    2. next unless $plugin->can('examine');
    3. last if $plugin->examine($email);
    4. }

    And so on. The sky's the limit.

    DESCRIPTION

    Provides a simple but, hopefully, extensible way of having 'plugins' for your module. Obviously this isn't going to be the be all and end all of solutions but it works for me.

    Essentially all it does is export a method into your namespace that looks through a search path for .pm files and turn those into class names.

    Optionally it instantiates those classes for you.

    ADVANCED USAGE

    Alternatively, if you don't want to use 'plugins' as the method ...

    1. package MyClass;
    2. use Module::Pluggable sub_name => 'foo';

    and then later ...

    1. my @plugins = $mc->foo();

    Or if you want to look in another namespace

    1. package MyClass;
    2. use Module::Pluggable search_path => ['Acme::MyClass::Plugin', 'MyClass::Extend'];

    or directory

    1. use Module::Pluggable search_dirs => ['mylibs/Foo'];

    Or if you want to instantiate each plugin rather than just return the name

    1. package MyClass;
    2. use Module::Pluggable instantiate => 'new';

    and then

    1. # whatever is passed to 'plugins' will be passed
    2. # to 'new' for each plugin
    3. my @plugins = $mc->plugins(@options);

    alternatively you can just require the module without instantiating it

    1. package MyClass;
    2. use Module::Pluggable require => 1;

    since requiring automatically searches inner packages, which may not be desirable, you can turn this off

    1. package MyClass;
    2. use Module::Pluggable require => 1, inner => 0;

    You can limit the plugins loaded using the except option, either as a string, array ref or regex

    1. package MyClass;
    2. use Module::Pluggable except => 'MyClass::Plugin::Foo';

    or

    1. package MyClass;
    2. use Module::Pluggable except => ['MyClass::Plugin::Foo', 'MyClass::Plugin::Bar'];

    or

    1. package MyClass;
    2. use Module::Pluggable except => qr/^MyClass::Plugin::(Foo|Bar)$/;

    and similarly for only which will only load plugins which match.

    Remember you can use the module more than once

    1. package MyClass;
    2. use Module::Pluggable search_path => 'MyClass::Filters' sub_name => 'filters';
    3. use Module::Pluggable search_path => 'MyClass::Plugins' sub_name => 'plugins';

    and then later ...

    1. my @filters = $self->filters;
    2. my @plugins = $self->plugins;

    PLUGIN SEARCHING

    Every time you call 'plugins' the whole search path is walked again. This allows for dynamically loading plugins even at run time. However this can get expensive and so if you don't expect to want to add new plugins at run time you could do

    1. package Foo;
    2. use strict;
    3. use Module::Pluggable sub_name => '_plugins';
    4. our @PLUGINS;
    5. sub plugins { @PLUGINS ||= shift->_plugins }
    6. 1;

    INNER PACKAGES

    If you have, for example, a file lib/Something/Plugin/Foo.pm that contains package definitions for both Something::Plugin::Foo and Something::Plugin::Bar then as long as you either have either the require or instantiate option set then we'll also find Something::Plugin::Bar . Nifty!

    OPTIONS

    You can pass a hash of options when importing this module.

    The options can be ...

    sub_name

    The name of the subroutine to create in your namespace.

    By default this is 'plugins'

    search_path

    An array ref of namespaces to look in.

    search_dirs

    An array ref of directorys to look in before @INC.

    instantiate

    Call this method on the class. In general this will probably be 'new' but it can be whatever you want. Whatever arguments are passed to 'plugins' will be passed to the method.

    The default is 'undef' i.e just return the class name.

    require

    Just require the class, don't instantiate (overrides 'instantiate');

    inner

    If set to 0 will not search inner packages. If set to 1 will override require.

    only

    Takes a string, array ref or regex describing the names of the only plugins to return. Whilst this may seem perverse ... well, it is. But it also makes sense. Trust me.

    except

    Similar to only it takes a description of plugins to exclude from returning. This is slightly less perverse.

    package

    This is for use by extension modules which build on Module::Pluggable : passing a package option allows you to place the plugin method in a different package other than your own.

    file_regex

    By default Module::Pluggable only looks for .pm files.

    By supplying a new file_regex then you can change this behaviour e.g

    1. file_regex => qr/\.plugin$/

    include_editor_junk

    By default Module::Pluggable ignores files that look like they were left behind by editors. Currently this means files ending in ~ (~), the extensions .swp or .swo, or files beginning with .#.

    Setting include_editor_junk changes Module::Pluggable so it does not ignore any files it finds.

    follow_symlinks

    Whether, when searching directories, to follow symlinks.

    Defaults to 1 i.e do follow symlinks.

    min_depth, max_depth

    This will allow you to set what 'depth' of plugin will be allowed.

    So, for example, MyClass::Plugin::Foo will have a depth of 3 and MyClass::Plugin::Foo::Bar will have a depth of 4 so to only get the former (i.e MyClass::Plugin::Foo ) do

    1. package MyClass;
    2. use Module::Pluggable max_depth => 3;

    and to only get the latter (i.e MyClass::Plugin::Foo::Bar )

    1. package MyClass;
    2. use Module::Pluggable min_depth => 4;

    TRIGGERS

    Various triggers can also be passed in to the options.

    If any of these triggers return 0 then the plugin will not be returned.

    before_require <plugin>

    Gets passed the plugin name.

    If 0 is returned then this plugin will not be required either.

    on_require_error <plugin> <err>

    Gets called when there's an error on requiring the plugin.

    Gets passed the plugin name and the error.

    The default on_require_error handler is to carp the error and return 0.

    on_instantiate_error <plugin> <err>

    Gets called when there's an error on instantiating the plugin.

    Gets passed the plugin name and the error.

    The default on_instantiate_error handler is to carp the error and return 0.

    after_require <plugin>

    Gets passed the plugin name.

    If 0 is returned then this plugin will be required but not returned as a plugin.

    METHODs

    search_path

    The method search_path is exported into you namespace as well. You can call that at any time to change or replace the search_path.

    1. $self->search_path( add => "New::Path" ); # add
    2. $self->search_path( new => "New::Path" ); # replace

    BEHAVIOUR UNDER TEST ENVIRONMENT

    In order to make testing reliable we exclude anything not from blib if blib.pm is in %INC.

    However if the module being tested used another module that itself used Module::Pluggable then the second module would fail. This was fixed by checking to see if the caller had (^|/)blib/ in their filename.

    There's an argument that this is the wrong behaviour and that modules should explicitly trigger this behaviour but that particular code has been around for 7 years now and I'm reluctant to change the default behaviour.

    You can now (as of version 4.1) force Module::Pluggable to look outside blib in a test environment by doing either

    1. require Module::Pluggable;
    2. $Module::Pluggable::FORCE_SEARCH_ALL_PATHS = 1;
    3. import Module::Pluggable;

    or

    1. use Module::Pluggable force_search_all_paths => 1;

    FUTURE PLANS

    This does everything I need and I can't really think of any other features I want to add. Famous last words of course

    Recently tried fixed to find inner packages and to make it 'just work' with PAR but there are still some issues.

    However suggestions (and patches) are welcome.

    DEVELOPMENT

    The master repo for this module is at

    https://github.com/simonwistow/Module-Pluggable

    AUTHOR

    Simon Wistow <simon@thegestalt.org>

    COPYING

    Copyright, 2006 Simon Wistow

    Distributed under the same terms as Perl itself.

    BUGS

    None known.

    SEE ALSO

    File::Spec, File::Find, File::Basename, Class::Factory::Util, Module::Pluggable::Ordered

     
    perldoc-html/Module/Pluggable/Object.html000644 000765 000024 00000041536 12275777467 020502 0ustar00jjstaff000000 000000 Module::Pluggable::Object - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Pluggable::Object

    Perl 5 version 18.2 documentation
    Recently read

    Module::Pluggable::Object

    NAME

    Module::Pluggable::Object - automatically give your module the ability to have plugins

    SYNOPSIS

    Simple use Module::Pluggable -

    1. package MyClass;
    2. use Module::Pluggable::Object;
    3. my $finder = Module::Pluggable::Object->new(%opts);
    4. print "My plugins are: ".join(", ", $finder->plugins)."\n";

    DESCRIPTION

    Provides a simple but, hopefully, extensible way of having 'plugins' for your module. Obviously this isn't going to be the be all and end all of solutions but it works for me.

    Essentially all it does is export a method into your namespace that looks through a search path for .pm files and turn those into class names.

    Optionally it instantiates those classes for you.

    This object is wrapped by Module::Pluggable . If you want to do something odd or add non-general special features you're probably best to wrap this and produce your own subclass.

    OPTIONS

    See the Module::Pluggable docs.

    AUTHOR

    Simon Wistow <simon@thegestalt.org>

    COPYING

    Copyright, 2006 Simon Wistow

    Distributed under the same terms as Perl itself.

    BUGS

    None known.

    SEE ALSO

    Module::Pluggable

     
    perldoc-html/Module/Load/Conditional.html000644 000765 000024 00000075023 12275777467 020512 0ustar00jjstaff000000 000000 Module::Load::Conditional - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Load::Conditional

    Perl 5 version 18.2 documentation
    Recently read

    Module::Load::Conditional

    NAME

    Module::Load::Conditional - Looking up module information / loading at runtime

    SYNOPSIS

    1. use Module::Load::Conditional qw[can_load check_install requires];
    2. my $use_list = {
    3. CPANPLUS => 0.05,
    4. LWP => 5.60,
    5. 'Test::More' => undef,
    6. };
    7. print can_load( modules => $use_list )
    8. ? 'all modules loaded successfully'
    9. : 'failed to load required modules';
    10. my $rv = check_install( module => 'LWP', version => 5.60 )
    11. or print 'LWP is not installed!';
    12. print 'LWP up to date' if $rv->{uptodate};
    13. print "LWP version is $rv->{version}\n";
    14. print "LWP is installed as file $rv->{file}\n";
    15. print "LWP requires the following modules to be installed:\n";
    16. print join "\n", requires('LWP');
    17. ### allow M::L::C to peek in your %INC rather than just
    18. ### scanning @INC
    19. $Module::Load::Conditional::CHECK_INC_HASH = 1;
    20. ### reset the 'can_load' cache
    21. undef $Module::Load::Conditional::CACHE;
    22. ### don't have Module::Load::Conditional issue warnings --
    23. ### default is '1'
    24. $Module::Load::Conditional::VERBOSE = 0;
    25. ### The last error that happened during a call to 'can_load'
    26. my $err = $Module::Load::Conditional::ERROR;

    DESCRIPTION

    Module::Load::Conditional provides simple ways to query and possibly load any of the modules you have installed on your system during runtime.

    It is able to load multiple modules at once or none at all if one of them was not able to load. It also takes care of any error checking and so forth.

    Methods

    $href = check_install( module => NAME [, version => VERSION, verbose => BOOL ] );

    check_install allows you to verify if a certain module is installed or not. You may call it with the following arguments:

    • module

      The name of the module you wish to verify -- this is a required key

    • version

      The version this module needs to be -- this is optional

    • verbose

      Whether or not to be verbose about what it is doing -- it will default to $Module::Load::Conditional::VERBOSE

    It will return undef if it was not able to find where the module was installed, or a hash reference with the following keys if it was able to find the file:

    • file

      Full path to the file that contains the module

    • dir

      Directory, or more exact the @INC entry, where the module was loaded from.

    • version

      The version number of the installed module - this will be undef if the module had no (or unparsable) version number, or if the variable $Module::Load::Conditional::FIND_VERSION was set to true. (See the GLOBAL VARIABLES section below for details)

    • uptodate

      A boolean value indicating whether or not the module was found to be at least the version you specified. If you did not specify a version, uptodate will always be true if the module was found. If no parsable version was found in the module, uptodate will also be true, since check_install had no way to verify clearly.

      See also $Module::Load::Conditional::DEPRECATED , which affects the outcome of this value.

    $bool = can_load( modules => { NAME => VERSION [,NAME => VERSION] }, [verbose => BOOL, nocache => BOOL] )

    can_load will take a list of modules, optionally with version numbers and determine if it is able to load them. If it can load *ALL* of them, it will. If one or more are unloadable, none will be loaded.

    This is particularly useful if you have More Than One Way (tm) to solve a problem in a program, and only wish to continue down a path if all modules could be loaded, and not load them if they couldn't.

    This function uses the load function from Module::Load under the hood.

    can_load takes the following arguments:

    • modules

      This is a hashref of module/version pairs. The version indicates the minimum version to load. If no version is provided, any version is assumed to be good enough.

    • verbose

      This controls whether warnings should be printed if a module failed to load. The default is to use the value of $Module::Load::Conditional::VERBOSE.

    • nocache

      can_load keeps its results in a cache, so it will not load the same module twice, nor will it attempt to load a module that has already failed to load before. By default, can_load will check its cache, but you can override that by setting nocache to true.

    @list = requires( MODULE );

    requires can tell you what other modules a particular module requires. This is particularly useful when you're intending to write a module for public release and are listing its prerequisites.

    requires takes but one argument: the name of a module. It will then first check if it can actually load this module, and return undef if it can't. Otherwise, it will return a list of modules and pragmas that would have been loaded on the module's behalf.

    Note: The list require returns has originated from your current perl and your current install.

    Global Variables

    The behaviour of Module::Load::Conditional can be altered by changing the following global variables:

    $Module::Load::Conditional::VERBOSE

    This controls whether Module::Load::Conditional will issue warnings and explanations as to why certain things may have failed. If you set it to 0, Module::Load::Conditional will not output any warnings. The default is 0;

    $Module::Load::Conditional::FIND_VERSION

    This controls whether Module::Load::Conditional will try to parse (and eval) the version from the module you're trying to load.

    If you don't wish to do this, set this variable to false . Understand then that version comparisons are not possible, and Module::Load::Conditional can not tell you what module version you have installed. This may be desirable from a security or performance point of view. Note that $FIND_VERSION code runs safely under taint mode .

    The default is 1;

    $Module::Load::Conditional::CHECK_INC_HASH

    This controls whether Module::Load::Conditional checks your %INC hash to see if a module is available. By default, only @INC is scanned to see if a module is physically on your filesystem, or available via an @INC-hook . Setting this variable to true will trust any entries in %INC and return them for you.

    The default is 0;

    $Module::Load::Conditional::CACHE

    This holds the cache of the can_load function. If you explicitly want to remove the current cache, you can set this variable to undef

    $Module::Load::Conditional::ERROR

    This holds a string of the last error that happened during a call to can_load . It is useful to inspect this when can_load returns undef.

    $Module::Load::Conditional::DEPRECATED

    This controls whether Module::Load::Conditional checks if a dual-life core module has been deprecated. If this is set to true check_install will return false to uptodate , if a dual-life module is found to be loaded from $Config{privlibexp}

    The default is 0;

    See Also

    Module::Load

    BUG REPORTS

    Please report bugs or other issues to <bug-module-load-conditional@rt.cpan.org>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Module/Build/Base.html000644 000765 000024 00000037543 12275777470 017300 0ustar00jjstaff000000 000000 Module::Build::Base - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Base

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Base

    NAME

    Module::Build::Base - Default methods for Module::Build

    SYNOPSIS

    1. Please see the Module::Build documentation.

    DESCRIPTION

    The Module::Build::Base module defines the core functionality of Module::Build . Its methods may be overridden by any of the platform-dependent modules in the Module::Build::Platform:: namespace, but the intention here is to make this base module as platform-neutral as possible. Nicely enough, Perl has several core tools available in the File:: namespace for doing this, so the task isn't very difficult.

    Please see the Module::Build documentation for more details.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    COPYRIGHT

    Copyright (c) 2001-2006 Ken Williams. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    perl(1), Module::Build(3)

     
    perldoc-html/Module/Build/Compat.html000644 000765 000024 00000057536 12275777467 017663 0ustar00jjstaff000000 000000 Module::Build::Compat - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Compat

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Compat

    NAME

    Module::Build::Compat - Compatibility with ExtUtils::MakeMaker

    SYNOPSIS

    1. # In a Build.PL :
    2. use Module::Build;
    3. my $build = Module::Build->new
    4. ( module_name => 'Foo::Bar',
    5. license => 'perl',
    6. create_makefile_pl => 'traditional' );
    7. ...

    DESCRIPTION

    Because ExtUtils::MakeMaker has been the standard way to distribute modules for a long time, many tools (CPAN.pm, or your system administrator) may expect to find a working Makefile.PL in every distribution they download from CPAN. If you want to throw them a bone, you can use Module::Build::Compat to automatically generate a Makefile.PL for you, in one of several different styles.

    Module::Build::Compat also provides some code that helps out the Makefile.PL at runtime.

    METHODS

    • create_makefile_pl($style, $build)

      Creates a Makefile.PL in the current directory in one of several styles, based on the supplied Module::Build object $build . This is typically controlled by passing the desired style as the create_makefile_pl parameter to Module::Build 's new() method; the Makefile.PL will then be automatically created during the distdir action.

      The currently supported styles are:

      • traditional

        A Makefile.PL will be created in the "traditional" style, i.e. it will use ExtUtils::MakeMaker and won't rely on Module::Build at all. In order to create the Makefile.PL, we'll include the requires and build_requires dependencies as the PREREQ_PM parameter.

        You don't want to use this style if during the perl Build.PL stage you ask the user questions, or do some auto-sensing about the user's environment, or if you subclass Module::Build to do some customization, because the vanilla Makefile.PL won't do any of that.

      • small

        A small Makefile.PL will be created that passes all functionality through to the Build.PL script in the same directory. The user must already have Module::Build installed in order to use this, or else they'll get a module-not-found error.

      • passthrough (DEPRECATED)

        This is just like the small option above, but if Module::Build is not already installed on the user's system, the script will offer to use CPAN.pm to download it and install it before continuing with the build.

        This option has been deprecated and may be removed in a future version of Module::Build. Modern CPAN.pm and CPANPLUS will recognize the configure_requires metadata property and install Module::Build before running Build.PL if Module::Build is listed and Module::Build now adds itself to configure_requires by default.

        Perl 5.10.1 includes configure_requires support. In the future, when configure_requires support is deemed sufficiently widespread, the passthrough style will be removed.

    • run_build_pl(args => \@ARGV)

      This method runs the Build.PL script, passing it any arguments the user may have supplied to the perl Makefile.PL command. Because ExtUtils::MakeMaker and Module::Build accept different arguments, this method also performs some translation between the two.

      run_build_pl() accepts the following named parameters:

      • args

        The args parameter specifies the parameters that would usually appear on the command line of the perl Makefile.PL command - typically you'll just pass a reference to @ARGV .

      • script

        This is the filename of the script to run - it defaults to Build.PL .

    • write_makefile()

      This method writes a 'dummy' Makefile that will pass all commands through to the corresponding Module::Build actions.

      write_makefile() accepts the following named parameters:

      • makefile

        The name of the file to write - defaults to the string Makefile .

    SCENARIOS

    So, some common scenarios are:

    1.

    Just include a Build.PL script (without a Makefile.PL script), and give installation directions in a README or INSTALL document explaining how to install the module. In particular, explain that the user must install Module::Build before installing your module.

    Note that if you do this, you may make things easier for yourself, but harder for people with older versions of CPAN or CPANPLUS on their system, because those tools generally only understand the Makefile.PL/ExtUtils::MakeMaker way of doing things.

    2.

    Include a Build.PL script and a "traditional" Makefile.PL, created either manually or with create_makefile_pl() . Users won't ever have to install Module::Build if they use the Makefile.PL, but they won't get to take advantage of Module::Build 's extra features either.

    For good measure, of course, test both the Makefile.PL and the Build.PL before shipping.

    3.

    Include a Build.PL script and a "pass-through" Makefile.PL built using Module::Build::Compat . This will mean that people can continue to use the "old" installation commands, and they may never notice that it's actually doing something else behind the scenes. It will also mean that your installation process is compatible with older versions of tools like CPAN and CPANPLUS.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    COPYRIGHT

    Copyright (c) 2001-2006 Ken Williams. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/ConfigData.html000644 000765 000024 00000046721 12275777467 020431 0ustar00jjstaff000000 000000 Module::Build::ConfigData - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::ConfigData

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::ConfigData

    NAME

    Module::Build::ConfigData - Configuration for Module::Build

    SYNOPSIS

    1. use Module::Build::ConfigData;
    2. $value = Module::Build::ConfigData->config('foo');
    3. $value = Module::Build::ConfigData->feature('bar');
    4. @names = Module::Build::ConfigData->config_names;
    5. @names = Module::Build::ConfigData->feature_names;
    6. Module::Build::ConfigData->set_config(foo => $new_value);
    7. Module::Build::ConfigData->set_feature(bar => $new_value);
    8. Module::Build::ConfigData->write; # Save changes

    DESCRIPTION

    This module holds the configuration data for the Module::Build module. It also provides a programmatic interface for getting or setting that configuration data. Note that in order to actually make changes, you'll have to have write access to the Module::Build::ConfigData module, and you should attempt to understand the repercussions of your actions.

    METHODS

    • config($name)

      Given a string argument, returns the value of the configuration item by that name, or undef if no such item exists.

    • feature($name)

      Given a string argument, returns the value of the feature by that name, or undef if no such feature exists.

    • set_config($name, $value)

      Sets the configuration item with the given name to the given value. The value may be any Perl scalar that will serialize correctly using Data::Dumper . This includes references, objects (usually), and complex data structures. It probably does not include transient things like filehandles or sockets.

    • set_feature($name, $value)

      Sets the feature with the given name to the given boolean value. The value will be converted to 0 or 1 automatically.

    • config_names()

      Returns a list of all the names of config items currently defined in Module::Build::ConfigData , or in scalar context the number of items.

    • feature_names()

      Returns a list of all the names of features currently defined in Module::Build::ConfigData , or in scalar context the number of features.

    • auto_feature_names()

      Returns a list of all the names of features whose availability is dynamically determined, or in scalar context the number of such features. Does not include such features that have later been set to a fixed value.

    • write()

      Commits any changes from set_config() and set_feature() to disk. Requires write access to the Module::Build::ConfigData module.

    AUTHOR

    Module::Build::ConfigData was automatically created using Module::Build . Module::Build was written by Ken Williams, but he holds no authorship claim or copyright claim to the contents of Module::Build::ConfigData .

     
    perldoc-html/Module/Build/Cookbook.html000644 000765 000024 00000142567 12275777467 020205 0ustar00jjstaff000000 000000 Module::Build::Cookbook - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Cookbook

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Cookbook

    NAME

    Module::Build::Cookbook - Examples of Module::Build Usage

    DESCRIPTION

    Module::Build isn't conceptually very complicated, but examples are always helpful. The following recipes should help developers and/or installers put together the pieces from the other parts of the documentation.

    BASIC RECIPES

    Installing modules that use Module::Build

    In most cases, you can just issue the following commands:

    1. perl Build.PL
    2. ./Build
    3. ./Build test
    4. ./Build install

    There's nothing complicated here - first you're running a script called Build.PL, then you're running a (newly-generated) script called Build and passing it various arguments.

    The exact commands may vary a bit depending on how you invoke perl scripts on your system. For instance, if you have multiple versions of perl installed, you can install to one particular perl's library directories like so:

    1. /usr/bin/perl5.8.1 Build.PL
    2. ./Build
    3. ./Build test
    4. ./Build install

    If you're on Windows where the current directory is always searched first for scripts, you'll probably do something like this:

    1. perl Build.PL
    2. Build
    3. Build test
    4. Build install

    On the old Mac OS (version 9 or lower) using MacPerl, you can double-click on the Build.PL script to create the Build script, then double-click on the Build script to run its build , test , and install actions.

    The Build script knows what perl was used to run Build.PL, so you don't need to re-invoke the Build script with the complete perl path each time. If you invoke it with the wrong perl path, you'll get a warning or a fatal error.

    Modifying Config.pm values

    Module::Build relies heavily on various values from perl's Config.pm to do its work. For example, default installation paths are given by installsitelib and installvendorman3dir and friends, C linker & compiler settings are given by ld , lddlflags , cc , ccflags , and so on. If you're pretty sure you know what you're doing, you can tell Module::Build to pretend there are different values in Config.pm than what's really there, by passing arguments for the --config parameter on the command line:

    1. perl Build.PL --config cc=gcc --config ld=gcc

    Inside the Build.PL script the same thing can be accomplished by passing values for the config parameter to new() :

    1. my $build = Module::Build->new
    2. (
    3. ...
    4. config => { cc => 'gcc', ld => 'gcc' },
    5. ...
    6. );

    In custom build code, the same thing can be accomplished by calling the config in Module::Build method:

    1. $build->config( cc => 'gcc' ); # Set
    2. $build->config( ld => 'gcc' ); # Set
    3. ...
    4. my $linker = $build->config('ld'); # Get

    Installing modules using the programmatic interface

    If you need to build, test, and/or install modules from within some other perl code (as opposed to having the user type installation commands at the shell), you can use the programmatic interface. Create a Module::Build object (or an object of a custom Module::Build subclass) and then invoke its dispatch() method to run various actions.

    1. my $build = Module::Build->new
    2. (
    3. module_name => 'Foo::Bar',
    4. license => 'perl',
    5. requires => { 'Some::Module' => '1.23' },
    6. );
    7. $build->dispatch('build');
    8. $build->dispatch('test', verbose => 1);
    9. $build->dispatch('install');

    The first argument to dispatch() is the name of the action, and any following arguments are named parameters.

    This is the interface we use to test Module::Build itself in the regression tests.

    Installing to a temporary directory

    To create packages for package managers like RedHat's rpm or Debian's deb , you may need to install to a temporary directory first and then create the package from that temporary installation. To do this, specify the destdir parameter to the install action:

    1. ./Build install --destdir /tmp/my-package-1.003

    This essentially just prepends all the installation paths with the /tmp/my-package-1.003 directory.

    Installing to a non-standard directory

    To install to a non-standard directory (for example, if you don't have permission to install in the system-wide directories), you can use the install_base or prefix parameters:

    1. ./Build install --install_base /foo/bar

    See INSTALL PATHS in Module::Build for a much more complete discussion of how installation paths are determined.

    Installing in the same location as ExtUtils::MakeMaker

    With the introduction of --prefix in Module::Build 0.28 and INSTALL_BASE in ExtUtils::MakeMaker 6.31 its easy to get them both to install to the same locations.

    First, ensure you have at least version 0.28 of Module::Build installed and 6.31 of ExtUtils::MakeMaker . Prior versions have differing (and in some cases quite strange) installation behaviors.

    The following installation flags are equivalent between ExtUtils::MakeMaker and Module::Build .

    1. MakeMaker Module::Build
    2. PREFIX=... --prefix ...
    3. INSTALL_BASE=... --install_base ...
    4. DESTDIR=... --destdir ...
    5. LIB=... --install_path lib=...
    6. INSTALLDIRS=... --installdirs ...
    7. INSTALLDIRS=perl --installdirs core
    8. UNINST=... --uninst ...
    9. INC=... --extra_compiler_flags ...
    10. POLLUTE=1 --extra_compiler_flags -DPERL_POLLUTE

    For example, if you are currently installing MakeMaker modules with this command:

    1. perl Makefile.PL PREFIX=~
    2. make test
    3. make install UNINST=1

    You can install into the same location with Module::Build using this:

    1. perl Build.PL --prefix ~
    2. ./Build test
    3. ./Build install --uninst 1

    prefix vs install_base

    The behavior of prefix is complicated and depends on how your Perl is configured. The resulting installation locations will vary from machine to machine and even different installations of Perl on the same machine. Because of this, it's difficult to document where prefix will place your modules.

    In contrast, install_base has predictable, easy to explain installation locations. Now that Module::Build and MakeMaker both have install_base there is little reason to use prefix other than to preserve your existing installation locations. If you are starting a fresh Perl installation we encourage you to use install_base . If you have an existing installation installed via prefix , consider moving it to an installation structure matching install_base and using that instead.

    Running a single test file

    Module::Build supports running a single test, which enables you to track down errors more quickly. Use the following format:

    1. ./Build test --test_files t/mytest.t

    In addition, you may want to run the test in verbose mode to get more informative output:

    1. ./Build test --test_files t/mytest.t --verbose 1

    I run this so frequently that I define the following shell alias:

    1. alias t './Build test --verbose 1 --test_files'

    So then I can just execute t t/mytest.t to run a single test.

    ADVANCED RECIPES

    Making a CPAN.pm-compatible distribution

    New versions of CPAN.pm understand how to use a Build.PL script, but old versions don't. If authors want to help users who have old versions, some form of Makefile.PL should be supplied. The easiest way to accomplish this is to use the create_makefile_pl parameter to Module::Build->new() in the Build.PL script, which can create various flavors of Makefile.PL during the dist action.

    As a best practice, we recommend using the "traditional" style of Makefile.PL unless your distribution has needs that can't be accomplished that way.

    The Module::Build::Compat module, which is part of Module::Build 's distribution, is responsible for creating these Makefile.PLs. Please see Module::Build::Compat for the details.

    Changing the order of the build process

    The build_elements property specifies the steps Module::Build will take when building a distribution. To change the build order, change the order of the entries in that property:

    1. # Process pod files first
    2. my @e = @{$build->build_elements};
    3. my ($i) = grep {$e[$_] eq 'pod'} 0..$#e;
    4. unshift @e, splice @e, $i, 1;

    Currently, build_elements has the following default value:

    1. [qw( PL support pm xs pod script )]

    Do take care when altering this property, since there may be non-obvious (and non-documented!) ordering dependencies in the Module::Build code.

    Adding new file types to the build process

    Sometimes you might have extra types of files that you want to install alongside the standard types like .pm and .pod files. For instance, you might have a Bar.dat file containing some data related to the Foo::Bar module and you'd like for it to end up as Foo/Bar.dat somewhere in perl's @INC path so Foo::Bar can access it easily at runtime. The following code from a sample Build.PL file demonstrates how to accomplish this:

    1. use Module::Build;
    2. my $build = Module::Build->new
    3. (
    4. module_name => 'Foo::Bar',
    5. ...other stuff here...
    6. );
    7. $build->add_build_element('dat');
    8. $build->create_build_script;

    This will find all .dat files in the lib/ directory, copy them to the blib/lib/ directory during the build action, and install them during the install action.

    If your extra files aren't located in the lib/ directory in your distribution, you can explicitly say where they are, just as you'd do with .pm or .pod files:

    1. use Module::Build;
    2. my $build = new Module::Build
    3. (
    4. module_name => 'Foo::Bar',
    5. dat_files => {'some/dir/Bar.dat' => 'lib/Foo/Bar.dat'},
    6. ...other stuff here...
    7. );
    8. $build->add_build_element('dat');
    9. $build->create_build_script;

    If your extra files actually need to be created on the user's machine, or if they need some other kind of special processing, you'll probably want to subclass Module::Build and create a special method to process them, named process_${kind}_files():

    1. use Module::Build;
    2. my $class = Module::Build->subclass(code => <<'EOF');
    3. sub process_dat_files {
    4. my $self = shift;
    5. ... locate and process *.dat files,
    6. ... and create something in blib/lib/
    7. }
    8. EOF
    9. my $build = $class->new
    10. (
    11. module_name => 'Foo::Bar',
    12. ...other stuff here...
    13. );
    14. $build->add_build_element('dat');
    15. $build->create_build_script;

    If your extra files don't go in lib/ but in some other place, see Adding new elements to the install process for how to actually get them installed.

    Please note that these examples use some capabilities of Module::Build that first appeared in version 0.26. Before that it could still be done, but the simple cases took a bit more work.

    Adding new elements to the install process

    By default, Module::Build creates seven subdirectories of the blib directory during the build process: lib, arch, bin, script, bindoc, libdoc, and html (some of these may be missing or empty if there's nothing to go in them). Anything copied to these directories during the build will eventually be installed during the install action (see INSTALL PATHS in Module::Build.

    If you need to create a new custom type of installable element, e.g. conf , then you need to tell Module::Build where things in blib/conf/ should be installed. To do this, use the install_path parameter to the new() method:

    1. my $build = Module::Build->new
    2. (
    3. ...other stuff here...
    4. install_path => { conf => $installation_path }
    5. );

    Or you can call the install_path() method later:

    1. $build->install_path(conf => $installation_path);

    The user may also specify the path on the command line:

    1. perl Build.PL --install_path conf=/foo/path/etc

    The important part, though, is that somehow the install path needs to be set, or else nothing in the blib/conf/ directory will get installed, and a runtime error during the install action will result.

    See also Adding new file types to the build process for how to create the stuff in blib/conf/ in the first place.

    EXAMPLES ON CPAN

    Several distributions on CPAN are making good use of various features of Module::Build. They can serve as real-world examples for others.

    SVN-Notify-Mirror

    http://search.cpan.org/~jpeacock/SVN-Notify-Mirror/

    John Peacock, author of the SVN-Notify-Mirror distribution, says:

    1.
    Using auto_features , I check to see whether two optional modules are available - SVN::Notify::Config and Net::SSH;
    2.
    If the S::N::Config module is loaded, I automatically generate test files for it during Build (using the PL_files property).
    3.
    If the ssh_feature is available, I ask if the user wishes to perform the ssh tests (since it requires a little preliminary setup);
    4.
    Only if the user has ssh_feature and answers yes to the testing, do I generate a test file.

    I'm sure I could not have handled this complexity with EU::MM, but it was very easy to do with M::B.

    Modifying an action

    Sometimes you might need an to have an action, say ./Build install, do something unusual. For instance, you might need to change the ownership of a file or do something else peculiar to your application.

    You can subclass Module::Build on the fly using the subclass() method and override the methods that perform the actions. You may need to read through Module::Build::Authoring and Module::Build::API to find the methods you want to override. All "action" methods are implemented by a method called "ACTION_" followed by the action's name, so here's an example of how it would work for the install action:

    1. # Build.PL
    2. use Module::Build;
    3. my $class = Module::Build->subclass(
    4. class => "Module::Build::Custom",
    5. code => <<'SUBCLASS' );
    6. sub ACTION_install {
    7. my $self = shift;
    8. # YOUR CODE HERE
    9. $self->SUPER::ACTION_install;
    10. }
    11. SUBCLASS
    12. $class->new(
    13. module_name => 'Your::Module',
    14. # rest of the usual Module::Build parameters
    15. )->create_build_script;

    Adding an action

    You can add a new ./Build action simply by writing the method for it in your subclass. Use depends_on to declare that another action must have been run before your action.

    For example, let's say you wanted to be able to write ./Build commit to test your code and commit it to Subversion.

    1. # Build.PL
    2. use Module::Build;
    3. my $class = Module::Build->subclass(
    4. class => "Module::Build::Custom",
    5. code => <<'SUBCLASS' );
    6. sub ACTION_commit {
    7. my $self = shift;
    8. $self->depends_on("test");
    9. $self->do_system(qw(svn commit));
    10. }
    11. SUBCLASS

    Bundling Module::Build

    Note: This section probably needs an update as the technology improves (see contrib/bundle.pl in the distribution).

    Suppose you want to use some new-ish features of Module::Build, e.g. newer than the version of Module::Build your users are likely to already have installed on their systems. The first thing you should do is set configure_requires to your minimum version of Module::Build. See Module::Build::Authoring.

    But not every build system honors configure_requires yet. Here's how you can ship a copy of Module::Build, but still use a newer installed version to take advantage of any bug fixes and upgrades.

    First, install Module::Build into Your-Project/inc/Module-Build. CPAN will not index anything in the inc directory so this copy will not show up in CPAN searches.

    1. cd Module-Build
    2. perl Build.PL --install_base /path/to/Your-Project/inc/Module-Build
    3. ./Build test
    4. ./Build install

    You should now have all the Module::Build .pm files in Your-Project/inc/Module-Build/lib/perl5.

    Next, add this to the top of your Build.PL.

    1. my $Bundled_MB = 0.30; # or whatever version it was.
    2. # Find out what version of Module::Build is installed or fail quietly.
    3. # This should be cross-platform.
    4. my $Installed_MB =
    5. `$^X -e "eval q{require Module::Build; print Module::Build->VERSION} or exit 1";
    6. # some operating systems put a newline at the end of every print.
    7. chomp $Installed_MB;
    8. $Installed_MB = 0 if $?;
    9. # Use our bundled copy of Module::Build if it's newer than the installed.
    10. unshift @INC, "inc/Module-Build/lib/perl5" if $Bundled_MB > $Installed_MB;
    11. require Module::Build;

    And write the rest of your Build.PL normally. Module::Build will remember your change to @INC and use it when you run ./Build.

    In the future, we hope to provide a more automated solution for this scenario; see inc/latest.pm in the Module::Build distribution for one indication of the direction we're moving.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    COPYRIGHT

    Copyright (c) 2001-2008 Ken Williams. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    perl(1), Module::Build(3), Module::Build::Authoring(3), Module::Build::API(3)

     
    perldoc-html/Module/Build/ModuleInfo.html000644 000765 000024 00000035264 12275777467 020473 0ustar00jjstaff000000 000000 Module::Build::ModuleInfo - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::ModuleInfo

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::ModuleInfo

    NAME

    Module::Build::ModuleInfo - DEPRECATED

    DESCRIPTION

    This module has been extracted into a separate distribution and renamed Module::Metadata. This module is kept as a subclass wrapper for compatibility.

    SEE ALSO

    perl(1), Module::Build, Module::Metadata

     
    perldoc-html/Module/Build/Notes.html000644 000765 000024 00000036150 12275777467 017515 0ustar00jjstaff000000 000000 Module::Build::Notes - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Notes

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Notes

    NAME

    Module::Build::Notes - Create persistent distribution configuration modules

    DESCRIPTION

    This module is used internally by Module::Build to create persistent configuration files that can be installed with a distribution. See Module::Build::ConfigData for an example.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    COPYRIGHT

    Copyright (c) 2001-2006 Ken Williams. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    perl(1), Module::Build(3)

     
    perldoc-html/Module/Build/PPMMaker.html000644 000765 000024 00000036612 12275777466 020043 0ustar00jjstaff000000 000000 Module::Build::PPMMaker - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::PPMMaker

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::PPMMaker

    NAME

    Module::Build::PPMMaker - Perl Package Manager file creation

    SYNOPSIS

    1. On the command line, builds a .ppd file:
    2. ./Build ppd

    DESCRIPTION

    This package contains the code that builds .ppd "Perl Package Description" files, in support of ActiveState's "Perl Package Manager". Details are here: http://aspn.activestate.com/ASPN/Downloads/ActivePerl/PPM/

    AUTHOR

    Dave Rolsky <autarch@urth.org>, Ken Williams <kwilliams@cpan.org>

    COPYRIGHT

    Copyright (c) 2001-2006 Ken Williams. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    perl(1), Module::Build(3)

     
    perldoc-html/Module/Build/Platform/000755 000765 000024 00000000000 12275777472 017312 5ustar00jjstaff000000 000000 perldoc-html/Module/Build/Version.html000644 000765 000024 00000034645 12275777467 020061 0ustar00jjstaff000000 000000 Module::Build::Version - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Version

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Version

    NAME

    Module::Build::Version - DEPRECATED

    DESCRIPTION

    Module::Build now lists version as a configure_requires dependency and no longer installs a copy.

    Page index
     
    perldoc-html/Module/Build/YAML.html000644 000765 000024 00000034765 12275777466 017200 0ustar00jjstaff000000 000000 Module::Build::YAML - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::YAML

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::YAML

    NAME

    Module::Build::YAML - DEPRECATED

    DESCRIPTION

    This module was originally an inline copy of YAML::Tiny. It has been deprecated in favor of using CPAN::Meta::YAML directly. This module is kept as a subclass wrapper for compatibility.

    Page index
     
    perldoc-html/Module/Build/Platform/Amiga.html000644 000765 000024 00000035755 12275777466 021240 0ustar00jjstaff000000 000000 Module::Build::Platform::Amiga - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::Amiga

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::Amiga

    NAME

    Module::Build::Platform::Amiga - Builder class for Amiga platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base . Please see the Module::Build for the docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/Default.html000644 000765 000024 00000035772 12275777467 021606 0ustar00jjstaff000000 000000 Module::Build::Platform::Default - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::Default

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::Default

    NAME

    Module::Build::Platform::Default - Stub class for unknown platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base . Please see the Module::Build for the docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/EBCDIC.html000644 000765 000024 00000035765 12275777467 021135 0ustar00jjstaff000000 000000 Module::Build::Platform::EBCDIC - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::EBCDIC

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::EBCDIC

    NAME

    Module::Build::Platform::EBCDIC - Builder class for EBCDIC platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base . Please see the Module::Build for the docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/MPEiX.html000644 000765 000024 00000035755 12275777467 021145 0ustar00jjstaff000000 000000 Module::Build::Platform::MPEiX - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::MPEiX

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::MPEiX

    NAME

    Module::Build::Platform::MPEiX - Builder class for MPEiX platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base . Please see the Module::Build for the docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/MacOS.html000644 000765 000024 00000040012 12275777467 021143 0ustar00jjstaff000000 000000 Module::Build::Platform::MacOS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::MacOS

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::MacOS

    NAME

    Module::Build::Platform::MacOS - Builder class for MacOS platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base and override a few methods. Please see Module::Build for the docs.

    Overridden Methods

    • new()

      MacPerl doesn't define $Config{sitelib} or $Config{sitearch} for some reason, but $Config{installsitelib} and $Config{installsitearch} are there. So we copy the install variables to the other location

    • make_executable()

      On MacOS we set the file type and creator to MacPerl so it will run with a double-click.

    • dispatch()

      Because there's no easy way to say "./Build test" on MacOS, if dispatch is called with no arguments and no @ARGV a dialog box will pop up asking what action to take and any extra arguments.

      Default action is "test".

    • ACTION_realclean()

      Need to unlock the Build program before deleting.

    AUTHOR

    Michael G Schwern <schwern@pobox.com>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/RiscOS.html000644 000765 000024 00000035765 12275777470 021360 0ustar00jjstaff000000 000000 Module::Build::Platform::RiscOS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::RiscOS

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::RiscOS

    NAME

    Module::Build::Platform::RiscOS - Builder class for RiscOS platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base . Please see the Module::Build for the docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/Unix.html000644 000765 000024 00000035745 12275777467 021145 0ustar00jjstaff000000 000000 Module::Build::Platform::Unix - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::Unix

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::Unix

    NAME

    Module::Build::Platform::Unix - Builder class for Unix platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base . Please see the Module::Build for the docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/VMS.html000644 000765 000024 00000045574 12275777466 020667 0ustar00jjstaff000000 000000 Module::Build::Platform::VMS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::VMS

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::VMS

    NAME

    Module::Build::Platform::VMS - Builder class for VMS platforms

    DESCRIPTION

    This module inherits from Module::Build::Base and alters a few minor details of its functionality. Please see Module::Build for the general docs.

    Overridden Methods

    • _set_defaults

      Change $self->{build_script} to 'Build.com' so @Build works.

    • cull_args

      '@Build foo' on VMS will not preserve the case of 'foo'. Rather than forcing people to write '@Build "foo"' we'll dispatch case-insensitively.

    • manpage_separator

      Use '__' instead of '::'.

    • prefixify

      Prefixify taking into account VMS' filepath syntax.

    • _quote_args

      Command-line arguments (but not the command itself) must be quoted to ensure case preservation.

    • have_forkpipe

      There is no native fork(), so some constructs depending on it are not available.

    • _backticks

      Override to ensure that we quote the arguments but not the command.

    • find_command

      Local an executable program

    • _maybe_command (override)

      Follows VMS naming conventions for executable files. If the name passed in doesn't exactly match an executable file, appends .Exe (or equivalent) to check for executable image, and .Com to check for DCL procedure. If this fails, checks directories in DCL$PATH and finally Sys$System: for an executable file having the name specified, with or without the .Exe-equivalent suffix.

    • do_system

      Override to ensure that we quote the arguments but not the command.

    • oneliner

      Override to ensure that we do not quote the command.

    • _infer_xs_spec

      Inherit the standard version but tweak the library file name to be something Dynaloader can find.

    • rscan_dir

      Inherit the standard version but remove dots at end of name. If the extended character set is in effect, do not remove dots from filenames with Unix path delimiters.

    • dist_dir

      Inherit the standard version but replace embedded dots with underscores because a dot is the directory delimiter on VMS.

    • man3page_name

      Inherit the standard version but chop the extra manpage delimiter off the front if there is one. The VMS version of splitdir('[.foo]') returns '', 'foo'.

    • expand_test_dir

      Inherit the standard version but relativize the paths as the native glob() doesn't do that for us.

    • _detildefy

      The home-grown glob() does not currently handle tildes, so provide limited support here. Expect only UNIX format file specifications for now.

    • find_perl_interpreter

      On VMS, $^X returns the fully qualified absolute path including version number. It's logically impossible to improve on it for getting the perl we're currently running, and attempting to manipulate it is usually lossy.

    • localize_file_path

      Convert the file path to the local syntax

    • localize_dir_path

      Convert the directory path to the local syntax

    • ACTION_clean

      The home-grown glob() expands a bit too aggressively when given a bare name, so default in a zero-length extension.

    AUTHOR

    Michael G Schwern <schwern@pobox.com> Ken Williams <kwilliams@cpan.org> Craig A. Berry <craigberry@mac.com>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/VOS.html000644 000765 000024 00000035735 12275777472 020664 0ustar00jjstaff000000 000000 Module::Build::Platform::VOS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::VOS

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::VOS

    NAME

    Module::Build::Platform::VOS - Builder class for VOS platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base . Please see the Module::Build for the docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/Windows.html000644 000765 000024 00000036057 12275777472 021645 0ustar00jjstaff000000 000000 Module::Build::Platform::Windows - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::Windows

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::Windows

    NAME

    Module::Build::Platform::Windows - Builder class for Windows platforms

    DESCRIPTION

    The sole purpose of this module is to inherit from Module::Build::Base and override a few methods. Please see Module::Build for the docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>, Randy W. Sims <RandyS@ThePierianSpring.org>

    SEE ALSO

    perl(1), Module::Build(3)

     
    perldoc-html/Module/Build/Platform/aix.html000644 000765 000024 00000035664 12275777466 021002 0ustar00jjstaff000000 000000 Module::Build::Platform::aix - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::aix

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::aix

    NAME

    Module::Build::Platform::aix - Builder class for AIX platform

    DESCRIPTION

    This module provides some routines very specific to the AIX platform.

    Please see the Module::Build for the general docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/cygwin.html000644 000765 000024 00000035750 12275777466 021515 0ustar00jjstaff000000 000000 Module::Build::Platform::cygwin - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::cygwin

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::cygwin

    NAME

    Module::Build::Platform::cygwin - Builder class for Cygwin platform

    DESCRIPTION

    This module provides some routines very specific to the cygwin platform.

    Please see the Module::Build for the general docs.

    AUTHOR

    Initial stub by Yitzchak Scott-Thoennes <sthoenna@efn.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/darwin.html000644 000765 000024 00000035723 12275777472 021476 0ustar00jjstaff000000 000000 Module::Build::Platform::darwin - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::darwin

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::darwin

    NAME

    Module::Build::Platform::darwin - Builder class for Mac OS X platform

    DESCRIPTION

    This module provides some routines very specific to the Mac OS X platform.

    Please see the Module::Build for the general docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Module/Build/Platform/os2.html000644 000765 000024 00000035666 12275777467 020727 0ustar00jjstaff000000 000000 Module::Build::Platform::os2 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Module::Build::Platform::os2

    Perl 5 version 18.2 documentation
    Recently read

    Module::Build::Platform::os2

    NAME

    Module::Build::Platform::os2 - Builder class for OS/2 platform

    DESCRIPTION

    This module provides some routines very specific to the OS/2 platform.

    Please see the Module::Build for the general docs.

    AUTHOR

    Ken Williams <kwilliams@cpan.org>

    SEE ALSO

    perl(1), Module::Build(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Memoize/AnyDBM_File.html000644 000765 000024 00000034227 12275777470 017554 0ustar00jjstaff000000 000000 Memoize::AnyDBM_File - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Memoize::AnyDBM_File

    Perl 5 version 18.2 documentation
    Recently read

    Memoize::AnyDBM_File

    NAME

    Memoize::AnyDBM_File - glue to provide EXISTS for AnyDBM_File for Storable use

    DESCRIPTION

    See Memoize.

    Page index
     
    perldoc-html/Memoize/Expire.html000644 000765 000024 00000073147 12275777467 017011 0ustar00jjstaff000000 000000 Memoize::Expire - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Memoize::Expire

    Perl 5 version 18.2 documentation
    Recently read

    Memoize::Expire

    NAME

    Memoize::Expire - Plug-in module for automatic expiration of memoized values

    SYNOPSIS

    1. use Memoize;
    2. use Memoize::Expire;
    3. tie my %cache => 'Memoize::Expire',
    4. LIFETIME => $lifetime, # In seconds
    5. NUM_USES => $n_uses;
    6. memoize 'function', SCALAR_CACHE => [HASH => \%cache ];

    DESCRIPTION

    Memoize::Expire is a plug-in module for Memoize. It allows the cached values for memoized functions to expire automatically. This manual assumes you are already familiar with the Memoize module. If not, you should study that manual carefully first, paying particular attention to the HASH feature.

    Memoize::Expire is a layer of software that you can insert in between Memoize itself and whatever underlying package implements the cache. The layer presents a hash variable whose values expire whenever they get too old, have been used too often, or both. You tell Memoize to use this forgetful hash as its cache instead of the default, which is an ordinary hash.

    To specify a real-time timeout, supply the LIFETIME option with a numeric value. Cached data will expire after this many seconds, and will be looked up afresh when it expires. When a data item is looked up afresh, its lifetime is reset.

    If you specify NUM_USES with an argument of n, then each cached data item will be discarded and looked up afresh after the nth time you access it. When a data item is looked up afresh, its number of uses is reset.

    If you specify both arguments, data will be discarded from the cache when either expiration condition holds.

    Memoize::Expire uses a real hash internally to store the cached data. You can use the HASH option to Memoize::Expire to supply a tied hash in place of the ordinary hash that Memoize::Expire will normally use. You can use this feature to add Memoize::Expire as a layer in between a persistent disk hash and Memoize. If you do this, you get a persistent disk cache whose entries expire automatically. For example:

    1. # Memoize
    2. # |
    3. # Memoize::Expire enforces data expiration policy
    4. # |
    5. # DB_File implements persistence of data in a disk file
    6. # |
    7. # Disk file
    8. use Memoize;
    9. use Memoize::Expire;
    10. use DB_File;
    11. # Set up persistence
    12. tie my %disk_cache => 'DB_File', $filename, O_CREAT|O_RDWR, 0666];
    13. # Set up expiration policy, supplying persistent hash as a target
    14. tie my %cache => 'Memoize::Expire',
    15. LIFETIME => $lifetime, # In seconds
    16. NUM_USES => $n_uses,
    17. HASH => \%disk_cache;
    18. # Set up memoization, supplying expiring persistent hash for cache
    19. memoize 'function', SCALAR_CACHE => [ HASH => \%cache ];

    INTERFACE

    There is nothing special about Memoize::Expire. It is just an example. If you don't like the policy that it implements, you are free to write your own expiration policy module that implements whatever policy you desire. Here is how to do that. Let us suppose that your module will be named MyExpirePolicy.

    Short summary: You need to create a package that defines four methods:

    • TIEHASH

      Construct and return cache object.

    • EXISTS

      Given a function argument, is the corresponding function value in the cache, and if so, is it fresh enough to use?

    • FETCH

      Given a function argument, look up the corresponding function value in the cache and return it.

    • STORE

      Given a function argument and the corresponding function value, store them into the cache.

    • CLEAR

      (Optional.) Flush the cache completely.

    The user who wants the memoization cache to be expired according to your policy will say so by writing

    1. tie my %cache => 'MyExpirePolicy', args...;
    2. memoize 'function', SCALAR_CACHE => [HASH => \%cache];

    This will invoke MyExpirePolicy->TIEHASH(args) . MyExpirePolicy::TIEHASH should do whatever is appropriate to set up the cache, and it should return the cache object to the caller.

    For example, MyExpirePolicy::TIEHASH might create an object that contains a regular Perl hash (which it will to store the cached values) and some extra information about the arguments and how old the data is and things like that. Let us call this object `C'.

    When Memoize needs to check to see if an entry is in the cache already, it will invoke C->EXISTS(key) . key is the normalized function argument. MyExpirePolicy::EXISTS should return 0 if the key is not in the cache, or if it has expired, and 1 if an unexpired value is in the cache. It should not return undef, because there is a bug in some versions of Perl that will cause a spurious FETCH if the EXISTS method returns undef.

    If your EXISTS function returns true, Memoize will try to fetch the cached value by invoking C->FETCH(key) . MyExpirePolicy::FETCH should return the cached value. Otherwise, Memoize will call the memoized function to compute the appropriate value, and will store it into the cache by calling C->STORE(key, value) .

    Here is a very brief example of a policy module that expires each cache item after ten seconds.

    1. package Memoize::TenSecondExpire;
    2. sub TIEHASH {
    3. my ($package, %args) = @_;
    4. my $cache = $args{HASH} || {};
    5. bless $cache => $package;
    6. }
    7. sub EXISTS {
    8. my ($cache, $key) = @_;
    9. if (exists $cache->{$key} &&
    10. $cache->{$key}{EXPIRE_TIME} > time) {
    11. return 1
    12. } else {
    13. return 0; # Do NOT return `undef' here.
    14. }
    15. }
    16. sub FETCH {
    17. my ($cache, $key) = @_;
    18. return $cache->{$key}{VALUE};
    19. }
    20. sub STORE {
    21. my ($cache, $key, $newvalue) = @_;
    22. $cache->{$key}{VALUE} = $newvalue;
    23. $cache->{$key}{EXPIRE_TIME} = time + 10;
    24. }

    To use this expiration policy, the user would say

    1. use Memoize;
    2. tie my %cache10sec => 'Memoize::TenSecondExpire';
    3. memoize 'function', SCALAR_CACHE => [HASH => \%cache10sec];

    Memoize would then call function whenever a cached value was entirely absent or was older than ten seconds.

    You should always support a HASH argument to TIEHASH that ties the underlying cache so that the user can specify that the cache is also persistent or that it has some other interesting semantics. The example above demonstrates how to do this, as does Memoize::Expire .

    Another sample module, Memoize::Saves, is available in a separate distribution on CPAN. It implements a policy that allows you to specify that certain function values would always be looked up afresh. See the documentation for details.

    ALTERNATIVES

    Brent Powers has a Memoize::ExpireLRU module that was designed to work with Memoize and provides expiration of least-recently-used data. The cache is held at a fixed number of entries, and when new data comes in, the least-recently used data is expired. See http://search.cpan.org/search?mode=module&query=ExpireLRU.

    Joshua Chamas's Tie::Cache module may be useful as an expiration manager. (If you try this, let me know how it works out.)

    If you develop any useful expiration managers that you think should be distributed with Memoize, please let me know.

    CAVEATS

    This module is experimental, and may contain bugs. Please report bugs to the address below.

    Number-of-uses is stored as a 16-bit unsigned integer, so can't exceed 65535.

    Because of clock granularity, expiration times may occur up to one second sooner than you expect. For example, suppose you store a value with a lifetime of ten seconds, and you store it at 12:00:00.998 on a certain day. Memoize will look at the clock and see 12:00:00. Then 9.01 seconds later, at 12:00:10.008 you try to read it back. Memoize will look at the clock and see 12:00:10 and conclude that the value has expired. This will probably not occur if you have Time::HiRes installed.

    AUTHOR

    Mark-Jason Dominus (mjd-perl-memoize+@plover.com)

    Mike Cariaso provided valuable insight into the best way to solve this problem.

    SEE ALSO

    perl(1)

    The Memoize man page.

    http://www.plover.com/~mjd/perl/Memoize/ (for news and updates)

    I maintain a mailing list on which I occasionally announce new versions of Memoize. The list is for announcements only, not discussion. To join, send an empty message to mjd-perl-memoize-request@Plover.com.

     
    perldoc-html/Memoize/ExpireFile.html000644 000765 000024 00000034215 12275777470 017574 0ustar00jjstaff000000 000000 Memoize::ExpireFile - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Memoize::ExpireFile

    Perl 5 version 18.2 documentation
    Recently read

    Memoize::ExpireFile

    NAME

    Memoize::ExpireFile - test for Memoize expiration semantics

    DESCRIPTION

    See Memoize::Expire.

    Page index
     
    perldoc-html/Memoize/ExpireTest.html000644 000765 000024 00000034747 12275777472 017650 0ustar00jjstaff000000 000000 Memoize::ExpireTest - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Memoize::ExpireTest

    Perl 5 version 18.2 documentation
    Recently read

    Memoize::ExpireTest

    NAME

    Memoize::ExpireTest - test for Memoize expiration semantics

    DESCRIPTION

    This module is just for testing expiration semantics. It's not a very good example of how to write an expiration module.

    If you are looking for an example, I recommend that you look at the simple example in the Memoize::Expire documentation, or at the code for Memoize::Expire itself.

    If you have questions, I will be happy to answer them if you send them to mjd-perl-memoize+@plover.com.

    Page index
     
    perldoc-html/Memoize/NDBM_File.html000644 000765 000024 00000034207 12275777467 017226 0ustar00jjstaff000000 000000 Memoize::NDBM_File - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Memoize::NDBM_File

    Perl 5 version 18.2 documentation
    Recently read

    Memoize::NDBM_File

    NAME

    Memoize::NDBM_File - glue to provide EXISTS for NDBM_File for Storable use

    DESCRIPTION

    See Memoize.

    Page index
     
    perldoc-html/Memoize/SDBM_File.html000644 000765 000024 00000034207 12275777470 017225 0ustar00jjstaff000000 000000 Memoize::SDBM_File - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Memoize::SDBM_File

    Perl 5 version 18.2 documentation
    Recently read

    Memoize::SDBM_File

    NAME

    Memoize::SDBM_File - glue to provide EXISTS for SDBM_File for Storable use

    DESCRIPTION

    See Memoize.

    Page index
     
    perldoc-html/Memoize/Storable.html000644 000765 000024 00000034163 12275777467 017323 0ustar00jjstaff000000 000000 Memoize::Storable - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Memoize::Storable

    Perl 5 version 18.2 documentation
    Recently read

    Memoize::Storable

    NAME

    Memoize::Storable - store Memoized data in Storable database

    DESCRIPTION

    See Memoize.

    Page index
     
    perldoc-html/Math/BigFloat.html000644 000765 000024 00000245031 12275777472 016515 0ustar00jjstaff000000 000000 Math::BigFloat - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Math::BigFloat

    Perl 5 version 18.2 documentation
    Recently read

    Math::BigFloat

    NAME

    Math::BigFloat - Arbitrary size floating point math package

    SYNOPSIS

    1. use Math::BigFloat;
    2. # Number creation
    3. my $x = Math::BigFloat->new($str); # defaults to 0
    4. my $y = $x->copy(); # make a true copy
    5. my $nan = Math::BigFloat->bnan(); # create a NotANumber
    6. my $zero = Math::BigFloat->bzero(); # create a +0
    7. my $inf = Math::BigFloat->binf(); # create a +inf
    8. my $inf = Math::BigFloat->binf('-'); # create a -inf
    9. my $one = Math::BigFloat->bone(); # create a +1
    10. my $mone = Math::BigFloat->bone('-'); # create a -1
    11. my $pi = Math::BigFloat->bpi(100); # PI to 100 digits
    12. # the following examples compute their result to 100 digits accuracy:
    13. my $cos = Math::BigFloat->new(1)->bcos(100); # cosinus(1)
    14. my $sin = Math::BigFloat->new(1)->bsin(100); # sinus(1)
    15. my $atan = Math::BigFloat->new(1)->batan(100); # arcus tangens(1)
    16. my $atan2 = Math::BigFloat->new( 1 )->batan2( 1 ,100); # batan(1)
    17. my $atan2 = Math::BigFloat->new( 1 )->batan2( 8 ,100); # batan(1/8)
    18. my $atan2 = Math::BigFloat->new( -2 )->batan2( 1 ,100); # batan(-2)
    19. # Testing
    20. $x->is_zero(); # true if arg is +0
    21. $x->is_nan(); # true if arg is NaN
    22. $x->is_one(); # true if arg is +1
    23. $x->is_one('-'); # true if arg is -1
    24. $x->is_odd(); # true if odd, false for even
    25. $x->is_even(); # true if even, false for odd
    26. $x->is_pos(); # true if >= 0
    27. $x->is_neg(); # true if < 0
    28. $x->is_inf(sign); # true if +inf, or -inf (default is '+')
    29. $x->bcmp($y); # compare numbers (undef,<0,=0,>0)
    30. $x->bacmp($y); # compare absolutely (undef,<0,=0,>0)
    31. $x->sign(); # return the sign, either +,- or NaN
    32. $x->digit($n); # return the nth digit, counting from right
    33. $x->digit(-$n); # return the nth digit, counting from left
    34. # The following all modify their first argument. If you want to pre-
    35. # serve $x, use $z = $x->copy()->bXXX($y); See under L</CAVEATS> for
    36. # necessary when mixing $a = $b assignments with non-overloaded math.
    37. # set
    38. $x->bzero(); # set $i to 0
    39. $x->bnan(); # set $i to NaN
    40. $x->bone(); # set $x to +1
    41. $x->bone('-'); # set $x to -1
    42. $x->binf(); # set $x to inf
    43. $x->binf('-'); # set $x to -inf
    44. $x->bneg(); # negation
    45. $x->babs(); # absolute value
    46. $x->bnorm(); # normalize (no-op)
    47. $x->bnot(); # two's complement (bit wise not)
    48. $x->binc(); # increment x by 1
    49. $x->bdec(); # decrement x by 1
    50. $x->badd($y); # addition (add $y to $x)
    51. $x->bsub($y); # subtraction (subtract $y from $x)
    52. $x->bmul($y); # multiplication (multiply $x by $y)
    53. $x->bdiv($y); # divide, set $x to quotient
    54. # return (quo,rem) or quo if scalar
    55. $x->bmod($y); # modulus ($x % $y)
    56. $x->bpow($y); # power of arguments ($x ** $y)
    57. $x->bmodpow($exp,$mod); # modular exponentiation (($num**$exp) % $mod))
    58. $x->blsft($y, $n); # left shift by $y places in base $n
    59. $x->brsft($y, $n); # right shift by $y places in base $n
    60. # returns (quo,rem) or quo if in scalar context
    61. $x->blog(); # logarithm of $x to base e (Euler's number)
    62. $x->blog($base); # logarithm of $x to base $base (f.i. 2)
    63. $x->bexp(); # calculate e ** $x where e is Euler's number
    64. $x->band($y); # bit-wise and
    65. $x->bior($y); # bit-wise inclusive or
    66. $x->bxor($y); # bit-wise exclusive or
    67. $x->bnot(); # bit-wise not (two's complement)
    68. $x->bsqrt(); # calculate square-root
    69. $x->broot($y); # $y'th root of $x (e.g. $y == 3 => cubic root)
    70. $x->bfac(); # factorial of $x (1*2*3*4*..$x)
    71. $x->bround($N); # accuracy: preserve $N digits
    72. $x->bfround($N); # precision: round to the $Nth digit
    73. $x->bfloor(); # return integer less or equal than $x
    74. $x->bceil(); # return integer greater or equal than $x
    75. # The following do not modify their arguments:
    76. bgcd(@values); # greatest common divisor
    77. blcm(@values); # lowest common multiplicator
    78. $x->bstr(); # return string
    79. $x->bsstr(); # return string in scientific notation
    80. $x->as_int(); # return $x as BigInt
    81. $x->exponent(); # return exponent as BigInt
    82. $x->mantissa(); # return mantissa as BigInt
    83. $x->parts(); # return (mantissa,exponent) as BigInt
    84. $x->length(); # number of digits (w/o sign and '.')
    85. ($l,$f) = $x->length(); # number of digits, and length of fraction
    86. $x->precision(); # return P of $x (or global, if P of $x undef)
    87. $x->precision($n); # set P of $x to $n
    88. $x->accuracy(); # return A of $x (or global, if A of $x undef)
    89. $x->accuracy($n); # set A $x to $n
    90. # these get/set the appropriate global value for all BigFloat objects
    91. Math::BigFloat->precision(); # Precision
    92. Math::BigFloat->accuracy(); # Accuracy
    93. Math::BigFloat->round_mode(); # rounding mode

    DESCRIPTION

    All operators (including basic math operations) are overloaded if you declare your big floating point numbers as

    1. $i = new Math::BigFloat '12_3.456_789_123_456_789E-2';

    Operations with overloaded operators preserve the arguments, which is exactly what you expect.

    Canonical notation

    Input to these routines are either BigFloat objects, or strings of the following four forms:

    • /^[+-]\d+$/

    • /^[+-]\d+\.\d*$/

    • /^[+-]\d+E[+-]?\d+$/

    • /^[+-]\d*\.\d+E[+-]?\d+$/

    all with optional leading and trailing zeros and/or spaces. Additionally, numbers are allowed to have an underscore between any two digits.

    Empty strings as well as other illegal numbers results in 'NaN'.

    bnorm() on a BigFloat object is now effectively a no-op, since the numbers are always stored in normalized form. On a string, it creates a BigFloat object.

    Output

    Output values are BigFloat objects (normalized), except for bstr() and bsstr().

    The string output will always have leading and trailing zeros stripped and drop a plus sign. bstr() will give you always the form with a decimal point, while bsstr() (s for scientific) gives you the scientific notation.

    1. Input bstr() bsstr()
    2. '-0' '0' '0E1'
    3. ' -123 123 123' '-123123123' '-123123123E0'
    4. '00.0123' '0.0123' '123E-4'
    5. '123.45E-2' '1.2345' '12345E-4'
    6. '10E+3' '10000' '1E4'

    Some routines (is_odd() , is_even() , is_zero() , is_one() , is_nan() ) return true or false, while others (bcmp() , bacmp() ) return either undef, <0, 0 or >0 and are suited for sort.

    Actual math is done by using the class defined with with => Class; (which defaults to BigInts) to represent the mantissa and exponent.

    The sign /^[+-]$/ is stored separately. The string 'NaN' is used to represent the result when input arguments are not numbers, as well as the result of dividing by zero.

    mantissa() , exponent() and parts()

    mantissa() and exponent() return the said parts of the BigFloat as BigInts such that:

    1. $m = $x->mantissa();
    2. $e = $x->exponent();
    3. $y = $m * ( 10 ** $e );
    4. print "ok\n" if $x == $y;

    ($m,$e) = $x->parts(); is just a shortcut giving you both of them.

    A zero is represented and returned as 0E1 , not 0E0 (after Knuth).

    Currently the mantissa is reduced as much as possible, favouring higher exponents over lower ones (e.g. returning 1e7 instead of 10e6 or 10000000e0). This might change in the future, so do not depend on it.

    Accuracy vs. Precision

    See also: Rounding.

    Math::BigFloat supports both precision (rounding to a certain place before or after the dot) and accuracy (rounding to a certain number of digits). For a full documentation, examples and tips on these topics please see the large section about rounding in Math::BigInt.

    Since things like sqrt(2) or 1 / 3 must presented with a limited accuracy lest a operation consumes all resources, each operation produces no more than the requested number of digits.

    If there is no global precision or accuracy set, and the operation in question was not called with a requested precision or accuracy, and the input $x has no accuracy or precision set, then a fallback parameter will be used. For historical reasons, it is called div_scale and can be accessed via:

    1. $d = Math::BigFloat->div_scale(); # query
    2. Math::BigFloat->div_scale($n); # set to $n digits

    The default value for div_scale is 40.

    In case the result of one operation has more digits than specified, it is rounded. The rounding mode taken is either the default mode, or the one supplied to the operation after the scale:

    1. $x = Math::BigFloat->new(2);
    2. Math::BigFloat->accuracy(5); # 5 digits max
    3. $y = $x->copy()->bdiv(3); # will give 0.66667
    4. $y = $x->copy()->bdiv(3,6); # will give 0.666667
    5. $y = $x->copy()->bdiv(3,6,undef,'odd'); # will give 0.666667
    6. Math::BigFloat->round_mode('zero');
    7. $y = $x->copy()->bdiv(3,6); # will also give 0.666667

    Note that Math::BigFloat->accuracy() and Math::BigFloat->precision() set the global variables, and thus any newly created number will be subject to the global rounding immediately. This means that in the examples above, the 3 as argument to bdiv() will also get an accuracy of 5.

    It is less confusing to either calculate the result fully, and afterwards round it explicitly, or use the additional parameters to the math functions like so:

    1. use Math::BigFloat;
    2. $x = Math::BigFloat->new(2);
    3. $y = $x->copy()->bdiv(3);
    4. print $y->bround(5),"\n"; # will give 0.66667
    5. or
    6. use Math::BigFloat;
    7. $x = Math::BigFloat->new(2);
    8. $y = $x->copy()->bdiv(3,5); # will give 0.66667
    9. print "$y\n";

    Rounding

    • ffround ( +$scale )

      Rounds to the $scale'th place left from the '.', counting from the dot. The first digit is numbered 1.

    • ffround ( -$scale )

      Rounds to the $scale'th place right from the '.', counting from the dot.

    • ffround ( 0 )

      Rounds to an integer.

    • fround ( +$scale )

      Preserves accuracy to $scale digits from the left (aka significant digits) and pads the rest with zeros. If the number is between 1 and -1, the significant digits count from the first non-zero after the '.'

    • fround ( -$scale ) and fround ( 0 )

      These are effectively no-ops.

    All rounding functions take as a second parameter a rounding mode from one of the following: 'even', 'odd', '+inf', '-inf', 'zero', 'trunc' or 'common'.

    The default rounding mode is 'even'. By using Math::BigFloat->round_mode($round_mode); you can get and set the default mode for subsequent rounding. The usage of $Math::BigFloat::$round_mode is no longer supported. The second parameter to the round functions then overrides the default temporarily.

    The as_number() function returns a BigInt from a Math::BigFloat. It uses 'trunc' as rounding mode to make it equivalent to:

    1. $x = 2.5;
    2. $y = int($x) + 2;

    You can override this by passing the desired rounding mode as parameter to as_number() :

    1. $x = Math::BigFloat->new(2.5);
    2. $y = $x->as_number('odd'); # $y = 3

    METHODS

    Math::BigFloat supports all methods that Math::BigInt supports, except it calculates non-integer results when possible. Please see Math::BigInt for a full description of each method. Below are just the most important differences:

    accuracy

    1. $x->accuracy(5); # local for $x
    2. CLASS->accuracy(5); # global for all members of CLASS
    3. # Note: This also applies to new()!
    4. $A = $x->accuracy(); # read out accuracy that affects $x
    5. $A = CLASS->accuracy(); # read out global accuracy

    Set or get the global or local accuracy, aka how many significant digits the results have. If you set a global accuracy, then this also applies to new()!

    Warning! The accuracy sticks, e.g. once you created a number under the influence of CLASS->accuracy($A) , all results from math operations with that number will also be rounded.

    In most cases, you should probably round the results explicitly using one of round() in Math::BigInt, bround() in Math::BigInt or bfround() in Math::BigInt or by passing the desired accuracy to the math operation as additional parameter:

    1. my $x = Math::BigInt->new(30000);
    2. my $y = Math::BigInt->new(7);
    3. print scalar $x->copy()->bdiv($y, 2); # print 4300
    4. print scalar $x->copy()->bdiv($y)->bround(2); # print 4300

    precision()

    1. $x->precision(-2); # local for $x, round at the second
    2. # digit right of the dot
    3. $x->precision(2); # ditto, round at the second digit left
    4. # of the dot
    5. CLASS->precision(5); # Global for all members of CLASS
    6. # This also applies to new()!
    7. CLASS->precision(-5); # ditto
    8. $P = CLASS->precision(); # read out global precision
    9. $P = $x->precision(); # read out precision that affects $x

    Note: You probably want to use accuracy instead. With accuracy you set the number of digits each result should have, with precision() you set the place where to round!

    bexp()

    1. $x->bexp($accuracy); # calculate e ** X

    Calculates the expression e ** $x where e is Euler's number.

    This method was added in v1.82 of Math::BigInt (April 2007).

    bnok()

    1. $x->bnok($y); # x over y (binomial coefficient n over k)

    Calculates the binomial coefficient n over k, also called the "choose" function. The result is equivalent to:

    1. ( n ) n!
    2. | - | = -------
    3. ( k ) k!(n-k)!

    This method was added in v1.84 of Math::BigInt (April 2007).

    bpi()

    1. print Math::BigFloat->bpi(100), "\n";

    Calculate PI to N digits (including the 3 before the dot). The result is rounded according to the current rounding mode, which defaults to "even".

    This method was added in v1.87 of Math::BigInt (June 2007).

    bcos()

    1. my $x = Math::BigFloat->new(1);
    2. print $x->bcos(100), "\n";

    Calculate the cosinus of $x, modifying $x in place.

    This method was added in v1.87 of Math::BigInt (June 2007).

    bsin()

    1. my $x = Math::BigFloat->new(1);
    2. print $x->bsin(100), "\n";

    Calculate the sinus of $x, modifying $x in place.

    This method was added in v1.87 of Math::BigInt (June 2007).

    batan2()

    1. my $y = Math::BigFloat->new(2);
    2. my $x = Math::BigFloat->new(3);
    3. print $y->batan2($x), "\n";

    Calculate the arcus tanges of $y divided by $x , modifying $y in place. See also batan().

    This method was added in v1.87 of Math::BigInt (June 2007).

    batan()

    1. my $x = Math::BigFloat->new(1);
    2. print $x->batan(100), "\n";

    Calculate the arcus tanges of $x, modifying $x in place. See also batan2().

    This method was added in v1.87 of Math::BigInt (June 2007).

    bmuladd()

    1. $x->bmuladd($y,$z);

    Multiply $x by $y, and then add $z to the result.

    This method was added in v1.87 of Math::BigInt (June 2007).

    Autocreating constants

    After use Math::BigFloat ':constant' all the floating point constants in the given scope are converted to Math::BigFloat . This conversion happens at compile time.

    In particular

    1. perl -MMath::BigFloat=:constant -e 'print 2E-100,"\n"'

    prints the value of 2E-100 . Note that without conversion of constants the expression 2E-100 will be calculated as normal floating point number.

    Please note that ':constant' does not affect integer constants, nor binary nor hexadecimal constants. Use bignum or Math::BigInt to get this to work.

    Math library

    Math with the numbers is done (by default) by a module called Math::BigInt::Calc. This is equivalent to saying:

    1. use Math::BigFloat lib => 'Calc';

    You can change this by using:

    1. use Math::BigFloat lib => 'GMP';

    Note: General purpose packages should not be explicit about the library to use; let the script author decide which is best.

    Note: The keyword 'lib' will warn when the requested library could not be loaded. To suppress the warning use 'try' instead:

    1. use Math::BigFloat try => 'GMP';

    If your script works with huge numbers and Calc is too slow for them, you can also for the loading of one of these libraries and if none of them can be used, the code will die:

    1. use Math::BigFloat only => 'GMP,Pari';

    The following would first try to find Math::BigInt::Foo, then Math::BigInt::Bar, and when this also fails, revert to Math::BigInt::Calc:

    1. use Math::BigFloat lib => 'Foo,Math::BigInt::Bar';

    See the respective low-level library documentation for further details.

    Please note that Math::BigFloat does not use the denoted library itself, but it merely passes the lib argument to Math::BigInt. So, instead of the need to do:

    1. use Math::BigInt lib => 'GMP';
    2. use Math::BigFloat;

    you can roll it all into one line:

    1. use Math::BigFloat lib => 'GMP';

    It is also possible to just require Math::BigFloat:

    1. require Math::BigFloat;

    This will load the necessary things (like BigInt) when they are needed, and automatically.

    See Math::BigInt for more details than you ever wanted to know about using a different low-level library.

    Using Math::BigInt::Lite

    For backwards compatibility reasons it is still possible to request a different storage class for use with Math::BigFloat:

    1. use Math::BigFloat with => 'Math::BigInt::Lite';

    However, this request is ignored, as the current code now uses the low-level math library for directly storing the number parts.

    EXPORTS

    Math::BigFloat exports nothing by default, but can export the bpi() method:

    1. use Math::BigFloat qw/bpi/;
    2. print bpi(10), "\n";

    BUGS

    Please see the file BUGS in the CPAN distribution Math::BigInt for known bugs.

    CAVEATS

    Do not try to be clever to insert some operations in between switching libraries:

    1. require Math::BigFloat;
    2. my $matter = Math::BigFloat->bone() + 4; # load BigInt and Calc
    3. Math::BigFloat->import( lib => 'Pari' ); # load Pari, too
    4. my $anti_matter = Math::BigFloat->bone()+4; # now use Pari

    This will create objects with numbers stored in two different backend libraries, and VERY BAD THINGS will happen when you use these together:

    1. my $flash_and_bang = $matter + $anti_matter; # Don't do this!
    • stringify, bstr()

      Both stringify and bstr() now drop the leading '+'. The old code would return '+1.23', the new returns '1.23'. See the documentation in Math::BigInt for reasoning and details.

    • bdiv

      The following will probably not print what you expect:

      1. print $c->bdiv(123.456),"\n";

      It prints both quotient and remainder since print works in list context. Also, bdiv() will modify $c, so be careful. You probably want to use

      1. print $c / 123.456,"\n";
      2. print scalar $c->bdiv(123.456),"\n"; # or if you want to modify $c

      instead.

    • brsft

      The following will probably not print what you expect:

      1. my $c = Math::BigFloat->new('3.14159');
      2. print $c->brsft(3,10),"\n"; # prints 0.00314153.1415

      It prints both quotient and remainder, since print calls brsft() in list context. Also, $c->brsft() will modify $c, so be careful. You probably want to use

      1. print scalar $c->copy()->brsft(3,10),"\n";
      2. # or if you really want to modify $c
      3. print scalar $c->brsft(3,10),"\n";

      instead.

    • Modifying and =

      Beware of:

      1. $x = Math::BigFloat->new(5);
      2. $y = $x;

      It will not do what you think, e.g. making a copy of $x. Instead it just makes a second reference to the same object and stores it in $y. Thus anything that modifies $x will modify $y (except overloaded math operators), and vice versa. See Math::BigInt for details and how to avoid that.

    • bpow

      bpow() now modifies the first argument, unlike the old code which left it alone and only returned the result. This is to be consistent with badd() etc. The first will modify $x, the second one won't:

      1. print bpow($x,$i),"\n"; # modify $x
      2. print $x->bpow($i),"\n"; # ditto
      3. print $x ** $i,"\n"; # leave $x alone
    • precision() vs. accuracy()

      A common pitfall is to use precision() when you want to round a result to a certain number of digits:

      1. use Math::BigFloat;
      2. Math::BigFloat->precision(4); # does not do what you
      3. # think it does
      4. my $x = Math::BigFloat->new(12345); # rounds $x to "12000"!
      5. print "$x\n"; # print "12000"
      6. my $y = Math::BigFloat->new(3); # rounds $y to "0"!
      7. print "$y\n"; # print "0"
      8. $z = $x / $y; # 12000 / 0 => NaN!
      9. print "$z\n";
      10. print $z->precision(),"\n"; # 4

      Replacing precision() with accuracy is probably not what you want, either:

      1. use Math::BigFloat;
      2. Math::BigFloat->accuracy(4); # enables global rounding:
      3. my $x = Math::BigFloat->new(123456); # rounded immediately
      4. # to "12350"
      5. print "$x\n"; # print "123500"
      6. my $y = Math::BigFloat->new(3); # rounded to "3
      7. print "$y\n"; # print "3"
      8. print $z = $x->copy()->bdiv($y),"\n"; # 41170
      9. print $z->accuracy(),"\n"; # 4

      What you want to use instead is:

      1. use Math::BigFloat;
      2. my $x = Math::BigFloat->new(123456); # no rounding
      3. print "$x\n"; # print "123456"
      4. my $y = Math::BigFloat->new(3); # no rounding
      5. print "$y\n"; # print "3"
      6. print $z = $x->copy()->bdiv($y,4),"\n"; # 41150
      7. print $z->accuracy(),"\n"; # undef

      In addition to computing what you expected, the last example also does not "taint" the result with an accuracy or precision setting, which would influence any further operation.

    SEE ALSO

    Math::BigInt, Math::BigRat and Math::Big as well as Math::BigInt::Pari and Math::BigInt::GMP.

    The pragmas bignum, bigint and bigrat might also be of interest because they solve the autoupgrading/downgrading issue, at least partly.

    The package at http://search.cpan.org/~tels/Math-BigInt contains more documentation including a full version history, testcases, empty subclass files and benchmarks.

    LICENSE

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

    AUTHORS

    Mark Biggar, overloaded interface by Ilya Zakharevich. Completely rewritten by Tels http://bloodgate.com in 2001 - 2006, and still at it in 2007.

     
    perldoc-html/Math/BigInt/000755 000765 000024 00000000000 12275777472 015307 5ustar00jjstaff000000 000000 perldoc-html/Math/BigInt.html000644 000765 000024 00000645327 12275777471 016215 0ustar00jjstaff000000 000000 Math::BigInt - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Math::BigInt

    Perl 5 version 18.2 documentation
    Recently read

    Math::BigInt

    NAME

    Math::BigInt - Arbitrary size integer/float math package

    SYNOPSIS

    1. use Math::BigInt;
    2. # or make it faster with huge numbers: install (optional)
    3. # Math::BigInt::GMP and always use (it will fall back to
    4. # pure Perl if the GMP library is not installed):
    5. # (See also the L<MATH LIBRARY> section!)
    6. # will warn if Math::BigInt::GMP cannot be found
    7. use Math::BigInt lib => 'GMP';
    8. # to suppress the warning use this:
    9. # use Math::BigInt try => 'GMP';
    10. # dies if GMP cannot be loaded:
    11. # use Math::BigInt only => 'GMP';
    12. my $str = '1234567890';
    13. my @values = (64,74,18);
    14. my $n = 1; my $sign = '-';
    15. # Number creation
    16. my $x = Math::BigInt->new($str); # defaults to 0
    17. my $y = $x->copy(); # make a true copy
    18. my $nan = Math::BigInt->bnan(); # create a NotANumber
    19. my $zero = Math::BigInt->bzero(); # create a +0
    20. my $inf = Math::BigInt->binf(); # create a +inf
    21. my $inf = Math::BigInt->binf('-'); # create a -inf
    22. my $one = Math::BigInt->bone(); # create a +1
    23. my $mone = Math::BigInt->bone('-'); # create a -1
    24. my $pi = Math::BigInt->bpi(); # returns '3'
    25. # see Math::BigFloat::bpi()
    26. $h = Math::BigInt->new('0x123'); # from hexadecimal
    27. $b = Math::BigInt->new('0b101'); # from binary
    28. $o = Math::BigInt->from_oct('0101'); # from octal
    29. # Testing (don't modify their arguments)
    30. # (return true if the condition is met, otherwise false)
    31. $x->is_zero(); # if $x is +0
    32. $x->is_nan(); # if $x is NaN
    33. $x->is_one(); # if $x is +1
    34. $x->is_one('-'); # if $x is -1
    35. $x->is_odd(); # if $x is odd
    36. $x->is_even(); # if $x is even
    37. $x->is_pos(); # if $x > 0
    38. $x->is_neg(); # if $x < 0
    39. $x->is_inf($sign); # if $x is +inf, or -inf (sign is default '+')
    40. $x->is_int(); # if $x is an integer (not a float)
    41. # comparing and digit/sign extraction
    42. $x->bcmp($y); # compare numbers (undef,<0,=0,>0)
    43. $x->bacmp($y); # compare absolutely (undef,<0,=0,>0)
    44. $x->sign(); # return the sign, either +,- or NaN
    45. $x->digit($n); # return the nth digit, counting from right
    46. $x->digit(-$n); # return the nth digit, counting from left
    47. # The following all modify their first argument. If you want to pre-
    48. # serve $x, use $z = $x->copy()->bXXX($y); See under L<CAVEATS> for
    49. # why this is necessary when mixing $a = $b assignments with non-over-
    50. # loaded math.
    51. $x->bzero(); # set $x to 0
    52. $x->bnan(); # set $x to NaN
    53. $x->bone(); # set $x to +1
    54. $x->bone('-'); # set $x to -1
    55. $x->binf(); # set $x to inf
    56. $x->binf('-'); # set $x to -inf
    57. $x->bneg(); # negation
    58. $x->babs(); # absolute value
    59. $x->bsgn(); # sign function (-1, 0, 1, or NaN)
    60. $x->bnorm(); # normalize (no-op in BigInt)
    61. $x->bnot(); # two's complement (bit wise not)
    62. $x->binc(); # increment $x by 1
    63. $x->bdec(); # decrement $x by 1
    64. $x->badd($y); # addition (add $y to $x)
    65. $x->bsub($y); # subtraction (subtract $y from $x)
    66. $x->bmul($y); # multiplication (multiply $x by $y)
    67. $x->bdiv($y); # divide, set $x to quotient
    68. # return (quo,rem) or quo if scalar
    69. $x->bmuladd($y,$z); # $x = $x * $y + $z
    70. $x->bmod($y); # modulus (x % y)
    71. $x->bmodpow($y,$mod); # modular exponentiation (($x ** $y) % $mod)
    72. $x->bmodinv($mod); # modular multiplicative inverse
    73. $x->bpow($y); # power of arguments (x ** y)
    74. $x->blsft($y); # left shift in base 2
    75. $x->brsft($y); # right shift in base 2
    76. # returns (quo,rem) or quo if in sca-
    77. # lar context
    78. $x->blsft($y,$n); # left shift by $y places in base $n
    79. $x->brsft($y,$n); # right shift by $y places in base $n
    80. # returns (quo,rem) or quo if in sca-
    81. # lar context
    82. $x->band($y); # bitwise and
    83. $x->bior($y); # bitwise inclusive or
    84. $x->bxor($y); # bitwise exclusive or
    85. $x->bnot(); # bitwise not (two's complement)
    86. $x->bsqrt(); # calculate square-root
    87. $x->broot($y); # $y'th root of $x (e.g. $y == 3 => cubic root)
    88. $x->bfac(); # factorial of $x (1*2*3*4*..$x)
    89. $x->bnok($y); # x over y (binomial coefficient n over k)
    90. $x->blog(); # logarithm of $x to base e (Euler's number)
    91. $x->blog($base); # logarithm of $x to base $base (f.i. 2)
    92. $x->bexp(); # calculate e ** $x where e is Euler's number
    93. $x->round($A,$P,$mode); # round to accuracy or precision using
    94. # mode $mode
    95. $x->bround($n); # accuracy: preserve $n digits
    96. $x->bfround($n); # $n > 0: round $nth digits,
    97. # $n < 0: round to the $nth digit after the
    98. # dot, no-op for BigInts
    99. # The following do not modify their arguments in BigInt (are no-ops),
    100. # but do so in BigFloat:
    101. $x->bfloor(); # return integer less or equal than $x
    102. $x->bceil(); # return integer greater or equal than $x
    103. # The following do not modify their arguments:
    104. # greatest common divisor (no OO style)
    105. my $gcd = Math::BigInt::bgcd(@values);
    106. # lowest common multiple (no OO style)
    107. my $lcm = Math::BigInt::blcm(@values);
    108. $x->length(); # return number of digits in number
    109. ($xl,$f) = $x->length(); # length of number and length of fraction
    110. # part, latter is always 0 digits long
    111. # for BigInts
    112. $x->exponent(); # return exponent as BigInt
    113. $x->mantissa(); # return (signed) mantissa as BigInt
    114. $x->parts(); # return (mantissa,exponent) as BigInt
    115. $x->copy(); # make a true copy of $x (unlike $y = $x;)
    116. $x->as_int(); # return as BigInt (in BigInt: same as copy())
    117. $x->numify(); # return as scalar (might overflow!)
    118. # conversion to string (do not modify their argument)
    119. $x->bstr(); # normalized string (e.g. '3')
    120. $x->bsstr(); # norm. string in scientific notation (e.g. '3E0')
    121. $x->as_hex(); # as signed hexadecimal string with prefixed 0x
    122. $x->as_bin(); # as signed binary string with prefixed 0b
    123. $x->as_oct(); # as signed octal string with prefixed 0
    124. # precision and accuracy (see section about rounding for more)
    125. $x->precision(); # return P of $x (or global, if P of $x undef)
    126. $x->precision($n); # set P of $x to $n
    127. $x->accuracy(); # return A of $x (or global, if A of $x undef)
    128. $x->accuracy($n); # set A $x to $n
    129. # Global methods
    130. Math::BigInt->precision(); # get/set global P for all BigInt objects
    131. Math::BigInt->accuracy(); # get/set global A for all BigInt objects
    132. Math::BigInt->round_mode(); # get/set global round mode, one of
    133. # 'even', 'odd', '+inf', '-inf', 'zero',
    134. # 'trunc' or 'common'
    135. Math::BigInt->config(); # return hash containing configuration

    DESCRIPTION

    All operators (including basic math operations) are overloaded if you declare your big integers as

    1. $i = new Math::BigInt '123_456_789_123_456_789';

    Operations with overloaded operators preserve the arguments which is exactly what you expect.

    • Input

      Input values to these routines may be any string, that looks like a number and results in an integer, including hexadecimal and binary numbers.

      Scalars holding numbers may also be passed, but note that non-integer numbers may already have lost precision due to the conversion to float. Quote your input if you want BigInt to see all the digits:

      1. $x = Math::BigInt->new(12345678890123456789); # bad
      2. $x = Math::BigInt->new('12345678901234567890'); # good

      You can include one underscore between any two digits.

      This means integer values like 1.01E2 or even 1000E-2 are also accepted. Non-integer values result in NaN.

      Hexadecimal (prefixed with "0x") and binary numbers (prefixed with "0b") are accepted, too. Please note that octal numbers are not recognized by new(), so the following will print "123":

      1. perl -MMath::BigInt -le 'print Math::BigInt->new("0123")'

      To convert an octal number, use from_oct();

      1. perl -MMath::BigInt -le 'print Math::BigInt->from_oct("0123")'

      Currently, Math::BigInt::new() defaults to 0, while Math::BigInt::new('') results in 'NaN'. This might change in the future, so use always the following explicit forms to get a zero or NaN:

      1. $zero = Math::BigInt->bzero();
      2. $nan = Math::BigInt->bnan();

      bnorm() on a BigInt object is now effectively a no-op, since the numbers are always stored in normalized form. If passed a string, creates a BigInt object from the input.

    • Output

      Output values are BigInt objects (normalized), except for the methods which return a string (see SYNOPSIS).

      Some routines (is_odd() , is_even() , is_zero() , is_one() , is_nan() , etc.) return true or false, while others (bcmp() , bacmp() ) return either undef (if NaN is involved), <0, 0 or >0 and are suited for sort.

    METHODS

    Each of the methods below (except config(), accuracy() and precision()) accepts three additional parameters. These arguments $A , $P and $R are accuracy , precision and round_mode . Please see the section about ACCURACY and PRECISION for more information.

    config()

    1. use Data::Dumper;
    2. print Dumper ( Math::BigInt->config() );
    3. print Math::BigInt->config()->{lib},"\n";

    Returns a hash containing the configuration, e.g. the version number, lib loaded etc. The following hash keys are currently filled in with the appropriate information.

    1. key Description
    2. Example
    3. ============================================================
    4. lib Name of the low-level math library
    5. Math::BigInt::Calc
    6. lib_version Version of low-level math library (see 'lib')
    7. 0.30
    8. class The class name of config() you just called
    9. Math::BigInt
    10. upgrade To which class math operations might be upgraded
    11. Math::BigFloat
    12. downgrade To which class math operations might be downgraded
    13. undef
    14. precision Global precision
    15. undef
    16. accuracy Global accuracy
    17. undef
    18. round_mode Global round mode
    19. even
    20. version version number of the class you used
    21. 1.61
    22. div_scale Fallback accuracy for div
    23. 40
    24. trap_nan If true, traps creation of NaN via croak()
    25. 1
    26. trap_inf If true, traps creation of +inf/-inf via croak()
    27. 1

    The following values can be set by passing config() a reference to a hash:

    1. trap_inf trap_nan
    2. upgrade downgrade precision accuracy round_mode div_scale

    Example:

    1. $new_cfg = Math::BigInt->config(
    2. { trap_inf => 1, precision => 5 }
    3. );

    accuracy()

    1. $x->accuracy(5); # local for $x
    2. CLASS->accuracy(5); # global for all members of CLASS
    3. # Note: This also applies to new()!
    4. $A = $x->accuracy(); # read out accuracy that affects $x
    5. $A = CLASS->accuracy(); # read out global accuracy

    Set or get the global or local accuracy, aka how many significant digits the results have. If you set a global accuracy, then this also applies to new()!

    Warning! The accuracy sticks, e.g. once you created a number under the influence of CLASS->accuracy($A) , all results from math operations with that number will also be rounded.

    In most cases, you should probably round the results explicitly using one of round(), bround() or bfround() or by passing the desired accuracy to the math operation as additional parameter:

    1. my $x = Math::BigInt->new(30000);
    2. my $y = Math::BigInt->new(7);
    3. print scalar $x->copy()->bdiv($y, 2); # print 4300
    4. print scalar $x->copy()->bdiv($y)->bround(2); # print 4300

    Please see the section about ACCURACY and PRECISION for further details.

    Value must be greater than zero. Pass an undef value to disable it:

    1. $x->accuracy(undef);
    2. Math::BigInt->accuracy(undef);

    Returns the current accuracy. For $x->accuracy() it will return either the local accuracy, or if not defined, the global. This means the return value represents the accuracy that will be in effect for $x:

    1. $y = Math::BigInt->new(1234567); # unrounded
    2. print Math::BigInt->accuracy(4),"\n"; # set 4, print 4
    3. $x = Math::BigInt->new(123456); # $x will be automatic-
    4. # ally rounded!
    5. print "$x $y\n"; # '123500 1234567'
    6. print $x->accuracy(),"\n"; # will be 4
    7. print $y->accuracy(),"\n"; # also 4, since global is 4
    8. print Math::BigInt->accuracy(5),"\n"; # set to 5, print 5
    9. print $x->accuracy(),"\n"; # still 4
    10. print $y->accuracy(),"\n"; # 5, since global is 5

    Note: Works also for subclasses like Math::BigFloat. Each class has it's own globals separated from Math::BigInt, but it is possible to subclass Math::BigInt and make the globals of the subclass aliases to the ones from Math::BigInt.

    precision()

    1. $x->precision(-2); # local for $x, round at the second
    2. # digit right of the dot
    3. $x->precision(2); # ditto, round at the second digit left
    4. # of the dot
    5. CLASS->precision(5); # Global for all members of CLASS
    6. # This also applies to new()!
    7. CLASS->precision(-5); # ditto
    8. $P = CLASS->precision(); # read out global precision
    9. $P = $x->precision(); # read out precision that affects $x

    Note: You probably want to use accuracy() instead. With accuracy() you set the number of digits each result should have, with precision() you set the place where to round!

    precision() sets or gets the global or local precision, aka at which digit before or after the dot to round all results. A set global precision also applies to all newly created numbers!

    In Math::BigInt, passing a negative number precision has no effect since no numbers have digits after the dot. In Math::BigFloat, it will round all results to P digits after the dot.

    Please see the section about ACCURACY and PRECISION for further details.

    Pass an undef value to disable it:

    1. $x->precision(undef);
    2. Math::BigInt->precision(undef);

    Returns the current precision. For $x->precision() it will return either the local precision of $x, or if not defined, the global. This means the return value represents the prevision that will be in effect for $x:

    1. $y = Math::BigInt->new(1234567); # unrounded
    2. print Math::BigInt->precision(4),"\n"; # set 4, print 4
    3. $x = Math::BigInt->new(123456); # will be automatically rounded
    4. print $x; # print "120000"!

    Note: Works also for subclasses like Math::BigFloat. Each class has its own globals separated from Math::BigInt, but it is possible to subclass Math::BigInt and make the globals of the subclass aliases to the ones from Math::BigInt.

    brsft()

    1. $x->brsft($y,$n);

    Shifts $x right by $y in base $n. Default is base 2, used are usually 10 and 2, but others work, too.

    Right shifting usually amounts to dividing $x by $n ** $y and truncating the result:

    1. $x = Math::BigInt->new(10);
    2. $x->brsft(1); # same as $x >> 1: 5
    3. $x = Math::BigInt->new(1234);
    4. $x->brsft(2,10); # result 12

    There is one exception, and that is base 2 with negative $x:

    1. $x = Math::BigInt->new(-5);
    2. print $x->brsft(1);

    This will print -3, not -2 (as it would if you divide -5 by 2 and truncate the result).

    new()

    1. $x = Math::BigInt->new($str,$A,$P,$R);

    Creates a new BigInt object from a scalar or another BigInt object. The input is accepted as decimal, hex (with leading '0x') or binary (with leading '0b').

    See Input for more info on accepted input formats.

    from_oct()

    1. $x = Math::BigInt->from_oct("0775"); # input is octal

    Interpret the input as an octal string and return the corresponding value. A "0" (zero) prefix is optional. A single underscore character may be placed right after the prefix, if present, or between any two digits. If the input is invalid, a NaN is returned.

    from_hex()

    1. $x = Math::BigInt->from_hex("0xcafe"); # input is hexadecimal

    Interpret input as a hexadecimal string. A "0x" or "x" prefix is optional. A single underscore character may be placed right after the prefix, if present, or between any two digits. If the input is invalid, a NaN is returned.

    from_bin()

    1. $x = Math::BigInt->from_bin("0b10011"); # input is binary

    Interpret the input as a binary string. A "0b" or "b" prefix is optional. A single underscore character may be placed right after the prefix, if present, or between any two digits. If the input is invalid, a NaN is returned.

    bnan()

    1. $x = Math::BigInt->bnan();

    Creates a new BigInt object representing NaN (Not A Number). If used on an object, it will set it to NaN:

    1. $x->bnan();

    bzero()

    1. $x = Math::BigInt->bzero();

    Creates a new BigInt object representing zero. If used on an object, it will set it to zero:

    1. $x->bzero();

    binf()

    1. $x = Math::BigInt->binf($sign);

    Creates a new BigInt object representing infinity. The optional argument is either '-' or '+', indicating whether you want infinity or minus infinity. If used on an object, it will set it to infinity:

    1. $x->binf();
    2. $x->binf('-');

    bone()

    1. $x = Math::BigInt->binf($sign);

    Creates a new BigInt object representing one. The optional argument is either '-' or '+', indicating whether you want one or minus one. If used on an object, it will set it to one:

    1. $x->bone(); # +1
    2. $x->bone('-'); # -1

    is_one()/is_zero()/is_nan()/is_inf()

    1. $x->is_zero(); # true if arg is +0
    2. $x->is_nan(); # true if arg is NaN
    3. $x->is_one(); # true if arg is +1
    4. $x->is_one('-'); # true if arg is -1
    5. $x->is_inf(); # true if +inf
    6. $x->is_inf('-'); # true if -inf (sign is default '+')

    These methods all test the BigInt for being one specific value and return true or false depending on the input. These are faster than doing something like:

    1. if ($x == 0)

    is_pos()/is_neg()/is_positive()/is_negative()

    1. $x->is_pos(); # true if > 0
    2. $x->is_neg(); # true if < 0

    The methods return true if the argument is positive or negative, respectively. NaN is neither positive nor negative, while +inf counts as positive, and -inf is negative. A zero is neither positive nor negative.

    These methods are only testing the sign, and not the value.

    is_positive() and is_negative() are aliases to is_pos() and is_neg() , respectively. is_positive() and is_negative() were introduced in v1.36, while is_pos() and is_neg() were only introduced in v1.68.

    is_odd()/is_even()/is_int()

    1. $x->is_odd(); # true if odd, false for even
    2. $x->is_even(); # true if even, false for odd
    3. $x->is_int(); # true if $x is an integer

    The return true when the argument satisfies the condition. NaN , +inf , -inf are not integers and are neither odd nor even.

    In BigInt, all numbers except NaN , +inf and -inf are integers.

    bcmp()

    1. $x->bcmp($y);

    Compares $x with $y and takes the sign into account. Returns -1, 0, 1 or undef.

    bacmp()

    1. $x->bacmp($y);

    Compares $x with $y while ignoring their sign. Returns -1, 0, 1 or undef.

    sign()

    1. $x->sign();

    Return the sign, of $x, meaning either + , - , -inf , +inf or NaN.

    If you want $x to have a certain sign, use one of the following methods:

    1. $x->babs(); # '+'
    2. $x->babs()->bneg(); # '-'
    3. $x->bnan(); # 'NaN'
    4. $x->binf(); # '+inf'
    5. $x->binf('-'); # '-inf'

    digit()

    1. $x->digit($n); # return the nth digit, counting from right

    If $n is negative, returns the digit counting from left.

    bneg()

    1. $x->bneg();

    Negate the number, e.g. change the sign between '+' and '-', or between '+inf' and '-inf', respectively. Does nothing for NaN or zero.

    babs()

    1. $x->babs();

    Set the number to its absolute value, e.g. change the sign from '-' to '+' and from '-inf' to '+inf', respectively. Does nothing for NaN or positive numbers.

    bsgn()

    1. $x->bsgn();

    Signum function. Set the number to -1, 0, or 1, depending on whether the number is negative, zero, or positive, respectivly. Does not modify NaNs.

    bnorm()

    1. $x->bnorm(); # normalize (no-op)

    bnot()

    1. $x->bnot();

    Two's complement (bitwise not). This is equivalent to

    1. $x->binc()->bneg();

    but faster.

    binc()

    1. $x->binc(); # increment x by 1

    bdec()

    1. $x->bdec(); # decrement x by 1

    badd()

    1. $x->badd($y); # addition (add $y to $x)

    bsub()

    1. $x->bsub($y); # subtraction (subtract $y from $x)

    bmul()

    1. $x->bmul($y); # multiplication (multiply $x by $y)

    bmuladd()

    1. $x->bmuladd($y,$z);

    Multiply $x by $y, and then add $z to the result,

    This method was added in v1.87 of Math::BigInt (June 2007).

    bdiv()

    1. $x->bdiv($y); # divide, set $x to quotient
    2. # return (quo,rem) or quo if scalar

    bmod()

    1. $x->bmod($y); # modulus (x % y)

    bmodinv()

    1. $x->bmodinv($mod); # modular multiplicative inverse

    Returns the multiplicative inverse of $x modulo $mod . If

    1. $y = $x -> copy() -> bmodinv($mod)

    then $y is the number closest to zero, and with the same sign as $mod , satisfying

    1. ($x * $y) % $mod = 1 % $mod

    If $x and $y are non-zero, they must be relative primes, i.e., bgcd($y, $mod)==1 . 'NaN ' is returned when no modular multiplicative inverse exists.

    bmodpow()

    1. $num->bmodpow($exp,$mod); # modular exponentiation
    2. # ($num**$exp % $mod)

    Returns the value of $num taken to the power $exp in the modulus $mod using binary exponentiation. bmodpow is far superior to writing

    1. $num ** $exp % $mod

    because it is much faster - it reduces internal variables into the modulus whenever possible, so it operates on smaller numbers.

    bmodpow also supports negative exponents.

    1. bmodpow($num, -1, $mod)

    is exactly equivalent to

    1. bmodinv($num, $mod)

    bpow()

    1. $x->bpow($y); # power of arguments (x ** y)

    blog()

    1. $x->blog($base, $accuracy); # logarithm of x to the base $base

    If $base is not defined, Euler's number (e) is used:

    1. print $x->blog(undef, 100); # log(x) to 100 digits

    bexp()

    1. $x->bexp($accuracy); # calculate e ** X

    Calculates the expression e ** $x where e is Euler's number.

    This method was added in v1.82 of Math::BigInt (April 2007).

    See also blog().

    bnok()

    1. $x->bnok($y); # x over y (binomial coefficient n over k)

    Calculates the binomial coefficient n over k, also called the "choose" function. The result is equivalent to:

    1. ( n ) n!
    2. | - | = -------
    3. ( k ) k!(n-k)!

    This method was added in v1.84 of Math::BigInt (April 2007).

    bpi()

    1. print Math::BigInt->bpi(100), "\n"; # 3

    Returns PI truncated to an integer, with the argument being ignored. This means under BigInt this always returns 3 .

    If upgrading is in effect, returns PI, rounded to N digits with the current rounding mode:

    1. use Math::BigFloat;
    2. use Math::BigInt upgrade => Math::BigFloat;
    3. print Math::BigInt->bpi(3), "\n"; # 3.14
    4. print Math::BigInt->bpi(100), "\n"; # 3.1415....

    This method was added in v1.87 of Math::BigInt (June 2007).

    bcos()

    1. my $x = Math::BigInt->new(1);
    2. print $x->bcos(100), "\n";

    Calculate the cosinus of $x, modifying $x in place.

    In BigInt, unless upgrading is in effect, the result is truncated to an integer.

    This method was added in v1.87 of Math::BigInt (June 2007).

    bsin()

    1. my $x = Math::BigInt->new(1);
    2. print $x->bsin(100), "\n";

    Calculate the sinus of $x, modifying $x in place.

    In BigInt, unless upgrading is in effect, the result is truncated to an integer.

    This method was added in v1.87 of Math::BigInt (June 2007).

    batan2()

    1. my $x = Math::BigInt->new(1);
    2. my $y = Math::BigInt->new(1);
    3. print $y->batan2($x), "\n";

    Calculate the arcus tangens of $y divided by $x , modifying $y in place.

    In BigInt, unless upgrading is in effect, the result is truncated to an integer.

    This method was added in v1.87 of Math::BigInt (June 2007).

    batan()

    1. my $x = Math::BigFloat->new(0.5);
    2. print $x->batan(100), "\n";

    Calculate the arcus tangens of $x, modifying $x in place.

    In BigInt, unless upgrading is in effect, the result is truncated to an integer.

    This method was added in v1.87 of Math::BigInt (June 2007).

    blsft()

    1. $x->blsft($y); # left shift in base 2
    2. $x->blsft($y,$n); # left shift, in base $n (like 10)

    brsft()

    1. $x->brsft($y); # right shift in base 2
    2. $x->brsft($y,$n); # right shift, in base $n (like 10)

    band()

    1. $x->band($y); # bitwise and

    bior()

    1. $x->bior($y); # bitwise inclusive or

    bxor()

    1. $x->bxor($y); # bitwise exclusive or

    bnot()

    1. $x->bnot(); # bitwise not (two's complement)

    bsqrt()

    1. $x->bsqrt(); # calculate square-root

    broot()

    1. $x->broot($N);

    Calculates the N'th root of $x .

    bfac()

    1. $x->bfac(); # factorial of $x (1*2*3*4*..$x)

    round()

    1. $x->round($A,$P,$round_mode);

    Round $x to accuracy $A or precision $P using the round mode $round_mode .

    bround()

    1. $x->bround($N); # accuracy: preserve $N digits

    bfround()

    1. $x->bfround($N);

    If N is > 0, rounds to the Nth digit from the left. If N < 0, rounds to the Nth digit after the dot. Since BigInts are integers, the case N < 0 is a no-op for them.

    Examples:

    1. Input N Result
    2. ===================================================
    3. 123456.123456 3 123500
    4. 123456.123456 2 123450
    5. 123456.123456 -2 123456.12
    6. 123456.123456 -3 123456.123

    bfloor()

    1. $x->bfloor();

    Set $x to the integer less or equal than $x. This is a no-op in BigInt, but does change $x in BigFloat.

    bceil()

    1. $x->bceil();

    Set $x to the integer greater or equal than $x. This is a no-op in BigInt, but does change $x in BigFloat.

    bgcd()

    1. bgcd(@values); # greatest common divisor (no OO style)

    blcm()

    1. blcm(@values); # lowest common multiple (no OO style)

    head2 length()

    1. $x->length();
    2. ($xl,$fl) = $x->length();

    Returns the number of digits in the decimal representation of the number. In list context, returns the length of the integer and fraction part. For BigInt's, the length of the fraction part will always be 0.

    exponent()

    1. $x->exponent();

    Return the exponent of $x as BigInt.

    mantissa()

    1. $x->mantissa();

    Return the signed mantissa of $x as BigInt.

    parts()

    1. $x->parts(); # return (mantissa,exponent) as BigInt

    copy()

    1. $x->copy(); # make a true copy of $x (unlike $y = $x;)

    as_int()/as_number()

    1. $x->as_int();

    Returns $x as a BigInt (truncated towards zero). In BigInt this is the same as copy() .

    as_number() is an alias to this method. as_number was introduced in v1.22, while as_int() was only introduced in v1.68.

    bstr()

    1. $x->bstr();

    Returns a normalized string representation of $x .

    bsstr()

    1. $x->bsstr(); # normalized string in scientific notation

    as_hex()

    1. $x->as_hex(); # as signed hexadecimal string with prefixed 0x

    as_bin()

    1. $x->as_bin(); # as signed binary string with prefixed 0b

    as_oct()

    1. $x->as_oct(); # as signed octal string with prefixed 0

    numify()

    1. print $x->numify();

    This returns a normal Perl scalar from $x. It is used automatically whenever a scalar is needed, for instance in array index operations.

    This loses precision, to avoid this use as_int() instead.

    modify()

    1. $x->modify('bpowd');

    This method returns 0 if the object can be modified with the given operation, or 1 if not.

    This is used for instance by Math::BigInt::Constant.

    upgrade()/downgrade()

    Set/get the class for downgrade/upgrade operations. Thuis is used for instance by bignum. The defaults are '', thus the following operation will create a BigInt, not a BigFloat:

    1. my $i = Math::BigInt->new(123);
    2. my $f = Math::BigFloat->new('123.1');
    3. print $i + $f,"\n"; # print 246

    div_scale()

    Set/get the number of digits for the default precision in divide operations.

    round_mode()

    Set/get the current round mode.

    ACCURACY and PRECISION

    Since version v1.33, Math::BigInt and Math::BigFloat have full support for accuracy and precision based rounding, both automatically after every operation, as well as manually.

    This section describes the accuracy/precision handling in Math::Big* as it used to be and as it is now, complete with an explanation of all terms and abbreviations.

    Not yet implemented things (but with correct description) are marked with '!', things that need to be answered are marked with '?'.

    In the next paragraph follows a short description of terms used here (because these may differ from terms used by others people or documentation).

    During the rest of this document, the shortcuts A (for accuracy), P (for precision), F (fallback) and R (rounding mode) will be used.

    Precision P

    A fixed number of digits before (positive) or after (negative) the decimal point. For example, 123.45 has a precision of -2. 0 means an integer like 123 (or 120). A precision of 2 means two digits to the left of the decimal point are zero, so 123 with P = 1 becomes 120. Note that numbers with zeros before the decimal point may have different precisions, because 1200 can have p = 0, 1 or 2 (depending on what the initial value was). It could also have p < 0, when the digits after the decimal point are zero.

    The string output (of floating point numbers) will be padded with zeros:

    1. Initial value P A Result String
    2. ------------------------------------------------------------
    3. 1234.01 -3 1000 1000
    4. 1234 -2 1200 1200
    5. 1234.5 -1 1230 1230
    6. 1234.001 1 1234 1234.0
    7. 1234.01 0 1234 1234
    8. 1234.01 2 1234.01 1234.01
    9. 1234.01 5 1234.01 1234.01000

    For BigInts, no padding occurs.

    Accuracy A

    Number of significant digits. Leading zeros are not counted. A number may have an accuracy greater than the non-zero digits when there are zeros in it or trailing zeros. For example, 123.456 has A of 6, 10203 has 5, 123.0506 has 7, 123.450000 has 8 and 0.000123 has 3.

    The string output (of floating point numbers) will be padded with zeros:

    1. Initial value P A Result String
    2. ------------------------------------------------------------
    3. 1234.01 3 1230 1230
    4. 1234.01 6 1234.01 1234.01
    5. 1234.1 8 1234.1 1234.1000

    For BigInts, no padding occurs.

    Fallback F

    When both A and P are undefined, this is used as a fallback accuracy when dividing numbers.

    Rounding mode R

    When rounding a number, different 'styles' or 'kinds' of rounding are possible. (Note that random rounding, as in Math::Round, is not implemented.)

    • 'trunc'

      truncation invariably removes all digits following the rounding place, replacing them with zeros. Thus, 987.65 rounded to tens (P=1) becomes 980, and rounded to the fourth sigdig becomes 987.6 (A=4). 123.456 rounded to the second place after the decimal point (P=-2) becomes 123.46.

      All other implemented styles of rounding attempt to round to the "nearest digit." If the digit D immediately to the right of the rounding place (skipping the decimal point) is greater than 5, the number is incremented at the rounding place (possibly causing a cascade of incrementation): e.g. when rounding to units, 0.9 rounds to 1, and -19.9 rounds to -20. If D < 5, the number is similarly truncated at the rounding place: e.g. when rounding to units, 0.4 rounds to 0, and -19.4 rounds to -19.

      However the results of other styles of rounding differ if the digit immediately to the right of the rounding place (skipping the decimal point) is 5 and if there are no digits, or no digits other than 0, after that 5. In such cases:

    • 'even'

      rounds the digit at the rounding place to 0, 2, 4, 6, or 8 if it is not already. E.g., when rounding to the first sigdig, 0.45 becomes 0.4, -0.55 becomes -0.6, but 0.4501 becomes 0.5.

    • 'odd'

      rounds the digit at the rounding place to 1, 3, 5, 7, or 9 if it is not already. E.g., when rounding to the first sigdig, 0.45 becomes 0.5, -0.55 becomes -0.5, but 0.5501 becomes 0.6.

    • '+inf'

      round to plus infinity, i.e. always round up. E.g., when rounding to the first sigdig, 0.45 becomes 0.5, -0.55 becomes -0.5, and 0.4501 also becomes 0.5.

    • '-inf'

      round to minus infinity, i.e. always round down. E.g., when rounding to the first sigdig, 0.45 becomes 0.4, -0.55 becomes -0.6, but 0.4501 becomes 0.5.

    • 'zero'

      round to zero, i.e. positive numbers down, negative ones up. E.g., when rounding to the first sigdig, 0.45 becomes 0.4, -0.55 becomes -0.5, but 0.4501 becomes 0.5.

    • 'common'

      round up if the digit immediately to the right of the rounding place is 5 or greater, otherwise round down. E.g., 0.15 becomes 0.2 and 0.149 becomes 0.1.

    The handling of A & P in MBI/MBF (the old core code shipped with Perl versions <= 5.7.2) is like this:

    • Precision
      1. * ffround($p) is able to round to $p number of digits after the decimal
      2. point
      3. * otherwise P is unused
    • Accuracy (significant digits)
      1. * fround($a) rounds to $a significant digits
      2. * only fdiv() and fsqrt() take A as (optional) parameter
      3. + other operations simply create the same number (fneg etc), or more (fmul)
      4. of digits
      5. + rounding/truncating is only done when explicitly calling one of fround
      6. or ffround, and never for BigInt (not implemented)
      7. * fsqrt() simply hands its accuracy argument over to fdiv.
      8. * the documentation and the comment in the code indicate two different ways
      9. on how fdiv() determines the maximum number of digits it should calculate,
      10. and the actual code does yet another thing
      11. POD:
      12. max($Math::BigFloat::div_scale,length(dividend)+length(divisor))
      13. Comment:
      14. result has at most max(scale, length(dividend), length(divisor)) digits
      15. Actual code:
      16. scale = max(scale, length(dividend)-1,length(divisor)-1);
      17. scale += length(divisor) - length(dividend);
      18. So for lx = 3, ly = 9, scale = 10, scale will actually be 16 (10+9-3).
      19. Actually, the 'difference' added to the scale is calculated from the
      20. number of "significant digits" in dividend and divisor, which is derived
      21. by looking at the length of the mantissa. Which is wrong, since it includes
      22. the + sign (oops) and actually gets 2 for '+100' and 4 for '+101'. Oops
      23. again. Thus 124/3 with div_scale=1 will get you '41.3' based on the strange
      24. assumption that 124 has 3 significant digits, while 120/7 will get you
      25. '17', not '17.1' since 120 is thought to have 2 significant digits.
      26. The rounding after the division then uses the remainder and $y to determine
      27. whether it must round up or down.
      28. ? I have no idea which is the right way. That's why I used a slightly more
      29. ? simple scheme and tweaked the few failing testcases to match it.

    This is how it works now:

    • Setting/Accessing
      1. * You can set the A global via Math::BigInt->accuracy() or
      2. Math::BigFloat->accuracy() or whatever class you are using.
      3. * You can also set P globally by using Math::SomeClass->precision()
      4. likewise.
      5. * Globals are classwide, and not inherited by subclasses.
      6. * to undefine A, use Math::SomeCLass->accuracy(undef);
      7. * to undefine P, use Math::SomeClass->precision(undef);
      8. * Setting Math::SomeClass->accuracy() clears automatically
      9. Math::SomeClass->precision(), and vice versa.
      10. * To be valid, A must be > 0, P can have any value.
      11. * If P is negative, this means round to the P'th place to the right of the
      12. decimal point; positive values mean to the left of the decimal point.
      13. P of 0 means round to integer.
      14. * to find out the current global A, use Math::SomeClass->accuracy()
      15. * to find out the current global P, use Math::SomeClass->precision()
      16. * use $x->accuracy() respective $x->precision() for the local
      17. setting of $x.
      18. * Please note that $x->accuracy() respective $x->precision()
      19. return eventually defined global A or P, when $x's A or P is not
      20. set.
    • Creating numbers
      1. * When you create a number, you can give the desired A or P via:
      2. $x = Math::BigInt->new($number,$A,$P);
      3. * Only one of A or P can be defined, otherwise the result is NaN
      4. * If no A or P is give ($x = Math::BigInt->new($number) form), then the
      5. globals (if set) will be used. Thus changing the global defaults later on
      6. will not change the A or P of previously created numbers (i.e., A and P of
      7. $x will be what was in effect when $x was created)
      8. * If given undef for A and P, NO rounding will occur, and the globals will
      9. NOT be used. This is used by subclasses to create numbers without
      10. suffering rounding in the parent. Thus a subclass is able to have its own
      11. globals enforced upon creation of a number by using
      12. $x = Math::BigInt->new($number,undef,undef):
      13. use Math::BigInt::SomeSubclass;
      14. use Math::BigInt;
      15. Math::BigInt->accuracy(2);
      16. Math::BigInt::SomeSubClass->accuracy(3);
      17. $x = Math::BigInt::SomeSubClass->new(1234);
      18. $x is now 1230, and not 1200. A subclass might choose to implement
      19. this otherwise, e.g. falling back to the parent's A and P.
    • Usage
      1. * If A or P are enabled/defined, they are used to round the result of each
      2. operation according to the rules below
      3. * Negative P is ignored in Math::BigInt, since BigInts never have digits
      4. after the decimal point
      5. * Math::BigFloat uses Math::BigInt internally, but setting A or P inside
      6. Math::BigInt as globals does not tamper with the parts of a BigFloat.
      7. A flag is used to mark all Math::BigFloat numbers as 'never round'.
    • Precedence
      1. * It only makes sense that a number has only one of A or P at a time.
      2. If you set either A or P on one object, or globally, the other one will
      3. be automatically cleared.
      4. * If two objects are involved in an operation, and one of them has A in
      5. effect, and the other P, this results in an error (NaN).
      6. * A takes precedence over P (Hint: A comes before P).
      7. If neither of them is defined, nothing is used, i.e. the result will have
      8. as many digits as it can (with an exception for fdiv/fsqrt) and will not
      9. be rounded.
      10. * There is another setting for fdiv() (and thus for fsqrt()). If neither of
      11. A or P is defined, fdiv() will use a fallback (F) of $div_scale digits.
      12. If either the dividend's or the divisor's mantissa has more digits than
      13. the value of F, the higher value will be used instead of F.
      14. This is to limit the digits (A) of the result (just consider what would
      15. happen with unlimited A and P in the case of 1/3 :-)
      16. * fdiv will calculate (at least) 4 more digits than required (determined by
      17. A, P or F), and, if F is not used, round the result
      18. (this will still fail in the case of a result like 0.12345000000001 with A
      19. or P of 5, but this can not be helped - or can it?)
      20. * Thus you can have the math done by on Math::Big* class in two modi:
      21. + never round (this is the default):
      22. This is done by setting A and P to undef. No math operation
      23. will round the result, with fdiv() and fsqrt() as exceptions to guard
      24. against overflows. You must explicitly call bround(), bfround() or
      25. round() (the latter with parameters).
      26. Note: Once you have rounded a number, the settings will 'stick' on it
      27. and 'infect' all other numbers engaged in math operations with it, since
      28. local settings have the highest precedence. So, to get SaferRound[tm],
      29. use a copy() before rounding like this:
      30. $x = Math::BigFloat->new(12.34);
      31. $y = Math::BigFloat->new(98.76);
      32. $z = $x * $y; # 1218.6984
      33. print $x->copy()->fround(3); # 12.3 (but A is now 3!)
      34. $z = $x * $y; # still 1218.6984, without
      35. # copy would have been 1210!
      36. + round after each op:
      37. After each single operation (except for testing like is_zero()), the
      38. method round() is called and the result is rounded appropriately. By
      39. setting proper values for A and P, you can have all-the-same-A or
      40. all-the-same-P modes. For example, Math::Currency might set A to undef,
      41. and P to -2, globally.
      42. ?Maybe an extra option that forbids local A & P settings would be in order,
      43. ?so that intermediate rounding does not 'poison' further math?
    • Overriding globals
      1. * you will be able to give A, P and R as an argument to all the calculation
      2. routines; the second parameter is A, the third one is P, and the fourth is
      3. R (shift right by one for binary operations like badd). P is used only if
      4. the first parameter (A) is undefined. These three parameters override the
      5. globals in the order detailed as follows, i.e. the first defined value
      6. wins:
      7. (local: per object, global: global default, parameter: argument to sub)
      8. + parameter A
      9. + parameter P
      10. + local A (if defined on both of the operands: smaller one is taken)
      11. + local P (if defined on both of the operands: bigger one is taken)
      12. + global A
      13. + global P
      14. + global F
      15. * fsqrt() will hand its arguments to fdiv(), as it used to, only now for two
      16. arguments (A and P) instead of one
    • Local settings
      1. * You can set A or P locally by using $x->accuracy() or
      2. $x->precision()
      3. and thus force different A and P for different objects/numbers.
      4. * Setting A or P this way immediately rounds $x to the new value.
      5. * $x->accuracy() clears $x->precision(), and vice versa.
    • Rounding
      1. * the rounding routines will use the respective global or local settings.
      2. fround()/bround() is for accuracy rounding, while ffround()/bfround()
      3. is for precision
      4. * the two rounding functions take as the second parameter one of the
      5. following rounding modes (R):
      6. 'even', 'odd', '+inf', '-inf', 'zero', 'trunc', 'common'
      7. * you can set/get the global R by using Math::SomeClass->round_mode()
      8. or by setting $Math::SomeClass::round_mode
      9. * after each operation, $result->round() is called, and the result may
      10. eventually be rounded (that is, if A or P were set either locally,
      11. globally or as parameter to the operation)
      12. * to manually round a number, call $x->round($A,$P,$round_mode);
      13. this will round the number by using the appropriate rounding function
      14. and then normalize it.
      15. * rounding modifies the local settings of the number:
      16. $x = Math::BigFloat->new(123.456);
      17. $x->accuracy(5);
      18. $x->bround(4);
      19. Here 4 takes precedence over 5, so 123.5 is the result and $x->accuracy()
      20. will be 4 from now on.
    • Default values
      1. * R: 'even'
      2. * F: 40
      3. * A: undef
      4. * P: undef
    • Remarks
      1. * The defaults are set up so that the new code gives the same results as
      2. the old code (except in a few cases on fdiv):
      3. + Both A and P are undefined and thus will not be used for rounding
      4. after each operation.
      5. + round() is thus a no-op, unless given extra parameters A and P

    Infinity and Not a Number

    While BigInt has extensive handling of inf and NaN, certain quirks remain.

    • oct()/hex()

      These perl routines currently (as of Perl v.5.8.6) cannot handle passed inf.

      1. te@linux:~> perl -wle 'print 2 ** 3333'
      2. inf
      3. te@linux:~> perl -wle 'print 2 ** 3333 == 2 ** 3333'
      4. 1
      5. te@linux:~> perl -wle 'print oct(2 ** 3333)'
      6. 0
      7. te@linux:~> perl -wle 'print hex(2 ** 3333)'
      8. Illegal hexadecimal digit 'i' ignored at -e line 1.
      9. 0

      The same problems occur if you pass them Math::BigInt->binf() objects. Since overloading these routines is not possible, this cannot be fixed from BigInt.

    • ==, !=, <, >, <=, >= with NaNs

      BigInt's bcmp() routine currently returns undef to signal that a NaN was involved in a comparison. However, the overload code turns that into either 1 or '' and thus operations like NaN != NaN might return wrong values.

    • log(-inf)

      log(-inf) is highly weird. Since log(-x)=pi*i+log(x), then log(-inf)=pi*i+inf. However, since the imaginary part is finite, the real infinity "overshadows" it, so the number might as well just be infinity. However, the result is a complex number, and since BigInt/BigFloat can only have real numbers as results, the result is NaN.

    • exp(), cos(), sin(), atan2()

      These all might have problems handling infinity right.

    INTERNALS

    The actual numbers are stored as unsigned big integers (with separate sign).

    You should neither care about nor depend on the internal representation; it might change without notice. Use ONLY method calls like $x->sign(); instead relying on the internal representation.

    MATH LIBRARY

    Math with the numbers is done (by default) by a module called Math::BigInt::Calc . This is equivalent to saying:

    1. use Math::BigInt try => 'Calc';

    You can change this backend library by using:

    1. use Math::BigInt try => 'GMP';

    Note: General purpose packages should not be explicit about the library to use; let the script author decide which is best.

    If your script works with huge numbers and Calc is too slow for them, you can also for the loading of one of these libraries and if none of them can be used, the code will die:

    1. use Math::BigInt only => 'GMP,Pari';

    The following would first try to find Math::BigInt::Foo, then Math::BigInt::Bar, and when this also fails, revert to Math::BigInt::Calc:

    1. use Math::BigInt try => 'Foo,Math::BigInt::Bar';

    The library that is loaded last will be used. Note that this can be overwritten at any time by loading a different library, and numbers constructed with different libraries cannot be used in math operations together.

    What library to use?

    Note: General purpose packages should not be explicit about the library to use; let the script author decide which is best.

    Math::BigInt::GMP and Math::BigInt::Pari are in cases involving big numbers much faster than Calc, however it is slower when dealing with very small numbers (less than about 20 digits) and when converting very large numbers to decimal (for instance for printing, rounding, calculating their length in decimal etc).

    So please select carefully what library you want to use.

    Different low-level libraries use different formats to store the numbers. However, you should NOT depend on the number having a specific format internally.

    See the respective math library module documentation for further details.

    SIGN

    The sign is either '+', '-', 'NaN', '+inf' or '-inf'.

    A sign of 'NaN' is used to represent the result when input arguments are not numbers or as a result of 0/0. '+inf' and '-inf' represent plus respectively minus infinity. You will get '+inf' when dividing a positive number by 0, and '-inf' when dividing any negative number by 0.

    mantissa(), exponent() and parts()

    mantissa() and exponent() return the said parts of the BigInt such that:

    1. $m = $x->mantissa();
    2. $e = $x->exponent();
    3. $y = $m * ( 10 ** $e );
    4. print "ok\n" if $x == $y;

    ($m,$e) = $x->parts() is just a shortcut that gives you both of them in one go. Both the returned mantissa and exponent have a sign.

    Currently, for BigInts $e is always 0, except +inf and -inf, where it is +inf ; and for NaN, where it is NaN ; and for $x == 0 , where it is 1 (to be compatible with Math::BigFloat's internal representation of a zero as 0E1 ).

    $m is currently just a copy of the original number. The relation between $e and $m will stay always the same, though their real values might change.

    EXAMPLES

    1. use Math::BigInt;
    2. sub bint { Math::BigInt->new(shift); }
    3. $x = Math::BigInt->bstr("1234") # string "1234"
    4. $x = "$x"; # same as bstr()
    5. $x = Math::BigInt->bneg("1234"); # BigInt "-1234"
    6. $x = Math::BigInt->babs("-12345"); # BigInt "12345"
    7. $x = Math::BigInt->bnorm("-0.00"); # BigInt "0"
    8. $x = bint(1) + bint(2); # BigInt "3"
    9. $x = bint(1) + "2"; # ditto (auto-BigIntify of "2")
    10. $x = bint(1); # BigInt "1"
    11. $x = $x + 5 / 2; # BigInt "3"
    12. $x = $x ** 3; # BigInt "27"
    13. $x *= 2; # BigInt "54"
    14. $x = Math::BigInt->new(0); # BigInt "0"
    15. $x--; # BigInt "-1"
    16. $x = Math::BigInt->badd(4,5) # BigInt "9"
    17. print $x->bsstr(); # 9e+0

    Examples for rounding:

    1. use Math::BigFloat;
    2. use Test;
    3. $x = Math::BigFloat->new(123.4567);
    4. $y = Math::BigFloat->new(123.456789);
    5. Math::BigFloat->accuracy(4); # no more A than 4
    6. ok ($x->copy()->fround(),123.4); # even rounding
    7. print $x->copy()->fround(),"\n"; # 123.4
    8. Math::BigFloat->round_mode('odd'); # round to odd
    9. print $x->copy()->fround(),"\n"; # 123.5
    10. Math::BigFloat->accuracy(5); # no more A than 5
    11. Math::BigFloat->round_mode('odd'); # round to odd
    12. print $x->copy()->fround(),"\n"; # 123.46
    13. $y = $x->copy()->fround(4),"\n"; # A = 4: 123.4
    14. print "$y, ",$y->accuracy(),"\n"; # 123.4, 4
    15. Math::BigFloat->accuracy(undef); # A not important now
    16. Math::BigFloat->precision(2); # P important
    17. print $x->copy()->bnorm(),"\n"; # 123.46
    18. print $x->copy()->fround(),"\n"; # 123.46

    Examples for converting:

    1. my $x = Math::BigInt->new('0b1'.'01' x 123);
    2. print "bin: ",$x->as_bin()," hex:",$x->as_hex()," dec: ",$x,"\n";

    Autocreating constants

    After use Math::BigInt ':constant' all the integer decimal, hexadecimal and binary constants in the given scope are converted to Math::BigInt . This conversion happens at compile time.

    In particular,

    1. perl -MMath::BigInt=:constant -e 'print 2**100,"\n"'

    prints the integer value of 2**100 . Note that without conversion of constants the expression 2**100 will be calculated as perl scalar.

    Please note that strings and floating point constants are not affected, so that

    1. use Math::BigInt qw/:constant/;
    2. $x = 1234567890123456789012345678901234567890
    3. + 123456789123456789;
    4. $y = '1234567890123456789012345678901234567890'
    5. + '123456789123456789';

    do not work. You need an explicit Math::BigInt->new() around one of the operands. You should also quote large constants to protect loss of precision:

    1. use Math::BigInt;
    2. $x = Math::BigInt->new('1234567889123456789123456789123456789');

    Without the quotes Perl would convert the large number to a floating point constant at compile time and then hand the result to BigInt, which results in an truncated result or a NaN.

    This also applies to integers that look like floating point constants:

    1. use Math::BigInt ':constant';
    2. print ref(123e2),"\n";
    3. print ref(123.2e2),"\n";

    will print nothing but newlines. Use either bignum or Math::BigFloat to get this to work.

    PERFORMANCE

    Using the form $x += $y; etc over $x = $x + $y is faster, since a copy of $x must be made in the second case. For long numbers, the copy can eat up to 20% of the work (in the case of addition/subtraction, less for multiplication/division). If $y is very small compared to $x, the form $x += $y is MUCH faster than $x = $x + $y since making the copy of $x takes more time then the actual addition.

    With a technique called copy-on-write, the cost of copying with overload could be minimized or even completely avoided. A test implementation of COW did show performance gains for overloaded math, but introduced a performance loss due to a constant overhead for all other operations. So Math::BigInt does currently not COW.

    The rewritten version of this module (vs. v0.01) is slower on certain operations, like new() , bstr() and numify() . The reason are that it does now more work and handles much more cases. The time spent in these operations is usually gained in the other math operations so that code on the average should get (much) faster. If they don't, please contact the author.

    Some operations may be slower for small numbers, but are significantly faster for big numbers. Other operations are now constant (O(1), like bneg() , babs() etc), instead of O(N) and thus nearly always take much less time. These optimizations were done on purpose.

    If you find the Calc module to slow, try to install any of the replacement modules and see if they help you.

    Alternative math libraries

    You can use an alternative library to drive Math::BigInt. See the section MATH LIBRARY for more information.

    For more benchmark results see http://bloodgate.com/perl/benchmarks.html.

    SUBCLASSING

    Subclassing Math::BigInt

    The basic design of Math::BigInt allows simple subclasses with very little work, as long as a few simple rules are followed:

    • The public API must remain consistent, i.e. if a sub-class is overloading addition, the sub-class must use the same name, in this case badd(). The reason for this is that Math::BigInt is optimized to call the object methods directly.

    • The private object hash keys like $x->{sign} may not be changed, but additional keys can be added, like $x->{_custom} .

    • Accessor functions are available for all existing object hash keys and should be used instead of directly accessing the internal hash keys. The reason for this is that Math::BigInt itself has a pluggable interface which permits it to support different storage methods.

    More complex sub-classes may have to replicate more of the logic internal of Math::BigInt if they need to change more basic behaviors. A subclass that needs to merely change the output only needs to overload bstr() .

    All other object methods and overloaded functions can be directly inherited from the parent class.

    At the very minimum, any subclass will need to provide its own new() and can store additional hash keys in the object. There are also some package globals that must be defined, e.g.:

    1. # Globals
    2. $accuracy = undef;
    3. $precision = -2; # round to 2 decimal places
    4. $round_mode = 'even';
    5. $div_scale = 40;

    Additionally, you might want to provide the following two globals to allow auto-upgrading and auto-downgrading to work correctly:

    1. $upgrade = undef;
    2. $downgrade = undef;

    This allows Math::BigInt to correctly retrieve package globals from the subclass, like $SubClass::precision . See t/Math/BigInt/Subclass.pm or t/Math/BigFloat/SubClass.pm completely functional subclass examples.

    Don't forget to

    1. use overload;

    in your subclass to automatically inherit the overloading from the parent. If you like, you can change part of the overloading, look at Math::String for an example.

    UPGRADING

    When used like this:

    1. use Math::BigInt upgrade => 'Foo::Bar';

    certain operations will 'upgrade' their calculation and thus the result to the class Foo::Bar. Usually this is used in conjunction with Math::BigFloat:

    1. use Math::BigInt upgrade => 'Math::BigFloat';

    As a shortcut, you can use the module bignum :

    1. use bignum;

    Also good for one-liners:

    1. perl -Mbignum -le 'print 2 ** 255'

    This makes it possible to mix arguments of different classes (as in 2.5 + 2) as well es preserve accuracy (as in sqrt(3)).

    Beware: This feature is not fully implemented yet.

    Auto-upgrade

    The following methods upgrade themselves unconditionally; that is if upgrade is in effect, they will always hand up their work:

    • bsqrt()
    • div()
    • blog()
    • bexp()

    Beware: This list is not complete.

    All other methods upgrade themselves only when one (or all) of their arguments are of the class mentioned in $upgrade (This might change in later versions to a more sophisticated scheme):

    EXPORTS

    Math::BigInt exports nothing by default, but can export the following methods:

    1. bgcd
    2. blcm

    CAVEATS

    Some things might not work as you expect them. Below is documented what is known to be troublesome:

    • bstr(), bsstr() and 'cmp'

      Both bstr() and bsstr() as well as automated stringify via overload now drop the leading '+'. The old code would return '+3', the new returns '3'. This is to be consistent with Perl and to make cmp (especially with overloading) to work as you expect. It also solves problems with Test.pm , because its ok() uses 'eq' internally.

      Mark Biggar said, when asked about to drop the '+' altogether, or make only cmp work:

      1. I agree (with the first alternative), don't add the '+' on positive
      2. numbers. It's not as important anymore with the new internal
      3. form for numbers. It made doing things like abs and neg easier,
      4. but those have to be done differently now anyway.

      So, the following examples will now work all as expected:

      1. use Test;
      2. BEGIN { plan tests => 1 }
      3. use Math::BigInt;
      4. my $x = new Math::BigInt 3*3;
      5. my $y = new Math::BigInt 3*3;
      6. ok ($x,3*3);
      7. print "$x eq 9" if $x eq $y;
      8. print "$x eq 9" if $x eq '9';
      9. print "$x eq 9" if $x eq 3*3;

      Additionally, the following still works:

      1. print "$x == 9" if $x == $y;
      2. print "$x == 9" if $x == 9;
      3. print "$x == 9" if $x == 3*3;

      There is now a bsstr() method to get the string in scientific notation aka 1e+2 instead of 100 . Be advised that overloaded 'eq' always uses bstr() for comparison, but Perl will represent some numbers as 100 and others as 1e+308. If in doubt, convert both arguments to Math::BigInt before comparing them as strings:

      1. use Test;
      2. BEGIN { plan tests => 3 }
      3. use Math::BigInt;
      4. $x = Math::BigInt->new('1e56'); $y = 1e56;
      5. ok ($x,$y); # will fail
      6. ok ($x->bsstr(),$y); # okay
      7. $y = Math::BigInt->new($y);
      8. ok ($x,$y); # okay

      Alternatively, simple use <=> for comparisons, this will get it always right. There is not yet a way to get a number automatically represented as a string that matches exactly the way Perl represents it.

      See also the section about Infinity and Not a Number for problems in comparing NaNs.

    • int()

      int() will return (at least for Perl v5.7.1 and up) another BigInt, not a Perl scalar:

      1. $x = Math::BigInt->new(123);
      2. $y = int($x); # BigInt 123
      3. $x = Math::BigFloat->new(123.45);
      4. $y = int($x); # BigInt 123

      In all Perl versions you can use as_number() or as_int for the same effect:

      1. $x = Math::BigFloat->new(123.45);
      2. $y = $x->as_number(); # BigInt 123
      3. $y = $x->as_int(); # ditto

      This also works for other subclasses, like Math::String.

      If you want a real Perl scalar, use numify() :

      1. $y = $x->numify(); # 123 as scalar

      This is seldom necessary, though, because this is done automatically, like when you access an array:

      1. $z = $array[$x]; # does work automatically
    • length

      The following will probably not do what you expect:

      1. $c = Math::BigInt->new(123);
      2. print $c->length(),"\n"; # prints 30

      It prints both the number of digits in the number and in the fraction part since print calls length() in list context. Use something like:

      1. print scalar $c->length(),"\n"; # prints 3
    • bdiv

      The following will probably not do what you expect:

      1. print $c->bdiv(10000),"\n";

      It prints both quotient and remainder since print calls bdiv() in list context. Also, bdiv() will modify $c, so be careful. You probably want to use

      1. print $c / 10000,"\n";
      2. print scalar $c->bdiv(10000),"\n"; # or if you want to modify $c

      instead.

      The quotient is always the greatest integer less than or equal to the real-valued quotient of the two operands, and the remainder (when it is non-zero) always has the same sign as the second operand; so, for example,

      1. 1 / 4 => ( 0, 1)
      2. 1 / -4 => (-1,-3)
      3. -3 / 4 => (-1, 1)
      4. -3 / -4 => ( 0,-3)
      5. -11 / 2 => (-5,1)
      6. 11 /-2 => (-5,-1)

      As a consequence, the behavior of the operator % agrees with the behavior of Perl's built-in % operator (as documented in the perlop manpage), and the equation

      1. $x == ($x / $y) * $y + ($x % $y)

      holds true for any $x and $y, which justifies calling the two return values of bdiv() the quotient and remainder. The only exception to this rule are when $y == 0 and $x is negative, then the remainder will also be negative. See below under "infinity handling" for the reasoning behind this.

      Perl's 'use integer;' changes the behaviour of % and / for scalars, but will not change BigInt's way to do things. This is because under 'use integer' Perl will do what the underlying C thinks is right and this is different for each system. If you need BigInt's behaving exactly like Perl's 'use integer', bug the author to implement it ;)

    • infinity handling

      Here are some examples that explain the reasons why certain results occur while handling infinity:

      The following table shows the result of the division and the remainder, so that the equation above holds true. Some "ordinary" cases are strewn in to show more clearly the reasoning:

      1. A / B = C, R so that C * B + R = A
      2. =========================================================
      3. 5 / 8 = 0, 5 0 * 8 + 5 = 5
      4. 0 / 8 = 0, 0 0 * 8 + 0 = 0
      5. 0 / inf = 0, 0 0 * inf + 0 = 0
      6. 0 /-inf = 0, 0 0 * -inf + 0 = 0
      7. 5 / inf = 0, 5 0 * inf + 5 = 5
      8. 5 /-inf = 0, 5 0 * -inf + 5 = 5
      9. -5/ inf = 0, -5 0 * inf + -5 = -5
      10. -5/-inf = 0, -5 0 * -inf + -5 = -5
      11. inf/ 5 = inf, 0 inf * 5 + 0 = inf
      12. -inf/ 5 = -inf, 0 -inf * 5 + 0 = -inf
      13. inf/ -5 = -inf, 0 -inf * -5 + 0 = inf
      14. -inf/ -5 = inf, 0 inf * -5 + 0 = -inf
      15. 5/ 5 = 1, 0 1 * 5 + 0 = 5
      16. -5/ -5 = 1, 0 1 * -5 + 0 = -5
      17. inf/ inf = 1, 0 1 * inf + 0 = inf
      18. -inf/-inf = 1, 0 1 * -inf + 0 = -inf
      19. inf/-inf = -1, 0 -1 * -inf + 0 = inf
      20. -inf/ inf = -1, 0 1 * -inf + 0 = -inf
      21. 8/ 0 = inf, 8 inf * 0 + 8 = 8
      22. inf/ 0 = inf, inf inf * 0 + inf = inf
      23. 0/ 0 = NaN

      These cases below violate the "remainder has the sign of the second of the two arguments", since they wouldn't match up otherwise.

      1. A / B = C, R so that C * B + R = A
      2. ========================================================
      3. -inf/ 0 = -inf, -inf -inf * 0 + inf = -inf
      4. -8/ 0 = -inf, -8 -inf * 0 + 8 = -8
    • Modifying and =

      Beware of:

      1. $x = Math::BigFloat->new(5);
      2. $y = $x;

      It will not do what you think, e.g. making a copy of $x. Instead it just makes a second reference to the same object and stores it in $y. Thus anything that modifies $x (except overloaded operators) will modify $y, and vice versa. Or in other words, = is only safe if you modify your BigInts only via overloaded math. As soon as you use a method call it breaks:

      1. $x->bmul(2);
      2. print "$x, $y\n"; # prints '10, 10'

      If you want a true copy of $x, use:

      1. $y = $x->copy();

      You can also chain the calls like this, this will make first a copy and then multiply it by 2:

      1. $y = $x->copy()->bmul(2);

      See also the documentation for overload.pm regarding = .

    • bpow

      bpow() (and the rounding functions) now modifies the first argument and returns it, unlike the old code which left it alone and only returned the result. This is to be consistent with badd() etc. The first three will modify $x, the last one won't:

      1. print bpow($x,$i),"\n"; # modify $x
      2. print $x->bpow($i),"\n"; # ditto
      3. print $x **= $i,"\n"; # the same
      4. print $x ** $i,"\n"; # leave $x alone

      The form $x **= $y is faster than $x = $x ** $y; , though.

    • Overloading -$x

      The following:

      1. $x = -$x;

      is slower than

      1. $x->bneg();

      since overload calls sub($x,0,1); instead of neg($x) . The first variant needs to preserve $x since it does not know that it later will get overwritten. This makes a copy of $x and takes O(N), but $x->bneg() is O(1).

    • Mixing different object types

      In Perl you will get a floating point value if you do one of the following:

      1. $float = 5.0 + 2;
      2. $float = 2 + 5.0;
      3. $float = 5 / 2;

      With overloaded math, only the first two variants will result in a BigFloat:

      1. use Math::BigInt;
      2. use Math::BigFloat;
      3. $mbf = Math::BigFloat->new(5);
      4. $mbi2 = Math::BigInteger->new(5);
      5. $mbi = Math::BigInteger->new(2);
      6. # what actually gets called:
      7. $float = $mbf + $mbi; # $mbf->badd()
      8. $float = $mbf / $mbi; # $mbf->bdiv()
      9. $integer = $mbi + $mbf; # $mbi->badd()
      10. $integer = $mbi2 / $mbi; # $mbi2->bdiv()
      11. $integer = $mbi2 / $mbf; # $mbi2->bdiv()

      This is because math with overloaded operators follows the first (dominating) operand, and the operation of that is called and returns thus the result. So, Math::BigInt::bdiv() will always return a Math::BigInt, regardless whether the result should be a Math::BigFloat or the second operant is one.

      To get a Math::BigFloat you either need to call the operation manually, make sure the operands are already of the proper type or casted to that type via Math::BigFloat->new():

      1. $float = Math::BigFloat->new($mbi2) / $mbi; # = 2.5

      Beware of simple "casting" the entire expression, this would only convert the already computed result:

      1. $float = Math::BigFloat->new($mbi2 / $mbi); # = 2.0 thus wrong!

      Beware also of the order of more complicated expressions like:

      1. $integer = ($mbi2 + $mbi) / $mbf; # int / float => int
      2. $integer = $mbi2 / Math::BigFloat->new($mbi); # ditto

      If in doubt, break the expression into simpler terms, or cast all operands to the desired resulting type.

      Scalar values are a bit different, since:

      1. $float = 2 + $mbf;
      2. $float = $mbf + 2;

      will both result in the proper type due to the way the overloaded math works.

      This section also applies to other overloaded math packages, like Math::String.

      One solution to you problem might be autoupgrading|upgrading. See the pragmas bignum, bigint and bigrat for an easy way to do this.

    • bsqrt()

      bsqrt() works only good if the result is a big integer, e.g. the square root of 144 is 12, but from 12 the square root is 3, regardless of rounding mode. The reason is that the result is always truncated to an integer.

      If you want a better approximation of the square root, then use:

      1. $x = Math::BigFloat->new(12);
      2. Math::BigFloat->precision(0);
      3. Math::BigFloat->round_mode('even');
      4. print $x->copy->bsqrt(),"\n"; # 4
      5. Math::BigFloat->precision(2);
      6. print $x->bsqrt(),"\n"; # 3.46
      7. print $x->bsqrt(3),"\n"; # 3.464
    • brsft()

      For negative numbers in base see also brsft.

    LICENSE

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    Math::BigFloat, Math::BigRat and Math::Big as well as Math::BigInt::Pari and Math::BigInt::GMP.

    The pragmas bignum, bigint and bigrat also might be of interest because they solve the autoupgrading/downgrading issue, at least partly.

    The package at http://search.cpan.org/search?mode=module&query=Math%3A%3ABigInt contains more documentation including a full version history, testcases, empty subclass files and benchmarks.

    AUTHORS

    Original code by Mark Biggar, overloaded interface by Ilya Zakharevich. Completely rewritten by Tels http://bloodgate.com in late 2000, 2001 - 2006 and still at it in 2007.

    Many people contributed in one or more ways to the final beast, see the file CREDITS for an (incomplete) list. If you miss your name, please drop me a mail. Thank you!

     
    perldoc-html/Math/BigRat.html000644 000765 000024 00000142700 12275777470 016173 0ustar00jjstaff000000 000000 Math::BigRat - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Math::BigRat

    Perl 5 version 18.2 documentation
    Recently read

    Math::BigRat

    NAME

    Math::BigRat - Arbitrary big rational numbers

    SYNOPSIS

    1. use Math::BigRat;
    2. my $x = Math::BigRat->new('3/7'); $x += '5/9';
    3. print $x->bstr(),"\n";
    4. print $x ** 2,"\n";
    5. my $y = Math::BigRat->new('inf');
    6. print "$y ", ($y->is_inf ? 'is' : 'is not') , " infinity\n";
    7. my $z = Math::BigRat->new(144); $z->bsqrt();

    DESCRIPTION

    Math::BigRat complements Math::BigInt and Math::BigFloat by providing support for arbitrary big rational numbers.

    MATH LIBRARY

    You can change the underlying module that does the low-level math operations by using:

    1. use Math::BigRat try => 'GMP';

    Note: This needs Math::BigInt::GMP installed.

    The following would first try to find Math::BigInt::Foo, then Math::BigInt::Bar, and when this also fails, revert to Math::BigInt::Calc:

    1. use Math::BigRat try => 'Foo,Math::BigInt::Bar';

    If you want to get warned when the fallback occurs, replace "try" with "lib":

    1. use Math::BigRat lib => 'Foo,Math::BigInt::Bar';

    If you want the code to die instead, replace "try" with "only":

    1. use Math::BigRat only => 'Foo,Math::BigInt::Bar';

    METHODS

    Any methods not listed here are derived from Math::BigFloat (or Math::BigInt), so make sure you check these two modules for further information.

    new()

    1. $x = Math::BigRat->new('1/3');

    Create a new Math::BigRat object. Input can come in various forms:

    1. $x = Math::BigRat->new(123); # scalars
    2. $x = Math::BigRat->new('inf'); # infinity
    3. $x = Math::BigRat->new('123.3'); # float
    4. $x = Math::BigRat->new('1/3'); # simple string
    5. $x = Math::BigRat->new('1 / 3'); # spaced
    6. $x = Math::BigRat->new('1 / 0.1'); # w/ floats
    7. $x = Math::BigRat->new(Math::BigInt->new(3)); # BigInt
    8. $x = Math::BigRat->new(Math::BigFloat->new('3.1')); # BigFloat
    9. $x = Math::BigRat->new(Math::BigInt::Lite->new('2')); # BigLite
    10. # You can also give D and N as different objects:
    11. $x = Math::BigRat->new(
    12. Math::BigInt->new(-123),
    13. Math::BigInt->new(7),
    14. ); # => -123/7

    numerator()

    1. $n = $x->numerator();

    Returns a copy of the numerator (the part above the line) as signed BigInt.

    denominator()

    1. $d = $x->denominator();

    Returns a copy of the denominator (the part under the line) as positive BigInt.

    parts()

    1. ($n,$d) = $x->parts();

    Return a list consisting of (signed) numerator and (unsigned) denominator as BigInts.

    numify()

    1. my $y = $x->numify();

    Returns the object as a scalar. This will lose some data if the object cannot be represented by a normal Perl scalar (integer or float), so use as_int() or as_float() instead.

    This routine is automatically used whenever a scalar is required:

    1. my $x = Math::BigRat->new('3/1');
    2. @array = (0,1,2,3);
    3. $y = $array[$x]; # set $y to 3

    as_int()/as_number()

    1. $x = Math::BigRat->new('13/7');
    2. print $x->as_int(),"\n"; # '1'

    Returns a copy of the object as BigInt, truncated to an integer.

    as_number() is an alias for as_int() .

    as_float()

    1. $x = Math::BigRat->new('13/7');
    2. print $x->as_float(),"\n"; # '1'
    3. $x = Math::BigRat->new('2/3');
    4. print $x->as_float(5),"\n"; # '0.66667'

    Returns a copy of the object as BigFloat, preserving the accuracy as wanted, or the default of 40 digits.

    This method was added in v0.22 of Math::BigRat (April 2008).

    as_hex()

    1. $x = Math::BigRat->new('13');
    2. print $x->as_hex(),"\n"; # '0xd'

    Returns the BigRat as hexadecimal string. Works only for integers.

    as_bin()

    1. $x = Math::BigRat->new('13');
    2. print $x->as_bin(),"\n"; # '0x1101'

    Returns the BigRat as binary string. Works only for integers.

    as_oct()

    1. $x = Math::BigRat->new('13');
    2. print $x->as_oct(),"\n"; # '015'

    Returns the BigRat as octal string. Works only for integers.

    from_hex()/from_bin()/from_oct()

    1. my $h = Math::BigRat->from_hex('0x10');
    2. my $b = Math::BigRat->from_bin('0b10000000');
    3. my $o = Math::BigRat->from_oct('020');

    Create a BigRat from an hexadecimal, binary or octal number in string form.

    length()

    1. $len = $x->length();

    Return the length of $x in digits for integer values.

    digit()

    1. print Math::BigRat->new('123/1')->digit(1); # 1
    2. print Math::BigRat->new('123/1')->digit(-1); # 3

    Return the N'ths digit from X when X is an integer value.

    bnorm()

    1. $x->bnorm();

    Reduce the number to the shortest form. This routine is called automatically whenever it is needed.

    bfac()

    1. $x->bfac();

    Calculates the factorial of $x. For instance:

    1. print Math::BigRat->new('3/1')->bfac(),"\n"; # 1*2*3
    2. print Math::BigRat->new('5/1')->bfac(),"\n"; # 1*2*3*4*5

    Works currently only for integers.

    bround()/round()/bfround()

    Are not yet implemented.

    bmod()

    1. use Math::BigRat;
    2. my $x = Math::BigRat->new('7/4');
    3. my $y = Math::BigRat->new('4/3');
    4. print $x->bmod($y);

    Set $x to the remainder of the division of $x by $y.

    bneg()

    1. $x->bneg();

    Used to negate the object in-place.

    is_one()

    1. print "$x is 1\n" if $x->is_one();

    Return true if $x is exactly one, otherwise false.

    is_zero()

    1. print "$x is 0\n" if $x->is_zero();

    Return true if $x is exactly zero, otherwise false.

    is_pos()/is_positive()

    1. print "$x is >= 0\n" if $x->is_positive();

    Return true if $x is positive (greater than or equal to zero), otherwise false. Please note that '+inf' is also positive, while 'NaN' and '-inf' aren't.

    is_positive() is an alias for is_pos() .

    is_neg()/is_negative()

    1. print "$x is < 0\n" if $x->is_negative();

    Return true if $x is negative (smaller than zero), otherwise false. Please note that '-inf' is also negative, while 'NaN' and '+inf' aren't.

    is_negative() is an alias for is_neg() .

    is_int()

    1. print "$x is an integer\n" if $x->is_int();

    Return true if $x has a denominator of 1 (e.g. no fraction parts), otherwise false. Please note that '-inf', 'inf' and 'NaN' aren't integer.

    is_odd()

    1. print "$x is odd\n" if $x->is_odd();

    Return true if $x is odd, otherwise false.

    is_even()

    1. print "$x is even\n" if $x->is_even();

    Return true if $x is even, otherwise false.

    bceil()

    1. $x->bceil();

    Set $x to the next bigger integer value (e.g. truncate the number to integer and then increment it by one).

    bfloor()

    1. $x->bfloor();

    Truncate $x to an integer value.

    bsqrt()

    1. $x->bsqrt();

    Calculate the square root of $x.

    broot()

    1. $x->broot($n);

    Calculate the N'th root of $x.

    badd()/bmul()/bsub()/bdiv()/bdec()/binc()

    Please see the documentation in Math::BigInt.

    copy()

    1. my $z = $x->copy();

    Makes a deep copy of the object.

    Please see the documentation in Math::BigInt for further details.

    bstr()/bsstr()

    1. my $x = Math::BigInt->new('8/4');
    2. print $x->bstr(),"\n"; # prints 1/2
    3. print $x->bsstr(),"\n"; # prints 1/2

    Return a string representing this object.

    bacmp()/bcmp()

    Used to compare numbers.

    Please see the documentation in Math::BigInt for further details.

    blsft()/brsft()

    Used to shift numbers left/right.

    Please see the documentation in Math::BigInt for further details.

    bpow()

    1. $x->bpow($y);

    Compute $x ** $y.

    Please see the documentation in Math::BigInt for further details.

    bexp()

    1. $x->bexp($accuracy); # calculate e ** X

    Calculates two integers A and B so that A/B is equal to e ** $x , where e is Euler's number.

    This method was added in v0.20 of Math::BigRat (May 2007).

    See also blog().

    bnok()

    1. $x->bnok($y); # x over y (binomial coefficient n over k)

    Calculates the binomial coefficient n over k, also called the "choose" function. The result is equivalent to:

    1. ( n ) n!
    2. | - | = -------
    3. ( k ) k!(n-k)!

    This method was added in v0.20 of Math::BigRat (May 2007).

    config()

    1. use Data::Dumper;
    2. print Dumper ( Math::BigRat->config() );
    3. print Math::BigRat->config()->{lib},"\n";

    Returns a hash containing the configuration, e.g. the version number, lib loaded etc. The following hash keys are currently filled in with the appropriate information.

    1. key RO/RW Description
    2. Example
    3. ============================================================
    4. lib RO Name of the Math library
    5. Math::BigInt::Calc
    6. lib_version RO Version of 'lib'
    7. 0.30
    8. class RO The class of config you just called
    9. Math::BigRat
    10. version RO version number of the class you used
    11. 0.10
    12. upgrade RW To which class numbers are upgraded
    13. undef
    14. downgrade RW To which class numbers are downgraded
    15. undef
    16. precision RW Global precision
    17. undef
    18. accuracy RW Global accuracy
    19. undef
    20. round_mode RW Global round mode
    21. even
    22. div_scale RW Fallback accuracy for div
    23. 40
    24. trap_nan RW Trap creation of NaN (undef = no)
    25. undef
    26. trap_inf RW Trap creation of +inf/-inf (undef = no)
    27. undef

    By passing a reference to a hash you may set the configuration values. This works only for values that a marked with a RW above, anything else is read-only.

    objectify()

    This is an internal routine that turns scalars into objects.

    BUGS

    Some things are not yet implemented, or only implemented half-way:

    • inf handling (partial)
    • NaN handling (partial)
    • rounding (not implemented except for bceil/bfloor)
    • $x ** $y where $y is not an integer
    • bmod(), blog(), bmodinv() and bmodpow() (partial)

    LICENSE

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    Math::BigFloat and Math::Big as well as Math::BigInt::Pari and Math::BigInt::GMP.

    See http://search.cpan.org/search?dist=bignum for a way to use Math::BigRat.

    The package at http://search.cpan.org/search?dist=Math%3A%3ABigRat may contain more documentation and examples as well as testcases.

    AUTHORS

    (C) by Tels http://bloodgate.com/ 2001 - 2009.

    Currently maintained by Jonathan "Duke" Leto <jonathan@leto.net> http://leto.net

     
    perldoc-html/Math/Complex.html000644 000765 000024 00000154532 12275777465 016444 0ustar00jjstaff000000 000000 Math::Complex - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Math::Complex

    Perl 5 version 18.2 documentation
    Recently read

    Math::Complex

    NAME

    Math::Complex - complex numbers and associated mathematical functions

    SYNOPSIS

    1. use Math::Complex;
    2. $z = Math::Complex->make(5, 6);
    3. $t = 4 - 3*i + $z;
    4. $j = cplxe(1, 2*pi/3);

    DESCRIPTION

    This package lets you create and manipulate complex numbers. By default, Perl limits itself to real numbers, but an extra use statement brings full complex support, along with a full set of mathematical functions typically associated with and/or extended to complex numbers.

    If you wonder what complex numbers are, they were invented to be able to solve the following equation:

    1. x*x = -1

    and by definition, the solution is noted i (engineers use j instead since i usually denotes an intensity, but the name does not matter). The number i is a pure imaginary number.

    The arithmetics with pure imaginary numbers works just like you would expect it with real numbers... you just have to remember that

    1. i*i = -1

    so you have:

    1. 5i + 7i = i * (5 + 7) = 12i
    2. 4i - 3i = i * (4 - 3) = i
    3. 4i * 2i = -8
    4. 6i / 2i = 3
    5. 1 / i = -i

    Complex numbers are numbers that have both a real part and an imaginary part, and are usually noted:

    1. a + bi

    where a is the real part and b is the imaginary part. The arithmetic with complex numbers is straightforward. You have to keep track of the real and the imaginary parts, but otherwise the rules used for real numbers just apply:

    1. (4 + 3i) + (5 - 2i) = (4 + 5) + i(3 - 2) = 9 + i
    2. (2 + i) * (4 - i) = 2*4 + 4i -2i -i*i = 8 + 2i + 1 = 9 + 2i

    A graphical representation of complex numbers is possible in a plane (also called the complex plane, but it's really a 2D plane). The number

    1. z = a + bi

    is the point whose coordinates are (a, b). Actually, it would be the vector originating from (0, 0) to (a, b). It follows that the addition of two complex numbers is a vectorial addition.

    Since there is a bijection between a point in the 2D plane and a complex number (i.e. the mapping is unique and reciprocal), a complex number can also be uniquely identified with polar coordinates:

    1. [rho, theta]

    where rho is the distance to the origin, and theta the angle between the vector and the x axis. There is a notation for this using the exponential form, which is:

    1. rho * exp(i * theta)

    where i is the famous imaginary number introduced above. Conversion between this form and the cartesian form a + bi is immediate:

    1. a = rho * cos(theta)
    2. b = rho * sin(theta)

    which is also expressed by this formula:

    1. z = rho * exp(i * theta) = rho * (cos theta + i * sin theta)

    In other words, it's the projection of the vector onto the x and y axes. Mathematicians call rho the norm or modulus and theta the argument of the complex number. The norm of z is marked here as abs(z).

    The polar notation (also known as the trigonometric representation) is much more handy for performing multiplications and divisions of complex numbers, whilst the cartesian notation is better suited for additions and subtractions. Real numbers are on the x axis, and therefore y or theta is zero or pi.

    All the common operations that can be performed on a real number have been defined to work on complex numbers as well, and are merely extensions of the operations defined on real numbers. This means they keep their natural meaning when there is no imaginary part, provided the number is within their definition set.

    For instance, the sqrt routine which computes the square root of its argument is only defined for non-negative real numbers and yields a non-negative real number (it is an application from R+ to R+). If we allow it to return a complex number, then it can be extended to negative real numbers to become an application from R to C (the set of complex numbers):

    1. sqrt(x) = x >= 0 ? sqrt(x) : sqrt(-x)*i

    It can also be extended to be an application from C to C, whilst its restriction to R behaves as defined above by using the following definition:

    1. sqrt(z = [r,t]) = sqrt(r) * exp(i * t/2)

    Indeed, a negative real number can be noted [x,pi] (the modulus x is always non-negative, so [x,pi] is really -x , a negative number) and the above definition states that

    1. sqrt([x,pi]) = sqrt(x) * exp(i*pi/2) = [sqrt(x),pi/2] = sqrt(x)*i

    which is exactly what we had defined for negative real numbers above. The sqrt returns only one of the solutions: if you want the both, use the root function.

    All the common mathematical functions defined on real numbers that are extended to complex numbers share that same property of working as usual when the imaginary part is zero (otherwise, it would not be called an extension, would it?).

    A new operation possible on a complex number that is the identity for real numbers is called the conjugate, and is noted with a horizontal bar above the number, or ~z here.

    1. z = a + bi
    2. ~z = a - bi

    Simple... Now look:

    1. z * ~z = (a + bi) * (a - bi) = a*a + b*b

    We saw that the norm of z was noted abs(z) and was defined as the distance to the origin, also known as:

    1. rho = abs(z) = sqrt(a*a + b*b)

    so

    1. z * ~z = abs(z) ** 2

    If z is a pure real number (i.e. b == 0 ), then the above yields:

    1. a * a = abs(a) ** 2

    which is true (abs has the regular meaning for real number, i.e. stands for the absolute value). This example explains why the norm of z is noted abs(z): it extends the abs function to complex numbers, yet is the regular abs we know when the complex number actually has no imaginary part... This justifies a posteriori our use of the abs notation for the norm.

    OPERATIONS

    Given the following notations:

    1. z1 = a + bi = r1 * exp(i * t1)
    2. z2 = c + di = r2 * exp(i * t2)
    3. z = <any complex or real number>

    the following (overloaded) operations are supported on complex numbers:

    1. z1 + z2 = (a + c) + i(b + d)
    2. z1 - z2 = (a - c) + i(b - d)
    3. z1 * z2 = (r1 * r2) * exp(i * (t1 + t2))
    4. z1 / z2 = (r1 / r2) * exp(i * (t1 - t2))
    5. z1 ** z2 = exp(z2 * log z1)
    6. ~z = a - bi
    7. abs(z) = r1 = sqrt(a*a + b*b)
    8. sqrt(z) = sqrt(r1) * exp(i * t/2)
    9. exp(z) = exp(a) * exp(i * b)
    10. log(z) = log(r1) + i*t
    11. sin(z) = 1/2i (exp(i * z1) - exp(-i * z))
    12. cos(z) = 1/2 (exp(i * z1) + exp(-i * z))
    13. atan2(y, x) = atan(y / x) # Minding the right quadrant, note the order.

    The definition used for complex arguments of atan2() is

    1. -i log((x + iy)/sqrt(x*x+y*y))

    Note that atan2(0, 0) is not well-defined.

    The following extra operations are supported on both real and complex numbers:

    1. Re(z) = a
    2. Im(z) = b
    3. arg(z) = t
    4. abs(z) = r
    5. cbrt(z) = z ** (1/3)
    6. log10(z) = log(z) / log(10)
    7. logn(z, n) = log(z) / log(n)
    8. tan(z) = sin(z) / cos(z)
    9. csc(z) = 1 / sin(z)
    10. sec(z) = 1 / cos(z)
    11. cot(z) = 1 / tan(z)
    12. asin(z) = -i * log(i*z + sqrt(1-z*z))
    13. acos(z) = -i * log(z + i*sqrt(1-z*z))
    14. atan(z) = i/2 * log((i+z) / (i-z))
    15. acsc(z) = asin(1 / z)
    16. asec(z) = acos(1 / z)
    17. acot(z) = atan(1 / z) = -i/2 * log((i+z) / (z-i))
    18. sinh(z) = 1/2 (exp(z) - exp(-z))
    19. cosh(z) = 1/2 (exp(z) + exp(-z))
    20. tanh(z) = sinh(z) / cosh(z) = (exp(z) - exp(-z)) / (exp(z) + exp(-z))
    21. csch(z) = 1 / sinh(z)
    22. sech(z) = 1 / cosh(z)
    23. coth(z) = 1 / tanh(z)
    24. asinh(z) = log(z + sqrt(z*z+1))
    25. acosh(z) = log(z + sqrt(z*z-1))
    26. atanh(z) = 1/2 * log((1+z) / (1-z))
    27. acsch(z) = asinh(1 / z)
    28. asech(z) = acosh(1 / z)
    29. acoth(z) = atanh(1 / z) = 1/2 * log((1+z) / (z-1))

    arg, abs, log, csc, cot, acsc, acot, csch, coth, acosech, acotanh, have aliases rho, theta, ln, cosec, cotan, acosec, acotan, cosech, cotanh, acosech, acotanh, respectively. Re , Im , arg , abs, rho , and theta can be used also as mutators. The cbrt returns only one of the solutions: if you want all three, use the root function.

    The root function is available to compute all the n roots of some complex, where n is a strictly positive integer. There are exactly n such roots, returned as a list. Getting the number mathematicians call j such that:

    1. 1 + j + j*j = 0;

    is a simple matter of writing:

    1. $j = ((root(1, 3))[1];

    The kth root for z = [r,t] is given by:

    1. (root(z, n))[k] = r**(1/n) * exp(i * (t + 2*k*pi)/n)

    You can return the kth root directly by root(z, n, k) , indexing starting from zero and ending at n - 1.

    The spaceship numeric comparison operator, <=>, is also defined. In order to ensure its restriction to real numbers is conform to what you would expect, the comparison is run on the real part of the complex number first, and imaginary parts are compared only when the real parts match.

    CREATION

    To create a complex number, use either:

    1. $z = Math::Complex->make(3, 4);
    2. $z = cplx(3, 4);

    if you know the cartesian form of the number, or

    1. $z = 3 + 4*i;

    if you like. To create a number using the polar form, use either:

    1. $z = Math::Complex->emake(5, pi/3);
    2. $x = cplxe(5, pi/3);

    instead. The first argument is the modulus, the second is the angle (in radians, the full circle is 2*pi). (Mnemonic: e is used as a notation for complex numbers in the polar form).

    It is possible to write:

    1. $x = cplxe(-3, pi/4);

    but that will be silently converted into [3,-3pi/4], since the modulus must be non-negative (it represents the distance to the origin in the complex plane).

    It is also possible to have a complex number as either argument of the make , emake , cplx , and cplxe : the appropriate component of the argument will be used.

    1. $z1 = cplx(-2, 1);
    2. $z2 = cplx($z1, 4);

    The new , make , emake , cplx , and cplxe will also understand a single (string) argument of the forms

    1. 2-3i
    2. -3i
    3. [2,3]
    4. [2,-3pi/4]
    5. [2]

    in which case the appropriate cartesian and exponential components will be parsed from the string and used to create new complex numbers. The imaginary component and the theta, respectively, will default to zero.

    The new , make , emake , cplx , and cplxe will also understand the case of no arguments: this means plain zero or (0, 0).

    DISPLAYING

    When printed, a complex number is usually shown under its cartesian style a+bi, but there are legitimate cases where the polar style [r,t] is more appropriate. The process of converting the complex number into a string that can be displayed is known as stringification.

    By calling the class method Math::Complex::display_format and supplying either "polar" or "cartesian" as an argument, you override the default display style, which is "cartesian" . Not supplying any argument returns the current settings.

    This default can be overridden on a per-number basis by calling the display_format method instead. As before, not supplying any argument returns the current display style for this number. Otherwise whatever you specify will be the new display style for this particular number.

    For instance:

    1. use Math::Complex;
    2. Math::Complex::display_format('polar');
    3. $j = (root(1, 3))[1];
    4. print "j = $j\n"; # Prints "j = [1,2pi/3]"
    5. $j->display_format('cartesian');
    6. print "j = $j\n"; # Prints "j = -0.5+0.866025403784439i"

    The polar style attempts to emphasize arguments like k*pi/n (where n is a positive integer and k an integer within [-9, +9]), this is called polar pretty-printing.

    For the reverse of stringifying, see the make and emake .

    CHANGED IN PERL 5.6

    The display_format class method and the corresponding display_format object method can now be called using a parameter hash instead of just a one parameter.

    The old display format style, which can have values "cartesian" or "polar" , can be changed using the "style" parameter.

    1. $j->display_format(style => "polar");

    The one parameter calling convention also still works.

    1. $j->display_format("polar");

    There are two new display parameters.

    The first one is "format" , which is a sprintf()-style format string to be used for both numeric parts of the complex number(s). The is somewhat system-dependent but most often it corresponds to "%.15g" . You can revert to the default by setting the format to undef.

    1. # the $j from the above example
    2. $j->display_format('format' => '%.5f');
    3. print "j = $j\n"; # Prints "j = -0.50000+0.86603i"
    4. $j->display_format('format' => undef);
    5. print "j = $j\n"; # Prints "j = -0.5+0.86603i"

    Notice that this affects also the return values of the display_format methods: in list context the whole parameter hash will be returned, as opposed to only the style parameter value. This is a potential incompatibility with earlier versions if you have been calling the display_format method in list context.

    The second new display parameter is "polar_pretty_print" , which can be set to true or false, the default being true. See the previous section for what this means.

    USAGE

    Thanks to overloading, the handling of arithmetics with complex numbers is simple and almost transparent.

    Here are some examples:

    1. use Math::Complex;
    2. $j = cplxe(1, 2*pi/3); # $j ** 3 == 1
    3. print "j = $j, j**3 = ", $j ** 3, "\n";
    4. print "1 + j + j**2 = ", 1 + $j + $j**2, "\n";
    5. $z = -16 + 0*i; # Force it to be a complex
    6. print "sqrt($z) = ", sqrt($z), "\n";
    7. $k = exp(i * 2*pi/3);
    8. print "$j - $k = ", $j - $k, "\n";
    9. $z->Re(3); # Re, Im, arg, abs,
    10. $j->arg(2); # (the last two aka rho, theta)
    11. # can be used also as mutators.

    CONSTANTS

    PI

    The constant pi and some handy multiples of it (pi2, pi4, and pip2 (pi/2) and pip4 (pi/4)) are also available if separately exported:

    1. use Math::Complex ':pi';
    2. $third_of_circle = pi2 / 3;

    Inf

    The floating point infinity can be exported as a subroutine Inf():

    1. use Math::Complex qw(Inf sinh);
    2. my $AlsoInf = Inf() + 42;
    3. my $AnotherInf = sinh(1e42);
    4. print "$AlsoInf is $AnotherInf\n" if $AlsoInf == $AnotherInf;

    Note that the stringified form of infinity varies between platforms: it can be for example any of

    1. inf
    2. infinity
    3. INF
    4. 1.#INF

    or it can be something else.

    Also note that in some platforms trying to use the infinity in arithmetic operations may result in Perl crashing because using an infinity causes SIGFPE or its moral equivalent to be sent. The way to ignore this is

    1. local $SIG{FPE} = sub { };

    ERRORS DUE TO DIVISION BY ZERO OR LOGARITHM OF ZERO

    The division (/) and the following functions

    1. log ln log10 logn
    2. tan sec csc cot
    3. atan asec acsc acot
    4. tanh sech csch coth
    5. atanh asech acsch acoth

    cannot be computed for all arguments because that would mean dividing by zero or taking logarithm of zero. These situations cause fatal runtime errors looking like this

    1. cot(0): Division by zero.
    2. (Because in the definition of cot(0), the divisor sin(0) is 0)
    3. Died at ...

    or

    1. atanh(-1): Logarithm of zero.
    2. Died at...

    For the csc , cot , asec , acsc , acot , csch , coth , asech , acsch , the argument cannot be 0 (zero). For the logarithmic functions and the atanh , acoth , the argument cannot be 1 (one). For the atanh , acoth , the argument cannot be -1 (minus one). For the atan , acot , the argument cannot be i (the imaginary unit). For the atan , acoth , the argument cannot be -i (the negative imaginary unit). For the tan , sec , tanh , the argument cannot be pi/2 + k * pi, where k is any integer. atan2(0, 0) is undefined, and if the complex arguments are used for atan2(), a division by zero will happen if z1**2+z2**2 == 0.

    Note that because we are operating on approximations of real numbers, these errors can happen when merely `too close' to the singularities listed above.

    ERRORS DUE TO INDIGESTIBLE ARGUMENTS

    The make and emake accept both real and complex arguments. When they cannot recognize the arguments they will die with error messages like the following

    1. Math::Complex::make: Cannot take real part of ...
    2. Math::Complex::make: Cannot take real part of ...
    3. Math::Complex::emake: Cannot take rho of ...
    4. Math::Complex::emake: Cannot take theta of ...

    BUGS

    Saying use Math::Complex; exports many mathematical routines in the caller environment and even overrides some (sqrt, log, atan2). This is construed as a feature by the Authors, actually... ;-)

    All routines expect to be given real or complex numbers. Don't attempt to use BigFloat, since Perl has currently no rule to disambiguate a '+' operation (for instance) between two overloaded entities.

    In Cray UNICOS there is some strange numerical instability that results in root(), cos(), sin(), cosh(), sinh(), losing accuracy fast. Beware. The bug may be in UNICOS math libs, in UNICOS C compiler, in Math::Complex. Whatever it is, it does not manifest itself anywhere else where Perl runs.

    SEE ALSO

    Math::Trig

    AUTHORS

    Daniel S. Lewart <lewart!at!uiuc.edu>, Jarkko Hietaniemi <jhi!at!iki.fi>, Raphael Manfredi <Raphael_Manfredi!at!pobox.com>, Zefram <zefram@fysh.org>

    LICENSE

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Math/Trig.html000644 000765 000024 00000153662 12275777467 015747 0ustar00jjstaff000000 000000 Math::Trig - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Math::Trig

    Perl 5 version 18.2 documentation
    Recently read

    Math::Trig

    NAME

    Math::Trig - trigonometric functions

    SYNOPSIS

    1. use Math::Trig;
    2. $x = tan(0.9);
    3. $y = acos(3.7);
    4. $z = asin(2.4);
    5. $halfpi = pi/2;
    6. $rad = deg2rad(120);
    7. # Import constants pi2, pip2, pip4 (2*pi, pi/2, pi/4).
    8. use Math::Trig ':pi';
    9. # Import the conversions between cartesian/spherical/cylindrical.
    10. use Math::Trig ':radial';
    11. # Import the great circle formulas.
    12. use Math::Trig ':great_circle';

    DESCRIPTION

    Math::Trig defines many trigonometric functions not defined by the core Perl which defines only the sin() and cos(). The constant pi is also defined as are a few convenience functions for angle conversions, and great circle formulas for spherical movement.

    TRIGONOMETRIC FUNCTIONS

    The tangent

    • tan

    The cofunctions of the sine, cosine, and tangent (cosec/csc and cotan/cot are aliases)

    csc, cosec, sec, sec, cot, cotan

    The arcus (also known as the inverse) functions of the sine, cosine, and tangent

    asin, acos, atan

    The principal value of the arc tangent of y/x

    atan2(y, x)

    The arcus cofunctions of the sine, cosine, and tangent (acosec/acsc and acotan/acot are aliases). Note that atan2(0, 0) is not well-defined.

    acsc, acosec, asec, acot, acotan

    The hyperbolic sine, cosine, and tangent

    sinh, cosh, tanh

    The cofunctions of the hyperbolic sine, cosine, and tangent (cosech/csch and cotanh/coth are aliases)

    csch, cosech, sech, coth, cotanh

    The area (also known as the inverse) functions of the hyperbolic sine, cosine, and tangent

    asinh, acosh, atanh

    The area cofunctions of the hyperbolic sine, cosine, and tangent (acsch/acosech and acoth/acotanh are aliases)

    acsch, acosech, asech, acoth, acotanh

    The trigonometric constant pi and some of handy multiples of it are also defined.

    pi, pi2, pi4, pip2, pip4

    ERRORS DUE TO DIVISION BY ZERO

    The following functions

    1. acoth
    2. acsc
    3. acsch
    4. asec
    5. asech
    6. atanh
    7. cot
    8. coth
    9. csc
    10. csch
    11. sec
    12. sech
    13. tan
    14. tanh

    cannot be computed for all arguments because that would mean dividing by zero or taking logarithm of zero. These situations cause fatal runtime errors looking like this

    1. cot(0): Division by zero.
    2. (Because in the definition of cot(0), the divisor sin(0) is 0)
    3. Died at ...

    or

    1. atanh(-1): Logarithm of zero.
    2. Died at...

    For the csc , cot , asec , acsc , acot , csch , coth , asech , acsch , the argument cannot be 0 (zero). For the atanh , acoth , the argument cannot be 1 (one). For the atanh , acoth , the argument cannot be -1 (minus one). For the tan , sec , tanh , sech , the argument cannot be pi/2 + k * pi, where k is any integer.

    Note that atan2(0, 0) is not well-defined.

    SIMPLE (REAL) ARGUMENTS, COMPLEX RESULTS

    Please note that some of the trigonometric functions can break out from the real axis into the complex plane. For example asin(2) has no definition for plain real numbers but it has definition for complex numbers.

    In Perl terms this means that supplying the usual Perl numbers (also known as scalars, please see perldata) as input for the trigonometric functions might produce as output results that no more are simple real numbers: instead they are complex numbers.

    The Math::Trig handles this by using the Math::Complex package which knows how to handle complex numbers, please see Math::Complex for more information. In practice you need not to worry about getting complex numbers as results because the Math::Complex takes care of details like for example how to display complex numbers. For example:

    1. print asin(2), "\n";

    should produce something like this (take or leave few last decimals):

    1. 1.5707963267949-1.31695789692482i

    That is, a complex number with the real part of approximately 1.571 and the imaginary part of approximately -1.317 .

    PLANE ANGLE CONVERSIONS

    (Plane, 2-dimensional) angles may be converted with the following functions.

    • deg2rad
      1. $radians = deg2rad($degrees);
    • grad2rad
      1. $radians = grad2rad($gradians);
    • rad2deg
      1. $degrees = rad2deg($radians);
    • grad2deg
      1. $degrees = grad2deg($gradians);
    • deg2grad
      1. $gradians = deg2grad($degrees);
    • rad2grad
      1. $gradians = rad2grad($radians);

    The full circle is 2 pi radians or 360 degrees or 400 gradians. The result is by default wrapped to be inside the [0, {2pi,360,400}[ circle. If you don't want this, supply a true second argument:

    1. $zillions_of_radians = deg2rad($zillions_of_degrees, 1);
    2. $negative_degrees = rad2deg($negative_radians, 1);

    You can also do the wrapping explicitly by rad2rad(), deg2deg(), and grad2grad().

    • rad2rad
      1. $radians_wrapped_by_2pi = rad2rad($radians);
    • deg2deg
      1. $degrees_wrapped_by_360 = deg2deg($degrees);
    • grad2grad
      1. $gradians_wrapped_by_400 = grad2grad($gradians);

    RADIAL COORDINATE CONVERSIONS

    Radial coordinate systems are the spherical and the cylindrical systems, explained shortly in more detail.

    You can import radial coordinate conversion functions by using the :radial tag:

    1. use Math::Trig ':radial';
    2. ($rho, $theta, $z) = cartesian_to_cylindrical($x, $y, $z);
    3. ($rho, $theta, $phi) = cartesian_to_spherical($x, $y, $z);
    4. ($x, $y, $z) = cylindrical_to_cartesian($rho, $theta, $z);
    5. ($rho_s, $theta, $phi) = cylindrical_to_spherical($rho_c, $theta, $z);
    6. ($x, $y, $z) = spherical_to_cartesian($rho, $theta, $phi);
    7. ($rho_c, $theta, $z) = spherical_to_cylindrical($rho_s, $theta, $phi);

    All angles are in radians.

    COORDINATE SYSTEMS

    Cartesian coordinates are the usual rectangular (x, y, z)-coordinates.

    Spherical coordinates, (rho, theta, pi), are three-dimensional coordinates which define a point in three-dimensional space. They are based on a sphere surface. The radius of the sphere is rho, also known as the radial coordinate. The angle in the xy-plane (around the z-axis) is theta, also known as the azimuthal coordinate. The angle from the z-axis is phi, also known as the polar coordinate. The North Pole is therefore 0, 0, rho, and the Gulf of Guinea (think of the missing big chunk of Africa) 0, pi/2, rho. In geographical terms phi is latitude (northward positive, southward negative) and theta is longitude (eastward positive, westward negative).

    BEWARE: some texts define theta and phi the other way round, some texts define the phi to start from the horizontal plane, some texts use r in place of rho.

    Cylindrical coordinates, (rho, theta, z), are three-dimensional coordinates which define a point in three-dimensional space. They are based on a cylinder surface. The radius of the cylinder is rho, also known as the radial coordinate. The angle in the xy-plane (around the z-axis) is theta, also known as the azimuthal coordinate. The third coordinate is the z, pointing up from the theta-plane.

    3-D ANGLE CONVERSIONS

    Conversions to and from spherical and cylindrical coordinates are available. Please notice that the conversions are not necessarily reversible because of the equalities like pi angles being equal to -pi angles.

    • cartesian_to_cylindrical
      1. ($rho, $theta, $z) = cartesian_to_cylindrical($x, $y, $z);
    • cartesian_to_spherical
      1. ($rho, $theta, $phi) = cartesian_to_spherical($x, $y, $z);
    • cylindrical_to_cartesian
      1. ($x, $y, $z) = cylindrical_to_cartesian($rho, $theta, $z);
    • cylindrical_to_spherical
      1. ($rho_s, $theta, $phi) = cylindrical_to_spherical($rho_c, $theta, $z);

      Notice that when $z is not 0 $rho_s is not equal to $rho_c .

    • spherical_to_cartesian
      1. ($x, $y, $z) = spherical_to_cartesian($rho, $theta, $phi);
    • spherical_to_cylindrical
      1. ($rho_c, $theta, $z) = spherical_to_cylindrical($rho_s, $theta, $phi);

      Notice that when $z is not 0 $rho_c is not equal to $rho_s .

    GREAT CIRCLE DISTANCES AND DIRECTIONS

    A great circle is section of a circle that contains the circle diameter: the shortest distance between two (non-antipodal) points on the spherical surface goes along the great circle connecting those two points.

    great_circle_distance

    You can compute spherical distances, called great circle distances, by importing the great_circle_distance() function:

    1. use Math::Trig 'great_circle_distance';
    2. $distance = great_circle_distance($theta0, $phi0, $theta1, $phi1, [, $rho]);

    The great circle distance is the shortest distance between two points on a sphere. The distance is in $rho units. The $rho is optional, it defaults to 1 (the unit sphere), therefore the distance defaults to radians.

    If you think geographically the theta are longitudes: zero at the Greenwhich meridian, eastward positive, westward negative -- and the phi are latitudes: zero at the North Pole, northward positive, southward negative. NOTE: this formula thinks in mathematics, not geographically: the phi zero is at the North Pole, not at the Equator on the west coast of Africa (Bay of Guinea). You need to subtract your geographical coordinates from pi/2 (also known as 90 degrees).

    1. $distance = great_circle_distance($lon0, pi/2 - $lat0,
    2. $lon1, pi/2 - $lat1, $rho);

    great_circle_direction

    The direction you must follow the great circle (also known as bearing) can be computed by the great_circle_direction() function:

    1. use Math::Trig 'great_circle_direction';
    2. $direction = great_circle_direction($theta0, $phi0, $theta1, $phi1);

    great_circle_bearing

    Alias 'great_circle_bearing' for 'great_circle_direction' is also available.

    1. use Math::Trig 'great_circle_bearing';
    2. $direction = great_circle_bearing($theta0, $phi0, $theta1, $phi1);

    The result of great_circle_direction is in radians, zero indicating straight north, pi or -pi straight south, pi/2 straight west, and -pi/2 straight east.

    great_circle_destination

    You can inversely compute the destination if you know the starting point, direction, and distance:

    1. use Math::Trig 'great_circle_destination';
    2. # $diro is the original direction,
    3. # for example from great_circle_bearing().
    4. # $distance is the angular distance in radians,
    5. # for example from great_circle_distance().
    6. # $thetad and $phid are the destination coordinates,
    7. # $dird is the final direction at the destination.
    8. ($thetad, $phid, $dird) =
    9. great_circle_destination($theta, $phi, $diro, $distance);

    or the midpoint if you know the end points:

    great_circle_midpoint

    1. use Math::Trig 'great_circle_midpoint';
    2. ($thetam, $phim) =
    3. great_circle_midpoint($theta0, $phi0, $theta1, $phi1);

    The great_circle_midpoint() is just a special case of

    great_circle_waypoint

    1. use Math::Trig 'great_circle_waypoint';
    2. ($thetai, $phii) =
    3. great_circle_waypoint($theta0, $phi0, $theta1, $phi1, $way);

    Where the $way is a value from zero ($theta0, $phi0) to one ($theta1, $phi1). Note that antipodal points (where their distance is pi radians) do not have waypoints between them (they would have an an "equator" between them), and therefore undef is returned for antipodal points. If the points are the same and the distance therefore zero and all waypoints therefore identical, the first point (either point) is returned.

    The thetas, phis, direction, and distance in the above are all in radians.

    You can import all the great circle formulas by

    1. use Math::Trig ':great_circle';

    Notice that the resulting directions might be somewhat surprising if you are looking at a flat worldmap: in such map projections the great circles quite often do not look like the shortest routes -- but for example the shortest possible routes from Europe or North America to Asia do often cross the polar regions. (The common Mercator projection does not show great circles as straight lines: straight lines in the Mercator projection are lines of constant bearing.)

    EXAMPLES

    To calculate the distance between London (51.3N 0.5W) and Tokyo (35.7N 139.8E) in kilometers:

    1. use Math::Trig qw(great_circle_distance deg2rad);
    2. # Notice the 90 - latitude: phi zero is at the North Pole.
    3. sub NESW { deg2rad($_[0]), deg2rad(90 - $_[1]) }
    4. my @L = NESW( -0.5, 51.3);
    5. my @T = NESW(139.8, 35.7);
    6. my $km = great_circle_distance(@L, @T, 6378); # About 9600 km.

    The direction you would have to go from London to Tokyo (in radians, straight north being zero, straight east being pi/2).

    1. use Math::Trig qw(great_circle_direction);
    2. my $rad = great_circle_direction(@L, @T); # About 0.547 or 0.174 pi.

    The midpoint between London and Tokyo being

    1. use Math::Trig qw(great_circle_midpoint);
    2. my @M = great_circle_midpoint(@L, @T);

    or about 69 N 89 E, in the frozen wastes of Siberia.

    NOTE: you cannot get from A to B like this:

    1. Dist = great_circle_distance(A, B)
    2. Dir = great_circle_direction(A, B)
    3. C = great_circle_destination(A, Dist, Dir)

    and expect C to be B, because the bearing constantly changes when going from A to B (except in some special case like the meridians or the circles of latitudes) and in great_circle_destination() one gives a constant bearing to follow.

    CAVEAT FOR GREAT CIRCLE FORMULAS

    The answers may be off by few percentages because of the irregular (slightly aspherical) form of the Earth. The errors are at worst about 0.55%, but generally below 0.3%.

    Real-valued asin and acos

    For small inputs asin() and acos() may return complex numbers even when real numbers would be enough and correct, this happens because of floating-point inaccuracies. You can see these inaccuracies for example by trying theses:

    1. print cos(1e-6)**2+sin(1e-6)**2 - 1,"\n";
    2. printf "%.20f", cos(1e-6)**2+sin(1e-6)**2,"\n";

    which will print something like this

    1. -1.11022302462516e-16
    2. 0.99999999999999988898

    even though the expected results are of course exactly zero and one. The formulas used to compute asin() and acos() are quite sensitive to this, and therefore they might accidentally slip into the complex plane even when they should not. To counter this there are two interfaces that are guaranteed to return a real-valued output.

    • asin_real
      1. use Math::Trig qw(asin_real);
      2. $real_angle = asin_real($input_sin);

      Return a real-valued arcus sine if the input is between [-1, 1], inclusive the endpoints. For inputs greater than one, pi/2 is returned. For inputs less than minus one, -pi/2 is returned.

    • acos_real
      1. use Math::Trig qw(acos_real);
      2. $real_angle = acos_real($input_cos);

      Return a real-valued arcus cosine if the input is between [-1, 1], inclusive the endpoints. For inputs greater than one, zero is returned. For inputs less than minus one, pi is returned.

    BUGS

    Saying use Math::Trig; exports many mathematical routines in the caller environment and even overrides some (sin, cos). This is construed as a feature by the Authors, actually... ;-)

    The code is not optimized for speed, especially because we use Math::Complex and thus go quite near complex numbers while doing the computations even when the arguments are not. This, however, cannot be completely avoided if we want things like asin(2) to give an answer instead of giving a fatal runtime error.

    Do not attempt navigation using these formulas.

    Math::Complex

    AUTHORS

    Jarkko Hietaniemi <jhi!at!iki.fi>, Raphael Manfredi <Raphael_Manfredi!at!pobox.com>, Zefram <zefram@fysh.org>

    LICENSE

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Math/BigInt/Calc.html000644 000765 000024 00000101376 12275777472 017047 0ustar00jjstaff000000 000000 Math::BigInt::Calc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Math::BigInt::Calc

    Perl 5 version 18.2 documentation
    Recently read

    Math::BigInt::Calc

    NAME

    Math::BigInt::Calc - Pure Perl module to support Math::BigInt

    SYNOPSIS

    This library provides support for big integer calculations. It is not intended to be used by other modules. Other modules which support the same API (see below) can also be used to support Math::BigInt, like Math::BigInt::GMP and Math::BigInt::Pari.

    DESCRIPTION

    In this library, the numbers are represented in base B = 10**N, where N is the largest possible value that does not cause overflow in the intermediate computations. The base B elements are stored in an array, with the least significant element stored in array element zero. There are no leading zero elements, except a single zero element when the number is zero.

    For instance, if B = 10000, the number 1234567890 is represented internally as [3456, 7890, 12].

    THE Math::BigInt API

    In order to allow for multiple big integer libraries, Math::BigInt was rewritten to use a plug-in library for core math routines. Any module which conforms to the API can be used by Math::BigInt by using this in your program:

    1. use Math::BigInt lib => 'libname';

    'libname' is either the long name, like 'Math::BigInt::Pari', or only the short version, like 'Pari'.

    General Notes

    A library only needs to deal with unsigned big integers. Testing of input parameter validity is done by the caller, so there is no need to worry about underflow (e.g., in _sub() and _dec() ) nor about division by zero (e.g., in _div() ) or similar cases.

    For some methods, the first parameter can be modified. That includes the possibility that you return a reference to a completely different object instead. Although keeping the reference and just changing its contents is preferred over creating and returning a different reference.

    Return values are always objects, strings, Perl scalars, or true/false for comparison routines.

    API version 1

    The following methods must be defined in order to support the use by Math::BigInt v1.70 or later.

    API version

    • api_version()

      Return API version as a Perl scalar, 1 for Math::BigInt v1.70, 2 for Math::BigInt v1.83.

    Constructors

    • _new(STR)

      Convert a string representing an unsigned decimal number to an object representing the same number. The input is normalize, i.e., it matches ^(0|[1-9]\d*)$ .

    • _zero()

      Return an object representing the number zero.

    • _one()

      Return an object representing the number one.

    • _two()

      Return an object representing the number two.

    • _ten()

      Return an object representing the number ten.

    • _from_bin(STR)

      Return an object given a string representing a binary number. The input has a '0b' prefix and matches the regular expression ^0[bB](0|1[01]*)$ .

    • _from_oct(STR)

      Return an object given a string representing an octal number. The input has a '0' prefix and matches the regular expression ^0[1-7]*$ .

    • _from_hex(STR)

      Return an object given a string representing a hexadecimal number. The input has a '0x' prefix and matches the regular expression ^0x(0|[1-9a-fA-F][\da-fA-F]*)$ .

    Mathematical functions

    Each of these methods may modify the first input argument, except _bgcd(), which shall not modify any input argument, and _sub() which may modify the second input argument.

    • _add(OBJ1, OBJ2)

      Returns the result of adding OBJ2 to OBJ1.

    • _mul(OBJ1, OBJ2)

      Returns the result of multiplying OBJ2 and OBJ1.

    • _div(OBJ1, OBJ2)

      Returns the result of dividing OBJ1 by OBJ2 and truncating the result to an integer.

    • _sub(OBJ1, OBJ2, FLAG)
    • _sub(OBJ1, OBJ2)

      Returns the result of subtracting OBJ2 by OBJ1. If flag is false or omitted, OBJ1 might be modified. If flag is true, OBJ2 might be modified.

    • _dec(OBJ)

      Decrement OBJ by one.

    • _inc(OBJ)

      Increment OBJ by one.

    • _mod(OBJ1, OBJ2)

      Return OBJ1 modulo OBJ2, i.e., the remainder after dividing OBJ1 by OBJ2.

    • _sqrt(OBJ)

      Return the square root of the object, truncated to integer.

    • _root(OBJ, N)

      Return Nth root of the object, truncated to int. N is >= 3.

    • _fac(OBJ)

      Return factorial of object (1*2*3*4*...).

    • _pow(OBJ1, OBJ2)

      Return OBJ1 to the power of OBJ2. By convention, 0**0 = 1.

    • _modinv(OBJ1, OBJ2)

      Return modular multiplicative inverse, i.e., return OBJ3 so that

      1. (OBJ3 * OBJ1) % OBJ2 = 1 % OBJ2

      The result is returned as two arguments. If the modular multiplicative inverse does not exist, both arguments are undefined. Otherwise, the arguments are a number (object) and its sign ("+" or "-").

      The output value, with its sign, must either be a positive value in the range 1,2,...,OBJ2-1 or the same value subtracted OBJ2. For instance, if the input arguments are objects representing the numbers 7 and 5, the method must either return an object representing the number 3 and a "+" sign, since (3*7) % 5 = 1 % 5, or an object representing the number 2 and "-" sign, since (-2*7) % 5 = 1 % 5.

    • _modpow(OBJ1, OBJ2, OBJ3)

      Return modular exponentiation, (OBJ1 ** OBJ2) % OBJ3.

    • _rsft(OBJ, N, B)

      Shift object N digits right in base B and return the resulting object. This is equivalent to performing integer division by B**N and discarding the remainder, except that it might be much faster, depending on how the number is represented internally.

      For instance, if the object $obj represents the hexadecimal number 0xabcde, then _rsft($obj, 2, 16) returns an object representing the number 0xabc. The "remainer", 0xde, is discarded and not returned.

    • _lsft(OBJ, N, B)

      Shift the object N digits left in base B. This is equivalent to multiplying by B**N, except that it might be much faster, depending on how the number is represented internally.

    • _log_int(OBJ, B)

      Return integer log of OBJ to base BASE. This method has two output arguments, the OBJECT and a STATUS. The STATUS is Perl scalar; it is 1 if OBJ is the exact result, 0 if the result was truncted to give OBJ, and undef if it is unknown whether OBJ is the exact result.

    • _gcd(OBJ1, OBJ2)

      Return the greatest common divisor of OBJ1 and OBJ2.

    Bitwise operators

    Each of these methods may modify the first input argument.

    • _and(OBJ1, OBJ2)

      Return bitwise and. If necessary, the smallest number is padded with leading zeros.

    • _or(OBJ1, OBJ2)

      Return bitwise or. If necessary, the smallest number is padded with leading zeros.

    • _xor(OBJ1, OBJ2)

      Return bitwise exclusive or. If necessary, the smallest number is padded with leading zeros.

    Boolean operators

    • _is_zero(OBJ)

      Returns a true value if OBJ is zero, and false value otherwise.

    • _is_one(OBJ)

      Returns a true value if OBJ is one, and false value otherwise.

    • _is_two(OBJ)

      Returns a true value if OBJ is two, and false value otherwise.

    • _is_ten(OBJ)

      Returns a true value if OBJ is ten, and false value otherwise.

    • _is_even(OBJ)

      Return a true value if OBJ is an even integer, and a false value otherwise.

    • _is_odd(OBJ)

      Return a true value if OBJ is an even integer, and a false value otherwise.

    • _acmp(OBJ1, OBJ2)

      Compare OBJ1 and OBJ2 and return -1, 0, or 1, if OBJ1 is less than, equal to, or larger than OBJ2, respectively.

    String conversion

    • _str(OBJ)

      Return a string representing the object. The returned string should have no leading zeros, i.e., it should match ^(0|[1-9]\d*)$ .

    • _as_bin(OBJ)

      Return the binary string representation of the number. The string must have a '0b' prefix.

    • _as_oct(OBJ)

      Return the octal string representation of the number. The string must have a '0x' prefix.

      Note: This method was required from Math::BigInt version 1.78, but the required API version number was not incremented, so there are older libraries that support API version 1, but do not support _as_oct() .

    • _as_hex(OBJ)

      Return the hexadecimal string representation of the number. The string must have a '0x' prefix.

    Numeric conversion

    • _num(OBJ)

      Given an object, return a Perl scalar number (int/float) representing this number.

    Miscellaneous

    • _copy(OBJ)

      Return a true copy of the object.

    • _len(OBJ)

      Returns the number of the decimal digits in the number. The output is a Perl scalar.

    • _zeros(OBJ)

      Return the number of trailing decimal zeros. The output is a Perl scalar.

    • _digit(OBJ, N)

      Return the Nth digit as a Perl scalar. N is a Perl scalar, where zero refers to the rightmost (least significant) digit, and negative values count from the left (most significant digit). If $obj represents the number 123, then _digit($obj, 0) is 3 and _digit(123, -1) is 1.

    • _check(OBJ)

      Return a true value if the object is OK, and a false value otherwise. This is a check routine to test the internal state of the object for corruption.

    API version 2

    The following methods are required for an API version of 2 or greater.

    Constructors

    • _1ex(N)

      Return an object representing the number 10**N where N >= 0 is a Perl scalar.

    Mathematical functions

    • _nok(OBJ1, OBJ2)

      Return the binomial coefficient OBJ1 over OBJ1.

    Miscellaneous

    • _alen(OBJ)

      Return the approximate number of decimal digits of the object. The output is one Perl scalar. This estimate must be greater than or equal to what _len() returns.

    API optional methods

    The following methods are optional, and can be defined if the underlying lib has a fast way to do them. If undefined, Math::BigInt will use pure Perl (hence slow) fallback routines to emulate these:

    Signed bitwise operators.

    Each of these methods may modify the first input argument.

    • _signed_or(OBJ1, OBJ2, SIGN1, SIGN2)

      Return the signed bitwise or.

    • _signed_and(OBJ1, OBJ2, SIGN1, SIGN2)

      Return the signed bitwise and.

    • _signed_xor(OBJ1, OBJ2, SIGN1, SIGN2)

      Return the signed bitwise exclusive or.

    WRAP YOUR OWN

    If you want to port your own favourite c-lib for big numbers to the Math::BigInt interface, you can take any of the already existing modules as a rough guideline. You should really wrap up the latest BigInt and BigFloat testsuites with your module, and replace in them any of the following:

    1. use Math::BigInt;

    by this:

    1. use Math::BigInt lib => 'yourlib';

    This way you ensure that your library really works 100% within Math::BigInt.

    LICENSE

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

    AUTHORS

    • Original math code by Mark Biggar, rewritten by Tels http://bloodgate.com/ in late 2000.

    • Separated from BigInt and shaped API with the help of John Peacock.

    • Fixed, speed-up, streamlined and enhanced by Tels 2001 - 2007.

    • API documentation corrected and extended by Peter John Acklam, <pjacklam@online.no>

    SEE ALSO

    Math::BigInt, Math::BigFloat, Math::BigInt::GMP, Math::BigInt::FastCalc and Math::BigInt::Pari.

     
    perldoc-html/Math/BigInt/CalcEmu.html000644 000765 000024 00000040377 12275777466 017524 0ustar00jjstaff000000 000000 Math::BigInt::CalcEmu - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Math::BigInt::CalcEmu

    Perl 5 version 18.2 documentation
    Recently read

    Math::BigInt::CalcEmu

    NAME

    Math::BigInt::CalcEmu - Emulate low-level math with BigInt code

    SYNOPSIS

    1. use Math::BigInt::CalcEmu;

    DESCRIPTION

    Contains routines that emulate low-level math functions in BigInt, e.g. optional routines the low-level math package does not provide on its own.

    Will be loaded on demand and called automatically by BigInt.

    Stuff here is really low-priority to optimize, since it is far better to implement the operation in the low-level math library directly, possible even using a call to the native lib.

    METHODS

    __emu_bxor

    __emu_band

    __emu_bior

    LICENSE

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

    AUTHORS

    (c) Tels http://bloodgate.com 2003, 2004 - based on BigInt code by Tels from 2001-2003.

    SEE ALSO

    Math::BigInt, Math::BigFloat, Math::BigInt::GMP and Math::BigInt::Pari.

     
    perldoc-html/Math/BigInt/FastCalc.html000644 000765 000024 00000043545 12275777466 017673 0ustar00jjstaff000000 000000 Math::BigInt::FastCalc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Math::BigInt::FastCalc

    Perl 5 version 18.2 documentation
    Recently read

    Math::BigInt::FastCalc

    NAME

    Math::BigInt::FastCalc - Math::BigInt::Calc with some XS for more speed

    SYNOPSIS

    Provides support for big integer calculations. Not intended to be used by other modules. Other modules which sport the same functions can also be used to support Math::BigInt, like Math::BigInt::GMP or Math::BigInt::Pari.

    DESCRIPTION

    In order to allow for multiple big integer libraries, Math::BigInt was rewritten to use library modules for core math routines. Any module which follows the same API as this can be used instead by using the following:

    1. use Math::BigInt lib => 'libname';

    'libname' is either the long name ('Math::BigInt::Pari'), or only the short version like 'Pari'. To use this library:

    1. use Math::BigInt lib => 'FastCalc';

    Note that from Math::BigInt v1.76 onwards, FastCalc will be loaded automatically, if possible.

    STORAGE

    FastCalc works exactly like Calc, in stores the numbers in decimal form, chopped into parts.

    METHODS

    The following functions are now implemented in FastCalc.xs:

    1. _is_odd _is_even _is_one _is_zero
    2. _is_two _is_ten
    3. _zero _one _two _ten
    4. _acmp _len
    5. _inc _dec
    6. __strip_zeros _copy

    LICENSE

    This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

    AUTHORS

    Original math code by Mark Biggar, rewritten by Tels http://bloodgate.com/ in late 2000. Separated from BigInt and shaped API with the help of John Peacock.

    Fixed, sped-up and enhanced by Tels http://bloodgate.com 2001-2003. Further streamlining (api_version 1 etc.) by Tels 2004-2007.

    Bug-fixing by Peter John Acklam <pjacklam@online.no> 2010-2011.

    SEE ALSO

    Math::BigInt, Math::BigFloat, Math::BigInt::GMP, Math::BigInt::FastCalc and Math::BigInt::Pari.

     
    perldoc-html/MIME/Base64.html000644 000765 000024 00000056314 12275777472 015714 0ustar00jjstaff000000 000000 MIME::Base64 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    MIME::Base64

    Perl 5 version 18.2 documentation
    Recently read

    MIME::Base64

    NAME

    MIME::Base64 - Encoding and decoding of base64 strings

    SYNOPSIS

    1. use MIME::Base64;
    2. $encoded = encode_base64('Aladdin:open sesame');
    3. $decoded = decode_base64($encoded);

    DESCRIPTION

    This module provides functions to encode and decode strings into and from the base64 encoding specified in RFC 2045 - MIME (Multipurpose Internet Mail Extensions). The base64 encoding is designed to represent arbitrary sequences of octets in a form that need not be humanly readable. A 65-character subset ([A-Za-z0-9+/=]) of US-ASCII is used, enabling 6 bits to be represented per printable character.

    The following primary functions are provided:

    • encode_base64( $bytes )
    • encode_base64( $bytes, $eol );

      Encode data by calling the encode_base64() function. The first argument is the byte string to encode. The second argument is the line-ending sequence to use. It is optional and defaults to "\n". The returned encoded string is broken into lines of no more than 76 characters each and it will end with $eol unless it is empty. Pass an empty string as second argument if you do not want the encoded string to be broken into lines.

      The function will croak with "Wide character in subroutine entry" if $bytes contains characters with code above 255. The base64 encoding is only defined for single-byte characters. Use the Encode module to select the byte encoding you want.

    • decode_base64( $str )

      Decode a base64 string by calling the decode_base64() function. This function takes a single argument which is the string to decode and returns the decoded data.

      Any character not part of the 65-character base64 subset is silently ignored. Characters occurring after a '=' padding character are never decoded.

    If you prefer not to import these routines into your namespace, you can call them as:

    1. use MIME::Base64 ();
    2. $encoded = MIME::Base64::encode($decoded);
    3. $decoded = MIME::Base64::decode($encoded);

    Additional functions not exported by default:

    • encode_base64url( $bytes )
    • decode_base64url( $str )

      Encode and decode according to the base64 scheme for "URL applications" [1]. This is a variant of the base64 encoding which does not use padding, does not break the string into multiple lines and use the characters "-" and "_" instead of "+" and "/" to avoid using reserved URL characters.

    • encoded_base64_length( $bytes )
    • encoded_base64_length( $bytes, $eol )

      Returns the length that the encoded string would have without actually encoding it. This will return the same value as length(encode_base64($bytes)), but should be more efficient.

    • decoded_base64_length( $str )

      Returns the length that the decoded string would have without actually decoding it. This will return the same value as length(decode_base64($str)), but should be more efficient.

    EXAMPLES

    If you want to encode a large file, you should encode it in chunks that are a multiple of 57 bytes. This ensures that the base64 lines line up and that you do not end up with padding in the middle. 57 bytes of data fills one complete base64 line (76 == 57*4/3):

    1. use MIME::Base64 qw(encode_base64);
    2. open(FILE, "/var/log/wtmp") or die "$!";
    3. while (read(FILE, $buf, 60*57)) {
    4. print encode_base64($buf);
    5. }

    or if you know you have enough memory

    1. use MIME::Base64 qw(encode_base64);
    2. local($/) = undef; # slurp
    3. print encode_base64(<STDIN>);

    The same approach as a command line:

    1. perl -MMIME::Base64 -0777 -ne 'print encode_base64($_)' <file

    Decoding does not need slurp mode if every line contains a multiple of four base64 chars:

    1. perl -MMIME::Base64 -ne 'print decode_base64($_)' <file

    Perl v5.8 and better allow extended Unicode characters in strings. Such strings cannot be encoded directly, as the base64 encoding is only defined for single-byte characters. The solution is to use the Encode module to select the byte encoding you want. For example:

    1. use MIME::Base64 qw(encode_base64);
    2. use Encode qw(encode);
    3. $encoded = encode_base64(encode("UTF-8", "\x{FFFF}\n"));
    4. print $encoded;

    COPYRIGHT

    Copyright 1995-1999, 2001-2004, 2010 Gisle Aas.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Distantly based on LWP::Base64 written by Martijn Koster <m.koster@nexor.co.uk> and Joerg Reichelt <j.reichelt@nexor.co.uk> and code posted to comp.lang.perl <3pd2lp$6gf@wsinti07.win.tue.nl> by Hans Mulder <hansm@wsinti07.win.tue.nl>

    The XS implementation uses code from metamail. Copyright 1991 Bell Communications Research, Inc. (Bellcore)

    SEE ALSO

    MIME::QuotedPrint

    [1] http://en.wikipedia.org/wiki/Base64#URL_applications

     
    perldoc-html/MIME/QuotedPrint.html000644 000765 000024 00000046115 12275777467 017150 0ustar00jjstaff000000 000000 MIME::QuotedPrint - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    MIME::QuotedPrint

    Perl 5 version 18.2 documentation
    Recently read

    MIME::QuotedPrint

    NAME

    MIME::QuotedPrint - Encoding and decoding of quoted-printable strings

    SYNOPSIS

    1. use MIME::QuotedPrint;
    2. $encoded = encode_qp($decoded);
    3. $decoded = decode_qp($encoded);

    DESCRIPTION

    This module provides functions to encode and decode strings into and from the quoted-printable encoding specified in RFC 2045 - MIME (Multipurpose Internet Mail Extensions). The quoted-printable encoding is intended to represent data that largely consists of bytes that correspond to printable characters in the ASCII character set. Each non-printable character (as defined by English Americans) is represented by a triplet consisting of the character "=" followed by two hexadecimal digits.

    The following functions are provided:

    • encode_qp( $str)
    • encode_qp( $str, $eol)
    • encode_qp( $str, $eol, $binmode )

      This function returns an encoded version of the string ($str) given as argument.

      The second argument ($eol) is the line-ending sequence to use. It is optional and defaults to "\n". Every occurrence of "\n" is replaced with this string, and it is also used for additional "soft line breaks" to ensure that no line end up longer than 76 characters. Pass it as "\015\012" to produce data suitable for external consumption. The string "\r\n" produces the same result on many platforms, but not all.

      The third argument ($binmode) will select binary mode if passed as a TRUE value. In binary mode "\n" will be encoded in the same way as any other non-printable character. This ensures that a decoder will end up with exactly the same string whatever line ending sequence it uses. In general it is preferable to use the base64 encoding for binary data; see MIME::Base64.

      An $eol of "" (the empty string) is special. In this case, no "soft line breaks" are introduced and binary mode is effectively enabled so that any "\n" in the original data is encoded as well.

    • decode_qp( $str )

      This function returns the plain text version of the string given as argument. The lines of the result are "\n" terminated, even if the $str argument contains "\r\n" terminated lines.

    If you prefer not to import these routines into your namespace, you can call them as:

    1. use MIME::QuotedPrint ();
    2. $encoded = MIME::QuotedPrint::encode($decoded);
    3. $decoded = MIME::QuotedPrint::decode($encoded);

    Perl v5.8 and better allow extended Unicode characters in strings. Such strings cannot be encoded directly, as the quoted-printable encoding is only defined for single-byte characters. The solution is to use the Encode module to select the byte encoding you want. For example:

    1. use MIME::QuotedPrint qw(encode_qp);
    2. use Encode qw(encode);
    3. $encoded = encode_qp(encode("UTF-8", "\x{FFFF}\n"));
    4. print $encoded;

    COPYRIGHT

    Copyright 1995-1997,2002-2004 Gisle Aas.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    MIME::Base64

     
    perldoc-html/Log/Message/000755 000765 000024 00000000000 12275777464 015350 5ustar00jjstaff000000 000000 perldoc-html/Log/Message.html000644 000765 000024 00000070352 12275777464 016245 0ustar00jjstaff000000 000000 Log::Message - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Log::Message

    Perl 5 version 18.2 documentation
    Recently read

    Log::Message

    NAME

    Log::Message - A generic message storing mechanism;

    SYNOPSIS

    1. use Log::Message private => 0, config => '/our/cf_file';
    2. my $log = Log::Message->new( private => 1,
    3. level => 'log',
    4. config => '/my/cf_file',
    5. );
    6. $log->store('this is my first message');
    7. $log->store( message => 'message #2',
    8. tag => 'MY_TAG',
    9. level => 'carp',
    10. extra => ['this is an argument to the handler'],
    11. );
    12. my @last_five_items = $log->retrieve(5);
    13. my @items = $log->retrieve( tag => qr/my_tag/i,
    14. message => qr/\d/,
    15. remove => 1,
    16. );
    17. my @items = $log->final( level => qr/carp/, amount => 2 );
    18. my $first_error = $log->first()
    19. # croak with the last error on the stack
    20. $log->final->croak;
    21. # empty the stack
    22. $log->flush();

    DESCRIPTION

    Log::Message is a generic message storage mechanism. It allows you to store messages on a stack -- either shared or private -- and assign meta-data to it. Some meta-data will automatically be added for you, like a timestamp and a stack trace, but some can be filled in by the user, like a tag by which to identify it or group it, and a level at which to handle the message (for example, log it, or die with it)

    Log::Message also provides a powerful way of searching through items by regexes on messages, tags and level.

    Hierarchy

    There are 4 modules of interest when dealing with the Log::Message::* modules:

    • Log::Message

      Log::Message provides a few methods to manipulate the stack it keeps. It has the option of keeping either a private or a public stack. More on this below.

    • Log::Message::Item

      These are individual message items, which are objects that contain the user message as well as the meta-data described above. See the Log::Message::Item manpage to see how to extract this meta-data and how to work with the Item objects. You should never need to create your own Item objects, but knowing about their methods and accessors is important if you want to write your own handlers. (See below)

    • Log::Message::Handlers

      These are a collection of handlers that will be called for a level that is used on a Log::Message::Item object. For example, if a message is logged with the 'carp' level, the 'carp' handler from Log::Message::Handlers will be called. See the Log::Message::Handlers manpage for more explanation about how handlers work, which one are available and how to create your own.

    • Log::Message::Config

      Per Log::Message object, there is a configuration required that will fill in defaults if the user did not specify arguments to override them (like for example what tag will be set if none was provided), Log::Message::Config handles the creation of these configurations.

      Configuration can be specified in 4 ways:

      • As a configuration file when you use Log::Message

      • As arguments when you use Log::Message

      • As a configuration file when you create a new Log::Message object. (The config will then only apply to that object if you marked it as private)

      • As arguments when you create a new Log::Message object.

        You should never need to use the Log::Message::Config module yourself, as this is transparently done by Log::Message, but its manpage does provide an explanation of how you can create a config file.

    Options

    When using Log::Message, or creating a new Log::Message object, you can supply various options to alter its behaviour. Of course, there are sensible defaults should you choose to omit these options.

    Below an explanation of all the options and how they work.

    • config

      The path to a configuration file to be read. See the manpage of Log::Message::Config for the required format

      These options will be overridden by any explicit arguments passed.

    • private

      Whether to create, by default, private or shared objects. If you choose to create shared objects, all Log::Message objects will use the same stack.

      This means that even though every module may make its own $log object they will still be sharing the same error stack on which they are putting errors and from which they are retrieving.

      This can be useful in big projects.

      If you choose to create a private object, then the stack will of course be private to this object, but it will still fall back to the shared config should no private config or overriding arguments be provided.

    • verbose

      Log::Message makes use of another module to validate its arguments, which is called Params::Check, which is a lightweight, yet powerful input checker and parser. (See the Params::Check manpage for details).

      The verbose setting will control whether this module will generate warnings if something improper is passed as input, or merely silently returns undef, at which point Log::Message will generate a warning.

      It's best to just leave this at its default value, which is '1'

    • tag

      The tag to add to messages if none was provided. If neither your config, nor any specific arguments supply a tag, then Log::Message will set it to 'NONE'

      Tags are useful for searching on or grouping by. For example, you could tag all the messages you want to go to the user as 'USER ERROR' and all those that are only debug information with 'DEBUG'.

      At the end of your program, you could then print all the ones tagged 'USER ERROR' to STDOUT, and those marked 'DEBUG' to a log file.

    • level

      level describes what action to take when a message is logged. Just like tag , Log::Message will provide a default (which is 'log') if neither your config file, nor any explicit arguments are given to override it.

      See the Log::Message::Handlers manpage to see what handlers are available by default and what they do, as well as to how to add your own handlers.

    • remove

      This indicates whether or not to automatically remove the messages from the stack when you've retrieved them. The default setting provided by Log::Message is '0': do not remove.

    • chrono

      This indicates whether messages should always be fetched in chronological order or not. This simply means that you can choose whether, when retrieving items, the item most recently added should be returned first, or the one that had been added most long ago.

      The default is to return the newest ones first

    Methods

    new

    This creates a new Log::Message object; The parameters it takes are described in the Options section below and let it just be repeated that you can use these options like this:

    1. my $log = Log::Message->new( %options );

    as well as during use time, like this:

    1. use Log::Message option1 => value, option2 => value

    There are but 3 rules to keep in mind:

    • Provided arguments take precedence over a configuration file.

    • Arguments to new take precedence over options provided at use time

    • An object marked private will always have an empty stack to begin with

    store

    This will create a new Item object and store it on the stack.

    Possible arguments you can give to it are:

    • message

      This is the only argument that is required. If no other arguments are given, you may even leave off the message key. The argument will then automatically be assumed to be the message.

    • tag

      The tag to add to this message. If not provided, Log::Message will look in your configuration for one.

    • level

      The level at which this message should be handled. If not provided, Log::Message will look in your configuration for one.

    • extra

      This is an array ref with arguments passed to the handler for this message, when it is called from store();

      The handler will receive them as a normal list

    store() will return true upon success and undef upon failure, as well as issue a warning as to why it failed.

    retrieve

    This will retrieve all message items matching the criteria specified from the stack.

    Here are the criteria you can discriminate on:

    • tag

      A regex to which the tag must adhere. For example qr/\w/.

    • level

      A regex to which the level must adhere.

    • message

      A regex to which the message must adhere.

    • amount

      Maximum amount of errors to return

    • chrono

      Return in chronological order, or not?

    • remove

      Remove items from the stack upon retrieval?

    In scalar context it will return the first item matching your criteria and in list context, it will return all of them.

    If an error occurs while retrieving, a warning will be issued and undef will be returned.

    first

    This is a shortcut for retrieving the first item(s) stored on the stack. It will default to only retrieving one if called with no arguments, and will always return results in chronological order.

    If you only supply one argument, it is assumed to be the amount you wish returned.

    Furthermore, it can take the same arguments as retrieve can.

    last

    This is a shortcut for retrieving the last item(s) stored on the stack. It will default to only retrieving one if called with no arguments, and will always return results in reverse chronological order.

    If you only supply one argument, it is assumed to be the amount you wish returned.

    Furthermore, it can take the same arguments as retrieve can.

    flush

    This removes all items from the stack and returns them to the caller

    SEE ALSO

    Log::Message::Item, Log::Message::Handlers, Log::Message::Config

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    Acknowledgements

    Thanks to Ann Barcomb for her suggestions.

    COPYRIGHT

    This module is copyright (c) 2002 Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Log/Message/Config.html000644 000765 000024 00000041641 12275777464 017451 0ustar00jjstaff000000 000000 Log::Message::Config - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Log::Message::Config

    Perl 5 version 18.2 documentation
    Recently read

    Log::Message::Config

    NAME

    Log::Message::Config - Configuration options for Log::Message

    SYNOPSIS

    1. # This module is implicitly used by Log::Message to create a config
    2. # which it uses to log messages.
    3. # For the options you can pass, see the C<Log::Message new()> method.
    4. # Below is a sample of a config file you could use
    5. # comments are denoted by a single '#'
    6. # use a shared stack, or have a private instance?
    7. # if none provided, set to '0',
    8. private = 1
    9. # do not be verbose
    10. verbose = 0
    11. # default tag to set on new items
    12. # if none provided, set to 'NONE'
    13. tag = SOME TAG
    14. # default level to handle items
    15. # if none provided, set to 'log'
    16. level = carp
    17. # extra files to include
    18. # if none provided, no files are auto included
    19. include = mylib.pl
    20. include = ../my/other/lib.pl
    21. # automatically delete items
    22. # when you retrieve them from the stack?
    23. # if none provided, set to '0'
    24. remove = 1
    25. # retrieve errors in chronological order, or not?
    26. # if none provided, set to '1'
    27. chrono = 0

    DESCRIPTION

    Log::Message::Config provides a standardized config object for Log::Message objects.

    It can either read options as perl arguments, or as a config file. See the Log::Message manpage for more information about what arguments are valid, and see the Synopsis for an example config file you can use

    SEE ALSO

    Log::Message, Log::Message::Item, Log::Message::Handlers

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    Acknowledgements

    Thanks to Ann Barcomb for her suggestions.

    COPYRIGHT

    This module is copyright (c) 2002 Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Log/Message/Handlers.html000644 000765 000024 00000045740 12275777463 020007 0ustar00jjstaff000000 000000 Log::Message::Handlers - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Log::Message::Handlers

    Perl 5 version 18.2 documentation
    Recently read

    Log::Message::Handlers

    NAME

    Log::Message::Handlers - Message handlers for Log::Message

    SYNOPSIS

    1. # Implicitly used by Log::Message to serve as handlers for
    2. # Log::Message::Item objects
    3. # Create your own file with a package called
    4. # Log::Message::Handlers to add to the existing ones, or to even
    5. # overwrite them
    6. $item->carp;
    7. $item->trace;

    DESCRIPTION

    Log::Message::Handlers provides handlers for Log::Message::Item objects. The handler corresponding to the level (see Log::Message::Item manpage for an explanation about levels) will be called automatically upon storing the error.

    Handlers may also explicitly be called on an Log::Message::Item object if one so desires (see the Log::Message manpage on how to retrieve the Item objects).

    Default Handlers

    log

    Will simply log the error on the stack, and do nothing special

    carp

    Will carp (see the Carp manpage) with the error, and add the timestamp of when it occurred.

    croak

    Will croak (see the Carp manpage) with the error, and add the timestamp of when it occurred.

    cluck

    Will cluck (see the Carp manpage) with the error, and add the timestamp of when it occurred.

    confess

    Will confess (see the Carp manpage) with the error, and add the timestamp of when it occurred

    die

    Will simply die with the error message of the item

    warn

    Will simply warn with the error message of the item

    trace

    Will provide a traceback of this error item back to the first one that occurred, clucking with every item as it comes across it.

    Custom Handlers

    If you wish to provide your own handlers, you can simply do the following:

    • Create a file that holds a package by the name of Log::Message::Handlers

    • Create subroutines with the same name as the levels you wish to handle in the Log::Message module (see the Log::Message manpage for explanation on levels)

    • Require that file in your program, or add it in your configuration (see the Log::Message::Config manpage for explanation on how to use a config file)

    And that is it, the handler will now be available to handle messages for you.

    The arguments a handler may receive are those specified by the extra key, when storing the message. See the Log::Message manpage for details on the arguments.

    SEE ALSO

    Log::Message, Log::Message::Item, Log::Message::Config

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    Acknowledgements

    Thanks to Ann Barcomb for her suggestions.

    COPYRIGHT

    This module is copyright (c) 2002 Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Log/Message/Item.html000644 000765 000024 00000050200 12275777463 017130 0ustar00jjstaff000000 000000 Log::Message::Item - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Log::Message::Item

    Perl 5 version 18.2 documentation
    Recently read

    Log::Message::Item

    NAME

    Log::Message::Item - Message objects for Log::Message

    SYNOPSIS

    1. # Implicitly used by Log::Message to create Log::Message::Item objects
    2. print "this is the message's id: ", $item->id;
    3. print "this is the message stored: ", $item->message;
    4. print "this is when it happened: ", $item->when;
    5. print "the message was tagged: ", $item->tag;
    6. print "this was the severity level: ", $item->level;
    7. $item->remove; # delete the item from the stack it was on
    8. # Besides these methods, you can also call the handlers on
    9. # the object specifically.
    10. # See the Log::Message::Handlers manpage for documentation on what
    11. # handlers are available by default and how to add your own

    DESCRIPTION

    Log::Message::Item is a class that generates generic Log items. These items are stored on a Log::Message stack, so see the Log::Message manpage about details how to retrieve them.

    You should probably not create new items by yourself, but use the storing mechanism provided by Log::Message.

    However, the accessors and handlers are of interest if you want to do fine tuning of how your messages are handled.

    The accessors and methods are described below, the handlers are documented in the Log::Message::Handlers manpage.

    Methods and Accessors

    remove

    Calling remove will remove the object from the stack it was on, so it will not show up any more in subsequent fetches of messages.

    You can still call accessors and handlers on it however, to handle it as you will.

    id

    Returns the internal ID of the item. This may be useful for comparing since the ID is incremented each time a new item is created. Therefore, an item with ID 4 must have been logged before an item with ID 9.

    when

    Returns the timestamp of when the message was logged

    message

    The actual message that was stored

    level

    The severity type of this message, as well as the name of the handler that was called upon storing it.

    tag

    Returns the identification tag that was put on the message.

    shortmess

    Returns the equivalent of a Carp::shortmess for this item. See the Carp manpage for details.

    longmess

    Returns the equivalent of a Carp::longmess for this item, which is essentially a stack trace. See the Carp manpage for details.

    parent

    Returns a reference to the Log::Message object that stored this item. This is useful if you want to have access to the full stack in a handler.

    SEE ALSO

    Log::Message, Log::Message::Handlers, Log::Message::Config

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    Acknowledgements

    Thanks to Ann Barcomb for her suggestions.

    COPYRIGHT

    This module is copyright (c) 2002 Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Log/Message/Simple.html000644 000765 000024 00000065057 12275777460 017500 0ustar00jjstaff000000 000000 Log::Message::Simple - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Log::Message::Simple

    Perl 5 version 18.2 documentation
    Recently read

    Log::Message::Simple

    NAME

    Log::Message::Simple - Simplified interface to Log::Message

    SYNOPSIS

    1. use Log::Message::Simple qw[msg error debug
    2. carp croak cluck confess];
    3. use Log::Message::Simple qw[:STD :CARP];
    4. ### standard reporting functionality
    5. msg( "Connecting to database", $verbose );
    6. error( "Database connection failed: $@", $verbose );
    7. debug( "Connection arguments were: $args", $debug );
    8. ### standard carp functionality
    9. carp( "Wrong arguments passed: @_" );
    10. croak( "Fatal: wrong arguments passed: @_" );
    11. cluck( "Wrong arguments passed -- including stacktrace: @_" );
    12. confess("Fatal: wrong arguments passed -- including stacktrace: @_" );
    13. ### retrieve individual message
    14. my @stack = Log::Message::Simple->stack;
    15. my @stack = Log::Message::Simple->flush;
    16. ### retrieve the entire stack in printable form
    17. my $msgs = Log::Message::Simple->stack_as_string;
    18. my $trace = Log::Message::Simple->stack_as_string(1);
    19. ### redirect output
    20. local $Log::Message::Simple::MSG_FH = \*STDERR;
    21. local $Log::Message::Simple::ERROR_FH = \*STDERR;
    22. local $Log::Message::Simple::DEBUG_FH = \*STDERR;
    23. ### force a stacktrace on error
    24. local $Log::Message::Simple::STACKTRACE_ON_ERROR = 1

    DESCRIPTION

    This module provides standardized logging facilities using the Log::Message module.

    FUNCTIONS

    msg("message string" [,VERBOSE])

    Records a message on the stack, and prints it to STDOUT (or actually $MSG_FH , see the GLOBAL VARIABLES section below), if the VERBOSE option is true. The VERBOSE option defaults to false.

    Exported by default, or using the :STD tag.

    debug("message string" [,VERBOSE])

    Records a debug message on the stack, and prints it to STDOUT (or actually $DEBUG_FH , see the GLOBAL VARIABLES section below), if the VERBOSE option is true. The VERBOSE option defaults to false.

    Exported by default, or using the :STD tag.

    error("error string" [,VERBOSE])

    Records an error on the stack, and prints it to STDERR (or actually $ERROR_FH , see the GLOBAL VARIABLES sections below), if the VERBOSE option is true. The VERBOSE options defaults to true.

    Exported by default, or using the :STD tag.

    carp();

    Provides functionality equal to Carp::carp() while still logging to the stack.

    Exported by using the :CARP tag.

    croak();

    Provides functionality equal to Carp::croak() while still logging to the stack.

    Exported by using the :CARP tag.

    confess();

    Provides functionality equal to Carp::confess() while still logging to the stack.

    Exported by using the :CARP tag.

    cluck();

    Provides functionality equal to Carp::cluck() while still logging to the stack.

    Exported by using the :CARP tag.

    CLASS METHODS

    Log::Message::Simple->stack()

    Retrieves all the items on the stack. Since Log::Message::Simple is implemented using Log::Message , consult its manpage for the function retrieve to see what is returned and how to use the items.

    Log::Message::Simple->stack_as_string([TRACE])

    Returns the whole stack as a printable string. If the TRACE option is true all items are returned with Carp::longmess output, rather than just the message. TRACE defaults to false.

    Log::Message::Simple->flush()

    Removes all the items from the stack and returns them. Since Log::Message::Simple is implemented using Log::Message , consult its manpage for the function retrieve to see what is returned and how to use the items.

    GLOBAL VARIABLES

    • $ERROR_FH

      This is the filehandle all the messages sent to error() are being printed. This defaults to *STDERR .

    • $MSG_FH

      This is the filehandle all the messages sent to msg() are being printed. This default to *STDOUT .

    • $DEBUG_FH

      This is the filehandle all the messages sent to debug() are being printed. This default to *STDOUT .

    • $STACKTRACE_ON_ERROR

      If this option is set to true , every call to error() will generate a stacktrace using Carp::shortmess() . Defaults to false

     
    perldoc-html/Locale/Country.html000644 000765 000024 00000065442 12275777463 017005 0ustar00jjstaff000000 000000 Locale::Country - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Locale::Country

    Perl 5 version 18.2 documentation
    Recently read

    Locale::Country

    NAME

    Locale::Country - standard codes for country identification

    SYNOPSIS

    1. use Locale::Country;
    2. $country = code2country('jp' [,CODESET]); # $country gets 'Japan'
    3. $code = country2code('Norway' [,CODESET]); # $code gets 'no'
    4. @codes = all_country_codes( [CODESET]);
    5. @names = all_country_names();
    6. # semi-private routines
    7. Locale::Country::alias_code('uk' => 'gb');
    8. Locale::Country::rename_country('gb' => 'Great Britain');

    DESCRIPTION

    The Locale::Country module provides access to several code sets that can be used for identifying countries, such as those defined in ISO 3166-1.

    Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 3166-1 two-letter codes will be used.

    SUPPORTED CODE SETS

    There are several different code sets you can use for identifying countries. A code set may be specified using either a name, or a constant that is automatically exported by this module.

    For example, the two are equivalent:

    1. $country = code2country('jp','alpha-2');
    2. $country = code2country('jp',LOCALE_CODE_ALPHA_2);

    The codesets currently supported are:

    • alpha-2, LOCALE_CODE_ALPHA_2

      This is the set of two-letter (lowercase) codes from ISO 3166-1, such as 'tv' for Tuvalu.

      This is the default code set.

    • alpha-3, LOCALE_CODE_ALPHA_3

      This is the set of three-letter (lowercase) codes from ISO 3166-1, such as 'brb' for Barbados. These codes are actually defined and maintained by the U.N. Statistics division.

    • numeric, LOCALE_CODE_NUMERIC

      This is the set of three-digit numeric codes from ISO 3166-1, such as 064 for Bhutan. These codes are actually defined and maintained by the U.N. Statistics division.

      If a 2-digit code is entered, it is converted to 3 digits by prepending a 0.

    • fips-10, LOCALE_CODE_FIPS

      The FIPS 10 data are two-letter (uppercase) codes assigned by the National Geospatial-Intelligence Agency.

    • dom, LOCALE_CODE_DOM

      The IANA is responsible for delegating management of the top level country domains. The country domains are the two-letter (lowercase) codes from ISO 3166 with a few other additions.

    ROUTINES

    • code2country ( CODE [,CODESET] )
    • country2code ( NAME [,CODESET] )
    • country_code2code ( CODE ,CODESET ,CODESET2 )
    • all_country_codes ( [CODESET] )
    • all_country_names ( [CODESET] )
    • Locale::Country::rename_country ( CODE ,NEW_NAME [,CODESET] )
    • Locale::Country::add_country ( CODE ,NAME [,CODESET] )
    • Locale::Country::delete_country ( CODE [,CODESET] )
    • Locale::Country::add_country_alias ( NAME ,NEW_NAME )
    • Locale::Country::delete_country_alias ( NAME )
    • Locale::Country::rename_country_code ( CODE ,NEW_CODE [,CODESET] )
    • Locale::Country::add_country_code_alias ( CODE ,NEW_CODE [,CODESET] )
    • Locale::Country::delete_country_code_alias ( CODE [,CODESET] )

      These routines are all documented in the Locale::Codes::API man page.

    • alias_code ( ALIAS, CODE [,CODESET] )

      Version 2.07 included 2 functions for modifying the internal data: rename_country and alias_code. Both of these could be used only to modify the internal data for country codes.

      As of 3.10, the internal data for all types of codes can be modified.

      The alias_code function is preserved for backwards compatibility, but the following two are identical:

      1. alias_code(ALIAS,CODE [,CODESET]);
      2. rename_country_code(CODE,ALIAS [,CODESET]);

      and the latter should be used for consistency.

      The alias_code function is deprecated and will be removed at some point in the future.

      Note: this function was previously called _alias_code, but the leading underscore has been dropped. The old name was supported for all 2.X releases, but has been dropped as of 3.00.

    SEE ALSO

    • Locale::Codes

      The Locale-Codes distribution.

    • Locale::Codes::API

      The list of functions supported by this module.

    • Locale::SubCountry

      ISO codes for country sub-divisions (states, counties, provinces, etc), as defined in ISO 3166-2. This module is not part of the Locale-Codes distribution, but is available from CPAN in CPAN/modules/by-module/Locale/

    • http://www.iso.org/iso/country_codes

      Official home page for the ISO 3166 maintenance agency.

      Unfortunately, they do not make the actual ISO available for free, so I cannot check the alpha-3 and numerical codes here.

    • http://www.iso.org/iso/list-en1-semic-3.txt
    • http://www.iso.org/iso/home/standards/country_codes/iso-3166-1_decoding_table.htm

      The source of ISO 3166-1 two-letter codes used by this module.

    • http://unstats.un.org/unsd/methods/m49/m49alpha.htm

      The source of the official ISO 3166-1 three-letter codes and three-digit codes.

      For some reason, this table is incomplete! Several countries are missing from it, and I cannot find them anywhere on the UN site. I get as much of the data from here as I can.

    • http://earth-info.nga.mil/gns/html/digraphs.htm

      The official list of the FIPS 10 codes.

    • http://www.iana.org/domains/

      Official source of the top-level domain names.

    • https://www.cia.gov/library/publications/the-world-factbook/appendix/print_appendix-d.html

      The World Factbook maintained by the CIA is a potential source of the data. Unfortunately, it adds/preserves non-standard codes, so it is no longer used as a source of data.

    • http://www.statoids.com/wab.html

      Another unofficial source of data. Currently, it is not used to get data, but the notes and explanatory material were very useful for understanding discrepancies between the sources.

    AUTHOR

    See Locale::Codes for full author history.

    Currently maintained by Sullivan Beck (sbeck@cpan.org).

    COPYRIGHT

    1. Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
    2. Copyright (c) 2001-2010 Neil Bowers
    3. Copyright (c) 2010-2013 Sullivan Beck

    This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Locale/Currency.html000644 000765 000024 00000052475 12275777464 017137 0ustar00jjstaff000000 000000 Locale::Currency - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Locale::Currency

    Perl 5 version 18.2 documentation
    Recently read

    Locale::Currency

    NAME

    Locale::Currency - standard codes for currency identification

    SYNOPSIS

    1. use Locale::Currency;
    2. $curr = code2currency('usd'); # $curr gets 'US Dollar'
    3. $code = currency2code('Euro'); # $code gets 'eur'
    4. @codes = all_currency_codes();
    5. @names = all_currency_names();

    DESCRIPTION

    The Locale::Currency module provides access to standard codes used for identifying currencies and funds, such as those defined in ISO 4217.

    Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 4217 three-letter codes will be used.

    SUPPORTED CODE SETS

    There are several different code sets you can use for identifying currencies. A code set may be specified using either a name, or a constant that is automatically exported by this module.

    For example, the two are equivalent:

    1. $curr = code2currency('usd','alpha');
    2. $curr = code2currency('usd',LOCALE_CURR_ALPHA);

    The codesets currently supported are:

    • alpha, LOCALE_CURR_ALPHA

      This is a set of three-letter (uppercase) codes from ISO 4217 such as EUR for Euro.

      Two of the codes specified by the standard (XTS which is reserved for testing purposes and XXX which is for transactions where no currency is involved) are omitted.

      This is the default code set.

    • num, LOCALE_CURR_NUMERIC

      This is the set of three-digit numeric codes from ISO 4217.

    ROUTINES

    • code2currency ( CODE [,CODESET] )
    • currency2code ( NAME [,CODESET] )
    • currency_code2code ( CODE ,CODESET ,CODESET2 )
    • all_currency_codes ( [CODESET] )
    • all_currency_names ( [CODESET] )
    • Locale::Currency::rename_currency ( CODE ,NEW_NAME [,CODESET] )
    • Locale::Currency::add_currency ( CODE ,NAME [,CODESET] )
    • Locale::Currency::delete_currency ( CODE [,CODESET] )
    • Locale::Currency::add_currency_alias ( NAME ,NEW_NAME )
    • Locale::Currency::delete_currency_alias ( NAME )
    • Locale::Currency::rename_currency_code ( CODE ,NEW_CODE [,CODESET] )
    • Locale::Currency::add_currency_code_alias ( CODE ,NEW_CODE [,CODESET] )
    • Locale::Currency::delete_currency_code_alias ( CODE [,CODESET] )

      These routines are all documented in the Locale::Codes::API man page.

    SEE ALSO

    AUTHOR

    See Locale::Codes for full author history.

    Currently maintained by Sullivan Beck (sbeck@cpan.org).

    COPYRIGHT

    1. Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
    2. Copyright (c) 2001 Michael Hennecke
    3. Copyright (c) 2001-2010 Neil Bowers
    4. Copyright (c) 2010-2013 Sullivan Beck

    This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Locale/Language.html000644 000765 000024 00000053416 12275777461 017061 0ustar00jjstaff000000 000000 Locale::Language - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Locale::Language

    Perl 5 version 18.2 documentation
    Recently read

    Locale::Language

    NAME

    Locale::Language - standard codes for language identification

    SYNOPSIS

    1. use Locale::Language;
    2. $lang = code2language('en'); # $lang gets 'English'
    3. $code = language2code('French'); # $code gets 'fr'
    4. @codes = all_language_codes();
    5. @names = all_language_names();

    DESCRIPTION

    The Locale::Language module provides access to standard codes used for identifying languages, such as those as defined in ISO 639.

    Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 639 two-letter codes will be used.

    SUPPORTED CODE SETS

    There are several different code sets you can use for identifying languages. A code set may be specified using either a name, or a constant that is automatically exported by this module.

    For example, the two are equivalent:

    1. $lang = code2language('en','alpha-2');
    2. $lang = code2language('en',LOCALE_CODE_ALPHA_2);

    The codesets currently supported are:

    • alpha-2, LOCALE_LANG_ALPHA_2

      This is the set of two-letter (lowercase) codes from ISO 639-1, such as 'he' for Hebrew. It also includes additions to this set included in the IANA language registry.

      This is the default code set.

    • alpha-3, LOCALE_LANG_ALPHA_3

      This is the set of three-letter (lowercase) bibliographic codes from ISO 639-2 and 639-5, such as 'heb' for Hebrew. It also includes additions to this set included in the IANA language registry.

    • term, LOCALE_LANG_TERM

      This is the set of three-letter (lowercase) terminologic codes from ISO 639.

    ROUTINES

    • code2language ( CODE [,CODESET] )
    • language2code ( NAME [,CODESET] )
    • language_code2code ( CODE ,CODESET ,CODESET2 )
    • all_language_codes ( [CODESET] )
    • all_language_names ( [CODESET] )
    • Locale::Language::rename_language ( CODE ,NEW_NAME [,CODESET] )
    • Locale::Language::add_language ( CODE ,NAME [,CODESET] )
    • Locale::Language::delete_language ( CODE [,CODESET] )
    • Locale::Language::add_language_alias ( NAME ,NEW_NAME )
    • Locale::Language::delete_language_alias ( NAME )
    • Locale::Language::rename_language_code ( CODE ,NEW_CODE [,CODESET] )
    • Locale::Language::add_language_code_alias ( CODE ,NEW_CODE [,CODESET] )
    • Locale::Language::delete_language_code_alias ( CODE [,CODESET] )

      These routines are all documented in the Locale::Codes::API man page.

    SEE ALSO

    AUTHOR

    See Locale::Codes for full author history.

    Currently maintained by Sullivan Beck (sbeck@cpan.org).

    COPYRIGHT

    1. Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
    2. Copyright (c) 2001-2010 Neil Bowers
    3. Copyright (c) 2010-2013 Sullivan Beck

    This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Locale/Maketext/000755 000765 000024 00000000000 12275777461 016221 5ustar00jjstaff000000 000000 perldoc-html/Locale/Maketext.html000644 000765 000024 00000305440 12275777464 017120 0ustar00jjstaff000000 000000 Locale::Maketext - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Locale::Maketext

    Perl 5 version 18.2 documentation
    Recently read

    Locale::Maketext

    NAME

    Locale::Maketext - framework for localization

    SYNOPSIS

    1. package MyProgram;
    2. use strict;
    3. use MyProgram::L10N;
    4. # ...which inherits from Locale::Maketext
    5. my $lh = MyProgram::L10N->get_handle() || die "What language?";
    6. ...
    7. # And then any messages your program emits, like:
    8. warn $lh->maketext( "Can't open file [_1]: [_2]\n", $f, $! );
    9. ...

    DESCRIPTION

    It is a common feature of applications (whether run directly, or via the Web) for them to be "localized" -- i.e., for them to a present an English interface to an English-speaker, a German interface to a German-speaker, and so on for all languages it's programmed with. Locale::Maketext is a framework for software localization; it provides you with the tools for organizing and accessing the bits of text and text-processing code that you need for producing localized applications.

    In order to make sense of Maketext and how all its components fit together, you should probably go read Locale::Maketext::TPJ13, and then read the following documentation.

    You may also want to read over the source for File::Findgrep and its constituent modules -- they are a complete (if small) example application that uses Maketext.

    QUICK OVERVIEW

    The basic design of Locale::Maketext is object-oriented, and Locale::Maketext is an abstract base class, from which you derive a "project class". The project class (with a name like "TkBocciBall::Localize", which you then use in your module) is in turn the base class for all the "language classes" for your project (with names "TkBocciBall::Localize::it", "TkBocciBall::Localize::en", "TkBocciBall::Localize::fr", etc.).

    A language class is a class containing a lexicon of phrases as class data, and possibly also some methods that are of use in interpreting phrases in the lexicon, or otherwise dealing with text in that language.

    An object belonging to a language class is called a "language handle"; it's typically a flyweight object.

    The normal course of action is to call:

    1. use TkBocciBall::Localize; # the localization project class
    2. $lh = TkBocciBall::Localize->get_handle();
    3. # Depending on the user's locale, etc., this will
    4. # make a language handle from among the classes available,
    5. # and any defaults that you declare.
    6. die "Couldn't make a language handle??" unless $lh;

    From then on, you use the maketext function to access entries in whatever lexicon(s) belong to the language handle you got. So, this:

    1. print $lh->maketext("You won!"), "\n";

    ...emits the right text for this language. If the object in $lh belongs to class "TkBocciBall::Localize::fr" and %TkBocciBall::Localize::fr::Lexicon contains ("You won!" => "Tu as gagné!") , then the above code happily tells the user "Tu as gagné!".

    METHODS

    Locale::Maketext offers a variety of methods, which fall into three categories:

    • Methods to do with constructing language handles.

    • maketext and other methods to do with accessing %Lexicon data for a given language handle.

    • Methods that you may find it handy to use, from routines of yours that you put in %Lexicon entries.

    These are covered in the following section.

    Construction Methods

    These are to do with constructing a language handle:

    • $lh = YourProjClass->get_handle( ...langtags... ) || die "lg-handle?";

      This tries loading classes based on the language-tags you give (like ("en-US", "sk", "kon", "es-MX", "ja", "i-klingon") , and for the first class that succeeds, returns YourProjClass::language->new().

      If it runs thru the entire given list of language-tags, and finds no classes for those exact terms, it then tries "superordinate" language classes. So if no "en-US" class (i.e., YourProjClass::en_us) was found, nor classes for anything else in that list, we then try its superordinate, "en" (i.e., YourProjClass::en), and so on thru the other language-tags in the given list: "es". (The other language-tags in our example list: happen to have no superordinates.)

      If none of those language-tags leads to loadable classes, we then try classes derived from YourProjClass->fallback_languages() and then if nothing comes of that, we use classes named by YourProjClass->fallback_language_classes(). Then in the (probably quite unlikely) event that that fails, we just return undef.

    • $lh = YourProjClass->get_handle() || die "lg-handle?";

      When get_handle is called with an empty parameter list, magic happens:

      If get_handle senses that it's running in program that was invoked as a CGI, then it tries to get language-tags out of the environment variable "HTTP_ACCEPT_LANGUAGE", and it pretends that those were the languages passed as parameters to get_handle .

      Otherwise (i.e., if not a CGI), this tries various OS-specific ways to get the language-tags for the current locale/language, and then pretends that those were the value(s) passed to get_handle .

      Currently this OS-specific stuff consists of looking in the environment variables "LANG" and "LANGUAGE"; and on MSWin machines (where those variables are typically unused), this also tries using the module Win32::Locale to get a language-tag for whatever language/locale is currently selected in the "Regional Settings" (or "International"?) Control Panel. I welcome further suggestions for making this do the Right Thing under other operating systems that support localization.

      If you're using localization in an application that keeps a configuration file, you might consider something like this in your project class:

      1. sub get_handle_via_config {
      2. my $class = $_[0];
      3. my $chosen_language = $Config_settings{'language'};
      4. my $lh;
      5. if($chosen_language) {
      6. $lh = $class->get_handle($chosen_language)
      7. || die "No language handle for \"$chosen_language\""
      8. . " or the like";
      9. } else {
      10. # Config file missing, maybe?
      11. $lh = $class->get_handle()
      12. || die "Can't get a language handle";
      13. }
      14. return $lh;
      15. }
    • $lh = YourProjClass::langname->new();

      This constructs a language handle. You usually don't call this directly, but instead let get_handle find a language class to use and to then call ->new on.

    • $lh->init();

      This is called by ->new to initialize newly-constructed language handles. If you define an init method in your class, remember that it's usually considered a good idea to call $lh->SUPER::init in it (presumably at the beginning), so that all classes get a chance to initialize a new object however they see fit.

    • YourProjClass->fallback_languages()

      get_handle appends the return value of this to the end of whatever list of languages you pass get_handle . Unless you override this method, your project class will inherit Locale::Maketext's fallback_languages , which currently returns ('i-default', 'en', 'en-US') . ("i-default" is defined in RFC 2277).

      This method (by having it return the name of a language-tag that has an existing language class) can be used for making sure that get_handle will always manage to construct a language handle (assuming your language classes are in an appropriate @INC directory). Or you can use the next method:

    • YourProjClass->fallback_language_classes()

      get_handle appends the return value of this to the end of the list of classes it will try using. Unless you override this method, your project class will inherit Locale::Maketext's fallback_language_classes , which currently returns an empty list, () . By setting this to some value (namely, the name of a loadable language class), you can be sure that get_handle will always manage to construct a language handle.

    The "maketext" Method

    This is the most important method in Locale::Maketext:

    1. $text = $lh->maketext(I<key>, ...parameters for this phrase...);

    This looks in the %Lexicon of the language handle $lh and all its superclasses, looking for an entry whose key is the string key. Assuming such an entry is found, various things then happen, depending on the value found:

    If the value is a scalarref, the scalar is dereferenced and returned (and any parameters are ignored).

    If the value is a coderef, we return &$value($lh, ...parameters...).

    If the value is a string that doesn't look like it's in Bracket Notation, we return it (after replacing it with a scalarref, in its %Lexicon).

    If the value does look like it's in Bracket Notation, then we compile it into a sub, replace the string in the %Lexicon with the new coderef, and then we return &$new_sub($lh, ...parameters...).

    Bracket Notation is discussed in a later section. Note that trying to compile a string into Bracket Notation can throw an exception if the string is not syntactically valid (say, by not balancing brackets right.)

    Also, calling &$coderef($lh, ...parameters...) can throw any sort of exception (if, say, code in that sub tries to divide by zero). But a very common exception occurs when you have Bracket Notation text that says to call a method "foo", but there is no such method. (E.g., "You have [quatn,_1,ball]." will throw an exception on trying to call $lh->quatn($_[1],'ball') -- you presumably meant "quant".) maketext catches these exceptions, but only to make the error message more readable, at which point it rethrows the exception.

    An exception may be thrown if key is not found in any of $lh's %Lexicon hashes. What happens if a key is not found, is discussed in a later section, "Controlling Lookup Failure".

    Note that you might find it useful in some cases to override the maketext method with an "after method", if you want to translate encodings, or even scripts:

    1. package YrProj::zh_cn; # Chinese with PRC-style glyphs
    2. use base ('YrProj::zh_tw'); # Taiwan-style
    3. sub maketext {
    4. my $self = shift(@_);
    5. my $value = $self->maketext(@_);
    6. return Chineeze::taiwan2mainland($value);
    7. }

    Or you may want to override it with something that traps any exceptions, if that's critical to your program:

    1. sub maketext {
    2. my($lh, @stuff) = @_;
    3. my $out;
    4. eval { $out = $lh->SUPER::maketext(@stuff) };
    5. return $out unless $@;
    6. ...otherwise deal with the exception...
    7. }

    Other than those two situations, I don't imagine that it's useful to override the maketext method. (If you run into a situation where it is useful, I'd be interested in hearing about it.)

    • $lh->fail_with or $lh->fail_with(PARAM)
    • $lh->failure_handler_auto

      These two methods are discussed in the section "Controlling Lookup Failure".

    Utility Methods

    These are methods that you may find it handy to use, generally from %Lexicon routines of yours (whether expressed as Bracket Notation or not).

    • $language->quant($number, $singular)
    • $language->quant($number, $singular, $plural)
    • $language->quant($number, $singular, $plural, $negative)

      This is generally meant to be called from inside Bracket Notation (which is discussed later), as in

      1. "Your search matched [quant,_1,document]!"

      It's for quantifying a noun (i.e., saying how much of it there is, while giving the correct form of it). The behavior of this method is handy for English and a few other Western European languages, and you should override it for languages where it's not suitable. You can feel free to read the source, but the current implementation is basically as this pseudocode describes:

      1. if $number is 0 and there's a $negative,
      2. return $negative;
      3. elsif $number is 1,
      4. return "1 $singular";
      5. elsif there's a $plural,
      6. return "$number $plural";
      7. else
      8. return "$number " . $singular . "s";
      9. #
      10. # ...except that we actually call numf to
      11. # stringify $number before returning it.

      So for English (with Bracket Notation) "...[quant,_1,file]..." is fine (for 0 it returns "0 files", for 1 it returns "1 file", and for more it returns "2 files", etc.)

      But for "directory", you'd want "[quant,_1,directory,directories]" so that our elementary quant method doesn't think that the plural of "directory" is "directorys". And you might find that the output may sound better if you specify a negative form, as in:

      1. "[quant,_1,file,files,No files] matched your query.\n"

      Remember to keep in mind verb agreement (or adjectives too, in other languages), as in:

      1. "[quant,_1,document] were matched.\n"

      Because if _1 is one, you get "1 document were matched". An acceptable hack here is to do something like this:

      1. "[quant,_1,document was, documents were] matched.\n"
    • $language->numf($number)

      This returns the given number formatted nicely according to this language's conventions. Maketext's default method is mostly to just take the normal string form of the number (applying sprintf "%G" for only very large numbers), and then to add commas as necessary. (Except that we apply tr/,./.,/ if $language->{'numf_comma'} is true; that's a bit of a hack that's useful for languages that express two million as "2.000.000" and not as "2,000,000").

      If you want anything fancier, consider overriding this with something that uses Number::Format, or does something else entirely.

      Note that numf is called by quant for stringifying all quantifying numbers.

    • $language->numerate($number, $singular, $plural, $negative)

      This returns the given noun form which is appropriate for the quantity $number according to this language's conventions. numerate is used internally by quant to quantify nouns. Use it directly -- usually from bracket notation -- to avoid quant 's implicit call to numf and output of a numeric quantity.

    • $language->sprintf($format, @items)

      This is just a wrapper around Perl's normal sprintf function. It's provided so that you can use "sprintf" in Bracket Notation:

      1. "Couldn't access datanode [sprintf,%10x=~[%s~],_1,_2]!\n"

      returning...

      1. Couldn't access datanode Stuff=[thangamabob]!
    • $language->language_tag()

      Currently this just takes the last bit of ref($language), turns underscores to dashes, and returns it. So if $language is an object of class Hee::HOO::Haw::en_us, $language->language_tag() returns "en-us". (Yes, the usual representation for that language tag is "en-US", but case is never considered meaningful in language-tag comparison.)

      You may override this as you like; Maketext doesn't use it for anything.

    • $language->encoding()

      Currently this isn't used for anything, but it's provided (with default value of (ref($language) && $language->{'encoding'})) or "iso-8859-1" ) as a sort of suggestion that it may be useful/necessary to associate encodings with your language handles (whether on a per-class or even per-handle basis.)

    Language Handle Attributes and Internals

    A language handle is a flyweight object -- i.e., it doesn't (necessarily) carry any data of interest, other than just being a member of whatever class it belongs to.

    A language handle is implemented as a blessed hash. Subclasses of yours can store whatever data you want in the hash. Currently the only hash entry used by any crucial Maketext method is "fail", so feel free to use anything else as you like.

    Remember: Don't be afraid to read the Maketext source if there's any point on which this documentation is unclear. This documentation is vastly longer than the module source itself.

    LANGUAGE CLASS HIERARCHIES

    These are Locale::Maketext's assumptions about the class hierarchy formed by all your language classes:

    • You must have a project base class, which you load, and which you then use as the first argument in the call to YourProjClass->get_handle(...). It should derive (whether directly or indirectly) from Locale::Maketext. It doesn't matter how you name this class, although assuming this is the localization component of your Super Mega Program, good names for your project class might be SuperMegaProgram::Localization, SuperMegaProgram::L10N, SuperMegaProgram::I18N, SuperMegaProgram::International, or even SuperMegaProgram::Languages or SuperMegaProgram::Messages.

    • Language classes are what YourProjClass->get_handle will try to load. It will look for them by taking each language-tag (skipping it if it doesn't look like a language-tag or locale-tag!), turning it to all lowercase, turning dashes to underscores, and appending it to YourProjClass . "::". So this:

      1. $lh = YourProjClass->get_handle(
      2. 'en-US', 'fr', 'kon', 'i-klingon', 'i-klingon-romanized'
      3. );

      will try loading the classes YourProjClass::en_us (note lowercase!), YourProjClass::fr, YourProjClass::kon, YourProjClass::i_klingon and YourProjClass::i_klingon_romanized. (And it'll stop at the first one that actually loads.)

    • I assume that each language class derives (directly or indirectly) from your project class, and also defines its @ISA, its %Lexicon, or both. But I anticipate no dire consequences if these assumptions do not hold.

    • Language classes may derive from other language classes (although they should have "use Thatclassname" or "use base qw(...classes...)"). They may derive from the project class. They may derive from some other class altogether. Or via multiple inheritance, it may derive from any mixture of these.

    • I foresee no problems with having multiple inheritance in your hierarchy of language classes. (As usual, however, Perl will complain bitterly if you have a cycle in the hierarchy: i.e., if any class is its own ancestor.)

    ENTRIES IN EACH LEXICON

    A typical %Lexicon entry is meant to signify a phrase, taking some number (0 or more) of parameters. An entry is meant to be accessed by via a string key in $lh->maketext(key, ...parameters...), which should return a string that is generally meant for be used for "output" to the user -- regardless of whether this actually means printing to STDOUT, writing to a file, or putting into a GUI widget.

    While the key must be a string value (since that's a basic restriction that Perl places on hash keys), the value in the lexicon can currently be of several types: a defined scalar, scalarref, or coderef. The use of these is explained above, in the section 'The "maketext" Method', and Bracket Notation for strings is discussed in the next section.

    While you can use arbitrary unique IDs for lexicon keys (like "_min_larger_max_error"), it is often useful for if an entry's key is itself a valid value, like this example error message:

    1. "Minimum ([_1]) is larger than maximum ([_2])!\n",

    Compare this code that uses an arbitrary ID...

    1. die $lh->maketext( "_min_larger_max_error", $min, $max )
    2. if $min > $max;

    ...to this code that uses a key-as-value:

    1. die $lh->maketext(
    2. "Minimum ([_1]) is larger than maximum ([_2])!\n",
    3. $min, $max
    4. ) if $min > $max;

    The second is, in short, more readable. In particular, it's obvious that the number of parameters you're feeding to that phrase (two) is the number of parameters that it wants to be fed. (Since you see _1 and a _2 being used in the key there.)

    Also, once a project is otherwise complete and you start to localize it, you can scrape together all the various keys you use, and pass it to a translator; and then the translator's work will go faster if what he's presented is this:

    1. "Minimum ([_1]) is larger than maximum ([_2])!\n",
    2. => "", # fill in something here, Jacques!

    rather than this more cryptic mess:

    1. "_min_larger_max_error"
    2. => "", # fill in something here, Jacques

    I think that keys as lexicon values makes the completed lexicon entries more readable:

    1. "Minimum ([_1]) is larger than maximum ([_2])!\n",
    2. => "Le minimum ([_1]) est plus grand que le maximum ([_2])!\n",

    Also, having valid values as keys becomes very useful if you set up an _AUTO lexicon. _AUTO lexicons are discussed in a later section.

    I almost always use keys that are themselves valid lexicon values. One notable exception is when the value is quite long. For example, to get the screenful of data that a command-line program might return when given an unknown switch, I often just use a brief, self-explanatory key such as "_USAGE_MESSAGE". At that point I then go and immediately to define that lexicon entry in the ProjectClass::L10N::en lexicon (since English is always my "project language"):

    1. '_USAGE_MESSAGE' => <<'EOSTUFF',
    2. ...long long message...
    3. EOSTUFF

    and then I can use it as:

    1. getopt('oDI', \%opts) or die $lh->maketext('_USAGE_MESSAGE');

    Incidentally, note that each class's %Lexicon inherits-and-extends the lexicons in its superclasses. This is not because these are special hashes per se, but because you access them via the maketext method, which looks for entries across all the %Lexicon hashes in a language class and all its ancestor classes. (This is because the idea of "class data" isn't directly implemented in Perl, but is instead left to individual class-systems to implement as they see fit..)

    Note that you may have things stored in a lexicon besides just phrases for output: for example, if your program takes input from the keyboard, asking a "(Y/N)" question, you probably need to know what the equivalent of "Y[es]/N[o]" is in whatever language. You probably also need to know what the equivalents of the answers "y" and "n" are. You can store that information in the lexicon (say, under the keys "~answer_y" and "~answer_n", and the long forms as "~answer_yes" and "~answer_no", where "~" is just an ad-hoc character meant to indicate to programmers/translators that these are not phrases for output).

    Or instead of storing this in the language class's lexicon, you can (and, in some cases, really should) represent the same bit of knowledge as code in a method in the language class. (That leaves a tidy distinction between the lexicon as the things we know how to say, and the rest of the things in the lexicon class as things that we know how to do.) Consider this example of a processor for responses to French "oui/non" questions:

    1. sub y_or_n {
    2. return undef unless defined $_[1] and length $_[1];
    3. my $answer = lc $_[1]; # smash case
    4. return 1 if $answer eq 'o' or $answer eq 'oui';
    5. return 0 if $answer eq 'n' or $answer eq 'non';
    6. return undef;
    7. }

    ...which you'd then call in a construct like this:

    1. my $response;
    2. until(defined $response) {
    3. print $lh->maketext("Open the pod bay door (y/n)? ");
    4. $response = $lh->y_or_n( get_input_from_keyboard_somehow() );
    5. }
    6. if($response) { $pod_bay_door->open() }
    7. else { $pod_bay_door->leave_closed() }

    Other data worth storing in a lexicon might be things like filenames for language-targetted resources:

    1. ...
    2. "_main_splash_png"
    3. => "/styles/en_us/main_splash.png",
    4. "_main_splash_imagemap"
    5. => "/styles/en_us/main_splash.incl",
    6. "_general_graphics_path"
    7. => "/styles/en_us/",
    8. "_alert_sound"
    9. => "/styles/en_us/hey_there.wav",
    10. "_forward_icon"
    11. => "left_arrow.png",
    12. "_backward_icon"
    13. => "right_arrow.png",
    14. # In some other languages, left equals
    15. # BACKwards, and right is FOREwards.
    16. ...

    You might want to do the same thing for expressing key bindings or the like (since hardwiring "q" as the binding for the function that quits a screen/menu/program is useful only if your language happens to associate "q" with "quit"!)

    BRACKET NOTATION

    Bracket Notation is a crucial feature of Locale::Maketext. I mean Bracket Notation to provide a replacement for the use of sprintf formatting. Everything you do with Bracket Notation could be done with a sub block, but bracket notation is meant to be much more concise.

    Bracket Notation is a like a miniature "template" system (in the sense of Text::Template, not in the sense of C++ templates), where normal text is passed thru basically as is, but text in special regions is specially interpreted. In Bracket Notation, you use square brackets ("[...]"), not curly braces ("{...}") to note sections that are specially interpreted.

    For example, here all the areas that are taken literally are underlined with a "^", and all the in-bracket special regions are underlined with an X:

    1. "Minimum ([_1]) is larger than maximum ([_2])!\n",
    2. ^^^^^^^^^ XX ^^^^^^^^^^^^^^^^^^^^^^^^^^ XX ^^^^

    When that string is compiled from bracket notation into a real Perl sub, it's basically turned into:

    1. sub {
    2. my $lh = $_[0];
    3. my @params = @_;
    4. return join '',
    5. "Minimum (",
    6. ...some code here...
    7. ") is larger than maximum (",
    8. ...some code here...
    9. ")!\n",
    10. }
    11. # to be called by $lh->maketext(KEY, params...)

    In other words, text outside bracket groups is turned into string literals. Text in brackets is rather more complex, and currently follows these rules:

    • Bracket groups that are empty, or which consist only of whitespace, are ignored. (Examples: "[]", "[ ]", or a [ and a ] with returns and/or tabs and/or spaces between them.

      Otherwise, each group is taken to be a comma-separated group of items, and each item is interpreted as follows:

    • An item that is "_digits" or "_-digits" is interpreted as $_[value]. I.e., "_1" becomes with $_[1], and "_-3" is interpreted as $_[-3] (in which case @_ should have at least three elements in it). Note that $_[0] is the language handle, and is typically not named directly.

    • An item "_*" is interpreted to mean "all of @_ except $_[0]". I.e., @_[1..$#_] . Note that this is an empty list in the case of calls like $lh->maketext(key) where there are no parameters (except $_[0], the language handle).

    • Otherwise, each item is interpreted as a string literal.

    The group as a whole is interpreted as follows:

    • If the first item in a bracket group looks like a method name, then that group is interpreted like this:

      1. $lh->that_method_name(
      2. ...rest of items in this group...
      3. ),
    • If the first item in a bracket group is "*", it's taken as shorthand for the so commonly called "quant" method. Similarly, if the first item in a bracket group is "#", it's taken to be shorthand for "numf".

    • If the first item in a bracket group is the empty-string, or "_*" or "_digits" or "_-digits", then that group is interpreted as just the interpolation of all its items:

      1. join('',
      2. ...rest of items in this group...
      3. ),

      Examples: "[_1]" and "[,_1]", which are synonymous; and "[,ID-(,_4,-,_2,)] ", which compiles as join "", "ID-(", $_[4], "-", $_[2], ")" .

    • Otherwise this bracket group is invalid. For example, in the group "[!@#,whatever]", the first item "!@#" is neither the empty-string, "_number", "_-number", "_*", nor a valid method name; and so Locale::Maketext will throw an exception of you try compiling an expression containing this bracket group.

    Note, incidentally, that items in each group are comma-separated, not /\s*,\s*/ -separated. That is, you might expect that this bracket group:

    1. "Hoohah [foo, _1 , bar ,baz]!"

    would compile to this:

    1. sub {
    2. my $lh = $_[0];
    3. return join '',
    4. "Hoohah ",
    5. $lh->foo( $_[1], "bar", "baz"),
    6. "!",
    7. }

    But it actually compiles as this:

    1. sub {
    2. my $lh = $_[0];
    3. return join '',
    4. "Hoohah ",
    5. $lh->foo(" _1 ", " bar ", "baz"), # note the <space> in " bar "
    6. "!",
    7. }

    In the notation discussed so far, the characters "[" and "]" are given special meaning, for opening and closing bracket groups, and "," has a special meaning inside bracket groups, where it separates items in the group. This begs the question of how you'd express a literal "[" or "]" in a Bracket Notation string, and how you'd express a literal comma inside a bracket group. For this purpose I've adopted "~" (tilde) as an escape character: "~[" means a literal '[' character anywhere in Bracket Notation (i.e., regardless of whether you're in a bracket group or not), and ditto for "~]" meaning a literal ']', and "~," meaning a literal comma. (Altho "," means a literal comma outside of bracket groups -- it's only inside bracket groups that commas are special.)

    And on the off chance you need a literal tilde in a bracket expression, you get it with "~~".

    Currently, an unescaped "~" before a character other than a bracket or a comma is taken to mean just a "~" and that character. I.e., "~X" means the same as "~~X" -- i.e., one literal tilde, and then one literal "X". However, by using "~X", you are assuming that no future version of Maketext will use "~X" as a magic escape sequence. In practice this is not a great problem, since first off you can just write "~~X" and not worry about it; second off, I doubt I'll add lots of new magic characters to bracket notation; and third off, you aren't likely to want literal "~" characters in your messages anyway, since it's not a character with wide use in natural language text.

    Brackets must be balanced -- every openbracket must have one matching closebracket, and vice versa. So these are all invalid:

    1. "I ate [quant,_1,rhubarb pie."
    2. "I ate [quant,_1,rhubarb pie[."
    3. "I ate quant,_1,rhubarb pie]."
    4. "I ate quant,_1,rhubarb pie[."

    Currently, bracket groups do not nest. That is, you cannot say:

    1. "Foo [bar,baz,[quux,quuux]]\n";

    If you need a notation that's that powerful, use normal Perl:

    1. %Lexicon = (
    2. ...
    3. "some_key" => sub {
    4. my $lh = $_[0];
    5. join '',
    6. "Foo ",
    7. $lh->bar('baz', $lh->quux('quuux')),
    8. "\n",
    9. },
    10. ...
    11. );

    Or write the "bar" method so you don't need to pass it the output from calling quux.

    I do not anticipate that you will need (or particularly want) to nest bracket groups, but you are welcome to email me with convincing (real-life) arguments to the contrary.

    AUTO LEXICONS

    If maketext goes to look in an individual %Lexicon for an entry for key (where key does not start with an underscore), and sees none, but does see an entry of "_AUTO" => some_true_value, then we actually define $Lexicon{key} = key right then and there, and then use that value as if it had been there all along. This happens before we even look in any superclass %Lexicons!

    (This is meant to be somewhat like the AUTOLOAD mechanism in Perl's function call system -- or, looked at another way, like the AutoLoader module.)

    I can picture all sorts of circumstances where you just do not want lookup to be able to fail (since failing normally means that maketext throws a die, although see the next section for greater control over that). But here's one circumstance where _AUTO lexicons are meant to be especially useful:

    As you're writing an application, you decide as you go what messages you need to emit. Normally you'd go to write this:

    1. if(-e $filename) {
    2. go_process_file($filename)
    3. } else {
    4. print qq{Couldn't find file "$filename"!\n};
    5. }

    but since you anticipate localizing this, you write:

    1. use ThisProject::I18N;
    2. my $lh = ThisProject::I18N->get_handle();
    3. # For the moment, assume that things are set up so
    4. # that we load class ThisProject::I18N::en
    5. # and that that's the class that $lh belongs to.
    6. ...
    7. if(-e $filename) {
    8. go_process_file($filename)
    9. } else {
    10. print $lh->maketext(
    11. qq{Couldn't find file "[_1]"!\n}, $filename
    12. );
    13. }

    Now, right after you've just written the above lines, you'd normally have to go open the file ThisProject/I18N/en.pm, and immediately add an entry:

    1. "Couldn't find file \"[_1]\"!\n"
    2. => "Couldn't find file \"[_1]\"!\n",

    But I consider that somewhat of a distraction from the work of getting the main code working -- to say nothing of the fact that I often have to play with the program a few times before I can decide exactly what wording I want in the messages (which in this case would require me to go changing three lines of code: the call to maketext with that key, and then the two lines in ThisProject/I18N/en.pm).

    However, if you set "_AUTO => 1" in the %Lexicon in, ThisProject/I18N/en.pm (assuming that English (en) is the language that all your programmers will be using for this project's internal message keys), then you don't ever have to go adding lines like this

    1. "Couldn't find file \"[_1]\"!\n"
    2. => "Couldn't find file \"[_1]\"!\n",

    to ThisProject/I18N/en.pm, because if _AUTO is true there, then just looking for an entry with the key "Couldn't find file \"[_1]\"!\n" in that lexicon will cause it to be added, with that value!

    Note that the reason that keys that start with "_" are immune to _AUTO isn't anything generally magical about the underscore character -- I just wanted a way to have most lexicon keys be autoable, except for possibly a few, and I arbitrarily decided to use a leading underscore as a signal to distinguish those few.

    READONLY LEXICONS

    If your lexicon is a tied hash the simple act of caching the compiled value can be fatal.

    For example a GDBM_File GDBM_READER tied hash will die with something like:

    1. gdbm store returned -1, errno 2, key "..." at ...

    All you need to do is turn on caching outside of the lexicon hash itself like so:

    1. sub init {
    2. my ($lh) = @_;
    3. ...
    4. $lh->{'use_external_lex_cache'} = 1;
    5. ...
    6. }

    And then instead of storing the compiled value in the lexicon hash it will store it in $lh->{'_external_lex_cache'}

    CONTROLLING LOOKUP FAILURE

    If you call $lh->maketext(key, ...parameters...), and there's no entry key in $lh's class's %Lexicon, nor in the superclass %Lexicon hash, and if we can't auto-make key (because either it starts with a "_", or because none of its lexicons have _AUTO => 1, ), then we have failed to find a normal way to maketext key. What then happens in these failure conditions, depends on the $lh object's "fail" attribute.

    If the language handle has no "fail" attribute, maketext will simply throw an exception (i.e., it calls die, mentioning the key whose lookup failed, and naming the line number where the calling $lh->maketext(key,...) was.

    If the language handle has a "fail" attribute whose value is a coderef, then $lh->maketext(key,...params...) gives up and calls:

    1. return $that_subref->($lh, $key, @params);

    Otherwise, the "fail" attribute's value should be a string denoting a method name, so that $lh->maketext(key,...params...) can give up with:

    1. return $lh->$that_method_name($phrase, @params);

    The "fail" attribute can be accessed with the fail_with method:

    1. # Set to a coderef:
    2. $lh->fail_with( \&failure_handler );
    3. # Set to a method name:
    4. $lh->fail_with( 'failure_method' );
    5. # Set to nothing (i.e., so failure throws a plain exception)
    6. $lh->fail_with( undef );
    7. # Get the current value
    8. $handler = $lh->fail_with();

    Now, as to what you may want to do with these handlers: Maybe you'd want to log what key failed for what class, and then die. Maybe you don't like die and instead you want to send the error message to STDOUT (or wherever) and then merely exit().

    Or maybe you don't want to die at all! Maybe you could use a handler like this:

    1. # Make all lookups fall back onto an English value,
    2. # but only after we log it for later fingerpointing.
    3. my $lh_backup = ThisProject->get_handle('en');
    4. open(LEX_FAIL_LOG, ">>wherever/lex.log") || die "GNAARGH $!";
    5. sub lex_fail {
    6. my($failing_lh, $key, $params) = @_;
    7. print LEX_FAIL_LOG scalar(localtime), "\t",
    8. ref($failing_lh), "\t", $key, "\n";
    9. return $lh_backup->maketext($key,@params);
    10. }

    Some users have expressed that they think this whole mechanism of having a "fail" attribute at all, seems a rather pointless complication. But I want Locale::Maketext to be usable for software projects of any scale and type; and different software projects have different ideas of what the right thing is to do in failure conditions. I could simply say that failure always throws an exception, and that if you want to be careful, you'll just have to wrap every call to $lh->maketext in an eval { }. However, I want programmers to reserve the right (via the "fail" attribute) to treat lookup failure as something other than an exception of the same level of severity as a config file being unreadable, or some essential resource being inaccessible.

    One possibly useful value for the "fail" attribute is the method name "failure_handler_auto". This is a method defined in the class Locale::Maketext itself. You set it with:

    1. $lh->fail_with('failure_handler_auto');

    Then when you call $lh->maketext(key, ...parameters...) and there's no key in any of those lexicons, maketext gives up with

    1. return $lh->failure_handler_auto($key, @params);

    But failure_handler_auto, instead of dying or anything, compiles $key, caching it in

    1. $lh->{'failure_lex'}{$key} = $complied

    and then calls the compiled value, and returns that. (I.e., if $key looks like bracket notation, $compiled is a sub, and we return &{$compiled}(@params); but if $key is just a plain string, we just return that.)

    The effect of using "failure_auto_handler" is like an AUTO lexicon, except that it 1) compiles $key even if it starts with "_", and 2) you have a record in the new hashref $lh->{'failure_lex'} of all the keys that have failed for this object. This should avoid your program dying -- as long as your keys aren't actually invalid as bracket code, and as long as they don't try calling methods that don't exist.

    "failure_auto_handler" may not be exactly what you want, but I hope it at least shows you that maketext failure can be mitigated in any number of very flexible ways. If you can formalize exactly what you want, you should be able to express that as a failure handler. You can even make it default for every object of a given class, by setting it in that class's init:

    1. sub init {
    2. my $lh = $_[0]; # a newborn handle
    3. $lh->SUPER::init();
    4. $lh->fail_with('my_clever_failure_handler');
    5. return;
    6. }
    7. sub my_clever_failure_handler {
    8. ...you clever things here...
    9. }

    HOW TO USE MAKETEXT

    Here is a brief checklist on how to use Maketext to localize applications:

    • Decide what system you'll use for lexicon keys. If you insist, you can use opaque IDs (if you're nostalgic for catgets ), but I have better suggestions in the section "Entries in Each Lexicon", above. Assuming you opt for meaningful keys that double as values (like "Minimum ([_1]) is larger than maximum ([_2])!\n"), you'll have to settle on what language those should be in. For the sake of argument, I'll call this English, specifically American English, "en-US".

    • Create a class for your localization project. This is the name of the class that you'll use in the idiom:

      1. use Projname::L10N;
      2. my $lh = Projname::L10N->get_handle(...) || die "Language?";

      Assuming you call your class Projname::L10N, create a class consisting minimally of:

      1. package Projname::L10N;
      2. use base qw(Locale::Maketext);
      3. ...any methods you might want all your languages to share...
      4. # And, assuming you want the base class to be an _AUTO lexicon,
      5. # as is discussed a few sections up:
      6. 1;
    • Create a class for the language your internal keys are in. Name the class after the language-tag for that language, in lowercase, with dashes changed to underscores. Assuming your project's first language is US English, you should call this Projname::L10N::en_us. It should consist minimally of:

      1. package Projname::L10N::en_us;
      2. use base qw(Projname::L10N);
      3. %Lexicon = (
      4. '_AUTO' => 1,
      5. );
      6. 1;

      (For the rest of this section, I'll assume that this "first language class" of Projname::L10N::en_us has _AUTO lexicon.)

    • Go and write your program. Everywhere in your program where you would say:

      1. print "Foobar $thing stuff\n";

      instead do it thru maketext, using no variable interpolation in the key:

      1. print $lh->maketext("Foobar [_1] stuff\n", $thing);

      If you get tired of constantly saying print $lh->maketext , consider making a functional wrapper for it, like so:

      1. use Projname::L10N;
      2. use vars qw($lh);
      3. $lh = Projname::L10N->get_handle(...) || die "Language?";
      4. sub pmt (@) { print( $lh->maketext(@_)) }
      5. # "pmt" is short for "Print MakeText"
      6. $Carp::Verbose = 1;
      7. # so if maketext fails, we see made the call to pmt

      Besides whole phrases meant for output, anything language-dependent should be put into the class Projname::L10N::en_us, whether as methods, or as lexicon entries -- this is discussed in the section "Entries in Each Lexicon", above.

    • Once the program is otherwise done, and once its localization for the first language works right (via the data and methods in Projname::L10N::en_us), you can get together the data for translation. If your first language lexicon isn't an _AUTO lexicon, then you already have all the messages explicitly in the lexicon (or else you'd be getting exceptions thrown when you call $lh->maketext to get messages that aren't in there). But if you were (advisedly) lazy and are using an _AUTO lexicon, then you've got to make a list of all the phrases that you've so far been letting _AUTO generate for you. There are very many ways to assemble such a list. The most straightforward is to simply grep the source for every occurrence of "maketext" (or calls to wrappers around it, like the above pmt function), and to log the following phrase.

    • You may at this point want to consider whether your base class (Projname::L10N), from which all lexicons inherit from (Projname::L10N::en, Projname::L10N::es, etc.), should be an _AUTO lexicon. It may be true that in theory, all needed messages will be in each language class; but in the presumably unlikely or "impossible" case of lookup failure, you should consider whether your program should throw an exception, emit text in English (or whatever your project's first language is), or some more complex solution as described in the section "Controlling Lookup Failure", above.

    • Submit all messages/phrases/etc. to translators.

      (You may, in fact, want to start with localizing to one other language at first, if you're not sure that you've properly abstracted the language-dependent parts of your code.)

      Translators may request clarification of the situation in which a particular phrase is found. For example, in English we are entirely happy saying "n files found", regardless of whether we mean "I looked for files, and found n of them" or the rather distinct situation of "I looked for something else (like lines in files), and along the way I saw n files." This may involve rethinking things that you thought quite clear: should "Edit" on a toolbar be a noun ("editing") or a verb ("to edit")? Is there already a conventionalized way to express that menu option, separate from the target language's normal word for "to edit"?

      In all cases where the very common phenomenon of quantification (saying "N files", for any value of N) is involved, each translator should make clear what dependencies the number causes in the sentence. In many cases, dependency is limited to words adjacent to the number, in places where you might expect them ("I found the-?PLURAL N empty-?PLURAL directory-?PLURAL"), but in some cases there are unexpected dependencies ("I found-?PLURAL ..."!) as well as long-distance dependencies "The N directory-?PLURAL could not be deleted-?PLURAL"!).

      Remind the translators to consider the case where N is 0: "0 files found" isn't exactly natural-sounding in any language, but it may be unacceptable in many -- or it may condition special kinds of agreement (similar to English "I didN'T find ANY files").

      Remember to ask your translators about numeral formatting in their language, so that you can override the numf method as appropriate. Typical variables in number formatting are: what to use as a decimal point (comma? period?); what to use as a thousands separator (space? nonbreaking space? comma? period? small middot? prime? apostrophe?); and even whether the so-called "thousands separator" is actually for every third digit -- I've heard reports of two hundred thousand being expressible as "2,00,000" for some Indian (Subcontinental) languages, besides the less surprising "200 000", "200.000", "200,000", and "200'000". Also, using a set of numeral glyphs other than the usual ASCII "0"-"9" might be appreciated, as via tr/0-9/\x{0966}-\x{096F}/ for getting digits in Devanagari script (for Hindi, Konkani, others).

      The basic quant method that Locale::Maketext provides should be good for many languages. For some languages, it might be useful to modify it (or its constituent numerate method) to take a plural form in the two-argument call to quant (as in "[quant,_1,files]") if it's all-around easier to infer the singular form from the plural, than to infer the plural form from the singular.

      But for other languages (as is discussed at length in Locale::Maketext::TPJ13), simple quant /numf is not enough. For the particularly problematic Slavic languages, what you may need is a method which you provide with the number, the citation form of the noun to quantify, and the case and gender that the sentence's syntax projects onto that noun slot. The method would then be responsible for determining what grammatical number that numeral projects onto its noun phrase, and what case and gender it may override the normal case and gender with; and then it would look up the noun in a lexicon providing all needed inflected forms.

    • You may also wish to discuss with the translators the question of how to relate different subforms of the same language tag, considering how this reacts with get_handle 's treatment of these. For example, if a user accepts interfaces in "en, fr", and you have interfaces available in "en-US" and "fr", what should they get? You may wish to resolve this by establishing that "en" and "en-US" are effectively synonymous, by having one class zero-derive from the other.

      For some languages this issue may never come up (Danish is rarely expressed as "da-DK", but instead is just "da"). And for other languages, the whole concept of a "generic" form may verge on being uselessly vague, particularly for interfaces involving voice media in forms of Arabic or Chinese.

    • Once you've localized your program/site/etc. for all desired languages, be sure to show the result (whether live, or via screenshots) to the translators. Once they approve, make every effort to have it then checked by at least one other speaker of that language. This holds true even when (or especially when) the translation is done by one of your own programmers. Some kinds of systems may be harder to find testers for than others, depending on the amount of domain-specific jargon and concepts involved -- it's easier to find people who can tell you whether they approve of your translation for "delete this message" in an email-via-Web interface, than to find people who can give you an informed opinion on your translation for "attribute value" in an XML query tool's interface.

    SEE ALSO

    I recommend reading all of these:

    Locale::Maketext::TPJ13 -- my The Perl Journal article about Maketext. It explains many important concepts underlying Locale::Maketext's design, and some insight into why Maketext is better than the plain old approach of having message catalogs that are just databases of sprintf formats.

    File::Findgrep is a sample application/module that uses Locale::Maketext to localize its messages. For a larger internationalized system, see also Apache::MP3.

    I18N::LangTags.

    Win32::Locale.

    RFC 3066, Tags for the Identification of Languages, as at http://sunsite.dk/RFC/rfc/rfc3066.html

    RFC 2277, IETF Policy on Character Sets and Languages is at http://sunsite.dk/RFC/rfc/rfc2277.html -- much of it is just things of interest to protocol designers, but it explains some basic concepts, like the distinction between locales and language-tags.

    The manual for GNU gettext . The gettext dist is available in ftp://prep.ai.mit.edu/pub/gnu/ -- get a recent gettext tarball and look in its "doc/" directory, there's an easily browsable HTML version in there. The gettext documentation asks lots of questions worth thinking about, even if some of their answers are sometimes wonky, particularly where they start talking about pluralization.

    The Locale/Maketext.pm source. Obverse that the module is much shorter than its documentation!

    COPYRIGHT AND DISCLAIMER

    Copyright (c) 1999-2004 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Sean M. Burke sburke@cpan.org

     
    perldoc-html/Locale/Script.html000644 000765 000024 00000053041 12275777463 016576 0ustar00jjstaff000000 000000 Locale::Script - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Locale::Script

    Perl 5 version 18.2 documentation
    Recently read

    Locale::Script

    NAME

    Locale::Script - standard codes for script identification

    SYNOPSIS

    1. use Locale::Script;
    2. $script = code2script('phnx'); # 'Phoenician'
    3. $code = script2code('Phoenician'); # 'Phnx'
    4. $code = script2code('Phoenician',
    5. LOCALE_CODE_NUMERIC); # 115
    6. @codes = all_script_codes();
    7. @scripts = all_script_names();

    DESCRIPTION

    The Locale::Script module provides access to standards codes used for identifying scripts, such as those defined in ISO 15924.

    Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 15924 four-letter codes will be used.

    SUPPORTED CODE SETS

    There are several different code sets you can use for identifying scripts. A code set may be specified using either a name, or a constant that is automatically exported by this module.

    For example, the two are equivalent:

    1. $script = code2script('phnx','alpha');
    2. $script = code2script('phnx',LOCALE_SCRIPT_ALPHA);

    The codesets currently supported are:

    • alpha, LOCALE_SCRIPT_ALPHA

      This is a set of four-letter (capitalized) codes from ISO 15924 such as 'Phnx' for Phoenician. It also includes additions to this set included in the IANA language registry.

      The Zxxx, Zyyy, and Zzzz codes are not used.

      This is the default code set.

    • num, LOCALE_SCRIPT_NUMERIC

      This is a set of three-digit numeric codes from ISO 15924 such as 115 for Phoenician.

    ROUTINES

    • code2script ( CODE [,CODESET] )
    • script2code ( NAME [,CODESET] )
    • script_code2code ( CODE ,CODESET ,CODESET2 )
    • all_script_codes ( [CODESET] )
    • all_script_names ( [CODESET] )
    • Locale::Script::rename_script ( CODE ,NEW_NAME [,CODESET] )
    • Locale::Script::add_script ( CODE ,NAME [,CODESET] )
    • Locale::Script::delete_script ( CODE [,CODESET] )
    • Locale::Script::add_script_alias ( NAME ,NEW_NAME )
    • Locale::Script::delete_script_alias ( NAME )
    • Locale::Script::rename_script_code ( CODE ,NEW_CODE [,CODESET] )
    • Locale::Script::add_script_code_alias ( CODE ,NEW_CODE [,CODESET] )
    • Locale::Script::delete_script_code_alias ( CODE [,CODESET] )

      These routines are all documented in the Locale::Codes::API man page.

    SEE ALSO

    AUTHOR

    See Locale::Codes for full author history.

    Currently maintained by Sullivan Beck (sbeck@cpan.org).

    COPYRIGHT

    1. Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
    2. Copyright (c) 2001-2010 Neil Bowers
    3. Copyright (c) 2010-2013 Sullivan Beck

    This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Locale/Maketext/Guts.html000644 000765 000024 00000035406 12275777461 020041 0ustar00jjstaff000000 000000 Locale::Maketext::Guts - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Locale::Maketext::Guts

    Perl 5 version 18.2 documentation
    Recently read

    Locale::Maketext::Guts

    NAME

    Locale::Maketext::Guts - Deprecated module to load Locale::Maketext utf8 code

    SYNOPSIS

    1. # Do this instead please
    2. use Locale::Maketext

    DESCRIPTION

    Previously Local::Maketext::GutsLoader performed some magic to load Locale::Maketext when utf8 was unavailable. The subs this module provided were merged back into Locale::Maketext

     
    perldoc-html/Locale/Maketext/GutsLoader.html000644 000765 000024 00000035454 12275777461 021173 0ustar00jjstaff000000 000000 Locale::Maketext::GutsLoader - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Locale::Maketext::GutsLoader

    Perl 5 version 18.2 documentation
    Recently read

    Locale::Maketext::GutsLoader

    NAME

    Locale::Maketext::GutsLoader - Deprecated module to load Locale::Maketext utf8 code

    SYNOPSIS

    1. # Do this instead please
    2. use Locale::Maketext

    DESCRIPTION

    Previously Locale::Maketext::Guts performed some magic to load Locale::Maketext when utf8 was unavailable. The subs this module provided were merged back into Locale::Maketext.

     
    perldoc-html/Locale/Maketext/Simple.html000644 000765 000024 00000061332 12275777460 020344 0ustar00jjstaff000000 000000 Locale::Maketext::Simple - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Locale::Maketext::Simple

    Perl 5 version 18.2 documentation
    Recently read

    Locale::Maketext::Simple

    NAME

    Locale::Maketext::Simple - Simple interface to Locale::Maketext::Lexicon

    VERSION

    This document describes version 0.18 of Locale::Maketext::Simple, released Septermber 8, 2006.

    SYNOPSIS

    Minimal setup (looks for auto/Foo/*.po and auto/Foo/*.mo):

    1. package Foo;
    2. use Locale::Maketext::Simple; # exports 'loc'
    3. loc_lang('fr'); # set language to French
    4. sub hello {
    5. print loc("Hello, [_1]!", "World");
    6. }

    More sophisticated example:

    1. package Foo::Bar;
    2. use Locale::Maketext::Simple (
    3. Class => 'Foo', # search in auto/Foo/
    4. Style => 'gettext', # %1 instead of [_1]
    5. Export => 'maketext', # maketext() instead of loc()
    6. Subclass => 'L10N', # Foo::L10N instead of Foo::I18N
    7. Decode => 1, # decode entries to unicode-strings
    8. Encoding => 'locale', # but encode lexicons in current locale
    9. # (needs Locale::Maketext::Lexicon 0.36)
    10. );
    11. sub japh {
    12. print maketext("Just another %1 hacker", "Perl");
    13. }

    DESCRIPTION

    This module is a simple wrapper around Locale::Maketext::Lexicon, designed to alleviate the need of creating Language Classes for module authors.

    The language used is chosen from the loc_lang call. If a lookup is not possible, the i-default language will be used. If the lookup is not in the i-default language, then the key will be returned.

    If Locale::Maketext::Lexicon is not present, it implements a minimal localization function by simply interpolating [_1] with the first argument, [_2] with the second, etc. Interpolated function like [quant,_1] are treated as [_1] , with the sole exception of [tense,_1,X] , which will append ing to _1 when X is present , or appending ed to <_1> otherwise.

    OPTIONS

    All options are passed either via the use statement, or via an explicit import.

    Class

    By default, Locale::Maketext::Simple draws its source from the calling package's auto/ directory; you can override this behaviour by explicitly specifying another package as Class .

    Path

    If your PO and MO files are under a path elsewhere than auto/ , you may specify it using the Path option.

    Style

    By default, this module uses the maketext style of [_1] and [quant,_1] for interpolation. Alternatively, you can specify the gettext style, which uses %1 and %quant(%1) for interpolation.

    This option is case-insensitive.

    Export

    By default, this module exports a single function, loc , into its caller's namespace. You can set it to another name, or set it to an empty string to disable exporting.

    Subclass

    By default, this module creates an ::I18N subclass under the caller's package (or the package specified by Class ), and stores lexicon data in its subclasses. You can assign a name other than I18N via this option.

    Decode

    If set to a true value, source entries will be converted into utf8-strings (available in Perl 5.6.1 or later). This feature needs the Encode or Encode::compat module.

    Encoding

    Specifies an encoding to store lexicon entries, instead of utf8-strings. If set to locale , the encoding from the current locale setting is used. Implies a true value for Decode .

    ACKNOWLEDGMENTS

    Thanks to Jos I. Boumans for suggesting this module to be written.

    Thanks to Chia-Liang Kao for suggesting Path and loc_lang .

    SEE ALSO

    Locale::Maketext, Locale::Maketext::Lexicon

    AUTHORS

    Audrey Tang <cpan@audreyt.org>

    COPYRIGHT

    Copyright 2003, 2004, 2005, 2006 by Audrey Tang <cpan@audreyt.org>.

    This software is released under the MIT license cited below. Additionally, when this software is distributed with Perl Kit, Version 5, you may also redistribute it and/or modify it under the same terms as Perl itself.

    The "MIT" License

    Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

    The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

     
    perldoc-html/List/Util/000755 000765 000024 00000000000 12275777463 015072 5ustar00jjstaff000000 000000 perldoc-html/List/Util.html000644 000765 000024 00000067304 12275777461 015770 0ustar00jjstaff000000 000000 List::Util - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    List::Util

    Perl 5 version 18.2 documentation
    Recently read

    List::Util

    NAME

    List::Util - A selection of general-utility list subroutines

    SYNOPSIS

    1. use List::Util qw(first max maxstr min minstr reduce shuffle sum);

    DESCRIPTION

    List::Util contains a selection of subroutines that people have expressed would be nice to have in the perl core, but the usage would not really be high enough to warrant the use of a keyword, and the size so small such that being individual extensions would be wasteful.

    By default List::Util does not export any subroutines. The subroutines defined are

    • first BLOCK LIST

      Similar to grep in that it evaluates BLOCK setting $_ to each element of LIST in turn. first returns the first element where the result from BLOCK is a true value. If BLOCK never returns true or LIST was empty then undef is returned.

      1. $foo = first { defined($_) } @list # first defined value in @list
      2. $foo = first { $_ > $value } @list # first value in @list which
      3. # is greater than $value

      This function could be implemented using reduce like this

      1. $foo = reduce { defined($a) ? $a : wanted($b) ? $b : undef } undef, @list

      for example wanted() could be defined() which would return the first defined value in @list

    • max LIST

      Returns the entry in the list with the highest numerical value. If the list is empty then undef is returned.

      1. $foo = max 1..10 # 10
      2. $foo = max 3,9,12 # 12
      3. $foo = max @bar, @baz # whatever

      This function could be implemented using reduce like this

      1. $foo = reduce { $a > $b ? $a : $b } 1..10
    • maxstr LIST

      Similar to max , but treats all the entries in the list as strings and returns the highest string as defined by the gt operator. If the list is empty then undef is returned.

      1. $foo = maxstr 'A'..'Z' # 'Z'
      2. $foo = maxstr "hello","world" # "world"
      3. $foo = maxstr @bar, @baz # whatever

      This function could be implemented using reduce like this

      1. $foo = reduce { $a gt $b ? $a : $b } 'A'..'Z'
    • min LIST

      Similar to max but returns the entry in the list with the lowest numerical value. If the list is empty then undef is returned.

      1. $foo = min 1..10 # 1
      2. $foo = min 3,9,12 # 3
      3. $foo = min @bar, @baz # whatever

      This function could be implemented using reduce like this

      1. $foo = reduce { $a < $b ? $a : $b } 1..10
    • minstr LIST

      Similar to min , but treats all the entries in the list as strings and returns the lowest string as defined by the lt operator. If the list is empty then undef is returned.

      1. $foo = minstr 'A'..'Z' # 'A'
      2. $foo = minstr "hello","world" # "hello"
      3. $foo = minstr @bar, @baz # whatever

      This function could be implemented using reduce like this

      1. $foo = reduce { $a lt $b ? $a : $b } 'A'..'Z'
    • reduce BLOCK LIST

      Reduces LIST by calling BLOCK, in a scalar context, multiple times, setting $a and $b each time. The first call will be with $a and $b set to the first two elements of the list, subsequent calls will be done by setting $a to the result of the previous call and $b to the next element in the list.

      Returns the result of the last call to BLOCK. If LIST is empty then undef is returned. If LIST only contains one element then that element is returned and BLOCK is not executed.

      1. $foo = reduce { $a < $b ? $a : $b } 1..10 # min
      2. $foo = reduce { $a lt $b ? $a : $b } 'aa'..'zz' # minstr
      3. $foo = reduce { $a + $b } 1 .. 10 # sum
      4. $foo = reduce { $a . $b } @bar # concat

      If your algorithm requires that reduce produce an identity value, then make sure that you always pass that identity value as the first argument to prevent undef being returned

      1. $foo = reduce { $a + $b } 0, @values; # sum with 0 identity value
    • shuffle LIST

      Returns the elements of LIST in a random order

      1. @cards = shuffle 0..51 # 0..51 in a random order
    • sum LIST

      Returns the sum of all the elements in LIST. If LIST is empty then undef is returned.

      1. $foo = sum 1..10 # 55
      2. $foo = sum 3,9,12 # 24
      3. $foo = sum @bar, @baz # whatever

      This function could be implemented using reduce like this

      1. $foo = reduce { $a + $b } 1..10

      If your algorithm requires that sum produce an identity of 0, then make sure that you always pass 0 as the first argument to prevent undef being returned

      1. $foo = sum 0, @values;
    • sum0 LIST

      Similar to sum , except this returns 0 when given an empty list, rather than undef.

    KNOWN BUGS

    With perl versions prior to 5.005 there are some cases where reduce will return an incorrect result. This will show up as test 7 of reduce.t failing.

    SUGGESTED ADDITIONS

    The following are additions that have been requested, but I have been reluctant to add due to them being very simple to implement in perl

    1. # One argument is true
    2. sub any { $_ && return 1 for @_; 0 }
    3. # All arguments are true
    4. sub all { $_ || return 0 for @_; 1 }
    5. # All arguments are false
    6. sub none { $_ && return 0 for @_; 1 }
    7. # One argument is false
    8. sub notall { $_ || return 1 for @_; 0 }
    9. # How many elements are true
    10. sub true { scalar grep { $_ } @_ }
    11. # How many elements are false
    12. sub false { scalar grep { !$_ } @_ }

    SEE ALSO

    Scalar::Util, List::MoreUtils

    COPYRIGHT

    Copyright (c) 1997-2007 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/List/Util/XS.html000644 000765 000024 00000037514 12275777463 016324 0ustar00jjstaff000000 000000 List::Util::XS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    List::Util::XS

    Perl 5 version 18.2 documentation
    Recently read

    List::Util::XS

    NAME

    List::Util::XS - Indicate if List::Util was compiled with a C compiler

    SYNOPSIS

    1. use List::Util::XS 1.20;

    DESCRIPTION

    List::Util::XS can be used as a dependency to ensure List::Util was installed using a C compiler and that the XS version is installed.

    During installation $List::Util::XS::VERSION will be set to undef if the XS was not compiled.

    Starting with release 1.23_03, Scalar-List-Util is always using the XS implementation, but for backwards compatibility, we still ship the List::Util::XS module which just loads List::Util .

    SEE ALSO

    Scalar::Util, List::Util, List::MoreUtils

    COPYRIGHT

    Copyright (c) 2008 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IPC/Cmd.html000644 000765 000024 00000134016 12275777455 015254 0ustar00jjstaff000000 000000 IPC::Cmd - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IPC::Cmd

    Perl 5 version 18.2 documentation
    Recently read

    IPC::Cmd

    NAME

    IPC::Cmd - finding and running system commands made easy

    SYNOPSIS

    1. use IPC::Cmd qw[can_run run run_forked];
    2. my $full_path = can_run('wget') or warn 'wget is not installed!';
    3. ### commands can be arrayrefs or strings ###
    4. my $cmd = "$full_path -b theregister.co.uk";
    5. my $cmd = [$full_path, '-b', 'theregister.co.uk'];
    6. ### in scalar context ###
    7. my $buffer;
    8. if( scalar run( command => $cmd,
    9. verbose => 0,
    10. buffer => \$buffer,
    11. timeout => 20 )
    12. ) {
    13. print "fetched webpage successfully: $buffer\n";
    14. }
    15. ### in list context ###
    16. my( $success, $error_message, $full_buf, $stdout_buf, $stderr_buf ) =
    17. run( command => $cmd, verbose => 0 );
    18. if( $success ) {
    19. print "this is what the command printed:\n";
    20. print join "", @$full_buf;
    21. }
    22. ### check for features
    23. print "IPC::Open3 available: " . IPC::Cmd->can_use_ipc_open3;
    24. print "IPC::Run available: " . IPC::Cmd->can_use_ipc_run;
    25. print "Can capture buffer: " . IPC::Cmd->can_capture_buffer;
    26. ### don't have IPC::Cmd be verbose, ie don't print to stdout or
    27. ### stderr when running commands -- default is '0'
    28. $IPC::Cmd::VERBOSE = 0;

    DESCRIPTION

    IPC::Cmd allows you to run commands platform independently, interactively if desired, but have them still work.

    The can_run function can tell you if a certain binary is installed and if so where, whereas the run function can actually execute any of the commands you give it and give you a clear return value, as well as adhere to your verbosity settings.

    CLASS METHODS

    $ipc_run_version = IPC::Cmd->can_use_ipc_run( [VERBOSE] )

    Utility function that tells you if IPC::Run is available. If the verbose flag is passed, it will print diagnostic messages if IPC::Run can not be found or loaded.

    $ipc_open3_version = IPC::Cmd->can_use_ipc_open3( [VERBOSE] )

    Utility function that tells you if IPC::Open3 is available. If the verbose flag is passed, it will print diagnostic messages if IPC::Open3 can not be found or loaded.

    $bool = IPC::Cmd->can_capture_buffer

    Utility function that tells you if IPC::Cmd is capable of capturing buffers in it's current configuration.

    $bool = IPC::Cmd->can_use_run_forked

    Utility function that tells you if IPC::Cmd is capable of providing run_forked on the current platform.

    FUNCTIONS

    $path = can_run( PROGRAM );

    can_run takes only one argument: the name of a binary you wish to locate. can_run works much like the unix binary which or the bash command type , which scans through your path, looking for the requested binary.

    Unlike which and type , this function is platform independent and will also work on, for example, Win32.

    If called in a scalar context it will return the full path to the binary you asked for if it was found, or undef if it was not.

    If called in a list context and the global variable $INSTANCES is a true value, it will return a list of the full paths to instances of the binary where found in PATH , or an empty list if it was not found.

    $ok | ($ok, $err, $full_buf, $stdout_buff, $stderr_buff) = run( command => COMMAND, [verbose => BOOL, buffer => \$SCALAR, timeout => DIGIT] );

    run takes 4 arguments:

    • command

      This is the command to execute. It may be either a string or an array reference. This is a required argument.

      See Caveats for remarks on how commands are parsed and their limitations.

    • verbose

      This controls whether all output of a command should also be printed to STDOUT/STDERR or should only be trapped in buffers (NOTE: buffers require IPC::Run to be installed, or your system able to work with IPC::Open3).

      It will default to the global setting of $IPC::Cmd::VERBOSE , which by default is 0.

    • buffer

      This will hold all the output of a command. It needs to be a reference to a scalar. Note that this will hold both the STDOUT and STDERR messages, and you have no way of telling which is which. If you require this distinction, run the run command in list context and inspect the individual buffers.

      Of course, this requires that the underlying call supports buffers. See the note on buffers above.

    • timeout

      Sets the maximum time the command is allowed to run before aborting, using the built-in alarm() call. If the timeout is triggered, the errorcode in the return value will be set to an object of the IPC::Cmd::TimeOut class. See the error message section below for details.

      Defaults to 0 , meaning no timeout is set.

    run will return a simple true or false when called in scalar context. In list context, you will be returned a list of the following items:

    • success

      A simple boolean indicating if the command executed without errors or not.

    • error message

      If the first element of the return value (success ) was 0, then some error occurred. This second element is the error message the command you requested exited with, if available. This is generally a pretty printed value of $? or $@ . See perldoc perlvar for details on what they can contain. If the error was a timeout, the error message will be prefixed with the string IPC::Cmd::TimeOut , the timeout class.

    • full_buffer

      This is an array reference containing all the output the command generated. Note that buffers are only available if you have IPC::Run installed, or if your system is able to work with IPC::Open3 -- see below). Otherwise, this element will be undef.

    • out_buffer

      This is an array reference containing all the output sent to STDOUT the command generated. The notes from full_buffer apply.

    • error_buffer

      This is an arrayreference containing all the output sent to STDERR the command generated. The notes from full_buffer apply.

    See the HOW IT WORKS section below to see how IPC::Cmd decides what modules or function calls to use when issuing a command.

    $hashref = run_forked( COMMAND, { child_stdin => SCALAR, timeout => DIGIT, stdout_handler => CODEREF, stderr_handler => CODEREF} );

    run_forked is used to execute some program or a coderef, optionally feed it with some input, get its return code and output (both stdout and stderr into separate buffers). In addition, it allows to terminate the program if it takes too long to finish.

    The important and distinguishing feature of run_forked is execution timeout which at first seems to be quite a simple task but if you think that the program which you're spawning might spawn some children itself (which in their turn could do the same and so on) it turns out to be not a simple issue.

    run_forked is designed to survive and successfully terminate almost any long running task, even a fork bomb in case your system has the resources to survive during given timeout.

    This is achieved by creating separate watchdog process which spawns the specified program in a separate process session and supervises it: optionally feeds it with input, stores its exit code, stdout and stderr, terminates it in case it runs longer than specified.

    Invocation requires the command to be executed or a coderef and optionally a hashref of options:

    • timeout

      Specify in seconds how long to run the command before it is killed with with SIG_KILL (9), which effectively terminates it and all of its children (direct or indirect).

    • child_stdin

      Specify some text that will be passed into the STDIN of the executed program.

    • stdout_handler

      Coderef of a subroutine to call when a portion of data is received on STDOUT from the executing program.

    • stderr_handler

      Coderef of a subroutine to call when a portion of data is received on STDERR from the executing program.

    • discard_output

      Discards the buffering of the standard output and standard errors for return by run_forked(). With this option you have to use the std*_handlers to read what the command outputs. Useful for commands that send a lot of output.

    • terminate_on_parent_sudden_death

      Enable this option if you wish all spawned processes to be killed if the initially spawned process (the parent) is killed or dies without waiting for child processes.

    run_forked will return a HASHREF with the following keys:

    • exit_code

      The exit code of the executed program.

    • timeout

      The number of seconds the program ran for before being terminated, or 0 if no timeout occurred.

    • stdout

      Holds the standard output of the executed command (or empty string if there was no STDOUT output or if discard_output was used; it's always defined!)

    • stderr

      Holds the standard error of the executed command (or empty string if there was no STDERR output or if discard_output was used; it's always defined!)

    • merged

      Holds the standard output and error of the executed command merged into one stream (or empty string if there was no output at all or if discard_output was used; it's always defined!)

    • err_msg

      Holds some explanation in the case of an error.

    $q = QUOTE

    Returns the character used for quoting strings on this platform. This is usually a ' (single quote) on most systems, but some systems use different quotes. For example, Win32 uses " (double quote).

    You can use it as follows:

    1. use IPC::Cmd qw[run QUOTE];
    2. my $cmd = q[echo ] . QUOTE . q[foo bar] . QUOTE;

    This makes sure that foo bar is treated as a string, rather than two separate arguments to the echo function.

    __END__

    HOW IT WORKS

    run will try to execute your command using the following logic:

    • If you have IPC::Run installed, and the variable $IPC::Cmd::USE_IPC_RUN is set to true (See the Global Variables section) use that to execute the command. You will have the full output available in buffers, interactive commands are sure to work and you are guaranteed to have your verbosity settings honored cleanly.

    • Otherwise, if the variable $IPC::Cmd::USE_IPC_OPEN3 is set to true (See the Global Variables section), try to execute the command using IPC::Open3. Buffers will be available on all platforms, interactive commands will still execute cleanly, and also your verbosity settings will be adhered to nicely;

    • Otherwise, if you have the verbose argument set to true, we fall back to a simple system() call. We cannot capture any buffers, but interactive commands will still work.

    • Otherwise we will try and temporarily redirect STDERR and STDOUT, do a system() call with your command and then re-open STDERR and STDOUT. This is the method of last resort and will still allow you to execute your commands cleanly. However, no buffers will be available.

    Global Variables

    The behaviour of IPC::Cmd can be altered by changing the following global variables:

    $IPC::Cmd::VERBOSE

    This controls whether IPC::Cmd will print any output from the commands to the screen or not. The default is 0.

    $IPC::Cmd::USE_IPC_RUN

    This variable controls whether IPC::Cmd will try to use IPC::Run when available and suitable.

    $IPC::Cmd::USE_IPC_OPEN3

    This variable controls whether IPC::Cmd will try to use IPC::Open3 when available and suitable. Defaults to true.

    $IPC::Cmd::WARN

    This variable controls whether run-time warnings should be issued, like the failure to load an IPC::* module you explicitly requested.

    Defaults to true. Turn this off at your own risk.

    $IPC::Cmd::INSTANCES

    This variable controls whether can_run will return all instances of the binary it finds in the PATH when called in a list context.

    Defaults to false, set to true to enable the described behaviour.

    $IPC::Cmd::ALLOW_NULL_ARGS

    This variable controls whether run will remove any empty/null arguments it finds in command arguments.

    Defaults to false, so it will remove null arguments. Set to true to allow them.

    Caveats

    • Whitespace and IPC::Open3 / system()

      When using IPC::Open3 or system, if you provide a string as the command argument, it is assumed to be appropriately escaped. You can use the QUOTE constant to use as a portable quote character (see above). However, if you provide an array reference, special rules apply:

      If your command contains special characters (< > | &), it will be internally stringified before executing the command, to avoid that these special characters are escaped and passed as arguments instead of retaining their special meaning.

      However, if the command contained arguments that contained whitespace, stringifying the command would lose the significance of the whitespace. Therefore, IPC::Cmd will quote any arguments containing whitespace in your command if the command is passed as an arrayref and contains special characters.

    • Whitespace and IPC::Run

      When using IPC::Run , if you provide a string as the command argument, the string will be split on whitespace to determine the individual elements of your command. Although this will usually just Do What You Mean, it may break if you have files or commands with whitespace in them.

      If you do not wish this to happen, you should provide an array reference, where all parts of your command are already separated out. Note however, if there are extra or spurious whitespaces in these parts, the parser or underlying code may not interpret it correctly, and cause an error.

      Example: The following code

      1. gzip -cdf foo.tar.gz | tar -xf -

      should either be passed as

      1. "gzip -cdf foo.tar.gz | tar -xf -"

      or as

      1. ['gzip', '-cdf', 'foo.tar.gz', '|', 'tar', '-xf', '-']

      But take care not to pass it as, for example

      1. ['gzip -cdf foo.tar.gz', '|', 'tar -xf -']

      Since this will lead to issues as described above.

    • IO Redirect

      Currently it is too complicated to parse your command for IO redirections. For capturing STDOUT or STDERR there is a work around however, since you can just inspect your buffers for the contents.

    • Interleaving STDOUT/STDERR

      Neither IPC::Run nor IPC::Open3 can interleave STDOUT and STDERR. For short bursts of output from a program, e.g. this sample,

      1. for ( 1..4 ) {
      2. $_ % 2 ? print STDOUT $_ : print STDERR $_;
      3. }

      IPC::[Run|Open3] will first read all of STDOUT, then all of STDERR, meaning the output looks like '13' on STDOUT and '24' on STDERR, instead of

      1. 1
      2. 2
      3. 3
      4. 4

      This has been recorded in rt.cpan.org as bug #37532: Unable to interleave STDOUT and STDERR.

    See Also

    IPC::Run, IPC::Open3

    ACKNOWLEDGEMENTS

    Thanks to James Mastros and Martijn van der Streek for their help in getting IPC::Open3 to behave nicely.

    Thanks to Petya Kohts for the run_forked code.

    BUG REPORTS

    Please report bugs or other issues to <bug-ipc-cmd@rt.cpan.org>.

    AUTHOR

    Original author: Jos Boumans <kane@cpan.org>. Current maintainer: Chris Williams <bingos@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IPC/Msg.html000644 000765 000024 00000047127 12275777453 015303 0ustar00jjstaff000000 000000 IPC::Msg - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IPC::Msg

    Perl 5 version 18.2 documentation
    Recently read

    IPC::Msg

    NAME

    IPC::Msg - SysV Msg IPC object class

    SYNOPSIS

    1. use IPC::SysV qw(IPC_PRIVATE S_IRUSR S_IWUSR);
    2. use IPC::Msg;
    3. $msg = IPC::Msg->new(IPC_PRIVATE, S_IRUSR | S_IWUSR);
    4. $msg->snd($msgtype, $msgdata);
    5. $msg->rcv($buf, 256);
    6. $ds = $msg->stat;
    7. $msg->remove;

    DESCRIPTION

    A class providing an object based interface to SysV IPC message queues.

    METHODS

    • new ( KEY , FLAGS )

      Creates a new message queue associated with KEY . A new queue is created if

      • KEY is equal to IPC_PRIVATE

      • KEY does not already have a message queue associated with it, and FLAGS & IPC_CREAT is true.

      On creation of a new message queue FLAGS is used to set the permissions. Be careful not to set any flags that the Sys V IPC implementation does not allow: in some systems setting execute bits makes the operations fail.

    • id

      Returns the system message queue identifier.

    • rcv ( BUF, LEN [, TYPE [, FLAGS ]] )

      Read a message from the queue. Returns the type of the message read. See msgrcv. The BUF becomes tainted.

    • remove

      Remove and destroy the message queue from the system.

    • set ( STAT )
    • set ( NAME => VALUE [, NAME => VALUE ...] )

      set will set the following values of the stat structure associated with the message queue.

      1. uid
      2. gid
      3. mode (oly the permission bits)
      4. qbytes

      set accepts either a stat object, as returned by the stat method, or a list of name-value pairs.

    • snd ( TYPE, MSG [, FLAGS ] )

      Place a message on the queue with the data from MSG and with type TYPE . See msgsnd.

    • stat

      Returns an object of type IPC::Msg::stat which is a sub-class of Class::Struct . It provides the following fields. For a description of these fields see you system documentation.

      1. uid
      2. gid
      3. cuid
      4. cgid
      5. mode
      6. qnum
      7. qbytes
      8. lspid
      9. lrpid
      10. stime
      11. rtime
      12. ctime

    SEE ALSO

    IPC::SysV, Class::Struct

    AUTHORS

    Graham Barr <gbarr@pobox.com>, Marcus Holland-Moritz <mhx@cpan.org>

    COPYRIGHT

    Version 2.x, Copyright (C) 2007-2010, Marcus Holland-Moritz.

    Version 1.x, Copyright (c) 1997, Graham Barr.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IPC/Open2.html000644 000765 000024 00000050173 12275777447 015536 0ustar00jjstaff000000 000000 IPC::Open2 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IPC::Open2

    Perl 5 version 18.2 documentation
    Recently read

    IPC::Open2

    NAME

    IPC::Open2 - open a process for both reading and writing using open2()

    SYNOPSIS

    1. use IPC::Open2;
    2. $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'some cmd and args');
    3. # or without using the shell
    4. $pid = open2(\*CHLD_OUT, \*CHLD_IN, 'some', 'cmd', 'and', 'args');
    5. # or with handle autovivification
    6. my($chld_out, $chld_in);
    7. $pid = open2($chld_out, $chld_in, 'some cmd and args');
    8. # or without using the shell
    9. $pid = open2($chld_out, $chld_in, 'some', 'cmd', 'and', 'args');
    10. waitpid( $pid, 0 );
    11. my $child_exit_status = $? >> 8;

    DESCRIPTION

    The open2() function runs the given $cmd and connects $chld_out for reading and $chld_in for writing. It's what you think should work when you try

    1. $pid = open(HANDLE, "|cmd args|");

    The write filehandle will have autoflush turned on.

    If $chld_out is a string (that is, a bareword filehandle rather than a glob or a reference) and it begins with >&, then the child will send output directly to that file handle. If $chld_in is a string that begins with <& , then $chld_in will be closed in the parent, and the child will read from it directly. In both cases, there will be a dup(2) instead of a pipe(2) made.

    If either reader or writer is the null string, this will be replaced by an autogenerated filehandle. If so, you must pass a valid lvalue in the parameter slot so it can be overwritten in the caller, or an exception will be raised.

    open2() returns the process ID of the child process. It doesn't return on failure: it just raises an exception matching /^open2:/ . However, exec failures in the child are not detected. You'll have to trap SIGPIPE yourself.

    open2() does not wait for and reap the child process after it exits. Except for short programs where it's acceptable to let the operating system take care of this, you need to do this yourself. This is normally as simple as calling waitpid $pid, 0 when you're done with the process. Failing to do this can result in an accumulation of defunct or "zombie" processes. See waitpid for more information.

    This whole affair is quite dangerous, as you may block forever. It assumes it's going to talk to something like bc, both writing to it and reading from it. This is presumably safe because you "know" that commands like bc will read a line at a time and output a line at a time. Programs like sort that read their entire input stream first, however, are quite apt to cause deadlock.

    The big problem with this approach is that if you don't have control over source code being run in the child process, you can't control what it does with pipe buffering. Thus you can't just open a pipe to cat -v and continually read and write a line from it.

    The IO::Pty and Expect modules from CPAN can help with this, as they provide a real tty (well, a pseudo-tty, actually), which gets you back to line buffering in the invoked command again.

    WARNING

    The order of arguments differs from that of open3().

    SEE ALSO

    See IPC::Open3 for an alternative that handles STDERR as well. This function is really just a wrapper around open3().

     
    perldoc-html/IPC/Open3.html000644 000765 000024 00000050632 12275777455 015536 0ustar00jjstaff000000 000000 IPC::Open3 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IPC::Open3

    Perl 5 version 18.2 documentation
    Recently read

    IPC::Open3

    NAME

    IPC::Open3 - open a process for reading, writing, and error handling using open3()

    SYNOPSIS

    1. $pid = open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR,
    2. 'some cmd and args', 'optarg', ...);
    3. my($wtr, $rdr, $err);
    4. use Symbol 'gensym'; $err = gensym;
    5. $pid = open3($wtr, $rdr, $err,
    6. 'some cmd and args', 'optarg', ...);
    7. waitpid( $pid, 0 );
    8. my $child_exit_status = $? >> 8;

    DESCRIPTION

    Extremely similar to open2(), open3() spawns the given $cmd and connects CHLD_OUT for reading from the child, CHLD_IN for writing to the child, and CHLD_ERR for errors. If CHLD_ERR is false, or the same file descriptor as CHLD_OUT, then STDOUT and STDERR of the child are on the same filehandle (this means that an autovivified lexical cannot be used for the STDERR filehandle, see SYNOPSIS). The CHLD_IN will have autoflush turned on.

    If CHLD_IN begins with <& , then CHLD_IN will be closed in the parent, and the child will read from it directly. If CHLD_OUT or CHLD_ERR begins with >&, then the child will send output directly to that filehandle. In both cases, there will be a dup(2) instead of a pipe(2) made.

    If either reader or writer is the null string, this will be replaced by an autogenerated filehandle. If so, you must pass a valid lvalue in the parameter slot so it can be overwritten in the caller, or an exception will be raised.

    The filehandles may also be integers, in which case they are understood as file descriptors.

    open3() returns the process ID of the child process. It doesn't return on failure: it just raises an exception matching /^open3:/ . However, exec failures in the child (such as no such file or permission denied), are just reported to CHLD_ERR, as it is not possible to trap them.

    If the child process dies for any reason, the next write to CHLD_IN is likely to generate a SIGPIPE in the parent, which is fatal by default. So you may wish to handle this signal.

    Note if you specify - as the command, in an analogous fashion to open(FOO, "-|") the child process will just be the forked Perl process rather than an external command. This feature isn't yet supported on Win32 platforms.

    open3() does not wait for and reap the child process after it exits. Except for short programs where it's acceptable to let the operating system take care of this, you need to do this yourself. This is normally as simple as calling waitpid $pid, 0 when you're done with the process. Failing to do this can result in an accumulation of defunct or "zombie" processes. See waitpid for more information.

    If you try to read from the child's stdout writer and their stderr writer, you'll have problems with blocking, which means you'll want to use select() or the IO::Select, which means you'd best use sysread() instead of readline() for normal stuff.

    This is very dangerous, as you may block forever. It assumes it's going to talk to something like bc, both writing to it and reading from it. This is presumably safe because you "know" that commands like bc will read a line at a time and output a line at a time. Programs like sort that read their entire input stream first, however, are quite apt to cause deadlock.

    The big problem with this approach is that if you don't have control over source code being run in the child process, you can't control what it does with pipe buffering. Thus you can't just open a pipe to cat -v and continually read and write a line from it.

    See Also

    • IPC::Open2

      Like Open3 but without STDERR catpure.

    • IPC::Run

      This is a CPAN module that has better error handling and more facilities than Open3.

    WARNING

    The order of arguments differs from that of open2().

     
    perldoc-html/IPC/Semaphore.html000644 000765 000024 00000054106 12275777451 016471 0ustar00jjstaff000000 000000 IPC::Semaphore - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IPC::Semaphore

    Perl 5 version 18.2 documentation
    Recently read

    IPC::Semaphore

    NAME

    IPC::Semaphore - SysV Semaphore IPC object class

    SYNOPSIS

    1. use IPC::SysV qw(IPC_PRIVATE S_IRUSR S_IWUSR IPC_CREAT);
    2. use IPC::Semaphore;
    3. $sem = IPC::Semaphore->new(IPC_PRIVATE, 10, S_IRUSR | S_IWUSR | IPC_CREAT);
    4. $sem->setall( (0) x 10);
    5. @sem = $sem->getall;
    6. $ncnt = $sem->getncnt;
    7. $zcnt = $sem->getzcnt;
    8. $ds = $sem->stat;
    9. $sem->remove;

    DESCRIPTION

    A class providing an object based interface to SysV IPC semaphores.

    METHODS

    • new ( KEY , NSEMS , FLAGS )

      Create a new semaphore set associated with KEY . NSEMS is the number of semaphores in the set. A new set is created if

      • KEY is equal to IPC_PRIVATE

      • KEY does not already have a semaphore identifier associated with it, and FLAGS & IPC_CREAT is true.

      On creation of a new semaphore set FLAGS is used to set the permissions. Be careful not to set any flags that the Sys V IPC implementation does not allow: in some systems setting execute bits makes the operations fail.

    • getall

      Returns the values of the semaphore set as an array.

    • getncnt ( SEM )

      Returns the number of processes waiting for the semaphore SEM to become greater than its current value

    • getpid ( SEM )

      Returns the process id of the last process that performed an operation on the semaphore SEM .

    • getval ( SEM )

      Returns the current value of the semaphore SEM .

    • getzcnt ( SEM )

      Returns the number of processes waiting for the semaphore SEM to become zero.

    • id

      Returns the system identifier for the semaphore set.

    • op ( OPLIST )

      OPLIST is a list of operations to pass to semop. OPLIST is a concatenation of smaller lists, each which has three values. The first is the semaphore number, the second is the operation and the last is a flags value. See semop for more details. For example

      1. $sem->op(
      2. 0, -1, IPC_NOWAIT,
      3. 1, 1, IPC_NOWAIT
      4. );
    • remove

      Remove and destroy the semaphore set from the system.

    • set ( STAT )
    • set ( NAME => VALUE [, NAME => VALUE ...] )

      set will set the following values of the stat structure associated with the semaphore set.

      1. uid
      2. gid
      3. mode (only the permission bits)

      set accepts either a stat object, as returned by the stat method, or a list of name-value pairs.

    • setall ( VALUES )

      Sets all values in the semaphore set to those given on the VALUES list. VALUES must contain the correct number of values.

    • setval ( N , VALUE )

      Set the N th value in the semaphore set to VALUE

    • stat

      Returns an object of type IPC::Semaphore::stat which is a sub-class of Class::Struct . It provides the following fields. For a description of these fields see your system documentation.

      1. uid
      2. gid
      3. cuid
      4. cgid
      5. mode
      6. ctime
      7. otime
      8. nsems

    SEE ALSO

    IPC::SysV, Class::Struct, semget, semctl, semop

    AUTHORS

    Graham Barr <gbarr@pobox.com>, Marcus Holland-Moritz <mhx@cpan.org>

    COPYRIGHT

    Version 2.x, Copyright (C) 2007-2010, Marcus Holland-Moritz.

    Version 1.x, Copyright (c) 1997, Graham Barr.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IPC/SharedMem.html000644 000765 000024 00000051367 12275777454 016424 0ustar00jjstaff000000 000000 IPC::SharedMem - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IPC::SharedMem

    Perl 5 version 18.2 documentation
    Recently read

    IPC::SharedMem

    NAME

    IPC::SharedMem - SysV Shared Memory IPC object class

    SYNOPSIS

    1. use IPC::SysV qw(IPC_PRIVATE S_IRUSR S_IWUSR);
    2. use IPC::SharedMem;
    3. $shm = IPC::SharedMem->new(IPC_PRIVATE, 8, S_IRWXU);
    4. $shm->write(pack("S", 4711), 2, 2);
    5. $data = $shm->read(0, 2);
    6. $ds = $shm->stat;
    7. $shm->remove;

    DESCRIPTION

    A class providing an object based interface to SysV IPC shared memory.

    METHODS

    • new ( KEY , SIZE , FLAGS )

      Creates a new shared memory segment associated with KEY . A new segment is created if

      • KEY is equal to IPC_PRIVATE

      • KEY does not already have a shared memory segment associated with it, and FLAGS & IPC_CREAT is true.

      On creation of a new shared memory segment FLAGS is used to set the permissions. Be careful not to set any flags that the Sys V IPC implementation does not allow: in some systems setting execute bits makes the operations fail.

    • id

      Returns the shared memory identifier.

    • read ( POS, SIZE )

      Read SIZE bytes from the shared memory segment at POS . Returns the string read, or undef if there was an error. The return value becomes tainted. See shmread.

    • write ( STRING, POS, SIZE )

      Write SIZE bytes to the shared memory segment at POS . Returns true if successful, or false if there is an error. See shmwrite.

    • remove

      Remove the shared memory segment from the system or mark it as removed as long as any processes are still attached to it.

    • is_removed

      Returns true if the shared memory segment has been removed or marked for removal.

    • stat

      Returns an object of type IPC::SharedMem::stat which is a sub-class of Class::Struct . It provides the following fields. For a description of these fields see you system documentation.

      1. uid
      2. gid
      3. cuid
      4. cgid
      5. mode
      6. segsz
      7. lpid
      8. cpid
      9. nattach
      10. atime
      11. dtime
      12. ctime
    • attach ( [FLAG] )

      Permanently attach to the shared memory segment. When a IPC::SharedMem object is attached, it will use memread and memwrite instead of shmread and shmwrite for accessing the shared memory segment. Returns true if successful, or false on error. See shmat.

    • detach

      Detach from the shared memory segment that previously has been attached to. Returns true if successful, or false on error. See shmdt.

    • addr

      Returns the address of the shared memory that has been attached to in a format suitable for use with pack('P'). Returns undef if the shared memory has not been attached.

    SEE ALSO

    IPC::SysV, Class::Struct

    AUTHORS

    Marcus Holland-Moritz <mhx@cpan.org>

    COPYRIGHT

    Version 2.x, Copyright (C) 2007-2010, Marcus Holland-Moritz.

    Version 1.x, Copyright (c) 1997, Graham Barr.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IPC/SysV.html000644 000765 000024 00000045201 12275777451 015446 0ustar00jjstaff000000 000000 IPC::SysV - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IPC::SysV

    Perl 5 version 18.2 documentation
    Recently read

    IPC::SysV

    NAME

    IPC::SysV - System V IPC constants and system calls

    SYNOPSIS

    1. use IPC::SysV qw(IPC_STAT IPC_PRIVATE);

    DESCRIPTION

    IPC::SysV defines and conditionally exports all the constants defined in your system include files which are needed by the SysV IPC calls. Common ones include

    1. IPC_CREATE IPC_EXCL IPC_NOWAIT IPC_PRIVATE IPC_RMID IPC_SET IPC_STAT
    2. GETVAL SETVAL GETPID GETNCNT GETZCNT GETALL SETALL
    3. SEM_A SEM_R SEM_UNDO
    4. SHM_RDONLY SHM_RND SHMLBA

    and auxiliary ones

    1. S_IRUSR S_IWUSR S_IRWXU
    2. S_IRGRP S_IWGRP S_IRWXG
    3. S_IROTH S_IWOTH S_IRWXO

    but your system might have more.

    • ftok( PATH )
    • ftok( PATH, ID )

      Return a key based on PATH and ID, which can be used as a key for msgget, semget and shmget. See ftok.

      If ID is omitted, it defaults to 1 . If a single character is given for ID, the numeric value of that character is used.

    • shmat( ID, ADDR, FLAG )

      Attach the shared memory segment identified by ID to the address space of the calling process. See shmat.

      ADDR should be undef unless you really know what you're doing.

    • shmdt( ADDR )

      Detach the shared memory segment located at the address specified by ADDR from the address space of the calling process. See shmdt.

    • memread( ADDR, VAR, POS, SIZE )

      Reads SIZE bytes from a memory segment at ADDR starting at position POS. VAR must be a variable that will hold the data read. Returns true if successful, or false if there is an error. memread() taints the variable.

    • memwrite( ADDR, STRING, POS, SIZE )

      Writes SIZE bytes from STRING to a memory segment at ADDR starting at position POS. If STRING is too long, only SIZE bytes are used; if STRING is too short, nulls are written to fill out SIZE bytes. Returns true if successful, or false if there is an error.

    SEE ALSO

    IPC::Msg, IPC::Semaphore, IPC::SharedMem, ftok, shmat, shmdt

    AUTHORS

    Graham Barr <gbarr@pobox.com>, Jarkko Hietaniemi <jhi@iki.fi>, Marcus Holland-Moritz <mhx@cpan.org>

    COPYRIGHT

    Version 2.x, Copyright (C) 2007-2010, Marcus Holland-Moritz.

    Version 1.x, Copyright (c) 1997, Graham Barr.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Compress/000755 000765 000024 00000000000 12275777455 015345 5ustar00jjstaff000000 000000 perldoc-html/IO/Dir.html000644 000765 000024 00000050405 12275777453 015160 0ustar00jjstaff000000 000000 IO::Dir - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Dir

    Perl 5 version 18.2 documentation
    Recently read

    IO::Dir

    NAME

    IO::Dir - supply object methods for directory handles

    SYNOPSIS

    1. use IO::Dir;
    2. $d = IO::Dir->new(".");
    3. if (defined $d) {
    4. while (defined($_ = $d->read)) { something($_); }
    5. $d->rewind;
    6. while (defined($_ = $d->read)) { something_else($_); }
    7. undef $d;
    8. }
    9. tie %dir, 'IO::Dir', ".";
    10. foreach (keys %dir) {
    11. print $_, " " , $dir{$_}->size,"\n";
    12. }

    DESCRIPTION

    The IO::Dir package provides two interfaces to perl's directory reading routines.

    The first interface is an object approach. IO::Dir provides an object constructor and methods, which are just wrappers around perl's built in directory reading routines.

    • new ( [ DIRNAME ] )

      new is the constructor for IO::Dir objects. It accepts one optional argument which, if given, new will pass to open

    The following methods are wrappers for the directory related functions built into perl (the trailing 'dir' has been removed from the names). See perlfunc for details of these functions.

    • open ( DIRNAME )
    • read ()
    • seek ( POS )
    • tell ()
    • rewind ()
    • close ()

    IO::Dir also provides an interface to reading directories via a tied hash. The tied hash extends the interface beyond just the directory reading routines by the use of lstat, from the File::stat package, unlink, rmdir and utime.

    • tie %hash, 'IO::Dir', DIRNAME [, OPTIONS ]

    The keys of the hash will be the names of the entries in the directory. Reading a value from the hash will be the result of calling File::stat::lstat . Deleting an element from the hash will delete the corresponding file or subdirectory, provided that DIR_UNLINK is included in the OPTIONS .

    Assigning to an entry in the hash will cause the time stamps of the file to be modified. If the file does not exist then it will be created. Assigning a single integer to a hash element will cause both the access and modification times to be changed to that value. Alternatively a reference to an array of two values can be passed. The first array element will be used to set the access time and the second element will be used to set the modification time.

    SEE ALSO

    File::stat

    AUTHOR

    Graham Barr. Currently maintained by the Perl Porters. Please report all bugs to <perlbug@perl.org>.

    COPYRIGHT

    Copyright (c) 1997-2003 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/File.html000644 000765 000024 00000055650 12275777451 015326 0ustar00jjstaff000000 000000 IO::File - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::File

    Perl 5 version 18.2 documentation
    Recently read

    IO::File

    NAME

    IO::File - supply object methods for filehandles

    SYNOPSIS

    1. use IO::File;
    2. $fh = IO::File->new();
    3. if ($fh->open("< file")) {
    4. print <$fh>;
    5. $fh->close;
    6. }
    7. $fh = IO::File->new("> file");
    8. if (defined $fh) {
    9. print $fh "bar\n";
    10. $fh->close;
    11. }
    12. $fh = IO::File->new("file", "r");
    13. if (defined $fh) {
    14. print <$fh>;
    15. undef $fh; # automatically closes the file
    16. }
    17. $fh = IO::File->new("file", O_WRONLY|O_APPEND);
    18. if (defined $fh) {
    19. print $fh "corge\n";
    20. $pos = $fh->getpos;
    21. $fh->setpos($pos);
    22. undef $fh; # automatically closes the file
    23. }
    24. autoflush STDOUT 1;

    DESCRIPTION

    IO::File inherits from IO::Handle and IO::Seekable . It extends these classes with methods that are specific to file handles.

    CONSTRUCTOR

    • new ( FILENAME [,MODE [,PERMS]] )

      Creates an IO::File . If it receives any parameters, they are passed to the method open; if the open fails, the object is destroyed. Otherwise, it is returned to the caller.

    • new_tmpfile

      Creates an IO::File opened for read/write on a newly created temporary file. On systems where this is possible, the temporary file is anonymous (i.e. it is unlinked after creation, but held open). If the temporary file cannot be created or opened, the IO::File object is destroyed. Otherwise, it is returned to the caller.

    METHODS

    • open( FILENAME [,MODE [,PERMS]] )
    • open( FILENAME, IOLAYERS )

      open accepts one, two or three parameters. With one parameter, it is just a front end for the built-in open function. With two or three parameters, the first parameter is a filename that may include whitespace or other special characters, and the second parameter is the open mode, optionally followed by a file permission value.

      If IO::File::open receives a Perl mode string (">", "+<", etc.) or an ANSI C fopen() mode string ("w", "r+", etc.), it uses the basic Perl open operator (but protects any special characters).

      If IO::File::open is given a numeric mode, it passes that mode and the optional permissions value to the Perl sysopen operator. The permissions default to 0666.

      If IO::File::open is given a mode that includes the : character, it passes all the three arguments to the three-argument open operator.

      For convenience, IO::File exports the O_XXX constants from the Fcntl module, if this module is available.

    • binmode( [LAYER] )

      binmode sets binmode on the underlying IO object, as documented in perldoc -f binmode .

      binmode accepts one optional parameter, which is the layer to be passed on to the binmode call.

    NOTE

    Some operating systems may perform IO::File::new() or IO::File::open() on a directory without errors. This behavior is not portable and not suggested for use. Using opendir() and readdir() or IO::Dir are suggested instead.

    SEE ALSO

    perlfunc, I/O Operators in perlop, IO::Handle, IO::Seekable, IO::Dir

    HISTORY

    Derived from FileHandle.pm by Graham Barr <gbarr@pobox.com>.

     
    perldoc-html/IO/Handle.html000644 000765 000024 00000074606 12275777454 015647 0ustar00jjstaff000000 000000 IO::Handle - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Handle

    Perl 5 version 18.2 documentation
    Recently read

    IO::Handle

    NAME

    IO::Handle - supply object methods for I/O handles

    SYNOPSIS

    1. use IO::Handle;
    2. $io = IO::Handle->new();
    3. if ($io->fdopen(fileno(STDIN),"r")) {
    4. print $io->getline;
    5. $io->close;
    6. }
    7. $io = IO::Handle->new();
    8. if ($io->fdopen(fileno(STDOUT),"w")) {
    9. $io->print("Some text\n");
    10. }
    11. # setvbuf is not available by default on Perls 5.8.0 and later.
    12. use IO::Handle '_IOLBF';
    13. $io->setvbuf($buffer_var, _IOLBF, 1024);
    14. undef $io; # automatically closes the file if it's open
    15. autoflush STDOUT 1;

    DESCRIPTION

    IO::Handle is the base class for all other IO handle classes. It is not intended that objects of IO::Handle would be created directly, but instead IO::Handle is inherited from by several other classes in the IO hierarchy.

    If you are reading this documentation, looking for a replacement for the FileHandle package, then I suggest you read the documentation for IO::File too.

    CONSTRUCTOR

    • new ()

      Creates a new IO::Handle object.

    • new_from_fd ( FD, MODE )

      Creates an IO::Handle like new does. It requires two parameters, which are passed to the method fdopen ; if the fdopen fails, the object is destroyed. Otherwise, it is returned to the caller.

    METHODS

    See perlfunc for complete descriptions of each of the following supported IO::Handle methods, which are just front ends for the corresponding built-in functions:

    1. $io->close
    2. $io->eof
    3. $io->fcntl( FUNCTION, SCALAR )
    4. $io->fileno
    5. $io->format_write( [FORMAT_NAME] )
    6. $io->getc
    7. $io->ioctl( FUNCTION, SCALAR )
    8. $io->read ( BUF, LEN, [OFFSET] )
    9. $io->print ( ARGS )
    10. $io->printf ( FMT, [ARGS] )
    11. $io->say ( ARGS )
    12. $io->stat
    13. $io->sysread ( BUF, LEN, [OFFSET] )
    14. $io->syswrite ( BUF, [LEN, [OFFSET]] )
    15. $io->truncate ( LEN )

    See perlvar for complete descriptions of each of the following supported IO::Handle methods. All of them return the previous value of the attribute and takes an optional single argument that when given will set the value. If no argument is given the previous value is unchanged (except for $io->autoflush will actually turn ON autoflush by default).

    1. $io->autoflush ( [BOOL] ) $|
    2. $io->format_page_number( [NUM] ) $%
    3. $io->format_lines_per_page( [NUM] ) $=
    4. $io->format_lines_left( [NUM] ) $-
    5. $io->format_name( [STR] ) $~
    6. $io->format_top_name( [STR] ) $^
    7. $io->input_line_number( [NUM]) $.

    The following methods are not supported on a per-filehandle basis.

    1. IO::Handle->format_line_break_characters( [STR] ) $:
    2. IO::Handle->format_formfeed( [STR]) $^L
    3. IO::Handle->output_field_separator( [STR] ) $,
    4. IO::Handle->output_record_separator( [STR] ) $\
    5. IO::Handle->input_record_separator( [STR] ) $/

    Furthermore, for doing normal I/O you might need these:

    • $io->fdopen ( FD, MODE )

      fdopen is like an ordinary open except that its first parameter is not a filename but rather a file handle name, an IO::Handle object, or a file descriptor number. (For the documentation of the open method, see IO::File.)

    • $io->opened

      Returns true if the object is currently a valid file descriptor, false otherwise.

    • $io->getline

      This works like <$io> described in I/O Operators in perlop except that it's more readable and can be safely called in a list context but still returns just one line. If used as the conditional +within a while or C-style for loop, however, you will need to +emulate the functionality of <$io> with defined($_ = $io->getline) .

    • $io->getlines

      This works like <$io> when called in a list context to read all the remaining lines in a file, except that it's more readable. It will also croak() if accidentally called in a scalar context.

    • $io->ungetc ( ORD )

      Pushes a character with the given ordinal value back onto the given handle's input stream. Only one character of pushback per handle is guaranteed.

    • $io->write ( BUF, LEN [, OFFSET ] )

      This write is somewhat like write found in C, in that it is the opposite of read. The wrapper for the perl write function is called format_write . However, whilst the C write function returns the number of bytes written, this write function simply returns true if successful (like print). A more C-like write is syswrite (see above).

    • $io->error

      Returns a true value if the given handle has experienced any errors since it was opened or since the last call to clearerr , or if the handle is invalid. It only returns false for a valid handle with no outstanding errors.

    • $io->clearerr

      Clear the given handle's error indicator. Returns -1 if the handle is invalid, 0 otherwise.

    • $io->sync

      sync synchronizes a file's in-memory state with that on the physical medium. sync does not operate at the perlio api level, but operates on the file descriptor (similar to sysread, sysseek and systell). This means that any data held at the perlio api level will not be synchronized. To synchronize data that is buffered at the perlio api level you must use the flush method. sync is not implemented on all platforms. Returns "0 but true" on success, undef on error, undef for an invalid handle. See fsync(3c).

    • $io->flush

      flush causes perl to flush any buffered data at the perlio api level. Any unread data in the buffer will be discarded, and any unwritten data will be written to the underlying file descriptor. Returns "0 but true" on success, undef on error.

    • $io->printflush ( ARGS )

      Turns on autoflush, print ARGS and then restores the autoflush status of the IO::Handle object. Returns the return value from print.

    • $io->blocking ( [ BOOL ] )

      If called with an argument blocking will turn on non-blocking IO if BOOL is false, and turn it off if BOOL is true.

      blocking will return the value of the previous setting, or the current setting if BOOL is not given.

      If an error occurs blocking will return undef and $! will be set.

    If the C functions setbuf() and/or setvbuf() are available, then IO::Handle::setbuf and IO::Handle::setvbuf set the buffering policy for an IO::Handle. The calling sequences for the Perl functions are the same as their C counterparts--including the constants _IOFBF , _IOLBF , and _IONBF for setvbuf()--except that the buffer parameter specifies a scalar variable to use as a buffer. You should only change the buffer before any I/O, or immediately after calling flush.

    WARNING: The IO::Handle::setvbuf() is not available by default on Perls 5.8.0 and later because setvbuf() is rather specific to using the stdio library, while Perl prefers the new perlio subsystem instead.

    WARNING: A variable used as a buffer by setbuf or setvbuf must not be modified in any way until the IO::Handle is closed or setbuf or setvbuf is called again, or memory corruption may result! Remember that the order of global destruction is undefined, so even if your buffer variable remains in scope until program termination, it may be undefined before the file IO::Handle is closed. Note that you need to import the constants _IOFBF , _IOLBF , and _IONBF explicitly. Like C, setbuf returns nothing. setvbuf returns "0 but true", on success, undef on failure.

    Lastly, there is a special method for working under -T and setuid/gid scripts:

    • $io->untaint

      Marks the object as taint-clean, and as such data read from it will also be considered taint-clean. Note that this is a very trusting action to take, and appropriate consideration for the data source and potential vulnerability should be kept in mind. Returns 0 on success, -1 if setting the taint-clean flag failed. (eg invalid handle)

    NOTE

    An IO::Handle object is a reference to a symbol/GLOB reference (see the Symbol package). Some modules that inherit from IO::Handle may want to keep object related variables in the hash table part of the GLOB. In an attempt to prevent modules trampling on each other I propose the that any such module should prefix its variables with its own name separated by _'s. For example the IO::Socket module keeps a timeout variable in 'io_socket_timeout'.

    SEE ALSO

    perlfunc, I/O Operators in perlop, IO::File

    BUGS

    Due to backwards compatibility, all filehandles resemble objects of class IO::Handle , or actually classes derived from that class. They actually aren't. Which means you can't derive your own class from IO::Handle and inherit those methods.

    HISTORY

    Derived from FileHandle.pm by Graham Barr <gbarr@pobox.com>

     
    perldoc-html/IO/Pipe.html000644 000765 000024 00000047027 12275777447 015350 0ustar00jjstaff000000 000000 IO::Pipe - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Pipe

    Perl 5 version 18.2 documentation
    Recently read

    IO::Pipe

    NAME

    IO::Pipe - supply object methods for pipes

    SYNOPSIS

    1. use IO::Pipe;
    2. $pipe = IO::Pipe->new();
    3. if($pid = fork()) { # Parent
    4. $pipe->reader();
    5. while(<$pipe>) {
    6. ...
    7. }
    8. }
    9. elsif(defined $pid) { # Child
    10. $pipe->writer();
    11. print $pipe ...
    12. }
    13. or
    14. $pipe = IO::Pipe->new();
    15. $pipe->reader(qw(ls -l));
    16. while(<$pipe>) {
    17. ...
    18. }

    DESCRIPTION

    IO::Pipe provides an interface to creating pipes between processes.

    CONSTRUCTOR

    • new ( [READER, WRITER] )

      Creates an IO::Pipe , which is a reference to a newly created symbol (see the Symbol package). IO::Pipe::new optionally takes two arguments, which should be objects blessed into IO::Handle , or a subclass thereof. These two objects will be used for the system call to pipe. If no arguments are given then method handles is called on the new IO::Pipe object.

      These two handles are held in the array part of the GLOB until either reader or writer is called.

    METHODS

    • reader ([ARGS])

      The object is re-blessed into a sub-class of IO::Handle , and becomes a handle at the reading end of the pipe. If ARGS are given then fork is called and ARGS are passed to exec.

    • writer ([ARGS])

      The object is re-blessed into a sub-class of IO::Handle , and becomes a handle at the writing end of the pipe. If ARGS are given then fork is called and ARGS are passed to exec.

    • handles ()

      This method is called during construction by IO::Pipe::new on the newly created IO::Pipe object. It returns an array of two objects blessed into IO::Pipe::End , or a subclass thereof.

    SEE ALSO

    IO::Handle

    AUTHOR

    Graham Barr. Currently maintained by the Perl Porters. Please report all bugs to <perlbug@perl.org>.

    COPYRIGHT

    Copyright (c) 1996-8 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Poll.html000644 000765 000024 00000043432 12275777447 015355 0ustar00jjstaff000000 000000 IO::Poll - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Poll

    Perl 5 version 18.2 documentation
    Recently read

    IO::Poll

    NAME

    IO::Poll - Object interface to system poll call

    SYNOPSIS

    1. use IO::Poll qw(POLLRDNORM POLLWRNORM POLLIN POLLHUP);
    2. $poll = IO::Poll->new();
    3. $poll->mask($input_handle => POLLIN);
    4. $poll->mask($output_handle => POLLOUT);
    5. $poll->poll($timeout);
    6. $ev = $poll->events($input);

    DESCRIPTION

    IO::Poll is a simple interface to the system level poll routine.

    METHODS

    • mask ( IO [, EVENT_MASK ] )

      If EVENT_MASK is given, then, if EVENT_MASK is non-zero, IO is added to the list of file descriptors and the next call to poll will check for any event specified in EVENT_MASK. If EVENT_MASK is zero then IO will be removed from the list of file descriptors.

      If EVENT_MASK is not given then the return value will be the current event mask value for IO.

    • poll ( [ TIMEOUT ] )

      Call the system level poll routine. If TIMEOUT is not specified then the call will block. Returns the number of handles which had events happen, or -1 on error.

    • events ( IO )

      Returns the event mask which represents the events that happened on IO during the last call to poll .

    • remove ( IO )

      Remove IO from the list of file descriptors for the next poll.

    • handles( [ EVENT_MASK ] )

      Returns a list of handles. If EVENT_MASK is not given then a list of all handles known will be returned. If EVENT_MASK is given then a list of handles will be returned which had one of the events specified by EVENT_MASK happen during the last call ti poll

    SEE ALSO

    poll(2), IO::Handle, IO::Select

    AUTHOR

    Graham Barr. Currently maintained by the Perl Porters. Please report all bugs to <perlbug@perl.org>.

    COPYRIGHT

    Copyright (c) 1997-8 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Seekable.html000644 000765 000024 00000044057 12275777455 016165 0ustar00jjstaff000000 000000 IO::Seekable - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Seekable

    Perl 5 version 18.2 documentation
    Recently read

    IO::Seekable

    NAME

    IO::Seekable - supply seek based methods for I/O objects

    SYNOPSIS

    1. use IO::Seekable;
    2. package IO::Something;
    3. @ISA = qw(IO::Seekable);

    DESCRIPTION

    IO::Seekable does not have a constructor of its own as it is intended to be inherited by other IO::Handle based objects. It provides methods which allow seeking of the file descriptors.

    • $io->getpos

      Returns an opaque value that represents the current position of the IO::File, or undef if this is not possible (eg an unseekable stream such as a terminal, pipe or socket). If the fgetpos() function is available in your C library it is used to implements getpos, else perl emulates getpos using C's ftell() function.

    • $io->setpos

      Uses the value of a previous getpos call to return to a previously visited position. Returns "0 but true" on success, undef on failure.

    See perlfunc for complete descriptions of each of the following supported IO::Seekable methods, which are just front ends for the corresponding built-in functions:

    • $io->seek ( POS, WHENCE )

      Seek the IO::File to position POS, relative to WHENCE:

      • WHENCE=0 (SEEK_SET)

        POS is absolute position. (Seek relative to the start of the file)

      • WHENCE=1 (SEEK_CUR)

        POS is an offset from the current position. (Seek relative to current)

      • WHENCE=2 (SEEK_END)

        POS is an offset from the end of the file. (Seek relative to end)

      The SEEK_* constants can be imported from the Fcntl module if you don't wish to use the numbers 0 1 or 2 in your code.

      Returns 1 upon success, 0 otherwise.

    • $io->sysseek( POS, WHENCE )

      Similar to $io->seek, but sets the IO::File's position using the system call lseek(2) directly, so will confuse most perl IO operators except sysread and syswrite (see perlfunc for full details)

      Returns the new position, or undef on failure. A position of zero is returned as the string "0 but true"

    • $io->tell

      Returns the IO::File's current position, or -1 on error.

    SEE ALSO

    perlfunc, I/O Operators in perlop, IO::Handle IO::File

    HISTORY

    Derived from FileHandle.pm by Graham Barr <gbarr@pobox.com>

     
    perldoc-html/IO/Select.html000644 000765 000024 00000056462 12275777454 015673 0ustar00jjstaff000000 000000 IO::Select - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Select

    Perl 5 version 18.2 documentation
    Recently read

    IO::Select

    NAME

    IO::Select - OO interface to the select system call

    SYNOPSIS

    1. use IO::Select;
    2. $s = IO::Select->new();
    3. $s->add(\*STDIN);
    4. $s->add($some_handle);
    5. @ready = $s->can_read($timeout);
    6. @ready = IO::Select->new(@handles)->can_read(0);

    DESCRIPTION

    The IO::Select package implements an object approach to the system select function call. It allows the user to see what IO handles, see IO::Handle, are ready for reading, writing or have an exception pending.

    CONSTRUCTOR

    • new ( [ HANDLES ] )

      The constructor creates a new object and optionally initialises it with a set of handles.

    METHODS

    • add ( HANDLES )

      Add the list of handles to the IO::Select object. It is these values that will be returned when an event occurs. IO::Select keeps these values in a cache which is indexed by the fileno of the handle, so if more than one handle with the same fileno is specified then only the last one is cached.

      Each handle can be an IO::Handle object, an integer or an array reference where the first element is an IO::Handle or an integer.

    • remove ( HANDLES )

      Remove all the given handles from the object. This method also works by the fileno of the handles. So the exact handles that were added need not be passed, just handles that have an equivalent fileno

    • exists ( HANDLE )

      Returns a true value (actually the handle itself) if it is present. Returns undef otherwise.

    • handles

      Return an array of all registered handles.

    • can_read ( [ TIMEOUT ] )

      Return an array of handles that are ready for reading. TIMEOUT is the maximum amount of time to wait before returning an empty list, in seconds, possibly fractional. If TIMEOUT is not given and any handles are registered then the call will block.

    • can_write ( [ TIMEOUT ] )

      Same as can_read except check for handles that can be written to.

    • has_exception ( [ TIMEOUT ] )

      Same as can_read except check for handles that have an exception condition, for example pending out-of-band data.

    • count ()

      Returns the number of handles that the object will check for when one of the can_ methods is called or the object is passed to the select static method.

    • bits()

      Return the bit string suitable as argument to the core select() call.

    • select ( READ, WRITE, EXCEPTION [, TIMEOUT ] )

      select is a static method, that is you call it with the package name like new . READ , WRITE and EXCEPTION are either undef or IO::Select objects. TIMEOUT is optional and has the same effect as for the core select call.

      The result will be an array of 3 elements, each a reference to an array which will hold the handles that are ready for reading, writing and have exceptions respectively. Upon error an empty list is returned.

    EXAMPLE

    Here is a short example which shows how IO::Select could be used to write a server which communicates with several sockets while also listening for more connections on a listen socket

    1. use IO::Select;
    2. use IO::Socket;
    3. $lsn = IO::Socket::INET->new(Listen => 1, LocalPort => 8080);
    4. $sel = IO::Select->new( $lsn );
    5. while(@ready = $sel->can_read) {
    6. foreach $fh (@ready) {
    7. if($fh == $lsn) {
    8. # Create a new socket
    9. $new = $lsn->accept;
    10. $sel->add($new);
    11. }
    12. else {
    13. # Process socket
    14. # Maybe we have finished with the socket
    15. $sel->remove($fh);
    16. $fh->close;
    17. }
    18. }
    19. }

    AUTHOR

    Graham Barr. Currently maintained by the Perl Porters. Please report all bugs to <perlbug@perl.org>.

    COPYRIGHT

    Copyright (c) 1997-8 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Socket/000755 000765 000024 00000000000 12275777453 015000 5ustar00jjstaff000000 000000 perldoc-html/IO/Socket.html000644 000765 000024 00000056741 12275777452 015702 0ustar00jjstaff000000 000000 IO::Socket - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Socket

    Perl 5 version 18.2 documentation
    Recently read

    IO::Socket

    NAME

    IO::Socket - Object interface to socket communications

    SYNOPSIS

    1. use IO::Socket;

    DESCRIPTION

    IO::Socket provides an object interface to creating and using sockets. It is built upon the IO::Handle interface and inherits all the methods defined by IO::Handle.

    IO::Socket only defines methods for those operations which are common to all types of socket. Operations which are specified to a socket in a particular domain have methods defined in sub classes of IO::Socket

    IO::Socket will export all functions (and constants) defined by Socket.

    CONSTRUCTOR

    • new ( [ARGS] )

      Creates an IO::Socket , which is a reference to a newly created symbol (see the Symbol package). new optionally takes arguments, these arguments are in key-value pairs. new only looks for one key Domain which tells new which domain the socket will be in. All other arguments will be passed to the configuration method of the package for that domain, See below.

      1. NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE

      As of VERSION 1.18 all IO::Socket objects have autoflush turned on by default. This was not the case with earlier releases.

      1. NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE

    METHODS

    See perlfunc for complete descriptions of each of the following supported IO::Socket methods, which are just front ends for the corresponding built-in functions:

    1. socket
    2. socketpair
    3. bind
    4. listen
    5. accept
    6. send
    7. recv
    8. peername (getpeername)
    9. sockname (getsockname)
    10. shutdown

    Some methods take slightly different arguments to those defined in perlfunc in attempt to make the interface more flexible. These are

    • accept([PKG])

      perform the system call accept on the socket and return a new object. The new object will be created in the same class as the listen socket, unless PKG is specified. This object can be used to communicate with the client that was trying to connect.

      In a scalar context the new socket is returned, or undef upon failure. In a list context a two-element array is returned containing the new socket and the peer address; the list will be empty upon failure.

      The timeout in the [PKG] can be specified as zero to effect a "poll", but you shouldn't do that because a new IO::Select object will be created behind the scenes just to do the single poll. This is horrendously inefficient. Use rather true select() with a zero timeout on the handle, or non-blocking IO.

    • socketpair(DOMAIN, TYPE, PROTOCOL)

      Call socketpair and return a list of two sockets created, or an empty list on failure.

    Additional methods that are provided are:

    • atmark

      True if the socket is currently positioned at the urgent data mark, false otherwise.

      1. use IO::Socket;
      2. my $sock = IO::Socket::INET->new('some_server');
      3. $sock->read($data, 1024) until $sock->atmark;

      Note: this is a reasonably new addition to the family of socket functions, so all systems may not support this yet. If it is unsupported by the system, an attempt to use this method will abort the program.

      The atmark() functionality is also exportable as sockatmark() function:

      1. use IO::Socket 'sockatmark';

      This allows for a more traditional use of sockatmark() as a procedural socket function. If your system does not support sockatmark(), the use declaration will fail at compile time.

    • connected

      If the socket is in a connected state the peer address is returned. If the socket is not in a connected state then undef will be returned.

    • protocol

      Returns the numerical number for the protocol being used on the socket, if known. If the protocol is unknown, as with an AF_UNIX socket, zero is returned.

    • sockdomain

      Returns the numerical number for the socket domain type. For example, for an AF_INET socket the value of &AF_INET will be returned.

    • sockopt(OPT [, VAL])

      Unified method to both set and get options in the SOL_SOCKET level. If called with one argument then getsockopt is called, otherwise setsockopt is called.

    • getsockopt(LEVEL, OPT)

      Get option associated with the socket. Other levels than SOL_SOCKET may be specified here.

    • setsockopt(LEVEL, OPT, VAL)

      Set option associated with the socket. Other levels than SOL_SOCKET may be specified here.

    • socktype

      Returns the numerical number for the socket type. For example, for a SOCK_STREAM socket the value of &SOCK_STREAM will be returned.

    • timeout([VAL])

      Set or get the timeout value (in seconds) associated with this socket. If called without any arguments then the current setting is returned. If called with an argument the current setting is changed and the previous value returned.

    LIMITATIONS

    On some systems, for an IO::Socket object created with new_from_fd(), or created with accept() from such an object, the protocol(), sockdomain() and socktype() methods may return undef.

    SEE ALSO

    Socket, IO::Handle, IO::Socket::INET, IO::Socket::UNIX

    AUTHOR

    Graham Barr. atmark() by Lincoln Stein. Currently maintained by the Perl Porters. Please report all bugs to <perlbug@perl.org>.

    COPYRIGHT

    Copyright (c) 1997-8 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    The atmark() implementation: Copyright 2001, Lincoln Stein <lstein@cshl.org>. This module is distributed under the same terms as Perl itself. Feel free to use, modify and redistribute it as long as you retain the correct attribution.

     
    perldoc-html/IO/Uncompress/000755 000765 000024 00000000000 12275777454 015707 5ustar00jjstaff000000 000000 perldoc-html/IO/Zlib.html000644 000765 000024 00000074061 12275777453 015346 0ustar00jjstaff000000 000000 IO::Zlib - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Zlib

    Perl 5 version 18.2 documentation
    Recently read

    IO::Zlib

    NAME

    IO::Zlib - IO:: style interface to Compress::Zlib

    SYNOPSIS

    With any version of Perl 5 you can use the basic OO interface:

    1. use IO::Zlib;
    2. $fh = new IO::Zlib;
    3. if ($fh->open("file.gz", "rb")) {
    4. print <$fh>;
    5. $fh->close;
    6. }
    7. $fh = IO::Zlib->new("file.gz", "wb9");
    8. if (defined $fh) {
    9. print $fh "bar\n";
    10. $fh->close;
    11. }
    12. $fh = IO::Zlib->new("file.gz", "rb");
    13. if (defined $fh) {
    14. print <$fh>;
    15. undef $fh; # automatically closes the file
    16. }

    With Perl 5.004 you can also use the TIEHANDLE interface to access compressed files just like ordinary files:

    1. use IO::Zlib;
    2. tie *FILE, 'IO::Zlib', "file.gz", "wb";
    3. print FILE "line 1\nline2\n";
    4. tie *FILE, 'IO::Zlib', "file.gz", "rb";
    5. while (<FILE>) { print "LINE: ", $_ };

    DESCRIPTION

    IO::Zlib provides an IO:: style interface to Compress::Zlib and hence to gzip/zlib compressed files. It provides many of the same methods as the IO::Handle interface.

    Starting from IO::Zlib version 1.02, IO::Zlib can also use an external gzip command. The default behaviour is to try to use an external gzip if no Compress::Zlib can be loaded, unless explicitly disabled by

    1. use IO::Zlib qw(:gzip_external 0);

    If explicitly enabled by

    1. use IO::Zlib qw(:gzip_external 1);

    then the external gzip is used instead of Compress::Zlib .

    CONSTRUCTOR

    • new ( [ARGS] )

      Creates an IO::Zlib object. If it receives any parameters, they are passed to the method open; if the open fails, the object is destroyed. Otherwise, it is returned to the caller.

    OBJECT METHODS

    • open ( FILENAME, MODE )

      open takes two arguments. The first is the name of the file to open and the second is the open mode. The mode can be anything acceptable to Compress::Zlib and by extension anything acceptable to zlib (that basically means POSIX fopen() style mode strings plus an optional number to indicate the compression level).

    • opened

      Returns true if the object currently refers to a opened file.

    • close

      Close the file associated with the object and disassociate the file from the handle. Done automatically on destroy.

    • getc

      Return the next character from the file, or undef if none remain.

    • getline

      Return the next line from the file, or undef on end of string. Can safely be called in an array context. Currently ignores $/ ($INPUT_RECORD_SEPARATOR or $RS when English is in use) and treats lines as delimited by "\n".

    • getlines

      Get all remaining lines from the file. It will croak() if accidentally called in a scalar context.

    • print ( ARGS... )

      Print ARGS to the file.

    • read ( BUF, NBYTES, [OFFSET] )

      Read some bytes from the file. Returns the number of bytes actually read, 0 on end-of-file, undef on error.

    • eof

      Returns true if the handle is currently positioned at end of file?

    • seek ( OFFSET, WHENCE )

      Seek to a given position in the stream. Not yet supported.

    • tell

      Return the current position in the stream, as a numeric offset. Not yet supported.

    • setpos ( POS )

      Set the current position, using the opaque value returned by getpos() . Not yet supported.

    • getpos ( POS )

      Return the current position in the string, as an opaque object. Not yet supported.

    USING THE EXTERNAL GZIP

    If the external gzip is used, the following opens are used:

    1. open(FH, "gzip -dc $filename |") # for read opens
    2. open(FH, " | gzip > $filename") # for write opens

    You can modify the 'commands' for example to hardwire an absolute path by e.g.

    1. use IO::Zlib ':gzip_read_open' => '/some/where/gunzip -c %s |';
    2. use IO::Zlib ':gzip_write_open' => '| /some/where/gzip.exe > %s';

    The %s is expanded to be the filename (sprintf is used, so be careful to escape any other % signs). The 'commands' are checked for sanity - they must contain the %s , and the read open must end with the pipe sign, and the write open must begin with the pipe sign.

    CLASS METHODS

    • has_Compress_Zlib

      Returns true if Compress::Zlib is available. Note that this does not mean that Compress::Zlib is being used: see gzip_external and gzip_used.

    • gzip_external

      Undef if an external gzip can be used if Compress::Zlib is not available (see has_Compress_Zlib), true if an external gzip is explicitly used, false if an external gzip must not be used. See gzip_used.

    • gzip_used

      True if an external gzip is being used, false if not.

    • gzip_read_open

      Return the 'command' being used for opening a file for reading using an external gzip.

    • gzip_write_open

      Return the 'command' being used for opening a file for writing using an external gzip.

    DIAGNOSTICS

    • IO::Zlib::getlines: must be called in list context

      If you want read lines, you must read in list context.

    • IO::Zlib::gzopen_external: mode '...' is illegal

      Use only modes 'rb' or 'wb' or /wb[1-9]/.

    • IO::Zlib::import: '...' is illegal

      The known import symbols are the :gzip_external , :gzip_read_open , and :gzip_write_open . Anything else is not recognized.

    • IO::Zlib::import: ':gzip_external' requires an argument

      The :gzip_external requires one boolean argument.

    • IO::Zlib::import: 'gzip_read_open' requires an argument

      The :gzip_external requires one string argument.

    • IO::Zlib::import: 'gzip_read' '...' is illegal

      The :gzip_read_open argument must end with the pipe sign (|) and have the %s for the filename. See USING THE EXTERNAL GZIP.

    • IO::Zlib::import: 'gzip_write_open' requires an argument

      The :gzip_external requires one string argument.

    • IO::Zlib::import: 'gzip_write_open' '...' is illegal

      The :gzip_write_open argument must begin with the pipe sign (|) and have the %s for the filename. An output redirect (>) is also often a good idea, depending on your operating system shell syntax. See USING THE EXTERNAL GZIP.

    • IO::Zlib::import: no Compress::Zlib and no external gzip

      Given that we failed to load Compress::Zlib and that the use of an external gzip was disabled, IO::Zlib has not much chance of working.

    • IO::Zlib::open: needs a filename

      No filename, no open.

    • IO::Zlib::READ: NBYTES must be specified

      We must know how much to read.

    • IO::Zlib::WRITE: too long LENGTH

      The LENGTH must be less than or equal to the buffer size.

    SEE ALSO

    perlfunc, I/O Operators in perlop, IO::Handle, Compress::Zlib

    HISTORY

    Created by Tom Hughes <tom@compton.nu>.

    Support for external gzip added by Jarkko Hietaniemi <jhi@iki.fi>.

    COPYRIGHT

    Copyright (c) 1998-2004 Tom Hughes <tom@compton.nu>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Uncompress/AnyInflate.html000644 000765 000024 00000200205 12275777450 020622 0ustar00jjstaff000000 000000 IO::Uncompress::AnyInflate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Uncompress::AnyInflate

    Perl 5 version 18.2 documentation
    Recently read

    IO::Uncompress::AnyInflate

    NAME

    IO::Uncompress::AnyInflate - Uncompress zlib-based (zip, gzip) file/buffer

    SYNOPSIS

    1. use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
    2. my $status = anyinflate $input => $output [,OPTS]
    3. or die "anyinflate failed: $AnyInflateError\n";
    4. my $z = new IO::Uncompress::AnyInflate $input [OPTS]
    5. or die "anyinflate failed: $AnyInflateError\n";
    6. $status = $z->read($buffer)
    7. $status = $z->read($buffer, $length)
    8. $status = $z->read($buffer, $length, $offset)
    9. $line = $z->getline()
    10. $char = $z->getc()
    11. $char = $z->ungetc()
    12. $char = $z->opened()
    13. $status = $z->inflateSync()
    14. $data = $z->trailingData()
    15. $status = $z->nextStream()
    16. $data = $z->getHeaderInfo()
    17. $z->tell()
    18. $z->seek($position, $whence)
    19. $z->binmode()
    20. $z->fileno()
    21. $z->eof()
    22. $z->close()
    23. $AnyInflateError ;
    24. # IO::File mode
    25. <$z>
    26. read($z, $buffer);
    27. read($z, $buffer, $length);
    28. read($z, $buffer, $length, $offset);
    29. tell($z)
    30. seek($z, $position, $whence)
    31. binmode($z)
    32. fileno($z)
    33. eof($z)
    34. close($z)

    DESCRIPTION

    This module provides a Perl interface that allows the reading of files/buffers that have been compressed in a number of formats that use the zlib compression library.

    The formats supported are

    • RFC 1950
    • RFC 1951 (optionally)
    • gzip (RFC 1952)
    • zip

    The module will auto-detect which, if any, of the supported compression formats is being used.

    Functional Interface

    A top-level function, anyinflate , is provided to carry out "one-shot" uncompression between buffers and/or files. For finer control over the uncompression process, see the OO Interface section.

    1. use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
    2. anyinflate $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "anyinflate failed: $AnyInflateError\n";

    The functional interface needs Perl5.005 or better.

    anyinflate $input => $output [, OPTS]

    anyinflate expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the compressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is uncompressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" anyinflate will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the uncompressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the uncompressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the uncompressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the uncompressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the uncompressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" anyinflate will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple compressed files/buffers and $output_filename_or_reference is a single file/buffer, after uncompression $output_filename_or_reference will contain a concatenation of all the uncompressed data from each of the input files/buffers.

    Optional Parameters

    Unless specified below, the optional parameters for anyinflate , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to anyinflate that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once anyinflate has completed.

      This parameter defaults to 0.

    • BinModeOut => 0|1

      When writing to a file or filehandle, set binmode before writing to the file.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all uncompressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any uncompressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any uncompressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any uncompressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all uncompressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any uncompressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all uncompressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any uncompressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any uncompressed data is output.

      Defaults to 0.

    • MultiStream => 0|1

      If the input file/buffer contains multiple compressed data streams, this option will uncompress the whole lot as a single data stream.

      Defaults to 0.

    • TrailingData => $scalar

      Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete.

      This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

      If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

      If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

      Don't bother using trailingData if the input is a filename.

      If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option.

    Examples

    To read the contents of the file file1.txt.Compressed and write the uncompressed data to the file file1.txt .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
    4. my $input = "file1.txt.Compressed";
    5. my $output = "file1.txt";
    6. anyinflate $input => $output
    7. or die "anyinflate failed: $AnyInflateError\n";

    To read from an existing Perl filehandle, $input , and write the uncompressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt.Compressed"
    6. or die "Cannot open 'file1.txt.Compressed': $!\n" ;
    7. my $buffer ;
    8. anyinflate $input => \$buffer
    9. or die "anyinflate failed: $AnyInflateError\n";

    To uncompress all files in the directory "/my/home" that match "*.txt.Compressed" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
    4. anyinflate '</my/home/*.txt.Compressed>' => '</my/home/#1.txt>'
    5. or die "anyinflate failed: $AnyInflateError\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;
    4. for my $input ( glob "/my/home/*.txt.Compressed" )
    5. {
    6. my $output = $input;
    7. $output =~ s/.Compressed// ;
    8. anyinflate $input => $output
    9. or die "Error compressing '$input': $AnyInflateError\n";
    10. }

    OO Interface

    Constructor

    The format of the constructor for IO::Uncompress::AnyInflate is shown below

    1. my $z = new IO::Uncompress::AnyInflate $input [OPTS]
    2. or die "IO::Uncompress::AnyInflate failed: $AnyInflateError\n";

    Returns an IO::Uncompress::AnyInflate object on success and undef on failure. The variable $AnyInflateError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Uncompress::AnyInflate can be used exactly like an IO::File filehandle. This means that all normal input file operations can be carried out with $z . For example, to read a line from a compressed file/buffer you can use either of these forms

    1. $line = $z->getline();
    2. $line = <$z>;

    The mandatory parameter $input is used to determine the source of the compressed data. This parameter can take one of three forms.

    • A filename

      If the $input parameter is a scalar, it is assumed to be a filename. This file will be opened for reading and the compressed data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the compressed data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the compressed data will be read from $$input .

    Constructor Options

    The option names defined below are case insensitive and can be optionally prefixed by a '-'. So all of the following are valid

    1. -AutoClose
    2. -autoclose
    3. AUTOCLOSE
    4. autoclose

    OPTS is a combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $input parameter is a filehandle. If specified, and the value is true, it will result in the file being closed once either the close method is called or the IO::Uncompress::AnyInflate object is destroyed.

      This parameter defaults to 0.

    • MultiStream => 0|1

      Allows multiple concatenated compressed streams to be treated as a single compressed stream. Decompression will stop once either the end of the file/buffer is reached, an error is encountered (premature eof, corrupt compressed data) or the end of a stream is not immediately followed by the start of another stream.

      This parameter defaults to 0.

    • Prime => $string

      This option will uncompress the contents of $string before processing the input file/buffer.

      This option can be useful when the compressed data is embedded in another file/data structure and it is not possible to work out where the compressed data begins without having to read the first few bytes. If this is the case, the uncompression can be primed with these bytes using this option.

    • Transparent => 0|1

      If this option is set and the input file/buffer is not compressed data, the module will allow reading of it anyway.

      In addition, if the input file/buffer does contain compressed data and there is non-compressed data immediately following it, setting this option will make this module treat the whole file/buffer as a single data stream.

      This option defaults to 1.

    • BlockSize => $num

      When reading the compressed input data, IO::Uncompress::AnyInflate will read it in blocks of $num bytes.

      This option defaults to 4096.

    • InputLength => $size

      When present this option will limit the number of compressed bytes read from the input file/buffer to $size . This option can be used in the situation where there is useful data directly after the compressed data stream and you know beforehand the exact length of the compressed data stream.

      This option is mostly used when reading from a filehandle, in which case the file pointer will be left pointing to the first byte directly after the compressed data stream.

      This option defaults to off.

    • Append => 0|1

      This option controls what the read method does with uncompressed data.

      If set to 1, all uncompressed data will be appended to the output parameter of the read method.

      If set to 0, the contents of the output parameter of the read method will be overwritten by the uncompressed data.

      Defaults to 0.

    • Strict => 0|1

      This option controls whether the extra checks defined below are used when carrying out the decompression. When Strict is on, the extra tests are carried out, when Strict is off they are not.

      The default for this option is off.

      If the input is an RFC 1950 data stream, the following will be checked:

      1

      The ADLER32 checksum field must be present.

      2

      The value of the ADLER32 field read must match the adler32 value of the uncompressed data actually contained in the file.

      If the input is a gzip (RFC 1952) data stream, the following will be checked:

      1

      If the FHCRC bit is set in the gzip FLG header byte, the CRC16 bytes in the header must match the crc16 value of the gzip header actually read.

      2

      If the gzip header contains a name field (FNAME) it consists solely of ISO 8859-1 characters.

      3

      If the gzip header contains a comment field (FCOMMENT) it consists solely of ISO 8859-1 characters plus line-feed.

      4

      If the gzip FEXTRA header field is present it must conform to the sub-field structure as defined in RFC 1952.

      5

      The CRC32 and ISIZE trailer fields must be present.

      6

      The value of the CRC32 field read must match the crc32 value of the uncompressed data actually contained in the gzip file.

      7

      The value of the ISIZE fields read must match the length of the uncompressed data actually read from the file.

    • RawInflate => 0|1

      When auto-detecting the compressed format, try to test for raw-deflate (RFC 1951) content using the IO::Uncompress::RawInflate module.

      The reason this is not default behaviour is because RFC 1951 content can only be detected by attempting to uncompress it. This process is error prone and can result is false positives.

      Defaults to 0.

    • ParseExtra => 0|1 If the gzip FEXTRA header field is present and this option is set, it will force the module to check that it conforms to the sub-field structure as defined in RFC 1952.

      If the Strict is on it will automatically enable this option.

      Defaults to 0.

    Examples

    TODO

    Methods

    read

    Usage is

    1. $status = $z->read($buffer)

    Reads a block of compressed data (the size the the compressed block is determined by the Buffer option in the constructor), uncompresses it and writes any uncompressed data into $buffer . If the Append parameter is set in the constructor, the uncompressed data will be appended to the $buffer parameter. Otherwise $buffer will be overwritten.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    read

    Usage is

    1. $status = $z->read($buffer, $length)
    2. $status = $z->read($buffer, $length, $offset)
    3. $status = read($z, $buffer, $length)
    4. $status = read($z, $buffer, $length, $offset)

    Attempt to read $length bytes of uncompressed data into $buffer .

    The main difference between this form of the read method and the previous one, is that this one will attempt to return exactly $length bytes. The only circumstances that this function will not is if end-of-file or an IO error is encountered.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    getline

    Usage is

    1. $line = $z->getline()
    2. $line = <$z>

    Reads a single line.

    This method fully supports the use of of the variable $/ (or $INPUT_RECORD_SEPARATOR or $RS when English is in use) to determine what constitutes an end of line. Paragraph mode, record mode and file slurp mode are all supported.

    getc

    Usage is

    1. $char = $z->getc()

    Read a single character.

    ungetc

    Usage is

    1. $char = $z->ungetc($string)

    inflateSync

    Usage is

    1. $status = $z->inflateSync()

    TODO

    getHeaderInfo

    Usage is

    1. $hdr = $z->getHeaderInfo();
    2. @hdrs = $z->getHeaderInfo();

    This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the end of the compressed input stream has been reached.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the input file/buffer. It is a fatal error to attempt to seek backward.

    Note that the implementation of seek in this module does not provide true random access to a compressed file/buffer. It works by uncompressing data from the current offset in the file/buffer until it reaches the ucompressed offset specified in the parameters to seek. For very small files this may be acceptable behaviour. For large files it may cause an unacceptable delay.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    Returns the current uncompressed line number. If EXPR is present it has the effect of setting the line number. Note that setting the line number does not change the current position within the file/buffer being read.

    The contents of $/ are used to to determine what constitutes a line terminator.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Uncompress::AnyInflate object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Uncompress::AnyInflate object was created, and the object is associated with a file, the underlying file will also be closed.

    nextStream

    Usage is

    1. my $status = $z->nextStream();

    Skips to the next compressed data stream in the input file/buffer. If a new compressed data stream is found, the eof marker will be cleared and $. will be reset to 0.

    Returns 1 if a new stream was found, 0 if none was found, and -1 if an error was encountered.

    trailingData

    Usage is

    1. my $data = $z->trailingData();

    Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete. It only makes sense to call this method once the end of the compressed data stream has been encountered.

    This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

    If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

    If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

    Don't bother using trailingData if the input is a filename.

    If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option in the constructor.

    Importing

    No symbolic constants are required by this IO::Uncompress::AnyInflate at present.

    • :all

      Imports anyinflate and $AnyInflateError . Same as doing this

      1. use IO::Uncompress::AnyInflate qw(anyinflate $AnyInflateError) ;

    EXAMPLES

    Working with Net::FTP

    See IO::Compress::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Uncompress/AnyUncompress.html000644 000765 000024 00000173312 12275777452 021410 0ustar00jjstaff000000 000000 IO::Uncompress::AnyUncompress - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Uncompress::AnyUncompress

    Perl 5 version 18.2 documentation
    Recently read

    IO::Uncompress::AnyUncompress

    NAME

    IO::Uncompress::AnyUncompress - Uncompress gzip, zip, bzip2 or lzop file/buffer

    SYNOPSIS

    1. use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
    2. my $status = anyuncompress $input => $output [,OPTS]
    3. or die "anyuncompress failed: $AnyUncompressError\n";
    4. my $z = new IO::Uncompress::AnyUncompress $input [OPTS]
    5. or die "anyuncompress failed: $AnyUncompressError\n";
    6. $status = $z->read($buffer)
    7. $status = $z->read($buffer, $length)
    8. $status = $z->read($buffer, $length, $offset)
    9. $line = $z->getline()
    10. $char = $z->getc()
    11. $char = $z->ungetc()
    12. $char = $z->opened()
    13. $data = $z->trailingData()
    14. $status = $z->nextStream()
    15. $data = $z->getHeaderInfo()
    16. $z->tell()
    17. $z->seek($position, $whence)
    18. $z->binmode()
    19. $z->fileno()
    20. $z->eof()
    21. $z->close()
    22. $AnyUncompressError ;
    23. # IO::File mode
    24. <$z>
    25. read($z, $buffer);
    26. read($z, $buffer, $length);
    27. read($z, $buffer, $length, $offset);
    28. tell($z)
    29. seek($z, $position, $whence)
    30. binmode($z)
    31. fileno($z)
    32. eof($z)
    33. close($z)

    DESCRIPTION

    This module provides a Perl interface that allows the reading of files/buffers that have been compressed with a variety of compression libraries.

    The formats supported are:

    • RFC 1950
    • RFC 1951 (optionally)
    • gzip (RFC 1952)
    • zip
    • bzip2
    • lzop
    • lzf
    • lzma
    • xz

    The module will auto-detect which, if any, of the supported compression formats is being used.

    Functional Interface

    A top-level function, anyuncompress , is provided to carry out "one-shot" uncompression between buffers and/or files. For finer control over the uncompression process, see the OO Interface section.

    1. use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
    2. anyuncompress $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "anyuncompress failed: $AnyUncompressError\n";

    The functional interface needs Perl5.005 or better.

    anyuncompress $input => $output [, OPTS]

    anyuncompress expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the compressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is uncompressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" anyuncompress will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the uncompressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the uncompressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the uncompressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the uncompressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the uncompressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" anyuncompress will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple compressed files/buffers and $output_filename_or_reference is a single file/buffer, after uncompression $output_filename_or_reference will contain a concatenation of all the uncompressed data from each of the input files/buffers.

    Optional Parameters

    Unless specified below, the optional parameters for anyuncompress , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to anyuncompress that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once anyuncompress has completed.

      This parameter defaults to 0.

    • BinModeOut => 0|1

      When writing to a file or filehandle, set binmode before writing to the file.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all uncompressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any uncompressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any uncompressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any uncompressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all uncompressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any uncompressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all uncompressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any uncompressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any uncompressed data is output.

      Defaults to 0.

    • MultiStream => 0|1

      If the input file/buffer contains multiple compressed data streams, this option will uncompress the whole lot as a single data stream.

      Defaults to 0.

    • TrailingData => $scalar

      Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete.

      This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

      If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

      If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

      Don't bother using trailingData if the input is a filename.

      If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option.

    Examples

    To read the contents of the file file1.txt.Compressed and write the uncompressed data to the file file1.txt .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
    4. my $input = "file1.txt.Compressed";
    5. my $output = "file1.txt";
    6. anyuncompress $input => $output
    7. or die "anyuncompress failed: $AnyUncompressError\n";

    To read from an existing Perl filehandle, $input , and write the uncompressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt.Compressed"
    6. or die "Cannot open 'file1.txt.Compressed': $!\n" ;
    7. my $buffer ;
    8. anyuncompress $input => \$buffer
    9. or die "anyuncompress failed: $AnyUncompressError\n";

    To uncompress all files in the directory "/my/home" that match "*.txt.Compressed" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
    4. anyuncompress '</my/home/*.txt.Compressed>' => '</my/home/#1.txt>'
    5. or die "anyuncompress failed: $AnyUncompressError\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;
    4. for my $input ( glob "/my/home/*.txt.Compressed" )
    5. {
    6. my $output = $input;
    7. $output =~ s/.Compressed// ;
    8. anyuncompress $input => $output
    9. or die "Error compressing '$input': $AnyUncompressError\n";
    10. }

    OO Interface

    Constructor

    The format of the constructor for IO::Uncompress::AnyUncompress is shown below

    1. my $z = new IO::Uncompress::AnyUncompress $input [OPTS]
    2. or die "IO::Uncompress::AnyUncompress failed: $AnyUncompressError\n";

    Returns an IO::Uncompress::AnyUncompress object on success and undef on failure. The variable $AnyUncompressError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Uncompress::AnyUncompress can be used exactly like an IO::File filehandle. This means that all normal input file operations can be carried out with $z . For example, to read a line from a compressed file/buffer you can use either of these forms

    1. $line = $z->getline();
    2. $line = <$z>;

    The mandatory parameter $input is used to determine the source of the compressed data. This parameter can take one of three forms.

    • A filename

      If the $input parameter is a scalar, it is assumed to be a filename. This file will be opened for reading and the compressed data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the compressed data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the compressed data will be read from $$input .

    Constructor Options

    The option names defined below are case insensitive and can be optionally prefixed by a '-'. So all of the following are valid

    1. -AutoClose
    2. -autoclose
    3. AUTOCLOSE
    4. autoclose

    OPTS is a combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $input parameter is a filehandle. If specified, and the value is true, it will result in the file being closed once either the close method is called or the IO::Uncompress::AnyUncompress object is destroyed.

      This parameter defaults to 0.

    • MultiStream => 0|1

      Allows multiple concatenated compressed streams to be treated as a single compressed stream. Decompression will stop once either the end of the file/buffer is reached, an error is encountered (premature eof, corrupt compressed data) or the end of a stream is not immediately followed by the start of another stream.

      This parameter defaults to 0.

    • Prime => $string

      This option will uncompress the contents of $string before processing the input file/buffer.

      This option can be useful when the compressed data is embedded in another file/data structure and it is not possible to work out where the compressed data begins without having to read the first few bytes. If this is the case, the uncompression can be primed with these bytes using this option.

    • Transparent => 0|1

      If this option is set and the input file/buffer is not compressed data, the module will allow reading of it anyway.

      In addition, if the input file/buffer does contain compressed data and there is non-compressed data immediately following it, setting this option will make this module treat the whole file/buffer as a single data stream.

      This option defaults to 1.

    • BlockSize => $num

      When reading the compressed input data, IO::Uncompress::AnyUncompress will read it in blocks of $num bytes.

      This option defaults to 4096.

    • InputLength => $size

      When present this option will limit the number of compressed bytes read from the input file/buffer to $size . This option can be used in the situation where there is useful data directly after the compressed data stream and you know beforehand the exact length of the compressed data stream.

      This option is mostly used when reading from a filehandle, in which case the file pointer will be left pointing to the first byte directly after the compressed data stream.

      This option defaults to off.

    • Append => 0|1

      This option controls what the read method does with uncompressed data.

      If set to 1, all uncompressed data will be appended to the output parameter of the read method.

      If set to 0, the contents of the output parameter of the read method will be overwritten by the uncompressed data.

      Defaults to 0.

    • Strict => 0|1

      This option controls whether the extra checks defined below are used when carrying out the decompression. When Strict is on, the extra tests are carried out, when Strict is off they are not.

      The default for this option is off.

    • RawInflate => 0|1

      When auto-detecting the compressed format, try to test for raw-deflate (RFC 1951) content using the IO::Uncompress::RawInflate module.

      The reason this is not default behaviour is because RFC 1951 content can only be detected by attempting to uncompress it. This process is error prone and can result is false positives.

      Defaults to 0.

    • UnLzma => 0|1

      When auto-detecting the compressed format, try to test for lzma_alone content using the IO::Uncompress::UnLzma module.

      The reason this is not default behaviour is because lzma_alone content can only be detected by attempting to uncompress it. This process is error prone and can result is false positives.

      Defaults to 0.

    Examples

    TODO

    Methods

    read

    Usage is

    1. $status = $z->read($buffer)

    Reads a block of compressed data (the size the the compressed block is determined by the Buffer option in the constructor), uncompresses it and writes any uncompressed data into $buffer . If the Append parameter is set in the constructor, the uncompressed data will be appended to the $buffer parameter. Otherwise $buffer will be overwritten.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    read

    Usage is

    1. $status = $z->read($buffer, $length)
    2. $status = $z->read($buffer, $length, $offset)
    3. $status = read($z, $buffer, $length)
    4. $status = read($z, $buffer, $length, $offset)

    Attempt to read $length bytes of uncompressed data into $buffer .

    The main difference between this form of the read method and the previous one, is that this one will attempt to return exactly $length bytes. The only circumstances that this function will not is if end-of-file or an IO error is encountered.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    getline

    Usage is

    1. $line = $z->getline()
    2. $line = <$z>

    Reads a single line.

    This method fully supports the use of of the variable $/ (or $INPUT_RECORD_SEPARATOR or $RS when English is in use) to determine what constitutes an end of line. Paragraph mode, record mode and file slurp mode are all supported.

    getc

    Usage is

    1. $char = $z->getc()

    Read a single character.

    ungetc

    Usage is

    1. $char = $z->ungetc($string)

    getHeaderInfo

    Usage is

    1. $hdr = $z->getHeaderInfo();
    2. @hdrs = $z->getHeaderInfo();

    This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the end of the compressed input stream has been reached.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the input file/buffer. It is a fatal error to attempt to seek backward.

    Note that the implementation of seek in this module does not provide true random access to a compressed file/buffer. It works by uncompressing data from the current offset in the file/buffer until it reaches the ucompressed offset specified in the parameters to seek. For very small files this may be acceptable behaviour. For large files it may cause an unacceptable delay.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    Returns the current uncompressed line number. If EXPR is present it has the effect of setting the line number. Note that setting the line number does not change the current position within the file/buffer being read.

    The contents of $/ are used to to determine what constitutes a line terminator.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Uncompress::AnyUncompress object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Uncompress::AnyUncompress object was created, and the object is associated with a file, the underlying file will also be closed.

    nextStream

    Usage is

    1. my $status = $z->nextStream();

    Skips to the next compressed data stream in the input file/buffer. If a new compressed data stream is found, the eof marker will be cleared and $. will be reset to 0.

    Returns 1 if a new stream was found, 0 if none was found, and -1 if an error was encountered.

    trailingData

    Usage is

    1. my $data = $z->trailingData();

    Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete. It only makes sense to call this method once the end of the compressed data stream has been encountered.

    This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

    If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

    If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

    Don't bother using trailingData if the input is a filename.

    If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option in the constructor.

    Importing

    No symbolic constants are required by this IO::Uncompress::AnyUncompress at present.

    • :all

      Imports anyuncompress and $AnyUncompressError . Same as doing this

      1. use IO::Uncompress::AnyUncompress qw(anyuncompress $AnyUncompressError) ;

    EXAMPLES

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Uncompress/Base.html000644 000765 000024 00000042363 12275777454 017457 0ustar00jjstaff000000 000000 IO::Uncompress::Base - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Uncompress::Base

    Perl 5 version 18.2 documentation
    Recently read

    IO::Uncompress::Base

    NAME

    IO::Uncompress::Base - Base Class for IO::Uncompress modules

    SYNOPSIS

    1. use IO::Uncompress::Base ;

    DESCRIPTION

    This module is not intended for direct use in application code. Its sole purpose if to to be sub-classed by IO::Uncompress modules.

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Uncompress/Bunzip2.html000644 000765 000024 00000161504 12275777454 020135 0ustar00jjstaff000000 000000 IO::Uncompress::Bunzip2 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Uncompress::Bunzip2

    Perl 5 version 18.2 documentation
    Recently read

    IO::Uncompress::Bunzip2

    NAME

    IO::Uncompress::Bunzip2 - Read bzip2 files/buffers

    SYNOPSIS

    1. use IO::Uncompress::Bunzip2 qw(bunzip2 $Bunzip2Error) ;
    2. my $status = bunzip2 $input => $output [,OPTS]
    3. or die "bunzip2 failed: $Bunzip2Error\n";
    4. my $z = new IO::Uncompress::Bunzip2 $input [OPTS]
    5. or die "bunzip2 failed: $Bunzip2Error\n";
    6. $status = $z->read($buffer)
    7. $status = $z->read($buffer, $length)
    8. $status = $z->read($buffer, $length, $offset)
    9. $line = $z->getline()
    10. $char = $z->getc()
    11. $char = $z->ungetc()
    12. $char = $z->opened()
    13. $data = $z->trailingData()
    14. $status = $z->nextStream()
    15. $data = $z->getHeaderInfo()
    16. $z->tell()
    17. $z->seek($position, $whence)
    18. $z->binmode()
    19. $z->fileno()
    20. $z->eof()
    21. $z->close()
    22. $Bunzip2Error ;
    23. # IO::File mode
    24. <$z>
    25. read($z, $buffer);
    26. read($z, $buffer, $length);
    27. read($z, $buffer, $length, $offset);
    28. tell($z)
    29. seek($z, $position, $whence)
    30. binmode($z)
    31. fileno($z)
    32. eof($z)
    33. close($z)

    DESCRIPTION

    This module provides a Perl interface that allows the reading of bzip2 files/buffers.

    For writing bzip2 files/buffers, see the companion module IO::Compress::Bzip2.

    Functional Interface

    A top-level function, bunzip2 , is provided to carry out "one-shot" uncompression between buffers and/or files. For finer control over the uncompression process, see the OO Interface section.

    1. use IO::Uncompress::Bunzip2 qw(bunzip2 $Bunzip2Error) ;
    2. bunzip2 $input => $output [,OPTS]
    3. or die "bunzip2 failed: $Bunzip2Error\n";

    The functional interface needs Perl5.005 or better.

    bunzip2 $input => $output [, OPTS]

    bunzip2 expects at least two parameters, $input and $output .

    The $input parameter

    The parameter, $input , is used to define the source of the compressed data.

    It can take one of the following forms:

    • A filename

      If the $input parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the input data will be read from $$input .

    • An array reference

      If $input is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is uncompressed.

    • An Input FileGlob string

      If $input is a string that is delimited by the characters "<" and ">" bunzip2 will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      If the fileglob does not match any files ...

      See File::GlobMapper for more details.

    If the $input parameter is any other type, undef will be returned.

    The $output parameter

    The parameter $output is used to control the destination of the uncompressed data. This parameter can take one of these forms.

    • A filename

      If the $output parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the uncompressed data will be written to it.

    • A filehandle

      If the $output parameter is a filehandle, the uncompressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output is a scalar reference, the uncompressed data will be stored in $$output .

    • An Array Reference

      If $output is an array reference, the uncompressed data will be pushed onto the array.

    • An Output FileGlob

      If $output is a string that is delimited by the characters "<" and ">" bunzip2 will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output is an fileglob string, $input must also be a fileglob string. Anything else is an error.

    If the $output parameter is any other type, undef will be returned.

    Notes

    When $input maps to multiple compressed files/buffers and $output is a single file/buffer, after uncompression $output will contain a concatenation of all the uncompressed data from each of the input files/buffers.

    Optional Parameters

    Unless specified below, the optional parameters for bunzip2 , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to bunzip2 that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once bunzip2 has completed.

      This parameter defaults to 0.

    • BinModeOut => 0|1

      When writing to a file or filehandle, set binmode before writing to the file.

      Defaults to 0.

    • Append => 0|1

      TODO

    • MultiStream => 0|1

      If the input file/buffer contains multiple compressed data streams, this option will uncompress the whole lot as a single data stream.

      Defaults to 0.

    • TrailingData => $scalar

      Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete.

      This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

      If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

      If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

      Don't bother using trailingData if the input is a filename.

      If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option.

    Examples

    To read the contents of the file file1.txt.bz2 and write the compressed data to the file file1.txt .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Bunzip2 qw(bunzip2 $Bunzip2Error) ;
    4. my $input = "file1.txt.bz2";
    5. my $output = "file1.txt";
    6. bunzip2 $input => $output
    7. or die "bunzip2 failed: $Bunzip2Error\n";

    To read from an existing Perl filehandle, $input , and write the uncompressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Bunzip2 qw(bunzip2 $Bunzip2Error) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt.bz2"
    6. or die "Cannot open 'file1.txt.bz2': $!\n" ;
    7. my $buffer ;
    8. bunzip2 $input => \$buffer
    9. or die "bunzip2 failed: $Bunzip2Error\n";

    To uncompress all files in the directory "/my/home" that match "*.txt.bz2" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Bunzip2 qw(bunzip2 $Bunzip2Error) ;
    4. bunzip2 '</my/home/*.txt.bz2>' => '</my/home/#1.txt>'
    5. or die "bunzip2 failed: $Bunzip2Error\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Bunzip2 qw(bunzip2 $Bunzip2Error) ;
    4. for my $input ( glob "/my/home/*.txt.bz2" )
    5. {
    6. my $output = $input;
    7. $output =~ s/.bz2// ;
    8. bunzip2 $input => $output
    9. or die "Error compressing '$input': $Bunzip2Error\n";
    10. }

    OO Interface

    Constructor

    The format of the constructor for IO::Uncompress::Bunzip2 is shown below

    1. my $z = new IO::Uncompress::Bunzip2 $input [OPTS]
    2. or die "IO::Uncompress::Bunzip2 failed: $Bunzip2Error\n";

    Returns an IO::Uncompress::Bunzip2 object on success and undef on failure. The variable $Bunzip2Error will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Uncompress::Bunzip2 can be used exactly like an IO::File filehandle. This means that all normal input file operations can be carried out with $z . For example, to read a line from a compressed file/buffer you can use either of these forms

    1. $line = $z->getline();
    2. $line = <$z>;

    The mandatory parameter $input is used to determine the source of the compressed data. This parameter can take one of three forms.

    • A filename

      If the $input parameter is a scalar, it is assumed to be a filename. This file will be opened for reading and the compressed data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the compressed data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the compressed data will be read from $$output .

    Constructor Options

    The option names defined below are case insensitive and can be optionally prefixed by a '-'. So all of the following are valid

    1. -AutoClose
    2. -autoclose
    3. AUTOCLOSE
    4. autoclose

    OPTS is a combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $input parameter is a filehandle. If specified, and the value is true, it will result in the file being closed once either the close method is called or the IO::Uncompress::Bunzip2 object is destroyed.

      This parameter defaults to 0.

    • MultiStream => 0|1

      Allows multiple concatenated compressed streams to be treated as a single compressed stream. Decompression will stop once either the end of the file/buffer is reached, an error is encountered (premature eof, corrupt compressed data) or the end of a stream is not immediately followed by the start of another stream.

      This parameter defaults to 0.

    • Prime => $string

      This option will uncompress the contents of $string before processing the input file/buffer.

      This option can be useful when the compressed data is embedded in another file/data structure and it is not possible to work out where the compressed data begins without having to read the first few bytes. If this is the case, the uncompression can be primed with these bytes using this option.

    • Transparent => 0|1

      If this option is set and the input file/buffer is not compressed data, the module will allow reading of it anyway.

      In addition, if the input file/buffer does contain compressed data and there is non-compressed data immediately following it, setting this option will make this module treat the whole file/bufffer as a single data stream.

      This option defaults to 1.

    • BlockSize => $num

      When reading the compressed input data, IO::Uncompress::Bunzip2 will read it in blocks of $num bytes.

      This option defaults to 4096.

    • InputLength => $size

      When present this option will limit the number of compressed bytes read from the input file/buffer to $size . This option can be used in the situation where there is useful data directly after the compressed data stream and you know beforehand the exact length of the compressed data stream.

      This option is mostly used when reading from a filehandle, in which case the file pointer will be left pointing to the first byte directly after the compressed data stream.

      This option defaults to off.

    • Append => 0|1

      This option controls what the read method does with uncompressed data.

      If set to 1, all uncompressed data will be appended to the output parameter of the read method.

      If set to 0, the contents of the output parameter of the read method will be overwritten by the uncompressed data.

      Defaults to 0.

    • Strict => 0|1

      This option is a no-op.

    • Small => 0|1

      When non-zero this options will make bzip2 use a decompression algorithm that uses less memory at the expense of increasing the amount of time taken for decompression.

      Default is 0.

    Examples

    TODO

    Methods

    read

    Usage is

    1. $status = $z->read($buffer)

    Reads a block of compressed data (the size the the compressed block is determined by the Buffer option in the constructor), uncompresses it and writes any uncompressed data into $buffer . If the Append parameter is set in the constructor, the uncompressed data will be appended to the $buffer parameter. Otherwise $buffer will be overwritten.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    read

    Usage is

    1. $status = $z->read($buffer, $length)
    2. $status = $z->read($buffer, $length, $offset)
    3. $status = read($z, $buffer, $length)
    4. $status = read($z, $buffer, $length, $offset)

    Attempt to read $length bytes of uncompressed data into $buffer .

    The main difference between this form of the read method and the previous one, is that this one will attempt to return exactly $length bytes. The only circumstances that this function will not is if end-of-file or an IO error is encountered.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    getline

    Usage is

    1. $line = $z->getline()
    2. $line = <$z>

    Reads a single line.

    This method fully supports the use of of the variable $/ (or $INPUT_RECORD_SEPARATOR or $RS when English is in use) to determine what constitutes an end of line. Paragraph mode, record mode and file slurp mode are all supported.

    getc

    Usage is

    1. $char = $z->getc()

    Read a single character.

    ungetc

    Usage is

    1. $char = $z->ungetc($string)

    getHeaderInfo

    Usage is

    1. $hdr = $z->getHeaderInfo();
    2. @hdrs = $z->getHeaderInfo();

    This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the end of the compressed input stream has been reached.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the input file/buffer. It is a fatal error to attempt to seek backward.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    Returns the current uncompressed line number. If EXPR is present it has the effect of setting the line number. Note that setting the line number does not change the current position within the file/buffer being read.

    The contents of $/ are used to to determine what constitutes a line terminator.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Uncompress::Bunzip2 object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Uncompress::Bunzip2 object was created, and the object is associated with a file, the underlying file will also be closed.

    nextStream

    Usage is

    1. my $status = $z->nextStream();

    Skips to the next compressed data stream in the input file/buffer. If a new compressed data stream is found, the eof marker will be cleared and $. will be reset to 0.

    Returns 1 if a new stream was found, 0 if none was found, and -1 if an error was encountered.

    trailingData

    Usage is

    1. my $data = $z->trailingData();

    Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete. It only makes sense to call this method once the end of the compressed data stream has been encountered.

    This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

    If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

    If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

    Don't bother using trailingData if the input is a filename.

    If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option in the constructor.

    Importing

    No symbolic constants are required by this IO::Uncompress::Bunzip2 at present.

    • :all

      Imports bunzip2 and $Bunzip2Error . Same as doing this

      1. use IO::Uncompress::Bunzip2 qw(bunzip2 $Bunzip2Error) ;

    EXAMPLES

    Working with Net::FTP

    See IO::Uncompress::Bunzip2::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    Compress::Zlib::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    The primary site for the bzip2 program is http://www.bzip.org.

    See the module Compress::Bzip2

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2008 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Uncompress/Gunzip.html000644 000765 000024 00000175654 12275777450 020067 0ustar00jjstaff000000 000000 IO::Uncompress::Gunzip - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Uncompress::Gunzip

    Perl 5 version 18.2 documentation
    Recently read

    IO::Uncompress::Gunzip

    NAME

    IO::Uncompress::Gunzip - Read RFC 1952 files/buffers

    SYNOPSIS

    1. use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
    2. my $status = gunzip $input => $output [,OPTS]
    3. or die "gunzip failed: $GunzipError\n";
    4. my $z = new IO::Uncompress::Gunzip $input [OPTS]
    5. or die "gunzip failed: $GunzipError\n";
    6. $status = $z->read($buffer)
    7. $status = $z->read($buffer, $length)
    8. $status = $z->read($buffer, $length, $offset)
    9. $line = $z->getline()
    10. $char = $z->getc()
    11. $char = $z->ungetc()
    12. $char = $z->opened()
    13. $status = $z->inflateSync()
    14. $data = $z->trailingData()
    15. $status = $z->nextStream()
    16. $data = $z->getHeaderInfo()
    17. $z->tell()
    18. $z->seek($position, $whence)
    19. $z->binmode()
    20. $z->fileno()
    21. $z->eof()
    22. $z->close()
    23. $GunzipError ;
    24. # IO::File mode
    25. <$z>
    26. read($z, $buffer);
    27. read($z, $buffer, $length);
    28. read($z, $buffer, $length, $offset);
    29. tell($z)
    30. seek($z, $position, $whence)
    31. binmode($z)
    32. fileno($z)
    33. eof($z)
    34. close($z)

    DESCRIPTION

    This module provides a Perl interface that allows the reading of files/buffers that conform to RFC 1952.

    For writing RFC 1952 files/buffers, see the companion module IO::Compress::Gzip.

    Functional Interface

    A top-level function, gunzip , is provided to carry out "one-shot" uncompression between buffers and/or files. For finer control over the uncompression process, see the OO Interface section.

    1. use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
    2. gunzip $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "gunzip failed: $GunzipError\n";

    The functional interface needs Perl5.005 or better.

    gunzip $input => $output [, OPTS]

    gunzip expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the compressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is uncompressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" gunzip will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the uncompressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the uncompressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the uncompressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the uncompressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the uncompressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" gunzip will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple compressed files/buffers and $output_filename_or_reference is a single file/buffer, after uncompression $output_filename_or_reference will contain a concatenation of all the uncompressed data from each of the input files/buffers.

    Optional Parameters

    Unless specified below, the optional parameters for gunzip , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to gunzip that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once gunzip has completed.

      This parameter defaults to 0.

    • BinModeOut => 0|1

      When writing to a file or filehandle, set binmode before writing to the file.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all uncompressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any uncompressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any uncompressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any uncompressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all uncompressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any uncompressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all uncompressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any uncompressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any uncompressed data is output.

      Defaults to 0.

    • MultiStream => 0|1

      If the input file/buffer contains multiple compressed data streams, this option will uncompress the whole lot as a single data stream.

      Defaults to 0.

    • TrailingData => $scalar

      Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete.

      This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

      If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

      If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

      Don't bother using trailingData if the input is a filename.

      If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option.

    Examples

    To read the contents of the file file1.txt.gz and write the uncompressed data to the file file1.txt .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
    4. my $input = "file1.txt.gz";
    5. my $output = "file1.txt";
    6. gunzip $input => $output
    7. or die "gunzip failed: $GunzipError\n";

    To read from an existing Perl filehandle, $input , and write the uncompressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt.gz"
    6. or die "Cannot open 'file1.txt.gz': $!\n" ;
    7. my $buffer ;
    8. gunzip $input => \$buffer
    9. or die "gunzip failed: $GunzipError\n";

    To uncompress all files in the directory "/my/home" that match "*.txt.gz" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
    4. gunzip '</my/home/*.txt.gz>' => '</my/home/#1.txt>'
    5. or die "gunzip failed: $GunzipError\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
    4. for my $input ( glob "/my/home/*.txt.gz" )
    5. {
    6. my $output = $input;
    7. $output =~ s/.gz// ;
    8. gunzip $input => $output
    9. or die "Error compressing '$input': $GunzipError\n";
    10. }

    OO Interface

    Constructor

    The format of the constructor for IO::Uncompress::Gunzip is shown below

    1. my $z = new IO::Uncompress::Gunzip $input [OPTS]
    2. or die "IO::Uncompress::Gunzip failed: $GunzipError\n";

    Returns an IO::Uncompress::Gunzip object on success and undef on failure. The variable $GunzipError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Uncompress::Gunzip can be used exactly like an IO::File filehandle. This means that all normal input file operations can be carried out with $z . For example, to read a line from a compressed file/buffer you can use either of these forms

    1. $line = $z->getline();
    2. $line = <$z>;

    The mandatory parameter $input is used to determine the source of the compressed data. This parameter can take one of three forms.

    • A filename

      If the $input parameter is a scalar, it is assumed to be a filename. This file will be opened for reading and the compressed data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the compressed data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the compressed data will be read from $$input .

    Constructor Options

    The option names defined below are case insensitive and can be optionally prefixed by a '-'. So all of the following are valid

    1. -AutoClose
    2. -autoclose
    3. AUTOCLOSE
    4. autoclose

    OPTS is a combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $input parameter is a filehandle. If specified, and the value is true, it will result in the file being closed once either the close method is called or the IO::Uncompress::Gunzip object is destroyed.

      This parameter defaults to 0.

    • MultiStream => 0|1

      Allows multiple concatenated compressed streams to be treated as a single compressed stream. Decompression will stop once either the end of the file/buffer is reached, an error is encountered (premature eof, corrupt compressed data) or the end of a stream is not immediately followed by the start of another stream.

      This parameter defaults to 0.

    • Prime => $string

      This option will uncompress the contents of $string before processing the input file/buffer.

      This option can be useful when the compressed data is embedded in another file/data structure and it is not possible to work out where the compressed data begins without having to read the first few bytes. If this is the case, the uncompression can be primed with these bytes using this option.

    • Transparent => 0|1

      If this option is set and the input file/buffer is not compressed data, the module will allow reading of it anyway.

      In addition, if the input file/buffer does contain compressed data and there is non-compressed data immediately following it, setting this option will make this module treat the whole file/buffer as a single data stream.

      This option defaults to 1.

    • BlockSize => $num

      When reading the compressed input data, IO::Uncompress::Gunzip will read it in blocks of $num bytes.

      This option defaults to 4096.

    • InputLength => $size

      When present this option will limit the number of compressed bytes read from the input file/buffer to $size . This option can be used in the situation where there is useful data directly after the compressed data stream and you know beforehand the exact length of the compressed data stream.

      This option is mostly used when reading from a filehandle, in which case the file pointer will be left pointing to the first byte directly after the compressed data stream.

      This option defaults to off.

    • Append => 0|1

      This option controls what the read method does with uncompressed data.

      If set to 1, all uncompressed data will be appended to the output parameter of the read method.

      If set to 0, the contents of the output parameter of the read method will be overwritten by the uncompressed data.

      Defaults to 0.

    • Strict => 0|1

      This option controls whether the extra checks defined below are used when carrying out the decompression. When Strict is on, the extra tests are carried out, when Strict is off they are not.

      The default for this option is off.

      1

      If the FHCRC bit is set in the gzip FLG header byte, the CRC16 bytes in the header must match the crc16 value of the gzip header actually read.

      2

      If the gzip header contains a name field (FNAME) it consists solely of ISO 8859-1 characters.

      3

      If the gzip header contains a comment field (FCOMMENT) it consists solely of ISO 8859-1 characters plus line-feed.

      4

      If the gzip FEXTRA header field is present it must conform to the sub-field structure as defined in RFC 1952.

      5

      The CRC32 and ISIZE trailer fields must be present.

      6

      The value of the CRC32 field read must match the crc32 value of the uncompressed data actually contained in the gzip file.

      7

      The value of the ISIZE fields read must match the length of the uncompressed data actually read from the file.

    • ParseExtra => 0|1 If the gzip FEXTRA header field is present and this option is set, it will force the module to check that it conforms to the sub-field structure as defined in RFC 1952.

      If the Strict is on it will automatically enable this option.

      Defaults to 0.

    Examples

    TODO

    Methods

    read

    Usage is

    1. $status = $z->read($buffer)

    Reads a block of compressed data (the size the the compressed block is determined by the Buffer option in the constructor), uncompresses it and writes any uncompressed data into $buffer . If the Append parameter is set in the constructor, the uncompressed data will be appended to the $buffer parameter. Otherwise $buffer will be overwritten.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    read

    Usage is

    1. $status = $z->read($buffer, $length)
    2. $status = $z->read($buffer, $length, $offset)
    3. $status = read($z, $buffer, $length)
    4. $status = read($z, $buffer, $length, $offset)

    Attempt to read $length bytes of uncompressed data into $buffer .

    The main difference between this form of the read method and the previous one, is that this one will attempt to return exactly $length bytes. The only circumstances that this function will not is if end-of-file or an IO error is encountered.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    getline

    Usage is

    1. $line = $z->getline()
    2. $line = <$z>

    Reads a single line.

    This method fully supports the use of of the variable $/ (or $INPUT_RECORD_SEPARATOR or $RS when English is in use) to determine what constitutes an end of line. Paragraph mode, record mode and file slurp mode are all supported.

    getc

    Usage is

    1. $char = $z->getc()

    Read a single character.

    ungetc

    Usage is

    1. $char = $z->ungetc($string)

    inflateSync

    Usage is

    1. $status = $z->inflateSync()

    TODO

    getHeaderInfo

    Usage is

    1. $hdr = $z->getHeaderInfo();
    2. @hdrs = $z->getHeaderInfo();

    This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).

    • Name

      The contents of the Name header field, if present. If no name is present, the value will be undef. Note this is different from a zero length name, which will return an empty string.

    • Comment

      The contents of the Comment header field, if present. If no comment is present, the value will be undef. Note this is different from a zero length comment, which will return an empty string.

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the end of the compressed input stream has been reached.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the input file/buffer. It is a fatal error to attempt to seek backward.

    Note that the implementation of seek in this module does not provide true random access to a compressed file/buffer. It works by uncompressing data from the current offset in the file/buffer until it reaches the ucompressed offset specified in the parameters to seek. For very small files this may be acceptable behaviour. For large files it may cause an unacceptable delay.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    Returns the current uncompressed line number. If EXPR is present it has the effect of setting the line number. Note that setting the line number does not change the current position within the file/buffer being read.

    The contents of $/ are used to to determine what constitutes a line terminator.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Uncompress::Gunzip object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Uncompress::Gunzip object was created, and the object is associated with a file, the underlying file will also be closed.

    nextStream

    Usage is

    1. my $status = $z->nextStream();

    Skips to the next compressed data stream in the input file/buffer. If a new compressed data stream is found, the eof marker will be cleared and $. will be reset to 0.

    Returns 1 if a new stream was found, 0 if none was found, and -1 if an error was encountered.

    trailingData

    Usage is

    1. my $data = $z->trailingData();

    Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete. It only makes sense to call this method once the end of the compressed data stream has been encountered.

    This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

    If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

    If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

    Don't bother using trailingData if the input is a filename.

    If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option in the constructor.

    Importing

    No symbolic constants are required by this IO::Uncompress::Gunzip at present.

    • :all

      Imports gunzip and $GunzipError . Same as doing this

      1. use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;

    EXAMPLES

    Working with Net::FTP

    See IO::Compress::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Uncompress/Inflate.html000644 000765 000024 00000172247 12275777453 020173 0ustar00jjstaff000000 000000 IO::Uncompress::Inflate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Uncompress::Inflate

    Perl 5 version 18.2 documentation
    Recently read

    IO::Uncompress::Inflate

    NAME

    IO::Uncompress::Inflate - Read RFC 1950 files/buffers

    SYNOPSIS

    1. use IO::Uncompress::Inflate qw(inflate $InflateError) ;
    2. my $status = inflate $input => $output [,OPTS]
    3. or die "inflate failed: $InflateError\n";
    4. my $z = new IO::Uncompress::Inflate $input [OPTS]
    5. or die "inflate failed: $InflateError\n";
    6. $status = $z->read($buffer)
    7. $status = $z->read($buffer, $length)
    8. $status = $z->read($buffer, $length, $offset)
    9. $line = $z->getline()
    10. $char = $z->getc()
    11. $char = $z->ungetc()
    12. $char = $z->opened()
    13. $status = $z->inflateSync()
    14. $data = $z->trailingData()
    15. $status = $z->nextStream()
    16. $data = $z->getHeaderInfo()
    17. $z->tell()
    18. $z->seek($position, $whence)
    19. $z->binmode()
    20. $z->fileno()
    21. $z->eof()
    22. $z->close()
    23. $InflateError ;
    24. # IO::File mode
    25. <$z>
    26. read($z, $buffer);
    27. read($z, $buffer, $length);
    28. read($z, $buffer, $length, $offset);
    29. tell($z)
    30. seek($z, $position, $whence)
    31. binmode($z)
    32. fileno($z)
    33. eof($z)
    34. close($z)

    DESCRIPTION

    This module provides a Perl interface that allows the reading of files/buffers that conform to RFC 1950.

    For writing RFC 1950 files/buffers, see the companion module IO::Compress::Deflate.

    Functional Interface

    A top-level function, inflate , is provided to carry out "one-shot" uncompression between buffers and/or files. For finer control over the uncompression process, see the OO Interface section.

    1. use IO::Uncompress::Inflate qw(inflate $InflateError) ;
    2. inflate $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "inflate failed: $InflateError\n";

    The functional interface needs Perl5.005 or better.

    inflate $input => $output [, OPTS]

    inflate expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the compressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is uncompressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" inflate will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the uncompressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the uncompressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the uncompressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the uncompressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the uncompressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" inflate will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple compressed files/buffers and $output_filename_or_reference is a single file/buffer, after uncompression $output_filename_or_reference will contain a concatenation of all the uncompressed data from each of the input files/buffers.

    Optional Parameters

    Unless specified below, the optional parameters for inflate , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to inflate that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once inflate has completed.

      This parameter defaults to 0.

    • BinModeOut => 0|1

      When writing to a file or filehandle, set binmode before writing to the file.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all uncompressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any uncompressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any uncompressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any uncompressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all uncompressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any uncompressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all uncompressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any uncompressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any uncompressed data is output.

      Defaults to 0.

    • MultiStream => 0|1

      If the input file/buffer contains multiple compressed data streams, this option will uncompress the whole lot as a single data stream.

      Defaults to 0.

    • TrailingData => $scalar

      Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete.

      This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

      If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

      If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

      Don't bother using trailingData if the input is a filename.

      If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option.

    Examples

    To read the contents of the file file1.txt.1950 and write the uncompressed data to the file file1.txt .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Inflate qw(inflate $InflateError) ;
    4. my $input = "file1.txt.1950";
    5. my $output = "file1.txt";
    6. inflate $input => $output
    7. or die "inflate failed: $InflateError\n";

    To read from an existing Perl filehandle, $input , and write the uncompressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Inflate qw(inflate $InflateError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt.1950"
    6. or die "Cannot open 'file1.txt.1950': $!\n" ;
    7. my $buffer ;
    8. inflate $input => \$buffer
    9. or die "inflate failed: $InflateError\n";

    To uncompress all files in the directory "/my/home" that match "*.txt.1950" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Inflate qw(inflate $InflateError) ;
    4. inflate '</my/home/*.txt.1950>' => '</my/home/#1.txt>'
    5. or die "inflate failed: $InflateError\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Inflate qw(inflate $InflateError) ;
    4. for my $input ( glob "/my/home/*.txt.1950" )
    5. {
    6. my $output = $input;
    7. $output =~ s/.1950// ;
    8. inflate $input => $output
    9. or die "Error compressing '$input': $InflateError\n";
    10. }

    OO Interface

    Constructor

    The format of the constructor for IO::Uncompress::Inflate is shown below

    1. my $z = new IO::Uncompress::Inflate $input [OPTS]
    2. or die "IO::Uncompress::Inflate failed: $InflateError\n";

    Returns an IO::Uncompress::Inflate object on success and undef on failure. The variable $InflateError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Uncompress::Inflate can be used exactly like an IO::File filehandle. This means that all normal input file operations can be carried out with $z . For example, to read a line from a compressed file/buffer you can use either of these forms

    1. $line = $z->getline();
    2. $line = <$z>;

    The mandatory parameter $input is used to determine the source of the compressed data. This parameter can take one of three forms.

    • A filename

      If the $input parameter is a scalar, it is assumed to be a filename. This file will be opened for reading and the compressed data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the compressed data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the compressed data will be read from $$input .

    Constructor Options

    The option names defined below are case insensitive and can be optionally prefixed by a '-'. So all of the following are valid

    1. -AutoClose
    2. -autoclose
    3. AUTOCLOSE
    4. autoclose

    OPTS is a combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $input parameter is a filehandle. If specified, and the value is true, it will result in the file being closed once either the close method is called or the IO::Uncompress::Inflate object is destroyed.

      This parameter defaults to 0.

    • MultiStream => 0|1

      Allows multiple concatenated compressed streams to be treated as a single compressed stream. Decompression will stop once either the end of the file/buffer is reached, an error is encountered (premature eof, corrupt compressed data) or the end of a stream is not immediately followed by the start of another stream.

      This parameter defaults to 0.

    • Prime => $string

      This option will uncompress the contents of $string before processing the input file/buffer.

      This option can be useful when the compressed data is embedded in another file/data structure and it is not possible to work out where the compressed data begins without having to read the first few bytes. If this is the case, the uncompression can be primed with these bytes using this option.

    • Transparent => 0|1

      If this option is set and the input file/buffer is not compressed data, the module will allow reading of it anyway.

      In addition, if the input file/buffer does contain compressed data and there is non-compressed data immediately following it, setting this option will make this module treat the whole file/buffer as a single data stream.

      This option defaults to 1.

    • BlockSize => $num

      When reading the compressed input data, IO::Uncompress::Inflate will read it in blocks of $num bytes.

      This option defaults to 4096.

    • InputLength => $size

      When present this option will limit the number of compressed bytes read from the input file/buffer to $size . This option can be used in the situation where there is useful data directly after the compressed data stream and you know beforehand the exact length of the compressed data stream.

      This option is mostly used when reading from a filehandle, in which case the file pointer will be left pointing to the first byte directly after the compressed data stream.

      This option defaults to off.

    • Append => 0|1

      This option controls what the read method does with uncompressed data.

      If set to 1, all uncompressed data will be appended to the output parameter of the read method.

      If set to 0, the contents of the output parameter of the read method will be overwritten by the uncompressed data.

      Defaults to 0.

    • Strict => 0|1

      This option controls whether the extra checks defined below are used when carrying out the decompression. When Strict is on, the extra tests are carried out, when Strict is off they are not.

      The default for this option is off.

      1

      The ADLER32 checksum field must be present.

      2

      The value of the ADLER32 field read must match the adler32 value of the uncompressed data actually contained in the file.

    Examples

    TODO

    Methods

    read

    Usage is

    1. $status = $z->read($buffer)

    Reads a block of compressed data (the size the the compressed block is determined by the Buffer option in the constructor), uncompresses it and writes any uncompressed data into $buffer . If the Append parameter is set in the constructor, the uncompressed data will be appended to the $buffer parameter. Otherwise $buffer will be overwritten.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    read

    Usage is

    1. $status = $z->read($buffer, $length)
    2. $status = $z->read($buffer, $length, $offset)
    3. $status = read($z, $buffer, $length)
    4. $status = read($z, $buffer, $length, $offset)

    Attempt to read $length bytes of uncompressed data into $buffer .

    The main difference between this form of the read method and the previous one, is that this one will attempt to return exactly $length bytes. The only circumstances that this function will not is if end-of-file or an IO error is encountered.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    getline

    Usage is

    1. $line = $z->getline()
    2. $line = <$z>

    Reads a single line.

    This method fully supports the use of of the variable $/ (or $INPUT_RECORD_SEPARATOR or $RS when English is in use) to determine what constitutes an end of line. Paragraph mode, record mode and file slurp mode are all supported.

    getc

    Usage is

    1. $char = $z->getc()

    Read a single character.

    ungetc

    Usage is

    1. $char = $z->ungetc($string)

    inflateSync

    Usage is

    1. $status = $z->inflateSync()

    TODO

    getHeaderInfo

    Usage is

    1. $hdr = $z->getHeaderInfo();
    2. @hdrs = $z->getHeaderInfo();

    This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the end of the compressed input stream has been reached.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the input file/buffer. It is a fatal error to attempt to seek backward.

    Note that the implementation of seek in this module does not provide true random access to a compressed file/buffer. It works by uncompressing data from the current offset in the file/buffer until it reaches the ucompressed offset specified in the parameters to seek. For very small files this may be acceptable behaviour. For large files it may cause an unacceptable delay.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    Returns the current uncompressed line number. If EXPR is present it has the effect of setting the line number. Note that setting the line number does not change the current position within the file/buffer being read.

    The contents of $/ are used to to determine what constitutes a line terminator.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Uncompress::Inflate object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Uncompress::Inflate object was created, and the object is associated with a file, the underlying file will also be closed.

    nextStream

    Usage is

    1. my $status = $z->nextStream();

    Skips to the next compressed data stream in the input file/buffer. If a new compressed data stream is found, the eof marker will be cleared and $. will be reset to 0.

    Returns 1 if a new stream was found, 0 if none was found, and -1 if an error was encountered.

    trailingData

    Usage is

    1. my $data = $z->trailingData();

    Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete. It only makes sense to call this method once the end of the compressed data stream has been encountered.

    This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

    If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

    If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

    Don't bother using trailingData if the input is a filename.

    If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option in the constructor.

    Importing

    No symbolic constants are required by this IO::Uncompress::Inflate at present.

    • :all

      Imports inflate and $InflateError . Same as doing this

      1. use IO::Uncompress::Inflate qw(inflate $InflateError) ;

    EXAMPLES

    Working with Net::FTP

    See IO::Compress::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Uncompress/RawInflate.html000644 000765 000024 00000171510 12275777450 020632 0ustar00jjstaff000000 000000 IO::Uncompress::RawInflate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Uncompress::RawInflate

    Perl 5 version 18.2 documentation
    Recently read

    IO::Uncompress::RawInflate

    NAME

    IO::Uncompress::RawInflate - Read RFC 1951 files/buffers

    SYNOPSIS

    1. use IO::Uncompress::RawInflate qw(rawinflate $RawInflateError) ;
    2. my $status = rawinflate $input => $output [,OPTS]
    3. or die "rawinflate failed: $RawInflateError\n";
    4. my $z = new IO::Uncompress::RawInflate $input [OPTS]
    5. or die "rawinflate failed: $RawInflateError\n";
    6. $status = $z->read($buffer)
    7. $status = $z->read($buffer, $length)
    8. $status = $z->read($buffer, $length, $offset)
    9. $line = $z->getline()
    10. $char = $z->getc()
    11. $char = $z->ungetc()
    12. $char = $z->opened()
    13. $status = $z->inflateSync()
    14. $data = $z->trailingData()
    15. $status = $z->nextStream()
    16. $data = $z->getHeaderInfo()
    17. $z->tell()
    18. $z->seek($position, $whence)
    19. $z->binmode()
    20. $z->fileno()
    21. $z->eof()
    22. $z->close()
    23. $RawInflateError ;
    24. # IO::File mode
    25. <$z>
    26. read($z, $buffer);
    27. read($z, $buffer, $length);
    28. read($z, $buffer, $length, $offset);
    29. tell($z)
    30. seek($z, $position, $whence)
    31. binmode($z)
    32. fileno($z)
    33. eof($z)
    34. close($z)

    DESCRIPTION

    This module provides a Perl interface that allows the reading of files/buffers that conform to RFC 1951.

    For writing RFC 1951 files/buffers, see the companion module IO::Compress::RawDeflate.

    Functional Interface

    A top-level function, rawinflate , is provided to carry out "one-shot" uncompression between buffers and/or files. For finer control over the uncompression process, see the OO Interface section.

    1. use IO::Uncompress::RawInflate qw(rawinflate $RawInflateError) ;
    2. rawinflate $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "rawinflate failed: $RawInflateError\n";

    The functional interface needs Perl5.005 or better.

    rawinflate $input => $output [, OPTS]

    rawinflate expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the compressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is uncompressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" rawinflate will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the uncompressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the uncompressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the uncompressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the uncompressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the uncompressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" rawinflate will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple compressed files/buffers and $output_filename_or_reference is a single file/buffer, after uncompression $output_filename_or_reference will contain a concatenation of all the uncompressed data from each of the input files/buffers.

    Optional Parameters

    Unless specified below, the optional parameters for rawinflate , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to rawinflate that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once rawinflate has completed.

      This parameter defaults to 0.

    • BinModeOut => 0|1

      When writing to a file or filehandle, set binmode before writing to the file.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all uncompressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any uncompressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any uncompressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any uncompressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all uncompressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any uncompressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all uncompressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any uncompressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any uncompressed data is output.

      Defaults to 0.

    • MultiStream => 0|1

      This option is a no-op.

    • TrailingData => $scalar

      Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete.

      This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

      If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

      If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

      Don't bother using trailingData if the input is a filename.

      If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option.

    Examples

    To read the contents of the file file1.txt.1951 and write the uncompressed data to the file file1.txt .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::RawInflate qw(rawinflate $RawInflateError) ;
    4. my $input = "file1.txt.1951";
    5. my $output = "file1.txt";
    6. rawinflate $input => $output
    7. or die "rawinflate failed: $RawInflateError\n";

    To read from an existing Perl filehandle, $input , and write the uncompressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::RawInflate qw(rawinflate $RawInflateError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt.1951"
    6. or die "Cannot open 'file1.txt.1951': $!\n" ;
    7. my $buffer ;
    8. rawinflate $input => \$buffer
    9. or die "rawinflate failed: $RawInflateError\n";

    To uncompress all files in the directory "/my/home" that match "*.txt.1951" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::RawInflate qw(rawinflate $RawInflateError) ;
    4. rawinflate '</my/home/*.txt.1951>' => '</my/home/#1.txt>'
    5. or die "rawinflate failed: $RawInflateError\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::RawInflate qw(rawinflate $RawInflateError) ;
    4. for my $input ( glob "/my/home/*.txt.1951" )
    5. {
    6. my $output = $input;
    7. $output =~ s/.1951// ;
    8. rawinflate $input => $output
    9. or die "Error compressing '$input': $RawInflateError\n";
    10. }

    OO Interface

    Constructor

    The format of the constructor for IO::Uncompress::RawInflate is shown below

    1. my $z = new IO::Uncompress::RawInflate $input [OPTS]
    2. or die "IO::Uncompress::RawInflate failed: $RawInflateError\n";

    Returns an IO::Uncompress::RawInflate object on success and undef on failure. The variable $RawInflateError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Uncompress::RawInflate can be used exactly like an IO::File filehandle. This means that all normal input file operations can be carried out with $z . For example, to read a line from a compressed file/buffer you can use either of these forms

    1. $line = $z->getline();
    2. $line = <$z>;

    The mandatory parameter $input is used to determine the source of the compressed data. This parameter can take one of three forms.

    • A filename

      If the $input parameter is a scalar, it is assumed to be a filename. This file will be opened for reading and the compressed data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the compressed data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the compressed data will be read from $$input .

    Constructor Options

    The option names defined below are case insensitive and can be optionally prefixed by a '-'. So all of the following are valid

    1. -AutoClose
    2. -autoclose
    3. AUTOCLOSE
    4. autoclose

    OPTS is a combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $input parameter is a filehandle. If specified, and the value is true, it will result in the file being closed once either the close method is called or the IO::Uncompress::RawInflate object is destroyed.

      This parameter defaults to 0.

    • MultiStream => 0|1

      Allows multiple concatenated compressed streams to be treated as a single compressed stream. Decompression will stop once either the end of the file/buffer is reached, an error is encountered (premature eof, corrupt compressed data) or the end of a stream is not immediately followed by the start of another stream.

      This parameter defaults to 0.

    • Prime => $string

      This option will uncompress the contents of $string before processing the input file/buffer.

      This option can be useful when the compressed data is embedded in another file/data structure and it is not possible to work out where the compressed data begins without having to read the first few bytes. If this is the case, the uncompression can be primed with these bytes using this option.

    • Transparent => 0|1

      If this option is set and the input file/buffer is not compressed data, the module will allow reading of it anyway.

      In addition, if the input file/buffer does contain compressed data and there is non-compressed data immediately following it, setting this option will make this module treat the whole file/buffer as a single data stream.

      This option defaults to 1.

    • BlockSize => $num

      When reading the compressed input data, IO::Uncompress::RawInflate will read it in blocks of $num bytes.

      This option defaults to 4096.

    • InputLength => $size

      When present this option will limit the number of compressed bytes read from the input file/buffer to $size . This option can be used in the situation where there is useful data directly after the compressed data stream and you know beforehand the exact length of the compressed data stream.

      This option is mostly used when reading from a filehandle, in which case the file pointer will be left pointing to the first byte directly after the compressed data stream.

      This option defaults to off.

    • Append => 0|1

      This option controls what the read method does with uncompressed data.

      If set to 1, all uncompressed data will be appended to the output parameter of the read method.

      If set to 0, the contents of the output parameter of the read method will be overwritten by the uncompressed data.

      Defaults to 0.

    • Strict => 0|1

      This option is a no-op.

    Examples

    TODO

    Methods

    read

    Usage is

    1. $status = $z->read($buffer)

    Reads a block of compressed data (the size the the compressed block is determined by the Buffer option in the constructor), uncompresses it and writes any uncompressed data into $buffer . If the Append parameter is set in the constructor, the uncompressed data will be appended to the $buffer parameter. Otherwise $buffer will be overwritten.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    read

    Usage is

    1. $status = $z->read($buffer, $length)
    2. $status = $z->read($buffer, $length, $offset)
    3. $status = read($z, $buffer, $length)
    4. $status = read($z, $buffer, $length, $offset)

    Attempt to read $length bytes of uncompressed data into $buffer .

    The main difference between this form of the read method and the previous one, is that this one will attempt to return exactly $length bytes. The only circumstances that this function will not is if end-of-file or an IO error is encountered.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    getline

    Usage is

    1. $line = $z->getline()
    2. $line = <$z>

    Reads a single line.

    This method fully supports the use of of the variable $/ (or $INPUT_RECORD_SEPARATOR or $RS when English is in use) to determine what constitutes an end of line. Paragraph mode, record mode and file slurp mode are all supported.

    getc

    Usage is

    1. $char = $z->getc()

    Read a single character.

    ungetc

    Usage is

    1. $char = $z->ungetc($string)

    inflateSync

    Usage is

    1. $status = $z->inflateSync()

    TODO

    getHeaderInfo

    Usage is

    1. $hdr = $z->getHeaderInfo();
    2. @hdrs = $z->getHeaderInfo();

    This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the end of the compressed input stream has been reached.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the input file/buffer. It is a fatal error to attempt to seek backward.

    Note that the implementation of seek in this module does not provide true random access to a compressed file/buffer. It works by uncompressing data from the current offset in the file/buffer until it reaches the ucompressed offset specified in the parameters to seek. For very small files this may be acceptable behaviour. For large files it may cause an unacceptable delay.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    Returns the current uncompressed line number. If EXPR is present it has the effect of setting the line number. Note that setting the line number does not change the current position within the file/buffer being read.

    The contents of $/ are used to to determine what constitutes a line terminator.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Uncompress::RawInflate object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Uncompress::RawInflate object was created, and the object is associated with a file, the underlying file will also be closed.

    nextStream

    Usage is

    1. my $status = $z->nextStream();

    Skips to the next compressed data stream in the input file/buffer. If a new compressed data stream is found, the eof marker will be cleared and $. will be reset to 0.

    Returns 1 if a new stream was found, 0 if none was found, and -1 if an error was encountered.

    trailingData

    Usage is

    1. my $data = $z->trailingData();

    Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete. It only makes sense to call this method once the end of the compressed data stream has been encountered.

    This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

    If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

    If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

    Don't bother using trailingData if the input is a filename.

    If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option in the constructor.

    Importing

    No symbolic constants are required by this IO::Uncompress::RawInflate at present.

    • :all

      Imports rawinflate and $RawInflateError . Same as doing this

      1. use IO::Uncompress::RawInflate qw(rawinflate $RawInflateError) ;

    EXAMPLES

    Working with Net::FTP

    See IO::Compress::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Uncompress/Unzip.html000644 000765 000024 00000202437 12275777453 017711 0ustar00jjstaff000000 000000 IO::Uncompress::Unzip - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Uncompress::Unzip

    Perl 5 version 18.2 documentation
    Recently read

    IO::Uncompress::Unzip

    NAME

    IO::Uncompress::Unzip - Read zip files/buffers

    SYNOPSIS

    1. use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
    2. my $status = unzip $input => $output [,OPTS]
    3. or die "unzip failed: $UnzipError\n";
    4. my $z = new IO::Uncompress::Unzip $input [OPTS]
    5. or die "unzip failed: $UnzipError\n";
    6. $status = $z->read($buffer)
    7. $status = $z->read($buffer, $length)
    8. $status = $z->read($buffer, $length, $offset)
    9. $line = $z->getline()
    10. $char = $z->getc()
    11. $char = $z->ungetc()
    12. $char = $z->opened()
    13. $status = $z->inflateSync()
    14. $data = $z->trailingData()
    15. $status = $z->nextStream()
    16. $data = $z->getHeaderInfo()
    17. $z->tell()
    18. $z->seek($position, $whence)
    19. $z->binmode()
    20. $z->fileno()
    21. $z->eof()
    22. $z->close()
    23. $UnzipError ;
    24. # IO::File mode
    25. <$z>
    26. read($z, $buffer);
    27. read($z, $buffer, $length);
    28. read($z, $buffer, $length, $offset);
    29. tell($z)
    30. seek($z, $position, $whence)
    31. binmode($z)
    32. fileno($z)
    33. eof($z)
    34. close($z)

    DESCRIPTION

    This module provides a Perl interface that allows the reading of zlib files/buffers.

    For writing zip files/buffers, see the companion module IO::Compress::Zip.

    Functional Interface

    A top-level function, unzip , is provided to carry out "one-shot" uncompression between buffers and/or files. For finer control over the uncompression process, see the OO Interface section.

    1. use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
    2. unzip $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "unzip failed: $UnzipError\n";

    The functional interface needs Perl5.005 or better.

    unzip $input => $output [, OPTS]

    unzip expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the compressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is uncompressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" unzip will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the uncompressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the uncompressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the uncompressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the uncompressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the uncompressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" unzip will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple compressed files/buffers and $output_filename_or_reference is a single file/buffer, after uncompression $output_filename_or_reference will contain a concatenation of all the uncompressed data from each of the input files/buffers.

    Optional Parameters

    Unless specified below, the optional parameters for unzip , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to unzip that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once unzip has completed.

      This parameter defaults to 0.

    • BinModeOut => 0|1

      When writing to a file or filehandle, set binmode before writing to the file.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all uncompressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any uncompressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any uncompressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any uncompressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all uncompressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any uncompressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all uncompressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any uncompressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any uncompressed data is output.

      Defaults to 0.

    • MultiStream => 0|1

      If the input file/buffer contains multiple compressed data streams, this option will uncompress the whole lot as a single data stream.

      Defaults to 0.

    • TrailingData => $scalar

      Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete.

      This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

      If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

      If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

      Don't bother using trailingData if the input is a filename.

      If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option.

    Examples

    Say you have a zip file, file1.zip , that only contains a single member, you can read it and write the uncompressed data to the file file1.txt like this.

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
    4. my $input = "file1.zip";
    5. my $output = "file1.txt";
    6. unzip $input => $output
    7. or die "unzip failed: $UnzipError\n";

    If you have a zip file that contains multiple members and want to read a specific member from the file, say "data1" , use the Name option

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
    4. my $input = "file1.zip";
    5. my $output = "file1.txt";
    6. unzip $input => $output, Name => "data1"
    7. or die "unzip failed: $UnzipError\n";

    Alternatively, if you want to read the "data1" member into memory, use a scalar reference for the output partameter.

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
    4. my $input = "file1.zip";
    5. my $output ;
    6. unzip $input => \$output, Name => "data1"
    7. or die "unzip failed: $UnzipError\n";
    8. # $output now contains the uncompressed data

    To read from an existing Perl filehandle, $input , and write the uncompressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Uncompress::Unzip qw(unzip $UnzipError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.zip"
    6. or die "Cannot open 'file1.zip': $!\n" ;
    7. my $buffer ;
    8. unzip $input => \$buffer
    9. or die "unzip failed: $UnzipError\n";

    OO Interface

    Constructor

    The format of the constructor for IO::Uncompress::Unzip is shown below

    1. my $z = new IO::Uncompress::Unzip $input [OPTS]
    2. or die "IO::Uncompress::Unzip failed: $UnzipError\n";

    Returns an IO::Uncompress::Unzip object on success and undef on failure. The variable $UnzipError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Uncompress::Unzip can be used exactly like an IO::File filehandle. This means that all normal input file operations can be carried out with $z . For example, to read a line from a compressed file/buffer you can use either of these forms

    1. $line = $z->getline();
    2. $line = <$z>;

    The mandatory parameter $input is used to determine the source of the compressed data. This parameter can take one of three forms.

    • A filename

      If the $input parameter is a scalar, it is assumed to be a filename. This file will be opened for reading and the compressed data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the compressed data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the compressed data will be read from $$input .

    Constructor Options

    The option names defined below are case insensitive and can be optionally prefixed by a '-'. So all of the following are valid

    1. -AutoClose
    2. -autoclose
    3. AUTOCLOSE
    4. autoclose

    OPTS is a combination of the following options:

    • Name => "membername"

      Open "membername" from the zip file for reading.

    • AutoClose => 0|1

      This option is only valid when the $input parameter is a filehandle. If specified, and the value is true, it will result in the file being closed once either the close method is called or the IO::Uncompress::Unzip object is destroyed.

      This parameter defaults to 0.

    • MultiStream => 0|1

      Treats the complete zip file/buffer as a single compressed data stream. When reading in multi-stream mode each member of the zip file/buffer will be uncompressed in turn until the end of the file/buffer is encountered.

      This parameter defaults to 0.

    • Prime => $string

      This option will uncompress the contents of $string before processing the input file/buffer.

      This option can be useful when the compressed data is embedded in another file/data structure and it is not possible to work out where the compressed data begins without having to read the first few bytes. If this is the case, the uncompression can be primed with these bytes using this option.

    • Transparent => 0|1

      If this option is set and the input file/buffer is not compressed data, the module will allow reading of it anyway.

      In addition, if the input file/buffer does contain compressed data and there is non-compressed data immediately following it, setting this option will make this module treat the whole file/buffer as a single data stream.

      This option defaults to 1.

    • BlockSize => $num

      When reading the compressed input data, IO::Uncompress::Unzip will read it in blocks of $num bytes.

      This option defaults to 4096.

    • InputLength => $size

      When present this option will limit the number of compressed bytes read from the input file/buffer to $size . This option can be used in the situation where there is useful data directly after the compressed data stream and you know beforehand the exact length of the compressed data stream.

      This option is mostly used when reading from a filehandle, in which case the file pointer will be left pointing to the first byte directly after the compressed data stream.

      This option defaults to off.

    • Append => 0|1

      This option controls what the read method does with uncompressed data.

      If set to 1, all uncompressed data will be appended to the output parameter of the read method.

      If set to 0, the contents of the output parameter of the read method will be overwritten by the uncompressed data.

      Defaults to 0.

    • Strict => 0|1

      This option controls whether the extra checks defined below are used when carrying out the decompression. When Strict is on, the extra tests are carried out, when Strict is off they are not.

      The default for this option is off.

    Examples

    TODO

    Methods

    read

    Usage is

    1. $status = $z->read($buffer)

    Reads a block of compressed data (the size the the compressed block is determined by the Buffer option in the constructor), uncompresses it and writes any uncompressed data into $buffer . If the Append parameter is set in the constructor, the uncompressed data will be appended to the $buffer parameter. Otherwise $buffer will be overwritten.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    read

    Usage is

    1. $status = $z->read($buffer, $length)
    2. $status = $z->read($buffer, $length, $offset)
    3. $status = read($z, $buffer, $length)
    4. $status = read($z, $buffer, $length, $offset)

    Attempt to read $length bytes of uncompressed data into $buffer .

    The main difference between this form of the read method and the previous one, is that this one will attempt to return exactly $length bytes. The only circumstances that this function will not is if end-of-file or an IO error is encountered.

    Returns the number of uncompressed bytes written to $buffer , zero if eof or a negative number on error.

    getline

    Usage is

    1. $line = $z->getline()
    2. $line = <$z>

    Reads a single line.

    This method fully supports the use of of the variable $/ (or $INPUT_RECORD_SEPARATOR or $RS when English is in use) to determine what constitutes an end of line. Paragraph mode, record mode and file slurp mode are all supported.

    getc

    Usage is

    1. $char = $z->getc()

    Read a single character.

    ungetc

    Usage is

    1. $char = $z->ungetc($string)

    inflateSync

    Usage is

    1. $status = $z->inflateSync()

    TODO

    getHeaderInfo

    Usage is

    1. $hdr = $z->getHeaderInfo();
    2. @hdrs = $z->getHeaderInfo();

    This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the end of the compressed input stream has been reached.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the input file/buffer. It is a fatal error to attempt to seek backward.

    Note that the implementation of seek in this module does not provide true random access to a compressed file/buffer. It works by uncompressing data from the current offset in the file/buffer until it reaches the ucompressed offset specified in the parameters to seek. For very small files this may be acceptable behaviour. For large files it may cause an unacceptable delay.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    Returns the current uncompressed line number. If EXPR is present it has the effect of setting the line number. Note that setting the line number does not change the current position within the file/buffer being read.

    The contents of $/ are used to to determine what constitutes a line terminator.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Uncompress::Unzip object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Uncompress::Unzip object was created, and the object is associated with a file, the underlying file will also be closed.

    nextStream

    Usage is

    1. my $status = $z->nextStream();

    Skips to the next compressed data stream in the input file/buffer. If a new compressed data stream is found, the eof marker will be cleared and $. will be reset to 0.

    Returns 1 if a new stream was found, 0 if none was found, and -1 if an error was encountered.

    trailingData

    Usage is

    1. my $data = $z->trailingData();

    Returns the data, if any, that is present immediately after the compressed data stream once uncompression is complete. It only makes sense to call this method once the end of the compressed data stream has been encountered.

    This option can be used when there is useful information immediately following the compressed data stream, and you don't know the length of the compressed data stream.

    If the input is a buffer, trailingData will return everything from the end of the compressed data stream to the end of the buffer.

    If the input is a filehandle, trailingData will return the data that is left in the filehandle input buffer once the end of the compressed data stream has been reached. You can then use the filehandle to read the rest of the input file.

    Don't bother using trailingData if the input is a filename.

    If you know the length of the compressed data stream before you start uncompressing, you can avoid having to use trailingData by setting the InputLength option in the constructor.

    Importing

    No symbolic constants are required by this IO::Uncompress::Unzip at present.

    • :all

      Imports unzip and $UnzipError . Same as doing this

      1. use IO::Uncompress::Unzip qw(unzip $UnzipError) ;

    EXAMPLES

    Working with Net::FTP

    See IO::Compress::FAQ

    Walking through a zip file

    The code below can be used to traverse a zip file, one compressed data stream at a time.

    1. use IO::Uncompress::Unzip qw($UnzipError);
    2. my $zipfile = "somefile.zip";
    3. my $u = new IO::Uncompress::Unzip $zipfile
    4. or die "Cannot open $zipfile: $UnzipError";
    5. my $status;
    6. for ($status = 1; $status > 0; $status = $u->nextStream())
    7. {
    8. my $name = $u->getHeaderInfo()->{Name};
    9. warn "Processing member $name\n" ;
    10. my $buff;
    11. while (($status = $u->read($buff)) > 0) {
    12. # Do something here
    13. }
    14. last if $status < 0;
    15. }
    16. die "Error processing $zipfile: $!\n"
    17. if $status < 0 ;

    Each individual compressed data stream is read until the logical end-of-file is reached. Then nextStream is called. This will skip to the start of the next compressed data stream and clear the end-of-file flag.

    It is also worth noting that nextStream can be called at any time -- you don't have to wait until you have exhausted a compressed data stream before skipping to the next one.

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Socket/INET.html000644 000765 000024 00000066265 12275777451 016442 0ustar00jjstaff000000 000000 IO::Socket::INET - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Socket::INET

    Perl 5 version 18.2 documentation
    Recently read

    IO::Socket::INET

    NAME

    IO::Socket::INET - Object interface for AF_INET domain sockets

    SYNOPSIS

    1. use IO::Socket::INET;

    DESCRIPTION

    IO::Socket::INET provides an object interface to creating and using sockets in the AF_INET domain. It is built upon the IO::Socket interface and inherits all the methods defined by IO::Socket.

    CONSTRUCTOR

    • new ( [ARGS] )

      Creates an IO::Socket::INET object, which is a reference to a newly created symbol (see the Symbol package). new optionally takes arguments, these arguments are in key-value pairs.

      In addition to the key-value pairs accepted by IO::Socket, IO::Socket::INET provides.

      1. PeerAddr Remote host address <hostname>[:<port>]
      2. PeerHost Synonym for PeerAddr
      3. PeerPort Remote port or service <service>[(<no>)] | <no>
      4. LocalAddr Local host bind address hostname[:port]
      5. LocalHost Synonym for LocalAddr
      6. LocalPort Local host bind port <service>[(<no>)] | <no>
      7. Proto Protocol name (or number) "tcp" | "udp" | ...
      8. Type Socket type SOCK_STREAM | SOCK_DGRAM | ...
      9. Listen Queue size for listen
      10. ReuseAddr Set SO_REUSEADDR before binding
      11. Reuse Set SO_REUSEADDR before binding (deprecated,
      12. prefer ReuseAddr)
      13. ReusePort Set SO_REUSEPORT before binding
      14. Broadcast Set SO_BROADCAST before binding
      15. Timeout Timeout value for various operations
      16. MultiHomed Try all addresses for multi-homed hosts
      17. Blocking Determine if connection will be blocking mode

      If Listen is defined then a listen socket is created, else if the socket type, which is derived from the protocol, is SOCK_STREAM then connect() is called.

      Although it is not illegal, the use of MultiHomed on a socket which is in non-blocking mode is of little use. This is because the first connect will never fail with a timeout as the connect call will not block.

      The PeerAddr can be a hostname or the IP-address on the "xx.xx.xx.xx" form. The PeerPort can be a number or a symbolic service name. The service name might be followed by a number in parenthesis which is used if the service is not known by the system. The PeerPort specification can also be embedded in the PeerAddr by preceding it with a ":".

      If Proto is not given and you specify a symbolic PeerPort port, then the constructor will try to derive Proto from the service name. As a last resort Proto "tcp" is assumed. The Type parameter will be deduced from Proto if not specified.

      If the constructor is only passed a single argument, it is assumed to be a PeerAddr specification.

      If Blocking is set to 0, the connection will be in nonblocking mode. If not specified it defaults to 1 (blocking mode).

      Examples:

      1. $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.org',
      2. PeerPort => 'http(80)',
      3. Proto => 'tcp');
      4. $sock = IO::Socket::INET->new(PeerAddr => 'localhost:smtp(25)');
      5. $sock = IO::Socket::INET->new(Listen => 5,
      6. LocalAddr => 'localhost',
      7. LocalPort => 9000,
      8. Proto => 'tcp');
      9. $sock = IO::Socket::INET->new('127.0.0.1:25');
      10. $sock = IO::Socket::INET->new(
      11. PeerPort => 9999,
      12. PeerAddr => inet_ntoa(INADDR_BROADCAST),
      13. Proto => udp,
      14. LocalAddr => 'localhost',
      15. Broadcast => 1 )
      16. or die "Can't bind : $@\n";
      17. NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE

      As of VERSION 1.18 all IO::Socket objects have autoflush turned on by default. This was not the case with earlier releases.

      1. NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE

    METHODS

    • sockaddr ()

      Return the address part of the sockaddr structure for the socket

    • sockport ()

      Return the port number that the socket is using on the local host

    • sockhost ()

      Return the address part of the sockaddr structure for the socket in a text form xx.xx.xx.xx

    • peeraddr ()

      Return the address part of the sockaddr structure for the socket on the peer host

    • peerport ()

      Return the port number for the socket on the peer host.

    • peerhost ()

      Return the address part of the sockaddr structure for the socket on the peer host in a text form xx.xx.xx.xx

    SEE ALSO

    Socket, IO::Socket

    AUTHOR

    Graham Barr. Currently maintained by the Perl Porters. Please report all bugs to <perlbug@perl.org>.

    COPYRIGHT

    Copyright (c) 1996-8 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Socket/UNIX.html000644 000765 000024 00000043502 12275777453 016455 0ustar00jjstaff000000 000000 IO::Socket::UNIX - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Socket::UNIX

    Perl 5 version 18.2 documentation
    Recently read

    IO::Socket::UNIX

    NAME

    IO::Socket::UNIX - Object interface for AF_UNIX domain sockets

    SYNOPSIS

    1. use IO::Socket::UNIX;

    DESCRIPTION

    IO::Socket::UNIX provides an object interface to creating and using sockets in the AF_UNIX domain. It is built upon the IO::Socket interface and inherits all the methods defined by IO::Socket.

    CONSTRUCTOR

    • new ( [ARGS] )

      Creates an IO::Socket::UNIX object, which is a reference to a newly created symbol (see the Symbol package). new optionally takes arguments, these arguments are in key-value pairs.

      In addition to the key-value pairs accepted by IO::Socket, IO::Socket::UNIX provides.

      1. Type Type of socket (eg SOCK_STREAM or SOCK_DGRAM)
      2. Local Path to local fifo
      3. Peer Path to peer fifo
      4. Listen Create a listen socket

      If the constructor is only passed a single argument, it is assumed to be a Peer specification.

      1. NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE

      As of VERSION 1.18 all IO::Socket objects have autoflush turned on by default. This was not the case with earlier releases.

      1. NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE NOTE

    METHODS

    • hostpath()

      Returns the pathname to the fifo at the local end

    • peerpath()

      Returns the pathanme to the fifo at the peer end

    SEE ALSO

    Socket, IO::Socket

    AUTHOR

    Graham Barr. Currently maintained by the Perl Porters. Please report all bugs to <perlbug@perl.org>.

    COPYRIGHT

    Copyright (c) 1996-8 Graham Barr <gbarr@pobox.com>. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Compress/Base.html000644 000765 000024 00000042337 12275777454 017115 0ustar00jjstaff000000 000000 IO::Compress::Base - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Compress::Base

    Perl 5 version 18.2 documentation
    Recently read

    IO::Compress::Base

    NAME

    IO::Compress::Base - Base Class for IO::Compress modules

    SYNOPSIS

    1. use IO::Compress::Base ;

    DESCRIPTION

    This module is not intended for direct use in application code. Its sole purpose if to to be sub-classed by IO::Compress modules.

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Compress/Bzip2.html000644 000765 000024 00000153741 12275777455 017234 0ustar00jjstaff000000 000000 IO::Compress::Bzip2 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Compress::Bzip2

    Perl 5 version 18.2 documentation
    Recently read

    IO::Compress::Bzip2

    NAME

    IO::Compress::Bzip2 - Write bzip2 files/buffers

    SYNOPSIS

    1. use IO::Compress::Bzip2 qw(bzip2 $Bzip2Error) ;
    2. my $status = bzip2 $input => $output [,OPTS]
    3. or die "bzip2 failed: $Bzip2Error\n";
    4. my $z = new IO::Compress::Bzip2 $output [,OPTS]
    5. or die "bzip2 failed: $Bzip2Error\n";
    6. $z->print($string);
    7. $z->printf($format, $string);
    8. $z->write($string);
    9. $z->syswrite($string [, $length, $offset]);
    10. $z->flush();
    11. $z->tell();
    12. $z->eof();
    13. $z->seek($position, $whence);
    14. $z->binmode();
    15. $z->fileno();
    16. $z->opened();
    17. $z->autoflush();
    18. $z->input_line_number();
    19. $z->newStream( [OPTS] );
    20. $z->close() ;
    21. $Bzip2Error ;
    22. # IO::File mode
    23. print $z $string;
    24. printf $z $format, $string;
    25. tell $z
    26. eof $z
    27. seek $z, $position, $whence
    28. binmode $z
    29. fileno $z
    30. close $z ;

    DESCRIPTION

    This module provides a Perl interface that allows writing bzip2 compressed data to files or buffer.

    For reading bzip2 files/buffers, see the companion module IO::Uncompress::Bunzip2.

    Functional Interface

    A top-level function, bzip2 , is provided to carry out "one-shot" compression between buffers and/or files. For finer control over the compression process, see the OO Interface section.

    1. use IO::Compress::Bzip2 qw(bzip2 $Bzip2Error) ;
    2. bzip2 $input => $output [,OPTS]
    3. or die "bzip2 failed: $Bzip2Error\n";

    The functional interface needs Perl5.005 or better.

    bzip2 $input => $output [, OPTS]

    bzip2 expects at least two parameters, $input and $output .

    The $input parameter

    The parameter, $input , is used to define the source of the uncompressed data.

    It can take one of the following forms:

    • A filename

      If the $input parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input is a scalar reference, the input data will be read from $$input .

    • An array reference

      If $input is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is compressed.

    • An Input FileGlob string

      If $input is a string that is delimited by the characters "<" and ">" bzip2 will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      If the fileglob does not match any files ...

      See File::GlobMapper for more details.

    If the $input parameter is any other type, undef will be returned.

    The $output parameter

    The parameter $output is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output is a scalar reference, the compressed data will be stored in $$output .

    • An Array Reference

      If $output is an array reference, the compressed data will be pushed onto the array.

    • An Output FileGlob

      If $output is a string that is delimited by the characters "<" and ">" bzip2 will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output is an fileglob string, $input must also be a fileglob string. Anything else is an error.

    If the $output parameter is any other type, undef will be returned.

    Notes

    When $input maps to multiple files/buffers and $output is a single file/buffer the input files/buffers will be stored in $output as a concatenated series of compressed data streams.

    Optional Parameters

    Unless specified below, the optional parameters for bzip2 , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to bzip2 that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once bzip2 has completed.

      This parameter defaults to 0.

    • BinModeIn => 0|1

      When reading from a file or filehandle, set binmode before reading.

      Defaults to 0.

    • Append => 0|1

      TODO

    Examples

    To read the contents of the file file1.txt and write the compressed data to the file file1.txt.bz2 .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Bzip2 qw(bzip2 $Bzip2Error) ;
    4. my $input = "file1.txt";
    5. bzip2 $input => "$input.bz2"
    6. or die "bzip2 failed: $Bzip2Error\n";

    To read from an existing Perl filehandle, $input , and write the compressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Bzip2 qw(bzip2 $Bzip2Error) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt"
    6. or die "Cannot open 'file1.txt': $!\n" ;
    7. my $buffer ;
    8. bzip2 $input => \$buffer
    9. or die "bzip2 failed: $Bzip2Error\n";

    To compress all files in the directory "/my/home" that match "*.txt" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Bzip2 qw(bzip2 $Bzip2Error) ;
    4. bzip2 '</my/home/*.txt>' => '<*.bz2>'
    5. or die "bzip2 failed: $Bzip2Error\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Bzip2 qw(bzip2 $Bzip2Error) ;
    4. for my $input ( glob "/my/home/*.txt" )
    5. {
    6. my $output = "$input.bz2" ;
    7. bzip2 $input => $output
    8. or die "Error compressing '$input': $Bzip2Error\n";
    9. }

    OO Interface

    Constructor

    The format of the constructor for IO::Compress::Bzip2 is shown below

    1. my $z = new IO::Compress::Bzip2 $output [,OPTS]
    2. or die "IO::Compress::Bzip2 failed: $Bzip2Error\n";

    It returns an IO::Compress::Bzip2 object on success and undef on failure. The variable $Bzip2Error will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Compress::Bzip2 can be used exactly like an IO::File filehandle. This means that all normal output file operations can be carried out with $z . For example, to write to a compressed file/buffer you can use either of these forms

    1. $z->print("hello world\n");
    2. print $z "hello world\n";

    The mandatory parameter $output is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output is a scalar reference, the compressed data will be stored in $$output .

    If the $output parameter is any other type, IO::Compress::Bzip2 ::new will return undef.

    Constructor Options

    OPTS is any combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $output parameter is a filehandle. If specified, and the value is true, it will result in the $output being closed once either the close method is called or the IO::Compress::Bzip2 object is destroyed.

      This parameter defaults to 0.

    • Append => 0|1

      Opens $output in append mode.

      The behaviour of this option is dependent on the type of $output .

      • A Buffer

        If $output is a buffer and Append is enabled, all compressed data will be append to the end if $output . Otherwise $output will be cleared before any data is written to it.

      • A Filename

        If $output is a filename and Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If $output is a filehandle, the file pointer will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      This parameter defaults to 0.

    • BlockSize100K => number

      Specify the number of 100K blocks bzip2 uses during compression.

      Valid values are from 1 to 9, where 9 is best compression.

      The default is 1.

    • WorkFactor => number

      Specifies how much effort bzip2 should take before resorting to a slower fallback compression algorithm.

      Valid values range from 0 to 250, where 0 means use the default value 30.

      The default is 0.

    • Strict => 0|1

      This is a placeholder option.

    Examples

    TODO

    Methods

    print

    Usage is

    1. $z->print($data)
    2. print $z $data

    Compresses and outputs the contents of the $data parameter. This has the same behaviour as the print built-in.

    Returns true if successful.

    printf

    Usage is

    1. $z->printf($format, $data)
    2. printf $z $format, $data

    Compresses and outputs the contents of the $data parameter.

    Returns true if successful.

    syswrite

    Usage is

    1. $z->syswrite $data
    2. $z->syswrite $data, $length
    3. $z->syswrite $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    write

    Usage is

    1. $z->write $data
    2. $z->write $data, $length
    3. $z->write $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    flush

    Usage is

    1. $z->flush;

    Flushes any pending compressed data to the output file/buffer.

    TODO

    Returns true on success.

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the close method has been called.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the output file/buffer. It is a fatal error to attempt to seek backward.

    Empty parts of the file/buffer will have NULL (0x00) bytes written to them.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    This method always returns undef when compressing.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Flushes any pending compressed data and then closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Compress::Bzip2 object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Compress::Bzip2 object was created, and the object is associated with a file, the underlying file will also be closed.

    newStream([OPTS])

    Usage is

    1. $z->newStream( [OPTS] )

    Closes the current compressed data stream and starts a new one.

    OPTS consists of any of the the options that are available when creating the $z object.

    See the Constructor Options section for more details.

    Importing

    No symbolic constants are required by this IO::Compress::Bzip2 at present.

    • :all

      Imports bzip2 and $Bzip2Error . Same as doing this

      1. use IO::Compress::Bzip2 qw(bzip2 $Bzip2Error) ;

    EXAMPLES

    Apache::GZip Revisited

    See IO::Compress::Bzip2::FAQ

    Working with Net::FTP

    See IO::Compress::Bzip2::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Uncompress::Bunzip2, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    Compress::Zlib::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    The primary site for the bzip2 program is http://www.bzip.org.

    See the module Compress::Bzip2

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2008 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Compress/Deflate.html000644 000765 000024 00000176016 12275777450 017605 0ustar00jjstaff000000 000000 IO::Compress::Deflate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Compress::Deflate

    Perl 5 version 18.2 documentation
    Recently read

    IO::Compress::Deflate

    NAME

    IO::Compress::Deflate - Write RFC 1950 files/buffers

    SYNOPSIS

    1. use IO::Compress::Deflate qw(deflate $DeflateError) ;
    2. my $status = deflate $input => $output [,OPTS]
    3. or die "deflate failed: $DeflateError\n";
    4. my $z = new IO::Compress::Deflate $output [,OPTS]
    5. or die "deflate failed: $DeflateError\n";
    6. $z->print($string);
    7. $z->printf($format, $string);
    8. $z->write($string);
    9. $z->syswrite($string [, $length, $offset]);
    10. $z->flush();
    11. $z->tell();
    12. $z->eof();
    13. $z->seek($position, $whence);
    14. $z->binmode();
    15. $z->fileno();
    16. $z->opened();
    17. $z->autoflush();
    18. $z->input_line_number();
    19. $z->newStream( [OPTS] );
    20. $z->deflateParams();
    21. $z->close() ;
    22. $DeflateError ;
    23. # IO::File mode
    24. print $z $string;
    25. printf $z $format, $string;
    26. tell $z
    27. eof $z
    28. seek $z, $position, $whence
    29. binmode $z
    30. fileno $z
    31. close $z ;

    DESCRIPTION

    This module provides a Perl interface that allows writing compressed data to files or buffer as defined in RFC 1950.

    For reading RFC 1950 files/buffers, see the companion module IO::Uncompress::Inflate.

    Functional Interface

    A top-level function, deflate , is provided to carry out "one-shot" compression between buffers and/or files. For finer control over the compression process, see the OO Interface section.

    1. use IO::Compress::Deflate qw(deflate $DeflateError) ;
    2. deflate $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "deflate failed: $DeflateError\n";

    The functional interface needs Perl5.005 or better.

    deflate $input => $output [, OPTS]

    deflate expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the uncompressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is compressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" deflate will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the compressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the compressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" deflate will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple files/buffers and $output_filename_or_reference is a single file/buffer the input files/buffers will be stored in $output_filename_or_reference as a concatenated series of compressed data streams.

    Optional Parameters

    Unless specified below, the optional parameters for deflate , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to deflate that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once deflate has completed.

      This parameter defaults to 0.

    • BinModeIn => 0|1

      When reading from a file or filehandle, set binmode before reading.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all compressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any compressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all compressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any compressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all compressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any compressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any compressed data is output.

      Defaults to 0.

    Examples

    To read the contents of the file file1.txt and write the compressed data to the file file1.txt.1950 .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Deflate qw(deflate $DeflateError) ;
    4. my $input = "file1.txt";
    5. deflate $input => "$input.1950"
    6. or die "deflate failed: $DeflateError\n";

    To read from an existing Perl filehandle, $input , and write the compressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Deflate qw(deflate $DeflateError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt"
    6. or die "Cannot open 'file1.txt': $!\n" ;
    7. my $buffer ;
    8. deflate $input => \$buffer
    9. or die "deflate failed: $DeflateError\n";

    To compress all files in the directory "/my/home" that match "*.txt" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Deflate qw(deflate $DeflateError) ;
    4. deflate '</my/home/*.txt>' => '<*.1950>'
    5. or die "deflate failed: $DeflateError\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Deflate qw(deflate $DeflateError) ;
    4. for my $input ( glob "/my/home/*.txt" )
    5. {
    6. my $output = "$input.1950" ;
    7. deflate $input => $output
    8. or die "Error compressing '$input': $DeflateError\n";
    9. }

    OO Interface

    Constructor

    The format of the constructor for IO::Compress::Deflate is shown below

    1. my $z = new IO::Compress::Deflate $output [,OPTS]
    2. or die "IO::Compress::Deflate failed: $DeflateError\n";

    It returns an IO::Compress::Deflate object on success and undef on failure. The variable $DeflateError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Compress::Deflate can be used exactly like an IO::File filehandle. This means that all normal output file operations can be carried out with $z . For example, to write to a compressed file/buffer you can use either of these forms

    1. $z->print("hello world\n");
    2. print $z "hello world\n";

    The mandatory parameter $output is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output is a scalar reference, the compressed data will be stored in $$output .

    If the $output parameter is any other type, IO::Compress::Deflate ::new will return undef.

    Constructor Options

    OPTS is any combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $output parameter is a filehandle. If specified, and the value is true, it will result in the $output being closed once either the close method is called or the IO::Compress::Deflate object is destroyed.

      This parameter defaults to 0.

    • Append => 0|1

      Opens $output in append mode.

      The behaviour of this option is dependent on the type of $output .

      • A Buffer

        If $output is a buffer and Append is enabled, all compressed data will be append to the end of $output . Otherwise $output will be cleared before any data is written to it.

      • A Filename

        If $output is a filename and Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If $output is a filehandle, the file pointer will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      This parameter defaults to 0.

    • Merge => 0|1

      This option is used to compress input data and append it to an existing compressed data stream in $output . The end result is a single compressed data stream stored in $output .

      It is a fatal error to attempt to use this option when $output is not an RFC 1950 data stream.

      There are a number of other limitations with the Merge option:

      1

      This module needs to have been built with zlib 1.2.1 or better to work. A fatal error will be thrown if Merge is used with an older version of zlib.

      2

      If $output is a file or a filehandle, it must be seekable.

      This parameter defaults to 0.

    • -Level

      Defines the compression level used by zlib. The value should either be a number between 0 and 9 (0 means no compression and 9 is maximum compression), or one of the symbolic constants defined below.

      1. Z_NO_COMPRESSION
      2. Z_BEST_SPEED
      3. Z_BEST_COMPRESSION
      4. Z_DEFAULT_COMPRESSION

      The default is Z_DEFAULT_COMPRESSION.

      Note, these constants are not imported by IO::Compress::Deflate by default.

      1. use IO::Compress::Deflate qw(:strategy);
      2. use IO::Compress::Deflate qw(:constants);
      3. use IO::Compress::Deflate qw(:all);
    • -Strategy

      Defines the strategy used to tune the compression. Use one of the symbolic constants defined below.

      1. Z_FILTERED
      2. Z_HUFFMAN_ONLY
      3. Z_RLE
      4. Z_FIXED
      5. Z_DEFAULT_STRATEGY

      The default is Z_DEFAULT_STRATEGY.

    • Strict => 0|1

      This is a placeholder option.

    Examples

    TODO

    Methods

    print

    Usage is

    1. $z->print($data)
    2. print $z $data

    Compresses and outputs the contents of the $data parameter. This has the same behaviour as the print built-in.

    Returns true if successful.

    printf

    Usage is

    1. $z->printf($format, $data)
    2. printf $z $format, $data

    Compresses and outputs the contents of the $data parameter.

    Returns true if successful.

    syswrite

    Usage is

    1. $z->syswrite $data
    2. $z->syswrite $data, $length
    3. $z->syswrite $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    write

    Usage is

    1. $z->write $data
    2. $z->write $data, $length
    3. $z->write $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    flush

    Usage is

    1. $z->flush;
    2. $z->flush($flush_type);

    Flushes any pending compressed data to the output file/buffer.

    This method takes an optional parameter, $flush_type , that controls how the flushing will be carried out. By default the $flush_type used is Z_FINISH . Other valid values for $flush_type are Z_NO_FLUSH , Z_SYNC_FLUSH , Z_FULL_FLUSH and Z_BLOCK . It is strongly recommended that you only set the flush_type parameter if you fully understand the implications of what it does - overuse of flush can seriously degrade the level of compression achieved. See the zlib documentation for details.

    Returns true on success.

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the close method has been called.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the output file/buffer. It is a fatal error to attempt to seek backward.

    Empty parts of the file/buffer will have NULL (0x00) bytes written to them.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    This method always returns undef when compressing.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Flushes any pending compressed data and then closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Compress::Deflate object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Compress::Deflate object was created, and the object is associated with a file, the underlying file will also be closed.

    newStream([OPTS])

    Usage is

    1. $z->newStream( [OPTS] )

    Closes the current compressed data stream and starts a new one.

    OPTS consists of any of the the options that are available when creating the $z object.

    See the Constructor Options section for more details.

    deflateParams

    Usage is

    1. $z->deflateParams

    TODO

    Importing

    A number of symbolic constants are required by some methods in IO::Compress::Deflate . None are imported by default.

    • :all

      Imports deflate , $DeflateError and all symbolic constants that can be used by IO::Compress::Deflate . Same as doing this

      1. use IO::Compress::Deflate qw(deflate $DeflateError :constants) ;
    • :constants

      Import all symbolic constants. Same as doing this

      1. use IO::Compress::Deflate qw(:flush :level :strategy) ;
    • :flush

      These symbolic constants are used by the flush method.

      1. Z_NO_FLUSH
      2. Z_PARTIAL_FLUSH
      3. Z_SYNC_FLUSH
      4. Z_FULL_FLUSH
      5. Z_FINISH
      6. Z_BLOCK
    • :level

      These symbolic constants are used by the Level option in the constructor.

      1. Z_NO_COMPRESSION
      2. Z_BEST_SPEED
      3. Z_BEST_COMPRESSION
      4. Z_DEFAULT_COMPRESSION
    • :strategy

      These symbolic constants are used by the Strategy option in the constructor.

      1. Z_FILTERED
      2. Z_HUFFMAN_ONLY
      3. Z_RLE
      4. Z_FIXED
      5. Z_DEFAULT_STRATEGY

    EXAMPLES

    Apache::GZip Revisited

    See IO::Compress::FAQ

    Working with Net::FTP

    See IO::Compress::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Compress/Gzip.html000644 000765 000024 00000223702 12275777451 017146 0ustar00jjstaff000000 000000 IO::Compress::Gzip - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Compress::Gzip

    Perl 5 version 18.2 documentation
    Recently read

    IO::Compress::Gzip

    NAME

    IO::Compress::Gzip - Write RFC 1952 files/buffers

    SYNOPSIS

    1. use IO::Compress::Gzip qw(gzip $GzipError) ;
    2. my $status = gzip $input => $output [,OPTS]
    3. or die "gzip failed: $GzipError\n";
    4. my $z = new IO::Compress::Gzip $output [,OPTS]
    5. or die "gzip failed: $GzipError\n";
    6. $z->print($string);
    7. $z->printf($format, $string);
    8. $z->write($string);
    9. $z->syswrite($string [, $length, $offset]);
    10. $z->flush();
    11. $z->tell();
    12. $z->eof();
    13. $z->seek($position, $whence);
    14. $z->binmode();
    15. $z->fileno();
    16. $z->opened();
    17. $z->autoflush();
    18. $z->input_line_number();
    19. $z->newStream( [OPTS] );
    20. $z->deflateParams();
    21. $z->close() ;
    22. $GzipError ;
    23. # IO::File mode
    24. print $z $string;
    25. printf $z $format, $string;
    26. tell $z
    27. eof $z
    28. seek $z, $position, $whence
    29. binmode $z
    30. fileno $z
    31. close $z ;

    DESCRIPTION

    This module provides a Perl interface that allows writing compressed data to files or buffer as defined in RFC 1952.

    All the gzip headers defined in RFC 1952 can be created using this module.

    For reading RFC 1952 files/buffers, see the companion module IO::Uncompress::Gunzip.

    Functional Interface

    A top-level function, gzip , is provided to carry out "one-shot" compression between buffers and/or files. For finer control over the compression process, see the OO Interface section.

    1. use IO::Compress::Gzip qw(gzip $GzipError) ;
    2. gzip $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "gzip failed: $GzipError\n";

    The functional interface needs Perl5.005 or better.

    gzip $input => $output [, OPTS]

    gzip expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the uncompressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is compressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" gzip will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    In addition, if $input_filename_or_reference is a simple filename, the default values for the Name and Time options will be sourced from that file.

    If you do not want to use these defaults they can be overridden by explicitly setting the Name and Time options or by setting the Minimal parameter.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the compressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the compressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" gzip will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple files/buffers and $output_filename_or_reference is a single file/buffer the input files/buffers will be stored in $output_filename_or_reference as a concatenated series of compressed data streams.

    Optional Parameters

    Unless specified below, the optional parameters for gzip , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to gzip that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once gzip has completed.

      This parameter defaults to 0.

    • BinModeIn => 0|1

      When reading from a file or filehandle, set binmode before reading.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all compressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any compressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all compressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any compressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all compressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any compressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any compressed data is output.

      Defaults to 0.

    Examples

    To read the contents of the file file1.txt and write the compressed data to the file file1.txt.gz .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Gzip qw(gzip $GzipError) ;
    4. my $input = "file1.txt";
    5. gzip $input => "$input.gz"
    6. or die "gzip failed: $GzipError\n";

    To read from an existing Perl filehandle, $input , and write the compressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Gzip qw(gzip $GzipError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt"
    6. or die "Cannot open 'file1.txt': $!\n" ;
    7. my $buffer ;
    8. gzip $input => \$buffer
    9. or die "gzip failed: $GzipError\n";

    To compress all files in the directory "/my/home" that match "*.txt" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Gzip qw(gzip $GzipError) ;
    4. gzip '</my/home/*.txt>' => '<*.gz>'
    5. or die "gzip failed: $GzipError\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Gzip qw(gzip $GzipError) ;
    4. for my $input ( glob "/my/home/*.txt" )
    5. {
    6. my $output = "$input.gz" ;
    7. gzip $input => $output
    8. or die "Error compressing '$input': $GzipError\n";
    9. }

    OO Interface

    Constructor

    The format of the constructor for IO::Compress::Gzip is shown below

    1. my $z = new IO::Compress::Gzip $output [,OPTS]
    2. or die "IO::Compress::Gzip failed: $GzipError\n";

    It returns an IO::Compress::Gzip object on success and undef on failure. The variable $GzipError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Compress::Gzip can be used exactly like an IO::File filehandle. This means that all normal output file operations can be carried out with $z . For example, to write to a compressed file/buffer you can use either of these forms

    1. $z->print("hello world\n");
    2. print $z "hello world\n";

    The mandatory parameter $output is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output is a scalar reference, the compressed data will be stored in $$output .

    If the $output parameter is any other type, IO::Compress::Gzip ::new will return undef.

    Constructor Options

    OPTS is any combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $output parameter is a filehandle. If specified, and the value is true, it will result in the $output being closed once either the close method is called or the IO::Compress::Gzip object is destroyed.

      This parameter defaults to 0.

    • Append => 0|1

      Opens $output in append mode.

      The behaviour of this option is dependent on the type of $output .

      • A Buffer

        If $output is a buffer and Append is enabled, all compressed data will be append to the end of $output . Otherwise $output will be cleared before any data is written to it.

      • A Filename

        If $output is a filename and Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If $output is a filehandle, the file pointer will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      This parameter defaults to 0.

    • Merge => 0|1

      This option is used to compress input data and append it to an existing compressed data stream in $output . The end result is a single compressed data stream stored in $output .

      It is a fatal error to attempt to use this option when $output is not an RFC 1952 data stream.

      There are a number of other limitations with the Merge option:

      1

      This module needs to have been built with zlib 1.2.1 or better to work. A fatal error will be thrown if Merge is used with an older version of zlib.

      2

      If $output is a file or a filehandle, it must be seekable.

      This parameter defaults to 0.

    • -Level

      Defines the compression level used by zlib. The value should either be a number between 0 and 9 (0 means no compression and 9 is maximum compression), or one of the symbolic constants defined below.

      1. Z_NO_COMPRESSION
      2. Z_BEST_SPEED
      3. Z_BEST_COMPRESSION
      4. Z_DEFAULT_COMPRESSION

      The default is Z_DEFAULT_COMPRESSION.

      Note, these constants are not imported by IO::Compress::Gzip by default.

      1. use IO::Compress::Gzip qw(:strategy);
      2. use IO::Compress::Gzip qw(:constants);
      3. use IO::Compress::Gzip qw(:all);
    • -Strategy

      Defines the strategy used to tune the compression. Use one of the symbolic constants defined below.

      1. Z_FILTERED
      2. Z_HUFFMAN_ONLY
      3. Z_RLE
      4. Z_FIXED
      5. Z_DEFAULT_STRATEGY

      The default is Z_DEFAULT_STRATEGY.

    • Minimal => 0|1

      If specified, this option will force the creation of the smallest possible compliant gzip header (which is exactly 10 bytes long) as defined in RFC 1952.

      See the section titled "Compliance" in RFC 1952 for a definition of the values used for the fields in the gzip header.

      All other parameters that control the content of the gzip header will be ignored if this parameter is set to 1.

      This parameter defaults to 0.

    • Comment => $comment

      Stores the contents of $comment in the COMMENT field in the gzip header. By default, no comment field is written to the gzip file.

      If the -Strict option is enabled, the comment can only consist of ISO 8859-1 characters plus line feed.

      If the -Strict option is disabled, the comment field can contain any character except NULL. If any null characters are present, the field will be truncated at the first NULL.

    • Name => $string

      Stores the contents of $string in the gzip NAME header field. If Name is not specified, no gzip NAME field will be created.

      If the -Strict option is enabled, $string can only consist of ISO 8859-1 characters.

      If -Strict is disabled, then $string can contain any character except NULL. If any null characters are present, the field will be truncated at the first NULL.

    • Time => $number

      Sets the MTIME field in the gzip header to $number.

      This field defaults to the time the IO::Compress::Gzip object was created if this option is not specified.

    • TextFlag => 0|1

      This parameter controls the setting of the FLG.FTEXT bit in the gzip header. It is used to signal that the data stored in the gzip file/buffer is probably text.

      The default is 0.

    • HeaderCRC => 0|1

      When true this parameter will set the FLG.FHCRC bit to 1 in the gzip header and set the CRC16 header field to the CRC of the complete gzip header except the CRC16 field itself.

      Note that gzip files created with the HeaderCRC flag set to 1 cannot be read by most, if not all, of the the standard gunzip utilities, most notably gzip version 1.2.4. You should therefore avoid using this option if you want to maximize the portability of your gzip files.

      This parameter defaults to 0.

    • OS_Code => $value

      Stores $value in the gzip OS header field. A number between 0 and 255 is valid.

      If not specified, this parameter defaults to the OS code of the Operating System this module was built on. The value 3 is used as a catch-all for all Unix variants and unknown Operating Systems.

    • ExtraField => $data

      This parameter allows additional metadata to be stored in the ExtraField in the gzip header. An RFC 1952 compliant ExtraField consists of zero or more subfields. Each subfield consists of a two byte header followed by the subfield data.

      The list of subfields can be supplied in any of the following formats

      1. -ExtraField => [$id1, $data1,
      2. $id2, $data2,
      3. ...
      4. ]
      5. -ExtraField => [ [$id1 => $data1],
      6. [$id2 => $data2],
      7. ...
      8. ]
      9. -ExtraField => { $id1 => $data1,
      10. $id2 => $data2,
      11. ...
      12. }

      Where $id1 , $id2 are two byte subfield ID's. The second byte of the ID cannot be 0, unless the Strict option has been disabled.

      If you use the hash syntax, you have no control over the order in which the ExtraSubFields are stored, plus you cannot have SubFields with duplicate ID.

      Alternatively the list of subfields can by supplied as a scalar, thus

      1. -ExtraField => $rawdata

      If you use the raw format, and the Strict option is enabled, IO::Compress::Gzip will check that $rawdata consists of zero or more conformant sub-fields. When Strict is disabled, $rawdata can consist of any arbitrary byte stream.

      The maximum size of the Extra Field 65535 bytes.

    • ExtraFlags => $value

      Sets the XFL byte in the gzip header to $value .

      If this option is not present, the value stored in XFL field will be determined by the setting of the Level option.

      If Level => Z_BEST_SPEED has been specified then XFL is set to 2. If Level => Z_BEST_COMPRESSION has been specified then XFL is set to 4. Otherwise XFL is set to 0.

    • Strict => 0|1

      Strict will optionally police the values supplied with other options to ensure they are compliant with RFC1952.

      This option is enabled by default.

      If Strict is enabled the following behaviour will be policed:

      • The value supplied with the Name option can only contain ISO 8859-1 characters.

      • The value supplied with the Comment option can only contain ISO 8859-1 characters plus line-feed.

      • The values supplied with the -Name and -Comment options cannot contain multiple embedded nulls.

      • If an ExtraField option is specified and it is a simple scalar, it must conform to the sub-field structure as defined in RFC 1952.

      • If an ExtraField option is specified the second byte of the ID will be checked in each subfield to ensure that it does not contain the reserved value 0x00.

      When Strict is disabled the following behaviour will be policed:

      • The value supplied with -Name option can contain any character except NULL.

      • The value supplied with -Comment option can contain any character except NULL.

      • The values supplied with the -Name and -Comment options can contain multiple embedded nulls. The string written to the gzip header will consist of the characters up to, but not including, the first embedded NULL.

      • If an ExtraField option is specified and it is a simple scalar, the structure will not be checked. The only error is if the length is too big.

      • The ID header in an ExtraField sub-field can consist of any two bytes.

    Examples

    TODO

    Methods

    print

    Usage is

    1. $z->print($data)
    2. print $z $data

    Compresses and outputs the contents of the $data parameter. This has the same behaviour as the print built-in.

    Returns true if successful.

    printf

    Usage is

    1. $z->printf($format, $data)
    2. printf $z $format, $data

    Compresses and outputs the contents of the $data parameter.

    Returns true if successful.

    syswrite

    Usage is

    1. $z->syswrite $data
    2. $z->syswrite $data, $length
    3. $z->syswrite $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    write

    Usage is

    1. $z->write $data
    2. $z->write $data, $length
    3. $z->write $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    flush

    Usage is

    1. $z->flush;
    2. $z->flush($flush_type);

    Flushes any pending compressed data to the output file/buffer.

    This method takes an optional parameter, $flush_type , that controls how the flushing will be carried out. By default the $flush_type used is Z_FINISH . Other valid values for $flush_type are Z_NO_FLUSH , Z_SYNC_FLUSH , Z_FULL_FLUSH and Z_BLOCK . It is strongly recommended that you only set the flush_type parameter if you fully understand the implications of what it does - overuse of flush can seriously degrade the level of compression achieved. See the zlib documentation for details.

    Returns true on success.

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the close method has been called.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the output file/buffer. It is a fatal error to attempt to seek backward.

    Empty parts of the file/buffer will have NULL (0x00) bytes written to them.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    This method always returns undef when compressing.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Flushes any pending compressed data and then closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Compress::Gzip object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Compress::Gzip object was created, and the object is associated with a file, the underlying file will also be closed.

    newStream([OPTS])

    Usage is

    1. $z->newStream( [OPTS] )

    Closes the current compressed data stream and starts a new one.

    OPTS consists of any of the the options that are available when creating the $z object.

    See the Constructor Options section for more details.

    deflateParams

    Usage is

    1. $z->deflateParams

    TODO

    Importing

    A number of symbolic constants are required by some methods in IO::Compress::Gzip . None are imported by default.

    • :all

      Imports gzip , $GzipError and all symbolic constants that can be used by IO::Compress::Gzip . Same as doing this

      1. use IO::Compress::Gzip qw(gzip $GzipError :constants) ;
    • :constants

      Import all symbolic constants. Same as doing this

      1. use IO::Compress::Gzip qw(:flush :level :strategy) ;
    • :flush

      These symbolic constants are used by the flush method.

      1. Z_NO_FLUSH
      2. Z_PARTIAL_FLUSH
      3. Z_SYNC_FLUSH
      4. Z_FULL_FLUSH
      5. Z_FINISH
      6. Z_BLOCK
    • :level

      These symbolic constants are used by the Level option in the constructor.

      1. Z_NO_COMPRESSION
      2. Z_BEST_SPEED
      3. Z_BEST_COMPRESSION
      4. Z_DEFAULT_COMPRESSION
    • :strategy

      These symbolic constants are used by the Strategy option in the constructor.

      1. Z_FILTERED
      2. Z_HUFFMAN_ONLY
      3. Z_RLE
      4. Z_FIXED
      5. Z_DEFAULT_STRATEGY

    EXAMPLES

    Apache::GZip Revisited

    See IO::Compress::FAQ

    Working with Net::FTP

    See IO::Compress::FAQ

    SEE ALSO

    Compress::Zlib, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Compress/RawDeflate.html000644 000765 000024 00000176615 12275777452 020266 0ustar00jjstaff000000 000000 IO::Compress::RawDeflate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Compress::RawDeflate

    Perl 5 version 18.2 documentation
    Recently read

    IO::Compress::RawDeflate

    NAME

    IO::Compress::RawDeflate - Write RFC 1951 files/buffers

    SYNOPSIS

    1. use IO::Compress::RawDeflate qw(rawdeflate $RawDeflateError) ;
    2. my $status = rawdeflate $input => $output [,OPTS]
    3. or die "rawdeflate failed: $RawDeflateError\n";
    4. my $z = new IO::Compress::RawDeflate $output [,OPTS]
    5. or die "rawdeflate failed: $RawDeflateError\n";
    6. $z->print($string);
    7. $z->printf($format, $string);
    8. $z->write($string);
    9. $z->syswrite($string [, $length, $offset]);
    10. $z->flush();
    11. $z->tell();
    12. $z->eof();
    13. $z->seek($position, $whence);
    14. $z->binmode();
    15. $z->fileno();
    16. $z->opened();
    17. $z->autoflush();
    18. $z->input_line_number();
    19. $z->newStream( [OPTS] );
    20. $z->deflateParams();
    21. $z->close() ;
    22. $RawDeflateError ;
    23. # IO::File mode
    24. print $z $string;
    25. printf $z $format, $string;
    26. tell $z
    27. eof $z
    28. seek $z, $position, $whence
    29. binmode $z
    30. fileno $z
    31. close $z ;

    DESCRIPTION

    This module provides a Perl interface that allows writing compressed data to files or buffer as defined in RFC 1951.

    Note that RFC 1951 data is not a good choice of compression format to use in isolation, especially if you want to auto-detect it.

    For reading RFC 1951 files/buffers, see the companion module IO::Uncompress::RawInflate.

    Functional Interface

    A top-level function, rawdeflate , is provided to carry out "one-shot" compression between buffers and/or files. For finer control over the compression process, see the OO Interface section.

    1. use IO::Compress::RawDeflate qw(rawdeflate $RawDeflateError) ;
    2. rawdeflate $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "rawdeflate failed: $RawDeflateError\n";

    The functional interface needs Perl5.005 or better.

    rawdeflate $input => $output [, OPTS]

    rawdeflate expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the uncompressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is compressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" rawdeflate will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the compressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the compressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" rawdeflate will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple files/buffers and $output_filename_or_reference is a single file/buffer the input files/buffers will be stored in $output_filename_or_reference as a concatenated series of compressed data streams.

    Optional Parameters

    Unless specified below, the optional parameters for rawdeflate , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to rawdeflate that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once rawdeflate has completed.

      This parameter defaults to 0.

    • BinModeIn => 0|1

      When reading from a file or filehandle, set binmode before reading.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all compressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any compressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all compressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any compressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all compressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any compressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any compressed data is output.

      Defaults to 0.

    Examples

    To read the contents of the file file1.txt and write the compressed data to the file file1.txt.1951 .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::RawDeflate qw(rawdeflate $RawDeflateError) ;
    4. my $input = "file1.txt";
    5. rawdeflate $input => "$input.1951"
    6. or die "rawdeflate failed: $RawDeflateError\n";

    To read from an existing Perl filehandle, $input , and write the compressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::RawDeflate qw(rawdeflate $RawDeflateError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt"
    6. or die "Cannot open 'file1.txt': $!\n" ;
    7. my $buffer ;
    8. rawdeflate $input => \$buffer
    9. or die "rawdeflate failed: $RawDeflateError\n";

    To compress all files in the directory "/my/home" that match "*.txt" and store the compressed data in the same directory

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::RawDeflate qw(rawdeflate $RawDeflateError) ;
    4. rawdeflate '</my/home/*.txt>' => '<*.1951>'
    5. or die "rawdeflate failed: $RawDeflateError\n";

    and if you want to compress each file one at a time, this will do the trick

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::RawDeflate qw(rawdeflate $RawDeflateError) ;
    4. for my $input ( glob "/my/home/*.txt" )
    5. {
    6. my $output = "$input.1951" ;
    7. rawdeflate $input => $output
    8. or die "Error compressing '$input': $RawDeflateError\n";
    9. }

    OO Interface

    Constructor

    The format of the constructor for IO::Compress::RawDeflate is shown below

    1. my $z = new IO::Compress::RawDeflate $output [,OPTS]
    2. or die "IO::Compress::RawDeflate failed: $RawDeflateError\n";

    It returns an IO::Compress::RawDeflate object on success and undef on failure. The variable $RawDeflateError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Compress::RawDeflate can be used exactly like an IO::File filehandle. This means that all normal output file operations can be carried out with $z . For example, to write to a compressed file/buffer you can use either of these forms

    1. $z->print("hello world\n");
    2. print $z "hello world\n";

    The mandatory parameter $output is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output is a scalar reference, the compressed data will be stored in $$output .

    If the $output parameter is any other type, IO::Compress::RawDeflate ::new will return undef.

    Constructor Options

    OPTS is any combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $output parameter is a filehandle. If specified, and the value is true, it will result in the $output being closed once either the close method is called or the IO::Compress::RawDeflate object is destroyed.

      This parameter defaults to 0.

    • Append => 0|1

      Opens $output in append mode.

      The behaviour of this option is dependent on the type of $output .

      • A Buffer

        If $output is a buffer and Append is enabled, all compressed data will be append to the end of $output . Otherwise $output will be cleared before any data is written to it.

      • A Filename

        If $output is a filename and Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If $output is a filehandle, the file pointer will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      This parameter defaults to 0.

    • Merge => 0|1

      This option is used to compress input data and append it to an existing compressed data stream in $output . The end result is a single compressed data stream stored in $output .

      It is a fatal error to attempt to use this option when $output is not an RFC 1951 data stream.

      There are a number of other limitations with the Merge option:

      1

      This module needs to have been built with zlib 1.2.1 or better to work. A fatal error will be thrown if Merge is used with an older version of zlib.

      2

      If $output is a file or a filehandle, it must be seekable.

      This parameter defaults to 0.

    • -Level

      Defines the compression level used by zlib. The value should either be a number between 0 and 9 (0 means no compression and 9 is maximum compression), or one of the symbolic constants defined below.

      1. Z_NO_COMPRESSION
      2. Z_BEST_SPEED
      3. Z_BEST_COMPRESSION
      4. Z_DEFAULT_COMPRESSION

      The default is Z_DEFAULT_COMPRESSION.

      Note, these constants are not imported by IO::Compress::RawDeflate by default.

      1. use IO::Compress::RawDeflate qw(:strategy);
      2. use IO::Compress::RawDeflate qw(:constants);
      3. use IO::Compress::RawDeflate qw(:all);
    • -Strategy

      Defines the strategy used to tune the compression. Use one of the symbolic constants defined below.

      1. Z_FILTERED
      2. Z_HUFFMAN_ONLY
      3. Z_RLE
      4. Z_FIXED
      5. Z_DEFAULT_STRATEGY

      The default is Z_DEFAULT_STRATEGY.

    • Strict => 0|1

      This is a placeholder option.

    Examples

    TODO

    Methods

    print

    Usage is

    1. $z->print($data)
    2. print $z $data

    Compresses and outputs the contents of the $data parameter. This has the same behaviour as the print built-in.

    Returns true if successful.

    printf

    Usage is

    1. $z->printf($format, $data)
    2. printf $z $format, $data

    Compresses and outputs the contents of the $data parameter.

    Returns true if successful.

    syswrite

    Usage is

    1. $z->syswrite $data
    2. $z->syswrite $data, $length
    3. $z->syswrite $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    write

    Usage is

    1. $z->write $data
    2. $z->write $data, $length
    3. $z->write $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    flush

    Usage is

    1. $z->flush;
    2. $z->flush($flush_type);

    Flushes any pending compressed data to the output file/buffer.

    This method takes an optional parameter, $flush_type , that controls how the flushing will be carried out. By default the $flush_type used is Z_FINISH . Other valid values for $flush_type are Z_NO_FLUSH , Z_SYNC_FLUSH , Z_FULL_FLUSH and Z_BLOCK . It is strongly recommended that you only set the flush_type parameter if you fully understand the implications of what it does - overuse of flush can seriously degrade the level of compression achieved. See the zlib documentation for details.

    Returns true on success.

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the close method has been called.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the output file/buffer. It is a fatal error to attempt to seek backward.

    Empty parts of the file/buffer will have NULL (0x00) bytes written to them.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    This method always returns undef when compressing.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Flushes any pending compressed data and then closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Compress::RawDeflate object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Compress::RawDeflate object was created, and the object is associated with a file, the underlying file will also be closed.

    newStream([OPTS])

    Usage is

    1. $z->newStream( [OPTS] )

    Closes the current compressed data stream and starts a new one.

    OPTS consists of any of the the options that are available when creating the $z object.

    See the Constructor Options section for more details.

    deflateParams

    Usage is

    1. $z->deflateParams

    TODO

    Importing

    A number of symbolic constants are required by some methods in IO::Compress::RawDeflate . None are imported by default.

    • :all

      Imports rawdeflate , $RawDeflateError and all symbolic constants that can be used by IO::Compress::RawDeflate . Same as doing this

      1. use IO::Compress::RawDeflate qw(rawdeflate $RawDeflateError :constants) ;
    • :constants

      Import all symbolic constants. Same as doing this

      1. use IO::Compress::RawDeflate qw(:flush :level :strategy) ;
    • :flush

      These symbolic constants are used by the flush method.

      1. Z_NO_FLUSH
      2. Z_PARTIAL_FLUSH
      3. Z_SYNC_FLUSH
      4. Z_FULL_FLUSH
      5. Z_FINISH
      6. Z_BLOCK
    • :level

      These symbolic constants are used by the Level option in the constructor.

      1. Z_NO_COMPRESSION
      2. Z_BEST_SPEED
      3. Z_BEST_COMPRESSION
      4. Z_DEFAULT_COMPRESSION
    • :strategy

      These symbolic constants are used by the Strategy option in the constructor.

      1. Z_FILTERED
      2. Z_HUFFMAN_ONLY
      3. Z_RLE
      4. Z_FIXED
      5. Z_DEFAULT_STRATEGY

    EXAMPLES

    Apache::GZip Revisited

    See IO::Compress::FAQ

    Working with Net::FTP

    See IO::Compress::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/IO/Compress/Zip.html000644 000765 000024 00000253367 12275777454 017014 0ustar00jjstaff000000 000000 IO::Compress::Zip - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    IO::Compress::Zip

    Perl 5 version 18.2 documentation
    Recently read

    IO::Compress::Zip

    NAME

    IO::Compress::Zip - Write zip files/buffers

    SYNOPSIS

    1. use IO::Compress::Zip qw(zip $ZipError) ;
    2. my $status = zip $input => $output [,OPTS]
    3. or die "zip failed: $ZipError\n";
    4. my $z = new IO::Compress::Zip $output [,OPTS]
    5. or die "zip failed: $ZipError\n";
    6. $z->print($string);
    7. $z->printf($format, $string);
    8. $z->write($string);
    9. $z->syswrite($string [, $length, $offset]);
    10. $z->flush();
    11. $z->tell();
    12. $z->eof();
    13. $z->seek($position, $whence);
    14. $z->binmode();
    15. $z->fileno();
    16. $z->opened();
    17. $z->autoflush();
    18. $z->input_line_number();
    19. $z->newStream( [OPTS] );
    20. $z->deflateParams();
    21. $z->close() ;
    22. $ZipError ;
    23. # IO::File mode
    24. print $z $string;
    25. printf $z $format, $string;
    26. tell $z
    27. eof $z
    28. seek $z, $position, $whence
    29. binmode $z
    30. fileno $z
    31. close $z ;

    DESCRIPTION

    This module provides a Perl interface that allows writing zip compressed data to files or buffer.

    The primary purpose of this module is to provide streaming write access to zip files and buffers. It is not a general-purpose file archiver. If that is what you want, check out Archive::Zip .

    At present three compression methods are supported by IO::Compress::Zip, namely Store (no compression at all), Deflate, Bzip2 and LZMA.

    Note that to create Bzip2 content, the module IO::Compress::Bzip2 must be installed.

    Note that to create LZMA content, the module IO::Compress::Lzma must be installed.

    For reading zip files/buffers, see the companion module IO::Uncompress::Unzip.

    Functional Interface

    A top-level function, zip , is provided to carry out "one-shot" compression between buffers and/or files. For finer control over the compression process, see the OO Interface section.

    1. use IO::Compress::Zip qw(zip $ZipError) ;
    2. zip $input_filename_or_reference => $output_filename_or_reference [,OPTS]
    3. or die "zip failed: $ZipError\n";

    The functional interface needs Perl5.005 or better.

    zip $input => $output [, OPTS]

    zip expects at least two parameters, $input_filename_or_reference and $output_filename_or_reference .

    The $input_filename_or_reference parameter

    The parameter, $input_filename_or_reference , is used to define the source of the uncompressed data.

    It can take one of the following forms:

    • A filename

      If the <$input_filename_or_reference> parameter is a simple scalar, it is assumed to be a filename. This file will be opened for reading and the input data will be read from it.

    • A filehandle

      If the $input_filename_or_reference parameter is a filehandle, the input data will be read from it. The string '-' can be used as an alias for standard input.

    • A scalar reference

      If $input_filename_or_reference is a scalar reference, the input data will be read from $$input_filename_or_reference .

    • An array reference

      If $input_filename_or_reference is an array reference, each element in the array must be a filename.

      The input data will be read from each file in turn.

      The complete array will be walked to ensure that it only contains valid filenames before any data is compressed.

    • An Input FileGlob string

      If $input_filename_or_reference is a string that is delimited by the characters "<" and ">" zip will assume that it is an input fileglob string. The input is the list of files that match the fileglob.

      See File::GlobMapper for more details.

    If the $input_filename_or_reference parameter is any other type, undef will be returned.

    In addition, if $input_filename_or_reference is a simple filename, the default values for the Name , Time , TextFlag , ExtAttr , exUnixN and exTime options will be sourced from that file.

    If you do not want to use these defaults they can be overridden by explicitly setting the Name , Time , TextFlag , ExtAttr , exUnixN and exTime options or by setting the Minimal parameter.

    The $output_filename_or_reference parameter

    The parameter $output_filename_or_reference is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output_filename_or_reference parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output_filename_or_reference parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output_filename_or_reference is a scalar reference, the compressed data will be stored in $$output_filename_or_reference .

    • An Array Reference

      If $output_filename_or_reference is an array reference, the compressed data will be pushed onto the array.

    • An Output FileGlob

      If $output_filename_or_reference is a string that is delimited by the characters "<" and ">" zip will assume that it is an output fileglob string. The output is the list of files that match the fileglob.

      When $output_filename_or_reference is an fileglob string, $input_filename_or_reference must also be a fileglob string. Anything else is an error.

      See File::GlobMapper for more details.

    If the $output_filename_or_reference parameter is any other type, undef will be returned.

    Notes

    When $input_filename_or_reference maps to multiple files/buffers and $output_filename_or_reference is a single file/buffer the input files/buffers will each be stored in $output_filename_or_reference as a distinct entry.

    Optional Parameters

    Unless specified below, the optional parameters for zip , OPTS , are the same as those used with the OO interface defined in the Constructor Options section below.

    • AutoClose => 0|1

      This option applies to any input or output data streams to zip that are filehandles.

      If AutoClose is specified, and the value is true, it will result in all input and/or output filehandles being closed once zip has completed.

      This parameter defaults to 0.

    • BinModeIn => 0|1

      When reading from a file or filehandle, set binmode before reading.

      Defaults to 0.

    • Append => 0|1

      The behaviour of this option is dependent on the type of output data stream.

      • A Buffer

        If Append is enabled, all compressed data will be append to the end of the output buffer. Otherwise the output buffer will be cleared before any compressed data is written to it.

      • A Filename

        If Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If Append is enabled, the filehandle will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      When Append is specified, and set to true, it will append all compressed data to the output data stream.

      So when the output is a filehandle it will carry out a seek to the eof before writing any compressed data. If the output is a filename, it will be opened for appending. If the output is a buffer, all compressed data will be appended to the existing buffer.

      Conversely when Append is not specified, or it is present and is set to false, it will operate as follows.

      When the output is a filename, it will truncate the contents of the file before writing any compressed data. If the output is a filehandle its position will not be changed. If the output is a buffer, it will be wiped before any compressed data is output.

      Defaults to 0.

    Examples

    To read the contents of the file file1.txt and write the compressed data to the file file1.txt.zip .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Zip qw(zip $ZipError) ;
    4. my $input = "file1.txt";
    5. zip $input => "$input.zip"
    6. or die "zip failed: $ZipError\n";

    To read from an existing Perl filehandle, $input , and write the compressed data to a buffer, $buffer .

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Zip qw(zip $ZipError) ;
    4. use IO::File ;
    5. my $input = new IO::File "<file1.txt"
    6. or die "Cannot open 'file1.txt': $!\n" ;
    7. my $buffer ;
    8. zip $input => \$buffer
    9. or die "zip failed: $ZipError\n";

    To create a zip file, output.zip , that contains the compressed contents of the files alpha.txt and beta.txt

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Zip qw(zip $ZipError) ;
    4. zip [ 'alpha.txt', 'beta.txt' ] => 'output.zip'
    5. or die "zip failed: $ZipError\n";

    Alternatively, rather than having to explicitly name each of the files that you want to compress, you could use a fileglob to select all the txt files in the current directory, as follows

    1. use strict ;
    2. use warnings ;
    3. use IO::Compress::Zip qw(zip $ZipError) ;
    4. my @files = <*.txt>;
    5. zip \@files => 'output.zip'
    6. or die "zip failed: $ZipError\n";

    or more succinctly

    1. zip [ <*.txt> ] => 'output.zip'
    2. or die "zip failed: $ZipError\n";

    OO Interface

    Constructor

    The format of the constructor for IO::Compress::Zip is shown below

    1. my $z = new IO::Compress::Zip $output [,OPTS]
    2. or die "IO::Compress::Zip failed: $ZipError\n";

    It returns an IO::Compress::Zip object on success and undef on failure. The variable $ZipError will contain an error message on failure.

    If you are running Perl 5.005 or better the object, $z , returned from IO::Compress::Zip can be used exactly like an IO::File filehandle. This means that all normal output file operations can be carried out with $z . For example, to write to a compressed file/buffer you can use either of these forms

    1. $z->print("hello world\n");
    2. print $z "hello world\n";

    The mandatory parameter $output is used to control the destination of the compressed data. This parameter can take one of these forms.

    • A filename

      If the $output parameter is a simple scalar, it is assumed to be a filename. This file will be opened for writing and the compressed data will be written to it.

    • A filehandle

      If the $output parameter is a filehandle, the compressed data will be written to it. The string '-' can be used as an alias for standard output.

    • A scalar reference

      If $output is a scalar reference, the compressed data will be stored in $$output .

    If the $output parameter is any other type, IO::Compress::Zip ::new will return undef.

    Constructor Options

    OPTS is any combination of the following options:

    • AutoClose => 0|1

      This option is only valid when the $output parameter is a filehandle. If specified, and the value is true, it will result in the $output being closed once either the close method is called or the IO::Compress::Zip object is destroyed.

      This parameter defaults to 0.

    • Append => 0|1

      Opens $output in append mode.

      The behaviour of this option is dependent on the type of $output .

      • A Buffer

        If $output is a buffer and Append is enabled, all compressed data will be append to the end of $output . Otherwise $output will be cleared before any data is written to it.

      • A Filename

        If $output is a filename and Append is enabled, the file will be opened in append mode. Otherwise the contents of the file, if any, will be truncated before any compressed data is written to it.

      • A Filehandle

        If $output is a filehandle, the file pointer will be positioned to the end of the file via a call to seek before any compressed data is written to it. Otherwise the file pointer will not be moved.

      This parameter defaults to 0.

    • Name => $string

      Stores the contents of $string in the zip filename header field.

      If Name is not specified and the $input parameter is a filename, the value of $input will be used for the zip filename header field.

      If Name is not specified and the $input parameter is not a filename, no zip filename field will be created.

      Note that both the CanonicalName and FilterName options can modify the value used for the zip filename header field.

    • CanonicalName => 0|1

      This option controls whether the filename field in the zip header is normalized into Unix format before being written to the zip file.

      It is recommended that you enable this option unless you really need to create a non-standard Zip file.

      This is what APPNOTE.TXT has to say on what should be stored in the zip filename header field.

      1. The name of the file, with optional relative path.
      2. The path stored should not contain a drive or
      3. device letter, or a leading slash. All slashes
      4. should be forward slashes '/' as opposed to
      5. backwards slashes '\' for compatibility with Amiga
      6. and UNIX file systems etc.

      This option defaults to false.

    • FilterName => sub { ... }

      This option allow the filename field in the zip header to be modified before it is written to the zip file.

      This option takes a parameter that must be a reference to a sub. On entry to the sub the $_ variable will contain the name to be filtered. If no filename is available $_ will contain an empty string.

      The value of $_ when the sub returns will be stored in the filename header field.

      Note that if CanonicalName is enabled, a normalized filename will be passed to the sub.

      If you use FilterName to modify the filename, it is your responsibility to keep the filename in Unix format.

      Although this option can be used with the OO ointerface, it is of most use with the one-shot interface. For example, the code below shows how FilterName can be used to remove the path component from a series of filenames before they are stored in $zipfile .

      1. sub compressTxtFiles
      2. {
      3. my $zipfile = shift ;
      4. my $dir = shift ;
      5. zip [ <$dir/*.txt> ] => $zipfile,
      6. FilterName => sub { s[^$dir/][] } ;
      7. }
    • Time => $number

      Sets the last modified time field in the zip header to $number.

      This field defaults to the time the IO::Compress::Zip object was created if this option is not specified and the $input parameter is not a filename.

    • ExtAttr => $attr

      This option controls the "external file attributes" field in the central header of the zip file. This is a 4 byte field.

      If you are running a Unix derivative this value defaults to

      1. 0100644 << 16

      This should allow read/write access to any files that are extracted from the zip file/buffer`.

      For all other systems it defaults to 0.

    • exTime => [$atime, $mtime, $ctime]

      This option expects an array reference with exactly three elements: $atime , mtime and $ctime . These correspond to the last access time, last modification time and creation time respectively.

      It uses these values to set the extended timestamp field (ID is "UT") in the local zip header using the three values, $atime, $mtime, $ctime. In addition it sets the extended timestamp field in the central zip header using $mtime .

      If any of the three values is undef that time value will not be used. So, for example, to set only the $mtime you would use this

      1. exTime => [undef, $mtime, undef]

      If the Minimal option is set to true, this option will be ignored.

      By default no extended time field is created.

    • exUnix2 => [$uid, $gid]

      This option expects an array reference with exactly two elements: $uid and $gid . These values correspond to the numeric User ID (UID) and Group ID (GID) of the owner of the files respectively.

      When the exUnix2 option is present it will trigger the creation of a Unix2 extra field (ID is "Ux") in the local zip header. This will be populated with $uid and $gid . An empty Unix2 extra field will also be created in the central zip header.

      Note - The UID & GID are stored as 16-bit integers in the "Ux" field. Use exUnixN if your UID or GID are 32-bit.

      If the Minimal option is set to true, this option will be ignored.

      By default no Unix2 extra field is created.

    • exUnixN => [$uid, $gid]

      This option expects an array reference with exactly two elements: $uid and $gid . These values correspond to the numeric User ID (UID) and Group ID (GID) of the owner of the files respectively.

      When the exUnixN option is present it will trigger the creation of a UnixN extra field (ID is "ux") in bothe the local and central zip headers. This will be populated with $uid and $gid . The UID & GID are stored as 32-bit integers.

      If the Minimal option is set to true, this option will be ignored.

      By default no UnixN extra field is created.

    • Comment => $comment

      Stores the contents of $comment in the Central File Header of the zip file.

      By default, no comment field is written to the zip file.

    • ZipComment => $comment

      Stores the contents of $comment in the End of Central Directory record of the zip file.

      By default, no comment field is written to the zip file.

    • Method => $method

      Controls which compression method is used. At present four compression methods are supported, namely Store (no compression at all), Deflate, Bzip2 and Lzma.

      The symbols, ZIP_CM_STORE, ZIP_CM_DEFLATE, ZIP_CM_BZIP2 and ZIP_CM_LZMA are used to select the compression method.

      These constants are not imported by IO::Compress::Zip by default.

      1. use IO::Compress::Zip qw(:zip_method);
      2. use IO::Compress::Zip qw(:constants);
      3. use IO::Compress::Zip qw(:all);

      Note that to create Bzip2 content, the module IO::Compress::Bzip2 must be installed. A fatal error will be thrown if you attempt to create Bzip2 content when IO::Compress::Bzip2 is not available.

      Note that to create Lzma content, the module IO::Compress::Lzma must be installed. A fatal error will be thrown if you attempt to create Lzma content when IO::Compress::Lzma is not available.

      The default method is ZIP_CM_DEFLATE.

    • Stream => 0|1

      This option controls whether the zip file/buffer output is created in streaming mode.

      Note that when outputting to a file with streaming mode disabled (Stream is 0), the output file must be seekable.

      The default is 1.

    • Zip64 => 0|1

      Create a Zip64 zip file/buffer. This option is used if you want to store files larger than 4 Gig or store more than 64K files in a single zip archive..

      Zip64 will be automatically set, as needed, if working with the one-shot interface when the input is either a filename or a scalar reference.

      If you intend to manipulate the Zip64 zip files created with this module using an external zip/unzip, make sure that it supports Zip64.

      In particular, if you are using Info-Zip you need to have zip version 3.x or better to update a Zip64 archive and unzip version 6.x to read a zip64 archive.

      The default is 0.

    • TextFlag => 0|1

      This parameter controls the setting of a bit in the zip central header. It is used to signal that the data stored in the zip file/buffer is probably text.

      In one-shot mode this flag will be set to true if the Perl -T operator thinks the file contains text.

      The default is 0.

    • ExtraFieldLocal => $data
    • ExtraFieldCentral => $data

      The ExtraFieldLocal option is used to store additional metadata in the local header for the zip file/buffer. The ExtraFieldCentral does the same for the matching central header.

      An extra field consists of zero or more subfields. Each subfield consists of a two byte header followed by the subfield data.

      The list of subfields can be supplied in any of the following formats

      1. ExtraFieldLocal => [$id1, $data1,
      2. $id2, $data2,
      3. ...
      4. ]
      5. ExtraFieldLocal => [ [$id1 => $data1],
      6. [$id2 => $data2],
      7. ...
      8. ]
      9. ExtraFieldLocal => { $id1 => $data1,
      10. $id2 => $data2,
      11. ...
      12. }

      Where $id1 , $id2 are two byte subfield ID's.

      If you use the hash syntax, you have no control over the order in which the ExtraSubFields are stored, plus you cannot have SubFields with duplicate ID.

      Alternatively the list of subfields can by supplied as a scalar, thus

      1. ExtraField => $rawdata

      In this case IO::Compress::Zip will check that $rawdata consists of zero or more conformant sub-fields.

      The Extended Time field (ID "UT"), set using the exTime option, and the Unix2 extra field (ID "Ux), set using the exUnix2 option, are examples of extra fields.

      If the Minimal option is set to true, this option will be ignored.

      The maximum size of an extra field 65535 bytes.

    • Minimal => 1|0

      If specified, this option will disable the creation of all extra fields in the zip local and central headers. So the exTime , exUnix2 , exUnixN , ExtraFieldLocal and ExtraFieldCentral options will be ignored.

      This parameter defaults to 0.

    • BlockSize100K => number

      Specify the number of 100K blocks bzip2 uses during compression.

      Valid values are from 1 to 9, where 9 is best compression.

      This option is only valid if the Method is ZIP_CM_BZIP2. It is ignored otherwise.

      The default is 1.

    • WorkFactor => number

      Specifies how much effort bzip2 should take before resorting to a slower fallback compression algorithm.

      Valid values range from 0 to 250, where 0 means use the default value 30.

      This option is only valid if the Method is ZIP_CM_BZIP2. It is ignored otherwise.

      The default is 0.

    • Preset => number

      Used to choose the LZMA compression preset.

      Valid values are 0-9 and LZMA_PRESET_DEFAULT .

      0 is the fastest compression with the lowest memory usage and the lowest compression.

      9 is the slowest compession with the highest memory usage but with the best compression.

      This option is only valid if the Method is ZIP_CM_LZMA. It is ignored otherwise.

      Defaults to LZMA_PRESET_DEFAULT (6).

    • Extreme => 0|1

      Makes LZMA compression a lot slower, but a small compression gain.

      This option is only valid if the Method is ZIP_CM_LZMA. It is ignored otherwise.

      Defaults to 0.

    • -Level

      Defines the compression level used by zlib. The value should either be a number between 0 and 9 (0 means no compression and 9 is maximum compression), or one of the symbolic constants defined below.

      1. Z_NO_COMPRESSION
      2. Z_BEST_SPEED
      3. Z_BEST_COMPRESSION
      4. Z_DEFAULT_COMPRESSION

      The default is Z_DEFAULT_COMPRESSION.

      Note, these constants are not imported by IO::Compress::Zip by default.

      1. use IO::Compress::Zip qw(:strategy);
      2. use IO::Compress::Zip qw(:constants);
      3. use IO::Compress::Zip qw(:all);
    • -Strategy

      Defines the strategy used to tune the compression. Use one of the symbolic constants defined below.

      1. Z_FILTERED
      2. Z_HUFFMAN_ONLY
      3. Z_RLE
      4. Z_FIXED
      5. Z_DEFAULT_STRATEGY

      The default is Z_DEFAULT_STRATEGY.

    • Strict => 0|1

      This is a placeholder option.

    Examples

    TODO

    Methods

    print

    Usage is

    1. $z->print($data)
    2. print $z $data

    Compresses and outputs the contents of the $data parameter. This has the same behaviour as the print built-in.

    Returns true if successful.

    printf

    Usage is

    1. $z->printf($format, $data)
    2. printf $z $format, $data

    Compresses and outputs the contents of the $data parameter.

    Returns true if successful.

    syswrite

    Usage is

    1. $z->syswrite $data
    2. $z->syswrite $data, $length
    3. $z->syswrite $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    write

    Usage is

    1. $z->write $data
    2. $z->write $data, $length
    3. $z->write $data, $length, $offset

    Compresses and outputs the contents of the $data parameter.

    Returns the number of uncompressed bytes written, or undef if unsuccessful.

    flush

    Usage is

    1. $z->flush;
    2. $z->flush($flush_type);

    Flushes any pending compressed data to the output file/buffer.

    This method takes an optional parameter, $flush_type , that controls how the flushing will be carried out. By default the $flush_type used is Z_FINISH . Other valid values for $flush_type are Z_NO_FLUSH , Z_SYNC_FLUSH , Z_FULL_FLUSH and Z_BLOCK . It is strongly recommended that you only set the flush_type parameter if you fully understand the implications of what it does - overuse of flush can seriously degrade the level of compression achieved. See the zlib documentation for details.

    Returns true on success.

    tell

    Usage is

    1. $z->tell()
    2. tell $z

    Returns the uncompressed file offset.

    eof

    Usage is

    1. $z->eof();
    2. eof($z);

    Returns true if the close method has been called.

    seek

    1. $z->seek($position, $whence);
    2. seek($z, $position, $whence);

    Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the output file/buffer. It is a fatal error to attempt to seek backward.

    Empty parts of the file/buffer will have NULL (0x00) bytes written to them.

    The $whence parameter takes one the usual values, namely SEEK_SET, SEEK_CUR or SEEK_END.

    Returns 1 on success, 0 on failure.

    binmode

    Usage is

    1. $z->binmode
    2. binmode $z ;

    This is a noop provided for completeness.

    opened

    1. $z->opened()

    Returns true if the object currently refers to a opened file/buffer.

    autoflush

    1. my $prev = $z->autoflush()
    2. my $prev = $z->autoflush(EXPR)

    If the $z object is associated with a file or a filehandle, this method returns the current autoflush setting for the underlying filehandle. If EXPR is present, and is non-zero, it will enable flushing after every write/print operation.

    If $z is associated with a buffer, this method has no effect and always returns undef.

    Note that the special variable $| cannot be used to set or retrieve the autoflush setting.

    input_line_number

    1. $z->input_line_number()
    2. $z->input_line_number(EXPR)

    This method always returns undef when compressing.

    fileno

    1. $z->fileno()
    2. fileno($z)

    If the $z object is associated with a file or a filehandle, fileno will return the underlying file descriptor. Once the close method is called fileno will return undef.

    If the $z object is associated with a buffer, this method will return undef.

    close

    1. $z->close() ;
    2. close $z ;

    Flushes any pending compressed data and then closes the output file/buffer.

    For most versions of Perl this method will be automatically invoked if the IO::Compress::Zip object is destroyed (either explicitly or by the variable with the reference to the object going out of scope). The exceptions are Perl versions 5.005 through 5.00504 and 5.8.0. In these cases, the close method will be called automatically, but not until global destruction of all live objects when the program is terminating.

    Therefore, if you want your scripts to be able to run on all versions of Perl, you should call close explicitly and not rely on automatic closing.

    Returns true on success, otherwise 0.

    If the AutoClose option has been enabled when the IO::Compress::Zip object was created, and the object is associated with a file, the underlying file will also be closed.

    newStream([OPTS])

    Usage is

    1. $z->newStream( [OPTS] )

    Closes the current compressed data stream and starts a new one.

    OPTS consists of any of the the options that are available when creating the $z object.

    See the Constructor Options section for more details.

    deflateParams

    Usage is

    1. $z->deflateParams

    TODO

    Importing

    A number of symbolic constants are required by some methods in IO::Compress::Zip . None are imported by default.

    • :all

      Imports zip , $ZipError and all symbolic constants that can be used by IO::Compress::Zip . Same as doing this

      1. use IO::Compress::Zip qw(zip $ZipError :constants) ;
    • :constants

      Import all symbolic constants. Same as doing this

      1. use IO::Compress::Zip qw(:flush :level :strategy :zip_method) ;
    • :flush

      These symbolic constants are used by the flush method.

      1. Z_NO_FLUSH
      2. Z_PARTIAL_FLUSH
      3. Z_SYNC_FLUSH
      4. Z_FULL_FLUSH
      5. Z_FINISH
      6. Z_BLOCK
    • :level

      These symbolic constants are used by the Level option in the constructor.

      1. Z_NO_COMPRESSION
      2. Z_BEST_SPEED
      3. Z_BEST_COMPRESSION
      4. Z_DEFAULT_COMPRESSION
    • :strategy

      These symbolic constants are used by the Strategy option in the constructor.

      1. Z_FILTERED
      2. Z_HUFFMAN_ONLY
      3. Z_RLE
      4. Z_FIXED
      5. Z_DEFAULT_STRATEGY
    • :zip_method

      These symbolic constants are used by the Method option in the constructor.

      1. ZIP_CM_STORE
      2. ZIP_CM_DEFLATE
      3. ZIP_CM_BZIP2

    EXAMPLES

    Apache::GZip Revisited

    See IO::Compress::FAQ

    Working with Net::FTP

    See IO::Compress::FAQ

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/I18N/Collate.html000644 000765 000024 00000042650 12275777454 016201 0ustar00jjstaff000000 000000 I18N::Collate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    I18N::Collate

    Perl 5 version 18.2 documentation
    Recently read

    I18N::Collate

    NAME

    I18N::Collate - compare 8-bit scalar data according to the current locale

    SYNOPSIS

    1. use I18N::Collate;
    2. setlocale(LC_COLLATE, 'locale-of-your-choice');
    3. $s1 = I18N::Collate->new("scalar_data_1");
    4. $s2 = I18N::Collate->new("scalar_data_2");

    DESCRIPTION

    1. ***
    2. WARNING: starting from the Perl version 5.003_06
    3. the I18N::Collate interface for comparing 8-bit scalar data
    4. according to the current locale
    5. HAS BEEN DEPRECATED
    6. That is, please do not use it anymore for any new applications
    7. and please migrate the old applications away from it because its
    8. functionality was integrated into the Perl core language in the
    9. release 5.003_06.
    10. See the perllocale manual page for further information.
    11. ***

    This module provides you with objects that will collate according to your national character set, provided that the POSIX setlocale() function is supported on your system.

    You can compare $s1 and $s2 above with

    1. $s1 le $s2

    to extract the data itself, you'll need a dereference: $$s1

    This module uses POSIX::setlocale(). The basic collation conversion is done by strxfrm() which terminates at NUL characters being a decent C routine. collate_xfrm() handles embedded NUL characters gracefully.

    The available locales depend on your operating system; try whether locale -a shows them or man pages for "locale" or "nlsinfo" or the direct approach ls /usr/lib/nls/loc or ls /usr/lib/nls or ls /usr/lib/locale . Not all the locales that your vendor supports are necessarily installed: please consult your operating system's documentation and possibly your local system administration. The locale names are probably something like xx_XX.(ISO)?8859-N or xx_XX.(ISO)?8859N, for example fr_CH.ISO8859-1 is the Swiss (CH) variant of French (fr), ISO Latin (8859) 1 (-1) which is the Western European character set.

     
    perldoc-html/I18N/LangTags/000755 000765 000024 00000000000 12275777455 015422 5ustar00jjstaff000000 000000 perldoc-html/I18N/LangTags.html000644 000765 000024 00000127236 12275777451 016317 0ustar00jjstaff000000 000000 I18N::LangTags - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    I18N::LangTags

    Perl 5 version 18.2 documentation
    Recently read

    I18N::LangTags

    NAME

    I18N::LangTags - functions for dealing with RFC3066-style language tags

    SYNOPSIS

    1. use I18N::LangTags();

    ...or specify whichever of those functions you want to import, like so:

    1. use I18N::LangTags qw(implicate_supers similarity_language_tag);

    All the exportable functions are listed below -- you're free to import only some, or none at all. By default, none are imported. If you say:

    1. use I18N::LangTags qw(:ALL)

    ...then all are exported. (This saves you from having to use something less obvious like use I18N::LangTags qw(/./) .)

    If you don't import any of these functions, assume a &I18N::LangTags:: in front of all the function names in the following examples.

    DESCRIPTION

    Language tags are a formalism, described in RFC 3066 (obsoleting 1766), for declaring what language form (language and possibly dialect) a given chunk of information is in.

    This library provides functions for common tasks involving language tags as they are needed in a variety of protocols and applications.

    Please see the "See Also" references for a thorough explanation of how to correctly use language tags.

    • the function is_language_tag($lang1)

      Returns true iff $lang1 is a formally valid language tag.

      1. is_language_tag("fr") is TRUE
      2. is_language_tag("x-jicarilla") is FALSE
      3. (Subtags can be 8 chars long at most -- 'jicarilla' is 9)
      4. is_language_tag("sgn-US") is TRUE
      5. (That's American Sign Language)
      6. is_language_tag("i-Klikitat") is TRUE
      7. (True without regard to the fact noone has actually
      8. registered Klikitat -- it's a formally valid tag)
      9. is_language_tag("fr-patois") is TRUE
      10. (Formally valid -- altho descriptively weak!)
      11. is_language_tag("Spanish") is FALSE
      12. is_language_tag("french-patois") is FALSE
      13. (No good -- first subtag has to match
      14. /^([xXiI]|[a-zA-Z]{2,3})$/ -- see RFC3066)
      15. is_language_tag("x-borg-prot2532") is TRUE
      16. (Yes, subtags can contain digits, as of RFC3066)
    • the function extract_language_tags($whatever)

      Returns a list of whatever looks like formally valid language tags in $whatever. Not very smart, so don't get too creative with what you want to feed it.

      1. extract_language_tags("fr, fr-ca, i-mingo")
      2. returns: ('fr', 'fr-ca', 'i-mingo')
      3. extract_language_tags("It's like this: I'm in fr -- French!")
      4. returns: ('It', 'in', 'fr')
      5. (So don't just feed it any old thing.)

      The output is untainted. If you don't know what tainting is, don't worry about it.

    • the function same_language_tag($lang1, $lang2)

      Returns true iff $lang1 and $lang2 are acceptable variant tags representing the same language-form.

      1. same_language_tag('x-kadara', 'i-kadara') is TRUE
      2. (The x/i- alternation doesn't matter)
      3. same_language_tag('X-KADARA', 'i-kadara') is TRUE
      4. (...and neither does case)
      5. same_language_tag('en', 'en-US') is FALSE
      6. (all-English is not the SAME as US English)
      7. same_language_tag('x-kadara', 'x-kadar') is FALSE
      8. (these are totally unrelated tags)
      9. same_language_tag('no-bok', 'nb') is TRUE
      10. (no-bok is a legacy tag for nb (Norwegian Bokmal))

      same_language_tag works by just seeing whether encode_language_tag($lang1) is the same as encode_language_tag($lang2) .

      (Yes, I know this function is named a bit oddly. Call it historic reasons.)

    • the function similarity_language_tag($lang1, $lang2)

      Returns an integer representing the degree of similarity between tags $lang1 and $lang2 (the order of which does not matter), where similarity is the number of common elements on the left, without regard to case and to x/i- alternation.

      1. similarity_language_tag('fr', 'fr-ca') is 1
      2. (one element in common)
      3. similarity_language_tag('fr-ca', 'fr-FR') is 1
      4. (one element in common)
      5. similarity_language_tag('fr-CA-joual',
      6. 'fr-CA-PEI') is 2
      7. similarity_language_tag('fr-CA-joual', 'fr-CA') is 2
      8. (two elements in common)
      9. similarity_language_tag('x-kadara', 'i-kadara') is 1
      10. (x/i- doesn't matter)
      11. similarity_language_tag('en', 'x-kadar') is 0
      12. similarity_language_tag('x-kadara', 'x-kadar') is 0
      13. (unrelated tags -- no similarity)
      14. similarity_language_tag('i-cree-syllabic',
      15. 'i-cherokee-syllabic') is 0
      16. (no B<leftmost> elements in common!)
    • the function is_dialect_of($lang1, $lang2)

      Returns true iff language tag $lang1 represents a subform of language tag $lang2.

      Get the order right! It doesn't work the other way around!

      1. is_dialect_of('en-US', 'en') is TRUE
      2. (American English IS a dialect of all-English)
      3. is_dialect_of('fr-CA-joual', 'fr-CA') is TRUE
      4. is_dialect_of('fr-CA-joual', 'fr') is TRUE
      5. (Joual is a dialect of (a dialect of) French)
      6. is_dialect_of('en', 'en-US') is FALSE
      7. (all-English is a NOT dialect of American English)
      8. is_dialect_of('fr', 'en-CA') is FALSE
      9. is_dialect_of('en', 'en' ) is TRUE
      10. is_dialect_of('en-US', 'en-US') is TRUE
      11. (B<Note:> these are degenerate cases)
      12. is_dialect_of('i-mingo-tom', 'x-Mingo') is TRUE
      13. (the x/i thing doesn't matter, nor does case)
      14. is_dialect_of('nn', 'no') is TRUE
      15. (because 'nn' (New Norse) is aliased to 'no-nyn',
      16. as a special legacy case, and 'no-nyn' is a
      17. subform of 'no' (Norwegian))
    • the function super_languages($lang1)

      Returns a list of language tags that are superordinate tags to $lang1 -- it gets this by removing subtags from the end of $lang1 until nothing (or just "i" or "x") is left.

      1. super_languages("fr-CA-joual") is ("fr-CA", "fr")
      2. super_languages("en-AU") is ("en")
      3. super_languages("en") is empty-list, ()
      4. super_languages("i-cherokee") is empty-list, ()
      5. ...not ("i"), which would be illegal as well as pointless.

      If $lang1 is not a valid language tag, returns empty-list in a list context, undef in a scalar context.

      A notable and rather unavoidable problem with this method: "x-mingo-tom" has an "x" because the whole tag isn't an IANA-registered tag -- but super_languages('x-mingo-tom') is ('x-mingo') -- which isn't really right, since 'i-mingo' is registered. But this module has no way of knowing that. (But note that same_language_tag('x-mingo', 'i-mingo') is TRUE.)

      More importantly, you assume at your peril that superordinates of $lang1 are mutually intelligible with $lang1. Consider this carefully.

    • the function locale2language_tag($locale_identifier)

      This takes a locale name (like "en", "en_US", or "en_US.ISO8859-1") and maps it to a language tag. If it's not mappable (as with, notably, "C" and "POSIX"), this returns empty-list in a list context, or undef in a scalar context.

      1. locale2language_tag("en") is "en"
      2. locale2language_tag("en_US") is "en-US"
      3. locale2language_tag("en_US.ISO8859-1") is "en-US"
      4. locale2language_tag("C") is undef or ()
      5. locale2language_tag("POSIX") is undef or ()
      6. locale2language_tag("POSIX") is undef or ()

      I'm not totally sure that locale names map satisfactorily to language tags. Think REAL hard about how you use this. YOU HAVE BEEN WARNED.

      The output is untainted. If you don't know what tainting is, don't worry about it.

    • the function encode_language_tag($lang1)

      This function, if given a language tag, returns an encoding of it such that:

      * tags representing different languages never get the same encoding.

      * tags representing the same language always get the same encoding.

      * an encoding of a formally valid language tag always is a string value that is defined, has length, and is true if considered as a boolean.

      Note that the encoding itself is not a formally valid language tag. Note also that you cannot, currently, go from an encoding back to a language tag that it's an encoding of.

      Note also that you must consider the encoded value as atomic; i.e., you should not consider it as anything but an opaque, unanalysable string value. (The internals of the encoding method may change in future versions, as the language tagging standard changes over time.)

      encode_language_tag returns undef if given anything other than a formally valid language tag.

      The reason encode_language_tag exists is because different language tags may represent the same language; this is normally treatable with same_language_tag , but consider this situation:

      You have a data file that expresses greetings in different languages. Its format is "[language tag]=[how to say 'Hello']", like:

      1. en-US=Hiho
      2. fr=Bonjour
      3. i-mingo=Hau'

      And suppose you write a program that reads that file and then runs as a daemon, answering client requests that specify a language tag and then expect the string that says how to greet in that language. So an interaction looks like:

      1. greeting-client asks: fr
      2. greeting-server answers: Bonjour

      So far so good. But suppose the way you're implementing this is:

      1. my %greetings;
      2. die unless open(IN, "<in.dat");
      3. while(<IN>) {
      4. chomp;
      5. next unless /^([^=]+)=(.+)/s;
      6. my($lang, $expr) = ($1, $2);
      7. $greetings{$lang} = $expr;
      8. }
      9. close(IN);

      at which point %greetings has the contents:

      1. "en-US" => "Hiho"
      2. "fr" => "Bonjour"
      3. "i-mingo" => "Hau'"

      And suppose then that you answer client requests for language $wanted by just looking up $greetings{$wanted}.

      If the client asks for "fr", that will look up successfully in %greetings, to the value "Bonjour". And if the client asks for "i-mingo", that will look up successfully in %greetings, to the value "Hau'".

      But if the client asks for "i-Mingo" or "x-mingo", or "Fr", then the lookup in %greetings fails. That's the Wrong Thing.

      You could instead do lookups on $wanted with:

      1. use I18N::LangTags qw(same_language_tag);
      2. my $response = '';
      3. foreach my $l2 (keys %greetings) {
      4. if(same_language_tag($wanted, $l2)) {
      5. $response = $greetings{$l2};
      6. last;
      7. }
      8. }

      But that's rather inefficient. A better way to do it is to start your program with:

      1. use I18N::LangTags qw(encode_language_tag);
      2. my %greetings;
      3. die unless open(IN, "<in.dat");
      4. while(<IN>) {
      5. chomp;
      6. next unless /^([^=]+)=(.+)/s;
      7. my($lang, $expr) = ($1, $2);
      8. $greetings{
      9. encode_language_tag($lang)
      10. } = $expr;
      11. }
      12. close(IN);

      and then just answer client requests for language $wanted by just looking up

      1. $greetings{encode_language_tag($wanted)}

      And that does the Right Thing.

    • the function alternate_language_tags($lang1)

      This function, if given a language tag, returns all language tags that are alternate forms of this language tag. (I.e., tags which refer to the same language.) This is meant to handle legacy tags caused by the minor changes in language tag standards over the years; and the x-/i- alternation is also dealt with.

      Note that this function does not try to equate new (and never-used, and unusable) ISO639-2 three-letter tags to old (and still in use) ISO639-1 two-letter equivalents -- like "ara" -> "ar" -- because "ara" has never been in use as an Internet language tag, and RFC 3066 stipulates that it never should be, since a shorter tag ("ar") exists.

      Examples:

      1. alternate_language_tags('no-bok') is ('nb')
      2. alternate_language_tags('nb') is ('no-bok')
      3. alternate_language_tags('he') is ('iw')
      4. alternate_language_tags('iw') is ('he')
      5. alternate_language_tags('i-hakka') is ('zh-hakka', 'x-hakka')
      6. alternate_language_tags('zh-hakka') is ('i-hakka', 'x-hakka')
      7. alternate_language_tags('en') is ()
      8. alternate_language_tags('x-mingo-tom') is ('i-mingo-tom')
      9. alternate_language_tags('x-klikitat') is ('i-klikitat')
      10. alternate_language_tags('i-klikitat') is ('x-klikitat')

      This function returns empty-list if given anything other than a formally valid language tag.

    • the function @langs = panic_languages(@accept_languages)

      This function takes a list of 0 or more language tags that constitute a given user's Accept-Language list, and returns a list of tags for other (non-super) languages that are probably acceptable to the user, to be used if all else fails.

      For example, if a user accepts only 'ca' (Catalan) and 'es' (Spanish), and the documents/interfaces you have available are just in German, Italian, and Chinese, then the user will most likely want the Italian one (and not the Chinese or German one!), instead of getting nothing. So panic_languages('ca', 'es') returns a list containing 'it' (Italian).

      English ('en') is always in the return list, but whether it's at the very end or not depends on the input languages. This function works by consulting an internal table that stipulates what common languages are "close" to each other.

      A useful construct you might consider using is:

      1. @fallbacks = super_languages(@accept_languages);
      2. push @fallbacks, panic_languages(
      3. @accept_languages, @fallbacks,
      4. );
    • the function implicate_supers( ...languages... )

      This takes a list of strings (which are presumed to be language-tags; strings that aren't, are ignored); and after each one, this function inserts super-ordinate forms that don't already appear in the list. The original list, plus these insertions, is returned.

      In other words, it takes this:

      1. pt-br de-DE en-US fr pt-br-janeiro

      and returns this:

      1. pt-br pt de-DE de en-US en fr pt-br-janeiro

      This function is most useful in the idiom

      1. implicate_supers( I18N::LangTags::Detect::detect() );

      (See I18N::LangTags::Detect.)

    • the function implicate_supers_strictly( ...languages... )

      This works like implicate_supers except that the implicated forms are added to the end of the return list.

      In other words, implicate_supers_strictly takes a list of strings (which are presumed to be language-tags; strings that aren't, are ignored) and after the whole given list, it inserts the super-ordinate forms of all given tags, minus any tags that already appear in the input list.

      In other words, it takes this:

      1. pt-br de-DE en-US fr pt-br-janeiro

      and returns this:

      1. pt-br de-DE en-US fr pt-br-janeiro pt de en

      The reason this function has "_strictly" in its name is that when you're processing an Accept-Language list according to the RFCs, if you interpret the RFCs quite strictly, then you would use implicate_supers_strictly, but for normal use (i.e., common-sense use, as far as I'm concerned) you'd use implicate_supers.

    ABOUT LOWERCASING

    I've considered making all the above functions that output language tags return all those tags strictly in lowercase. Having all your language tags in lowercase does make some things easier. But you might as well just lowercase as you like, or call encode_language_tag($lang1) where appropriate.

    ABOUT UNICODE PLAINTEXT LANGUAGE TAGS

    In some future version of I18N::LangTags, I plan to include support for RFC2482-style language tags -- which are basically just normal language tags with their ASCII characters shifted into Plane 14.

    SEE ALSO

    * I18N::LangTags::List

    * RFC 3066, http://www.ietf.org/rfc/rfc3066.txt, "Tags for the Identification of Languages". (Obsoletes RFC 1766)

    * RFC 2277, http://www.ietf.org/rfc/rfc2277.txt, "IETF Policy on Character Sets and Languages".

    * RFC 2231, http://www.ietf.org/rfc/rfc2231.txt, "MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations".

    * RFC 2482, http://www.ietf.org/rfc/rfc2482.txt, "Language Tagging in Unicode Plain Text".

    * Locale::Codes, in http://www.perl.com/CPAN/modules/by-module/Locale/

    * ISO 639-2, "Codes for the representation of names of languages", including two-letter and three-letter codes, http://www.loc.gov/standards/iso639-2/php/code_list.php

    * The IANA list of registered languages (hopefully up-to-date), http://www.iana.org/assignments/language-tags

    COPYRIGHT

    Copyright (c) 1998+ Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    The programs and documentation in this dist are distributed in the hope that they will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Sean M. Burke sburke@cpan.org

     
    perldoc-html/I18N/Langinfo.html000644 000765 000024 00000052761 12275777451 016354 0ustar00jjstaff000000 000000 I18N::Langinfo - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    I18N::Langinfo

    Perl 5 version 18.2 documentation
    Recently read

    I18N::Langinfo

    NAME

    I18N::Langinfo - query locale information

    SYNOPSIS

    1. use I18N::Langinfo;

    DESCRIPTION

    The langinfo() function queries various locale information that can be used to localize output and user interfaces. The langinfo() requires one numeric argument that identifies the locale constant to query: if no argument is supplied, $_ is used. The numeric constants appropriate to be used as arguments are exportable from I18N::Langinfo.

    The following example will import the langinfo() function itself and three constants to be used as arguments to langinfo(): a constant for the abbreviated first day of the week (the numbering starts from Sunday = 1) and two more constants for the affirmative and negative answers for a yes/no question in the current locale.

    1. use I18N::Langinfo qw(langinfo ABDAY_1 YESSTR NOSTR);
    2. my ($abday_1, $yesstr, $nostr) = map { langinfo($_) } (ABDAY_1, YESSTR, NOSTR);
    3. print "$abday_1? [$yesstr/$nostr] ";

    In other words, in the "C" (or English) locale the above will probably print something like:

    1. Sun? [yes/no]

    but under a French locale

    1. dim? [oui/non]

    The usually available constants are

    1. ABDAY_1 ABDAY_2 ABDAY_3 ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7
    2. ABMON_1 ABMON_2 ABMON_3 ABMON_4 ABMON_5 ABMON_6
    3. ABMON_7 ABMON_8 ABMON_9 ABMON_10 ABMON_11 ABMON_12
    4. DAY_1 DAY_2 DAY_3 DAY_4 DAY_5 DAY_6 DAY_7
    5. MON_1 MON_2 MON_3 MON_4 MON_5 MON_6
    6. MON_7 MON_8 MON_9 MON_10 MON_11 MON_12

    for abbreviated and full length days of the week and months of the year,

    1. D_T_FMT D_FMT T_FMT

    for the date-time, date, and time formats used by the strftime() function (see POSIX)

    1. AM_STR PM_STR T_FMT_AMPM

    for the locales for which it makes sense to have ante meridiem and post meridiem time formats,

    1. CODESET CRNCYSTR RADIXCHAR

    for the character code set being used (such as "ISO8859-1", "cp850", "koi8-r", "sjis", "utf8", etc.), for the currency string, for the radix character used between the integer and the fractional part of decimal numbers (yes, this is redundant with POSIX::localeconv())

    1. YESSTR YESEXPR NOSTR NOEXPR

    for the affirmative and negative responses and expressions, and

    1. ERA ERA_D_FMT ERA_D_T_FMT ERA_T_FMT

    for the Japanese Emperor eras (naturally only defined under Japanese locales).

    See your langinfo(3) for more information about the available constants. (Often this means having to look directly at the langinfo.h C header file.)

    Note that unfortunately none of the above constants are guaranteed to be available on a particular platform. To be on the safe side you can wrap the import in an eval like this:

    1. eval {
    2. require I18N::Langinfo;
    3. I18N::Langinfo->import(qw(langinfo CODESET));
    4. $codeset = langinfo(CODESET()); # note the ()
    5. };
    6. if (!$@) { ... failed ... }

    EXPORT

    By default only the langinfo() function is exported.

    SEE ALSO

    perllocale, localeconv in POSIX, setlocale in POSIX, nl_langinfo(3).

    The langinfo() is just a wrapper for the C nl_langinfo() interface.

    AUTHOR

    Jarkko Hietaniemi, <jhi@hut.fi>

    COPYRIGHT AND LICENSE

    Copyright 2001 by Jarkko Hietaniemi

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/I18N/LangTags/Detect.html000644 000765 000024 00000043571 12275777455 017532 0ustar00jjstaff000000 000000 I18N::LangTags::Detect - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    I18N::LangTags::Detect

    Perl 5 version 18.2 documentation
    Recently read

    I18N::LangTags::Detect

    NAME

    I18N::LangTags::Detect - detect the user's language preferences

    SYNOPSIS

    1. use I18N::LangTags::Detect;
    2. my @user_wants = I18N::LangTags::Detect::detect();

    DESCRIPTION

    It is a common problem to want to detect what language(s) the user would prefer output in.

    FUNCTIONS

    This module defines one public function, I18N::LangTags::Detect::detect() . This function is not exported (nor is even exportable), and it takes no parameters.

    In scalar context, the function returns the most preferred language tag (or undef if no preference was seen).

    In list context (which is usually what you want), the function returns a (possibly empty) list of language tags representing (best first) what languages the user apparently would accept output in. You will probably want to pass the output of this through I18N::LangTags::implicate_supers_tightly(...) or I18N::LangTags::implicate_supers(...) , like so:

    1. my @languages =
    2. I18N::LangTags::implicate_supers_tightly(
    3. I18N::LangTags::Detect::detect()
    4. );

    ENVIRONMENT

    This module looks for several environment variables, including REQUEST_METHOD, HTTP_ACCEPT_LANGUAGE, LANGUAGE, LC_ALL, LC_MESSAGES, and LANG.

    It will also use the Win32::Locale module, if it's installed.

    SEE ALSO

    I18N::LangTags, Win32::Locale, Locale::Maketext.

    (This module's core code started out as a routine in Locale::Maketext; but I moved it here once I realized it was more generally useful.)

    COPYRIGHT

    Copyright (c) 1998-2004 Sean M. Burke. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    The programs and documentation in this dist are distributed in the hope that they will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.

    AUTHOR

    Sean M. Burke sburke@cpan.org

     
    perldoc-html/I18N/LangTags/List.html000644 000765 000024 00000177257 12275777453 017244 0ustar00jjstaff000000 000000 I18N::LangTags::List - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    I18N::LangTags::List

    Perl 5 version 18.2 documentation
    Recently read

    I18N::LangTags::List

    NAME

    I18N::LangTags::List -- tags and names for human languages

    SYNOPSIS

    1. use I18N::LangTags::List;
    2. print "Parlez-vous... ", join(', ',
    3. I18N::LangTags::List::name('elx') || 'unknown_language',
    4. I18N::LangTags::List::name('ar-Kw') || 'unknown_language',
    5. I18N::LangTags::List::name('en') || 'unknown_language',
    6. I18N::LangTags::List::name('en-CA') || 'unknown_language',
    7. ), "?\n";

    prints:

    1. Parlez-vous... Elamite, Kuwait Arabic, English, Canadian English?

    DESCRIPTION

    This module provides a function I18N::LangTags::List::name( langtag ) that takes a language tag (see I18N::LangTags) and returns the best attempt at an English name for it, or undef if it can't make sense of the tag.

    The function I18N::LangTags::List::name(...) is not exported.

    This module also provides a function I18N::LangTags::List::is_decent( langtag ) that returns true iff the language tag is syntactically valid and is for general use (like "fr" or "fr-ca", below). That is, it returns false for tags that are syntactically invalid and for tags, like "aus", that are listed in brackets below. This function is not exported.

    The map of tags-to-names that it uses is accessible as %I18N::LangTags::List::Name, and it's the same as the list that follows in this documentation, which should be useful to you even if you don't use this module.

    ABOUT LANGUAGE TAGS

    Internet language tags, as defined in RFC 3066, are a formalism for denoting human languages. The two-letter ISO 639-1 language codes are well known (as "en" for English), as are their forms when qualified by a country code ("en-US"). Less well-known are the arbitrary-length non-ISO codes (like "i-mingo"), and the recently (in 2001) introduced three-letter ISO-639-2 codes.

    Remember these important facts:

    • Language tags are not locale IDs. A locale ID is written with a "_" instead of a "-", (almost?) always matches m/^\w\w_\w\w\b/, and means something different than a language tag. A language tag denotes a language. A locale ID denotes a language as used in a particular place, in combination with non-linguistic location-specific information such as what currency is used there. Locales also often denote character set information, as in "en_US.ISO8859-1".

    • Language tags are not for computer languages.

    • "Dialect" is not a useful term, since there is no objective criterion for establishing when two language-forms are dialects of eachother, or are separate languages.

    • Language tags are not case-sensitive. en-US, en-us, En-Us, etc., are all the same tag, and denote the same language.

    • Not every language tag really refers to a single language. Some language tags refer to conditions: i-default (system-message text in English plus maybe other languages), und (undetermined language). Others (notably lots of the three-letter codes) are bibliographic tags that classify whole groups of languages, as with cus "Cushitic (Other)" (i.e., a language that has been classed as Cushtic, but which has no more specific code) or the even less linguistically coherent sai for "South American Indian (Other)". Though useful in bibliography, SUCH TAGS ARE NOT FOR GENERAL USE. For further guidance, email me.

    • Language tags are not country codes. In fact, they are often distinct codes, as with language tag ja for Japanese, and ISO 3166 country code .jp for Japan.

    LIST OF LANGUAGES

    The first part of each item is the language tag, between {...}. It is followed by an English name for the language or language-group. Language tags that I judge to be not for general use, are bracketed.

    This list is in alphabetical order by English name of the language.

    • {ab} : Abkhazian

      eq Abkhaz

    • {ace} : Achinese
    • {ach} : Acoli
    • {ada} : Adangme
    • {ady} : Adyghe

      eq Adygei

    • {aa} : Afar
    • {afh} : Afrihili

      (Artificial)

    • {af} : Afrikaans
    • [{afa} : Afro-Asiatic (Other)]
    • {ak} : Akan

      (Formerly "aka".)

    • {akk} : Akkadian

      (Historical)

    • {sq} : Albanian
    • {ale} : Aleut
    • [{alg} : Algonquian languages]

      NOT Algonquin!

    • [{tut} : Altaic (Other)]
    • {am} : Amharic

      NOT Aramaic!

    • {i-ami} : Ami

      eq Amis. eq 'Amis. eq Pangca.

    • [{apa} : Apache languages]
    • {ar} : Arabic

      Many forms are mutually un-intelligible in spoken media. Notable forms: {ar-ae} UAE Arabic; {ar-bh} Bahrain Arabic; {ar-dz} Algerian Arabic; {ar-eg} Egyptian Arabic; {ar-iq} Iraqi Arabic; {ar-jo} Jordanian Arabic; {ar-kw} Kuwait Arabic; {ar-lb} Lebanese Arabic; {ar-ly} Libyan Arabic; {ar-ma} Moroccan Arabic; {ar-om} Omani Arabic; {ar-qa} Qatari Arabic; {ar-sa} Sauda Arabic; {ar-sy} Syrian Arabic; {ar-tn} Tunisian Arabic; {ar-ye} Yemen Arabic.

    • {arc} : Aramaic

      NOT Amharic! NOT Samaritan Aramaic!

    • {arp} : Arapaho
    • {arn} : Araucanian
    • {arw} : Arawak
    • {hy} : Armenian
    • {an} : Aragonese
    • [{art} : Artificial (Other)]
    • {ast} : Asturian

      eq Bable.

    • {as} : Assamese
    • [{ath} : Athapascan languages]

      eq Athabaskan. eq Athapaskan. eq Athabascan.

    • [{aus} : Australian languages]
    • [{map} : Austronesian (Other)]
    • {av} : Avaric

      (Formerly "ava".)

    • {ae} : Avestan

      eq Zend

    • {awa} : Awadhi
    • {ay} : Aymara
    • {az} : Azerbaijani

      eq Azeri

      Notable forms: {az-Arab} Azerbaijani in Arabic script; {az-Cyrl} Azerbaijani in Cyrillic script; {az-Latn} Azerbaijani in Latin script.

    • {ban} : Balinese
    • [{bat} : Baltic (Other)]
    • {bal} : Baluchi
    • {bm} : Bambara

      (Formerly "bam".)

    • [{bai} : Bamileke languages]
    • {bad} : Banda
    • [{bnt} : Bantu (Other)]
    • {bas} : Basa
    • {ba} : Bashkir
    • {eu} : Basque
    • {btk} : Batak (Indonesia)
    • {bej} : Beja
    • {be} : Belarusian

      eq Belarussian. eq Byelarussian. eq Belorussian. eq Byelorussian. eq White Russian. eq White Ruthenian. NOT Ruthenian!

    • {bem} : Bemba
    • {bn} : Bengali

      eq Bangla.

    • [{ber} : Berber (Other)]
    • {bho} : Bhojpuri
    • {bh} : Bihari
    • {bik} : Bikol
    • {bin} : Bini
    • {bi} : Bislama

      eq Bichelamar.

    • {bs} : Bosnian
    • {bra} : Braj
    • {br} : Breton
    • {bug} : Buginese
    • {bg} : Bulgarian
    • {i-bnn} : Bunun
    • {bua} : Buriat
    • {my} : Burmese
    • {cad} : Caddo
    • {car} : Carib
    • {ca} : Catalan

      eq Catalán. eq Catalonian.

    • [{cau} : Caucasian (Other)]
    • {ceb} : Cebuano
    • [{cel} : Celtic (Other)]

      Notable forms: {cel-gaulish} Gaulish (Historical)

    • [{cai} : Central American Indian (Other)]
    • {chg} : Chagatai

      (Historical?)

    • [{cmc} : Chamic languages]
    • {ch} : Chamorro
    • {ce} : Chechen
    • {chr} : Cherokee

      eq Tsalagi

    • {chy} : Cheyenne
    • {chb} : Chibcha

      (Historical) NOT Chibchan (which is a language family).

    • {ny} : Chichewa

      eq Nyanja. eq Chinyanja.

    • {zh} : Chinese

      Many forms are mutually un-intelligible in spoken media. Notable forms: {zh-Hans} Chinese, in simplified script; {zh-Hant} Chinese, in traditional script; {zh-tw} Taiwan Chinese; {zh-cn} PRC Chinese; {zh-sg} Singapore Chinese; {zh-mo} Macau Chinese; {zh-hk} Hong Kong Chinese; {zh-guoyu} Mandarin [Putonghua/Guoyu]; {zh-hakka} Hakka [formerly "i-hakka"]; {zh-min} Hokkien; {zh-min-nan} Southern Hokkien; {zh-wuu} Shanghaiese; {zh-xiang} Hunanese; {zh-gan} Gan; {zh-yue} Cantonese.

    • {chn} : Chinook Jargon

      eq Chinook Wawa.

    • {chp} : Chipewyan
    • {cho} : Choctaw
    • {cu} : Church Slavic

      eq Old Church Slavonic.

    • {chk} : Chuukese

      eq Trukese. eq Chuuk. eq Truk. eq Ruk.

    • {cv} : Chuvash
    • {cop} : Coptic
    • {kw} : Cornish
    • {co} : Corsican

      eq Corse.

    • {cr} : Cree

      NOT Creek! (Formerly "cre".)

    • {mus} : Creek

      NOT Cree!

    • [{cpe} : English-based Creoles and pidgins (Other)]
    • [{cpf} : French-based Creoles and pidgins (Other)]
    • [{cpp} : Portuguese-based Creoles and pidgins (Other)]
    • [{crp} : Creoles and pidgins (Other)]
    • {hr} : Croatian

      eq Croat.

    • [{cus} : Cushitic (Other)]
    • {cs} : Czech
    • {dak} : Dakota

      eq Nakota. eq Latoka.

    • {da} : Danish
    • {dar} : Dargwa
    • {day} : Dayak
    • {i-default} : Default (Fallthru) Language

      Defined in RFC 2277, this is for tagging text (which must include English text, and might/should include text in other appropriate languages) that is emitted in a context where language-negotiation wasn't possible -- in SMTP mail failure messages, for example.

    • {del} : Delaware
    • {din} : Dinka
    • {dv} : Divehi

      eq Maldivian. (Formerly "div".)

    • {doi} : Dogri

      NOT Dogrib!

    • {dgr} : Dogrib

      NOT Dogri!

    • [{dra} : Dravidian (Other)]
    • {dua} : Duala
    • {nl} : Dutch

      eq Netherlander. Notable forms: {nl-nl} Netherlands Dutch; {nl-be} Belgian Dutch.

    • {dum} : Middle Dutch (ca.1050-1350)

      (Historical)

    • {dyu} : Dyula
    • {dz} : Dzongkha
    • {efi} : Efik
    • {egy} : Ancient Egyptian

      (Historical)

    • {eka} : Ekajuk
    • {elx} : Elamite

      (Historical)

    • {en} : English

      Notable forms: {en-au} Australian English; {en-bz} Belize English; {en-ca} Canadian English; {en-gb} UK English; {en-ie} Irish English; {en-jm} Jamaican English; {en-nz} New Zealand English; {en-ph} Philippine English; {en-tt} Trinidad English; {en-us} US English; {en-za} South African English; {en-zw} Zimbabwe English.

    • {enm} : Old English (1100-1500)

      (Historical)

    • {ang} : Old English (ca.450-1100)

      eq Anglo-Saxon. (Historical)

    • {i-enochian} : Enochian (Artificial)
    • {myv} : Erzya
    • {eo} : Esperanto

      (Artificial)

    • {et} : Estonian
    • {ee} : Ewe

      (Formerly "ewe".)

    • {ewo} : Ewondo
    • {fan} : Fang
    • {fat} : Fanti
    • {fo} : Faroese
    • {fj} : Fijian
    • {fi} : Finnish
    • [{fiu} : Finno-Ugrian (Other)]

      eq Finno-Ugric. NOT Ugaritic!

    • {fon} : Fon
    • {fr} : French

      Notable forms: {fr-fr} France French; {fr-be} Belgian French; {fr-ca} Canadian French; {fr-ch} Swiss French; {fr-lu} Luxembourg French; {fr-mc} Monaco French.

    • {frm} : Middle French (ca.1400-1600)

      (Historical)

    • {fro} : Old French (842-ca.1400)

      (Historical)

    • {fy} : Frisian
    • {fur} : Friulian
    • {ff} : Fulah

      (Formerly "ful".)

    • {gaa} : Ga
    • {gd} : Scots Gaelic

      NOT Scots!

    • {gl} : Gallegan

      eq Galician

    • {lg} : Ganda

      (Formerly "lug".)

    • {gay} : Gayo
    • {gba} : Gbaya
    • {gez} : Geez

      eq Ge'ez

    • {ka} : Georgian
    • {de} : German

      Notable forms: {de-at} Austrian German; {de-be} Belgian German; {de-ch} Swiss German; {de-de} Germany German; {de-li} Liechtenstein German; {de-lu} Luxembourg German.

    • {gmh} : Middle High German (ca.1050-1500)

      (Historical)

    • {goh} : Old High German (ca.750-1050)

      (Historical)

    • [{gem} : Germanic (Other)]
    • {gil} : Gilbertese
    • {gon} : Gondi
    • {gor} : Gorontalo
    • {got} : Gothic

      (Historical)

    • {grb} : Grebo
    • {grc} : Ancient Greek

      (Historical) (Until 15th century or so.)

    • {el} : Modern Greek

      (Since 15th century or so.)

    • {gn} : Guarani

      Guaraní

    • {gu} : Gujarati
    • {gwi} : Gwich'in

      eq Gwichin

    • {hai} : Haida
    • {ht} : Haitian

      eq Haitian Creole

    • {ha} : Hausa
    • {haw} : Hawaiian

      Hawai'ian

    • {he} : Hebrew

      (Formerly "iw".)

    • {hz} : Herero
    • {hil} : Hiligaynon
    • {him} : Himachali
    • {hi} : Hindi
    • {ho} : Hiri Motu
    • {hit} : Hittite

      (Historical)

    • {hmn} : Hmong
    • {hu} : Hungarian
    • {hup} : Hupa
    • {iba} : Iban
    • {is} : Icelandic
    • {io} : Ido

      (Artificial)

    • {ig} : Igbo

      (Formerly "ibo".)

    • {ijo} : Ijo
    • {ilo} : Iloko
    • [{inc} : Indic (Other)]
    • [{ine} : Indo-European (Other)]
    • {id} : Indonesian

      (Formerly "in".)

    • {inh} : Ingush
    • {ia} : Interlingua (International Auxiliary Language Association)

      (Artificial) NOT Interlingue!

    • {ie} : Interlingue

      (Artificial) NOT Interlingua!

    • {iu} : Inuktitut

      A subform of "Eskimo".

    • {ik} : Inupiaq

      A subform of "Eskimo".

    • [{ira} : Iranian (Other)]
    • {ga} : Irish
    • {mga} : Middle Irish (900-1200)

      (Historical)

    • {sga} : Old Irish (to 900)

      (Historical)

    • [{iro} : Iroquoian languages]
    • {it} : Italian

      Notable forms: {it-it} Italy Italian; {it-ch} Swiss Italian.

    • {ja} : Japanese

      (NOT "jp"!)

    • {jv} : Javanese

      (Formerly "jw" because of a typo.)

    • {jrb} : Judeo-Arabic
    • {jpr} : Judeo-Persian
    • {kbd} : Kabardian
    • {kab} : Kabyle
    • {kac} : Kachin
    • {kl} : Kalaallisut

      eq Greenlandic "Eskimo"

    • {xal} : Kalmyk
    • {kam} : Kamba
    • {kn} : Kannada

      eq Kanarese. NOT Canadian!

    • {kr} : Kanuri

      (Formerly "kau".)

    • {krc} : Karachay-Balkar
    • {kaa} : Kara-Kalpak
    • {kar} : Karen
    • {ks} : Kashmiri
    • {csb} : Kashubian

      eq Kashub

    • {kaw} : Kawi
    • {kk} : Kazakh
    • {kha} : Khasi
    • {km} : Khmer

      eq Cambodian. eq Kampuchean.

    • [{khi} : Khoisan (Other)]
    • {kho} : Khotanese
    • {ki} : Kikuyu

      eq Gikuyu.

    • {kmb} : Kimbundu
    • {rw} : Kinyarwanda
    • {ky} : Kirghiz
    • {i-klingon} : Klingon
    • {kv} : Komi
    • {kg} : Kongo

      (Formerly "kon".)

    • {kok} : Konkani
    • {ko} : Korean
    • {kos} : Kosraean
    • {kpe} : Kpelle
    • {kro} : Kru
    • {kj} : Kuanyama
    • {kum} : Kumyk
    • {ku} : Kurdish
    • {kru} : Kurukh
    • {kut} : Kutenai
    • {lad} : Ladino

      eq Judeo-Spanish. NOT Ladin (a minority language in Italy).

    • {lah} : Lahnda

      NOT Lamba!

    • {lam} : Lamba

      NOT Lahnda!

    • {lo} : Lao

      eq Laotian.

    • {la} : Latin

      (Historical) NOT Ladin! NOT Ladino!

    • {lv} : Latvian

      eq Lettish.

    • {lb} : Letzeburgesch

      eq Luxemburgian, eq Luxemburger. (Formerly "i-lux".)

    • {lez} : Lezghian
    • {li} : Limburgish

      eq Limburger, eq Limburgan. NOT Letzeburgesch!

    • {ln} : Lingala
    • {lt} : Lithuanian
    • {nds} : Low German

      eq Low Saxon. eq Low German. eq Low Saxon.

    • {art-lojban} : Lojban (Artificial)
    • {loz} : Lozi
    • {lu} : Luba-Katanga

      (Formerly "lub".)

    • {lua} : Luba-Lulua
    • {lui} : Luiseno

      eq Luiseño.

    • {lun} : Lunda
    • {luo} : Luo (Kenya and Tanzania)
    • {lus} : Lushai
    • {mk} : Macedonian

      eq the modern Slavic language spoken in what was Yugoslavia. NOT the form of Greek spoken in Greek Macedonia!

    • {mad} : Madurese
    • {mag} : Magahi
    • {mai} : Maithili
    • {mak} : Makasar
    • {mg} : Malagasy
    • {ms} : Malay

      NOT Malayalam!

    • {ml} : Malayalam

      NOT Malay!

    • {mt} : Maltese
    • {mnc} : Manchu
    • {mdr} : Mandar

      NOT Mandarin!

    • {man} : Mandingo
    • {mni} : Manipuri

      eq Meithei.

    • [{mno} : Manobo languages]
    • {gv} : Manx
    • {mi} : Maori

      NOT Mari!

    • {mr} : Marathi
    • {chm} : Mari

      NOT Maori!

    • {mh} : Marshall

      eq Marshallese.

    • {mwr} : Marwari
    • {mas} : Masai
    • [{myn} : Mayan languages]
    • {men} : Mende
    • {mic} : Micmac
    • {min} : Minangkabau
    • {i-mingo} : Mingo

      eq the Irquoian language West Virginia Seneca. NOT New York Seneca!

    • [{mis} : Miscellaneous languages]

      Don't use this.

    • {moh} : Mohawk
    • {mdf} : Moksha
    • {mo} : Moldavian

      eq Moldovan.

    • [{mkh} : Mon-Khmer (Other)]
    • {lol} : Mongo
    • {mn} : Mongolian

      eq Mongol.

    • {mos} : Mossi
    • [{mul} : Multiple languages]

      Not for normal use.

    • [{mun} : Munda languages]
    • {nah} : Nahuatl
    • {nap} : Neapolitan
    • {na} : Nauru
    • {nv} : Navajo

      eq Navaho. (Formerly "i-navajo".)

    • {nd} : North Ndebele
    • {nr} : South Ndebele
    • {ng} : Ndonga
    • {ne} : Nepali

      eq Nepalese. Notable forms: {ne-np} Nepal Nepali; {ne-in} India Nepali.

    • {new} : Newari
    • {nia} : Nias
    • [{nic} : Niger-Kordofanian (Other)]
    • [{ssa} : Nilo-Saharan (Other)]
    • {niu} : Niuean
    • {nog} : Nogai
    • {non} : Old Norse

      (Historical)

    • [{nai} : North American Indian]

      Do not use this.

    • {no} : Norwegian

      Note the two following forms:

    • {nb} : Norwegian Bokmal

      eq Bokmål, (A form of Norwegian.) (Formerly "no-bok".)

    • {nn} : Norwegian Nynorsk

      (A form of Norwegian.) (Formerly "no-nyn".)

    • [{nub} : Nubian languages]
    • {nym} : Nyamwezi
    • {nyn} : Nyankole
    • {nyo} : Nyoro
    • {nzi} : Nzima
    • {oc} : Occitan (post 1500)

      eq Provençal, eq Provencal

    • {oj} : Ojibwa

      eq Ojibwe. (Formerly "oji".)

    • {or} : Oriya
    • {om} : Oromo
    • {osa} : Osage
    • {os} : Ossetian; Ossetic
    • [{oto} : Otomian languages]

      Group of languages collectively called "Otomí".

    • {pal} : Pahlavi

      eq Pahlevi

    • {i-pwn} : Paiwan

      eq Pariwan

    • {pau} : Palauan
    • {pi} : Pali

      (Historical?)

    • {pam} : Pampanga
    • {pag} : Pangasinan
    • {pa} : Panjabi

      eq Punjabi

    • {pap} : Papiamento

      eq Papiamentu.

    • [{paa} : Papuan (Other)]
    • {fa} : Persian

      eq Farsi. eq Iranian.

    • {peo} : Old Persian (ca.600-400 B.C.)
    • [{phi} : Philippine (Other)]
    • {phn} : Phoenician

      (Historical)

    • {pon} : Pohnpeian

      NOT Pompeiian!

    • {pl} : Polish
    • {pt} : Portuguese

      eq Portugese. Notable forms: {pt-pt} Portugal Portuguese; {pt-br} Brazilian Portuguese.

    • [{pra} : Prakrit languages]
    • {pro} : Old Provencal (to 1500)

      eq Old Provençal. (Historical.)

    • {ps} : Pushto

      eq Pashto. eq Pushtu.

    • {qu} : Quechua

      eq Quecha.

    • {rm} : Raeto-Romance

      eq Romansh.

    • {raj} : Rajasthani
    • {rap} : Rapanui
    • {rar} : Rarotongan
    • [{qaa - qtz} : Reserved for local use.]
    • [{roa} : Romance (Other)]

      NOT Romanian! NOT Romany! NOT Romansh!

    • {ro} : Romanian

      eq Rumanian. NOT Romany!

    • {rom} : Romany

      eq Rom. NOT Romanian!

    • {rn} : Rundi
    • {ru} : Russian

      NOT White Russian! NOT Rusyn!

    • [{sal} : Salishan languages]

      Large language group.

    • {sam} : Samaritan Aramaic

      NOT Aramaic!

    • {se} : Northern Sami

      eq Lappish. eq Lapp. eq (Northern) Saami.

    • {sma} : Southern Sami
    • {smn} : Inari Sami
    • {smj} : Lule Sami
    • {sms} : Skolt Sami
    • [{smi} : Sami languages (Other)]
    • {sm} : Samoan
    • {sad} : Sandawe
    • {sg} : Sango
    • {sa} : Sanskrit

      (Historical)

    • {sat} : Santali
    • {sc} : Sardinian

      eq Sard.

    • {sas} : Sasak
    • {sco} : Scots

      NOT Scots Gaelic!

    • {sel} : Selkup
    • [{sem} : Semitic (Other)]
    • {sr} : Serbian

      eq Serb. NOT Sorbian.

      Notable forms: {sr-Cyrl} : Serbian in Cyrillic script; {sr-Latn} : Serbian in Latin script.

    • {srr} : Serer
    • {shn} : Shan
    • {sn} : Shona
    • {sid} : Sidamo
    • {sgn-...} : Sign Languages

      Always use with a subtag. Notable forms: {sgn-gb} British Sign Language (BSL); {sgn-ie} Irish Sign Language (ESL); {sgn-ni} Nicaraguan Sign Language (ISN); {sgn-us} American Sign Language (ASL).

      (And so on with other country codes as the subtag.)

    • {bla} : Siksika

      eq Blackfoot. eq Pikanii.

    • {sd} : Sindhi
    • {si} : Sinhalese

      eq Sinhala.

    • [{sit} : Sino-Tibetan (Other)]
    • [{sio} : Siouan languages]
    • {den} : Slave (Athapascan)

      ("Slavey" is a subform.)

    • [{sla} : Slavic (Other)]
    • {sk} : Slovak

      eq Slovakian.

    • {sl} : Slovenian

      eq Slovene.

    • {sog} : Sogdian
    • {so} : Somali
    • {son} : Songhai
    • {snk} : Soninke
    • {wen} : Sorbian languages

      eq Wendish. eq Sorb. eq Lusatian. eq Wend. NOT Venda! NOT Serbian!

    • {nso} : Northern Sotho
    • {st} : Southern Sotho

      eq Sutu. eq Sesotho.

    • [{sai} : South American Indian (Other)]
    • {es} : Spanish

      Notable forms: {es-ar} Argentine Spanish; {es-bo} Bolivian Spanish; {es-cl} Chilean Spanish; {es-co} Colombian Spanish; {es-do} Dominican Spanish; {es-ec} Ecuadorian Spanish; {es-es} Spain Spanish; {es-gt} Guatemalan Spanish; {es-hn} Honduran Spanish; {es-mx} Mexican Spanish; {es-pa} Panamanian Spanish; {es-pe} Peruvian Spanish; {es-pr} Puerto Rican Spanish; {es-py} Paraguay Spanish; {es-sv} Salvadoran Spanish; {es-us} US Spanish; {es-uy} Uruguayan Spanish; {es-ve} Venezuelan Spanish.

    • {suk} : Sukuma
    • {sux} : Sumerian

      (Historical)

    • {su} : Sundanese
    • {sus} : Susu
    • {sw} : Swahili

      eq Kiswahili

    • {ss} : Swati
    • {sv} : Swedish

      Notable forms: {sv-se} Sweden Swedish; {sv-fi} Finland Swedish.

    • {syr} : Syriac
    • {tl} : Tagalog
    • {ty} : Tahitian
    • [{tai} : Tai (Other)]

      NOT Thai!

    • {tg} : Tajik
    • {tmh} : Tamashek
    • {ta} : Tamil
    • {i-tao} : Tao

      eq Yami.

    • {tt} : Tatar
    • {i-tay} : Tayal

      eq Atayal. eq Atayan.

    • {te} : Telugu
    • {ter} : Tereno
    • {tet} : Tetum
    • {th} : Thai

      NOT Tai!

    • {bo} : Tibetan
    • {tig} : Tigre
    • {ti} : Tigrinya
    • {tem} : Timne

      eq Themne. eq Timene.

    • {tiv} : Tiv
    • {tli} : Tlingit
    • {tpi} : Tok Pisin
    • {tkl} : Tokelau
    • {tog} : Tonga (Nyasa)

      NOT Tsonga!

    • {to} : Tonga (Tonga Islands)

      (Pronounced "Tong-a", not "Tong-ga")

      NOT Tsonga!

    • {tsi} : Tsimshian

      eq Sm'algyax

    • {ts} : Tsonga

      NOT Tonga!

    • {i-tsu} : Tsou
    • {tn} : Tswana

      Same as Setswana.

    • {tum} : Tumbuka
    • [{tup} : Tupi languages]
    • {tr} : Turkish

      (Typically in Roman script)

    • {ota} : Ottoman Turkish (1500-1928)

      (Typically in Arabic script) (Historical)

    • {crh} : Crimean Turkish

      eq Crimean Tatar

    • {tk} : Turkmen

      eq Turkmeni.

    • {tvl} : Tuvalu
    • {tyv} : Tuvinian

      eq Tuvan. eq Tuvin.

    • {tw} : Twi
    • {udm} : Udmurt
    • {uga} : Ugaritic

      NOT Ugric!

    • {ug} : Uighur
    • {uk} : Ukrainian
    • {umb} : Umbundu
    • {und} : Undetermined

      Not a tag for normal use.

    • {ur} : Urdu
    • {uz} : Uzbek

      eq Özbek

      Notable forms: {uz-Cyrl} Uzbek in Cyrillic script; {uz-Latn} Uzbek in Latin script.

    • {vai} : Vai
    • {ve} : Venda

      NOT Wendish! NOT Wend! NOT Avestan! (Formerly "ven".)

    • {vi} : Vietnamese

      eq Viet.

    • {vo} : Volapuk

      eq Volapük. (Artificial)

    • {vot} : Votic

      eq Votian. eq Vod.

    • [{wak} : Wakashan languages]
    • {wa} : Walloon
    • {wal} : Walamo

      eq Wolaytta.

    • {war} : Waray

      Presumably the Philippine language Waray-Waray (Samareño), not the smaller Philippine language Waray Sorsogon, nor the extinct Australian language Waray.

    • {was} : Washo

      eq Washoe

    • {cy} : Welsh
    • {wo} : Wolof
    • {x-...} : Unregistered (Semi-Private Use)

      "x-" is a prefix for language tags that are not registered with ISO or IANA. Example, x-double-dutch

    • {xh} : Xhosa
    • {sah} : Yakut
    • {yao} : Yao

      (The Yao in Malawi?)

    • {yap} : Yapese

      eq Yap

    • {ii} : Sichuan Yi
    • {yi} : Yiddish

      Formerly "ji". Usually in Hebrew script.

      Notable forms: {yi-latn} Yiddish in Latin script

    • {yo} : Yoruba
    • [{ypk} : Yupik languages]

      Several "Eskimo" languages.

    • {znd} : Zande
    • [{zap} : Zapotec]

      (A group of languages.)

    • {zen} : Zenaga

      NOT Zend.

    • {za} : Zhuang
    • {zu} : Zulu
    • {zun} : Zuni

      eq Zuñi

    SEE ALSO

    I18N::LangTags and its "See Also" section.

    COPYRIGHT AND DISCLAIMER

    Copyright (c) 2001+ Sean M. Burke. All rights reserved.

    You can redistribute and/or modify this document under the same terms as Perl itself.

    This document is provided in the hope that it will be useful, but without any warranty; without even the implied warranty of accuracy, authoritativeness, completeness, merchantability, or fitness for a particular purpose.

    Email any corrections or questions to me.

    AUTHOR

    Sean M. Burke, sburke@cpan.org

     
    perldoc-html/Hash/Util/000755 000765 000024 00000000000 12275777447 015044 5ustar00jjstaff000000 000000 perldoc-html/Hash/Util.html000644 000765 000024 00000133126 12275777447 015740 0ustar00jjstaff000000 000000 Hash::Util - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Hash::Util

    Perl 5 version 18.2 documentation
    Recently read

    Hash::Util

    NAME

    Hash::Util - A selection of general-utility hash subroutines

    SYNOPSIS

    1. # Restricted hashes
    2. use Hash::Util qw(
    3. fieldhash fieldhashes
    4. all_keys
    5. lock_keys unlock_keys
    6. lock_value unlock_value
    7. lock_hash unlock_hash
    8. lock_keys_plus
    9. hash_locked hash_unlocked
    10. hashref_locked hashref_unlocked
    11. hidden_keys legal_keys
    12. lock_ref_keys unlock_ref_keys
    13. lock_ref_value unlock_ref_value
    14. lock_hashref unlock_hashref
    15. lock_ref_keys_plus
    16. hidden_ref_keys legal_ref_keys
    17. hash_seed hash_value hv_store
    18. bucket_stats bucket_info bucket_array
    19. lock_hash_recurse unlock_hash_recurse
    20. hash_traversal_mask
    21. );
    22. %hash = (foo => 42, bar => 23);
    23. # Ways to restrict a hash
    24. lock_keys(%hash);
    25. lock_keys(%hash, @keyset);
    26. lock_keys_plus(%hash, @additional_keys);
    27. # Ways to inspect the properties of a restricted hash
    28. my @legal = legal_keys(%hash);
    29. my @hidden = hidden_keys(%hash);
    30. my $ref = all_keys(%hash,@keys,@hidden);
    31. my $is_locked = hash_locked(%hash);
    32. # Remove restrictions on the hash
    33. unlock_keys(%hash);
    34. # Lock individual values in a hash
    35. lock_value (%hash, 'foo');
    36. unlock_value(%hash, 'foo');
    37. # Ways to change the restrictions on both keys and values
    38. lock_hash (%hash);
    39. unlock_hash(%hash);
    40. my $hashes_are_randomised = hash_seed() != 0;
    41. my $int_hash_value = hash_value( 'string' );
    42. my $mask= hash_traversal_mask(%hash);
    43. hash_traversal_mask(%hash,1234);

    DESCRIPTION

    Hash::Util and Hash::Util::FieldHash contain special functions for manipulating hashes that don't really warrant a keyword.

    Hash::Util contains a set of functions that support restricted hashes. These are described in this document. Hash::Util::FieldHash contains an (unrelated) set of functions that support the use of hashes in inside-out classes, described in Hash::Util::FieldHash.

    By default Hash::Util does not export anything.

    Restricted hashes

    5.8.0 introduces the ability to restrict a hash to a certain set of keys. No keys outside of this set can be added. It also introduces the ability to lock an individual key so it cannot be deleted and the ability to ensure that an individual value cannot be changed.

    This is intended to largely replace the deprecated pseudo-hashes.

    • lock_keys
    • unlock_keys
      1. lock_keys(%hash);
      2. lock_keys(%hash, @keys);

      Restricts the given %hash's set of keys to @keys. If @keys is not given it restricts it to its current keyset. No more keys can be added. delete() and exists() will still work, but will not alter the set of allowed keys. Note: the current implementation prevents the hash from being bless()ed while it is in a locked state. Any attempt to do so will raise an exception. Of course you can still bless() the hash before you call lock_keys() so this shouldn't be a problem.

      1. unlock_keys(%hash);

      Removes the restriction on the %hash's keyset.

      Note that if any of the values of the hash have been locked they will not be unlocked after this sub executes.

      Both routines return a reference to the hash operated on.

    • lock_keys_plus
      1. lock_keys_plus(%hash,@additional_keys)

      Similar to lock_keys() , with the difference being that the optional key list specifies keys that may or may not be already in the hash. Essentially this is an easier way to say

      1. lock_keys(%hash,@additional_keys,keys %hash);

      Returns a reference to %hash

    • lock_value
    • unlock_value
      1. lock_value (%hash, $key);
      2. unlock_value(%hash, $key);

      Locks and unlocks the value for an individual key of a hash. The value of a locked key cannot be changed.

      Unless %hash has already been locked the key/value could be deleted regardless of this setting.

      Returns a reference to the %hash.

    • lock_hash
    • unlock_hash
      1. lock_hash(%hash);

      lock_hash() locks an entire hash, making all keys and values read-only. No value can be changed, no keys can be added or deleted.

      1. unlock_hash(%hash);

      unlock_hash() does the opposite of lock_hash(). All keys and values are made writable. All values can be changed and keys can be added and deleted.

      Returns a reference to the %hash.

    • lock_hash_recurse
    • unlock_hash_recurse
      1. lock_hash_recurse(%hash);

      lock_hash() locks an entire hash and any hashes it references recursively, making all keys and values read-only. No value can be changed, no keys can be added or deleted.

      This method only recurses into hashes that are referenced by another hash. Thus a Hash of Hashes (HoH) will all be restricted, but a Hash of Arrays of Hashes (HoAoH) will only have the top hash restricted.

      1. unlock_hash_recurse(%hash);

      unlock_hash_recurse() does the opposite of lock_hash_recurse(). All keys and values are made writable. All values can be changed and keys can be added and deleted. Identical recursion restrictions apply as to lock_hash_recurse().

      Returns a reference to the %hash.

    • hashref_locked
    • hash_locked
      1. hashref_locked(\%hash) and print "Hash is locked!\n";
      2. hash_locked(%hash) and print "Hash is locked!\n";

      Returns true if the hash and its keys are locked.

    • hashref_unlocked
    • hash_unlocked
      1. hashref_unlocked(\%hash) and print "Hash is unlocked!\n";
      2. hash_unlocked(%hash) and print "Hash is unlocked!\n";

      Returns true if the hash and its keys are unlocked.

    • legal_keys
      1. my @keys = legal_keys(%hash);

      Returns the list of the keys that are legal in a restricted hash. In the case of an unrestricted hash this is identical to calling keys(%hash).

    • hidden_keys
      1. my @keys = hidden_keys(%hash);

      Returns the list of the keys that are legal in a restricted hash but do not have a value associated to them. Thus if 'foo' is a "hidden" key of the %hash it will return false for both defined and exists tests.

      In the case of an unrestricted hash this will return an empty list.

      NOTE this is an experimental feature that is heavily dependent on the current implementation of restricted hashes. Should the implementation change, this routine may become meaningless, in which case it will return an empty list.

    • all_keys
      1. all_keys(%hash,@keys,@hidden);

      Populates the arrays @keys with the all the keys that would pass an exists tests, and populates @hidden with the remaining legal keys that have not been utilized.

      Returns a reference to the hash.

      In the case of an unrestricted hash this will be equivalent to

      1. $ref = do {
      2. @keys = keys %hash;
      3. @hidden = ();
      4. \%hash
      5. };

      NOTE this is an experimental feature that is heavily dependent on the current implementation of restricted hashes. Should the implementation change this routine may become meaningless in which case it will behave identically to how it would behave on an unrestricted hash.

    • hash_seed
      1. my $hash_seed = hash_seed();

      hash_seed() returns the seed bytes used to randomise hash ordering.

      Note that the hash seed is sensitive information: by knowing it one can craft a denial-of-service attack against Perl code, even remotely, see Algorithmic Complexity Attacks in perlsec for more information. Do not disclose the hash seed to people who don't need to know it. See also PERL_HASH_SEED_DEBUG in perlrun.

      Prior to Perl 5.17.6 this function returned a UV, it now returns a string, which may be of nearly any size as determined by the hash function your Perl has been built with. Possible sizes may be but are not limited to 4 bytes (for most hash algorithms) and 16 bytes (for siphash).

    • hash_value
      1. my $hash_value = hash_value($string);

      hash_value() returns the current perl's internal hash value for a given string.

      Returns a 32 bit integer representing the hash value of the string passed in. This value is only reliable for the lifetime of the process. It may be different depending on invocation, environment variables, perl version, architectures, and build options.

      Note that the hash value of a given string is sensitive information: by knowing it one can deduce the hash seed which in turn can allow one to craft a denial-of-service attack against Perl code, even remotely, see Algorithmic Complexity Attacks in perlsec for more information. Do not disclose the hash value of a string to people who don't need to know it. See also PERL_HASH_SEED_DEBUG in perlrun.

    • bucket_info

      Return a set of basic information about a hash.

      1. my ($keys, $buckets, $used, @length_counts)= bucket_info($hash);

      Fields are as follows:

      1. 0: Number of keys in the hash
      2. 1: Number of buckets in the hash
      3. 2: Number of used buckets in the hash
      4. rest : list of counts, Kth element is the number of buckets
      5. with K keys in it.

      See also bucket_stats() and bucket_array().

    • bucket_stats

      Returns a list of statistics about a hash.

      1. my ($keys, buckets, $used, $utilization_ratio, $collision_pct,
      2. $mean, $stddev, @length_counts) = bucket_info($hashref);

      Fields are as follows:

      1. 0: Number of keys in the hash
      2. 1: Number of buckets in the hash
      3. 2: Number of used buckets in the hash
      4. 3: Hash Quality Score
      5. 4: Percent of buckets used
      6. 5: Percent of keys which are in collision
      7. 6: Average bucket length
      8. 7: Standard Deviation of bucket lengths.
      9. rest : list of counts, Kth element is the number of buckets
      10. with K keys in it.

      See also bucket_info() and bucket_array().

      Note that Hash Quality Score would be 1 for an ideal hash, numbers close to and below 1 indicate good hashing, and number significantly above indicate a poor score. In practice it should be around 0.95 to 1.05. It is defined as:

      1. $score= sum( $count[$length] * ($length * ($length + 1) / 2) )
      2. /
      3. ( ( $keys / 2 * $buckets ) *
      4. ( $keys + ( 2 * $buckets ) - 1 ) )

      The formula is from the Red Dragon book (reformulated to use the data available) and is documented at http://www.strchr.com/hash_functions

    • bucket_array
      1. my $array= bucket_array(\%hash);

      Returns a packed representation of the bucket array associated with a hash. Each element of the array is either an integer K, in which case it represents K empty buckets, or a reference to another array which contains the keys that are in that bucket.

      Note that the information returned by bucket_array is sensitive information: by knowing it one can directly attack perl's hash function which in turn may allow one to craft a denial-of-service attack against Perl code, even remotely, see Algorithmic Complexity Attacks in perlsec for more information. Do not disclose the output of this function to people who don't need to know it. See also PERL_HASH_SEED_DEBUG in perlrun. This function is provided strictly for debugging and diagnostics purposes only, it is hard to imagine a reason why it would be used in production code.

    • hv_store
      1. my $sv = 0;
      2. hv_store(%hash,$key,$sv) or die "Failed to alias!";
      3. $hash{$key} = 1;
      4. print $sv; # prints 1

      Stores an alias to a variable in a hash instead of copying the value.

    • hash_traversal_mask

      As of Perl 5.18 every hash has its own hash traversal order, and this order changes every time a new element is inserted into the hash. This functionality is provided by maintaining an unsigned integer mask (U32) which is xor'ed with the actual bucket id during a traversal of the hash buckets using keys(), values() or each().

      You can use this subroutine to get and set the traversal mask for a specific hash. Setting the mask ensures that a given hash will produce the same key order. Note that this does not guarantee that two hashes will produce the same key order for the same hash seed and traversal mask, items that collide into one bucket may have different orders regardless of this setting.

    Operating on references to hashes.

    Most subroutines documented in this module have equivalent versions that operate on references to hashes instead of native hashes. The following is a list of these subs. They are identical except in name and in that instead of taking a %hash they take a $hashref, and additionally are not prototyped.

    • lock_ref_keys
    • unlock_ref_keys
    • lock_ref_keys_plus
    • lock_ref_value
    • unlock_ref_value
    • lock_hashref
    • unlock_hashref
    • lock_hashref_recurse
    • unlock_hashref_recurse
    • hash_ref_unlocked
    • legal_ref_keys
    • hidden_ref_keys

    CAVEATS

    Note that the trapping of the restricted operations is not atomic: for example

    1. eval { %hash = (illegal_key => 1) }

    leaves the %hash empty rather than with its original contents.

    BUGS

    The interface exposed by this module is very close to the current implementation of restricted hashes. Over time it is expected that this behavior will be extended and the interface abstracted further.

    AUTHOR

    Michael G Schwern <schwern@pobox.com> on top of code by Nick Ing-Simmons and Jeffrey Friedl.

    hv_store() is from Array::RefElem, Copyright 2000 Gisle Aas.

    Additional code by Yves Orton.

    SEE ALSO

    Scalar::Util, List::Util and Algorithmic Complexity Attacks in perlsec.

    Hash::Util::FieldHash.

     
    perldoc-html/Hash/Util/FieldHash.html000644 000765 000024 00000235205 12275777447 017570 0ustar00jjstaff000000 000000 Hash::Util::FieldHash - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Hash::Util::FieldHash

    Perl 5 version 18.2 documentation
    Recently read

    Hash::Util::FieldHash

    NAME

    Hash::Util::FieldHash - Support for Inside-Out Classes

    SYNOPSIS

    1. ### Create fieldhashes
    2. use Hash::Util qw(fieldhash fieldhashes);
    3. # Create a single field hash
    4. fieldhash my %foo;
    5. # Create three at once...
    6. fieldhashes \ my(%foo, %bar, %baz);
    7. # ...or any number
    8. fieldhashes @hashrefs;
    9. ### Create an idhash and register it for garbage collection
    10. use Hash::Util::FieldHash qw(idhash register);
    11. idhash my %name;
    12. my $object = \ do { my $o };
    13. # register the idhash for garbage collection with $object
    14. register($object, \ %name);
    15. # the following entry will be deleted when $object goes out of scope
    16. $name{$object} = 'John Doe';
    17. ### Register an ordinary hash for garbage collection
    18. use Hash::Util::FieldHash qw(id register);
    19. my %name;
    20. my $object = \ do { my $o };
    21. # register the hash %name for garbage collection of $object's id
    22. register $object, \ %name;
    23. # the following entry will be deleted when $object goes out of scope
    24. $name{id $object} = 'John Doe';

    FUNCTIONS

    Hash::Util::FieldHash offers a number of functions in support of The Inside-out Technique of class construction.

    • id
      1. id($obj)

      Returns the reference address of a reference $obj. If $obj is not a reference, returns $obj.

      This function is a stand-in replacement for Scalar::Util::refaddr, that is, it returns the reference address of its argument as a numeric value. The only difference is that refaddr() returns undef when given a non-reference while id() returns its argument unchanged.

      id() also uses a caching technique that makes it faster when the id of an object is requested often, but slower if it is needed only once or twice.

    • id_2obj
      1. $obj = id_2obj($id)

      If $id is the id of a registered object (see register), returns the object, otherwise an undefined value. For registered objects this is the inverse function of id() .

    • register
      1. register($obj)
      2. register($obj, @hashrefs)

      In the first form, registers an object to work with for the function id_2obj() . In the second form, it additionally marks the given hashrefs down for garbage collection. This means that when the object goes out of scope, any entries in the given hashes under the key of id($obj) will be deleted from the hashes.

      It is a fatal error to register a non-reference $obj. Any non-hashrefs among the following arguments are silently ignored.

      It is not an error to register the same object multiple times with varying sets of hashrefs. Any hashrefs that are not registered yet will be added, others ignored.

      Registry also implies thread support. When a new thread is created, all references are replaced with new ones, including all objects. If a hash uses the reference address of an object as a key, that connection would be broken. With a registered object, its id will be updated in all hashes registered with it.

    • idhash
      1. idhash my %hash

      Makes an idhash from the argument, which must be a hash.

      An idhash works like a normal hash, except that it stringifies a reference used as a key differently. A reference is stringified as if the id() function had been invoked on it, that is, its reference address in decimal is used as the key.

    • idhashes
      1. idhashes \ my(%hash, %gnash, %trash)
      2. idhashes \ @hashrefs

      Creates many idhashes from its hashref arguments. Returns those arguments that could be converted or their number in scalar context.

    • fieldhash
      1. fieldhash %hash;

      Creates a single fieldhash. The argument must be a hash. Returns a reference to the given hash if successful, otherwise nothing.

      A fieldhash is, in short, an idhash with auto-registry. When an object (or, indeed, any reference) is used as a fieldhash key, the fieldhash is automatically registered for garbage collection with the object, as if register $obj, \ %fieldhash had been called.

    • fieldhashes
      1. fieldhashes @hashrefs;

      Creates any number of field hashes. Arguments must be hash references. Returns the converted hashrefs in list context, their number in scalar context.

    DESCRIPTION

    A word on terminology: I shall use the term field for a scalar piece of data that a class associates with an object. Other terms that have been used for this concept are "object variable", "(object) property", "(object) attribute" and more. Especially "attribute" has some currency among Perl programmer, but that clashes with the attributes pragma. The term "field" also has some currency in this sense and doesn't seem to conflict with other Perl terminology.

    In Perl, an object is a blessed reference. The standard way of associating data with an object is to store the data inside the object's body, that is, the piece of data pointed to by the reference.

    In consequence, if two or more classes want to access an object they must agree on the type of reference and also on the organization of data within the object body. Failure to agree on the type results in immediate death when the wrong method tries to access an object. Failure to agree on data organization may lead to one class trampling over the data of another.

    This object model leads to a tight coupling between subclasses. If one class wants to inherit from another (and both classes access object data), the classes must agree about implementation details. Inheritance can only be used among classes that are maintained together, in a single source or not.

    In particular, it is not possible to write general-purpose classes in this technique, classes that can advertise themselves as "Put me on your @ISA list and use my methods". If the other class has different ideas about how the object body is used, there is trouble.

    For reference Name_hash in Example 1 shows the standard implementation of a simple class Name in the well-known hash based way. It also demonstrates the predictable failure to construct a common subclass NamedFile of Name and the class IO::File (whose objects must be globrefs).

    Thus, techniques are of interest that store object data not in the object body but some other place.

    The Inside-out Technique

    With inside-out classes, each class declares a (typically lexical) hash for each field it wants to use. The reference address of an object is used as the hash key. By definition, the reference address is unique to each object so this guarantees a place for each field that is private to the class and unique to each object. See Name_id in Example 1 for a simple example.

    In comparison to the standard implementation where the object is a hash and the fields correspond to hash keys, here the fields correspond to hashes, and the object determines the hash key. Thus the hashes appear to be turned inside out.

    The body of an object is never examined by an inside-out class, only its reference address is used. This allows for the body of an actual object to be anything at all while the object methods of the class still work as designed. This is a key feature of inside-out classes.

    Problems of Inside-out

    Inside-out classes give us freedom of inheritance, but as usual there is a price.

    Most obviously, there is the necessity of retrieving the reference address of an object for each data access. It's a minor inconvenience, but it does clutter the code.

    More important (and less obvious) is the necessity of garbage collection. When a normal object dies, anything stored in the object body is garbage-collected by perl. With inside-out objects, Perl knows nothing about the data stored in field hashes by a class, but these must be deleted when the object goes out of scope. Thus the class must provide a DESTROY method to take care of that.

    In the presence of multiple classes it can be non-trivial to make sure that every relevant destructor is called for every object. Perl calls the first one it finds on the inheritance tree (if any) and that's it.

    A related issue is thread-safety. When a new thread is created, the Perl interpreter is cloned, which implies that all reference addresses in use will be replaced with new ones. Thus, if a class tries to access a field of a cloned object its (cloned) data will still be stored under the now invalid reference address of the original in the parent thread. A general CLONE method must be provided to re-establish the association.

    Solutions

    Hash::Util::FieldHash addresses these issues on several levels.

    The id() function is provided in addition to the existing Scalar::Util::refaddr() . Besides its short name it can be a little faster under some circumstances (and a bit slower under others). Benchmark if it matters. The working of id() also allows the use of the class name as a generic object as described further down.

    The id() function is incorporated in id hashes in the sense that it is called automatically on every key that is used with the hash. No explicit call is necessary.

    The problems of garbage collection and thread safety are both addressed by the function register() . It registers an object together with any number of hashes. Registry means that when the object dies, an entry in any of the hashes under the reference address of this object will be deleted. This guarantees garbage collection in these hashes. It also means that on thread cloning the object's entries in registered hashes will be replaced with updated entries whose key is the cloned object's reference address. Thus the object-data association becomes thread-safe.

    Object registry is best done when the object is initialized for use with a class. That way, garbage collection and thread safety are established for every object and every field that is initialized.

    Finally, field hashes incorporate all these functions in one package. Besides automatically calling the id() function on every object used as a key, the object is registered with the field hash on first use. Classes based on field hashes are fully garbage-collected and thread safe without further measures.

    More Problems

    Another problem that occurs with inside-out classes is serialization. Since the object data is not in its usual place, standard routines like Storable::freeze() , Storable::thaw() and Data::Dumper::Dumper() can't deal with it on their own. Both Data::Dumper and Storable provide the necessary hooks to make things work, but the functions or methods used by the hooks must be provided by each inside-out class.

    A general solution to the serialization problem would require another level of registry, one that that associates classes and fields. So far, the functions of Hash::Util::FieldHash are unaware of any classes, which I consider a feature. Therefore Hash::Util::FieldHash doesn't address the serialization problems.

    The Generic Object

    Classes based on the id() function (and hence classes based on idhash() and fieldhash() ) show a peculiar behavior in that the class name can be used like an object. Specifically, methods that set or read data associated with an object continue to work as class methods, just as if the class name were an object, distinct from all other objects, with its own data. This object may be called the generic object of the class.

    This works because field hashes respond to keys that are not references like a normal hash would and use the string offered as the hash key. Thus, if a method is called as a class method, the field hash is presented with the class name instead of an object and blithely uses it as a key. Since the keys of real objects are decimal numbers, there is no conflict and the slot in the field hash can be used like any other. The id() function behaves correspondingly with respect to non-reference arguments.

    Two possible uses (besides ignoring the property) come to mind. A singleton class could be implemented this using the generic object. If necessary, an init() method could die or ignore calls with actual objects (references), so only the generic object will ever exist.

    Another use of the generic object would be as a template. It is a convenient place to store class-specific defaults for various fields to be used in actual object initialization.

    Usually, the feature can be entirely ignored. Calling object methods as class methods normally leads to an error and isn't used routinely anywhere. It may be a problem that this error isn't indicated by a class with a generic object.

    How to use Field Hashes

    Traditionally, the definition of an inside-out class contains a bare block inside which a number of lexical hashes are declared and the basic accessor methods defined, usually through Scalar::Util::refaddr . Further methods may be defined outside this block. There has to be a DESTROY method and, for thread support, a CLONE method.

    When field hashes are used, the basic structure remains the same. Each lexical hash will be made a field hash. The call to refaddr can be omitted from the accessor methods. DESTROY and CLONE methods are not necessary.

    If you have an existing inside-out class, simply making all hashes field hashes with no other change should make no difference. Through the calls to refaddr or equivalent, the field hashes never get to see a reference and work like normal hashes. Your DESTROY (and CLONE) methods are still needed.

    To make the field hashes kick in, it is easiest to redefine refaddr as

    1. sub refaddr { shift }

    instead of importing it from Scalar::Util . It should now be possible to disable DESTROY and CLONE. Note that while it isn't disabled, DESTROY will be called before the garbage collection of field hashes, so it will be invoked with a functional object and will continue to function.

    It is not desirable to import the functions fieldhash and/or fieldhashes into every class that is going to use them. They are only used once to set up the class. When the class is up and running, these functions serve no more purpose.

    If there are only a few field hashes to declare, it is simplest to

    1. use Hash::Util::FieldHash;

    early and call the functions qualified:

    1. Hash::Util::FieldHash::fieldhash my %foo;

    Otherwise, import the functions into a convenient package like HUF or, more general, Aux

    1. {
    2. package Aux;
    3. use Hash::Util::FieldHash ':all';
    4. }

    and call

    1. Aux::fieldhash my %foo;

    as needed.

    Garbage-Collected Hashes

    Garbage collection in a field hash means that entries will "spontaneously" disappear when the object that created them disappears. That must be borne in mind, especially when looping over a field hash. If anything you do inside the loop could cause an object to go out of scope, a random key may be deleted from the hash you are looping over. That can throw the loop iterator, so it's best to cache a consistent snapshot of the keys and/or values and loop over that. You will still have to check that a cached entry still exists when you get to it.

    Garbage collection can be confusing when keys are created in a field hash from normal scalars as well as references. Once a reference is used with a field hash, the entry will be collected, even if it was later overwritten with a plain scalar key (every positive integer is a candidate). This is true even if the original entry was deleted in the meantime. In fact, deletion from a field hash, and also a test for existence constitute use in this sense and create a liability to delete the entry when the reference goes out of scope. If you happen to create an entry with an identical key from a string or integer, that will be collected instead. Thus, mixed use of references and plain scalars as field hash keys is not entirely supported.

    EXAMPLES

    The examples show a very simple class that implements a name, consisting of a first and last name (no middle initial). The name class has four methods:

    • init()

      An object method that initializes the first and last name to its two arguments. If called as a class method, init() creates an object in the given class and initializes that.

    • first()

      Retrieve the first name

    • last()

      Retrieve the last name

    • name()

      Retrieve the full name, the first and last name joined by a blank.

    The examples show this class implemented with different levels of support by Hash::Util::FieldHash . All supported combinations are shown. The difference between implementations is often quite small. The implementations are:

    • Name_hash

      A conventional (not inside-out) implementation where an object is a hash that stores the field values, without support by Hash::Util::FieldHash . This implementation doesn't allow arbitrary inheritance.

    • Name_id

      Inside-out implementation based on the id() function. It needs a DESTROY method. For thread support a CLONE method (not shown) would also be needed. Instead of Hash::Util::FieldHash::id() the function Scalar::Util::refaddr could be used with very little functional difference. This is the basic pattern of an inside-out class.

    • Name_idhash

      Idhash-based inside-out implementation. Like Name_id it needs a DESTROY method and would need CLONE for thread support.

    • Name_id_reg

      Inside-out implementation based on the id() function with explicit object registry. No destructor is needed and objects are thread safe.

    • Name_idhash_reg

      Idhash-based inside-out implementation with explicit object registry. No destructor is needed and objects are thread safe.

    • Name_fieldhash

      FieldHash-based inside-out implementation. Object registry happens automatically. No destructor is needed and objects are thread safe.

    These examples are realized in the code below, which could be copied to a file Example.pm.

    Example 1

    1. use strict; use warnings;
    2. {
    3. package Name_hash; # standard implementation: the object is a hash
    4. sub init {
    5. my $obj = shift;
    6. my ($first, $last) = @_;
    7. # create an object if called as class method
    8. $obj = bless {}, $obj unless ref $obj;
    9. $obj->{ first} = $first;
    10. $obj->{ last} = $last;
    11. $obj;
    12. }
    13. sub first { shift()->{ first} }
    14. sub last { shift()->{ last} }
    15. sub name {
    16. my $n = shift;
    17. join ' ' => $n->first, $n->last;
    18. }
    19. }
    20. {
    21. package Name_id;
    22. use Hash::Util::FieldHash qw(id);
    23. my (%first, %last);
    24. sub init {
    25. my $obj = shift;
    26. my ($first, $last) = @_;
    27. # create an object if called as class method
    28. $obj = bless \ my $o, $obj unless ref $obj;
    29. $first{ id $obj} = $first;
    30. $last{ id $obj} = $last;
    31. $obj;
    32. }
    33. sub first { $first{ id shift()} }
    34. sub last { $last{ id shift()} }
    35. sub name {
    36. my $n = shift;
    37. join ' ' => $n->first, $n->last;
    38. }
    39. sub DESTROY {
    40. my $id = id shift;
    41. delete $first{ $id};
    42. delete $last{ $id};
    43. }
    44. }
    45. {
    46. package Name_idhash;
    47. use Hash::Util::FieldHash;
    48. Hash::Util::FieldHash::idhashes( \ my (%first, %last) );
    49. sub init {
    50. my $obj = shift;
    51. my ($first, $last) = @_;
    52. # create an object if called as class method
    53. $obj = bless \ my $o, $obj unless ref $obj;
    54. $first{ $obj} = $first;
    55. $last{ $obj} = $last;
    56. $obj;
    57. }
    58. sub first { $first{ shift()} }
    59. sub last { $last{ shift()} }
    60. sub name {
    61. my $n = shift;
    62. join ' ' => $n->first, $n->last;
    63. }
    64. sub DESTROY {
    65. my $n = shift;
    66. delete $first{ $n};
    67. delete $last{ $n};
    68. }
    69. }
    70. {
    71. package Name_id_reg;
    72. use Hash::Util::FieldHash qw(id register);
    73. my (%first, %last);
    74. sub init {
    75. my $obj = shift;
    76. my ($first, $last) = @_;
    77. # create an object if called as class method
    78. $obj = bless \ my $o, $obj unless ref $obj;
    79. register( $obj, \ (%first, %last) );
    80. $first{ id $obj} = $first;
    81. $last{ id $obj} = $last;
    82. $obj;
    83. }
    84. sub first { $first{ id shift()} }
    85. sub last { $last{ id shift()} }
    86. sub name {
    87. my $n = shift;
    88. join ' ' => $n->first, $n->last;
    89. }
    90. }
    91. {
    92. package Name_idhash_reg;
    93. use Hash::Util::FieldHash qw(register);
    94. Hash::Util::FieldHash::idhashes \ my (%first, %last);
    95. sub init {
    96. my $obj = shift;
    97. my ($first, $last) = @_;
    98. # create an object if called as class method
    99. $obj = bless \ my $o, $obj unless ref $obj;
    100. register( $obj, \ (%first, %last) );
    101. $first{ $obj} = $first;
    102. $last{ $obj} = $last;
    103. $obj;
    104. }
    105. sub first { $first{ shift()} }
    106. sub last { $last{ shift()} }
    107. sub name {
    108. my $n = shift;
    109. join ' ' => $n->first, $n->last;
    110. }
    111. }
    112. {
    113. package Name_fieldhash;
    114. use Hash::Util::FieldHash;
    115. Hash::Util::FieldHash::fieldhashes \ my (%first, %last);
    116. sub init {
    117. my $obj = shift;
    118. my ($first, $last) = @_;
    119. # create an object if called as class method
    120. $obj = bless \ my $o, $obj unless ref $obj;
    121. $first{ $obj} = $first;
    122. $last{ $obj} = $last;
    123. $obj;
    124. }
    125. sub first { $first{ shift()} }
    126. sub last { $last{ shift()} }
    127. sub name {
    128. my $n = shift;
    129. join ' ' => $n->first, $n->last;
    130. }
    131. }
    132. 1;

    To exercise the various implementations the script below can be used.

    It sets up a class Name that is a mirror of one of the implementation classes Name_hash , Name_id , ..., Name_fieldhash . That determines which implementation is run.

    The script first verifies the function of the Name class.

    In the second step, the free inheritability of the implementation (or lack thereof) is demonstrated. For this purpose it constructs a class called NamedFile which is a common subclass of Name and the standard class IO::File . This puts inheritability to the test because objects of IO::File must be globrefs. Objects of NamedFile should behave like a file opened for reading and also support the name() method. This class juncture works with exception of the Name_hash implementation, where object initialization fails because of the incompatibility of object bodies.

    Example 2

    1. use strict; use warnings; $| = 1;
    2. use Example;
    3. {
    4. package Name;
    5. use base 'Name_id'; # define here which implementation to run
    6. }
    7. # Verify that the base package works
    8. my $n = Name->init(qw(Albert Einstein));
    9. print $n->name, "\n";
    10. print "\n";
    11. # Create a named file handle (See definition below)
    12. my $nf = NamedFile->init(qw(/tmp/x Filomena File));
    13. # use as a file handle...
    14. for ( 1 .. 3 ) {
    15. my $l = <$nf>;
    16. print "line $_: $l";
    17. }
    18. # ...and as a Name object
    19. print "...brought to you by ", $nf->name, "\n";
    20. exit;
    21. # Definition of NamedFile
    22. package NamedFile;
    23. use base 'Name';
    24. use base 'IO::File';
    25. sub init {
    26. my $obj = shift;
    27. my ($file, $first, $last) = @_;
    28. $obj = $obj->IO::File::new() unless ref $obj;
    29. $obj->open($file) or die "Can't read '$file': $!";
    30. $obj->Name::init($first, $last);
    31. }
    32. __END__

    GUTS

    To make Hash::Util::FieldHash work, there were two changes to perl itself. PERL_MAGIC_uvar was made available for hashes, and weak references now call uvar get magic after a weakref has been cleared. The first feature is used to make field hashes intercept their keys upon access. The second one triggers garbage collection.

    The PERL_MAGIC_uvar interface for hashes

    PERL_MAGIC_uvar get magic is called from hv_fetch_common and hv_delete_common through the function hv_magic_uvar_xkey , which defines the interface. The call happens for hashes with "uvar" magic if the ufuncs structure has equal values in the uf_val and uf_set fields. Hashes are unaffected if (and as long as) these fields hold different values.

    Upon the call, the mg_obj field will hold the hash key to be accessed. Upon return, the SV* value in mg_obj will be used in place of the original key in the hash access. The integer index value in the first parameter will be the action value from hv_fetch_common , or -1 if the call is from hv_delete_common .

    This is a template for a function suitable for the uf_val field in a ufuncs structure for this call. The uf_set and uf_index fields are irrelevant.

    1. IV watch_key(pTHX_ IV action, SV* field) {
    2. MAGIC* mg = mg_find(field, PERL_MAGIC_uvar);
    3. SV* keysv = mg->mg_obj;
    4. /* Do whatever you need to. If you decide to
    5. supply a different key newkey, return it like this
    6. */
    7. sv_2mortal(newkey);
    8. mg->mg_obj = newkey;
    9. return 0;
    10. }

    Weakrefs call uvar magic

    When a weak reference is stored in an SV that has "uvar" magic, set magic is called after the reference has gone stale. This hook can be used to trigger further garbage-collection activities associated with the referenced object.

    How field hashes work

    The three features of key hashes, key replacement, thread support, and garbage collection are supported by a data structure called the object registry. This is a private hash where every object is stored. An "object" in this sense is any reference (blessed or unblessed) that has been used as a field hash key.

    The object registry keeps track of references that have been used as field hash keys. The keys are generated from the reference address like in a field hash (though the registry isn't a field hash). Each value is a weak copy of the original reference, stored in an SV that is itself magical (PERL_MAGIC_uvar again). The magical structure holds a list (another hash, really) of field hashes that the reference has been used with. When the weakref becomes stale, the magic is activated and uses the list to delete the reference from all field hashes it has been used with. After that, the entry is removed from the object registry itself. Implicitly, that frees the magic structure and the storage it has been using.

    Whenever a reference is used as a field hash key, the object registry is checked and a new entry is made if necessary. The field hash is then added to the list of fields this reference has used.

    The object registry is also used to repair a field hash after thread cloning. Here, the entire object registry is processed. For every reference found there, the field hashes it has used are visited and the entry is updated.

    Internal function Hash::Util::FieldHash::_fieldhash

    1. # test if %hash is a field hash
    2. my $result = _fieldhash \ %hash, 0;
    3. # make %hash a field hash
    4. my $result = _fieldhash \ %hash, 1;

    _fieldhash is the internal function used to create field hashes. It takes two arguments, a hashref and a mode. If the mode is boolean false, the hash is not changed but tested if it is a field hash. If the hash isn't a field hash the return value is boolean false. If it is, the return value indicates the mode of field hash. When called with a boolean true mode, it turns the given hash into a field hash of this mode, returning the mode of the created field hash. _fieldhash does not erase the given hash.

    Currently there is only one type of field hash, and only the boolean value of the mode makes a difference, but that may change.

    AUTHOR

    Anno Siegel (ANNO) wrote the xs code and the changes in perl proper Jerry Hedden (JDHEDDEN) made it faster

    COPYRIGHT AND LICENSE

    Copyright (C) 2006-2007 by (Anno Siegel)

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or, at your option, any later version of Perl 5 you may have available.

     
    perldoc-html/Getopt/Long.html000644 000765 000024 00000274507 12275777446 016311 0ustar00jjstaff000000 000000 Getopt::Long - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Getopt::Long

    Perl 5 version 18.2 documentation
    Recently read

    Getopt::Long

    NAME

    Getopt::Long - Extended processing of command line options

    SYNOPSIS

    1. use Getopt::Long;
    2. my $data = "file.dat";
    3. my $length = 24;
    4. my $verbose;
    5. GetOptions ("length=i" => \$length, # numeric
    6. "file=s" => \$data, # string
    7. "verbose" => \$verbose) # flag
    8. or die("Error in command line arguments\n");

    DESCRIPTION

    The Getopt::Long module implements an extended getopt function called GetOptions(). It parses the command line from @ARGV , recognizing and removing specified options and their possible values.

    This function adheres to the POSIX syntax for command line options, with GNU extensions. In general, this means that options have long names instead of single letters, and are introduced with a double dash "--". Support for bundling of command line options, as was the case with the more traditional single-letter approach, is provided but not enabled by default.

    Command Line Options, an Introduction

    Command line operated programs traditionally take their arguments from the command line, for example filenames or other information that the program needs to know. Besides arguments, these programs often take command line options as well. Options are not necessary for the program to work, hence the name 'option', but are used to modify its default behaviour. For example, a program could do its job quietly, but with a suitable option it could provide verbose information about what it did.

    Command line options come in several flavours. Historically, they are preceded by a single dash - , and consist of a single letter.

    1. -l -a -c

    Usually, these single-character options can be bundled:

    1. -lac

    Options can have values, the value is placed after the option character. Sometimes with whitespace in between, sometimes not:

    1. -s 24 -s24

    Due to the very cryptic nature of these options, another style was developed that used long names. So instead of a cryptic -l one could use the more descriptive --long . To distinguish between a bundle of single-character options and a long one, two dashes are used to precede the option name. Early implementations of long options used a plus + instead. Also, option values could be specified either like

    1. --size=24

    or

    1. --size 24

    The + form is now obsolete and strongly deprecated.

    Getting Started with Getopt::Long

    Getopt::Long is the Perl5 successor of newgetopt.pl . This was the first Perl module that provided support for handling the new style of command line options, in particular long option names, hence the Perl5 name Getopt::Long. This module also supports single-character options and bundling.

    To use Getopt::Long from a Perl program, you must include the following line in your Perl program:

    1. use Getopt::Long;

    This will load the core of the Getopt::Long module and prepare your program for using it. Most of the actual Getopt::Long code is not loaded until you really call one of its functions.

    In the default configuration, options names may be abbreviated to uniqueness, case does not matter, and a single dash is sufficient, even for long option names. Also, options may be placed between non-option arguments. See Configuring Getopt::Long for more details on how to configure Getopt::Long.

    Simple options

    The most simple options are the ones that take no values. Their mere presence on the command line enables the option. Popular examples are:

    1. --all --verbose --quiet --debug

    Handling simple options is straightforward:

    1. my $verbose = ''; # option variable with default value (false)
    2. my $all = ''; # option variable with default value (false)
    3. GetOptions ('verbose' => \$verbose, 'all' => \$all);

    The call to GetOptions() parses the command line arguments that are present in @ARGV and sets the option variable to the value 1 if the option did occur on the command line. Otherwise, the option variable is not touched. Setting the option value to true is often called enabling the option.

    The option name as specified to the GetOptions() function is called the option specification. Later we'll see that this specification can contain more than just the option name. The reference to the variable is called the option destination.

    GetOptions() will return a true value if the command line could be processed successfully. Otherwise, it will write error messages using die() and warn(), and return a false result.

    A little bit less simple options

    Getopt::Long supports two useful variants of simple options: negatable options and incremental options.

    A negatable option is specified with an exclamation mark ! after the option name:

    1. my $verbose = ''; # option variable with default value (false)
    2. GetOptions ('verbose!' => \$verbose);

    Now, using --verbose on the command line will enable $verbose , as expected. But it is also allowed to use --noverbose , which will disable $verbose by setting its value to 0 . Using a suitable default value, the program can find out whether $verbose is false by default, or disabled by using --noverbose .

    An incremental option is specified with a plus + after the option name:

    1. my $verbose = ''; # option variable with default value (false)
    2. GetOptions ('verbose+' => \$verbose);

    Using --verbose on the command line will increment the value of $verbose . This way the program can keep track of how many times the option occurred on the command line. For example, each occurrence of --verbose could increase the verbosity level of the program.

    Mixing command line option with other arguments

    Usually programs take command line options as well as other arguments, for example, file names. It is good practice to always specify the options first, and the other arguments last. Getopt::Long will, however, allow the options and arguments to be mixed and 'filter out' all the options before passing the rest of the arguments to the program. To stop Getopt::Long from processing further arguments, insert a double dash -- on the command line:

    1. --size 24 -- --all

    In this example, --all will not be treated as an option, but passed to the program unharmed, in @ARGV .

    Options with values

    For options that take values it must be specified whether the option value is required or not, and what kind of value the option expects.

    Three kinds of values are supported: integer numbers, floating point numbers, and strings.

    If the option value is required, Getopt::Long will take the command line argument that follows the option and assign this to the option variable. If, however, the option value is specified as optional, this will only be done if that value does not look like a valid command line option itself.

    1. my $tag = ''; # option variable with default value
    2. GetOptions ('tag=s' => \$tag);

    In the option specification, the option name is followed by an equals sign = and the letter s. The equals sign indicates that this option requires a value. The letter s indicates that this value is an arbitrary string. Other possible value types are i for integer values, and f for floating point values. Using a colon : instead of the equals sign indicates that the option value is optional. In this case, if no suitable value is supplied, string valued options get an empty string '' assigned, while numeric options are set to 0 .

    Options with multiple values

    Options sometimes take several values. For example, a program could use multiple directories to search for library files:

    1. --library lib/stdlib --library lib/extlib

    To accomplish this behaviour, simply specify an array reference as the destination for the option:

    1. GetOptions ("library=s" => \@libfiles);

    Alternatively, you can specify that the option can have multiple values by adding a "@", and pass a scalar reference as the destination:

    1. GetOptions ("library=s@" => \$libfiles);

    Used with the example above, @libfiles (or @$libfiles ) would contain two strings upon completion: "lib/stdlib" and "lib/extlib" , in that order. It is also possible to specify that only integer or floating point numbers are acceptable values.

    Often it is useful to allow comma-separated lists of values as well as multiple occurrences of the options. This is easy using Perl's split() and join() operators:

    1. GetOptions ("library=s" => \@libfiles);
    2. @libfiles = split(/,/,join(',',@libfiles));

    Of course, it is important to choose the right separator string for each purpose.

    Warning: What follows is an experimental feature.

    Options can take multiple values at once, for example

    1. --coordinates 52.2 16.4 --rgbcolor 255 255 149

    This can be accomplished by adding a repeat specifier to the option specification. Repeat specifiers are very similar to the {...} repeat specifiers that can be used with regular expression patterns. For example, the above command line would be handled as follows:

    1. GetOptions('coordinates=f{2}' => \@coor, 'rgbcolor=i{3}' => \@color);

    The destination for the option must be an array or array reference.

    It is also possible to specify the minimal and maximal number of arguments an option takes. foo=s{2,4} indicates an option that takes at least two and at most 4 arguments. foo=s{1,} indicates one or more values; foo:s{,} indicates zero or more option values.

    Options with hash values

    If the option destination is a reference to a hash, the option will take, as value, strings of the form key= value. The value will be stored with the specified key in the hash.

    1. GetOptions ("define=s" => \%defines);

    Alternatively you can use:

    1. GetOptions ("define=s%" => \$defines);

    When used with command line options:

    1. --define os=linux --define vendor=redhat

    the hash %defines (or %$defines ) will contain two keys, "os" with value "linux" and "vendor" with value "redhat" . It is also possible to specify that only integer or floating point numbers are acceptable values. The keys are always taken to be strings.

    User-defined subroutines to handle options

    Ultimate control over what should be done when (actually: each time) an option is encountered on the command line can be achieved by designating a reference to a subroutine (or an anonymous subroutine) as the option destination. When GetOptions() encounters the option, it will call the subroutine with two or three arguments. The first argument is the name of the option. (Actually, it is an object that stringifies to the name of the option.) For a scalar or array destination, the second argument is the value to be stored. For a hash destination, the second argument is the key to the hash, and the third argument the value to be stored. It is up to the subroutine to store the value, or do whatever it thinks is appropriate.

    A trivial application of this mechanism is to implement options that are related to each other. For example:

    1. my $verbose = ''; # option variable with default value (false)
    2. GetOptions ('verbose' => \$verbose,
    3. 'quiet' => sub { $verbose = 0 });

    Here --verbose and --quiet control the same variable $verbose , but with opposite values.

    If the subroutine needs to signal an error, it should call die() with the desired error message as its argument. GetOptions() will catch the die(), issue the error message, and record that an error result must be returned upon completion.

    If the text of the error message starts with an exclamation mark ! it is interpreted specially by GetOptions(). There is currently one special command implemented: die("!FINISH") will cause GetOptions() to stop processing options, as if it encountered a double dash -- .

    In version 2.37 the first argument to the callback function was changed from string to object. This was done to make room for extensions and more detailed control. The object stringifies to the option name so this change should not introduce compatibility problems.

    Here is an example of how to access the option name and value from within a subroutine:

    1. GetOptions ('opt=i' => \&handler);
    2. sub handler {
    3. my ($opt_name, $opt_value) = @_;
    4. print("Option name is $opt_name and value is $opt_value\n");
    5. }

    Options with multiple names

    Often it is user friendly to supply alternate mnemonic names for options. For example --height could be an alternate name for --length . Alternate names can be included in the option specification, separated by vertical bar | characters. To implement the above example:

    1. GetOptions ('length|height=f' => \$length);

    The first name is called the primary name, the other names are called aliases. When using a hash to store options, the key will always be the primary name.

    Multiple alternate names are possible.

    Case and abbreviations

    Without additional configuration, GetOptions() will ignore the case of option names, and allow the options to be abbreviated to uniqueness.

    1. GetOptions ('length|height=f' => \$length, "head" => \$head);

    This call will allow --l and --L for the length option, but requires a least --hea and --hei for the head and height options.

    Summary of Option Specifications

    Each option specifier consists of two parts: the name specification and the argument specification.

    The name specification contains the name of the option, optionally followed by a list of alternative names separated by vertical bar characters.

    1. length option name is "length"
    2. length|size|l name is "length", aliases are "size" and "l"

    The argument specification is optional. If omitted, the option is considered boolean, a value of 1 will be assigned when the option is used on the command line.

    The argument specification can be

    • !

      The option does not take an argument and may be negated by prefixing it with "no" or "no-". E.g. "foo!" will allow --foo (a value of 1 will be assigned) as well as --nofoo and --no-foo (a value of 0 will be assigned). If the option has aliases, this applies to the aliases as well.

      Using negation on a single letter option when bundling is in effect is pointless and will result in a warning.

    • +

      The option does not take an argument and will be incremented by 1 every time it appears on the command line. E.g. "more+" , when used with --more --more --more, will increment the value three times, resulting in a value of 3 (provided it was 0 or undefined at first).

      The + specifier is ignored if the option destination is not a scalar.

    • = type [ desttype ] [ repeat ]

      The option requires an argument of the given type. Supported types are:

      • s

        String. An arbitrary sequence of characters. It is valid for the argument to start with - or -- .

      • i

        Integer. An optional leading plus or minus sign, followed by a sequence of digits.

      • o

        Extended integer, Perl style. This can be either an optional leading plus or minus sign, followed by a sequence of digits, or an octal string (a zero, optionally followed by '0', '1', .. '7'), or a hexadecimal string (0x followed by '0' .. '9', 'a' .. 'f', case insensitive), or a binary string (0b followed by a series of '0' and '1').

      • f

        Real number. For example 3.14 , -6.23E24 and so on.

      The desttype can be @ or % to specify that the option is list or a hash valued. This is only needed when the destination for the option value is not otherwise specified. It should be omitted when not needed.

      The repeat specifies the number of values this option takes per occurrence on the command line. It has the format { [ min ] [ , [ max ] ] }.

      min denotes the minimal number of arguments. It defaults to 1 for options with = and to 0 for options with : , see below. Note that min overrules the = / : semantics.

      max denotes the maximum number of arguments. It must be at least min. If max is omitted, but the comma is not, there is no upper bound to the number of argument values taken.

    • : type [ desttype ]

      Like = , but designates the argument as optional. If omitted, an empty string will be assigned to string values options, and the value zero to numeric options.

      Note that if a string argument starts with - or -- , it will be considered an option on itself.

    • : number [ desttype ]

      Like :i , but if the value is omitted, the number will be assigned.

    • : + [ desttype ]

      Like :i , but if the value is omitted, the current value for the option will be incremented.

    Advanced Possibilities

    Object oriented interface

    Getopt::Long can be used in an object oriented way as well:

    1. use Getopt::Long;
    2. $p = Getopt::Long::Parser->new;
    3. $p->configure(...configuration options...);
    4. if ($p->getoptions(...options descriptions...)) ...
    5. if ($p->getoptionsfromarray( \@array, ...options descriptions...)) ...

    Configuration options can be passed to the constructor:

    1. $p = new Getopt::Long::Parser
    2. config => [...configuration options...];

    Thread Safety

    Getopt::Long is thread safe when using ithreads as of Perl 5.8. It is not thread safe when using the older (experimental and now obsolete) threads implementation that was added to Perl 5.005.

    Documentation and help texts

    Getopt::Long encourages the use of Pod::Usage to produce help messages. For example:

    1. use Getopt::Long;
    2. use Pod::Usage;
    3. my $man = 0;
    4. my $help = 0;
    5. GetOptions('help|?' => \$help, man => \$man) or pod2usage(2);
    6. pod2usage(1) if $help;
    7. pod2usage(-exitval => 0, -verbose => 2) if $man;
    8. __END__
    9. =head1 NAME
    10. sample - Using Getopt::Long and Pod::Usage
    11. =head1 SYNOPSIS
    12. sample [options] [file ...]
    13. Options:
    14. -help brief help message
    15. -man full documentation
    16. =head1 OPTIONS
    17. =over 8
    18. =item B<-help>
    19. Print a brief help message and exits.
    20. =item B<-man>
    21. Prints the manual page and exits.
    22. =back
    23. =head1 DESCRIPTION
    24. B<This program> will read the given input file(s) and do something
    25. useful with the contents thereof.
    26. =cut

    See Pod::Usage for details.

    Parsing options from an arbitrary array

    By default, GetOptions parses the options that are present in the global array @ARGV . A special entry GetOptionsFromArray can be used to parse options from an arbitrary array.

    1. use Getopt::Long qw(GetOptionsFromArray);
    2. $ret = GetOptionsFromArray(\@myopts, ...);

    When used like this, options and their possible values are removed from @myopts , the global @ARGV is not touched at all.

    The following two calls behave identically:

    1. $ret = GetOptions( ... );
    2. $ret = GetOptionsFromArray(\@ARGV, ... );

    This also means that a first argument hash reference now becomes the second argument:

    1. $ret = GetOptions(\%opts, ... );
    2. $ret = GetOptionsFromArray(\@ARGV, \%opts, ... );

    Parsing options from an arbitrary string

    A special entry GetOptionsFromString can be used to parse options from an arbitrary string.

    1. use Getopt::Long qw(GetOptionsFromString);
    2. $ret = GetOptionsFromString($string, ...);

    The contents of the string are split into arguments using a call to Text::ParseWords::shellwords . As with GetOptionsFromArray , the global @ARGV is not touched.

    It is possible that, upon completion, not all arguments in the string have been processed. GetOptionsFromString will, when called in list context, return both the return status and an array reference to any remaining arguments:

    1. ($ret, $args) = GetOptionsFromString($string, ... );

    If any arguments remain, and GetOptionsFromString was not called in list context, a message will be given and GetOptionsFromString will return failure.

    As with GetOptionsFromArray, a first argument hash reference now becomes the second argument.

    Storing options values in a hash

    Sometimes, for example when there are a lot of options, having a separate variable for each of them can be cumbersome. GetOptions() supports, as an alternative mechanism, storing options values in a hash.

    To obtain this, a reference to a hash must be passed as the first argument to GetOptions(). For each option that is specified on the command line, the option value will be stored in the hash with the option name as key. Options that are not actually used on the command line will not be put in the hash, on other words, exists($h{option}) (or defined()) can be used to test if an option was used. The drawback is that warnings will be issued if the program runs under use strict and uses $h{option} without testing with exists() or defined() first.

    1. my %h = ();
    2. GetOptions (\%h, 'length=i'); # will store in $h{length}

    For options that take list or hash values, it is necessary to indicate this by appending an @ or % sign after the type:

    1. GetOptions (\%h, 'colours=s@'); # will push to @{$h{colours}}

    To make things more complicated, the hash may contain references to the actual destinations, for example:

    1. my $len = 0;
    2. my %h = ('length' => \$len);
    3. GetOptions (\%h, 'length=i'); # will store in $len

    This example is fully equivalent with:

    1. my $len = 0;
    2. GetOptions ('length=i' => \$len); # will store in $len

    Any mixture is possible. For example, the most frequently used options could be stored in variables while all other options get stored in the hash:

    1. my $verbose = 0; # frequently referred
    2. my $debug = 0; # frequently referred
    3. my %h = ('verbose' => \$verbose, 'debug' => \$debug);
    4. GetOptions (\%h, 'verbose', 'debug', 'filter', 'size=i');
    5. if ( $verbose ) { ... }
    6. if ( exists $h{filter} ) { ... option 'filter' was specified ... }

    Bundling

    With bundling it is possible to set several single-character options at once. For example if a , v and x are all valid options,

    1. -vax

    would set all three.

    Getopt::Long supports two levels of bundling. To enable bundling, a call to Getopt::Long::Configure is required.

    The first level of bundling can be enabled with:

    1. Getopt::Long::Configure ("bundling");

    Configured this way, single-character options can be bundled but long options must always start with a double dash -- to avoid ambiguity. For example, when vax , a , v and x are all valid options,

    1. -vax

    would set a , v and x , but

    1. --vax

    would set vax .

    The second level of bundling lifts this restriction. It can be enabled with:

    1. Getopt::Long::Configure ("bundling_override");

    Now, -vax would set the option vax .

    When any level of bundling is enabled, option values may be inserted in the bundle. For example:

    1. -h24w80

    is equivalent to

    1. -h 24 -w 80

    When configured for bundling, single-character options are matched case sensitive while long options are matched case insensitive. To have the single-character options matched case insensitive as well, use:

    1. Getopt::Long::Configure ("bundling", "ignorecase_always");

    It goes without saying that bundling can be quite confusing.

    The lonesome dash

    Normally, a lone dash - on the command line will not be considered an option. Option processing will terminate (unless "permute" is configured) and the dash will be left in @ARGV .

    It is possible to get special treatment for a lone dash. This can be achieved by adding an option specification with an empty name, for example:

    1. GetOptions ('' => \$stdio);

    A lone dash on the command line will now be a legal option, and using it will set variable $stdio .

    Argument callback

    A special option 'name' <> can be used to designate a subroutine to handle non-option arguments. When GetOptions() encounters an argument that does not look like an option, it will immediately call this subroutine and passes it one parameter: the argument name. Well, actually it is an object that stringifies to the argument name.

    For example:

    1. my $width = 80;
    2. sub process { ... }
    3. GetOptions ('width=i' => \$width, '<>' => \&process);

    When applied to the following command line:

    1. arg1 --width=72 arg2 --width=60 arg3

    This will call process("arg1") while $width is 80 , process("arg2") while $width is 72 , and process("arg3") while $width is 60 .

    This feature requires configuration option permute, see section Configuring Getopt::Long.

    Configuring Getopt::Long

    Getopt::Long can be configured by calling subroutine Getopt::Long::Configure(). This subroutine takes a list of quoted strings, each specifying a configuration option to be enabled, e.g. ignore_case , or disabled, e.g. no_ignore_case . Case does not matter. Multiple calls to Configure() are possible.

    Alternatively, as of version 2.24, the configuration options may be passed together with the use statement:

    1. use Getopt::Long qw(:config no_ignore_case bundling);

    The following options are available:

    • default

      This option causes all configuration options to be reset to their default values.

    • posix_default

      This option causes all configuration options to be reset to their default values as if the environment variable POSIXLY_CORRECT had been set.

    • auto_abbrev

      Allow option names to be abbreviated to uniqueness. Default is enabled unless environment variable POSIXLY_CORRECT has been set, in which case auto_abbrev is disabled.

    • getopt_compat

      Allow + to start options. Default is enabled unless environment variable POSIXLY_CORRECT has been set, in which case getopt_compat is disabled.

    • gnu_compat

      gnu_compat controls whether --opt= is allowed, and what it should do. Without gnu_compat , --opt= gives an error. With gnu_compat , --opt= will give option opt and empty value. This is the way GNU getopt_long() does it.

    • gnu_getopt

      This is a short way of setting gnu_compat bundling permute no_getopt_compat . With gnu_getopt , command line handling should be fully compatible with GNU getopt_long().

    • require_order

      Whether command line arguments are allowed to be mixed with options. Default is disabled unless environment variable POSIXLY_CORRECT has been set, in which case require_order is enabled.

      See also permute , which is the opposite of require_order .

    • permute

      Whether command line arguments are allowed to be mixed with options. Default is enabled unless environment variable POSIXLY_CORRECT has been set, in which case permute is disabled. Note that permute is the opposite of require_order .

      If permute is enabled, this means that

      1. --foo arg1 --bar arg2 arg3

      is equivalent to

      1. --foo --bar arg1 arg2 arg3

      If an argument callback routine is specified, @ARGV will always be empty upon successful return of GetOptions() since all options have been processed. The only exception is when -- is used:

      1. --foo arg1 --bar arg2 -- arg3

      This will call the callback routine for arg1 and arg2, and then terminate GetOptions() leaving "arg3" in @ARGV .

      If require_order is enabled, options processing terminates when the first non-option is encountered.

      1. --foo arg1 --bar arg2 arg3

      is equivalent to

      1. --foo -- arg1 --bar arg2 arg3

      If pass_through is also enabled, options processing will terminate at the first unrecognized option, or non-option, whichever comes first.

    • bundling (default: disabled)

      Enabling this option will allow single-character options to be bundled. To distinguish bundles from long option names, long options must be introduced with -- and bundles with - .

      Note that, if you have options a , l and all , and auto_abbrev enabled, possible arguments and option settings are:

      1. using argument sets option(s)
      2. ------------------------------------------
      3. -a, --a a
      4. -l, --l l
      5. -al, -la, -ala, -all,... a, l
      6. --al, --all all

      The surprising part is that --a sets option a (due to auto completion), not all .

      Note: disabling bundling also disables bundling_override .

    • bundling_override (default: disabled)

      If bundling_override is enabled, bundling is enabled as with bundling but now long option names override option bundles.

      Note: disabling bundling_override also disables bundling .

      Note: Using option bundling can easily lead to unexpected results, especially when mixing long options and bundles. Caveat emptor.

    • ignore_case (default: enabled)

      If enabled, case is ignored when matching option names. If, however, bundling is enabled as well, single character options will be treated case-sensitive.

      With ignore_case , option specifications for options that only differ in case, e.g., "foo" and "Foo" , will be flagged as duplicates.

      Note: disabling ignore_case also disables ignore_case_always .

    • ignore_case_always (default: disabled)

      When bundling is in effect, case is ignored on single-character options also.

      Note: disabling ignore_case_always also disables ignore_case .

    • auto_version (default:disabled)

      Automatically provide support for the --version option if the application did not specify a handler for this option itself.

      Getopt::Long will provide a standard version message that includes the program name, its version (if $main::VERSION is defined), and the versions of Getopt::Long and Perl. The message will be written to standard output and processing will terminate.

      auto_version will be enabled if the calling program explicitly specified a version number higher than 2.32 in the use or require statement.

    • auto_help (default:disabled)

      Automatically provide support for the --help and -? options if the application did not specify a handler for this option itself.

      Getopt::Long will provide a help message using module Pod::Usage. The message, derived from the SYNOPSIS POD section, will be written to standard output and processing will terminate.

      auto_help will be enabled if the calling program explicitly specified a version number higher than 2.32 in the use or require statement.

    • pass_through (default: disabled)

      Options that are unknown, ambiguous or supplied with an invalid option value are passed through in @ARGV instead of being flagged as errors. This makes it possible to write wrapper scripts that process only part of the user supplied command line arguments, and pass the remaining options to some other program.

      If require_order is enabled, options processing will terminate at the first unrecognized option, or non-option, whichever comes first. However, if permute is enabled instead, results can become confusing.

      Note that the options terminator (default -- ), if present, will also be passed through in @ARGV .

    • prefix

      The string that starts options. If a constant string is not sufficient, see prefix_pattern .

    • prefix_pattern

      A Perl pattern that identifies the strings that introduce options. Default is --|-|\+ unless environment variable POSIXLY_CORRECT has been set, in which case it is --|-.

    • long_prefix_pattern

      A Perl pattern that allows the disambiguation of long and short prefixes. Default is -- .

      Typically you only need to set this if you are using nonstandard prefixes and want some or all of them to have the same semantics as '--' does under normal circumstances.

      For example, setting prefix_pattern to --|-|\+|\/ and long_prefix_pattern to --|\/ would add Win32 style argument handling.

    • debug (default: disabled)

      Enable debugging output.

    Exportable Methods

    • VersionMessage

      This subroutine provides a standard version message. Its argument can be:

      • A string containing the text of a message to print before printing the standard message.

      • A numeric value corresponding to the desired exit status.

      • A reference to a hash.

      If more than one argument is given then the entire argument list is assumed to be a hash. If a hash is supplied (either as a reference or as a list) it should contain one or more elements with the following keys:

      • -message
      • -msg

        The text of a message to print immediately prior to printing the program's usage message.

      • -exitval

        The desired exit status to pass to the exit() function. This should be an integer, or else the string "NOEXIT" to indicate that control should simply be returned without terminating the invoking process.

      • -output

        A reference to a filehandle, or the pathname of a file to which the usage message should be written. The default is \*STDERR unless the exit value is less than 2 (in which case the default is \*STDOUT ).

      You cannot tie this routine directly to an option, e.g.:

      1. GetOptions("version" => \&VersionMessage);

      Use this instead:

      1. GetOptions("version" => sub { VersionMessage() });
    • HelpMessage

      This subroutine produces a standard help message, derived from the program's POD section SYNOPSIS using Pod::Usage. It takes the same arguments as VersionMessage(). In particular, you cannot tie it directly to an option, e.g.:

      1. GetOptions("help" => \&HelpMessage);

      Use this instead:

      1. GetOptions("help" => sub { HelpMessage() });

    Return values and Errors

    Configuration errors and errors in the option definitions are signalled using die() and will terminate the calling program unless the call to Getopt::Long::GetOptions() was embedded in eval { ... } , or die() was trapped using $SIG{__DIE__} .

    GetOptions returns true to indicate success. It returns false when the function detected one or more errors during option parsing. These errors are signalled using warn() and can be trapped with $SIG{__WARN__} .

    Legacy

    The earliest development of newgetopt.pl started in 1990, with Perl version 4. As a result, its development, and the development of Getopt::Long, has gone through several stages. Since backward compatibility has always been extremely important, the current version of Getopt::Long still supports a lot of constructs that nowadays are no longer necessary or otherwise unwanted. This section describes briefly some of these 'features'.

    Default destinations

    When no destination is specified for an option, GetOptions will store the resultant value in a global variable named opt_ XXX, where XXX is the primary name of this option. When a progam executes under use strict (recommended), these variables must be pre-declared with our() or use vars .

    1. our $opt_length = 0;
    2. GetOptions ('length=i'); # will store in $opt_length

    To yield a usable Perl variable, characters that are not part of the syntax for variables are translated to underscores. For example, --fpp-struct-return will set the variable $opt_fpp_struct_return . Note that this variable resides in the namespace of the calling program, not necessarily main . For example:

    1. GetOptions ("size=i", "sizes=i@");

    with command line "-size 10 -sizes 24 -sizes 48" will perform the equivalent of the assignments

    1. $opt_size = 10;
    2. @opt_sizes = (24, 48);

    Alternative option starters

    A string of alternative option starter characters may be passed as the first argument (or the first argument after a leading hash reference argument).

    1. my $len = 0;
    2. GetOptions ('/', 'length=i' => $len);

    Now the command line may look like:

    1. /length 24 -- arg

    Note that to terminate options processing still requires a double dash -- .

    GetOptions() will not interpret a leading "<>" as option starters if the next argument is a reference. To force "<" and ">" as option starters, use "><" . Confusing? Well, using a starter argument is strongly deprecated anyway.

    Configuration variables

    Previous versions of Getopt::Long used variables for the purpose of configuring. Although manipulating these variables still work, it is strongly encouraged to use the Configure routine that was introduced in version 2.17. Besides, it is much easier.

    Tips and Techniques

    Pushing multiple values in a hash option

    Sometimes you want to combine the best of hashes and arrays. For example, the command line:

    1. --list add=first --list add=second --list add=third

    where each successive 'list add' option will push the value of add into array ref $list->{'add'}. The result would be like

    1. $list->{add} = [qw(first second third)];

    This can be accomplished with a destination routine:

    1. GetOptions('list=s%' =>
    2. sub { push(@{$list{$_[1]}}, $_[2]) });

    Troubleshooting

    GetOptions does not return a false result when an option is not supplied

    That's why they're called 'options'.

    GetOptions does not split the command line correctly

    The command line is not split by GetOptions, but by the command line interpreter (CLI). On Unix, this is the shell. On Windows, it is COMMAND.COM or CMD.EXE. Other operating systems have other CLIs.

    It is important to know that these CLIs may behave different when the command line contains special characters, in particular quotes or backslashes. For example, with Unix shells you can use single quotes (') and double quotes (") to group words together. The following alternatives are equivalent on Unix:

    1. "two words"
    2. 'two words'
    3. two\ words

    In case of doubt, insert the following statement in front of your Perl program:

    1. print STDERR (join("|",@ARGV),"\n");

    to verify how your CLI passes the arguments to the program.

    Undefined subroutine &main::GetOptions called

    Are you running Windows, and did you write

    1. use GetOpt::Long;

    (note the capital 'O')?

    How do I put a "-?" option into a Getopt::Long?

    You can only obtain this using an alias, and Getopt::Long of at least version 2.13.

    1. use Getopt::Long;
    2. GetOptions ("help|?"); # -help and -? will both set $opt_help

    Other characters that can't appear in Perl identifiers are also supported as aliases with Getopt::Long of at least version 2.39.

    As of version 2.32 Getopt::Long provides auto-help, a quick and easy way to add the options --help and -? to your program, and handle them.

    See auto_help in section Configuring Getopt::Long.

    AUTHOR

    Johan Vromans <jvromans@squirrel.nl>

    COPYRIGHT AND DISCLAIMER

    This program is Copyright 1990,2010 by Johan Vromans. This program is free software; you can redistribute it and/or modify it under the terms of the Perl Artistic License or the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

    This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

    If you do not have a copy of the GNU General Public License write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

     
    perldoc-html/Getopt/Std.html000644 000765 000024 00000046247 12275777446 016142 0ustar00jjstaff000000 000000 Getopt::Std - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Getopt::Std

    Perl 5 version 18.2 documentation
    Recently read

    Getopt::Std

    NAME

    getopt, getopts - Process single-character switches with switch clustering

    SYNOPSIS

    1. use Getopt::Std;
    2. getopt('oDI'); # -o, -D & -I take arg. Sets $opt_* as a side effect.
    3. getopt('oDI', \%opts); # -o, -D & -I take arg. Values in %opts
    4. getopts('oif:'); # -o & -i are boolean flags, -f takes an argument
    5. # Sets $opt_* as a side effect.
    6. getopts('oif:', \%opts); # options as above. Values in %opts

    DESCRIPTION

    The getopt() function processes single-character switches with switch clustering. Pass one argument which is a string containing all switches that take an argument. For each switch found, sets $opt_x (where x is the switch name) to the value of the argument if an argument is expected, or 1 otherwise. Switches which take an argument don't care whether there is a space between the switch and the argument.

    The getopts() function is similar, but you should pass to it the list of all switches to be recognized. If unspecified switches are found on the command-line, the user will be warned that an unknown option was given. The getopts() function returns true unless an invalid option was found.

    Note that, if your code is running under the recommended use strict 'vars' pragma, you will need to declare these package variables with "our":

    1. our($opt_x, $opt_y);

    For those of you who don't like additional global variables being created, getopt() and getopts() will also accept a hash reference as an optional second argument. Hash keys will be x (where x is the switch name) with key values the value of the argument or 1 if no argument is specified.

    To allow programs to process arguments that look like switches, but aren't, both functions will stop processing switches when they see the argument -- . The -- will be removed from @ARGV.

    --help and --version

    If - is not a recognized switch letter, getopts() supports arguments --help and --version . If main::HELP_MESSAGE() and/or main::VERSION_MESSAGE() are defined, they are called; the arguments are the output file handle, the name of option-processing package, its version, and the switches string. If the subroutines are not defined, an attempt is made to generate intelligent messages; for best results, define $main::VERSION.

    If embedded documentation (in pod format, see perlpod) is detected in the script, --help will also show how to access the documentation.

    Note that due to excessive paranoia, if $Getopt::Std::STANDARD_HELP_VERSION isn't true (the default is false), then the messages are printed on STDERR, and the processing continues after the messages are printed. This being the opposite of the standard-conforming behaviour, it is strongly recommended to set $Getopt::Std::STANDARD_HELP_VERSION to true.

    One can change the output file handle of the messages by setting $Getopt::Std::OUTPUT_HELP_VERSION. One can print the messages of --help (without the Usage: line) and --version by calling functions help_mess() and version_mess() with the switches string as an argument.

     
    perldoc-html/Filter/Simple.html000644 000765 000024 00000146675 12275777446 016632 0ustar00jjstaff000000 000000 Filter::Simple - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Filter::Simple

    Perl 5 version 18.2 documentation
    Recently read

    Filter::Simple

    NAME

    Filter::Simple - Simplified source filtering

    SYNOPSIS

    1. # in MyFilter.pm:
    2. package MyFilter;
    3. use Filter::Simple;
    4. FILTER { ... };
    5. # or just:
    6. #
    7. # use Filter::Simple sub { ... };
    8. # in user's code:
    9. use MyFilter;
    10. # this code is filtered
    11. no MyFilter;
    12. # this code is not

    DESCRIPTION

    The Problem

    Source filtering is an immensely powerful feature of recent versions of Perl. It allows one to extend the language itself (e.g. the Switch module), to simplify the language (e.g. Language::Pythonesque), or to completely recast the language (e.g. Lingua::Romana::Perligata). Effectively, it allows one to use the full power of Perl as its own, recursively applied, macro language.

    The excellent Filter::Util::Call module (by Paul Marquess) provides a usable Perl interface to source filtering, but it is often too powerful and not nearly as simple as it could be.

    To use the module it is necessary to do the following:

    1.

    Download, build, and install the Filter::Util::Call module. (If you have Perl 5.7.1 or later, this is already done for you.)

    2.

    Set up a module that does a use Filter::Util::Call .

    3.

    Within that module, create an import subroutine.

    4.

    Within the import subroutine do a call to filter_add , passing it either a subroutine reference.

    5.

    Within the subroutine reference, call filter_read or filter_read_exact to "prime" $_ with source code data from the source file that will use your module. Check the status value returned to see if any source code was actually read in.

    6.

    Process the contents of $_ to change the source code in the desired manner.

    7.

    Return the status value.

    8.

    If the act of unimporting your module (via a no) should cause source code filtering to cease, create an unimport subroutine, and have it call filter_del . Make sure that the call to filter_read or filter_read_exact in step 5 will not accidentally read past the no. Effectively this limits source code filters to line-by-line operation, unless the import subroutine does some fancy pre-pre-parsing of the source code it's filtering.

    For example, here is a minimal source code filter in a module named BANG.pm. It simply converts every occurrence of the sequence BANG\s+BANG to the sequence die 'BANG' if $BANG in any piece of code following a use BANG; statement (until the next no BANG; statement, if any):

    1. package BANG;
    2. use Filter::Util::Call ;
    3. sub import {
    4. filter_add( sub {
    5. my $caller = caller;
    6. my ($status, $no_seen, $data);
    7. while ($status = filter_read()) {
    8. if (/^\s*no\s+$caller\s*;\s*?$/) {
    9. $no_seen=1;
    10. last;
    11. }
    12. $data .= $_;
    13. $_ = "";
    14. }
    15. $_ = $data;
    16. s/BANG\s+BANG/die 'BANG' if \$BANG/g
    17. unless $status < 0;
    18. $_ .= "no $class;\n" if $no_seen;
    19. return 1;
    20. })
    21. }
    22. sub unimport {
    23. filter_del();
    24. }
    25. 1 ;

    This level of sophistication puts filtering out of the reach of many programmers.

    A Solution

    The Filter::Simple module provides a simplified interface to Filter::Util::Call; one that is sufficient for most common cases.

    Instead of the above process, with Filter::Simple the task of setting up a source code filter is reduced to:

    1.

    Download and install the Filter::Simple module. (If you have Perl 5.7.1 or later, this is already done for you.)

    2.

    Set up a module that does a use Filter::Simple and then calls FILTER { ... } .

    3.

    Within the anonymous subroutine or block that is passed to FILTER , process the contents of $_ to change the source code in the desired manner.

    In other words, the previous example, would become:

    1. package BANG;
    2. use Filter::Simple;
    3. FILTER {
    4. s/BANG\s+BANG/die 'BANG' if \$BANG/g;
    5. };
    6. 1 ;

    Note that the source code is passed as a single string, so any regex that uses ^ or $ to detect line boundaries will need the /m flag.

    Disabling or changing <no> behaviour

    By default, the installed filter only filters up to a line consisting of one of the three standard source "terminators":

    1. no ModuleName; # optional comment

    or:

    1. __END__

    or:

    1. __DATA__

    but this can be altered by passing a second argument to use Filter::Simple or FILTER (just remember: there's no comma after the initial block when you use FILTER ).

    That second argument may be either a qr'd regular expression (which is then used to match the terminator line), or a defined false value (which indicates that no terminator line should be looked for), or a reference to a hash (in which case the terminator is the value associated with the key 'terminator' .

    For example, to cause the previous filter to filter only up to a line of the form:

    1. GNAB esu;

    you would write:

    1. package BANG;
    2. use Filter::Simple;
    3. FILTER {
    4. s/BANG\s+BANG/die 'BANG' if \$BANG/g;
    5. }
    6. qr/^\s*GNAB\s+esu\s*;\s*?$/;

    or:

    1. FILTER {
    2. s/BANG\s+BANG/die 'BANG' if \$BANG/g;
    3. }
    4. { terminator => qr/^\s*GNAB\s+esu\s*;\s*?$/ };

    and to prevent the filter's being turned off in any way:

    1. package BANG;
    2. use Filter::Simple;
    3. FILTER {
    4. s/BANG\s+BANG/die 'BANG' if \$BANG/g;
    5. }
    6. ""; # or: 0

    or:

    1. FILTER {
    2. s/BANG\s+BANG/die 'BANG' if \$BANG/g;
    3. }
    4. { terminator => "" };

    Note that, no matter what you set the terminator pattern to, the actual terminator itself must be contained on a single source line.

    All-in-one interface

    Separating the loading of Filter::Simple:

    1. use Filter::Simple;

    from the setting up of the filtering:

    1. FILTER { ... };

    is useful because it allows other code (typically parser support code or caching variables) to be defined before the filter is invoked. However, there is often no need for such a separation.

    In those cases, it is easier to just append the filtering subroutine and any terminator specification directly to the use statement that loads Filter::Simple, like so:

    1. use Filter::Simple sub {
    2. s/BANG\s+BANG/die 'BANG' if \$BANG/g;
    3. };

    This is exactly the same as:

    1. use Filter::Simple;
    2. BEGIN {
    3. Filter::Simple::FILTER {
    4. s/BANG\s+BANG/die 'BANG' if \$BANG/g;
    5. };
    6. }

    except that the FILTER subroutine is not exported by Filter::Simple.

    Filtering only specific components of source code

    One of the problems with a filter like:

    1. use Filter::Simple;
    2. FILTER { s/BANG\s+BANG/die 'BANG' if \$BANG/g };

    is that it indiscriminately applies the specified transformation to the entire text of your source program. So something like:

    1. warn 'BANG BANG, YOU'RE DEAD';
    2. BANG BANG;

    will become:

    1. warn 'die 'BANG' if $BANG, YOU'RE DEAD';
    2. die 'BANG' if $BANG;

    It is very common when filtering source to only want to apply the filter to the non-character-string parts of the code, or alternatively to only the character strings.

    Filter::Simple supports this type of filtering by automatically exporting the FILTER_ONLY subroutine.

    FILTER_ONLY takes a sequence of specifiers that install separate (and possibly multiple) filters that act on only parts of the source code. For example:

    1. use Filter::Simple;
    2. FILTER_ONLY
    3. code => sub { s/BANG\s+BANG/die 'BANG' if \$BANG/g },
    4. quotelike => sub { s/BANG\s+BANG/CHITTY CHITTY/g };

    The "code" subroutine will only be used to filter parts of the source code that are not quotelikes, POD, or __DATA__ . The quotelike subroutine only filters Perl quotelikes (including here documents).

    The full list of alternatives is:

    • "code"

      Filters only those sections of the source code that are not quotelikes, POD, or __DATA__ .

    • "code_no_comments"

      Filters only those sections of the source code that are not quotelikes, POD, comments, or __DATA__ .

    • "executable"

      Filters only those sections of the source code that are not POD or __DATA__ .

    • "executable_no_comments"

      Filters only those sections of the source code that are not POD, comments, or __DATA__ .

    • "quotelike"

      Filters only Perl quotelikes (as interpreted by &Text::Balanced::extract_quotelike ).

    • "string"

      Filters only the string literal parts of a Perl quotelike (i.e. the contents of a string literal, either half of a tr///, the second half of an s///).

    • "regex"

      Filters only the pattern literal parts of a Perl quotelike (i.e. the contents of a qr// or an m//, the first half of an s///).

    • "all"

      Filters everything. Identical in effect to FILTER .

    Except for FILTER_ONLY code => sub {...} , each of the component filters is called repeatedly, once for each component found in the source code.

    Note that you can also apply two or more of the same type of filter in a single FILTER_ONLY . For example, here's a simple macro-preprocessor that is only applied within regexes, with a final debugging pass that prints the resulting source code:

    1. use Regexp::Common;
    2. FILTER_ONLY
    3. regex => sub { s/!\[/[^/g },
    4. regex => sub { s/%d/$RE{num}{int}/g },
    5. regex => sub { s/%f/$RE{num}{real}/g },
    6. all => sub { print if $::DEBUG };

    Filtering only the code parts of source code

    Most source code ceases to be grammatically correct when it is broken up into the pieces between string literals and regexes. So the 'code' and 'code_no_comments' component filter behave slightly differently from the other partial filters described in the previous section.

    Rather than calling the specified processor on each individual piece of code (i.e. on the bits between quotelikes), the 'code...' partial filters operate on the entire source code, but with the quotelike bits (and, in the case of 'code_no_comments' , the comments) "blanked out".

    That is, a 'code...' filter replaces each quoted string, quotelike, regex, POD, and __DATA__ section with a placeholder. The delimiters of this placeholder are the contents of the $; variable at the time the filter is applied (normally "\034" ). The remaining four bytes are a unique identifier for the component being replaced.

    This approach makes it comparatively easy to write code preprocessors without worrying about the form or contents of strings, regexes, etc.

    For convenience, during a 'code...' filtering operation, Filter::Simple provides a package variable ($Filter::Simple::placeholder ) that contains a pre-compiled regex that matches any placeholder...and captures the identifier within the placeholder. Placeholders can be moved and re-ordered within the source code as needed.

    In addition, a second package variable (@Filter::Simple::components ) contains a list of the various pieces of $_ , as they were originally split up to allow placeholders to be inserted.

    Once the filtering has been applied, the original strings, regexes, POD, etc. are re-inserted into the code, by replacing each placeholder with the corresponding original component (from @components ). Note that this means that the @components variable must be treated with extreme care within the filter. The @components array stores the "back- translations" of each placeholder inserted into $_ , as well as the interstitial source code between placeholders. If the placeholder backtranslations are altered in @components , they will be similarly changed when the placeholders are removed from $_ after the filter is complete.

    For example, the following filter detects concatenated pairs of strings/quotelikes and reverses the order in which they are concatenated:

    1. package DemoRevCat;
    2. use Filter::Simple;
    3. FILTER_ONLY code => sub {
    4. my $ph = $Filter::Simple::placeholder;
    5. s{ ($ph) \s* [.] \s* ($ph) }{ $2.$1 }gx
    6. };

    Thus, the following code:

    1. use DemoRevCat;
    2. my $str = "abc" . q(def);
    3. print "$str\n";

    would become:

    1. my $str = q(def)."abc";
    2. print "$str\n";

    and hence print:

    1. defabc

    Using Filter::Simple with an explicit import subroutine

    Filter::Simple generates a special import subroutine for your module (see How it works) which would normally replace any import subroutine you might have explicitly declared.

    However, Filter::Simple is smart enough to notice your existing import and Do The Right Thing with it. That is, if you explicitly define an import subroutine in a package that's using Filter::Simple, that import subroutine will still be invoked immediately after any filter you install.

    The only thing you have to remember is that the import subroutine must be declared before the filter is installed. If you use FILTER to install the filter:

    1. package Filter::TurnItUpTo11;
    2. use Filter::Simple;
    3. FILTER { s/(\w+)/\U$1/ };

    that will almost never be a problem, but if you install a filtering subroutine by passing it directly to the use Filter::Simple statement:

    1. package Filter::TurnItUpTo11;
    2. use Filter::Simple sub{ s/(\w+)/\U$1/ };

    then you must make sure that your import subroutine appears before that use statement.

    Using Filter::Simple and Exporter together

    Likewise, Filter::Simple is also smart enough to Do The Right Thing if you use Exporter:

    1. package Switch;
    2. use base Exporter;
    3. use Filter::Simple;
    4. @EXPORT = qw(switch case);
    5. @EXPORT_OK = qw(given when);
    6. FILTER { $_ = magic_Perl_filter($_) }

    Immediately after the filter has been applied to the source, Filter::Simple will pass control to Exporter, so it can do its magic too.

    Of course, here too, Filter::Simple has to know you're using Exporter before it applies the filter. That's almost never a problem, but if you're nervous about it, you can guarantee that things will work correctly by ensuring that your use base Exporter always precedes your use Filter::Simple .

    How it works

    The Filter::Simple module exports into the package that calls FILTER (or uses it directly) -- such as package "BANG" in the above example -- two automagically constructed subroutines -- import and unimport -- which take care of all the nasty details.

    In addition, the generated import subroutine passes its own argument list to the filtering subroutine, so the BANG.pm filter could easily be made parametric:

    1. package BANG;
    2. use Filter::Simple;
    3. FILTER {
    4. my ($die_msg, $var_name) = @_;
    5. s/BANG\s+BANG/die '$die_msg' if \${$var_name}/g;
    6. };
    7. # and in some user code:
    8. use BANG "BOOM", "BAM"; # "BANG BANG" becomes: die 'BOOM' if $BAM

    The specified filtering subroutine is called every time a use BANG is encountered, and passed all the source code following that call, up to either the next no BANG; (or whatever terminator you've set) or the end of the source file, whichever occurs first. By default, any no BANG; call must appear by itself on a separate line, or it is ignored.

    AUTHOR

    Damian Conway

    CONTACT

    Filter::Simple is now maintained by the Perl5-Porters. Please submit bug via the perlbug tool that comes with your perl. For usage instructions, read perldoc perlbug or possibly man perlbug . For mostly anything else, please contact <perl5-porters@perl.org>.

    Maintainer of the CPAN release is Steffen Mueller <smueller@cpan.org>. Contact him with technical difficulties with respect to the packaging of the CPAN module.

    Praise of the module, flowers, and presents still go to the author, Damian Conway <damian@conway.org>.

    COPYRIGHT AND LICENSE

    1. Copyright (c) 2000-2008, Damian Conway. All Rights Reserved.
    2. This module is free software. It may be used, redistributed
    3. and/or modified under the same terms as Perl itself.
     
    perldoc-html/Filter/Util/000755 000765 000024 00000000000 12275777445 015404 5ustar00jjstaff000000 000000 perldoc-html/Filter/Util/Call.html000644 000765 000024 00000127041 12275777445 017152 0ustar00jjstaff000000 000000 Filter::Util::Call - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Filter::Util::Call

    Perl 5 version 18.2 documentation
    Recently read

    Filter::Util::Call

    NAME

    Filter::Util::Call - Perl Source Filter Utility Module

    SYNOPSIS

    1. use Filter::Util::Call ;

    DESCRIPTION

    This module provides you with the framework to write Source Filters in Perl.

    An alternate interface to Filter::Util::Call is now available. See Filter::Simple for more details.

    A Perl Source Filter is implemented as a Perl module. The structure of the module can take one of two broadly similar formats. To distinguish between them, the first will be referred to as method filter and the second as closure filter.

    Here is a skeleton for the method filter:

    1. package MyFilter ;
    2. use Filter::Util::Call ;
    3. sub import
    4. {
    5. my($type, @arguments) = @_ ;
    6. filter_add([]) ;
    7. }
    8. sub filter
    9. {
    10. my($self) = @_ ;
    11. my($status) ;
    12. $status = filter_read() ;
    13. $status ;
    14. }
    15. 1 ;

    and this is the equivalent skeleton for the closure filter:

    1. package MyFilter ;
    2. use Filter::Util::Call ;
    3. sub import
    4. {
    5. my($type, @arguments) = @_ ;
    6. filter_add(
    7. sub
    8. {
    9. my($status) ;
    10. $status = filter_read() ;
    11. $status ;
    12. } )
    13. }
    14. 1 ;

    To make use of either of the two filter modules above, place the line below in a Perl source file.

    1. use MyFilter;

    In fact, the skeleton modules shown above are fully functional Source Filters, albeit fairly useless ones. All they does is filter the source stream without modifying it at all.

    As you can see both modules have a broadly similar structure. They both make use of the Filter::Util::Call module and both have an import method. The difference between them is that the method filter requires a filter method, whereas the closure filter gets the equivalent of a filter method with the anonymous sub passed to filter_add.

    To make proper use of the closure filter shown above you need to have a good understanding of the concept of a closure. See perlref for more details on the mechanics of closures.

    use Filter::Util::Call

    The following functions are exported by Filter::Util::Call :

    1. filter_add()
    2. filter_read()
    3. filter_read_exact()
    4. filter_del()

    import()

    The import method is used to create an instance of the filter. It is called indirectly by Perl when it encounters the use MyFilter line in a source file (See import for more details on import).

    It will always have at least one parameter automatically passed by Perl - this corresponds to the name of the package. In the example above it will be "MyFilter" .

    Apart from the first parameter, import can accept an optional list of parameters. These can be used to pass parameters to the filter. For example:

    1. use MyFilter qw(a b c) ;

    will result in the @_ array having the following values:

    1. @_ [0] => "MyFilter"
    2. @_ [1] => "a"
    3. @_ [2] => "b"
    4. @_ [3] => "c"

    Before terminating, the import function must explicitly install the filter by calling filter_add .

    filter_add()

    The function, filter_add , actually installs the filter. It takes one parameter which should be a reference. The kind of reference used will dictate which of the two filter types will be used.

    If a CODE reference is used then a closure filter will be assumed.

    If a CODE reference is not used, a method filter will be assumed. In a method filter, the reference can be used to store context information. The reference will be blessed into the package by filter_add .

    See the filters at the end of this documents for examples of using context information using both method filters and closure filters.

    filter() and anonymous sub

    Both the filter method used with a method filter and the anonymous sub used with a closure filter is where the main processing for the filter is done.

    The big difference between the two types of filter is that the method filter uses the object passed to the method to store any context data, whereas the closure filter uses the lexical variables that are maintained by the closure.

    Note that the single parameter passed to the method filter, $self , is the same reference that was passed to filter_add blessed into the filter's package. See the example filters later on for details of using $self .

    Here is a list of the common features of the anonymous sub and the filter() method.

    • $_

      Although $_ doesn't actually appear explicitly in the sample filters above, it is implicitly used in a number of places.

      Firstly, when either filter or the anonymous sub are called, a local copy of $_ will automatically be created. It will always contain the empty string at this point.

      Next, both filter_read and filter_read_exact will append any source data that is read to the end of $_ .

      Finally, when filter or the anonymous sub are finished processing, they are expected to return the filtered source using $_ .

      This implicit use of $_ greatly simplifies the filter.

    • $status

      The status value that is returned by the user's filter method or anonymous sub and the filter_read and read_exact functions take the same set of values, namely:

      1. < 0 Error
      2. = 0 EOF
      3. > 0 OK
    • filter_read and filter_read_exact

      These functions are used by the filter to obtain either a line or block from the next filter in the chain or the actual source file if there aren't any other filters.

      The function filter_read takes two forms:

      1. $status = filter_read() ;
      2. $status = filter_read($size) ;

      The first form is used to request a line, the second requests a block.

      In line mode, filter_read will append the next source line to the end of the $_ scalar.

      In block mode, filter_read will append a block of data which is <= $size to the end of the $_ scalar. It is important to emphasise the that filter_read will not necessarily read a block which is precisely $size bytes.

      If you need to be able to read a block which has an exact size, you can use the function filter_read_exact . It works identically to filter_read in block mode, except it will try to read a block which is exactly $size bytes in length. The only circumstances when it will not return a block which is $size bytes long is on EOF or error.

      It is very important to check the value of $status after every call to filter_read or filter_read_exact .

    • filter_del

      The function, filter_del , is used to disable the current filter. It does not affect the running of the filter. All it does is tell Perl not to call filter any more.

      See Example 4: Using filter_del for details.

    EXAMPLES

    Here are a few examples which illustrate the key concepts - as such most of them are of little practical use.

    The examples sub-directory has copies of all these filters implemented both as method filters and as closure filters.

    Example 1: A simple filter.

    Below is a method filter which is hard-wired to replace all occurrences of the string "Joe" to "Jim" . Not particularly Useful, but it is the first example and I wanted to keep it simple.

    1. package Joe2Jim ;
    2. use Filter::Util::Call ;
    3. sub import
    4. {
    5. my($type) = @_ ;
    6. filter_add(bless []) ;
    7. }
    8. sub filter
    9. {
    10. my($self) = @_ ;
    11. my($status) ;
    12. s/Joe/Jim/g
    13. if ($status = filter_read()) > 0 ;
    14. $status ;
    15. }
    16. 1 ;

    Here is an example of using the filter:

    1. use Joe2Jim ;
    2. print "Where is Joe?\n" ;

    And this is what the script above will print:

    1. Where is Jim?

    Example 2: Using the context

    The previous example was not particularly useful. To make it more general purpose we will make use of the context data and allow any arbitrary from and to strings to be used. This time we will use a closure filter. To reflect its enhanced role, the filter is called Subst .

    1. package Subst ;
    2. use Filter::Util::Call ;
    3. use Carp ;
    4. sub import
    5. {
    6. croak("usage: use Subst qw(from to)")
    7. unless @_ == 3 ;
    8. my ($self, $from, $to) = @_ ;
    9. filter_add(
    10. sub
    11. {
    12. my ($status) ;
    13. s/$from/$to/
    14. if ($status = filter_read()) > 0 ;
    15. $status ;
    16. })
    17. }
    18. 1 ;

    and is used like this:

    1. use Subst qw(Joe Jim) ;
    2. print "Where is Joe?\n" ;

    Example 3: Using the context within the filter

    Here is a filter which a variation of the Joe2Jim filter. As well as substituting all occurrences of "Joe" to "Jim" it keeps a count of the number of substitutions made in the context object.

    Once EOF is detected ($status is zero) the filter will insert an extra line into the source stream. When this extra line is executed it will print a count of the number of substitutions actually made. Note that $status is set to 1 in this case.

    1. package Count ;
    2. use Filter::Util::Call ;
    3. sub filter
    4. {
    5. my ($self) = @_ ;
    6. my ($status) ;
    7. if (($status = filter_read()) > 0 ) {
    8. s/Joe/Jim/g ;
    9. ++ $$self ;
    10. }
    11. elsif ($$self >= 0) { # EOF
    12. $_ = "print q[Made ${$self} substitutions\n]" ;
    13. $status = 1 ;
    14. $$self = -1 ;
    15. }
    16. $status ;
    17. }
    18. sub import
    19. {
    20. my ($self) = @_ ;
    21. my ($count) = 0 ;
    22. filter_add(\$count) ;
    23. }
    24. 1 ;

    Here is a script which uses it:

    1. use Count ;
    2. print "Hello Joe\n" ;
    3. print "Where is Joe\n" ;

    Outputs:

    1. Hello Jim
    2. Where is Jim
    3. Made 2 substitutions

    Example 4: Using filter_del

    Another variation on a theme. This time we will modify the Subst filter to allow a starting and stopping pattern to be specified as well as the from and to patterns. If you know the vi editor, it is the equivalent of this command:

    1. :/start/,/stop/s/from/to/

    When used as a filter we want to invoke it like this:

    1. use NewSubst qw(start stop from to) ;

    Here is the module.

    1. package NewSubst ;
    2. use Filter::Util::Call ;
    3. use Carp ;
    4. sub import
    5. {
    6. my ($self, $start, $stop, $from, $to) = @_ ;
    7. my ($found) = 0 ;
    8. croak("usage: use Subst qw(start stop from to)")
    9. unless @_ == 5 ;
    10. filter_add(
    11. sub
    12. {
    13. my ($status) ;
    14. if (($status = filter_read()) > 0) {
    15. $found = 1
    16. if $found == 0 and /$start/ ;
    17. if ($found) {
    18. s/$from/$to/ ;
    19. filter_del() if /$stop/ ;
    20. }
    21. }
    22. $status ;
    23. } )
    24. }
    25. 1 ;

    Filter::Simple

    If you intend using the Filter::Call functionality, I would strongly recommend that you check out Damian Conway's excellent Filter::Simple module. Damian's module provides a much cleaner interface than Filter::Util::Call. Although it doesn't allow the fine control that Filter::Util::Call does, it should be adequate for the majority of applications. It's available at

    1. http://search.cpan.org/dist/Filter-Simple/

    AUTHOR

    Paul Marquess

    DATE

    26th January 1996

     
    perldoc-html/File/Basename.html000644 000765 000024 00000070306 12275777446 016531 0ustar00jjstaff000000 000000 File::Basename - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Basename

    Perl 5 version 18.2 documentation
    Recently read

    File::Basename

    NAME

    File::Basename - Parse file paths into directory, filename and suffix.

    SYNOPSIS

    1. use File::Basename;
    2. ($name,$path,$suffix) = fileparse($fullname,@suffixlist);
    3. $name = fileparse($fullname,@suffixlist);
    4. $basename = basename($fullname,@suffixlist);
    5. $dirname = dirname($fullname);

    DESCRIPTION

    These routines allow you to parse file paths into their directory, filename and suffix.

    NOTE: dirname() and basename() emulate the behaviours, and quirks, of the shell and C functions of the same name. See each function's documentation for details. If your concern is just parsing paths it is safer to use File::Spec's splitpath() and splitdir() methods.

    It is guaranteed that

    1. # Where $path_separator is / for Unix, \ for Windows, etc...
    2. dirname($path) . $path_separator . basename($path);

    is equivalent to the original path for all systems but VMS.

    • fileparse
      1. my($filename, $directories, $suffix) = fileparse($path);
      2. my($filename, $directories, $suffix) = fileparse($path, @suffixes);
      3. my $filename = fileparse($path, @suffixes);

      The fileparse() routine divides a file path into its $directories, $filename and (optionally) the filename $suffix.

      $directories contains everything up to and including the last directory separator in the $path including the volume (if applicable). The remainder of the $path is the $filename.

      1. # On Unix returns ("baz", "/foo/bar/", "")
      2. fileparse("/foo/bar/baz");
      3. # On Windows returns ("baz", 'C:\foo\bar\', "")
      4. fileparse('C:\foo\bar\baz');
      5. # On Unix returns ("", "/foo/bar/baz/", "")
      6. fileparse("/foo/bar/baz/");

      If @suffixes are given each element is a pattern (either a string or a qr//) matched against the end of the $filename. The matching portion is removed and becomes the $suffix.

      1. # On Unix returns ("baz", "/foo/bar/", ".txt")
      2. fileparse("/foo/bar/baz.txt", qr/\.[^.]*/);

      If type is non-Unix (see fileparse_set_fstype) then the pattern matching for suffix removal is performed case-insensitively, since those systems are not case-sensitive when opening existing files.

      You are guaranteed that $directories . $filename . $suffix will denote the same location as the original $path.

    • basename
      1. my $filename = basename($path);
      2. my $filename = basename($path, @suffixes);

      This function is provided for compatibility with the Unix shell command basename(1) . It does NOT always return the file name portion of a path as you might expect. To be safe, if you want the file name portion of a path use fileparse() .

      basename() returns the last level of a filepath even if the last level is clearly directory. In effect, it is acting like pop() for paths. This differs from fileparse() 's behaviour.

      1. # Both return "bar"
      2. basename("/foo/bar");
      3. basename("/foo/bar/");

      @suffixes work as in fileparse() except all regex metacharacters are quoted.

      1. # These two function calls are equivalent.
      2. my $filename = basename("/foo/bar/baz.txt", ".txt");
      3. my $filename = fileparse("/foo/bar/baz.txt", qr/\Q.txt\E/);

      Also note that in order to be compatible with the shell command, basename() does not strip off a suffix if it is identical to the remaining characters in the filename.

    • dirname

      This function is provided for compatibility with the Unix shell command dirname(1) and has inherited some of its quirks. In spite of its name it does NOT always return the directory name as you might expect. To be safe, if you want the directory name of a path use fileparse() .

      Only on VMS (where there is no ambiguity between the file and directory portions of a path) and AmigaOS (possibly due to an implementation quirk in this module) does dirname() work like fileparse($path) , returning just the $directories.

      1. # On VMS and AmigaOS
      2. my $directories = dirname($path);

      When using Unix or MSDOS syntax this emulates the dirname(1) shell function which is subtly different from how fileparse() works. It returns all but the last level of a file path even if the last level is clearly a directory. In effect, it is not returning the directory portion but simply the path one level up acting like chop() for file paths.

      Also unlike fileparse() , dirname() does not include a trailing slash on its returned path.

      1. # returns /foo/bar. fileparse() would return /foo/bar/
      2. dirname("/foo/bar/baz");
      3. # also returns /foo/bar despite the fact that baz is clearly a
      4. # directory. fileparse() would return /foo/bar/baz/
      5. dirname("/foo/bar/baz/");
      6. # returns '.'. fileparse() would return 'foo/'
      7. dirname("foo/");

      Under VMS, if there is no directory information in the $path, then the current default device and directory is used.

    • fileparse_set_fstype
      1. my $type = fileparse_set_fstype();
      2. my $previous_type = fileparse_set_fstype($type);

      Normally File::Basename will assume a file path type native to your current operating system (ie. /foo/bar style on Unix, \foo\bar on Windows, etc...). With this function you can override that assumption.

      Valid $types are "MacOS", "VMS", "AmigaOS", "OS2", "RISCOS", "MSWin32", "DOS" (also "MSDOS" for backwards bug compatibility), "Epoc" and "Unix" (all case-insensitive). If an unrecognized $type is given "Unix" will be assumed.

      If you've selected VMS syntax, and the file specification you pass to one of these routines contains a "/", they assume you are using Unix emulation and apply the Unix syntax rules instead, for that function call only.

    SEE ALSO

    dirname(1), basename(1), File::Spec

     
    perldoc-html/File/CheckTree.html000644 000765 000024 00000041773 12275777444 016657 0ustar00jjstaff000000 000000 File::CheckTree - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::CheckTree

    Perl 5 version 18.2 documentation
    Recently read

    File::CheckTree

    NAME

    File::CheckTree - run many filetest checks on a tree

    SYNOPSIS

    1. use File::CheckTree;
    2. $num_warnings = validate( q{
    3. /vmunix -e || die
    4. /boot -e || die
    5. /bin cd
    6. csh -ex
    7. csh !-ug
    8. sh -ex
    9. sh !-ug
    10. /usr -d || warn "What happened to $file?\n"
    11. });

    DESCRIPTION

    The validate() routine takes a single multiline string consisting of directives, each containing a filename plus a file test to try on it. (The file test may also be a "cd", causing subsequent relative filenames to be interpreted relative to that directory.) After the file test you may put || die to make it a fatal error if the file test fails. The default is || warn. The file test may optionally have a "!' prepended to test for the opposite condition. If you do a cd and then list some relative filenames, you may want to indent them slightly for readability. If you supply your own die() or warn() message, you can use $file to interpolate the filename.

    Filetests may be bunched: "-rwx" tests for all of -r , -w , and -x . Only the first failed test of the bunch will produce a warning.

    The routine returns the number of warnings issued.

    AUTHOR

    File::CheckTree was derived from lib/validate.pl which was written by Larry Wall. Revised by Paul Grassie <grassie@perl.com> in 2002.

    HISTORY

    File::CheckTree used to not display fatal error messages. It used to count only those warnings produced by a generic || warn (and not those in which the user supplied the message). In addition, the validate() routine would leave the user program in whatever directory was last entered through the use of "cd" directives. These bugs were fixed during the development of perl 5.8. The first fixed version of File::CheckTree was 4.2.

     
    perldoc-html/File/Compare.html000644 000765 000024 00000040777 12275777444 016413 0ustar00jjstaff000000 000000 File::Compare - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Compare

    Perl 5 version 18.2 documentation
    Recently read

    File::Compare

    NAME

    File::Compare - Compare files or filehandles

    SYNOPSIS

    1. use File::Compare;
    2. if (compare("file1","file2") == 0) {
    3. print "They're equal\n";
    4. }

    DESCRIPTION

    The File::Compare::compare function compares the contents of two sources, each of which can be a file or a file handle. It is exported from File::Compare by default.

    File::Compare::cmp is a synonym for File::Compare::compare. It is exported from File::Compare only by request.

    File::Compare::compare_text does a line by line comparison of the two files. It stops as soon as a difference is detected. compare_text() accepts an optional third argument: This must be a CODE reference to a line comparison function, which returns 0 when both lines are considered equal. For example:

    1. compare_text($file1, $file2)

    is basically equivalent to

    1. compare_text($file1, $file2, sub {$_[0] ne $_[1]} )

    RETURN

    File::Compare::compare and its sibling functions return 0 if the files are equal, 1 if the files are unequal, or -1 if an error was encountered.

    AUTHOR

    File::Compare was written by Nick Ing-Simmons. Its original documentation was written by Chip Salzenberg.

     
    perldoc-html/File/Copy.html000644 000765 000024 00000057577 12275777444 015745 0ustar00jjstaff000000 000000 File::Copy - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Copy

    Perl 5 version 18.2 documentation
    Recently read

    File::Copy

    NAME

    File::Copy - Copy files or filehandles

    SYNOPSIS

    1. use File::Copy;
    2. copy("file1","file2") or die "Copy failed: $!";
    3. copy("Copy.pm",\*STDOUT);
    4. move("/dev1/fileA","/dev2/fileB");
    5. use File::Copy "cp";
    6. $n = FileHandle->new("/a/file","r");
    7. cp($n,"x");

    DESCRIPTION

    The File::Copy module provides two basic functions, copy and move , which are useful for getting the contents of a file from one place to another.

    • copy

      The copy function takes two parameters: a file to copy from and a file to copy to. Either argument may be a string, a FileHandle reference or a FileHandle glob. Obviously, if the first argument is a filehandle of some sort, it will be read from, and if it is a file name it will be opened for reading. Likewise, the second argument will be written to (and created if need be). Trying to copy a file on top of itself is an error.

      If the destination (second argument) already exists and is a directory, and the source (first argument) is not a filehandle, then the source file will be copied into the directory specified by the destination, using the same base name as the source file. It's a failure to have a filehandle as the source when the destination is a directory.

      Note that passing in files as handles instead of names may lead to loss of information on some operating systems; it is recommended that you use file names whenever possible. Files are opened in binary mode where applicable. To get a consistent behaviour when copying from a filehandle to a file, use binmode on the filehandle.

      An optional third parameter can be used to specify the buffer size used for copying. This is the number of bytes from the first file, that will be held in memory at any given time, before being written to the second file. The default buffer size depends upon the file, but will generally be the whole file (up to 2MB), or 1k for filehandles that do not reference files (eg. sockets).

      You may use the syntax use File::Copy "cp" to get at the cp alias for this function. The syntax is exactly the same. The behavior is nearly the same as well: as of version 2.15, cp will preserve the source file's permission bits like the shell utility cp(1) would do, while copy uses the default permissions for the target file (which may depend on the process' umask, file ownership, inherited ACLs, etc.). If an error occurs in setting permissions, cp will return 0, regardless of whether the file was successfully copied.

    • move

      The move function also takes two parameters: the current name and the intended name of the file to be moved. If the destination already exists and is a directory, and the source is not a directory, then the source file will be renamed into the directory specified by the destination.

      If possible, move() will simply rename the file. Otherwise, it copies the file to the new location and deletes the original. If an error occurs during this copy-and-delete process, you may be left with a (possibly partial) copy of the file under the destination name.

      You may use the mv alias for this function in the same way that you may use the cp alias for copy .

    • syscopy

      File::Copy also provides the syscopy routine, which copies the file specified in the first parameter to the file specified in the second parameter, preserving OS-specific attributes and file structure. For Unix systems, this is equivalent to the simple copy routine, which doesn't preserve OS-specific attributes. For VMS systems, this calls the rmscopy routine (see below). For OS/2 systems, this calls the syscopy XSUB directly. For Win32 systems, this calls Win32::CopyFile .

      Special behaviour if syscopy is defined (OS/2, VMS and Win32):

      If both arguments to copy are not file handles, then copy will perform a "system copy" of the input file to a new output file, in order to preserve file attributes, indexed file structure, etc. The buffer size parameter is ignored. If either argument to copy is a handle to an opened file, then data is copied using Perl operators, and no effort is made to preserve file attributes or record structure.

      The system copy routine may also be called directly under VMS and OS/2 as File::Copy::syscopy (or under VMS as File::Copy::rmscopy , which is the routine that does the actual work for syscopy).

    • rmscopy($from,$to[,$date_flag])

      The first and second arguments may be strings, typeglobs, typeglob references, or objects inheriting from IO::Handle; they are used in all cases to obtain the filespec of the input and output files, respectively. The name and type of the input file are used as defaults for the output file, if necessary.

      A new version of the output file is always created, which inherits the structure and RMS attributes of the input file, except for owner and protections (and possibly timestamps; see below). All data from the input file is copied to the output file; if either of the first two parameters to rmscopy is a file handle, its position is unchanged. (Note that this means a file handle pointing to the output file will be associated with an old version of that file after rmscopy returns, not the newly created version.)

      The third parameter is an integer flag, which tells rmscopy how to handle timestamps. If it is < 0, none of the input file's timestamps are propagated to the output file. If it is > 0, then it is interpreted as a bitmask: if bit 0 (the LSB) is set, then timestamps other than the revision date are propagated; if bit 1 is set, the revision date is propagated. If the third parameter to rmscopy is 0, then it behaves much like the DCL COPY command: if the name or type of the output file was explicitly specified, then no timestamps are propagated, but if they were taken implicitly from the input filespec, then all timestamps other than the revision date are propagated. If this parameter is not supplied, it defaults to 0.

      Like copy , rmscopy returns 1 on success. If an error occurs, it sets $! , deletes the output file, and returns 0.

    RETURN

    All functions return 1 on success, 0 on failure. $! will be set if an error was encountered.

    AUTHOR

    File::Copy was written by Aaron Sherman <ajs@ajs.com> in 1995, and updated by Charles Bailey <bailey@newman.upenn.edu> in 1996.

     
    perldoc-html/File/DosGlob.html000644 000765 000024 00000043564 12275777443 016352 0ustar00jjstaff000000 000000 File::DosGlob - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::DosGlob

    Perl 5 version 18.2 documentation
    Recently read

    File::DosGlob

    NAME

    File::DosGlob - DOS like globbing and then some

    SYNOPSIS

    1. require 5.004;
    2. # override CORE::glob in current package
    3. use File::DosGlob 'glob';
    4. # override CORE::glob in ALL packages (use with extreme caution!)
    5. use File::DosGlob 'GLOBAL_glob';
    6. @perlfiles = glob "..\\pe?l/*.p?";
    7. print <..\\pe?l/*.p?>;
    8. # from the command line (overrides only in main::)
    9. > perl -MFile::DosGlob=glob -e "print <../pe*/*p?>"

    DESCRIPTION

    A module that implements DOS-like globbing with a few enhancements. It is largely compatible with perlglob.exe (the M$ setargv.obj version) in all but one respect--it understands wildcards in directory components.

    For example, <..\\l*b\\file/*glob.p?> will work as expected (in that it will find something like '..\lib\File/DosGlob.pm' alright). Note that all path components are case-insensitive, and that backslashes and forward slashes are both accepted, and preserved. You may have to double the backslashes if you are putting them in literally, due to double-quotish parsing of the pattern by perl.

    Spaces in the argument delimit distinct patterns, so glob('*.exe *.dll') globs all filenames that end in .exe or .dll. If you want to put in literal spaces in the glob pattern, you can escape them with either double quotes, or backslashes. e.g. glob('c:/"Program Files"/*/*.dll') , or glob('c:/Program\ Files/*/*.dll') . The argument is tokenized using Text::ParseWords::parse_line() , so see Text::ParseWords for details of the quoting rules used.

    Extending it to csh patterns is left as an exercise to the reader.

    EXPORTS (by request only)

    glob()

    BUGS

    Should probably be built into the core, and needs to stop pandering to DOS habits. Needs a dose of optimizium too.

    AUTHOR

    Gurusamy Sarathy <gsar@activestate.com>

    HISTORY

    • Support for globally overriding glob() (GSAR 3-JUN-98)

    • Scalar context, independent iterator context fixes (GSAR 15-SEP-97)

    • A few dir-vs-file optimizations result in glob importation being 10 times faster than using perlglob.exe, and using perlglob.bat is only twice as slow as perlglob.exe (GSAR 28-MAY-97)

    • Several cleanups prompted by lack of compatible perlglob.exe under Borland (GSAR 27-MAY-97)

    • Initial version (GSAR 20-FEB-97)

    SEE ALSO

    perl

    perlglob.bat

    Text::ParseWords

     
    perldoc-html/File/Fetch.html000644 000765 000024 00000111236 12275777444 016043 0ustar00jjstaff000000 000000 File::Fetch - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Fetch

    Perl 5 version 18.2 documentation
    Recently read

    File::Fetch

    NAME

    File::Fetch - A generic file fetching mechanism

    SYNOPSIS

    1. use File::Fetch;
    2. ### build a File::Fetch object ###
    3. my $ff = File::Fetch->new(uri => 'http://some.where.com/dir/a.txt');
    4. ### fetch the uri to cwd() ###
    5. my $where = $ff->fetch() or die $ff->error;
    6. ### fetch the uri to /tmp ###
    7. my $where = $ff->fetch( to => '/tmp' );
    8. ### parsed bits from the uri ###
    9. $ff->uri;
    10. $ff->scheme;
    11. $ff->host;
    12. $ff->path;
    13. $ff->file;

    DESCRIPTION

    File::Fetch is a generic file fetching mechanism.

    It allows you to fetch any file pointed to by a ftp , http , file , or rsync uri by a number of different means.

    See the HOW IT WORKS section further down for details.

    ACCESSORS

    A File::Fetch object has the following accessors

    • $ff->uri

      The uri you passed to the constructor

    • $ff->scheme

      The scheme from the uri (like 'file', 'http', etc)

    • $ff->host

      The hostname in the uri. Will be empty if host was originally 'localhost' for a 'file://' url.

    • $ff->vol

      On operating systems with the concept of a volume the second element of a file:// is considered to the be volume specification for the file. Thus on Win32 this routine returns the volume, on other operating systems this returns nothing.

      On Windows this value may be empty if the uri is to a network share, in which case the 'share' property will be defined. Additionally, volume specifications that use '|' as ':' will be converted on read to use ':'.

      On VMS, which has a volume concept, this field will be empty because VMS file specifications are converted to absolute UNIX format and the volume information is transparently included.

    • $ff->share

      On systems with the concept of a network share (currently only Windows) returns the sharename from a file://// url. On other operating systems returns empty.

    • $ff->path

      The path from the uri, will be at least a single '/'.

    • $ff->file

      The name of the remote file. For the local file name, the result of $ff->output_file will be used.

    • $ff->file_default

      The name of the default local file, that $ff->output_file falls back to if it would otherwise return no filename. For example when fetching a URI like http://www.abc.net.au/ the contents retrieved may be from a remote file called 'index.html'. The default value of this attribute is literally 'file_default'.

    • $ff->output_file

      The name of the output file. This is the same as $ff->file, but any query parameters are stripped off. For example:

      1. http://example.com/index.html?x=y

      would make the output file be index.html rather than index.html?x=y.

    METHODS

    $ff = File::Fetch->new( uri => 'http://some.where.com/dir/file.txt' );

    Parses the uri and creates a corresponding File::Fetch::Item object, that is ready to be fetch ed and returns it.

    Returns false on failure.

    $where = $ff->fetch( [to => /my/output/dir/ | \$scalar] )

    Fetches the file you requested and returns the full path to the file.

    By default it writes to cwd() , but you can override that by specifying the to argument:

    1. ### file fetch to /tmp, full path to the file in $where
    2. $where = $ff->fetch( to => '/tmp' );
    3. ### file slurped into $scalar, full path to the file in $where
    4. ### file is downloaded to a temp directory and cleaned up at exit time
    5. $where = $ff->fetch( to => \$scalar );

    Returns the full path to the downloaded file on success, and false on failure.

    $ff->error([BOOL])

    Returns the last encountered error as string. Pass it a true value to get the Carp::longmess() output instead.

    HOW IT WORKS

    File::Fetch is able to fetch a variety of uris, by using several external programs and modules.

    Below is a mapping of what utilities will be used in what order for what schemes, if available:

    1. file => LWP, lftp, file
    2. http => LWP, HTTP::Lite, wget, curl, lftp, fetch, lynx, iosock
    3. ftp => LWP, Net::FTP, wget, curl, lftp, fetch, ncftp, ftp
    4. rsync => rsync

    If you'd like to disable the use of one or more of these utilities and/or modules, see the $BLACKLIST variable further down.

    If a utility or module isn't available, it will be marked in a cache (see the $METHOD_FAIL variable further down), so it will not be tried again. The fetch method will only fail when all options are exhausted, and it was not able to retrieve the file.

    The fetch utility is available on FreeBSD. NetBSD and Dragonfly BSD may also have it from pkgsrc . We only check for fetch on those three platforms.

    iosock is a very limited IO::Socket::INET based mechanism for retrieving http schemed urls. It doesn't follow redirects for instance.

    A special note about fetching files from an ftp uri:

    By default, all ftp connections are done in passive mode. To change that, see the $FTP_PASSIVE variable further down.

    Furthermore, ftp uris only support anonymous connections, so no named user/password pair can be passed along.

    /bin/ftp is blacklisted by default; see the $BLACKLIST variable further down.

    GLOBAL VARIABLES

    The behaviour of File::Fetch can be altered by changing the following global variables:

    $File::Fetch::FROM_EMAIL

    This is the email address that will be sent as your anonymous ftp password.

    Default is File-Fetch@example.com .

    $File::Fetch::USER_AGENT

    This is the useragent as LWP will report it.

    Default is File::Fetch/$VERSION .

    $File::Fetch::FTP_PASSIVE

    This variable controls whether the environment variable FTP_PASSIVE and any passive switches to commandline tools will be set to true.

    Default value is 1.

    Note: When $FTP_PASSIVE is true, ncftp will not be used to fetch files, since passive mode can only be set interactively for this binary

    $File::Fetch::TIMEOUT

    When set, controls the network timeout (counted in seconds).

    Default value is 0.

    $File::Fetch::WARN

    This variable controls whether errors encountered internally by File::Fetch should be carp 'd or not.

    Set to false to silence warnings. Inspect the output of the error() method manually to see what went wrong.

    Defaults to true .

    $File::Fetch::DEBUG

    This enables debugging output when calling commandline utilities to fetch files. This also enables Carp::longmess errors, instead of the regular carp errors.

    Good for tracking down why things don't work with your particular setup.

    Default is 0.

    $File::Fetch::BLACKLIST

    This is an array ref holding blacklisted modules/utilities for fetching files with.

    To disallow the use of, for example, LWP and Net::FTP , you could set $File::Fetch::BLACKLIST to:

    1. $File::Fetch::BLACKLIST = [qw|lwp netftp|]

    The default blacklist is [qw|ftp|], as /bin/ftp is rather unreliable.

    See the note on MAPPING below.

    $File::Fetch::METHOD_FAIL

    This is a hashref registering what modules/utilities were known to fail for fetching files (mostly because they weren't installed).

    You can reset this cache by assigning an empty hashref to it, or individually remove keys.

    See the note on MAPPING below.

    MAPPING

    Here's a quick mapping for the utilities/modules, and their names for the $BLACKLIST, $METHOD_FAIL and other internal functions.

    1. LWP => lwp
    2. HTTP::Lite => httplite
    3. HTTP::Tiny => httptiny
    4. Net::FTP => netftp
    5. wget => wget
    6. lynx => lynx
    7. ncftp => ncftp
    8. ftp => ftp
    9. curl => curl
    10. rsync => rsync
    11. lftp => lftp
    12. fetch => fetch
    13. IO::Socket => iosock

    FREQUENTLY ASKED QUESTIONS

    So how do I use a proxy with File::Fetch?

    File::Fetch currently only supports proxies with LWP::UserAgent. You will need to set your environment variables accordingly. For example, to use an ftp proxy:

    1. $ENV{ftp_proxy} = 'foo.com';

    Refer to the LWP::UserAgent manpage for more details.

    I used 'lynx' to fetch a file, but its contents is all wrong!

    lynx can only fetch remote files by dumping its contents to STDOUT , which we in turn capture. If that content is a 'custom' error file (like, say, a 404 handler), you will get that contents instead.

    Sadly, lynx doesn't support any options to return a different exit code on non-200 OK status, giving us no way to tell the difference between a 'successful' fetch and a custom error page.

    Therefor, we recommend to only use lynx as a last resort. This is why it is at the back of our list of methods to try as well.

    Files I'm trying to fetch have reserved characters or non-ASCII characters in them. What do I do?

    File::Fetch is relatively smart about things. When trying to write a file to disk, it removes the query parameters (see the output_file method for details) from the file name before creating it. In most cases this suffices.

    If you have any other characters you need to escape, please install the URI::Escape module from CPAN, and pre-encode your URI before passing it to File::Fetch . You can read about the details of URIs and URI encoding here:

    1. http://www.faqs.org/rfcs/rfc2396.html

    TODO

    • Implement $PREFER_BIN

      To indicate to rather use commandline tools than modules

    BUG REPORTS

    Please report bugs or other issues to <bug-file-fetch@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/File/Find.html000644 000765 000024 00000110664 12275777444 015676 0ustar00jjstaff000000 000000 File::Find - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Find

    Perl 5 version 18.2 documentation
    Recently read

    File::Find

    NAME

    File::Find - Traverse a directory tree.

    SYNOPSIS

    1. use File::Find;
    2. find(\&wanted, @directories_to_search);
    3. sub wanted { ... }
    4. use File::Find;
    5. finddepth(\&wanted, @directories_to_search);
    6. sub wanted { ... }
    7. use File::Find;
    8. find({ wanted => \&process, follow => 1 }, '.');

    DESCRIPTION

    These are functions for searching through directory trees doing work on each file found similar to the Unix find command. File::Find exports two functions, find and finddepth . They work similarly but have subtle differences.

    • find
      1. find(\&wanted, @directories);
      2. find(\%options, @directories);

      find() does a depth-first search over the given @directories in the order they are given. For each file or directory found, it calls the &wanted subroutine. (See below for details on how to use the &wanted function). Additionally, for each directory found, it will chdir() into that directory and continue the search, invoking the &wanted function on each file or subdirectory in the directory.

    • finddepth
      1. finddepth(\&wanted, @directories);
      2. finddepth(\%options, @directories);

      finddepth() works just like find() except that it invokes the &wanted function for a directory after invoking it for the directory's contents. It does a postorder traversal instead of a preorder traversal, working from the bottom of the directory tree up where find() works from the top of the tree down.

    %options

    The first argument to find() is either a code reference to your &wanted function, or a hash reference describing the operations to be performed for each file. The code reference is described in The wanted function below.

    Here are the possible keys for the hash:

    • wanted

      The value should be a code reference. This code reference is described in The wanted function below. The &wanted subroutine is mandatory.

    • bydepth

      Reports the name of a directory only AFTER all its entries have been reported. Entry point finddepth() is a shortcut for specifying { bydepth => 1 } in the first argument of find() .

    • preprocess

      The value should be a code reference. This code reference is used to preprocess the current directory. The name of the currently processed directory is in $File::Find::dir . Your preprocessing function is called after readdir(), but before the loop that calls the wanted() function. It is called with a list of strings (actually file/directory names) and is expected to return a list of strings. The code can be used to sort the file/directory names alphabetically, numerically, or to filter out directory entries based on their name alone. When follow or follow_fast are in effect, preprocess is a no-op.

    • postprocess

      The value should be a code reference. It is invoked just before leaving the currently processed directory. It is called in void context with no arguments. The name of the current directory is in $File::Find::dir . This hook is handy for summarizing a directory, such as calculating its disk usage. When follow or follow_fast are in effect, postprocess is a no-op.

    • follow

      Causes symbolic links to be followed. Since directory trees with symbolic links (followed) may contain files more than once and may even have cycles, a hash has to be built up with an entry for each file. This might be expensive both in space and time for a large directory tree. See follow_fast and follow_skip below. If either follow or follow_fast is in effect:

      • It is guaranteed that an lstat has been called before the user's wanted() function is called. This enables fast file checks involving _. Note that this guarantee no longer holds if follow or follow_fast are not set.

      • There is a variable $File::Find::fullname which holds the absolute pathname of the file with all symbolic links resolved. If the link is a dangling symbolic link, then fullname will be set to undef.

      This is a no-op on Win32.

    • follow_fast

      This is similar to follow except that it may report some files more than once. It does detect cycles, however. Since only symbolic links have to be hashed, this is much cheaper both in space and time. If processing a file more than once (by the user's wanted() function) is worse than just taking time, the option follow should be used.

      This is also a no-op on Win32.

    • follow_skip

      follow_skip==1 , which is the default, causes all files which are neither directories nor symbolic links to be ignored if they are about to be processed a second time. If a directory or a symbolic link are about to be processed a second time, File::Find dies.

      follow_skip==0 causes File::Find to die if any file is about to be processed a second time.

      follow_skip==2 causes File::Find to ignore any duplicate files and directories but to proceed normally otherwise.

    • dangling_symlinks

      If true and a code reference, will be called with the symbolic link name and the directory it lives in as arguments. Otherwise, if true and warnings are on, warning "symbolic_link_name is a dangling symbolic link\n" will be issued. If false, the dangling symbolic link will be silently ignored.

    • no_chdir

      Does not chdir() to each directory as it recurses. The wanted() function will need to be aware of this, of course. In this case, $_ will be the same as $File::Find::name .

    • untaint

      If find is used in taint-mode (-T command line switch or if EUID != UID or if EGID != GID) then internally directory names have to be untainted before they can be chdir'ed to. Therefore they are checked against a regular expression untaint_pattern. Note that all names passed to the user's wanted() function are still tainted. If this option is used while not in taint-mode, untaint is a no-op.

    • untaint_pattern

      See above. This should be set using the qr quoting operator. The default is set to qr|^([-+@\w./]+)$|. Note that the parentheses are vital.

    • untaint_skip

      If set, a directory which fails the untaint_pattern is skipped, including all its sub-directories. The default is to 'die' in such a case.

    The wanted function

    The wanted() function does whatever verifications you want on each file and directory. Note that despite its name, the wanted() function is a generic callback function, and does not tell File::Find if a file is "wanted" or not. In fact, its return value is ignored.

    The wanted function takes no arguments but rather does its work through a collection of variables.

    • $File::Find::dir is the current directory name,
    • $_ is the current filename within that directory
    • $File::Find::name is the complete pathname to the file.

    The above variables have all been localized and may be changed without affecting data outside of the wanted function.

    For example, when examining the file /some/path/foo.ext you will have:

    1. $File::Find::dir = /some/path/
    2. $_ = foo.ext
    3. $File::Find::name = /some/path/foo.ext

    You are chdir()'d to $File::Find::dir when the function is called, unless no_chdir was specified. Note that when changing to directories is in effect the root directory (/) is a somewhat special case inasmuch as the concatenation of $File::Find::dir , '/' and $_ is not literally equal to $File::Find::name . The table below summarizes all variants:

    1. $File::Find::name $File::Find::dir $_
    2. default / / .
    3. no_chdir=>0 /etc / etc
    4. /etc/x /etc x
    5. no_chdir=>1 / / /
    6. /etc / /etc
    7. /etc/x /etc /etc/x

    When follow or follow_fast are in effect, there is also a $File::Find::fullname . The function may set $File::Find::prune to prune the tree unless bydepth was specified. Unless follow or follow_fast is specified, for compatibility reasons (find.pl, find2perl) there are in addition the following globals available: $File::Find::topdir , $File::Find::topdev , $File::Find::topino , $File::Find::topmode and $File::Find::topnlink .

    This library is useful for the find2perl tool, which when fed,

    1. find2perl / -name .nfs\* -mtime +7 \
    2. -exec rm -f {} \; -o -fstype nfs -prune

    produces something like:

    1. sub wanted {
    2. /^\.nfs.*\z/s &&
    3. (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &&
    4. int(-M _) > 7 &&
    5. unlink($_)
    6. ||
    7. ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &&
    8. $dev < 0 &&
    9. ($File::Find::prune = 1);
    10. }

    Notice the _ in the above int(-M _) : the _ is a magical filehandle that caches the information from the preceding stat(), lstat(), or filetest.

    Here's another interesting wanted function. It will find all symbolic links that don't resolve:

    1. sub wanted {
    2. -l && !-e && print "bogus link: $File::Find::name\n";
    3. }

    Note that you may mix directories and (non-directory) files in the list of directories to be searched by the wanted() function.

    1. find(\&wanted, "./foo", "./bar", "./baz/epsilon");

    In the example above, no file in ./baz/ other than ./baz/epsilon will be evaluated by wanted() .

    See also the script pfind on CPAN for a nice application of this module.

    WARNINGS

    If you run your program with the -w switch, or if you use the warnings pragma, File::Find will report warnings for several weird situations. You can disable these warnings by putting the statement

    1. no warnings 'File::Find';

    in the appropriate scope. See perllexwarn for more info about lexical warnings.

    CAVEAT

    • $dont_use_nlink

      You can set the variable $File::Find::dont_use_nlink to 1, if you want to force File::Find to always stat directories. This was used for file systems that do not have an nlink count matching the number of sub-directories. Examples are ISO-9660 (CD-ROM), AFS, HPFS (OS/2 file system), FAT (DOS file system) and a couple of others.

      You shouldn't need to set this variable, since File::Find should now detect such file systems on-the-fly and switch itself to using stat. This works even for parts of your file system, like a mounted CD-ROM.

      If you do set $File::Find::dont_use_nlink to 1, you will notice slow-downs.

    • symlinks

      Be aware that the option to follow symbolic links can be dangerous. Depending on the structure of the directory tree (including symbolic links to directories) you might traverse a given (physical) directory more than once (only if follow_fast is in effect). Furthermore, deleting or changing files in a symbolically linked directory might cause very unpleasant surprises, since you delete or change files in an unknown directory.

    BUGS AND CAVEATS

    Despite the name of the finddepth() function, both find() and finddepth() perform a depth-first search of the directory hierarchy.

    HISTORY

    File::Find used to produce incorrect results if called recursively. During the development of perl 5.8 this bug was fixed. The first fixed version of File::Find was 1.01.

    SEE ALSO

    find, find2perl.

     
    perldoc-html/File/Glob.html000644 000765 000024 00000106604 12275777445 015701 0ustar00jjstaff000000 000000 File::Glob - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Glob

    Perl 5 version 18.2 documentation
    Recently read

    File::Glob

    NAME

    File::Glob - Perl extension for BSD glob routine

    SYNOPSIS

    1. use File::Glob ':bsd_glob';
    2. @list = bsd_glob('*.[ch]');
    3. $homedir = bsd_glob('~gnat', GLOB_TILDE | GLOB_ERR);
    4. if (GLOB_ERROR) {
    5. # an error occurred reading $homedir
    6. }
    7. ## override the core glob (CORE::glob() does this automatically
    8. ## by default anyway, since v5.6.0)
    9. use File::Glob ':globally';
    10. my @sources = <*.{c,h,y}>;
    11. ## override the core glob, forcing case sensitivity
    12. use File::Glob qw(:globally :case);
    13. my @sources = <*.{c,h,y}>;
    14. ## override the core glob forcing case insensitivity
    15. use File::Glob qw(:globally :nocase);
    16. my @sources = <*.{c,h,y}>;
    17. ## glob on all files in home directory
    18. use File::Glob ':globally';
    19. my @sources = <~gnat/*>;

    DESCRIPTION

    The glob angle-bracket operator <> is a pathname generator that implements the rules for file name pattern matching used by Unix-like shells such as the Bourne shell or C shell.

    File::Glob::bsd_glob() implements the FreeBSD glob(3) routine, which is a superset of the POSIX glob() (described in IEEE Std 1003.2 "POSIX.2"). bsd_glob() takes a mandatory pattern argument, and an optional flags argument, and returns a list of filenames matching the pattern, with interpretation of the pattern modified by the flags variable.

    Since v5.6.0, Perl's CORE::glob() is implemented in terms of bsd_glob(). Note that they don't share the same prototype--CORE::glob() only accepts a single argument. Due to historical reasons, CORE::glob() will also split its argument on whitespace, treating it as multiple patterns, whereas bsd_glob() considers them as one pattern. But see :bsd_glob under EXPORTS, below.

    META CHARACTERS

    1. \ Quote the next metacharacter
    2. [] Character class
    3. {} Multiple pattern
    4. * Match any string of characters
    5. ? Match any single character
    6. ~ User name home directory

    The metanotation a{b,c,d}e is a shorthand for abe ace ade . Left to right order is preserved, with results of matches being sorted separately at a low level to preserve this order. As a special case {, }, and {} are passed undisturbed.

    EXPORTS

    See also the POSIX FLAGS below, which can be exported individually.

    :bsd_glob

    The :bsd_glob export tag exports bsd_glob() and the constants listed below. It also overrides glob() in the calling package with one that behaves like bsd_glob() with regard to spaces (the space is treated as part of a file name), but supports iteration in scalar context; i.e., it preserves the core function's feature of returning the next item each time it is called.

    :glob

    The :glob tag, now discouraged, is the old version of :bsd_glob . It exports the same constants and functions, but its glob() override does not support iteration; it returns the last file name in scalar context. That means this will loop forever:

    1. use File::Glob ':glob';
    2. while (my $file = <* copy.txt>) {
    3. ...
    4. }

    bsd_glob

    This function, which is included in the two export tags listed above, takes one or two arguments. The first is the glob pattern. The second is a set of flags ORed together. The available flags are listed below under POSIX FLAGS. If the second argument is omitted, GLOB_CSH (or GLOB_CSH|GLOB_NOCASE on VMS and DOSish systems) is used by default.

    :nocase and :case

    These two export tags globally modify the default flags that bsd_glob() and, except on VMS, Perl's built-in glob operator use. GLOB_NOCASE is turned on or off, respectively.

    csh_glob

    The csh_glob() function can also be exported, but you should not use it directly unless you really know what you are doing. It splits the pattern into words and feeds each one to bsd_glob(). Perl's own glob() function uses this internally.

    POSIX FLAGS

    The POSIX defined flags for bsd_glob() are:

    • GLOB_ERR

      Force bsd_glob() to return an error when it encounters a directory it cannot open or read. Ordinarily bsd_glob() continues to find matches.

    • GLOB_LIMIT

      Make bsd_glob() return an error (GLOB_NOSPACE) when the pattern expands to a size bigger than the system constant ARG_MAX (usually found in limits.h). If your system does not define this constant, bsd_glob() uses sysconf(_SC_ARG_MAX) or _POSIX_ARG_MAX where available (in that order). You can inspect these values using the standard POSIX extension.

    • GLOB_MARK

      Each pathname that is a directory that matches the pattern has a slash appended.

    • GLOB_NOCASE

      By default, file names are assumed to be case sensitive; this flag makes bsd_glob() treat case differences as not significant.

    • GLOB_NOCHECK

      If the pattern does not match any pathname, then bsd_glob() returns a list consisting of only the pattern. If GLOB_QUOTE is set, its effect is present in the pattern returned.

    • GLOB_NOSORT

      By default, the pathnames are sorted in ascending ASCII order; this flag prevents that sorting (speeding up bsd_glob()).

    The FreeBSD extensions to the POSIX standard are the following flags:

    • GLOB_BRACE

      Pre-process the string to expand {pat,pat,...} strings like csh(1). The pattern '{}' is left unexpanded for historical reasons (and csh(1) does the same thing to ease typing of find(1) patterns).

    • GLOB_NOMAGIC

      Same as GLOB_NOCHECK but it only returns the pattern if it does not contain any of the special characters "*", "?" or "[". NOMAGIC is provided to simplify implementing the historic csh(1) globbing behaviour and should probably not be used anywhere else.

    • GLOB_QUOTE

      Use the backslash ('\') character for quoting: every occurrence of a backslash followed by a character in the pattern is replaced by that character, avoiding any special interpretation of the character. (But see below for exceptions on DOSISH systems).

    • GLOB_TILDE

      Expand patterns that start with '~' to user name home directories.

    • GLOB_CSH

      For convenience, GLOB_CSH is a synonym for GLOB_BRACE | GLOB_NOMAGIC | GLOB_QUOTE | GLOB_TILDE | GLOB_ALPHASORT .

    The POSIX provided GLOB_APPEND , GLOB_DOOFFS , and the FreeBSD extensions GLOB_ALTDIRFUNC , and GLOB_MAGCHAR flags have not been implemented in the Perl version because they involve more complex interaction with the underlying C structures.

    The following flag has been added in the Perl implementation for csh compatibility:

    • GLOB_ALPHASORT

      If GLOB_NOSORT is not in effect, sort filenames is alphabetical order (case does not matter) rather than in ASCII order.

    DIAGNOSTICS

    bsd_glob() returns a list of matching paths, possibly zero length. If an error occurred, &File::Glob::GLOB_ERROR will be non-zero and $! will be set. &File::Glob::GLOB_ERROR is guaranteed to be zero if no error occurred, or one of the following values otherwise:

    • GLOB_NOSPACE

      An attempt to allocate memory failed.

    • GLOB_ABEND

      The glob was stopped because an error was encountered.

    In the case where bsd_glob() has found some matching paths, but is interrupted by an error, it will return a list of filenames and set &File::Glob::ERROR.

    Note that bsd_glob() deviates from POSIX and FreeBSD glob(3) behaviour by not considering ENOENT and ENOTDIR as errors - bsd_glob() will continue processing despite those errors, unless the GLOB_ERR flag is set.

    Be aware that all filenames returned from File::Glob are tainted.

    NOTES

    • If you want to use multiple patterns, e.g. bsd_glob("a* b*") , you should probably throw them in a set as in bsd_glob("{a*,b*}") . This is because the argument to bsd_glob() isn't subjected to parsing by the C shell. Remember that you can use a backslash to escape things.

    • On DOSISH systems, backslash is a valid directory separator character. In this case, use of backslash as a quoting character (via GLOB_QUOTE) interferes with the use of backslash as a directory separator. The best (simplest, most portable) solution is to use forward slashes for directory separators, and backslashes for quoting. However, this does not match "normal practice" on these systems. As a concession to user expectation, therefore, backslashes (under GLOB_QUOTE) only quote the glob metacharacters '[', ']', '{', '}', '-', '~', and backslash itself. All other backslashes are passed through unchanged.

    • Win32 users should use the real slash. If you really want to use backslashes, consider using Sarathy's File::DosGlob, which comes with the standard Perl distribution.

    SEE ALSO

    glob, glob(3)

    AUTHOR

    The Perl interface was written by Nathan Torkington <gnat@frii.com>, and is released under the artistic license. Further modifications were made by Greg Bacon <gbacon@cs.uah.edu>, Gurusamy Sarathy <gsar@activestate.com>, and Thomas Wegner <wegner_thomas@yahoo.com>. The C glob code has the following copyright:

    1. Copyright (c) 1989, 1993 The Regents of the University of California.
    2. All rights reserved.
    3. This code is derived from software contributed to Berkeley by
    4. Guido van Rossum.
    5. Redistribution and use in source and binary forms, with or without
    6. modification, are permitted provided that the following conditions
    7. are met:
    8. 1. Redistributions of source code must retain the above copyright
    9. notice, this list of conditions and the following disclaimer.
    10. 2. Redistributions in binary form must reproduce the above copyright
    11. notice, this list of conditions and the following disclaimer in the
    12. documentation and/or other materials provided with the distribution.
    13. 3. Neither the name of the University nor the names of its contributors
    14. may be used to endorse or promote products derived from this software
    15. without specific prior written permission.
    16. THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS "AS IS" AND
    17. ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
    18. IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
    19. ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
    20. FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
    21. DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
    22. OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
    23. HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
    24. LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
    25. OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
    26. SUCH DAMAGE.
     
    perldoc-html/File/GlobMapper.html000644 000765 000024 00000071672 12275777443 017052 0ustar00jjstaff000000 000000 File::GlobMapper - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::GlobMapper

    Perl 5 version 18.2 documentation
    Recently read

    File::GlobMapper

    NAME

    File::GlobMapper - Extend File Glob to Allow Input and Output Files

    SYNOPSIS

    1. use File::GlobMapper qw( globmap );
    2. my $aref = globmap $input => $output
    3. or die $File::GlobMapper::Error ;
    4. my $gm = new File::GlobMapper $input => $output
    5. or die $File::GlobMapper::Error ;

    DESCRIPTION

    This module needs Perl5.005 or better.

    This module takes the existing File::Glob module as a starting point and extends it to allow new filenames to be derived from the files matched by File::Glob .

    This can be useful when carrying out batch operations on multiple files that have both an input filename and output filename and the output file can be derived from the input filename. Examples of operations where this can be useful include, file renaming, file copying and file compression.

    Behind The Scenes

    To help explain what File::GlobMapper does, consider what code you would write if you wanted to rename all files in the current directory that ended in .tar.gz to .tgz. So say these files are in the current directory

    1. alpha.tar.gz
    2. beta.tar.gz
    3. gamma.tar.gz

    and they need renamed to this

    1. alpha.tgz
    2. beta.tgz
    3. gamma.tgz

    Below is a possible implementation of a script to carry out the rename (error cases have been omitted)

    1. foreach my $old ( glob "*.tar.gz" )
    2. {
    3. my $new = $old;
    4. $new =~ s#(.*)\.tar\.gz$#$1.tgz# ;
    5. rename $old => $new
    6. or die "Cannot rename '$old' to '$new': $!\n;
    7. }

    Notice that a file glob pattern *.tar.gz was used to match the .tar.gz files, then a fairly similar regular expression was used in the substitute to allow the new filename to be created.

    Given that the file glob is just a cut-down regular expression and that it has already done a lot of the hard work in pattern matching the filenames, wouldn't it be handy to be able to use the patterns in the fileglob to drive the new filename?

    Well, that's exactly what File::GlobMapper does.

    Here is same snippet of code rewritten using globmap

    1. for my $pair (globmap '<*.tar.gz>' => '<#1.tgz>' )
    2. {
    3. my ($from, $to) = @$pair;
    4. rename $from => $to
    5. or die "Cannot rename '$old' to '$new': $!\n;
    6. }

    So how does it work?

    Behind the scenes the globmap function does a combination of a file glob to match existing filenames followed by a substitute to create the new filenames.

    Notice how both parameters to globmap are strings that are delimited by <>. This is done to make them look more like file globs - it is just syntactic sugar, but it can be handy when you want the strings to be visually distinctive. The enclosing <> are optional, so you don't have to use them - in fact the first thing globmap will do is remove these delimiters if they are present.

    The first parameter to globmap , *.tar.gz, is an Input File Glob. Once the enclosing "< ... >" is removed, this is passed (more or less) unchanged to File::Glob to carry out a file match.

    Next the fileglob *.tar.gz is transformed behind the scenes into a full Perl regular expression, with the additional step of wrapping each transformed wildcard metacharacter sequence in parenthesis.

    In this case the input fileglob *.tar.gz will be transformed into this Perl regular expression

    1. ([^/]*)\.tar\.gz

    Wrapping with parenthesis allows the wildcard parts of the Input File Glob to be referenced by the second parameter to globmap , #1.tgz , the Output File Glob. This parameter operates just like the replacement part of a substitute command. The difference is that the #1 syntax is used to reference sub-patterns matched in the input fileglob, rather than the $1 syntax that is used with perl regular expressions. In this case #1 is used to refer to the text matched by the * in the Input File Glob. This makes it easier to use this module where the parameters to globmap are typed at the command line.

    The final step involves passing each filename matched by the *.tar.gz file glob through the derived Perl regular expression in turn and expanding the output fileglob using it.

    The end result of all this is a list of pairs of filenames. By default that is what is returned by globmap . In this example the data structure returned will look like this

    1. ( ['alpha.tar.gz' => 'alpha.tgz'],
    2. ['beta.tar.gz' => 'beta.tgz' ],
    3. ['gamma.tar.gz' => 'gamma.tgz']
    4. )

    Each pair is an array reference with two elements - namely the from filename, that File::Glob has matched, and a to filename that is derived from the from filename.

    Limitations

    File::GlobMapper has been kept simple deliberately, so it isn't intended to solve all filename mapping operations. Under the hood File::Glob (or for older versions of Perl, File::BSDGlob ) is used to match the files, so you will never have the flexibility of full Perl regular expression.

    Input File Glob

    The syntax for an Input FileGlob is identical to File::Glob , except for the following

    1.

    No nested {}

    2.

    Whitespace does not delimit fileglobs.

    3.

    The use of parenthesis can be used to capture parts of the input filename.

    4.

    If an Input glob matches the same file more than once, only the first will be used.

    The syntax

    • ~
    • ~user
    • .

      Matches a literal '.'. Equivalent to the Perl regular expression

      1. \.
    • *

      Matches zero or more characters, except '/'. Equivalent to the Perl regular expression

      1. [^/]*
    • ?

      Matches zero or one character, except '/'. Equivalent to the Perl regular expression

      1. [^/]?
    • \

      Backslash is used, as usual, to escape the next character.

    • []

      Character class.

    • {,}

      Alternation

    • ()

      Capturing parenthesis that work just like perl

    Any other character it taken literally.

    Output File Glob

    The Output File Glob is a normal string, with 2 glob-like features.

    The first is the '*' metacharacter. This will be replaced by the complete filename matched by the input file glob. So

    1. *.c *.Z

    The second is

    Output FileGlobs take the

    • "*"

      The "*" character will be replaced with the complete input filename.

    • #1

      Patterns of the form /#\d/ will be replaced with the

    Returned Data

    EXAMPLES

    A Rename script

    Below is a simple "rename" script that uses globmap to determine the source and destination filenames.

    1. use File::GlobMapper qw(globmap) ;
    2. use File::Copy;
    3. die "rename: Usage rename 'from' 'to'\n"
    4. unless @ARGV == 2 ;
    5. my $fromGlob = shift @ARGV;
    6. my $toGlob = shift @ARGV;
    7. my $pairs = globmap($fromGlob, $toGlob)
    8. or die $File::GlobMapper::Error;
    9. for my $pair (@$pairs)
    10. {
    11. my ($from, $to) = @$pair;
    12. move $from => $to ;
    13. }

    Here is an example that renames all c files to cpp.

    1. $ rename '*.c' '#1.cpp'

    A few example globmaps

    Below are a few examples of globmaps

    To copy all your .c file to a backup directory

    1. '</my/home/*.c>' '</my/backup/#1.c>'

    If you want to compress all

    1. '</my/home/*.[ch]>' '<*.gz>'

    To uncompress

    1. '</my/home/*.[ch].gz>' '</my/home/#1.#2>'

    SEE ALSO

    File::Glob

    AUTHOR

    The File::GlobMapper module was written by Paul Marquess, pmqs@cpan.org.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005 Paul Marquess. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/File/Path.html000644 000765 000024 00000136702 12275777445 015714 0ustar00jjstaff000000 000000 File::Path - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Path

    Perl 5 version 18.2 documentation
    Recently read

    File::Path

    NAME

    File::Path - Create or remove directory trees

    VERSION

    This document describes version 2.09 of File::Path, released 2013-01-17.

    SYNOPSIS

    1. use File::Path qw(make_path remove_tree);
    2. make_path('foo/bar/baz', '/zug/zwang');
    3. make_path('foo/bar/baz', '/zug/zwang', {
    4. verbose => 1,
    5. mode => 0711,
    6. });
    7. remove_tree('foo/bar/baz', '/zug/zwang');
    8. remove_tree('foo/bar/baz', '/zug/zwang', {
    9. verbose => 1,
    10. error => \my $err_list,
    11. });
    12. # legacy (interface promoted before v2.00)
    13. mkpath('/foo/bar/baz');
    14. mkpath('/foo/bar/baz', 1, 0711);
    15. mkpath(['/foo/bar/baz', 'blurfl/quux'], 1, 0711);
    16. rmtree('foo/bar/baz', 1, 1);
    17. rmtree(['foo/bar/baz', 'blurfl/quux'], 1, 1);
    18. # legacy (interface promoted before v2.06)
    19. mkpath('foo/bar/baz', '/zug/zwang', { verbose => 1, mode => 0711 });
    20. rmtree('foo/bar/baz', '/zug/zwang', { verbose => 1, mode => 0711 });

    DESCRIPTION

    This module provide a convenient way to create directories of arbitrary depth and to delete an entire directory subtree from the filesystem.

    The following functions are provided:

    • make_path( $dir1, $dir2, .... )
    • make_path( $dir1, $dir2, ...., \%opts )

      The make_path function creates the given directories if they don't exists before, much like the Unix command mkdir -p .

      The function accepts a list of directories to be created. Its behaviour may be tuned by an optional hashref appearing as the last parameter on the call.

      The function returns the list of directories actually created during the call; in scalar context the number of directories created.

      The following keys are recognised in the option hash:

      • mode => $num

        The numeric permissions mode to apply to each created directory (defaults to 0777), to be modified by the current umask. If the directory already exists (and thus does not need to be created), the permissions will not be modified.

        mask is recognised as an alias for this parameter.

      • verbose => $bool

        If present, will cause make_path to print the name of each directory as it is created. By default nothing is printed.

      • error => \$err

        If present, it should be a reference to a scalar. This scalar will be made to reference an array, which will be used to store any errors that are encountered. See the ERROR HANDLING section for more information.

        If this parameter is not used, certain error conditions may raise a fatal error that will cause the program will halt, unless trapped in an eval block.

      • owner => $owner
      • user => $owner
      • uid => $owner

        If present, will cause any created directory to be owned by $owner . If the value is numeric, it will be interpreted as a uid, otherwise as username is assumed. An error will be issued if the username cannot be mapped to a uid, or the uid does not exist, or the process lacks the privileges to change ownership.

        Ownwership of directories that already exist will not be changed.

        user and uid are aliases of owner .

      • group => $group

        If present, will cause any created directory to be owned by the group $group . If the value is numeric, it will be interpreted as a gid, otherwise as group name is assumed. An error will be issued if the group name cannot be mapped to a gid, or the gid does not exist, or the process lacks the privileges to change group ownership.

        Group ownwership of directories that already exist will not be changed.

        1. make_path '/var/tmp/webcache', {owner=>'nobody', group=>'nogroup'};
    • mkpath( $dir )
    • mkpath( $dir, $verbose, $mode )
    • mkpath( [$dir1, $dir2,...], $verbose, $mode )
    • mkpath( $dir1, $dir2,..., \%opt )

      The mkpath() function provide the legacy interface of make_path() with a different interpretation of the arguments passed. The behaviour and return value of the function is otherwise identical to make_path().

    • remove_tree( $dir1, $dir2, .... )
    • remove_tree( $dir1, $dir2, ...., \%opts )

      The remove_tree function deletes the given directories and any files and subdirectories they might contain, much like the Unix command rm -r or del /s on Windows.

      The function accepts a list of directories to be removed. Its behaviour may be tuned by an optional hashref appearing as the last parameter on the call.

      The functions returns the number of files successfully deleted.

      The following keys are recognised in the option hash:

      • verbose => $bool

        If present, will cause remove_tree to print the name of each file as it is unlinked. By default nothing is printed.

      • safe => $bool

        When set to a true value, will cause remove_tree to skip the files for which the process lacks the required privileges needed to delete files, such as delete privileges on VMS. In other words, the code will make no attempt to alter file permissions. Thus, if the process is interrupted, no filesystem object will be left in a more permissive mode.

      • keep_root => $bool

        When set to a true value, will cause all files and subdirectories to be removed, except the initially specified directories. This comes in handy when cleaning out an application's scratch directory.

        1. remove_tree( '/tmp', {keep_root => 1} );
      • result => \$res

        If present, it should be a reference to a scalar. This scalar will be made to reference an array, which will be used to store all files and directories unlinked during the call. If nothing is unlinked, the array will be empty.

        1. remove_tree( '/tmp', {result => \my $list} );
        2. print "unlinked $_\n" for @$list;

        This is a useful alternative to the verbose key.

      • error => \$err

        If present, it should be a reference to a scalar. This scalar will be made to reference an array, which will be used to store any errors that are encountered. See the ERROR HANDLING section for more information.

        Removing things is a much more dangerous proposition than creating things. As such, there are certain conditions that remove_tree may encounter that are so dangerous that the only sane action left is to kill the program.

        Use error to trap all that is reasonable (problems with permissions and the like), and let it die if things get out of hand. This is the safest course of action.

    • rmtree( $dir )
    • rmtree( $dir, $verbose, $safe )
    • rmtree( [$dir1, $dir2,...], $verbose, $safe )
    • rmtree( $dir1, $dir2,..., \%opt )

      The rmtree() function provide the legacy interface of remove_tree() with a different interpretation of the arguments passed. The behaviour and return value of the function is otherwise identical to remove_tree().

    ERROR HANDLING

    • NOTE:

      The following error handling mechanism is considered experimental and is subject to change pending feedback from users.

    If make_path or remove_tree encounter an error, a diagnostic message will be printed to STDERR via carp (for non-fatal errors), or via croak (for fatal errors).

    If this behaviour is not desirable, the error attribute may be used to hold a reference to a variable, which will be used to store the diagnostics. The variable is made a reference to an array of hash references. Each hash contain a single key/value pair where the key is the name of the file, and the value is the error message (including the contents of $! when appropriate). If a general error is encountered the diagnostic key will be empty.

    An example usage looks like:

    1. remove_tree( 'foo/bar', 'bar/rat', {error => \my $err} );
    2. if (@$err) {
    3. for my $diag (@$err) {
    4. my ($file, $message) = %$diag;
    5. if ($file eq '') {
    6. print "general error: $message\n";
    7. }
    8. else {
    9. print "problem unlinking $file: $message\n";
    10. }
    11. }
    12. }
    13. else {
    14. print "No error encountered\n";
    15. }

    Note that if no errors are encountered, $err will reference an empty array. This means that $err will always end up TRUE; so you need to test @$err to determine if errors occured.

    NOTES

    File::Path blindly exports mkpath and rmtree into the current namespace. These days, this is considered bad style, but to change it now would break too much code. Nonetheless, you are invited to specify what it is you are expecting to use:

    1. use File::Path 'rmtree';

    The routines make_path and remove_tree are not exported by default. You must specify which ones you want to use.

    1. use File::Path 'remove_tree';

    Note that a side-effect of the above is that mkpath and rmtree are no longer exported at all. This is due to the way the Exporter module works. If you are migrating a codebase to use the new interface, you will have to list everything explicitly. But that's just good practice anyway.

    1. use File::Path qw(remove_tree rmtree);

    API CHANGES

    The API was changed in the 2.0 branch. For a time, mkpath and rmtree tried, unsuccessfully, to deal with the two different calling mechanisms. This approach was considered a failure.

    The new semantics are now only available with make_path and remove_tree . The old semantics are only available through mkpath and rmtree . Users are strongly encouraged to upgrade to at least 2.08 in order to avoid surprises.

    SECURITY CONSIDERATIONS

    There were race conditions 1.x implementations of File::Path's rmtree function (although sometimes patched depending on the OS distribution or platform). The 2.0 version contains code to avoid the problem mentioned in CVE-2002-0435.

    See the following pages for more information:

    1. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=286905
    2. http://www.nntp.perl.org/group/perl.perl5.porters/2005/01/msg97623.html
    3. http://www.debian.org/security/2005/dsa-696

    Additionally, unless the safe parameter is set (or the third parameter in the traditional interface is TRUE), should a remove_tree be interrupted, files that were originally in read-only mode may now have their permissions set to a read-write (or "delete OK") mode.

    DIAGNOSTICS

    FATAL errors will cause the program to halt (croak ), since the problem is so severe that it would be dangerous to continue. (This can always be trapped with eval, but it's not a good idea. Under the circumstances, dying is the best thing to do).

    SEVERE errors may be trapped using the modern interface. If the they are not trapped, or the old interface is used, such an error will cause the program will halt.

    All other errors may be trapped using the modern interface, otherwise they will be carp ed about. Program execution will not be halted.

    • mkdir [path]: [errmsg] (SEVERE)

      make_path was unable to create the path. Probably some sort of permissions error at the point of departure, or insufficient resources (such as free inodes on Unix).

    • No root path(s) specified

      make_path was not given any paths to create. This message is only emitted if the routine is called with the traditional interface. The modern interface will remain silent if given nothing to do.

    • No such file or directory

      On Windows, if make_path gives you this warning, it may mean that you have exceeded your filesystem's maximum path length.

    • cannot fetch initial working directory: [errmsg]

      remove_tree attempted to determine the initial directory by calling Cwd::getcwd , but the call failed for some reason. No attempt will be made to delete anything.

    • cannot stat initial working directory: [errmsg]

      remove_tree attempted to stat the initial directory (after having successfully obtained its name via getcwd ), however, the call failed for some reason. No attempt will be made to delete anything.

    • cannot chdir to [dir]: [errmsg]

      remove_tree attempted to set the working directory in order to begin deleting the objects therein, but was unsuccessful. This is usually a permissions issue. The routine will continue to delete other things, but this directory will be left intact.

    • directory [dir] changed before chdir, expected dev=[n] ino=[n], actual dev=[n] ino=[n], aborting. (FATAL)

      remove_tree recorded the device and inode of a directory, and then moved into it. It then performed a stat on the current directory and detected that the device and inode were no longer the same. As this is at the heart of the race condition problem, the program will die at this point.

    • cannot make directory [dir] read+writeable: [errmsg]

      remove_tree attempted to change the permissions on the current directory to ensure that subsequent unlinkings would not run into problems, but was unable to do so. The permissions remain as they were, and the program will carry on, doing the best it can.

    • cannot read [dir]: [errmsg]

      remove_tree tried to read the contents of the directory in order to acquire the names of the directory entries to be unlinked, but was unsuccessful. This is usually a permissions issue. The program will continue, but the files in this directory will remain after the call.

    • cannot reset chmod [dir]: [errmsg]

      remove_tree , after having deleted everything in a directory, attempted to restore its permissions to the original state but failed. The directory may wind up being left behind.

    • cannot remove [dir] when cwd is [dir]

      The current working directory of the program is /some/path/to/here and you are attempting to remove an ancestor, such as /some/path. The directory tree is left untouched.

      The solution is to chdir out of the child directory to a place outside the directory tree to be removed.

    • cannot chdir to [parent-dir] from [child-dir]: [errmsg], aborting. (FATAL)

      remove_tree , after having deleted everything and restored the permissions of a directory, was unable to chdir back to the parent. The program halts to avoid a race condition from occurring.

    • cannot stat prior working directory [dir]: [errmsg], aborting. (FATAL)

      remove_tree was unable to stat the parent directory after have returned from the child. Since there is no way of knowing if we returned to where we think we should be (by comparing device and inode) the only way out is to croak .

    • previous directory [parent-dir] changed before entering [child-dir], expected dev=[n] ino=[n], actual dev=[n] ino=[n], aborting. (FATAL)

      When remove_tree returned from deleting files in a child directory, a check revealed that the parent directory it returned to wasn't the one it started out from. This is considered a sign of malicious activity.

    • cannot make directory [dir] writeable: [errmsg]

      Just before removing a directory (after having successfully removed everything it contained), remove_tree attempted to set the permissions on the directory to ensure it could be removed and failed. Program execution continues, but the directory may possibly not be deleted.

    • cannot remove directory [dir]: [errmsg]

      remove_tree attempted to remove a directory, but failed. This may because some objects that were unable to be removed remain in the directory, or a permissions issue. The directory will be left behind.

    • cannot restore permissions of [dir] to [0nnn]: [errmsg]

      After having failed to remove a directory, remove_tree was unable to restore its permissions from a permissive state back to a possibly more restrictive setting. (Permissions given in octal).

    • cannot make file [file] writeable: [errmsg]

      remove_tree attempted to force the permissions of a file to ensure it could be deleted, but failed to do so. It will, however, still attempt to unlink the file.

    • cannot unlink file [file]: [errmsg]

      remove_tree failed to remove a file. Probably a permissions issue.

    • cannot restore permissions of [file] to [0nnn]: [errmsg]

      After having failed to remove a file, remove_tree was also unable to restore the permissions on the file to a possibly less permissive setting. (Permissions given in octal).

    • unable to map [owner] to a uid, ownership not changed");

      make_path was instructed to give the ownership of created directories to the symbolic name [owner], but getpwnam did not return the corresponding numeric uid. The directory will be created, but ownership will not be changed.

    • unable to map [group] to a gid, group ownership not changed

      make_path was instructed to give the group ownership of created directories to the symbolic name [group], but getgrnam did not return the corresponding numeric gid. The directory will be created, but group ownership will not be changed.

    SEE ALSO

    • File::Remove

      Allows files and directories to be moved to the Trashcan/Recycle Bin (where they may later be restored if necessary) if the operating system supports such functionality. This feature may one day be made available directly in File::Path .

    • File::Find::Rule

      When removing directory trees, if you want to examine each file to decide whether to delete it (and possibly leaving large swathes alone), File::Find::Rule offers a convenient and flexible approach to examining directory trees.

    BUGS

    Please report all bugs on the RT queue:

    http://rt.cpan.org/NoAuth/Bugs.html?Dist=File-Path

    You can also send pull requests to the Github repository:

    https://github.com/dland/File-Path

    ACKNOWLEDGEMENTS

    Paul Szabo identified the race condition originally, and Brendan O'Dea wrote an implementation for Debian that addressed the problem. That code was used as a basis for the current code. Their efforts are greatly appreciated.

    Gisle Aas made a number of improvements to the documentation for 2.07 and his advice and assistance is also greatly appreciated.

    AUTHORS

    Tim Bunce and Charles Bailey. Currently maintained by David Landgren <david@landgren.net>.

    COPYRIGHT

    This module is copyright (C) Charles Bailey, Tim Bunce and David Landgren 1995-2013. All rights reserved.

    LICENSE

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/File/Spec/000755 000765 000024 00000000000 12275777446 015014 5ustar00jjstaff000000 000000 perldoc-html/File/Spec.html000644 000765 000024 00000102260 12275777444 015701 0ustar00jjstaff000000 000000 File::Spec - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec

    NAME

    File::Spec - portably perform operations on file names

    SYNOPSIS

    1. use File::Spec;
    2. $x=File::Spec->catfile('a', 'b', 'c');

    which returns 'a/b/c' under Unix. Or:

    1. use File::Spec::Functions;
    2. $x = catfile('a', 'b', 'c');

    DESCRIPTION

    This module is designed to support operations commonly performed on file specifications (usually called "file names", but not to be confused with the contents of a file, or Perl's file handles), such as concatenating several directory and file names into a single path, or determining whether a path is rooted. It is based on code directly taken from MakeMaker 5.17, code written by Andreas König, Andy Dougherty, Charles Bailey, Ilya Zakharevich, Paul Schinder, and others.

    Since these functions are different for most operating systems, each set of OS specific routines is available in a separate module, including:

    1. File::Spec::Unix
    2. File::Spec::Mac
    3. File::Spec::OS2
    4. File::Spec::Win32
    5. File::Spec::VMS

    The module appropriate for the current OS is automatically loaded by File::Spec. Since some modules (like VMS) make use of facilities available only under that OS, it may not be possible to load all modules under all operating systems.

    Since File::Spec is object oriented, subroutines should not be called directly, as in:

    1. File::Spec::catfile('a','b');

    but rather as class methods:

    1. File::Spec->catfile('a','b');

    For simple uses, File::Spec::Functions provides convenient functional forms of these methods.

    METHODS

    • canonpath

      No physical check on the filesystem, but a logical cleanup of a path.

      1. $cpath = File::Spec->canonpath( $path ) ;

      Note that this does *not* collapse x/../y sections into y. This is by design. If /foo on your system is a symlink to /bar/baz, then /foo/../quux is actually /bar/quux, not /quux as a naive ../-removal would give you. If you want to do this kind of processing, you probably want Cwd 's realpath() function to actually traverse the filesystem cleaning up paths like this.

    • catdir

      Concatenate two or more directory names to form a complete path ending with a directory. But remove the trailing slash from the resulting string, because it doesn't look good, isn't necessary and confuses OS/2. Of course, if this is the root directory, don't cut off the trailing slash :-)

      1. $path = File::Spec->catdir( @directories );
    • catfile

      Concatenate one or more directory names and a filename to form a complete path ending with a filename

      1. $path = File::Spec->catfile( @directories, $filename );
    • curdir

      Returns a string representation of the current directory.

      1. $curdir = File::Spec->curdir();
    • devnull

      Returns a string representation of the null device.

      1. $devnull = File::Spec->devnull();
    • rootdir

      Returns a string representation of the root directory.

      1. $rootdir = File::Spec->rootdir();
    • tmpdir

      Returns a string representation of the first writable directory from a list of possible temporary directories. Returns the current directory if no writable temporary directories are found. The list of directories checked depends on the platform; e.g. File::Spec::Unix checks $ENV{TMPDIR} (unless taint is on) and /tmp.

      1. $tmpdir = File::Spec->tmpdir();
    • updir

      Returns a string representation of the parent directory.

      1. $updir = File::Spec->updir();
    • no_upwards

      Given a list of file names, strip out those that refer to a parent directory. (Does not strip symlinks, only '.', '..', and equivalents.)

      1. @paths = File::Spec->no_upwards( @paths );
    • case_tolerant

      Returns a true or false value indicating, respectively, that alphabetic case is not or is significant when comparing file specifications. Cygwin and Win32 accept an optional drive argument.

      1. $is_case_tolerant = File::Spec->case_tolerant();
    • file_name_is_absolute

      Takes as its argument a path, and returns true if it is an absolute path.

      1. $is_absolute = File::Spec->file_name_is_absolute( $path );

      This does not consult the local filesystem on Unix, Win32, OS/2, or Mac OS (Classic). It does consult the working environment for VMS (see file_name_is_absolute in File::Spec::VMS).

    • path

      Takes no argument. Returns the environment variable PATH (or the local platform's equivalent) as a list.

      1. @PATH = File::Spec->path();
    • join

      join is the same as catfile.

    • splitpath

      Splits a path in to volume, directory, and filename portions. On systems with no concept of volume, returns '' for volume.

      1. ($volume,$directories,$file) =
      2. File::Spec->splitpath( $path );
      3. ($volume,$directories,$file) =
      4. File::Spec->splitpath( $path, $no_file );

      For systems with no syntax differentiating filenames from directories, assumes that the last file is a path unless $no_file is true or a trailing separator or /. or /.. is present. On Unix, this means that $no_file true makes this return ( '', $path, '' ).

      The directory portion may or may not be returned with a trailing '/'.

      The results can be passed to catpath() to get back a path equivalent to (usually identical to) the original path.

    • splitdir

      The opposite of catdir.

      1. @dirs = File::Spec->splitdir( $directories );

      $directories must be only the directory portion of the path on systems that have the concept of a volume or that have path syntax that differentiates files from directories.

      Unlike just splitting the directories on the separator, empty directory names ('' ) can be returned, because these are significant on some OSes.

    • catpath()

      Takes volume, directory and file portions and returns an entire path. Under Unix, $volume is ignored, and directory and file are concatenated. A '/' is inserted if need be. On other OSes, $volume is significant.

      1. $full_path = File::Spec->catpath( $volume, $directory, $file );
    • abs2rel

      Takes a destination path and an optional base path returns a relative path from the base path to the destination path:

      1. $rel_path = File::Spec->abs2rel( $path ) ;
      2. $rel_path = File::Spec->abs2rel( $path, $base ) ;

      If $base is not present or '', then Cwd::cwd() is used. If $base is relative, then it is converted to absolute form using rel2abs(). This means that it is taken to be relative to Cwd::cwd().

      On systems with the concept of volume, if $path and $base appear to be on two different volumes, we will not attempt to resolve the two paths, and we will instead simply return $path . Note that previous versions of this module ignored the volume of $base , which resulted in garbage results part of the time.

      On systems that have a grammar that indicates filenames, this ignores the $base filename as well. Otherwise all path components are assumed to be directories.

      If $path is relative, it is converted to absolute form using rel2abs(). This means that it is taken to be relative to Cwd::cwd().

      No checks against the filesystem are made. On VMS, there is interaction with the working environment, as logicals and macros are expanded.

      Based on code written by Shigio Yamaguchi.

    • rel2abs()

      Converts a relative path to an absolute path.

      1. $abs_path = File::Spec->rel2abs( $path ) ;
      2. $abs_path = File::Spec->rel2abs( $path, $base ) ;

      If $base is not present or '', then Cwd::cwd() is used. If $base is relative, then it is converted to absolute form using rel2abs(). This means that it is taken to be relative to Cwd::cwd().

      On systems with the concept of volume, if $path and $base appear to be on two different volumes, we will not attempt to resolve the two paths, and we will instead simply return $path . Note that previous versions of this module ignored the volume of $base , which resulted in garbage results part of the time.

      On systems that have a grammar that indicates filenames, this ignores the $base filename as well. Otherwise all path components are assumed to be directories.

      If $path is absolute, it is cleaned up and returned using canonpath.

      No checks against the filesystem are made. On VMS, there is interaction with the working environment, as logicals and macros are expanded.

      Based on code written by Shigio Yamaguchi.

    For further information, please see File::Spec::Unix, File::Spec::Mac, File::Spec::OS2, File::Spec::Win32, or File::Spec::VMS.

    SEE ALSO

    File::Spec::Unix, File::Spec::Mac, File::Spec::OS2, File::Spec::Win32, File::Spec::VMS, File::Spec::Functions, ExtUtils::MakeMaker

    AUTHOR

    Currently maintained by Ken Williams <KWILLIAMS@cpan.org> .

    The vast majority of the code was written by Kenneth Albanowski <kjahds@kjahds.com> , Andy Dougherty <doughera@lafayette.edu> , Andreas König <A.Koenig@franz.ww.TU-Berlin.DE> , Tim Bunce <Tim.Bunce@ig.co.uk> . VMS support by Charles Bailey <bailey@newman.upenn.edu> . OS/2 support by Ilya Zakharevich <ilya@math.ohio-state.edu> . Mac support by Paul Schinder <schinder@pobox.com> , and Thomas Wegner <wegner_thomas@yahoo.com> . abs2rel() and rel2abs() written by Shigio Yamaguchi <shigio@tamacom.com> , modified by Barrie Slaymaker <barries@slaysys.com> . splitpath(), splitdir(), catpath() and catdir() by Barrie Slaymaker.

    COPYRIGHT

    Copyright (c) 2004-2013 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/File/Temp.html000644 000765 000024 00000206104 12275777445 015717 0ustar00jjstaff000000 000000 File::Temp - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Temp

    Perl 5 version 18.2 documentation
    Recently read

    File::Temp

    NAME

    File::Temp - return name and handle of a temporary file safely

    SYNOPSIS

    1. use File::Temp qw/ tempfile tempdir /;
    2. $fh = tempfile();
    3. ($fh, $filename) = tempfile();
    4. ($fh, $filename) = tempfile( $template, DIR => $dir);
    5. ($fh, $filename) = tempfile( $template, SUFFIX => '.dat');
    6. ($fh, $filename) = tempfile( $template, TMPDIR => 1 );
    7. binmode( $fh, ":utf8" );
    8. $dir = tempdir( CLEANUP => 1 );
    9. ($fh, $filename) = tempfile( DIR => $dir );

    Object interface:

    1. require File::Temp;
    2. use File::Temp ();
    3. use File::Temp qw/ :seekable /;
    4. $fh = File::Temp->new();
    5. $fname = $fh->filename;
    6. $fh = File::Temp->new(TEMPLATE => $template);
    7. $fname = $fh->filename;
    8. $tmp = File::Temp->new( UNLINK => 0, SUFFIX => '.dat' );
    9. print $tmp "Some data\n";
    10. print "Filename is $tmp\n";
    11. $tmp->seek( 0, SEEK_END );

    The following interfaces are provided for compatibility with existing APIs. They should not be used in new code.

    MkTemp family:

    1. use File::Temp qw/ :mktemp /;
    2. ($fh, $file) = mkstemp( "tmpfileXXXXX" );
    3. ($fh, $file) = mkstemps( "tmpfileXXXXXX", $suffix);
    4. $tmpdir = mkdtemp( $template );
    5. $unopened_file = mktemp( $template );

    POSIX functions:

    1. use File::Temp qw/ :POSIX /;
    2. $file = tmpnam();
    3. $fh = tmpfile();
    4. ($fh, $file) = tmpnam();

    Compatibility functions:

    1. $unopened_file = File::Temp::tempnam( $dir, $pfx );

    DESCRIPTION

    File::Temp can be used to create and open temporary files in a safe way. There is both a function interface and an object-oriented interface. The File::Temp constructor or the tempfile() function can be used to return the name and the open filehandle of a temporary file. The tempdir() function can be used to create a temporary directory.

    The security aspect of temporary file creation is emphasized such that a filehandle and filename are returned together. This helps guarantee that a race condition can not occur where the temporary file is created by another process between checking for the existence of the file and its opening. Additional security levels are provided to check, for example, that the sticky bit is set on world writable directories. See safe_level for more information.

    For compatibility with popular C library functions, Perl implementations of the mkstemp() family of functions are provided. These are, mkstemp(), mkstemps(), mkdtemp() and mktemp().

    Additionally, implementations of the standard POSIX tmpnam() and tmpfile() functions are provided if required.

    Implementations of mktemp(), tmpnam(), and tempnam() are provided, but should be used with caution since they return only a filename that was valid when function was called, so cannot guarantee that the file will not exist by the time the caller opens the filename.

    Filehandles returned by these functions support the seekable methods.

    OBJECT-ORIENTED INTERFACE

    This is the primary interface for interacting with File::Temp . Using the OO interface a temporary file can be created when the object is constructed and the file can be removed when the object is no longer required.

    Note that there is no method to obtain the filehandle from the File::Temp object. The object itself acts as a filehandle. The object isa IO::Handle and isa IO::Seekable so all those methods are available.

    Also, the object is configured such that it stringifies to the name of the temporary file and so can be compared to a filename directly. It numifies to the refaddr the same as other handles and so can be compared to other handles with == .

    1. $fh eq $filename # as a string
    2. $fh != \*STDOUT # as a number
    • new

      Create a temporary file object.

      1. my $tmp = File::Temp->new();

      by default the object is constructed as if tempfile was called without options, but with the additional behaviour that the temporary file is removed by the object destructor if UNLINK is set to true (the default).

      Supported arguments are the same as for tempfile : UNLINK (defaulting to true), DIR, EXLOCK and SUFFIX. Additionally, the filename template is specified using the TEMPLATE option. The OPEN option is not supported (the file is always opened).

      1. $tmp = File::Temp->new( TEMPLATE => 'tempXXXXX',
      2. DIR => 'mydir',
      3. SUFFIX => '.dat');

      Arguments are case insensitive.

      Can call croak() if an error occurs.

    • newdir

      Create a temporary directory using an object oriented interface.

      1. $dir = File::Temp->newdir();

      By default the directory is deleted when the object goes out of scope.

      Supports the same options as the tempdir function. Note that directories created with this method default to CLEANUP => 1.

      1. $dir = File::Temp->newdir( $template, %options );

      A template may be specified either with a leading template or with a TEMPLATE argument.

    • filename

      Return the name of the temporary file associated with this object (if the object was created using the "new" constructor).

      1. $filename = $tmp->filename;

      This method is called automatically when the object is used as a string.

    • dirname

      Return the name of the temporary directory associated with this object (if the object was created using the "newdir" constructor).

      1. $dirname = $tmpdir->dirname;

      This method is called automatically when the object is used in string context.

    • unlink_on_destroy

      Control whether the file is unlinked when the object goes out of scope. The file is removed if this value is true and $KEEP_ALL is not.

      1. $fh->unlink_on_destroy( 1 );

      Default is for the file to be removed.

    • DESTROY

      When the object goes out of scope, the destructor is called. This destructor will attempt to unlink the file (using unlink1) if the constructor was called with UNLINK set to 1 (the default state if UNLINK is not specified).

      No error is given if the unlink fails.

      If the object has been passed to a child process during a fork, the file will be deleted when the object goes out of scope in the parent.

      For a temporary directory object the directory will be removed unless the CLEANUP argument was used in the constructor (and set to false) or unlink_on_destroy was modified after creation. Note that if a temp directory is your current directory, it cannot be removed - a warning will be given in this case. chdir() out of the directory before letting the object go out of scope.

      If the global variable $KEEP_ALL is true, the file or directory will not be removed.

    FUNCTIONS

    This section describes the recommended interface for generating temporary files and directories.

    • tempfile

      This is the basic function to generate temporary files. The behaviour of the file can be changed using various options:

      1. $fh = tempfile();
      2. ($fh, $filename) = tempfile();

      Create a temporary file in the directory specified for temporary files, as specified by the tmpdir() function in File::Spec.

      1. ($fh, $filename) = tempfile($template);

      Create a temporary file in the current directory using the supplied template. Trailing `X' characters are replaced with random letters to generate the filename. At least four `X' characters must be present at the end of the template.

      1. ($fh, $filename) = tempfile($template, SUFFIX => $suffix)

      Same as previously, except that a suffix is added to the template after the `X' translation. Useful for ensuring that a temporary filename has a particular extension when needed by other applications. But see the WARNING at the end.

      1. ($fh, $filename) = tempfile($template, DIR => $dir);

      Translates the template as before except that a directory name is specified.

      1. ($fh, $filename) = tempfile($template, TMPDIR => 1);

      Equivalent to specifying a DIR of "File::Spec->tmpdir", writing the file into the same temporary directory as would be used if no template was specified at all.

      1. ($fh, $filename) = tempfile($template, UNLINK => 1);

      Return the filename and filehandle as before except that the file is automatically removed when the program exits (dependent on $KEEP_ALL). Default is for the file to be removed if a file handle is requested and to be kept if the filename is requested. In a scalar context (where no filename is returned) the file is always deleted either (depending on the operating system) on exit or when it is closed (unless $KEEP_ALL is true when the temp file is created).

      Use the object-oriented interface if fine-grained control of when a file is removed is required.

      If the template is not specified, a template is always automatically generated. This temporary file is placed in tmpdir() (File::Spec) unless a directory is specified explicitly with the DIR option.

      1. $fh = tempfile( DIR => $dir );

      If called in scalar context, only the filehandle is returned and the file will automatically be deleted when closed on operating systems that support this (see the description of tmpfile() elsewhere in this document). This is the preferred mode of operation, as if you only have a filehandle, you can never create a race condition by fumbling with the filename. On systems that can not unlink an open file or can not mark a file as temporary when it is opened (for example, Windows NT uses the O_TEMPORARY flag) the file is marked for deletion when the program ends (equivalent to setting UNLINK to 1). The UNLINK flag is ignored if present.

      1. (undef, $filename) = tempfile($template, OPEN => 0);

      This will return the filename based on the template but will not open this file. Cannot be used in conjunction with UNLINK set to true. Default is to always open the file to protect from possible race conditions. A warning is issued if warnings are turned on. Consider using the tmpnam() and mktemp() functions described elsewhere in this document if opening the file is not required.

      If the operating system supports it (for example BSD derived systems), the filehandle will be opened with O_EXLOCK (open with exclusive file lock). This can sometimes cause problems if the intention is to pass the filename to another system that expects to take an exclusive lock itself (such as DBD::SQLite) whilst ensuring that the tempfile is not reused. In this situation the "EXLOCK" option can be passed to tempfile. By default EXLOCK will be true (this retains compatibility with earlier releases).

      1. ($fh, $filename) = tempfile($template, EXLOCK => 0);

      Options can be combined as required.

      Will croak() if there is an error.

    • tempdir

      This is the recommended interface for creation of temporary directories. By default the directory will not be removed on exit (that is, it won't be temporary; this behaviour can not be changed because of issues with backwards compatibility). To enable removal either use the CLEANUP option which will trigger removal on program exit, or consider using the "newdir" method in the object interface which will allow the directory to be cleaned up when the object goes out of scope.

      The behaviour of the function depends on the arguments:

      1. $tempdir = tempdir();

      Create a directory in tmpdir() (see File::Spec).

      1. $tempdir = tempdir( $template );

      Create a directory from the supplied template. This template is similar to that described for tempfile(). `X' characters at the end of the template are replaced with random letters to construct the directory name. At least four `X' characters must be in the template.

      1. $tempdir = tempdir ( DIR => $dir );

      Specifies the directory to use for the temporary directory. The temporary directory name is derived from an internal template.

      1. $tempdir = tempdir ( $template, DIR => $dir );

      Prepend the supplied directory name to the template. The template should not include parent directory specifications itself. Any parent directory specifications are removed from the template before prepending the supplied directory.

      1. $tempdir = tempdir ( $template, TMPDIR => 1 );

      Using the supplied template, create the temporary directory in a standard location for temporary files. Equivalent to doing

      1. $tempdir = tempdir ( $template, DIR => File::Spec->tmpdir);

      but shorter. Parent directory specifications are stripped from the template itself. The TMPDIR option is ignored if DIR is set explicitly. Additionally, TMPDIR is implied if neither a template nor a directory are supplied.

      1. $tempdir = tempdir( $template, CLEANUP => 1);

      Create a temporary directory using the supplied template, but attempt to remove it (and all files inside it) when the program exits. Note that an attempt will be made to remove all files from the directory even if they were not created by this module (otherwise why ask to clean it up?). The directory removal is made with the rmtree() function from the File::Path module. Of course, if the template is not specified, the temporary directory will be created in tmpdir() and will also be removed at program exit.

      Will croak() if there is an error.

    MKTEMP FUNCTIONS

    The following functions are Perl implementations of the mktemp() family of temp file generation system calls.

    • mkstemp

      Given a template, returns a filehandle to the temporary file and the name of the file.

      1. ($fh, $name) = mkstemp( $template );

      In scalar context, just the filehandle is returned.

      The template may be any filename with some number of X's appended to it, for example /tmp/temp.XXXX. The trailing X's are replaced with unique alphanumeric combinations.

      Will croak() if there is an error.

    • mkstemps

      Similar to mkstemp(), except that an extra argument can be supplied with a suffix to be appended to the template.

      1. ($fh, $name) = mkstemps( $template, $suffix );

      For example a template of testXXXXXX and suffix of .dat would generate a file similar to testhGji_w.dat.

      Returns just the filehandle alone when called in scalar context.

      Will croak() if there is an error.

    • mkdtemp

      Create a directory from a template. The template must end in X's that are replaced by the routine.

      1. $tmpdir_name = mkdtemp($template);

      Returns the name of the temporary directory created.

      Directory must be removed by the caller.

      Will croak() if there is an error.

    • mktemp

      Returns a valid temporary filename but does not guarantee that the file will not be opened by someone else.

      1. $unopened_file = mktemp($template);

      Template is the same as that required by mkstemp().

      Will croak() if there is an error.

    POSIX FUNCTIONS

    This section describes the re-implementation of the tmpnam() and tmpfile() functions described in POSIX using the mkstemp() from this module.

    Unlike the POSIX implementations, the directory used for the temporary file is not specified in a system include file (P_tmpdir ) but simply depends on the choice of tmpdir() returned by File::Spec. On some implementations this location can be set using the TMPDIR environment variable, which may not be secure. If this is a problem, simply use mkstemp() and specify a template.

    • tmpnam

      When called in scalar context, returns the full name (including path) of a temporary file (uses mktemp()). The only check is that the file does not already exist, but there is no guarantee that that condition will continue to apply.

      1. $file = tmpnam();

      When called in list context, a filehandle to the open file and a filename are returned. This is achieved by calling mkstemp() after constructing a suitable template.

      1. ($fh, $file) = tmpnam();

      If possible, this form should be used to prevent possible race conditions.

      See tmpdir in File::Spec for information on the choice of temporary directory for a particular operating system.

      Will croak() if there is an error.

    • tmpfile

      Returns the filehandle of a temporary file.

      1. $fh = tmpfile();

      The file is removed when the filehandle is closed or when the program exits. No access to the filename is provided.

      If the temporary file can not be created undef is returned. Currently this command will probably not work when the temporary directory is on an NFS file system.

      Will croak() if there is an error.

    ADDITIONAL FUNCTIONS

    These functions are provided for backwards compatibility with common tempfile generation C library functions.

    They are not exported and must be addressed using the full package name.

    • tempnam

      Return the name of a temporary file in the specified directory using a prefix. The file is guaranteed not to exist at the time the function was called, but such guarantees are good for one clock tick only. Always use the proper form of sysopen with O_CREAT | O_EXCL if you must open such a filename.

      1. $filename = File::Temp::tempnam( $dir, $prefix );

      Equivalent to running mktemp() with $dir/$prefixXXXXXXXX (using unix file convention as an example)

      Because this function uses mktemp(), it can suffer from race conditions.

      Will croak() if there is an error.

    UTILITY FUNCTIONS

    Useful functions for dealing with the filehandle and filename.

    • unlink0

      Given an open filehandle and the associated filename, make a safe unlink. This is achieved by first checking that the filename and filehandle initially point to the same file and that the number of links to the file is 1 (all fields returned by stat() are compared). Then the filename is unlinked and the filehandle checked once again to verify that the number of links on that file is now 0. This is the closest you can come to making sure that the filename unlinked was the same as the file whose descriptor you hold.

      1. unlink0($fh, $path)
      2. or die "Error unlinking file $path safely";

      Returns false on error but croaks() if there is a security anomaly. The filehandle is not closed since on some occasions this is not required.

      On some platforms, for example Windows NT, it is not possible to unlink an open file (the file must be closed first). On those platforms, the actual unlinking is deferred until the program ends and good status is returned. A check is still performed to make sure that the filehandle and filename are pointing to the same thing (but not at the time the end block is executed since the deferred removal may not have access to the filehandle).

      Additionally, on Windows NT not all the fields returned by stat() can be compared. For example, the dev and rdev fields seem to be different. Also, it seems that the size of the file returned by stat() does not always agree, with stat(FH) being more accurate than stat(filename), presumably because of caching issues even when using autoflush (this is usually overcome by waiting a while after writing to the tempfile before attempting to unlink0 it).

      Finally, on NFS file systems the link count of the file handle does not always go to zero immediately after unlinking. Currently, this command is expected to fail on NFS disks.

      This function is disabled if the global variable $KEEP_ALL is true and an unlink on open file is supported. If the unlink is to be deferred to the END block, the file is still registered for removal.

      This function should not be called if you are using the object oriented interface since the it will interfere with the object destructor deleting the file.

    • cmpstat

      Compare stat of filehandle with stat of provided filename. This can be used to check that the filename and filehandle initially point to the same file and that the number of links to the file is 1 (all fields returned by stat() are compared).

      1. cmpstat($fh, $path)
      2. or die "Error comparing handle with file";

      Returns false if the stat information differs or if the link count is greater than 1. Calls croak if there is a security anomaly.

      On certain platforms, for example Windows, not all the fields returned by stat() can be compared. For example, the dev and rdev fields seem to be different in Windows. Also, it seems that the size of the file returned by stat() does not always agree, with stat(FH) being more accurate than stat(filename), presumably because of caching issues even when using autoflush (this is usually overcome by waiting a while after writing to the tempfile before attempting to unlink0 it).

      Not exported by default.

    • unlink1

      Similar to unlink0 except after file comparison using cmpstat, the filehandle is closed prior to attempting to unlink the file. This allows the file to be removed without using an END block, but does mean that the post-unlink comparison of the filehandle state provided by unlink0 is not available.

      1. unlink1($fh, $path)
      2. or die "Error closing and unlinking file";

      Usually called from the object destructor when using the OO interface.

      Not exported by default.

      This function is disabled if the global variable $KEEP_ALL is true.

      Can call croak() if there is a security anomaly during the stat() comparison.

    • cleanup

      Calling this function will cause any temp files or temp directories that are registered for removal to be removed. This happens automatically when the process exits but can be triggered manually if the caller is sure that none of the temp files are required. This method can be registered as an Apache callback.

      Note that if a temp directory is your current directory, it cannot be removed. chdir() out of the directory first before calling cleanup() . (For the cleanup at program exit when the CLEANUP flag is set, this happens automatically.)

      On OSes where temp files are automatically removed when the temp file is closed, calling this function will have no effect other than to remove temporary directories (which may include temporary files).

      1. File::Temp::cleanup();

      Not exported by default.

    PACKAGE VARIABLES

    These functions control the global state of the package.

    • safe_level

      Controls the lengths to which the module will go to check the safety of the temporary file or directory before proceeding. Options are:

      • STANDARD

        Do the basic security measures to ensure the directory exists and is writable, that temporary files are opened only if they do not already exist, and that possible race conditions are avoided. Finally the unlink0 function is used to remove files safely.

      • MEDIUM

        In addition to the STANDARD security, the output directory is checked to make sure that it is owned either by root or the user running the program. If the directory is writable by group or by other, it is then checked to make sure that the sticky bit is set.

        Will not work on platforms that do not support the -k test for sticky bit.

      • HIGH

        In addition to the MEDIUM security checks, also check for the possibility of ``chown() giveaway'' using the POSIX sysconf() function. If this is a possibility, each directory in the path is checked in turn for safeness, recursively walking back to the root directory.

        For platforms that do not support the POSIX _PC_CHOWN_RESTRICTED symbol (for example, Windows NT) it is assumed that ``chown() giveaway'' is possible and the recursive test is performed.

      The level can be changed as follows:

      1. File::Temp->safe_level( File::Temp::HIGH );

      The level constants are not exported by the module.

      Currently, you must be running at least perl v5.6.0 in order to run with MEDIUM or HIGH security. This is simply because the safety tests use functions from Fcntl that are not available in older versions of perl. The problem is that the version number for Fcntl is the same in perl 5.6.0 and in 5.005_03 even though they are different versions.

      On systems that do not support the HIGH or MEDIUM safety levels (for example Win NT or OS/2) any attempt to change the level will be ignored. The decision to ignore rather than raise an exception allows portable programs to be written with high security in mind for the systems that can support this without those programs failing on systems where the extra tests are irrelevant.

      If you really need to see whether the change has been accepted simply examine the return value of safe_level .

      1. $newlevel = File::Temp->safe_level( File::Temp::HIGH );
      2. die "Could not change to high security"
      3. if $newlevel != File::Temp::HIGH;
    • TopSystemUID

      This is the highest UID on the current system that refers to a root UID. This is used to make sure that the temporary directory is owned by a system UID (root , bin , sys etc) rather than simply by root.

      This is required since on many unix systems /tmp is not owned by root.

      Default is to assume that any UID less than or equal to 10 is a root UID.

      1. File::Temp->top_system_uid(10);
      2. my $topid = File::Temp->top_system_uid;

      This value can be adjusted to reduce security checking if required. The value is only relevant when safe_level is set to MEDIUM or higher.

    • $KEEP_ALL

      Controls whether temporary files and directories should be retained regardless of any instructions in the program to remove them automatically. This is useful for debugging but should not be used in production code.

      1. $File::Temp::KEEP_ALL = 1;

      Default is for files to be removed as requested by the caller.

      In some cases, files will only be retained if this variable is true when the file is created. This means that you can not create a temporary file, set this variable and expect the temp file to still be around when the program exits.

    • $DEBUG

      Controls whether debugging messages should be enabled.

      1. $File::Temp::DEBUG = 1;

      Default is for debugging mode to be disabled.

    WARNING

    For maximum security, endeavour always to avoid ever looking at, touching, or even imputing the existence of the filename. You do not know that that filename is connected to the same file as the handle you have, and attempts to check this can only trigger more race conditions. It's far more secure to use the filehandle alone and dispense with the filename altogether.

    If you need to pass the handle to something that expects a filename then on a unix system you can use "/dev/fd/" . fileno($fh) for arbitrary programs. Perl code that uses the 2-argument version of open can be passed "+<=&" . fileno($fh) . Otherwise you will need to pass the filename. You will have to clear the close-on-exec bit on that file descriptor before passing it to another process.

    1. use Fcntl qw/F_SETFD F_GETFD/;
    2. fcntl($tmpfh, F_SETFD, 0)
    3. or die "Can't clear close-on-exec flag on temp fh: $!\n";

    Temporary files and NFS

    Some problems are associated with using temporary files that reside on NFS file systems and it is recommended that a local filesystem is used whenever possible. Some of the security tests will most probably fail when the temp file is not local. Additionally, be aware that the performance of I/O operations over NFS will not be as good as for a local disk.

    Forking

    In some cases files created by File::Temp are removed from within an END block. Since END blocks are triggered when a child process exits (unless POSIX::_exit() is used by the child) File::Temp takes care to only remove those temp files created by a particular process ID. This means that a child will not attempt to remove temp files created by the parent process.

    If you are forking many processes in parallel that are all creating temporary files, you may need to reset the random number seed using srand(EXPR) in each child else all the children will attempt to walk through the same set of random file names and may well cause themselves to give up if they exceed the number of retry attempts.

    Directory removal

    Note that if you have chdir'ed into the temporary directory and it is subsequently cleaned up (either in the END block or as part of object destruction), then you will get a warning from File::Path::rmtree().

    Taint mode

    If you need to run code under taint mode, updating to the latest File::Spec is highly recommended.

    BINMODE

    The file returned by File::Temp will have been opened in binary mode if such a mode is available. If that is not correct, use the binmode() function to change the mode of the filehandle.

    Note that you can modify the encoding of a file opened by File::Temp also by using binmode().

    HISTORY

    Originally began life in May 1999 as an XS interface to the system mkstemp() function. In March 2000, the OpenBSD mkstemp() code was translated to Perl for total control of the code's security checking, to ensure the presence of the function regardless of operating system and to help with portability. The module was shipped as a standard part of perl from v5.6.1.

    SEE ALSO

    tmpnam in POSIX, tmpfile in POSIX, File::Spec, File::Path

    See IO::File and File::MkTemp, Apache::TempFile for different implementations of temporary file handling.

    See File::Tempdir for an alternative object-oriented wrapper for the tempdir function.

    AUTHOR

    Tim Jenness <tjenness@cpan.org>

    Copyright (C) 2007-2010 Tim Jenness. Copyright (C) 1999-2007 Tim Jenness and the UK Particle Physics and Astronomy Research Council. All Rights Reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Original Perl implementation loosely based on the OpenBSD C code for mkstemp(). Thanks to Tom Christiansen for suggesting that this module should be written and providing ideas for code improvements and security enhancements.

     
    perldoc-html/File/stat.html000644 000765 000024 00000052254 12275777443 015770 0ustar00jjstaff000000 000000 File::stat - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::stat

    Perl 5 version 18.2 documentation
    Recently read

    File::stat

    NAME

    File::stat - by-name interface to Perl's built-in stat() functions

    SYNOPSIS

    1. use File::stat;
    2. $st = stat($file) or die "No $file: $!";
    3. if ( ($st->mode & 0111) && $st->nlink > 1) ) {
    4. print "$file is executable with lotsa links\n";
    5. }
    6. if ( -x $st ) {
    7. print "$file is executable\n";
    8. }
    9. use Fcntl "S_IRUSR";
    10. if ( $st->cando(S_IRUSR, 1) ) {
    11. print "My effective uid can read $file\n";
    12. }
    13. use File::stat qw(:FIELDS);
    14. stat($file) or die "No $file: $!";
    15. if ( ($st_mode & 0111) && ($st_nlink > 1) ) {
    16. print "$file is executable with lotsa links\n";
    17. }

    DESCRIPTION

    This module's default exports override the core stat() and lstat() functions, replacing them with versions that return "File::stat" objects. This object has methods that return the similarly named structure field name from the stat(2) function; namely, dev, ino, mode, nlink, uid, gid, rdev, size, atime, mtime, ctime, blksize, and blocks.

    As of version 1.02 (provided with perl 5.12) the object provides "-X" overloading, so you can call filetest operators (-f , -x , and so on) on it. It also provides a ->cando method, called like

    1. $st->cando( ACCESS, EFFECTIVE )

    where ACCESS is one of S_IRUSR , S_IWUSR or S_IXUSR from the Fcntl module, and EFFECTIVE indicates whether to use effective (true) or real (false) ids. The method interprets the mode , uid and gid fields, and returns whether or not the current process would be allowed the specified access.

    If you don't want to use the objects, you may import the ->cando method into your namespace as a regular function called stat_cando . This takes an arrayref containing the return values of stat or lstat as its first argument, and interprets it for you.

    You may also import all the structure fields directly into your namespace as regular variables using the :FIELDS import tag. (Note that this still overrides your stat() and lstat() functions.) Access these fields as variables named with a preceding st_ in front their method names. Thus, $stat_obj->dev() corresponds to $st_dev if you import the fields.

    To access this functionality without the core overrides, pass the use an empty import list, and then access function functions with their full qualified names. On the other hand, the built-ins are still available via the CORE:: pseudo-package.

    BUGS

    As of Perl 5.8.0 after using this module you cannot use the implicit $_ or the special filehandle _ with stat() or lstat(), trying to do so leads into strange errors. The workaround is for $_ to be explicit

    1. my $stat_obj = stat $_;

    and for _ to explicitly populate the object using the unexported and undocumented populate() function with CORE::stat():

    1. my $stat_obj = File::stat::populate(CORE::stat(_));

    ERRORS

    • -%s is not implemented on a File::stat object

      The filetest operators -t , -T and -B are not implemented, as they require more information than just a stat buffer.

    WARNINGS

    These can all be disabled with

    1. no warnings "File::stat";
    • File::stat ignores use filetest 'access'

      You have tried to use one of the -rwxRWX filetests with use filetest 'access' in effect. File::stat will ignore the pragma, and just use the information in the mode member as usual.

    • File::stat ignores VMS ACLs

      VMS systems have a permissions structure that cannot be completely represented in a stat buffer, and unlike on other systems the builtin filetest operators respect this. The File::stat overloads, however, do not, since the information required is not available.

    NOTE

    While this class is currently implemented using the Class::Struct module to build a struct-like class, you shouldn't rely upon this.

    AUTHOR

    Tom Christiansen

     
    perldoc-html/File/Spec/Cygwin.html000644 000765 000024 00000040575 12275777443 017152 0ustar00jjstaff000000 000000 File::Spec::Cygwin - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec::Cygwin

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec::Cygwin

    NAME

    File::Spec::Cygwin - methods for Cygwin file specs

    SYNOPSIS

    1. require File::Spec::Cygwin; # Done internally by File::Spec if needed

    DESCRIPTION

    See File::Spec and File::Spec::Unix. This package overrides the implementation of these methods, not the semantics.

    This module is still in beta. Cygwin-knowledgeable folks are invited to offer patches and suggestions.

    • canonpath

      Any \ (backslashes) are converted to / (forward slashes), and then File::Spec::Unix canonpath() is called on the result.

    • file_name_is_absolute

      True is returned if the file name begins with drive_letter: , and if not, File::Spec::Unix file_name_is_absolute() is called.

    • tmpdir (override)

      Returns a string representation of the first existing directory from the following list:

      1. $ENV{TMPDIR}
      2. /tmp
      3. $ENV{'TMP'}
      4. $ENV{'TEMP'}
      5. C:/temp

      Since Perl 5.8.0, if running under taint mode, and if the environment variables are tainted, they are not used.

    • case_tolerant

      Override Unix. Cygwin case-tolerance depends on managed mount settings and as with MsWin32 on GetVolumeInformation() $ouFsFlags == FS_CASE_SENSITIVE, indicating the case significance when comparing file specifications. Default: 1

    COPYRIGHT

    Copyright (c) 2004,2007 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/File/Spec/Epoc.html000644 000765 000024 00000037256 12275777444 016603 0ustar00jjstaff000000 000000 File::Spec::Epoc - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec::Epoc

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec::Epoc

    NAME

    File::Spec::Epoc - methods for Epoc file specs

    SYNOPSIS

    1. require File::Spec::Epoc; # Done internally by File::Spec if needed

    DESCRIPTION

    See File::Spec::Unix for a documentation of the methods provided there. This package overrides the implementation of these methods, not the semantics.

    This package is still work in progress ;-)

    • canonpath()

      No physical check on the filesystem, but a logical cleanup of a path. On UNIX eliminated successive slashes and successive "/.".

    AUTHOR

    o.flebbe@gmx.de

    COPYRIGHT

    Copyright (c) 2004 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    See File::Spec and File::Spec::Unix. This package overrides the implementation of these methods, not the semantics.

     
    perldoc-html/File/Spec/Functions.html000644 000765 000024 00000041752 12275777443 017660 0ustar00jjstaff000000 000000 File::Spec::Functions - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec::Functions

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec::Functions

    NAME

    File::Spec::Functions - portably perform operations on file names

    SYNOPSIS

    1. use File::Spec::Functions;
    2. $x = catfile('a','b');

    DESCRIPTION

    This module exports convenience functions for all of the class methods provided by File::Spec.

    For a reference of available functions, please consult File::Spec::Unix, which contains the entire set, and which is inherited by the modules for other platforms. For further information, please see File::Spec::Mac, File::Spec::OS2, File::Spec::Win32, or File::Spec::VMS.

    Exports

    The following functions are exported by default.

    1. canonpath
    2. catdir
    3. catfile
    4. curdir
    5. rootdir
    6. updir
    7. no_upwards
    8. file_name_is_absolute
    9. path

    The following functions are exported only by request.

    1. devnull
    2. tmpdir
    3. splitpath
    4. splitdir
    5. catpath
    6. abs2rel
    7. rel2abs
    8. case_tolerant

    All the functions may be imported using the :ALL tag.

    COPYRIGHT

    Copyright (c) 2004 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    File::Spec, File::Spec::Unix, File::Spec::Mac, File::Spec::OS2, File::Spec::Win32, File::Spec::VMS, ExtUtils::MakeMaker

     
    perldoc-html/File/Spec/Mac.html000644 000765 000024 00000121222 12275777445 016401 0ustar00jjstaff000000 000000 File::Spec::Mac - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec::Mac

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec::Mac

    NAME

    File::Spec::Mac - File::Spec for Mac OS (Classic)

    SYNOPSIS

    1. require File::Spec::Mac; # Done internally by File::Spec if needed

    DESCRIPTION

    Methods for manipulating file specifications.

    METHODS

    • canonpath

      On Mac OS, there's nothing to be done. Returns what it's given.

    • catdir()

      Concatenate two or more directory names to form a path separated by colons (":") ending with a directory. Resulting paths are relative by default, but can be forced to be absolute (but avoid this, see below). Automatically puts a trailing ":" on the end of the complete path, because that's what's done in MacPerl's environment and helps to distinguish a file path from a directory path.

      IMPORTANT NOTE: Beginning with version 1.3 of this module, the resulting path is relative by default and not absolute. This decision was made due to portability reasons. Since File::Spec->catdir() returns relative paths on all other operating systems, it will now also follow this convention on Mac OS. Note that this may break some existing scripts.

      The intended purpose of this routine is to concatenate directory names. But because of the nature of Macintosh paths, some additional possibilities are allowed to make using this routine give reasonable results for some common situations. In other words, you are also allowed to concatenate paths instead of directory names (strictly speaking, a string like ":a" is a path, but not a name, since it contains a punctuation character ":").

      So, beside calls like

      1. catdir("a") = ":a:"
      2. catdir("a","b") = ":a:b:"
      3. catdir() = "" (special case)

      calls like the following

      1. catdir(":a:") = ":a:"
      2. catdir(":a","b") = ":a:b:"
      3. catdir(":a:","b") = ":a:b:"
      4. catdir(":a:",":b:") = ":a:b:"
      5. catdir(":") = ":"

      are allowed.

      Here are the rules that are used in catdir() ; note that we try to be as compatible as possible to Unix:

      1.

      The resulting path is relative by default, i.e. the resulting path will have a leading colon.

      2.

      A trailing colon is added automatically to the resulting path, to denote a directory.

      3.

      Generally, each argument has one leading ":" and one trailing ":" removed (if any). They are then joined together by a ":". Special treatment applies for arguments denoting updir paths like "::lib:", see (4), or arguments consisting solely of colons ("colon paths"), see (5).

      4.

      When an updir path like ":::lib::" is passed as argument, the number of directories to climb up is handled correctly, not removing leading or trailing colons when necessary. E.g.

      1. catdir(":::a","::b","c") = ":::a::b:c:"
      2. catdir(":::a::","::b","c") = ":::a:::b:c:"
      5.

      Adding a colon ":" or empty string "" to a path at any position doesn't alter the path, i.e. these arguments are ignored. (When a "" is passed as the first argument, it has a special meaning, see (6)). This way, a colon ":" is handled like a "." (curdir) on Unix, while an empty string "" is generally ignored (see Unix->canonpath() ). Likewise, a "::" is handled like a ".." (updir), and a ":::" is handled like a "../.." etc. E.g.

      1. catdir("a",":",":","b") = ":a:b:"
      2. catdir("a",":","::",":b") = ":a::b:"
      6.

      If the first argument is an empty string "" or is a volume name, i.e. matches the pattern /^[^:]+:/, the resulting path is absolute.

      7.

      Passing an empty string "" as the first argument to catdir() is like passingFile::Spec->rootdir() as the first argument, i.e.

      1. catdir("","a","b") is the same as
      2. catdir(rootdir(),"a","b").

      This is true on Unix, where catdir("","a","b") yields "/a/b" and rootdir() is "/". Note that rootdir() on Mac OS is the startup volume, which is the closest in concept to Unix' "/". This should help to run existing scripts originally written for Unix.

      8.

      For absolute paths, some cleanup is done, to ensure that the volume name isn't immediately followed by updirs. This is invalid, because this would go beyond "root". Generally, these cases are handled like their Unix counterparts:

      1. Unix:
      2. Unix->catdir("","") = "/"
      3. Unix->catdir("",".") = "/"
      4. Unix->catdir("","..") = "/" # can't go
      5. # beyond root
      6. Unix->catdir("",".","..","..","a") = "/a"
      7. Mac:
      8. Mac->catdir("","") = rootdir() # (e.g. "HD:")
      9. Mac->catdir("",":") = rootdir()
      10. Mac->catdir("","::") = rootdir() # can't go
      11. # beyond root
      12. Mac->catdir("",":","::","::","a") = rootdir() . "a:"
      13. # (e.g. "HD:a:")

      However, this approach is limited to the first arguments following "root" (again, see Unix->canonpath() ). If there are more arguments that move up the directory tree, an invalid path going beyond root can be created.

      As you've seen, you can force catdir() to create an absolute path by passing either an empty string or a path that begins with a volume name as the first argument. However, you are strongly encouraged not to do so, since this is done only for backward compatibility. Newer versions of File::Spec come with a method called catpath() (see below), that is designed to offer a portable solution for the creation of absolute paths. It takes volume, directory and file portions and returns an entire path. While catdir() is still suitable for the concatenation of directory names, you are encouraged to use catpath() to concatenate volume names and directory paths. E.g.

      1. $dir = File::Spec->catdir("tmp","sources");
      2. $abs_path = File::Spec->catpath("MacintoshHD:", $dir,"");

      yields

      1. "MacintoshHD:tmp:sources:" .
    • catfile

      Concatenate one or more directory names and a filename to form a complete path ending with a filename. Resulting paths are relative by default, but can be forced to be absolute (but avoid this).

      IMPORTANT NOTE: Beginning with version 1.3 of this module, the resulting path is relative by default and not absolute. This decision was made due to portability reasons. Since File::Spec->catfile() returns relative paths on all other operating systems, it will now also follow this convention on Mac OS. Note that this may break some existing scripts.

      The last argument is always considered to be the file portion. Since catfile() uses catdir() (see above) for the concatenation of the directory portions (if any), the following with regard to relative and absolute paths is true:

      1. catfile("") = ""
      2. catfile("file") = "file"

      but

      1. catfile("","") = rootdir() # (e.g. "HD:")
      2. catfile("","file") = rootdir() . file # (e.g. "HD:file")
      3. catfile("HD:","file") = "HD:file"

      This means that catdir() is called only when there are two or more arguments, as one might expect.

      Note that the leading ":" is removed from the filename, so that

      1. catfile("a","b","file") = ":a:b:file" and
      2. catfile("a","b",":file") = ":a:b:file"

      give the same answer.

      To concatenate volume names, directory paths and filenames, you are encouraged to use catpath() (see below).

    • curdir

      Returns a string representing the current directory. On Mac OS, this is ":".

    • devnull

      Returns a string representing the null device. On Mac OS, this is "Dev:Null".

    • rootdir

      Returns a string representing the root directory. Under MacPerl, returns the name of the startup volume, since that's the closest in concept, although other volumes aren't rooted there. The name has a trailing ":", because that's the correct specification for a volume name on Mac OS.

      If Mac::Files could not be loaded, the empty string is returned.

    • tmpdir

      Returns the contents of $ENV{TMPDIR}, if that directory exits or the current working directory otherwise. Under MacPerl, $ENV{TMPDIR} will contain a path like "MacintoshHD:Temporary Items:", which is a hidden directory on your startup volume.

    • updir

      Returns a string representing the parent directory. On Mac OS, this is "::".

    • file_name_is_absolute

      Takes as argument a path and returns true, if it is an absolute path. If the path has a leading ":", it's a relative path. Otherwise, it's an absolute path, unless the path doesn't contain any colons, i.e. it's a name like "a". In this particular case, the path is considered to be relative (i.e. it is considered to be a filename). Use ":" in the appropriate place in the path if you want to distinguish unambiguously. As a special case, the filename '' is always considered to be absolute. Note that with version 1.2 of File::Spec::Mac, this does no longer consult the local filesystem.

      E.g.

      1. File::Spec->file_name_is_absolute("a"); # false (relative)
      2. File::Spec->file_name_is_absolute(":a:b:"); # false (relative)
      3. File::Spec->file_name_is_absolute("MacintoshHD:");
      4. # true (absolute)
      5. File::Spec->file_name_is_absolute(""); # true (absolute)
    • path

      Returns the null list for the MacPerl application, since the concept is usually meaningless under Mac OS. But if you're using the MacPerl tool under MPW, it gives back $ENV{Commands} suitably split, as is done in :lib:ExtUtils:MM_Mac.pm.

    • splitpath
      1. ($volume,$directories,$file) = File::Spec->splitpath( $path );
      2. ($volume,$directories,$file) = File::Spec->splitpath( $path,
      3. $no_file );

      Splits a path into volume, directory, and filename portions.

      On Mac OS, assumes that the last part of the path is a filename unless $no_file is true or a trailing separator ":" is present.

      The volume portion is always returned with a trailing ":". The directory portion is always returned with a leading (to denote a relative path) and a trailing ":" (to denote a directory). The file portion is always returned without a leading ":". Empty portions are returned as empty string ''.

      The results can be passed to catpath() to get back a path equivalent to (usually identical to) the original path.

    • splitdir

      The opposite of catdir() .

      1. @dirs = File::Spec->splitdir( $directories );

      $directories should be only the directory portion of the path on systems that have the concept of a volume or that have path syntax that differentiates files from directories. Consider using splitpath() otherwise.

      Unlike just splitting the directories on the separator, empty directory names ("" ) can be returned. Since catdir() on Mac OS always appends a trailing colon to distinguish a directory path from a file path, a single trailing colon will be ignored, i.e. there's no empty directory name after it.

      Hence, on Mac OS, both

      1. File::Spec->splitdir( ":a:b::c:" ); and
      2. File::Spec->splitdir( ":a:b::c" );

      yield:

      1. ( "a", "b", "::", "c")

      while

      1. File::Spec->splitdir( ":a:b::c::" );

      yields:

      1. ( "a", "b", "::", "c", "::")
    • catpath
      1. $path = File::Spec->catpath($volume,$directory,$file);

      Takes volume, directory and file portions and returns an entire path. On Mac OS, $volume, $directory and $file are concatenated. A ':' is inserted if need be. You may pass an empty string for each portion. If all portions are empty, the empty string is returned. If $volume is empty, the result will be a relative path, beginning with a ':'. If $volume and $directory are empty, a leading ":" (if any) is removed form $file and the remainder is returned. If $file is empty, the resulting path will have a trailing ':'.

    • abs2rel

      Takes a destination path and an optional base path and returns a relative path from the base path to the destination path:

      1. $rel_path = File::Spec->abs2rel( $path ) ;
      2. $rel_path = File::Spec->abs2rel( $path, $base ) ;

      Note that both paths are assumed to have a notation that distinguishes a directory path (with trailing ':') from a file path (without trailing ':').

      If $base is not present or '', then the current working directory is used. If $base is relative, then it is converted to absolute form using rel2abs() . This means that it is taken to be relative to the current working directory.

      If $path and $base appear to be on two different volumes, we will not attempt to resolve the two paths, and we will instead simply return $path. Note that previous versions of this module ignored the volume of $base, which resulted in garbage results part of the time.

      If $base doesn't have a trailing colon, the last element of $base is assumed to be a filename. This filename is ignored. Otherwise all path components are assumed to be directories.

      If $path is relative, it is converted to absolute form using rel2abs() . This means that it is taken to be relative to the current working directory.

      Based on code written by Shigio Yamaguchi.

    • rel2abs

      Converts a relative path to an absolute path:

      1. $abs_path = File::Spec->rel2abs( $path ) ;
      2. $abs_path = File::Spec->rel2abs( $path, $base ) ;

      Note that both paths are assumed to have a notation that distinguishes a directory path (with trailing ':') from a file path (without trailing ':').

      If $base is not present or '', then $base is set to the current working directory. If $base is relative, then it is converted to absolute form using rel2abs() . This means that it is taken to be relative to the current working directory.

      If $base doesn't have a trailing colon, the last element of $base is assumed to be a filename. This filename is ignored. Otherwise all path components are assumed to be directories.

      If $path is already absolute, it is returned and $base is ignored.

      Based on code written by Shigio Yamaguchi.

    AUTHORS

    See the authors list in File::Spec. Mac OS support by Paul Schinder <schinder@pobox.com> and Thomas Wegner <wegner_thomas@yahoo.com>.

    COPYRIGHT

    Copyright (c) 2004 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    See File::Spec and File::Spec::Unix. This package overrides the implementation of these methods, not the semantics.

     
    perldoc-html/File/Spec/OS2.html000644 000765 000024 00000036731 12275777446 016317 0ustar00jjstaff000000 000000 File::Spec::OS2 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec::OS2

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec::OS2

    NAME

    File::Spec::OS2 - methods for OS/2 file specs

    SYNOPSIS

    1. require File::Spec::OS2; # Done internally by File::Spec if needed

    DESCRIPTION

    See File::Spec and File::Spec::Unix. This package overrides the implementation of these methods, not the semantics.

    Amongst the changes made for OS/2 are...

    • tmpdir

      Modifies the list of places temp directory information is looked for.

      1. $ENV{TMPDIR}
      2. $ENV{TEMP}
      3. $ENV{TMP}
      4. /tmp
      5. /
    • splitpath

      Volumes can be drive letters or UNC sharenames (\\server\share).

    COPYRIGHT

    Copyright (c) 2004 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/File/Spec/Unix.html000644 000765 000024 00000062116 12275777446 016633 0ustar00jjstaff000000 000000 File::Spec::Unix - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec::Unix

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec::Unix

    NAME

    File::Spec::Unix - File::Spec for Unix, base for other File::Spec modules

    SYNOPSIS

    1. require File::Spec::Unix; # Done automatically by File::Spec

    DESCRIPTION

    Methods for manipulating file specifications. Other File::Spec modules, such as File::Spec::Mac, inherit from File::Spec::Unix and override specific methods.

    METHODS

    • canonpath()

      No physical check on the filesystem, but a logical cleanup of a path. On UNIX eliminates successive slashes and successive "/.".

      1. $cpath = File::Spec->canonpath( $path ) ;

      Note that this does *not* collapse x/../y sections into y. This is by design. If /foo on your system is a symlink to /bar/baz, then /foo/../quux is actually /bar/quux, not /quux as a naive ../-removal would give you. If you want to do this kind of processing, you probably want Cwd 's realpath() function to actually traverse the filesystem cleaning up paths like this.

    • catdir()

      Concatenate two or more directory names to form a complete path ending with a directory. But remove the trailing slash from the resulting string, because it doesn't look good, isn't necessary and confuses OS2. Of course, if this is the root directory, don't cut off the trailing slash :-)

    • catfile

      Concatenate one or more directory names and a filename to form a complete path ending with a filename

    • curdir

      Returns a string representation of the current directory. "." on UNIX.

    • devnull

      Returns a string representation of the null device. "/dev/null" on UNIX.

    • rootdir

      Returns a string representation of the root directory. "/" on UNIX.

    • tmpdir

      Returns a string representation of the first writable directory from the following list or the current directory if none from the list are writable:

      1. $ENV{TMPDIR}
      2. /tmp

      If running under taint mode, and if $ENV{TMPDIR} is tainted, it is not used.

    • updir

      Returns a string representation of the parent directory. ".." on UNIX.

    • no_upwards

      Given a list of file names, strip out those that refer to a parent directory. (Does not strip symlinks, only '.', '..', and equivalents.)

    • case_tolerant

      Returns a true or false value indicating, respectively, that alphabetic is not or is significant when comparing file specifications.

    • file_name_is_absolute

      Takes as argument a path and returns true if it is an absolute path.

      This does not consult the local filesystem on Unix, Win32, OS/2 or Mac OS (Classic). It does consult the working environment for VMS (see file_name_is_absolute in File::Spec::VMS).

    • path

      Takes no argument, returns the environment variable PATH as an array.

    • join

      join is the same as catfile.

    • splitpath
      1. ($volume,$directories,$file) = File::Spec->splitpath( $path );
      2. ($volume,$directories,$file) = File::Spec->splitpath( $path,
      3. $no_file );

      Splits a path into volume, directory, and filename portions. On systems with no concept of volume, returns '' for volume.

      For systems with no syntax differentiating filenames from directories, assumes that the last file is a path unless $no_file is true or a trailing separator or /. or /.. is present. On Unix this means that $no_file true makes this return ( '', $path, '' ).

      The directory portion may or may not be returned with a trailing '/'.

      The results can be passed to catpath() to get back a path equivalent to (usually identical to) the original path.

    • splitdir

      The opposite of catdir().

      1. @dirs = File::Spec->splitdir( $directories );

      $directories must be only the directory portion of the path on systems that have the concept of a volume or that have path syntax that differentiates files from directories.

      Unlike just splitting the directories on the separator, empty directory names ('' ) can be returned, because these are significant on some OSs.

      On Unix,

      1. File::Spec->splitdir( "/a/b//c/" );

      Yields:

      1. ( '', 'a', 'b', '', 'c', '' )
    • catpath()

      Takes volume, directory and file portions and returns an entire path. Under Unix, $volume is ignored, and directory and file are concatenated. A '/' is inserted if needed (though if the directory portion doesn't start with '/' it is not added). On other OSs, $volume is significant.

    • abs2rel

      Takes a destination path and an optional base path returns a relative path from the base path to the destination path:

      1. $rel_path = File::Spec->abs2rel( $path ) ;
      2. $rel_path = File::Spec->abs2rel( $path, $base ) ;

      If $base is not present or '', then cwd() is used. If $base is relative, then it is converted to absolute form using rel2abs(). This means that it is taken to be relative to cwd().

      On systems that have a grammar that indicates filenames, this ignores the $base filename. Otherwise all path components are assumed to be directories.

      If $path is relative, it is converted to absolute form using rel2abs(). This means that it is taken to be relative to cwd().

      No checks against the filesystem are made, so the result may not be correct if $base contains symbolic links. (Apply Cwd::abs_path() beforehand if that is a concern.) On VMS, there is interaction with the working environment, as logicals and macros are expanded.

      Based on code written by Shigio Yamaguchi.

    • rel2abs()

      Converts a relative path to an absolute path.

      1. $abs_path = File::Spec->rel2abs( $path ) ;
      2. $abs_path = File::Spec->rel2abs( $path, $base ) ;

      If $base is not present or '', then cwd() is used. If $base is relative, then it is converted to absolute form using rel2abs(). This means that it is taken to be relative to cwd().

      On systems that have a grammar that indicates filenames, this ignores the $base filename. Otherwise all path components are assumed to be directories.

      If $path is absolute, it is cleaned up and returned using canonpath().

      No checks against the filesystem are made. On VMS, there is interaction with the working environment, as logicals and macros are expanded.

      Based on code written by Shigio Yamaguchi.

    COPYRIGHT

    Copyright (c) 2004 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Please submit bug reports and patches to perlbug@perl.org.

    SEE ALSO

    File::Spec

     
    perldoc-html/File/Spec/VMS.html000644 000765 000024 00000047726 12275777444 016365 0ustar00jjstaff000000 000000 File::Spec::VMS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec::VMS

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec::VMS

    NAME

    File::Spec::VMS - methods for VMS file specs

    SYNOPSIS

    1. require File::Spec::VMS; # Done internally by File::Spec if needed

    DESCRIPTION

    See File::Spec::Unix for a documentation of the methods provided there. This package overrides the implementation of these methods, not the semantics.

    The default behavior is to allow either VMS or Unix syntax on input and to return VMS syntax on output unless Unix syntax has been explicity requested via the DECC$FILENAME_UNIX_REPORT CRTL feature.

    • canonpath (override)

      Removes redundant portions of file specifications and returns results in native syntax unless Unix filename reporting has been enabled.

    • catdir (override)

      Concatenates a list of file specifications, and returns the result as a native directory specification unless the Unix filename reporting feature has been enabled. No check is made for "impossible" cases (e.g. elements other than the first being absolute filespecs).

    • catfile (override)

      Concatenates a list of directory specifications with a filename specification to build a path.

    • curdir (override)

      Returns a string representation of the current directory: '[]' or '.'

    • devnull (override)

      Returns a string representation of the null device: '_NLA0:' or '/dev/null'

    • rootdir (override)

      Returns a string representation of the root directory: 'SYS$DISK:[000000]' or '/'

    • tmpdir (override)

      Returns a string representation of the first writable directory from the following list or '' if none are writable:

      1. /tmp if C<DECC$FILENAME_UNIX_REPORT> is enabled.
      2. sys$scratch:
      3. $ENV{TMPDIR}

      Since perl 5.8.0, if running under taint mode, and if $ENV{TMPDIR} is tainted, it is not used.

    • updir (override)

      Returns a string representation of the parent directory: '[-]' or '..'

    • case_tolerant (override)

      VMS file specification syntax is case-tolerant.

    • path (override)

      Translate logical name DCL$PATH as a searchlist, rather than trying to split string value of $ENV{'PATH'} .

    • file_name_is_absolute (override)

      Checks for VMS directory spec as well as Unix separators.

    • splitpath (override)
      1. ($volume,$directories,$file) = File::Spec->splitpath( $path );
      2. ($volume,$directories,$file) = File::Spec->splitpath( $path,
      3. $no_file );

      Passing a true value for $no_file indicates that the path being split only contains directory components, even on systems where you can usually (when not supporting a foreign syntax) tell the difference between directories and files at a glance.

    • splitdir (override)

      Split a directory specification into the components.

    • catpath (override)

      Construct a complete filespec.

    • abs2rel (override)

      Attempt to convert an absolute file specification to a relative specification.

    • rel2abs (override)

      Return an absolute file specification from a relative one.

    COPYRIGHT

    Copyright (c) 2004 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    See File::Spec and File::Spec::Unix. This package overrides the implementation of these methods, not the semantics.

    An explanation of VMS file specs can be found at http://h71000.www7.hp.com/doc/731FINAL/4506/4506pro_014.html#apps_locating_naming_files.

     
    perldoc-html/File/Spec/Win32.html000644 000765 000024 00000051626 12275777446 016616 0ustar00jjstaff000000 000000 File::Spec::Win32 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    File::Spec::Win32

    Perl 5 version 18.2 documentation
    Recently read

    File::Spec::Win32

    NAME

    File::Spec::Win32 - methods for Win32 file specs

    SYNOPSIS

    1. require File::Spec::Win32; # Done internally by File::Spec if needed

    DESCRIPTION

    See File::Spec::Unix for a documentation of the methods provided there. This package overrides the implementation of these methods, not the semantics.

    • devnull

      Returns a string representation of the null device.

    • tmpdir

      Returns a string representation of the first existing directory from the following list:

      1. $ENV{TMPDIR}
      2. $ENV{TEMP}
      3. $ENV{TMP}
      4. SYS:/temp
      5. C:\system\temp
      6. C:/temp
      7. /tmp
      8. /

      The SYS:/temp is preferred in Novell NetWare and the C:\system\temp for Symbian (the File::Spec::Win32 is used also for those platforms).

      Since Perl 5.8.0, if running under taint mode, and if the environment variables are tainted, they are not used.

    • case_tolerant

      MSWin32 case-tolerance depends on GetVolumeInformation() $ouFsFlags == FS_CASE_SENSITIVE, indicating the case significance when comparing file specifications. Since XP FS_CASE_SENSITIVE is effectively disabled for the NT subsubsystem. See http://cygwin.com/ml/cygwin/2007-07/msg00891.html Default: 1

    • file_name_is_absolute

      As of right now, this returns 2 if the path is absolute with a volume, 1 if it's absolute with no volume, 0 otherwise.

    • catfile

      Concatenate one or more directory names and a filename to form a complete path ending with a filename

    • canonpath

      No physical check on the filesystem, but a logical cleanup of a path. On UNIX eliminated successive slashes and successive "/.". On Win32 makes

      1. dir1\dir2\dir3\..\..\dir4 -> \dir\dir4 and even
      2. dir1\dir2\dir3\...\dir4 -> \dir\dir4
    • splitpath
      1. ($volume,$directories,$file) = File::Spec->splitpath( $path );
      2. ($volume,$directories,$file) = File::Spec->splitpath( $path,
      3. $no_file );

      Splits a path into volume, directory, and filename portions. Assumes that the last file is a path unless the path ends in '\\', '\\.', '\\..' or $no_file is true. On Win32 this means that $no_file true makes this return ( $volume, $path, '' ).

      Separators accepted are \ and /.

      Volumes can be drive letters or UNC sharenames (\\server\share).

      The results can be passed to catpath to get back a path equivalent to (usually identical to) the original path.

    • splitdir

      The opposite of catdir().

      1. @dirs = File::Spec->splitdir( $directories );

      $directories must be only the directory portion of the path on systems that have the concept of a volume or that have path syntax that differentiates files from directories.

      Unlike just splitting the directories on the separator, leading empty and trailing directory entries can be returned, because these are significant on some OSs. So,

      1. File::Spec->splitdir( "/a/b/c" );

      Yields:

      1. ( '', 'a', 'b', '', 'c', '' )
    • catpath

      Takes volume, directory and file portions and returns an entire path. Under Unix, $volume is ignored, and this is just like catfile(). On other OSs, the $volume become significant.

    Note For File::Spec::Win32 Maintainers

    Novell NetWare inherits its File::Spec behaviour from File::Spec::Win32.

    COPYRIGHT

    Copyright (c) 2004,2007 by the Perl 5 Porters. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    See File::Spec and File::Spec::Unix. This package overrides the implementation of these methods, not the semantics.

     
    perldoc-html/ExtUtils/CBuilder/000755 000765 000024 00000000000 12275777440 016507 5ustar00jjstaff000000 000000 perldoc-html/ExtUtils/CBuilder.html000644 000765 000024 00000070507 12275777440 017406 0ustar00jjstaff000000 000000 ExtUtils::CBuilder - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::CBuilder

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::CBuilder

    NAME

    ExtUtils::CBuilder - Compile and link C code for Perl modules

    SYNOPSIS

    1. use ExtUtils::CBuilder;
    2. my $b = ExtUtils::CBuilder->new(%options);
    3. $obj_file = $b->compile(source => 'MyModule.c');
    4. $lib_file = $b->link(objects => $obj_file);

    DESCRIPTION

    This module can build the C portions of Perl modules by invoking the appropriate compilers and linkers in a cross-platform manner. It was motivated by the Module::Build project, but may be useful for other purposes as well. However, it is not intended as a general cross-platform interface to all your C building needs. That would have been a much more ambitious goal!

    METHODS

    • new

      Returns a new ExtUtils::CBuilder object. A config parameter lets you override Config.pm settings for all operations performed by the object, as in the following example:

      1. # Use a different compiler than Config.pm says
      2. my $b = ExtUtils::CBuilder->new( config =>
      3. { ld => 'gcc' } );

      A quiet parameter tells CBuilder to not print its system() commands before executing them:

      1. # Be quieter than normal
      2. my $b = ExtUtils::CBuilder->new( quiet => 1 );
    • have_compiler

      Returns true if the current system has a working C compiler and linker, false otherwise. To determine this, we actually compile and link a sample C library. The sample will be compiled in the system tempdir or, if that fails for some reason, in the current directory.

    • have_cplusplus

      Just like have_compiler but for C++ instead of C.

    • compile

      Compiles a C source file and produces an object file. The name of the object file is returned. The source file is specified in a source parameter, which is required; the other parameters listed below are optional.

      • object_file

        Specifies the name of the output file to create. Otherwise the object_file() method will be consulted, passing it the name of the source file.

      • include_dirs

        Specifies any additional directories in which to search for header files. May be given as a string indicating a single directory, or as a list reference indicating multiple directories.

      • extra_compiler_flags

        Specifies any additional arguments to pass to the compiler. Should be given as a list reference containing the arguments individually, or if this is not possible, as a string containing all the arguments together.

      • C++

        Specifies that the source file is a C++ source file and sets appropriate compiler flags

      The operation of this method is also affected by the archlibexp , cccdlflags , ccflags , optimize , and cc entries in Config.pm .

    • link

      Invokes the linker to produce a library file from object files. In scalar context, the name of the library file is returned. In list context, the library file and any temporary files created are returned. A required objects parameter contains the name of the object files to process, either in a string (for one object file) or list reference (for one or more files). The following parameters are optional:

      • lib_file

        Specifies the name of the output library file to create. Otherwise the lib_file() method will be consulted, passing it the name of the first entry in objects .

      • module_name

        Specifies the name of the Perl module that will be created by linking. On platforms that need to do prelinking (Win32, OS/2, etc.) this is a required parameter.

      • extra_linker_flags

        Any additional flags you wish to pass to the linker.

      On platforms where need_prelink() returns true, prelink() will be called automatically.

      The operation of this method is also affected by the lddlflags , shrpenv , and ld entries in Config.pm .

    • link_executable

      Invokes the linker to produce an executable file from object files. In scalar context, the name of the executable file is returned. In list context, the executable file and any temporary files created are returned. A required objects parameter contains the name of the object files to process, either in a string (for one object file) or list reference (for one or more files). The optional parameters are the same as link with exception for

      • exe_file

        Specifies the name of the output executable file to create. Otherwise the exe_file() method will be consulted, passing it the name of the first entry in objects .

    • object_file
      1. my $object_file = $b->object_file($source_file);

      Converts the name of a C source file to the most natural name of an output object file to create from it. For instance, on Unix the source file foo.c would result in the object file foo.o.

    • lib_file
      1. my $lib_file = $b->lib_file($object_file);

      Converts the name of an object file to the most natural name of a output library file to create from it. For instance, on Mac OS X the object file foo.o would result in the library file foo.bundle.

    • exe_file
      1. my $exe_file = $b->exe_file($object_file);

      Converts the name of an object file to the most natural name of an executable file to create from it. For instance, on Mac OS X the object file foo.o would result in the executable file foo, and on Windows it would result in foo.exe.

    • prelink

      On certain platforms like Win32, OS/2, VMS, and AIX, it is necessary to perform some actions before invoking the linker. The ExtUtils::Mksymlists module does this, writing files used by the linker during the creation of shared libraries for dynamic extensions. The names of any files written will be returned as a list.

      Several parameters correspond to ExtUtils::Mksymlists::Mksymlists() options, as follows:

      1. Mksymlists() prelink() type
      2. -------------|-------------------|-------------------
      3. NAME | dl_name | string (required)
      4. DLBASE | dl_base | string
      5. FILE | dl_file | string
      6. DL_VARS | dl_vars | array reference
      7. DL_FUNCS | dl_funcs | hash reference
      8. FUNCLIST | dl_func_list | array reference
      9. IMPORTS | dl_imports | hash reference
      10. VERSION | dl_version | string

      Please see the documentation for ExtUtils::Mksymlists for the details of what these parameters do.

    • need_prelink

      Returns true on platforms where prelink() should be called during linking, and false otherwise.

    • extra_link_args_after_prelink

      Returns list of extra arguments to give to the link command; the arguments are the same as for prelink(), with addition of array reference to the results of prelink(); this reference is indexed by key prelink_res .

    TO DO

    Currently this has only been tested on Unix and doesn't contain any of the Windows-specific code from the Module::Build project. I'll do that next.

    HISTORY

    This module is an outgrowth of the Module::Build project, to which there have been many contributors. Notably, Randy W. Sims submitted lots of code to support 3 compilers on Windows and helped with various other platform-specific issues. Ilya Zakharevich has contributed fixes for OS/2; John E. Malmberg and Peter Prymmer have done likewise for VMS.

    SUPPORT

    ExtUtils::CBuilder is maintained as part of the Perl 5 core. Please submit any bug reports via the perlbug tool included with Perl 5. Bug reports will be included in the Perl 5 ticket system at http://rt.perl.org.

    The Perl 5 source code is available at <http://perl5.git.perl.org/perl.git> and ExtUtils-CBuilder may be found in the dist/ExtUtils-CBuilder directory of the repository.

    AUTHOR

    Ken Williams, kwilliams@cpan.org

    Additional contributions by The Perl 5 Porters.

    COPYRIGHT

    Copyright (c) 2003-2005 Ken Williams. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    perl(1), Module::Build(3)

     
    perldoc-html/ExtUtils/Command/000755 000765 000024 00000000000 12275777441 016375 5ustar00jjstaff000000 000000 perldoc-html/ExtUtils/Command.html000644 000765 000024 00000050035 12275777440 017265 0ustar00jjstaff000000 000000 ExtUtils::Command - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Command

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Command

    NAME

    ExtUtils::Command - utilities to replace common UNIX commands in Makefiles etc.

    SYNOPSIS

    1. perl -MExtUtils::Command -e cat files... > destination
    2. perl -MExtUtils::Command -e mv source... destination
    3. perl -MExtUtils::Command -e cp source... destination
    4. perl -MExtUtils::Command -e touch files...
    5. perl -MExtUtils::Command -e rm_f files...
    6. perl -MExtUtils::Command -e rm_rf directories...
    7. perl -MExtUtils::Command -e mkpath directories...
    8. perl -MExtUtils::Command -e eqtime source destination
    9. perl -MExtUtils::Command -e test_f file
    10. perl -MExtUtils::Command -e test_d directory
    11. perl -MExtUtils::Command -e chmod mode files...
    12. ...

    DESCRIPTION

    The module is used to replace common UNIX commands. In all cases the functions work from @ARGV rather than taking arguments. This makes them easier to deal with in Makefiles. Call them like this:

    1. perl -MExtUtils::Command -e some_command some files to work on

    and NOT like this:

    1. perl -MExtUtils::Command -e 'some_command qw(some files to work on)'

    For that use Shell::Command.

    Filenames with * and ? will be glob expanded.

    FUNCTIONS

    • cat
      1. cat file ...

      Concatenates all files mentioned on command line to STDOUT.

    • eqtime
      1. eqtime source destination

      Sets modified time of destination to that of source.

    • rm_rf
      1. rm_rf files or directories ...

      Removes files and directories - recursively (even if readonly)

    • rm_f
      1. rm_f file ...

      Removes files (even if readonly)

    • touch
      1. touch file ...

      Makes files exist, with current timestamp

    • mv
      1. mv source_file destination_file
      2. mv source_file source_file destination_dir

      Moves source to destination. Multiple sources are allowed if destination is an existing directory.

      Returns true if all moves succeeded, false otherwise.

    • cp
      1. cp source_file destination_file
      2. cp source_file source_file destination_dir

      Copies sources to the destination. Multiple sources are allowed if destination is an existing directory.

      Returns true if all copies succeeded, false otherwise.

    • chmod
      1. chmod mode files ...

      Sets UNIX like permissions 'mode' on all the files. e.g. 0666

    • mkpath
      1. mkpath directory ...

      Creates directories, including any parent directories.

    • test_f
      1. test_f file

      Tests if a file exists. Exits with 0 if it does, 1 if it does not (ie. shell's idea of true and false).

    • test_d
      1. test_d directory

      Tests if a directory exists. Exits with 0 if it does, 1 if it does not (ie. shell's idea of true and false).

    • dos2unix
      1. dos2unix files or dirs ...

      Converts DOS and OS/2 linefeeds to Unix style recursively.

    SEE ALSO

    Shell::Command which is these same functions but take arguments normally.

    AUTHOR

    Nick Ing-Simmons ni-s@cpan.org

    Maintained by Michael G Schwern schwern@pobox.com within the ExtUtils-MakeMaker package and, as a separate CPAN package, by Randy Kobes r.kobes@uwinnipeg.ca .

     
    perldoc-html/ExtUtils/Constant/000755 000765 000024 00000000000 12275777442 016611 5ustar00jjstaff000000 000000 perldoc-html/ExtUtils/Constant.html000644 000765 000024 00000060537 12275777441 017511 0ustar00jjstaff000000 000000 ExtUtils::Constant - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Constant

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Constant

    NAME

    ExtUtils::Constant - generate XS code to import C header constants

    SYNOPSIS

    1. use ExtUtils::Constant qw (WriteConstants);
    2. WriteConstants(
    3. NAME => 'Foo',
    4. NAMES => [qw(FOO BAR BAZ)],
    5. );
    6. # Generates wrapper code to make the values of the constants FOO BAR BAZ
    7. # available to perl

    DESCRIPTION

    ExtUtils::Constant facilitates generating C and XS wrapper code to allow perl modules to AUTOLOAD constants defined in C library header files. It is principally used by the h2xs utility, on which this code is based. It doesn't contain the routines to scan header files to extract these constants.

    USAGE

    Generally one only needs to call the WriteConstants function, and then

    1. #include "const-c.inc"

    in the C section of Foo.xs

    1. INCLUDE: const-xs.inc

    in the XS section of Foo.xs .

    For greater flexibility use constant_types() , C_constant and XS_constant , with which WriteConstants is implemented.

    Currently this module understands the following types. h2xs may only know a subset. The sizes of the numeric types are chosen by the Configure script at compile time.

    • IV

      signed integer, at least 32 bits.

    • UV

      unsigned integer, the same size as IV

    • NV

      floating point type, probably double , possibly long double

    • PV

      NUL terminated string, length will be determined with strlen

    • PVN

      A fixed length thing, given as a [pointer, length] pair. If you know the length of a string at compile time you may use this instead of PV

    • SV

      A mortal SV.

    • YES

      Truth. (PL_sv_yes ) The value is not needed (and ignored).

    • NO

      Defined Falsehood. (PL_sv_no ) The value is not needed (and ignored).

    • UNDEF

      undef. The value of the macro is not needed.

    FUNCTIONS

    • constant_types

      A function returning a single scalar with #define definitions for the constants used internally between the generated C and XS functions.

    • XS_constant PACKAGE, TYPES, XS_SUBNAME, C_SUBNAME

      A function to generate the XS code to implement the perl subroutine PACKAGE::constant used by PACKAGE::AUTOLOAD to load constants. This XS code is a wrapper around a C subroutine usually generated by C_constant , and usually named constant .

      TYPES should be given either as a comma separated list of types that the C subroutine constant will generate or as a reference to a hash. It should be the same list of types as C_constant was given. [Otherwise XS_constant and C_constant may have different ideas about the number of parameters passed to the C function constant ]

      You can call the perl visible subroutine something other than constant if you give the parameter XS_SUBNAME. The C subroutine it calls defaults to the name of the perl visible subroutine, unless you give the parameter C_SUBNAME.

    • autoload PACKAGE, VERSION, AUTOLOADER

      A function to generate the AUTOLOAD subroutine for the module PACKAGE VERSION is the perl version the code should be backwards compatible with. It defaults to the version of perl running the subroutine. If AUTOLOADER is true, the AUTOLOAD subroutine falls back on AutoLoader::AUTOLOAD for all names that the constant() routine doesn't recognise.

    • WriteMakefileSnippet

      WriteMakefileSnippet ATTRIBUTE => VALUE [, ...]

      A function to generate perl code for Makefile.PL that will regenerate the constant subroutines. Parameters are named as passed to WriteConstants , with the addition of INDENT to specify the number of leading spaces (default 2).

      Currently only INDENT , NAME , DEFAULT_TYPE , NAMES , C_FILE and XS_FILE are recognised.

    • WriteConstants ATTRIBUTE => VALUE [, ...]

      Writes a file of C code and a file of XS code which you should #include and INCLUDE in the C and XS sections respectively of your module's XS code. You probably want to do this in your Makefile.PL , so that you can easily edit the list of constants without touching the rest of your module. The attributes supported are

      • NAME

        Name of the module. This must be specified

      • DEFAULT_TYPE

        The default type for the constants. If not specified IV is assumed.

      • BREAKOUT_AT

        The names of the constants are grouped by length. Generate child subroutines for each group with this number or more names in.

      • NAMES

        An array of constants' names, either scalars containing names, or hashrefs as detailed in C_constant.

      • PROXYSUBS

        If true, uses proxy subs. See ExtUtils::Constant::ProxySubs.

      • C_FH

        A filehandle to write the C code to. If not given, then C_FILE is opened for writing.

      • C_FILE

        The name of the file to write containing the C code. The default is const-c.inc . The - in the name ensures that the file can't be mistaken for anything related to a legitimate perl package name, and not naming the file .c avoids having to override Makefile.PL's .xs to .c rules.

      • XS_FH

        A filehandle to write the XS code to. If not given, then XS_FILE is opened for writing.

      • XS_FILE

        The name of the file to write containing the XS code. The default is const-xs.inc .

      • XS_SUBNAME

        The perl visible name of the XS subroutine generated which will return the constants. The default is constant .

      • C_SUBNAME

        The name of the C subroutine generated which will return the constants. The default is XS_SUBNAME. Child subroutines have _ and the name length appended, so constants with 10 character names would be in constant_10 with the default XS_SUBNAME.

    AUTHOR

    Nicholas Clark <nick@ccl4.org> based on the code in h2xs by Larry Wall and others

     
    perldoc-html/ExtUtils/Embed.html000644 000765 000024 00000057360 12275777437 016741 0ustar00jjstaff000000 000000 ExtUtils::Embed - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Embed

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Embed

    NAME

    ExtUtils::Embed - Utilities for embedding Perl in C/C++ applications

    SYNOPSIS

    1. perl -MExtUtils::Embed -e xsinit
    2. perl -MExtUtils::Embed -e ccopts
    3. perl -MExtUtils::Embed -e ldopts

    DESCRIPTION

    ExtUtils::Embed provides utility functions for embedding a Perl interpreter and extensions in your C/C++ applications. Typically, an application Makefile will invoke ExtUtils::Embed functions while building your application.

    @EXPORT

    ExtUtils::Embed exports the following functions:

    xsinit(), ldopts(), ccopts(), perl_inc(), ccflags(), ccdlflags(), xsi_header(), xsi_protos(), xsi_body()

    FUNCTIONS

    • xsinit()

      Generate C/C++ code for the XS initializer function.

      When invoked as `perl -MExtUtils::Embed -e xsinit --` the following options are recognized:

      -o <output filename> (Defaults to perlxsi.c)

      -o STDOUT will print to STDOUT.

      -std (Write code for extensions that are linked with the current Perl.)

      Any additional arguments are expected to be names of modules to generate code for.

      When invoked with parameters the following are accepted and optional:

      xsinit($filename,$std,[@modules])

      Where,

      $filename is equivalent to the -o option.

      $std is boolean, equivalent to the -std option.

      [@modules] is an array ref, same as additional arguments mentioned above.

    • Examples
      1. perl -MExtUtils::Embed -e xsinit -- -o xsinit.c Socket

      This will generate code with an xs_init function that glues the perl Socket::bootstrap function to the C boot_Socket function and writes it to a file named xsinit.c.

      Note that DynaLoader is a special case where it must call boot_DynaLoader directly.

      1. perl -MExtUtils::Embed -e xsinit

      This will generate code for linking with DynaLoader and each static extension found in $Config{static_ext}. The code is written to the default file name perlxsi.c.

      1. perl -MExtUtils::Embed -e xsinit -- -o xsinit.c -std DBI DBD::Oracle

      Here, code is written for all the currently linked extensions along with code for DBI and DBD::Oracle.

      If you have a working DynaLoader then there is rarely any need to statically link in any other extensions.

    • ldopts()

      Output arguments for linking the Perl library and extensions to your application.

      When invoked as `perl -MExtUtils::Embed -e ldopts --` the following options are recognized:

      -std

      Output arguments for linking the Perl library and any extensions linked with the current Perl.

      -I <path1:path2>

      Search path for ModuleName.a archives. Default path is @INC. Library archives are expected to be found as /some/path/auto/ModuleName/ModuleName.a For example, when looking for Socket.a relative to a search path, we should find auto/Socket/Socket.a

      When looking for DBD::Oracle relative to a search path, we should find auto/DBD/Oracle/Oracle.a

      Keep in mind that you can always supply /my/own/path/ModuleName.a as an additional linker argument.

      -- <list of linker args>

      Additional linker arguments to be considered.

      Any additional arguments found before the -- token are expected to be names of modules to generate code for.

      When invoked with parameters the following are accepted and optional:

      ldopts($std,[@modules],[@link_args],$path)

      Where:

      $std is boolean, equivalent to the -std option.

      [@modules] is equivalent to additional arguments found before the -- token.

      [@link_args] is equivalent to arguments found after the -- token.

      $path is equivalent to the -I option.

      In addition, when ldopts is called with parameters, it will return the argument string rather than print it to STDOUT.

    • Examples
      1. perl -MExtUtils::Embed -e ldopts

      This will print arguments for linking with libperl and extensions found in $Config{static_ext}. This includes libraries found in $Config{libs} and the first ModuleName.a library for each extension that is found by searching @INC or the path specified by the -I option. In addition, when ModuleName.a is found, additional linker arguments are picked up from the extralibs.ld file in the same directory.

      1. perl -MExtUtils::Embed -e ldopts -- -std Socket

      This will do the same as the above example, along with printing additional arguments for linking with the Socket extension.

      1. perl -MExtUtils::Embed -e ldopts -- -std Msql -- -L/usr/msql/lib -lmsql

      Any arguments after the second '--' token are additional linker arguments that will be examined for potential conflict. If there is no conflict, the additional arguments will be part of the output.

    • perl_inc()

      For including perl header files this function simply prints:

      1. -I$Config{archlibexp}/CORE

      So, rather than having to say:

      1. perl -MConfig -e 'print "-I$Config{archlibexp}/CORE"'

      Just say:

      1. perl -MExtUtils::Embed -e perl_inc
    • ccflags(), ccdlflags()

      These functions simply print $Config{ccflags} and $Config{ccdlflags}

    • ccopts()

      This function combines perl_inc(), ccflags() and ccdlflags() into one.

    • xsi_header()

      This function simply returns a string defining the same EXTERN_C macro as perlmain.c along with #including perl.h and EXTERN.h.

    • xsi_protos(@modules)

      This function returns a string of boot_$ModuleName prototypes for each @modules.

    • xsi_body(@modules)

      This function returns a string of calls to newXS() that glue the module bootstrap function to boot_ModuleName for each @modules.

      xsinit() uses the xsi_* functions to generate most of its code.

    EXAMPLES

    For examples on how to use ExtUtils::Embed for building C/C++ applications with embedded perl, see perlembed.

    SEE ALSO

    perlembed

    AUTHOR

    Doug MacEachern <dougm@osf.org>

    Based on ideas from Tim Bunce <Tim.Bunce@ig.co.uk> and minimod.pl by Andreas Koenig <k@anna.in-berlin.de> and Tim Bunce.

     
    perldoc-html/ExtUtils/Install.html000644 000765 000024 00000071244 12275777442 017324 0ustar00jjstaff000000 000000 ExtUtils::Install - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Install

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Install

    NAME

    ExtUtils::Install - install files from here to there

    SYNOPSIS

    1. use ExtUtils::Install;
    2. install({ 'blib/lib' => 'some/install/dir' } );
    3. uninstall($packlist);
    4. pm_to_blib({ 'lib/Foo/Bar.pm' => 'blib/lib/Foo/Bar.pm' });

    VERSION

    1.59

    DESCRIPTION

    Handles the installing and uninstalling of perl modules, scripts, man pages, etc...

    Both install() and uninstall() are specific to the way ExtUtils::MakeMaker handles the installation and deinstallation of perl modules. They are not designed as general purpose tools.

    On some operating systems such as Win32 installation may not be possible until after a reboot has occured. This can have varying consequences: removing an old DLL does not impact programs using the new one, but if a new DLL cannot be installed properly until reboot then anything depending on it must wait. The package variable

    1. $ExtUtils::Install::MUST_REBOOT

    is used to store this status.

    If this variable is true then such an operation has occured and anything depending on this module cannot proceed until a reboot has occured.

    If this value is defined but false then such an operation has ocurred, but should not impact later operations.

    Functions

    1. # deprecated forms
    2. install(\%from_to);
    3. install(\%from_to, $verbose, $dry_run, $uninstall_shadows,
    4. $skip, $always_copy, \%result);
    5. # recommended form as of 1.47
    6. install([
    7. from_to => \%from_to,
    8. verbose => 1,
    9. dry_run => 0,
    10. uninstall_shadows => 1,
    11. skip => undef,
    12. always_copy => 1,
    13. result => \%install_results,
    14. ]);

    Copies each directory tree of %from_to to its corresponding value preserving timestamps and permissions.

    There are two keys with a special meaning in the hash: "read" and "write". These contain packlist files. After the copying is done, install() will write the list of target files to $from_to{write}. If $from_to{read} is given the contents of this file will be merged into the written file. The read and the written file may be identical, but on AFS it is quite likely that people are installing to a different directory than the one where the files later appear.

    If $verbose is true, will print out each file removed. Default is false. This is "make install VERBINST=1". $verbose values going up to 5 show increasingly more diagnostics output.

    If $dry_run is true it will only print what it was going to do without actually doing it. Default is false.

    If $uninstall_shadows is true any differing versions throughout @INC will be uninstalled. This is "make install UNINST=1"

    As of 1.37_02 install() supports the use of a list of patterns to filter out files that shouldn't be installed. If $skip is omitted or undefined then install will try to read the list from INSTALL.SKIP in the CWD. This file is a list of regular expressions and is just like the MANIFEST.SKIP file used by ExtUtils::Manifest.

    A default site INSTALL.SKIP may be provided by setting then environment variable EU_INSTALL_SITE_SKIPFILE, this will only be used when there isn't a distribution specific INSTALL.SKIP. If the environment variable EU_INSTALL_IGNORE_SKIP is true then no install file filtering will be performed.

    If $skip is undefined then the skip file will be autodetected and used if it is found. If $skip is a reference to an array then it is assumed the array contains the list of patterns, if $skip is a true non reference it is assumed to be the filename holding the list of patterns, any other value of $skip is taken to mean that no install filtering should occur.

    Changes As of Version 1.47

    As of version 1.47 the following additions were made to the install interface. Note that the new argument style and use of the %result hash is recommended.

    The $always_copy parameter which when true causes files to be updated regardles as to whether they have changed, if it is defined but false then copies are made only if the files have changed, if it is undefined then the value of the environment variable EU_INSTALL_ALWAYS_COPY is used as default.

    The %result hash will be populated with the various keys/subhashes reflecting the install. Currently these keys and their structure are:

    1. install => { $target => $source },
    2. install_fail => { $target => $source },
    3. install_unchanged => { $target => $source },
    4. install_filtered => { $source => $pattern },
    5. uninstall => { $uninstalled => $source },
    6. uninstall_fail => { $uninstalled => $source },

    where $source is the filespec of the file being installed. $target is where it is being installed to, and $uninstalled is any shadow file that is in @INC or $ENV{PERL5LIB} or other standard locations, and $pattern is the pattern that caused a source file to be skipped. In future more keys will be added, such as to show created directories, however this requires changes in other modules and must therefore wait.

    These keys will be populated before any exceptions are thrown should there be an error.

    Note that all updates of the %result are additive, the hash will not be cleared before use, thus allowing status results of many installs to be easily aggregated.

    NEW ARGUMENT STYLE

    If there is only one argument and it is a reference to an array then the array is assumed to contain a list of key-value pairs specifying the options. In this case the option "from_to" is mandatory. This style means that you dont have to supply a cryptic list of arguments and can use a self documenting argument list that is easier to understand.

    This is now the recommended interface to install().

    RETURN

    If all actions were successful install will return a hashref of the results as described above for the $result parameter. If any action is a failure then install will die, therefore it is recommended to pass in the $result parameter instead of using the return value. If the result parameter is provided then the returned hashref will be the passed in hashref.

    1. install_default();
    2. install_default($fullext);

    Calls install() with arguments to copy a module from blib/ to the default site installation location.

    $fullext is the name of the module converted to a directory (ie. Foo::Bar would be Foo/Bar). If $fullext is not specified, it will attempt to read it from @ARGV.

    This is primarily useful for install scripts.

    NOTE This function is not really useful because of the hard-coded install location with no way to control site vs core vs vendor directories and the strange way in which the module name is given. Consider its use discouraged.

    1. uninstall($packlist_file);
    2. uninstall($packlist_file, $verbose, $dont_execute);

    Removes the files listed in a $packlist_file.

    If $verbose is true, will print out each file removed. Default is false.

    If $dont_execute is true it will only print what it was going to do without actually doing it. Default is false.

    1. pm_to_blib(\%from_to, $autosplit_dir);
    2. pm_to_blib(\%from_to, $autosplit_dir, $filter_cmd);

    Copies each key of %from_to to its corresponding value efficiently. Filenames with the extension .pm are autosplit into the $autosplit_dir. Any destination directories are created.

    $filter_cmd is an optional shell command to run each .pm file through prior to splitting and copying. Input is the contents of the module, output the new module contents.

    You can have an environment variable PERL_INSTALL_ROOT set which will be prepended as a directory to each installed file (and directory).

    ENVIRONMENT

    • PERL_INSTALL_ROOT

      Will be prepended to each install path.

    • EU_INSTALL_IGNORE_SKIP

      Will prevent the automatic use of INSTALL.SKIP as the install skip file.

    • EU_INSTALL_SITE_SKIPFILE

      If there is no INSTALL.SKIP file in the make directory then this value can be used to provide a default.

    • EU_INSTALL_ALWAYS_COPY

      If this environment variable is true then normal install processes will always overwrite older identical files during the install process.

      Note that the alias EU_ALWAYS_COPY will be supported if EU_INSTALL_ALWAYS_COPY is not defined until at least the 1.50 release. Please ensure you use the correct EU_INSTALL_ALWAYS_COPY.

    AUTHOR

    Original author lost in the mists of time. Probably the same as Makemaker.

    Production release currently maintained by demerphq yves at cpan.org , extensive changes by Michael G. Schwern.

    Send bug reports via http://rt.cpan.org/. Please send your generated Makefile along with your report.

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    See http://www.perl.com/perl/misc/Artistic.html

     
    perldoc-html/ExtUtils/Installed.html000644 000765 000024 00000061550 12275777442 017634 0ustar00jjstaff000000 000000 ExtUtils::Installed - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Installed

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Installed

    NAME

    ExtUtils::Installed - Inventory management of installed modules

    SYNOPSIS

    1. use ExtUtils::Installed;
    2. my ($inst) = ExtUtils::Installed->new( skip_cwd => 1 );
    3. my (@modules) = $inst->modules();
    4. my (@missing) = $inst->validate("DBI");
    5. my $all_files = $inst->files("DBI");
    6. my $files_below_usr_local = $inst->files("DBI", "all", "/usr/local");
    7. my $all_dirs = $inst->directories("DBI");
    8. my $dirs_below_usr_local = $inst->directory_tree("DBI", "prog");
    9. my $packlist = $inst->packlist("DBI");

    DESCRIPTION

    ExtUtils::Installed provides a standard way to find out what core and module files have been installed. It uses the information stored in .packlist files created during installation to provide this information. In addition it provides facilities to classify the installed files and to extract directory information from the .packlist files.

    USAGE

    The new() function searches for all the installed .packlists on the system, and stores their contents. The .packlists can be queried with the functions described below. Where it searches by default is determined by the settings found in %Config::Config , and what the value is of the PERL5LIB environment variable.

    METHODS

    Unless specified otherwise all method can be called as class methods, or as object methods. If called as class methods then the "default" object will be used, and if necessary created using the current processes %Config and @INC. See the 'default' option to new() for details.

    • new()

      This takes optional named parameters. Without parameters, this searches for all the installed .packlists on the system using information from %Config::Config and the default module search paths @INC . The packlists are read using the ExtUtils::Packlist module.

      If the named parameter skip_cwd is true, the current directory . will be stripped from @INC before searching for .packlists. This keeps ExtUtils::Installed from finding modules installed in other perls that happen to be located below the current directory.

      If the named parameter config_override is specified, it should be a reference to a hash which contains all information usually found in %Config::Config . For example, you can obtain the configuration information for a separate perl installation and pass that in.

      1. my $yoda_cfg = get_fake_config('yoda');
      2. my $yoda_inst =
      3. ExtUtils::Installed->new(config_override=>$yoda_cfg);

      Similarly, the parameter inc_override may be a reference to an array which is used in place of the default module search paths from @INC .

      1. use Config;
      2. my @dirs = split(/\Q$Config{path_sep}\E/, $ENV{PERL5LIB});
      3. my $p5libs = ExtUtils::Installed->new(inc_override=>\@dirs);

      Note: You probably do not want to use these options alone, almost always you will want to set both together.

      The parameter extra_libs can be used to specify additional paths to search for installed modules. For instance

      1. my $installed =
      2. ExtUtils::Installed->new(extra_libs=>["/my/lib/path"]);

      This should only be necessary if /my/lib/path is not in PERL5LIB.

      Finally there is the 'default', and the related 'default_get' and 'default_set' options. These options control the "default" object which is provided by the class interface to the methods. Setting default_get to true tells the constructor to return the default object if it is defined. Setting default_set to true tells the constructor to make the default object the constructed object. Setting the default option is like setting both to true. This is used primarily internally and probably isn't interesting to any real user.

    • modules()

      This returns a list of the names of all the installed modules. The perl 'core' is given the special name 'Perl'.

    • files()

      This takes one mandatory parameter, the name of a module. It returns a list of all the filenames from the package. To obtain a list of core perl files, use the module name 'Perl'. Additional parameters are allowed. The first is one of the strings "prog", "doc" or "all", to select either just program files, just manual files or all files. The remaining parameters are a list of directories. The filenames returned will be restricted to those under the specified directories.

    • directories()

      This takes one mandatory parameter, the name of a module. It returns a list of all the directories from the package. Additional parameters are allowed. The first is one of the strings "prog", "doc" or "all", to select either just program directories, just manual directories or all directories. The remaining parameters are a list of directories. The directories returned will be restricted to those under the specified directories. This method returns only the leaf directories that contain files from the specified module.

    • directory_tree()

      This is identical in operation to directories(), except that it includes all the intermediate directories back up to the specified directories.

    • validate()

      This takes one mandatory parameter, the name of a module. It checks that all the files listed in the modules .packlist actually exist, and returns a list of any missing files. If an optional second argument which evaluates to true is given any missing files will be removed from the .packlist

    • packlist()

      This returns the ExtUtils::Packlist object for the specified module.

    • version()

      This returns the version number for the specified module.

    EXAMPLE

    See the example in ExtUtils::Packlist.

    AUTHOR

    Alan Burlison <Alan.Burlison@uk.sun.com>

     
    perldoc-html/ExtUtils/Liblist.html000644 000765 000024 00000070575 12275777442 017326 0ustar00jjstaff000000 000000 ExtUtils::Liblist - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Liblist

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Liblist

    NAME

    ExtUtils::Liblist - determine libraries to use and how to use them

    SYNOPSIS

    1. require ExtUtils::Liblist;
    2. $MM->ext($potential_libs, $verbose, $need_names);
    3. # Usually you can get away with:
    4. ExtUtils::Liblist->ext($potential_libs, $verbose, $need_names)

    DESCRIPTION

    This utility takes a list of libraries in the form -llib1 -llib2 -llib3 and returns lines suitable for inclusion in an extension Makefile. Extra library paths may be included with the form -L/another/path this will affect the searches for all subsequent libraries.

    It returns an array of four or five scalar values: EXTRALIBS, BSLOADLIBS, LDLOADLIBS, LD_RUN_PATH, and, optionally, a reference to the array of the filenames of actual libraries. Some of these don't mean anything unless on Unix. See the details about those platform specifics below. The list of the filenames is returned only if $need_names argument is true.

    Dependent libraries can be linked in one of three ways:

    • For static extensions

      by the ld command when the perl binary is linked with the extension library. See EXTRALIBS below.

    • For dynamic extensions at build/link time

      by the ld command when the shared object is built/linked. See LDLOADLIBS below.

    • For dynamic extensions at load time

      by the DynaLoader when the shared object is loaded. See BSLOADLIBS below.

    EXTRALIBS

    List of libraries that need to be linked with when linking a perl binary which includes this extension. Only those libraries that actually exist are included. These are written to a file and used when linking perl.

    LDLOADLIBS and LD_RUN_PATH

    List of those libraries which can or must be linked into the shared library when created using ld. These may be static or dynamic libraries. LD_RUN_PATH is a colon separated list of the directories in LDLOADLIBS. It is passed as an environment variable to the process that links the shared library.

    BSLOADLIBS

    List of those libraries that are needed but can be linked in dynamically at run time on this platform. SunOS/Solaris does not need this because ld records the information (from LDLOADLIBS) into the object file. This list is used to create a .bs (bootstrap) file.

    PORTABILITY

    This module deals with a lot of system dependencies and has quite a few architecture specific if s in the code.

    VMS implementation

    The version of ext() which is executed under VMS differs from the Unix-OS/2 version in several respects:

    • Input library and path specifications are accepted with or without the -l and -L prefixes used by Unix linkers. If neither prefix is present, a token is considered a directory to search if it is in fact a directory, and a library to search for otherwise. Authors who wish their extensions to be portable to Unix or OS/2 should use the Unix prefixes, since the Unix-OS/2 version of ext() requires them.

    • Wherever possible, shareable images are preferred to object libraries, and object libraries to plain object files. In accordance with VMS naming conventions, ext() looks for files named libshr and librtl; it also looks for liblib and liblib to accommodate Unix conventions used in some ported software.

    • For each library that is found, an appropriate directive for a linker options file is generated. The return values are space-separated strings of these directives, rather than elements used on the linker command line.

    • LDLOADLIBS contains both the libraries found based on $potential_libs and the CRTLs, if any, specified in Config.pm. EXTRALIBS contains just those libraries found based on $potential_libs . BSLOADLIBS and LD_RUN_PATH are always empty.

    In addition, an attempt is made to recognize several common Unix library names, and filter them out or convert them to their VMS equivalents, as appropriate.

    In general, the VMS version of ext() should properly handle input from extensions originally designed for a Unix or VMS environment. If you encounter problems, or discover cases where the search could be improved, please let us know.

    Win32 implementation

    The version of ext() which is executed under Win32 differs from the Unix-OS/2 version in several respects:

    • If $potential_libs is empty, the return value will be empty. Otherwise, the libraries specified by $Config{perllibs} (see Config.pm) will be appended to the list of $potential_libs . The libraries will be searched for in the directories specified in $potential_libs , $Config{libpth} , and in $Config{installarchlib}/CORE . For each library that is found, a space-separated list of fully qualified library pathnames is generated.

    • Input library and path specifications are accepted with or without the -l and -L prefixes used by Unix linkers.

      An entry of the form -La:\foo specifies the a:\foo directory to look for the libraries that follow.

      An entry of the form -lfoo specifies the library foo , which may be spelled differently depending on what kind of compiler you are using. If you are using GCC, it gets translated to libfoo.a , but for other win32 compilers, it becomes foo.lib . If no files are found by those translated names, one more attempt is made to find them using either foo.a or libfoo.lib , depending on whether GCC or some other win32 compiler is being used, respectively.

      If neither the -L or -l prefix is present in an entry, the entry is considered a directory to search if it is in fact a directory, and a library to search for otherwise. The $Config{lib_ext} suffix will be appended to any entries that are not directories and don't already have the suffix.

      Note that the -L and -l prefixes are not required, but authors who wish their extensions to be portable to Unix or OS/2 should use the prefixes, since the Unix-OS/2 version of ext() requires them.

    • Entries cannot be plain object files, as many Win32 compilers will not handle object files in the place of libraries.

    • Entries in $potential_libs beginning with a colon and followed by alphanumeric characters are treated as flags. Unknown flags will be ignored.

      An entry that matches /:nodefault/i disables the appending of default libraries found in $Config{perllibs} (this should be only needed very rarely).

      An entry that matches /:nosearch/i disables all searching for the libraries specified after it. Translation of -Lfoo and -lfoo still happens as appropriate (depending on compiler being used, as reflected by $Config{cc} ), but the entries are not verified to be valid files or directories.

      An entry that matches /:search/i reenables searching for the libraries specified after it. You can put it at the end to enable searching for default libraries specified by $Config{perllibs} .

    • The libraries specified may be a mixture of static libraries and import libraries (to link with DLLs). Since both kinds are used pretty transparently on the Win32 platform, we do not attempt to distinguish between them.

    • LDLOADLIBS and EXTRALIBS are always identical under Win32, and BSLOADLIBS and LD_RUN_PATH are always empty (this may change in future).

    • You must make sure that any paths and path components are properly surrounded with double-quotes if they contain spaces. For example, $potential_libs could be (literally):

      1. "-Lc:\Program Files\vc\lib" msvcrt.lib "la test\foo bar.lib"

      Note how the first and last entries are protected by quotes in order to protect the spaces.

    • Since this module is most often used only indirectly from extension Makefile.PL files, here is an example Makefile.PL entry to add a library to the build process for an extension:

      1. LIBS => ['-lgl']

      When using GCC, that entry specifies that MakeMaker should first look for libgl.a (followed by gl.a ) in all the locations specified by $Config{libpth} .

      When using a compiler other than GCC, the above entry will search for gl.lib (followed by libgl.lib ).

      If the library happens to be in a location not in $Config{libpth} , you need:

      1. LIBS => ['-Lc:\gllibs -lgl']

      Here is a less often used example:

      1. LIBS => ['-lgl', ':nosearch -Ld:\mesalibs -lmesa -luser32']

      This specifies a search for library gl as before. If that search fails to find the library, it looks at the next item in the list. The :nosearch flag will prevent searching for the libraries that follow, so it simply returns the value as -Ld:\mesalibs -lmesa -luser32, since GCC can use that value as is with its linker.

      When using the Visual C compiler, the second item is returned as -libpath:d:\mesalibs mesa.lib user32.lib.

      When using the Borland compiler, the second item is returned as -Ld:\mesalibs mesa.lib user32.lib, and MakeMaker takes care of moving the -Ld:\mesalibs to the correct place in the linker command line.

    SEE ALSO

    ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/MM.html000644 000765 000024 00000035642 12275777442 016231 0ustar00jjstaff000000 000000 ExtUtils::MM - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM

    NAME

    ExtUtils::MM - OS adjusted ExtUtils::MakeMaker subclass

    SYNOPSIS

    1. require ExtUtils::MM;
    2. my $mm = MM->new(...);

    DESCRIPTION

    FOR INTERNAL USE ONLY

    ExtUtils::MM is a subclass of ExtUtils::MakeMaker which automatically chooses the appropriate OS specific subclass for you (ie. ExtUils::MM_Unix, etc...).

    It also provides a convenient alias via the MM class (I didn't want MakeMaker modules outside of ExtUtils/).

    This class might turn out to be a temporary solution, but MM won't go away.

     
    perldoc-html/ExtUtils/MM_AIX.html000644 000765 000024 00000036526 12275777442 016734 0ustar00jjstaff000000 000000 ExtUtils::MM_AIX - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_AIX

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_AIX

    NAME

    ExtUtils::MM_AIX - AIX specific subclass of ExtUtils::MM_Unix

    SYNOPSIS

    1. Don't use this module directly.
    2. Use ExtUtils::MM and let it choose.

    DESCRIPTION

    This is a subclass of ExtUtils::MM_Unix which contains functionality for AIX.

    Unless otherwise stated it works just like ExtUtils::MM_Unix

    Overridden methods

    dlsyms

    Define DL_FUNCS and DL_VARS and write the *.exp files.

    AUTHOR

    Michael G Schwern <schwern@pobox.com> with code from ExtUtils::MM_Unix

    SEE ALSO

    ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/MM_Any.html000644 000765 000024 00000152462 12275777441 017037 0ustar00jjstaff000000 000000 ExtUtils::MM_Any - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_Any

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_Any

    NAME

    ExtUtils::MM_Any - Platform-agnostic MM methods

    SYNOPSIS

    1. FOR INTERNAL USE ONLY!
    2. package ExtUtils::MM_SomeOS;
    3. # Temporarily, you have to subclass both. Put MM_Any first.
    4. require ExtUtils::MM_Any;
    5. require ExtUtils::MM_Unix;
    6. @ISA = qw(ExtUtils::MM_Any ExtUtils::Unix);

    DESCRIPTION

    FOR INTERNAL USE ONLY!

    ExtUtils::MM_Any is a superclass for the ExtUtils::MM_* set of modules. It contains methods which are either inherently cross-platform or are written in a cross-platform manner.

    Subclass off of ExtUtils::MM_Any and ExtUtils::MM_Unix. This is a temporary solution.

    THIS MAY BE TEMPORARY!

    METHODS

    Any methods marked Abstract must be implemented by subclasses.

    Cross-platform helper methods

    These are methods which help writing cross-platform code.

    os_flavor Abstract

    1. my @os_flavor = $mm->os_flavor;

    @os_flavor is the style of operating system this is, usually corresponding to the MM_*.pm file we're using.

    The first element of @os_flavor is the major family (ie. Unix, Windows, VMS, OS/2, etc...) and the rest are sub families.

    Some examples:

    1. Cygwin98 ('Unix', 'Cygwin', 'Cygwin9x')
    2. Windows ('Win32')
    3. Win98 ('Win32', 'Win9x')
    4. Linux ('Unix', 'Linux')
    5. MacOS X ('Unix', 'Darwin', 'MacOS', 'MacOS X')
    6. OS/2 ('OS/2')

    This is used to write code for styles of operating system. See os_flavor_is() for use.

    os_flavor_is

    1. my $is_this_flavor = $mm->os_flavor_is($this_flavor);
    2. my $is_this_flavor = $mm->os_flavor_is(@one_of_these_flavors);

    Checks to see if the current operating system is one of the given flavors.

    This is useful for code like:

    1. if( $mm->os_flavor_is('Unix') ) {
    2. $out = `foo 2>&1`;
    3. }
    4. else {
    5. $out = `foo`;
    6. }

    can_load_xs

    1. my $can_load_xs = $self->can_load_xs;

    Returns true if we have the ability to load XS.

    This is important because miniperl, used to build XS modules in the core, can not load XS.

    split_command

    1. my @cmds = $MM->split_command($cmd, @args);

    Most OS have a maximum command length they can execute at once. Large modules can easily generate commands well past that limit. Its necessary to split long commands up into a series of shorter commands.

    split_command will return a series of @cmds each processing part of the args. Collectively they will process all the arguments. Each individual line in @cmds will not be longer than the $self->max_exec_len being careful to take into account macro expansion.

    $cmd should include any switches and repeated initial arguments.

    If no @args are given, no @cmds will be returned.

    Pairs of arguments will always be preserved in a single command, this is a heuristic for things like pm_to_blib and pod2man which work on pairs of arguments. This makes things like this safe:

    1. $self->split_command($cmd, %pod2man);

    echo

    1. my @commands = $MM->echo($text);
    2. my @commands = $MM->echo($text, $file);
    3. my @commands = $MM->echo($text, $file, \%opts);

    Generates a set of @commands which print the $text to a $file.

    If $file is not given, output goes to STDOUT.

    If $opts{append} is true the $file will be appended to rather than overwritten. Default is to overwrite.

    If $opts{allow_variables} is true, make variables of the form $(...) will not be escaped. Other $ will. Default is to escape all $ .

    Example of use:

    1. my $make = map "\t$_\n", $MM->echo($text, $file);

    wraplist

    1. my $args = $mm->wraplist(@list);

    Takes an array of items and turns them into a well-formatted list of arguments. In most cases this is simply something like:

    1. FOO \
    2. BAR \
    3. BAZ

    maketext_filter

    1. my $filter_make_text = $mm->maketext_filter($make_text);

    The text of the Makefile is run through this method before writing to disk. It allows systems a chance to make portability fixes to the Makefile.

    By default it does nothing.

    This method is protected and not intended to be called outside of MakeMaker.

    cd Abstract

    1. my $subdir_cmd = $MM->cd($subdir, @cmds);

    This will generate a make fragment which runs the @cmds in the given $dir. The rough equivalent to this, except cross platform.

    1. cd $subdir && $cmd

    Currently $dir can only go down one level. "foo" is fine. "foo/bar" is not. "../foo" is right out.

    The resulting $subdir_cmd has no leading tab nor trailing newline. This makes it easier to embed in a make string. For example.

    1. my $make = sprintf <<'CODE', $subdir_cmd;
    2. foo :
    3. $(ECHO) what
    4. %s
    5. $(ECHO) mouche
    6. CODE

    oneliner Abstract

    1. my $oneliner = $MM->oneliner($perl_code);
    2. my $oneliner = $MM->oneliner($perl_code, \@switches);

    This will generate a perl one-liner safe for the particular platform you're on based on the given $perl_code and @switches (a -e is assumed) suitable for using in a make target. It will use the proper shell quoting and escapes.

    $(PERLRUN) will be used as perl.

    Any newlines in $perl_code will be escaped. Leading and trailing newlines will be stripped. Makes this idiom much easier:

    1. my $code = $MM->oneliner(<<'CODE', [...switches...]);
    2. some code here
    3. another line here
    4. CODE

    Usage might be something like:

    1. # an echo emulation
    2. $oneliner = $MM->oneliner('print "Foo\n"');
    3. $make = '$oneliner > somefile';

    All dollar signs must be doubled in the $perl_code if you expect them to be interpreted normally, otherwise it will be considered a make macro. Also remember to quote make macros else it might be used as a bareword. For example:

    1. # Assign the value of the $(VERSION_FROM) make macro to $vf.
    2. $oneliner = $MM->oneliner('$$vf = "$(VERSION_FROM)"');

    Its currently very simple and may be expanded sometime in the figure to include more flexible code and switches.

    quote_literal Abstract

    1. my $safe_text = $MM->quote_literal($text);
    2. my $safe_text = $MM->quote_literal($text, \%options);

    This will quote $text so it is interpreted literally in the shell.

    For example, on Unix this would escape any single-quotes in $text and put single-quotes around the whole thing.

    If $options{allow_variables} is true it will leave '$(FOO)' make variables untouched. If false they will be escaped like any other $ . Defaults to true.

    escape_dollarsigns

    1. my $escaped_text = $MM->escape_dollarsigns($text);

    Escapes stray $ so they are not interpreted as make variables.

    It lets by $(...).

    escape_all_dollarsigns

    1. my $escaped_text = $MM->escape_all_dollarsigns($text);

    Escapes all $ so they are not interpreted as make variables.

    escape_newlines Abstract

    1. my $escaped_text = $MM->escape_newlines($text);

    Shell escapes newlines in $text.

    max_exec_len Abstract

    1. my $max_exec_len = $MM->max_exec_len;

    Calculates the maximum command size the OS can exec. Effectively, this is the max size of a shell command line.

    make

    1. my $make = $MM->make;

    Returns the make variant we're generating the Makefile for. This attempts to do some normalization on the information from %Config or the user.

    Targets

    These are methods which produce make targets.

    all_target

    Generate the default target 'all'.

    blibdirs_target

    1. my $make_frag = $mm->blibdirs_target;

    Creates the blibdirs target which creates all the directories we use in blib/.

    The blibdirs.ts target is deprecated. Depend on blibdirs instead.

    clean (o)

    Defines the clean target.

    clean_subdirs_target

    1. my $make_frag = $MM->clean_subdirs_target;

    Returns the clean_subdirs target. This is used by the clean target to call clean on any subdirectories which contain Makefiles.

    dir_target

    1. my $make_frag = $mm->dir_target(@directories);

    Generates targets to create the specified directories and set its permission to PERM_DIR.

    Because depending on a directory to just ensure it exists doesn't work too well (the modified time changes too often) dir_target() creates a .exists file in the created directory. It is this you should depend on. For portability purposes you should use the $(DIRFILESEP) macro rather than a '/' to seperate the directory from the file.

    1. yourdirectory$(DIRFILESEP).exists

    distdir

    Defines the scratch directory target that will hold the distribution before tar-ing (or shar-ing).

    dist_test

    Defines a target that produces the distribution in the scratchdirectory, and runs 'perl Makefile.PL; make ;make test' in that subdirectory.

    dynamic (o)

    Defines the dynamic target.

    makemakerdflt_target

    1. my $make_frag = $mm->makemakerdflt_target

    Returns a make fragment with the makemakerdeflt_target specified. This target is the first target in the Makefile, is the default target and simply points off to 'all' just in case any make variant gets confused or something gets snuck in before the real 'all' target.

    manifypods_target

    1. my $manifypods_target = $self->manifypods_target;

    Generates the manifypods target. This target generates man pages from all POD files in MAN1PODS and MAN3PODS.

    metafile_target

    1. my $target = $mm->metafile_target;

    Generate the metafile target.

    Writes the file META.yml YAML encoded meta-data about the module in the distdir. The format follows Module::Build's as closely as possible.

    metafile_data

    1. my @metadata_pairs = $mm->metafile_data(\%meta_add, \%meta_merge);

    Returns the data which MakeMaker turns into the META.yml file.

    Values of %meta_add will overwrite any existing metadata in those keys. %meta_merge will be merged with them.

    metafile_file

    1. my $meta_yml = $mm->metafile_file(@metadata_pairs);

    Turns the @metadata_pairs into YAML.

    This method does not implement a complete YAML dumper, being limited to dump a hash with values which are strings, undef's or nested hashes and arrays of strings. No quoting/escaping is done.

    distmeta_target

    1. my $make_frag = $mm->distmeta_target;

    Generates the distmeta target to add META.yml to the MANIFEST in the distdir.

    mymeta

    1. my $mymeta = $mm->mymeta;

    Generate MYMETA information as a hash either from an existing META.yml or from internal data.

    write_mymeta

    1. $self->write_mymeta( $mymeta );

    Write MYMETA information to MYMETA.yml.

    This will probably be refactored into a more generic YAML dumping method.

    realclean (o)

    Defines the realclean target.

    realclean_subdirs_target

    1. my $make_frag = $MM->realclean_subdirs_target;

    Returns the realclean_subdirs target. This is used by the realclean target to call realclean on any subdirectories which contain Makefiles.

    signature_target

    1. my $target = $mm->signature_target;

    Generate the signature target.

    Writes the file SIGNATURE with "cpansign -s".

    distsignature_target

    1. my $make_frag = $mm->distsignature_target;

    Generates the distsignature target to add SIGNATURE to the MANIFEST in the distdir.

    special_targets

    1. my $make_frag = $mm->special_targets

    Returns a make fragment containing any targets which have special meaning to make. For example, .SUFFIXES and .PHONY.

    Init methods

    Methods which help initialize the MakeMaker object and macros.

    init_ABSTRACT

    1. $mm->init_ABSTRACT

    init_INST

    1. $mm->init_INST;

    Called by init_main. Sets up all INST_* variables except those related to XS code. Those are handled in init_xs.

    init_INSTALL

    1. $mm->init_INSTALL;

    Called by init_main. Sets up all INSTALL_* variables (except INSTALLDIRS) and *PREFIX.

    init_INSTALL_from_PREFIX

    1. $mm->init_INSTALL_from_PREFIX;

    init_from_INSTALL_BASE

    1. $mm->init_from_INSTALL_BASE

    init_VERSION Abstract

    1. $mm->init_VERSION

    Initialize macros representing versions of MakeMaker and other tools

    MAKEMAKER: path to the MakeMaker module.

    MM_VERSION: ExtUtils::MakeMaker Version

    MM_REVISION: ExtUtils::MakeMaker version control revision (for backwards compat)

    VERSION: version of your module

    VERSION_MACRO: which macro represents the version (usually 'VERSION')

    VERSION_SYM: like version but safe for use as an RCS revision number

    DEFINE_VERSION: -D line to set the module version when compiling

    XS_VERSION: version in your .xs file. Defaults to $(VERSION)

    XS_VERSION_MACRO: which macro represents the XS version.

    XS_DEFINE_VERSION: -D line to set the xs version when compiling.

    Called by init_main.

    init_tools

    1. $MM->init_tools();

    Initializes the simple macro definitions used by tools_other() and places them in the $MM object. These use conservative cross platform versions and should be overridden with platform specific versions for performance.

    Defines at least these macros.

    1. Macro Description
    2. NOOP Do nothing
    3. NOECHO Tell make not to display the command itself
    4. SHELL Program used to run shell commands
    5. ECHO Print text adding a newline on the end
    6. RM_F Remove a file
    7. RM_RF Remove a directory
    8. TOUCH Update a file's timestamp
    9. TEST_F Test for a file's existence
    10. CP Copy a file
    11. MV Move a file
    12. CHMOD Change permissions on a file
    13. FALSE Exit with non-zero
    14. TRUE Exit with zero
    15. UMASK_NULL Nullify umask
    16. DEV_NULL Suppress all command output

    init_others

    1. $MM->init_others();

    Initializes the macro definitions having to do with compiling and linking used by tools_other() and places them in the $MM object.

    If there is no description, its the same as the parameter to WriteMakefile() documented in ExtUtils::MakeMaker.

    tools_other

    1. my $make_frag = $MM->tools_other;

    Returns a make fragment containing definitions for the macros init_others() initializes.

    init_DIRFILESEP Abstract

    1. $MM->init_DIRFILESEP;
    2. my $dirfilesep = $MM->{DIRFILESEP};

    Initializes the DIRFILESEP macro which is the seperator between the directory and filename in a filepath. ie. / on Unix, \ on Win32 and nothing on VMS.

    For example:

    1. # instead of $(INST_ARCHAUTODIR)/extralibs.ld
    2. $(INST_ARCHAUTODIR)$(DIRFILESEP)extralibs.ld

    Something of a hack but it prevents a lot of code duplication between MM_* variants.

    Do not use this as a seperator between directories. Some operating systems use different seperators between subdirectories as between directories and filenames (for example: VOLUME:[dir1.dir2]file on VMS).

    init_linker Abstract

    1. $mm->init_linker;

    Initialize macros which have to do with linking.

    PERL_ARCHIVE: path to libperl.a equivalent to be linked to dynamic extensions.

    PERL_ARCHIVE_AFTER: path to a library which should be put on the linker command line after the external libraries to be linked to dynamic extensions. This may be needed if the linker is one-pass, and Perl includes some overrides for C RTL functions, such as malloc().

    EXPORT_LIST: name of a file that is passed to linker to define symbols to be exported.

    Some OSes do not need these in which case leave it blank.

    init_platform

    1. $mm->init_platform

    Initialize any macros which are for platform specific use only.

    A typical one is the version number of your OS specific mocule. (ie. MM_Unix_VERSION or MM_VMS_VERSION).

    init_MAKE

    1. $mm->init_MAKE

    Initialize MAKE from either a MAKE environment variable or $Config{make}.

    Tools

    A grab bag of methods to generate specific macros and commands.

    manifypods

    Defines targets and routines to translate the pods into manpages and put them into the INST_* directories.

    POD2MAN_macro

    1. my $pod2man_macro = $self->POD2MAN_macro

    Returns a definition for the POD2MAN macro. This is a program which emulates the pod2man utility. You can add more switches to the command by simply appending them on the macro.

    Typical usage:

    1. $(POD2MAN) --section=3 --perm_rw=$(PERM_RW) podfile1 man_page1 ...

    test_via_harness

    1. my $command = $mm->test_via_harness($perl, $tests);

    Returns a $command line which runs the given set of $tests with Test::Harness and the given $perl.

    Used on the t/*.t files.

    test_via_script

    1. my $command = $mm->test_via_script($perl, $script);

    Returns a $command line which just runs a single test without Test::Harness. No checks are done on the results, they're just printed.

    Used for test.pl, since they don't always follow Test::Harness formatting.

    tool_autosplit

    Defines a simple perl call that runs autosplit. May be deprecated by pm_to_blib soon.

    arch_check

    1. my $arch_ok = $mm->arch_check(
    2. $INC{"Config.pm"},
    3. File::Spec->catfile($Config{archlibexp}, "Config.pm")
    4. );

    A sanity check that what Perl thinks the architecture is and what Config thinks the architecture is are the same. If they're not it will return false and show a diagnostic message.

    When building Perl it will always return true, as nothing is installed yet.

    The interface is a bit odd because this is the result of a quick refactoring. Don't rely on it.

    File::Spec wrappers

    ExtUtils::MM_Any is a subclass of File::Spec. The methods noted here override File::Spec.

    catfile

    File::Spec <= 0.83 has a bug where the file part of catfile is not canonicalized. This override fixes that bug.

    Misc

    Methods I can't really figure out where they should go yet.

    find_tests

    1. my $test = $mm->find_tests;

    Returns a string suitable for feeding to the shell to return all tests in t/*.t.

    extra_clean_files

    1. my @files_to_clean = $MM->extra_clean_files;

    Returns a list of OS specific files to be removed in the clean target in addition to the usual set.

    installvars

    1. my @installvars = $mm->installvars;

    A list of all the INSTALL* variables without the INSTALL prefix. Useful for iteration or building related variable sets.

    libscan

    1. my $wanted = $self->libscan($path);

    Takes a path to a file or dir and returns an empty string if we don't want to include this file in the library. Otherwise it returns the the $path unchanged.

    Mainly used to exclude version control administrative directories from installation.

    platform_constants

    1. my $make_frag = $mm->platform_constants

    Returns a make fragment defining all the macros initialized in init_platform() rather than put them in constants().

    AUTHOR

    Michael G Schwern <schwern@pobox.com> and the denizens of makemaker@perl.org with code from ExtUtils::MM_Unix and ExtUtils::MM_Win32.

     
    perldoc-html/ExtUtils/MM_BeOS.html000644 000765 000024 00000035454 12275777441 017101 0ustar00jjstaff000000 000000 ExtUtils::MM_BeOS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_BeOS

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_BeOS

    NAME

    ExtUtils::MM_BeOS - methods to override UN*X behaviour in ExtUtils::MakeMaker

    SYNOPSIS

    1. use ExtUtils::MM_BeOS; # Done internally by ExtUtils::MakeMaker if needed

    DESCRIPTION

    See ExtUtils::MM_Unix for a documentation of the methods provided there. This package overrides the implementation of these methods, not the semantics.

    • os_flavor

      BeOS is BeOS.

    • init_linker

      libperl.a equivalent to be linked to dynamic extensions.

    1; __END__

     
    perldoc-html/ExtUtils/MM_Cygwin.html000644 000765 000024 00000037026 12275777441 017546 0ustar00jjstaff000000 000000 ExtUtils::MM_Cygwin - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_Cygwin

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_Cygwin

    NAME

    ExtUtils::MM_Cygwin - methods to override UN*X behaviour in ExtUtils::MakeMaker

    SYNOPSIS

    1. use ExtUtils::MM_Cygwin; # Done internally by ExtUtils::MakeMaker if needed

    DESCRIPTION

    See ExtUtils::MM_Unix for a documentation of the methods provided there.

    • os_flavor

      We're Unix and Cygwin.

    • cflags

      if configured for dynamic loading, triggers #define EXT in EXTERN.h

    • replace_manpage_separator

      replaces strings '::' with '.' in MAN*POD man page names

    • init_linker

      points to libperl.a

    • maybe_command

      If our path begins with /cygdrive/ then we use ExtUtils::MM_Win32 to determine if it may be a command. Otherwise we use the tests from ExtUtils::MM_Unix .

    • dynamic_lib

      Use the default to produce the *.dll's. But for new archdir dll's use the same rebase address if the old exists.

    • all_target

      Build man pages, too

     
    perldoc-html/ExtUtils/MM_DOS.html000644 000765 000024 00000036771 12275777440 016740 0ustar00jjstaff000000 000000 ExtUtils::MM_DOS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_DOS

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_DOS

    NAME

    ExtUtils::MM_DOS - DOS specific subclass of ExtUtils::MM_Unix

    SYNOPSIS

    1. Don't use this module directly.
    2. Use ExtUtils::MM and let it choose.

    DESCRIPTION

    This is a subclass of ExtUtils::MM_Unix which contains functionality for DOS.

    Unless otherwise stated, it works just like ExtUtils::MM_Unix

    Overridden methods

    • os_flavor
    • replace_manpage_separator

      Generates Foo__Bar.3 style man page names

    AUTHOR

    Michael G Schwern <schwern@pobox.com> with code from ExtUtils::MM_Unix

    SEE ALSO

    ExtUtils::MM_Unix, ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/MM_Darwin.html000644 000765 000024 00000035560 12275777442 017534 0ustar00jjstaff000000 000000 ExtUtils::MM_Darwin - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_Darwin

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_Darwin

    NAME

    ExtUtils::MM_Darwin - special behaviors for OS X

    SYNOPSIS

    1. For internal MakeMaker use only

    DESCRIPTION

    See ExtUtils::MM_Unix for ExtUtils::MM_Any for documention on the methods overridden here.

    Overriden Methods

    init_dist

    Turn off Apple tar's tendency to copy resource forks as "._foo" files.

     
    perldoc-html/ExtUtils/MM_MacOS.html000644 000765 000024 00000036024 12275777437 017252 0ustar00jjstaff000000 000000 ExtUtils::MM_MacOS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_MacOS

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_MacOS

    NAME

    ExtUtils::MM_MacOS - once produced Makefiles for MacOS Classic

    SYNOPSIS

    1. # MM_MacOS no longer contains any code. This is just a stub.

    DESCRIPTION

    Once upon a time, MakeMaker could produce an approximation of a correct Makefile on MacOS Classic (MacPerl). Due to a lack of maintainers, this fell out of sync with the rest of MakeMaker and hadn't worked in years. Since there's little chance of it being repaired, MacOS Classic is fading away, and the code was icky to begin with, the code has been deleted to make maintenance easier.

    Those interested in writing modules for MacPerl should use Module::Build which works better than MakeMaker ever did.

    Anyone interested in resurrecting this file should pull the old version from the MakeMaker CVS repository and contact makemaker@perl.org, but we really encourage you to work on Module::Build instead.

     
    perldoc-html/ExtUtils/MM_NW5.html000644 000765 000024 00000036352 12275777440 016717 0ustar00jjstaff000000 000000 ExtUtils::MM_NW5 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_NW5

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_NW5

    NAME

    ExtUtils::MM_NW5 - methods to override UN*X behaviour in ExtUtils::MakeMaker

    SYNOPSIS

    1. use ExtUtils::MM_NW5; # Done internally by ExtUtils::MakeMaker if needed

    DESCRIPTION

    See ExtUtils::MM_Unix for a documentation of the methods provided there. This package overrides the implementation of these methods, not the semantics.

    • os_flavor

      We're Netware in addition to being Windows.

    • init_platform

      Add Netware macros.

      LIBPTH, BASE_IMPORT, NLM_VERSION, MPKTOOL, TOOLPATH, BOOT_SYMBOL, NLM_SHORT_NAME, INCLUDE, PATH, MM_NW5_REVISION

    • platform_constants

      Add Netware macros initialized above to the Makefile.

    • const_cccmd
    • static_lib
    • dynamic_lib

      Defines how to produce the *.so (or equivalent) files.

     
    perldoc-html/ExtUtils/MM_OS2.html000644 000765 000024 00000035655 12275777442 016720 0ustar00jjstaff000000 000000 ExtUtils::MM_OS2 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_OS2

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_OS2

    NAME

    ExtUtils::MM_OS2 - methods to override UN*X behaviour in ExtUtils::MakeMaker

    SYNOPSIS

    1. use ExtUtils::MM_OS2; # Done internally by ExtUtils::MakeMaker if needed

    DESCRIPTION

    See ExtUtils::MM_Unix for a documentation of the methods provided there. This package overrides the implementation of these methods, not the semantics.

    METHODS

    • init_dist

      Define TO_UNIX to convert OS2 linefeeds to Unix style.

    • init_linker
    • os_flavor

      OS/2 is OS/2

     
    perldoc-html/ExtUtils/MM_QNX.html000644 000765 000024 00000036543 12275777441 016757 0ustar00jjstaff000000 000000 ExtUtils::MM_QNX - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_QNX

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_QNX

    NAME

    ExtUtils::MM_QNX - QNX specific subclass of ExtUtils::MM_Unix

    SYNOPSIS

    1. Don't use this module directly.
    2. Use ExtUtils::MM and let it choose.

    DESCRIPTION

    This is a subclass of ExtUtils::MM_Unix which contains functionality for QNX.

    Unless otherwise stated it works just like ExtUtils::MM_Unix

    Overridden methods

    extra_clean_files

    Add .err files corresponding to each .c file.

    AUTHOR

    Michael G Schwern <schwern@pobox.com> with code from ExtUtils::MM_Unix

    SEE ALSO

    ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/MM_UWIN.html000644 000765 000024 00000037054 12275777441 017071 0ustar00jjstaff000000 000000 ExtUtils::MM_UWIN - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_UWIN

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_UWIN

    NAME

    ExtUtils::MM_UWIN - U/WIN specific subclass of ExtUtils::MM_Unix

    SYNOPSIS

    1. Don't use this module directly.
    2. Use ExtUtils::MM and let it choose.

    DESCRIPTION

    This is a subclass of ExtUtils::MM_Unix which contains functionality for the AT&T U/WIN UNIX on Windows environment.

    Unless otherwise stated it works just like ExtUtils::MM_Unix

    Overridden methods

    • os_flavor

      In addition to being Unix, we're U/WIN.

    • replace_manpage_separator

    AUTHOR

    Michael G Schwern <schwern@pobox.com> with code from ExtUtils::MM_Unix

    SEE ALSO

    ExtUtils::MM_Win32, ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/MM_Unix.html000644 000765 000024 00000112207 12275777443 017226 0ustar00jjstaff000000 000000 ExtUtils::MM_Unix - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_Unix

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_Unix

    NAME

    ExtUtils::MM_Unix - methods used by ExtUtils::MakeMaker

    SYNOPSIS

    require ExtUtils::MM_Unix;

    DESCRIPTION

    The methods provided by this package are designed to be used in conjunction with ExtUtils::MakeMaker. When MakeMaker writes a Makefile, it creates one or more objects that inherit their methods from a package MM . MM itself doesn't provide any methods, but it ISA ExtUtils::MM_Unix class. The inheritance tree of MM lets operating specific packages take the responsibility for all the methods provided by MM_Unix. We are trying to reduce the number of the necessary overrides by defining rather primitive operations within ExtUtils::MM_Unix.

    If you are going to write a platform specific MM package, please try to limit the necessary overrides to primitive methods, and if it is not possible to do so, let's work out how to achieve that gain.

    If you are overriding any of these methods in your Makefile.PL (in the MY class), please report that to the makemaker mailing list. We are trying to minimize the necessary method overrides and switch to data driven Makefile.PLs wherever possible. In the long run less methods will be overridable via the MY class.

    METHODS

    The following description of methods is still under development. Please refer to the code for not suitably documented sections and complain loudly to the makemaker@perl.org mailing list. Better yet, provide a patch.

    Not all of the methods below are overridable in a Makefile.PL. Overridable methods are marked as (o). All methods are overridable by a platform specific MM_*.pm file.

    Cross-platform methods are being moved into MM_Any. If you can't find something that used to be in here, look in MM_Any.

    Methods

    • os_flavor

      Simply says that we're Unix.

    • c_o (o)

      Defines the suffix rules to compile different flavors of C files to object files.

    • cflags (o)

      Does very much the same as the cflags script in the perl distribution. It doesn't return the whole compiler command line, but initializes all of its parts. The const_cccmd method then actually returns the definition of the CCCMD macro which uses these parts.

    • const_cccmd (o)

      Returns the full compiler call for C programs and stores the definition in CONST_CCCMD.

    • const_config (o)

      Defines a couple of constants in the Makefile that are imported from %Config.

    • const_loadlibs (o)

      Defines EXTRALIBS, LDLOADLIBS, BSLOADLIBS, LD_RUN_PATH. See ExtUtils::Liblist for details.

    • constants (o)
      1. my $make_frag = $mm->constants;

      Prints out macros for lots of constants.

    • depend (o)

      Same as macro for the depend attribute.

    • init_DEST
      1. $mm->init_DEST

      Defines the DESTDIR and DEST* variables paralleling the INSTALL*.

    • init_dist
      1. $mm->init_dist;

      Defines a lot of macros for distribution support.

      1. macro description default
      2. TAR tar command to use tar
      3. TARFLAGS flags to pass to TAR cvf
      4. ZIP zip command to use zip
      5. ZIPFLAGS flags to pass to ZIP -r
      6. COMPRESS compression command to gzip --best
      7. use for tarfiles
      8. SUFFIX suffix to put on .gz
      9. compressed files
      10. SHAR shar command to use shar
      11. PREOP extra commands to run before
      12. making the archive
      13. POSTOP extra commands to run after
      14. making the archive
      15. TO_UNIX a command to convert linefeeds
      16. to Unix style in your archive
      17. CI command to checkin your ci -u
      18. sources to version control
      19. RCS_LABEL command to label your sources rcs -Nv$(VERSION_SYM): -q
      20. just after CI is run
      21. DIST_CP $how argument to manicopy() best
      22. when the distdir is created
      23. DIST_DEFAULT default target to use to tardist
      24. create a distribution
      25. DISTVNAME name of the resulting archive $(DISTNAME)-$(VERSION)
      26. (minus suffixes)
    • dist (o)
      1. my $dist_macros = $mm->dist(%overrides);

      Generates a make fragment defining all the macros initialized in init_dist.

      %overrides can be used to override any of the above.

    • dist_basics (o)

      Defines the targets distclean, distcheck, skipcheck, manifest, veryclean.

    • dist_ci (o)

      Defines a check in target for RCS.

    • dist_core (o)
      1. my $dist_make_fragment = $MM->dist_core;

      Puts the targets necessary for 'make dist' together into one make fragment.

    • dist_target
      1. my $make_frag = $MM->dist_target;

      Returns the 'dist' target to make an archive for distribution. This target simply checks to make sure the Makefile is up-to-date and depends on $(DIST_DEFAULT).

    • tardist_target
      1. my $make_frag = $MM->tardist_target;

      Returns the 'tardist' target which is simply so 'make tardist' works. The real work is done by the dynamically named tardistfile_target() method, tardist should have that as a dependency.

    • zipdist_target
      1. my $make_frag = $MM->zipdist_target;

      Returns the 'zipdist' target which is simply so 'make zipdist' works. The real work is done by the dynamically named zipdistfile_target() method, zipdist should have that as a dependency.

    • tarfile_target
      1. my $make_frag = $MM->tarfile_target;

      The name of this target is the name of the tarball generated by tardist. This target does the actual work of turning the distdir into a tarball.

    • zipfile_target
      1. my $make_frag = $MM->zipfile_target;

      The name of this target is the name of the zip file generated by zipdist. This target does the actual work of turning the distdir into a zip file.

    • uutardist_target
      1. my $make_frag = $MM->uutardist_target;

      Converts the tarfile into a uuencoded file

    • shdist_target
      1. my $make_frag = $MM->shdist_target;

      Converts the distdir into a shell archive.

    • dlsyms (o)

      Used by some OS' to define DL_FUNCS and DL_VARS and write the *.exp files.

      Normally just returns an empty string.

    • dynamic_bs (o)

      Defines targets for bootstrap files.

    • dynamic_lib (o)

      Defines how to produce the *.so (or equivalent) files.

    • exescan

      Deprecated method. Use libscan instead.

    • extliblist

      Called by init_others, and calls ext ExtUtils::Liblist. See ExtUtils::Liblist for details.

    • find_perl

      Finds the executables PERL and FULLPERL

    • fixin
      1. $mm->fixin(@files);

      Inserts the sharpbang or equivalent magic number to a set of @files.

    • force (o)

      Writes an empty FORCE: target.

    • guess_name

      Guess the name of this package by examining the working directory's name. MakeMaker calls this only if the developer has not supplied a NAME attribute.

    • has_link_code

      Returns true if C, XS, MYEXTLIB or similar objects exist within this object that need a compiler. Does not descend into subdirectories as needs_linking() does.

    • init_dirscan

      Scans the directory structure and initializes DIR, XS, XS_FILES, C, C_FILES, O_FILES, H, H_FILES, PL_FILES, EXE_FILES.

      Called by init_main.

    • init_MANPODS

      Determines if man pages should be generated and initializes MAN1PODS and MAN3PODS as appropriate.

    • init_MAN1PODS

      Initializes MAN1PODS from the list of EXE_FILES.

    • init_MAN3PODS

      Initializes MAN3PODS from the list of PM files.

    • init_PM

      Initializes PMLIBDIRS and PM from PMLIBDIRS.

    • init_DIRFILESEP

      Using / for Unix. Called by init_main.

    • init_main

      Initializes AR, AR_STATIC_ARGS, BASEEXT, CONFIG, DISTNAME, DLBASE, EXE_EXT, FULLEXT, FULLPERL, FULLPERLRUN, FULLPERLRUNINST, INST_*, INSTALL*, INSTALLDIRS, LIB_EXT, LIBPERL_A, MAP_TARGET, NAME, OBJ_EXT, PARENT_NAME, PERL, PERL_ARCHLIB, PERL_INC, PERL_LIB, PERL_SRC, PERLRUN, PERLRUNINST, PREFIX, VERSION, VERSION_SYM, XS_VERSION.

    • init_tools

      Initializes tools to use their common (and faster) Unix commands.

    • init_linker

      Unix has no need of special linker flags.

    • init_PERL
      1. $mm->init_PERL;

      Called by init_main. Sets up ABSPERL, PERL, FULLPERL and all the *PERLRUN* permutations.

      1. PERL is allowed to be miniperl
      2. FULLPERL must be a complete perl
      3. ABSPERL is PERL converted to an absolute path
      4. *PERLRUN contains everything necessary to run perl, find it's
      5. libraries, etc...
      6. *PERLRUNINST is *PERLRUN + everything necessary to find the
      7. modules being built.
    • init_platform
    • platform_constants

      Add MM_Unix_VERSION.

    • init_PERM
      1. $mm->init_PERM

      Called by init_main. Initializes PERL_*

    • init_xs
      1. $mm->init_xs

      Sets up macros having to do with XS code. Currently just INST_STATIC, INST_DYNAMIC and INST_BOOT.

    • install (o)

      Defines the install target.

    • installbin (o)

      Defines targets to make and to install EXE_FILES.

    • linkext (o)

      Defines the linkext target which in turn defines the LINKTYPE.

    • lsdir

      Takes as arguments a directory name and a regular expression. Returns all entries in the directory that match the regular expression.

    • macro (o)

      Simple subroutine to insert the macros defined by the macro attribute into the Makefile.

    • makeaperl (o)

      Called by staticmake. Defines how to write the Makefile to produce a static new perl.

      By default the Makefile produced includes all the static extensions in the perl library. (Purified versions of library files, e.g., DynaLoader_pure_p1_c0_032.a are automatically ignored to avoid link errors.)

    • makefile (o)

      Defines how to rewrite the Makefile.

    • maybe_command

      Returns true, if the argument is likely to be a command.

    • needs_linking (o)

      Does this module need linking? Looks into subdirectory objects (see also has_link_code())

    • parse_abstract

      parse a file and return what you think is the ABSTRACT

    • parse_version
      1. my $version = MM->parse_version($file);

      Parse a $file and return what $VERSION is set to by the first assignment. It will return the string "undef" if it can't figure out what $VERSION is. $VERSION should be for all to see, so our $VERSION or plain $VERSION are okay, but my $VERSION is not.

      <package Foo VERSION> is also checked for. The first version declaration found is used, but this may change as it differs from how Perl does it.

      parse_version() will try to use version before checking for $VERSION so the following will work.

      1. $VERSION = qv(1.2.3);
    • pasthru (o)

      Defines the string that is passed to recursive make calls in subdirectories.

    • perl_script

      Takes one argument, a file name, and returns the file name, if the argument is likely to be a perl script. On MM_Unix this is true for any ordinary, readable file.

    • perldepend (o)

      Defines the dependency from all *.h files that come with the perl distribution.

    • pm_to_blib

      Defines target that copies all files in the hash PM to their destination and autosplits them. See DESCRIPTION in ExtUtils::Install

    • post_constants (o)

      Returns an empty string per default. Dedicated to overrides from within Makefile.PL after all constants have been defined.

    • post_initialize (o)

      Returns an empty string per default. Used in Makefile.PLs to add some chunk of text to the Makefile after the object is initialized.

    • postamble (o)

      Returns an empty string. Can be used in Makefile.PLs to write some text to the Makefile at the end.

    • ppd

      Defines target that creates a PPD (Perl Package Description) file for a binary distribution.

    • prefixify
      1. $MM->prefixify($var, $prefix, $new_prefix, $default);

      Using either $MM->{uc $var} || $Config{lc $var}, it will attempt to replace it's $prefix with a $new_prefix.

      Should the $prefix fail to match AND a PREFIX was given as an argument to WriteMakefile() it will set it to the $new_prefix + $default. This is for systems whose file layouts don't neatly fit into our ideas of prefixes.

      This is for heuristics which attempt to create directory structures that mirror those of the installed perl.

      For example:

      1. $MM->prefixify('installman1dir', '/usr', '/home/foo', 'man/man1');

      this will attempt to remove '/usr' from the front of the $MM->{INSTALLMAN1DIR} path (initializing it to $Config{installman1dir} if necessary) and replace it with '/home/foo'. If this fails it will simply use '/home/foo/man/man1'.

    • processPL (o)

      Defines targets to run *.PL files.

    • quote_paren

      Backslashes parentheses () in command line arguments. Doesn't handle recursive Makefile $(...) constructs, but handles simple ones.

    • replace_manpage_separator
      1. my $man_name = $MM->replace_manpage_separator($file_path);

      Takes the name of a package, which may be a nested package, in the form 'Foo/Bar.pm' and replaces the slash with :: or something else safe for a man page file name. Returns the replacement.

    • cd
    • oneliner
    • quote_literal
    • escape_newlines
    • max_exec_len

      Using POSIX::ARG_MAX. Otherwise falling back to 4096.

    • static (o)

      Defines the static target.

    • static_lib (o)

      Defines how to produce the *.a (or equivalent) files.

    • staticmake (o)

      Calls makeaperl.

    • subdir_x (o)

      Helper subroutine for subdirs

    • subdirs (o)

      Defines targets to process subdirectories.

    • test (o)

      Defines the test targets.

    • test_via_harness (override)

      For some reason which I forget, Unix machines like to have PERL_DL_NONLAZY set for tests.

    • test_via_script (override)

      Again, the PERL_DL_NONLAZY thing.

    • tool_xsubpp (o)

      Determines typemaps, xsubpp version, prototype behaviour.

    • all_target

      Build man pages, too

    • top_targets (o)

      Defines the targets all, subdirs, config, and O_FILES

    • writedoc

      Obsolete, deprecated method. Not used since Version 5.21.

    • xs_c (o)

      Defines the suffix rules to compile XS files to C.

    • xs_cpp (o)

      Defines the suffix rules to compile XS files to C++.

    • xs_o (o)

      Defines suffix rules to go from XS to object files directly. This is only intended for broken make implementations.

    SEE ALSO

    ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/MM_VMS.html000644 000765 000024 00000067475 12275777440 016765 0ustar00jjstaff000000 000000 ExtUtils::MM_VMS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_VMS

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_VMS

    NAME

    ExtUtils::MM_VMS - methods to override UN*X behaviour in ExtUtils::MakeMaker

    SYNOPSIS

    1. Do not use this directly.
    2. Instead, use ExtUtils::MM and it will figure out which MM_*
    3. class to use for you.

    DESCRIPTION

    See ExtUtils::MM_Unix for a documentation of the methods provided there. This package overrides the implementation of these methods, not the semantics.

    Methods always loaded

    • wraplist

      Converts a list into a string wrapped at approximately 80 columns.

    Methods

    Those methods which override default MM_Unix methods are marked "(override)", while methods unique to MM_VMS are marked "(specific)". For overridden methods, documentation is limited to an explanation of why this method overrides the MM_Unix method; see the ExtUtils::MM_Unix documentation for more details.

    • guess_name (override)

      Try to determine name of extension being built. We begin with the name of the current directory. Since VMS filenames are case-insensitive, however, we look for a .pm file whose name matches that of the current directory (presumably the 'main' .pm file for this extension), and try to find a package statement from which to obtain the Mixed::Case package name.

    • find_perl (override)

      Use VMS file specification syntax and CLI commands to find and invoke Perl images.

    • _fixin_replace_shebang (override)

      Helper routine for MM->fixin(), overridden because there's no such thing as an actual shebang line that will be intepreted by the shell, so we just prepend $Config{startperl} and preserve the shebang line argument for any switches it may contain.

    • maybe_command (override)

      Follows VMS naming conventions for executable files. If the name passed in doesn't exactly match an executable file, appends .Exe (or equivalent) to check for executable image, and .Com to check for DCL procedure. If this fails, checks directories in DCL$PATH and finally Sys$System: for an executable file having the name specified, with or without the .Exe-equivalent suffix.

    • pasthru (override)

      VMS has $(MMSQUALIFIERS) which is a listing of all the original command line options. This is used in every invocation of make in the VMS Makefile so PASTHRU should not be necessary. Using PASTHRU tends to blow commands past the 256 character limit.

    • pm_to_blib (override)

      VMS wants a dot in every file so we can't have one called 'pm_to_blib', it becomes 'pm_to_blib.' and MMS/K isn't smart enough to know that when you have a target called 'pm_to_blib' it should look for 'pm_to_blib.'.

      So in VMS its pm_to_blib.ts.

    • perl_script (override)

      If name passed in doesn't specify a readable file, appends .com or .pl and tries again, since it's customary to have file types on all files under VMS.

    • replace_manpage_separator

      Use as separator a character which is legal in a VMS-syntax file name.

    • init_DEST

      (override) Because of the difficulty concatenating VMS filepaths we must pre-expand the DEST* variables.

    • init_DIRFILESEP

      No seperator between a directory path and a filename on VMS.

    • init_main (override)
    • init_tools (override)

      Provide VMS-specific forms of various utility commands.

      Sets DEV_NULL to nothing because I don't know how to do it on VMS.

      Changes EQUALIZE_TIMESTAMP to set revision date of target file to one second later than source file, since MMK interprets precisely equal revision dates for a source and target file as a sign that the target needs to be updated.

    • init_platform (override)

      Add PERL_VMS, MM_VMS_REVISION and MM_VMS_VERSION.

      MM_VMS_REVISION is for backwards compatibility before MM_VMS had a $VERSION.

    • platform_constants
    • init_VERSION (override)

      Override the *DEFINE_VERSION macros with VMS semantics. Translate the MAKEMAKER filepath to VMS style.

    • constants (override)

      Fixes up numerous file and directory macros to insure VMS syntax regardless of input syntax. Also makes lists of files comma-separated.

    • special_targets

      Clear the default .SUFFIXES and put in our own list.

    • cflags (override)

      Bypass shell script and produce qualifiers for CC directly (but warn user if a shell script for this extension exists). Fold multiple /Defines into one, since some C compilers pay attention to only one instance of this qualifier on the command line.

    • const_cccmd (override)

      Adds directives to point C preprocessor to the right place when handling #include <sys/foo.h> directives. Also constructs CC command line a bit differently than MM_Unix method.

    • tools_other (override)

      Throw in some dubious extra macros for Makefile args.

      Also keep around the old $(SAY) macro in case somebody's using it.

    • init_dist (override)

      VMSish defaults for some values.

      1. macro description default
      2. ZIPFLAGS flags to pass to ZIP -Vu
      3. COMPRESS compression command to gzip
      4. use for tarfiles
      5. SUFFIX suffix to put on -gz
      6. compressed files
      7. SHAR shar command to use vms_share
      8. DIST_DEFAULT default target to use to tardist
      9. create a distribution
      10. DISTVNAME Use VERSION_SYM instead of $(DISTNAME)-$(VERSION_SYM)
      11. VERSION for the name
    • c_o (override)

      Use VMS syntax on command line. In particular, $(DEFINE) and $(PERL_INC) have been pulled into $(CCCMD). Also use MM[SK] macros.

    • xs_c (override)

      Use MM[SK] macros.

    • xs_o (override)

      Use MM[SK] macros, and VMS command line for C compiler.

    • dlsyms (override)

      Create VMS linker options files specifying universal symbols for this extension's shareable image, and listing other shareable images or libraries to which it should be linked.

    • dynamic_lib (override)

      Use VMS Link command.

    • static_lib (override)

      Use VMS commands to manipulate object library.

    • extra_clean_files

      Clean up some OS specific files. Plus the temp file used to shorten a lot of commands. And the name mangler database.

    • zipfile_target
    • tarfile_target
    • shdist_target

      Syntax for invoking shar, tar and zip differs from that for Unix.

    • install (override)

      Work around DCL's 255 character limit several times,and use VMS-style command line quoting in a few cases.

    • perldepend (override)

      Use VMS-style syntax for files; it's cheaper to just do it directly here than to have the MM_Unix method call catfile repeatedly. Also, if we have to rebuild Config.pm, use MM[SK] to do it.

    • makeaperl (override)

      Undertake to build a new set of Perl images using VMS commands. Since VMS does dynamic loading, it's not necessary to statically link each extension into the Perl image, so this isn't the normal build path. Consequently, it hasn't really been tested, and may well be incomplete.

    • maketext_filter (override)

      Insure that colons marking targets are preceded by space, in order to distinguish the target delimiter from a colon appearing as part of a filespec.

    • prefixify (override)

      prefixifying on VMS is simple. Each should simply be:

      1. perl_root:[some.dir]

      which can just be converted to:

      1. volume:[your.prefix.some.dir]

      otherwise you get the default layout.

      In effect, your search prefix is ignored and $Config{vms_prefix} is used instead.

    • cd
    • oneliner
    • echo

      perl trips up on "<foo>" thinking it's an input redirect. So we use the native Write command instead. Besides, its faster.

    • quote_literal
    • escape_dollarsigns

      Quote, don't escape.

    • escape_all_dollarsigns

      Quote, don't escape.

    • escape_newlines
    • max_exec_len

      256 characters.

    • init_linker
    • catdir (override)
    • catfile (override)

      Eliminate the macros in the output to the MMS/MMK file.

      (File::Spec::VMS used to do this for us, but it's being removed)

    • eliminate_macros

      Expands MM[KS]/Make macros in a text string, using the contents of identically named elements of %$self , and returns the result as a file specification in Unix syntax.

      NOTE: This is the canonical version of the method. The version in File::Spec::VMS is deprecated.

    • fixpath
      1. my $path = $mm->fixpath($path);
      2. my $path = $mm->fixpath($path, $is_dir);

      Catchall routine to clean up problem MM[SK]/Make macros. Expands macros in any directory specification, in order to avoid juxtaposing two VMS-syntax directories when MM[SK] is run. Also expands expressions which are all macro, so that we can tell how long the expansion is, and avoid overrunning DCL's command buffer when MM[KS] is running.

      fixpath() checks to see whether the result matches the name of a directory in the current default directory and returns a directory or file specification accordingly. $is_dir can be set to true to force fixpath() to consider the path to be a directory or false to force it to be a file.

      NOTE: This is the canonical version of the method. The version in File::Spec::VMS is deprecated.

    • os_flavor

      VMS is VMS.

    AUTHOR

    Original author Charles Bailey bailey@newman.upenn.edu

    Maintained by Michael G Schwern schwern@pobox.com

    See ExtUtils::MakeMaker for patching and contact information.

     
    perldoc-html/ExtUtils/MM_VOS.html000644 000765 000024 00000036514 12275777441 016756 0ustar00jjstaff000000 000000 ExtUtils::MM_VOS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_VOS

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_VOS

    NAME

    ExtUtils::MM_VOS - VOS specific subclass of ExtUtils::MM_Unix

    SYNOPSIS

    1. Don't use this module directly.
    2. Use ExtUtils::MM and let it choose.

    DESCRIPTION

    This is a subclass of ExtUtils::MM_Unix which contains functionality for VOS.

    Unless otherwise stated it works just like ExtUtils::MM_Unix

    Overridden methods

    extra_clean_files

    Cleanup VOS core files

    AUTHOR

    Michael G Schwern <schwern@pobox.com> with code from ExtUtils::MM_Unix

    SEE ALSO

    ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/MM_Win32.html000644 000765 000024 00000044104 12275777440 017202 0ustar00jjstaff000000 000000 ExtUtils::MM_Win32 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_Win32

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_Win32

    NAME

    ExtUtils::MM_Win32 - methods to override UN*X behaviour in ExtUtils::MakeMaker

    SYNOPSIS

    1. use ExtUtils::MM_Win32; # Done internally by ExtUtils::MakeMaker if needed

    DESCRIPTION

    See ExtUtils::MM_Unix for a documentation of the methods provided there. This package overrides the implementation of these methods, not the semantics.

    Overridden methods

    • dlsyms
    • replace_manpage_separator

      Changes the path separator with .

    • maybe_command

      Since Windows has nothing as simple as an executable bit, we check the file extension.

      The PATHEXT env variable will be used to get a list of extensions that might indicate a command, otherwise .com, .exe, .bat and .cmd will be used by default.

    • init_DIRFILESEP

      Using \ for Windows.

    • init_tools

      Override some of the slower, portable commands with Windows specific ones.

    • init_others

      Override the default link and compile tools.

      LDLOADLIBS's default is changed to $Config{libs}.

      Adjustments are made for Borland's quirks needing -L to come first.

    • init_platform

      Add MM_Win32_VERSION.

    • platform_constants
    • constants

      Add MAXLINELENGTH for dmake before all the constants are output.

    • special_targets

      Add .USESHELL target for dmake.

    • static_lib

      Changes how to run the linker.

      The rest is duplicate code from MM_Unix. Should move the linker code to its own method.

    • dynamic_lib

      Complicated stuff for Win32 that I don't understand. :(

    • extra_clean_files

      Clean out some extra dll.{base,exp} files which might be generated by gcc. Otherwise, take out all *.pdb files.

    • init_linker
    • perl_script

      Checks for the perl program under several common perl extensions.

    • xs_o

      This target is stubbed out. Not sure why.

    • pasthru

      All we send is -nologo to nmake to prevent it from printing its damned banner.

    • arch_check (override)

      Normalize all arguments for consistency of comparison.

    • oneliner

      These are based on what command.com does on Win98. They may be wrong for other Windows shells, I don't know.

    • cd

      dmake can handle Unix style cd'ing but nmake (at least 1.5) cannot. It wants:

      1. cd dir1\dir2
      2. command
      3. another_command
      4. cd ..\..
    • max_exec_len

      nmake 1.50 limits command length to 2048 characters.

    • os_flavor

      Windows is Win32.

    • cflags

      Defines the PERLDLL symbol if we are configured for static building since all code destined for the perl5xx.dll must be compiled with the PERLDLL symbol defined.

     
    perldoc-html/ExtUtils/MM_Win95.html000644 000765 000024 00000037700 12275777440 017217 0ustar00jjstaff000000 000000 ExtUtils::MM_Win95 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MM_Win95

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MM_Win95

    NAME

    ExtUtils::MM_Win95 - method to customize MakeMaker for Win9X

    SYNOPSIS

    1. You should not be using this module directly.

    DESCRIPTION

    This is a subclass of ExtUtils::MM_Win32 containing changes necessary to get MakeMaker playing nice with command.com and other Win9Xisms.

    Overridden methods

    Most of these make up for limitations in the Win9x/nmake command shell. Mostly its lack of &&.

    • xs_c

      The && problem.

    • xs_cpp

      The && problem

    • xs_o

      The && problem.

    • max_exec_len

      Win98 chokes on things like Encode if we set the max length to nmake's max of 2K. So we go for a more conservative value of 1K.

    • os_flavor

      Win95 and Win98 and WinME are collectively Win9x and Win32

    AUTHOR

    Code originally inside MM_Win32. Original author unknown.

    Currently maintained by Michael G Schwern schwern@pobox.com .

    Send patches and ideas to makemaker@perl.org .

    See http://www.makemaker.org.

     
    perldoc-html/ExtUtils/MY.html000644 000765 000024 00000035314 12275777440 016237 0ustar00jjstaff000000 000000 ExtUtils::MY - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MY

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MY

    NAME

    ExtUtils::MY - ExtUtils::MakeMaker subclass for customization

    SYNOPSIS

    1. # in your Makefile.PL
    2. sub MY::whatever {
    3. ...
    4. }

    DESCRIPTION

    FOR INTERNAL USE ONLY

    ExtUtils::MY is a subclass of ExtUtils::MM. Its provided in your Makefile.PL for you to add and override MakeMaker functionality.

    It also provides a convenient alias via the MY class.

    ExtUtils::MY might turn out to be a temporary solution, but MY won't go away.

     
    perldoc-html/ExtUtils/MakeMaker/000755 000765 000024 00000000000 12275777442 016655 5ustar00jjstaff000000 000000 perldoc-html/ExtUtils/MakeMaker.html000644 000765 000024 00000315552 12275777437 017562 0ustar00jjstaff000000 000000 ExtUtils::MakeMaker - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MakeMaker

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MakeMaker

    NAME

    ExtUtils::MakeMaker - Create a module Makefile

    SYNOPSIS

    1. use ExtUtils::MakeMaker;
    2. WriteMakefile(
    3. NAME => "Foo::Bar",
    4. VERSION_FROM => "lib/Foo/Bar.pm",
    5. );

    DESCRIPTION

    This utility is designed to write a Makefile for an extension module from a Makefile.PL. It is based on the Makefile.SH model provided by Andy Dougherty and the perl5-porters.

    It splits the task of generating the Makefile into several subroutines that can be individually overridden. Each subroutine returns the text it wishes to have written to the Makefile.

    As there are various Make programs with incompatible syntax, which use operating system shells, again with incompatible syntax, it is important for users of this module to know which flavour of Make a Makefile has been written for so they'll use the correct one and won't have to face the possibly bewildering errors resulting from using the wrong one.

    On POSIX systems, that program will likely be GNU Make; on Microsoft Windows, it will be either Microsoft NMake or DMake. Note that this module does not support generating Makefiles for GNU Make on Windows. See the section on the MAKE parameter for details.

    MakeMaker is object oriented. Each directory below the current directory that contains a Makefile.PL is treated as a separate object. This makes it possible to write an unlimited number of Makefiles with a single invocation of WriteMakefile().

    How To Write A Makefile.PL

    See ExtUtils::MakeMaker::Tutorial.

    The long answer is the rest of the manpage :-)

    Default Makefile Behaviour

    The generated Makefile enables the user of the extension to invoke

    1. perl Makefile.PL # optionally "perl Makefile.PL verbose"
    2. make
    3. make test # optionally set TEST_VERBOSE=1
    4. make install # See below

    The Makefile to be produced may be altered by adding arguments of the form KEY=VALUE . E.g.

    1. perl Makefile.PL INSTALL_BASE=~

    Other interesting targets in the generated Makefile are

    1. make config # to check if the Makefile is up-to-date
    2. make clean # delete local temp files (Makefile gets renamed)
    3. make realclean # delete derived files (including ./blib)
    4. make ci # check in all the files in the MANIFEST file
    5. make dist # see below the Distribution Support section

    make test

    MakeMaker checks for the existence of a file named test.pl in the current directory, and if it exists it executes the script with the proper set of perl -I options.

    MakeMaker also checks for any files matching glob("t/*.t"). It will execute all matching files in alphabetical order via the Test::Harness module with the -I switches set correctly.

    If you'd like to see the raw output of your tests, set the TEST_VERBOSE variable to true.

    1. make test TEST_VERBOSE=1

    make testdb

    A useful variation of the above is the target testdb . It runs the test under the Perl debugger (see perldebug). If the file test.pl exists in the current directory, it is used for the test.

    If you want to debug some other testfile, set the TEST_FILE variable thusly:

    1. make testdb TEST_FILE=t/mytest.t

    By default the debugger is called using -d option to perl. If you want to specify some other option, set the TESTDB_SW variable:

    1. make testdb TESTDB_SW=-Dx

    make install

    make alone puts all relevant files into directories that are named by the macros INST_LIB, INST_ARCHLIB, INST_SCRIPT, INST_MAN1DIR and INST_MAN3DIR. All these default to something below ./blib if you are not building below the perl source directory. If you are building below the perl source, INST_LIB and INST_ARCHLIB default to ../../lib, and INST_SCRIPT is not defined.

    The install target of the generated Makefile copies the files found below each of the INST_* directories to their INSTALL* counterparts. Which counterparts are chosen depends on the setting of INSTALLDIRS according to the following table:

    1. INSTALLDIRS set to
    2. perl site vendor
    3. PERLPREFIX SITEPREFIX VENDORPREFIX
    4. INST_ARCHLIB INSTALLARCHLIB INSTALLSITEARCH INSTALLVENDORARCH
    5. INST_LIB INSTALLPRIVLIB INSTALLSITELIB INSTALLVENDORLIB
    6. INST_BIN INSTALLBIN INSTALLSITEBIN INSTALLVENDORBIN
    7. INST_SCRIPT INSTALLSCRIPT INSTALLSITESCRIPT INSTALLVENDORSCRIPT
    8. INST_MAN1DIR INSTALLMAN1DIR INSTALLSITEMAN1DIR INSTALLVENDORMAN1DIR
    9. INST_MAN3DIR INSTALLMAN3DIR INSTALLSITEMAN3DIR INSTALLVENDORMAN3DIR

    The INSTALL... macros in turn default to their %Config ($Config{installprivlib}, $Config{installarchlib}, etc.) counterparts.

    You can check the values of these variables on your system with

    1. perl '-V:install.*'

    And to check the sequence in which the library directories are searched by perl, run

    1. perl -le 'print join $/, @INC'

    Sometimes older versions of the module you're installing live in other directories in @INC. Because Perl loads the first version of a module it finds, not the newest, you might accidentally get one of these older versions even after installing a brand new version. To delete all other versions of the module you're installing (not simply older ones) set the UNINST variable.

    1. make install UNINST=1

    INSTALL_BASE

    INSTALL_BASE can be passed into Makefile.PL to change where your module will be installed. INSTALL_BASE is more like what everyone else calls "prefix" than PREFIX is.

    To have everything installed in your home directory, do the following.

    1. # Unix users, INSTALL_BASE=~ works fine
    2. perl Makefile.PL INSTALL_BASE=/path/to/your/home/dir

    Like PREFIX, it sets several INSTALL* attributes at once. Unlike PREFIX it is easy to predict where the module will end up. The installation pattern looks like this:

    1. INSTALLARCHLIB INSTALL_BASE/lib/perl5/$Config{archname}
    2. INSTALLPRIVLIB INSTALL_BASE/lib/perl5
    3. INSTALLBIN INSTALL_BASE/bin
    4. INSTALLSCRIPT INSTALL_BASE/bin
    5. INSTALLMAN1DIR INSTALL_BASE/man/man1
    6. INSTALLMAN3DIR INSTALL_BASE/man/man3

    INSTALL_BASE in MakeMaker and --install_base in Module::Build (as of 0.28) install to the same location. If you want MakeMaker and Module::Build to install to the same location simply set INSTALL_BASE and --install_base to the same location.

    INSTALL_BASE was added in 6.31.

    PREFIX and LIB attribute

    PREFIX and LIB can be used to set several INSTALL* attributes in one go. Here's an example for installing into your home directory.

    1. # Unix users, PREFIX=~ works fine
    2. perl Makefile.PL PREFIX=/path/to/your/home/dir

    This will install all files in the module under your home directory, with man pages and libraries going into an appropriate place (usually ~/man and ~/lib). How the exact location is determined is complicated and depends on how your Perl was configured. INSTALL_BASE works more like what other build systems call "prefix" than PREFIX and we recommend you use that instead.

    Another way to specify many INSTALL directories with a single parameter is LIB.

    1. perl Makefile.PL LIB=~/lib

    This will install the module's architecture-independent files into ~/lib, the architecture-dependent files into ~/lib/$archname.

    Note, that in both cases the tilde expansion is done by MakeMaker, not by perl by default, nor by make.

    Conflicts between parameters LIB, PREFIX and the various INSTALL* arguments are resolved so that:

    • setting LIB overrides any setting of INSTALLPRIVLIB, INSTALLARCHLIB, INSTALLSITELIB, INSTALLSITEARCH (and they are not affected by PREFIX);

    • without LIB, setting PREFIX replaces the initial $Config{prefix} part of those INSTALL* arguments, even if the latter are explicitly set (but are set to still start with $Config{prefix} ).

    If the user has superuser privileges, and is not working on AFS or relatives, then the defaults for INSTALLPRIVLIB, INSTALLARCHLIB, INSTALLSCRIPT, etc. will be appropriate, and this incantation will be the best:

    1. perl Makefile.PL;
    2. make;
    3. make test
    4. make install

    make install by default writes some documentation of what has been done into the file $(INSTALLARCHLIB)/perllocal.pod. This feature can be bypassed by calling make pure_install.

    AFS users

    will have to specify the installation directories as these most probably have changed since perl itself has been installed. They will have to do this by calling

    1. perl Makefile.PL INSTALLSITELIB=/afs/here/today \
    2. INSTALLSCRIPT=/afs/there/now INSTALLMAN3DIR=/afs/for/manpages
    3. make

    Be careful to repeat this procedure every time you recompile an extension, unless you are sure the AFS installation directories are still valid.

    Static Linking of a new Perl Binary

    An extension that is built with the above steps is ready to use on systems supporting dynamic loading. On systems that do not support dynamic loading, any newly created extension has to be linked together with the available resources. MakeMaker supports the linking process by creating appropriate targets in the Makefile whenever an extension is built. You can invoke the corresponding section of the makefile with

    1. make perl

    That produces a new perl binary in the current directory with all extensions linked in that can be found in INST_ARCHLIB, SITELIBEXP, and PERL_ARCHLIB. To do that, MakeMaker writes a new Makefile, on UNIX, this is called Makefile.aperl (may be system dependent). If you want to force the creation of a new perl, it is recommended that you delete this Makefile.aperl, so the directories are searched through for linkable libraries again.

    The binary can be installed into the directory where perl normally resides on your machine with

    1. make inst_perl

    To produce a perl binary with a different name than perl , either say

    1. perl Makefile.PL MAP_TARGET=myperl
    2. make myperl
    3. make inst_perl

    or say

    1. perl Makefile.PL
    2. make myperl MAP_TARGET=myperl
    3. make inst_perl MAP_TARGET=myperl

    In any case you will be prompted with the correct invocation of the inst_perl target that installs the new binary into INSTALLBIN.

    make inst_perl by default writes some documentation of what has been done into the file $(INSTALLARCHLIB)/perllocal.pod. This can be bypassed by calling make pure_inst_perl.

    Warning: the inst_perl: target will most probably overwrite your existing perl binary. Use with care!

    Sometimes you might want to build a statically linked perl although your system supports dynamic loading. In this case you may explicitly set the linktype with the invocation of the Makefile.PL or make:

    1. perl Makefile.PL LINKTYPE=static # recommended

    or

    1. make LINKTYPE=static # works on most systems

    Determination of Perl Library and Installation Locations

    MakeMaker needs to know, or to guess, where certain things are located. Especially INST_LIB and INST_ARCHLIB (where to put the files during the make(1) run), PERL_LIB and PERL_ARCHLIB (where to read existing modules from), and PERL_INC (header files and libperl*.*).

    Extensions may be built either using the contents of the perl source directory tree or from the installed perl library. The recommended way is to build extensions after you have run 'make install' on perl itself. You can do that in any directory on your hard disk that is not below the perl source tree. The support for extensions below the ext directory of the perl distribution is only good for the standard extensions that come with perl.

    If an extension is being built below the ext/ directory of the perl source then MakeMaker will set PERL_SRC automatically (e.g., ../..). If PERL_SRC is defined and the extension is recognized as a standard extension, then other variables default to the following:

    1. PERL_INC = PERL_SRC
    2. PERL_LIB = PERL_SRC/lib
    3. PERL_ARCHLIB = PERL_SRC/lib
    4. INST_LIB = PERL_LIB
    5. INST_ARCHLIB = PERL_ARCHLIB

    If an extension is being built away from the perl source then MakeMaker will leave PERL_SRC undefined and default to using the installed copy of the perl library. The other variables default to the following:

    1. PERL_INC = $archlibexp/CORE
    2. PERL_LIB = $privlibexp
    3. PERL_ARCHLIB = $archlibexp
    4. INST_LIB = ./blib/lib
    5. INST_ARCHLIB = ./blib/arch

    If perl has not yet been installed then PERL_SRC can be defined on the command line as shown in the previous section.

    Which architecture dependent directory?

    If you don't want to keep the defaults for the INSTALL* macros, MakeMaker helps you to minimize the typing needed: the usual relationship between INSTALLPRIVLIB and INSTALLARCHLIB is determined by Configure at perl compilation time. MakeMaker supports the user who sets INSTALLPRIVLIB. If INSTALLPRIVLIB is set, but INSTALLARCHLIB not, then MakeMaker defaults the latter to be the same subdirectory of INSTALLPRIVLIB as Configure decided for the counterparts in %Config, otherwise it defaults to INSTALLPRIVLIB. The same relationship holds for INSTALLSITELIB and INSTALLSITEARCH.

    MakeMaker gives you much more freedom than needed to configure internal variables and get different results. It is worth mentioning that make(1) also lets you configure most of the variables that are used in the Makefile. But in the majority of situations this will not be necessary, and should only be done if the author of a package recommends it (or you know what you're doing).

    Using Attributes and Parameters

    The following attributes may be specified as arguments to WriteMakefile() or as NAME=VALUE pairs on the command line.

    • ABSTRACT

      One line description of the module. Will be included in PPD file.

    • ABSTRACT_FROM

      Name of the file that contains the package description. MakeMaker looks for a line in the POD matching /^($package\s-\s)(.*)/. This is typically the first line in the "=head1 NAME" section. $2 becomes the abstract.

    • AUTHOR

      Array of strings containing name (and email address) of package author(s). Is used in CPAN Meta files (META.yml or META.json) and PPD (Perl Package Description) files for PPM (Perl Package Manager).

    • BINARY_LOCATION

      Used when creating PPD files for binary packages. It can be set to a full or relative path or URL to the binary archive for a particular architecture. For example:

      1. perl Makefile.PL BINARY_LOCATION=x86/Agent.tar.gz

      builds a PPD package that references a binary of the Agent package, located in the x86 directory relative to the PPD itself.

    • BUILD_REQUIRES

      A hash of modules that are needed to build your module but not run it.

      This will go into the build_requires field of your CPAN Meta file. (META.yml or META.json).

      The format is the same as PREREQ_PM.

    • C

      Ref to array of *.c file names. Initialised from a directory scan and the values portion of the XS attribute hash. This is not currently used by MakeMaker but may be handy in Makefile.PLs.

    • CCFLAGS

      String that will be included in the compiler call command line between the arguments INC and OPTIMIZE.

    • CONFIG

      Arrayref. E.g. [qw(archname manext)] defines ARCHNAME & MANEXT from config.sh. MakeMaker will add to CONFIG the following values anyway: ar cc cccdlflags ccdlflags dlext dlsrc ld lddlflags ldflags libc lib_ext obj_ext ranlib sitelibexp sitearchexp so

    • CONFIGURE

      CODE reference. The subroutine should return a hash reference. The hash may contain further attributes, e.g. {LIBS => ...}, that have to be determined by some evaluation method.

    • CONFIGURE_REQUIRES

      A hash of modules that are required to run Makefile.PL itself, but not to run your distribution.

      This will go into the configure_requires field of your CPAN Meta file (META.yml or META.json)

      Defaults to { "ExtUtils::MakeMaker" => 0 }

      The format is the same as PREREQ_PM.

    • DEFINE

      Something like "-DHAVE_UNISTD_H"

    • DESTDIR

      This is the root directory into which the code will be installed. It prepends itself to the normal prefix. For example, if your code would normally go into /usr/local/lib/perl you could set DESTDIR=~/tmp/ and installation would go into ~/tmp/usr/local/lib/perl.

      This is primarily of use for people who repackage Perl modules.

      NOTE: Due to the nature of make, it is important that you put the trailing slash on your DESTDIR. ~/tmp/ not ~/tmp.

    • DIR

      Ref to array of subdirectories containing Makefile.PLs e.g. ['sdbm'] in ext/SDBM_File

    • DISTNAME

      A safe filename for the package.

      Defaults to NAME below but with :: replaced with -.

      For example, Foo::Bar becomes Foo-Bar.

    • DISTVNAME

      Your name for distributing the package with the version number included. This is used by 'make dist' to name the resulting archive file.

      Defaults to DISTNAME-VERSION.

      For example, version 1.04 of Foo::Bar becomes Foo-Bar-1.04.

      On some OS's where . has special meaning VERSION_SYM may be used in place of VERSION.

    • DL_FUNCS

      Hashref of symbol names for routines to be made available as universal symbols. Each key/value pair consists of the package name and an array of routine names in that package. Used only under AIX, OS/2, VMS and Win32 at present. The routine names supplied will be expanded in the same way as XSUB names are expanded by the XS() macro. Defaults to

      1. {"$(NAME)" => ["boot_$(NAME)" ] }

      e.g.

      1. {"RPC" => [qw( boot_rpcb rpcb_gettime getnetconfigent )],
      2. "NetconfigPtr" => [ 'DESTROY'] }

      Please see the ExtUtils::Mksymlists documentation for more information about the DL_FUNCS, DL_VARS and FUNCLIST attributes.

    • DL_VARS

      Array of symbol names for variables to be made available as universal symbols. Used only under AIX, OS/2, VMS and Win32 at present. Defaults to []. (e.g. [ qw(Foo_version Foo_numstreams Foo_tree ) ])

    • EXCLUDE_EXT

      Array of extension names to exclude when doing a static build. This is ignored if INCLUDE_EXT is present. Consult INCLUDE_EXT for more details. (e.g. [ qw( Socket POSIX ) ] )

      This attribute may be most useful when specified as a string on the command line: perl Makefile.PL EXCLUDE_EXT='Socket Safe'

    • EXE_FILES

      Ref to array of executable files. The files will be copied to the INST_SCRIPT directory. Make realclean will delete them from there again.

      If your executables start with something like #!perl or #!/usr/bin/perl MakeMaker will change this to the path of the perl 'Makefile.PL' was invoked with so the programs will be sure to run properly even if perl is not in /usr/bin/perl.

    • FIRST_MAKEFILE

      The name of the Makefile to be produced. This is used for the second Makefile that will be produced for the MAP_TARGET.

      Defaults to 'Makefile' or 'Descrip.MMS' on VMS.

      (Note: we couldn't use MAKEFILE because dmake uses this for something else).

    • FULLPERL

      Perl binary able to run this extension, load XS modules, etc...

    • FULLPERLRUN

      Like PERLRUN, except it uses FULLPERL.

    • FULLPERLRUNINST

      Like PERLRUNINST, except it uses FULLPERL.

    • FUNCLIST

      This provides an alternate means to specify function names to be exported from the extension. Its value is a reference to an array of function names to be exported by the extension. These names are passed through unaltered to the linker options file.

    • H

      Ref to array of *.h file names. Similar to C.

    • IMPORTS

      This attribute is used to specify names to be imported into the extension. Takes a hash ref.

      It is only used on OS/2 and Win32.

    • INC

      Include file dirs eg: "-I/usr/5include -I/path/to/inc"

    • INCLUDE_EXT

      Array of extension names to be included when doing a static build. MakeMaker will normally build with all of the installed extensions when doing a static build, and that is usually the desired behavior. If INCLUDE_EXT is present then MakeMaker will build only with those extensions which are explicitly mentioned. (e.g. [ qw( Socket POSIX ) ])

      It is not necessary to mention DynaLoader or the current extension when filling in INCLUDE_EXT. If the INCLUDE_EXT is mentioned but is empty then only DynaLoader and the current extension will be included in the build.

      This attribute may be most useful when specified as a string on the command line: perl Makefile.PL INCLUDE_EXT='POSIX Socket Devel::Peek'

    • INSTALLARCHLIB

      Used by 'make install', which copies files from INST_ARCHLIB to this directory if INSTALLDIRS is set to perl.

    • INSTALLBIN

      Directory to install binary files (e.g. tkperl) into if INSTALLDIRS=perl.

    • INSTALLDIRS

      Determines which of the sets of installation directories to choose: perl, site or vendor. Defaults to site.

    • INSTALLMAN1DIR
    • INSTALLMAN3DIR

      These directories get the man pages at 'make install' time if INSTALLDIRS=perl. Defaults to $Config{installman*dir}.

      If set to 'none', no man pages will be installed.

    • INSTALLPRIVLIB

      Used by 'make install', which copies files from INST_LIB to this directory if INSTALLDIRS is set to perl.

      Defaults to $Config{installprivlib}.

    • INSTALLSCRIPT

      Used by 'make install' which copies files from INST_SCRIPT to this directory if INSTALLDIRS=perl.

    • INSTALLSITEARCH

      Used by 'make install', which copies files from INST_ARCHLIB to this directory if INSTALLDIRS is set to site (default).

    • INSTALLSITEBIN

      Used by 'make install', which copies files from INST_BIN to this directory if INSTALLDIRS is set to site (default).

    • INSTALLSITELIB

      Used by 'make install', which copies files from INST_LIB to this directory if INSTALLDIRS is set to site (default).

    • INSTALLSITEMAN1DIR
    • INSTALLSITEMAN3DIR

      These directories get the man pages at 'make install' time if INSTALLDIRS=site (default). Defaults to $(SITEPREFIX)/man/man$(MAN*EXT).

      If set to 'none', no man pages will be installed.

    • INSTALLSITESCRIPT

      Used by 'make install' which copies files from INST_SCRIPT to this directory if INSTALLDIRS is set to site (default).

    • INSTALLVENDORARCH

      Used by 'make install', which copies files from INST_ARCHLIB to this directory if INSTALLDIRS is set to vendor.

    • INSTALLVENDORBIN

      Used by 'make install', which copies files from INST_BIN to this directory if INSTALLDIRS is set to vendor.

    • INSTALLVENDORLIB

      Used by 'make install', which copies files from INST_LIB to this directory if INSTALLDIRS is set to vendor.

    • INSTALLVENDORMAN1DIR
    • INSTALLVENDORMAN3DIR

      These directories get the man pages at 'make install' time if INSTALLDIRS=vendor. Defaults to $(VENDORPREFIX)/man/man$(MAN*EXT).

      If set to 'none', no man pages will be installed.

    • INSTALLVENDORSCRIPT

      Used by 'make install' which copies files from INST_SCRIPT to this directory if INSTALLDIRS is set to vendor.

    • INST_ARCHLIB

      Same as INST_LIB for architecture dependent files.

    • INST_BIN

      Directory to put real binary files during 'make'. These will be copied to INSTALLBIN during 'make install'

    • INST_LIB

      Directory where we put library files of this extension while building it.

    • INST_MAN1DIR

      Directory to hold the man pages at 'make' time

    • INST_MAN3DIR

      Directory to hold the man pages at 'make' time

    • INST_SCRIPT

      Directory where executable files should be installed during 'make'. Defaults to "./blib/script", just to have a dummy location during testing. make install will copy the files in INST_SCRIPT to INSTALLSCRIPT.

    • LD

      Program to be used to link libraries for dynamic loading.

      Defaults to $Config{ld}.

    • LDDLFLAGS

      Any special flags that might need to be passed to ld to create a shared library suitable for dynamic loading. It is up to the makefile to use it. (See lddlflags in Config)

      Defaults to $Config{lddlflags}.

    • LDFROM

      Defaults to "$(OBJECT)" and is used in the ld command to specify what files to link/load from (also see dynamic_lib below for how to specify ld flags)

    • LIB

      LIB should only be set at perl Makefile.PL time but is allowed as a MakeMaker argument. It has the effect of setting both INSTALLPRIVLIB and INSTALLSITELIB to that value regardless any explicit setting of those arguments (or of PREFIX). INSTALLARCHLIB and INSTALLSITEARCH are set to the corresponding architecture subdirectory.

    • LIBPERL_A

      The filename of the perllibrary that will be used together with this extension. Defaults to libperl.a.

    • LIBS

      An anonymous array of alternative library specifications to be searched for (in order) until at least one library is found. E.g.

      1. 'LIBS' => ["-lgdbm", "-ldbm -lfoo", "-L/path -ldbm.nfs"]

      Mind, that any element of the array contains a complete set of arguments for the ld command. So do not specify

      1. 'LIBS' => ["-ltcl", "-ltk", "-lX11"]

      See ODBM_File/Makefile.PL for an example, where an array is needed. If you specify a scalar as in

      1. 'LIBS' => "-ltcl -ltk -lX11"

      MakeMaker will turn it into an array with one element.

    • LICENSE

      The licensing terms of your distribution. Generally it's "perl" for the same license as Perl itself.

      See Module::Build::API for the list of options.

      Defaults to "unknown".

    • LINKTYPE

      'static' or 'dynamic' (default unless usedl=undef in config.sh). Should only be used to force static linking (also see linkext below).

    • MAKE

      Variant of make you intend to run the generated Makefile with. This parameter lets Makefile.PL know what make quirks to account for when generating the Makefile.

      MakeMaker also honors the MAKE environment variable. This parameter takes precedence.

      Currently the only significant values are 'dmake' and 'nmake' for Windows users, instructing MakeMaker to generate a Makefile in the flavour of DMake ("Dennis Vadura's Make") or Microsoft NMake respectively.

      Defaults to $Config{make}, which may go looking for a Make program in your environment.

      How are you supposed to know what flavour of Make a Makefile has been generated for if you didn't specify a value explicitly? Search the generated Makefile for the definition of the MAKE variable, which is used to recursively invoke the Make utility. That will tell you what Make you're supposed to invoke the Makefile with.

    • MAKEAPERL

      Boolean which tells MakeMaker that it should include the rules to make a perl. This is handled automatically as a switch by MakeMaker. The user normally does not need it.

    • MAKEFILE_OLD

      When 'make clean' or similar is run, the $(FIRST_MAKEFILE) will be backed up at this location.

      Defaults to $(FIRST_MAKEFILE).old or $(FIRST_MAKEFILE)_old on VMS.

    • MAN1PODS

      Hashref of pod-containing files. MakeMaker will default this to all EXE_FILES files that include POD directives. The files listed here will be converted to man pages and installed as was requested at Configure time.

      This hash should map POD files (or scripts containing POD) to the man file names under the blib/man1/ directory, as in the following example:

      1. MAN1PODS => {
      2. 'doc/command.pod' => 'blib/man1/command.1',
      3. 'scripts/script.pl' => 'blib/man1/script.1',
      4. }
    • MAN3PODS

      Hashref that assigns to *.pm and *.pod files the files into which the manpages are to be written. MakeMaker parses all *.pod and *.pm files for POD directives. Files that contain POD will be the default keys of the MAN3PODS hashref. These will then be converted to man pages during make and will be installed during make install .

      Example similar to MAN1PODS.

    • MAP_TARGET

      If it is intended that a new perl binary be produced, this variable may hold a name for that binary. Defaults to perl

    • META_ADD
    • META_MERGE

      A hashref of items to add to the CPAN Meta file (META.yml or META.json).

      They differ in how they behave if they have the same key as the default metadata. META_ADD will override the default value with its own. META_MERGE will merge its value with the default.

      Unless you want to override the defaults, prefer META_MERGE so as to get the advantage of any future defaults.

    • MIN_PERL_VERSION

      The minimum required version of Perl for this distribution.

      Either the 5.006001 or the 5.6.1 format is acceptable.

    • MYEXTLIB

      If the extension links to a library that it builds, set this to the name of the library (see SDBM_File)

    • NAME

      The package representing the distribution. For example, Test::More or ExtUtils::MakeMaker . It will be used to derive information about the distribution such as the DISTNAME, installation locations within the Perl library and where XS files will be looked for by default (see XS).

      NAME must be a valid Perl package name and it must have an associated .pm file. For example, Foo::Bar is a valid NAME and there must exist Foo/Bar.pm. Any XS code should be in Bar.xs unless stated otherwise.

      Your distribution must have a NAME .

    • NEEDS_LINKING

      MakeMaker will figure out if an extension contains linkable code anywhere down the directory tree, and will set this variable accordingly, but you can speed it up a very little bit if you define this boolean variable yourself.

    • NOECHO

      Command so make does not print the literal commands it's running.

      By setting it to an empty string you can generate a Makefile that prints all commands. Mainly used in debugging MakeMaker itself.

      Defaults to @ .

    • NORECURS

      Boolean. Attribute to inhibit descending into subdirectories.

    • NO_META

      When true, suppresses the generation and addition to the MANIFEST of the META.yml and META.json module meta-data files during 'make distdir'.

      Defaults to false.

    • NO_MYMETA

      When true, suppresses the generation of MYMETA.yml and MYMETA.json module meta-data files during 'perl Makefile.PL'.

      Defaults to false.

    • NO_VC

      In general, any generated Makefile checks for the current version of MakeMaker and the version the Makefile was built under. If NO_VC is set, the version check is neglected. Do not write this into your Makefile.PL, use it interactively instead.

    • OBJECT

      List of object files, defaults to '$(BASEEXT)$(OBJ_EXT)', but can be a long string containing all object files, e.g. "tkpBind.o tkpButton.o tkpCanvas.o"

      (Where BASEEXT is the last component of NAME, and OBJ_EXT is $Config{obj_ext}.)

    • OPTIMIZE

      Defaults to -O . Set it to -g to turn debugging on. The flag is passed to subdirectory makes.

    • PERL

      Perl binary for tasks that can be done by miniperl.

    • PERL_CORE

      Set only when MakeMaker is building the extensions of the Perl core distribution.

    • PERLMAINCC

      The call to the program that is able to compile perlmain.c. Defaults to $(CC).

    • PERL_ARCHLIB

      Same as for PERL_LIB, but for architecture dependent files.

      Used only when MakeMaker is building the extensions of the Perl core distribution (because normally $(PERL_ARCHLIB) is automatically in @INC, and adding it would get in the way of PERL5LIB).

    • PERL_LIB

      Directory containing the Perl library to use.

      Used only when MakeMaker is building the extensions of the Perl core distribution (because normally $(PERL_LIB) is automatically in @INC, and adding it would get in the way of PERL5LIB).

    • PERL_MALLOC_OK

      defaults to 0. Should be set to TRUE if the extension can work with the memory allocation routines substituted by the Perl malloc() subsystem. This should be applicable to most extensions with exceptions of those

      • with bugs in memory allocations which are caught by Perl's malloc();

      • which interact with the memory allocator in other ways than via malloc(), realloc(), free(), calloc(), sbrk() and brk();

      • which rely on special alignment which is not provided by Perl's malloc().

      NOTE. Neglecting to set this flag in any one of the loaded extension nullifies many advantages of Perl's malloc(), such as better usage of system resources, error detection, memory usage reporting, catchable failure of memory allocations, etc.

    • PERLPREFIX

      Directory under which core modules are to be installed.

      Defaults to $Config{installprefixexp}, falling back to $Config{installprefix}, $Config{prefixexp} or $Config{prefix} should $Config{installprefixexp} not exist.

      Overridden by PREFIX.

    • PERLRUN

      Use this instead of $(PERL) when you wish to run perl. It will set up extra necessary flags for you.

    • PERLRUNINST

      Use this instead of $(PERL) when you wish to run perl to work with modules. It will add things like -I$(INST_ARCH) and other necessary flags so perl can see the modules you're about to install.

    • PERL_SRC

      Directory containing the Perl source code (use of this should be avoided, it may be undefined)

    • PERM_DIR

      Desired permission for directories. Defaults to 755 .

    • PERM_RW

      Desired permission for read/writable files. Defaults to 644 .

    • PERM_RWX

      Desired permission for executable files. Defaults to 755 .

    • PL_FILES

      MakeMaker can run programs to generate files for you at build time. By default any file named *.PL (except Makefile.PL and Build.PL) in the top level directory will be assumed to be a Perl program and run passing its own basename in as an argument. For example...

      1. perl foo.PL foo

      This behavior can be overridden by supplying your own set of files to search. PL_FILES accepts a hash ref, the key being the file to run and the value is passed in as the first argument when the PL file is run.

      1. PL_FILES => {'bin/foobar.PL' => 'bin/foobar'}

      Would run bin/foobar.PL like this:

      1. perl bin/foobar.PL bin/foobar

      If multiple files from one program are desired an array ref can be used.

      1. PL_FILES => {'bin/foobar.PL' => [qw(bin/foobar1 bin/foobar2)]}

      In this case the program will be run multiple times using each target file.

      1. perl bin/foobar.PL bin/foobar1
      2. perl bin/foobar.PL bin/foobar2

      PL files are normally run after pm_to_blib and include INST_LIB and INST_ARCH in their @INC , so the just built modules can be accessed... unless the PL file is making a module (or anything else in PM) in which case it is run before pm_to_blib and does not include INST_LIB and INST_ARCH in its @INC . This apparently odd behavior is there for backwards compatibility (and it's somewhat DWIM).

    • PM

      Hashref of .pm files and *.pl files to be installed. e.g.

      1. {'name_of_file.pm' => '$(INST_LIBDIR)/install_as.pm'}

      By default this will include *.pm and *.pl and the files found in the PMLIBDIRS directories. Defining PM in the Makefile.PL will override PMLIBDIRS.

    • PMLIBDIRS

      Ref to array of subdirectories containing library files. Defaults to [ 'lib', $(BASEEXT) ]. The directories will be scanned and any files they contain will be installed in the corresponding location in the library. A libscan() method can be used to alter the behaviour. Defining PM in the Makefile.PL will override PMLIBDIRS.

      (Where BASEEXT is the last component of NAME.)

    • PM_FILTER

      A filter program, in the traditional Unix sense (input from stdin, output to stdout) that is passed on each .pm file during the build (in the pm_to_blib() phase). It is empty by default, meaning no filtering is done.

      Great care is necessary when defining the command if quoting needs to be done. For instance, you would need to say:

      1. {'PM_FILTER' => 'grep -v \\"^\\#\\"'}

      to remove all the leading comments on the fly during the build. The extra \\ are necessary, unfortunately, because this variable is interpolated within the context of a Perl program built on the command line, and double quotes are what is used with the -e switch to build that command line. The # is escaped for the Makefile, since what is going to be generated will then be:

      1. PM_FILTER = grep -v \"^\#\"

      Without the \\ before the #, we'd have the start of a Makefile comment, and the macro would be incorrectly defined.

    • POLLUTE

      Release 5.005 grandfathered old global symbol names by providing preprocessor macros for extension source compatibility. As of release 5.6, these preprocessor definitions are not available by default. The POLLUTE flag specifies that the old names should still be defined:

      1. perl Makefile.PL POLLUTE=1

      Please inform the module author if this is necessary to successfully install a module under 5.6 or later.

    • PPM_INSTALL_EXEC

      Name of the executable used to run PPM_INSTALL_SCRIPT below. (e.g. perl)

    • PPM_INSTALL_SCRIPT

      Name of the script that gets executed by the Perl Package Manager after the installation of a package.

    • PREFIX

      This overrides all the default install locations. Man pages, libraries, scripts, etc... MakeMaker will try to make an educated guess about where to place things under the new PREFIX based on your Config defaults. Failing that, it will fall back to a structure which should be sensible for your platform.

      If you specify LIB or any INSTALL* variables they will not be affected by the PREFIX.

    • PREREQ_FATAL

      Bool. If this parameter is true, failing to have the required modules (or the right versions thereof) will be fatal. perl Makefile.PL will die instead of simply informing the user of the missing dependencies.

      It is extremely rare to have to use PREREQ_FATAL . Its use by module authors is strongly discouraged and should never be used lightly.

      Module installation tools have ways of resolving unmet dependencies but to do that they need a Makefile. Using PREREQ_FATAL breaks this. That's bad.

      Assuming you have good test coverage, your tests should fail with missing dependencies informing the user more strongly that something is wrong. You can write a t/00compile.t test which will simply check that your code compiles and stop "make test" prematurely if it doesn't. See BAIL_OUT in Test::More for more details.

    • PREREQ_PM

      A hash of modules that are needed to run your module. The keys are the module names ie. Test::More, and the minimum version is the value. If the required version number is 0 any version will do.

      This will go into the requires field of your CPAN Meta file (META.yml or META.json).

      1. PREREQ_PM => {
      2. # Require Test::More at least 0.47
      3. "Test::More" => "0.47",
      4. # Require any version of Acme::Buffy
      5. "Acme::Buffy" => 0,
      6. }
    • PREREQ_PRINT

      Bool. If this parameter is true, the prerequisites will be printed to stdout and MakeMaker will exit. The output format is an evalable hash ref.

      1. $PREREQ_PM = {
      2. 'A::B' => Vers1,
      3. 'C::D' => Vers2,
      4. ...
      5. };

      If a distribution defines a minimal required perl version, this is added to the output as an additional line of the form:

      1. $MIN_PERL_VERSION = '5.008001';

      If BUILD_REQUIRES is not empty, it will be dumped as $BUILD_REQUIRES hashref.

    • PRINT_PREREQ

      RedHatism for PREREQ_PRINT . The output format is different, though:

      1. perl(A::B)>=Vers1 perl(C::D)>=Vers2 ...

      A minimal required perl version, if present, will look like this:

      1. perl(perl)>=5.008001
    • SITEPREFIX

      Like PERLPREFIX, but only for the site install locations.

      Defaults to $Config{siteprefixexp}. Perls prior to 5.6.0 didn't have an explicit siteprefix in the Config. In those cases $Config{installprefix} will be used.

      Overridable by PREFIX

    • SIGN

      When true, perform the generation and addition to the MANIFEST of the SIGNATURE file in the distdir during 'make distdir', via 'cpansign -s'.

      Note that you need to install the Module::Signature module to perform this operation.

      Defaults to false.

    • SKIP

      Arrayref. E.g. [qw(name1 name2)] skip (do not write) sections of the Makefile. Caution! Do not use the SKIP attribute for the negligible speedup. It may seriously damage the resulting Makefile. Only use it if you really need it.

    • TEST_REQUIRES

      A hash of modules that are needed to test your module but not run or build it.

      This will go into the test_requires field of your CPAN Meta file. (META.yml or META.json).

      The format is the same as PREREQ_PM.

    • TYPEMAPS

      Ref to array of typemap file names. Use this when the typemaps are in some directory other than the current directory or when they are not named typemap. The last typemap in the list takes precedence. A typemap in the current directory has highest precedence, even if it isn't listed in TYPEMAPS. The default system typemap has lowest precedence.

    • VENDORPREFIX

      Like PERLPREFIX, but only for the vendor install locations.

      Defaults to $Config{vendorprefixexp}.

      Overridable by PREFIX

    • VERBINST

      If true, make install will be verbose

    • VERSION

      Your version number for distributing the package. This defaults to 0.1.

    • VERSION_FROM

      Instead of specifying the VERSION in the Makefile.PL you can let MakeMaker parse a file to determine the version number. The parsing routine requires that the file named by VERSION_FROM contains one single line to compute the version number. The first line in the file that contains something like a $VERSION assignment or package Name VERSION will be used. The following lines will be parsed o.k.:

      1. # Good
      2. package Foo::Bar 1.23; # 1.23
      3. $VERSION = '1.00'; # 1.00
      4. *VERSION = \'1.01'; # 1.01
      5. ($VERSION) = q$Revision$ =~ /(\d+)/g; # The digits in $Revision$
      6. $FOO::VERSION = '1.10'; # 1.10
      7. *FOO::VERSION = \'1.11'; # 1.11

      but these will fail:

      1. # Bad
      2. my $VERSION = '1.01';
      3. local $VERSION = '1.02';
      4. local $FOO::VERSION = '1.30';

      "Version strings" are incompatible and should not be used.

      1. # Bad
      2. $VERSION = 1.2.3;
      3. $VERSION = v1.2.3;

      version objects are fine. As of MakeMaker 6.35 version.pm will be automatically loaded, but you must declare the dependency on version.pm. For compatibility with older MakeMaker you should load on the same line as $VERSION is declared.

      1. # All on one line
      2. use version; our $VERSION = qv(1.2.3);

      (Putting my or local on the preceding line will work o.k.)

      The file named in VERSION_FROM is not added as a dependency to Makefile. This is not really correct, but it would be a major pain during development to have to rewrite the Makefile for any smallish change in that file. If you want to make sure that the Makefile contains the correct VERSION macro after any change of the file, you would have to do something like

      1. depend => { Makefile => '$(VERSION_FROM)' }

      See attribute depend below.

    • VERSION_SYM

      A sanitized VERSION with . replaced by _. For places where . has special meaning (some filesystems, RCS labels, etc...)

    • XS

      Hashref of .xs files. MakeMaker will default this. e.g.

      1. {'name_of_file.xs' => 'name_of_file.c'}

      The .c files will automatically be included in the list of files deleted by a make clean.

    • XSOPT

      String of options to pass to xsubpp. This might include -C++ or -extern . Do not include typemaps here; the TYPEMAP parameter exists for that purpose.

    • XSPROTOARG

      May be set to an empty string, which is identical to -prototypes , or -noprototypes . See the xsubpp documentation for details. MakeMaker defaults to the empty string.

    • XS_VERSION

      Your version number for the .xs file of this package. This defaults to the value of the VERSION attribute.

    Additional lowercase attributes

    can be used to pass parameters to the methods which implement that part of the Makefile. Parameters are specified as a hash ref but are passed to the method as a hash.

    • clean
      1. {FILES => "*.xyz foo"}
    • depend
      1. {ANY_TARGET => ANY_DEPENDENCY, ...}

      (ANY_TARGET must not be given a double-colon rule by MakeMaker.)

    • dist
      1. {TARFLAGS => 'cvfF', COMPRESS => 'gzip', SUFFIX => '.gz',
      2. SHAR => 'shar -m', DIST_CP => 'ln', ZIP => '/bin/zip',
      3. ZIPFLAGS => '-rl', DIST_DEFAULT => 'private tardist' }

      If you specify COMPRESS, then SUFFIX should also be altered, as it is needed to tell make the target file of the compression. Setting DIST_CP to ln can be useful, if you need to preserve the timestamps on your files. DIST_CP can take the values 'cp', which copies the file, 'ln', which links the file, and 'best' which copies symbolic links and links the rest. Default is 'best'.

    • dynamic_lib
      1. {ARMAYBE => 'ar', OTHERLDFLAGS => '...', INST_DYNAMIC_DEP => '...'}
    • linkext
      1. {LINKTYPE => 'static', 'dynamic' or ''}

      NB: Extensions that have nothing but *.pm files had to say

      1. {LINKTYPE => ''}

      with Pre-5.0 MakeMakers. Since version 5.00 of MakeMaker such a line can be deleted safely. MakeMaker recognizes when there's nothing to be linked.

    • macro
      1. {ANY_MACRO => ANY_VALUE, ...}
    • postamble

      Anything put here will be passed to MY::postamble() if you have one.

    • realclean
      1. {FILES => '$(INST_ARCHAUTODIR)/*.xyz'}
    • test
      1. {TESTS => 't/*.t'}
    • tool_autosplit
      1. {MAXLEN => 8}

    Overriding MakeMaker Methods

    If you cannot achieve the desired Makefile behaviour by specifying attributes you may define private subroutines in the Makefile.PL. Each subroutine returns the text it wishes to have written to the Makefile. To override a section of the Makefile you can either say:

    1. sub MY::c_o { "new literal text" }

    or you can edit the default by saying something like:

    1. package MY; # so that "SUPER" works right
    2. sub c_o {
    3. my $inherited = shift->SUPER::c_o(@_);
    4. $inherited =~ s/old text/new text/;
    5. $inherited;
    6. }

    If you are running experiments with embedding perl as a library into other applications, you might find MakeMaker is not sufficient. You'd better have a look at ExtUtils::Embed which is a collection of utilities for embedding.

    If you still need a different solution, try to develop another subroutine that fits your needs and submit the diffs to makemaker@perl.org

    For a complete description of all MakeMaker methods see ExtUtils::MM_Unix.

    Here is a simple example of how to add a new target to the generated Makefile:

    1. sub MY::postamble {
    2. return <<'MAKE_FRAG';
    3. $(MYEXTLIB): sdbm/Makefile
    4. cd sdbm && $(MAKE) all
    5. MAKE_FRAG
    6. }

    The End Of Cargo Cult Programming

    WriteMakefile() now does some basic sanity checks on its parameters to protect against typos and malformatted values. This means some things which happened to work in the past will now throw warnings and possibly produce internal errors.

    Some of the most common mistakes:

    • MAN3PODS => ' '

      This is commonly used to suppress the creation of man pages. MAN3PODS takes a hash ref not a string, but the above worked by accident in old versions of MakeMaker.

      The correct code is MAN3PODS => { } .

    Hintsfile support

    MakeMaker.pm uses the architecture-specific information from Config.pm. In addition it evaluates architecture specific hints files in a hints/ directory. The hints files are expected to be named like their counterparts in PERL_SRC/hints , but with an .pl file name extension (eg. next_3_2.pl ). They are simply evaled by MakeMaker within the WriteMakefile() subroutine, and can be used to execute commands as well as to include special variables. The rules which hintsfile is chosen are the same as in Configure.

    The hintsfile is eval()ed immediately after the arguments given to WriteMakefile are stuffed into a hash reference $self but before this reference becomes blessed. So if you want to do the equivalent to override or create an attribute you would say something like

    1. $self->{LIBS} = ['-ldbm -lucb -lc'];

    Distribution Support

    For authors of extensions MakeMaker provides several Makefile targets. Most of the support comes from the ExtUtils::Manifest module, where additional documentation can be found.

    • make distcheck

      reports which files are below the build directory but not in the MANIFEST file and vice versa. (See ExtUtils::Manifest::fullcheck() for details)

    • make skipcheck

      reports which files are skipped due to the entries in the MANIFEST.SKIP file (See ExtUtils::Manifest::skipcheck() for details)

    • make distclean

      does a realclean first and then the distcheck. Note that this is not needed to build a new distribution as long as you are sure that the MANIFEST file is ok.

    • make manifest

      rewrites the MANIFEST file, adding all remaining files found (See ExtUtils::Manifest::mkmanifest() for details)

    • make distdir

      Copies all the files that are in the MANIFEST file to a newly created directory with the name $(DISTNAME)-$(VERSION). If that directory exists, it will be removed first.

      Additionally, it will create META.yml and META.json module meta-data file in the distdir and add this to the distdir's MANIFEST. You can shut this behavior off with the NO_META flag.

    • make disttest

      Makes a distdir first, and runs a perl Makefile.PL , a make, and a make test in that directory.

    • make tardist

      First does a distdir. Then a command $(PREOP) which defaults to a null command, followed by $(TO_UNIX), which defaults to a null command under UNIX, and will convert files in distribution directory to UNIX format otherwise. Next it runs tar on that directory into a tarfile and deletes the directory. Finishes with a command $(POSTOP) which defaults to a null command.

    • make dist

      Defaults to $(DIST_DEFAULT) which in turn defaults to tardist.

    • make uutardist

      Runs a tardist first and uuencodes the tarfile.

    • make shdist

      First does a distdir. Then a command $(PREOP) which defaults to a null command. Next it runs shar on that directory into a sharfile and deletes the intermediate directory again. Finishes with a command $(POSTOP) which defaults to a null command. Note: For shdist to work properly a shar program that can handle directories is mandatory.

    • make zipdist

      First does a distdir. Then a command $(PREOP) which defaults to a null command. Runs $(ZIP) $(ZIPFLAGS) on that directory into a zipfile. Then deletes that directory. Finishes with a command $(POSTOP) which defaults to a null command.

    • make ci

      Does a $(CI) and a $(RCS_LABEL) on all files in the MANIFEST file.

    Customization of the dist targets can be done by specifying a hash reference to the dist attribute of the WriteMakefile call. The following parameters are recognized:

    1. CI ('ci -u')
    2. COMPRESS ('gzip --best')
    3. POSTOP ('@ :')
    4. PREOP ('@ :')
    5. TO_UNIX (depends on the system)
    6. RCS_LABEL ('rcs -q -Nv$(VERSION_SYM):')
    7. SHAR ('shar')
    8. SUFFIX ('.gz')
    9. TAR ('tar')
    10. TARFLAGS ('cvf')
    11. ZIP ('zip')
    12. ZIPFLAGS ('-r')

    An example:

    1. WriteMakefile(
    2. ...other options...
    3. dist => {
    4. COMPRESS => "bzip2",
    5. SUFFIX => ".bz2"
    6. }
    7. );

    Module Meta-Data (META and MYMETA)

    Long plaguing users of MakeMaker based modules has been the problem of getting basic information about the module out of the sources without running the Makefile.PL and doing a bunch of messy heuristics on the resulting Makefile. Over the years, it has become standard to keep this information in one or more CPAN Meta files distributed with each distribution.

    The original format of CPAN Meta files was YAML and the corresponding file was called META.yml. In 2010, version 2 of the CPAN::Meta::Spec was released, which mandates JSON format for the metadata in order to overcome certain compatibility issues between YAML serializers and to avoid breaking older clients unable to handle a new version of the spec. The CPAN::Meta library is now standard for accessing old and new-style Meta files.

    If CPAN::Meta is installed, MakeMaker will automatically generate META.json and META.yml files for you and add them to your MANIFEST as part of the 'distdir' target (and thus the 'dist' target). This is intended to seamlessly and rapidly populate CPAN with module meta-data. If you wish to shut this feature off, set the NO_META WriteMakefile() flag to true.

    At the 2008 QA Hackathon in Oslo, Perl module toolchain maintainers agrees to use the CPAN Meta format to communicate post-configuration requirements between toolchain components. These files, MYMETA.json and MYMETA.yml, are generated when Makefile.PL generates a Makefile (if CPAN::Meta is installed). Clients like CPAN or CPANPLUS will read this files to see what prerequisites must be fulfilled before building or testing the distribution. If you with to shut this feature off, set the NO_MYMETA WriteMakeFile() flag to true.

    Disabling an extension

    If some events detected in Makefile.PL imply that there is no way to create the Module, but this is a normal state of things, then you can create a Makefile which does nothing, but succeeds on all the "usual" build targets. To do so, use

    1. use ExtUtils::MakeMaker qw(WriteEmptyMakefile);
    2. WriteEmptyMakefile();

    instead of WriteMakefile().

    This may be useful if other modules expect this module to be built OK, as opposed to work OK (say, this system-dependent module builds in a subdirectory of some other distribution, or is listed as a dependency in a CPAN::Bundle, but the functionality is supported by different means on the current architecture).

    Other Handy Functions

    • prompt
      1. my $value = prompt($message);
      2. my $value = prompt($message, $default);

      The prompt() function provides an easy way to request user input used to write a makefile. It displays the $message as a prompt for input. If a $default is provided it will be used as a default. The function returns the $value selected by the user.

      If prompt() detects that it is not running interactively and there is nothing on STDIN or if the PERL_MM_USE_DEFAULT environment variable is set to true, the $default will be used without prompting. This prevents automated processes from blocking on user input.

      If no $default is provided an empty string will be used instead.

    ENVIRONMENT

    • PERL_MM_OPT

      Command line options used by MakeMaker->new() , and thus by WriteMakefile() . The string is split on whitespace, and the result is processed before any actual command line arguments are processed.

    • PERL_MM_USE_DEFAULT

      If set to a true value then MakeMaker's prompt function will always return the default without waiting for user input.

    • PERL_CORE

      Same as the PERL_CORE parameter. The parameter overrides this.

    SEE ALSO

    Module::Build is a pure-Perl alternative to MakeMaker which does not rely on make or any other external utility. It is easier to extend to suit your needs.

    Module::Install is a wrapper around MakeMaker which adds features not normally available.

    ExtUtils::ModuleMaker and Module::Starter are both modules to help you setup your distribution.

    CPAN::Meta and CPAN::Meta::Spec explain CPAN Meta files in detail.

    AUTHORS

    Andy Dougherty doughera@lafayette.edu , Andreas König andreas.koenig@mind.de , Tim Bunce timb@cpan.org . VMS support by Charles Bailey bailey@newman.upenn.edu . OS/2 support by Ilya Zakharevich ilya@math.ohio-state.edu .

    Currently maintained by Michael G Schwern schwern@pobox.com

    Send patches and ideas to makemaker@perl.org .

    Send bug reports via http://rt.cpan.org/. Please send your generated Makefile along with your report.

    For more up-to-date information, see http://www.makemaker.org.

    Repository available at https://github.com/Perl-Toolchain-Gang/ExtUtils-MakeMaker.

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    See http://www.perl.com/perl/misc/Artistic.html

     
    perldoc-html/ExtUtils/Manifest.html000644 000765 000024 00000075464 12275777440 017472 0ustar00jjstaff000000 000000 ExtUtils::Manifest - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Manifest

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Manifest

    NAME

    ExtUtils::Manifest - utilities to write and check a MANIFEST file

    SYNOPSIS

    1. use ExtUtils::Manifest qw(...funcs to import...);
    2. mkmanifest();
    3. my @missing_files = manicheck;
    4. my @skipped = skipcheck;
    5. my @extra_files = filecheck;
    6. my($missing, $extra) = fullcheck;
    7. my $found = manifind();
    8. my $manifest = maniread();
    9. manicopy($read,$target);
    10. maniadd({$file => $comment, ...});

    DESCRIPTION

    Functions

    ExtUtils::Manifest exports no functions by default. The following are exported on request

    • mkmanifest
      1. mkmanifest();

      Writes all files in and below the current directory to your MANIFEST. It works similar to the result of the Unix command

      1. find . > MANIFEST

      All files that match any regular expression in a file MANIFEST.SKIP (if it exists) are ignored.

      Any existing MANIFEST file will be saved as MANIFEST.bak.

    • manifind
      1. my $found = manifind();

      returns a hash reference. The keys of the hash are the files found below the current directory.

    • manicheck
      1. my @missing_files = manicheck();

      checks if all the files within a MANIFEST in the current directory really do exist. If MANIFEST and the tree below the current directory are in sync it silently returns an empty list. Otherwise it returns a list of files which are listed in the MANIFEST but missing from the directory, and by default also outputs these names to STDERR.

    • filecheck
      1. my @extra_files = filecheck();

      finds files below the current directory that are not mentioned in the MANIFEST file. An optional file MANIFEST.SKIP will be consulted. Any file matching a regular expression in such a file will not be reported as missing in the MANIFEST file. The list of any extraneous files found is returned, and by default also reported to STDERR.

    • fullcheck
      1. my($missing, $extra) = fullcheck();

      does both a manicheck() and a filecheck(), returning then as two array refs.

    • skipcheck
      1. my @skipped = skipcheck();

      lists all the files that are skipped due to your MANIFEST.SKIP file.

    • maniread
      1. my $manifest = maniread();
      2. my $manifest = maniread($manifest_file);

      reads a named MANIFEST file (defaults to MANIFEST in the current directory) and returns a HASH reference with files being the keys and comments being the values of the HASH. Blank lines and lines which start with # in the MANIFEST file are discarded.

    • maniskip
      1. my $skipchk = maniskip();
      2. my $skipchk = maniskip($manifest_skip_file);
      3. if ($skipchk->($file)) { .. }

      reads a named MANIFEST.SKIP file (defaults to MANIFEST.SKIP in the current directory) and returns a CODE reference that tests whether a given filename should be skipped.

    • manicopy
      1. manicopy(\%src, $dest_dir);
      2. manicopy(\%src, $dest_dir, $how);

      Copies the files that are the keys in %src to the $dest_dir. %src is typically returned by the maniread() function.

      1. manicopy( maniread(), $dest_dir );

      This function is useful for producing a directory tree identical to the intended distribution tree.

      $how can be used to specify a different methods of "copying". Valid values are cp , which actually copies the files, ln which creates hard links, and best which mostly links the files but copies any symbolic link to make a tree without any symbolic link. cp is the default.

    • maniadd
      1. maniadd({ $file => $comment, ...});

      Adds an entry to an existing MANIFEST unless its already there.

      $file will be normalized (ie. Unixified). UNIMPLEMENTED

    MANIFEST

    A list of files in the distribution, one file per line. The MANIFEST always uses Unix filepath conventions even if you're not on Unix. This means foo/bar style not foo\bar.

    Anything between white space and an end of line within a MANIFEST file is considered to be a comment. Any line beginning with # is also a comment. Beginning with ExtUtils::Manifest 1.52, a filename may contain whitespace characters if it is enclosed in single quotes; single quotes or backslashes in that filename must be backslash-escaped.

    1. # this a comment
    2. some/file
    3. some/other/file comment about some/file
    4. 'some/third file' comment

    MANIFEST.SKIP

    The file MANIFEST.SKIP may contain regular expressions of files that should be ignored by mkmanifest() and filecheck(). The regular expressions should appear one on each line. Blank lines and lines which start with # are skipped. Use \# if you need a regular expression to start with a # .

    For example:

    1. # Version control files and dirs.
    2. \bRCS\b
    3. \bCVS\b
    4. ,v$
    5. \B\.svn\b
    6. # Makemaker generated files and dirs.
    7. ^MANIFEST\.
    8. ^Makefile$
    9. ^blib/
    10. ^MakeMaker-\d
    11. # Temp, old and emacs backup files.
    12. ~$
    13. \.old$
    14. ^#.*#$
    15. ^\.#

    If no MANIFEST.SKIP file is found, a default set of skips will be used, similar to the example above. If you want nothing skipped, simply make an empty MANIFEST.SKIP file.

    In one's own MANIFEST.SKIP file, certain directives can be used to include the contents of other MANIFEST.SKIP files. At present two such directives are recognized.

    • #!include_default

      This inserts the contents of the default MANIFEST.SKIP file

    • #!include /Path/to/another/manifest.skip

      This inserts the contents of the specified external file

    The included contents will be inserted into the MANIFEST.SKIP file in between #!start included /path/to/manifest.skip and #!end included /path/to/manifest.skip markers. The original MANIFEST.SKIP is saved as MANIFEST.SKIP.bak.

    EXPORT_OK

    &mkmanifest , &manicheck , &filecheck , &fullcheck , &maniread , and &manicopy are exportable.

    GLOBAL VARIABLES

    $ExtUtils::Manifest::MANIFEST defaults to MANIFEST . Changing it results in both a different MANIFEST and a different MANIFEST.SKIP file. This is useful if you want to maintain different distributions for different audiences (say a user version and a developer version including RCS).

    $ExtUtils::Manifest::Quiet defaults to 0. If set to a true value, all functions act silently.

    $ExtUtils::Manifest::Debug defaults to 0. If set to a true value, or if PERL_MM_MANIFEST_DEBUG is true, debugging output will be produced.

    DIAGNOSTICS

    All diagnostic output is sent to STDERR .

    • Not in MANIFEST: file

      is reported if a file is found which is not in MANIFEST .

    • Skipping file

      is reported if a file is skipped due to an entry in MANIFEST.SKIP .

    • No such file: file

      is reported if a file mentioned in a MANIFEST file does not exist.

    • MANIFEST: $!

      is reported if MANIFEST could not be opened.

    • Added to MANIFEST: file

      is reported by mkmanifest() if $Verbose is set and a file is added to MANIFEST. $Verbose is set to 1 by default.

    ENVIRONMENT

    • PERL_MM_MANIFEST_DEBUG

      Turns on debugging

    SEE ALSO

    ExtUtils::MakeMaker which has handy targets for most of the functionality.

    AUTHOR

    Andreas Koenig andreas.koenig@anima.de

    Maintained by Michael G Schwern schwern@pobox.com within the ExtUtils-MakeMaker package and, as a separate CPAN package, by Randy Kobes r.kobes@uwinnipeg.ca .

     
    perldoc-html/ExtUtils/Miniperl.html000644 000765 000024 00000036562 12275777442 017501 0ustar00jjstaff000000 000000 ExtUtils::Miniperl - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Miniperl

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Miniperl

    NAME

    ExtUtils::Miniperl, writemain - write the C code for perlmain.c

    SYNOPSIS

    use ExtUtils::Miniperl;

    writemain(@directories);

    DESCRIPTION

    This whole module is written when perl itself is built from a script called minimod.PL. In case you want to patch it, please patch minimod.PL in the perl distribution instead.

    writemain() takes an argument list of directories containing archive libraries that relate to perl modules and should be linked into a new perl binary. It writes to STDOUT a corresponding perlmain.c file that is a plain C file containing all the bootstrap code to make the modules associated with the libraries available from within perl.

    The typical usage is from within a Makefile generated by ExtUtils::MakeMaker. So under normal circumstances you won't have to deal with this module directly.

    SEE ALSO

    ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/Mkbootstrap.html000644 000765 000024 00000037131 12275777441 020217 0ustar00jjstaff000000 000000 ExtUtils::Mkbootstrap - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Mkbootstrap

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Mkbootstrap

    NAME

    ExtUtils::Mkbootstrap - make a bootstrap file for use by DynaLoader

    SYNOPSIS

    Mkbootstrap

    DESCRIPTION

    Mkbootstrap typically gets called from an extension Makefile.

    There is no *.bs file supplied with the extension. Instead, there may be a *_BS file which has code for the special cases, like posix for berkeley db on the NeXT.

    This file will get parsed, and produce a maybe empty @DynaLoader::dl_resolve_using array for the current architecture. That will be extended by $BSLOADLIBS, which was computed by ExtUtils::Liblist::ext(). If this array still is empty, we do nothing, else we write a .bs file with an @DynaLoader::dl_resolve_using array.

    The *_BS file can put some code into the generated *.bs file by placing it in $bscode . This is a handy 'escape' mechanism that may prove useful in complex situations.

    If @DynaLoader::dl_resolve_using contains -L* or -l* entries then Mkbootstrap will automatically add a dl_findfile() call to the generated *.bs file.

     
    perldoc-html/ExtUtils/Mksymlists.html000644 000765 000024 00000051411 12275777441 020066 0ustar00jjstaff000000 000000 ExtUtils::Mksymlists - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Mksymlists

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Mksymlists

    NAME

    ExtUtils::Mksymlists - write linker options files for dynamic extension

    SYNOPSIS

    1. use ExtUtils::Mksymlists;
    2. Mksymlists( NAME => $name ,
    3. DL_VARS => [ $var1, $var2, $var3 ],
    4. DL_FUNCS => { $pkg1 => [ $func1, $func2 ],
    5. $pkg2 => [ $func3 ] );

    DESCRIPTION

    ExtUtils::Mksymlists produces files used by the linker under some OSs during the creation of shared libraries for dynamic extensions. It is normally called from a MakeMaker-generated Makefile when the extension is built. The linker option file is generated by calling the function Mksymlists , which is exported by default from ExtUtils::Mksymlists . It takes one argument, a list of key-value pairs, in which the following keys are recognized:

    • DLBASE

      This item specifies the name by which the linker knows the extension, which may be different from the name of the extension itself (for instance, some linkers add an '_' to the name of the extension). If it is not specified, it is derived from the NAME attribute. It is presently used only by OS2 and Win32.

    • DL_FUNCS

      This is identical to the DL_FUNCS attribute available via MakeMaker, from which it is usually taken. Its value is a reference to an associative array, in which each key is the name of a package, and each value is an a reference to an array of function names which should be exported by the extension. For instance, one might say DL_FUNCS => { Homer::Iliad => [ qw(trojans greeks) ], Homer::Odyssey => [ qw(travellers family suitors) ] } . The function names should be identical to those in the XSUB code; Mksymlists will alter the names written to the linker option file to match the changes made by xsubpp. In addition, if none of the functions in a list begin with the string boot_, Mksymlists will add a bootstrap function for that package, just as xsubpp does. (If a boot_<pkg> function is present in the list, it is passed through unchanged.) If DL_FUNCS is not specified, it defaults to the bootstrap function for the extension specified in NAME.

    • DL_VARS

      This is identical to the DL_VARS attribute available via MakeMaker, and, like DL_FUNCS, it is usually specified via MakeMaker. Its value is a reference to an array of variable names which should be exported by the extension.

    • FILE

      This key can be used to specify the name of the linker option file (minus the OS-specific extension), if for some reason you do not want to use the default value, which is the last word of the NAME attribute (e.g. for Tk::Canvas , FILE defaults to Canvas ).

    • FUNCLIST

      This provides an alternate means to specify function names to be exported from the extension. Its value is a reference to an array of function names to be exported by the extension. These names are passed through unaltered to the linker options file. Specifying a value for the FUNCLIST attribute suppresses automatic generation of the bootstrap function for the package. To still create the bootstrap name you have to specify the package name in the DL_FUNCS hash:

      1. Mksymlists( NAME => $name ,
      2. FUNCLIST => [ $func1, $func2 ],
      3. DL_FUNCS => { $pkg => [] } );
    • IMPORTS

      This attribute is used to specify names to be imported into the extension. It is currently only used by OS/2 and Win32.

    • NAME

      This gives the name of the extension (e.g. Tk::Canvas ) for which the linker option file will be produced.

    When calling Mksymlists , one should always specify the NAME attribute. In most cases, this is all that's necessary. In the case of unusual extensions, however, the other attributes can be used to provide additional information to the linker.

    AUTHOR

    Charles Bailey <bailey@newman.upenn.edu>

    REVISION

    Last revised 14-Feb-1996, for Perl 5.002.

     
    perldoc-html/ExtUtils/Packlist.html000644 000765 000024 00000063112 12275777440 017461 0ustar00jjstaff000000 000000 ExtUtils::Packlist - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Packlist

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Packlist

    NAME

    ExtUtils::Packlist - manage .packlist files

    SYNOPSIS

    1. use ExtUtils::Packlist;
    2. my ($pl) = ExtUtils::Packlist->new('.packlist');
    3. $pl->read('/an/old/.packlist');
    4. my @missing_files = $pl->validate();
    5. $pl->write('/a/new/.packlist');
    6. $pl->{'/some/file/name'}++;
    7. or
    8. $pl->{'/some/other/file/name'} = { type => 'file',
    9. from => '/some/file' };

    DESCRIPTION

    ExtUtils::Packlist provides a standard way to manage .packlist files. Functions are provided to read and write .packlist files. The original .packlist format is a simple list of absolute pathnames, one per line. In addition, this package supports an extended format, where as well as a filename each line may contain a list of attributes in the form of a space separated list of key=value pairs. This is used by the installperl script to differentiate between files and links, for example.

    USAGE

    The hash reference returned by the new() function can be used to examine and modify the contents of the .packlist. Items may be added/deleted from the .packlist by modifying the hash. If the value associated with a hash key is a scalar, the entry written to the .packlist by any subsequent write() will be a simple filename. If the value is a hash, the entry written will be the filename followed by the key=value pairs from the hash. Reading back the .packlist will recreate the original entries.

    FUNCTIONS

    • new()

      This takes an optional parameter, the name of a .packlist. If the file exists, it will be opened and the contents of the file will be read. The new() method returns a reference to a hash. This hash holds an entry for each line in the .packlist. In the case of old-style .packlists, the value associated with each key is undef. In the case of new-style .packlists, the value associated with each key is a hash containing the key=value pairs following the filename in the .packlist.

    • read()

      This takes an optional parameter, the name of the .packlist to be read. If no file is specified, the .packlist specified to new() will be read. If the .packlist does not exist, Carp::croak will be called.

    • write()

      This takes an optional parameter, the name of the .packlist to be written. If no file is specified, the .packlist specified to new() will be overwritten.

    • validate()

      This checks that every file listed in the .packlist actually exists. If an argument which evaluates to true is given, any missing files will be removed from the internal hash. The return value is a list of the missing files, which will be empty if they all exist.

    • packlist_file()

      This returns the name of the associated .packlist file

    EXAMPLE

    Here's modrm , a little utility to cleanly remove an installed module.

    1. #!/usr/local/bin/perl -w
    2. use strict;
    3. use IO::Dir;
    4. use ExtUtils::Packlist;
    5. use ExtUtils::Installed;
    6. sub emptydir($) {
    7. my ($dir) = @_;
    8. my $dh = IO::Dir->new($dir) || return(0);
    9. my @count = $dh->read();
    10. $dh->close();
    11. return(@count == 2 ? 1 : 0);
    12. }
    13. # Find all the installed packages
    14. print("Finding all installed modules...\n");
    15. my $installed = ExtUtils::Installed->new();
    16. foreach my $module (grep(!/^Perl$/, $installed->modules())) {
    17. my $version = $installed->version($module) || "???";
    18. print("Found module $module Version $version\n");
    19. print("Do you want to delete $module? [n] ");
    20. my $r = <STDIN>; chomp($r);
    21. if ($r && $r =~ /^y/i) {
    22. # Remove all the files
    23. foreach my $file (sort($installed->files($module))) {
    24. print("rm $file\n");
    25. unlink($file);
    26. }
    27. my $pf = $installed->packlist($module)->packlist_file();
    28. print("rm $pf\n");
    29. unlink($pf);
    30. foreach my $dir (sort($installed->directory_tree($module))) {
    31. if (emptydir($dir)) {
    32. print("rmdir $dir\n");
    33. rmdir($dir);
    34. }
    35. }
    36. }
    37. }

    AUTHOR

    Alan Burlison <Alan.Burlison@uk.sun.com>

     
    perldoc-html/ExtUtils/ParseXS.html000644 000765 000024 00000052575 12275777441 017250 0ustar00jjstaff000000 000000 ExtUtils::ParseXS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::ParseXS

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::ParseXS

    NAME

    ExtUtils::ParseXS - converts Perl XS code into C code

    SYNOPSIS

    1. use ExtUtils::ParseXS qw(process_file);
    2. process_file( filename => 'foo.xs' );
    3. process_file( filename => 'foo.xs',
    4. output => 'bar.c',
    5. 'C++' => 1,
    6. typemap => 'path/to/typemap',
    7. hiertype => 1,
    8. except => 1,
    9. prototypes => 1,
    10. versioncheck => 1,
    11. linenumbers => 1,
    12. optimize => 1,
    13. prototypes => 1,
    14. );

    DESCRIPTION

    ExtUtils::ParseXS will compile XS code into C code by embedding the constructs necessary to let C functions manipulate Perl values and creates the glue necessary to let Perl access those functions. The compiler uses typemaps to determine how to map C function parameters and variables to Perl values.

    The compiler will search for typemap files called typemap. It will use the following search path to find default typemaps, with the rightmost typemap taking precedence.

    1. ../../../typemap:../../typemap:../typemap:typemap

    EXPORT

    None by default. process_file() may be exported upon request.

    FUNCTIONS

    • process_file()

      This function processes an XS file and sends output to a C file. Named parameters control how the processing is done. The following parameters are accepted:

      • C++

        Adds extern "C" to the C code. Default is false.

      • hiertype

        Retains :: in type names so that C++ hierarchical types can be mapped. Default is false.

      • except

        Adds exception handling stubs to the C code. Default is false.

      • typemap

        Indicates that a user-supplied typemap should take precedence over the default typemaps. A single typemap may be specified as a string, or multiple typemaps can be specified in an array reference, with the last typemap having the highest precedence.

      • prototypes

        Generates prototype code for all xsubs. Default is false.

      • versioncheck

        Makes sure at run time that the object file (derived from the .xs file) and the .pm files have the same version number. Default is true.

      • linenumbers

        Adds #line directives to the C output so error messages will look like they came from the original XS file. Default is true.

      • optimize

        Enables certain optimizations. The only optimization that is currently affected is the use of targets by the output C code (see perlguts). Not optimizing may significantly slow down the generated code, but this is the way xsubpp of 5.005 and earlier operated. Default is to optimize.

      • inout

        Enable recognition of IN , OUT_LIST and INOUT_LIST declarations. Default is true.

      • argtypes

        Enable recognition of ANSI-like descriptions of function signature. Default is true.

      • s

        Maintainer note: I have no clue what this does. Strips function prefixes?

    • errors()

      This function returns the number of [a certain kind of] errors encountered during processing of the XS file.

    AUTHOR

    Based on xsubpp code, written by Larry Wall.

    Maintained by:

    • Ken Williams, <ken@mathforum.org>

    • David Golden, <dagolden@cpan.org>

    • James Keenan, <jkeenan@cpan.org>

    • Steffen Mueller, <smueller@cpan.org>

    COPYRIGHT

    Copyright 2002-2012 by Ken Williams, David Golden and other contributors. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Based on the ExtUtils::xsubpp code by Larry Wall and the Perl 5 Porters, which was released under the same license terms.

    SEE ALSO

    perl, ExtUtils::xsubpp, ExtUtils::MakeMaker, perlxs, perlxstut.

     
    perldoc-html/ExtUtils/testlib.html000644 000765 000024 00000035513 12275777440 017361 0ustar00jjstaff000000 000000 ExtUtils::testlib - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::testlib

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::testlib

    NAME

    ExtUtils::testlib - add blib/* directories to @INC

    SYNOPSIS

    1. use ExtUtils::testlib;

    DESCRIPTION

    After an extension has been built and before it is installed it may be desirable to test it bypassing make test . By adding

    1. use ExtUtils::testlib;

    to a test program the intermediate directories used by make are added to @INC.

     
    perldoc-html/ExtUtils/MakeMaker/Config.html000644 000765 000024 00000035561 12275777442 020762 0ustar00jjstaff000000 000000 ExtUtils::MakeMaker::Config - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MakeMaker::Config

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MakeMaker::Config

    NAME

    ExtUtils::MakeMaker::Config - Wrapper around Config.pm

    SYNOPSIS

    1. use ExtUtils::MakeMaker::Config;
    2. print $Config{installbin}; # or whatever

    DESCRIPTION

    FOR INTERNAL USE ONLY

    A very thin wrapper around Config.pm so MakeMaker is easier to test.

     
    perldoc-html/ExtUtils/MakeMaker/FAQ.html000644 000765 000024 00000113651 12275777440 020157 0ustar00jjstaff000000 000000 ExtUtils::MakeMaker::FAQ - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MakeMaker::FAQ

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MakeMaker::FAQ

    NAME

    ExtUtils::MakeMaker::FAQ - Frequently Asked Questions About MakeMaker

    DESCRIPTION

    FAQs, tricks and tips for ExtUtils::MakeMaker .

    Module Installation

    • How do I install a module into my home directory?

      If you're not the Perl administrator you probably don't have permission to install a module to its default location. Then you should install it for your own use into your home directory like so:

      1. # Non-unix folks, replace ~ with /path/to/your/home/dir
      2. perl Makefile.PL INSTALL_BASE=~

      This will put modules into ~/lib/perl5, man pages into ~/man and programs into ~/bin.

      To ensure your Perl programs can see these newly installed modules, set your PERL5LIB environment variable to ~/lib/perl5 or tell each of your programs to look in that directory with the following:

      1. use lib "$ENV{HOME}/lib/perl5";

      or if $ENV{HOME} isn't set and you don't want to set it for some reason, do it the long way.

      1. use lib "/path/to/your/home/dir/lib/perl5";
    • How do I get MakeMaker and Module::Build to install to the same place?

      Module::Build, as of 0.28, supports two ways to install to the same location as MakeMaker.

      We highly recommend the install_base method, its the simplest and most closely approximates the expected behavior of an installation prefix.

      1) Use INSTALL_BASE / --install_base

      MakeMaker (as of 6.31) and Module::Build (as of 0.28) both can install to the same locations using the "install_base" concept. See INSTALL_BASE in ExtUtils::MakeMaker for details. To get MM and MB to install to the same location simply set INSTALL_BASE in MM and --install_base in MB to the same location.

      1. perl Makefile.PL INSTALL_BASE=/whatever
      2. perl Build.PL --install_base /whatever

      This works most like other language's behavior when you specify a prefix. We recommend this method.

      2) Use PREFIX / --prefix

      Module::Build 0.28 added support for --prefix which works like MakeMaker's PREFIX.

      1. perl Makefile.PL PREFIX=/whatever
      2. perl Build.PL --prefix /whatever

      We highly discourage this method. It should only be used if you know what you're doing and specifically need the PREFIX behavior. The PREFIX algorithm is complicated and focused on matching the system installation.

    • How do I keep from installing man pages?

      Recent versions of MakeMaker will only install man pages on Unix-like operating systems.

      For an individual module:

      1. perl Makefile.PL INSTALLMAN1DIR=none INSTALLMAN3DIR=none

      If you want to suppress man page installation for all modules you have to reconfigure Perl and tell it 'none' when it asks where to install man pages.

    • How do I use a module without installing it?

      Two ways. One is to build the module normally...

      1. perl Makefile.PL
      2. make
      3. make test

      ...and then set the PERL5LIB environment variable to point at the blib/lib and blib/arch directories.

      The other is to install the module in a temporary location.

      1. perl Makefile.PL INSTALL_BASE=~/tmp
      2. make
      3. make test
      4. make install

      And then set PERL5LIB to ~/tmp/lib/perl5. This works well when you have multiple modules to work with. It also ensures that the module goes through its full installation process which may modify it.

    • PREFIX vs INSTALL_BASE from Module::Build::Cookbook

      The behavior of PREFIX is complicated and depends closely on how your Perl is configured. The resulting installation locations will vary from machine to machine and even different installations of Perl on the same machine. Because of this, its difficult to document where prefix will place your modules.

      In contrast, INSTALL_BASE has predictable, easy to explain installation locations. Now that Module::Build and MakeMaker both have INSTALL_BASE there is little reason to use PREFIX other than to preserve your existing installation locations. If you are starting a fresh Perl installation we encourage you to use INSTALL_BASE. If you have an existing installation installed via PREFIX, consider moving it to an installation structure matching INSTALL_BASE and using that instead.

    Common errors and problems

    • "No rule to make target `/usr/lib/perl5/CORE/config.h', needed by `Makefile'"

      Just what it says, you're missing that file. MakeMaker uses it to determine if perl has been rebuilt since the Makefile was made. It's a bit of a bug that it halts installation.

      Some operating systems don't ship the CORE directory with their base perl install. To solve the problem, you likely need to install a perl development package such as perl-devel (CentOS, Fedora and other Redhat systems) or perl (Ubuntu and other Debian systems).

    Philosophy and History

    • Why not just use <insert other build config tool here>?

      Why did MakeMaker reinvent the build configuration wheel? Why not just use autoconf or automake or ppm or Ant or ...

      There are many reasons, but the major one is cross-platform compatibility.

      Perl is one of the most ported pieces of software ever. It works on operating systems I've never even heard of (see perlport for details). It needs a build tool that can work on all those platforms and with any wacky C compilers and linkers they might have.

      No such build tool exists. Even make itself has wildly different dialects. So we have to build our own.

    • What is Module::Build and how does it relate to MakeMaker?

      Module::Build is a project by Ken Williams to supplant MakeMaker. Its primary advantages are:

      • pure perl. no make, no shell commands
      • easier to customize
      • cleaner internals
      • less cruft

      Module::Build is the official heir apparent to MakeMaker and we encourage people to work on M::B rather than spending time adding features to MakeMaker.

    Module Writing

    • How do I keep my $VERSION up to date without resetting it manually?

      Often you want to manually set the $VERSION in the main module distribution because this is the version that everybody sees on CPAN and maybe you want to customize it a bit. But for all the other modules in your dist, $VERSION is really just bookkeeping and all that's important is it goes up every time the module is changed. Doing this by hand is a pain and you often forget.

      Simplest way to do it automatically is to use your version control system's revision number (you are using version control, right?).

      In CVS, RCS and SVN you use $Revision$ (see the documentation of your version control system for details). Every time the file is checked in the $Revision$ will be updated, updating your $VERSION.

      SVN uses a simple integer for $Revision$ so you can adapt it for your $VERSION like so:

      1. ($VERSION) = q$Revision$ =~ /(\d+)/;

      In CVS and RCS version 1.9 is followed by 1.10. Since CPAN compares version numbers numerically we use a sprintf() to convert 1.9 to 1.009 and 1.10 to 1.010 which compare properly.

      1. $VERSION = sprintf "%d.%03d", q$Revision$ =~ /(\d+)\.(\d+)/g;

      If branches are involved (ie. $Revision: 1.5.3.4$) it's a little more complicated.

      1. # must be all on one line or MakeMaker will get confused.
      2. $VERSION = do { my @r = (q$Revision$ =~ /\d+/g); sprintf "%d."."%03d" x $#r, @r };

      In SVN, $Revision$ should be the same for every file in the project so they would all have the same $VERSION. CVS and RCS have a different $Revision$ per file so each file will have a different $VERSION. Distributed version control systems, such as SVK, may have a different $Revision$ based on who checks out the file, leading to a different $VERSION on each machine! Finally, some distributed version control systems, such as darcs, have no concept of revision number at all.

    • What's this META.yml thing and how did it get in my MANIFEST?!

      META.yml is a module meta-data file pioneered by Module::Build and automatically generated as part of the 'distdir' target (and thus 'dist'). See Module Meta-Data in ExtUtils::MakeMaker.

      To shut off its generation, pass the NO_META flag to WriteMakefile() .

    • How do I delete everything not in my MANIFEST?

      Some folks are surprised that make distclean does not delete everything not listed in their MANIFEST (thus making a clean distribution) but only tells them what they need to delete. This is done because it is considered too dangerous. While developing your module you might write a new file, not add it to the MANIFEST, then run a distclean and be sad because your new work was deleted.

      If you really want to do this, you can use ExtUtils::Manifest::manifind() to read the MANIFEST and File::Find to delete the files. But you have to be careful. Here's a script to do that. Use at your own risk. Have fun blowing holes in your foot.

      1. #!/usr/bin/perl -w
      2. use strict;
      3. use File::Spec;
      4. use File::Find;
      5. use ExtUtils::Manifest qw(maniread);
      6. my %manifest = map {( $_ => 1 )}
      7. grep { File::Spec->canonpath($_) }
      8. keys %{ maniread() };
      9. if( !keys %manifest ) {
      10. print "No files found in MANIFEST. Stopping.\n";
      11. exit;
      12. }
      13. find({
      14. wanted => sub {
      15. my $path = File::Spec->canonpath($_);
      16. return unless -f $path;
      17. return if exists $manifest{ $path };
      18. print "unlink $path\n";
      19. unlink $path;
      20. },
      21. no_chdir => 1
      22. },
      23. "."
      24. );
    • Which tar should I use on Windows?

      We recommend ptar from Archive::Tar not older than 1.66 with '-C' option.

    • Which zip should I use on Windows for '[nd]make zipdist'?

      We recommend InfoZIP: http://www.info-zip.org/Zip.html

    XS

    • How do I prevent "object version X.XX does not match bootstrap parameter Y.YY" errors?

      XS code is very sensitive to the module version number and will complain if the version number in your Perl module doesn't match. If you change your module's version # without rerunning Makefile.PL the old version number will remain in the Makefile, causing the XS code to be built with the wrong number.

      To avoid this, you can force the Makefile to be rebuilt whenever you change the module containing the version number by adding this to your WriteMakefile() arguments.

      1. depend => { '$(FIRST_MAKEFILE)' => '$(VERSION_FROM)' }
    • How do I make two or more XS files coexist in the same directory?

      Sometimes you need to have two and more XS files in the same package. One way to go is to put them into separate directories, but sometimes this is not the most suitable solution. The following technique allows you to put two (and more) XS files in the same directory.

      Let's assume that we have a package Cool::Foo , which includes Cool::Foo and Cool::Bar modules each having a separate XS file. First we use the following Makefile.PL:

      1. use ExtUtils::MakeMaker;
      2. WriteMakefile(
      3. NAME => 'Cool::Foo',
      4. VERSION_FROM => 'Foo.pm',
      5. OBJECT => q/$(O_FILES)/,
      6. # ... other attrs ...
      7. );

      Notice the OBJECT attribute. MakeMaker generates the following variables in Makefile:

      1. # Handy lists of source code files:
      2. XS_FILES= Bar.xs \
      3. Foo.xs
      4. C_FILES = Bar.c \
      5. Foo.c
      6. O_FILES = Bar.o \
      7. Foo.o

      Therefore we can use the O_FILES variable to tell MakeMaker to use these objects into the shared library.

      That's pretty much it. Now write Foo.pm and Foo.xs, Bar.pm and Bar.xs, where Foo.pm bootstraps the shared library and Bar.pm simply loading Foo.pm.

      The only issue left is to how to bootstrap Bar.xs. This is done from Foo.xs:

      1. MODULE = Cool::Foo PACKAGE = Cool::Foo
      2. BOOT:
      3. # boot the second XS file
      4. boot_Cool__Bar(aTHX_ cv);

      If you have more than two files, this is the place where you should boot extra XS files from.

      The following four files sum up all the details discussed so far.

      1. Foo.pm:
      2. -------
      3. package Cool::Foo;
      4. require DynaLoader;
      5. our @ISA = qw(DynaLoader);
      6. our $VERSION = '0.01';
      7. bootstrap Cool::Foo $VERSION;
      8. 1;
      9. Bar.pm:
      10. -------
      11. package Cool::Bar;
      12. use Cool::Foo; # bootstraps Bar.xs
      13. 1;
      14. Foo.xs:
      15. -------
      16. #include "EXTERN.h"
      17. #include "perl.h"
      18. #include "XSUB.h"
      19. MODULE = Cool::Foo PACKAGE = Cool::Foo
      20. BOOT:
      21. # boot the second XS file
      22. boot_Cool__Bar(aTHX_ cv);
      23. MODULE = Cool::Foo PACKAGE = Cool::Foo PREFIX = cool_foo_
      24. void
      25. cool_foo_perl_rules()
      26. CODE:
      27. fprintf(stderr, "Cool::Foo says: Perl Rules\n");
      28. Bar.xs:
      29. -------
      30. #include "EXTERN.h"
      31. #include "perl.h"
      32. #include "XSUB.h"
      33. MODULE = Cool::Bar PACKAGE = Cool::Bar PREFIX = cool_bar_
      34. void
      35. cool_bar_perl_rules()
      36. CODE:
      37. fprintf(stderr, "Cool::Bar says: Perl Rules\n");

      And of course a very basic test:

      1. t/cool.t:
      2. --------
      3. use Test;
      4. BEGIN { plan tests => 1 };
      5. use Cool::Foo;
      6. use Cool::Bar;
      7. Cool::Foo::perl_rules();
      8. Cool::Bar::perl_rules();
      9. ok 1;

      This tip has been brought to you by Nick Ing-Simmons and Stas Bekman.

    PATCHING

    If you have a question you'd like to see added to the FAQ (whether or not you have the answer) please send it to makemaker@perl.org.

    AUTHOR

    The denizens of makemaker@perl.org.

    SEE ALSO

    ExtUtils::MakeMaker

     
    perldoc-html/ExtUtils/MakeMaker/Tutorial.html000644 000765 000024 00000054300 12275777442 021350 0ustar00jjstaff000000 000000 ExtUtils::MakeMaker::Tutorial - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::MakeMaker::Tutorial

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::MakeMaker::Tutorial

    NAME

    ExtUtils::MakeMaker::Tutorial - Writing a module with MakeMaker

    SYNOPSIS

    1. use ExtUtils::MakeMaker;
    2. WriteMakefile(
    3. NAME => 'Your::Module',
    4. VERSION_FROM => 'lib/Your/Module.pm'
    5. );

    DESCRIPTION

    This is a short tutorial on writing a simple module with MakeMaker. It's really not that hard.

    The Mantra

    MakeMaker modules are installed using this simple mantra

    1. perl Makefile.PL
    2. make
    3. make test
    4. make install

    There are lots more commands and options, but the above will do it.

    The Layout

    The basic files in a module look something like this.

    1. Makefile.PL
    2. MANIFEST
    3. lib/Your/Module.pm

    That's all that's strictly necessary. There's additional files you might want:

    1. lib/Your/Other/Module.pm
    2. t/some_test.t
    3. t/some_other_test.t
    4. Changes
    5. README
    6. INSTALL
    7. MANIFEST.SKIP
    8. bin/some_program
    • Makefile.PL

      When you run Makefile.PL, it makes a Makefile. That's the whole point of MakeMaker. The Makefile.PL is a simple program which loads ExtUtils::MakeMaker and runs the WriteMakefile() function to generate a Makefile.

      Here's an example of what you need for a simple module:

      1. use ExtUtils::MakeMaker;
      2. WriteMakefile(
      3. NAME => 'Your::Module',
      4. VERSION_FROM => 'lib/Your/Module.pm'
      5. );

      NAME is the top-level namespace of your module. VERSION_FROM is the file which contains the $VERSION variable for the entire distribution. Typically this is the same as your top-level module.

    • MANIFEST

      A simple listing of all the files in your distribution.

      1. Makefile.PL
      2. MANIFEST
      3. lib/Your/Module.pm

      File paths in a MANIFEST always use Unix conventions (ie. /) even if you're not on Unix.

      You can write this by hand or generate it with 'make manifest'.

      See ExtUtils::Manifest for more details.

    • lib/

      This is the directory where the .pm and .pod files you wish to have installed go. They are laid out according to namespace. So Foo::Bar is lib/Foo/Bar.pm.

    • t/

      Tests for your modules go here. Each test filename ends with a .t. So t/foo.t/ 'make test' will run these tests. The directory is flat, you cannot, for example, have t/foo/bar.t run by 'make test'.

      Tests are run from the top level of your distribution. So inside a test you would refer to ./lib to enter the lib directory, for example.

    • Changes

      A log of changes you've made to this module. The layout is free-form. Here's an example:

      1. 1.01 Fri Apr 11 00:21:25 PDT 2003
      2. - thing() does some stuff now
      3. - fixed the wiggy bug in withit()
      4. 1.00 Mon Apr 7 00:57:15 PDT 2003
      5. - "Rain of Frogs" now supported
    • README

      A short description of your module, what it does, why someone would use it and its limitations. CPAN automatically pulls your README file out of the archive and makes it available to CPAN users, it is the first thing they will read to decide if your module is right for them.

    • INSTALL

      Instructions on how to install your module along with any dependencies. Suggested information to include here:

      1. any extra modules required for use
      2. the minimum version of Perl required
      3. if only works on certain operating systems
    • MANIFEST.SKIP

      A file full of regular expressions to exclude when using 'make manifest' to generate the MANIFEST. These regular expressions are checked against each file path found in the distribution (so you're matching against "t/foo.t" not "foo.t").

      Here's a sample:

      1. ~$ # ignore emacs and vim backup files
      2. .bak$ # ignore manual backups
      3. \# # ignore CVS old revision files and emacs temp files

      Since # can be used for comments, # must be escaped.

      MakeMaker comes with a default MANIFEST.SKIP to avoid things like version control directories and backup files. Specifying your own will override this default.

    • bin/

    SEE ALSO

    perlmodstyle gives stylistic help writing a module.

    perlnewmod gives more information about how to write a module.

    There are modules to help you through the process of writing a module: ExtUtils::ModuleMaker, Module::Install, PAR

     
    perldoc-html/ExtUtils/Constant/Base.html000644 000765 000024 00000063005 12275777442 020355 0ustar00jjstaff000000 000000 ExtUtils::Constant::Base - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Constant::Base

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Constant::Base

    NAME

    ExtUtils::Constant::Base - base class for ExtUtils::Constant objects

    SYNOPSIS

    1. require ExtUtils::Constant::Base;
    2. @ISA = 'ExtUtils::Constant::Base';

    DESCRIPTION

    ExtUtils::Constant::Base provides a base implementation of methods to generate C code to give fast constant value lookup by named string. Currently it's mostly used ExtUtils::Constant::XS, which generates the lookup code for the constant() subroutine found in many XS modules.

    USAGE

    ExtUtils::Constant::Base exports no subroutines. The following methods are available

    • header

      A method returning a scalar containing definitions needed, typically for a C header file.

    • memEQ_clause args_hashref

      A method to return a suitable C if statement to check whether name is equal to the C variable name . If checked_at is defined, then it is used to avoid memEQ for short names, or to generate a comment to highlight the position of the character in the switch statement.

      If i<checked_at> is a reference to a scalar, then instead it gives the characters pre-checked at the beginning, (and the number of chars by which the C variable name has been advanced. These need to be chopped from the front of name).

    • dump_names arg_hashref, ITEM...

      An internal function to generate the embedded perl code that will regenerate the constant subroutines. default_type, types and ITEMs are the same as for C_constant. indent is treated as number of spaces to indent by. If declare_types is true a $types is always declared in the perl code generated, if defined and false never declared, and if undefined $types is only declared if the values in types as passed in cannot be inferred from default_types and the ITEMs.

    • assign arg_hashref, VALUE...

      A method to return a suitable assignment clause. If type is aggregate (eg PVN expects both pointer and length) then there should be multiple VALUEs for the components. pre and post if defined give snippets of C code to proceed and follow the assignment. pre will be at the start of a block, so variables may be defined in it.

    • return_clause arg_hashref, ITEM

      A method to return a suitable #ifdef clause. ITEM is a hashref (as passed to C_constant and match_clause . indent is the number of spaces to indent, defaulting to 6.

    • switch_clause arg_hashref, NAMELEN, ITEMHASH, ITEM...

      An internal method to generate a suitable switch clause, called by C_constant ITEMs are in the hash ref format as given in the description of C_constant , and must all have the names of the same length, given by NAMELEN. ITEMHASH is a reference to a hash, keyed by name, values being the hashrefs in the ITEM list. (No parameters are modified, and there can be keys in the ITEMHASH that are not in the list of ITEMs without causing problems - the hash is passed in to save generating it afresh for each call).

    • params WHAT

      An "internal" method, subject to change, currently called to allow an overriding class to cache information that will then be passed into all the *param* calls. (Yes, having to read the source to make sense of this is considered a known bug). WHAT is be a hashref of types the constant function will return. In ExtUtils::Constant::XS this method is used to returns a hashref keyed IV NV PV SV to show which combination of pointers will be needed in the C argument list generated by C_constant_other_params_definition and C_constant_other_params

    • dogfood arg_hashref, ITEM...

      An internal function to generate the embedded perl code that will regenerate the constant subroutines. Parameters are the same as for C_constant.

      Currently the base class does nothing and returns an empty string.

    • normalise_items args, default_type, seen_types, seen_items, ITEM...

      Convert the items to a normalised form. For 8 bit and Unicode values converts the item to an array of 1 or 2 items, both 8 bit and UTF-8 encoded.

    • C_constant arg_hashref, ITEM...

      A function that returns a list of C subroutine definitions that return the value and type of constants when passed the name by the XS wrapper. ITEM... gives a list of constant names. Each can either be a string, which is taken as a C macro name, or a reference to a hash with the following keys

      • name

        The name of the constant, as seen by the perl code.

      • type

        The type of the constant (IV, NV etc)

      • value

        A C expression for the value of the constant, or a list of C expressions if the type is aggregate. This defaults to the name if not given.

      • macro

        The C pre-processor macro to use in the #ifdef . This defaults to the name, and is mainly used if value is an enum . If a reference an array is passed then the first element is used in place of the #ifdef line, and the second element in place of the #endif . This allows pre-processor constructions such as

        1. #if defined (foo)
        2. #if !defined (bar)
        3. ...
        4. #endif
        5. #endif

        to be used to determine if a constant is to be defined.

        A "macro" 1 signals that the constant is always defined, so the #if /#endif test is omitted.

      • default

        Default value to use (instead of croak ing with "your vendor has not defined...") to return if the macro isn't defined. Specify a reference to an array with type followed by value(s).

      • pre

        C code to use before the assignment of the value of the constant. This allows you to use temporary variables to extract a value from part of a struct and return this as value. This C code is places at the start of a block, so you can declare variables in it.

      • post

        C code to place between the assignment of value (to a temporary) and the return from the function. This allows you to clear up anything in pre. Rarely needed.

      • def_pre
      • def_post

        Equivalents of pre and post for the default value.

      • utf8

        Generated internally. Is zero or undefined if name is 7 bit ASCII, "no" if the name is 8 bit (and so should only match if SvUTF8() is false), "yes" if the name is utf8 encoded.

        The internals automatically clone any name with characters 128-255 but none 256+ (ie one that could be either in bytes or utf8) into a second entry which is utf8 encoded.

      • weight

        Optional sorting weight for names, to determine the order of linear testing when multiple names fall in the same case of a switch clause. Higher comes earlier, undefined defaults to zero.

      In the argument hashref, package is the name of the package, and is only used in comments inside the generated C code. subname defaults to constant if undefined.

      default_type is the type returned by ITEM s that don't specify their type. It defaults to the value of default_type() . types should be given either as a comma separated list of types that the C subroutine subname will generate or as a reference to a hash. default_type will be added to the list if not present, as will any types given in the list of ITEMs. The resultant list should be the same list of types that XS_constant is given. [Otherwise XS_constant and C_constant may differ in the number of parameters to the constant function. indent is currently unused and ignored. In future it may be used to pass in information used to change the C indentation style used.] The best way to maintain consistency is to pass in a hash reference and let this function update it.

      breakout governs when child functions of subname are generated. If there are breakout or more ITEMs with the same length of name, then the code to switch between them is placed into a function named subname_len, for example constant_5 for names 5 characters long. The default breakout is 3. A single ITEM is always inlined.

    BUGS

    Not everything is documented yet.

    Probably others.

    AUTHOR

    Nicholas Clark <nick@ccl4.org> based on the code in h2xs by Larry Wall and others

     
    perldoc-html/ExtUtils/Constant/Utils.html000644 000765 000024 00000037257 12275777440 020613 0ustar00jjstaff000000 000000 ExtUtils::Constant::Utils - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Constant::Utils

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Constant::Utils

    NAME

    ExtUtils::Constant::Utils - helper functions for ExtUtils::Constant

    SYNOPSIS

    1. use ExtUtils::Constant::Utils qw (C_stringify);
    2. $C_code = C_stringify $stuff;

    DESCRIPTION

    ExtUtils::Constant::Utils packages up utility subroutines used by ExtUtils::Constant, ExtUtils::Constant::Base and derived classes. All its functions are explicitly exportable.

    USAGE

    • C_stringify NAME

      A function which returns a 7 bit ASCII correctly \ escaped version of the string passed suitable for C's "" or ''. It will die if passed Unicode characters.

    • perl_stringify NAME

      A function which returns a 7 bit ASCII correctly \ escaped version of the string passed suitable for a perl "" string.

    AUTHOR

    Nicholas Clark <nick@ccl4.org> based on the code in h2xs by Larry Wall and others

     
    perldoc-html/ExtUtils/Constant/XS.html000644 000765 000024 00000036045 12275777440 020037 0ustar00jjstaff000000 000000 ExtUtils::Constant::XS - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Constant::XS

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Constant::XS

    NAME

    ExtUtils::Constant::XS - generate C code for XS modules' constants.

    SYNOPSIS

    1. require ExtUtils::Constant::XS;

    DESCRIPTION

    ExtUtils::Constant::XS overrides ExtUtils::Constant::Base to generate C code for XS modules' constants.

    BUGS

    Nothing is documented.

    Probably others.

    AUTHOR

    Nicholas Clark <nick@ccl4.org> based on the code in h2xs by Larry Wall and others

     
    perldoc-html/ExtUtils/Command/MM.html000644 000765 000024 00000047226 12275777441 017607 0ustar00jjstaff000000 000000 ExtUtils::Command::MM - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::Command::MM

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::Command::MM

    NAME

    ExtUtils::Command::MM - Commands for the MM's to use in Makefiles

    SYNOPSIS

    1. perl "-MExtUtils::Command::MM" -e "function" "--" arguments...

    DESCRIPTION

    FOR INTERNAL USE ONLY! The interface is not stable.

    ExtUtils::Command::MM encapsulates code which would otherwise have to be done with large "one" liners.

    Any $(FOO) used in the examples are make variables, not Perl.

    • test_harness
      1. test_harness($verbose, @test_libs);

      Runs the tests on @ARGV via Test::Harness passing through the $verbose flag. Any @test_libs will be unshifted onto the test's @INC.

      @test_libs are run in alphabetical order.

    • pod2man
      1. pod2man( '--option=value',
      2. $podfile1 => $manpage1,
      3. $podfile2 => $manpage2,
      4. ...
      5. );
      6. # or args on @ARGV

      pod2man() is a function performing most of the duties of the pod2man program. Its arguments are exactly the same as pod2man as of 5.8.0 with the addition of:

      1. --perm_rw octal permission to set the resulting manpage to

      And the removal of:

      1. --verbose/-v
      2. --help/-h

      If no arguments are given to pod2man it will read from @ARGV.

      If Pod::Man is unavailable, this function will warn and return undef.

    • warn_if_old_packlist
      1. perl "-MExtUtils::Command::MM" -e warn_if_old_packlist <somefile>

      Displays a warning that an old packlist file was found. Reads the filename from @ARGV.

    • perllocal_install
      1. perl "-MExtUtils::Command::MM" -e perllocal_install
      2. <type> <module name> <key> <value> ...
      3. # VMS only, key|value pairs come on STDIN
      4. perl "-MExtUtils::Command::MM" -e perllocal_install
      5. <type> <module name> < <key>|<value> ...

      Prints a fragment of POD suitable for appending to perllocal.pod. Arguments are read from @ARGV.

      'type' is the type of what you're installing. Usually 'Module'.

      'module name' is simply the name of your module. (Foo::Bar)

      Key/value pairs are extra information about the module. Fields include:

      1. installed into which directory your module was out into
      2. LINKTYPE dynamic or static linking
      3. VERSION module version number
      4. EXE_FILES any executables installed in a space seperated
      5. list
    • uninstall
      1. perl "-MExtUtils::Command::MM" -e uninstall <packlist>

      A wrapper around ExtUtils::Install::uninstall(). Warns that uninstallation is deprecated and doesn't actually perform the uninstallation.

     
    perldoc-html/ExtUtils/CBuilder/Platform/000755 000765 000024 00000000000 12275777440 020273 5ustar00jjstaff000000 000000 perldoc-html/ExtUtils/CBuilder/Platform/Windows.html000644 000765 000024 00000036702 12275777440 022623 0ustar00jjstaff000000 000000 ExtUtils::CBuilder::Platform::Windows - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    ExtUtils::CBuilder::Platform::Windows

    Perl 5 version 18.2 documentation
    Recently read

    ExtUtils::CBuilder::Platform::Windows

    NAME

    ExtUtils::CBuilder::Platform::Windows - Builder class for Windows platforms

    DESCRIPTION

    This module implements the Windows-specific parts of ExtUtils::CBuilder. Most of the Windows-specific stuff has to do with compiling and linking C code. Currently we support the 3 compilers perl itself supports: MSVC, BCC, and GCC.

    This module inherits from ExtUtils::CBuilder::Base , so any functionality not implemented here will be implemented there. The interfaces are defined by the ExtUtils::CBuilder documentation.

    AUTHOR

    Ken Williams <ken@mathforum.org>

    Most of the code here was written by Randy W. Sims <RandyS@ThePierianSpring.org>.

    SEE ALSO

    perl(1), ExtUtils::CBuilder(3), ExtUtils::MakeMaker(3)

     
    perldoc-html/Exporter/Heavy.html000644 000765 000024 00000034315 12275777441 017016 0ustar00jjstaff000000 000000 Exporter::Heavy - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Exporter::Heavy

    Perl 5 version 18.2 documentation
    Recently read

    Exporter::Heavy

    NAME

    Exporter::Heavy - Exporter guts

    SYNOPSIS

    (internal use only)

    DESCRIPTION

    No user-serviceable parts inside.

     
    perldoc-html/Encode/Alias.html000644 000765 000024 00000052055 12275777442 016362 0ustar00jjstaff000000 000000 Encode::Alias - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Alias

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Alias

    NAME

    Encode::Alias - alias definitions to encodings

    SYNOPSIS

    1. use Encode;
    2. use Encode::Alias;
    3. define_alias( "newName" => ENCODING);
    4. define_alias( qr/.../ => ENCODING);
    5. define_alias( sub { return ENCODING if ...; } );

    DESCRIPTION

    Allows newName to be used as an alias for ENCODING. ENCODING may be either the name of an encoding or an encoding object (as described in Encode).

    Currently the first argument to define_alias() can be specified in the following ways:

    • As a simple string.
    • As a qr// compiled regular expression, e.g.:
      1. define_alias( qr/^iso8859-(\d+)$/i => '"iso-8859-$1"' );

      In this case, if ENCODING is not a reference, it is eval-ed in order to allow $1 etc. to be substituted. The example is one way to alias names as used in X11 fonts to the MIME names for the iso-8859-* family. Note the double quotes inside the single quotes.

      (or, you don't have to do this yourself because this example is predefined)

      If you are using a regex here, you have to use the quotes as shown or it won't work. Also note that regex handling is tricky even for the experienced. Use this feature with caution.

    • As a code reference, e.g.:
      1. define_alias( sub {shift =~ /^iso8859-(\d+)$/i ? "iso-8859-$1" : undef } );

      The same effect as the example above in a different way. The coderef takes the alias name as an argument and returns a canonical name on success or undef if not. Note the second argument is ignored if provided. Use this with even more caution than the regex version.

    Changes in code reference aliasing

    As of Encode 1.87, the older form

    1. define_alias( sub { return /^iso8859-(\d+)$/i ? "iso-8859-$1" : undef } );

    no longer works.

    Encode up to 1.86 internally used "local $_" to implement ths older form. But consider the code below;

    1. use Encode;
    2. $_ = "eeeee" ;
    3. while (/(e)/g) {
    4. my $utf = decode('aliased-encoding-name', $1);
    5. print "position:",pos,"\n";
    6. }

    Prior to Encode 1.86 this fails because of "local $_".

    Alias overloading

    You can override predefined aliases by simply applying define_alias(). The new alias is always evaluated first, and when necessary, define_alias() flushes the internal cache to make the new definition available.

    1. # redirect SHIFT_JIS to MS/IBM Code Page 932, which is a
    2. # superset of SHIFT_JIS
    3. define_alias( qr/shift.*jis$/i => '"cp932"' );
    4. define_alias( qr/sjis$/i => '"cp932"' );

    If you want to zap all predefined aliases, you can use

    1. Encode::Alias->undef_aliases;

    to do so. And

    1. Encode::Alias->init_aliases;

    gets the factory settings back.

    Note that define_alias() will not be able to override the canonical name of encodings. Encodings are first looked up by canonical name before potential aliases are tried.

    SEE ALSO

    Encode, Encode::Supported

     
    perldoc-html/Encode/Byte.html000644 000765 000024 00000043346 12275777440 016235 0ustar00jjstaff000000 000000 Encode::Byte - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Byte

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Byte

    NAME

    Encode::Byte - Single Byte Encodings

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $greek = encode("iso-8859-7", $utf8); # loads Encode::Byte implicitly
    3. $utf8 = decode("iso-8859-7", $greek); # ditto

    ABSTRACT

    This module implements various single byte encodings. For most cases it uses \x80-\xff (upper half) to map non-ASCII characters. Encodings supported are as follows.

    1. Canonical Alias Description
    2. --------------------------------------------------------------------
    3. # ISO 8859 series
    4. (iso-8859-1 is in built-in)
    5. iso-8859-2 latin2 [ISO]
    6. iso-8859-3 latin3 [ISO]
    7. iso-8859-4 latin4 [ISO]
    8. iso-8859-5 [ISO]
    9. iso-8859-6 [ISO]
    10. iso-8859-7 [ISO]
    11. iso-8859-8 [ISO]
    12. iso-8859-9 latin5 [ISO]
    13. iso-8859-10 latin6 [ISO]
    14. iso-8859-11
    15. (iso-8859-12 is nonexistent)
    16. iso-8859-13 latin7 [ISO]
    17. iso-8859-14 latin8 [ISO]
    18. iso-8859-15 latin9 [ISO]
    19. iso-8859-16 latin10 [ISO]
    20. # Cyrillic
    21. koi8-f
    22. koi8-r cp878 [RFC1489]
    23. koi8-u [RFC2319]
    24. # Vietnamese
    25. viscii
    26. # all cp* are also available as ibm-*, ms-*, and windows-*
    27. # also see L<http://msdn.microsoft.com/en-us/library/aa752010%28VS.85%29.aspx>
    28. cp424
    29. cp437
    30. cp737
    31. cp775
    32. cp850
    33. cp852
    34. cp855
    35. cp856
    36. cp857
    37. cp860
    38. cp861
    39. cp862
    40. cp863
    41. cp864
    42. cp865
    43. cp866
    44. cp869
    45. cp874
    46. cp1006
    47. cp1250 WinLatin2
    48. cp1251 WinCyrillic
    49. cp1252 WinLatin1
    50. cp1253 WinGreek
    51. cp1254 WinTurkish
    52. cp1255 WinHebrew
    53. cp1256 WinArabic
    54. cp1257 WinBaltic
    55. cp1258 WinVietnamese
    56. # Macintosh
    57. # Also see L<http://developer.apple.com/technotes/tn/tn1150.html>
    58. MacArabic
    59. MacCentralEurRoman
    60. MacCroatian
    61. MacCyrillic
    62. MacFarsi
    63. MacGreek
    64. MacHebrew
    65. MacIcelandic
    66. MacRoman
    67. MacRomanian
    68. MacRumanian
    69. MacSami
    70. MacThai
    71. MacTurkish
    72. MacUkrainian
    73. # More vendor encodings
    74. AdobeStandardEncoding
    75. nextstep
    76. hp-roman8

    DESCRIPTION

    To find how to use this module in detail, see Encode.

    SEE ALSO

    Encode

     
    perldoc-html/Encode/CJKConstants.html000644 000765 000024 00000033730 12275777440 017632 0ustar00jjstaff000000 000000 Encode::CJKConstants - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::CJKConstants

    Perl 5 version 18.2 documentation
    Recently read

    Encode::CJKConstants

    NAME

    Encode::CJKConstants.pm -- Internally used by Encode::??::ISO_2022_*

    Page index
     
    perldoc-html/Encode/CN/000755 000765 000024 00000000000 12275777442 014734 5ustar00jjstaff000000 000000 perldoc-html/Encode/CN.html000644 000765 000024 00000042326 12275777442 015631 0ustar00jjstaff000000 000000 Encode::CN - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::CN

    Perl 5 version 18.2 documentation
    Recently read

    Encode::CN

    NAME

    Encode::CN - China-based Chinese Encodings

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $euc_cn = encode("euc-cn", $utf8); # loads Encode::CN implicitly
    3. $utf8 = decode("euc-cn", $euc_cn); # ditto

    DESCRIPTION

    This module implements China-based Chinese charset encodings. Encodings supported are as follows.

    1. Canonical Alias Description
    2. --------------------------------------------------------------------
    3. euc-cn /\beuc.*cn$/i EUC (Extended Unix Character)
    4. /\bcn.*euc$/i
    5. /\bGB[-_ ]?2312(?:\D.*$|$)/i (see below)
    6. gb2312-raw The raw (low-bit) GB2312 character map
    7. gb12345-raw Traditional chinese counterpart to
    8. GB2312 (raw)
    9. iso-ir-165 GB2312 + GB6345 + GB8565 + additions
    10. MacChineseSimp GB2312 + Apple Additions
    11. cp936 Code Page 936, also known as GBK
    12. (Extended GuoBiao)
    13. hz 7-bit escaped GB2312 encoding
    14. --------------------------------------------------------------------

    To find how to use this module in detail, see Encode.

    NOTES

    Due to size concerns, GB 18030 (an extension to GBK ) is distributed separately on CPAN, under the name Encode::HanExtra. That module also contains extra Taiwan-based encodings.

    BUGS

    When you see charset=gb2312 on mails and web pages, they really mean euc-cn encodings. To fix that, gb2312 is aliased to euc-cn . Use gb2312-raw when you really mean it.

    The ASCII region (0x00-0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium.

    SEE ALSO

    Encode

     
    perldoc-html/Encode/Config.html000644 000765 000024 00000033633 12275777442 016537 0ustar00jjstaff000000 000000 Encode::Config - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Config

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Config

    NAME

    Encode::Config -- internally used by Encode

    Page index
     
    perldoc-html/Encode/EBCDIC.html000644 000765 000024 00000037225 12275777440 016242 0ustar00jjstaff000000 000000 Encode::EBCDIC - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::EBCDIC

    Perl 5 version 18.2 documentation
    Recently read

    Encode::EBCDIC

    NAME

    Encode::EBCDIC - EBCDIC Encodings

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $posix_bc = encode("posix-bc", $utf8); # loads Encode::EBCDIC implicitly
    3. $utf8 = decode("", $posix_bc); # ditto

    ABSTRACT

    This module implements various EBCDIC-Based encodings. Encodings supported are as follows.

    1. Canonical Alias Description
    2. --------------------------------------------------------------------
    3. cp37
    4. cp500
    5. cp875
    6. cp1026
    7. cp1047
    8. posix-bc

    DESCRIPTION

    To find how to use this module in detail, see Encode.

    SEE ALSO

    Encode, perlebcdic

     
    perldoc-html/Encode/Encoder.html000644 000765 000024 00000060132 12275777440 016701 0ustar00jjstaff000000 000000 Encode::Encoder - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Encoder

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Encoder

    NAME

    Encode::Encoder -- Object Oriented Encoder

    SYNOPSIS

    1. use Encode::Encoder;
    2. # Encode::encode("ISO-8859-1", $data);
    3. Encode::Encoder->new($data)->iso_8859_1; # OOP way
    4. # shortcut
    5. use Encode::Encoder qw(encoder);
    6. encoder($data)->iso_8859_1;
    7. # you can stack them!
    8. encoder($data)->iso_8859_1->base64; # provided base64() is defined
    9. # you can use it as a decoder as well
    10. encoder($base64)->bytes('base64')->latin1;
    11. # stringified
    12. print encoder($data)->utf8->latin1; # prints the string in latin1
    13. # numified
    14. encoder("\x{abcd}\x{ef}g")->utf8 == 6; # true. bytes::length($data)

    ABSTRACT

    Encode::Encoder allows you to use Encode in an object-oriented style. This is not only more intuitive than a functional approach, but also handier when you want to stack encodings. Suppose you want your UTF-8 string converted to Latin1 then Base64: you can simply say

    1. my $base64 = encoder($utf8)->latin1->base64;

    instead of

    1. my $latin1 = encode("latin1", $utf8);
    2. my $base64 = encode_base64($utf8);

    or the lazier and more convoluted

    1. my $base64 = encode_base64(encode("latin1", $utf8));

    Description

    Here is how to use this module.

    • There are at least two instance variables stored in a hash reference, {data} and {encoding}.

    • When there is no method, it takes the method name as the name of the encoding and encodes the instance data with encoding. If successful, the instance encoding is set accordingly.

    • You can retrieve the result via ->data but usually you don't have to because the stringify operator ("") is overridden to do exactly that.

    Predefined Methods

    This module predefines the methods below:

    • $e = Encode::Encoder->new([$data, $encoding]);

      returns an encoder object. Its data is initialized with $data if present, and its encoding is set to $encoding if present.

      When $encoding is omitted, it defaults to utf8 if $data is already in utf8 or "" (empty string) otherwise.

    • encoder()

      is an alias of Encode::Encoder->new(). This one is exported on demand.

    • $e->data([$data])

      When $data is present, sets the instance data to $data and returns the object itself. Otherwise, the current instance data is returned.

    • $e->encoding([$encoding])

      When $encoding is present, sets the instance encoding to $encoding and returns the object itself. Otherwise, the current instance encoding is returned.

    • $e->bytes([$encoding])

      decodes instance data from $encoding, or the instance encoding if omitted. If the conversion is successful, the instance encoding will be set to "".

      The name bytes was deliberately picked to avoid namespace tainting -- this module may be used as a base class so method names that appear in Encode::Encoding are avoided.

    Example: base64 transcoder

    This module is designed to work with Encode::Encoding. To make the Base64 transcoder example above really work, you could write a module like this:

    1. package Encode::Base64;
    2. use base 'Encode::Encoding';
    3. __PACKAGE__->Define('base64');
    4. use MIME::Base64;
    5. sub encode{
    6. my ($obj, $data) = @_;
    7. return encode_base64($data);
    8. }
    9. sub decode{
    10. my ($obj, $data) = @_;
    11. return decode_base64($data);
    12. }
    13. 1;
    14. __END__

    And your caller module would be something like this:

    1. use Encode::Encoder;
    2. use Encode::Base64;
    3. # now you can really do the following
    4. encoder($data)->iso_8859_1->base64;
    5. encoder($base64)->bytes('base64')->latin1;

    Operator Overloading

    This module overloads two operators, stringify ("") and numify (0+).

    Stringify dumps the data inside the object.

    Numify returns the number of bytes in the instance data.

    They come in handy when you want to print or find the size of data.

    SEE ALSO

    Encode, Encode::Encoding

     
    perldoc-html/Encode/Encoding.html000644 000765 000024 00000067414 12275777441 017063 0ustar00jjstaff000000 000000 Encode::Encoding - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Encoding

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Encoding

    NAME

    Encode::Encoding - Encode Implementation Base Class

    SYNOPSIS

    1. package Encode::MyEncoding;
    2. use base qw(Encode::Encoding);
    3. __PACKAGE__->Define(qw(myCanonical myAlias));

    DESCRIPTION

    As mentioned in Encode, encodings are (in the current implementation at least) defined as objects. The mapping of encoding name to object is via the %Encode::Encoding hash. Though you can directly manipulate this hash, it is strongly encouraged to use this base class module and add encode() and decode() methods.

    Methods you should implement

    You are strongly encouraged to implement methods below, at least either encode() or decode().

    • ->encode($string [,$check])

      MUST return the octet sequence representing $string.

      • If $check is true, it SHOULD modify $string in place to remove the converted part (i.e. the whole string unless there is an error). If perlio_ok() is true, SHOULD becomes MUST.

      • If an error occurs, it SHOULD return the octet sequence for the fragment of string that has been converted and modify $string in-place to remove the converted part leaving it starting with the problem fragment. If perlio_ok() is true, SHOULD becomes MUST.

      • If $check is is false then encode MUST make a "best effort" to convert the string - for example, by using a replacement character.

    • ->decode($octets [,$check])

      MUST return the string that $octets represents.

      • If $check is true, it SHOULD modify $octets in place to remove the converted part (i.e. the whole sequence unless there is an error). If perlio_ok() is true, SHOULD becomes MUST.

      • If an error occurs, it SHOULD return the fragment of string that has been converted and modify $octets in-place to remove the converted part leaving it starting with the problem fragment. If perlio_ok() is true, SHOULD becomes MUST.

      • If $check is false then decode should make a "best effort" to convert the string - for example by using Unicode's "\x{FFFD}" as a replacement character.

    If you want your encoding to work with encoding pragma, you should also implement the method below.

    • ->cat_decode($destination, $octets, $offset, $terminator [,$check])

      MUST decode $octets with $offset and concatenate it to $destination. Decoding will terminate when $terminator (a string) appears in output. $offset will be modified to the last $octets position at end of decode. Returns true if $terminator appears output, else returns false.

    Other methods defined in Encode::Encodings

    You do not have to override methods shown below unless you have to.

    • ->name

      Predefined As:

      1. sub name { return shift->{'Name'} }

      MUST return the string representing the canonical name of the encoding.

    • ->mime_name

      Predefined As:

      1. sub mime_name{
      2. require Encode::MIME::Name;
      3. return Encode::MIME::Name::get_mime_name(shift->name);
      4. }

      MUST return the string representing the IANA charset name of the encoding.

    • ->renew

      Predefined As:

      1. sub renew {
      2. my $self = shift;
      3. my $clone = bless { %$self } => ref($self);
      4. $clone->{renewed}++;
      5. return $clone;
      6. }

      This method reconstructs the encoding object if necessary. If you need to store the state during encoding, this is where you clone your object.

      PerlIO ALWAYS calls this method to make sure it has its own private encoding object.

    • ->renewed

      Predefined As:

      1. sub renewed { $_[0]->{renewed} || 0 }

      Tells whether the object is renewed (and how many times). Some modules emit Use of uninitialized value in null operation warning unless the value is numeric so return 0 for false.

    • ->perlio_ok()

      Predefined As:

      1. sub perlio_ok {
      2. eval{ require PerlIO::encoding };
      3. return $@ ? 0 : 1;
      4. }

      If your encoding does not support PerlIO for some reasons, just;

      1. sub perlio_ok { 0 }
    • ->needs_lines()

      Predefined As:

      1. sub needs_lines { 0 };

      If your encoding can work with PerlIO but needs line buffering, you MUST define this method so it returns true. 7bit ISO-2022 encodings are one example that needs this. When this method is missing, false is assumed.

    Example: Encode::ROT13

    1. package Encode::ROT13;
    2. use strict;
    3. use base qw(Encode::Encoding);
    4. __PACKAGE__->Define('rot13');
    5. sub encode($$;$){
    6. my ($obj, $str, $chk) = @_;
    7. $str =~ tr/A-Za-z/N-ZA-Mn-za-m/;
    8. $_[1] = '' if $chk; # this is what in-place edit means
    9. return $str;
    10. }
    11. # Jr pna or ynml yvxr guvf;
    12. *decode = \&encode;
    13. 1;

    Why the heck Encode API is different?

    It should be noted that the $check behaviour is different from the outer public API. The logic is that the "unchecked" case is useful when the encoding is part of a stream which may be reporting errors (e.g. STDERR). In such cases, it is desirable to get everything through somehow without causing additional errors which obscure the original one. Also, the encoding is best placed to know what the correct replacement character is, so if that is the desired behaviour then letting low level code do it is the most efficient.

    By contrast, if $check is true, the scheme above allows the encoding to do as much as it can and tell the layer above how much that was. What is lacking at present is a mechanism to report what went wrong. The most likely interface will be an additional method call to the object, or perhaps (to avoid forcing per-stream objects on otherwise stateless encodings) an additional parameter.

    It is also highly desirable that encoding classes inherit from Encode::Encoding as a base class. This allows that class to define additional behaviour for all encoding objects.

    1. package Encode::MyEncoding;
    2. use base qw(Encode::Encoding);
    3. __PACKAGE__->Define(qw(myCanonical myAlias));

    to create an object with bless {Name => ...}, $class , and call define_encoding. They inherit their name method from Encode::Encoding .

    Compiled Encodings

    For the sake of speed and efficiency, most of the encodings are now supported via a compiled form: XS modules generated from UCM files. Encode provides the enc2xs tool to achieve that. Please see enc2xs for more details.

    SEE ALSO

    perlmod, enc2xs

     
    perldoc-html/Encode/GSM0338.html000644 000765 000024 00000042224 12275777440 016270 0ustar00jjstaff000000 000000 Encode::GSM0338 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::GSM0338

    Perl 5 version 18.2 documentation
    Recently read

    Encode::GSM0338

    NAME

    Encode::GSM0338 -- ESTI GSM 03.38 Encoding

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $gsm0338 = encode("gsm0338", $utf8); # loads Encode::GSM0338 implicitly
    3. $utf8 = decode("gsm0338", $gsm0338); # ditto

    DESCRIPTION

    GSM0338 is for GSM handsets. Though it shares alphanumerals with ASCII, control character ranges and other parts are mapped very differently, mainly to store Greek characters. There are also escape sequences (starting with 0x1B) to cover e.g. the Euro sign.

    This was once handled by Encode::Bytes but because of all those unusual specifications, Encode 2.20 has relocated the support to this module.

    NOTES

    Unlike most other encodings, the following aways croaks on error for any $chk that evaluates to true.

    1. $gsm0338 = encode("gsm0338", $utf8 $chk);
    2. $utf8 = decode("gsm0338", $gsm0338, $chk);

    So if you want to check the validity of the encoding, surround the expression with eval {} block as follows;

    1. eval {
    2. $utf8 = decode("gsm0338", $gsm0338, $chk);
    3. };
    4. if ($@){
    5. # handle exception here
    6. }

    BUGS

    ESTI GSM 03.38 Encoding itself.

    Mapping \x00 to '@' causes too much pain everywhere.

    Its use of \x1b (escape) is also very questionable.

    Because of those two, the code paging approach used use in ucm-based Encoding SOMETIMES fails so this module was written.

    SEE ALSO

    Encode

     
    perldoc-html/Encode/Guess.html000644 000765 000024 00000066456 12275777440 016427 0ustar00jjstaff000000 000000 Encode::Guess - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Guess

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Guess

    NAME

    Encode::Guess -- Guesses encoding from data

    SYNOPSIS

    1. # if you are sure $data won't contain anything bogus
    2. use Encode;
    3. use Encode::Guess qw/euc-jp shiftjis 7bit-jis/;
    4. my $utf8 = decode("Guess", $data);
    5. my $data = encode("Guess", $utf8); # this doesn't work!
    6. # more elaborate way
    7. use Encode::Guess;
    8. my $enc = guess_encoding($data, qw/euc-jp shiftjis 7bit-jis/);
    9. ref($enc) or die "Can't guess: $enc"; # trap error this way
    10. $utf8 = $enc->decode($data);
    11. # or
    12. $utf8 = decode($enc->name, $data)

    ABSTRACT

    Encode::Guess enables you to guess in what encoding a given data is encoded, or at least tries to.

    DESCRIPTION

    By default, it checks only ascii, utf8 and UTF-16/32 with BOM.

    1. use Encode::Guess; # ascii/utf8/BOMed UTF

    To use it more practically, you have to give the names of encodings to check (suspects as follows). The name of suspects can either be canonical names or aliases.

    CAVEAT: Unlike UTF-(16|32), BOM in utf8 is NOT AUTOMATICALLY STRIPPED.

    1. # tries all major Japanese Encodings as well
    2. use Encode::Guess qw/euc-jp shiftjis 7bit-jis/;

    If the $Encode::Guess::NoUTFAutoGuess variable is set to a true value, no heuristics will be applied to UTF8/16/32, and the result will be limited to the suspects and ascii .

    • Encode::Guess->set_suspects

      You can also change the internal suspects list via set_suspects method.

      1. use Encode::Guess;
      2. Encode::Guess->set_suspects(qw/euc-jp shiftjis 7bit-jis/);
    • Encode::Guess->add_suspects

      Or you can use add_suspects method. The difference is that set_suspects flushes the current suspects list while add_suspects adds.

      1. use Encode::Guess;
      2. Encode::Guess->add_suspects(qw/euc-jp shiftjis 7bit-jis/);
      3. # now the suspects are euc-jp,shiftjis,7bit-jis, AND
      4. # euc-kr,euc-cn, and big5-eten
      5. Encode::Guess->add_suspects(qw/euc-kr euc-cn big5-eten/);
    • Encode::decode("Guess" ...)

      When you are content with suspects list, you can now

      1. my $utf8 = Encode::decode("Guess", $data);
    • Encode::Guess->guess($data)

      But it will croak if:

      • Two or more suspects remain

      • No suspects left

      So you should instead try this;

      1. my $decoder = Encode::Guess->guess($data);

      On success, $decoder is an object that is documented in Encode::Encoding. So you can now do this;

      1. my $utf8 = $decoder->decode($data);

      On failure, $decoder now contains an error message so the whole thing would be as follows;

      1. my $decoder = Encode::Guess->guess($data);
      2. die $decoder unless ref($decoder);
      3. my $utf8 = $decoder->decode($data);
    • guess_encoding($data, [, list of suspects])

      You can also try guess_encoding function which is exported by default. It takes $data to check and it also takes the list of suspects by option. The optional suspect list is not reflected to the internal suspects list.

      1. my $decoder = guess_encoding($data, qw/euc-jp euc-kr euc-cn/);
      2. die $decoder unless ref($decoder);
      3. my $utf8 = $decoder->decode($data);
      4. # check only ascii, utf8 and UTF-(16|32) with BOM
      5. my $decoder = guess_encoding($data);

    CAVEATS

    • Because of the algorithm used, ISO-8859 series and other single-byte encodings do not work well unless either one of ISO-8859 is the only one suspect (besides ascii and utf8).

      1. use Encode::Guess;
      2. # perhaps ok
      3. my $decoder = guess_encoding($data, 'latin1');
      4. # definitely NOT ok
      5. my $decoder = guess_encoding($data, qw/latin1 greek/);

      The reason is that Encode::Guess guesses encoding by trial and error. It first splits $data into lines and tries to decode the line for each suspect. It keeps it going until all but one encoding is eliminated out of suspects list. ISO-8859 series is just too successful for most cases (because it fills almost all code points in \x00-\xff).

    • Do not mix national standard encodings and the corresponding vendor encodings.

      1. # a very bad idea
      2. my $decoder
      3. = guess_encoding($data, qw/shiftjis MacJapanese cp932/);

      The reason is that vendor encoding is usually a superset of national standard so it becomes too ambiguous for most cases.

    • On the other hand, mixing various national standard encodings automagically works unless $data is too short to allow for guessing.

      1. # This is ok if $data is long enough
      2. my $decoder =
      3. guess_encoding($data, qw/euc-cn
      4. euc-jp shiftjis 7bit-jis
      5. euc-kr
      6. big5-eten/);
    • DO NOT PUT TOO MANY SUSPECTS! Don't you try something like this!

      1. my $decoder = guess_encoding($data,
      2. Encode->encodings(":all"));

    It is, after all, just a guess. You should alway be explicit when it comes to encodings. But there are some, especially Japanese, environment that guess-coding is a must. Use this module with care.

    TO DO

    Encode::Guess does not work on EBCDIC platforms.

    SEE ALSO

    Encode, Encode::Encoding

     
    perldoc-html/Encode/JP/000755 000765 000024 00000000000 12275777442 014745 5ustar00jjstaff000000 000000 perldoc-html/Encode/JP.html000644 000765 000024 00000045115 12275777441 015640 0ustar00jjstaff000000 000000 Encode::JP - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::JP

    Perl 5 version 18.2 documentation
    Recently read

    Encode::JP

    NAME

    Encode::JP - Japanese Encodings

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $euc_jp = encode("euc-jp", $utf8); # loads Encode::JP implicitly
    3. $utf8 = decode("euc-jp", $euc_jp); # ditto

    ABSTRACT

    This module implements Japanese charset encodings. Encodings supported are as follows.

    1. Canonical Alias Description
    2. --------------------------------------------------------------------
    3. euc-jp /\beuc.*jp$/i EUC (Extended Unix Character)
    4. /\bjp.*euc/i
    5. /\bujis$/i
    6. shiftjis /\bshift.*jis$/i Shift JIS (aka MS Kanji)
    7. /\bsjis$/i
    8. 7bit-jis /\bjis$/i 7bit JIS
    9. iso-2022-jp ISO-2022-JP [RFC1468]
    10. = 7bit JIS with all Halfwidth Kana
    11. converted to Fullwidth
    12. iso-2022-jp-1 ISO-2022-JP-1 [RFC2237]
    13. = ISO-2022-JP with JIS X 0212-1990
    14. support. See below
    15. MacJapanese Shift JIS + Apple vendor mappings
    16. cp932 /\bwindows-31j$/i Code Page 932
    17. = Shift JIS + MS/IBM vendor mappings
    18. jis0201-raw JIS0201, raw format
    19. jis0208-raw JIS0201, raw format
    20. jis0212-raw JIS0201, raw format
    21. --------------------------------------------------------------------

    DESCRIPTION

    To find out how to use this module in detail, see Encode.

    Note on ISO-2022-JP(-1)?

    ISO-2022-JP-1 (RFC2237) is a superset of ISO-2022-JP (RFC1468) which adds support for JIS X 0212-1990. That means you can use the same code to decode to utf8 but not vice versa.

    1. $utf8 = decode('iso-2022-jp-1', $stream);

    and

    1. $utf8 = decode('iso-2022-jp', $stream);

    yield the same result but

    1. $with_0212 = encode('iso-2022-jp-1', $utf8);

    is now different from

    1. $without_0212 = encode('iso-2022-jp', $utf8 );

    In the latter case, characters that map to 0212 are first converted to U+3013 (0xA2AE in EUC-JP; a white square also known as 'Tofu' or 'geta mark') then fed to the decoding engine. U+FFFD is not used, in order to preserve text layout as much as possible.

    BUGS

    The ASCII region (0x00-0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium.

    SEE ALSO

    Encode

     
    perldoc-html/Encode/KR/000755 000765 000024 00000000000 12275777440 014746 5ustar00jjstaff000000 000000 perldoc-html/Encode/KR.html000644 000765 000024 00000041406 12275777440 015641 0ustar00jjstaff000000 000000 Encode::KR - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::KR

    Perl 5 version 18.2 documentation
    Recently read

    Encode::KR

    NAME

    Encode::KR - Korean Encodings

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $euc_kr = encode("euc-kr", $utf8); # loads Encode::KR implicitly
    3. $utf8 = decode("euc-kr", $euc_kr); # ditto

    DESCRIPTION

    This module implements Korean charset encodings. Encodings supported are as follows.

    1. Canonical Alias Description
    2. --------------------------------------------------------------------
    3. euc-kr /\beuc.*kr$/i EUC (Extended Unix Character)
    4. /\bkr.*euc$/i
    5. ksc5601-raw Korean standard code set (as is)
    6. cp949 /(?:x-)?uhc$/i
    7. /(?:x-)?windows-949$/i
    8. /\bks_c_5601-1987$/i
    9. Code Page 949 (EUC-KR + 8,822
    10. (additional Hangul syllables)
    11. MacKorean EUC-KR + Apple Vendor Mappings
    12. johab JOHAB A supplementary encoding defined in
    13. Annex 3 of KS X 1001:1998
    14. iso-2022-kr iso-2022-kr [RFC1557]
    15. --------------------------------------------------------------------

    To find how to use this module in detail, see Encode.

    BUGS

    When you see charset=ks_c_5601-1987 on mails and web pages, they really mean "cp949" encodings. To fix that, the following aliases are set;

    1. qr/(?:x-)?uhc$/i => '"cp949"'
    2. qr/(?:x-)?windows-949$/i => '"cp949"'
    3. qr/ks_c_5601-1987$/i => '"cp949"'

    The ASCII region (0x00-0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium.

    SEE ALSO

    Encode

     
    perldoc-html/Encode/MIME/000755 000765 000024 00000000000 12275777442 015163 5ustar00jjstaff000000 000000 perldoc-html/Encode/Symbol.html000644 000765 000024 00000037135 12275777441 016577 0ustar00jjstaff000000 000000 Encode::Symbol - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Symbol

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Symbol

    NAME

    Encode::Symbol - Symbol Encodings

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $symbol = encode("symbol", $utf8); # loads Encode::Symbol implicitly
    3. $utf8 = decode("", $symbol); # ditto

    ABSTRACT

    This module implements symbol and dingbats encodings. Encodings supported are as follows.

    1. Canonical Alias Description
    2. --------------------------------------------------------------------
    3. symbol
    4. dingbats
    5. AdobeZDingbat
    6. AdobeSymbol
    7. MacDingbats

    DESCRIPTION

    To find out how to use this module in detail, see Encode.

    SEE ALSO

    Encode

     
    perldoc-html/Encode/TW.html000644 000765 000024 00000043616 12275777441 015665 0ustar00jjstaff000000 000000 Encode::TW - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::TW

    Perl 5 version 18.2 documentation
    Recently read

    Encode::TW

    NAME

    Encode::TW - Taiwan-based Chinese Encodings

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $big5 = encode("big5", $utf8); # loads Encode::TW implicitly
    3. $utf8 = decode("big5", $big5); # ditto

    DESCRIPTION

    This module implements tradition Chinese charset encodings as used in Taiwan and Hong Kong. Encodings supported are as follows.

    1. Canonical Alias Description
    2. --------------------------------------------------------------------
    3. big5-eten /\bbig-?5$/i Big5 encoding (with ETen extensions)
    4. /\bbig5-?et(en)?$/i
    5. /\btca-?big5$/i
    6. big5-hkscs /\bbig5-?hk(scs)?$/i
    7. /\bhk(scs)?-?big5$/i
    8. Big5 + Cantonese characters in Hong Kong
    9. MacChineseTrad Big5 + Apple Vendor Mappings
    10. cp950 Code Page 950
    11. = Big5 + Microsoft vendor mappings
    12. --------------------------------------------------------------------

    To find out how to use this module in detail, see Encode.

    NOTES

    Due to size concerns, EUC-TW (Extended Unix Character), CCCII (Chinese Character Code for Information Interchange), BIG5PLUS (CMEX's Big5+) and BIG5EXT (CMEX's Big5e) are distributed separately on CPAN, under the name Encode::HanExtra. That module also contains extra China-based encodings.

    BUGS

    Since the original big5 encoding (1984) is not supported anywhere (glibc and DOS-based systems uses big5 to mean big5-eten ; Microsoft uses big5 to mean cp950 ), a conscious decision was made to alias big5 to big5-eten , which is the de facto superset of the original big5.

    The CNS11643 encoding files are not complete. For common CNS11643 manipulation, please use EUC-TW in Encode::HanExtra, which contains planes 1-7.

    The ASCII region (0x00-0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium.

    SEE ALSO

    Encode

     
    perldoc-html/Encode/Unicode/000755 000765 000024 00000000000 12275777442 016022 5ustar00jjstaff000000 000000 perldoc-html/Encode/Unicode.html000644 000765 000024 00000063175 12275777442 016724 0ustar00jjstaff000000 000000 Encode::Unicode - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Unicode

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Unicode

    NAME

    Encode::Unicode -- Various Unicode Transformation Formats

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $ucs2 = encode("UCS-2BE", $utf8);
    3. $utf8 = decode("UCS-2BE", $ucs2);

    ABSTRACT

    This module implements all Character Encoding Schemes of Unicode that are officially documented by Unicode Consortium (except, of course, for UTF-8, which is a native format in perl).

    • http://www.unicode.org/glossary/ says:

      Character Encoding Scheme A character encoding form plus byte serialization. There are Seven character encoding schemes in Unicode: UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32 (UCS-4), UTF-32BE (UCS-4BE) and UTF-32LE (UCS-4LE), and UTF-7.

      Since UTF-7 is a 7-bit (re)encoded version of UTF-16BE, It is not part of Unicode's Character Encoding Scheme. It is separately implemented in Encode::Unicode::UTF7. For details see Encode::Unicode::UTF7.

    • Quick Reference
      1. Decodes from ord(N) Encodes chr(N) to...
      2. octet/char BOM S.P d800-dfff ord > 0xffff \x{1abcd} ==
      3. ---------------+-----------------+------------------------------
      4. UCS-2BE 2 N N is bogus Not Available
      5. UCS-2LE 2 N N bogus Not Available
      6. UTF-16 2/4 Y Y is S.P S.P BE/LE
      7. UTF-16BE 2/4 N Y S.P S.P 0xd82a,0xdfcd
      8. UTF-16LE 2/4 N Y S.P S.P 0x2ad8,0xcddf
      9. UTF-32 4 Y - is bogus As is BE/LE
      10. UTF-32BE 4 N - bogus As is 0x0001abcd
      11. UTF-32LE 4 N - bogus As is 0xcdab0100
      12. UTF-8 1-4 - - bogus >= 4 octets \xf0\x9a\af\8d
      13. ---------------+-----------------+------------------------------

    Size, Endianness, and BOM

    You can categorize these CES by 3 criteria: size of each character, endianness, and Byte Order Mark.

    by size

    UCS-2 is a fixed-length encoding with each character taking 16 bits. It does not support surrogate pairs. When a surrogate pair is encountered during decode(), its place is filled with \x{FFFD} if CHECK is 0, or the routine croaks if CHECK is 1. When a character whose ord value is larger than 0xFFFF is encountered, its place is filled with \x{FFFD} if CHECK is 0, or the routine croaks if CHECK is 1.

    UTF-16 is almost the same as UCS-2 but it supports surrogate pairs. When it encounters a high surrogate (0xD800-0xDBFF), it fetches the following low surrogate (0xDC00-0xDFFF) and desurrogate s them to form a character. Bogus surrogates result in death. When \x{10000} or above is encountered during encode(), it ensurrogate s them and pushes the surrogate pair to the output stream.

    UTF-32 (UCS-4) is a fixed-length encoding with each character taking 32 bits. Since it is 32-bit, there is no need for surrogate pairs.

    by endianness

    The first (and now failed) goal of Unicode was to map all character repertoires into a fixed-length integer so that programmers are happy. Since each character is either a short or long in C, you have to pay attention to the endianness of each platform when you pass data to one another.

    Anything marked as BE is Big Endian (or network byte order) and LE is Little Endian (aka VAX byte order). For anything not marked either BE or LE, a character called Byte Order Mark (BOM) indicating the endianness is prepended to the string.

    CAVEAT: Though BOM in utf8 (\xEF\xBB\xBF) is valid, it is meaningless and as of this writing Encode suite just leave it as is (\x{FeFF}).

    • BOM as integer when fetched in network byte order
      1. 16 32 bits/char
      2. -------------------------
      3. BE 0xFeFF 0x0000FeFF
      4. LE 0xFFFe 0xFFFe0000
      5. -------------------------

    This modules handles the BOM as follows.

    • When BE or LE is explicitly stated as the name of encoding, BOM is simply treated as a normal character (ZERO WIDTH NO-BREAK SPACE).

    • When BE or LE is omitted during decode(), it checks if BOM is at the beginning of the string; if one is found, the endianness is set to what the BOM says. If no BOM is found, the routine dies.

    • When BE or LE is omitted during encode(), it returns a BE-encoded string with BOM prepended. So when you want to encode a whole text file, make sure you encode() the whole text at once, not line by line or each line, not file, will have a BOM prepended.

    • UCS-2 is an exception. Unlike others, this is an alias of UCS-2BE. UCS-2 is already registered by IANA and others that way.

    Surrogate Pairs

    To say the least, surrogate pairs were the biggest mistake of the Unicode Consortium. But according to the late Douglas Adams in The Hitchhiker's Guide to the Galaxy Trilogy, In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move . Their mistake was not of this magnitude so let's forgive them.

    (I don't dare make any comparison with Unicode Consortium and the Vogons here ;) Or, comparing Encode to Babel Fish is completely appropriate -- if you can only stick this into your ear :)

    Surrogate pairs were born when the Unicode Consortium finally admitted that 16 bits were not big enough to hold all the world's character repertoires. But they already made UCS-2 16-bit. What do we do?

    Back then, the range 0xD800-0xDFFF was not allocated. Let's split that range in half and use the first half to represent the upper half of a character and the second half to represent the lower half of a character . That way, you can represent 1024 * 1024 = 1048576 more characters. Now we can store character ranges up to \x{10ffff} even with 16-bit encodings. This pair of half-character is now called a surrogate pair and UTF-16 is the name of the encoding that embraces them.

    Here is a formula to ensurrogate a Unicode character \x{10000} and above;

    1. $hi = ($uni - 0x10000) / 0x400 + 0xD800;
    2. $lo = ($uni - 0x10000) % 0x400 + 0xDC00;

    And to desurrogate;

    1. $uni = 0x10000 + ($hi - 0xD800) * 0x400 + ($lo - 0xDC00);

    Note this move has made \x{D800}-\x{DFFF} into a forbidden zone but perl does not prohibit the use of characters within this range. To perl, every one of \x{0000_0000} up to \x{ffff_ffff} (*) is a character.

    1. (*) or \x{ffff_ffff_ffff_ffff} if your perl is compiled with 64-bit
    2. integer support!

    Error Checking

    Unlike most encodings which accept various ways to handle errors, Unicode encodings simply croaks.

    1. % perl -MEncode -e'$_ = "\xfe\xff\xd8\xd9\xda\xdb\0\n"' \
    2. -e'Encode::from_to($_, "utf16","shift_jis", 0); print'
    3. UTF-16:Malformed LO surrogate d8d9 at /path/to/Encode.pm line 184.
    4. % perl -MEncode -e'$a = "BOM missing"' \
    5. -e' Encode::from_to($a, "utf16", "shift_jis", 0); print'
    6. UTF-16:Unrecognised BOM 424f at /path/to/Encode.pm line 184.

    Unlike other encodings where mappings are not one-to-one against Unicode, UTFs are supposed to map 100% against one another. So Encode is more strict on UTFs.

    Consider that "division by zero" of Encode :)

    SEE ALSO

    Encode, Encode::Unicode::UTF7, http://www.unicode.org/glossary/, http://www.unicode.org/unicode/faq/utf_bom.html,

    RFC 2781 http://www.ietf.org/rfc/rfc2781.txt,

    The whole Unicode standard http://www.unicode.org/unicode/uni2book/u2.html

    Ch. 15, pp. 403 of Programming Perl (3rd Edition) by Larry Wall, Tom Christiansen, Jon Orwant; O'Reilly & Associates; ISBN 0-596-00027-8

     
    perldoc-html/Encode/Unicode/UTF7.html000644 000765 000024 00000040620 12275777442 017437 0ustar00jjstaff000000 000000 Encode::Unicode::UTF7 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::Unicode::UTF7

    Perl 5 version 18.2 documentation
    Recently read

    Encode::Unicode::UTF7

    NAME

    Encode::Unicode::UTF7 -- UTF-7 encoding

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $utf7 = encode("UTF-7", $utf8);
    3. $utf8 = decode("UTF-7", $ucs2);

    ABSTRACT

    This module implements UTF-7 encoding documented in RFC 2152. UTF-7, as its name suggests, is a 7-bit re-encoded version of UTF-16BE. It is designed to be MTA-safe and expected to be a standard way to exchange Unicoded mails via mails. But with the advent of UTF-8 and 8-bit compliant MTAs, UTF-7 is hardly ever used.

    UTF-7 was not supported by Encode until version 1.95 because of that. But Unicode::String, a module by Gisle Aas which adds Unicode supports to non-utf8-savvy perl did support UTF-7, the UTF-7 support was added so Encode can supersede Unicode::String 100%.

    In Practice

    When you want to encode Unicode for mails and web pages, however, do not use UTF-7 unless you are sure your recipients and readers can handle it. Very few MUAs and WWW Browsers support these days (only Mozilla seems to support one). For general cases, use UTF-8 for message body and MIME-Header for header instead.

    SEE ALSO

    Encode, Encode::Unicode, Unicode::String

    RFC 2781 http://www.ietf.org/rfc/rfc2152.txt

     
    perldoc-html/Encode/MIME/Header.html000644 000765 000024 00000042156 12275777441 017250 0ustar00jjstaff000000 000000 Encode::MIME::Header - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::MIME::Header

    Perl 5 version 18.2 documentation
    Recently read

    Encode::MIME::Header

    NAME

    Encode::MIME::Header -- MIME 'B' and 'Q' header encoding

    SYNOPSIS

    1. use Encode qw/encode decode/;
    2. $utf8 = decode('MIME-Header', $header);
    3. $header = encode('MIME-Header', $utf8);

    ABSTRACT

    This module implements RFC 2047 Mime Header Encoding. There are 3 variant encoding names; MIME-Header , MIME-B and MIME-Q . The difference is described below

    1. decode() encode()
    2. ----------------------------------------------
    3. MIME-Header Both B and Q =?UTF-8?B?....?=
    4. MIME-B B only; Q croaks =?UTF-8?B?....?=
    5. MIME-Q Q only; B croaks =?UTF-8?Q?....?=

    DESCRIPTION

    When you decode(=?encoding?X?ENCODED WORD?=), ENCODED WORD is extracted and decoded for X encoding (B for Base64, Q for Quoted-Printable). Then the decoded chunk is fed to decode(encoding). So long as encoding is supported by Encode, any source encoding is fine.

    When you encode, it just encodes UTF-8 string with X encoding then quoted with =?UTF-8?X?....?= . The parts that RFC 2047 forbids to encode are left as is and long lines are folded within 76 bytes per line.

    BUGS

    It would be nice to support encoding to non-UTF8, such as =?ISO-2022-JP? and =?ISO-8859-1?= but that makes the implementation too complicated. These days major mail agents all support =?UTF-8? so I think it is just good enough.

    Due to popular demand, 'MIME-Header-ISO_2022_JP' was introduced by Makamaka. Thre are still too many MUAs especially cellular phone handsets which does not grok UTF-8.

    SEE ALSO

    Encode

    RFC 2047, http://www.faqs.org/rfcs/rfc2047.html and many other locations.

     
    perldoc-html/Encode/MIME/Name.html000644 000765 000024 00000034431 12275777442 016736 0ustar00jjstaff000000 000000 Encode::MIME::Name - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::MIME::Name

    Perl 5 version 18.2 documentation
    Recently read

    Encode::MIME::Name

    NAME

    Encode::MIME::NAME -- internally used by Encode

    SEE ALSO

    I18N::Charset

    Page index
     
    perldoc-html/Encode/KR/2022_KR.html000644 000765 000024 00000034143 12275777440 016622 0ustar00jjstaff000000 000000 Encode::KR::2022_KR - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::KR::2022_KR

    Perl 5 version 18.2 documentation
    Recently read

    Encode::KR::2022_KR

    NAME

    Encode::KR::2022_KR -- internally used by Encode::KR

    Page index
     
    perldoc-html/Encode/JP/H2Z.html000644 000765 000024 00000034121 12275777442 016237 0ustar00jjstaff000000 000000 Encode::JP::H2Z - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::JP::H2Z

    Perl 5 version 18.2 documentation
    Recently read

    Encode::JP::H2Z

    NAME

    Encode::JP::H2Z -- internally used by Encode::JP::2022_JP*

    Page index
     
    perldoc-html/Encode/JP/JIS7.html000644 000765 000024 00000034116 12275777440 016352 0ustar00jjstaff000000 000000 Encode::JP::JIS7 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::JP::JIS7

    Perl 5 version 18.2 documentation
    Recently read

    Encode::JP::JIS7

    NAME

    Encode::JP::JIS7 -- internally used by Encode::JP

    Page index
     
    perldoc-html/Encode/CN/HZ.html000644 000765 000024 00000034100 12275777442 016141 0ustar00jjstaff000000 000000 Encode::CN::HZ - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Encode::CN::HZ

    Perl 5 version 18.2 documentation
    Recently read

    Encode::CN::HZ

    NAME

    Encode::CN::HZ -- internally used by Encode::CN

    Page index
     
    perldoc-html/Digest/MD5.html000644 000765 000024 00000106414 12275777435 015741 0ustar00jjstaff000000 000000 Digest::MD5 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Digest::MD5

    Perl 5 version 18.2 documentation
    Recently read

    Digest::MD5

    NAME

    Digest::MD5 - Perl interface to the MD5 Algorithm

    SYNOPSIS

    1. # Functional style
    2. use Digest::MD5 qw(md5 md5_hex md5_base64);
    3. $digest = md5($data);
    4. $digest = md5_hex($data);
    5. $digest = md5_base64($data);
    6. # OO style
    7. use Digest::MD5;
    8. $ctx = Digest::MD5->new;
    9. $ctx->add($data);
    10. $ctx->addfile($file_handle);
    11. $digest = $ctx->digest;
    12. $digest = $ctx->hexdigest;
    13. $digest = $ctx->b64digest;

    DESCRIPTION

    The Digest::MD5 module allows you to use the RSA Data Security Inc. MD5 Message Digest algorithm from within Perl programs. The algorithm takes as input a message of arbitrary length and produces as output a 128-bit "fingerprint" or "message digest" of the input.

    Note that the MD5 algorithm is not as strong as it used to be. It has since 2005 been easy to generate different messages that produce the same MD5 digest. It still seems hard to generate messages that produce a given digest, but it is probably wise to move to stronger algorithms for applications that depend on the digest to uniquely identify a message.

    The Digest::MD5 module provide a procedural interface for simple use, as well as an object oriented interface that can handle messages of arbitrary length and which can read files directly.

    FUNCTIONS

    The following functions are provided by the Digest::MD5 module. None of these functions are exported by default.

    • md5($data,...)

      This function will concatenate all arguments, calculate the MD5 digest of this "message", and return it in binary form. The returned string will be 16 bytes long.

      The result of md5("a", "b", "c") will be exactly the same as the result of md5("abc").

    • md5_hex($data,...)

      Same as md5(), but will return the digest in hexadecimal form. The length of the returned string will be 32 and it will only contain characters from this set: '0'..'9' and 'a'..'f'.

    • md5_base64($data,...)

      Same as md5(), but will return the digest as a base64 encoded string. The length of the returned string will be 22 and it will only contain characters from this set: 'A'..'Z', 'a'..'z', '0'..'9', '+' and '/'.

      Note that the base64 encoded string returned is not padded to be a multiple of 4 bytes long. If you want interoperability with other base64 encoded md5 digests you might want to append the redundant string "==" to the result.

    METHODS

    The object oriented interface to Digest::MD5 is described in this section. After a Digest::MD5 object has been created, you will add data to it and finally ask for the digest in a suitable format. A single object can be used to calculate multiple digests.

    The following methods are provided:

    • $md5 = Digest::MD5->new

      The constructor returns a new Digest::MD5 object which encapsulate the state of the MD5 message-digest algorithm.

      If called as an instance method (i.e. $md5->new) it will just reset the state the object to the state of a newly created object. No new object is created in this case.

    • $md5->reset

      This is just an alias for $md5->new.

    • $md5->clone

      This a copy of the $md5 object. It is useful when you do not want to destroy the digests state, but need an intermediate value of the digest, e.g. when calculating digests iteratively on a continuous data stream. Example:

      1. my $md5 = Digest::MD5->new;
      2. while (<>) {
      3. $md5->add($_);
      4. print "Line $.: ", $md5->clone->hexdigest, "\n";
      5. }
    • $md5->add($data,...)

      The $data provided as argument are appended to the message we calculate the digest for. The return value is the $md5 object itself.

      All these lines will have the same effect on the state of the $md5 object:

      1. $md5->add("a"); $md5->add("b"); $md5->add("c");
      2. $md5->add("a")->add("b")->add("c");
      3. $md5->add("a", "b", "c");
      4. $md5->add("abc");
    • $md5->addfile($io_handle)

      The $io_handle will be read until EOF and its content appended to the message we calculate the digest for. The return value is the $md5 object itself.

      The addfile() method will croak() if it fails reading data for some reason. If it croaks it is unpredictable what the state of the $md5 object will be in. The addfile() method might have been able to read the file partially before it failed. It is probably wise to discard or reset the $md5 object if this occurs.

      In most cases you want to make sure that the $io_handle is in binmode before you pass it as argument to the addfile() method.

    • $md5->add_bits($data, $nbits)
    • $md5->add_bits($bitstring)

      Since the MD5 algorithm is byte oriented you might only add bits as multiples of 8, so you probably want to just use add() instead. The add_bits() method is provided for compatibility with other digest implementations. See Digest for description of the arguments that add_bits() take.

    • $md5->digest

      Return the binary digest for the message. The returned string will be 16 bytes long.

      Note that the digest operation is effectively a destructive, read-once operation. Once it has been performed, the Digest::MD5 object is automatically reset and can be used to calculate another digest value. Call $md5->clone->digest if you want to calculate the digest without resetting the digest state.

    • $md5->hexdigest

      Same as $md5->digest, but will return the digest in hexadecimal form. The length of the returned string will be 32 and it will only contain characters from this set: '0'..'9' and 'a'..'f'.

    • $md5->b64digest

      Same as $md5->digest, but will return the digest as a base64 encoded string. The length of the returned string will be 22 and it will only contain characters from this set: 'A'..'Z', 'a'..'z', '0'..'9', '+' and '/'.

      The base64 encoded string returned is not padded to be a multiple of 4 bytes long. If you want interoperability with other base64 encoded md5 digests you might want to append the string "==" to the result.

    EXAMPLES

    The simplest way to use this library is to import the md5_hex() function (or one of its cousins):

    1. use Digest::MD5 qw(md5_hex);
    2. print "Digest is ", md5_hex("foobarbaz"), "\n";

    The above example would print out the message:

    1. Digest is 6df23dc03f9b54cc38a0fc1483df6e21

    The same checksum can also be calculated in OO style:

    1. use Digest::MD5;
    2. $md5 = Digest::MD5->new;
    3. $md5->add('foo', 'bar');
    4. $md5->add('baz');
    5. $digest = $md5->hexdigest;
    6. print "Digest is $digest\n";

    With OO style, you can break the message arbitrarily. This means that we are no longer limited to have space for the whole message in memory, i.e. we can handle messages of any size.

    This is useful when calculating checksum for files:

    1. use Digest::MD5;
    2. my $filename = shift || "/etc/passwd";
    3. open (my $fh, '<', $filename) or die "Can't open '$filename': $!";
    4. binmode($fh);
    5. $md5 = Digest::MD5->new;
    6. while (<$fh>) {
    7. $md5->add($_);
    8. }
    9. close($fh);
    10. print $md5->b64digest, " $filename\n";

    Or we can use the addfile method for more efficient reading of the file:

    1. use Digest::MD5;
    2. my $filename = shift || "/etc/passwd";
    3. open (my $fh, '<', $filename) or die "Can't open '$filename': $!";
    4. binmode ($fh);
    5. print Digest::MD5->new->addfile($fh)->hexdigest, " $filename\n";

    Perl 5.8 support Unicode characters in strings. Since the MD5 algorithm is only defined for strings of bytes, it can not be used on strings that contains chars with ordinal number above 255. The MD5 functions and methods will croak if you try to feed them such input data:

    1. use Digest::MD5 qw(md5_hex);
    2. my $str = "abc\x{300}";
    3. print md5_hex($str), "\n"; # croaks
    4. # Wide character in subroutine entry

    What you can do is calculate the MD5 checksum of the UTF-8 representation of such strings. This is achieved by filtering the string through encode_utf8() function:

    1. use Digest::MD5 qw(md5_hex);
    2. use Encode qw(encode_utf8);
    3. my $str = "abc\x{300}";
    4. print md5_hex(encode_utf8($str)), "\n";
    5. # 8c2d46911f3f5a326455f0ed7a8ed3b3

    SEE ALSO

    Digest, Digest::MD2, Digest::SHA, Digest::HMAC

    md5sum(1)

    RFC 1321

    http://en.wikipedia.org/wiki/MD5

    The paper "How to Break MD5 and Other Hash Functions" by Xiaoyun Wang and Hongbo Yu.

    COPYRIGHT

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    1. Copyright 1998-2003 Gisle Aas.
    2. Copyright 1995-1996 Neil Winton.
    3. Copyright 1991-1992 RSA Data Security, Inc.

    The MD5 algorithm is defined in RFC 1321. This implementation is derived from the reference C code in RFC 1321 which is covered by the following copyright statement:

    • Copyright (C) 1991-2, RSA Data Security, Inc. Created 1991. All rights reserved.

      License to copy and use this software is granted provided that it is identified as the "RSA Data Security, Inc. MD5 Message-Digest Algorithm" in all material mentioning or referencing this software or this function.

      License is also granted to make and use derivative works provided that such works are identified as "derived from the RSA Data Security, Inc. MD5 Message-Digest Algorithm" in all material mentioning or referencing the derived work.

      RSA Data Security, Inc. makes no representations concerning either the merchantability of this software or the suitability of this software for any particular purpose. It is provided "as is" without express or implied warranty of any kind.

      These notices must be retained in any copies of any part of this documentation and/or software.

    This copyright does not prohibit distribution of any version of Perl containing this extension under the terms of the GNU or Artistic licenses.

    AUTHORS

    The original MD5 interface was written by Neil Winton (N.Winton@axion.bt.co.uk ).

    The Digest::MD5 module is written by Gisle Aas <gisle@ActiveState.com>.

     
    perldoc-html/Digest/SHA.html000644 000765 000024 00000140121 12275777436 015761 0ustar00jjstaff000000 000000 Digest::SHA - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Digest::SHA

    Perl 5 version 18.2 documentation
    Recently read

    Digest::SHA

    NAME

    Digest::SHA - Perl extension for SHA-1/224/256/384/512

    SYNOPSIS

    In programs:

    1. # Functional interface
    2. use Digest::SHA qw(sha1 sha1_hex sha1_base64 ...);
    3. $digest = sha1($data);
    4. $digest = sha1_hex($data);
    5. $digest = sha1_base64($data);
    6. $digest = sha256($data);
    7. $digest = sha384_hex($data);
    8. $digest = sha512_base64($data);
    9. # Object-oriented
    10. use Digest::SHA;
    11. $sha = Digest::SHA->new($alg);
    12. $sha->add($data); # feed data into stream
    13. $sha->addfile(*F);
    14. $sha->addfile($filename);
    15. $sha->add_bits($bits);
    16. $sha->add_bits($data, $nbits);
    17. $sha_copy = $sha->clone; # if needed, make copy of
    18. $sha->dump($file); # current digest state,
    19. $sha->load($file); # or save it on disk
    20. $digest = $sha->digest; # compute digest
    21. $digest = $sha->hexdigest;
    22. $digest = $sha->b64digest;

    From the command line:

    1. $ shasum files
    2. $ shasum --help

    SYNOPSIS (HMAC-SHA)

    1. # Functional interface only
    2. use Digest::SHA qw(hmac_sha1 hmac_sha1_hex ...);
    3. $digest = hmac_sha1($data, $key);
    4. $digest = hmac_sha224_hex($data, $key);
    5. $digest = hmac_sha256_base64($data, $key);

    ABSTRACT

    Digest::SHA is a complete implementation of the NIST Secure Hash Standard. It gives Perl programmers a convenient way to calculate SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256 message digests. The module can handle all types of input, including partial-byte data.

    DESCRIPTION

    Digest::SHA is written in C for speed. If your platform lacks a C compiler, you can install the functionally equivalent (but much slower) Digest::SHA::PurePerl module.

    The programming interface is easy to use: it's the same one found in CPAN's Digest module. So, if your applications currently use Digest::MD5 and you'd prefer the stronger security of SHA, it's a simple matter to convert them.

    The interface provides two ways to calculate digests: all-at-once, or in stages. To illustrate, the following short program computes the SHA-256 digest of "hello world" using each approach:

    1. use Digest::SHA qw(sha256_hex);
    2. $data = "hello world";
    3. @frags = split(//, $data);
    4. # all-at-once (Functional style)
    5. $digest1 = sha256_hex($data);
    6. # in-stages (OOP style)
    7. $state = Digest::SHA->new(256);
    8. for (@frags) { $state->add($_) }
    9. $digest2 = $state->hexdigest;
    10. print $digest1 eq $digest2 ?
    11. "whew!\n" : "oops!\n";

    To calculate the digest of an n-bit message where n is not a multiple of 8, use the add_bits() method. For example, consider the 446-bit message consisting of the bit-string "110" repeated 148 times, followed by "11". Here's how to display its SHA-1 digest:

    1. use Digest::SHA;
    2. $bits = "110" x 148 . "11";
    3. $sha = Digest::SHA->new(1)->add_bits($bits);
    4. print $sha->hexdigest, "\n";

    Note that for larger bit-strings, it's more efficient to use the two-argument version add_bits($data, $nbits), where $data is in the customary packed binary format used for Perl strings.

    The module also lets you save intermediate SHA states to disk, or display them on standard output. The dump() method generates portable, human-readable text describing the current state of computation. You can subsequently retrieve the file with load() to resume where the calculation left off.

    To see what a state description looks like, just run the following:

    1. use Digest::SHA;
    2. Digest::SHA->new->add("Shaw" x 1962)->dump;

    As an added convenience, the Digest::SHA module offers routines to calculate keyed hashes using the HMAC-SHA-1/224/256/384/512 algorithms. These services exist in functional form only, and mimic the style and behavior of the sha(), sha_hex(), and sha_base64() functions.

    1. # Test vector from draft-ietf-ipsec-ciph-sha-256-01.txt
    2. use Digest::SHA qw(hmac_sha256_hex);
    3. print hmac_sha256_hex("Hi There", chr(0x0b) x 32), "\n";

    UNICODE AND SIDE EFFECTS

    Perl supports Unicode strings as of version 5.6. Such strings may contain wide characters, namely, characters whose ordinal values are greater than 255. This can cause problems for digest algorithms such as SHA that are specified to operate on sequences of bytes.

    The rule by which Digest::SHA handles a Unicode string is easy to state, but potentially confusing to grasp: the string is interpreted as a sequence of byte values, where each byte value is equal to the ordinal value (viz. code point) of its corresponding Unicode character. That way, the Unicode string 'abc' has exactly the same digest value as the ordinary string 'abc'.

    Since a wide character does not fit into a byte, the Digest::SHA routines croak if they encounter one. Whereas if a Unicode string contains no wide characters, the module accepts it quite happily. The following code illustrates the two cases:

    1. $str1 = pack('U*', (0..255));
    2. print sha1_hex($str1); # ok
    3. $str2 = pack('U*', (0..256));
    4. print sha1_hex($str2); # croaks

    Be aware that the digest routines silently convert UTF-8 input into its equivalent byte sequence in the native encoding (cf. utf8::downgrade). This side effect influences only the way Perl stores the data internally, but otherwise leaves the actual value of the data intact.

    NIST STATEMENT ON SHA-1

    NIST acknowledges that the work of Prof. Xiaoyun Wang constitutes a practical collision attack on SHA-1. Therefore, NIST encourages the rapid adoption of the SHA-2 hash functions (e.g. SHA-256) for applications requiring strong collision resistance, such as digital signatures.

    ref. http://csrc.nist.gov/groups/ST/hash/statement.html

    PADDING OF BASE64 DIGESTS

    By convention, CPAN Digest modules do not pad their Base64 output. Problems can occur when feeding such digests to other software that expects properly padded Base64 encodings.

    For the time being, any necessary padding must be done by the user. Fortunately, this is a simple operation: if the length of a Base64-encoded digest isn't a multiple of 4, simply append "=" characters to the end of the digest until it is:

    1. while (length($b64_digest) % 4) {
    2. $b64_digest .= '=';
    3. }

    To illustrate, sha256_base64("abc") is computed to be

    1. ungWv48Bz+pBQUDeXa4iI7ADYaOWF3qctBD/YfIAFa0

    which has a length of 43. So, the properly padded version is

    1. ungWv48Bz+pBQUDeXa4iI7ADYaOWF3qctBD/YfIAFa0=

    EXPORT

    None by default.

    EXPORTABLE FUNCTIONS

    Provided your C compiler supports a 64-bit type (e.g. the long long of C99, or __int64 used by Microsoft C/C++), all of these functions will be available for use. Otherwise, you won't be able to perform the SHA-384 and SHA-512 transforms, both of which require 64-bit operations.

    Functional style

    • sha1($data, ...)
    • sha224($data, ...)
    • sha256($data, ...)
    • sha384($data, ...)
    • sha512($data, ...)
    • sha512224($data, ...)
    • sha512256($data, ...)

      Logically joins the arguments into a single string, and returns its SHA-1/224/256/384/512 digest encoded as a binary string.

    • sha1_hex($data, ...)
    • sha224_hex($data, ...)
    • sha256_hex($data, ...)
    • sha384_hex($data, ...)
    • sha512_hex($data, ...)
    • sha512224_hex($data, ...)
    • sha512256_hex($data, ...)

      Logically joins the arguments into a single string, and returns its SHA-1/224/256/384/512 digest encoded as a hexadecimal string.

    • sha1_base64($data, ...)
    • sha224_base64($data, ...)
    • sha256_base64($data, ...)
    • sha384_base64($data, ...)
    • sha512_base64($data, ...)
    • sha512224_base64($data, ...)
    • sha512256_base64($data, ...)

      Logically joins the arguments into a single string, and returns its SHA-1/224/256/384/512 digest encoded as a Base64 string.

      It's important to note that the resulting string does not contain the padding characters typical of Base64 encodings. This omission is deliberate, and is done to maintain compatibility with the family of CPAN Digest modules. See PADDING OF BASE64 DIGESTS for details.

    OOP style

    • new($alg)

      Returns a new Digest::SHA object. Allowed values for $alg are 1, 224, 256, 384, 512, 512224, or 512256. It's also possible to use common string representations of the algorithm (e.g. "sha256", "SHA-384"). If the argument is missing, SHA-1 will be used by default.

      Invoking new as an instance method will not create a new object; instead, it will simply reset the object to the initial state associated with $alg. If the argument is missing, the object will continue using the same algorithm that was selected at creation.

    • reset($alg)

      This method has exactly the same effect as new($alg). In fact, reset is just an alias for new.

    • hashsize

      Returns the number of digest bits for this object. The values are 160, 224, 256, 384, 512, 224, and 256 for SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224 and SHA-512/256, respectively.

    • algorithm

      Returns the digest algorithm for this object. The values are 1, 224, 256, 384, 512, 512224, and 512256 for SHA-1, SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256, respectively.

    • clone

      Returns a duplicate copy of the object.

    • add($data, ...)

      Logically joins the arguments into a single string, and uses it to update the current digest state. In other words, the following statements have the same effect:

      1. $sha->add("a"); $sha->add("b"); $sha->add("c");
      2. $sha->add("a")->add("b")->add("c");
      3. $sha->add("a", "b", "c");
      4. $sha->add("abc");

      The return value is the updated object itself.

    • add_bits($data, $nbits)
    • add_bits($bits)

      Updates the current digest state by appending bits to it. The return value is the updated object itself.

      The first form causes the most-significant $nbits of $data to be appended to the stream. The $data argument is in the customary binary format used for Perl strings.

      The second form takes an ASCII string of "0" and "1" characters as its argument. It's equivalent to

      1. $sha->add_bits(pack("B*", $bits), length($bits));

      So, the following two statements do the same thing:

      1. $sha->add_bits("111100001010");
      2. $sha->add_bits("\xF0\xA0", 12);
    • addfile(*FILE)

      Reads from FILE until EOF, and appends that data to the current state. The return value is the updated object itself.

    • addfile($filename [, $mode])

      Reads the contents of $filename, and appends that data to the current state. The return value is the updated object itself.

      By default, $filename is simply opened and read; no special modes or I/O disciplines are used. To change this, set the optional $mode argument to one of the following values:

      1. "b" read file in binary mode
      2. "p" use portable mode
      3. "0" use BITS mode

      The "p" mode ensures that the digest value of $filename will be the same when computed on different operating systems. It accomplishes this by internally translating all newlines in text files to UNIX format before calculating the digest. Binary files are read in raw mode with no translation whatsoever.

      The BITS mode ("0") interprets the contents of $filename as a logical stream of bits, where each ASCII '0' or '1' character represents a 0 or 1 bit, respectively. All other characters are ignored. This provides a convenient way to calculate the digest values of partial-byte data by using files, rather than having to write programs using the add_bits method.

    • dump($filename)

      Provides persistent storage of intermediate SHA states by writing a portable, human-readable representation of the current state to $filename. If the argument is missing, or equal to the empty string, the state information will be written to STDOUT.

    • load($filename)

      Returns a Digest::SHA object representing the intermediate SHA state that was previously dumped to $filename. If called as a class method, a new object is created; if called as an instance method, the object is reset to the state contained in $filename. If the argument is missing, or equal to the empty string, the state information will be read from STDIN.

    • digest

      Returns the digest encoded as a binary string.

      Note that the digest method is a read-once operation. Once it has been performed, the Digest::SHA object is automatically reset in preparation for calculating another digest value. Call $sha->clone->digest if it's necessary to preserve the original digest state.

    • hexdigest

      Returns the digest encoded as a hexadecimal string.

      Like digest, this method is a read-once operation. Call $sha->clone->hexdigest if it's necessary to preserve the original digest state.

      This method is inherited if Digest::base is installed on your system. Otherwise, a functionally equivalent substitute is used.

    • b64digest

      Returns the digest encoded as a Base64 string.

      Like digest, this method is a read-once operation. Call $sha->clone->b64digest if it's necessary to preserve the original digest state.

      This method is inherited if Digest::base is installed on your system. Otherwise, a functionally equivalent substitute is used.

      It's important to note that the resulting string does not contain the padding characters typical of Base64 encodings. This omission is deliberate, and is done to maintain compatibility with the family of CPAN Digest modules. See PADDING OF BASE64 DIGESTS for details.

    HMAC-SHA-1/224/256/384/512

    • hmac_sha1($data, $key)
    • hmac_sha224($data, $key)
    • hmac_sha256($data, $key)
    • hmac_sha384($data, $key)
    • hmac_sha512($data, $key)
    • hmac_sha512224($data, $key)
    • hmac_sha512256($data, $key)

      Returns the HMAC-SHA-1/224/256/384/512 digest of $data/$key, with the result encoded as a binary string. Multiple $data arguments are allowed, provided that $key is the last argument in the list.

    • hmac_sha1_hex($data, $key)
    • hmac_sha224_hex($data, $key)
    • hmac_sha256_hex($data, $key)
    • hmac_sha384_hex($data, $key)
    • hmac_sha512_hex($data, $key)
    • hmac_sha512224_hex($data, $key)
    • hmac_sha512256_hex($data, $key)

      Returns the HMAC-SHA-1/224/256/384/512 digest of $data/$key, with the result encoded as a hexadecimal string. Multiple $data arguments are allowed, provided that $key is the last argument in the list.

    • hmac_sha1_base64($data, $key)
    • hmac_sha224_base64($data, $key)
    • hmac_sha256_base64($data, $key)
    • hmac_sha384_base64($data, $key)
    • hmac_sha512_base64($data, $key)
    • hmac_sha512224_base64($data, $key)
    • hmac_sha512256_base64($data, $key)

      Returns the HMAC-SHA-1/224/256/384/512 digest of $data/$key, with the result encoded as a Base64 string. Multiple $data arguments are allowed, provided that $key is the last argument in the list.

      It's important to note that the resulting string does not contain the padding characters typical of Base64 encodings. This omission is deliberate, and is done to maintain compatibility with the family of CPAN Digest modules. See PADDING OF BASE64 DIGESTS for details.

    SEE ALSO

    Digest, Digest::SHA::PurePerl

    The Secure Hash Standard (Draft FIPS PUB 180-4) can be found at:

    http://csrc.nist.gov/publications/drafts/fips180-4/Draft-FIPS180-4_Feb2011.pdf

    The Keyed-Hash Message Authentication Code (HMAC):

    http://csrc.nist.gov/publications/fips/fips198/fips-198a.pdf

    AUTHOR

    1. Mark Shelor <mshelor@cpan.org>

    ACKNOWLEDGMENTS

    The author is particularly grateful to

    1. Gisle Aas
    2. Sean Burke
    3. Chris Carey
    4. Alexandr Ciornii
    5. Jim Doble
    6. Thomas Drugeon
    7. Julius Duque
    8. Jeffrey Friedl
    9. Robert Gilmour
    10. Brian Gladman
    11. Adam Kennedy
    12. Andy Lester
    13. Alex Muntada
    14. Steve Peters
    15. Chris Skiscim
    16. Martin Thurn
    17. Gunnar Wolf
    18. Adam Woodbury

    "who by trained skill rescued life from such great billows and such thick darkness and moored it in so perfect a calm and in so brilliant a light" - Lucretius

    COPYRIGHT AND LICENSE

    Copyright (C) 2003-2013 Mark Shelor

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    perlartistic

     
    perldoc-html/Digest/base.html000644 000765 000024 00000037171 12275777435 016271 0ustar00jjstaff000000 000000 Digest::base - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Digest::base

    Perl 5 version 18.2 documentation
    Recently read

    Digest::base

    NAME

    Digest::base - Digest base class

    SYNOPSIS

    1. package Digest::Foo;
    2. use base 'Digest::base';

    DESCRIPTION

    The Digest::base class provide implementations of the methods addfile and add_bits in terms of add , and of the methods hexdigest and b64digest in terms of digest .

    Digest implementations might want to inherit from this class to get this implementations of the alternative add and digest methods. A minimal subclass needs to implement the following methods by itself:

    1. new
    2. clone
    3. add
    4. digest

    The arguments and expected behaviour of these methods are described in Digest.

    SEE ALSO

    Digest

     
    perldoc-html/Digest/file.html000644 000765 000024 00000040055 12275777435 016271 0ustar00jjstaff000000 000000 Digest::file - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Digest::file

    Perl 5 version 18.2 documentation
    Recently read

    Digest::file

    NAME

    Digest::file - Calculate digests of files

    SYNOPSIS

    1. # Poor mans "md5sum" command
    2. use Digest::file qw(digest_file_hex);
    3. for (@ARGV) {
    4. print digest_file_hex($_, "MD5"), " $_\n";
    5. }

    DESCRIPTION

    This module provide 3 convenience functions to calculate the digest of files. The following functions are provided:

    • digest_file( $file, $algorithm, [$arg,...] )

      This function will calculate and return the binary digest of the bytes of the given file. The function will croak if it fails to open or read the file.

      The $algorithm is a string like "MD2", "MD5", "SHA-1", "SHA-512". Additional arguments are passed to the constructor for the implementation of the given algorithm.

    • digest_file_hex( $file, $algorithm, [$arg,...] )

      Same as digest_file(), but return the digest in hex form.

    • digest_file_base64( $file, $algorithm, [$arg,...] )

      Same as digest_file(), but return the digest as a base64 encoded string.

    SEE ALSO

    Digest

     
    perldoc-html/Devel/InnerPackage.html000644 000765 000024 00000041543 12275777435 017524 0ustar00jjstaff000000 000000 Devel::InnerPackage - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Devel::InnerPackage

    Perl 5 version 18.2 documentation
    Recently read

    Devel::InnerPackage

    NAME

    Devel::InnerPackage - find all the inner packages of a package

    SYNOPSIS

    1. use Foo::Bar;
    2. use Devel::InnerPackage qw(list_packages);
    3. my @inner_packages = list_packages('Foo::Bar');

    DESCRIPTION

    Given a file like this

    1. package Foo::Bar;
    2. sub foo {}
    3. package Foo::Bar::Quux;
    4. sub quux {}
    5. package Foo::Bar::Quirka;
    6. sub quirka {}
    7. 1;

    then

    1. list_packages('Foo::Bar');

    will return

    1. Foo::Bar::Quux
    2. Foo::Bar::Quirka

    METHODS

    list_packages <package name>

    Return a list of all inner packages of that package.

    AUTHOR

    Simon Wistow <simon@thegestalt.org>

    COPYING

    Copyright, 2005 Simon Wistow

    Distributed under the same terms as Perl itself.

    BUGS

    None known.

     
    perldoc-html/Devel/PPPort.html000644 000765 000024 00000223403 12275777435 016356 0ustar00jjstaff000000 000000 Devel::PPPort - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Devel::PPPort

    Perl 5 version 18.2 documentation
    Recently read

    Devel::PPPort

    NAME

    Devel::PPPort - Perl/Pollution/Portability

    SYNOPSIS

    1. Devel::PPPort::WriteFile(); # defaults to ./ppport.h
    2. Devel::PPPort::WriteFile('someheader.h');

    DESCRIPTION

    Perl's API has changed over time, gaining new features, new functions, increasing its flexibility, and reducing the impact on the C namespace environment (reduced pollution). The header file written by this module, typically ppport.h, attempts to bring some of the newer Perl API features to older versions of Perl, so that you can worry less about keeping track of old releases, but users can still reap the benefit.

    Devel::PPPort contains a single function, called WriteFile . Its only purpose is to write the ppport.h C header file. This file contains a series of macros and, if explicitly requested, functions that allow XS modules to be built using older versions of Perl. Currently, Perl versions from 5.003 to 5.11.5 are supported.

    This module is used by h2xs to write the file ppport.h.

    Why use ppport.h?

    You should use ppport.h in modern code so that your code will work with the widest range of Perl interpreters possible, without significant additional work.

    You should attempt older code to fully use ppport.h, because the reduced pollution of newer Perl versions is an important thing. It's so important that the old polluting ways of original Perl modules will not be supported very far into the future, and your module will almost certainly break! By adapting to it now, you'll gain compatibility and a sense of having done the electronic ecology some good.

    How to use ppport.h

    Don't direct the users of your module to download Devel::PPPort . They are most probably no XS writers. Also, don't make ppport.h optional. Rather, just take the most recent copy of ppport.h that you can find (e.g. by generating it with the latest Devel::PPPort release from CPAN), copy it into your project, adjust your project to use it, and distribute the header along with your module.

    Running ppport.h

    But ppport.h is more than just a C header. It's also a Perl script that can check your source code. It will suggest hints and portability notes, and can even make suggestions on how to change your code. You can run it like any other Perl program:

    1. perl ppport.h [options] [files]

    It also has embedded documentation, so you can use

    1. perldoc ppport.h

    to find out more about how to use it.

    FUNCTIONS

    WriteFile

    WriteFile takes one optional argument. When called with one argument, it expects to be passed a filename. When called with no arguments, it defaults to the filename ppport.h.

    The function returns a true value if the file was written successfully. Otherwise it returns a false value.

    COMPATIBILITY

    ppport.h supports Perl versions from 5.003 to 5.11.5 in threaded and non-threaded configurations.

    Provided Perl compatibility API

    The header file written by this module, typically ppport.h, provides access to the following elements of the Perl API that is not available in older Perl releases:

    1. _aMY_CXT
    2. _pMY_CXT
    3. aMY_CXT
    4. aMY_CXT_
    5. aTHX
    6. aTHX_
    7. aTHXR
    8. aTHXR_
    9. AvFILLp
    10. boolSV
    11. call_argv
    12. call_method
    13. call_pv
    14. call_sv
    15. ckWARN
    16. CopFILE
    17. CopFILE_set
    18. CopFILEAV
    19. CopFILEGV
    20. CopFILEGV_set
    21. CopFILESV
    22. CopSTASH
    23. CopSTASH_eq
    24. CopSTASH_set
    25. CopSTASHPV
    26. CopSTASHPV_set
    27. CopyD
    28. CPERLscope
    29. dAX
    30. dAXMARK
    31. DEFSV
    32. DEFSV_set
    33. dITEMS
    34. dMY_CXT
    35. dMY_CXT_SV
    36. dNOOP
    37. dTHR
    38. dTHX
    39. dTHXa
    40. dTHXoa
    41. dTHXR
    42. dUNDERBAR
    43. dVAR
    44. dXCPT
    45. dXSTARG
    46. END_EXTERN_C
    47. ERRSV
    48. eval_pv
    49. eval_sv
    50. EXTERN_C
    51. G_METHOD
    52. get_av
    53. get_cv
    54. get_cvn_flags
    55. get_cvs
    56. get_hv
    57. get_sv
    58. grok_bin
    59. grok_hex
    60. grok_number
    61. grok_numeric_radix
    62. GROK_NUMERIC_RADIX
    63. grok_oct
    64. gv_fetchpvn_flags
    65. gv_fetchpvs
    66. gv_fetchsv
    67. gv_stashpvn
    68. gv_stashpvs
    69. GvSVn
    70. hv_fetchs
    71. hv_stores
    72. HvNAME_get
    73. HvNAMELEN_get
    74. IN_LOCALE
    75. IN_LOCALE_COMPILETIME
    76. IN_LOCALE_RUNTIME
    77. IN_PERL_COMPILETIME
    78. INT2PTR
    79. IS_NUMBER_GREATER_THAN_UV_MAX
    80. IS_NUMBER_IN_UV
    81. IS_NUMBER_INFINITY
    82. IS_NUMBER_NAN
    83. IS_NUMBER_NEG
    84. IS_NUMBER_NOT_INT
    85. isALNUMC
    86. isASCII
    87. isBLANK
    88. isCNTRL
    89. isGRAPH
    90. isGV_with_GP
    91. isPRINT
    92. isPSXSPC
    93. isPUNCT
    94. isXDIGIT
    95. IVdf
    96. IVSIZE
    97. IVTYPE
    98. load_module
    99. memEQ
    100. memEQs
    101. memNE
    102. memNEs
    103. MoveD
    104. mPUSHi
    105. mPUSHn
    106. mPUSHp
    107. mPUSHs
    108. mPUSHu
    109. mXPUSHi
    110. mXPUSHn
    111. mXPUSHp
    112. mXPUSHs
    113. mXPUSHu
    114. MY_CXT
    115. MY_CXT_CLONE
    116. MY_CXT_INIT
    117. my_snprintf
    118. my_sprintf
    119. my_strlcat
    120. my_strlcpy
    121. newCONSTSUB
    122. newRV_inc
    123. newRV_noinc
    124. newSV_type
    125. newSVpvn
    126. newSVpvn_flags
    127. newSVpvn_share
    128. newSVpvn_utf8
    129. newSVpvs
    130. newSVpvs_flags
    131. newSVpvs_share
    132. newSVuv
    133. Newx
    134. Newxc
    135. Newxz
    136. NOOP
    137. NUM2PTR
    138. NVef
    139. NVff
    140. NVgf
    141. NVTYPE
    142. packWARN
    143. PERL_ABS
    144. PERL_BCDVERSION
    145. PERL_GCC_BRACE_GROUPS_FORBIDDEN
    146. PERL_HASH
    147. PERL_INT_MAX
    148. PERL_INT_MIN
    149. PERL_LONG_MAX
    150. PERL_LONG_MIN
    151. PERL_MAGIC_arylen
    152. PERL_MAGIC_backref
    153. PERL_MAGIC_bm
    154. PERL_MAGIC_collxfrm
    155. PERL_MAGIC_dbfile
    156. PERL_MAGIC_dbline
    157. PERL_MAGIC_defelem
    158. PERL_MAGIC_env
    159. PERL_MAGIC_envelem
    160. PERL_MAGIC_ext
    161. PERL_MAGIC_fm
    162. PERL_MAGIC_glob
    163. PERL_MAGIC_isa
    164. PERL_MAGIC_isaelem
    165. PERL_MAGIC_mutex
    166. PERL_MAGIC_nkeys
    167. PERL_MAGIC_overload
    168. PERL_MAGIC_overload_elem
    169. PERL_MAGIC_overload_table
    170. PERL_MAGIC_pos
    171. PERL_MAGIC_qr
    172. PERL_MAGIC_regdata
    173. PERL_MAGIC_regdatum
    174. PERL_MAGIC_regex_global
    175. PERL_MAGIC_shared
    176. PERL_MAGIC_shared_scalar
    177. PERL_MAGIC_sig
    178. PERL_MAGIC_sigelem
    179. PERL_MAGIC_substr
    180. PERL_MAGIC_sv
    181. PERL_MAGIC_taint
    182. PERL_MAGIC_tied
    183. PERL_MAGIC_tiedelem
    184. PERL_MAGIC_tiedscalar
    185. PERL_MAGIC_utf8
    186. PERL_MAGIC_uvar
    187. PERL_MAGIC_uvar_elem
    188. PERL_MAGIC_vec
    189. PERL_MAGIC_vstring
    190. PERL_PV_ESCAPE_ALL
    191. PERL_PV_ESCAPE_FIRSTCHAR
    192. PERL_PV_ESCAPE_NOBACKSLASH
    193. PERL_PV_ESCAPE_NOCLEAR
    194. PERL_PV_ESCAPE_QUOTE
    195. PERL_PV_ESCAPE_RE
    196. PERL_PV_ESCAPE_UNI
    197. PERL_PV_ESCAPE_UNI_DETECT
    198. PERL_PV_PRETTY_DUMP
    199. PERL_PV_PRETTY_ELLIPSES
    200. PERL_PV_PRETTY_LTGT
    201. PERL_PV_PRETTY_NOCLEAR
    202. PERL_PV_PRETTY_QUOTE
    203. PERL_PV_PRETTY_REGPROP
    204. PERL_QUAD_MAX
    205. PERL_QUAD_MIN
    206. PERL_REVISION
    207. PERL_SCAN_ALLOW_UNDERSCORES
    208. PERL_SCAN_DISALLOW_PREFIX
    209. PERL_SCAN_GREATER_THAN_UV_MAX
    210. PERL_SCAN_SILENT_ILLDIGIT
    211. PERL_SHORT_MAX
    212. PERL_SHORT_MIN
    213. PERL_SIGNALS_UNSAFE_FLAG
    214. PERL_SUBVERSION
    215. PERL_UCHAR_MAX
    216. PERL_UCHAR_MIN
    217. PERL_UINT_MAX
    218. PERL_UINT_MIN
    219. PERL_ULONG_MAX
    220. PERL_ULONG_MIN
    221. PERL_UNUSED_ARG
    222. PERL_UNUSED_CONTEXT
    223. PERL_UNUSED_DECL
    224. PERL_UNUSED_VAR
    225. PERL_UQUAD_MAX
    226. PERL_UQUAD_MIN
    227. PERL_USE_GCC_BRACE_GROUPS
    228. PERL_USHORT_MAX
    229. PERL_USHORT_MIN
    230. PERL_VERSION
    231. Perl_warner
    232. Perl_warner_nocontext
    233. PERLIO_FUNCS_CAST
    234. PERLIO_FUNCS_DECL
    235. PL_bufend
    236. PL_bufptr
    237. PL_compiling
    238. PL_copline
    239. PL_curcop
    240. PL_curstash
    241. PL_DBsignal
    242. PL_DBsingle
    243. PL_DBsub
    244. PL_DBtrace
    245. PL_debstash
    246. PL_defgv
    247. PL_diehook
    248. PL_dirty
    249. PL_dowarn
    250. PL_errgv
    251. PL_error_count
    252. PL_expect
    253. PL_hexdigit
    254. PL_hints
    255. PL_in_my
    256. PL_in_my_stash
    257. PL_laststatval
    258. PL_lex_state
    259. PL_lex_stuff
    260. PL_linestr
    261. PL_na
    262. PL_no_modify
    263. PL_parser
    264. PL_perl_destruct_level
    265. PL_perldb
    266. PL_ppaddr
    267. PL_rsfp
    268. PL_rsfp_filters
    269. PL_signals
    270. PL_stack_base
    271. PL_stack_sp
    272. PL_statcache
    273. PL_stdingv
    274. PL_Sv
    275. PL_sv_arenaroot
    276. PL_sv_no
    277. PL_sv_undef
    278. PL_sv_yes
    279. PL_tainted
    280. PL_tainting
    281. PL_tokenbuf
    282. pMY_CXT
    283. pMY_CXT_
    284. Poison
    285. PoisonFree
    286. PoisonNew
    287. PoisonWith
    288. pTHX
    289. pTHX_
    290. PTR2IV
    291. PTR2nat
    292. PTR2NV
    293. PTR2ul
    294. PTR2UV
    295. PTRV
    296. PUSHmortal
    297. PUSHu
    298. pv_display
    299. pv_escape
    300. pv_pretty
    301. SAVE_DEFSV
    302. START_EXTERN_C
    303. START_MY_CXT
    304. STMT_END
    305. STMT_START
    306. STR_WITH_LEN
    307. sv_2pv_flags
    308. sv_2pv_nolen
    309. sv_2pvbyte
    310. sv_2pvbyte_nolen
    311. sv_2uv
    312. sv_catpv_mg
    313. sv_catpvf_mg
    314. sv_catpvf_mg_nocontext
    315. sv_catpvn_mg
    316. sv_catpvn_nomg
    317. sv_catpvs
    318. sv_catsv_mg
    319. sv_catsv_nomg
    320. SV_CONST_RETURN
    321. SV_COW_DROP_PV
    322. SV_COW_SHARED_HASH_KEYS
    323. SV_GMAGIC
    324. SV_HAS_TRAILING_NUL
    325. SV_IMMEDIATE_UNREF
    326. sv_magic_portable
    327. SV_MUTABLE_RETURN
    328. SV_NOSTEAL
    329. sv_pvn_force_flags
    330. sv_pvn_nomg
    331. sv_setiv_mg
    332. sv_setnv_mg
    333. sv_setpv_mg
    334. sv_setpvf_mg
    335. sv_setpvf_mg_nocontext
    336. sv_setpvn_mg
    337. sv_setpvs
    338. sv_setsv_mg
    339. sv_setsv_nomg
    340. sv_setuv
    341. sv_setuv_mg
    342. SV_SMAGIC
    343. sv_usepvn_mg
    344. SV_UTF8_NO_ENCODING
    345. sv_uv
    346. sv_vcatpvf
    347. sv_vcatpvf_mg
    348. sv_vsetpvf
    349. sv_vsetpvf_mg
    350. SVf
    351. SVf_UTF8
    352. SVfARG
    353. SvGETMAGIC
    354. SvIV_nomg
    355. SvMAGIC_set
    356. SvPV_const
    357. SvPV_flags
    358. SvPV_flags_const
    359. SvPV_flags_const_nolen
    360. SvPV_flags_mutable
    361. SvPV_force
    362. SvPV_force_flags
    363. SvPV_force_flags_mutable
    364. SvPV_force_flags_nolen
    365. SvPV_force_mutable
    366. SvPV_force_nolen
    367. SvPV_force_nomg
    368. SvPV_force_nomg_nolen
    369. SvPV_mutable
    370. SvPV_nolen
    371. SvPV_nolen_const
    372. SvPV_nomg
    373. SvPV_nomg_const
    374. SvPV_nomg_const_nolen
    375. SvPV_renew
    376. SvPVbyte
    377. SvPVX_const
    378. SvPVX_mutable
    379. SvREFCNT_inc
    380. SvREFCNT_inc_NN
    381. SvREFCNT_inc_simple
    382. SvREFCNT_inc_simple_NN
    383. SvREFCNT_inc_simple_void
    384. SvREFCNT_inc_simple_void_NN
    385. SvREFCNT_inc_void
    386. SvREFCNT_inc_void_NN
    387. SvRV_set
    388. SvSHARED_HASH
    389. SvSTASH_set
    390. SvUOK
    391. SvUV
    392. SvUV_nomg
    393. SvUV_set
    394. SvUVx
    395. SvUVX
    396. SvUVXx
    397. SvVSTRING_mg
    398. UNDERBAR
    399. UTF8_MAXBYTES
    400. UVof
    401. UVSIZE
    402. UVTYPE
    403. UVuf
    404. UVXf
    405. UVxf
    406. vload_module
    407. vnewSVpvf
    408. WARN_ALL
    409. WARN_AMBIGUOUS
    410. WARN_ASSERTIONS
    411. WARN_BAREWORD
    412. WARN_CLOSED
    413. WARN_CLOSURE
    414. WARN_DEBUGGING
    415. WARN_DEPRECATED
    416. WARN_DIGIT
    417. WARN_EXEC
    418. WARN_EXITING
    419. WARN_GLOB
    420. WARN_INPLACE
    421. WARN_INTERNAL
    422. WARN_IO
    423. WARN_LAYER
    424. WARN_MALLOC
    425. WARN_MISC
    426. WARN_NEWLINE
    427. WARN_NUMERIC
    428. WARN_ONCE
    429. WARN_OVERFLOW
    430. WARN_PACK
    431. WARN_PARENTHESIS
    432. WARN_PIPE
    433. WARN_PORTABLE
    434. WARN_PRECEDENCE
    435. WARN_PRINTF
    436. WARN_PROTOTYPE
    437. WARN_QW
    438. WARN_RECURSION
    439. WARN_REDEFINE
    440. WARN_REGEXP
    441. WARN_RESERVED
    442. WARN_SEMICOLON
    443. WARN_SEVERE
    444. WARN_SIGNAL
    445. WARN_SUBSTR
    446. WARN_SYNTAX
    447. WARN_TAINT
    448. WARN_THREADS
    449. WARN_UNINITIALIZED
    450. WARN_UNOPENED
    451. WARN_UNPACK
    452. WARN_UNTIE
    453. WARN_UTF8
    454. WARN_VOID
    455. warner
    456. XCPT_CATCH
    457. XCPT_RETHROW
    458. XCPT_TRY_END
    459. XCPT_TRY_START
    460. XPUSHmortal
    461. XPUSHu
    462. XSprePUSH
    463. XSPROTO
    464. XSRETURN
    465. XSRETURN_UV
    466. XST_mUV
    467. ZeroD

    Perl API not supported by ppport.h

    There is still a big part of the API not supported by ppport.h. Either because it doesn't make sense to back-port that part of the API, or simply because it hasn't been implemented yet. Patches welcome!

    Here's a list of the currently unsupported API, and also the version of Perl below which it is unsupported:

    • perl 5.14.0
      1. BhkDISABLE
      2. BhkENABLE
      3. BhkENTRY_set
      4. MULTICALL
      5. PERL_SYS_TERM
      6. POP_MULTICALL
      7. PUSH_MULTICALL
      8. XopDISABLE
      9. XopENABLE
      10. XopENTRY
      11. XopENTRY_set
      12. cophh_new_empty
      13. my_lstat
      14. my_stat
      15. ref
      16. stashpv_hvname_match
    • perl 5.13.10
      1. foldEQ_utf8_flags
      2. is_utf8_xidcont
      3. is_utf8_xidfirst
    • perl 5.13.8
      1. foldEQ_latin1
      2. mg_findext
      3. parse_arithexpr
      4. parse_fullexpr
      5. parse_listexpr
      6. parse_termexpr
      7. sv_unmagicext
    • perl 5.13.7
      1. HvENAME
      2. OP_CLASS
      3. SvPV_nomg_nolen
      4. XopFLAGS
      5. amagic_deref_call
      6. bytes_cmp_utf8
      7. cop_hints_2hv
      8. cop_hints_fetch_pv
      9. cop_hints_fetch_pvn
      10. cop_hints_fetch_pvs
      11. cop_hints_fetch_sv
      12. cophh_2hv
      13. cophh_copy
      14. cophh_delete_pv
      15. cophh_delete_pvn
      16. cophh_delete_pvs
      17. cophh_delete_sv
      18. cophh_fetch_pv
      19. cophh_fetch_pvn
      20. cophh_fetch_pvs
      21. cophh_fetch_sv
      22. cophh_free
      23. cophh_store_pv
      24. cophh_store_pvn
      25. cophh_store_pvs
      26. cophh_store_sv
      27. custom_op_register
      28. custom_op_xop
      29. newFOROP
      30. newWHILEOP
      31. op_lvalue
      32. op_scope
      33. parse_barestmt
      34. parse_block
      35. parse_label
    • perl 5.13.6
      1. LINKLIST
      2. SvTRUE_nomg
      3. ck_entersub_args_list
      4. ck_entersub_args_proto
      5. ck_entersub_args_proto_or_list
      6. cv_get_call_checker
      7. cv_set_call_checker
      8. isWORDCHAR
      9. lex_stuff_pv
      10. mg_free_type
      11. newSVpv_share
      12. op_append_elem
      13. op_append_list
      14. op_contextualize
      15. op_linklist
      16. op_prepend_elem
      17. parse_stmtseq
      18. rv2cv_op_cv
      19. savesharedpvs
      20. savesharedsvpv
      21. sv_2bool_flags
      22. sv_catpv_flags
      23. sv_catpv_nomg
      24. sv_catpvs_flags
      25. sv_catpvs_mg
      26. sv_catpvs_nomg
      27. sv_cmp_flags
      28. sv_cmp_locale_flags
      29. sv_collxfrm_flags
      30. sv_eq_flags
      31. sv_setpvs_mg
      32. sv_setref_pvs
    • perl 5.13.5
      1. PL_rpeepp
      2. caller_cx
      3. isOCTAL
      4. lex_stuff_pvs
      5. parse_fullstmt
    • perl 5.13.4
      1. XS_APIVERSION_BOOTCHECK
    • perl 5.13.3
      1. blockhook_register
      2. croak_no_modify
    • perl 5.13.2
      1. SvNV_nomg
      2. find_rundefsv
      3. foldEQ
      4. foldEQ_locale
      5. foldEQ_utf8
      6. hv_fill
      7. sv_dec_nomg
      8. sv_inc_nomg
    • perl 5.13.1
      1. croak_sv
      2. die_sv
      3. mess_sv
      4. sv_2nv_flags
      5. warn_sv
    • perl 5.11.5
      1. sv_pos_u2b_flags
    • perl 5.11.4
      1. prescan_version
    • perl 5.11.2
      1. PL_keyword_plugin
      2. lex_bufutf8
      3. lex_discard_to
      4. lex_grow_linestr
      5. lex_next_chunk
      6. lex_peek_unichar
      7. lex_read_space
      8. lex_read_to
      9. lex_read_unichar
      10. lex_stuff_pvn
      11. lex_stuff_sv
      12. lex_unstuff
      13. pad_findmy
    • perl 5.11.1
      1. ck_warner
      2. ck_warner_d
      3. is_utf8_perl_space
      4. is_utf8_perl_word
      5. is_utf8_posix_digit
    • perl 5.11.0
      1. Gv_AMupdate
      2. PL_opfreehook
      3. SvOOK_offset
      4. av_iter_p
      5. fetch_cop_label
      6. gv_add_by_type
      7. gv_fetchmethod_flags
      8. is_ascii_string
      9. pregfree2
      10. save_adelete
      11. save_aelem_flags
      12. save_hdelete
      13. save_helem_flags
      14. sv_utf8_upgrade_flags_grow
    • perl 5.10.1
      1. HeUTF8
      2. croak_xs_usage
      3. mro_get_from_name
      4. mro_get_private_data
      5. mro_register
      6. mro_set_mro
      7. mro_set_private_data
      8. save_hints
      9. save_padsv_and_mortalize
      10. save_pushi32ptr
      11. save_pushptr
      12. save_pushptrptr
      13. sv_insert_flags
    • perl 5.10.0
      1. hv_common
      2. hv_common_key_len
      3. sv_destroyable
      4. sys_init
      5. sys_init3
      6. sys_term
    • perl 5.9.5
      1. PL_parser
      2. Perl_signbit
      3. SvRX
      4. SvRXOK
      5. av_create_and_push
      6. av_create_and_unshift_one
      7. gv_fetchfile_flags
      8. lex_start
      9. mro_get_linear_isa
      10. mro_method_changed_in
      11. my_dirfd
      12. pregcomp
      13. ptr_table_clear
      14. ptr_table_fetch
      15. ptr_table_free
      16. ptr_table_new
      17. ptr_table_split
      18. ptr_table_store
      19. re_compile
      20. re_intuit_start
      21. reg_named_buff_all
      22. reg_named_buff_exists
      23. reg_named_buff_fetch
      24. reg_named_buff_firstkey
      25. reg_named_buff_nextkey
      26. reg_named_buff_scalar
      27. regfree_internal
      28. savesharedpvn
      29. scan_vstring
      30. upg_version
    • perl 5.9.4
      1. PerlIO_context_layers
      2. gv_name_set
      3. hv_copy_hints_hv
      4. my_vsnprintf
      5. newXS_flags
      6. regclass_swash
      7. sv_does
      8. sv_usepvn_flags
    • perl 5.9.3
      1. av_arylen_p
      2. ckwarn
      3. ckwarn_d
      4. csighandler
      5. dMULTICALL
      6. doref
      7. gv_const_sv
      8. hv_eiter_p
      9. hv_eiter_set
      10. hv_name_set
      11. hv_placeholders_get
      12. hv_placeholders_p
      13. hv_placeholders_set
      14. hv_riter_p
      15. hv_riter_set
      16. is_utf8_string_loclen
      17. newGIVENOP
      18. newSVhek
      19. newWHENOP
      20. savepvs
      21. sortsv_flags
      22. vverify
    • perl 5.9.2
      1. SvPVbyte_force
      2. find_rundefsvoffset
      3. op_refcnt_lock
      4. op_refcnt_unlock
      5. savesvpv
      6. vnormal
    • perl 5.9.1
      1. hv_clear_placeholders
      2. hv_scalar
      3. scan_version
      4. sv_2iv_flags
      5. sv_2uv_flags
    • perl 5.9.0
      1. new_version
      2. save_set_svflags
      3. vcmp
      4. vnumify
      5. vstringify
    • perl 5.8.3
      1. SvIsCOW
      2. SvIsCOW_shared_hash
    • perl 5.8.1
      1. SvVOK
      2. doing_taint
      3. find_runcv
      4. is_utf8_string_loc
      5. packlist
      6. save_bool
      7. savestack_grow_cnt
      8. seed
      9. sv_cat_decode
      10. sv_compile_2op
      11. sv_setpviv
      12. sv_setpviv_mg
      13. unpackstring
    • perl 5.8.0
      1. hv_iternext_flags
      2. hv_store_flags
      3. is_utf8_idcont
      4. nothreadhook
    • perl 5.7.3
      1. OP_DESC
      2. OP_NAME
      3. PL_peepp
      4. PerlIO_clearerr
      5. PerlIO_close
      6. PerlIO_eof
      7. PerlIO_error
      8. PerlIO_fileno
      9. PerlIO_fill
      10. PerlIO_flush
      11. PerlIO_get_base
      12. PerlIO_get_bufsiz
      13. PerlIO_get_cnt
      14. PerlIO_get_ptr
      15. PerlIO_read
      16. PerlIO_seek
      17. PerlIO_set_cnt
      18. PerlIO_set_ptrcnt
      19. PerlIO_setlinebuf
      20. PerlIO_stderr
      21. PerlIO_stdin
      22. PerlIO_stdout
      23. PerlIO_tell
      24. PerlIO_unread
      25. PerlIO_write
      26. SvLOCK
      27. SvSHARE
      28. SvUNLOCK
      29. atfork_lock
      30. atfork_unlock
      31. custom_op_desc
      32. custom_op_name
      33. deb
      34. debstack
      35. debstackptrs
      36. gv_fetchmeth_autoload
      37. ibcmp_utf8
      38. my_fork
      39. my_socketpair
      40. pack_cat
      41. perl_destruct
      42. pv_uni_display
      43. save_shared_pvref
      44. savesharedpv
      45. sortsv
      46. sv_copypv
      47. sv_magicext
      48. sv_nolocking
      49. sv_nosharing
      50. sv_recode_to_utf8
      51. sv_uni_display
      52. to_uni_fold
      53. to_uni_lower
      54. to_uni_title
      55. to_uni_upper
      56. to_utf8_case
      57. to_utf8_fold
      58. to_utf8_lower
      59. to_utf8_title
      60. to_utf8_upper
      61. unpack_str
      62. uvchr_to_utf8_flags
      63. uvuni_to_utf8_flags
      64. vdeb
    • perl 5.7.2
      1. calloc
      2. getcwd_sv
      3. init_tm
      4. malloc
      5. mfree
      6. mini_mktime
      7. my_atof2
      8. my_strftime
      9. op_null
      10. realloc
      11. sv_catpvn_flags
      12. sv_catsv_flags
      13. sv_setsv_flags
      14. sv_utf8_upgrade_flags
      15. sv_utf8_upgrade_nomg
      16. swash_fetch
    • perl 5.7.1
      1. POPpbytex
      2. bytes_from_utf8
      3. despatch_signals
      4. do_openn
      5. gv_handler
      6. is_lvalue_sub
      7. my_popen_list
      8. save_mortalizesv
      9. scan_num
      10. sv_force_normal_flags
      11. sv_setref_uv
      12. sv_unref_flags
      13. sv_utf8_upgrade
      14. utf8_length
      15. utf8_to_uvchr
      16. utf8_to_uvuni
      17. utf8n_to_uvuni
      18. uvuni_to_utf8
    • perl 5.6.1
      1. SvGAMAGIC
      2. apply_attrs_string
      3. bytes_to_utf8
      4. gv_efullname4
      5. gv_fullname4
      6. is_utf8_string
      7. save_generic_pvref
      8. utf16_to_utf8
      9. utf16_to_utf8_reversed
      10. utf8_to_bytes
    • perl 5.6.0
      1. PERL_SYS_INIT3
      2. SvIOK_UV
      3. SvIOK_notUV
      4. SvIOK_only_UV
      5. SvPOK_only_UTF8
      6. SvPVbyte_nolen
      7. SvPVbytex
      8. SvPVbytex_force
      9. SvPVutf8
      10. SvPVutf8_force
      11. SvPVutf8_nolen
      12. SvPVutf8x
      13. SvPVutf8x_force
      14. SvUOK
      15. SvUTF8
      16. SvUTF8_off
      17. SvUTF8_on
      18. av_delete
      19. av_exists
      20. call_atexit
      21. cast_i32
      22. cast_iv
      23. cast_ulong
      24. cast_uv
      25. do_gv_dump
      26. do_gvgv_dump
      27. do_hv_dump
      28. do_magic_dump
      29. do_op_dump
      30. do_open9
      31. do_pmop_dump
      32. do_sv_dump
      33. dump_all
      34. dump_eval
      35. dump_form
      36. dump_indent
      37. dump_packsubs
      38. dump_sub
      39. dump_vindent
      40. get_context
      41. get_ppaddr
      42. gv_dump
      43. init_i18nl10n
      44. init_i18nl14n
      45. is_uni_alnum
      46. is_uni_alnum_lc
      47. is_uni_alpha
      48. is_uni_alpha_lc
      49. is_uni_ascii
      50. is_uni_ascii_lc
      51. is_uni_cntrl
      52. is_uni_cntrl_lc
      53. is_uni_digit
      54. is_uni_digit_lc
      55. is_uni_graph
      56. is_uni_graph_lc
      57. is_uni_idfirst
      58. is_uni_idfirst_lc
      59. is_uni_lower
      60. is_uni_lower_lc
      61. is_uni_print
      62. is_uni_print_lc
      63. is_uni_punct
      64. is_uni_punct_lc
      65. is_uni_space
      66. is_uni_space_lc
      67. is_uni_upper
      68. is_uni_upper_lc
      69. is_uni_xdigit
      70. is_uni_xdigit_lc
      71. is_utf8_alnum
      72. is_utf8_alpha
      73. is_utf8_ascii
      74. is_utf8_char
      75. is_utf8_cntrl
      76. is_utf8_digit
      77. is_utf8_graph
      78. is_utf8_idfirst
      79. is_utf8_lower
      80. is_utf8_mark
      81. is_utf8_print
      82. is_utf8_punct
      83. is_utf8_space
      84. is_utf8_upper
      85. is_utf8_xdigit
      86. magic_dump
      87. mess
      88. my_atof
      89. my_fflush_all
      90. newANONATTRSUB
      91. newATTRSUB
      92. newXS
      93. newXSproto
      94. new_collate
      95. new_ctype
      96. new_numeric
      97. op_dump
      98. perl_parse
      99. pmop_dump
      100. re_intuit_string
      101. reginitcolors
      102. require_pv
      103. safesyscalloc
      104. safesysfree
      105. safesysmalloc
      106. safesysrealloc
      107. save_I8
      108. save_alloc
      109. save_destructor
      110. save_destructor_x
      111. save_re_context
      112. save_vptr
      113. scan_bin
      114. set_context
      115. set_numeric_local
      116. set_numeric_radix
      117. set_numeric_standard
      118. str_to_version
      119. sv_2pvutf8
      120. sv_2pvutf8_nolen
      121. sv_force_normal
      122. sv_len_utf8
      123. sv_pos_b2u
      124. sv_pos_u2b
      125. sv_pv
      126. sv_pvbyte
      127. sv_pvbyten
      128. sv_pvbyten_force
      129. sv_pvutf8
      130. sv_pvutf8n
      131. sv_pvutf8n_force
      132. sv_rvweaken
      133. sv_utf8_decode
      134. sv_utf8_downgrade
      135. sv_utf8_encode
      136. swash_init
      137. tmps_grow
      138. to_uni_lower_lc
      139. to_uni_title_lc
      140. to_uni_upper_lc
      141. utf8_distance
      142. utf8_hop
      143. vcroak
      144. vform
      145. vmess
      146. vwarn
      147. vwarner
    • perl 5.005_03
      1. POPpx
      2. get_vtbl
      3. save_generic_svref
    • perl 5.005
      1. PL_modglobal
      2. cx_dump
      3. debop
      4. debprofdump
      5. fbm_compile
      6. fbm_instr
      7. get_op_descs
      8. get_op_names
      9. init_stacks
      10. mg_length
      11. mg_size
      12. newHVhv
      13. new_stackinfo
      14. regdump
      15. regexec_flags
      16. regnext
      17. runops_debug
      18. runops_standard
      19. save_iv
      20. save_op
      21. screaminstr
      22. sv_iv
      23. sv_nv
      24. sv_peek
      25. sv_pvn
      26. sv_pvn_nomg
      27. sv_true
    • perl 5.004_05
      1. do_binmode
      2. save_aelem
      3. save_helem
    • perl 5.004
      1. GIMME_V
      2. G_VOID
      3. HEf_SVKEY
      4. HeHASH
      5. HeKEY
      6. HeKLEN
      7. HePV
      8. HeSVKEY
      9. HeSVKEY_force
      10. HeSVKEY_set
      11. HeVAL
      12. SvSetMagicSV
      13. SvSetMagicSV_nosteal
      14. SvSetSV_nosteal
      15. SvTAINTED
      16. SvTAINTED_off
      17. SvTAINTED_on
      18. block_gimme
      19. call_list
      20. cv_const_sv
      21. delimcpy
      22. do_open
      23. form
      24. gv_autoload4
      25. gv_efullname3
      26. gv_fetchmethod_autoload
      27. gv_fullname3
      28. hv_delayfree_ent
      29. hv_delete_ent
      30. hv_exists_ent
      31. hv_fetch_ent
      32. hv_free_ent
      33. hv_iterkeysv
      34. hv_ksplit
      35. hv_store_ent
      36. ibcmp_locale
      37. my_failure_exit
      38. my_memcmp
      39. my_pclose
      40. my_popen
      41. newSVpvf
      42. rsignal
      43. rsignal_state
      44. save_I16
      45. save_gp
      46. share_hek
      47. start_subparse
      48. sv_catpvf
      49. sv_catpvf_mg
      50. sv_cmp_locale
      51. sv_derived_from
      52. sv_gets
      53. sv_magic_portable
      54. sv_setpvf
      55. sv_setpvf_mg
      56. sv_taint
      57. sv_tainted
      58. sv_untaint
      59. sv_vcatpvf
      60. sv_vcatpvf_mg
      61. sv_vcatpvfn
      62. sv_vsetpvf
      63. sv_vsetpvf_mg
      64. sv_vsetpvfn
      65. unsharepvn
      66. vnewSVpvf
      67. warner

    BUGS

    If you find any bugs, Devel::PPPort doesn't seem to build on your system or any of its tests fail, please use the CPAN Request Tracker at http://rt.cpan.org/ to create a ticket for the module.

    AUTHORS

    • Version 1.x of Devel::PPPort was written by Kenneth Albanowski.

    • Version 2.x was ported to the Perl core by Paul Marquess.

    • Version 3.x was ported back to CPAN by Marcus Holland-Moritz.

    COPYRIGHT

    Version 3.x, Copyright (C) 2004-2010, Marcus Holland-Moritz.

    Version 2.x, Copyright (C) 2001, Paul Marquess.

    Version 1.x, Copyright (C) 1999, Kenneth Albanowski.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    See h2xs, ppport.h.

     
    perldoc-html/Devel/Peek.html000644 000765 000024 00000156247 12275777436 016072 0ustar00jjstaff000000 000000 Devel::Peek - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Devel::Peek

    Perl 5 version 18.2 documentation
    Recently read

    Devel::Peek

    NAME

    Devel::Peek - A data debugging tool for the XS programmer

    SYNOPSIS

    1. use Devel::Peek;
    2. Dump( $a );
    3. Dump( $a, 5 );
    4. DumpArray( 5, $a, $b, ... );
    5. mstat "Point 5";
    6. use Devel::Peek ':opd=st';

    DESCRIPTION

    Devel::Peek contains functions which allows raw Perl datatypes to be manipulated from a Perl script. This is used by those who do XS programming to check that the data they are sending from C to Perl looks as they think it should look. The trick, then, is to know what the raw datatype is supposed to look like when it gets to Perl. This document offers some tips and hints to describe good and bad raw data.

    It is very possible that this document will fall far short of being useful to the casual reader. The reader is expected to understand the material in the first few sections of perlguts.

    Devel::Peek supplies a Dump() function which can dump a raw Perl datatype, and mstat("marker") function to report on memory usage (if perl is compiled with corresponding option). The function DeadCode() provides statistics on the data "frozen" into inactive CV . Devel::Peek also supplies SvREFCNT() , SvREFCNT_inc() , and SvREFCNT_dec() which can query, increment, and decrement reference counts on SVs. This document will take a passive, and safe, approach to data debugging and for that it will describe only the Dump() function.

    Function DumpArray() allows dumping of multiple values (useful when you need to analyze returns of functions).

    The global variable $Devel::Peek::pv_limit can be set to limit the number of character printed in various string values. Setting it to 0 means no limit.

    If use Devel::Peek directive has a :opd=FLAGS argument, this switches on debugging of opcode dispatch. FLAGS should be a combination of s, t , and P (see -D flags in perlrun). :opd is a shortcut for :opd=st .

    Runtime debugging

    CvGV($cv) return one of the globs associated to a subroutine reference $cv.

    debug_flags() returns a string representation of $^D (similar to what is allowed for -D flag). When called with a numeric argument, sets $^D to the corresponding value. When called with an argument of the form "flags-flags" , set on/off bits of $^D corresponding to letters before/after - . (The returned value is for $^D before the modification.)

    runops_debug() returns true if the current opcode dispatcher is the debugging one. When called with an argument, switches to debugging or non-debugging dispatcher depending on the argument (active for newly-entered subs/etc only). (The returned value is for the dispatcher before the modification.)

    Memory footprint debugging

    When perl is compiled with support for memory footprint debugging (default with Perl's malloc()), Devel::Peek provides an access to this API.

    Use mstat() function to emit a memory state statistic to the terminal. For more information on the format of output of mstat() see Using $ENV{PERL_DEBUG_MSTATS} in perldebguts.

    Three additional functions allow access to this statistic from Perl. First, use mstats_fillhash(%hash) to get the information contained in the output of mstat() into %hash. The field of this hash are

    1. minbucket nbuckets sbrk_good sbrk_slack sbrked_remains sbrks start_slack
    2. topbucket topbucket_ev topbucket_odd total total_chain total_sbrk totfree

    Two additional fields free , used contain array references which provide per-bucket count of free and used chunks. Two other fields mem_size , available_size contain array references which provide the information about the allocated size and usable size of chunks in each bucket. Again, see Using $ENV{PERL_DEBUG_MSTATS} in perldebguts for details.

    Keep in mind that only the first several "odd-numbered" buckets are used, so the information on size of the "odd-numbered" buckets which are not used is probably meaningless.

    The information in

    1. mem_size available_size minbucket nbuckets

    is the property of a particular build of perl, and does not depend on the current process. If you do not provide the optional argument to the functions mstats_fillhash(), fill_mstats(), mstats2hash(), then the information in fields mem_size , available_size is not updated.

    fill_mstats($buf) is a much cheaper call (both speedwise and memory-wise) which collects the statistic into $buf in machine-readable form. At a later moment you may need to call mstats2hash($buf, %hash) to use this information to fill %hash.

    All three APIs fill_mstats($buf) , mstats_fillhash(%hash) , and mstats2hash($buf, %hash) are designed to allocate no memory if used the second time on the same $buf and/or %hash.

    So, if you want to collect memory info in a cycle, you may call

    1. $#buf = 999;
    2. fill_mstats($_) for @buf;
    3. mstats_fillhash(%report, 1); # Static info too
    4. foreach (@buf) {
    5. # Do something...
    6. fill_mstats $_; # Collect statistic
    7. }
    8. foreach (@buf) {
    9. mstats2hash($_, %report); # Preserve static info
    10. # Do something with %report
    11. }

    EXAMPLES

    The following examples don't attempt to show everything as that would be a monumental task, and, frankly, we don't want this manpage to be an internals document for Perl. The examples do demonstrate some basics of the raw Perl datatypes, and should suffice to get most determined people on their way. There are no guidewires or safety nets, nor blazed trails, so be prepared to travel alone from this point and on and, if at all possible, don't fall into the quicksand (it's bad for business).

    Oh, one final bit of advice: take perlguts with you. When you return we expect to see it well-thumbed.

    A simple scalar string

    Let's begin by looking a simple scalar which is holding a string.

    1. use Devel::Peek;
    2. $a = 42; $a = "hello";
    3. Dump $a;

    The output:

    1. SV = PVIV(0xbc288) at 0xbe9a8
    2. REFCNT = 1
    3. FLAGS = (POK,pPOK)
    4. IV = 42
    5. PV = 0xb2048 "hello"\0
    6. CUR = 5
    7. LEN = 8

    This says $a is an SV, a scalar. The scalar type is a PVIV, which is capable of holding an integer (IV) and/or a string (PV) value. The scalar's head is allocated at address 0xbe9a8, while the body is at 0xbc288. Its reference count is 1. It has the POK flag set, meaning its current PV field is valid. Because POK is set we look at the PV item to see what is in the scalar. The \0 at the end indicate that this PV is properly NUL-terminated. Note that the IV field still contains its old numeric value, but because FLAGS doesn't have IOK set, we must ignore the IV item. CUR indicates the number of characters in the PV. LEN indicates the number of bytes allocated for the PV (at least one more than CUR, because LEN includes an extra byte for the end-of-string marker, then usually rounded up to some efficient allocation unit).

    A simple scalar number

    If the scalar contains a number the raw SV will be leaner.

    1. use Devel::Peek;
    2. $a = 42;
    3. Dump $a;

    The output:

    1. SV = IV(0xbc818) at 0xbe9a8
    2. REFCNT = 1
    3. FLAGS = (IOK,pIOK)
    4. IV = 42

    This says $a is an SV, a scalar. The scalar is an IV, a number. Its reference count is 1. It has the IOK flag set, meaning it is currently being evaluated as a number. Because IOK is set we look at the IV item to see what is in the scalar.

    A simple scalar with an extra reference

    If the scalar from the previous example had an extra reference:

    1. use Devel::Peek;
    2. $a = 42;
    3. $b = \$a;
    4. Dump $a;

    The output:

    1. SV = IV(0xbe860) at 0xbe9a8
    2. REFCNT = 2
    3. FLAGS = (IOK,pIOK)
    4. IV = 42

    Notice that this example differs from the previous example only in its reference count. Compare this to the next example, where we dump $b instead of $a .

    A reference to a simple scalar

    This shows what a reference looks like when it references a simple scalar.

    1. use Devel::Peek;
    2. $a = 42;
    3. $b = \$a;
    4. Dump $b;

    The output:

    1. SV = IV(0xf041c) at 0xbe9a0
    2. REFCNT = 1
    3. FLAGS = (ROK)
    4. RV = 0xbab08
    5. SV = IV(0xbe860) at 0xbe9a8
    6. REFCNT = 2
    7. FLAGS = (IOK,pIOK)
    8. IV = 42

    Starting from the top, this says $b is an SV. The scalar is an IV, which is capable of holding an integer or reference value. It has the ROK flag set, meaning it is a reference (rather than an integer or string). Notice that Dump follows the reference and shows us what $b was referencing. We see the same $a that we found in the previous example.

    Note that the value of RV coincides with the numbers we see when we stringify $b. The addresses inside IV() are addresses of X*** structures which hold the current state of an SV . This address may change during lifetime of an SV.

    A reference to an array

    This shows what a reference to an array looks like.

    1. use Devel::Peek;
    2. $a = [42];
    3. Dump $a;

    The output:

    1. SV = IV(0xc85998) at 0xc859a8
    2. REFCNT = 1
    3. FLAGS = (ROK)
    4. RV = 0xc70de8
    5. SV = PVAV(0xc71e10) at 0xc70de8
    6. REFCNT = 1
    7. FLAGS = ()
    8. ARRAY = 0xc7e820
    9. FILL = 0
    10. MAX = 0
    11. ARYLEN = 0x0
    12. FLAGS = (REAL)
    13. Elt No. 0
    14. SV = IV(0xc70f88) at 0xc70f98
    15. REFCNT = 1
    16. FLAGS = (IOK,pIOK)
    17. IV = 42

    This says $a is a reference (ROK), which points to another SV which is a PVAV, an array. The array has one element, element zero, which is another SV. The field FILL above indicates the last element in the array, similar to $#$a .

    If $a pointed to an array of two elements then we would see the following.

    1. use Devel::Peek 'Dump';
    2. $a = [42,24];
    3. Dump $a;

    The output:

    1. SV = IV(0x158c998) at 0x158c9a8
    2. REFCNT = 1
    3. FLAGS = (ROK)
    4. RV = 0x1577de8
    5. SV = PVAV(0x1578e10) at 0x1577de8
    6. REFCNT = 1
    7. FLAGS = ()
    8. ARRAY = 0x1585820
    9. FILL = 1
    10. MAX = 1
    11. ARYLEN = 0x0
    12. FLAGS = (REAL)
    13. Elt No. 0
    14. SV = IV(0x1577f88) at 0x1577f98
    15. REFCNT = 1
    16. FLAGS = (IOK,pIOK)
    17. IV = 42
    18. Elt No. 1
    19. SV = IV(0x158be88) at 0x158be98
    20. REFCNT = 1
    21. FLAGS = (IOK,pIOK)
    22. IV = 24

    Note that Dump will not report all the elements in the array, only several first (depending on how deep it already went into the report tree).

    A reference to a hash

    The following shows the raw form of a reference to a hash.

    1. use Devel::Peek;
    2. $a = {hello=>42};
    3. Dump $a;

    The output:

    1. SV = IV(0x8177858) at 0x816a618
    2. REFCNT = 1
    3. FLAGS = (ROK)
    4. RV = 0x814fc10
    5. SV = PVHV(0x8167768) at 0x814fc10
    6. REFCNT = 1
    7. FLAGS = (SHAREKEYS)
    8. ARRAY = 0x816c5b8 (0:7, 1:1)
    9. hash quality = 100.0%
    10. KEYS = 1
    11. FILL = 1
    12. MAX = 7
    13. RITER = -1
    14. EITER = 0x0
    15. Elt "hello" HASH = 0xc8fd181b
    16. SV = IV(0x816c030) at 0x814fcf4
    17. REFCNT = 1
    18. FLAGS = (IOK,pIOK)
    19. IV = 42

    This shows $a is a reference pointing to an SV. That SV is a PVHV, a hash. Fields RITER and EITER are used by each.

    The "quality" of a hash is defined as the total number of comparisons needed to access every element once, relative to the expected number needed for a random hash. The value can go over 100%.

    The total number of comparisons is equal to the sum of the squares of the number of entries in each bucket. For a random hash of <n > keys into <k > buckets, the expected value is:

    1. n + n(n-1)/2k

    Dumping a large array or hash

    The Dump() function, by default, dumps up to 4 elements from a toplevel array or hash. This number can be increased by supplying a second argument to the function.

    1. use Devel::Peek;
    2. $a = [10,11,12,13,14];
    3. Dump $a;

    Notice that Dump() prints only elements 10 through 13 in the above code. The following code will print all of the elements.

    1. use Devel::Peek 'Dump';
    2. $a = [10,11,12,13,14];
    3. Dump $a, 5;

    A reference to an SV which holds a C pointer

    This is what you really need to know as an XS programmer, of course. When an XSUB returns a pointer to a C structure that pointer is stored in an SV and a reference to that SV is placed on the XSUB stack. So the output from an XSUB which uses something like the T_PTROBJ map might look something like this:

    1. SV = IV(0xf381c) at 0xc859a8
    2. REFCNT = 1
    3. FLAGS = (ROK)
    4. RV = 0xb8ad8
    5. SV = PVMG(0xbb3c8) at 0xc859a0
    6. REFCNT = 1
    7. FLAGS = (OBJECT,IOK,pIOK)
    8. IV = 729160
    9. NV = 0
    10. PV = 0
    11. STASH = 0xc1d10 "CookBookB::Opaque"

    This shows that we have an SV which is a reference, which points at another SV. In this case that second SV is a PVMG, a blessed scalar. Because it is blessed it has the OBJECT flag set. Note that an SV which holds a C pointer also has the IOK flag set. The STASH is set to the package name which this SV was blessed into.

    The output from an XSUB which uses something like the T_PTRREF map, which doesn't bless the object, might look something like this:

    1. SV = IV(0xf381c) at 0xc859a8
    2. REFCNT = 1
    3. FLAGS = (ROK)
    4. RV = 0xb8ad8
    5. SV = PVMG(0xbb3c8) at 0xc859a0
    6. REFCNT = 1
    7. FLAGS = (IOK,pIOK)
    8. IV = 729160
    9. NV = 0
    10. PV = 0

    A reference to a subroutine

    Looks like this:

    1. SV = IV(0x24d2dd8) at 0x24d2de8
    2. REFCNT = 1
    3. FLAGS = (TEMP,ROK)
    4. RV = 0x24e79d8
    5. SV = PVCV(0x24e5798) at 0x24e79d8
    6. REFCNT = 2
    7. FLAGS = ()
    8. COMP_STASH = 0x22c9c50 "main"
    9. START = 0x22eed60 ===> 0
    10. ROOT = 0x22ee490
    11. GVGV::GV = 0x22de9d8 "MY" :: "top_targets"
    12. FILE = "(eval 5)"
    13. DEPTH = 0
    14. FLAGS = 0x0
    15. OUTSIDE_SEQ = 93
    16. PADLIST = 0x22e9ed8
    17. PADNAME = 0x22e9ec0(0x22eed00) PAD = 0x22e9ea8(0x22eecd0)
    18. OUTSIDE = 0x22c9fb0 (MAIN)

    This shows that

    • the subroutine is not an XSUB (since START and ROOT are non-zero, and XSUB is not listed, and is thus null);

    • that it was compiled in the package main ;

    • under the name MY::top_targets ;

    • inside a 5th eval in the program;

    • it is not currently executed (see DEPTH );

    • it has no prototype (PROTOTYPE field is missing).

    EXPORTS

    Dump , mstat , DeadCode , DumpArray , DumpWithOP and DumpProg , fill_mstats , mstats_fillhash , mstats2hash by default. Additionally available SvREFCNT , SvREFCNT_inc and SvREFCNT_dec .

    BUGS

    Readers have been known to skip important parts of perlguts, causing much frustration for all.

    AUTHOR

    Ilya Zakharevich ilya@math.ohio-state.edu

    Copyright (c) 1995-98 Ilya Zakharevich. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Author of this software makes no claim whatsoever about suitability, reliability, edability, editability or usability of this product, and should not be kept liable for any damage resulting from the use of it. If you can use it, you are in luck, if not, I should not be kept responsible. Keep a handy copy of your backup tape at hand.

    SEE ALSO

    perlguts, and perlguts, again.

     
    perldoc-html/Devel/SelfStubber.html000644 000765 000024 00000043270 12275777435 017414 0ustar00jjstaff000000 000000 Devel::SelfStubber - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Devel::SelfStubber

    Perl 5 version 18.2 documentation
    Recently read

    Devel::SelfStubber

    NAME

    Devel::SelfStubber - generate stubs for a SelfLoading module

    SYNOPSIS

    To generate just the stubs:

    1. use Devel::SelfStubber;
    2. Devel::SelfStubber->stub('MODULENAME','MY_LIB_DIR');

    or to generate the whole module with stubs inserted correctly

    1. use Devel::SelfStubber;
    2. $Devel::SelfStubber::JUST_STUBS=0;
    3. Devel::SelfStubber->stub('MODULENAME','MY_LIB_DIR');

    MODULENAME is the Perl module name, e.g. Devel::SelfStubber, NOT 'Devel/SelfStubber' or 'Devel/SelfStubber.pm'.

    MY_LIB_DIR defaults to '.' if not present.

    DESCRIPTION

    Devel::SelfStubber prints the stubs you need to put in the module before the __DATA__ token (or you can get it to print the entire module with stubs correctly placed). The stubs ensure that if a method is called, it will get loaded. They are needed specifically for inherited autoloaded methods.

    This is best explained using the following example:

    Assume four classes, A,B,C & D.

    A is the root class, B is a subclass of A, C is a subclass of B, and D is another subclass of A.

    1. A
    2. / \
    3. B D
    4. /
    5. C

    If D calls an autoloaded method 'foo' which is defined in class A, then the method is loaded into class A, then executed. If C then calls method 'foo', and that method was reimplemented in class B, but set to be autoloaded, then the lookup mechanism never gets to the AUTOLOAD mechanism in B because it first finds the method already loaded in A, and so erroneously uses that. If the method foo had been stubbed in B, then the lookup mechanism would have found the stub, and correctly loaded and used the sub from B.

    So, for classes and subclasses to have inheritance correctly work with autoloading, you need to ensure stubs are loaded.

    The SelfLoader can load stubs automatically at module initialization with the statement 'SelfLoader->load_stubs()';, but you may wish to avoid having the stub loading overhead associated with your initialization (though note that the SelfLoader::load_stubs method will be called sooner or later - at latest when the first sub is being autoloaded). In this case, you can put the sub stubs before the __DATA__ token. This can be done manually, but this module allows automatic generation of the stubs.

    By default it just prints the stubs, but you can set the global $Devel::SelfStubber::JUST_STUBS to 0 and it will print out the entire module with the stubs positioned correctly.

    At the very least, this is useful to see what the SelfLoader thinks are stubs - in order to ensure future versions of the SelfStubber remain in step with the SelfLoader, the SelfStubber actually uses the SelfLoader to determine which stubs are needed.

     
    perldoc-html/Data/Dumper.html000644 000765 000024 00000174620 12275777436 016247 0ustar00jjstaff000000 000000 Data::Dumper - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Data::Dumper

    Perl 5 version 18.2 documentation
    Recently read

    Data::Dumper

    NAME

    Data::Dumper - stringified perl data structures, suitable for both printing and eval

    SYNOPSIS

    1. use Data::Dumper;
    2. # simple procedural interface
    3. print Dumper($foo, $bar);
    4. # extended usage with names
    5. print Data::Dumper->Dump([$foo, $bar], [qw(foo *ary)]);
    6. # configuration variables
    7. {
    8. local $Data::Dumper::Purity = 1;
    9. eval Data::Dumper->Dump([$foo, $bar], [qw(foo *ary)]);
    10. }
    11. # OO usage
    12. $d = Data::Dumper->new([$foo, $bar], [qw(foo *ary)]);
    13. ...
    14. print $d->Dump;
    15. ...
    16. $d->Purity(1)->Terse(1)->Deepcopy(1);
    17. eval $d->Dump;

    DESCRIPTION

    Given a list of scalars or reference variables, writes out their contents in perl syntax. The references can also be objects. The content of each variable is output in a single Perl statement. Handles self-referential structures correctly.

    The return value can be evaled to get back an identical copy of the original reference structure. (Please do consider the security implications of eval'ing code from untrusted sources!)

    Any references that are the same as one of those passed in will be named $VAR n (where n is a numeric suffix), and other duplicate references to substructures within $VAR n will be appropriately labeled using arrow notation. You can specify names for individual values to be dumped if you use the Dump() method, or you can change the default $VAR prefix to something else. See $Data::Dumper::Varname and $Data::Dumper::Terse below.

    The default output of self-referential structures can be evaled, but the nested references to $VAR n will be undefined, since a recursive structure cannot be constructed using one Perl statement. You should set the Purity flag to 1 to get additional statements that will correctly fill in these references. Moreover, if evaled when strictures are in effect, you need to ensure that any variables it accesses are previously declared.

    In the extended usage form, the references to be dumped can be given user-specified names. If a name begins with a * , the output will describe the dereferenced type of the supplied reference for hashes and arrays, and coderefs. Output of names will be avoided where possible if the Terse flag is set.

    In many cases, methods that are used to set the internal state of the object will return the object itself, so method calls can be conveniently chained together.

    Several styles of output are possible, all controlled by setting the Indent flag. See Configuration Variables or Methods below for details.

    Methods

    • PACKAGE->new(ARRAYREF [, ARRAYREF])

      Returns a newly created Data::Dumper object. The first argument is an anonymous array of values to be dumped. The optional second argument is an anonymous array of names for the values. The names need not have a leading $ sign, and must be comprised of alphanumeric characters. You can begin a name with a * to specify that the dereferenced type must be dumped instead of the reference itself, for ARRAY and HASH references.

      The prefix specified by $Data::Dumper::Varname will be used with a numeric suffix if the name for a value is undefined.

      Data::Dumper will catalog all references encountered while dumping the values. Cross-references (in the form of names of substructures in perl syntax) will be inserted at all possible points, preserving any structural interdependencies in the original set of values. Structure traversal is depth-first, and proceeds in order from the first supplied value to the last.

    • $OBJ->Dump or PACKAGE->Dump(ARRAYREF [, ARRAYREF])

      Returns the stringified form of the values stored in the object (preserving the order in which they were supplied to new ), subject to the configuration options below. In a list context, it returns a list of strings corresponding to the supplied values.

      The second form, for convenience, simply calls the new method on its arguments before dumping the object immediately.

    • $OBJ->Seen([HASHREF])

      Queries or adds to the internal table of already encountered references. You must use Reset to explicitly clear the table if needed. Such references are not dumped; instead, their names are inserted wherever they are encountered subsequently. This is useful especially for properly dumping subroutine references.

      Expects an anonymous hash of name => value pairs. Same rules apply for names as in new . If no argument is supplied, will return the "seen" list of name => value pairs, in a list context. Otherwise, returns the object itself.

    • $OBJ->Values([ARRAYREF])

      Queries or replaces the internal array of values that will be dumped. When called without arguments, returns the values as a list. When called with a reference to an array of replacement values, returns the object itself. When called with any other type of argument, dies.

    • $OBJ->Names([ARRAYREF])

      Queries or replaces the internal array of user supplied names for the values that will be dumped. When called without arguments, returns the names. When called with an array of replacement names, returns the object itself. If the number of replacment names exceeds the number of values to be named, the excess names will not be used. If the number of replacement names falls short of the number of values to be named, the list of replacment names will be exhausted and remaining values will not be renamed. When called with any other type of argument, dies.

    • $OBJ->Reset

      Clears the internal table of "seen" references and returns the object itself.

    Functions

    • Dumper(LIST)

      Returns the stringified form of the values in the list, subject to the configuration options below. The values will be named $VAR n in the output, where n is a numeric suffix. Will return a list of strings in a list context.

    Configuration Variables or Methods

    Several configuration variables can be used to control the kind of output generated when using the procedural interface. These variables are usually localized in a block so that other parts of the code are not affected by the change.

    These variables determine the default state of the object created by calling the new method, but cannot be used to alter the state of the object thereafter. The equivalent method names should be used instead to query or set the internal state of the object.

    The method forms return the object itself when called with arguments, so that they can be chained together nicely.

    • $Data::Dumper::Indent or $OBJ->Indent([NEWVAL])

      Controls the style of indentation. It can be set to 0, 1, 2 or 3. Style 0 spews output without any newlines, indentation, or spaces between list items. It is the most compact format possible that can still be called valid perl. Style 1 outputs a readable form with newlines but no fancy indentation (each level in the structure is simply indented by a fixed amount of whitespace). Style 2 (the default) outputs a very readable form which takes into account the length of hash keys (so the hash value lines up). Style 3 is like style 2, but also annotates the elements of arrays with their index (but the comment is on its own line, so array output consumes twice the number of lines). Style 2 is the default.

    • $Data::Dumper::Purity or $OBJ->Purity([NEWVAL])

      Controls the degree to which the output can be evaled to recreate the supplied reference structures. Setting it to 1 will output additional perl statements that will correctly recreate nested references. The default is 0.

    • $Data::Dumper::Pad or $OBJ->Pad([NEWVAL])

      Specifies the string that will be prefixed to every line of the output. Empty string by default.

    • $Data::Dumper::Varname or $OBJ->Varname([NEWVAL])

      Contains the prefix to use for tagging variable names in the output. The default is "VAR".

    • $Data::Dumper::Useqq or $OBJ->Useqq([NEWVAL])

      When set, enables the use of double quotes for representing string values. Whitespace other than space will be represented as [\n\t\r] , "unsafe" characters will be backslashed, and unprintable characters will be output as quoted octal integers. Since setting this variable imposes a performance penalty, the default is 0. Dump() will run slower if this flag is set, since the fast XSUB implementation doesn't support it yet.

    • $Data::Dumper::Terse or $OBJ->Terse([NEWVAL])

      When set, Data::Dumper will emit single, non-self-referential values as atoms/terms rather than statements. This means that the $VAR n names will be avoided where possible, but be advised that such output may not always be parseable by eval.

    • $Data::Dumper::Freezer or $OBJ->Freezer([NEWVAL])

      Can be set to a method name, or to an empty string to disable the feature. Data::Dumper will invoke that method via the object before attempting to stringify it. This method can alter the contents of the object (if, for instance, it contains data allocated from C), and even rebless it in a different package. The client is responsible for making sure the specified method can be called via the object, and that the object ends up containing only perl data types after the method has been called. Defaults to an empty string.

      If an object does not support the method specified (determined using UNIVERSAL::can()) then the call will be skipped. If the method dies a warning will be generated.

    • $Data::Dumper::Toaster or $OBJ->Toaster([NEWVAL])

      Can be set to a method name, or to an empty string to disable the feature. Data::Dumper will emit a method call for any objects that are to be dumped using the syntax bless(DATA, CLASS)->METHOD() . Note that this means that the method specified will have to perform any modifications required on the object (like creating new state within it, and/or reblessing it in a different package) and then return it. The client is responsible for making sure the method can be called via the object, and that it returns a valid object. Defaults to an empty string.

    • $Data::Dumper::Deepcopy or $OBJ->Deepcopy([NEWVAL])

      Can be set to a boolean value to enable deep copies of structures. Cross-referencing will then only be done when absolutely essential (i.e., to break reference cycles). Default is 0.

    • $Data::Dumper::Quotekeys or $OBJ->Quotekeys([NEWVAL])

      Can be set to a boolean value to control whether hash keys are quoted. A defined false value will avoid quoting hash keys when it looks like a simple string. Default is 1, which will always enclose hash keys in quotes.

    • $Data::Dumper::Bless or $OBJ->Bless([NEWVAL])

      Can be set to a string that specifies an alternative to the bless builtin operator used to create objects. A function with the specified name should exist, and should accept the same arguments as the builtin. Default is bless.

    • $Data::Dumper::Pair or $OBJ->Pair([NEWVAL])

      Can be set to a string that specifies the separator between hash keys and values. To dump nested hash, array and scalar values to JavaScript, use: $Data::Dumper::Pair = ' : '; . Implementing bless in JavaScript is left as an exercise for the reader. A function with the specified name exists, and accepts the same arguments as the builtin.

      Default is: => .

    • $Data::Dumper::Maxdepth or $OBJ->Maxdepth([NEWVAL])

      Can be set to a positive integer that specifies the depth beyond which we don't venture into a structure. Has no effect when Data::Dumper::Purity is set. (Useful in debugger when we often don't want to see more than enough). Default is 0, which means there is no maximum depth.

    • $Data::Dumper::Useperl or $OBJ->Useperl([NEWVAL])

      Can be set to a boolean value which controls whether the pure Perl implementation of Data::Dumper is used. The Data::Dumper module is a dual implementation, with almost all functionality written in both pure Perl and also in XS ('C'). Since the XS version is much faster, it will always be used if possible. This option lets you override the default behavior, usually for testing purposes only. Default is 0, which means the XS implementation will be used if possible.

    • $Data::Dumper::Sortkeys or $OBJ->Sortkeys([NEWVAL])

      Can be set to a boolean value to control whether hash keys are dumped in sorted order. A true value will cause the keys of all hashes to be dumped in Perl's default sort order. Can also be set to a subroutine reference which will be called for each hash that is dumped. In this case Data::Dumper will call the subroutine once for each hash, passing it the reference of the hash. The purpose of the subroutine is to return a reference to an array of the keys that will be dumped, in the order that they should be dumped. Using this feature, you can control both the order of the keys, and which keys are actually used. In other words, this subroutine acts as a filter by which you can exclude certain keys from being dumped. Default is 0, which means that hash keys are not sorted.

    • $Data::Dumper::Deparse or $OBJ->Deparse([NEWVAL])

      Can be set to a boolean value to control whether code references are turned into perl source code. If set to a true value, B::Deparse will be used to get the source of the code reference. Using this option will force using the Perl implementation of the dumper, since the fast XSUB implementation doesn't support it.

      Caution : use this option only if you know that your coderefs will be properly reconstructed by B::Deparse .

    • $Data::Dumper::Sparseseen or $OBJ->Sparseseen([NEWVAL])

      By default, Data::Dumper builds up the "seen" hash of scalars that it has encountered during serialization. This is very expensive. This seen hash is necessary to support and even just detect circular references. It is exposed to the user via the Seen() call both for writing and reading.

      If you, as a user, do not need explicit access to the "seen" hash, then you can set the Sparseseen option to allow Data::Dumper to eschew building the "seen" hash for scalars that are known not to possess more than one reference. This speeds up serialization considerably if you use the XS implementation.

      Note: If you turn on Sparseseen , then you must not rely on the content of the seen hash since its contents will be an implementation detail!

    Exports

    • Dumper

    EXAMPLES

    Run these code snippets to get a quick feel for the behavior of this module. When you are through with these examples, you may want to add or change the various configuration variables described above, to see their behavior. (See the testsuite in the Data::Dumper distribution for more examples.)

    1. use Data::Dumper;
    2. package Foo;
    3. sub new {bless {'a' => 1, 'b' => sub { return "foo" }}, $_[0]};
    4. package Fuz; # a weird REF-REF-SCALAR object
    5. sub new {bless \($_ = \ 'fu\'z'), $_[0]};
    6. package main;
    7. $foo = Foo->new;
    8. $fuz = Fuz->new;
    9. $boo = [ 1, [], "abcd", \*foo,
    10. {1 => 'a', 023 => 'b', 0x45 => 'c'},
    11. \\"p\q\'r", $foo, $fuz];
    12. ########
    13. # simple usage
    14. ########
    15. $bar = eval(Dumper($boo));
    16. print($@) if $@;
    17. print Dumper($boo), Dumper($bar); # pretty print (no array indices)
    18. $Data::Dumper::Terse = 1; # don't output names where feasible
    19. $Data::Dumper::Indent = 0; # turn off all pretty print
    20. print Dumper($boo), "\n";
    21. $Data::Dumper::Indent = 1; # mild pretty print
    22. print Dumper($boo);
    23. $Data::Dumper::Indent = 3; # pretty print with array indices
    24. print Dumper($boo);
    25. $Data::Dumper::Useqq = 1; # print strings in double quotes
    26. print Dumper($boo);
    27. $Data::Dumper::Pair = " : "; # specify hash key/value separator
    28. print Dumper($boo);
    29. ########
    30. # recursive structures
    31. ########
    32. @c = ('c');
    33. $c = \@c;
    34. $b = {};
    35. $a = [1, $b, $c];
    36. $b->{a} = $a;
    37. $b->{b} = $a->[1];
    38. $b->{c} = $a->[2];
    39. print Data::Dumper->Dump([$a,$b,$c], [qw(a b c)]);
    40. $Data::Dumper::Purity = 1; # fill in the holes for eval
    41. print Data::Dumper->Dump([$a, $b], [qw(*a b)]); # print as @a
    42. print Data::Dumper->Dump([$b, $a], [qw(*b a)]); # print as %b
    43. $Data::Dumper::Deepcopy = 1; # avoid cross-refs
    44. print Data::Dumper->Dump([$b, $a], [qw(*b a)]);
    45. $Data::Dumper::Purity = 0; # avoid cross-refs
    46. print Data::Dumper->Dump([$b, $a], [qw(*b a)]);
    47. ########
    48. # deep structures
    49. ########
    50. $a = "pearl";
    51. $b = [ $a ];
    52. $c = { 'b' => $b };
    53. $d = [ $c ];
    54. $e = { 'd' => $d };
    55. $f = { 'e' => $e };
    56. print Data::Dumper->Dump([$f], [qw(f)]);
    57. $Data::Dumper::Maxdepth = 3; # no deeper than 3 refs down
    58. print Data::Dumper->Dump([$f], [qw(f)]);
    59. ########
    60. # object-oriented usage
    61. ########
    62. $d = Data::Dumper->new([$a,$b], [qw(a b)]);
    63. $d->Seen({'*c' => $c}); # stash a ref without printing it
    64. $d->Indent(3);
    65. print $d->Dump;
    66. $d->Reset->Purity(0); # empty the seen cache
    67. print join "----\n", $d->Dump;
    68. ########
    69. # persistence
    70. ########
    71. package Foo;
    72. sub new { bless { state => 'awake' }, shift }
    73. sub Freeze {
    74. my $s = shift;
    75. print STDERR "preparing to sleep\n";
    76. $s->{state} = 'asleep';
    77. return bless $s, 'Foo::ZZZ';
    78. }
    79. package Foo::ZZZ;
    80. sub Thaw {
    81. my $s = shift;
    82. print STDERR "waking up\n";
    83. $s->{state} = 'awake';
    84. return bless $s, 'Foo';
    85. }
    86. package main;
    87. use Data::Dumper;
    88. $a = Foo->new;
    89. $b = Data::Dumper->new([$a], ['c']);
    90. $b->Freezer('Freeze');
    91. $b->Toaster('Thaw');
    92. $c = $b->Dump;
    93. print $c;
    94. $d = eval $c;
    95. print Data::Dumper->Dump([$d], ['d']);
    96. ########
    97. # symbol substitution (useful for recreating CODE refs)
    98. ########
    99. sub foo { print "foo speaking\n" }
    100. *other = \&foo;
    101. $bar = [ \&other ];
    102. $d = Data::Dumper->new([\&other,$bar],['*other','bar']);
    103. $d->Seen({ '*foo' => \&foo });
    104. print $d->Dump;
    105. ########
    106. # sorting and filtering hash keys
    107. ########
    108. $Data::Dumper::Sortkeys = \&my_filter;
    109. my $foo = { map { (ord, "$_$_$_") } 'I'..'Q' };
    110. my $bar = { %$foo };
    111. my $baz = { reverse %$foo };
    112. print Dumper [ $foo, $bar, $baz ];
    113. sub my_filter {
    114. my ($hash) = @_;
    115. # return an array ref containing the hash keys to dump
    116. # in the order that you want them to be dumped
    117. return [
    118. # Sort the keys of %$foo in reverse numeric order
    119. $hash eq $foo ? (sort {$b <=> $a} keys %$hash) :
    120. # Only dump the odd number keys of %$bar
    121. $hash eq $bar ? (grep {$_ % 2} keys %$hash) :
    122. # Sort keys in default order for all other hashes
    123. (sort keys %$hash)
    124. ];
    125. }

    BUGS

    Due to limitations of Perl subroutine call semantics, you cannot pass an array or hash. Prepend it with a \ to pass its reference instead. This will be remedied in time, now that Perl has subroutine prototypes. For now, you need to use the extended usage form, and prepend the name with a * to output it as a hash or array.

    Data::Dumper cheats with CODE references. If a code reference is encountered in the structure being processed (and if you haven't set the Deparse flag), an anonymous subroutine that contains the string '"DUMMY"' will be inserted in its place, and a warning will be printed if Purity is set. You can eval the result, but bear in mind that the anonymous sub that gets created is just a placeholder. Someday, perl will have a switch to cache-on-demand the string representation of a compiled piece of code, I hope. If you have prior knowledge of all the code refs that your data structures are likely to have, you can use the Seen method to pre-seed the internal reference table and make the dumped output point to them, instead. See EXAMPLES above.

    The Useqq and Deparse flags makes Dump() run slower, since the XSUB implementation does not support them.

    SCALAR objects have the weirdest looking bless workaround.

    Pure Perl version of Data::Dumper escapes UTF-8 strings correctly only in Perl 5.8.0 and later.

    NOTE

    Starting from Perl 5.8.1 different runs of Perl will have different ordering of hash keys. The change was done for greater security, see Algorithmic Complexity Attacks in perlsec. This means that different runs of Perl will have different Data::Dumper outputs if the data contains hashes. If you need to have identical Data::Dumper outputs from different runs of Perl, use the environment variable PERL_HASH_SEED, see PERL_HASH_SEED in perlrun. Using this restores the old (platform-specific) ordering: an even prettier solution might be to use the Sortkeys filter of Data::Dumper.

    AUTHOR

    Gurusamy Sarathy gsar@activestate.com

    Copyright (c) 1996-98 Gurusamy Sarathy. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    VERSION

    Version 2.145 (March 15 2013))

    SEE ALSO

    perl(1)

     
    perldoc-html/DBM_Filter/compress.html000644 000765 000024 00000036726 12275777435 017707 0ustar00jjstaff000000 000000 DBM_Filter::compress - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    DBM_Filter::compress

    Perl 5 version 18.2 documentation
    Recently read

    DBM_Filter::compress

    NAME

    DBM_Filter::compress - filter for DBM_Filter

    SYNOPSIS

    1. use SDBM_File; # or DB_File, or GDBM_File, or NDBM_File, or ODBM_File
    2. use DBM_Filter ;
    3. $db = tie %hash, ...
    4. $db->Filter_Push('compress');

    DESCRIPTION

    This DBM filter will compress all data before it is written to the database and uncompressed it on reading.

    A fatal error will be thrown if the Compress::Zlib module is not available.

    SEE ALSO

    DBM_Filter, perldbmfilter, Compress::Zlib

    AUTHOR

    Paul Marquess pmqs@cpan.org

     
    perldoc-html/DBM_Filter/encode.html000644 000765 000024 00000037762 12275777435 017312 0ustar00jjstaff000000 000000 DBM_Filter::encode - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    DBM_Filter::encode

    Perl 5 version 18.2 documentation
    Recently read

    DBM_Filter::encode

    NAME

    DBM_Filter::encode - filter for DBM_Filter

    SYNOPSIS

    1. use SDBM_File; # or DB_File, or GDBM_File, or NDBM_File, or ODBM_File
    2. use DBM_Filter ;
    3. $db = tie %hash, ...
    4. $db->Filter_Push('encode' => 'iso-8859-16');

    DESCRIPTION

    This DBM filter allows you to choose the character encoding will be store in the DBM file. The usage is

    1. $db->Filter_Push('encode' => ENCODING);

    where "ENCODING" must be a valid encoding name that the Encode module recognises.

    A fatal error will be thrown if:

    1

    The Encode module is not available.

    2

    The encoding requested is not supported by the Encode module.

    SEE ALSO

    DBM_Filter, perldbmfilter, Encode

    AUTHOR

    Paul Marquess pmqs@cpan.org

     
    perldoc-html/DBM_Filter/int32.html000644 000765 000024 00000036522 12275777435 017005 0ustar00jjstaff000000 000000 DBM_Filter::int32 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    DBM_Filter::int32

    Perl 5 version 18.2 documentation
    Recently read

    DBM_Filter::int32

    NAME

    DBM_Filter::int32 - filter for DBM_Filter

    SYNOPSIS

    1. use SDBM_File; # or DB_File, or GDBM_File, or NDBM_File, or ODBM_File
    2. use DBM_Filter ;
    3. $db = tie %hash, ...
    4. $db->Filter_Push('int32');

    DESCRIPTION

    This DBM filter is used when interoperating with a C/C++ application that uses a C int as either the key and/or value in the DBM file.

    SEE ALSO

    DBM_Filter, perldbmfilter

    AUTHOR

    Paul Marquess pmqs@cpan.org

     
    perldoc-html/DBM_Filter/null.html000644 000765 000024 00000037136 12275777435 017022 0ustar00jjstaff000000 000000 DBM_Filter::null - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    DBM_Filter::null

    Perl 5 version 18.2 documentation
    Recently read

    DBM_Filter::null

    NAME

    DBM_Filter::null - filter for DBM_Filter

    SYNOPSIS

    1. use SDBM_File; # or DB_File, or GDBM_File, or NDBM_File, or ODBM_File
    2. use DBM_Filter ;
    3. $db = tie %hash, ...
    4. $db->Filter_Push('null');

    DESCRIPTION

    This filter ensures that all data written to the DBM file is null terminated. This is useful when you have a perl script that needs to interoperate with a DBM file that a C program also uses. A fairly common issue is for the C application to include the terminating null in a string when it writes to the DBM file. This filter will ensure that all data written to the DBM file can be read by the C application.

    SEE ALSO

    DBM_Filter, perldbmfilter

    AUTHOR

    Paul Marquess pmqs@cpan.org

     
    perldoc-html/DBM_Filter/utf8.html000644 000765 000024 00000036546 12275777435 016742 0ustar00jjstaff000000 000000 DBM_Filter::utf8 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    DBM_Filter::utf8

    Perl 5 version 18.2 documentation
    Recently read

    DBM_Filter::utf8

    NAME

    DBM_Filter::utf8 - filter for DBM_Filter

    SYNOPSIS

    1. use SDBM_File; # or DB_File, or GDBM_File, or NDBM_File, or ODBM_File
    2. use DBM_Filter ;
    3. $db = tie %hash, ...
    4. $db->Filter_Push('utf8');

    DESCRIPTION

    This Filter will ensure that all data written to the DBM will be encoded in UTF-8.

    This module uses the Encode module.

    SEE ALSO

    DBM_Filter, perldbmfilter, Encode

    AUTHOR

    Paul Marquess pmqs@cpan.org

     
    perldoc-html/Config/Extensions.html000644 000765 000024 00000037170 12275777434 017502 0ustar00jjstaff000000 000000 Config::Extensions - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Config::Extensions

    Perl 5 version 18.2 documentation
    Recently read

    Config::Extensions

    SYNOPSIS

    1. use Config::Extensions '%Extensions';
    2. if ($Extensions{PerlIO::via}) {
    3. # This perl has PerlIO::via built
    4. }

    DESCRIPTION

    The Config::Extensions module provides a hash %Extensions containing all the core extensions that were enabled for this perl. The hash is keyed by extension name, with each entry having one of 3 possible values:

    • dynamic

      The extension is dynamically linked

    • nonxs

      The extension is pure perl, so doesn't need linking to the perl executable

    • static

      The extension is statically linked to the perl binary

    As all values evaluate to true, a simple if test is good enough to determine whether an extension is present.

    All the data uses to generate the %Extensions hash is already present in the Config module, but not in such a convenient format to quickly reference.

    AUTHOR

    Nicholas Clark <nick@ccl4.org>

     
    perldoc-html/Compress/Raw/000755 000765 000024 00000000000 12275777434 015564 5ustar00jjstaff000000 000000 perldoc-html/Compress/Zlib.html000644 000765 000024 00000222666 12275777434 016637 0ustar00jjstaff000000 000000 Compress::Zlib - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Compress::Zlib

    Perl 5 version 18.2 documentation
    Recently read

    Compress::Zlib

    NAME

    Compress::Zlib - Interface to zlib compression library

    SYNOPSIS

    1. use Compress::Zlib ;
    2. ($d, $status) = deflateInit( [OPT] ) ;
    3. $status = $d->deflate($input, $output) ;
    4. $status = $d->flush([$flush_type]) ;
    5. $d->deflateParams(OPTS) ;
    6. $d->deflateTune(OPTS) ;
    7. $d->dict_adler() ;
    8. $d->crc32() ;
    9. $d->adler32() ;
    10. $d->total_in() ;
    11. $d->total_out() ;
    12. $d->msg() ;
    13. $d->get_Strategy();
    14. $d->get_Level();
    15. $d->get_BufSize();
    16. ($i, $status) = inflateInit( [OPT] ) ;
    17. $status = $i->inflate($input, $output [, $eof]) ;
    18. $status = $i->inflateSync($input) ;
    19. $i->dict_adler() ;
    20. $d->crc32() ;
    21. $d->adler32() ;
    22. $i->total_in() ;
    23. $i->total_out() ;
    24. $i->msg() ;
    25. $d->get_BufSize();
    26. $dest = compress($source) ;
    27. $dest = uncompress($source) ;
    28. $gz = gzopen($filename or filehandle, $mode) ;
    29. $bytesread = $gz->gzread($buffer [,$size]) ;
    30. $bytesread = $gz->gzreadline($line) ;
    31. $byteswritten = $gz->gzwrite($buffer) ;
    32. $status = $gz->gzflush($flush) ;
    33. $offset = $gz->gztell() ;
    34. $status = $gz->gzseek($offset, $whence) ;
    35. $status = $gz->gzclose() ;
    36. $status = $gz->gzeof() ;
    37. $status = $gz->gzsetparams($level, $strategy) ;
    38. $errstring = $gz->gzerror() ;
    39. $gzerrno
    40. $dest = Compress::Zlib::memGzip($buffer) ;
    41. $dest = Compress::Zlib::memGunzip($buffer) ;
    42. $crc = adler32($buffer [,$crc]) ;
    43. $crc = crc32($buffer [,$crc]) ;
    44. $crc = adler32_combine($crc1, $crc2, $len2)l
    45. $crc = crc32_combine($adler1, $adler2, $len2)
    46. my $version = Compress::Raw::Zlib::zlib_version();

    DESCRIPTION

    The Compress::Zlib module provides a Perl interface to the zlib compression library (see AUTHOR for details about where to get zlib).

    The Compress::Zlib module can be split into two general areas of functionality, namely a simple read/write interface to gzip files and a low-level in-memory compression/decompression interface.

    Each of these areas will be discussed in the following sections.

    Notes for users of Compress::Zlib version 1

    The main change in Compress::Zlib version 2.x is that it does not now interface directly to the zlib library. Instead it uses the IO::Compress::Gzip and IO::Uncompress::Gunzip modules for reading/writing gzip files, and the Compress::Raw::Zlib module for some low-level zlib access.

    The interface provided by version 2 of this module should be 100% backward compatible with version 1. If you find a difference in the expected behaviour please contact the author (See AUTHOR). See GZIP INTERFACE

    With the creation of the IO::Compress and IO::Uncompress modules no new features are planned for Compress::Zlib - the new modules do everything that Compress::Zlib does and then some. Development on Compress::Zlib will be limited to bug fixes only.

    If you are writing new code, your first port of call should be one of the new IO::Compress or IO::Uncompress modules.

    GZIP INTERFACE

    A number of functions are supplied in zlib for reading and writing gzip files that conform to RFC 1952. This module provides an interface to most of them.

    If you have previously used Compress::Zlib 1.x, the following enhancements/changes have been made to the gzopen interface:

    1

    If you want to open either STDIN or STDOUT with gzopen , you can now optionally use the special filename "- " as a synonym for \*STDIN and \*STDOUT .

    2

    In Compress::Zlib version 1.x, gzopen used the zlib library to open the underlying file. This made things especially tricky when a Perl filehandle was passed to gzopen . Behind the scenes the numeric C file descriptor had to be extracted from the Perl filehandle and this passed to the zlib library.

    Apart from being non-portable to some operating systems, this made it difficult to use gzopen in situations where you wanted to extract/create a gzip data stream that is embedded in a larger file, without having to resort to opening and closing the file multiple times.

    It also made it impossible to pass a perl filehandle that wasn't associated with a real filesystem file, like, say, an IO::String .

    In Compress::Zlib version 2.x, the gzopen interface has been completely rewritten to use the IO::Compress::Gzip for writing gzip files and IO::Uncompress::Gunzip for reading gzip files. None of the limitations mentioned above apply.

    3

    Addition of gzseek to provide a restricted seek interface.

    4.

    Added gztell .

    A more complete and flexible interface for reading/writing gzip files/buffers is included with the module IO-Compress-Zlib . See IO::Compress::Gzip and IO::Uncompress::Gunzip for more details.

    • $gz = gzopen($filename, $mode)
    • $gz = gzopen($filehandle, $mode)

      This function opens either the gzip file $filename for reading or writing or attaches to the opened filehandle, $filehandle . It returns an object on success and undef on failure.

      When writing a gzip file this interface will always create the smallest possible gzip header (exactly 10 bytes). If you want greater control over what gets stored in the gzip header (like the original filename or a comment) use IO::Compress::Gzip instead. Similarly if you want to read the contents of the gzip header use IO::Uncompress::Gunzip.

      The second parameter, $mode , is used to specify whether the file is opened for reading or writing and to optionally specify a compression level and compression strategy when writing. The format of the $mode parameter is similar to the mode parameter to the 'C' function fopen , so "rb" is used to open for reading, "wb" for writing and "ab" for appending (writing at the end of the file).

      To specify a compression level when writing, append a digit between 0 and 9 to the mode string -- 0 means no compression and 9 means maximum compression. If no compression level is specified Z_DEFAULT_COMPRESSION is used.

      To specify the compression strategy when writing, append 'f' for filtered data, 'h' for Huffman only compression, or 'R' for run-length encoding. If no strategy is specified Z_DEFAULT_STRATEGY is used.

      So, for example, "wb9" means open for writing with the maximum compression using the default strategy and "wb4R" means open for writing with compression level 4 and run-length encoding.

      Refer to the zlib documentation for the exact format of the $mode parameter.

    • $bytesread = $gz->gzread($buffer [, $size]) ;

      Reads $size bytes from the compressed file into $buffer . If $size is not specified, it will default to 4096. If the scalar $buffer is not large enough, it will be extended automatically.

      Returns the number of bytes actually read. On EOF it returns 0 and in the case of an error, -1.

    • $bytesread = $gz->gzreadline($line) ;

      Reads the next line from the compressed file into $line .

      Returns the number of bytes actually read. On EOF it returns 0 and in the case of an error, -1.

      It is legal to intermix calls to gzread and gzreadline .

      To maintain backward compatibility with version 1.x of this module gzreadline ignores the $/ variable - it always uses the string "\n" as the line delimiter.

      If you want to read a gzip file a line at a time and have it respect the $/ variable (or $INPUT_RECORD_SEPARATOR , or $RS when English is in use) see IO::Uncompress::Gunzip.

    • $byteswritten = $gz->gzwrite($buffer) ;

      Writes the contents of $buffer to the compressed file. Returns the number of bytes actually written, or 0 on error.

    • $status = $gz->gzflush($flush_type) ;

      Flushes all pending output into the compressed file.

      This method takes an optional parameter, $flush_type , that controls how the flushing will be carried out. By default the $flush_type used is Z_FINISH . Other valid values for $flush_type are Z_NO_FLUSH , Z_SYNC_FLUSH , Z_FULL_FLUSH and Z_BLOCK . It is strongly recommended that you only set the flush_type parameter if you fully understand the implications of what it does - overuse of flush can seriously degrade the level of compression achieved. See the zlib documentation for details.

      Returns 0 on success.

    • $offset = $gz->gztell() ;

      Returns the uncompressed file offset.

    • $status = $gz->gzseek($offset, $whence) ;

      Provides a sub-set of the seek functionality, with the restriction that it is only legal to seek forward in the compressed file. It is a fatal error to attempt to seek backward.

      When opened for writing, empty parts of the file will have NULL (0x00) bytes written to them.

      The $whence parameter should be one of SEEK_SET, SEEK_CUR or SEEK_END.

      Returns 1 on success, 0 on failure.

    • $gz->gzclose

      Closes the compressed file. Any pending data is flushed to the file before it is closed.

      Returns 0 on success.

    • $gz->gzsetparams($level, $strategy

      Change settings for the deflate stream $gz .

      The list of the valid options is shown below. Options not specified will remain unchanged.

      Note: This method is only available if you are running zlib 1.0.6 or better.

      • $level

        Defines the compression level. Valid values are 0 through 9, Z_NO_COMPRESSION , Z_BEST_SPEED , Z_BEST_COMPRESSION , and Z_DEFAULT_COMPRESSION .

      • $strategy

        Defines the strategy used to tune the compression. The valid values are Z_DEFAULT_STRATEGY , Z_FILTERED and Z_HUFFMAN_ONLY .

    • $gz->gzerror

      Returns the zlib error message or number for the last operation associated with $gz . The return value will be the zlib error number when used in a numeric context and the zlib error message when used in a string context. The zlib error number constants, shown below, are available for use.

      1. Z_OK
      2. Z_STREAM_END
      3. Z_ERRNO
      4. Z_STREAM_ERROR
      5. Z_DATA_ERROR
      6. Z_MEM_ERROR
      7. Z_BUF_ERROR
    • $gzerrno

      The $gzerrno scalar holds the error code associated with the most recent gzip routine. Note that unlike gzerror() , the error is not associated with a particular file.

      As with gzerror() it returns an error number in numeric context and an error message in string context. Unlike gzerror() though, the error message will correspond to the zlib message when the error is associated with zlib itself, or the UNIX error message when it is not (i.e. zlib returned Z_ERRORNO ).

      As there is an overlap between the error numbers used by zlib and UNIX, $gzerrno should only be used to check for the presence of an error in numeric context. Use gzerror() to check for specific zlib errors. The gzcat example below shows how the variable can be used safely.

    Examples

    Here is an example script which uses the interface. It implements a gzcat function.

    1. use strict ;
    2. use warnings ;
    3. use Compress::Zlib ;
    4. # use stdin if no files supplied
    5. @ARGV = '-' unless @ARGV ;
    6. foreach my $file (@ARGV) {
    7. my $buffer ;
    8. my $gz = gzopen($file, "rb")
    9. or die "Cannot open $file: $gzerrno\n" ;
    10. print $buffer while $gz->gzread($buffer) > 0 ;
    11. die "Error reading from $file: $gzerrno" . ($gzerrno+0) . "\n"
    12. if $gzerrno != Z_STREAM_END ;
    13. $gz->gzclose() ;
    14. }

    Below is a script which makes use of gzreadline . It implements a very simple grep like script.

    1. use strict ;
    2. use warnings ;
    3. use Compress::Zlib ;
    4. die "Usage: gzgrep pattern [file...]\n"
    5. unless @ARGV >= 1;
    6. my $pattern = shift ;
    7. # use stdin if no files supplied
    8. @ARGV = '-' unless @ARGV ;
    9. foreach my $file (@ARGV) {
    10. my $gz = gzopen($file, "rb")
    11. or die "Cannot open $file: $gzerrno\n" ;
    12. while ($gz->gzreadline($_) > 0) {
    13. print if /$pattern/ ;
    14. }
    15. die "Error reading from $file: $gzerrno\n"
    16. if $gzerrno != Z_STREAM_END ;
    17. $gz->gzclose() ;
    18. }

    This script, gzstream, does the opposite of the gzcat script above. It reads from standard input and writes a gzip data stream to standard output.

    1. use strict ;
    2. use warnings ;
    3. use Compress::Zlib ;
    4. binmode STDOUT; # gzopen only sets it on the fd
    5. my $gz = gzopen(\*STDOUT, "wb")
    6. or die "Cannot open stdout: $gzerrno\n" ;
    7. while (<>) {
    8. $gz->gzwrite($_)
    9. or die "error writing: $gzerrno\n" ;
    10. }
    11. $gz->gzclose ;

    Compress::Zlib::memGzip

    This function is used to create an in-memory gzip file with the minimum possible gzip header (exactly 10 bytes).

    1. $dest = Compress::Zlib::memGzip($buffer)
    2. or die "Cannot compress: $gzerrno\n";

    If successful, it returns the in-memory gzip file. Otherwise it returns undef and the $gzerrno variable will store the zlib error code.

    The $buffer parameter can either be a scalar or a scalar reference.

    See IO::Compress::Gzip for an alternative way to carry out in-memory gzip compression.

    Compress::Zlib::memGunzip

    This function is used to uncompress an in-memory gzip file.

    1. $dest = Compress::Zlib::memGunzip($buffer)
    2. or die "Cannot uncompress: $gzerrno\n";

    If successful, it returns the uncompressed gzip file. Otherwise it returns undef and the $gzerrno variable will store the zlib error code.

    The $buffer parameter can either be a scalar or a scalar reference. The contents of the $buffer parameter are destroyed after calling this function.

    If $buffer consists of multiple concatenated gzip data streams only the first will be uncompressed. Use gunzip with the MultiStream option in the IO::Uncompress::Gunzip module if you need to deal with concatenated data streams.

    See IO::Uncompress::Gunzip for an alternative way to carry out in-memory gzip uncompression.

    COMPRESS/UNCOMPRESS

    Two functions are provided to perform in-memory compression/uncompression of RFC 1950 data streams. They are called compress and uncompress .

    • $dest = compress($source [, $level] ) ;

      Compresses $source . If successful it returns the compressed data. Otherwise it returns undef.

      The source buffer, $source , can either be a scalar or a scalar reference.

      The $level parameter defines the compression level. Valid values are 0 through 9, Z_NO_COMPRESSION , Z_BEST_SPEED , Z_BEST_COMPRESSION , and Z_DEFAULT_COMPRESSION . If $level is not specified Z_DEFAULT_COMPRESSION will be used.

    • $dest = uncompress($source) ;

      Uncompresses $source . If successful it returns the uncompressed data. Otherwise it returns undef.

      The source buffer can either be a scalar or a scalar reference.

    Please note: the two functions defined above are not compatible with the Unix commands of the same name.

    See IO::Deflate and IO::Inflate included with this distribution for an alternative interface for reading/writing RFC 1950 files/buffers.

    Deflate Interface

    This section defines an interface that allows in-memory compression using the deflate interface provided by zlib.

    Here is a definition of the interface available:

    ($d, $status) = deflateInit( [OPT] )

    Initialises a deflation stream.

    It combines the features of the zlib functions deflateInit , deflateInit2 and deflateSetDictionary .

    If successful, it will return the initialised deflation stream, $d and $status of Z_OK in a list context. In scalar context it returns the deflation stream, $d , only.

    If not successful, the returned deflation stream ($d ) will be undef and $status will hold the exact zlib error code.

    The function optionally takes a number of named options specified as -Name=>value pairs. This allows individual options to be tailored without having to specify them all in the parameter list.

    For backward compatibility, it is also possible to pass the parameters as a reference to a hash containing the name=>value pairs.

    The function takes one optional parameter, a reference to a hash. The contents of the hash allow the deflation interface to be tailored.

    Here is a list of the valid options:

    • -Level

      Defines the compression level. Valid values are 0 through 9, Z_NO_COMPRESSION , Z_BEST_SPEED , Z_BEST_COMPRESSION , and Z_DEFAULT_COMPRESSION .

      The default is Z_DEFAULT_COMPRESSION.

    • -Method

      Defines the compression method. The only valid value at present (and the default) is Z_DEFLATED.

    • -WindowBits

      To create an RFC 1950 data stream, set WindowBits to a positive number.

      To create an RFC 1951 data stream, set WindowBits to -MAX_WBITS .

      For a full definition of the meaning and valid values for WindowBits refer to the zlib documentation for deflateInit2.

      Defaults to MAX_WBITS.

    • -MemLevel

      For a definition of the meaning and valid values for MemLevel refer to the zlib documentation for deflateInit2.

      Defaults to MAX_MEM_LEVEL.

    • -Strategy

      Defines the strategy used to tune the compression. The valid values are Z_DEFAULT_STRATEGY , Z_FILTERED and Z_HUFFMAN_ONLY .

      The default is Z_DEFAULT_STRATEGY.

    • -Dictionary

      When a dictionary is specified Compress::Zlib will automatically call deflateSetDictionary directly after calling deflateInit . The Adler32 value for the dictionary can be obtained by calling the method $d- dict_adler()>.

      The default is no dictionary.

    • -Bufsize

      Sets the initial size for the deflation buffer. If the buffer has to be reallocated to increase the size, it will grow in increments of Bufsize .

      The default is 4096.

    Here is an example of using the deflateInit optional parameter list to override the default buffer size and compression level. All other options will take their default values.

    1. deflateInit( -Bufsize => 300,
    2. -Level => Z_BEST_SPEED ) ;

    ($out, $status) = $d->deflate($buffer)

    Deflates the contents of $buffer . The buffer can either be a scalar or a scalar reference. When finished, $buffer will be completely processed (assuming there were no errors). If the deflation was successful it returns the deflated output, $out , and a status value, $status , of Z_OK .

    On error, $out will be undef and $status will contain the zlib error code.

    In a scalar context deflate will return $out only.

    As with the deflate function in zlib, it is not necessarily the case that any output will be produced by this method. So don't rely on the fact that $out is empty for an error test.

    ($out, $status) = $d->flush() =head2 ($out, $status) = $d->flush($flush_type)

    Typically used to finish the deflation. Any pending output will be returned via $out . $status will have a value Z_OK if successful.

    In a scalar context flush will return $out only.

    Note that flushing can seriously degrade the compression ratio, so it should only be used to terminate a decompression (using Z_FINISH ) or when you want to create a full flush point (using Z_FULL_FLUSH ).

    By default the flush_type used is Z_FINISH . Other valid values for flush_type are Z_NO_FLUSH , Z_PARTIAL_FLUSH , Z_SYNC_FLUSH and Z_FULL_FLUSH . It is strongly recommended that you only set the flush_type parameter if you fully understand the implications of what it does. See the zlib documentation for details.

    $status = $d->deflateParams([OPT])

    Change settings for the deflate stream $d .

    The list of the valid options is shown below. Options not specified will remain unchanged.

    • -Level

      Defines the compression level. Valid values are 0 through 9, Z_NO_COMPRESSION , Z_BEST_SPEED , Z_BEST_COMPRESSION , and Z_DEFAULT_COMPRESSION .

    • -Strategy

      Defines the strategy used to tune the compression. The valid values are Z_DEFAULT_STRATEGY , Z_FILTERED and Z_HUFFMAN_ONLY .

    $d->dict_adler()

    Returns the adler32 value for the dictionary.

    $d->msg()

    Returns the last error message generated by zlib.

    $d->total_in()

    Returns the total number of bytes uncompressed bytes input to deflate.

    $d->total_out()

    Returns the total number of compressed bytes output from deflate.

    Example

    Here is a trivial example of using deflate . It simply reads standard input, deflates it and writes it to standard output.

    1. use strict ;
    2. use warnings ;
    3. use Compress::Zlib ;
    4. binmode STDIN;
    5. binmode STDOUT;
    6. my $x = deflateInit()
    7. or die "Cannot create a deflation stream\n" ;
    8. my ($output, $status) ;
    9. while (<>)
    10. {
    11. ($output, $status) = $x->deflate($_) ;
    12. $status == Z_OK
    13. or die "deflation failed\n" ;
    14. print $output ;
    15. }
    16. ($output, $status) = $x->flush() ;
    17. $status == Z_OK
    18. or die "deflation failed\n" ;
    19. print $output ;

    Inflate Interface

    This section defines the interface available that allows in-memory uncompression using the deflate interface provided by zlib.

    Here is a definition of the interface:

    ($i, $status) = inflateInit()

    Initialises an inflation stream.

    In a list context it returns the inflation stream, $i , and the zlib status code in $status . In a scalar context it returns the inflation stream only.

    If successful, $i will hold the inflation stream and $status will be Z_OK .

    If not successful, $i will be undef and $status will hold the zlib error code.

    The function optionally takes a number of named options specified as -Name=>value pairs. This allows individual options to be tailored without having to specify them all in the parameter list.

    For backward compatibility, it is also possible to pass the parameters as a reference to a hash containing the name=>value pairs.

    The function takes one optional parameter, a reference to a hash. The contents of the hash allow the deflation interface to be tailored.

    Here is a list of the valid options:

    • -WindowBits

      To uncompress an RFC 1950 data stream, set WindowBits to a positive number.

      To uncompress an RFC 1951 data stream, set WindowBits to -MAX_WBITS .

      For a full definition of the meaning and valid values for WindowBits refer to the zlib documentation for inflateInit2.

      Defaults to MAX_WBITS.

    • -Bufsize

      Sets the initial size for the inflation buffer. If the buffer has to be reallocated to increase the size, it will grow in increments of Bufsize .

      Default is 4096.

    • -Dictionary

      The default is no dictionary.

    Here is an example of using the inflateInit optional parameter to override the default buffer size.

    1. inflateInit( -Bufsize => 300 ) ;

    ($out, $status) = $i->inflate($buffer)

    Inflates the complete contents of $buffer . The buffer can either be a scalar or a scalar reference.

    Returns Z_OK if successful and Z_STREAM_END if the end of the compressed data has been successfully reached. If not successful, $out will be undef and $status will hold the zlib error code.

    The $buffer parameter is modified by inflate . On completion it will contain what remains of the input buffer after inflation. This means that $buffer will be an empty string when the return status is Z_OK . When the return status is Z_STREAM_END the $buffer parameter will contains what (if anything) was stored in the input buffer after the deflated data stream.

    This feature is useful when processing a file format that encapsulates a compressed data stream (e.g. gzip, zip).

    $status = $i->inflateSync($buffer)

    Scans $buffer until it reaches either a full flush point or the end of the buffer.

    If a full flush point is found, Z_OK is returned and $buffer will be have all data up to the flush point removed. This can then be passed to the deflate method.

    Any other return code means that a flush point was not found. If more data is available, inflateSync can be called repeatedly with more compressed data until the flush point is found.

    $i->dict_adler()

    Returns the adler32 value for the dictionary.

    $i->msg()

    Returns the last error message generated by zlib.

    $i->total_in()

    Returns the total number of bytes compressed bytes input to inflate.

    $i->total_out()

    Returns the total number of uncompressed bytes output from inflate.

    Example

    Here is an example of using inflate .

    1. use strict ;
    2. use warnings ;
    3. use Compress::Zlib ;
    4. my $x = inflateInit()
    5. or die "Cannot create a inflation stream\n" ;
    6. my $input = '' ;
    7. binmode STDIN;
    8. binmode STDOUT;
    9. my ($output, $status) ;
    10. while (read(STDIN, $input, 4096))
    11. {
    12. ($output, $status) = $x->inflate(\$input) ;
    13. print $output
    14. if $status == Z_OK or $status == Z_STREAM_END ;
    15. last if $status != Z_OK ;
    16. }
    17. die "inflation failed\n"
    18. unless $status == Z_STREAM_END ;

    CHECKSUM FUNCTIONS

    Two functions are provided by zlib to calculate checksums. For the Perl interface, the order of the two parameters in both functions has been reversed. This allows both running checksums and one off calculations to be done.

    1. $crc = adler32($buffer [,$crc]) ;
    2. $crc = crc32($buffer [,$crc]) ;

    The buffer parameters can either be a scalar or a scalar reference.

    If the $crc parameters is undef, the crc value will be reset.

    If you have built this module with zlib 1.2.3 or better, two more CRC-related functions are available.

    1. $crc = adler32_combine($crc1, $crc2, $len2)l
    2. $crc = crc32_combine($adler1, $adler2, $len2)

    These functions allow checksums to be merged.

    Misc

    my $version = Compress::Zlib::zlib_version();

    Returns the version of the zlib library.

    CONSTANTS

    All the zlib constants are automatically imported when you make use of Compress::Zlib.

    SEE ALSO

    IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 1995-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Compress/Raw/Bzip2.html000644 000765 000024 00000102452 12275777434 017444 0ustar00jjstaff000000 000000 Compress::Raw::Bzip2 - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Compress::Raw::Bzip2

    Perl 5 version 18.2 documentation
    Recently read

    Compress::Raw::Bzip2

    NAME

    Compress::Raw::Bzip2 - Low-Level Interface to bzip2 compression library

    SYNOPSIS

    1. use Compress::Raw::Bzip2 ;
    2. my ($bz, $status) = new Compress::Raw::Bzip2 [OPTS]
    3. or die "Cannot create bzip2 object: $bzerno\n";
    4. $status = $bz->bzdeflate($input, $output);
    5. $status = $bz->bzflush($output);
    6. $status = $bz->bzclose($output);
    7. my ($bz, $status) = new Compress::Raw::Bunzip2 [OPTS]
    8. or die "Cannot create bunzip2 object: $bzerno\n";
    9. $status = $bz->bzinflate($input, $output);
    10. my $version = Compress::Raw::Bzip2::bzlibversion();

    DESCRIPTION

    Compress::Raw::Bzip2 provides an interface to the in-memory compression/uncompression functions from the bzip2 compression library.

    Although the primary purpose for the existence of Compress::Raw::Bzip2 is for use by the IO::Compress::Bzip2 and IO::Compress::Bunzip2 modules, it can be used on its own for simple compression/uncompression tasks.

    Compression

    ($z, $status) = new Compress::Raw::Bzip2 $appendOutput, $blockSize100k, $workfactor;

    Creates a new compression object.

    If successful, it will return the initialised compression object, $z and a $status of BZ_OK in a list context. In scalar context it returns the deflation object, $z , only.

    If not successful, the returned compression object, $z , will be undef and $status will hold the a bzip2 error code.

    Below is a list of the valid options:

    • $appendOutput

      Controls whether the compressed data is appended to the output buffer in the bzdeflate , bzflush and bzclose methods.

      Defaults to 1.

    • $blockSize100k

      To quote the bzip2 documentation

      1. blockSize100k specifies the block size to be used for compression. It
      2. should be a value between 1 and 9 inclusive, and the actual block size
      3. used is 100000 x this figure. 9 gives the best compression but takes
      4. most memory.

      Defaults to 1.

    • $workfactor

      To quote the bzip2 documentation

      1. This parameter controls how the compression phase behaves when
      2. presented with worst case, highly repetitive, input data. If
      3. compression runs into difficulties caused by repetitive data, the
      4. library switches from the standard sorting algorithm to a fallback
      5. algorithm. The fallback is slower than the standard algorithm by
      6. perhaps a factor of three, but always behaves reasonably, no matter how
      7. bad the input.
      8. Lower values of workFactor reduce the amount of effort the standard
      9. algorithm will expend before resorting to the fallback. You should set
      10. this parameter carefully; too low, and many inputs will be handled by
      11. the fallback algorithm and so compress rather slowly, too high, and
      12. your average-to-worst case compression times can become very large. The
      13. default value of 30 gives reasonable behaviour over a wide range of
      14. circumstances.
      15. Allowable values range from 0 to 250 inclusive. 0 is a special case,
      16. equivalent to using the default value of 30.

      Defaults to 0.

    $status = $bz->bzdeflate($input, $output);

    Reads the contents of $input , compresses it and writes the compressed data to $output .

    Returns BZ_RUN_OK on success and a bzip2 error code on failure.

    If appendOutput is enabled in the constructor for the bzip2 object, the compressed data will be appended to $output . If not enabled, $output will be truncated before the compressed data is written to it.

    $status = $bz->bzflush($output);

    Flushes any pending compressed data to $output .

    Returns BZ_RUN_OK on success and a bzip2 error code on failure.

    $status = $bz->bzclose($output);

    Terminates the compressed data stream and flushes any pending compressed data to $output .

    Returns BZ_STREAM_END on success and a bzip2 error code on failure.

    Example

    Uncompression

    ($z, $status) = new Compress::Raw::Bunzip2 $appendOutput, $consumeInput, $small, $verbosity, $limitOutput;

    If successful, it will return the initialised uncompression object, $z and a $status of BZ_OK in a list context. In scalar context it returns the deflation object, $z , only.

    If not successful, the returned uncompression object, $z , will be undef and $status will hold the a bzip2 error code.

    Below is a list of the valid options:

    • $appendOutput

      Controls whether the compressed data is appended to the output buffer in the bzinflate , bzflush and bzclose methods.

      Defaults to 1.

    • $consumeInput
    • $small

      To quote the bzip2 documentation

      1. If small is nonzero, the library will use an alternative decompression
      2. algorithm which uses less memory but at the cost of decompressing more
      3. slowly (roughly speaking, half the speed, but the maximum memory
      4. requirement drops to around 2300k).

      Defaults to 0.

    • $limitOutput

      The LimitOutput option changes the behavior of the $i->bzinflate method so that the amount of memory used by the output buffer can be limited.

      When LimitOutput is used the size of the output buffer used will either be the 16k or the amount of memory already allocated to $output , whichever is larger. Predicting the output size available is tricky, so don't rely on getting an exact output buffer size.

      When LimitOutout is not specified $i->bzinflate will use as much memory as it takes to write all the uncompressed data it creates by uncompressing the input buffer.

      If LimitOutput is enabled, the ConsumeInput option will also be enabled.

      This option defaults to false.

    • $verbosity

      This parameter is ignored.

      Defaults to 0.

    $status = $z->bzinflate($input, $output);

    Uncompresses $input and writes the uncompressed data to $output .

    Returns BZ_OK if the uncompression was successful, but the end of the compressed data stream has not been reached. Returns BZ_STREAM_END on successful uncompression and the end of the compression stream has been reached.

    If consumeInput is enabled in the constructor for the bunzip2 object, $input will have all compressed data removed from it after uncompression. On BZ_OK return this will mean that $input will be an empty string; when BZ_STREAM_END $input will either be an empty string or will contain whatever data immediately followed the compressed data stream.

    If appendOutput is enabled in the constructor for the bunzip2 object, the uncompressed data will be appended to $output . If not enabled, $output will be truncated before the uncompressed data is written to it.

    Misc

    my $version = Compress::Raw::Bzip2::bzlibversion();

    Returns the version of the underlying bzip2 library.

    Constants

    The following bzip2 constants are exported by this module

    1. BZ_RUN
    2. BZ_FLUSH
    3. BZ_FINISH
    4. BZ_OK
    5. BZ_RUN_OK
    6. BZ_FLUSH_OK
    7. BZ_FINISH_OK
    8. BZ_STREAM_END
    9. BZ_SEQUENCE_ERROR
    10. BZ_PARAM_ERROR
    11. BZ_MEM_ERROR
    12. BZ_DATA_ERROR
    13. BZ_DATA_ERROR_MAGIC
    14. BZ_IO_ERROR
    15. BZ_UNEXPECTED_EOF
    16. BZ_OUTBUFF_FULL
    17. BZ_CONFIG_ERROR

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    The primary site for the bzip2 program is http://www.bzip.org.

    See the module Compress::Bzip2

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Compress/Raw/Zlib.html000644 000765 000024 00000264717 12275777427 017375 0ustar00jjstaff000000 000000 Compress::Raw::Zlib - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Compress::Raw::Zlib

    Perl 5 version 18.2 documentation
    Recently read

    Compress::Raw::Zlib

    NAME

    Compress::Raw::Zlib - Low-Level Interface to zlib compression library

    SYNOPSIS

    1. use Compress::Raw::Zlib ;
    2. ($d, $status) = new Compress::Raw::Zlib::Deflate( [OPT] ) ;
    3. $status = $d->deflate($input, $output) ;
    4. $status = $d->flush($output [, $flush_type]) ;
    5. $d->deflateReset() ;
    6. $d->deflateParams(OPTS) ;
    7. $d->deflateTune(OPTS) ;
    8. $d->dict_adler() ;
    9. $d->crc32() ;
    10. $d->adler32() ;
    11. $d->total_in() ;
    12. $d->total_out() ;
    13. $d->msg() ;
    14. $d->get_Strategy();
    15. $d->get_Level();
    16. $d->get_BufSize();
    17. ($i, $status) = new Compress::Raw::Zlib::Inflate( [OPT] ) ;
    18. $status = $i->inflate($input, $output [, $eof]) ;
    19. $status = $i->inflateSync($input) ;
    20. $i->inflateReset() ;
    21. $i->dict_adler() ;
    22. $d->crc32() ;
    23. $d->adler32() ;
    24. $i->total_in() ;
    25. $i->total_out() ;
    26. $i->msg() ;
    27. $d->get_BufSize();
    28. $crc = adler32($buffer [,$crc]) ;
    29. $crc = crc32($buffer [,$crc]) ;
    30. $crc = adler32_combine($crc1, $crc2, $len2)l
    31. $crc = crc32_combine($adler1, $adler2, $len2)
    32. my $version = Compress::Raw::Zlib::zlib_version();
    33. my $flags = Compress::Raw::Zlib::zlibCompileFlags();

    DESCRIPTION

    The Compress::Raw::Zlib module provides a Perl interface to the zlib compression library (see AUTHOR for details about where to get zlib).

    Compress::Raw::Zlib::Deflate

    This section defines an interface that allows in-memory compression using the deflate interface provided by zlib.

    Here is a definition of the interface available:

    ($d, $status) = new Compress::Raw::Zlib::Deflate( [OPT] )

    Initialises a deflation object.

    If you are familiar with the zlib library, it combines the features of the zlib functions deflateInit , deflateInit2 and deflateSetDictionary .

    If successful, it will return the initialised deflation object, $d and a $status of Z_OK in a list context. In scalar context it returns the deflation object, $d , only.

    If not successful, the returned deflation object, $d , will be undef and $status will hold the a zlib error code.

    The function optionally takes a number of named options specified as Name => value pairs. This allows individual options to be tailored without having to specify them all in the parameter list.

    For backward compatibility, it is also possible to pass the parameters as a reference to a hash containing the name=>value pairs.

    Below is a list of the valid options:

    • -Level

      Defines the compression level. Valid values are 0 through 9, Z_NO_COMPRESSION , Z_BEST_SPEED , Z_BEST_COMPRESSION , and Z_DEFAULT_COMPRESSION .

      The default is Z_DEFAULT_COMPRESSION .

    • -Method

      Defines the compression method. The only valid value at present (and the default) is Z_DEFLATED .

    • -WindowBits

      To compress an RFC 1950 data stream, set WindowBits to a positive number between 8 and 15.

      To compress an RFC 1951 data stream, set WindowBits to -MAX_WBITS .

      To compress an RFC 1952 data stream (i.e. gzip), set WindowBits to WANT_GZIP .

      For a definition of the meaning and valid values for WindowBits refer to the zlib documentation for deflateInit2.

      Defaults to MAX_WBITS .

    • -MemLevel

      For a definition of the meaning and valid values for MemLevel refer to the zlib documentation for deflateInit2.

      Defaults to MAX_MEM_LEVEL.

    • -Strategy

      Defines the strategy used to tune the compression. The valid values are Z_DEFAULT_STRATEGY , Z_FILTERED , Z_RLE , Z_FIXED and Z_HUFFMAN_ONLY .

      The default is Z_DEFAULT_STRATEGY .

    • -Dictionary

      When a dictionary is specified Compress::Raw::Zlib will automatically call deflateSetDictionary directly after calling deflateInit . The Adler32 value for the dictionary can be obtained by calling the method $d->dict_adler() .

      The default is no dictionary.

    • -Bufsize

      Sets the initial size for the output buffer used by the $d->deflate and $d->flush methods. If the buffer has to be reallocated to increase the size, it will grow in increments of Bufsize .

      The default buffer size is 4096.

    • -AppendOutput

      This option controls how data is written to the output buffer by the $d->deflate and $d->flush methods.

      If the AppendOutput option is set to false, the output buffers in the $d->deflate and $d->flush methods will be truncated before uncompressed data is written to them.

      If the option is set to true, uncompressed data will be appended to the output buffer in the $d->deflate and $d->flush methods.

      This option defaults to false.

    • -CRC32

      If set to true, a crc32 checksum of the uncompressed data will be calculated. Use the $d->crc32 method to retrieve this value.

      This option defaults to false.

    • -ADLER32

      If set to true, an adler32 checksum of the uncompressed data will be calculated. Use the $d->adler32 method to retrieve this value.

      This option defaults to false.

    Here is an example of using the Compress::Raw::Zlib::Deflate optional parameter list to override the default buffer size and compression level. All other options will take their default values.

    1. my $d = new Compress::Raw::Zlib::Deflate ( -Bufsize => 300,
    2. -Level => Z_BEST_SPEED ) ;

    $status = $d->deflate($input, $output)

    Deflates the contents of $input and writes the compressed data to $output .

    The $input and $output parameters can be either scalars or scalar references.

    When finished, $input will be completely processed (assuming there were no errors). If the deflation was successful it writes the deflated data to $output and returns a status value of Z_OK .

    On error, it returns a zlib error code.

    If the AppendOutput option is set to true in the constructor for the $d object, the compressed data will be appended to $output . If it is false, $output will be truncated before any compressed data is written to it.

    Note: This method will not necessarily write compressed data to $output every time it is called. So do not assume that there has been an error if the contents of $output is empty on returning from this method. As long as the return code from the method is Z_OK , the deflate has succeeded.

    $status = $d->flush($output [, $flush_type])

    Typically used to finish the deflation. Any pending output will be written to $output .

    Returns Z_OK if successful.

    Note that flushing can seriously degrade the compression ratio, so it should only be used to terminate a decompression (using Z_FINISH ) or when you want to create a full flush point (using Z_FULL_FLUSH ).

    By default the flush_type used is Z_FINISH . Other valid values for flush_type are Z_NO_FLUSH , Z_PARTIAL_FLUSH , Z_SYNC_FLUSH and Z_FULL_FLUSH . It is strongly recommended that you only set the flush_type parameter if you fully understand the implications of what it does. See the zlib documentation for details.

    If the AppendOutput option is set to true in the constructor for the $d object, the compressed data will be appended to $output . If it is false, $output will be truncated before any compressed data is written to it.

    $status = $d->deflateReset()

    This method will reset the deflation object $d . It can be used when you are compressing multiple data streams and want to use the same object to compress each of them. It should only be used once the previous data stream has been flushed successfully, i.e. a call to $d->flush(Z_FINISH) has returned Z_OK .

    Returns Z_OK if successful.

    $status = $d->deflateParams([OPT])

    Change settings for the deflate object $d .

    The list of the valid options is shown below. Options not specified will remain unchanged.

    • -Level

      Defines the compression level. Valid values are 0 through 9, Z_NO_COMPRESSION , Z_BEST_SPEED , Z_BEST_COMPRESSION , and Z_DEFAULT_COMPRESSION .

    • -Strategy

      Defines the strategy used to tune the compression. The valid values are Z_DEFAULT_STRATEGY , Z_FILTERED and Z_HUFFMAN_ONLY .

    • -BufSize

      Sets the initial size for the output buffer used by the $d->deflate and $d->flush methods. If the buffer has to be reallocated to increase the size, it will grow in increments of Bufsize .

    $status = $d->deflateTune($good_length, $max_lazy, $nice_length, $max_chain)

    Tune the internal settings for the deflate object $d . This option is only available if you are running zlib 1.2.2.3 or better.

    Refer to the documentation in zlib.h for instructions on how to fly deflateTune .

    $d->dict_adler()

    Returns the adler32 value for the dictionary.

    $d->crc32()

    Returns the crc32 value for the uncompressed data to date.

    If the CRC32 option is not enabled in the constructor for this object, this method will always return 0;

    $d->adler32()

    Returns the adler32 value for the uncompressed data to date.

    $d->msg()

    Returns the last error message generated by zlib.

    $d->total_in()

    Returns the total number of bytes uncompressed bytes input to deflate.

    $d->total_out()

    Returns the total number of compressed bytes output from deflate.

    $d->get_Strategy()

    Returns the deflation strategy currently used. Valid values are Z_DEFAULT_STRATEGY , Z_FILTERED and Z_HUFFMAN_ONLY .

    $d->get_Level()

    Returns the compression level being used.

    $d->get_BufSize()

    Returns the buffer size used to carry out the compression.

    Example

    Here is a trivial example of using deflate . It simply reads standard input, deflates it and writes it to standard output.

    1. use strict ;
    2. use warnings ;
    3. use Compress::Raw::Zlib ;
    4. binmode STDIN;
    5. binmode STDOUT;
    6. my $x = new Compress::Raw::Zlib::Deflate
    7. or die "Cannot create a deflation stream\n" ;
    8. my ($output, $status) ;
    9. while (<>)
    10. {
    11. $status = $x->deflate($_, $output) ;
    12. $status == Z_OK
    13. or die "deflation failed\n" ;
    14. print $output ;
    15. }
    16. $status = $x->flush($output) ;
    17. $status == Z_OK
    18. or die "deflation failed\n" ;
    19. print $output ;

    Compress::Raw::Zlib::Inflate

    This section defines an interface that allows in-memory uncompression using the inflate interface provided by zlib.

    Here is a definition of the interface:

    ($i, $status) = new Compress::Raw::Zlib::Inflate( [OPT] )

    Initialises an inflation object.

    In a list context it returns the inflation object, $i , and the zlib status code ($status ). In a scalar context it returns the inflation object only.

    If successful, $i will hold the inflation object and $status will be Z_OK .

    If not successful, $i will be undef and $status will hold the zlib error code.

    The function optionally takes a number of named options specified as -Name => value pairs. This allows individual options to be tailored without having to specify them all in the parameter list.

    For backward compatibility, it is also possible to pass the parameters as a reference to a hash containing the name=>value pairs.

    Here is a list of the valid options:

    • -WindowBits

      To uncompress an RFC 1950 data stream, set WindowBits to a positive number between 8 and 15.

      To uncompress an RFC 1951 data stream, set WindowBits to -MAX_WBITS .

      To uncompress an RFC 1952 data stream (i.e. gzip), set WindowBits to WANT_GZIP .

      To auto-detect and uncompress an RFC 1950 or RFC 1952 data stream (i.e. gzip), set WindowBits to WANT_GZIP_OR_ZLIB .

      For a full definition of the meaning and valid values for WindowBits refer to the zlib documentation for inflateInit2.

      Defaults to MAX_WBITS .

    • -Bufsize

      Sets the initial size for the output buffer used by the $i->inflate method. If the output buffer in this method has to be reallocated to increase the size, it will grow in increments of Bufsize .

      Default is 4096.

    • -Dictionary

      The default is no dictionary.

    • -AppendOutput

      This option controls how data is written to the output buffer by the $i->inflate method.

      If the option is set to false, the output buffer in the $i->inflate method will be truncated before uncompressed data is written to it.

      If the option is set to true, uncompressed data will be appended to the output buffer by the $i->inflate method.

      This option defaults to false.

    • -CRC32

      If set to true, a crc32 checksum of the uncompressed data will be calculated. Use the $i->crc32 method to retrieve this value.

      This option defaults to false.

    • -ADLER32

      If set to true, an adler32 checksum of the uncompressed data will be calculated. Use the $i->adler32 method to retrieve this value.

      This option defaults to false.

    • -ConsumeInput

      If set to true, this option will remove compressed data from the input buffer of the $i->inflate method as the inflate progresses.

      This option can be useful when you are processing compressed data that is embedded in another file/buffer. In this case the data that immediately follows the compressed stream will be left in the input buffer.

      This option defaults to true.

    • -LimitOutput

      The LimitOutput option changes the behavior of the $i->inflate method so that the amount of memory used by the output buffer can be limited.

      When LimitOutput is used the size of the output buffer used will either be the value of the Bufsize option or the amount of memory already allocated to $output , whichever is larger. Predicting the output size available is tricky, so don't rely on getting an exact output buffer size.

      When LimitOutout is not specified $i->inflate will use as much memory as it takes to write all the uncompressed data it creates by uncompressing the input buffer.

      If LimitOutput is enabled, the ConsumeInput option will also be enabled.

      This option defaults to false.

      See The LimitOutput option for a discussion on why LimitOutput is needed and how to use it.

    Here is an example of using an optional parameter to override the default buffer size.

    1. my ($i, $status) = new Compress::Raw::Zlib::Inflate( -Bufsize => 300 ) ;

    $status = $i->inflate($input, $output [,$eof])

    Inflates the complete contents of $input and writes the uncompressed data to $output . The $input and $output parameters can either be scalars or scalar references.

    Returns Z_OK if successful and Z_STREAM_END if the end of the compressed data has been successfully reached.

    If not successful $status will hold the zlib error code.

    If the ConsumeInput option has been set to true when the Compress::Raw::Zlib::Inflate object is created, the $input parameter is modified by inflate . On completion it will contain what remains of the input buffer after inflation. In practice, this means that when the return status is Z_OK the $input parameter will contain an empty string, and when the return status is Z_STREAM_END the $input parameter will contains what (if anything) was stored in the input buffer after the deflated data stream.

    This feature is useful when processing a file format that encapsulates a compressed data stream (e.g. gzip, zip) and there is useful data immediately after the deflation stream.

    If the AppendOutput option is set to true in the constructor for this object, the uncompressed data will be appended to $output . If it is false, $output will be truncated before any uncompressed data is written to it.

    The $eof parameter needs a bit of explanation.

    Prior to version 1.2.0, zlib assumed that there was at least one trailing byte immediately after the compressed data stream when it was carrying out decompression. This normally isn't a problem because the majority of zlib applications guarantee that there will be data directly after the compressed data stream. For example, both gzip (RFC 1950) and zip both define trailing data that follows the compressed data stream.

    The $eof parameter only needs to be used if all of the following conditions apply

    1

    You are either using a copy of zlib that is older than version 1.2.0 or you want your application code to be able to run with as many different versions of zlib as possible.

    2

    You have set the WindowBits parameter to -MAX_WBITS in the constructor for this object, i.e. you are uncompressing a raw deflated data stream (RFC 1951).

    3

    There is no data immediately after the compressed data stream.

    If all of these are the case, then you need to set the $eof parameter to true on the final call (and only the final call) to $i->inflate .

    If you have built this module with zlib >= 1.2.0, the $eof parameter is ignored. You can still set it if you want, but it won't be used behind the scenes.

    $status = $i->inflateSync($input)

    This method can be used to attempt to recover good data from a compressed data stream that is partially corrupt. It scans $input until it reaches either a full flush point or the end of the buffer.

    If a full flush point is found, Z_OK is returned and $input will be have all data up to the flush point removed. This data can then be passed to the $i->inflate method to be uncompressed.

    Any other return code means that a flush point was not found. If more data is available, inflateSync can be called repeatedly with more compressed data until the flush point is found.

    Note full flush points are not present by default in compressed data streams. They must have been added explicitly when the data stream was created by calling Compress::Deflate::flush with Z_FULL_FLUSH .

    $status = $i->inflateReset()

    This method will reset the inflation object $i . It can be used when you are uncompressing multiple data streams and want to use the same object to uncompress each of them.

    Returns Z_OK if successful.

    $i->dict_adler()

    Returns the adler32 value for the dictionary.

    $i->crc32()

    Returns the crc32 value for the uncompressed data to date.

    If the CRC32 option is not enabled in the constructor for this object, this method will always return 0;

    $i->adler32()

    Returns the adler32 value for the uncompressed data to date.

    If the ADLER32 option is not enabled in the constructor for this object, this method will always return 0;

    $i->msg()

    Returns the last error message generated by zlib.

    $i->total_in()

    Returns the total number of bytes compressed bytes input to inflate.

    $i->total_out()

    Returns the total number of uncompressed bytes output from inflate.

    $d->get_BufSize()

    Returns the buffer size used to carry out the decompression.

    Examples

    Here is an example of using inflate .

    1. use strict ;
    2. use warnings ;
    3. use Compress::Raw::Zlib;
    4. my $x = new Compress::Raw::Zlib::Inflate()
    5. or die "Cannot create a inflation stream\n" ;
    6. my $input = '' ;
    7. binmode STDIN;
    8. binmode STDOUT;
    9. my ($output, $status) ;
    10. while (read(STDIN, $input, 4096))
    11. {
    12. $status = $x->inflate($input, $output) ;
    13. print $output ;
    14. last if $status != Z_OK ;
    15. }
    16. die "inflation failed\n"
    17. unless $status == Z_STREAM_END ;

    The next example show how to use the LimitOutput option. Notice the use of two nested loops in this case. The outer loop reads the data from the input source - STDIN and the inner loop repeatedly calls inflate until $input is exhausted, we get an error, or the end of the stream is reached. One point worth remembering is by using the LimitOutput option you also get ConsumeInput set as well - this makes the code below much simpler.

    1. use strict ;
    2. use warnings ;
    3. use Compress::Raw::Zlib;
    4. my $x = new Compress::Raw::Zlib::Inflate(LimitOutput => 1)
    5. or die "Cannot create a inflation stream\n" ;
    6. my $input = '' ;
    7. binmode STDIN;
    8. binmode STDOUT;
    9. my ($output, $status) ;
    10. OUTER:
    11. while (read(STDIN, $input, 4096))
    12. {
    13. do
    14. {
    15. $status = $x->inflate($input, $output) ;
    16. print $output ;
    17. last OUTER
    18. unless $status == Z_OK || $status == Z_BUF_ERROR ;
    19. }
    20. while ($status == Z_OK && length $input);
    21. }
    22. die "inflation failed\n"
    23. unless $status == Z_STREAM_END ;

    CHECKSUM FUNCTIONS

    Two functions are provided by zlib to calculate checksums. For the Perl interface, the order of the two parameters in both functions has been reversed. This allows both running checksums and one off calculations to be done.

    1. $crc = adler32($buffer [,$crc]) ;
    2. $crc = crc32($buffer [,$crc]) ;

    The buffer parameters can either be a scalar or a scalar reference.

    If the $crc parameters is undef, the crc value will be reset.

    If you have built this module with zlib 1.2.3 or better, two more CRC-related functions are available.

    1. $crc = adler32_combine($crc1, $crc2, $len2)l
    2. $crc = crc32_combine($adler1, $adler2, $len2)

    These functions allow checksums to be merged.

    Misc

    my $version = Compress::Raw::Zlib::zlib_version();

    Returns the version of the zlib library.

    my $flags = Compress::Raw::Zlib::zlibCompileFlags();

    Returns the flags indicating compile-time options that were used to build the zlib library. See the zlib documentation for a description of the flags returned by zlibCompileFlags .

    Note that when the zlib sources are built along with this module the sprintf flags (bits 24, 25 and 26) should be ignored.

    If you are using zlib 1.2.0 or older, zlibCompileFlags will return 0.

    The LimitOutput option.

    By default $i->inflate($input, $output) will uncompress all data in $input and write all of the uncompressed data it has generated to $output . This makes the interface to inflate much simpler - if the method has uncompressed $input successfully all compressed data in $input will have been dealt with. So if you are reading from an input source and uncompressing as you go the code will look something like this

    1. use strict ;
    2. use warnings ;
    3. use Compress::Raw::Zlib;
    4. my $x = new Compress::Raw::Zlib::Inflate()
    5. or die "Cannot create a inflation stream\n" ;
    6. my $input = '' ;
    7. my ($output, $status) ;
    8. while (read(STDIN, $input, 4096))
    9. {
    10. $status = $x->inflate($input, $output) ;
    11. print $output ;
    12. last if $status != Z_OK ;
    13. }
    14. die "inflation failed\n"
    15. unless $status == Z_STREAM_END ;

    The points to note are

    • The main processing loop in the code handles reading of compressed data from STDIN.

    • The status code returned from inflate will only trigger termination of the main processing loop if it isn't Z_OK . When LimitOutput has not been used the Z_OK status means means that the end of the compressed data stream has been reached or there has been an error in uncompression.

    • After the call to inflate all of the uncompressed data in $input will have been processed. This means the subsequent call to read can overwrite it's contents without any problem.

    For most use-cases the behavior described above is acceptable (this module and it's predecessor, Compress::Zlib , have used it for over 10 years without an issue), but in a few very specific use-cases the amount of memory required for $output can prohibitively large. For example, if the compressed data stream contains the same pattern repeated thousands of times, a relatively small compressed data stream can uncompress into hundreds of megabytes. Remember inflate will keep allocating memory until all the uncompressed data has been written to the output buffer - the size of $output is unbounded.

    The LimitOutput option is designed to help with this use-case.

    The main difference in your code when using LimitOutput is having to deal with cases where the $input parameter still contains some uncompressed data that inflate hasn't processed yet. The status code returned from inflate will be Z_OK if uncompression took place and Z_BUF_ERROR if the output buffer is full.

    Below is typical code that shows how to use LimitOutput .

    1. use strict ;
    2. use warnings ;
    3. use Compress::Raw::Zlib;
    4. my $x = new Compress::Raw::Zlib::Inflate(LimitOutput => 1)
    5. or die "Cannot create a inflation stream\n" ;
    6. my $input = '' ;
    7. binmode STDIN;
    8. binmode STDOUT;
    9. my ($output, $status) ;
    10. OUTER:
    11. while (read(STDIN, $input, 4096))
    12. {
    13. do
    14. {
    15. $status = $x->inflate($input, $output) ;
    16. print $output ;
    17. last OUTER
    18. unless $status == Z_OK || $status == Z_BUF_ERROR ;
    19. }
    20. while ($status == Z_OK && length $input);
    21. }
    22. die "inflation failed\n"
    23. unless $status == Z_STREAM_END ;

    Points to note this time:

    • There are now two nested loops in the code: the outer loop for reading the compressed data from STDIN, as before; and the inner loop to carry out the uncompression.

    • There are two exit points from the inner uncompression loop.

      Firstly when inflate has returned a status other than Z_OK or Z_BUF_ERROR . This means that either the end of the compressed data stream has been reached (Z_STREAM_END ) or there is an error in the compressed data. In either of these cases there is no point in continuing with reading the compressed data, so both loops are terminated.

      The second exit point tests if there is any data left in the input buffer, $input - remember that the ConsumeInput option is automatically enabled when LimitOutput is used. When the input buffer has been exhausted, the outer loop can run again and overwrite a now empty $input .

    ACCESSING ZIP FILES

    Although it is possible (with some effort on your part) to use this module to access .zip files, there are other perl modules available that will do all the hard work for you. Check out Archive::Zip , Archive::Zip::SimpleZip , IO::Compress::Zip and IO::Uncompress::Unzip .

    FAQ

    Compatibility with Unix compress/uncompress.

    This module is not compatible with Unix compress .

    If you have the uncompress program available, you can use this to read compressed files

    1. open F, "uncompress -c $filename |";
    2. while (<F>)
    3. {
    4. ...

    Alternatively, if you have the gunzip program available, you can use this to read compressed files

    1. open F, "gunzip -c $filename |";
    2. while (<F>)
    3. {
    4. ...

    and this to write compress files, if you have the compress program available

    1. open F, "| compress -c $filename ";
    2. print F "data";
    3. ...
    4. close F ;

    Accessing .tar.Z files

    See previous FAQ item.

    If the Archive::Tar module is installed and either the uncompress or gunzip programs are available, you can use one of these workarounds to read .tar.Z files.

    Firstly with uncompress

    1. use strict;
    2. use warnings;
    3. use Archive::Tar;
    4. open F, "uncompress -c $filename |";
    5. my $tar = Archive::Tar->new(*F);
    6. ...

    and this with gunzip

    1. use strict;
    2. use warnings;
    3. use Archive::Tar;
    4. open F, "gunzip -c $filename |";
    5. my $tar = Archive::Tar->new(*F);
    6. ...

    Similarly, if the compress program is available, you can use this to write a .tar.Z file

    1. use strict;
    2. use warnings;
    3. use Archive::Tar;
    4. use IO::File;
    5. my $fh = new IO::File "| compress -c >$filename";
    6. my $tar = Archive::Tar->new();
    7. ...
    8. $tar->write($fh);
    9. $fh->close ;

    Zlib Library Version Support

    By default Compress::Raw::Zlib will build with a private copy of version 1.2.5 of the zlib library. (See the README file for details of how to override this behaviour)

    If you decide to use a different version of the zlib library, you need to be aware of the following issues

    • First off, you must have zlib 1.0.5 or better.

    • You need to have zlib 1.2.1 or better if you want to use the -Merge option with IO::Compress::Gzip , IO::Compress::Deflate and IO::Compress::RawDeflate .

    CONSTANTS

    All the zlib constants are automatically imported when you make use of Compress::Raw::Zlib.

    SEE ALSO

    Compress::Zlib, IO::Compress::Gzip, IO::Uncompress::Gunzip, IO::Compress::Deflate, IO::Uncompress::Inflate, IO::Compress::RawDeflate, IO::Uncompress::RawInflate, IO::Compress::Bzip2, IO::Uncompress::Bunzip2, IO::Compress::Lzma, IO::Uncompress::UnLzma, IO::Compress::Xz, IO::Uncompress::UnXz, IO::Compress::Lzop, IO::Uncompress::UnLzop, IO::Compress::Lzf, IO::Uncompress::UnLzf, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress

    IO::Compress::FAQ

    File::GlobMapper, Archive::Zip, Archive::Tar, IO::Zlib

    For RFC 1950, 1951 and 1952 see http://www.faqs.org/rfcs/rfc1950.html, http://www.faqs.org/rfcs/rfc1951.html and http://www.faqs.org/rfcs/rfc1952.html

    The zlib compression library was written by Jean-loup Gailly gzip@prep.ai.mit.edu and Mark Adler madler@alumni.caltech.edu.

    The primary site for the zlib compression library is http://www.zlib.org.

    The primary site for gzip is http://www.gzip.org.

    AUTHOR

    This module was written by Paul Marquess, pmqs@cpan.org.

    MODIFICATION HISTORY

    See the Changes file.

    COPYRIGHT AND LICENSE

    Copyright (c) 2005-2013 Paul Marquess. All rights reserved.

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Class/Struct.html000644 000765 000024 00000136466 12275777430 016473 0ustar00jjstaff000000 000000 Class::Struct - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Class::Struct

    Perl 5 version 18.2 documentation
    Recently read

    Class::Struct

    NAME

    Class::Struct - declare struct-like datatypes as Perl classes

    SYNOPSIS

    1. use Class::Struct;
    2. # declare struct, based on array:
    3. struct( CLASS_NAME => [ ELEMENT_NAME => ELEMENT_TYPE, ... ]);
    4. # declare struct, based on hash:
    5. struct( CLASS_NAME => { ELEMENT_NAME => ELEMENT_TYPE, ... });
    6. package CLASS_NAME;
    7. use Class::Struct;
    8. # declare struct, based on array, implicit class name:
    9. struct( ELEMENT_NAME => ELEMENT_TYPE, ... );
    10. # Declare struct at compile time
    11. use Class::Struct CLASS_NAME => [ ELEMENT_NAME => ELEMENT_TYPE, ... ];
    12. use Class::Struct CLASS_NAME => { ELEMENT_NAME => ELEMENT_TYPE, ... };
    13. # declare struct at compile time, based on array, implicit class name:
    14. package CLASS_NAME;
    15. use Class::Struct ELEMENT_NAME => ELEMENT_TYPE, ... ;
    16. package Myobj;
    17. use Class::Struct;
    18. # declare struct with four types of elements:
    19. struct( s => '$', a => '@', h => '%', c => 'My_Other_Class' );
    20. $obj = new Myobj; # constructor
    21. # scalar type accessor:
    22. $element_value = $obj->s; # element value
    23. $obj->s('new value'); # assign to element
    24. # array type accessor:
    25. $ary_ref = $obj->a; # reference to whole array
    26. $ary_element_value = $obj->a(2); # array element value
    27. $obj->a(2, 'new value'); # assign to array element
    28. # hash type accessor:
    29. $hash_ref = $obj->h; # reference to whole hash
    30. $hash_element_value = $obj->h('x'); # hash element value
    31. $obj->h('x', 'new value'); # assign to hash element
    32. # class type accessor:
    33. $element_value = $obj->c; # object reference
    34. $obj->c->method(...); # call method of object
    35. $obj->c(new My_Other_Class); # assign a new object

    DESCRIPTION

    Class::Struct exports a single function, struct . Given a list of element names and types, and optionally a class name, struct creates a Perl 5 class that implements a "struct-like" data structure.

    The new class is given a constructor method, new , for creating struct objects.

    Each element in the struct data has an accessor method, which is used to assign to the element and to fetch its value. The default accessor can be overridden by declaring a sub of the same name in the package. (See Example 2.)

    Each element's type can be scalar, array, hash, or class.

    The struct() function

    The struct function has three forms of parameter-list.

    1. struct( CLASS_NAME => [ ELEMENT_LIST ]);
    2. struct( CLASS_NAME => { ELEMENT_LIST });
    3. struct( ELEMENT_LIST );

    The first and second forms explicitly identify the name of the class being created. The third form assumes the current package name as the class name.

    An object of a class created by the first and third forms is based on an array, whereas an object of a class created by the second form is based on a hash. The array-based forms will be somewhat faster and smaller; the hash-based forms are more flexible.

    The class created by struct must not be a subclass of another class other than UNIVERSAL .

    It can, however, be used as a superclass for other classes. To facilitate this, the generated constructor method uses a two-argument blessing. Furthermore, if the class is hash-based, the key of each element is prefixed with the class name (see Perl Cookbook, Recipe 13.12).

    A function named new must not be explicitly defined in a class created by struct .

    The ELEMENT_LIST has the form

    1. NAME => TYPE, ...

    Each name-type pair declares one element of the struct. Each element name will be defined as an accessor method unless a method by that name is explicitly defined; in the latter case, a warning is issued if the warning flag (-w) is set.

    Class Creation at Compile Time

    Class::Struct can create your class at compile time. The main reason for doing this is obvious, so your class acts like every other class in Perl. Creating your class at compile time will make the order of events similar to using any other class ( or Perl module ).

    There is no significant speed gain between compile time and run time class creation, there is just a new, more standard order of events.

    Element Types and Accessor Methods

    The four element types -- scalar, array, hash, and class -- are represented by strings -- '$' , '@' , '%' , and a class name -- optionally preceded by a '*' .

    The accessor method provided by struct for an element depends on the declared type of the element.

    • Scalar ('$' or '*$' )

      The element is a scalar, and by default is initialized to undef (but see Initializing with new).

      The accessor's argument, if any, is assigned to the element.

      If the element type is '$' , the value of the element (after assignment) is returned. If the element type is '*$' , a reference to the element is returned.

    • Array ('@' or '*@' )

      The element is an array, initialized by default to () .

      With no argument, the accessor returns a reference to the element's whole array (whether or not the element was specified as '@' or '*@' ).

      With one or two arguments, the first argument is an index specifying one element of the array; the second argument, if present, is assigned to the array element. If the element type is '@' , the accessor returns the array element value. If the element type is '*@' , a reference to the array element is returned.

      As a special case, when the accessor is called with an array reference as the sole argument, this causes an assignment of the whole array element. The object reference is returned.

    • Hash ('%' or '*%' )

      The element is a hash, initialized by default to () .

      With no argument, the accessor returns a reference to the element's whole hash (whether or not the element was specified as '%' or '*%' ).

      With one or two arguments, the first argument is a key specifying one element of the hash; the second argument, if present, is assigned to the hash element. If the element type is '%' , the accessor returns the hash element value. If the element type is '*%' , a reference to the hash element is returned.

      As a special case, when the accessor is called with a hash reference as the sole argument, this causes an assignment of the whole hash element. The object reference is returned.

    • Class ('Class_Name' or '*Class_Name' )

      The element's value must be a reference blessed to the named class or to one of its subclasses. The element is not initialized by default.

      The accessor's argument, if any, is assigned to the element. The accessor will croak if this is not an appropriate object reference.

      If the element type does not start with a '*' , the accessor returns the element value (after assignment). If the element type starts with a '*' , a reference to the element itself is returned.

    Initializing with new

    struct always creates a constructor called new . That constructor may take a list of initializers for the various elements of the new struct.

    Each initializer is a pair of values: element name => value. The initializer value for a scalar element is just a scalar value. The initializer for an array element is an array reference. The initializer for a hash is a hash reference.

    The initializer for a class element is an object of the corresponding class, or of one of it's subclasses, or a reference to a hash containing named arguments to be passed to the element's constructor.

    See Example 3 below for an example of initialization.

    EXAMPLES

    • Example 1

      Giving a struct element a class type that is also a struct is how structs are nested. Here, Timeval represents a time (seconds and microseconds), and Rusage has two elements, each of which is of type Timeval .

      1. use Class::Struct;
      2. struct( Rusage => {
      3. ru_utime => 'Timeval', # user time used
      4. ru_stime => 'Timeval', # system time used
      5. });
      6. struct( Timeval => [
      7. tv_secs => '$', # seconds
      8. tv_usecs => '$', # microseconds
      9. ]);
      10. # create an object:
      11. my $t = Rusage->new(ru_utime=>Timeval->new(), ru_stime=>Timeval->new());
      12. # $t->ru_utime and $t->ru_stime are objects of type Timeval.
      13. # set $t->ru_utime to 100.0 sec and $t->ru_stime to 5.0 sec.
      14. $t->ru_utime->tv_secs(100);
      15. $t->ru_utime->tv_usecs(0);
      16. $t->ru_stime->tv_secs(5);
      17. $t->ru_stime->tv_usecs(0);
    • Example 2

      An accessor function can be redefined in order to provide additional checking of values, etc. Here, we want the count element always to be nonnegative, so we redefine the count accessor accordingly.

      1. package MyObj;
      2. use Class::Struct;
      3. # declare the struct
      4. struct ( 'MyObj', { count => '$', stuff => '%' } );
      5. # override the default accessor method for 'count'
      6. sub count {
      7. my $self = shift;
      8. if ( @_ ) {
      9. die 'count must be nonnegative' if $_[0] < 0;
      10. $self->{'MyObj::count'} = shift;
      11. warn "Too many args to count" if @_;
      12. }
      13. return $self->{'MyObj::count'};
      14. }
      15. package main;
      16. $x = new MyObj;
      17. print "\$x->count(5) = ", $x->count(5), "\n";
      18. # prints '$x->count(5) = 5'
      19. print "\$x->count = ", $x->count, "\n";
      20. # prints '$x->count = 5'
      21. print "\$x->count(-5) = ", $x->count(-5), "\n";
      22. # dies due to negative argument!
    • Example 3

      The constructor of a generated class can be passed a list of element=>value pairs, with which to initialize the struct. If no initializer is specified for a particular element, its default initialization is performed instead. Initializers for non-existent elements are silently ignored.

      Note that the initializer for a nested class may be specified as an object of that class, or as a reference to a hash of initializers that are passed on to the nested struct's constructor.

      1. use Class::Struct;
      2. struct Breed =>
      3. {
      4. name => '$',
      5. cross => '$',
      6. };
      7. struct Cat =>
      8. [
      9. name => '$',
      10. kittens => '@',
      11. markings => '%',
      12. breed => 'Breed',
      13. ];
      14. my $cat = Cat->new( name => 'Socks',
      15. kittens => ['Monica', 'Kenneth'],
      16. markings => { socks=>1, blaze=>"white" },
      17. breed => Breed->new(name=>'short-hair', cross=>1),
      18. or: breed => {name=>'short-hair', cross=>1},
      19. );
      20. print "Once a cat called ", $cat->name, "\n";
      21. print "(which was a ", $cat->breed->name, ")\n";
      22. print "had two kittens: ", join(' and ', @{$cat->kittens}), "\n";

    Author and Modification History

    Modified by Damian Conway, 2001-09-10, v0.62.

    1. Modified implicit construction of nested objects.
    2. Now will also take an object ref instead of requiring a hash ref.
    3. Also default initializes nested object attributes to undef, rather
    4. than calling object constructor without args
    5. Original over-helpfulness was fraught with problems:
    6. * the class's constructor might not be called 'new'
    7. * the class might not have a hash-like-arguments constructor
    8. * the class might not have a no-argument constructor
    9. * "recursive" data structures didn't work well:
    10. package Person;
    11. struct { mother => 'Person', father => 'Person'};

    Modified by Casey West, 2000-11-08, v0.59.

    1. Added the ability for compile time class creation.

    Modified by Damian Conway, 1999-03-05, v0.58.

    1. Added handling of hash-like arg list to class ctor.
    2. Changed to two-argument blessing in ctor to support
    3. derivation from created classes.
    4. Added classname prefixes to keys in hash-based classes
    5. (refer to "Perl Cookbook", Recipe 13.12 for rationale).
    6. Corrected behaviour of accessors for '*@' and '*%' struct
    7. elements. Package now implements documented behaviour when
    8. returning a reference to an entire hash or array element.
    9. Previously these were returned as a reference to a reference
    10. to the element.

    Renamed to Class::Struct and modified by Jim Miner, 1997-04-02.

    1. members() function removed.
    2. Documentation corrected and extended.
    3. Use of struct() in a subclass prohibited.
    4. User definition of accessor allowed.
    5. Treatment of '*' in element types corrected.
    6. Treatment of classes as element types corrected.
    7. Class name to struct() made optional.
    8. Diagnostic checks added.

    Originally Class::Template by Dean Roehrich.

    1. # Template.pm --- struct/member template builder
    2. # 12mar95
    3. # Dean Roehrich
    4. #
    5. # changes/bugs fixed since 28nov94 version:
    6. # - podified
    7. # changes/bugs fixed since 21nov94 version:
    8. # - Fixed examples.
    9. # changes/bugs fixed since 02sep94 version:
    10. # - Moved to Class::Template.
    11. # changes/bugs fixed since 20feb94 version:
    12. # - Updated to be a more proper module.
    13. # - Added "use strict".
    14. # - Bug in build_methods, was using @var when @$var needed.
    15. # - Now using my() rather than local().
    16. #
    17. # Uses perl5 classes to create nested data types.
    18. # This is offered as one implementation of Tom Christiansen's "structs.pl"
    19. # idea.
     
    perldoc-html/CPANPLUS/Backend/000755 000765 000024 00000000000 12275777431 016011 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Backend.html000644 000765 000024 00000144564 12275777430 016714 0ustar00jjstaff000000 000000 CPANPLUS::Backend - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Backend

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Backend

    NAME

    CPANPLUS::Backend - programmer's interface to CPANPLUS

    SYNOPSIS

    1. my $cb = CPANPLUS::Backend->new;
    2. my $conf = $cb->configure_object;
    3. my $author = $cb->author_tree('KANE');
    4. my $mod = $cb->module_tree('Some::Module');
    5. my $mod = $cb->parse_module( module => 'Some::Module' );
    6. my @objs = $cb->search( type => TYPE,
    7. allow => [...] );
    8. $cb->flush('all');
    9. $cb->reload_indices;
    10. $cb->local_mirror;

    DESCRIPTION

    This module provides the programmer's interface to the CPANPLUS libraries.

    ENVIRONMENT

    When CPANPLUS::Backend is loaded, which is necessary for just about every <CPANPLUS> operation, the environment variable PERL5_CPANPLUS_IS_RUNNING is set to the current process id.

    Additionally, the environment variable PERL5_CPANPLUS_IS_VERSION will be set to the version of CPANPLUS::Backend .

    This information might be useful somehow to spawned processes.

    METHODS

    $cb = CPANPLUS::Backend->new( [CONFIGURE_OBJ] )

    This method returns a new CPANPLUS::Backend object. This also initialises the config corresponding to this object. You have two choices in this:

    • Provide a valid CPANPLUS::Configure object

      This will be used verbatim.

    • No arguments

      Your default config will be loaded and used.

    New will return a CPANPLUS::Backend object on success and die on failure.

    $href = $cb->module_tree( [@modules_names_list] )

    Returns a reference to the CPANPLUS module tree.

    If you give it any arguments, they will be treated as module names and module_tree will try to look up these module names and return the corresponding module objects instead.

    See CPANPLUS::Module for the operations you can perform on a module object.

    $href = $cb->author_tree( [@author_names_list] )

    Returns a reference to the CPANPLUS author tree.

    If you give it any arguments, they will be treated as author names and author_tree will try to look up these author names and return the corresponding author objects instead.

    See CPANPLUS::Module::Author for the operations you can perform on an author object.

    $conf = $cb->configure_object;

    Returns a copy of the CPANPLUS::Configure object.

    See CPANPLUS::Configure for operations you can perform on a configure object.

    $su = $cb->selfupdate_object;

    Returns a copy of the CPANPLUS::Selfupdate object.

    See the CPANPLUS::Selfupdate manpage for the operations you can perform on the selfupdate object.

    @mods = $cb->search( type => TYPE, allow => AREF, [data => AREF, verbose => BOOL] )

    search enables you to search for either module or author objects, based on their data. The type you can specify is any of the accessors specified in CPANPLUS::Module::Author or CPANPLUS::Module . search will determine by the type you specified whether to search by author object or module object.

    You have to specify an array reference of regular expressions or strings to match against. The rules used for this array ref are the same as in Params::Check , so read that manpage for details.

    The search is an or search, meaning that if any of the criteria match, the search is considered to be successful.

    You can specify the result of a previous search as data to limit the new search to these module or author objects, rather than the entire module or author tree. This is how you do and searches.

    Returns a list of module or author objects on success and false on failure.

    See CPANPLUS::Module for the operations you can perform on a module object. See CPANPLUS::Module::Author for the operations you can perform on an author object.

    $backend_rv = $cb->fetch( modules => \@mods )

    Fetches a list of modules. @mods can be a list of distribution names, module names or module objects--basically anything that parse_module can understand.

    See the equivalent method in CPANPLUS::Module for details on other options you can pass.

    Since this is a multi-module method call, the return value is implemented as a CPANPLUS::Backend::RV object. Please consult that module's documentation on how to interpret the return value.

    $backend_rv = $cb->extract( modules => \@mods )

    Extracts a list of modules. @mods can be a list of distribution names, module names or module objects--basically anything that parse_module can understand.

    See the equivalent method in CPANPLUS::Module for details on other options you can pass.

    Since this is a multi-module method call, the return value is implemented as a CPANPLUS::Backend::RV object. Please consult that module's documentation on how to interpret the return value.

    $backend_rv = $cb->install( modules => \@mods )

    Installs a list of modules. @mods can be a list of distribution names, module names or module objects--basically anything that parse_module can understand.

    See the equivalent method in CPANPLUS::Module for details on other options you can pass.

    Since this is a multi-module method call, the return value is implemented as a CPANPLUS::Backend::RV object. Please consult that module's documentation on how to interpret the return value.

    $backend_rv = $cb->readme( modules => \@mods )

    Fetches the readme for a list of modules. @mods can be a list of distribution names, module names or module objects--basically anything that parse_module can understand.

    See the equivalent method in CPANPLUS::Module for details on other options you can pass.

    Since this is a multi-module method call, the return value is implemented as a CPANPLUS::Backend::RV object. Please consult that module's documentation on how to interpret the return value.

    $backend_rv = $cb->files( modules => \@mods )

    Returns a list of files used by these modules if they are installed. @mods can be a list of distribution names, module names or module objects--basically anything that parse_module can understand.

    See the equivalent method in CPANPLUS::Module for details on other options you can pass.

    Since this is a multi-module method call, the return value is implemented as a CPANPLUS::Backend::RV object. Please consult that module's documentation on how to interpret the return value.

    $backend_rv = $cb->distributions( modules => \@mods )

    Returns a list of module objects representing all releases for this module on success. @mods can be a list of distribution names, module names or module objects, basically anything that parse_module can understand.

    See the equivalent method in CPANPLUS::Module for details on other options you can pass.

    Since this is a multi-module method call, the return value is implemented as a CPANPLUS::Backend::RV object. Please consult that module's documentation on how to interpret the return value.

    $mod_obj = $cb->parse_module( module => $modname|$distname|$modobj|URI|PATH )

    parse_module tries to find a CPANPLUS::Module object that matches your query. Here's a list of examples you could give to parse_module ;

    These items would all come up with a CPANPLUS::Module object for Text::Bastardize . The ones marked explicitly as being version 1.06 would give back a CPANPLUS::Module object of that version. Even if the version on CPAN is currently higher.

    The last three are examples of PATH resolution. In the first, we supply an absolute path to the unwrapped distribution. In the second the distribution is relative to the current working directory. In the third, we will use the current working directory.

    If parse_module is unable to actually find the module you are looking for in its module tree, but you supplied it with an author, module and version part in a distribution name or URI, it will create a fake CPANPLUS::Module object for you, that you can use just like the real thing.

    See CPANPLUS::Module for the operations you can perform on a module object.

    If even this fancy guessing doesn't enable parse_module to create a fake module object for you to use, it will warn about an error and return false.

    $bool = $cb->reload_indices( [update_source => BOOL, verbose => BOOL] );

    This method reloads the source files.

    If update_source is set to true, this will fetch new source files from your CPAN mirror. Otherwise, reload_indices will do its usual cache checking and only update them if they are out of date.

    By default, update_source will be false.

    The verbose setting defaults to what you have specified in your config file.

    Returns true on success and false on failure.

    $bool = $cb->flush(CACHE_NAME)

    This method allows flushing of caches. There are several things which can be flushed:

    • methods

      The return status of methods which have been attempted, such as different ways of fetching files. It is recommended that automatic flushing be used instead.

    • hosts

      The return status of URIs which have been attempted, such as different hosts of fetching files. It is recommended that automatic flushing be used instead.

    • modules

      Information about modules such as prerequisites and whether installation succeeded, failed, or was not attempted.

    • lib

      This resets PERL5LIB, which is changed to ensure that while installing modules they are in our @INC.

    • load

      This resets the cache of modules we've attempted to load, but failed. This enables you to load them again after a failed load, if they somehow have become available.

    • all

      Flush all of the aforementioned caches.

    Returns true on success and false on failure.

    @mods = $cb->installed()

    Returns a list of module objects of all your installed modules. If an error occurs, it will return false.

    See CPANPLUS::Module for the operations you can perform on a module object.

    $bool = $cb->local_mirror([path => '/dir/to/save/to', index_files => BOOL, force => BOOL, verbose => BOOL] )

    Creates a local mirror of CPAN, of only the most recent sources in a location you specify. If you set this location equal to a custom host in your CPANPLUS::Config you can use your local mirror to install from.

    It takes the following arguments:

    • path

      The location where to create the local mirror.

    • index_files

      Enable/disable fetching of index files. You can disable fetching of the index files if you don't plan to use the local mirror as your primary site, or if you'd like up-to-date index files be fetched from elsewhere.

      Defaults to true.

    • force

      Forces refetching of packages, even if they are there already.

      Defaults to whatever setting you have in your CPANPLUS::Config .

    • verbose

      Prints more messages about what its doing.

      Defaults to whatever setting you have in your CPANPLUS::Config .

    Returns true on success and false on error.

    $file = $cb->autobundle([path => OUTPUT_PATH, force => BOOL, verbose => BOOL])

    Writes out a snapshot of your current installation in CPAN bundle style. This can then be used to install the same modules for a different or on a different machine by issuing the following commands:

    1. ### using the default shell:
    2. CPAN Terminal> i file://path/to/Snapshot_XXYY.pm
    3. ### using the API
    4. $modobj = $cb->parse_module( module => 'file://path/to/Snapshot_XXYY.pm' );
    5. $modobj->install;

    It will, by default, write to an 'autobundle' directory under your cpanplus homedirectory, but you can override that by supplying a path argument.

    It will return the location of the output file on success and false on failure.

    $bool = $cb->save_state

    Explicit command to save memory state to disk. This can be used to save information to disk about where a module was extracted, the result of make test , etc. This will then be re-loaded into memory when a new session starts.

    The capability of saving state to disk depends on the source engine being used (See CPANPLUS::Config for the option to choose your source engine). The default storage engine supports this option.

    Most users will not need this command, but it can handy for automated systems like setting up CPAN smoke testers.

    The method will return true if it managed to save the state to disk, or false if it did not.

    CUSTOM MODULE SOURCES

    Besides the sources as provided by the general CPAN mirrors, it's possible to add your own sources list to your CPANPLUS index.

    The methodology behind this works much like Debian's apt-sources .

    The methods below show you how to make use of this functionality. Also note that most of these methods are available through the default shell plugin command /cs, making them available as shortcuts through the shell and via the commandline.

    %files = $cb->list_custom_sources

    Returns a mapping of registered custom sources and their local indices as follows:

    1. /full/path/to/local/index => http://remote/source

    Note that any file starting with an # is being ignored.

    $local_index = $cb->add_custom_source( uri => URI, [verbose => BOOL] );

    Adds an URI to your own sources list and mirrors its index. See the documentation on $cb->update_custom_source on how this is done.

    Returns the full path to the local index on success, or false on failure.

    Note that when adding a new URI , the change to the in-memory tree is not saved until you rebuild or save the tree to disk again. You can do this using the $cb->reload_indices method.

    $local_index = $cb->remove_custom_source( uri => URI, [verbose => BOOL] );

    Removes an URI from your own sources list and removes its index.

    To find out what URI s you have as part of your own sources list, use the $cb->list_custom_sources method.

    Returns the full path to the deleted local index file on success, or false on failure.

    $bool = $cb->update_custom_source( [remote => URI] );

    Updates the indexes for all your custom sources. It does this by fetching a file called packages.txt in the root of the custom sources's URI . If you provide the remote argument, it will only update the index for that specific URI .

    Here's an example of how custom sources would resolve into index files:

    1. file:///path/to/sources => file:///path/to/sources/packages.txt
    2. http://example.com/sources => http://example.com/sources/packages.txt
    3. ftp://example.com/sources => ftp://example.com/sources/packages.txt

    The file packages.txt simply holds a list of packages that can be found under the root of the URI . This file can be automatically generated for you when the remote source is a file:// URI. For http://, ftp://, and similar, the administrator of that repository should run the method $cb->write_custom_source_index on the repository to allow remote users to index it.

    For details, see the $cb->write_custom_source_index method below.

    All packages that are added via this mechanism will be attributed to the author with CPANID LOCAL . You can use this id to search for all added packages.

    $file = $cb->write_custom_source_index( path => /path/to/package/root, [to => /path/to/index/file, verbose => BOOL] );

    Writes the index for a custom repository root. Most users will not have to worry about this, but administrators of a repository will need to make sure their indexes are up to date.

    The index will be written to a file called packages.txt in your repository root, which you can specify with the path argument. You can override this location by specifying the to argument, but in normal operation, that should not be required.

    Once the index file is written, users can then add the URI pointing to the repository to their custom list of sources and start using it right away. See the $cb->add_custom_source method for user details.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPANPLUS::Configure, CPANPLUS::Module, CPANPLUS::Module::Author, CPANPLUS::Selfupdate

     
    perldoc-html/CPANPLUS/Config.html000644 000765 000024 00000074265 12275777433 016575 0ustar00jjstaff000000 000000 CPANPLUS::Config - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Config

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Config

    NAME

    CPANPLUS::Config - configuration defaults and heuristics for CPANPLUS

    SYNOPSIS

    1. ### conf object via CPANPLUS::Backend;
    2. $cb = CPANPLUS::Backend->new;
    3. $conf = $cb->configure_object;
    4. ### or as a standalone object
    5. $conf = CPANPLUS::Configure->new;
    6. ### values in 'conf' section
    7. $verbose = $conf->get_conf( 'verbose' );
    8. $conf->set_conf( verbose => 1 );
    9. ### values in 'program' section
    10. $editor = $conf->get_program( 'editor' );
    11. $conf->set_program( editor => '/bin/vi' );

    DESCRIPTION

    This module contains defaults and heuristics for configuration information for CPANPLUS. To change any of these values, please see the documentation in CPANPLUS::Configure .

    Below you'll find a list of configuration types and keys, and their meaning.

    CONFIGURATION

    Section 'conf'

    • hosts

      An array ref containing hosts entries to be queried for packages.

      An example entry would like this:

      1. { 'scheme' => 'ftp',
      2. 'path' => '/pub/CPAN/',
      3. 'host' => 'ftp.cpan.org'
      4. },
    • allow_build_interactivity

      Boolean flag to indicate whether 'perl Makefile.PL' and similar are run interactively or not. Defaults to 'true'.

    • allow_unknown_prereqs

      Boolean flag to indicate that unresolvable prereqs are acceptable. If true then only warnings will be issued (the behaviour before 0.9114) when a module is unresolvable from any our sources (CPAN and/or custom_sources ). If false then an unresolvable prereq will fail during the prepare stage of distribution installation. Defaults to true .

    • base

      The directory CPANPLUS keeps all its build and state information in. Defaults to ~/.cpanplus. If File::HomeDir is available, that will be used to work out your HOME directory. This may be overriden by setting the PERL5_CPANPLUS_HOME environment variable, see CPANPLUS::Config::HomeEnv for more details.

    • buildflags

      Any flags to be passed to 'perl Build.PL'. See perldoc Module::Build for details. Defaults to an empty string.

    • cpantest

      Boolean flag to indicate whether or not to mail test results of module installations to http://testers.cpan.org. Defaults to 'false'.

    • cpantest_mx

      String holding an explicit mailserver to use when sending out emails for http://testers.cpan.org. An empty string will use your system settings. Defaults to an empty string.

    • debug

      Boolean flag to enable or disable extensive debuggging information. Defaults to 'false'.

    • dist_type

      Default distribution type to use when building packages. See cpan2dist or CPANPLUS::Dist for details. An empty string will not use any package building software. Defaults to an empty string.

    • email

      Email address to use for anonymous ftp access and as from address when sending emails. Defaults to an example.com address.

    • enable_custom_sources

      Boolean flag indicating whether custom sources should be enabled or not. See the CUSTOM MODULE SOURCES in CPANPLUS::Backend for details on how to use them.

      Defaults to true

    • extractdir

      String containing the directory where fetched archives should be extracted. An empty string will use a directory under your base directory. Defaults to an empty string.

    • fetchdir

      String containing the directory where fetched archives should be stored. An empty string will use a directory under your base directory. Defaults to an empty string.

    • flush

      Boolean indicating whether build failures, cache dirs etc should be flushed after every operation or not. Defaults to 'true'.

    • force

      Boolean indicating whether files should be forcefully overwritten if they exist, modules should be installed when they fail tests, etc. Defaults to 'false'.

    • histfile

      A string containing the history filename of the CPANPLUS readline instance.

    • lib

      An array ref holding directories to be added to @INC when CPANPLUS starts up. Defaults to an empty array reference.

    • makeflags

      A string holding flags that will be passed to the make program when invoked. Defaults to an empty string.

    • makemakerflags

      A string holding flags that will be passed to perl Makefile.PL when invoked. Defaults to an empty string.

    • md5

      A boolean indicating whether or not sha256 checks should be done when an archive is fetched. Defaults to 'true' if you have Digest::SHA installed, 'false' otherwise.

    • no_update

      A boolean indicating whether or not CPANPLUS ' source files should be updated or not. Defaults to 'false'.

    • passive

      A boolean indicating whether or not to use passive ftp connections. Defaults to 'true'.

    • prefer_bin

      A boolean indicating whether or not to prefer command line programs over perl modules. Defaults to 'false' unless you do not have Compress::Zlib installed (as that would mean we could not extract .tar.gz files)

    • prefer_makefile

      A boolean indicating whether or not prefer a Makefile.PL over a Build.PL file if both are present. Defaults to 'true', unless the perl version is at least 5.10.1 or appropriate versions of Module::Build and CPANPLUS::Dist::Build are available.

    • prereqs

      A digit indicating what to do when a package you are installing has a prerequisite. Options are:

      1. 0 Do not install
      2. 1 Install
      3. 2 Ask
      4. 3 Ignore (dangerous, install will probably fail!)

      The default is to ask.

    • shell

      A string holding the shell class you wish to start up when starting CPANPLUS in interactive mode.

      Defaults to CPANPLUS::Shell::Default , the default CPANPLUS shell.

    • show_startup_tip

      A boolean indicating whether or not to show start up tips in the interactive shell. Defaults to 'true'.

    • signature

      A boolean indicating whether or not check signatures if packages are signed. Defaults to 'true' if you have gpg or Crypt::OpenPGP installed, 'false' otherwise.

    • skiptest

      A boolean indicating whether or not to skip tests when installing modules. Defaults to 'false'.

    • storable

      A boolean indicating whether or not to use Storable to write compiled source file information to disk. This makes for faster startup and look up times, but takes extra diskspace. Defaults to 'true' if you have Storable installed and 'false' if you don't.

    • timeout

      Digit indicating the time before a fetch request times out (in seconds). Defaults to 300.

    • verbose

      A boolean indicating whether or not CPANPLUS runs in verbose mode. Defaults to 'true' if you have the environment variable PERL5_CPANPLUS_VERBOSE set to true, 'false' otherwise.

      It is recommended you run with verbose enabled, but it is disabled for historical reasons.

    • write_install_log

      A boolean indicating whether or not to write install logs after installing a module using the interactive shell. Defaults to 'true'.

    • source_engine

      Class to use as the source engine, which is generally a subclass of CPANPLUS::Internals::Source . Default to CPANPLUS::Internals::Source::Memory .

    • cpantest_reporter_args

      A hashref of key => value pairs that are passed to the constructor of Test::Reporter . If you'd want to enable TLS for example, you'd set it to:

      1. { transport => 'Net::SMTP::TLS',
      2. transport_args => [ User => 'Joe', Password => '123' ],
      3. }

    Section 'program'

    • editor

      A string holding the path to your editor of choice. Defaults to your $ENV{EDITOR}, $ENV{VISUAL}, 'vi' or 'pico' programs, in that order.

    • make

      A string holding the path to your make binary. Looks for the make program used to build perl or failing that, a make in your path.

    • pager

      A string holding the path to your pager of choice. Defaults to your $ENV{PAGER}, 'less' or 'more' programs, in that order.

    • shell

      A string holding the path to your login shell of choice. Defaults to your $ENV{SHELL} setting, or $ENV{COMSPEC} on Windows.

    • sudo

      A string holding the path to your sudo binary if your install path requires super user permissions. Looks for sudo in your path, or remains empty if you do not require super user permissions to install.

    • perlwrapper

      DEPRECATED

      A string holding the path to the cpanp-run-perl utility bundled with CPANPLUS, which is used to enable autoflushing in spawned processes.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPANPLUS::Backend, CPANPLUS::Configure::Setup, CPANPLUS::Configure

     
    perldoc-html/CPANPLUS/Configure.html000644 000765 000024 00000057077 12275777433 017313 0ustar00jjstaff000000 000000 CPANPLUS::Configure - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Configure

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Configure

    NAME

    CPANPLUS::Configure - configuration for CPANPLUS

    SYNOPSIS

    1. $conf = CPANPLUS::Configure->new( );
    2. $bool = $conf->can_save;
    3. $bool = $conf->save( $where );
    4. @opts = $conf->options( $type );
    5. $make = $conf->get_program('make');
    6. $verbose = $conf->set_conf( verbose => 1 );

    DESCRIPTION

    This module deals with all the configuration issues for CPANPLUS. Users can use objects created by this module to alter the behaviour of CPANPLUS.

    Please refer to the CPANPLUS::Backend documentation on how to obtain a CPANPLUS::Configure object.

    METHODS

    $Configure = CPANPLUS::Configure->new( load_configs => BOOL )

    This method returns a new object. Normal users will never need to invoke the new method, but instead retrieve the desired object via a method call on a CPANPLUS::Backend object.

    • load_configs

      Controls whether or not additional user configurations are to be loaded or not. Defaults to true .

    $bool = $Configure->init( [rescan => BOOL])

    Initialize the configure with other config files than just the default 'CPANPLUS::Config'.

    Called from new() to load user/system configurations

    If the rescan option is provided, your disk will be examined again to see if there are new config files that could be read. Defaults to false .

    Returns true on success, false on failure.

    can_save( [$config_location] )

    Check if we can save the configuration to the specified file. If no file is provided, defaults to your personal config.

    Returns true if the file can be saved, false otherwise.

    $file = $conf->save( [$package_name] )

    Saves the configuration to the package name you provided. If this package is not CPANPLUS::Config::System , it will be saved in your .cpanplus directory, otherwise it will be attempted to be saved in the system wide directory.

    If no argument is provided, it will default to your personal config.

    Returns the full path to the file if the config was saved, false otherwise.

    options( type => TYPE )

    Returns a list of all valid config options given a specific type (like for example conf of program ) or false if the type does not exist

    ACCESSORS

    Accessors that start with a _ are marked private -- regular users should never need to use these.

    See the CPANPLUS::Config documentation for what items can be set and retrieved.

    get_SOMETHING( ITEM, [ITEM, ITEM, ... ] );

    The get_* style accessors merely retrieves one or more desired config options.

    set_SOMETHING( ITEM => VAL, [ITEM => VAL, ITEM => VAL, ... ] );

    The set_* style accessors set the current value for one or more config options and will return true upon success, false on failure.

    add_SOMETHING( ITEM => VAL, [ITEM => VAL, ITEM => VAL, ... ] );

    The add_* style accessor adds a new key to a config key.

    Currently, the following accessors exist:

    • set|get_conf

      Simple configuration directives like verbosity and favourite shell.

    • set|get_program

      Location of helper programs.

    • _set|_get_build

      Locations of where to put what files for CPANPLUS.

    • _set|_get_source

      Locations and names of source files locally.

    • _set|_get_mirror

      Locations and names of source files remotely.

    • _set|_get_fetch

      Special settings pertaining to the fetching of files.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPANPLUS::Backend, CPANPLUS::Configure::Setup, CPANPLUS::Config

     
    perldoc-html/CPANPLUS/Dist/000755 000765 000024 00000000000 12275777434 015370 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Dist.html000644 000765 000024 00000055572 12275777431 016271 0ustar00jjstaff000000 000000 CPANPLUS::Dist - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Dist

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Dist

    NAME

    CPANPLUS::Dist - base class for plugins

    SYNOPSIS

    1. my $dist = CPANPLUS::Dist::YOUR_DIST_TYPE_HERE->new(
    2. module => $modobj,
    3. );

    DESCRIPTION

    CPANPLUS::Dist is a base class for CPANPLUS::Dist::MM and CPANPLUS::Dist::Build . Developers of other CPANPLUS::Dist::* plugins should look at CPANPLUS::Dist::Base .

    ACCESSORS

    • parent()

      Returns the CPANPLUS::Module object that parented this object.

    • status()

      Returns the Object::Accessor object that keeps the status for this module.

    STATUS ACCESSORS

    All accessors can be accessed as follows: $deb->status->ACCESSOR

    • created()

      Boolean indicating whether the dist was created successfully. Explicitly set to 0 when failed, so a value of undef may be interpreted as not yet attempted .

    • installed()

      Boolean indicating whether the dist was installed successfully. Explicitly set to 0 when failed, so a value of undef may be interpreted as not yet attempted .

    • uninstalled()

      Boolean indicating whether the dist was uninstalled successfully. Explicitly set to 0 when failed, so a value of undef may be interpreted as not yet attempted .

    • dist()

      The location of the final distribution. This may be a file or directory, depending on how your distribution plug in of choice works. This will be set upon a successful create.

    $dist = CPANPLUS::Dist::YOUR_DIST_TYPE_HERE->new( module => MODOBJ );

    Create a new CPANPLUS::Dist::YOUR_DIST_TYPE_HERE object based on the provided MODOBJ .

    *** DEPRECATED *** The optional argument format is used to indicate what type of dist you would like to create (like CPANPLUS::Dist::MM or CPANPLUS::Dist::Build and so on ).

    CPANPLUS::Dist->new is exclusively meant as a method to be inherited by CPANPLUS::Dist::MM|Build .

    Returns a CPANPLUS::Dist::YOUR_DIST_TYPE_HERE object on success and false on failure.

    @dists = CPANPLUS::Dist->dist_types;

    Returns a list of the CPANPLUS::Dist::* classes available

    $bool = CPANPLUS::Dist->rescan_dist_types;

    Rescans @INC for available dist types. Useful if you've installed new CPANPLUS::Dist::* classes and want to make them available to the current process.

    $bool = CPANPLUS::Dist->has_dist_type( $type )

    Returns true if distribution type $type is loaded/supported.

    $bool = $dist->prereq_satisfied( modobj => $modobj, version => $version_spec )

    Returns true if this prereq is satisfied. Returns false if it's not. Also issues an error if it seems "unsatisfiable," i.e. if it can't be found on CPAN or the latest CPAN version doesn't satisfy it.

    $configure_requires = $dist->find_configure_requires( [file => /path/to/META.yml] )

    Reads the configure_requires for this distribution from the META.yml or META.json file in the root directory and returns a hashref with module names and versions required.

    $bool = $dist->_resolve_prereqs( ... )

    Makes sure prerequisites are resolved

    1. format The dist class to use to make the prereqs
    2. (ie. CPANPLUS::Dist::MM)
    3. prereqs Hash of the prerequisite modules and their versions
    4. target What to do with the prereqs.
    5. create => Just build them
    6. install => Install them
    7. ignore => Ignore them
    8. prereq_build If true, always build the prereqs even if already
    9. resolved
    10. verbose Be verbose
    11. force Force the prereq to be built, even if already resolved
     
    perldoc-html/CPANPLUS/Error.html000644 000765 000024 00000046373 12275777430 016455 0ustar00jjstaff000000 000000 CPANPLUS::Error - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Error

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Error

    NAME

    CPANPLUS::Error - error handling for CPANPLUS

    SYNOPSIS

    1. use CPANPLUS::Error qw[cp_msg cp_error];

    DESCRIPTION

    This module provides the error handling code for the CPANPLUS libraries, and is mainly intended for internal use.

    FUNCTIONS

    cp_msg("message string" [,VERBOSE])

    Records a message on the stack, and prints it to STDOUT (or actually $MSG_FH , see the GLOBAL VARIABLES section below), if the VERBOSE option is true. The VERBOSE option defaults to false.

    msg()

    An alias for cp_msg .

    cp_error("error string" [,VERBOSE])

    Records an error on the stack, and prints it to STDERR (or actually $ERROR_FH , see the GLOBAL VARIABLES sections below), if the VERBOSE option is true. The VERBOSE options defaults to true.

    error()

    An alias for cp_error .

    CLASS METHODS

    CPANPLUS::Error->stack()

    Retrieves all the items on the stack. Since CPANPLUS::Error is implemented using Log::Message , consult its manpage for the function retrieve to see what is returned and how to use the items.

    CPANPLUS::Error->stack_as_string([TRACE])

    Returns the whole stack as a printable string. If the TRACE option is true all items are returned with Carp::longmess output, rather than just the message. TRACE defaults to false.

    CPANPLUS::Error->flush()

    Removes all the items from the stack and returns them. Since CPANPLUS::Error is implemented using Log::Message , consult its manpage for the function retrieve to see what is returned and how to use the items.

    GLOBAL VARIABLES

    • $ERROR_FH

      This is the filehandle all the messages sent to error() are being printed. This defaults to *STDERR .

    • $MSG_FH

      This is the filehandle all the messages sent to msg() are being printed. This default to *STDOUT .

     
    perldoc-html/CPANPLUS/Internals/000755 000765 000024 00000000000 12275777434 016424 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Internals.html000644 000765 000024 00000055250 12275777434 017321 0ustar00jjstaff000000 000000 CPANPLUS::Internals - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals

    NAME

    CPANPLUS::Internals - CPANPLUS internals

    SYNOPSIS

    1. my $internals = CPANPLUS::Internals->_init( _conf => $conf );
    2. my $backend = CPANPLUS::Internals->_retrieve_id( $ID );

    DESCRIPTION

    This module is the guts of CPANPLUS -- it inherits from all other modules in the CPANPLUS::Internals::* namespace, thus defying normal rules of OO programming -- but if you're reading this, you already know what's going on ;)

    Please read the CPANPLUS::Backend documentation for the normal API.

    ACCESSORS

    • _conf

      Get/set the configure object

    • _id

      Get/set the id

    METHODS

    $internals = CPANPLUS::Internals->_init( _conf => CONFIG_OBJ )

    _init creates a new CPANPLUS::Internals object.

    You have to pass it a valid CPANPLUS::Configure object.

    Returns the object on success, or dies on failure.

    $bool = $internals->_flush( list => \@caches )

    Flushes the designated caches from the CPANPLUS object.

    Returns true on success, false if one or more caches could not be be flushed.

    $bool = $internals->_register_callback( name => CALLBACK_NAME, code => CODEREF );

    Registers a callback for later use by the internal libraries.

    Here is a list of the currently used callbacks:

    • install_prerequisite

      Is called when the user wants to be asked about what to do with prerequisites. Should return a boolean indicating true to install the prerequisite and false to skip it.

    • send_test_report

      Is called when the user should be prompted if he wishes to send the test report. Should return a boolean indicating true to send the test report and false to skip it.

    • munge_test_report

      Is called when the test report message has been composed, giving the user a chance to programatically alter it. Should return the (munged) message to be sent.

    • edit_test_report

      Is called when the user should be prompted to edit test reports about to be sent out by Test::Reporter. Should return a boolean indicating true to edit the test report in an editor and false to skip it.

    • proceed_on_test_failure

      Is called when 'make test' or 'Build test' fails. Should return a boolean indicating whether the install should continue even if the test failed.

    • munge_dist_metafile

      Is called when the CPANPLUS::Dist::* metafile is created, like control for CPANPLUS::Dist::Deb , giving the user a chance to programatically alter it. Should return the (munged) text to be written to the metafile.

    $bool = $internals->_add_to_includepath( directories => \@dirs )

    Adds a list of directories to the include path. This means they get added to @INC as well as $ENV{PERL5LIB} .

    Returns true on success, false on failure.

    $bool = $internals->_add_to_path( directories => \@dirs )

    Adds a list of directories to the PATH, but only if they actually contain anything.

    Returns true on success, false on failure.

    $id = CPANPLUS::Internals->_last_id

    Return the id of the last object stored.

    $id = CPANPLUS::Internals->_store_id( $internals )

    Store this object; return its id.

    $obj = CPANPLUS::Internals->_retrieve_id( $ID )

    Retrieve an object based on its ID -- return false on error.

    CPANPLUS::Internals->_remove_id( $ID )

    Remove the object marked by $ID from storage.

    @objs = CPANPLUS::Internals->_return_all_objects

    Return all stored objects.

     
    perldoc-html/CPANPLUS/Module/000755 000765 000024 00000000000 12275777431 015707 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Module.html000644 000765 000024 00000134535 12275777431 016610 0ustar00jjstaff000000 000000 CPANPLUS::Module - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Module

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Module

    NAME

    CPANPLUS::Module - CPAN module objects for CPANPLUS

    SYNOPSIS

    1. ### get a module object from the CPANPLUS::Backend object
    2. my $mod = $cb->module_tree('Some::Module');
    3. ### accessors
    4. $mod->version;
    5. $mod->package;
    6. ### methods
    7. $mod->fetch;
    8. $mod->extract;
    9. $mod->install;

    DESCRIPTION

    CPANPLUS::Module creates objects from the information in the source files. These can then be used to query and perform actions on, like fetching or installing.

    These objects should only be created internally. For fake objects, there's the CPANPLUS::Module::Fake class. To obtain a module object consult the CPANPLUS::Backend documentation.

    CLASS METHODS

    accessors ()

    Returns a list of all accessor methods to the object

    ACCESSORS

    An objects of this class has the following accessors:

    • name

      Name of the module.

    • module

      Name of the module.

    • version

      Version of the module. Defaults to '0.0' if none was provided.

    • path

      Extended path on the mirror.

    • comment

      Any comment about the module -- largely unused.

    • package

      The name of the package.

    • description

      Description of the module -- only registered modules have this.

    • dslip

      The five character dslip string, that represents meta-data of the module -- again, only registered modules have this.

    • status

      The CPANPLUS::Module::Status object associated with this object. (see below).

    • author

      The CPANPLUS::Module::Author object associated with this object.

    • parent

      The CPANPLUS::Internals object that spawned this module object.

    STATUS ACCESSORS

    CPANPLUS caches a lot of results from method calls and saves data it collected along the road for later reuse.

    CPANPLUS uses this internally, but it is also available for the end user. You can get a status object by calling:

    1. $modobj->status

    You can then query the object as follows:

    • installer_type

      The installer type used for this distribution. Will be one of 'makemaker' or 'build'. This determines whether CPANPLUS::Dist::MM or CPANPLUS::Dist::Build will be used to build this distribution.

    • dist_cpan

      The dist object used to do the CPAN-side of the installation. Either a CPANPLUS::Dist::MM or CPANPLUS::Dist::Build object.

    • dist

      The custom dist object used to do the operating specific side of the installation, if you've chosen to use this. For example, if you've chosen to install using the ports format, this may be a CPANPLUS::Dist::Ports object.

      Undefined if you didn't specify a separate format to install through.

    • prereqs | requires

      A hashref of prereqs this distribution was found to have. Will look something like this:

      1. { Carp => 0.01, strict => 0 }

      Might be undefined if the distribution didn't have any prerequisites.

    • configure_requires

      Like prereqs, but these are necessary to be installed before the build process can even begin.

    • signature

      Flag indicating, if a signature check was done, whether it was OK or not.

    • extract

      The directory this distribution was extracted to.

    • fetch

      The location this distribution was fetched to.

    • readme

      The text of this distributions README file.

    • uninstall

      Flag indicating if an uninstall call was done successfully.

    • created

      Flag indicating if the create call to your dist object was done successfully.

    • installed

      Flag indicating if the install call to your dist object was done successfully.

    • checksums

      The location of this distributions CHECKSUMS file.

    • checksum_ok

      Flag indicating if the checksums check was done successfully.

    • checksum_value

      The checksum value this distribution is expected to have

    METHODS

    $self = CPANPLUS::Module->new( OPTIONS )

    This method returns a CPANPLUS::Module object. Normal users should never call this method directly, but instead use the CPANPLUS::Backend to obtain module objects.

    This example illustrates a new() call with all required arguments:

    1. CPANPLUS::Module->new(
    2. module => 'Foo',
    3. path => 'authors/id/A/AA/AAA',
    4. package => 'Foo-1.0.tgz',
    5. author => $author_object,
    6. _id => INTERNALS_OBJECT_ID,
    7. );

    Every accessor is also a valid option to pass to new .

    Returns a module object on success and false on failure.

    $mod->package_name( [$package_string] )

    Returns the name of the package a module is in. For Acme::Bleach that might be Acme-Bleach .

    $mod->package_version( [$package_string] )

    Returns the version of the package a module is in. For a module in the package Acme-Bleach-1.1.tar.gz this would be 1.1 .

    $mod->package_extension( [$package_string] )

    Returns the suffix added by the compression method of a package a certain module is in. For a module in Acme-Bleach-1.1.tar.gz , this would be tar.gz .

    $mod->package_is_perl_core

    Returns a boolean indicating of the package a particular module is in, is actually a core perl distribution.

    $mod->module_is_supplied_with_perl_core( [version => $]] )

    Returns a boolean indicating whether ANY VERSION of this module was supplied with the current running perl's core package.

    $mod->is_bundle

    Returns a boolean indicating if the module you are looking at, is actually a bundle. Bundles are identified as modules whose name starts with Bundle:: .

    $mod->is_autobundle;

    Returns a boolean indicating if the module you are looking at, is actually an autobundle as generated by $cb->autobundle .

    $mod->is_third_party

    Returns a boolean indicating whether the package is a known third-party module (i.e. it's not provided by the standard Perl distribution and is not available on the CPAN, but on a third party software provider). See Module::ThirdParty for more details.

    $mod->third_party_information

    Returns a reference to a hash with more information about a third-party module. See the documentation about module_information() in Module::ThirdParty for more details.

    $clone = $self->clone

    Clones the current module object for tinkering with. It will have a clean CPANPLUS::Module::Status object, as well as a fake CPANPLUS::Module::Author object.

    $where = $self->fetch

    Fetches the module from a CPAN mirror. Look at CPANPLUS::Internals::Fetch::_fetch() for details on the options you can pass.

    $path = $self->extract

    Extracts the fetched module. Look at CPANPLUS::Internals::Extract::_extract() for details on the options you can pass.

    $type = $self->get_installer_type([prefer_makefile => BOOL])

    Gets the installer type for this module. This may either be build or makemaker . If Module::Build is unavailable or no installer type is available, it will fall back to makemaker . If both are available, it will pick the one indicated by your config, or by the prefer_makefile option you can pass to this function.

    Returns the installer type on success, and false on error.

    $dist = $self->dist([target => 'prepare|create', format => DISTRIBUTION_TYPE, args => {key => val}]);

    Create a distribution object, ready to be installed. Distribution type defaults to your config settings

    The optional args hashref is passed on to the specific distribution types' create method after being dereferenced.

    Returns a distribution object on success, false on failure.

    See CPANPLUS::Dist for details.

    $bool = $mod->prepare( )

    Convenience method around install() that prepares a module without actually building it. This is equivalent to invoking install with target set to prepare

    Returns true on success, false on failure.

    $bool = $mod->create( )

    Convenience method around install() that creates a module. This is equivalent to invoking install with target set to create

    Returns true on success, false on failure.

    $bool = $mod->test( )

    Convenience wrapper around install() that tests a module, without installing it. It's the equivalent to invoking install() with target set to create and skiptest set to 0 .

    Returns true on success, false on failure.

    $bool = $self->install([ target => 'init|prepare|create|install', format => FORMAT_TYPE, extractdir => DIRECTORY, fetchdir => DIRECTORY, prefer_bin => BOOL, force => BOOL, verbose => BOOL, ..... ]);

    Installs the current module. This includes fetching it and extracting it, if this hasn't been done yet, as well as creating a distribution object for it.

    This means you can pass it more arguments than described above, which will be passed on to the relevant methods as they are called.

    See CPANPLUS::Internals::Fetch , CPANPLUS::Internals::Extract and CPANPLUS::Dist for details.

    Returns true on success, false on failure.

    Returns a list of module objects the Bundle specifies.

    This requires you to have extracted the bundle already, using the extract() method.

    Returns false on error.

    $text = $self->readme

    Fetches the readme belonging to this module and stores it under $obj->status->readme . Returns the readme as a string on success and returns false on failure.

    $version = $self->installed_version()

    Returns the currently installed version of this module, if any.

    $where = $self->installed_file()

    Returns the location of the currently installed file of this module, if any.

    $dir = $self->installed_dir()

    Returns the directory (or more accurately, the @INC handle) from which this module was loaded, if any.

    $bool = $self->is_uptodate([version => VERSION_NUMBER])

    Returns a boolean indicating if this module is uptodate or not.

    $href = $self->details()

    Returns a hashref with key/value pairs offering more information about a particular module. For example, for Time::HiRes it might look like this:

    1. Author Jarkko Hietaniemi (jhi@iki.fi)
    2. Description High resolution time, sleep, and alarm
    3. Development Stage Released
    4. Installed File /usr/local/perl/lib/Time/Hires.pm
    5. Interface Style plain Functions, no references used
    6. Language Used C and perl, a C compiler will be needed
    7. Package Time-HiRes-1.65.tar.gz
    8. Public License Unknown
    9. Support Level Developer
    10. Version Installed 1.52
    11. Version on CPAN 1.65

    @list = $self->contains()

    Returns a list of module objects that represent the modules also present in the package of this module.

    For example, for Archive::Tar this might return:

    1. Archive::Tar
    2. Archive::Tar::Constant
    3. Archive::Tar::File

    @list_of_hrefs = $self->fetch_report()

    This function queries the CPAN testers database at http://testers.cpan.org/ for test results of specified module objects, module names or distributions.

    Look at CPANPLUS::Internals::Report::_query_report() for details on the options you can pass and the return value to expect.

    $bool = $self->uninstall([type => [all|man|prog])

    This function uninstalls the specified module object.

    You can install 2 types of files, either man pages or prog ram files. Alternately you can specify all to uninstall both (which is the default).

    Returns true on success and false on failure.

    Do note that this does an uninstall via the so-called .packlist, so if you used a module installer like say, ports or apt , you should not use this, but use your package manager instead.

    @modobj = $self->distributions()

    Returns a list of module objects representing all releases for this module on success, false on failure.

    @list = $self->files ()

    Returns a list of files used by this module, if it is installed.

    @list = $self->directory_tree ()

    Returns a list of directories used by this module.

    @list = $self->packlist ()

    Returns the ExtUtils::Packlist object for this module.

    @list = $self->validate ()

    Returns a list of files that are missing for this modules, but are present in the .packlist file.

    $bool = $self->add_to_includepath;

    Adds the current modules path to @INC and $PERL5LIB . This allows you to add the module from its build dir to your path.

    It also adds the current modules bin and/or script paths to the PATH.

    You can reset $PATH , @INC and $PERL5LIB to their original state when you started the program, by calling:

    1. $self->parent->flush('lib');

    $path = $self->best_path_to_module_build();

    OBSOLETE

    If a newer version of Module::Build is found in your path, it will return this special path. If the newest version of Module::Build is found in your regular @INC , the method will return false. This indicates you do not need to add a special directory to your @INC .

    Note that this is only relevant if you're building your own CPANPLUS::Dist::* plugin -- the built-in dist types already have this taken care of.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CPANPLUS/Selfupdate.html000644 000765 000024 00000064431 12275777430 017453 0ustar00jjstaff000000 000000 CPANPLUS::Selfupdate - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Selfupdate

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Selfupdate

    NAME

    CPANPLUS::Selfupdate - self-updating for CPANPLUS

    SYNOPSIS

    1. $su = $cb->selfupdate_object;
    2. @feats = $su->list_features;
    3. @feats = $su->list_enabled_features;
    4. @mods = map { $su->modules_for_feature( $_ ) } @feats;
    5. @mods = $su->list_core_dependencies;
    6. @mods = $su->list_core_modules;
    7. for ( @mods ) {
    8. print $_->name " should be version " . $_->version_required;
    9. print "Installed version is not uptodate!"
    10. unless $_->is_installed_version_sufficient;
    11. }
    12. $ok = $su->selfupdate( update => 'all', latest => 0 );

    METHODS

    $self = CPANPLUS::Selfupdate->new( $backend_object );

    Sets up a new selfupdate object. Called automatically when a new backend object is created.

    @cat = $self->list_categories

    Returns a list of categories that the selfupdate method accepts.

    See selfupdate for details.

    %list = $self->list_modules_to_update( update => "core|dependencies|enabled_features|features|all", [latest => BOOL] )

    List which modules selfupdate would upgrade. You can update either the core (CPANPLUS itself), the core dependencies, all features you have currently turned on, or all features available, or everything.

    The latest option determines whether it should update to the latest version on CPAN, or if the minimal required version for CPANPLUS is good enough.

    Returns a hash of feature names and lists of module objects to be upgraded based on the category you provided. For example:

    1. %list = $self->list_modules_to_update( update => 'core' );

    Would return:

    1. ( core => [ $module_object_for_cpanplus ] );

    $bool = $self->selfupdate( update => "core|dependencies|enabled_features|features|all", [latest => BOOL, force => BOOL] )

    Selfupdate CPANPLUS. You can update either the core (CPANPLUS itself), the core dependencies, all features you have currently turned on, or all features available, or everything.

    The latest option determines whether it should update to the latest version on CPAN, or if the minimal required version for CPANPLUS is good enough.

    Returns true on success, false on error.

    @features = $self->list_features

    Returns a list of features that are supported by CPANPLUS.

    @features = $self->list_enabled_features

    Returns a list of features that are enabled in your current CPANPLUS installation.

    @mods = $self->modules_for_feature( FEATURE [,AS_HASH] )

    Returns a list of CPANPLUS::Selfupdate::Module objects which represent the modules required to support this feature.

    For a list of features, call the list_features method.

    If the AS_HASH argument is provided, no module objects are returned, but a hashref where the keys are names of the modules, and values are their minimum versions.

    @mods = $self->list_core_dependencies( [AS_HASH] )

    Returns a list of CPANPLUS::Selfupdate::Module objects which represent the modules that comprise the core dependencies of CPANPLUS.

    If the AS_HASH argument is provided, no module objects are returned, but a hashref where the keys are names of the modules, and values are their minimum versions.

    @mods = $self->list_core_modules( [AS_HASH] )

    Returns a list of CPANPLUS::Selfupdate::Module objects which represent the modules that comprise the core of CPANPLUS.

    If the AS_HASH argument is provided, no module objects are returned, but a hashref where the keys are names of the modules, and values are their minimum versions.

    CPANPLUS::Selfupdate::Module

    CPANPLUS::Selfupdate::Module extends CPANPLUS::Module objects by providing accessors to aid in selfupdating CPANPLUS.

    These objects are returned by all methods of CPANPLUS::Selfupdate that return module objects.

    $version = $mod->version_required

    Returns the version of this module required for CPANPLUS.

    $bool = $mod->is_installed_version_sufficient

    Returns true if the installed version of this module is sufficient for CPANPLUS, or false if it is not.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CPANPLUS/Shell/000755 000765 000024 00000000000 12275777433 015533 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Shell.html000644 000765 000024 00000041627 12275777435 016435 0ustar00jjstaff000000 000000 CPANPLUS::Shell - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Shell

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Shell

    NAME

    CPANPLUS::Shell - base class for CPANPLUS shells

    SYNOPSIS

    1. use CPANPLUS::Shell; # load the shell indicated by your
    2. # config -- defaults to
    3. # CPANPLUS::Shell::Default
    4. use CPANPLUS::Shell qw[Classic] # load CPANPLUS::Shell::Classic;
    5. my $ui = CPANPLUS::Shell->new();
    6. my $name = $ui->which; # Find out what shell you loaded
    7. $ui->shell; # run the ui shell

    DESCRIPTION

    This module is the generic loading (and base class) for all CPANPLUS shells. Through this module you can load any installed CPANPLUS shell.

    Just about all the functionality is provided by the shell that you have loaded, and not by this class (which merely functions as a generic loading class), so please consult the documentation of your shell of choice.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPANPLUS::Shell::Default, CPANPLUS::Shell::Classic, cpanp

     
    perldoc-html/CPANPLUS/Shell/Classic.html000644 000765 000024 00000037216 12275777433 020013 0ustar00jjstaff000000 000000 CPANPLUS::Shell::Classic - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Shell::Classic

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Shell::Classic

    NAME

    CPANPLUS::Shell::Classic - CPAN.pm emulation for CPANPLUS

    DESCRIPTION

    The Classic shell is designed to provide the feel of the CPAN.pm shell using CPANPLUS underneath.

    For detailed documentation, refer to CPAN.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPANPLUS::Configure, CPANPLUS::Module, CPANPLUS::Module::Author

    SEE ALSO

    CPAN

     
    perldoc-html/CPANPLUS/Shell/Default/000755 000765 000024 00000000000 12275777430 017114 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Shell/Default.html000644 000765 000024 00000054604 12275777430 020013 0ustar00jjstaff000000 000000 CPANPLUS::Shell::Default - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Shell::Default

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Shell::Default

    NAME

    CPANPLUS::Shell::Default - the default CPANPLUS shell

    SYNOPSIS

    1. ### loading the shell:
    2. $ cpanp # run 'cpanp' from the command line
    3. $ perl -MCPANPLUS -eshell # load the shell from the command line
    4. use CPANPLUS::Shell qw[Default]; # load this shell via the API
    5. # always done via CPANPLUS::Shell
    6. my $ui = CPANPLUS::Shell->new;
    7. $ui->shell; # run the shell
    8. $ui->dispatch_on_input( input => 'x'); # update the source using the
    9. # dispatch method
    10. ### when in the shell:
    11. ### Note that all commands can also take options.
    12. ### Look at their underlying CPANPLUS::Backend methods to see
    13. ### what options those are.
    14. cpanp> h # show help messages
    15. cpanp> ? # show help messages
    16. cpanp> m Acme # find acme modules, allows regexes
    17. cpanp> a KANE # find modules by kane, allows regexes
    18. cpanp> f Acme::Foo # get a list of all releases of Acme::Foo
    19. cpanp> i Acme::Foo # install Acme::Foo
    20. cpanp> i Acme-Foo-1.3 # install version 1.3 of Acme::Foo
    21. cpanp> i <URI> # install from URI, like ftp://foo.com/X.tgz
    22. cpanp> i <DIR> # install from an absolute or relative directory
    23. cpanp> i 1 3..5 # install search results 1, 3, 4 and 5
    24. cpanp> i * # install all search results
    25. cpanp> a KANE; i *; # find modules by kane, install all results
    26. cpanp> t Acme::Foo # test Acme::Foo, without installing it
    27. cpanp> u Acme::Foo # uninstall Acme::Foo
    28. cpanp> d Acme::Foo # download Acme::Foo
    29. cpanp> z Acme::Foo # download & extract Acme::Foo, then open a
    30. # shell in the extraction directory
    31. cpanp> c Acme::Foo # get a list of test results for Acme::Foo
    32. cpanp> l Acme::Foo # view details about the Acme::Foo package
    33. cpanp> r Acme::Foo # view Acme::Foo's README file
    34. cpanp> o # get a list of all installed modules that
    35. # are out of date
    36. cpanp> o 1..3 # list uptodateness from a previous search
    37. cpanp> s conf # show config settings
    38. cpanp> s conf md5 1 # enable md5 checks
    39. cpanp> s program # show program settings
    40. cpanp> s edit # edit config file
    41. cpanp> s reconfigure # go through initial configuration again
    42. cpanp> s selfupdate # update your CPANPLUS install
    43. cpanp> s save # save config to disk
    44. cpanp> s mirrors # show currently selected mirrors
    45. cpanp> ! [PERL CODE] # execute the following perl code
    46. cpanp> b # create an autobundle for this computers
    47. # perl installation
    48. cpanp> x # reload index files (purges cache)
    49. cpanp> x --update_source # reload index files, get fresh source files
    50. cpanp> p [FILE] # print error stack (to a file)
    51. cpanp> v # show the banner
    52. cpanp> w # show last search results again
    53. cpanp> q # quit the shell
    54. cpanp> e # exit the shell and reload
    55. cpanp> /plugins # list available plugins
    56. cpanp> /? PLUGIN # list help test of <PLUGIN>
    57. ### common options:
    58. cpanp> i ... --skiptest # skip tests
    59. cpanp> i ... --force # force all operations
    60. cpanp> i ... --verbose # run in verbose mode

    DESCRIPTION

    This module provides the default user interface to CPANPLUS . You can start it via the cpanp binary, or as detailed in the SYNOPSIS.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPANPLUS::Shell::Classic, CPANPLUS::Shell, cpanp

     
    perldoc-html/CPANPLUS/Shell/Default/Plugins/000755 000765 000024 00000000000 12275777431 020536 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Shell/Default/Plugins/CustomSource.html000644 000765 000024 00000044464 12275777431 024073 0ustar00jjstaff000000 000000 CPANPLUS::Shell::Default::Plugins::CustomSource - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Shell::Default::Plugins::CustomSource

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Shell::Default::Plugins::CustomSource

    NAME

    CPANPLUS::Shell::Default::Plugins::CustomSource - add custom sources to CPANPLUS

    SYNOPSIS

    1. ### elaborate help text
    2. CPAN Terminal> /? cs
    3. ### add a new custom source
    4. CPAN Terminal> /cs --add file:///path/to/releases
    5. ### list all your custom sources by
    6. CPAN Terminal> /cs --list
    7. ### display the contents of a custom source by URI or ID
    8. CPAN Terminal> /cs --contents file:///path/to/releases
    9. CPAN Terminal> /cs --contents 1
    10. ### Update a custom source by URI or ID
    11. CPAN Terminal> /cs --update file:///path/to/releases
    12. CPAN Terminal> /cs --update 1
    13. ### Remove a custom source by URI or ID
    14. CPAN Terminal> /cs --remove file:///path/to/releases
    15. CPAN Terminal> /cs --remove 1
    16. ### Write an index file for a custom source, to share
    17. ### with 3rd parties or remote users
    18. CPAN Terminal> /cs --write file:///path/to/releases
    19. ### Make sure to save your sources when adding/removing
    20. ### sources, so your changes are reflected in the cache:
    21. CPAN Terminal> x

    DESCRIPTION

    This is a CPANPLUS::Shell::Default plugin that can add custom sources to your CPANPLUS installation. This is a wrapper around the custom module sources code as outlined in CUSTOM MODULE SOURCES in CPANPLUS::Backend.

    This allows you to extend your index of available modules beyond what's available on CPAN with your own local distributions, or ones offered by third parties.

     
    perldoc-html/CPANPLUS/Shell/Default/Plugins/Remote.html000644 000765 000024 00000042503 12275777430 022662 0ustar00jjstaff000000 000000 CPANPLUS::Shell::Default::Plugins::Remote - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Shell::Default::Plugins::Remote

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Shell::Default::Plugins::Remote

    NAME

    CPANPLUS::Shell::Default::Plugins::Remote - connect to a remote CPANPLUS

    SYNOPSIS

    1. CPAN Terminal> /connect localhost 1337 --user=foo --pass=bar
    2. ...
    3. CPAN Terminal@localhost> /disconnect

    DESCRIPTION

    This is a CPANPLUS::Shell::Default plugin that allows you to connect to a machine running an instance of CPANPLUS::Daemon , allowing remote usage of the CPANPLUS Shell .

    A sample session, updating all modules on a remote machine, might look like this:

    1. CPAN Terminal> /connect --user=my_user --pass=secret localhost 1337
    2. Connection accepted
    3. Successfully connected to 'localhost' on port '11337'
    4. Note that no output will appear until a command has completed
    5. -- this may take a while
    6. CPAN Terminal@localhost> o; i *
    7. [....]
    8. CPAN Terminal@localhost> /disconnect
    9. CPAN Terminal>

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPANPLUS::Shell::Default, CPANPLUS::Shell, cpanp

     
    perldoc-html/CPANPLUS/Shell/Default/Plugins/Source.html000644 000765 000024 00000042365 12275777430 022675 0ustar00jjstaff000000 000000 CPANPLUS::Shell::Default::Plugins::Source - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Shell::Default::Plugins::Source

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Shell::Default::Plugins::Source

    NAME

    CPANPLUS::Shell::Default::Plugins::Source - read in CPANPLUS commands

    SYNOPSIS

    1. CPAN Terminal> /source /tmp/list_of_commands /tmp/more_commands

    DESCRIPTION

    This is a CPANPLUS::Shell::Default plugin that works just like your unix shells source(1) command; it reads in a file that has commands in it to execute, and then executes them.

    A sample file might look like this:

    1. # first, update all the source files
    2. x --update_source
    3. # find all of my modules that are on the CPAN
    4. # test them, and store the error log
    5. a ^KANE$'
    6. t *
    7. p /home/kane/cpan-autotest/log
    8. # and inform us we're good to go
    9. ! print "Autotest complete, log stored; please enter your commands!"

    Note how empty lines, and lines starting with a '#' are being skipped in the execution.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPANPLUS::Shell::Default, CPANPLUS::Shell, cpanp

     
    perldoc-html/CPANPLUS/Module/Author/000755 000765 000024 00000000000 12275777430 017150 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Module/Author.html000644 000765 000024 00000046265 12275777431 020054 0ustar00jjstaff000000 000000 CPANPLUS::Module::Author - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Module::Author

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Module::Author

    NAME

    CPANPLUS::Module::Author - CPAN author object for CPANPLUS

    SYNOPSIS

    1. my $author = CPANPLUS::Module::Author->new(
    2. author => 'Jack Ashton',
    3. cpanid => 'JACKASH',
    4. _id => INTERNALS_OBJECT_ID,
    5. );
    6. $author->cpanid;
    7. $author->author;
    8. $author->email;
    9. @dists = $author->distributions;
    10. @mods = $author->modules;
    11. @accessors = CPANPLUS::Module::Author->accessors;

    DESCRIPTION

    CPANPLUS::Module::Author creates objects from the information in the source files. These can then be used to query on.

    These objects should only be created internally. For fake objects, there's the CPANPLUS::Module::Author::Fake class.

    ACCESSORS

    An objects of this class has the following accessors:

    • author

      Name of the author.

    • cpanid

      The CPAN id of the author.

    • email

      The email address of the author, which defaults to '' if not provided.

    • parent

      The CPANPLUS::Internals::Object that spawned this module object.

    METHODS

    $auth = CPANPLUS::Module::Author->new( author => AUTHOR_NAME, cpanid => CPAN_ID, _id => INTERNALS_ID [, email => AUTHOR_EMAIL] )

    This method returns a CPANPLUS::Module::Author object, based on the given parameters.

    Returns false on failure.

    @mod_objs = $auth->modules()

    Return a list of module objects this author has released.

    @dists = $auth->distributions()

    Returns a list of module objects representing all the distributions this author has released.

    CLASS METHODS

    accessors ()

    Returns a list of all accessor methods to the object

     
    perldoc-html/CPANPLUS/Module/Checksums.html000644 000765 000024 00000037343 12275777431 020534 0ustar00jjstaff000000 000000 CPANPLUS::Module::Checksums - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Module::Checksums

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Module::Checksums

    NAME

    CPANPLUS::Module::Checksums - checking the checksum of a distribution

    SYNOPSIS

    1. $file = $modobj->checksums;
    2. $bool = $mobobj->_validate_checksum;

    DESCRIPTION

    This is a class that provides functions for checking the checksum of a distribution. Should not be loaded directly, but used via the interface provided via CPANPLUS::Module .

    METHODS

    $mod->checksums

    Fetches the checksums file for this module object. For the options it can take, see CPANPLUS::Module::fetch() .

    Returns the location of the checksums file on success and false on error.

    The location of the checksums file is also stored as

    1. $mod->status->checksums
     
    perldoc-html/CPANPLUS/Module/Fake.html000644 000765 000024 00000041042 12275777430 017443 0ustar00jjstaff000000 000000 CPANPLUS::Module::Fake - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Module::Fake

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Module::Fake

    NAME

    CPANPLUS::Module::Fake - fake module object for internal use

    SYNOPSIS

    1. my $obj = CPANPLUS::Module::Fake->new(
    2. module => 'Foo',
    3. path => 'ftp/path/to/foo',
    4. author => CPANPLUS::Module::Author::Fake->new,
    5. package => 'fake-1.1.tgz',
    6. _id => $cpan->_id,
    7. );

    DESCRIPTION

    A class for creating fake module objects, for shortcut use internally by CPANPLUS.

    Inherits from CPANPLUS::Module .

    METHODS

    new( module => $mod, path => $path, package => $pkg, [_id => DIGIT] )

    Creates a dummy module object from the above parameters. It can take more options (same as CPANPLUS::Module->new but the above are required.

     
    perldoc-html/CPANPLUS/Module/Author/Fake.html000644 000765 000024 00000040407 12275777430 020711 0ustar00jjstaff000000 000000 CPANPLUS::Module::Author::Fake - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Module::Author::Fake

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Module::Author::Fake

    NAME

    CPANPLUS::Module::Author::Fake - dummy author object for CPANPLUS

    SYNOPSIS

    1. my $auth = CPANPLUS::Module::Author::Fake->new(
    2. author => 'Foo Bar',
    3. email => 'luser@foo.com',
    4. cpanid => 'FOO',
    5. _id => $cpan->id,
    6. );

    DESCRIPTION

    A class for creating fake author objects, for shortcut use internally by CPANPLUS.

    Inherits from CPANPLUS::Module::Author .

    METHODS

    new( _id => DIGIT )

    Creates a dummy author object. It can take the same options as CPANPLUS::Module::Author->new , but will fill in default ones if none are provided. Only the _id key is required.

     
    perldoc-html/CPANPLUS/Internals/Extract.html000644 000765 000024 00000046734 12275777434 020742 0ustar00jjstaff000000 000000 CPANPLUS::Internals::Extract - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals::Extract

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals::Extract

    NAME

    CPANPLUS::Internals::Extract - internals for archive extraction

    SYNOPSIS

    1. ### for source files ###
    2. $self->_gunzip( file => 'foo.gz', output => 'blah.txt' );
    3. ### for modules/packages ###
    4. $dir = $self->_extract( module => $modobj,
    5. extractdir => '/some/where' );

    DESCRIPTION

    CPANPLUS::Internals::Extract extracts compressed files for CPANPLUS. It can do this by either a pure perl solution (preferred) with the use of Archive::Tar and Compress::Zlib , or with binaries, like gzip and tar .

    The flow looks like this:

    1. $cb->_extract
    2. Delegate to Archive::Extract

    METHODS

    $dir = _extract( module => $modobj, [perl => '/path/to/perl', extractdir => '/path/to/extract/to', prefer_bin => BOOL, verbose => BOOL, force => BOOL] )

    _extract will take a module object and extract it to extractdir if provided, or the default location which is obtained from your config.

    The file name is obtained by looking at $modobj->status->fetch and will be parsed to see if it's a tar or zip archive.

    If it's a zip archive, __unzip will be called, otherwise __untar will be called. In the unlikely event the file is of neither format, an error will be thrown.

    _extract takes the following options:

    • module

      A CPANPLUS::Module object. This is required.

    • extractdir

      The directory to extract the archive to. By default this looks something like: /CPANPLUS_BASE/PERL_VERSION/BUILD/MODULE_NAME

    • prefer_bin

      A flag indicating whether you prefer a pure perl solution, ie Archive::Tar or Archive::Zip respectively, or a binary solution like unzip and tar .

    • perl

      The path to the perl executable to use for any perl calls. Also used to determine the build version directory for extraction.

    • verbose

      Specifies whether to be verbose or not. Defaults to your corresponding config entry.

    • force

      Specifies whether to force the extraction or not. Defaults to your corresponding config entry.

    All other options are passed on verbatim to __unzip or __untar .

    Returns the directory the file was extracted to on success and false on failure.

     
    perldoc-html/CPANPLUS/Internals/Fetch.html000644 000765 000024 00000051022 12275777432 020341 0ustar00jjstaff000000 000000 CPANPLUS::Internals::Fetch - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals::Fetch

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals::Fetch

    NAME

    CPANPLUS::Internals::Fetch - internals for fetching files

    SYNOPSIS

    1. my $output = $cb->_fetch(
    2. module => $modobj,
    3. fetchdir => '/path/to/save/to',
    4. verbose => BOOL,
    5. force => BOOL,
    6. );
    7. $cb->_add_fail_host( host => 'foo.com' );
    8. $cb->_host_ok( host => 'foo.com' );

    DESCRIPTION

    CPANPLUS::Internals::Fetch fetches files from either ftp, http, file or rsync mirrors.

    This is the rough flow:

    1. $cb->_fetch
    2. Delegate to File::Fetch;

    METHODS

    $path = _fetch( module => $modobj, [fetchdir => '/path/to/save/to', fetch_from => 'scheme://path/to/fetch/from', verbose => BOOL, force => BOOL, prefer_bin => BOOL, ttl => $seconds] )

    _fetch will fetch files based on the information in a module object. You always need a module object. If you want a fake module object for a one-off fetch, look at CPANPLUS::Module::Fake .

    fetchdir is the place to save the file to. Usually this information comes from your configuration, but you can override it expressly if needed.

    fetch_from lets you specify an URI to get this file from. If you do not specify one, your list of configured hosts will be probed to download the file from.

    force forces a new download, even if the file already exists.

    verbose simply indicates whether or not to print extra messages.

    prefer_bin indicates whether you prefer the use of commandline programs over perl modules. Defaults to your corresponding config setting.

    ttl (in seconds) indicates how long a cached copy is valid for. If the fetch time of the local copy is within the ttl, the cached copy is returned. Otherwise, the file is refetched.

    _fetch figures out, based on the host list, what scheme to use and from there, delegates to File::Fetch do the actual fetching.

    Returns the path of the output file on success, false on failure.

    Note that you can set a blacklist on certain methods in the config. Simply add the identifying name of the method (ie, lwp ) to: $conf->_set_fetch( blacklist => ['lwp'] );

    And the LWP function will be skipped by File::Fetch .

    _add_fail_host( host => $host_hashref )

    Mark a particular host as bad. This makes CPANPLUS::Internals::Fetch skip it in fetches until this cache is flushed.

    _host_ok( host => $host_hashref )

    Query the cache to see if this host is ok, or if it has been flagged as bad.

    Returns true if the host is ok, false otherwise.

     
    perldoc-html/CPANPLUS/Internals/Report.html000644 000765 000024 00000054333 12275777434 020575 0ustar00jjstaff000000 000000 CPANPLUS::Internals::Report - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals::Report

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals::Report

    NAME

    CPANPLUS::Internals::Report - internals for sending test reports

    SYNOPSIS

    1. ### enable test reporting
    2. $cb->configure_object->set_conf( cpantest => 1 );
    3. ### set custom mx host, shouldn't normally be needed
    4. $cb->configure_object->set_conf( cpantest_mx => 'smtp.example.com' );

    DESCRIPTION

    This module provides all the functionality to send test reports to http://testers.cpan.org using the Test::Reporter module.

    All methods will be called automatically if you have CPANPLUS configured to enable test reporting (see the SYNOPSIS ).

    METHODS

    $bool = $cb->_have_query_report_modules

    This function checks if all the required modules are here for querying reports. It returns true and loads them if they are, or returns false otherwise.

    $bool = $cb->_have_send_report_modules

    This function checks if all the required modules are here for sending reports. It returns true and loads them if they are, or returns false otherwise.

    @list = $cb->_query_report( module => $modobj, [all_versions => BOOL, verbose => BOOL] )

    This function queries the CPAN testers database at http://testers.cpan.org/ for test results of specified module objects, module names or distributions.

    The optional argument all_versions controls whether all versions of a given distribution should be grabbed. It defaults to false (fetching only reports for the current version).

    Returns the a list with the following data structures (for CPANPLUS version 0.042) on success, or false on failure. The contents of the data structure depends on what http://testers.cpan.org returns, but generally looks like this:

    1. {
    2. 'grade' => 'PASS',
    3. 'dist' => 'CPANPLUS-0.042',
    4. 'platform' => 'i686-pld-linux-thread-multi'
    5. 'details' => 'http://nntp.x.perl.org/group/perl.cpan.testers/98316'
    6. ...
    7. },
    8. {
    9. 'grade' => 'PASS',
    10. 'dist' => 'CPANPLUS-0.042',
    11. 'platform' => 'i686-linux-thread-multi'
    12. 'details' => 'http://nntp.x.perl.org/group/perl.cpan.testers/99416'
    13. ...
    14. },
    15. {
    16. 'grade' => 'FAIL',
    17. 'dist' => 'CPANPLUS-0.042',
    18. 'platform' => 'cygwin-multi-64int',
    19. 'details' => 'http://nntp.x.perl.org/group/perl.cpan.testers/99371'
    20. ...
    21. },
    22. {
    23. 'grade' => 'FAIL',
    24. 'dist' => 'CPANPLUS-0.042',
    25. 'platform' => 'i586-linux',
    26. 'details' => 'http://nntp.x.perl.org/group/perl.cpan.testers/99396'
    27. ...
    28. },

    The status of the test can be one of the following: UNKNOWN, PASS, FAIL or NA (not applicable).

    $bool = $cb->_send_report( module => $modobj, buffer => $make_output, failed => BOOL, [save => BOOL, address => $email_to, verbose => BOOL, force => BOOL]);

    This function sends a testers report to cpan-testers@perl.org for a particular distribution. It returns true on success, and false on failure.

    It takes the following options:

    • module

      The module object of this particular distribution

    • buffer

      The output buffer from the 'make/make test' process

    • failed

      Boolean indicating if the 'make/make test' went wrong

    • save

      Boolean indicating if the report should be saved locally instead of mailed out. If provided, this function will return the location the report was saved to, rather than a simple boolean 'TRUE'.

      Defaults to false.

    • address

      The email address to mail the report for. You should never need to override this, but it might be useful for debugging purposes.

      Defaults to cpan-testers@perl.org .

    • verbose

      Boolean indicating on whether or not to be verbose.

      Defaults to your configuration settings

    • force

      Boolean indicating whether to force the sending, even if the max amount of reports for fails have already been reached, or if you may already have sent it before.

      Defaults to your configuration settings

     
    perldoc-html/CPANPLUS/Internals/Search.html000644 000765 000024 00000051432 12275777427 020526 0ustar00jjstaff000000 000000 CPANPLUS::Internals::Search - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals::Search

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals::Search

    NAME

    CPANPLUS::Internals::Search - internals for searching for modules

    SYNOPSIS

    1. my $aref = $cpan->_search_module_tree(
    2. type => 'package',
    3. allow => [qr/DBI/],
    4. );
    5. my $aref = $cpan->_search_author_tree(
    6. type => 'cpanid',
    7. data => \@old_results,
    8. verbose => 1,
    9. allow => [qw|KANE AUTRIJUS|],
    10. );
    11. my $aref = $cpan->_all_installed( );

    DESCRIPTION

    The functions in this module are designed to find module(objects) based on certain criteria and return them.

    METHODS

    _search_module_tree( type => TYPE, allow => \@regexes, [data => \@previous_results ] )

    Searches the moduletree for module objects matching the criteria you specify. Returns an array ref of module objects on success, and false on failure.

    It takes the following arguments:

    • type

      This can be any of the accessors for the CPANPLUS::Module objects. This is a required argument.

    • allow

      A set of rules, or more precisely, a list of regexes (via qr// or plain strings), that the type must adhere too. You can specify as many as you like, and it will be treated as an OR search. For an AND search, see the data argument.

      This is a required argument.

    • data

      An arrayref of previous search results. This is the way to do an AND search -- _search_module_tree will only search the module objects specified in data if provided, rather than the moduletree itself.

    _search_author_tree( type => TYPE, allow => \@regexex, [data => \@previous_results ] )

    Searches the authortree for author objects matching the criteria you specify. Returns an array ref of author objects on success, and false on failure.

    It takes the following arguments:

    • type

      This can be any of the accessors for the CPANPLUS::Module::Author objects. This is a required argument.

    • allow

      A set of rules, or more precisely, a list of regexes (via qr// or plain strings), that the type must adhere too. You can specify as many as you like, and it will be treated as an OR search. For an AND search, see the data argument.

      This is a required argument.

    • data

      An arrayref of previous search results. This is the way to do an and search -- _search_author_tree will only search the author objects specified in data if provided, rather than the authortree itself.

    _all_installed()

    This function returns an array ref of module objects of modules that are installed on this system.

     
    perldoc-html/CPANPLUS/Internals/Source/000755 000765 000024 00000000000 12275777430 017660 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Internals/Source.html000644 000765 000024 00000101135 12275777430 020547 0ustar00jjstaff000000 000000 CPANPLUS::Internals::Source - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals::Source

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals::Source

    NAME

    CPANPLUS::Internals::Source - internals for updating source files

    SYNOPSIS

    1. ### lazy load author/module trees ###
    2. $cb->_author_tree;
    3. $cb->_module_tree;

    DESCRIPTION

    CPANPLUS::Internals::Source controls the updating of source files and the parsing of them into usable module/author trees to be used by CPANPLUS .

    Functions exist to check if source files are still good to use as well as update them, and then parse them.

    The flow looks like this:

    1. $cb->_author_tree || $cb->_module_tree
    2. $cb->_check_trees
    3. $cb->__check_uptodate
    4. $cb->_update_source
    5. $cb->__update_custom_module_sources
    6. $cb->__update_custom_module_source
    7. $cb->_build_trees
    8. ### engine methods
    9. { $cb->_init_trees;
    10. $cb->_standard_trees_completed
    11. $cb->_custom_trees_completed
    12. }
    13. $cb->__create_author_tree
    14. ### engine methods
    15. { $cb->_add_author_object }
    16. $cb->__create_module_tree
    17. $cb->__create_dslip_tree
    18. ### engine methods
    19. { $cb->_add_module_object }
    20. $cb->__create_custom_module_entries
    21. $cb->_dslip_defs

    METHODS

    $cb->_build_trees( uptodate => BOOL, [use_stored => BOOL, path => $path, verbose => BOOL] )

    This method rebuilds the author- and module-trees from source.

    It takes the following arguments:

    • uptodate

      Indicates whether any on disk caches are still ok to use.

    • path

      The absolute path to the directory holding the source files.

    • verbose

      A boolean flag indicating whether or not to be verbose.

    • use_stored

      A boolean flag indicating whether or not it is ok to use previously stored trees. Defaults to true.

    Returns a boolean indicating success.

    $cb->_check_trees( [update_source => BOOL, path => PATH, verbose => BOOL] )

    Retrieve source files and return a boolean indicating whether or not the source files are up to date.

    Takes several arguments:

    • update_source

      A flag to force re-fetching of the source files, even if they are still up to date.

    • path

      The absolute path to the directory holding the source files.

    • verbose

      A boolean flag indicating whether or not to be verbose.

    Will get information from the config file by default.

    $cb->__check_uptodate( file => $file, name => $name, [update_source => BOOL, verbose => BOOL] )

    __check_uptodate checks if a given source file is still up-to-date and if not, or when update_source is true, will re-fetch the source file.

    Takes the following arguments:

    • file

      The source file to check.

    • name

      The internal shortcut name for the source file (used for config lookups).

    • update_source

      Flag to force updating of sourcefiles regardless.

    • verbose

      Boolean to indicate whether to be verbose or not.

    Returns a boolean value indicating whether the current files are up to date or not.

    $cb->_update_source( name => $name, [path => $path, verbose => BOOL] )

    This method does the actual fetching of source files.

    It takes the following arguments:

    • name

      The internal shortcut name for the source file (used for config lookups).

    • path

      The full path where to write the files.

    • verbose

      Boolean to indicate whether to be verbose or not.

    Returns a boolean to indicate success.

    $cb->__create_author_tree([path => $path, uptodate => BOOL, verbose => BOOL])

    This method opens a source files and parses its contents into a searchable author-tree or restores a file-cached version of a previous parse, if the sources are uptodate and the file-cache exists.

    It takes the following arguments:

    • uptodate

      A flag indicating whether the file-cache is uptodate or not.

    • path

      The absolute path to the directory holding the source files.

    • verbose

      A boolean flag indicating whether or not to be verbose.

    Will get information from the config file by default.

    Returns a tree on success, false on failure.

    $cb->_create_mod_tree([path => $path, uptodate => BOOL, verbose => BOOL])

    This method opens a source files and parses its contents into a searchable module-tree or restores a file-cached version of a previous parse, if the sources are uptodate and the file-cache exists.

    It takes the following arguments:

    • uptodate

      A flag indicating whether the file-cache is up-to-date or not.

    • path

      The absolute path to the directory holding the source files.

    • verbose

      A boolean flag indicating whether or not to be verbose.

    Will get information from the config file by default.

    Returns a tree on success, false on failure.

    $cb->__create_dslip_tree([path => $path, uptodate => BOOL, verbose => BOOL])

    This method opens a source files and parses its contents into a searchable dslip-tree or restores a file-cached version of a previous parse, if the sources are uptodate and the file-cache exists.

    It takes the following arguments:

    • uptodate

      A flag indicating whether the file-cache is uptodate or not.

    • path

      The absolute path to the directory holding the source files.

    • verbose

      A boolean flag indicating whether or not to be verbose.

    Will get information from the config file by default.

    Returns a tree on success, false on failure.

    $cb->_dslip_defs ()

    This function returns the definition structure (ARRAYREF) of the dslip tree.

    $file = $cb->_add_custom_module_source( uri => URI, [verbose => BOOL] );

    Adds a custom source index and updates it based on the provided URI.

    Returns the full path to the index file on success or false on failure.

    $index = $cb->__custom_module_source_index_file( uri => $uri );

    Returns the full path to the encoded index file for $uri , as used by all custom module source routines.

    $file = $cb->_remove_custom_module_source( uri => URI, [verbose => BOOL] );

    Removes a custom index file based on the URI provided.

    Returns the full path to the index file on success or false on failure.

    %files = $cb->__list_custom_module_sources

    This method scans the 'custom-sources' directory in your base directory for additional sources to include in your module tree.

    Returns a list of key value pairs as follows:

    1. /full/path/to/source/file%3Fencoded => http://decoded/mirror/path

    $bool = $cb->__update_custom_module_sources( [verbose => BOOL] );

    Attempts to update all the index files to your custom module sources.

    If the index is missing, and it's a file:// uri, it will generate a new local index for you.

    Return true on success, false on failure.

    $ok = $cb->__update_custom_module_source

    Attempts to update all the index files to your custom module sources.

    If the index is missing, and it's a file:// uri, it will generate a new local index for you.

    Return true on success, false on failure.

    $bool = $cb->__write_custom_module_index( path => /path/to/packages, [to => /path/to/index/file, verbose => BOOL] )

    Scans the path you provided for packages and writes an index with all the available packages to $path/packages.txt . If you'd like the index to be written to a different file, provide the to argument.

    Returns true on success and false on failure.

    $bool = $cb->__create_custom_module_entries( [verbose => BOOL] )

    Creates entries in the module tree based upon the files as returned by __list_custom_module_sources .

    Returns true on success, false on failure.

     
    perldoc-html/CPANPLUS/Internals/Utils.html000644 000765 000024 00000061265 12275777433 020423 0ustar00jjstaff000000 000000 CPANPLUS::Internals::Utils - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals::Utils

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals::Utils

    NAME

    CPANPLUS::Internals::Utils - convenience functions for CPANPLUS

    SYNOPSIS

    1. my $bool = $cb->_mkdir( dir => 'blah' );
    2. my $bool = $cb->_chdir( dir => 'blah' );
    3. my $bool = $cb->_rmdir( dir => 'blah' );
    4. my $bool = $cb->_move( from => '/some/file', to => '/other/file' );
    5. my $bool = $cb->_move( from => '/some/dir', to => '/other/dir' );
    6. my $cont = $cb->_get_file_contents( file => '/path/to/file' );
    7. my $version = $cb->_perl_version( perl => $^X );

    DESCRIPTION

    CPANPLUS::Internals::Utils holds a few convenience functions for CPANPLUS libraries.

    METHODS

    $cb->_mkdir( dir => '/some/dir' )

    _mkdir creates a full path to a directory.

    Returns true on success, false on failure.

    $cb->_chdir( dir => '/some/dir' )

    _chdir changes directory to a dir.

    Returns true on success, false on failure.

    $cb->_rmdir( dir => '/some/dir' );

    Removes a directory completely, even if it is non-empty.

    Returns true on success, false on failure.

    $cb->_perl_version ( perl => 'some/perl/binary' );

    _perl_version returns the version of a certain perl binary. It does this by actually running a command.

    Returns the perl version on success and false on failure.

    $cb->_version_to_number( version => $version );

    Returns a proper module version, or '0.0' if none was available.

    $cb->_whoami

    Returns the name of the subroutine you're currently in.

    _get_file_contents( file => $file );

    Returns the contents of a file

    $cb->_move( from => $file|$dir, to => $target );

    Moves a file or directory to the target.

    Returns true on success, false on failure.

    $cb->_copy( from => $file|$dir, to => $target );

    Moves a file or directory to the target.

    Returns true on success, false on failure.

    $cb->_mode_plus_w( file => '/path/to/file' );

    Sets the +w bit for the file.

    Returns true on success, false on failure.

    $uri = $cb->_host_to_uri( scheme => SCHEME, host => HOST, path => PATH );

    Turns a CPANPLUS::Config style host entry into an URI string.

    Returns the uri on success, and false on failure

    $cb->_vcmp( VERSION, VERSION );

    Normalizes the versions passed and does a '<=>' on them, returning the result.

    $cb->_home_dir

    Returns the user's homedir, or cwd if it could not be found

    $path = $cb->_safe_path( path => $path );

    Returns a path that's safe to us on Win32 and VMS.

    Only cleans up the path on Win32 if the path exists.

    On VMS, it encodes dots to _ using VMS::Filespec::vmsify

    ($pkg, $version, $ext) = $cb->_split_package_string( package => PACKAGE_STRING );

    Splits the name of a CPAN package string up into its package, version and extension parts.

    For example, Foo-Bar-1.2.tar.gz would return the following parts:

    1. Package: Foo-Bar
    2. Version: 1.2
    3. Extension: tar.gz
     
    perldoc-html/CPANPLUS/Internals/Source/Memory.html000644 000765 000024 00000041513 12275777430 022022 0ustar00jjstaff000000 000000 CPANPLUS::Internals::Source::Memory - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals::Source::Memory

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals::Source::Memory

    NAME

    CPANPLUS::Internals::Source::Memory - In memory implementation

    $cb->__memory_retrieve_source(name => $name, [path => $path, uptodate => BOOL, verbose => BOOL])

    This method retrieves a storabled tree identified by $name .

    It takes the following arguments:

    • name

      The internal name for the source file to retrieve.

    • uptodate

      A flag indicating whether the file-cache is up-to-date or not.

    • path

      The absolute path to the directory holding the source files.

    • verbose

      A boolean flag indicating whether or not to be verbose.

    Will get information from the config file by default.

    Returns a tree on success, false on failure.

    $cb->__memory_save_source([verbose => BOOL, path => $path])

    This method saves all the parsed trees in storabled format if Storable is available.

    It takes the following arguments:

    • path

      The absolute path to the directory holding the source files.

    • verbose

      A boolean flag indicating whether or not to be verbose.

    Will get information from the config file by default.

    Returns true on success, false on failure.

     
    perldoc-html/CPANPLUS/Internals/Source/SQLite.html000644 000765 000024 00000034553 12275777430 021721 0ustar00jjstaff000000 000000 CPANPLUS::Internals::Source::SQLite - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Internals::Source::SQLite

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Internals::Source::SQLite

    NAME

    CPANPLUS::Internals::Source::SQLite - SQLite implementation

    Page index
     
    perldoc-html/CPANPLUS/Dist/Autobundle.html000644 000765 000024 00000036324 12275777432 020366 0ustar00jjstaff000000 000000 CPANPLUS::Dist::Autobundle - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Dist::Autobundle

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Dist::Autobundle

    NAME

    CPANPLUS::Dist::Autobundle - distribution class for installation snapshots

    SYNOPSIS

    1. $modobj = $cb->parse_module( module => 'file://path/to/Snapshot_XXYY.pm' );
    2. $modobj->install;

    DESCRIPTION

    CPANPLUS::Dist::Autobundle is a distribution class for installing installation snapshots as created by CPANPLUS ' autobundle command.

    All modules as mentioned in the snapshot will be installed on your system.

     
    perldoc-html/CPANPLUS/Dist/Base.html000644 000765 000024 00000064435 12275777434 017144 0ustar00jjstaff000000 000000 CPANPLUS::Dist::Base - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Dist::Base

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Dist::Base

    NAME

    CPANPLUS::Dist::Base - Base class for custom distribution classes

    SYNOPSIS

    1. package CPANPLUS::Dist::MY_IMPLEMENTATION
    2. use base 'CPANPLUS::Dist::Base';
    3. sub prepare {
    4. my $dist = shift;
    5. ### do the 'standard' things
    6. $dist->SUPER::prepare( @_ ) or return;
    7. ### do MY_IMPLEMENTATION specific things
    8. ...
    9. ### don't forget to set the status!
    10. return $dist->status->prepared( $SUCCESS ? 1 : 0 );
    11. }

    DESCRIPTION

    CPANPLUS::Dist::Base functions as a base class for all custom distribution implementations. It does all the mundane work CPANPLUS would have done without a custom distribution, so you can override just the parts you need to make your own implementation work.

    FLOW

    Below is a brief outline when and in which order methods in this class are called:

    1. $Class->format_available; # can we use this class on this system?
    2. $dist->init; # set up custom accessors, etc
    3. $dist->prepare; # find/write meta information
    4. $dist->create; # write the distribution file
    5. $dist->install; # install the distribution file
    6. $dist->uninstall; # remove the distribution (OPTIONAL)

    METHODS

    @subs = $Class->methods

    Returns a list of methods that this class implements that you can override.

    $bool = $Class->format_available

    This method is called when someone requests a module to be installed via the superclass. This gives you the opportunity to check if all the needed requirements to build and install this distribution have been met.

    For example, you might need a command line program, or a certain perl module installed to do your job. Now is the time to check.

    Simply return true if the request can proceed and false if it can not.

    The CPANPLUS::Dist::Base implementation always returns true.

    $bool = $dist->init

    This method is called just after the new dist object is set up and before the prepare method is called. This is the time to set up the object so it can be used with your class.

    For example, you might want to add extra accessors to the status object, which you might do as follows:

    1. $dist->status->mk_accessors( qw[my_implementation_accessor] );

    The status object is implemented as an instance of the Object::Accessor class. Please refer to its documentation for details.

    Return true if the initialization was successful, and false if it was not.

    The CPANPLUS::Dist::Base implementation does not alter your object and always returns true.

    $bool = $dist->prepare

    This runs the preparation step of your distribution. This step is meant to set up the environment so the create step can create the actual distribution(file). A prepare call in the standard ExtUtils::MakeMaker distribution would, for example, run perl Makefile.PL to find the dependencies for a distribution. For a debian distribution, this is where you would write all the metafiles required for the dpkg-* tools.

    The CPANPLUS::Dist::Base implementation simply calls the underlying distribution class (Typically CPANPLUS::Dist::MM or CPANPLUS::Dist::Build ).

    Sets $dist->status->prepared to the return value of this function. If you override this method, you should make sure to set this value.

    $bool = $dist->create

    This runs the creation step of your distribution. This step is meant to follow up on the prepare call, that set up your environment so the create step can create the actual distribution(file). A create call in the standard ExtUtils::MakeMaker distribution would, for example, run make and make test to build and test a distribution. For a debian distribution, this is where you would create the actual .deb file using dpkg .

    The CPANPLUS::Dist::Base implementation simply calls the underlying distribution class (Typically CPANPLUS::Dist::MM or CPANPLUS::Dist::Build ).

    Sets $dist->status->dist to the location of the created distribution. If you override this method, you should make sure to set this value.

    Sets $dist->status->created to the return value of this function. If you override this method, you should make sure to set this value.

    $bool = $dist->install

    This runs the install step of your distribution. This step is meant to follow up on the create call, which prepared a distribution(file) to install. A create call in the standard ExtUtils::MakeMaker distribution would, for example, run make install to copy the distribution files to their final destination. For a debian distribution, this is where you would run dpkg --install on the created .deb file.

    The CPANPLUS::Dist::Base implementation simply calls the underlying distribution class (Typically CPANPLUS::Dist::MM or CPANPLUS::Dist::Build ).

    Sets $dist->status->installed to the return value of this function. If you override this method, you should make sure to set this value.

    $bool = $dist->uninstall

    This runs the uninstall step of your distribution. This step is meant to remove the distribution from the file system. A uninstall call in the standard ExtUtils::MakeMaker distribution would, for example, run make uninstall to remove the distribution files the file system. For a debian distribution, this is where you would run dpkg --uninstall PACKAGE.

    The CPANPLUS::Dist::Base implementation simply calls the underlying distribution class (Typically CPANPLUS::Dist::MM or CPANPLUS::Dist::Build ).

    Sets $dist->status->uninstalled to the return value of this function. If you override this method, you should make sure to set this value.

     
    perldoc-html/CPANPLUS/Dist/Build/000755 000765 000024 00000000000 12275777434 016427 5ustar00jjstaff000000 000000 perldoc-html/CPANPLUS/Dist/Build.html000644 000765 000024 00000065132 12275777431 017321 0ustar00jjstaff000000 000000 CPANPLUS::Dist::Build - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Dist::Build

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Dist::Build

    NAME

    CPANPLUS::Dist::Build - CPANPLUS plugin to install packages that use Build.PL

    SYNOPSIS

    1. my $build = CPANPLUS::Dist->new(
    2. format => 'CPANPLUS::Dist::Build',
    3. module => $modobj,
    4. );
    5. $build->prepare; # runs Build.PL
    6. $build->create; # runs build && build test
    7. $build->install; # runs build install

    DESCRIPTION

    CPANPLUS::Dist::Build is a distribution class for Module::Build related modules. Using this package, you can create, install and uninstall perl modules. It inherits from CPANPLUS::Dist .

    Normal users won't have to worry about the interface to this module, as it functions transparently as a plug-in to CPANPLUS and will just Do The Right Thing when it's loaded.

    ACCESSORS

    • parent()

      Returns the CPANPLUS::Module object that parented this object.

    • status()

      Returns the Object::Accessor object that keeps the status for this module.

    STATUS ACCESSORS

    All accessors can be accessed as follows: $build->status->ACCESSOR

    • build_pl ()

      Location of the Build file. Set to 0 explicitly if something went wrong.

    • build ()

      BOOL indicating if the Build command was successful.

    • test ()

      BOOL indicating if the Build test command was successful.

    • prepared ()

      BOOL indicating if the prepare call exited successfully This gets set after perl Build.PL

    • distdir ()

      Full path to the directory in which the prepare call took place, set after a call to prepare .

    • created ()

      BOOL indicating if the create call exited successfully. This gets set after Build and Build test .

    • installed ()

      BOOL indicating if the module was installed. This gets set after Build install exits successfully.

    • uninstalled ()

      BOOL indicating if the module was uninstalled properly.

    • _create_args ()

      Storage of the arguments passed to create for this object. Used for recursive calls when satisfying prerequisites.

    • _install_args ()

      Storage of the arguments passed to install for this object. Used for recursive calls when satisfying prerequisites.

    METHODS

    $bool = CPANPLUS::Dist::Build->format_available();

    Returns a boolean indicating whether or not you can use this package to create and install modules in your environment.

    $bool = $dist->init();

    Sets up the CPANPLUS::Dist::Build object for use. Effectively creates all the needed status accessors.

    Called automatically whenever you create a new CPANPLUS::Dist object.

    $bool = $dist->prepare([perl => '/path/to/perl', buildflags => 'EXTRA=FLAGS', force => BOOL, verbose => BOOL])

    prepare prepares a distribution, running Build.PL and establishing any prerequisites this distribution has.

    The variable PERL5_CPANPLUS_IS_EXECUTING will be set to the full path of the Build.PL that is being executed. This enables any code inside the Build.PL to know that it is being installed via CPANPLUS.

    After a successful prepare you may call create to create the distribution, followed by install to actually install it.

    Returns true on success and false on failure.

    $dist->create([perl => '/path/to/perl', buildflags => 'EXTRA=FLAGS', prereq_target => TARGET, force => BOOL, verbose => BOOL, skiptest => BOOL])

    create preps a distribution for installation. This means it will run Build and Build test . This will also satisfy any prerequisites the module may have.

    If you set skiptest to true, it will skip the Build test stage. If you set force to true, it will go over all the stages of the Build process again, ignoring any previously cached results. It will also ignore a bad return value from Build test and still allow the operation to return true.

    Returns true on success and false on failure.

    You may then call $dist->install on the object to actually install it.

    $dist->install([verbose => BOOL, perl => /path/to/perl])

    Actually installs the created dist.

    Returns true on success and false on failure.

    AUTHOR

    Originally by Jos Boumans <kane@cpan.org>. Brought to working condition by Ken Williams <kwilliams@cpan.org>.

    Other hackery and currently maintained by Chris BinGOs Williams ( no relation ). <bingos@cpan.org>.

    LICENSE

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001, 2002, 2003, 2004, 2005 Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CPANPLUS/Dist/MM.html000644 000765 000024 00000065757 12275777434 016613 0ustar00jjstaff000000 000000 CPANPLUS::Dist::MM - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Dist::MM

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Dist::MM

    NAME

    CPANPLUS::Dist::MM - distribution class for MakeMaker related modules

    SYNOPSIS

    1. $mm = CPANPLUS::Dist::MM->new( module => $modobj );
    2. $mm->create; # runs make && make test
    3. $mm->install; # runs make install

    DESCRIPTION

    CPANPLUS::Dist::MM is a distribution class for MakeMaker related modules. Using this package, you can create, install and uninstall perl modules. It inherits from CPANPLUS::Dist .

    ACCESSORS

    • parent()

      Returns the CPANPLUS::Module object that parented this object.

    • status()

      Returns the Object::Accessor object that keeps the status for this module.

    STATUS ACCESSORS

    All accessors can be accessed as follows: $mm->status->ACCESSOR

    • makefile ()

      Location of the Makefile (or Build file). Set to 0 explicitly if something went wrong.

    • make ()

      BOOL indicating if the make (or Build ) command was successful.

    • test ()

      BOOL indicating if the make test (or Build test ) command was successful.

    • prepared ()

      BOOL indicating if the prepare call exited successfully This gets set after perl Makefile.PL

    • distdir ()

      Full path to the directory in which the prepare call took place, set after a call to prepare .

    • created ()

      BOOL indicating if the create call exited successfully. This gets set after make and make test .

    • installed ()

      BOOL indicating if the module was installed. This gets set after make install (or Build install ) exits successfully.

    • uninstalled ()

      BOOL indicating if the module was uninstalled properly.

    • _create_args ()

      Storage of the arguments passed to create for this object. Used for recursive calls when satisfying prerequisites.

    • _install_args ()

      Storage of the arguments passed to install for this object. Used for recursive calls when satisfying prerequisites.

    METHODS

    $bool = $dist->format_available();

    Returns a boolean indicating whether or not you can use this package to create and install modules in your environment.

    $bool = $dist->init();

    Sets up the CPANPLUS::Dist::MM object for use. Effectively creates all the needed status accessors.

    Called automatically whenever you create a new CPANPLUS::Dist object.

    $bool = $dist->prepare([perl => '/path/to/perl', makemakerflags => 'EXTRA=FLAGS', force => BOOL, verbose => BOOL])

    prepare preps a distribution for installation. This means it will run perl Makefile.PL and determine what prerequisites this distribution declared.

    If you set force to true, it will go over all the stages of the prepare process again, ignoring any previously cached results.

    When running perl Makefile.PL , the environment variable PERL5_CPANPLUS_IS_EXECUTING will be set to the full path of the Makefile.PL that is being executed. This enables any code inside the Makefile.PL to know that it is being installed via CPANPLUS.

    Returns true on success and false on failure.

    You may then call $dist->create on the object to create the installable files.

    $href = $dist->_find_prereqs( file => '/path/to/Makefile', [verbose => BOOL])

    Parses a Makefile for PREREQ_PM entries and distills from that any prerequisites mentioned in the Makefile

    Returns a hash with module-version pairs on success and false on failure.

    $bool = $dist->create([perl => '/path/to/perl', make => '/path/to/make', makeflags => 'EXTRA=FLAGS', prereq_target => TARGET, skiptest => BOOL, force => BOOL, verbose => BOOL])

    create creates the files necessary for installation. This means it will run make and make test . This will also scan for and attempt to satisfy any prerequisites the module may have.

    If you set skiptest to true, it will skip the make test stage. If you set force to true, it will go over all the stages of the make process again, ignoring any previously cached results. It will also ignore a bad return value from make test and still allow the operation to return true.

    Returns true on success and false on failure.

    You may then call $dist->install on the object to actually install it.

    $bool = $dist->install([make => '/path/to/make', makemakerflags => 'EXTRA=FLAGS', force => BOOL, verbose => BOOL])

    install runs the following command: make install

    Returns true on success, false on failure.

    $bool = $dist->write_makefile_pl([force => BOOL, verbose => BOOL])

    This routine can write a Makefile.PL from the information in a module object. It is used to write a Makefile.PL when the original author forgot it (!!).

    Returns 1 on success and false on failure.

    The file gets written to the directory the module's been extracted to.

     
    perldoc-html/CPANPLUS/Dist/Sample.html000644 000765 000024 00000034656 12275777427 017517 0ustar00jjstaff000000 000000 CPANPLUS::Dist::Sample - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Dist::Sample

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Dist::Sample

    NAME

    CPANPLUS::Dist::Sample -- Sample code to create your own Dist::* plugin

    Description.

    This document is Obsolete. Please read the documentation and code in CPANPLUS::Dist::Base .

    Page index
     
    perldoc-html/CPANPLUS/Dist/Build/Constants.html000644 000765 000024 00000037047 12275777434 021304 0ustar00jjstaff000000 000000 CPANPLUS::Dist::Build::Constants - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Dist::Build::Constants

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Dist::Build::Constants

    NAME

    CPANPLUS::Dist::Build::Constants - Constants for CPANPLUS::Dist::Build

    SYNOPSIS

    1. use CPANPLUS::Dist::Build::Constants;

    DESCRIPTION

    CPANPLUS::Dist::Build::Constants provides some constants required by CPANPLUS::Dist::Build.

    AUTHOR

    Originally by Jos Boumans <kane@cpan.org>. Brought to working condition and currently maintained by Ken Williams <kwilliams@cpan.org>.

    LICENSE

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001, 2002, 2003, 2004, 2005 Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CPANPLUS/Backend/RV.html000644 000765 000024 00000046161 12275777431 017236 0ustar00jjstaff000000 000000 CPANPLUS::Backend::RV - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPANPLUS::Backend::RV

    Perl 5 version 18.2 documentation
    Recently read

    CPANPLUS::Backend::RV

    NAME

    CPANPLUS::Backend::RV - return value objects

    SYNOPSIS

    1. ### create a CPANPLUS::Backend::RV object
    2. $backend_rv = CPANPLUS::Backend::RV->new(
    3. ok => $boolean,
    4. args => $args,
    5. rv => $return_value
    6. function => $calling_function );
    7. ### if you have a CPANPLUS::Backend::RV object
    8. $passed_args = $backend_rv->args; # args passed to function
    9. $ok = $backend_rv->ok; # boolean indication overall
    10. # result of the call
    11. $function = $backend_rv->function # name of the calling
    12. # function
    13. $rv = $backend_rv->rv # the actual return value
    14. # of the calling function

    DESCRIPTION

    This module provides return value objects for multi-module calls to CPANPLUS::Backend. In boolean context, it returns the status of the overall result (ie, the same as the ok method would).

    METHODS

    new( ok => BOOL, args => DATA, rv => DATA, [function => $method_name] )

    Creates a new CPANPLUS::Backend::RV object from the data provided. This method should only be called by CPANPLUS::Backend functions. The accessors may be used by users inspecting an RV object.

    All the argument names can be used as accessors later to retrieve the data.

    Arguments:

    • ok

      Boolean indicating overall success

    • args

      The arguments provided to the function that returned this rv object. Useful to inspect later to see what was actually passed to the function in case of an error.

    • rv

      An arbitrary data structure that has the detailed return values of each of your multi-module calls.

    • function

      The name of the function that created this rv object. Can be explicitly passed. If not, new() will try to deduce the name from caller() information.

    BUG REPORTS

    Please report bugs or other issues to <bug-cpanplus@rt.cpan.org<gt>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    The CPAN++ interface (of which this module is a part of) is copyright (c) 2001 - 2007, Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CPAN/Debug.html000644 000765 000024 00000034151 12275777431 015676 0ustar00jjstaff000000 000000 CPAN::Debug - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::Debug

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::Debug

    NAME

    CPAN::Debug - internal debugging for CPAN.pm

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Page index
     
    perldoc-html/CPAN/Distroprefs.html000644 000765 000024 00000053557 12275777434 017172 0ustar00jjstaff000000 000000 CPAN::Distroprefs - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::Distroprefs

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::Distroprefs

    NAME

    CPAN::Distroprefs -- read and match distroprefs

    SYNOPSIS

    1. use CPAN::Distroprefs;
    2. my %info = (... distribution/environment info ...);
    3. my $finder = CPAN::Distroprefs->find($prefs_dir, \%ext_map);
    4. while (my $result = $finder->next) {
    5. die $result->as_string if $result->is_fatal;
    6. warn($result->as_string), next if $result->is_warning;
    7. for my $pref (@{ $result->prefs }) {
    8. if ($pref->matches(\%info)) {
    9. return $pref;
    10. }
    11. }
    12. }

    DESCRIPTION

    This module encapsulates reading Distroprefs and matching them against CPAN distributions.

    INTERFACE

    1. my $finder = CPAN::Distroprefs->find($dir, \%ext_map);
    2. while (my $result = $finder->next) { ... }

    Build an iterator which finds distroprefs files in the given directory.

    %ext_map is a hashref whose keys are file extensions and whose values are modules used to load matching files:

    1. {
    2. 'yml' => 'YAML::Syck',
    3. 'dd' => 'Data::Dumper',
    4. ...
    5. }

    Each time $finder->next is called, the iterator returns one of two possible values:

    • a CPAN::Distroprefs::Result object
    • undef, indicating that no prefs files remain to be found

    RESULTS

    find() returns CPAN::Distroprefs::Result objects to indicate success or failure when reading a prefs file.

    Common

    All results share some common attributes:

    type

    success , warning , or fatal

    file

    the file from which these prefs were read, or to which this error refers (relative filename)

    ext

    the file's extension, which determines how to load it

    dir

    the directory the file was read from

    abs

    the absolute path to the file

    Errors

    Error results (warning and fatal) contain:

    msg

    the error message (usually either $! or a YAML error)

    Successes

    Success results contain:

    prefs

    an arrayref of CPAN::Distroprefs::Pref objects

    PREFS

    CPAN::Distroprefs::Pref objects represent individual distroprefs documents. They are constructed automatically as part of success results from find() .

    data

    the pref information as a hashref, suitable for e.g. passing to Kwalify

    match_attributes

    returns a list of the valid match attributes (see the Distroprefs section in CPAN)

    currently: env perl perlconfig distribution module

    has_any_match

    true if this pref has a 'match' attribute at all

    has_valid_subkeys

    true if this pref has a 'match' attribute and at least one valid match attribute

    matches

    1. if ($pref->matches(\%arg)) { ... }

    true if this pref matches the passed-in hashref, which must have a value for each of the match_attributes (above)

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CPAN/FirstTime.html000644 000765 000024 00000111345 12275777430 016556 0ustar00jjstaff000000 000000 CPAN::FirstTime - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::FirstTime

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::FirstTime

    NAME

    CPAN::FirstTime - Utility for CPAN::Config file Initialization

    SYNOPSIS

    CPAN::FirstTime::init()

    DESCRIPTION

    The init routine asks a few questions and writes a CPAN/Config.pm or CPAN/MyConfig.pm file (depending on what it is currently using).

    In the following all questions and explanations regarding config variables are collected.

    • auto_commit

      Normally CPAN.pm keeps config variables in memory and changes need to be saved in a separate 'o conf commit' command to make them permanent between sessions. If you set the 'auto_commit' option to true, changes to a config variable are always automatically committed to disk.

      Always commit changes to config variables to disk?

    • build_cache

      CPAN.pm can limit the size of the disk area for keeping the build directories with all the intermediate files.

      Cache size for build directory (in MB)?

    • build_dir

      Directory where the build process takes place?

    • build_dir_reuse

      Until version 1.88 CPAN.pm never trusted the contents of the build_dir directory between sessions. Since 1.88_58 CPAN.pm has a YAML-based mechanism that makes it possible to share the contents of the build_dir/ directory between different sessions with the same version of perl. People who prefer to test things several days before installing will like this feature because it saves a lot of time.

      If you say yes to the following question, CPAN will try to store enough information about the build process so that it can pick up in future sessions at the same state of affairs as it left a previous session.

      Store and re-use state information about distributions between CPAN.pm sessions?

    • build_requires_install_policy

      When a module declares another one as a 'build_requires' prerequisite this means that the other module is only needed for building or testing the module but need not be installed permanently. In this case you may wish to install that other module nonetheless or just keep it in the 'build_dir' directory to have it available only temporarily. Installing saves time on future installations but makes the perl installation bigger.

      You can choose if you want to always install (yes), never install (no) or be always asked. In the latter case you can set the default answer for the question to yes (ask/yes) or no (ask/no).

      Policy on installing 'build_requires' modules (yes, no, ask/yes, ask/no)?

    • cache_metadata

      To considerably speed up the initial CPAN shell startup, it is possible to use Storable to create a cache of metadata. If Storable is not available, the normal index mechanism will be used.

      Note: this mechanism is not used when use_sqlite is on and SQLLite is running.

      Cache metadata (yes/no)?

    • check_sigs

      CPAN packages can be digitally signed by authors and thus verified with the security provided by strong cryptography. The exact mechanism is defined in the Module::Signature module. While this is generally considered a good thing, it is not always convenient to the end user to install modules that are signed incorrectly or where the key of the author is not available or where some prerequisite for Module::Signature has a bug and so on.

      With the check_sigs parameter you can turn signature checking on and off. The default is off for now because the whole tool chain for the functionality is not yet considered mature by some. The author of CPAN.pm would recommend setting it to true most of the time and turning it off only if it turns out to be annoying.

      Note that if you do not have Module::Signature installed, no signature checks will be performed at all.

      Always try to check and verify signatures if a SIGNATURE file is in the package and Module::Signature is installed (yes/no)?

    • colorize_output

      When you have Term::ANSIColor installed, you can turn on colorized output to have some visual differences between normal CPAN.pm output, warnings, debugging output, and the output of the modules being installed. Set your favorite colors after some experimenting with the Term::ANSIColor module.

      Do you want to turn on colored output?

    • colorize_print

      Color for normal output?

    • colorize_warn

      Color for warnings?

    • colorize_debug

      Color for debugging messages?

    • commandnumber_in_prompt

      The prompt of the cpan shell can contain the current command number for easier tracking of the session or be a plain string.

      Do you want the command number in the prompt (yes/no)?

    • connect_to_internet_ok

      If you have never defined your own urllist in your configuration then CPAN.pm will be hesitant to use the built in default sites for downloading. It will ask you once per session if a connection to the internet is OK and only if you say yes, it will try to connect. But to avoid this question, you can choose your favorite download sites once and get away with it. Or, if you have no favorite download sites answer yes to the following question.

      If no urllist has been chosen yet, would you prefer CPAN.pm to connect to the built-in default sites without asking? (yes/no)?

    • ftp_passive

      Shall we always set the FTP_PASSIVE environment variable when dealing with ftp download (yes/no)?

    • ftpstats_period

      Statistics about downloads are truncated by size and period simultaneously.

      How many days shall we keep statistics about downloads?

    • ftpstats_size

      Statistics about downloads are truncated by size and period simultaneously.

      How many items shall we keep in the statistics about downloads?

    • getcwd

      CPAN.pm changes the current working directory often and needs to determine its own current working directory. Per default it uses Cwd::cwd but if this doesn't work on your system for some reason, alternatives can be configured according to the following table:

      1. cwd Cwd::cwd
      2. getcwd Cwd::getcwd
      3. fastcwd Cwd::fastcwd
      4. backtickcwd external command cwd

      Preferred method for determining the current working directory?

    • halt_on_failure

      Normally, CPAN.pm continues processing the full list of targets and dependencies, even if one of them fails. However, you can specify that CPAN should halt after the first failure.

      Do you want to halt on failure (yes/no)?

    • histfile

      If you have one of the readline packages (Term::ReadLine::Perl, Term::ReadLine::Gnu, possibly others) installed, the interactive CPAN shell will have history support. The next two questions deal with the filename of the history file and with its size. If you do not want to set this variable, please hit SPACE ENTER to the following question.

      File to save your history?

    • histsize

      Number of lines to save?

    • inactivity_timeout

      Sometimes you may wish to leave the processes run by CPAN alone without caring about them. Because the Makefile.PL or the Build.PL sometimes contains question you're expected to answer, you can set a timer that will kill a 'perl Makefile.PL' process after the specified time in seconds.

      If you set this value to 0, these processes will wait forever. This is the default and recommended setting.

      Timeout for inactivity during {Makefile,Build}.PL?

    • index_expire

      The CPAN indexes are usually rebuilt once or twice per hour, but the typical CPAN mirror mirrors only once or twice per day. Depending on the quality of your mirror and your desire to be on the bleeding edge, you may want to set the following value to more or less than one day (which is the default). It determines after how many days CPAN.pm downloads new indexes.

      Let the index expire after how many days?

    • inhibit_startup_message

      When the CPAN shell is started it normally displays a greeting message that contains the running version and the status of readline support.

      Do you want to turn this message off?

    • keep_source_where

      Unless you are accessing the CPAN on your filesystem via a file: URL, CPAN.pm needs to keep the source files it downloads somewhere. Please supply a directory where the downloaded files are to be kept.

      Download target directory?

    • load_module_verbosity

      When CPAN.pm loads a module it needs for some optional feature, it usually reports about module name and version. Choose 'v' to get this message, 'none' to suppress it.

      Verbosity level for loading modules (none or v)?

    • makepl_arg

      Every Makefile.PL is run by perl in a separate process. Likewise we run 'make' and 'make install' in separate processes. If you have any parameters (e.g. PREFIX, UNINST or the like) you want to pass to the calls, please specify them here.

      If you don't understand this question, just press ENTER.

      Typical frequently used settings:

      1. PREFIX=~/perl # non-root users (please see manual for more hints)

      Parameters for the 'perl Makefile.PL' command?

    • make_arg

      Parameters for the 'make' command? Typical frequently used setting:

      1. -j3 # dual processor system (on GNU make)

      Your choice:

    • make_install_arg

      Parameters for the 'make install' command? Typical frequently used setting:

      1. UNINST=1 # to always uninstall potentially conflicting files
      2. # (but do NOT use with local::lib or INSTALL_BASE)

      Your choice:

    • make_install_make_command

      Do you want to use a different make command for 'make install'? Cautious people will probably prefer:

      1. su root -c make
      2. or
      3. sudo make
      4. or
      5. /path1/to/sudo -u admin_account /path2/to/make

      or some such. Your choice:

    • mbuildpl_arg

      A Build.PL is run by perl in a separate process. Likewise we run './Build' and './Build install' in separate processes. If you have any parameters you want to pass to the calls, please specify them here.

      Typical frequently used settings:

      1. --install_base /home/xxx # different installation directory

      Parameters for the 'perl Build.PL' command?

    • mbuild_arg

      Parameters for the './Build' command? Setting might be:

      1. --extra_linker_flags -L/usr/foo/lib # non-standard library location

      Your choice:

    • mbuild_install_arg

      Parameters for the './Build install' command? Typical frequently used setting:

      1. --uninst 1 # uninstall conflicting files
      2. # (but do NOT use with local::lib or INSTALL_BASE)

      Your choice:

    • mbuild_install_build_command

      Do you want to use a different command for './Build install'? Sudo users will probably prefer:

      1. su root -c ./Build
      2. or
      3. sudo ./Build
      4. or
      5. /path1/to/sudo -u admin_account ./Build

      or some such. Your choice:

    • pager

      What is your favorite pager program?

    • prefer_installer

      When you have Module::Build installed and a module comes with both a Makefile.PL and a Build.PL, which shall have precedence?

      The main two standard installer modules are the old and well established ExtUtils::MakeMaker (for short: EUMM) which uses the Makefile.PL. And the next generation installer Module::Build (MB) which works with the Build.PL (and often comes with a Makefile.PL too). If a module comes only with one of the two we will use that one but if both are supplied then a decision must be made between EUMM and MB. See also http://rt.cpan.org/Ticket/Display.html?id=29235 for a discussion about the right default.

      Or, as a third option you can choose RAND which will make a random decision (something regular CPAN testers will enjoy).

      In case you can choose between running a Makefile.PL or a Build.PL, which installer would you prefer (EUMM or MB or RAND)?

    • prefs_dir

      CPAN.pm can store customized build environments based on regular expressions for distribution names. These are YAML files where the default options for CPAN.pm and the environment can be overridden and dialog sequences can be stored that can later be executed by an Expect.pm object. The CPAN.pm distribution comes with some prefab YAML files that cover sample distributions that can be used as blueprints to store your own prefs. Please check out the distroprefs/ directory of the CPAN.pm distribution to get a quick start into the prefs system.

      Directory where to store default options/environment/dialogs for building modules that need some customization?

    • prerequisites_policy

      The CPAN module can detect when a module which you are trying to build depends on prerequisites. If this happens, it can build the prerequisites for you automatically ('follow'), ask you for confirmation ('ask'), or just ignore them ('ignore'). Choosing 'follow' also sets PERL_AUTOINSTALL and PERL_EXTUTILS_AUTOINSTALL for "--defaultdeps" if not already set.

      Please set your policy to one of the three values.

      Policy on building prerequisites (follow, ask or ignore)?

    • randomize_urllist

      CPAN.pm can introduce some randomness when using hosts for download that are configured in the urllist parameter. Enter a numeric value between 0 and 1 to indicate how often you want to let CPAN.pm try a random host from the urllist. A value of one specifies to always use a random host as the first try. A value of zero means no randomness at all. Anything in between specifies how often, on average, a random host should be tried first.

      Randomize parameter

    • scan_cache

      By default, each time the CPAN module is started, cache scanning is performed to keep the cache size in sync ('atstart'). Alternatively, scanning and cleanup can happen when CPAN exits ('atexit'). To prevent any cache cleanup, answer 'never'.

      Perform cache scanning ('atstart', 'atexit' or 'never')?

    • shell

      What is your favorite shell?

    • show_unparsable_versions

      During the 'r' command CPAN.pm finds modules without version number. When the command finishes, it prints a report about this. If you want this report to be very verbose, say yes to the following variable.

      Show all individual modules that have no $VERSION?

    • show_upload_date

      The 'd' and the 'm' command normally only show you information they have in their in-memory database and thus will never connect to the internet. If you set the 'show_upload_date' variable to true, 'm' and 'd' will additionally show you the upload date of the module or distribution. Per default this feature is off because it may require a net connection to get at the upload date.

      Always try to show upload date with 'd' and 'm' command (yes/no)?

    • show_zero_versions

      During the 'r' command CPAN.pm finds modules with a version number of zero. When the command finishes, it prints a report about this. If you want this report to be very verbose, say yes to the following variable.

      Show all individual modules that have a $VERSION of zero?

    • tar_verbosity

      When CPAN.pm uses the tar command, which switch for the verbosity shall be used? Choose 'none' for quiet operation, 'v' for file name listing, 'vv' for full listing.

      Tar command verbosity level (none or v or vv)?

    • term_is_latin

      The next option deals with the charset (a.k.a. character set) your terminal supports. In general, CPAN is English speaking territory, so the charset does not matter much but some CPAN have names that are outside the ASCII range. If your terminal supports UTF-8, you should say no to the next question. If it expects ISO-8859-1 (also known as LATIN1) then you should say yes. If it supports neither, your answer does not matter because you will not be able to read the names of some authors anyway. If you answer no, names will be output in UTF-8.

      Your terminal expects ISO-8859-1 (yes/no)?

    • term_ornaments

      When using Term::ReadLine, you can turn ornaments on so that your input stands out against the output from CPAN.pm.

      Do you want to turn ornaments on?

    • test_report

      The goal of the CPAN Testers project (http://testers.cpan.org/) is to test as many CPAN packages as possible on as many platforms as possible. This provides valuable feedback to module authors and potential users to identify bugs or platform compatibility issues and improves the overall quality and value of CPAN.

      One way you can contribute is to send test results for each module that you install. If you install the CPAN::Reporter module, you have the option to automatically generate and deliver test reports to CPAN Testers whenever you run tests on a CPAN package.

      See the CPAN::Reporter documentation for additional details and configuration settings. If your firewall blocks outgoing traffic, you may need to configure CPAN::Reporter before sending reports.

      Generate test reports if CPAN::Reporter is installed (yes/no)?

    • perl5lib_verbosity

      When CPAN.pm extends @INC via PERL5LIB, it prints a list of directories added (or a summary of how many directories are added). Choose 'v' to get this message, 'none' to suppress it.

      Verbosity level for PERL5LIB changes (none or v)?

    • prefer_external_tar

      Per default all untar operations are done with the perl module Archive::Tar; by setting this variable to true the external tar command is used if available; on Unix this is usually preferred because they have a reliable and fast gnutar implementation.

      Use the external tar program instead of Archive::Tar?

    • trust_test_report_history

      When a distribution has already been tested by CPAN::Reporter on this machine, CPAN can skip the test phase and just rely on the test report history instead.

      Note that this will not apply to distributions that failed tests because of missing dependencies. Also, tests can be run regardless of the history using "force".

      Do you want to rely on the test report history (yes/no)?

    • use_sqlite

      CPAN::SQLite is a layer between the index files that are downloaded from the CPAN and CPAN.pm that speeds up metadata queries and reduces memory consumption of CPAN.pm considerably.

      Use CPAN::SQLite if available? (yes/no)?

    • version_timeout

      This timeout prevents CPAN from hanging when trying to parse a pathologically coded $VERSION from a module.

      The default is 15 seconds. If you set this value to 0, no timeout will occur, but this is not recommended.

      Timeout for parsing module versions?

    • yaml_load_code

      Both YAML.pm and YAML::Syck are capable of deserialising code. As this requires a string eval, which might be a security risk, you can use this option to enable or disable the deserialisation of code via CPAN::DeferredCode. (Note: This does not work under perl 5.6)

      Do you want to enable code deserialisation (yes/no)?

    • yaml_module

      At the time of this writing (2009-03) there are three YAML implementations working: YAML, YAML::Syck, and YAML::XS. The latter two are faster but need a C compiler installed on your system. There may be more alternative YAML conforming modules. When I tried two other players, YAML::Tiny and YAML::Perl, they seemed not powerful enough to work with CPAN.pm. This may have changed in the meantime.

      Which YAML implementation would you prefer?

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CPAN/HandleConfig.html000644 000765 000024 00000037245 12275777431 017200 0ustar00jjstaff000000 000000 CPAN::HandleConfig - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::HandleConfig

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::HandleConfig

    NAME

    CPAN::HandleConfig - internal configuration handling for CPAN.pm

    CLASS->safe_quote ITEM

    Quotes an item to become safe against spaces in shell interpolation. An item is enclosed in double quotes if:

    1. - the item contains spaces in the middle
    2. - the item does not start with a quote

    This happens to avoid shell interpolation problems when whitespace is present in directory names.

    This method uses commands_quote to determine the correct quote. If commands_quote is a space, no quoting will take place.

    if it starts and ends with the same quote character: leave it as it is

    if it contains no whitespace: leave it as it is

    if it contains whitespace, then

    if it contains quotes: better leave it as it is

    else: quote it with the correct quote type for the box we're on

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CPAN/Kwalify.html000644 000765 000024 00000037625 12275777430 016266 0ustar00jjstaff000000 000000 CPAN::Kwalify - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::Kwalify

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::Kwalify

    NAME

    CPAN::Kwalify - Interface between CPAN.pm and Kwalify.pm

    SYNOPSIS

    1. use CPAN::Kwalify;
    2. validate($schema_name, $data, $file, $doc);

    DESCRIPTION

    • _validate($schema_name, $data, $file, $doc)

      $schema_name is the name of a supported schema. Currently only distroprefs is supported. $data is the data to be validated. $file is the absolute path to the file the data are coming from. $doc is the index of the document within $doc that is to be validated. The last two arguments are only there for better error reporting.

      Relies on being called from within CPAN.pm.

      Dies if something fails. Does not return anything useful.

    • yaml($schema_name)

      Returns the YAML text of that schema. Dies if something fails.

    AUTHOR

    Andreas Koenig <andk@cpan.org>

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    See http://www.perl.com/perl/misc/Artistic.html

     
    perldoc-html/CPAN/Nox.html000644 000765 000024 00000035650 12275777434 015424 0ustar00jjstaff000000 000000 CPAN::Nox - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::Nox

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::Nox

    NAME

    CPAN::Nox - Wrapper around CPAN.pm without using any XS module

    SYNOPSIS

    Interactive mode:

    1. perl -MCPAN::Nox -e shell;

    DESCRIPTION

    This package has the same functionality as CPAN.pm, but tries to prevent the usage of compiled extensions during its own execution. Its primary purpose is a rescue in case you upgraded perl and broke binary compatibility somehow.

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    CPAN

     
    perldoc-html/CPAN/Queue.html000644 000765 000024 00000034155 12275777434 015743 0ustar00jjstaff000000 000000 CPAN::Queue - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::Queue

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::Queue

    NAME

    CPAN::Queue - internal queue support for CPAN.pm

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Page index
     
    perldoc-html/CPAN/Tarzip.html000644 000765 000024 00000034177 12275777434 016134 0ustar00jjstaff000000 000000 CPAN::Tarzip - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::Tarzip

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::Tarzip

    NAME

    CPAN::Tarzip - internal handling of tar archives for CPAN.pm

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Page index
     
    perldoc-html/CPAN/Version.html000644 000765 000024 00000041431 12275777431 016274 0ustar00jjstaff000000 000000 CPAN::Version - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CPAN::Version

    Perl 5 version 18.2 documentation
    Recently read

    CPAN::Version

    NAME

    CPAN::Version - utility functions to compare CPAN versions

    SYNOPSIS

    1. use CPAN::Version;
    2. CPAN::Version->vgt("1.1","1.1.1"); # 1 bc. 1.1 > 1.001001
    3. CPAN::Version->vlt("1.1","1.1"); # 0 bc. 1.1 not < 1.1
    4. CPAN::Version->vcmp("1.1","1.1.1"); # 1 bc. first is larger
    5. CPAN::Version->vcmp("1.1.1","1.1"); # -1 bc. first is smaller
    6. CPAN::Version->readable(v1.2.3); # "v1.2.3"
    7. CPAN::Version->vstring("v1.2.3"); # v1.2.3
    8. CPAN::Version->float2vv(1.002003); # "v1.2.3"

    DESCRIPTION

    This module mediates between some version that perl sees in a package and the version that is published by the CPAN indexer.

    It's only written as a helper module for both CPAN.pm and CPANPLUS.pm.

    As it stands it predates version.pm but has the same goal: make version strings visible and comparable.

    LICENSE

    This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

     
    perldoc-html/CGI/Apache.html000644 000765 000024 00000035242 12275777434 015717 0ustar00jjstaff000000 000000 CGI::Apache - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CGI::Apache

    Perl 5 version 18.2 documentation
    Recently read

    CGI::Apache

    NAME

    CGI::Apache - Backward compatibility module for CGI.pm

    SYNOPSIS

    Do not use this module. It is deprecated.

    ABSTRACT

    DESCRIPTION

    AUTHOR INFORMATION

    BUGS

    SEE ALSO

     
    perldoc-html/CGI/Carp.html000644 000765 000024 00000110422 12275777431 015412 0ustar00jjstaff000000 000000 CGI::Carp - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CGI::Carp

    Perl 5 version 18.2 documentation
    Recently read

    CGI::Carp

    NAME

    CGI::Carp - CGI routines for writing to the HTTPD (or other) error log

    SYNOPSIS

    1. use CGI::Carp;
    2. croak "We're outta here!";
    3. confess "It was my fault: $!";
    4. carp "It was your fault!";
    5. warn "I'm confused";
    6. die "I'm dying.\n";
    7. use CGI::Carp qw(cluck);
    8. cluck "I wouldn't do that if I were you";
    9. use CGI::Carp qw(fatalsToBrowser);
    10. die "Fatal error messages are now sent to browser";

    DESCRIPTION

    CGI scripts have a nasty habit of leaving warning messages in the error logs that are neither time stamped nor fully identified. Tracking down the script that caused the error is a pain. This fixes that. Replace the usual

    1. use Carp;

    with

    1. use CGI::Carp

    The standard warn(), die (), croak(), confess() and carp() calls will be replaced with functions that write time-stamped messages to the HTTP server error log.

    For example:

    1. [Fri Nov 17 21:40:43 1995] test.pl: I'm confused at test.pl line 3.
    2. [Fri Nov 17 21:40:43 1995] test.pl: Got an error message: Permission denied.
    3. [Fri Nov 17 21:40:43 1995] test.pl: I'm dying.

    REDIRECTING ERROR MESSAGES

    By default, error messages are sent to STDERR. Most HTTPD servers direct STDERR to the server's error log. Some applications may wish to keep private error logs, distinct from the server's error log, or they may wish to direct error messages to STDOUT so that the browser will receive them.

    The carpout() function is provided for this purpose. Since carpout() is not exported by default, you must import it explicitly by saying

    1. use CGI::Carp qw(carpout);

    The carpout() function requires one argument, a reference to an open filehandle for writing errors. It should be called in a BEGIN block at the top of the CGI application so that compiler errors will be caught. Example:

    1. BEGIN {
    2. use CGI::Carp qw(carpout);
    3. open(LOG, ">>/usr/local/cgi-logs/mycgi-log") or
    4. die("Unable to open mycgi-log: $!\n");
    5. carpout(LOG);
    6. }

    carpout() does not handle file locking on the log for you at this point. Also, note that carpout() does not work with in-memory file handles, although a patch would be welcome to address that.

    The real STDERR is not closed -- it is moved to CGI::Carp::SAVEERR. Some servers, when dealing with CGI scripts, close their connection to the browser when the script closes STDOUT and STDERR. CGI::Carp::SAVEERR is there to prevent this from happening prematurely.

    You can pass filehandles to carpout() in a variety of ways. The "correct" way according to Tom Christiansen is to pass a reference to a filehandle GLOB:

    1. carpout(\*LOG);

    This looks weird to mere mortals however, so the following syntaxes are accepted as well:

    1. carpout(LOG);
    2. carpout(main::LOG);
    3. carpout(main'LOG);
    4. carpout(\LOG);
    5. carpout(\'main::LOG');
    6. ... and so on

    FileHandle and other objects work as well.

    Use of carpout() is not great for performance, so it is recommended for debugging purposes or for moderate-use applications. A future version of this module may delay redirecting STDERR until one of the CGI::Carp methods is called to prevent the performance hit.

    MAKING PERL ERRORS APPEAR IN THE BROWSER WINDOW

    If you want to send fatal (die, confess) errors to the browser, import the special "fatalsToBrowser" subroutine:

    1. use CGI::Carp qw(fatalsToBrowser);
    2. die "Bad error here";

    Fatal errors will now be echoed to the browser as well as to the log. CGI::Carp arranges to send a minimal HTTP header to the browser so that even errors that occur in the early compile phase will be seen. Nonfatal errors will still be directed to the log file only (unless redirected with carpout).

    Note that fatalsToBrowser may not work well with mod_perl version 2.0 and higher.

    Changing the default message

    By default, the software error message is followed by a note to contact the Webmaster by e-mail with the time and date of the error. If this message is not to your liking, you can change it using the set_message() routine. This is not imported by default; you should import it on the use() line:

    1. use CGI::Carp qw(fatalsToBrowser set_message);
    2. set_message("It's not a bug, it's a feature!");

    You may also pass in a code reference in order to create a custom error message. At run time, your code will be called with the text of the error message that caused the script to die. Example:

    1. use CGI::Carp qw(fatalsToBrowser set_message);
    2. BEGIN {
    3. sub handle_errors {
    4. my $msg = shift;
    5. print "<h1>Oh gosh</h1>";
    6. print "<p>Got an error: $msg</p>";
    7. }
    8. set_message(\&handle_errors);
    9. }

    In order to correctly intercept compile-time errors, you should call set_message() from within a BEGIN{} block.

    DOING MORE THAN PRINTING A MESSAGE IN THE EVENT OF PERL ERRORS

    If fatalsToBrowser in conjunction with set_message does not provide you with all of the functionality you need, you can go one step further by specifying a function to be executed any time a script calls "die", has a syntax error, or dies unexpectedly at runtime with a line like "undef->explode();".

    1. use CGI::Carp qw(set_die_handler);
    2. BEGIN {
    3. sub handle_errors {
    4. my $msg = shift;
    5. print "content-type: text/html\n\n";
    6. print "<h1>Oh gosh</h1>";
    7. print "<p>Got an error: $msg</p>";
    8. #proceed to send an email to a system administrator,
    9. #write a detailed message to the browser and/or a log,
    10. #etc....
    11. }
    12. set_die_handler(\&handle_errors);
    13. }

    Notice that if you use set_die_handler(), you must handle sending HTML headers to the browser yourself if you are printing a message.

    If you use set_die_handler(), you will most likely interfere with the behavior of fatalsToBrowser, so you must use this or that, not both.

    Using set_die_handler() sets SIG{__DIE__} (as does fatalsToBrowser), and there is only one SIG{__DIE__}. This means that if you are attempting to set SIG{__DIE__} yourself, you may interfere with this module's functionality, or this module may interfere with your module's functionality.

    SUPPRESSING PERL ERRORS APPEARING IN THE BROWSER WINDOW

    A problem sometimes encountered when using fatalsToBrowser is when a die() is done inside an eval body or expression. Even though the fatalsToBrower support takes precautions to avoid this, you still may get the error message printed to STDOUT. This may have some undesireable effects when the purpose of doing the eval is to determine which of several algorithms is to be used.

    By setting $CGI::Carp::TO_BROWSER to 0 you can suppress printing the die messages but without all of the complexity of using set_die_handler . You can localize this effect to inside eval bodies if this is desireable: For example:

    1. eval {
    2. local $CGI::Carp::TO_BROWSER = 0;
    3. die "Fatal error messages not sent browser"
    4. }
    5. # $@ will contain error message

    MAKING WARNINGS APPEAR AS HTML COMMENTS

    It is also possible to make non-fatal errors appear as HTML comments embedded in the output of your program. To enable this feature, export the new "warningsToBrowser" subroutine. Since sending warnings to the browser before the HTTP headers have been sent would cause an error, any warnings are stored in an internal buffer until you call the warningsToBrowser() subroutine with a true argument:

    1. use CGI::Carp qw(fatalsToBrowser warningsToBrowser);
    2. use CGI qw(:standard);
    3. print header();
    4. warningsToBrowser(1);

    You may also give a false argument to warningsToBrowser() to prevent warnings from being sent to the browser while you are printing some content where HTML comments are not allowed:

    1. warningsToBrowser(0); # disable warnings
    2. print "<script type=\"text/javascript\"><!--\n";
    3. print_some_javascript_code();
    4. print "//--></script>\n";
    5. warningsToBrowser(1); # re-enable warnings

    Note: In this respect warningsToBrowser() differs fundamentally from fatalsToBrowser(), which you should never call yourself!

    OVERRIDING THE NAME OF THE PROGRAM

    CGI::Carp includes the name of the program that generated the error or warning in the messages written to the log and the browser window. Sometimes, Perl can get confused about what the actual name of the executed program was. In these cases, you can override the program name that CGI::Carp will use for all messages.

    The quick way to do that is to tell CGI::Carp the name of the program in its use statement. You can do that by adding "name=cgi_carp_log_name" to your "use" statement. For example:

    1. use CGI::Carp qw(name=cgi_carp_log_name);

    . If you want to change the program name partway through the program, you can use the set_progname() function instead. It is not exported by default, you must import it explicitly by saying

    1. use CGI::Carp qw(set_progname);

    Once you've done that, you can change the logged name of the program at any time by calling

    1. set_progname(new_program_name);

    You can set the program back to the default by calling

    1. set_progname(undef);

    Note that this override doesn't happen until after the program has compiled, so any compile-time errors will still show up with the non-overridden program name

    CHANGE LOG

    3.51 Added $CGI::Carp::TO_BROWSER

    1.29 Patch from Peter Whaite to fix the unfixable problem of CGI::Carp not behaving correctly in an eval() context.

    1.05 carpout() added and minor corrections by Marc Hedlund <hedlund@best.com> on 11/26/95.

    1.06 fatalsToBrowser() no longer aborts for fatal errors within eval() statements.

    1.08 set_message() added and carpout() expanded to allow for FileHandle objects.

    1.09 set_message() now allows users to pass a code REFERENCE for really custom error messages. croak and carp are now exported by default. Thanks to Gunther Birznieks for the patches.

    1.10 Patch from Chris Dean (ctdean@cogit.com) to allow module to run correctly under mod_perl.

    1.11 Changed order of &gt; and &lt; escapes.

    1.12 Changed die() on line 217 to CORE::die to avoid -w warning.

    1.13 Added cluck() to make the module orthogonal with Carp. More mod_perl related fixes.

    1.20 Patch from Ilmari Karonen (perl@itz.pp.sci.fi): Added warningsToBrowser(). Replaced <CODE> tags with <PRE> in fatalsToBrowser() output.

    1.23 ineval() now checks both $^S and inspects the message for the "eval" pattern (hack alert!) in order to accommodate various combinations of Perl and mod_perl.

    1.24 Patch from Scott Gifford (sgifford@suspectclass.com): Add support for overriding program name.

    1.26 Replaced CORE::GLOBAL::die with the evil $SIG{__DIE__} because the former isn't working in some people's hands. There is no such thing as reliable exception handling in Perl.

    1.27 Replaced tell STDOUT with bytes=tell STDOUT.

    AUTHORS

    Copyright 1995-2002, Lincoln D. Stein. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    SEE ALSO

    Carp, CGI::Base, CGI::BasePlus, CGI::Request, CGI::MiniSvr, CGI::Form, CGI::Response.

     
    perldoc-html/CGI/Cookie.html000644 000765 000024 00000105220 12275777431 015736 0ustar00jjstaff000000 000000 CGI::Cookie - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CGI::Cookie

    Perl 5 version 18.2 documentation
    Recently read

    CGI::Cookie

    NAME

    CGI::Cookie - Interface to HTTP Cookies

    SYNOPSIS

    1. use CGI qw/:standard/;
    2. use CGI::Cookie;
    3. # Create new cookies and send them
    4. $cookie1 = CGI::Cookie->new(-name=>'ID',-value=>123456);
    5. $cookie2 = CGI::Cookie->new(-name=>'preferences',
    6. -value=>{ font => Helvetica,
    7. size => 12 }
    8. );
    9. print header(-cookie=>[$cookie1,$cookie2]);
    10. # fetch existing cookies
    11. %cookies = CGI::Cookie->fetch;
    12. $id = $cookies{'ID'}->value;
    13. # create cookies returned from an external source
    14. %cookies = CGI::Cookie->parse($ENV{COOKIE});

    DESCRIPTION

    CGI::Cookie is an interface to HTTP/1.1 cookies, an innovation that allows Web servers to store persistent information on the browser's side of the connection. Although CGI::Cookie is intended to be used in conjunction with CGI.pm (and is in fact used by it internally), you can use this module independently.

    For full information on cookies see

    1. http://tools.ietf.org/html/rfc2109
    2. http://tools.ietf.org/html/rfc2965
    3. http://tools.ietf.org/html/draft-ietf-httpstate-cookie

    USING CGI::Cookie

    CGI::Cookie is object oriented. Each cookie object has a name and a value. The name is any scalar value. The value is any scalar or array value (associative arrays are also allowed). Cookies also have several optional attributes, including:

    • 1. expiration date

      The expiration date tells the browser how long to hang on to the cookie. If the cookie specifies an expiration date in the future, the browser will store the cookie information in a disk file and return it to the server every time the user reconnects (until the expiration date is reached). If the cookie species an expiration date in the past, the browser will remove the cookie from the disk file. If the expiration date is not specified, the cookie will persist only until the user quits the browser.

    • 2. domain

      This is a partial or complete domain name for which the cookie is valid. The browser will return the cookie to any host that matches the partial domain name. For example, if you specify a domain name of ".capricorn.com", then the browser will return the cookie to Web servers running on any of the machines "www.capricorn.com", "ftp.capricorn.com", "feckless.capricorn.com", etc. Domain names must contain at least two periods to prevent attempts to match on top level domains like ".edu". If no domain is specified, then the browser will only return the cookie to servers on the host the cookie originated from.

    • 3. path

      If you provide a cookie path attribute, the browser will check it against your script's URL before returning the cookie. For example, if you specify the path "/cgi-bin", then the cookie will be returned to each of the scripts "/cgi-bin/tally.pl", "/cgi-bin/order.pl", and "/cgi-bin/customer_service/complain.pl", but not to the script "/cgi-private/site_admin.pl". By default, the path is set to "/", so that all scripts at your site will receive the cookie.

    • 4. secure flag

      If the "secure" attribute is set, the cookie will only be sent to your script if the CGI request is occurring on a secure channel, such as SSL.

    • 5. httponly flag

      If the "httponly" attribute is set, the cookie will only be accessible through HTTP Requests. This cookie will be inaccessible via JavaScript (to prevent XSS attacks).

      This feature is only supported by recent browsers like Internet Explorer 6 Service Pack 1, Firefox 3.0 and Opera 9.5 (and later of course).

      See these URLs for more information:

      1. http://msdn.microsoft.com/en-us/library/ms533046.aspx
      2. http://www.owasp.org/index.php/HTTPOnly#Browsers_Supporting_HTTPOnly

    Creating New Cookies

    1. my $c = CGI::Cookie->new(-name => 'foo',
    2. -value => 'bar',
    3. -expires => '+3M',
    4. -domain => '.capricorn.com',
    5. -path => '/cgi-bin/database',
    6. -secure => 1
    7. );

    Create cookies from scratch with the new method. The -name and -value parameters are required. The name must be a scalar value. The value can be a scalar, an array reference, or a hash reference. (At some point in the future cookies will support one of the Perl object serialization protocols for full generality).

    -expires accepts any of the relative or absolute date formats recognized by CGI.pm, for example "+3M" for three months in the future. See CGI.pm's documentation for details.

    -max-age accepts the same data formats as -expires, but sets a relative value instead of an absolute like -expires. This is intended to be more secure since a clock could be changed to fake an absolute time. In practice, as of 2011, -max-age still does not enjoy the widespread support that -expires has. You can set both, and browsers that support -max-age should ignore the Expires header. The drawback to this approach is the bit of bandwidth for sending an extra header on each cookie.

    -domain points to a domain name or to a fully qualified host name. If not specified, the cookie will be returned only to the Web server that created it.

    -path points to a partial URL on the current server. The cookie will be returned to all URLs beginning with the specified path. If not specified, it defaults to '/', which returns the cookie to all pages at your site.

    -secure if set to a true value instructs the browser to return the cookie only when a cryptographic protocol is in use.

    -httponly if set to a true value, the cookie will not be accessible via JavaScript.

    For compatibility with Apache::Cookie, you may optionally pass in a mod_perl request object as the first argument to new() . It will simply be ignored:

    1. my $c = CGI::Cookie->new($r,
    2. -name => 'foo',
    3. -value => ['bar','baz']);

    Sending the Cookie to the Browser

    The simplest way to send a cookie to the browser is by calling the bake() method:

    1. $c->bake;

    This will print the Set-Cookie HTTP header to STDOUT using CGI.pm. CGI.pm will be loaded for this purpose if it is not already. Otherwise CGI.pm is not required or used by this module.

    Under mod_perl, pass in an Apache request object:

    1. $c->bake($r);

    If you want to set the cookie yourself, Within a CGI script you can send a cookie to the browser by creating one or more Set-Cookie: fields in the HTTP header. Here is a typical sequence:

    1. my $c = CGI::Cookie->new(-name => 'foo',
    2. -value => ['bar','baz'],
    3. -expires => '+3M');
    4. print "Set-Cookie: $c\n";
    5. print "Content-Type: text/html\n\n";

    To send more than one cookie, create several Set-Cookie: fields.

    If you are using CGI.pm, you send cookies by providing a -cookie argument to the header() method:

    1. print header(-cookie=>$c);

    Mod_perl users can set cookies using the request object's header_out() method:

    1. $r->headers_out->set('Set-Cookie' => $c);

    Internally, Cookie overloads the "" operator to call its as_string() method when incorporated into the HTTP header. as_string() turns the Cookie's internal representation into an RFC-compliant text representation. You may call as_string() yourself if you prefer:

    1. print "Set-Cookie: ",$c->as_string,"\n";

    Recovering Previous Cookies

    1. %cookies = CGI::Cookie->fetch;

    fetch returns an associative array consisting of all cookies returned by the browser. The keys of the array are the cookie names. You can iterate through the cookies this way:

    1. %cookies = CGI::Cookie->fetch;
    2. for (keys %cookies) {
    3. do_something($cookies{$_});
    4. }

    In a scalar context, fetch() returns a hash reference, which may be more efficient if you are manipulating multiple cookies.

    CGI.pm uses the URL escaping methods to save and restore reserved characters in its cookies. If you are trying to retrieve a cookie set by a foreign server, this escaping method may trip you up. Use raw_fetch() instead, which has the same semantics as fetch(), but performs no unescaping.

    You may also retrieve cookies that were stored in some external form using the parse() class method:

    1. $COOKIES = `cat /usr/tmp/Cookie_stash`;
    2. %cookies = CGI::Cookie->parse($COOKIES);

    If you are in a mod_perl environment, you can save some overhead by passing the request object to fetch() like this:

    1. CGI::Cookie->fetch($r);

    If the value passed to parse() is undefined, an empty array will returned in list context, and an empty hashref will be returned in scalar context.

    Manipulating Cookies

    Cookie objects have a series of accessor methods to get and set cookie attributes. Each accessor has a similar syntax. Called without arguments, the accessor returns the current value of the attribute. Called with an argument, the accessor changes the attribute and returns its new value.

    • name()

      Get or set the cookie's name. Example:

      1. $name = $c->name;
      2. $new_name = $c->name('fred');
    • value()

      Get or set the cookie's value. Example:

      1. $value = $c->value;
      2. @new_value = $c->value(['a','b','c','d']);

      value() is context sensitive. In a list context it will return the current value of the cookie as an array. In a scalar context it will return the first value of a multivalued cookie.

    • domain()

      Get or set the cookie's domain.

    • path()

      Get or set the cookie's path.

    • expires()

      Get or set the cookie's expiration time.

    AUTHOR INFORMATION

    Copyright 1997-1998, Lincoln D. Stein. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Address bug reports and comments to: lstein@cshl.org

    BUGS

    This section intentionally left blank.

    SEE ALSO

    CGI::Carp, CGI

    RFC 2109, RFC 2695

     
    perldoc-html/CGI/Fast.html000644 000765 000024 00000057173 12275777432 015440 0ustar00jjstaff000000 000000 CGI::Fast - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CGI::Fast

    Perl 5 version 18.2 documentation
    Recently read

    CGI::Fast

    NAME

    CGI::Fast - CGI Interface for Fast CGI

    SYNOPSIS

    1. use CGI::Fast qw(:standard);
    2. $COUNTER = 0;
    3. while (new CGI::Fast) {
    4. print header;
    5. print start_html("Fast CGI Rocks");
    6. print
    7. h1("Fast CGI Rocks"),
    8. "Invocation number ",b($COUNTER++),
    9. " PID ",b($$),".",
    10. hr;
    11. print end_html;
    12. }

    DESCRIPTION

    CGI::Fast is a subclass of the CGI object created by CGI.pm. It is specialized to work well FCGI module, which greatly speeds up CGI scripts by turning them into persistently running server processes. Scripts that perform time-consuming initialization processes, such as loading large modules or opening persistent database connections, will see large performance improvements.

    OTHER PIECES OF THE PUZZLE

    In order to use CGI::Fast you'll need the FCGI module. See http://www.cpan.org/ for details.

    WRITING FASTCGI PERL SCRIPTS

    FastCGI scripts are persistent: one or more copies of the script are started up when the server initializes, and stay around until the server exits or they die a natural death. After performing whatever one-time initialization it needs, the script enters a loop waiting for incoming connections, processing the request, and waiting some more.

    A typical FastCGI script will look like this:

    1. #!/usr/bin/perl
    2. use CGI::Fast;
    3. &do_some_initialization();
    4. while ($q = new CGI::Fast) {
    5. &process_request($q);
    6. }

    Each time there's a new request, CGI::Fast returns a CGI object to your loop. The rest of the time your script waits in the call to new(). When the server requests that your script be terminated, new() will return undef. You can of course exit earlier if you choose. A new version of the script will be respawned to take its place (this may be necessary in order to avoid Perl memory leaks in long-running scripts).

    CGI.pm's default CGI object mode also works. Just modify the loop this way:

    1. while (new CGI::Fast) {
    2. &process_request;
    3. }

    Calls to header(), start_form(), etc. will all operate on the current request.

    INSTALLING FASTCGI SCRIPTS

    See the FastCGI developer's kit documentation for full details. On the Apache server, the following line must be added to srm.conf:

    1. AddType application/x-httpd-fcgi .fcgi

    FastCGI scripts must end in the extension .fcgi. For each script you install, you must add something like the following to srm.conf:

    1. FastCgiServer /usr/etc/httpd/fcgi-bin/file_upload.fcgi -processes 2

    This instructs Apache to launch two copies of file_upload.fcgi at startup time.

    USING FASTCGI SCRIPTS AS CGI SCRIPTS

    Any script that works correctly as a FastCGI script will also work correctly when installed as a vanilla CGI script. However it will not see any performance benefit.

    EXTERNAL FASTCGI SERVER INVOCATION

    FastCGI supports a TCP/IP transport mechanism which allows FastCGI scripts to run external to the webserver, perhaps on a remote machine. To configure the webserver to connect to an external FastCGI server, you would add the following to your srm.conf:

    1. FastCgiExternalServer /usr/etc/httpd/fcgi-bin/file_upload.fcgi -host sputnik:8888

    Two environment variables affect how the CGI::Fast object is created, allowing CGI::Fast to be used as an external FastCGI server. (See FCGI documentation for FCGI::OpenSocket for more information.)

    • FCGI_SOCKET_PATH

      The address (TCP/IP) or path (UNIX Domain) of the socket the external FastCGI script to which bind an listen for incoming connections from the web server.

    • FCGI_LISTEN_QUEUE

      Maximum length of the queue of pending connections.

    For example:

    1. #!/usr/local/bin/perl # must be a FastCGI version of perl!
    2. use CGI::Fast;
    3. &do_some_initialization();
    4. $ENV{FCGI_SOCKET_PATH} = "sputnik:8888";
    5. $ENV{FCGI_LISTEN_QUEUE} = 100;
    6. while ($q = new CGI::Fast) {
    7. &process_request($q);
    8. }

    CAVEATS

    I haven't tested this very much.

    AUTHOR INFORMATION

    Copyright 1996-1998, Lincoln D. Stein. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Address bug reports and comments to: lstein@cshl.org

    BUGS

    This section intentionally left blank.

    SEE ALSO

    CGI::Carp, CGI

     
    perldoc-html/CGI/Pretty.html000644 000765 000024 00000047067 12275777434 016035 0ustar00jjstaff000000 000000 CGI::Pretty - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CGI::Pretty

    Perl 5 version 18.2 documentation
    Recently read

    CGI::Pretty

    NAME

    CGI::Pretty - module to produce nicely formatted HTML code

    SYNOPSIS

    1. use CGI::Pretty qw( :html3 );
    2. # Print a table with a single data element
    3. print table( TR( td( "foo" ) ) );

    DESCRIPTION

    CGI::Pretty is a module that derives from CGI. It's sole function is to allow users of CGI to output nicely formatted HTML code.

    When using the CGI module, the following code: print table( TR( td( "foo" ) ) );

    produces the following output: <TABLE><TR><TD>foo</TD></TR></TABLE>

    If a user were to create a table consisting of many rows and many columns, the resultant HTML code would be quite difficult to read since it has no carriage returns or indentation.

    CGI::Pretty fixes this problem. What it does is add a carriage return and indentation to the HTML code so that one can easily read it.

    1. print table( TR( td( "foo" ) ) );

    now produces the following output: <TABLE> <TR> <TD>foo</TD> </TR> </TABLE>

    Recommendation for when to use CGI::Pretty

    CGI::Pretty is far slower than using CGI.pm directly. A benchmark showed that it could be about 10 times slower. Adding newlines and spaces may alter the rendered appearance of HTML. Also, the extra newlines and spaces also make the file size larger, making the files take longer to download.

    With all those considerations, it is recommended that CGI::Pretty be used primarily for debugging.

    Tags that won't be formatted

    The following tags are not formatted: <a>, <pre>, <code>, <script>, <textarea>, and <td>. If these tags were formatted, the user would see the extra indentation on the web browser causing the page to look different than what would be expected. If you wish to add more tags to the list of tags that are not to be touched, push them onto the @AS_IS array:

    1. push @CGI::Pretty::AS_IS,qw(XMP);

    Customizing the Indenting

    If you wish to have your own personal style of indenting, you can change the $INDENT variable:

    1. $CGI::Pretty::INDENT = "\t\t";

    would cause the indents to be two tabs.

    Similarly, if you wish to have more space between lines, you may change the $LINEBREAK variable:

    1. $CGI::Pretty::LINEBREAK = "\n\n";

    would create two carriage returns between lines.

    If you decide you want to use the regular CGI indenting, you can easily do the following:

    1. $CGI::Pretty::INDENT = $CGI::Pretty::LINEBREAK = "";

    AUTHOR

    Brian Paulsen <Brian@ThePaulsens.com>, with minor modifications by Lincoln Stein <lstein@cshl.org> for incorporation into the CGI.pm distribution.

    Copyright 1999, Brian Paulsen. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Bug reports and comments to Brian@ThePaulsens.com. You can also write to lstein@cshl.org, but this code looks pretty hairy to me and I'm not sure I understand it!

    SEE ALSO

    CGI

     
    perldoc-html/CGI/Push.html000644 000765 000024 00000067707 12275777431 015465 0ustar00jjstaff000000 000000 CGI::Push - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CGI::Push

    Perl 5 version 18.2 documentation
    Recently read

    CGI::Push

    NAME

    CGI::Push - Simple Interface to Server Push

    SYNOPSIS

    1. use CGI::Push qw(:standard);
    2. do_push(-next_page=>\&next_page,
    3. -last_page=>\&last_page,
    4. -delay=>0.5);
    5. sub next_page {
    6. my($q,$counter) = @_;
    7. return undef if $counter >= 10;
    8. return start_html('Test'),
    9. h1('Visible'),"\n",
    10. "This page has been called ", strong($counter)," times",
    11. end_html();
    12. }
    13. sub last_page {
    14. my($q,$counter) = @_;
    15. return start_html('Done'),
    16. h1('Finished'),
    17. strong($counter - 1),' iterations.',
    18. end_html;
    19. }

    DESCRIPTION

    CGI::Push is a subclass of the CGI object created by CGI.pm. It is specialized for server push operations, which allow you to create animated pages whose content changes at regular intervals.

    You provide CGI::Push with a pointer to a subroutine that will draw one page. Every time your subroutine is called, it generates a new page. The contents of the page will be transmitted to the browser in such a way that it will replace what was there beforehand. The technique will work with HTML pages as well as with graphics files, allowing you to create animated GIFs.

    Only Netscape Navigator supports server push. Internet Explorer browsers do not.

    USING CGI::Push

    CGI::Push adds one new method to the standard CGI suite, do_push(). When you call this method, you pass it a reference to a subroutine that is responsible for drawing each new page, an interval delay, and an optional subroutine for drawing the last page. Other optional parameters include most of those recognized by the CGI header() method.

    You may call do_push() in the object oriented manner or not, as you prefer:

    1. use CGI::Push;
    2. $q = new CGI::Push;
    3. $q->do_push(-next_page=>\&draw_a_page);
    4. -or-
    5. use CGI::Push qw(:standard);
    6. do_push(-next_page=>\&draw_a_page);

    Parameters are as follows:

    • -next_page
      1. do_push(-next_page=>\&my_draw_routine);

      This required parameter points to a reference to a subroutine responsible for drawing each new page. The subroutine should expect two parameters consisting of the CGI object and a counter indicating the number of times the subroutine has been called. It should return the contents of the page as an array of one or more items to print. It can return a false value (or an empty array) in order to abort the redrawing loop and print out the final page (if any)

      1. sub my_draw_routine {
      2. my($q,$counter) = @_;
      3. return undef if $counter > 100;
      4. return start_html('testing'),
      5. h1('testing'),
      6. "This page called $counter times";
      7. }

      You are of course free to refer to create and use global variables within your draw routine in order to achieve special effects.

    • -last_page

      This optional parameter points to a reference to the subroutine responsible for drawing the last page of the series. It is called after the -next_page routine returns a false value. The subroutine itself should have exactly the same calling conventions as the -next_page routine.

    • -type

      This optional parameter indicates the content type of each page. It defaults to "text/html". Normally the module assumes that each page is of a homogeneous MIME type. However if you provide either of the magic values "heterogeneous" or "dynamic" (the latter provided for the convenience of those who hate long parameter names), you can specify the MIME type -- and other header fields -- on a per-page basis. See "heterogeneous pages" for more details.

    • -delay

      This indicates the delay, in seconds, between frames. Smaller delays refresh the page faster. Fractional values are allowed.

      If not specified, -delay will default to 1 second

    • -cookie, -target, -expires, -nph

      These have the same meaning as the like-named parameters in CGI::header().

      If not specified, -nph will default to 1 (as needed for many servers, see below).

    Heterogeneous Pages

    Ordinarily all pages displayed by CGI::Push share a common MIME type. However by providing a value of "heterogeneous" or "dynamic" in the do_push() -type parameter, you can specify the MIME type of each page on a case-by-case basis.

    If you use this option, you will be responsible for producing the HTTP header for each page. Simply modify your draw routine to look like this:

    1. sub my_draw_routine {
    2. my($q,$counter) = @_;
    3. return header('text/html'), # note we're producing the header here
    4. start_html('testing'),
    5. h1('testing'),
    6. "This page called $counter times";
    7. }

    You can add any header fields that you like, but some (cookies and status fields included) may not be interpreted by the browser. One interesting effect is to display a series of pages, then, after the last page, to redirect the browser to a new URL. Because redirect() does b<not> work, the easiest way is with a -refresh header field, as shown below:

    1. sub my_draw_routine {
    2. my($q,$counter) = @_;
    3. return undef if $counter > 10;
    4. return header('text/html'), # note we're producing the header here
    5. start_html('testing'),
    6. h1('testing'),
    7. "This page called $counter times";
    8. }
    9. sub my_last_page {
    10. return header(-refresh=>'5; URL=http://somewhere.else/finished.html',
    11. -type=>'text/html'),
    12. start_html('Moved'),
    13. h1('This is the last page'),
    14. 'Goodbye!'
    15. hr,
    16. end_html;
    17. }

    Changing the Page Delay on the Fly

    If you would like to control the delay between pages on a page-by-page basis, call push_delay() from within your draw routine. push_delay() takes a single numeric argument representing the number of seconds you wish to delay after the current page is displayed and before displaying the next one. The delay may be fractional. Without parameters, push_delay() just returns the current delay.

    INSTALLING CGI::Push SCRIPTS

    Server push scripts must be installed as no-parsed-header (NPH) scripts in order to work correctly on many servers. On Unix systems, this is most often accomplished by prefixing the script's name with "nph-". Recognition of NPH scripts happens automatically with WebSTAR and Microsoft IIS. Users of other servers should see their documentation for help.

    Apache web server from version 1.3b2 on does not need server push scripts installed as NPH scripts: the -nph parameter to do_push() may be set to a false value to disable the extra headers needed by an NPH script.

    AUTHOR INFORMATION

    Copyright 1995-1998, Lincoln D. Stein. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Address bug reports and comments to: lstein@cshl.org

    BUGS

    This section intentionally left blank.

    SEE ALSO

    CGI::Carp, CGI

     
    perldoc-html/CGI/Switch.html000644 000765 000024 00000035257 12275777434 016005 0ustar00jjstaff000000 000000 CGI::Switch - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CGI::Switch

    Perl 5 version 18.2 documentation
    Recently read

    CGI::Switch

    NAME

    CGI::Switch - Backward compatibility module for defunct CGI::Switch

    SYNOPSIS

    Do not use this module. It is deprecated.

    ABSTRACT

    DESCRIPTION

    AUTHOR INFORMATION

    BUGS

    SEE ALSO

     
    perldoc-html/CGI/Util.html000644 000765 000024 00000036013 12275777427 015452 0ustar00jjstaff000000 000000 CGI::Util - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    CGI::Util

    Perl 5 version 18.2 documentation
    Recently read

    CGI::Util

    NAME

    CGI::Util - Internal utilities used by CGI module

    SYNOPSIS

    none

    DESCRIPTION

    no public subroutines

    AUTHOR INFORMATION

    Copyright 1995-1998, Lincoln D. Stein. All rights reserved.

    This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

    Address bug reports and comments to: lstein@cshl.org. When sending bug reports, please provide the version of CGI.pm, the version of Perl, the name and version of your Web server, and the name and version of the operating system you are using. If the problem is even remotely browser dependent, please provide information about the affected browsers as well.

    SEE ALSO

    CGI

     
    perldoc-html/B/Concise.html000644 000765 000024 00000151570 12275777424 015702 0ustar00jjstaff000000 000000 B::Concise - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    B::Concise

    Perl 5 version 18.2 documentation
    Recently read

    B::Concise

    NAME

    B::Concise - Walk Perl syntax tree, printing concise info about ops

    SYNOPSIS

    1. perl -MO=Concise[,OPTIONS] foo.pl
    2. use B::Concise qw(set_style add_callback);

    DESCRIPTION

    This compiler backend prints the internal OPs of a Perl program's syntax tree in one of several space-efficient text formats suitable for debugging the inner workings of perl or other compiler backends. It can print OPs in the order they appear in the OP tree, in the order they will execute, or in a text approximation to their tree structure, and the format of the information displayed is customizable. Its function is similar to that of perl's -Dx debugging flag or the B::Terse module, but it is more sophisticated and flexible.

    EXAMPLE

    Here's two outputs (or 'renderings'), using the -exec and -basic (i.e. default) formatting conventions on the same code snippet.

    1. % perl -MO=Concise,-exec -e '$a = $b + 42'
    2. 1 <0> enter
    3. 2 <;> nextstate(main 1 -e:1) v
    4. 3 <#> gvsv[*b] s
    5. 4 <$> const[IV 42] s
    6. * 5 <2> add[t3] sK/2
    7. 6 <#> gvsv[*a] s
    8. 7 <2> sassign vKS/2
    9. 8 <@> leave[1 ref] vKP/REFC

    In this -exec rendering, each opcode is executed in the order shown. The add opcode, marked with '*', is discussed in more detail.

    The 1st column is the op's sequence number, starting at 1, and is displayed in base 36 by default. Here they're purely linear; the sequences are very helpful when looking at code with loops and branches.

    The symbol between angle brackets indicates the op's type, for example; <2> is a BINOP, <@> a LISTOP, and <#> is a PADOP, which is used in threaded perls. (see OP class abbreviations).

    The opname, as in 'add[t1]', may be followed by op-specific information in parentheses or brackets (ex '[t1]').

    The op-flags (ex 'sK/2') are described in (OP flags abbreviations).

    1. % perl -MO=Concise -e '$a = $b + 42'
    2. 8 <@> leave[1 ref] vKP/REFC ->(end)
    3. 1 <0> enter ->2
    4. 2 <;> nextstate(main 1 -e:1) v ->3
    5. 7 <2> sassign vKS/2 ->8
    6. * 5 <2> add[t1] sK/2 ->6
    7. - <1> ex-rv2sv sK/1 ->4
    8. 3 <$> gvsv(*b) s ->4
    9. 4 <$> const(IV 42) s ->5
    10. - <1> ex-rv2sv sKRM*/1 ->7
    11. 6 <$> gvsv(*a) s ->7

    The default rendering is top-down, so they're not in execution order. This form reflects the way the stack is used to parse and evaluate expressions; the add operates on the two terms below it in the tree.

    Nullops appear as ex-opname , where opname is an op that has been optimized away by perl. They're displayed with a sequence-number of '-', because they are not executed (they don't appear in previous example), they're printed here because they reflect the parse.

    The arrow points to the sequence number of the next op; they're not displayed in -exec mode, for obvious reasons.

    Note that because this rendering was done on a non-threaded perl, the PADOPs in the previous examples are now SVOPs, and some (but not all) of the square brackets have been replaced by round ones. This is a subtle feature to provide some visual distinction between renderings on threaded and un-threaded perls.

    OPTIONS

    Arguments that don't start with a hyphen are taken to be the names of subroutines or formats to render; if no such functions are specified, the main body of the program (outside any subroutines, and not including use'd or require'd files) is rendered. Passing BEGIN , UNITCHECK , CHECK , INIT , or END will cause all of the corresponding special blocks to be printed. Arguments must follow options.

    Options affect how things are rendered (ie printed). They're presented here by their visual effect, 1st being strongest. They're grouped according to how they interrelate; within each group the options are mutually exclusive (unless otherwise stated).

    Options for Opcode Ordering

    These options control the 'vertical display' of opcodes. The display 'order' is also called 'mode' elsewhere in this document.

    • -basic

      Print OPs in the order they appear in the OP tree (a preorder traversal, starting at the root). The indentation of each OP shows its level in the tree, and the '->' at the end of the line indicates the next opcode in execution order. This mode is the default, so the flag is included simply for completeness.

    • -exec

      Print OPs in the order they would normally execute (for the majority of constructs this is a postorder traversal of the tree, ending at the root). In most cases the OP that usually follows a given OP will appear directly below it; alternate paths are shown by indentation. In cases like loops when control jumps out of a linear path, a 'goto' line is generated.

    • -tree

      Print OPs in a text approximation of a tree, with the root of the tree at the left and 'left-to-right' order of children transformed into 'top-to-bottom'. Because this mode grows both to the right and down, it isn't suitable for large programs (unless you have a very wide terminal).

    Options for Line-Style

    These options select the line-style (or just style) used to render each opcode, and dictates what info is actually printed into each line.

    • -concise

      Use the author's favorite set of formatting conventions. This is the default, of course.

    • -terse

      Use formatting conventions that emulate the output of B::Terse. The basic mode is almost indistinguishable from the real B::Terse, and the exec mode looks very similar, but is in a more logical order and lacks curly brackets. B::Terse doesn't have a tree mode, so the tree mode is only vaguely reminiscent of B::Terse.

    • -linenoise

      Use formatting conventions in which the name of each OP, rather than being written out in full, is represented by a one- or two-character abbreviation. This is mainly a joke.

    • -debug

      Use formatting conventions reminiscent of B::Debug; these aren't very concise at all.

    • -env

      Use formatting conventions read from the environment variables B_CONCISE_FORMAT , B_CONCISE_GOTO_FORMAT , and B_CONCISE_TREE_FORMAT .

    Options for tree-specific formatting

    • -compact

      Use a tree format in which the minimum amount of space is used for the lines connecting nodes (one character in most cases). This squeezes out a few precious columns of screen real estate.

    • -loose

      Use a tree format that uses longer edges to separate OP nodes. This format tends to look better than the compact one, especially in ASCII, and is the default.

    • -vt

      Use tree connecting characters drawn from the VT100 line-drawing set. This looks better if your terminal supports it.

    • -ascii

      Draw the tree with standard ASCII characters like + and |. These don't look as clean as the VT100 characters, but they'll work with almost any terminal (or the horizontal scrolling mode of less(1)) and are suitable for text documentation or email. This is the default.

    These are pairwise exclusive, i.e. compact or loose, vt or ascii.

    Options controlling sequence numbering

    • -basen

      Print OP sequence numbers in base n. If n is greater than 10, the digit for 11 will be 'a', and so on. If n is greater than 36, the digit for 37 will be 'A', and so on until 62. Values greater than 62 are not currently supported. The default is 36.

    • -bigendian

      Print sequence numbers with the most significant digit first. This is the usual convention for Arabic numerals, and the default.

    • -littleendian

      Print sequence numbers with the least significant digit first. This is obviously mutually exclusive with bigendian.

    Other options

    • -src

      With this option, the rendering of each statement (starting with the nextstate OP) will be preceded by the 1st line of source code that generates it. For example:

      1. 1 <0> enter
      2. # 1: my $i;
      3. 2 <;> nextstate(main 1 junk.pl:1) v:{
      4. 3 <0> padsv[$i:1,10] vM/LVINTRO
      5. # 3: for $i (0..9) {
      6. 4 <;> nextstate(main 3 junk.pl:3) v:{
      7. 5 <0> pushmark s
      8. 6 <$> const[IV 0] s
      9. 7 <$> const[IV 9] s
      10. 8 <{> enteriter(next->j last->m redo->9)[$i:1,10] lKS
      11. k <0> iter s
      12. l <|> and(other->9) vK/1
      13. # 4: print "line ";
      14. 9 <;> nextstate(main 2 junk.pl:4) v
      15. a <0> pushmark s
      16. b <$> const[PV "line "] s
      17. c <@> print vK
      18. # 5: print "$i\n";
      19. ...
    • -stash="somepackage"

      With this, "somepackage" will be required, then the stash is inspected, and each function is rendered.

    The following options are pairwise exclusive.

    • -main

      Include the main program in the output, even if subroutines were also specified. This rendering is normally suppressed when a subroutine name or reference is given.

    • -nomain

      This restores the default behavior after you've changed it with '-main' (it's not normally needed). If no subroutine name/ref is given, main is rendered, regardless of this flag.

    • -nobanner

      Renderings usually include a banner line identifying the function name or stringified subref. This suppresses the printing of the banner.

      TBC: Remove the stringified coderef; while it provides a 'cookie' for each function rendered, the cookies used should be 1,2,3.. not a random hex-address. It also complicates string comparison of two different trees.

    • -banner

      restores default banner behavior.

    • -banneris => subref

      TBC: a hookpoint (and an option to set it) for a user-supplied function to produce a banner appropriate for users needs. It's not ideal, because the rendering-state variables, which are a natural candidate for use in concise.t, are unavailable to the user.

    Option Stickiness

    If you invoke Concise more than once in a program, you should know that the options are 'sticky'. This means that the options you provide in the first call will be remembered for the 2nd call, unless you re-specify or change them.

    ABBREVIATIONS

    The concise style uses symbols to convey maximum info with minimal clutter (like hex addresses). With just a little practice, you can start to see the flowers, not just the branches, in the trees.

    OP class abbreviations

    These symbols appear before the op-name, and indicate the B:: namespace that represents the ops in your Perl code.

    1. 0 OP (aka BASEOP) An OP with no children
    2. 1 UNOP An OP with one child
    3. 2 BINOP An OP with two children
    4. | LOGOP A control branch OP
    5. @ LISTOP An OP that could have lots of children
    6. / PMOP An OP with a regular expression
    7. $ SVOP An OP with an SV
    8. " PVOP An OP with a string
    9. { LOOP An OP that holds pointers for a loop
    10. ; COP An OP that marks the start of a statement
    11. # PADOP An OP with a GV on the pad

    OP flags abbreviations

    OP flags are either public or private. The public flags alter the behavior of each opcode in consistent ways, and are represented by 0 or more single characters.

    1. v OPf_WANT_VOID Want nothing (void context)
    2. s OPf_WANT_SCALAR Want single value (scalar context)
    3. l OPf_WANT_LIST Want list of any length (list context)
    4. Want is unknown
    5. K OPf_KIDS There is a firstborn child.
    6. P OPf_PARENS This operator was parenthesized.
    7. (Or block needs explicit scope entry.)
    8. R OPf_REF Certified reference.
    9. (Return container, not containee).
    10. M OPf_MOD Will modify (lvalue).
    11. S OPf_STACKED Some arg is arriving on the stack.
    12. * OPf_SPECIAL Do something weird for this op (see op.h)

    Private flags, if any are set for an opcode, are displayed after a '/'

    1. 8 <@> leave[1 ref] vKP/REFC ->(end)
    2. 7 <2> sassign vKS/2 ->8

    They're opcode specific, and occur less often than the public ones, so they're represented by short mnemonics instead of single-chars; see op.h for gory details, or try this quick 2-liner:

    1. $> perl -MB::Concise -de 1
    2. DB<1> |x \%B::Concise::priv

    FORMATTING SPECIFICATIONS

    For each line-style ('concise', 'terse', 'linenoise', etc.) there are 3 format-specs which control how OPs are rendered.

    The first is the 'default' format, which is used in both basic and exec modes to print all opcodes. The 2nd, goto-format, is used in exec mode when branches are encountered. They're not real opcodes, and are inserted to look like a closing curly brace. The tree-format is tree specific.

    When a line is rendered, the correct format-spec is copied and scanned for the following items; data is substituted in, and other manipulations like basic indenting are done, for each opcode rendered.

    There are 3 kinds of items that may be populated; special patterns, #vars, and literal text, which is copied verbatim. (Yes, it's a set of s///g steps.)

    Special Patterns

    These items are the primitives used to perform indenting, and to select text from amongst alternatives.

    • (x(exec_text;basic_text)x)

      Generates exec_text in exec mode, or basic_text in basic mode.

    • (*(text)*)

      Generates one copy of text for each indentation level.

    • (*(text1;text2)*)

      Generates one fewer copies of text1 than the indentation level, followed by one copy of text2 if the indentation level is more than 0.

    • (?(text1#varText2)?)

      If the value of var is true (not empty or zero), generates the value of var surrounded by text1 and Text2, otherwise nothing.

    • ~

      Any number of tildes and surrounding whitespace will be collapsed to a single space.

    # Variables

    These #vars represent opcode properties that you may want as part of your rendering. The '#' is intended as a private sigil; a #var's value is interpolated into the style-line, much like "read $this".

    These vars take 3 forms:

    • #var

      A property named 'var' is assumed to exist for the opcodes, and is interpolated into the rendering.

    • #varN

      Generates the value of var, left justified to fill N spaces. Note that this means while you can have properties 'foo' and 'foo2', you cannot render 'foo2', but you could with 'foo2a'. You would be wise not to rely on this behavior going forward ;-)

    • #Var

      This ucfirst form of #var generates a tag-value form of itself for display; it converts '#Var' into a 'Var => #var' style, which is then handled as described above. (Imp-note: #Vars cannot be used for conditional-fills, because the => #var transform is done after the check for #Var's value).

    The following variables are 'defined' by B::Concise; when they are used in a style, their respective values are plugged into the rendering of each opcode.

    Only some of these are used by the standard styles, the others are provided for you to delve into optree mechanics, should you wish to add a new style (see add_style below) that uses them. You can also add new ones using add_callback.

    • #addr

      The address of the OP, in hexadecimal.

    • #arg

      The OP-specific information of the OP (such as the SV for an SVOP, the non-local exit pointers for a LOOP, etc.) enclosed in parentheses.

    • #class

      The B-determined class of the OP, in all caps.

    • #classsym

      A single symbol abbreviating the class of the OP.

    • #coplabel

      The label of the statement or block the OP is the start of, if any.

    • #exname

      The name of the OP, or 'ex-foo' if the OP is a null that used to be a foo.

    • #extarg

      The target of the OP, or nothing for a nulled OP.

    • #firstaddr

      The address of the OP's first child, in hexadecimal.

    • #flags

      The OP's flags, abbreviated as a series of symbols.

    • #flagval

      The numeric value of the OP's flags.

    • #hints

      The COP's hint flags, rendered with abbreviated names if possible. An empty string if this is not a COP. Here are the symbols used:

      1. $ strict refs
      2. & strict subs
      3. * strict vars
      4. x$ explicit use/no strict refs
      5. x& explicit use/no strict subs
      6. x* explicit use/no strict vars
      7. i integers
      8. l locale
      9. b bytes
      10. { block scope
      11. % localise %^H
      12. < open in
      13. > open out
      14. I overload int
      15. F overload float
      16. B overload binary
      17. S overload string
      18. R overload re
      19. T taint
      20. E eval
      21. X filetest access
      22. U utf-8
    • #hintsval

      The numeric value of the COP's hint flags, or an empty string if this is not a COP.

    • #hyphseq

      The sequence number of the OP, or a hyphen if it doesn't have one.

    • #label

      'NEXT', 'LAST', or 'REDO' if the OP is a target of one of those in exec mode, or empty otherwise.

    • #lastaddr

      The address of the OP's last child, in hexadecimal.

    • #name

      The OP's name.

    • #NAME

      The OP's name, in all caps.

    • #next

      The sequence number of the OP's next OP.

    • #nextaddr

      The address of the OP's next OP, in hexadecimal.

    • #noise

      A one- or two-character abbreviation for the OP's name.

    • #private

      The OP's private flags, rendered with abbreviated names if possible.

    • #privval

      The numeric value of the OP's private flags.

    • #seq

      The sequence number of the OP. Note that this is a sequence number generated by B::Concise.

    • #seqnum

      5.8.x and earlier only. 5.9 and later do not provide this.

      The real sequence number of the OP, as a regular number and not adjusted to be relative to the start of the real program. (This will generally be a fairly large number because all of B::Concise is compiled before your program is).

    • #opt

      Whether or not the op has been optimised by the peephole optimiser.

      Only available in 5.9 and later.

    • #sibaddr

      The address of the OP's next youngest sibling, in hexadecimal.

    • #svaddr

      The address of the OP's SV, if it has an SV, in hexadecimal.

    • #svclass

      The class of the OP's SV, if it has one, in all caps (e.g., 'IV').

    • #svval

      The value of the OP's SV, if it has one, in a short human-readable format.

    • #targ

      The numeric value of the OP's targ.

    • #targarg

      The name of the variable the OP's targ refers to, if any, otherwise the letter t followed by the OP's targ in decimal.

    • #targarglife

      Same as #targarg, but followed by the COP sequence numbers that delimit the variable's lifetime (or 'end' for a variable in an open scope) for a variable.

    • #typenum

      The numeric value of the OP's type, in decimal.

    One-Liner Command tips

    • perl -MO=Concise,bar foo.pl

      Renders only bar() from foo.pl. To see main, drop the ',bar'. To see both, add ',-main'

    • perl -MDigest::MD5=md5 -MO=Concise,md5 -e1

      Identifies md5 as an XS function. The export is needed so that BC can find it in main.

    • perl -MPOSIX -MO=Concise,_POSIX_ARG_MAX -e1

      Identifies _POSIX_ARG_MAX as a constant sub, optimized to an IV. Although POSIX isn't entirely consistent across platforms, this is likely to be present in virtually all of them.

    • perl -MPOSIX -MO=Concise,a -e 'print _POSIX_SAVED_IDS'

      This renders a print statement, which includes a call to the function. It's identical to rendering a file with a use call and that single statement, except for the filename which appears in the nextstate ops.

    • perl -MPOSIX -MO=Concise,a -e 'sub a{_POSIX_SAVED_IDS}'

      This is very similar to previous, only the first two ops differ. This subroutine rendering is more representative, insofar as a single main program will have many subs.

    • perl -MB::Concise -e 'B::Concise::compile("-exec","-src", \%B::Concise::)->()'

      This renders all functions in the B::Concise package with the source lines. It eschews the O framework so that the stashref can be passed directly to B::Concise::compile(). See -stash option for a more convenient way to render a package.

    Using B::Concise outside of the O framework

    The common (and original) usage of B::Concise was for command-line renderings of simple code, as given in EXAMPLE. But you can also use B::Concise from your code, and call compile() directly, and repeatedly. By doing so, you can avoid the compile-time only operation of O.pm, and even use the debugger to step through B::Concise::compile() itself.

    Once you're doing this, you may alter Concise output by adding new rendering styles, and by optionally adding callback routines which populate new variables, if such were referenced from those (just added) styles.

    Example: Altering Concise Renderings

    1. use B::Concise qw(set_style add_callback);
    2. add_style($yourStyleName => $defaultfmt, $gotofmt, $treefmt);
    3. add_callback
    4. ( sub {
    5. my ($h, $op, $format, $level, $stylename) = @_;
    6. $h->{variable} = some_func($op);
    7. });
    8. $walker = B::Concise::compile(@options,@subnames,@subrefs);
    9. $walker->();

    set_style()

    set_style accepts 3 arguments, and updates the three format-specs comprising a line-style (basic-exec, goto, tree). It has one minor drawback though; it doesn't register the style under a new name. This can become an issue if you render more than once and switch styles. Thus you may prefer to use add_style() and/or set_style_standard() instead.

    set_style_standard($name)

    This restores one of the standard line-styles: terse , concise , linenoise , debug , env , into effect. It also accepts style names previously defined with add_style().

    add_style ()

    This subroutine accepts a new style name and three style arguments as above, and creates, registers, and selects the newly named style. It is an error to re-add a style; call set_style_standard() to switch between several styles.

    add_callback ()

    If your newly minted styles refer to any new #variables, you'll need to define a callback subroutine that will populate (or modify) those variables. They are then available for use in the style you've chosen.

    The callbacks are called for each opcode visited by Concise, in the same order as they are added. Each subroutine is passed five parameters.

    1. 1. A hashref, containing the variable names and values which are
    2. populated into the report-line for the op
    3. 2. the op, as a B<B::OP> object
    4. 3. a reference to the format string
    5. 4. the formatting (indent) level
    6. 5. the selected stylename

    To define your own variables, simply add them to the hash, or change existing values if you need to. The level and format are passed in as references to scalars, but it is unlikely that they will need to be changed or even used.

    Running B::Concise::compile()

    compile accepts options as described above in OPTIONS, and arguments, which are either coderefs, or subroutine names.

    It constructs and returns a $treewalker coderef, which when invoked, traverses, or walks, and renders the optrees of the given arguments to STDOUT. You can reuse this, and can change the rendering style used each time; thereafter the coderef renders in the new style.

    walk_output lets you change the print destination from STDOUT to another open filehandle, or into a string passed as a ref (unless you've built perl with -Uuseperlio).

    1. my $walker = B::Concise::compile('-terse','aFuncName', \&aSubRef); # 1
    2. walk_output(\my $buf);
    3. $walker->(); # 1 renders -terse
    4. set_style_standard('concise'); # 2
    5. $walker->(); # 2 renders -concise
    6. $walker->(@new); # 3 renders whatever
    7. print "3 different renderings: terse, concise, and @new: $buf\n";

    When $walker is called, it traverses the subroutines supplied when it was created, and renders them using the current style. You can change the style afterwards in several different ways:

    1. 1. call C<compile>, altering style or mode/order
    2. 2. call C<set_style_standard>
    3. 3. call $walker, passing @new options

    Passing new options to the $walker is the easiest way to change amongst any pre-defined styles (the ones you add are automatically recognized as options), and is the only way to alter rendering order without calling compile again. Note however that rendering state is still shared amongst multiple $walker objects, so they must still be used in a coordinated manner.

    B::Concise::reset_sequence()

    This function (not exported) lets you reset the sequence numbers (note that they're numbered arbitrarily, their goal being to be human readable). Its purpose is mostly to support testing, i.e. to compare the concise output from two identical anonymous subroutines (but different instances). Without the reset, B::Concise, seeing that they're separate optrees, generates different sequence numbers in the output.

    Errors

    Errors in rendering (non-existent function-name, non-existent coderef) are written to the STDOUT, or wherever you've set it via walk_output().

    Errors using the various *style* calls, and bad args to walk_output(), result in die(). Use an eval if you wish to catch these errors and continue processing.

    AUTHOR

    Stephen McCamant, <smcc@CSUA.Berkeley.EDU>.

     
    perldoc-html/B/Debug.html000644 000765 000024 00000042533 12275777423 015342 0ustar00jjstaff000000 000000 B::Debug - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    B::Debug

    Perl 5 version 18.2 documentation
    Recently read

    B::Debug

    NAME

    B::Debug - Walk Perl syntax tree, printing debug info about ops

    SYNOPSIS

    1. perl -MO=Debug foo.pl
    2. perl -MO=Debug,-exec foo.pl

    DESCRIPTION

    See ext/B/README and the newer B::Concise, B::Terse.

    OPTIONS

    With option -exec, walks tree in execute order, otherwise in basic order.

    AUTHOR

    Malcolm Beattie, mbeattie@sable.ox.ac.uk Reini Urban rurban@cpan.org

    LICENSE

    Copyright (c) 1996, 1997 Malcolm Beattie Copyright (c) 2008, 2010 Reini Urban

    1. This program is free software; you can redistribute it and/or modify
    2. it under the terms of either:
    3. a) the GNU General Public License as published by the Free
    4. Software Foundation; either version 1, or (at your option) any
    5. later version, or
    6. b) the "Artistic License" which comes with this kit.
    7. This program is distributed in the hope that it will be useful,
    8. but WITHOUT ANY WARRANTY; without even the implied warranty of
    9. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See either
    10. the GNU General Public License or the Artistic License for more details.
    11. You should have received a copy of the Artistic License with this kit,
    12. in the file named "Artistic". If not, you can get one from the Perl
    13. distribution. You should also have received a copy of the GNU General
    14. Public License, in the file named "Copying". If not, you can get one
    15. from the Perl distribution or else write to the Free Software Foundation,
    16. Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
     
    perldoc-html/B/Deparse.html000644 000765 000024 00000130346 12275777423 015677 0ustar00jjstaff000000 000000 B::Deparse - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    B::Deparse

    Perl 5 version 18.2 documentation
    Recently read

    B::Deparse

    NAME

    B::Deparse - Perl compiler backend to produce perl code

    SYNOPSIS

    perl -MO=Deparse[,-d][,-fFILE][,-p][,-q][,-l] [,-sLETTERS][,-xLEVEL] prog.pl

    DESCRIPTION

    B::Deparse is a backend module for the Perl compiler that generates perl source code, based on the internal compiled structure that perl itself creates after parsing a program. The output of B::Deparse won't be exactly the same as the original source, since perl doesn't keep track of comments or whitespace, and there isn't a one-to-one correspondence between perl's syntactical constructions and their compiled form, but it will often be close. When you use the -p option, the output also includes parentheses even when they are not required by precedence, which can make it easy to see if perl is parsing your expressions the way you intended.

    While B::Deparse goes to some lengths to try to figure out what your original program was doing, some parts of the language can still trip it up; it still fails even on some parts of Perl's own test suite. If you encounter a failure other than the most common ones described in the BUGS section below, you can help contribute to B::Deparse's ongoing development by submitting a bug report with a small example.

    OPTIONS

    As with all compiler backend options, these must follow directly after the '-MO=Deparse', separated by a comma but not any white space.

    • -d

      Output data values (when they appear as constants) using Data::Dumper. Without this option, B::Deparse will use some simple routines of its own for the same purpose. Currently, Data::Dumper is better for some kinds of data (such as complex structures with sharing and self-reference) while the built-in routines are better for others (such as odd floating-point values).

    • -fFILE

      Normally, B::Deparse deparses the main code of a program, and all the subs defined in the same file. To include subs defined in other files, pass the -f option with the filename. You can pass the -f option several times, to include more than one secondary file. (Most of the time you don't want to use it at all.) You can also use this option to include subs which are defined in the scope of a #line directive with two parameters.

    • -l

      Add '#line' declarations to the output based on the line and file locations of the original code.

    • -p

      Print extra parentheses. Without this option, B::Deparse includes parentheses in its output only when they are needed, based on the structure of your program. With -p, it uses parentheses (almost) whenever they would be legal. This can be useful if you are used to LISP, or if you want to see how perl parses your input. If you say

      1. if ($var & 0x7f == 65) {print "Gimme an A!"}
      2. print ($which ? $a : $b), "\n";
      3. $name = $ENV{USER} or "Bob";

      B::Deparse,-p will print

      1. if (($var & 0)) {
      2. print('Gimme an A!')
      3. };
      4. (print(($which ? $a : $b)), '???');
      5. (($name = $ENV{'USER'}) or '???')

      which probably isn't what you intended (the '???' is a sign that perl optimized away a constant value).

    • -P

      Disable prototype checking. With this option, all function calls are deparsed as if no prototype was defined for them. In other words,

      1. perl -MO=Deparse,-P -e 'sub foo (\@) { 1 } foo @x'

      will print

      1. sub foo (\@) {
      2. 1;
      3. }
      4. &foo(\@x);

      making clear how the parameters are actually passed to foo .

    • -q

      Expand double-quoted strings into the corresponding combinations of concatenation, uc, ucfirst, lc, lcfirst, quotemeta, and join. For instance, print

      1. print "Hello, $world, @ladies, \u$gentlemen\E, \u\L$me!";

      as

      1. print 'Hello, ' . $world . ', ' . join($", @ladies) . ', '
      2. . ucfirst($gentlemen) . ', ' . ucfirst(lc $me . '!');

      Note that the expanded form represents the way perl handles such constructions internally -- this option actually turns off the reverse translation that B::Deparse usually does. On the other hand, note that $x = "$y" is not the same as $x = $y : the former makes the value of $y into a string before doing the assignment.

    • -sLETTERS

      Tweak the style of B::Deparse's output. The letters should follow directly after the 's', with no space or punctuation. The following options are available:

      • C

        Cuddle elsif , else , and continue blocks. For example, print

        1. if (...) {
        2. ...
        3. } else {
        4. ...
        5. }

        instead of

        1. if (...) {
        2. ...
        3. }
        4. else {
        5. ...
        6. }

        The default is not to cuddle.

      • iNUMBER

        Indent lines by multiples of NUMBER columns. The default is 4 columns.

      • T

        Use tabs for each 8 columns of indent. The default is to use only spaces. For instance, if the style options are -si4T, a line that's indented 3 times will be preceded by one tab and four spaces; if the options were -si8T, the same line would be preceded by three tabs.

      • vSTRING.

        Print STRING for the value of a constant that can't be determined because it was optimized away (mnemonic: this happens when a constant is used in void context). The end of the string is marked by a period. The string should be a valid perl expression, generally a constant. Note that unless it's a number, it probably needs to be quoted, and on a command line quotes need to be protected from the shell. Some conventional values include 0, 1, 42, '', 'foo', and 'Useless use of constant omitted' (which may need to be -sv"'Useless use of constant omitted'." or something similar depending on your shell). The default is '???'. If you're using B::Deparse on a module or other file that's require'd, you shouldn't use a value that evaluates to false, since the customary true constant at the end of a module will be in void context when the file is compiled as a main program.

    • -xLEVEL

      Expand conventional syntax constructions into equivalent ones that expose their internal operation. LEVEL should be a digit, with higher values meaning more expansion. As with -q, this actually involves turning off special cases in B::Deparse's normal operations.

      If LEVEL is at least 3, for loops will be translated into equivalent while loops with continue blocks; for instance

      1. for ($i = 0; $i < 10; ++$i) {
      2. print $i;
      3. }

      turns into

      1. $i = 0;
      2. while ($i < 10) {
      3. print $i;
      4. } continue {
      5. ++$i
      6. }

      Note that in a few cases this translation can't be perfectly carried back into the source code -- if the loop's initializer declares a my variable, for instance, it won't have the correct scope outside of the loop.

      If LEVEL is at least 5, use declarations will be translated into BEGIN blocks containing calls to require and import; for instance,

      1. use strict 'refs';

      turns into

      1. sub BEGIN {
      2. require strict;
      3. do {
      4. 'strict'->import('refs')
      5. };
      6. }

      If LEVEL is at least 7, if statements will be translated into equivalent expressions using &&, ?: and do {} ; for instance

      1. print 'hi' if $nice;
      2. if ($nice) {
      3. print 'hi';
      4. }
      5. if ($nice) {
      6. print 'hi';
      7. } else {
      8. print 'bye';
      9. }

      turns into

      1. $nice and print 'hi';
      2. $nice and do { print 'hi' };
      3. $nice ? do { print 'hi' } : do { print 'bye' };

      Long sequences of elsifs will turn into nested ternary operators, which B::Deparse doesn't know how to indent nicely.

    USING B::Deparse AS A MODULE

    Synopsis

    1. use B::Deparse;
    2. $deparse = B::Deparse->new("-p", "-sC");
    3. $body = $deparse->coderef2text(\&func);
    4. eval "sub func $body"; # the inverse operation

    Description

    B::Deparse can also be used on a sub-by-sub basis from other perl programs.

    new

    1. $deparse = B::Deparse->new(OPTIONS)

    Create an object to store the state of a deparsing operation and any options. The options are the same as those that can be given on the command line (see OPTIONS); options that are separated by commas after -MO=Deparse should be given as separate strings.

    ambient_pragmas

    1. $deparse->ambient_pragmas(strict => 'all', '$[' => $[);

    The compilation of a subroutine can be affected by a few compiler directives, pragmas. These are:

    • use strict;

    • use warnings;

    • Assigning to the special variable $[

    • use integer;

    • use bytes;

    • use utf8;

    • use re;

    Ordinarily, if you use B::Deparse on a subroutine which has been compiled in the presence of one or more of these pragmas, the output will include statements to turn on the appropriate directives. So if you then compile the code returned by coderef2text, it will behave the same way as the subroutine which you deparsed.

    However, you may know that you intend to use the results in a particular context, where some pragmas are already in scope. In this case, you use the ambient_pragmas method to describe the assumptions you wish to make.

    Not all of the options currently have any useful effect. See BUGS for more details.

    The parameters it accepts are:

    • strict

      Takes a string, possibly containing several values separated by whitespace. The special values "all" and "none" mean what you'd expect.

      1. $deparse->ambient_pragmas(strict => 'subs refs');
    • $[

      Takes a number, the value of the array base $[. Cannot be non-zero on Perl 5.15.3 or later.

    • bytes
    • utf8
    • integer

      If the value is true, then the appropriate pragma is assumed to be in the ambient scope, otherwise not.

    • re

      Takes a string, possibly containing a whitespace-separated list of values. The values "all" and "none" are special. It's also permissible to pass an array reference here.

      1. $deparser->ambient_pragmas(re => 'eval');
    • warnings

      Takes a string, possibly containing a whitespace-separated list of values. The values "all" and "none" are special, again. It's also permissible to pass an array reference here.

      1. $deparser->ambient_pragmas(warnings => [qw[void io]]);

      If one of the values is the string "FATAL", then all the warnings in that list will be considered fatal, just as with the warnings pragma itself. Should you need to specify that some warnings are fatal, and others are merely enabled, you can pass the warnings parameter twice:

      1. $deparser->ambient_pragmas(
      2. warnings => 'all',
      3. warnings => [FATAL => qw/void io/],
      4. );

      See perllexwarn for more information about lexical warnings.

    • hint_bits
    • warning_bits

      These two parameters are used to specify the ambient pragmas in the format used by the special variables $^H and ${^WARNING_BITS}.

      They exist principally so that you can write code like:

      1. { my ($hint_bits, $warning_bits);
      2. BEGIN {($hint_bits, $warning_bits) = ($^H, ${^WARNING_BITS})}
      3. $deparser->ambient_pragmas (
      4. hint_bits => $hint_bits,
      5. warning_bits => $warning_bits,
      6. '$[' => 0 + $[
      7. ); }

      which specifies that the ambient pragmas are exactly those which are in scope at the point of calling.

    • %^H

      This parameter is used to specify the ambient pragmas which are stored in the special hash %^H.

    coderef2text

    1. $body = $deparse->coderef2text(\&func)
    2. $body = $deparse->coderef2text(sub ($$) { ... })

    Return source code for the body of a subroutine (a block, optionally preceded by a prototype in parens), given a reference to the sub. Because a subroutine can have no names, or more than one name, this method doesn't return a complete subroutine definition -- if you want to eval the result, you should prepend "sub subname ", or "sub " for an anonymous function constructor. Unless the sub was defined in the main:: package, the code will include a package declaration.

    BUGS

    • The only pragmas to be completely supported are: use warnings , use strict , use bytes , use integer and use feature . ($[ , which behaves like a pragma, is also supported.)

      Excepting those listed above, we're currently unable to guarantee that B::Deparse will produce a pragma at the correct point in the program. (Specifically, pragmas at the beginning of a block often appear right before the start of the block instead.) Since the effects of pragmas are often lexically scoped, this can mean that the pragma holds sway over a different portion of the program than in the input file.

    • In fact, the above is a specific instance of a more general problem: we can't guarantee to produce BEGIN blocks or use declarations in exactly the right place. So if you use a module which affects compilation (such as by over-riding keywords, overloading constants or whatever) then the output code might not work as intended.

      This is the most serious outstanding problem, and will require some help from the Perl core to fix.

    • Some constants don't print correctly either with or without -d. For instance, neither B::Deparse nor Data::Dumper know how to print dual-valued scalars correctly, as in:

      1. use constant E2BIG => ($!=7); $y = E2BIG; print $y, 0+$y;
      2. use constant H => { "#" => 1 }; H->{"#"};
    • An input file that uses source filtering probably won't be deparsed into runnable code, because it will still include the use declaration for the source filtering module, even though the code that is produced is already ordinary Perl which shouldn't be filtered again.

    • Optimised away statements are rendered as '???'. This includes statements that have a compile-time side-effect, such as the obscure

      1. my $x if 0;

      which is not, consequently, deparsed correctly.

      1. foreach my $i (@_) { 0 }
      2. =>
      3. foreach my $i (@_) { '???' }
    • Lexical (my) variables declared in scopes external to a subroutine appear in code2ref output text as package variables. This is a tricky problem, as perl has no native facility for referring to a lexical variable defined within a different scope, although PadWalker is a good start.

    • There are probably many more bugs on non-ASCII platforms (EBCDIC).

    • Lexical my subroutines are not deparsed properly at the moment. They are emitted as pure declarations, without their body; and the declaration may appear in the wrong place (before any lexicals the body closes over, or before the use feature declaration that permits use of this feature).

      We expect to resolve this before the lexical-subroutine feature is no longer considered experimental.

    • Lexical state subroutines are not deparsed at all at the moment.

      We expect to resolve this before the lexical-subroutine feature is no longer considered experimental.

    AUTHOR

    Stephen McCamant <smcc@CSUA.Berkeley.EDU>, based on an earlier version by Malcolm Beattie <mbeattie@sable.ox.ac.uk>, with contributions from Gisle Aas, James Duncan, Albert Dvornik, Robin Houston, Dave Mitchell, Hugo van der Sanden, Gurusamy Sarathy, Nick Ing-Simmons, and Rafael Garcia-Suarez.

     
    perldoc-html/B/Lint/000755 000765 000024 00000000000 12275777423 014325 5ustar00jjstaff000000 000000 perldoc-html/B/Lint.html000644 000765 000024 00000060372 12275777424 015224 0ustar00jjstaff000000 000000 B::Lint - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    B::Lint

    Perl 5 version 18.2 documentation
    Recently read

    B::Lint

    NAME

    B::Lint - Perl lint

    SYNOPSIS

    perl -MO=Lint[,OPTIONS] foo.pl

    DESCRIPTION

    The B::Lint module is equivalent to an extended version of the -w option of perl. It is named after the program lint which carries out a similar process for C programs.

    OPTIONS AND LINT CHECKS

    Option words are separated by commas (not whitespace) and follow the usual conventions of compiler backend options. Following any options (indicated by a leading -) come lint check arguments. Each such argument (apart from the special all and none options) is a word representing one possible lint check (turning on that check) or is no-foo (turning off that check). Before processing the check arguments, a standard list of checks is turned on. Later options override earlier ones. Available options are:

    • magic-diamond

      Produces a warning whenever the magic <> readline is used. Internally it uses perl's two-argument open which itself treats filenames with special characters specially. This could allow interestingly named files to have unexpected effects when reading.

      1. % touch 'rm *|'
      2. % perl -pe 1

      The above creates a file named rm *|. When perl opens it with <> it actually executes the shell program rm * . This makes <> dangerous to use carelessly.

    • context

      Produces a warning whenever an array is used in an implicit scalar context. For example, both of the lines

      1. $foo = length(@bar);
      2. $foo = @bar;

      will elicit a warning. Using an explicit scalar() silences the warning. For example,

      1. $foo = scalar(@bar);
    • implicit-read and implicit-write

      These options produce a warning whenever an operation implicitly reads or (respectively) writes to one of Perl's special variables. For example, implicit-read will warn about these:

      1. /foo/;

      and implicit-write will warn about these:

      1. s/foo/bar/;

      Both implicit-read and implicit-write warn about this:

      1. for (@a) { ... }
    • bare-subs

      This option warns whenever a bareword is implicitly quoted, but is also the name of a subroutine in the current package. Typical mistakes that it will trap are:

      1. use constant foo => 'bar';
      2. @a = ( foo => 1 );
      3. $b{foo} = 2;

      Neither of these will do what a naive user would expect.

    • dollar-underscore

      This option warns whenever $_ is used either explicitly anywhere or as the implicit argument of a print statement.

    • private-names

      This option warns on each use of any variable, subroutine or method name that lives in a non-current package but begins with an underscore ("_"). Warnings aren't issued for the special case of the single character name "_" by itself (e.g. $_ and @_ ).

    • undefined-subs

      This option warns whenever an undefined subroutine is invoked. This option will only catch explicitly invoked subroutines such as foo() and not indirect invocations such as &$subref() or $obj->meth() . Note that some programs or modules delay definition of subs until runtime by means of the AUTOLOAD mechanism.

    • regexp-variables

      This option warns whenever one of the regexp variables $` , $& or $' is used. Any occurrence of any of these variables in your program can slow your whole program down. See perlre for details.

    • all

      Turn all warnings on.

    • none

      Turn all warnings off.

    NON LINT-CHECK OPTIONS

    • -u Package

      Normally, Lint only checks the main code of the program together with all subs defined in package main. The -u option lets you include other package names whose subs are then checked by Lint.

    EXTENDING LINT

    Lint can be extended by with plugins. Lint uses Module::Pluggable to find available plugins. Plugins are expected but not required to inform Lint of which checks they are adding.

    The B::Lint->register_plugin( MyPlugin => \@new_checks ) method adds the list of @new_checks to the list of valid checks. If your module wasn't loaded by Module::Pluggable then your class name is added to the list of plugins.

    You must create a match( \%checks ) method in your plugin class or one of its parents. It will be called on every op as a regular method call with a hash ref of checks as its parameter.

    The class methods B::Lint->file and B::Lint->line contain the current filename and line number.

    1. package Sample;
    2. use B::Lint;
    3. B::Lint->register_plugin( Sample => [ 'good_taste' ] );
    4. sub match {
    5. my ( $op, $checks_href ) = shift @_;
    6. if ( $checks_href->{good_taste} ) {
    7. ...
    8. }
    9. }

    TODO

    • while(<FH>) stomps $_
    • strict oo
    • unchecked system calls
    • more tests, validate against older perls

    BUGS

    This is only a very preliminary version.

    AUTHOR

    Malcolm Beattie, mbeattie@sable.ox.ac.uk.

    ACKNOWLEDGEMENTS

    Sebastien Aperghis-Tramoni - bug fixes

     
    perldoc-html/B/Showlex.html000644 000765 000024 00000043510 12275777424 015742 0ustar00jjstaff000000 000000 B::Showlex - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    B::Showlex

    Perl 5 version 18.2 documentation
    Recently read

    B::Showlex

    NAME

    B::Showlex - Show lexical variables used in functions or files

    SYNOPSIS

    1. perl -MO=Showlex[,-OPTIONS][,SUBROUTINE] foo.pl

    DESCRIPTION

    When a comma-separated list of subroutine names is given as options, Showlex prints the lexical variables used in those subroutines. Otherwise, it prints the file-scope lexicals in the file.

    EXAMPLES

    Traditional form:

    1. $ perl -MO=Showlex -e 'my ($i,$j,$k)=(1,"foo")'
    2. Pad of lexical names for comppadlist has 4 entries
    3. 0: SPECIAL #1 &PL_sv_undef
    4. 1: PVNV (0x9db0fb0) $i
    5. 2: PVNV (0x9db0f38) $j
    6. 3: PVNV (0x9db0f50) $k
    7. Pad of lexical values for comppadlist has 5 entries
    8. 0: SPECIAL #1 &PL_sv_undef
    9. 1: NULL (0x9da4234)
    10. 2: NULL (0x9db0f2c)
    11. 3: NULL (0x9db0f44)
    12. 4: NULL (0x9da4264)
    13. -e syntax OK

    New-style form:

    1. $ perl -MO=Showlex,-newlex -e 'my ($i,$j,$k)=(1,"foo")'
    2. main Pad has 4 entries
    3. 0: SPECIAL #1 &PL_sv_undef
    4. 1: PVNV (0xa0c4fb8) "$i" = NULL (0xa0b8234)
    5. 2: PVNV (0xa0c4f40) "$j" = NULL (0xa0c4f34)
    6. 3: PVNV (0xa0c4f58) "$k" = NULL (0xa0c4f4c)
    7. -e syntax OK

    New form, no specials, outside O framework:

    1. $ perl -MB::Showlex -e \
    2. 'my ($i,$j,$k)=(1,"foo"); B::Showlex::compile(-newlex,-nosp)->()'
    3. main Pad has 4 entries
    4. 1: PVNV (0x998ffb0) "$i" = IV (0x9983234) 1
    5. 2: PVNV (0x998ff68) "$j" = PV (0x998ff5c) "foo"
    6. 3: PVNV (0x998ff80) "$k" = NULL (0x998ff74)

    Note that this example shows the values of the lexicals, whereas the other examples did not (as they're compile-time only).

    OPTIONS

    The -newlex option produces a more readable name => value format, and is shown in the second example above.

    The -nosp option eliminates reporting of SPECIALs, such as 0: SPECIAL #1 &PL_sv_undef above. Reporting of SPECIALs can sometimes overwhelm your declared lexicals.

    SEE ALSO

    B::Showlex can also be used outside of the O framework, as in the third example. See B::Concise for a fuller explanation of reasons.

    TODO

    Some of the reported info, such as hex addresses, is not particularly valuable. Other information would be more useful for the typical programmer, such as line-numbers, pad-slot reuses, etc.. Given this, -newlex isnt a particularly good flag-name.

    AUTHOR

    Malcolm Beattie, mbeattie@sable.ox.ac.uk

     
    perldoc-html/B/Terse.html000644 000765 000024 00000040344 12275777424 015375 0ustar00jjstaff000000 000000 B::Terse - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    B::Terse

    Perl 5 version 18.2 documentation
    Recently read

    B::Terse

    NAME

    B::Terse - Walk Perl syntax tree, printing terse info about ops

    SYNOPSIS

    1. perl -MO=Terse[,OPTIONS] foo.pl

    DESCRIPTION

    This module prints the contents of the parse tree, but without as much information as B::Debug. For comparison, print "Hello, world." produced 96 lines of output from B::Debug, but only 6 from B::Terse.

    This module is useful for people who are writing their own back end, or who are learning about the Perl internals. It's not useful to the average programmer.

    This version of B::Terse is really just a wrapper that calls B::Concise with the -terse option. It is provided for compatibility with old scripts (and habits) but using B::Concise directly is now recommended instead.

    For compatibility with the old B::Terse, this module also adds a method named terse to B::OP and B::SV objects. The B::SV method is largely compatible with the old one, though authors of new software might be advised to choose a more user-friendly output format. The B::OP terse method, however, doesn't work well. Since B::Terse was first written, much more information in OPs has migrated to the scratchpad datastructure, but the terse interface doesn't have any way of getting to the correct pad. As a kludge, the new version will always use the pad for the main program, but for OPs in subroutines this will give the wrong answer or crash.

    AUTHOR

    The original version of B::Terse was written by Malcolm Beattie, <mbeattie@sable.ox.ac.uk>. This wrapper was written by Stephen McCamant, <smcc@MIT.EDU>.

     
    perldoc-html/B/Xref.html000644 000765 000024 00000051123 12275777424 015214 0ustar00jjstaff000000 000000 B::Xref - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    B::Xref

    Perl 5 version 18.2 documentation
    Recently read

    B::Xref

    NAME

    B::Xref - Generates cross reference reports for Perl programs

    SYNOPSIS

    perl -MO=Xref[,OPTIONS] foo.pl

    DESCRIPTION

    The B::Xref module is used to generate a cross reference listing of all definitions and uses of variables, subroutines and formats in a Perl program. It is implemented as a backend for the Perl compiler.

    The report generated is in the following format:

    1. File filename1
    2. Subroutine subname1
    3. Package package1
    4. object1 line numbers
    5. object2 line numbers
    6. ...
    7. Package package2
    8. ...

    Each File section reports on a single file. Each Subroutine section reports on a single subroutine apart from the special cases "(definitions)" and "(main)". These report, respectively, on subroutine definitions found by the initial symbol table walk and on the main part of the program or module external to all subroutines.

    The report is then grouped by the Package of each variable, subroutine or format with the special case "(lexicals)" meaning lexical variables. Each object name (implicitly qualified by its containing Package) includes its type character(s) at the beginning where possible. Lexical variables are easier to track and even included dereferencing information where possible.

    The line numbers are a comma separated list of line numbers (some preceded by code letters) where that object is used in some way. Simple uses aren't preceded by a code letter. Introductions (such as where a lexical is first defined with my) are indicated with the letter "i". Subroutine and method calls are indicated by the character "&". Subroutine definitions are indicated by "s" and format definitions by "f".

    For instance, here's part of the report from the pod2man program that comes with Perl:

    1. Subroutine clear_noremap
    2. Package (lexical)
    3. $ready_to_print i1069, 1079
    4. Package main
    5. $& 1086
    6. $. 1086
    7. $0 1086
    8. $1 1087
    9. $2 1085, 1085
    10. $3 1085, 1085
    11. $ARGV 1086
    12. %HTML_Escapes 1085, 1085

    This shows the variables used in the subroutine clear_noremap . The variable $ready_to_print is a my() (lexical) variable, introduced (first declared with my()) on line 1069, and used on line 1079. The variable $& from the main package is used on 1086, and so on.

    A line number may be prefixed by a single letter:

    • i

      Lexical variable introduced (declared with my()) for the first time.

    • &

      Subroutine or method call.

    • s

      Subroutine defined.

    • r

      Format defined.

    The most useful option the cross referencer has is to save the report to a separate file. For instance, to save the report on myperlprogram to the file report:

    1. $ perl -MO=Xref,-oreport myperlprogram

    OPTIONS

    Option words are separated by commas (not whitespace) and follow the usual conventions of compiler backend options.

    • -oFILENAME

      Directs output to FILENAME instead of standard output.

    • -r

      Raw output. Instead of producing a human-readable report, outputs a line in machine-readable form for each definition/use of a variable/sub/format.

    • -d

      Don't output the "(definitions)" sections.

    • -D[tO]

      (Internal) debug options, probably only useful if -r included. The t option prints the object on the top of the stack as it's being tracked. The O option prints each operator as it's being processed in the execution order of the program.

    BUGS

    Non-lexical variables are quite difficult to track through a program. Sometimes the type of a non-lexical variable's use is impossible to determine. Introductions of non-lexical non-scalars don't seem to be reported properly.

    AUTHOR

    Malcolm Beattie, mbeattie@sable.ox.ac.uk.

     
    perldoc-html/B/Lint/Debug.html000644 000765 000024 00000034500 12275777423 016243 0ustar00jjstaff000000 000000 B::Lint::Debug - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    B::Lint::Debug

    Perl 5 version 18.2 documentation
    Recently read

    B::Lint::Debug

    NAME

    B::Lint::Debug - Adds debugging stringification to B::

    DESCRIPTION

    This module injects stringification to a B::OP*/B::SPECIAL. This should not be loaded unless you're debugging.

    Page index
     
    perldoc-html/Attribute/Handlers.html000644 000765 000024 00000222411 12275777422 017630 0ustar00jjstaff000000 000000 Attribute::Handlers - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Attribute::Handlers

    Perl 5 version 18.2 documentation
    Recently read

    Attribute::Handlers

    NAME

    Attribute::Handlers - Simpler definition of attribute handlers

    VERSION

    This document describes version 0.93 of Attribute::Handlers, released July 20, 2011.

    SYNOPSIS

    1. package MyClass;
    2. require 5.006;
    3. use Attribute::Handlers;
    4. no warnings 'redefine';
    5. sub Good : ATTR(SCALAR) {
    6. my ($package, $symbol, $referent, $attr, $data) = @_;
    7. # Invoked for any scalar variable with a :Good attribute,
    8. # provided the variable was declared in MyClass (or
    9. # a derived class) or typed to MyClass.
    10. # Do whatever to $referent here (executed in CHECK phase).
    11. ...
    12. }
    13. sub Bad : ATTR(SCALAR) {
    14. # Invoked for any scalar variable with a :Bad attribute,
    15. # provided the variable was declared in MyClass (or
    16. # a derived class) or typed to MyClass.
    17. ...
    18. }
    19. sub Good : ATTR(ARRAY) {
    20. # Invoked for any array variable with a :Good attribute,
    21. # provided the variable was declared in MyClass (or
    22. # a derived class) or typed to MyClass.
    23. ...
    24. }
    25. sub Good : ATTR(HASH) {
    26. # Invoked for any hash variable with a :Good attribute,
    27. # provided the variable was declared in MyClass (or
    28. # a derived class) or typed to MyClass.
    29. ...
    30. }
    31. sub Ugly : ATTR(CODE) {
    32. # Invoked for any subroutine declared in MyClass (or a
    33. # derived class) with an :Ugly attribute.
    34. ...
    35. }
    36. sub Omni : ATTR {
    37. # Invoked for any scalar, array, hash, or subroutine
    38. # with an :Omni attribute, provided the variable or
    39. # subroutine was declared in MyClass (or a derived class)
    40. # or the variable was typed to MyClass.
    41. # Use ref($_[2]) to determine what kind of referent it was.
    42. ...
    43. }
    44. use Attribute::Handlers autotie => { Cycle => Tie::Cycle };
    45. my $next : Cycle(['A'..'Z']);

    DESCRIPTION

    This module, when inherited by a package, allows that package's class to define attribute handler subroutines for specific attributes. Variables and subroutines subsequently defined in that package, or in packages derived from that package may be given attributes with the same names as the attribute handler subroutines, which will then be called in one of the compilation phases (i.e. in a BEGIN , CHECK , INIT , or END block). (UNITCHECK blocks don't correspond to a global compilation phase, so they can't be specified here.)

    To create a handler, define it as a subroutine with the same name as the desired attribute, and declare the subroutine itself with the attribute :ATTR . For example:

    1. package LoudDecl;
    2. use Attribute::Handlers;
    3. sub Loud :ATTR {
    4. my ($package, $symbol, $referent, $attr, $data, $phase,
    5. $filename, $linenum) = @_;
    6. print STDERR
    7. ref($referent), " ",
    8. *{$symbol}{NAME}, " ",
    9. "($referent) ", "was just declared ",
    10. "and ascribed the ${attr} attribute ",
    11. "with data ($data)\n",
    12. "in phase $phase\n",
    13. "in file $filename at line $linenum\n";
    14. }

    This creates a handler for the attribute :Loud in the class LoudDecl. Thereafter, any subroutine declared with a :Loud attribute in the class LoudDecl:

    1. package LoudDecl;
    2. sub foo: Loud {...}

    causes the above handler to be invoked, and passed:

    • [0]

      the name of the package into which it was declared;

    • [1]

      a reference to the symbol table entry (typeglob) containing the subroutine;

    • [2]

      a reference to the subroutine;

    • [3]

      the name of the attribute;

    • [4]

      any data associated with that attribute;

    • [5]

      the name of the phase in which the handler is being invoked;

    • [6]

      the filename in which the handler is being invoked;

    • [7]

      the line number in this file.

    Likewise, declaring any variables with the :Loud attribute within the package:

    1. package LoudDecl;
    2. my $foo :Loud;
    3. my @foo :Loud;
    4. my %foo :Loud;

    will cause the handler to be called with a similar argument list (except, of course, that $_[2] will be a reference to the variable).

    The package name argument will typically be the name of the class into which the subroutine was declared, but it may also be the name of a derived class (since handlers are inherited).

    If a lexical variable is given an attribute, there is no symbol table to which it belongs, so the symbol table argument ($_[1] ) is set to the string 'LEXICAL' in that case. Likewise, ascribing an attribute to an anonymous subroutine results in a symbol table argument of 'ANON' .

    The data argument passes in the value (if any) associated with the attribute. For example, if &foo had been declared:

    1. sub foo :Loud("turn it up to 11, man!") {...}

    then a reference to an array containing the string "turn it up to 11, man!" would be passed as the last argument.

    Attribute::Handlers makes strenuous efforts to convert the data argument ($_[4] ) to a usable form before passing it to the handler (but see Non-interpretive attribute handlers). If those efforts succeed, the interpreted data is passed in an array reference; if they fail, the raw data is passed as a string. For example, all of these:

    1. sub foo :Loud(till=>ears=>are=>bleeding) {...}
    2. sub foo :Loud(qw/till ears are bleeding/) {...}
    3. sub foo :Loud(qw/till, ears, are, bleeding/) {...}
    4. sub foo :Loud(till,ears,are,bleeding) {...}

    causes it to pass ['till','ears','are','bleeding'] as the handler's data argument. While:

    1. sub foo :Loud(['till','ears','are','bleeding']) {...}

    causes it to pass [ ['till','ears','are','bleeding'] ] ; the array reference specified in the data being passed inside the standard array reference indicating successful interpretation.

    However, if the data can't be parsed as valid Perl, then it is passed as an uninterpreted string. For example:

    1. sub foo :Loud(my,ears,are,bleeding) {...}
    2. sub foo :Loud(qw/my ears are bleeding) {...}

    cause the strings 'my,ears,are,bleeding' and 'qw/my ears are bleeding' respectively to be passed as the data argument.

    If no value is associated with the attribute, undef is passed.

    Typed lexicals

    Regardless of the package in which it is declared, if a lexical variable is ascribed an attribute, the handler that is invoked is the one belonging to the package to which it is typed. For example, the following declarations:

    1. package OtherClass;
    2. my LoudDecl $loudobj : Loud;
    3. my LoudDecl @loudobjs : Loud;
    4. my LoudDecl %loudobjex : Loud;

    causes the LoudDecl::Loud handler to be invoked (even if OtherClass also defines a handler for :Loud attributes).

    Type-specific attribute handlers

    If an attribute handler is declared and the :ATTR specifier is given the name of a built-in type (SCALAR , ARRAY , HASH , or CODE ), the handler is only applied to declarations of that type. For example, the following definition:

    1. package LoudDecl;
    2. sub RealLoud :ATTR(SCALAR) { print "Yeeeeow!" }

    creates an attribute handler that applies only to scalars:

    1. package Painful;
    2. use base LoudDecl;
    3. my $metal : RealLoud; # invokes &LoudDecl::RealLoud
    4. my @metal : RealLoud; # error: unknown attribute
    5. my %metal : RealLoud; # error: unknown attribute
    6. sub metal : RealLoud {...} # error: unknown attribute

    You can, of course, declare separate handlers for these types as well (but you'll need to specify no warnings 'redefine' to do it quietly):

    1. package LoudDecl;
    2. use Attribute::Handlers;
    3. no warnings 'redefine';
    4. sub RealLoud :ATTR(SCALAR) { print "Yeeeeow!" }
    5. sub RealLoud :ATTR(ARRAY) { print "Urrrrrrrrrr!" }
    6. sub RealLoud :ATTR(HASH) { print "Arrrrrgggghhhhhh!" }
    7. sub RealLoud :ATTR(CODE) { croak "Real loud sub torpedoed" }

    You can also explicitly indicate that a single handler is meant to be used for all types of referents like so:

    1. package LoudDecl;
    2. use Attribute::Handlers;
    3. sub SeriousLoud :ATTR(ANY) { warn "Hearing loss imminent" }

    (I.e. ATTR(ANY) is a synonym for :ATTR ).

    Non-interpretive attribute handlers

    Occasionally the strenuous efforts Attribute::Handlers makes to convert the data argument ($_[4] ) to a usable form before passing it to the handler get in the way.

    You can turn off that eagerness-to-help by declaring an attribute handler with the keyword RAWDATA . For example:

    1. sub Raw : ATTR(RAWDATA) {...}
    2. sub Nekkid : ATTR(SCALAR,RAWDATA) {...}
    3. sub Au::Naturale : ATTR(RAWDATA,ANY) {...}

    Then the handler makes absolutely no attempt to interpret the data it receives and simply passes it as a string:

    1. my $power : Raw(1..100); # handlers receives "1..100"

    Phase-specific attribute handlers

    By default, attribute handlers are called at the end of the compilation phase (in a CHECK block). This seems to be optimal in most cases because most things that can be defined are defined by that point but nothing has been executed.

    However, it is possible to set up attribute handlers that are called at other points in the program's compilation or execution, by explicitly stating the phase (or phases) in which you wish the attribute handler to be called. For example:

    1. sub Early :ATTR(SCALAR,BEGIN) {...}
    2. sub Normal :ATTR(SCALAR,CHECK) {...}
    3. sub Late :ATTR(SCALAR,INIT) {...}
    4. sub Final :ATTR(SCALAR,END) {...}
    5. sub Bookends :ATTR(SCALAR,BEGIN,END) {...}

    As the last example indicates, a handler may be set up to be (re)called in two or more phases. The phase name is passed as the handler's final argument.

    Note that attribute handlers that are scheduled for the BEGIN phase are handled as soon as the attribute is detected (i.e. before any subsequently defined BEGIN blocks are executed).

    Attributes as tie interfaces

    Attributes make an excellent and intuitive interface through which to tie variables. For example:

    1. use Attribute::Handlers;
    2. use Tie::Cycle;
    3. sub UNIVERSAL::Cycle : ATTR(SCALAR) {
    4. my ($package, $symbol, $referent, $attr, $data, $phase) = @_;
    5. $data = [ $data ] unless ref $data eq 'ARRAY';
    6. tie $$referent, 'Tie::Cycle', $data;
    7. }
    8. # and thereafter...
    9. package main;
    10. my $next : Cycle('A'..'Z'); # $next is now a tied variable
    11. while (<>) {
    12. print $next;
    13. }

    Note that, because the Cycle attribute receives its arguments in the $data variable, if the attribute is given a list of arguments, $data will consist of a single array reference; otherwise, it will consist of the single argument directly. Since Tie::Cycle requires its cycling values to be passed as an array reference, this means that we need to wrap non-array-reference arguments in an array constructor:

    1. $data = [ $data ] unless ref $data eq 'ARRAY';

    Typically, however, things are the other way around: the tieable class expects its arguments as a flattened list, so the attribute looks like:

    1. sub UNIVERSAL::Cycle : ATTR(SCALAR) {
    2. my ($package, $symbol, $referent, $attr, $data, $phase) = @_;
    3. my @data = ref $data eq 'ARRAY' ? @$data : $data;
    4. tie $$referent, 'Tie::Whatever', @data;
    5. }

    This software pattern is so widely applicable that Attribute::Handlers provides a way to automate it: specifying 'autotie' in the use Attribute::Handlers statement. So, the cycling example, could also be written:

    1. use Attribute::Handlers autotie => { Cycle => 'Tie::Cycle' };
    2. # and thereafter...
    3. package main;
    4. my $next : Cycle(['A'..'Z']); # $next is now a tied variable
    5. while (<>) {
    6. print $next;
    7. }

    Note that we now have to pass the cycling values as an array reference, since the autotie mechanism passes tie a list of arguments as a list (as in the Tie::Whatever example), not as an array reference (as in the original Tie::Cycle example at the start of this section).

    The argument after 'autotie' is a reference to a hash in which each key is the name of an attribute to be created, and each value is the class to which variables ascribed that attribute should be tied.

    Note that there is no longer any need to import the Tie::Cycle module -- Attribute::Handlers takes care of that automagically. You can even pass arguments to the module's import subroutine, by appending them to the class name. For example:

    1. use Attribute::Handlers
    2. autotie => { Dir => 'Tie::Dir qw(DIR_UNLINK)' };

    If the attribute name is unqualified, the attribute is installed in the current package. Otherwise it is installed in the qualifier's package:

    1. package Here;
    2. use Attribute::Handlers autotie => {
    3. Other::Good => Tie::SecureHash, # tie attr installed in Other::
    4. Bad => Tie::Taxes, # tie attr installed in Here::
    5. UNIVERSAL::Ugly => Software::Patent # tie attr installed everywhere
    6. };

    Autoties are most commonly used in the module to which they actually tie, and need to export their attributes to any module that calls them. To facilitate this, Attribute::Handlers recognizes a special "pseudo-class" -- __CALLER__ , which may be specified as the qualifier of an attribute:

    1. package Tie::Me::Kangaroo:Down::Sport;
    2. use Attribute::Handlers autotie =>
    3. { '__CALLER__::Roo' => __PACKAGE__ };

    This causes Attribute::Handlers to define the Roo attribute in the package that imports the Tie::Me::Kangaroo:Down::Sport module.

    Note that it is important to quote the __CALLER__::Roo identifier because a bug in perl 5.8 will refuse to parse it and cause an unknown error.

    Passing the tied object to tie

    Occasionally it is important to pass a reference to the object being tied to the TIESCALAR, TIEHASH, etc. that ties it.

    The autotie mechanism supports this too. The following code:

    1. use Attribute::Handlers autotieref => { Selfish => Tie::Selfish };
    2. my $var : Selfish(@args);

    has the same effect as:

    1. tie my $var, 'Tie::Selfish', @args;

    But when "autotieref" is used instead of "autotie" :

    1. use Attribute::Handlers autotieref => { Selfish => Tie::Selfish };
    2. my $var : Selfish(@args);

    the effect is to pass the tie call an extra reference to the variable being tied:

    1. tie my $var, 'Tie::Selfish', \$var, @args;

    EXAMPLES

    If the class shown in SYNOPSIS were placed in the MyClass.pm module, then the following code:

    1. package main;
    2. use MyClass;
    3. my MyClass $slr :Good :Bad(1**1-1) :Omni(-vorous);
    4. package SomeOtherClass;
    5. use base MyClass;
    6. sub tent { 'acle' }
    7. sub fn :Ugly(sister) :Omni('po',tent()) {...}
    8. my @arr :Good :Omni(s/cie/nt/);
    9. my %hsh :Good(q/bye/) :Omni(q/bus/);

    would cause the following handlers to be invoked:

    1. # my MyClass $slr :Good :Bad(1**1-1) :Omni(-vorous);
    2. MyClass::Good:ATTR(SCALAR)( 'MyClass', # class
    3. 'LEXICAL', # no typeglob
    4. \$slr, # referent
    5. 'Good', # attr name
    6. undef # no attr data
    7. 'CHECK', # compiler phase
    8. );
    9. MyClass::Bad:ATTR(SCALAR)( 'MyClass', # class
    10. 'LEXICAL', # no typeglob
    11. \$slr, # referent
    12. 'Bad', # attr name
    13. 0 # eval'd attr data
    14. 'CHECK', # compiler phase
    15. );
    16. MyClass::Omni:ATTR(SCALAR)( 'MyClass', # class
    17. 'LEXICAL', # no typeglob
    18. \$slr, # referent
    19. 'Omni', # attr name
    20. '-vorous' # eval'd attr data
    21. 'CHECK', # compiler phase
    22. );
    23. # sub fn :Ugly(sister) :Omni('po',tent()) {...}
    24. MyClass::UGLY:ATTR(CODE)( 'SomeOtherClass', # class
    25. \*SomeOtherClass::fn, # typeglob
    26. \&SomeOtherClass::fn, # referent
    27. 'Ugly', # attr name
    28. 'sister' # eval'd attr data
    29. 'CHECK', # compiler phase
    30. );
    31. MyClass::Omni:ATTR(CODE)( 'SomeOtherClass', # class
    32. \*SomeOtherClass::fn, # typeglob
    33. \&SomeOtherClass::fn, # referent
    34. 'Omni', # attr name
    35. ['po','acle'] # eval'd attr data
    36. 'CHECK', # compiler phase
    37. );
    38. # my @arr :Good :Omni(s/cie/nt/);
    39. MyClass::Good:ATTR(ARRAY)( 'SomeOtherClass', # class
    40. 'LEXICAL', # no typeglob
    41. \@arr, # referent
    42. 'Good', # attr name
    43. undef # no attr data
    44. 'CHECK', # compiler phase
    45. );
    46. MyClass::Omni:ATTR(ARRAY)( 'SomeOtherClass', # class
    47. 'LEXICAL', # no typeglob
    48. \@arr, # referent
    49. 'Omni', # attr name
    50. "" # eval'd attr data
    51. 'CHECK', # compiler phase
    52. );
    53. # my %hsh :Good(q/bye) :Omni(q/bus/);
    54. MyClass::Good:ATTR(HASH)( 'SomeOtherClass', # class
    55. 'LEXICAL', # no typeglob
    56. \%hsh, # referent
    57. 'Good', # attr name
    58. 'q/bye' # raw attr data
    59. 'CHECK', # compiler phase
    60. );
    61. MyClass::Omni:ATTR(HASH)( 'SomeOtherClass', # class
    62. 'LEXICAL', # no typeglob
    63. \%hsh, # referent
    64. 'Omni', # attr name
    65. 'bus' # eval'd attr data
    66. 'CHECK', # compiler phase
    67. );

    Installing handlers into UNIVERSAL, makes them...err..universal. For example:

    1. package Descriptions;
    2. use Attribute::Handlers;
    3. my %name;
    4. sub name { return $name{$_[2]}||*{$_[1]}{NAME} }
    5. sub UNIVERSAL::Name :ATTR {
    6. $name{$_[2]} = $_[4];
    7. }
    8. sub UNIVERSAL::Purpose :ATTR {
    9. print STDERR "Purpose of ", &name, " is $_[4]\n";
    10. }
    11. sub UNIVERSAL::Unit :ATTR {
    12. print STDERR &name, " measured in $_[4]\n";
    13. }

    Let's you write:

    1. use Descriptions;
    2. my $capacity : Name(capacity)
    3. : Purpose(to store max storage capacity for files)
    4. : Unit(Gb);
    5. package Other;
    6. sub foo : Purpose(to foo all data before barring it) { }
    7. # etc.

    UTILITY FUNCTIONS

    This module offers a single utility function, findsym() .

    • findsym
      1. my $symbol = Attribute::Handlers::findsym($package, $referent);

      The function looks in the symbol table of $package for the typeglob for $referent , which is a reference to a variable or subroutine (SCALAR, ARRAY, HASH, or CODE). If it finds the typeglob, it returns it. Otherwise, it returns undef. Note that findsym memoizes the typeglobs it has previously successfully found, so subsequent calls with the same arguments should be much faster.

    DIAGNOSTICS

    • Bad attribute type: ATTR(%s)

      An attribute handler was specified with an :ATTR(ref_type), but the type of referent it was defined to handle wasn't one of the five permitted: SCALAR , ARRAY , HASH , CODE , or ANY .

    • Attribute handler %s doesn't handle %s attributes

      A handler for attributes of the specified name was defined, but not for the specified type of declaration. Typically encountered whe trying to apply a VAR attribute handler to a subroutine, or a SCALAR attribute handler to some other type of variable.

    • Declaration of %s attribute in package %s may clash with future reserved word

      A handler for an attributes with an all-lowercase name was declared. An attribute with an all-lowercase name might have a meaning to Perl itself some day, even though most don't yet. Use a mixed-case attribute name, instead.

    • Can't have two ATTR specifiers on one subroutine

      You just can't, okay? Instead, put all the specifications together with commas between them in a single ATTR(specification).

    • Can't autotie a %s

      You can only declare autoties for types "SCALAR" , "ARRAY" , and "HASH" . They're the only things (apart from typeglobs -- which are not declarable) that Perl can tie.

    • Internal error: %s symbol went missing

      Something is rotten in the state of the program. An attributed subroutine ceased to exist between the point it was declared and the point at which its attribute handler(s) would have been called.

    • Won't be able to apply END handler

      You have defined an END handler for an attribute that is being applied to a lexical variable. Since the variable may not be available during END this won't happen.

    AUTHOR

    Damian Conway (damian@conway.org). The maintainer of this module is now Rafael Garcia-Suarez (rgarciasuarez@gmail.com).

    Maintainer of the CPAN release is Steffen Mueller (smueller@cpan.org). Contact him with technical difficulties with respect to the packaging of the CPAN module.

    BUGS

    There are undoubtedly serious bugs lurking somewhere in code this funky :-) Bug reports and other feedback are most welcome.

    COPYRIGHT AND LICENSE

    1. Copyright (c) 2001-2009, Damian Conway. All Rights Reserved.
    2. This module is free software. It may be used, redistributed
    3. and/or modified under the same terms as Perl itself.
     
    perldoc-html/Archive/Extract.html000644 000765 000024 00000116762 12275777422 017133 0ustar00jjstaff000000 000000 Archive::Extract - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Archive::Extract

    Perl 5 version 18.2 documentation
    Recently read

    Archive::Extract

    NAME

    Archive::Extract - A generic archive extracting mechanism

    SYNOPSIS

    1. use Archive::Extract;
    2. ### build an Archive::Extract object ###
    3. my $ae = Archive::Extract->new( archive => 'foo.tgz' );
    4. ### extract to cwd() ###
    5. my $ok = $ae->extract;
    6. ### extract to /tmp ###
    7. my $ok = $ae->extract( to => '/tmp' );
    8. ### what if something went wrong?
    9. my $ok = $ae->extract or die $ae->error;
    10. ### files from the archive ###
    11. my $files = $ae->files;
    12. ### dir that was extracted to ###
    13. my $outdir = $ae->extract_path;
    14. ### quick check methods ###
    15. $ae->is_tar # is it a .tar file?
    16. $ae->is_tgz # is it a .tar.gz or .tgz file?
    17. $ae->is_gz; # is it a .gz file?
    18. $ae->is_zip; # is it a .zip file?
    19. $ae->is_bz2; # is it a .bz2 file?
    20. $ae->is_tbz; # is it a .tar.bz2 or .tbz file?
    21. $ae->is_lzma; # is it a .lzma file?
    22. $ae->is_xz; # is it a .xz file?
    23. $ae->is_txz; # is it a .tar.xz or .txz file?
    24. ### absolute path to the archive you provided ###
    25. $ae->archive;
    26. ### commandline tools, if found ###
    27. $ae->bin_tar # path to /bin/tar, if found
    28. $ae->bin_gzip # path to /bin/gzip, if found
    29. $ae->bin_unzip # path to /bin/unzip, if found
    30. $ae->bin_bunzip2 # path to /bin/bunzip2 if found
    31. $ae->bin_unlzma # path to /bin/unlzma if found
    32. $ae->bin_unxz # path to /bin/unxz if found

    DESCRIPTION

    Archive::Extract is a generic archive extraction mechanism.

    It allows you to extract any archive file of the type .tar, .tar.gz, .gz, .Z, tar.bz2, .tbz, .bz2, .zip, .xz,, .txz, .tar.xz or .lzma without having to worry how it does so, or use different interfaces for each type by using either perl modules, or commandline tools on your system.

    See the HOW IT WORKS section further down for details.

    METHODS

    $ae = Archive::Extract->new(archive => '/path/to/archive',[type => TYPE])

    Creates a new Archive::Extract object based on the archive file you passed it. Automatically determines the type of archive based on the extension, but you can override that by explicitly providing the type argument.

    Valid values for type are:

    • tar

      Standard tar files, as produced by, for example, /bin/tar. Corresponds to a .tar suffix.

    • tgz

      Gzip compressed tar files, as produced by, for example /bin/tar -z. Corresponds to a .tgz or .tar.gz suffix.

    • gz

      Gzip compressed file, as produced by, for example /bin/gzip. Corresponds to a .gz suffix.

    • Z

      Lempel-Ziv compressed file, as produced by, for example /bin/compress. Corresponds to a .Z suffix.

    • zip

      Zip compressed file, as produced by, for example /bin/zip. Corresponds to a .zip, .jar or .par suffix.

    • bz2

      Bzip2 compressed file, as produced by, for example, /bin/bzip2. Corresponds to a .bz2 suffix.

    • tbz

      Bzip2 compressed tar file, as produced by, for example /bin/tar -j. Corresponds to a .tbz or .tar.bz2 suffix.

    • lzma

      Lzma compressed file, as produced by /bin/lzma. Corresponds to a .lzma suffix.

    • xz

      Xz compressed file, as produced by /bin/xz . Corresponds to a .xz suffix.

    • txz

      Xz compressed tar file, as produced by, for example /bin/tar -J . Corresponds to a .txz or .tar.xz suffix.

    Returns a Archive::Extract object on success, or false on failure.

    $ae->extract( [to => '/output/path'] )

    Extracts the archive represented by the Archive::Extract object to the path of your choice as specified by the to argument. Defaults to cwd() .

    Since .gz files never hold a directory, but only a single file; if the to argument is an existing directory, the file is extracted there, with its .gz suffix stripped. If the to argument is not an existing directory, the to argument is understood to be a filename, if the archive type is gz . In the case that you did not specify a to argument, the output file will be the name of the archive file, stripped from its .gz suffix, in the current working directory.

    extract will try a pure perl solution first, and then fall back to commandline tools if they are available. See the GLOBAL VARIABLES section below on how to alter this behaviour.

    It will return true on success, and false on failure.

    On success, it will also set the follow attributes in the object:

    • $ae->extract_path

      This is the directory that the files where extracted to.

    • $ae->files

      This is an array ref with the paths of all the files in the archive, relative to the to argument you specified. To get the full path to an extracted file, you would use:

      1. File::Spec->catfile( $to, $ae->files->[0] );

      Note that all files from a tar archive will be in unix format, as per the tar specification.

    ACCESSORS

    $ae->error([BOOL])

    Returns the last encountered error as string. Pass it a true value to get the Carp::longmess() output instead.

    $ae->extract_path

    This is the directory the archive got extracted to. See extract() for details.

    $ae->files

    This is an array ref holding all the paths from the archive. See extract() for details.

    $ae->archive

    This is the full path to the archive file represented by this Archive::Extract object.

    $ae->type

    This is the type of archive represented by this Archive::Extract object. See accessors below for an easier way to use this. See the new() method for details.

    $ae->types

    Returns a list of all known types for Archive::Extract 's new method.

    $ae->is_tgz

    Returns true if the file is of type .tar.gz. See the new() method for details.

    $ae->is_tar

    Returns true if the file is of type .tar. See the new() method for details.

    $ae->is_gz

    Returns true if the file is of type .gz. See the new() method for details.

    $ae->is_Z

    Returns true if the file is of type .Z. See the new() method for details.

    $ae->is_zip

    Returns true if the file is of type .zip. See the new() method for details.

    $ae->is_lzma

    Returns true if the file is of type .lzma. See the new() method for details.

    $ae->is_xz

    Returns true if the file is of type .xz . See the new() method for details.

    $ae->bin_tar

    Returns the full path to your tar binary, if found.

    $ae->bin_gzip

    Returns the full path to your gzip binary, if found

    $ae->bin_unzip

    Returns the full path to your unzip binary, if found

    $ae->bin_unlzma

    Returns the full path to your unlzma binary, if found

    $ae->bin_unxz

    Returns the full path to your unxz binary, if found

    $bool = $ae->have_old_bunzip2

    Older versions of /bin/bunzip2, from before the bunzip2 1.0 release, require all archive names to end in .bz2 or it will not extract them. This method checks if you have a recent version of bunzip2 that allows any extension, or an older one that doesn't.

    debug( MESSAGE )

    This method outputs MESSAGE to the default filehandle if $DEBUG is true. It's a small method, but it's here if you'd like to subclass it so you can so something else with any debugging output.

    HOW IT WORKS

    Archive::Extract tries first to determine what type of archive you are passing it, by inspecting its suffix. It does not do this by using Mime magic, or something related. See CAVEATS below.

    Once it has determined the file type, it knows which extraction methods it can use on the archive. It will try a perl solution first, then fall back to a commandline tool if that fails. If that also fails, it will return false, indicating it was unable to extract the archive. See the section on GLOBAL VARIABLES to see how to alter this order.

    CAVEATS

    File Extensions

    Archive::Extract trusts on the extension of the archive to determine what type it is, and what extractor methods therefore can be used. If your archives do not have any of the extensions as described in the new() method, you will have to specify the type explicitly, or Archive::Extract will not be able to extract the archive for you.

    Supporting Very Large Files

    Archive::Extract can use either pure perl modules or command line programs under the hood. Some of the pure perl modules (like Archive::Tar and Compress::unLZMA) take the entire contents of the archive into memory, which may not be feasible on your system. Consider setting the global variable $Archive::Extract::PREFER_BIN to 1 , which will prefer the use of command line programs and won't consume so much memory.

    See the GLOBAL VARIABLES section below for details.

    Bunzip2 support of arbitrary extensions.

    Older versions of /bin/bunzip2 do not support arbitrary file extensions and insist on a .bz2 suffix. Although we do our best to guard against this, if you experience a bunzip2 error, it may be related to this. For details, please see the have_old_bunzip2 method.

    GLOBAL VARIABLES

    $Archive::Extract::DEBUG

    Set this variable to true to have all calls to command line tools be printed out, including all their output. This also enables Carp::longmess errors, instead of the regular carp errors.

    Good for tracking down why things don't work with your particular setup.

    Defaults to false .

    $Archive::Extract::WARN

    This variable controls whether errors encountered internally by Archive::Extract should be carp 'd or not.

    Set to false to silence warnings. Inspect the output of the error() method manually to see what went wrong.

    Defaults to true .

    $Archive::Extract::PREFER_BIN

    This variables controls whether Archive::Extract should prefer the use of perl modules, or commandline tools to extract archives.

    Set to true to have Archive::Extract prefer commandline tools.

    Defaults to false .

    TODO / CAVEATS

    • Mime magic support

      Maybe this module should use something like File::Type to determine the type, rather than blindly trust the suffix.

    • Thread safety

      Currently, Archive::Extract does a chdir to the extraction dir before extraction, and a chdir back again after. This is not necessarily thread safe. See rt.cpan.org bug #45671 for details.

    BUG REPORTS

    Please report bugs or other issues to <bug-archive-extract@rt.cpan.org>.

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    COPYRIGHT

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

     
    perldoc-html/Archive/Tar/000755 000765 000024 00000000000 12275777423 015345 5ustar00jjstaff000000 000000 perldoc-html/Archive/Tar.html000644 000765 000024 00000241002 12275777422 016231 0ustar00jjstaff000000 000000 Archive::Tar - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Archive::Tar

    Perl 5 version 18.2 documentation
    Recently read

    Archive::Tar

    NAME

    Archive::Tar - module for manipulations of tar archives

    SYNOPSIS

    1. use Archive::Tar;
    2. my $tar = Archive::Tar->new;
    3. $tar->read('origin.tgz');
    4. $tar->extract();
    5. $tar->add_files('file/foo.pl', 'docs/README');
    6. $tar->add_data('file/baz.txt', 'This is the contents now');
    7. $tar->rename('oldname', 'new/file/name');
    8. $tar->chown('/', 'root');
    9. $tar->chown('/', 'root:root');
    10. $tar->chmod('/tmp', '1777');
    11. $tar->write('files.tar'); # plain tar
    12. $tar->write('files.tgz', COMPRESS_GZIP); # gzip compressed
    13. $tar->write('files.tbz', COMPRESS_BZIP); # bzip2 compressed

    DESCRIPTION

    Archive::Tar provides an object oriented mechanism for handling tar files. It provides class methods for quick and easy files handling while also allowing for the creation of tar file objects for custom manipulation. If you have the IO::Zlib module installed, Archive::Tar will also support compressed or gzipped tar files.

    An object of class Archive::Tar represents a .tar(.gz) archive full of files and things.

    Object Methods

    Archive::Tar->new( [$file, $compressed] )

    Returns a new Tar object. If given any arguments, new() calls the read() method automatically, passing on the arguments provided to the read() method.

    If new() is invoked with arguments and the read() method fails for any reason, new() returns undef.

    $tar->read ( $filename|$handle, [$compressed, {opt => 'val'}] )

    Read the given tar file into memory. The first argument can either be the name of a file or a reference to an already open filehandle (or an IO::Zlib object if it's compressed)

    The read will replace any previous content in $tar !

    The second argument may be considered optional, but remains for backwards compatibility. Archive::Tar now looks at the file magic to determine what class should be used to open the file and will transparently Do The Right Thing.

    Archive::Tar will warn if you try to pass a bzip2 compressed file and the IO::Zlib / IO::Uncompress::Bunzip2 modules are not available and simply return.

    Note that you can currently not pass a gzip compressed filehandle, which is not opened with IO::Zlib , a bzip2 compressed filehandle, which is not opened with IO::Uncompress::Bunzip2 , nor a string containing the full archive information (either compressed or uncompressed). These are worth while features, but not currently implemented. See the TODO section.

    The third argument can be a hash reference with options. Note that all options are case-sensitive.

    • limit

      Do not read more than limit files. This is useful if you have very big archives, and are only interested in the first few files.

    • filter

      Can be set to a regular expression. Only files with names that match the expression will be read.

    • md5

      Set to 1 and the md5sum of files will be returned (instead of file data) my $iter = Archive::Tar->iter( $file, 1, {md5 => 1} ); while( my $f = $iter->() ) { print $f->data . "\t" . $f->full_path . $/; }

    • extract

      If set to true, immediately extract entries when reading them. This gives you the same memory break as the extract_archive function. Note however that entries will not be read into memory, but written straight to disk. This means no Archive::Tar::File objects are created for you to inspect.

    All files are stored internally as Archive::Tar::File objects. Please consult the Archive::Tar::File documentation for details.

    Returns the number of files read in scalar context, and a list of Archive::Tar::File objects in list context.

    $tar->contains_file( $filename )

    Check if the archive contains a certain file. It will return true if the file is in the archive, false otherwise.

    Note however, that this function does an exact match using eq on the full path. So it cannot compensate for case-insensitive file- systems or compare 2 paths to see if they would point to the same underlying file.

    $tar->extract( [@filenames] )

    Write files whose names are equivalent to any of the names in @filenames to disk, creating subdirectories as necessary. This might not work too well under VMS. Under MacPerl, the file's modification time will be converted to the MacOS zero of time, and appropriate conversions will be done to the path. However, the length of each element of the path is not inspected to see whether it's longer than MacOS currently allows (32 characters).

    If extract is called without a list of file names, the entire contents of the archive are extracted.

    Returns a list of filenames extracted.

    $tar->extract_file( $file, [$extract_path] )

    Write an entry, whose name is equivalent to the file name provided to disk. Optionally takes a second parameter, which is the full native path (including filename) the entry will be written to.

    For example:

    1. $tar->extract_file( 'name/in/archive', 'name/i/want/to/give/it' );
    2. $tar->extract_file( $at_file_object, 'name/i/want/to/give/it' );

    Returns true on success, false on failure.

    $tar->list_files( [\@properties] )

    Returns a list of the names of all the files in the archive.

    If list_files() is passed an array reference as its first argument it returns a list of hash references containing the requested properties of each file. The following list of properties is supported: name, size, mtime (last modified date), mode, uid, gid, linkname, uname, gname, devmajor, devminor, prefix.

    Passing an array reference containing only one element, 'name', is special cased to return a list of names rather than a list of hash references, making it equivalent to calling list_files without arguments.

    $tar->get_files( [@filenames] )

    Returns the Archive::Tar::File objects matching the filenames provided. If no filename list was passed, all Archive::Tar::File objects in the current Tar object are returned.

    Please refer to the Archive::Tar::File documentation on how to handle these objects.

    $tar->get_content( $file )

    Return the content of the named file.

    $tar->replace_content( $file, $content )

    Make the string $content be the content for the file named $file.

    $tar->rename( $file, $new_name )

    Rename the file of the in-memory archive to $new_name.

    Note that you must specify a Unix path for $new_name, since per tar standard, all files in the archive must be Unix paths.

    Returns true on success and false on failure.

    $tar->chmod( $file, $mode )

    Change mode of $file to $mode.

    Returns true on success and false on failure.

    $tar->chown( $file, $uname [, $gname] )

    Change owner $file to $uname and $gname.

    Returns true on success and false on failure.

    $tar->remove (@filenamelist)

    Removes any entries with names matching any of the given filenames from the in-memory archive. Returns a list of Archive::Tar::File objects that remain.

    $tar->clear

    clear clears the current in-memory archive. This effectively gives you a 'blank' object, ready to be filled again. Note that clear only has effect on the object, not the underlying tarfile.

    $tar->write ( [$file, $compressed, $prefix] )

    Write the in-memory archive to disk. The first argument can either be the name of a file or a reference to an already open filehandle (a GLOB reference).

    The second argument is used to indicate compression. You can either compress using gzip or bzip2 . If you pass a digit, it's assumed to be the gzip compression level (between 1 and 9), but the use of constants is preferred:

    1. # write a gzip compressed file
    2. $tar->write( 'out.tgz', COMPRESS_GZIP );
    3. # write a bzip compressed file
    4. $tar->write( 'out.tbz', COMPRESS_BZIP );

    Note that when you pass in a filehandle, the compression argument is ignored, as all files are printed verbatim to your filehandle. If you wish to enable compression with filehandles, use an IO::Zlib or IO::Compress::Bzip2 filehandle instead.

    The third argument is an optional prefix. All files will be tucked away in the directory you specify as prefix. So if you have files 'a' and 'b' in your archive, and you specify 'foo' as prefix, they will be written to the archive as 'foo/a' and 'foo/b'.

    If no arguments are given, write returns the entire formatted archive as a string, which could be useful if you'd like to stuff the archive into a socket or a pipe to gzip or something.

    $tar->add_files( @filenamelist )

    Takes a list of filenames and adds them to the in-memory archive.

    The path to the file is automatically converted to a Unix like equivalent for use in the archive, and, if on MacOS, the file's modification time is converted from the MacOS epoch to the Unix epoch. So tar archives created on MacOS with Archive::Tar can be read both with tar on Unix and applications like suntar or Stuffit Expander on MacOS.

    Be aware that the file's type/creator and resource fork will be lost, which is usually what you want in cross-platform archives.

    Instead of a filename, you can also pass it an existing Archive::Tar::File object from, for example, another archive. The object will be clone, and effectively be a copy of the original, not an alias.

    Returns a list of Archive::Tar::File objects that were just added.

    $tar->add_data ( $filename, $data, [$opthashref] )

    Takes a filename, a scalar full of data and optionally a reference to a hash with specific options.

    Will add a file to the in-memory archive, with name $filename and content $data . Specific properties can be set using $opthashref . The following list of properties is supported: name, size, mtime (last modified date), mode, uid, gid, linkname, uname, gname, devmajor, devminor, prefix, type. (On MacOS, the file's path and modification times are converted to Unix equivalents.)

    Valid values for the file type are the following constants defined by Archive::Tar::Constant:

    • FILE

      Regular file.

    • HARDLINK
    • SYMLINK

      Hard and symbolic ("soft") links; linkname should specify target.

    • CHARDEV
    • BLOCKDEV

      Character and block devices. devmajor and devminor should specify the major and minor device numbers.

    • DIR

      Directory.

    • FIFO

      FIFO (named pipe).

    • SOCKET

      Socket.

    Returns the Archive::Tar::File object that was just added, or undef on failure.

    $tar->error( [$BOOL] )

    Returns the current errorstring (usually, the last error reported). If a true value was specified, it will give the Carp::longmess equivalent of the error, in effect giving you a stacktrace.

    For backwards compatibility, this error is also available as $Archive::Tar::error although it is much recommended you use the method call instead.

    $tar->setcwd( $cwd );

    Archive::Tar needs to know the current directory, and it will run Cwd::cwd() every time it extracts a relative entry from the tarfile and saves it in the file system. (As of version 1.30, however, Archive::Tar will use the speed optimization described below automatically, so it's only relevant if you're using extract_file() ).

    Since Archive::Tar doesn't change the current directory internally while it is extracting the items in a tarball, all calls to Cwd::cwd() can be avoided if we can guarantee that the current directory doesn't get changed externally.

    To use this performance boost, set the current directory via

    1. use Cwd;
    2. $tar->setcwd( cwd() );

    once before calling a function like extract_file and Archive::Tar will use the current directory setting from then on and won't call Cwd::cwd() internally.

    To switch back to the default behaviour, use

    1. $tar->setcwd( undef );

    and Archive::Tar will call Cwd::cwd() internally again.

    If you're using Archive::Tar 's extract() method, setcwd() will be called for you.

    Class Methods

    Archive::Tar->create_archive($file, $compressed, @filelist)

    Creates a tar file from the list of files provided. The first argument can either be the name of the tar file to create or a reference to an open file handle (e.g. a GLOB reference).

    The second argument is used to indicate compression. You can either compress using gzip or bzip2 . If you pass a digit, it's assumed to be the gzip compression level (between 1 and 9), but the use of constants is preferred:

    1. # write a gzip compressed file
    2. Archive::Tar->create_archive( 'out.tgz', COMPRESS_GZIP, @filelist );
    3. # write a bzip compressed file
    4. Archive::Tar->create_archive( 'out.tbz', COMPRESS_BZIP, @filelist );

    Note that when you pass in a filehandle, the compression argument is ignored, as all files are printed verbatim to your filehandle. If you wish to enable compression with filehandles, use an IO::Zlib or IO::Compress::Bzip2 filehandle instead.

    The remaining arguments list the files to be included in the tar file. These files must all exist. Any files which don't exist or can't be read are silently ignored.

    If the archive creation fails for any reason, create_archive will return false. Please use the error method to find the cause of the failure.

    Note that this method does not write on the fly as it were; it still reads all the files into memory before writing out the archive. Consult the FAQ below if this is a problem.

    Archive::Tar->iter( $filename, [ $compressed, {opt => $val} ] )

    Returns an iterator function that reads the tar file without loading it all in memory. Each time the function is called it will return the next file in the tarball. The files are returned as Archive::Tar::File objects. The iterator function returns the empty list once it has exhausted the files contained.

    The second argument can be a hash reference with options, which are identical to the arguments passed to read().

    Example usage:

    1. my $next = Archive::Tar->iter( "example.tar.gz", 1, {filter => qr/\.pm$/} );
    2. while( my $f = $next->() ) {
    3. print $f->name, "\n";
    4. $f->extract or warn "Extraction failed";
    5. # ....
    6. }

    Archive::Tar->list_archive($file, $compressed, [\@properties])

    Returns a list of the names of all the files in the archive. The first argument can either be the name of the tar file to list or a reference to an open file handle (e.g. a GLOB reference).

    If list_archive() is passed an array reference as its third argument it returns a list of hash references containing the requested properties of each file. The following list of properties is supported: full_path, name, size, mtime (last modified date), mode, uid, gid, linkname, uname, gname, devmajor, devminor, prefix, type.

    See Archive::Tar::File for details about supported properties.

    Passing an array reference containing only one element, 'name', is special cased to return a list of names rather than a list of hash references.

    Archive::Tar->extract_archive($file, $compressed)

    Extracts the contents of the tar file. The first argument can either be the name of the tar file to create or a reference to an open file handle (e.g. a GLOB reference). All relative paths in the tar file will be created underneath the current working directory.

    extract_archive will return a list of files it extracted. If the archive extraction fails for any reason, extract_archive will return false. Please use the error method to find the cause of the failure.

    $bool = Archive::Tar->has_io_string

    Returns true if we currently have IO::String support loaded.

    Either IO::String or perlio support is needed to support writing stringified archives. Currently, perlio is the preferred method, if available.

    See the GLOBAL VARIABLES section to see how to change this preference.

    $bool = Archive::Tar->has_perlio

    Returns true if we currently have perlio support loaded.

    This requires perl-5.8 or higher, compiled with perlio

    Either IO::String or perlio support is needed to support writing stringified archives. Currently, perlio is the preferred method, if available.

    See the GLOBAL VARIABLES section to see how to change this preference.

    $bool = Archive::Tar->has_zlib_support

    Returns true if Archive::Tar can extract zlib compressed archives

    $bool = Archive::Tar->has_bzip2_support

    Returns true if Archive::Tar can extract bzip2 compressed archives

    Archive::Tar->can_handle_compressed_files

    A simple checking routine, which will return true if Archive::Tar is able to uncompress compressed archives on the fly with IO::Zlib and IO::Compress::Bzip2 or false if not both are installed.

    You can use this as a shortcut to determine whether Archive::Tar will do what you think before passing compressed archives to its read method.

    GLOBAL VARIABLES

    $Archive::Tar::FOLLOW_SYMLINK

    Set this variable to 1 to make Archive::Tar effectively make a copy of the file when extracting. Default is 0 , which means the symlink stays intact. Of course, you will have to pack the file linked to as well.

    This option is checked when you write out the tarfile using write or create_archive .

    This works just like /bin/tar's -h option.

    $Archive::Tar::CHOWN

    By default, Archive::Tar will try to chown your files if it is able to. In some cases, this may not be desired. In that case, set this variable to 0 to disable chown-ing, even if it were possible.

    The default is 1 .

    $Archive::Tar::CHMOD

    By default, Archive::Tar will try to chmod your files to whatever mode was specified for the particular file in the archive. In some cases, this may not be desired. In that case, set this variable to 0 to disable chmod-ing.

    The default is 1 .

    $Archive::Tar::SAME_PERMISSIONS

    When, $Archive::Tar::CHMOD is enabled, this setting controls whether the permissions on files from the archive are used without modification of if they are filtered by removing any setid bits and applying the current umask.

    The default is 1 for the root user and 0 for normal users.

    $Archive::Tar::DO_NOT_USE_PREFIX

    By default, Archive::Tar will try to put paths that are over 100 characters in the prefix field of your tar header, as defined per POSIX-standard. However, some (older) tar programs do not implement this spec. To retain compatibility with these older or non-POSIX compliant versions, you can set the $DO_NOT_USE_PREFIX variable to a true value, and Archive::Tar will use an alternate way of dealing with paths over 100 characters by using the GNU Extended Header feature.

    Note that clients who do not support the GNU Extended Header feature will not be able to read these archives. Such clients include tars on Solaris , Irix and AIX .

    The default is 0 .

    $Archive::Tar::DEBUG

    Set this variable to 1 to always get the Carp::longmess output of the warnings, instead of the regular carp . This is the same message you would get by doing:

    1. $tar->error(1);

    Defaults to 0 .

    $Archive::Tar::WARN

    Set this variable to 0 if you do not want any warnings printed. Personally I recommend against doing this, but people asked for the option. Also, be advised that this is of course not threadsafe.

    Defaults to 1 .

    $Archive::Tar::error

    Holds the last reported error. Kept for historical reasons, but its use is very much discouraged. Use the error() method instead:

    1. warn $tar->error unless $tar->extract;

    Note that in older versions of this module, the error() method would return an effectively global value even when called an instance method as above. This has since been fixed, and multiple instances of Archive::Tar now have separate error strings.

    $Archive::Tar::INSECURE_EXTRACT_MODE

    This variable indicates whether Archive::Tar should allow files to be extracted outside their current working directory.

    Allowing this could have security implications, as a malicious tar archive could alter or replace any file the extracting user has permissions to. Therefor, the default is to not allow insecure extractions.

    If you trust the archive, or have other reasons to allow the archive to write files outside your current working directory, set this variable to true .

    Note that this is a backwards incompatible change from version 1.36 and before.

    $Archive::Tar::HAS_PERLIO

    This variable holds a boolean indicating if we currently have perlio support loaded. This will be enabled for any perl greater than 5.8 compiled with perlio .

    If you feel strongly about disabling it, set this variable to false . Note that you will then need IO::String installed to support writing stringified archives.

    Don't change this variable unless you really know what you're doing.

    $Archive::Tar::HAS_IO_STRING

    This variable holds a boolean indicating if we currently have IO::String support loaded. This will be enabled for any perl that has a loadable IO::String module.

    If you feel strongly about disabling it, set this variable to false . Note that you will then need perlio support from your perl to be able to write stringified archives.

    Don't change this variable unless you really know what you're doing.

    $Archive::Tar::ZERO_PAD_NUMBERS

    This variable holds a boolean indicating if we will create zero padded numbers for size , mtime and checksum . The default is 0 , indicating that we will create space padded numbers. Added for compatibility with busybox implementations.

    FAQ

    • What's the minimum perl version required to run Archive::Tar?

      You will need perl version 5.005_03 or newer.

    • Isn't Archive::Tar slow?

      Yes it is. It's pure perl, so it's a lot slower then your /bin/tar However, it's very portable. If speed is an issue, consider using /bin/tar instead.

    • Isn't Archive::Tar heavier on memory than /bin/tar?

      Yes it is, see previous answer. Since Compress::Zlib and therefore IO::Zlib doesn't support seek on their filehandles, there is little choice but to read the archive into memory. This is ok if you want to do in-memory manipulation of the archive.

      If you just want to extract, use the extract_archive class method instead. It will optimize and write to disk immediately.

      Another option is to use the iter class method to iterate over the files in the tarball without reading them all in memory at once.

    • Can you lazy-load data instead?

      In some cases, yes. You can use the iter class method to iterate over the files in the tarball without reading them all in memory at once.

    • How much memory will an X kb tar file need?

      Probably more than X kb, since it will all be read into memory. If this is a problem, and you don't need to do in memory manipulation of the archive, consider using the iter class method, or /bin/tar instead.

    • What do you do with unsupported filetypes in an archive?

      Unix has a few filetypes that aren't supported on other platforms, like Win32 . If we encounter a hardlink or symlink we'll just try to make a copy of the original file, rather than throwing an error.

      This does require you to read the entire archive in to memory first, since otherwise we wouldn't know what data to fill the copy with. (This means that you cannot use the class methods, including iter on archives that have incompatible filetypes and still expect things to work).

      For other filetypes, like chardevs and blockdevs we'll warn that the extraction of this particular item didn't work.

    • I'm using WinZip, or some other non-POSIX client, and files are not being extracted properly!

      By default, Archive::Tar is in a completely POSIX-compatible mode, which uses the POSIX-specification of tar to store files. For paths greater than 100 characters, this is done using the POSIX header prefix . Non-POSIX-compatible clients may not support this part of the specification, and may only support the GNU Extended Header functionality. To facilitate those clients, you can set the $Archive::Tar::DO_NOT_USE_PREFIX variable to true . See the GLOBAL VARIABLES section for details on this variable.

      Note that GNU tar earlier than version 1.14 does not cope well with the POSIX header prefix . If you use such a version, consider setting the $Archive::Tar::DO_NOT_USE_PREFIX variable to true .

    • How do I extract only files that have property X from an archive?

      Sometimes, you might not wish to extract a complete archive, just the files that are relevant to you, based on some criteria.

      You can do this by filtering a list of Archive::Tar::File objects based on your criteria. For example, to extract only files that have the string foo in their title, you would use:

      1. $tar->extract(
      2. grep { $_->full_path =~ /foo/ } $tar->get_files
      3. );

      This way, you can filter on any attribute of the files in the archive. Consult the Archive::Tar::File documentation on how to use these objects.

    • How do I access .tar.Z files?

      The Archive::Tar module can optionally use Compress::Zlib (via the IO::Zlib module) to access tar files that have been compressed with gzip . Unfortunately tar files compressed with the Unix compress utility cannot be read by Compress::Zlib and so cannot be directly accesses by Archive::Tar .

      If the uncompress or gunzip programs are available, you can use one of these workarounds to read .tar.Z files from Archive::Tar

      Firstly with uncompress

      1. use Archive::Tar;
      2. open F, "uncompress -c $filename |";
      3. my $tar = Archive::Tar->new(*F);
      4. ...

      and this with gunzip

      1. use Archive::Tar;
      2. open F, "gunzip -c $filename |";
      3. my $tar = Archive::Tar->new(*F);
      4. ...

      Similarly, if the compress program is available, you can use this to write a .tar.Z file

      1. use Archive::Tar;
      2. use IO::File;
      3. my $fh = new IO::File "| compress -c >$filename";
      4. my $tar = Archive::Tar->new();
      5. ...
      6. $tar->write($fh);
      7. $fh->close ;
    • How do I handle Unicode strings?

      Archive::Tar uses byte semantics for any files it reads from or writes to disk. This is not a problem if you only deal with files and never look at their content or work solely with byte strings. But if you use Unicode strings with character semantics, some additional steps need to be taken.

      For example, if you add a Unicode string like

      1. # Problem
      2. $tar->add_data('file.txt', "Euro: \x{20AC}");

      then there will be a problem later when the tarfile gets written out to disk via $tar- write()>:

      1. Wide character in print at .../Archive/Tar.pm line 1014.

      The data was added as a Unicode string and when writing it out to disk, the :utf8 line discipline wasn't set by Archive::Tar , so Perl tried to convert the string to ISO-8859 and failed. The written file now contains garbage.

      For this reason, Unicode strings need to be converted to UTF-8-encoded bytestrings before they are handed off to add_data() :

      1. use Encode;
      2. my $data = "Accented character: \x{20AC}";
      3. $data = encode('utf8', $data);
      4. $tar->add_data('file.txt', $data);

      A opposite problem occurs if you extract a UTF8-encoded file from a tarball. Using get_content() on the Archive::Tar::File object will return its content as a bytestring, not as a Unicode string.

      If you want it to be a Unicode string (because you want character semantics with operations like regular expression matching), you need to decode the UTF8-encoded content and have Perl convert it into a Unicode string:

      1. use Encode;
      2. my $data = $tar->get_content();
      3. # Make it a Unicode string
      4. $data = decode('utf8', $data);

      There is no easy way to provide this functionality in Archive::Tar , because a tarball can contain many files, and each of which could be encoded in a different way.

    CAVEATS

    The AIX tar does not fill all unused space in the tar archive with 0x00. This sometimes leads to warning messages from Archive::Tar .

    1. Invalid header block at offset nnn

    A fix for that problem is scheduled to be released in the following levels of AIX, all of which should be coming out in the 4th quarter of 2009:

    1. AIX 5.3 TL7 SP10
    2. AIX 5.3 TL8 SP8
    3. AIX 5.3 TL9 SP5
    4. AIX 5.3 TL10 SP2
    5. AIX 6.1 TL0 SP11
    6. AIX 6.1 TL1 SP7
    7. AIX 6.1 TL2 SP6
    8. AIX 6.1 TL3 SP3

    The IBM APAR number for this problem is IZ50240 (Reported component ID: 5765G0300 / AIX 5.3). It is possible to get an ifix for that problem. If you need an ifix please contact your local IBM AIX support.

    TODO

    • Check if passed in handles are open for read/write

      Currently I don't know of any portable pure perl way to do this. Suggestions welcome.

    • Allow archives to be passed in as string

      Currently, we only allow opened filehandles or filenames, but not strings. The internals would need some reworking to facilitate stringified archives.

    • Facilitate processing an opened filehandle of a compressed archive

      Currently, we only support this if the filehandle is an IO::Zlib object. Environments, like apache, will present you with an opened filehandle to an uploaded file, which might be a compressed archive.

    SEE ALSO

    AUTHOR

    This module by Jos Boumans <kane@cpan.org>.

    Please reports bugs to <bug-archive-tar@rt.cpan.org>.

    ACKNOWLEDGEMENTS

    Thanks to Sean Burke, Chris Nandor, Chip Salzenberg, Tim Heaney, Gisle Aas, Rainer Tammer and especially Andrew Savige for their help and suggestions.

    COPYRIGHT

    This module is copyright (c) 2002 - 2009 Jos Boumans <kane@cpan.org>. All rights reserved.

    This library is free software; you may redistribute and/or modify it under the same terms as Perl itself.

    Page index
     
    perldoc-html/Archive/Tar/File.html000644 000765 000024 00000065747 12275777423 017135 0ustar00jjstaff000000 000000 Archive::Tar::File - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    Archive::Tar::File

    Perl 5 version 18.2 documentation
    Recently read

    Archive::Tar::File

    NAME

    Archive::Tar::File - a subclass for in-memory extracted file from Archive::Tar

    SYNOPSIS

    1. my @items = $tar->get_files;
    2. print $_->name, ' ', $_->size, "\n" for @items;
    3. print $object->get_content;
    4. $object->replace_content('new content');
    5. $object->rename( 'new/full/path/to/file.c' );

    DESCRIPTION

    Archive::Tar::Files provides a neat little object layer for in-memory extracted files. It's mostly used internally in Archive::Tar to tidy up the code, but there's no reason users shouldn't use this API as well.

    Accessors

    A lot of the methods in this package are accessors to the various fields in the tar header:

    • name

      The file's name

    • mode

      The file's mode

    • uid

      The user id owning the file

    • gid

      The group id owning the file

    • size

      File size in bytes

    • mtime

      Modification time. Adjusted to mac-time on MacOS if required

    • chksum

      Checksum field for the tar header

    • type

      File type -- numeric, but comparable to exported constants -- see Archive::Tar's documentation

    • linkname

      If the file is a symlink, the file it's pointing to

    • magic

      Tar magic string -- not useful for most users

    • version

      Tar version string -- not useful for most users

    • uname

      The user name that owns the file

    • gname

      The group name that owns the file

    • devmajor

      Device major number in case of a special file

    • devminor

      Device minor number in case of a special file

    • prefix

      Any directory to prefix to the extraction path, if any

    • raw

      Raw tar header -- not useful for most users

    Methods

    Archive::Tar::File->new( file => $path )

    Returns a new Archive::Tar::File object from an existing file.

    Returns undef on failure.

    Archive::Tar::File->new( data => $path, $data, $opt )

    Returns a new Archive::Tar::File object from data.

    $path defines the file name (which need not exist), $data the file contents, and $opt is a reference to a hash of attributes which may be used to override the default attributes (fields in the tar header), which are described above in the Accessors section.

    Returns undef on failure.

    Archive::Tar::File->new( chunk => $chunk )

    Returns a new Archive::Tar::File object from a raw 512-byte tar archive chunk.

    Returns undef on failure.

    $bool = $file->extract( [ $alternative_name ] )

    Extract this object, optionally to an alternative name.

    See Archive::Tar->extract_file for details.

    Returns true on success and false on failure.

    $path = $file->full_path

    Returns the full path from the tar header; this is basically a concatenation of the prefix and name fields.

    $bool = $file->validate

    Done by Archive::Tar internally when reading the tar file: validate the header against the checksum to ensure integer tar file.

    Returns true on success, false on failure

    $bool = $file->has_content

    Returns a boolean to indicate whether the current object has content. Some special files like directories and so on never will have any content. This method is mainly to make sure you don't get warnings for using uninitialized values when looking at an object's content.

    $content = $file->get_content

    Returns the current content for the in-memory file

    $cref = $file->get_content_by_ref

    Returns the current content for the in-memory file as a scalar reference. Normal users won't need this, but it will save memory if you are dealing with very large data files in your tar archive, since it will pass the contents by reference, rather than make a copy of it first.

    $bool = $file->replace_content( $content )

    Replace the current content of the file with the new content. This only affects the in-memory archive, not the on-disk version until you write it.

    Returns true on success, false on failure.

    $bool = $file->rename( $new_name )

    Rename the current file to $new_name.

    Note that you must specify a Unix path for $new_name, since per tar standard, all files in the archive must be Unix paths.

    Returns true on success and false on failure.

    $bool = $file->chmod $mode)

    Change mode of $file to $mode. The mode can be a string or a number which is interpreted as octal whether or not a leading 0 is given.

    Returns true on success and false on failure.

    $bool = $file->chown( $user [, $group])

    Change owner of $file to $user. If a $group is given that is changed as well. You can also pass a single parameter with a colon separating the use and group as in 'root:wheel'.

    Returns true on success and false on failure.

    Convenience methods

    To quickly check the type of a Archive::Tar::File object, you can use the following methods:

    • $file->is_file

      Returns true if the file is of type file

    • $file->is_dir

      Returns true if the file is of type dir

    • $file->is_hardlink

      Returns true if the file is of type hardlink

    • $file->is_symlink

      Returns true if the file is of type symlink

    • $file->is_chardev

      Returns true if the file is of type chardev

    • $file->is_blockdev

      Returns true if the file is of type blockdev

    • $file->is_fifo

      Returns true if the file is of type fifo

    • $file->is_socket

      Returns true if the file is of type socket

    • $file->is_longlink

      Returns true if the file is of type LongLink . Should not happen after a successful read.

    • $file->is_label

      Returns true if the file is of type Label . Should not happen after a successful read.

    • $file->is_unknown

      Returns true if the file type is unknown

     
    perldoc-html/App/Prove/000755 000765 000024 00000000000 12275777422 015050 5ustar00jjstaff000000 000000 perldoc-html/App/Prove.html000644 000765 000024 00000075105 12275777423 015747 0ustar00jjstaff000000 000000 App::Prove - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    App::Prove

    Perl 5 version 18.2 documentation
    Recently read

    App::Prove

    NAME

    App::Prove - Implements the prove command.

    VERSION

    Version 3.26

    DESCRIPTION

    Test::Harness provides a command, prove , which runs a TAP based test suite and prints a report. The prove command is a minimal wrapper around an instance of this module.

    SYNOPSIS

    1. use App::Prove;
    2. my $app = App::Prove->new;
    3. $app->process_args(@ARGV);
    4. $app->run;

    METHODS

    Class Methods

    new

    Create a new App::Prove . Optionally a hash ref of attribute initializers may be passed.

    state_class

    Getter/setter for the name of the class used for maintaining state. This class should either subclass from App::Prove::State or provide an identical interface.

    state_manager

    Getter/setter for the instance of the state_class .

    add_rc_file

    1. $prove->add_rc_file('myproj/.proverc');

    Called before process_args to prepend the contents of an rc file to the options.

    process_args

    1. $prove->process_args(@args);

    Processes the command-line arguments. Attributes will be set appropriately. Any filenames may be found in the argv attribute.

    Dies on invalid arguments.

    run

    Perform whatever actions the command line args specified. The prove command line tool consists of the following code:

    1. use App::Prove;
    2. my $app = App::Prove->new;
    3. $app->process_args(@ARGV);
    4. exit( $app->run ? 0 : 1 ); # if you need the exit code

    require_harness

    Load a harness replacement class.

    1. $prove->require_harness($for => $class_name);

    print_version

    Display the version numbers of the loaded TAP::Harness and the current Perl.

    Attributes

    After command line parsing the following attributes reflect the values of the corresponding command line switches. They may be altered before calling run .

    • archive
    • argv
    • backwards
    • blib
    • color
    • directives
    • dry
    • exec
    • extensions
    • failures
    • comments
    • formatter
    • harness
    • ignore_exit
    • includes
    • jobs
    • lib
    • merge
    • modules
    • parse
    • plugins
    • quiet
    • really_quiet
    • recurse
    • rules
    • show_count
    • show_help
    • show_man
    • show_version
    • shuffle
    • state
    • state_class
    • taint_fail
    • taint_warn
    • test_args
    • timer
    • verbose
    • warnings_fail
    • warnings_warn
    • tapversion
    • trap

    PLUGINS

    App::Prove provides support for 3rd-party plugins. These are currently loaded at run-time, after arguments have been parsed (so you can not change the way arguments are processed, sorry), typically with the -Pplugin switch, eg:

    1. prove -PMyPlugin

    This will search for a module named App::Prove::Plugin::MyPlugin , or failing that, MyPlugin . If the plugin can't be found, prove will complain & exit.

    You can pass an argument to your plugin by appending an = after the plugin name, eg -PMyPlugin=foo . You can pass multiple arguments using commas:

    1. prove -PMyPlugin=foo,bar,baz

    These are passed in to your plugin's load() class method (if it has one), along with a reference to the App::Prove object that is invoking your plugin:

    1. sub load {
    2. my ($class, $p) = @_;
    3. my @args = @{ $p->{args} };
    4. # @args will contain ( 'foo', 'bar', 'baz' )
    5. $p->{app_prove}->do_something;
    6. ...
    7. }

    Note that the user's arguments are also passed to your plugin's import() function as a list, eg:

    1. sub import {
    2. my ($class, @args) = @_;
    3. # @args will contain ( 'foo', 'bar', 'baz' )
    4. ...
    5. }

    This is for backwards compatibility, and may be deprecated in the future.

    Sample Plugin

    Here's a sample plugin, for your reference:

    1. package App::Prove::Plugin::Foo;
    2. # Sample plugin, try running with:
    3. # prove -PFoo=bar -r -j3
    4. # prove -PFoo -Q
    5. # prove -PFoo=bar,My::Formatter
    6. use strict;
    7. use warnings;
    8. sub load {
    9. my ($class, $p) = @_;
    10. my @args = @{ $p->{args} };
    11. my $app = $p->{app_prove};
    12. print "loading plugin: $class, args: ", join(', ', @args ), "\n";
    13. # turn on verbosity
    14. $app->verbose( 1 );
    15. # set the formatter?
    16. $app->formatter( $args[1] ) if @args > 1;
    17. # print some of App::Prove's state:
    18. for my $attr (qw( jobs quiet really_quiet recurse verbose )) {
    19. my $val = $app->$attr;
    20. $val = 'undef' unless defined( $val );
    21. print "$attr: $val\n";
    22. }
    23. return 1;
    24. }
    25. 1;

    SEE ALSO

    prove, TAP::Harness

     
    perldoc-html/App/Prove/State/000755 000765 000024 00000000000 12275777422 016130 5ustar00jjstaff000000 000000 perldoc-html/App/Prove/State.html000644 000765 000024 00000047721 12275777422 017031 0ustar00jjstaff000000 000000 App::Prove::State - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    App::Prove::State

    Perl 5 version 18.2 documentation
    Recently read

    App::Prove::State

    NAME

    App::Prove::State - State storage for the prove command.

    VERSION

    Version 3.26

    DESCRIPTION

    The prove command supports a --state option that instructs it to store persistent state across runs. This module implements that state and the operations that may be performed on it.

    SYNOPSIS

    1. # Re-run failed tests
    2. $ prove --state=fail,save -rbv

    METHODS

    Class Methods

    new

    Accepts a hashref with the following key/value pairs:

    • store

      The filename of the data store holding the data that App::Prove::State reads.

    • extensions (optional)

      The test name extensions. Defaults to .t.

    • result_class (optional)

      The name of the result_class . Defaults to App::Prove::State::Result .

    result_class

    Getter/setter for the name of the class used for tracking test results. This class should either subclass from App::Prove::State::Result or provide an identical interface.

    extensions

    Get or set the list of extensions that files must have in order to be considered tests. Defaults to ['.t'].

    results

    Get the results of the last test run. Returns a result_class() instance.

    commit

    Save the test results. Should be called after all tests have run.

    Instance Methods

    apply_switch

    1. $self->apply_switch('failed,save');

    Apply a list of switch options to the state, updating the internal object state as a result. Nothing is returned.

    Diagnostics: - "Illegal state option: %s"

    • last

      Run in the same order as last time

    • failed

      Run only the failed tests from last time

    • passed

      Run only the passed tests from last time

    • all

      Run all tests in normal order

    • hot

      Run the tests that most recently failed first

    • todo

      Run the tests ordered by number of todos.

    • slow

      Run the tests in slowest to fastest order.

    • fast

      Run test tests in fastest to slowest order.

    • new

      Run the tests in newest to oldest order.

    • old

      Run the tests in oldest to newest order.

    • save

      Save the state on exit.

    get_tests

    Given a list of args get the names of tests that should run

    observe_test

    Store the results of a test.

    save

    Write the state to a file.

    load

    Load the state from a file

     
    perldoc-html/App/Prove/State/Result/000755 000765 000024 00000000000 12275777422 017406 5ustar00jjstaff000000 000000 perldoc-html/App/Prove/State/Result.html000644 000765 000024 00000051527 12275777422 020306 0ustar00jjstaff000000 000000 App::Prove::State::Result - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    App::Prove::State::Result

    Perl 5 version 18.2 documentation
    Recently read

    App::Prove::State::Result

    NAME

    App::Prove::State::Result - Individual test suite results.

    VERSION

    Version 3.26

    DESCRIPTION

    The prove command supports a --state option that instructs it to store persistent state across runs. This module encapsulates the results for a single test suite run.

    SYNOPSIS

    1. # Re-run failed tests
    2. $ prove --state=fail,save -rbv

    METHODS

    Class Methods

    new

    1. my $result = App::Prove::State::Result->new({
    2. generation => $generation,
    3. tests => \%tests,
    4. });

    Returns a new App::Prove::State::Result instance.

    state_version

    Returns the current version of state storage.

    test_class

    Returns the name of the class used for tracking individual tests. This class should either subclass from App::Prove::State::Result::Test or provide an identical interface.

    generation

    Getter/setter for the "generation" of the test suite run. The first generation is 1 (one) and subsequent generations are 2, 3, etc.

    last_run_time

    Getter/setter for the time of the test suite run.

    tests

    Returns the tests for a given generation. This is a hashref or a hash, depending on context called. The keys to the hash are the individual test names and the value is a hashref with various interesting values. Each k/v pair might resemble something like this:

    1. 't/foo.t' => {
    2. elapsed => '0.0428488254547119',
    3. gen => '7',
    4. last_pass_time => '1219328376.07815',
    5. last_result => '0',
    6. last_run_time => '1219328376.07815',
    7. last_todo => '0',
    8. mtime => '1191708862',
    9. seq => '192',
    10. total_passes => '6',
    11. }

    test

    1. my $test = $result->test('t/customer/create.t');

    Returns an individual App::Prove::State::Result::Test instance for the given test name (usually the filename). Will return a new App::Prove::State::Result::Test instance if the name is not found.

    test_names

    Returns an list of test names, sorted by run order.

    remove

    1. $result->remove($test_name); # remove the test
    2. my $test = $result->test($test_name); # fatal error

    Removes a given test from results. This is a no-op if the test name is not found.

    num_tests

    Returns the number of tests for a given test suite result.

    raw

    Returns a hashref of raw results, suitable for serialization by YAML.

     
    perldoc-html/App/Prove/State/Result/Test.html000644 000765 000024 00000044644 12275777422 021227 0ustar00jjstaff000000 000000 App::Prove::State::Result::Test - perldoc.perl.org

    Modules

    • ABCDE
    • FGHIL
    • MNOPS
    • TUX

    Tools

    App::Prove::State::Result::Test

    Perl 5 version 18.2 documentation
    Recently read

    App::Prove::State::Result::Test

    NAME

    App::Prove::State::Result::Test - Individual test results.

    VERSION

    Version 3.26

    DESCRIPTION

    The prove command supports a --state option that instructs it to store persistent state across runs. This module encapsulates the results for a single test.

    SYNOPSIS

    1. # Re-run failed tests
    2. $ prove --state=fail,save -rbv

    METHODS

    Class Methods

    new

    Instance Methods

    name

    The name of the test. Usually a filename.

    elapsed

    The total elapsed times the test took to run, in seconds from the epoch..

    generation

    The number for the "generation" of the test run. The first generation is 1 (one) and subsequent generations are 2, 3, etc.

    last_pass_time

    The last time the test program passed, in seconds from the epoch.

    Returns undef if the program has never passed.

    last_fail_time

    The last time the test suite failed, in seconds from the epoch.

    Returns undef if the program has never failed.

    mtime

    Returns the mtime of the test, in seconds from the epoch.

    raw

    Returns a hashref of raw test data, suitable for serialization by YAML.

    result

    Currently, whether or not the test suite passed with no 'problems' (such as TODO passed).

    run_time

    The total time it took for the test to run, in seconds. If Time::HiRes is available, it will have finer granularity.

    num_todo

    The number of tests with TODO directives.

    sequence

    The order in which this test was run for the given test suite result.

    total_passes

    The number of times the test has passed.

    total_failures

    The number of times the test has failed.

    parser

    The underlying parser object. This is useful if you need the full information for the test program.